10:00 AM "Xgl" - David Reveman, Novell Server work since last year: RANDR, GLX, Xv, EGL - GLX compositing manager: server reqs, wm integration, modularity, todo - Was using pbuffers for offscreen memory management; moving to FBOs - Leaving as much as possible to the GL implementation RANDR: can do resize and refresh rate now, but not rotate - no plans for rotation; should probably be done in the compmgr Xv scaling with bilinear interpolation - Xv colorspace conversion with GL_ARB_fragment_program, YV12 only - Adds YUV pictures to Render extension - Slightly slow on first-gen ARB_fp cards; PBOs would help performance Need software GL implementation for indirect GLX, else no Composite integration - Clipping using glScissor, have to render multiple times - Have to wrap GL objects to get context sharing and texture object hiding Damage integration: SwapBuffers when double buffered, bounding box when single Shared indirect context means state tracking issues when switching b/w Begin/End Want all the useful "* buffer object" extensions for better performance Texture from pixmap extension: lets you bind pixmaps to texures - http://people.freedesktop.org/~davidr/GLX_EXT_texture_from_pixmap.txt - Means GL compositing managers can Go Really Fast Synchronized screen updates: GLX_MESA_copy_sub_buffer? - Sync to vertical retrace: SGI_video_sync or OML_sync_control? - Problem is syncing swap across all clients, without blocking server - One thread per GLX client? One buffer swap thread? - Using a swap thread currently; can still block sometimes though EGL: needs a few extensions yet, screen_surface and opengl_api GLX compmgr: needs accelerated indirect, EXT_tfp, and pref. MESA_copy_sub_buffer - current design draws to root window - has clipping issues, correct way to do it is still unknown Why integrate wm with compmgr? - Effects are limited without the integration - Don't need to repartent: less memry usage, nicer window decorations - synchronized resize - flips instead of swaps for GLX windows - can move multihead infrastructure to the client side compmgr architecture: plain C, plugin architecture for effects - decorations drawn in separate process and then composited todo: - multiple screen support - better window management; new wm with lots of copied code, so kinda rough - window decorator program: theme system, other toolkits needs from open source drivers: - either fast glCopyPixels or GLX_MESA_copy_sub_buffer - GL_EXT_frame_buffer_object eventually 11:00 AM "Avalon" - Matthias Hopf Users will have raised expectations If we need high-level changes, start talking and doing now Don't want to clone everything necessarily How does backwards compatibility affect these decisions? Quartz Extreme - first widely accepted compositing window system - many output drivers - pdf, gl, etc. - application space libs for vectors, image manip, video, color management Avalon - will be bigger impact just because it's MS - resolution and device independent graphics - deeped color channels, better color spaces, retained mode and scene graph - better separation of code and UI - XAML - similar library layer to OSX (X server assumes monitor is sRGB and does linear color space transforms) (but doesn't export color profiles) Composite: pretty equivalent for features and memory usage Render: has potential for 2D widgets EXA: midterm solution for XAA Xgl: moves the acceleration architecture to match competition Xgl mostly up to par or even superior, architecturally; has a few issues - experimental - drivers need enhancements - binary drivers vs. gpl is icky - vista vs. opengl So what do we need above that? Start with retained mode - Retained compositition and refresh; needs scene graph - On server side? Complex, difficult; don't want server side fonts - On client side? Similar to Avalon; not X's problem then, but isn't written Accessibility - can render upscaled window contents; want support for vector content? - could do scene graph on server (again) - Or advanced expose event; ask app to redraw at upscaled resolution - better! Antialiasing: - Prerendering: done for render; wrong when transformed, but, does it matter? - Minification during composite: only way to do it correctly - Goal is to make minified windows look blurry, not sparkly 3D widgets: - Easily usable substitute for GL context - Similar to Avalon viewport. Do it in retained mode? Do it client or server? - Create new extension for integration? - Just use GL in client library? needs FBOs to go fast Color calibration and correction - Correction embedded in Quartz and Avalon, as app-space library - Have to export color profile, needs to be end-to-end Geometry correction - view plane can be distorted (display walls, etc) - job of compositing manager. is this doable in current plugin system? Dynamic reconfiguration - needs to happen, somehow. no real ideas yet. 1:00 PM "Status Update on Project Looking Glass" - Deron Johnson, Sun New mode: LG in a box (runs in a subwindow, Xnest-y) New apps: background manager, 3d file manager, Knowledge Web calendar/scheduler with solar-system-ish visualization Community continues to grow (Google SoC helped) Simplified installation - includes all the Java bits now working on bundling into Ubuntu Multiverse LG LiveCD - based on SLAX, Knoppix version in progress - http://lg3d-livecd.dev.java.net/ Useful 3D methods: - Bookshelving - Slanted parked windows for monitorin - Radial presentation - exploits spatial memory - Natural metaphors like the scheduler - Window flipping - preserves context - Brand enhancement and brand look Near term: Panorama release, transition from research to product - feature completion, stability, distro bundling. Planned for end 2006. Xorg integration - Input redirection, Hide/ShowCursor, Composite overlay window - Composite on by default Composite overlay window - give the compmgr a surface to draw on without interference - always above normal windows, below screen saver window - created lazily, automatically mapped, override-redirect, borderwidth = 0 - immune from Composite redirection, invisible to the wm and XQueryTree OpenGL integration: need composite redirection and the usual extensions Future: Architecture improvements, a11y, i18n, 3d widget library, XINPUT Gnome / general toolkit integration: - shared config files / utilities, mixed gtk/lg apps Keith Packard: Coordinate Redirection Have lots of coordinate spaces - parent, child, widget... - difference currently just translations. imagine arbitrary mappings. Requirements - Arbitrary parent/child mappings, under app control - have to transform both pointer coords and window coords Possible architectures - Define it in server; needs full programming language, performance issues - RPC architecture; server blocks waiting for replies - Regular X extension; X keeps running, must respect atomicity requirements Pointer event processing: Coord transforms, hit testing, grabs, freezes. The rules: requests must be atomic, including event and error generation First attempt: capture mouse events, send to redirect client, for one window - miscomputes grabbed pointer position - ignores freezes - doesn't redirect non-pointer related coordinates Second try: transform for all client windows, let the server pick Frozen events? Scan queued events, transform all of them; gives atomic thaw Atomic requests - Evaluate every required transform - Do not execute the request, suspend that client - Ask redirection client for transforms - Save redirect client responses - Restart requesting client and resume the request For which requests? - {Grab,Ungrab}{Pointer,Keyboard} - ChangeActivePointerGrab, AllowEvents - QueryPointer, GetMotionEvents - TranslateCoordinates - WarpPointer First eight need parent->child transform; last two need the other direction (lots of protocol details) 2:00 PM "NVIDIA Driver Internals" - Andy Ritger, Nvidia Overview: - Unified Driver Architecture - Driver Components - Features - Direct Rendering Client Interaction with X - Rendering and Scanout Interaction - Video Memory - ABI and API Compatibility - Direct Rendered OpenGL + Damage/Composite Unified Driver Architecture - Majority of the code is shared among OSes; and one code base for all GPUs Driver Components - kernel module, 2D X driver, client side library, GLX extension, OpenGL core - nvidia.ko nvidia_drv.so, libGL.so, libglx.so libGLcore.so (Diagram that doesn't translate to ascii) Additional utilities: installer, settings, config file generator Feature set: - TwinView, multiscreen, OpenGL+Xinerama, Configurability, Quad-buffered stereo - RGB/CI workstation color overlays, Framelock, SDI (Serial Digital Interface) - SLI (alternate frames, split frames, or antialiasing) Direct rendering: - avoid IPC and copy overhead, noticable performance win - different between hardware acceleration and direct rendering - Server side must coordinate with OpenGL to sync and propagate data - Data to propagate: geometry, clip list, swap interval, antialiasing... Control flow of direct rendering - X driver pushes current drawable state into shared memory segment - OpenGL runs async with the server - when it must sync, checks for current drawable data, update if state - sync is to ensure integrity of data and guarantee ordering by each driver - traditional GPUs have one command buffer that's shared - nvidia GPUs have one command buffer per client, have to manage sequencing - Why does this matter? imagine moving a window while it's clipped; client's rendering must match server's clip list - ordering is a problem when you have to use output as input Rendering/Scanout interactions: - pageflipping versus blitting - pageflipping with windowed GL means syncing X rendering between buffers - with QBS, you also flip between left and right eye on vblank - GL needs to control when to flip, vblank sync interaction - memory allocation can depend on scanout properties - AA and SLI have effects on how you swap - video has tight constraints on frame delivery Video memory - Have lots, but it's not all CPU-mappable - Some GPUs can render to system memory over PCIE; CPU mappable but slower - Video memory layout may not be linear - Organization of bits may be optimized for rendering and texturing - linear CPU mapping may require sacrifices - many attributes, placement is non-trivial, works best with advance knowledge API/ABI compatibility - One driver for all X servers from XFree86 4.0 to now - ABI compatibility is painful for open drivers too - Suggestions: Break infrequently and only when really necessary - Add new entrypoint and deprecate old to allow transitions - Update ABI version number appropriately, make it queryable - minimize number of ABI versions and driver versions - Do many at once to get them out of the way, update APIs when appropriate GL/Damage/Composite integration - Clients have to know their drawable is redirected - Clients notify X when drawable is damaged - Option for sync: wait for hardware to complete rendering before readout - Option for sync: wait for hardware to complete rendering before damaging - compositing overhead will be substantial, especially for high framerate app - want the ability to get full GL performance - some features like overlays and QBS might not work in Composite http://developer.nvidia.com/object/xdevconf_2006_presentation.html Phase 2 presentation: Why the Xorg DDX model is the right thing to do Existing framework is high-level interface between DDX and driver X on OpenGL model - initially motivated by lack of Render acceleration - replaces Xorg DDX with DDX that uses OpenGL for rendering - no hardware specific X driver, layered wholly on GL Nine goals: - Bring a compelling composited X system to the unix desktop - Power to explore new UIs - Maintain app backward compat - Improve interaction between X and kernels - Make relevant X rendering perform optimally - Continue to support existing advanced functionality - Give vendors the flexibility to expose new features - Give users the flexibility to choose their feature set - Brind the functionality of Damage/Composite to market in near future 1, 2, and 3 are implicitly accomplished with either Xgl or Xorg 4 is independent of driver model Both models have same hardware capabilities, loadable X driver has more context - Render performance improving within loadable framework - Current driver model at least as capable as Xgl to make use of GPU 6 and 7: Existing driver model allows these features (QBS, etc.) - much harder in X on OpenGL - how do you handle the sync for direct rendering in Xgl? - features like framelock, etc would need complex backdoors 8: choice contingent on having those features at all, so due to 6 and 7... 9: current framework requires incremental changes, versus large work for Xgl Arguments and rebuttals: - "GL will yield performance improvements" - no, it's the same hardware - GL can't work with Composite unless the server is also using GL - just false - Using GL for compositing for the desktop requires that the server render in GL - also just false - 3D hardware is simply faster than 2D hardware. - Not always, depends on the op. Mostly a function of the driver, not h/w. - IHVs are going to remove 2D functionality - Well, no, not if the customer wants it - X on GL will be easier for IHVs, it's one driver - If we're just talking rendering maybe. But makes features harder. Future directions: - GL composite managers, need output window and EXT_tfp - Bring Composite to the mainstream - rendering to redirected windows - EXT_tfp - continue to improve Render accel - address Xv + Composite - fix remaining Composite bugs - Enable Composite by default - Establish industry standard benchmarks - construct them, encourage competition, make sure IHVs take it seriously - Establish industry standard conformance tests - rendercheck? xts? correctness for EXT_tfp? From nVidia's standpoint, current model is just better 3:00 PM "Development Challenges for X and GL" - Kevin Martin, Red Hat Building for the future - Focusing on server configuration, Cairo, and GL-based Compositing - Other areas: Power management, interface cleanup, GLX1.3/GL2.0, color mgmt - Want to work with the community to bring this about Configuration - Been ignored for far too long - Major support burden - current tools help but don't completely solve the problem - problem space is huge, touches many parts of the OS - Problem is along two axes: devices, and automatic vs. dynamic - Should be able to start the server with no configuration file (automatic) - Should be able to connect a projector and have it work (dynamic) - Fallbacks are needed because devices are produced faster than we can handle - Want to work with IHVs to help the driver process along Cairo - Many toolkits and apps are switching - Majority of work has been on definition, correctness, and output drivers - Currently finishing printing support - Can start looking at performance issues now - Want to make whole cairo stack fast - For X11 output, Render composition, trapezoids, new tesselation algorithm - General 2D acceleration: XAA is a bad match, EXA is better, needs more work GL-based Composited desktop - Goals: artifact free desktop, enable develpoers to use full power of h/w - Foundation: Composite, Damage, Fixes, GLX/DRI/Mesa - What we're adding: accelerated indirect GLX, EXT_tfd, Metacity scene graph - Still much work to do: memory management, fbos, input, redirection, plugins... - Want to work with the community to create the short-term roadmap Long term vision - X built entirely on top of GL/DRI - Fully 3D desktop experience - Full dynamic configuration, app migration... etc http://people.freedesktop.org/~kem/kem-xdevconf2006.pdf 4:00PM "Future Driver Development" Dave Airlie X Momentum - Good: XFree86 split, people getting hired, more CVS access, more communication - Bad: Still too few maintainers, still very much in code-drop mode 2D DDX features that need work - Memory management blocking a lot of work now - Better framework for external i2c chips - Dynamic driver configuration (again) - Framework for driver control (SISCTRL, driconf), need to be made generic GL features that need work - Memory management, memory management, memory management - enables FBOs, pbuffers, PBOs, etc - Engage Khronos on the topic of EGL What is EGL? - Standalone binding to OpenGL|ES - Needs enhancements for mode setting and cursor handling - Useful for embedded work and as the limit case of X on GL (picture of the long term model) Closed Future Development - Was "fine" when X was a big code drop model - Allows work that can improve X to be hidden away (Xsun, Xsgi, etc.) - Parallel duplicate development is a waste of time - Needs better process: avoid code drops, get more feedback from maintainers - Would benefit from seeing the complete development history Reverse Engineering - Huge waste of time - Can't be funded - Very difficult to make a stable driver - Makes the maintainers legally liable Suggestions - Get into the community, get on IRC, talk on the mailing lists, get more people - Need to improve the dialogue with IHVs to get useful open source support Wishlist - intel: i810 non-VBE modesetting - radeon: xpress 200 chipset, x1xxx 2d support, XvMC - nv: dual head, exa 5:00 PM "X on Small Devices, Nokia 770" - Tapani Palli, Nokia Overview - Hardware specification - Device constraints - Memory usage - Rendering pipeline - What can do with hardware - X related framework software Specs: - 800x480x16, 220 MHz ARM9, 64M DDR RAM, 128M Flash Constraints: - Low memory, no dedicated VRAM - No hardware acceleration - Full screen update is ~750KB - Power consumption issues Memory usage - Xomap - customized kdrive server, ~1M - X libraries ~1M - ~1.4M of system memory for runtime data - 1M of mapped framebuffer memory Rendering Pipeline (w/ diagram) - Has a DSP for video, LCD controller handles rotation/scaling/color conversion - Exploits Damage extension to do updates What can be done with the hardware? - Hardware RANDR support - 2x upscaling - for video and games - cuts screen update memory by 4x! - YUV->RGB colorspace conversion New Extensions - Xsp, misc stuff. - Touchscreen calibration, video setup, scaling, pressure detection - Xck, simple colorkeying support 770 X software - Sapwood, shared pixmap cache and theming engine - Misc small hacks, white background and blank cursor theme Other Nokia X involvement - Xephyr - Xnest++, all the new extensions - Xoo - Xephyr skin - xrestop - resource usage tracking tool - xresponse - tool for meitting events, DND, etc - tool (in development) for debugging window properties, stacking Future of Xomap - randr, does rotation but not resize - Xinput, Xv, Composite - Accessibility feature using hardware scaler? Science fiction? - X using OpenGLES - transfer input/windows/apps to other devices 10:00 AM "What Accessibility Needs from X" - Peter Korn, Sun Intro: - Working on a11y since 1992, did a lot of stuff - Know a lot about what users need - Don't know a lot about X Why do we care? - Many users have disabilities - These users have money - Many laws mandate accessibility - It's not hard to do - It's the right thing to do Important statistics - Disabled people are the largest minority group in the US - Have about $175e9 in disposable income - 15.4 million working americans with disabilities - 8% of web users - 53% of americans over 65 - 15% to 20% of the world population Laws that mandate A11Y - US legislation: Section 508, ADA, IDEA, Section 255 of the Telco Act - International: - Australian Disability Discrimination Act of 1992 - Canadian Human Rights Act of 1977 - Portugal Internet Accessibility - German employment laws - http://developer.gnome.org/projects/gap/laws.html What is computer accessibility? - Direct access: modifications so system is usable directly, eg. large print - Mediated access: Use of additional code for re-presentation and interaction - eg, magnifier, braille readout, alternative input What do we need for direct access? - 100% mouseless access; keyboard shortcuts, sticky keys, etc. - Theming support; high contrast and large print themes, well integrated - Audio alternatives - flashing or captioning What do we need for mediated access? ( demo of really slick screen magnifier for win32 ) - Damage/Composite for repainting - final decision for painting - Complete ownership of the keyboard - final decision for input - Hooks into GUI to track focus, caret, acquire text... - Network audio - Currently using gnopernicus and orca apps When driven by voice: - Speech recognition - Hook into all apps so they can be driven remotely - Nothing like this in unix yet When driven physically: - XInput, XKB, maybe libusb - Same hooks as above - Gnome Onscreen Keyboard and Dasher are our tools for this For cognitive impairments: - General simplification, possible window manager assistance - Re-rendering content: sync text display with speech, color/font change - Text composition support: context sensitive dictionary/thesaurus - Pronunciation assistance/prompting - Homophone color coding - Nothing like this in unix yet Complementary technologies - Using the computer to enhance your interaction with the world - Way cool but no real constraints on X itself What we need from X - Various extensions: AccessX, XKB, XInput - Disambiguation of HID devices - Composite, Damage, Fixes, Xcursor - Fast access to video card - implies good drivers - Network audio - Need to own the keyboard - XEvIE - What else? Ad hoc presentation: keithp on XCB Overview: Rework of the protocol layer for size, thread safety, etc. Accomplishments - Xlib ABI compatibility, can be intermixed - Passes X Test Suite, evidence of correctness - Usable for daily use, some performance tuning Areas for Investigation - Within 5% for 99% of tests, some outliers - Should be within 2% for everything - Code cleanup - Plan for poly-primitive merging like Xlib Known Issues - Move Xlib changes to Xorg tree - Make XCB zero-fill the pad bytes - Very minor API tweaks - Documentation and testing Final Words - XCB really is done - Xlib/XCB is very close, needs polishing GLX_EXT_texture_from_pixmap discussion - led by Adam Jackson Three main threads of discussion. First centered on exactly what the semantics of glBindTexImageEXT would be. The spec as currently written doesn't define whether it acts like an alias to the pixmap storage (a la APPLE_client_storage) or whether it acts like a copy like glTexImage2D. The consensus seems to be that it has to act like a copy. This doesn't preclude optimizations where the copy is deferred, copy-on-write style, or even elided if the pixmap isn't modified between Binds, but that's an implementation detail. However to ensure that the bound texture image is stable, you still need to grab the server; doing any better will require some sort of protocol between producer and consumer. The spec will be updated to reflect this. Tangentially, the usage model is expected to be BindTexImage, render, unbind. The implementation may optimize by doing unbinds lazily; we sort of do this anyway. The second discussion was focused on how to do subimage updates. As BindTexImage is currently specified, the entire texture image is updated. But damage events are typically some small subregion of the texture; for an 80x24 terminal you're refreshing the whole thing to update one-two-thousandth of it. There was some question of how to best update the minimal set of regions. We weren't able to come up with a pleasant API or conceptual model for how that sort of operation should look. Some of this could probably be accomplished "for free" by tracking damage on the server side. Probably going to defer this until we have a better idea how much of a win it is. The final thread considered how to best handle pixmaps that exceed the maximum texture dimensions. This seemed conflated with the previous thread of discussion so we got a little muddled down there. Turns out there's a much simpler idea: when the pixmap is too big, do glXMakeContextCurrent to make it your current read buffer, create as many texture objects as you need to reduce to below the texture image size limits, and glCopyTexImage out of your read buffer into your new texture images. This is existing functionality, doesn't require large image pushes over the wire, and works with all DDXes. 1:00 PM "Accelerated Indirect GLX" - Kristian Høgsberg, Red Hat Current GLX stack: (diagram) Client apps link to libGL Direct rendering - Using XF86DRI extension, client determine that server supports direct - libGL loader asks the X server which DRI driver to load and some h/w details - libGL loader loads DRI driver using dlopen, creates a DRI screen - all rendering handled entirely in the client - server only involved when windows move or clip list changes - currently the only way to get hardware accelerated GLX with open drivers Indirect rendering - If server doesn't support DRI, or setup fails, fall back to indirect rendering - OpenGL requests are marshalled to the server - server maintains an OpenGL context on behalf of the client and renders - currently done with software Mesa Motivation - Making X pixmaps available as OpenGL textures is central to GLX compmgrs - When accelerated indirect GLX is available, can do other paths through it too - Evolutionary step towards the Xegl model How to do it - Move the loader from libGL into the server - Redirect DRI protocol requests to call server implementation directly - Setup process looks exactly the same - Have to maintain slightly more context information and switch to proxy ctxs Visual setup process - At visual init time, GLX module adds GL properties to X visuals - When DRI driver is loaded, ask it what visuals it supports - The intersection of the two sets is exported to the client - No fbconfigs for the ARGB visual because they're added too late The DRI lock - Uses a global lock to serialize card access - DRI driver takes it when it needs to touch the card - XAA knows nothing about it - Server DRI support always takes the lock on behalf of XAA Deadlock - Now that DRI driver tries to take lock when server already has it, deadlock - So, drop the server's lock before calling into GLX dispatch - Can reduce the locking done by server by pushing it into XAA/EXA Texture from drawable - Can now dump pixmap contents into the DRI driver with TexImage2D - Currently just copies the whole thing - Can track damage on the pixmap to only update the bits that need it - Would like to be able to texture directly from the pixmap data Software rendering - Still need a software GL module for systems with no DRI driver - Current implementation makes protocol talk directly to DRI driver API - Idea: DRI software driver - decouples Mesa and server build process - simplifies GLX module - slight conflict with how Xgl does GLX - How to support non-double-buffered rendering? - Fake the SAREA to emulate the kernel management - DRI software driver wraps rendering commands, polls SAREA before rendering - Alternatively: new abstraction layer between GLX and renderer - could use DRI driver, current GLcore, or libGL (for Xgl) - Probably mergable to head once the software path is worked out Future work - More debugging and testing - Few extensions/changes to the DRI interface - createNewDrawable shouldn't add drawable to DRM hash table - bindContext that takes __DRIdrawable pointer instead of drawable ID Demos: glxgears, quake, metacity with composite manager 2:00 PM "A preview of the GNOME compositing manager" - Søren Sandmann Pedersen Red Hat Outline - Introduction - Goals - Scene graph - Metacity support - Demo Goal - Better looking graphics - Make desktop objects more physical - Make compositing available to the whole desktop - Buiding on existing infrastructure Better looks - Flicker-free repaint - Transition effects - More 3d environment in the long term Compositing for the whole desktop - Window management effects - Non-wm effects: magnification, native 3d apps Build on existing infrastructure - No flag days - Getting all the details right in the WM is hard - Works with existing Xorg DDX - Simple switch to turn it on Scene graph (diagram) - Tons of extensions involved - Screen sized windows - Many Damage and Composite bugs uncovered - Either need GL IncludeInferiors or don't clip by children, or magic window Metacity changes - Uses compositor library - Builds scene graph containing all toplevels - Has various effects already, more or less useful - Uses scene graph - just a library, has own Display connection Performance - Radeon 7500 - One glClear, one nautilus, 1.5 screens of apps, 1600x1200x32 @ 40fps = 330MP/s - Close to fillrate limited - Software rendering is extremely slow, XaaNoOffscreenPixmaps for workaround - NameWindowPixmap on unmapped window returns bogus stuff atm - xmatrix screensaver stresses region code, which is surprisingly slow Demos - All but latest stuff is in libcm and metacity head, and Xorg branch 3:00 PM - Deconstructing X Server Configuration - Adam Jackson, Red Hat. hmm, lets do some writeup Aitomatic config: input - single seat vs multiseat - evdev vs xinput - evdev hotplut working in patches - not feature complete yet: xkb, absolute coords...13:14 < zrusin> - have nothing like xkb geometry for pointers - still want device knowledge Automatic config: cards - autoconfig works sort of by luck - expose device ID typles to loader - - radeon vs fglrx - probe for dumb bus cards? - try-harder mode for smart drivers (discussion about political/marketing/technical issues of letting hardware advertise pcids and using them to automatically configure x) Automatic config: monitors - radeon does best, but _ugly_ code - DDC info cant always be trusted - - at best you can trust the monitor name - - might need override files - can do a decent heuristic - need to pick sensible defaults - - MergedFB/TwinView - - default sync ranges Bridge: core and userproofing - parser hacks - partials, includes - safe mode for last-known-good boot - ignore bogus paths, etc. - expose configuration to external apps - - protocol or window properties - - serializable config - - Xorg -devicelist, Xorg -driveropts Dynamic config: extensions - why should you restart? - suggestion: server "embryonic" mode - needs significant api rework - dependency tracking - wrapping any layer Dynamic config: input - terrible ui for configuration - no good way to disambiguate - reassignable axes - server selection for multiseat - usb makes this really hard Dynamic config: output - can be pluggable! sususb - screenplug is hard - significant memory management work needed - lots of intermediate steps with RANDR - - simple resize - - resize from 0x0 for output enable - limited cardplug through HAL - - - start new server on plug - - - gtk migration? hide behind Xdmx? Design: - totally an open question - HAL vs. extensions - need some input from the community 9:00 AM "Power Management for Graphics Subsystems" - Jay Cotton, Sun two types of PM for X - Display (DPMS) & Frame Buffer (FBPM) two primary modes - Suspend/Resume and Reduced Power ACPI S3 - suspend to RAM, shut whole graphics system off reduced power - adjust performance of graphics device Xorg interfaces with kernel PM subsystem [insert complicated diagram here] (explaining the diagrams showing how suspend & resume flow through the kernel into the X server and back. the basic overview is that when the kernel decides it's time to shut down power, it stops all clients to make sure they're not writing via DRI/etc. any more, writes a message to an fd the Xserver poll()'s in the main WaitFor loop (like input devices) , waits for the Xserver to save all state and report back that it's okay to suspend, or that it failed (or didn't respond) and then the kernel cancels the suspend) Reduced power consumption - Provides ability to run at lower speed or disable portions of the GPU - PCI bus spec provides the mechanism, D0 through D3 power states - Interface is via kernel PM subsystem (/dev/pm in solaris) FBPM extension - Xorg tells DDX to change power state - FBPMChangeStateNotice - DDX tells /dev/pm to change power state - ioctl - PM subsystem calls kernel graphics driver - gfx_power X Driver Writers - Make sure the driver supports suspend/resume - Make sure it supports framebuffer power management Kernel Driver Writers - Make sure you have an interface to support this Graphics Vendors - Design the hardware with this stuff in mind Contacts: djb@sun.com and jay.cotton@sun.com, http://opensolaris.org http://blogs.sun.com/roller/page/jaycotton/ http://mediacast.sun.com/share/alanc/xdevconf06-fbpm.pdf 10:00 AM Automated display configuration in X.org - Stuart Kreitman Scope of problems - Many display/card combos aren't working - Configuration is hard, the tools suck - Dependence on BIOS and vendor tools - Laptop and modern users want to hotplug screens Elements of the solution - Good EDID handling - Root window property for R&R - hotkeyed output port switching - Xorg -preferredres - No or minimal xorg.conf - Accumulated database of fixup info - VESA and display vendor engagement Good EDID handling - Standard, Detailed, Preferred timing specs - Coordinated Video Timing algorithm - Minimized blanking calculation - Centralized service for all DDXes - EDID-driven mode pool - Informed EDID interpretation Priority of EDID components: preferred, detailed, standard, cvt, extended Blocks: video timing, display information, digital protocol, localized string Dynamic configuration for hotplug - Laptops, datacenter/kvm - gain independence from bios, xorg.conf, ddx - ddc pin interrupt when available - probe, attach, clone to all ports by default - R&R for port rescan External interfaces - ctrl-alt-foo is a big win - root window property populated with EDID info - Better command line stuff, like -preferredres die xorg.conf die - clean up mode pool computation - fix hsync and vrefresh defaults - etc. Corner cases - if architecture is good, reduce corner cases to bugs - EDID faults are known to exist - standalone edid reader, encourage field reports - specialized knowledge of CRT/panel behaviour beyond specs - vesa representative and vendor engagement Need to make the crash case not crash to unusable state 11:00 AM "Challenges of OLPC Graphics" - Jim Gettys, OLPC Gross breakdown of laptop costs: 25% display, 25% microsoft, 50% marketing OLPC breakdown: no sales, no marketing, minimial distribution Using linux and cheap displays First generation hardware specs Initial rollout of 5 to 10 million laptops next year Other principles: scale, floating price, openness, textbook, universal For flat panels there's really only TFT LCDs Improvements needed: cost, power consumption, sunlight readability, form factor New display type: dual mode - b/w, reflective, ~1130x830 - color, 640x480 Power assumption of conventional LCD - 6-8W for backlight - grayscale reflective requires no backlight - color backlight is LED, less than 1W - processor dominates the rest of the power budget HW specs - AMD Geode GX2/5536, 128M RAM, 512M flash - 7" 4:3 dual mode LCD - 3 or 4 USB, one powered - 802.11, diversity antennae, mesh networking - ac97 audio, line/mike in/out, stereo speakers - rubber keyboard, touchpad - nimh batteries, optional generator, banana plugs - LinuxBIOS (we hope) Software challenges - All the usual stuff, plus - can run LCD slowly, turn off flashy bits, tickless operation - Keep processor in low-power mode as much as possible - Memory consumption Graphics specs - GX2 built-in framebuffer - has alpha blending but no 3d support - cell phone gl chips can't address enough memory - probably run in 16bpp; which format? - mode control, yuv overlay, hardware cursor, refresh compression LCD trivia - Don't have to run at 60Hz, but can't go to DC - 10 to 20Hz is possible - Reduces power consumption of the LCD and of the memory controller Holistic view - Stop blinking cursors - wm will need to be aware of memory and power consumption - probably no Composite - memory consumption is the 2nd big concern after power 1:00 PM "New DRI memory manager and i915 driver update" - Keith Whitwell, Tungsten Graphics. Overview - Full strength memory manager wasn't necessary for traditional usage - perceived to be difficult, talked about for years, can't be put off any more - fundamental for modern desktops due to offscreen rendering Current behaviour - Clients cooperate to avoid stepping on each other's textures - if one client needs more, can eject without saving - some global info to help with decisions - Easy to implement, better than nothing Downsides - Can't trust data to remain in texture memory - Have to keep two copies of texture image data - Slow texture uploads, both copies have to touched - can't blit for CopyTexSubImage - no EXT_fbo, pbuffers, private back/z buffers - No fast VBOs or PBOs - Nasty hackarounds - GLX_MESA_allocate_memory What to do? - Generalize textures to buffers - Guarantee buffer preservation - Still need to evict other client's buffers - Mechanism to force buffers to AGP or VRAM - Mechanism for buffer offset discovery - Map/unmap into client memory Semantics: Stolen from ARB_vertex_buffer_object - Generalized interface to multiple vendors' memory management - Just implement ARB semantics and add the missing driver-facing interfaces Buffer concept - Identified by opaque integer handle - Sharing buffers is allowed and straightforward - No mechanism yet for notifies for resize or destroy notify - key new call: ValidateBufferList - specifies acceptable memory pools per buffer - triggers the upload Fences - Encapsulates flushing and low-level IRQ mechanisms - Need some smarts about when to emit flushes, what kind, when to IRQ... - Buffer management code handles this behind the scenes - Fallbacks, image uploads, map/unmap get fenced for you Current status - Userspace prototype in i915, memcpy based - Fast TexSubImage, CopyTexSubImage, CopyPixels - Fast TexImage, optimized by not waiting for idle before replacing - EXT_fbo soon, other paths as time allows - TTM code for dynamic AGP manipulation - Clear path to VRAM implementation Current issues - ValidateBuffer offsets only valid while lock held - Problems stuffing DMA buffers - need to fire before releasing DRI lock - Solution: Fixup/relocation lists for DMA buffers - emit DMA without lock, grab lock, fixup, fire, unlock - Will prototype soon - Thomas' and Keith's code not yet integrated Next steps - DMA fixup lists, same concept for cliprects - Treat command buffers as another first-class memory object - Submit multiple DMA+patch buffers to the mm, let it schedule them What about the DDX? - Phase one, nothing happens - VideoRAM specifies a fixed size AGP pool, we manage the rest - This pool is an excellent place for pinned buffers - no fragmentation - Phase two, move stuff from fixes pool to managed pool - Only pinned buffers remain: scanout, cursor image, ARB_vbo map/unmap, YUV - Eventually teach the memory manager about pinned buffers too What about VRAM? - AGP dynamic mapping is cool, but does it work for VRAM? - Need two things: transfer path to/from VRAM, eviction buffer allocation - Allocation may be challenging but solvable - Also new semantics for deciding when to copy between AGP/VRAM Optimizations and more fun stuff - Various replacement algorithms - Speculative upload/download/duplication - DMA prioritization - Multiple HW command queues - Multiple client sync-to-vblank - Allocate backbuffer on the fly - triple-buffering for free! Thomas Hellstrom - Dynamic AGP mapping Overview - Background - Current situation - What we want to do - What we can do - Solution Background - AGP space limited by aperture size and by allocation at DRI init time - Need to evict data when out of memory - readback is slow, mapping uncached (diagram) What we want: AGP space management - Avoid unused allocated AGP space - Manage the size of the aperture - Fast bind/unbind/evict - Fast read from AGP Problems - All mappings need to be uncached - Want them cached for read speed - Some translation tables allow cachable pages (PCIE, Intel GTT) - Alternatively, let the DRM manage the mapping, change policy on bind/unbind Translation Table Maps - Userspace part of the memory manager creates TTMs - Memory pages are automatically allocated when used or bound - User can request bindings of any range of pages from ttm to aperture - Manager evicts when out of space Implications and limitations - TTMs aren't resizable - One TTM per buffer? - If it can bind cached pages, TTMs aren't really needed when not sharing - Anonymous TTM regions - Shared TTM memory - access rights? API - goes in libdrm - can validate, unbind/evict, map/unmap, destroy, fence - bind/unbind are for card visibility, map/unmap are for cpu visibility