Freedreno status report: FOSS graphics on Adreno/Snapdragon
Rob Clark
24 March 2015
First, an overview of Adreno
- Adreno 2xx
- OpenGLES 2.0
- Unified Shader ISA
- VLIW vec4 + scalar co-dispatch
- Adreno 3xx
- OpenGLES 3.0
- OpenCL 1.1 (embedded profile)
- Unified Shader ISA
- explicitly pipelined scalar
- Adreno 4xx
- OpenGLES 3.1 + AEP
- OpenCL 1.2 (full profile)
First, an overview of Adreno (continued)
- Tile Based Renderer
- GMEM: large “macro-tile”, 256KiB to 1.5MiB
- Driver explicitly manages tile buffer
- driver partitions between all MRTs and depth/stencil
- driver handles restore/resolve
- driver handles partitioning render target into tiles
- GMEM bypass (at least a3xx and later)
- in certain scenarios driver can decide to bypass tile buffer and do immediate rendering
Motivation: Lack of FOSS gfx on ARM
- Open Source is about freedom
- If you have the src and the will, you have a way
- New widget, new feature, new distro, etc..
- For modern UI the GPU is becoming more important
- If you don’t have the src, you are limited by the blob
- Often times, only android blob available
- If you don’t have the src, you can’t recompile it!
- clever hacks: libhybris
- Fortunately now you don’t need to put up with that!
Some History… the early days
- So in mid-2012 I decided to do something about it!
- Found some hardware with a220 and started r/e (~May 2012)
- (before this, spent some time on the 2d engine)
- Started Gallium driver (Nov 2012)
- Things basically working (Mar 2013)
- minimal feature set, GL 1.4 and GLES 2.0
- worked well enough for xonotic, gnome-shell, etc
- Most of this time, it was evening/weekend mode
Some History… Adreno 3xx
- Joined RH graphics team (Feb 2013)
- Ordered a nexus4, started r/e (Mar 2013)
- omg, everything changed (shader ISA, registers..)!
- Mostly left a2xx behind at this point
- it worked well enough, and hw was not capable of much fancy stuff
- plus little desire to deal with even more ancient kernels
- Things basically working (Summer 2013)
- still just GL 1.4 / GLES 2.0
Some History… Adreno 3xx (continued)
- Upstream drm/msm driver (Aug 2013, 3.12)
- HW binning support (Jan 2014)
- Start on new-compiler (Feb 2014)
- proper instruction scheduling fixed many things
- not to mention, big perf boost
- but fallback to old compiler for complex stuff (array RA!)
- OpenGL 2.0/2.1 (May 2014)
- First drm/msm patches from QCOM (Jun 2014)
Some History… Adreno 3xx/4xx
- Received ifc6540, my first a4xx device (Jul 2014)
- nearly all registers changed again :-/
- same basic shader ISA (with some tweaks).. phew!
- >90% piglit pass on a3xx (Oct 2014)
- drm/msm patches to enable a4xx from QCOM! (Nov 2014, 3.19)
- Initial a4xx gallium support (Nov 2014)
- still slightly behind a3xx but by Feb 2015 almost all games, etc, are working
- New compiler finally supports register assignment for arrays! (Mar 2015)
- remove old compiler, and enable by default glsl130 and integer support!
Freedreno and the Linux Graphics Stack (Xorg)
Freedreno and the Linux Graphics Stack (Wayland)
Freedreno and the Android Graphics Stack
Coming Soon, hopefully?
Freedreno Gallium Internals
Freedreno Tiling
- Clear/draw cmds built up normally
- in parallel, binning pass cmds built up in separate cmdstream buffer
- On flush / render target change / etc:
- mark start point of tiling commands
- emit IB to binning pass (optional)
- binning pass runs simplified vtx shader to determine which primitives are visible in which tiles
- for each tile, emit tile setup and IB to clear/draw cmds
see https://github.com/freedreno/freedreno/wiki/Adreno-tiling
Freedreno Queries (Basic Concept)
- At start/stop point, driver emits commands to snapshot relevant counter(s)
When query result requested, driver sums the differences of the pairs of start and stop points
Q1 := (S5 - S4) + (S3 - S1);
Q2 := (S6 - S4) + (S3 - S2);
Debugging… $FD_MESA_DEBUG
[robclark@reptile:~]$ FD_MESA_DEBUG=help glxinfo | head
debug_get_flags_option: help for FD_MESA_DEBUG:
| msgs [0x00000001] Print debug messages
| disasm [0x00000002] Dump TGSI and adreno shader disassembly
| dclear [0x00000004] Mark all state dirty after clear
| flush [0x00000008] Force flush after every draw
| noscis [0x00000010] Disable scissor optimization
| direct [0x00000020] Force inline (SS_DIRECT) state loads
| nobypass [0x00000040] Disable GMEM bypass
| fraghalf [0x00000080] Use half-precision in fragment shader
| nobin [0x00000100] Disable hw binning
| noopt [0x00000200] Disable optimization passes in compiler
| optmsgs [0x00000400] Enable optimizer debug messages
| optdump [0x00000800] Dump shader DAG to .dot files
| glsl120 [0x00001000] Temporary flag to force GLSL 120 (rather than 130) on a3xx+
| nocp [0x00002000] Disable copy-propagation
Debugging… additional tips
- apitrace!
- It’s a great way to send me something so I can reproduce
- apitrace trace [–api=egl] my-app
- Getting command-stream traces:
- drm/msm:
- cat /sys/kernel/debug/dri/0/rd > mytrace.rd
- recommended: FD_MESA_DEBUG=direct
- kgsl:
- LD_PRELOAD=/path/to/libwrap.so
On the road to GL 3.0 / GLES 3.0
- Much work going on behind the scenes to enable gl3 features
- much thanks to Ilia Mirkin who has been doing most of this work!
- Compiler/glsl130 support:
- many new texture sample instructions, sample from integer textures, etc
- integer support
- Many new texture/vbo/etc formats
- Many new extensions
- GL_ARB_framebuffer_sRGB, GL_ARB_texture_rg, GL_EXT_packed_float, GL_EXT_texture_shared_exponent, GL_EXT_texture_snorm, GL_ARB_draw_instanced, GL_ARB_instanced_arrays
GL 3.0 / GLES 3.0 TODO List
- GLES 3.0
- Transform Feedback and UBO’s
- we know how these work.. just need to write code!
- MRT
- partially working, on a branch
- advanced flow control (ie. loops that cannot be unwound, etc)
- if you avoid shadertoy.com it isn’t too common
- mostly needs work on the ir3 compiler backend
- GL 3.0
- RGTC (internally convert to uncompressed?)
- MSAA
- NV_conditional_render
- nvidia’s “we hate tilers” extension
- lying may be the best perf option for a tiler
see https://github.com/freedreno/freedreno/wiki/TODO and https://trello.com/b/VC0IXzrq/freedreno
Compiler - ir3
- The bigger part of a3xx/a4xx
- but fortunately shared in common
- another big chunk of work to support flow control
- Started documenting
- Very preliminary work on NIR backend
How to get it
- Simple, the userspace parts are all upstream!
- Mesa
- recommended: >=10.4.x for a3xx and >=10.5.1 for a4xx
- add freedreno to --with-gallium-drivers=
- Xorg
- xf86-video-freedreno
- or xf86-video-modesetting with glamor
- libdrm
- enabled by default after 2.4.59
- --enable-freedreno-experimental-api for earlier
- Linaro also provides builds for some boards
Help wanted
- Use it:
- Make sure it is enabled in your favorite “distro”
- Report bugs
- Make something cool!
- Driver work:
- Everything from kernel to compilers
- Or, if you know GL, we need more tests for GLES3.1+AEP features to trace blob
Devices
|
|
Inforce 6540 SBC (ifc6540) Snapdragon 805 - APQ8084 - a420
|
Inforce 6410 SBC (ifc6410) Snapdragon 600 - APQ8064 - a320
|
|
|
DragonBoard 410c Snapdragon 410 - APQ8016 - a306
|
Compulab utilite2 Snapdragon 600 - APQ8064 - a320
|