The irregular Nouveau-Development companion
Issue for April, 8th
1. Intro
Last weekend I didn't publish any update, as I was under the impression, that each and every topic mentioned might not have been taken seriously. So I waited just another week. And as March was slow going, this was fitting too.
The weekend of March 24th / 25th was the end of Google's SoC applications. We had about 4-5 applications, which are currently evaluated. As we don't know how many slots we get in the Xorg meta project for SoC, we currently can't say, which projects will get realised. We have applications for:
- nouveaufb: framebuffer driver compatible with accelerated X driver for nvidia cards (nouveau)
- Xv support for Nouveau driver
- OpenGL texturing support for the 'Nouveau' driver
- Add Exa support to more operations
- Patent-free S3TC compression implementation for Mesa this one is for Mesa and was due to a suggestion from marcheu
So we have 4 applications from a total of 12 which came in for Xorg (according to daniels).
Regarding our idea of the driver crash course: We will probably do it, however we are not sure on how to do it. Either we will write a wiki page and set a certain time on IRC where people can drop in and ask questions or we will talk about a topic in IRC, allow for questions later and write it up on the Wiki. Currently we tend towards option #1 but please watch this space.
The TiNDC is now translated at least into spanish (see #16), perhaps into other languages too. German, Russian and Polish translations of the Wiki pages started. Thanks alot for all your hard work, Translation Team!
2. Current status
Thunderbird and Skinkie did some work on reverse engineering the TV-Out registers. As both haven't much time, progress is slow. But currently Thunderbird is able to do basic things like overscan, moving the image and things like that are pretty easy but setting up tvout itself is pretty complicated and he has no idea how that fully works.
We had this already covered in the TiNDC issue, but more info surfaced during the last weeks: A new memory manager called TTM is in the works. Now NVidia hardware is somewhat different or advanced from current design and we need to make sure that our needs are met too.
Just a very brief recap of the NVidia hardware:
- Up to 32 hardware contexts on each card (seems as if NV50 has probably 128)
- Context switching done via interrupt handler (up to NV2x) or by the card itself (from NV3x) on
- Each context consist of a FIFO and needed objects. The FIFO holds all commands while the objects are holding all information necessary to process the command and hold possible results.
- Objects are created depending on the task to be processed (e.g. 2D, 3D, Video processing)
- DMA Objects describe a memory region and its properties (type, address, allowed limits, allowed access types)
- Contexts are totally separated from each other.
- FIFOs are managed from user space
This leaves the nouveau driver with some requirements regarding the TTM:
- how are we supposed to implement fencing of buffers that can be used on multiple separate command streams?
- it's preferable to not have the kernel (DRM) messing with a FIFO
- fencing / pinning multiple buffers at once.
- As hardware is already doing some validation work, we only need to make sure that buffers and their addresses are valid.
- The TTM is able to move buffers around, even evict them from AGP/VRAM completely.
Well, totte (Thomas Hellström, a drm hacker) dropped by in #nouveau and discussed exactly those issue with wallbraker, marcheu and darktama. Result was, that we would need a new ioctl() call in the drm which allows for validating / fencing buffers in the TTM. This ioctl() needs only to be called if a new texture is introduced by the OpenGL application, although this might not be absolutely true, as darktama is not quite sure if this would really work out.
So currently we believe, that we need to use the TTM like this: 1. Pin all the buffers you're reading/writing to (called "validating" by the TTM) 2. Submit commands that are using the buffers 3. Emit fences so the ttm knows when it's safe to evict
"Pinning" means telling the TTM that it can't move or evict the buffers because they're in use. A "fence" is just a counter basically. After the commands are submitted you tell the TTM "After this fence reaches this value, the buffers aren't being used anymore" when you tell the ttm to validate buffers, it marks them as being in use then later on you associate a "fence object" with it once the fence object is signalled (the fence counter reaches the desired value), the TTM marks the buffer(s) as being unused.
Stillunknown was fed up that his NV43 (which is the same type as KoalaBR's) still didn't work while KoalaBR's one did, so over the last 3 weeks he pestered darktama again and again to help him get his card working. Finally, Darktama gave in and requested kernel logs with DRM debug output, X logs and nouveau fifo tracing output. Those helped Darktama to create a fitting patch. Stillunknown hit a bug in the swtcl while the rest of the NV4x cards weren't hitting that code path. Source One more success report came in from mg, who did get a NV44 running, only the context voodoo was needed.
PQ did more work on his register description database in xml. He refined the design with z3ro, so that it will be useful for Radeon and Nouveau. However, he nearly went insane when darktama casually mentioned that there are registers which are auto incremented when used to write to other registers (like uploading the context voodoo). We nearly lost PQ there, but he quickly recovered when genius stroke him and he found an easy way to encode just that information into his xml schema. While at first only aimed at enhancing the mmio-parse output it now seems as if nvclock (long term) and radeon may use this tool too for their needs.
abotezatu (who applied for SoC) researched the still remaining NV1x issue regarding the non working context switch.
Airlied continues his work on randr12 slowly due to other obligations. His branch is probably mostly broken for everything except his setup. But as he is concetrating on both PPC and x86 both architecture should be equally broken. It seems as if it will be necessary to parse some Bios tables to gain all necessary data to make this work. For now parsing the table on NV4x works (earlier cards probably don't have this table), but darktama broke the drm binary compability, thus a new merge with the master branch is needed. But currently Airlied is working to get TTM and the Intel drivers into shape.
jb17some continue his search to find out, why the objects in instance ram are sometimes not found on G70 and above. He has a much better understanding of why the crashes occur and where the context instance mem is stored. It looks like G70 and up cards only store the instancemem in the frame buffer and not the 16mb pci region 0. This theory was verified against reality on his 7900gs (by dumping the pci region 0 + frame buffer), comparing this results against some dumps from the dump repository. Further test on a 6150 seem to support the theorie that this is true for all cards >= NV4x which are using shared memory.
KoalaBR published the script to automatically download, compile and run renouveau including creating a fitting archive from the output. You can find a link to it on the REnouveau Wiki page and currently here: http://www.ping.de/sites/koala/script/createdump.sh
And before we ask you again for your help, here we have a basket of easter eggs (well, no real eggs, but we hope you like them nevertheless)
- Marcheu fixed context switching on NV04. Glxgears still doesn't work as some bugfixes for the 3d setup are still missing.
- Julienr and Marcheu fixed the crashes on Quadro FX 350M (PCIE) (see bug #10404 on Bugzilla). If you see crashes with Go7300/7400 or similar chips, you should retest and give us feedback.
- Context switching seems to work on all cards now, but not all cards can run glxgears yet. On some cards, certain 3d commands in DMA or FIFO hangs. The reason for that is currently under investigation.
3. Help needed
We are interested in SLI dumps and dumps for Go7300/7400 (and please set the Topic line of your email to hint at the fact that this is an SLI dump). G80 owner willing to test patches would be very welcome too.
And yes, MmioTrace dumps would be very welcome too.
Go7300/7400 owner should test whether nouveau (2d) works for them.
So, that was the Easter edition of the TiNDC. About 3 days of work, covering 3 weeks of development, read in about (a guess) 3 minutes? Never mind I hope you enjoyed reading it.