I am currently trying to update the page, make it more readable and then transfer to NouveauBits (I guess no one updates others personal page).

TODO: are channels compleatly independent? Each channel has its own objects and so on.

Intro

This page sums up some informations about nouveau driver. Some of it is obsolete, another part is probably wrong. Before you read this, you probably should familiarize yourself with some parts of IntroductoryCourse like NouveauTerms or riva128.txt(outdated information).
As most opensource projects, even nouveau has certain lack of documentation. The knowledge is stored in source and developers and it is good that they code instead of writing documentation. Don't expect some fancy explanation, as everyone who is seriously considering joining the project, you will have to read heap of code and spend hours trying to make some sense from MMIO and renouveau dumps You will read dumps and hope for the best.
Brave developer, I wish you good luck.
Contents

Nvidia overview

NVidia is using few major components for its programming:
MMIO registers - used for modesetting (1024x768, text mode...), clock, tv-out, interrupt control...
Gr objects - objects that can be used for actually doing something, like draw on screen, memory transfer, combine two textures (raster operation object, solid line object, swizzled surface object...)
DMA objects - memory areas, places for color buffers, depth buffers, notifiers...
FIFO - memory of area, where you can write RINGS (calling methods of objects) for modification of gr objects
To show something on screen, driver performs some basic operations:
set graphic mode using MMIO
allocate FIFO (= allocate channel)
create some objects
- DMA object for screen (dS),
- Gr object for screen surface (gSS)
- Gr object for rectange (gR)
assign gr object to subchannels
setup objects (connect dS to gSS, gSS to gR, set color and size of gR).
call methods of gR for drawing
watch nice rectangle
That is the theory anyway.

Gr objects

NVidia uses context objects to drive drawing operations. An object can define surface, draw a triangle or transfer image.
List of objects:
NV01_CONTEXT_CLIP_RECTANGLE
NV_MEMORY_TO_MEMORY_FORMAT
NV03_PRIMITIVE_RASTER_OP
NV04_GDI_RECTANGLE_TEXT
NV04_SWIZZLED_SURFACE
NV04_CONTEXT_SURFACES_3D
NV04_DX5_TEXTURED_TRIANGLE
NV04_DX6_MULTITEX_TRIANGLE
NV04_COLOR_KEY
NV04_SOLID_LINE
NV04_UNK005E
NV05_SCALED_IMAGE_FROM_MEMORY
NV04_SCALED_IMAGE_FROM_MEMORY
NV_IMAGE_FROM_CPU
NV05_IMAGE_FROM_CPU
NV_IMAGE_BLIT
NV11_IMAGE_BLIT
NV30_IMAGE_BLIT
NV10_TCL_PRIMITIVE_3D
NV11_TCL_PRIMITIVE_3D
NV17_TCL_PRIMITIVE_3D
NV10_IMAGE_FROM_CPU
NV10_PRIMITIVE_2D
NV10_VIDEO_DISPLAY
NV10_UNK0072
NV10_SCALED_IMAGE_FROM_MEMORY
NV10_CONTEXT_SURFACES_2D
NV04_CONTEXT_SURFACES_2D
NV04_IMAGE_PATTERN
NV20_SWIZZLED_SURFACE
NV20_TCL_PRIMITIVE_3D
NV30_TCL_PRIMITIVE_3D
NV40_TCL_PRIMITIVE_3D
NV30_CLEAR_BUFFER
NV50_TCL_PRIMITIVE_3D
NV_DMA_FROM_MEMORY
NV_DMA_TO_MEMORY
NV_DMA_IN_MEMORY
Gr object can be created in few steps:
- check that object with same handle does not exists (nouveau does this through list of objects that were created in the fifo, but it should be done through HT... probably)
- alloc some space in RAMIN heap and store gr object there
- insert handle to RAMHT
- you are done!
A context object is referenced by a user defined 4 byte long handle (like a DMA object) and is stored inside RAMIN heap.
Complete structure of objects in nouveau_object.c, on NV40 they are 32 byte long, on NV30 and earlier 16 bytes.
Every object has methods, most of them just store values in object, some perform an action (draw, transfer). List of operations can be found under definition of object in nouveau_reg.h. The operations are just defines for numbers, reason for that is in part subchannels.
Gr objects are stored in RAMIN heap. Each object is identified by a used defined handle , handle is 4bytes integer. It is a handle that is used as parameter to methods, not offset.
List of all known objects and its mehods can be found in nouveau_reg.h in xf86-video-nouveau.

FIFO

DMA objects

There are also DMA object, but they are more or less chunks of memory.
- Lorem ipsum ex debet facete definitiones sea, ea sint ponderum cum. Usu noster tamquam maiorum ex. Natum justo accusamus an ius, veri accusamus cu nec. Prima vivendum nominati in has, vide consulatu in has. Pri eirmod fastidii deseruisse et.

subchannels and channels

This section is based mostly on structure Nv006aChannel (from _Nv04Device). As you can see, the structure is volatile, so how it is layed out is how it looks in the memory.
NVidia HW can easily support multiple 3D processes at once thanks to channels. Each 3D process has its own channel and is using it for all operations. Each channel can be thought of as individual processing unit (TRUE? context switching is necessary for this illusion). Number of channels differs from 16 on nv4 to 128 on nv50 (same as number of FIFOs). Every channels has 8 subchannels.
Subchannel is more or less slot, where driver can assign an object and then you can call methods of the object. This section partly explains alternative way of programming the nvidia hardware instead of FIFO, however much slower and less secure.
Every object has methods that do something with it, and since methods
- are actually offsets into subchannel (note that first 0x100 bytes of subchannel is filled with something common = no method is lower than 0x100 ) and
- objects can have different methods at the same offsets
you first assign object to subchannel and all memory registers in the subchannel will change to work as methods of the given object type.
For example you create object NV04_GDI_RECTANGLE_TEXT (type 0x4A) and assign it to subchannel 2. Then subchannel 2 will will change registers 0x100-0x.. so they work as methods used by NV04_GDI_RECTANGLE_TEXT. (offset NV04_GDI_RECTANGLE_TEXT_SURFACE 0x198 will set surface where to write, parameter is handle). Actually the "methods" mostly only store some value that is later used in some operation. Very few of them actually do something (like drawing a triangle), first you have to call many methods to set up correct state of objects.
If you change the object in subchannel, setup of the original object is lost (TRUE).

nv_demo

The nv_demos are small programs used for RE of nvidia commands and registers. They are not drivers and they do not use DRI/Gallium3D. They manually set up objects and write some commands to FIFO. The developer write some code to nv_demo and hopes to achieve some result (triangle, repated texture, scissors..). She will achieve the expected result (mostly) after a few (painfull) hours (days). General principle is to look at the MMIO and renouveau dumps and try to repeat what nvidia blob did. Sometimes it is a single bit, sometimes the whole infrastructure.
- nv04_demo http://nouveau.cvs.sourceforge.net/nouveau/nv04_demo/
- nv10_demo http://nouveau.cvs.sourceforge.net/nouveau/nv10_demo/
- nv20_demo http://cgit.freedesktop.org/~pq/nv20_demo/
- nv30_demo http://cgit.freedesktop.org/~jkolb/nv30_demo/
- nv40_demo http://nouveau.cvs.sourceforge.net/nouveau/nv40_demo/

MMIO registers

RAMIN, RAMFC, RAMHT, RAMRO

At first you may be confused what are these shortcuts about. They are sequence of MMIO registers (nv50 my be different). As you may know, MMIO registers are mapped as memory area of Nvidia card (mostly 16MiB 32bit area you can see with lspci -vv -n).

RAMIN

RAMIN is instance memory - part of MMIO memory of graphic card, it is not yet clear, how big RAMIN is at each card, nouveau is using 1MiB for >=NV10, and 512KiB for < NV10. At least on NV40 is RAMIN at the end of VRAM. If you want to write to RAMIN, use NV_WRITE(NV_RAMIN + offset). NV_RAMIN is 0x00700000 (= start of RAMIN area in MMIO area). Memory area of RAMIN excluding RAMFC RAMHT and RAMRO is called RAMIN heap and is used for storing DMA and context objects.

RAMFC

RAMFC - FIFO context table, RAMFC is stored inside RAMIN, at offset from the start of RAMIN.
In case of NV40 or better, offset is 0x20000 (RAMIN + 0x20000)
else 0x11400 (RAMIN + 0x11400).
Size of RAMFC is number_of_fifos (NV3=8; NV4+=16; NV10+=32)* size_of_fifo_ctx (NV4+=32; NV10+=64, NV40+=128). After you decide on position, you should wirite it to register NV_PFIFO_RAMFC, what you exactly write differs from model to model, but most of the time, you write RAMFC_offset >> 8 | something (for more info see nouveau_fifo_instmem_configure).

RAMHT

RAMHT - FIFO hash table. RAMHT is used for storing and looking up handles of DMA and context object. Key of hash table is 4 bytes long handle and maybe some other 4 bytes long whatever. First is used defined, other can be deducted (see nouveau_object.c) RAMHT is positioned at offset 0x10000 + NV_RAMIN. I dont know whether this position is fixed or predefined , but is used on every card. You should write it to register NV_PFIFO_RAMHT (not offset itself, but some mangle, see nouveau_fifo_instmem_configure).

RAMRO

RAMRO is an abbreviation of FIFO runout table. It is stored at an offset 0x11200 + NV_RAMIN and is 512 bytes long... probably. Purpose and use is unknown.
After you decide on position, you should write it to register NV_PFIFO_RAMRO (offset>>8). For more info see nouveau_fifo_instmem_configure.

DMA object

DMA objects are used to reference a piece of memory in the framebuffer, PCI or AGP address space. Each object is 16 bytes big and looks as follows:

  address - here it means 32 byte offset of memory area DMA object references. Example: DMA notifier is a small memory area (32-256 bytes) somewhere in AGP or VRAM. Offset in this case is offset from AGP/VRAM starting offset.
  '''Entry[0]'''
  11:0  class (seems like I can always use 0 here)
  12    page table present?
  13    page entry linear?
  15:14 access: 0 rw, 1 ro, 2 wo
  17:16 target: 0 NV memory, 1 NV memory tiled, 2 PCI, 3 AGP
  31:20 dma adjust (bits 0-11 of the address)
  '''entry[1]'''
  dma limit
  '''entry[2]'''
  1     0 readonly, 1 readwrite
  31:12 dma frame address (bits 12-31 of the address)
  Non linear page tables seem to need a list of frame addresses afterwards,
  the rivatv project has some info on this.

DMA objects are stored in RAMIN heap. Each object is identified by handle , handle is 4bytes integer. Creation of object consists from few steps:

check that object with same handle does not exists (nouveau does this through list of objects that were created in the fifo, but it should be done throught HT... probably)
alloc some space in RAMIN heap and store DMA object there
insert handle to RAMHT

Context object (Graphic object)

NouveauBits

During exploration of NVidia hardware and Nouveau, there are many questions you will ask and some questions or problems you would not. Some of them are here.

* Null object (in source NvNull) is object passed as argument to some object functions where you don't know what you will fill in later. It is more or less default object. Its handle is 0x00000000 and it has to be created to be used(so you pass handle 0x0000 to functions of other objects). Class of null object is 0x30.

* There are no "implicit" objects (= objects that don't have to be created and can be used). All objects have to be created. During creation of object a user defined handle is assigned to the object and the handle is inserted into RAMHT along with some other info(like where is the object stored). Some functions require a handle of some object as parameter (NOT a address, if handle is not found in RAMHT, PFIFO error will occur).

* Nv20 either doesn't implement scissors or we dont know how to init it

* The INSTANCE_WR writes in *_graph_context_init are setting up the default graphics context.

* The intialized objects and MMIO setup survive the X restart. It will survive a lot, object setup will survive 20 seconds long shutdown. Beware when trying initialize drm using nouveau own code instead of nvidia.

* 19:09 < marcheu> but PGRAPH can be either put there through the graphics context, or through the object (or through direct writes, but nvidia doesn't do much of those)

* NV_PFB are registers that control frame buffer ram timings, fb layout, tiling...

* Offsets of the depth and color buffer must be multiple of 64 (otherwise the card will choke). If depth is disabled, you can set depth offset to same value as color buffer.

* In EXA A8+A8 is very important common operation. EXA uses it to compute font/icons masks.

* When DRM log says "invalid channel" whatever, you must reboot.

* PGRAPH context shouldnt be modified by user/driver. PGRAPH context is modified by card itself as you write to FIFO (Nvidia is writing to it in nvsdk, but there is used completly different approach).

21:48 < matc> pq: AFAIK PIPE_DATA stuff is a mirror to 3D state. And on nv10 we need to init them with certain value, otherwise 3D command produce errors IRRC. 21:49 < matc> yes may be if the 3D pipe is in a random state

Software methods (GPU system call handled in DPC)

Some old cards (like nv04) do not implement all methods for all objects in hardware. SET methods are frequent example of that. Since the card does not implement them, nouveau (some other driver) has to do it itself. How does it happen?
- The card is processing FIFO and encounter unimplemented method.
- card invoke interrupt and the kernel will call irq handling function with INVALID_METHOD interrupt.
- the driver will determine method and object that caused the interrupt and implements the method (or ignore it, program only what you need).
How to get object and method? No idea... do MMIO trace and look for something in FIFO context that looks like a object handle.
- 11.605111 read32 #96 +0x0040016c -> 0x00001476 11.605300 read32 #96 +0x00714760 -> 0x0101905f This looks like a object handle, see? Easy ^___
Just kidding, get address from 40016c, and the value from 400160. It could work.

* How are interrupts handled?

All interrupts end up in drm/shared-core/nouveau_irq.c, once you are in irq handler, it's simple. There is a major interrupt reg, and a minor reg, you read the major reg (PMC_INTR_0) which tells you what engine it's about (PFIFO, PGRAPH, CRTC) and according to that we route to different functions (nouveau_*_irq_handler). Each bit set in the minor reg means an event... you process, and write 1 to that bit to tell the card it's taken care of (even if didn't do anything).

Ftom now on is Old Content(TM): update, spellcheck, move.

Opened questions :

Is the FIFO only way to modify objects? Nope it is not. But it is a convenient, fast and safe way. You can also change objects by directly writing to memory area, where objects are stored. When you fire FIFO, it will take all commands you wrote and modify object. I dont remember address where objects are stored, but if you want to see how PIO structure looks, it is in nvsdk, file nvhw32.h structures like Nv04Dx5TexturedTriangle or Nv03GdiRectangleText.
If there is a sequence of commands on a FIFO object where offset of actions differs by 4, is it possible to send more than one command in one packet? I have seen similar things in some renouveau dumps.

Will all these FIFO actions produce same result?
1.
   BEGIN_RING_SIZE(7, 0x00000304, 3)
   OUT_RING(0x11111111)
   OUT_RING(0x22222222)
   OUT_RING(0x33333333)
2.
   BEGIN_RING_SIZE(7, 0x00000304, 1)
   OUT_RING(0x11111111)
   BEGIN_RING_SIZE(7, 0x00000308, 1)
   OUT_RING(0x22222222)
   BEGIN_RING_SIZE(7, 0x0000030c, 1)
   OUT_RING(0x33333333)
3.
   BEGIN_RING_SIZE(7, 0x00000304, 2)
   OUT_RING(0x11111111)
   OUT_RING(0x22222222)
   BEGIN_RING_SIZE(7, 0x0000030c, 1)
   OUT_RING(0x33333333)
4.
   BEGIN_RING_SIZE(7, 0x00000304, 1)
   OUT_RING(0x11111111)
   BEGIN_RING_SIZE(7, 0x00000308, 2)
   OUT_RING(0x22222222)
   OUT_RING(0x33333333)

ANSWERED: Yup, in most cases (it may cause some problems when you write a value that will cause problems). You have assigned object in subchannel and every packet writes data to the object at specified offset. At first I thought that every packet is a command with proper syntax, it is not. Packet just writes data into object so when object should do something, it has all necessary data in itself.
How is DMA0, DMA1 and DMAnotify used in graphic context object?
- DMAnotify is an address of notify object, card writes to the the notify object after end of DMA transfer. For more information see nv_dma.c, "How do we wait for DMA completion (by notifiers)"
- Rest is unknown for now.
ANSWERED Explain some DMA command, maybe NV04_DX6_MULTITEX_TRIANGLE_DMA_NOTIFY. It has one parameter - DMA notify object. For more informations, see chapter DMA notify.
What exactly is an Entry[3] in DMA object? It is sometimes seen 0, sometimes copy of entry[2] and sometimes something completely weird. Source dmesg. DMA objects have defined structure only for Entry[0-2]
ANSWERED Where is allocated FIFO? Or mapped.
- User space FIFO is mapped to address dev_priv->fifos[channel].cmd_mem inside FB (or possibly AGP). Process of creating FIFO is following:
  - Allocate space inside FB(AGP) and create DMA object for the memory area.
  - Get an address of FIFO context for the channel and write into the context the DMA object and some other informations. The precise FIFO context format is card class specific.
Is for hashing object address to RAMHT used something else than handle? More info about second CARD32.
More info about DMA objects, usage, see above
- NV_DMA_IN_MEMORY - what does this class mean? It is used for AGP and FB DMA object. Sometimes size is size of AGP aperture, sometimes roughly half of video ram.
- NV_DMA_TO_MEMORY - what does this class mean? Size is mostly 10f
- NV_DMA_FROM_MEMORY - what does this class mean? Size 1ff. Sometimes same offset as TO_MEMORY, probably copy
Are DMA handles user defined?
I dont know how can multiple subchannels on different channels use same context object. Ask marcheu.
Well, what should you write NV_PFIFO_RAM[HT|FC|RO]? Can I specify RAM[HT|FC|RO] at any part of RAMIN and then write it to the register or will it break?
Do commands like
- NV04_IMAGE_PATTERN_MONO_FORMAT = lowEndian
- NV04_IMAGE_PATTERN_COLOR_FORMAT = A8R8G8B8
- change context objects? Probably. Do they do something else?
Is channels and subchannels section correct? If it it, then what is purpose of context object.

Channels and subchannels

NVidia HW can easily support multiple opengl processes at once thanks to channels. Each OpenGl process has its own channel and is using it for all operations. Each channel can be thought of as individual processing unit (context switching is necessary for this illusion).

Each channel has 8 subchannels. Subchannel is something like a pointer to "object" which can be things like NV04_CONTEXT_SURFACES_3D or NV04_DX5_TEXTURED_TRIANGLE or NV04_SCALED_IMAGE_FROM_MEMORY... and each "object" can do a given kind of action, as their names imply. So subchannel with NV04_SCALED_IMAGE_FROM_MEMORY cant draw an triangle on screen, but can put image on screen. You can see objects in nouveau_regs.h together with possible commands for fifo (lets call them rings).

Probably wrong!!! So you have
subchannels - subchannel is represented by a number(0-7), and is associated with a context object. Every FIFO command XYZ at subchannel S will perform an action on context object associated with subchannel S.
context objects - context object structure can be seen as instance of class. It stores properties and information provided by commands performed on subchannel S

How does it work together

During DRI initialization, there are created context objects and subchannels are associated with objects.

Object	Object type	info/subchannel
NvDmaFB	DMA object	VRAM of card
NvDmaAGP	DMA object	AGP bus
Nv3D	class_3d	NvSub3D, default card primitive for drawing 3D at the card
NvCtxSurf2D	NV04_CONTEXT_SURFACES_2D	NvSubCtxSurf2D
NvCtxSurf3D	NV04_CONTEXT_SURFACES_3D	NvSubCtxSurf3D, only on NV04, used for setting up buffer context
NvImageBlit	NV_IMAGE_BLIT	NvSubImageBlit
NvMemFormat	NV_MEMORY_TO_MEMORY_FORMAT	NvSubMemFormat, used for copy of buffers

After that various OpenGL functions are called and they use FIFO for drawing and other operations.

static void nv20ClipPlane(GLcontext *ctx, GLenum plane, const GLfloat *equation)
{
        nouveauContextPtr nmesa = NOUVEAU_CONTEXT(ctx);
        BEGIN_RING_CACHE(NvSub3D, NV20_TCL_PRIMITIVE_3D_CLIP_PLANE_A(plane), 4); // We are using previously created subchannel NvSub3D for definition of clip plane
        OUT_RING_CACHEf(equation[0]);
        OUT_RING_CACHEf(equation[1]);
        OUT_RING_CACHEf(equation[2]);
        OUT_RING_CACHEf(equation[3]);
}

How to start to do some work with FIFO

During your work with fifo, you put actions into FIFO. Each action is a packet like structure, that starts with size, command and subchannel.

  BEGIN_RING(NvSub3D_0, NV30_TCL_PRIMITIVE_3D_CLEAR_VALUE_ARGB, 1);
  OUT_RING  (0xffffffff);

Before you start putting actions to FIFO, you have to assign context object to subchannel (context object has to be of course created even before this). Well, driver should take care of this for you, but during development you need to do it yourself.

        // how to assign context object to subchannel
        BEGIN_RING(number_of_subchannel, 0, 1);  // assign object with handle_of_object
        OUT_RING  (handle_of_object);            // to subchannel number_of_subchannel

Now there is assigned object in subchannel number_of_subchannel (NoS). You can assign another context objects to another subchannels anytime. As soon as there is at least one subchannel with assigned object, you can put actions to FIFO (other actions than assigning object to subchannel).

Readl actions that *do* *something* are specified in packets (in most cases one action in one packet) os actions. Packet/action starts with writing info about action to FIFO through BEGIN_RING(subchannel_with_assigned_object_OBJ, command, size). You can use only commands that are valid on object OBJ, list can be found nouveau_reg.h, under class of OBJ.

Object NV30_CLEAR_BUFFER (this is class of OBJ)
#define                 NV30_CLEAR_BUFFER                        0x00000066  // this is class
#       define          NV30_CLEAR_BUFFER_SET_DMA_NOTIFY         0x00000180  // this command for class NV30_CLEAR_BUFFER
#       define          NV30_CLEAR_BUFFER_SET_IMAGE_PATTERN      0x00000188  // this command for class NV30_CLEAR_BUFFER
#       define          NV30_CLEAR_BUFFER_SET_RASTER_OP          0x0000018c  // this command for class NV30_CLEAR_BUFFER
#       define          NV30_CLEAR_BUFFER_SET_CONTEXT_SURFACE_2D 0x00000198  // this command for class NV30_CLEAR_BUFFER
#       define          NV30_CLEAR_BUFFER_UNK002fc               0x000002fc  // this command for class NV30_CLEAR_BUFFER
!!! you cant BEGIN_RING(subchannel_with_assigned_object_NV30_CLEAR_BUFFER, NV30_TCL_PRIMITIVE_3D_BEGIN_END, 0x1)

BEGIN_RING marks start of packet with a command. It also has size, size is always at least 0x1 and it means number of arguments of command. Size is command specific and it is necessary to use renouveau dumps or other sources for precise syntax. Each argument is 4 bytes. Header of packet is also 4 bytes long (see macro).

  // example of action, it wont do anything useful or maybe crash.
  NvSub3D = 7;
  Nv3D_handle = 0xbeef1234;
  // create a context object
  _context_obj_create(Nv3D_handle, NV30_TCL_PRIMITIVE_3D, init.channel)
  // set Nv3D_handle context object to subchannel 7
  BEGIN_RING(NvSub3D, 0, 1);
  OUT_RING  (Nv3D_handle);
  // set clear color
  BEGIN_RING(NvSub3D, NV30_TCL_PRIMITIVE_3D_CLEAR_VALUE_ARGB, 1);
  OUT_RING  (0xff0000ff);
  // clear color buffer
  BEGIN_RING(NvSub3D, NV30_TCL_PRIMITIVE_3D_CLEAR_WHICH_BUFFERS, 1);
  OUT_RING  (0x000000f0);

This may not be the only way specifying actions. See questions for packed actions

FIXME: is it simple or too complicated or elementary? Well, confusing packets / actions

DMA notify - example

In this part I am attempting to describe some stuff about DMA, like commands ending with DMA_NOTIFY. Nearly every object has command ...DMA_NOTIFY, parameter of the command is DMA notifier.

A DMA notifier is a DMA object that references a small (32 byte it seems, we use 256 for safety) memory area that will be used by the HW to give feedback about a DMA operation.

// This is not exactly what you write to FIFO, but it gives an idea.
set subchannel 0 to context object NV_IMAGE_FROM_CPU (beef6101)
  // now at each line is one action. First is COMMAND - parameters (here mostly objects)
  NV_IMAGE_FROM_CPU_DMA_NOTIFY       - DMA obj NV_DMA_TO_MEMORY (beef0301)
  NV_IMAGE_FROM_CPU_CLIP_RECTANGLE   - context obj NV01_CONTEXT_CLIP_RECTANGLE) | patch = SRCCOPY_AND (beef1901)
  NV_IMAGE_FROM_CPU_PATTERN          - context obj NV04_IMAGE_PATTERN (***beef4401)
  NV_IMAGE_FROM_CPU_ROP              - context obj NV03_PRIMITIVE_RASTER_OP (beef4301)
  NV_IMAGE_FROM_CPU_SURFACE          - context obj NV04_CONTEXT_SURFACES_2D (beef4201)
  NV_IMAGE_FROM_CPU_OPERATION        - SRCCOPY // It tells card copy image.

Well, what exactly is NV_IMAGE_FROM_CPU doing? It probably copy image from CPU to screen. First you map subchannel and do some commands at subchannel/object (0/beef6101). The last action (NV_IMAGE_FROM_CPU_OPERATION - SRCCOPY) tells GPU to copy image from source to destination. It can take a while, and during that time, GPU is processing another FIFO actions (if it is possible, GPU will be able to tell if next action interfere with DMA transfer and if it does, then GPU will wait for end of transfer). So FIFO process few (4-5) actions after SRCCOPY and now action (NV_IMAGE_FROM_CPU_OPERATION - SRCCOPY) ends.

Since it specified NV_IMAGE_FROM_CPU_DMA_NOTIFY, there is a DMA notifier and we can be notified about end of transfer. There are two ways to check:

Either repeatedly read the notifier address and wait until it changes,
or enable a 'wakeup' interrupt by writing NOTIFY_WRITE_LE_AWAKEN into the 'notify' field of the object in the channel. My guess is that this causes an interrupt in PGRAPH/NOTIFY as soon as the transfer is completed.

It is unknown, how the 'nvdriver' reacts if it gets notify events that are not registered.

Writing NV_NOTIFY_WRITE_LE_AWAKEN into the 'Notify' field of an object in a channel really causes an interrupt in the PGRAPH engine. Thus we can determine whether a DMA transfer has finished in the interrupt handler.

We can't use interrupts in user land, so we do the simple polling approach.

For further info look into nv_dma.c.

What is action - command

I should probably standardize terminology...
action is one packet that does something, like

  BEGIN_RING(NvSub3D_0, NV30_TCL_PRIMITIVE_3D_CLEAR_VALUE_ARGB, 1);
  OUT_RING  (0xffffffff);

command is in this case NV30_TCL_PRIMITIVE_3D_CLEAR_VALUE_ARGB

NV04 initialization

The purpose of this document is to store for another developers as well as for myself some informations and principles of NVidia cards I discovered. But the original goal of my journey is to make glxgears work on NV04 class of cards.

From test_startup dump of REnouveau it seems that I should follow this :

 1. Create DMA objects for FB and AGP.
 1. Create DMA memory area for surface ('''NvDma3Dsurf'''), where will be drawn triangle. In startup it looks like this:
          Searching for object beef0201
          Context is 82001495
          Software object
          ENGINE_SW[0] = 0000303d = class = 003d (NV_DMA_IN_MEMORY) | page table | page entry linear | dma_access = rw | dma_target = NV mem | dma adjust = 000
          ENGINE_SW[1] = 00fd00ff = dma limit = 00fd00ff
          ENGINE_SW[2] = 00000003 = dma page address = 00000 | r/w = TRUE | UNKNOWN = 00000001
          ENGINE_SW[3] = 00000003 = dma page address = 00000 | r/w = TRUE | UNKNOWN = 00000001
 1. Create context object NV04_CONTEXT_SURFACES_3D (NvCtxSurf3D)
   1. NV04_CONTEXT_SURFACES_3D_DMA_COLOR = NvDma3Dsurf
   1. NV04_CONTEXT_SURFACES_3D_DMA_ZETA  = NvDma3Dsurf
   1. NV04_CONTEXT_SURFACES_3D_FORMAT = color = !A8R8G8B8 | type = pitch | width = 0 | height = 0
   1. NV04_CONTEXT_SURFACES_3D_CLIP_HORIZONTAL = x = 0 | width = 512
   1. NV04_CONTEXT_SURFACES_3D_CLIP_VERTICAL = y = 0 | height = 512
   1. NV04_CONTEXT_SURFACES_3D_PITCH = color = 5120 | zeta = 2112
   1. NV04_CONTEXT_SURFACES_3D_OFFSET_COLOR = 0x0001f400
   1. NV04_CONTEXT_SURFACES_3D_OFFSET_ZETA = 0x00ec8000
 1. Create context object NV04_DX5_TEXTURED_TRIANGLE.
   1. NV04_DX5_TEXTURED_TRIANGLE_DMA_1 = NvDma3Dsurf
   1. NV04_DX5_TEXTURED_TRIANGLE_DMA_2 = DMA object at AGP aparature, with dma page address = 48000
   1. NV04_DX5_TEXTURED_TRIANGLE_SURFACE = NvCtxSurf3D
 1. Draw triangle... somehow.Copy code from default test.

The trouble is, it doesn't work. I made a mistake somewhere.