[FrontPage] [TitleIndex] [WordIndex

If you do not understand the terms mentioned here head to IntroductoryCourse.

The card generations

First of all, there are a lot of different card models, grouped into 8 card generations for our purposes. The cards are identified by a few different names: the marketing name ("GeForce 8800 GTX"), the code name that nvidia uses in their documentation ("G80"), and the real code number ("NV50"). The code number is embedded directly in hardware, and it's the important thing for identifying your card for programming purposes. For the first cards, roughly up to NV3x, nvidia code name was equal to the real code number of the cards, later it diverged. Note that the same real code number can correspond to several marketing names: this means that these cards use the same GPU core, but differ in some other characteristic, like bus size, memory size, number of working shaders/ROPs, etc.

Generation name

Model code numbers

Commercial names

Notes

NV01

Diamond Edge 3D

The first nv card ever. Had 3d drawing engine using quadratic curves instead of polygons. Nobody used it, nobody bought it... don't really bother. Not supported by nouveau.

NV02

?

Quadratic curves taken even further, never really completed. A bad joke.

NV03

RIVA 128

First serious, polygon-based 3d card. Can do DX5. Very rare now. Not supported by nouveau.

NV04

NV04, NV05

RIVA TNT, TNT2

Introduced DMA FIFOs, enough hardware 3d support to do DX6 (multitex and stencil), and objects as we know them. First card supported by nouveau.

NV10

NV10, NV11, NV15, NV17, NV18

GeForce 256, GeForce 2, GeForce 4 MX

Introduced hardware TCL

NV20

NV20, NV25, NV28, NV2A

GeForce 3, GeForce 4 Ti

First shaders

NV30

NV30, NV31, NV34, NV35, NV36

GeForce FX

First serious shaders

NV40

NV4x, NV67

GeForce 6xxx, GeForce 7xxx

Only shaders now, fixed-function removed

NV50

NV50, NV8x, NV9x, NVAx

GeForce 8xxx, GeForce 9xxx, GeForce 1xx, GeForce 2xx

Unified shader architecture, CUDA support. Major pieces of card architecture redone, a lot of things changed. Only one of the old objects survived.

PCI BARs

NV cards have 2 or 3 BARs, all of them memory spaces:

  1. BAR 0: control registers. 16MB in size. Is divided into several areas for each of the functional blocks of the card.
  2. BAR 1: VRAM. On pre-NV50, corresponds directly to the available VRAM on card. On NV50, gets remapped through VM engine.
  3. BAR 2: PRAMIN. This bar exists only on NV40 and newer and gives you access to the whole of PRAMIN. On NV50, gets remapped through VM engine.

The functional blocks

The control BAR contains several regions of MMIO registers, corresponding roughly to functional units of the card:

Address

Name

Description

0x000000-0x000fff (NV04-NV40), 0x000000-0x001fff (NV50)

PMC

Master Control. Contains registers that apply to the card as a whole, like master interrupt enable/status, card version, etc.

0x001000-0x001fff (NV04-NV40), 0x088000-0x088fff (NV50)

PBUS

Bus-related configuration. Simply an alias for PCI configuration space.

0x002000-0x0037ff

PFIFO

Responsible for queueing of commands to PGRAPH and possibly other execution engines.

0x003800-0x003fff (NV04-NV30), 0x090000-0x090fff (NV40-NV50)

PFIFO_CACHE1

Another part of PFIFO.

0x008000-0x008fff (NV10-original NV40)

PVIDEO

Video overlay.

0x009000-0x009fff

PTIMER

A timer. can count time. and do timer interrupts.

0x00d000-0x00dfff

PTV

FILLME

0x00e000-0x00efff (NV50 only)

PCONNECTOR

FILLME

0x0c0000-0x0c0fff

PRMVIO

Mapped to VGA registers: 0x3c2, 0x3c3, 0x3c4, 0x3c5, 0x3ce, 0x3cf

0x100000-0x100fff

PFB

FrameBuffer control. Controls the VRAM configuration.

0x101000-0x101fff

PEXTDEV

Interfacing with external devices, reading straps

0x300000-0x30ffff

PROM

Contains a copy of Video BIOS

0x400000-0x40ffff

PGRAPH

The graphics engine. Does all acceleration, including 3d, 2d, CUDA, and even memory blits.

0x600000-0x600fff

PCRTC0

CRTC setup

0x601000-0x601fff

PRMCIO

CRTC setup. Mapped to VGA registers: 0x3c0, 0x3c1, 0x3c2, 0x3d4, 0x3d5, 0x3da

0x610000-0x61ffff (NV50 only)

PDISPLAY

NV50 modesetting FIFO setup

0x640000-0x64ffff (NV50 only)

PDISPLAY_USER

NV50 modesetting FIFO submission

0x680000-0x680fff

PRAMDAC

RAMDAC setup. On NV04, also contains overlay control.

0x681000-0x681fff

PRMDIO

RAMDAC setup. Mapped to VGA registers: 0x3c6, 0x3c7, 0x3c8, 0x3c9.

0x700000-0x7fffff (NV04 and up), 0xc00000-0xcfffff (NV03)

PRAMIN

Instance memory, see below

0x800000-0x8fffff (NV04), 0x800000-0x9fffff (NV10-NV30), 0x800000-0x81ffff (NV40), 0xc00000-0xcfffff (NV50)

FIFO

PFIFO user submission interface

PMC

PMC is Master Control, a block of registers used for stuff that don't fit anywhere else, or apply to the whole card.

Address

Name

Description

0x0000

PMC_BOOT_0

Reports card chipset and stepping

0x0004 (NV10+)

PMC_BOOT_1

Selects big/little endian mode for the card

0x0100

PMC_INTR_0

Shows which functional units have pending IRQ

0x0140

PMC_INTR_EN_0

Selects which functional units can cause IRQs

0x0160

PMC_INTR_READ_0

???

0x0200

PMC_ENABLE

Enables other functional units

0x1540 (NV40)

(not named yet)

Enables/disables individual vertex/pixel shader units

0x1540 (NV50)

(not named yet)

Enables/disables MPs, TPs, and ROPs

0x15f0 (NV40 and up)

PMC_BACKLIGHT

Controls backlight on laptops

0x1700 (NV40)

???

???

0x1704 (NV40)

???

???

0x1708 (NV40)

???

???

0x170c (NV40)

???

???

0x1700 (NV50)

PMC_BAR0_PRAMIN

Physical VRAM address of window that PRAMIN points to, shifted right by 16 bits.

0x1704 (NV50)

???

???

0x1708 (NV50)

???

???

0x170c (NV50)

???

???

0x1710 (NV50)

???

???

0x1900-0x191c (NV50)

???

???

PMC_BOOT_0

This tells you what GPU you have. For maximum enjoyment, it has different format for NV01-NV03, NV04-NV05 and NV10+. Yay.

If bits 24-27 are non-0, you have NV10 or better, with the following format:

If bits 12-15 are non-0, you have one of NV04 family cards:

In other cases, you have NV01-NV03.

PMC_BOOT_1

Available on NV10 and later. Endian switch. Write 0x00000001 in your favorite endianness to switch the card to this endianness. This switch, if set to big-endian mode, causes all accesses to MMIO BAR to be byte-swapped [xor all byte addresses with 3], except for areas containing 8-bit legacy VGA registers. This includes PRAMIN accesses. Accesses to framebuffer BAR aren't affected.

When read, returns 0 if in little-endian mode, 0x01000001 if in big-endian mode.

TODO: Check NV40 PRAMIN BAR and NV50. Check how to byte-swap PFIFO pushbuffers.

PMC_INTR and PMC_INTR_EN

The functional blocks of the card can report interrupts for various reasons. PMC_INTR reports which of them are currently reporting interrupts, After you service an interrupt, you have to write 1 to corresponding PMC_INTR bit to clear its status.

The bit assignments for PMC_INTR:

The PMC_INTR_EN register enables/disables reporting interrupts by PCI IRQ line. Write 1 to enable interrupts, 0 to disable.

PMC_ENABLE

Used to turn the functional units on/off.

The VRAM and the RAMIN

The Video memory is used to store all kinds of data used by the card: scanout buffers, video input buffers, render targets, textures, pixmaps, vertex buffers, shader code, command buffers, as well as some management objects. Video memory is split into normal memory and RAMIN, also known as instance memory. RAMIN is used to contain the card management objects (usually accessible only to the kernel), normal memory is for objects that normal applications access.

Stuff stored in RAMIN: RAMHT, RAMRO, RAMFC, NvObjectTypes|DMA/graph objects, PGRAPH context (NV20+), channel setup and page tables (NV50+). Also, VBIOS copies its own image to first 64kiB of PRAMIN when it boots (or to area initially pointed to by PMC+0x1700, in case of NV50).

NV03 - NV40

On NV03 through NV40: RAMIN starts from end of VRAM, and is simply last 1MB (or 32MB on NV40) of memory, except addressed in reverse. The unit of reversal is 16 bytes on NV04 and NV10, 512kiB on NV40. No idea on other cards. This means that address 0x12345 in RAMIN on NV04 card with 32MB of memory is actually address 0x1ffedc5 in VRAM. Or, if you prefer a formula:

real VRAM address = VRAM_size - (ramin_address - (ramin_address % reversal_unit_size)) - reversal_unit_size + (ramin_address % reversal_unit_size)

The PRAMIN window at 0x700000-0x7fffff (or 0xc00000-0xcfffff on NV03, presumably) in BAR 0 is mapped to first 1MB of RAMIN, and uses RAMIN addresses (that is, reversed wrt real VRAM addresses).

On NV40, BAR 2 is also mapped to the RAMIN using RAMIN addresses, and is 32MB long. Before NV40, RAMIN addresses used by other pieces of card max out at 1MB. No idea if you're allowed to go past BAR 2 size on NV40.

BAR 1 is simply mapped to the whole VRAM. If VRAM is >256MB, only first 256MB are accessible directly by the CPU, and you're basically out of luck.

NV50

On NV50, on the other hand, things are completely different. RAMIN is no longer a separate area of memory, and the management objects previously in RAMIN can now be placed anywhere in VRAM. Address reversal is gone too.

NV50 introduces virtual memory: the addresses that GPU uses are virtual, and the memory management unit maps them to actual physical addresses in VRAM by page-based address translation. This remapping applies to both GPU accesses from graphics engine (with per-channel page tables) and PCI BARs 1 and 2 (with another page table sets). The only difference between RAMIN objects and normal objects is that RAMIN objects are specified by their physical addresses directly.

However, the PRAMIN mapping in BAR0 can now be used to access any part of video memory by its physical address -- it's used as a 1MB window to physical memory, and its base can be set to any 64kB-aligned address using PMC register 0x1700 (see above).


2013-03-24 13:16