04:18anholt: which NVISA_*_CHIPSET is nvc0?
04:18imirkin: or whichever one is #define'd to 0xc0 :)
04:19anholt: ha, obvious now
04:23imirkin: anholt: https://nouveau.freedesktop.org/CodeNames.html#NVC0
04:24imirkin: the chip has a register at address 0 which has the chip id in it, that's the nvxx thing (in hex)
04:24imirkin: but then there is also G(family)(iteration) naming too
04:24imirkin: e.g. GF100
04:24imirkin: usually xx7/8 are the smaller chips
04:24imirkin: and xx0/4 are the bigger ones
04:25imirkin: but within a family, we never differentiate anything in nouveau
04:25imirkin: (except tesla, which has DX10.0 and DX10.1 chips - the latter gained some functionality)
04:27imirkin: (specifically texgather, cube arrays, texquerylod, and some non-shader functionality around xfb, seamless cubemaps)
04:28anholt: ooh, interesting. nir fuse-ffma ends up hurting overall.
04:28imirkin: yeah, definitely
04:28imirkin: we try to be somewhat careful about fusing, i think?
04:28anholt: are there some restrictions on modifiers or something?
04:29anholt: ok, makes sense then.
04:29imirkin: depending on the ISA you're targeting
04:29imirkin: like the nv50 isa, src3 == dst
04:29anholt: so extra movs if you need src3 again
04:30imirkin: yeah, or if there's any funny business
04:30imirkin: also limited modifiers (like neg, abs, etc)
04:31imirkin: and also no constbufs i think?
04:31imirkin: or less?
04:31imirkin: (not to mention immediates)
04:31imirkin: in general const/imm can only go into the second source
04:31imirkin: nv50 has 4- and 8-byte instruction encodings
04:32imirkin: the 4-byte ops are paired, pretty sure there's dual-dispatch
04:32imirkin: but you can't encode as much stuff into 4 bytes, so more restrictions
04:32imirkin: also if you go over 64 regs, you have to use the long encoding
04:32imirkin: (out of 128 on nv50)
04:33imirkin: (it's 256 as far as RA is concerned - everyhting is allocated as half-regs, multiplies want half-regs for the 16-bit variants)
04:38mhenning: is the half-reg thing only for nv50? I thought we were doing RA in terms of 32-bit regs on nvc0 but I might be confused
04:39imirkin: yes, on nvc0, RA is in 32-bit units
04:40imirkin: on nv50, RA is in 16-bit units
04:40imirkin: the target has a "getFileUnit()" which indicates the unit of RA
04:40anholt: was there 16-bit support that was dropped?
04:40imirkin: anholt: no. integer mul takes half-regs on nv50
04:40imirkin: there are 24- and 16-bit integer mul modes
04:40imirkin: the 16-bit ones take half-regs
04:41imirkin: (and are combined to build up a 32-bit mul)
04:41anholt: yeah, ok
04:42imirkin: there might be other situations where those are used, but that's the only one i can think of
04:42imirkin: oh, also i made extensive use of 16-bit regs for image stuff
04:42imirkin: for the address calculations and format conversions