00:56ReinUsesLisp: imirkin: SHR supports one extra xmode called XHI
00:56ReinUsesLisp: the encoding is {"INVALIDSHRXMODE1", "X", "XHI", NULL}
00:56ReinUsesLisp: at offset 43
00:57ReinUsesLisp: er... {"", "INVALIDSHRXMODE1", "X", "XHI", NULL}
01:16imirkin: yes
01:16imirkin: extra-high :)
01:17imirkin: basically that means that the high bits get filled in from the "X" aka carry bit
01:17imirkin: at least that's my recolleciton
01:17imirkin: ReinUsesLisp: --^
01:17imirkin: we use this for 64-bit shifts on pre-SM35
01:17imirkin: i think.
01:19imirkin: hm, no.
01:19imirkin: we use SHR.HI for SM35+ though
01:19imirkin: but i think that the "X" comes in useful for 128-bit shifts
01:19imirkin: which we don't support in nouveau
01:22imirkin: hm. so the "X" is bit 44
01:23imirkin: and we always use SHF for 64-bit things
01:23imirkin: but i guess SHR has a high mode as well
01:25ReinUsesLisp: it changes the written flags, right?
01:25ReinUsesLisp: I haven't investigated it yet
01:25imirkin: .X consumes
01:25imirkin: .CC sets the carry flag
01:26ReinUsesLisp: I didn't know there were 128 bit operations in GL through extensions :P
01:26imirkin: there aren't
01:26imirkin: but there are in cuda
01:26ReinUsesLisp: I'm interested in seeing what it emits for u128 / u128
01:26ReinUsesLisp: u64 / u64 is already horrible
01:27imirkin: i think it uses SHF though
01:27imirkin: there's a SHF.L and SHF.R which is really good for 64-bit
01:27imirkin: and iirc can be extended to 128-bit, although potentially with the help of some of those SHR.X/HI things
01:27ReinUsesLisp: we've seen games using it
01:27ReinUsesLisp: we haven't implemented them yet
01:27imirkin: probably as bindless handles
01:28imirkin: although why they'd be doing shifts on those...
01:28ReinUsesLisp: likely, it uses bindless
01:28ReinUsesLisp: maybe it's using HLSL samplers
01:29ReinUsesLisp: I don't know if you recall my question about viewports being flipped; gdkchan (from Ryujinx) found what it was
01:29ReinUsesLisp: it's GL_NV_viewport_swizzle
01:30imirkin: that's delightful
01:30imirkin: what is it exactly?
01:30imirkin: and what are the methods for it?
01:30imirkin: would be nice to ducment/implement
01:30ReinUsesLisp: it's pretty simple
01:31imirkin: no, i think i get it
01:31imirkin: lets you mess with x/y/z/w coords
01:31ReinUsesLisp: it's a bitfield encoded here https://github.com/envytools/envytools/blob/305c1dddf579da68a58b211ac80687b4fd67b476/rnndb/graph/gf100_3d.xml#L302-L303
01:31ReinUsesLisp: (between those two)
01:31ReinUsesLisp: one per viewport
01:31imirkin: i.e. 0x0a18?
01:31ReinUsesLisp: yes
01:32imirkin: and 2 bits per viewport
01:32imirkin: 0 = x, 3 = w?
01:32ReinUsesLisp: iirc yes, I haven't implemented it in yuzu yet
01:32imirkin: er, that's not enough
01:32ReinUsesLisp: hmm
01:32imirkin: 16 viewports, 4 components per viewport, 2 bits per component
01:32imirkin: that means 4x32bit
01:32ReinUsesLisp: it's different
01:32ReinUsesLisp: check the stride
01:33imirkin: oh
01:33ReinUsesLisp: it has 32 bits per viewport, it can do whatever it wants
01:33imirkin: it's already inside a viewport thing
01:33imirkin: right. i see.
01:33imirkin: everything's already a viewport array
01:34imirkin: so it's 8 bits per viewport, but how are they laid out?
01:34imirkin: oh, 12 bits. coz of the negation.
01:34imirkin: one per nibble, one might hope? or tightly packed?
01:34ReinUsesLisp: https://github.com/Ryujinx/Ryujinx/blob/angel/Ryujinx.Graphics.Gpu/State/ViewportTransform.cs
01:35ReinUsesLisp: those methods unpack it
01:35imirkin: cool
01:35imirkin: so one per nibble
01:35imirkin: and which bit is the sign bit?
01:36ReinUsesLisp: the enum is somewhere else, github is trash searching branches
01:36ReinUsesLisp: it matches the GL enum order iirc
01:36imirkin: s/searching branches//
01:36imirkin: ah ok
01:36imirkin: so low bit is sign bit, essentially
01:36ReinUsesLisp: btw someone wrote a wrapper to mod commercial games using the shipped libraries (this includes Nvidia's driver); we can write arbitrary OpenGL code on it
01:37ReinUsesLisp: if you want to know what registers some weird extension something uses, let me know
01:37ReinUsesLisp: -something*
01:37imirkin: well, normally we test that out directly too
01:37imirkin: but i've gotten immensely lazy
01:37imirkin: and not highly-caring
01:37imirkin: but you should submit a patch to envytools/rnndb for that register
01:37ReinUsesLisp: well, I can dump them from an emulator :P
01:38ReinUsesLisp: brb 30 mins
01:38ReinUsesLisp: ok, I'll try adding them
02:27ReinUsesLisp: imirkin: would it be something like this https://pastebin.com/raw/EzZ5JfWR ?
02:27ReinUsesLisp: how should "gfNN_viewport_swizzle" be called and where should it be defined?
02:30imirkin: come with a reasonable name based on the other names
02:30imirkin: how about VIEWPORT_SWIZZLE
02:30imirkin: and put it sequentially in the right place
02:30imirkin: oh, the type itself - i see
02:30imirkin: um
02:30imirkin: (gimme a sec)
02:31ReinUsesLisp: oh, I uploaded the old pastebin :P
02:32ReinUsesLisp: https://pastebin.com/raw/FAFJg10Y
02:32imirkin: how about gm200_viewport_swizzle
02:33imirkin: and add a GM204- mask on it
02:33imirkin: iirc GM204 < GM200
02:33imirkin: <reg32 offset="0x0a1c" name="SUBPIXEL_PRECISION" length="16" stride="32" variants="GM204_3D-">
02:33imirkin: so like that
02:33imirkin: (i checked, the ext is available on GTX 960 and up)
02:33imirkin: (based on gpuinfo.org)
02:34ReinUsesLisp: ok, where should the enum be defined?
02:38imirkin: up top
22:27TimurTabi: I have a question about nouveau_dmem.c. What is meant by "DMEM" in this file? I ask because the local memory to the various microcodes on a GPU are called "dmem", but when I look at the driver code, it doesn't look like it's allocating that kind of memory.