00:51 imirkin: ReinUsesLisp: it's used in geometry/tess shaders, to help figure out where to fetch a particular vertex's attribute data
00:52 imirkin: looks like we use the bytes at position 0 and 2 of the word ... (call them b0 and b2), and then we multiply them together, and add the vertex number we're trying to grab
00:52 imirkin: i'm guessine one is a base offset (perhaps per lane), and the other is a stride
01:03 ReinUsesLisp: One more question. looking at the generated shader, writable abuf memory seems be a set of vertices, unlike GLSL. OUT.EMIT will emit a vertex with its abuf vertex index and an offset:
01:03 ReinUsesLisp: OUT.EMIT R2, R2, RZ ;
01:03 ReinUsesLisp: which one is the vertex number here?
01:04 ReinUsesLisp: (one of them being an offset is just a guess)
01:05 imirkin: none of them...
01:05 imirkin: RZ is the vertex stream
01:05 imirkin: i.e. zero
01:05 imirkin: but it can also be an immediate (or even a register? dunno, never tried)
01:05 imirkin: the first argument is basically an offset
01:06 imirkin: it starts out as zero
01:06 imirkin: and then the result of the OUT.EMIT gets fed into the next one
01:09 ReinUsesLisp_: ok, with that I should get something working, thanks!
01:10 imirkin: basically it's an offset of where to store it in memory... it stores a bunch of stuff, returns another offset, etc.
01:10 imirkin: it's one and the same offset, the streams arent' separate
01:10 imirkin: i think the values just get tagged or something
01:10 imirkin: now if you're emulating this with GLSL, you have no real choice but to ignore it entirely
01:11 imirkin: and hope that the authors of the shader aren't doing something sneaky
01:11 ReinUsesLisp_: I'm using nouveau for now
01:11 ReinUsesLisp_: otherwise I'd have to implement VMAD and friends
01:12 imirkin: not sure what you're doing then ... i thought switch emu?
01:12 ReinUsesLisp_: yes
01:12 imirkin: but you're just taking the shaders and running them as-is through nouveau with some ext?
01:13 ReinUsesLisp_: I'm writting a standard OpenGL program and building it native + for Switch, I check that everything works fine on host and real hardware and then start writting an implementation for the emulator
01:14 imirkin: ah
01:15 imirkin: well i think newer blob versions manage some of memory differently than nouveau does
01:15 imirkin: esp in tess shaders
01:15 imirkin: there's some sexy new way of doing it on maxwell+ that we never cared to figure out
01:16 ReinUsesLisp_: it's just geometry shaders for now, I just found one game using tess and none using the compute engine (sadly)
01:16 ReinUsesLisp_: Maxwell+ includes maxwell? :P
01:17 ReinUsesLisp_: does*
01:23 imirkin: yes
02:17 robert_: does nouveau support GeForce Go 7600?
02:53 imirkin: define 'support'
02:54 imirkin: (but yes - nouveau supports all nvidia gpu's, TNT2 and up)
02:54 imirkin: er, TNT and up, of course
06:39 HdkR: imirkin: Are you around? I've got an interesting problem/question
06:40 HdkR: Karol may also be able to help here but they arent here?
06:41 HdkR: (middle of flying so I may be in and out)
06:44 HdkR: imirkin: In Nouveau's quest of getting scoreboarding optimal, you must have documented instruction latencies of various instructions in families somewhere. Is there a location where this is documented?
06:45 imirkin: emit_gm107.cpp
06:46 imirkin: and emit_nvc0.cpp
06:46 imirkin: it's not extremely accurate
06:46 imirkin: there was a paper that came out with precise latencies for maxwell and pascal iirc
06:48 HdkR: imirkin: Can you get me a link to that paper? I didnt realize that existed
06:48 HdkR: something in Maxwell/Pascal timeframe would be great
06:49 HdkR: Since obviously Volta+ hasnt been done yet
06:50 HdkR: I cant look at those files directly right now, but is it something like instruction tables that store information like latency?
06:51 imirkin: hrmph, i don't have a link here
06:51 HdkR: Something that is very easy to get the information will make my arguments more compelling
06:52 imirkin: https://arxiv.org/pdf/1804.06826.pdf
06:52 imirkin: check chapter 4
06:52 imirkin: was actually pascal and volta
06:52 HdkR: Alroght, ill take a look when i have a break and internet
06:52 imirkin: you must have at least a little internet now...
06:53 HdkR: cell phone though
07:07 HdkR: oh
07:07 HdkR: that is volta and pascal
07:07 HdkR: even better
07:15 HdkR: imirkin: Do you have a direct link for the cuda throughput documentation on hand?
07:17 imirkin: not sure what you're talking about
07:17 imirkin: so i'm going to go with "no"
07:18 HdkR: https://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions
07:18 HdkR: "maximum instruction throughput" on the left side
07:19 HdkR: because direct links are impossible on this dumb site
07:29 imirkin: ah
18:44 rhyskidd: looks like there's a new falcon on Turing, referenced as 'GSP'
18:44 rhyskidd: also a couple of new DevInit script opcodes
18:45 rhyskidd: ^ just some random notes for those also looking