00:51imirkin: ReinUsesLisp: it's used in geometry/tess shaders, to help figure out where to fetch a particular vertex's attribute data
00:52imirkin: looks like we use the bytes at position 0 and 2 of the word ... (call them b0 and b2), and then we multiply them together, and add the vertex number we're trying to grab
00:52imirkin: i'm guessine one is a base offset (perhaps per lane), and the other is a stride
01:03ReinUsesLisp: One more question. looking at the generated shader, writable abuf memory seems be a set of vertices, unlike GLSL. OUT.EMIT will emit a vertex with its abuf vertex index and an offset:
01:03ReinUsesLisp: OUT.EMIT R2, R2, RZ ;
01:03ReinUsesLisp: which one is the vertex number here?
01:04ReinUsesLisp: (one of them being an offset is just a guess)
01:05imirkin: none of them...
01:05imirkin: RZ is the vertex stream
01:05imirkin: i.e. zero
01:05imirkin: but it can also be an immediate (or even a register? dunno, never tried)
01:05imirkin: the first argument is basically an offset
01:06imirkin: it starts out as zero
01:06imirkin: and then the result of the OUT.EMIT gets fed into the next one
01:09ReinUsesLisp_: ok, with that I should get something working, thanks!
01:10imirkin: basically it's an offset of where to store it in memory... it stores a bunch of stuff, returns another offset, etc.
01:10imirkin: it's one and the same offset, the streams arent' separate
01:10imirkin: i think the values just get tagged or something
01:10imirkin: now if you're emulating this with GLSL, you have no real choice but to ignore it entirely
01:11imirkin: and hope that the authors of the shader aren't doing something sneaky
01:11ReinUsesLisp_: I'm using nouveau for now
01:11ReinUsesLisp_: otherwise I'd have to implement VMAD and friends
01:12imirkin: not sure what you're doing then ... i thought switch emu?
01:12imirkin: but you're just taking the shaders and running them as-is through nouveau with some ext?
01:13ReinUsesLisp_: I'm writting a standard OpenGL program and building it native + for Switch, I check that everything works fine on host and real hardware and then start writting an implementation for the emulator
01:15imirkin: well i think newer blob versions manage some of memory differently than nouveau does
01:15imirkin: esp in tess shaders
01:15imirkin: there's some sexy new way of doing it on maxwell+ that we never cared to figure out
01:16ReinUsesLisp_: it's just geometry shaders for now, I just found one game using tess and none using the compute engine (sadly)
01:16ReinUsesLisp_: Maxwell+ includes maxwell? :P
02:17robert_: does nouveau support GeForce Go 7600?
02:53imirkin: define 'support'
02:54imirkin: (but yes - nouveau supports all nvidia gpu's, TNT2 and up)
02:54imirkin: er, TNT and up, of course
06:39HdkR: imirkin: Are you around? I've got an interesting problem/question
06:40HdkR: Karol may also be able to help here but they arent here?
06:41HdkR: (middle of flying so I may be in and out)
06:44HdkR: imirkin: In Nouveau's quest of getting scoreboarding optimal, you must have documented instruction latencies of various instructions in families somewhere. Is there a location where this is documented?
06:46imirkin: and emit_nvc0.cpp
06:46imirkin: it's not extremely accurate
06:46imirkin: there was a paper that came out with precise latencies for maxwell and pascal iirc
06:48HdkR: imirkin: Can you get me a link to that paper? I didnt realize that existed
06:48HdkR: something in Maxwell/Pascal timeframe would be great
06:49HdkR: Since obviously Volta+ hasnt been done yet
06:50HdkR: I cant look at those files directly right now, but is it something like instruction tables that store information like latency?
06:51imirkin: hrmph, i don't have a link here
06:51HdkR: Something that is very easy to get the information will make my arguments more compelling
06:52imirkin: check chapter 4
06:52imirkin: was actually pascal and volta
06:52HdkR: Alroght, ill take a look when i have a break and internet
06:52imirkin: you must have at least a little internet now...
06:53HdkR: cell phone though
07:07HdkR: that is volta and pascal
07:07HdkR: even better
07:15HdkR: imirkin: Do you have a direct link for the cuda throughput documentation on hand?
07:17imirkin: not sure what you're talking about
07:17imirkin: so i'm going to go with "no"
07:18HdkR: "maximum instruction throughput" on the left side
07:19HdkR: because direct links are impossible on this dumb site
18:44rhyskidd: looks like there's a new falcon on Turing, referenced as 'GSP'
18:44rhyskidd: also a couple of new DevInit script opcodes
18:45rhyskidd: ^ just some random notes for those also looking