00:28 pabs: IIRC power is dual-endian, you can choose your endianness on a per-process basis even :)
00:28 pabs: IIRC the RaptorCS Talos workstations can run big-endian Debian ppc64 for eg
02:02 HdkR: ARMv8 in the spec lets you do the same thing. You can even have your exception handlers switch endian on the fly
02:05 HdkR: Of course no one uses it
03:01 airlied: TimurTabi: https://people.freedesktop.org/~airlied/scratch/535.104.05.logs.tgz but I don't think it helps
03:02 airlied: I've found the base gsp msg struct was changed and checksum, so the gsp might never get a proper msg
04:06 airlied: TimurTabi: https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/nouveau-gsp-535.104.05-wip?ref_type=heads has my wip patch for 535.104.05, still not going though
15:57 fdobridge: <b​enjaminl> what's the newest version of the proprietary drivers that people have successfully used with `demmt`?
15:58 fdobridge: <k​arolherbst🐧🦀> uhhh
15:58 fdobridge: <b​enjaminl> I'm trying with `470.199.02`, but am seeing a lot of unrecognized ioctls, including a `size=40` variant of `NVRM_IOCTL_CREATE`...
15:58 fdobridge: <k​arolherbst🐧🦀> very old
15:58 fdobridge: <k​arolherbst🐧🦀> yeah...
15:58 fdobridge: <b​enjaminl> oof
15:58 fdobridge: <k​arolherbst🐧🦀> so nobody has been keeping it updated
15:58 fdobridge: <k​arolherbst🐧🦀> and nvidia is now also doing userspace command submission
15:58 fdobridge: <k​arolherbst🐧🦀> so... we kinda need an entirely new approach
15:58 fdobridge: <k​arolherbst🐧🦀> the entire compute side of things is also different since a while
15:59 fdobridge: <k​arolherbst🐧🦀> also
15:59 fdobridge: <k​arolherbst🐧🦀> valgrind: https://github.com/karolherbst/valgrind/commit/e29d6ef2b3de297f20c7f52756f3de50ad9461ba
15:59 fdobridge: <k​arolherbst🐧🦀> envytools: https://github.com/karolherbst/envytools/commit/24467a5a8948797c911fdc45f22d7faed1fb8b8c
16:00 fdobridge: <k​arolherbst🐧🦀> looks like I used `390.77` for it
16:02 fdobridge: <b​enjaminl> thanks!
16:02 fdobridge: <b​enjaminl> if it's not too much work to explain, what's uvm?
16:02 fdobridge: <b​enjaminl> I haven't seen that one before
16:02 fdobridge: <k​arolherbst🐧🦀> unified virtual memory
16:02 fdobridge: <k​arolherbst🐧🦀> it's a new set of UAPI nvidia uses in their driver for compute
16:03 fdobridge: <k​arolherbst🐧🦀> and they started to use it for compute shaders as well in OpenGL
16:03 fdobridge: <k​arolherbst🐧🦀> it was mostly a CUDA thing before
16:03 fdobridge: <k​arolherbst🐧🦀> anyway.. adding support for all of this is just major pain
16:04 fdobridge: <k​arolherbst🐧🦀> and I think we kinda need a better approach here
16:04 fdobridge: <k​arolherbst🐧🦀> and hey.. nvidia's driver is now open source anyway, so we might as well just hack it up there and find a solution for userspace command submission
16:05 fdobridge: <b​enjaminl> haha yeah, unfortunately my card is Maxwell, so not supported by nvidia-open
16:05 fdobridge: <k​arolherbst🐧🦀> pain
16:07 fdobridge: <b​enjaminl> I got into this in the first place because I was looking at the issue for NVK conservative rasterization in mesa, and was like "hmm, maybe I can try this and learn how RE for the proprietary driver works"
16:08 fdobridge: <b​enjaminl> it's turning out this one is probably beyond my capabilities with my current knowledge 🙂
20:28 fdobridge: <g​fxstrand> What do you mean by "pixels on X" and "pixels on Y"?
20:29 fdobridge: <k​arolherbst🐧🦀> raster pixel per thread
20:30 fdobridge: <g​fxstrand> What do you mean by "per thread"?
20:31 fdobridge: <k​arolherbst🐧🦀> I have no idea 😄
20:31 fdobridge: <g​fxstrand> 😂
20:31 fdobridge: <g​fxstrand> Awesome!
20:32 fdobridge: <k​arolherbst🐧🦀> though I think in nvidia speak a thread is a single execution unit
20:32 fdobridge: <k​arolherbst🐧🦀> but they also use the term "thread lanes" sometimes
20:32 fdobridge: <k​arolherbst🐧🦀> but a subgroup is called "warp"
20:32 fdobridge: <k​arolherbst🐧🦀> so I guess a thread is a single thread
20:33 fdobridge: <k​arolherbst🐧🦀> maybe just check what the value is and then we'll see what they mean here
21:05 fdobridge: <g​fxstrand> So it turns out NVIDIA doesn't have any special tricks for `gl_SampleMask` with per-sample shading. 🙄
21:05 fdobridge: <k​arolherbst🐧🦀> sad
21:06 fdobridge: <k​arolherbst🐧🦀> so what are they doing?
21:06 fdobridge: <g​fxstrand> `COVMASK & (1 << gl_SampleId)` just like codege
21:06 fdobridge: <g​fxstrand> `COVMASK & (1 << gl_SampleId)` just like codegen (edited)
21:06 fdobridge: <k​arolherbst🐧🦀> ahh yeah
21:06 fdobridge: <g​fxstrand> I mean, I could make it conditional on `passcount > 2` or something maybe
21:07 fdobridge: <g​fxstrand> I mean, I could make it conditional on `passcount > 1` or something maybe (edited)
21:07 fdobridge: <k​arolherbst🐧🦀> mhhh...
21:07 fdobridge: <k​arolherbst🐧🦀> why tho
21:07 fdobridge: <g​fxstrand> But the driver still has to know to shove passcount to max in that case
21:07 fdobridge: <g​fxstrand> So there's still API going on
21:07 fdobridge: <g​fxstrand> Shader key time!
21:07 fdobridge: <k​arolherbst🐧🦀> 😄
21:09 fdobridge: <g​fxstrand> The million dollar question, though, is what 2 bits do I want in that key?
21:09 fdobridge: <g​fxstrand> *sigh*
22:01 fdobridge: <g​fxstrand> Ugh... Looks like the blob doesn't use SamplerPos
22:01 fdobridge: <g​fxstrand> I'm gonna assume it doesn't do what we think it does...
22:03 fdobridge: <g​fxstrand> Back to a cbuf it is...
22:03 fdobridge: <g​fxstrand> 🙄
22:04 fdobridge: <k​arolherbst🐧🦀> ohh... I think I know what it does
22:04 fdobridge: <k​arolherbst🐧🦀> it returns the position of a sample within a MS sample pattern, but I kinda thought that's what we need?
22:05 fdobridge: <k​arolherbst🐧🦀> or uhm...
22:05 fdobridge: <k​arolherbst🐧🦀> maybe not
22:08 fdobridge: <k​arolherbst🐧🦀> sounds like it's for `GL_NV_sample_locations`, specifically `FRAMEBUFFER_PROGRAMMABLE_SAMPLE_LOCATIONS_NV`
22:09 fdobridge: <g​fxstrand> It looks like it does something real with the texture index
22:09 fdobridge: <g​fxstrand> So maybe it looks up the number of samples?
22:09 fdobridge: <k​arolherbst🐧🦀> I mean.. the docs say sample index within MS pattern
22:10 fdobridge: <k​arolherbst🐧🦀> as the input
22:10 fdobridge: <k​arolherbst🐧🦀> of the selected texture
22:11 fdobridge: <k​arolherbst🐧🦀> those sample locations are fully programmable, so it kinda makes sense
22:11 fdobridge: <g​fxstrand> Yeah, but I can't use that on render targets without texture headers for them.
22:11 fdobridge: <k​arolherbst🐧🦀> right..
22:12 fdobridge: <k​arolherbst🐧🦀> the only MS related thing `PIXLD` has is `MY_INDEX`
22:12 fdobridge: <k​arolherbst🐧🦀> and uhm.. `INNER_COVERAGE`
22:12 fdobridge: <k​arolherbst🐧🦀> and others 😄
22:12 fdobridge: <g​fxstrand> Yeah, inner coverage isn't it
22:12 fdobridge: <g​fxstrand> What others?
22:13 fdobridge: <k​arolherbst🐧🦀> ohh wait... INNER_COVERAGE says it's independent of MS, so that makes sense
22:13 fdobridge: <k​arolherbst🐧🦀> `MSCOUNT`, `COVMASK`, `MY_INDEX`
22:13 fdobridge: <k​arolherbst🐧🦀> `MY_INDEX` is the current sample index
22:13 fdobridge: <k​arolherbst🐧🦀> heh
22:14 fdobridge: <k​arolherbst🐧🦀> `MY_INDEX` has a fun feature.. it sets the output predicate to true if in MS mode
22:14 fdobridge: <g​fxstrand> Oh, that is fun...
22:14 fdobridge: <g​fxstrand> What does "MS mode" mean?
22:15 fdobridge: <k​arolherbst🐧🦀> ehhhh
22:15 fdobridge: <g​fxstrand> passes > 1
22:15 fdobridge: <k​arolherbst🐧🦀> I misread, it's `SSAA mode`
22:15 fdobridge: <g​fxstrand> Also, what does PIXLD `78..81` set to 0 do?
22:16 fdobridge: <k​arolherbst🐧🦀> is 78..81 the mode?
22:16 fdobridge: <g​fxstrand> Yeah
22:16 fdobridge: <k​arolherbst🐧🦀> probably doing `MSCOUNT`
22:16 fdobridge: <g​fxstrand> It's a valid encoding according to the disassembler
22:16 fdobridge: <k​arolherbst🐧🦀> it drops the mode in the output?
22:16 fdobridge: <g​fxstrand> Ah, MSCOUNT could be easy
22:16 fdobridge: <g​fxstrand> For 0
22:16 fdobridge: <g​fxstrand> The others decode to things
22:17 fdobridge: <k​arolherbst🐧🦀> okay, does anything decode to `MSCOUNT`?
22:17 fdobridge: <g​fxstrand> No
22:17 fdobridge: <k​arolherbst🐧🦀> okay..
22:17 fdobridge: <k​arolherbst🐧🦀> then 0 is `MSCOUNT` 😄
22:17 fdobridge: <k​arolherbst🐧🦀> it has a star on it meaning default
22:17 fdobridge: <g​fxstrand> 5, 6, and 7 decode to `???6` etc.
22:17 fdobridge: <k​arolherbst🐧🦀> ahh yeah
22:17 fdobridge: <k​arolherbst🐧🦀> so it's not implemented
22:17 fdobridge: <k​arolherbst🐧🦀> or not existing
22:18 fdobridge: <g​fxstrand> I figured
22:41 fdobridge: <g​fxstrand> Okay, now to figure out why alpha-to-coverage doesn't work...
23:09 fdobridge: <g​fxstrand> Seems to work when there are attachments.
23:10 fdobridge: <g​fxstrand> Do I need a dummy attachment or something? That would be frustrating.
23:52 fdobridge: <g​fxstrand> There's `RASTER_SAMPLES` but it doesn't seem to do anything and it complains when I set it to 0 (1x1)
23:52 fdobridge: <g​fxstrand> I may need to try dumping command buffers from the blob
23:54 fdobridge: <g​fxstrand> It's just such a giant PITA