00:00 skeggsb: what hw?
00:00 Lyude: skeggsb: gp107
00:01 Lyude: also https://lyude.net/~lyudess/tmp/ovly-exception-b-kms-plane.log
00:02 skeggsb: what makes you think it's ovim? 7 would be overlay (dma) 2 right?
00:57 imirkin: pmoreau: pushed an update, with a handful more minor commits
00:57 imirkin: pmoreau: i still have yet to run all this for like ... regular graphics
00:57 imirkin: to make sure i didn't accidentally the whole thing :)
01:02 Lyude: mupuf: mhh-might actually need a little longer before sending these patches out (I completely forgot to make sure this code doesn't make libdrm_nouveau mandatory for building correctly), but i'm hoping to send this out asap
01:05 imirkin: pmoreau: going to run some CTS things on it
05:27 mupuf: Lyude: \o/, Cc me and I will review them
08:17 pmoreau: imirkin: If it compiles, then it surely works, right? No need to run those annoying CTS! 🤣
08:17 pmoreau: And awesome, I’ll have a look at the update tonight.
15:41 imirkin: pmoreau: well, CTS/dEQP seem happy with the GL portion at least
15:41 imirkin: except for this friendly fellow: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4380
15:42 pmoreau: Cool! I managed to get things set up again on my other computer, but the CTS crashes due to some OpenGL 4 function gated behind a check for OpenGL 2. 🙂
15:42 imirkin: pmoreau: i'm using master iirc ... let me check which commit i'm on
15:42 pmoreau: Same here
15:43 imirkin: i'm on commit b29bf0434c16796dc48a17a52c7fe219d558af31
15:43 imirkin: which was master some (long) time ago
15:43 pmoreau: This is the faulty bit: https://github.com/KhronosGroup/VK-GL-CTS/blob/master/framework/opengl/gluStateReset.cpp#L1179-L1185
15:43 imirkin: and then i run cts with
15:43 imirkin: ./glcts --deqp-visibility=hidden --deqp-caselist-file=gl_cts/data/mustpass/gl/khronos_mustpass/4.6.1.x/gl33-master.txt
15:44 imirkin: lol, oops
15:44 imirkin: those are just random versions on there
15:44 imirkin: impressive
15:44 imirkin: the images thing is weird too
15:45 pmoreau: Yup :-D
15:45 imirkin: images were in 4.2
15:45 imirkin: but really it should be based on ext
15:45 imirkin: anyways, they took my patches for stuff like this. you should send yours too. takes a while, but gets in eventually
15:45 pmoreau: And in other places, they gated that glMinSampleShading behind the proper extension or ES 3.2 or OpenGL 4.5!! 🙃
15:46 pmoreau: I will send something, but I have work to finish first
15:46 imirkin: pmoreau: but the compute tests i've been running were in deqp
15:46 imirkin: which i run with
15:47 pmoreau: Okay
15:47 pmoreau: I have been running “./cts-runner --type=gl33” on my end.
15:47 imirkin: ooof no
15:47 imirkin: don't do that
15:47 imirkin: use ./glcts
15:47 imirkin: cts-runner is going to fail.
15:48 pmoreau: Ok
15:48 imirkin: use cts-runner to generate actual CTS submissions
15:48 imirkin: but that's not really what we're doing here
18:20 imirkin: damo22: btw, i have shared atomics now. if you feel like testing, there's a branch at https://gitlab.freedesktop.org/imirkin/mesa/-/commits/nv50_compute
18:20 imirkin: but i think pmoreau will be able to test himself, so you definitely don't have to do it
18:21 imirkin: damo22: you might run e.g. MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/path/to/mesa/lib64 ./deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_
18:21 imirkin: damo22: you might run e.g. MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/path/to/mesa/lib64 ./deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_*
18:21 imirkin: [note the fixed second version]
18:22 pmoreau: I could run that on my G94, once it is done with the current rendering.
18:22 imirkin: pmoreau: won't help
18:22 pmoreau: GT200+ needed?
18:22 imirkin: shared atomics is nva0+
18:22 imirkin: i have a fallback for earlier GPUs
18:22 pmoreau: Ah k
18:22 imirkin: which is basically "YOLO" wrt the locking ;)
18:23 pmoreau: Hahahaha!
18:23 pmoreau: Sounds awesome!
18:23 imirkin: works fine for the single invocation variants of the deqp's
18:23 karolherbst: imirkin: for fun reasons... intel doesn't have 64 bit shared atomics.. like not at all
18:23 imirkin: not so much when you add more
18:23 karolherbst: not even a tiny bit
18:23 karolherbst: so I wrote nir based lowering with an actual lock in shared memory at some point :(
18:23 imirkin: karolherbst: not even the teeeensiest little bit? :)
18:23 karolherbst: imirkin: nope
18:24 imirkin: how do you do an actual lock?
18:24 imirkin: you gotta have SOME locking
18:24 karolherbst: well, you can do atomics in 32 bit
18:24 imirkin: you can do it with a CAS for example
18:24 imirkin: but pre-nva0 ... there is zero locking in shared
18:24 karolherbst: ohhhh
18:24 karolherbst: that's even worse
18:24 karolherbst: allocate locks in global mem :p
18:24 imirkin: not even the teeeeeensiest little bit :)
18:24 imirkin: lol
18:24 imirkin: for max perf
18:24 pmoreau: I was going to suggest that, Karol 🤣
18:24 karolherbst: imirkin: as if it matters
18:25 karolherbst: :p
18:25 imirkin: atomic gmem is g84+
18:25 imirkin: so the g80 is left out in the cold a bit
18:25 karolherbst: does cuda 6.5 even expose shared atomics on those gpus?
18:25 imirkin: much sadness.
18:25 karolherbst: or isthere no cuda on those
18:25 imirkin: there definitely is
18:25 imirkin: sm_12 has shared atomics
18:25 karolherbst: I guess simply no atomics
18:25 karolherbst: mhhh
18:25 imirkin: sm_11 has global atomics
18:25 imirkin: sm_10 is much sad.
18:25 karolherbst: sure
18:26 karolherbst: so I guess on sm_10 you can't use those features
18:26 imirkin: right
18:26 imirkin: nouveau barely works on those G80's anyways
18:26 imirkin: they also don't support sampler LOD clamps
18:26 imirkin: and lots of various things
18:26 imirkin: it's like the original R600, but not nearly as bad
21:29 imirkin: pmoreau: btw, i sorta plan folding all the images stuff into one or two commits. let me know if you think that's a bad plan
21:31 pmoreau: I would leave the one that introduces the defines and similar as a separate one, but if you want to merge all the new stuff into one commit, that sounds okay though maybe a bit heavy. :-)
21:31 imirkin: i'd split the driver and compiler portions
21:32 pmoreau: 👍️
21:32 imirkin: but like ... what defines do you mean?
21:32 imirkin: just like the cb layout?
21:33 pmoreau: “nv50: Replace hardcoded texture/constbuf count with define”
21:33 imirkin: i'm not touching that
21:33 imirkin: i'm talking about images
21:34 imirkin: which are all in my "WIP" portion
23:48 imirkin: pmoreau: did you get a chance to try atomic shared on nva0+?