00:00skeggsb: what hw?
00:00Lyude: skeggsb: gp107
00:01Lyude: also https://lyude.net/~lyudess/tmp/ovly-exception-b-kms-plane.log
00:02skeggsb: what makes you think it's ovim? 7 would be overlay (dma) 2 right?
00:57imirkin: pmoreau: pushed an update, with a handful more minor commits
00:57imirkin: pmoreau: i still have yet to run all this for like ... regular graphics
00:57imirkin: to make sure i didn't accidentally the whole thing :)
01:02Lyude: mupuf: mhh-might actually need a little longer before sending these patches out (I completely forgot to make sure this code doesn't make libdrm_nouveau mandatory for building correctly), but i'm hoping to send this out asap
01:05imirkin: pmoreau: going to run some CTS things on it
05:27mupuf: Lyude: \o/, Cc me and I will review them
08:17pmoreau: imirkin: If it compiles, then it surely works, right? No need to run those annoying CTS! 🤣
08:17pmoreau: And awesome, I’ll have a look at the update tonight.
15:41imirkin: pmoreau: well, CTS/dEQP seem happy with the GL portion at least
15:41imirkin: except for this friendly fellow: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4380
15:42pmoreau: Cool! I managed to get things set up again on my other computer, but the CTS crashes due to some OpenGL 4 function gated behind a check for OpenGL 2. 🙂
15:42imirkin: pmoreau: i'm using master iirc ... let me check which commit i'm on
15:42pmoreau: Same here
15:43imirkin: i'm on commit b29bf0434c16796dc48a17a52c7fe219d558af31
15:43imirkin: which was master some (long) time ago
15:43pmoreau: This is the faulty bit: https://github.com/KhronosGroup/VK-GL-CTS/blob/master/framework/opengl/gluStateReset.cpp#L1179-L1185
15:43imirkin: and then i run cts with
15:43imirkin: ./glcts --deqp-visibility=hidden --deqp-caselist-file=gl_cts/data/mustpass/gl/khronos_mustpass/4.6.1.x/gl33-master.txt
15:44imirkin: lol, oops
15:44imirkin: those are just random versions on there
15:44imirkin: impressive
15:44imirkin: the images thing is weird too
15:45pmoreau: Yup :-D
15:45imirkin: images were in 4.2
15:45imirkin: but really it should be based on ext
15:45imirkin: anyways, they took my patches for stuff like this. you should send yours too. takes a while, but gets in eventually
15:45pmoreau: And in other places, they gated that glMinSampleShading behind the proper extension or ES 3.2 or OpenGL 4.5!! 🙃
15:46pmoreau: I will send something, but I have work to finish first
15:46imirkin: pmoreau: but the compute tests i've been running were in deqp
15:46imirkin: which i run with
15:47imirkin: MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1
15:47pmoreau: Okay
15:47pmoreau: I have been running “./cts-runner --type=gl33” on my end.
15:47imirkin: ooof no
15:47imirkin: don't do that
15:47imirkin: use ./glcts
15:47imirkin: cts-runner is going to fail.
15:48pmoreau: Ok
15:48imirkin: use cts-runner to generate actual CTS submissions
15:48imirkin: but that's not really what we're doing here
18:20imirkin: damo22: btw, i have shared atomics now. if you feel like testing, there's a branch at https://gitlab.freedesktop.org/imirkin/mesa/-/commits/nv50_compute
18:20imirkin: but i think pmoreau will be able to test himself, so you definitely don't have to do it
18:21imirkin: damo22: you might run e.g. MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/path/to/mesa/lib64 ./deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_
18:21imirkin: damo22: you might run e.g. MESA_EXTENSION_OVERRIDE=GL_OES_texture_buffer MESA_GLES_VERSION_OVERRIDE=3.1 LD_LIBRARY_PATH=/path/to/mesa/lib64 ./deqp-gles31 --deqp-visibility=hidden -n dEQP-GLES31.functional.compute.basic.shared_atomic_op_*
18:21imirkin: [note the fixed second version]
18:22pmoreau: I could run that on my G94, once it is done with the current rendering.
18:22imirkin: pmoreau: won't help
18:22pmoreau: GT200+ needed?
18:22imirkin: shared atomics is nva0+
18:22imirkin: i have a fallback for earlier GPUs
18:22pmoreau: Ah k
18:22imirkin: which is basically "YOLO" wrt the locking ;)
18:23pmoreau: Hahahaha!
18:23pmoreau: Sounds awesome!
18:23imirkin: works fine for the single invocation variants of the deqp's
18:23karolherbst: imirkin: for fun reasons... intel doesn't have 64 bit shared atomics.. like not at all
18:23imirkin: not so much when you add more
18:23karolherbst: not even a tiny bit
18:23karolherbst: so I wrote nir based lowering with an actual lock in shared memory at some point :(
18:23imirkin: karolherbst: not even the teeeensiest little bit? :)
18:23karolherbst: imirkin: nope
18:24imirkin: how do you do an actual lock?
18:24imirkin: you gotta have SOME locking
18:24karolherbst: well, you can do atomics in 32 bit
18:24imirkin: you can do it with a CAS for example
18:24imirkin: but pre-nva0 ... there is zero locking in shared
18:24karolherbst: ohhhh
18:24karolherbst: that's even worse
18:24karolherbst: allocate locks in global mem :p
18:24imirkin: not even the teeeeeensiest little bit :)
18:24imirkin: lol
18:24imirkin: for max perf
18:24pmoreau: I was going to suggest that, Karol 🤣
18:24karolherbst: imirkin: as if it matters
18:25karolherbst: :p
18:25imirkin: atomic gmem is g84+
18:25imirkin: so the g80 is left out in the cold a bit
18:25karolherbst: does cuda 6.5 even expose shared atomics on those gpus?
18:25imirkin: much sadness.
18:25karolherbst: or isthere no cuda on those
18:25imirkin: there definitely is
18:25imirkin: sm_12 has shared atomics
18:25karolherbst: I guess simply no atomics
18:25karolherbst: mhhh
18:25imirkin: sm_11 has global atomics
18:25imirkin: sm_10 is much sad.
18:25karolherbst: sure
18:26karolherbst: so I guess on sm_10 you can't use those features
18:26imirkin: right
18:26imirkin: nouveau barely works on those G80's anyways
18:26imirkin: they also don't support sampler LOD clamps
18:26imirkin: and lots of various things
18:26imirkin: it's like the original R600, but not nearly as bad
21:29imirkin: pmoreau: btw, i sorta plan folding all the images stuff into one or two commits. let me know if you think that's a bad plan
21:31pmoreau: I would leave the one that introduces the defines and similar as a separate one, but if you want to merge all the new stuff into one commit, that sounds okay though maybe a bit heavy. :-)
21:31imirkin: i'd split the driver and compiler portions
21:32pmoreau: 👍️
21:32imirkin: but like ... what defines do you mean?
21:32imirkin: just like the cb layout?
21:33pmoreau: “nv50: Replace hardcoded texture/constbuf count with define”
21:33imirkin: i'm not touching that
21:33imirkin: i'm talking about images
21:34imirkin: which are all in my "WIP" portion
23:48imirkin: pmoreau: did you get a chance to try atomic shared on nva0+?