01:40airlied: dakr: new patch on the list, I think that's the proper solution for all arches
03:17fdobridge: <gfxstrand> Hrm... dEQP-GL45-ES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d.int_color.nearest.s_mirrored_repeat_t_clamp_to_border_npot is failing
03:17fdobridge: <gfxstrand> Hrm...
03:17fdobridge: <gfxstrand> ```
03:17fdobridge: <gfxstrand> dEQP-GL45-ES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d.int_color.nearest.s_mirrored_repeat_t_clamp_to_border_npot
03:17fdobridge: <gfxstrand> ```
03:17fdobridge: <gfxstrand> is failing (edited)
05:27fdobridge: <airlied> @gfxstrand just was profiling some shader builds from llama.cpp, #3 0x00007f487d00e942 in nak_rs::bitset::BitSet::union_with (self=0x34f1560, other=0x7ffc2be6bd10)
05:27fdobridge: <airlied> at ../src/nouveau/compiler/nak/bitset.rs:152
05:27fdobridge: <airlied> 152 let uw = self.words[w] | other.words[w];
05:27fdobridge: <airlied> the uw allocation is quite high on the perf charts
05:34fdobridge: <airlied> not even sure why I'm seeing an alloc path getting hit there
05:55fdobridge: <airlied> actually I'm probably misreading it, not an allocation, just entering the vec code
06:06fdobridge: <airlied> ah yeah debugopt build seems a lot happier
12:05fdobridge: <rinlovesyou> I wonder just how much overhead something like that actually causes, rust bound checks slice/vector access
12:05fdobridge: <rinlovesyou> Maybe not much on its own but i could picture this adding up
12:13fdobridge: <moonykay> It's a never taken branch, and hints as such to modern CPUs
12:13fdobridge: <moonykay> so on average it actually doesn't add up to anything due to OoO occupancy issues (i.e. "those instruction slots wouldn't have been doing anything anyway")
12:14fdobridge: <moonykay> It can add up in hot loops under specific circumstances, but if you manage to actually measure that you've earned the right to pull out unsafe.
12:14fdobridge: <nanokatze> while the branch cost can be masked the cost of the check is still there
12:14fdobridge: <nanokatze> while the cost of branch itself is easily masked the cost of the check is still there (edited)
12:14fdobridge: <moonykay> the check is usually just a `cmp`, which fuses with the following branch
12:15fdobridge: <moonykay> this is more of a concern on lower performance or not-OoO systems
12:16fdobridge: <moonykay> (which don't average having plenty of free execution units)
12:16fdobridge: <nanokatze> yes that's the bit I mean
12:16fdobridge: <moonykay> even there it's pretty cheap, rust/LLVM usually pulls the check upward as high as possible
12:16fdobridge: <nanokatze> cmp costs kinda remotely the same way add or sub or whatever else costs
12:16fdobridge: <moonykay> `add` is usually the cheapest possible operation
12:16fdobridge: <nanokatze> yes, unless it's in a loop like you said
12:17fdobridge: <nanokatze> yes, unless it's in a tiny loop like you said (edited)
12:17fdobridge: <nanokatze> yes, unless it's in a loop with tiny body like you said (edited)
12:17fdobridge: <nanokatze> oops we're in nouveau
12:17fdobridge: <nanokatze> sorry
12:17fdobridge: <moonykay> #tech-talk
12:28fdobridge: <gfxstrand> But also Devin debug rust code is sloooow.
12:28fdobridge: <gfxstrand> But also debug rust code is sloooow. (edited)
12:41fdobridge: <moonykay> rust definitely relies on modern compiler optimization
12:58fdobridge: <ahuillet> exciting sentence!
13:15fdobridge: <!DodoNVK (she) 🇱🇹> ~~C supremacy~~
15:12fdobridge: <gfxstrand> Not only does Rust rely on compiler optimization but it relies on inlining in particular. Everything in Rust has a stack depth of at least 12.
15:13fdobridge: <karolherbst🐧🦀> It also enables a few things only in debug builds, like overflow detection
15:47fdobridge: <ahuillet> hence the stack overflow we discussed some time ago with a debug builD?
15:47fdobridge: <ahuillet> hence the stack overflow we discussed some time ago with a debug build? (edited)
16:38fdobridge: <karolherbst🐧🦀> yeah, possibly
17:09fdobridge: <gfxstrand> Yeah, it's super fun when you get a panic. In C, `up 5` in GDB will get you out of it. In Rust, it's somewhere between `up 12` and `up 14`, typically.
17:16fdobridge: <karolherbst🐧🦀> ~~compared to the 500 I usually had with Java it's basically nothing still~~
18:14ad__: hi
18:15ad__: working on the backlight ada issue now
18:44Lyude: airlied: think I wrote up a patch for the suspend/resume issue :) https://paste.centos.org/view/41c6a068
18:45Lyude: ad__: hey! I think hans got in touch with you iirc? were they able to help at all?
18:45Lyude: (about to try that s/r patch on my own machine after a kernel build)
18:59Lyude: enormous nouveau bug i just found: no nvk mascot
21:31airlied: ahuillet: actually you might be able to help nvk on stencil8 support, if you know any of the secrets behind it
21:32airlied: ahuillet: a straight implementation didn't just work
21:32airlied: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24596 was one attempt
22:01fdobridge: <gfxstrand> Oh, good. This only fails when run in windowed mode. It passes with `--deqp-surface-type=fbo`
22:01fdobridge: <gfxstrand> @zmike. ^^
22:01fdobridge: <gfxstrand> I suspect sRGB
22:02fdobridge: <zmike.> :stressheadache:
22:03fdobridge: <zmike.> file ticket I guess, but I gotta figure out this tomb raider regression first
22:04fdobridge: <zmike.> does it fail on lavapipe too
22:04fdobridge: <gfxstrand> Should I try with X11 instead of Wayland?
22:04fdobridge: <zmike.> yeah try on a few different setups just for data points
22:04fdobridge: <zmike.> would be useful
22:06fdobridge: <gfxstrand> lavapipe fails too
22:07fdobridge: <zmike.> great
22:07fdobridge: <zmike.> does llvmpipe pass?
22:09fdobridge: <gfxstrand> Looks like iris fails
22:10fdobridge: <gfxstrand> Though it's hard to tell. GL is weird and I have too many GPUs. 🙃
22:10fdobridge: <zmike.> if iris fails too then probably it's a core mesa bug
22:10fdobridge: <gfxstrand> Zink+ANV fails
22:10fdobridge: <gfxstrand> Let me try iris
22:11fdobridge: <gfxstrand> I'm going to try with X11 next
22:12fdobridge: <gfxstrand> Yeah, iris fails, too
22:12fdobridge: <gfxstrand> Fun
22:15fdobridge: <gfxstrand> Passes on X11
22:15fdobridge: <gfxstrand> Fun
22:16fdobridge: <gfxstrand> I'm going to file it a mesa/ bug
22:17fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/issues/10993
22:18fdobridge: <gfxstrand> I guess I'm going to re-run the CTS on X11. Woo...
22:22fdobridge: <zmike.> it should be relatively trivial to resolve if it can be A/B compared with a passing fbo config
22:22fdobridge: <zmike.> maybe tpalli will tackle it
22:22fdobridge: <gfxstrand> Yeah
22:23fdobridge: <gfxstrand> In any case, it passes on X11 so I'm going to assume all of my other failures are that too and run the CTS again
22:23fdobridge: <gfxstrand> Maybe it'll even be faster on X11
22:23fdobridge: <zmike.> it won't, but I never test on wayland so I'm surprised it works at all
22:29fdobridge: <gfxstrand> x11_egl_glx doesn't like me either
22:43fdobridge: <gfxstrand> And `dEQP-GL45-ES3.info.vendor` is throwing `EGL_BAD_NATIVE_WINDOW`
22:52fdobridge: <zmike.> that's new
22:55fdobridge: <gfxstrand> It's fine on Wayland. I can't get cts-runner to get past that on any of the X11 configs
22:56fdobridge: <zmike.> other drivers?
22:57fdobridge: <gfxstrand> iris is fine
22:59fdobridge: <gfxstrand> `./glcts --deqp-gl-context-type=egl -n dEQP-GL45-ES3.info.vendor`
23:01fdobridge: <gfxstrand> On any X11 config
23:03fdobridge: <gfxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/issues/10994
23:30fdobridge: <zmike.> on x11_egl_glx this passes for me on radv and lavapipe
23:30fdobridge: <zmike.> testing anv now
23:30fdobridge: <zmike.> once my build finishes...
23:30fdobridge: <gfxstrand> It's probably the SW WSI path
23:30fdobridge: <gfxstrand> Because no modifiers
23:30fdobridge: <zmike.> 🤔
23:31fdobridge: <zmike.> nope, still passes with LIBGL_KOPPER_DRI2=1
23:33fdobridge: <gfxstrand> ANV is fine
23:34fdobridge: <gfxstrand> IDK where to even set a breakpoint for this
23:35fdobridge: <zmike.> that's always the most exciting part of wsi debugging
23:35fdobridge: <zmike.> I've got about 6 years left on this egl cts build on my nvk machine
23:40fdobridge: <gfxstrand> `dri2_x11_swap_buffers_msc()` fails
23:43fdobridge: <zmike.> yes
23:52fdobridge: <zmike.> yea so I think it's hitting the dri2 swapbuffers path when it needs to be hitting the drisw path
23:53fdobridge: <zmike.> but I'm not sure exactly why and I'm using a potato on my couch to try and ssh this
23:58fdobridge: <zmike.> @gfxstrand great news
23:58fdobridge: <zmike.> I have a solution for you
23:58fdobridge: <gfxstrand> Oh?
23:58fdobridge: <zmike.> `LIBGL_DRI3_DISABLE=1 LIBGL_KOPPER_DISABLE=1 ./glcts --deqp-gl-context-type=egl -n dEQP-GL45-ES3.info.vendor`