01:40 airlied: dakr: new patch on the list, I think that's the proper solution for all arches
03:17 fdobridge: <g​fxstrand> Hrm... dEQP-GL45-ES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d.int_color.nearest.s_mirrored_repeat_t_clamp_to_border_npot is failing
03:17 fdobridge: <g​fxstrand> Hrm...
03:17 fdobridge: <g​fxstrand> ```
03:17 fdobridge: <g​fxstrand> dEQP-GL45-ES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d.int_color.nearest.s_mirrored_repeat_t_clamp_to_border_npot
03:17 fdobridge: <g​fxstrand> ```
03:17 fdobridge: <g​fxstrand> is failing (edited)
05:27 fdobridge: <a​irlied> @gfxstrand just was profiling some shader builds from llama.cpp, #3 0x00007f487d00e942 in nak_rs::bitset::BitSet::union_with (self=0x34f1560, other=0x7ffc2be6bd10)
05:27 fdobridge: <a​irlied> at ../src/nouveau/compiler/nak/bitset.rs:152
05:27 fdobridge: <a​irlied> 152 let uw = self.words[w] | other.words[w];
05:27 fdobridge: <a​irlied> the uw allocation is quite high on the perf charts
05:34 fdobridge: <a​irlied> not even sure why I'm seeing an alloc path getting hit there
05:55 fdobridge: <a​irlied> actually I'm probably misreading it, not an allocation, just entering the vec code
06:06 fdobridge: <a​irlied> ah yeah debugopt build seems a lot happier
12:05 fdobridge: <r​inlovesyou> I wonder just how much overhead something like that actually causes, rust bound checks slice/vector access
12:05 fdobridge: <r​inlovesyou> Maybe not much on its own but i could picture this adding up
12:13 fdobridge: <m​oonykay> It's a never taken branch, and hints as such to modern CPUs
12:13 fdobridge: <m​oonykay> so on average it actually doesn't add up to anything due to OoO occupancy issues (i.e. "those instruction slots wouldn't have been doing anything anyway")
12:14 fdobridge: <m​oonykay> It can add up in hot loops under specific circumstances, but if you manage to actually measure that you've earned the right to pull out unsafe.
12:14 fdobridge: <n​anokatze> while the branch cost can be masked the cost of the check is still there
12:14 fdobridge: <n​anokatze> while the cost of branch itself is easily masked the cost of the check is still there (edited)
12:14 fdobridge: <m​oonykay> the check is usually just a `cmp`, which fuses with the following branch
12:15 fdobridge: <m​oonykay> this is more of a concern on lower performance or not-OoO systems
12:16 fdobridge: <m​oonykay> (which don't average having plenty of free execution units)
12:16 fdobridge: <n​anokatze> yes that's the bit I mean
12:16 fdobridge: <m​oonykay> even there it's pretty cheap, rust/LLVM usually pulls the check upward as high as possible
12:16 fdobridge: <n​anokatze> cmp costs kinda remotely the same way add or sub or whatever else costs
12:16 fdobridge: <m​oonykay> `add` is usually the cheapest possible operation
12:16 fdobridge: <n​anokatze> yes, unless it's in a loop like you said
12:17 fdobridge: <n​anokatze> yes, unless it's in a tiny loop like you said (edited)
12:17 fdobridge: <n​anokatze> yes, unless it's in a loop with tiny body like you said (edited)
12:17 fdobridge: <n​anokatze> oops we're in nouveau
12:17 fdobridge: <n​anokatze> sorry
12:17 fdobridge: <m​oonykay> #tech-talk
12:28 fdobridge: <g​fxstrand> But also Devin debug rust code is sloooow.
12:28 fdobridge: <g​fxstrand> But also debug rust code is sloooow. (edited)
12:41 fdobridge: <m​oonykay> rust definitely relies on modern compiler optimization
12:58 fdobridge: <a​huillet> exciting sentence!
13:15 fdobridge: <!​DodoNVK (she) 🇱🇹> ~~C supremacy~~
15:12 fdobridge: <g​fxstrand> Not only does Rust rely on compiler optimization but it relies on inlining in particular. Everything in Rust has a stack depth of at least 12.
15:13 fdobridge: <k​arolherbst🐧🦀> It also enables a few things only in debug builds, like overflow detection
15:47 fdobridge: <a​huillet> hence the stack overflow we discussed some time ago with a debug builD?
15:47 fdobridge: <a​huillet> hence the stack overflow we discussed some time ago with a debug build? (edited)
16:38 fdobridge: <k​arolherbst🐧🦀> yeah, possibly
17:09 fdobridge: <g​fxstrand> Yeah, it's super fun when you get a panic. In C, `up 5` in GDB will get you out of it. In Rust, it's somewhere between `up 12` and `up 14`, typically.
17:16 fdobridge: <k​arolherbst🐧🦀> ~~compared to the 500 I usually had with Java it's basically nothing still~~
18:14 ad__: hi
18:15 ad__: working on the backlight ada issue now
18:44 Lyude: airlied: think I wrote up a patch for the suspend/resume issue :) https://paste.centos.org/view/41c6a068
18:45 Lyude: ad__: hey! I think hans got in touch with you iirc? were they able to help at all?
18:45 Lyude: (about to try that s/r patch on my own machine after a kernel build)
18:59 Lyude: enormous nouveau bug i just found: no nvk mascot
21:31 airlied: ahuillet: actually you might be able to help nvk on stencil8 support, if you know any of the secrets behind it
21:32 airlied: ahuillet: a straight implementation didn't just work
21:32 airlied: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24596 was one attempt
22:01 fdobridge: <g​fxstrand> Oh, good. This only fails when run in windowed mode. It passes with `--deqp-surface-type=fbo`
22:01 fdobridge: <g​fxstrand> @zmike. ^^
22:01 fdobridge: <g​fxstrand> I suspect sRGB
22:02 fdobridge: <z​mike.> :stressheadache:
22:03 fdobridge: <z​mike.> file ticket I guess, but I gotta figure out this tomb raider regression first
22:04 fdobridge: <z​mike.> does it fail on lavapipe too
22:04 fdobridge: <g​fxstrand> Should I try with X11 instead of Wayland?
22:04 fdobridge: <z​mike.> yeah try on a few different setups just for data points
22:04 fdobridge: <z​mike.> would be useful
22:06 fdobridge: <g​fxstrand> lavapipe fails too
22:07 fdobridge: <z​mike.> great
22:07 fdobridge: <z​mike.> does llvmpipe pass?
22:09 fdobridge: <g​fxstrand> Looks like iris fails
22:10 fdobridge: <g​fxstrand> Though it's hard to tell. GL is weird and I have too many GPUs. 🙃
22:10 fdobridge: <z​mike.> if iris fails too then probably it's a core mesa bug
22:10 fdobridge: <g​fxstrand> Zink+ANV fails
22:10 fdobridge: <g​fxstrand> Let me try iris
22:11 fdobridge: <g​fxstrand> I'm going to try with X11 next
22:12 fdobridge: <g​fxstrand> Yeah, iris fails, too
22:12 fdobridge: <g​fxstrand> Fun
22:15 fdobridge: <g​fxstrand> Passes on X11
22:15 fdobridge: <g​fxstrand> Fun
22:16 fdobridge: <g​fxstrand> I'm going to file it a mesa/ bug
22:17 fdobridge: <g​fxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/issues/10993
22:18 fdobridge: <g​fxstrand> I guess I'm going to re-run the CTS on X11. Woo...
22:22 fdobridge: <z​mike.> it should be relatively trivial to resolve if it can be A/B compared with a passing fbo config
22:22 fdobridge: <z​mike.> maybe tpalli will tackle it
22:22 fdobridge: <g​fxstrand> Yeah
22:23 fdobridge: <g​fxstrand> In any case, it passes on X11 so I'm going to assume all of my other failures are that too and run the CTS again
22:23 fdobridge: <g​fxstrand> Maybe it'll even be faster on X11
22:23 fdobridge: <z​mike.> it won't, but I never test on wayland so I'm surprised it works at all
22:29 fdobridge: <g​fxstrand> x11_egl_glx doesn't like me either
22:43 fdobridge: <g​fxstrand> And `dEQP-GL45-ES3.info.vendor` is throwing `EGL_BAD_NATIVE_WINDOW`
22:52 fdobridge: <z​mike.> that's new
22:55 fdobridge: <g​fxstrand> It's fine on Wayland. I can't get cts-runner to get past that on any of the X11 configs
22:56 fdobridge: <z​mike.> other drivers?
22:57 fdobridge: <g​fxstrand> iris is fine
22:59 fdobridge: <g​fxstrand> `./glcts --deqp-gl-context-type=egl -n dEQP-GL45-ES3.info.vendor`
23:01 fdobridge: <g​fxstrand> On any X11 config
23:03 fdobridge: <g​fxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/issues/10994
23:30 fdobridge: <z​mike.> on x11_egl_glx this passes for me on radv and lavapipe
23:30 fdobridge: <z​mike.> testing anv now
23:30 fdobridge: <z​mike.> once my build finishes...
23:30 fdobridge: <g​fxstrand> It's probably the SW WSI path
23:30 fdobridge: <g​fxstrand> Because no modifiers
23:30 fdobridge: <z​mike.> 🤔
23:31 fdobridge: <z​mike.> nope, still passes with LIBGL_KOPPER_DRI2=1
23:33 fdobridge: <g​fxstrand> ANV is fine
23:34 fdobridge: <g​fxstrand> IDK where to even set a breakpoint for this
23:35 fdobridge: <z​mike.> that's always the most exciting part of wsi debugging
23:35 fdobridge: <z​mike.> I've got about 6 years left on this egl cts build on my nvk machine
23:40 fdobridge: <g​fxstrand> `dri2_x11_swap_buffers_msc()` fails
23:43 fdobridge: <z​mike.> yes
23:52 fdobridge: <z​mike.> yea so I think it's hitting the dri2 swapbuffers path when it needs to be hitting the drisw path
23:53 fdobridge: <z​mike.> but I'm not sure exactly why and I'm using a potato on my couch to try and ssh this
23:58 fdobridge: <z​mike.> @gfxstrand great news
23:58 fdobridge: <z​mike.> I have a solution for you
23:58 fdobridge: <g​fxstrand> Oh?
23:58 fdobridge: <z​mike.> `LIBGL_DRI3_DISABLE=1 LIBGL_KOPPER_DISABLE=1 ./glcts --deqp-gl-context-type=egl -n dEQP-GL45-ES3.info.vendor`