00:51KungFuJesus: So, if we're getting "SWIZZLE_NONE" values here for any of the first 4 indices...something may already be off
00:51KungFuJesus: I finally found the code for what this actually doing for the index switching
00:53karolherbst: I'm sure after enough digging you'll come to the same conclusion as I did: burn it to the ground and start big endian support from scratch :P
00:58KungFuJesus: lol, well, I want things to work first. If -O0 is eliminating this bug, we're probably close to working
00:58KungFuJesus: I'm starting to think the "NONE" index coming into there is a symptom of something unsavory coming out the glsl compiler
00:59KungFuJesus: that code looks pretty unseemly and it seems to be doing a whole bunch of stuff that isn't technically legal in C++
00:59karolherbst: nah, the issue has something to do with reswizzling for big endian and those special swizzle accessors just being weird
01:00KungFuJesus: I'll print the in and out from format without and without optimizations to see if the output differs but I really think the input is differing
01:00karolherbst: I suspect that without optimizations it just doesn't segfault because of memory layout differences or something
01:06KungFuJesus: well, it's not segfault, it's just present a black framebuffer
01:07KungFuJesus: given that this bit is only called like one or two times, it should be pretty easy to see the differences in a debugger for what's happening
01:15KungFuJesus: hmm, seems like maybe multiple threads are calling this? https://pastebin.com/m8e2PZKz
01:38KungFuJesus: also find it weird I somehow ended up at builtin_unreachable, seems like something may have somehow corrupted the stack? I'll print around the area to stderr
02:16KungFuJesus: lol, well that's rich. Working versus not working seem to have no discernible differences
02:17KungFuJesus: it's gotta be the code coming out of the glsl compiler
02:18KungFuJesus: some pointer being deref'd somewhere that is sometimes harmless and sometimes not or something
02:22KungFuJesus: ahhh, valgrind did find something when I launched mythtv outside of the wrapper
06:43OftenTimeConsuming: I'm getting an annoying hanging problem with my 780 Ti. It seems to happen in the default power state with both minimal 3D load and too much 3D load for the default power state. glxgears mitigates the low load problem, but can cause the higher load problem. I'm on Linux 6.1.0-rc8, are there are non-proprietary patches that improve things?
06:58karolherbst: KungFuJesus: uhh yeah.. those valgrind errors do not sound great :)
06:59karolherbst: OftenTimeConsuming: maybe.. what's the mesa version you are on?
07:00karolherbst: and what kind of hanging problems are you seeing? Any errors in the logs?
07:12OftenTimeConsuming: I'm on mesa-22.2.3 and can upgrade to 22.3.1.
07:14OftenTimeConsuming: I don't really have logging setup, aside from what's on Xorg.log.old, but I don't see anything there.
07:14karolherbst: that might actually help because the fixes for using multiple gl context within the same application only got applied to 22.3
07:15karolherbst: soo.. e.g. if you use electron apps and see hangs, 22.3 should be able to fix that
07:15OftenTimeConsuming: I don't use electron cr..apps.
07:15karolherbst: well other applications could trigger that too, it's just what users usually end up using
07:16karolherbst: but hard to tell if we don't know what's happening on your system
07:16OftenTimeConsuming: Typically what happens is that every window and the keyboard (libinput) freezes, but sometimes the mouse still works and moves, except it's not useful, as everything is frozen.
07:16OftenTimeConsuming: glxgears stops for example
07:16karolherbst: mhh.. anything in dmesg?
07:17OftenTimeConsuming: I can't really reliably reproduce the crash, as it seems to happen at random. I've had 2 back to back hangs happen viewing images in Tor Browser.
07:17OftenTimeConsuming: dmesg is cleared at reboot, but I'll look at my old kern.logs
07:18OftenTimeConsuming: Oh of course, so old kern.logs, just the current
07:20karolherbst: might make it more easy to debug those things with a proper logging daemon storing old logs (or log rotating them or whatever)
07:27karolherbst: anyway.. mesa 22.3 _might_ help, but without kernel logs or any logs or knowing what happens I guess there isn't much to help here. It might also be that the Tor Browser, which I think is Firefox is trigger those threading issues fixed in 22.3, but it could also be anything else.
07:31OftenTimeConsuming: Hmm, syslogd is running, but it's not rotating? I guess I'll leave dmesg open so hopefully I see logs just before another hang
07:38karolherbst: it might not fetch kernel logs, dunno
13:11OftenTimeConsuming: >nouveau 0000:01:00.0: fifo: FB_FLUSH_TIMEOUT Huh, but no hang.
13:19karolherbst: I think this happens if the GPU is too slow for things
13:19karolherbst: could bump up the clocks
13:20karolherbst: but yeah... it's a silly issue
13:54iska: did kepler improve since 2019? I'm thinking of buying a used laptop with a gt740
14:23OftenTimeConsuming: karolherbst: I see, thanks.
14:32karolherbst: iska: yeah, we use nir now and reclocking should more or less work, it's just still manual. But I'd not recommend buying a hybrid GPU laptop _unless_ you really need the perf
14:42tagr: skeggsb: I'm running into some issues with Nouveau on gm20b after the recent falcon code rework
14:43tagr: I was able to fix a few of the more obvious issues, but I'm now running into this error message and I'm not sure how to fix it:
14:43tagr: [ 15.490032] nouveau 57000000.gpu: pmu: unexpected init message size 255 vs 42
14:43tagr: any thoughts?
14:46tagr: here are some of the things I had to do in order to get things working, though I'm not sure that last hunk is correct: http://ix.io/4kkp
14:48tagr: sorry for being very late on this, looks like this started happening in linux-next around early/mid December when we had various other breakages in linux-next, so this went unnoticed for a while, but now it's showing up in mainline
14:54KungFuJesus: karolherbst: fascinatingly, none of those uninitialized accesses occur with -O0. Begs the question if optimizations being disabled are initializing more memory or if the optimizer is taking a liberty that is technically legal but messes with the code generation out of the fragment shaders
14:55KungFuJesus: I probably don't have much time to mess with this today but I do wonder if I can progressively enable optimizations manually until the bug(s?) appears
14:57iska: ty :)
14:59KungFuJesus: it definitely doesn't eliminate every bug, that's for sure. The GLSL based deinterlacer is a flickery mess where the mythtv UI pops up constantly underneath the video stream. That goes away when you disable the deinterlacing
15:04karolherbst: KungFuJesus: well.. an unitialized access is just that. The compiler _might_ optimize certain bytes away if it can figure out it's never legally accessed or something
17:01fdobridge: <marysaka> About this, I would love to have it to look Turing macros a bit (and check if there is any macro that may use opcode 6 on Fermi)
17:04fdobridge: <marysaka> Seems that the bot doesn't support quotes so it doesn't show on IRC correctly oops
23:16fdobridge: <gfxstrand> @marysaka Ok, I've got the two builders combined. I didn't try to make your series clean on top of my Turing cleanups. It's a bunch of reasonably clean turing cleanups with one bit STUFF patch at the end
23:16fdobridge: <gfxstrand> https://gitlab.freedesktop.org/jekstrand/mesa/-/commits/nvk/mme-fermi-v2
23:17fdobridge: <gfxstrand> With that, NVK attempts to compile macros on turing \o/
23:17fdobridge: <gfxstrand> With that, NVK attempts to compile macros on my Maxwell \o/ (edited)
23:17fdobridge: <gfxstrand> Of course, it blows up because there aren't enough registers. 😢
23:18fdobridge: <gfxstrand> IDK what the best thing to do about that is going to be.
23:18fdobridge: <gfxstrand> I need to look a bit harder at your control-flow stuff and see if we can make it a bit more register-efficient.
23:19fdobridge: <gfxstrand> Or maybe we should make `mme_loop()` take a bool for "I'm done with the reg now" or something like that.
23:19fdobridge: <gfxstrand> If we had that, we could use the input register as a counter and save one
23:20fdobridge: <gfxstrand> Or we can stop using mme_loop
23:20fdobridge: <gfxstrand> It's a nice shortcut but IDK what it really saves us in the end.
23:23fdobridge: <gfxstrand> @marysaka Is the 17-bit immediate for adds sign-extended?
23:23fdobridge: <gfxstrand> We should have unit tests which very carefully poke at that corner case.
23:24fdobridge: <gfxstrand> The tests I had for Turing are designed for 16-bit immediates so they're not quite the tests we want for Fermi
23:24fdobridge: <marysaka> yeah
23:24fdobridge: <marysaka> totally forgot to make those :AkkoDerp:
23:25fdobridge: <gfxstrand> Yeah, we should have tests for that. We should probably also have an optimization for ADD and SUB with an immediate which uses the ADD_IMM version whenever possible.
23:26fdobridge: <gfxstrand> That'll help with register pressure.
23:26fdobridge: <gfxstrand> I'm also contemplating whether or not we want real RA for Fermi
23:27fdobridge: <gfxstrand> We've gotten by without it for Turing by calling `mme_free_reg()` here and there but Fermi is constrained enough it's getting painful.
23:28fdobridge:<gfxstrand> really doesn't want to suggest anyone write an optimizer. 😅
23:29fdobridge: <gfxstrand> I think I'm going to sign off for the night. I've got an existence proof that Fermi and Turing can be smashed together well enough.
23:30fdobridge: <gfxstrand> Now need to think about the implications this all has on the driver...
23:48fdobridge: <marysaka> :sweating: