00:33imirkin: pmoreau: btw, probably just have to purge on store. for a load, just skip it (without purging)
01:40DMJC: I have a 460gtx SLI setup I don't use
01:41DMJC: are there any nouveau developers in Australia that would be interested in hacking SLI with these?
06:15ask6155: hello
06:16ask6155: when running minecraft with mods I get this error: java: ../libdrm-2.4.105/nouveau/pushbuf.c:723: nouveau_pushbuf_data: Assertion `kref' failed.
06:16ask6155: this crashes minecraft instantly
06:16ask6155: help?
06:16imirkin: ask6155: iirc minecraft is multithreaded, which upsets some things.
06:17ask6155: is there a workaround or a fix?
06:18imirkin: i believe karolherbst has been working on a "fix" for multithreading issues. not 100% sure that this would help your particular problem
06:18imirkin: you'd need to build mesa from source to get try it
06:19ask6155: so the 'fix' has been released but has been commited to mesa?
06:19ask6155: how hard is it to build mesa?
06:19imirkin: ask6155: no, the fix is being worked on
06:20imirkin: but you can grab the patches and build your own copy
06:20ask6155: aah
06:20imirkin: (this is an issue which has existed since the dawn of time, so it's not exactly a new thing, nor particularly easy to fix)
06:20ask6155: how hard is it to build mesa? I can compile programs from source but I don't quite understand what mesa is so...
06:21imirkin: i mean ... pretty easy - run a few commands, have a good time.
06:21imirkin: not sure what your level is
06:21imirkin: so hard to judge
06:21ask6155: i mean I can compile a binary in a folder and run it but is mesa like that too?
06:21imirkin: the fact that you refer to the word 'folder' makes me more worried.
06:22imirkin: mesa is a software package
06:22ask6155: I'm talking about static binaries
06:22imirkin: in this case you'd have to clone a git tree
06:22ask6155: I build them sometimes for fun
06:23ask6155: anyways is there a website with build instructions?
06:23imirkin: and then you run a few commands to build the libs which will land them in some location-of-choice, and then when running minecraft you'd point at those libs instead
06:23imirkin: probably
06:23imirkin: i hear google is good at finding things on the internet
06:23ask6155: ah ok
06:24imirkin: afaik this is the branch you want: https://gitlab.freedesktop.org/karolherbst/mesa/-/commits/nvc0_threading
06:25imirkin: (actually ... stupid question ... what GPU do you have?)
06:25ask6155: GT 730
06:25imirkin: yeah ok, that should be fine
06:26imirkin: man, what is it with everyone getting GT 730's?
06:26imirkin: you're like the third person with a GT 730 in the past few days
06:26ask6155: I dunno why others are getting it but I got it because I'm stupid
06:27ask6155: should've spent my money on an amd cpu with the same amount of gpu performance...
06:29imirkin: :)
06:30imirkin: certainly what i would have recommended, yes
06:30imirkin: is it at least the GK208 GT 730? or is it a GF108?
06:30ask6155: GK208
06:32imirkin: ok, that's good at least
06:32ask6155: okay got the source, switched to nvc0_threading
06:32imirkin: you get reclocking. so you can crash faster :)
06:36ask6155: i did meson ..
06:36ask6155: how do I compile them to a different folder and not overwrite my own libs?
06:36imirkin: --prefix=/home/ask6155/install
06:39imirkin: and then run minecraft with LD_LIBRARY_PATH=/home/ask6155/install/lib64
06:40ccr: you'll also may want to set LIBGL_DRIVERS_PATH
06:42imirkin: nope
06:42imirkin: don't set that
06:42imirkin: ever
06:43ccr: hmm
06:43ccr: I've had to set it
06:43ask6155: this might take some time
06:44imirkin: ccr: nope
06:44imirkin: setting it just confuses things greatly
06:44imirkin: don't set it
06:44imirkin: ever.
06:45ccr: okay. perhaps it's an artifact from Mesa's past, but I'm 100% sure that I've had to set it for test builds (that I install under /opt)
06:46ccr: because things didn't work without it
06:47ccr: imirkin, you are correct that I don't seem to need it with current Mesa at least. hrm.
06:47imirkin: ccr: that's because you didn't install libGL
06:47imirkin: and were trying to point the dir to a non-install directory
06:47imirkin: don't do that.
06:47imirkin: that's the only case where that thing is useful
06:47imirkin: but don't do that
06:49ccr: heh. well, I wish I knew what I did wrong back then .. how could I manage to not install libGL?
06:49ccr: anyway, sorry for confusing things.
06:50ccr: oh, wait, I think I know what happened. it was probably some lib/ vs lib64/ thing.
06:50imirkin: the LIBGL_* thing is only needed when you're doing "weird shit". if you're just installing + using that installed thing, that's not weird :)
06:56ccr: Weird Science
07:10ask6155: it is... done
07:11ask6155: inside install there is lib share and include
07:12ask6155: there is no lib64
07:14ccr: just use LD_LIBRARY_PATH=/home/ask6155/install/lib then. lib/ should contain libGL etc
07:14ask6155: I'm using multimc and as a wrapper command I'm adding `LD_LIBRARY_PATH=/home/ask6155/install/lib`
07:15ask6155: nope that failed
07:15ccr: failed how?
07:16ask6155: well the wrapper thing failed
07:16ask6155: maybe I have to pass it as a java arg?
07:16ccr: no
07:16ccr: just "LD_LIBRARY_PATH=/home/ask6155/install/lib multimc"
07:16imirkin: lib4
07:16imirkin: lib64
07:16ask6155: yeah but that is for multimc
07:17ccr: [10:12] <ask6155> there is no lib64
07:17imirkin: weird
07:17ask6155: multimc launches minecraft from within
07:17ccr: well, usually the environment variables get inherited
07:18ccr: you could try (assuming bash or similar): export LD_LIBRARY_PATH=/home/ask6155/install/lib
07:18ask6155: ok I'll try that then
07:18ccr: and after that run whatever
07:23imirkin: if multimc launches minecraft, you may have to teach it about the extra env var
07:30ask6155: well I had to export ld_library to my profile and something different has happend
07:30ask6155: it had a different crash
07:30imirkin: yay!
07:30ask6155: it crashed with sigsegv
07:31ccr: it's progress, at least :P
07:31ask6155: problematic frame: nouveau_dri.so+0x981b2d
07:32ask6155: is the coredump of any use?
07:35imirkin: ideally you could explain your exact software to karolherbst and he can take a look when he gets up
07:35imirkin: speaking of
07:35imirkin: it's time i went to sleep
07:36imirkin: good luck
07:37ask6155: well I'm out of time now... I come tomorrow when he'll be awake I guess
10:51karolherbst: imirkin: minecraft does use multithreading
10:52karolherbst: but the code isn't complete so asserts might be hit
10:57karolherbst: imirkin: btw, still want to land those patches already: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8765
12:39RSpliet: What in the hell... nouveau just went from rock solid to crash-within-minutes, without any updates today except usbredir.
12:39RSpliet: I'll post some logs later tonight.
12:52karolherbst: RSpliet: sometimes the position of the moon affect things
12:55RSpliet: Ah, in that case let me just nuke the moon. That'll solve things!
12:56RSpliet: Nobody'll miss it, right?
13:03karolherbst: I am sure it would be fine
13:07ccr: oh noes, my secret evil moonbase!
15:09RSpliet: https://paste.centos.org/view/2a1ea77d
15:10RSpliet: I think web_content is firefox... but with everything being an electron app these days I'm not holding my hand in the fire for that one
15:12ccr: yeah, 'Web Content' is a FF thread typically
15:12ccr: Electron apps are based on Chrome/Chromium
15:13RSpliet: https://paste.centos.org/view/cd6a3f81
15:13RSpliet: Then it was my entire Wayland session disappearing, nothing happening but a scroll-up in Pidgin due to a new msg
16:22karolherbst: RSpliet: not sure if FF is threaded these days
16:22karolherbst: but yeah...
16:23karolherbst: our recovery from broken channels is... bad
16:23karolherbst: I want to fix it after I fixed multithreading
16:24pmoreau: How hard could it be to store a constant at a constant address in shared, read from that fixed location, and store it in global (for char3, given that char2 works in more complex scenarios)? Apparently a lot harder than I would think… 😅
16:59pmoreau: Sneaky ConstantFolding for SPLIT transforming `mov u32 %r40 0x00000001; split u32 { %r41s %r42s } %r40` into `mov u32 %r41s 0x00000001` because it had TYPE_U32 hardcoded in it; the storing/loading/storing of constants now works.
17:00imirkin: pmoreau: what's the problem with that?
17:00imirkin: oh, type = u32
17:00pmoreau: :-)
17:00imirkin: i made that work a while back
17:00imirkin: mostly to make mul's work better
17:00imirkin: s/better/with immediates/
17:01imirkin: i wasn't always super-careful with that stuff
17:01imirkin: "sorry"
17:01pmoreau: No worries
17:02imirkin: on the bright side, half-reg emission should be moderately solid now
17:02pmoreau: I realised today that char2 was working purely by mistake yesterday, since instead of giving a byte size to mkSplit, I was giving it TYPE_U16… 🤦
17:03karolherbst: imirkin: btw, I have a "fun" solution for this recursive locking thing... just not sure how "safe" it is
17:03karolherbst: https://gist.github.com/karolherbst/6a1325b3d33e3e26ae6894f7ddca883b
17:03karolherbst: skeggsb mentioned that mesa shouldn't use kick_notify for that kind of stuff
17:04karolherbst: and that change _seems_ to work just fine
17:04karolherbst: just unsure if there are weird side affects
17:04imirkin: is it "yolo"? :)
17:04imirkin: coz that's my favorite solution to any locking problem
17:05karolherbst: or if there are more places we have to
17:05karolherbst: imirkin: not quite
17:06imirkin: mmmmmmmmmmmmmmmmm
17:06karolherbst: yep...
17:06imirkin: it miiiight
17:06imirkin: it's a free country!
17:06karolherbst: I think we have to do that in a couple more cases _or_ we move that into places where we kick it explicitly
17:06karolherbst: and just ignore implicit fluches
17:07karolherbst: uhm..
17:07karolherbst: submissions
17:07imirkin: bah, can't find the simpsons clip
17:11RSpliet: karolherbst: With a bit of luck I don't need it after we get MT rolling proper :-)
17:20karolherbst: ahh heh.. yeah, that doesn't work if there is no flush
20:01pmoreau: Note to self: after two tests triggering “gr: TRAP_MP - TP0: 00000001 [LOCAL_LIMIT_READ]”, it is time to reset the GPU cause it will refuse to do anything more.
20:01karolherbst: :D
20:01karolherbst: happens
20:02imirkin: pmoreau: eh, i haven't really seen that
20:02pmoreau: What would be the best to reset it without restarting (and hopefully without rmod’ing nouveau, cause I need for my iGPU to drive the screen)?
20:03imirkin: you can unbind + bind nouveau to the device
20:03imirkin: or rather, other way around
20:04pmoreau: I get a bunch of “gr: TRAP_MP_EXEC - TP 0 MP 0: 00000008 [TIMEOUT] at 0003d0 warp 5, opcode 303f01fd 6c0087c8” for any additional test I try to run, some WARN in nouveau_bo_move, etc.
20:04pmoreau: How do I bind/unbind a device to a driver? That is something new to me
20:05imirkin: cd /sys/somewhere
20:05imirkin: echo 0.1.23 > unbind
20:05imirkin: etc
20:08pmoreau: Thanks, found it: /sys/bus/pci/drivers/nouveau
20:08imirkin: now you can just ls in that dir
20:08imirkin: and the symlink names are what you can echo into unbind
20:09imirkin: and then the same value back into bind
20:09imirkin: (the pci address iirc)
20:11pmoreau: Welp, reboot it is cause looks like the laptop froze.
20:11imirkin: wait 30s or so
20:12imirkin: sometimes it times out the channels
20:12pmoreau: But should it impact the screen when it it being driven by my other GPU?
20:13imirkin: if nouveau itself locks up, yes
20:13imirkin: locking should be per-device though
20:13pmoreau: Okay, sure
20:16pmoreau: Good to know about that unbind/bind dance, assuming it doesn’t lock up every time I try to use it.
20:19pmoreau: Mmh, so I got a couple of WARN in nouveau_display_destroy, and some nullptr dereference in the following cleanup.
20:19imirkin: you sure you unbound the right one?
20:19imirkin: they all look alike ;)
20:20pmoreau: Yeah, 02 is the dGPU and 03 is the integrated.
20:20pmoreau: But I think the display sharing via the apple gmux might make things a bit funky.
20:21imirkin: ah maybe
20:21imirkin: but still unbind from nouveau shouldn't affect that? maybe?
20:21imirkin: this isn't a super-tested path
20:21imirkin: esp with the gmux thing
20:21imirkin: just be glad it loads :p
20:22pmoreau: :-D
20:23pmoreau: I did spent like 2-3 years getting that laptop to boot without locking up, soooo
20:23pmoreau: That’s how I started with Nouveau
20:26pmoreau: After the device gets removed, vgaswitcheroo gets disabled, and that is followed by some refcount underflow when trying to get the EDID; I guess vgaswitcheroo getting disabled triggers a connection probe that does not go well.
20:27pmoreau: Let’s see if I can find a way to detect when we might run into those LOCAL_LIMIT_{READ|WRITE} and disallow launching the grid.
20:28imirkin: how are you even getting those in first place
20:28imirkin: oh... local limit
20:29imirkin: yeah, that could be worse, since it could lead to a shader infiinte loop or something
20:29karolherbst: pmoreau: ahh yeah.. mmhhh
20:29imirkin: if the wrong thing gets messed up
20:29karolherbst: imirkin: well.. it's CL so those thing can happen sadly :/
20:30karolherbst: imirkin: do we have a nice way of stopping all work on the GPU?
20:30karolherbst: or well.. a channel?
20:30karolherbst: I am thinking of ctrl+c where we end up waiting on something to happen
20:31imirkin: karolherbst: dunno sorry
20:31karolherbst: couldn't we just require that the kernel automaticallys kills the channel ones we destroy the userspace side of it?
20:51pmoreau: < https://matrix.org/_matrix/media/r0/download/matrix.org/yPkdXOwpTlcTvVqLaDqURXol/message.txt >
20:51pmoreau: Why am I not getting an error that I am getting over max tls? 🤔
20:55imirkin: is there such an error?
20:55imirkin: oh, you mean a LAUNCH error?
20:56pmoreau: When doing a nv50_tls_realloc it does check against max_tls_size
20:56imirkin: iirc that never gets called
20:56pmoreau: But I think I might not be computing it properly
20:56imirkin: maybe i'm thinking of a different function
20:56pmoreau: It is, from nv50_program_upload_code
21:23karolherbst: uhh
21:24karolherbst: I might even fix that fix problem we had with the shader-db runner shim
21:24karolherbst: imirkin: soooo... imagine a fence with ref==1
21:24karolherbst: we never emit such fence
21:36pmoreau: Oh
21:37pmoreau: Just noticed one thing: the tls buffer info is set up once in nv50_screen_compute_setup and that’s it, but when realloc is called it updates using the NV50_3D macros…
21:38imirkin: pmoreau: yeah. i didn't think it could ever change
21:38imirkin: maybe i'm thinking of something else
21:38imirkin: iirc it doesn't udpate the bufctx either
21:39pmoreau: It doesn’t look like it.
21:40pmoreau: Unless it’s managed by nouveau_bo_new?
21:41imirkin: maybe i was thinking of something else
21:41imirkin: there's definitely something which looked like it could get reallocated but never actually was
21:57pmoreau: Looks like this was the cause for my LOCAL_LIMIT_{READ|WRITE}; bumping the default allocation allowed to run more tests before hitting the issue (if I didn’t make any mistake). Unfortunately, the test whether to allocate or not is `tls_space < screen->cur_tls_space`, so I was a bit surprised to see it reallocate for “16 temps” when that’s the value I used for the default…