08:13 pmoreau: imirkin: I did not, but I will look at that as soon as I got some breakfast. :-)
10:10 pmoreau: imirkin: “nv50/ir: emit store unlock separately from regular store” seems to be the first commit that makes it work. It was a bit tricky to figure it out, as even with previous commits it some times worked.
10:14 pmoreau: Like initially when checking out earlier commits, everything worked up to and including “WIP nv50: expose compute”.
20:19 imirkin: pmoreau: oh, that's probably because the memory was already "right"
20:19 imirkin: dunno
20:19 imirkin: i've seen that sort of thing happen
20:19 imirkin: that's very surprising though
20:19 imirkin: i assumed that one would NOT matter
20:20 imirkin: pmoreau: i'm going to give you some other stuff to test
20:51 pmoreau: imirkin: Sure, I’ll try those tomorrow.
20:51 imirkin: pmoreau: in general when that happens, i find it helpful to run a bunch of other unrelated tests
20:52 pmoreau: Yeah, I tried running some glxgears or waiting some time in between, and that helped.
20:52 imirkin: it's hard to know WHEN it's happening though
20:52 imirkin: esp when you're doing a bisect :)
20:53 imirkin: in this case it's only 4 things to test so not the end of the world
20:54 imirkin: oh wait
20:54 imirkin: that commit does more than i thought
20:54 imirkin: i'm going to split it
20:54 imirkin: there's a part that i think matters and a part that doesn't
20:56 imirkin: that commit does "don't use predication for st" and "do st twice"
20:57 imirkin: pretty sure the first bit matters, second bit doesn't
20:57 imirkin: mwk: hey, what's your impression of predication + stores to g[] and s[]? should it work? or not?
21:00 mwk: ... don't know
21:00 mwk: I mean there's no obvious reason not to
21:01 mwk: if naything, I'd be more confident in s[] stores working than g[] though
21:12 imirkin: mwk: hm ok. would that opinion change for "st unlock"?
21:12 mwk: well that's an interesting one
21:12 mwk: I'm... not sure it's even possible to have more than one lock at a time?
21:12 imirkin: well
21:12 imirkin: it's for the loop
21:12 mwk: so only one thread should be active when doing the unlock in the first place
21:13 imirkin: $c = ld lock
21:13 mwk: hmmm
21:13 imirkin: (lt $c) st unlock
21:13 mwk: did you see nv using it or not?
21:13 imirkin: nv did not
21:13 mwk: if not... I'd be wary, write some testcases
21:13 imirkin: nv also did some weird shit
21:13 imirkin: like they re-loaded the value after the ld lock
21:13 imirkin: and the stored the value separately, and then st unlock
21:14 mwk: that'd be a good argument for replicating that weird shit unless you're *really* certain you know exactly how the hardware works
21:14 mwk: preferably by having single-stepped some code samples and looking at the exact register state at every instruction
21:14 imirkin: well, evidence seems to suggest that the double load is unnecessary
21:14 imirkin: mwk: do you have a single-stepper in that hwtest thing somewhere?
21:14 mwk: no
21:15 mwk: I never got to make any proper infrastructure for that, unfortunately
21:20 imirkin: hm too bad
22:44 unlord: I've got a RIVA TNT2, can someone tell me what I'm doing wrong from this Xorg.0.log? https://paste.debian.net/hidden/3dc7cbe2/
22:47 imirkin: odd
22:47 imirkin: everything seems fine
22:47 imirkin: and i just tested on a TNT2 literally yesterday
22:47 imirkin: can you pastebin dmesg as well?
22:47 RSpliet: what's -19 again?
22:48 imirkin: feels like a permissions issue actually
22:48 RSpliet: ENODEV
22:48 damo22: youre not in video grouo?
22:48 damo22: group?
22:49 imirkin: mwk: would you expect that to be difficult? iirc it's moderately straightforward on nv50 to step through a shader
22:49 imirkin: (in the hwtest infra, not in general where there are additional concerns, like identifying the shader invocation you care about/etc)
22:49 unlord: imirkin: I can
22:49 RSpliet: damo22, unlord, imirkin: well, it's "No such file or directory". Perhaps the nouveau kernel module isn't loaded?
22:50 unlord: # modprobe nouveau
22:50 unlord: modprobe: FATAL: Module nouveau not found in directory /lib/modules/5.4.80-gentoo-r1
22:50 RSpliet: that's a problem
22:50 imirkin: not if it's built in
22:50 RSpliet: true
22:50 imirkin: but i'm assuming it's not :)
22:50 RSpliet: it _is_ gentoo!
22:51 imirkin: unlord: with TNT2 you can also use the xf86-video-nv driver, which will drive the GPU directly and you don't need nouveau for that
22:51 imirkin: (and you don't lose features like dual-head/etc ... because TNT2 didn't have 'em)
22:51 RSpliet: isn't that deprecated?
22:51 imirkin: RSpliet: depends who you ask
22:51 RSpliet: as in, does it still exist?
22:51 imirkin: if you ask the right person, everything's deprecated
22:51 imirkin: RSpliet: sure
22:52 unlord: it exists on my system
22:52 RSpliet: also, doesn't vieux do like GL 1.2 for that or sth? :-P
22:52 imirkin: https://cgit.freedesktop.org/xorg/driver/xf86-video-nv/
22:52 imirkin: yeah, vieux exposes GL. but quite frankly it sucks on nv4/nv5
22:52 imirkin: they really don't support a lot of features
22:53 imirkin: it's all swtnl, but on top of that, the "frag" pipeline is insufficient to cover a lot of GL 1.x stuff
22:53 imirkin: it basically just allows you to render textured triangles
22:53 imirkin: with very basic blending features
22:54 imirkin: for some reason even the "lodbias" demo in mesa-demos fails miserably on it. i didn't figure out why, but it's been that way since at least mesa 20.3
22:54 imirkin: so i didn't go back further in time :)
22:54 imirkin: (or maybe mesa 20.0, i forget which one i tried)
22:55 mwk: imirkin: there are still unidentified pieces of state
22:55 mwk: and I never quite figured out how the single-step state machine works
22:56 imirkin: hrmph ok
22:56 imirkin: that means i don't have a chance.
22:59 mwk: I mean, it should be possible to get something up
22:59 mwk: just... be aware the state is not *quite* understood
23:11 RSpliet: I'm always amazed by how mwk remembers details that even the NVIDIA engineers who built and brought up the thing can't recall :-P
23:21 imirkin: mwk: there are docs on it, right? i even saw fermi docs at some point, but that's obv quite different
23:43 mwk: no
23:43 mwk: and it's completely different
23:44 mwk: like, g80 has no concept of a trap handler
23:44 mwk: if you hit a breakpoint or single-step, the MP just halts, you get a trap [I think?] and you get to directly poke the state
23:46 mwk: first order of business is actually enabling debugging (if you don't, debug trap is just a normal "kill the shader" event), which IIRC goes through the same weird indirect register space as actually examining the state
23:47 mwk: then you either enable single-step mode before starting the shader (again, by poking that space), or stuff breakpoint insns into your code
23:47 mwk: but I can't recommend stuffing breakpoints, because that also involves removing them later when you actually hit them, and the process is ridiculously messy from what I recall
23:49 unlord: So I installed the xf86-video-nv but modprobe gives the same error
23:49 unlord: # modprobe nv
23:49 unlord: modprobe: FATAL: Module nv not found in directory /lib/modules/5.4.80-gentoo-r1
23:50 unlord: the file is here: /usr/lib/xorg/modules/drivers/nv_drv.so
23:56 RSpliet: unlord: the nv driver is 100% userspace. You don't have to modprobe it, just tell X.org to use "nv"
23:56 RSpliet: This may take some xorg.conf hackery