08:13pmoreau: imirkin: I did not, but I will look at that as soon as I got some breakfast. :-)
10:10pmoreau: imirkin: “nv50/ir: emit store unlock separately from regular store” seems to be the first commit that makes it work. It was a bit tricky to figure it out, as even with previous commits it some times worked.
10:14pmoreau: Like initially when checking out earlier commits, everything worked up to and including “WIP nv50: expose compute”.
20:19imirkin: pmoreau: oh, that's probably because the memory was already "right"
20:19imirkin: i've seen that sort of thing happen
20:19imirkin: that's very surprising though
20:19imirkin: i assumed that one would NOT matter
20:20imirkin: pmoreau: i'm going to give you some other stuff to test
20:51pmoreau: imirkin: Sure, I’ll try those tomorrow.
20:51imirkin: pmoreau: in general when that happens, i find it helpful to run a bunch of other unrelated tests
20:52pmoreau: Yeah, I tried running some glxgears or waiting some time in between, and that helped.
20:52imirkin: it's hard to know WHEN it's happening though
20:52imirkin: esp when you're doing a bisect :)
20:53imirkin: in this case it's only 4 things to test so not the end of the world
20:54imirkin: oh wait
20:54imirkin: that commit does more than i thought
20:54imirkin: i'm going to split it
20:54imirkin: there's a part that i think matters and a part that doesn't
20:56imirkin: that commit does "don't use predication for st" and "do st twice"
20:57imirkin: pretty sure the first bit matters, second bit doesn't
20:57imirkin: mwk: hey, what's your impression of predication + stores to g and s? should it work? or not?
21:00mwk: ... don't know
21:00mwk: I mean there's no obvious reason not to
21:01mwk: if naything, I'd be more confident in s stores working than g though
21:12imirkin: mwk: hm ok. would that opinion change for "st unlock"?
21:12mwk: well that's an interesting one
21:12mwk: I'm... not sure it's even possible to have more than one lock at a time?
21:12imirkin: it's for the loop
21:12mwk: so only one thread should be active when doing the unlock in the first place
21:13imirkin: $c = ld lock
21:13imirkin: (lt $c) st unlock
21:13mwk: did you see nv using it or not?
21:13imirkin: nv did not
21:13mwk: if not... I'd be wary, write some testcases
21:13imirkin: nv also did some weird shit
21:13imirkin: like they re-loaded the value after the ld lock
21:13imirkin: and the stored the value separately, and then st unlock
21:14mwk: that'd be a good argument for replicating that weird shit unless you're *really* certain you know exactly how the hardware works
21:14mwk: preferably by having single-stepped some code samples and looking at the exact register state at every instruction
21:14imirkin: well, evidence seems to suggest that the double load is unnecessary
21:14imirkin: mwk: do you have a single-stepper in that hwtest thing somewhere?
21:15mwk: I never got to make any proper infrastructure for that, unfortunately
21:20imirkin: hm too bad
22:44unlord: I've got a RIVA TNT2, can someone tell me what I'm doing wrong from this Xorg.0.log? https://paste.debian.net/hidden/3dc7cbe2/
22:47imirkin: everything seems fine
22:47imirkin: and i just tested on a TNT2 literally yesterday
22:47imirkin: can you pastebin dmesg as well?
22:47RSpliet: what's -19 again?
22:48imirkin: feels like a permissions issue actually
22:48damo22: youre not in video grouo?
22:49imirkin: mwk: would you expect that to be difficult? iirc it's moderately straightforward on nv50 to step through a shader
22:49imirkin: (in the hwtest infra, not in general where there are additional concerns, like identifying the shader invocation you care about/etc)
22:49unlord: imirkin: I can
22:49RSpliet: damo22, unlord, imirkin: well, it's "No such file or directory". Perhaps the nouveau kernel module isn't loaded?
22:50unlord: # modprobe nouveau
22:50unlord: modprobe: FATAL: Module nouveau not found in directory /lib/modules/5.4.80-gentoo-r1
22:50RSpliet: that's a problem
22:50imirkin: not if it's built in
22:50imirkin: but i'm assuming it's not :)
22:50RSpliet: it _is_ gentoo!
22:51imirkin: unlord: with TNT2 you can also use the xf86-video-nv driver, which will drive the GPU directly and you don't need nouveau for that
22:51imirkin: (and you don't lose features like dual-head/etc ... because TNT2 didn't have 'em)
22:51RSpliet: isn't that deprecated?
22:51imirkin: RSpliet: depends who you ask
22:51RSpliet: as in, does it still exist?
22:51imirkin: if you ask the right person, everything's deprecated
22:51imirkin: RSpliet: sure
22:52unlord: it exists on my system
22:52RSpliet: also, doesn't vieux do like GL 1.2 for that or sth? :-P
22:52imirkin: yeah, vieux exposes GL. but quite frankly it sucks on nv4/nv5
22:52imirkin: they really don't support a lot of features
22:53imirkin: it's all swtnl, but on top of that, the "frag" pipeline is insufficient to cover a lot of GL 1.x stuff
22:53imirkin: it basically just allows you to render textured triangles
22:53imirkin: with very basic blending features
22:54imirkin: for some reason even the "lodbias" demo in mesa-demos fails miserably on it. i didn't figure out why, but it's been that way since at least mesa 20.3
22:54imirkin: so i didn't go back further in time :)
22:54imirkin: (or maybe mesa 20.0, i forget which one i tried)
22:55mwk: imirkin: there are still unidentified pieces of state
22:55mwk: and I never quite figured out how the single-step state machine works
22:56imirkin: hrmph ok
22:56imirkin: that means i don't have a chance.
22:59mwk: I mean, it should be possible to get something up
22:59mwk: just... be aware the state is not *quite* understood
23:11RSpliet: I'm always amazed by how mwk remembers details that even the NVIDIA engineers who built and brought up the thing can't recall :-P
23:21imirkin: mwk: there are docs on it, right? i even saw fermi docs at some point, but that's obv quite different
23:43mwk: and it's completely different
23:44mwk: like, g80 has no concept of a trap handler
23:44mwk: if you hit a breakpoint or single-step, the MP just halts, you get a trap [I think?] and you get to directly poke the state
23:46mwk: first order of business is actually enabling debugging (if you don't, debug trap is just a normal "kill the shader" event), which IIRC goes through the same weird indirect register space as actually examining the state
23:47mwk: then you either enable single-step mode before starting the shader (again, by poking that space), or stuff breakpoint insns into your code
23:47mwk: but I can't recommend stuffing breakpoints, because that also involves removing them later when you actually hit them, and the process is ridiculously messy from what I recall
23:49unlord: So I installed the xf86-video-nv but modprobe gives the same error
23:49unlord: # modprobe nv
23:49unlord: modprobe: FATAL: Module nv not found in directory /lib/modules/5.4.80-gentoo-r1
23:50unlord: the file is here: /usr/lib/xorg/modules/drivers/nv_drv.so
23:56RSpliet: unlord: the nv driver is 100% userspace. You don't have to modprobe it, just tell X.org to use "nv"
23:56RSpliet: This may take some xorg.conf hackery