17:36anholt: karolherbst: there, I put up the NIR backend MR for you.
17:37karolherbst: the stats are massive
17:37karolherbst: but I think reduction in local might be just a side effect
17:39karolherbst: anholt: it seems like we do something wrong for GS shaders on nv50
17:40karolherbst: huh edgeflag ...
17:44anholt: I think I'm making some sense of the primid fail
17:45karolherbst: I can take a look at all those nvc0+ regressions, as they should be fixed asap (because volta+)
17:49karolherbst: anholt: I am a bit unhappy about 15597 as we are doing the wrong thing with tgsi as well
17:49karolherbst: but I guess we can do the same wrong thing with nir
18:00anholt: nv50 alphatest fixed.
18:01karolherbst: anholt: more nir lowering is the correct answer anyway :P
18:15anholt: nv50 primid sorted. now to see if this nvc0 fix worked.
18:17karolherbst: anholt: what was wrong with primid?
18:17anholt: it's an sv now, gotta look in the sv list for it.
18:18karolherbst: makes sense
18:20karolherbst: last time I checked using nir by default, the GPR usage increased by 40% for pixmark_piano, but speed improved by 10%
18:24anholt: cuts instructions by 10%, so yeah.
18:24karolherbst: well more gprs means less threads
18:25karolherbst: but all those loop based opts are a huge win
18:25karolherbst: so I wouldn't be surprised that now it's even more perf
18:25anholt: sure, but if you're basically not doing memory access then your threadcount doesn't matter much.
18:26karolherbst: I meant active threads
18:27karolherbst: I would be surprised if running more threads at once doesn't make a difference
18:27anholt: I'm assuming your threadcount is the "number of shaders logically active on the shader core where you thread switch between them on stalls"
18:27anholt: is it not that?
18:27karolherbst: I meant like real threads
18:28karolherbst: it's... a bit complicated, but we have a "logical threads being there" and a "threads actually running at the same time" thing
18:28karolherbst: and used GPRs have an impact on the latter
18:30agneli: guys maybe one of you woudl have some time and will to look into my issue, please? here is the copy of that I previously pasted in the channel https://paste.debian.net/plain/1237981
18:31karolherbst: anholt: but there is a little bit more to it.. so if we fall below a certian threshold (I think 32 regs) it doesn't matter for real
21:16karolherbst: anholt: the diff in fails looks very good now :)