09:50 pmoreau: imirkin: I have sadly not; I ended up working longer than expected and got distracted by another.
09:56 karolherbst: imirkin: does this ring any bell? https://bugzilla.redhat.com/show_bug.cgi?id=1930977
10:42 RSpliet: karolherbst: this is aarch64?
10:42 RSpliet: Yeah, Jetson Nano. Surprised they declare a blocker for such a niche piece of HW
10:43 RSpliet: "niche" - pehaps a slightly premature judgement :-P
11:24 karolherbst: RSpliet: yeah well :p
12:41 imirkin: karolherbst: doesn't ring a bell, sorry. how does it die? not clear from the report
13:00 karolherbst: apparently booting
13:00 karolherbst: maybe the modifier stuff broke shit? dunno
13:00 karolherbst: I am setting up my jetson nano right now
13:01 imirkin: karolherbst: that's not what i mean
13:01 imirkin: what causes the process to terminate
13:01 imirkin: i see a backtrace
13:01 imirkin: but not a reason for it dying
13:02 imirkin: like did someone attach gdb to it and type "bt"? etc
13:02 imirkin: (likely causes might be SIGSEGV, SIGBUS, etc)
13:02 imirkin: pmoreau: no rush!
13:02 imirkin: just my curiousity ;)
13:03 karolherbst: imirkin: SIGSEGV, written in the bug title
13:03 imirkin: karolherbst: lol, go me for missing that
13:04 imirkin: in fairness to me, i'm still sipping my first cup of coffee
16:56 pmoreau: imirkin: https://gitlab.freedesktop.org/pmoreau/mesa/-/snippets/1664
16:56 imirkin: boo!
16:56 imirkin: pmoreau: can you run just dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_group
16:56 pmoreau: Halfway there, though! 😉
16:56 imirkin: with NV50_PROG_DEBUG=255
16:57 pmoreau: Sure!
16:57 imirkin: and also include the contents of TestResults.qpa
16:58 bencoh: 59
16:58 bencoh: woops :)
16:58 pmoreau: 42
16:59 pmoreau: I added the results to the same snippet.
16:59 imirkin: thanks
17:00 imirkin: oh. nir.
17:00 imirkin: that's untested.
17:00 imirkin: can you double-check that it fails with tgsi?
17:01 pmoreau: I trigger an assert with TGSI, in nv50_program.c
17:02 imirkin: where
17:02 pmoreau: Let me retry with commenting that code out
17:02 imirkin: what's the assert?
17:03 imirkin: oh, one you added?
17:04 pmoreau: Yeah due to relying on some NIR properties to get the combined size of all inputs, to know by how much to offset the accesses to shared.
17:04 imirkin: EMIT: shl (SUBOP:1) s32 $r1 $r0 4 (8)
17:04 imirkin: hmmm
17:04 imirkin: that seems weird. wtf is subop 1?
17:05 pmoreau: Added results for TGSI
17:05 imirkin: thanks
17:05 imirkin: also fail i assume?
17:05 pmoreau: Yes
17:08 imirkin: weird.
17:08 imirkin: i don't see anything obviously wrong =/
17:10 imirkin: pmoreau: mind trying to write a similar test in opencl
17:10 imirkin: and seeing what that compiles to?
17:10 pmoreau: I was going to try that
17:10 imirkin: could be missing something obvious
17:10 imirkin: or could be something sutble
17:10 pmoreau: Just stuck looking on the sources for that test 🙃
17:10 imirkin: like $r63 being a lot more busted than i thought
17:10 imirkin: it's in the TestResults.qpa file
17:11 pmoreau: Oh
17:11 imirkin: along with the very helpful comment of "Comparison failed for Output.values[2]"
17:11 imirkin: definitely didn't want to know what value you *did* find there ... sigh
17:14 pmoreau: Hehe
17:14 pmoreau: You just get a bad grade, and no comments as to why. 🤷
17:15 RSpliet: Lyude: silly suggestion, but are we sure that nouveau uses the 917d display class (codepaths) on kepler?
17:15 RSpliet: sorry, on *those* keplers even
17:15 Lyude: RSpliet: I was wondering the same thing actually
17:15 Lyude: it should say in the open-gpu-docs
17:16 Lyude: RSpliet: ok yeah-gk104+ is DISP021X, core channel cl917d and cursor cl917a
17:16 imirkin: RSpliet: pretty sure.
17:16 imirkin: RSpliet: otherwise they wouldn't get 128x128 either
17:16 imirkin: and 256x256 at least sorta-works
17:17 imirkin: just has rendering artifacts
17:17 Lyude: i'm a bit surprised by james' suggestion, but it definitely sounds like it's worth trying
17:17 imirkin: 4k-aligned vs 256-aligned?
17:17 imirkin: we definitely wouldn't use a large page for those
17:17 karolherbst: imirkin: https://gist.github.com/karolherbst/a8486ea170221987a1588da07b06e1ea :)
17:18 Lyude: imirkin: yes. which is weird because the cursor doesn't follow the same alignment as the rest, but if nvidia suggested it I think it's definitely worth trying. especially since the issues with the cursor as described sound like what would happen when I was doing igt stuff and set the wrong alignment for surfaces
17:18 imirkin: Lyude: could be that it's due to crossing pages
17:18 Lyude: oooooh, good point
17:19 imirkin: Lyude: although ... even 64x64(*4) = 16k
17:19 imirkin: so it'd already be crossing pages
17:19 imirkin: ooooh
17:19 imirkin: 256*256*4 = 256k
17:19 imirkin: so it COULD be getting a large page (64/128k)
17:19 imirkin: hm
17:20 Lyude: tbh - I wonder if this might also explain the issues we've been seeing with small ovlys on kepler
17:20 imirkin: could be
17:21 imirkin: anyways ... priority should be to getting users going again
17:21 imirkin: and then worrying about fixing the universe
17:21 imirkin: nobody actually needs 256x256 cursors
17:21 Lyude: imirkin: yeah-that's why i'm going to try this today :)
17:21 imirkin: cool
17:21 Lyude: otherwise we'll just limit it for now and figure it out later
17:22 imirkin: sounds good
17:23 imirkin: pmoreau: karolherbst: should separately figure out where that SHL + subop 1 was coming from in the nir path
17:23 karolherbst: wrap
17:24 imirkin: why would that be getting set?
17:24 karolherbst: nir shifts do warp or do not... dunno which way it was
17:24 karolherbst: *wrap
17:24 karolherbst: see commit 59b44f90aa4d
17:25 imirkin: hm
17:25 imirkin: well the nv50 hw just does one thing
17:25 karolherbst: :/
17:25 imirkin: i don't know offhand which thing it does, should be obvious from mwk's excelltn hwtests
17:25 karolherbst: so only clamped shifts?
17:25 karolherbst: because that's kind of the default
17:26 imirkin: doesn't look like it wraps
17:26 karolherbst: *sigh*
17:26 imirkin: i dunno what clamped is
17:26 karolherbst: okay, jason will love this
17:26 karolherbst: imirkin: !wrap
17:26 imirkin: is that where shifting by 100 == 0?
17:26 karolherbst: yes
17:27 imirkin: admittedly it does wrap
17:27 imirkin: just not to the argument bit width
17:27 karolherbst: ohh
17:27 karolherbst: I see
17:27 imirkin: /* const shift count */
17:27 imirkin: s2 = op1 >> 16 & 0x7f;
17:27 imirkin: \
17:27 imirkin: so there's that
17:27 imirkin: the immediate is max 0x7f
17:27 imirkin: er wait
17:27 imirkin: hold on
17:28 imirkin: exp = s1 << s2;
17:28 imirkin: but ... this is C
17:28 imirkin: do shifts wrap in C?
17:28 karolherbst: nfi
17:28 imirkin: ok. we should test it out.
17:28 imirkin: iirc they do
17:30 pmoreau: OpenCL compiler is unhappy about that atomic add on shared 🙃
17:30 karolherbst: pmoreau: ugh
17:30 imirkin: pmoreau: did you target to sm_12?
17:30 imirkin: (i dunno how one does that in CL)
17:30 pmoreau: Using a global pointer did not help
17:30 imirkin: ok. just use ptx i guess
17:31 imirkin: might not be hooked up in CL
17:31 karolherbst: pmoreau: using this secret API to compile from clc to PTX?
17:31 pmoreau: Yup
17:31 karolherbst: mhhh
17:31 pmoreau: Compiler is like: “(0) Error: unsupported operation”
17:31 karolherbst: forcing sm12 didn't help?
17:31 karolherbst: ahh
17:31 karolherbst: that's probably the frontend
17:31 karolherbst: yeah...
17:31 pmoreau: I need to double check what the compiler options are for it
17:31 karolherbst: set CL to 1.1?
17:32 karolherbst: -cl-nv-arch sm_12 -cl-nv-cstd=CL1.1
17:32 pmoreau: Just found it
17:35 pmoreau: Okay, that helped!
17:35 karolherbst: :)
17:39 imirkin: i was just guessing about how ld lock / st unlock worked
17:40 imirkin: but it seemed VERY similar to the nvc0 variant
17:40 imirkin: the kepler variant is slightly evolved relative to the nvc0 variant
17:40 imirkin: (on kepler st unlock can also fail)
17:40 karolherbst: the heck...
17:40 karolherbst: and how do you figure out it failed?
17:41 imirkin: predicate
17:41 pmoreau: imirkin: Added to the snippet
17:41 karolherbst: ahh
17:41 imirkin: am i missing something, or are those gitlab snippets just really hard to navigate?
17:41 imirkin: there's no way to collapse a file etc?
17:41 pmoreau: https://gitlab.freedesktop.org/pmoreau/mesa/-/snippets/1664/raw/master/SM%201.2%20dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_group%20as%20OpenCL
17:42 pmoreau: Here is a direct link
17:42 karolherbst: uhhh
17:42 karolherbst: BAR.ARV.WAIT
17:42 karolherbst: :/
17:42 karolherbst: run away
17:42 imirkin: uh wut
17:42 pmoreau: Yeah, I just saw there was no way to collapse the different files… 🤦
17:42 imirkin: /*02b8*/ R2G.U32.U32 g[A1+0x0], R2; /* 0xe420878004000001 */
17:42 imirkin: /*02c0*/ R2G.U32.U32.UNL g[A1+0x0], R2; /* 0xe4a0878004000001 */
17:43 imirkin: so ... i guess you gotta write it and then write it again with the unlock?
17:43 imirkin: that ... was not immediately apparent to me
17:43 karolherbst: ohhhh
17:43 pmoreau: If at first you do not success, try again! 😉 🤣
17:43 karolherbst: interesting
17:43 pmoreau: *succeed, would be better
17:44 imirkin: and it loads it a second time too
17:44 imirkin: that's just weird.
17:44 imirkin: /*0290*/ G2R.U32.UNL.C0 o[0x7f], g [A1+0x0].U32; /* 0x4480c7c8140001fd */
17:44 imirkin: /*02a8*/ MOV R1, g [A1+0x0]; /* 0x0423c7801400c005 */
17:44 imirkin: oh, interesting
17:44 imirkin: the first time it doesn't actually have a destination
17:44 imirkin: ok yeah, so this is slightly diff than how it all works on nvc0
17:45 imirkin: pmoreau: want to fix it up yourself, or want me to patch it up? i can't right now, but can do tonight
17:45 pmoreau: I won’t have the time, other things to take care of, so go for it.
17:45 imirkin: ok. tonight then.
17:46 imirkin: thanks for testing
17:46 imirkin: pmoreau: btw, if it's easy, can you output the same thing for nvc0 (sm_20)?
17:46 imirkin: i want to see if they produce similarly weird code, or if there's more
17:46 pmoreau: Sure, one sec
17:46 imirkin: coz i have another theory - that some of those ops don't respect predicates
17:47 imirkin: which would also explain some of these problems
17:48 pmoreau: https://gitlab.freedesktop.org/pmoreau/mesa/-/snippets/1666
17:49 imirkin: yeah ok, so for nvc0 they do it the much more reasonable way
17:49 imirkin: IME when nvidia does something weird like this it isn't completely for no reason at all
17:49 imirkin: generally they know something we don't :)
17:50 pmoreau: :-)
17:50 imirkin: i'll have a couple of commits to test out various theories
17:50 imirkin: thanks!
17:50 pmoreau: Thanks for your work!
17:53 imirkin: yw. been fun to do some bringup for a change.
17:54 imirkin: and looks like we'll be able to finally fix 3d images on fermi
17:54 imirkin: not like the most useful feature in the world
17:54 imirkin: but ... still nice
17:58 karolherbst: imirkin: that jetson bug is an evil memory corruption :/
17:58 imirkin: karolherbst: i hate evil memory corruption. friendly memory corruption is so much nicer
17:58 karolherbst: imirkin: wasn't there some fix for indirect draws?
17:58 imirkin: there have been many fixes
17:58 imirkin: since the dawn of time
17:58 karolherbst: I meant like recently
17:59 imirkin: user draws got broken for a period of time
17:59 karolherbst: yeah
17:59 karolherbst: it's probably that
17:59 imirkin: but you'd have to be using a git checkout for that iirc
17:59 karolherbst: let's see
17:59 imirkin: an old one at that
17:59 karolherbst: mhh
17:59 karolherbst: I am at 21
17:59 imirkin: rc-what?
17:59 karolherbst: 5
18:00 imirkin: my fixes landed in 21.0.0-rc2
18:00 imirkin: https://cgit.freedesktop.org/mesa/mesa/commit/?h=staging/21.0&id=9fc330f27ba9a3c906f1e1342ed2ca03ef08ba23
18:00 karolherbst: ahhhh
18:00 imirkin: and also https://cgit.freedesktop.org/mesa/mesa/commit/?h=staging/21.0&id=4dee39f04d025d6ae1ad631caef35b0254516bd7
18:01 karolherbst: no
18:01 karolherbst: there is new stuff :/
18:01 imirkin: and also this guy i guess: https://cgit.freedesktop.org/mesa/mesa/commit/?h=staging/21.0&id=f763d0f1952151e0fcae596e85600e7f391ea442
18:01 karolherbst: the bug report is for rc4
18:01 imirkin: but that was in the branchpoint
18:01 karolherbst: and there are more commits from mareko on that file
18:01 karolherbst: since your changes
18:01 karolherbst: seriously...
18:01 karolherbst: I bet one of them broke it
18:02 imirkin: i don't see them...
18:02 imirkin: other than obviously trivial / irrelevant ones
18:03 imirkin: (in the -rc branch)
18:03 karolherbst: imirkin: https://gitlab.freedesktop.org/mesa/mesa/-/commits/21.0/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
18:03 karolherbst: ohh wait
18:03 karolherbst: wrong... branch/tag?
18:03 karolherbst: ahh, I messed up
18:03 karolherbst: k...
18:04 imirkin: even if i pick staging/21.0 it doesn't seem to show my commits
18:04 imirkin: which is weird
18:04 imirkin: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/commits/21.0/src/gallium/drivers/nouveau/nvc0
18:05 imirkin: and my testing included running all of deqp / cts
18:05 karolherbst: ahh yeah, there is your fix
18:05 karolherbst: mhhh
18:05 karolherbst: but the code is clearly doing garbage things
18:05 karolherbst: p indirect->buffer
18:06 karolherbst: (struct pipe_resource *) 0xffffa5b02504 <_mesa_validated_drawrangeelements+260>
18:06 karolherbst: and calls nvc0_draw_indirect with that
18:06 karolherbst: ohhhh
18:06 karolherbst: imirkin: I guess tegra_draw_vbo needs fixing too
18:06 karolherbst: :/
18:06 imirkin: there's a tegra_draw_vbo now?
18:06 karolherbst: yep
18:07 imirkin: lol yeah
18:07 karolherbst: ehhh...
18:07 karolherbst: marek touched that as well
18:07 karolherbst: uhhh
18:07 karolherbst: let's see
18:07 imirkin: doesn't seem obviously wrong
18:07 imirkin: but also doesn't seem obviously right
18:11 karolherbst: huh...
18:12 karolherbst: HUH :O
18:12 karolherbst: imirkin: 4. ctx->Driver.DrawGallium(ctx, &info, &draw, 1);
18:12 imirkin: right ...
18:12 imirkin: that's the new super-multi-draw callback that mareko added
18:12 karolherbst: yeah
18:12 karolherbst: but it has 4 arsg
18:12 karolherbst: tegra_draw_vbo has 5
18:12 imirkin: it's not 1:1
18:13 imirkin: that calls DrawGallium in st/mesa
18:13 imirkin: st_vbo.c
18:13 karolherbst: ahh.. optimized builds
18:20 karolherbst: ahhh
18:20 imirkin: oh. and we're also looking at the wrong flag. super.
18:20 imirkin: pmoreau: --^
18:20 karolherbst: imirkin: this pindirect = &indirect; line looks like it breaks
18:20 imirkin: they branch on LT, we branch on equality
18:20 karolherbst: so indirect is on the stack
18:21 karolherbst: and we never write to it
18:21 karolherbst: then we pass garbage into nvc0
18:21 karolherbst: inside tegra_draw_vbo
18:21 imirkin: memcpy(&indirect, pindirect, sizeof(indirect));
18:21 karolherbst: only if the clause is true
18:21 imirkin: oh, but we should pass in null
18:21 imirkin: if it's not
18:21 karolherbst: yep
18:21 imirkin: oops
18:22 imirkin: i think you can just move the pindirect = &indirect into the if ()
18:22 imirkin: also
18:22 imirkin: there's a second thing
18:22 imirkin: which should be unwrapped
18:22 imirkin: while you're at it
18:22 karolherbst: yeah
18:22 imirkin: indirect->other_buffer
18:22 imirkin: i forget what it's called
18:22 imirkin: for the "count" to also be indirect
18:22 imirkin: AZDO! yeah!
18:23 karolherbst: :D
18:23 karolherbst: indirect_draw_count
18:25 imirkin: karolherbst: also you can get rid of the if (num_draws) thing at the front
18:25 imirkin: since it just passes through to another driver
18:25 karolherbst: yep
18:26 karolherbst: ohh wait
18:26 karolherbst: no, we can't
18:26 imirkin: why not
18:26 karolherbst: well... we could probably rewrite the entire function to just loop anyway
18:26 karolherbst: because it calls itself to unwrap all the stuff? no?
18:26 imirkin: but nouveau takes care of that too
18:27 karolherbst: ohhh
18:27 imirkin: and there's nothing to unwrap in the draws array
18:27 karolherbst: yeah, you are right
18:27 imirkin: really i should hook that up better in nouveau
18:27 imirkin: since we can skip on some of the validation
18:27 imirkin: something for the future.
18:28 karolherbst: for the times where perf actually matters
18:28 imirkin: AZDO!
18:28 imirkin: YEAH!
18:28 karolherbst: what do you have against AZDO? :D
18:28 imirkin: nothing
18:28 imirkin: it's good
18:28 imirkin: you're the one against it!
18:28 imirkin: some of those things are taking it a wee bit too far imho
18:29 imirkin: like the indirect count
18:29 imirkin: but hey, it was a fun macro to write
18:29 karolherbst: oh well
18:29 karolherbst: could be worse
18:31 imirkin: but reducing validation is also part of AZDO
18:31 imirkin: which is what mareko has been driving towards
18:31 karolherbst: well, reducing CPU overhead is generally a good idea anyway
18:32 karolherbst: just not a big fan of moving more load towards the GPU
18:32 imirkin: this is purely CPU overhead though
18:32 karolherbst: yeah, probably
18:32 imirkin: a bunch of things that can't change don't need to be checked over and over
18:32 karolherbst: yep
18:32 imirkin: it's not a LOT of overhead
18:32 karolherbst: I should CPU profile mesa on the jetson at some point
18:32 karolherbst: because there it actually matters
18:33 imirkin: but mareko has gotten it all so low that even little bits matter
18:33 imirkin: if one is to improve
22:40 Lyude: imirkin: so 256x256 cursors definitely do work on kepler (tested with a GK104, and I double checked to verify igt wasn't lying to me). I think the issue does have something to do with the alignment that X is using
22:41 imirkin: Lyude: modetest
22:41 imirkin: there's a patch on list to make modetest use 256x256 cursors
22:41 imirkin: (in that thread, further up)
22:46 pmoreau:sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/ZfzjJAeCDaswAUIzrGINyoCp/message.txt >
22:46 pmoreau: How is this using 4 GPRs? 🤔
22:46 imirkin: there's a min iirc
22:46 imirkin: and/or alignment
22:47 pmoreau: Oh, could be
22:47 pmoreau: And same for the local then?
22:47 imirkin: well, min for local is 0
22:48 imirkin: that means that there's some l[] usage further up
22:48 pmoreau: That’s the whole emitted program
22:48 imirkin: what gpu? that doesn't decode at all
22:48 pmoreau: SM1.1
22:49 pmoreau: It decodes with nvdisasm, but couldn’t get it to work with envydis for whatever reason.
22:49 imirkin: yeah, doesn't decode in envydis at all
22:49 pmoreau: (Also, it does run fine on actual hardware :-D)
22:49 imirkin: weird
22:49 imirkin: yeah, seems fine
22:50 imirkin: i dunno what causes the local usage
22:50 imirkin: maybe the thing being compiled has some l[] which gets optimized out
22:50 imirkin: dunno
22:51 imirkin: oh lol
22:51 imirkin: wait
22:51 imirkin: it decodes fine in envydis
22:51 imirkin: i still had -W from when i was pasting 64-bit things into it
22:51 imirkin: works much better with -w :)
22:52 imirkin: apparently we both made the exact same mistake
22:52 imirkin: good times
22:52 pmoreau: Oh!
22:52 pmoreau: Yes indeed! 🤣
22:52 imirkin: anyways, i'm in the middle of coming up with some candidate fixes for the atomic thing
22:52 imirkin: should have something in the next hour or so
22:52 imirkin: basically a series of commits
22:52 pmoreau: Cool! I am looking at those offsets for shared/inputs and cleaning/fixing that up.
22:53 pmoreau: bin.tlsSpace is what gets reported as local afterwards, right?
22:53 imirkin: i would think so
22:55 pmoreau: Okay, that is set to the value reported by NIR as scratch, and it does report 4 bytes of scratch size.
22:56 imirkin: uhm
22:56 imirkin: that should be set by codegen
22:56 imirkin: not an input
23:13 Lyude: there's our buggy cursor
23:18 Lyude: and, it's definitely not alignment related (unless I keep getting very lucky when running igt)
23:18 imirkin: got it with modetest?
23:19 imirkin: modetest sticks it into a dumb bo
23:19 Lyude: that could definitely explain it then
23:19 Lyude: imirkin: yeah I got it with modetest
23:19 imirkin: also does igt place the cursor at random places
23:19 Lyude: alignment/stride seems to be the same between the two
23:19 imirkin: or only fixed offsets?
23:19 Lyude: imirkin: it does have random testing (also like I said I double checked, e.g. I actually did look at the visual output :)
23:20 imirkin: i just mean the cursor position
23:20 Lyude: imirkin: yeah-that's what I meant, there's a test for placing the cursor in random spots
23:20 Lyude: and that one does pass
23:21 imirkin: ah ok
23:21 Lyude: I'm pretty sure we're seeing the cursor scanout from somewhere other then the fb, because when I tried modetest I was able to see the remanents of one of the cursor patterns igt drew
23:21 Lyude: so it is something related to the bo allocation
23:22 imirkin: well it's pretty easy to see what modetest does
23:22 imirkin: i.e. nothing complicated :)
23:22 Lyude: yeah, which is why I'm thinking it might be something with how nouveau handles dumb bos
23:22 imirkin: and it does work at 128x128
23:22 imirkin: allegedly.
23:22 Lyude: skeggsb: any idea how I can get nouveau to spit out as much info as possible about bo/fb allocations?
23:33 Lyude: "Use VRAM if there is any ; otherwise fallback to system memory" that's definitely not the problem but I can't help but to think that's probably wrong in nouveau_display_dumb_create() :)
23:36 imirkin: Lyude: when that system memory gets pinned, it'll end up in vram
23:36 Lyude: ah right, forgot about that
23:39 imirkin: (but maybe not, who knows)
23:39 imirkin: but hopefully you're not running out of VRAM to make a cursor bo
23:39 imirkin: that'd be a tight fit ;)
23:39 Lyude: yeah lol, that's definitely not our issue
23:39 Lyude: was just surprised to see that
23:39 imirkin: "if only i had more vram, i could fit the cursor in there"
23:39 imirkin: you know the next GPUs will have like a 4096x4096 cursor
23:40 imirkin: with FP16 alpha blend
23:40 imirkin: but packed in a surface that the GPU can't render to
23:41 Lyude: hehe, we've already got 32 planes on newer gpus :P
23:42 imirkin: maor!
23:43 imirkin: Lyude: btw, one additional diff is that modetest (by default) doesn't use atomic
23:43 imirkin: or universal planes
23:43 Lyude: imirkin: it's reproducible with atomic
23:43 imirkin: dunno what igt does
23:43 Lyude: (I almost always use -a with modetest)
23:43 imirkin: [of course neither does xf86-video-modesetting]
23:44 Lyude: I'm convinced it has something to do with how we're allocating the bo in nouveau tbh
23:44 imirkin: you could be right
23:44 imirkin: what does igt do?
23:44 Lyude: imirkin: we call down to libdrm's nouveau functions directly, which causes us to use the NOUVEAU_GEM_* ioctls
23:45 Lyude: so my assumption is there's some requirement for scanout surfaces we're not enforcing in the dumb path, but i'm not sure what
23:46 Lyude: oh-i have an idea that might tell me what
23:49 pmoreau: imirkin: I rework the offsets for shared memory in the NIR frontend: https://gitlab.freedesktop.org/pmoreau/mesa/-/commit/a3004a82158d89c6a7ee2f50e1099f7645c06400
23:57 imirkin: pmoreau: just pushed an update
23:58 imirkin: with a bunch of commits at the end
23:58 imirkin: which try various things
23:58 imirkin: try the full branch first
23:58 imirkin: and if that works, peel it back commit by commit
23:58 imirkin: and if that doesn't work, would appreciate NV50_PROG_DEBUG=1 output (tgsi please)