19:12anholt: skeggsb: 5.17-rc6 seems to be hitting a NULL deref at boot from the falcon reset sequence. I think gm20b_pmu is missing the .reset field. Should that be gf100_pmu_reset like gm200.c's?
19:13anholt: did manage to probe nouveau on my jetson once with that, but not twice.
19:16anholt: (3/5 reboots got "nouveau 57000000.gpu: bus: MMIO read of 00000000 FAULT at 137000", then everything falls apart)
23:14anholt: https://gitlab.freedesktop.org/anholt/mesa/-/jobs/19299576 progress!
23:43imirkin: wow, that is mucho flake
23:50anholt: yeah, https://gitlab.freedesktop.org/anholt/mesa/-/blob/cf6a308f4659f9943b1c70b043fbfdcd3ea78298/src/gallium/drivers/nouveau/ci/nouveau-nv12b-flakes.txt is what I came up with, but that run says there's even more
23:50imirkin: anholt: i think this can be shortened to "*"
23:51anholt: I think it's "tf, atomics, and tess"
23:51imirkin: there's no great reason for any of those to be flaky btw
23:51imirkin: *esp* not tess
23:51imirkin: tess flaky suggests that we failed at something rather basic on maxwell
23:51imirkin: which wouldn't be ENTIRELY surprising
23:51imirkin: but still slightly
23:51imirkin: we do things in a different way than blob
23:52imirkin: but it did seem like our way worked fine
23:52imirkin: at least in some tests
23:52anholt: it's possible that something in gles31 is hanging the gpu and innocent tess tests lose out
23:52imirkin: there could also be any amount of incorrect init on GM20B
23:52imirkin: since only the nvidia guys cared about that one to any extent
23:52anholt: but... nope. https://gitlab.freedesktop.org/anholt/mesa/-/jobs/19299576#L9015 right as we start doing gles31 tests, with no dmesg complaints.
23:53imirkin: 22-02-28 23:52:34 ERROR - dEQP error: nouveau: kernel rejected pushbuf: Device or resource busy
23:53imirkin: 878722-02-28 23:52:34 ERROR - dEQP error: nouveau: ch5: krec 0 pushes 511 bufs 29 relocs 0
23:53imirkin: that's not completely great
23:54imirkin: good, just not great.
23:54anholt: ah! there. I missed it in scrolling
23:54imirkin: fwiw there's a limit of 512 pushes per submit
23:54imirkin: iirc completely artificial
23:54imirkin: but it does force a submit at that time
23:54anholt: oh, that's much after line 9015, though.
23:55imirkin: i think you're still doing multi-threading somewhere?
23:55anholt: multi process, not multi thread
23:55imirkin: i'd advise against that
23:56anholt: single process didn't help my flakiness on other nv, but we could try it here too
23:56imirkin: hmmm ok
23:56imirkin: "other" = ?
23:56anholt: nv30 and 50
23:56imirkin: if so that's a whole level of different
23:56imirkin: nv50 - that's surprising that it was _flaky_
23:56imirkin: faily - sure.
23:56imirkin: (esp with single-process)
23:56anholt: well. I mean that I never managed to get 30 or 50 to reliably run even single tests with the boards I have.
23:57anholt: 30 I spent the most time trying on
23:57imirkin: hm. well, your experience is much different than mine
23:57imirkin: that said, i have a slightly older kernel
23:57imirkin: so perhaps that contributes
23:57imirkin: i'm on 5.6
23:57imirkin: nv30 is definitely crashy
23:57imirkin: iirc i had to add a lot of -x's to piglit for that one to not die on a run
23:58anholt: I'm sure, especially without https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12241
23:58imirkin: that's for nv4x
23:58imirkin: nv30 doesn't suffer from such deficiences
23:58imirkin: (due to not supporting loops ... heh)
23:59imirkin: (but yeah, same driver. not sure which gpu you were targeting)
23:59anholt: nv47 was what I was on
23:59imirkin: oh ok
23:59anholt: my other was nva8, which I think you said was cursed