04:15imirkin: skeggsb: so should TU10x be able to use the copy engine logic for screen offload?
04:16skeggsb: yes, the copy engines work, ttm uses them already
04:16imirkin: right, but now they should function, unlike earlier
04:17skeggsb: they would have already
04:17imirkin: mmmmm ... dunno
04:17imirkin: i don't think they did.
04:17imirkin: or _something_ didn't
04:17skeggsb: ttm has always used it since i pushed the code
04:17imirkin: for e.g. output slaving
04:17imirkin: i thought it was coz GR hadn't been brought up on turing
04:17skeggsb: it's a completely different part of the GPU
04:18imirkin: ok
04:18skeggsb: oh, nevermind, we block userspace from creating *any* channel if we couldn't get GR for the DRM for some reason
04:19imirkin: heh ok
04:19skeggsb: somewhat unnecessary, but, anyway
04:19imirkin: so it could have worked
04:19imirkin: just didn't
04:19skeggsb: yep
04:19imirkin: looks like you had oodles of fun with this
04:20imirkin: hopefully still works on tegra :)
04:20skeggsb: not entirely the word i'd use
04:20imirkin: lol
04:20skeggsb: i've messaged tagr asking to pretty please test it for me :P
04:24skeggsb: i have tu11x sitting there waiting too, tested with RM's firmware... but better not push it just in case whatever we get given is different for some reason
04:24imirkin: ah
04:25imirkin: and i assume there's been little to no effort to create a 2d or 3d backend for these?
04:25skeggsb: i have most of volta done, which is largely compiler.. hopefully turing isn't too much of a leap from there
04:25skeggsb: finishing that is next on the list
04:26imirkin: ah nice
10:24tagr: skeggsb: from a very quick check it looks like at least gp10b is no longer working with your linux-5.6 branch
10:24tagr: sorry for not getting around to testing it earlier
10:26tagr: skeggsb: it probes fine, but when I test using kmscube, I get this: https://pastebin.com/PmpLGqwx
10:26tagr: I'll try to isolate a little more
11:21tagr: skeggsb: looks like the first bad commit is this: 3c47e381d651 ("drm/nouveau/gr/gv100-: modify gr init to match newer version of RM")
11:22tagr: but it doesn't look obviously broken or anything, the only real difference seems to be the usage of gr->ppc_tpc_max, but I'd have to look into it a little more
11:23tagr: note that on that commit, there's no crash, but instead I get these:
11:23tagr: [ 89.024086] nouveau 17000000.gpu: secboot: error during falcon reset: -110
11:23tagr: [ 89.031005] nouveau 17000000.gpu: gr: init failed, -110
11:24tagr: so I think a subsequent patch probably has a minor bug in the cleanup paths or something that causes the crash
11:25tagr: oh, and by the way, that MMIO read fault is harmless and I've had a patch to work around it in my tree for a while now, not sure why I never sent it out
11:53karolherbst: tagr: you probably need my hacky workaround
11:54karolherbst: tagr: check if this one makes a difference: https://github.com/karolherbst/nouveau/commit/8549d6983e3316e276062234684e4f8122e36c6c
11:54karolherbst: skeggsb ran into a similiar issue
11:58tagr: karolherbst: that doesn't seem to change anything for me
11:59tagr: karolherbst: but it looks like something stranger is going on here
11:59karolherbst: you are testing on current master, right?
11:59karolherbst: I didn't check yet if my workaround still works there... wanted to do that today
11:59tagr: karolherbst: testing skeggsb's linux-5.6 branch
11:59karolherbst: yeah..
11:59karolherbst: he might have used a changed workaround, so I don't know
11:59karolherbst: I talked with him today/yesterday about it
12:01tagr: after I get the "gr: init failed, 12" error and it crashes with a NULL pointer dereference, the reason for the NULL pointer dereference seems to be oclass->engine == 0xc in nvkm_fifo_chan_child_new()
12:02tagr: that 12 is ENOMEM, which is very unusual
12:03tagr: hmm... 12 also happens to be 0xc, which is that value of oclass->engine
12:07karolherbst: uhhh
12:07karolherbst: maybe some weird memory corruption?
12:07karolherbst: you test that on tegra, right?
12:08tagr: yes, gp10b
12:08karolherbst: maybe enable kasan and see if anything gets reported
12:29tagr: heh... kernel doesn't even boot with kasan enabled...
12:38karolherbst: :(
12:38karolherbst: like it's hitting errors or it's broken or it's slow?
12:48tagr: looks more like it's broken, but to be honest, I've never attempted to enable it before, so this might be just standard behavior on Tegra
12:48tagr: I'll have to go check this on vanilla
12:51tagr: hm... actually it looks more like some function somewhere is returning ENOMEM instead of -ENOMEM
12:52tagr: nvkm_engine_ref() return 0xc when given a valid pointer, and the gr: init failed, 12 is from the nvkm_subdev_init() from nvkm_engine_ref()
12:59karolherbst: mhh, odd
14:14karolherbst: imirkin: what ya think? https://github.com/karolherbst/mesa/commit/5aac462274528d52ba1d482bbef34a2335db746a
14:15karolherbst: minor adjustment: https://github.com/karolherbst/mesa/commit/5d6bf4bf70421e908918d29cd891d7e627e341ad
14:18karolherbst: mhh.. I could probably also maintain the full list and return a reference instead of construction a new list each call
14:23karolherbst: what I don't have a good understanding of is the lval->join->defs.remove code inside GCRA::cleanup
14:29karolherbst: but I think it's doing what I implemented properly, just different...
14:36MagnusOl: Hi
18:39imirkin_: karolherbst: that .remove() is just trying to undo the joins from earlier
18:40imirkin_: oh dear. getting fancy with operator()
18:58imirkin_: karolherbst: make it operator()(const Value *val) -- then you don't have to remove the const's elsewhere
19:09karolherbst: well.. you know, STL is stupid, I can't remove the const
19:10imirkin_: wha?
19:10karolherbst: defs[val] <- compile error because const Value*
19:10imirkin_: oh, right
19:10imirkin_: make that const too :)
19:11imirkin_: (it is, right?)
19:11imirkin_: moar const!
19:11karolherbst: then the merge stuff doesn't work anymore
19:11karolherbst: but.. maybe it doesn't matter anymore at that point?
19:11karolherbst: because if RA is done, we don't need the value information anymore I think
19:11imirkin_: ah yes.
19:11imirkin_: p.first->defs.insert
19:11karolherbst: yeah
19:12imirkin_: sigh.
19:12karolherbst: but I think we can remove that bit completly
19:13karolherbst: uhm... no
19:13karolherbst: we can't..
19:13karolherbst: we still run passes after it :D
19:14karolherbst: that DefIterator class is super annoying :/
19:14imirkin_: feel free to stl-ify things
19:14imirkin_: (in a separate change)
19:14karolherbst: ohh, that's a typedef
19:14imirkin_: yes
19:14imirkin_: just to type fewer things
19:14imirkin_: no auto keyword, you know :)
19:16karolherbst: for (Value::DefIterator def = defs.begin(); def != defs.end(); ++def) (*def)->get()->join = rep;-> for (ValueDef *def : defs) def->get()->join = rep;
19:16karolherbst: but yeah.. maybe in its own change or so
19:16karolherbst: ohh, first that, then my patch.. should lead to fewer changes
19:18pedahzur: Lyude: Anything you need from me today? :)
19:23Lyude: pedahzur: not yet, I've just got a couple of different things on my todo list right nw
19:23pedahzur: Sounds good
22:30karolherbst: soo.. now to the next RA bug...
22:32imirkin_: there's more than 1?!
22:33karolherbst: big surprise, I know
22:33karolherbst: the next one I want to look into is the one where we get the "ERROR: no viable spill candidates left" error, although nothing needs to be spilled
22:34imirkin_: oh, i hate that
22:34karolherbst: yeah...
22:34imirkin_: that's coz we merge
22:34karolherbst: no
22:34imirkin_: but then something's messed up
22:34imirkin_: no?
22:34karolherbst: nope
22:34imirkin_: huh
22:34karolherbst: it's really stupid actually
22:34karolherbst: so we have vec4 slots
22:34imirkin_: aka merge.
22:34karolherbst: and if we allocate one 32 bit register, we assume the worst case can mark the entire thing as "gone"
22:35imirkin_: for a vec4 - yes
22:35karolherbst: so if we need to RA multiple vec4s. we run out of space
22:35imirkin_: it has to be aligned
22:35imirkin_: it's not a bug
22:35imirkin_: it's a feature ;)
22:35karolherbst: I know
22:35karolherbst: thats why it's annoying
22:35karolherbst: we essentially always assume the worst possible allocation of registers
22:35karolherbst: and see if the next vec4 fits
22:35karolherbst: *check
22:36imirkin_: i wonder if we could order the allocations to succeed more often
22:36imirkin_: i.e. isntead of walking RIG nodes in order
22:36imirkin_: sort them by size
22:36imirkin_: (width, that is)
22:36imirkin_: obviously not a guarantee of success...
22:37karolherbst: util/ra already does that as well... so we essentially have those choices: 1. fix codegen/ra 2. use the common graph coloring implementation aka util/ra
22:38karolherbst: uhm... s/graph/node/...
22:38imirkin_: nah, "graph coloring" is the right term
22:38karolherbst: ahh, okay
22:38karolherbst: weird, but okay
22:38imirkin_: obv you're coloring nodes in a graph
22:38imirkin_: but ... wtvr
22:39karolherbst: anyway, util/ra has this concept of register classes and you can handle overlapping classes just fine and stuff
22:39karolherbst: but..
22:39karolherbst: uff
22:39karolherbst: regressions and a lot of work
22:39imirkin_: =]
22:39imirkin_: i need to remember to try to fix the last failing GTF test tonight
22:39imirkin_: i had some idea that i wrote down for what could be going wrong
22:40imirkin_: hopefully i wrote enough down to re-figure out the idea
22:40imirkin_: and even more hopefully that idea makes things work
22:40imirkin_: after that, we "just" need to resolve the hangs that happen when it's all run in sequence in cts-runner
22:41imirkin_: perhaps we can get skeggsb interested :)
22:41karolherbst: :D
22:41karolherbst: those are the hangs you get if you run the CTS for 10 hours, right?
22:41skeggsb: in what?
22:41imirkin_: no
22:41imirkin_: it's the hang that happens reliably (for my configuration anyways) on the first run through the CTS tests
22:41imirkin_: about 10-20 mins in
22:42karolherbst: ohhh
22:42karolherbst: lucky you
22:42karolherbst: I used the cts-runner once, the first hangs/random fails happened after 80% of the runs or so
22:42imirkin_: skeggsb: cts-runner hangs with a read failure-type-thing for me
22:42imirkin_: some sort of memory management issue
22:42imirkin_: for me it hangs in a reliable spot
22:43imirkin_: but runnign the tests directly naturally works
22:43karolherbst: imirkin_: how much VRAM?
22:43imirkin_: dunno, maybe 1GB?
22:43karolherbst: mhhhh
22:43imirkin_: GT 1030 -- top of the line
22:43imirkin_: nothing but the best for me! :)
22:43karolherbst: skeggsb: mind adding an nouveau option to limit the amount of used VRAM?
22:43imirkin_: it might even have a fan - that's how you know quality
22:43karolherbst: I think this would help us figuring out some issues (maybe this one as well)
22:44karolherbst: mine gt 1030 doesn't have a fan :)
22:44imirkin_: skeggsb: also if you could add an option that *increases* vram ;)
22:44karolherbst: :D
22:45imirkin_: and adds MP's
22:45karolherbst: obviously
22:46karolherbst: but I guess even if a GPU has more MP's than reported thorough... something, they are probably fused of or so
22:46imirkin_: yeah :)
22:47skeggsb: turn those TNT2s into RTX2080Ti? ;)
22:49karolherbst: mhhh.. actually, I think using util/ra isn't that much work.. we essentially just start by replacing GCRA::selectRegisters and move our way up
22:49karolherbst: through in 600 loc of glue code and it's done or so
22:53pedahzur: imirkin_ Lyude: are there directions somewhere for easily pulling the lastest master, building the module and trying it? Sorry...been several years since I've fiddled with kernel/module dev. I'm rusty. :)
23:07imirkin_: skeggsb: a *lot* of TNT2's :)
23:07imirkin_: pedahzur: well, once you have a .config, make oldconfig; make -j8
23:07imirkin_: pedahzur: i just have a git clone
23:09loonycyborg: it is meant nouveau from the kernel tree?
23:09imirkin_: you can clone e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
23:10loonycyborg: like I know that nouveau is in kernel, so any branches you're working on will be in torvalds' repo now?
23:10imirkin_: loonycyborg: pedahzur is having trouble with some modesetting stuff
23:10imirkin_: a fix for which is in 5.5-rc6
23:11imirkin_: more generally, nouveau is in several places
23:11pedahzur: imirkin_: But it doesn't for my issue. :)
23:11imirkin_: pedahzur: you tried?
23:11imirkin_: then why do you need directions for pulling an arbitrary tree?
23:11pedahzur: imirkin_: And posted another kernel log to the mailing list yesterday. :(
23:11imirkin_: pedahzur: o
23:11imirkin_: so you did.
23:12pedahzur: imirkin_: I was wondering if there was a place where the nouveau code would be more up to date than what was in kernel, and I'd just give it a try. But if the latest kernel rc is it, then I'm good. :)
23:12imirkin_: pedahzur: https://github.com/skeggsb/linux/commits/linux-5.6
23:12imirkin_: that's going into the next kernel
23:14loonycyborg: I'm just curious where do you have feature branches now, that is to which repo you push to collaborate?
23:14pedahzur: imirkin_: Yeah, what loonycyborg said. :)
23:27loonycyborg: ah I see, a fork on github
23:29imirkin_: loonycyborg: on our personal private desktops
23:30imirkin_: skeggsb is the maintainer. we send him patches. he has this bizarro "nouveau" tree where he tends to push patches first
23:31imirkin_: there's not a ton of collaboration on features that i've seen
23:31imirkin_: skeggsb does most of the kernel work, anyways
23:32loonycyborg: how you're sending them? e-mail and git am stuff?
23:37loonycyborg: at least if you just send patches then it probably loses authorship information and commit message
23:39loonycyborg: that is if you just send output of git diff
23:58imirkin: the normal way that people send patches
23:58imirkin: which preserves all this stuff
23:59imirkin: sending the otuput of "git diff" is silly