04:15 imirkin: skeggsb: so should TU10x be able to use the copy engine logic for screen offload?
04:16 skeggsb: yes, the copy engines work, ttm uses them already
04:16 imirkin: right, but now they should function, unlike earlier
04:17 skeggsb: they would have already
04:17 imirkin: mmmmm ... dunno
04:17 imirkin: i don't think they did.
04:17 imirkin: or _something_ didn't
04:17 skeggsb: ttm has always used it since i pushed the code
04:17 imirkin: for e.g. output slaving
04:17 imirkin: i thought it was coz GR hadn't been brought up on turing
04:17 skeggsb: it's a completely different part of the GPU
04:18 imirkin: ok
04:18 skeggsb: oh, nevermind, we block userspace from creating *any* channel if we couldn't get GR for the DRM for some reason
04:19 imirkin: heh ok
04:19 skeggsb: somewhat unnecessary, but, anyway
04:19 imirkin: so it could have worked
04:19 imirkin: just didn't
04:19 skeggsb: yep
04:19 imirkin: looks like you had oodles of fun with this
04:20 imirkin: hopefully still works on tegra :)
04:20 skeggsb: not entirely the word i'd use
04:20 imirkin: lol
04:20 skeggsb: i've messaged tagr asking to pretty please test it for me :P
04:24 skeggsb: i have tu11x sitting there waiting too, tested with RM's firmware... but better not push it just in case whatever we get given is different for some reason
04:24 imirkin: ah
04:25 imirkin: and i assume there's been little to no effort to create a 2d or 3d backend for these?
04:25 skeggsb: i have most of volta done, which is largely compiler.. hopefully turing isn't too much of a leap from there
04:25 skeggsb: finishing that is next on the list
04:26 imirkin: ah nice
10:24 tagr: skeggsb: from a very quick check it looks like at least gp10b is no longer working with your linux-5.6 branch
10:24 tagr: sorry for not getting around to testing it earlier
10:26 tagr: skeggsb: it probes fine, but when I test using kmscube, I get this: https://pastebin.com/PmpLGqwx
10:26 tagr: I'll try to isolate a little more
11:21 tagr: skeggsb: looks like the first bad commit is this: 3c47e381d651 ("drm/nouveau/gr/gv100-: modify gr init to match newer version of RM")
11:22 tagr: but it doesn't look obviously broken or anything, the only real difference seems to be the usage of gr->ppc_tpc_max, but I'd have to look into it a little more
11:23 tagr: note that on that commit, there's no crash, but instead I get these:
11:23 tagr: [ 89.024086] nouveau 17000000.gpu: secboot: error during falcon reset: -110
11:23 tagr: [ 89.031005] nouveau 17000000.gpu: gr: init failed, -110
11:24 tagr: so I think a subsequent patch probably has a minor bug in the cleanup paths or something that causes the crash
11:25 tagr: oh, and by the way, that MMIO read fault is harmless and I've had a patch to work around it in my tree for a while now, not sure why I never sent it out
11:53 karolherbst: tagr: you probably need my hacky workaround
11:54 karolherbst: tagr: check if this one makes a difference: https://github.com/karolherbst/nouveau/commit/8549d6983e3316e276062234684e4f8122e36c6c
11:54 karolherbst: skeggsb ran into a similiar issue
11:58 tagr: karolherbst: that doesn't seem to change anything for me
11:59 tagr: karolherbst: but it looks like something stranger is going on here
11:59 karolherbst: you are testing on current master, right?
11:59 karolherbst: I didn't check yet if my workaround still works there... wanted to do that today
11:59 tagr: karolherbst: testing skeggsb's linux-5.6 branch
11:59 karolherbst: yeah..
11:59 karolherbst: he might have used a changed workaround, so I don't know
11:59 karolherbst: I talked with him today/yesterday about it
12:01 tagr: after I get the "gr: init failed, 12" error and it crashes with a NULL pointer dereference, the reason for the NULL pointer dereference seems to be oclass->engine == 0xc in nvkm_fifo_chan_child_new()
12:02 tagr: that 12 is ENOMEM, which is very unusual
12:03 tagr: hmm... 12 also happens to be 0xc, which is that value of oclass->engine
12:07 karolherbst: uhhh
12:07 karolherbst: maybe some weird memory corruption?
12:07 karolherbst: you test that on tegra, right?
12:08 tagr: yes, gp10b
12:08 karolherbst: maybe enable kasan and see if anything gets reported
12:29 tagr: heh... kernel doesn't even boot with kasan enabled...
12:38 karolherbst: :(
12:38 karolherbst: like it's hitting errors or it's broken or it's slow?
12:48 tagr: looks more like it's broken, but to be honest, I've never attempted to enable it before, so this might be just standard behavior on Tegra
12:48 tagr: I'll have to go check this on vanilla
12:51 tagr: hm... actually it looks more like some function somewhere is returning ENOMEM instead of -ENOMEM
12:52 tagr: nvkm_engine_ref() return 0xc when given a valid pointer, and the gr: init failed, 12 is from the nvkm_subdev_init() from nvkm_engine_ref()
12:59 karolherbst: mhh, odd
14:14 karolherbst: imirkin: what ya think? https://github.com/karolherbst/mesa/commit/5aac462274528d52ba1d482bbef34a2335db746a
14:15 karolherbst: minor adjustment: https://github.com/karolherbst/mesa/commit/5d6bf4bf70421e908918d29cd891d7e627e341ad
14:18 karolherbst: mhh.. I could probably also maintain the full list and return a reference instead of construction a new list each call
14:23 karolherbst: what I don't have a good understanding of is the lval->join->defs.remove code inside GCRA::cleanup
14:29 karolherbst: but I think it's doing what I implemented properly, just different...
14:36 MagnusOl: Hi
18:39 imirkin_: karolherbst: that .remove() is just trying to undo the joins from earlier
18:40 imirkin_: oh dear. getting fancy with operator()
18:58 imirkin_: karolherbst: make it operator()(const Value *val) -- then you don't have to remove the const's elsewhere
19:09 karolherbst: well.. you know, STL is stupid, I can't remove the const
19:10 imirkin_: wha?
19:10 karolherbst: defs[val] <- compile error because const Value*
19:10 imirkin_: oh, right
19:10 imirkin_: make that const too :)
19:11 imirkin_: (it is, right?)
19:11 imirkin_: moar const!
19:11 karolherbst: then the merge stuff doesn't work anymore
19:11 karolherbst: but.. maybe it doesn't matter anymore at that point?
19:11 karolherbst: because if RA is done, we don't need the value information anymore I think
19:11 imirkin_: ah yes.
19:11 imirkin_: p.first->defs.insert
19:11 karolherbst: yeah
19:12 imirkin_: sigh.
19:12 karolherbst: but I think we can remove that bit completly
19:13 karolherbst: uhm... no
19:13 karolherbst: we can't..
19:13 karolherbst: we still run passes after it :D
19:14 karolherbst: that DefIterator class is super annoying :/
19:14 imirkin_: feel free to stl-ify things
19:14 imirkin_: (in a separate change)
19:14 karolherbst: ohh, that's a typedef
19:14 imirkin_: yes
19:14 imirkin_: just to type fewer things
19:14 imirkin_: no auto keyword, you know :)
19:16 karolherbst: for (Value::DefIterator def = defs.begin(); def != defs.end(); ++def) (*def)->get()->join = rep;-> for (ValueDef *def : defs) def->get()->join = rep;
19:16 karolherbst: but yeah.. maybe in its own change or so
19:16 karolherbst: ohh, first that, then my patch.. should lead to fewer changes
19:18 pedahzur: Lyude: Anything you need from me today? :)
19:23 Lyude: pedahzur: not yet, I've just got a couple of different things on my todo list right nw
19:23 pedahzur: Sounds good
22:30 karolherbst: soo.. now to the next RA bug...
22:32 imirkin_: there's more than 1?!
22:33 karolherbst: big surprise, I know
22:33 karolherbst: the next one I want to look into is the one where we get the "ERROR: no viable spill candidates left" error, although nothing needs to be spilled
22:34 imirkin_: oh, i hate that
22:34 karolherbst: yeah...
22:34 imirkin_: that's coz we merge
22:34 karolherbst: no
22:34 imirkin_: but then something's messed up
22:34 imirkin_: no?
22:34 karolherbst: nope
22:34 imirkin_: huh
22:34 karolherbst: it's really stupid actually
22:34 karolherbst: so we have vec4 slots
22:34 imirkin_: aka merge.
22:34 karolherbst: and if we allocate one 32 bit register, we assume the worst case can mark the entire thing as "gone"
22:35 imirkin_: for a vec4 - yes
22:35 karolherbst: so if we need to RA multiple vec4s. we run out of space
22:35 imirkin_: it has to be aligned
22:35 imirkin_: it's not a bug
22:35 imirkin_: it's a feature ;)
22:35 karolherbst: I know
22:35 karolherbst: thats why it's annoying
22:35 karolherbst: we essentially always assume the worst possible allocation of registers
22:35 karolherbst: and see if the next vec4 fits
22:35 karolherbst: *check
22:36 imirkin_: i wonder if we could order the allocations to succeed more often
22:36 imirkin_: i.e. isntead of walking RIG nodes in order
22:36 imirkin_: sort them by size
22:36 imirkin_: (width, that is)
22:36 imirkin_: obviously not a guarantee of success...
22:37 karolherbst: util/ra already does that as well... so we essentially have those choices: 1. fix codegen/ra 2. use the common graph coloring implementation aka util/ra
22:38 karolherbst: uhm... s/graph/node/...
22:38 imirkin_: nah, "graph coloring" is the right term
22:38 karolherbst: ahh, okay
22:38 karolherbst: weird, but okay
22:38 imirkin_: obv you're coloring nodes in a graph
22:38 imirkin_: but ... wtvr
22:39 karolherbst: anyway, util/ra has this concept of register classes and you can handle overlapping classes just fine and stuff
22:39 karolherbst: but..
22:39 karolherbst: uff
22:39 karolherbst: regressions and a lot of work
22:39 imirkin_: =]
22:39 imirkin_: i need to remember to try to fix the last failing GTF test tonight
22:39 imirkin_: i had some idea that i wrote down for what could be going wrong
22:40 imirkin_: hopefully i wrote enough down to re-figure out the idea
22:40 imirkin_: and even more hopefully that idea makes things work
22:40 imirkin_: after that, we "just" need to resolve the hangs that happen when it's all run in sequence in cts-runner
22:41 imirkin_: perhaps we can get skeggsb interested :)
22:41 karolherbst: :D
22:41 karolherbst: those are the hangs you get if you run the CTS for 10 hours, right?
22:41 skeggsb: in what?
22:41 imirkin_: no
22:41 imirkin_: it's the hang that happens reliably (for my configuration anyways) on the first run through the CTS tests
22:41 imirkin_: about 10-20 mins in
22:42 karolherbst: ohhh
22:42 karolherbst: lucky you
22:42 karolherbst: I used the cts-runner once, the first hangs/random fails happened after 80% of the runs or so
22:42 imirkin_: skeggsb: cts-runner hangs with a read failure-type-thing for me
22:42 imirkin_: some sort of memory management issue
22:42 imirkin_: for me it hangs in a reliable spot
22:43 imirkin_: but runnign the tests directly naturally works
22:43 karolherbst: imirkin_: how much VRAM?
22:43 imirkin_: dunno, maybe 1GB?
22:43 karolherbst: mhhhh
22:43 imirkin_: GT 1030 -- top of the line
22:43 imirkin_: nothing but the best for me! :)
22:43 karolherbst: skeggsb: mind adding an nouveau option to limit the amount of used VRAM?
22:43 imirkin_: it might even have a fan - that's how you know quality
22:43 karolherbst: I think this would help us figuring out some issues (maybe this one as well)
22:44 karolherbst: mine gt 1030 doesn't have a fan :)
22:44 imirkin_: skeggsb: also if you could add an option that *increases* vram ;)
22:44 karolherbst: :D
22:45 imirkin_: and adds MP's
22:45 karolherbst: obviously
22:46 karolherbst: but I guess even if a GPU has more MP's than reported thorough... something, they are probably fused of or so
22:46 imirkin_: yeah :)
22:47 skeggsb: turn those TNT2s into RTX2080Ti? ;)
22:49 karolherbst: mhhh.. actually, I think using util/ra isn't that much work.. we essentially just start by replacing GCRA::selectRegisters and move our way up
22:49 karolherbst: through in 600 loc of glue code and it's done or so
22:53 pedahzur: imirkin_ Lyude: are there directions somewhere for easily pulling the lastest master, building the module and trying it? Sorry...been several years since I've fiddled with kernel/module dev. I'm rusty. :)
23:07 imirkin_: skeggsb: a *lot* of TNT2's :)
23:07 imirkin_: pedahzur: well, once you have a .config, make oldconfig; make -j8
23:07 imirkin_: pedahzur: i just have a git clone
23:09 loonycyborg: it is meant nouveau from the kernel tree?
23:09 imirkin_: you can clone e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
23:10 loonycyborg: like I know that nouveau is in kernel, so any branches you're working on will be in torvalds' repo now?
23:10 imirkin_: loonycyborg: pedahzur is having trouble with some modesetting stuff
23:10 imirkin_: a fix for which is in 5.5-rc6
23:11 imirkin_: more generally, nouveau is in several places
23:11 pedahzur: imirkin_: But it doesn't for my issue. :)
23:11 imirkin_: pedahzur: you tried?
23:11 imirkin_: then why do you need directions for pulling an arbitrary tree?
23:11 pedahzur: imirkin_: And posted another kernel log to the mailing list yesterday. :(
23:11 imirkin_: pedahzur: o
23:11 imirkin_: so you did.
23:12 pedahzur: imirkin_: I was wondering if there was a place where the nouveau code would be more up to date than what was in kernel, and I'd just give it a try. But if the latest kernel rc is it, then I'm good. :)
23:12 imirkin_: pedahzur: https://github.com/skeggsb/linux/commits/linux-5.6
23:12 imirkin_: that's going into the next kernel
23:14 loonycyborg: I'm just curious where do you have feature branches now, that is to which repo you push to collaborate?
23:14 pedahzur: imirkin_: Yeah, what loonycyborg said. :)
23:27 loonycyborg: ah I see, a fork on github
23:29 imirkin_: loonycyborg: on our personal private desktops
23:30 imirkin_: skeggsb is the maintainer. we send him patches. he has this bizarro "nouveau" tree where he tends to push patches first
23:31 imirkin_: there's not a ton of collaboration on features that i've seen
23:31 imirkin_: skeggsb does most of the kernel work, anyways
23:32 loonycyborg: how you're sending them? e-mail and git am stuff?
23:37 loonycyborg: at least if you just send patches then it probably loses authorship information and commit message
23:39 loonycyborg: that is if you just send output of git diff
23:58 imirkin: the normal way that people send patches
23:58 imirkin: which preserves all this stuff
23:59 imirkin: sending the otuput of "git diff" is silly