01:25rhyskidd: ccaione: no comments as yet, haven't had a chance to test a build including that patch
01:31imirkin: karolherbst: down to 6 varying locations failures... of which 5 are known test fail.
01:33imirkin: make that all 6
02:20mytbk: Will nouveau be broken after a kexec reboot?
02:23imirkin: you can add nouveau.config=NvForcePost=1 to attempt to fix it
02:23mytbk: Let me try it.
02:23imirkin: which will reset the card into a good state
02:24mytbk: I'm using coreboot with Linux payload with nouveau, then use kexec to boot my kernel on disk. The two kernels have many differences, but I'll first figure out if kexec will break nouveau.
02:25imirkin: well, if the first kernel doesn't touch the nvidia gpu, then you're fine
02:25imirkin: iirc kexec doesn't properly shut anything down
02:26imirkin: which means devices may be DMA'ing to whereever...
02:26skeggsb: NvForcePost=1 is generally a bad idea actually, a ton of boards don't like devinit being done twice
02:26imirkin: a bit unfortunate =/
02:26imirkin: skeggsb: i'm guessing esp recent boards?
02:27mytbk: Wow, NvForcePost seems good to me. I only got a `nouveau 0000:05:00.0: bus: MMIO read of 00000000 FAULT at 009410 [ IBUS ]`, but I can see tty1 and startx later.
02:27skeggsb: i haven't found a particular pattern
02:27imirkin: skeggsb: well, i figured secure boot might not take kindly to it :)
02:28skeggsb: kexec() just booting into a new kernel without at *least* calling suspend funcs on devices is complete bong-hits if you ask me
02:28skeggsb: makes drivers have to deal with initial state that would *never* happen otherwise
02:32airlied: well the idea is usually that you kexec in crash dump handling
02:32airlied: which is when you definitely don't want to call into anything
02:32airlied: afaik if you do a kexec in normal circumstances it shuts stuff down
02:33airlied: therre is kexec and kexec -f, just like reboot and reboot -f :)
02:46skeggsb: airlied: aahh, i was not aware of that, that's somewhat better
05:31mlankhorst: airlied: hm quick look at the offending oops, likely not related to the collissions, perhaps related to fbdev->flags removal?
07:25mlankhorst: if it was related to the merge fixup I made then you would expect some vblank accounting errors, not an oops :)
08:19karolherbst: imirkin: nice :)
08:21karolherbst: skeggsb: how hard would be a full init of the GPU if nouveau detects, that the GPU isn't in a state which might be the result of runtime resuming or POST?
08:24mwk: karolherbst: a full init to bring a GPU "up" is quite different from shutting down running engines to bring a GPU "down"
08:26karolherbst: well I can imagine, I was just wondering if there is a "nice" way to fully reset the GPU into a proper state, but I guess this should be devinit, which doesn't really work?
08:27mwk: I'd guess no
08:27mwk: consider this: if noone even took care to idle the GPU, it just might be in the middle of drawing a complicated frame
08:28karolherbst: ohh no, I meant at nouveau loading time
08:28mwk: with DMA operations in flight, shaders running, and pixels being written to the framebuffer
08:28karolherbst: like nouveau loads and detects that the GPU is in a funny state which it shouldn't be
08:28mwk: karolherbst: but that's exactly what can happen
08:28mwk: with kexec
08:30karolherbst: mhh, I didn't play with kexec in a long time, but afaik i915 was able to handle this situation quite well, no idea if I did kexec -f
08:30mwk: and resetting things while they have outstanding transactions to other things... that never ends well
08:30karolherbst: but I would also say, if somebody does a kexec -f on a normally working system it's the users fault and we might simply ignore such situations
08:31karolherbst: ignoring as in: sure the system will be messed up and there is nothing we can do about it
08:33karolherbst: mwk: assume we may try to properly support a "kexec + shutdown" on a running system and "kexec -f" on a totally crashed system, maybe we could try to at least work something out for those scenarios
08:34karolherbst: there is kexec -p which will execute the loaded kernel on a kernel panic
09:50mwk: karolherbst: I'd say rendering a lot of triangles can be a contributing factor to a kernel panic :p
09:50karolherbst: mhh, true
09:51karolherbst: but in an ideal world it shouldn't cause an kernel panic
09:52karolherbst: so in this case, why is there a kernel panic to begin with ;)
09:52karolherbst: but I think it would be nice to at least support the "normal" use cases
09:53karolherbst: if somebody wants to work on the exotic ones, fine, but this might be a waste of time
19:23karolherbst: mhh, this doesn't sound too good "PHYSICAL_STACK_OVERFLOW"
19:25tobijk: karolherbst: yay, how did you manage that?
19:27karolherbst: playing some game with wine
19:28imirkin_: stop drinking!
19:28imirkin_: perhaps the computer didn't take kindly to being mixed with alcohol
19:28karolherbst: most likely
19:29karolherbst: seems to be hard to debug as well, maybe apitrace may help
19:29karolherbst: but apitrace + wine is always pain
20:22karolherbst: ohh no, a game where everything is tainted in green
20:23tobijk: karolherbst: parteien zur bundestagswahl :>
20:24karolherbst: "nvc0_program_translate:610 - shader translation failed: -4" nice
20:24tobijk: "no viable spill candidates left"?
20:31karolherbst: those are dx9 shaders though
20:31tobijk: well its the most common, were does it get stuck then? :o
20:32karolherbst: lets see if we can dump those dx9 shaders with mesa
20:39karolherbst: tobijk: most likely something like unsupported OP or so
20:40karolherbst: NV50_PROG_DEBUG doesn't print anything prior those errors
20:41tobijk: are you using gallium9?
20:41tobijk: so yeah most likely a gallium op we do not support
20:41karolherbst: mhh, but apperantly RA fails
20:42karolherbst: cause the ret code is -4
20:42tobijk: and we are back at spilling ;-)
20:42tobijk: just kidding
20:42karolherbst: I let it print the program when returning -4
20:42karolherbst: otherwise it will be too painful
20:43karolherbst: or it is something stupid
20:45imirkin_: crysis by any chance?
20:45imirkin_: yeah, known issue
20:45imirkin_: too much texture
20:45imirkin_: in too much the wrong order
20:45karolherbst: it's fixable, isn't it?
20:46imirkin_: there's two things to fix
20:46imirkin_: (a) it should be scheduling things better so as not to spill
20:46imirkin_: (b) spilling should be fixed
20:47imirkin_: easy right? :)
20:47karolherbst: I am actually surprised why nothing gets printed to the console
20:47imirkin_: spilling is broken for reasons i only partly understand. it ends up as a refcounting issue, but that's certainly not the core of the problem
20:47karolherbst: this is annoying me a little right now
20:48imirkin_: i tried to address some of this stuff
20:48imirkin_: but that fixed some cases and broke others
20:49tobijk: karolherbst: i have a second try spill thing, which may help
20:49imirkin_: the problem is that the IR gets totally messed up
20:49karolherbst: it does
20:49karolherbst: prog->print() doesn't even print anything
20:50imirkin_: the merging and "undo" logic aren't quite right
20:50imirkin_: but even without that, the spill code inserter messes some things up as well iirc
20:50tobijk: karolherbst: you need to init it :>
20:51tobijk: karolherbst: https://hastebin.com/toguzayuru.cpp
20:51tobijk: that may help :)
20:51karolherbst: I doubt that
20:52imirkin_: this is a thing to avoid trying to spill things that will cause issues down the line
20:52karolherbst: wine is super shitty to debug... can somebody make wine debugable?
20:52tobijk: imirkin_: well i dint repropose to upstream it :P
20:53imirkin_: karolherbst: i have the shader if you want
20:54karolherbst: imirkin_: this would help, yes
20:54imirkin_: can't find it. hold on
20:54karolherbst: ... this makes no frigging sense now
20:55karolherbst: the tgsi should be enough I assume
20:56imirkin_: might have to touch up nouveau_compiler to accept larger shaders
20:56imirkin_: (and/or make it deal with dynamic sizes)
20:56karolherbst: have a patch for it
20:56karolherbst: nice, things are sane again
20:58tobijk: uhm, that shader compiles here
20:59karolherbst: "texbar (SUBOP:14)" sure
20:59tobijk: which card do you use?
20:59karolherbst: tobijk: it compiles on gpus with more than 63 regs
20:59karolherbst: I have a gk106
20:59karolherbst: it even compiles on tesla
21:00tobijk: so e6?
21:00karolherbst: maybe your patch might help indeed
21:00tobijk: karolherbst: yeah it actually might (yet it will not be upstreamed :>)
21:02karolherbst: it doesn't help
21:03tobijk: karolherbst: yeah it frees up some regs, maybe not enough
21:03tobijk: so not enough glue for the spilling algo ;-)
21:03tobijk: but i have no clue how to start with the spilliung issue
21:06karolherbst: I think this could be some kind of alignment issue
21:07tobijk: you mean the fact that we try to unspill again as soon as possible?
21:08karolherbst: at least the shader compiles with 73 gprs
21:09karolherbst: mov u32 $r33 $r71 (8) and the next tex reads and writes to $r32d
21:14tobijk: karolherbst: mind to paste the shader with the half done ra you have?
21:14tobijk: i'm too lazy right now to get my hands on that :)
21:16tobijk: karolherbst: never mind got a ra completed version instead
21:24karolherbst: mhh interesting
21:24karolherbst: RA doesn't seem to be _that_ broken
21:26karolherbst: imirkin_: I think RA just fails to detect _what_ needs to be spilled and ends up spilling not enough values
21:26tobijk: karolherbst: well we fail to spill enough values for some reasons where we clearly ought to spill
21:31tobijk: oh my propagate values for builtin calls was not enough
21:31tobijk: that shader has a vfetch which could go directly to a call input reg *meh*
21:44tobijk: we could save some regs if we would use some more movs for immediates
21:45tobijk: (only where needed of course)