00:06 Lyude: OH, now that is seriously interesting
00:06 Lyude: blacklisting nouveau and rebooting enough times eventually gets me a spurious interrupt before nouveau's loaded
00:06 Lyude: and it seems to fail disp if I try loading nouveau after that
00:07 Lyude: there's definitely something evil here
00:07 Lyude: the owls are not what they seem
00:09 karolherbst: Lyude: mhhhh, maybe we get an interrupt which triggers something odd?
00:09 Lyude: karolherbst: well, an unclaimed interrupt is almost always a sign something very bad is going on to begin with
00:10 Lyude: most likely to do with the fw, I guess then the question is what
00:10 karolherbst: Lyude: ohhhhhhhhhhhh
00:10 karolherbst: it is so obvious now
00:10 Lyude: hm?
00:10 karolherbst: we get an interrupt and we _think_ it was the evo failing executing the method
00:10 karolherbst: but
00:10 karolherbst: it wasn't the evo
00:11 Lyude: there's a catch
00:11 Lyude: so
00:11 karolherbst: it is just that crappy GPU state hammering interrupts like crazy
00:11 karolherbst: or well, more or less
00:11 karolherbst: soooo
00:11 Lyude: the thing is when I loaded nouveau after I saw the spurious interrupt, display still failed in response to an interrupt
00:11 karolherbst: yeah
00:12 karolherbst: it makes sense
00:12 Lyude: unless the interrupt would remain unclaimed until we loaded nouveau, but it said it was disabling the interrupt entirely
00:12 karolherbst: let me read some code
00:12 Lyude: k
00:12 karolherbst: Lyude: no, I think it is something like that: we enable the display engine, so we also start handling those interrupts
00:12 karolherbst: but
00:12 karolherbst: the engine is in a weirdo state hammering interrupts until we finish init
00:13 karolherbst: soooo
00:13 karolherbst: Lyude: mind hacking something up to ignore interrupts until we are done calling core507d_init?
00:14 karolherbst: uhm
00:14 karolherbst: until we call evo_wait the first time actually
00:15 Lyude: karolherbst: maybe (btw, I am very surprised we aren't already disabling interrupts before doing anything else since I think AMD does that...), I thought I tried that already but I might not have put the write to 0x6100b0 early enough
00:15 nyef: Can you "just" mask the interrupts from PDISPLAY at the PBUS level until things are set up?
00:16 karolherbst: yeah, it is a guess from my side, but it _might_ be the cause
00:16 Lyude: is there anything earlier then disp->init?
00:17 karolherbst: Lyude: nouveau_display_create
00:17 karolherbst: but... I hope nouveau_display_create isn't doing anything stupid
00:17 karolherbst: but you never know
00:17 Lyude: it might not need to, the bios could have done something stupid well before it gave us a chance to
00:18 karolherbst: maybe
00:26 karolherbst: Lyude: mhh, do you think something _could_ happen here? "drm_kms_helper_poll_init(dev); drm_kms_helper_poll_disable(dev);" ?
00:26 Lyude: yes
00:26 Lyude: but let me take a closer look
00:26 karolherbst: I mean, that looks liker super suspicious
00:28 karolherbst: Lyude: I guess we might want to move the init up to where we first call _enable?
00:28 Lyude: karolherbst: tbh, that is really supposed to only do something if we had polled connectors (and we don't on nouveau) so I'm not terribly suspecious of that
00:28 Lyude: but i'm verifying that right now
00:28 karolherbst: ahh
00:28 Lyude: also: interrupt trick didn't work
00:28 karolherbst: mhhh
00:32 karolherbst: Lyude: see that "ret = nouveau_bo_new(&drm->client, 4096, " inside nv50_display_create?
00:32 Lyude: karolherbst: yeah; that's definitely not doing anything (the polling I mean)
00:32 Lyude: karolherbst: yep
00:32 karolherbst: that should be the memory we use for the evo stuff
00:34 karolherbst: mhh
00:35 karolherbst: and then we call into nv50_core_new
00:36 Lyude: karolherbst: btw, I can still give you access to this machine if you want to take a closer look
00:37 karolherbst: yeah, but not today, you can pm me the stuff and then I could check tomorrow morning
00:37 Lyude: sgtm
00:37 Lyude: i want to go home anyway :)
00:37 karolherbst: :)
00:38 Lyude: heading off now, have a good night!
00:40 HdkR: Does Nouveau support Turing yet?
00:40 HdkR: </s>
00:40 karolherbst: the heck
00:41 karolherbst: HdkR: maybe :p
00:41 HdkR: lol
00:42 karolherbst: what kind of question is that anyway :D
00:42 HdkR: It was just announced, so obviously I had to do the sarcastic question immediately
00:42 karolherbst: I mean, either there are patches somewhere, something is merged or people working on it wouldn't be allowed to tell you anyway :p
00:42 karolherbst: really?
00:42 HdkR: https://www.anandtech.com/show/13214/nvidia-reveals-next-gen-turing-gpu-architecture
00:43 karolherbst: skeggsb: ^^
00:43 HdkR: Siggraph stream just happened
00:47 karolherbst: yeah... dunno
00:48 karolherbst: I hope Turing doens't have a new ISA...
00:48 karolherbst: or other things
00:48 HdkR: It's all about the gigglerays
00:50 karolherbst: yeah well...
00:50 karolherbst: apperantly they pumped up int16 and int8 perf
00:50 karolherbst: which _might_ be interesting for dolphin
00:50 HdkR: Dolphin needs int24 mostly
00:51 karolherbst: soooo int16 + int8 + carry :p
00:51 HdkR: hah
00:51 HdkR: or int32 + clamp or mask
00:51 karolherbst: well
00:51 karolherbst: that is slower ;)
00:52 karolherbst: if int16 is 2x int32 in terms of speed
00:52 karolherbst: and int8 4x int32
00:52 karolherbst: that might introduce some weirdly optimized shaders
00:52 karolherbst: alone because glsl doesn't know about int16 nor int8
00:52 karolherbst: _but_
00:52 karolherbst: a compiler could be outsmarted
00:53 karolherbst: int temp = input & 0xffff;
00:53 karolherbst: which could be optimized to a single b16 load
00:54 karolherbst: maybe clamping would give you the same thing
00:54 karolherbst: but...
00:54 karolherbst: I don't look forward in writing those passes :D
00:54 HdkR: haha
00:54 karolherbst: *to
00:55 karolherbst: but uhm....
00:55 karolherbst: the ISA has to change for that
00:55 karolherbst: meh....
00:55 karolherbst: another ISA
00:55 HdkR:stacks ISAs higher
00:56 karolherbst: I mean, with the Volta one they have enough space for all that stuff...
00:56 karolherbst: maybe they just add a variant
00:56 karolherbst: or something
00:56 karolherbst: dunno how much space there actually is
17:16 Lyude: karolherbst: poke, I've got the machine here whenever you want to try taking a look at it today
20:48 Lyude: karolherbst: I just noticed this: https://paste.fedoraproject.org/paste/UtZP52I0pRyfjWz9DYpgKQ
20:49 Lyude: before we create the core channel, the registers the interrupt is complaining about actually contain the core507d_init method
20:49 Lyude: then we try kicking it after the disp error happens
20:49 Lyude: i'm going to see if I can figure out if anything in nouveau happens to be the one writing those registers before evo gets setup
20:49 karolherbst: yeah, makes sense
20:50 Lyude: at the very least, someone messing with our push buffer seems slightly less likely
22:09 Lyude: karolherbst: you had mentioned that you were worried this weird disp bug might be the cause of some sort of caching issues, correct?
22:10 karolherbst: yes
22:10 karolherbst: why? :D
22:10 Lyude: OH
22:10 Lyude: nevermind
22:10 Lyude: but
22:10 Lyude: i just got a lead!
22:11 Lyude: happens between lines 2225 and 2235 in drivers/gpu/drm/nouveau/dispnv50/disp.c
22:12 Lyude: oh, and a second time between 2235-2241
22:12 Lyude: (those line numbers are probably a little off for you, as that's after adding a lot of various printks
22:58 Lyude: well, still not much more informed of what's going on here then before, the line numbers don't match any place we're writing, although it sems that the bug might be able to happen anywhere after we call nv50_core_new(), which isn't terribly useful...
22:58 Lyude: karolherbst: do you have any idea how we might be able to tell if this is some sort of caching issue or not?
22:59 karolherbst: no idea
23:01 JayFoxRox: are the NV20 / NV2A fogtables dumped anywhere?
23:52 Lyude: skeggsb: what is the bios setup on your p50 like, do you have uefi enabled/disabled? or anything else changed from the defaults?