00:07 karolherbst: was able to bring it back...
00:08 karolherbst: just removed the bridge as well
00:08 karolherbst: and rescanned
00:08 karolherbst: Lekensteyn: anyway, if I drop the speed to 5.0 before suspending, it doesn't break either
00:08 karolherbst: so it is really only 2.5
00:09 karolherbst: I was wondering if the 8b/10b vs 128b/130b encoding causes it, but apparently not
00:09 karolherbst: this is just stupid
00:11 karolherbst: Lekensteyn: ahhh, the PCI slot status reg is telling us that the slot is empty
00:11 karolherbst: mhhhh
00:11 karolherbst: 0xba bit 6
00:12 karolherbst: after system suspend/resume that bit flips to on
00:12 karolherbst: and rescaning the bus brings the gpu back
00:13 karolherbst: fun
00:15 karolherbst: Lekensteyn: https://gist.github.com/karolherbst/72861fe3b7833cabc0525b909083a940
00:15 karolherbst: check the PCI R 00:01.0 0xb8 lines
00:15 karolherbst: the first puts the GPU into 5.0
00:15 karolherbst: the second into 2.5
00:15 karolherbst: so after invoking _OFF the bridge things the device is gone from the slot?
00:15 karolherbst: that would explain a lot?
00:15 Lekensteyn: uh wut
00:18 Lekensteyn: could it be a side-effect? Can you also dump the link control and link status registers?
00:18 karolherbst: b2 and b0?
00:19 karolherbst: k
00:20 karolherbst: Lekensteyn: mhh, nothing
00:20 karolherbst: https://gist.github.com/karolherbst/72861fe3b7833cabc0525b909083a940
00:20 karolherbst: the changing bits just indicate the current speed
00:20 karolherbst: and on OFF -> ON transition the devices successfully agreed on 8.0
00:20 karolherbst: because that's what the GPU comes up with
00:20 Lekensteyn: eh oops I meant link control 2 and slot control 2
00:21 karolherbst: ahh
00:21 Lekensteyn: *link control/status 2
00:21 Lekensteyn: 0xd0?
00:22 karolherbst: yeah
00:23 karolherbst: Lekensteyn: heh... more fun: https://gist.github.com/karolherbst/72861fe3b7833cabc0525b909083a940
00:23 karolherbst: link status 0s out
00:24 karolherbst: mhh 0x1f means eq phase 3 successful
00:24 karolherbst: 0 means everything is shit
02:39 imirkin: skeggsb: just checking if you need anything more from me on the LUT patch
02:39 imirkin: (i mailed a proper version last week)
03:29 skeggsb: imirkin: i don't believe so, i'll have a look over again shortly and merge if not
03:29 imirkin: excellent thanks
03:35 imirkin: i think i mentioned this, but all my gv100+ changes are entirely speculative (although based on the docs available)
03:45 HdkR: Oh? What are you doing on GV100+? :)
03:45 imirkin: speculatively guessing how to flip between 256- and 1024-sized luts
03:46 HdkR: Ah, okay
03:46 imirkin: hard to beat actual testing
03:47 imirkin: (e.g. i discovered that the hw gets confused when you mix 1024 and 256-sized luts for input vs output
03:47 imirkin: semi-reasonable in hindsight, but it could have also worked ok, if they did things differently
03:48 skeggsb: what actually happens in that case btw
03:48 skeggsb: that surprised me a bit
03:48 imirkin: well, it "works"
03:48 imirkin: but the LUT values are all div-by-4 effectively :)
03:48 imirkin: i.e. it wants values 0..1023 but we give it 0..255 or whatever
03:49 imirkin: we could offset it with the GAIN_OFS thing, probably, but ... hard to care about that case
03:50 imirkin: feel free to try it out :)
03:51 imirkin: (the modetest tool from the "hdr" branch i've pointed a few times can do this)
03:52 imirkin: just run --degamma linear:1024 --gamma linear:256
03:52 imirkin: or give-versa on the sizes
03:52 imirkin: vice-versa*
11:33 pmoreau: imirkin: re “weren't you work on gmux stuff? and macs?” a tiny bit; I can have a look, but I don’t think I’ll be of much help.
12:32 karolherbst: Lekensteyn: I am wondering if I should execute the _OFF method manually until I hit the line which causes the port to get messed up :/
12:32 karolherbst: that's probably going to be fun
12:32 karolherbst: or is that something the ACPI debugger can do?
12:39 Tom^: karolherbst: you know what, i dont know if you remember the issue where i thought my 780ti was dying and broke cs:go while nouveau wasnt. but im starting to suspect its cs:go itself that did this "flickering/gray screen". because the nvidia blob zeroes vram allocations. and once radeonsi added it to drirc i get the exact same flicker/gray screen at random only it deadlocks. turning it off it doesnt happend.
12:39 Tom^: only the scope artifacts is back and i bet it didnt deadlock the nvidia blob because it dealt with the "situation" better
12:40 karolherbst: heh, fun
12:40 Tom^: karolherbst: :D ugh sometimes i hate software bugs.
12:49 karolherbst: Tom^: but that would kind of mean that everybody is hitting this issue now?
12:49 karolherbst: might be worth to pint on #radeon and see what they say there
12:49 karolherbst: oh, you already did
12:50 Tom^: karolherbst: yeah, i did. but i also suspect im the 1% of these 1% running linux hitting it. the rest is on ubuntu and other distros where this hasnt landed yet
12:50 Tom^: and i suspect im even further deep down in the numbers because im playing it a bit eh nerdish once i do. couple hours straight! :D
12:52 Tom^: really need to setup ssh, might be able to get something out of it that way, just odd that it deadlocks with absolutley zero info in the journals
13:07 karolherbst: Lekensteyn: after the \_SB.PCI0.LKDS 0 the slot reports having no device
13:10 karolherbst: mhh, it essentially only sets Q0L2 = One
13:10 karolherbst: and waits for this value to become 0 again
13:57 karolherbst: Lekensteyn: progress!
13:57 karolherbst: I am now able to get the GPU disappeared without ACPI code :)
15:19 karolherbst: Lekensteyn: do you know to which device \CPEX (0x5FF9BD98+0xE7) belongs
15:19 karolherbst: ?
15:22 karolherbst: mhh, for me that's BIOS-e820: [mem 0x00000000782b7000-0x0000000078997fff] ACPI NVS
15:31 karolherbst: Lekensteyn: I see you were already investigating this code, but I guess you never came to the conclustion that the write to Q0L2 kills it
15:41 karolherbst: ufff
15:41 karolherbst: that is it...
15:41 karolherbst: the fuck
15:46 karolherbst: Lekensteyn: okay... if I clear the lower bits of CPEX, meaning I make ACPI set P0L2 instead of Q0L2, runpm just works ....
15:46 karolherbst: no matter the link speed
19:09 imirkin_: Lyude: i guess the person with the fan issues has a GTX 260 (nva0)
19:10 imirkin_: nva0 usually had fancier things than most boards
19:11 Lyude: imirkin_: alright, I'll check in a little bit to see if I've got one of those around here somewhere