21:10 karolherbst:should maybe write a script to do this just every day
21:10 bencoh: asking people for lspci?
21:10 karolherbst: no, inviting dri-logger
21:11 bencoh: oh :]
21:11 karolherbst: seems like we don't have logs for the entire month
21:12 karolherbst: bencoh: lspci -tvvnn as well please
21:12 bencoh: to be honest I should have printed bridge->device in quirk_broken_nv_runpm(), but looks like I got lazy back then
21:12 karolherbst: well.. lspci can tell us :p
21:12 bencoh: yeah, I was actually wondering which one was connected to the nvidia card as well
21:13 bencoh: http://pastebin.notk.org/pastebin.php?show=f7918092f
21:14 karolherbst: so it's PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:0151] (rev 09)
21:14 karolherbst: which I guess kind of makes sense
21:15 karolherbst: I guess it's the same one
21:15 karolherbst: just for apple
21:15 karolherbst: "00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05)" is the one blacklisted
21:15 karolherbst: ohh.. newer gen
21:15 karolherbst: mhhhh
21:15 karolherbst: bencoh: and the patch helps indeed I guess?
21:15 bencoh: it does :)
21:15 karolherbst: annoying
21:16 karolherbst: I hoped we wouldn't need to add another one, but maybe I just called for it by adding the switch :p
21:16 bencoh: I suppose there are more bridges out there that show the same behavior
21:16 karolherbst: maybe
21:16 bencoh: you definitely did :D
21:16 karolherbst: but we tested others, and they were fine afaik
21:16 karolherbst: it seems to be pretty much just the "main" bridge controller
21:17 karolherbst: sometimes you have the GPU on one of the lower bridges
21:17 bencoh: when I started investigating it I first went through git log thinking "maybe someone fixed that but not for my hw", and ... it certainly called for me
21:17 karolherbst: those 1c.x ones
21:17 karolherbst: and they are fine afaik
21:18 karolherbst: bencoh: well.. at this point I think it would be safe to just include your id and send a patch out
21:18 karolherbst: and cc stable
21:18 bencoh: 0x0151 then?
21:18 karolherbst: yeah
21:19 bencoh: alright, I'll need to test it just to make sure and send a patch
21:19 karolherbst: I am still not sure if it's some bug in the kernel or some hw bug :/
21:19 karolherbst: but keeping track of the devices where it does seem to help might help with something
21:19 karolherbst: _but_
21:19 karolherbst: it does not surprise me that it happens on an apple device
21:19 bencoh: don't nouveau patches hit some staging tree before reaching stable@ btw?
21:20 karolherbst: bencoh: normally yes, but for this it's fine to send to dri-devel and stable directly
21:20 bencoh: huhu, to be honest I'm surprised I got this laptop to work nicely at all
21:20 karolherbst: and add my review-by
21:20 karolherbst: "Reviewed-by: Karol Herbst <kherbst@redhat.com>"
21:21 bencoh: noted :)
21:21 karolherbst: (well.. as long as the patch really only adds the id :p)
21:21 bencoh: (haha, yeah)
21:21 karolherbst: bencoh: the older gens are manageable
21:21 karolherbst: I had a macbook pro from the.. sandy bridge era
21:21 karolherbst: which I think is slightly older than yours
21:22 bencoh: mine is ivy bridge, apparently
21:22 karolherbst: yeah
21:22 karolherbst: and it was totally not working 5-6 years ago :p
21:22 karolherbst: bencoh: ohh, I guess you are legacy booting, right?
21:22 karolherbst: ohh no
21:23 karolherbst: wait...
21:23 bencoh: by the way, since we're at it ... I need to bring the nvidia card on before suspending (mem state), otherwise it just oops (at resume iirc)
21:23 karolherbst: of course not :p
21:23 bencoh: no, I'm using rEFIt
21:23 karolherbst: yeah..
21:23 karolherbst: earlier you had to add grub hacks to even make it display on the intel gpu
21:23 bencoh: (I bring it on using vgaswitcheroo)
21:23 karolherbst: as it defaults to the discrete GPU when booting
21:23 bencoh: indeed
21:23 bencoh: well, I didn't had to, but I remember reading about that
21:24 bencoh: have* to
21:24 karolherbst: mine was a MacBookPro8,2
21:24 bencoh: so err, is there any proper reason that the card needs to be enabled before suspend, or should that be taken care on by the driver?
21:25 bencoh: (I'd tend to say the driver should attend to it, or at least handle resume properly nonetheless)
21:27 karolherbst: bencoh: state needs to be saved
21:27 karolherbst: you don't want to keep VRAM powered on while suspending, do you?
21:28 bencoh: right, but I'd say the driver should still do it even if the card is "off"
21:28 karolherbst: or is there an issue with resuming?
21:28 bencoh: (not that I really know what it means for it to be off)
21:28 bencoh: I think there is, I'll paste some logs
21:28 karolherbst: well, no power -> bits become random over time
21:29 karolherbst: I guess we could skip it if nothing uses the GPU...
21:29 karolherbst: skeggsb: could we skip doing the full resume path if nothing ended up using the GPU on suspend?
21:29 karolherbst: not that it matters...
21:29 karolherbst: bencoh: apparently there are some (or all) firmwares which "loose" the GPU when it was powered off when suspending
21:30 karolherbst: so we can't skip waking it up, but we could make suspend faster by skipping some work
21:30 bencoh: http://pastebin.notk.org/pastebin.php?show=f41cc7bf7
21:31 karolherbst: bencoh: that's without the patch, right?
21:31 bencoh: for now it just crashes unless I power it on manually before entering suspend
21:31 bencoh: karolherbst: that happens even with the patch
21:31 karolherbst: huh.. strange
21:31 bencoh: the patch prevents the d3hot/d3cold issue
21:31 karolherbst: that's... weird
21:32 bencoh: well, at least on this laptop that's how it is
21:32 karolherbst: yeah... odd
21:33 bencoh: I added echo ON > vgaswitcheroo to pm-suspend/systemd hooks, and it has been suspending fine since then
21:33 bencoh: I still can't get hibernate to play nice as well, but ... I guess that'd be more tedious
21:34 karolherbst: ehhh.. something is weird
21:35 bencoh: karolherbst: you're referring to that part I suppose
21:35 bencoh: Jul 2 05:00:46 alkaid kernel: [ 1153.126263] nouveau 0000:01:00.0: Refused to change power state, currently in D3
21:35 karolherbst: might printing the kernel stack when that happens?
21:35 karolherbst: the message is somewhere in the pci code
21:35 bencoh: I'd need to build it with kgdb on then
21:36 karolherbst: no
21:36 karolherbst: there is a function.. wait
21:36 bencoh: well, we already have a stack trace
21:36 karolherbst: yeah.. but that's somewhere else :p
21:36 bencoh: oh, you mean adding dump_stack() to that code
21:36 karolherbst: yes
21:36 karolherbst: just right after the message
21:36 robi: who would be the best person to send a patch?
21:36 bencoh: the Refused to change power state, currently in D3 one, you mean
21:36 robi: (for review)
21:37 karolherbst: robi: the nouveau mailing list
21:37 robi: got it
21:37 karolherbst: bencoh: yes
21:37 bencoh: I just realized this doesn't happen *during* resume, but during ON > vgaswitcheroo *after* resume, by the way
21:37 bencoh: as if suspending it while it's off broke something
21:37 karolherbst: bencoh: yeah... I was already wondering :p
21:38 karolherbst: bencoh: ahh, yeah
21:38 karolherbst: can't do that :p
21:38 karolherbst: the GPU needs to be on while suspending
21:38 bencoh: yeah, the question is still why doesn't the driver take care of the needed part
21:38 bencoh: (during suspend/resume)
21:38 karolherbst: it does normally
21:38 karolherbst: I think...
21:38 karolherbst: maybe somebody messed with it
21:38 bencoh: hmm ... looks like it doesn't do everything that's needed on this laptop at least :D
21:39 bencoh: ah, could be as well :-)
21:39 karolherbst: uhm.. wait
21:40 bencoh: I'll be afk for a moment, coming back in 20~30mn :)
21:58 karolherbst: HdkR: https://gitlab.freedesktop.org/karolherbst/nouveau_ci/-/tree/master/
21:59 HdkR: 404'd, I guess private project?
21:59 karolherbst: ehhh
21:59 karolherbst: :D
21:59 karolherbst: are you logged in?
22:00 HdkR: I am
22:00 karolherbst: nickname?
22:00 HdkR: Sonicadvance1
22:01 karolherbst: gitlabs interface is just terrible if it comes to adding names
22:01 karolherbst: "ryan" just lists all bryans as well...
22:01 karolherbst: and if you put a space, it's like an or
22:01 karolherbst: terrible
22:02 karolherbst: ahh "ryan hou" is the magic string for you :p
22:02 HdkR: I've been mistaken for bryan many times by humans, I think it's matching behaviour :P
22:02 karolherbst: done, now you can see it :p
22:03 karolherbst: the python script is terrible though
22:03 karolherbst: to much os.system and sudo :D
22:03 karolherbst: _but_ it does set everything up
22:04 karolherbst: dnsmasq as DHCP and TFTP server, laptop finds it and does load the netboox.xyz binaries, which then loads the menu via https :)
22:04 karolherbst: now I just need to provide my own menu + ipxe files
22:05 karolherbst: still having some hardcoded interfaces in the dnsmasq.conf .. oh well
22:05 karolherbst: that's for later
22:06 HdkR: Everything is is rough on the first iteration
22:06 HdkR: w/e :D
22:06 karolherbst: heh.. apparently my internet doesn't work anymore
22:06 HdkR: weird, I have "access" but it isn't showing me a file structure? wtf are you doing gitlab
22:06 karolherbst: ahh seems like only ipv4 is bonkers
22:07 karolherbst: ohh wait.. this happened earlier
22:07 karolherbst: something screws up my network (it's me probably?)
22:07 karolherbst: needed to reboot but any network access essentially blocks forever
22:07 karolherbst: it's weird
22:08 karolherbst: and some stuff still works
22:09 HdkR: The boot starts taking down your entire network, or the device's networking is broken?
22:09 karolherbst: dunno
22:09 karolherbst: it just appears to start randomly
22:09 karolherbst: I think my second network interface dies in a weird way
22:10 karolherbst: ahh yeah.. lsusb hangs as well
22:11 karolherbst: HdkR: device path of hell: ../../devices/pci0000:00/0000:00:1d.6/0000:06:00.0/0000:07:02.0/0000:3e:00.0/usb4/4-1/4-1.4/4-1.4:1.0/net/enp62s0u1u4
22:12 karolherbst: ahh yeah..
22:12 karolherbst: it's definetly the device
22:12 karolherbst: or kernel
22:12 HdkR: Something not interacting nicely with that USB device at the very least
22:13 karolherbst: I blame TB actually...
22:13 karolherbst: I just removed the device, but it's still there
22:13 karolherbst: "[18231.513344] pcieport 0000:07:00.0: can't change power state from D3cold to D0 (config space inaccessible)" :D
22:13 karolherbst: yeah.. lol
22:14 robi: karolherbst: that path looks like something you might get if you daisychain 5 usb hubs together
22:14 karolherbst: I wouldn't be surprised if the commit nouveau broke with is involved here as well
22:14 karolherbst: robi: nope
22:14 karolherbst: USB-C hub on my TB port :p
22:14 robi: lol
22:15 karolherbst: it's... weird
22:15 karolherbst: so...
22:15 karolherbst: TB isn't USB-C directly.. so in order to accept USB-C devices, it has to spawn a fake USB controller
22:15 karolherbst: 00:1d.6 is a PCIe port, a normal one
22:15 karolherbst: 0000:06:00.0 is a spawend TB bridge
22:16 karolherbst: 0000:07:02.0 is .. another spawned bridge? (now, that's ridicoulus)
22:16 karolherbst: 0000:3e:00.0 is the USB 3.1 controller from the TB controller
22:16 karolherbst: and then you get the USB stuff
22:16 karolherbst: :D
22:17 HdkR: Thunderbolt is amazing in so many ways...
22:17 karolherbst: ahhh ACPI throws some errors in dmesg as well
22:17 karolherbst: _fun_
22:17 karolherbst: and here I thought runpm enabled all the way is a good idea :D
22:26 karolherbst: *sigh*
22:26 karolherbst: reboot it is then
22:35 karolherbst: HdkR: can you see the code now?
22:36 HdkR: ah, now I can
22:36 karolherbst: weird.. seems like being a guest isn't enough :p
22:36 HdkR: Wacky
22:39 HdkR: ah okay, I see how this work
22:39 HdkR: s
22:39 karolherbst: there are a few things I don't like :p
22:40 karolherbst: but in the end I want it to auto install :)
22:40 karolherbst: but that's for next week
22:41 HdkR: How does Netboot know what OSes are available?
22:41 karolherbst: it has a full https stack...
22:41 karolherbst: so.. it downloads the menu and everything :D
22:41 karolherbst: it's quite instane
22:42 HdkR: Oh, so its "Installers->Operating Systems" menu actually just downloads the media from the web without configuration?
22:42 karolherbst: yep
22:42 HdkR: Whoa
22:42 karolherbst: you can provide your own files though
22:42 karolherbst: but then it still downloads the images
22:43 karolherbst: well
22:43 karolherbst: depending on the ipxe file
22:43 karolherbst: it's quite cool actually
22:44 HdkR: Yea, that's really neat. I'm definitely going to toy with this once I get my hands on a new ARM device :D
22:44 karolherbst: :D
22:45 karolherbst: it doesn't do arm though
22:45 HdkR: womp womp
22:45 karolherbst: well.. it's firmware code :p
22:46 karolherbst: but that stuff even comes with a node web frontend if you want to, and you can deploy new ipxe files while the service is running
22:46 karolherbst: and if an arm device supports anyway of that I am sure you could also support it there
22:46 karolherbst: just... the ipxe files are quite x86 centric
22:47 karolherbst: (I skipped the node stuff obviously)
22:49 HdkR: Hm, I'll have to check how this ProX supports ipxe boot then
22:49 HdkR: Maybe the network bits just don't work past that point
22:50 karolherbst: it probably just downloads the kernel and initrd file directly
22:50 karolherbst: normally the bootloader support BOOTC and/or TFTP
22:50 karolherbst: uhm
22:50 karolherbst: BOOTP
22:51 karolherbst: PXE is weird...
22:51 karolherbst: but I am sure arm devices don't do PXE
22:52 HdkR: I'll check its boot options in a bit
22:53 HdkR: Wacky microsoft product
22:53 karolherbst: ohh wait.. I do the same, just instead of using closed source PXE I use netboot.xyz :D
22:54 karolherbst:still needs to understand how all of that works
22:55 karolherbst: ahh.. there is "Syslinux PXELINUX" and gPXE/iPXE as open source alternatives
23:00 karolherbst: oh well
23:00 karolherbst: let's see what works reliable enough :D