20:28 fltrz: karolherbst: I am totally confused: passing DRI_PIME= 1 or 2 with glxinfo results in either intel HD 4000 or nouveau nvidia results; but both selections combined with glxgears always use card1 when I check with lsof. so I still don't know what card1 refers to: intel HD 4000 or nvidiea gpu
20:30 karolherbst: fltrz: you could check what outputs are on each
20:31 karolherbst: ohh
20:31 karolherbst: I think there is a simpler solution
20:31 karolherbst: "ls /dev/dri/by-path/ -lah"
20:52 fltrz: aha! card1 is intel HD 4000 after all, and card0 is nvidia!
20:56 fltrz: weird: the lsof /dev/dri/{card,render}* has electron use renderD128 and renderD129, but rendeD128 is associated with card0 accoding to your last "ls /dev/dri/by-path/ -lah" command, yet card0 is not in lsof output?
20:57 fltrz: it would be nice if I could get electron to use intel HD 4000 if I risk crashing the nvidia gpu occasionally
20:57 fltrz: thanks a lot for the last command
21:00 karolherbst: fltrz: I think it's a bug in electron actually...
21:00 karolherbst: not sure if one can force to use a specific GPU
21:02 karolherbst: fltrz: https://wiki.archlinux.org/title/chromium#Forcing_specific_GPU that might work
21:03 karolherbst: just instead of chromium use the electron app you want to start
21:03 karolherbst: they should accept the same args
21:32 phomes: last weekend I updated nvk to use the common physical device enumeration code. I am looking for another small thing to do next
21:37 karolherbst: phomes: yeah, thanks for that. I don't have a good list of small things to work on though. Maybe jekstrand has some ideas?
21:38 karolherbst: what GPU do you have btw?
21:40 phomes: it is a TU104
21:41 phomes: RTX 2080
21:42 karolherbst: could probably looking into wiring up smaller extensions
21:45 phomes: I can do that. Do I just pick one and then try to find some isolated tests that can run even at the current state of the driver?
21:45 karolherbst: basically that
21:45 karolherbst: if you run all the current vk tests we have a pass rate of around 90%
21:45 karolherbst: but I'd suggest to only run the relevant ones unless you can do that on a spare machine
21:47 phomes: any suggestion for a small extension to start with? I do not know which ones are small
21:48 karolherbst: I don't know :) jekstrand is the vulkan expert here
21:48 karolherbst: jekstrand might even know good small taks one could work on in parallel
21:50 karolherbst: phomes: mhh.. could also try to make zink run on top and start with simple things like glxinfo maybe
21:50 karolherbst: I have no idea how well it would work, but could be something to work on
21:50 karolherbst: I also don't think we have scanout wired up... could also be something useful
21:52 karolherbst: anyway... running zink on top of nvk could be a good reason to ditch our gallium driver, or at least having it as an option for things running faster
21:55 phomes: thank you for the suggestions. I will see if I can get zink running or if jekstrand has other ideas
22:06 jekstrand: phomes: We're getting really close to where people can start going nuts wiring up features. There's 3-4 items left which I consider to be "basics".
22:06 jekstrand: phomes: But someone could probably start already.
22:06 jekstrand: Gettting WSI in shape and working towards Zink seems like a good early goal.
22:07 karolherbst: that remindes me.. I think my modifier patch is basically ready, even if it doesn't wire anything up
22:09 karolherbst: could probably merge this already: https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/78
22:11 karolherbst: needs some work parsing out the modifier, but didn't get to it yet
22:11 karolherbst: might be something for phomes to work on, but not sure if one could easily test this stuff
22:13 fltrz: hmm the min browser (or electron) keeps using renderD128, I tried your suggestion and also --force_low_power_gpu
22:14 karolherbst: :(
22:14 phomes: I may not have the experience to fix complicated things. A small extention with good test cases seems more manageble :)
22:14 karolherbst: fltrz: I suspect their code assumes "low power gpu" to be card0
22:14 karolherbst: phomes: you are in luck there as vulkan usually has a lot of tests for all extensions
22:15 karolherbst: but I'd see what zink is doing... it will probably crash on some functions or so
22:15 karolherbst: and then you could check which vulkan CTS tests are testing those
22:15 karolherbst: making glxinfo run shouldn't be terribly difficult... I hope
22:16 phomes: I will try that. Thansk again for the help
22:27 fltrz: the chrome://gpu page lists intel hd 4000 as active but also lists the existence of the other GPU. is it possible it just requests a render context on all GPU's? in case some PRIME switch or whatever happens in the unexpected direction so it could migrate?
22:27 karolherbst: no idea
22:27 karolherbst: it doesn't happen for me
22:30 fltrz: so Im convinced that its not really using the renderD128 (nvidia on card0), only the renderD129 on card1 (intel hda 4000), but really annoying if the GPU is active because min refuses to let go of renderD128
22:30 karolherbst: :/
22:30 karolherbst: so GL_VENDOR shows intel?
22:31 fltrz: I also seem to remember actually that the laptop was much colder until one day it got hotter (perhaps installation or update of min/electron?)
22:31 fltrz: karolherbst: yes
22:31 karolherbst: annoying
22:31 fltrz: unless I use DRI_PRIME=... with glxinfo so that it shows nouveau
22:32 fltrz: you told me yesterday there was a person with a reversed card0 and card1, so it looks like I have the same
22:32 karolherbst: yeah.. looks like it
22:32 karolherbst: applications shouldn't really care about any of this, but chromium is special and they apparently have their own handling
22:33 karolherbst: could file a bug there or something
22:33 fltrz: I wonder if perhaps chromium is annoying because it expects card0 to be integrated and card1 to be the discrete?
22:34 karolherbst: maybe
22:34 fltrz: I think you mentioned a way to reverse the order of card0,1
22:35 fltrz: perhaps I should just dump min/chromium
22:37 fltrz: sorry for my interrupting your conversation, I'll continue whatever I was doing
22:40 karolherbst: don't worry about that, just distracted with other things. I just don't know _why_ chromium behaves like that, but I think it would be best to file a bug and if nothing happens let us ping on that more firmly
22:48 fltrz: ooh, can I force card0 to disable? I wonder if it crashes min
22:49 fltrz: if it doesn't crash then its really not using card0 at all, and I can proceed with recompiling nouveau with OpenCL enabled
22:49 karolherbst: don't think so
22:50 karolherbst: though if it's not doing anything with the GPU it should still turn off
22:50 karolherbst: like even debugging OpenGL apps can cause the GPU to be turned of if nothing happens for a whi;e
22:50 fltrz: even if min has an open render context without using it?
22:50 karolherbst: yep
22:50 fltrz: hm, then its really weird that card0 stays active
22:50 karolherbst: if you don't do any ioctls, the GPU will be powered of after some time of idling
22:51 karolherbst: does the GPU get powered off once chrome is closed?
22:51 fltrz: good question
22:53 fltrz: weird, yesterday cat /sys/bus/pci/devices/0000\:00\:02.0/power/autosuspend_delay_ms returned 5000 now it gives an cat: '/sys/bus/pci/devices/0000:00:02.0/power/autosuspend_delay_ms': Input/output error
22:53 fltrz: ../control is still auto
22:53 karolherbst: heh...
22:54 karolherbst: maybe use a different device?
22:54 karolherbst: 00:02.0 is probably the intel GPU
22:55 fltrz: power state D0
22:55 fltrz: oh
22:58 fltrz: so 01:00:0/control ../status and ../autosuspend_delay_ms return auto, active and 5000 just like yesterday
22:58 karolherbst: and nothing uses card0?
22:58 fltrz: nothing
22:58 karolherbst: or renderD128?
22:59 fltrz: when min is shut down nothing uses card0 nor renderD128
22:59 karolherbst: what about 00:01:0?
23:00 fltrz: 00:01:0 exists but don't know what that is?
23:01 karolherbst: at least that's what I assume is the nvidia GPU connected to
23:01 karolherbst: should be the PCIe bridge controller or root port
23:01 fltrz: theres more files in the power folder of that device
23:01 fltrz: ah
23:01 karolherbst: lspci -t should show it
23:01 karolherbst: though it's a bit hard to parse :D
23:01 karolherbst: on my system:
23:01 karolherbst: -[0000:00]-+-00.0
23:01 karolherbst: +-01.0-[01]--+-00.0
23:01 karolherbst: | \-00.1
23:02 fltrz: did you want me to check a specific file of this PCIe bridge controller?
23:02 karolherbst: just the same ones
23:02 karolherbst: control should be auto
23:02 karolherbst: (I hope)
23:02 karolherbst: though I kind of expect the audio device to mess up
23:02 fltrz: control is actually 'on'; runtime_status = active
23:03 karolherbst: ahh...
23:03 karolherbst: should be auto
23:03 karolherbst: could just echo auto into it
23:03 fltrz: lets try :)
23:03 karolherbst: though your GPU _could_ be connected to a different port, but it doesn't hurt to set "control" to "auto" for all PCI devices
23:03 karolherbst: normally that is
23:04 fltrz: probably wrong command: udo echo 'auto' > /sys/bus/pci/devices/0000\:00\:01.0/power/control
23:04 fltrz: bash: /sys/bus/pci/devices/0000:00:01.0/power/control: Permission denied
23:04 karolherbst: yeah, need a root shell
23:05 fltrz: ok np
23:06 fltrz: its auto now
23:06 fltrz: so in 5 seconds the GPU might actually shut down?
23:07 karolherbst: hopefully
23:07 karolherbst: but others could prevent it
23:07 karolherbst: like 01:00.1
23:07 karolherbst: it's just sad that there are distributions out there, where the defaults are just wrong :(
23:08 fltrz: so anything on PCI bus might be keeping it awake?
23:08 karolherbst: nope
23:08 karolherbst: the audio device is just part of the GPU
23:08 fltrz: specifically the nvidia audio device
23:08 karolherbst: and you only get proper power savings if the root port can be disabled as well
23:08 fltrz: ok so I guess I should check what its control settings are?
23:08 karolherbst: it could be even a driver thing.. let me check
23:10 karolherbst: I think you want "/sys/module/snd_hda_intel/parameters/power_save" to be 1
23:10 fltrz: nvidia audio device snd_hda_intel ?? is auto & active
23:10 karolherbst: yeah
23:10 fltrz: that one actually is 1
23:11 karolherbst: CONFIG_SND_HDA_POWER_SAVE_DEFAULT=1 set on the kernel should also help
23:11 karolherbst: ahh...
23:11 karolherbst: annoying
23:11 karolherbst: so something keeps the gpu awake...
23:11 karolherbst: ohh... wait...
23:11 karolherbst: the audio device is active?
23:11 fltrz: who initiates the 5000 ms countdown?
23:11 karolherbst: huh...
23:11 karolherbst: the kernel
23:11 fltrz: yes active
23:12 karolherbst: anything coming up with "lsof /dev/snd/by-path/pci-0000\:01\:00.1"?
23:13 fltrz: weird lsof: status error on /dev/snd/by-path/pci-0000:01:00.1: No such file or directory
23:13 karolherbst: huh...
23:13 karolherbst: maybe the path is different for you
23:13 fltrz: but there is lsof /dev/snd/by-path/pci-0000\:00\:1b.0
23:14 karolherbst: that's your normal sound card I guess
23:14 fltrz: its pulseaudio
23:14 karolherbst: ohhhhhhhhh
23:14 karolherbst: I think I know what you hit...
23:14 fltrz: I dunno, let me check in lspci what 1b corresponds to
23:14 karolherbst: there is this weirdo bug
23:14 karolherbst: where if the audio driver fails to init on a device, it doesn't init runpm
23:14 karolherbst: mind pastebining your "dmesg"?
23:14 fltrz: 00:1b.0 Audio device: Intel Corporation 7 Series/C216 Chipset Family High Definition Audio Controller (rev 04)
23:15 fltrz: sure
23:15 fltrz: I keep forgetting the handy pastebin like service where I can use curl or wget or so to upload txt file
23:19 karolherbst: I usually just copy paste into gist
23:26 fltrz: I PMed it
23:26 fltrz: if u didn't get it let me know
23:26 karolherbst: heh...
23:26 karolherbst: "snd_hda_intel 0000:01:00.1: Unable to change power state from unknown to D0, device inaccessible"
23:27 karolherbst: "snd_hda_intel 0000:01:00.1: no codecs initialized" might also be the bug I was refering to...
23:27 karolherbst: normally you'd see a line like "snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops nv50_audio_component_bind_ops [nouveau])"
23:28 karolherbst: but yeah.. the audio device really doesn't have anything
23:28 fltrz: hmm, so I probably screwed that up too during arch install
23:28 karolherbst: nah
23:28 karolherbst: I think the audio driver is just broken
23:28 fltrz: I did use the audio device with voice chat, so I know it works, but it was very noisy
23:28 fltrz: on the microphone
23:28 karolherbst: might want to file a bug against linux and report your audio device on the nvidia GPU stays active
23:29 karolherbst: that's the other one
23:29 fltrz: I may have fooled around trying to get the mic working properly
23:29 karolherbst: 00:1b.0 is what your normal audio device is
23:29 karolherbst: 01:00.1 is the one on the GPU for HDMI and DP audio
23:29 fltrz: oh
23:29 karolherbst: there is no way you can mess this up anyway
23:29 karolherbst: there are some phantom audio "sub devices" on the GPU
23:30 karolherbst: RSpliet was also hitting that at some point
23:30 karolherbst: the gist is: the GPU reports having that audio sub device, but there actually isn't anything
23:30 karolherbst: fltrz: do you have any HDMI or DP outputs inside "/sys/class/drm/" on card0?
23:31 karolherbst: or is it all on card1?
23:31 fltrz: card0/ card1/ card1-DP-1/ card1-HDMI-A-1/ card1-LVDS-1/ card1-VGA-1/ renderD128/ renderD129/ version
23:31 karolherbst: yeah..
23:31 karolherbst: so your GPU doesn't even have any ports
23:31 karolherbst: well.. the nvidia one
23:31 fltrz: card0 is nvidia and card1 is intel hd 4000
23:31 karolherbst: yep
23:31 karolherbst: and LVDS-1 is your internal display
23:32 karolherbst: yeah... so you run into this weirdo bug :/
23:33 karolherbst: echo "0000:01:00.1" > "/sys/bus/pci/devices/0000:01:00.1/driver/unbind"
23:33 karolherbst: that should do the trick I think
23:33 fltrz: so the card doesnt have ports, that was by hardware design? anything it renders etc goes back to intel HD 4000?
23:33 karolherbst: might have to toggle the control file to on and then auto again on 01:00.1 after doing that
23:33 karolherbst: yeah
23:33 karolherbst: some laptops are like this
23:33 karolherbst: others have ports wired to the nvidia GPU
23:33 karolherbst: others have a mix
23:34 karolherbst: every laptop is different :)
23:34 fltrz: ok so I have to write it down for the root shell
23:34 karolherbst: mine has its port on the nvidia GPU, unless I use DP-MST then it goes through intel
23:34 karolherbst: DP-MST meaning USB-C or Thunderbolt
23:36 karolherbst:takes a note: "it's always the audio device"
23:37 fltrz: echo-ing on took a few seconds
23:37 karolherbst: mhhh
23:37 karolherbst: might have errors in dmesg now
23:37 karolherbst: ohh "on"
23:37 karolherbst: then it was turned of
23:38 karolherbst: lspci should also take a noticable time now once the GPU is turned off
23:38 karolherbst: lspci is one of the few things which still forces the GPU to be powered on for silly reasons :)
23:39 karolherbst: uhhhh
23:39 fltrz: how do I check if it is actually turned off?
23:40 karolherbst: uhm... could do some ACPI calls :D
23:40 karolherbst: though you'll notice
23:40 karolherbst: "sensors" is one way
23:40 karolherbst: but that's just the driver part
23:40 karolherbst: _but_
23:40 karolherbst: you should see the CPU temperature dropping
23:40 karolherbst: over time.. slowly
23:41 fltrz: I mean cat /sys/bus/pci/devices/0000\:01\:00.0/power/runtime_status still gives active, but runtime_suspended_time is nonzero for the first time to my recollection
23:41 fltrz: but its not increasing
23:41 karolherbst: heh
23:41 karolherbst: maybe it bugged out now :/
23:41 karolherbst: did you echo auto into control again for the audio device?
23:42 karolherbst: but that "nouveau 0000:01:00.0: DRM: failed to idle channel 1 [DRM]" one is bad
23:42 karolherbst: huh...
23:43 karolherbst: I think somebody needs to take a good look at this issue and figure something out, but I don't have such a system and can't do any testing or something.
23:43 fltrz: yes I echo'd auto back into 01:00.1's power/control file
23:44 karolherbst: yeah... I think nouveau bugged out now for whatever reason :/
23:44 karolherbst: at least that's one mystery solved
23:44 fltrz: :) can I verify its actually bugged out? if so its good news min is still running
23:44 karolherbst: yeah
23:45 karolherbst: I suspect it's just the audio driver being a little buggy
23:45 karolherbst: you might want to file a bug on the kernels bugzilla and see if anybody reacts to it
23:45 fltrz: I don't think I filed a bug with the kernel before, and wouldn't know how to phrase it
23:45 karolherbst: I think there was a mailing list thread at some point, but RSpliet should know
23:46 fltrz: Im still convinced I must have messed up the snd_hda_intel or whatever
23:46 fltrz: this can't not be my own fault
23:46 karolherbst: nah, it's the driver being broken
23:48 fltrz: the GPU audio device: cat: '/sys/bus/pci/devices/0000:01:00.1/power/autosuspend_delay_ms': Input/output error
23:48 fltrz: auto
23:48 fltrz: 414906664
23:48 fltrz: suspended
23:48 fltrz: 701083
23:48 karolherbst: yeah, that looks fine now
23:49 fltrz: is there a way to talk to the nouveau driver? like what's up?
23:49 karolherbst: sadly not really
23:49 fltrz: or this file interface is that
23:50 karolherbst: if something is wrong you'd usually see this comming up in dmesg anyway
23:51 fltrz: perhaps I need proper BIOS/UEFI settings?
23:51 karolherbst: doubful
23:52 fltrz: so assume I restart the laptop, and compile and install nouveau with OpenCL, it should work, just not be able to autosuspend?
23:53 fltrz: or this audio driver thing means nouveau no go?
23:53 karolherbst: correct
23:53 karolherbst: no, it shouldn't impact anything
23:53 fltrz: ok
23:53 fltrz: so thats good news, together with min/electron not being taken down
23:55 fltrz: could the remaining min renderD128 be complicit in the failure? if I lose min, and the renderD128 has then always disappeared from lsof output, and then retry unbinding and restarting the audio node; the result should be the same or we might progress?
23:55 fltrz: *close* min
23:56 fltrz: i.e. min was running when I tried
23:56 karolherbst: uhm.. no idea
23:56 fltrz: then ill try