20:14 dcomp: Ive found something which stops fifo: SCHED_ERROR 20 on my GM108 840M
20:18 dcomp: When I boot with acpi_osi="Windows 2013". I still get faults but not sched error 20
20:40 karolherbst: dcomp: would be strange. I guess you just need to wait a little longer
20:41 dcomp: Does nouveau call into acpi to boot up the card. Because there seems to be a PowerResouce NVP3 which has _ON and _OFF gated by the _OSI call
20:48 dcomp: https://paste.centos.org/view/ea2f4752
20:55 imirkin: dcomp: yes, acpi is heavily involved on semi-modern laptops
21:02 dcomp: Although a reboot after using acpi_osi did result in a non function laptop which finally decided to do a BIOS Recovery.
21:43 karolherbst: dcomp: sure, but that SCHED_ERROR is unrelated to that
21:43 karolherbst: except, the GPU got suspended and the driver is doing stupid things
21:45 karolherbst: it could also be that doing d3cold does put the GPU into a weird state we don't really bootstrap the GPU correctly from
21:45 karolherbst: dcomp: so a nouveau.runpm=0 should also get rid of the error
21:45 karolherbst: but we would need to know more about when this error happens and what happened before
21:46 dcomp: This is from boot. The first thing that has been done is modprobe nouveau
21:48 karolherbst: a lot of things are happening on boot with the GPU
21:48 karolherbst: mind sharing dmesg from one showing that error?
21:52 dcomp:uploaded an image: sticker740727000960132526.png (33KiB) < https://matrix.org/_matrix/media/r0/download/jupiterbroadcasting.com/UEmRRQUpICrfMPdOOIzyHpSH/sticker740727000960132526.png >
21:53 dcomp: runpm=0 has done is stop it from recovering
21:53 dcomp: Sorry butterfingers
21:58 karolherbst: dcomp: what do you mean by recovering?
21:59 karolherbst: anyway, without dmesg we won't know what's up
22:01 dcomp: Sorry the sched_error spam cause the system to lock up. Usually after 90ish seconds it recovers
22:02 dcomp: https://paste.centos.org/view/6f2bf3ea
22:02 dcomp: https://paste.centos.org/view/fca59a7f boot with acpi_osi
22:03 dcomp: the first one is modprobe nouveau runpm=0
22:04 karolherbst: dcomp: do you know if that's a regression?
22:04 karolherbst: "nouveau 0000:07:00.0: fifo: fault 01 [WRITE] at 0000000000150000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel -1 [007fd38000 unknown]" looks like there is some bigger issue
22:05 dcomp: i remember your maxwell reclocking branch helping it boot.
22:05 karolherbst: but not with this error
22:05 karolherbst: but if it used to work, then that means we have a kernel regression
22:05 karolherbst: if it was always broken, then not
22:05 karolherbst: but I assume it wasn't always broken
22:06 karolherbst: dcomp: mind trying older kernel mayor releases and see if 5.7 or 5.6 work?
22:07 karolherbst: just annoying to guess.. but I'd assume that you might be able to figure out with what kernel it used to work
22:10 karolherbst: but anyway, looks like something with the device initialization is broken
22:14 dcomp: is it not the same error 4 years ago: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/175
22:14 dcomp: last comment
22:15 karolherbst: it is exactly the same error
22:15 karolherbst: okay, so it was always broken
22:17 karolherbst: something is just busted on that GPU
22:17 dcomp: I remember you had a maxwell reclocking branch that would fix it if I forced a higher perf mode on boot
22:17 karolherbst: mhhhh
22:17 karolherbst: I think the clocks aren't the problem here
22:17 karolherbst: setting the same clocks would probably just work as well
22:18 karolherbst: dcomp: ohh, but it is a 1st gen maxwell...
22:18 karolherbst: we do support reclocking on those out of the box
22:18 karolherbst: dcomp: so mind booting with NvClkMode=0x7 or something?
22:18 karolherbst: or maybe try 0xf
22:18 karolherbst: but I suspect that any will do
22:19 karolherbst: uhn.. nouveau.NvClkMode=0x7 when you boot, NvClkMode with modprobe nouveau
22:19 karolherbst: ehh wait
22:19 karolherbst: no.. should be fine?
22:19 karolherbst: I never used that so I don't know if hex strings can be used
22:20 karolherbst: ahh.. 0x should work
22:23 dcomp: https://paste.centos.org/view/0a61aa89
22:23 dcomp: I was sure that worked last time. I remember running glxgears
22:23 karolherbst: well.. "nouveau: unknown parameter 'NvClkMode' ignored"
22:23 karolherbst: because I messedup
22:24 karolherbst: needs to be config=NvClkMode=0x7
22:24 karolherbst: sorry for that
22:24 karolherbst: no idea why we do this config thing though ...
22:29 karolherbst: skeggsb: I think I remember an issue like that and I concluded that doing memory reclocking once fixes stuff... but that would also mean that the vbios either does not contain something we need or does contain it, but it's broken...
22:29 karolherbst: any idea?
22:30 karolherbst: dcomp: in any case.. mind creating a new bug report and attach the vbios? You can find the vbios under /sys/kernel/debug/dri/1/vbios.rom