10:58 kwizart: Hello, I have the following error on jetson-tk1 with fedora 5.9.0-rc5 kernel (error exists since few kernel already):
10:58 kwizart: [ 238.003771] nouveau 57000000.gpu: gr: DATA_ERROR 0000009c [] ch 4 [04001eb000 xfwm4[1125]] subc 0 class a297 mthd 0d78 data 00000014
11:00 kwizart: is there any information to understand the error so I can look at ?
11:17 karolherbst: kwizart: yeah.. you need to use kms modifiers
11:17 karolherbst: newer Xorg should help or wayland compositors which use kms modifiers
11:18 karolherbst: the issue is already understood, fixing it for non modifier aware userspace is just annoying as X is just fundamentally broken and for wayaldn it will kill performance
11:18 kwizart: karolherbst, does xfwm4 need to be modified ?
11:18 karolherbst: that's X, right?
11:18 kwizart: or to rely on a newer mesa ?
11:18 kwizart: yes
11:18 karolherbst: then you need X from master
11:19 karolherbst: but I also got reports where this wasn't enough... but I could reproduce that
11:19 karolherbst: *couldn't
11:19 kwizart: ouch..., okay, I expect there is mesa counterpart
11:19 karolherbst: there isn't really much mesa can do except to make it work by killing performance
11:21 kwizart: okay, so KMS is only libdrm/Xorg level ? no application modification needed ?
11:21 karolherbst: KMS is a kernel thing
11:21 karolherbst: and no.. applications should be fine
11:22 karolherbst: just wayland compositors and X need to deal with it
11:23 kwizart: okay, so let's more to GNOME/wayland in f33 for this... looks more reliable
11:23 kwizart: move
11:25 karolherbst: well.. with mutter you need to enable an experimental feature for it to work :/
11:25 karolherbst: but I worked on a patch to enable it by default, just need to get it merged
11:26 kwizart: can you point it to me ? kernel patch or mutter one ?
11:27 karolherbst: https://gitlab.gnome.org/karolherbst/mutter/-/commits/tegra-3-36/
11:40 kwizart: thankx - /me reads https://01.org/linuxgraphics/Linux-Window-Systems-with-DRM
13:18 PaulePanter: With Linux 5.4.57, running PTS benchmark pts/desktop-graphics, when running ParaView, the X session (Xfce) freezes: https://paste.debian.net/1163898/
13:18 PaulePanter: [10673.667144] nouveau 0000:01:00.0: fifo: read fault at 0001fff000 engine 00 [PGRAPH] client 02 [GPC0/] reason 02 [PAGE_NOT_PRESENT] on channel 6 [003faa8000 pvpython[20713]]
13:18 PaulePanter: Is somebody interested in this, or is it expected?
13:22 PaulePanter: Killing ParaView over SSH makes the X session work again.
15:19 RSpliet: PaulePanter: the nouveau devs tend to be not too interested in bugs in older kernels (even LTS). The risk of chasing bugs that have already been fixed is too high, and the team is too small to have the luxury of spending that time
15:56 dcomp: Not sure what else I can do for my card. So I've uploaded the vbios to https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/265. And I'll be available for any tests.
16:08 kwizart: I confirm https://gitlab.gnome.org/karolherbst/mutter/-/commit/4fb9103b48765017fcb9e8cdda25ed4abbccc64f works on jetson-tk1, but I wonder about older tegra (I might give a try in the next days)
16:08 kwizart:is out for today
17:53 kherbst: kwizart: did you notice any kind of flickering? I had some issues with the jetson nano, but that's also probably because of the memory being super slow..
17:53 kherbst: but I think there are some syncing issues we still need to figure out
17:53 kherbst: just wondering if you hit anything
19:37 austriancoder: Is there any kind of a per engine utilization for Nvidia GPUs?
19:38 kherbst: austriancoder: yes
19:38 kherbst: multiple ones even
19:38 kherbst: why?
19:39 kherbst: I even have patches to wire them up, just never got to finish them as they are not that useful with unreliable reclocking anyway
19:39 kherbst: but if somebody would plan to add some interfaces for userspace to fetch those values, I'd be happy to revive them
19:39 austriancoder: kherbst: nice.. I am looking at different kernel/user space implementations to get some inspiration on how to do it for etnaviv
19:40 kherbst: ahh..
19:40 kherbst: for us it's just some counters you need to reset regulary
19:40 kherbst: and you can configure masks
19:40 kherbst: let's see..
19:40 austriancoder: okay
19:40 kherbst: https://github.com/karolherbst/nouveau/commits/pmu_counters_v4
19:40 kherbst: I implemented that in firmware though
19:41 kherbst: but maybe that gives you some ideas already
19:41 austriancoder: I am not yet sure if the perf_event route is a good one to go to get these values into userspace
19:42 kherbst: well
19:42 kherbst: the issue is that those are global
19:42 kherbst: and you can't restrict them to a process or whatever
19:42 kherbst: but yeah.. for nowing what's up on the system in general it might be useful
19:43 kherbst: austriancoder: what I think makes more sense to have some drm interfaces actually..
19:43 kherbst: or something else to tweak system performance
19:43 kherbst: usually those counters are used for DVFS
19:43 austriancoder: I have a global idle cycle counter and some idle bits like described in https://envytools.readthedocs.io/en/latest/hw/pm/pdaemon/counter.html
19:43 kherbst: and we could add an interface to read those out _and_ to adjust threshols or something
19:43 kherbst: like, clock up when you reach 50% or so
19:44 kherbst: and define domains: VRAM, cores, video, etc...
19:44 kherbst: austriancoder: yeah, that's the stuff I implemented
19:45 kherbst: of course you can impement it inside the driver, but that's quite a waste of power
19:45 austriancoder: kherbst: there is already a perfcounter infrastructure in etnaviv
19:45 kherbst: for shaders or for engines?
19:46 kherbst: keep in mind, that for nvidia those are completly different
19:46 kherbst: perf counters are per context/shader invocation
19:46 kherbst: engine counters are global
19:47 kherbst: not sure if we actually have some SM counters which are not per context.. mhh
19:47 austriancoder: there are no hw contexts on etnaviv
19:47 austriancoder: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/etnaviv/etnaviv_perfmon.c?h=v5.9-rc5#n101
19:47 kherbst: right, but I assume you still have per invocation counters?
19:48 kherbst: or rather per executed shader or something
19:48 kherbst: mhhh
19:49 kherbst: austriancoder: those we have all implemented in userspace
19:49 kherbst: we even need compute shaders for some of them to read them out
19:50 austriancoder: uff
19:50 kherbst: anyway, for those APIs exist in OpenGL
19:51 kherbst: and I think it's fine to rely on those
19:51 kherbst: what I think makes more sense is to actually focus on engine counters and have new interfaces for those
19:51 kherbst: as this really makes sense
19:51 kherbst: especially if you think about performance tweaking
19:52 kherbst: so, some users might want the clock to go up at 90% load, others at 50%
19:52 kherbst: and some want the clock to not lower for minutes, others only for seconds etc...
19:52 kherbst: we could be more aggressive with power savings if userspace could adjust that
19:53 austriancoder: That was my last try https://patchwork.kernel.org/project/dri-devel/list/?series=316185 .. switched to hrtimers but still not sure if I want a global per gpu average utilization over 1s in the kernel or of it is better to do it in the user space
19:54 kherbst: austriancoder: 1s is too long
19:54 kherbst: e.g. when an application requests a new context you want it to be 0.1s or lower
19:54 kherbst: and you could wait for a minute to see how the load develops
19:54 kherbst: and go from there
19:54 kherbst: and adjust clocks accordingly
19:54 kherbst: but for normal polling? yes 1s is probably enough
19:55 kherbst: and I think just knowing _what_ the GPU spends time on (VRAM, shaders, video accel, etc..) is more valuable than the values itself
19:55 austriancoder: we only clock down the gpu if termals are bad..
19:55 austriancoder: Okay
19:55 kherbst: I did some experiments with that to implement dynamic reclocking in nouveau
19:56 kherbst: and 0.1s was quite nice
19:56 kherbst: but I also tried 0.01s eg..
19:56 kherbst: the goal is just to keep the clocks high for a long enough time
19:56 austriancoder: yeah
19:57 kherbst: anyway, if we get some kernel itnerfaces for that, I'd be happy to add support for that inside nouveau
19:57 austriancoder: perf would do it
19:58 kherbst: the issue is just, we don't have that many "slots" :/
19:58 kherbst: so we can just monitor 7/3 engines depending ont he gen
19:58 kherbst: and we have like... 20?
19:59 austriancoder: If i implement it via perf/pmu then userspace does two reads, with any time between them, where each read returns a timestamp and number of nanoseconds the engine was busy between those two timestamps
20:00 austriancoder: all I have to do from my driver side is to keep a sum of all active 10ms periods while the pmu counter is open by userspace
20:02 kherbst: yeah..
20:02 kherbst: maybe I'll look into it as well. Would be interesting
20:02 austriancoder: but yeah.. the end goal would be something like this: https://lists.freedesktop.org/archives/intel-gfx/2020-September/248062.html
20:02 kherbst: yeah
20:04 austriancoder: need to study that patches too
20:04 austriancoder: kherbst: thanks a lot for your input!
20:04 kherbst: np