06:40PaulePanter: RSpliet: I figured, and tried with Linux 5.9-rc4, and report it to the bug tracker.
06:44PaulePanter: What is the preferred bug tracker.
06:44PaulePanter: does not seem too active.
06:45PaulePanter: ParaView from the PTS hangs, and I believe the corresponding Linux 5.9-rc4 messages are:
06:46PaulePanter: BUG: kernel NULL pointer dereference, address: 0000000000000000
08:22RSpliet: PaulePanter: yeah... hmm, well, that bugtracker thing is a problem on itself.
08:23RSpliet: kherbst: has that been resolved yet?
08:25RSpliet: PaulePanter: thanks for running it through a new kernel though, that's very useful!
08:27RSpliet: And you're lucky, nullptr dereferences tend to be relatively easy to debug in the grand scheme of things. I hope the active devs can identify the problem quickly
08:27RSpliet: skeggsb: ^
08:29skeggsb: RSpliet: https://gitlab.freedesktop.org/drm/nouveau
08:30skeggsb: i got airlied to create it a while back in response to the fd.o one being unavailable, but then kinda forgot about it
08:31RSpliet: skeggsb: I eavesdropped on the conversations here in the past few months, and picked up some contradicting messages. Wasn't sure whether it was decided that gitlab would be the "official" bugtracker... but sounds like it is. Thanks!
08:32skeggsb: well, i don't much care where, so if people want to fight about it, then they can :P
08:32skeggsb: i'll use whatever, because, really, it doesn't matter that much
08:33RSpliet: skeggsb: nah I'd be the same, but it's good to know where to point users in a way that you and all the others actually pick up on the messages. Or having it scattered over 7 platforms
08:34RSpliet: *not scattered
08:34RSpliet: PaulePanter: you heard it! If you could plop all your details on that gitlab bugtracker you found please :-)
08:46kherbst: skeggsb: saw my comments about suspend/resume paths breaking runpm?
08:47skeggsb: kherbst: i've been on PTO a couple days, i'll catch up in the morning :P
08:47kherbst: ahh, okay
08:47rowbee: how can i check whether my gpu clock speeds are lower than they should be? i don't know whether this article is up-to-date https://www.phoronix.com/scan.php?page=article&item=nvidia-nouveau-2019&num=1 but it mentions that the GPU was in the "much lower boot clock speeds" and that they had to manually increase the clock speed.
08:48kherbst: rowbee: depending on the GPU there isn't anything you can do about it anyway
08:48rowbee: it's a GT540m
08:48kherbst: that's fermi, so there is nothing you can do about it
08:48rowbee: dang really
08:48kherbst: sadly yes
08:48kherbst: we never got to get reclocking working on fermi
08:48RSpliet: skeggsb: in that case my apologies for bothering you :-D
08:48kherbst: a bunch of people have a bunch of code though
08:48rowbee: well that certainly makes me sad
08:49rowbee: not even manually?
08:49kherbst: yeah.. sadly, there are usually more important things to work on and we don't have enough people to work on those extra features :/
08:49RSpliet: rowbee: soz, I worked on it in the past but got sidetracked by a PhD. I now work for a competitor of NVIDIA, so very hesitant to dive into it now.
08:49kherbst: rowbee: well.. there is _some_ code but the biggest issue is memory
08:49kherbst: you can probably reclock the core clocks and it might work on some perf levels
08:49rowbee: RSpliet: no worries, would be nice to have but it can't be helped.
08:49kherbst: but you can't change memory
08:49RSpliet: Actually, I think the core clock changing code is the biggest problem right now
08:50rowbee: i guess i'll dual boot windows for my high performance gpu needs
08:50RSpliet: The actual configuration of the PLLs afaik
08:50kherbst: RSpliet: on some GPUs that actually works
08:50kherbst: but yeah
08:50kherbst: on some GPUs there were also bugs
08:50RSpliet: kherbst: higher clocks were an issue iirg
08:50rowbee: currently trying to free up some hard disk space on this laptop so i can resize partitions for a windows
08:50kherbst: but mid range ones were okay
08:50kherbst: and on fermi that still was a huge win given that some GPUs just booted at like 50 mhz
08:51RSpliet: while Ben had done some pretty solid work on the memory parameter side of things
08:51rowbee: is there a way i can check lol
08:51RSpliet: I had it done for one GPU only, he had a much more generic solution
08:51kherbst: rowbee: we do report the current clocks with the pstate file though
08:51rowbee: how can i access it?
08:51RSpliet: /sys/kernel/debug/dri/<card no>/pstate
08:52RSpliet: if I recall correctly
08:52rowbee: low or normal?
08:52rowbee: oh dear
08:53RSpliet: "AC" is your current clock
08:53kherbst: not tooo bad then
08:53rowbee: ah i see
08:53kherbst: could have been worse
08:54rowbee: GeForce RTX 2080 Ti is about 3181% faster than this gpu
08:54rowbee: time for upgrade sometime maybe ^^;;
08:54rowbee: gpu clock supposed to be at 672mhz apparently... so that's why i'm getting like 30fps on low settings on a video game from 2007
08:55kherbst: rowbee: well, the GPU isn't fast to begin with
08:55RSpliet: It's honestly by far the biggest performance problem with nouveau, and it's across the board
08:55rowbee: anyway, i'll just dualboot for whatever i need
08:55rowbee: i guess i wouldn't understand it if you explained the technical problems with reclocking on nvidia gpus
08:56rowbee: so i'll assume it's nvidia being a dick
08:56kherbst: I guess it's just a lot of work
08:59RSpliet: Well, it's really not as much work as it seems. The code is finnicky, not complex.
09:00RSpliet: To get from where nouveau (and the different branches) stands to having clock changing support for Fermi is, I guess, 1-2 months full-time.
09:01rowbee: holy crap
09:03RSpliet: But it's a small team, 3-4 active contributors. And there's other fires to put out - like support for Ampère, SPIR-V, Vulkan, OpenCL, constant bug-fixes...
09:03rowbee: i guess that's if you're an existing nouveau dev. you have to learn all the stuff if not...
09:03kherbst: fixing multithreading *cough*
09:03RSpliet: kherbst: indeed :-D
09:03kherbst: although this time I actually have a branch fixing it :D
09:03kherbst: I just don't like the code as is
09:04RSpliet: kherbst: does it compile? If so: ship it! :-P
09:04kherbst: it passes all the multithreading tests in deqp at least :D
09:04RSpliet: rowbee: yes, for novice devs the problem can be pretty overwhelming.
09:04kherbst: rowbee: well.. when I started I right jumped in into those annoying issues, it just takes time
09:05kherbst: and you spend a lot of time doing the wrong thing
09:05rowbee: what *is* the problem, anyway? can you not just bitbang a value in memory to reclock? how much work is it to change the clock in the gpu?
09:05RSpliet: rowbee: it's hundreds of values, derived from values in the video bios, that need to be set perfectly
09:05RSpliet: Screw up one bit, and there's a 70% chance your card hangs
09:05rowbee: oh yeah, that's annoying
09:06rowbee: and i guess the values that need to be derived are different for each generatin
09:06rowbee: why couldn't the gpu just derive the values itself?
09:06RSpliet: Different for every card
09:06rowbee: that's annoying. and you need to hardware to test it as well.
09:06rowbee: plus it has to not be your daily driver ._.
09:07RSpliet: Yep. OEMs choose these values (as they choose the memory chips their card ships with)
09:07rowbee: i'll be without a sane desktop for the next 3 months or so, but maybe after christmas i can work on getting reclocking working on gt540m
09:07RSpliet: \o/ That's the spirit
09:08rowbee: i mean i did get backlight working
09:08rowbee: might as well
09:08RSpliet: Hehe, that was probably an easier issue (not easy, just easier :-P)
09:08rowbee: oh yeah definitely
09:08rowbee: just force a GPIO reset due to braindead OEM VBIOS
09:09RSpliet: Finding that issue is the real PITA though
09:10rowbee: i don't remember the names but i remember messing with nvapeek/poke for a couple of hours before the culprit was found
09:10RSpliet: Anyway, if you're back around Christmas I'll probs still be here and can maybe answer some questions. I wrote the code for the second gen Tesla reclocking (NVA3/5/8) and some MCPs (NVAA/NVAC)
09:10rowbee: i guess my questions would be to get the general gist of what i need to do
09:10rowbee: besides reading previous reclocking code
09:10rowbee: i'm a total GPU noob so
09:11RSpliet: It's pretty straightforward linear code
09:11rowbee: ah that's nice at least
09:11rowbee: i would try hacking on it right now but as i said this laptop is currently my breadwinner and i don't want to risk a hanging debian ^^;;
09:12rowbee: esp. when i don't have windows to fallback to
09:12RSpliet: TL;DR: When a user requests a clock change, there's a big old function preparing all the values to set (DRAM timings mainly, but also other related parameters). It just compiles down to a sequence of register writes and pauses.
09:12rowbee: but wrong values and you bomb
09:12RSpliet: These writes and pauses are not performed by the driver itself, but rather composed into a list that is sent off to the "PMU" core, a processor on the GPU that will run the script for you
09:13rowbee: i see, so you just prepare a list and then shoot it off via DMA or PCI or whatever
09:13RSpliet: The PMU firmware bits are there already, you don't have to worry much about that either.
09:13RSpliet: That's it
09:14rowbee: maybe i'll try out a bit after i get latest kernel sources
09:15rowbee: seems like i opened up enough space for a 125GB windows partition
09:16rowbee: i'm off to hack on that, cheers
09:16rowbee: windows and having fun are like nvidia and linux
09:17RSpliet: Windows is a great pleasure to masochists
09:20kherbst: only admins born in the 60s like to use windows :p
09:20kherbst: that also excludes all MS employees :P
09:22kherbst: anyway, today is XDC :D
09:33dcomp: No nouveau talk this year?
09:48kherbst: well, we had one at Fosdem and I am not too fond about doing remote talks
09:48kherbst: and I guess everybody else is to busy or doesn't want to give talks
10:02RSpliet: Yeah, plus IMHO they're a nice moment for exposure, but not the most informative. A summary of achievements (accell for Turing/Volta, some display work, plus small changes across the board) and the usual "NVIDIA makes big promises but doesn't deliver on the most critical part - the firmwares". There, just did the presentation for you :-P
14:06PaulePanter: RSpliet: Thank you. https://gitlab.freedesktop.org/drm/nouveau/-/issues/8
14:07RSpliet: PaulePanter: Thank you very much! And thanks for including the whole dmesg rather than a grep'd snipped, usually I have to follow up and ask for that :-D
14:10PaulePanter: RSpliet: I give you one better, and just uploaded the run through `scripts/decode_stacktrace.sh`. No idea, if it’ll be helpful.
14:24RSpliet: PaulePanter: Legend!
14:26RSpliet: Mind also plopping the version of other components on your system in the bug report? Defo mesa, x.org, and if you're using the nouveau ddx rather than glamor also the version of the nouveau-ddx package (whatever that may be called in your distro)?
14:54jcline: Might be worth making an issue template with prompts, but I dunno if that's possible with the repository feature off
15:38RSpliet: jcline: well a good start would be for someone to update https://nouveau.freedesktop.org/wiki/Bugs/
15:40RSpliet: Not sure I still know my credentials to do that
15:51jcline: Yeah... I wasn't aware of the gitlab repo until just now
15:57RSpliet: No worries. I just realised my message looked a lot more passive-aggressive than I meant it :-P
15:58karolherbst: skeggsb, daniels, skeggsb: mind adding more people to the project? :D
15:58karolherbst: ohh wait
15:58karolherbst: I actually can do stuff there
15:58karolherbst: oh well
15:59daniels: your fault then, sorry :P
18:35karolherbst: skeggsb: I kind of think we also want to regenerate all of _emit and _target inside codegen
20:07kwizart: karolherbst, 19:53 < kherbst> kwizart: did you notice any kind of flickering? I had some issues with the jetson nano, but that's also probably because of the memory being super slow..
20:08kwizart: yes, I've experienced that on jetson-tk1 with a 4k display, specially when entering screen saver (on gnome)
20:09karolherbst: tagr: any ideas? it sounds like that's something we actually want to solve for real
20:09karolherbst: and I suspect there is just some syncing issue somewhere
20:10kwizart: I wonder if that can be solved by the related mesa patch ? (that was on wayland, not Xorg)
20:10karolherbst: not sure
20:10karolherbst: I think there was some issue tagr found, but I can remember the details and if there was a patch for it
20:10kwizart:tries to build it on armhfp, as I've saw you have an aarch64 f32 copr builder
20:11karolherbst: I can also enable armhfp :D
20:11kwizart: yep, it doesn't seem that easy (as it's emulated on copr)
20:11kwizart: doing a koji scratch build might be easier (building natively)
20:12karolherbst: well.. I don't build it myself anyway
20:13karolherbst: kwizart: https://copr.fedorainfracloud.org/coprs/karolherbst/tegra_testing/build/1663968/
20:13kwizart: karolherbst, yep, I've rebased the patch on top on f33 mesa
20:14karolherbst: huh, which patch btw?
20:14kwizart: or even https://koji.fedoraproject.org/koji/taskinfo?taskID=51610949
20:14kwizart: (only arm,aarch64 build)
20:15kwizart: the one from tagr [PATCH] tegra: Support framebuffer modifier unaware application
20:16karolherbst: right.. I think I'd just burn down any support for non modifier aware userspace
20:16karolherbst: it's fundamentally broken with X anyway
20:16karolherbst: and no solution to actually fix it for good
20:17karolherbst: kwizart: I think enabling modifier support in compositors is really the way to go
20:17karolherbst: all that stuff was added for those use cases anyway
20:18karolherbst: I mean.. we can probably make it work for wayland as you page flip, but it's still a perf nightmare unless you spend tons of time to be smart
20:26karolherbst: kwizart: but I think those patches also suffer from the syncing problems, no?
20:28kwizart: karolherbst, I haven't tried the mesa patch yet (still building) I will report once tested on the device
20:28karolherbst: ahh, okay