03:51imirkin: karolherbst: i'm seeing https://paste.debian.net/1190225/ -- looks like the serialize stuff is off a bit?
03:51imirkin: i guess the addReloc logic misses some initialization sometimes? dunno
05:59imirkin: interesting. glxspheres + HUD = major fail on nv50: nouveau 0000:04:00.0: glxspheres: pushbuf push count exceeds limit: 161228 max 512
09:28Wolf480pl: Checked on Windows again and under load GPU-Z is reporting mem clock 1248 MHz (meanwhile OpenHardwareMonitor reports 2x that) and 1.025V. On Linux with nouveau, in pstate != 07, sensors(1) is reporting only 1.01V.
09:29Wolf480pl: Could it be that nouveau is setting voltage too low and/or mem clock 2x too high?
09:31Wolf480pl: (pstate 0f says 5000MHz which, is 4x higher than what GPU-Z was reporting on Windows, or 2x higher if we account for DDR AFAIU)
10:13karolherbst: imirkin: could be. yes
10:13karolherbst: how did you hit it?
10:16karolherbst: (or maybe I should just run with libasan through the entire shader-db with cache enabled)
11:14Wolf480pl: RSpliet, either way the fps count was way off (showed 100fps, felt like 5fps) and so was the score
11:46RSpliet: lyude: I just updated my Fedora machines to kernel 5.11.7. And behold, on my Kepler GPU the cursor is wrecked. Can you chase up backporting your fix to Fedora?
14:09karolherbst: Wolf480pl: that's on a laptop, right?
14:10karolherbst: not sure if that's an intel but then
14:10karolherbst: I noticed that if you don't vsync and don't get 60 fps straight, the intel driver missbehaves terribly
14:10karolherbst: but could also be a stupid nouveau bug in the prime offloading path
14:10Wolf480pl: I reran the benchmark with nouveau as primary gpu and got a saner result
14:11Wolf480pl: ~500 points with pstate 0a
14:11karolherbst: what about 0xf?
14:11karolherbst: Wolf480pl: well, if you don't care about battery lifetime or general instability, you can use nouveau as the main, yes :D
14:11karolherbst: ahh :/
14:11karolherbst: ohh that's GK104...
14:11Wolf480pl: I can give you dmesg, but that was with an older kernel
14:11karolherbst: Wolf480pl: do you have a pastebin of the pstate file somewhere?
14:12karolherbst: I think that's the high clock issue
14:12Wolf480pl: pstate https://gist.github.com/Wolf480pl/4c506a4eb6637cea5d524d875ed77198
14:12Wolf480pl: dmesg https://gist.github.com/Wolf480pl/cab91b80d0ca0aab5889581e45a011f8
14:12karolherbst: does 0xe work?
14:12Wolf480pl: nope, similar hang to 0xf
14:13Wolf480pl: tbh I don't remember if the dmesg is from 0xe or 0xf
14:13karolherbst: the error kind of indicates something odd with memory
14:13karolherbst: could alsom be related to unknwon bits
14:13Wolf480pl: so, more traces needed?
14:14karolherbst: did you file a bug already? I think it would make sense to 1. attach a vbios 2. attach a mmiotrace with nvidia when clocking with the nvidia-settings GUI
14:14Wolf480pl: if I did it was years ago
14:14karolherbst: ahh yeah...
14:14Wolf480pl: but I'm not sure if I did
14:14karolherbst: sadly I wouldn't have time to work on those issues :/ best we can do is to document it
14:15RSpliet: This stuff doesn't change much. If you did, the old trace should be usable
14:15karolherbst: trace with nouveau would also help
14:16karolherbst: but yeah.. the memory is getting bonkers :/
14:16Wolf480pl: also, it doesn't hang immediately, I can run glxspheres for a while and it works, but when I run heaven it hangs sooner or later
14:16karolherbst: yeah... just some instabilities
14:17karolherbst: could also be something trivial like we don't increase the mem voltage for whatever reasons
14:17Wolf480pl: could this be related to voltage?
14:17karolherbst: might be
14:17karolherbst: might also be some missconfiguration
14:17Wolf480pl: oh, but which voltage does sensors(1) report?
14:17RSpliet: really high (memory) clocks sometimes require to increase the PLL in two stages, or even use two PLLs to get up there. Not sure that logic is perfect
14:18karolherbst: yeah... memory voltage is... annoying
14:18karolherbst: RSpliet: we do that
14:18karolherbst: the threshold is a memory clock
14:18karolherbst: but yeah...
14:18karolherbst: maybe that threshold is bonkers
14:18karolherbst: could be anything really
14:18Wolf480pl: btw. do I remember correctly nouveau bugs used to be on freedesktop bugzilla?
14:19karolherbst: we are kind of moving to gitlab
14:19karolherbst:plans to work on issue templates on momnday
14:19Wolf480pl: ok, but are old bugs still on bugzilla or on gitlab already?
14:22karolherbst: I think they are mostly migrated
14:35karolherbst: imirkin: ohh, actually.. I think that's just padding
14:36karolherbst: we should calloc the entry
14:37karolherbst: we either we memset or do some fancy realloc thing?
14:50Wolf480pl: btw. I looked at my old (2015) notes from past driver debugging and it says "The blob sets the register 20344 to 44 which increases the voltage to 1.02V", is this useful or should I try to make new traces with the blob?
14:56Wolf480pl: btw. btw. I still have `options nouveau config=War00C800_0=1` in my modprobe.d, does this option still exist?
14:58Wolf480pl: oh, it's on by default now?
15:04Wolf480pl: nvm that was in skeggsb's fork
15:12Wolf480pl: karolherbst, so I open nvidia-settings, start mmiotrace, switch "preferred mode" to "maximum performance", stop mmiotrace?
15:12karolherbst: you have to enalbe mmiotrace before loading the nvidia driver sadly :/
15:12karolherbst: but with nvidia-settings you have nice controls over when to reclock
15:12karolherbst: and can add markers into the trace
15:13Wolf480pl: in nvidia-settings I only see "powermizer settings: preferred mode: [auto|adaptive|maximum performance]"
15:14Wolf480pl: unless application profiles can be used for reclocking too
15:16karolherbst: the powermizer setting is good enough
15:16karolherbst: maximum perf can be set to force the highest perf mode
15:50Wolf480pl: I have an mmiotrace from `modprobe nvidia*` to changing powermizer to highest perf and back. I didn't run any 3d load in that trace, but nvidia-settings did show pstates changing anyway. Is this good enough?
15:52Wolf480pl: Should I also make an mmiotrace of what nouveau does when told to change pstate?
15:52Wolf480pl: Should I also include dmesg or sth?
17:01imirkin: karolherbst: yeah, i figured it was something like that
17:03imirkin: karolherbst: i hit it by running deqp, while trying to debug an unrelated issue
17:04imirkin: (running valgrind on it, that is)
17:04imirkin: i think it's much more common on nv50, since nv50 has to do CALL relocations
17:04imirkin: (there are no relative calls)
17:19karolherbst: mind putting a memset or something to see if that fixes it?
17:19imirkin: any particular place?
17:19karolherbst: otherwise maybe we want to declare everything packed we serialize
17:20karolherbst: imirkin: when the reloc gets allcoated
17:20karolherbst: I guess direclty after the realloc?
17:20imirkin: ok. i'll look at that code a bit closer later on i guess
17:21imirkin: trying to put a series of misc fixes together
17:22karolherbst: did we even enalbe the cache for nv50 yet? I was under the impression that I didn't merge the nv50 enabelemt, because the rework was bigger and I too lazy to review it in depth at that point? Or maybe I went through it and checked it.. can't remember :D
17:22imirkin: dunno. but the bug i was debugging was on nv50.
17:22imirkin: (found it -- apparently bufctx_cp != bufctx_3d ... who knew)
17:25imirkin: my nv50 enablement of compute/images/etc is nearly complete
17:25imirkin: just a few things left to address, which aren't huge deals
17:27karolherbst: and then I break it with the CL CTS :p
17:27imirkin: well, someone will have to do the nir hookup as well
17:27imirkin: i had to do a bunch of dodgy stuff
17:28imirkin: well - not THAT dodgy. just slightly.
17:28imirkin: like doing the buffer/image file remapping to gmem slots
17:28karolherbst: I am aware of some issues I have to look into
17:28imirkin: and i sorta assume that you didn't take all the precautions of from_tgsi in generating the IR
17:29imirkin: so it might rely on things which don't work on nv50
17:29karolherbst: I have to rework some offset stuff and addresses
17:29imirkin: but as far as ES 3.1 compute support, it's ~done
17:29imirkin: i have some very weird format reinterpret issue going on with rgba8_snorm
17:30imirkin: which makes no sense, so i sorta assume it's an entirely unrelated issue
17:30imirkin: and if i were nice, i'd hook up indirect dispatch
17:31imirkin: and there are a handful of "weird" fails, at least one of them is because the generated program code is too large, i think
17:31karolherbst: don't remind me of too large program codes :/
17:31karolherbst: but I guess the ones you are talking about are still small
17:32karolherbst: how mamy instructions can nv50 hw execute? 64k?
17:32karolherbst: 2 million actually
17:33imirkin: the issue is actually with the *uplaod* of the program
17:33imirkin: we use SIFC, and i think we overflow it
17:33karolherbst: yeah well.. at least for CL we have to fix it
17:33imirkin: right, but it's not like a core problem
17:33karolherbst: with compute applications are just super crazy
17:34imirkin: it's a simple fix, assuming that is it.
17:38karolherbst: wait.. I actually have nv30 hardware..
17:39imirkin: that won't help testing nv50...
17:39karolherbst: no, unrelated
17:39imirkin: CL on nv30? :)
17:40karolherbst: I thought my pre nv50 gpus are AGP or something...
17:40imirkin: as long as you don't try to use control flow, should be fine =]
17:40imirkin: or, you know, memory loads/stores
17:43imirkin: supports SLI
17:43imirkin: you can finally get some fast rendering done!
17:43karolherbst: the other is a Quadro FX 3450
17:43imirkin: i have that one
17:43imirkin: not a bad board
17:43imirkin: note that there are 2 nv4x 3d classes
17:44imirkin: the "nv40" and "nv44" ones
17:44imirkin: one of them supports index buffers (in a buffer), the other doesn't and you have to stream them in
17:44karolherbst: wondering if I should look into issues like this: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4397
17:44karolherbst: but probably super painful to figure out
17:44RSpliet: I have an nv4b.. here in the UK even
17:44karolherbst: it's not like htere are even tests I could validate against
17:45imirkin: yeah, i mean obviously the geometry got messed up somehow
17:45imirkin: but it's not like it's 100% messed up
17:45imirkin: so you're looking for something slightly subtle
17:45imirkin: you can run deqp-gles2 and see if that triggers anything
17:45karolherbst: we support gles2 on those?
17:45imirkin: nv4x, sure
17:45imirkin: nv3x just gets you GL 1.5
17:45imirkin: but nv4x gets you GL 2.1 and GLES2
17:46karolherbst: I thought nv4x is also only 1.5
17:46karolherbst: do we do tricks or did nvidia not care?
17:46imirkin: nvidia's driver is also GL 2.1
17:46imirkin: (at least 2.0)
17:46karolherbst: ahh then the website is wrong
17:46karolherbst: wiki indeed lists 2.1
17:46imirkin: it's been known to happen
17:46karolherbst: okay, for gl 2.1 we also have a few CTS tests
17:46imirkin: i wouldn't be surprised if they even did 2.0 for the nv3x hw
17:47imirkin: but that'd be a LOT of fallbacks
17:47karolherbst: I think thye reported 2.1
17:47karolherbst: wiki says this: "All models support Direct3D 9.0a and OpenGL 1.5 (2.1 (software) with latest drivers)"
17:47imirkin: but back in those days, applications knew what they could and could not use
17:47imirkin: so it didn't really matter
17:48karolherbst: actually surprised nvidia or AMD didn't release a GL_EXT_is_feature_slow extension :D
17:48imirkin: bad marketing
17:50RSpliet: well, they did have the GeForce slowdown commands
17:50imirkin: not in a public ext :)
17:51RSpliet: shame we never got an extension for that! Sounds like a killer feature for applications!
18:00agneli: guys I spent few last weeks researching my question, no luck, maybe wrong places
18:01agneli: can I get a multihead card, let us say nVidia Quadro 160M
18:01agneli: to display different vts on different displays?
18:01imirkin: not really. what you're looking for is "multiseat"
18:01agneli: like i could with fbcon=map:
18:01agneli: even with a 40 yrs old hgc...
18:02agneli: not quite
18:02imirkin: although even that might be one GPU per seat
18:02agneli: I like text terminals I just want multihead
18:02imirkin: tbh i'm not 100% sure how vt's work
18:02agneli: but yes multiseat I wanted to check as well but later
18:03agneli: mapping usually happens per /dev/fb*
18:03imirkin: so the thing is... let's say "it works" and you can do this
18:03imirkin: how do you direct input to a vt?
18:03imirkin: and how do you tell it which vt goes to which screen
18:03imirkin: (there are more than 2 vt's)
18:03agneli: so let us say I have an old i486 with VGA and HGC (even dos supports that)
18:03agneli: I can map one vt to /dev/fb1 and another to /dev/fb0
18:04imirkin: there are 12 vt's
18:04imirkin: ok, so you just pre-declare that vt's 1, 3, 7 go to connector A
18:04imirkin: and 2, 4, 5, 6 go to connector B?
18:04imirkin: i'm unfamiliar with that
18:04imirkin: anyways, like i said, there's no HARD reason this can't work
18:05karolherbst: is fbcon even a kernel command line tihng?
18:05imirkin: the GPU is there, fbcon has hooks to draw to stuff
18:05agneli: fbcon is a kernel module
18:05imirkin: karolherbst: fbcon is the software which displays text
18:05imirkin: karolherbst: it acts as a fbdev client, which in turn acts as a kms client
18:06karolherbst: but I don't see any mention of a fbcon parameter
18:06imirkin: karolherbst: it's there.
18:06imirkin: just perhaps not where you're looking
18:07karolherbst: ahh https://www.kernel.org/doc/html/latest/fb/fbcon.html?highlight=console
18:07karolherbst: so map maps drivers
18:07imirkin: ideally you could feed it connectors. but i don't think fbcon interacts with kms at that level
18:07imirkin: i'm not sure how fbdev deals with multihead either, tbh
18:08Wolf480pl: so you'd need to create more than one fbdev per gpu?
18:08karolherbst: I guess fbcon should be able to not always mirror things
18:08karolherbst: because atm I think that's what it does
18:08imirkin: at least in my limited experience with it, fbcon output is mirrored across displays
18:08karolherbst: mirror the current one to all displays
18:09imirkin: but i dunno what it does if you have diff size monitors
18:09karolherbst: being shitty
18:09imirkin: which don't have a single compatible mode across them
18:09karolherbst: litterally tried that
18:09karolherbst: it's annoying
18:09agneli: imirkin: that I can answer
18:09karolherbst: either you have cut outs
18:09imirkin: ka-boom? :)
18:09karolherbst: well, not that bad
18:09agneli: it lowers the resolution to the smallest monitor
18:09karolherbst: but you get like a FHD worth ot space on a 4K display
18:10karolherbst: or you get 4k worth of space on the fhd one by things getting cut of
18:10agneli: karolherbst: but I am able then set it back to 4k using fbset
18:10karolherbst: I think it kind of depends when you add displays
18:10karolherbst: agneli: ohh sure
18:11agneli: but then it cuts things out
18:11karolherbst: I am just the "it has to work by default, otherwise it's stupid" guy
18:11karolherbst: best case would be that terminals are getting scaled accordingly
18:11karolherbst: oh well..
18:11agneli: there is a vresx vresy parameter in the /etc/fb.modes though...
18:11agneli: which stands for virtual
18:12agneli: man fb.modes
18:12agneli: but I have no idea what it means...
18:12karolherbst: yeah well...
18:12karolherbst: I fixed the console on my 4k screens by setting a custom font
18:12imirkin: agneli: virtual resolution is usually something which enables panning
18:13imirkin: e.g. you might have a 1024x768 monitor but you enable a 2000x2000 virtual resolution
18:13karolherbst: thing is... I don't know how I did it :D
18:13imirkin: and then the monitor is a 1024x768 viewport into that "global" display
18:13karolherbst: I think it's some grub magic in my case
18:13imirkin: it was more of a thing back in the 90's
18:14agneli: I am fixing this issue by forcing all displays to the highest bidder via video= option
18:14agneli: the smaller displays just complain they cannot cope
18:14imirkin: karolherbst: btw, when you get a chance: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9740
18:14agneli: but well they just mirror anyway...
18:14karolherbst: already saw the MR
18:15karolherbst: yeah.. if I don't forget, I will take a deeper look on monday
18:15imirkin: karolherbst: none of those are big fixes - the viewport thing makes it look bigger than it is, but that's a separate MR
18:16RSpliet: karolherbst: btw, I didn't CC you on this because I think it's Lyude's bug/patch, but https://bugzilla.redhat.com/show_bug.cgi?id=1941291
18:17karolherbst: RSpliet: I think somebody already fixed it?
18:17karolherbst: there were some cursor related patches on the ML
18:17RSpliet: karolherbst: yes, Lyude. But Fedora is broken
18:17karolherbst: specifically for kepler
18:17RSpliet: They just pushed 5.11 to F33 stable.
18:17karolherbst: request a backport of the patch :p
18:17RSpliet: Without that fix in it
18:17RSpliet: I did
18:18RSpliet: Just not sure who's going to pick that up, so making sure I bug all of y'all RH people :-P
18:18karolherbst: RSpliet: https://src.fedoraproject.org/rpms/kernel/pull-requests :p
18:18karolherbst: or ping... ehhh....
18:19karolherbst: RSpliet: jforbes
18:19karolherbst: but it would be easier to already prepare the spec patch and everything
18:19karolherbst: with URL to upstream commit, etc...
18:20karolherbst: it's the "drm/nouveau/kms/nve4-nv108: Limit cursors to 128x128" patch, isn't it?
18:20RSpliet: Believe so yes
18:20karolherbst: did you try it?
18:21RSpliet: I don't have a kernel tree handy
18:21karolherbst: you don't need to
18:21karolherbst: .. wait
18:21karolherbst: clone the spec repo
18:21karolherbst: add the patch
18:21karolherbst: get fedpkg
18:21karolherbst: and do "fedpkg mockbuild"
18:21karolherbst: that gives you rpms
18:22karolherbst: and those you can isntall via "dnf update *.rpm"
18:22RSpliet: That almost sounds easy. I actually have (... had) an alias called rpmbuild.kernel. It just takes several hours on my machine
18:22karolherbst: let it run over night
18:22RSpliet: Because it's a puny FX 6300 with cooling issues (the cooler is loud and ineffective)
18:22RSpliet: Hah, not sure I can sleep with my machine on :')
18:23RSpliet: AMD stock coolers really are a work of art
18:24RSpliet: It's in need of replacement. It's embarassing how theoretically the fastest computer in my house is an SBM. In practice though, I suspect its cooling isn't up to the job either and it'll throttle if I use it as a build farm
18:25RSpliet: Think you can get COPR to build that kernel easily? Not even sure if this fix made it upstream yet :-C
18:25RSpliet: (I'm also still cooking)
18:26karolherbst: ehh.. copr
18:26karolherbst: it will probably time out
18:26karolherbst: yeah... if it's not upstream yet.. mhh
18:27karolherbst: RSpliet: upstream in drm is enough though
18:28RSpliet: Well, not really. Nouveau is broken on kepler for upstream 5.11 and 5.12rc3. Upstream shouldn't be broken. The patch was on the ML two weeks ago, that really should have fast-tracked upwards.
18:29karolherbst: ping airlied or danvet then :p
18:29RSpliet: danvet hereby pinged :-P
18:29imirkin: RSpliet: works fine, just don't use stupid xf86-video-modesetting :p
18:30karolherbst: I think the patch still mostly misses reviews
18:30RSpliet: imirkin: wayland
18:30imirkin: RSpliet: ah, sad
18:30karolherbst: imirkin: [PATCH] drm/nouveau/kms/nve4-nv108: Limit cursors to 128x128 :p
18:30imirkin: karolherbst: yeah, but it's only a problem if someone requests the large cursors to begin with
18:30karolherbst: RSpliet: but yeah, I think it would make sense to try it out
18:30danvet: I thought Lyude was going back on that to respin?
18:30karolherbst: or maybe I could
18:30karolherbst: danvet: no idea...
18:31danvet: but also not going to look anywhere near a git tree today :-)
18:31karolherbst: imirkin: what? we don't request larger cursors in the novueau ddx?
18:31RSpliet: danvet: it sounds like somewhere a ball was dropped. Not too fussed about the "who" question, but the result is that currently upstream is broken
18:31karolherbst: danvet: fair
18:31imirkin: karolherbst: no, always 64x64
18:31karolherbst: imirkin: :/ sad
18:31RSpliet: danvet: that's fine. There's a bug on RHBZ. I'll see if I can quickly get an RPM build tree to do a review if that's what it takes.
18:31imirkin: danvet: afaik there's some "better" solution that's possible, but it's more involved. this is a good backport candidate.
18:31danvet: but airlied should wake up any moment
18:31danvet: but he's not here
18:31danvet: poke him on #dri-devel
18:31Wolf480pl: agneli, looks like this is the code that makes fbdevs https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L1867
18:32karolherbst: imirkin: well.. I guess we will habe those "cursor looks crappy on 8k" bugs soon then :p
18:32imirkin: karolherbst: yes ... i expect those to start flowing in RSN
18:36agneli: Wolf480pl: let me digest it ;)
18:37imirkin: should be possible to introduce a "zaphodheads" setting for generating the fbdevs
18:47RSpliet: karolherbst: if it makes you feel any better, looks like Renoir also soiled itself on 5.11
18:48karolherbst: ohh, I am happy that only cursors are broken, based on our testing, everything is broken in each release until proven otherwise :p
18:48RSpliet: Oh no I mean, Renoir as in amdgpu
18:51HerrSpliet: wouldn't wake up no matter how many buttons I bashed on my keyboard. restarting gdm didn't do the trick. Reboot and suddenly plymouth played ball on the shutdown process.
19:29imirkin: karolherbst: ok, the viewport MR is in, i updated the misc fixes MR so that it's a bit easier to look at diffs hopefully
19:30imirkin: all pretty straightforward stuff.
19:36RSpliet: 17 iterations later, I'm actually building a kernel RPM now. I forgot that my PC is more stable since I've introduced thermal paste
19:38RSpliet: every tried a peanut-butter-thermalpaste sandwich?
20:03imirkin: RSpliet: is it like tomato paste?
20:29RSpliet: Ohh it's a lot more moreish, it's like tomacco paste!
20:44imirkin: mmm... tomacco...