00:00imirkin: these messages about nvrm every like second are pretty annoying
00:00imirkin: i've never seen anything like that
00:00Celmor[m]: wasn't that the first thing I came here to say?
00:00imirkin: maybe some preference needs to be adjusted with glvnd
00:00imirkin: yeah, but i thought you had like 2 of those messages
00:00imirkin: not 2 billion
00:01Celmor[m]: + a crash of my browser for good measure
00:05Celmor[m]: removing nvidia-utils packages (which also required me to remove nvidia modules) stopped the NVRM spam
00:05Celmor[m]: and broke my i3bar :/
00:06Celmor[m]: perhaps i3bar was the culprit since it tries to show my GPUs temp
00:06imirkin: so that's probably what was calling some nv util
00:09Celmor[m]: removing the "block" for showing nvidia temp and reinstalling the nvidia stuff works too
00:23Celmor[m]: could there be a firmware package where installing it helps with nouveay? I saw one from AUR but don't remember the exact name so I can't find it anymore
00:24Celmor[m]: ##linux talked about released signed firmware package
00:25imirkin: you appear to have accel just fine
00:27Celmor[m]: accel for what?
00:27imirkin: (and compute)
00:27Celmor[m]: gettings ~60fps in teeworlds where I used to get 1000+
00:28Celmor[m]: not even reaching my monitors refresh rate
00:28imirkin: nouveau can't reclock
00:28imirkin: so you're running at the lowest perf level of the board
00:28imirkin: change memory/core clocks
00:28Celmor[m]: a p-state
00:29Celmor[m]: this? https://wiki.archlinux.org/index.php/Nouveau#Power_management
00:29imirkin: but you can't change pstate on your board
00:29Celmor[m]: which board?
00:29imirkin: (nor 980ti)
00:29Celmor[m]: how do I check the current pstate?
00:30imirkin: cat that file
00:30Celmor[m]: tried that
00:30Celmor[m]: cat: /sys/kernel/debug/dri/0/pstate: No such device
00:30Celmor[m]: I do also have /sys/kernel/debug/dri/128
00:30imirkin: ENODEV? that's weird.
00:30imirkin: that's an alias
00:30Celmor[m]: not sure if it's confused as I have 2 GPUs installed
00:30imirkin: no, 0 = 128
00:30imirkin: long story
00:31imirkin: do you have a 1?
00:31imirkin: what's the other GPU?
00:31Celmor[m]: I have 0 and 128
00:31Celmor[m]: 6900 xt
00:31imirkin: is a driver for it loaded?
00:31Celmor[m]: vfio stub driver
00:31imirkin: i'm surprised it doesn't have a directory in deubg/dri
00:31imirkin: oh ok
00:31imirkin: that explains it
00:32imirkin: i guess we don't even know how to get current clocks on pascal
00:32imirkin: anyways, it's going to be pretty low
00:32HdkR: ~100Mhz core clock isn't good to play games at :)
00:32Celmor[m]: I would think doing the rendering in CPU would be even faster than the GPU
00:33imirkin: you can try it
00:33imirkin: i doubt it
00:33imirkin: unless you have a VERY beefy cpu
00:33imirkin: export LIBGL_ALWAYS_SOFTWARE=1
00:33imirkin: and then run the application
00:33Celmor[m]: I've tried `LIBGL_ALWAYS_SOFTWARE=1 teeworlds` and get the exact same performance as without
00:34imirkin: pastebin "glxinfo"?
00:34Celmor[m]: glxinfo -B ?
00:34imirkin: dunno what -B is
00:34Celmor[m]: with -B is a lot of output
00:34imirkin: you mean without -B? I guess -B is fine
00:34imirkin: that gets me the info
00:35imirkin: ok, so that means you're getting accel
00:35imirkin: are you sure it's not limited to refresh rate?
00:35Celmor[m]: picom doesn't work with vsync
00:35Celmor[m]: and monitors refresh rate is at 165
00:36imirkin: i dunno what teeworlds is, but unless it's very simple, it'd be quite surprising that you could match even lowest perf level on 1050ti on cpu
00:36Celmor[m]: it's a 2d game
00:36imirkin: is it GL?
00:36imirkin: depends how it's done
00:37imirkin: but it might be simple enough in that case
00:37Celmor[m]: any idea here? https://termbin.com/zha0
00:38imirkin: i forget
00:38Celmor[m]: command from https://wiki.archlinux.org/index.php/Nouveau#Vertical_Sync
00:38imirkin: the swap control thing
00:38imirkin: i also dunno what picom is
00:38Celmor[m]: a compositor, responsible for vsync
00:38imirkin: anyways, i'd reocmmend against using a compositor
00:38HdkR: imirkin: Teeworlds is a /very/ simple 2D online platformer/shooter game. I used to run it on my GMA 945 at full speed :P
00:39Celmor[m]: so basically no vsync. just like on nvidia, lol
00:39imirkin: no, you still have vsync
00:42imirkin: by default, if you do glXSwapBuffers() it'll be vsync'd
00:42Celmor[m]: I'm merely running a window manager otherwise (i3-wm)
00:42Celmor[m]: isn't picom saying that the swap buffer thing doesn't work?
00:42Celmor[m]: even while using nvidia I never had working vsync
00:42imirkin: it's talking about like GLX_OML_sync_control or some such shit
00:46Celmor[m]: I still don't see how I'll get vsync without a compositor
00:46imirkin: individual applications will be vsync'd
00:46imirkin: e.g. run "glxgears"
00:46imirkin: note how it runs at 60fps
00:46imirkin: or whatever refresh rate you have
00:47Celmor[m]: if I move that window it isn't
00:48Celmor[m]: mpv also doesn't vsync
00:48Celmor[m]: it's running at ~160.547 fps
00:48imirkin: and what's your refresh rate?
00:48imirkin: i'm too tired to think abou tthis
00:49imirkin: normally it should run at your refresh rate
00:49imirkin: fi you run with e.g. vblank_mode=0 in the env
00:49imirkin: it'll run at an unrestricted rate
00:49imirkin: (which for glxgears is largely limited by fill rate or pcie transfer rate, depending on the precise setup)
00:51imirkin: (the very first run will be off coz when it's starting up it won't be quite right)
00:51imirkin: 300 frames in 5.0 seconds = 59.950 FPS
00:51imirkin: that's what i get
00:51Celmor[m]: xorg froze again :/
00:51imirkin: sounds like nouveau isn't going to work out well for you, given the set of applications you run, or something else
00:53Celmor[m]: I just wonder if installing that firmware package would help
00:53imirkin: you already have it
00:54Celmor[m]: there's one from AUR that I don't
00:54imirkin: but you have the firmware, otherwise you wouldn't get accel
00:55Celmor[m]: so opengl accel apparently is working, albeit at the lowest p state, but video accel isn't?
00:56imirkin: you're not gonna get video accel ever
00:57Celmor[m]: is it so hardware to interface with?
00:57imirkin: please rephrase
00:58Celmor[m]: interfacing with the gpus nvdec
00:58imirkin: what's the question though?
00:59Celmor[m]: is it that hard to interface with nvdec so there's no work being made in that regard?
00:59imirkin: video decoding is soul-sucking work
00:59imirkin: i know because my soul has been sucked :)
01:00imirkin: you'd have to (a) figure out how to load the firwmare and interact with it at the engine level and (b) work out how to get it to decode stuff
01:00imirkin: the latter has been made slightly easier by them actually releasing some nvdec docs
01:02Celmor[m]: it be nice though to change the p state
01:02imirkin: it would be.
01:05Celmor[m]: Guess I'll undo the changes then. thanks for the help though
02:45CheetahPixie: Morning folks.
02:45CheetahPixie: I have but a simple question.
02:45CheetahPixie: What's the news on Wayland?
02:45CheetahPixie: I have a 520 that acts pretty drunk on it.
03:59imirkin: CheetahPixie: should generally speaking work
04:01CheetahPixie: Any gotchas or issues I should know of?
04:01imirkin: avoid software which starts with the letters 'g' or 'k'
04:10CheetahPixie: Okay, so no KDE?
04:10CheetahPixie: But I'm running on KDE.
04:12HdkR: There's the first problem :)
04:13imirkin: CDE would be fine - doesn't start with g or k
04:13imirkin: (if you know what that is, i feel sorry for you)
04:51CheetahPixie: I do.
04:52CheetahPixie: Think I tried it once?
04:52CheetahPixie: I installed every DE my repos had.
04:58HdkR: Did you try sway? :)
07:32CheetahPixie: Nope. I want to stick to KDE more than anything, really.
08:08linkmauve: imirkin, re video decoding, I thought the Ryujinx people did RE the hardware interface so we would “just” have to reimplement the software interface.
12:56pmoreau: imirkin: Sounds good with pushing the patches! 👍️ I’ll try to prep a series with some tweaks to the NIR frontend; atm testing your latest “nv50_compute” branch and fixing some issues with 8-bit values.
15:03pmoreau: This seems weird, is there a 96-bit load? Or should it be a 128-bit load (aligned to a 128-bit boundary)? https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp#L2778
15:23imirkin: pmoreau: there's a 96-bit load ... i suspect it should be aligned to 128-bits though
15:24imirkin: linkmauve: nvidia pushed docs on NVDEC (and NVENC)
15:24pmoreau: Oh there is? Never mind then.
15:24imirkin: linkmauve: but even figuring out where the firmware lives and how to load it can be quite annoying
15:48imirkin: pmoreau: it's just loading 3 regs' worth of stuff
15:48imirkin: i dunno. definitely exists on some gens, perhaps not all.
15:54pmoreau: It makes sense since loading float3 would be a relatively common case.
16:03tertl3: 10-4 I have been added to mailing list
17:03imirkin: pmoreau: going to try to land some of the rest of the less dubious things, including images even if there are some TOOD's left
17:03imirkin: i don't think it's doing anyone any good sitting in my branch
17:03imirkin: later today though
17:57pmoreau: imirkin: I’m back to trying to solve the issue with `MemoryOpt::combineSt()` (similar to what we discussed a month ago, see at 22:02 https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=nouveau&date=2021-03-01).
18:00pmoreau: I tried your suggestion of setting dType to U32 and sType to U8 but that makes the CTS fails (whereas it works if I disable the optimisations).
18:02pmoreau: (I think it fails due to now the program storing the whole 32-bit value, and as a result overwrites the nearby 8-bit values.
18:50pmoreau: Okay, the success seems to be almost random though it seems that if I step through some of the `MemoryOpt` code (or maybe any code) and spend a bit of time there, the whole thing will fail but not for the reasons one would expect? Here is the excerpt from dmesg regarding one run: https://gitlab.freedesktop.org/-/snippets/1858.
18:52pmoreau: Those `TRAP_MP_EXEC - TP 0 MP 0: 00000010 [INVALID_OPCODE] at 000000 warp 1, opcode 00000000 00000000` look really weird as the emitter still generates the same binary in both instances but somehow the upload of the binary fails when stepping through for too long or something like that?
18:54pmoreau: Maybe the card is in a weird state, but I just ran the same program again without any changes except that I did not put any breakpoints nor stepped through the code and it ran just fine.
20:24imirkin: pmoreau: ok, so there are some practical realities
20:24imirkin: which is that while you can have a load of 8-bit data
20:24imirkin: that always has to end up in a 32-bit reg on tesla
20:24imirkin: ld g only ever returns 32-bit (or higher) regs
20:24imirkin: (maybe it can do a 16-bit reg, but dealing with half-regs is variously annoying)
20:25imirkin: pmoreau: i'd be curious to see the input code
20:25imirkin: maybe we need a "lowering" pass which normalizes some of this
21:56imirkin: pmoreau: i pushed another version of the nv50_compute branch ... i think it's basically ready, but i want to flush out some of the other changes i have first
22:15imirkin: pmoreau: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10164 (and the tangentially related https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10162)