00:00gfxstrand[d]: Yeah, is basically push descriptors
00:01gfxstrand[d]: karolherbst[d]: I don't see the problem. We support both of those. Not on Kepler because Kepler sucks, but everywhere else.
00:01karolherbst[d]: gfxstrand[d]: yeah, fair
00:02karolherbst[d]: I meant just in general, because intel doesn't support shaderStorageImageReadWithoutFormat on most gens :ferrisUpsideDown:
00:02karolherbst[d]: but anyway, for nvk it's fine
00:02gfxstrand[d]: Yeah, that's an Intel hardware problem
00:02karolherbst[d]: though CL is pretty restricted in what implicit format conversions are allowed anyway
00:02karolherbst[d]: it has to be of the matching type at least, just bit width and components are dynamic
00:03gfxstrand[d]: I think Microsoft finally told them to fix it but IDK what happened since I left, not that I could talk about it anyway. 🙃
00:03karolherbst[d]: `.shaderStorageImageReadWithoutFormat = pdevice->info.verx10 >= 125,`
00:03karolherbst[d]: is what anv is doing
00:04gfxstrand[d]: Okay, so DG2, then. That makes sense. I had forgotten it worked on DG2.
00:04gfxstrand[d]: Everything from those last 2 years or so is a blur
00:11redsheep[d]: gfxstrand[d]: Does zink rely on VK_KHR_vulkan_memory_model? If not then yeah kepler getting nak support up to the level of making zink work well would be great for those who can kind of reclock
00:12karolherbst[d]: would be an interesting experiment anyway, I've seen perf difference of around 10% with radv vs radeonsi
00:12redsheep[d]: They'll never get good dxvk but 🤷
00:13gfxstrand[d]: If someone wants to write sm20 back-end support they're more than welcome to
00:14karolherbst[d]: though I'd definitely go with SM35 first, because that would cover the most performance kepler cards
00:16redsheep[d]: It seems nice that with nvk codegen support dropped the only supported hardware would be cards that can in principle pass 1.3 cts, but maybe that isn't all that important
00:16karolherbst[d]: it depends on what you want to do with it
00:17gfxstrand[d]: Meh, it's more about being able to assume NAK than dropping hardware. As long as it doesn't make hash of anything, someone could implement Fermi support for all I care.
00:18karolherbst[d]: Ben said something that the copy engine stuff could probably be used on fermi, if one writes the proper firmware for it :ferrisUpsideDown:
00:18karolherbst[d]: though I totally don't know the details, but anyway
00:18karolherbst[d]: fermi is pain
00:18karolherbst[d]: a lot of pain
00:19gfxstrand[d]: I didn't say I was going to. 😝
00:19karolherbst[d]: I think it doesn't have bindless images at all
00:19karolherbst[d]: and only split texture + samplers
00:20redsheep[d]: gfxstrand[d]: Yeah I am surprised you went as far as doing the maxwell work to begin with
00:21karolherbst[d]: anyway.. fermi would be _fun_
00:21redsheep[d]: Speaking of, I didn't manage to get any pascal users to poke at the new branch
00:21redsheep[d]: I know they're so similar that it's hardly worth mentioning but there might be something weird
00:23redsheep[d]: At the very least raster pattern is really different with gp104 vs gm204, so they did change a few things
00:29gfxstrand[d]: redsheep[d]: Just tell them to poke at main. 😅
00:29redsheep[d]: Oh right it merged, forgot 😛
00:31gfxstrand[d]: redsheep[d]: It was a fun diversion
06:00asdqueerfromeu[d]: gfxstrand[d]: How about Tesla too? 🔺
10:16butterflies[d]: is Fermi even remotely worth implementing support for tho, lol
10:20triang3l[d]: butterflies[d]: There's no official driver 🙃
10:20butterflies[d]: triang3l[d]: I have a cursed idea: VkOnGL
10:21butterflies[d]: (even NV themselves didn't ship Vulkan on Fermi)
10:43dadschoorse[d]: supporting kepler sounds as pointless as anv supporting haswell
10:59redsheep[d]: Yeah, the latest anything kepler should have shipped new would be a full decade ago at this point
11:20esdrastarsis[d][d]: dadschoorse[d]: Yeah, just create kevk (like hasvk) :happy_gears:
11:26asdqueerfromeu[d]: dadschoorse[d]: The most popular Kepler GPU is the GT 730 (according to Steam's hardware survey)
11:29redsheep[d]: Seems like more trouble than it's worth for a total of .4% of the market, and two of the worst cards on the list at that
11:31redsheep[d]: Even what remains of maxwell from the list appears to total just under 1%, though if we throw in pascal it's several times more
11:35redsheep[d]: Hmm wow pascal seems to account for about 10% of the total steam hardware survey at the moment, that's crazy high for 7-8 years later. Shame those will probably never be fast on an open driver.
11:36mohamexiety[d]: yeah the 1060, 1070, and 1080 are very popular
11:36esdrastarsis[d][d]: Unfortunately
11:38redsheep[d]: That generation was just too good, like actually. Killed years of sales figures, I'm sure. I loved my 1080ti
11:39RSpliet: also the last generation where you didn't need to sell both kidkeys for high end, one kidney was enough
11:44redsheep[d]: It also provided a really compelling upgrade for everyone who had not bought the 980ti, no other card came close to even the 1070, at least when it launched (vega was a bit later)
11:46redsheep[d]: Since it was almost entirely a frequency upgrade it even made poorly written games with low utilization look fast
11:46karolherbst[d]: dadschoorse[d]: nouveau actually supports to reclock kepler GPUs, like the 780 Ti, so some nouveau users were getting those GPUs for the performance they get
11:46tiredchiku[d]: 4090 is the new 1080Ti
11:47karolherbst[d]: and the GL driver was able to match the blobs perf around 70%/80% in a few games
12:14karolherbst[d]: gfxstrand[d]: I'm considering our rustc version in CI to 1.78, while bumping the req to it as well, or at least 1.76 (as that aligns us with Firefox). 1.78 added checking for safety pre-conditions in core unsafe functions, e.g. `slice::from_raw_parts` and it's causing bugs for me triggered in piglit e.g.
12:21dadschoorse[d]: karolherbst[d]: haswell vulkan support was also useful, but a maintenance nightmare in the long term afaiu
12:22karolherbst[d]: 1.77 also gives us c-string literals and offset_of!
12:22karolherbst[d]: dadschoorse[d]: yeah, but nvk uses the same code on each gens as something like genxml is pretty much an overkill, so things align way better between args. Most of the changes are between pascal/volta/turing afaik
12:23karolherbst[d]: though one argument against kepler is the old texture header format
12:29RSpliet: redsheep[d]: "poorly written games" are unfortunately a fact of life :-( Bit of a sidestep, but I'm genuinely impressed with the youtuber who optimised the original Super Mario 64 to run solidly at 60fps on original HW
13:01linkmauve: RSpliet, that one was a bit of an outlier, it was compiled at -O0 back then because their compiler introduced bugs they couldn’t fix before the release.
13:02RSpliet: linkmauve: sure :-) but also working for a company that designs GPUs I can definitely confirm that poorly written/optimised games are a fact of life
13:03linkmauve: Of course. ^^
13:03linkmauve: Hopefully they don’t ship at -O0 any more nowadays!
13:04RSpliet: Well of course not! MSVC would never accept parameters starting with a minus :-P
13:06linkmauve: :D
13:12HdkR: I have definitely hit recent games without optimizations enabled. Most of the time an indie game on a small engine doesn't need to worry much :)
13:23gfxstrand[d]: karolherbst[d]: Ci catching those would be nice. I am a little worried that we'll accidentally use features if we bump the CI version but not the requirements.
13:24gfxstrand[d]: And I'm not sure if we should bump the requirement yet
13:24karolherbst[d]: yeah...
13:24karolherbst[d]: well
13:24gfxstrand[d]: But also I've copied+pasted `offset_of!` like 3 times now
13:24karolherbst[d]: Firefox ESR needs 1.76 anyway
13:24karolherbst[d]: so bumping to 1.76 at least is totally fine
13:24karolherbst[d]: it's just annoying, that the goodies came in with 1.77 and 1.78
13:25karolherbst[d]: but I want to bump to 1.76 anyway, like for `OnceCell`, because I want to use it
13:25karolherbst[d]: and I think a couple of other things...
13:25karolherbst[d]: though I could also globalize the rustc version checking and move everything to 1.76
13:25karolherbst[d]: currently all the rust things are spread around, and I'd also like to move the rust update policy I have in rusticl to global mesa...
13:26karolherbst[d]: maybe I indeed should do all of those things
13:26karolherbst[d]: oh well... I'm bumping to 1.76 for now, because that should be fine for everybody. If we feel the strong urge to update even further we can do that afterwards
13:26gfxstrand[d]: I'm fine bumping to 1.67, especially now that we've branched. I'm less sure about 1.78
13:27karolherbst[d]: yeah...
13:27karolherbst[d]: 1.78 will get us distros to complain probably :ferrisUpsideDown:
13:27karolherbst[d]: I kinda like the idea using the same as firefox, because others need to package rust for that one anyway
13:27gfxstrand[d]: Is there any way to turn off the asserts in from_raw_parts? NAK uses it heavily and can statically guarantee those invariants
13:28karolherbst[d]: it's a runtime check, so as long as you never pass e.g. 0 as the length at runtime you are fine
13:28karolherbst[d]: but it's disabled in release builds afaik
13:28karolherbst[d]: https://blog.rust-lang.org/2024/05/02/Rust-1.78.0.html#asserting-unsafe-preconditions
13:29karolherbst[d]: the issue was that only debug builds of rust triggered those before, now it's triggered based on your project setting
14:52gfxstrand[d]: Okay, that's fine then
14:52gfxstrand[d]: And if Rust constant-folds some parameters, it'll probably elide the check on a lot of debug builds, too. I just want to be sure for release
16:32karolherbst[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30414
21:02anon: i'm having some trouble getting nouveau installed as the dri provider. it's falling back to llvmpipe
21:02anon: this in kernels 6.9.9 through .11
21:03anon: the boot log doesn't say why sw is being used, at least not that i can see
21:03anon: but nouveau is being loaded and used by other modules as expected
21:04anon: is there a log i can check that might shed some light on why this is happening?
21:11gfxstrand[d]: Ugh... Looks like my faults on Maxwell A are kernel bugs. 😢
21:11gfxstrand[d]: Which is a pity because those GPUs might actually be useful.
21:12skeggsb9778[d]: What does the bug seem to be?
21:12gfxstrand[d]: Probably something with migration?
21:12karolherbst: anon: "dmesg" and what "glxgears" or something else prints might help to debug this
21:12gfxstrand[d]: I get faults when I'm running stuff in parallel but if I run the exact same caselist on its own, everything is fine.
21:14gfxstrand[d]: It's also possible it's some sort of fencing issue
21:17gfxstrand[d]: Yeah, it's feeling like a fencing issue. All the tests which are faulting are doing some semi-heavy compute work.
21:17gfxstrand[d]: Maybe we're not waiting on compute shaders properly?
21:18skeggsb9778[d]: Yeah I'm not sure. It sounds like the same mystery issue that's been discussed a few times forever now, that I suspect is fencing related (though, I have no solid evidence for that either)
21:19karolherbst[d]: or maybe it's just plain boring data race somewhere in the module
21:21anon: here is dmesg: https://paste.centos.org/view/ef720224
21:21redsheep[d]: anon: Are you trying to use zink+NVK? If so then it sounds like you are having the same issue as me
21:22karolherbst: anon: you are missing firmware
21:22karolherbst: but also nouveau not handling it well that there is no firmware
21:22karolherbst: anyway, you are missing firmware
21:23anon: ok
21:23karolherbst: no idea why it's not there, but you'll need the nvidia-gpu-firmware package
21:23anon: so the firmware is in the 6.9.8 kernel, the last one that worked i guess?
21:23karolherbst: and regenerate your initramfs via "dracut -f --regenerate-all"
21:24anon: you're right that the package is not installed at the moment
21:24karolherbst: the firmware is shipped in a different package than the kernel, so I would be surprised if the kernel version matters at all here
21:24karolherbst: :)
21:24gfxstrand[d]: karolherbst[d]: , skeggsb9778[d] I wonder if we need to be using the release ops in the QMD and manually waiting on the command stream.
21:25karolherbst[d]: gfxstrand[d]: mhhhhhhhhh
21:25karolherbst[d]: maybe?
21:25gfxstrand[d]: Like maybe WFI just doesn't do the trick
21:25skeggsb9778[d]: yeah maybe, i suggested to airlied[d] to try the graphics version of that too
21:26karolherbst[d]: it might only be relevant if you do simultaneous compute
21:26skeggsb9778[d]: ie. stick a report semaphore release (from the end of the pipe) at the end of the pushbuf before sending it to the kernel
21:26anon: which part of the log says i am missing firmware?
21:26karolherbst: anon: "[ 9.852922] nouveau 0000:01:00.0: acr: firmware unavailable"
21:28anon: ok, after reboot, llvmpipe still used instead of nouveau
21:28karolherbst: could run into other issues now, what's your dmesg now?
21:29anon: hang on
21:31anon: here is new dmesg: https://paste.centos.org/view/cafaa64e
21:32karolherbst: anon: it's missing quite a lot
21:35anon: here is new dmesg: https://paste.centos.org/view/eeb4f6f2
21:35anon: still says no firmware
21:35anon: i did install the package
21:35karolherbst: anon: it's still not finding the firmware. Have you ran the dracut command I mentioned above?
21:36anon: no
21:36anon: ok, stand by
21:44gfxstrand[d]: karolherbst[d]: I really do need to play with this stuff. I'd also like to stop making all our DMAs synchronous. That can't be doing good things for perf. In theory, it should be roughly the same infrastructure for both. Some sort of counter in memory that's attached to the context or to the command buffer somehow.
21:44gfxstrand[d]: gfxstrand[d]: I really do need to play with this stuff. I'd also like to stop making all our DMAs synchronous. That can't be doing good things for perf. In theory, it should be roughly the same infrastructure for both. Some sort of counter in memory that's attached to the context or to the command buffer somehow.
21:45karolherbst[d]: well.. it's just a semaphore you'd need to signal/wait on, no?
21:45anon: that apparently worked. the opengl renderer string is now NV134 instead of llvmpipe
21:45karolherbst: cool
21:46karolherbst[d]: like.. `dma-copy` has the `SEMAPHORE` at the top, and I think it's really just managing this
21:46karolherbst[d]: and the `PAYLOAD` is probably the value being set
21:47karolherbst[d]: and you could probably just have a timeline and kinda deal with wrap arounds
21:47karolherbst[d]: (by creating a new semaphore at some point)
21:47karolherbst[d]: probably want multiple ones anyway as some GPUs have like up to 10? or so copy engines
21:47anon: yes, wayland works now
21:48karolherbst[d]: mhhh..
21:48karolherbst[d]: `#define NVC7B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_SEMAPHORE_WITH_TIMESTAMP (0x00000002)`
21:48karolherbst[d]: `#define NVC7B5_LAUNCH_DMA_SEMAPHORE_TYPE_RELEASE_FOUR_WORD_SEMAPHORE (0x00000002)`
21:48karolherbst[d]: interesting....
21:49anon: karolherbst[d]: thanks. you're a steely-eyed missile man
21:49karolherbst: :) np
21:49karolherbst: after a while you know what to look for first :D
21:49anon: no doubt
21:58airlied[d]: gfxstrand[d]: should probably work out transfer queues first
22:02gfxstrand[d]: Yeah, need to figure that out, too
22:03gfxstrand[d]: And skeggsb9778[d] and I need to have a good chat about how we want to advertise the GPU topology to userspace so we can maybe get an API put together for that.
22:03gfxstrand[d]: Lots of things to do