00:22 fdobridge: <S​id> trying to write a patch to expose more queues, but I'm unsure how to disable presentation on the other queue families
00:24 fdobridge: <S​id> right now I've got a fair few queues exposed but present support is enabled on all queue families, leading to both dxvk and vkd3d failing to create their respective devices
00:24 fdobridge: <S​id> though vkcube seems to run fine
00:26 fdobridge: <S​id> GZDoom's vulkan renderer also seems to be running fine
05:07 OftenTimeConsuming: How do you get the backtrace dumped out to fbcon? That just gets scrolled off the screen and fbcon doesn't support scroll anymore. All I got is; https://termbin.com/agvw
05:51 fdobridge: <g​fxstrand> Uh... Not sure. I'll have to look at it. I'll be at my computer in 45 min or so
05:54 fdobridge: <S​id> I tried to look at how radv and anv do it but couldn't figure it out, though I didn't dig too deep either
05:58 fdobridge: <g​fxstrand> Do they only present on some queues?
05:59 fdobridge: <g​fxstrand> I'm sure there's a hook somewhere. I just don't remember where.
05:59 fdobridge: <n​anokatze> there's no hook, wsi checks if queue supports graphics compute or transfer (any of these)
05:59 fdobridge: <S​id> yeah, nv only presents on some queues
06:00 fdobridge: <n​anokatze> does it matter what nv prop does?
06:00 fdobridge: <S​id> or rather, on some queue families
06:00 fdobridge: <n​anokatze> mesa vulkan wsi is happy with transfer flag being there
06:00 fdobridge: <S​id> I set the queue families matching nv prop, fwiw
06:00 fdobridge: <g​fxstrand> That's probably fine for us. I don't think we fundamentally really care as long as we can do a copy
06:03 fdobridge: <S​id> btw I'm exposing all the queues that are on thw different GSP generations, except the queue families with encode/decode bits set
06:09 fdobridge: <S​id> also, fwiw, with all queues having present enabled, only dxvk and vkd3d are failing
06:09 fdobridge: <S​id> apps that use vulkan natively run just fine
06:11 fdobridge: <n​anokatze> radv advertises present support transfer_bit-only queue fwiw
06:11 fdobridge: <n​anokatze> well, radv itself doesn't, it's the common wsi code that's responsible for that
06:11 fdobridge: <n​anokatze> radv advertises present support on transfer_bit-only queue fwiw (edited)
06:12 fdobridge: <S​id> here's the queue families on nv prop: https://paste.sidonthe.net/pasta/otter-turtle-mole
06:13 fdobridge: <S​id> I enabled index 0, 1, and 2 on nvk
06:13 fdobridge: <S​id> with the same bits set and the same queue counts
06:14 fdobridge: <S​id> only difference between the two is stuff tied to the extensions missing on nvk, and that present is enabled on all three
06:15 fdobridge: <n​anokatze> how is dxvk failing
06:16 fdobridge: <S​id> Direct3D 11 device creation failed
06:16 fdobridge: <n​anokatze> present support on transfer-only should have zero impact on dxvk because it pokes present on graphics queue, strictly
06:16 fdobridge: <S​id> I'm not at my machine for another 5h sadly (classes)
06:17 fdobridge: <S​id> but I do remember the more important stuff
06:17 fdobridge: <S​id> 5? 4h
06:18 fdobridge: <n​anokatze> checking logs could be helpful so I guess we'll have to wait those 4 hours
06:18 fdobridge: <S​id> well
06:18 fdobridge: <S​id> if I pain myself by having to walk back and forth half an hour in this weather
06:18 fdobridge: <S​id> I can grab logs in ~1.5h
06:19 fdobridge: <n​anokatze> should you? probably no
06:20 fdobridge: <S​id> I was planning on skipping lunch, but if I do the walk I'll be able to have lunch too
06:20 fdobridge: <S​id> downside being, 30 minutes of walking in 39C, 36% humidity
06:20 fdobridge: <S​id> 15 mins back, 15 forth
06:21 fdobridge: <S​id> ..probably best to not skip lunch
07:59 fdobridge: <S​id> https://cdn.discordapp.com/attachments/1034184951790305330/1232964107301425162/KILLERINSTINCTX64_R_d3d11.log?ex=662b5ece&is=662a0d4e&hm=ff3b1765de52ca90293bd07fb7584aebfd197778e890c3e32cc6febda54271d0&
07:59 fdobridge: <S​id> https://cdn.discordapp.com/attachments/1034184951790305330/1232964107645489152/KILLERINSTINCTX64_R_dxgi.log?ex=662b5ece&is=662a0d4e&hm=5c6c942086336a7388b175554fbd5836a289d446776912be570c4ba976ad4617&
07:59 fdobridge: <S​id> dxvk logs with log_level debug
08:00 fdobridge: <S​id> ```
08:00 fdobridge: <S​id> info: Queue families:
08:00 fdobridge: <S​id> info: Graphics : 0
08:00 fdobridge: <S​id> info: Transfer : 2
08:00 fdobridge: <S​id> info: Sparse : 0
08:00 fdobridge: <S​id> ```
08:02 fdobridge: <S​id> here's the queues I set up: https://paste.sidonthe.net/pasta/bear-bee-eel
08:03 fdobridge: <S​id> :thonk:
08:58 fdobridge: <n​anokatze> info: Queue families:
08:58 fdobridge: <n​anokatze> info: Graphics : 0
08:58 fdobridge: <n​anokatze> info: Transfer : 2
09:00 fdobridge: <S​id> yes but look
09:00 fdobridge: <S​id> I 100% have graphics queues
09:00 fdobridge: <n​anokatze> yes I'm trying to think why it finds things that way
09:00 fdobridge: <S​id> also all apps that use vulkan natively run fine
09:01 fdobridge: <S​id> vkcube, vkgears, gzdoom, raze, vkquake tested so far
09:02 fdobridge: <n​anokatze> it feels like potentially a winevulkan issue
09:02 fdobridge: <n​anokatze> though idk why it would be
09:03 fdobridge: <n​anokatze> oh wait
09:03 fdobridge: <n​anokatze> ignore this
09:03 fdobridge: <n​anokatze> it finds the queues correctly
09:05 fdobridge: <S​id> oh?
09:11 fdobridge: <n​anokatze> where are your nvk changes
09:12 fdobridge: <n​anokatze> idk what raze is but I suspect that neither of those use anything other than 1 queue in 0th queue family
09:12 fdobridge: <n​anokatze> it might be that your changes are incomplete and CreateDevice fails when the app tries to create queues in any other queue families
09:12 fdobridge: <n​anokatze> or more than 1 queue in a queue family
09:12 fdobridge: <S​id> in a patch file on my drive :3
09:12 fdobridge: <n​anokatze> ok then post it
09:13 fdobridge: <S​id> build engine source port (?) that uses gzdoom tech
09:13 fdobridge: <S​id> can't post it right now
09:38 jfalempe: Hey, I'm trying to make drm_panic work on nouveau. I've the simple case (when running fbcon) working. But when I'm under the gnome desktop, tiling is activated, it makes the output a bit messy.
09:39 jfalempe: the plane->state->fb->format->block_h / block_w are both 0, so I don't know where to find the tiling info.
09:40 jfalempe: or if there is a way to disable tiling, that would be even easier.
10:33 fdobridge: <S​id> okay, time to do this one step at a time
10:57 fdobridge: <S​id> right, bumping up the existing queue count doesn't cause anything funny to happen
10:57 fdobridge: <S​id> time to enable transfer queues
10:58 fdobridge: <n​anokatze> dxvk isn't going to use more than one queue of a queue family
11:01 fdobridge: <S​id> yeah I'm just going one step at a time to see what happens
11:01 fdobridge: <S​id> and to identify what I'm doing wrong
11:01 fdobridge: <S​id> because I'm not very experienced with writing driver code c:
11:05 fdobridge: <S​id> turned on transfer queues, boom
11:07 fdobridge: <a​huillet> boom good, or boom not good?
11:08 fdobridge: <S​id> not good 😅
11:08 fdobridge: <S​id> game refuses to launch now
11:08 fdobridge: <S​id> also the same if I disable the transfer queues and enable compute queues, hmm
11:08 fdobridge: <S​id> meaning I'm likely missing something in nvk_queue.c
11:17 fdobridge: <S​id> hmm, I see
11:22 fdobridge: <S​id> actually, let's start with compute queue...
11:22 fdobridge: <S​id> since it's only one bit less than the graphics queue
11:26 fdobridge: <n​anokatze> dxvk doesn't use any compute-only queues
11:26 fdobridge: <n​anokatze> but it does use transfer-only queue if that's available
11:26 fdobridge: <n​anokatze> so I'm fairly confident vkCreateDevice just fails when the app actually requests the queues
11:26 fdobridge: <S​id> yeah
11:27 fdobridge: <S​id> goal is to figure out *why*
11:27 fdobridge: <S​id> so I can give it what it wants
11:39 fdobridge: <n​anokatze> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/vulkan/nvk_device.c#L126
11:39 fdobridge: <n​anokatze> did you make any changes here
11:47 fdobridge: <k​arolherbst🐧🦀> @gfxstrand apparently we'll get bitten by mesons lacking support for rust crates not running build scripts and we'll probably have to make sure that `--cfg=no_literal_byte_character` is passed for anybody using newer proc-macro2
11:47 fdobridge: <k​arolherbst🐧🦀> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28923/
11:47 fdobridge: <k​arolherbst🐧🦀> ahh
11:47 fdobridge: <k​arolherbst🐧🦀> somebody filed an MR already
12:02 fdobridge: <g​fxstrand> cool
12:17 fdobridge: <k​arolherbst🐧🦀> I still kinda hoped we'd have a better solution here :/
14:11 fdobridge: <r​inlovesyou> Can cargo not be integrated here so these dependency issues don't happen?
14:12 fdobridge: <k​arolherbst🐧🦀> in theory it can, it's just a different kind of mess
14:13 fdobridge: <r​inlovesyou> I think the one time i tried integrating rust with another build system i just had that build system invoke cargo directly haha
14:14 fdobridge: <r​inlovesyou> A lot less pain with dependencies, and if we turn mesa into a cargo workspace i think rust analyzer will also be happy
14:14 fdobridge: <k​arolherbst🐧🦀> yeah.. but we need to do something with the built library on the meson side
14:14 fdobridge: <k​arolherbst🐧🦀> and rust-analzyer is also happy with meson
14:14 fdobridge: <r​inlovesyou> Oh, i haven't gotten it to work in Mesa at all haha
14:15 fdobridge: <m​ohamexiety> rust-analyzer works fine on my end minus a little quirk iirc
14:15 fdobridge: <r​inlovesyou> Anything I need to do there?
14:15 fdobridge: <k​arolherbst🐧🦀> yeah, point it to the right file
14:15 fdobridge: <m​ohamexiety> you have to point it to a `.toml` file in the build directory
14:15 fdobridge: <k​arolherbst🐧🦀> `builddir/rust-project.json`
14:15 fdobridge: <r​inlovesyou> Ahh
14:15 fdobridge: <r​inlovesyou> Thank you
14:15 fdobridge: <k​arolherbst🐧🦀> `rust-analyzer.linkedProjects`
14:15 fdobridge: <k​arolherbst🐧🦀> needs to point to that file
14:16 fdobridge: <m​ohamexiety> oh `.json`, not `.toml`. but yeah that's the file
14:16 fdobridge: <r​inlovesyou> I think we can specify this in another file that it will look for so this doesn't have to be manually set
14:17 fdobridge: <k​arolherbst🐧🦀> yeah.. I think the meson extension can do this sort of thing..
14:17 fdobridge: <k​arolherbst🐧🦀> not quite sure
14:17 fdobridge: <r​inlovesyou> I'm not all that familiar with meson, i would've just created that file myself lol
14:21 fdobridge: <k​arolherbst🐧🦀> I meant setting up the vscode settings so that file is detected automatically
14:21 fdobridge: <k​arolherbst🐧🦀> meson already generates the file
14:30 fdobridge: <r​inlovesyou> Ohh
14:33 fdobridge: <k​arolherbst🐧🦀> yeah... I asked Dylan to add support for it to meson when I start rusticl like 2 years ago and it works pretty nicely since then. Just sometimes rust-analyzer doesn't rescan the project or something, but uhh.. overall it's really useful. I just wished the meson extension would register that file automatically
15:30 fdobridge: <S​id> I think I see something funny in our queue init code
15:31 fdobridge: <S​id> ```
15:31 fdobridge: <S​id> /* We rely on compute shaders for queries */
15:31 fdobridge: <S​id> if (queue_family->queue_flags & VK_QUEUE_GRAPHICS_BIT)
15:31 fdobridge: <S​id> queue_flags |= VK_QUEUE_COMPUTE_BIT;
15:31 fdobridge: <S​id>
15:31 fdobridge: <S​id> /* We currently rely on 3D engine MMEs for indirect dispatch */
15:31 fdobridge: <S​id> if (queue_family->queue_flags & VK_QUEUE_COMPUTE_BIT)
15:31 fdobridge: <S​id> queue_flags |= VK_QUEUE_GRAPHICS_BIT;
15:31 fdobridge: <S​id> ```
15:31 fdobridge: <S​id> if graphics bit is set, we're adding compute bit as well
15:31 fdobridge: <S​id> which is fine
15:31 fdobridge: <S​id> but if compute bit is set, we're adding graphics bit?
15:31 fdobridge: <S​id> that doesn't seem right, it implies compute-only queues are not a thing
15:32 fdobridge: <S​id> can someone confirm I'm understanding this correctly?
15:37 fdobridge: <a​huillet> I assume the implication is that 3D engine (=graphics!) MME is needed for compute indirect dispatch
15:37 fdobridge: <a​huillet> and that... therefore... neither gfx-only nor compute-only queues are a thing on nvk
15:40 fdobridge: <S​id> hmm, sounds like this got a whole lot more complicated :P
15:42 fdobridge: <S​id> I *could* try removing this to see what happens
15:42 fdobridge: <a​huillet> boom! dead GPU
15:42 fdobridge: <a​huillet> (that's the bad kind of boom)
15:43 fdobridge: <S​id> no gfx-only queues makes sense, since just about every report on gpuinfo says gfx-only queues don't exist
15:43 fdobridge: <S​id> but compute-only, hm
16:12 Lyude: Taking another look at the low memory runtime suspend/resume error. Going to try to understand the radix3 table a bit better
16:51 Lyude: feels like the more I understand the stranger this becomes. airlied - so I realized the only thing we're using nvif_gsp_mem_ctor is for actually storing the bus addresses of each page of the firmware
16:51 Lyude: [ 8.949288] Lyude:nvkm_gsp_radix3_sg:1973: (bufsize) == 77824 ← so I checked the size of that and it really doesn't seem that large
16:53 Lyude: so it might not be that bad of an idea to leave around those allocations maybe, so long as we just don't keep the actual sg allocation holding the suspend/resume data. But then that doesn't really explain how it's failing in the first place :s
16:54 Lyude: unless bufsize is pages but I'm fairly sure it's just a page-aligned size in bytes
17:03 fdobridge: <r​inlovesyou> how does one have the balls to even work on this when some mistakes may lead to a bricked gpu :Sweats:
17:03 fdobridge: <S​id> (he's joking, mostly)
17:05 fdobridge: <m​tijanic> It is really, really hard to brick a GPU like this, even if you try.
17:06 fdobridge: <S​id> yeah
17:07 fdobridge: <m​tijanic> And if you manage to do it by accident with userspace changes, send us a security bug report and we can just give you a new one as bounty :)
17:07 fdobridge: <S​id> because all the dangerous things you could tinker with are locked behind the firmware
17:07 fdobridge: <S​id> new objective acquired
17:07 fdobridge: <S​id> :myy_TinyGiggle:
17:17 fdobridge: <r​edsheep> Yeah if I thought was playing Russian roulette with my 4090 I wouldn't be testing nouveau. It's not entirely impossible to kill hardware but I don't expect userspace driver bugs to almost ever manage that.
17:19 fdobridge: <s​nektron> If you flash a faulty firmware
17:19 fdobridge: <S​id> that's not a userspace change though
17:21 fdobridge: <r​edsheep> Hardware is way more resilient than what the average joe would expect
18:22 fdobridge: <a​irlied> Mostly brick hw from excessive card swaps and unintended hot unplugs, and usually the motherboard goes first
18:22 fdobridge: <a​huillet> I break the retention clip on PCIe slot on purpose right when I get the mobo now
18:23 fdobridge: <a​huillet> the one motherboard I killed swapping GPUs was when I dropped the super-heavy board and it destroyed an inductor tied to the GbE controller
18:23 fdobridge: <a​irlied> Good idea, I should do that actually
18:23 fdobridge: <a​huillet> motherboard still works, but no Ethernet.
18:23 fdobridge: <m​ohamexiety> that clip is really bad these days with how big GPUs are. I unironically can't actually use the clip the "intended way". always need to get some exterior utensil to plop it out
18:24 fdobridge: <a​irlied> Lols I have a mb in same condition, USB ethernet ftw
18:24 fdobridge: <m​ohamexiety> (usually a wooden spoon in my case 🐸)
18:24 fdobridge: <a​irlied> I only bricked one laptop to the state where HP had to visit my house
18:25 fdobridge: <a​irlied> And that was a kernel bug which bricked the ethernet permanently
18:26 fdobridge: <m​ohamexiety> oof...
18:30 fdobridge: <r​edsheep> One of my favorite features of my high end board is that there's a button that's more accessible that depresses the clip
18:30 fdobridge: <r​edsheep> No need to mess with shoving a ruler between my cooler and card anymore
18:30 fdobridge: <m​ohamexiety> yeah that one is becoming more common thankfully. mine doesn't but it's promising to see at least
18:31 fdobridge: <m​ohamexiety> toolless M.2 too ❤️
18:45 Lyude: airlied, skeggsb: jfyi thanks to sima I think I confirmed that 77kb is definitely enough to fail in lowmem situations so the solution for this I think is going to be just keeping around allocations for the radix3 table, so I think I can write a patch once I get back from the pharmacy :)
18:58 airlied: Lyude: I'd like to see us just use the other allocator for the 77kb level, since the reason there is a page table there is for exactly this reason
19:10 Lyude: airlied: hm? I thought the page table was for mapping the sg allocation that we do for the suspend/resume data
19:10 Lyude: erm, mapping it on the GPU's end i mean
19:11 airlied: Lyude: the reason it's a multi-level page table is so you can avoid needing contiguous mem allocations for the levels
19:12 Lyude: Interesting
19:13 Lyude: ooook, I understand now
19:38 Lyude: ok - I think I might be able to dig further into this then when i get back. I think I've got a much better idea of what to do now
21:22 fdobridge: <g​fxstrand> Yes, that
21:23 fdobridge: <S​id> how would I go about changing that?
21:24 fdobridge: <S​id> or will that be a fairly involved process
21:31 fdobridge: <g​fxstrand> https://www.collabora.com/news-and-blog/blog/2024/04/25/re-converging-control-flow-on-nvidia-gpus/
21:32 fdobridge: <g​fxstrand> We don't. Not easily. I think there's probably a way but I need to do more research. We can set the dispatch size with a DMA but we'll need a different strategy for queries. IDK how the blob does then.
21:33 fdobridge: <S​id> hmmm, so it did get a whole lot more complicated 😅
21:34 fdobridge: <m​ohamexiety> saw it a short while ago. very nicely written and interesting!
21:46 fdobridge: <S​id> well, guess I'll have to look at something else to do then
21:46 fdobridge: <S​id> unless @ahuillet can help with that
22:00 fdobridge: <s​nektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a markup error here
22:00 fdobridge: <s​nektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a typo hete (edited)
22:00 fdobridge: <s​nektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a typo here (edited)
23:21 Lyude: hooray i actually think i wrote code using sgt that i fully understand now :)
23:34 Lyude: uhoh. so I'm definitely making progress but something weird is happening
23:36 Lyude: I -think- I managed to get the GSP booted with the non-contiguous page table allocation finally, but it seems like now we're failing on nvkm_gsp_fwsec_frts(). And if i understand correctly it's failing because nvkm_gsp_fwsec is no longer reading the vbios correctly - which I assume probably means something is still mapped wrong somewhere?
23:41 Lyude: https://paste.centos.org/view/9a65eac7 with the patch being https://gitlab.freedesktop.org/lyudess/linux/-/commit/0d9be02ebc6372252b0d69db57836209f8a8260f (airlied, skeggsb)
23:45 Lyude: ...OH
23:45 Lyude: ok it's fixed and it works :D
23:45 Lyude: I did not realize for_each_sgtable_dma_page() did not increment the page offset
23:46 Lyude: amazing ♥, time to run this on my laptop then and see if the problem goes away