00:22fdobridge: <Sid> trying to write a patch to expose more queues, but I'm unsure how to disable presentation on the other queue families
00:24fdobridge: <Sid> right now I've got a fair few queues exposed but present support is enabled on all queue families, leading to both dxvk and vkd3d failing to create their respective devices
00:24fdobridge: <Sid> though vkcube seems to run fine
00:26fdobridge: <Sid> GZDoom's vulkan renderer also seems to be running fine
01:46fastrabbit: The thing is being as in range, that's very important constraint or property of function, it's actually a dimension, the supercompute engine can be expressed as derivative, that has three variables hence terms, those are offset , index, value and maximum value, first three are in relational dependence they have algebraic relation, they can be replaced with values and with any of the terms fixed or replaced with numbers you
01:46fastrabbit: can derive others, so if the base operation that forms an outcome is addition you have some terms combined or subtracted inside a dimension which is indicated as time. So after the assignment you can subtract or add those elements depending what is your function involving those terms. There's only one rule for that function as term value is smaller than offset and index added together with each other, so what you fill into
01:46fastrabbit: couple of selection items as possible answers which are arranged as sums of operands where the index is referred passes through between dependencies, and you use linear algebra and relations to derive a transition from sum or twice the sum and alike.
05:07OftenTimeConsuming: How do you get the backtrace dumped out to fbcon? That just gets scrolled off the screen and fbcon doesn't support scroll anymore. All I got is; https://termbin.com/agvw
05:51fdobridge: <gfxstrand> Uh... Not sure. I'll have to look at it. I'll be at my computer in 45 min or so
05:54fdobridge: <Sid> I tried to look at how radv and anv do it but couldn't figure it out, though I didn't dig too deep either
05:58fdobridge: <gfxstrand> Do they only present on some queues?
05:59fdobridge: <gfxstrand> I'm sure there's a hook somewhere. I just don't remember where.
05:59fdobridge: <nanokatze> there's no hook, wsi checks if queue supports graphics compute or transfer (any of these)
05:59fdobridge: <Sid> yeah, nv only presents on some queues
06:00fdobridge: <nanokatze> does it matter what nv prop does?
06:00fdobridge: <Sid> or rather, on some queue families
06:00fdobridge: <nanokatze> mesa vulkan wsi is happy with transfer flag being there
06:00fdobridge: <Sid> I set the queue families matching nv prop, fwiw
06:00fdobridge: <gfxstrand> That's probably fine for us. I don't think we fundamentally really care as long as we can do a copy
06:03fdobridge: <Sid> btw I'm exposing all the queues that are on thw different GSP generations, except the queue families with encode/decode bits set
06:09fdobridge: <Sid> also, fwiw, with all queues having present enabled, only dxvk and vkd3d are failing
06:09fdobridge: <Sid> apps that use vulkan natively run just fine
06:11fdobridge: <nanokatze> radv advertises present support transfer_bit-only queue fwiw
06:11fdobridge: <nanokatze> well, radv itself doesn't, it's the common wsi code that's responsible for that
06:11fdobridge: <nanokatze> radv advertises present support on transfer_bit-only queue fwiw (edited)
06:12fdobridge: <Sid> here's the queue families on nv prop: https://paste.sidonthe.net/pasta/otter-turtle-mole
06:13fdobridge: <Sid> I enabled index 0, 1, and 2 on nvk
06:13fdobridge: <Sid> with the same bits set and the same queue counts
06:14fdobridge: <Sid> only difference between the two is stuff tied to the extensions missing on nvk, and that present is enabled on all three
06:15fdobridge: <nanokatze> how is dxvk failing
06:16fdobridge: <Sid> Direct3D 11 device creation failed
06:16fdobridge: <nanokatze> present support on transfer-only should have zero impact on dxvk because it pokes present on graphics queue, strictly
06:16fdobridge: <Sid> I'm not at my machine for another 5h sadly (classes)
06:17fdobridge: <Sid> but I do remember the more important stuff
06:17fdobridge: <Sid> 5? 4h
06:18fdobridge: <nanokatze> checking logs could be helpful so I guess we'll have to wait those 4 hours
06:18fdobridge: <Sid> well
06:18fdobridge: <Sid> if I pain myself by having to walk back and forth half an hour in this weather
06:18fdobridge: <Sid> I can grab logs in ~1.5h
06:19fdobridge: <nanokatze> should you? probably no
06:20fdobridge: <Sid> I was planning on skipping lunch, but if I do the walk I'll be able to have lunch too
06:20fdobridge: <Sid> downside being, 30 minutes of walking in 39C, 36% humidity
06:20fdobridge: <Sid> 15 mins back, 15 forth
06:21fdobridge: <Sid> ..probably best to not skip lunch
07:59fdobridge: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1232964107301425162/KILLERINSTINCTX64_R_d3d11.log?ex=662b5ece&is=662a0d4e&hm=ff3b1765de52ca90293bd07fb7584aebfd197778e890c3e32cc6febda54271d0&
07:59fdobridge: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1232964107645489152/KILLERINSTINCTX64_R_dxgi.log?ex=662b5ece&is=662a0d4e&hm=5c6c942086336a7388b175554fbd5836a289d446776912be570c4ba976ad4617&
07:59fdobridge: <Sid> dxvk logs with log_level debug
08:00fdobridge: <Sid> ```
08:00fdobridge: <Sid> info: Queue families:
08:00fdobridge: <Sid> info: Graphics : 0
08:00fdobridge: <Sid> info: Transfer : 2
08:00fdobridge: <Sid> info: Sparse : 0
08:00fdobridge: <Sid> ```
08:02fdobridge: <Sid> here's the queues I set up: https://paste.sidonthe.net/pasta/bear-bee-eel
08:03fdobridge: <Sid> :thonk:
08:58fdobridge: <nanokatze> info: Queue families:
08:58fdobridge: <nanokatze> info: Graphics : 0
08:58fdobridge: <nanokatze> info: Transfer : 2
09:00fdobridge: <Sid> yes but look
09:00fdobridge: <Sid> I 100% have graphics queues
09:00fdobridge: <nanokatze> yes I'm trying to think why it finds things that way
09:00fdobridge: <Sid> also all apps that use vulkan natively run fine
09:01fdobridge: <Sid> vkcube, vkgears, gzdoom, raze, vkquake tested so far
09:02fdobridge: <nanokatze> it feels like potentially a winevulkan issue
09:02fdobridge: <nanokatze> though idk why it would be
09:03fdobridge: <nanokatze> oh wait
09:03fdobridge: <nanokatze> ignore this
09:03fdobridge: <nanokatze> it finds the queues correctly
09:05fdobridge: <Sid> oh?
09:11fdobridge: <nanokatze> where are your nvk changes
09:12fdobridge: <nanokatze> idk what raze is but I suspect that neither of those use anything other than 1 queue in 0th queue family
09:12fdobridge: <nanokatze> it might be that your changes are incomplete and CreateDevice fails when the app tries to create queues in any other queue families
09:12fdobridge: <nanokatze> or more than 1 queue in a queue family
09:12fdobridge: <Sid> in a patch file on my drive :3
09:12fdobridge: <nanokatze> ok then post it
09:13fdobridge: <Sid> build engine source port (?) that uses gzdoom tech
09:13fdobridge: <Sid> can't post it right now
09:38jfalempe: Hey, I'm trying to make drm_panic work on nouveau. I've the simple case (when running fbcon) working. But when I'm under the gnome desktop, tiling is activated, it makes the output a bit messy.
09:39jfalempe: the plane->state->fb->format->block_h / block_w are both 0, so I don't know where to find the tiling info.
09:40jfalempe: or if there is a way to disable tiling, that would be even easier.
10:33fdobridge: <Sid> okay, time to do this one step at a time
10:57fdobridge: <Sid> right, bumping up the existing queue count doesn't cause anything funny to happen
10:57fdobridge: <Sid> time to enable transfer queues
10:58fdobridge: <nanokatze> dxvk isn't going to use more than one queue of a queue family
11:01fdobridge: <Sid> yeah I'm just going one step at a time to see what happens
11:01fdobridge: <Sid> and to identify what I'm doing wrong
11:01fdobridge: <Sid> because I'm not very experienced with writing driver code c:
11:05fdobridge: <Sid> turned on transfer queues, boom
11:07fdobridge: <ahuillet> boom good, or boom not good?
11:08fdobridge: <Sid> not good 😅
11:08fdobridge: <Sid> game refuses to launch now
11:08fdobridge: <Sid> also the same if I disable the transfer queues and enable compute queues, hmm
11:08fdobridge: <Sid> meaning I'm likely missing something in nvk_queue.c
11:17fdobridge: <Sid> hmm, I see
11:22fdobridge: <Sid> actually, let's start with compute queue...
11:22fdobridge: <Sid> since it's only one bit less than the graphics queue
11:26fdobridge: <nanokatze> dxvk doesn't use any compute-only queues
11:26fdobridge: <nanokatze> but it does use transfer-only queue if that's available
11:26fdobridge: <nanokatze> so I'm fairly confident vkCreateDevice just fails when the app actually requests the queues
11:26fdobridge: <Sid> yeah
11:27fdobridge: <Sid> goal is to figure out *why*
11:27fdobridge: <Sid> so I can give it what it wants
11:39fdobridge: <nanokatze> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/vulkan/nvk_device.c#L126
11:39fdobridge: <nanokatze> did you make any changes here
11:47fdobridge: <karolherbst🐧🦀> @gfxstrand apparently we'll get bitten by mesons lacking support for rust crates not running build scripts and we'll probably have to make sure that `--cfg=no_literal_byte_character` is passed for anybody using newer proc-macro2
11:47fdobridge: <karolherbst🐧🦀> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28923/
11:47fdobridge: <karolherbst🐧🦀> ahh
11:47fdobridge: <karolherbst🐧🦀> somebody filed an MR already
12:02fdobridge: <gfxstrand> cool
12:17fdobridge: <karolherbst🐧🦀> I still kinda hoped we'd have a better solution here :/
14:11fdobridge: <rinlovesyou> Can cargo not be integrated here so these dependency issues don't happen?
14:12fdobridge: <karolherbst🐧🦀> in theory it can, it's just a different kind of mess
14:13fdobridge: <rinlovesyou> I think the one time i tried integrating rust with another build system i just had that build system invoke cargo directly haha
14:14fdobridge: <rinlovesyou> A lot less pain with dependencies, and if we turn mesa into a cargo workspace i think rust analyzer will also be happy
14:14fdobridge: <karolherbst🐧🦀> yeah.. but we need to do something with the built library on the meson side
14:14fdobridge: <karolherbst🐧🦀> and rust-analzyer is also happy with meson
14:14fdobridge: <rinlovesyou> Oh, i haven't gotten it to work in Mesa at all haha
14:15fdobridge: <mohamexiety> rust-analyzer works fine on my end minus a little quirk iirc
14:15fdobridge: <rinlovesyou> Anything I need to do there?
14:15fdobridge: <karolherbst🐧🦀> yeah, point it to the right file
14:15fdobridge: <mohamexiety> you have to point it to a `.toml` file in the build directory
14:15fdobridge: <karolherbst🐧🦀> `builddir/rust-project.json`
14:15fdobridge: <rinlovesyou> Ahh
14:15fdobridge: <rinlovesyou> Thank you
14:15fdobridge: <karolherbst🐧🦀> `rust-analyzer.linkedProjects`
14:15fdobridge: <karolherbst🐧🦀> needs to point to that file
14:16fdobridge: <mohamexiety> oh `.json`, not `.toml`. but yeah that's the file
14:16fdobridge: <rinlovesyou> I think we can specify this in another file that it will look for so this doesn't have to be manually set
14:17fdobridge: <karolherbst🐧🦀> yeah.. I think the meson extension can do this sort of thing..
14:17fdobridge: <karolherbst🐧🦀> not quite sure
14:17fdobridge: <rinlovesyou> I'm not all that familiar with meson, i would've just created that file myself lol
14:21fdobridge: <karolherbst🐧🦀> I meant setting up the vscode settings so that file is detected automatically
14:21fdobridge: <karolherbst🐧🦀> meson already generates the file
14:30fdobridge: <rinlovesyou> Ohh
14:33fdobridge: <karolherbst🐧🦀> yeah... I asked Dylan to add support for it to meson when I start rusticl like 2 years ago and it works pretty nicely since then. Just sometimes rust-analyzer doesn't rescan the project or something, but uhh.. overall it's really useful. I just wished the meson extension would register that file automatically
15:30fdobridge: <Sid> I think I see something funny in our queue init code
15:31fdobridge: <Sid> ```
15:31fdobridge: <Sid> /* We rely on compute shaders for queries */
15:31fdobridge: <Sid> if (queue_family->queue_flags & VK_QUEUE_GRAPHICS_BIT)
15:31fdobridge: <Sid> queue_flags |= VK_QUEUE_COMPUTE_BIT;
15:31fdobridge: <Sid>
15:31fdobridge: <Sid> /* We currently rely on 3D engine MMEs for indirect dispatch */
15:31fdobridge: <Sid> if (queue_family->queue_flags & VK_QUEUE_COMPUTE_BIT)
15:31fdobridge: <Sid> queue_flags |= VK_QUEUE_GRAPHICS_BIT;
15:31fdobridge: <Sid> ```
15:31fdobridge: <Sid> if graphics bit is set, we're adding compute bit as well
15:31fdobridge: <Sid> which is fine
15:31fdobridge: <Sid> but if compute bit is set, we're adding graphics bit?
15:31fdobridge: <Sid> that doesn't seem right, it implies compute-only queues are not a thing
15:32fdobridge: <Sid> can someone confirm I'm understanding this correctly?
15:37fdobridge: <ahuillet> I assume the implication is that 3D engine (=graphics!) MME is needed for compute indirect dispatch
15:37fdobridge: <ahuillet> and that... therefore... neither gfx-only nor compute-only queues are a thing on nvk
15:40fdobridge: <Sid> hmm, sounds like this got a whole lot more complicated :P
15:42fdobridge: <Sid> I *could* try removing this to see what happens
15:42fdobridge: <ahuillet> boom! dead GPU
15:42fdobridge: <ahuillet> (that's the bad kind of boom)
15:43fdobridge: <Sid> no gfx-only queues makes sense, since just about every report on gpuinfo says gfx-only queues don't exist
15:43fdobridge: <Sid> but compute-only, hm
16:12Lyude: Taking another look at the low memory runtime suspend/resume error. Going to try to understand the radix3 table a bit better
16:51Lyude: feels like the more I understand the stranger this becomes. airlied - so I realized the only thing we're using nvif_gsp_mem_ctor is for actually storing the bus addresses of each page of the firmware
16:51Lyude: [ 8.949288] Lyude:nvkm_gsp_radix3_sg:1973: (bufsize) == 77824 ← so I checked the size of that and it really doesn't seem that large
16:53Lyude: so it might not be that bad of an idea to leave around those allocations maybe, so long as we just don't keep the actual sg allocation holding the suspend/resume data. But then that doesn't really explain how it's failing in the first place :s
16:54Lyude: unless bufsize is pages but I'm fairly sure it's just a page-aligned size in bytes
17:03fdobridge: <rinlovesyou> how does one have the balls to even work on this when some mistakes may lead to a bricked gpu :Sweats:
17:03fdobridge: <Sid> (he's joking, mostly)
17:05fdobridge: <mtijanic> It is really, really hard to brick a GPU like this, even if you try.
17:06fdobridge: <Sid> yeah
17:07fdobridge: <mtijanic> And if you manage to do it by accident with userspace changes, send us a security bug report and we can just give you a new one as bounty :)
17:07fdobridge: <Sid> because all the dangerous things you could tinker with are locked behind the firmware
17:07fdobridge: <Sid> new objective acquired
17:07fdobridge: <Sid> :myy_TinyGiggle:
17:17fdobridge: <redsheep> Yeah if I thought was playing Russian roulette with my 4090 I wouldn't be testing nouveau. It's not entirely impossible to kill hardware but I don't expect userspace driver bugs to almost ever manage that.
17:19fdobridge: <snektron> If you flash a faulty firmware
17:19fdobridge: <Sid> that's not a userspace change though
17:21fdobridge: <redsheep> Hardware is way more resilient than what the average joe would expect
18:22fdobridge: <airlied> Mostly brick hw from excessive card swaps and unintended hot unplugs, and usually the motherboard goes first
18:22fdobridge: <ahuillet> I break the retention clip on PCIe slot on purpose right when I get the mobo now
18:23fdobridge: <ahuillet> the one motherboard I killed swapping GPUs was when I dropped the super-heavy board and it destroyed an inductor tied to the GbE controller
18:23fdobridge: <airlied> Good idea, I should do that actually
18:23fdobridge: <ahuillet> motherboard still works, but no Ethernet.
18:23fdobridge: <mohamexiety> that clip is really bad these days with how big GPUs are. I unironically can't actually use the clip the "intended way". always need to get some exterior utensil to plop it out
18:24fdobridge: <airlied> Lols I have a mb in same condition, USB ethernet ftw
18:24fdobridge: <mohamexiety> (usually a wooden spoon in my case 🐸)
18:24fdobridge: <airlied> I only bricked one laptop to the state where HP had to visit my house
18:25fdobridge: <airlied> And that was a kernel bug which bricked the ethernet permanently
18:26fdobridge: <mohamexiety> oof...
18:30fdobridge: <redsheep> One of my favorite features of my high end board is that there's a button that's more accessible that depresses the clip
18:30fdobridge: <redsheep> No need to mess with shoving a ruler between my cooler and card anymore
18:30fdobridge: <mohamexiety> yeah that one is becoming more common thankfully. mine doesn't but it's promising to see at least
18:31fdobridge: <mohamexiety> toolless M.2 too ❤️
18:45Lyude: airlied, skeggsb: jfyi thanks to sima I think I confirmed that 77kb is definitely enough to fail in lowmem situations so the solution for this I think is going to be just keeping around allocations for the radix3 table, so I think I can write a patch once I get back from the pharmacy :)
18:58airlied: Lyude: I'd like to see us just use the other allocator for the 77kb level, since the reason there is a page table there is for exactly this reason
19:10Lyude: airlied: hm? I thought the page table was for mapping the sg allocation that we do for the suspend/resume data
19:10Lyude: erm, mapping it on the GPU's end i mean
19:11airlied: Lyude: the reason it's a multi-level page table is so you can avoid needing contiguous mem allocations for the levels
19:12Lyude: Interesting
19:13Lyude: ooook, I understand now
19:38Lyude: ok - I think I might be able to dig further into this then when i get back. I think I've got a much better idea of what to do now
21:22fdobridge: <gfxstrand> Yes, that
21:23fdobridge: <Sid> how would I go about changing that?
21:24fdobridge: <Sid> or will that be a fairly involved process
21:31fdobridge: <gfxstrand> https://www.collabora.com/news-and-blog/blog/2024/04/25/re-converging-control-flow-on-nvidia-gpus/
21:32fdobridge: <gfxstrand> We don't. Not easily. I think there's probably a way but I need to do more research. We can set the dispatch size with a DMA but we'll need a different strategy for queries. IDK how the blob does then.
21:33fdobridge: <Sid> hmmm, so it did get a whole lot more complicated 😅
21:34fdobridge: <mohamexiety> saw it a short while ago. very nicely written and interesting!
21:46fdobridge: <Sid> well, guess I'll have to look at something else to do then
21:46fdobridge: <Sid> unless @ahuillet can help with that
22:00fdobridge: <snektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a markup error here
22:00fdobridge: <snektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a typo hete (edited)
22:00fdobridge: <snektron> `We get a third, different behavior if we move the subgroupMin()</code< into the loop: ` there's a typo here (edited)
23:21Lyude: hooray i actually think i wrote code using sgt that i fully understand now :)
23:34Lyude: uhoh. so I'm definitely making progress but something weird is happening
23:36Lyude: I -think- I managed to get the GSP booted with the non-contiguous page table allocation finally, but it seems like now we're failing on nvkm_gsp_fwsec_frts(). And if i understand correctly it's failing because nvkm_gsp_fwsec is no longer reading the vbios correctly - which I assume probably means something is still mapped wrong somewhere?
23:41Lyude: https://paste.centos.org/view/9a65eac7 with the patch being https://gitlab.freedesktop.org/lyudess/linux/-/commit/0d9be02ebc6372252b0d69db57836209f8a8260f (airlied, skeggsb)
23:45Lyude: ...OH
23:45Lyude: ok it's fixed and it works :D
23:45Lyude: I did not realize for_each_sgtable_dma_page() did not increment the page offset
23:46Lyude: amazing ♥, time to run this on my laptop then and see if the problem goes away