00:52airlied: TimurTabi: yes that does seem to get leaked alright
00:52airlied: and the wpr one
02:18fdobridge_: <airlied> uggh did a bios update on my turing laptop and now the gpu won't come out of runpm either
02:24fdobridge_: <gfxstrand> 😭
02:24fdobridge_: <gfxstrand> :xenia_sob:
02:36fdobridge_: <airlied> I'm not sure my current low alcohol intake lifestyle is compatible with looking at ACPI tables
03:14fdobridge_: <gfxstrand> Trying to trim the repro case now. This one doesn't repro 100%
03:26fdobridge_: <gfxstrand> Down to 100k tests...
03:26airlied: TimurTabi: btw sent out a fix finally for that registry rpc size being wrong
03:31fdobridge_: <Sid> we don't have DSC yet, right?
04:01fdobridge_: <airlied> https://patchwork.freedesktop.org/patch/576335/
04:01fdobridge_: <airlied> @gfxstrand care to check if that changes the runpm behaviour for you?
04:05TimurTabi: airlied: have taken a look at my command-line registry patch? I sent a v2 today.
04:39fdobridge_: <gfxstrand> Down to two tests that appear to have nothing whatsoever to do with each other. 😭
04:45fdobridge_: <airlied> that patch actually probably won't be any yse, reproduced the runpm again on turing
04:46airlied: TimurTabi: looks good, I just wanted to land the fix for the current code before adding more on top
04:47TimurTabi: airlied: my patch supercedes yours and will cause a merge conflict, just FYI.
04:49airlied: TimurTabi: yes yours will need to be rebased onto that once it is landing
04:49airlied: or it might end up just as a fixes/next mere
04:49airlied: merge
04:52TimurTabi: airlied: so you want my patch to go into drm-next? Np
05:21fdobridge_: <gfxstrand> I think my workgroup barriers are hosed.
05:21fdobridge_: <gfxstrand> I'll look more tomorrow
08:57simplymastermind: Marcelina Wanda Fecalinksa as most of you is a birth abortion leftover suicidal abuser. He makes close anal business with Paul PoopWise., sme time ago we had a punisher who would make it harder, he would fuck so bad to the butt that the shithole comes loose from the butt. you get that therapy all soon.
08:59simplymastermind: tranny do you need something else for today?
09:47fdobridge_: <rinlovesyou> looks like it has not made it into 6.7.2, i'll keep an eye out for it though
09:47fdobridge_: <rinlovesyou>
09:47fdobridge_: <rinlovesyou> Also, what's going on with the IRC..?
09:48fdobridge_: <!DodoNVK (she) 🇱🇹> There's some Estonian person that's doing weird stuff and constantly joining the #ID:1034184951790305330 channel
09:49fdobridge_: <rinlovesyou> what a weirdly specific channel to attack
09:50fdobridge_: <rinlovesyou> Ah yes let me go harass the open source nvidia driver
09:50fdobridge_: <!DodoNVK (she) 🇱🇹> I think that person also joins some other related channels (but I can't confirm that)
09:50fdobridge_: <!DodoNVK (she) 🇱🇹> Like #dri-devel for example
09:52fdobridge_: <rinlovesyou> Either way sad times for me right now, can't test anything when 6.8-rc1 keeps freezing my graphical interface >:V
09:53fdobridge_: <rinlovesyou> That or there's a nasty memory leak somewhere
09:57fdobridge_: <rinlovesyou> https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg478682.html
09:57fdobridge_: <rinlovesyou>
09:57fdobridge_: <rinlovesyou> Looks like I'm not alone though :)
10:04fdobridge_: <airlied> Try rc2 already 🙂
10:59fdobridge_: <rinlovesyou> Sadly still present :(
11:00fdobridge_: <rinlovesyou> Too bad the gsp fix didn't make it into 6.7.2, perhaps 6.7.3
12:51fdobridge_: <Sid> @rinlovesyou does the whole system freeze? can you switch to another tty? do you have an iGPU as well?
13:29fdobridge_: <rinlovesyou> whole system seems to freeze. can not switch to a tty. sadly no IGPU, it's a ryzen 9 3900x
13:29fdobridge_: <rinlovesyou> notable that firefox audio continues to play
13:29fdobridge_: <Sid> ...hmm, no igpu
13:29fdobridge_: <rinlovesyou> until it crashes, as described in the mail above
13:29fdobridge_: <rinlovesyou> so the system is at least still going on that level
13:30fdobridge_: <Sid> oh wait you're having it pre 6.8rc2 as well
13:30fdobridge_: <rinlovesyou> yes
13:30fdobridge_: <rinlovesyou> it has been present ever since 6.8rc1
13:30fdobridge_: <rinlovesyou> linus *said* system freeze is supposed to be gone in rc2 but it doesn't seem to be
13:31fdobridge_: <Sid> I thought maybe this patch would help but since it's not iGPU related and has been happening prior as well...
13:31fdobridge_: <Sid> oh so not on 6.7?
13:31fdobridge_: <rinlovesyou> no, it's fine on 6.7, even 6.7.2
13:32fdobridge_: <rinlovesyou> inb4 it's actually a nouveau thing and i'm only catching it because gsp works in 6.8 :clueless:
13:32fdobridge_: <Sid> I see
13:32fdobridge_: <rinlovesyou> inb4 it's actually a nouveau thing and i'm only catching it in 6.8 because gsp works in 6.8 :clueless: (edited)
13:32fdobridge_: <Sid> gsp works on 6.7 as well
13:32fdobridge_: <rinlovesyou> inb4 it's actually a nouveau thing and i'm only catching it in 6.8 because gsp works :clueless: (edited)
13:32fdobridge_: <rinlovesyou> not for turing
13:32fdobridge_: <Sid> it does
13:32fdobridge_: <rinlovesyou> brokey.
13:32fdobridge_: <Sid> you just have to enable it
13:32fdobridge_: <Sid> oh, right, you had those RIPs
13:33fdobridge_: <Sid> WorksOnMyMachine 😅
13:33fdobridge_: <rinlovesyou> lol
13:33fdobridge_: <Sid> has worked since 6.6 (since before gsp support was mainline)
13:33fdobridge_: <rinlovesyou> at the very least not for my card
13:33fdobridge_: <Sid> granted, I had iGPU related issues that only got fixed in 6.7-rc5? 4? but yeah
13:34fdobridge_: <rinlovesyou> gsp only functions correctly for the 2070 super in 6.8
13:34fdobridge_: <Sid> ..yeah, it looks like even within the same generation the hardware is wildly different
13:34fdobridge_: <Sid> 1660Ti here
13:34fdobridge_: <rinlovesyou> typical nvidia moment
13:34fdobridge_: <Sid> I'm actually on nouveau+gsp right now too
13:35fdobridge_: <rinlovesyou> i would be too if 6.8 didn't yeet my system at random
13:35fdobridge_: <rinlovesyou> hopefully soon, following along NVK's development is exciting
13:35fdobridge_: <Sid> you'll get there :>
13:37fdobridge_: <!DodoNVK (she) 🇱🇹> I might have to open a bug report for the emulator repo (because I don't really know what's happening here)
14:33fdobridge_: <!DodoNVK (she) 🇱🇹> `vita3k: ../mesa/src/nouveau/vulkan/nvk_device_memory.c:57: zero_vram: Assertion '(bo->size / 4096) < (1 << 15)' failed.` :cursedgears:
15:03fdobridge_: <rinlovesyou> It would be nice if i could just use the gsp from 6.8 but I'm not able to Frankenstein myself a kernel
15:03fdobridge_: <rinlovesyou> It would be nice if i could just use the gsp from 6.8 but I'm not about to Frankenstein myself a kernel (edited)
15:26fdobridge_: <!DodoNVK (she) 🇱🇹> Bug report made: https://github.com/Vita3K/Vita3K/issues/3199 (feel free to comment on it)
15:37fdobridge_: <karolherbst🐧🦀> let's see how much ping-pong we are going to play with that one
18:02fdobridge_: <gfxstrand> Well, that turned out to be extremely uninteresting: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27365
18:02fdobridge_: <gfxstrand> I guess it's not going to help cts + wsi
18:02fdobridge_: <gfxstrand> Maybe it'll at least make it so I can run 8 shards? 🤷🏻♀️
18:04fdobridge_: <gfxstrand> The fact that this test failed when run after certain 3D tests implies that shared memory lives in the same internal memories as some other sort of cache like index or vertex buffers. It looked very much like index buffer data.
18:04fdobridge_: <gfxstrand> So that's fun.
18:09fdobridge_: <karolherbst🐧🦀> btw, nvidia prefers to use `SV_COMBINED_TID` because it's faster to do the alu in extracting the components than to do three pulls 🥲
18:13fdobridge_: <gfxstrand> Yeah
18:13fdobridge_: <gfxstrand> I'll probably switch to it at some point
18:13fdobridge_: <gfxstrand> Do you have the bit pattern in your magic docs? It looks like x is at bit 0 and y is at bit 16 from what I saw but I'd like to be sure.
18:13fdobridge_: <gfxstrand> And, yeah, PRMT is cheap
18:15fdobridge_: <karolherbst🐧🦀> yeah.. it's x = 10:0, y = 25:16, z = 31:26
18:15fdobridge_: <karolherbst🐧🦀> same sizes as the separate ones btw
18:17fdobridge_: <gfxstrand> Ugh.. That 26 sucks...
18:17fdobridge_: <gfxstrand> I mean, it's not too bad because it's just a shift but still...
18:17fdobridge_: <gfxstrand> Oh, the 25:16 sucks worse
18:17fdobridge_: <karolherbst🐧🦀> why?
18:18fdobridge_: <gfxstrand> If they were aligned to bytes, I could use PRMT
18:18fdobridge_: <gfxstrand> Z is just a shift so that's fine. X is just an AND so that's fine, too.
18:18fdobridge_: <gfxstrand> But y is a full bitfield extract. 😭
18:18fdobridge_: <gfxstrand> But Y is a full bitfield extract. 😭 (edited)
18:18fdobridge_: <karolherbst🐧🦀> just use `BMSK`?
18:19fdobridge_: <gfxstrand> Turing doesn't have the full extract version
18:19fdobridge_: <karolherbst🐧🦀> ehh wait
18:19fdobridge_: <karolherbst🐧🦀> that's just a mask 😄
18:19fdobridge_: <gfxstrand> On Maxwe.., I can use BFE
18:19fdobridge_: <karolherbst🐧🦀> right..
18:19fdobridge_: <gfxstrand> On Maxwell, I can use BFE (edited)
18:20fdobridge_: <gfxstrand> But, like, it's 4 ALU vs. 3 so still kinda meh
18:21fdobridge_: <karolherbst🐧🦀> maybe they stopped using it on Turing? Dunno.. I know we've added it to codegen, because they used that one on maxwell or so
18:22fdobridge_: <gfxstrand> Yeah, it's easy enough to switch how we implement `gl_LocalInvocationIndex`.
18:22fdobridge_: <gfxstrand> But also it's a compute shader so guaranteed to do memory things and you probably do it once at the top so... meh. 🤷🏻♀️
18:24fdobridge_: <gfxstrand> Unfortunately, fixing this bug (as entertaining as it was to chase down) sheds no light whatsoever on my CTS+WSI woes. :xenia_sob:
18:24fdobridge_: <gfxstrand> I'm also very close to just officially not caring.
18:24fdobridge_: <gfxstrand> I submitted last night without WSI. There's nothing in the spec that says a driver has to have WSI
18:25fdobridge_: <karolherbst🐧🦀> that's what we settled with for GL :ferrisUpsideDown:
18:25fdobridge_: <gfxstrand> And no one's going to sue us over IP in our WSI implementation.
18:26fdobridge_: <gfxstrand> I do want to find/fix the bug if we can because, until we understand and fix it, there's a chance someone will hit it in the wild. Most apps do use WSI, after all.
18:26fdobridge_: <gfxstrand> But for the purposes of checking off 1.3 and moving on... I'm kinda meh at this point.
18:57fdobridge_: <!DodoNVK (she) 🇱🇹> I wonder what's the difference between Maxwell 1 and 2 (I know Maxwell 2 introduced the high-secure firmware stuff)
19:00fdobridge_: <gfxstrand> Maxwell 2 also added the new texture headers
19:03fdobridge_: <gfxstrand> Okay, let's try 8 shards with VK_KHR_zero_initialize_workgroup_memory fixed
19:12fdobridge_: <!DodoNVK (she) 🇱🇹> I see multiple SM versions for Maxwell (is there any difference between them?)
19:17fdobridge_: <!DodoNVK (she) 🇱🇹> Also could codegen be disabled for NVK once SM20 or SM30 support lands? codegen can be useful as a reference for example in that GTA fog bug (but its reliability and performance can be questionable so 🤷♀️)
19:24fdobridge_: <gfxstrand> Yeah, I'm planning to eventually totally remove it but it'll stay as long as there's hardware we want to enable which NAK doesn't support yet.
19:24fdobridge_: <gfxstrand> I've got it pretty well walled off in the compile pipeline so it's not hurting anything.
19:25fdobridge_: <!DodoNVK (she) 🇱🇹> Nuking codegen will break the nouveau OpenGL driver though (which will still be needed for old hardware)
19:26fdobridge_: <gfxstrand> Well, totaly remove it from NVK
19:26fdobridge_: <gfxstrand> It'll stick around for nouveau GL for as long as that's useful
20:35fdobridge_: <azkali> Huh it should be working 🤔
20:35fdobridge_: <azkali> Let me test that then but my rootfs is working
20:54fdobridge_: <gfxstrand> @airlied What's the page alignment for VRAM? Is it really 64K?
20:54fdobridge_: <gfxstrand> What does nouveau.ko do if we bind with a smaller alignment?
20:55fdobridge_: <airlied> probably lets you do it, not sure it can stop you
20:56fdobridge_: <gfxstrand> So can we bind VRAM at a 4k granularty?
20:56fdobridge_: <gfxstrand> Is 64K just more efficient then
20:56fdobridge_: <gfxstrand> Is 64K just more efficient then? (edited)
20:56fdobridge_: <airlied> it only matters for images
20:56fdobridge_: <gfxstrand> Well, images are what we're working with here....
20:56fdobridge_: <airlied> at least I think it does, don't really know what the rules are in the hw
20:57fdobridge_: <gfxstrand> @mohamexiety is trying to get sparse residency working and I'm trying to figure out the rules.
20:57fdobridge_: <airlied> start by assuming the kernel isn't enforcing the rules, since we don't know the rules
20:58fdobridge_: <gfxstrand> womp womp
20:58fdobridge_: <airlied> you can have 64k pages with a kind attached
20:58fdobridge_: <airlied> now whether you can suballocate multiple images inside a 64k kind block, I'm not sure
20:58fdobridge_: <gfxstrand> Okay, that's not the question being asked.
20:58fdobridge_: <gfxstrand> We can. It's fine.
20:59fdobridge_: <gfxstrand> The bigger question is if the page tables allow us to bind multple discontiguous VRAM segments in the same 64K
20:59fdobridge_: <gfxstrand> Like, what is the required alignment of the VM_BIND offset/size parameters
20:59fdobridge_: <airlied> no I think you have one page table entry for the 64k
20:59fdobridge_: <gfxstrand> Okay
20:59fdobridge_: <gfxstrand> @mohamexiety There's your answer^^
21:00fdobridge_: <gfxstrand> Assert everything is 64k aligned in the image bind handling path and we'll figure out what granularity and limits to advertise to make that happen
21:01fdobridge_: <mohamexiety> okay, got it
21:01fdobridge_: <airlied> { 47, &gp100_vmm_desc_16[4], NVKM_VMM_PAGE_Sxxx },
21:01fdobridge_: <airlied> { 38, &gp100_vmm_desc_16[3], NVKM_VMM_PAGE_Sxxx },
21:01fdobridge_: <airlied> { 29, &gp100_vmm_desc_16[2], NVKM_VMM_PAGE_Sxxx },
21:01fdobridge_: <airlied> { 21, &gp100_vmm_desc_16[1], NVKM_VMM_PAGE_SVxC },
21:01fdobridge_: <airlied> { 16, &gp100_vmm_desc_16[0], NVKM_VMM_PAGE_SVxC },
21:01fdobridge_: <airlied> { 12, &gp100_vmm_desc_12[0], NVKM_VMM_PAGE_SVHx },
21:01fdobridge_: <airlied> are the turing options in the kernel
21:01fdobridge_: <airlied> S is sparse, V is VRAM, H is host, and C is comp
21:02fdobridge_: <airlied> so you can have empty sparse maps 47,38,29 bits, then 21 or 16 bits VRAM, and 12 bits vram/host,
21:02fdobridge_: <airlied> so in theory we might be allowed 4k alignment for non-comp images, which is all we support right now
21:09fdobridge_: <!DodoNVK (she) 🇱🇹> Any progress for https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26622 ? :nouveau:
22:28fdobridge_: <gfxstrand> @airlied typed me a kernel patch for a REBAR query so I should be able to do that and hide it behind REBAR
22:29fdobridge_: <gfxstrand> I also need to work up the courage to lane !27205
22:32fdobridge_: <gfxstrand> He also typed a kernel patch for getting VRAM size which we should be able to use for "real" memory_budget
22:36fdobridge_: <!DodoNVK (she) 🇱🇹> What if I don't have ReBAR?
22:36fdobridge_: <gfxstrand> They you either won't get it or you'll get a very small region
22:36fdobridge_: <gfxstrand> Like 128M or something
22:36fdobridge_: <gfxstrand> Maybe 64M
22:36fdobridge_: <!DodoNVK (she) 🇱🇹> Is a small region good enough for gamescope?
22:37fdobridge_: <gfxstrand> If you have REBAR, you get the whole thing.
22:37fdobridge_: <gfxstrand> Should be
22:37fdobridge_: <gfxstrand> 64M is quite a bit for just upload memory
22:37fdobridge_: <gfxstrand> You can fit a lot of vertices in 64M
22:43fdobridge_: <Sid> if you don't have ReBAR, 256mb is the bar size
22:43fdobridge_: <Sid> on turing
22:43dj-death: gfxstrand: are you going to upload descriptors like that as well?
22:46fdobridge_: <gfxstrand> Yeah, but we also use some of that for the driver
22:46fdobridge_: <gfxstrand> I'd rather not but IDK.
22:46fdobridge_: <Sid> ah, fair
22:46fdobridge_: <gfxstrand> dj-death: We can upload descriptors with the DMA engine but ugh...
22:46fdobridge_: <gfxstrand> I'd rather just map, I think.
22:47fdobridge_: <gfxstrand> Even if I have to keep descriptor buffers in system ram
22:49fdobridge_: <!DodoNVK (she) 🇱🇹> What BAR do you like the most? Resizable Snickers BAR? Non-resizable Bounty BAR? 1 GB Mars BAR? 😅
22:50dj-death: gfxstrand: right
22:50fdobridge_: <gfxstrand> I'm a Snickers girl, myself. Though I don't mind an Almond Joy..
22:53fdobridge_: <!DodoNVK (she) 🇱🇹> I never heard of Almond Joy
22:58fdobridge_: <airlied> For egpu you really want DMA uploads
22:58fdobridge_: <gfxstrand> Yeah...