00:18fdobridge_: <gfxstrand> @karolherbst Is there a form of `membar` for attribute memory?
00:20fdobridge_: <karolherbst🐧🦀> for `AST` and `ALD` operations, right?
00:23fdobridge_: <karolherbst🐧🦀> @gfxstrand no, but writes of `AST` are only visible in the same invocation or by the next tess program
00:23fdobridge_: <karolherbst🐧🦀> running into weird issues or something?
00:23fdobridge_: <gfxstrand> Yeah
00:24fdobridge_: <karolherbst🐧🦀> are you using `ALD.O`?
00:24fdobridge_: <gfxstrand> Yeah
00:24fdobridge_: <karolherbst🐧🦀> mhhh
00:25fdobridge_: <karolherbst🐧🦀> sooo
00:25fdobridge_: <karolherbst🐧🦀> there is something funky, and I don't know if that matters
00:25fdobridge_: <karolherbst🐧🦀> but
00:25fdobridge_: <karolherbst🐧🦀> apparently you should `ALD` the location before you overwrite it via `AST`
00:26fdobridge_: <karolherbst🐧🦀> I think...
00:26fdobridge_: <karolherbst🐧🦀> I don't know much about tess so I don't actually know what is meant there
00:27fdobridge_: <karolherbst🐧🦀> ehh wait...
00:27fdobridge_: <gfxstrand> So the problem is that we have these write-masked tess output writes which spirv_to_nir turns into load/vec/store
00:27fdobridge_: <gfxstrand> And somewhere along the line that's going sidewayw
00:27fdobridge_: <gfxstrand> And somewhere along the line that's going sideways (edited)
00:27fdobridge_: <karolherbst🐧🦀> I think it's meant that special data is stored in the output and needs to be fetched before you overwrite it with the actual output or something...
00:27fdobridge_: <karolherbst🐧🦀> it's a bit confusing
00:27fdobridge_: <karolherbst🐧🦀> mhh
00:28fdobridge_: <karolherbst🐧🦀> anyway.. I don't see any barrier instruction being relevant here
00:29fdobridge_: <karolherbst🐧🦀> or anything else
00:29fdobridge_: <karolherbst🐧🦀> might also be some funky shader header stuff.. dunno
00:34fdobridge_: <airlied> for me running wsi the sync tests doesn't seem to hang, doing a full serial run
00:36fdobridge_: <airlied> @dwlsalmeida I've at least reproduced the fail on turing
00:37fdobridge_: <dwlsalmeida> \o/
00:37fdobridge_: <dwlsalmeida> I was worried it was something on my setup
00:43fdobridge_: <gfxstrand> Okay, I'll come back and look at TCS output stores tomorrow
00:43fdobridge_: <gfxstrand> Something fishy is going on but I need to eat supper
00:56fdobridge_: <gfxstrand> When I passed the CTS before, it was with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25916
00:56fdobridge_: <gfxstrand> I need to rebase that and try to find a reviewer
00:59fdobridge_: <redsheep> Update: Yeah plasma wayland and x11 are both definitely not working on zink on nvk just yet. X11 session goes to black screen with just a cursor and logs complaining about lack of dri modifiers, and wayland locked my system up so bad I couldn't use a tty or anything and had to hold the power button. I don't seem to have gotten anything about it in any logs either.
01:02fdobridge_: <redsheep> It would be nice to be able to cut the nouveau gl driver out of the equation entirely but seems something there is still broken pretty bad with running the whole session. That works on AMD, right?
01:53fdobridge_: <Sid> can someone with working rebar on linux help me out a bit?
01:57fdobridge_: <Sid> I just wanna know what the `memory` section of lspci -v says for your gpu
02:06fdobridge_: <Sid> I also want `dmesg | grep -iE bar`
03:19fdobridge_: <airlied> @dwlsalmeida okay found it, turing had a workaround getting in the way
03:20fdobridge_: <airlied> https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/nvdec-expose for the kernel patch
03:20fdobridge_: <airlied> also one hack on top of the vid branch gets me to where I was failing before
03:26fdobridge_: <gfxstrand> If I do with -vv, I see three regions. One of them is 256M without REBAR and 12GB with REBAR
03:29fdobridge_: <Sid> awesome
03:29fdobridge_: <Sid> ```
03:29fdobridge_: <Sid> 01:00.0 VGA compatible controller: NVIDIA Corporation TU116M [GeForce GTX 1660 Ti Mobile] (rev a1) (prog-if 00 [VGA controller])
03:29fdobridge_: <Sid> Subsystem: Acer Incorporated [ALI] TU116M [GeForce GTX 1660 Ti Mobile]
03:29fdobridge_: <Sid> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
03:29fdobridge_: <Sid> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
03:29fdobridge_: <Sid> Latency: 0
03:29fdobridge_: <Sid> Interrupt: pin A routed to IRQ 162
03:29fdobridge_: <Sid> Region 0: Memory at 8e000000 (32-bit, non-prefetchable) [size=16M]
03:29fdobridge_: <Sid> Region 1: Memory at 800000000 (64-bit, prefetchable) [size=8G]
03:29fdobridge_: <Sid> Region 3: Memory at 700000000 (64-bit, prefetchable) [size=32M]
03:29fdobridge_: <Sid> ```
03:29fdobridge_: <Sid> getting places
03:29fdobridge_: <Sid> need to get the driver to behave now
07:07fdobridge_: <marysaka> That would match what I saw on mesh with ISEBERD/ISBEWR
07:09fdobridge_: <marysaka> I had that exact situation with the only input of mesh (workgroup id -> VERTEX_ID) but I just ended up just moving the load at the start of the entrypoint and call it a day tho...
09:22fdobridge_: <karolherbst🐧🦀> huh...
09:22fdobridge_: <karolherbst🐧🦀> yeah....
09:22fdobridge_: <karolherbst🐧🦀> sooo
09:22fdobridge_: <karolherbst🐧🦀> I guess that's it then?
09:22fdobridge_: <karolherbst🐧🦀> 😄
09:22fdobridge_: <karolherbst🐧🦀> do all `ALD`s before doing any `AST`
09:22fdobridge_: <karolherbst🐧🦀> I'm sure it's more detailed like that...
12:50fdobridge_: <tom3026> @gfxstrand just checking coverity a bit, https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/vulkan/nvk_image.c#L301 is this WIP because external_info is always NULL
12:53fdobridge_: <tom3026> radv seems to loop over pImageFormatInfo and incase ->sType is VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_EXTERNAL_IMAGE_FORMAT_INFO it sets external_info to (const void *)pImageFormatInfo, https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/vulkan/radv_formats.c#L1626
12:53fdobridge_: <tom3026> but yeah just being bored checking coverity xD
12:55fdobridge_: <tom3026> or perhaps that vk_find_struct_const was supposed to set external_info just below it heh
13:06fdobridge_: <karolherbst🐧🦀> I couldn't stand those 500 clippy warnings: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27234
13:06fdobridge_: <karolherbst🐧🦀> :ferrisUpsideDown:
13:11fdobridge_: <tom3026> cool applying that, and testing properly setting that external_info. lets see if my gpu gives a smoke and poff
13:11fdobridge_: <karolherbst🐧🦀> 😄
13:11fdobridge_: <karolherbst🐧🦀> probs
13:12fdobridge_: <Sid> still gotta make the driver behave, but I'm glad the hardware and firmware are in harmony
13:12fdobridge_: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1199703150144266290/299321745-23d7ee8f-9379-4441-bccc-63f54f4a2814.png?ex=65c381a0&is=65b10ca0&hm=b67c705f216c411a8e75cd8d09469791e4e5d2394a0173c94bfea1d21ac00aae&
13:12fdobridge_: <tom3026> explorer.exe? of my eyes
13:12fdobridge_: <Sid> Hiren's BootCD
13:13fdobridge_: <Sid> a live environment, because I'm not installing windows 🙃
13:18fdobridge_: <tom3026> @karolherbst is nak inside libvulkan_nouveau.so ? since im building that seperately from mesa otherwise i gotta rebuild it all
13:18fdobridge_: <karolherbst🐧🦀> yes
13:18fdobridge_: <tom3026> okay
13:30fdobridge_: <tom3026> seems to work here, il run some cts but its probably gonna stop at wsi as usual heh
13:31fdobridge_: <tom3026> cs2 ran and the other vkmark/games that always has so
13:31fdobridge_: <karolherbst🐧🦀> yeah.. nothing there should break anything 😄
13:31fdobridge_: <karolherbst🐧🦀> but who knows
13:52fdobridge_: <tom3026> cool either clippy or me changing external_info properly being set running cts until it stops after wsi
13:52fdobridge_: <tom3026> ```
13:52fdobridge_: <tom3026> [ 276.600926] nouveau 0000:01:00.0: gsp: intr 00008000
13:52fdobridge_: <tom3026> [ 321.261970] nouveau 0000:01:00.0: gsp: intr 00008000
13:52fdobridge_: <tom3026> [ 1355.701510] nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:24 type:13 scope:1 part:233
13:52fdobridge_: <tom3026> [ 1355.701519] nouveau 0000:01:00.0: fifo:c00000:0003:0018:[deqp-vk[8443]] errored - disabling channel
13:52fdobridge_: <tom3026> [ 1355.701524] nouveau 0000:01:00.0: deqp-vk[8443]: channel 24 killed!
13:52fdobridge_: <tom3026> ```
13:52fdobridge_: <tom3026>
13:52fdobridge_: <tom3026> but before those changes?
13:52fdobridge_: <tom3026>
13:52fdobridge_: <tom3026> ```
13:52fdobridge_: <tom3026> acer kernel: __vm_enough_memory: pid: 16729, comm: deqp-vk, not enough memory for the allocation
13:52fdobridge_: <tom3026> acer kernel: __vm_enough_memory: pid: 16729, comm: deqp-vk, not enough memory for the allocation
13:52fdobridge_: <tom3026> acer kernel: __vm_enough_memory: pid: 16729, comm: deqp-vk, not enough memory for the allocation
13:52fdobridge_: <tom3026> acer kernel: __vm_enough_memory: pid: 16729, comm: deqp-vk, not enough memory for the allocation
13:52fdobridge_: <tom3026> acer kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:24 type:13 scope:1 part:233
13:52fdobridge_: <tom3026> acer kernel: nouveau 0000:01:00.0: fifo:c00000:0003:0018:[deqp-vk[16729]] errored - disabling channel
13:52fdobridge_: <tom3026> acer kernel: nouveau 0000:01:00.0: deqp-vk[16729]: channel 24 killed!
13:52fdobridge_: <tom3026> acer kernel: nouveau 0000:01:00.0: gsp: mmu fault queued
13:52fdobridge_: <tom3026> acer kernel: nouveau 0000:01:00.0: gsp: rc engn:00000001 chid:24 type:31 scope:1 part:233
13:52fdobridge_: <tom3026> ```
13:52fdobridge_: <tom3026> a better result! no idea how many passes/failes changed but 😄
13:55fdobridge_: <karolherbst🐧🦀> pain
13:55fdobridge_: <karolherbst🐧🦀> 😄
13:56fdobridge_: <karolherbst🐧🦀> anyway, I think the changes are fine, just needs somebody to review it
14:23fdobridge_: <tom3026> no not pain, i mean after the changes i didnt get __vm_enough_memory errors nor mmu fault heh
14:24fdobridge_: <karolherbst🐧🦀> heh
14:24fdobridge_: <karolherbst🐧🦀> sounds unrelated
14:24fdobridge_: <karolherbst🐧🦀> might have been fixed by something else in the meantime
15:27fdobridge_: <gfxstrand> This doesn't seem to help. 😢
15:33fdobridge_: <karolherbst🐧🦀> pain
15:33fdobridge_: <karolherbst🐧🦀> tried doing _all_ `ALD`s before any `AST`?
15:34fdobridge_: <karolherbst🐧🦀> see ^^
15:36fdobridge_: <gfxstrand> IDK if we can always do that...
15:36fdobridge_: <gfxstrand> Not for tess
15:36fdobridge_: <gfxstrand> I mean, maybe if we do crazy nonsense
15:40fdobridge_: <karolherbst🐧🦀> yeah.... would need to copy to a temporary and stuff, but I'm not quite sure that the restrictions here are
15:53fdobridge_: <karolherbst🐧🦀> @gfxstrand sooo... it sounds like that the input coordinates of the tessellation eval are stored in the output ISBE of the program. But I don't really have any more details on that, just that it sounds like that some region `ALD` accesses in tess eval alias with locations `AST` would write to
15:55fdobridge_: <marysaka> I think it might be worth trying to enable ISBE accesses on the shader header and dump it to a buffer
15:55fdobridge_: <marysaka> as I'm pretty sure AST just manipulate the ATTR.SKEW
15:55fdobridge_: <marysaka> as I'm pretty sure ALD/AST just manipulate the ATTR.SKEW (edited)
16:00fdobridge_: <gfxstrand> Yeah, that's a bit whacky
16:05fdobridge_: <karolherbst🐧🦀> I'm sure it all makes sense somehow 😄
16:30fdobridge_: <gfxstrand> I think I found it: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27242
16:31fdobridge_: <gfxstrand> I need to wait for Karol's clippy fixes to finish running before I can CTS it, though.
16:31fdobridge_: <gfxstrand> It fixes the one test
16:31fdobridge_: <gfxstrand> And makes us closer match the blob behavior
16:31fdobridge_: <gfxstrand> That frickin' vertex source is magic, I swear...
16:32fdobridge_: <!DodoNVK (she) 🇱🇹> What test?
16:33fdobridge_: <gfxstrand> dEQP-VK.subgroups.basic.framebuffer.subgroupmemorybarrierimage_tess_control
16:33fdobridge_: <gfxstrand> I'm genuinely unsure how tessellation has been working at all this whole time if that's really the bug, though.
16:33fdobridge_: <gfxstrand> But, again, that instruction source is magic.
16:34fdobridge_: <gfxstrand> It's also entirely possible that tessellation has very much not been working and we just never noticed. 😂
16:35fdobridge_: <tom3026> oh speaking of nir_nak and lowering tried finding the function nir_u2u32 but didnt strike much luck for some reason but is this also a null ptr deref? https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/compiler/nak_nir_lower_tex.c?ref_type=heads#L53 , depending on the switch case tex_h is null and its given as a parameter to it aswell
16:35fdobridge_: <tom3026> otherwise its a false/positive from coverity
16:41fdobridge_: <gfxstrand> Nope, that one's fine. tex_h will always be non-null.
16:41fdobridge_: <gfxstrand> Nope, that one's fine. `tex_h` will always be non-null. (edited)
16:41fdobridge_: <tom3026> okay
16:59fdobridge_: <Sid> ReBAR on Turing
16:59fdobridge_: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1199760429107388546/299394501-983dc620-6435-4ee3-a114-e1d7da310731.png?ex=65c3b6f8&is=65b141f8&hm=b50cacd6f5b3f4509f623fa1eae18e76525c3348244efc5515239d0fb266707f&
16:59fdobridge_: <Sid> time to figure out why it's not working on linux
17:00fdobridge_: <Sid> yes, I bit the bullet and am dual booting for a few days
17:00fdobridge_: <gfxstrand> Nice!
17:02fdobridge_: <karolherbst🐧🦀> funky.. I never read `src[1]` in the GL driver :ferrisUpsideDown:
17:03fdobridge_: <karolherbst🐧🦀> maybe I should
17:03fdobridge_: <karolherbst🐧🦀> in some cases
17:04fdobridge_: <Sid> it was actually fairly simple to get that to happen
17:04fdobridge_: <gfxstrand> It's needed for GS and TCS inputs but not TCS outputs, apparently.
17:05fdobridge_: <Sid> and I don't even have a documented bios
17:05fdobridge_: <gfxstrand> Though the CTS may disagree. 😅
17:06fdobridge_: <karolherbst🐧🦀> we still have some fails in regards to tess and gs in GL, so that might explain those 😄
17:08fdobridge_: <Sid> 💢
17:08fdobridge_: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1199762605150052462/image.png?ex=65c3b8ff&is=65b143ff&hm=b89236d6b43d129241945ded5cf5d34b93288909743a90986c0ec8ff75a1fe59&
18:16fdobridge_: <Sid> ...I think I figured it out
18:16fdobridge_: <Sid> I *think* it's because the OS is unable to tell if I've booted off of a GPT layout drive, since my root partition is a bcachefs pool
18:18fdobridge_: <tom3026> aw cts was running fine with wsi disabled until i reached dEQP-VK.compute.pipeline.zero_initialize_workgroup_memory
18:21fdobridge_: <gfxstrand> 😢
18:22fdobridge_: <gfxstrand> Hrm... Looks like this patch isn't so good. It fixed the one test but now others are failing.
18:22fdobridge_: <gfxstrand> I'm very confused by what the blob generates, though.
19:43fdobridge_: <redsheep> How in the world did you get your hands on cannon lake?
19:43fdobridge_: <Sid> it's not a cannon lake
19:43fdobridge_: <Sid> acer just labeled things wrong/reused code without changing much
19:44fdobridge_: <Sid> my cpu is a coffeelake quad core i5
19:44fdobridge_: <Sid> which is similar to both skylake *and* kabylake
19:45fdobridge_: <Sid> also I still don't know why my GPU capabilities aren't being reported the same on linux as on windows
19:45fdobridge_: <Sid> i.e. max bar size
19:47fdobridge_: <Sid> 8 gig rebar on turing on windows is nice but it is worthless to me if I can't replicate it on linux 😅
19:48fdobridge_: <karolherbst🐧🦀> just burn windows down, then your only choices are to not have rebar
19:48fdobridge_: <Sid> oh I already formatted the partition I had installed windows on
19:49fdobridge_: <Sid> but knowing my hardware will happily do rebar, and knowing it's the linux stack blocking me from having it is a kicker I can't get over
19:53fdobridge_: <Sid> *especially* considering the system is assigning 8gb memory to the gpu even on linux, but *something* is preventing the gpu from stretching its legs
19:54fdobridge_: <redsheep> Can the BAR only ever be powers of 2? If not, isn't 8GB wrong?
19:55fdobridge_: <Sid> only powers of 3, yes
19:55fdobridge_: <Sid> only powers of 2, yes (edited)
19:55fdobridge_: <redsheep> I was scared when you had typed 3 there for a moment lol
19:55fdobridge_: <Sid> and BAR has to, ideally, be larger than VRAM
19:55fdobridge_: <redsheep> Trinary computers would be horrific
19:55fdobridge_: <Sid> yeah
19:56fdobridge_: <Sid> I can confirm it's not the nvidia driver preventing the BAR from going up, because even nouveau doesn't push it higher than 256mb
19:56fdobridge_: <Sid> I'm still bothered by my lspci -vv output
19:57fdobridge_: <Sid> https://paste.sidonthe.net/raw/mole-emu-raven
19:57fdobridge_: <Sid> caps says 256 is largest, why
19:58fdobridge_: <airlied> if you are booting with CSM or whatever it's probably blocking it, make sure you have UEFI (maybe try a live linux usb image or something)
19:58fdobridge_: <Sid> CSM is off
19:58fdobridge_: <Sid> I had it working on a test installation of windows
19:58fdobridge_: <Sid> rebooted to linux, no resized bar
19:59fdobridge_: <Sid> ..hang on
19:59fdobridge_: <Sid> https://github.com/xCuri0/ReBarUEFI/issues/119
19:59fdobridge_: <Sid> more info about the whole thing ^
20:02fdobridge_: <dwlsalmeida> I confirm that this is now working \o/
20:03fdobridge_: <airlied> yay, hopefully you can figure out how to make forward progress or make it do something useful!
20:11fdobridge_: <Sid> :o
20:11fdobridge_: <Sid> pop os live iso shows the caps correctly
20:26fdobridge_: <Sid> I will riot if archiso also shows it correctly
20:26fdobridge_: <Sid> by riot I mean angrily shut down my laptop, go to bed, and slowly come to the realization that I'm probably gonna have to reinstall
20:30fdobridge_: <Sid> ok even archiso doesn't show it
20:40fdobridge_: <dwlsalmeida> you may already know this, but you made great progress already, tried testing this on a 64x64 I frame and it decodes correctly, only tiling is broken
20:40fdobridge_: <dwlsalmeida> I'll be playing with it for a while lol
20:59fdobridge_: <dwlsalmeida> @airlied one more thing, how did you dump and decode the command streams in the file you shared? Is the tool by @marysaka involved in anyway?
21:00Lyude: oh!
21:02Lyude: tried one of my old kepler cards again to look at a bug to see if I could get it working (this card has not worked for a while in my test machine, but does work in porygon's test bench) and I think I just discovered exactly why that is. at least two or three resistors near the PCIe contacts have broken off
21:02Lyude: ...i think they are resistors. i will admit I am no electronics expert. but hey guess I've got an excuse to try my pinecil out finally :)
21:16karolherbst: Lyude: I had a GPU like that, but it kinda still worked... though I also taped those back to the GPU, so maybe that's why
21:16karolherbst: 🙃
21:16Lyude: Hah
21:16Lyude: karolherbst: yeah this one is exactly the same - works in some machines and not others
21:17karolherbst: but the tape got a bit melted over the months, just not enough to actually get black or anything, but... it got hot enough to do _something_ :D
21:17Lyude: honestly I've been looking for an excuse to learn how to solder this kind of stuff though as a weekend project so this is perfect :)
21:17karolherbst: yeah...
21:17karolherbst: have fun :D
21:17Lyude: ...unfortunately I do very much hope this is not my only fermi with displayport haha
21:17Lyude: erm, kepler
21:17karolherbst: the GPU also had one of those soldered "wrongly", like uhhh... it was weird
21:18karolherbst: like the angle was all weird
21:18karolherbst: maybe I still have the GPU somewhere
21:18karolherbst: Lyude: worst case just ping mupuf to send some GPUs over to you or something
21:18karolherbst: I think mupuf still has a few from that time
21:19Lyude: ooo
21:19karolherbst: though I'm not sure.. the last time I asked, I got 3 fermis or so
21:19karolherbst: but I think there were some left :D
21:19karolherbst: I already had enough keplers
21:19karolherbst: *always
21:19karolherbst: so you should be more lucky there I hope
21:20Lyude: karolherbst: if you've got a kepler with displayport it might be good to see if you could reverse-bisect the 22122 issue that neils mentioned
21:20karolherbst: I should have
21:20Lyude: still might keep this around because tbh I'd love to learn how to do this, especially so I can keep the funkier cards in my collection working
21:21karolherbst: I have a gk106 and gk110 with DP
21:22karolherbst: Lyude: but wait... uhm...
21:22karolherbst: I think that one is actually fixed...
21:22Lyude: karolherbst: so do I but I don't know which commit did it unfortunately
21:22Lyude: was what I was hoping to figure out
21:22karolherbst: ohh..
21:22karolherbst: yeah of course, :D
21:22karolherbst: I even mentioned that on the bug 🙃
21:23karolherbst: Lyude: you know which issue it was on gitlab?
21:23Lyude: yeah one sec
21:23Lyude: I -think- it's this one https://gitlab.freedesktop.org/drm/nouveau/-/issues/188#note_2229276
21:23fdobridge_: <airlied> yes I had to hack Mary's tool, and then a bunch of manual steps, it was very messy
21:24karolherbst: Lyude: yep
21:27fdobridge_: <airlied> https://gitlab.freedesktop.org/airlied/mesa/-/commits/nv-pushbuf-dump-tool-hacks-video has the hacks
21:27fdobridge_: <airlied> but I think I had to manually figure out some stuff, as there was no way to track what was a video subchannel across submits
21:31fdobridge_: <airlied> got a full single thread conformace on turing, but 239 fails, no hangs
21:34fdobridge_: <gfxstrand> Maybe we just can't read from per-vertex tessellation outputs?
21:35fdobridge_: <gfxstrand> I mean the hardware isn't erroring but maybe that just doesn't work?
21:35fdobridge_: <karolherbst🐧🦀> they can
21:35fdobridge_: <karolherbst🐧🦀> tess shaders I mean
21:36fdobridge_: <gfxstrand> Yeah, they ought to be able to
21:37fdobridge_: <karolherbst🐧🦀> did you move all `ALD` before the `AST`s yet?
21:37fdobridge_: <karolherbst🐧🦀> though mhh...
21:37fdobridge_: <karolherbst🐧🦀> though
21:37fdobridge_: <karolherbst🐧🦀> you might see a different problem
21:40fdobridge_: <karolherbst🐧🦀> maybe it's something else going on...
21:41fdobridge_: <karolherbst🐧🦀> do you have a shader dump somewhere?
21:43fdobridge_: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1199831836604506142/message.txt?ex=65c3f979&is=65b18479&hm=2d77edbd1a41794dcfe3dbb005807bfb330cb40092c707af5d5f976de48bbdba&
21:44fdobridge_: <karolherbst🐧🦀> ohh.. it's patch stuff?
21:45fdobridge_: <karolherbst🐧🦀> though I guess the patch ones do work?
21:48fdobridge_: <karolherbst🐧🦀> @gfxstrand mhh.. you are reading the outputs before anything wrote to it?
21:50fdobridge_: <gfxstrand> No, it's not patch
21:50fdobridge_: <karolherbst🐧🦀> that `ald.o a[%r6][0x80]` confuses me as this reads the output of the current invocation, but that one hasn't written to it yet, has it?
21:51fdobridge_: <karolherbst🐧🦀> well.. `ast.p a[0x0] %r11` is patch
21:51fdobridge_: <gfxstrand> The patch bits are fine
21:52fdobridge_: <karolherbst🐧🦀> right.. I'm just confused by that `ald.o` + `ast` thing
21:52fdobridge_: <gfxstrand> The problem appears to be that the `ald.o` isn't picking up the `ast` inside the if
21:52fdobridge_: <karolherbst🐧🦀> or is that just to write the second component?
21:52fdobridge_: <karolherbst🐧🦀> the ast in `block 4`?
21:52fdobridge_: <gfxstrand> yeah
21:52fdobridge_: <karolherbst🐧🦀> I see...
21:53fdobridge_: <gfxstrand> the ald/ast pattern in block 6 is attempting to do a write-masked store but I think it's stomping things
21:53fdobridge_: <karolherbst🐧🦀> might be
21:53fdobridge_: <karolherbst🐧🦀> though I doubt it...
21:54fdobridge_: <gfxstrand> The problem is that 0x88 is wrong
21:54fdobridge_: <karolherbst🐧🦀> what's this one supposed to do? `{%r60 %r61 %r62 %r63} = ald a[%r59][0x70]`
21:54fdobridge_: <karolherbst🐧🦀> mhhh.. I see
21:54fdobridge_: <gfxstrand> That's just copying gl_Position
21:55fdobridge_: <gfxstrand> Everything is fine except 0x88
21:55fdobridge_: <karolherbst🐧🦀> mhhh
21:55fdobridge_: <airlied> https://paste.centos.org/view/raw/9dbee9a3 is my fail list from cts master, with full wsi x11+wayland included
21:55fdobridge_: <gfxstrand> There's a load/store in block 3 that writes 0x80.z and then a store in block 4 that writes 0x80.z and then a load/store in block 6 that's supposed to write 0x80.xyw
21:56fdobridge_: <karolherbst🐧🦀> so if I read it correctly, `0x88` is supposed to be either `0x0` or `0x3f800000`, right?
21:56fdobridge_: <gfxstrand> yes
21:56fdobridge_: <gfxstrand> And we're getting 0x1 when we should be getting 0x0
21:56fdobridge_: <gfxstrand> I think
21:57fdobridge_: <karolherbst🐧🦀> 0x1 or 1.0?
21:57fdobridge_: <Sid> 🥳
21:57fdobridge_: <Sid> https://cdn.discordapp.com/attachments/1034184951790305330/1199835407609766009/image.png?ex=65c3fccd&is=65b187cd&hm=0b2bc14bc121f81883c5e856e6e3aa5a715efa181b7ac8906fc13d68e5fb8a32&
21:58fdobridge_: <gfxstrand> `$1 = {3.70168781e+09, 3.70168781e+09, 1, 0, 0, 3.70168781e+09, 1, 0}`
21:58fdobridge_: <gfxstrand> I think the expected result is
21:58fdobridge_: <gfxstrand> `$1 = {3.70168781e+09, 3.70168781e+09, 1, 0, 0, 3.70168781e+09, 0, 0}`
21:58fdobridge_: <karolherbst🐧🦀> that's 0x80 and 0x90, right?
21:58fdobridge_: <karolherbst🐧🦀> ehh
21:58fdobridge_: <gfxstrand> It's 0x80 on two different tests
21:58fdobridge_: <karolherbst🐧🦀> okay
21:59fdobridge_: <gfxstrand> 0x80 twice as it were
21:59fdobridge_: <gfxstrand> But the first vec4 is fine
21:59fdobridge_: <karolherbst🐧🦀> maybe `%p35 = isetp.eq.i32 %r17 %r34` is just never true?
21:59fdobridge_: <gfxstrand> They take different sides of the if
21:59fdobridge_: <karolherbst🐧🦀> for wahtever reason?
21:59fdobridge_: <karolherbst🐧🦀> ehh or false
21:59fdobridge_: <karolherbst🐧🦀> whatever makes it skip block 4
21:59fdobridge_: <karolherbst🐧🦀> or doesn't make it skip
21:59fdobridge_: <karolherbst🐧🦀> you know what I mean
21:59fdobridge_: <Sid> either hugepages or pci pm was breaking rebar
22:00fdobridge_: <gfxstrand> Nah, it's executing the if properly
22:00fdobridge_: <Sid> nuked my configs, worked
22:00fdobridge_: <karolherbst🐧🦀> mhhh
22:00fdobridge_: <karolherbst🐧🦀> I mean..., are you sure?
22:00fdobridge_: <gfxstrand> but now I'm really confused.
22:00fdobridge_: <gfxstrand> How are we getting 1?!?
22:00fdobridge_: <karolherbst🐧🦀> it.s float 1, no?
22:01fdobridge_: <karolherbst🐧🦀> so.. `0x3f800000`
22:02fdobridge_: <karolherbst🐧🦀> so yeah.. my working theory would be that block 4 is always executes, that's why you see the 1 there
22:02fdobridge_: <karolherbst🐧🦀> it would be the only sane explanation
22:03fdobridge_: <gfxstrand> If block 4 always executes, I'd have `{3.70168781e+09, 3.70168781e+09, 1, 0}` for both vec4s
22:04fdobridge_: <karolherbst🐧🦀> why?
22:04fdobridge_: <karolherbst🐧🦀> block 4 writes the same value to the other components
22:05fdobridge_: <gfxstrand> Because .x also changes based on whether or not block4 is executed
22:05fdobridge_: <karolherbst🐧🦀> ohh right...
22:05fdobridge_: <karolherbst🐧🦀> in block 6...
22:06fdobridge_: <gfxstrand> FWIW, this shader is fine
22:06fdobridge_: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1199837563062599710/message.txt?ex=65c3fece&is=65b189ce&hm=02918689e08ebeac0f32f3c058815aea2470b3c0f0e7889ee4d9210cae8ba02b&
22:06fdobridge_: <gfxstrand> That's when I fix NIR to not do the read/vec/write pattern
22:06fdobridge_: <karolherbst🐧🦀> mhhh...
22:06fdobridge_: <karolherbst🐧🦀> mhhhhhhhhhhhhhh
22:06fdobridge_:<karolherbst🐧🦀> interesting
22:07fdobridge_: <gfxstrand> And, like, that should pass the CTS
22:07fdobridge_: <gfxstrand> But also so should the read/write pattern
22:07fdobridge_: <karolherbst🐧🦀> yeah...
22:07fdobridge_: <karolherbst🐧🦀> should
22:07fdobridge_: <karolherbst🐧🦀> maybe something funky on the encoding level?
22:08fdobridge_: <karolherbst🐧🦀> mhhh
22:09fdobridge_: <Sid> anyways, I've been up for 24 hours straight (don't ask), and I'm going to go hit the sack peacefully
22:09fdobridge_: <karolherbst🐧🦀> @gfxstrand fyi, `AST` can do 96 bit writes
22:09fdobridge_: <karolherbst🐧🦀> ehh wait
22:10fdobridge_: <karolherbst🐧🦀> nevermind 😄
22:10fdobridge_: <gfxstrand> Sleep well!
22:10fdobridge_: <karolherbst🐧🦀> mhhhh
22:11fdobridge_: <karolherbst🐧🦀> @gfxstrand what happens if you initialize `0x80` fully before doing any loads on it?
22:12fdobridge_: <gfxstrand> IDK
22:12fdobridge_: <karolherbst🐧🦀> at this point I'd say it's worth a try...
22:13fdobridge_: <karolherbst🐧🦀> but yeah.. the main difference is, that the fixed version doesn't do any `ald.o`
22:13fdobridge_: <karolherbst🐧🦀> but it's also not like that it's completely not working
22:15fdobridge_: <karolherbst🐧🦀> wait actually...
22:15fdobridge_: <karolherbst🐧🦀> only 0x88 is actually visible in the result
22:16fdobridge_: <karolherbst🐧🦀> @gfxstrand okay... what if you can't read the output of the current invoc?
22:17fdobridge_: <karolherbst🐧🦀> because
22:17fdobridge_: <karolherbst🐧🦀> _why_ would you even
22:23Lyude: dakr: found the missing commits I -think- I need for getting Data<T, U, V>::registrations() from rust/kernel/device.rs uncommented - specifically 6c9714fd5e666ad3a218279cbe93fcee82917c17 from the rust for linux tree which adds the RevocableMutex type. But I noticed that you seem to have pulled in 3b83b0d2887fcb - which seems to also add a RevocableMutex type - but your version is
22:23Lyude: missing all of the functions required to actually access the underlying data. do you have any idea why this is?
22:23Lyude: and whether I can just replace the code from that commit, or if we should do something else to get this working?
22:25Lyude: FWIW: the reason I need it is https://gitlab.freedesktop.org/lyudess/linux/-/blob/rvkms/drivers/gpu/drm/rvkms/rvkms.rs?ref_type=heads#L82 I need a pinned version of reg similar to how asahi registers their drm driver https://github.com/AsahiLinux/linux/blob/asahi/drivers/gpu/drm/asahi/driver.rs#L165
22:26Lyude: unless you happen to have a better idea of how I could get a pinned reference to the driver registration object there
22:40karolherbst: Lyude: why not just use `pin!` or do the same thing asahi is doing?
22:40karolherbst: Pin doesn't really do much in itself
22:42karolherbst: anyway.. pining local variables is why `pin!` exists
22:43Lyude: karolherbst: I don't totally understand what's going on is why :P, but also I'd have to use pin_unchecked anyway
22:43karolherbst: API constraints I figure...
22:43fdobridge_: <gfxstrand> I wonder if my `ald` is actually loading from vtx0 or something
22:44Lyude: so it's just fine if I just do pin_unchecked if I make sure that things don't go out of scope or something like that?
22:44Lyude: sorry, Pin::new_unchecked
22:44fdobridge_: <karolherbst🐧🦀> maybe? could be that the encoding is messed up? did you check with nvdisasm?
22:44karolherbst: Lyude: they can't, because the borrow checker would complain :P
22:44karolherbst: this ain't C :D
22:45karolherbst: and if you want to store a reference somewhere, you need to attach a lifetime and everything has to ensure those lifetime constraints are fullfiled, so yeah
22:46karolherbst: Pin only guarantees, that nothing will move the value somewhere else, e.g. through things like mem::swap
22:46Lyude: OH
22:46Lyude: wow ok
22:46Lyude: i understand now
22:47Lyude: I'll try using Pin::new_unchecked then (since that type doesn't appear to implement Unpin here)
22:47fdobridge_: <gfxstrand> Looks like the blob is doing some lea thing to get the index for the output load
22:47Lyude: I just wanted to either match what asahi's doing or understand what's happening if I do my own thing
22:47Lyude: I think I understand now though
22:48karolherbst: Lyude: and even Unpin isn't an issue
22:48karolherbst: as e.g. things like u32 are Unpin, because it's kinda pointless to prevent them from moving
22:48karolherbst: as you can just copy it anyway
22:48karolherbst: (the reason is something else tho)
22:49Lyude: yeah unpin I still don't fully wrap my head around. i'll have to try reading the docs for that again
22:49Lyude: thank you for explaining :)
22:49karolherbst: yeah.. Pin is a weird concept
22:50Lyude: i wonder then why asahi accesses the registration object they have in such a weird way then
22:50karolherbst: dunno tbh
22:51Lyude: i'll give this a shot and keep writing up trait impls until I've got something compiling and see what happens
22:51karolherbst: I don't really see why anything would use Pin there anyway
22:51Lyude: I guess if it blows up I can poke them
22:52karolherbst: but anwway.. pinning helps if you have fields referencing other fields within a struct
22:52karolherbst: e.g. if you have a parsed result and a data buffer
22:52karolherbst: and you can't have anything move around the value of that buffer (like replacing it with a different thing) as it would invalidate the parsed info
22:53karolherbst: like e.g. if part of that parsed info is a string
22:53Lyude: mhm - I mostly grokked, I guess I was mostly confused on things like - how long something stays Pin, if there's something special that actually needs to be done for unsafe impls to ensure that something really is Pin, etc. but it sounds like it really is just a compiler hint to say don't let things move until the Pin is dropped?
22:54karolherbst: Pin doesn't do anything on its own
22:54karolherbst: it just prevents certain opreations to be done on types which are Pin, but not Unpin
22:55Lyude: gotcha gotcha
22:55karolherbst: I think this is a good quick example: https://doc.rust-lang.org/std/pin/#example-self-referential-struct
22:55karolherbst: the last line is an exmaple of what you can't do
22:56fdobridge_: <Sid> BAR caps aren't exposed on nouveau+GSP btw:
22:56fdobridge_: <Sid> https://paste.sidonthe.net/raw/otter-goat-cat
22:56Lyude: hooray, now I can finish this skeleton hopefully
22:57fdobridge_: <karolherbst🐧🦀> lea is just shift and add
22:58fdobridge_: <karolherbst🐧🦀> but yeah...
22:58fdobridge_: <karolherbst🐧🦀> maybe something needs to be done on the output index
23:01fdobridge_: <Sid> same without gsp
23:03fdobridge_: <Sid> same thing on the proprietary driver (open kernel module) https://paste.sidonthe.net/raw/snail-toad-monkey, caps are exposed correctly
23:03fdobridge_: <Sid> goob night
23:04fdobridge_: <karolherbst🐧🦀> same
23:12fdobridge_: <gfxstrand> `LEA.HI.SX32 R4, R0, R5, 0x1e`
23:12fdobridge_: <gfxstrand> With some IMAD in there, too
23:12fdobridge_: <gfxstrand> Ugh...
23:14fdobridge_: <gfxstrand> ```
23:14fdobridge_: <gfxstrand> S2R R0, SR_LANEID
23:14fdobridge_: <gfxstrand> S2R R6, SR_INVOCATION_ID
23:14fdobridge_: <gfxstrand> IMAD.IADD R5, R0, 0x1, -R6
23:14fdobridge_: <gfxstrand> IMAD.SHL.U32 R0, R6, 0x4, RZ
23:14fdobridge_: <gfxstrand> LEA.HI.SX32 R4, R0, R5, 0x1e
23:14fdobridge_: <gfxstrand> ```
23:22HdkR: 🤌
23:38Lyude: ...oh right. i see why asahi accesses the registration in such a weird way now :(
23:49Lyude: reg gets moved into the device data, so they then extract it again from data
23:52Lyude: dakr: ^ I think I was right, we do need the rest of RevocableMutex so we can actually access the data inside it
23:54fdobridge_: <gfxstrand> I literally don't know what half those instructions mean and none of them are documented publicly. 😭
23:55fdobridge_: <gfxstrand> I mean, I can translate to fuzzy English but not C.
23:55fdobridge_: <gfxstrand> I think the first one is just 1 -R6
23:55fdobridge_: <gfxstrand> I think the first one is just `1 - R6` (edited)
23:56fdobridge_: <gfxstrand> I think the first one is just `R0 - R6` (edited)