00:03 fdobridge: <e​sdrastarsis> maybe NV_ERR_NOT_READY?
00:30 fdobridge: <a​irlied> seems like it
00:31 fdobridge: <a​irlied> I wonder should we retry in that instance
01:54 Liver_K: Where does fdobridge bridge to?
01:55 fdobridge: <e​sdrastarsis> Discord
06:13 fdobridge: <a​irlied> okay I've written the code to in theory decode a h264 I frame, in practice the gpu just kills my channel 😛
06:20 fdobridge: <o​rowith2os> `dbg!()` time? :v
06:54 udoprog: So IIUC, GSP support using nvidia firmware is landing in 6.7, so I decided to take one of the release candidates and nouveau for a spin. Once booted into the system, how do I check if it's working?
06:58 airlied: you have to install the firmware from linux-firmware and then if you have a pre-ada card add nouveau.config=NvGspRm=1 to kernel command line
07:00 udoprog: And after that? How can I check if the GSP is working? GPU clock speeds? I looked through the patch and I didn't immediately spot any kernel diagnostics being emitted once it boots I could look for.
07:02 airlied: yeah there isn't anything too obvious some bios parsing goes away
07:05 udoprog: all right, thanks
08:06 fdobridge: <a​irlied> okay, not hanging the decode engine, and I can see some red in the output so something is decoding
15:13 fdobridge: <g​fxstrand> Doing this with NVK or VAAPI?
15:27 fdobridge: <!​DodoNVK (she) 🇱🇹> Doesn't matter if zink supports VA-API :frog_pregnant:
17:02 fdobridge: <!​DodoNVK (she) 🇱🇹> Does anyone have a TU116 GPU? I want you to do some nouveau GSP testing 🐸
17:16 udoprog: Oh, is there a Discord server I could join instead? Just saw fdobridge.
17:27 DodoGTA: udoprog: https://discord.gg/cfByQFvg2q
17:28 udoprog: <3
17:40 fdobridge: <u​doprog> So let's say I want to start working on some problems I'm having right now with nouveau. The first thing I was pondering was how to set up a development environment. My idea so far has been to do PCI passthrough and work with a virtual machine so that the host environment can stay up while the driver is being reloaded. Does this work or are there other suggestions for how to do this? Like a devbox with a kvm switch?
17:45 fdobridge: <g​fxstrand> PSA: I just rebased the sm50 branch and ran rustfmt on the whole thing. Outstanding MRs may need to be rebased. I pulled in @dwlsalmeida's and @marysaka's first, though.
17:46 fdobridge: <g​fxstrand> Are you wanting to hack on userspace or kernel? If userspace, the kernel is fairly stable on new GPUs and you shouldn't need the VM.
17:46 fdobridge: <g​fxstrand> As long as you're running a recent kernel
17:49 fdobridge: <u​doprog> Both probably, right now I want to investigate a warn I get in the kernel and the broader usecase of dmabuf passing in pipewire. But I also just want to conveniently boot up a bleeding edge kernel on a separate environment.
17:52 fdobridge: <u​doprog> Both probably, right now I want to investigate a warn I get in the kernel and the broader usecase of dmabuf passing in pipewire. But I also just want to conveniently boot up a bleeding edge / patched kernel on a separate environment. (edited)
17:56 fdobridge: <g​fxstrand> Yeah, a VM with passthrough should work for that, I think.
17:56 fdobridge: <o​rowith2os> for userspace, containers? :)
17:57 fdobridge: <o​rowith2os> lets you mess around a *lot* without worry of messing anything up
17:57 fdobridge: <o​rowith2os> distrobox would let you get up and running fairly quickly, feels almost like a bare metal distro
17:57 fdobridge: <u​doprog> Yeah, sounds like an idea. Is there way to conveniently build the relevant userspace parts from source into a container?
17:58 fdobridge: <g​fxstrand> That won't work if you want to monkey with kernels.
17:58 fdobridge: <o​rowith2os> hence why I said, for userspace
17:59 fdobridge: <g​fxstrand> For userspace, just build mesa in a directory and use `meson devenv`
17:59 fdobridge: <o​rowith2os> would also be helpful, since you can move the tarball of the container into a VM if you need kernel shenanigans
17:59 fdobridge: <g​fxstrand> No need to deal with whole containers unless you're trying to test the whole x/wayland/whatever stack with a hacked up driver.
18:00 fdobridge: <u​doprog> Neat, so that just sets environment variables in a shell to point towards build-from-source mesa?
18:00 fdobridge: <o​rowith2os> bingo
18:02 fdobridge: <o​rowith2os> I find it helps when I need to dev *anything*, especially considering I like to run immutable systems and having the build deps of.... everything on any host system feels icky
18:03 fdobridge: <u​doprog> All right, that should work for userspace, cheers!
18:03 fdobridge: <o​rowith2os> and I can change envs on a dime, so if something doesn't like Arch, Fedora, or Ubuntu, I can move to what I need
18:03 fdobridge: <g​fxstrand> That's fine if that's your jam. I avoid containers like the plague unless they solve a very real problem for me like crazy ruby deps or something.
18:03 fdobridge: <c​langcat> Yup You can also do it for Vulkan if you want to work on Vulkan drivers to.
18:03 fdobridge: <!​DodoNVK (she) 🇱🇹> Crazy Ruby deps?
18:04 fdobridge: <o​rowith2os> *looks at ten container envs currently running*
18:04 fdobridge: <g​fxstrand> I have a container just for building the Vulkan spec because it uses recent asciidoctor + extensions which no one actually has in their distro.
18:04 fdobridge: <g​fxstrand> But if I'm just hacking on Mesa? Nah, stock Linux (any distro) is fine.
18:10 fdobridge: <c​langcat> Mesa's built system is very nice for the most part it just works.
18:11 fdobridge: <u​doprog> So I just want to check, recently (last 5 years or so maybe?) I've not been able to unload nvidia and load the nouveau kernel modules cleanly on a running system. It seems to lead to stuff like the gpu hanging. Is that supposed to work?
18:14 fdobridge: <c​langcat> It probably should as I know Optimus laptops can do it(Not sure how well it should work) I think they use a tool called bbswitch no idea if that'll work in your situtation.
18:15 fdobridge: <c​langcat> Though can't say for sure as I just use Nouveau cause I don't care about prefomance.
18:18 fdobridge: <u​doprog> Well, if I can get it to work I was thinking of just doing a multi-seat setup with my the environment I'm working on running on my integrated gpu.
18:19 fdobridge: <u​doprog> I've had difficulties getting PCI passthrough to work on my current mobo, and there's no space in my chassi to move the current card around much to trial and error.
18:20 fdobridge: <c​langcat> Yea I can't do it with my motherboard atleast not the intial way I planned as my GPUs are all on one IOMMU group. There is probably still a way to do it but I don't care to much tbh.
18:23 fdobridge: <u​doprog> Similar issue. It's a pain.
18:24 fdobridge: <c​langcat> Depends on use case I guess. I don't care alot I just do it for OS-dev sometimes.
18:34 fdobridge: <p​ac85> Uh yeah, iirc they provide one for building it.
18:34 fdobridge: <p​ac85> Without it needs tons of ruby stuff and some java thing as well iirc
18:34 fdobridge: <p​ac85> https://github.com/KhronosGroup/Vulkan-Docs/blob/main/scripts/runDocker
18:34 fdobridge: <p​ac85> <https://github.com/KhronosGroup/Vulkan-Docs/blob/main/scripts/runDocker> (edited)
18:49 fdobridge: <a​irlied> NVK so far, though ffmpeg has some ext reqs we aren't providing yet, so just using the simple CTS test
18:50 fdobridge: <g​fxstrand> What exts?
18:52 fdobridge: <!​DodoNVK (she) 🇱🇹> What's the current status of VA-API on zink? :frog_gears:
18:53 fdobridge: <a​irlied> Don't know, it could also be a vulkan version problem
18:54 fdobridge: <!​DodoNVK (she) 🇱🇹> Feel free to use my hacks then :triangle_nvk:
18:55 fdobridge: <a​irlied> It failed to start saying timeline semaphores weren't supported, and when I started debugging it I got lost in some getphysicaldevicefeatures mess
18:56 fdobridge: <a​irlied> Like the loader was calling gpdf1 instead of gpdf2
18:56 fdobridge: <a​irlied> Vaapi on zink needs Vulkan exts to do it properly
18:57 fdobridge: <!​DodoNVK (she) 🇱🇹> Which is obvious (how well does it work with the current Vulkan video extensions though?)
19:01 fdobridge: <!​DodoNVK (she) 🇱🇹> Timeline semaphores are a hard requirement for FFmpeg Vulkan (it's checking for the Vulkan 1.2 feature though so you should use my Vulkan 1.3 exposure hacks)
19:04 fdobridge: <!​DodoNVK (she) 🇱🇹> I just saw your Vulkan video testing branch for NVK so I guess I can try hanging the GPU myself :cursedgears:
19:09 fdobridge: <a​irlied> Yeah I hacked NVK to 1.3. But it probably needs a bit more than that, maybe today, after I write P frames
19:11 fdobridge: <!​DodoNVK (she) 🇱🇹> I wonder where I can get the experimental kernel changes though (one commit mentions kernel changes as a requirement)
19:16 fdobridge: <a​irlied> Yeah I havent pushed it anywhere yet since it just a hack to get me going, needs actual thought applied, I'll post it in a bit
19:19 fdobridge: <g​fxstrand> I'm doing a Vulkan 1.1 run now. It might be time to turn it on.
19:19 fdobridge: <g​fxstrand> At least for Turing+
19:21 fdobridge: <g​fxstrand> Unfortunately, I can't do an actual CTS submission because I can't get through a run with GSP. 😭
19:22 fdobridge: <!​DodoNVK (she) 🇱🇹> Is it because of some race conditions?
19:23 fdobridge: <g​fxstrand> No, something appears messed up with sync objects.
19:23 fdobridge: <g​fxstrand> IDK what
19:24 fdobridge: <g​fxstrand> I get to the synchronization semaphore impor/export tests and it just locks up.
19:24 fdobridge: <g​fxstrand> It's totally reproducable, too.... with a full CTS run. 😭
19:24 fdobridge: <g​fxstrand> I haven't narrowed it down at all.
19:26 fdobridge: <a​irlied> Nothing in dmesg?
19:26 fdobridge: <a​irlied> Like the interaction of gsp and syncobjs is quite minimal
19:27 fdobridge: <g​fxstrand> Nothing instructive
19:27 fdobridge: <g​fxstrand> It's been a bit since I've tried to do a full run, though.
19:27 fdobridge: <a​irlied> Have you got a branch? I can give it a run on one of my machines
19:28 fdobridge: <!​DodoNVK (she) 🇱🇹> GSP is a mess (TU117 is suffering from GR class errors)
19:29 fdobridge: <g​fxstrand> nvk/conformance
19:31 fdobridge: <a​irlied> Is there an issue filed?
19:32 fdobridge: <!​DodoNVK (she) 🇱🇹> Now that I have another person reproducing the issue with their TU117 on a 6.7 rc kernel I can definitely open a bug for that
19:33 fdobridge: <!​DodoNVK (she) 🇱🇹> Would some kernel parameters help give extra information?
19:33 fdobridge: <a​irlied> Yes but file the issue first then can work out what is best
19:34 fdobridge: <a​irlied> I'm sure I have tu117 somewhere
19:39 fdobridge: <g​fxstrand> What does the LDC subop do?
19:41 fdobridge: <g​fxstrand> Hrm... Maybe it affects how the immediate (if any) is interpreted?
19:43 fdobridge: <k​arolherbst🐧🦀> it does
19:43 fdobridge: <k​arolherbst🐧🦀> it controls how the overflow into the next cb works
19:44 fdobridge: <k​arolherbst🐧🦀> more even
19:44 fdobridge: <k​arolherbst🐧🦀> only the default one works with bindless ubos btw
19:45 fdobridge: <g​fxstrand> Ah
19:46 fdobridge: <g​fxstrand> I don't think I care about one CB flowing into another
19:46 fdobridge: <g​fxstrand> At least not right now
19:46 fdobridge: <k​arolherbst🐧🦀> `.IL` basically treats overflows as the next cb
19:46 fdobridge: <k​arolherbst🐧🦀> `.IS` splits the input into two 16 bit values and the sum of the bottom can't overflow the index
19:47 fdobridge: <k​arolherbst🐧🦀> `.ISL`same as `.IS` just that it checks that the cb index is checked against the limit of 14?
19:48 fdobridge: <k​arolherbst🐧🦀> mhhh
19:50 fdobridge: <k​arolherbst🐧🦀> yeah.. the overflow really only matters for real edge case indirects where it's a bit faster
19:50 fdobridge: <a​irlied> that branch has a revert 1.1 on top, should I drop that, or will it die regardless?
19:50 fdobridge: <k​arolherbst🐧🦀> I think it matters more for robustness and what value needs to be returned for an OOB access
19:51 fdobridge: <k​arolherbst🐧🦀> OOB on constant buffers generally return 0, unless they overflow into the next index
19:51 fdobridge: <g​fxstrand> It'll probably work with 11
19:51 fdobridge: <g​fxstrand> But go ahead and keep it at 1.0 because that's what I think I did for my last run
19:51 fdobridge: <g​fxstrand> And it'll be faster
19:52 fdobridge: <a​irlied> so it hangs with just run-deqp or do I need to do a proper submission run?
19:52 fdobridge: <!​DodoNVK (she) 🇱🇹> The entire NVK trio is active right now :triangle_nvk:
19:53 fdobridge: <a​irlied> https://paste.centos.org/view/raw/ac20f769 is the kernel hack
19:53 fdobridge: <g​fxstrand> Proper submission run
19:54 fdobridge: <g​fxstrand> It takes like 2-3 hours before it dies. 😭
19:54 fdobridge:<a​irlied> cries into single thread
19:55 fdobridge: <!​DodoNVK (she) 🇱🇹> Literally a one-line change (totally upstreamable /s)
19:56 fdobridge: <o​rowith2os> is it not possible to skip ahead to the bits you want?
19:56 fdobridge: <o​rowith2os> then when it doesn't die from there, restart from the beginning
19:56 fdobridge: <g​fxstrand> The group of tests it does on work fine by themselves. I haven't spent the time to bisect a minimum precursor set.
19:57 fdobridge: <o​rowith2os> I see
19:57 fdobridge: <k​arolherbst🐧🦀> tried increasing the test group size with deqp-runner?
19:59 fdobridge: <g​fxstrand> All of dEQP-VK.synchronization.* passes
19:59 fdobridge: <g​fxstrand> So it needs to be bigger groups than that
20:02 fdobridge: <k​arolherbst🐧🦀> I mean.. deqp-runner kinda runs tests in random order, no? So my hope is that with bigger sizes (maybe 10k?) it makes it more likely to hit the issue through it
20:04 fdobridge: <g​fxstrand> Unrelated: Why does `ldc.u16` throw a misaligned addr error for things aligned to 2B? 🤡
20:05 fdobridge: <k​arolherbst🐧🦀> sure the offset is aligned to 2b?
20:05 fdobridge: <g​fxstrand> fairly
20:06 fdobridge: <k​arolherbst🐧🦀> is it a negative number?
20:07 fdobridge: <k​arolherbst🐧🦀> mhh though I think the base address from the reg is treated as unsigned...
20:12 fdobridge: <g​fxstrand> Not as far as I know
20:13 fdobridge: <g​fxstrand> My 8/16-bit branch is failing exactly one test. 🤡
20:13 fdobridge: <g​fxstrand> Oh, it's in a loop... I wonder if this is somehow helper-related. 🤔
20:24 fdobridge: <a​irlied> Test case 'dEQP-VK.memory.allocation.basic.size_4KiB.forward.count_4000'..
20:24 fdobridge: <a​irlied> MESA: error: ../src/nouveau/vulkan/nvk_device_memory.c:212: VK_ERROR_OUT_OF_DEVICE_MEMORY
20:24 fdobridge: <a​irlied> ResourceError (res: VK_ERROR_OUT_OF_DEVICE_MEMORY at vktMemoryAllocationTests.cpp:505)
20:24 fdobridge: <a​irlied> bleh
20:32 fdobridge: <g​fxstrand> `ldc.u8` seems to handle unaligned things fine. 🤡
20:32 fdobridge: <g​fxstrand> Does `ldc.u16` secretly require 4B alignment?
20:32 fdobridge: <g​fxstrand> That would be silly
20:32 fdobridge: <g​fxstrand> But also entirely believable.
20:43 fdobridge: <a​irlied> I plugged in a tu117, seems as fine as any other card with GSP here
20:44 fdobridge: <a​irlied> display works, parallel deqp run isn't dying in flames
20:52 fdobridge: <k​arolherbst🐧🦀> not afaik
20:53 fdobridge: <k​arolherbst🐧🦀> have you double checked with the nvidia disassembler?
20:55 fdobridge: <k​arolherbst🐧🦀> mhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
20:55 fdobridge: <k​arolherbst🐧🦀> @gfxstrand it sounds like that the offset needs to be 4 byte aligned
20:56 fdobridge: <k​arolherbst🐧🦀> but I think that's implied by the encoding?
20:56 fdobridge: <k​arolherbst🐧🦀> like the two lower bits just don't exist in the offset or something?
20:56 fdobridge: <k​arolherbst🐧🦀> maybe I misremember that
20:57 fdobridge: <k​arolherbst🐧🦀> though I think `LDC` has the full 16 bits?
20:57 fdobridge: <k​arolherbst🐧🦀> yeah...
20:57 fdobridge: <k​arolherbst🐧🦀> `LDC` doesn't have that restriction
20:59 fdobridge: <!​DodoNVK (she) 🇱🇹> How many rays could NVK trace before causing a GPU hang once RT support gets implemented? 🔦
21:36 fdobridge: <g​fxstrand> Well, it doesn't have an encoding restriction but it sure seems to require 4B alignment internally.
21:36 fdobridge: <g​fxstrand> If I split all the way to `ldc.u8` it handles unaligned things just fine.
21:41 fdobridge: <k​arolherbst🐧🦀> weird...
21:42 fdobridge: <k​arolherbst🐧🦀> are you sure your encoding is correct?
21:42 HdkR: Sounds about right. LDC restrictions suck :)
21:43 fdobridge: <g​fxstrand> Pretty sure. The disassembler gives me the right thing back, anyway.
21:43 fdobridge: <k​arolherbst🐧🦀> annoying...
21:43 fdobridge: <k​arolherbst🐧🦀> let me craft some CL kernel and see what happens
21:44 fdobridge: <g​fxstrand> Yeah... I'm about to give up on `ldc.u16`. like, who needs it anyway? I can shift AND and shift.
21:44 fdobridge: <g​fxstrand> Yeah... I'm about to give up on `ldc.u16`. like, who needs it anyway? I can AND and shift. (edited)
21:44 fdobridge: <g​fxstrand> It only matters for constant offsets anyway.
21:47 fdobridge: <k​arolherbst🐧🦀> ohh.. so indirects are fine?
21:47 fdobridge: <g​fxstrand> It's one of those things that's super annoying but ultimately doesn't matter once you've decided how to paint the shed.
21:48 fdobridge: <g​fxstrand> Nope. Indirects are still busted. I've tired with unaligned+indirect, aligned+indirect, and 0+indirect.
21:48 fdobridge: <!​DodoNVK (she) 🇱🇹> An OpenCL kernel of Linux would be pretty ironic
21:49 fdobridge: <g​fxstrand> UBO indirects just aren't that common and adding 2-3 ALU ops isn't going to kill performance.
22:03 fdobridge: <k​arolherbst🐧🦀> ```
22:03 fdobridge: <k​arolherbst🐧🦀> test:
22:03 fdobridge: <k​arolherbst🐧🦀> /*0000*/ IMAD.MOV.U32 R1, RZ, RZ, c[0x0][0x28] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0010*/ IMAD.MOV.U32 R6, RZ, RZ, c[0x0][0x178] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0020*/ ULDC.64 UR4, c[0x0][0x170] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0030*/ IMAD.MOV.U32 R7, RZ, RZ, 0x1 ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0040*/ SHF.L.U32 R2, R6, 0x1, RZ ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0050*/ IADD3 R0, R2.reuse, c[0x0][0x168], RZ ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0060*/ IADD3 R4, R2, c[0x0][0x160], RZ ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0070*/ LDC.U16.IL R0, c[0x0][R0] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0080*/ LDC.U16.IL R3, c[0x0][R4] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*0090*/ IADD3 R5, R0, R3, RZ ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*00a0*/ SHF.L.U64.HI R3, R6, R7, c[0x0][0x17c] ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*00b0*/ STG.E.U16.SYS [R2.64+UR4], R5 ;
22:03 fdobridge: <k​arolherbst🐧🦀> /*00c0*/ EXIT ;
22:03 fdobridge: <k​arolherbst🐧🦀> .L_x_0:
22:03 fdobridge: <k​arolherbst🐧🦀> /*00d0*/ BRA `(.L_x_0);
22:03 fdobridge: <k​arolherbst🐧🦀> /*00e0*/ NOP;
22:03 fdobridge: <k​arolherbst🐧🦀> /*00f0*/ NOP
22:03 fdobridge: <k​arolherbst🐧🦀> ```
22:03 fdobridge: <k​arolherbst🐧🦀> mhh
22:03 fdobridge: <k​arolherbst🐧🦀> let me add an offset and see what changes
22:03 fdobridge: <k​arolherbst🐧🦀> ` /*0080*/ LDC.U16.IL R3, c[0x0][R4+0x2] ;`
22:03 fdobridge: <k​arolherbst🐧🦀> 🤷
22:04 fdobridge: <k​arolherbst🐧🦀> ohhh
22:04 fdobridge: <k​arolherbst🐧🦀> @gfxstrand maybe it works with `.IL`?
22:07 fdobridge: <g​fxstrand> That could be. I can try that after a bit.
22:08 fdobridge: <k​arolherbst🐧🦀> they probably use `.IL` in CL for other reasons though 😄
22:08 fdobridge: <k​arolherbst🐧🦀> actually...
22:08 fdobridge: <k​arolherbst🐧🦀> kinda smart
22:08 fdobridge: <k​arolherbst🐧🦀> doesn't need to deal with pain `vec2`
22:08 fdobridge: <g​fxstrand> IDK
22:08 fdobridge: <g​fxstrand> Yeah
22:09 fdobridge: <g​fxstrand> And if you don't care about bounds checking...
22:09 fdobridge: <k​arolherbst🐧🦀> well.. the hardware bound checks anyway
22:09 fdobridge: <g​fxstrand> I can poke at the Vulkan blob, too
22:09 fdobridge: <k​arolherbst🐧🦀> hardware returns 0 for any OOB cb access
22:09 fdobridge: <g​fxstrand> Not with .IL, not if you actually have separate bindings.
22:10 fdobridge: <k​arolherbst🐧🦀> does the hardware complain?
22:10 fdobridge: <g​fxstrand> I mean, it'll give you 0 if you run past the end of cb14 or whatever
22:10 fdobridge: <k​arolherbst🐧🦀> yeah, sure
22:10 fdobridge: <k​arolherbst🐧🦀> that's what I meant 😄
22:10 fdobridge: <k​arolherbst🐧🦀> it won't trap I mean
22:14 fdobridge: <a​irlied> okay tu117 made it through a parallel deqp run with about the same amount of damage as any other card, so I don't think there is anything gsp specific wrong with tu117
22:15 fdobridge: <!​DodoNVK (she) 🇱🇹> How about testing SuperTuxKart with zink?
22:16 fdobridge: <!​DodoNVK (she) 🇱🇹> Or DXVK v1.10.0+ (both of these cases trigger an error for me)?
22:19 fdobridge: <!​DodoNVK (she) 🇱🇹> Actually I meant the OpenGL driver (maybe I should get some rest)
22:43 fdobridge: <a​irlied> Yeah just I doubt it's a gsp specific problem it might just be reporting a problem we miss