12:57avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1325447999899439188/amp_a_samsung.patch?ex=677bd324&is=677a81a4&hm=45ef04a0758451805bfd3b4fca9562be9cf7effc74d493f4f8ac8e66223c8f07&
12:57avhe[d]: dwlsalmeida[d]: this fixes amp_a_samsung for me
12:57avhe[d]: not sure what this is about yet
13:00avhe[d]: it still works when only `log2MaxTransformSkipSize` is set of these
13:12dwlsalmeida[d]: I had that too
13:12dwlsalmeida[d]: Except for log2MaxTransform
13:13dwlsalmeida[d]: I guess I didn’t trace this
13:13avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1325452140331798630/image.png?ex=677bd6ff&is=677a857f&hm=df93a90465b90d5d965bf9702b5c26ed9ab1bf878de609952d09ef6862029bd0&
13:13avhe[d]: i'm just reproducing what i dumped from an nvcuvid run
13:13dwlsalmeida[d]: This is in a struct named 444ext, a bit weird how this is relevant
13:14avhe[d]: yeah
13:14dwlsalmeida[d]: Can you run fluster?
13:14avhe[d]: i can give it a try yeah
13:15dwlsalmeida[d]: I guess I have the formulas for the other two numbers
13:15dwlsalmeida[d]: Let me power on my stuff
13:21avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1325454029005914213/image.png?ex=677bd8c2&is=677a8742&hm=0befc37aef3ef1828a3f359b88bc558b6c37790ee1e1906e3c35e1eeeb0053c1&
13:21avhe[d]: it times out on those two
13:21dwlsalmeida[d]: // nvh265.v3.HevcSliceEdgeOffset =
13:21dwlsalmeida[d]: // ((608 * aligned_h) + (5016 * aligned_h) + (aligned_w * aligned_h))
13:21dwlsalmeida[d]: // >> 8;
13:21avhe[d]: i admit i haven't looked past the first frame
13:21dwlsalmeida[d]: you can increase the timeout with -t
13:22dwlsalmeida[d]: like -t 240
13:22avhe[d]: dwlsalmeida[d]: oh nice, where is that from?
13:22dwlsalmeida[d]: avhe[d]: trial and error basically
13:24dwlsalmeida[d]: The 608 comes from these macros:
13:24dwlsalmeida[d]: // HEVC Filter FG buffer
13:24dwlsalmeida[d]: #define HEVC_DBLK_TOP_SIZE_IN_SB16 ALIGN_UP(632, 128) // ctb16 + 444
13:24dwlsalmeida[d]: #define HEVC_DBLK_TOP_BUF_SIZE(w) NVDEC_ALIGN( (ALIGN_UP(w,16)/16 + 2) * HEVC_DBLK_TOP_SIZE_IN_SB16) // 8K: 1285*256
13:24dwlsalmeida[d]: #define HEVC_DBLK_LEFT_SIZE_IN_SB16 ALIGN_UP(506, 128) // ctb16 + 444
13:24dwlsalmeida[d]: #define HEVC_DBLK_LEFT_BUF_SIZE(h) NVDEC_ALIGN( (ALIGN_UP(h,16)/16 + 2) * HEVC_DBLK_LEFT_SIZE_IN_SB16) // 8K: 1028*256
13:24dwlsalmeida[d]: if you pass `1600`, which is the height for AMP_A into them, and them sum the two values, you get that
13:26dwlsalmeida[d]: I had the other two values in a python session somewhere, which I seem to have killed unfortunaly
13:28avhe[d]: i'll try to find how they are calculated on tegra
13:43__ad: could the "job timeout, channel 24 killed", on ada lovelace, be an issue of nouveau ?
13:44__ad: i am getting this on wayland, with whatever compositor
13:45__ad: i suspect of an hw issue with my laptop
13:57dwlsalmeida[d]: __ad: Probably userspace is to blame somehow
14:00avhe[d]: if i'm reading this right, on tegra they do something like this
14:00avhe[d]: ```width_aligned = align(width, 64), height_aligned = align(height, 64)
14:00avhe[d]: num_px = width_aligned * height_aligned
14:00avhe[d]: if (num_px > 0x220000)
14:00avhe[d]: a = width_aligned >> 5, b = height_aligned >> 5;
14:00avhe[d]: else
14:00avhe[d]: a = width_aligned >> 4, b = height_aligned >> 4;
14:00avhe[d]: HevcFltAboveOffset = (height_aligned * 0x260 >> 8) + (height_aligned * 0x1300 >> 8) + (height_aligned * 0x4c >> 8)
14:00avhe[d]: HevcSaoAboveOffset = HevcFltAboveOffset + ((b + a * b) * 0xb00 + 0xff >> 8)
14:00avhe[d]: HevcSliceEdgeOffset = HevcSaoAboveOffset + (a * b * 3)
14:02dwlsalmeida[d]: does this work out to the same values?
14:02avhe[d]: didn't try but for HevcSliceEdgeOffset not
14:02avhe[d]: since on cuvid we had the same value as HevcSaoAboveOffset
14:03dwlsalmeida[d]: let's try this
14:03dwlsalmeida[d]: ((624 * 1600) + (5016 * 1600) + (2560 * 1600)) >> 8;
14:03dwlsalmeida[d]: 51250
14:03dwlsalmeida[d]: so 624 instead of 608
14:05__ad: dwlsalmeida[d]: thanks a lot
14:07avhe[d]: yeah none of those match but they are in the same ballpark
14:09dwlsalmeida[d]: 35150 would be `((608*1600) + (5016*1600)) >>8`
14:10dwlsalmeida[d]: well, the numbers I gave match exactly, at least for this particular video, not sure how they will work in general
14:22dwlsalmeida[d]: btw, I noticed that if you pass an out of bounds value in either of `HevcFltAboveOffset` and `HevcSaoAboveOffset` you get an mmu fault, so these values are indeed being used to index into the filter buffer despite the `main10_444` nomenclature
14:23dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1325469832736669798/Screenshot_2025-01-05_at_11.23.28.png?ex=677be77a&is=677a95fa&hm=ab3dd41f5d687891947827d764cbf464e32cd70f92a9dbe75dac144857415cd0&
14:23dwlsalmeida[d]: hooray!!!!!!!! this **finally** decodes!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1
14:40dwlsalmeida[d]: avhe[d]: hey thanks for figuring this out
14:41avhe[d]: no problem
14:44dwlsalmeida[d]: ok, 100/147 with the fixes, so this is almost working
14:45dwlsalmeida[d]: considering that we don't have 10bit support because, among other things, format negotiation is broken on all vulkan decoders in gstreamer
19:12airlied[d]: __ad: No most likely not a hw issue, just nouveau
22:51gfxstrand[d]: dwlsalmeida[d]: Congrats! 🎉
22:53Jasper[m]: Would it in theory be possible to run nvk on older Tegra devices? (X1 and X2) I'm wondering if this could circumvent some issues I'm having with Nouveau proper
23:14airlied: nouveau proper? do you mean the kernel modul
23:14airlied: Jasper[m]: what issues are you having with nouveau proper? nvk is just a vulkan driver
23:14airlied: and no nvk doesn't work on tegra at all yet
23:15Jasper[m]: I was mostly talking about the opengl driver, but honestly I don't know where the issue I'm having lies.
23:15Jasper[m]: Bummer about it not working though, but I understand.
23:17Jasper[m]: I talked about fetching logs before, but for whatever reason I cannot even get to the point of loading a UI on pmOS anymore. I'm currently trying to get that fixed before I try to get logs
23:44magic_rb[d]: Jasper[m]: my recommendation is to screw the UI and go over serial/usb gadget. The UI might not be coming up due to the exact driver problem youre trying to debug
23:45magic_rb[d]: (Side note, its hilarious how messy this room is from a bridging perspective. Its a discord room bridged to IRC which is bridged to Matrix i assume. Me myself im accessing it through a discord bridge via matrix)
23:46magic_rb[d]: (So in the worst(best?) case a message can go through 3 different bridges before it gets to me, which is truly amazing)
23:52Jasper[m]: Hahaha interesting, yes I'm accessing from a matrix->irc bridge
23:52Jasper[m]: I have UART, I can access the device over a usb keyboard/eth connection even, but I don't think there were any issues related to nouveau
23:53Jasper[m]: Just an fb_set_par error, gonna grab it, ine sec
23:54Jasper[m]: detected fb_set_par error, error code: -16 basically this spammed a billion times then I get dropped into a tty
23:55Jasper[m]: sddm complains that it can't take over tty1 and keeps retrying until failure state, lightdm used to work but now doesn't anymore
23:55Jasper[m]: It's interesting
23:56airlied: definitely a kernel bug then