21:57dwlsalmeida[d]: avhe[d]: if you feel like it, would you be able to comment a bit on my HEVC implementation?
21:57dwlsalmeida[d]: https://gitlab.freedesktop.org/dwlsalmeida/mesa/-/commits/nvk-vulkan-video-h265
21:57dwlsalmeida[d]: This actually decodes some files perfectly
21:58dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1324497178667909254/Screenshot_2025-01-02_at_18.58.32.png?ex=67785d9f&is=67770c1f&hm=e2d5762e24f552dfc273ff3be64571220d51cad0d557a842d763b1d7f2c94236&
21:58dwlsalmeida[d]: Others (the majority) are just completely broken
21:59dwlsalmeida[d]: I ran out of ideas after fiddling with everything I could think of, naturally the picture parameters are identical between us and the blob
22:00dwlsalmeida[d]: I noticed that the SAO and BSD buffers have to be co-located with the filter buffer, but that doesn't fix it either. Tried hardcoding the values for the offsets, but no dice still
22:01dwlsalmeida[d]: It seems like the card doesn't process the bitstream at all, because `nvdec_error_s` reports 0 macroblocks currently decoded, but also 0 failed
22:02dwlsalmeida[d]: I dumped the bitstream we're submitting, and it's the same thing the blob is submitting
22:03avhe[d]: ooh your method of computing the sw_hdr_skip_len is pretty clever
22:03dwlsalmeida[d]: oh thanks 😄
22:03avhe[d]: anyway i'll give it a look tomorrow
22:04avhe[d]: if you post a problematic bitstream i can confirm whether or not it decodes correctly on my tegra impl
22:06dwlsalmeida[d]: e.g.:
22:06dwlsalmeida[d]: https://www.itu.int/wftp3/av-arch/jctvc-site/bitstream_exchange/draft_conformance/HEVC_v1/AMP_A_Samsung_7.zip
22:06dwlsalmeida[d]: it's "sibling" decodes just fine, i.e.:
22:06dwlsalmeida[d]: https://www.itu.int/wftp3/av-arch/jctvc-site/bitstream_exchange/draft_conformance/HEVC_v1/AMP_B_Samsung_7.zip
22:09dwlsalmeida[d]: or https://www.itu.int/wftp3/av-arch/jctvc-site/bitstream_exchange/draft_conformance/HEVC_v1/ipcm_D_NEC_3.zip
22:09dwlsalmeida[d]: where it is clear that there's some bad address being computed somewhere, because you can clearly see the right thing in the frame, with some extra garbage
22:10dwlsalmeida[d]: This intratop buffer is still a mystery, I tried mmaping that with the tracer but that segfaults, so I am just assuming this is some opaque buffer passed to the gpu
22:15avhe[d]: yeah it decodes fine on my switch
22:15avhe[d]: have you checked that the auxiliary metadata buffers (eg. tile sizes etc) match with the blob?
22:16dwlsalmeida[d]: tiles match, yes
22:16dwlsalmeida[d]: even because AMP_A and AMP_B have the same tile sizes, so that can't be the issue
22:16dwlsalmeida[d]: scaling lists are disabled, but anyways the blob has that zeroed out
22:22dwlsalmeida[d]: ahuillet: `nvdec_error_s` reports `error_code: 5`, I wonder if you know what the two bits mean this time?
22:24airlied[d]: any vm faults?
22:32dwlsalmeida[d]: Nope
22:32dwlsalmeida[d]: Dmesg is clear
22:52airlied[d]: just fyi addreses is spelt addresses 🙂
23:25dwlsalmeida[d]: I noticed this typo too, I managed to write the thing right the second time around and then it wouldn’t compile
23:26dwlsalmeida[d]: I was too lazy to change the other five occurrences to the right spelling so I just kept going with the wrong one lol