00:01 dwlsalmeida[d]: `ffmpeg[3104]: channel 40 killed!` This channel number, does that mean anything?
00:10 karolherbst[d]: not really
00:33 dwlsalmeida[d]: yeah tried debugging this the whole day:
00:33 dwlsalmeida[d]: [ 1062.724405] nouveau 0000:01:00.0: gsp: rc engn:00000013 chid:32 type:68 scope:1 part:233
00:33 dwlsalmeida[d]: [ 1062.724411] nouveau 0000:01:00.0: fifo:6606c307:0004:0020:[ffmpeg[5990]] errored - disabling channel
00:33 dwlsalmeida[d]: [ 1062.724414] nouveau 0000:01:00.0: ffmpeg[5990]: channel 32 killed!
00:33 dwlsalmeida[d]: Coming up short, really
00:33 dwlsalmeida[d]: I wish I knew what `type:68` mean
00:36 airlied[d]: still just submitting the same thing as my branch?
00:38 dwlsalmeida[d]: ^ yours used to work, now I get all zeroes and this error, mine should be almost identical, yet that returns -ENODEV in queue submit
00:38 dwlsalmeida[d]: (used to work when I tested that 1yr ago, but tiling was broken)
00:53 dwlsalmeida[d]: avhe[d]: are you still around? I see you managed to dump the picture parameters from the blob once
01:03 dwlsalmeida[d]: can these be read somehow?
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_MC_ERRINT (0xDEC3001A)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_MC_IQT_ERRINT (0xDEC3001B)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_REC_ERRINT (0xDEC3001C)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_REC_IQT_ERRINT (0xDEC3001D)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_REC_MC_ERRINT (0xDEC3001E)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_DBF_REC_MC_IQT_ERRINT (0xDEC3001F)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_PICTURE_INIT (0xDEC30100)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_STATEMACHINE_FAILURE (0xDEC30101)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_CTXID_PIC (0xDEC30901)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_CTXID_UCODE (0xDEC30902)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_CTXID_FC (0xDEC30903)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_CTXID_SLH (0xDEC30904)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_UCODE_SIZE (0xDEC30905)
01:03 dwlsalmeida[d]: +#define NVC5B0_DEC_ERROR_H264_INVALID_SLICE_COUNT
01:05 airlied[d]: do you mean the h264 struct?
01:09 dwlsalmeida[d]: yeah, I have 0 idea whether that is the problem, just trying everything
01:09 dwlsalmeida[d]: hey, how about this?
01:09 dwlsalmeida[d]: +typedef struct _nvdec_status_s
01:09 dwlsalmeida[d]: +{
01:09 dwlsalmeida[d]: + NvU32 mbs_correctly_decoded; // total numers of correctly decoded macroblocks
01:09 dwlsalmeida[d]: + NvU32 mbs_in_error; // number of error macroblocks.
01:09 dwlsalmeida[d]: + NvU32 reserved;
01:09 dwlsalmeida[d]: + NvU32 error_status; // report error if any
01:09 dwlsalmeida[d]: + union
01:09 dwlsalmeida[d]: + {
01:09 dwlsalmeida[d]: + nvdec_status_hevc_s hevc;
01:09 dwlsalmeida[d]: + nvdec_status_vp9_s vp9;
01:09 dwlsalmeida[d]: + };
01:09 dwlsalmeida[d]: + NvU32 slice_header_error_code; // report error in slice header
01:09 dwlsalmeida[d]: +
01:09 dwlsalmeida[d]: +} nvdec_status_s;
01:09 dwlsalmeida[d]: +
01:09 dwlsalmeida[d]: i.e.: https://github.com/NVIDIA/open-gpu-doc/commit/97c510e793804d116b6bc110e1b3473565fc1440.patch
01:10 dwlsalmeida[d]: This will probably return one one of `NVC5B0_DEC_ERROR_H264_*`
01:11 dwlsalmeida[d]: maybe here?
01:11 dwlsalmeida[d]: P_MTHD(p, NVC5B0, SET_NVDEC_STATUS_OFFSET);
01:11 dwlsalmeida[d]: P_NVC5B0_SET_NVDEC_STATUS_OFFSET(p, mbstatus_address >> 8);
01:14 airlied[d]: yes that is where that should end up
01:49 awilcox: hello all, I think I have found an endianness bug in the firmware loader, but I wanted to make sure before I started adding le32_to_cpu()s everywhere. this binHdr doesn't look right to me: https://paste.gentoo.zip/0hTF9AHw and the Oops adds to my suspicions, but I don't have an LE machine with a Pascal to test with.
01:51 airlied: that paste site gives security errors, probably use a better one
01:52 awilcox: https://bpa.st/IHUJA
01:53 airlied: I'd start throwing around le32_to_cpu
01:55 awilcox: okay. also, when it does this, it becomes impossible to rmmod nouveau, or even reboot the machine; is there a fancy way to 'recover' from this, or is hard power off the only solution? (rebooting spins forever trying to umount / because 'Resource busy', and that's the same error from rmmod)
01:56 awilcox: I ask only because waiting for the BMC to boot up takes this from a 30 second edit-compile-reboot cycle to a 7 minute edit-compile-reboot cycle :)
01:57 airlied: echo b > /proc/sysrq-trigger might reboot without spinning, but with that sort of crash there isn't much else
07:39 karolherbst: maybe we should just refuse to probe on big endian 🙃
07:40 karolherbst: it's not like this is tested in any capacity
07:40 karolherbst: and even if nouveau loads, userspace going to be broken on top...
08:15 avhe[d]: dwlsalmeida[d]: yes but no clue here, my preload stub worked with nvidia.ko and i haven't really touched nouveau yet
12:41 dwlsalmeida[d]: dwlsalmeida[d]: The most interesting part is that there’s is no output produced, yet this report all macroblocks as decoded and error==0
12:42 dwlsalmeida[d]: If I try to force an error, by eg changing the size of the input buffer randomly or something like that, then it reports an error number that is not in the header file
12:44 avhe[d]: dwlsalmeida[d]: can you post your sample file? and i'll run my preload thing so you can compare against the blob
12:52 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301529141845295114/64x64-I.h264?ex=6724cef8&is=67237d78&hm=783b3668de4dd9b2565cf9bb43b1048b01b446dc415987cb487bcf26cdfd66ae&
12:52 dwlsalmeida[d]: ^just a 64x64 I frame, can't get simpler than that
12:52 dwlsalmeida[d]: thank you very much by the way!!!
13:13 avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301534531677061121/ffmpeg.log?ex=6724d3fd&is=6723827d&hm=1cb6d594e70196d641d6bb66915f9b7c5736ef980ab035189786cfe2e7aa66d7&
13:13 avhe[d]: here you go
13:13 avhe[d]: command was `ffmpeg -hide_banner -loglevel error -hwaccel nvdec -i ~/Vidéos/share/64x64-I.h264 -f null -`
13:14 avhe[d]: i'll try to clean up the code and post it this evening
13:20 dwlsalmeida[d]: Maps:
13:20 dwlsalmeida[d]: Luma 0: 0x01204000, Chroma 0: 0x01204010
13:20 dwlsalmeida[d]: Luma 1: 0x01204000, Chroma 1: 0x01204010
13:20 dwlsalmeida[d]: Luma 2: 0x01204000, Chroma 2: 0x01204010
13:20 dwlsalmeida[d]: Coloc: 0x01200600, History: 0x01200702
13:20 dwlsalmeida[d]: Do you know what these coloc and history buffers stand for?
13:21 avhe[d]: coloc i believe means colocated, it probably contains metadata for each dpb frame
13:21 avhe[d]: history i'm not certain
13:22 avhe[d]: as far as a userland driver is concerned you can just treat them as opaque buffers
13:23 avhe[d]: hm i realize i should've enabled more logs if you want to correctly size those buffers
13:28 avhe[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301538175327539222/ffmpeg.log?ex=6724d762&is=672385e2&hm=4aeeb4da0816f8209b5902ece7e156598b37c11f0fb4675eb66880c1233ca0c4&
13:28 avhe[d]: this gets pretty verbose but should contain logs regarding buffer allocation
13:30 avhe[d]: so for instance, NVC7B0_SET_COLOC_DATA_OFFSET is set to 0x01200600, grepping for 1200600 you find the following (mind that vaddrs in the gpfifo buffer are right-shifted by 8)
13:30 avhe[d]: Ioctl on 40 (/dev/nvidiactl): req 0xc0384657 (type 70 'F', dir 3, nr 0x57, size 0x38)
13:30 avhe[d]: Dma mem map: root 0xc1d00022, device 0x80000000, dma 0x80000003, mem 0x80000044, offset 0, length 0x10000, flags 0, dma offset 0x120060000
13:31 avhe[d]: and sometimes buffers are merged together (for instance NVC7B0_H264_SET_MBHIST_BUF_OFFSET and NVC7B0_SET_HISTORY_OFFSET here)
13:33 dwlsalmeida[d]: Method 0x005e (0x20018140): type 1, size 1, subchannel 4, reg 0x00000500 (NVC7B0_H264_SET_MBHIST_BUF_OFFSET)
13:33 dwlsalmeida[d]: 0x01200700
13:33 dwlsalmeida[d]: Method 0x0060 (0x20018106): type 1, size 1, subchannel 4, reg 0x00000418 (NVC7B0_SET_HISTORY_OFFSET)
13:33 dwlsalmeida[d]: 0x01200702
13:33 dwlsalmeida[d]: There seems to be only a 2 byte offset between the two here
13:34 avhe[d]: 0x200 because of the right-shift by 8
13:34 dwlsalmeida[d]: ahh
13:34 dwlsalmeida[d]: true
13:34 avhe[d]: this also means buffers need to be allocated to a 0x100 alignment, otherwise the low bits are lost when passing them in the cmdbuf
13:49 dwlsalmeida[d]: will work on that later, but thank you that's extremely helpful
13:50 avhe[d]: no problem
13:54 jfalempe: Lyude: I've refactored the nouveau tiling code, so that both the display checks and panic code use the same functions to get the tiling parameters: https://patchwork.freedesktop.org/series/133963/
13:55 jfalempe: Can you please take a look ?
16:55 dwlsalmeida[d]: I noticed that the traces (both from envyhooks and from averne's tracer) all include `NVC5B0_SEMAPHORE_*`, where ours do not
16:55 dwlsalmeida[d]: do you guys happen to know why?
16:57 dwlsalmeida[d]: There is only one frame, so I assume there's no pipeline barriers inserted (since these are inserted between `vkCmdDecodeVideoKHR` commands), I see ffmpeg is using timeline semaphores for synchronization, but these apparently boil down to `VKSync` which are implemented by DRM `syncobj`s,
17:48 Lyude: jfalempe: no problem, will do that today
18:33 airlied[d]: I doubt the semaphores matter we just deal with it differently
18:56 avhe[d]: dwlsalmeida[d]: i put together a small repo here <https://github.com/averne/NvdecTrace>
18:56 avhe[d]: it's not super pretty code but it does the job
18:58 dwlsalmeida[d]: you said you haven't tested that on nouveau, or you're sure it doesn't work? just out of curiosity
18:58 avhe[d]: no this is very specific to nvidia.ko
18:58 avhe[d]: it works pretty much the same way as envyhooks, by injecting a fake mmio page then intercepting kickoffs
19:01 airlied[d]: that should make working out the sizing the opaque buffers a bit easier
19:01 avhe[d]: i have it worked out already, but for tegra
19:02 avhe[d]: though i don't expect significant differences on discrete cards
19:02 avhe[d]: <https://github.com/averne/FFmpeg/blob/nvtegra/libavcodec/nvtegra_h264.c#L124-L132> here is the code for h264
19:04 dwlsalmeida[d]: ok lets immediately try this out ^
19:13 airlied[d]: oh nice yes that won't change on discrete
20:34 dwlsalmeida[d]: ok I *finally* got back to where I started
20:35 dwlsalmeida[d]: I found in my notes from one year ago that ffmpeg *never* actually worked with Dave's branch
20:35 dwlsalmeida[d]: and that it would output a blank file, like it does today too
20:35 dwlsalmeida[d]: using an older version of GStreamer though, does seem to produce a *somewhat* correct result
20:36 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301645959931105370/out.yuv?ex=67253bc4&is=6723ea44&hm=93c17714581c02eda5455de2ed97160d44332abe5b20c7e455b7295b33e18fce&
20:36 dwlsalmeida[d]: This is the 64x64 decoded file, for those interested
20:37 dwlsalmeida[d]: This error is now gone if GStreamer is used:
20:37 dwlsalmeida[d]: [ 6069.895411] nouveau 0000:01:00.0: ffmpeg[33856]: channel 32 killed!
20:37 dwlsalmeida[d]: [ 6390.138340] nouveau 0000:01:00.0: gsp: rc engn:00000013 chid:32 type:68 scope:1 part:233
20:37 dwlsalmeida[d]: [ 6390.138347] nouveau 0000:01:00.0: fifo:6606c307:0004:0020:[ffmpeg[34551]] errored - disabling channel
20:37 dwlsalmeida[d]: so, something's not quite right when ffmpeg tries to download the image from the GPU
20:58 asdqueerfromeu[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301651606399357041/Screenshot_20241031_225810.png?ex=67254106&is=6723ef86&hm=c28f05887c30792670e24cb042c3b01ad16d27781006e8458fe6b21149f64147&
20:58 asdqueerfromeu[d]: dwlsalmeida[d]: This is how it looks like, right?
21:08 dwlsalmeida[d]: hm no
21:09 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301654221216546866/Screenshot_2024-10-31_at_18.08.42.png?ex=67254375&is=6723f1f5&hm=035e557d13aa9d0919ce1dd765feb521287b5a319b2e4f10323fcf39a429842d&
21:09 dwlsalmeida[d]: The luma plane is actually right, i.e.:
21:09 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301654383708213309/Screenshot_2024-10-31_at_18.09.35.png?ex=6725439c&is=6723f21c&hm=810549bf81dcfbcc04ff30dd9fea718613c2a16232646c45060d83daf95b76e7&
21:10 dwlsalmeida[d]: ^ this is identical to the reference
21:10 dwlsalmeida[d]: now, the chroma plane is fucked
21:11 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301654821132046476/Screenshot_2024-10-31_at_18.10.31.png?ex=67254404&is=6723f284&hm=21a7bc5a09721bed4a63a1ab7934371129e0c8cd8c921a6b6236008a43a3cd2d&
21:11 dwlsalmeida[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1301654821400608918/Screenshot_2024-10-31_at_18.10.43.png?ex=67254404&is=6723f284&hm=37c5b04997926d59b0584360191c6f03d63a10a2e1688a916e40531462a63f65&
21:11 dwlsalmeida[d]: first one is decoded by the card, the second one is the reference
21:23 airlied[d]: looks tiling
21:24 airlied[d]: also wierd on the ffmpeg thing, I'm pretty sure I used ffmpeg to decode it when I wrote it
21:24 airlied[d]: but maybe I was using CTS
21:32 asdqueerfromeu[d]: dwlsalmeida[d]: I guess I used the wrong setting in ImageMagick then
22:12 dwlsalmeida[d]: yeah this is indeed some tiling issue
22:12 dwlsalmeida[d]: if you look at the chroma plane in blocks of 16x16 they
23:08 mrmx450: hi