02:17fdobridge_: <rinlovesyou> How does profiling this look like if i wanted to take a crack at pain areas for performance?
02:37fdobridge_: <redsheep> I've asked along the same lines and it sounds like the best option right now is to use renderdoc. I was going to do the same thing but I haven't had a chance yet.
02:45fdobridge_: <rinlovesyou> gotcha, i'll look into that
11:39gaussian6: Those sw chiplet procedures indeed kinda work here, so I set up the development process , since none is interested, you keep dealing with such trolls as karolherbst, I have enough of this freak show, such people are guaranteed to fail as they always did anyway.
14:04fdobridge_: <dwlsalmeida> no SET_OBJECT here for the video class though :/
14:21fdobridge_: <dwlsalmeida> @marysaka actually, I also get c4b0 (not c5b0), @airlied I am only able to trace this `NVC5B0_SEMAPHORE_A` `NVC5B0_SET_WATCHDOG_TIMER`
14:21fdobridge_: <dwlsalmeida> tips?
14:41fdobridge_: <gfxstrand> I'm going to try a CTS run with WSI again. I removed my Maxwell from the box in case that's somehow messing itup.
14:42fdobridge_: <gfxstrand> (I could totally believe GNOME is trying to run on both cards, one of which has GSP and one doesn't, and that somehow causes my problems.)
14:42fdobridge_: <tom3026> well it does fail on my ampere on kde tho heh
14:42fdobridge_: <tom3026> technically two gpus too but one is amd
14:43fdobridge_: <tom3026> something odd is going on tho, sync CTS fails in a full run, running them seperately they pass. same with wsi
14:45fdobridge_: <gfxstrand> Yeah, if this fails I'm going to shut off WSI and run that way. I think I can get it to pass now
14:46fdobridge_: <tom3026> yeah thats what i did and hit the synchro* tests failing in similiar manner, but passes if manually run just them. currently building 6.7.2 since it has a single nouveau commit and bunch of other null deref and what not things. xD but i suspect its gonna be the same
14:49fdobridge_: <gfxstrand> I will find a way to submit 1.3 next week somehow
14:50fdobridge_: <tom3026> run each section manually combine the results
14:50fdobridge_: <tom3026> dont tell anyone. 🫢
14:51fdobridge_: <tom3026> was anyhow just thinking about trying to build libvulkan_nouveau with fsanitize i wonder if that could show something fun
15:36fdobridge_: <phomes_> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27297
15:41fdobridge_: <rhed0x> would be neat if mesamatrix could group extensions by roadmap
15:49fdobridge_: <!DodoNVK (she) 🇱🇹> :triangle_nvk: is now REALLY becoming the RADV of NVIDIA hardware
15:50fdobridge_: <Sid> 🤔
15:53fdobridge_: <rhed0x> well yeah its the Mesa driver for Nvidia hardware 🙃
16:19fdobridge_: <tom3026> are you building with clang or gcc?
16:28fdobridge_: <gfxstrand> GCC
16:33fdobridge_: <tom3026> because im soon through an entire CTS suddenly, 6.7.2 with arlieds patch, nvk with 26990 , 27159 , 172. not a single error or fault in dmesg
16:34fdobridge_: <tom3026> either that or because i used gcc this time on libvulkan_nouveau O_o
16:34fdobridge_: <Sid> :o
16:35fdobridge_: <tom3026> its going so far it feels like dri_prime failed and its running on the internal gpu xD
16:35fdobridge_: <Sid> is that 6.7.2 with an additional patch?
16:35fdobridge_: <tom3026> yeah
16:37fdobridge_: <tom3026> https://lore.kernel.org/dri-devel/20240123072538.1290035-1-airlied@gmail.com/T/#u
16:37fdobridge_: <tom3026> ```
16:37fdobridge_: <tom3026> Test run totals:
16:37fdobridge_: <tom3026> Passed: 233060/897727 (26.0%)
16:37fdobridge_: <tom3026> Failed: 67/897727 (0.0%)
16:37fdobridge_: <tom3026> Not supported: 664596/897727 (74.0%)
16:37fdobridge_: <tom3026> Warnings: 4/897727 (0.0%)
16:37fdobridge_: <tom3026> Waived: 0/897727 (0.0%)
16:37fdobridge_: <tom3026> ```
16:38fdobridge_: <tom3026> i hope this didnt run on radv
16:39fdobridge_: <!DodoNVK (she) 🇱🇹> That's why you set `VK_ICD_FILENAMES`
16:39fdobridge_: <tom3026> yeah i do in my "prime_run" script
16:39fdobridge_: <tom3026> or well actually only export VK_LOADER_DRIVERS_SELECT="*nouveau*" and DRI_PRIME=1
16:40fdobridge_: <tom3026> there are stars there or "wildcards" but discord seems to make the word uh idk the word.. tilting? xD
16:42fdobridge_: <tom3026> ```
16:42fdobridge_: <tom3026> Test case 'dEQP-VK.api.info.get_physical_device_properties2.format_properties.basic'..
16:42fdobridge_: <tom3026> WARNING: NVK is not a conformant Vulkan implementation, testing use only.
16:42fdobridge_: <tom3026> Pass (Querying device format properties succeeded)
16:42fdobridge_: <tom3026> ```
16:42fdobridge_: <tom3026> so yeah unless half of them swapped gpu in the mid it ran on nvk
16:47fdobridge_: <Sid> italics
16:54fdobridge_: <gfxstrand> `VK_ICD_FILENAMES` is more robust
17:01fdobridge_: <tom3026> k setting both. rerunning!
17:32fdobridge_: <benjaminl> @gfxstrand which version of the nvidia driver have you been using nvdump with?
17:32fdobridge_: <benjaminl> I tried it out and got a binary that's definitely compressed, but not zstd
17:35f_: karolherbst: it happened again, this time on both monitors and trying to change the mode makes things go worse
17:36f_: trying to exit from my wayland session now...
17:36fdobridge_: <marysaka> 535.x and before
17:36fdobridge_: <marysaka> latest release changed the format
17:37fdobridge_: <marysaka> The format didn't change for pipeline cache tho
17:37f_: I can probably try getting dmesg logs as I should still sorta have access to the laptop via ssh
17:38f_: system didn't hang
17:38f_: (yet)
17:40fdobridge_: <tom3026> ```
17:40fdobridge_: <tom3026> Test run totals:
17:40fdobridge_: <tom3026> Passed: 233043/897727 (26.0%)
17:40fdobridge_: <tom3026> Failed: 84/897727 (0.0%)
17:40fdobridge_: <tom3026> Not supported: 664596/897727 (74.0%)
17:40fdobridge_: <tom3026> Warnings: 4/897727 (0.0%)
17:40fdobridge_: <tom3026> Waived: 0/897727 (0.0%)
17:40fdobridge_: <tom3026> ```
17:41fdobridge_: <tom3026> VK_ICD_FILENAMES set, deqp-vk didnt show up in amdgpu_top as it does when i actually run it internally
17:41fdobridge_: <tom3026> so either 6.7.2, or not applying https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024 or some other recent change ❤️
17:48fdobridge_: <Sid> woah
17:52f_: Ok got dmesg (blindly typed 'sudo dmesg > nvdmesg')
17:52f_: and I see many kernel oopses...
17:53f_: sending shortly..
17:58f_: https://bin.vitali64.duckdns.org/65b3f2a2
17:58f_: Sorry that was all I could get
18:04fdobridge_: <Sid> :O
18:06fdobridge_: <tom3026> just installed the blob was about to compare
18:06fdobridge_: <airlied> I'm going to have to revert the prime sync fix unfortunately, I'll come up with a better plan next week
18:06fdobridge_: <tom3026> "Segmentation fault (core dumped)" two seconds in
18:06fdobridge_: <tom3026> lol
18:07fdobridge_: <tom3026> "deqp-vk[4491]: segfault at 555720df3778 ip 00007ffff61ed73e sp 00007fffffffc3a0 error 4 in libnvidia-glcore.so.550.40.07[7ffff5800000+c00000] likely on CPU 2 (core 1, socket 0)"
18:07fdobridge_: <tom3026> 😄
18:07fdobridge_: <airlied> I also got a completed turing run with wsi + sync all fine a few days ago
18:08fdobridge_: <tom3026> yeah maybe is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27024 thats done it, gotta try apply it again and see what happends
18:17fdobridge_: <Sid> is it causing problems?
18:19fdobridge_: <gfxstrand> How?!? Maybe some weird test order thing makes it go away?
18:23fdobridge_: <tom3026> do you also have 27024 applied before i test it aswell yet again?
18:23fdobridge_: <tom3026> otherwise im clueless because i cant get it to fail after 6.7.2
18:25fdobridge_: <gfxstrand> No, I don't have 27024 applied
18:25fdobridge_: <gfxstrand> I'm running main + a couple 1.3 enablement patches.
18:25fdobridge_: <gfxstrand> Honestly, it might be that I'm running 1.3....
18:26fdobridge_: <gfxstrand> That could make the test list shift subtly and mask the problem.
18:27fdobridge_: <tom3026> isnt that the same as @asdqueerfromeu doing in the pkgbuild?
18:27fdobridge_: <tom3026> if so im enabling 1.3 aswell
18:27fdobridge_: <gfxstrand> hrm...
18:27fdobridge_: <gfxstrand> Oh, I just had a thought...
18:28fdobridge_: <gfxstrand> I'll have to explore that after my current run completes.
18:29fdobridge_: <gfxstrand> My current run is attempting a full CTS run (64 and 32-bit) with all the latest fixes (no extra branches) and 1.3 enabled but no WSI.
18:30fdobridge_: <gfxstrand> Specifically, I'm wondering if threaded submit is maybe somehow screwed up. That's one of the few things that WSI has a chance of affecting and which is sticky (i.e. doesn't get disabled once it's on).
18:58fourierfast: Writing proper code is in fact simpler than what you are doing, however at overseas you "mixed" things up, anal was delivered to mindill African cranks by Finnish bisexual that has nothing more to do with me other than she was harassing all our businesses to extort money from us, there was no relationship between us, she fucked and tongued random African bums in front of me, trying to build an extortion case, and lied that I raped her
18:58fourierfast: which ultimately lead to assaults of me, and later her brutal treatment, they took her front row from teeth out, and if that did not cause any state change in her sick behavior as gangster extortion type towards better, they get more brutally handled where bullets come to play etc. All this clown crew cranks and sluts who bothered me and my doings.
19:05fdobridge_: <Sid> ...what the fuck?
19:05Sid127: can we get an op in here e-e
19:06fdobridge_: <!DodoNVK (she) 🇱🇹> Estonia is attacking us again
19:17fdobridge_: <dwlsalmeida> @airlied fyi I ran out of ideas, so I just changed this to 1: `nvh264->tileFormat = 1;`, which results in a perfectly decoded I frame, nice
19:44Lyude: I've never worked with rcu, but I assume that one is supposed to be able to modify data behind an rcu pointer by updating it somehow?
20:07fdobridge_: <gfxstrand> @dwlsalmeida Yeah, so `tileFormat` appears to correspond to `nil_tiling::is_tiled` and it looks like `gob_height` is probably `nil_tiling::y_log2 - 1` or something like that.
20:07fdobridge_: <gfxstrand> However, the way they're documenting the GOBs there, it's unclear to me what they actually means
20:07fdobridge_: <gfxstrand> However, the way they're documenting the GOBs there, it's unclear to me what they actually mean (edited)
20:08fdobridge_: <dwlsalmeida> does anybody know what "GOB" stands for?..
20:08fdobridge_: <avhe> <https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/video/nvdec_drv.h#L634>
20:08fdobridge_: <avhe> 0 is for tegra (TBL: Tegra Block Linear)
20:09fdobridge_: <avhe> group of bytes, it should be 1 typically, which results in a block height of 8<<1 = 16 (size of a macroblock)
20:10fdobridge_: <gfxstrand> Except the comment in the header makes no sense.
20:11fdobridge_: <gfxstrand> Even on the tiny bit of hardware where the gob height is configurable, it's 8 or 4 for 3D. Why would video have a bunch of different sizes?
20:11fdobridge_: <gfxstrand> And the height in GOBs is nowhere to be found if `gob_height` isn't that.
20:12fdobridge_: <dwlsalmeida> well, for one we can have variable macroblock sizes for HEVC and AV1
20:12fdobridge_: <averne> gob_height here is the number of GOBs in a block, the block height is fixed at 8
20:12fdobridge_: <gfxstrand> Hrm... there's a `log2_tile_rows/columns`, maybe it's that?
20:12fdobridge_: <averne> at least that's how it is on tegra
20:12fdobridge_: <gfxstrand> Okay, that makes more sense.
20:12fdobridge_: <averne> gob_height here is the number of GOBs in a block, the GOB height is fixed at 8 (edited)
20:12fdobridge_: <karolherbst🐧🦀> is it this fermi/non-fermi switch again?
20:13fdobridge_: <gfxstrand> So, yeah, that corresponds to `nil_tiling::y_log2`
20:13fdobridge_: <gfxstrand> I don't think so
20:13fdobridge_: <gfxstrand> I think it's badly named
20:13fdobridge_: <gfxstrand> At least I hope that's all it is. 😅
20:13fdobridge_: <karolherbst🐧🦀> yeah.. I mean yes...
20:13fdobridge_: <karolherbst🐧🦀> ohh wait
20:13fdobridge_: <karolherbst🐧🦀> height..
20:13fdobridge_: <karolherbst🐧🦀> nvm
20:14fdobridge_: <gfxstrand> Yeah, so `gob_height` being "height in GOBs" makes sense.
20:14fdobridge_: <averne> can't speak for av1, but for hevc they also use 1
20:14fdobridge_: <averne> <https://github.com/averne/FFmpeg/blob/nvtegra/libavcodec/nvtegra_hevc.c#L246> <-- driver for tegra I've been developing
20:15fdobridge_: <karolherbst🐧🦀> probably makes sense to use a different tiling when encoding/decoding videos than what you'd use in 3D stuff
20:15fdobridge_: <dwlsalmeida> @avhe hey I really liked the dump from your tool that you sent earlier, I wonder if we can print the picture parameter and the bitstream using envyhooks? you seem to have figured it out. Just envyhooks by itself won't take us very far in this particular case
20:15fdobridge_: <karolherbst🐧🦀> and use something matching the block format of the video
20:15fdobridge_: <dwlsalmeida> that was just a wild guess from me 😄
20:16fdobridge_: <averne> in my preload stub I maintained a list of all buffers, then looked up the corresponding using the iova written in the pushbuffer
20:16fdobridge_: <averne> mind that the iova is right-shifted by 8
20:18fdobridge_: <averne> it's more convenient to do on tegra, because they submit jobs using an ioctl, and iovas aren't written directly but go through a relocation step in the kernel, so you get the buffer handle directly
20:19fdobridge_: <gfxstrand> Yeah, could be. But we need to know how to map it all to NIL so we don't screw up image tiling when you share between the two.
20:20fdobridge_: <dwlsalmeida> @marysaka your opinion?
20:22fdobridge_: <!DodoNVK (she) 🇱🇹> How many frames of 2160p H.264 video can Novideo (NVK Video) driver render right now before hanging? :triangle_nvk:
20:41fdobridge_: <gfxstrand> 64-bit run completed successfully!
20:44fdobridge_: <!DodoNVK (she) 🇱🇹> Now it's time for 32-bit to make sure there aren't weird issues there (padding flashbacks)
20:51fdobridge_: <tom3026> yay now why did both yours and mine fail earlier? xD
20:51fdobridge_: <gfxstrand> WSI
20:51fdobridge_: <gfxstrand> What does WSI have to do with it? Hell if I know!
20:51fdobridge_: <tom3026> but but im running wsi now :<
20:52fdobridge_: <gfxstrand> And it passes?!?
20:52fdobridge_: <!DodoNVK (she) 🇱🇹> "No failures, no flakes, it just passe-"
20:55fdobridge_: <tom3026> yeah
20:59fdobridge_: <tom3026> oh no.. nvm seems both my vk-default.txt and vk-tom.txt has it disabled... i blame vim and typoes
20:59fdobridge_: <tom3026> and to little coffe
21:03fdobridge_: <tom3026> here we go running it again!
21:08fdobridge_: <gfxstrand> Okay, good, I'm not crazy. 😂
21:14fdobridge_: <!DodoNVK (she) 🇱🇹> I definitely am (I panicked NAK with vkcube back when it was WIP)
21:28fdobridge_: <airlied> @dwlsalmeida nice
21:42fdobridge_: <tom3026> ```
21:42fdobridge_: <tom3026> Test run totals:
21:42fdobridge_: <tom3026> Passed: 171019/681329 (25.1%)
21:42fdobridge_: <tom3026> Failed: 53/681329 (0.0%)
21:42fdobridge_: <tom3026> Not supported: 510253/681329 (74.9%)
21:42fdobridge_: <tom3026> Warnings: 4/681329 (0.0%)
21:42fdobridge_: <tom3026> Waived: 0/681329 (0.0%)
21:42fdobridge_: <tom3026> ```
21:42fdobridge_: <tom3026> 4 shards
21:43fdobridge_: <Sid> total tests are less here than before 🤔
21:45fdobridge_: <benjaminl> so I'm trying to figure out how nvidia is doing texture lod clamping on sm50, and getting a... very implausible looking instruction in the shader dump
21:45fdobridge_: <benjaminl> `80270404 c53ab007`
21:45fdobridge_: <tom3026> yeah wut why is changing --deqp-fraction reducing the total tests
21:46fdobridge_: <benjaminl> this looks like a predicated `ld.s8`, but from context it must be loading to r4..r8, and the rest of the shader doesn't use predicates at all
21:47fdobridge_: <benjaminl> I don't think I've ever seen an instruction where bits 16..19 are something other than the predicate
22:01fdobridge_: <tom3026> @gfxstrand was your script you linked earlier perhaps supposed to have & at the end of depq-vk to actually make it pareallel?
22:02fdobridge_: <tom3026> im only running one fourth of the tests heh
22:19fdobridge_: <gfxstrand> Oh, well I am crazy. But my perception of reality seems intact and that's at least grounding.
22:19fdobridge_: <gfxstrand> I mean, you could...
22:19fdobridge_: <gfxstrand> I'm doing it all serial because I'm scared
22:20fdobridge_: <gfxstrand> If you do it in parallel, you need more flags
22:20fdobridge_: <gfxstrand> There's some shader caching stuff that parallel runs will screw up if you're not careful.
22:20fdobridge_: <tom3026> yeah but for some reasons it stops at 1-4 i never checked the testresult output until now
22:21fdobridge_: <tom3026> or logfile that is.
22:21fdobridge_: <gfxstrand> If there are any fails, it stops
22:23fdobridge_: <tom3026> so those 53 fails is the reason it never begun with the other fractions then?
22:37Lyude: dakr: good news: I...... think I might hacked around enough to come up with a workaround to the Revocable issue :). I think
22:39Lyude: had to make quite a few changes to kernel::drm_device_register! (mainly: having things in a Registration object is painful because that object gets moved away, so I just made it so we prepare the Registration object well before registration - and then added an accessor that gives us the two variables we need to pass to register things. it's pretty dirty but I know of no better way to
22:39Lyude: do this :P
22:40Lyude: now I've got to go and fix some other compilation errors that weren't showing up before - but hopefully this should work
22:42fdobridge_: <tom3026> yeah idk, im not that good with bash but adding & and directly after a "wait" then it properly runs the shards one by one without it im only getting one shard/fraction
23:01Lyude: and it compiles!!! i am going to handle some other stuff I need to do today before trying to load this because I would be kind of shocked if it just worked and didn't lead into more time spent troubleshooting
23:12karolherbst: you'd be surprised
23:19f_: too much "random" things going on lol
23:19f_: I'll upgrade and report back..