01:13noocsharp: what makes the firmware more difficult to get to for maxwell and onward, is it just hard to find them in the binary?
01:35fdobridge_: <benjaminl> are there preferences on whether to "ununify" ops on SM50 at this point? There used to be several instruction that don't exist on SM50 but were implemented as something else in the encoder. Most of these were moved to dedicated instructions at some point, with the split between SM50 and SM70 happening in the builder
01:35fdobridge_: <benjaminl> I'm currently looking at `iabs`, which is implemented in the encoder as `i2i`
01:37fdobridge_: <benjaminl> it's missing the abs src modifier bit, which is a trivial fix, but I'm also considering splitting out a separate `OpI2I` and emitting that in the builder `iabs` method
01:37fdobridge_: <benjaminl> I guess I mostly just get confused when the instructions in the IR don't match up to the ones we're actually using
03:16fdobridge_: <gfxstrand> @airlied you familiar with this one?
03:17fdobridge_: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1197016897108054147/message.txt?ex=65b9bbdb&is=65a746db&hm=87f8140ad7e8734056f914f87adf257fc3c542a5f179be94c859f4770def2c15&
03:17fdobridge_: <gfxstrand> dakr: ^^
03:18fdobridge_: <gfxstrand> Admittedly, it's been a minute since I rev'd my kernel on that box.
03:22fdobridge_: <gfxstrand> For those wondering, ^^ is what happened to my CTS run on ampere.
03:39fdobridge_: <gfxstrand> I wonder if these magic fencing patches will fix it.
04:09fdobridge_: <airlied> Any sign of a problem before that? Like channel hangs or VM alloc fails?
06:42fdobridge_: <!DodoNVK (she) 🇱🇹> Normal 6.7 should work now (because it has the initial GSP support with various fixes)
14:42fdobridge_: <marysaka> That should hopefully be okay to get mixins on headers around https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27115
14:43fdobridge_: <gfxstrand> Nope, not for about 5k seconds.
14:44fdobridge_: <!DodoNVK (she) 🇱🇹> I'm getting Minecraft modding flashbacks from that word
14:45fdobridge_: <marysaka> Same but it's not as fancy as that.
14:46fdobridge_: <marysaka> Same but it's not as fancy as Sponge's mixins. (edited)
14:53fdobridge_: <gfxstrand> We've got a different version in the conservative rasterization MR. Mind giving that a look?
14:54fdobridge_: <marysaka> I missed that one 😓
14:54fdobridge_: <gfxstrand> Yeah, I did some fixing to make it so we always give a semi-full path: "nvidia/classes/foo.h" vs. "mixin/classes/foo.h"
14:55fdobridge_: <gfxstrand> Seems a little dangerous but I think I like that over prefixing everything.
14:56fdobridge_: <marysaka> yeah so it looks mostly the same I guess I will close my MR and rebase on those patches
14:56fdobridge_: <gfxstrand> Okay
14:57fdobridge_: <gfxstrand> I'm going to try and merge that today. I'm not entirely happy with the "if file exists" checks in the meson but IDK that there's a better option.
14:57fdobridge_: <marysaka> what is the min version needed to compile mesa? because I *think* the |= syntax on method is after 3.9
14:57fdobridge_: <marysaka> what is the min version needed to compile mesa? because I *think* the |= syntax on dictis after 3.9 (edited)
14:57fdobridge_: <marysaka> what is the min version needed to compile mesa? because I *think* the |= syntax on dict is after 3.9 (edited)
14:57fdobridge_: <marysaka> what is the min version needed to compile mesa? because I *think* the |= syntax on dict is since ~3.9 (edited)
14:58fdobridge_: <!DodoNVK (she) 🇱🇹> Of Meson? I guess it's 1.3 with NAK enabled
14:58fdobridge_: <marysaka> of python in general
14:58fdobridge_: <marysaka> sorry I forgot words here I need another cup of tea 😅
14:58fdobridge_: <marysaka> what is the min version of python needed to compile mesa? because I *think* the |= syntax on dict is since ~3.9 (edited)
14:59fdobridge_: <marysaka> what is the min version of python needed to compile mesa? because I *think* the |= syntax on dict is since 3.9 (<https://peps.python.org/pep-0584/>) (edited)
15:00fdobridge_: <gfxstrand> I think 2.6 or 2.7
15:00fdobridge_: <gfxstrand> I think 3.6 or 3.7 (edited)
15:01fdobridge_: <!DodoNVK (she) 🇱🇹> Meson requires Python 3.7 by itself
15:01fdobridge_: <marysaka> Will add a comment then
17:13fdobridge_: <gfxstrand> With the latest fence patches, I didn't get this fail. Instead, I just got what looked like a standard channel timeout.
17:40fdobridge_: <benjaminl> does CI check python 3.7? I ask because there's a comment at the top of `class_parser` that says it probably needs 3.9 and I'd like to remove it if it's been tested with 3.7
17:47fdobridge_: <gfxstrand> I don't think so
17:48fdobridge_: <gfxstrand> I think that requirement comes from ChromeOS and they don't care about NVK
17:48fdobridge_: <gfxstrand> If we require 3.9, that's really not going to be the thing distros struggle with. 😅
17:49fdobridge_: <benjaminl> alright I'll leave the comment then
17:50fdobridge_: <benjaminl> I tried to test it myself but decided that getting a compatible mako version set up was too much of a pain lol
17:53fdobridge_: <gfxstrand> Yeah
18:10fdobridge_: <gfxstrand> Just got it again. Same CTS test. This time on the `nouvellis/alirlied-gsp-fixes`
18:11fdobridge_: <gfxstrand> @airlied ^^
18:16fdobridge_: <gfxstrand> `dEQP-VK.synchronization.timeline_semaphore.device_host.write_copy_buffer_read_copy_buffer_to_image.buffer_16384` is the test that hits it
18:16fdobridge_: <gfxstrand> Test passes when run by itself.
18:17fdobridge_: <gfxstrand> It only seems to fail as part of a full CTS run. 🤦🏻♀️
18:17fdobridge_: <gfxstrand> And it runs fast so I don't think it's timing out
18:43fdobridge_: <gfxstrand> @marysaka does your dumper know how to dump QMDs?
18:43fdobridge_: <marysaka> not currently
18:43fdobridge_: <gfxstrand> I'm very unconvinced by our `*_SM_CONFIG_SHARED_MEM_SIZE` configuration
18:44fdobridge_: <marysaka> how does NVIDIA upload those atm on dGPU?
18:44fdobridge_: <gfxstrand> IDK
18:44fdobridge_: <marysaka> I know that on tegra it's an inline upload from the command buffer
18:44fdobridge_: <gfxstrand> Hrm... Let me do a dump and see what's there
18:44fdobridge_: <gfxstrand> Maybe I can find it in a DMA
18:53fdobridge_: <gfxstrand> Looks like they're doing a DMA. Now let me see if I can decode it.
18:53fdobridge_: <gfxstrand> @marysaka Did you ever upstream `nv_push_dump`?
18:54fdobridge_: <marysaka> It's on this branch https://gitlab.freedesktop.org/marysaka/mesa/-/tree/nouveau/pushbuf-dump-tool?ref_type=heads
18:56fdobridge_: <marysaka> and I have this script https://gist.github.com/marysaka/3b203164793cb6d61834debc3dfda04e/revisions
18:57fdobridge_: <airlied> @gfxstrand what ampere gpu is that btw? I'll plug one back in today and give it a run
19:04fdobridge_: <gfxstrand> 3060
19:04fdobridge_: <Sid> I could try running it on my 1660Ti too, if needed
19:04fdobridge_: <gfxstrand> Oh, right. You had to hack up the isaspec stuff
19:05fdobridge_: <marysaka> yeah :nya_flop:
19:05fdobridge_: <gfxstrand> How about we just get rid of isaspec?
19:05fdobridge_: <gfxstrand> This seems like a solvable problem. 😂
19:05fdobridge_: <marysaka> that would be awesome 😄
19:06fdobridge_: <marysaka> I wanted to fix isaspec but I forgot to look into it again
19:06fdobridge_: <gfxstrand> Like, we're already hand-typing the decode
19:06fdobridge_: <marysaka> not for mme macro tho?
19:07fdobridge_: <marysaka> I remember having to fix some stuffs in the Fermi isaspec definition to get better disassembly (that I probably forgot to push somewhere)
19:07fdobridge_: <gfxstrand> @karolherbst Does Volta use the Turing MME or the Fermi one?
19:08fdobridge_: <karolherbst🐧🦀> fermi
19:08fdobridge_: <gfxstrand> Okay
19:08fdobridge_: <gfxstrand> Yet more weirdness
19:08fdobridge_: <karolherbst🐧🦀> yeah...
19:08fdobridge_: <gfxstrand> It really is a Maxwell with Turing SMs
19:08fdobridge_: <karolherbst🐧🦀> indirect draws are missing thing, right?
19:08fdobridge_: <karolherbst🐧🦀> yeah.. more or less
19:08fdobridge_: <karolherbst🐧🦀> well.. more like pascal
19:09fdobridge_: <karolherbst🐧🦀> pascal changed compute a little
19:09fdobridge_: <marysaka> (Volta is just Maxwell Gen 4 right)
19:09fdobridge_: <karolherbst🐧🦀> pain
19:11fdobridge_: <gfxstrand> Really, it's a "Turing isn't ready but AMD might do something cool so we need to tape out some hardware. What do ya got , guys?"
19:16fdobridge_: <gfxstrand> Either that or they fab'd a few so the compiler folks could get real-world testing with the new ISA while the other bits got finalized and someone in marketing decided to sell it.
19:19fdobridge_: <gfxstrand> No, I will not RiiR the MME stuff while I'm at it...
19:24fdobridge_: <gfxstrand> Maybe I'll genxml them.
19:24fdobridge_: <gfxstrand> nah
19:27fdobridge_: <mohamexiety> tbf it may not be an entirely bad idea to lock volta support behind experimental like pascal/maxwell/etc because it's genuinely rare HW
19:28fdobridge_: <gfxstrand> It already is
19:41Lyude: sheesh, there are a lot of commits from the asahi tree that don't compile and need another commit to actually build properly
19:43airlied: yeah I except some of it is rust compiler changes over time
19:45airlied: expect
19:52fdobridge_: <!DodoNVK (she) 🇱🇹> Rarer than Tempest 3000?
20:03Lyude: currently trying to figure out where kernel::device::Device::raw() comes from, since I see it used in some asahi commits but can't seem to find any implementation for it
20:21Lyude: airlied: btw - any idea if lina actually hangs out on the #asahi-dev channel or elsewhere? wondering if she might know the answer to this
20:21fdobridge_: <airlied> @gfxstrand https://paste.centos.org/view/a3c8b804 will fix the warning, but not sure if it will fix the causes of the warning
20:22Lyude: (also it appears I meant drm::device::Device )
20:29Lyude: actually now looking at it more closely i'm, not actually sure how this was ever meant to work. maybe raw() just got renamed at some point
20:30Lyude: since the next patch moves over to using raw_mut(), and the original patch seems to expect raw() to provide a mutable reference
21:01Lyude: btw airlied, dakr: https://gitlab.freedesktop.org/lyudess/linux/-/commits/0ad7088f current status on things I've pulled in, it appears I will likely have to go back and pull a number of other things in so that I can get access to the rust drm gem abstractions fwiw so I expect a lot more stuff to appear soon
21:01Lyude: some of those commits will definitely be squashed as well eventually to make sure everything builds on each commit
21:05fdobridge_: <gfxstrand> There's a push command type that we can't decode. Type 6. NVIDIA seems to like using it
21:06airlied: Lyude: I thought lina was on discord or zulip, but not seeing them around either currently
21:07fdobridge_: <airlied> @gfxstrand so that warn shouldn't be fatal either
21:21fdobridge_: <gfxstrand> @airlied All I know is that I see a device lost and that warning
21:21fdobridge_: <gfxstrand> I've not dug any deeper to figure out why it's killing my channel
21:22fdobridge_: <gfxstrand> `LOAD_INLINE_QMD_DATA`
21:22fdobridge_: <gfxstrand> Which we should probably use...
21:22fdobridge_: <gfxstrand> It'd save us a lot of headaches
21:22fdobridge_: <marysaka> huh
21:22fdobridge_: <marysaka> Oh that's Pascal+
21:25fdobridge_: <marysaka> On T210 (Maxwell) they were uploading it with that one inline way
21:25fdobridge_: <airlied> okay I suspect now you'll see just device lost 🙂
21:25fdobridge_: <airlied> with that patch
21:25fdobridge_: <gfxstrand> @marysaka https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27126
21:25fdobridge_: <marysaka> I think I reproduced that on Pascalinette some years ago let me find it
21:25fdobridge_: <marysaka> https://github.com/Pascalinette/gpu_playground/blob/master/maxwell/dma_copy_engine.py#L33
21:25fdobridge_: <marysaka> That one thinggy
21:26fdobridge_: <gfxstrand> Okay, now I can actually look at this QMD. 😂
21:26fdobridge_: <airlied> https://paste.centos.org/view/raw/80b22bfc result of a cts run on my ampere just now
21:26fdobridge_: <marysaka> no more isaspec, thank you :SoniiPray:
21:26fdobridge_: <gfxstrand> Yeah, look at the LOC count.
21:27fdobridge_: <marysaka> missing a 6 for my liking /s
21:27fdobridge_: <marysaka> but yeah we should probably do inline upload that way for gen before Pascal (might be useful for Kepler? idk)
21:28fdobridge_: <gfxstrand> What do you mean?
21:28fdobridge_: <marysaka> We should also add some code in the parser to make it understand LAUNCH_DMA inline data
21:28fdobridge_: <gfxstrand> Yeah...
21:29fdobridge_: <marysaka> (-606)
21:29fdobridge_: <gfxstrand> And maybe decode QMDs
21:29fdobridge_: <gfxstrand> hehe
21:39fdobridge_: <airlied> those cond render/xfb fails are new, I wonder are they new tests
21:40fdobridge_: <gfxstrand> I'm not seeing them and I thought I was running on a pretty new CTS
21:42fdobridge_: <airlied> I just rebased to main 5m ago
21:43fdobridge_: <gfxstrand> Okay, they may be new
21:43fdobridge_: <gfxstrand> I'm on the 1.7.3 branch
21:44fdobridge_: <airlied> anyways I've sent kernel warning fix to the mailing list
21:49fdobridge_:<gfxstrand> is very confused by the blob's configuration of `MIN_SM_CONFIG_SHARED_MEM_SIZE`
21:49fdobridge_: <gfxstrand> _is very confused by the blob's configuration of_ `MIN_SM_CONFIG_SHARED_MEM_SIZE` (edited)
21:50fdobridge_: <gfxstrand> It sets the min to 32k
21:50fdobridge_: <gfxstrand> which is a lot
21:51fdobridge_: <karolherbst🐧🦀> well.. we don't know what those values do
21:51fdobridge_: <karolherbst🐧🦀> so there is that
21:51fdobridge_: <gfxstrand> Yeah
21:51fdobridge_: <gfxstrand> I think it has to do with how shared memory is partitioned among workgroups
21:51fdobridge_: <karolherbst🐧🦀> yeah...
21:52fdobridge_: <gfxstrand> Where you can have multiple workgroups in-flight with different shared memory sizes and different workgroup sizes and the HW has to be able to ringbuffer it all.
21:52fdobridge_: <gfxstrand> But how we're supposed to calculate that is a mystery
21:55fdobridge_: <gfxstrand> It seems setting 64k as a maximum is okay even with large workgroup sizes.
21:56fdobridge_: <gfxstrand> Poking at the blob, it seems they set max=64k, min=32k
21:56fdobridge_: <gfxstrand> IDK why the min is so large
21:56fdobridge_: <gfxstrand> With min=64k whenever the shader wants more than 32k
23:06fdobridge_: <airlied> is nvk_cmd_copy_query_pool_results_mme dead code?