04:34 caramelcandy: i do not need a door staff to walk into my own hotel, door staff is for you there, and since you harassed me there, some got killed indeed and they left me a message too, you started with bombing my physical routes, you invaded into my personal areas to assault me, you deserved what you got airlied. and it's not like you can say as losers of first degree such things to men like kristjan palusalu and other estonian stars, you should had
04:34 caramelcandy: already learned it, my father is a dickhead like you, a criminal trash.
05:55 fdobridge_: <g​fxstrand> @karolherbst Have you run rusticl on nvk recently?
08:45 fdobridge_: <g​fxstrand> @karolherbst @airlied The modifiers situation is worse than I though. See point 2 in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27672
09:00 fdobridge_: <a​irlied> yup that agrees with what I worked out last week
12:21 fdobridge_: <k​arolherbst🐧🦀> not yet, but I should 😄
12:45 fdobridge_: <!​DodoNVK (she) 🇱🇹> Didn't you try getting rusticl working back in 2022?
13:13 fdobridge_: <k​arolherbst🐧🦀> it kinda worked on nouveau, but the problem was the broken compiler
13:13 fdobridge_: <k​arolherbst🐧🦀> so before NAK NVK had the same problem
13:13 fdobridge_: <k​arolherbst🐧🦀> it's really pointless to run rusticl on any vulkan impl not supporting `int16` or `int8` properly
15:28 fdobridge_: <r​hed0x> @gfxstrand can i bother you for a few minutes?
15:29 fdobridge_: <r​hed0x> I'm trying to get started working on NVK
15:30 fdobridge_: <r​hed0x> and I picked conservative rasterization as something to potentially look into because it's
15:30 fdobridge_: <r​hed0x> A: not yet implemented
15:30 fdobridge_: <r​hed0x> B: hopefully just a bunch of registers
15:31 fdobridge_: <r​hed0x> conservative rasterization happens to be missing from the nvidia headers, so I dumped the pushbufs from the proprietary driver
15:31 fdobridge_: <r​hed0x> ```
15:31 fdobridge_: <r​hed0x> [0x0000001f] HDR 2001008c subch 0 NINC | [0x0000001f] HDR 80000452 subch 0 IMMD
15:31 fdobridge_: <r​hed0x> mthd 0230 unknown method <
15:31 fdobridge_: <r​hed0x> .VALUE = 0x31103 <
15:31 fdobridge_: <r​hed0x> <
15:31 fdobridge_: <r​hed0x> [0x00000021] HDR 80010452 subch 0 IMMD <
15:31 fdobridge_: <r​hed0x> mthd 1148 unknown method mthd 1148 unknown method
15:31 fdobridge_: <r​hed0x> .VALUE = 0x1 | .VALUE = 0x0
15:31 fdobridge_: <r​hed0x>
15:31 fdobridge_: <r​hed0x>
15:31 fdobridge_: <r​hed0x> 0x31113 for underestimate
15:31 fdobridge_: <r​hed0x> 0x31100 for overestimate with 0.0
15:31 fdobridge_: <r​hed0x> 0x31102 for overestimate with 0.5
15:31 fdobridge_: <r​hed0x> 0x31103 for overestimate with 0.75 (max)
15:31 fdobridge_: <r​hed0x> ```
15:31 fdobridge_: <r​hed0x> 0x1148 also shows up in the GL driver for conservative rasterization
15:32 fdobridge_: <r​hed0x> one thing that irritates me a bit is that first call to method 0x0230
15:33 fdobridge_: <r​hed0x> pretty much everything NVK does goes through `NVC0_FIFO_PKHDR_IL(int subc, int mthd, uint16_t data)`
15:33 fdobridge_: <r​hed0x> which results in the HDRs starting with 0x80000000
15:33 fdobridge_: <r​hed0x> this one starts with 0x200000
15:34 fdobridge_: <r​hed0x> pretty much everything NVK does goes through `P_IMMD => NVC0_FIFO_PKHDR_IL(int subc, int mthd, uint16_t data)` (edited)
15:34 fdobridge_: <m​arysaka> There is an MR for conservative rasterization btw https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25668
15:34 fdobridge_: <r​hed0x> damnit :c
15:36 fdobridge_: <m​arysaka> for 0x200000 I'm pretty sure you want P_MTHD @rhed0x
15:37 fdobridge_: <m​arysaka> so for example
15:37 fdobridge_: <m​arysaka>
15:37 fdobridge_: <m​arysaka> ```c
15:37 fdobridge_: <m​arysaka> P_MTHD(p, NVC597, SET_MME_MEM_ADDRESS_A);
15:37 fdobridge_: <m​arysaka> P_NVC597_SET_MME_MEM_ADDRESS_A(p, high32(data_addr));
15:37 fdobridge_: <m​arysaka> P_NVC597_SET_MME_MEM_ADDRESS_B(p, low32(data_addr));
15:37 fdobridge_: <m​arysaka> /* Start 3 dwords into MME RAM */
15:37 fdobridge_: <m​arysaka> P_NVC597_SET_MME_DATA_RAM_ADDRESS(p, 3);
15:37 fdobridge_: <m​arysaka> P_IMMD(p, NVC597, MME_DMA_WRITE, 20);
15:37 fdobridge_: <m​arysaka> ```
15:38 fdobridge_: <m​arysaka> the 3 `P_NVC597_` calls will be in increment mode (so 0x200000 if I'm not mistaken)
15:38 fdobridge_: <m​arysaka> and the P_IMMD will cause the sequence to end
15:38 fdobridge_: <r​hed0x> okay different question then: any idea why that MR sets up a macro just to set the extra overestimate?
15:39 fdobridge_: <k​arolherbst🐧🦀> `P_IMMD` should be immediate or hdr+ value afaik
15:39 fdobridge_: <k​arolherbst🐧🦀> not sure we have any smartness in place to append it to the last one
15:39 fdobridge_: <m​arysaka> yes but here it will end the last sequence right?
15:39 fdobridge_: <r​hed0x> whats does HDR mean here?
15:39 fdobridge_: <k​arolherbst🐧🦀> ohh sure
15:39 fdobridge_: <k​arolherbst🐧🦀> header
15:39 fdobridge_: <r​hed0x> right, that makes sense
15:39 fdobridge_: <m​arysaka> see here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25668
15:40 fdobridge_: <k​arolherbst🐧🦀> maybe we should write header instead of HDR because that means something else these days 😄
15:40 fdobridge_: <r​hed0x> yeah I'm already staring at that
15:40 fdobridge_: <r​hed0x> ```c
15:40 fdobridge_: <r​hed0x> void
15:40 fdobridge_: <r​hed0x> nvk_mme_set_conservative_raster_state(struct mme_builder *b) {
15:40 fdobridge_: <r​hed0x> struct mme_value new_state = mme_load(b);
15:40 fdobridge_: <r​hed0x> struct mme_value old_state = nvk_mme_load_scratch(b, CONSERVATIVE_RASTER_STATE);
15:40 fdobridge_: <r​hed0x>
15:40 fdobridge_: <r​hed0x> mme_if(b, ine, new_state, old_state) {
15:41 fdobridge_: <r​hed0x> nvk_mme_store_scratch(b, CONSERVATIVE_RASTER_STATE, new_state);
15:41 fdobridge_: <r​hed0x> mme_set_priv_reg(b, new_state, mme_imm(BITFIELD_RANGE(3, 23)), mme_imm(0x418800));
15:41 fdobridge_: <r​hed0x> }
15:41 fdobridge_: <r​hed0x> }
15:41 fdobridge_: <r​hed0x> ```
15:41 fdobridge_: <r​hed0x> I just dont really understand why that macro exists
15:41 fdobridge_: <m​arysaka> that remind me we should also make use of the "prefetch" bit someday if we start writing commands on the GPU directly...
15:42 fdobridge_: <m​arysaka> (my knowledge of this is quite rusty and based on the original researches we did for the Switch, I should update the names in my brain someday 😅 )
15:43 fdobridge_: <m​arysaka> to only set the private reg if the state changed
15:43 fdobridge_: <m​arysaka> because it is costy as it needs to wait on the firmware side
15:43 fdobridge_: <r​hed0x> is changing that particularly expensive?
15:43 fdobridge_: <m​arysaka> because it is costy as it needs to wait on the firmware side to reply (edited)
15:43 fdobridge_: <r​hed0x> ah
15:43 fdobridge_: <m​arysaka> yeah it's the "falcon" methods
15:44 fdobridge_: <m​arysaka> not sure how they route that with GSP
15:44 fdobridge_: <r​hed0x> prop doesnt seem to call a macro or am I overlooking that?
15:46 fdobridge_: <m​arysaka> prop? `mme_set_priv_reg` here is the implementation of the `NVK_MME_SET_PRIV_REG` macro if that's what you are talking about :aki_thonk:
15:46 fdobridge_: <r​hed0x> ```
15:46 fdobridge_: <r​hed0x> [0x0000001f] HDR 2001008c subch 0 NINC
15:46 fdobridge_: <r​hed0x> mthd 0230 unknown method
15:46 fdobridge_: <r​hed0x> .VALUE = 0x31103
15:46 fdobridge_: <r​hed0x> ```
15:47 fdobridge_: <r​hed0x> thats what i got out of the prop driver
15:47 fdobridge_: <r​hed0x> with the value differing depending on over/underestimation and the extra size
15:50 fdobridge_: <m​arysaka> hmm maybe @vdpafaor know?
15:57 fdobridge_: <m​arysaka> @rhed0x on what gen are you testing?
15:57 fdobridge_: <r​hed0x> ampere
15:57 fdobridge_: <m​arysaka> hmm
15:57 fdobridge_: <m​arysaka> I know that on Maxwell/Pascal it goes via 0x418800 priv reg but maybe they changed that on Turing/Ampere?
15:59 fdobridge_: <k​arolherbst🐧🦀> lemme check something...
15:59 fdobridge_: <m​arysaka> 0x418800 being `gr_pri_gpcs_setup_debug` as per nvgpu (<https://github.com/alliedvision/linux_nvidia_jetson/blob/4609206e6594f1eb21e43e69afa8974cf20cc096/kernel/nvgpu/drivers/gpu/nvgpu/hal/gr/init/gr_init_gv11b.c#L61>)
15:59 fdobridge_: <k​arolherbst🐧🦀> we actually poke that reg from GL :ferrisUpsideDown:
15:59 fdobridge_: <k​arolherbst🐧🦀> and we never knew or something
15:59 fdobridge_: <k​arolherbst🐧🦀> check the `_conservative_raster_state` mme things in the gl driver
16:00 fdobridge_: <k​arolherbst🐧🦀> pre volta we have a `send (extrinsrt 0x0 $r2 0 12 11) /* sends 0x418800 */`
16:00 fdobridge_: <k​arolherbst🐧🦀> ehh pre turing
16:00 fdobridge_: <k​arolherbst🐧🦀> but in turing we also write to the same: `send(0x00418800);`
16:07 fdobridge_: <m​arysaka> it's possible that moved on ampere :aki_thonk:
16:16 fdobridge_: <r​hed0x> thanks btw :ferris_happy:
17:13 fdobridge_: <b​enjaminl> I only have maxwell to test with, and got an MME call when I dumped the proprietary driver that is pretty similar to `nvk_mme_set_conservative_raster_state` from that MR
17:13 fdobridge_: <b​enjaminl> the 0230 method you're getting is definitely new
17:15 fdobridge_: <b​enjaminl> that 0x31103 value also looks like it's packed differently, so you'll probably have to test it with a bunch of different parameters and figure out what the bit ranges are
17:16 fdobridge_: <g​fxstrand> They may have fixed it on Turing
17:17 fdobridge_: <g​fxstrand> That wouldn't surprise me at all
17:17 fdobridge_: <g​fxstrand> going through FALCON is perf death so they'd want to fix that up
17:17 fdobridge_: <b​enjaminl> a bit more context on why the macro is checking the previous value is that mesa's dynamic state tracking puts over/underestimation and enable/disable in the same field, but in the hardware toggling enable/disable is much cheaper than toggling over/under
17:18 fdobridge_: <b​enjaminl> so without the check, we would be pessimistically assuming that `mode = OVER` means that the previous value was `enabled = false; mode = UNDER;`, and setting both
17:35 fdobridge_: <r​hed0x> maybe i should try that MR later
18:58 fdobridge_: <p​homes_> I just tested on turing (tu104):
18:58 fdobridge_: <p​homes_> Test run totals:
18:58 fdobridge_: <p​homes_> Passed: 48/343 (14.0%)
18:58 fdobridge_: <p​homes_> Failed: 100/343 (29.2%)
18:58 fdobridge_: <p​homes_> Not supported: 195/343 (56.9%)
18:58 fdobridge_: <p​homes_> Warnings: 0/343 (0.0%)
18:58 fdobridge_: <p​homes_> Waived: 0/343 (0.0%)
20:00 phodius: hi when will EGL_EXT_image_dma_import be functional in nvk?
22:04 phodius: can i get an invite link to the discord #nouveau?
22:23 fdobridge_: <!​DodoNVK (she) 🇱🇹> phodius: https://discord.gg/ZAzuXNZw4k
22:23 fdobridge_: <e​nergetic_parrot_03598> thanks
22:48 fdobridge_: <g​fxstrand> @vdpafaor Where are we at with NAK on Maxwell? Do you have a good sense for how much is left?
23:05 fdobridge_: <b​enjaminl> haven't had time to work on it in a while, but the big missing pieces currently are:
23:05 fdobridge_: <b​enjaminl>
23:05 fdobridge_: <b​enjaminl> - scheduling (I've been testing with `NAK_DEBUG=serial`, I suspect some of the instruction latencies are different, but haven't looked into it)
23:05 fdobridge_: <b​enjaminl> - atomics _mostly_ don't pass the CTS yet
23:05 fdobridge_: <b​enjaminl> - there are a fair number of 3d-related instructions that haven't been implemented yet
23:05 fdobridge_: <b​enjaminl> haven't had time to work on it in a while, but the big missing pieces currently are:
23:05 fdobridge_: <b​enjaminl>
23:05 fdobridge_: <b​enjaminl> - scheduling (I've been testing with `NAK_DEBUG=serial`, I suspect some of the instruction latencies are different, but haven't looked into it)
23:05 fdobridge_: <b​enjaminl> - atomics _mostly_ don't pass the CTS yet
23:05 fdobridge_: <b​enjaminl> - there are a fair number of 3d-related instructions that haven't been implemented yet (edited)
23:06 fdobridge_: <b​enjaminl> this reminds me... I have an MR from a month that's like 90% done to fix a bunch of texture op test failures
23:10 fdobridge_: <g​fxstrand> Yeah, we need to get scheduling figured out
23:15 fdobridge_: <b​enjaminl> does it work on turing?
23:17 fdobridge_: <e​nergetic_parrot_03598> what would it take to get wayland compositors running on nvk it looks like its just needs the EGL_EXT_image_dma_import, where would the code be located at if it was implemented ?
23:19 fdobridge_: <g​fxstrand> Yeah, scheduling on Turing is fine
23:20 fdobridge_: <g​fxstrand> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795
23:21 soreau: gfxstrand: does that need extra kernel bits?
23:21 fdobridge_: <g​fxstrand> Nope
23:21 fdobridge_: <g​fxstrand> I think it's mostly working (modulo nouveau GL's modifiers being bullshit)
23:22 soreau: gfxstrand: does it work on all hw?
23:22 fdobridge_: <g​fxstrand> Should
23:22 soreau: well, for nvk supported hw
23:23 fdobridge_: <g​fxstrand> Oh, and someone needs to implement linear render hacks.
23:24 soreau: for wsi?
23:25 fdobridge_: <g​fxstrand> For WSI, we can use the PRIME blit path.
23:26 fdobridge_: <g​fxstrand> But in order to support modifiers we need to be able to render to things with `DRM_FORMAT_MOD_LINEAR` and that's not something NVIDIA hardware wants to do.
23:28 soreau: hm, needs a bigger hammer?
23:30 fdobridge_: <g​fxstrand> Yeah, we need something. There's a few options.
23:57 fdobridge_: <e​nergetic_parrot_03598> i get WARNING: NVK is not a conformant Vulkan implementation, testing use only.
23:57 fdobridge_: <e​nergetic_parrot_03598> Selected GPU 0: TU106, type: DiscreteGpu
23:57 fdobridge_: <e​nergetic_parrot_03598> vkcube: ../src/vulkan/wsi/wsi_common_drm.c:441: wsi_configure_native_image: Assertion `!"Failed to find a supported modifier! This should never " "happen because LINEAR should always be available"' failed.