00:01 mhenning[d]: Okay. My notes put it at a consistent 2% slowdown in horizon zero dawn for 2 queues (7% slowdown for 4 queues), but those results might be out of date now that we've landed some of the synchronization improvements
00:06 gfxstrand[d]: `Pass: 59903, Fail: 11, Crash: 2, Skip: 72080, Timeout: 4, Duration: 1:13:19, Remaining: 1:24:23`
00:31 mhenning[d]: I filed https://gitlab.freedesktop.org/mesa/mesa/-/issues/14211 for the muti-queue thing
01:01 gfxstrand[d]: I'm starting to wonder if the kernel isn't maybe giving us maps with dirty caches sometimes. 😕
01:02 gfxstrand[d]: Also, I kinda wonder if sparse residency is hooked up properly on Tegra
01:03 gfxstrand[d]: In theory, it's Maxwell B but it likely has different page table code
01:23 mhenning[d]: Sparse isn't too difficult to turn off, is it?
01:24 mhenning[d]: clear the flag on the queue or whatever
01:34 gfxstrand[d]: Yeah. It's easy enough to disable
01:34 gfxstrand[d]: But weirdly it seems to be only MSAA sparse
01:34 gfxstrand[d]: So maybe that's just funky on Maxwell and we never saw it before
01:34 gfxstrand[d]: I did a big MSAA rework recently but IDK if I did a full CTS run pre-Turing
01:37 gfxstrand[d]: Here's my current list of fails:
01:37 gfxstrand[d]: dEQP-VK.dgc.ext.compute.smoke.1024_sequences_device_local_from_compute_preprocess_state_same_universal_queue,Fail
01:37 gfxstrand[d]: dEQP-VK.dgc.ext.compute.smoke.1024_sequences_device_local_from_host_preprocess_state_same_universal_queue,Fail
01:37 gfxstrand[d]: dEQP-VK.dgc.ext.graphics.draw.token_draw.gpl_fast_with_geom_with_execution_set_check_draw_params_preprocess_separate_state_cmd_buffer,Crash
01:37 gfxstrand[d]: dEQP-VK.dgc.ext.graphics.draw.token_draw.gpl_mix_base_fast_with_execution_set_check_draw_params_preprocess_separate_state_cmd_buffer_unordered,Crash
01:37 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.coherent.atomic_atomic.atomicrmw.device.payload_local.buffer.guard_local.buffer.frag,Fail
01:37 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.coherent.fence_atomic.atomicwrite.device.payload_local.buffer.guard_local.buffer.frag,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.coherent.fence_atomic.atomicwrite.queuefamily.payload_local.buffer.guard_local.image.vert,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.atomic_atomic.atomicrmw.queuefamily.payload_local.buffer.guard_local.buffer.frag,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.atomic_atomic.atomicrmw.queuefamily.payload_local.buffer.guard_local.image.frag,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.atomic_fence.atomicwrite.queuefamily.payload_local.buffer.guard_local.image.frag,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.fence_atomic.atomicwrite.device.payload_local.buffer.guard_local.image.vert,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_local.image.frag,Fail
01:38 gfxstrand[d]: dEQP-VK.memory_model.message_passing.ext.u32.noncoherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_local.image.vert,Fail
01:38 gfxstrand[d]: dEQP-VK.query_pool.statistics_query.tes_evaluation_shader_invocations.32bits_tes_evaluation_shader_invocations_triangles,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.r32f.samples_2,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.r32i.samples_2,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.rgba16f.samples_4,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.rgba16f.samples_8,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.rgba16i.samples_4,Fail
01:38 gfxstrand[d]: dEQP-VK.sparse_resources.multisampled_image_sparse_residency.rgba32i.samples_4,Fail
01:47 anthonypellet: It's not the first time your heroic fecalists produce complete POO! You are the masters at it, it took me three days to make input drivers that you puzzle all the community with decades and scam that victim of your terror is mentally ill. Guess what we have vast majorities votes to shut you down for good, you think it's illegal or what, as said , i never did an illegal stunt so that you
01:47 anthonypellet: could send me to prison. But i can easily send all the terrorists in cambodia to prison for very long time. Have all the evidence. We talked as to what fraud new zealand and US military did. No real person will work on their lines nor visit their country again. That all over illborn anuses. It's not worth to even investigate in my life the closest ones surrounding me are all pennyless
01:47 anthonypellet: traitors just like the slut from finland. They commit very nasty shit, and i never listen to those until it's time to either finish them with my lines or put them to prison for good, i can do either one effectively with my supporters. Nearly anyone real will go against you, and that is why they offer so many VPN's too, currently if no involvement is detected in my killoff attempts i let
01:47 anthonypellet: your suicidal IRC people to be safe and intact, otherwise you are as done as the above soon. I've been living in my country with many mentally and physically ill people doing lifetime terror and being endorsed by regime and i told what was going to happen, only give those powers or firearms despite those dudes wearing a skirt they will shoot it to the target easily (RIP Charlie Kirk).
01:49 steel01[d]: 10-30 00:48:35.259 489 527 E [minigbm:nouveau_bo_create_for_modifier(666)]: Allocating new BO: 1920x1080, fourcc = AB24, size = 9830400 domain = 0x4, pte_kind = 0xfe, tile_mode = 0x50, modifier = 0x300000000000015
01:49 steel01[d]: 10-30 00:48:35.323 2329 2346 E Tellusim: VKTexture::create(): ASTC44RGBAu8n format is not supported
01:49 steel01[d]: 10-30 00:48:35.323 2329 2346 E Tellusim: VKTexture::create(): can't create 2D ASTC44RGBAu8n 2560x1440 texture
01:49 steel01[d]: Sad. Still can't run gravitymark.
01:53 steel01[d]: Oh hey. The cursor is correct with vulkan rendering. Huh, so it may not be an issue with the tegra-drm cursor plane like I originally thought. Something in nouveau gallium is broke, I guess. Which I'm sure people here will say is no surprise at all.
02:00 gfxstrand[d]: Oh, I should hook up ASTC support
02:00 gfxstrand[d]: Should be easy enough
02:01 steel01[d]: 10-30 01:58:02.336 574 2458 E [minigbm:tegra_bo_map(921)]: Mapping BO: 1, map_offset=100733000, size = 152000
02:01 steel01[d]: 10-30 01:58:02.337 2244 2347 I AudioTrack: Setting StartMediaTimeUs = 0
02:01 steel01[d]: 10-30 01:58:02.348 2244 2347 I CodecNameUnknown-MediaCodecVideoRenderer: onOutputFormatChanged: outputFormat:{crop-right=1279, max-height=720, sar-width=1, color-format=2130708361, mime=video/raw, color-standard=1, color-transfer=3, sar-height=1, crop-bottom=719, max-width=1280, crop-left=0, width=1280, color-range=2, crop-top=0, rotation-degrees=0, height=720},
02:01 steel01[d]: codec:android.media.MediaCodec@4354cac
02:01 steel01[d]: 10-30 01:58:02.351 503 548 F RenderEngine: Failed to create a valid texture. [0xb400002fd8026190]:[1280,720] isProtected:0 isWriteable:0 format:842094169
02:01 steel01[d]: 10-30 01:58:02.352 503 548 F libc : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 548 (RenderEngine), pid 503 (surfaceflinger)
02:01 steel01[d]: Tried to play a video in smarttube and it didn't like that either. It's trying to map 1280x720 it says, but I don't see any matching alloc. All the allocs shortly before that was fo 1920x180.
02:11 gfxstrand[d]: steel01[d]: https://gitlab.freedesktop.org/gfxstrand/mesa/-/commits/nvk/tegra-fixes
02:11 gfxstrand[d]: Top patch is ASTC support
02:13 anthonypellet: These people i deal everyday with are as illborn as the murderer of Charlie Kirk, they break as none else in innocent situations and terror others daily in our country too, and there is nothing to gain in helping out any of those, that is why i had learned how to defend my life, cause it's everything i have and the only thing i have , a blessing my own life, i am very good in defending it.
02:25 steel01[d]: 10-30 02:22:19.455 4393 4410 I Kodi : 2025-10-30 02:22:19.455 T:4410 info <general>: Instancing CRendererMediaCodecSurface
02:25 steel01[d]: 10-30 02:22:19.455 4393 4410 I Kodi : 2025-10-30 02:22:19.455 T:4410 info <general>: CRendererMediaCodecSurface::Configure
02:25 steel01[d]: 10-30 02:22:19.472 574 4740 E [minigbm:tegra_bo_map(921)]: Mapping BO: 1, map_offset=11aa60000, size = 309000
02:25 steel01[d]: 10-30 02:22:19.567 4393 4410 I Kodi : 2025-10-30 02:22:19.567 T:4410 info <general>: Loading skin file: VideoFullScreen.xml, load type: KEEP_IN_MEMORY
02:25 steel01[d]: 10-30 02:22:19.582 574 4740 E [minigbm:tegra_bo_map(921)]: Mapping BO: 1, map_offset=11a388000, size = 309000
02:25 steel01[d]: 10-30 02:22:19.588 2492 2510 F RenderEngine: Failed to create a valid texture. [0xb400002e31e28030]:[1920,1080] isProtected:0 isWriteable:0 format:842094169
02:25 steel01[d]: 10-30 02:22:19.589 2492 2510 F libc : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 2510 (RenderEngine), pid 2492 (surfaceflinger)
02:25 steel01[d]: 10-30 02:22:19.645 574 4740 E [minigbm:tegra_bo_map(921)]: Mapping BO: 1, map_offset=122ea8000, size = 309000
02:25 steel01[d]: Hmm. Throwing a 1920x1080 video at kodi did the same thing. The common issue here is the aosp software media codecs. I had similar issues on nouveau gl as well. If I tell kodi to use its internal ffmpeg, it works fine.
02:25 gfxstrand[d]: For video, Mauro sent me a minigbm patch for video. No idea if it's correct
02:26 gfxstrand[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1433280863499452496/linear_combinations.diff?ex=69041e52&is=6902ccd2&hm=5d5aef7daa5b80cc21e59050861a8fae4f759fcbf72b85e38d01d4cabf21a194&
02:27 steel01[d]: Oh, this rings bells. I've seen stuff like that in other impls that didn't make much sense to me. I'll tack it in and restart my build.
02:33 mangodev[d]: i wonder what the planned target is for the tegra driver stuff
02:33 mangodev[d]: i heard some stuff in this chat talking about old nvidia shields?
02:33 mangodev[d]: or is there some other target more in the sights (jetson? nintendo switch?)
02:34 gfxstrand[d]: I'm testing on TX1. But really, I just want it to work and I want to gain experience with it.
02:34 gfxstrand[d]: And I want to use it as a platform to learn about Android and improve Mesa Android support.
02:35 gfxstrand[d]: But beyond that, I don't care too much what others do with it. I don't plan on hacking on shields, personally.
02:35 mangodev[d]: are nouveau and nvk going to be both a desktop *and* mobile drivers? or are the codebases going to be (mostly) separate?
02:35 gfxstrand[d]: Oh, it's all together
02:35 gfxstrand[d]: There's very little different on TK1+
02:35 gfxstrand[d]: It's mostly that we have to do cache flushing or mobile is hosed
02:35 anthonypellet: it's my dad and all the regime in our country giving voice to illborn people, believe it or not they placed despite of total errors gotten from birth righ away all chips in to destroy me, so the conversation with my sisters and mom and dad are anything but humanly, they use their voice to destroy me every time, so not too strange is sentences as to how i do not deserve decent treatment in
02:35 anthonypellet: hopsital, as to how i do not deserve that blessing that god gave me, so after being injured so bad, my sisters were not shy to be happy i got butchered and gave a vote against me, not to leak any of computing equipment or chanche to get out of it, and my mother was only crying knowing they got caught in doind such evil. Guess what? their mental insitution run is over for now, and probably
02:35 anthonypellet: their lives too, despite they never being present in reality they were as dead when they were born to be honest, so there is my hypothesis , you do not make bread from shit.
02:36 mangodev[d]: :|
02:36 mangodev[d]: gfxstrand[d]: has nvidia gotten over a lot of mobile hardware restrictions with tegra?
02:37 mangodev[d]: i've heard with more recent generations that they got past the texture size and aspect ratio limitations, but i honestly don't keep up a ton on tegra
02:37 gfxstrand[d]: IDK what you mean. It's just an NVIDIA GPU with ASTC support. It's otherwise no different from any other Maxwell.
02:38 mangodev[d]: interesting, never knew they were *that* close to the desktop stuff (even if a generation or two behind)
02:43 steel01[d]: mangodev[d]: For me, I want to run android on everything from shield to switch to jetsons. My context is the lineage os project.
02:44 mangodev[d]: are there no discernable difference between running of system ram instead of dedicated vram?
02:44 gfxstrand[d]: Display is different. They're still using something derived from the older Tegras for that. And they have different firmware and command submission hardware. But from a userspace 3D perspective, they're virtually identical to the desktop GPUs.
02:44 steel01[d]: The downstream kernel is just about out of steam, so if I want to continue support for these devices, I need to pivot to mainline and mesa.
03:32 mhenning[d]: mangodev[d]: some of the really old tegra parts were completely different from desktop, but more modern stuff is the same
03:32 steel01[d]: Poor tegra 4.
03:34 steel01[d]: gfxstrand[d]: This did not help the swcodec crash. Same as what I posted above still.
03:36 steel01[d]: gfxstrand[d]: Gravitymark now runs. Lets see if it'll do the actual benchmark.
03:39 steel01[d]: Meh, it's oom'ing on tx1. I had it running on kernel 4.9 with nvgpu. It would oom on l4t, though. Time to run a tx2 build.
03:45 gfxstrand[d]: Oof
03:45 gfxstrand[d]: But cool that it works
06:19 steel01[d]: Oooowwwww. Well, gravitymark runs on tx2. To the tune of 0.8 fps and a score of 144.
06:19 steel01[d]: Kernel 4.9 and nvgpu got 10.1 fps and 1691, for comparison.
06:21 steel01[d]: Devfreq reports that the clock is supposed to be running at max. As does the bpmp sysfs interface. So presumably that's telling the truth.
07:21 mangodev[d]: steel01[d]: can't wait for the phoronix article on this tmr 🫠
07:31 steel01[d]: steel01[d]: The recent changes did not fix angle, fwiw. Still getting this.
07:57 jja2000[d]: mangodev[d]: There's also the Pixel C
09:22 notthatclippy[d]: gfxstrand[d]: Aktshually, the display stuff should be unified starting with Orin. This is why Orin _also_ needs OpenRM. The display driver code for dGPUs is in nvidia.ko/gsp, and now that Tegras use the same IP it meant either porting all that over to nvgpu or just pulling in a (very trimmed down) RM copy.
09:25 notthatclippy[d]: It also had some nice side benefits, like letting us ship a smaller OpenRM first.. And also when we shipped the real one, the initial target was datacenter only, but it turned out that we could reuse the Orin display work and it was brought up in a few days completely unexpectedly.
09:40 mohamexiety[d]: so that’s why openrm worked fine for display capable dGPUs in the initial launch!
09:41 mohamexiety[d]: I was always curious about that because the repo stated the target was headless compute datacenter stuff but for the most part it just worked™ on GeForce
10:19 snowycoder[d]: Interlock passes every related dEQP test!
10:19 snowycoder[d]: Test run totals:
10:19 snowycoder[d]: Passed: 3602/7226 (49.8%)
10:19 snowycoder[d]: Failed: 0/7226 (0.0%)
10:19 snowycoder[d]: Not supported: 3624/7226 (50.2%)
10:19 snowycoder[d]: Buut it spinlocks forever if we kill every thread in a warp,
10:19 snowycoder[d]: any way to resurrect a thread?
10:26 marysaka[d]: snowycoder[d]: I think they bail out if every threads are killed/discarded
10:26 marysaka[d]: would need to recheck that but yeah
10:31 triang3l[d]: snowycoder[d]: You need to unlock the critical section while bailing out in this case
10:32 triang3l[d]: or is there some hardware control flow logic that makes running even scalar code impossible if all lanes are inactive already?
10:36 snowycoder[d]: triang3l[d]: That's a great question, I have no idea.
10:36 snowycoder[d]: Time to re-check nv-prop
13:24 gfxstrand[d]: steel01[d]: Woof. That's a mess. Maybe there's caching bits we're missing? That sounds very much like a lack of memory bandwidth. Could also be compression, I suppose.
13:46 marysaka[d]: compression and tiled cache too
14:03 gfxstrand[d]: Yeah, that stuff is going to really matter on Tegra. We have zero memory bandwidth
14:34 gfxstrand[d]: NGL, it's kind of amazing tha ASTC works 100% on the first try. That was hell to implement on Intel.
14:51 snowycoder[d]: `OpVote` has the same semantics as in PTX for divergent execution, right?
15:07 snowycoder[d]: marysaka[d]: allright, nv-prop does bail-out if all threads are killed (they add the check before each discard).
15:07 snowycoder[d]: To work around divergent execution they store the alive mask in memory, maybe that helps with register pressure?
15:07 snowycoder[d]: Should we do the same or just use a local variable?
15:09 marysaka[d]: snowycoder[d]: probably local variable? unless they access the mask they store at the start on another thread possibly
15:44 snowycoder[d]: marysaka[d]: Right, they probably store it in memory to support demotions inside function calls(?)
15:45 marysaka[d]: maybe? I haven't test anything complex for FSI sadly 😄
18:03 leftmostcat[d]: Can I assume `iadd_sat` will need support for 64-bit values?
18:34 mhenning[d]: leftmostcat[d]: It looks like there's a nir_lower_iadd_sat64, so you could probably implement it just for 32-bit to begin with and use the nir lowering
19:06 leftmostcat[d]: Phew. Thanks.
19:12 leftmostcat[d]: Ahh, well, `lower_iadd_sat` applies to all bit sizes of both `lower_iadd_sat` and `lower_isub_sat`, so that'll be a bit of work.
19:54 karolherbst[d]: nah, there is a `nir_lower_iadd_sat64` for int64 lowering
20:02 snowycoder[d]: FSI works even with all-warp discards 🎉
20:07 leftmostcat[d]: Oh no. There's documentation that's fibbing, then.
21:05 mhenning[d]: marysaka[d]: Interestingly there's some kernel-side TSG work in 06db7fded6dec88772a65c5a39af12ba4dc2ad38 but it doesn't seem to be wired up to the uapi
21:12 gfxstrand[d]: snowycoder[d]: Good work!
21:14 gfxstrand[d]: leftmostcat[d]: Not strictly, no. But it's easy enough to support if you've got 32-bit working. For the 64-bit version, you only need to look at the top halves of the 64-bit values for your `plop3` ops (because the sign bit for the whole thing is the top bit of the 64-bit value) and then do the select on both halves.
21:18 leftmostcat[d]: That makes sense, thanks.
21:19 gfxstrand[d]: Oh, and we should probably support `[ui]sub_sat` while you're at it. They're basically the same, just with the conditions tweaked a bit.
21:20 gfxstrand[d]: Also, we should make builder helpers for these with unit tests
21:21 gfxstrand[d]: See, for example, iadd64: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/nouveau/compiler/nak/hw_tests.rs?ref_type=heads#L1151
21:22 mhenning[d]: gfxstrand[d]: We already have usub_sat
21:28 gfxstrand[d]: So we do. 🙃
21:28 gfxstrand[d]: Still, it should be a builder helper and it should be unit tested.
21:29 mhenning[d]: Yeah, I'm wondering if the usub_sat stuff predates hw_tests
21:29 gfxstrand[d]: it does
21:29 gfxstrand[d]: Or at least it predates doing HW tests for builder helpers.
21:31 gfxstrand[d]: Also, the current implementation only works on Volta+ because it uses `iadd3`. But we could totally implement it pre-Volta with a second `iadd2` and an and.
21:32 gfxstrand[d]: `iadd_sat` would be harder because the `plop3` trick probably doesn't work pre-Turing.
21:54 mhenning[d]: sure. Let's not get too far ahead of ourselves though. It's fine for leftmostcat to implement one thing at a time
21:57 gfxstrand[d]: Yup
21:58 gfxstrand[d]: Mostly suggesting the unit tests because that's an easier way to make sure that it works than chucking the whole CTS at it.
23:07 mohamexiety[d]: https://gitlab.freedesktop.org/mohamexiety/nouveau/-/commits/compv3 updated compression kernel here, also rebased on latest 6.18rc
23:07 mohamexiety[d]: rebased nvk comp MR too
23:07 mohamexiety[d]: this should fix the issue that phomes_[d] ran into. also patches sent to the ML
23:09 mohamexiety[d]: this should hopefully be final. things still aren't clear but I'll likely be unavailable in november completely
23:15 mohamexiety[d]: (if all goes well should be available for review/etc work though. just wont have HW on hand to actually do work on for a bit)
23:28 gfxstrand[d]: I should probably review again