01:25bailsman: I think I found a race condition and I want to make sure I'm not imagining it.
01:27bailsman: In src/vulkan/wsi/wsi_common_x11.c:wsi_CreateSwapchainKHR we call iface->create_swapchain which I believe goes to src/vulkan/wsi/wsi_common_x11.c:x11_surface_create_swapchain on my system
01:28bailsman: directly after that function returns we assign swapchain->fences = vk_zalloc(...)
01:30bailsman: however, *before* that function returns, it does a pthread_create on x11_manage_fifo_queues and that function uses that chain->fences value
01:31bailsman: I think on my system this is crashing because I'm getting unlucky. What's the right way to fix this? Add a null check before using the fences pointer?
09:43MrCooper: Company: pretty rare IME
12:13dottedmag: Does drm_connector.possible_encoders mean "all possible encoders for this connector"? I had a look at drmdb, and in most submissions I see a bunch of encoders never mentioned in possible_encoders
12:15dottedmag: E.g. here https://drmdb.emersion.fr/snapshots/019698068fa5 encoders 48, 56, 64, 72 aren't mentioned
12:16dottedmag: Or here https://drmdb.emersion.fr/snapshots/45cf1dbd0ec0 encoders 97, 98, 99, 100 aren't mentioned
12:18zamundaaa[m]: dottedmag: have a look at the encoder type. The ones that aren't mentioned are for DP MST, so the connectors that would use them only appear if you use DP MST
12:23dottedmag: zamundaaa[m]: I see, that's the only case in drmdb. Do these encoders ever end up in possible_encoders, and what triggers them?
12:25zamundaaa[m]: They should be in the encoders array when the connector is DP MST, so if it's from a usb c dock for example
12:26emersion: new connectors appear when DP-MST is used
12:27dottedmag: Aha, so typically it may happen on hotplug? And encoder represents is a chunk of silicone on display controller, so it's static for the whole lifetime of the card?
12:28zamundaaa[m]: Yes
12:28zamundaaa[m]: But why are you asking these questions? As a user of the drm API you should ignore encoders, except for determining the crtcs you can use for a connector
12:29dottedmag: Curiosity. I like understanding how things actually work.
15:00neobrain: Is it desirable that lavapipe force-enables threaded queue submits even when LP_NUM_THREADS=0? I'm using that envvar to get a specific backtrace from a crash in my Vulkan app, but due to the submit thread it won't point me to the specific line in my Vulkan code that's crashing (i.e. it only shows the lavapipe-internal stack frames)
15:01neobrain: (I did manage to disable threaded queue submits locally, just curious if this should be the default behavior for num_threads==0)
15:14zmike: yes that's the default behavior intentionally
15:14zmike: it's otherwise impossible to simultaneously submit work and wait on semaphores/fences
15:14zmike: if you need a breakpoint set it on the enqueue functions
15:33neobrain: Breakpoints are useful in other cases, but not if you're trying to find out which line of code triggered a crash in the first place :)
15:38neobrain: I saw some references to some "on demand" mode for threaded submit, but I'm guessing that doesn't do what I think. Could you determine that submit work can safely be async without risking deadlocks based on feature flags, or is what you're describing a problem even for Vulkan-1.0-style semaphores/fences?
15:39neobrain: err, can safely be done synchronously*
15:48zmike: I'm not sure what issue you're trying to debug that you can't get a crash?
15:48zmike: if it's memory access use ASAN or valgrind, otherwise gdb will catch asserts normally in threads
15:49zmike: if you're having trouble understanding the command flow, you can use LVP_CMD_DEBUG=1
15:51neobrain: zmike: If the crash happens in lavapipe/llvmpipe, I'll get a backtrace of the queue submit thread even if LP_NUM_THREADS==0. I won't be able to (easily) learn which submit in *my application code* triggered the crash
15:51zmike: LVP_CMD_DEBUG will probably be enough for that
15:51zmike: or at least it has been for me
15:52zmike: also LP_XYZ env vars are for llvmpipe
15:52zmike: they don't affect frontends
15:52neobrain: is that still the current name of the variable? Can't find any references to it in the mesa source
15:52zmike: maybe LVP_DEBUG_CMD ? I'm not at a pc now
15:53zmike: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/frontends/lavapipe/lvp_device.c#L1591
15:53zmike: maybe update your mesa source if it's too old
15:54zmike: but it's been in there for a long time now
15:55neobrain: Ah thanks! Don't tell anyone but my checkout is from January 🙈
15:56zmike: 😬
16:03neobrain: And I get what I deserve for that, because the latest version fixes the crash... (not that this makes my Vulkan code correct, but now I can finally debug it in renderdoc!)
16:05HdkR: \o/
16:06HdkR: zmike: One does not simply enable ASAN and expect it to work in this world :P
16:13zmike: it works in my world
16:13zmike: get on my level
16:15HdkR: I want to be where the ASAN is
19:29anarsoul|2: our of curiosity, why there is no nir_op_atan or nir_op_atan2? Are these not commonly implemented in hw?
19:50alyssa: correct
19:50alyssa: and in so much as they they aren't perf criticla
20:02HdkR: Alternatively, if it was perf critical, the developer already screamed in to the void and found approximation that was faster :P
22:15anarsoul|2: alyssa: see https://gitlab.freedesktop.org/mesa/mesa/-/issues/9323 :)
22:15alyssa: anarsoul|2: and that program is... atan-bound?
22:15anarsoul|2: nope, but atan adds extra 31 instructions
22:16anarsoul|2: 67 with atan, 36 with atan replaced by multiplication
22:16alyssa: that's not necessarily the probelm
22:16alyssa: though I don't know utgard's performance characteristics
22:17anarsoul|2: it's 1 instruction per clock for PP and in the best case 1 pixel per clock
22:18anarsoul|2: so 7.4 MPix/s max with 67 instructions and 13.8 Mpix/s max with 36 instructions for a Mali4xx with a single core clocked at 500MHz
22:30alyssa: oof.
22:31alyssa: I'm not sure I believe those numbers though?
22:31alyssa: Don't you have multiple pixels being shaded in parallel?
22:32alyssa: or is a *throughput* of 1 instruction per clock when considering all of the parallelism, with some (presumably higher) latency?)
22:35anarsoul|2: alyssa: that's with all of the parallelism, 1 instr per clock per core
23:59alyssa: ouch