IRC Logs of #dri-devel on irc.freenode.net for 2023-05-16

00:26 zmike: karolherbst: vvl already handles validation for spirv and spirv-val does the same
00:28 karolherbst: zmike: I meant like checking if the SPIR-V declares any extensions a driver doesn't support
00:28 zmike: yep that's part of it
00:29 karolherbst: how would that work through spirv-val then?
00:30 karolherbst: I know it can print if one forgets to specify an extension
00:30 zmike: ah spirv-val just does the base validation
00:30 zmike: vvl handles device caps
00:31 karolherbst: right.. but I can't use vvl from inside CL, so that would be an issue. Guess I'll just trust on spirv_to_nir to fail
00:31 zmike: why can't you
00:31 karolherbst: how could I
00:31 zmike: zink ?
00:31 karolherbst: gallium is a dead end
00:31 zmike: 🤔
00:32 karolherbst: cl did become command buffer extensions e.g.
00:32 karolherbst: and mapping that to gallium or reworking gallium to support that would be pure pain
00:32 zmike: I'm not sure what you mean
00:32 karolherbst: and it sounds like implementors want to focus more on that
00:32 zmike: like the GL cmdbuf ext?
00:32 karolherbst: no, like vulkan command buffers
00:33 zmike: yea there's that nv gl ext that's kinda similar
00:33 karolherbst: funky
00:33 karolherbst: but still sounds like pain to do in gallium
00:33 zmike: why?
00:33 karolherbst: I mean.. you could record it and call gallium apis
00:33 karolherbst: but that defeats the purpose
00:34 zmike: I'd expect you could do it with gallium by just adding begincmdbuf / endcmdbuf / exec interfaces
00:34 karolherbst: uhhh
00:34 karolherbst: yeah well...
00:34 zmike: then the driver would create and manage an IB with the commands that could be reused
00:34 karolherbst: yeah.. that basically means to rewrite most/all drivers
00:34 zmike: not sure about that one
00:35 zmike: would be pretty trivial in zink at least
00:35 karolherbst: yeah.. in zink
00:35 zmike: isn't that where you wanted validation anyway?
00:35 karolherbst: could be a zink only interface, but then again
00:35 karolherbst: not really
00:35 zmike: I mean in the cmdbufs
00:35 alyssa: openzl
00:35 alyssa: zusticl?
00:36 karolherbst: I'm also not sure how much value zink adds in terms of CL
00:36 zmike: stop couple naming my driver
00:36 karolherbst: ntv is super usefull
00:36 karolherbst: but besides that? mhh not sure
00:36 karolherbst: what would you say how much of zink deals with messy vk compute stuff?
00:36 karolherbst: 5%?
00:37 zmike: I don't know what you're asking
00:37 alyssa: writing a compute-only gallium driver for rusticl (if you already have a vulkan driver in-tree) doesn't seem like a big deal anyhow
00:37 karolherbst: yeah...
00:37 alyssa: but i'll zip my mouth with my gallium propaganda
00:37 karolherbst: my point is rather, the compute side of things is so trivial, that targeting vulkan directly or using zink wouldn't do much of a difference
00:38 karolherbst: I'm just wondeirng how one could deal with extensions like command buffers through gallium
00:38 karolherbst: for now it doesn't matter anyway
00:38 karolherbst: but yeah.. no idea
00:38 karolherbst: maybe I prototype it when I find some time
00:38 karolherbst: anyway
02:36 kode54: does anyone know why i915 is this bad? https://youtu.be/Y8Z4zft10ik
03:20 kode54: cool
03:20 kode54: I can't use kmscube either
03:21 kode54: it fails to create the framebuffer due to EREMOTE error which would seem to indicate that the BO is in system memory
03:23 kode54: oh
03:23 kode54: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/271 known issue
06:11 airlied: 26.4g 13.9g virt/res for one mesh test, might need to fix that
07:10 pq: zamundaaa[m], sure, but if all of that 400 nits available through simple brightness adjustment a.k.a backlight, or is there a hidden switch somewhere to go to "HDR mode"?
07:11 pq: zamundaaa[m], btw. which HDR modes does EDID tell you to be available?
07:12 pq: zamundaaa[m], vsyrjala, what does the Colorspace property set to BT2020 actually do in the driver and hardware, given the panel itself can only take panel-native pixel values I presume?
07:21 pq: If the panel takes any data given to it in its native pixel values, then sending it BT2020 encoded content whose actual used color gamut is smaller would indeed result in a desaturated image.
07:24 pq: or does an eDP panel actually have similar pixel processors like standard HDMI/DP interface monitors do?
07:25 pq: zamundaaa[m], seeing what edid-decode says about the EDID would be interesting.
07:26 jani: pq: I didn't read the backlog, but some panels actually do have a hidden switch. e.g. you have to write the source OUI to specific DPCD registers, and the panel behaves differently. (isn't that lovely?)
07:27 pq: jani, yay.
07:28 pq: jani, is that hooked up to KMS UAPI somehow?
07:29 pq: e.g. would setting HDR_OUTPUT_METADATA property make the driver automatically program that?
07:35 jani: pq: it's done unconditionally by e.g. i915
07:36 jani: intel_edp_init_source_oui() in drivers/gpu/drm/i915/display/intel_dp.c
07:36 pq: oh, so what behavior do we unconditionally get?
07:36 jani: HDR backlight, for example
07:37 pq: what about the color encoding of pixel values?
07:38 pq: does it make pixel values to be expected in panel-native color gamut, or?
07:39 jani: I have no idea
07:40 pq: heh, so that's a question we need an answer for, and the effect of Colorspace property on eDP as well :-)
07:42 pq: Ideally, if a driver exposes Colorspace and HDR_OUTPUT_METADATA properties for an eDP panel, and if that eDP panel does not handle those *at all*, then the driver would need to do something about it. Or expose something completely different so that userspace knows it needs to handle all that itself.
07:42 pq: hwentlan_, ^
08:46 jani: pq: if I got a euro for every display that has bogus info in EDID or DPCD, I'd be a wealthy man :p
08:46 jani: btw this might be related https://gitlab.freedesktop.org/drm/intel/-/issues/8425
09:37 pq: jani, I did not yet even consider EDID to be incorrect, that just another thing on top of all this. :-)
09:42 pq: well, the EDID in that issue seems to be WCG SDR by the impression of the numbers, even though it claims ST2084 TF too.
09:43 pq: the primaries and white point are off enough from any standard that they might even be true, who knows
10:12 pq: oops, I read the wrong column. The luminance values are HDR indeed in that EDID.
10:16 zamundaaa[m]: It's not my laptop but I can ask. IIRC Nate doesn't have Windows on it though
10:31 pq: zamundaaa[m], why Windows?
10:35 pq: The thing with Colorspace and HDR_OUTPUT_METADATA is that they are only relayed to the "monitor" and then it's the monitor that should make use of them. I wonder why would an internal panel have the necessary circuitry to make use of them.
10:41 pq: In the end, HDR is "just" about preparing the content for the display differently, bumping the maximum possible luminance way, up and using greater pixel bit depth to adequately cover the new dynamic range. One can compromise on everything except "preparing content differently" while still getting a HDR'ish look but just lower quality.
10:45 zamundaaa[m]: pq: because swick asked
10:46 zamundaaa[m]: pq: here's the edid: https://pastebin.com/E8EdA7Lt
10:48 pq: zamundaaa[m], is that gitlab issue linked earlier exactly the case you are looking at?
10:50 zamundaaa[m]: No, but it is pretty similar
11:21 pq: zamundaaa[m], that's a quite basic looking HDR EDID. I'm getting the feeling that the primaries are somewhat meaningful, the luminances are filled in in a very simple manner, and the TFs might be a copy&paste...
11:23 pq: zamundaaa[m], have you tried setting HDR_OUTPUT_METADATA max luminance to 400 cd/m²? I wouldn't expect it to make a difference for an internal panel, but who knows.
11:24 vsyrjala: did we try i915.enable_psr=0 already?
11:24 zamundaaa[m]: vsyrjala: no, I didn't get a response yet
11:25 zamundaaa[m]: pq: there is no HDR_OUTPUT_METADATA, the APU doesn't support it
11:25 pq: oh right, then you have no switch to turn HDR on, except ramping backlight up to max.
11:26 zamundaaa[m]: vsyrjala: how is HDR metadata & Colorspace *supposed* to work with internal displays btw?
11:26 pq: which for an internal panel might as well be the right thing to do, dunno
11:28 pq: wonder how wide bits per pixel are actually flowing through the eDP...
11:28 zamundaaa[m]: Like, does the driver do automatic color conversions behind the users back, or is there an additional controller doing this, or is userspace expected to send the raw rgb values for the display, or...?
11:31 pq: that question boils down to what internal panels actually do, do they use the infoframe data for anything
11:33 pq: from KMS UAPI perspective, there should not be a difference between external monitors and internal panels, but the panel hardware may be different and if so, I'd personally expect the driver to either compensate or not expose the properties.
11:37 pq: Unfortunately when no such properties exist, the expectation is a color encoding close to sRGB SDR. But it might as well be panel-native pixel encoding. A "monitor driver" file would probably contain the information for Windows, I guess. Or maybe Windows special-cases eDP + certain EDID values in order to produce panel-native pixels.
11:40 pq: I would expect to find a switch between "sRGB emulation" and "panel native" pixel encodings, which might be what jani talked about.
11:49 zamundaaa[m]: pq: same, but we don't have such a thing in KMS, right? When I set Colorspace to BT2020_RGB, something still needs to convert that to the panel native pixel encoding
11:50 pq: Yes, which is why I think the Colorspace property might not apply for internal panels if they don't have the processor for that.
11:53 zamundaaa[m]: swick: but the EDID does say that in this case
11:53 pq: this needs either specs or reverse-engineering what happens on the sink side of eDP, and once we know that, we can think how and if the KMS properties should be handled in a driver.
11:54 pq: if swick is talking, he's not coming through to irc
11:55 pq: zamundaaa[m], I think EDID is only trustworthy up to what bits of it is used by Windows.
11:56 pq: and if Windows special-cases eDP, or punts everything to a "panel driver", we'd have to do the same
13:32 alyssa: https://rosenzweig.io/7lytwl.jpg
13:36 alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036/commits has been reviewed in full by jenatali but would like to make aware affected driver maintainers in case they have further concerns
13:38 alyssa: (turnip, v3dv, lavapipe, r600, intel, ac)
13:38 zmike: I forgot to click submit
13:42 alyssa: zmike: 💪
13:42 zmike: if there was a weak arm emoji this would've been the time to use it
13:43 alyssa: Cortex-A.jpg
13:43 cmichael:thinks zmike needs more "leg day" and less "arm day"
13:44 zmike: the leg day emoji is kinda meh though
13:44 Ristovski: huh, `-Dxlib-lease=disabled` -> 'undefined reference to xcb_randr_get_crtc_info_unchecked' and others
13:45 Ristovski:adds to list of meson projects where -Dauto_features=disabled is unusable OOTB
13:49 alyssa: :(
15:03 alyssa: danylo: anholt_: cwabbott: Could someone take a look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23003 before it gets conflicty? thank you :)
16:38 zmike: jenatali: fwiw you're not alone, I never know anything exists in nir until someone points it out
16:40 Ristovski: and then everyone else finds out from your blog :P
16:47 alyssa: zmike: just this week I discovered nir_intrinsic_copy_const_indices
16:47 zmike: literally what
16:49 jenatali: I still need to port to load_ubo_vec4 and lower_mem_bit_sizes
17:42 alyssa: zmike: same
17:58 Kayden: so I was working on some nir_opt_load_store_vectorize related improvements for load/store globals
17:58 Kayden: and I noticed that fossil-db was showing no improvements
17:58 Kayden: I discovered that in the one fossil I was looking at, it was enabling robustBufferAccess, which disabled vectorization for globals
17:59 alyssa: !
17:59 Kayden: it's a DX12 app from vkd3d-proton
17:59 Kayden: oddly... fossilize commit 945aacaa9b3d5cc1ac75b54d68c94c6520df9745 (Capture mesh shader feature PDF2s as well.) is what's causing it to suddenly enable robustBufferAccess
18:00 Kayden: it -sounds- like fossilize is intending to enable robustness if the captured pipeline requested it, but otherwise it's disabling it. and, for older fossils, it's filling in "please disable robustness". which sounds like a fine plan but I'm not sure if it's working right
18:00 Kayden: with that commit I get an opts.features != NULL and prior to that I get opts.features == NULL causing it to fill in default PDFs
18:03 pendingchaos: RADV used to add global to the robust_modes too, but I decided that it wasn't necessary
18:03 pendingchaos: since robust_modes is supposed to avoid address overflow within the load like "load32(-4); load32(0)" -> "load_2x32(-4)", but the original IR wouldn't have worked because address 0 is invalid for globals
18:06 zmike: robustBufferAccess disables vectorization?
18:06 pendingchaos: in some situations
18:07 zmike: huh
18:07 Kayden: pendingchaos: interesting, I was thinking it was something to do with bounds checking of components
18:07 pendingchaos: it shouldn't if the vectorized load is aligned to the size of the entire load
18:07 zmike: ah ok
18:07 zmike: so like a uint load with offset aligned to uint would vectorize
18:08 pendingchaos: no, it's to avoid loads/stores crossing negative->positive/zero offsets
18:08 Kayden: hmm, yeah, that doesn't seem necessary then
18:09 pendingchaos: zmike: aligned to the size of the entire vectorized load
18:09 pendingchaos: so 2x32 requires 8 byte alignment and 4x32 requires 16 byte alignment
18:09 zmike: right
18:10 pendingchaos: I believe the vectorization is fine without robust buffer access because there's a line in the vulkan spec saying a load might be considered out of bounds if a load with a similar offset also is out of bounds
18:11 pendingchaos: so that load32(-4) can mess up a load32(0)
19:03 alyssa: phase 1 one of my plan to kill nir_register underway
19:03 alyssa: (-:
19:03 alyssa: TBD if I can finish this one off though. Optimistic but atomics was just a warmup
19:08 airlied: alyssa: whats wrong with nir register?
19:09 alyssa: airlied: Extremely invasive in the IR
19:10 alyssa: and doing nothing for drivers that ingest SSA other than bloat memory footprint and slow things down
19:10 airlied: what about drivers that ingest registers? :-)
19:10 alyssa: Connor, Faith, and I have been talking about replacing nir_reg_src/nir_reg_dest with special load/store_reg intrinsics for years now
19:10 airlied: ah makes sense
19:10 alyssa: This week's project is evaluating the feasibility of that
19:11 alyssa: I mean obviously it can be done
19:11 alyssa: the question is can it be done without regressing perf on the non-SSA drivers
19:11 alyssa: so I'm writing fresh Midgard compiler code for the first time in ... IDK how long, it's been a while
19:11 alyssa: my vec4 baby :~)
19:11 jenatali: Could they just be nir_variable with nir_var_register as the memory mode?
19:12 jenatali: Or is that still too much bloat?
19:12 jenatali: Eh I guess you'd lower_io and get load/store register intrinsics anyway, nvm
19:13 alyssa: jenatali: considered that but registers are sufficiently special that I don't want to do multiple cross-tree reworks at once
19:13 alyssa: so what I have WIP'd are a set of intrinsics that map 1:1 to nir_register/nir_reg_src/nir_reg_dest just in instruction form
19:13 alyssa: and a pass to translate from old to new
19:14 alyssa: next up is teaching midgard to ingest those intrinsics, and adding some more passes/helpers so we're not relying on midgard's backend copyprop
19:14 alyssa: (which is... not great, and probably better than some of the vec4 backends we have in tree relying on nir_register for perf)
19:23 alyssa: (The copyprop is the hard part of the problem here. Luckily I've got plans for that too.)
19:36 karolherbst: alyssa: what are the use cases for nir_register atm? phi resolving and what else?
19:37 anholt_: karolherbst: function local variables with array indexing not lowered to scratch
19:38 karolherbst: that's just for indirects if the driver supports it, right?
19:38 karolherbst: could be made a drivers problem to convert scratch to.. indrect register acesses if they feel like it
19:40 anholt_: karolherbst: it is already the driver's problem.
19:40 anholt_: the goal is to reduce the impact on the rest of nir from the existence of regs, while not regressing codegen across drivers.
19:40 karolherbst: fair enough
19:42 alyssa: karolherbst: yep -- phi resolving, some indirect arrays, and also partial writes to vectors on vec4 backends
19:42 karolherbst: uhhh partial vector writes..
19:42 alyssa: That's why I'm doing bring up against my vec4 compiler :)
19:42 alyssa: I figure if I can implement something that's good enough for midgard, it'll be good enough for everyone
19:43 karolherbst: that's probably a fair assumption
19:43 alyssa: I'll write the patches for midgard and ntt regardless
19:43 karolherbst: I'm still playing with the idea of merging load_uniform with load_kernel_input by just forcing every driver to do byte addressing instead of silly 32bit or vec4 games
19:43 alyssa: unsure what my plan is for all the random backends
19:43 karolherbst: but not sure how many drivers will hate me for it
19:43 alyssa: TBD how invasive the change is
19:48 airlied: karolherbst: not all hw can do byte addressing though
19:48 karolherbst: mhh...
19:48 karolherbst: annoying
19:49 alyssa: not all hw deserves opencl support :p
19:49 airlied: pretty sure a lot of older hw can only do vec4
19:49 airlied: yes totally dont expect CL
19:49 airlied: just that fixing load uniform isnt that easy
19:50 alyssa: my point is, can't you just do load_ubo instead of load_kernel_input and not touch load_uniform?
19:51 alyssa: airlied: TBF, agx can't do byte addressing strictly speaking :
19:51 alyssa: :p
19:51 alyssa: but i Just Deal With It in the backend
19:51 karolherbst: alyssa: yeah... so... drivers lowering uniforms to ubos don't see load_kernel_input anymore
19:51 Kayden:is hoping load_uniform goes away
19:51 alyssa: Kayden: in favour of load_ubo?
19:52 Kayden: yeah
19:52 Kayden: anv still uses it for...something
19:52 alyssa: uniforms go through a goofy lowering series on asahi these days
19:52 alyssa: load_deref -> load_uniform -> load_ubo -> load_global_constant -> load_constant_agx -> load_preamble
19:52 alyssa: (I think that's the whole chain)
19:54 karolherbst: cursed
19:54 karolherbst: I wonder how good agx is for CL these days
19:54 alyssa: still need to do images
19:54 karolherbst: sure, but I could ignore that for now
19:55 alyssa: but if you really wanted to spend your PTO on asahicl I could whip up image support this week I guess
19:55 alyssa: also wait why are you even here go back to playing video games
19:55 alyssa: :p
19:55 karolherbst: no need for that
19:55 karolherbst: :D
19:55 karolherbst: my switch is on tho :D
19:55 karolherbst: just taking a break from play games or something
19:55 zmike: smh not being able to play games and do driver development at the same time
19:56 karolherbst: the world just isn't fair to me
19:57 karolherbst: deleting clover will be epic
19:57 karolherbst: and deleting all the gallium nonsense coming with that
19:57 karolherbst: IR_NATIVE will allow so much code to be deleted from radeonsi
19:58 karolherbst: alyssa: if you want to make the remove of clover happening soon, you'll have to buy a r600 GPU
20:03 alyssa: karolherbst: tempting
20:24 alyssa:doesn't understand this cf_node stuff
20:25 jenatali: alyssa: What about it?
20:27 Kayden: back before we had CI labs, I used to run piglit and then play batman while waiting for results
20:31 alyssa: Kayden: Lol
20:31 alyssa: developers developers developers developers developers developers
20:31 alyssa: jenatali: I want to iterate the shader in source order. How do I do that?
20:31 alyssa: Bit of an X/Y problem
20:32 alyssa: Uh
20:32 jenatali: nir_foreach_block?
20:32 alyssa: That doesn't get me the if conditions
20:32 jenatali: Or do you need to iterate the CF nodes in source order?
20:32 alyssa: I want to check for each use of a load_reg intrinsic that there is no intervening store_reg intrinsic
20:33 alyssa: (what would become write-after-read hazards if not dealt with)
20:33 alyssa: Conceptually there's a simple algorithm
20:33 alyssa: foreach block {
20:34 jenatali: Why do you need if sources?
20:34 alyssa: foreach instr in block {
20:34 alyssa: if instr == load_reg {
20:34 alyssa: valid_set |= instr
20:34 alyssa: } else if instr == store_reg {
20:34 alyssa: valid_set -= instr
20:34 alyssa: } else {
20:35 alyssa: foreach src in instr {
20:35 alyssa: if src is load_reg, assert(src in valid_set)
20:35 alyssa: ..
20:35 alyssa: }
20:35 alyssa: Problem is that misses the if's
20:35 alyssa: since if-sources don't happen in a block in NIR
20:35 jenatali: I see
20:35 alyssa: and I need to be able to gaurantee there's no something dumb like
20:36 alyssa: x = load_reg r
20:36 alyssa: store_reg r, foo
20:36 alyssa: if x {
20:36 alyssa: ..
20:36 alyssa: }
20:36 alyssa: (well, either guarantee it's not there, or insert a mov after the load_reg and read from the mov instead)
20:37 jenatali: alyssa: Look at the nir_foreach_block implementation, note it uses nir_block_cf_tree_next, use nir_cf_node_cf_tree_next to write an equivalent foreach loop, then check the node type and cast to block or check if source
20:37 alyssa: Will give that a try
20:37 alyssa: thanks :)
20:37 alyssa: (Note I am trying to do this locally, i.e. valid_set is per block only. I just.. want to think of if conditions as being read at the end of the preceding block)
20:42 jenatali: alyssa: You might want to just check the current block's next cf node then (if there is one) to see if it's an if
20:43 jenatali: alyssa: nir_block_get_following_if
20:43 alyssa: yeah, I was looking at that just now
20:43 alyssa: shouldn't be able to have more than one if, either, since there'd be an intervening then block.. I think
20:43 jenatali: Correct
20:44 Kayden: yeah, I think walking blocks and using nir_block_get_following_if is probably the easiest plan
20:44 jenatali: IIRC nir requires that after the if/else block there's also a block before any additional control flow
20:44 Kayden: the if condition is basically executed in that block anyway
20:45 alyssa: +1
20:45 alyssa: thank you both :)
20:57 alyssa: airlied: lavapipe getting working mesh shaders before anv? =P
21:00 airlied: alyssa: depends on how fast I can clean up the code :-P
21:01 airlied: current branch passes cts at least
21:01 jenatali: Nice :)
21:03 alyssa: =D
21:03 alyssa: how bad is software-only mesh?
21:04 alyssa: relative to e.g. geom and tess?
21:04 alyssa: s/and/or/
21:05 jenatali: When I did it in WARP it was... not great
21:06 jenatali: My problem was the ordering requirements. I might've just not been familiar enough with the internals but I couldn't figure out a good way to sort the geometry without just serializing things
21:07 alyssa: uff
21:08 airlied: yeah I do lack for any overlap
21:08 airlied: but I think it's probably faster than vs/gs/tess in llvmpipe
21:08 airlied: because I can at least thread the task and mesh shader execution itself
21:09 alyssa: I'm dragging my feet on GS on asahi
21:09 alyssa: because, yeah
21:09 airlied: one good thing is there's no xfb
21:09 alyssa: oh thank god
21:10 alyssa: well. GS, I have a plan but I hate it. TS, I have no plan other than *handwave* compile the warp/llvmpipe tessellator as a CL kernel
21:10 airlied: just lower it all to task/mesh :-P
21:11 airlied: though I suppose you then lower task/mesh to cs anyways :-P
21:13 alyssa: Yep :P
21:13 alyssa: the really icky case is primitive restart + { XFB, GS, TS }
21:13 alyssa: in gl I just draw_vbo_without_primitive_restart
21:13 alyssa: not really an option in vk.
21:14 alyssa: (unless I rewrite things with a cs.. still needs some funny memory allocation)
21:26 alyssa: ok, load_reg coalescing works now
21:26 alyssa: with almost no changes for the backend
21:26 alyssa: store_reg I need to give more thought
21:26 alyssa: and probably do a similar validation/lowering as above but in reverse
21:31 karolherbst: sooo.. is nir_lower_mem_access_bit_sizes the fancy pass now I can use to lower all vec8/16 to vec4?
21:31 karolherbst: just uhm... I know what to do for like next week
21:32 jenatali: Yes
21:32 karolherbst: nice
21:33 karolherbst: I kinda want to ditch pointless vec8/16 code from backends again
21:33 jenatali: I'm going to be adding masked store support to it soon (for storing an 8bit var on hardware that doesn't support it, by doing 2 atomics)
21:33 jenatali: And then I can delete a whole crapload of custom code and a few custom intrinsics
21:33 karolherbst: fancy
21:34 alyssa: "storing an 8bit var on hardware that doesn't support it, by doing 2 atomics"
21:34 alyssa: thanks I hate it
21:35 jenatali: Yeah
21:35 jenatali: DXIL doesn't have 8bit support
21:35 jenatali: CL requires it :(
21:35 alyssa: atomic_xor(0xFF) + atomic_or?
21:35 jenatali: So does DOOM Eternal in Vk
21:35 alyssa: er wait
21:35 alyssa: atomic_and(0xFF00) + atomic_or?
21:35 jenatali: Yep
21:36 alyssa: still seems vaguely racey
21:36 karolherbst: why doesn't it have 8 bit support...
21:36 jenatali: Yeah, if you have a race on the same byte you can get garbage. The alternative is a cmpxchg loop for every store
21:36 alyssa: yeah..
21:36 jenatali: karolherbst: Good question
21:36 alyssa: any good answer? :P
21:37 karolherbst: I kinda understand if the alu can't do it, but for memory load/stores? mhh
21:37 jenatali: No, there's no good answer
21:38 jenatali: Nobody asked for it really until we did CL, and then our DXC team has been busy ever since
21:38 karolherbst: is it worse than some company saying "no int8 or no dxil with us"
21:38 jenatali: We still don't have scalar UBO layouts, which people are actually asking for...
21:39 anholt_: man, it's really easy to pass ARB_fp fog tests when we don't actually have any.
21:39 karolherbst: I still don't understand why modern APIs just don't see memory as blob of bytes
21:39 Company: no int8 or no dxil with us
21:41 alyssa: lol
21:45 airlied: jenatali: should just move to vulkan :-P
21:47 airlied:wonders how many lawyers saying that could summon
21:55 alyssa: :~)
23:26 jenatali: alyssa: Congrats!