00:54 airlied: mattst88: yes ignore lto for now totally
00:55 mattst88: dcbaker, dcbaker[m]: ^
01:06 hanetzer: imirkin: get a kilofoot cable ;P
01:17 haasn: 21:57 <swick> Lyude: some video player people said they don't trust the hardware to get dithering right and they just do it themselves on 8bpc framebuffers <- if you're talking about mpv it requests the highest bit depth framebuffer it can
03:43 thaytan: Is there a write up of what's needed/ongoing for nouveau / Turing reclocking?
03:44 imirkin: thaytan: yeah, just need a PMU which implements it, signed by the nvidia key so that the hardware will accept it
03:44 imirkin: piece o' cake
03:44 airlied: a large piece with sprinkles
03:45 thaytan: ah - I thought I'd be showing how little I know
03:45 thaytan: I didn't realise it was a signed block - I thought it was about RE :)
03:46 airlied: nope it needs magic pixie dust
03:47 imirkin: well
03:47 imirkin: there's also next-to-impossible RE that'd have to be done
03:47 imirkin: since normally it's doen by fuzzing the vbios
03:48 imirkin: to see what the blob does
03:48 imirkin: but now the vbios is also signed
03:48 imirkin: but once you did all that RE, you'd still be SOL
03:58 thaytan: signing :(
03:58 thaytan: we're just gonna have to break into that secure vault
03:58 thaytan: anyone got Tom Cruise' number?
04:16 imirkin: gonna have to swim underwater to access the vault...
08:10 damo22: can someone please take a look at this issue on libpciaccess? https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/issues/13
08:44 pq: bernie_, hahaha, wow.
08:45 hat|: hey, the commit 64662dd5baeec19a618156b52df7a7e7adba94cf has been merged to master but I don't see it in any release/staging for now, how safe is it to compile from master?
08:55 bbrezillon: jekstrand: on a second thought (after reading the memory model spec for the hundredth time), I wonder if it's not NIR_MEMORY_RELEASE we should have here https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/compiler/nir/nir_opt_copy_prop_vars.c#L837
08:56 pq: bernie_, Lyude, Re: restoring KMS state after someone else changed it while you weren't looking; this is what the "blind save/restore" is supposed to solve in display servers. I've been talking about that on dri-devel@. I forget what thread, though.
08:58 pq: Lyude, "the driver is supposed to be able to scale down bpps as needed"?!
08:59 pq: Lyude, I as a Weston developer would be very upset if some DRM driver decided to use less bits than I specify and not tell me about it.
08:59 bbrezillon: jekstrand: IIUC, we should flush all memory operations that are before instructions (barriers or an atomics) that have the Release semantic
09:02 bbrezillon: Acquire only guarantees that every mem op that appear after an instruction are executed after this instruction, but that doesn't prevent memory ops before the atomic/barrier to cross this atomic/barrier
09:53 bbrezillon: jekstrand: hm, maybe it's just me not understanding what copy_prop_vars does
10:02 daniels: pq: you'd be surprised how many devices have 6bpc panels
10:04 pq: daniels, that doesn't matter, but I bet HDR would fail to be detected if the wire carried 8 bpc or 6 bpc isntead of 10 bpc.
10:05 daniels: that's scaling down, no?
10:05 pq: if the driver magically sends out less bits than a display server told it to, then the driver should tell about it to the display server, so that when HDR magically stops working, there is one more case covered by an explicit log message about what happened.
10:13 emersion: i agree this would be important to know, but maybe the EDID is enough to figure this out?
10:13 emersion: if not, some kind of "current_bpc" prop would be nice
10:14 emersion: i guess the question is: will the kernel always pick min("max bpc" prop, max supported bpc in EDID)
10:15 emersion: or if the logic is more involved than that
10:17 pq: emersion, EDID is about monitor capabilities. What I'm talking about is the DRM driver deciding to send less bits because e.g. total memory or wire bandwidth limitation does not allow more.
10:18 pq: like when the use of multiple CRTCs pushes you over the limit
10:19 pq: or maybe due to link negotiation for a poor quality cable
10:20 emersion: ah, that's unexpected for sure
10:20 emersion: i'd expect atomic commits to fail and user-space to need to decreease "max bpc" instead?
10:23 pq: haa haa...
10:23 daniels: yes, I'd hope/expect that as well
10:24 daniels: though it's yet another axis for userspace to try dropping down: first you have to cull modifiers, then you have to cull formats, then you have to cull resolutions; all of these require allocating as many new gbm_bos as you have CRTCs you want to light up
10:24 pq: Lyude said drivers are in the process of implementing automatic bpc lowering, if I understood right.
10:26 pq: daniels, yeah, if you aim for a totally automatic best mode discovery. OTOH, I'm not sure "best mode" even can be discovered algorithmically because "best" depends on the user.
10:45 emersion: the kernel can't know whether the user absolutely wants 10bpc, even if the resolution is lower than the max
10:45 emersion: more magic is more pain :P
10:50 HdkR: I have a television that advertises some high bpc amount (12bpc? 16bpc?) and I never want it to be running above 8bpc :P
10:51 pq: I don't mind kernel magic as long as it's overridable. { AUTO, 10BPC, 8BPC, ... }, default to AUTO, fail if cannot do exactly what was chosen. Also need another property to tell userspace what actually happened if AUTO, so that userspace can start with AUTO and then move on with more specific requirements.
10:54 pq: this can quickly evolve into "if I picked AUTO, what would I get? But I don't want to actually program it yet, I'm just pondering" kind of API.
10:55 pq: TEST_ONLY commit, the next level
11:03 emersion: yeah, hm :S
11:36 kusma: dcbaker[m]: Is there a reason why the mesa 20.0.8 release-notes aren't in master yet?
11:50 bernie_: emersion: thank you for the quick review!
11:51 bernie_: anyone knows a quick way to _clear_ HDR_OUTPUT_METADATA so I don't have to write a program to do it?
11:51 bernie_: it's annoying that it messes up the colors in my x11 session
12:02 emersion: modetest has something to do this iirc?
12:02 emersion: (set arbitrary props)
12:04 emersion: is it normal that hdr_metadata_infoframe isn't prefixed with "drm_" in drm_mode.h?
12:04 emersion: sounds like it could cause conflicts
12:05 vsyrjala: sigh
12:08 emersion: but it's released uapi now, so probably too late
12:21 bernie_: emersion: it seems modetest -w cannot set (nor clear) blobs
12:22 bernie_: emersion: maybe i could add a new flag to clear blobs...
13:47 MrCooper: pq: so you're saying XRGB 8888 should fail if the monitor only has a 6 bpc panel? ;)
13:48 imirkin: only RGB565 for you!
13:49 imirkin: pq: fwiw the nouveau property is exactly as you say
13:49 imirkin: and if you set dither to 10bpc but you actually have an 8bpc screen (and a 10bpc visual) then you effectively just don't get dithering.
13:50 vsyrjala: i guess what people want is a 'min bpc' prop on the connector
13:51 emersion: vsyrjala: yeah, maybe that would make more sense
13:51 vsyrjala:totally in the meh camp on this
13:51 emersion: a "current bpc"/"effective bpc" would be handy too
13:52 emersion: how so?
13:52 vsyrjala: a bit of banding never hurt anyone
13:52 linkmauve: Can you also configure the parameters of the dithering pattern you get in various scanout devices?
13:52 linkmauve: vsyrjala, ;_;
13:53 pq: MrCooper, no. It should fail if userspace demands 8 bpc on the wire and it can't have it.
13:53 vsyrjala: userspace can't demand anything on the wire
13:53 vsyrjala: 'min bpc' would add that
13:53 pq: vsyrjala, that's the first problem
13:54 MrCooper: meanwhile, 10 bpc getting dithered down transparently is no different from the same happening with 8 bpc
13:54 vsyrjala: no normal user is ever going to set any min bpc stuff. so still meh imo
13:55 pq: even with HDR?
13:55 emersion: when you have an "enable hdr" button in some configuration UI, and then use 8bpc, users get angry
13:55 emersion: i've tried hdr metadata with 8bpc, it's not great
13:57 emersion: at first didn't understood why all black areas had severe quality issues
13:57 vsyrjala: is telling the user "no hdr for you on your hdr capable screen" much better really?
13:57 vsyrjala: still angry
13:58 emersion: user-sapce _can_ do something about it
13:58 emersion: e.g. lower the resolution
13:58 emersion: or the refresh rate
13:58 emersion: also, silently failing is never a good idea imho
13:58 pq: vsyrjala, that's better than a bad picture. Userspace can e.g. suggest ways to improve to the user: try a better cable, etc.
13:59 imirkin: or try a better computer ;)
13:59 vsyrjala: picture on the screen -> no fail ;)
13:59 vsyrjala:ducks
13:59 ivyl: mangled picture on the screen -> a great success
13:59 pq: "let's pay him only 80% of the salary and see if he notices" ;-)
14:00 vsyrjala:would honestly not notice
14:00 imirkin: fix the glitch, as it were?
14:00 vsyrjala: i hope i'm still getting paid anything actually
14:02 imirkin: https://www.youtube.com/watch?v=zqjQDP9KX6E
14:05 MrCooper: bsat: yes, dithering means temporal in this context
14:13 danvet: emersion, a lot of the hdmi infoframe stuff is internally shared with drivers/media so they don't feel left out so much
14:13 danvet: that's maybe why the prefix was forgotten in the uapi
14:16 emersion: rip
14:20 vsyrjala: hmm. we don't even use that struct definition anywhere
14:21 pq: maybe it's passed straight through as a blob into HDMI infoframe? :-p
14:21 emersion: lol
14:22 vsyrjala: it is. so i guess we can argue it's there just to document the cta-861.3 infoframe layout
14:23 vsyrjala: hmm. acutally no. we do use it
14:23 vsyrjala: no that's hdr_output_metadata
14:23 vsyrjala: what was the other one agian/
14:23 vsyrjala: ?
14:23 vsyrjala: hdr_metadata_infoframe
14:23 vsyrjala: wtf is this stuff?
14:23 emersion: hdr_metadata_infoframe
14:24 emersion: hdr_output_metadata contains a hdr_metadata_infoframe field
14:24 emersion: i assume there's also an enum or defines for metadata_type
14:24 emersion: hdr_output_metadata sounds DRM-specific
14:25 emersion: hdr_metadata_infoframe should have "hdmi" in its name
14:25 vsyrjala: cta-something-something actually
14:25 vsyrjala: until they rename that organization a fourth (?) time
14:26 emersion: then maybe hdmi_metadata_type1 is misnamed?
14:26 emersion: well, w/e, it's uapi now, nothing we can do about it
14:28 vsyrjala: no idea. i expect half of userspace to interpret this mess one way, and the rest the some other way
14:29 vsyrjala: and when we get other metadata layouts it probably gets even more "fun"
14:32 vsyrjala: i guess hdr_output_metadata is a drm specific wrapper around the spec stuff to slap on the extra id
14:32 vsyrjala: which i guess still takes the values defined in the cta spec. confusing
14:34 vsyrjala: the type1 label is a bit wrong. only the stuff starting from the display primaries is actually the type1 static metadata. the two bytes before that are just the other part of the infoframe
15:04 jekstrand: bbrezillon: Right. You just reminded me why it's acquire. :-D Copy propagation eliminates reads by replacing them with the source of a write. This effectively moves reads up. Therefore, when we see a release, we need to flush everything so no reads get moved above that point.
15:12 bbrezillon: jekstrand: you mean "when we see an *acquire*" (nothing is flushed on release AFAICT)
15:13 jekstrand: Yes, when we see an acquire
15:13 jekstrand: wait...
15:13 jekstrand: no
15:13 jekstrand: bah
15:13 jekstrand: let me go read code
15:13 jekstrand: Right. Acquire because no reads are allowed to be moved above an acquire.
15:14 bbrezillon: yep
15:14 dschuermann: jekstrand: can you (or someone with SPIRV expertise) have a quick look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207/diffs?commit_id=6b184a1b06364871895f9135075244998b6a4658 ?
15:15 bbrezillon: jekstrand: ok, that part makes sense
15:16 bbrezillon: there's still the part updating written->modes I don't quite understand
15:26 jekstrand: dschuermann: Uh... I don't know.
15:26 bbrezillon: jekstrand: if you have a barrier with a Release, all writes before that point should be considered invalid (in the sense the written value can't be used as a replacement for a load from the same var)
15:26 bbrezillon: I'm sure this case is handled, but I fail to see where
15:26 jekstrand: dschuermann: It's not that I think your SPIR-V code is wrong. I just don't know that I trust spirv_to_nir or even NIR to propagate things properly.
15:27 dschuermann: it's just default-applying the restrict decoration according to the spec
15:28 jekstrand: bbrezillon: Not quite, I don't think. If you see a release, all writes before that point have to happen. That's handled by the dead_var_writes pass which is separate from copy-prop. The copy-prop pass never removes any writes, it just propagates them to replace reads.
15:28 dschuermann: jekstrand: if something gets lost on the transition through NIR, I'd say that's a different issue
15:29 jekstrand: dschuermann: The problem is that NIR isn't aware of restrict at the deref level and so it isn't doing the propagation. Instead, what's likely happening is that spirv_to_nir is taking the restrict from some point in the access chain and propagating it all the way to the tail.
15:31 jekstrand: Say you have a pointer `restrict struct { uint x } *p;` and two OpAccessChains which do uint `uint *a = p->x, *b = p->y;` and then OpStore which does `*a = 1; *b = 2`. If we let spirv_to_nir do the propagation in it's lazy "just OR everythign" way, it'll end up with restrict on both `a` and `b` and it will look like the two stores can be re-ordered even though they can't.
15:33 jekstrand: Instead, we really want restrict to be a thing that probably comes in through a `deref_type_cast` and then NIR uses that as part of it's non-aliasing proving logic.
15:33 bbrezillon: jekstrand: I'm not concerned about writes not happening, just concerned about reads using a value that might not be up-to-date because it comes from a a store that happened before the 'released' barrier, meaning that another could have changed the value in-between
15:34 bbrezillon: *another thread
15:34 jekstrand: I'm not really sure how you'd want to turn that into SMEM opcodes at the ACO level. :-/
15:35 jekstrand: bbrezillon: I think that's fine. The Vulkan memory model makes no guarantees about reads and writes unless you have write -> release -> acquire -> read.
15:36 alyssa: nroberts: In case I'm barking up the wrong tree - what was the original motivation for storing mediump temps/locals as float instead of f16? Just simpler or are there bad edge cases for the latter approach?
15:36 dschuermann: jekstrand: we have separate caches for SMEM/VMEM, so we need to be careful if the same buffer is written by a different memory operation
15:36 jekstrand: dschuermann: Oh, that's just ugly. :-(
15:37 dschuermann: currently, aco just goes full-ignore as usually no-one uses divergent/uniform accesses mixed on the same buffer, but it's a quite bold assumption
15:37 jekstrand: It should be fine as long as no one writes to it
15:37 jekstrand: But, yeah, for an SSBO, things could get "fun"
15:38 jekstrand: But are there interesting cases where someone is doing entirely scalar access *and* writing to it? That seems extremely odd.
15:39 dschuermann: what we do now, is just check all accesses to the same buffer, unless aliased
15:39 bbrezillon: jekstrand: ok, then we'd have an acquire to invalidate the copies before we read, makes sense. Thanks for bearing with me :)
15:39 dschuermann: jekstrand: that is really common. typically you have either divergent load -> divergent store, or uniform load-store
15:40 jekstrand: bbrezillon: No worries. This stuff is insanely hard to get into your head.
15:40 jekstrand: bbrezillon: Not your head in particular. In general, it's difficult. It took me like a week of study to actually grok it.
15:40 jekstrand: And I'm still not sure I have.
15:41 jekstrand: dschuermann: I'm a bit concerned because "restrict" has multiple meanings depending on what way you look at it.
15:41 dschuermann: jekstrand: we found only 22/111493 shaders to do mixed uniform/divergent load/store at all
15:42 jekstrand: And putting "restrict" on the load/store opcode doesn't seem to make any sense.
15:42 jekstrand: restrict on the descriptor is, I think, what you want.
15:42 dschuermann: it's a variable decoration which means that the pointer doesn't alias anything °°
15:42 jekstrand: So you basically want to tag descriptors as being scalar or vector
15:43 jekstrand: dschuermann: Yeah, I know. It's weird
15:43 jekstrand: So do you have a pass for handling this today?
15:43 jekstrand: If so, how does it work?
15:43 dschuermann: same MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207/diffs?commit_id=bdcdd1ad1e6b40dc54afdbc33860911ea74283f8
15:44 dschuermann: or better here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207/diffs?commit_id=bdcdd1ad1e6b40dc54afdbc33860911ea74283f8#d0488ddd1da2786fc08fb3c5c5c0b5b0329dc0cd_226_243
15:45 dschuermann: pendingchaos: ^
16:03 nroberts: alyssa, I’m not sure if I fully understand the question, but the idea in general is to not change any types of variables. instead the mediumpness is treated as a way to specify the minimum precision of operations using the variables
16:04 nroberts: I guess if you forcibly make a temporary variable f16, but then it happens to only be used in operations that can’t be demoted to mediump then it would end up adding an unneccesary conversion to/from f16 to truncate the precision
16:05 pendingchaos: jekstrand: I don't see what's wrong with your "`uint *a = p->x, *b = p->y;" example? both stores are different locations in the same resource, so they can be re-ordered
16:05 alyssa: nroberts: Hm, okay.
16:07 alyssa: The issue is with conditionals like https://people.collabora.com/~alyssa/0001-WIP-glsl-Test-GLSL-ternary-statement-in-precision-lo.patch
16:07 alyssa: which end up totally 32-bit at the GLSL level and then totally 32-bit thru NIR and in the backend
16:08 alyssa: (Noticed in glmark2-es2 -bshading:shaing=cel ... probably affects anything that does conditionals though)
16:10 nroberts: hm, the compiler-generated temporary variables should have their precision marked as mediump, so in theory it should still be able to lower that without changing the type of the temporary variables
16:10 nroberts: but I guess something is going wrong
16:10 alyssa: right, it's marked mediump but it's still stored as float
16:11 alyssa: Since the operation in question is just an `assign` (nested within an `if`), there's nothing at the GLSL to do as 16-bit, the pass doesn't kick in
16:12 alyssa: then in NIR that mediump temp becomes a real 32-bit SSA value, and then after a bunch of lowering passes and opt_peephole_select we end up with a 32-bit bcsel instead of expected 16-bit
16:12 alyssa: using an f16vec4 temp instead at the GLSL level fixes that (no NIR changes needed) but may be too invasive.
16:13 alyssa: (If peephole select is disabled, so we have an actual if, the nir_register is 32-bit, so same issue, now it's a 32-bit mov in the backend and twice the reg pressure)
16:15 nroberts: right, I guess I would have to investigate more to understand it fully, but at least in theory the design of the lowering pass is that it should be able to lower that without changing the type of the temporary
16:17 alyssa: I'm trying to understand how that would (in theory / design) be facilitated so I don't go tearing up perfectly good code.
16:34 jekstrand: pendingchaos: I meant `*b = p->x`
16:57 dschuermann: jekstrand: that would be 2 pointers aliasing the same variable...
16:58 jekstrand: dschuermann: Yes, but if we let spirv_to_nir do the propagation, it would flag both of them restrict
16:58 jekstrand: Which is clearly wrong
16:58 dcbaker: kusma: because I made the release late last night and didn't get to it yet :)
16:58 dschuermann: jekstrand: no, the glsl is already wrong. this is not allowed unless decorated as aliased
16:59 dschuermann: (or the spirv)
17:01 bnieuwenhuizen: dschuermann: I thought this was only about whether buffers overlapped each other, not variables?
17:01 jekstrand: dschuermann: The problem is that spirv_to_nir just ORs all access flags as it walks the deref
17:01 jekstrand: dschuermann: So, currently, RESTRICT on a variable will end you up with RESTRICT on all pointers that point to that variable.
17:04 bnieuwenhuizen: on the other hand, restrict is the default for Vulkan, and ALIASED would be perfectly reasonable when or-ing, no?
17:04 jekstrand: Yes, it would
17:04 dschuermann: so, restrict/aliased refers to "Memory Object Declaration: An OpVariable, or an OpFunctionParameter of pointer type, or the contents of an OpVariable that holds either a pointer to the PhysicalStorageBuffer storage class or an array of such pointers." and if I'm not mistaken this means that any two pointers must not alias unless decorated as such
17:04 jekstrand: dschuermann: Sadly, no. :-(
17:05 dschuermann: maybe we should do the opposite and introduce an aliased flag which we can OR?
17:05 jekstrand: dschuermann: Pointers which are generally by OpAccessChain might alias
17:05 bnieuwenhuizen: dschuermann: as far as I understand for the memory object declarations it means that different descriptors should point to physically separate regions of memory
17:06 jekstrand: It's only at the points where we initially get a pointer such as OpLoad from a variable, as a function parameter, or from a descriptor that it's decorated.
17:06 jekstrand: dschuermann: Long-term, we need to make NIR capable of reasoning about restrict
17:06 jekstrand: Short-term, we just need some way to get data to your pass
17:06 dschuermann: ok, so our idea is correct but the patch might overrule aliased variables by applying restrict, right?
17:06 jekstrand: Actually.... I don't know that you can get correct without NIR having a deep understanding.
17:07 dschuermann: wouldn't be enough to check that an aliased flag is not present anywhere on the access chain?
17:07 jekstrand: You need a few things
17:08 jekstrand: 1. If the base pointer isn't restrict (is alias), you know nothing and have to assume worst-case.
17:08 jekstrand: (where base pointer might come from a descriptor or it might be a cast from some random integer value of unknown origin)
17:08 dschuermann: I think restrict is the default?
17:09 bnieuwenhuizen: jekstrand: "The Simple, GLSL, and Vulkan memory models can assume that aliasing is generally not present between the memory object declarations. "
17:10 jekstrand: 2. If a pointer starting from the descriptor is ever passed into something that isn't a deref chain, you have no idea where it will come out or who will use it so you have to assume that anything which accesses that descriptor might alias and assume worst case.
17:11 jekstrand: bnieuwenhuizen: I'm well aware of what the SPIR-V spec says and it's handling of restrict/alias is utterly rubbish. I tried to get it specified sanely but failed.
17:13 jekstrand: bnieuwenhuizen: The correct way to read it is that OpVariable, OpConvertUToPointer, OpFunctionParameter, OpLoad, and anything else which might "generate" a pointer default to are restrict if they aren't decorated as alias. Everything else is "don't know" so it's effectively "alias until proven otherwise".
17:13 jekstrand: The problem is that "alias" is a useless decoration when trying to actually reason about thing.
17:13 jekstrand: When reasoning about things it's always "they alias unless I can prove otherwise"
17:14 jekstrand: So when you counter a pointer that's flagged restrict you know "this one, and anything dervived from it, doesn't alias anything not derived from this pointer"
17:15 dschuermann: which is... well enough for our purpose I think °°
17:15 jekstrand: for your purposes, all you care about are "is the descriptor flagged alias"
17:16 dschuermann: yep
17:16 bnieuwenhuizen: jekstrand: 'Everything else is "don't know" ' How do you get a pointer without something that might generate a pointer?
17:16 jekstrand: Well, actually I don't think you can possibly do anything properly correct as long as you're still using the offset/index model and not the deref model in spirv_to_nir. Because there's no way you can handle 2 above.
17:17 jekstrand: bnieuwenhuizen: What I mean by that is that the compiler has to figure it out.
17:17 jekstrand: bnieuwenhuizen: Let me back up for a second. Forget about restrict/alias applying to buffers
17:17 bnieuwenhuizen: but that is simply dataflow, no?
17:17 jekstrand: Assume we only care about pointers
17:17 jekstrand: All restrict/alias on a buffer means is "the base pointer is flagged restrict/alias"
17:19 jekstrand: For any given two pointers, there are generally three questions I would want to answer: 1) Do a and b point at the same memory for sure, 2) Does a completely contain b (so a write to a destroys a write from b) and 3) Do a and b not point to overlapping memory for sure.
17:19 jekstrand: Generally, we start with the assumption that any two pointers might alias and prove otherwise.
17:20 dschuermann: jekstrand: wouldn't 3) be (partially) answered by "they use different descriptors and not aliased"
17:20 jekstrand: The typical method for proving otherwise is to chase the two pointers and find one of two things: a) They point to completely different variables or b) They share a parent deref but take provably different paths from there. (Different struct members or different constant array indices.)
17:21 jekstrand: The alias decoration on descriptors plays into a
17:21 dschuermann: b) needs proper pointer analysis
17:21 jekstrand: Yes, b needs pointer analysis
17:22 bnieuwenhuizen: right, but we're not trying to solve the full aliasing problem here right?
17:22 jekstrand: Where things get "fun" is when you start getting pointer casts or storing pointers in variables or passing them to selects or phis.
17:22 bernie_: how do I delete a blob property?
17:22 bernie_: I tried setting it to 0 or -1, but I get EACCES
17:23 jekstrand: The moment when either of your pointer can't be chased back to a variable, you have to assume they alias.
17:23 jekstrand: Unless you can walk the other direction and prove that there's no way they could have possibly gotten the address to the variable to which you do have a complete deref chain.
17:25 dschuermann: jekstrand: "fun" part aside ;) can we solve our easy case with an explicit aliased flag? (only ssbo accesses)
17:25 bnieuwenhuizen: right, but isn't this pretty much "track all users of the variable and throw away restrict if you get lost"?
17:25 jekstrand: bnieuwenhuizen: Yes, but you can't do that so long as RADV is still using the index+offset addressing scheme. :-)
17:25 jekstrand: dschuermann: Vulkan allows the fun stuff. :P
17:25 bnieuwenhuizen: jekstrand: at the buffer level, why not?
17:26 jekstrand: bnieuwenhuizen: Because you can't really chase and see if you get lost
17:26 jekstrand: I mean, maybe you can but I don't think so
17:27 jekstrand: What you could do (and this might work) is say "if all our SSBO access uses constant buffer indices, we can do the analysis" and that probably gets you 90% of what you want.
17:27 bnieuwenhuizen: jekstrand: just iterate over the program tracking looking for all vulkan_resource_index, and giving up if those are used by vulkan_resource_reindex ?
17:27 jekstrand: bnieuwenhuizen: What about variable pointers?
17:28 emersion: bernie_: you need to be root to set KMS props
17:28 jekstrand: So I think you could do it if the index field for every SSBO access came directly from a vulkan_resource_index
17:28 jekstrand: The moment one of them ends up in ALU, you're toast
17:28 bnieuwenhuizen: jekstrand: that was what vulkan_resource_reindex was for no? (In case phi, select were also allowed, you can also give up on them)
17:29 jekstrand: yes, reindex is roughly for that
17:30 dschuermann: has yet to be seen in the wild... °°
17:30 jekstrand: variable pointers? Yes, they're a real thing. Not common but there are apps.
17:31 jekstrand: I don't think any games do
17:31 jekstrand: Because D3D doesn't have such a thing
17:31 jekstrand: In any case....
17:31 jekstrand: I think the conclusion I'm coming to is that making `restrict` handled properly in NIR is actually a different problem from the one you're trying to solve.
17:32 jekstrand: I think you can solve your problem with a fairly straightforward thing.
17:33 jekstrand: 1. spirv_to_nir needs to flag buffers as restrict somehow (maybe a flag on the descriptor_load?)
17:33 danvet: hwentlan, the fun starts with the first patch in the series ...
17:33 jekstrand: s/restrict/aliased
17:33 danvet: if you want to test something
17:33 jekstrand: 2. You need to somehow record the set of buffers flagged aliased for use later
17:34 danvet: hwentlan, agd5f_ aside any preferred tree for this, it only interacts as in both together fix a bug, but can be merged independently
17:35 jekstrand: 3. In your analysis pass, store a "vector" flag per-SSBO and anything which is accessed with a divergent instruction gets "vector" set. If you see any SSBO access where you can't immediately figure out which SSBO it belongs to, flag all of them as "vector"
17:35 jekstrand: Then you use a scalar access if and only if you can immediately figure out which SSBO it belongs to and if that SSBO is not flagged "vector"
17:38 hwentlan: No preference from me. agd5f_ might have a preference though
17:41 dschuermann: jekstrand: thanks alot! ok if I copy-paste this as MR comment?
17:42 jekstrand: dschuermann: Feel free!
17:42 jekstrand: dschuermann: I think my #1 concern with the approach you're taking now is mostly that it's putting a bunch of information in NIR which, in it's current form, is full of lies.
17:42 jekstrand: It's definitely useful for your back-end to have something.
17:44 dschuermann: yeah, I thought the whole aliasing thing might be easier in Vulkan :/
17:44 dschuermann: I feel they only did half of what would be needed to make it really useful on compiler side
17:47 bernie_: pq: In case you were unaware of it: https://chromium-review.googlesource.com/c/chromium/src/+/2091274
17:47 jekstrand: dschuermann: I think the information is there. They just did it in the most awkward way possible, of course.
17:47 jekstrand: Welcome to SPIR-V!
17:49 bernie_: emersion: i got it to work
17:49 bernie_: emersion: I didn't need to be root
17:51 bernie_: emersion: oh wait... i had write access to /dev/dri/card0 due to being in group video
17:51 bernie_: emersion: but anyway... this worked from the text console: modetest -w 59:HDR_OUTPUT_METADATA:0
17:52 bernie_: emersion: it didn't work from inside kwinft
17:52 bernie_: emersion: and from a vt controlled by the amdgpu DDX driver, I can't even *read* the props
17:54 bernie_: emersion: the DDX driver calls drmDropMaster() when it loses the VT
17:56 bernie_: A friend who works on Chromium pointed me at this code they use to relax the single-master problem:
17:56 bernie_: https://chromium.googlesource.com/chromiumos/platform/frecon/+/refs/heads/master/main.c#245
17:56 bernie_: it requires an out-of-tree kernel patch
18:00 bernie_: this is the patch: https://codereview.chromium.org/138033024/diff/1/drivers/gpu/drm/drm_drv.c
18:05 kusma: dcbaker: OK, fair. Thanks :)
18:20 pendingchaos: jekstrand: in your example, the stores to both "a" and "b" are to a variable marked as restrict
18:20 pendingchaos: so I don't see how adding ACCESS_RESTRICT to the stores is wrong?
18:23 imirkin: restrict means "will not alias", i think
18:23 imirkin: default is "might alias"
18:24 pendingchaos: but afaiu ACCESS_RESTRICT refers to the variable, not the access
18:24 imirkin: yes
18:24 jekstrand: pendingchaos: It depends on how you specify it
18:24 imirkin: i don't think jekstrand's example is quite right...
18:24 jekstrand: With pointers, it's not nearly as clear as that. :-(
18:25 jekstrand: pendingchaos: Let me start over. You have `restrict S *p;` where S is a struct type with a member `x`. Then you have `uint *a = p->x; *b = p->x; *a = 10; *b = 20;`, they both write to the same value so the pointers `a` and `b` do alias.
18:25 imirkin: if p is restrict, and you do *a = p->x, *b = p->x, then i think it's guaranteed that a and b won't alias p
18:26 imirkin: oh - you're saying that p->x is a pointer itself?
18:26 imirkin: i think i missed that.
18:26 jekstrand: If, on the other hand you had two random pointers `restrict uint *q = something(); restrict uint *r = something_else();` and you do `*q = 10; *r = 12;` the compiler can assume they don't alias and re-order.
18:27 jekstrand: Oh, I should have said &p->x
18:27 jekstrand: Sorry
18:27 jekstrand: IRC is kind-of terrible because you can't correct anything. :-(
18:27 jekstrand: Let me type into a pastebin
18:27 imirkin: all your comments make a lot more sense with the & :)
18:34 jekstrand: https://paste.centos.org/view/bf1ed7b9
18:34 jekstrand: Ugh.. I said swapchains. :-( I meant access chains
18:34 jekstrand: Too many terms
18:35 jekstrand: Typo fixed: https://paste.centos.org/view/7244ee45
18:35 agd5f_: danvet, no preference
18:35 imirkin: jekstrand: what's the scope of "restrict" for variables?
18:36 jekstrand: imirkin: What do you mean?
18:36 imirkin: well
18:36 jekstrand: imirkin: Normally, all variables are restrict
18:36 jekstrand: SSBOs are weird
18:36 imirkin: restrict means that it doesn't alias
18:36 imirkin: but doesn't alias with what?
18:36 imirkin: anything ever, past or future?
18:37 imirkin: restrict makes sense on ssbo's
18:37 jekstrand: "alias" on an SSBO means that that binding may overlap memory of other bindings.
18:37 imirkin: since those are actual chunks of vram
18:37 imirkin: right
18:37 imirkin: but on a naked pointer in the code, what does it mean?
18:37 jekstrand: For everything else, the "base pointer" of each variable is restrict.
18:37 jekstrand: because access to one variable shouldn't affect another
18:37 imirkin: right....
18:38 imirkin: but you could just as well have
18:38 imirkin: unsigned restrict *a = &foo;
18:38 imirkin: unsigned restrict *b = &foo;
18:38 imirkin: and you have the same problems again
18:38 imirkin: (where foo is a local variable, for example)
18:38 jekstrand: If you have a function parameter that's a restrict pointer, it's a contract which says that nothing accessed within the function starting from that pointer will alias anything starting from any other pointer passed in.
18:39 jekstrand: I'm not exactly sure what the semantics would be if you declare a temporary pointer and flag it restrict
18:39 jekstrand: I think it would be restrict with respect to all other restrict pointers in that scope
18:40 jekstrand: Or, more specifically, if you have `struct S restrict *p`, then any access through p can be re-ordered with respect to access not through p
18:40 jekstrand: It gets tricky
18:40 jekstrand: I'd need to go very seriously read some language rules to figure it out.
18:41 imirkin: compilers are hard.
18:41 jekstrand: Yup
18:41 jekstrand: But the point is that just propagating restrict to all child pointers is definitely bogus.
18:42 danvet: agd5f_, I think marginally better if it's all in -misc, otherwise we might have a regression in the merge commit if we're really unlucky
18:44 pendingchaos: why would NIR think those two stores can be reordered? they obviously point to the same location
18:45 jekstrand: pendingchaos: Because they're marked restrict
18:49 pendingchaos: but restrict is just a flag applied to a variable to indicate that it doesn't alias with other variables?
18:49 pendingchaos: and both "c" and "e" point to the same variable
18:52 agd5f_: danvet, works for me
21:39 newuser444467899: Suddenly changing mouse sensitivity after launching dota 2 with vulkan api on Arch Linux /Debian Linux or launching Dota 2 with opengl and streaming/recording via ffmpeg and obs with vaapi encoder
21:39 newuser444467899: how i can fix this