11:51mripard: sima: following up on here and mastodon, I guess I'll just drop the patch
11:53mripard: it's still an implicit panic, which is inconsistent over similar calls, but that's really not the point of the series
11:59javierm: mripard: I still don't understand why the check is controversial... is not that drm_connector_init() is in a hot path
12:02javierm: but yeah, not the point of the series so you could drop it if that would help focusing on the actual changes rather than bikeshedding how paranoid helpers functions should be
12:10pq: Pre-conditions vs. legit argument values.
12:12emersion: normal runtime errors vs. programmer mistakes
12:13javierm: pq, emersion: the problem is that C doesn't have a way to check that
12:13emersion: a way to check what?
12:13javierm: so whether pre-conditions / programmer mistakes or not, it will lead to a panic
12:14emersion: no matter how complicated the type system may be, there will always be cases where the type system won't be enough
12:14pq: C does have assert(), and the kernel has BUG() and friend.
12:14emersion: there will always be cases where a panic/abort is the only way to indicate a programmer mistake
12:14emersion: maybe my function has two arguments, and one can only be supplied when the other not, etc
12:15emersion: or a string needs to have a specific format
12:15emersion: or a number greater/smaller than 42
12:15emersion: Rust cannot encode these things
12:17emersion: (and IMHO trying to encode complicated constraints in type systems results in developers writing types, instead of writing actual useful code :P )
12:17javierm: emersion: no, but we are talking about preventing a NULL pointer dereference here
12:17pq: It's also a question of definition: who is required to validate the arguments or pre-conditions. The one who does not bear that responsibility can explode when pre-conditions are broken.
12:17emersion: javierm: to me, they're the same kinds of errors: programmer mistake
12:17pq: javierm, yes, we are talking about exactly that.
12:19javierm: emersion, pq: Ok. So is more controversial than I thought then
12:20pq: If the pre-condition is that this function must be called with a non-NULL argument, and someone calls it with NULL, then it should explode. But if that pre-condition does not exist, then the function may as well return a graceful error if it cannot work.
12:20emersion: my take is that it's a good thing to panic/WARN/abort on programmer error
12:20emersion: way easier to spot the mistake and debug
12:20javierm: pq, emersion: my take is that core kernel code should be paranoid and try very hard to not lead to a panic
12:21pq: Indeed, unexpected failures should explode hard and on the spot (in debug builds).
12:21emersion: a panic isn't such a big deal
12:21emersion: sorry, i mean a WARN
12:22emersion: it's not like it takes down the whole kernel or anything
12:23pq: javierm, that's true depending on what we're talking about. If arguments come from userspace, then totally yeah.
12:23emersion: invalid args coming from user-space isn't a kernel programming mistake
12:24mripard: emersion: Rust can do that if you use the newtype pattern
12:24javierm: emersion: it could with CONFIG_PANIC_ON_OOPS=y but yeah, probably not enabled in production builds
12:24mripard: but back to the topic
12:24emersion: mripard: do what?
12:25mripard: all of that discussion about panic vs error check is also missing an important point
12:25emersion: ah, use a newtype to enforce that the arg has a given property?
12:25mripard: a) we never document that *anywhere* b) it's implicit c) it's not consistent across similar calls
12:25emersion: that makes it the caller's responsibility to check, which may not be better
12:26mripard: and that cannot be explained by "well panic is just better"
12:26mripard: those three things are bad, no matter which side of the fence you're on
12:26javierm: mripard: agreed
12:26emersion: documenting pre-conditions is indeed important
12:28pq: indeed, and if you have a if (!arg) return -EFAIL;, then it documents that there is no pre-condition saying arg must not be NULL, because NULL is gracefully handled.
12:28mripard: so I'm sending a separate series adding a bunch of BUG_ON, improving the documentation and making that consistent across calls
12:28pq: so you're making the policy by adding the check
12:28mripard: would that satisfy everyone?
12:28emersion: i think that's the right thing to do
12:28pq: mripard, that sounds awesome to me.
12:28mripard: s/so I'm sending/if I'm sending/
12:29mripard: great
12:29mripard: I'll do that then :)
12:29mripard: topic closed, can we move to the rest of the series ? :)
12:29pq: essentially the debate is about wether "arg is not NULL" is a pre-condition or not.
12:30pq: If you add BUG_ON(!foo), that documents it's a pre-condition.
12:30pq: If you add if (!foo) return -EINVAL; that documents it is not a pre-condition.
12:31emersion: yea
12:32pq: For a casual code reader, not having a pre-condition documented means that the opposite is probably normal operation and needs to be handled as well as possible, which might prompt them to write a lot of code to deal with it. Sometimes that's good, sometimes bad, depending.
12:35pq: Simply dereferencing a pointer can be seen as a pre-condition "is not NULL'ish", because NULL pointer dereference is expected to crash immediately. Problem are cases where it does not crash, or somehow allows a compiler to generate nonsense code.
12:36javierm: just wanted to mention that the BUG_ON() kernel-doc says that should be used only as a last resort and if there's really no way out
12:36javierm: https://elixir.bootlin.com/linux/latest/source/include/asm-generic/bug.h#L52
12:37javierm: I understand your point from a theoretical PoV but in practice I believe that can cause a lot of harm to not be paranoid about check if the pre-conditions are meet
12:38pq: depends on the probabilities
12:40javierm: regardless I agree with mripard that making the pre-condition explicit with a BUG_ON() rather than implicit is progress
12:41pq: Pre-conditions should definitely be checked at least in debug builds / CI for sure. What to do when it fails is another question. But this is also orthogonal to defining or not defining something as a pre-condition.
12:42mripard: could we just stop discussion a one-liner and focus on the rest of the series that actually provide some notable features?
12:42pq: and whether something is a pre-condition or not is totally arbitrary
12:42mripard: pleaaaase :)
12:42pq: sorry, philosophy is much more interesting than work :-p
12:42javierm: pq :D
13:31sima: mripard, I think you get the crown for "caused the best bikeshed of the year"
13:31mripard: if that means it's an automatic ack for the rest of the series, then I'm glad I did :-D
14:44karolherbst: gfxstrand: we have a bug in regards to alligned attributes in vtn: https://gist.github.com/karolherbst/f62d2c874e3d9cd6d839c68bbd9fdf9b
14:44karolherbst: "64 %64 = deref_cast (uvec2 *)%63 (shared uvec2) (ptr_stride=0, align_mul=1, align_offset=0) #!!! alignment is wrong"
14:45karolherbst: so the shared memory is forced to be aligned, but then we drop it back to 1...
14:45karolherbst: but I guess we might just not honor the alignment information in vtn there?
14:46karolherbst: maybe I should check if the spir-v is even fine...
15:05karolherbst: yeah.. the spirv has the alignment.. let's see...
15:07karolherbst: ohh.. it applies Alignment to the variable, not the pointer type...
15:09karolherbst: guess we could work around that.. *sigh*
15:31gfxstrand: Oh...
15:31gfxstrand: What do you mean?
15:33karolherbst: gfxstrand: like the spirv-llvm-translators puts `Alignment` on the `OpVariable` not on its pointer type
15:34karolherbst: should have been `OpDecorate %_ptr_Workgroup__arr_uchar_ulong_1024 Alignment 8` instead I think
15:41gfxstrand: Weird
15:41gfxstrand: I thought we handled that.
15:44karolherbst: well.. doesn't look like it
15:44karolherbst: well.. it's also invalid to put it on the variable
15:44karolherbst: "Apply only to a pointer. Alignment is an unsigned 32-bit integer declaring a known minimum alignment the pointer has."
15:48wv: are there known issues regarding colors on freedreno/dmabuf? It appears that when I use glimagesink or appsink using dmabufs, I have color issues. Or video is green, or there's like a shift of 180 pixels in colors (like a mask is played next to the grey image)
15:58gfxstrand: karolherbst: Well, an OpVariable is a pointer...
15:59gfxstrand: karolherbst: Or is it a variable that contains a pointer?
15:59gfxstrand: I honestly don't know what we're supposed to do with an alignment on a variable.
16:00karolherbst: the variable is just an array
16:00gfxstrand: ugh...
16:00karolherbst: in OpenCL C: __local uchar data[512/4*sizeof(ALIGN_TYPE)] __attribute__((aligned(sizeof(ALIGN_TYPE))));
16:01gfxstrand: Right
16:01gfxstrand: Yeah, we don't handle that sort of alignment yet.
16:01karolherbst: mhh.. we'd also have to place the variable aligned I guess...
16:01karolherbst: yeah.. it kinda makes sense to be able to align the type actually...
16:02karolherbst: the variable I mean
16:03gfxstrand: And... The spec says nothing. Because of course it does.
16:03gfxstrand: Let me type something quick.
16:18karolherbst: gfxstrand: where does the spec state that a variable is a pointer actually?
16:19gfxstrand: OpVariable returns a pointer
16:19gfxstrand: Just look at the OpVariable spec
16:19karolherbst: ohhh...
16:20gfxstrand: But, as with most SPIR-V things, it's legal to write it but no spec actually says what it means so we kinda have to make it up.
16:20karolherbst: let's see...
16:20karolherbst: "The compiler is responsible for aligning objects allocated by OpVariable to the appropriate alignment as required by the Result Type."
16:21karolherbst: I guess that's everything we get here
16:21gfxstrand: Yeah, but what does that mean?!?
16:21karolherbst: well.. if the result type would be decorated with "alignment"
16:22karolherbst: but I think this just means we have to align per CL rules (type alignment) or according to the alignment attribute (Alignment decoration)
16:22gfxstrand: Yeah, but alignment is a hint, not a command.
16:22gfxstrand: At least on all other pointer types
16:22karolherbst: I don't think it's a hint in CL
16:22gfxstrand: In any other environment it is.
16:23karolherbst: like if you align a shared memory block you can hope that it's indeed aligned that way
16:23karolherbst: or if you align struct members
16:23gfxstrand: Well, it's a hint in the sense that if the client sets an alignment then it's up to the client to make sure it's true.
16:23karolherbst: yeah
16:23gfxstrand: But I mean that it's data provided to the compiler. The compiler is free to ignore it.
16:23karolherbst: but the client also expects this alignment to be honored by thec ompiler
16:23karolherbst: mhhh
16:23karolherbst: I don't think it is
16:23karolherbst: not in CL
16:23gfxstrand: Sure you can. You can use unaligned loads for everything. There's nothing stopping you.
16:24karolherbst: like again, if you annotate a shared memory block with "this is aligned with 8 bytes" then you kinda want that block to be 8 byte aligned
16:24karolherbst: well...
16:24karolherbst: right
16:24karolherbst: but...
16:24karolherbst: the thing is
16:24karolherbst: what if you cast that array to an array of longs
16:25karolherbst: and then use load/stores on it in functions
16:25karolherbst: you can't assume everywhere that things are unaligned if you care about perf
16:26karolherbst: but yes.. you can just ignore it and do unaligned load/stores everywhere :)
16:32gfxstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26522
16:32karolherbst: cool thanks, will try it out later
16:34gfxstrand: I've compile tested and I think it ought to do the trick but I've not tried out your CL test case.
17:26jenatali: karolherbst: Seems like the problem is in the SPIR-V, no?
17:26jenatali: %84 = OpBitcast %_ptr_Workgroup_v2uint %arrayidx6
17:27jenatali: OpStore %84 %81 Aligned 1
17:28karolherbst: yeah.. sounds like there is also a bug in the translator/llvm then
17:28karolherbst: we need to honor the alignmend on the variable for placement inside the shared memory region anyway
17:28jenatali: If you look at the LLVM IR I'm pretty sure you'll see the 1-byte alignment come in there too
17:29jenatali: Right, I agree, there's still something that should be fixed here, but I don't think that's the cause of your unaligned loads/stores
17:29karolherbst: yeah..
17:29jenatali:fought a lot with alignment since DXIL has no 1-byte aligned load/store capabilities...
17:29karolherbst: I still have a cast to align_mul=1 sadly :)
17:29karolherbst: yeah..
17:36karolherbst: zink might also want to align variables...
17:36karolherbst: but I'm sure it's fine ...
17:37karolherbst: jenatali: we should add a CLC_DEBUG=dump_llvm option... :D
17:37jenatali: Yeah, sure
17:38karolherbst: let me add one actually..
17:42karolherbst: and now I know why I haven't already
17:57karolherbst: jenatali: you are right.. "store <2 x i32> %6, <2 x i32> addrspace(3)* %9, align 1" pain
17:58jenatali: Yep
18:00karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26524
18:01karolherbst: ehh...
18:02karolherbst: I wanted to move the function
18:03DavidHeidelberg: karolherbst: where is the docs for it? :P
18:04karolherbst: there is
18:04karolherbst: uhm.. for what anyway?
18:05karolherbst: anyway.. done with the MR update
18:06DavidHeidelberg: karolherbst: weird, I see it with changes on top of both commits, but didn't show up when browsing one-by-one
18:06DavidHeidelberg: so good :)
18:06karolherbst: yeah.. gitlab also screwed up for me a few minutes ago
18:10kisak: milestone, 20k merge requests merged into mesa since migrating to gitlab.
18:10karolherbst: I hope it was mine
18:11karolherbst: ohh.. it was already a while ago
18:11hch12907: karolherbst: for a moment I thought it was "dump" in the "to throw away" sense
18:11karolherbst: not yet
18:12hch12907: I was like, are we adding a C parser to mesa now?
18:12hch12907: 😄
18:12karolherbst: like
18:12karolherbst: if that's the fun you are looking for, sure
18:13kisak: oh, I got the milestone wrong ... 20k merge requests merged by Marge Bot just now
18:14karolherbst: ahh
18:45jenatali: karolherbst: Yeah zink's callback is very wrong
18:46karolherbst: figures
18:47jenatali: It's saying that it requires component-aligned accesses, which defeats the whole purpose of the pass
18:47jenatali: Which is to change the component size
18:47karolherbst: right
18:48karolherbst: so if it sees a 32 bit load with an alignment of 1, it should return 8x4 instead I guess
18:48jenatali: Right
18:49jenatali: Unless it can actually do a 32bit load with an alignment of 1, in which case it can return 32x1, it just needs to set the out alignment to 1
18:49karolherbst: but will the pass use atomics for that already (if allowed)?
18:49karolherbst: I think vulkan spir-v requires type size aligned load/stores for everything, no?
18:50jenatali: Probably
18:50karolherbst: yeah.. the main intention of adding that pass/code was to deal with vec8/vec16
18:50karolherbst: but I'll have to figure out the unaligned bits as well
18:50jenatali: The pass will do atomics for stores where the returned alignment is greater than the input alignment, yeah
18:51karolherbst: mhhh
18:51jenatali: I.e. if the input alignment is 1, but the driver can only do 2- or 4-byte-aligned stores, then the pass will promote the store to atomics with 4-byte alignment
18:51karolherbst: okay, so zink should just enable that atomic option and it should be good?
18:52jenatali: It already is setting that flag, which just guards an assert. It needs to return correct data from the callback
18:52karolherbst: well.. I guess for private memory it shouldn't rely on that option
18:53jenatali: karolherbst: You might want to reference what I have here: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/microsoft/compiler/nir_to_dxil.c#L6184
18:53karolherbst: jenatali: but how would that then work.. if you e.g. have a 32 unaligned store, but you don't want to split that up into 8x4 because it would violate API rules
18:54jenatali: It has to get split up into 4 8x1 stores
18:54karolherbst: okay, so the pass won't do that automatically, but it will use atomics for it?
18:55jenatali: The callback has to return that you want 8x1 with 4-byte alignment, and then the pass will turn that into atomics, yeah
18:55karolherbst: 8x1 or 8x4?
18:56jenatali: Well, if you can do 8x4 with 1-byte alignment, then you can just return that. Zink can do that with the 8bit load/store extension
18:56jenatali: If that extension isn't present, then zink needs to return a higher alignment, and in that case you'd want to do 1 byte at a time I think
18:56karolherbst: I tihnk the problem is that a vulkan compiler would be free to split those stores
18:57jenatali: Sure, then the vulkan compiler can do that
18:57karolherbst: but wouldn't that be illegal from a CL perspective?
18:57jenatali: Why?
18:58karolherbst: mhhh, not quite sure actually...
18:58jenatali: The only requirement is that the writes don't modify other data, and that they don't race with other writes to nearby data
18:59karolherbst: ohh right..
18:59jenatali: If the vulkan driver (or zink) splits an unaligned store into 4x byte stores, that's fine as long as each store actually only writes the relevant byte
18:59karolherbst: like if you do a 64x4 write but the pointer is aligned with 2
19:00karolherbst: wait.. bad example
19:00karolherbst: 32x2 but aligned with +0x3 and you want to use a 32x3 write for the whole area, but the edges need to update in a way it doesn't interfer with anything
19:01jenatali: Then if you can do 2-byte-aligned stores, you'd return that, otherwise 4. If you return 4, you'll need to write one 2-byte chunk at a time via atomics. If you return 2, then you can return 16x4 and the pass will chunk it up appropriately
19:01karolherbst: right...
19:01karolherbst: yeah, makes sense
19:01jenatali: 32x2 with +0x3 offset, you'd use a 1-byte store for the first bit, then the pass would call you back again with a 4-byte-aligned store
19:02jenatali: It's really quite a clever pass as long as the callback returns good data. Took me a while / a few tries to get the callback right for our limitations though
20:35DemiMarie: @_oftc_CADIndie:matrix.org: could you push Qualcomm to do what Intel and AMD do and have launch-day upstreamed drivers?
20:39gfxstrand: That would require Qualcomm to actually contribute to the upstream drivers.
20:39gfxstrand: Or any open-source upstream for that matter.
20:41DemiMarie: They absolutely should
20:41DemiMarie: Google should just ban out of tree drivers from Android
20:42DemiMarie: Make people upstream their drivers or no Google Play Services
20:43DemiMarie: Heck, I suspect that Android is a bigger market for Qualcomm than desktop Linux is for Intel.
20:48pac85: Demi: that would ruin the fun for those that do the reverse engineering
20:49DavidHeidelberg: DemiMarie: Google should ban proprietary rootkit (GMS) which runs on 99% of Androids :) but it's not going to happen
20:50karolherbst: jenatali: yeah.. I kinda wished this pass would have proper documentation on the callback :)
20:50jenatali: karolherbst: Be the change you want to see in the world :P
20:51karolherbst: I'm already am, hence me not having time to also document code anymore 🙃
20:51karolherbst: oh hey, now I even have a good excuse for real
20:52robclark: DemiMarie: qcom did actually have patches on list for sm8650 and sc8380xp basically the day they were announced.. I don't believe either are shipping yet
20:53robclark: (also no idea who you were replying to, I guess yet another matrix ghost)
20:53pac85: Including GPU driver patches?
20:54robclark: I don't think they have posted kernel parts yet, but I expect they will before those devices ship
20:54robclark: (which is sometime next year for sc8380 ... not sure about the mobile part, but tbh the laptop part is more interesting)
20:54pac85: join #freedreno
20:54pac85: whoops
20:59gfxstrand: DemiMarie: Oh, if only it were that simple...
21:35mattst88: magical thinking
22:39DemiMarie: mattst88 gfxstrand robclark DavidHeidelberg: I’m still very much an idealist.
22:42robclark: the situation has improved somewhat dramatically from 5 or so years ago, now we are getting SoC support the day it was announced rather than a few years after it shipped.. qcom contributing to the UMD is still a bit of a red line for them, but they do contribute to KMD
22:46mattst88: DemiMarie: that's great, but you're kind of doing it again...