00:00 anholt: x86 has llvm's repo as source of llvm packages
00:01 airlied: okay might make it easeri to just compile spirv-llvm-translator then
00:01 jenatali: airlied: How were you thinking you'd get a libclc spv binary for CI, considering you need LLVM master for that?
00:05 airlied: jenatali: that was why I was waiting for clc changes to land first :P
00:05 airlied: didn't want to expend much effort until it was time to face the messy bits
00:05 jenatali: Heh, got it, figure it out later :P
00:05 airlied: yeah I didn't want to enable CI now and then block libclc landing on that
00:06 airlied: in theory I think you can build libclc against system llv/clang/translator
00:06 airlied: it's just the large git clone
00:06 jenatali: Right
00:06 airlied: to get a few files
00:10 anholt: for the next docker novice to try to do a big development of ci containers, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700/diffs?commit_id=a4754055338fa0a3fb1f7772520960d3104e8680 may help test some of your build scripts
00:10 anholt:was really happy to finally sort that out
00:19 airlied: jenatali: one thing the patch that adds support for async opcodes, seems to change the overall opcode parsing
00:19 airlied: maybe that could be separateed out
00:19 airlied: granted it seems pretty trivial, i was just surprised to see it in there
00:19 jenatali: airlied: Sure, we could separate it out
00:20 jenatali: airlied: karolherbst was trying to run some kernels that used it, so I added it in
00:20 airlied: also the w+ 5, count -5 changes
00:20 jenatali: But yeah I'm happy to split up things however
00:20 airlied: yeah seems like we could move that change earlier in the series
00:20 airlied: so adding the async copy/ wait events just dds those
00:21 airlied: adds
00:21 jenatali: Ah, you mean split it up, so the infrastructure for a core opcode => libclc mapping is one patch, and then specifically handling the copy/wait opcodes is separate?
00:21 jenatali: I want to say I had it like that at one point and someone told me to put them together :P
00:23 jenatali: If you want to put that comment in the MR so I don't forget to do it, that'd be great :)
00:23 airlied: jenatali: yeah added to it now
00:23 jenatali: Thanks
00:24 airlied: jenatali: just seems it's breaking the do one thing in a patch
00:24 airlied: the commit msg doesn't even mention the opcode changes
00:24 jenatali: Fair
00:25 jenatali: Still learning the ropes, having a sane commit history is super different from the source control background I'm used to :P
04:54 jekstrand: anholt: I've got the jump_if instruction passing crucible. Kicking to Jenkins now.
06:56 mareko: Vercas: NGG_DISABLE_PROVOK_REUSE is for VS and TES, it means that primitives won't share a provoking vertex
07:20 pq: Lyude, oh the backlight stuff. That's why I pinged you about HDR. :-)
07:26 pq: Lyude, I'm not sure this is that relevant to you, because I assume software to have no idea what a backlight control actually does wrt. SDR/HDR luminance, so any effects are left for the user to manually compensate for. But, if it was possible to know programmatically in userspace, users might be happier.
07:28 pq: Lyude, by manual compensation I mean a "monitor EDR value" slider in desktop settings.
07:28 krh: argh, gitlab is all 503 again
07:28 pq: or similar
07:28 pq: krh, yup, on-going maintenance as seen on #freedesktop
07:59 danvet_: sravn, I think CREDITS predates git, now we have the git log for that stuff
11:15 pcercuei: about 24-bit pixel modes... (DRM_FORMAT_RGB888)
11:15 pcercuei: Does that mean there is no dummy byte?
11:15 pcercuei: and 4 pixels can be crammed into 3 dwords?
11:25 vsyrjala: yes
11:28 pcercuei: awesome! Thanks
11:28 pq: that's an unexpected reaction :-)
11:28 pcercuei: :)
12:24 danvet_: pq, if all you care about is memcpy throughput for your fb :-)
12:25 pq: then use RGB232 :-P
12:37 randymago: I have finished all the security and performance related investigations on all hardware, i won't be participating on that channel anymore, i have some info large pile of how to run code in secure and performant ways, however i am not blogger myself.
12:37 randymago: so i take off now, so you can clean up the banlist if you want.
13:18 danvet_: mlankhorst_, drm-misc-fixes is stuck on -rc2 :-/
13:38 seanpaul: danvet_: a little room-reading on my part, would opening up connector->atomic_commit() for more than writeback jobs be an automatic nack? i'd like to use it for non-modeset hdcp transitions on qc which doesn't have seamless modeset like i915
13:38 danvet_: seanpaul, hm feels a bit hacky
13:39 danvet_: imo write up a seamless modeset infra
13:39 danvet_: either in msm, or as helpers
13:39 danvet_: essentially what we do is if (!crtc_needs_modset()) fastset_fixup()
13:40 seanpaul: imo, needing modeset for hdcp is the hack
13:40 danvet_: yeah agreed on that
13:40 danvet_: but you don't
13:40 danvet_: no one is forcing you to do that
13:41 seanpaul: the problem is that i don't have a hook at the right level to enable/disable hdcp
13:41 danvet_: atomic helper isn't a midlayer
13:41 danvet_: you can overwrite/hack up/add hooks
13:41 seanpaul: so i need to go through modeset to get a connector->atomic_enable call
13:41 danvet_: like i915 does
13:41 danvet_: nope
13:42 seanpaul: well, going through modeset on i915 is also bogus
13:43 seanpaul: ideally i'd like to pull out all the hdcp auth logic into a helper
13:43 seanpaul: and it'd be nice if the driver didn't have to subclass atomic to use it
13:44 danvet_: why subclass atomic?
13:44 danvet_: https://paste.debian.net/1164647/ this is what I recommend
13:44 danvet_: ofc you then need to stop setting crtc_state->connector_changed from your atomic_check for the hdcp stuff
13:45 seanpaul: that is what i had in i915 originally, it was nacked to go through modeset
13:46 danvet_: i915 has a massive machinery for fast modesets
13:46 danvet_: it's kinda a different beast
13:46 seanpaul: i still don't understand why this is better than opening up atomic_commit
13:47 danvet_: it's different things
13:47 danvet_: the writeback atomic_commit is kinda like a plane
13:47 danvet_: called a connector for hysterical reasons
13:48 danvet_: but really not much to do with any other connector
13:48 danvet_: so if you insist this must be solved in helpers, then I guess we can do a connector->atomic_fixup
13:48 danvet_: which is called for !modeset stuff
13:48 danvet_: and fairly useful for all kinds of things
13:48 seanpaul: so the commit_tail hack works ok for driver connectors, but what's the guidance for bridge connectors?
13:48 danvet_: so kinda like i915 fastset
13:48 seanpaul: not insisting, just want to get the right solution
13:49 danvet_: we can also do a bridge->atomic_fixup
13:49 danvet_: the thing is, if you put it into helpers it needs to compose with everything else somewhat reasonably
13:50 seanpaul: yeah, understood. so you're thinking atomic_fixup is like atomic_commit_v2
13:50 danvet_: seanpaul, btw for i915 I didn't nack avoiding the modeset
13:50 danvet_: iirc I nacked where you put that code
13:50 danvet_: because it's a general problem, and we have that fastset stuff in i915
13:50 danvet_: seanpaul, nah
13:51 danvet_: it's like the alternative path if commit_modeset_enables didn't do anything
13:51 danvet_: atomic_commit_nonmodeset_fixups
13:51 vsyrjala: the i915 encoder fastset stuff is a total hack atm. it's not even atomic with the rest of the frame :(
13:51 danvet_: yeah that's I guess another problem
13:52 danvet_: otoh hdcp is a mess anyway and you get pixel garbage pretty much as a feature if you're unlucky, so meh :-)
13:52 vsyrjala: yeah, hdcp is meh anyway. infoframes and whatnot would be more important imo
13:52 seanpaul: ok, i'm not totally thrilled with atomic_fixup, so i'll leave that until i have to deal with a bridge
13:54 danvet_: seanpaul, I guess we should rename atomic_commit to atomic_writeback_commit
13:54 danvet_: to avoid confusion
13:54 seanpaul: yeah, atomic_writeback_commit_only--srsly() ;-)
13:57 danvet_: seanpaul, I think the full helper solution probably needs a crtc_state->need_fixup or so
13:57 danvet_: and then a pile of atomic_mode_fixup callbacks
13:57 danvet_: which is a terribly name because we already have a mode_fixup hook which is called in the atomic_check phase
13:57 danvet_: so maybe atomic_fastset_fixup
13:58 danvet_: but just for msm I'd really just hand-roll this and done
13:58 seanpaul: atomic_call_me_maybe()
13:58 danvet_: +1
13:59 seanpaul: ok, i'll hand roll it for now and see how it feels, thanks for your input!
14:05 danvet_: seanpaul, the real nasty with generic fastset is undoing a crtc_state->mode_changed = true
14:05 danvet_: we'd probably need to refcount that
14:05 danvet_: so that if a bridge can do a fastset, it can roll that back
14:06 danvet_: without rolling back other reasons for full modeset
14:06 danvet_: or we need to split it all up more
14:06 danvet_: iow, full generic fixup after fastset is nasty, which is why I'm not super enthusiastic about rolling it out to helpers ad-hoc
14:08 seanpaul: yeah, fair points, i might have a use for it in a bridge for psr sometime soon, so i'll let it rattle around until then
14:22 mlankhorst_: danvet_: oh have a good reason for newer drm-misc-fixes?
14:22 mlankhorst_: I'll update on monday
14:42 mripard: danvet_: how are we supposed to add a driver to the drm-misc defconfig if it's missing? is there a process in place for that?
14:44 mripard: it looks like we're missing drivers/gpu/drm/imx/dcss
18:22 jekstrand:hates LLVM
18:24 karolherbst: jekstrand: what is it this time? :D
18:24 jekstrand: karolherbst: It's casting and memcpying all over
18:24 karolherbst: ahh
18:24 jekstrand: And I'm trying to help NIR chew through it
18:30 jenatali: jekstrand: Unoptimized LLVM?
18:31 jekstrand: jenatali: Somewhat "optimized", unfortunately. :(
18:31 jenatali: Ah, and I'm guessing you're not starting from source then
18:31 jekstrand: I have source and I can shut off some optimizations but it's still pretty bad
18:31 jekstrand: Lots of struct copies
18:32 jenatali: :(
18:32 karolherbst: ehh..
18:32 karolherbst: and I guess llvm casts them to char* copies?
18:32 jekstrand: Always!
18:32 karolherbst: uhh.. pointers, not copies
18:32 karolherbst: right..
18:32 karolherbst: well.. char* is kind of the natural type when copying bytes though :/
18:33 jekstrand: Getting rid of the char* on the memcpy isn't hard
19:04 HdkR: jekstrand: This is why their SROA pass is a beefcake, since it expects nearly everything to be in "memory" :P
19:17 airlied: jekstrand: I think I landed all the memcpy deref stuff I could easily do way back, but there might have been some hacks outstanding
19:17 jekstrand: airlied: Yeah, it's pretty bad....
19:17 airlied: but yeah there were still some messy ones I didn't get to far into
19:21 airlied: jekstrand: is it copying structs out of structs or arrays of structs?
19:21 jekstrand: just copying structs
19:22 jenatali: airlied: I split the async/wait patch, hopefully along the divide that you were looking for :)
19:29 jekstrand: Wow, a few optimizations and my compile times are already improved!
19:30 airlied: jenatali: thanks, much clearer, r-bs for the two
19:30 jenatali: airlied: Thanks :)
19:31 jekstrand: ssa_17005 is not a good number....
19:31 jenatali: jekstrand: That's all? :P
19:31 airlied: karolherbst: you might be best person to ack or rb "clover: handle libclc shader (v3)"
19:32 karolherbst: airlied: the only patch left?
19:32 airlied: karolherbst: pretty much
19:32 jenatali: karolherbst: There's a few more -- unless jekstrand's relatively broad ack was supposed to cover them
19:33 airlied: karolherbst: pretty much, the others are just opcode moving mostly
19:33 airlied: jenatali: I'm just looking over the last few now
19:33 jekstrand: I didn't read any of the opcode moving
19:33 jekstrand: so someone should
19:33 jenatali: airlied: Cool, thanks
19:35 jekstrand: I just love how LLVM always likes to write vec4s by casting them to vec3s and writing that....
19:35 airlied: jenatali: okay all the opcode moving patches have my rb
19:35 jekstrand: *sigh
19:35 jekstrand: This shader has a lot of vec3s
19:36 jenatali: jekstrand: Is is a vec3 variable that's read/written as a vec4? Or a vec4 that's read/written as a vec3?
19:36 jekstrand: jenatali: The former
19:36 jekstrand: So it's always OOB writing which is just awesome
19:37 jenatali: jekstrand: I have a special-case pass I wrote to clean those up, though really we should genericize it to handle out-of-bounds reads/writes of a variable
19:37 jenatali: jekstrand: https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/304/diffs?commit_id=3284e04a44bba5ce0dc13a0cf8ad3c446a196dc5 if you want to take inspiration from it
19:37 jekstrand: Hrm... It uses a write-mask of xyzw
19:37 jekstrand: xyz rather
19:39 jekstrand: I've got about 1KB of scratch in this shader and it's all vec3s :/
19:40 jenatali: jekstrand: Yeah, sounds about right
19:40 jenatali: That's why I wrote that pass, so copy prop could remove it all
19:40 jenatali: My problem was all in vec3 function parameters
19:40 jekstrand: Yeah
19:40 jekstrand: This kernel is storing a lot of 3D vectors
19:52 jenatali: Alright, all the libclc patches are actually reviewed now :D
19:52 jenatali: airlied, jekstrand, karolherbst: Any reason to wait longer before merging?
19:52 jekstrand: Not so far as I know
19:53 airlied: jenatali: no we've waited long enough :-P
19:53 jenatali: :) alright then, here goes nothing
19:53 karolherbst: I just left some comments :p
19:53 jenatali: karolherbst: Oh ok, let me look
19:54 airlied: karolherbst: I think jenatali just did all of them
19:54 jenatali: karolherbst: Yeah, I thought "just" meant after I pushed the fixes for all of your comments
19:54 karolherbst: :D
19:54 karolherbst: ohh, I see
19:55 karolherbst: yeah, I have nothing against merging it. It worked fine locally :)
19:57 jenatali: karolherbst: I'm just double-checking outstanding comments... you had one about moving LIBCLC_INCLUDEDIR in the Meson build files?
19:58 jenatali: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6035#note_611863
20:01 karolherbst: yeah...
20:01 karolherbst: I don't like those things to be defined all that often
20:02 jenatali: Alright, let me fix that...
20:07 jenatali: Ok, fixed
20:08 jenatali: Here goes for real now
20:12 airlied: jenatali: yay
20:12 jenatali: Once I fix the CI failures from things I apparently broke :)
20:43 jenatali: It's in \o/
20:48 airlied: jenatali: \o/
20:50 jenatali: airlied: *now* you get to figure out how to get it in CI ;)
21:17 jekstrand: jenatali: \o/
21:23 jekstrand: Wow... I'm compiling like crazy now.
21:23 jekstrand: Here's hoping my patches are correct. 😂
21:24 jekstrand: spills/fills cut massively. \o/
21:26 jenatali: Woo
21:27 jekstrand: It no longer takes 4 minutes to compile a single shader. :)
21:29 jenatali: jekstrand: Was it just the vec3/vec4 and memcpy problems or was there more to it than that?
21:29 ccr:starts a wave for jekstrand
21:29 ccr: ~\o/~
21:31 jekstrand: jenatali: Just vec4/vec3 and memcpy. Well, my optimization does more than just vec4/vec3 so it might be helping more than that.
21:31 jenatali: jekstrand: Great :) Looking forward to being able to benefit from your improvements
21:31 jekstrand: jenatali: Good. I'm planning on you reviewing the patches. :P
21:31 jenatali: jekstrand: Sure, sounds good
21:35 jekstrand: jenatali: Went from maximum spill count of about 6k to about 300 :)
21:35 jekstrand: It's still spilling but that's massively better
21:36 jenatali: Yeah that's what I'd expect
21:37 jekstrand: Ugh... Looks like for u8vec3, LLVM doesn't use a write-mask. :-(
21:38 imirkin_: for all the rgb888 afficionados...
21:38 jekstrand: There's a part of me which wants to write a pass which just runs through the whole kernel and replaces all vec3s with vec4s
21:38 jekstrand: Add my own casts and hope they cancel
21:39 imirkin_: so wait, what are they doing?
21:39 jenatali: jekstrand: I think we should add an optimization pass which finds known OOB accesses of variables and drops the writes and replaces reads with undefs
21:39 jekstrand: imirkin_: Declaring a variable in SPIR-V that's a u8vec3 and then always accessing it by first casting to a u8vec4
21:39 imirkin_: oh lol
21:39 jekstrand: jenatali: I'm not sure if we can legally do that. :-/
21:40 jenatali: jekstrand: Why not?
21:40 imirkin_: sounds like all the dx11->glsl translators using int <-> float casts all over
21:40 jekstrand: I guess I don't really know what the rules are
21:40 jekstrand: imirkin_: Yup, only way way worse
21:40 jekstrand: jenatali: I'm not sure what the rules are. If the padding bits "don't exist" then I think LLVM is generating invalid SPIR-V.
21:40 jekstrand: If the padding bits do exist, then don't we have to preserve them?
21:41 jenatali: jekstrand: Isn't it undefined behavior in C to access memory outside of a variable?
21:41 jenatali: Though, I guess the C code isn't doing that...
21:41 jekstrand: Probably? But LLVM is doing just that for these u8vec3s
21:42 imirkin_: at least in glsl, in order to go from vec3 -> vec4, you have to specify the value of the final component...
21:47 jenatali: jekstrand: Yeah, I can't find anything indicating out-of-bounds accesses are invalid... I wonder if the right thing to do is to replace vec3 variables with vec4 and cast to vec3
21:48 jenatali: Since it is spec'd that vec3s take up the same amount of space as a vec4
21:48 jekstrand: jenatali: I'm kind-of thinking about that
21:49 jekstrand: jenatali: Unfortunately, that's going to be an annoying pass to write.
21:49 jenatali: jekstrand: Eh, doesn't sound too bad
21:50 jekstrand: I'll give it a try and we'll see how it goes.
22:36 jekstrand: jenatali: Got a pass. Time to see if it works. :)
22:36 jenatali: Good luck
22:38 jekstrand: jenatali: Does LLVM place dummy members after vec3s?
22:39 jenatali: jekstrand: Uh... I don't think so?
22:39 jenatali: jekstrand: It can't, since vec3 + vec1 doesn't pack in CL like it does in GL
22:40 jekstrand: jenatali: Hah! It doesn't like it when I smash all my system values to vec4 :)
22:41 jenatali: jekstrand: Oh, that'll do it
22:41 jenatali: Those probably should stay as vec3...
22:43 jekstrand: Seems to work. Helps spilling a bit more
22:44 jekstrand: Woo! I'm down to a single function_temp cast!
22:45 jekstrand:does the happy dance
22:45 jenatali: jekstrand: Nice!
22:47 jekstrand: The remaining struct is struct foo { half a[3]; half b[3]; ivec2 c; }
22:47 jekstrand: Sadly, the alignment of c is burning me. :-(
22:48 jekstrand: That stupid little hole is preventing me from lowering the memcpy_deref to copy_deref
22:48 jenatali: :(
22:48 jenatali: I'm very interested to see what your memcpy elision looks like
22:49 jekstrand: I've got one more trick in my bag.....
22:49 jenatali: jekstrand: I'm pretty sure you've got way more than one, if I've learned anything about you these last few months
22:49 jekstrand: lol
22:50 jekstrand:hides his hat full of rabbits under his desk
22:50 karolherbst: jekstrand: don't tell me you add explicit padding members :p
22:50 jekstrand: karolherbst: Nope
23:39 jekstrand: Ugh... Foiled by my own helper functions.
23:43 agrisis: is it possible for a separate app to draw to dri as an overlay?
23:43 agrisis: i.e, if another app is already using drm/kms, I haven't found a way for another app to write to it
23:44 jekstrand: No, only one app can control the display at a time.
23:44 jekstrand: If you want multiple, you need a compositor to sit in the middle.
23:44 agrisis: jekstrand: thanks, that makes sense
23:45 imirkin: and/or delegate stuff with the kms delegation logic
23:45 agrisis: one other thing, it seems in Archlinux using systemd, if you have an app using drm/kms on tty1, you can switch to tty2 and see agetty, but on void linux, even though agetty is running on tty1+2, I can't seem to switch to tty2 or at least I still see the image being drawn
23:45 agrisis: any idea what that could be?
23:46 agrisis: imirkin: do you have a reference for that?
23:46 agrisis: imirkin: or do you mean the other app would need to communicate to the owning app for it to draw?
23:47 imirkin: well ... depends what you want to do
23:47 imirkin: e.g. if you have 2 screens, you can "give" one to the other app
23:47 imirkin: if you want to draw to a shared buffer, then something has to mediate that sharing...
23:47 imirkin: unfortunately i'm blanking on both the term that was used for the kms delegation thing, as well as the person who worked on it
23:48 imirkin: keith packard is the one who worked on it...
23:48 imirkin: and it was called "leasing"
23:48 imirkin: e.g. https://keithp.com/blogs/DRM-lease/
23:48 jekstrand: drm lease
23:49 imirkin: the blog talks about him hacking it up, but afaik it has landed
23:49 jekstrand: jenatali: Marge is working on the main memcpy MR. I'll try to pluck my new patches out into an MR for you once that's landed.
23:49 jekstrand: jenatali: I've now gotten rid of 100% of my local variables from some rather complex kernels. :D
23:49 imirkin: agrisis: more on drm leases: https://www.x.org/wiki/Events/XDC2017/packard_drm_lease.pdf
23:51 keithp: imirkin: it's been in the kernel for quite a while now
23:51 jekstrand: jenatali: That said, it's late. It may not be until Monday.
23:52 imirkin: keithp: i should hope so, given that your talk was in 2017. (wow, time sure flies...)
23:52 keithp: yes it does :-)
23:52 keithp: although some kernel patches take longer than that. the RT series appears nearly merged, after something like 20 years?
23:52 imirkin: hehe
23:55 agrisis: nice thank you
23:58 anholt: krh: there's a driconf-structs branch in my mesa tree now. Getting the { } in these macros to balance is miserable, but the commit looks promising now.
23:58 anholt: (also, the xml generation is obviously pretty stubby)