00:00anholt: x86 has llvm's repo as source of llvm packages
00:01airlied: okay might make it easeri to just compile spirv-llvm-translator then
00:01jenatali: airlied: How were you thinking you'd get a libclc spv binary for CI, considering you need LLVM master for that?
00:05airlied: jenatali: that was why I was waiting for clc changes to land first :P
00:05airlied: didn't want to expend much effort until it was time to face the messy bits
00:05jenatali: Heh, got it, figure it out later :P
00:05airlied: yeah I didn't want to enable CI now and then block libclc landing on that
00:06airlied: in theory I think you can build libclc against system llv/clang/translator
00:06airlied: it's just the large git clone
00:06jenatali: Right
00:06airlied: to get a few files
00:10anholt: for the next docker novice to try to do a big development of ci containers, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700/diffs?commit_id=a4754055338fa0a3fb1f7772520960d3104e8680 may help test some of your build scripts
00:10anholt:was really happy to finally sort that out
00:19airlied: jenatali: one thing the patch that adds support for async opcodes, seems to change the overall opcode parsing
00:19airlied: maybe that could be separateed out
00:19airlied: granted it seems pretty trivial, i was just surprised to see it in there
00:19jenatali: airlied: Sure, we could separate it out
00:20jenatali: airlied: karolherbst was trying to run some kernels that used it, so I added it in
00:20airlied: also the w+ 5, count -5 changes
00:20jenatali: But yeah I'm happy to split up things however
00:20airlied: yeah seems like we could move that change earlier in the series
00:20airlied: so adding the async copy/ wait events just dds those
00:21airlied: adds
00:21jenatali: Ah, you mean split it up, so the infrastructure for a core opcode => libclc mapping is one patch, and then specifically handling the copy/wait opcodes is separate?
00:21jenatali: I want to say I had it like that at one point and someone told me to put them together :P
00:23jenatali: If you want to put that comment in the MR so I don't forget to do it, that'd be great :)
00:23airlied: jenatali: yeah added to it now
00:23jenatali: Thanks
00:24airlied: jenatali: just seems it's breaking the do one thing in a patch
00:24airlied: the commit msg doesn't even mention the opcode changes
00:24jenatali: Fair
00:25jenatali: Still learning the ropes, having a sane commit history is super different from the source control background I'm used to :P
04:54jekstrand: anholt: I've got the jump_if instruction passing crucible. Kicking to Jenkins now.
06:56mareko: Vercas: NGG_DISABLE_PROVOK_REUSE is for VS and TES, it means that primitives won't share a provoking vertex
07:20pq: Lyude, oh the backlight stuff. That's why I pinged you about HDR. :-)
07:26pq: Lyude, I'm not sure this is that relevant to you, because I assume software to have no idea what a backlight control actually does wrt. SDR/HDR luminance, so any effects are left for the user to manually compensate for. But, if it was possible to know programmatically in userspace, users might be happier.
07:28pq: Lyude, by manual compensation I mean a "monitor EDR value" slider in desktop settings.
07:28krh: argh, gitlab is all 503 again
07:28pq: or similar
07:28pq: krh, yup, on-going maintenance as seen on #freedesktop
07:59danvet_: sravn, I think CREDITS predates git, now we have the git log for that stuff
11:15pcercuei: about 24-bit pixel modes... (DRM_FORMAT_RGB888)
11:15pcercuei: Does that mean there is no dummy byte?
11:15pcercuei: and 4 pixels can be crammed into 3 dwords?
11:25vsyrjala: yes
11:28pcercuei: awesome! Thanks
11:28pq: that's an unexpected reaction :-)
11:28pcercuei: :)
12:24danvet_: pq, if all you care about is memcpy throughput for your fb :-)
12:25pq: then use RGB232 :-P
12:37randymago: I have finished all the security and performance related investigations on all hardware, i won't be participating on that channel anymore, i have some info large pile of how to run code in secure and performant ways, however i am not blogger myself.
12:37randymago: so i take off now, so you can clean up the banlist if you want.
13:18danvet_: mlankhorst_, drm-misc-fixes is stuck on -rc2 :-/
13:38seanpaul: danvet_: a little room-reading on my part, would opening up connector->atomic_commit() for more than writeback jobs be an automatic nack? i'd like to use it for non-modeset hdcp transitions on qc which doesn't have seamless modeset like i915
13:38danvet_: seanpaul, hm feels a bit hacky
13:39danvet_: imo write up a seamless modeset infra
13:39danvet_: either in msm, or as helpers
13:39danvet_: essentially what we do is if (!crtc_needs_modset()) fastset_fixup()
13:40seanpaul: imo, needing modeset for hdcp is the hack
13:40danvet_: yeah agreed on that
13:40danvet_: but you don't
13:40danvet_: no one is forcing you to do that
13:41seanpaul: the problem is that i don't have a hook at the right level to enable/disable hdcp
13:41danvet_: atomic helper isn't a midlayer
13:41danvet_: you can overwrite/hack up/add hooks
13:41seanpaul: so i need to go through modeset to get a connector->atomic_enable call
13:41danvet_: like i915 does
13:41danvet_: nope
13:42seanpaul: well, going through modeset on i915 is also bogus
13:43seanpaul: ideally i'd like to pull out all the hdcp auth logic into a helper
13:43seanpaul: and it'd be nice if the driver didn't have to subclass atomic to use it
13:44danvet_: why subclass atomic?
13:44danvet_: https://paste.debian.net/1164647/ this is what I recommend
13:44danvet_: ofc you then need to stop setting crtc_state->connector_changed from your atomic_check for the hdcp stuff
13:45seanpaul: that is what i had in i915 originally, it was nacked to go through modeset
13:46danvet_: i915 has a massive machinery for fast modesets
13:46danvet_: it's kinda a different beast
13:46seanpaul: i still don't understand why this is better than opening up atomic_commit
13:47danvet_: it's different things
13:47danvet_: the writeback atomic_commit is kinda like a plane
13:47danvet_: called a connector for hysterical reasons
13:48danvet_: but really not much to do with any other connector
13:48danvet_: so if you insist this must be solved in helpers, then I guess we can do a connector->atomic_fixup
13:48danvet_: which is called for !modeset stuff
13:48danvet_: and fairly useful for all kinds of things
13:48seanpaul: so the commit_tail hack works ok for driver connectors, but what's the guidance for bridge connectors?
13:48danvet_: so kinda like i915 fastset
13:48seanpaul: not insisting, just want to get the right solution
13:49danvet_: we can also do a bridge->atomic_fixup
13:49danvet_: the thing is, if you put it into helpers it needs to compose with everything else somewhat reasonably
13:50seanpaul: yeah, understood. so you're thinking atomic_fixup is like atomic_commit_v2
13:50danvet_: seanpaul, btw for i915 I didn't nack avoiding the modeset
13:50danvet_: iirc I nacked where you put that code
13:50danvet_: because it's a general problem, and we have that fastset stuff in i915
13:50danvet_: seanpaul, nah
13:51danvet_: it's like the alternative path if commit_modeset_enables didn't do anything
13:51danvet_: atomic_commit_nonmodeset_fixups
13:51vsyrjala: the i915 encoder fastset stuff is a total hack atm. it's not even atomic with the rest of the frame :(
13:51danvet_: yeah that's I guess another problem
13:52danvet_: otoh hdcp is a mess anyway and you get pixel garbage pretty much as a feature if you're unlucky, so meh :-)
13:52vsyrjala: yeah, hdcp is meh anyway. infoframes and whatnot would be more important imo
13:52seanpaul: ok, i'm not totally thrilled with atomic_fixup, so i'll leave that until i have to deal with a bridge
13:54danvet_: seanpaul, I guess we should rename atomic_commit to atomic_writeback_commit
13:54danvet_: to avoid confusion
13:54seanpaul: yeah, atomic_writeback_commit_only--srsly() ;-)
13:57danvet_: seanpaul, I think the full helper solution probably needs a crtc_state->need_fixup or so
13:57danvet_: and then a pile of atomic_mode_fixup callbacks
13:57danvet_: which is a terribly name because we already have a mode_fixup hook which is called in the atomic_check phase
13:57danvet_: so maybe atomic_fastset_fixup
13:58danvet_: but just for msm I'd really just hand-roll this and done
13:58seanpaul: atomic_call_me_maybe()
13:58danvet_: +1
13:59seanpaul: ok, i'll hand roll it for now and see how it feels, thanks for your input!
14:05danvet_: seanpaul, the real nasty with generic fastset is undoing a crtc_state->mode_changed = true
14:05danvet_: we'd probably need to refcount that
14:05danvet_: so that if a bridge can do a fastset, it can roll that back
14:06danvet_: without rolling back other reasons for full modeset
14:06danvet_: or we need to split it all up more
14:06danvet_: iow, full generic fixup after fastset is nasty, which is why I'm not super enthusiastic about rolling it out to helpers ad-hoc
14:08seanpaul: yeah, fair points, i might have a use for it in a bridge for psr sometime soon, so i'll let it rattle around until then
14:22mlankhorst_: danvet_: oh have a good reason for newer drm-misc-fixes?
14:22mlankhorst_: I'll update on monday
14:42mripard: danvet_: how are we supposed to add a driver to the drm-misc defconfig if it's missing? is there a process in place for that?
14:44mripard: it looks like we're missing drivers/gpu/drm/imx/dcss
18:22jekstrand:hates LLVM
18:24karolherbst: jekstrand: what is it this time? :D
18:24jekstrand: karolherbst: It's casting and memcpying all over
18:24karolherbst: ahh
18:24jekstrand: And I'm trying to help NIR chew through it
18:30jenatali: jekstrand: Unoptimized LLVM?
18:31jekstrand: jenatali: Somewhat "optimized", unfortunately. :(
18:31jenatali: Ah, and I'm guessing you're not starting from source then
18:31jekstrand: I have source and I can shut off some optimizations but it's still pretty bad
18:31jekstrand: Lots of struct copies
18:32jenatali: :(
18:32karolherbst: ehh..
18:32karolherbst: and I guess llvm casts them to char* copies?
18:32jekstrand: Always!
18:32karolherbst: uhh.. pointers, not copies
18:32karolherbst: right..
18:32karolherbst: well.. char* is kind of the natural type when copying bytes though :/
18:33jekstrand: Getting rid of the char* on the memcpy isn't hard
19:04HdkR: jekstrand: This is why their SROA pass is a beefcake, since it expects nearly everything to be in "memory" :P
19:17airlied: jekstrand: I think I landed all the memcpy deref stuff I could easily do way back, but there might have been some hacks outstanding
19:17jekstrand: airlied: Yeah, it's pretty bad....
19:17airlied: but yeah there were still some messy ones I didn't get to far into
19:21airlied: jekstrand: is it copying structs out of structs or arrays of structs?
19:21jekstrand: just copying structs
19:22jenatali: airlied: I split the async/wait patch, hopefully along the divide that you were looking for :)
19:29jekstrand: Wow, a few optimizations and my compile times are already improved!
19:30airlied: jenatali: thanks, much clearer, r-bs for the two
19:30jenatali: airlied: Thanks :)
19:31jekstrand: ssa_17005 is not a good number....
19:31jenatali: jekstrand: That's all? :P
19:31airlied: karolherbst: you might be best person to ack or rb "clover: handle libclc shader (v3)"
19:32karolherbst: airlied: the only patch left?
19:32airlied: karolherbst: pretty much
19:32jenatali: karolherbst: There's a few more -- unless jekstrand's relatively broad ack was supposed to cover them
19:33airlied: karolherbst: pretty much, the others are just opcode moving mostly
19:33airlied: jenatali: I'm just looking over the last few now
19:33jekstrand: I didn't read any of the opcode moving
19:33jekstrand: so someone should
19:33jenatali: airlied: Cool, thanks
19:35jekstrand: I just love how LLVM always likes to write vec4s by casting them to vec3s and writing that....
19:35airlied: jenatali: okay all the opcode moving patches have my rb
19:35jekstrand: *sigh
19:35jekstrand: This shader has a lot of vec3s
19:36jenatali: jekstrand: Is is a vec3 variable that's read/written as a vec4? Or a vec4 that's read/written as a vec3?
19:36jekstrand: jenatali: The former
19:36jekstrand: So it's always OOB writing which is just awesome
19:37jenatali: jekstrand: I have a special-case pass I wrote to clean those up, though really we should genericize it to handle out-of-bounds reads/writes of a variable
19:37jenatali: jekstrand: https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/304/diffs?commit_id=3284e04a44bba5ce0dc13a0cf8ad3c446a196dc5 if you want to take inspiration from it
19:37jekstrand: Hrm... It uses a write-mask of xyzw
19:37jekstrand: xyz rather
19:39jekstrand: I've got about 1KB of scratch in this shader and it's all vec3s :/
19:40jenatali: jekstrand: Yeah, sounds about right
19:40jenatali: That's why I wrote that pass, so copy prop could remove it all
19:40jenatali: My problem was all in vec3 function parameters
19:40jekstrand: Yeah
19:40jekstrand: This kernel is storing a lot of 3D vectors
19:52jenatali: Alright, all the libclc patches are actually reviewed now :D
19:52jenatali: airlied, jekstrand, karolherbst: Any reason to wait longer before merging?
19:52jekstrand: Not so far as I know
19:53airlied: jenatali: no we've waited long enough :-P
19:53jenatali: :) alright then, here goes nothing
19:53karolherbst: I just left some comments :p
19:53jenatali: karolherbst: Oh ok, let me look
19:54airlied: karolherbst: I think jenatali just did all of them
19:54jenatali: karolherbst: Yeah, I thought "just" meant after I pushed the fixes for all of your comments
19:54karolherbst: :D
19:54karolherbst: ohh, I see
19:55karolherbst: yeah, I have nothing against merging it. It worked fine locally :)
19:57jenatali: karolherbst: I'm just double-checking outstanding comments... you had one about moving LIBCLC_INCLUDEDIR in the Meson build files?
19:58jenatali: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6035#note_611863
20:01karolherbst: yeah...
20:01karolherbst: I don't like those things to be defined all that often
20:02jenatali: Alright, let me fix that...
20:07jenatali: Ok, fixed
20:08jenatali: Here goes for real now
20:12airlied: jenatali: yay
20:12jenatali: Once I fix the CI failures from things I apparently broke :)
20:43jenatali: It's in \o/
20:48airlied: jenatali: \o/
20:50jenatali: airlied: *now* you get to figure out how to get it in CI ;)
21:17jekstrand: jenatali: \o/
21:23jekstrand: Wow... I'm compiling like crazy now.
21:23jekstrand: Here's hoping my patches are correct. 😂
21:24jekstrand: spills/fills cut massively. \o/
21:26jenatali: Woo
21:27jekstrand: It no longer takes 4 minutes to compile a single shader. :)
21:29jenatali: jekstrand: Was it just the vec3/vec4 and memcpy problems or was there more to it than that?
21:29ccr:starts a wave for jekstrand
21:29ccr: ~\o/~
21:31jekstrand: jenatali: Just vec4/vec3 and memcpy. Well, my optimization does more than just vec4/vec3 so it might be helping more than that.
21:31jenatali: jekstrand: Great :) Looking forward to being able to benefit from your improvements
21:31jekstrand: jenatali: Good. I'm planning on you reviewing the patches. :P
21:31jenatali: jekstrand: Sure, sounds good
21:35jekstrand: jenatali: Went from maximum spill count of about 6k to about 300 :)
21:35jekstrand: It's still spilling but that's massively better
21:36jenatali: Yeah that's what I'd expect
21:37jekstrand: Ugh... Looks like for u8vec3, LLVM doesn't use a write-mask. :-(
21:38imirkin_: for all the rgb888 afficionados...
21:38jekstrand: There's a part of me which wants to write a pass which just runs through the whole kernel and replaces all vec3s with vec4s
21:38jekstrand: Add my own casts and hope they cancel
21:39imirkin_: so wait, what are they doing?
21:39jenatali: jekstrand: I think we should add an optimization pass which finds known OOB accesses of variables and drops the writes and replaces reads with undefs
21:39jekstrand: imirkin_: Declaring a variable in SPIR-V that's a u8vec3 and then always accessing it by first casting to a u8vec4
21:39imirkin_: oh lol
21:39jekstrand: jenatali: I'm not sure if we can legally do that. :-/
21:40jenatali: jekstrand: Why not?
21:40imirkin_: sounds like all the dx11->glsl translators using int <-> float casts all over
21:40jekstrand: I guess I don't really know what the rules are
21:40jekstrand: imirkin_: Yup, only way way worse
21:40jekstrand: jenatali: I'm not sure what the rules are. If the padding bits "don't exist" then I think LLVM is generating invalid SPIR-V.
21:40jekstrand: If the padding bits do exist, then don't we have to preserve them?
21:41jenatali: jekstrand: Isn't it undefined behavior in C to access memory outside of a variable?
21:41jenatali: Though, I guess the C code isn't doing that...
21:41jekstrand: Probably? But LLVM is doing just that for these u8vec3s
21:42imirkin_: at least in glsl, in order to go from vec3 -> vec4, you have to specify the value of the final component...
21:47jenatali: jekstrand: Yeah, I can't find anything indicating out-of-bounds accesses are invalid... I wonder if the right thing to do is to replace vec3 variables with vec4 and cast to vec3
21:48jenatali: Since it is spec'd that vec3s take up the same amount of space as a vec4
21:48jekstrand: jenatali: I'm kind-of thinking about that
21:49jekstrand: jenatali: Unfortunately, that's going to be an annoying pass to write.
21:49jenatali: jekstrand: Eh, doesn't sound too bad
21:50jekstrand: I'll give it a try and we'll see how it goes.
22:36jekstrand: jenatali: Got a pass. Time to see if it works. :)
22:36jenatali: Good luck
22:38jekstrand: jenatali: Does LLVM place dummy members after vec3s?
22:39jenatali: jekstrand: Uh... I don't think so?
22:39jenatali: jekstrand: It can't, since vec3 + vec1 doesn't pack in CL like it does in GL
22:40jekstrand: jenatali: Hah! It doesn't like it when I smash all my system values to vec4 :)
22:41jenatali: jekstrand: Oh, that'll do it
22:41jenatali: Those probably should stay as vec3...
22:43jekstrand: Seems to work. Helps spilling a bit more
22:44jekstrand: Woo! I'm down to a single function_temp cast!
22:45jekstrand:does the happy dance
22:45jenatali: jekstrand: Nice!
22:47jekstrand: The remaining struct is struct foo { half a[3]; half b[3]; ivec2 c; }
22:47jekstrand: Sadly, the alignment of c is burning me. :-(
22:48jekstrand: That stupid little hole is preventing me from lowering the memcpy_deref to copy_deref
22:48jenatali: :(
22:48jenatali: I'm very interested to see what your memcpy elision looks like
22:49jekstrand: I've got one more trick in my bag.....
22:49jenatali: jekstrand: I'm pretty sure you've got way more than one, if I've learned anything about you these last few months
22:49jekstrand: lol
22:50jekstrand:hides his hat full of rabbits under his desk
22:50karolherbst: jekstrand: don't tell me you add explicit padding members :p
22:50jekstrand: karolherbst: Nope
23:39jekstrand: Ugh... Foiled by my own helper functions.
23:43agrisis: is it possible for a separate app to draw to dri as an overlay?
23:43agrisis: i.e, if another app is already using drm/kms, I haven't found a way for another app to write to it
23:44jekstrand: No, only one app can control the display at a time.
23:44jekstrand: If you want multiple, you need a compositor to sit in the middle.
23:44agrisis: jekstrand: thanks, that makes sense
23:45imirkin: and/or delegate stuff with the kms delegation logic
23:45agrisis: one other thing, it seems in Archlinux using systemd, if you have an app using drm/kms on tty1, you can switch to tty2 and see agetty, but on void linux, even though agetty is running on tty1+2, I can't seem to switch to tty2 or at least I still see the image being drawn
23:45agrisis: any idea what that could be?
23:46agrisis: imirkin: do you have a reference for that?
23:46agrisis: imirkin: or do you mean the other app would need to communicate to the owning app for it to draw?
23:47imirkin: well ... depends what you want to do
23:47imirkin: e.g. if you have 2 screens, you can "give" one to the other app
23:47imirkin: if you want to draw to a shared buffer, then something has to mediate that sharing...
23:47imirkin: unfortunately i'm blanking on both the term that was used for the kms delegation thing, as well as the person who worked on it
23:48imirkin: keith packard is the one who worked on it...
23:48imirkin: and it was called "leasing"
23:48imirkin: e.g. https://keithp.com/blogs/DRM-lease/
23:48jekstrand: drm lease
23:49imirkin: the blog talks about him hacking it up, but afaik it has landed
23:49jekstrand: jenatali: Marge is working on the main memcpy MR. I'll try to pluck my new patches out into an MR for you once that's landed.
23:49jekstrand: jenatali: I've now gotten rid of 100% of my local variables from some rather complex kernels. :D
23:49imirkin: agrisis: more on drm leases: https://www.x.org/wiki/Events/XDC2017/packard_drm_lease.pdf
23:51keithp: imirkin: it's been in the kernel for quite a while now
23:51jekstrand: jenatali: That said, it's late. It may not be until Monday.
23:52imirkin: keithp: i should hope so, given that your talk was in 2017. (wow, time sure flies...)
23:52keithp: yes it does :-)
23:52keithp: although some kernel patches take longer than that. the RT series appears nearly merged, after something like 20 years?
23:52imirkin: hehe
23:55agrisis: nice thank you
23:58anholt: krh: there's a driconf-structs branch in my mesa tree now. Getting the { } in these macros to balance is miserable, but the commit looks promising now.
23:58anholt: (also, the xml generation is obviously pretty stubby)