IRC Logs of #dri-devel on irc.freenode.net for 2024-03-18

00:13 kindergarden: https://oregonoutdoorfamily.com/bee-stings-allergy-infection/ same thing, but i was dosed with oral ones and played table tennis with blood poisoning and line was clearer, there was no pain i was under though, but it was known to everyone except me it was blood poisoning my coach knew that
00:24 kindergarden: you want to tell me more of poetry how powerful and functional you are as supreme CHARL fucker cunt, who can not even cope with another man walking on sand ? be allone tolerate lethal amounts or combinations of meds, you are crazy
00:25 kindergarden: you can not even succeed in your first degree murder in 100attempts
00:25 kindergarden: i put you off instantly with one similar dose or attempt
02:02 hypermodesatbest: and btw. especially my dad and some other gangster monster fraudsters, they jam you with one second using sound waves released, something that makes you sick so bad that you hang yourself, along the lines of why do not you work out your belly, they still scam that stuff , just amazing when i have maximum amount of antipsychiotics and everyone knows that its not possible to lose weight, even if you play 5 tournaments during
02:02 hypermodesatbest: half a year, and more importantly every time i quit i am back to normal range. You are same genetic garbage , yes i am sensitive to endless absurd and projections released by abortion leftovers who talk they are big deals. So once again, i am not anymore interested in the girl who delt sexuallly with wrong people, i aint deal with such person who humiliated me 2.5 years it's not the person i need in my life, endless cheatings
02:02 hypermodesatbest: endless trash people she deals with, let her cure her aids and ado her abortions in her own locations, you have no writes to relate me with such idiots. Since you did, i expect you get knocked down quite soon probably if you did, the island people, are all soon reported dead.
02:03 airlied: oops sorry got tab completed badly
02:03 HdkR: Oh no, Jeremy Rand!
02:16 Lynne: airlied: I'm looking into porting av1's decoding to use reference frame names instead of frame_ids that we generate ourselves
02:16 Lynne: and I don't get it
02:17 Lynne: the spec says that every frame's slotIndex must be unique and must be found in the list of active references (referenceNameSlotIndices)
02:17 Lynne: *every reference
02:17 Lynne: but at the same time, our current code assumes that the slotIndex will never change, while a reference's slot index will
02:18 Lynne: how does CTS even work?
02:23 airlied: why would a references slot index change?
02:23 airlied: it has to stay where it's put until it not used anymore
02:24 airlied: I don't think you can use av1 reference names here
02:24 airlied: assuming you mean LAST/ALTREF etc
02:27 airlied: ref_frame_idx maps from the 7 names to a list of references for that frame (up to 8)
02:28 airlied: but you can have more than 8 slots because a frame might not actively use a reference, but the next one might
02:39 Lynne: airlied: the spec does say that referenceNameSlotIndices must be LAST/ALTREF/GOLDEN, etc
02:40 Lynne: since there's just 8 types (counting 0, the current decoded frame's name), I don't think that should work at all with more than 8 slots
02:40 Lynne: the slots do change frame to frame too, the 0th frame (current) becomes LAST on the next frame
02:42 airlied: Lynne: yeah so that maps the 0..7 through ref_frame_idx into slotIndexes
02:46 airlied: so you should end up with a slot index for each reference frame name
02:52 Lynne: does the order of references matter in the way they're given to the decoder?
02:53 Lynne: currently I get crahes on all vendors
02:54 airlied: Lynne: in the map?
02:54 airlied: yes
02:54 Lynne: no, in the pointers themselves
02:54 Lynne: for reference frames that don't exist, I still provide a reference that increments the number of refernces (so I always give 7 refs in total to the decoder, but any disabled ones, I use slotIndex = -1 with a null resource as the spec requires)
02:55 airlied: you shouldn't do that
02:55 airlied: the -1 should go in the map
02:56 airlied: you just provide referenceslots for the actual things in the DPB
02:58 airlied: so if you have say just an INTRA_FRAME and LAST_FRAME and no LAST2_FRAME LAST3_FRAME etc, then the map would be [slot_index_a, slot_index_b, -1, -1, -1, -1, -1]
02:58 airlied: then provide 2 reference slots if a/b are different
02:58 airlied: with the slots into the DPB that are per frame
03:04 Lynne: if I don't, radv asserts
03:05 Lynne: btw, does a map of all zeroes make sense? it should, it just means the frame gets splat'd across all refs, right?
03:08 airlied: Lynne: yes all 7 refs would point to slot 0, happens a lot in 2nd/3rd frames
03:17 Lynne: so does the assert make sense or?
03:18 Lynne: as far as I read it, it requires that all references are always present
03:18 Lynne: and my reading of the spec does also follow that
03:20 zzoon[m]: that's my understanding too.
03:20 airlied: present where?
03:20 airlied: referenceNameSlotIndices is an array of seven (VK_MAX_VIDEO_AV1_REFERENCES_PER_FRAME_KHR, which is equal to the Video Std definition STD_VIDEO_AV1_REFS_PER_FRAME) signed integer values specifying the index of the DPB slot or a negative integer value for each AV1 reference name used for inter coding.
03:21 airlied: though maybe that means they all have to point at 0
03:21 airlied: though I think they come from the av1 file, we don't control that
03:22 airlied: I think it turns out all 7 are always there after the first frame anyways
03:23 Lynne: yes, but I'm talking about the list of references, not the map
03:24 airlied: so you think we should provide all active references whether referenced or not for each frame?
03:24 airlied: tchar: would have to clarify, but I think he said previously here that was rejected
03:26 airlied: "tchar: airlied: isn't that currently the case? the complete dpb state being the current frame and all its unique dependent frames. The idea of putting the whole "VBI" in the API was rejected "
03:26 airlied: Lynne: so the current working code doesn't hit the assert? so what do you do different (or are you hitting it now?)
03:37 tabulanebula: tabulanebula> Lynne: there is no point in talking to this aids shithead, he is not only retarded. but you might find kapo to be at work at saving you from the explosives they blast. tabulanebula> https://www.nobelprize.org/prizes/medicine/2022/paabo/lecture/ mad world, so once again what a set of monkeys complete vomit and abuse was honored with 7million dollars, those trashes design shape changes with estonian and finnish doctors t
03:37 dwfreed: ugh
03:38 dwfreed: banning him isn't helpful; if you see me say something after him, I've "handled" it
03:44 Lynne: airlied: I'm not hitting it
03:45 Lynne: with the current working code, but if I skip -1 refs, I do
03:48 airlied: dwfreed: can you actually block him effectively, his persistence is admirable
03:49 airlied: dwfreed: add a ? in there :-)
03:49 airlied: Lynne: what does skipping -1 refs look like in code?
03:50 airlied: like I'm filling in -1 refs now?
03:50 dwfreed: airlied: I've done a few things that should help
03:52 Lynne: airlied: if (ref is missing) { continue } else { fill_pict, nb_refs++ }
03:55 truckwithlights: Lynne: there is no point in talking to this aids shithead, he is not only retarded. but you might find kapo to be at work at saving you from the explosives they blast. https://www.nobelprize.org/prizes/medicine/2022/paabo/lecture/ mad world, so once again what a set of monkeys complete vomit and abuse was honored with 7million dollars, those trashes design shape changes with estonian and finnish doctors to estonian kids to suffer under
03:55 dwfreed: ugh
03:58 HdkR: :)
04:00 airlied: Lynne: isn't that what it does now? if the ref frame is none, it continues, otherwise fills in the info
04:00 airlied: after deduplicating
04:04 Lynne: actually yeah...
04:04 Lynne: but what *does* that assert do?
04:04 airlied: the assert is because the amd fw is messed up
04:04 airlied: so we have to lie to it
04:05 airlied: it can either mean the API usage is wrong or the internal attempt to fix the fw is wrong
04:11 Lynne: lisa will surely save us
04:55 Lynne: airlied: https://github.com/cyanreg/FFmpeg/tree/av1dec_spec
04:55 Lynne: to my understanding, this is what the spec expects
04:55 Lynne: I don't trigger the assert, but I do crash on both implementations
04:56 Lynne: with or without a continue in if (s->ref[i].f->pict_type == AV_PICTURE_TYPE_NONE)
04:56 Lynne: could you look at it and see if it makes sense?
05:07 airlied: Lynne: no that doesn't make sense
05:07 airlied: the current code makes sense afaik
05:08 airlied: you have to manage the DPB slots not the reference indexes
05:08 airlied: so if you decode a reference frame, you have to use that slot index in all future decodes referencing that slot
05:08 airlied: not the AV1 referenecs
05:09 airlied: so you can't just use slot 0 for the current frame decode
05:10 Lynne: but the vulkan spec says referenceNameSlotIndices must contain LAST/GOLD, etc, and also says that each frame's slotIndex must be contained in referenceNameSlotIndices
05:10 airlied: no referenceNameSlotIndices is a mapping *to* LAST/GOLD etc
05:10 Lynne: did we mess up bigtime?
05:10 airlied: from slot
05:10 Lynne: err, really?
05:11 airlied: so referenceNameSlotIndices[LAST] = slotIdx
05:11 airlied: (or -1) if not used
05:12 Lynne: so by index to a reference frame they really mean an index in the last... N decoded frames I guess, not a named reference?
05:16 Lynne: so *that's* why the decoder checks 32 indices for the id_alloc_mask
05:16 Lynne: still, 32 is rather arbitrary
05:16 Lynne: why not 64 or 16?
05:16 airlied: 32 is probably a bit large
05:16 airlied: that was just an easy uint32_t choice
05:17 airlied: it's just the DPB index
05:17 airlied: I think we advertise max 16 dpb slots there
05:17 airlied: you have to allocate those slots to images for as long as they live
05:17 airlied:isn't sure I've seen it go over 9 or 10
05:18 airlied: if the ffmpeg code was like the h264 code we'd just use the index of the DPB reference
05:27 Lynne: would you mind replying to jkqxz on the ffmpeg ml? maybe we could change the decoder while it still hasn't grown large
05:27 Lynne: https://ffmpeg.org//pipermail/ffmpeg-devel/2024-March/323596.html
09:37 Company: MrCooper: while you're looking at https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/7042 - do you have any idea why the Vulkan numbers are higher than the GL numbers? They're essentially using the same flags now
09:38 MrCooper: not offhand
09:38 Company: the only flag I don't have in GL is DEVICE_LOCAL, but I don't even understand what happens if I go for DEVICE_LOCAL | HOST_CACHED
09:38 Company: also, I tested it a bit and it seemd to make Vulkan slower
09:38 MrCooper: that combination isn't available with AMD GPUs, is it?
09:39 Company: good question - I don't have my AMD booted so can't check
09:39 MrCooper: it's more of a rhetorical question :)
09:40 MrCooper: also, if there are CPU reads, DEVICE_LOCAL is bad
09:41 Company: there usually aren't - but the benchmark does indeed use them
09:42 Company: I guess I should check with DEVICE_LOCAL and a different benchmark
09:42 Company: still, the GL usage is the same, and GL is faster
09:43 Company: and apparently DEVICE_LOCAL | HOST_CACHED is indeed not a thing
09:43 Company: that's good for my sanity
09:44 Company: i guess for vertex buffers, we should ultimately be using DEVICE_LOCAL and write-only
09:45 Company: or is that bad when we write the data in chunks?
09:46 MrCooper: should be fine as long as the writes fill the write-combine buffers
09:47 Company: I don't know where this stuff should go, because it's written once in 20-50 byte chunks when creating the instructions and then it's read in the same chunks I guess when the GPU processes the instructions
09:48 MrCooper: though note that the gain from DEVICE_LOCAL vs. uncacheable system memory is minor, doing something sub-optimal with DEVICE_LOCAL will likely hurt more than that
09:49 MrCooper: well, that's for streaming data, anyway; if the GPU uses the data multiple times, the balance shifts in favour of DEVICE_LOCAL
09:49 Company: right
09:51 Company: at some point I need to learn how memory actually works instead of just having the mental model I made up via experience
09:52 Company: and all this work just so that people can smoothly scroll a fullscreen gnome-terminal with a tiny font size at 144Hz...
09:54 MrCooper: sounds pretty important to me :)
09:58 emersion: related: https://basnieuwenhuizen.nl/making-reading-from-vram-less-catastrophic/
10:10 MrCooper: Company: someone who's more of a Vulkan expert than me might notice things GTK could do better
10:25 Company: MrCooper: (and everyone else) that reminds me: We plan to switch GTK development to Vulkan by default after the Gnome/Fedora releases are out and people settle back into development
10:25 Company: which should lead to people finding lots of Vulkan bugs
10:26 MrCooper: cool
10:28 Company: I hope that works fine but I know Vulkan is less vetted than GL
10:29 Company: like Intel's YUV
12:08 mripard: pq, emersion: for display interfaces that can support multiple formats (like HDMI with RGB and its YUV variants), do you know if it's preferrable to lower bpc and try to use RGB always, or if it's best to try to match the bpc as much as possible at the expense of selecting another format?
12:09 emersion: i have no idea
12:09 emersion: maybe vsyrjala knows
12:09 emersion: ideally user-space would be able to select…
12:09 pq: match bpc to what? you mean use highest possible?
12:09 swick[m]: we don't have any other bpc
12:09 swick[m]: (from user space)
12:10 emersion: i understood the question as "is it better to lower the bpc or switch to YUV?"
12:11 swick[m]: the answer is that user space should be in control, but isn't, so do whatever you want
12:11 swick[m]: someone will not be happy
12:11 pq: so it's a trade-off between per-sample color precision and per-pixel sampling (spatial precision). I suspect which one is preferable depends on the use case.
12:11 mripard: emersion: yeah, that's what I meant, and it started from a discussion with vsyrjala indeed :)
12:12 pq: I'm assuming YUV also implies chroma sub-sampling.
12:12 mripard: yep, we're discussing YUV422
12:12 pq: I really have no experience to give any generic rule of thumb.
12:13 emersion: right, so RGB would be better for text content, and YUV for video content?
12:13 emersion: i suppose it's not possible to switch between RGB and YUV without a modeset?
12:13 swick[m]: no
12:13 pq: If your content is video or photos, I think high bpc YUV would be better than lower bpc RGB. If your context is GUI graphics and text, then the opposite.
12:13 zamundaaa[m]: mripard: I would expect most desktop PC users to rather have lower bpc instead of chroma subsampling
12:14 swick[m]: except if you have enough resolution, then chroma subsampling doesn't matter
12:14 swick[m]: GUIs often have gradients where the bitdepth can be relevant
12:15 emersion: you mean HiDPI?
12:15 swick[m]: but yes, I also tend to prefer RGB, but that's my person preference
12:15 mripard: zamundaaa[m]: so, generally speaking, we could consider YUV as a fallback-only solution, but try to use RGB at all (bpc) cost prior, right?
12:15 emersion: maybe the screen is high res (4k), but maybe the screen is very large
12:15 mripard: (until we have a proper uAPI to expose the formats that is)
12:15 pq: would you agree that whatever is picked, should not be made conditional based on e.g. CRTC size (resolution)?
12:15 swick[m]: yes
12:15 zamundaaa[m]: mripard: like swick said, only down to some bpc
12:16 pq: niiice
12:16 swick[m]: really, this is all a bit pointless because it should be in control of user space
12:16 pq: right
12:16 emersion: if RGB/YUV could be switched without a modeset, would be interesting to decide based on "content type" prop
12:16 swick[m]: please don't
12:17 emersion: i mean, does it matter?
12:17 mripard: emersion: I'm pretty sure you'd need a modeset for most if not all drivers
12:17 pq: so what's the choice of least surprise? I would claim "RGB even if it means lower bpc"
12:17 swick[m]: same
12:17 emersion: if a modeset is needed, i think generic compositors will almost always want to unconditionally pick RGB
12:17 pq: and once userspace actually can choose between RGB and YUV, that will fix those use cases that are unhappy with this default.
12:17 mripard: ack, so it looks we have a consensus here :)
12:18 zamundaaa[m]: I would maybe change that decision when it comes to HDR though
12:19 swick[m]: please for the love of god, no
12:19 zamundaaa[m]: Going below 8bpc with PQ would not be good
12:19 swick[m]: let's not make all the legacy stuff not even more complicated
12:19 mripard: yeah, I agree, we should stick to something simple that can be easily documented
12:19 swick[m]: if user space needs a specific bit depth it should be able to tell the kernel
12:20 zamundaaa[m]: swick: that "legacy" stuff is what people are using right now. It must at the very least not be made worse vs the status quo.
12:20 zamundaaa[m]: Idk what that status quo is though
12:20 emersion: driver-specific
12:21 mripard: yep, that's what I'm trying to improve
12:21 emersion: AMD will pick YUV in some cases, pretty sure
12:21 emersion: (cc hwentlan_ ^)
12:21 mripard: i915 uses YUV420 when required, RGB otherwise
12:21 swick[m]: the legacy stuff is also completely broken
12:21 zamundaaa[m]: I know that amdgpu picks subsampling over reducing the bpc with my TV, and changing that would almost certainly cause banding on it with HDR
12:22 swick[m]: if it works, it's by accident, not because the documented behavior can give you working hddr
12:22 mripard: dw-hdmi uses YUV420 when required, and then will favor YUV422 and YUV444 before RGB for each bpc, but I'm pretty sure that the logic is broken and they will always end up using RGB 8bpc
12:22 swick[m]: zamundaaa: too bad, fix up the uapi
12:22 swick[m]: this mindset of "the kernel must magically do all the things I want" really not helpful
12:22 emersion: swick[m]: sounds like you volunteer? :D
12:22 swick[m]: I am trying to do this
12:23 zamundaaa[m]: You shouldn't break the old behavior before uAPI is fixed... The order of things is important there
12:23 emersion: please also avoid asking people to "just" "fix up the uapi"
12:23 emersion: it's a huge chunk of work
12:23 swick[m]: that's part of the the color pipeline API allows us to do
12:23 swick[m]: I'm well aware
12:24 swick[m]: I'm just saying people shouldn't make the current situation worse to fix
12:24 pq: mripard, what are you doing that made you ask this question in the first place?
12:24 emersion: wording such as "we should fix the uapi" is better than "you, fix the uapi"
12:24 pq: is this still about sharing the HDMI setup code between all drivers?
12:24 mripard: pq: yes
12:24 pq: hmm
12:25 pq: and drivers naturally disagree
12:25 mripard: like emersion was saying, it's all a big mess right now and I'm trying to put that logic in a common place to avoid drivers screwing up and coming with yet another variant
12:25 mripard: there's not much of a disagreement, just that it's not clear what the best behaviour is
12:26 mripard: but it looks like there's a consensus, which is also aligned with i915 so it looks like it's also the most conservative option
12:26 pq: but then, *our opinions* are kinda irrelevant, too, because kernel regressions are the thing you need to avoid as much as possible. So I think you need to choose a consensus algorithm from the existing driver behavior, and *if* there are ties, then use our opinions to solve them. And hope no-one screams "regression".
12:32 mripard: I think it's very much relevant, but also because it kind of depends on what regression means here. Given that it's all a big mess right now, we won't be able to come up with something that doesn't change the current format selection for anyone. So we should still strive to get something decent.
12:33 pq: right
12:35 mripard: so if regression means "my screen is black now" then yeah, it's bad. if it means "I used to have YUV but now I get RGB with a lower bpc", I'm not sure we should consider it one
12:35 mripard: and since it's opt-in, we can still revert the commit for that one particular driver if someone complains
12:35 vsyrjala: in i915 we go as far as offloading the rgb->ycbcr 4:2:0 conversion to an external dp dongle if possible. outputting rgb avoids having to play annoying tricks with gamma/csc/scalers/etc. on certain classes of hardware
12:37 pq: mripard, yeah, I never know where the line is for what's a kernel regression and what's not. Happy to leave that to sub-sys maintainers.
12:37 pq: I just presume the worst from kernel developer perspective.
12:51 vsyrjala: someone broke DRM_KMS_HELPER dependencies?
12:51 vsyrjala: some bridge thing wants it as =y but i want t as =m
12:52 vsyrjala: commit e3f18b0dd1db ("drm/bridge: Select DRM_KMS_HELPER for DRM_PANEL_BRIDGE")
12:53 vsyrjala: ah, already being discussed on ml
13:10 grillo_0: melissawen, mairacanal: About the bg color TODO, so maybe would be good to remove it?
13:11 emersion: grillo_0, melissawen, mairacanal: i wanted to type uAPI but then realized it was more tricky than it seems
13:12 emersion: namely, if i want to draw a solid color, i would need to try first the solid color prop, and then fallback to buffer if the driver rejects that
13:12 emersion: that doesn't really work like the rest of the props (if i can't use a plane, i will composite it with GL/Vulkan)
13:18 zamundaaa[m]: What would the use case for the crtc background color be btw? I can't think of anything that isn't a really specific edge case
13:21 emersion: some people configure swaybg with a solid color, for instance
13:22 emersion: a terminal emulator or text editor or web browser could take advantage of it, maybe
13:29 Company: MrCooper: fun fact: zink is faster than my Vulkan renderer, so I must be doing *something* wrong :)
13:30 ccr: ~ zink magic ~
13:31 Company: I like zink, because whenever it's faster than either my GL or my Vulkan code, I know something is going wrong somewhere
13:31 Company: I much prefer it to be faster than the GL code, because that's usually a drvier bug and not my fault
13:33 pq: zamundaaa[m], the preferred background for watching scale-preserved fullscreen photos is probably not black but dim. Perhaps for movies as well.
13:38 zamundaaa[m]: thanks, those make sense
13:58 MrCooper: Company: not sure what "zink is faster than the GL code" means, faster than a non-zink GL driver?
14:00 Company: I've had zink be faster than the regular driver on my benchmarks once or twice
14:01 Company: usually that means that Mesa's vulkan driver is better than the GL driver at something
14:02 pq: hwentlan_, swick[m], pondering if a 3x 1D LUT + 3D LUT could be enough for everything: https://sourceforge.net/p/lcms/mailman/lcms-user/thread/20240318155851.3750e566%40eldfell/#msg58750235
14:02 Company: for now I'm just confused, I even made sure they use the same memory type - maybe it's something else that's breaking here
14:03 MrCooper: Company: then it still using your GL code though, so not "faster than it"
14:03 MrCooper: just faster than the other GL driver
14:04 Company: yeah
14:05 Company: though technically it could hit different codepaths in my GL code, too
14:05 mripard: sima: assuming we would access drm_connector.state outside of the modesetting path (debugfs), which locks does one need to take? drm_mode_config.connection_mutex only or is there something else?
14:07 sima: mripard, if you call the drm_get*_state functions they'll take the right locks
14:07 sima: but yeah for just reading that one is the right one, and the various GET* ioctl we also open code that so should be fine for debugfs
14:14 Company: pq: do you want 1 1D LUT or 2 1D LUTs (one at the start, one at the end)?
14:14 Company: because that paper kinda feels like the 1D LUT is just to (de)gamma the pixels
14:15 pq: Company, it's three 1D LUT and then one 3D LUT.
14:15 pinchartl: narmstrong: you've sent tags for "[PATCH] drm/panel: ilitek-ili9881c: Fix warning with GPIO controllers that sleep" and "[PATCH 2/2] drm/panel: ilitek-ili9881c: Add Startek KD050HDFIA020-C020A support". do you plan to merge them at some point ?
14:15 Company: pq: 3 LUTs because one per channel, or?
14:16 pq: the point of the 1D LUTs is to reshape the space into another one that is better suited for the uniform sampling of the 3D LUT, so that the 3D LUT can achieve the required precision with a feasible size.
14:16 pq: yes, per channel
14:16 narmstrong: pinchartl: yes
14:17 pinchartl: thanks. no urgency, I know you're busy with a revert :-)
14:17 pq: Company, note that "linearization" is not going to optical space in this case. Instead, "linear" is defined as "good for the uniform sampling in the 3D LUT".
14:18 Company: pq: yeah, I got that part - I was just wondering how much that is about linearization only, at which case it's not the 1D LUT that matters but the (de)gammaing, and as long as the 3D LUT lives in a linear space, it'll work out
14:18 Company: pq: because the paper doesn't talk about that, it just compares with a 3D LUT from linear to sRGB
14:19 pq: the point is not "linear to sRGB" but from one space to a very differently curved another space
14:19 Company: yeah, but the paper doesn't do that - the paper just does linear to sRGB
14:19 pq: that's irrelevant
14:19 mripard: sima: great, thanks :)
14:20 pq: an interesting case to evaluate would be sRGB <-> BT.2020/PQ RGB transformation
14:22 pq: particularly interesting would be to see how much tone and gamut mapping will cause problems there
14:22 Company: dunno how irrelevant that is - I'd just have liked a comparison with 3D LUT into linear sRGB + gamma
14:22 Company: that a 17 point 3D LUT can't deal with gamma isn't too surprising to me
14:23 pq: by linear sRGB + gamma you mean a matrix and gamma?
14:23 Company: yeah
14:23 Company: 3D LUT and gamma
14:24 Company: the paper goes from camera raw, which is linear
14:24 Company: otherwise I'd have expected it wants a degamma step before the 3D LUT, too
14:25 pq: is it?
14:25 Company: it says so
14:25 Company: "the apparent gamma of this space is about 1.0"
14:25 pq: it says "apparent gamma is about 1.0", it's "apparent gamma" and approximate
14:26 pq: you're right, that it may be a deliberately "easy" case, though
14:26 Company: I read that as "good enough"
14:26 mairacanal: grillo_0, i don't think we should remove it, as it is something that we want, but we are still discussing the implementation and use-case
14:26 pq: AFAIU, a matrix can be represented as a 3D LUT of two taps per dimension with zero error, as long as you stay inside the "cube".
14:26 Company: I also have no idea what can go wrong once you start going to/from oklab or such
14:27 pq: Company, you wouldn't go to oklab. It's intended for RGB to RGB.
14:28 Company: but for the RGB spaces, I would have expected degamma => 3D LUT in linear space => gamma to be good enough
14:28 pq: it's camera RAW though
14:29 pq: camera RAW (RGB) does not have a neat analythical representation I think, it's a sample cloud if you actually profile it.
14:31 tleydxdy: well, raw don't even have "RGB" pixel I think
14:31 pq: depends on what you mean by "RGB"
14:32 tleydxdy: there's probably a lot more G pixels than R and B in a raw
14:32 pq: libcamera people could probably help here
14:32 pq: that's Bayer
14:33 tleydxdy: yes
14:33 demarchi: sima: https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/44 contains the dim_pull_request part of splitting the tag/pr we talked about some time ago. There's a pending question there about creating the tag. Is the "tag something != tip of the branch" something desirable in general? Or is my workflow too different?
14:33 pq: that's a spatial arrangement, while here we're more interested in the colorimetry
14:34 tleydxdy: so processing it in rgb space seem completely wrong
14:34 pq: why would it be?
14:34 tleydxdy: well, you only know one color at any point, so how can you do colorimetry?
14:34 pq: obviously de-Bayering happens at some point, giving you a three-channel sample at each pixel
14:35 tleydxdy: you need to at least debayer it and that bakes in a lot of assumptions there
14:35 pq: we're not interested in the spatial properties here, just the (spatially interpolated) colorimetry
14:35 tleydxdy: that's not operating on raw anymore
14:36 pq: ok
14:37 pq: The same with displays, if you profile it, you get a sample cloud, that is, a 3D LUT essentially from device RGB -> optical measurement. This one is probably uniformly sampled in device RGB.
14:38 tleydxdy: yeah, there's a billion way to debayer, most likely you want to do chromatic aberration correction as well, the spatial aspect of "where the pixels is" is gonna affect your color result a lot
14:38 pq: If you profile a camera, I don't think either side will be uniformly sampled, or you have an impressive test target to take photos of.
14:39 Calandracas: not just profiling a camera, but profiling each lens :P
14:40 tleydxdy: the pixel value you get after you process a raw is not a simple function of the raw pixels at that point
14:40 tleydxdy: it's a whole mess
14:41 Calandracas: if you need to debayer, this library works fantastically: https://github.com/CarVac/librtprocess
14:42 Calandracas: and will give an idea of the level of complexity involved in demosaicing
14:42 Calandracas: its also important to do black level subtraction before debayering
14:42 Calandracas:learned that the hard way
14:44 melissawen: grillo_0, emersion I think we should keep the bkg TODO there, in our radar. But yes, we can describe the current task status better.
14:46 melissawen: BTW, I've seen a proposal for plane solid_fill prop (msm IIRC). Is there any relationship between these two props?
14:46 emersion: i don't think it makes sense to have both
14:46 emersion: to me, they describe the same feature
14:55 melissawen: hmmm... it looks like we are discussing towards the plane solid_fill property, right? In this case, do we need a CRTC bkg prop too?
14:57 melissawen: if so, should these possible plane and CRTC `solid_fill` props have a similar behavior?
14:59 melissawen: BTW, sima, mripard, tzimmermann, we have a fix for a drm helper that is so far only used by VKMS and I'm in doubt about where should I apply it (-misc-next or misc-fixes).
14:59 melissawen: https://lore.kernel.org/dri-devel/20240316-drm_fixed-v2-1-c1bc2665b5ed@riseup.net/
14:59 melissawen: could you guide me?
15:00 tzimmermann: melissawen, drm-misc-fixes
15:00 tzimmermann: 'git tag --contains 8b25320887d7' says that v6.5 already has the fixed patch
15:01 tzimmermann: so the fastest way to get it into upstream is drm-misc-fixes
15:01 tzimmermann: melissawen, you can use 'dim fixes 8b25320887d7' to create the Fixes tag
15:02 sima: tzimmermann, atm we're in the merge window so it's -next-fixes
15:02 tzimmermann: it will tell you about stable as well
15:02 sima: or dim gets confused
15:02 sima: or patches get lost
15:03 sima: https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch
15:03 tzimmermann: sima, what's the difference? drm-misc-fixes goes into drm-fixes, right? shouldn't that land at the end of the merge window at the latest?
15:03 sima: it's a bit subtle in the flow chart, but since there's no -rc only a release the 2nd question is "no"
15:03 sima: tzimmermann, only if you don't forgot the do a pr for both drm-misc-next-fixes and drm-misc-fixes
15:04 tzimmermann: oh well
15:04 tzimmermann: melissawen, drm-misc-next-fixes then. sorry
15:04 tzimmermann: about the misinformation
15:05 tzimmermann: i'll prepare the next PR on Thursday
15:05 sima: tzimmermann, tbh not sure we should change the docs and just send pr for both during the merge window
15:05 sima: it's a bit too reliably a confusion :-/
15:05 tzimmermann: :)
15:05 sima: otoh as long as it's applied somewhere and not outright lost in a developer branch it's still a win
15:05 tzimmermann: i got it wrong for years, lol
15:06 melissawen: tzimmermann, sima thanks! I usually follow this part of the documentation "If in doubt, apply to drm-misc-next or ask your favorite maintainer on IRC"
15:06 tzimmermann: that's true, better than a lost patch
15:07 sima: tzimmermann, dim status says that currently there's unmerged patches in both drm-misc-fixes and -next-fixes anyway, so I guess 2 pr it is this week
15:08 melissawen: I'm also not clear about this "is the bug in the current -rc?". Does it mean if the bug exists in the current mainline -rc or if the bug **only** exists in the current -rc?
15:09 mripard: tzimmermann: tbf, I was just as wrong :)
15:09 melissawen: (maybe it is my low english skills)
15:09 mairacanal: tzimmermann, sima, is it possible to add a "Is it the merge window?" to the diagram make it more clear?
15:10 mripard: melissawen: I'm not sure what is the difference between your two options?
15:10 sima: melissawen, it's just confusing
15:10 emersion: melissawen: the CRTC background prop would be equivalent to exposing a plane underlay which only supports fullscreen color
15:13 melissawen: mripard, I mean, we have a bug since v6.5, so we can say the bug exists in the current -rc (and all past -rc). But if we are in the v6.9-rc2 and the bug was introduced in this -rc, so the bug **only** exists in the current -rc.
15:13 gfxstrand: Ugh... I need to make some decisions about unstructured NIR
15:14 gfxstrand: In particular, I need reg_intrinsics_to_ssa to work on unstructured control-flow.
15:14 gfxstrand: This means we need to be able to do a reverse post-DFS walk of the CFG.
15:15 gfxstrand: Do we want to require that the list of blocks in nir_function_impl is always a reverse post-dfs, add a helper to re-sort it, and enforce that with nir_validate?
15:15 mripard: melissawen: the "age" of the bug doesn't matter, the only thing that does is the version affected. The question is: is it in Linus' tree (in last -rc) or in drm-misc-next ?
15:15 gfxstrand: Do we want to DFS and sort the blocks on the spot every time we want to iterate that way?
15:15 gfxstrand: Do we want a half-way solution where dominance records the CFG post-indices and we sort right before we walk?
15:16 gfxstrand: cwabbott, jenatali, karolherbst: ^^
15:16 melissawen: mripard, oh, I got it! thanks for explaining!
15:16 mripard: melissawen: so if the bug is 10y old or just got merged in last rc, it should be merged in drm-misc-fixes either way
15:18 mripard: if it's not in Linus' tree but drm-misc-next, then it should either be merged in drm-misc-next before -rc6, or drm-misc-next-fixes if it's between X-rc6 and X+1-rc1
15:20 javierm: the drm-misc-next-fixes always confuses me as well. A couple of times I did wrong and a cherry-pick was needed, which duplicated the commits and polluted the git log :(
15:22 melissawen: mripard, okay... but then I'm confused again. Because the bug fix I've mentioned (drm+vkms) is present since v6.5, but we are now in a merge window.
15:24 mripard: melissawen: then "if it's not in Linus' tree but drm-misc-next" is false :)
15:24 javierm: melissawen: the question is, do you want your fix to land in v6.9 or v6.10 ?
15:24 melissawen: so, sima mentioned drm-misc-next-fixes because we are in a merge window... what should be evaluated first: merge window or Linus' tree?
15:25 mripard: I think that's the part where tzimmermann and I were confused about too
15:25 melissawen: x_X
15:26 mripard: like javierm was saying, it basically boils down to what version you want to target
15:26 mripard: after -rc6, you still have two releases for the current release
15:27 mripard: and after the final release, the merge window opens and the next version is going to be (X+1)-rc1
15:28 mripard: so what sima said was that drm-misc-next should always be open, drm-misc-next-fixes between -rc6 and (X+1)-rc1, and drm-misc-fixes between (X+1)-rc1 and X+1-final
15:28 javierm: melissawen: the problem is that there's a gap between DRM maintainers send the drm-next PR to Linus and the -rc1 released by Linus that is the target for drm-misc-fixes
15:28 javierm: and that gap is what drm-misc-nex-fixes tries to cover
15:29 sima: javierm, the gap is when the last drm-misc-next goes into drm-next
15:29 mripard: which seems a bit weird to me, I would have expected drm-misc-fixes to be always open, just like drm-misc-next, with drm-misc-next-fixes being just the stop-gap measure
15:29 sima: at around -rc6
15:29 sima: mripard, yeah maybe we should switch to that model, defacto it is already
15:30 sima: essentially if it's a bugfix, try drm-misc-fixes first if it contains the bug, then drm-misc-next-fixes, then drm-misc-next
15:30 karolherbst: gfxstrand: why do you need reg_intrinsics_to_ssa to run unstructured?
15:30 javierm: sima, mripard: but that needs drm-misc-fixes to get a backmerge of drm-misc-next right ?
15:30 javierm: err, of drm-next I meant
15:30 mripard: javierm: which will happen at next -rc1
15:30 sima: javierm, if all goes well you can just fast-forward to -rc1
15:30 sima: sometimes you have a few lost commits, and then yes you need a backmerge
15:31 javierm: mripard, sima: exactly, but the problem is that then you have the time gap I mentioned
15:31 sima: javierm, the time gap is much bigger
15:31 gfxstrand: karolherbst: Because to make my crazy but correct new warp barriers pass, NAK wants unstructured CF.
15:31 karolherbst: scary
15:31 gfxstrand: :)
15:31 javierm: sima: how so? Isn't between drm-misc maintainers send their last drm-misc-next PR and Linus releases -rc1 ?
15:31 sima: yeah
15:31 gfxstrand: karolherbst: That or I need to define a structured goto in NIR
15:31 sima: which is about a month, which is a bit much
15:32 karolherbst: I mean.. I totally see why you'd want that to run on unstructured CF, but... have you considered doing it in SSA regardless? :D
15:32 gfxstrand: Or add a whole new nir_scope CFG node type and a scope_break jump type which breaks out of N scopes
15:32 gfxstrand: Those are all options
15:32 karolherbst: ohh right...
15:32 karolherbst: uhhh
15:32 javierm: sima: yeah, I understand the rationale of drm-misc-next-fixes but is just that I never track that closely the part of the process
15:32 sima: javierm, the other thing is that you can't really merge drm-misc-fixes into drm-next, because that would replicate linus' merge commit
15:32 sima: which he really doesn't like
15:32 karolherbst: the barrier aspect isn't the issue, but rather doing all of this convergency stuff
15:32 gfxstrand: Yeah, roughly.
15:33 sima: and if you'd just merge that merge commit, you get a random point in the middle of the merge window, which isn't a good idea either
15:33 gfxstrand: There's a lot of different ways to slice this pie
15:33 javierm: sima: I wonder if dim could automatically close drm-misc-fixes during that period and warn users
15:33 sima: so you can't pick up all fixes during the merge window with drm-misc-fixes, you need -next-fixes even during that time since that's based on drm-next
15:33 karolherbst: gfxstrand: mhh. why not attach the convergency info to the CFG? Like... inner threads need to converge after leaving this block, and it's up to the backend to implement it correctly?
15:33 sima: javierm, I tried, I failed
15:33 gfxstrand: karolherbst: Oh, that's all implicit in NIR already
15:34 sima: javierm, like even the logic to push the right branch to the linux-next branches is mostly busted
15:34 mripard: javierm: the point is that you can't close drm-misc-fixes after -rc6, because there's still -rc7 and the final release that you still might want to target
15:34 javierm: sima: I see
15:34 karolherbst: mhh, fair
15:34 sima: since after the freeze we want to push drm-misc-fixes + drm-misc-next-fixes and _not_ drm-misc-next, since that's not for the current merge window
15:34 javierm: mripard: but -rc7+ are not merged from drm-misc-fixes but drm-misc-next-fixes
15:34 karolherbst: but I guess you want to insert the barriers when in nir, not when in nak
15:34 sima: and if there's anything big in drm-misc-next it really annoys people
15:35 gfxstrand: It's not really the barriers that are the problem. It's all the extra CF required to do the break cascade
15:35 sima: javierm, drm-misc-next-fixes only heads into drm-next (assuming no misplaced/lost commits outside of when it's needed)
15:35 karolherbst: mhh, I see
15:35 gfxstrand: And I'll feel way more comfortable about it if I can just leverage NIR's SSA re-construction.
15:35 mripard: javierm: honestly, it's from both. Each time I start handling drm-misc-fixes for a given release, I basically merge -rc1 and drm-misc-next-fixes if there's any commit
15:35 javierm: sima: I see
15:35 gfxstrand: NAK also has SSA reconstruction so I can also do it there in theory
15:36 sima: javierm, and drm-misc-fixes never goes into drm-next (well aside from misplaced patches during the merge window that would be stuck otherwise until after -rc1)
15:36 mripard: javierm: and from what tzimmermann, I'm pretty sure he does too :)
15:36 sima: javierm, aside from the cherry-picking to manage the *fixes branches, this applies to drm-misc too https://drm.pages.freedesktop.org/maintainer-tools/drm-intel.html#patch-and-merge-flow
15:38 tzimmermann: right, when i take over -misc-fixes, i bring the branch up to the currect -rc1 and check -misc-next-fixes for any leftovers
15:40 karolherbst: gfxstrand: but I think I lean towards the "on-the-fly" solution, because in codgen we had implicit ordering and I still have nightmares from doing anything CFG related in codegen from that
15:40 gfxstrand: heh
15:40 karolherbst: ehh "on the spot" you said
15:41 gfxstrand: Yeah, it's just annoying because you have to DFS and return an array and then someone has to free the array.
15:41 gfxstrand: Not really the end of the world, just annoying.
15:41 karolherbst: in codegen it was even worse, because branches didn't point to blocks, but rather the first in the next list was branch taken or not taken, and uhh... it was all terrible. But they were also implicitly sorted and it messed with RA or something, it was horrible
15:42 gfxstrand: NAK is a little that way but probably not as bad
15:42 karolherbst: gfxstrand: what if you just have an iterator helper, which takes a callback with the block as an input and you shouldn't mess with the CFG while iterating?
15:43 clever: emersion: of note, the HVS on the rpi also supports a background fill color
15:44 emersion: good to know
15:44 emersion: is it per-plane, or per-CRTC?
15:44 gfxstrand: karolherbst: I think I have a clever plan
15:44 clever: emersion: per crtc
15:45 clever: let me find the exact details...
15:45 karolherbst: gfxstrand: is it the good kinda or the bad kind of a clever plan?
15:45 clever: emersion: here is my open implementation of things: https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/include/platform/bcm28xx/hvs.h#L259-L263
15:45 clever: from the open rpi firmware stuff
15:46 clever: and if i cross-reference back to linux...
15:46 clever: emersion: https://github.com/raspberrypi/linux/blob/rpi-6.6.y/drivers/gpu/drm/vc4/vc4_regs.h#L392-L403
15:47 clever: all 3 outputs on the HVS, have a dedicated color and fill-enable flag
15:47 gfxstrand: karolherbst: IDK
15:47 emersion: cool
15:47 karolherbst: sounds promising then
15:49 clever: emersion: i think vc4_hvs.c is what enables it, via enable_bg_fill, but it never sets a color, so its just 000000, black
15:49 clever: emersion: basically, there is a chunk of ram being used as a fifo, the read side is treating it as a fifo, and sending pixels to the encoder (such as hdmi)
15:49 clever: emersion: but the write side is treating it as ram, and for each plane, just copies pixels from host ram to the fifo ram, doing scaling and blending as it goes
15:50 clever: if no plane covers a pixel, then that slot never gets written, and you get undefined (previous scanlines) as the result
15:50 clever: the bg fill prevents that, by just setting the entire scanline to a known state, but that also costs clock cycles
15:51 clever: linux uses enable_bg_fill, to decide if there is a full-screen plane or not, and turn that on/off
15:52 clever: but it could definitely be exposed as a drm prop
16:02 mripard: emersion: iirc sun4i's crtc allows a BG color too
16:04 mripard: so it looks like it's common enough
16:05 clever: https://github.com/librerpi/lk-overlay/blob/master/lib/fasterconsole/fasterconsole.c#L214 https://ext.earthtools.ca/private/rpi/faster-console-1.mp4
16:06 clever: a random baremetal demo, where i used the bg fill
16:18 pq: emersion, melissawen, I think from purely UAPI perspective, I might prefer bg color prop on CRTC *and* solid_fill prop on planes that can use also real FBs. I think creating another (fake) plane that can only ever use solid_fill is more complicated for no benefit I can see.
16:19 emersion: so you prefer having two codepaths?
16:19 emersion: and try these when you need to paint bg?
16:19 pq: at first thought, yeah
16:19 emersion: that's surprising
16:20 pq: I guess I don't see myself using a plane with solid_fill
16:21 pq: but you're right, if one is aiming to use planes with solid_fill opportunistically, then CRTC bgcolor prop has no use.
16:21 pq: how do you tell userspace that this "bg plane" cannot ever take any FB?
16:21 emersion: you can just reject commits
16:22 pq: empty format list?
16:22 emersion: i mean, on Arm it's all virtualized planes anyways
16:22 pq: alright
16:28 pq: "bg plane" would be restricted to covering the whole CRTC. No problem either, I guess.
16:30 vsyrjala: yuck on 'bg plane'
16:30 vsyrjala: that would cause a lot of grief to the driver all over
16:32 emersion: only because the DRM internals are designed this way
16:33 vsyrjala: yes. they are supposed to more or less reflect the hw
16:34 vsyrjala: i guess one could in theory intercept this more or less in the ioctl handle itself, and thus hide the mess from the lower level code
16:35 vsyrjala: but dunno if i really like that either. migth as well emulate it in userspace imo ;)
16:35 emersion: everything emulated in user-space is replicated as many times as there are compositors
16:35 emersion: and having 3 ways to do the same thing is not a great uAPI
16:58 zamundaaa[m]: pq: you could have no FB_ID property on the plane... But then again, some userspace is very trusting for the kernel to never change anything
16:59 zamundaaa[m]: If the choice is between a plane that has the "normal" properties, and a CRTC prop for the background color, I'd pick the CRTC prop though. Otherwise userspace will quite often waste atomic tests on the color-only plane
17:03 emersion: zamundaaa[m]: but some hw supports things that cannot be described via a CRTC proo
17:03 emersion: prop
17:04 zamundaaa[m]: What kind of things?
17:06 zmike: mareko: I am convinced that the number of mesa/st nir passes broken with lowered io is infinite
17:07 emersion: zamundaaa[m]: a plane background color, or a square in the middle of the screen filled with a color
17:08 zamundaaa[m]: That sounds more like a new drm object would be needed to describe that
17:08 zamundaaa[m]: with some description of what the background color applies to
17:09 DemiMarie: emersion: which hardware is this?
17:41 mareko: zmike: yes the optional passes likely don't work
17:41 zmike: they not only don't work, the lowered io paths show a pretty fundamental misunderstanding of how lowered io works :/
17:42 mareko: it works here
17:42 zmike: sure
17:42 zmike: I am fixing them as I go
17:43 mareko: how is lowered IO misunderstood?
17:43 zmike: still searching for variables in e.g., nir_lower_clamp_color_outputs
17:43 mareko: yes, that's an optional pass
17:43 zmike: yes
17:43 zmike: I was agreeing with you
17:44 zmike: I have a bunch of fixes so far, but the goalpost seems far
17:44 clever: emersion: i can also picture how you could do a solid_fill plane on the rpi, just make a 1x1 or 2x2 pixel plane, and let the hw scaler do the rest
17:45 mareko: in the future I'm hoping we could lower IO for all drivers, and unlower it for drivers that don't accept it, so that we have only 1 code path and 1 style of NIR in st/mesa
17:46 mareko: that one is also far
17:46 zmike: very
17:47 gfxstrand: That would be nice
17:48 zmike: year 2050 goals
17:58 mareko: it's doable in 2024
18:02 sima: demarchi, dropped a comment
18:03 sima: demarchi, and I don't think your use-case for dim tag-branch is that unusual, makes sense to me
18:03 sima: like if you try to match the commit that CI has done overnight runs on or something like that, but only tag it when it's good
18:11 jenatali: mareko: I've considered doing exactly that. My backend wants the load/store intrinsics lowered, but still wants variables to iterate over. I'm perpetually tempted to add something late to reconstruct the variables from the intrinsic data
18:13 zmike: do I have great news for you
18:14 jenatali: :O
18:40 demarchi: sima: thanks... I will take a look on getting the missing part
23:30 mareko: jenatali: that's what driver-specific shader info gathering is for... you just gather IO descriptions from the intrinsics and store it somewhere
23:30 jenatali: Yeah. I just haven't done that work yet :)
23:30 mareko: radeonsi has this large structure called si_shader_info
23:31 mareko: it's like shader_info, but more