02:17airlied: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23066 mesh shader for the lvp
02:23HdkR: :O
02:34kisak: airlied: Dare I ask how painful it was? Is it more Creature from the Black Lagoon or Frankenstein?
02:36airlied: kisak: initially I expected to be really horrible, but it's ended up frankensteining the llvmpipe compute shaders and the draw pipelines together and it mostly seemed to work
07:36pq: airlied, will you be coming back to the new KMS color pipeline UAPI discussion? It seems to have died off. Were you convinced in the end at all?
07:39airlied: pq: I was waiting until I'd see everyone chime in :-), I'm still on a fence, but probably going to have to let people create a noose for themselves
07:40airlied: if sima is okay with the direction I'll trust it'll all work out :-)
07:40pq: alright
07:40pq: airlied, I'm not sure if your intention was to explicitly enable proprietary drivers and algorithms, or the opposite, with your question about nvidia.
07:42airlied: the problem isn't what I explicity want to enable, it's what happens when their solution to enabling them is to replace a bunch of distro libraries
07:42airlied: we've been through this pain before with libGL and libgbm etc
07:43pq: if you are suree it will happen, I guess it means a loader with the userspace lib - if that ever gains widespread use in compositors.
07:44airlied: yes and the creating a loader starts leading down to creating conformance tests etc
07:44pq: sure
07:44sima: pq, yeah I got really badly sidelined with typing up the uapi header + docs last week
07:44sima: but also, I think it's largely orthogonal and not consequential for the actual color pipeline uapi semantics
07:44airlied: like we can probably ignore it in the short-term, but I'd like to make sure people aren't forgetting the pain of userspace vendoring
07:45airlied: like it's probably not egl streams painful, unless nvidia decide to expose a completely different uapi from their color library
07:45sima: airlied, for what you brought up, I'm not sure there's really a good solution, I think best we can is be keenly aware of the tradeoff we're picking
07:46pq: and that's probably not where anyone would be starting. The KMS UAPI starts with fairly generic elements that will get directly used compositors, and whether a userspace library actually emerges remains to be seen. So proprietary vendors may not even have a place to hook into / replace, unless someone is serious about driving a userspace lib with the essential aim of enabling proprietary algorithms.
07:46airlied: sima: put the perscriptive api in the kernel "fixes it" :-)
07:46sima: yeah
07:46airlied: pq: that's worse thoguh
07:46airlied: I don't want compositors directly using the kms uapi in case hardware changes direction
07:46sima: pq, was anyone from asahi involved too? apparently they have a maxxed out prescriptive api due to half the macos driver running in the fw ...
07:46airlied: then the kernel has to fake up some generic elements to keep things going on newer hw
07:47airlied: yeah we also have to deal with mappings to the nvidia/asahi fw apis which might be a nightmare
07:47airlied: or we end up having per-driver uapi for those, which is also a nightmare
07:48airlied: if we can define a minimal viable set of hardware that will always be there, then maybe compositors using it is okay
07:48pq: sima, are you sure you have prescriptive vs. descriptive the right way around? I have no idea of any asahi stuff.
07:48airlied: but if we start having non-niche compositors with baked in pipeline knowledge it'll be horrible
07:49airlied: like I don't really care what gamescope does, but mutter/kwin/weston shouldn't get stuck in corners needing rewrites on new hardware
07:49pq: My opinion is that if the KMS color operations are not mathematically defined and explicit, then "my" compositors won't be using them.
07:50pq: I'm not sure what kind of "pipeline knowledge" you are imagining.
07:53sima: pq, maybe not, it's early and coffee not yet working :-)
07:54airlied: pq: like the existance of certain LUTs with certain abilities
07:54airlied: then a new hw design removes those in favour of something completely different
07:54sima: pq, yup I got the wrong, I meant asahi seems very descriptive
07:55airlied: and now the compositor can't do a bunch of things on that hardware
07:56sima: yeah that's the awkward part of a prescriptive uapi
07:56sima: like if a new gpu would suddenly break gl1 or so
07:56sima: and gears stops working :-)
07:57airlied: but at least there we fix one driver in userspace, not 500 GL1 apps
07:57pq: airlied, well, I can't talk for anyone else, but Weston will always be doing an optimization to map its internal pipeline to KMS, and sometimes it works and sometimes it won't. There will be no assumptions, just a mapping algorithm attempting its best.
07:58airlied: my worry is compositor proliferation is going to cause all sorts of I plugged in new hardware and now my desktop looks like 8-bit solaris firefox
07:58airlied: or my laptop now consumes all the Ws
07:58sima: airlied, I guess the fallback should always be gl, ie. crappy battery times but otherwise looking good?
07:58airlied: yea I guess it will have to be that, and then we have to rewrite all the compositors
07:59pq: why is plugging in new hardware suddenly a kernel-regression-severity problem?
07:59pq: yes, shader fallbacks are the answer when KMS does not fit
07:59airlied: pq: distros like to provide a uniform set of features across vendors as possible
07:59airlied: it's called an abstraction layer for a reason
08:00airlied: if I have to update all the compositors in my distros every 6 months to keep up with new hardware, that's going to be a substantial increase in testing costs
08:00sima: pq, it's not a kernel regression issue of "stop the press, need a fix asap"
08:00sima: but a "users are going to be disappointed and that's not very good" issue
08:01pq: airlied, but the only difference with plugging in new hardware is that battery drains faster - not that image becomes crappy.
08:01airlied: pq: we do care about batteries on laptops though
08:01sima: my take is that if we really can't do better then oh well, but we should at least be clear on this
08:01pq: sure, but it's like asking for all the goodies without requiring any work - impossible.
08:02sima: and we might inflict retrofitting a descriptive retrofit underneath a common prescriptive uapi for more compat
08:02airlied: pq: hence why I asked if you put it all in the kernel
08:02airlied: so we don't have to keep every compsitor up to date for every hw release
08:02pq: you want to generate shader sources in kernel?
08:02airlied: if we had one userspace library hiding it all, that would also be preferrable
08:03airlied: but getting userspace devs to agree on that would probably be difficult
08:03pq: very
08:03sima: pq, essentially if you go with pure prescriptive then kms is no longer the abstraction layer
08:03airlied: pq: at this stage it's probably saner to add spirv generators to the kernel than update numerous compositors on each new hw release :-P
08:03sima: ebpf
08:03sima:had to, resistance was futile
08:04pq: sima, it still is an abstraction layer, just not as high-level, but also not down to hardware-level.
08:04pq: airlied, my compositor has no vulkan
08:04sima: pq, I guess it'd be like vk, but with zero required features
08:04airlied: spirv would just be the transport language :-)
08:05sima: everybody gets a compiler!
08:05airlied: or we return an ebpf program :-P
08:05pq: airlied, ok, then what about the policy question?
08:05airlied: but yeah it's going to be a horrible mess, please make a userspace library at least, or attempt to
08:05sima: I mean the trouble with that is if your fallback isn't full gl/vk but just a fancy 2d blend engine
08:05sima: you're still screwed
08:06pq: airlied, who do you put the policy into userspace if the kernel UAPI is descriptive?
08:06pq: *how
08:06sima: I think that was prescriptive uapi + mandated userspace lib that provides some descriptive uapi
08:06airlied: I think that's the closest we'll get to sanity
08:06sima: we could ship them in the kernel even, perf does that too iirc
08:06pq: sima, that's the different design.
08:07sima: not that kernel build system is great
08:07airlied: I think starting with compositors all going on their own is going to end badly
08:07airlied: pq: not sure where you'd policy, ebpf? :-)
08:07sima: pq, mandated as in lowest common denominator fallback, compositors could still do their own thing that integrates better
08:07airlied:has to dinner
08:07pq: it really makes no difference if the policy is in a userspace lib maintained in the kernel tree, it's still in the kernel
08:08pq: if userspace is not allowed to bypass that lib
08:08sima: well it makes a technical difference of what the kernel does
08:08sima: the spirv generator would then at least run without kernel privs
08:08pq: that technical difference is irrelevant, userspace i.e. compositors are no longer in control of the policy.
08:09sima: pq, see my clarification, I don't mean mandated as "the only uapi"
08:09sima: but as in "we'll patch this to offer a lowest common denominator thing as fallback"
08:09sima: so more a mandate for people who add new gpu generations to their driver, when the color pipeline changes completely
08:09sima: kinda like mesa3d is the mandated 3d library
08:10sima: you can still do whatever you want, and plenty vendors do
08:10sima: and if it's open source we even promise to never break it
08:11pq: but mesa3d does not decide what or how stuff gets shaded, the apps still define that policy.
08:12pq: but if you have a descriptive API, then the policy is necessarily behind that API
08:12sima: there's a lot of fast&loose in the spec especially around texture sampling
08:12sima: or at least historically there was
08:12pq: sure, little details - but here we're talking about major details
08:12pq: obviously visible details
08:12sima: but yeah if there's just no reasonable descriptive api for this then trying to have it in userspace as a fallback is also not going to work
08:13sima: but given that every other os does have some descriptive api somewhere for this I'm not sure that's a solid claim
08:13pq: well, a descriptive API that everyone agrees to might emerge in a decade I guess, assuming that hardware and color research do not advance much.
08:14pq: a decade if active in-production use, I mean
08:14sima: yeah if that's the consensus across all stakeholders
08:14sima: then I think we just have to bite the cost of the prescriptive approach
08:15sima: i.e. document it clearly that this is what we're doing and why, get a pile of acks as sign-off, ship it
08:15sima: stock up on good beverages too :-)
08:16jadahl: si
08:16jadahl: sima: there were discussions on mastond
08:17pq: Wayland will be a descriptive API. Are you comparing other OSs on equal terms? Apps will be using Wayland, not KMS.
08:17jadahl: on about asahi + kms color pipelines
08:17jadahl: apparently apple hardware is very much implemented to match the apple compositor, with fixed color spaces/formats/... in various steps
08:17sima: pq, afaik the driver interface is generally descriptive too
08:17sima: not just the app/compositor interface
08:18sima: jadahl, yeah that one might be big time lolz :-/
08:18jadahl: not super different from what it seems nvidia hw does
08:18pq: sima, are "KMS" and GPU stuff split into separate APIs, or are both collected behind the same API?
08:19pq: as in, does Windows compositor have the policy, or does the driver have it?
08:20pq: The crucial point for fallbacks when KMS does not work is that the policy is in a single place. Windows could as well put the policy in the gfx drivers, but for Linux I'd like to allow alternatives that are not tied to your hardware driver.
08:21pq: jadahl, no, it does sound super different from what you said.
08:21pq: jadahl, unless the policy behind that firmware API is strictly fixed-function?
08:21sima: pq, I think the driver has it, to make it all work
08:21jadahl: pq: wasn't the nvidi ahardware designed to do blending in (put long acronym here)?
08:21sima: kinda like hwc2 in android
08:22pq: but if macOS changes its policies, wouldn't the firmware behavior change without warning too?
08:22sima: driver gets the surface stack, makes it into something the gpu+display can consume and shows it on the display
08:22sima: which is a pretty fundamental break of how kms works
08:23pq: jadahl, I haven't heard anything like that.
08:23jannau: apple's firmware API has at least pixel format, color space and eotf per plane (with some weird limitations)
08:24pq: sima, right. And our hwc2 is a Wayland compositor, many of them.
08:24sima: pq, you specify the exact fw you get in your boot entry, kinda like loading a specific fw version in linux instead of just any
08:24jannau: the whole pipeline has at least a gamma LUT (exposed by macOS) and a CTM (which appears to have some weirdness) too
08:24pq: since there is no "The Linux OS", there are multiple distributions which even allow multiple choices
08:25sima: pq, yeah so I guess just document all these constraint in the uapi doc as the reason for why we have this is probably best
08:25sima: constraints/tradeoffs
08:25jadahl: pq: I think nvidia has fixed function blocks that do things like converting LMS to ICtCp
08:25jannau: I believe it has degamma LUT too but afaik not exposed by macOS
08:26pq: jadahl, sure, but that's ok, and nothing to do with blending.
08:26jadahl: pq: sorry, shouldn't have assumed it was blending, I meant that it isn't exposing arbitrary math ops, but rather specific transformations, which seems similar to what apple hw does
08:27jannau: I haven't yet looked at what macOS does for HDR but it looks like the firmware handles that based on described input and selected output color format
08:27pq: jadahl, from the hackfest, I understood that the NVIDIA blocks can be described with mathematics and they are deterministic. Apple fw API I don't know if it can be described like that, or is it non-deterministic.
08:28jannau: the firmware presents a long list of detailed possible output color formats
08:29jannau: I don't think marcan has looked into HDR in detail either
08:29pq: even if Apple fw API is purely descriptive, then if a set of input parameters always leads to the same mathematical processing, it could theoretically be exposed as a prescriptive API, but it's going to be very fixed-function.
08:31pq: which means special snowflake KMS elements - however, the intel/nvidia/amd KMS drivers could likely expose the same elements as an alternative pipeline.
08:31jadahl: pq: maybe it has clear mathematics, but it adds restrictions
08:31pq: all elements always have restrictions
08:32jadahl: sure, but a set of luts and matrixes is a different story from "this can do LMS->ICtCp"
08:33pq: yes
08:33jannau: regarding the firmware interface: we will only support selected firmware versions (based on the macOS version they ship with). other versions will simply not work since the interface is not stable
08:34pq: jannau, sounds fine - just add the color operations as another thing to keep matching the KMS UAPI you expose.
08:35jannau: so if Apple changes how the color pipeline works we can in principle change annotations of the color operations, they main issue is probably if we are able to detect changes
08:35pq: so it kind of sounds like asahi should be the first one to design new KMS colorop elements, because they have the most special-cased ones.
08:36pq: However, since you say the output description affects the operation, and the output cannot be described per plane (can it?), then that looks like the big problem.
08:37pq: is there a split between before-blending and after-blending color management?
08:38pq: or do you just describe output per CRTC, input per plane, and then magic happens in between to come up with a single image from multiple planes?
08:39jannau: timing and work load wise it's probably a bad idea to have to wait for asahi
08:39pq: jannau, well, if DRM maintainers say that any UAPI design must also fit asahi...
08:39pq: and even without needing any changes in userspace
08:39pq: I mean in compositors
08:40pq: then we really need to start from the most special, most restricted, hardware/firmware there is that we need to support
08:40jannau: I've seen no explicit color management on a per plane basis except the colorspace and eotf description but I haven't looked at HDR yet
08:41pq: jannau, those parameters are exactly what we're talking about.
08:41pq: they are the descriptive API
08:42pq: you set a color space and eotf per plane, and then trust that something chooses a good color conversion (policy) to go from that to what you want out
08:43jadahl: pq: so yea seems the apple firmware provides a fairly opaque system: https://fosstodon.org/@mupuf/110373792727582495
08:43pq: what we need is a prescriptive API: with these parameters values, the mathematical operation is that.
08:46pq: deja vu EGL Streams...
08:46javierm: I don't see that difficult on getting devs to agree on using a single library to hide all the details and provide a more abstract interface
08:46javierm: libinput and libcamera are too good examples
08:46javierm: *two
08:46jannau: I would expect that blending respects the per plane description, blends into a internal format and does further conversion for the output format (if necessary)
08:48pq: jannau, the problem is that the KMS color pipeline is (so far thought to be) strictly split between before-blending and after-blending color operations which can be defined independently. That creates the question: in which color space and encoding does apple fw blend in? Because that will necessarily need to be defined in the UAPI.
08:48jannau: one potential problem is that changing the output color format could require a modeset
08:50pq: other drivers already require that sometimes if not always
08:51jannau: pq: I don't know what format the hw uses for blending internally
08:52pq: yes, which is a problem - but it doesn't need to be known for real, we "just" need a mathematical model that matches the results.
08:52pq: but in the end, it's another policy decision robbed from compositors
08:56pq: it's a really difficult impedance mismatch - more like a whole different design paradigm, which is why I'm really scared of the demands to make that apple fw be on equal footing with "real" FOSS drivers.
08:57pq: I'm not sure if Weston would ever be able to make use of the apple fw API really, the design fundamentals are so different.
08:58pq: emersion, are reading all this today? ^
08:58pq: *are you
08:59pq: javierm, I've mentioned libcamera to people, but I don't really know about it, and no-one else seems interested to look at its design...
09:00pq: I guess the closest equivalent in libinput are the pointer acceleration curves, and I think it just recently grew a presciptive API for it (let compositors define an arbitrary LUT curve).
09:02pq: AFAIU, originally libinput intended to have just whatever that works for everyone, then it needed more and more choices, and now it has a fully prescriptive option, so I would claim that it failed in having a descriptive API.
09:04javierm: pq: I can't say that understand that much about color management but I see the parallel with what libcamera tries to achieve
09:05pq: javierm, I imagine libcamera would be excellent inspiration indeed.
09:06javierm: basically in modern systems you don't just have a USB Video Class camera but video pipelines consisting of dumb camera sensors + ISP + other IP blocks for video scaling, format conversion, etc
09:06javierm: and the kernel exposes a graph with all these IP blocks (called Media Entities) and user-space need to setup a pipeline
09:07javierm: this means that you can't have a generic user-space and need platform specific logic to set these media pipelines
09:07pq: Color management has color gamut and tone mapping operations, that are totally subjective and there are multiple conflicting goals that one could have with them, which means there cannot be a single algorithm to fit all.
09:08javierm: so what libcamera does is to provide an abstract camera API for user-space to use and the pipeline configuration is done by the library, that contains platform specific logic for that (called pipeline handlers)
09:08pq: more over, those algorithms are still actively researched, too, with an unlimited complexity range
09:09javierm: pq: yeah, that's another thing that libcamera does. Because besides configuring the pipeline, the camera sensors are dumb and you need post-processing of the frames using 3A algorithms
09:09javierm: 3A = autoexposure, auto balance and autofocus
09:10pq: javierm, yes, that sounds like a very good fit here too. The public library API might have different design principles, maybe.
09:10javierm: so libcamera allows to plug these algorithms too
09:11javierm: pq: I guess it boils down to attempt to describe this in a generic way using KMS properties and have as a goal to have generic user-space or make the KMS API more low level and have a library on top
09:11pq: javierm, what's the kernel UAPI like? Is it full of hardware specifics or does it pretend to abstract something?
09:12pq: javierm, and why is libcamera not completely in the kernel side of UAPI? - airlied and sima asking ;-)
09:12sima: pq, needs more pinchartl but I think 2 weeks of vacation
09:13sima: afaik the kernel api is still a bit up in the air, and rn just reusing some v4l drivers until the userspace lib interface is clearer
09:13javierm: pq: https://www.kernel.org/doc/html/latest/userspace-api/media/mediactl/media-ioc-device-info.html
09:13javierm: sima: no, the kernel API is very old actually but is just that every product had platform specific code and there wasn't a generic library for it
09:14pq: emersion, hwentlan_, jadahl, are you getting all this? ^
09:14sima: but there's been all kinds of proposal floating around from further extending mediactl to new subsystem to somehow bashing this into drm
09:14javierm: AFAIK it was even used in the Nokia N phones like a decade ago
09:14sima: javierm, afaik there's not a lot of actual recent products fully using mediactl api
09:15sima: but I'm also rather far away from all that discussion
09:15javierm: sima: wasn't used in Chromebooks for example?
09:16javierm: but yeah, we need pinchartl to make sure that I'm not saying anything silly here :)
09:16sima: cros team is working on a new kamera api
09:16sima: kcam or something, there was an lpc talk
09:16sima: android afaik is just yolo vendor stuff behind the android interfaces in userspace
09:17sima: I've also heard a few of the details on the intel side with the ipu6 driver
09:17pq: javierm, is that one get-info ioctl all what's common there?
09:17sima: the one for dell where it's just a driver in userspace really and very blobby
09:18sima: javierm, I also thought there's a bunch of important things missing like sync_file support or atomic commit
09:18javierm: pq: yeah, just exposing the media graph is what's common. You can read about the device model here https://www.kernel.org/doc/html/latest/userspace-api/media/mediactl/media-controller-model.html
09:19sima: for media overall I mean
09:19javierm: sima: yeah, for x86 is complicated because the ACPI tables shipped are mostly for Windows and so not compatible with the IPU driver
09:20javierm: sima: I believe only the Chromebooks ship a DSDT table that are compatible and describe the media graph using a _DSD extension
09:20sima: oh I didn't even know about any acpi issues
09:21javierm: sima: yeah, that's why you couldn't have cameras working on the Window Surface tablets
09:21javierm: but that's a different conversation than libcamera :)
09:22pq: javierm, that sounds like what would be the best in the long run, but also that design would have practically zero element abstraction at KMS UAPI.
09:22pq: as in what an element does and how it's confisugre
09:22pq: *configured
09:22sima: pq, yup, for that design the prescriptive api is probably the right one
09:22sima: or at least a good stepping stone
09:22javierm: pq: yup, that's why I said that the trade off is basically an abstraction in KMS vs not hiding the color pipelines and have platform specific user-space
09:23pq: sima, the kernel UAPI would not even be prescriptive or descriptive, it would be fully hardware and driver specific.
09:23javierm: which I believe is the descriptive vs prescriptive discussion ?
09:23pq: not really
09:23sima: pq, I'm not sure we'll get there right away, but yeah that would be possible
09:24sima: entirely hw specific api is a bit a too big step from where kms is currently
09:24sima: unless you want a multi-year experimental phase to figure out the lib api, before anything lands
09:25javierm: sima: yeah, it is diverging a lot from the existing KMS design that attempts to provide a more abstract interface to user-space
09:25javierm: but I can't fail to see the similarities between what you are trying to achieve for color pipelines and what the media folks solved with the media controller API
09:26pq: sima, that multi-year effort is exactly what I've been saying would need to happen if we did what airlied and you asked first.
09:27pq: Isn't the proposed prescriptive KMS color UAPI design a natural step in a path to establish a single common userspace library, after which the UAPI can fall into total hardware-specificness naturally over time, making less and less userspace bypass the library?
09:28pq: That won't happen if you demand the UAPI to maintain its abstraction. It must be allowed to fall into total hardware-specificness, or there won't be an incentive to have that single library.
09:29sima: pq, yeah I think that's probably the right pragmatic tradeoff
09:30sima: so for me as long as everyone's on board with the direction, we're good
09:30pq: I might even say that the existing KMS UAPI for planes, CRTCs and connectors is a little too easy to use and abstracted for everyone to start caring seriously about libliftoff.
09:30sima: we're _really_ good in drm at keeping somewhat regretful uapi alive for a long time :-)
09:31javierm: pq: indeed. I agree with you that's either an abstract KMS API or a low-level hardware API + library to abstract those details away
09:31HdkR: The AGP API is gonna live forever
09:31sima: so I'm not super worried about all the concerns, as long as we don't ignore them and don't try to get to something better longer-term
09:31sima: and some pragmatic stepping stones meanwhile are needed to get this off
09:31sima: and the proposed prescript color pipeline looks like a good one
09:32emersion: pq: I'm on leave atm
09:32sima: HdkR, why did you just drop that reminder :-P
09:32javierm: sima, pq: all I'm trying to say is that before commiting to an API to talk with pinchartl, because IIRC the media folks attempt to abstract the pipelines in the kernel using the v4l2 API and that failed
09:32HdkR: sima: Need to take the good with the ugly :P
09:32pq: emersion, enjoy, but please bookmark today's chat to revisit when you're back. :-)
09:33emersion: will try to remember
09:33javierm: sima, pq: and at the end decided that the only way to implement it properly was to expose the whole media graph using a different API and let user-space to setup the entities and connections
09:33pq: https://oftc.irclog.whitequark.org/dri-devel/2023-05-17
09:34sima: HdkR, it should just be an iommufd but alas 20 years too late for that
09:34HdkR: Indeed
09:35pq: javierm, good points, thanks!
09:37sima: pq, I do think that we might end up in a future where the prescriptive pipeline is fairly limited for just backwards compat and the real interface is a per-crtc blob a la ADF (android display framework)
09:37sima: but that really then means the libkmscolor is fully established and everyone is ok with using that
09:37sima: and that's a very long road to get there I think
09:38pq: yup
09:38sima: plus display color pipelines might be limited enough that we never need to get there
09:38sima: unlike cameras, where the post-processing and life-capture controls really are innovating a _lot_
09:39sima: because it's such a competitive advantage in the mobile space to have slightly better camera
09:39pq: well, gamut and tone mapping operators are innovating a lot too
09:39pq: SBTM signalling is coming, which means "do it all yourself in the source"
09:39sima: yeah it's definitely not settled, but with cameras not even the physical approach to autofocus is settled
09:40sima: so you have widely different hw inputs with widely different algorithms even before you get into anything specific
09:40sima: sbtm?
09:40pq: Source Based Tone Mapping
09:41sima: ah so sink goes completely dumb for hdr, like with old panels where real color correction was all done in the source too?
09:41pq: instead of sending BT.2020/PQ or HLG to a sink, you send sink-specific native data
09:41sima: yeah that's going to be fun
09:41pq: yes
09:41pq: well, it will make my work *easier* when there is no need to guess what a monitor may or may not do
09:42pq: and it also required hw vendors to correctly describe their monitor behavior, which means a "copy&paste EDID" no longer flies
09:43pq: but it also means there will be more demand from display controller color pipelines, because taking the hit of a shader pass might be upsetting to gamer performance
09:44sima: well on mobile you often flat out don't have the memory bw for a full post-process step
09:44pq: right
09:44sima: but yeah this sounds like pain until both sink and source support is widely adoopted
09:44pq: and you don't have the circuitry of an external monitor either, I presume
09:44sima: like the first few sinks/source drivers will just screw this up
09:45sima: like with adaptive sync, and you'll need to buy the right combo
09:45sima: I guess self-refresh panels should have enough circuitry to not make the sink-side mapping too costly
09:45sima: but it's still duplicated silicon for no gain
09:46sima: ok gtg, ttyl
10:04kbingham: Hola! I got pinged to look at the backlog here relating to libcamera ... I saw a few questions / topics above I could answer/comment on ... But I think the conversation is over. I'll add them here in case it helps clarify anything but I'm always around to talk about libcamera too, (or we're over in #libcamera too)
10:06kbingham: > the problem is that the KMS color pipeline is (so far thought to be)
10:06kbingham: libcamera also has to deal with colorspace issues and configuration too all the way through the pipeline. We are trying to take a lot of care to ensure we can expose CCM's from the hardware, but also ensure we can track how the colorspace is conveyed from the hardware and through all the transforms it can have through the pipeline. Generally colors are always hard ;-)
10:06kbingham: > <pq> javierm, and why is libcamera not completely in the kernel side of UAPI? - airlied and sima asking ;-)
10:07kbingham: libcamera handles the userspace side and doesn't fit in kernel space. In particular there are lots of parts that need algorithms to run that are not suited to kernel side, and then management of sending buffers from one device to another. That all needs to be configurable as well.
10:07kbingham: > <sima> javierm, afaik there's not a lot of actual recent products fully using mediactl api
10:07kbingham:
10:07kbingham: There's quite a few from my perspective ... I guess it depends on what you use.
10:07kbingham:
10:07kbingham: > <sima> cros team is working on a new kamera api
10:07kbingham: Cros team are working on a new api called kcam ... it has not been well received.
10:07kbingham: Khronos are also working on a new Kamaros API. We're working with them and we would expect the linux implementation of that to be ... (I'm sure you can guess) ... libcamera.
10:07kbingham: > Intel IPU6 ..
10:08kbingham:
10:08kbingham: Yes, work in progress.. There are some key technical challenges there. (Well maybe they're political challenges as if Intel were happy to open things up the technical difficulties would go away :D)
10:08kbingham: > <sima> unless you want a multi-year experimental phase to figure out the lib api, before anything lands
10:08kbingham: libcamera is about 5 years into this ... Second best time to start is now ? ;-)
10:08kbingham: <end of me commenting on backlog>
10:42pq: kbingham, thanks!
10:53sima: kbingham, thx
11:31pendingchaos: is there any reason why NIR doesn't allow vec6/7 and vec9-15?
11:31pendingchaos: I expect it would be simpler to support every size below 17
12:05jenatali: pendingchaos: Is there a reason to do so? There's some alu opcodes that need to be replicated by vector size and it'd be a lot of bloat
12:46pendingchaos: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636#note_1911767 was brought up for vec6
12:46pendingchaos: all of the alu opcodes are or can be generated, so it would just be some switch cases like nir_fdot() helper (and that can be fixed by generating count->op helpers for reduction opcodes)
12:50pendingchaos: it would also make the rules for valid vector sizes simpler (<=16 instead of the weird <=5 || ==8 || ==16 thing)
12:56karolherbst: with stuff like that I always feel like there is no actual reason to do it besides "I want a vec src"
12:56karolherbst: if the harware requires a vec6 src, sure, but I highly doubt that
12:56karolherbst: and we kinda should stop adding arbitrary vec sizes
12:56alyssa: 08:01 < pq> airlied, but the only difference with plugging in new hardware is that battery drains faster - not that image becomes crappy.
12:57karolherbst: I could argue the same for nvidia, but we have two vec4 sources, thats' it
12:57alyssa: counterpoint, Night Light in both KDE and GNOME is KMS only and just refuses to work if the display controller doesn't support, neither implements the gl fallback
12:57karolherbst: sure, we could _push_ those tex arg inside a vec6 for certain tex operations, but.... why
12:57alyssa: so IDK if compositors will bother with gl fallbacks if It Works On My Machine
12:58alyssa: 08:05 < sima> I mean the trouble with that is if your fallback isn't full gl/vk but just a fancy 2d blend engine
12:58alyssa: wasn't this the libliftoff promise?
12:59pq: alyssa, that's a compositor's design choice. You can totally implement night light without KMS too. Well, any desktop compositor must.
13:00sima: alyssa, yeah this was in reply to airlied's suggestion that the kernel return a spirv shader that implements what the hw does
13:02pendingchaos: the hardware does use a vec6 src, it puts all texture sources into a single vector
13:02pendingchaos: 3 components for cube/2darray coordinates; then offset, bias and compare
13:03pendingchaos: this is somewhat extreme edge case, so maybe we could just round the NIR vec to vec8, but why not add more vec sizes if we aren't increasing the max?
13:03alyssa: pq: CAN, yes. But if the two big desktop compositors don't.. well..
13:03pq: alyssa, fallbacks are already implemented for KMS overlay and cursor planes, too. No desktop compositor assumes they always work for everything they throw at it, so they already have a GPU or software renderer to do the same job.
13:04alyssa: I just don't have a lot of faith in mutter implementing standards at this point.
13:04pq: alyssa, file bugs to them.
13:04pq: ok
13:04alyssa: gfxstrand: clearly we need oyu
13:04alyssa: you
13:04alyssa: to bring back my Faith
13:05alyssa: :p
13:05MrCooper: alyssa: Night Light is a compositor internal feature, what does it have to do with any standards?
13:06alyssa:is just grumpy and will show herself out
13:06karolherbst: pendingchaos: nir_serialize does some werid magic on the vec size for now and it's kinda packed, but yeah, could be changed
13:08pendingchaos: nir_serialize should already work for all values below 256
13:08pendingchaos: below 2**32 if you change some local variables from uint8_t to uint32_t
13:10karolherbst: pendingchaos: `encode_num_components_in_3bits`
13:11karolherbst: but yeah.. seems like i thandles non special one just fine
13:13karolherbst: we do support vec5 already, so.. maybe it would be fine then
13:13glehmann: pendingchaos: maybe we don't need a vec6 if you let it create a vec3 and override the regClass according to base in aco_isel_setup?
13:15pendingchaos: I'm not sure if I like that kind of mismatch between NIR and the IR
13:15pendingchaos: might be fine since we're directly using the intrinsic destination by the tex instruction though
13:20pendingchaos: if vec6 becomes a thing, I'd like to keep the current approach
13:43alyssa: 117 files changed, 289 insertions(+), 334 deletions(-)
13:44alyssa: barely worth it, boo
14:00alyssa: --
14:01alyssa: Is the BITFIELD_BIT/BITFIELD_BIT64 distinction actually load bearing?
14:01alyssa: I
14:01alyssa: 've seen it cause bugs and it's getting in the way of unifying to a single BIT macro
14:01alyssa: if we just define as BITFIELD_BIT64 always.. is that going to tank perf somewhere? is C really that stupid?
14:53karolherbst: alyssa: __typeof__ exists
14:54alyssa: karolherbst: not sure that helps in this case
14:54alyssa: it's often used with constants
14:54karolherbst: mhhhh, but it operates on the bitfield array, no?
14:54alyssa: no
14:54karolherbst: ohh wait
14:54alyssa: it's literally just (1 << x) and (1ull << x) respectively
14:54karolherbst: it's the other thing
14:55karolherbst: yeah.. feels stupid to have both kinda...
14:55karolherbst: make BITFIELD_BIT do 64 bit internally and remove BITFIELD_BIT64?
14:56karolherbst: if it overflows the destination I'm sure the compiler complains
14:58alyssa: right, that's the thing I'm wondering if it'll be a perf issue
14:58robmur01: other way round - assigning "uint32_t y = (1ull << 32)" should silently truncate to 0, where "(1 << 32)" would be able to warn
14:59robmur01: about shifting beyond the range of int
14:59alyssa: robmur01: https://www.youtube.com/watch?v=tas0O586t80
15:05jenatali: IIRC MSVC actually has an additional warning here compared to clang/gcc
15:05jenatali: But no __typeof__
15:29karolherbst: robmur01: uhh.. why ....
15:30karolherbst: this is so stupid...
15:31robmur01: because narrowing of unsigned types is well-defined, unfortunately :)
15:37pepp: is it expected that vk_common_ResetCommandBuffer doesn't check if the command buffer state is "pending"? The doc says "commandBuffer must not be in the pending state"
15:41zmike: could be an oversight
15:45pendingchaos: I don't think vulkan drivers are expected to do these kinds of checks
15:46HdkR: That's what the validation layers are for
15:48pepp: oh ok
16:00zmike: there's a lot of validation in common code though
17:22alyssa: test regressing from a compiler change despite identical assembly, hm yes makes sense (-:
17:26karolherbst: alyssa: the shader_info stuff matters :P
17:26alyssa: karolherbst: same bits sent to the hardware, up to address randomness
17:27alyssa: Technically I *do* have this hardware with me... haven't run CTS on my personal laptop in years though, lol
17:27karolherbst: heh
17:27karolherbst: well if all bits are equal, then the hardware still wins :P
17:27alyssa: fr
17:29HdkR: Alignments and padding coming back with a vengeance?
17:29alyssa: Maybe? I'm thinking more likely C undefined behaviour somewhere
17:29alyssa: but really hard to figure that out with just CI + drm-shim
17:29alyssa: V
17:31alyssa: ok finally found a test with a difference
17:32alyssa: does not appear especially pleasant to debug
17:32alyssa: but, whatever dropping 32 bytes off every fadd instruction on aco is worth something, eye on the prize..
17:34HdkR: "X.Org Foundation Prize Compensation Official"
17:36alyssa: I see the new code is coalescing a move
17:36alyssa: which causes the whole program to be rescheduled
17:36alyssa: lovely compiler, yes.
17:37alyssa: Midgard, don't you see, I'm trying to help you
17:39alyssa:tries to figure out how to run CTS on this thing
17:39alyssa: copious amounts of rsync, maybe
17:40alyssa: ahahaha incompatible glibc versions, of course.
17:40alyssa: it would be too easy
17:40alyssa:bumps to testing
17:44alyssa: oh, I know what I should do. convert ntt and see how softpipe fairs on the m1
17:45alyssa: softpipe on m1 vs panfrost on t860
17:45alyssa: *rk3399
17:45alyssa: who wins? :P
17:49robmur01: T860 wins our hearts, every time :)
17:51alyssa: robmur01: <
17:51alyssa: <3
17:54alyssa: De-registerizing backends definitely requires a lot more thought and backend domain knowledge than I would like, even with my magic helpers.
17:58Company: oh nice, vkcube doesn't work either
17:59Company: so it's not my code, it's TigerLake
17:59Company: wonder why it works with xcb but bot with Wayland
18:01Company: I wodner if it's worth investigating or if I should wait for https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20418 to settle
18:01Company: (I'm on stock F38 atm)
18:05Company: zehortigoza, dj-death: ^^^ ?
18:08zehortigoza: Company: do you have logs? it should not matter if is X or Wayland
18:13Company: zehortigoza: vkcube shows up and then immediately exits because vkAcquireNextImage() fails with VK_NOT_READY
18:13zehortigoza: anything on journalctl?
18:14Company: zehortigoza: I haven't looked into it too much because it's a corner case, but I get it reliably when all swapchain images are in use
18:14Company: there's an error, let me quickly rebuild my test
18:17Company: zehortigoza: https://gitlab.freedesktop.org/mesa/mesa/-/blob/23.0/src/intel/vulkan/anv_device.c#L3868 trigggers
18:18Company: via WaitForFences() in my app
18:21Company: and my app works fine on X11, too
18:21Company: my app = GTK, I've been hacking on the Vulkan renderer there
18:28zehortigoza: Company: huum! this could be several things... it vkcube works fine on X in my Tigerlake. will give a try later with Wayland. file a bug so we can better track this.
18:30Company: yeah, XWayland works fine - both with my GTK and with vkcube
18:38Company: zehortigoza: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9044
20:34alyssa: I may regret asking, but why does llvmpipe call nir_convert_from_ssa?
20:34alyssa: I mean, LLVM itself is SSA so why not just translate phis?
20:34alyssa: airlied: ^^
20:36DavidHeidelberg[m]: does this spam-storm happen time to time: ERROR - dEQP error: MESA: warning: vk_log*() called with client-invisible object 0x55c31bd83d60 of type VK_OBJECT_TYPE_DEVICE? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/41927818
20:37alyssa: DavidHeidelberg[m]: there's an MR to fix that
20:37alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22986
20:37alyssa: I guess we should just assign to marge
20:37DavidHeidelberg[m]: alyssa: nice. not my fault. Much happy. Debian 12 CI almost ready. (except zink+anv traces fail)
20:38alyssa: Lol
20:38alyssa: DavidHeidelberg[m]: please assign to marge if the patch is sane
20:39DavidHeidelberg[m]: I would say it look OK. acked, assigning (at least CI will not get spammed)
20:39anholt_: alyssa: you can't just map the phis to llvm, because you're doing manual exec mask management instead of control flow.
20:39DavidHeidelberg[m]: thanks alyssa
20:41alyssa: anholt_: right. I regret asking now ;P
20:44airlied: alyssa: yeah we don't direct convert from single lane NIR to single lane LLVM
20:44airlied: that wouldn't be very fast
20:54Kayden: vec1 64 ssa_41 = i2i64 ssa_39
20:54Kayden: vec1 64 ssa_42 = deref_array &(*ssa_40)[ssa_41] (global uint) /* &((Indi
20:54Kayden: rect *)ssa_34)->values[ssa_41] */
20:54Kayden: we have.....64-bit array indexes?
20:55Kayden: aren't variables limited to 4GB typically?
20:57Kayden: only case I can think of where that would be needed is the last unsized array in an SSBO or something, but even then I'm not sure
21:05alyssa: airlied: Well, I have helpers to have our nir reg cake and eat it too, I can translate llvmpipe
21:05alyssa: *gallivm
21:05alyssa: and zink if we really insist on not passing phis through to spir-v.
21:07alyssa: more worried about all the random little backends (lima, etnaviv, r600, ..)
21:16idr: Kayden: Is the i2i64 something the shader is explicitly doing?
21:19Kayden: no, it's coming from vtn_access_link_as_ssa
21:24Kayden: I guess the deref chain as a whole is 64-bit involving pointers...
21:24alyssa: OK, finally managed to get a T860 testing environment locally
21:24Kayden: but array indexes really shouldn't be 64-bit IMO
21:24alyssa: and yep, can reproduce the regression. despite no change in the assembly. Uh.
21:57DemiMarie: As a security researcher, please run as little as possible with kernel privileges. It’s okay for there to be a userspace lib that is maintained as part of the kernel, but if something can run without kernel privileges, it should unless there is an extremely good reason not to.
22:09alyssa: SCREAMING
22:11alyssa: karolherbst: Identical disassembly, different bytes (-:
22:14alyssa: yes, this is probably a midgard backend bug, or two
22:14alyssa: no, I'm not about to go fixing deep midgard backend bugs
22:14alyssa: test passing now anyway. yoof.
22:16alyssa: deqp is a lot happier now, lol
22:16alyssa: down to 2 fails in deqp-gles2, ok
22:37cmarcelo: is @nora from gitlab on IRC?
22:38zmike: cmarcelo: are you planning to review my spirv debug MR
22:40cmarcelo: zmike: missed that, will look at it now
22:40zmike: thx
22:58karolherbst: alyssa: oof
22:59karolherbst: cmarcelo: doubt it, but what's the matter?
22:59gfxstrand: alyssa: What do you need me for?
23:00gfxstrand: Oh, mutter. 🙄
23:00gfxstrand: It's possible I have a commit in mutter. I don't think so but I know I filed some bugs back in the day.
23:02cmarcelo: karolherbst: rusticl registers a callback with spirv_to_nir to get all the error messages and apparently is printing all of them regardless the level, apparently this generates noise to conformance.
23:03karolherbst: uhh.. I think I did that
23:03cmarcelo: karolherbst: spirv will pass the level, but I wonder if the rusticl side has/should-have an env var to toggle off (or on) the debug output.
23:03cmarcelo: (i.e. is probably fine ignoring the level, as long as you give rusticl users a way out of the prints)
23:03karolherbst: let me see the code
23:03karolherbst: *check
23:03cmarcelo: karolherbst: thanks!
23:05karolherbst: huh..
23:05karolherbst: it shouldn't...
23:05karolherbst: cmarcelo: I've added some RUSTICL_DEBUG=spirv thing to make it do that
23:05karolherbst: but if that's not used it shouldn't print any errors by itself
23:05karolherbst: unless spirv_to_nir is doing this
23:06cmarcelo: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9050
23:06karolherbst: and it usually does so in debug builds
23:06cmarcelo: karolherbst: comment there so @nora can see it. :-)
23:07cmarcelo: I don't think spirv code is printing those, as it follows the log level env var we have.
23:07karolherbst: ehh
23:07karolherbst: that's expected behavior actually
23:07karolherbst: on debug builds, the vtn log level can't be changed
23:08karolherbst: or the other way around? uhm...
23:08cmarcelo: vtn side seems solid here. the thing is we ALWAYS give everything to the debug callback (msg, level, etc)
23:09karolherbst: yeah...
23:09karolherbst: but there is no debug callback installed by default
23:10karolherbst: I'll ask nora what the initial case was
23:14karolherbst: Okay, seems like nora did set RUSTICL_DEBUG=program
23:16cmarcelo: karolherbst: cool thanks!
23:29karolherbst: alyssa: if I don't forget to check next week, please remind me to verify that all `nir_register` handling inside codegen is dead code
23:30karolherbst: actually...
23:30karolherbst: uhhh
23:31karolherbst: forget what I said, we still call nir_convert_from_ssa(true)
23:31karolherbst: uhhh
23:32alyssa: karolherbst: boop
23:32karolherbst: it's super stupid in codegen
23:32karolherbst: so you know what codegen does first coming out of nir?
23:32alyssa: convert to ssa
23:32karolherbst: not quite, do some pre SSA lowering, _then_ convert to SSA
23:32alyssa: heh
23:33alyssa: karolherbst: Good news for you, you can just translate the new intrinsics to moves totally naively and then the nouveau optimizer should crunch through them
23:33karolherbst: I'd rather consume SSA directly than make use of the new reg things :D
23:33alyssa: and you don't need to bother with all my fancy new reg stuff
23:33alyssa: or that yes
23:33alyssa: or you could just switch to NAK
23:33alyssa:devil emoji
23:33karolherbst: yeah.. the only reason nir_register stuff gets handled because of resolved phi nodes
23:34karolherbst: besides that it's all clean SSA
23:34karolherbst: the TGSI code path isn't though.. but who cares anyway
23:34karolherbst: anyway, it's something I can probably look into next week