00:15 airlied: jenatali: does the libclc mr still rely on libclc patches that haven't landed? or can it get split up?
00:16 jenatali: airlied: Yeah, there's a few patches at the end that rely on https://reviews.llvm.org/D85911
00:17 jenatali: airlied: We can stop at a631d531 (inclusive) and save the last few for after that lands
00:20 airlied: jekstrand is probably good for acks/rb for the internal flag + extern C patches at least
00:27 emersion: MrCooper: you said we could POLLOUT on dmabufs to know when they are ready instead of having to resort to explicit sync
00:28 emersion: MrCooper: however it seems like we have some dmabufs that sometimes POLLOUT, sometimes don't -- a blocking poll() will wait forever
00:28 emersion: MrCooper: any idea what this could come from? could this be because the dmabuf is already ready?
00:45 airlied: jekstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6552
02:10 Plagman: seems like it ends up waiting for whatever is the latest commit in some cases, not the commit we queued the wait on
02:23 airlied: anholt: btw you seen that request for dev access to piglit?
02:25 airlied: anholt: on the mailing list, I'm happy to ack it
02:29 bnieuwenhuizen: airlied: I've also had somebody complain that if you explicitly select a device the device-select layer still exposes all devices
02:29 bnieuwenhuizen: any issues with only showing matching devices if there is an explicit selection?
02:35 airlied: bnieuwenhuizen: seems wierd why you'd care in that case
02:36 airlied: it's only meant to control the ordering
02:36 airlied: bnieuwenhuizen: had they something broken by it?
02:38 bnieuwenhuizen: airlied: not quite sure
02:39 bnieuwenhuizen: on the other hand I couldn't thing of a reason why we keep exposing all devices
02:40 jenatali: bnieuwenhuizen: Not sure who's doing the selecting here, but if the app ever needs to deal in shared/external resources, it might need the ability to enumerate all devices to be able to use the right one
02:40 jenatali: At least, that's one of the main reasons we only adjust enumeration order instead of hiding
02:41 bnieuwenhuizen: jenatali: this is about a vulkan layer where the user explicitly selects the device
02:41 bnieuwenhuizen: the app selecting another device afterwards seems like a surprising behavior to me
02:43 airlied: bnieuwenhuizen: I just can't see the advantage to hiding the other devices, unless the app is doing something tricky
02:44 bnieuwenhuizen: airlied: tricky apps exist: example is renderdoc
02:45 airlied: my intent was only to reorder, never remove, esp when device groups exist as well (and we don't do those right either)
02:46 airlied: I suppose we could another env var, which might be overkill
02:46 bnieuwenhuizen: yeah
02:46 bnieuwenhuizen: (on the overkill)
05:57 Plagman: MrCooper: related to the problem above, the compositor's async compute operations are still getting implicit synced to the commit that's not yet done, even though we're specifically waiting for a done commit and using that buffer - it only happens when xwayland is blitting
05:58 Plagman: when xwayland is flipping we properly run when we want
05:58 Plagman: is it blitting to a single xwayland surface or something?
07:03 danvet: airlied, mail from sfr ... I guess time for a backmerge into drm-next?
07:17 pq: danvet, "reset modifier list...", wow, that's a lot required functionality for atomic userspace I've never even heard of, I think.
07:18 pq: Lyude, does that HDR backlight thing need anything from userspace?
07:18 danvet: pq, you can also do it full generic (which I thought is what weston does)
07:19 danvet: what I described is just the minimal hacked up thing to not fail at modifiers
07:19 danvet: and you don't really need to keep around the list of modifiers either, you can just do this every time you alloc
07:26 pq: danvet, um, what's the full generic thing? Get IN_FORMATS, pass it in to GBM, and that's all?
07:27 pq: all the alloc'ing is delegated to GBM surfaces
07:28 pq: but GBM can't know or test if the resulting FB would work
07:28 pq: e.g. against bandwidth limits of the current set of active CRTCs
07:28 danvet: yeah so if the resulting fb doesn't work
07:28 danvet: you throw it away, remove that modifier, and try the next one
07:29 pq: right... I don't think weston implements that yet
07:30 pq: because it requires you to get the buffer out of a GBM surface first for testing, doesn't it?
07:30 pq: also getting the buffer out is the only to know which modifier GBM chose
07:32 MrCooper: Plagman: yeah, Xwayland 1.20.y still keeps re-using a single buffer without Present flipping, fixed in xserver Git master
07:32 Plagman: ok that explains it
07:32 Plagman: probably explains weird things with the poll() when we get behind as well, it's essentially losing a race over and over right?
07:32 pq: 1. get IN_FORMATS as the modifier list. 2. create GBM surface with the mod list. 3. glClear. 4. get buffer out. 5. atomic test. 6. if fail, remove the mod from list and go to 2. 7. put back the buffer to GBM. 8. go on with your actual rendering.
07:32 MrCooper: and yeah, polling the dma-buf FD only returns when there are no unsignalled fences anymore, which could never happen if the buffer keeps getting rendered to
07:32 Plagman: ok
07:32 pq: emersion, ^
07:33 Plagman: i think it's abnormal/not desirable that we're blitting in the first place, but going multi-buffer wuold be quite a bit better while we fix that
07:34 Plagman: what's the behaviour in xserver master exactly, does it make as many buffers as the app's?
07:34 MrCooper: it should only blit when there's no choice
07:34 Plagman: (does it make them on demand or upfront?)
07:34 MrCooper: on demand
07:35 Plagman: cool
07:35 Plagman: i'll give it a shot thanks, confirming my suspicion helps a lot
07:35 MrCooper: glad to be of help
07:36 Plagman: i don't know all the nuances, but i'm not sure there's ever 'no choice' around flipping in a scenario like gamescope
07:36 Plagman: where there's not really an xwayland desktop, just getting the app's buffers forwarded to us should be acceptable in all cases pretty sure
07:36 MrCooper: it depends on the X11 window hierarchy
07:38 MrCooper: the dimensions of the window presented to must exactly match those of the toplevel (root window child) window, and must not be clipped by any children
07:40 Plagman: i think even with mismatching dimensions (which is probably all the cases we run into for games) we could still do something useful with the buffer
07:42 MrCooper: if the buffer is bigger, possibly; not if it's smaller though
07:43 airlied: danvet: yeah was considering a backmerge, mught wait for rc4
07:43 MrCooper: Plagman: if you can file a GitLab issue about specific cases where it's not flipping but you think it could, I can look into it
07:44 Plagman: i'm sure we can do more or less the exact same thing we're doing today with a smaller buffer
07:44 Plagman: just without the xwayland copy
07:44 Plagman: our whole thing is to stretch the buffer and lie to the app about it so i can't imagine it'd be a problem
07:45 Plagman: yeah definitely will file issues once we get the low hanging fruit from our side of things
07:45 MrCooper: a flipped buffer becomes the window pixmap, so the pixmap has to be at least as large as the toplevel window
07:45 Plagman: there was already a tweak in proton for d3d9 games that made a bunch of them flip
07:46 Plagman: ok i see, it's more about not breaking the x11 contract with clients
07:46 Plagman: than what the compositor can and cannot do
07:46 Plagman: sorry i'd misunderstood
07:46 MrCooper: right, it's mainly X11 constraints, Wayland is a lot more flexible
07:50 Plagman: by the time the compositor gets it, it doesn't know it was for a subwindow, presumably?
07:51 Plagman: trying to think if i could detect it and try to forcefully resize some windows to make it work out
07:53 MrCooper: which compositor gets what exactly? :)
07:54 Plagman: a wayland compositor getting a dma-buf from a buffer commit (here gamescope through wlroots)
07:55 pq: Plagman, your Wayland compositor is also the X11 WM. Maybe you could use the force of X11 and go change application Windows behind its back, even though it's not what X11 WMs are supposed to do? :-P
07:55 Plagman: yep that's essentially what i'm wondering now
07:56 Plagman: i was thinking about detecting that the commit was originating from a subwindow before trying for remediation, as i'd hate to mess with the game windows if i don't have to
07:57 MrCooper: there's only one Wayland surface for the X11 toplevel window, though I guess the Wayland compositor could try matching the damage region to the X11 window hierarchy
07:57 pq: no, I think you have to do it before any commits you might be interested in, otherwise the buffer is already not what you wanted
07:58 pq: or like MrCooper says, you only get commits from the top-level by definition
07:59 pq: so you have to first make sure the child window you want matches the top-level before you can get the commits from the child
08:01 pq: all you need for that is the X11 window hierarchy, right? Plus magic knowledge of which Window is actually the one you want.
09:42 danvet: bnieuwenhuizen, no new version of amdgpu modifiers, or did I miss it when catching up with mails?
10:01 bnieuwenhuizen: danvet: was hoping for some more substantive feedback from the AMD folks. At this point I only have nits on the last patch from Marek ...
10:04 danvet: bnieuwenhuizen, yeah I was hoping we could reach some conclusion on the entire "when should we sample legacy bo flags" thing I brought up
10:05 danvet: imo sounds like we could do that at addfb time like for i915, which would be nice for consistency I think
10:05 bnieuwenhuizen: danvet: one complication there I noticed is that the legacy flags don't contain enough information to determine a modifier
10:05 danvet: but then we don't have modifiers for all generations in amdgpu
10:05 danvet: uh
10:05 danvet: what's missing?
10:05 bnieuwenhuizen: probably enough to determine the display part of the modifier, but in cases where we have a render and a display compression surface they don't tell anything about the render one
10:06 danvet: bnieuwenhuizen, well for addfb we only care about the display side
10:06 bnieuwenhuizen: and it is hard to determine the offset of a plane that isn't specified
10:06 danvet: so I guess that should be enough?
10:06 bnieuwenhuizen: danvet: I think the modifier/offsets of the fb can be read back for e.g. screenshot tools?
10:06 danvet: hm
10:06 danvet:checks
10:07 bnieuwenhuizen: we could do it by keeping a shadow copy of the modifier & offsets, but at some point one wonders if it isn't cleaner by just putting the DC structs in the framebuffer and populate them at addfb time
10:07 danvet: bnieuwenhuizen, indeed
10:08 danvet: bnieuwenhuizen, also I just realized that when you set dev->mode_config.allow_fb_modifiers
10:08 danvet: you have to fill this out at addfb time
10:08 danvet: or getfb2 breaks
10:08 danvet: we should probably spec that somewhere
10:08 danvet: bnieuwenhuizen, yeah I think that would be even cleaner
10:09 danvet: bnieuwenhuizen, so there's not a display-only modifier which avoids the render compression plane side?
10:10 bnieuwenhuizen: there is one that we can use but it is not usable by mesa on most GPUs
10:11 danvet: *shrug*
10:11 danvet: better correct than corrupt
10:11 danvet: I mean all this breaks is smooth transitions between legacy compositor not using modifiers
10:11 danvet: and a new compositor using modifiers and getfb2
10:11 danvet: I'm also typing up a doc patch to clarify this
10:16 bnieuwenhuizen: oh hmm, the modifier I had in mind doesn't even work for larger chips ...
10:18 danvet: bnieuwenhuizen, https://paste.debian.net/1162276/ thoughts?
10:18 danvet: before I send it out for wider discussion I mean
10:19 bnieuwenhuizen: doesn't DRM_FORMAT_MODIFIER_INVALID technically also work?
10:20 danvet: getfb2 unconditionally sets DRM_MODE_FB_MODIFIERS if allow_fb_modifiers is set
10:21 bnieuwenhuizen: right, but that is kernel code so can be changed right?
10:22 danvet: maybe it's uapi
10:22 danvet: I'd say it should be, since if userspace runs in modifier mode
10:22 danvet: it probably wants getfb2 to give it modifiers
10:22 bnieuwenhuizen: hmm :|
10:22 bnieuwenhuizen:adds another modifier flag ...
10:24 bnieuwenhuizen: danvet: otherwise seems reasonable
10:25 danvet: bnieuwenhuizen, maybe best if you reply on-list with the issue for amdgpu modifiers
10:25 danvet: so we can discuss this a bit wider
10:25 danvet: hm maybe should have cc'ed emersion and pq too
10:25 emersion:reads
10:25 danvet: https://patchwork.freedesktop.org/patch/388742/?series=81262&rev=1
10:30 DPA: danvet: Regarding this: https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/497#note_613712
10:30 DPA: What would that change be? Is it replacing the 1 in this line with a 2? https://gitlab.freedesktop.org/xorg/xserver/-/blob/master/hw/xfree86/drivers/modesetting/driver.c#L1076
10:30 DPA: In that case, I can make an MR if this is needed.
10:30 DPA: Also, as far as I know, only you guys know how the whole drm/dri stuff works, and you guys maintain the xserver, right? Doing such changes is part of maintance work that
10:30 DPA: only you guys know that it needs to be done and how to do it. That doesn't mean that there aren't other people who care about Xorg, even if it is only in the sense that they rely on
10:30 DPA: and want it to work so they can keep using it.
10:38 danvet: DPA, xserver is unmaintained
10:38 danvet: hasn't seen a release since over 2 years
10:38 danvet: *main release
10:38 karolherbst: fix for anything X related: don't use X :p
10:38 danvet: and yes it would be just changing that one line, but that change also doesn't give you any benefit
10:38 danvet: yeah
10:38 danvet: X is dead
10:38 danvet: more or less
10:39 karolherbst: time is better spent on figuring out issues with wayland
10:39 karolherbst: but yeah...
10:39 karolherbst: we still have X around which is annoying
10:40 karolherbst: maybe we should figure out if there will be a 1.21 release or not...
10:40 karolherbst: I know that _some_ are working on getting things worked out
10:41 karolherbst: personally I also want tegra devices to just work, but we are in this in between state where doing the right thing breaks other drivers
10:41 karolherbst: like intel
10:41 danvet: imo 1.21 Xwayland-only release would make sense
10:41 pq: You still haven't seen the full power of Wayland, which is dozens of KMS clients making different assumptions on what works. :-p
10:41 karolherbst: yeah, just rip out the entire X server
10:41 karolherbst: pq: lol.. :D
10:42 karolherbst: I mean.. fact is for a normal user deskop wayland is already the way to go
10:42 karolherbst: it's not waylands fault that some desktops are just broken with wayland
10:42 karolherbst: and I wished we would just say: X is dead, no fixes, no releases, deal with it
10:42 karolherbst: and no support whatsoever
10:42 pq: haha, touché, but try to explain that to an average user
10:43 karolherbst: the thing is, people depend on X getting support
10:43 karolherbst: deskktops mainly
10:43 karolherbst: so they are in this "X works, and wayland needs more work"
10:43 karolherbst: and that will continue forever
10:43 karolherbst: I think we just need to do it the python2 way: state a date after which it is dead
10:43 karolherbst: and dead for real
10:44 karolherbst: (but they still did a bug fix release... for whatever reason)
10:45 DPA: I see...
10:45 karolherbst: maybe we should just announce at XDC that X will be dead in 2021 Lo
10:45 karolherbst: :p
10:45 pq: anyway, what I was referring to is that instead of Xorg drivers and maybe Kodi, you now have a lot more KMS clients that eat away kernel devs' room to maneuver around undocumented UAPI.
10:46 karolherbst: undocumented UAPI == documented UAPI :p
10:46 danvet: pq, I think it's a lot better than "every kms driver has it's own separate ddx making incompatible assumptions about how kms works"
10:46 karolherbst: undocumented just gets documented on how people use it
10:46 danvet: pq, thus far the only real screw-up was the atomic in -modesetting I think
10:47 danvet: and we've clarified a lot of the corners one way or another
10:47 pq: danvet, I agree, but I also recognize the possibility of going the totally opposite direction that depends on Xorg forevermore.
10:47 danvet: pq, I don't think so
10:48 danvet: there's no one willing to pay the 6-7 figure to make Xorg-modesetting atomic and modifier aware
10:48 pq: I'm glad the opposite direction didn't stick.
10:48 danvet: and just too many features that we're building on top of that stuff
10:48 danvet: and Xorg consulting is expensive mostly because the 5 or so worldwide experts just don't want to do it, so it's a premium :-)
10:48 pq: no, I mean to keep every driver seeming using the same KMS UAPI but with totally different behavior and the making up for the different in Xorg driver API.
10:49 danvet: pq, I think nvidia's eglstreams is bad enough that everyone tries to avoid that mess
10:49 pq: like a world where Wayland didn't happen
10:49 danvet: yeah
10:49 danvet: I think cros has a bunch of vendor stuff all over
10:50 danvet: and android also seems very fond of very stuff all over
10:50 pq: yeah, just saying I can imagine it, nothing more :-)
10:51 pq: Wayland is young, there is still time to accidentally pull the kernel into knot. ;-)
10:55 pq: emersion, for C8, gamma LUT is the color map.
10:55 pq: AFAIU
11:15 emersion: oh
11:16 emersion: good point
11:16 danvet: yeah C8 is special
11:16 danvet: R8 is the greyscale thing
11:17 danvet: and addfb1 gives you C8 if you ask for bpc == 8
11:17 danvet: argh I mean bpp = 8
11:19 emersion: makes sense
11:19 emersion: i guess skip gamma if format is RGBA8888 or any permutation fo that
11:19 emersion: of*
11:20 emersion: that's just being nice to gamma-less drivers like vkms
11:20 emersion: (may also be useful for new drivers that don't implement everything yet)
11:30 danvet: cubanismo or someone else from nvidia around here?
12:44 daniels: danvet: not until west coast time I'd imagine
12:45 danvet: daniels, some nicknames I could put on my alert list?
12:47 daniels: cubanismo is usually jamesjones on here IIRC, and usually more in #xorg-devel than #dri-devel
12:47 daniels: aplattner's the other one, but he's more libEGL.so than actual winsys
12:48 Lyude: pq: I believe so, but hopefully it shouldn't be too complicated
13:03 pq: Lyude, how would userspace be involved? Adjusting the SDR brightness level?
13:04 pq: Lyude, I haven't actually heard of HDR max brightness being adjustable yet.
13:04 Lyude: pq: requesting hdr backlight mode on and off, and potentially also controlling it in response to changes from battery to AC
13:22 LaserEyess: /41/
13:22 LaserEyess: oops
13:56 pq: Lyude, wouldn't HDR backlight mode be dependent on putting the monitor into HDR mode via wire format? Or is the backlight the only major difference between SDR and HDR?
13:57 Lyude: pq: I think so, that and the battery vs. ac optimization
13:57 Lyude: And I think the two might be separate?
13:57 pq: well, that sounds very different from external monitors
13:58 Lyude: pq: it is yeah
13:59 pq: IOW, no magic gamut or tone mapping in the monitor, but the wire pixel values go straigh to the panel?
14:00 pq: sounds like an actual reason to implement the tone mapping according to HDR metadata in display servers rather than relying on monitors to do it
14:08 Lyude: pq: there is some tone mapping and color gamut as well oops, nearly forgot about that
14:08 Lyude: But I think that works the same way as normal monitors for the most part, albeit it some of it needs to be programmed in the kernel through special Intel specific dpcd interfaces for certain lcd panels
14:09 Lyude: erm, *certain chipsets
14:09 Lyude: But userspace doesn't care about that
14:09 pq: oh, disappoint
14:09 Lyude: (haven't had my coffee yet :)
14:10 pq: but that again raises the question, why would the backlight have a separate userspace-toggleable switch for HDR/SDR?
14:10 pq: or is it "full HDR" vs. "battery saving HDR"?
15:29 Lyude: pq: yeah-full hdr vs. battery saving hdr
15:30 Lyude: pq: there's also some different content modes ijrc
15:30 Lyude: *iirc
16:19 mslusarz: my Mesa MR with trivial changes (6126) timed out on 2 CI jobs, should I reassign to Marge or someone wants to take a look?
16:21 daniels: mslusarz: thanks for asking! yep, the freedreno one is being actively worked on (fix should hopefully land in the next day or two), and the glsl test is something I thought had been fixed, but it turns out it only made it less likely :(
16:27 pepp: btw this MR shouldn't run most tests but test-source-dep.yml declares all src/mesa/ subfolders as dependencies for all drivers (so your changes to src/mesa/drivers/dri/i965/brw_blorp.c creates all test jobs)
16:28 FLHerne: mslusarz: Also for future reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3437 "Tracker for CI failures"
16:38 mslusarz: thanks guys
17:24 ajax: anyone know of like a cli tool that takes an integer as an argument and prints the GL enum name for it?
17:29 anholt:has something rigged up using nickle to do int to hex for me
17:29 ajax: there's a name i haven't heard in a long time
17:30 anholt: yep. my hands are still bound to it.
17:30 karolherbst: ajax: you can probably just git grep the hex against mesa/include
17:31 ajax: karolherbst: if i didn't find that tedious i wouldn't be asking
17:32 ajax: it's really liberating to just be able to say 'errno 5' and have it tell me that means EIO instead of needing my hands to memorize grep [arbitrary] /usr/include/*GL*/*.h
17:33 ajax: also not all of the values in those headers are in hex (argh)
17:33 ajax: so, cool, need to know the special cases for that too.
17:34 ajax: i should shut up and write the thing i suppose
17:34 anholt: ajax: we could have a little src/tools/ that used _mesa_enum_to_string()
17:36 ajax: ooh, that'd be clever
17:37 anholt: I mean, my first thought was to use the gl_generator crate in rust.
18:49 jekstrand: karolherbst: Well, after close to a whole day staring at it, I've got the global_work_offsets test passing
18:50 karolherbst: nice
18:50 jekstrand: Turns out it was an iris state tracking bug. :-(
18:50 karolherbst: :/
18:50 karolherbst: kernel input buffer size needs to be aligned or something similiar annoying?
18:50 jekstrand: I've found a few iris bugs in this process....
18:50 jekstrand: No, we have to re-emit push constants if the local group size changes
18:50 jekstrand: We weren't
18:50 karolherbst: ...
18:50 karolherbst: that's even more annoying
18:50 jekstrand: So it was always the second one that failed
18:51 jekstrand: Because the first one was fine and then the second one used a bigger group size and boom
18:51 jekstrand: More patches for Kayden to review, I guess. :)
18:51 karolherbst: what a coincidence, I have some for you as well :p
18:51 jekstrand: Oh, nice!
18:52 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6569
18:52 jekstrand: karolherbst: I'll trade you for deref alignments. :)
18:52 karolherbst: somehow we have to fix passes :/
18:52 jekstrand: karolherbst: Ooh! Does that one give us shader constants?
18:53 karolherbst: https://gist.github.com/karolherbst/ad7eecbadb8f888cd86202178d9cbadd
18:53 karolherbst: the else block
18:53 karolherbst: you have a quite direct access
18:53 karolherbst: but...
18:53 karolherbst: with my code it still ends up getting pushed into the buffer
18:54 karolherbst: probably just wrong order or we need to do some trivial opts
18:54 jekstrand: karolherbst: Just need more optimizations
18:54 karolherbst: which one specifically?
18:54 jekstrand: karolherbst: I think running constant folding should clean it up
18:54 karolherbst: mhhh
18:54 karolherbst: on derefs?
18:54 karolherbst: interesting
18:55 jekstrand: karolherbst: Hrm....
18:55 jekstrand: karolherbst: Actually, I don't think nir_opt_constant_folding knows how to lower load_constant_global based on the base address
18:55 jekstrand: But that should be fixable
18:55 jekstrand: karolherbst: We've already got a thing for load_constant
18:55 karolherbst: ohh, I see
18:55 karolherbst: but mhh
18:55 karolherbst: I don't think that's what we want
18:55 karolherbst: that would be too late
18:55 jekstrand: We just need to detect load_constant_global(iadd(base_ptr, imm))
18:56 karolherbst: we kind of have to do that before pushing it into the buffer?
18:56 karolherbst: or do we just not care?
18:56 jekstrand: We need to do it before we get rid of nir_shader::constant_data
18:56 jekstrand: If extra stuff lands in the buffer, I don't care
18:56 karolherbst: given that clc puts a lot of stuff into it, it might actually make sense
18:56 jekstrand: dead_variables should sort the CLC stuff out
18:57 karolherbst: I mean.. outVal and outIndex won't be dead in our case, no?
18:57 karolherbst: they still end up in the buffer
18:57 jekstrand: karolherbst: They're used in the else
18:58 jekstrand: I guess it kind-of depends on what order we want to do things in
18:58 karolherbst: well.. but directly
18:58 karolherbst: not indirectly
18:58 karolherbst: we kind of only need to push out indirects into its own buffer
18:58 karolherbst: all the direct accesses we can leave alone
18:59 karolherbst: in this case we just have load_deref(deref_var) things
19:00 jekstrand: One thing I've been thinking about is a general vars-based constant-folding pass.
19:01 jekstrand: We can do it right now sort-of
19:01 jekstrand: opt_large_constants is capable of moveing stuff into nir_shader::constant_data
19:01 jekstrand: But there are other cases where we might be able to do that
19:01 karolherbst: I mean.. we just need a nir_deref_is_direct and nir_deref_get_constant or so :p
19:01 jekstrand: Maybe?
19:01 jekstrand: I just don't like crawling derefs in any more passes than we absolutely have to
19:01 karolherbst: telling if a deref is direct or indirect is something we kind of need anyway I think
19:02 karolherbst: yeah...
19:02 karolherbst: maybe we can do that in opt_deref?
19:02 karolherbst: eliminating derefs completly is somehow also an optimization of derefs :p
19:03 karolherbst: and if we run constant folding before opt_deref the indicies might actually be all plain constants
19:03 jekstrand: I'm not sure we want it in opt_deref either TBH
19:03 karolherbst: but we could start with the simple case: load(var)
19:04 jekstrand: If we weren't using pointers for constants, lower_io + constant folding woudl be enough
19:04 karolherbst: yeah...
19:04 jekstrand: We'd still end up with them in the constant buffer but that wouldn't hurt anyone
19:04 jekstrand: And dead variables would take care of the CLC case
19:04 jekstrand: But with pointers, it gets trickier to fold
19:04 karolherbst: mhhh
19:04 karolherbst: I think the buffer could grow quite a bit with clc
19:04 karolherbst: but maybe it won't matter
19:04 karolherbst: and if it becomes a problem we can change things later
19:05 jekstrand: dead code will delete any CLC stuff we don't use. Any that we do use will likely be used in full
19:05 karolherbst: not always
19:05 karolherbst: think about builtins with constant arguments
19:05 karolherbst: maybe somebody puts some sin(whatever_constant) in the kernel
19:06 karolherbst: or some other ultra complex builting needing lookup tables
19:07 jekstrand: Yeah....
19:07 jekstrand: I think we probably just want a pass for this
19:08 jekstrand: Shouldn't be hard to write
19:08 jekstrand: foreach load_deref with mode == constant, get a path. If the path starts with a variable, walk it and the initializer and try to find the tail.
19:08 jekstrand: You can probably do it in 50 lines
19:09 karolherbst: I guess
19:09 karolherbst: will try to write it and see how it goes
19:11 karolherbst: oh well... reviewing the alignment stuff first though :p
19:11 jekstrand: We probably also want a pass which "promotes" global/shared with an initializer which is never written (other than the initializer) to __constant
19:11 jekstrand: But that can come later
19:12 karolherbst: mhhh.. that's also CL 2.0 stuff, no?
19:12 jekstrand: karolherbst: I don't know
19:12 karolherbst: but... in CL to verify something is never writen is kind of... annoying
19:12 karolherbst: depends on how much we ignore "crappy code"
19:13 jekstrand: karolherbst: Yeah.....
19:13 karolherbst: I mean.. some could just do offsetof stuff and do magic offsets which just look like indirects to us
19:13 jekstrand: karolherbst: Especially for global/shared
19:13 karolherbst: but I guess if there are no indirects we are save :)
19:13 karolherbst: *safe
19:22 jekstrand: karolherbst: My fix for global size fixed 4 tests :D
19:22 jekstrand: Pass 88 Fails 3 Crashes 11 Timeouts 4
19:30 Plagman: anyone know if amdgpu has a backdoor of some sort to force dp retraining / hotplug?
19:31 Plagman: screen on my laptop is staying black after resume on 5.8, but resume seems to otherwise work better than i've ever seen it do, so i'm trying to see i can nudge it out of this state with a workaround
20:01 Plagman: oh, wrong channel probbably
20:01 Plagman: sorry
20:02 jekstrand: Plagman: There are people here who know about dp retraining
20:02 jekstrand: I'm just not one of them. :P
20:03 Plagman: yeah, - figured any amd-specific way of messing with it ight be best asked in #radeon
20:03 Plagman: although i don't have high hopes what i'm after can be done
20:09 danvet: Plagman, dpms on/off helps sometimes
20:09 danvet: if it's just timing after resume
20:09 danvet: that retrains
20:10 Plagman: i think i gave that a shot, didn't seem to help but might have messed it up
20:10 Plagman: just did echo on / off to the dpms file in sys/class/drm for that display
20:11 danvet: hm not sure that works
20:12 danvet: it's read-only, so definitely doesn't help :-)
20:12 danvet: you need to whack your compositor, since just stopping the display generally breaks it
20:26 DPA: I need some directions regarding the problem that ro->kms_fd isn't taken into account here:
20:26 DPA: https://gitlab.freedesktop.org/DPA/mesa/-/blob/master/src/gallium/winsys/etnaviv/drm/etnaviv_drm_winsys.c#L107
20:26 DPA: What is the prefered way to solve this? I could just remove the hash_table and always make it call screen_create.
20:26 DPA: Or I could make it also check kms_fd. Or I could restructure some things to keep the pipe_screens as consolidated as possible like so:
20:26 DPA: https://gitlab.freedesktop.org/DPA/mesa/-/commit/cf923e7ebccf5857b5f8804dd78378f583bfba18
20:26 DPA: There are probably still a lot of other possible solutions, but what is the right thing to do?
20:35 jekstrand: Ugh... Images suck....
20:36 airlied: I expect that will be oft repeated
20:37 jekstrand: Kayden: Are we using the prog_data binding tables in iris at all?
20:37 Kayden: don't believe so
20:38 Kayden: Caio did a bunch of nice stuff there
20:40 jekstrand: So, CL requires all images be either read_only or write_only
20:40 jekstrand: That's nice at least
20:40 jekstrand: We can always use the dataport for read-only and always use the sampler for read-only
20:46 jekstrand: It also looks like indirects on images are simply not a thing in CL
20:46 jekstrand: But it's hard to tell because the rules for SPIR-V are... loose.
20:46 karolherbst: jekstrand: they are, but only via ?:
20:46 karolherbst: I think
20:47 jekstrand: Ugh...
20:47 karolherbst: not sure if it's legal in Cl to use it though
20:47 jekstrand: So much for my grand plan. :(
20:47 karolherbst: and what LLVM does with it
20:47 jekstrand: I think I need to handle some things differently in iris for kernels
20:47 jekstrand: It's not bad, it's just not at all what GL does
20:48 karolherbst: jekstrand: when I looked into it I just load a constant inside the wrapper iterating through the args
20:48 karolherbst: and then things just got constant folded
20:48 jekstrand: karolherbst: That's roughly what I've done
20:48 karolherbst: at least for the image
20:48 karolherbst: sampler... is weird
20:48 jekstrand: karolherbst: The problem is that iris divides things into images and textures and image instructions only work on images and texture instructions only work on textures
20:48 jekstrand: That's not very CLish
20:48 karolherbst: ehhh :/
20:49 karolherbst: no idea if it matters for us
20:49 jenatali: jekstrand: There's read-write images in CL2.0 or an extension in 1.2 IIRC
20:50 jenatali: FWIW, we want to map textures -> "shader resource views" and images -> "unordered access views", so we remap any image intrinsics on read-only images to tex ops in a lowering pass for CLOn12
20:50 jenatali: Write-only/read-write images we leave as images
20:51 jekstrand: jenatali: Yeah, that's roughly what I want to do too
20:51 jekstrand: jenatali: But If ? : is allowed, we're screwed
20:52 jenatali: Hm... I'm not sure about that
20:52 jekstrand: Hrm.. No, I think we're ok because it has to be one of the two types.
20:52 jenatali: jekstrand: What's the problem with ?: ?
20:52 jekstrand: Which is to say that it has to be the same type on both sides
20:52 jenatali: Yeah, I don't think you could select between read-only and write-only or read-write
20:53 karolherbst: yeah... I think C enforced that it has the same result on either side
20:53 jekstrand: So we just need a image_is_read_only helper which does a little magic
20:53 karolherbst: and CLC should do as well
20:53 jenatali: jekstrand: I embed that info in the access of the variable
20:53 jenatali: jekstrand: Here's our lowering pass if you want to reference: https://gitlab.freedesktop.org/kusma/mesa/-/blob/msclc-d3d12/src/microsoft/clc/clc_compiler.c#L251
20:54 jekstrand: jenatali: Yeah, the problem is fishing from the variable
20:54 jekstrand: jenatali: Looks like we have that information on the image intrinsic
20:55 jekstrand: Or we could at if we wanted
20:55 jenatali: Which info?
20:55 jekstrand: access
20:55 jekstrand: We've got a field for it and read-only is part of the type
20:55 jenatali: jekstrand: Oh hey, look at that
20:56 jekstrand: Here's hoping LLVM decorates things properly. :D
20:58 jekstrand: Ugh... Except for that stupid ImageTexelPointer in the way. :(
20:59 jenatali: jekstrand: Where are you getting an ImageTexelPointer?
20:59 jekstrand: jenatali: We get it on most of the image ops
20:59 jekstrand: Actually, just atomics. Never mind.
21:00 jenatali: Yeah was going to say, it should just be a read
21:02 jenatali: jekstrand: Looks like that access field is just for non-uniform. I don't think you'll get the ops decorated with whether the image was read-only vs read-write
21:02 jenatali: Though you could also just assume a read is on a read-only image until you plan to support read-write images
21:03 jekstrand: jenatali: I can add that. :)
21:03 jekstrand: jenatali: Just takes a little SPIR-V plumbing
21:04 jenatali: jekstrand: What if you get SPIR-V handed to you from the app (CL2.x)?
21:04 jekstrand: jenatali: It has to use the right types
21:05 jenatali: jekstrand: Ah right I forgot there's an access_qualifier in the vtn type too
21:05 jekstrand: Yup. :)
21:05 jekstrand: I'm hoping we can rely on that
21:05 jenatali: jekstrand: Yeah, you should be able to, based on my experience
21:05 jenatali: As long as the default is right for when it's not present -- which it is because I made it read-only for kernels :)
21:08 jekstrand: The SPIR-V environment spec for CL says the access qualifier must be present.
21:08 jekstrand: I'm not sure if 0 is allowed
21:08 jekstrand: Doesn't look like it. It has three values: read-only, write-only, and read-write
21:09 jenatali: I definitely saw images with no 9th word...
21:09 jekstrand: That's illegal....
21:09 jekstrand: Oh, we have a thing for this in spirv_to_nir already
21:10 jekstrand: The SPIR-V envionment spec says:
21:10 jekstrand: he optional image Access Qualifier must be present.
21:10 jekstrand: *The
21:10 jenatali: Huh
21:10 jekstrand: I suspect it's the CLC spec that has a default
21:10 karolherbst: jenatali: the translator might be wrong :p
21:10 jenatali: karolherbst: I'm skimming its code right now :)
21:11 jekstrand: We should figure that out and fix the translator if needed
21:12 karolherbst: jenatali: probably easier to write a test and verify it
21:15 jenatali: Huh, I guess it does the right thing, as far as I can tell
21:15 jenatali: Maybe I just added that missing logic for defense in-depth, but I could've sworn I started with an assert and it tripped
21:15 jenatali:shrugs
21:16 jekstrand: We should throw in a vtn_fail_if and see if it blows up for you
21:19 jenatali: Yep, one more thing I need to try out
21:20 jenatali: Stupid other responsibilities keeping me from it though
21:23 karolherbst: jekstrand: btw.. testing your align MR now..
21:23 karolherbst: I have a feeling it blows stuff up
21:25 karolherbst: yep...
21:26 jenatali: :D
21:27 anholt: mareko, robclark: have you looked into having FS interpolation instructions be fp16 in NIR?
21:30 karolherbst: ehhh... I think some shaders are looping forever now? .. mhh
21:30 karolherbst: or.. they just run a long time
21:33 robclark: anholt: varying interpolation?
21:33 anholt: yeah
21:41 karolherbst: jekstrand: I think you need to add support for mem_constant in your alignment MR
21:42 karolherbst: mhh.. I think I can do it as well...
21:42 karolherbst: will have a patch in a minute
21:43 robclark: anholt: I hadn't really looked into that, since we can `bary.f` and let the hw do it for us..
21:44 robclark: (at one point I thought we'd have to interpolate manually for centroid/etc.. but that was just blob being silly)
21:45 jekstrand: karolherbst: I thought I did....
21:45 jekstrand: karolherbst: Did you have rebase trouble?
21:45 karolherbst: no
21:45 karolherbst: you'll when you see the patches
21:45 karolherbst: it's all trivial
21:45 jekstrand: ok
21:45 karolherbst: huh... guess something is missing
21:46 anholt: robclark: right, I'm trying to make the NIR match what the HW can do (interp to 16 bits), so we can drop the backend opt that breaks VK
21:46 karolherbst: heh... something is wrong
21:50 karolherbst: okay.. it's not what I thought it was.. I'll dig deeper
21:53 karolherbst: jekstrand: yeah.. offsets are always 0 with mem_constant
21:54 jekstrand: karolherbst: Weird...
21:54 karolherbst: ohhh.. wait...
21:54 jekstrand: karolherbst: Offsets?
21:54 karolherbst: base_ptr + offset
21:54 jekstrand: That sounds wrong
21:54 karolherbst: ahh.. I think I know.. mhhh
21:57 karolherbst: okay...
21:57 karolherbst: now it works
21:58 karolherbst: partly my and partly your fault :p
21:58 jekstrand: narually
21:58 jekstrand: *naturally
21:59 karolherbst: mhhhh
21:59 karolherbst: there is just one stupid thing...
21:59 karolherbst: it would fall apart if we have indirects _and_ offsets
22:00 jekstrand: ?
22:00 jekstrand: What's the issue?
22:01 karolherbst: ehh.. no, it works because of ordering, but it's fragile
22:01 karolherbst: jekstrand: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/9c5c61ea734604def88b2752590b48dd853ce6f8
22:02 karolherbst: that I had to use to fix it locally
22:03 jekstrand: karolherbst: Ah, right. the lower_io assert probably moved and I failed at rebasing
22:03 karolherbst: let's see if that resolves all fails
22:03 karolherbst: nope...
22:04 karolherbst: I ran more than just basic so maybe that's why you didn't really hit the others.. but some tests are missbehaving quite a lot
22:04 jekstrand: Are they misbehaving because they now have alignment information? :P
22:04 karolherbst: not sure yet
22:04 karolherbst: feels like some are looping forever
22:05 jekstrand: That's possible
22:05 jekstrand: I might have messed up progress somewhere
22:08 karolherbst: or maybe those select tests just take a bit of time :D...
22:09 karolherbst: anyway.. there are crashes
22:10 karolherbst: ohhh... right.. they fail compiling, so they recompiling on each kernel invocation..
22:10 karolherbst: yeah.. that makes stuff take a long time
22:10 jekstrand: Why do they fail compiling?
22:12 karolherbst: one validation error I hit btw: https://gist.github.com/karolherbst/3f5a2140507f1218b514ed5e6cd01e3f
22:13 jekstrand: karolherbst: Hrm....
22:13 jekstrand: karolherbst: Looks like one of them is getting an explicit type and the other not or something like that
22:13 karolherbst: probably
22:14 jekstrand: karolherbst: You shouldn't be running lower_vars_to_explicit_types on constant
22:14 jekstrand: karolherbst: That's the problem there
22:15 karolherbst: well.. then the offsets are all 0 :)
22:15 jekstrand: karolherbst: nir_lower_mem_constant_vars is what we should be using
22:15 karolherbst: ahh yeah.. I am using that
22:15 jekstrand: So lower_mem_constant_vars is doing something wrong
22:15 karolherbst: right
22:16 karolherbst: I also figured out why the select tests are now taking ages
22:16 karolherbst: https://gist.github.com/karolherbst/5284cff757bee410e68a034d77e9285f
22:16 karolherbst: ....
22:16 karolherbst: "quality software" :p
22:17 jekstrand: jenatali: I just pushed a wip/clover-images branch. There's a lowering pass in there you might like. :)
22:17 jekstrand: jenatali: I didn't end up pulling anything from your code. It was easier to just type it.
22:17 jenatali: :)
22:18 jekstrand: With that, it looks like images should work.... except they don't. :(
22:18 jekstrand: karolherbst: What test is blowing up with offset=0?
22:18 karolherbst: constant and constant_sources
22:21 jekstrand:runs
22:21 karolherbst: you need my constant MR for those btw
22:21 jekstrand: k
22:25 karolherbst: anyway.. going bed early for a change :D
22:25 jekstrand: ok
22:25 jekstrand: I'll fix it and push again
22:25 karolherbst: cool.. will retest tomorrow then
22:30 jekstrand:bisects
22:49 jekstrand: karolherbst: Fixed. See the first two new commits in the MR
23:06 jenatali: jekstrand: In your image lowering pass, you end up with tex ops with texture_deref srcs that point to images rather than samplers - is that valid in nir? Just curious
23:09 jenatali: I guess I still don't entirely understand nir's distinction between what's an image and what's a sampler
23:12 jekstrand: jenatali: Yeah, it's fine
23:12 jekstrand: jenatali: Which is to say that we don't have validation rules forbiding it
23:12 jenatali: :)
23:13 jekstrand:has an image test passing \o/
23:14 jenatali: Nice!
23:15 airlied: jenatali: win!
23:15 airlied: jekstrand: win!
23:15 airlied: :-P
23:15 jenatali: Hey I'll take it as a win for me too, I helped :P
23:15 airlied: \o\ /o/
23:20 jekstrand: Just needed to make a new clover_arg_size_align helper that handles images and samplers the same way clover does
23:20 jekstrand:pushes his branch
23:20 jekstrand: I'll clean it up a bit tomorrow once deref-align lands
23:21 jekstrand: It's a bit of a mess right now
23:21 jekstrand: But it's a mess with a working image test. :)
23:21 jenatali: :)
23:21 jenatali: Out of curiosity... which one? :P
23:21 jekstrand: readimage
23:22 jenatali: That's with a sampler, right?
23:22 jekstrand: no
23:22 jenatali: Oh, cool
23:22 jekstrand: read-only no sampler
23:23 jekstrand: Oh, jolly.... clover seems to not be setting the right usage bits when creating images....
23:24 jekstrand: Oh, of course clover would have its own PIPE_BIND flag
23:24 jenatali: Obviously
23:24 jekstrand: Clovers image handling needs a serious overhaul
23:25 jenatali: Had it even been used before?
23:26 jekstrand: Clover? Yeah, it works on some of the AMd drivers
23:27 jenatali: Ah, didn't realize there was already image support in Clover for that
23:27 jekstrand: Yeah, but it works completely differently from GL image support for no reason
23:28 jekstrand: Well, maybe there's a reason
23:28 jekstrand: But it's probably far more historical than technical
23:30 jekstrand: And... this is going to be work. I'm done for now.
23:30 jekstrand:peaces out