01:54 Lynne: is a pipeline barrier with no image or buffer data guaranteed to be a noop?
02:04 ishitatsuyuki: Lynne: with no pMemoryBarriers as well (assuming PipelineBarrier2)?
02:06 Lynne: yes, vkCmdPipelineBarrier2, zero barrier structs in total
02:06 Lynne: (I forgot VkBufferMemoryBarrier2 was a thing, I only use VkMemoryBarrier2)
02:10 ishitatsuyuki: In terms of Vulkan execution model it should be a noop, but whether the call is actually noop depends on the driver
02:10 ishitatsuyuki: briefly looking at code it should be a noop on radv
02:58 Lynne: cool
02:59 Lynne: do you know if radv happens to be more barrier-happy than hip?
03:01 Lynne: 2 years ago I ported some simple opencl compute code to vulkan and copying everything opencl did, the result was that vulkan was massively slower due to each barrier between each shader dispatch
03:01 Lynne: I had to parallelize it at the cost of gigabytes of memory plus atomics when the original used tens of megabytes
03:01 airlied: probably a bit too vague a question to answer
03:03 airlied: like what were you barriering and why? if something takes a lot longer it could be stalling somewhere unexpected
03:04 Lynne: just an nlmeans shader, what you do is you run the same shader about 800 times with different parameters, and each shader simply reads from an image, does some math, and adds the result to a buffer
03:05 Lynne: opencl did it the naive way, it bound one invocation, executed it, barrier, bind, execute, barrier and so on
03:05 Lynne: copying this in vulkan was massively slower
03:05 airlied: so you are barriering on the buffer?
03:06 Lynne: yes, just the buffer
03:07 airlied: so did you just write all the outputs to a buffer and sum at the end?
03:08 airlied: surprised CL was doing it faster, maybe it knew something so decided not to barrier
03:08 Lynne: here's the original code - https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_nlmeans_opencl.c#L223-L265
03:09 Lynne: it does not do any sync between each invocation... not sure I know enough about opencl to say whether that's legal or not
03:10 Lynne: each "pass" has 2 shaders - one to integrate the image with a given parameter, and one to compute the weights and then add the result to an image
03:10 airlied: so why did vulkan need a barrier?
03:10 Lynne: does it not?
03:10 airlied: seems like if you had a single ssbo and were just writing to it from subsequent shader invocations there is no need
03:11 airlied: esp if it's all on the compute queue
03:12 airlied: then barrier at the end to read back
03:12 Lynne: forget about what I wrote previously, focus on the 2-shader layout I mentioned
03:12 Lynne: so you call this sequence of shaders 800 times
03:13 Lynne: you need a buffer to carry the results between the 2 shaders, first one just writes to it, second one just reads (and writes the result atomically to another buffer, but that needs no barrier so its not a problem)
03:14 Lynne: do you need a barrier each time you reuse a temporary vkbuffer?
03:14 airlied: where is that temporary buffer in the cl code?
03:15 Lynne: integral_img
03:17 airlied: I'd guess your cl code is buggy and might just work by luck, but I'm not fully across opencl cmd submits
03:18 airlied: and you should use cl events between submissions
03:20 Lynne: okay, what about vulkan, when would vulkan need a barrier for such a code?
03:20 Lynne: between each dispatch, or between each reuse of the buffer?
03:20 airlied: yeah you probably do need a barrier between each dispatch if horiz and vert are reading/writing the same parts of integral_img
03:23 Lynne: I've merged horiz/vert in my code
03:23 Lynne: would you need a barrier between horiz+vert and weights?
03:24 airlied: yes if weights is going to read back and you have to make sure the other shader has finished
03:27 airlied: at a guess you'd be better of throwing them all into one shaders with a barrier in the shader :-), but I could be wrong
03:32 Lynne: the integration shader runs width-wise, so it does height number of workgroups
03:32 Lynne: the weights shader is per pixel
03:33 Lynne: I did try combining them, but it wasn't worth it
03:34 Lynne: if only you could dynamically stop or resume workgroups, I see no reason why it can't be done these days, as far as I understand the hardware schedules each warp/group independently
03:36 airlied: oh so like launch a max shader and drop some threads from executing for parts of it?
03:36 airlied:is still too much of a cpu programmer :-P
03:38 Lynne: yeah, that would be cool, then you could call a function that increases the amount of workgroups
03:38 Lynne: the former one is already supported... in vertex shaders
03:39 Lynne: via the terminate_invocation call, I did ask why it isn't allowed here and no one could answer other than "because spirv does not allow it in compute shaders"
03:41 Lynne: its obvious there's no reason why it shouldn't be supported, other than vulkan is restrictive by default to make sure everything breaks the same way on all platforms and also because no one cared enough during submission
03:42 airlied: I think you can just return in some threads
03:43 airlied: but if you are using a full subgroup it won't do much
03:44 airlied: my guess is there is probably some amazing subgroup algorithm you could use, but I've no idea what it would be :-P
03:45 Lynne: there's quite a few parallel prefix sum algorithms out there
03:45 Lynne: ...they were all slower than just doing it naively
03:45 Lynne: at least for sub 8k images
03:55 Lynne: speaking of barriers, I wish the validation layers checked for those
03:57 Lynne: by the way, are there implicit barriers between each submission, or do you need a barrier at the start before reading from a barrier written to by a previous submission?
04:51 soreau: does i915 have anything like amdgpu_gpu_reset in /sys/?
04:56 soreau: I found i915_wedged but it puts a line in dmesg and not much else (the compositor seems unaffected)
08:04 jfalempe: if someone want to review a small rust patch, it's waiting for some time now: https://patchwork.freedesktop.org/series/142175/, maybe javierm?
12:28 glehmann: Do u2u and i2i nir opcodes support 1bit sources?
12:29 glehmann: would be kind of weird, because there are also b2i opcodes which is the same as u2u with 1bit source
12:49 javierm: jfalempe: sure, I can take a look to it on Monday
12:50 pq: emersion, https://drmdb.emersion.fr/properties/4008636142/COLOR_SPACE internal server error
13:04 emersion: hm
13:05 emersion: "inconsistent property types"
13:05 emersion: "inconsistent property types: range and signed range"
13:06 pq: you're welcome :-D
13:06 emersion: seems like some out of tree driver is using a different type
13:07 pq: yes, I doubt that property even exists upstream
13:09 emersion: opened https://gitlab.freedesktop.org/emersion/drmdb/-/issues/1
13:10 pq: oh, that's where it's hosted - me lazy
13:11 emersion: i moved it recently-ish
13:20 sima: agd5f, I guess looks like greg will finally script cherry pick handling, might be good if you can adopt dim cherry-pick/cherry-pick-branch so we do this consistently?
13:20 sima: should be doable to have a flavor of this that doesn't need the entire dim setup
13:21 sima: demarchi might be able to help with that
13:42 Ermine: what is DRM minor? is it render node?
13:44 pq: device numbers are (major, minor), DRM has a single major IIRC, and the minor is dynamically allocated to identify a device node that is either primary node, render node, or control node, per device driver instance.
13:45 pq: 'ls -l' shows the major and minor numbers of a device node
13:46 pq: control nodes do not exist anymore
13:47 Ermine: ah, those minors
13:47 Ermine: thank you
13:51 sima: pq, with accel there's now a second major number range I thought, but yes
13:52 pq: ah
14:20 agd5f: sima, sounds good. stable kind of drives me crazy. Besides all of the cherry-pick stuff, the entire process is arbitrary. Half the time I have to argue with greg and sasha about getting a patch in and other times they pull stuff into kernels that I never asked them for and then when I causes regressions, I have to argue with them to revert it.
14:24 sima: agd5f, yeah it's pain
14:25 sima: but I think we have a good chance to stop at least some of the pain
14:25 ukleinek:bets it's also a pain on their side
14:25 sima: just no more rebases of -next trees (but I think you've stopped doing that a while ago) so there's no sha1 references into the void
14:25 sima: and then very consistently cherry-pick stuff to -fixes with scripts
14:26 sima: maybe in another 8 years we get some of the other endless discussions sorted :-/
14:26 ukleinek: if I understood right that's only a part of their pain
14:26 agd5f: sima, yeah, I've already stopped rebasing and restarted doing cherry-pick -x
14:27 sima: agd5f, I guess even if greg scripts it all now we'll probably have a bit more shouting on this until there's enough cherry-pick -x annotated history in upstream
14:27 demarchi: sima: but dim cherry-pick/cherry-pick-branch is only done by maintainers. Why would that need to be done without setup?
14:27 sima: so I'm bracing for that already
14:27 sima: demarchi, because agd5f doesn't use dim for amdgpu trees
14:28 sima: and might be worth to share the actual script for corner case bugfixes
14:28 sima: or maybe agd5f switches to dim entirely
14:29 vsyrjala: some time ago i proposed that we'd ask the stable folks to stop the auto backport nonsense completely for drm (or at least i915). but that request would need to come from maintainers i guess
14:30 demarchi: vsyrjala: but that thread was about a very wild backport, unrelated to Fixes: or Cc: stable
14:31 demarchi: having a commit with "drm/....: Fix ..." shouldn't really mean the commit needs to be backported
14:32 demarchi: vsyrjala: and if they stop doing, someone would need to handle the drm (or i915) side, which is not a small task
14:32 vsyrjala: i just want them to backport stuff actually marked with cc:stable. nothing else
14:33 ukleinek: vsyrjala: I would expect that is something they can handle.
14:34 ukleinek: I know a guy who requested that for patches with author=him, surely also works for patches touching drm
14:36 emersion: Ermine, pq: OpenBSD works differently in this regard IIRC
14:36 emersion: or FreeBSD?
14:36 emersion: one of these
14:36 emersion: so better not assume anything special about major/minor
14:36 vsyrjala: ukleinek: i also asked for that, but iirc greg said they only do exceptions based on code not people
14:43 Ermine: emersion: was asking because of gputop
14:48 ukleinek: vsyrjala: https://lore.kernel.org/stable/CAMj1kXGzMCT4KQcrnD80p6ZA=-j+aAPuPbKRuYQiRjof-+dTUg@mail.gmail.com/
14:49 sima: vsyrjala, I guess if intel/amd/msm all agree we can ask for that, maybe after the dust has settled with the cherry-pick tracing
14:49 sima: agd5f, robclark ^^ thoughts?
14:51 vsyrjala: ukleinek: no reply to that, so how do you know they agreed to it?
14:51 sima: I do agree that AUTOSEL is a nuisance at best
14:52 agd5f: sima, probably, but there are cases where it's worked out in our favor and it's a happy surprise when I see a patch already in stable that I had planned to send out. OTOH, the opposite is also true. Not sure which where things shake out on the whole.
14:52 ukleinek: vsyrjala: Ard asked to expand from author=him, so I read that as the status quo?!
14:53 sima: agd5f, yeah I think we should perhaps wait a few releases until the cherry-pick dust has settled and then see
14:53 vsyrjala: it does feel like thye've toned down the AUTOSEL a bit at least. so perhaps it's not such a huge pain anymore
14:54 sima: since that cherry pick shouting has been such a huge distractions thus far
14:54 sima: yeah for a while it was picking up typo fixes just because of Fixes: tags iirc
14:55 alyssa: jenatali: to follow up on the ctor discussion - I *think* I'd like to proceed with what I have (bindgen C++ global ctor + the singleton you reviewed), since I have it working and all the alternatives have pitfalls
14:56 agd5f: I think it still does. Just this week I had to reject patches going back to stable which were picked up as dependencies sot that their subsequent reverts could be applied to the stable tree.
14:56 alyssa: I'm very open to a rework later if we can figure out an alternative that's actually better.. I don't think that needs to block merge now, since some of the alternatives would still use all the singleton infra anyway.
14:56 jenatali: Sounds reasonable to me
14:56 alyssa: (The possibility of statically baking all format strings across the tree... exists, I guess? but has a lot of open questions.)
14:57 alyssa: (and all the other options would still internally use the singleton, just not with dllmain)
14:57 alyssa: (and the singleton is the complicated part, the actual bindgen code here is really short)
14:57 jenatali: Right
14:58 sima: agd5f, ok that's a bit too hilarious, but also sounds like a bug in the cherry-pick logic?
14:59 vsyrjala: ukleinek: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/ignore_list
15:00 sima: agd5f, do you have an archive link for that one?
15:01 ukleinek: vsyrjala: Ard is on irc as ardb in case you want to ask for details.
15:02 ukleinek:has the impression that rebase vs cherry-pick -x is a red herring and not what this discussion is about.
15:05 agd5f: sima, https://www.spinics.net/lists/stable-commits/msg391119.html
15:11 sima: agd5f, replied asking how this happened
15:11 sima: figured while we debug stable process bugs, might as well get them all
15:13 MrCooper: I recently pointed out a case like that to Sasha, no response...
15:16 sima: MrCooper, if you want could pile up your case on the same thread, I cc'ed dri-devel
15:17 MrCooper: don't really care that much I'm afraid :)
15:19 sima: MrCooper, hm I didn't find anything in archives from you, at least nothing recent
15:20 MrCooper: might have been on amd-gfx
15:22 sima: ah found it, searched for the wrong mail address
15:24 MrCooper: b3a64de3-353c-4214-a876-f44d3f1de07b@mailbox.org on 2024-11-25
15:24 sima: yeah found 2
15:24 sima: https://lore.kernel.org/stable/a31d3d49-1861-19a2-2bb4-8793c8eabee9@mailbox.org/ older one
15:24 MrCooper: IIRC there were earlier cases where Sasha drooped them in response
15:24 MrCooper: *dropped
15:28 alyssa: > The conversion specifiers f, F, e, E, g, G, a, A convert a float argument to a double only if the double data type is supported
15:28 alyssa: thanks i hate it
15:28 alyssa: jenatali: (((:
15:29 jenatali: Yeah isn't that awful?
15:31 alyssa: for real
15:31 alyssa: i almost want to go out-of-spec here for driver CL because that's sufficiently dumb
15:31 alyssa: or call fp64 lowering in vtn_bindgen i guess
15:51 robclark: sima, I didn't manage to read entire scrollback, but is the proposal to request stable tree folks to only cherry-pick things that have Cc: stable? If so, I think we can do that.. (fyi lumag, abhinav__)
15:57 sima: robclark, it's more about whether we want to because autosel causes to much pain
15:57 sima: but maybe we can reduce the autosel pain first a bit, seems to have some obvious bugs
16:01 robclark: I generally have an idea of what I want backported, and make sure they get Fixes tag.. but I can add Cc stable as well... I think I'd prefer not having random things cherrypicked
16:02 bl4ckb0ne: is there a way to query the GL/GLES versions offered by EGL? Or at least the maximum version?
16:04 alyssa: dcbaker: any reason not to have `fs = import('fs')` in our root meson.build?
16:14 karenw: bl4ckb0ne: Yes. eglGetConfig/elgChooseConfig and parse through the result
16:16 karenw: Technically this is an extension, but ubiquitous: EGL_KHR_create_context
16:16 karenw: (EGL_CONTEXT_MAJOR_VERSION_KHR EGL_CONTEXT_MINOR_VERSION_KHR)
16:17 bl4ckb0ne: completely forgot about that one, thanks
16:36 dcbaker: alyssa: no, historically it just didn’t exist when we switched to Meson, lol
16:36 alyssa: dcbaker: cool. commit on my branch then
16:36 alyssa: also, I couldn't find docs on this-
16:37 alyssa: does custom_target implicitly depend on the program you run
16:37 alyssa: I assume so but this is definitely dragons territory
16:38 alyssa: https://gitlab.freedesktop.org/alyssa/mesa/-/commit/b0894eafbf9ba9a32a726e2d659593a107c2cb80
16:38 alyssa: this seems to work
16:38 alyssa: (also generally, if you see any way to further reduce https://gitlab.freedesktop.org/alyssa/mesa/-/commit/5d5097f54a375767db11ace906675f94761d77ab please let me know since this is what everyone will be copypasting from :p)
16:39 alyssa: though i'd personally say 26 lines of copy/paste and then you can be immediately productive with CL in your driver is pretty good
16:39 alyssa: (that commit is immediately followed by https://gitlab.freedesktop.org/alyssa/mesa/-/commit/b39f72225e515af5ac289ca88698fadd7d15e539 :P)
16:40 alyssa: (i'm not even touching most of the call sites -- the generated code literally matches the function signature we had before in 2/3 of the cases there =D)
16:57 dcbaker: alyssa: yes, the program is a dependency of the custom target
16:57 alyssa: awesome thx
16:59 dcbaker: I’m eating breakfast, but I’ll come have a better look in a couple at my computer. Phone is not a good review tool :D
17:07 alyssa: real
17:31 dcbaker: alyssa: you have my r-b for the "drop trivial depends"
17:32 karolherbst: that reminds me.. I need to look into this scratch issue...
17:37 dcbaker: alyssa: I left your a few comments on the other one. Once the Meson 1.8 window opens I have some work to make copying custom_targets less awful that needs to get done...
17:43 alyssa: dcbaker: thanks! squashed into !33099
17:59 jenatali: alyssa: Going to trigger MSVC builds on your MR just to confirm it plays nicely
18:00 jenatali: Overall looks pretty good. Didn't review the bindgen commit itself (yet) though
18:00 alyssa: thanks!
18:00 alyssa: bindgen commit should probably still be wip
18:00 alyssa: in particular it runs a complete random set of nir passes copypasted from the intel and asahi compilers
18:01 alyssa: need to figure out what we actually need
18:01 alyssa: next week problem
18:02 jenatali: This does raise the bar for building lavapipe on Windows since setting up mesa-clc is a pain right now, but that's probably fine
18:04 dcbaker: jenatali: is it just the well known set of of issues, or does Windows have issues that are separate from other OSes?
18:05 jenatali: "Installing" LLVM isn't really a thing on Windows
18:05 jenatali: It's either manually downloading it and configuring a bunch of paths, or else it's building it locally. Both of those are pretty miserable
18:06 jenatali: And I guess for mesa-clc it's actually linking against LLVM rather than calling out to Clang which rules out the downloading it aspect, it needs to be built, since it produces static libs and those aren't portable from one build environment to the next
18:06 alyssa: hopefully that one is solved this year
18:07 jenatali: That one?
18:07 alyssa: not using clang
18:07 alyssa: the binary one
18:08 jenatali: Ah right once the translator's obsolete we can just request a SPIR-V from binary clang, that'd be nice
18:08 jenatali: There's still the libclc aspect of it but that's more tractable I think
18:08 jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/69548369 - looks like you've raised the meson bar ;)
18:09 alyssa: dcbaker: ^ is that ok?
18:10 alyssa: meson 1.5 is in debian stable backports ..
18:10 dcbaker: I think 1.3 is required if you want any of the Rust drivers, so I think that would be fine? That's probably more of a distro maintainer question though
18:10 dcbaker: 1.3 was November 2023
18:12 alyssa: i coulda sworn we just discussed this..
18:14 dcbaker:realized today his first Meson commits went into Meson 0.40 way back in 2017...
18:17 alyssa:just learned that jan 2024 was a year ago
18:17 alyssa: weird
18:17 alyssa: wonder when that happened
18:17 dcbaker: On April 1st
18:18 alyssa: ah
22:10 ccr: cine