04:00 krh: jekstrand: 6 new stages!?
04:00 krh: six!
04:02 HdkR: Inject that RT right in
04:08 airlied:wonders if RT will ever justify itself :-P
04:11 airlied: like surely we could find another user for all the diespace :-P
04:12 krh: with unified shaders, a bunch of stages doesn't have to take a lot of die space
04:15 airlied:can't wait for mesa support (I can wait)
04:16 krh: airlied: I expect you're going to add support to glxgears?
04:16 airlied: krh: llvmpipe ray tracing with gears ftw
04:17 airlied: it still feels like nvidia had some bored engineers who wanted to do something cool
04:18 krh: they had to find a way to keep adding transistors and fans
04:19 airlied: I think they have to keep finding ways for gamers to buy new cards, to fund CUDA dev
04:26 airlied: would be nice if t he XDC schedule had a mention of timezone, even if just to say UTC
04:26 airlied: is it utc?
04:59 jekstrand: airlied: UTC+2, I think?
04:59 jekstrand: krh: Yes, 6 stages. :)
05:03 airlied:guess I'll just lvie stream the next day anyways :-P
05:04 jekstrand: airlied: If it's the next day, it's not live. :-P
05:05 airlied: good point, I should just wait for the talks to be published :-P
05:05 jekstrand: It's going to be a very strange XDC. I'm not really sure how talks are going to work.
05:05 jekstrand: I also don't know how interactive things will be.
05:06 jekstrand: I hope it's still fairly interactive and not just a bunch of videos.
05:06 airlied: probably have to ask some people post plumbers
05:06 airlied: since I expect it'll be a very similiar setup
05:06 jekstrand: Probably
05:07 jekstrand: At least, I'd hope the XDC and plumbers people are passing notes
05:07 airlied: plumbers is totally the middle of the night for me, XDC just starts when I cook dinner :-P
05:07 jekstrand: Oh, then XDC shouldn't be bad.
05:07 jekstrand: XDC starts for me shortly after I usually go to bed. :-/
05:08 airlied: yeah I could do a morning session or two if pushed
05:09 jekstrand: Hey, what do you know? My ray-tracing MR passes CI finally.
05:09 jekstrand: People seem to have added some -Werror to CI and I was missing switch cases.
05:09 airlied: now we just need some hw to run it on
05:10 HdkR: Just run it in llvmpipe ;)
05:10 jekstrand: airlied: Where's that nouveau Vulkan driver when we need one?
05:11 jekstrand: nouvk
05:11 airlied: jekstrand: should just hack up a kernel mm api already
05:12 airlied: I don't think anyone has a rdna2 card yet
05:16 airlied: there is quite the lack of compute talks :-P
05:18 jekstrand: airlied: I don't know. jenatali has 3 talks scheduled.
05:19 jekstrand: But likely only 1 of those will be compute-focused.
05:19 airlied: yeah only the lats one seems CL related
05:19 jekstrand: (or maybe it's only 2? I don't remember)
05:19 airlied: it would be strange the year we finally do get CL working and nobody talks about it :-P
05:21 jekstrand: heh
05:24 krh: also, no Xaw talks
05:25 airlied: actually 0 wayland talks as well
05:25 airlied: apart from the mention in the microsoft talk
05:25 krh: unless you count the weston on windows one
05:25 krh: yeah
05:26 krh: well, 0 talks about wayland must mean it's finally done
05:26 krh: just like Xaw
05:26 jekstrand: heh
05:30 airlied: krh: Xaw3d killed it
05:33 krh: airlied: who can resist Xaw3d though
05:33 jekstrand: o/
05:34 HdkR: Everyone has moved on from attempting to run CL on x86 devices, the new hotness is OpenCL on ARM. How soon until Panfrost runs OpenCL? ;)
05:38 krh: pinchartl: does this look reasonable to you? https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2379110
06:07 wm4: apparently EGL_MESA_config_select_group is a thing now, where is it specified?
06:25 krh: pinchartl: ah, I see, that's not what I want
06:46 bbrezillon: jekstrand: I think I'm convinced now. The only thing bothering me is this hardcoded alignment when you reach a var deref
06:48 jekstrand: bbrezillon: Would you rather it be bigger?
07:42 tjaalton: dcbaker: heya, mesa 20.2 needs a new rc? :)
07:43 bbrezillon: jekstrand: I'd rather it be specified by the backend on a per-var-mode basis, rather than letting the backend clamp the value returned by nir_get_explicit_deref_align()
07:44 bbrezillon: and I'm actually worried about actual memory alignment being smaller than the hardcoded value, so I'm not sure having a bigger value helps
07:48 bbrezillon: if it's smaller, the worse that can happen is sub-optimal accesses, which is not great, but still better than having optimizations using an alignment that's bigger than the reality
07:48 bbrezillon: but maybe it's a non-issue in practice, so feel free to ignore my concerns :)
07:55 MrCooper: can I get a review on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6481 (fixing up CI rules so pre-merge jobs don't run by default again)? If not, I'm planning to fast-track merge it within ~3 hours
08:24 wm4: ah EGL_MESA_config_select_group got reverted
08:24 wm4: is there a new way to get an alpha framebuffer (so the compositor blends it)?
08:28 pq: wm4, isn't that done by looking at the NativeVisualID of the EGLConfig and pick one that matches your desire?
08:29 wm4: pq: I think I'm already trying that
08:30 wm4: what attributes does the native visual have to have for transparency to work?
08:30 pq: what the NativeVisualID is depends on what EGL platform you use
08:30 wm4: (this is in the X11 case)
08:31 pq: ok, so it's a X11 Visual ID... I'm quite hazy on that...
08:33 pq: wm4, umm, maybe depth 24 (opqaue) vs. 32 (alpha channel)?
08:35 wm4: I think that's what I'm doing, works with GLX
08:36 pq: huh, then I don't know
08:36 wm4: https://bugs.freedesktop.org/show_bug.cgi?id=67676
08:36 pq: obviously you have a compositing manager running?
08:36 wm4: seems like EGL doesn't include them in the first place
08:36 wm4: yes, and it works with GLX
08:37 wm4: there was a patch to fix it, which was merged, and then reverted for unknown reasons, then silence
08:37 pq: did you look at the list returned by eglGetConfigs or what was it that returns the full list, not what eglChooseConfig returns?
08:38 wm4: let me try to add some more manual dumping of deb ug infos
08:41 wm4: so I think the "vis\nid" column in eglinfo shows the X visual ID
08:41 wm4: no *headscratch*
08:41 MrCooper: wm4: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1938
08:42 wm4: MrCooper: I see
08:46 wm4: pq: it's possible that eglinfo lists an alpha X visual, but eglChooseConfig() doesn't list it with fairly standard/trivial request parameters
08:47 pq: I believe that eglChooseConfig is traditionally hacked to keep old application assumptions working, so I wouldn't bet on its behavior.
08:48 pq: or hacked the database it searches in
08:51 wm4: pq: ok, this is dumb... eglChooseConfig doesn't include the required config/visual if EGL_ALPHA_SIZE is set to 1
08:53 wm4: this seems rather unintuitive, though I don't remember what alpha in an EGL config is even supposed to mean (since it's _not_ about window/compositor alpha)
08:53 pq: somehow I'm not surprised, no matter how dumb :-p
08:54 wm4: so, how would this work on wayland? (no idea if wayland/EGL has a concept of visuals)
08:54 pq: well, I guess it's about whether the rendering API (e.g. GL) sees an alpha channel in the render target or assumes it's all 1.0
08:54 pq: Wayland does, and IIRC it's the DRM pixel format as the native visual ID
08:54 wm4: the chosen config now has EGL_ALPHA_SIZE=0, and compositor transparency works
08:55 pq: so pick a config where the DRM pixel format has an alpha channel, and the compositor will use it
08:55 pq: ehheh >_<
08:55 wm4: one could argue that there should be a way to do this portably in EGL, but I guess nobody is going to want to touch EGL configs with a 10 foot pole
10:46 daniels: see https://bugs.freedesktop.org/show_bug.cgi?id=67676
10:58 wm4: yeah I posted that link myself
11:04 daniels: :)
11:04 daniels: Wayland doesn't have _any_ native visual IDs however
11:05 daniels: GBM unsurprisingly uses GBM formats, but Wayland just internally translates the config params; if you ask for EGL_ALPHA_SIZE=8, you'll get 8-bit alpha blending
11:05 wm4: so what does it do to associate EGLConfigs with DRM (or GBM or whatever) formats?
11:05 wm4: and that sounds like the opposite of X11
11:06 wm4: which will exclude alpha formats with EGL_ALPHA_SIZE!=0
11:19 pq: daniels, oh, so I misremembered. Sad.
11:21 pq: I think we should make EGL Wayland use DRM formats like GBM. EGL does not really define what a "visual" means, does it?
11:21 pq: like we would never want a visual to include premult/not or sRGB vs. linear or any color space stuff or...
11:23 pq: wm4, I bet the X11 behavior is simply to let old apps work: they don't expect to get blending, but they use alpha channel and pick the first EGLConfig returned by eglChooseConfig.
11:23 pinchartl: krh: indeed I think that will cause issues
11:33 hakzsam: dj-death: https://github.com/KhronosGroup/VK-GL-CTS/issues/141 you solved that problem just by building mesa with glvnd? I have the issue, but glvnd doesn't help...
11:45 hakzsam: nvm
11:46 dj-death: hakzsam: fixed?
11:46 hakzsam: yep
11:50 daniels: pq: mm, but what would that gain us, given that as you say visuals don't encode srgb/unorm, premult, etc etc?
11:51 pq: daniels, solution wm4's problem? Like ensuring there is a 8-bit alpha channel, ot RGB are 10 bits, or RGB 16F?
11:51 pq: 10bit and 16F formats are going to be useful with HDR
11:52 wm4: EGL certainly seems outdated and unflexible here
11:53 pq: accidentally choosing BGR instead of RGB might rule out composite-bypass, but was that supposed to be handled transparently inside EGL?
11:54 pq: or is all that already expressable in EGLConfigs?
11:55 daniels: if you want an 8-bit channel, select for EGL_ALPHA_SIZE==8
11:55 daniels: if you want 10, select for 10
11:55 daniels: all of this is queryable
11:56 daniels: atm float/HDR/sRGB/etc are not actually selectable, but that can be solved with more attribs
11:56 pq: what about queryable?
11:56 glennk: should always inspect the full list returned by eglChooseConfig and score them according to app preferences
11:56 daniels: yes
11:56 daniels: the core issue is that eglChooseConfig is specified to be not good
11:56 pq: eglGetConfigs
11:57 glennk: i think mentally renaming eglChooseConfig as eglFilterConfigs might convey its intention better :-)
12:01 wm4: enumerating all possible combinations as an explicit list seems like a bad idea
12:04 daniels: wm4: why?
12:05 wm4: combinatorial explosion
12:05 glennk: most eglChooseConfig internal implementations i've seen do pretty much that
12:05 daniels: ^
12:05 wm4: of course EGL's concept requires that
12:19 yshui`: i figured out issue 3199
12:25 yshui`: explanation: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3199#note_610291
13:08 renator: hi!
13:09 renator: just one quick question...
13:09 renator: I have an Intel HD 500 onboard GPU (my CPU is an Atom E3940 - Apollo Lake). I need to enable Spread Spectrum Clock on both eDP and HDMI outputs that I have. I'm passing i915.lvds_use_ssc=1 as kernel argument (btw, using kernel 4.9.1). My question is: Is there any way to check if SSC is really enabled on both screens?
13:09 vsyrjala: iirc we never enable ssc on edp/dp atm
13:09 vsyrjala: or hdmi. didn't think it was legal for hdmi even
13:10 renator: ssc is enabled only for lvds then?
13:11 vsyrjala: yes iirc. i've been occasionally pondering about flipping it on for edp as well. but haven't actually looked at the details too much
13:13 renator: But is it possible to enable on the i915 driver side? I mean... It's not dependent from the firmware to enable it right?
13:14 vsyrjala: it would be possible. the details vary quite a bit between different platforms though
13:20 karolherbst: uffff
13:20 renator: Thanks for the info! I'll play a little with the code. One last question: if I manage to enable it on eDP, is there any easy way to know if it's really working (besides taking the PC to lab and analyse the radio frequencies) ?
13:21 karolherbst: yshui`: your finding causes a lot of headaches already :D maybe somebody knowing things better can come up with a fix
13:21 karolherbst: but anyway.. that somehow sounds like libstd++ bug to me
13:22 yshui`: karolherbst: upstream says it's "unsupported": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96817
13:23 karolherbst: doesn't make it less of a "bug" :p
13:23 karolherbst: they just don't want to make things more expensive for single threaded applications
13:23 karolherbst: which imho are irrelevant these days
13:24 vsyrjala: renator: i think something like https://paste.debian.net/1161454/ should be all that's required for an apl
13:25 karolherbst: yshui`: I guess the only think mesa could do is to not link against pthread and check at runtime if it can do threading or not
13:26 karolherbst: but that hurts performance more than just going with the multithreaded locking code
13:26 karolherbst: ....
13:26 karolherbst: assuming things are single threaded is a falacy these days and everybody refusing to see it I just ignore :p
13:27 yshui`: fair
13:29 yshui`: yeah i don't see a very clean solution
13:29 yshui`: maybe link libGL with pthread?
13:30 karolherbst: I wouldn't be surprised if always doing the multithreaded locking will be even faster because you get better cache hits and have to deal with less indirections and stuff...
13:30 karolherbst: yshui`: already is
13:30 karolherbst: or not?
13:30 yshui`: libglvnd's isn't
13:30 karolherbst: the issue is rather if libstdc++ gets loaded before or after pthread, no?
13:30 karolherbst: ahh
13:30 karolherbst: mhhh
13:30 jekstrand: bbrezillon: I think my inclination is to use the big alignment and see how it plays out.
13:30 karolherbst: but glvnd also loads mesa sooner or later anyway
13:30 karolherbst: mhhh
13:30 jekstrand: bbrezillon: I do think we definitely need more comments
13:30 jekstrand: bbrezillon: But, until we start seeing more shaders going through the optimizer, it's hard to know exactly what we'll see.
13:31 karolherbst: yshui`: mind checking in LD_DEBUG=libs in what order those libs are getting loaded?
13:31 yshui`: the goal is to get pthread pulled in as the program starts
13:31 karolherbst: yshui`: some applications also dlopen libGL though
13:31 yshui`: ah, right
13:32 karolherbst: it's just libstdc++ being stubborn here :p
13:32 karolherbst: I mean.. I see the point, but applications being single threaded usually don't care about performance anyway
13:32 yshui`: libglvnd doesn't load mesa until some function is called, i think
13:33 karolherbst: could be
13:33 karolherbst: I think the biggest application being singlethreaded is gcc at this point.. mhhh oh well
13:33 yshui`: zero-overhead abstraction and such
13:36 dcbaker[m]: tjaalton: yes it does. I'm way behind on that, hopefully today
13:39 tjaalton: great, thanks
13:48 MrCooper: yshui`: I've also hit (something like) that with just "gamemoderun glxgears" though, which doesn't pull in libstdc++ outside of Mesa AFAICT
13:50 yshui`: maybe gamemode pulls in libstdc++?
13:50 yshui`: could also be a different problem
13:56 MrCooper: it doesn't
14:05 bbrezillon: jekstrand: fair enough
14:19 yshui`: MrCooper: maybe be a different problem?
14:19 MrCooper: maybe, or maybe your theory is wrong :)
15:16 karolherbst: jenatali, jekstrand: mhh.. so yeah.. running clinfo is super slow with clover at the moment :/
15:16 jenatali: Because of libclc?
15:17 daniels: you need a clcache
15:17 jenatali: And what's "super slow"? We talking the order of seconds or more?
15:20 karolherbst: so other caps vtn complains about are SpvCapabilityLinkage and SpvCapabilityFloat16Buffer
15:20 karolherbst: jenatali: seconds, yes
15:20 jekstrand: daniels: You joke but..... That's not a bad idea, actually.
15:21 jenatali: Oh right. Linkage we should actually just support. Float16Buffer I added as supported when I implemented the vload/vstore_half stuff
15:21 jekstrand: We really should compile CLC, optimize it a bit, and then serialize the NIR to the disk cache.
15:21 jenatali: Which I also still have to port over...
15:22 alyssa: ooi does cl need shader variants for anything?
15:22 jekstrand: alyssa: It shouldn't
15:24 daniels: jekstrand: it really isn't a bad idea
15:24 karolherbst: jenatali: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/720
15:25 jekstrand: daniels: I think we do that for the fp64 shader in iris
15:25 jekstrand: daniels: IIRC, it sped up CTS runs a bit
15:25 karolherbst: jekstrand: well.. you could implement local size hints as variants
15:25 daniels: alyssa: the one place we needed a variant for, we ended up replacing that with a function call to a magic runtime-provided function which tells you how tragic your native fma is
15:25 karolherbst: or offsets
15:25 jekstrand: karolherbst: *could* and *must* are different, I think.
15:26 karolherbst: yeah
15:26 jekstrand: I think alyssa is asking because she's very tired of shader variants. :)
15:26 karolherbst: but need is also not require ;)
15:26 karolherbst: ahh, yeah
15:26 karolherbst: understandable
15:26 jenatali: Yeah, we have to implement changing the local size as a DXIL variant
15:26 jenatali: We end up implementing them as NIR variants just because it was easier
15:27 karolherbst: jenatali: soo, clinfo takes like 3 seconds :)
15:27 jenatali: That's way too long
15:27 karolherbst: clc gets linked like 5 times?
15:27 jenatali: karolherbst: Is that an optimized build?
15:27 karolherbst: no
15:27 karolherbst: but still
15:27 karolherbst: clc is the problem here
15:27 alyssa: daniels: makes sense
15:27 jekstrand: karolherbst: Are you compiling CLC at driver init or just fetching the file from disk?
15:28 alyssa: [lol no generics.jpg]
15:28 jenatali: karolherbst: Sounds like libclc loading/parsing should be moved from context creation to somewhere more global so we don't do it for every context...
15:28 karolherbst: yeah...
15:28 jenatali: FWIW it's completely global in CLOn12
15:28 karolherbst: I guess we should do it once for each device
15:28 karolherbst: but...
15:28 jekstrand: We need to compile per-device but we can do that on-demand
15:28 karolherbst: nir_options is something we only get with a context
15:29 karolherbst: ohh yeah...
15:29 jekstrand: We can load the CLC file once per 32/64-bit
15:29 karolherbst: mhh maybe
15:29 karolherbst: jekstrand: we really need just one version
15:29 karolherbst: even though the CL spec leaves it open to support both address bits, there is no way to actually flip it over
15:30 jenatali: karolherbst: Each device has an innate bitness. You need both versions if you want to support devices with both bitnesses
15:30 jenatali: You can't request a device to use shaders from a different bitness though
15:31 karolherbst: we need to compile per device anyway
15:31 jenatali: I'm open to Clover patches which cache things or do things more globally :)
15:31 jenatali: It's a bit hard for me to author them since I don't really have a setup to try things at the moment
15:32 karolherbst: ahhh.. the CI pipeline for the translator takes so long ;D
15:33 karolherbst: mhhh...
15:33 karolherbst: running the CTS I only see the warning printed once...
15:33 karolherbst: I think clinfo creates multiple queues
15:33 karolherbst: ahh yeah.. it happens on each subtest
15:33 karolherbst: okay...
15:34 karolherbst: jenatali: not even with llvmpipe?
15:36 pcercuei: I want to add a function do_stuff(struct drm_device *drm), that can only run on a modeset. It must not start before the modeset, and the modeset cannot continue until the function ran
15:37 pcercuei: What's the best way of doing this?
15:37 alyssa: ...and yes, I *am* tired of shader variants :p
15:37 pcercuei: (that function is called in a different thread, otherwise it'd be easy)
15:39 vsyrjala: if you want to block on it immediately why would you run it in a different thread?
15:39 karolherbst: jekstrand, jenatali: did any of you wanted to look into the NIR_PRINT stuff? I could work on the caching compiled clc
15:39 jenatali: karolherbst: I already handled it
15:40 pcercuei: vsyrjala: I don't want to, I just don't have a choice. It's a callback from the clk framework (clk_notifier_register)
15:40 karolherbst: jenatali: it's not wired up yet I guess, but yeah.. looks fine I think
15:40 karolherbst: I left a comment though :p
15:40 karolherbst: ohh wait
15:41 karolherbst: mhhhhhh
15:42 karolherbst: mhhhh
15:42 karolherbst: clc doesn't actually run any kernels
15:42 karolherbst: uhm
15:42 karolherbst: clinfo
15:44 karolherbst: mhhhh
15:44 jenatali: karolherbst: We defer libclc loading until the first program compile, and then just cache it globally
15:45 karolherbst: yeah...
15:45 karolherbst: let's see
15:45 karolherbst: heh, fun. file.open throws an exception when the file doesn't exist
15:46 jenatali: Oh, is that the crash? Unhandled exception?
15:47 karolherbst: yeah
15:47 karolherbst: clCreateContext doesn't catch it as it seems
15:48 jenatali: That seems... wrong
15:48 karolherbst: I think we can remove the failbit thing
15:48 jenatali: https://en.cppreference.com/w/cpp/io/basic_ifstream/open says it should just set the failbit on failure
15:48 karolherbst: we enabled exceptions on failbit
15:48 jenatali: Ohh
15:49 karolherbst: let's see
15:49 jenatali: Seems like that should just be removed
15:50 karolherbst: yep
15:50 karolherbst: removing it makes it work
15:50 jenatali: Great, thanks
15:50 karolherbst: I'll leave a comment
15:56 karolherbst: soo.. now how to deal witht he clc binary... mhhh
16:12 vsyrjala: pcercuei: so you somehow trigger some other thread to call some clk framework thing that calls your notifier and want to wait until it was called?
16:16 pcercuei: vsyrjala: yes. The callback is called before and after the clock is modified (with a different parameter: PRE_RATE_CHANGE then POST_RATE_CHANGE). For the first one, I need to wait until I'm in vblank, so I do drm_crtc_wait_one_vblank(). But then I need to wait until I get called again with POST_RATE_CHANGE to continue with the modeset
16:17 mripard: pcercuei: can you do it the other way around, ie have your modeset code call the clk framework to change the rate?
16:18 pcercuei: mripard, the modeset code changes the pixclock to whatever the panel needs in the modeset. But the pixclock is derived from the PLL, and the PLL can change over time because of cpufreq
16:19 pcercuei: so I register a clock notifier on the PLL, and need to recompute the pixclock rate when the PLL rate changes
16:19 mripard: i'm not sure what is the relation with the modeset there then
16:20 mripard: the cpufreq rate change can happen at any point in time?
16:20 pcercuei: yes. But I get a callback before/after it happens
16:20 mripard: (or at least, there's no correlation between a cpufreq change rate and a modeset)
16:21 mripard: right, but I don't get the part where you said that it had to happen during a modeset
16:22 pcercuei: mripard: if your PLL go from 200 MHz to 400 MHz, your pixel clock may have an invalid frequency for a short time while the panel is displaying
16:22 mripard: yes
16:23 mripard: but there's no modeset involved in this case
16:23 mripard: as far as DRM is concerned, it just happened
16:23 pcercuei: there will be one next time the application flips its buffers :)
16:23 mripard: yeah, but you have no idea when that will happen
16:24 mripard: or even if it happens
16:24 mripard: so you can't really rely on that
16:24 pcercuei: I know
16:24 pcercuei: How is it handled elsewhere?
16:25 vsyrjala: sane hardware has dedicated clocks for display :P
16:25 pcercuei: vsyrjala: I didn't know "sane hardware" was a thing :)
16:26 vsyrjala: true enough. let's go with "slightly less insanse"
16:27 mripard: pcercuei: you can either lock the clock frequency using clk_set_rate_exclusive, but you would effectively lose the cpu clock rate change I guess?
16:27 lynxeye: pcercuei: Why is your cpufreq changing a shared PLL? No divider to downclock the CPU?
16:27 pcercuei: mripard: correct
16:27 mripard: or you can use a notifier a pass the drm_device as an argument (but I'm still unsure why you mentionned the synchronisation with the modeset)
16:28 pcercuei: lynxeye, yes, a proper cpufreq would downclock the CPU and not touch the PLL. But we mainly overclock, so the PLL is modified
16:29 mripard: pcercuei: is there RGB LEDs on your hardware too ? :)
16:30 pcercuei: mripard: https://zcrc.me/assets/images/rs90-tux.jpg
16:30 pcercuei: showing ingenic-drm in its glory
16:31 vsyrjala: this all sounds pretty fragile since i guess there are no guarantees the clock change will finish during the vblank even if you wait for vblank in the pre notifier
16:31 pcercuei: vsyrjala, true.
16:32 mripard: ah, I missed the part about the vblank, yeah, it makes sense then
16:37 pcercuei: I have the same problem in the MMC driver, with more damaging consequences, since it disrupts communication with the SD card
16:38 mripard: it sounds pretty bad :)
16:39 mripard: if the CPU clock has a separate divider, maybe you should just raise the PLL in the bootloader / when the clock driver registers, and then you would just change the CPU clock divider?
16:40 pcercuei: can't do that :( Because not all devices can overclock the same
16:50 jenatali: jekstrand: Think !6313 is good to Marge or should I be waiting for someone else to take a look?
16:50 jekstrand: jenatali: Marge it
16:56 pcercuei: mripard, this somewhat works: https://gist.github.com/pcercuei/a077a61d20e6a8d5ee0b2cbdb6164b78
16:56 pcercuei: but it expects ingenic_drm_crtc_atomic_flush() to be called periodically
17:02 mripard: I'm not sure it's something we can expect
17:02 karolherbst: jekstrand: I think we might want to write up shader caching in clover the proper way. So we can cache spirvs and nirs
17:02 karolherbst: and then we can just throw libclc into this as well
17:02 jekstrand: karolherbst: Yeah, probably
17:02 karolherbst: does our sahder cache track last use or something?
17:03 karolherbst: I really don't want to risk clc to get evicted, but if we always fetch it then it would stay in that case I guess
17:04 karolherbst: last time I talked about it curro_ didn't like the idea and said applications are responsible for cachining, but I still think we should still just do it
17:10 jenatali: That reminds me... I should really hook up caching too...
17:10 karolherbst: yeah..
17:11 karolherbst: airlied: any other idea on how to properly cache clc?
17:11 jekstrand: karolherbst, jenatali, bbrezillon: I'm rebasing and reworking a few things in my deref alignment MR right now. I'll try to pull in all of the patches from bbrezillon's that we still need with the deref alignment stuff.
17:11 karolherbst: wiring up a shader cache is kind of annoying in clover
17:12 jekstrand: Once that's done, what's between us and landing that and the __constant stuff?
17:12 karolherbst: is the constant stuff reviewed? I mean.. it does work perfectly fine, I just don't know if it requires some patches from me or not?
17:13 jekstrand: karolherbst: I think there were some comments I may still need to deal with
17:13 jekstrand: And I think I'd like to add a load_global_constant intrinsic before we merge it.
17:13 karolherbst: mhhh
17:13 jekstrand: Int the back-end, you should be able to treat it identically to load_global but NIR can do better stuff with it.
17:13 karolherbst: what is that for?
17:13 karolherbst: ahhh
17:13 karolherbst: I see
17:13 jekstrand: In particular, it's trivially CSEable
17:13 karolherbst: right
17:14 jekstrand: Because you know the memory can't change and so there are no inter-instruction dependencies.
17:14 jekstrand: It doesn't really have to be different
17:14 karolherbst: jekstrand: I have this patch for clover: 7fd79919321e0e192ca004612ad99198af824f86
17:14 jekstrand: It's just convenient if it is
17:14 karolherbst: ..
17:14 karolherbst: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/7fd79919321e0e192ca004612ad99198af824f86
17:14 jekstrand: karolherbst: Yeah, I should learn enough about clover to review patches :)
17:14 karolherbst: :D
17:15 karolherbst: but I'd like to include the clover integration into the MRs as well, so we don't drift apart that much
17:16 karolherbst: I think my patch is fine, at least I don't know any reason it's still WIP
17:18 Vanfanel: Hi. I am trying to use a PRIMARY plane (video mode is 1360x768) to scale a 640x480 buffer fullscreen. For that, I use these PRIMAR PLANE props: SRC_W=640; SRC_H=480; SRC_X=0; SRC_Y=0; CRTC_W=1360; CRTC_H=768; CRTC_X=0; CRTC_Y=0; But only a part of the 640x480 image is shown, starting on the lower part of the screen. So is it possible to do it at all using a PRIMARY plane or I need an overlay for this?
17:18 jenatali: jekstrand: Constant memory looks fine to me, I don't think I had any concerns with it
17:19 karolherbst: jenatali: already using it?
17:20 jenatali: karolherbst: I've got a series that cherry-picks it to our fork, and it doesn't break us at least
17:20 jenatali: I don't think it really improves things though
17:20 pcercuei: Vanfanel: SRC_W/H are in 16.16 format, so you should use (640<<16) and (480<<16)
17:20 jenatali: At least, not on its own
17:21 Vanfanel: pcercuei: yes. I do that with SRC_W/H.
17:22 pcercuei: Vanfanel, then it all depends on whether your primary plane can scale
17:23 Vanfanel: pcercuei: it seems to be able: horizontally, image seems correcty scaled
17:25 Vanfanel: pcercuei: the problem is that, vertically, image starts on the lower part of the screen, with a great black border on top
17:25 Vanfanel: *big black border
17:26 Vanfanel: but it's stretched horizontally
17:29 pcercuei: what driver?
17:30 Vanfanel: pcercuei: AMDGPU
17:33 Lyude: Hm, does anyone know off the top of their head if it's possible for an atomic commit in DRM to change the mode on a given CRTC without including its connector in the commit?
17:34 airlied: karolherbst: what takes the time btw? loading, validating?
17:34 karolherbst: airlied: clc
17:35 airlied: but what part of clc loading
17:37 airlied: radv caches its meta shaders in a private cache from memory
17:38 vsyrjala: Lyude: one is supposed to call drm_atomic_add_affected_connectors()/etc. when flagging a modeset
17:40 Lyude: vsyrjala: even when just performing a modeset that doesn't change the connectors/encoder in use on a CRTC?
17:52 Lyude: actually - I thought that might matter for this https://gitlab.freedesktop.org/lyudess/linux/-/tree/wip/dp-mst-edid-reprobe-v1 but looking at things more closely I don't think it will (I was thinking I might need to grab connection_mutex in drm_dp_mst_handle_conn_stat() in order to update the connector epoch, but I realized the only thing that can really change as a result of a CSN is ldps or
17:52 Lyude: ddps (the pbn can technically change there, but we only reprobe it on ddps/ldps change and since those imply connection status changes already, userspace should still be able to pick up on that change without us needing to increment the epoch counter)
17:54 Lyude: (it would have mattered otherwise since then there'd basically be no need for us to grab &mgr->base.lock in the hotplugging paths for mst at all)
18:01 airlied: jenatali: I just merged the libclc MR into master here and got a glsl_get_bit_size(dest_type) where dest_type is a vtn type
18:02 jenatali: airlied: Hm, probably merge conflict, one sec
18:03 jenatali: Yeah, merge conflict, my bad
18:08 airlied:still gets a crash loading libclc here in var->var constant init land
18:08 jenatali: airlied: Pretty sure you still need jekstrand's constant series + Clover patches from karolherbst
18:10 jekstrand: Yeah, we need the constant stuff for CLC
18:11 jekstrand: Otherwise, we have to do a bunch of juggling about constant formats
18:11 airlied:goes to make its how pile of MR tree :-:
18:11 jenatali: Yet another reason not to jump the gun on merging the libclc series
18:12 jenatali: I think it's in pretty good shape on its own, but yeah the dependencies are still all coming together
18:17 airlied: Can't find clc function _Z3fmafff
18:17 airlied: btw likely because I'm missing clc patches
18:18 jenatali: airlied: That's https://reviews.llvm.org/D85911
18:20 curro:'s breakers just went off for no reason... too many GPUs running at full TDP on the same circuit? or more likely a shortcircuit somewhere :/
18:20 curro: karolherbst: in what context did i say that? i'm not opposed to shader caching a priori
18:27 airlied: karolherbst: NIR_VALIDATE=9 make any difference btw
18:30 airlied: ah yeah a lot of time is in the spirv parsing alright
18:31 jekstrand: I guess we now have a SPIR-V parsing benchmark. :-/
18:31 jekstrand: There shouldn't be any O(n^2) algorithms in there...
18:33 airlied: https://paste.centos.org/view/raw/ff753ed9
18:35 airlied: https://paste.centos.org/view/raw/d9673275
18:35 airlied: bit more in that second one
18:43 jenatali: Huh, interesting
18:44 karolherbst: curro: caching inside clover for clc, spirvs and nirs
18:46 curro: karolherbst: not opposed to that a priori, maybe i just didn't like your specific proposal ;P
18:46 karolherbst: could be
18:50 airlied: jenatali: nothing really stands out, its just processing a lot of spir-v takes a a lot of time :-P
18:50 jenatali: Yep, pretty much
18:50 jenatali: Solution isn't to make it faster, it's to not do it / do it less often
18:51 jenatali: I've noticed on Windows, debug builds are suuuuper slow at SPIR-V parsing, but I haven't noticed any perf problems with release builds
18:56 jekstrand: NIR is very assert-happy
18:56 jekstrand: Especially nir_builder
19:48 airlied: https://paste.centos.org/view/21929235
19:48 airlied: karolherbst: uggh clover linking against mesa utils seems broken
19:48 karolherbst: airlied: why
19:48 airlied: at the how does any of this work stage of the day
19:49 karolherbst: it has its own copy of everything
19:49 airlied: the link line puts the llvmclnir.a after the libmesautil.a
19:50 airlied: when doing the final link
19:51 airlied: it's a meson thing, but I'm not sure what particular meson thing I have tochange
19:52 jekstrand: karolherbst: I'm about to construct a better "packed" MR. I intend to delete all of the spirv_to_nir code for OpenCL type layout as part of it.
19:52 jekstrand: karolherbst: Do you mind?
19:52 jekstrand: karolherbst: And make us use vars_to_explicit_types for everything
19:52 karolherbst: that's fine
19:53 karolherbst: I can do some testing after you are done with it and verity it doesn't break anything
19:53 jekstrand: karolherbst: Cool.
19:53 jekstrand: karolherbst: I think packed is the only reason to still be using the layout code in spirv_to_nir
19:56 airlied: karolherbst: oops missing extern "C" :-P
19:58 jenatali: jekstrand: Once you've got that up I'll try to set aside some time to replace our current packed handling with what you've got, at least to try it out
19:58 jekstrand: jenatali: Ok, cool.
19:59 jekstrand: jenatali: Do you think we're getting pretty close to time for you to do your mega-rebase from hell?
19:59 jekstrand: s/rebase/rewrite/ :-P
20:00 jenatali: Pretty close, yeah
20:00 jekstrand: It's not really my problem in some sense, but it makes me nervous how much we're changing out from under you given that CLOn12 is the best testing platform right now in terms of completeness.
20:01 jekstrand: And every time you test anything, it's a disaster of a cherry-pick
20:01 jenatali: Yeah, which is why I'm doing my best to shovel things upstream so we can get things more in-synch rather than less
20:01 jekstrand: Yup
20:02 jenatali: If only I didn't have all these other things as part of my day job that're distracting me this week...
20:02 jekstrand: Yeah, I hear you
20:03 jenatali: But yeah, I still need to prep MRs for vload_half/vstore_half and the conversions with rounding/saturation
20:04 jenatali: I think those are the last things that still need to have MRs created, everything else just needs to settle, and then we'll need to deal with all the churn that's happened either as part of reviewing code upstream, or just unrelated upstream churn (like the variable list change)
20:09 karolherbst: airlied: you work on the caching clc thing?
20:09 airlied: I think my first disk cache effort only took 1s of the 6s
20:09 karolherbst: :D
20:09 karolherbst: nice
20:09 airlied: I suspect sha1 for the whole spirv is eatint time
20:09 karolherbst: you are only caching clc at this point, right?
20:10 karolherbst: mhhh
20:10 karolherbst: airlied: I have a stupid idea
20:10 airlied: yeah https://paste.centos.org/view/raw/4276d05d
20:10 airlied: simple to start woth
20:10 karolherbst: use "mesa-$version-clover-spirv-libclc-dhasjkdhaksdhakjsdhasd" as the input string for the hash :p
20:10 karolherbst: but I guess we want to support updated clc...
20:10 karolherbst: and maybe we don't want to mess with creation/change dates?
20:11 karolherbst: I think it's just pointless to hash the entire thing
20:11 airlied: so it works
20:11 karolherbst: right
20:11 karolherbst: but I think we can be smarter... mhhh
20:11 airlied: most of my time is now spent in clang
20:11 karolherbst: ohhhh
20:11 karolherbst: can we support compression?
20:11 karolherbst: and tell the clc build system that it compresses it with xz or something
20:12 karolherbst: and we just hash the archive
20:12 karolherbst: airlied: ahh, cool
20:12 karolherbst: but I guess we can live with cold cache taking a lot of time
20:12 airlied: need to see where time is psent now
20:12 airlied: looks like deserializing nir
20:12 karolherbst: ahh
20:12 karolherbst: how long does clinfo take with a warm cache?
20:13 airlied: perf takes about 1 minute to load the clinfo trace
20:13 airlied: karolherbst: it's gone from 6 seconds to 5 seconds
20:13 karolherbst: ehhh
20:13 karolherbst: that's not much...
20:13 karolherbst: airlied: I guess the issue is that clinfo creates multiple contexts or whatever
20:13 karolherbst: and we do the deserialization 5 times
20:14 jekstrand: Can we load the SPIR-V file at context init and delay parsing the actual SPIR-V until our first compile?
20:14 airlied: so mnuch of my startup time is in clang
20:14 karolherbst: jekstrand: we don't have a gallium context at that point
20:15 karolherbst: clover context == gallium screen
20:15 airlied: which is likely due to debug builds or the like
20:15 karolherbst: ahh
20:15 karolherbst: probably
20:15 karolherbst: but I guess we could create a throw away context to compile clc
20:16 karolherbst: and store it on the clover context
20:16 airlied: https://paste.centos.org/view/1b27d406
20:16 karolherbst: mhhh
20:16 karolherbst: oh well
20:17 airlied: so clc loading went from > 20% of time to 13%
20:17 karolherbst: but that's still a lot :D
20:17 karolherbst: airlied: mind trying with clang release build?
20:17 karolherbst: or wait... I could
20:18 karolherbst: compiling mesa as release as well :)
20:20 airlied: karolherbst: the reason I load clc in the context I think was clover's constructors are a mess
20:20 karolherbst: yeah
20:20 karolherbst: but that's the C++ way
20:20 karolherbst: normally
20:20 airlied: without rewriting it with explicit ordering I don't think I could make it work
20:21 karolherbst: what do you mean with explicit ordering?
20:21 airlied: like avoiding ctors
20:21 airlied: and just calling init functions
20:21 karolherbst: ctors are ordered
20:21 airlied: for a shared library, they aren't orderable though
20:21 airlied: you can't change the order
20:21 karolherbst: the thing about ctors is that you do all resource allocation in them so they get released whenever you call the dtors
20:22 karolherbst: init function destroy this idiomatic C++ behaviour
20:22 airlied: yes, but the thing about ctors if they are a bad idea for a shared library :-P
20:22 curro: why would they be a bad idea in a shared library?
20:22 airlied: because you've no way to order them
20:23 karolherbst: curro: on shared object loading time
20:23 karolherbst: right..
20:23 airlied: on library load the linker just runs them al
20:23 karolherbst: so the order is random
20:23 airlied: even if it's a defined order
20:23 curro: airlied: is that because you are defining global variables?
20:23 karolherbst: of the static init functions.. mhh
20:23 airlied: it's not something you control
20:23 airlied: so if you for instance need something in device before context or vice versa
20:23 karolherbst: airlied: 1.3s -> 0.7s
20:23 karolherbst: but that's still slow :/
20:23 karolherbst: I see it stuttering for each clc load
20:24 karolherbst: curro: I think it has to do how clover gets loaded
20:24 karolherbst: like when all the devices are iterated
20:24 karolherbst: that kind of happens.... no idea when
20:25 karolherbst: ohh when the platform gets loaded.. let's see
20:26 karolherbst: airlied: mind checking when platform::platform gets called?
20:26 curro: uhm, so your data will get constructed whenever the object it's part of is constructed. if you declare it in the device it will be constructed when the device is created, if you declare it in the context it will be constructed when the context is created. isn't that precisely what you want even in a shared library?
20:26 karolherbst: curro: I think clover has global state already
20:27 karolherbst: and that's what airlied is hittingh
20:27 karolherbst: so you can't rely on other stuff being there really
20:27 curro: it only has one global: the one Clover platform object
20:27 karolherbst: yes
20:27 karolherbst: and that's when the devices are created as well
20:28 karolherbst: so we can't really do compilation of anything
20:28 karolherbst: in device::device
20:29 karolherbst: airlied: src/gallium/frontends/clover/api/platform.cpp is where the global platform object is
20:29 karolherbst: mhhh
20:29 airlied: ideally I'd called load_clc from the device constructor
20:29 karolherbst: yeah.. that makes a lot of sense
20:29 karolherbst: but we can't really rely on gallium being fully loaded at that point, right?
20:29 airlied: linfo: ../src/compiler/glsl_types.cpp:1043: static const glsl_type* glsl_type::get_array_instance(const glsl_type*, unsigned int, unsigned int): Assertion `glsl_type_users > 0' failed.
20:29 airlied: is what happens when I do
20:29 karolherbst: like, we couldn't just create a gallium context doing compilation from inside device::device
20:29 karolherbst: uhhh
20:29 airlied: then I chase down where we init glsl types
20:30 airlied: and it hasn't happend yet
20:30 karolherbst: yeah.. that's on me :D
20:30 airlied: because some constructor isn't called
20:30 karolherbst: src/gallium/frontends/clover/nir/invocation.cpp .. mhhh
20:30 karolherbst: airlied: mind reworking the glsl_type ref thing?
20:30 karolherbst: and call it inside device::device or so?
20:31 karolherbst:somehow knew this would backfire at some point
20:31 airlied: karolherbst: yes so that's global
20:31 karolherbst: yep
20:31 karolherbst: but we can actually move it into the device::device constructor
20:31 karolherbst: I just did it there so it happens only once
20:31 karolherbst: not for each device
20:31 karolherbst: but it also doesn't harm doing couple of times I think
20:31 karolherbst: as long as we unref in the dtor
20:32 curro: oh, yikes, didn't remember that one
20:34 airlied: now I get a segfault in a glsl type access later
20:34 karolherbst: :/
20:34 airlied: so I find out there's a reason you put it in a global :-P
20:35 airlied: I've vague memories of this rabbit hole
20:35 karolherbst: yeah...
20:35 curro: airlied: how are you trying to compile stuff from device::device? maybe you could just declare a local glsl_type_ref in your code to have the guarantee that the glsl type stuff is initialized at the point you try to compile that code
20:35 airlied: curro: I want to load the clc library from the device constructor
20:36 airlied: since that is where it makes the most sense
20:36 airlied: at the moment I have the context constructor load it after the facts as a workaround
20:37 karolherbst: airlied: wait a second.. I am not even sure that's the problem of the longer loading time at this point.. let me check something
20:37 curro: airlied: so you are calling glsl_type_singleton_decref() in the device constructor now in addition to the init function?
20:38 karolherbst: ahh no...
20:38 karolherbst: with clinfo we end up getting context::context called a couple of times
20:38 karolherbst: so yeah. this is a problem
20:38 airlied: okau I think the problem is glsl_type has had it's constructor called
20:38 karolherbst: ufff
20:47 airlied: karolherbst, curro : yeah so all the glsl builtin types are constructed
20:47 airlied: taking the ref doesn't make that happen
20:48 airlied: it happens in order according to the linker
20:48 airlied: and the order is after the platform is constructed
20:48 airlied: and this is why I dislike C++ ctors in shared libs :-p
20:49 jenatali: Can't you just delay-init platforms/devices until the API's called to enumerate them?
20:49 airlied: jenatali: that's what I suggested above :-P
20:49 airlied: just have a nice init path
20:50 curro: airlied: ahh. well, they are kind of compelling in shared libraries because they save you from dealing with the concurrency issues you'd have with on-demand init
20:50 airlied: jenatali: though I'd still like to load libclc early and once
20:50 jenatali: Sure
20:51 jenatali: FWIW here's what I do: https://github.com/microsoft/OpenCLOn12/blob/master/src/main.cpp#L232, which is a delay-init
20:51 curro: airlied: you could also load libclc lazily when needed, but make sure it's only done once
20:51 jenatali: I load libclc during the first compile/link/build call
20:51 curro: right
20:51 airlied: we'd like to fail device init though if we don't have it
20:51 jenatali: Ah right
20:52 airlied: or I suppose if it's buggy, like I could stat the file and say that's good enough
20:52 jekstrand: So load it from the file and store it as a std::vector<uint32_t> and don't parse until later.
20:52 jekstrand: No NIR or glsl_type loading required
20:52 jenatali: Yep
20:54 curro: airlied: i'd be fine with delaying the initialization of the _clover_platform singleton till the first use if that's the easier thing to do
20:54 curro: airlied: anyway can't help wondering, how are you trying to compile stuff in the device constructor if there is no pipe_context created at that point?
20:55 airlied: curro: it's just spirv->nir
20:56 airlied: there's no backend needed
20:56 jenatali: I thought you needed nir options from the pipe_context?
20:58 airlied: yeah it does I wonder why it doesn't fail
20:58 airlied: oh we just need the ipe screen
20:58 airlied: not the context
20:58 jenatali: Ah
20:58 airlied: to get the options
20:58 airlied: curro: my memory of doing this later, is that everywhere I could do it I have a const device reference
20:59 airlied: so there was no way to put the nir into the device
21:00 airlied: at tthat point clover defeated my C++ love and I just made it work
21:00 karolherbst: airlied: ohh, that makes it easier indeed
21:00 airlied: karolherbst: I can just rmove all the consts? :-)
21:01 karolherbst: airlied: mhhh...
21:01 airlied: jekstrand: the codei s designed like that it loads the file, and does the spirv->nir separate
21:02 airlied: the problem was clover didn't let me put the nir anywhere that wasn't const at the point I wanted to
21:02 karolherbst: I mean.. doing it inside device::device sounds like the best idea
21:02 karolherbst: mhhh
21:02 airlied: karolherbst: which means delaying platform init
21:02 karolherbst: heh.. wait
21:03 karolherbst: everything is statically linked in
21:03 karolherbst: mhhhh
21:03 airlied: my kid has a peppa pig book and the last line is the great big train went in a great bit circle
21:03 airlied: this reminds me of that :-P
21:03 karolherbst: order of initialization of global objects is pretty much undefined, no?
21:03 airlied: it's linker defined
21:04 karolherbst: I mean.. is there some order we can _rely_ on?
21:04 airlied: you can call that undefined for any real world use :-P
21:04 karolherbst: right..
21:04 karolherbst: so we globally construct the clover devices and glsl_types
21:04 karolherbst: airlied: I have a stupid idea...
21:05 karolherbst: huh.. wait
21:05 karolherbst: do we actually have global glsl_type types?
21:05 airlied: gcc has init_prioruty
21:05 karolherbst: I thought they are just pointers?
21:05 airlied: karolherbst: glsl_type::uvec2_type
21:05 airlied: is the one that explodes for me
21:06 karolherbst: after glsl_type_singleton_init_or_ref?
21:06 airlied: yes because it's a global ctor
21:06 karolherbst: mhhh
21:07 airlied: that function doesn't actually init anythiing
21:07 karolherbst: right.. the ref doesn't do that much
21:07 airlied: it just stops you from cleaning up
21:07 karolherbst: ehhh
21:07 karolherbst: glsl_type_users should be atomic, no?
21:07 airlied: it assumes the global ctor has been called before it
21:07 airlied: and I don't think there is anything you can do to fix it
21:07 karolherbst: or is ++/-- considered atomic?
21:07 airlied: it doesn't matter really
21:07 curro: airlied: ahm... then honestly it seems more compelling to simply stat() libclc from the device constructor and do the actual libclc loading lazily at compilation time
21:07 karolherbst: I know
21:07 airlied: curro: that hits the dev is const problem
21:07 karolherbst: airlied: but maybe we want to construct the glsl stuff inside the ref function?
21:07 karolherbst: dunno...
21:08 airlied: karolherbst: you can't
21:08 karolherbst: would get rid of global stuff
21:08 karolherbst: why not?
21:08 airlied: karolherbst: ah mybe
21:08 airlied: if you get get rid of global stuff
21:08 karolherbst: right
21:08 karolherbst: can be just pointers
21:08 airlied: I'm not sure how to do that though
21:08 karolherbst: and we just call new inside reg
21:08 karolherbst: *ref
21:09 curro: airlied: const references shouldn't be a problem, define the data you want as a lazy<T> in the object, passing an init as argument on device construction time, then your init function will be called the first time you use the object, doesn't matter whether it's const
21:09 airlied: karolherbst: all those are statics insdie glsl_type
21:09 curro: <3 C++
21:09 karolherbst: airlied: sure.. but those are all accessed via functions anyway
21:09 karolherbst: or...
21:09 karolherbst: mhhh
21:09 airlied: karolherbst: nope, direct refs
21:09 karolherbst: where?
21:09 airlied: glsl_type::uvec2_type
21:09 airlied: and vec4_type
21:09 airlied: those are used everwhere
21:10 karolherbst: ahh.. inside glsl/
21:10 karolherbst: mhhh
21:10 karolherbst: curro: we still have to assign something somewhere
21:11 karolherbst: and change an object
21:11 karolherbst: but maybe that would be a bit fine? dunno how that works
21:11 curro: karolherbst: that's done transparently by clover::lazy, it will call an arbitrary function to initialize the object the first time you use it, then save the result
21:11 karolherbst: well.. right, but that kind of sounds non const to me
21:12 karolherbst: but maybe there is some magic happening I don't see
21:12 karolherbst: ehh "mutable"
21:12 curro: karolherbst: it has const semantics. accessing the object doesn't change it's behavior in any way. it's just its performance behavior which is non-const
21:13 maelcum: it's conceptually const, just make sure that the init function doesn't rely on state that is not guaranteed to be a certain way when it's called ;)
21:13 karolherbst: airlied: just use mutable on the cached data :p
21:13 maelcum: you can certainly mess it up, just like with mutable. so... don't :>
21:13 karolherbst: clover::lazy uses mutable
21:14 curro: well, i added that lazy helper in order to guarantee that mutable isn't used in a way that changes the semantics of a const object, please use it instead of a bare mutable object
21:14 maelcum: (<- c++ dev just overheard the conversation)
21:18 airlied: curro: lazy is certainly under doucmented, and under used, I'm not sure even looking at the one use of it, that it makes enough sense to me, to writea second users
21:18 airlied: but I think I kinda get it enough to hack something up
21:20 karolherbst: what we really need is more operator-> :D
21:20 curro: airlied: i'm sorry for that it certainly is underdocumented
21:20 curro: basically you just init a lazy object like: lazy<your_type> x(expensive_function_that_inits_the_object); then just x() in order to access it
21:20 karolherbst: I don't feel like operator() is used correctly overall
21:21 karolherbst: x-> feels more natural
21:21 karolherbst: we should probably add some operator-> overloads where we use operator() as "dereferencing"
21:22 curro: karolherbst: *shrug*, -> doesn't work if you aren't trying to access a method of the contents e.g. if you're just caching an int
21:22 karolherbst: right.. but I meant lazy and what we do with the state objects
21:23 airlied:isn't seeing where the expensice function is for the lazy ulong in event
21:23 karolherbst: "kern.program().devices()" should just be d_kern->program->devices
21:23 curro: karolherbst: but assume that you're caching a lazy<int>, then you try to print it, how are you going to use -> for that if int doesn't have any methods?
21:24 karolherbst: operator int()
21:24 airlied: this much C++ requires caffination I expect
21:24 karolherbst: uhm
21:24 karolherbst: operator T()
21:24 karolherbst: uhm
21:24 karolherbst: what was the cast operator? :D
21:24 karolherbst: T operator()?
21:24 jenatali: operator T()
21:25 karolherbst: jenatali: no.. that's var() thing, no?
21:25 karolherbst: I meant int t = var;
21:25 curro: karolherbst: which defines an implicit conversion which may lead to subtle issues... the reason () is used instead is because it requires you to request the conversion explicitly
21:25 karolherbst: right....
21:25 jenatali: You could always add "explicit"
21:25 karolherbst: yeah.. I guess
21:25 karolherbst: but that also makes it annoying to use
21:25 karolherbst: anyway, those () felt unnatural in a couple of places
21:26 jenatali: karolherbst: Casting to the thing from a T is just a constructor
21:26 karolherbst: jenatali: there are cast operators though
21:27 karolherbst: you could also add T& operator() to satisfy function taking references
21:27 karolherbst: or maybe only the reference one?
21:27 karolherbst: dunno...
21:27 karolherbst: it's a bit annoying
21:27 karolherbst: using operator() feels too java like
21:27 karolherbst: and I don't like java anymore
21:27 karolherbst: reminds me of that
21:28 airlied: curro: I'm not following why the event code uses lazy btw
21:28 curro: airlied: i don't like java either, but find function call operators kind of handy :P
21:28 curro: karolherbst: ^
21:28 jenatali: Yeah, I'm generally not a fan of using operator(). There's instances where it makes sense but sometimes it's abused, like hash<T>
21:28 karolherbst: ohh, I am not arguing on a technical level
21:28 karolherbst: it just feels unnatural
21:29 karolherbst: "kern.program().context().devices()" *shouders*
21:29 karolherbst: that really should be kern->program->context->devices :D
21:29 curro: airlied: the "expensive" function used for lazy in event.cpp is clover::timestamp::query IIRC
21:29 curro: airlied: it's a function object since it needs more state than just a function pointer
21:30 karolherbst: but I guess we shouldn't start doing clover::core::kernel* operator->(cl_kernel kern) things :D
21:30 karolherbst: would be fun though
21:32 curro: karolherbst: *shrug*, coming from a functional programming background, i feel like a function call with empty braces should be fully equivalent to writing the name of a naked value containing its result, /me can barely see the redundant braces :P
21:34 airlied:thinks I'd write a C version of clover quicker than work out lazy today
21:35 curro: airlied: what's the problem..? just declare an init function and pass it to it on construction
21:35 curro: should even work with a lambda expression if you don't feel like declaring an init function
21:36 airlied: curerntly I have a struct nir_shader *
21:36 airlied: this clearly needs an object wrapped around it
21:36 karolherbst: airlied: lazy<nir_shader *> clc([&] () { ... return nir; };
21:36 karolherbst: uhm...
21:36 karolherbst: );
21:36 karolherbst: or something like that
21:36 curro: airlied: then just declare it like lazy<nir_shader *> in the object, and on construction pass it a function that returns your nir_shader *
21:42 airlied: karolherbst: did a cat walk over your keyboard? :-)
21:43 airlied:isn't sure how to get the log parameter in there even if I do make it work
21:44 airlied: ../src/gallium/frontends/clover/core/device.hpp:112:42: error: expected identifier before ‘[’ token
21:44 SolarAquarion: getting a build failure on mesa-git /mesa/src/gallium/auxiliary/gallivm/lp_bld_init.c: In function ‘create_pass_manager’:
21:44 SolarAquarion: ../mesa/src/gallium/auxiliary/gallivm/lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration]
21:44 SolarAquarion: 172 | LLVMAddConstantPropagationPass(gallivm->passmgr);
21:44 SolarAquarion: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21:44 airlied: 112 | lazy<struct nir_shader *> clc_nir( [&]() { std::string log; return nir::libclc_spirv_to_nir(mod, this, log); }
21:44 SolarAquarion: | LLVMAddCorrelatedValuePropagationPass
21:44 SolarAquarion: cc1: some warnings being treated as errors
21:44 curro: whoa, guys, one at a time ;)
21:44 airlied: karolherbst: is my results of what you said there
21:45 airlied: SolarAquarion: is that against llvm git?
21:45 curro: airlied: you need to pass it inn the constructor of device::device
21:45 SolarAquarion: airlied: yes
21:47 curro: airlied: like: ", ldev(ldev), clc_nir(your function here)", *or* initialize it with an assignment from the constructor body itself "clc_nir = lazy<nir_shader *>([&]() {...});"
21:50 curro: airlied: once you've gotten that to work, i suggest changing it to an std::shared_ptr<nir_shader> instead though so you don't need to manually free the memory pointed by it
21:50 airlied: ../src/gallium/frontends/clover/util/lazy.hpp: In instantiation of ‘clover::detail::deferred_lazy<T, F>::operator T() const [with T = nir_shader*; F = long int]’:
21:50 airlied: ../src/gallium/frontends/clover/util/lazy.hpp:58:10: required from here
21:51 airlied: ../src/gallium/frontends/clover/util/lazy.hpp:60:24: error: expression cannot be used as a function
21:51 jenatali: Ahhh good ol' C++
21:51 airlied: lazy<struct nir_shader *> clc_nir; in the device object
21:52 curro: airlied: that's because you're somehow passing a long int as init function
21:52 airlied: I suspect C++ is doing something here that I'm not
21:53 airlied: oh no it was me I think
21:53 airlied: I had an old clc_nir(NULL)
21:55 airlied: okay stuff runs, just need to make sure it works and clean it up a bit
21:58 airlied: curro: I have to ralloc_free the nir_shader anyways I expect
21:59 curro: airlied: you can pass ralloc_free to the shared_ptr constructor as deleter function, so it will be called for you once the device gets destructed
22:12 karolherbst: uhhh.. can we just convince libstdc++ people to always link against pthread? :D
22:13 karolherbst:feels that caring about single threaded applciations perf is so 1990
22:16 dcbaker[m]: karolherbst: I've heard the future if glibc is that pthreads, m, rt, and dl are all getting merged into the core. Presumably that affects c++ as well
22:16 karolherbst: I guess...
22:17 karolherbst: but that wouldn't prevent people being smart and say they use lockfree code for single threaded applications running into the same issue
22:17 karolherbst: or they accept single threaded doesn't matter
22:17 karolherbst: I just see those as two different issues
22:18 karolherbst: but maybe under the hood it would just work like that
22:18 karolherbst: like one could exchange func pointers once new threads are created or something?
22:18 karolherbst: oh well...
22:18 karolherbst: they'll figure it out I guess
22:22 karolherbst: airlied: do you have a patch already I could try as well?
22:22 karolherbst: or will it still take a bit of time?
22:23 jenatali: karolherbst, airlied: If you have updates to Clover for libclc, feel free to just push those straight into my MR
22:24 jenatali: karolherbst: Did you see your translator patch failed clang-tidy? https://travis-ci.org/github/KhronosGroup/SPIRV-LLVM-Translator/jobs/721736313
22:24 jenatali: Hooray 80-character lines...
22:25 karolherbst: yeah...
22:25 karolherbst: I noticed, but I ignored it and forgot :D
22:25 karolherbst:feels personally offended if somebody points out "only 80 characters" :p
22:26 karolherbst: but now that even the kernel moved to "please use 100" maybe everybody else should at least do 100 as well :p
22:31 airlied: karolherbst: just dropping kids off so got distrcted
22:31 dcbaker[m]: Just don't end up like the GL CTS where you need 2 ultra wide monitors to read one line...
22:32 airlied: will send a patch out once I'm back
22:34 karolherbst: dcbaker[m]: yeah.. I personally think around 120/140 is the sweet spot
22:35 karolherbst: some go instane and do happy 200 lines :D
22:35 karolherbst:had to work with somebody on a team in the past who thought early returns are the worst thing
22:35 karolherbst: imagine how deep stuff went from time to time
22:36 jenatali: D:
22:36 karolherbst:has many "fun" stories back when working on java stuff
22:38 karolherbst: it was a nice surprise when stacktraces contained less than 100 entries :D
22:39 bnieuwenhuizen: karolherbst: the amdvlk code is all without early returns. result in a lot of if (success) { if (sucess) { if (success) {}}}
22:39 karolherbst: uhhhhh
22:39 karolherbst: I call that "serious enterprise software"
22:39 bnieuwenhuizen: I guess we even have a bunch of that in mesa via addrlib
22:39 karolherbst: where developers have to show of by the amount of "anti-patterns" they tell others about
22:39 karolherbst: :p
22:40 karolherbst: :D
22:40 karolherbst: oh well...
22:40 karolherbst: I mean.. I kind of understand the reasons but... ufff
22:41 bnieuwenhuizen: for how often we modify the library ourselves it is not worth fighting it
22:42 karolherbst: yeah...
22:51 jenatali: jekstrand: Am I missing something, or does https://gitlab.freedesktop.org/mesa/mesa/-/blob/c77a5fcc40334918b5689e0c4f523b544c74d838/src/compiler/nir/nir_lower_io.c#L1704 always assert if you pass global to this pass?
22:52 jekstrand: jenatali: I have a patch which adds global to that list
22:52 jenatali: Not in that series :)
22:53 jekstrand: jenatali: There is now. :)
22:53 jenatali: jekstrand: Ah, there it is :)
23:06 mareko: karolherbst: only 80 characters :)
23:22 karolherbst: hey... :D
23:37 airlied: karolherbst, curro : https://paste.centos.org/view/raw/8ed21dc8 need some more c++ help there :-P
23:37 airlied: error at the bottom, trying to use shared_ptr with lazy
23:37 karolherbst: why?
23:37 karolherbst: why would that be even necessary?
23:38 airlied: cause curro said I should up there :)
23:38 airlied: can avoid the explicit free as well
23:39 karolherbst: mhhhh
23:39 airlied: just can't get the nir_shader * back out of the resulting const clover::lazy<std::shared_ptr<nir_shader> >
23:39 karolherbst: well..
23:39 karolherbst: lazy calls delete on the internal object
23:40 karolherbst: mhhh
23:41 karolherbst: let me try,,,
23:41 karolherbst: airlied: is that on top of the last patch?
23:41 airlied: I though () would at least get me the shared ptr
23:42 airlied: karolherbst: yes on top of the disk cache
23:42 airlied: https://gitlab.freedesktop.org/airlied/mesa/-/commits/clc-wip is my base
23:45 karolherbst: ehh.. mhhh
23:45 karolherbst: annoying..
23:45 karolherbst: let's see
23:46 jenatali: airlied: Are you just missing a .get()?
23:46 karolherbst: jenatali: implicit casting doesn't work
23:46 jenatali: Hm?
23:46 airlied: jenatali: nope tried that :-P
23:46 karolherbst: airlied: spirv_options.clc_shader = static_cast<std::shared_ptr<nir_shader> >(dev.clc_nir).get();
23:46 airlied: the () on the lazy object doesn't seem to work
23:47 karolherbst: airlied: because it's a cast operator
23:47 karolherbst: () only works if you supply the layz function
23:47 curro: airlied: er, you get that because it has no () operator
23:47 karolherbst: ehh
23:47 curro: it's a conversion operator
23:47 karolherbst: there is no operator()
23:47 airlied: doh
23:47 karolherbst: right :D
23:47 airlied: could there be a ()?
23:48 karolherbst: airlied: add an operator->
23:48 karolherbst: then you could do clc_nir->get()
23:48 airlied: that static cast looks might ugly
23:48 curro: airlied: you can use it like std::shared_ptr<nir_shader> nir = dev.clc_nir;
23:48 airlied: oh maybe I can just do that
23:49 airlied: okay that works for me
23:49 karolherbst: ehh...
23:49 karolherbst: that makes it also annoying somehow :D
23:49 karolherbst: but yeah...
23:50 curro: karolherbst: weren't you advocating for conversion operators instead of function call operators just a minute ago?! ;)
23:50 karolherbst: depends on the use case :p
23:50 karolherbst: but ultimately in this case you have a pointer
23:50 karolherbst: so you use operator->
23:50 curro: we could add that yes
23:51 karolherbst: but.. that doesn't seem to work as I expect it to...
23:51 karolherbst: let's see
23:51 jenatali: Personal opinion: smart pointers (like lazy/optional) should behave like pointers, with -> and * operators
23:51 karolherbst: yes
23:51 curro: jenatali: lazy isn't intended to behave like a pointer, it's intended to have value semantics
23:52 jenatali: I feel like trying to wrap something that's not a raw value with value semantics doesn't generally work in practice
23:52 jenatali: I wish it did, but it always has rough edges
23:52 jenatali: E.g. double conversions not being considered
23:52 curro: well it works in this case, since it's simply wrapping an object which needs to have value semantics itself
23:52 karolherbst: mhhh
23:52 curro: so i think if we add a -> operator to lazy, it should call the -> operator of the underlying object if it exists, otherwise it should fail to instantiate
23:52 karolherbst: that double operator-> messes tings up
23:53 karolherbst: curro: clc_nir->get() fails because I already have the nir_shader thing...
23:53 karolherbst: even though I return the same as in the conversion operator
23:54 karolherbst: mhhh...
23:54 curro: karolherbst: what do you mean by the nir_shader thing?
23:54 karolherbst: ../src/gallium/frontends/clover/nir/invocation.cpp:254:44: error: ‘using element_type = struct nir_shader’ {aka ‘struct nir_shader’} has no member named ‘get’
23:54 karolherbst: spirv_options.clc_shader = dev.clc_nir->get();
23:54 karolherbst: just added operator-> just like operator T()
23:55 curro: karolherbst: isn't that because you got a lazy<nir_shader *> instead of a shared_ptr like airlied?
23:55 karolherbst: I have the shared_ptr
23:55 karolherbst: but yeah..
23:55 karolherbst: seems like it?
23:55 karolherbst: but T is the shared_ptr<nir_shader> thing or not?
23:56 curro: karolherbst: can you show us the code?
23:56 karolherbst: https://gist.github.com/karolherbst/6a7b01f4305c0b9ee92cf287d7ebccb3
23:56 karolherbst: on top of airlieds patch
23:56 karolherbst: I am sure there is a magic rule that C++ follows all operator-> until the end...
23:57 karolherbst: "lazy<std::shared_ptr<struct nir_shader>> clc_nir;" is the declaration for me
23:58 karolherbst: and
23:58 karolherbst: clc_nir->info returns the shader_info thing from nir_shader
23:58 karolherbst: so...
23:58 karolherbst: there is some magic happening
23:58 jenatali: karolherbst: operator-> is required to return a pointer
23:58 karolherbst: well...
23:58 jenatali: It should return the thing that would be dereferenced by operator* (if there was an operator*)
23:59 karolherbst: right...
23:59 karolherbst: but in this case it returns the shared_ptr
23:59 karolherbst: which.. behaves like a pointer
23:59 karolherbst: so it's fine
23:59 jenatali: Right, which means ->get() doesn't make sense
23:59 karolherbst: tight
23:59 karolherbst: *right
23:59 jenatali: If you could overload operator.() to make a type fully transparent, then you could do .get()