00:00 alyssa:notices shifts have to be 32bit
00:00 alyssa: Is there a spec reason for that..?
00:00 imirkin: most major hardware either exclusively works like that, or is able to work like that
00:01 imirkin: on nvidia it's called "wrap" mode
00:01 alyssa: seems... backwards?
00:02 alyssa: (mali has 16-bit shifts, fwiw)
00:03 Kayden: ah, it might make sense to have 16-bit shifts ... I think 64-bit shifts were uncommon to non-existent
00:04 imirkin: nvidia supports 64-bit shift sources, but you have to do it in multiple steps
00:05 imirkin: (it actually supports arbitrary-width shift sources, in a way)
00:08 HdkR: imirkin: Funnel Shift :)
00:09 imirkin: that's the one
00:13 imirkin: SHF.L and SHF.R. i guess SHF = shift funnel
00:13 imirkin: those only appeared on kepler2 though -- for kepler and earlier, it's a bit painful
00:17 alyssa: Kayden: sure, but 8/16 is plenty useful :)
00:29 krh: anholt: with !5033 are we building deqp and driver without qemu, ie natively with a cross-compiler?
01:32 airlied: dolphin: doesn't seem to have appeared in patchwork, which is why I missed it
01:32 airlied: I just got it from lore.kernel.org onw
04:14 jekstrand: alyssa: Mostly because NIR doesn't have a concept of ALU ops which have two different variable-bit-width sources.
04:14 jekstrand: alyssa: So we had three options:
04:15 jekstrand: 1. Have four shift opcodes named ishrN which shift N bits of data but have a floating shift amount
04:15 jekstrand: 2. Require the amount of data being shifted and the shift amount to have the same bit-width
04:15 jekstrand: 3. Fix the bit-width of the shift amount and let the amount of data being shifted float.
04:16 jekstrand: Option 3 seemed like the least obnoxious at the time.
04:16 jekstrand: Of course, we could have defined the shift to be 8-bit for everything but not everyone does 8-bit
04:18 jekstrand: We could have tried to make both float (that would require core NIR changes) but then you have to deal with the question of whether or not driver X can handle combination Y
04:19 jekstrand: Since everyone handles 32-bit we said, "Forget it, we'll just do that"
04:45 anholt: krh: right. we were doing so before as well, on the shared arm runners, and now we do on x86 runners.
04:46 anholt: the qemu avoidance is just on fd-farm
04:46 anholt: at runtime
04:46 anholt: (driver is still being built on the arm runners, which have the sporadic failure to upload their artifacts)
04:48 anholt: this image gets us close to "we could cross-build the driver on fd-farm with just the freedreno driver, and see if that's cheaper than downloading artifacts" but i think without also solving gitlab-runner being bad at git clones that would be worse.
04:49 anholt: krh: so, uh, to do baremetal cheza I need to touch nm config on fd-farm. if I were to break things, is there any way of getting into the building?
05:18 krh: anholt: it doesn't sound like it's easy, but I'll check
07:14 airlied: MrCooper: looks like with a disk cache the llvmpipe jobs run in under 6 minutes
07:14 airlied: just have to make it less crashy :-P
07:14 MrCooper: nice!
07:15 airlied: should make up for when I enable GL 4.5 :-P
07:20 MrCooper: airlied: BTW, any comments on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4986 ?
07:22 airlied: MrCooper: I'm good with it, I think anholt objected to it in the past
07:23 airlied: on the grounds that he dislikes llvmpipe having optimisations the break confromance
07:23 airlied: and he dislikes setting the flag for testing
07:24 airlied: MrCooper: I've tagged him, and added a comment
07:25 MrCooper: ah right, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1496#note_206318
10:07 j4ni: pinchartl: may I have your ack for merging http://lore.kernel.org/r/20200514060732.3378396-2-gwan-gyeong.mun@intel.com via drm-intel please?
10:08 j4ni: danvet: in the same series as ^ there's http://lore.kernel.org/r/20200514060732.3378396-4-gwan-gyeong.mun@intel.com - ack for merging that via drm-intel too?
10:08 j4ni: danvet: I may have asked before, but it's been a while...
10:10 danvet: ack on both
10:10 danvet: the hdmi helpers are kinda in maintainer no-mans land
10:11 danvet: both and neither of v4l and drm feel responsible :-)
10:15 bbrezillon: jekstrand: do you think we could remove this assert https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/compiler/nir/nir_deref.c#L311 ?
10:16 bbrezillon: I'm trying to use nir_build_deref_offset() on a deref chain whose root element is a deref_cast
10:16 bbrezillon: but I still need to get the offset from this base pointer
10:17 bbrezillon: if removing the assert is not acceptable, I'll have to duplicate the function :-/
10:18 bbrezillon: daniels:^
10:24 bbrezillon: actually, it's not quite true, there's a deref_ptr_as_array in the middle
10:24 bbrezillon: that makes build_deref_offset() fail
10:30 bbrezillon: https://gitlab.freedesktop.org/snippets/1013
10:31 bbrezillon: jekstrand: the above diff does what I want ^. Let me know if that's something you'd accept or if I should have my own helper to do that
10:46 karolherbst: bbrezillon: I am actually if I have it removed as well..
10:49 karolherbst: bbrezillon: honestly.. I think this change should be fine. There doesn't seem a proper reason why that assert is there besides "all derefs start with a var", which isn't true anymore
11:13 j4ni: danvet: thanks
11:24 bbrezillon: karolherbst: ok, thanks for the feedback
11:37 hakzsam: daniels: https://gitlab.freedesktop.org/hakzsam/mesa/pipelines/146663 windows_build seem stucked
11:38 daniels: hakzsam: it is - it's rebuilding LLVM on Mesa master, so I blocked other jobs to make sure no one did a duplicate build
11:38 daniels: it should be unblocked in ~10-15min
11:39 hakzsam: okay, thanks
11:42 pinchartl: j4ni: you don't need it :-) I don't like having an HDMI prefix for functions that will be used for DP too, but I don't mind that much, so feel free to merge the patch
11:45 karolherbst: jekstrand: would you mind if I essentially copy what I need from the old vtn_cfg approach for the unstructred case? I knid of prefer doing it the old way as this fits better
11:45 xexaxo1: danvet: what additional documentation do you envision for https://lists.freedesktop.org/archives/dri-devel/2020-May/266047.html
12:08 daniels: hakzsam: fwiw we're chasing up sponsorship for more Windows execution time so a) the existing builds can run faster and b) we can run more of them
12:09 hakzsam: yeah, would be nice :)
12:19 TheRealJohnGalt: I just made an issue (with a reproducible apitrace) for an amdgpu kernel/firmware crash that occurs on various mesa and kernel versions. I just found another one that only occurs on >=mesa 20.0.6 (incl git). Should I be adding this to the same issue, making a new one on drm/amd, or making a new issue on mesa since it only happens on newer mesa (despite being kernel-side crash)?
12:24 danvet: xexaxo1, go ahead I gues
12:24 danvet: x
12:24 danvet: s
12:24 danvet:cant type
12:26 imirkin: danvet: gotta key ctrl pressed down if you want to save :)
12:51 xexaxo1: imirkin: did someone say ctrl-s http://www.commitstrip.com/en/2013/06/05/ctrl-s/
12:51 imirkin: i was going for ^X ^S
12:52 imirkin: since those were the two letters he typed
12:53 freyr69: I'm trying to make EGL work on GBM on Centos 8, but I can't. eglinfo just says that `eglinfo: eglInitialize failed` for GBM platform. Is there any way to obtain some meaningful info what's wrong?
12:55 imirkin: freyr69: LIBGL_DEBUG=verbose
12:55 imirkin: i also like "strace -f -e file" to tell me wtf it's trying to load
12:56 freyr69: Tried it, says nothing safe for "libGL: Can't open configuration file /etc/drirc: No such file or directory.", which is fine, I guess.
12:56 imirkin: yeah, that's fine
12:56 imirkin: try the strace
13:00 freyr69: imirkin: well, it's last syscall is `stat("/dev/dri/card0", {st_mode=S_IFCHR|0660, st_rdev=makedev(226, 0), ...}) = 0`, so it tries stat on dri device and exits
13:01 imirkin: mind pastebinning the whole thing?
13:01 freyr69: Yeah, sure
13:03 imirkin: also stupid question, but do you have a /dev/dri/card0? and are you running as root if you're trying to do modesetting stuff?
13:06 freyr69: imirkin: https://pastebin.com/C7P3An8Z
13:06 freyr69: Yes, I have it, and dri drivers installed
13:06 freyr69: `crw-rw----. 1 root video 226, 0 May 14 15:00 /dev/dri/card0`
13:07 freyr69: and my user is in video
13:08 freyr69: Also I'm trying through ssh without any physical display attached
13:08 imirkin: hm, weird, yeah, it hates you for some reason
13:08 imirkin: can you just try it as root?
13:08 freyr69: Yes, same thing
13:09 freyr69: The log is from root
13:09 imirkin: what hardware is it?
13:09 imirkin: lspci -nn -d 10de:
13:09 imirkin: er
13:09 imirkin: sorry
13:09 imirkin: lspci -nn -d ::300
13:11 freyr69: 00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 630 (Desktop) [8086:3e92]
13:12 freyr69: It worked for forever, the only thing I've changed in my Centos Kickstart file is that I add user to `video` group during the installation, but that can't be the reason, right?
13:12 freyr69: May it be due to selinux or something?
13:12 imirkin: could be
13:12 imirkin: check audit log?
13:13 imirkin: also confirm that you have a iris_dri.so or i965_dri.so
13:13 freyr69: Yeah, I have drivers
13:13 freyr69: -rwxr-xr-x. 1 root root 12865568 Jan 6 04:39 /usr/lib64/dri/i965_dri.so
13:15 imirkin: this is when i start putting debug statements all over the loader :)
13:15 imirkin: oh
13:15 imirkin: i had a weird issue once
13:15 imirkin: related to the video group
13:15 imirkin: yeah
13:15 imirkin: can you ls -l /dev/dri/renderD128 ?
13:15 imirkin: some stupid thing was checking that they had the same group or something
13:15 imirkin: iirc it got removed though
13:17 freyr69: imirkin: renderD128 is in `render` group
13:18 imirkin: yeah, that matches my current setup
13:19 freyr69: Ok, I'll try to reinstall it and add my user to video group after installation
13:19 imirkin: selinux might be a problem too, dunno. sorry
13:20 freyr69: I've disabled it, but still doesn't work, no meaningful message either
13:20 freyr69: Maybe some very subtle installation bug
13:43 MrCooper: imirkin: there's also EGL_LOG_LEVEL=debug for EGL
13:48 TheRealJohnGalt: Because the crashes could be linked, I added the second apitrace to the same issue.
13:49 imirkin: MrCooper: ah right, yeah. forgot about that one.
13:51 Zeising: The scripts here https://gitlab.freedesktop.org/mesa/mesa/-/tree/master/src/gallium/tools/trace that still use python2, are they needed for the mesa distribution, or can I ignore the python2 requirement for mesa? I'm unsure if these scripts are even installed by default, or if they are more like debug and development tools.
13:52 imirkin: Zeising: they're used for development
13:52 imirkin: to help make sense of the gallium trace plugin
13:52 Zeising: Ok
13:52 imirkin: they would generally not be installed in a user install
13:52 Zeising: So, as a package maintainer and so on of mesa, I can safely ignore them?
13:52 imirkin: i believe so, yes
13:52 Zeising: Ok
13:53 jekstrand: bbrezillon, karolherbst: The reason is that right now NIR derefs fall into two categories: Explicit and implicit. Explicit ones have strided and offsets on the types and are allowed to have casts and things. Implicit ones don't have strides and offsets and are required to be complete.
13:53 jekstrand: bbrezillon: There's no strict reason why we couldn't come up with a path for implicit but incomplete.
13:54 jekstrand: bbrezillon: However, I'd like to know more context before I form an opinion on whether or not that's a good idea for your case.
13:54 karolherbst: ohh actually
13:54 karolherbst: I think we just forget to set the stride in a few places
13:54 karolherbst: I have some patches for that
13:55 karolherbst: jekstrand: eg those: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068
13:55 jekstrand: karolherbst: With respect to CFG, I don't know. I don't see why one would fit better than the other. I thought you basically rewrote it all for unstructured.
13:55 karolherbst: jekstrand: yeah.. but I was making reuse of some of the bits from structured, like the switch case handling
13:56 karolherbst: but I think it makes sense to ensure that all derefs starting from a case should have a stride
13:57 karolherbst: no idea why it could be anything else
13:58 jekstrand: karolherbst: Ah. Re-using switch handling makes sense
13:58 bbrezillon: jekstrand: that's about CL constant kernel args being passed as ptrs, those are then turned into load_deref() with var_mem_ubo mode
13:58 bbrezillon: and I need to extract the offset relative to the base pointer
13:59 bbrezillon: based on the deref chain
13:59 karolherbst: bbrezillon: but you should have a stride on those, right?
14:00 bbrezillon: yep, I do
14:00 bbrezillon: (let me double check)
14:00 bbrezillon: yep => /* ptr_stride=4 */
14:00 xexaxo1: imirkin: <freyr69> seems to be picking the i965_dri.so. chances are something within the driver itself is unhappy
14:01 xexaxo1: since eglinfo initially attempts GBM, which gets to the driver. the follow-up eglInitialize, which fails is for Wayland.
14:01 imirkin: hah
14:01 imirkin: really?
14:01 imirkin: i thought each one failed
14:01 imirkin: maybe only wayland/x11 failed
14:02 imirkin: xexaxo1: https://pastebin.com/C7P3An8Z
14:02 xexaxo1: there's one "failed" message, which happens on the second iteration of the drmDevices
14:02 imirkin: eglInitialize failed before "Wayland platform"
14:02 imirkin: (and also after)
14:03 xexaxo1: actually, nope I'm blind.
14:18 karolherbst: jekstrand: maybe we could get the vtn bits ready for inclusion and deal with the structurizer later? dunno if anybody else plans on changing stuff in vtn_cfg :D
14:19 jekstrand: karolherbst: That's the only major change to vtn_cfg I'm aware of
14:19 karolherbst: okay
14:19 jekstrand: It's not something I like to mess with often
14:19 karolherbst: would be still cool to get it reviewed :)
14:19 karolherbst: especially because it's less painful than the structurizer bits
14:19 karolherbst: I hope
14:20 karolherbst: anyway, now we have a DFS and a breadth first algorithem.. but with unstructured CFG you can really only do a DFS
14:20 karolherbst: except I am missing something
14:21 karolherbst: nice.. it seems to still work
14:21 karolherbst: I hope
14:23 karolherbst: jekstrand: this commit especially: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401/diffs?commit_id=712c4517c6fc4a9a9d55b22b81ff0f11c8254222
14:23 karolherbst: I went the lazy way and brought vtn_add_case back
14:24 karolherbst: uhmm.. I could get rid of this nir_push_if thing btw
14:25 karolherbst: wondering if I should stress thest the structurizer by letting anv go through that path :O
14:26 alyssa: karolherbst: happy to throw review bw as needed :)
14:27 karolherbst: alyssa: that one commit is probably the critical one right now...
14:27 karolherbst: and we still need to find a solution for my licensing mess up :/
14:27 karolherbst: dunno if it's safe to assume the license as the entire project is MIT
14:30 daniels: has julianwi dropped off the face of the earth?
14:30 karolherbst: no clue.. at least he didn't respond for a while now
14:31 daniels: anyway, the license document is clear - the default Mesa license is MIT, and it's explicitly spelled out
14:31 daniels: your work in core Mesa code is thus derived from MIT, and his work is derived from yours thus also MIT
14:31 daniels: you'd both be copyright holders
14:31 daniels: obviously a S-o-b would be nice ...
14:31 karolherbst: right
14:31 daniels: have you tried pinging via email?
14:32 karolherbst: not yet
14:32 daniels: fwiw I have on my list to try the new structuriser + rebased libclc with llvmpipe again
14:32 karolherbst: daniels: I just pushed onto the branch rebased the structurizer on top of master
14:33 karolherbst: so you can probably get rid of some patches which landed already in the meantime :)
14:39 alyssa: karolherbst: `nir_push_if_src(b, nir_src_for_ssa(
14:39 alyssa: ` .... I think you just want push_if ;)
14:40 alyssa: Actually I don't see why the push_if thing is there at all, should probably be dropped
14:41 alyssa: nir_goto_if always has an SSA source..
14:42 alyssa: and in a world of so little structure, cling onto that ;p
14:42 alyssa: but dropping that, the spirv parse commit is r-b :)
14:47 daniels: karolherbst: i saw, thanks :) you beat me to it as we've just gone through the process of rebasing ourselves over the last few days - hence the todo note for me to make sure the upstream notes are up to date
14:58 karolherbst: :)
15:07 karolherbst: alyssa: actually.. the structurizer needs it
15:07 karolherbst: but I'll move the change
15:31 alyssa: karolherbst: How so?
15:31 karolherbst: alyssa: nir_jump_instr.condition
15:31 alyssa: isn't it always SSA, per the previous commit?
15:32 alyssa: assert(nir_jump_instr.condition.is_ssa); nir_jump_instr.condition.ssa
15:32 karolherbst: mhhh
15:32 alyssa: don't you know if you assert something it has to be true?
15:32 alyssa: be assertive already!
15:32 alyssa: ;P
15:32 karolherbst: alyssa: more talking about route_to_cond though
15:33 karolherbst: I am quite sure the condition could be a non ssa value
15:33 karolherbst: or.. maybe it won't be
15:33 karolherbst: dunno
15:34 alyssa: karolherbst: I don't see how
15:34 alyssa: It only gets jumps from goto_if's, right?
15:34 alyssa: and the goto_if's only come from the spirv patch
15:34 alyssa: and the only two goto_if's in that patch have src_as_ssa wrapping the condition
15:34 karolherbst: mhh.. true
15:34 alyssa: proofs :rainbow:
15:49 pcercuei: I get a lockup if I try to call for_each_new_plane_in_state(crtc->state->state, ...) inside a crtc's .atomic_begin()
15:49 pcercuei: even if there's nothing inside
15:50 pcercuei: the boot just stops there
15:50 vsyrjala: null ptr deref
15:51 vsyrjala: don't use {plane,crtc,etc}_state->state if you can avoid it. it be full of dragons
15:51 pcercuei: I'd get some debug if that was the case, no?
15:52 pcercuei: a kernel panic
15:52 vsyrjala: i never load gpu drivers at boot. if they explode there often the machine is hosed
15:53 vsyrjala: also never use crtc->state & co.
15:55 vsyrjala: there are cases where you can do that, but ususally just causes headaches
15:55 pcercuei: well, I don't see a way around it
15:56 pcercuei: the atomic callbacks don't give you a pointer to the state
15:56 vsyrjala: pass in the drm_atomic_state all the way from the top is the usual answer
15:56 vsyrjala: any vfuncs which don't have that are imo in need of repair
15:58 pcercuei: e.g. in a CRTC's .atomic_flush(), the state is in crtc->state, no way around it
15:58 pcercuei: Is there a better way to retrieve the planes associated with a CRTC?
15:59 vsyrjala: the one atomic_flush i see takes crtc+old_crtc_state
15:59 pcercuei: yes
16:00 vsyrjala: you can dig out old_crtc_state->state, but imo that whole thing was a bit of design mistake and we should change everything to plumb the drm_atomic_state through explicitly
16:01 pcercuei: heh
16:01 pcercuei: so it works with oldstate->state, fails with crtc->state->state
16:02 pcercuei: the thing is, I don't care about the old state, I need to know about the new state :(
16:02 vsyrjala: drm_atomic_get_new_crtc_state(old_crtc_state->state, crtc)
16:03 vsyrjala: whether you can get at at via new_crtc_state->state or old_crtc_state->depends on which phase of the commit you're in
16:03 vsyrjala: it's quite horrible really. hence my suggestion to just plumb the state in from the top
16:04 vsyrjala:been converting a lot of i915 to do just that in recent times
16:05 jekstrand: karolherbst: I'm trying to get NIR stuff reviewed. I only have so much bandwidth for it at the moment, though. :-(
16:06 pcercuei: vsyrjala: same lockup
16:06 pcercuei: I think it's just an endless loop
16:06 vsyrjala: modprobe.blacklist + ssh + manual modprobe is how i debug driver load issues
16:17 alyssa: jekstrand: lmk if I can help with anything :)
16:28 alyssa: uhm.. any reason CLAMP/MIN2/etc is defined in both u_math and macro?
16:28 alyssa: (Identically, too)
16:31 yshui: hmm, vkcube with aco used to have a slightly different background color from amdvlk. now they are consistent. was it fixed recently?
16:33 alyssa: airlied: I see 78bce52f9a57ea02 put them in util/macros.h but except for *3 they were there since 4f25420bdd834e81a3?
16:37 agd5f: daniels, not trying to be a jerk about the TMZ stuff. I honestly didn't know anyone was working on the EGL protected content stuff. It wasn't a use case we had as of yet. The whole protected content ecosystem on Linux is pretty ill-defined at the moment, as far as I understand it.
16:40 alyssa: jekstrand: Ah, gotcha (re shift sizes). "Require the amount of data being shifted and the shift amount to have the same bit-width" seemed more NIR-like, which is why I was surprised, that's all :)
16:44 alyssa: (It's also closest to our hw. 8-bit data can only be shifted by an 8-bit (or 16-bit shift), 16-bit by 16 (...or 8 or 32 technically), 32 by 32 (or 16 or 64) where parantheses all have caveats, but... meh
16:44 alyssa: )
16:59 jekstrand: alyssa: Yeah, all those parentheticals are going to mean that basically everyone needs a convert somewhere for something.
17:22 karolherbst: jekstrand: this sounds like a vtn/nir bug, right? https://gitlab.freedesktop.org/mesa/mesa/-/issues/2963
17:22 karolherbst: ec1 32 ssa_2 = deref_cast (uint *)ssa_1 (shared uint) /* ptr_stride=0 */
17:23 karolherbst: wondering why there is no stride on the param though
17:23 pcercuei: How can I express the constraint that two planes cannot be used at the same time?
17:25 karolherbst: ahh, an that's right after spirv_to_nir, nice
17:28 vsyrjala: pcercuei: in i915 we track active planes as a bitmask in the crtc state. easy to check that for various things
17:29 vsyrjala: iirc drm_crtc_state already has a plane mask of some sort, but it's not quite what we want for i915 so we have our own
17:31 alyssa: jekstrand: Alas.
17:32 pcercuei: vsyrjala: I'll take a look, thanks
17:32 jekstrand: karolherbst: Uh... We haven't assigned type strides yet. :-(
17:33 jekstrand: karolherbst: This is one of those places where NIR is weirdly modal. :-(
17:33 karolherbst: jekstrand: it's a bit nasty actually
17:33 karolherbst: but yeah..
17:33 karolherbst: I see the issue
17:33 jekstrand: So, unfortunately, I think that assert needs to be reverted
17:33 karolherbst: yeah.. fixing it sounds too messy right now
17:34 karolherbst: it doesn't matter at all after function inlining, but mhhh
17:38 karolherbst: jekstrand, hakzsam, craftyguy: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5042
17:40 karolherbst: I think this is a bug also affecting CL so I will need to fix it for real sooner or later anywaz
17:40 karolherbst: *anyway
17:40 jekstrand: karolherbst: I don't see what inlining has to do with anything
17:41 karolherbst: jekstrand: the func param type has no stride
17:41 karolherbst: if you get rid of the func call, you start the chain with a variable instead of the cast
17:41 karolherbst: so it's all good
17:43 karolherbst: or would something later on assign the strides instead? In this case it sounds like something we might move up to inside vtn instead then
17:46 jenatali: Hey, I think there's a bit of a mistake in nir_opt_alebraic with the flrp lowering. Seems like anything that's conditioned on options->lower_flrp[16|32|64] probably shouldn't be marked with ~ (inexact)... Wanted to see if I was missing something, considering there's also an explicit nir_lower_flrp pass...
18:02 alyssa: jenatali: I assume those opts *are* inexact, though.
18:03 anholt: alyssa: if you're setting lower_, it's probably not about optimization but about "I literally don't have a flrp instruction"
18:03 alyssa: anholt: right, but there are also opts here
18:04 alyssa: first one I see, "flrp(a, a + b, c) => a + (b * c)" -- whether the hw has native flrp ops or not, I assume that's not numerically precise for OpenCL requirements, hence the inexact qualifier?
18:04 jenatali: alyssa: Then why is it also gated by the lower_ option?
18:04 alyssa: That's a better question :-)
18:05 karolherbst: ohhh, I think I ran into this as well..
18:05 jenatali: I was just a bit surprised when setting the lower_ option and still ended up getting flrps
18:05 alyssa: Right, that's a bug then
18:05 karolherbst: jenatali: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/03012894cabd02980b56480d0d99c929e1782e0e
18:06 alyssa: jenatali: When lower_flrp stuff was last refactored, explicit calls to nir_lower_flrp were added into all the drivers
18:06 alyssa: The band-aid fix is to add that to nir->dxil, although arguably it should be done automatically regardless
18:07 karolherbst: ohh, we have a nir_lower_flrp?
18:07 alyssa: It looks like the lowering is done automatically for GL, but not CL, if I understand
18:07 jenatali: Yeah, that makes sense
18:07 jenatali: I was also confused why there were algebraic lowerings as well as an explicit pass for it
18:08 alyssa: If I understand right, I guess the algebraic opts predicated on lowering are faster than the explicit lowering on hw that doesn't have native flrp, but slower if there is native flrp
18:09 jenatali: Ohh, so the algebraic opts are intended to be inexact, while the explicit lowering pass can produce exact results?
18:10 alyssa: I *think* so, but I try to stay in GLES land where everything is inexact ;)
18:10 jenatali: Heh, that makes sense
18:10 jenatali: Thanks!
18:10 karolherbst: alyssa: there is no precise modifier for GLES?
18:10 alyssa: np
18:10 alyssa: karolherbst: shh ;P
18:10 karolherbst: :p
18:10 imirkin: how does that work with tess?
18:11 imirkin: the tess ext might introduce something in that domain...
18:11 alyssa: jenatali: There is a question of whether NIR should have native flrp ops, afaik it's lowered everywhere except some Intel GPUs
18:11 kusma: karolherbst: well, there's invariant, which is almost the same
18:12 karolherbst: I know :p
18:12 Venemo: hey guys
18:12 jenatali: alyssa: Good to know. I'll try not to rock the boat on this
18:12 kusma: Oh, I see. alyssa was just pretending :)
18:13 Venemo: what would be a good way to add support for zero_wins to NIR? we (aco devs) discussed that we'd like to support this for DXVK's sake
18:13 karolherbst: Venemo: wasn't the idea to have a shader option for that?
18:13 karolherbst: although that probably affects optimiyations
18:14 Venemo: the two possibilities that come to mind is to either have a new opcode, or a bool in shader_info which changes the semantics of the current opcode
18:14 Venemo: the problem with a new opcode is that then a great number of algebraic optimizations need to be duplicated and separately mainatined.
18:14 karolherbst: Venemo: zero wins is for multiplicationgs with inf, right? Or was there something else?
18:15 mannerov: NaN too
18:15 karolherbst: ehh, NaN
18:15 karolherbst: annoying
18:15 mannerov: It also affects mad
18:15 karolherbst: right
18:15 karolherbst: that NaN part was the reasons it's pain
18:16 karolherbst: Venemo: I'd check how many of the opts it will affect
18:16 karolherbst: and if we know that, we can start discussing
18:16 Venemo: the problem with the bool in shader info is that there might be corner cases in existing optimizations which break it subtly, and that a new set of optimizations need to be written that are not valid otherwise
18:16 karolherbst: if it only affects like 5.. it's easy to fix that
18:16 Venemo: either way, it's a headache
18:16 daniels: agd5f: me neither, sorry if it came off that way. I just don't want AMD to implement things which leaves me on the hook for behaviour I can't reason about as a compositor, because it's all hidden driver magic not exposed to me
18:16 karolherbst: not really
18:16 karolherbst: we could do python magic probably
18:16 karolherbst: or something
18:16 karolherbst: anyway, knowing which opts are affected should help
18:17 mannerov: just disable some optimizations
18:17 daniels: agd5f: I'm currently enjoying a beer in the sunshine, but I'll send you a proper reply tomorrow with some hopefully helpful references
18:17 Venemo: karolherbst, mannerov do you think there is anything outside of nir_opt_algebraic which is affected?
18:18 karolherbst: Venemo: probably inside drivers
18:18 Venemo: I meant anything in NIR
18:18 mannerov: I don't know nir enough to say
18:19 Venemo: I guess beyond NIR, it's the drivers' decision to either support it or not
18:19 karolherbst: Venemo: I mean in the compiler of drivers
18:19 mannerov: however I can say if you support 0*inf = 0 it is very unlikely to have NaN generated, but it can happen (inf as vertex shader output -> NaN for pixel shader)
18:20 karolherbst: Venemo: I am right now checking where we check that for nouveau
18:20 Venemo: karolherbst: you mean backend compiler?
18:20 karolherbst: but it's not much
18:20 karolherbst: yeah
18:20 karolherbst: constant folding is one case
18:20 karolherbst: add+mul = mad is also disabled as it seems
18:21 Venemo: well, like I said, I would like to support it in RADV/ACO
18:21 karolherbst: sure
18:21 karolherbst: but I think a nir_shader_option flag + disabling some opts is probably the best way
18:21 karolherbst: I don't see the issue here to be honest
18:21 Venemo: AMD has v_mul_legacy and v_mad_legacy as well as v_mac_legacy which can be used to this purpose
18:22 karolherbst: we might need a lowering pass for drivers not having native support for that though
18:22 karolherbst: but.. besides that?
18:23 Venemo: I think it's perfectly okay if a driver doesn't wanna support this. it's probably a minimal gain
18:23 alyssa: jenatali: Looking into it a bit more, it looks like the slightly longer answer is that at this point getting rid of flrp in NIR would likely generate worse code for everyone (since lower_flrp / the opts can do more aggressive codegen than you could do just from the definition) and there does exist still-supported hw in-tree that wants the native op anyway.
18:23 jenatali: alyssa: Thanks for following up, good to know. I'll just add a lower step to DXIL
18:23 alyssa: If NIR were designed from scratch today, I don't know if flrp would make the cut, but it's there now and the reasons to remove seem more aesthetic than technical.
18:24 alyssa: Not that I'm against aesthetic cleanups, of course :-)
18:24 Venemo: ok karolherbst I think that makes sense. if we can be sure that only algebraic opts are affected
18:25 karolherbst: Venemo: that's the only stuff where it matters for nouveau at least
18:25 karolherbst: if there is more
18:25 karolherbst: we can always fix it later
18:25 alyssa: Maybe lower_flrp call should be happening at the frontend level; it's already that way for OpenGL, but not for clover. And that doesn't help at all for Vulkan really.
18:26 Venemo: karolherbst: yeah, we can always fix it later, hehe :)
18:26 pendingchaos: karolherbst: "nir_shader_option flag" you mean nir_shader_compiler_options?
18:26 Venemo: karolherbst: in fact, are there really any opts which would break with zero_wins? or is that only my imagination?
18:27 karolherbst: Venemo: it seems like it's already handled
18:27 alyssa: Venemo: My imagination agrees but I can't think of any either ..
18:27 karolherbst: Venemo: check the spirv code for SpvExecutionModeSignedZeroInfNanPreserve
18:28 karolherbst: FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE_*
18:28 Venemo: in SPIR-V?
18:28 Venemo: whoah
18:29 karolherbst: yeah
18:29 Venemo: does that mean there is already an extension for this?
18:29 karolherbst: ohhh
18:29 karolherbst: Venemo: it seems like we treat it as exact or something
18:29 karolherbst: yeah
18:29 Venemo: what do you mean, treat it as exact?
18:29 pendingchaos: Venemo: signed_zero_inf_nan_preserve is different than 0-wins
18:29 karolherbst: Venemo: SPV_KHR_float_controls
18:29 karolherbst: pendingchaos: ohh, really?
18:29 karolherbst: annoying
18:30 karolherbst: sounds like it's very similiar
18:30 pendingchaos: IIRC it means the compiler cannot assume there isn't any infinity/nan
18:31 karolherbst: ohhh
18:31 karolherbst: you are right
18:31 karolherbst: "The implementation must not perform optimizations on floating-point instructions that do not preserve sign of a zero, or assume that operands and results are not NaNs or infinities."
18:31 karolherbst: okay, then it's something different indeed
18:33 Venemo: sounds like that's almost the opposite, but not quite
18:34 karolherbst: Venemo: but maybe a similiar approach could be used for zero_wins
18:36 karolherbst: Venemo: we could probably do a check like "info->mul_zero_wins" or something in the opts
18:36 karolherbst: no idea what's accessible there
18:36 karolherbst: ahh, info is accessible
18:36 Venemo: info is available always, yes
18:36 karolherbst: okay, cool
18:36 karolherbst: that should help then
18:37 Venemo: however, if it turns out that there is any NIR pass out there which adds a mul and expects it to produce nan or inf... like there is in alyssa 's and my imagination, then it turns into a headache quickly
18:39 karolherbst: Venemo: we have the same issue with the precise modifiers already and it took a while to get rid of most of the bugs
18:39 karolherbst: and I am sure there are still some left
18:39 Venemo: heh, that's not very convincing
18:40 karolherbst: your problem is way smaller :p
18:40 Venemo: on the other hand, Josh already made some patches which add this to ACO (and don't yet deal with MAD), but without any changes in NIR. and that works fine for him, at least
18:49 dschuermann: 'works fine' sounds like a decent proof :P
18:54 pepp: daniels: re TMZ, did you see my comment https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4049#note_496083 ?
19:11 airlied: daniels: I think if we get structurizer + libclc bits landed, running piglit CL test in CI might not be unreasonable on llvmpipe
19:13 sravn:ended up writing 36 backlight patches. Posted only 18 in the hope this is easier to get reviewed.
19:19 karolherbst: airlied: yeah.. just getting it in will be annoying :p
19:19 karolherbst: unless people magically review the structurizer
19:22 karolherbst: anyway, alyssa review already helps, going through that right now
19:22 Venemo: dschuermann: it's not 'proof' but at least it didn't fail horribly
19:23 airlied: at some point it's pointless keeping it in a branch, and if only affects CL, we can just ack it in
19:23 karolherbst: airlied: jekstrand was against that last time we talked about it
19:25 airlied: karolherbst: well he's likely the only person who's review he will accept as valid :-P
19:25 karolherbst: :p
19:25 airlied: and you can reivew stuff in the tree
19:26 airlied: I suppose I did read the paper it's based on, maybe I can review it :-P
19:26 airlied: but I wouldn't trust my review either
19:26 karolherbst: yeah.. I wouldn't trust mine either
19:28 jekstrand: airlied: I'm not worried about the structurizer
19:29 jekstrand: I'm more worried about all the NIR passes that will break
19:29 jekstrand: But, also, that's not really my problem
19:29 airlied: jekstrand: isn't the idea that the NIR passes will only run after the structurizer?
19:29 airlied: I don't think anyone is suggesting we do unstructrured NIR here
19:29 jekstrand: That depends on who you ask. :)
19:30 airlied: if someone is suggesting that, then they can fix them
19:30 jekstrand: Fair enough
19:30 karolherbst: if anybody wants to run passes on unstructured CF, they are free to fix the passes :p
19:30 airlied: but for CL I've no interest in that, CL is quite gclear the irreducible flow control is optional
19:30 jekstrand: Where's the branch? I'll give it a quick read and ack
19:30 karolherbst: jekstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401
19:30 karolherbst: the vtn changes are probably more problematic though
19:33 sravn: danvet: https://lore.kernel.org/dri-devel/20200514191001.457441-1-sam@ravnborg.org/T/#t is one step towards fixing the todo entry concerning backlight.
19:33 danvet: sravn, just read the cover letter, this sounds awesome
19:34 danvet: I guess the grand plan is that once bl_is_blank is rolled out, we fuse the triplet-state in-between out?
19:34 danvet: or do we first need to roll out bl_enable/disable() everywhere?
19:35 sravn: danvet: I have mostly looked at fb_blank users for now, and it is almost gone in my tree.
19:35 danvet: nice
19:35 sravn: But the idea is to make this simpler both for the backlight drivers and thier users.
19:36 sravn: So a natural next step would be backlight_enable()/disable().
19:36 danvet: sravn, hm your bl_is_blank does not seem to match bl_enable/disable
19:37 sravn: But I wanted to make sure I understood what I was dealing with - and adding or updatign the documentation helped a lot
19:38 danvet: s/bl/backlight/ in all my text
19:38 sravn: enable/disable should not touch fb_blank. Thats on of the pending cleanups that awaits
19:39 sravn: One reason why I have attacked fb_blank first. When it is only used by the core I can drop it from backlight_properties and likewise fron enable/disable
19:41 danvet: I'd still include fb_blank in your bl_is_blank function
19:41 danvet: since atm you remove that check from some bl drivers
19:42 danvet: which is confusing
19:42 danvet: and I think more risky wrt introducing regressions
19:43 sravn: OK, will do in v2. Will reply on the list too.
19:43 danvet: well backlight it outside of drm anyway, so need to wait for the maintainer
19:51 sravn: yep, but in most cases the backlight people responds rather quick, so no worries there. There is a few drm patches in the beginning that I hope to get ack on from drm folks. Buts it driver specific code so lets see
20:08 danvet: robher, I think you might care about "[PATCH 7/9] drm/shmem-helpers: Redirect mmap for imported dma-buf"
20:08 danvet: can you perhaps take a look, maybe even test?
20:09 robher: danvet: thanks, will go look at it. Buried in DT...
20:10 danvet: yeah seen your storm of mails fly by recently
20:16 jekstrand: karolherbst: Added some "please put this in the right patch" comments and a couple general questions
20:34 agd5f: daniels, nice. almost beer-o-clock here. actually sunny for a change too
20:38 imirkin: mmmmmm beer. the cause of, and solution to, all of life's problems.
20:40 kisak: imirkin: I watched a Simpsons episode with that line last night.
20:41 imirkin: we're out of beer, beer baron!
20:41 kisak:looks around paranoidly for imirkin's camera feed
20:42 imirkin: it's a great episode
20:42 imirkin: at first prohibition was great... people were drinking more and having more fun, but without beer, prohibition just doesn't work!
21:07 alyssa:wonders what kind of crazy packing they're using for RGB10_A2
21:07 alyssa: internally I mean
21:09 alyssa: It's... not what's in the spec
21:11 alyssa: (I think, anyway)
21:14 alyssa: Oh *no*
21:15 alyssa: Hnn
21:15 alyssa: Top 8-bits of R, G, B are bytes 0, 1, 2
21:15 alyssa: Bottom 2-bits of R, G, B, A form byte 3
21:15 imirkin: logical.
21:16 alyssa: that's not normal rgb10_a2, right?
21:16 imirkin: lol no ;)
21:16 alyssa: okay good
21:16 imirkin: but it has an interesting advantage
21:16 imirkin: which is that if you feed that to a display expecting rgbx8888, it'll Just Work (tm)
21:16 alyssa: Oooo, right.
21:16 imirkin: obviously you lose the bottom bits of precision, but meh
21:17 alyssa: AFAICT this is the internal encoding in the tilebuffer
21:17 imirkin: yeah, only use-case i can think of is if you want to auto-convert it to rgbx8888 for free
21:18 alyssa:shrugs
21:18 imirkin: doesn't mean there aren't others!
21:18 alyssa: RGBA4 is internally [R][0][G][0][B][0][A][0]
21:18 alyssa: i.e. padding to bytes
21:18 alyssa: RGB565 is similar
21:18 imirkin: so ... same thing then
21:18 imirkin: i.e. free conversion to rgbx8888 :)
21:18 alyssa: Yeah.. guess so ;_0
21:18 alyssa: :-)
21:19 imirkin: i won't ask about rgba16
21:19 imirkin: or rgb9e5
21:19 alyssa: it's all 128-bit, actually
21:19 alyssa: so these are the 32-bit described above repeated 4 times
21:19 alyssa: rgab16 is identical to spec but repeated once
21:20 alyssa: twice
21:25 alyssa: --Yup, that was it. Now to write an unpack routine in addition to pack.
21:37 alyssa: Cool~ Down to 4 formats then.
21:53 alyssa: imirkin: Oh, but RGB10_A2_UINT is unaligned (i.e. equal to the spec). Because it'd be too easy to reuse the routine I literally just wrote
21:57 airlied: ints kinda mean ints, had to just drop smoe bits off the end
21:57 alyssa: airlied: bwah?
22:08 airlied: ^had^bad
22:09 alyssa: airlied: The bits for rgb10_a2_unorm are still there, just moved around
22:09 airlied: it's not like you can take a 10-bit and drop the low bits and having be useful :-P
22:09 airlied: alyssa: yup but that is for the convert to 8-bit useful case
22:09 airlied: ints don't really have that use
22:09 alyssa: True, true