01:29 daniels: restore is at >70% btw; more repos are active now, and they'll all be active by the morning
03:52 jekstrand: Ugh... structurizer....
03:53 HdkR: ugh, x86
03:53 jekstrand: :P
03:53 HdkR: It's bring your complaint to work day right? :P
03:54 jekstrand: I guess
03:57 airlied: jekstrand: did you find another structizer problem or the same one still messy?
03:58 jekstrand: airlied: I think it's new? But it may also be just a new iteration on an old one. It's hard to tell, TBH.
04:05 jekstrand: Ok, another one fixed. Now time for the next structurizing bug. :-/
04:05 jekstrand: This is getting old....
04:48 jekstrand: Woo! All my kernels now SPIR-V -> NIR and structurize. \o/
04:49 airlied: jekstrand: nice!
04:50 jekstrand: There are still some holes. I'm commenting out rounding mode handling. Need to work w/ jenatali on that.
04:51 jekstrand: Also, I probably want CLC
04:51 jekstrand: I had to add nir_builder impls of some builtins
04:53 airlied: jekstrand: yeah landing clc shouldn't be that big a problem
04:54 jekstrand: I'm a bit more worried about conversions
04:54 jekstrand: I've got hacks for the 2-3 OpenCL builtins I'm missing so CLC isn't urgent.
04:54 airlied: let me know when you want a lvl0 impl :-P
04:55 jekstrand: heh
04:55 jekstrand: I assume you've got one kicking around? :P
04:56 airlied: going to kick it some more once all the cl bits lands
04:56 airlied: sycl would like generic pointers and memcpy :-p
04:56 jekstrand: I've got both of those in MRs. :)
04:57 jekstrand: Generic pointers make me sad.....
04:57 jekstrand: They're not too bad until someone casts to/from uint64
04:57 jekstrand: Because, whenever we can chase the pointer back to an argument or something, we can figure out what it is.
04:57 jekstrand: But if they store it as generic or cast to/from uint64, we're hosed.
04:58 airlied: then it's switch statement itme?
04:58 airlied: time
04:58 jekstrand: yup
04:58 jekstrand: I guess Nvidia has a thing where they can map all the different things to one address space. We don't.
04:59 jekstrand: So it's a short if-ladder
05:00 jekstrand: We could probably do some whole-shader analysis to see if any pointers to shared or function memory are ever cast to generic in any way and, if not, assume it's all global.
05:00 jekstrand: I think that should be safe.
05:00 jekstrand: And it'd probably let us get rid of all generics in quite a few cases.
05:19 airlied:launches yet another ttm refactor
06:41 daniels: everything should be fully restored now btw
06:46 rcdrone: hmm, i'm trying to merge into my fork, and i'm getting errors about corrupt/empty files when i try to push
06:56 rcdrone: i can git clone mesa/mesa to a fresh folder, but jpark37/mesa fails now
07:02 rcdrone: I hope my gitlab fork isn't irreperably damaged because of whatever happened today? It feels that way, but I don't know Git that well.
07:04 daniels: rcdrone: looking into it, thanks
07:04 rcdrone: thanks
07:46 rcdrone: daniels: headed to bed. i'll check back in the morning.
07:46 daniels: rcdrone: thanks
08:00 pq: Lyude, I don't read intel-gfx in any form. If you don't cc dri-devel, there is no way I could see it. That said, I'm not quite sure what patches you refer to.
08:04 pq: daniels, did you get any sleep? Going on fumes now? :-o
08:11 daniels: pq: just enough
08:29 hakzsam: is Marge off?
08:35 MrCooper: daniels: ^ maybe Marge was also affected by the Mesa repo issues? Looks like she's not processing MRs assigned to her
08:45 daniels: thanks, brought back up
08:47 hakzsam: daniels: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6838#note_633493
08:47 daniels: hakzsam: yep, fixing it as we speak
08:47 hakzsam: thanks!
09:02 daniels: ok, you can't push to Mesa forks for now for infuriating reasons
11:19 danvet_: tagr, e3eb3250d84ef if you want an example
11:20 danvet_: only thing to triple check if you outright extend the struct is that everyone currently using it memsets it
11:20 danvet_: so that when they get the new struct, the new fields are cleared
11:20 danvet_: if that's not the case, create a new struct with the old one at the beginning
11:20 danvet_: and a new ioctl #define, but with the same number
11:21 danvet_: and a single handler which uses the new struct, always
11:28 pcercuei: Reading https://www.collabora.com/news-and-blog/blog/2020/08/31/pushing-pixels-to-your-chromebook/
11:29 pcercuei: "Each application using DRM is called a client. There can be multiple clients using a DRM driver at any given time."
11:29 pcercuei: I thought there could only be one client using a DRM driver at the same time?
11:29 pcercuei: Does that mean that one app could render to one plane, and another app render to another plane?
11:32 daniels: no
11:33 daniels: there can be and often are multiple clients, but only one can hold 'master' at a time, and that's the only one which can operate KMS
11:33 pcercuei: alright
11:33 daniels: you can use DRM leases to distribute outputs between multiple clients (e.g. HDMI-1 to one client, HDMI-2 to another) so they operate independently, but that's a hard split of CRTC/connector/plane
11:33 pcercuei: thanks for clearing that up
11:33 daniels: you can't have multiple processes driving planes on the same CRTC
11:33 daniels: no
11:33 daniels: *np
11:34 pcercuei: I have a main application running on the primary plane, I wanted to have an OSD of some sort on the overlay plane
11:34 pcercuei: I don't really see how I can do that cleanly, without resolving to using a VM...
11:34 pcercuei: a WM*
12:04 tagr: danvet_: not sure I understand how this is supposed to work because the IOCTL code encodes the size of the struct, doesn't it?
12:04 tagr: danvet_: oh... does DRM mask that out, perhaps?
12:09 daniels: pcercuei: yeah, you need that in one
12:09 daniels: I can recommend Weston :)
12:12 danvet_: tagr, yup
12:12 danvet_: and then zero extend
12:12 danvet_: see drm_ioctl()
12:13 danvet_: or the recent lwn articles about extensible syscalls, it's the same
12:13 danvet_: we've been doing this since forever
12:13 pcercuei: daniels, this is for a device with 32 MiB of RAM, using weston may be overkill (although I have no idea how much it typically uses)
12:13 daniels: ouch
12:13 danvet_: https://lwn.net/Articles/830666/
12:13 daniels: not too much, but that's super limited given that you presumably need some of that RAM for decoded video as well as UI :P
12:14 pcercuei: indeed
12:17 pq: pcercuei, FWIW, https://gitlab.freedesktop.org/wayland/weston/-/issues/244 might be an idea to pursue if you also add never allocating buffers for the renderer at all.
12:43 krh: daniels: 'What happened to "everything is a file"'
12:44 daniels: krh: ?
12:45 krh: daniels: first comment on the lwn article you linked
12:45 daniels: wrong dan.* :P
12:46 krh: dan.*: oops
12:48 pcercuei: pq: I guess I will experiment with it at some point, but not anytime soon
13:16 tagr: danvet_: yeah, I just came across that when inspecting the code
13:16 tagr: danvet_: this is really neat
13:16 alyssa:tries to get surfaceless deqp running
13:16 alyssa: I've been using wayland for years for deqp oops :P
13:17 tagr: danvet_: I was aware of the zero-clearing, but I didn't realize that it would also work for variable-sized structures
13:18 tagr: danvet_: so I had assumed that it would only work if the structure had reserved fields at the end, but now that I fully read this, it absolutely makes sense
13:19 tagr: I'm going to have to check the various callers of the PUSHBUF IOCTL to see if they do zero-clear the structure, but this should allow me to clean up the code quite nicely
13:19 alyssa: austriancoder: https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2019-02-26
13:19 alyssa: ^ Did you ever figure that out?
13:19 tagr: the IOCTL handler was already basically superclassing the existing PUSHBUF IOCTL handler, but if we can just make this an extensible struct it'll be much neater
13:21 italove: quit
13:22 italove: oops, sorry, message in wrong chat
13:55 alyssa: austriancoder: For future denizens, EGL_PLATFORM=surfaceless has to be set
14:27 jekstrand: jenatali, karolherbst: Thoughts on clz/ctz opcodes? They exist in OpenCL and we have them basically verbatim in HW. We have to do gymnastics to implement the GL findlMSB and findLSB in terms of them.
14:29 HdkR: jekstrand: With well defined on zero and on MSB/LSB bit set semantics? :)
14:30 jekstrand: working on those details
14:30 jenatali: jekstrand: DXIL has intrinsics for them as well, though yeah the devil's in the details
14:30 jekstrand: The lzd instruction counts component-wise the leading zeros from src0 and stores the resulting counts in dst. If
14:30 jekstrand: src0 is zero, store 32 in dst.
14:31 jekstrand: From our HW docs ^^
14:31 jekstrand: That's exactly opencl CLZ
14:31 jekstrand: For 32-bit, anyway
14:32 jekstrand: We also have FBL/FBH which have the GL semantics
14:32 jenatali: Ah yeah that's the DXIL intrinsic, there's no direct clz/ctz
14:33 jekstrand: The FBL/FBH semantics (count bits and set -1 if 0) seem like the more generic because you can &(bit_size-1) to get the OpenCL behavior.
14:33 karolherbst: yeah, we also have such a hw instruction
14:34 karolherbst: clz and ctz
14:34 jekstrand: karolherbst: What do they do when the input is 0?
14:34 jekstrand: bit_size or -1?
14:34 karolherbst: ~0 I think
14:34 karolherbst: but the instruction is not 0 based, but 1 based
14:35 karolherbst: so if it's 0, it returns ~0
14:35 jekstrand: what?
14:35 karolherbst: our instruction is called find leading one
14:35 karolherbst: and it can do a forward and backwards search
14:36 karolherbst: so we need to do some adjustmenets anyway
14:36 jekstrand: Bah... You can't &(bit_size-1) but you can do umin(fbh(x), bit_size)
14:37 jekstrand: So that should be two instructions for everyone
14:37 jekstrand: Except maybe NV
14:37 jekstrand: Kayden, mattst88: Do either of you remember why we don't use FBH for find_msb?
14:38 karolherbst: ehh wait...
14:38 jekstrand: Oh, right. It's because FBH counts from the left and GLSL wants a count from the right.
14:38 karolherbst: we had that in hw in the past...
14:41 karolherbst: mhh
14:41 karolherbst: nvidia just uses FLO for clz
14:42 karolherbst: ohh wait..
14:42 karolherbst: they do an add
14:42 karolherbst: smart
14:42 karolherbst: FLO.U32 R0, R0
14:42 karolherbst: IADD32I R2, -R0, 0x1f
14:42 karolherbst: ohh ehh.. and ISETP...
14:42 karolherbst: yeah well
14:42 karolherbst: annoying
14:43 karolherbst: guess I was wrong and it was always painful to implement
14:47 imirkin_: karolherbst: one of those things was implemented by reversing the bits
14:47 imirkin_: (i.e. bitfieldReverse equivalent)
14:48 imirkin_: karolherbst: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp#n3818
14:48 karolherbst: sure
14:49 karolherbst: but you can't imple,ent clz like this
14:49 imirkin_: which one's clz?
14:49 imirkin_: count leading zeros?
14:49 karolherbst: count leading zeros
14:49 karolherbst: yeah
14:49 imirkin_: isn't that just BFIND?
14:50 imirkin_: i.e. UMSB in tgsi?
14:50 karolherbst: nvidia doesn't think so
14:50 karolherbst: they do have a clz in PTX
14:51 imirkin_: here's how i defined it in tgsi: "Computes the 0-based index of the highest set bit of the argument. Returns -1 if none are set."
14:51 imirkin_: so right. they want the inverse of that.
14:51 karolherbst: nope
14:52 imirkin_: i.e. count, but from the front
14:52 imirkin_: i.e. from the msb rather than lsb
14:52 karolherbst: that won't give you clz
14:52 karolherbst: "Returns the count of trailing 0-bits in x. If x is 0, returns the size in bits of the type of x or component type of x, if x is a vector."
14:52 imirkin_: i mean, 31 - that = clz
14:52 imirkin_: that makes sense.
14:52 karolherbst: right
14:52 karolherbst: and nvidia adds another ISETP
14:52 imirkin_: for the all-zero case
14:52 karolherbst: yep
14:52 imirkin_: again, makes sense.
14:53 karolherbst: yeah.. just three instructions is a bit annoying for such a simple thing :p
14:53 daniels: MrCooper, hakzsam: hmm, ever since the RadeonSI rules tweak, most MR pipelines seem to be running all of the jobs
14:53 daniels: I wonder if that's just because the branch got rebased across the .gitlab-ci.yml change in between submission & merge, or if the rules are broken?
14:53 karolherbst: imirkin_: nvidia doesn't predicate the FLO+IADD though
14:54 karolherbst: so you have ISETP, pred BRA, FLO, IADD end:
14:54 imirkin_: karolherbst: yeah, it behooves one to use instructions more directly supported by the underlying arch :)
14:55 hakzsam: daniels: like which?
15:02 daniels: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
15:03 daniels: hmm maybe I got the sequence wrong
15:03 daniels: and that's just triggering because of the uapi change
15:03 daniels: there were a couple before the RadeonSI change got merged, but maybe that was separate
15:08 hakzsam: daniels: it makes sense to me that it's triggered for !6049 because of the amd/common changes
15:08 hakzsam: nvm, this MR contains radeonsi changes anyways
15:08 MrCooper: daniels: the radeonsi rule changes cannot affect jobs which don't use those rules, so I don't think it can be directly related
15:09 MrCooper: (it being all jobs being created in post-merge pipelines presumably?)
15:28 MrCooper: daniels: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6683 didn't have all jobs pre- nor post-merge, so looks fine I think?
15:28 daniels: nice :)
15:51 jekstrand: karolherbst: wip/nir-lower-goto-if-fixes
15:51 Kayden: hm, still can't make a new merge request in kwg/mesa - that expected given the current outages?
15:51 jekstrand: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6750
15:51 jekstrand: karolherbst: Mind reviewing the core NIR bits and acking the nir-lower-goto-if bits
15:52 jekstrand: karolherbst: I found the "real" structurizer bug. \o/
15:52 karolherbst: cool
15:52 jekstrand: With that, all my internal kernels structurize properly
15:52 jekstrand: And it passes vulkan CTS with the hack patch
15:53 Kayden: nvm, worked this time, thanks!
15:53 jekstrand: karolherbst: I intend to leave the DCE and empty-blocks opt patches to a different MR which we can leave WIP until such a time as we actually want to start doing unstructured opts.
15:55 dcbaker[m]: jekstrand: I'm catching up on the backlog, did anything happen to get that radeonsi thing fixed?
15:55 dcbaker[m]: for 20.2
15:56 jekstrand: dcbaker[m]: mareko made a MR from my patch and added his rb tag
15:57 dcbaker[m]: bnieuwenhuizen: I've got three patches from you that I can't get to apply, the first two are reverts, the "set BIG_PAGE" and the "move L2_CACHE_CONTROL" ones. Do you actually want those on 20.2?
15:59 dcbaker[m]: the other one is the "radv,radeonsi: Disable compression on interop depth images", it looks like there's some base patches missing
16:18 agd5f_: what is the RHEL component name for udev?
16:18 agd5f_: ajax, ^
16:24 bnieuwenhuizen: dcbaker[m]: yes I want them, I'll provide backports. They're not super urgent though so don't let them block the final release
16:25 dcbaker[m]: cool, I'm just waiting for CI to get back to me and I'm going to try to make the release today if I can get it all green today
16:25 dcbaker[m]: although, mareko: do you want to get https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6841 in before the 20.2 release?
16:34 MrCooper: agd5f_: systemd-udev
16:35 agd5f_: MrCooper, thanks!
16:36 MrCooper: np
16:36 MrCooper: well I checked on Fedora, but presumably it should be the same :)
16:51 jekstrand: anholt: So, fixing that from_ssa bug got me thinking about the liveness stuff you're doing in NIR->TGSI
16:51 jekstrand: anholt: We have a perf problem in nir_ssa_defs_interfere which actually shows up in perf traces which is related
16:51 jekstrand: Namely, liveness doesn't generate a live range end which is usable
16:51 jekstrand: So we try very hard to avoid it and then fall back to that nasty list walk.
16:52 anholt: jekstrand: got to run in a minute, will be back. have thoughts about liveness.
16:53 jekstrand: Ok. I'll be here
17:17 Venemo: jenatali: ping, do you have a moment?
17:18 jenatali: Venemo: Sure, what's up?
17:18 Venemo: jenatali: do I remember correctly that you are in the D3D12 team?
17:18 jenatali: Venemo: Yep
17:19 Venemo: okay, I'm reading about mesh shaders here, specifically about primitive attributes: https://microsoft.github.io/DirectX-Specs/d3d/MeshShader.html#primitive-attributes
17:19 Venemo: this sentence has me confused: If you write to the same attribute at the same vertex index, the last value written will be the value exported from the Mesh shader.
17:19 Venemo: what does the vertex index have to do with a per-primitive attribute?
17:19 Venemo: is this a typo and is it meant to say primitive index?
17:20 Venemo: jenatali: ^^
17:21 jenatali: Venemo: I think you're probably right, let me double-check with the team
17:24 Venemo: jenatali: I'm specifically wondering, what happens if you have multiple primitives which use the exact same vertices in the exact same order. I thought this wouldn't make any sense, but learned today that this is actually valid. do you maybe know something about this?
17:28 mareko: dcbaker[m]: yes please
17:39 Venemo: mareko: on Navi 10, is it possible for the PS to know the primitive ID, if multiple primitives might have the same provoking vertex?
17:44 jenatali: Venemo: I don't see anything special about that for mesh shaders... it's just like having indexed drawing with the same vertices in the same order, isn't it?
17:45 Venemo: jenatali: so you're saying that it's a valid use case
17:45 Lyude: pq: I'm talking about https://patchwork.freedesktop.org/patch/390621/?series=81702&rev=1
17:45 jenatali: Venemo: I don't see why not
17:45 Venemo: yeah, okay
17:47 Lyude: since I think some of the registers added there are related to that, and what I'm planning on adding support for to i915
17:47 Venemo: jenatali: I'm just thinking about how to implement per-primitive attributes. before I learned about this use case, I thought they can be just turned into per-vertex attributes, but apparently that's not the case if the same vertices can belong to mulitple primitives.
17:47 jenatali: Venemo: Ah, yeah I understand
17:47 Venemo: jenatali: thanks for the confirming my suspicion :)
17:48 Venemo: jenatali: in essence, I don't see how to do this unless the hw keeps track of it for me
17:49 jekstrand: karolherbst: Do we have lowering somewhere for fisfinite?
17:49 karolherbst: I don't think so
17:49 karolherbst: and I think we should remove the op anyway
17:50 karolherbst: isfinite a is really just neo a a
17:50 karolherbst: but for that we would need to add it first :)
17:51 jenatali: I only added it because DXIL had one
17:51 karolherbst: I think _all_ hardware implements it as neo a a though
17:51 karolherbst: uhm
17:51 karolherbst: yeah, neo is right :D
17:51 mareko: Venemo: NGG_DISABLE_PROVOK_REUSE takes care of that
17:51 karolherbst: ehh. wait
17:51 karolherbst: actually.. not
17:52 karolherbst: I confused myself
17:53 karolherbst: (isfinite a) == (and (neo a a) (neq (fabs a) INFINITY)))
17:53 imirkin_: karolherbst: can FCHK be used to determine this stuff?
17:53 karolherbst: let's see
17:53 Venemo: mareko: can you elaborate a bit on that, please? if I have a mesh shader that exports 2 triangles with the exact same vertices in the exact same order, how can the PS differentiate between the two primitives?
17:54 karolherbst: imirkin_: nope
17:54 karolherbst: FCHK is just a range check
17:56 karolherbst: mhh, but maybe it could help with some funky stuff
17:56 karolherbst: the op is a bit odd
17:57 jekstrand: I wonder if we have a magic bit for it...
17:57 jekstrand: We might
17:57 daniels: Kayden: not expected, looking into it - if I haven't updated you by tomorrow, please remind me
17:58 jekstrand: Yeah, we might be able to implement it with MOV.o
17:58 imirkin_: karolherbst: yes, it's very odd. and has flags.
18:00 karolherbst: well.. if one flag counts as "flags"
18:01 jekstrand: Hrm... It's not clear from the docs if the overflow conditional is triggered by a simple MOV
18:01 karolherbst: afaik it just does some divide related range checking
18:01 karolherbst: jekstrand: well, right, but you still need to deal with the infinity value
18:02 karolherbst: so you are already at two instructions at best
18:02 karolherbst: and we could actually do it in two instructions
18:02 jekstrand: We can definitely do it in two
18:02 jekstrand: Trying to do it in one might not be possible
18:03 karolherbst: mhhhh
18:04 jekstrand: We have an overflow flag:
18:04 jekstrand: OF. This conditional modifier tests whether the computed result
18:04 jekstrand: causes overflow – the computed result is outside the range of the
18:04 jekstrand: destination data type.
18:05 jenatali: bnieuwenhuizen: Thanks for the spec fix :)
18:05 jekstrand: But it's not clear that it does something useful for infinite values
18:06 jekstrand:adds nir_opt_algebraic lowering
18:10 airlied: agd5f_: systemd if ajax hasmt answered
18:11 airlied: ah MrCooper answered
18:31 rcdrone: daniels: i can clone my fork with errors now, which it an improvement, but isn't confidence inspiring. should i just wait a week? i'm not in a super rush.
18:48 daniels: rcdrone: you're one of three people with problems right now - all being well we'll sort it tomorrow, but worst case we'll have you back by Monday, restored from a backup from the 22nd
18:49 daniels: the activity log shows you haven't pushed anything after the 21st, so there'd be no data loss
18:50 rcdrone: daniels: sounds good
18:56 anholt: jekstrand: so, I'm using the liveness analysis I wrote right now, but it's not that great because it doesn't do regs, and it's less useful in the future because I want to use noltis, which means that the last ssa use may be later (because a larger tile consumes that instr as an edge ("value I need stored") or earlier because the last use was an internal node (we didn't need that value stored, like a uniform's offset that gets baked into the instruction).
18:56 anholt: I would love to land it, even as is, because I think ntt is a way forward for paying down a bunch of debt we've got. but I'm not convinced that NIR helper is long-lived.
18:57 anholt: semi-related to liveness: in a branch I've got for noltis that wants to do "can I use this nir_src with a reg at this later point in the instruction sequence?", I ended up making instr->index a metadata, so I can walk reg defs to see if any are between two instructions.
18:59 jekstrand: anholt: The problem we need to solve with from_ssa is similar(ish)
19:00 jekstrand: We need to say "Is this SSA def live at that instruction"
19:00 jekstrand: Which means we need an end IP for the live range and something on the instruction to compare it to.
19:00 jekstrand: Which, I think, is also what you need for ntt
19:01 jekstrand: But we want to do it random-access rather than walking the instruction list and maintaining an IP
19:03 anholt: if you've got instr index as metadata, I wonder if it would make sense to use that and walk the ssa uses to see if any are in between, instead of walking the instructions in the block to see if any of them are uses
19:04 anholt: but not sure how maintaining that metadata would work while you're inside of a pass that's adding/removing instrs.
19:06 zmike: alyssa: ping re: !6563
19:12 jekstrand: anholt: Yeah.....
19:12 jekstrand: anholt: Fortunately, for the from_ssa case, it's doing 100% analysis at the time we're analyzing liveness information.
19:13 jekstrand: Instructions are only added later
19:25 anholt: oh, good
19:25 anholt: so, it could potentially use the liveness helper, even without a fancier ad-hoc ssa live check?
19:25 anholt: anyone with a clover setup able to test out https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6751 ?
19:44 jekstrand: anholt: wip/shamrock in my tree and you can test it. :)
19:45 jekstrand: anholt: As far as the liveness helper goes, only if I can trust the instruction indices
19:45 jekstrand: anholt: Also, we need to solve if uses somehow
20:03 sravn: danvet: I just read that Bartlomiej steps down as fbdev maintainer - he have done a good job and I was not happy to read that he stopped
20:04 sravn: But keeping it in drm-misc seems like the right choice so as time permits I will also look a little after the fbdev patches - as long as they hits dri-devel
21:29 anholt: jekstrand: presumably I also need some llvm, and some cts, and...
21:30 jekstrand: anholt: System LLVM and clinfo should be enough
21:30 jekstrand: anholt: Or you can just poke karolherbst for a while
21:30 anholt: oh? I thought there was a ton of other stuff in flight to make it happen
21:31 jekstrand: or me if I get bored
21:31 jekstrand: anholt: There's tons of stuff in flight to make it all conformant but basic "can you boot up a driver?" is all there.
21:31 karolherbst: what is anholt up to? clover on vc? :p
21:31 jekstrand: anholt: Apart from the bits in my wip/shamrock branch which I have no intention of landing. :P
21:31 jekstrand: karolherbst: Reworking pipe-loader stuff
21:31 karolherbst: ahh
21:31 karolherbst: +1
21:31 karolherbst: also, +1 if you make it so that the dynamic pipe loader has to go sadly... :p
21:32 jekstrand: anholt: Give me a minute. I can test it for you.
21:32 anholt: nah, I just delete most of it
21:32 anholt: not all of it
21:39 jekstrand: anholt: Building now
21:39 anholt: thanks!
21:42 jekstrand: anholt: How I have to figure out how to make iris work with your patches. :P
21:43 airlied: jekstrand: is it the last 3 patches the ones you won't land?
21:43 anholt: jekstrand: should be "delete everything but the header includes from your pipe loader"
21:43 jekstrand: airlied: Specificallythe first two
21:45 kisak: anyone happen to know if idr is around? I'd like to check if it's alright to fast track https://gitlab.freedesktop.org/mesa/mesa/-/commit/caf4a533ef977b29d24adbbfc253cc2fb2f43f85 into my PPA even though !6358 hasn't landed yet.
21:46 jekstrand: anholt: Is there some bit of codegen I need to modify somewhere
21:46 anholt: jekstrand: shouldn't be. did I break it?
21:46 jekstrand: anholt: Maybe?
21:46 jekstrand: anholt: Let me rebase yours on top of mine and squasn in bits
21:55 jekstrand: anholt: It looks like "Define the DRM entrypoints in drm_helper.h is breaking clover for me
21:55 anholt: thanks, I'll have to give it a shot with some debugging. I really thought that getting things going was a lot harder
21:56 jekstrand: anholt: If it makes your life easier, landing the pipe-loader patch for iris would probably be ok as long as we don't advertise the caps
21:56 jekstrand: There's just not much point with just GL
21:57 anholt: with your branch I'm unblocked, I can debug on top of it
21:57 jekstrand: cool
21:57 anholt: (because what I really want is to be unblocked to rewrite the xml stuff, it just sounds like sooooo much fun)
22:16 jekstrand: anholt: I wrote that wip/nir-if-instr branch in 2015. It's not gonna rebase. :)
22:17 anholt: oof
22:17 jekstrand: Wow, that was back when nir lived in src/glsl!
22:17 jekstrand: I think we should probably do it.
22:17 jekstrand: It'd let us drop nir_ssa_def::if_uses
22:18 jekstrand: That sounds like a feature
22:18 imirkin_: jekstrand: i'm sure it won't rebase for other reasons, but git is actually really good at rebasing across renames/etc
22:19 jekstrand: imirkin_: Yeah, it is. Just giving an idea of timeline. It was pre-Vulkan. :)
22:21 jekstrand:tries rebasing it
22:21 jekstrand: Yeah... Not gonna happen
22:21 imirkin_: hehe
22:21 jekstrand: It'll have to be rewritten from scratch
22:21 anholt: jekstrand: did we have a conclusion about liveness? next steps?
22:21 jekstrand: anholt: Not sure
22:21 jekstrand: anholt: So is your thing now using instruction indices rather than a rolling IP tracked as you walk?
22:22 anholt: instr->index
22:23 jekstrand: right
22:23 jekstrand: I forgot that even existed
22:23 anholt: ugliest part is that I mark ifs live to last-instr-plus-1
22:25 jekstrand: anholt: Literally the only thing which uses nir_instr::index is nir_move_vec_src_uses_to_dest and it uses it for its implementation of ssa_def_dominates_instr.
22:26 jekstrand: anholt: I say we replace it with uint32_t live_index and make it part of liveness.
22:26 jekstrand: Then I have no qualms about relying on it
22:26 anholt: jekstrand: hmm. I had another use I was introducing for "see if a reg is written between the nir_src using it and a later instruction you want to pull that nir_src to"
22:26 anholt: (not in the current ntt series)
22:26 jekstrand: Then replace our usage of def->live_index with def->parent_instr->live_index
22:26 anholt: that helper would really help with killing src modifiers
22:27 jekstrand: And then we can add a nir_ssa_def::live_end_index
22:27 jekstrand: And just track real liveness information
22:27 anholt: interesting!
22:27 jekstrand: Ugh....
22:27 jekstrand: Maybe that won't work.
22:27 jekstrand: Yeah, it won't
22:28 jekstrand: Not as well as we'd like anyway
22:28 jekstrand: live_end really needs to be per-block.
22:28 anholt: you want a set of live ranges instead of a single interval?
22:28 anholt: (ntt wants the single interval)
22:29 jekstrand: for out-of-ssa, definitely
22:29 anholt: seems plausible. hey, do you have a handy testcase for your "ssa live at perf sucks"?
22:29 jekstrand: No, I don't
22:30 jekstrand: I think I noticed back when I was hacking on fp64 tests which had on the order of 100k SSA defs and 10k blocks
22:30 jekstrand: I just know I've seen it show up in traces
22:32 anholt: hmm. sysprof of 10s of shader-db has is_live_at at .04% CPU
22:32 jekstrand: Yeah, I don't think it's huge unless you get into really bad corners
22:34 imirkin_: dmat4 * dmat4 with the fp64 -> not fp64 lowering is probably brutal
22:35 imirkin_: esp if everything gets unrolled
22:39 anholt: jekstrand: -Dgallium-opencl=icd for clover?
22:39 jekstrand: anholt: -Dgallium-opencl=icd -Dopencl-spirv=true
22:40 jekstrand: anholt: The one extra dep you'll need is spirv-llvm-translator
22:40 jekstrand: anholt: They have one branch per LLVM version going all the way back to 7 or so
22:40 jekstrand: I've been using 10 on my system
22:51 anholt: jekstrand: got clinfo up, thanks
22:51 jekstrand: Woo!
22:53 airlied: jekstrand: btw fedora noe packages the translator
22:54 jekstrand: \o/
23:01 anholt: jekstrand: err, "clCreateContext(NULL, ...) [default] No devices found in platform"
23:01 anholt: (stock branch of yours)
23:02 anholt: LD_LIBRARY_PATH=`pwd`/build/lib OCL_ICD_VENDORS=`pwd`/build/src/gallium/targets/opencl/ clinfo
23:03 airlied: I think you have to install
23:03 airlied: so it picks up the pipe drivers
23:03 jekstrand: Yeah, I wouldn't even try without an install
23:03 jekstrand: And then it should be export OCL_ICD_VENDORS="${MESA_PATH}/_install/etc/OpenCL/vendors"
23:03 jekstrand: Where ${MESA_PATH}/_install is your install prefix
23:04 anholt: installed, with that vendors, still nothing.
23:05 anholt: strace shows opening pipe_iris for a moment, then moving on to pipe_kmsro
23:05 airlied: sounds like screen create failing maybe?
23:05 airlied: make sure it picked up the translator on build
23:06 anholt: when I didn't have the translator, meson just failed
23:11 anholt: there we go. some stray junk in my tree, and then needed ld_args_build_id
23:12 anholt: have a new commit in your tree, enjoy.
23:12 jekstrand: anholt: Rebasing the if instruction. So far, here's the stats:
23:12 jekstrand: 14 files changed, 69 insertions(+), 163 deletions(-)
23:12 anholt: that strikes me as not enough files changed :)
23:13 jekstrand: Oh, it's nowhere close. :)
23:14 jekstrand: However, when most of the modifications are "delete the code to handle if_uses", it's pretty good evindence this is on the right track. :)
23:33 anholt: jekstrand: thanks for the help getting cl up. turns out it helps to make my entrypoint struct public.
23:33 jekstrand: :)
23:34 jekstrand: anholt: Now you can join in the OpenCL fun! :)
23:34 anholt:places this branch carefully back on the shelf
23:34 anholt: but, tbh, I'm tempted to throw together CL in CI on msm just so I can refactor without breaking stuff.
23:35 jekstrand: Yeah, we should wire up llvmpipe clover or something just for CL CI
23:35 airlied:fails to understand the wiring up
23:35 airlied: it's already wired up, just needs to be CI
23:36 jekstrand: Oh, right.
23:36 jekstrand: I forgot you'd wired it
23:36 airlied: I might move it out from LP_DEBUG=cl
23:36 airlied: to a separate env var
23:36 airlied: since LP_DEBUG is only debug builds ony
23:40 airlied: I was kinda waiting for clc to land before engaging CI
23:40 airlied: and I guess there's a few low hanging fruit in llvmpipe
23:42 anholt: jekstrand: btw, when you do end up rebasing on my cleanup, it's not just delete the code from pipe_iris, it's also include drm_helper.h
23:42 jekstrand: anholt: Yeah, I think I figured that out
23:59 airlied: ah the other thing stopping me touch CL CI was likely fear of debian llvm packages :-P