00:00 airlied: karolherbst: and those values seem to be the index into the sampler views
00:00 airlied: maybe I thought (wrongly) that they could be repeated or come out of order
00:01 karolherbst: imirkin: well, you can have 128 textures inside one kernel
00:01 karolherbst: so clover would bind 128 sampler views
00:02 airlied: and that's fine if you fix the driver
00:03 airlied: and don't expose 128 to mesa ever
00:03 karolherbst: right
00:03 karolherbst: as I said: worst case only gallium needs fixing, and not st/mesa :)
00:03 airlied:would be happy to see 1 image :-P
00:03 karolherbst: :p
00:03 karolherbst: I am quite close actually
00:04 karolherbst: I think
00:04 karolherbst: airlied: I thought ~90 passes and ~10 fails in the basic tests is way too high, so I enabled images :p
00:05 karolherbst: the remaining fails are super annoying bits
00:05 karolherbst: like enabled memcpy
00:05 karolherbst: and other random stuff
00:10 karolherbst: mhh sadly vtn just assumes you have pointers... mhhh
00:45 karolherbst: "vec4 32 ssa_21 = txl ssa_0 (texture_offset), ssa_3 (sampler_offset), ssa_19 (coord), ssa_20 (lod), 0 (texture), 0 (sampler)" :)
01:06 jekstrand: karolherbst: I think we'll likely need a sampler intrinsic for immutable sample4rs
01:07 karolherbst: jekstrand: why? It's not the samplers being immutable
01:08 karolherbst: but yeah.. we might need it for the image/texture part.. but right now Iby treating it like ssa values, I just ignore that bit
01:09 jekstrand: I thought OpenCL had inline samplers...
01:09 jekstrand: Maybe I'm wrong?
01:10 karolherbst: there was something, but I don't know if that actually matters... let me check
01:11 karolherbst: jekstrand: ahh, right.. sampler_t is essentially an enum of stuff
01:12 karolherbst: https://www.khronos.org/registry/OpenCL/sdk/2.1/docs/man/xhtml/sampler_t.html
01:14 karolherbst: but clover still creates a normal pipe_sampler_state if it's an kernel arg
01:17 karolherbst: and I think in kernel samplers should just get extracted as compilation output and just binded via the API as well
01:17 karolherbst: at least that would be the most straightforward way
01:18 jekstrand: Yeah, that's what I was thinking as well.
01:18 karolherbst: "A variable of sampler_t type declared in the program source must be initialized with a 32-bit unsigned integer constant"
01:18 karolherbst: so it has to be constant anyway
01:18 jekstrand: So what's the other issue with samplers/images?
01:18 karolherbst: jekstrand: vtn just deals with that stuff complelty different
01:18 karolherbst: that's probably the main problem
01:19 karolherbst: it assumes vtn_pointers for sampler/texture, but with CL we kind of have offsets
01:19 karolherbst: not a big deal, just a bit annoying
01:19 jekstrand: Yeah...
01:19 jekstrand: It assumes a pointer back to a descriptor
01:19 jekstrand: Because that's the way it works in GLSL (sort-of) and is required to work in SPIR-V for Vulkan.
01:20 jekstrand: What are these offsets that SPIR-V does?
01:20 karolherbst: https://gist.githubusercontent.com/karolherbst/f2a783f1da9da4887b2aa17c015da099/raw/aff56922600efc3fcadac93a154b409b93a3dc45/gistfile1.txt
01:20 karolherbst: jekstrand: ABI of clover
01:20 karolherbst: implicit numbering of args
01:20 jekstrand: Can we just add nir_intrinsic_spirv_image, nir_intrinsic_spirv_sampler, and nir_intrinsic_spirv_imm_sampler, all of which return pointers?
01:20 karolherbst: and as the args are opaque you shouldn't take any input space actually
01:20 karolherbst: jekstrand: why not just load_const?
01:20 karolherbst: seems simplier
01:21 karolherbst: I just deal with it inside the kernel wrapper
01:21 jekstrand: We could but it requires re-plumbing spirv_to_nir
01:21 karolherbst: sure... but I already did the work
01:21 jekstrand: Hrm... Maybe not...
01:21 jekstrand: How many special cases did that take?
01:21 karolherbst: other annoying bit is that it requires new glsl_types as in spirv the OpTypeImage gets created with void :)
01:22 karolherbst: jekstrand: not much
01:22 jekstrand: OpTypeImage gets created with Void?
01:22 jekstrand: What do you mean
01:22 jekstrand: ?
01:23 karolherbst: base type
01:23 karolherbst: it's void :)
01:23 karolherbst: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/2f2530b39c27d978c6df061097c0be9f640bc054
01:23 karolherbst: ehh, "Sampled Type"
01:23 karolherbst: the type is set by the actual operation
01:24 karolherbst: the patch isn't that ugly surprisinly
01:24 karolherbst: *surprisingly
01:24 jekstrand: Yeah, that's a new thing in SPIR-V
01:24 jekstrand: Vulkan is going to start doing that too
01:24 karolherbst: ohh, okay
01:25 jekstrand: I think we should already handle that
01:25 jekstrand: Well, Vulkan sort-of does
01:25 karolherbst: well.. apparently not
01:25 anarsoul|2: austriancoder: you may want to fix this warning: https://gitlab.freedesktop.org/anarsoul/mesa/-/jobs/1878830#L1947
01:25 karolherbst: maybe it looks different in vulkan
01:25 jekstrand: Vulkan gets its signedness from the operation but it's int/floatness from the sampler type
01:25 karolherbst: yeah well
01:25 karolherbst: in CL you get everything from the operation
01:25 jekstrand: That's fine. For texturing, everything's on the operation in NIR anyway
01:26 karolherbst: for some things, yes.. I saw that
01:26 jekstrand: Not so much for images but that shouldn't be too terrible to handle.
01:26 karolherbst: but I still hit some errors,
01:26 karolherbst: anyway, the patch shows what I needed to change
01:26 jekstrand: Sure
01:26 karolherbst: and it isn't all that much actually
01:27 karolherbst: that "bool sampled;" thing is dead code btw
01:27 karolherbst: maybe I should send a clean up patch for that :)
01:28 jekstrand: I'm not sure I want it gone.... I kind of want to use it for ImageQuerySize and ImageLoad
01:29 jekstrand: I guess it's only needed for QuerySize
01:29 karolherbst: well.. right now it's not used at all.. so
01:29 jekstrand: Yeah, it isn't
01:29 jekstrand: That's because of history, mostly.
01:30 jekstrand: Initially, I was trying to use glsl_type for everything and so I made the distinction between sampled images and storage images using the image2D vs. sampler2D types but that doesn't actually match the GLSL -> SPIR-V translation.
01:31 jekstrand: The GLSL -> SPIR-V translation only uses sampler2D for combined image samplers
01:33 karolherbst: airlied: ehh.. running into issues at runtime now :(
01:33 jekstrand: FYI: I'm pretty sure your function parameter handling is going to break Vulkan quite badly
01:33 karolherbst: probably
01:35 karolherbst: jekstrand: fyi, the spirv looks like this: https://gist.githubusercontent.com/karolherbst/0f0202e00a1c91c9f233309fa8135d58/raw/2ded3970e906a3768224cb595538554803da9094/a.spv
01:36 karolherbst: but I guess called functions would look similiar in regards to image/sampler args? dunno if there is a way to actually tell those apart
01:36 jekstrand: karolherbst: Uh, yeah, that's very GLSL bindless isn't it?
01:37 karolherbst: yeah... but at the same time it isn't
01:37 karolherbst: :( it just sucks
01:37 jekstrand: Yeah, best option may be to do something similar to what you did.
01:37 karolherbst: jekstrand: the part which kills bindless is, that sampler and image args are opaque and shouldn't take away from the kernel input storage
01:37 jekstrand: Oh, that sucks
01:38 karolherbst: yep
01:38 jekstrand: I mean, it could be worse, but it really kind-of sucks.
01:38 jekstrand: I guess that implies a binding model where your binding table is basically indexed by argument position.
01:38 karolherbst: well.. it'st just 128+16 args in the worst case and you might have enough storage to have some hidden storage for stuff like this
01:38 karolherbst: still...
01:39 jekstrand: Which is fine, TBH, but it does mean thinking about things differently in spirv_to_nir
01:39 karolherbst: yep
01:39 karolherbst: the painful part is just how things get into the kernel
01:39 karolherbst: which is already annoying
01:39 jekstrand: One option which is kind-of terrible but kind-of not would be to just make a uniform variable per input and just pass pointers around.
01:39 jekstrand: That's the nicest CL -> GLSL type mapping
01:40 karolherbst: yeah... probably
01:40 karolherbst: but you still need to set the constant value somewhere
01:40 karolherbst: mhh
01:40 jekstrand: But allowing images and samplers to just be SSA values is also advantageous
01:40 karolherbst: maybe nir opts could actually optimize all of that away.. dunno
01:41 jekstrand: I have a very old branch somewhere which reworks NIR to work in a more bindless way where you have a load_deref on the sampler pointer and the result of that load_deref is passed into the nir_tex_instr.
01:41 karolherbst: I already see that a offsets -> index opt could be useful
01:42 jekstrand: With the assumption that in a non-bindless shader, 100% of nir_tex_instrs would be such that you could chase the SSA value to the deref.
01:42 karolherbst: jekstrand: the problem is, what if your hw doesn't support bindless and you really need to end with offset or index
01:42 karolherbst: yeah...
01:42 jekstrand: The bindless handle can, in theory, be anything.
01:42 jekstrand: In particular, it can be a 32-bit index
01:42 karolherbst: mhhh
01:42 karolherbst: true
01:43 karolherbst: just at some point we still need to know if it's an bindless handle or just indirect texture/sampler reference
01:43 jekstrand: I'm not sure it's worth it though
01:43 jekstrand: For this, likely something akin to what you've done is the most pragmatic choice.
01:44 karolherbst: probably
01:44 jekstrand: OpenCL actually seems pretty simple. Just count the images/samplers and do one index each.
01:44 karolherbst: maybe I give it some thoughts and find a better solution... I mean, in the end it shouldn't matter if the deref points to an uniform or an actual constant
01:44 karolherbst: but...
01:45 jekstrand: I actually kind-of like the idea of creating one uniform for eac
01:45 jekstrand: *each
01:45 jekstrand: However, I suspect we'd actually need an array of images or something like that
01:45 jekstrand: And that could get messy
01:45 karolherbst: mhh, I could just create a function_temp var and start from there...
01:45 karolherbst: or something
01:45 jekstrand: Just doing it with straight indices isn't terrible though
01:45 jekstrand: Given what the OpenCL SPIR-V looks like
01:45 karolherbst: if I have the var, set it to a constant and pass the deref into the function, nir should be able to optimize it to a simple constant offset
01:46 karolherbst: or not?
01:46 jekstrand: After function inlining, yes.
01:46 karolherbst: mhhh
01:47 karolherbst: that might be a solid alternative then
01:47 karolherbst: I don't know how much indirection CL allows here
01:47 karolherbst: but I guess you could select a sampler from an array.. maybe
01:47 karolherbst: and then we would need something like that anyway
01:48 jekstrand: Yeah, I don't know what CL's rules are either
01:48 jekstrand: I suspect they basically don't exist. :(
01:49 karolherbst: let's see what nvidia allows :)
01:50 imirkin: sounds like it's probably DX's rules if they allow 128 textures and 16 samplers
01:50 imirkin: (DX10)
01:50 karolherbst: heh.. error: array with image or sampler element type is not allowed
01:50 imirkin: nvidia hw allows indirect on both texture and sampler binding unit
01:51 imirkin: basically you pack a 32-bit int with the sampler id and texture id in some bit layout
01:51 karolherbst: not even casting is allowed
01:51 karolherbst: uff
01:52 karolherbst: jekstrand: I guess nvidia doesn't allow indirect samplers...
01:52 imirkin: (that's def the kepler way, but i'm 99% sure that works on fermi too. no indirection on DX10 of course.)
01:52 karolherbst: same for images
01:52 jekstrand: karolherbst: Heh...
01:52 karolherbst: so...
01:52 karolherbst: maybe CL really doesn't know indirection here
01:52 karolherbst: which.. is good for us
01:52 imirkin: or maybe it was added in later CL?
01:52 karolherbst: dunno
01:52 karolherbst: maybe
01:53 imirkin: i suspect even CL 1.2 is targeted at DX10-era hw
01:53 karolherbst: even in CL2.0 mode it doesn't work
01:53 imirkin: i.e. tesla / pre-evergreen r600
01:53 imirkin: you could make images work on tesla with much blood sweat and tears
01:53 karolherbst: but nvidias CL2.0 implementation isn't complete.. so
01:54 karolherbst: imirkin: why?
01:54 imirkin: coz there's no support for images
01:54 karolherbst: I thought in compute shaders there are
01:54 imirkin: but there is support for gmem
01:54 imirkin: and the images *are* stored in gmem *somewhere*...
01:54 karolherbst: mhhhhhh.
01:54 imirkin: so ... like i said, blood sweat and tears :)
01:54 karolherbst: I could check what they do on an cuda with tesla...
01:54 imirkin: i think curro did a good chunk of the work to make it happen though
01:55 imirkin: i forget if it was merged ... probably not
01:55 imirkin: https://github.com/curro/mesa/commits/nv50-compute
01:56 jekstrand: karolherbst: The more I think about it, the more it looks like your approach fits current NIR the best.
01:56 jekstrand: We could possibly do something different going forward but I'm not sure if it's worth it
01:56 karolherbst: well I kind of still like the function_temp idea
01:57 jekstrand: Feel free to give it a try
01:57 imirkin: https://github.com/curro/mesa/commit/9981ef41f03144aab04b92897b46242d7f7de180
01:57 jekstrand: I'm just saying that you've convinced me that my gut reaction of "OMG, we must have pointers!" was likely an overreaction. :-)
01:58 karolherbst: :p
01:58 karolherbst: imirkin: ufff
01:58 imirkin: having ops which do this for you is a lot more convenient :)
01:59 karolherbst: yes
01:59 jekstrand: That code looks annoyingly familiar....
02:00 imirkin: written *by* someone annoyingly familiar? :)
02:03 jekstrand: Well, that too. :-)
02:03 jekstrand: But the code we carry today, I've rewritten at least twice.
02:04 jekstrand: So we're not using curro's original code anymore. We're using basically his code only translated to NIR.
02:04 karolherbst: well.. we still have enough fun with images even on latest gen :/
02:04 karolherbst: really.. what's the point of having hw ops if you still end up doing a lot of the stuff yourself :(
02:05 jekstrand:always wonders that
02:05 jekstrand: I thought images "just work" on recent nvidia HW
02:05 karolherbst: heh.. apparently the CPU doesn't like 0 % 0
02:05 karolherbst: big surprise
02:05 karolherbst: jekstrand: ... well... ufff
02:05 karolherbst: no
02:07 jekstrand: Oh? What's busted about them?
02:08 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#L2640
02:08 karolherbst: mostly the coordination stuff I guess
02:08 karolherbst: and we do some format conversion
02:09 karolherbst: oh yeah.. and 2d layers of 3d image bound are pain as well
02:09 karolherbst: so you might think the hardware would know how to handle that, especially with all the tiling around
02:09 karolherbst: but no
02:09 karolherbst: nvidia just disables tiling on the z axis and passes in the 2d layer as an offset into the 3d image :)
02:09 karolherbst: but that comes with runtime costs of.. you know, detecting the application doing that
02:11 jekstrand: Yeah
02:11 jekstrand: Having the hardware be able to do format conversion would be nice...
02:13 karolherbst: I guess you can spend the transistors on more usefull things.. and kind of optimizing the conversions with the code around might even give you more benefits..
02:13 karolherbst: dunno
02:23 airlied: karolherbst: ah yes I vaguely remember hitting the void thing as well when I gave up last time :-P
02:24 karolherbst: airlied: seems like clover passes in a reference for the image, but not for the sampler... ehh
02:24 karolherbst: anyway.. I crashed my GPU context often enough so nouveau got stuck :(
06:05 lrusak: any chance to import P010 in mesa like it does with YUV420 by using shaders?
06:05 lrusak: or YUV420p10
08:21 pq: airlied, emersion, jadahl, fstat... so what does fstat on a dmabuf fd return today for the device major/minor?
08:22 pq: hm, not easy to check with the shell, since /proc/<pid>/fd/## is just a broken symlink for dmabuf
08:24 jadahl: pq: the check would happen in some gstreamer or pipewire pipeline rather than the shell in this case
08:26 pq: jadahl, yes
08:27 pq: I mean, what else could the major/minor returned by fstatting a dmabuf fd return than the device that allocated it - seems like a good fit, unless it already has a different meaning
08:30 ascent12: pq: st_rdev is 0.
08:30 ascent12: So major/minor is 0.
08:31 pq: ok, sounds like N/A
09:23 anarsoul|2: daniels: looks like fdo-packet-m1xl-1 is out of disk space
09:23 anarsoul|2: https://gitlab.freedesktop.org/anarsoul/mesa/-/jobs/1884402
09:37 hakzsam: MrCooper: are you fine with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4044 ?
09:52 MrCooper: noticed a couple more things I'm afraid
09:52 hakzsam: thanks, I will update
10:08 MrCooper: hakzsam: please cancel the pipeline (still no idea why others can't do so for your project?)
10:09 hakzsam: why should I cancel it?
10:09 hakzsam: I don't know either why you can't
10:10 hakzsam: ok, it wasn't rebased on recent master, should be ok now
10:19 MrCooper: in your project settings under "General" => "Visibility, project features, permissions", is the "Pipelines" switch enabled? And what's selected in the combo box to the right of it?
10:19 linkmauve: pq, fd symlinks in /proc aren’t actually broken, you can still open them and then do stuff with them.
10:20 pq: linkmauve, ohh
10:23 pq: looks like the stat command is not to be trusted on them though, for a dmabuf it says 'Device: 5h/5d' which is the same it says for the fd directory too.
10:24 MrCooper: hakzsam: did you see my question above?
10:26 hakzsam: MrCooper: it's enabled and it's "Everyone with access"
10:26 linkmauve: pq, it’s especially useful when you’ve inadvertantly deleted a file and it’s still open by another process; you can then recreate a hardlink and it won’t be deleted anymore.
10:32 MrCooper: hakzsam: hrm, and what's selected for "Project visibility" at the top of that section?
10:36 hakzsam: MrCooper: "Public" and the checkbox below isn't checked
10:36 MrCooper: k, then I'm out of ideas again
10:37 hakzsam: I think we already checked last time :/
10:41 MrCooper: ah, I think it's because you restricted access to the Git branch
10:43 MrCooper: "Repository" in that section maybe?
10:44 hakzsam: same as the Pipelines section
10:51 MrCooper: did you define any protected branches on the Repository settings page?
10:52 hakzsam: no
11:07 hakzsam: MrCooper: https://gitlab.freedesktop.org/hakzsam/mesa/-/jobs/1885960 crashed again maybe we need to rm ppc64el as well?
11:08 MrCooper: sounds like a shot in the dark; I'd try it with a local podman/docker first
11:59 daniels: anarsoul|2: fixed thanks
12:20 tpalli: MrCooper seems like something wrong with panfrost CI jobs
12:21 tpalli: MrCooper getting stuck/timeout failure
12:32 daniels: tpalli: yeah, that's my fault and not his :P
12:32 daniels: the fd.o runner which executes those jobs was dead because its disk was full (an ongoing saga we're fixing for good this week but still not done yet)
12:32 daniels: if you retry them the runner will pick them up, but they'll take longer than normal to complete because there's now a backlog
12:35 tpalli: daniels ok thanks!
12:41 kazlaus: mdnavare: it's not part of the spec, but realistically you aren't going to get a good experience on panels that have a range of that or less
13:37 MrCooper: tomeu: is there an MR yet for making the meson-arm* artifacts smaller? They currently generate an order of magnitude more egress traffic than other artifacts
13:43 daniels: MrCooper: i guess the underlying issue is that they also contain the kernel + modules, which is great for traceability but less great for egress ... we almost certainly want to implement a transparent cache on the LAVA side, which anholt already did for the db410c before they abandoned that iirc
13:45 MrCooper: that would still mean up to 200 MB * number of caches of egress per pipeline
13:49 daniels: there'd be 2 caches, and presumably we could cache the rootfs separately, but yeah, this would point to us needing to do something smarter
14:22 mripard: airlied: yeah, sorry, I meant rc5
14:51 imirkin_: tpalli: dunno if you want to do an audit, but basically every driver should have PIPE_CAP_MIXED_COLOR_DEPTH_BITS enabled. nv30 is what i added that cap for, to try to avoid applications from picking obviously-bad things
14:52 tpalli: imirkin_ ok cool, I know we have some exceptions with old gens but those are not supported in iris anyway, was that r-b?
14:53 imirkin_: tpalli: sure, r-b (although you may want someone from intel to sign off?)
14:53 imirkin_: tpalli: i'm pretty sure even gen4 supports this. maybe i915 would be equally unhappy? dunno.
14:54 imirkin_: basically on nv30, the hw must have these things match. a competent driver would do fallbacks + blits to support the API, but even in such a case, why invite the badness :)
14:57 tpalli: imirkin_ right I think we have some issues with 565, there at least i965 has some gen specific things where 16bit depth is also expected
14:58 imirkin_: ha! that's surprising
14:58 imirkin_: but nice to know i'm not the only one who suffers
14:58 tpalli: :)
14:59 imirkin_: on nv30, the solution to that problem is "don't bind depth buffer"
14:59 imirkin_: which isn't ... perfect
14:59 imirkin_: better than gpu hangs, but worse than correct rendering
15:02 tpalli: right, 565 is sort of rare but there still seems at least android games that want it
15:02 imirkin_: well - a lot more common in GeForce FX times
15:02 imirkin_: (which is the nv30 generation)
15:24 hakzsam: MrCooper: btw, I can't debug this locally because I don't have access to any s390x arch
15:24 MrCooper: that's irrelevant, all of this runs on x86
15:24 imirkin_: few people do :)
15:25 imirkin_: can't just go to best buy to pick up one of those...
15:25 MrCooper: code for cross-built architectures is executed by qemu
15:25 MrCooper: and it's qemu which crashes
15:25 hakzsam: wow, will be a pain to debug :P
15:27 MrCooper: yeah, but you can certainly try other workarounds
15:27 hakzsam: before installing more i386 packages it worked, that's a start
17:40 pinchartl: sravn: thanks
17:56 hakzsam: https://gitlab.freedesktop.org/hakzsam/mesa/-/jobs/1891806 --> no space left on device
18:02 daniels: i do wonder why the disk is filling so quickly :\
18:02 daniels: obviously it's not fully properly fixed to have full capacity and a real storage runner yet
18:02 daniels: but it's filling >400GB in a few hours
18:02 daniels: which is surprising
18:02 imirkin_: wow
18:03 MrCooper: baobab/du/... to the rescue?
18:04 hakzsam: that's huge
18:06 linkmauve: ncdu is nice, when you’re on a server without GTK+ installed.
18:08 daniels: ah yes, it's because that one host is pulling all the ARM builds, as well as doing x86 builds + tests, and has also recently put through some Android builds
18:08 daniels: makes sense
18:08 daniels: hakzsam: they have 2.8TB in total but the RAID is not yet provisioned and enabled
18:08 hakzsam: ok
18:09 daniels: anyway, cleared out for now
18:09 daniels: RAID will be provisioned as soon as we can
18:13 anarsoul|2: austriancoder: jekstrand: any opinion on moving https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126/diffs?commit_id=3e21b2554b86a7039f7b55914fca0b744444208f from !4126 into nir_opt_algebraic?
18:16 anarsoul|2: also kusma ^^
18:16 jekstrand: anarsoul|2: Is that actually cheaper?
18:16 anarsoul|2: jekstrand: yes for lima
18:17 jekstrand: The one in nir_opt_algebraic is only about 5 instructiosn
18:17 jekstrand: None of which are multiplication
18:17 jekstrand: Unless sel is stupid expensive
18:18 anarsoul|2: sel takes 2 slots
18:18 anarsoul|2: on Utgard GP
18:19 jekstrand: Given that etnaviv and lima are the only drivers setting lower_ftrunc, I don't personally care about the perf trade-offs.
18:19 jekstrand: I'm fine with it bein in nir_opt_algebraic
18:19 jekstrand: It's also fine for it to be driver-specific
18:19 anarsoul|2: etnaviv probably won't like it since they need to lower fsign
18:20 jekstrand: Sure, but you can do 'options->lower_ftrunc && options->lower_fsign'
18:20 anarsoul|2: right
18:20 anarsoul|2: good idea
18:20 anarsoul|2: let's do that
18:21 jekstrand:is still skeptical that it's actually faster
18:21 jekstrand:also doesn't care
18:21 anarsoul|2: jekstrand: we get less instructions with this lowering on lima
18:21 jekstrand: Ok
18:21 jekstrand: I guess that is the metric
18:23 anarsoul|2: jekstrand: sel takes both mul slots on GP, so we end up with one less op available and it makes scheduler job harder
18:24 jekstrand: Yeah, that makes sense, I suppose
18:25 jekstrand: anarsoul|2: Why does the new lowering use ffloor(fneg(x)) rather than ffloor(fabs(x))?
18:25 anarsoul|2: jekstrand: because we don't have fabs :)
18:26 anarsoul|2: it's lowered to fmax(a, -a)
18:26 anarsoul|2: while fneg is essentially free, it's just a modifier on an operand
18:26 kusma: anarsoul|2: it it's not universally faster, I think it's fine to have it in the driver.
18:27 kusma: anarsoul|2: I just wondered if there was another reason
18:27 jekstrand: anarsoul|2: Ah...
18:30 anarsoul|2: guess I should move fabs lowering to nir_opt_algebraic as well. It makes no sense to keep it in backend
18:34 flto: anarsoul|2: for etnaviv the lowering in nir_opt_algebraic optimizes to 2 instructions (floor + sel, since sel can do compare with 0 for free, and abs/neg are free modifiers), yours would be 3 instructions
18:35 karolherbst: fabs as 3 instructions?
18:36 anarsoul|2: flto: I see. So I guess I'm keeping it in lima.
18:37 anarsoul|2: karolherbst: 1 for lima, but with 2 operands and taking one of acc slots.
18:37 karolherbst: uhh
18:38 karolherbst: sounds weird enough
18:38 anarsoul|2: basically ffloor(fneg(x)) is 1 op (not to be confused with instruction)
18:38 karolherbst: yeah right.. but source modifiers are quite common and this kind of looks like it
18:38 anarsoul|2: ffloor(fabs(x)) is 2 ops. Could be 2 instructions if there's no 2 free acc slots in current instruction
18:38 karolherbst: ohhh
18:39 karolherbst: I see
18:39 anarsoul|2: https://gitlab.freedesktop.org/anarsoul/mali-isa-docs/-/blob/master/Utgard-GP.md if you're curious about vertex shader ISA on Utgard
18:39 karolherbst: no clue why fabs would be more expensive than fneg.. but okay
18:40 anarsoul|2: karolherbst: that's easy, likely no bits were left for fabs modifier
18:40 karolherbst: maybe
18:40 flto: anarsoul|2: lower_ftrunc was added specifically for etnaviv.. if there was a 3rd user of lower_ftrunc and yours was better for that hardware I think it would make sense to change nir_opt_algebraic
18:41 anarsoul|2: instruction is 128 bits and every bit is used
18:41 karolherbst: you have very long instructions
18:42 karolherbst: but I guess being a vectorized ISA means you need more bits for general stuff
18:42 anarsoul|2: flto: I'm not sure how to add it into nir_opt_algebraic to avoid regression on etnaviv.
18:42 karolherbst: probably
18:42 anarsoul|2: karolherbst: it's scalar ISA
18:43 karolherbst: huh
18:43 flto: anarsoul|2: I'm saying if it was more than just lima that wants this change, then etnaviv could have its own custom lower_ftrunc instead
18:43 anarsoul|2: flto: ah, right.
18:44 karolherbst: anarsoul|2: I got confused by the xyzw values, but I guess an instruction can only operate on one channel at the time?
18:44 anarsoul|2: karolherbst: operation in instruction can operate on one channel, that's correct
18:44 karolherbst: okay
19:02 melissawen: Hi, my name is Melissa. I am a master's student at the University of São Paulo (Brazil), and I am interested in participating in gsoc this year. So, I am currently trying to understand a little more about the drm and work with some igt tests and I recently sent some simple patches to the drm.
19:02 melissawen: siqueira, I saw a project proposal for vkms that also includes working with igt, and I would like to apply for it :)
19:04 melissawen: I would like to discuss a little more to understand well the proposal and also have some guidance
19:09 Lyude: mupuf: ^ maybe you can help point them in the right direction?
19:11 Venemo: daniels: I'm having issues with the panfrost ci again. I retried it already and it failed again.
19:11 Venemo: daniels: https://gitlab.freedesktop.org/Venemo/mesa/-/jobs/1892716
19:32 siqueira: melissawen Hi! First of all, thanks for your interest in VKMS project, I have seen some of your patches and emails in the mailing list. If you take a look at https://www.x.org/wiki/SummerOfCodeIdeas/ you will see that I added some basic info and also some directions.
19:33 siqueira: From your emails, I know that you are trying to reproduce some a bug and also fix the alpha issue. In my opinion, you are in the right direction and if you need any help during your work feel free to ask me here or via email.
19:34 siqueira: Btw, if you have time could you change the behavior of `enable_cursor` for making it always enable and just disable if we use `disable_cursor`? These days, I think there is no reason to keep cursor disable by default on VKMS.
19:55 daniels: Venemo: I think this is a general gitlab-runner issue which can affect all jobs, I'm trying to understand what it is but I may not be able to fix it for you tonight
19:56 pepp: Venemo: you should rebase your MR on master. That way the panfrost CI wouldn't run for your MR
19:56 Venemo: oh, I see. thanks pepp
19:56 Venemo: will try
19:57 daniels: that would also help tbf
20:36 airlied: mripard: sorry fell of my list yesterday, will backmerge today
20:39 melissawen: siqueira, thanks! Yes, I can work on this issue about cursor. I think it would also help my tasks as I am working on some igt tests related to it.
20:41 siqueira: melissawen thanks, don't forget to CC me
20:41 melissawen: ok!
20:42 Venemo: pepp, daniels I now rebased on latest master and assigned it to marge and the ci failed again
20:47 pepp: Venemo: looks like https://gitlab.freedesktop.org/mesa/mesa/issues/2505
20:47 gitbot: Mesa issue 2505 in mesa "cache_test intermittently fails" [Ci, Glsl, Opened]
20:49 Venemo: omfg
21:20 mripard: airlied: thanks
21:31 imirkin_: Venemo: you're like the lightning rod for this stuff
21:41 jcdutton: Hi. Are there any open source opencl test suites ? I have an opencl application that if not working right, and hoping that I can get it to fail a particular standard opencl test, to help with debugging it.
21:42 robclark: jcdutton: https://github.com/KhronosGroup/OpenCL-CTS
21:44 agd5f: jcdutton, someone had started an opencl piglit as a GSOC project one year. Not sure it ever really took off though
21:45 karolherbst: jcdutton: mind sharing which CL application that is? I might look into it and fix it
21:45 karolherbst: and mind telling what driver?
21:45 karolherbst: But I'd assume r600
21:47 airlied: karolherbst: get any further with images?
21:47 karolherbst: not yet
21:48 karolherbst: I mean.. it should work, there is just some random stuff failing on a runtime level
21:49 jcdutton: karolherbst, the problem app is opencv
21:50 karolherbst: ehh, I see
21:50 karolherbst: jcdutton: any way to launch it from a defaul installation to make it crash?
21:51 jcdutton: karolherbst, I will try the KhronosGroup tests, if they all pass, then I will just need to debug my app. I think the problem might be trying to multithread calls to opencv
21:52 karolherbst: they won't pass all
21:52 airlied: or even close :-P
21:52 karolherbst: and you will get random fails so any report will be useless in regards to opencv
21:52 jcdutton: Is opencv just too buggy ?
21:52 karolherbst: no, mesa is
21:52 karolherbst: well, clover
21:53 jcdutton: karolherbst, I am using rocm with a vega 56.
21:53 karolherbst: ohh, then you are in the wrong channel to report this issue :p
21:53 karolherbst: we don't care about propriatary software here :p
21:53 karolherbst: well.. actually we do, like games, but I mean propraitary implementations
21:54 jcdutton: karolherbst, I know, the AMD binary blob inside ROCM is buggy as hell.
21:54 karolherbst: right, but not our problem :)
21:56 HdkR: Does this just mean that everyone's OpenCL implementation is buggy? Take of pick of which bugs you want to deal with? :D
21:57 karolherbst: HdkR: well, if in the occurance of this bug mesa would be involved I would kind of care
21:57 karolherbst: but as it seems it's not
21:58 karolherbst: but... it also never sounded like jcdutton was actually asking us to help out :) but yeah, running the OpenCL CTS _might_ help
21:58 karolherbst: but implementations are also supposed to pass the test
21:58 karolherbst: so....
21:58 HdkR: ah right.Rocm Sock'm opencl
21:58 karolherbst: the tests isn't great though
21:58 karolherbst: *aren't
21:58 karolherbst: and I know a lot of bits they are not testing
22:05 Venemo: imirkin_ :D
22:05 Venemo: imirkin: I had a good laugh at that comment :)
22:06 imirkin_: happy to help any way i can!
22:15 jcdutton: Can clover run opencl on a vega ?
22:18 ajax: i believe so. it works on polaris, at least, i don't imagine it fails to support vega too.