07:27 danvet: mlankhorst, last drm-misc-next?
07:27 danvet: for 5.9 I mean
07:29 airlied: oops forgot to apply last one
07:29 airlied:does so now
07:29 danvet: well it's -next, and it's only a few days old
07:38 airlied: actually it'll be tomorrow :-P
07:45 danvet: sravn, btw for small issues in patches from new or drive-through contributors, imo best to either just bikeshed yourself or apply as-is with the suggestion just in your reply for next time around
07:45 danvet: the idea is to foster this ownership feeling, and that works best if the new contributors bikeshed color choice gets in than yours
07:45 danvet: within reasons ofc
09:14 xexaxo1: Putti: I don't see a way to handle that in kernel w/o breaking the world, so I'd suggest sticking with an android userspace hack.
09:46 mlankhorst: danvet: ok might as well send one now
09:46 danvet: mlankhorst, not really a need imo, was just asking
09:47 danvet: so I know what to type into my patch applied messages :-)
09:47 mlankhorst: ah k
09:48 danvet: hm might be good actually since Lyude 's vblank work series is now in there
09:48 mlankhorst: yeah figured as much
09:48 danvet: but then also no one who immediately needs that I think
09:50 mlankhorst: will it be pulled in from nouveau anyway?
09:51 mlankhorst: https://paste.debian.net/1157125/ shortlog from last pull
09:58 danvet: mlankhorst, I thought that was the plan, I guess not
09:58 danvet: both places would be bad
09:58 danvet: but not the worst thing we've done :-)
09:59 danvet: skeggsb, ^^ fyi make sure you don't have the crc/vblank work stuff
09:59 mlankhorst: or do. :P
12:54 tanty: tomeu: thanks for merging the MR to the traces-db repo and grant me developer rights! :)
13:03 mripard: anholt__: ack, enjoy your holidays :)
13:33 tomeu: tanty: thanks for the traces!
16:30 jekstrand: bnieuwenhuizen, hakzsam: I've got a MR ready for VK_EXT_float_atomics as well, including the SPIR-V bits, but the public SPIR-V headers haven't been updated yet.
16:31 hakzsam: jekstrand: VK_EXT_float_atomics? or VK_EXT_shader_atomic_float?
16:31 jekstrand: hakzsam: VK_EXT_shader_atomic_float
16:31 jekstrand: Sorry
16:32 jekstrand: hakzsam: I don't know what HW AMD has for it. You should be able to at least wire up the basics.
16:33 hakzsam: no native exchange atomic apparently
16:34 jekstrand: hakzsam: For atomic exchange, you should be able to use the integer version.
16:34 jekstrand: hakzsam: It also adds a float add atomic
16:34 jekstrand: My ANV patches only enable exchange
16:34 jekstrand: But I've got SPIR-V patches for fadd which I've tested at least generate valid NIR.
16:35 hakzsam: no float add atomic either :)
16:35 hakzsam: do you have a link to your branch?
16:35 jekstrand: Ok then. :)
16:35 pendingchaos: we have float atomic add for shared memory
16:35 pendingchaos: ds_add_f32/ds_add_rtn_f32
16:35 karolherbst: jekstrand: we support atomic fadd as well
16:36 hakzsam: pendingchaos: yeah, for shared
16:36 karolherbst: jekstrand: ohh btw.. I have lowering code for shared
16:37 hakzsam: jekstrand: is the khronos mesa branch up-to-date for atomics?
16:37 jekstrand: hakzsam: I just pushed it to khronos gitlab. I'll push it public once we have a header update. I just don't know what all else is in there.
16:38 jekstrand: hakzsam: It's in the wip/VK_EXT_shader_atomic_float branch
16:38 hakzsam: ok, thanks
16:38 jekstrand: Shared is a separate feature bit if you want to wire up just that.
16:39 hakzsam: yeah
16:39 jekstrand: Gerrig 5722 has the CTS tests
16:39 jekstrand: *Gerrit
16:39 hakzsam: will do :)
16:52 hakzsam: jekstrand: shouldn't you enable float32_atomic_add and float64_atomic_add in ANV?
16:53 hakzsam: nvm, you don't enable the VK feature
16:54 jekstrand: Yeah, current hardware can only do exchange
17:01 imirkin: dschuermann: any reason to be so restrictive with https://cgit.freedesktop.org/mesa/mesa/commit/?id=9d22c5ed718abcf98444f9654b912ccb42b2ccdd -- any alu op commutes with bcsel, no? why specifically with -1, 1 arguments?
17:01 imirkin: or is it just because neg is ~free, while alu ops aren't?
17:05 Kayden: hm, yeah, exchange would be the same for both integer and float. compare-exchange isn't, because 0x80000000 == 0 for the float version (among other reasons)
17:06 jekstrand: Yup
17:07 jekstrand: You could implement float atomics on top of integer compare/exchange with a loop but it's not ideal.
17:07 imirkin: can implement any atomics with a loop...
17:07 imirkin: unfortunately we have to fall back to that in nouveau, since not all variants are supported on all memory types
17:09 jekstrand: In shared, you can also implement them with a lock which may or may not be better.
17:09 jekstrand: Our CL driver does that for 64-bit atomics
17:10 jekstrand: Which you can't implement with a 32-bit exchange and a loop
17:11 karolherbst: jekstrand: we don't have anything in nir doing the loop stuff right now, do we? I could try to get it merged in a more generic way in case anybody would like to use it
17:12 karolherbst: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/e596528dee70efca42a883987ef25bfa61134602
17:13 karolherbst: should probably add a struct per memory so it can be fully configured?
17:15 karolherbst: jekstrand: do you need lowering for cerating atomic ops?
17:15 karolherbst: s/cerating/certain/
17:16 karolherbst: mhh.. or should the interface rather be var mode + ops selection?
17:16 karolherbst: but it also differs between 32 and 64 bit :/
17:16 karolherbst: annoying
17:16 karolherbst: 3 dimensional matrix
17:20 jekstrand: karolherbst: Callback?
17:21 imirkin: just do 11 dimensions. that should cover it.
17:21 jekstrand: karolherbst: I have no problem if someone wants to merge generic NIR code for it.
17:21 jekstrand: The generic code isn't that hard. It's mostly a matter of figuringo out how to expose what things to lower.
17:21 karolherbst: yeah...
17:22 karolherbst: callback with bit_size, nir_alu_op + mem_type vs atomic_op
17:22 karolherbst: and return value would be the lowering kind
17:22 karolherbst: cas loop vs locked op or something?
17:23 jekstrand: For now, I'd assume loop unless you really want a locking implementation.
17:23 jekstrand: And I'd just make the callback take the nir_intrinsic_instr
17:23 karolherbst: yeah.. I'd only implement the loop, but want to consider something just in case that won't work
17:23 karolherbst: okay
17:24 jekstrand: the other thing you likely want to do is to also have a lowering pass which tries to use subgroup ops to accelerate it whenever possible.
17:24 karolherbst: I was wondering if it would be easier to have a pair in case somebody doesn't support anything besides 64 bit cas or so
17:24 karolherbst: but yeah..
17:25 jekstrand: For instance, any time you do an add with a constant, you want to use subgroup ops to only do that loop in one lane and use subgroup masks and normal arithmentic to get the other values.
17:25 karolherbst: mhhh... good idea actually
17:25 karolherbst: so we already would have two variants: cas and cas+subgroup
17:25 jekstrand: That pass might be useful even on hardware that has the atomics
17:26 karolherbst: right
17:26 karolherbst: I see
17:26 jekstrand: I'd make it separate passes. One which does the subgroup ops stuff (maybe using the divergence analysis pass from dschuermann instead of looking only for constants) and a separate one which lowers.
17:26 karolherbst: okay, sounds like a plan
17:27 karolherbst: will do the lowering first as this is something we already need today anyway.. we just have our own lowering, so it's easy to verify if I mess it up or not :)
17:28 jekstrand: If the back-end supports the full subgroup arithmetic ops, I think you can do a single lane even in the divergent non-constant case and even for crazy things like umax.
17:47 dschuermann: jekstrand: if we have a way to keep NIR in lcssa, we might also be able to make divergence a metadata
17:47 dschuermann: imirkin: what exactly do you mean? that optimization removes the fmul...
17:48 imirkin: dschuermann: yeah, i picked up on that later ...
18:04 jekstrand: hakzsam: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992
18:11 austriancoder: anholt__: maybe it is ready now? !5661
18:12 austriancoder:is not the best tech doc writer
18:14 anholt__: austriancoder: r-b applied. thanks!
18:16 austriancoder: anholt__: thanks
18:29 tanty: tomeu: could you take a look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5890 ?
19:17 mareko: robclark: radeonsi hangs with int16 (infinite loop), so I can't do any big testing
19:18 robclark: mareko: ok.. well there were some big problems about lower_precision trying to inline no-op intrinsic bodies, and things like that.. I've got fixes for some of that (some more hacky than others, I'm still working through the rest of the deqp problems)
19:50 TheRealJohnGalt: Is it okay to request old issues I didn't create the same as newer (now fixed) issues be closed?
19:56 EdB: hehehe, i'm starting a printf implementation for clover "............p.l.o.p.........k................�........P.j.�......" <-- I'm already happy with that :p
19:57 airlied: EdB: nice
19:57 jenatali: EdB: Did you see https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/154 ?
19:57 jenatali: Feel free to pick whatever patches would make sense for you
19:58 EdB: nop, didn't see that
20:02 EdB: jenatali: mine is not using nir, cl compiled by llvm for amd target
20:04 robclark: hmm, we don't have an fround_az (away from zero) in nir?
20:07 imirkin: i don't think i've seen that in hw either
20:07 robclark: that seems to be what rndaz.f does..
20:08 airlied: EdB: I do intend to switch mesa clover on amdgpu to use nir at some point
20:08 imirkin: airlied: what's the path there? llvm -> spirv -> nir?
20:10 airlied: imirkin: yeah pretty much
20:11 airlied: then you could in theory once you typed a lot use aco as the backend
20:16 EdB: for amd target, llvm give me a list of pattern used in the program, and what is to be printed is store in a global buffer with an id mathing the desired pattern
20:17 EdB: a pattern looks like !{!"1:1:8:%s\\n"}
20:18 karolherbst: oh, interesting
20:18 EdB: well it's 1:1:8:%s\n
20:21 sravn: danvet: I guess the comment was triggered on the response to the flink ioctl. Will fix when I commit next time. My lame excuse was that my drm-misc tree is occupied and dim does not support two drm-misc trees :-(
20:22 danvet: sravn, yeah it's not the most awesome workflow :-/
20:23 danvet: sravn, I think it was actually the dev_* -> drm_* debug output replacement one
20:23 danvet: but yeah that one was broken in v1
20:23 karolherbst: EdB: feel free to reach out whenever you have it properly wired up in clover, I'd might want to test it with the spir-v path as well to check how well that would work out
20:24 danvet: for these simple patches of people I don't recognize yet as regulars I just squeeze my eyes and then merge
20:24 danvet: honey trap style :-)
20:27 EdB: karolherbst: It's really dirty for the moment :) But I will let you know
20:34 sravn: danvet: for the logging it would be nice to hace all logging converted and I already applied some patches. But if this scares them away then better apply a few more patches.
20:35 sravn: I just cannot see us coping with ~100 individual submitted patches just to convert drm/*.c til new style logging..
20:36 sravn: There is already too much on the backlog. I have enough at the moment on top of trying to finish my own patches...
20:37 danvet: sravn, yeah that's why I try to trick people in, throw commit rights at them
20:37 danvet: and then let them fence for themselves as much as possible :-)
20:37 danvet: otherwise you just sink
20:53 mattst88: bbrezillon: nice work! :D
22:09 anholt__: huh, weird virgl fail on mesa/mesa https://gitlab.freedesktop.org/mesa/mesa/-/jobs/3704030
22:09 anholt__: restarting the job went fine
22:09 jekstrand: daniels, anholt__, robclark: What is this arm64_test CI job? It has a 2 hour timeout!!!!!!
22:10 jekstrand: It's been sitting here running for 45 minutes on a RADV MR
22:10 anholt__: https://lists.freedesktop.org/archives/mesa-dev/2020-July/224523.html
22:10 anholt__: I got one of the fixes done today
22:12 airlied: anholt__: can't say I've seen that in any of my test runs
22:13 jekstrand: anholt__: Is it just re-building a container?
22:13 anholt__: airlied: long arm_test container build? you'll only see it if you win the race between someone merging an MR bumping the container and mesa/mesa finishing the rebuild
22:13 anholt__: jekstrand: yes
22:13 jekstrand: Ok, that makes sense. Still, a 45 minute build seems insane.
22:14 jekstrand: I guess it's building a whole kernel....
22:14 airlied: anholt__: no the virgl crash
22:14 anholt__: deqp, kernel, renderdoc, on 8 cores.
22:14 anholt__: and apitrace
22:14 anholt__: and some misc
22:14 anholt__: airlied: ah, sorry
22:14 airlied: and I'm triggering a lot of virgl CI via llvmpipe
22:15 anholt__: airlied: it's been fun to watch the both virgl and llvmpipe test fixes rolling in from you
22:17 airlied:is getting down to very pointy end, line and point rendering failures
22:18 anholt__:remembers trying to debug those on vc4. noticed recently deqp had changes from broadcom to the reference renderer for gles2-only contexts.
22:18 imirkin: btw, what's the resolution to the clipping discussion?
22:18 imirkin: wrt point (and line) clipping
22:18 imirkin: is the behavior just expected to be different between GL and GLES?
22:19 jekstrand: imirkin: Yup
22:19 jekstrand: imirkin: Sadly
22:19 imirkin: or was there some sort of resolution
22:19 imirkin: wow
22:19 jekstrand: imirkin: Which is to say that GLES requires pop-free clipping and GL I think doesn't specify.
22:19 jekstrand: And certain companies want to keep it that way. :-(
22:19 imirkin: oh - GL doesn't specify?
22:19 jekstrand: I think so
22:19 imirkin: i thought it did specify (just the other way)
22:19 jekstrand: Or maybe it specs popping
22:19 imirkin: at least that's nice
22:19 jekstrand: But GLES is definitely pop-free.
22:20 imirkin: right
22:20 jekstrand: I think, anyway
22:20 jekstrand: I could have it backwards
22:20 imirkin: GLES is the one where you clip it as if it's a real primitive
22:20 jekstrand: In Vulkan, you query from the implementation which it does
22:20 airlied:I epxect gets to find out next
22:20 jekstrand: And I think there's an extension which lets you ask for one
22:20 imirkin: whereas GL, i thought, is the one where it's either all-in or all-out
22:20 airlied: though annoyting if i Have to make GL and GLES do different
22:20 imirkin: but could be that GL is just unspecified.
22:20 anholt__: austriancoder: uh oh https://gitlab.freedesktop.org/anholt/mesa/-/jobs/3705927
22:23 airlied: is "If the primitive under consideration is a point, then clipping passes it un-
22:23 airlied: changed if it lies within the clip volume; otherwise, it is discarded."
22:23 airlied: the spec line I'd be looking for?
22:23 jekstrand: airlied: Uh... maybe?
22:24 jekstrand:doesn't want to digging through specs for point clipping behavior today
22:25 imirkin: airlied: sounds like you found it. hopefully in the GL spec?
22:25 imirkin: airlied: same applies to lines too btw - although obviously there's clipping going on there
22:26 imirkin: like - primitive clipping
22:26 airlied: https://github.com/KhronosGroup/WebGL/issues/2888
22:26 gitbot: KhronosGroup issue 2888 in WebGL "POINTS clipping not working correctly on several drivers" [Open]
22:26 airlied: has a pretty goog summary
22:26 airlied: imirkin: yeah that's the GL spec
22:26 airlied: "The bottom line is that OpenGL ES has pop-free point clipping while in OpenGL they pop in and out."
22:26 imirkin: ok, so the behaviors are just plain different. yay!
22:26 imirkin: i don't think the gallium rast settings account for this
22:27 airlied: yeah not sure how we'd do it, have a pop my points bit?
22:27 imirkin: clip_wide_points? something
22:27 jekstrand: and caps for what the HW supports
22:28 airlied: I doubt we can "lower" the behavuoir though
22:28 imirkin: can the st work around it by messing with clipping? probably
22:28 airlied: or would want ti
22:28 airlied: oh I suppose you could extend the clipping, but that would affect other prims
22:28 imirkin: oh, but only if it's viewport
22:28 imirkin: what if the clipping is to an area inside the viewport
22:29 imirkin: you could use scissors, but there's a limited number of those
22:29 imirkin: on nvidia hw, there's the concept of a "window scissor" which is different than a viewport scissor
22:29 airlied: gallium drivers I assume can't know if they are GLES or GL states
22:29 imirkin: "can't" is strong, but definitely don't
22:29 imirkin: at least not directly
22:29 airlied: I suppose we at least should signal to the driver which clip behaviour is required
22:29 jekstrand: Make it a dynamic state. :P
22:30 airlied: even if drivres ignore it
22:30 imirkin: yeah, there are a few of those bits already
22:30 imirkin: like the bottom something rule
22:30 imirkin: bottom edge rule?
22:30 airlied: I wonder what d3d does
22:30 imirkin: and a few others
22:30 imirkin: we've (i believe with good reason) resisted adding a raw "API" field anywhere
22:30 imirkin: so i think another rast bit makes sense
22:31 airlied: "If the point, taking into account the point size, is completely outside the viewport in X and Y, then the point is not rendered; "
22:31 airlied: seems d3d is not-pop
22:31 airlied: "It is possible for the point position to be outside the viewport in X or Y and still be partially visible."
22:31 airlied: at least in d3d9
22:31 imirkin: check d3d10 - that had a lot of departures from d3d9
22:31 imirkin: i think d3d11+ are pretty similar to d3d10 in those regards
22:34 airlied: not finding d3d11 spec for it
22:36 airlied: imirkin: is point_tri_clip in rast state already the bit we need?
22:37 imirkin: airlied: could be. does it get set to anything?
22:39 jenatali: airlied: What exactly are you looking for in the D3D11 spec?
22:39 jekstrand: jenatali: line/point clipping behavior
22:40 imirkin: (specifically of *wide* lines/points)
22:40 jenatali: D3D11 doesn't have wide lines/points
22:40 airlied: imirkin: never set
22:40 imirkin: airlied: i like it!
22:40 airlied: probably the sanest answer :-P
22:40 imirkin: jenatali: wow - no point sprites at all?
22:40 jenatali: Nope
22:40 airlied: imirkin: I assume the vmware one d3d9 layer does it
22:41 imirkin: jenatali: what about d3d10?
22:41 jenatali: imirkin: No, D3D10 doesn't have it either
22:41 jenatali: Full rasterization rules for D3D11 are here: https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#3.4%20Rasterization%20Rules
22:41 imirkin: gah! gallium.rtfd.org seems to have managed to become the mesa rtfd
22:42 robclark: too much docs to read?
22:42 imirkin: robclark: took me a while to find where the gallium ones were :)
22:42 imirkin: got it now
22:42 HdkR: Nobody needs wide lines when you can just emulate it in shaders ;)
22:42 imirkin: airlied: so point_tri_clip is documented as "Determines if clipping of points should happen after they are converted to “rectangles” (required by d3d) or before (required by OpenGL, though this rule is ignored by some IHVs)."
22:43 airlied: yeah which sounds like gles is d3d9 behaviour
22:43 imirkin: so sounds like gles + d3d = true, and gl = false
23:28 airlied: imirkin: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6003
23:30 eric_engestrom: imirkin, robclark: the gallium docs have been merged into the mesa website: https://docs.mesa3d.org/gallium/
23:31 eric_engestrom: looks like that process affected the old gallium docs website as well
23:31 eric_engestrom: we should probably replace it with a redirect to the new docs location?
23:32 eric_engestrom: kusma: do you know who managed the old gallium docs website?
23:33 imirkin: <---
23:35 eric_engestrom: oh, imirkin you do?
23:35 imirkin: yes. and i think i made robclark an admin too? i forget.
23:35 imirkin: i set it up ages ago
23:35 eric_engestrom: then yeah, what I said: could you redirect it to docs.mesa? :)
23:36 imirkin: not sure that's an option
23:36 imirkin: but i'll have a look
23:36 eric_engestrom: otherwise, it's not that bad, nothing's lost it's just drowned a bit ^^