IRC Logs of #dri-devel on irc.freenode.net for 2024-01-08

07:12 mareko: the limitation of vulkan is that DMABUF requires modifiers
07:15 mareko: X.Org won't give you modifiers with the DDX that supports FreeSync and TearFree
07:24 Company: isn't that a limitation of X.Org more than Vulkan?
07:49 mareko: it's only a limitation of the DDX
07:49 mareko: some other DDXs support modifiers
08:38 MrCooper: Company: "a few 100k of vram" with at least 2 surface buffers, that's a small window :)
08:39 Company: well, that's managed by libvulkan - I was only thinking about VkDeviceMemory I handle myself
08:41 Company: I also have no idea how much memory the driver allocates for the command buffers
08:41 Company: I only know how many images I allocate for icons, the vertex buffer and things like that
08:43 Company: but in general I want to be able to delete everything on the GPU and recreate it from scratch
08:43 Company: because then I can deal with GPU resets
08:43 Company: but then I can also delete everything if the app has been unused in the background for long enough
08:44 MrCooper: makes sense, there's now a suspended state Wayland protocol for the latter
08:46 Company: right, it might make sense to listen to that, too
09:05 sima: airlied, imre drm-tip is fixed
09:31 MrCooper: agd5f: not sure it's really a good idea to ask Markus Elfring to re-send a patch series from 2016 :/
09:36 ccr: :D
09:53 CounterPillow: rofl
09:53 CounterPillow: Too bad you can only ban him from the mailing list and not from individual developers' inboxes
09:54 Company: amdgpu: The CS has been rejected (-22).
09:55 Company: okay, now I broke it
10:10 karolherbst: jenatali, airlied: I'm looking into the llvm spirv backend now, because llvm-17 forcing opaque pointers breaks most of the compiler tests as function forward declarations with the translator are just broken :(
10:11 karolherbst: and I want to know if it's better with that anyway
12:35 alyssa: good luck ^^
13:10 jenatali: karolherbst: cool, let me know how it goes
13:11 karolherbst: I tried to figure out why my translator didn't compile for an hour just to notice that my ~/local prefix had stale files in it :')
13:16 karolherbst: forgot to build the spirv target :')
13:16 jenatali: Sounds like me in December, realizing that I've been installing libclc to the wrong drive for literal years which was causing some math failures
13:17 karolherbst: ... pain
13:17 jenatali: But only in my local dev builds at least
13:21 karolherbst: now I have to figure out how to get the spirv...
13:27 alyssa: oof
13:44 karolherbst: I should check what llvmpipe is doing, not what radeonsi is doing :D
14:55 imre: sima, thx
14:58 DavidHeidelberg: hakzsam: do you have idea about what can cause breakage in your fixes? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/53463012
14:59 hakzsam: DavidHeidelberg: what fixes?
15:00 DavidHeidelberg: oh, it's not yours, but the Konstantin's https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26812#note_2229907
15:01 hakzsam: hmm, interesting
15:14 pinchartl: aarggghhhhh who had the stupid idea of naming a cryptocurrency XDC ? they've hijacked search engines
15:14 javierm: :(
15:16 karolherbst: jenatali: I managed to do it :3 though it's cursed
15:17 sima: what does it even stand for? or they just picked 3 letters for which domains where available?
15:17 jenatali: karolherbst: great
15:18 karolherbst: though I hate everything about it
15:18 karolherbst: why is the LLVM API such a mess
15:29 MrCooper: mupuf: the UMDs in AMD driver releases use libdrm_amdgpu, it's basically the user-space abstraction layer for amdgpu
15:30 MrCooper: simply merging all of libdrm into Mesa seems feasible though
15:30 alyssa:would be in favour of that
15:32 karolherbst: do we have any users of libdrm outside of mesa still?
15:32 karolherbst: (except DDX never seeing updates)
15:33 MrCooper: see 3 lines above your question
15:33 pcercuei: I have a "kmsgrab" program that uses libdrm :)
15:37 karolherbst: jenatali: the llvm stuff crashes :')
15:37 karolherbst: test_compiler: /home/kherbst/git/llvm-project/llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.h:153: llvm::Register llvm::SPIRV::ModuleAnalysisInfo::getFuncReg(const llvm::Function*): Assertion `FuncPtrRegPair != FuncMap.end() && "Cannot find function ID"' failed.
15:38 karolherbst: that's when forward declaring functions ...
15:38 karolherbst: so it's not fixing the issue I hoped it would fix :D
15:38 jenatali: Oof
15:38 karolherbst: well.. it's llvm-17
15:39 karolherbst: I shall try llvm-18, llvm-main and then complain to them they should fix it and see them falling into eternal dispair as opaque pointers makes it a pain to support
15:40 karolherbst: yo....
15:40 karolherbst: it's broken
15:40 karolherbst: Source and destination types of SpvOpStore do not match: uint8_t (%8) vs. uint64_t (%9)
15:41 karolherbst: just trying to run math_brute_force
15:41 karolherbst: the translator is already a nightmare in terms of producing valid spir-v
15:41 karolherbst: it seems that llvm target is worse
15:47 karolherbst: pain... fedora LLVM doens't build the spirv backend :')
15:49 alyssa: D:
15:50 karolherbst: let's see if patching the spec file is trivial here :D
15:52 jenatali: Guess I'm just going to be stuck on LLVM 16 or older forever. The beauty of statically linking is I get to pick my version and don't have to care about what else is on the system
15:52 karolherbst: :D
15:53 karolherbst: well.. requiring the SPIRV target is easy to do and it seems like you have to do 0 changes to the packaging
15:54 mupuf: MrCooper: hmm, that's indeed a good point!
16:02 karolherbst: distributions will hate me when I say: <llvm-17 :D
16:04 alyssa: is that an option
16:04 karolherbst: jenatali: somebody on the spirv-llvm-translator bug tracker suggested that we should just convert the spirvs to llvm and link on the llvm level
16:05 karolherbst: seriously... writing an OpenCL C to SPIR-V compiler becomes a better idea every llvm release
16:09 jenatali: Ugh. I guess we could do SPIR linking instead of SPIR-V linking but oof that sounds like a terrible idea for app-provided SPIR-V
16:09 karolherbst: well
16:10 karolherbst: on an API level you can't link SPIR-Vs I think
16:10 karolherbst: or well...
16:10 karolherbst: it's not defined I think
16:10 karolherbst: but yeah..
16:11 karolherbst: I think technically people could use it for this
16:11 alyssa: OpenCL C to NIR compiler :frog:
16:11 karolherbst: don't tempt me
16:12 alyssa: ~~how hard can it be~~ Delet
16:12 karolherbst: I mean... at least the problem of figuring out the function prototype is simpler in a spirv llvm target as in theory you can kinda fetch the information from the clang context or something?
16:12 karolherbst: I dunno
16:12 karolherbst: I should talk with them about it...
16:13 karolherbst: file a bug and complain that it totally can't compile opencl c with external functions
16:13 karolherbst: but maybe it works better on main...
16:15 daniels: karolherbst, MrCooper: every single KMS user is a consumer of libdrm
16:16 karolherbst: jenatali: I think I'm just disappointed that they don't really care about validating their spirvs....
16:16 karolherbst: it was already an issue with the translator
16:16 karolherbst: but uhhh.. *sigh*
16:16 karolherbst: even the one in clinfo is invalid :')
16:18 karolherbst: `%__spirv_BuiltInGlobalInvocationId = OpVariable %_ptr_Input_uchar Input` :D
16:18 karolherbst: pain
16:18 karolherbst: pain
16:18 karolherbst: why
16:29 karolherbst: I kinda like my idea of just throwing in some casts in the translator to deal with it....
16:29 karolherbst: like
16:29 karolherbst: all pointer arguments are the same type
16:29 karolherbst: and then just cast to the real one in the function body
16:30 karolherbst: it's a pain, but at least it's valid pain
16:44 mattst88: yay, some patches landed in vulkan-cts that stop running a few hundred thousand duplicate tests
16:49 bnieuwenhuizen: uh, which ones did they dedupe?
17:19 mattst88: bnieuwenhuizen: a bunch of VK_EXT_pipeline_protected_access tests -- see https://github.com/khronosGroup/vk-GL-CTS/commit/2ecdbe0665b6aa805b7457b800bb0e200310e54f and https://github.com/khronosGroup/vk-GL-CTS/commit/ea83dc0a128dfec2815c8d9e28d8039f4f43ca8d
17:20 mattst88: https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/4859
17:20 mattst88: looksl ike I was wrong about the number -- that link says 'about 60,000'
17:27 thellstrom: Hi! I'm about to add xe to nightly.conf early tomorrow CET. https://patchwork.kernel.org/project/intel-gfx/patch/20231222113640.14038-1-thomas.hellstrom@linux.intel.com/ Will require configured ssh for gitlab.freedesktop.org for auto-adding the xe remote, or doing it manually using https.
18:07 tango_: so hm apparently kernel 6.6.9 broke rusticl radeonsi on my setup, but I think the issue may be deeper since I'm getting this on dmesg: amdgpu 0000:66:00.0: amdgpu: bo 000000008744cead va 0x0800000000-0x08000001ff conflict with 0x0800000000-0x0800000002
18:08 tango_: question is: where should I submit the bug?
18:09 tango_: oh the mystery thickens. it's a conflict between rocm and rusticl. damnit
18:12 tango_: and initializing rusticl before rocm leads to a deadlock in rocm
18:13 karolherbst: funky
18:13 tango_: so hm what's the policy here when the proprietary driver is involved? I assume it's «we don't care, you're on your own», which is fair
18:13 tango_: (actually not proprietary driver, proprietary opencl platform)
18:13 karolherbst: well.. doesn't really matter if it's proprietary or not
18:13 karolherbst: sounds like a regression in the kernel
18:13 tango_: I guess I could spin it that way 8-D
18:13 karolherbst: could `git bisect` it and figure out what broke it
18:13 karolherbst: or just file a bug
18:15 tango_: let me reboot in 6.6.8 first to make sure it's the kernel, I think the system upgrade did both kernel and mesa, which is going to make it much more … interesting to debug
18:15 tango_: brb
18:15 zmike: mareko: any other brainbusters I'm supposed to review or was that the last of them
18:39 tango_: karolherbst: not a kernel issue, apparently, with the new mesa it breaks the same way on both 6.6.8 and 6.6.9
18:40 tango_: so, should I submit the issue anyway?
19:45 gfxstrand: Ugh... Looks like I missed a ping and it's not on the scrollback. :-(
19:47 zmike: jenatali: it's been too long since we both were at work
19:47 zmike: I missed this
19:47 jenatali: Agreed
19:50 jenatali: zmike: Seems like st_QueryInternalFormat probably should handle GL_INTERNALFORMAT_SUPPORTED, check if the target is an MSAA target, and if so, check if it can be multisampled before just returning the default true
19:50 zmike: jenatali: I don't suppose you wanna yolo a quick hack up there for ci to run
19:51 jenatali: Lemme give it a shot
19:51 zmike: hero
19:53 mal: gfxstrand: you mean this https://oftc.irclog.whitequark.org/dri-devel/2024-01-06#32803383; ?
19:56 jenatali: zmike: Oh cool, that also removes a ton of debug spew from this test for me
19:56 jenatali: Pushed a patch, time to kick CI
19:56 zmike: oh nice
19:58 mareko: zmike: thanks, I think that's all
19:58 zmike: cool
20:01 gfxstrand: Ah, karolherbst grumbling about scratch...
20:01 karolherbst: scratch?
20:02 karolherbst: ohh that one...
20:02 karolherbst: gfxstrand: yeah so the tldr is, that if we are unlucky, and the CL C code does something silly (like e.g. casting random u8 arrays to higher things) we could end up with unaligned loads, even though the layout of that memory area is up to us
20:03 karolherbst: but that could also prevent vectorization, e.g. loading u8[8], which we could turn into a u64 load if it would be aligned properly
20:03 gfxstrand: Yeah
20:03 gfxstrand: I'm fine with aligning things higher as long as we're not breaking CL rules by doing so.
20:04 karolherbst: yeah... I was wondering if we want to check in lower_explicit_io what the biggest access is
20:04 gfxstrand: IDK if that's practical
20:04 gfxstrand: That's an annoying check
20:04 karolherbst: for scratch and shared we could just doing, but yeah...
20:04 karolherbst: *do it
20:05 karolherbst: gfxstrand: any other ideas on what to check for?
20:05 karolherbst: mhhh...
20:05 karolherbst: guess we might need to take offsets into account as well..
20:06 karolherbst: like what if the access is [0:1] + 2, but also 0 + [1:2] a bit later for $reasons, and we'd have to choose, but whatever we pick, it's still better than both unaligned
20:06 karolherbst: but anyway.. atm I'm only thinking of checking for explicit casts
20:06 gfxstrand: I think if we do anything, we just align the base up.
20:06 karolherbst: and not to consider potentical vectorization later on
20:06 gfxstrand: There's no point in optimizing for accessing [3:7] of an array.
20:07 karolherbst: mhhh
20:07 jenatali: Yeah, each scratch var can be aligned to 8 bytes or 16 bytes or whatever the driver's preferred alignment is instead of just aligning it based on its type
20:09 karolherbst: yeah.. maybe that would be easier...
20:21 jenatali: zmike: Nope apparently my missing 'break' that non-MSVC build tools told me about just broke the test... let's try again...
20:21 zmike: it's okay, first days back from vacation
20:21 zmike: shake it off
20:27 jenatali: O.o why does st_QuerySamplesForFormat always return one of the max-samples values...? That seems wrong
20:28 jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11372
20:28 jenatali: Wtf?
20:31 jenatali: zmike: Does that MR make any sense to you?
20:32 zmike: 🤕
20:32 CounterPillow: People are still using NVIDIA Tesla generation boards? As in 8800GT and friends? :D
20:32 zmike: I think this is because in GL there's that weird thing where samplecounts aren't format specific...if you advertise a sample count you're advertising it for ALL samples
20:33 zmike: I remember running into something stupid like this a few years back
20:33 jenatali: Ok, but what if some formats can't be MSAA?
20:33 zmike: I think that was the stupid part where like 9E5 was still required to return support or something
20:34 karolherbst: CounterPillow: even the gen before tesla
20:34 jenatali: The more I learn about GL the more I think it was a mistake
20:34 zmike: hey
20:34 zmike: you might be right but I can't say that
20:35 gfxstrand: I mean, so was D3D9 and D3D8 and 11 and Vulkan and...
20:35 gfxstrand: Really, computers were a mistake
20:35 jenatali: Heh
20:35 karolherbst: don't worry, your secrets are safe with us
20:35 zmike: that we can all agree with
20:35 zmike: sand was not meant to think.
20:35 CounterPillow: OpenGL had the unpleasant role of developing in parallel with GPUs becoming generally programmable parallel processors, so I do not fault it for getting it wrong occasionally.
20:36 jenatali: Ok so if you ask for sample counts for 999e5 we need to return whatever the max sample count is... but I can still claim it's unsupported for a multisample target, right?
20:36 zmike: 🤕
20:36 zmike: maybe
20:36 jenatali: sigh
20:37 karolherbst: well.. gl doesn't have such queries anyway, no?
20:37 zmike: ackchuahlly
20:37 karolherbst: just report more and don't do MSAA, what's the user going to do? file a bug?
20:37 zmike: ohhhh it was that easy all along?
20:38 jenatali: D3D doesn't support 999e5 as render target at all
20:38 jenatali: Well, not yet anyway. It's actually being added coincidentally
20:38 jenatali: Maybe the answer is just add the xfails and they'll go away when that lights up and I don't care about this weird corner case on drivers that don't support it
20:38 gfxstrand: Why are we doing MSAA on 999e5?!?
20:39 zmike: that's the fun part
20:39 zmike: we're not
20:40 karolherbst: wasn't that also this part of time where people were like "look, I know our GPU doesn't support this feature, but we can emulate it really fast using LLVM"
20:42 CounterPillow: that sounds like Intel GMA950
20:51 alyssa: CounterPillow: "occassionally"
20:52 alyssa: karolherbst: I've spent the past 3 years seeing if it's possible to do gl4.6 on a broken gles3.1 part (~:
20:54 CounterPillow: no bully the bifrost
20:54 alyssa: CounterPillow: actually I was talking about agx (-:
20:54 CounterPillow: oh, oof
21:02 jenatali: Alright let's see if piglit complains about this
21:06 karolherbst: alyssa: I mean... the alternative would be to invent your own API everybody has to use to use modern but supported features of your GPU :P
21:09 jenatali: zmike: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/53492970 - that looks better, if you're good with the patch I pushed, go ahead and update the fail list and add your r-b
21:18 mareko: jenatali: 999e5 wasn't renderable until RDNA 2
21:19 jenatali: Then I guess I don't understand why radeonsi isn't impacted by this change 🤷
21:21 mareko: also image stores didn't work with it either
21:21 jenatali: Right
21:22 jenatali: mareko: What happens if you ask to allocate a 999e5 multisampled texture in radeonsi prior to RDNA2? Do you still allocate it as multisampled even though you can't render to it?
21:22 jenatali: If so, how do you put contents in it? Blit?
21:23 mareko: not blittable either
21:23 mareko: need to reinterpret the format to uint for blits
21:23 mareko: I have no clue, maybe it's just untested?
21:25 mareko: for FBOs, we definitely report unsupported for the FBO status
21:25 jenatali: arb_internalformat_query2-image-format-compatibility-type tries to create multisampled 999e5
21:26 jenatali: It just creates it so that it can query properties against it though
21:26 mareko: it looks like we allocate it
21:27 jenatali: And I guess it's just zeroed or whatever since there's no way to put contents in it
21:27 mareko: but there is no way to set any of the pixels
21:27 jenatali: What a mess
21:32 gfxstrand: Whatever happened to nir_strip?
21:33 jenatali: mareko: In that case you might be interested in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25931/diffs?commit_id=894c9e2740431b1028010fa3fd03f7255f6e4213
21:36 jenatali: gfxstrand: What is that?
21:41 gfxstrand: It was a pass that stripped "unnecessary" stuff from the NIR you could use it as a cache key
21:43 jenatali: Oh cool
21:48 alyssa: gfxstrand: argument to nir_serialize
21:49 alyssa: gfxstrand: ff71fae4403 ("nir: strip as we serialize to remove the nir_shader_clone call")
21:52 gfxstrand: Neat
21:53 eric_engestrom: PSA: reminder that the 24.0 branchpoint is in ~44h; get your MRs reviewed now ;)
22:13 mareko: I just ran into this Gitlab bug: https://gitlab.com/gitlab-org/gitlab/-/issues/421630
22:25 mareko: robclark: what is happening with these jobs? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/53494421
22:39 robclark: mareko: something got slower and now it's hitting 20m timeout, I guess?
22:41 robclark: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/53493765 looks like it was all but finished before timeout
22:44 robclark: looking at the previous pipeline that ran freedreno jobs (https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/1073401) these tests aren't close to the timeout limit
22:45 mareko: robclark: isn't this affecting it? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26584/diffs?commit_id=1e17469e81ccdd92db23f446514fbf83eb576a00
22:46 robclark: shouldn't be.. nr_cpus should be 8 for those devices
22:46 robclark: also.. I'm not quite sure why that would make _vk_ tests slower
22:46 mareko: indeed
22:47 alyssa: good reminder I need to wire up tc.
22:47 alyssa: or just go vk already......
22:49 robclark: mareko: the other a630_vk job looks like it is trying to report flakes, so maybe some transient network issue?
23:00 mareko: robclark: reassigned to marge
23:01 mareko: has anybody considered emulating index buffer translation and primitive restart with compute?
23:02 mareko: so that we can stop using u_primconvert
23:07 jenatali: mareko: I've considered it. Haven't considered it hard enough to actually do it though
23:09 zmike: same
23:18 alyssa: same
23:18 alyssa: mareko: prim restart emulation is stupid hard to do in parallel
23:19 alyssa:gave up trying and hopes nothing other than the CTS ever hits that compute kernel
23:21 alyssa: it's in src/asahi/lib/shaders/geometry.cl if you want to be disgusted
23:25 alyssa: if you do embark on this quest, I highly recommend the cl magic stuff
23:43 jenatali: As a first step, it doesn't need to be parallel, just having it async to avoid the CPU round trip would still be useful
23:52 alyssa: jenatali: yep, that's why I wrote that doom macro