01:04AlexShi: Hey all, I'm on the Microsoft D3D team and trying to verify that some of our changes in mesa master can pass CTS verification. I've been seeing this error in my test runs, has anyone else noticed this?
01:04AlexShi: Debug\glcts.exe --deqp-case=KHR-GL33.shaders30.declarations.declarations.redeclare_gl_FragColor
01:04AlexShi: Test case 'KHR-GL33.shaders30.declarations.declarations.redeclare_gl_FragColor'..
01:04AlexShi: Fail (expected compilation to fail, but both shaders compiled correctly.)
01:06anholt: AlexShi: interesting. we have KHR-GL33 running on other drivers in CI, so I would be surprised to see a regression on that in core mesa.
01:07anholt: are you running the same cts version we are in CI? (184.108.40.206)
01:07AlexShi: Is there a quick way for me to check on that version number?
01:07anholt: I just git log my tree, I don't think the binary can tell you
01:08jenatali: IIRC we're using the latest released GL (desktop, not GLES) tag
01:08jenatali: We had it working before we merged our code upstream
01:10anholt: hmm. wonder if you're running into something with allow_glsl_builtin_variable_redeclaration driconf?
01:11anholt: is this the only regressed test?
01:12AlexShi: I've got a bunch of very noisy test failures, but there seems to be a good number of compilation-should-have-failed errors
01:13jenatali: I wouldn't expect that driconf option to be on, considering that driconf isn't hooked up at all on Windows (I think?)
01:19AlexShi: I've got another pattern of failures across many tests that looks something like this:
01:20AlexShi: <Text>A constant named [false], defined in fragment shader, was accepted by the compiler, which is prohibited by the spec. Offending source code:
01:20AlexShi: <Text>A constant named [mat2], defined in fragment shader, was accepted by the compiler, which is prohibited by the spec. Offending source code:
01:20AlexShi: <Text>A constant named [mat3], defined in fragment shader, was accepted by the compiler, which is prohibited by the spec. Offending source code:
01:22AlexShi: That repeats ad. nauseum for plenty of CommonBugs.___ tests
01:30anholt: jenatali: yeah, just wondering if we broke the !driconf path somehow.
01:31anholt: I wrote lots of unit tests to try to verify that, but...
01:31anholt:done for the day
01:38jenatali: anholt: You might be on to something with driconf...
01:38jenatali: Looks like the driconf options might be uninitialized on Windows...
01:39jenatali: Oh nvm, there's the memset
06:29jstultz: kherbst: hey, so ran into a build issue with the following line where 256 is being cast to a uint8_t and is thus zero: https://gitlab.freedesktop.org/mesa/mesa/-/blob/a7a7d25e5b909711e3649eba2f24cc04dca8ab20/src/gallium/drivers/nouveau/nvc0/nvc0_program.c#L689
06:30jstultz: kherbst: the "// XXX: why?" comment makes it seem like its not clear whats supposed to happen.
06:31jstultz: kherbst: was curious if you had any thoughts on what might be a good way to fix that so -Werror,-Wconstant-conversion doesn't fail?
06:36jstultz: kherbst: does a more explicit "= (info_out.bin.maxGPR + 5 > 255) ? 0 : info_out.bin.maxGPR + 5;" make sense?
06:41HdkR: More likely they were just maxing out the defined registers, so it should be 255?
06:56jstultz: HdkR: yea, that's what i was guessing but figured it may be trying to be clever if it overflows.. didn't want to introduce logic change
07:22mareko: imirkin: all resources have packed slots except streamout buffers
07:42HdkR: jstultz: I believe in this case the code is just wrong, num_regs = 0 doesn't really work there
08:02kherbst: jstultz: huh... mhh, yeah, I'd make it a 255 I think...
08:44MrCooper: daniels: AFAICT https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7508 left the kernel+rootfs_armhf artifacts unused, so either that job should be disabled, or maybe some jobs using them re-enabled
08:47pq: emersion, did you forget all the discussions that resulted in the conclusion "a fence is not a command in the GL stream"? :-P
08:48pq: there is no "fence command", the EGL spec is plain misleading
08:50pq: emersion, so you need to do exactly what daniels said: CreateSync, glFlush, DupNativeFence. And this is only valid for knowing when that work completes.
08:51pq: imirkin, the android in eglDupNativeFenceFDANDROID doesn't mean anything, it's just that android needed it first, but we all followed needing it too.
08:55ascent12: Yeah, the once the fence is signalled, you know all prior work to that point is complete, but if you didn't actually do any new work, it may just return a fence from the last submission.
08:56pq: emersion, https://github.com/KhronosGroup/EGL-Registry/issues/94
08:56ascent12: Which is how I found out that you can't use fences to time how long something takes.
08:57pq: I would expect the fence/fd thing to work fine for PBO operations, but...
08:58pq: not sure anyone ever tried?
08:59emersion: ah, yeah :P
08:59ascent12: I haven't look back to the start of this discussion, but as long as there was some actual work given to the GPU, there shouldn't be any issues with getting a fence.
08:59emersion: it _seems_ to work looking at the timings, unfortunately most of the time is still spent in the final memcpy
09:01emersion: so i was hoping for a "async readpixels that doesn't block the compositor", but it seems not really possible
09:04pq: ascent12, does glReadPixels with a PBO count as "actual work"?
09:05ascent12: I would imagine so.
09:05pq: emersion, hmm. So maybe the fence is good for the GPU ops, but leaves out the CPU work needed.
09:05emersion: yeah, that's myb understanding
09:05pq: ascent12, this ^ would suggest perhaps otherwise.
09:05emersion: i've seen other people use CreateSync with ReadPixels
09:06pq: would need a small test program that doesn't do any GPU work than glReadPixels with a PBO in a loop, with fence fd waiting.
09:07emersion: pq: to use PBOs, you need to use ReadPixels to initiate the FBO→PBO copy, then when this is complete map the PBO and manually use memcpy
09:07pq: emersion, oh, you have a manual memcpy
09:08ascent12: opengl-managed memory to your own managed memory?
09:08emersion: … because that's how the PBO API works?
09:08emersion: i have no idea honestly
09:08emersion: see last example in http://www.songho.ca/opengl/gl_pbo.html
09:09pq: you said you can map the PBO to CPU, right? So why memcpy instead of using the data directly from the map?
09:09pq: is the PBO mem slow for random reads?
09:10emersion: pq: hm, yeah, that's a good question. my use-case is screen capture, so i need to fill a client buffer with the downloaded data
09:10emersion: i can't share a random mapped memory region
09:10pq: ok, that's a good reason to memcpy
09:10emersion: i don't klnow about the PBO mapped region properties
09:11ascent12: If only there was the equivalent of VK_EXT_external_memory_host
09:11emersion: why don't we just have a vulkan renderer :P
09:12pq: weston is still lacking even PBOs :-)
09:12emersion: well, it's not clear PBOs are useful for downloads
09:13pq: even for uploads, might be useful with wl_shm
09:13emersion: i'd need to do some measures to figure out if it would be useful for uploads
09:13pq: oh wait, you'd still need to memcpy manually there too?
09:14emersion: *disappointement in the crowd*
09:14pq: bummer - somehow I expected I could give GL a pointer to arbitarary memory and it would off-load the CPU reads to a thread under the hood when necessary
09:14emersion: yeah, me too
09:14pq: does Vulkan have this?
09:15emersion: maybe they were scared about users needing to free the pointer or something
09:15ascent12: VK_EXT_external_memory_host if I'm not mistaken
09:15emersion: "here's my pointer, please import it"
09:16ascent12: I wonder if there would be any resistance if someone tried to make a new OpenGL extension for that :P
09:19lynxeye: The horrors of AMD_pinned_memory?
09:20glennk: tip: create a buffer large enough for 3 frames worth of data, map that coherent on init, leave it mapped
09:20ascent12: Oh, so there already is an extension.
09:20HdkR: buffer_storage and pinned_memory <3
09:22emersion: lynxeye: can you expand on the horror side?
09:23lynxeye: emersion: kernel guys will hate you for using it, as we try all day to pretend that we never introduced the mistake of pinning userspace memory in our UAPI. ;)
09:24emersion: hm, why is there a need for pinning?
09:24HdkR: Same thing should be able to be pulled off with buffer_storage :)
09:24ascent12: Yeah, we just want async reads/writes to our memory. Pinning isn't important.
09:25ascent12: To avoid a memcpy
09:25pq: not even to avoid memcpy, but to put it into another thread while not caring about threads ourselves
09:26lynxeye: emersion: If you want to reap the full benefits, you want the GPU to do the copy into/out of userspace memory. Which requires pinning, as most GPUs don't do proper page faults (and app developers would not like the latency of a page fault).
09:26pq: IOW, let the driver do a memcpy in a thread if it can't use the memory directly
09:29ascent12: True, but at least it'll give the driver the oppurtunity to.
09:29emersion: hm, i still see memcpys in the EXT_buffer_storage examples…
09:30pq: HdkR, how do you avoid manual memcpy's with buffer_storage? use something else on the side?
09:30HdkR: Depends on what you're trying to do
09:30pq: HdkR, to copy from a wl_shm wl_buffer into a texture
09:30ascent12: ^ i.e. normal shared memory object, for those not familiar with wayland
09:30pq: IOW, I have a fd from another process, and I can mmap that fd to get CPU access to the memory.
09:31emersion: basically, you have an existing pointer, and want to upload it
09:31HdkR: oh, existing pointer would need pinned_memory most likely
09:31emersion: the AMD_pinned_memory looks exactly like what i'm interested in
09:31emersion: however it's not gles2 :(
09:32HdkR: buffer_storage would be where you have the GL driver generate the memory backing for you and then everything uses that instead
09:32pq: HdkR, right, that's not enough for the usual Wayland compositor requirements.
09:32ascent12: Neither is buffer_storage. Probably need to keep the slow path for GLES2
09:32emersion: yeah, we don't want the GL driver to generate any memory backing
09:32HdkR: buffer_storage isn't gles2 either though, requires ES3.1 :)
09:32emersion: ascent12: https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_buffer_storage.txt
09:32emersion: i guess "OpenGL ES 3.1 is required", but at least it's GLES
09:33HdkR: Technically the GL desktop 4.2 or something, but mesa is silly and exposes it everywhere if the driver enables the cap :P
09:33HdkR: er, the GL desktop extension requires 4.2*
09:34emersion: i don't see it in my GLES context at least
09:34ascent12: There is an EXT and ARB version of the extension, written for each API separately
09:34HdkR: mesa follows the minimum ES version, not the GL version
09:34pq: HdkR, given this "I already have the memory" requirement, would using PBO with manual memcpy give any advantage over the simple glTexImage2D call?
09:34emersion: would be nice to have AMD_not_pinned_memory
09:35emersion: ie. exactly what vulkan has
09:35emersion: hm, i may be misunderstanding what pinned memory is here
09:36pq: emersion, pinned means "not swappable" more or less, i.e. locked into physical ram into specific phys addresses. AFAIU
09:36HdkR: pq: Doubt
09:37pq: IOW, the way to run out of memory not being able to do anything about it than give up in the kernel
09:37emersion: but why is this required for async uploads?
09:38pq: emersion, because you don't want the kernel to move be buffer while the GPU is reading it.
09:38pq: *the buffer
09:38ascent12: Some hardware (e.g. nice AMD GPUs) do have some separate DMA engine thing for uploading data to the GPU. Probably has some extra requirements on memory to use it.
09:38ascent12: You can see it in vulkan, with the separate transfer queue.
09:39pq: HdkR, thanks - it's both comforting that Weston cannot really do any better, and sad that Weston cannot really do any better.
09:39emersion: ascent12: makes sense
09:40HdkR: pq: Your best bet is to avoid client<->device copies as much as physically possible :)
09:40pq: HdkR, I think that is already the case.
09:40HdkR: In some cases having the GPU read memory directly from host can just be faster though
09:40ascent12: Yeah, damage tracking solves most of that, but the wayland compositor just wants to make sure it's not blocking or being slow in any situation
09:40lynxeye: pq: I guess the canonical way of doing what you want to achieve is to use the standard glTexImage and offload that to a application thread.
09:41pq: lynxeye, right... have managed to avoid threads so far though
09:42glennk: friends don't let friends to GL from multiple threads...
09:42pq: doing the manual memcpy to a PBO from a thread might be... wiser?
09:43ascent12: May as well just use glTexImage or whatever it was if there is another thread.
09:43pq: ascent12, but that needs a GL context current in the extra thread. memcpy doesn't.
09:43ascent12: You'd need to create its own (shared) EGL context for it
09:43HdkR: Yea, just use shared contexts
09:44HdkR: You'll make nouveau a bit sad but everyone else happy
09:44lynxeye: pq: I'm not sure. With glTexImage the driver already knows what you are going to do with the data (texture from it) and can get it into the GPU preferred layout while slurping in your user-memory.
09:44pq: lynxeye, true
09:48lynxeye: ascend12: I chuckle a little that people in this conversion at the same time wish for a Vulkan renderer, while complaining that the GL driver doesn't do enough threading behind their back. ;)
09:50pq: Vulkan doesn't do threading behind my back?
09:50emersion: pq: correct
09:50emersion: i think
09:50HdkR: It tries to avoid it anyway :)
09:50ascent12: I wonder if the shader compiler is multithreaded in vulkan
09:50lynxeye: pq: At least that's what the advertising says. I don't think it's true anymore with timeline semaphores.
09:51pq: right - well, when working with OpenGL, we have the OpenGL mindset. When working with Vulkan, the mindset is completely different.
09:52pq: and I've never worked with Vulkan yet
09:52emersion: i wonder how VK_EXT_external_memory_host can work without a thread
09:52emersion: maybe it also uses pinned memory and everything is done in-kernel
09:52ascent12: I drew a triangle and some textured squares once in Vulkan. That makes me an expert in it.
09:53emersion: pretty much
09:53ascent12: But even then, the way the textures were uploaded was pretty sloppy :P
10:43tzimmermann: airlied, danvet, JFYI there's nothing in drm-misc-fixes this week
11:35tzimmermann: sailus, hi! is the print patchset ready for merging?
11:43tagr_: Lyude: sorry for the delay, I had missed that message
11:44tagr_: Lyude: drm_dp_aux_attach() is pretty much the first time the driver learns about DRM, so I don't think we could do it any earlier
11:46tagr_: one thing that we might be able to do is to turn the DPAUX devices into "host1x clients", which would cause their ->init() functions to be called once the DRM device has been created
11:46tagr_: however, drm_dp_aux_attach() is called from tegra_sor_init(), so the result would be basically the same
11:47tagr_: Lyude: so I think setting .drm_dev in drm_dp_aux_attach() is actually the right thing to do
11:50sailus: tzimmermann: I believe so.
15:56nroberts: does drm_vma_node_unmap not work with the GEM shmem helpers? it looks like the helpers remove the pgoff from the vma and then drm_vma_node_unmap can’t find the mappings and it doesn’t do anything
15:56nroberts: the GEM shmem helpers themselves try to use drm_vma_node_unmap when purging a buffer. surely this will not work?
15:56nroberts: it looks like panfrost is using this, so maybe there is a bug there?
15:56nroberts: I made this patch and it seems to help in my case where I want to use drm_vma_node_unmap. https://github.com/bpeel/linux/commit/e3c368e9fbbef4eaf18fcf1a58a89c22e15d7d6d
15:59danvet: nroberts, yeah this looks broken
16:00nroberts: cool, maybe I should send that patch then
16:00nroberts: it’d be interesting if someone could replicate the potential bug in panfrost
16:02danvet: nroberts, looks like this broke v3d with 40609d4820b21
16:02danvet: anholt, ^^
16:02danvet: and panfrost/lima look like broken from the start
16:03danvet: well v3d doesn't seem to have any shrinker or anything like that, so not broken
16:03danvet: but panfrost looks broken
16:03nroberts: I think v3d is not using the purging?
16:03nroberts: right, yeah
16:46daniels: tomeu, bbrezillon: ^
16:58ym: Hi. I'm thinking about making a compiler to GLSL code and wondering if there is some kind of formalized knowledge base and inference engine of GLSL?
16:59imirkin: ym: not sure what you're looking for - perhaps elaborate a bit further?
17:05bbrezillon: nroberts, daniels: wow, that would explain the WARN_ON()s we were having when purging BOs
17:05nroberts: oh fun
17:06nroberts: I went ahead and sent the patch to dri-devel and CC’d the panfrost maintainers, I hope that’s ok
17:06nroberts: (I’m not sure if it worked though, it doesn’t seem to have shown up yet…)
17:06imirkin: if you're not subscribed, it has to get approved
17:06bbrezillon: nroberts: thanks
17:06nroberts: I am subscribed, yeah
17:07imirkin: and you sent from the email you're subscribed with? :)
17:07nroberts: yes :)
17:07nroberts: I hope
17:07ym: imirkin, I need GLSL pattern generator. For example, if the source program has only draw-circle function, then all the context to implement this minimum functionality should be generated automatically. Some sort of other side of Yacc.
17:07imirkin: ym: there was a team that was doing glsl fuzzing a while back
17:08imirkin: ym: https://github.com/google/graphicsfuzz
17:11ym: imirkin, interesting, thank you.
17:35Lyude: tagr_: yeah that's what I ended up doing, although there seems to be some bridge drivers that kinda complicate the drm_dev backpointer stuff
17:36Lyude: i'm wondering if we might just need to make the DRM helpers ok with not having a drm_dev backpointer the entire time, since some drivers seem to do DPCD transactions well before they know anything about DRM (this is going on the assumption we'll eventually be moving all of the debugging output for the DP helpers over to drm_dbg_*())
18:25zmike: imirkin: you were right, trace driver is awesome
18:26zmike: did you say there was some kind of visualizer?
18:26imirkin: there's a script to parse the xml
18:26imirkin: into something not quite as horrid
18:26imirkin: it's ... somewhere
18:27imirkin: probably dump.py
18:27imirkin: 100 years ago there was also a rbug driver and associated gui
18:27imirkin: but i believe it's all gone now
18:27zmike: rbug still exists
18:27zmike: I just copied code from it yesterday
18:28imirkin: oh hm, yeah, i guess it's still there
18:28imirkin: there used to be a rbug-gui too
18:29imirkin: i guess here: https://cgit.freedesktop.org/mesa/rbug-gui
18:29imirkin: updated as recently as 6 years ago
18:31zmike: the amount of not-workingness in these trace python scripts is roughly 100%
18:31imirkin: hm ok. they used to work for me
18:31zmike: feels right
18:31zmike: I think it's python syntax changes
18:31imirkin: oh, coz you have py3?
18:32imirkin: quite annoying that distros started shipping py3 as "python"
18:32imirkin: breaks everything
18:32imirkin: since it's not compatible
18:32imirkin: o well
18:33zmike: hmm the env in the python files is using python2
18:33anholt: distros dragged their feet for as long as they could, but my understanding is it's python upstream that basically forced their hands.
18:33imirkin: my point is you don't make "bash" into a thing that doesn't interpret shell scripts
18:34imirkin: a ton of scripts use "python", so python should have remained py2
18:34anholt: yep! python upstream made a very bad call.
18:34imirkin: you don't want to ship py2 anymore? fine, don't have a "python" binary, no problem
18:34imirkin: the distros could have done what they wanted, but instead they chose to make python be py3
18:35anholt: imirkin: the issue is that distro installs are not the only way people get python, and so you have in-the-wild python scripts that shebang to python that are either 2 or 3.
18:35ccr: not sure how this works on debian, but I've no py2 packages installed anymore and there is no "python" binary, only python3. of course it's possible that some package provides a symlink or so.
18:37pepp: ccr: Debian has python-is-python2 / python-is-python3 packages to provide the python -> pythonX symlink
18:37zmike: I made a ticket in case someone is bored and has free time
18:38imirkin: i suspect there was a similar issue with py1 -> py2, but python was nowhere near as popular back then
18:38ccr: pepp, ah. I seemed to remember something like that.
18:38imirkin: i know i only started playing with pyhon around py2.2 or so
18:39tagr_: Lyude: any idea why they do DPCD transactions before they know about DRM? I can't think of any reason why they would want to do that
18:57Lyude: tagr_: it seems like they might be trying to do connector probing before the hardware is even ready, I still need to look a bit more closely at the code but I -think- that's what's happening
19:02alyssa: Is there any diff between SHADER_CAP_INT64_ATOMICS and CAP_INT64_ATOMICS?
19:02alyssa: looks like clover vs mesa which is a ridiculous distinction..
19:03imirkin: usually the shader caps are for things where you want a diff answer per shader stage
19:08jekstrand: Are int64 atomics going to be per-stage?
19:08imirkin: i hate to bring it up
19:08jekstrand: I guess it depends on how you look at it. On some tilers, you may not have tomics in vertex stages
19:08jekstrand: Or older NV or something
19:08imirkin: but nv50 only has this stuff on compute
19:09imirkin: not on frag or other stages
19:09anholt: jekstrand: we concluded that the intention of specs was that tilers can do vertex atomics
19:09anholt: we've got some bogus piglit tests that assume single execution of vertex atomics
19:09jekstrand: I guess the question is if "can do atomics?" is CAP_INT64_ATOMICS && SHADER_CAP_ATOMICS or if CAP_INT64_ATOMICS is supposed to give you both.
19:10imirkin: however the other question is whether any of this stuff matters
19:10jekstrand: anholt: Yes, multiple executions of vertex shaders are allowed.
19:10imirkin: e.g. can you expose NV_shader_atomic_int64 or whatever it's called
19:10imirkin: where you only have ssbo's available on compute? i think so.
19:10jekstrand: anholt: But some hardware doesn't have atomics at all in those stages due to not entirely unified architectures, I think.
19:10imirkin: so in practice, it doesn't really matter.
19:10anholt: jekstrand: oh? is there one?
19:10imirkin: on nv50, any global memory access is compute-only
19:11imirkin: it's not about atomics, it's about global memory access
19:11imirkin: [at least on nv50]
19:12jekstrand: anholt: I'm not sure on the details but I do know that Vulkan and GL both allow you to not expose atomics at all in vertex stages.
19:12jekstrand: They also allow fragment to be optional
19:12jekstrand: So nv50 is good
19:12imirkin: on fermi, we only expose images in frag and compute (due to laziness, mainly)
19:12imirkin: jekstrand: well actually the desktop GL requires frag too
19:13imirkin: but ES is only in compute
19:13imirkin: (but AEP makes it required in frag too iirc)
19:13jekstrand: imirkin: Oh? Ok. Vulkan tends to have ES 3.1 limits.
19:13jekstrand:doesn't GL anymore
19:13imirkin: i'm going through and hooking up nv50 compute support right now
19:13imirkin: but i don't know if we'll actually be able to expose it "properly"
19:14alyssa: es3.2 is getting on my last nerve
19:14imirkin: alyssa: once you have GL 4.5, it's pretty easy ;)
19:15jekstrand: alyssa: Does that mean you're getting annoyed back-to-front or that you're running out of nerves?
19:16alyssa: jekstrand: Yes.
19:16alyssa: imirkin: GS/TS is the showstopper for us, and that's already in GL 3.3
19:17imirkin: alyssa: tess is GL 4.0 actually
19:18alyssa: well yes.
19:18alyssa: Either way
19:18zmike:pushes up glasses
19:18jekstrand: alyssa: Just give up, write a Vulkan driver, and make zmike figure out how to do TS/GS in compute shaders for you.
19:19zmike: jekstrand: https://assets.change.org/photos/5/oc/du/RIocDuTwtSZPDrW-800x450-noPad.jpg?1525434484
19:20alyssa: jekstrand: lol
19:20alyssa: we'll want to set geometryShader in vulkan too :(
19:20kherbst: jekstrand: nah.. compute is too fancy, just use a good ol VS
19:21alyssa: and tessellationShader apparently
19:22jekstrand: alyssa: Does ARM support either of those in Vulkan?
19:22alyssa: if vk.gpuinfo.org is trustworthy
19:26Lyude: does drm_bridge_attach() rely on any kind of locking?
19:38Lyude: tagr_: hm-looks like I might have actually just misread some code yesterday, the cadence bridge I thought was doing that was not actually doing that
19:41Lyude: oh, wow, looks like a lot of folks do not call drm_dp_aux_register() correctly though
19:42zmike: anholt: if you're interested I can check for more formats, I figured that was a decent selection though
19:42Lyude: including me, just now *goes to fix that*
19:44Lyude: actually, hm. no, I think docs are just wrong
19:55zmike: what are the gl_CurrentAttribFrag0MESA variables supposed to correspond to?
19:55zmike: just generic inputs?
19:57jekstrand32: Where are those coming from?
19:57zmike: I hacked up the trace driver to dump shaders as it goes
19:58imirkin: zmike: they're internal things
19:58imirkin: used to implement certain ff glsl features
19:58zmike: like what
19:58imirkin: like gl_CurrentAttrib :)
19:59imirkin: not sure - would have to rtfs
19:59zmike: I'm trying to rtfs now and it's predictably impenetrable like most of glsl
19:59imirkin: k, gimme a min
19:59jekstrand32: Give src/mesa/main/ff_fragment_shader.cpp a read
20:00zmike: ah yes, a light afternoon teatime read
20:02zmike: this is some rough indentation
20:02imirkin: looks basically like gl_TexCoord but potentially for other things
20:03zmike: I do have texcoord in this shader
20:03imirkin: to access those in frag shader
20:03imirkin: but in a more generic manner
20:03zmike: okay, so this looks like a generic sampling shader of some sort
20:03imirkin: which is convenient internally
20:03zmike: probably not a problem for me...
20:03imirkin: it's basically an array and you index in
20:04imirkin: so that you can do e.g.
20:04imirkin: and not worry about the proper way of accessing such a value in frag
20:04imirkin: (where foo = gl_CurrentAttribFragMESA)
20:05imirkin: looks like it's used in vertex shaders
20:06imirkin: hm. or not
20:06zmike: makes sense
20:06imirkin: but it's definitely based on vert attrib values
20:06imirkin: (which must be getting passe through in vertex shader)
20:07imirkin: with GL 1.x fixed function, you designated certain vertex attribs as "color" or "texture" or whatever
20:07imirkin: and this just provides a way of accessing that
20:07imirkin: i guess.
20:08imirkin: another way to look at it is that this code works and best not to touch it.
20:08zmike: not trying to touch it, just trying to see if I'm doing it right
20:17imirkin: anholt: in the future, please don't remove caps from nouveau caps lists
20:17alyssa: the defaults are there for good reason...
20:17imirkin: no, they're there for a bad reason. but i don't want to have this discussion again.
20:18imirkin: please just don't remove stuff from nouveau.
20:18anholt: imirkin: ok, can do
20:28jenatali: jekstrand: That null impl change apparently did end up biting us, our build automation for the compat pack we ship apparently had an old version of libclc that didn't have an implementation of that __builtin_has_hw_fma32 helper :P
20:29jekstrand: jenatali: :-P
20:29jekstrand: jenatali: Those patches make it super-slick to implement stuff like __builtin_has_hw_fma32, though.
20:29jenatali: How so?
20:30jenatali: Oh, I guess if you did have a truly extern function, yeah that makes sense
20:30jekstrand: jenatali: https://paste.centos.org/view/7c40285d
20:31jekstrand: jenatali: That's an implementation of some INTEL subgroup extensions and float atomics.
20:31jenatali: jekstrand: Nice, that is pretty slick
20:32jekstrand: It also makes it really easy to detect the difference between an empty function and a prototype so you don't accidentally stomp something app-defined.
20:33jenatali: Yeah, I think it was a good idea
20:40anholt: mareko: should having a screen ->finalize_nir be compatible with having variants in mesa/st?
20:42anholt: I'm getting a UAF during glUniform into uniform storage that was freed at create_vp_variant -> st_finalize_nir -> st_nir_assign_uniform_locations -> _mesa_add_sized_state_reference.
20:43anholt: original storage being accessed was from deserialize_glsl_program -> create_linked_shader_and_program -> ... _mesa_reserve_parameter_storage. The bug doesn't seem to happen without a disk cache hit.
21:25ajax: why is all the sw code so bad
21:32anholt: because if you're running it, something has gone horribly wrong already? (or you're CI, which, well, I'm the one who took a job eating bees)
21:39ajax: it seems like neither xmesa nor drisw actually make an st attachment for the front buffer
21:39ajax: which sorta ruins any test that thinks reading from GL_FRONT is a thing you can do
21:41airlied: oh I think I wrote code to readback from front once for glBlitFramebuffer
21:41airlied: but it all went horribly wrong
21:41ajax: at least with mitshm you could somewhat reasonably flip between two buffers?
21:42ajax: or, change it so the shm seg is the front buffer attachment, and just accept the cost of another alloc for the back
21:43airlied: the problems I ran into were accessing stuff from threads and deadlocks, I wonder if I still have those hacks anywhere
21:43ajax: or, dynamically allocate a front buffer attachment if you ever find yourself reading from it and GetImage the bits back on the spot
21:43ajax: https://paste.centos.org/view/raw/9252a12a <- what led to this line of questioning
21:45airlied: 2b676570960277d47477822ffeccc672613f9142 was part of it
21:46airlied: 0f5b1409fd2f9b26c45e750a37947d27c892ee60 was the other
21:46airlied: because we ended up doing GetImage from the llvmpipe rast threads and deadlocking
21:47ajax: ew, yeah
22:05zmike: anholt: did you want the entire conditional removed or just first part?
22:05anholt: The conditional key size thing
22:05anholt: just compare the whole key
22:23kiryl: Hi folks. I have a 4K montior (Dell UP2414Q) that represents itself over DP MST as two 1920x2160 panels. It works mostly fine nowaday, excpet for vertical tearing on the boarder between panel. Is there a way to get rid of it?
22:24kiryl: I use awesomewm without any compositors. I've tried to use picom with various options, but it doesn't help.
22:24seubz: Hi all, posting a super random question in the hopes someone has an idea. I'd like to display a Vulkan application running in a Docker container from Windows. I found a patched mesa release which allows me to do so using vcXsrv on Windows, and from what I can tell, it just modifies the Vulkan backend in mesa to detect vcXsrv and fake DRI3 support.
22:24seubz: Now, ideally, I would just like to use bleeding-edge mesa for this, but any Vulkan application complains about the lack of DRI3 support without the patches I mentioned. Is there a way to add a piece of configuration anywhere to advertise DRI3 support in this case (even if faked, as it seems to work by doing so)? Here is the patch in question:
22:26anholt: zmike: kicked off the arm build of your latest clamplers pipeline, thinking we could hit play on the various piglit runs (except freedreno) and see if the txd/txl ended up fixing anything.
22:26zmike: anholt: sweatytowelguy.jpg
22:27zmike: hopefully didn't break anything
22:28jenatali: seubz: Are you trying to run a hardware Vulkan driver from Mesa? Or just lavapipe? Are you trying to run it in a Windows container, or a Linux container (hosted on Windows)?
22:30seubz: jenatali: sorry I thought I mentioned that - I am using lavapipe, and this is a Linux container hosted on Windows WSL2. It does work when patching mesa based on changes I linked to.
22:32seubz: A better overview is here: https://github.com/gnsmrky/wsl-vulkan-mesa (he's not using Docker, that part doesn't matter)
22:34jenatali: Looks like it's just the 2 patches that need to be applied, should be easy enough to put them on top of latest
22:34seubz: I don't think he also has any desire to PR these changes, which makes sense as they're just a workaround, but I'm wondering if there is a way to properly support this today without source changes basically. I didn't realize lavapipe was so advanced already (I managed to test more advanced content than vkcube on it and was pleasantly surprised)
22:34seubz: Yes - my goal isn't to "make it work", it is more to make it work in a standardized way without carrying those patches around
22:35jenatali: Makes sense. Wish I could help but I know very little about X or how lavapipe interacts with it
22:36seubz: No worries, thanks! Maybe someone will see this and have an idea
22:42ajax: seubz: hm
22:44alyssa: kherbst: Is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6064 NAK'd by you or...?
22:45alyssa: I realize it's not perfect but I -think- it's a clear improvement over master and unblocks OpenCL on Panfrost
22:45kherbst: alyssa: check what iris and nvc0 are doing
22:45kherbst: it's quite stuid but also very clever
22:45ajax: seubz: i could be convinced to just rip that warning out tbh
22:46alyssa: kherbst: ...Is that a yes or a no? :P
22:46kherbst: alyssa: I have both versions, the old and the new one
22:46kherbst: the new one has higher CPU overhead soo.. ugh
22:46ajax: i'm having difficulty thinking of a problem that it would diagnose, that isn't better diagnosed elsewhere
22:46kherbst: if somebody has a very nice an clever idea.. please
22:47alyssa: higher how?
22:47kherbst: because you loop over the inputs in a weird way
22:48ajax: hey awesome
22:48kherbst: alyssa: this is my alternative fix: https://gitlab.freedesktop.org/mesa/mesa/-/commit/6e52c6dfcc93aa801d1ac86796492fc325d997d2
22:48kherbst: for 32 bit GPUs that has to be a memcpy of 4
22:48kherbst: and you are done
22:48ajax: jekstrand: your fingerprints are on 3d8feb38 and i'm wondering whether that warning is worth it anymore
22:49alyssa: kherbst: I don't see how that works
22:49jekstrand: ajax ¯\_(ツ)_/¯
22:49kherbst: alyssa: how so?
22:49ajax: jekstrand: especially since lavapipe doesn't need dri3 at all
22:49kherbst: you get a pointer into the input buffer
22:49kherbst: the API is stupid that it is a uint32_t pointer
22:50kherbst: it should be void* instead
22:50kherbst: so clover gives you the pointer directly without having to copy values around or anything
22:50kherbst: it sucks, but it works
22:52seubz: ajax: sorry I was afk - I don't mind the warning itself, but would it make sense to warn and just continue without erroring out in that case?
22:53ajax: seubz: i think soldiering on would make sense, yes. i also think not bothering to check at all makes even more sense ;)
22:53alyssa: kherbst: Hm.
22:54kherbst: alyssa: yeah.. hence why me not caring about my MR
22:55seubz: ajax: well that would certainly work - I don't know enough about the other layers (or anything mesa-related really) to give a sensible opinion ;)
22:56ajax: put it this way
22:56kherbst: alyssa: I think the most straight forward way is do redeclare the current API to void* and specify the size to write should be of sizeof(ptr) of the current exection mode
22:57ajax: you get to that code by, if a particular extension is present, asking part of that extension whether displaying will work
22:57ajax: but if it's not going to work, you would just not advertise that you support that extension
22:58ajax: so it's only ever failing in a case that would otherwise work... so why warn
22:59seubz: Yep it makes complete sense
23:02seubz: Should I create a bug in https://gitlab.freedesktop.org/mesa/mesa/-/issues?
23:02ajax: yes please!
23:02seubz: Cool cool, thank you!
23:10kiryl: I've filed an issue about the tearing issue I described above. Any help will be appreciated: https://gitlab.freedesktop.org/drm/intel/-/issues/3093
23:14imirkin: kiryl: it's a tricky issue generically
23:15imirkin: the two screens are independent
23:15imirkin: and i don't know if we allow frame-locking on most hw (or if that's even a thing)
23:16ajax: i've wanted drm to do soft genlock for approximately forever
23:16kiryl: it's solved somehow under Windows
23:16imirkin: ajax: is it a thing in e.g. intel hw?
23:17ajax: imirkin: i think it's possible to get two dp streams on a single connector to be line-aligned when transmitted, yeah
23:19ajax: what i meant by "soft genlock" there was just stretching the vblank interval gently on the two connectors until their vsync interrupts are within like 0.1ms of each other
23:19ajax: more of a vga idea, but
23:19imirkin: ajax: i thought that DP streams were completely independent
23:19imirkin: the DP "cable" is split into buckets, and buckets are assigned to streams
23:20imirkin: sort of like token ring (iirc)
23:20imirkin: so each picture stream is completely separate
23:20ajax: but if that's true, then the first 16 pixels (or whatever) in the top-left corner of stream one happen at some time A
23:20imirkin: and it should be no easier/harder to do it over dp-mst vs 2 hdmi connectors
23:20ajax: and in stream 2 at time B
23:21imirkin: sure, but the buckets are interleaved
23:21ajax: and while you _could_ pack those totally independently into the bitstream, you could also choose not to
23:21imirkin: so you can never get true frame lock, but ... at those latencies you wouldn't care
23:22imirkin: actually i guess sonet is the better analogy
23:22ajax: yeah. if their interleaving happens to be on corresponding lines then good enough. if one is on line 0 while the other is on line 400 that's choosing to do poorly, so i have to believe it's possible to choose to do well
23:23imirkin: iirc there are 64 "buckets", and the bits just go in bucket order
23:23imirkin: but they're independent
23:23imirkin: so each crtc decides to send bits whenever it feels like it
23:24imirkin: the buckets are dedicated to it whenever it wants to do it.
23:24imirkin: somehow you have to tell the crtc's to actually start at the same time (and not have drifting clocks)
23:24vsyrjala: it should be possible on intel hw, just no one has seriously tried to implement it
23:25vsyrjala: we just have the hw port sync mode for now. and that only works with sst
23:25imirkin: vsyrjala: cool. i'm sure it's possible on nvidia hw, i just have no idea how. there were cables you could use in the bad old days, but it's internal in the modern hw somehow
23:25ajax: it's still external sometimes
23:27airlied: amd have genlock, not sure it's supported and intel from broadwell maybe?
23:27imirkin: well, DP-MST is probably bdw+ anyways
23:27imirkin: (actually hm, i might have had it on hsw)
23:27airlied: maybe haswell got the sync stuff
23:28airlied: and ivb didn't
23:31kiryl: I run the monitor from hsw
23:31imirkin: ok, so hsw clearly had MST :)
23:32airlied: yeah I developedsupport for those monitors on hsw
23:32airlied: and/or maybe ivb
23:34vsyrjala: bdw has port sync. hsw does not. but we only implement it for skl+
23:34vsyrjala: but it's sst only. so won't help with any mst stuff
23:34imirkin: i was so excited when it gained dp audio over mst. only to find out that it was totally useless on the dell monitors i had.
23:35imirkin: kiryl: btw, which nvidia chip do you have?
23:37imirkin: yeah, GP104 is what i was looking for
23:37imirkin: just wanted to see if there was any locking that was obvious in the nvidia-published headers
23:37imirkin: but it may not be in those methods at all
23:37imirkin: and instead configured in a different way
23:41imirkin: nothing extremely obvious.
23:42vsyrjala: of course no amount of genlocking will fix the fact that xorg isn't using the atomic uapi. so page flips can still tear
23:44imirkin: shhhh, don't inject reality into the dream!
23:44vsyrjala: occassionally i ponder about hiding the tiles from userspace entirely. wouldn't actually be much different from the i915 bigjoiner stuff, except we would actually have multiple encoders and connectors to deal with
23:45imirkin: well, the TILE stuff is used by the generic dp stuff, while bigjoiner is an i915 thing i think?
23:45imirkin: although i think there's some precedent for this in other places too, e.g. dual-LVDS or whatever
23:48vsyrjala: at least for us dual link lvds/dsi stuff is handled by the hw to some degree. and now there's this edp mso thing too which is a bit similar