IRC Logs of #dri-devel on irc.freenode.net for 2023-11-03

08:30 MrCooper: Company: the stride alignment requirement of current AMD GPUs applies to any format
08:30 MrCooper: argh, he just left
15:20 karolherbst: robclark: any idea what's wrong with this? hardware fault? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/51155473
15:22 daniels: yeah, it looks like the hardware just died
15:36 robclark: karolherbst: looks like a630-traces died 3 times in a row the same way, so I guess there is something wrong with the MR?
15:37 robclark: different runners each time too
15:37 robclark: so looking like a legit problem
15:41 robclark: karolherbst, daniels: err, hmm.. actually looking at the raw results.. piglit still claims "pass" despite freedoom gpu hang.. so maybe infra problem?
15:42 robclark: friday evening in europe, I guess that is the normal time for infra problems
15:43 daniels: oh, I see what's happened
15:44 daniels: 23-11-03 14:37:10 R SERIAL-CPU> hwc: mesa: pass
15:44 daniels: that's the last thing in the init script before it sleeps
15:44 daniels: bm looks for 'hwci: mesa: pass' to terminate
15:45 daniels: maybe we should just echo that out like 3x to make sure it gets through
15:45 robclark: heh, the earlier ones: hwci:��sa: pass
15:45 robclark: and: hwc: mesa: pass
15:45 daniels: uart \o/
15:46 robclark: how does it manage to drop/garble a character three times in a row, exactly there!
15:47 robclark: but yeah, maybe just echo it a few times at the end
15:53 daniels: I wonder if we should make bm use SSH for console like LAVA does now
15:56 daniels: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26032
15:56 gfxstrand: daniels: FYI: The NAK MR is literally just waiting for meson 1.3.0rc2 to be tagged. As soon as that tag exists and propagates to pip, I can make a 2-line change and it should go through fine.
15:56 gfxstrand: But I don't control that. I'm hopping they tag today.
15:57 daniels: gfxstrand: sure thing, do you need anything from me?
15:59 gfxstrand: Nope
15:59 daniels: coolio
15:59 gfxstrand: If another container build fails, I'll poke you. Otherwise, I'm just waiting for the tagl.
15:59 daniels: bonne chance
15:59 gfxstrand: *tag
17:04 jenatali: gfxstrand: Any more comments on !25998?
17:12 robclark: gfxstrand: is the meson dependency bump _only_ if you are building nvk? Or across the board?
17:12 gfxstrand: robclark: Only for NVK
17:12 gfxstrand: robclark: But it means we need to bump the version in CI
17:13 robclark: coolio.. was just wondering if I'd need to uprev meson in CrOS but sounds like I dodge that bullet :-P
17:13 gfxstrand: Not unless you're shipping NVK on an NVIDIA chromebook you haven't told me about. 😅
17:14 robclark: well, at least for mesa-freedreno uprev, I don't need to enable nvk :-P
17:15 gfxstrand: So you're saying there is an NVK chromebook? :P
17:16 gfxstrand: If so, don't you think that's a tad presumptuous? I mean, Fedora did it so why not google. ¯\_(ツ)_/¯
17:17 robclark: well, the thing with the nv dGPU is no more... but if there ever is something w/ nv in the future, I think most folks here would strongly prefer to ship nvk ;-)
17:41 Calandracas: Trying to compile mesa with llvm17, build is failing on gallium/auxiliary/gallivm/lp_bld_init.c because llvm-c/Transform/Scalar.h does not exist. It appears that the C bindings for some of the "Transforms" headers have been removed
17:55 DavidHeidelberg: Calandracas: I have feeling I saw MR which addresses this issue
17:56 Calandracas: Hum I'll take a look
17:58 Calandracas: Probably this one https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23827
18:15 Company: is there a recommended way to measure libGL overhead in a profiler so I can optimize the right stuff? When I run stuff - even with mesa_glthread=false exported - most of the work ends up in a thread
18:15 Company: and then the flamegraph isn't useful
18:19 robclark: if the driver supports threaded ctx (which iris and radeonsi do) then add `GALLIUM_THREAD=0`
18:26 Company: that *and* mesa_glthread=false seems to do the trick
18:30 Company: https://i.imgur.com/cGUK7NC.png
18:30 Company: seems the magic is batching which will require tracking texture usage across commands, and that's all about using an atlas for all the icons and glyphs so it's the same texture everywhere
18:31 Company: interesting because Vulkan doesn't seem to need that, vkCmdDraw has a way smaller overhead
18:45 robclark: binding new texture state between every draw is probably not great.. radeonsi does support bindless textures which would make things more like vk (but iris and others do not)
18:52 Company: yeah, I'm using a glSampler2D textures[N_TEXTURES]; array and now basically doing 3 "levels" - one doing constant expression texture accesses (read: large if statement), one doing dynamically uniform texture accesses and one doing bindless/nonuinform
18:52 Company: with the nonuniform stuff I can do 1 draw call for everything
18:53 Company: with the dynamically uniform I can do batch draw calls as long as they use the same texture
18:53 Company: and with constants I need to setup texture units before each draw call with different textures
18:54 Company: just need to get it right...
18:57 robclark: I'd dynamically index the texture array over if/else ladder
19:01 Company: yeah, that's what I'm doing
19:01 Company: but it's a bit of a question how many entries I want to put there
19:04 robclark: I think max is 32.. put as many as you can if it avoids state changes
19:12 Company: I think I read the spec and it says "has to be at least 32", but may be as much as GL_MAX_SOME_CONSTANT
19:15 Company: it's https://opengl.gpuinfo.org/displaycapability.php?name=GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS
19:17 Company: unsurprising to noone, the largest number in there is reported by zink
19:41 robclark: I'm not sure zink should be reporting more.. or at least: #define PIPE_MAX_SAMPLERS 32
19:42 robclark: oh, hmm.. but per-stage vs cross-shader-stage
19:43 robclark: Company: you probably want to look at GL_MAX_TEXTURE_IMAGE_UNITS or the various per-stage ones
19:45 Company: ah yes
19:45 Company: wrong variable again
19:46 karolherbst: Kayden: are there any good reasons why the timestapm query is restricted to 36 bits in iris?
19:52 karolherbst: it causes tests to flake randomly so I was wondering if we could just unrestrict it to 64 bit
19:52 karolherbst: or some other value
20:36 jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/51171600 :(
20:38 mriesch: CounterPillow: just read the conversation you had a while ago about LED strips as panels... what approach did you pick in the end?
20:39 CounterPillow: none, didn't continue the project as I ran out of energy for it for now
20:46 mriesch: CounterPillow: oh i see...
20:55 mriesch: would it be reasonable to conclude that a framebuffer/panel approach would be the way to go?
21:11 CounterPillow: yeah, panel is the way to go.
21:12 CounterPillow: Though, refresh rate is a bit fiddly to figure out since that depends on your SPI controller speed and the maximum speed the SPI chain LEDs can work at.
21:37 mriesch: CounterPillow: afaik there is fbdev panel and DRM panel, right? any thoughts on this choice?
23:04 cwabbott: gfxstrand: ok, it turns out that I was lucky, the NaN canonicalization while copying was entirely due to sample() and if I use texelFetch() instead and be careful about using a half-reg for f16 formats (to avoid the f16->f32->f16 roundtrip also canonicalizing) then I can avoid any of it
23:05 cwabbott: it'll mean we can't use the special 2d path that always does sample(), but that's probably fine
23:05 cwabbott: crisis averted
23:09 cwabbott: ughhhh some r16_unorm tests are still failing
23:09 cwabbott: snorm works but not unorm?!?
23:17 cwabbott: nvm, my bad
23:22 cwabbott: dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.2d_to_2d.* all pass now \o/
23:24 Kayden: karolherbst: the 36-bit thing has been there since 2012...but I'm not sure how much sense it actually makes these days. f0159018d7709b57d9916575512d75cb3f2fb395 ... the thinking was that opengl programs used GL_QUERY_COUNTER_BITS to assume things rolled over at a power of two, and so when we scaled the 32-bit timestamp value by the tick frequency to get actual nanoseconds, it'd roll over at...some other time. so we tried to find something workable
23:24 Kayden: but nowadays the time scaling is done by various fractional things and so I'm not sure it works out anyway
23:25 karolherbst: yeah...
23:25 karolherbst: makes sense
23:25 karolherbst: all I can say is, that I have CL CTS flakes caused by that
23:25 Kayden: hmm, actually it looks like it may be a 36-bit signal nowadays...
23:31 karolherbst: Kayden: okay, so inside a shader there is just 36 bit of precision anyway?
23:32 Kayden: oh, inside a shader........as I understand it our timestamp register used for ARB_shader_clock is pretty much garbage and doesn't work most of the time anyway
23:32 karolherbst: pain
23:33 karolherbst: In CL there is no way to fetch the timestamp in a kernel anyway, I was just curious because I thought `signal` is a shader thing on intel
23:33 Kayden: I don't remember the details - curro did a lot of work to measure instruction timings using it. he would probably remember the details of how it's broken, but it was a pretty huge list
23:34 Kayden: I think it might've reset on thread switches or power management events or something absurd like that
23:34 karolherbst: uhh
23:34 karolherbst: annoying...
23:34 karolherbst: well
23:34 Kayden: I think the one you get via PIPE_CONTOL or the Timestamp register (read from the command stream) is a lot more reliable
23:35 karolherbst: all I use are `get_timestamp` and PIPE_QUERY_TIMESTAMP
23:35 karolherbst: PIPE_CONTOL? that's something I've never heard of
23:36 karolherbst: ohh that's crocus/iris internal naming
23:50 Kayden: yeah, it's the GPU command that gives you a pipelined timestamp read
23:51 Kayden: hmm, looks like we get devinfo->timestamp_freq from i915 these days...which reads it from a deprecated register
23:51 Kayden: lovely
23:53 Kayden: oh, nvm, it does read the right one...eventually