06:51mlankhorst: bv /12
12:15tzimmermann: jani, vsyrjala, if there are no further comments from your side, i'd like to merge https://lore.kernel.org/dri-devel/20251003032303.16518-1-chintanlike@gmail.com/ soonish. I'm getting CI reports about this problem, which i'd like to get resolved
12:15tzimmermann: i'll remove that final sentence
12:18sima: demarchi, done a bit of PR processing, no maintainer-ack checks fired
12:18sima: so looking much better thus far
12:18sima: mripard, bbrezillon ^^
12:19sima: oops, actually don't yet have your revert and new implementation, so yeah that silence was expected :-/
12:25mlankhorst: mripard: Perhaps we should change the template to remove the /assign so we get cc'd at least?
12:51bbrezillon: sima: not too sure what I should be looking at :D
12:56vsyrjala: tzimmermann: i guess no one came up with a real fix? eg. bump the timeout a bit and/or increase the priority of whatever thread does the vblank signalling?
12:57tzimmermann: vsyrjala, not yet. i've been thinking about these things, but the core problem would remain
12:57tzimmermann: i've never seen these errors on loca machines. i assume that the CI systems are just really busy
12:59vsyrjala: we have quite a few of those calls in i915 so the backtrace can be really useful to find out where it's coming from
13:00vsyrjala: maybe it should return the error to the caller, and then we could keep the warn in i915 code?
13:00tzimmermann: ok
13:01tzimmermann: there's presumably an irq-mode for the hrtimer. that might make it look more like a real vblank irq. https://elixir.bootlin.com/linux/v6.17.4/source/include/linux/hrtimer.h#L32
13:02tzimmermann: maybe we could try this
13:18tzimmermann: i'll give this some testing
13:36sima: bbrezillon, false alarm :-D
13:52lileo: mupuf, following up from xdc: did disabling psr help with the underflow at all?
13:56mupuf: lileo: I had forgotten what I was supposed to test! I'll try tonight and let you know in a week :)
14:42alyssa: ishitatsuyuki: clipdist is a mess and there's like 4 forms of it and idk just copy honeykrisp tbh.
14:43alyssa: nir_intrinsic_component() & nir_intrinsic_io_semantics()::location have what you need
14:43alyssa: and possibly also the offset[] argument which is like the first source I think
14:43alyssa: grep for "compact_arrays" also
14:43alyssa: (with compact arrays, the indexing for clipdist is in components and not vec4s which is what you want but means special casing.)
14:45Lynne: "A single structure or statically-sized array must still be less than 4GB, in part to avoid needing 64-bit Offset/ArrayStride decorations", useful
14:46Lynne: once again proving everyone should use BDAs
14:56alyssa: Lynne: you're not wrong.
14:58Lynne: I do wish 64-bit arithmetic was faster on modern GPUs, its a real overhead
14:58karolherbst: ohh nvidia solved this problem, it's great
14:59karolherbst: load/store instructions can do a 64 + 32 or 64 add as part of it, so you can load a 64 bit base, compute a 32 bit offset and off you go (and most other vendors have similar things)
15:00karolherbst: but yeah.. that relies on the fact that you can prove you do have offsets fitting within 32 bits
15:05Lynne: I think AMD solved that too on 9700 cards
15:05Lynne: by having 64-bit ops rather than needing 2 32-bit ops+carry
15:08Lynne: karolherbst: static offset or offset from a register?
15:08karolherbst: both
15:08karolherbst: it's an iadd3, but one register needs to be uniform and the other not
15:08karolherbst: and then you have a constant offset on top
15:09karolherbst: and on shared memory you also get a shift (1, 3 or 4) for free
15:09karolherbst: ehh (0, 2 or 3)
15:09lileo: mupuf, adding `amdgpu.dcdebugmask=0x810` to kernel cmdline will disable psr and some additional idle power saving features. Curious if that helps with the underflow
15:09lileo: There's also a gitlab issue tracking this https://gitlab.freedesktop.org/drm/amd/-/issues/4463#note_3162682
15:09tursulin: tzimmermann: is drm-misc-fixes open for fixes at the moment? such as https://lore.kernel.org/dri-devel/20251021160951.1415603-1-akash.goel@arm.com/
15:09Lynne: karolherbst: oh, its exactly x86's addressing, nice
15:09lileo: mupuf, if you could also put your memory info there, that'll help as well
15:12mupuf: lileo: Want me to open an issue in drm/amdgpu? Want me to tag you there?
15:12karolherbst: yeah.. I have patches to wire it up
15:14lileo: mupuf: no need, just add your response to the form that Mario created: https://gitlab.freedesktop.org/drm/amd/-/issues/4463#note_3162681 (i linked the wrong comment originally, oops!)
15:16lileo: (IOW an issue already exists, no need to create another :) just add your info there)
15:16alyssa: Lynne: my favourite unhinged x86 addressing trick is that `*((void*)(uintptr_t)(x << 2))` is a single instruction
15:16alyssa: with the 64-bit shift "for free"
15:17alyssa: which gives a limited form of tagged pointers with 0 overhead on x86
15:17alyssa: :clown:
15:17Lynne: wait, what? you're dereffing a void *?
15:18alyssa: s/void/int/ or something idk
15:19karolherbst: sadly nvidia doesn't have a 64bit shift, because the shift only exists on shared 🙃 but yeah... `*(ubase << 2 + offset + constant)` is a single thing
15:19Lynne: not sure where that's useful outside of a kernel or direct physical memory ops
15:20tzimmermann: tursulin, of course. drm-misc-fixes is always open and goes towards upstream each week. that specific patch should either have a fixes tag or explain why it doesn't.
15:20tursulin: tzimmermann: See my reply, I pasted a Fixes: that there.
15:20tursulin: Okay if I push it with that added or you want to?
15:21tzimmermann: ok, all good then
15:21tzimmermann: please do
15:21tursulin: ty!
15:51demarchi: sima: instead of marge, what about enabling a ff-with-merge in gitlab for this simple repo?
15:51demarchi: (don't really remember the name of that setting, but used that a long time ago in another project...)
15:53demarchi: sima: this one... https://docs.gitlab.com/user/project/merge_requests/methods/#merge-commit-with-semi-linear-history
15:54demarchi: (I can't change the settings in maintainer-tools repo though)
17:13alyssa: ../src/gallium/drivers/d3d12/d3d12_context_graphics.cpp(745): error C3861: '__typeof__': identifier not found
17:13alyssa: MSVC nooo
17:14alyssa: is typeof in C but not C++? t.t
17:14alyssa: I guess.. decltype? is the standard C++ version?
17:18alyssa: Can someone who knows about MSVC and/or C++ sanity check https://rosenzweig.io/0001-fixup.patch ? thanks
17:38HdkR: Seems reasonable enough :D
17:39alyssa: thx
18:29DPA: alyssa: __typeof__ is a compiler extension. typeof exists since C23.
18:30alyssa: DPA: ...lovely.
18:30alyssa: patches welcome once this lands, I don't have envy of experimenting with every compiler version in CI more right now
19:21mupuf: lileo: thanks. I've added a comment and answered on the form. I'll add the dcdebugmask parameter now
19:24lileo: awesome, thx!
20:10alyssa: compiler that's not O(N^2) challenge level impossible
20:33alyssa: dschuermann: pixelcluster: any thoughts on "Fast liveness checking for SSA-form programs" by Boissinot et al?
20:34alyssa: We can't/shouldn't? use it in backends since it requires def-use chains and the cure there is probably worse than the disease
20:34alyssa: but it should be implementable in NIR
20:34alyssa: whether it's *useful* in NIR I'm less sure