IRC Logs of #dri-devel on irc.freenode.net for 2024-10-31

09:59 fomys: (moved from #freedesktop)
09:59 fomys: Hi,
09:59 fomys: I am the new VKMS maintainer, and I want to commit this: https://lore.kernel.org/all/20241014145009.3360515-1-mairacanal@riseup.net/
09:59 fomys: I just added my Acked-by yesterday.
09:59 fomys: How long should I wait before commiting?
09:59 fomys: How should I commit? From the documentation I found this process, do I miss something?
09:59 fomys: - dim checkout drm-misc-next
09:59 fomys: - dim b4-shazam <mail-id>
09:59 fomys: - dim checkpatch <series>
09:59 fomys: - dim sparse <series>
09:59 fomys: - dim push-branch drm-misc-next
09:59 fomys: Thanks,
09:59 fomys: Louis Chauvet
10:43 mupuf: fomys: 24h is definitely enough
10:43 mupuf:can't comment on the how to merge... it's been too long since I last pushed anything to dim
10:44 fomys: Thanks for the delay confirmation, I will wait for someone else to confirm the process to push, thanks !
12:36 mripard: fomys: dim ub, dim b4-shazam, dim push. And if you think the doc could be better, please send a doc update after it :)
14:29 fomys: I just pushed it, I hope I did things correctly!
15:15 fomys: On my VKMS work, I just ran the CI to see how it works / fix the broken tests, I have few questions:
15:15 fomys: - Is there an easy why to tell "please run vkms:none and all its dependency"? I don't need the whole pipeline to be run, only the vkms dependencies.
15:15 fomys: - On the results (https://gitlab.freedesktop.org/louischauvet/kernel/-/jobs/65883946), if I understand correctly I need to mark some test as Pass in the corresponding vkms-none-fails.txt (UnexpectedImprovement on kms_cursor_crc)
15:15 fomys: - On the same results, I have timeouts on kms_plane@pixel-format, is there a way to increase the timeout? As vkms will support many new formats, 60s will not be sufficient I think.
15:15 fomys: - Do I have to do anything with the flakes test? I don't understand if kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic is passing or not in this run
15:15 fomys: - Is there a way to run the tests on a different architecture? As VKMS is doing a lot of buffer manipulation I would like to test at least on arm64.
15:53 mripard: fomys: afaics, you did. and fyi, b4 ty is super helpful if you use b4-shazam
15:54 fomys: Yes, I know, I already use b4, very nice tool to not be lost in my series :)
17:37 dbrouwer: fomys: about the CI: 1) to run vkms:none and all its dependencies you can use the ci_run_n_monitor.sh tool in mesa, it works for other gitlab pipelines too. Run it like this:
17:37 dbrouwer: ./bin/ci/ci_run_n_monitor.sh --pipeline-url https://gitlab.freedesktop.org/louischauvet/kernel/-/pipelines/1300934 --target vkms:none
17:37 dbrouwer: 2) Fix the UnexpectedImprovements e.g. kms_cursor_crc@cursor-rapid-movement-128x128, by just removing them entirely from the *fails.txt, you don’t have to mark them as Pass
17:37 dbrouwer: 3) The timeouts are not coming from ci, but somewhere in IGT I think
17:37 dbrouwer: 4) KnownFlakes e.g. kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic you don’t have to do anything, just leave them, they count as a pass for ci
17:37 dbrouwer: 5) To run the tests on arm64 you would need to write a new ci job
17:38 fomys: Thanks for all of this! I will use this tool!
17:41 fomys: For now I will not update the *fails.txt, I may found a flakiness in vkms (https://gitlab.freedesktop.org/louischauvet/kernel/-/jobs/65944002). I am currently investigating to know if this is an existing issue (my current guess because I don't see yet how my series can generate this)
17:41 fomys: If this is an existing issue (kernel crash == pipeline fail), how should I do? Can I commit other series before fixing it? Should I fix it first?
17:47 fomys: Ok, it is not my fault, the issue was there before: https://gitlab.freedesktop.org/jim.cromie/kernel-drm-next-dd/-/jobs/61803193, https://gitlab.freedesktop.org/jim.cromie/kernel-drm-next-dd/-/jobs/61484201
17:47 fomys: It is in the same time a good new (my series does not break stuff), and a bad new (I have to investigate a race-condition issue that I never had in 8 month of developing VKMS and running igt tests)
17:48 fomys: (thanks jim.cromie to run this ci often :-))
18:32 Company: jenatali: do you have some more of that magic format mapping code? Apparently glTexStorageMem2DEXT() thinks I need to know how to map dxgi formats to gl formats
18:32 Company: and gallium's d3d12_format.c works with pipe formats, not gl formats
18:35 jenatali: So you want the GL to pipe mapping?
18:35 Company: GL to DXGI
18:36 jenatali: You won't find a one-stop path in Mesa, it goes through pipe formats
18:36 Company: yeah, that's why I'm asking
18:36 Company: I was hoping you knew some place
18:36 jenatali: Not offhand but pipe formats map to DXGI pretty cleanly so with the GL to pipe mapping you should be good
18:37 jenatali: ANGLE might have it?
18:38 Company: that's a good idea
18:41 Company: https://github.com/google/angle/blob/main/src/libANGLE/renderer/d3d/d3d11/texture_format_map.json
19:49 mairacanal: fomys, i remember having issues with kms_cursor_crc@cursor-rapid-movement-512x512
19:52 mairacanal: seeing the CI log reminded me of https://lore.kernel.org/dri-devel/24864732-dd34-a391-f5c3-27783765794d@riseup.net/T/#mf1186769f3d505b17f2da8c140d7cd20a407e145
20:23 alyssa: oof. nir_opt_algebraic eats nuw bits
20:24 alyssa: maybe this is the wrong way to go about this..
20:24 alyssa: I'm trying to plumb through nuw on the ishl's generated by nir_lower_io and nir_lower_uniforms_to_ubo
20:25 alyssa: so that I can promote the 32-bit ishl to a 64-bit ishl that I can fold into my hardware op (64 + zext(32) << shift)
20:26 alyssa: (which is faster than `64 + zext(32 << shift)` which requires an extra instruction for the shift. The hw semantic is perfect for C based languages, including OpenCL. but nir_lower_io doesnt play nice)
20:31 alyssa: wondering if other hw has this problem
20:57 karolherbst: I think nvidia has a native instruction for those things
20:58 karolherbst: it's kinda like LEA, no?
20:58 karolherbst: though I think x86 doesn't have the shift
21:02 alyssa: karolherbst: I *need* to plumb through the shift for correctness ..
21:02 karolherbst: yeah...
21:03 karolherbst: but I mean nvidia has an instruction for (64 + zext(32) << shift) and it's called LEA there
21:03 alyssa: sure
21:03 alyssa: so IDK. nir_lower_io can set nuw but nir opt algebraic needs to not destroy it
21:03 alyssa: and we dont have infra to preserve nuw/nsw bits in algebraic..
21:04 karolherbst: mhhh
21:04 karolherbst: yeah...
21:04 karolherbst: or add a lea instruction and drivers lower it
21:05 karolherbst: but might make sense to figure out all the semantics first...
21:05 karolherbst: the shift for nvidia is a 5 bit constant
21:06 alyssa: right so if we have a dedicated instructions we lose all the existing ishl patterns
21:06 alyssa: it's very chicken and egg
21:06 alyssa: though maybe that's still.. better overall
21:07 karolherbst: I mean.. if lower_io emits the lea
21:07 karolherbst: and then we either keep it or lower it?
21:07 karolherbst: could also make lea have those semantics
21:08 alyssa: Hmm
21:08 karolherbst: it's a single native instruction, so not really much harm you can do here
21:08 alyssa: Part of the problem is that I'm doing the AGX lowering late, because I kind of have to do some optimization to remove the crap that comes out of i/o lowering first
21:08 karolherbst: ehh well.. 2 for nvidia if you want 64 bit result
21:08 alyssa: but could probably do it earlierish
21:09 karolherbst: mhh I see
21:09 karolherbst: but I'd love to see a native lea instruction, because there is hardware who could make use of it
21:18 alyssa: i dont see how that would work usefully crossvendor
22:12 alyssa: hmm. I definitely need to restructure somethings and probably need to duplicate a few ops but
22:12 alyssa: there are worse things
22:12 alyssa: ops and opts
22:13 alyssa: current idea is to introduce an aadd similar to the amul we have now
22:13 alyssa: so then we can do the fast and loose optimizations for addressing
22:14 alyssa: and then fuse into a lea32 which has the semantic of "a + zext(b << c)" where the shift is guaranteed not to overflow s.t. we can impl in hw as "a + zext(b) << c" if profitable
22:14 alyssa: but don't have to if not
22:15 alyssa: just really unclear *when* that opt should happen
22:15 alyssa: do it too early, you impede other opts
22:15 alyssa: do it too late, you lose the nuw info you needed
22:15 alyssa: plumbing amul/aadd thru works but with the impeding other ops so then I need to copypaste things like `aadd(x, 0) -> x` rules
22:16 alyssa: and then you get funny mismatches like `aadd(iadd(x, #y), #z)`...
22:16 alyssa: that's the same as iadd(x, #(y+z))
22:17 alyssa: ...but is the same as aadd(x, #(y+z))?
22:17 alyssa: ^is it
22:17 alyssa: strictly no.. the inner iadd could overflow..
22:17 alyssa: in some ways opencl addressing is so much easier
22:18 alyssa: it's just always 64-bit, no shenanigans