13:47 DavidHeidelberg[m]: any fans of VAAPI testing? I need review of libva-utils uprev :) https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22888
14:40 daniels: mupuf: not sure if you/chturner have changed the radv kernel recently, but https://gitlab.freedesktop.org/mesa/mesa/-/jobs/41876790
14:41 daniels: (relatedly, it would be great if that could share a kernel tree with the rest of everyone else, and also if b2c had the same container that executes in gitlab-ci rather than something external?)
14:42 mupuf: We did not. It's a new change the kernel, it's a zink change that exposed a kernel bug
14:43 mupuf: We are using the same container as gitlab-ci , unless you are talking about the trigger container?
14:43 mupuf: As for sharing the kernel, would it be OK if others stopped using modules?
14:44 mupuf: Or shipped them as cpio images?
14:44 mupuf: daniels: ^
14:45 mupuf: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22855#note_1911510 is the regression
14:51 DavidHeidelberg[m]: mupuf: about sharing modules outside of the rootfs.. we though about it
14:51 DavidHeidelberg[m]: I personally think it's good idea, if we consider integration with KernelCI, so we could "replace kernel ondemand"
14:52 mupuf: 100%
14:53 mupuf: Yeah, I like having the kernel and user space separated
14:53 mupuf: I added support for modules in b2c last week
14:54 mupuf: I ended up with the following interface: a kernel file (vmlinuz), a modules.cpio.xz, and a firmware.cpio.xz
14:55 mupuf: This way, bootloaders can download all the initrds and the kernel will extract them all
14:55 mupuf: (In EFI mode)
14:56 mupuf: For legacy mode, the bootloader usually decompresses them, concatenates them, then start the kernel
14:57 mupuf: And for rootfs operations, extracting a tarball or a cpio archive is similar-enough
14:59 mupuf: B2C's makefile creates the cpio archives for us (it finds the firmware needed for all the modules you compiled)
15:01 mupuf: DavidHeidelberg[m]: how does that sound like?
15:07 alyssa: Lovely, nir_reg rework is crashing on armhf but not arm64..
15:09 DavidHeidelberg[m]: mupuf: "me likey". I guess first step is distribute modules separately for the start.
15:21 daniels: mupuf: btw I forgot to reply the other week when you were talking about b2c doing NFS, but it still really hurts ...
15:22 mupuf: Loool, isn't that what you use?
15:22 daniels: I mean
15:22 daniels: in this scheme, you've got the device pulling the container image over the network, writing it to the nfsroot over the network, reading it back over the network to decompress it, writing the uncompressed image back over the network, then executing from there
15:23 daniels: in our scheme, the controller (which has substantially more I/O capacity than a low-end Arm board) decompresses the rootfs to local storage, then the device just executes straight out of that nfsroot
15:23 mupuf: Yeah, but you know the extraction would happen only once, right?
15:24 mupuf: The first time you boot this container, following times would be cached
15:24 daniels: assuming that the storage is completely persistent, which for us it isn't because we have KCI and CrOS testing and $customer_product testing, none of which use b2c
15:24 daniels: so we'd have to go semi-rewrite LAVA to understand different job classes and allocate persistent storage for them
15:25 mupuf: Good to know
15:26 mupuf: So, the NFS shares are created per job, ack
15:26 daniels: even so there would still be a moderate amount of pain because we typically have 15 DUTs per device type, so you get a lower hit rate, and afaict you'd still have the issue that you'd have the DUT as the one expanding the image (NFS read -> CPU decompress -> NFS write); not only does the CPU time hurt because their CPUs are fairly weak, but even when you've got 15 completely partitioned networks, the network load of doing so is
15:26 daniels: ... not low
15:26 Company: quesstion: is there a good way to emulate the capabilities of old/embedded hardware with Mesa? Stuff like max texture size and shader limits and whatnot?
15:27 daniels: yeah, in the LAVA model we download (with two-tier local caching; one per-dispatcher i.e. shared between roughly 15 DUTs and one LAVA-global) the tarball and decompress it on the x86 dispatcher, then the LAVA device just has that as nfsroot=
15:28 mupuf: We pull containers through a caching proxy too
15:28 mupuf: But yeah, to support b2c, you would need to add per-machine NFS shares not managed by lava
15:29 daniels: or add support to LAVA for some kind of semi-persistent 'use this rootfs for everything Mesa on this device, and here's how to clean up after the jobs too' storage mode - but yeah, my concern is mainly the expansion being done on the DUT requiring too much load on an already-slammed network + low-end CPU
15:30 daniels: it's not impossible of course, just trying to share why it's seriously non-trivial
15:30 mupuf: And adding local storage is also not doable? Would solve your network issue
15:32 daniels: ehh
15:32 mupuf: Anyway, moving mesa to containers-only isn't blocked by using b2c in lava
15:32 daniels: it's doable, but the economics are poor
15:32 daniels: if your DUT costs $1k for the whole rig, then dropping $200 on decent NVMe isn't terrible
15:32 mupuf: Just means that you'll be stuck with an indirection
15:32 daniels: if your DUT costs $60, then pairing it with $200 NVMe is ... less obviously good
15:32 mupuf: (Trigger container)
15:32 mupuf: Lol, yeah
15:34 daniels: so yeah, if there's an obvious need for it then it's doable, but if the only cost is 'wtf is this rootfs upload step' then we'll just spend the $60k elsewhere
15:36 mupuf: Well, the cost is the gitlab-ci.yml complexity being so high that other projects need to import it
15:36 mupuf: That's just gonna work
15:36 mupuf: But, I guess there could be a gitlab to LAVA converter to hide everything
15:37 daniels: yeah, that's something we are working on in the background
15:37 daniels: we've been treating it as a nice-to-have bit of tech-debt cleanup but less urgent than ... *gestures at everything*
15:37 mupuf: Good :)
15:38 mupuf: Well, at least we agree on the goal!
15:39 mupuf: I guess I'll need to look at the bare metal infra next, to see how it could be simplified
15:40 mupuf: (To be shareable between projects)
15:41 mupuf:'s goal is to bring testing to more projects
15:42 mupuf: eric_engestrom: You may be interested in the above discussion
17:04 mattst88: tursulin: thanks a bunch for https://patchwork.freedesktop.org/series/102175/ -- we're building some stuff on top of it now
17:05 mattst88: one thing that might be useful is a little program that just wraps igt_drm_clients_scan and prints the current DRM clients
17:27 DavidHeidelberg[m]: mupuf: we thinking about allowing KernelCI to share pipeline (limited to LAVA, since other job does't have priorities and don't want to block pre-merge)
17:30 DavidHeidelberg[m]: daniels: nicely said. I think we can prepare stuff in that direction if it dowsn't ivolve too much work?
17:40 daniels: DavidHeidelberg[m]: yep, good thing to think about and work towards
17:51 mupuf: DavidHeidelberg[m]: share pipeline?
17:52 DavidHeidelberg[m]: *share the jobs
17:59 daniels: mupuf: you've reminded me that we need to extend the LAVA structured-logging MR I pointed you to to both bm and b2c; do you/Charlie have some spare cycles for that atm?
17:59 daniels: (we're elbow-deep in a bunch of stuff including bifurcating LAVA to pivot from UART to SSH for control and feedback because it turns out it is possible to build a UART so unreliable it's unusable)
18:00 mupuf: Charlie is working on vkcts, but Eric just joined the team. Maybe he'll be up for it!
18:01 mupuf: Good luck with the pivot!
18:02 mupuf:'s plan for consoles is to extend netconsole to become a TTY over IP
18:06 mupuf: DavidHeidelberg[m]: how are you planning to share the jobs?
18:07 DavidHeidelberg[m]: Trigger pipeline with injected kernel and modules
18:08 mupuf: Oh, sounds like a great idea!
18:16 mupuf: DavidHeidelberg[m]: how is kernel ci packaging its kernel and modules?
18:16 DemiMarie: mupuf: please ensure that there is a layer of encryption and authentication (perhaps WireGuard?) over that!
18:17 mupuf: DemiMarie: over what?
18:17 DemiMarie: mupuf: the TTY over IP
18:18 DemiMarie: Because people will do things like expose a TTY on the public Internet
18:18 mupuf: TTL=1 to the rescue?
18:19 mupuf:fails to see how one could get early boot messages if TLS or WG was mandated
18:20 mupuf: But people can be free to set things up then use the right source IP to go through a VPN
18:20 Sachiel: I don't know what it would be protecting either, since anyone can just go look at the jobs output on gitlab anyway
18:21 mupuf: I guess DemiMarie is worried about users shooting themselves in the foot by choosing to use this driver rather than SSH
18:24 DemiMarie: mupuf: yup
18:24 DemiMarie: And my thought was to require that WG is built-in (not a module) and that the secrets are available during early boot.
18:25 mupuf: I guess with ipv6 we could force a local-only network
18:25 DemiMarie: mupuf: is this read-write or output-only?
18:26 DemiMarie: > And my thought was to require that WG is built-in (not a module) and that the secrets are available during early boot.
18:26 DemiMarie: Could this work?
18:26 mupuf: I want it read/write
18:26 mupuf: In theory, yeah. In practice, could luck upstreaming that
18:27 DemiMarie: Why “good luck upstreaming that”?
18:27 mupuf: Because the TTY subsystem is already messy enough :D
18:28 mupuf: And people will say: this is policy, it belongs to the userspace
18:29 DemiMarie: To which my response would be: we need to get logs from before userspace starts, so it cannot run in userspace, and there is going to be a login prompt on it, so it must be encrypted.
18:32 mupuf: To which they will say: Use SSH. TTY is for debug only :D
18:32 mupuf: Or local-only
18:33 DemiMarie: And my response is that people need to be able to debug stuff they may not have a physically secure connection to.
18:33 mupuf: And then silence will fall :D
18:33 DemiMarie: The authorized public key could be provided on the kernel command line and the secret key could be specified as a TPM NVRAM index or a UEFI variable GUID.
18:35 mupuf: Good ideas, but why enforce that the driver uses this interface? Why not have: setup wireguard through the kernel cmdline, and setup this driver through the cmdline?
18:36 mupuf: Then users may shoot themselves in the foot... but don't have to
18:37 DemiMarie: mupuf: users _will_ shoot themselves in the foot. That is why secure by default is so important.
18:37 mupuf: And the wg work can be reused for other services like samba, NFS, or whatever
18:38 DemiMarie: Maybe guard the insecure feature by unsafe_allow_insecure_unencrypted_tty_write or similar?
18:38 mupuf: You are so damn right... and making a lot of sense...
18:38 DemiMarie: Thank you
18:38 mupuf:used to be a security researcher, in a past life
18:39 mupuf: Even chatted with your CEO :d
18:39 DemiMarie: You are correct that having the secure connection be configured separately is a much cleaner design.
18:39 DemiMarie: 🤣
18:40 DemiMarie: If you mean Marek, I think he is CTO, but that is a trivial nit.
18:40 mupuf: I meant Johanna
18:40 mupuf: Wasn't she the CEO?
18:41 mupuf: Anyway, the best would likely to hide this driver behind CONFIG_EXPERT
18:42 mupuf: And add that this is not for end users, but meant for automated farms
18:42 DemiMarie: Maybe? Joanna left ITL before I joined.
18:42 mupuf:is just showing his age at this point
18:43 DemiMarie: As far as CONFIG_EXPERT, I don’t know if that is enough. From the standard Qubes kernel config:... (full message at <https://matrix.org/_matrix/media/v3/download/matrix.org/zjgGdWEhmmEYwJLGAYBbZNEY>)
18:43 mupuf: It was 13years ago
18:43 DemiMarie: Fair
18:44 DemiMarie: Maybe hide it behind BROKEN, so that anyone who wants to use it must apply a trivial kernel patch? Or hide it behind something like INSECURE_NETWORK_TTY?
18:46 mupuf: Well, as long as it isn't accidentally shipped to users, and the doc makes it clear what the security level is (none), not sure how adding more hoops will do
18:47 mupuf:always have to read netconsoles.txt
18:47 mupuf: I doubt people would remember the cmdline by heart :D
18:49 mupuf: TLS may be supported though, not sure how this works without an RTC
18:51 mupuf: Bed time! Thanks for the chat DemiMarie, it was a pleasure :)
18:51 DemiMarie: mupuf: Good night! Thanks for the chat, it was a pleasure here too.
19:18 mattst88: anyone have any clues about what fixed the VA-API segfault reported in https://bugs.gentoo.org/906222 ?
19:18 mattst88: it happens on 23.0.3 but not in 23.1.0
19:18 mattst88: so we just need a backport of the fix
19:21 mattst88: DavidHeidelberg[m]: you've committed some radeon+VA-API+CI stuff today, so maybe you know :)
19:59 daniels: mupuf: don't ever try to do anything with netconsole - it's a dead end
19:59 daniels: it relies on being able to push data whilst holding all the locks, which works very well for proper low-level UART, and very poorly for usbeth
20:14 mareko:isn't a CTO yet
20:33 DemiMarie: mupuf: your best option for console output is USB debug capability, which is explicitly designed for this purpose.
21:13 DavidHeidelberg[m]: mattst88: Haha, I knew I should touch it. I'll try to look in hour (away from the PC) ;)
21:13 DavidHeidelberg[m]: *shouldn't
21:17 mattst88: lol
21:25 DavidHeidelberg[m]: look at backtrace on the phone, it seems to me to be likely nir issue, otherwise it should be catched earlier
21:25 DavidHeidelberg[m]: we had the vaapi testing in-place for a while I think
21:28 daniels: DemiMarie: eh? no-one is exposing TTYs over public IP
21:28 daniels: DemiMarie: please give us the respect of assuming that we've thought for more than two seconds about what we're doing
21:29 airlied: the fact someone can create a UART that doesn't work is still impressing me
21:29 daniels: airlied: me too! which is why we spent a few weeks properly bottoming it out before concluding that the hardware was actually, somehow, terminally broken
21:30 airlied: daniels: I'd recommend a serial over uart device, but I think those are all terminally broken as well :-P
21:31 daniels: ... serial over uart?
21:32 DemiMarie: daniels: I trust that nobody here would do that. I was assuming the feature got into the mainline kernel and included in distros.
21:32 DemiMarie: In which case someone else might well make this mistake.
21:33 daniels: DemiMarie: we're not making a mainline distribution, we're making a dedicated CI system
21:34 daniels: there would need to be a lot of terrible decisions in the chain to Ubuntu shipping an SSH server which accepted known predefined SSH keys to log in by default; no-one here would propose him, and no-one in the chain to shipping would accept them either
21:35 DemiMarie: daniels: I was referring to the kernel TTY idea proposed earlier. That seemed to be essentially kernel-mode telnet to me.
21:39 daniels: DemiMarie: sure, but the idea that anyone who had spent multiple years making a CI system would just expose the lot over public IPs and forget that they were public, is pretty insulting tbh
21:39 DemiMarie: daniels: that was not my intent and I am sorry it came across that way.
21:40 daniels: DemiMarie: no problem; it's useful to provide insight and perspective, but just please try to remember that the people you are talking to are not idiots, and have probably thought about what they're doing for more than a few seconds, so don't need to have blisteringly obvious things explained at them
21:42 DemiMarie: daniels: Thank you. Would better phrasing have been something like, “If this gets upstreamed and included in distros, it is likely that an inexperienced person will expose it to the public Internet, either accidentally or without understanding the consequences.”?
21:43 daniels: I honestly can't imagine a distro that would suddenly start exposing TTYs over IP without considering what might happen
21:44 daniels: (and for this specific application, none of the Mesa CI DUTs have publicly-routable IPs)
21:44 daniels: it's not novel either. Alpine could start shipping passwordless root@ SSH access by default tomorrow, but I'd not expect them to. it's probably not worth a security advisory to them to let them know they shouldn't do it tho.
21:47 bl4ckb0ne: ill watch psykose to make sure it never happens
21:49 DemiMarie: daniels: Oh of course no distro would do it by default. “Inexperienced person” refers to the _end user_, not a distro maintainer.
21:49 psykose: i've seen worse security advisories
21:49 psykose: like '3rd party repo has insecure custom installer script on github'
21:49 psykose: just the other day
21:50 psykose:grumbles in distro maintainer
21:50 DemiMarie: Was that where the confusion was?
21:51 psykose: was that to me? just an offhand comment, i'm not confused by anything you've said :)
22:08 DavidHeidelberg[m]: mattst88: I don't see anything obvious, try to create mesa/mesa issue and cc someone from amd working on vaapi impl.
22:12 mattst88: DavidHeidelberg[m]: thanks
22:39 karolherbst: Kayden: 78a195f252d558c828c20bebda4bd9252534f53d regressed OpenCL conversion tests for long -> float conversions :(
22:40 Kayden: that's...odd
22:40 Kayden: aren't you hitting nir_lower_int64 in brw_postprocess_nir?
22:40 Kayden: it seemed like it was getting called twice
22:41 karolherbst: ohh, it continues to compile, the result is just wrong
22:41 karolherbst: for input == 0xfffffffeffffffff
22:41 Kayden: ._.
22:41 Kayden: probably some optimization happening now that's broken
22:42 Kayden: where do I get those tests?
22:42 alyssa: NIR_DEBUG=print all the things! (:
22:42 Kayden: or we can just revert it for now
22:42 alyssa: Kayden: https://github.com/KhronosGroup/OpenCL-CTS
22:42 alyssa: described to me once as "the worst CTS in history"
22:42 karolherbst: RUSTICL_ENABLE=iris run_local_mesa ./build/test_conformance/conversions/test_conversions float_rtn_long -w
22:42 alyssa: glhf
22:42 karolherbst: but yeah...
22:43 karolherbst: sadly.. NIR_PRINT ain't wired up yet, because.... meson
22:43 jenatali: > described to me once as "the worst CTS in history"
22:43 jenatali: That tracks
22:44 karolherbst: at least it builds quickly
22:44 Kayden: fantastic, it says "Install OpenCL"
22:44 Kayden: cmake -GNinja -DCL_INCLUDE_DIR=/usr/include/CL -DCL_LIB_DIR=/usr/lib does find it
22:44 alyssa: clown.jpg
22:44 karolherbst: ehhh
22:45 karolherbst: cmake -G "Ninja" ../ -DCL_LIBCLCXX_DIR=/usr/lib64/ -DCL_INCLUDE_DIR=/usr/include -DCL_LIB_DIR=/usr/lib64/ -DCMAKE_RUNTIME_OUTPUT_DIRECTORY=. -DOPENCL_LIBRARIES=OpenCL -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS='-Wno-stringop-overflow -Wno-alloc-size-larger-than'
22:45 karolherbst: or something
22:45 Kayden: because apparently looking in /usr/includbe is hard :)
22:45 karolherbst: at least on fedora37 I'm using that
22:45 karolherbst: $legacy_code
22:47 karolherbst: I'm sure there is some subtle thing going wrong or something...
22:47 karolherbst: wouldn't be the first time
22:54 karolherbst: ehh I hope I did find the right commit, because reverting it didn't fix it?
22:54 karolherbst: hold on a sec
22:56 karolherbst: yeah.. seems like I messed up...
22:57 karolherbst: Kayden: ahhh..
22:57 karolherbst: Kayden: it only regressed on gen12
22:57 Kayden: ah
22:58 Kayden: that sounds believable
22:58 karolherbst: but yeah, with that commit gen12 fails the same way as gen9.5? something something
22:58 karolherbst: CML-H
23:00 karolherbst: https://gist.github.com/karolherbst/3afe79544a73a7a5a4567a4fba47e0cd
23:01 karolherbst: an `uclz` is missing
23:01 karolherbst: and some more stuff, mhh
23:14 karolherbst: mhhh, looks like some clamping got removed.. but that's kinda hard to track down
23:17 Kayden: is there something I have to do to get iris to work with rusticl other than IRIS_ENABLE_CLOVER=1? clinfo is showing rusticl but it's not loading iris_dri
23:18 Kayden: had it working before but I lost my history
23:18 karolherbst: rusticl only repsects RUSTICL_ENABLE
23:18 karolherbst: RUSTICL_ENABLE=iris
23:18 karolherbst: anyway, you want to run this test: ./build/test_conformance/conversions/test_conversions float_rtn_long -w -1
23:18 Kayden: ah thanks
23:18 karolherbst: this kinda smells like an opt being slightly broken tbh
23:19 Kayden: yep.
23:19 Kayden: for 64-bit
23:19 karolherbst: sooo
23:20 karolherbst: there is a imax uadd_sat 0x20, iadd 0x1f, ineg uclz a, b sequence missing
23:21 karolherbst: and that's now imax, a, b
23:21 karolherbst: I think
23:21 karolherbst: or something along those lines
23:22 karolherbst: updated the good/bad here with "serialize" https://gist.github.com/karolherbst/3afe79544a73a7a5a4567a4fba47e0cd
23:22 karolherbst: diffing it makes it kinda obvious what's gone
23:23 karolherbst: oh well.. it's getting late here, will maybe check back tomorrow in case you got something or look into it next week or so
23:25 Kayden: have a good night
23:26 Kayden: it fails on icelake too FWIW
23:26 karolherbst: yeah, I mean pre gen12 it was always broken, I just never got to figuring out the remaining issues
23:27 Kayden: oh :(
23:27 Kayden: no, it passes on ICL without that commit
23:27 karolherbst: I wouldn't be surprised if you just made gen12 hit the same bug now :)
23:27 karolherbst: ahh,
23:27 karolherbst: guess ICL is fine then
23:27 Kayden: both use int64 lowering
23:27 karolherbst: my desktop is ADL
23:28 karolherbst: anyway, good night
23:28 Kayden: 'night!