08:41 bentiss: ouch, now m3l-12 is acting weird as well: https://gitlab.freedesktop.org/freedesktop/ci-templates/-/jobs/36180774
08:43 bentiss: ohoh: https://github.com/containers/podman/issues/5136#issuecomment-583884319 -> maybe we can edit that soft cap
09:14 bentiss: daniels: ^^ I managed to raise the num_locks to 8192 on that runner, but we probably need to edit podman-free-space to only keep a decent amount of volumes
10:01 mupuf: bentiss: OMG :o
10:04 bentiss: actually, given the number of kept volumes, we might want to not raise that limit...
10:04 bentiss:is gathering some numbers on m3l-12, to see what is the weight of the volumes on the disks
10:07 bentiss: only 2.9 TB...
10:07 mupuf: bentiss: what's there?
10:07 mupuf: is it the caches for the containers?
10:08 bentiss: mupuf: yeah, gitlab-runner keeps a cache of all containers so that they spin up faster without having to pull the git tree and such
10:08 mupuf: Makes sense!
10:09 bentiss: well, the disk is 7TB, and used at 3.9TB, so that's ~1 TB of images
10:09 mupuf:should also increase this value in valve-infra too
10:13 bentiss: mupuf: https://gitlab.freedesktop.org/freedesktop/helm-gitlab-config/-/commit/b11cddfd9180ed0ae8ec025bb9b3ffbb95bf247d
10:15 bentiss: mupuf: to be able to bump it: pause the runner, no pods should be running, then drop the file, rm /dev/shm/libpod_lock, then podman system renumber, then systemctl restart podman
10:16 mupuf: bentiss: rebooting wouldn't be enough? We would need to also call podman system renumber?
10:17 bentiss: mupuf: yeah, that's what I read in the bug reports. rebooting might work, but this allows to bump the limit without a reboot ;)
10:17 mupuf: :)
10:17 mupuf: thanks for the info! I'll make an issue in valve-infra
10:18 bentiss: no worries. this way we'll have a trace on how to bu,p the limit on the fly :)
10:25 mupuf: bentiss: https://github.com/containers/common/blob/main/docs/containers.conf.5.md says the default is 2048 locks
10:26 mupuf: so, a 4x increase should help :)
10:26 bentiss: mupuf: yep. And we get those 2048 in 3 weeks
10:27 mupuf: indeed, but I thought it was 4096, so rather than going from 3 to 6 weeks, you are going from 3 to 12 weeks :)
10:27 bentiss: the problem is that this will buy some time, but I also need to do some janitoring because there are quite some volumes that will never be used again
10:27 bentiss: so I'm working on flushing them regularly atm
10:27 mupuf: yeah, it would be great if podman could keep usage statistics
10:27 mupuf: I need them for b2c, so I may implement that
10:28 mupuf: (I would like to implement an LRU policy on volumes and container layers
10:28 mupuf: and I should definitely implement that directly in podman
10:29 mupuf: BTW, podman 4.4 is out, with my first commits there. The upstreaming process seems a bit fast and loose, but they have nice tooling
10:29 mupuf: unreliable, but nice :)
10:36 bentiss: heh, congrats :)
11:01 mupuf: bentiss: thanks, but the point was that working with upstream isn't a hassle
11:02 bentiss: mupuf: I get that, but it's still always fun to have code accepeted in a new project :)
11:03 mupuf: indeed!
11:08 MrCooper: I just reported https://gitlab.freedesktop.org/mesa/mesa/-/issues/8268 as spam
11:14 daniels: that's awesome, thankyou
11:15 daniels: bentiss: ^
11:15 bentiss: I'm glad I managed to figure this one out
11:18 daniels: I would've never thought to correlate locks and volumes ...
11:21 bentiss: well, the bugs on github for podman helped a lot :)
12:24 eric_engestrom: I was asked if we should put tests that fail and we don't think we'll ever be able to fix in `-fails` or `-skips`
12:24 eric_engestrom: I answer that I think we should put them in `-fails` to know if they ever get fixed, but now I'm wondering what others think?
12:25 eric_engestrom: I get wanting to save resources by not running them if we don't believe they'll ever pass
12:27 daniels: right, if they're just long-term broken but fail quickly and not noisily then -fails is fine, but if they're either never going to work (for architectural reasons), or fail slowly/loudly (e.g. take out other tests with GPU resets), then -skips makes more sense
12:35 eric_engestrom: daniels: thx, I guess my view was too simplified/restricted :)
12:42 mupuf: agreed with what daniels said
13:56 MrCooper: +1
15:07 pixelcluster: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8269#note_1764504 more spam, reported on gitlab as well