07:23 tagr: the Tegra video decode support pretty much done: https://patchwork.freedesktop.org/series/94766/ and https://github.com/cyndis/vaapi-tegra-driver
07:24 tagr: just waiting on a couple of Acked-by's, primarily on the DT bindings, but the userspace requirements should all be met now, so the plan is to merge the kernel bits for v5.16 and the vaapi driver is there for anyone who wants to try
12:32 karolherbst: tagr: ohh, do you know how well that works with the UEFI firmware?
12:34 tagr: it should work much the same irrespective of the firmware
12:34 tagr: last time I tried it was able to decode MPEG2 and MPEG4, although it was a bit slow
12:34 tagr: that has now apparently been addressed, though I haven't specifically tested since
12:35 karolherbst: okay
12:39 tagr: I'm afraid not much of that would be usable for discrete GPUs, though
12:40 tagr: the submission interface is all different, though I suspect that the accompanying documentation of the command stream might be helpful
12:44 fudgespinner: is it common for older NV40/50 GPUs to only has one power state?
12:44 fudgespinner: that's what I saw on my 9500 GT
12:46 karolherbst: tagr: yeah.. I already expect that much
12:46 karolherbst: tagr: btw.. I tried to fix the regression we talked about with callbacks and either I messed up or the issue wasn't fixed by that :/
12:47 karolherbst: honestly no clue what's going on there
12:47 karolherbst: fudgespinner: yes
12:47 karolherbst: especially desktop ones
12:50 fudgespinner: Based on TechPowerUp's VBIOS collection, I can only find two models of 9500 GT with more than one power state, and they're mostly exotic cards with 128 or 256 MB VRAM...
12:52 fudgespinner: Also, about the boot clock, there's no way it can be changed back to boot clock once the driver has loaded, and reclocked?
12:54 karolherbst: yeah
12:54 karolherbst: the vbios doesn't contains the info on what needs to be done to get to the boot clock usually
12:55 karolherbst: all of that isn't that trivial sadly
12:57 fudgespinner: and it seems multiple power/performance states for desktop GPUs are introduced with GeForce 200 series. Though I'm not sure if that applies to G9x-based cards.
12:58 karolherbst: yeah... it's.. tricky
12:58 karolherbst: usually laptops with those cards already had multiple power levels
12:58 karolherbst: the need for that on desktops wasn't big enough back then
12:58 karolherbst: with fermi/kepler they started to make more use of it
12:59 tagr: karolherbst: I'm pulling together the last pieces of a test setup, so I should be ready later today or early tomorrow to take a closer look, see if I have some luck
12:59 karolherbst: okay, cool
12:59 karolherbst: tagr: btw.. the blocker will probably be waived and hoped that it gets fixed until the release
13:01 tagr: okay, let's hope so
13:02 tagr: karolherbst: I've looked a bit at the testing bits a little and there's a couple of options that I can pursue, but none of them trivial, so it'll take a bit
13:02 tagr: hopefully not too long, though
13:03 karolherbst: yeah... I think we should discuss the testing bits on the meeting next week though and see if we can get Nvidia to officially invest into it as this would help both sides
13:05 tagr: one thing that would be easy for us to do is seed some hardware if that helps, finding the time to work on CI will be... more difficult
13:07 fudgespinner: If you don't mind the odd question about the older cards like GeForce FX, I know it has separate 2D/3D clock, and I wonder if nouveau will handle power management the same way as newer cards.
13:09 karolherbst: tagr: yeah.. hw isn't the big issue though
13:09 karolherbst: and Nvidia is open to send some out, so..
13:09 karolherbst: I think it's really the problem to find dedicated resources to care about such a system
13:10 karolherbst: and not limit it to last minute package testing
13:10 karolherbst: fudgespinner: well... need somebody to look into it and we are far away of being able to spend much time on those "nice to haves" sadly
13:11 fudgespinner: I see...
13:13 fudgespinner: I understand the reason about it. most people with old cards won't using it for Linux unless it's a display adapter for old system.
13:13 karolherbst: well.. we don't even have time to work on it for newer cards :/
13:16 fudgespinner: At least compared to using Windows or Linux with binary driver, this is far from end of the world for somebody who still has their Tesla-based cards lying around. And I think a lot of people has them, basic home office builds here still sold with GT 210 as dedicated cards.
13:16 karolherbst: yeah
13:16 karolherbst: We try to keep them working, people should just not expect performance out of it :D
13:17 fudgespinner: Well, I did run some Windows games on it and it didn't catch fire nor runs with single digits.
13:19 fudgespinner: Also another issue is there's almost 100% chance that logging out will completely hangs my system with binary driver
13:19 karolherbst: uff
13:19 fudgespinner: huh?
13:19 karolherbst: I mean.. logging out shouldn't hang the system :D
13:20 karolherbst: there I'd expect more from nvidia
13:21 fudgespinner: yeah, somehow that happened.
13:37 tagr: karolherbst: the good thing is that such a system, if fully automated, will be light on maintenance
13:37 tagr: on the downside, it's quite a bit of work to get it going
13:37 karolherbst: yeah...
13:37 karolherbst: well
13:38 karolherbst: the problem is rather random machines dieing
13:38 karolherbst: danials were saying that you need to replace each node after a year roughly
13:38 karolherbst: due to the load mesa puts on the CI system if doing pre merge
13:39 karolherbst: but there are also random fails.. network breaking.. some FS corruption.. SD card bricks... you name it
13:39 karolherbst: but as long as it's not pre merge.. it shouldn't be painful
13:39 tagr: odd... doesn't it "just run tests"? that shouldn't be all that much load
13:39 karolherbst: although I'd network boot those devices :D
13:39 karolherbst: the new firmware seems to be quite good with it
13:39 karolherbst: or enables it by default
13:39 karolherbst: tagr: well....
13:39 karolherbst: we already have like 12k MRs
13:40 karolherbst: 10.8k of them merged
13:40 karolherbst: and most of them go through the full pipeline of all ci nodes
13:40 karolherbst: actually.. 13k MRs
13:41 karolherbst: ~25 MRs per day as it seems
13:41 tagr: I could imagine that if you actually were to try and build Mesa on the test system that might take a toll on it, but merely running it? I suppose if you have to reflash the entire system every time, perhaps that's also not quite what these devices are designed for
13:41 karolherbst: you get the tests as docker images
13:41 karolherbst: and your gitlab runner does something with it
13:41 karolherbst: so we build it on the fdo infra
13:41 karolherbst: and ship the containers around
13:42 tagr: definitely sounds like network booting would be advantageous in that kind of setup
13:42 karolherbst: yep
13:42 karolherbst: I actually have most of the hardware here already
13:43 karolherbst: switch with PoE, a machine to function as a runner/controller, jetson nano, PoE to eth+5V splitter...
13:43 karolherbst: just didn't got around to figure out how the sw stack should look like
13:43 tagr: I'd have to check if the internal test farm allows network booting, I think most of the time it'll reflash completely so that everything gets tested
13:44 karolherbst: ahh
13:44 karolherbst: well from a mesa perspective you only get mesa as a container
13:44 karolherbst: so you can keep the base OS as is
13:44 tagr: hm... how do you make sure the dependencies are all correct?
13:44 karolherbst: shipping kernel gets only interesting once we enable CI on the kernel side
13:44 karolherbst: tagr: docker?
13:45 tagr: oh... so basically you pull in the base OS that you built against?
13:45 karolherbst: yes
13:45 karolherbst: the image contain the whole OS + mesa
13:46 karolherbst: you still have to store it on the test nodes though to actually run it
13:46 karolherbst: so it's still heavy on the storage
13:47 karolherbst: tagr: also.. I am sure you don't want to run it on your test farm :D
13:47 tagr: couldn't the test runner just extract the whole image and export it via NFS or something?
13:47 karolherbst: it could
13:48 tagr: that way it would be the test runner that gets most of the load, and disks are usually easy to replace on those
13:48 karolherbst: some are using lava to set up the entire system some.. use something else? dunno
13:48 karolherbst: tagr: you can ask what the others are doing
13:49 tagr: I'm pretty much tied to the system that we have in the farm, I don't think I could pull this off locally
13:49 karolherbst: tagr: the issue is, that your CI maintainers will say "running containers from the internet?!? no way"
13:49 tagr: that's for the freedesktop CI integration, right?
13:50 tagr: in a first step we'd generate all the containers (or whatever) internally
13:50 karolherbst: sure, but people can submit MRs and I think everybody can trigger the pipeline
13:50 karolherbst: so one could hide a bitcoin miner or something.. dunno :D
13:50 karolherbst: I mean.. we'd notice, but
13:50 tagr: but yeah, accepting external requests is definitely something that has people worried =)
13:51 karolherbst: I'd just treat that whole thing as insecure and don't expose it to anything secretly or so
13:51 karolherbst: just treat is as untrusted
13:52 tagr: couldn't this just trigger something like a notification so that a trusted system could just go and grab the latest version when it gets notified?
13:52 tagr: and then it could just construct everything internally
13:52 karolherbst: tagr: you could
13:53 karolherbst: but you still pull from git repos from random users
13:53 karolherbst: but that's for pre merge
13:53 karolherbst: but if we ignore that
13:53 karolherbst: and say you have a fork, doing a daily pull from mesa/mesa
13:53 karolherbst: and push into that fork (which you can do within gitlab CI)
13:54 karolherbst: you can then notify an internal system and pull from that
13:54 karolherbst: and just roll out your own stuff for now
13:54 tagr: yeah, I think that's about as good as it's going to get for starters
13:54 karolherbst: yep
13:54 tagr: it's also infinitely better than the status quo
13:55 karolherbst: yeah
13:55 karolherbst: you could also like.. let's say.. run 1/10 of the CTS each run and randomize tests or so
13:56 karolherbst: depending on how long it takes to execute whatever tests you choose to test
13:56 karolherbst: or well.. wire up some devices multiple times and just shard it
13:57 karolherbst: tagr: https://gitlab.freedesktop.org/mesa/parallel-deqp-runner
13:57 karolherbst: that's what we use
13:58 karolherbst: you hand in a list of tests to run, what is allowed fail etc... and it runs the deqp/CTS tests in parallel on the test machine
13:58 karolherbst: and you can hand in 1/10 of the list to each of your 10 machine test farm or something :D
14:42 karolherbst: tagr: anyway.. I will run my full CTS pipeline with libasan enabled.. maybe my workaround or the callback stuff was good enough to fix the tegra driver, but we still had some remaining issues in nouveau as with my fixes the session did run for quite some time, and it might not matter in practise hence me never experienced issues with only nouveau before
14:43 karolherbst: could be that both sides are kind of borked..
14:50 karolherbst: but it doesn't look broken
14:50 karolherbst: I'll.. see in 9 hours once all tests are executed
17:17 karolherbst: ehh.. all mesa core issues :(
17:18 karolherbst: wait.. CTS actually
17:34 karolherbst: yeah...
17:34 karolherbst: added a suppression entry for that CTS internal bug and now most of the crashes are gone
17:34 karolherbst: annoying
18:57 karolherbst: okay nice.. crash rate was around 5%, now it's at 0.5%
18:58 fudgespinner: nice decrease!
18:59 karolherbst: well.. I didn't fix anything, I just told libasan to ignore invalid memory copys inside the CTS
18:59 karolherbst: :D
19:00 fudgespinner: also, did you guys take a GPU and look at microscope for reverse engineering? :3c
19:00 karolherbst: not yet
19:01 fudgespinner: it was supposed to be food for thought :)
19:01 fudgespinner: AKA, I'm just wondering.
19:02 karolherbst: yeah.. but usually you don't really understand anything form looking at it
19:02 karolherbst: it's just way too complexe
19:04 fudgespinner: oh okay...
19:05 karolherbst: like what would be interesting to extract is the signing key use to sign firmware in case it's symmetrical and everything, but then the question is could we use it or not or something...
19:05 karolherbst: but for understanding how the GPU works.. well
19:06 karolherbst: there are millions of transistors
19:06 karolherbst: ehh
19:06 karolherbst: billions
19:06 karolherbst: latest GPUs are close to 30 billion
19:08 fudgespinner:sweats profusely after seeing the number
23:34 karolherbst: tagr: yeah.. so with only nouveau gnome-shell just starts even with libasan compiled into mesa.. so no memory corruptions :(
23:35 karolherbst: so... no clue