08:52damo22: is anyone poking around in ucode for nv cards? i heard there was a limitation of nouveau versus the binary driver due to some crypto lock?
09:09pabs: IIRC cards that don't have signed firmware already have open firmware (and IIRC that is generated at runtime)
09:43damo22: if i remember correctly, the clock speed is running at like 1% of its full speed on some cards
09:44damo22: or is that when you omit the ucode?
10:03damo22: i remember now, adequate clock control is mostly missing from nouveau
10:06hell__: I remember talking about this some time ago
10:07damo22: hi hell__
10:07hell__: IIRC, one piece of the puzzle is missing to be able to change memory clocks properly
10:08hell__: something to temporarily store data while changing clocks
10:08hell__: but don't quote me with my colander memory on that
10:09damo22: iiuc we dont need to change any firmware to have clock control right?
10:10damo22: its probably some undocumented reg though
10:10hell__: yes, I vaguely recall the reason being lack of documentation
10:24karolherbst: damo22: you can't change some regs value without signed firmware
10:24karolherbst: that's why the firmware is _required_
10:25karolherbst: but there are sometimes other reasons. on 2nd gen maxwell the regs for reclocking aren't locked down yet, but fan control was
10:25karolherbst: so if anybody wants their 2nd gen maxwell to get crispy hot, they could actually just enable the kepler/maxwell reclocking code and go ahead
10:29damo22: is that on gm200?
10:30hell__: or bypass fan control using hardware methods, e.g. wire the fans elsewhere
10:31hell__: I wonder why some regs are locked without signed firmware
10:32hell__: I mean, if the regs are a mailbox (or similar) interface to talk with the thing running the firmware, it's no surprise that it won't work without firmware
10:41damo22: hang on, but if the binary driver can change the clocking and fan control, it must be using a signed firmware that allows that
10:41RSpliet: a non-distributable signed firmware even!
10:42damo22: i see
10:42karolherbst: hell__: "security reasons"
10:42karolherbst: and those are no mailbox things
10:42karolherbst: they control the hardware directly
10:43karolherbst: well.. more or less
10:43damo22: what if it was possible to sign any community firmware?
10:43karolherbst: there is a layer between the hardware and the iomem registers, but...
10:43RSpliet: Tl;dr: they repeatedly promised redistributable firmwares, but so far those were just castles in the air
10:43karolherbst: damo22: maybe, but then we also have to reverse engineer stuff, which means signing a lot of firmware
10:44karolherbst: and "security reasons" can include a lot of things 1. prevent modding 2. prevent malware from damaging hardware ...
10:44karolherbst: so if they just sign any random firmware from us, it could contradict some of their reasons they implemented all of that in the first place
10:44RSpliet: prevent crypto miners from circumventing limitations?
10:44karolherbst: RSpliet: well.. at least we get the firmware for acceleration
10:44karolherbst: at some point
10:45RSpliet: it's easy to get salty about their motivation, but it won't change much
10:45karolherbst: all we can do is to collaborate and make them want to release firmware
10:46karolherbst: time will tell if we succeed or not
10:47RSpliet: I think though that since last Friday we can officially say that NVIDIA has the worst OSS driver attitude in the industry? ;-P
10:51karolherbst: what happend on Friday?
10:51karolherbst: I was on PTO :)
10:54pabs: ImgTec released their libre Vulcan driver for PowerVR?
10:54pabs: yep, that was Friday https://lists.freedesktop.org/archives/mesa-dev/2022-March/225699.html
10:55karolherbst: ohh, right
10:55karolherbst: I thought RSpliet meant something Nvidia did
10:55pabs: more something they didn't do :)
11:03damo22: so the only reason we dont have a foss driver possibly on par with the binary driver is because nvidia hasn't released a redistributable firmware with unlocked features?
11:04karolherbst: damo22: I'd say having the firmware is a pre req for getting there
11:04karolherbst: the driver isn't just what's happening on the kernel side
11:04karolherbst: and there is enough stuff you have to write code for
11:05karolherbst: also.. bugs
11:08RSpliet: pabs: bingo
11:10damo22: maybe they will release some more firmware eventually
11:11karolherbst: we are still waiting for ampere accelleration firmware, so hopefully that will come soon
11:13damo22: "security reasons" so foss support can lag behind proprietary software
11:14karolherbst: I think they have valid reasons though
11:15RSpliet: damo22: in all sincerity, I think their security reasons make some sense in the context of datacentre/cloud GPUs. Not entirely convinced, but... I didn't see their studies towards it
11:17damo22: whats the difference between a low end datacentre gpu and a high end consumer gaming gpu?
11:17karolherbst: anyway, I never got the feeling they are actively fighting against FOSS
11:17karolherbst: damo22: support
11:17damo22: and $
11:17karolherbst: well support costs $
11:18damo22: why does a datacentre need support for a card that they run forever until it breaks?
11:18karolherbst: but yeah, as far as we can tell, the GPUs are more or less identical
11:18karolherbst: damo22: ohh there are some software features only available to those
11:19karolherbst: like that GRID stuff
11:19karolherbst: GPU virtualization
11:19damo22: right so the driver is locked for that reason
11:19karolherbst: it doesn't even matter if the driver prevents that stuff
11:19karolherbst: management wants check marks, and if the consumer driver doesn't say "supports GRID" they can't check that mark
11:21karolherbst: sure they make it harder, so that some companies can't ignore that, but usually it's all about official support
11:21karolherbst: support is more than just fixing bugs
11:22damo22: if the card doesnt support GRID because the driver has it disabled in firmware, then there is a reason to keep the driver crippled in some tiers/models
11:22karolherbst: I am not even sure it's disabled in firmware
11:22karolherbst: it's most likely in the driver
11:23karolherbst: but again.. that doesn't really matter. business with big bucks really love their check marks
11:23karolherbst: or like think about companies with weirdo policies and they need to run RHEL 7 for 15 years
11:24karolherbst: but also with those beefy GPUs
11:24karolherbst: and now they need nvidia to support this kind of stuff
11:24karolherbst: that's also part of "support"
11:24damo22: fun times
11:39RSpliet: damo22: the difference is time-sharing your GPU with multiple users that require strict isolation from each other. Comes with a new set of security challenges you didn't have to worry about as much on desktop.
11:40RSpliet: Oh and QOS/scheduling challenges too. That an individual user shouldn't be able to circumvent... which I guess is a new angle to "security"
11:49karolherbst: in the end we can only guess
11:51karolherbst: but you already hear about malware inside the firmware signed with the leaked keys... so I guess that kind of proves the point, some kind of sec is needed?
11:52RSpliet: We can only guess... or leak the internal documents that describe how they came to the decision of signed firmware :-PPP
11:53karolherbst: RSpliet: I don't want to give those coin bros any ideas, but I could think about hacked context switching firmware to schedule coin mining crap :D
11:54karolherbst: anyway, I won't look at this stuff
11:54RSpliet: Nah, I was obvs joking
11:55karolherbst: RSpliet: and here I thought you'd sacrifice yourself to do it
11:56RSpliet: hahaha. Not sure what's worse... looking at it as an OSS developer, or looking at it while working for a competitor. Let's not cluck around and find out on this one
11:59damo22: RSpliet: i know you guys are doing the best you can, and it helps free software users, thats all i care about
12:01RSpliet: damo22: don't look at me. I joined ImgTec a while ago, and not even the OSS driver team. All I did for their OSS driver RFC last Friday was a minimal amount of cheerleading :-P Tapped out of nouveau ages ago, just don't have the free time to spend on it.
12:01karolherbst: RSpliet: shame on you
12:04damo22: one day i will attempt to port the nouveau kernel component to GNU/Hurd, but we're not ready for that yet
12:05karolherbst: damo22: I think the only sustainable way of doing is to either port all of drm+mm or include a linux compat layer for all the used stuff
12:05damo22: yes drm
12:05karolherbst: well.. it doesn't stop at drm
12:05karolherbst: I said mm for a reason
12:05damo22: what is mm?
12:06karolherbst: memory management
12:06karolherbst: drm and mm have deep ties for a lot of things
12:06karolherbst: ttm, but also mmu notifiers and similiar things
12:07damo22: but hurd drivers are in userspace
12:07damo22: we can just call malloc?
12:07karolherbst: no they can't be anymore
12:07karolherbst: damo22: haha.. no
12:08damo22: or vm_allocate_contiguous
12:08karolherbst: so the thing isn't allocation or something
12:08karolherbst: but vm mirroring
12:08karolherbst: so GPUs and CPUs can share the same VM (more or less)
12:08karolherbst: so if the GPU page faults, you have to migrate stuff from the CPU into the GPU
12:08karolherbst: and vice versa
12:08damo22: oh crap
12:08karolherbst: yes :)
12:09karolherbst: luckily nobody really uses it for graphics
12:09karolherbst: you might not have to care?
12:09karolherbst: but compute workloads are interested in this kind of stuff
12:10damo22: i want a display in hurd too
12:10karolherbst: but you also have things like userptrs, which should be fairly trivial to implement in userspace, as you just map CPU memory into the GPUs VM
12:10karolherbst: without the mirroring stuff
12:11karolherbst: but atm there are such deep ties between drm and mm
12:11karolherbst: I know that nouveau and amdgpu/amdkfd make use of that, maybe other drivers
12:11karolherbst: but anyway.. thinking about porting drm also includes thinking about what to do with the mm bits
12:12damo22: currently the hurd's kernel manages all memory
12:12karolherbst: it's a tough call.. dealing with all of this requires a lot of time and work, but not doing it makes GNU hurd pretty irrelevant
12:13karolherbst: ahh, right
12:13damo22: we could add an interface to map gpu memory
12:13damo22: or something
12:13karolherbst: something something :)
12:14RSpliet: Defo sounds like a team effort to me
12:14karolherbst: mapping GPU memory is required one way or the other
12:14karolherbst: into the CPUs VM I mean
12:14karolherbst: everything else is more or less optional
12:14damo22: yeah ok
12:14damo22: thanks for the info
12:15karolherbst: optional in the sense of you only need it for GL/VK/CL extensions
12:15karolherbst: or optional features
12:17damo22: the main memory is virtualised to a slab allocator in mach afaik, but gpu memory is accessed via PCI space isnt it?
12:17karolherbst: RSpliet: yeah, I am sure that without companies investing workforce it won't fly
12:17karolherbst: damo22: yeah, dma stuff
12:18karolherbst: some GPUs even sit behind an iommu
12:18damo22: yeah we havent figured out iommu yet
12:18karolherbst: nvidia jetsons e.g., but might be true for other platforms
12:18karolherbst: at least I think that's where it's used
12:19karolherbst: we do have iommu support, but...
12:19damo22: but userspace accessing pci over iommu would be bulletproof
12:19karolherbst: sure, but you can't rely on that for desktops
12:19karolherbst: it's an optional feature on Intel CPUs e.g. and there are many without support for that at all
12:20karolherbst: anyway.. from a gut feeling I'd say device memory has to be managed inside the kernel regardless
12:20karolherbst: can't treat it differently than normal RAM
12:20karolherbst: sensitive information might be in VRAM, like rendered brower windows showing revocered passwords :)
12:22damo22: we have a layer that arbitrates access to pci via a userspace server, you can access the pci regions as files on a virtual set of nodes, and mmap them directly
12:23damo22: they have permissions just like normal files
12:25karolherbst: okay sure, and the server is responsible of clearing returned memory and making sure the same bits are not mapped twice? _and_ is also aware of context switched memory reagions? like "that region looks different for each GPU context"
12:26damo22: why cant you map the same bits twice?
12:26karolherbst: because it might point into random VRAM
12:26karolherbst: and processes could read out each others allocated VRAM
12:27karolherbst: which sounds to me like a huge security problem
12:28karolherbst: but that context switched stuff is also annoying, because it depends on the _active_ context which the firmware switches to, not the kernel or userspace
12:28karolherbst: the region we are talking about here for GPUs is _huge_
12:28damo22: we can have a separate server that manages the gpu memory
12:29karolherbst: and what would that change?
12:29karolherbst: you can literally write a GPU physical address into an iomem reg and read it out
12:29karolherbst: the memory in VRAM I mean
12:30damo22: well thats only problematic if you put things in VRAM that are sensitive
12:30karolherbst: via iomem reg
12:30karolherbst: damo22: well.. your browser content?
12:30karolherbst: stuff has to be rendered somewhere
12:30karolherbst: everything you see on a display lives in VRAM
12:31damo22: i see
12:31damo22: so the kernel hides all that
12:32damo22: we will need to provide an interface that also hides the VRAM but lets you use the card
12:32karolherbst: the kernel driver touches the iomem regs, userspace only uses stuff exposed via ioctls
12:32damo22: we can reuse that
12:32karolherbst: damo22: yeah... so.. there is the concept of command buffers. It's memory where you write GPU specific commands and let the kernel schedule it on the GPU
12:32karolherbst: for rendering and stuff
12:33karolherbst: but memory management is its own thing
12:33karolherbst: allocating buffers is something you need to do on top
12:33karolherbst: and copying into those
12:33karolherbst: or read
12:33damo22: ok sounds like we need to have VRAM managed by mach
12:33karolherbst: something also need to allocate a GPU context and a GPU VM and attach that stuff
12:33karolherbst: and write the GPUs page tables :)
12:34damo22: ok good to know, is that like in GART thing?
12:34karolherbst: damo22: yeah.. I'd say that's the only patch forward which wouldn't be super annoying to deal with
12:35karolherbst: yeah.. GART is kind of a different term for dma
12:36karolherbst: although I am not even sure everybody has the same meaning for that term
12:36karolherbst: I think it was mainly used when the GPUs didn't had contexts and something had to isolate stuff
12:37karolherbst: reading up on TTM should clarify a lot of things... hopefully
12:39damo22: hmm maybe we can have a userspace server that connects to existing pci server, to probe for the device, and it provides the VRAM isolation
12:39damo22: and context etc
12:40damo22: then we can replace the existing pci server with one that supports iommu
12:42damo22: if anyone wants to hack the card and read other people's VRAM they will just be prevented by the permissions on the pci region
12:43damo22: but it could allow a user to fire up a graphics card without root at all being involved :)
12:44karolherbst: I'd just lock down everything unless there is a reason userspace should be able to access it
12:44karolherbst: some of those interfaces are also only really useful for debugging or other things
12:45karolherbst: normally all memory access should go through dma
12:46karolherbst: but something has to configure that as well
12:46damo22: there will be no drivers in hurd's kernel i just spent ages putting disk driver into userspace
12:46karolherbst: sounds like a mistake to me, but...
12:47karolherbst: not saying a driver has to be as huge as in linux, but I am convinced some bits have to live inside the kernel
12:48karolherbst: and the main one being memory management
12:48damo22: yes we have memory management
12:48karolherbst: right, but devices also might have memory and that needs also to be managed
12:49damo22: interesting point
12:50damo22: why cant the driver have that built in per device
12:50damo22: (in userspace)
12:50karolherbst: damo22: well.. for one reason that dma needs physical addresses to actualy work and the other point is, that stuff goes both ways
12:51karolherbst: so if you know the physical address of memory in RAM you can map it into the GPUs vm and read/write stuff
12:52karolherbst: even memory used by the kernel
12:52damo22: we handled disk buffers with dma to use random addresses in RAM iirc
12:52karolherbst: damo22: well.. how random are _physical_ addresses
12:53karolherbst: but it's not even a data leak issue, the GPU can trash kernel memory
12:53karolherbst: so you could dos the machine
12:53damo22: yeah we need iommu
12:53karolherbst: you don't have iommu
12:54karolherbst: or sure, you can say we support GPUs on 15% of all machines or something.. dunno
12:55damo22: VRAM is located in PCI space though, gpu can write to anything outside that region?
12:56karolherbst: no, it's not
12:56karolherbst: I think through the normal PCI ways you can access like 256MB
12:56karolherbst: but... given the amount of VRAM modern GPUs have
12:57karolherbst: there are some ways of extending VRAM accisible through PCI, but...
12:57karolherbst: normally you do everything through dma
12:57karolherbst: VRAM lives at some physical address and it gets mapped in
12:58karolherbst: although at this point I am not sure if the PCI BARs are simply reconfigured or not.. mhhh
12:59karolherbst: yeah not sure how all of that works in the deep detail
12:59damo22: is it possible to map the entire VRAM to a single mapping?
12:59karolherbst: I don't know for sure
13:00damo22: i'll read TTM
13:01karolherbst: there are some people who really know this stuff in and out, so maybe it makes sense to ask on the dri-devel mailing list
13:01damo22: im not ready for that yet, i need to do some more homework
18:40agd5f: karolherbst, damo22 you can do stuff like resize PCI BARs if you have enough MMIO space on your platform. Also platforms with newer bioses will automatically resize BARs if the right conditions are met so the OS doesn't have to.
18:42karolherbst: ahh yeah, that rings a bell
18:44karolherbst: but I don't think we do it for nouveau atm