05:47pmoreau: @imirkin: Consistency, so that `constbufs` and `textures` both refer to the same stage, rather than the former being about fragment shaders and the second geometry shaders.
05:48imirkin: pmoreau: OCD? :p
09:49tagr: karolherbst: so, turns out X doesn't actually do framebuffer modifiers by default either because the kernel refuses to enable it for X specifically =\
09:49tagr: refuses to enable atomic modesetting, that is
09:50tagr: framebuffer modifier support in X is currently tied to atomic modesetting, but I have a fairly trivial patch to fix that
10:08karolherbst: I thought I didn't see the message for newer X, no?
10:13tagr: karolherbst: I think this is basically just the "implicit" framebuffer modifiers at work again
10:43ProN00b: is there a writeup explaining the techical/political challenges behind the pmu issue anywhere?
10:46RSpliet: ProN00b: it's not that long a story. Technical: hardware rejects full access to all hardware registers from firmware that isn't signed by NVIDIA. That access is required by PMU firmware to configure important stuff like fan speed.
10:46RSpliet: political: NVIDIA has not released a redistributable PMU firmware that is capable of all the things that the official driver's PMU firmware can do. Not sure they released one at all.
10:47RSpliet: Equally there is are no facilities for signing firmwares developed by the community.
10:48ProN00b: so you can not ship the signed nvidia pmu firmware from the official drivers i take it? was nvidia asked for that?
10:49ProN00b: did you write pmu firmware for the older cards that are supporting reclocking or are there releases or is the on-card firmware good enough there?
10:49RSpliet: So far nobody's gone the mile of extracting them from the official driver. They're sort of hidden and compressed. And no, that would not be redistributable.
10:49karolherbst: some parts of the reclocking are implemented in the pmu for older gens
10:50karolherbst: but on newer gens that's not feasable
10:50RSpliet: Yes, GPUs prior to second-gen Maxwell run nouveau-developed firmware. They facilitate "reclocking" to a certain degree.
10:50karolherbst: newer gens lock down fan controls and voltage regulation
10:50karolherbst: so there isn't much we can do anyway
10:51ProN00b: RSpliet, technically speaking whats the architecture called that firmware is compiled as? can like gcc or llvm compile to nvidia pmu chip arch?
10:51RSpliet: ProN00b: the architecture is called falcon
10:51RSpliet: It's developed by NVIDIA
10:51RSpliet: envytools contains an assembler and disassembler for it
10:57ProN00b: is there a common protocol that is used to communicate with the falcon pmu? lets say nvidia just pushed their signed pmu blobs to the kernel, would nouveau use them? is there a common protocol to talk to the firmware that would just have to be adapted or would it still require extensive work on a per card basis on the nouveau side of things?
10:58karolherbst: there is no "procotol" as in "hw"
10:58karolherbst: you write software and that does things
10:58karolherbst: of course nvidia has their IPC stuff and a API/ABI to pass things
10:58karolherbst: so they can always change stuff inside their driver and we would have to adjust
10:59karolherbst: or we stick to one version and go with that
10:59karolherbst: but then we would have to support newer driver versions API/ABI
10:59karolherbst: for newer gens
10:59ProN00b: so how common between the archs you do support is the pmu communication?
10:59karolherbst: it has nothing to do with archs
11:00ProN00b: the families then
11:00karolherbst: it has nothing to do with families
11:00karolherbst: it's all software
11:01ProN00b: i mean one card model has one pmu, another card model has another pmu
11:01karolherbst: the only think the hw defines is how interrupt works
11:01karolherbst: doesn't matter
11:01karolherbst: and also the hw defines the ISA used
11:01karolherbst: but nothing else
11:01karolherbst: you can do whatever you want
11:02karolherbst: you can have the same API on all gens, you can change it ever driver version
11:02karolherbst: doesn't matter
11:02karolherbst: the PMU is really just a processor running code and doing interrupts
11:03karolherbst: your question is like asking if TCP depends on the CPU
11:03ProN00b: and the answer to that would be clear
11:04tagr: karolherbst: okay, so it looks like for X the modifier stuff only gets enabled for dri clients
11:04tagr: so glxgears for example will end up allocating with framebuffer modifiers
11:04tagr: but the modesetting driver itself doesn't currently work that way, so we need that fix to actually make it do so
11:05tagr: karolherbst: but I also found a nifty workaround for the legacy X case and it involves drirc
11:05karolherbst: that might actually work then
11:05tagr: karolherbst: basically the idea is to set "allow_implicit_framebuffer_modifiers" to true for Xorg
11:06tagr: then this needs a bit of parsing code and we can store that value in the Tegra driver and simply skip the detiling blit and everything seems to work fine with that
11:06ProN00b: karolherbst, so how is the workflow in nouveau lets say for tega, you make a pmu firmware using that envytools falcon assembler and then communicate with that from the driver using your own "protocol" ?
11:06karolherbst: ProN00b: pretty much yes
11:07karolherbst: you essentially just share memory between the kernel driver and the falcon
11:07karolherbst: and you can send an interrupt to the falcon to do stuff
11:07karolherbst: or the falcon sends an interrupt to the CPU so it does stuff
11:08karolherbst: and you can do wahtever you want inside that shared memory
11:08karolherbst: I think right now we just use some scratch registers to exchange data
11:10ProN00b: so the interrupt contains a instruction pointer you point to code placed in the shared memory or how does it work?
11:10karolherbst: just a plain interrupt
11:10karolherbst: after that the CPU or falcon just reads out stuff from the shared meory
11:10karolherbst: and in there you encode whatever needs to be done
11:11ProN00b: so you upload firmware that includes an interrupt handler that does the work?
11:11karolherbst: pretty much yes
11:11karolherbst: you can of course also have internal interrupts like timers
11:11karolherbst: so you need to careful on how to do all of that
11:11ProN00b: another thing, can you actually run firmware on newer card PMUs, but since its not signed that firmware can not access hardware registers or did i misunderstand?
11:11karolherbst: but essentially that's all you have
11:12karolherbst: it can access _some_ registers
11:12karolherbst: just not the ones we care about
11:12karolherbst: tagr: yeah.. that might actually work
11:18tagr: karolherbst: I slightly worry that this might break again if the modesetting driver is ever "fixed" to no longer query modifiers from the GBM buffer objects that it allocated (without requesting modifiers)
11:18tagr: but perhaps I should add a comment to that, so that it doesn't get changed
11:19ProN00b: is there an overview of the architecture and security mechanisms? like what processors there are and what code they are running and what they are bootstrapped by?
11:19tagr: in either case, I suspect that if it does get changed at some point we may already be far enough ahead with modifiers support that we don't have to worry
11:24karolherbst: tagr: yeah..
11:25karolherbst: I think assuming modifiers will be the norm is probably safe
11:25karolherbst: I mean.. on 1.21 all of that just works anyway or not?
11:25karolherbst: or is there some stuff also broken there?
11:25tagr: karolherbst: but thinking about it, if the modesetting driver fix I have can be backported to 1.20 (it's something on the order of 5 or so lines, and a fairly obvious fix), we may not actually need that drirc quirk
11:26karolherbst: I see
11:26karolherbst: do you know what others think about that patch?
11:26tagr: karolherbst: modesetting on 1.21 doesn't allocate with modifiers because Linux denies atomic modesetting to X
11:26karolherbst: if it's really easy to get merged, that that could be fine
11:26karolherbst: why does it work with 1.21 then?
11:26tagr: I'll send out the modesetting patch shortly, then we'll see how it's received
11:28tagr: karolherbst: well, modesetting in 1.21 still relies on the "implicit modifiers" behaviour that we were discussing a while ago (i.e. modesetting may allocate without framebuffer modifiers, but then queries the modifier of the buffer anyway, so if we don't enforce pitch-linear in the DRI driver, then we may end up with a modifier != DRM_FORMAT_MOD_LINEAR and that happens to work fine on Tegra)
11:29karolherbst: I see
11:29tagr: the difference between 1.20.8 and 1.21 is that glxgears doesn't actually allocate with modifiers in the former case, and forcing pitch-linear there causes the depth buffer format mismatch that was causing the crash
11:30tagr: in 1.21, glxgears will end up allocating with framebuffer modifiers and then the pitch-linear problem goes away
11:31tagr: 1.21 with my modesetting fix won't rely on the implicit modifier bug/feature, so as long as universal planes are supported by the kernel and as long as the kernel advertises framebuffer modifiers for any of the primary planes, it should work
11:32tagr: for X 1.20.8 and glxgears we'd need the de-tiling blit workaround in Mesa, I think
11:34tagr: I'd have to check what exactly happens when we backport the modesetting fix to 1.20.8
11:34tagr: that should fix the problem that I was seeing with X not flushing the front buffer and therefore the de-tiling blit not happening (because the front buffer is now allocated with modifiers and we don't need the de-tiling blit)
11:35tagr: not sure about what would happen to glxgears in that case... I think that would still do the de-tiling blit, even if it isn't necessary (but only because it's being allocated with PIPE_BIND_SCANOUT)
11:36tagr: ugh... this is horribly complicated, but at least my head is no longer spinning, so I'm fairly confident I now know what I'm talking about =)
11:56karolherbst: :D nice
12:04tagr: karolherbst: so pull requests are the new standard way of merging changes to Mesa? or can I still send patches to the mailing list and apply directly to git after they've been reviewed?
12:04tagr: s/pull requests/merge requests/
12:05karolherbst: tagr: we do mainly MR now. Usually if you merge/push directly somebody will complain about it :p
12:05tagr: karolherbst: alright then, I'll have to hunt down somebody from the fdo team because I can't seem to remember my gitlab password
12:06tagr: also, it doesn't send password reset emails to any of my mail addresses if I try that
12:06tagr: but I'm not even sure if I explicitly signed-up, so there may not be an email associated with my account there
12:09tagr: daniels, mupuf: is this anything you guys could help out with? I don't recall signing up to gitlab, but for some reason a "tagr" account does exist
12:09tagr: daniels, mupuf: I can't recall the password and password recovery to any of my known email addresses doesn't work
12:11karolherbst: tagr: try using github?
12:11daniels: hang on
12:12daniels: heh, I assume your @avionic-design.de address no longer works :P
12:12tagr: daniels: ha... indeed it doesn't =)
12:12daniels: should I change it to your gmail address?
12:13tagr: wow... throwback Friday =)
12:13tagr: daniels: yes, email@example.com is the one
12:14daniels: can I delete your thierry.reding account?
12:15tagr: hm... I didn't know there was one
12:15daniels: you created it about 7 minutes ago
12:16tagr: huh... oh wait... that was perhaps as I was trying to log in with Google
12:16tagr: yeah, that doesn't make any sense, I'd prefer to stick with the "tagr" one
12:16daniels: cool, I'll just delete that then
12:17tagr: easier to match up with the fdo username and everything
12:17daniels: ok, tagr has firstname.lastname@example.org now, so you can do a password reset with that, and then you can associate the account with your google one
12:18daniels: (account settings -> account -> connected accounts)
12:18tagr: oohh... nifty
12:38tagr: karolherbst: do I need to let anyone know specifically about a merge request? and, do I also send out patches to the mailing list for review?
12:39karolherbst: set labels
12:39karolherbst: if not, I can set them for you
12:39karolherbst: or is that against xorg-server?
12:40tagr: karolherbst: there's a couple, actually,
12:40tagr: here's the first one: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6425
12:40tagr: which is just something that I've been carrying around for much too long
12:42karolherbst: tagr: ohh, I just saw that you are located in Hamburg? nice
12:42tagr: where are you located?
12:43karolherbst: brno, but I am trying to relocate back to hamburg
12:43karolherbst: so maybe starting next year I might in hamburg as well
12:46tagr: oh, are you originally from Hamburg?
12:46karolherbst: more or less yes
12:46karolherbst: live like 40km north of it
12:46karolherbst: then moved to hamburg and then to brno
12:48tagr: karolherbst: cool, perhaps we can meet up when you've moved back
12:48karolherbst: yeah, would be cool
12:48tagr: if circumstances allow =P
12:49tagr: karolherbst: here's the detiling blit patch that should get modifiers-unaware Wayland compositors working: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6426
12:49tagr: I've labelled that with "tegra"
12:49tagr: do the labels mean that somebody "responsible" for that label will get notified?
12:51karolherbst: you can subscribe to labels and then you get mails
12:53karolherbst: anyway, I will give the patch a go and see how that works out :)
12:54tagr: karolherbst: looks like it works with Weston, kmscube (modifier-unaware) and X 1.20.8 with fullscreen glxgears
12:54tagr: karolherbst: windowed glxgears still broken, but I'll check if I can make it work if I cherry-pick my modesetting fix onto v1.20.8
13:11tagr: karolherbst: here's the fix for the modesetting driver: https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/497
13:11tagr: karolherbst: I can cherry-pick this cleanly onto v1.20.8 and then everything works as expected in X (both glxgears windowed or fullscreen)
13:12karolherbst: yeah.. I know nothing about X internals, so I can't comment on that one :)
13:13karolherbst: but would be cool if this fixes it
13:13karolherbst: let me patch xorg-server as well
13:18tagr: karolherbst: the combination fixes the X issues for me
13:19karolherbst: ehh.. that patch doesn't apply on top of all the patches fedora has :D
13:19tagr: it's also small enough that I think it might be a candidate for a 1.20.9
13:19karolherbst: "error: hw/xfree86/drivers/modesetting/driver.h: patch does not apply" ehhh
13:19tagr: can you apply it manually? it's something like 5 lines
13:19karolherbst: yeah.. trying to fix it up
13:19tagr: or if you have a git branch with the fedora patch stack I can rebase on top of that
13:20karolherbst: ohh.. that's just 1.20.8.. mhh
13:21tagr: ah... the merge request is for git master
13:21tagr: git cherry-pick was happy to apply it onto 1.20.8
13:22karolherbst: yeah.. git is fine with the patch
13:22tagr: karolherbst: for glxgears you'll still need the detile patch as well
13:22karolherbst: "} modesettingRec, *modesettingPtr" vs "/* shadow API */" in the last line :D
13:23karolherbst: tagr: yeah.. I am building mesa and xorg-server in my copr and then just install that and see how that goes
13:23karolherbst: just doing it like that so I can ask others to test as well :)
13:24tagr: okay, great
13:24tagr: karolherbst: I did notice some inconsistent behaviour with modifier-unaware kmscube... looks like the rotation isn't entirely smooth
13:25tagr: and it's not just the low framerate that's causing it
13:26tagr: looks like it's "bobbing" around a bit, as if every other frame was temporally reversed or something
13:26tagr: it doesn't seem to happen all the time, though
13:27tagr: which might explain why I didn't notice it earlier
13:27tagr: or maybe I just wasn't looking closely enough
13:27imirkin: ProN00b: https://nvidia.github.io/open-gpu-doc/Falcon-Security/Falcon-Security.html -- this is all we got.
13:28tagr: hmm... I think I didn't notice before because it's almost not noticeable when I run at pstate = 5 and higher EMC frequency (which gives me about 60 fps for kmscube)
13:29imirkin: tagr: re your much earlier comment, atomic in modesetting is kinda broken, so got killed off at the kernel level :)
13:29karolherbst: tagr: yeah...
13:29karolherbst: I think stuff isn't that great if you drop below 60 fps
13:30tagr: I wonder if somehow a ->flush_resource() is not enough for Nouveau and we end up partially scanning out a frame and then the detiling blit ends up overwriting the next frame?
13:30karolherbst: tagr: ohh.. one thing... a patch isn't allowed to break the ABI
13:31karolherbst: but I think that's modesetting internal only, right?
13:31tagr: karolherbst: I think if I look very closely I also see this happening at 60 fps, it's just very hard to notice because the time difference and therefore the rotation is very small between frames
13:31tagr: karolherbst: yes, that modesetting patch is all just internal to the modesetting driver
13:32tagr: which reminds me that I do have a patchset somewhere that uses proper fences for synchronization between Tegra DRM and Nouveau
13:33tagr: not sure if that would fix this, though
13:34tagr: ha... nvc0_flush_resource() is a nop!
13:35karolherbst: imirkin probably knows why
13:35karolherbst: maybe not
13:35karolherbst: it got added with 419cd5f2a24b
13:35imirkin: not offhand, nor do i know what that callback would do
13:36karolherbst: "Flush the resource cache"
13:36imirkin: right. i have no idea what that is.
13:36karolherbst: prime stuff
13:36imirkin: what cache
13:37imirkin: anyways, yeah. clearly i know nothing about it.
13:37karolherbst: it's so that you flush driver internal caches so external clients can use that stuff
13:37karolherbst: or something
13:37tagr: I guess this works fine on Nouveau because it does the synchronization elsewhere
13:37karolherbst: seems r600 only which needs it
13:37tagr: or perhaps because there's no need for any synchronization in the first place as long as everything remains in VRAM
13:38tagr: for Tegra it'll be relevant, though
14:28tagr: karolherbst: http://ix.io/2uON
14:28tagr: that seems to get rid of the bobbing
14:28tagr: though it also cuts the performance roughly into half
14:28tagr: not sure why exactly that is
14:30imirkin: nouveau_bo_wait is a pretty heavy op
14:31karolherbst: tagr: mhh..
14:31imirkin: tagr: i wonder if it'd be enough to just flush the local command buffer
14:31imirkin: or maybe throw in a synchronize
14:32karolherbst: yeah.. I see the issue as well locally
14:44tagr: imirkin: how do I flush the local command buffer?
14:45tagr: or throw in a synchronize?
14:45tagr: that's what I was looking for when I found nouveau_bo_wait() =)
14:45imirkin: tagr: well
14:45imirkin: synchronize as in the synchronize command
14:45imirkin: to the gpu
14:45imirkin: which will get it to not process additional draws until it's done with the current ones
14:46imirkin: it's like 0x110? something around there.
14:46tagr: imirkin: it actually sounds to me more like what we really need is a fence that we can wait for
14:46imirkin: anyways, to flush the local command buffer...
14:46imirkin: well, that's what nouveau_bo_wait does, sorta
14:47tagr: imirkin: double buffering should already ensure that we don't overwrite anything, but what we do need is to wait for the rendering to finish before we can display it
14:50tagr: I'm going to pull in my sync fd changes from a long time ago on top of this, because they were meant to address exactly that
14:50imirkin: so what nouveau_bo_wait does is ... wait on a fence.
14:51imirkin: (in the kernel)
14:51tagr: the sync fd changes were similar to that, but they were meant to also implement the cross-driver userspace parts, so that the KMS driver can take a fence emitted by the GPU as an input
14:51imirkin: yeah. i think sync fd is the new way to do this
14:52imirkin: nouveau_bo_wait has been around for quite a while
14:52tagr: so I think that would also solve this problem, except that it would, again, rely on newer infrastructure for this
14:52tagr: so something like non-atomic kmscube won't work, and neither would X, presumably
14:52karolherbst: tagr: relying on new infrastructure is probably fine, just depends on how new
14:53tagr: but since X is pretty much front-buffer-only rendering, there really isn't very much we can do anyway
14:53tagr: karolherbst: sync FD support requires new kernel ABI, so there's the downside of that taking a long while to make it in
14:53karolherbst: what I am kind of wondering is, how do all the other SoCs fix this or do all of those have different constarints or do they also require modifier support?
14:54imirkin: tagr: maybe the solution is ... don't use modifiers when you're front-buffer rendering and the display thing doesn't support those modifiers?
14:55tagr: imirkin: yes, that's basically what the fixed modesetting driver will now do
14:55imirkin: so then you're down to some "core" problems of front-buffer rendering
14:56imirkin: which is that you don't know when it's OK to look at the front buffer
14:57tagr: yeah, which basically just means you get flicker and potentially sometimes this "bobbing"
14:57tagr: or tearing rather
14:58karolherbst: I guess when you do front buffer rendering you don't care about performance anyway :p
14:59karolherbst: so if this path just ends up oversynchronizing I really don't care that much
15:00imirkin: tagr: yeah ... would a redirecting compositor "fix" this?
15:07tagr: imirkin: possibly, I haven't looked into that
15:08imirkin: i'd be comfortable saying that if you want tear-free, you need a redirecting compositor
15:08tagr: anyway, I've got to run now, but I'll look more into this next week
15:08imirkin: otherwise you get tears (on your screen *and* in your eyes...)
15:08tagr: thanks for all the help so far
15:08tagr: hehe =)
15:29ProN00b: imirkin, thanks for the link on falcon security
17:18karolherbst: tagr: ahh, I see, bikeshedding started :p
17:19karolherbst: anyway, it seems like wayland (including zwayland) works quite fine with the mesa patch
17:19karolherbst: I will give it a bit more testing though
17:20karolherbst: tagr: also, maybe it makes sense to include some other people working on SoCs, so we get away form this "but Intel and AMD" push
17:20karolherbst: I don't know if that's an issue on the others anyway
17:21karolherbst: but maybe they all also use modifier aware X and moved on?
17:21karolherbst: I just don't know
17:56karolherbst: tagr: I talked with alyssa and it seems like all of that just works on panfrost on 1.20.8 and wayland apparently
17:56karolherbst: but that was just copied over from v3d
17:57karolherbst: so.. maybe there is a different solution we are not seeing right now
19:17karolherbst: imirkin: sooo.. ldc but for uint8 and byte aligned address :)
19:17karolherbst: I mean.. I think the bug is that it gets load propagated
19:17imirkin: no clue if that's a thing.
19:17karolherbst: I don't think it's a thing on all gens...
19:17karolherbst: soething was funky and it got me to ignore it for now :D
19:18karolherbst: but that's the last thing to pass all test_basic tests in the CL CTS
19:30karolherbst: imirkin: ohh.. it wasn't ldc, it was cvt
19:30karolherbst: "cvt f32 $r3 u8 c0[0x1] "
19:30imirkin: right, that obviously won't work
19:30imirkin: the load prop needs to learn about sizes i guess
19:30imirkin: note that you can't just force 32-bit
19:31karolherbst: only allow it for ldc :p
19:31imirkin: coz e.g. with mul on nv50, it can actually load 16-bit consts
19:31imirkin: (it's a 16-bit integer mul)
19:31karolherbst: at this point I might just fix it and optimize later :p
19:31karolherbst: but mhh
19:31karolherbst: I fix it for nvc0 and let pmoreau sort it out for nv50 :p
19:32karolherbst: both probably need fixing anyway
21:14pmoreau: karolherbst: I had a patch around for trying to handle all those cvt to/from u8/s8/u16/s16, with and without saturate. Will need to dig it up.
21:16karolherbst: but we are not there yet anyway
21:17karolherbst: let me try to fix test_basic first :D