04:33fdobridge_: <enigma9o7> https://cdn.discordapp.com/attachments/1034184951790305330/1186889125312340129/shot-2023-12-19_20-33-06.png?ex=6594e3a2&is=65826ea2&hm=69c5bc2537f4ec717c3d8bd8c750f813f98ee3996db615335a94a464be8a7f02&
04:34fdobridge_: <enigma9o7> So it hapened. I was using thunar file manager to copy a file when just stopped responding.
04:37fdobridge_: <enigma9o7> Please let me know if this is useful or what kind of logs I should be getting and if I should be reporting this elsewhere, etc. I want to resolve this, I lock up every day or two when using nouveau.
06:30fdobridge_: <Sid> I was able to get more logs about the gpu hangs on prime
06:31fdobridge_: <Sid> https://paste.sidonthe.net/raw/snail-falcon-dog
06:34fdobridge_: <Sid> https://paste.sidonthe.net/raw/spider-hawk-wasp (edited)
06:38fdobridge_: <Sid> `[Wed Dec 20 11:56:55 2023] ga100_fifo_nonstall_intr+0x24/0x30 [nouveau b7885176833803f935f2c6c6db46f2cb7b7d0e7c]` stands out to me the most, since I have a TU116M
06:46fdobridge_: <Sid> that's where ssh connection stopped responding 😔
08:32fdobridge_: <Sid> quake 2's tutorial runs at 72fps with GSP, and 24-25 without
08:33fdobridge_: <Sid> quake 2's tutorial runs at 72fps with GSP, and 25-35 without (edited)
08:46fdobridge_: <Sid> managed to sort out the async waits on fence timing out with `intel_iommu=igfx_off`
08:47fdobridge_: <Sid> however the rcu stalls still happen and take the whole system with it
08:47fdobridge_: <Sid> gsp enabled, yes
08:51fdobridge_: <Sid> here's the dmesg until when the ssh connection died
08:51fdobridge_: <Sid> https://paste.sidonthe.net/raw/mouse-parrot-fox
08:56fdobridge_: <Sid> @airlied hopefully this helps a bit?
08:57fdobridge_: <airlied> Oh iommu off suggests something isn't getting translated somewhere
08:58fdobridge_: <karolherbst🐧🦀> ben always assumed the fencing is broken, but could never find out where
08:59fdobridge_: <Sid> using iommu igfx off didn't solve the system freezes for me, but it did clear up some of the logs
09:00fdobridge_: <airlied> I wonder what is holding the spinlock there
09:00fdobridge_: <karolherbst🐧🦀> I've added a workaround where I kick the fence somewhere else
09:00fdobridge_: <airlied> Maybe that path isn't the best place to take a spinlock
09:00fdobridge_: <karolherbst🐧🦀> might just spin forever
09:00fdobridge_: <karolherbst🐧🦀> and waits on a fence never signalled
09:01fdobridge_: <airlied> Oh we should maybe yield and timeout there
09:01fdobridge_: <Sid> if you want me to test a kernel patch I can absolutely build a custom kernel to test things out :)
09:01fdobridge_: <airlied> Since if we hold the spinlock we might have an issue
09:01fdobridge_: <karolherbst🐧🦀> I'm on my phone, so I don't know the code there 😄
09:02fdobridge_: <Sid> would like to see this issue sorted before 6.7 drops to make nvk testing a bit easier for myself, and while code goes over my head most of the time I'm always open to testing
09:02fdobridge_: <karolherbst🐧🦀> but nouveau suffers from "just fix the bug instead" in a couple of places, so we do a few silly things
09:03fdobridge_: <karolherbst🐧🦀> like using the GPU timer for time outs, which also leads to infinite loops once the gpu dies
09:03fdobridge_: <karolherbst🐧🦀> kinda wished we'd fix all those things
09:05fdobridge_: <airlied> Is there a gitlab issue for this?
09:05fdobridge_: <Sid> I did pipe in a little bit in https://gitlab.freedesktop.org/mesa/mesa/-/issues/10225
09:05fdobridge_: <karolherbst🐧🦀> heh.. I've sent oatches years ago, but they got nacked
09:05fdobridge_: <airlied> Would be good to get those logs and system configuration in one place if not
09:07fdobridge_: <airlied> If it isn't exact same issue would be good to file
09:08fdobridge_: <airlied> Like I have ideas if NVIDIA CTX dies and Intel is waiting on a fence that might explode, but not sure you are seeing any dying ctx
09:09fdobridge_: <karolherbst🐧🦀> mhhh
09:10fdobridge_: <Sid> I can't really tell, best I can do is grab dmesg and try to generate an accompanying proton log
09:10fdobridge_: <karolherbst🐧🦀> yeah sooo
09:11fdobridge_: <karolherbst🐧🦀> the code is uhm.. deadlocking for sure
09:11fdobridge_: <karolherbst🐧🦀> or rather.. spinning infinitely
09:11fdobridge_: <karolherbst🐧🦀> @airlied see `nouveau_fence_update`
09:11fdobridge_: <karolherbst🐧🦀> if the GPU never progresses, the lock will never be unlocked
09:11fdobridge_: <airlied> Yeah we should probably not do that 🙂
09:11fdobridge_: <karolherbst🐧🦀> I think kicking the fence might even help
09:12fdobridge_: <karolherbst🐧🦀> let me find my older patch
09:13fdobridge_: <karolherbst🐧🦀> 6b04ce966a738ecdd9294c9593e48513c0dc90aa
09:13fdobridge_: <karolherbst🐧🦀> https://gitlab.freedesktop.org/drm/nouveau/-/commit/6b04ce966a738ecdd9294c9593e48513c0dc90aa
09:13fdobridge_: <karolherbst🐧🦀> that helped with some issues with never signaled fence
09:13fdobridge_: <karolherbst🐧🦀> but in a different path
09:15fdobridge_: <karolherbst🐧🦀> mhhh
09:15fdobridge_: <karolherbst🐧🦀> though the code is a bit messy to understand...
09:16fdobridge_: <airlied> I really dislike that patch 🙂
09:16fdobridge_: <karolherbst🐧🦀> same
09:16fdobridge_: <karolherbst🐧🦀> but it fixes a bug
09:17fdobridge_: <karolherbst🐧🦀> and quite reliably so
09:17shulai: Hi everyone, I am new here and I'm totally a greenhand in nouveau and oftc. Can anyone in this channel see my message?
09:17fdobridge_: <airlied> Worst kinda papering over 🙂 I did spent a few days in this area 6-9 months ago with Ben but never worked it out
09:17fdobridge_: <airlied> But mostly I never had a decent reproducee
09:17fdobridge_: <karolherbst🐧🦀> see, I was smart and saved me the troubles of debugging it for 6-9 months
09:18fdobridge_: <airlied> And the thing I thought was this was fixed by something else
09:18fdobridge_: <karolherbst🐧🦀> incredible
09:19karolherbst: shulai: sure
09:20fdobridge_: <karolherbst🐧🦀> ohh actually.. it's code somewhere else which never unlocks
09:20fdobridge_: <karolherbst🐧🦀> `nouveau_fence_wait_uevent_handler` just never locks
09:21fdobridge_: <Sid> *sweat*
09:21fdobridge_: <karolherbst🐧🦀> and `nvkm_inth_allow` also never locks
09:21fdobridge_: <karolherbst🐧🦀> so yeah...
09:21fdobridge_: <karolherbst🐧🦀> the kernel is kinda uhm...
09:21fdobridge_: <karolherbst🐧🦀> doesn't like that
09:21fdobridge_: <Sid> makes sense, would explain why it takes the whole system with it
09:22fdobridge_: <karolherbst🐧🦀> sadly we can't run `lockdep` as it complains about other things
09:22fdobridge_: <karolherbst🐧🦀> yeah...
09:22fdobridge_: <karolherbst🐧🦀> network is one of the first things to go down with stuff like that
09:22fdobridge_: <Sid> the system also stops responding to keyboard input
09:22fdobridge_: <Sid> can't alt-tab out or move the mouse and have the screen update 5 minutes later
09:23fdobridge_: <karolherbst🐧🦀> yeah...
09:23fdobridge_: <karolherbst🐧🦀> we should never spinlock without a timeout
09:23fdobridge_: <karolherbst🐧🦀> it's just killing systems too easily
09:23fdobridge_: <Sid> yeah
09:23fdobridge_: <karolherbst🐧🦀> the big question is just, what never unlocks
09:24fdobridge_: <Sid> that is a very good question, and a difficult one to answer since I'm not able to get logs beyond when the system explodes
09:24fdobridge_: <Sid> and a hard reset means no journal is written to file
09:24fdobridge_: <Sid> meaning no journalctl -b -1 shenanigans to identify what's going wrong
09:27fdobridge_: <Sid> kinda wish there was a way to use AOSP debugging methods here, ramoops is really helpful when debugging kernel crashes
09:27fdobridge_: <karolherbst🐧🦀> yeah...
09:27fdobridge_: <karolherbst🐧🦀> you can kinda set it up
09:28fdobridge_: <karolherbst🐧🦀> wait...
09:28fdobridge_: <karolherbst🐧🦀> let me check
09:28fdobridge_: <karolherbst🐧🦀> you can kinda make the machine reboot on stalls
09:29fdobridge_: <Sid> huh
09:29fdobridge_: <Sid> hm https://www.kernel.org/doc/html/v4.15/admin-guide/ramoops.html
09:30fdobridge_: <karolherbst🐧🦀> `mem=128M ramoops.mem_address=0x8000000` (just adjust to your RAM)
09:30fdobridge_: <karolherbst🐧🦀> and then...
09:31fdobridge_: <karolherbst🐧🦀> `CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC`
09:32fdobridge_: <karolherbst🐧🦀> and `panic_timeout,`
09:32fdobridge_: <karolherbst🐧🦀> and `panic_timeout` (edited)
09:32fdobridge_: <karolherbst🐧🦀> and `CONFIG_BOOTPARAM_HARDLOCKUP_PANIC`
09:33fdobridge_: <karolherbst🐧🦀> I think that's all you need here?
09:33fdobridge_: <Sid> seems like it
09:33fdobridge_: <Sid> setting the mem_address right might take some trial and error, but bleh
09:33fdobridge_: <karolherbst🐧🦀> yeah...
09:34fdobridge_: <karolherbst🐧🦀> but there is a handy way of ooping your kenrel randomly 😄
09:34fdobridge_: <Sid> heh
09:34fdobridge_: <karolherbst🐧🦀> `echo c > /proc/sysrq-trigger` I think
09:35fdobridge_: <Sid> fun stuff
09:35fdobridge_: <karolherbst🐧🦀> https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
09:35fdobridge_: <Sid> I'll build the kernel with ramoops configured in a few hours, fairly tired, been testing various things since I woke up 4 hours ago
09:36fdobridge_: <karolherbst🐧🦀> I couldn't really sleep last night, so I'm kinda all mushy in my brain
09:36fdobridge_: <Sid> hope you feel better soon :>
09:36fdobridge_: <karolherbst🐧🦀> yeah.. tomorrow after I get some sleep 🥲
09:37fdobridge_: <karolherbst🐧🦀> I had ramoops set up on my desktop, and I thought I still had, but apparently once I bought my serial console for the desktop I removed it...
09:46shulai: Nice to be here. I find some information on the website quite out of date.
09:46karolherbst: yeah.... that's sadly the case
09:47shulai: I am trying to trace the MMIO of CUDA kernel launching
09:47karolherbst: oh yeah.... you won't find much there
09:47shulai: but with mmiotracing tools as introduced in https://wiki.ubuntu.com/X/MMIOTracing, there are no logs corresponding to kernel launches.
09:47karolherbst: all that stuff happens entirely in userspace
09:47karolherbst: so you won't see anything in the mmio log there
09:48shulai: I think those are happening within the low ISA range as introduced. Am I wrong?
09:48fdobridge_: <karolherbst🐧🦀> @marysaka do you think your stuff could be used to trace CUDA?
09:48karolherbst: shulai: what do you mean by "ISA range"?
09:48fdobridge_: <marysaka> I haven't tried but maybe?
09:49fdobridge_: <karolherbst🐧🦀> I mean.. it's just the compute channel...
09:49fdobridge_: <marysaka> it should use the same interface for submitting pushbuf right so it should work
09:49fdobridge_: <karolherbst🐧🦀> yeah..
09:49shulai: It is said that "the legacy ISA address range 0xa0000 - 0x100000 cannot be traced this way"
09:49fdobridge_: <marysaka> just hope nothing break apart in the mitm paths I guess
09:49fdobridge_: <karolherbst🐧🦀> @marysaka you know what I want to RE with it?
09:50shulai: So I think the cuda engine entry may exist in that region
09:50fdobridge_: <marysaka> you want to debug trap handlers? :aki_thonk:
09:50karolherbst: shulai: mmiotrace is just the wrong tool for what you want to do
09:50fdobridge_: <karolherbst🐧🦀> @marysaka how did you know 😛
09:50fdobridge_: <marysaka> I don't have anything for dumping shader from it tho btw
09:50fdobridge_: <karolherbst🐧🦀> but `cuda-gdb` more specifically
09:50fdobridge_: <karolherbst🐧🦀> nah..
09:50fdobridge_: <marysaka> still not sure how they upload them, it takes a different path
09:50fdobridge_: <karolherbst🐧🦀> the trap handler I can write myself
09:50fdobridge_: <karolherbst🐧🦀> I need to know how to set it up 😄
09:50fdobridge_: <marysaka> okay
09:51fdobridge_: <karolherbst🐧🦀> just get it invoked and dump some information into a buffer and read it back
09:51fdobridge_: <karolherbst🐧🦀> the trap handler shader is the easy part
09:51fdobridge_: <karolherbst🐧🦀> we have sys values like "max reg usage" and other funky things
09:51fdobridge_: <karolherbst🐧🦀> I know all those bits
09:52shulai: Then how can I trace and trap the MMIO of cuda APIs? With what tools?
09:52fdobridge_: <karolherbst🐧🦀> and the trap handler is literally just "write last $pc, all regs/predicates" to a buffer
09:52fdobridge_: <karolherbst🐧🦀> and that's already much better than what we have now
09:52karolherbst: shulai: you can't
09:52karolherbst: as there is no MMIO
09:54karolherbst: the only think MMIO is used here to set up the context and that's done by firmware mostly these days
09:54karolherbst: so you won't get that anyway
09:54karolherbst: nvidia manages most those things in userspace, and allocation of the context/VM/memory happens through firmware and page table writes in VRAM
09:54shulai: If the cuda apis does not communicate with the GPU through MMIO, then how?
09:54karolherbst: command buffers
09:55karolherbst: there is a ring buffer on the GPU, but adding commands is also done in userspace
09:55karolherbst: they use `userfd` for it
09:55karolherbst: but getting the ringbuffer thing also gives you nearly nothing on its own
09:58karolherbst: shulai: anyway.. we have new tools for dumping those command buffers to reverse engineer their vulkan driver
09:58karolherbst: but it's unknown if it works with cuda
09:59shulai: So there is no way to hook the "adding commands" operations in the kernel space?
09:59karolherbst: it's the wrong place to do it
10:00karolherbst: mmiotrace is helpful to reverse engineer a kenrel driver, but not necessarily what userspace is doing. There is a looooooot of data coming through and you'd simply overload your kernel with how it's done through mmiotrace
10:03shulai: Can I know the address of command buffers and write to it directly?
10:04karolherbst: shulai: you kinda don't want to write to them unless you want to crash the GPU
10:11karolherbst: shulai: anyway.. if you want to check out how it all works and want to give the tool a try: https://gitlab.freedesktop.org/marysaka/envyhooks
10:13shulai: Thanks a lot. I'll take a look.
10:31fdobridge_: <airlied> @tiredchiku https://paste.centos.org/view/faa9bf17 might alleviate some of it
10:31fdobridge_: <Sid> I'll test it in a while, thanks
11:11fdobridge_: <Sid> tried it, still took the system down
11:12fdobridge_: <Sid> forgot to open an ssh connection first because I'm smart like that
11:22fdobridge_: <Sid> @airlied https://paste.sidonthe.net/raw/monkey-horse-whale
17:07fdobridge_: <dwlsalmeida> anyone knows the difference between SUATOM.P vs SUATOM.D ?
17:08fdobridge_: <dwlsalmeida> I'm assuming they map to SURED.B vs SURED.P
17:08fdobridge_: <dwlsalmeida> I'm assuming they map to SURED.P vs SURED.B (edited)
17:18RSpliet: AFAIK "RED" is NVIDIA terminology for reductions, i.e. atomic operations without a return value
17:20fdobridge_: <dwlsalmeida> ah yes, asking about .P and .D
17:25fdobridge_: <karolherbst🐧🦀> formatted and unformatted
17:25fdobridge_: <karolherbst🐧🦀> `.D` is raw data
17:26fdobridge_: <karolherbst🐧🦀> `SURED.B` doesn't exist on all hardware
17:26fdobridge_: <karolherbst🐧🦀> or maybe on none
17:27fdobridge_: <karolherbst🐧🦀> `.P` is doing implicit format conversion
17:27fdobridge_: <dwlsalmeida> I thought P stood for pointer
17:28fdobridge_: <karolherbst🐧🦀> I think it's data vs pixel
17:29fdobridge_: <karolherbst🐧🦀> anyway.. if `.P` doesn't exist, the shader needs to do the format conversion
17:30fdobridge_: <karolherbst🐧🦀> it doesn't exist for all ops
17:32fdobridge_: <dwlsalmeida> I assume this is why this function exists -> NVC0LoweringPass::handleSurfaceOpGM107(TexInstruction *su)
17:33fdobridge_: <karolherbst🐧🦀> yes
17:34fdobridge_: <dwlsalmeida> @marysaka do you happen to know whether we have something similar in nak?
17:34fdobridge_: <dwlsalmeida> my guess is not really, but checking anyways
17:35fdobridge_: <marysaka> I don't think we have anything in NAK so far :aki_thonk:
17:36fdobridge_: <karolherbst🐧🦀> isn't there nir lowering for it?
17:37fdobridge_: <karolherbst🐧🦀> also vulkan has options to disable unformatted ops
17:38fdobridge_: <karolherbst🐧🦀> though the lowering is mostly whacking the coord and doing a conversion
17:38fdobridge_: <!DodoNVK (she) 🇱🇹> Then how does vkcube work? 🐸
17:39fdobridge_: <karolherbst🐧🦀> does vkcube write to images?
17:40fdobridge_: <dwlsalmeida> @karolherbst I did not know you could lower this, I am new heheh, I am trying to hook up nir_intrinsic_bindless_image_atomic_swap for maxwell in nak, and this .P vs .D is the only difference I can spot when disassembling and comparing against codegen
17:41fdobridge_: <karolherbst🐧🦀> `.D` is giving you the raw data, so if it's an integer image, but you want a float result, you need to convert the result yourself. However you can't do this if the image format isn't known at compile time
17:41fdobridge_: <karolherbst🐧🦀> (I think)
17:43fdobridge_: <karolherbst🐧🦀> or it was in regards to bit sizes?
17:43fdobridge_: <karolherbst🐧🦀> dunno the details
17:44fdobridge_: <karolherbst🐧🦀> anyway.. the tldr is that `.P` checks the image format from the descriptor, where `.D` just gives you the raw data plainly
17:45fdobridge_: <karolherbst🐧🦀> with `.D` you also need to convert the coordinates yourself as it's purely byteaddressing
17:45fdobridge_: <karolherbst🐧🦀> where `.P` is pixel based addressing
17:46fdobridge_: <karolherbst🐧🦀> using `.P` allows you to get rid of some of that handling, but only if you know the format at compile time
17:46fdobridge_: <karolherbst🐧🦀> ehh even when not
17:46fdobridge_: <karolherbst🐧🦀> can't lower unformatted ops if you don't know the format or only with huge pain
17:48fdobridge_: <dwlsalmeida> eh..wdym by "know the format at compile time"? I am running `dEQP-VK.image.atomic_operations.exchange.1d.notransfer.normal_read.normal_img.r32i_intermediate_values` and I am not really sure where I can read that in the shader
17:49fdobridge_: <karolherbst🐧🦀> in the image var
17:49fdobridge_: <dwlsalmeida> `layout (r32i, binding=0) coherent uniform iimage1D u_resultImage;`
17:49fdobridge_: <karolherbst🐧🦀> and I think the ops as well?
17:49fdobridge_: <karolherbst🐧🦀> ahh yeah
17:49fdobridge_: <karolherbst🐧🦀> r32i format
17:49fdobridge_: <dwlsalmeida> ahh
17:49fdobridge_: <karolherbst🐧🦀> so single channel signed integer 32 bit
17:50fdobridge_: <dwlsalmeida> so let me recap, `coord` is the pixel coordinates, but that has to be converted manually to the actual memory address and the IR rewritten to match that
17:50fdobridge_: <karolherbst🐧🦀> so you could load it with the actual cords via `suld.p.r.32`
17:51fdobridge_: <karolherbst🐧🦀> ehh..
17:51fdobridge_: <karolherbst🐧🦀> `suld.p.1D.r`
17:51fdobridge_: <karolherbst🐧🦀> like that 😄
17:52fdobridge_: <karolherbst🐧🦀> and the raw data would be `suld.d.1D.32` but the `coords *4` as you need to do adjustements of the coords
17:52fdobridge_: <karolherbst🐧🦀> and will have to convert to your operation dest format base on the source format
17:52fdobridge_: <karolherbst🐧🦀> but I think you need to do that anyway, as you can't really state if you want float or int results
17:53fdobridge_: <karolherbst🐧🦀> using `.p` simply saves you the trouble of fixing up the coordinates
17:53fdobridge_: <karolherbst🐧🦀> and also allows you to specify the components
17:55fdobridge_: <karolherbst🐧🦀> mhhhh
17:55fdobridge_: <karolherbst🐧🦀> actually..
17:55fdobridge_: <karolherbst🐧🦀> byte addressing only with also `.BA`
17:55fdobridge_: <karolherbst🐧🦀> otherwise it's scaled by the given size
17:56fdobridge_: <karolherbst🐧🦀> anyway, I don't really know the actual details here so you kinda need to trust the compiler know what they are doing. All I know is that `.P` allows you to skip some format conversion stuff
18:01fdobridge_: <dwlsalmeida> @karolherbst ty for helping 🙂
19:33fdobridge_: <enigma9o7> https://cdn.discordapp.com/attachments/1034184951790305330/1187115473054203984/shot-2023-12-19_20-33-06.png?ex=6595b670&is=65834170&hm=4b1999ae811585dd8112b4f177f3dafbb76c65d4769f0f23668ebeffc30fc531&
19:33fdobridge_: <enigma9o7> I posted this already last night, just didn't get any response, so asking again. I have nouveau lockup my machine every couple days. Doesn't happen with nvidia-340. Currently on kernel 5.4, but tested 6.6 and can't even go a few hours without lockup.
19:34fdobridge_: <enigma9o7> The above happened when thunar was in focus and I was just dragging a file form one dir to another. But had plenty of other stuff running, as usual, stuff like firefox and plexmediaserver and virtualbox.
19:35fdobridge_: <enigma9o7> Is there somewhere better I should be reporting this? I'd like to resolve this as it's not usable this way. Are these the right logs? What info is useful? I would appreciate some advice here what to do....
19:38fdobridge_: <Sid> what gpu, and can you post the dmesg as a file/in a pastebin service, is what'd be a good start
19:42fdobridge_: <enigma9o7> Sure.
19:42fdobridge_: <enigma9o7> GPU is GT330M
19:47fdobridge_: <enigma9o7> The above screenshot is from journalctl. I'm new at taking logs so not sure how to get it to text but will figure it out. Is dmesg the same log?
19:48fdobridge_: <airlied> @enigma9o7 nobody is going to be rushing to fix this unfortunately
19:48fdobridge_: <airlied> though that log is at least got some interesting details
19:50fdobridge_: <airlied> I'd suggest opening an issue on gitlab, and attaching the excerpts from the log files instead of whole screenshoes
19:50fdobridge_: <airlied> I'd suggest opening an issue on gitlab, and attaching the excerpts from the log files instead of whole screenshots (edited)
19:53fdobridge_: <enigma9o7> Ok thanks. I'm working on figuring out how to get logfiles now.
21:07fdobridge_: <karolherbst🐧🦀> it's kinda a bug a few users hit, but I weren't able to reproduce locally with any of my cards...
21:30fdobridge_: <karolherbst🐧🦀> @gfxstrand how are OOB image accesses handled atm?
21:31fdobridge_: <karolherbst🐧🦀> mhh.. or rather how is it defined in vulkan?
21:35Lyude: hm. wondering how to get started with this display driver, and whether I should be trying to write something like vkms in rust first before trying nova
21:36Lyude: since I assume it's probably going to be easiest to start the nova driver once we've actually got some sort of driver in the tree that can load GSP?
21:37Lyude: (any thoughts btw airlied, karolherbst ?)
21:37karolherbst: just go straight for it :D
21:37karolherbst: I think it's probably easier to do a render only driver for now
21:37karolherbst: like make sure it does prime offloading and works on laptops
21:37karolherbst: and then add display on top
21:38karolherbst: or work on it in the meantime
21:38karolherbst: the abstractions for those basic things should already existing within the asahi tree, so that _might_ be quite easy from a drm integration pov
21:39Lyude: I mean - jfyi, I did actually just check about asahi's KMS driver and it's still written in C, so I assume I'm still going to need to come up with interfaces for DRM
21:40karolherbst: I meant the accelerated driver
21:40karolherbst: it still interfaces with drm_buf and all that
21:40Lyude: ooooooooh ok
21:40karolherbst: and uses the fencing interfaces
21:41karolherbst: at least I think landing those abstractions first is easier as they are already kinda tested in the wild
21:41Lyude: well yeah, I assume it'll be a good while before we can get the display bits landed in the kernel - especially since we'll have to land new KMS abstractions
21:42Lyude: would like to get started on that in some form in the mean time though
21:43karolherbst: yeah
21:43karolherbst: in any case, you'd need to write the code for GSP bringup first
21:43Lyude: yeah, that's sort of why I'm wondering if we should start with a different display driver first
21:44airlied: I'd just start with vkms for display abstractions
21:45karolherbst: why though?
21:45airlied: nova needs a fair bit of groundwork to get display
21:45karolherbst: I mean.. vkms in rust sounds like a good idea, but writing two drivers is also kinda a lot of work
21:46karolherbst: and I don't think that starting with vkms actually will give us any benefits over just starting with nova
21:46airlied: but we can't start with nova for display
21:46Lyude: karolherbst: the main thing we can work on without GSP code is the actual KMS abstractions - which imo, is probably going to be the more challenging bit anyway (or not)
21:46Lyude: I think it likely will since that's going to be bikeshed city
21:46karolherbst: yeah, so we do it a few weeks/months later
21:46airlied: because we'd have to develop at least a falcon + gsp, memory management and page table before display
21:47airlied: so if we can parallel do the abstractions work in vkms then that would have some value
21:47airlied: since we suspect the hard work will be in building the kms abstractions
21:47karolherbst: sure, but who is going to be the second person doing the work in parallel?
21:48Lyude: i think that's me :P
21:48airlied: danilo is going to start on nova bringup, lyude does display stuff
21:48karolherbst: yeah, which brings me back to my initial point, that it would just delay writing the code for nova
21:48airlied: not really, we don't have to finish vkms
21:48karolherbst: mhh
21:48karolherbst: I see
21:48Lyude: yeah - I don't think I'd really want to try upstreaming it
21:49karolherbst: if it's just prototyping then it would probably also just be copied over into nova I guess (more or less)
21:49Lyude: just the minimal needed to actually get some sort of abstraction that we can use for kms. I'm hoping the actual making of the abstraction won't be too bad, because atomic seems kind of perfect for rust w/r/t states being (mostly) immutable
21:49Lyude: but upstreaming might take time
21:49karolherbst: yeah... well
21:49karolherbst: "depends"
21:50karolherbst: mutable state just means you need to wrap it in a `Mutex` to keep using non mut references
21:50karolherbst: and I suspect all of the atomic API in rust will not take mut refs
21:51karolherbst: maybe.. dunno
21:51Lyude: I'm talking more about the fact that there's like, very clearly defined points at which you modify a state or have a state that's effectively read-only.
21:52Lyude: i think i'm just gonna dive in and give it a shot, don't know that there's much way of us knowing if it's a good idea or not otherwise
21:52karolherbst: yeah
21:53Lyude: also - any of y'all have rust-analyzer working in asahi's tree/whatever tree we'd be using for this?
21:53karolherbst: this needs support from the build system
21:54karolherbst: it needs to generate a `rust-project.json` file which rust-analyzer loads
21:54karolherbst: dunno if the rust 4 linux stuff does it
21:54karolherbst: if not, it should
21:54Lyude: Oh. that's, not the worst. not great either
21:54karolherbst: yeah.. I dived into all this fun for mesa and meson back then :D
21:55karolherbst: but as long as you have that rust-project.json file it just works
21:55Lyude: karolherbst: yeah I'll look into it a bit, figured I probably would anyway
21:55Lyude: worst case I have done some funny hacks in the past to get this working in the kernel for clangd before kbuild actually had real support for it
21:55Lyude: (bear)
21:55karolherbst: Lyude: if you feel like hand crafting that file, it kinda looks like this: https://gist.github.com/karolherbst/90c024fc5546ddde0ddc2fdbd27c0d0f
21:56Lyude: I'll -probably- see if the kernel can make it and if not, figure out a way to trace build commands from kbuild in a similar manner to how bear does it to start off
21:56karolherbst: yeah.. should be trivial to generate
21:56karolherbst: it's just a list of crates and a deps tree
21:57karolherbst: the indexing is cursed, but whatever
21:57Lyude: probably less cursed then some of the setups I've had in the past :3
21:57karolherbst: probably
21:58karolherbst: Lyude: seems like the kernel supports it
21:58karolherbst: check Documentation/rust/quick-start.rst
21:58karolherbst: `make LLVM=1 rust-analyzer`
21:58karolherbst: cursed that it's not done automatically
21:59karolherbst: and then uhmm...
21:59Lyude: wooooo
21:59karolherbst: and then for e.g. VScode you'd need to add this to the config: "rust-analyzer.linkedProjects": [ "builddir/rust-project.json" ],
21:59Lyude: i feel empty without my code completion :3
22:00Lyude: karolherbst: gotcha, I'll figure out the nvim-cmp equiv for that (I'm pretty sure I already know what it is)
22:00karolherbst: yeah.. hopefully
22:00fdobridge_: <redsheep> I assume this new driver was the one y'all were referring to when I was asking about modesetting madness? So is this where the eventual work will be towards newer features like enabling HDMI FRL, 10 bit color, DSC and reduced modes?
22:00Lyude: oh no, we're only aiming for 640x400 vga support. (yes - that's the correct driver :)
22:00karolherbst: :D
22:00Lyude: tbh HDMI FRL is going to be tough unless other drivers have it already in the kernel
22:00karolherbst: 640x400 but HDR
22:00Lyude: hah
22:01fdobridge_: <dadschoorse> for reads: undefined result without VK_EXT_robustness2/VK_EXT_image_robustness, 0, 1 for alpha otherwise
22:01fdobridge_: <dadschoorse> for writes: always discarded
22:01Lyude: but, basically everything else is an eventual yes
22:01fdobridge_: <karolherbst🐧🦀> mhhhh
22:01fdobridge_: <karolherbst🐧🦀> so I think NVK gets that wrong then
22:01fdobridge_: <redsheep> The Nvidia one does with gsp, nobody else does though, seems Nvidia probably has the sensitive code inside the gsp if I had to guess
22:01Lyude: I mean yeah but nvidia also had a closed driver until now, which.
22:01fdobridge_: <karolherbst🐧🦀> there is a flag to flip between "ignore", "nearest" and "trap"
22:02fdobridge_: <karolherbst🐧🦀> and nearest is the default
22:02Lyude: you know i've thought about if they actually like. asked hdmi, if they could release those bits, but maybe it's not an issue since it's handled on the gsp side in their driver
22:02karolherbst: heh
22:02karolherbst: that HDMI story....
22:02Lyude: JFYI: there are legal issues with HDMI, is the short version of the story
22:02karolherbst: it's a real pita
22:02Lyude: yeah
22:02Lyude: just get displayport
22:02Lyude: all hail vesa, bill is very cool, thanks bill, etc.
22:03karolherbst: I totally don't get the HDMI folks
22:03Lyude: I just think they're way too close to media IP to really have a realistic view of things
22:04Lyude: I hope hdmi goes away someday. it's not even a very good connector
22:04fdobridge_: <redsheep> Unfortunately they still have the higher speed display out on the latest Nvidia cards, so I don't haveuch choice
22:04airlied: I also don't think there is a real legal problem with anyone not in the hdmi consortium from working it out and implementing it
22:04Lyude: ^ this is also possible yes
22:05Lyude: could document it for them
22:07fdobridge_: <redsheep> What would that even take? Is there any way to probe what calls Nvidia makes to gsp? Sorry if thats a noob question, I'd love to help tease it apart with my hardware if I knew how
22:09fdobridge_: <redsheep> Or maybe I'd need AMD hardware where the implementation isn't already in the firmware?
22:09Lyude: all of us were at that level once in our lives it's all good :). so - openrm (the name for nvidia's GPU driver) is open source and has the code for using the FRL stuff in GSP I believe
22:09Lyude: I have a feeling that might not give us all the implementation details other drivers would need but we could certainly use that and tracing GSP calls to figure out how it works for nvidia hardware
22:11fdobridge_: <airlied> are there monitors that need FRL, or just TVs?
22:11Lyude: I think some monitors can use it but I don't know how many need it
22:11Lyude: no idea though
22:12fdobridge_: <redsheep> I'm fairly sure there are yes. Samsung has one or two I believe. But in my case it is a TV.
22:12karolherbst: Lyude: looks like you really need it for anything over 4K@60
22:13Lyude: i never got stuff like that
22:13karolherbst: https://en.wikipedia.org/wiki/HDMI#Refresh_frequency_limits_for_common_resolutions
22:13Lyude: like it would be one thing if hdmi was cheaper but
22:13Lyude: it's, not?
22:13Lyude: one has royalties, the other doesn't
22:13karolherbst: a 4K@60 HDR would already trigger it :D
22:13fdobridge_: <redsheep> Even displayport needs DSC just to achieve 4k120 if you want HDR.
22:13karolherbst: but I do have a 4K@120 TV
22:13karolherbst: HDR
22:13karolherbst: not sure if real or fake HDR
22:13fdobridge_: <airlied> isn't FRL just DSC in disguise?
22:14karolherbst: they have DSC on top
22:14karolherbst: it's needed for like 8K@120
22:15Lyude: FRL is what happened when they finally couldn't turn the clocks higher :3
22:15fdobridge_: <redsheep> No, my understanding is FRL is the faster HDMI 2.1 link
22:15karolherbst: there are multiple FRL rates
22:15karolherbst: like 6?
22:15karolherbst: and the first one is slower than 600MHz TMDS
22:15fdobridge_: <redsheep> 6-12 ghz I want to say
22:15fdobridge_: <airlied> yeah it's a new signalling mechanism to replace TMDS, but also has some compression on top
22:16karolherbst: yeah, but then they have DSC on top of that
22:16fdobridge_: <airlied> my serious lack of understanding is the problematic bits are around the compression not the signalling
22:16fdobridge_: <airlied> but maybe there is some issues around link training
22:16karolherbst: it's probably just poking the hardware correctly
22:16karolherbst: or GSP just doing it for us anyway
22:17karolherbst: do we have to select the DP lanes with GSP? or is GSP handling that on its own?
22:17karolherbst: (and also.. how would it even)
22:17Lyude: you just come up with a link configuration for GSP and set it with the DP_MAIN_CTRL_CMD or something like that
22:17karolherbst: like on those Turing GPUs with USB-C ports
22:17karolherbst: I see
22:17Lyude: like use 1-4 links, some bandwidth, post-lt, fec, et .
22:17karolherbst: so the OS decides the link config?
22:17Lyude: *etc.
22:17karolherbst: sane
22:18Lyude: karolherbst: yes
22:18Lyude: I think you can have gsp decide it as well
22:18karolherbst: well.. maybe we can do that for HDMI
22:18Lyude: but both nouveau and openrm doesn't seem to
22:18karolherbst: as there it's pointless for the OS to really decide it
22:18Lyude: karolherbst: yeah - that's kinda why I think that might be the easiest implementation of hdmi frl
22:18Lyude: because I assume a lot of the training stuff is just doing a gsp callback
22:18karolherbst: yeah...
22:18Lyude: *call
22:19fdobridge_: <airlied> one does not simple do a gsp callback 😛
22:19fdobridge_: <airlied> I need a meme generator for that
22:20RSpliet: Having trouble finding sources on FRL. Is this not just a massive simplification of the display logic in that it only has to support like 6 well-defined clock frequencies rather than arbitrary clocks?
22:20fdobridge_: <airlied> I'd planned to do some nova work this week also, but currently I'm sitting working out why crocus is broken on gen5 intel
22:20karolherbst: RSpliet: it's probably abstracted away on hardware so that we don't really bother besides selecting the bandwidth...
22:20karolherbst: probably
22:20karolherbst: heh
22:20karolherbst: bingo
22:21karolherbst: wait..
22:21karolherbst: is that a different FLR?
22:21fdobridge_: <!DodoNVK (she) 🇱🇹> #zmikes-meme-factory
22:21karolherbst: nvidia.. why
22:21RSpliet: I think it always was... didn't we just give display HW a desired frequency and it would come up with some PLL to achieve that? Guess that's now just moved to firmware, which makes sense
22:22karolherbst: `const FlipLockRequestedGroup *pFLRG`
22:22fdobridge_: <airlied> I suppose a 4k@144 monitor would be hdmi2.1 at least
22:22RSpliet: hahaha
22:22RSpliet: FRL != FLR
22:22karolherbst: ahh shit :D
22:22RSpliet: TLAs are a PITA
22:22karolherbst: we have the docs for it tho
22:22karolherbst: yeah...
22:23karolherbst: it's trivial :D
22:23karolherbst: Lyude: https://github.com/NVIDIA/open-gpu-doc/blob/master/manuals/ampere/ga102/dev_display_withoffset.ref.txt#L6558
22:23RSpliet: I'm sure many firmware and HW devs have cried many tears to make it trivial ;x
22:23karolherbst: that's ont he hardware
22:23karolherbst: yeah..
22:23karolherbst: you just set the bandwidth apparently
22:23Lyude: mhm, I don't think we touch a lot of those bits anymore though
22:24karolherbst: I'm sure there is an RPC for that on gsp
22:24Lyude: but regardless I feel like I've looked at the frl stuff at least once - yeah
22:24Lyude: not very closely but I definitely scrolled past it while going through their DP code
22:25karolherbst: so we just calc the needed bandwidth, set the FRL mode correctly and mvoe on :D
22:29fdobridge_: <redsheep> That doesn't sound terrible, I assume that won't get patched into the existing kmd though right?
22:30karolherbst: I don't think there is anything to patch through
22:30karolherbst: besides maybe "sink supports those FRL modes"
22:30karolherbst: Lyude: do you know if it's part of the EDID or something curesed?
22:31Lyude: oh i have no idea, I know next to nothing about FRL
22:31karolherbst: mhh
22:31karolherbst: maybe I should check on my TV tomorrow
22:31Lyude: I would assume so though
22:31Lyude: since HDMI doesn't really have DP aux
22:31fdobridge_: <redsheep> Need me to dump my edid? I've passed it under windows and I think I saw it mentioned
22:31karolherbst: yeah.. I'm wondering if it's a max rate or a list of rates
22:31karolherbst: on a 4K120 display or so?
22:31Lyude: oh - that also may or may not even matter
22:31karolherbst: yeah, would help
22:32Lyude: Since I wouldn't be surprised if the GSP has calls to actually get the supported rates, which we probably want to use over reading the edid so we get free quirks
22:32karolherbst: Lyude: I thinking of cursed hardware only doing FRL2 and FRL4 or something silly :D
22:32karolherbst: if it's legal by the spech you will find hardware doing it, and if it would be illegal there would be hardware like that anyway
22:32Lyude: i've learned specs are best read with the widest and vaguest possible interpetations
22:33karolherbst: doesn't matter for HDMI 2.1 as we won't ever see the spec anyway
22:33karolherbst: but I've also seen displays where changing adapter versions didn't update the EDID checksum and other funny things
22:35karolherbst: yo pain.. those HDMI 2.1 related EDID bits are of course part of the HDMI spec
22:35karolherbst: wait.....
22:36karolherbst: nvm
22:39fdobridge_: <gfxstrand> Depends on how many robustness features you advertise. 😂 For full robustness 2, they have to return 0
22:39karolherbst: mhhh...
22:39karolherbst: `Maximum TMDS clock: 300 MHz`
22:40Lyude: oh wow vkms is small
22:40fdobridge_: <gfxstrand> There's a fun comment about this in nak_nir_lower_tex.c
22:40Lyude: I guess I'm not surprised but that's certainly welcome :)
22:40karolherbst: that 300 MHz makes no sense
22:40karolherbst: ohh..
22:40karolherbst: there are more blocks
22:40karolherbst: "Maximum TMDS Character Rate: 600 MHz"
22:41karolherbst: I really should check on my TV :D
22:41fdobridge_: <karolherbst🐧🦀> I see..
22:42fdobridge_: <karolherbst🐧🦀> @gfxstrand `.IGN` on the surface ops to get 0 😛
22:43fdobridge_: <karolherbst🐧🦀> mhhh
22:43fdobridge_: <gfxstrand> What's .IGN?
22:43fdobridge_: <karolherbst🐧🦀> return 0 on OOB
22:43fdobridge_: <gfxstrand> Ah
22:43fdobridge_: <karolherbst🐧🦀> the default is `.NEAR`
22:43fdobridge_: <karolherbst🐧🦀> however
22:43fdobridge_: <karolherbst🐧🦀> `.D` returns 0, `.P` returns a null texel
22:44fdobridge_: <karolherbst🐧🦀> which can mean alpha is 1
22:44fdobridge_: <karolherbst🐧🦀> 1 for int, 1.0 for norm float
22:44fdobridge_: <gfxstrand> Surface ops are weird. It's 0 for null but I think it's either 0 or 0001 for OOB. There's lots of details in the spec that I don't have memorized.
22:44fdobridge_: <karolherbst🐧🦀> I'm sure the hw matches the spec here 😄
22:44fdobridge_: <gfxstrand> Probably
22:45fdobridge_: <karolherbst🐧🦀> `.NEAR` simply clamps
22:45fdobridge_: <karolherbst🐧🦀> and the third and last option is `.TRAP`
22:45fdobridge_: <gfxstrand> Yeah, we don't want that
22:45fdobridge_: <karolherbst🐧🦀> let's see if the tex ops have something as well
22:45fdobridge_: <karolherbst🐧🦀> mhhh
22:45fdobridge_: <karolherbst🐧🦀> prolly part of the texture desc
22:46fdobridge_: <karolherbst🐧🦀> or uhm.. sampler :ferrisUpsideDown:
22:47fdobridge_: <dadschoorse> sounds like you want .P
22:47fdobridge_: <karolherbst🐧🦀> well
22:47fdobridge_: <karolherbst🐧🦀> if it's there
22:47fdobridge_: <karolherbst🐧🦀> but yeah
22:47fdobridge_: <karolherbst🐧🦀> `.P` is kinda weird
22:47fdobridge_: <redsheep> Does this work for you to see the EDID on my monitor? Looks like it just specifies a maximum link rate of 12x4 for 48gbps
22:47fdobridge_: <redsheep> https://cdn.discordapp.com/attachments/1034184951790305330/1187164449216413748/LGC2EDID.bin?ex=6595e40d&is=65836f0d&hm=685f96c6df6af5ac870b69fe1ddd049de7d324f91f3764464b184954b972242a&
22:48fdobridge_: <karolherbst🐧🦀> mhhh
22:48fdobridge_: <karolherbst🐧🦀> `Maximum TMDS Character Rate: 600 MHz`
22:48fdobridge_: <karolherbst🐧🦀> OHHH
22:48fdobridge_: <karolherbst🐧🦀> it's there
22:48fdobridge_: <karolherbst🐧🦀> noooo
22:48karolherbst: Lyude: edid-decode be like: "Max Fixed Rate Link: 3 and 6 Gbps per lane on 3 lanes, 6, 8, 10 and 12 Gbps on 4 lanes"
22:49karolherbst: "DSC Max Fixed Rate Link: 3 and 6 Gbps per lane on 3 lanes, 6 Gbps on 4 lanes"
22:49karolherbst: "DSC Max Slices: up to 4 slices and up to (340 MHz/Ksliceadjust) pixel clock per slice"
22:49karolherbst: so yeah..
22:49karolherbst: it's entirely cursed
22:50Lyude: oh wow they finally have multiple lanes
22:50karolherbst: https://git.linuxtv.org/edid-decode.git/tree/parse-cta-block.cpp#n1276
22:51karolherbst: https://git.linuxtv.org/edid-decode.git/tree/parse-cta-block.cpp#n1219
22:51karolherbst: ohh
22:51karolherbst: it's less cursed
22:51karolherbst: there are apparently a set of legal options
22:52fdobridge_: <redsheep> I mean, it makes sense for the DSC modes to be different. If you don't have a bad cable and source you don't want to use it, and most non-DSC focused displays don't have hardcore enough DSC hardware to drive lots of bits through it.
23:09fdobridge_: <redsheep> It would be really cool to have a way to manually define whether I want DSC and turn all the little knobs with how my display gets driven, but I understand that's almost an automatic footgun in the hands of most users. That would be more work that only the handful of display overclockers would ever make good use of.
23:09fdobridge_: <karolherbst🐧🦀> the main benefit for using DSC in DP is to free up DP lanes for other devices
23:09fdobridge_: <karolherbst🐧🦀> but...
23:09fdobridge_: <karolherbst🐧🦀> the linux story on this is poor
23:10fdobridge_: <redsheep> Ohhhh thunderbolt 5 will probably be a nightmare if that ever comes up, it makes that part of all this WAY more complicated.
23:11fdobridge_: <karolherbst🐧🦀> lucky us that we don't handle it in any good way in the first place
23:11fdobridge_: <karolherbst🐧🦀> or I missed it and we did land dynamic reconfiguration of lanes
23:11fdobridge_: <karolherbst🐧🦀> which I doubt we do 😄
23:11fdobridge_: <karolherbst🐧🦀> as there is also like no userspace feedback for that ?
23:12fdobridge_: <karolherbst🐧🦀> I think atm we just yolo it all
23:12fdobridge_: <karolherbst🐧🦀> and if you end up with bad lane assignment, you just can't use your devices 😄
23:12fdobridge_: <karolherbst🐧🦀> maybe that got fixed tho
23:13karolherbst: Lyude: do you know the status on managing DP lane assignments properly?
23:13karolherbst: or should drivers just use as few lanes as possible
23:14Lyude: What exactly do you mean? FWIW: usually the strategy you want is: lowest possible for SST (some panels need quirks to start at highest possible), highest possible for MST to reduce the need for full modesets when a connector is added/removed
23:14Lyude: "lowest possible" meaning the slowest configuration that can fit the highest res/refresh rate/etc. on the display
23:15karolherbst: yeah.. I think nouveau prefers lower rates over fewer lanes atm
23:15karolherbst: but I also kinda assumed the idea was to manage this more explicitly
23:15Lyude: I don't know that lane count matters that often
23:15karolherbst: well
23:15karolherbst: they do if you want to use USB3 devices on the same thing
23:16karolherbst: so you kinda want to prefer fewe lanes, so you can have some leftovers for high-speed USB devices
23:16Lyude: karolherbst: I think that's more of a bandwidth allocation issue, since I think the interfaces I've seen work in terms of allocating so much bandwidth to one thing
23:16Lyude: i think amd may have added some support for this. of course, they did not add any helpers i'm aware of
23:16karolherbst: yeah.. I see
23:16Lyude: thanks amd
23:17karolherbst: pain
23:19fdobridge_: <redsheep> I would think that at least for a single panel over HDMI FRL you would want to prefer as many lanes as possible at the lowest link rate, not the fewest lanes. It's actually pretty hard to get a good enough hdmi cable for high link rates, as I learned the hard way.
23:19fdobridge_: <redsheep> FRL is extremely sensitive
23:19fdobridge_: <karolherbst🐧🦀> yeah, for HDMI that's probably true
23:19fdobridge_: <karolherbst🐧🦀> for DP you run into USB lane assignment problems on a USB-C thing
23:20fdobridge_: <karolherbst🐧🦀> though maybe nouveau tries with the fastest rate first?
23:20fdobridge_: <karolherbst🐧🦀> mhh. I'd have to check the code
23:21fdobridge_: <karolherbst🐧🦀> but there are more fun bits
23:21fdobridge_: <karolherbst🐧🦀> normally drivers drive displays at higher bit per pixel values
23:21fdobridge_: <karolherbst🐧🦀> and dropping the display to 8 or 6 even, could allow you to use more displays/USB devices
23:22fdobridge_: <karolherbst🐧🦀> like some display just advertize 16 bpc and drivers happily use that
23:22fdobridge_: <karolherbst🐧🦀> even if like only 8 make really sense
23:22fdobridge_: <redsheep> Would that be 6 bit with some kind of temporal dithering or any madness like that? I imagine just straight 6 bit signal would look really gross
23:23fdobridge_: <karolherbst🐧🦀> I think at some point in the future we'd extend kms/drm to also specify how many lanes the driver is allowed to use the most for the display
23:23fdobridge_: <karolherbst🐧🦀> yeah dunno...
23:23fdobridge_: <karolherbst🐧🦀> sometimes it means getting 4K@60 instead of just 4K@30
23:24fdobridge_: <karolherbst🐧🦀> I actually fixed that nonsense for nouveau as with my HDR display nouveau ended up allocating bandwith for 16 bpc 😄
23:24fdobridge_: <airlied> at least on i915 for older eDP we had to go slow and widw
23:24fdobridge_: <airlied> wide
23:25fdobridge_: <karolherbst🐧🦀> yeah.. for eDP it makes perfect sense
23:25fdobridge_: <karolherbst🐧🦀> only displays on USB-C hubs are the special case here
23:25fdobridge_: <airlied> so current code for non-DSC it goes slow/wide
23:26fdobridge_: <airlied> in i915 that is
23:26fdobridge_: <karolherbst🐧🦀> anyway.. I think my point is, that at some point it might make sense to add management code for this into usbcore and have some UAPI so userspace can change policies or something
23:26fdobridge_: <karolherbst🐧🦀> and drm will have to driver displays at one lane if requested
23:26fdobridge_: <karolherbst🐧🦀> *drive
23:26fdobridge_: <airlied> just add usb configuration to atomic modesetting 😛
23:26fdobridge_: <karolherbst🐧🦀> and just make it work
23:26fdobridge_: <karolherbst🐧🦀> 😄
23:26fdobridge_: <karolherbst🐧🦀> yes.. atomic usb
23:28fdobridge_: <karolherbst🐧🦀> but I think giving users the choice to choose between "drive external drive at USB2 speeds and crips display" and "potentially worsen disp output quality, but get USB3 speeds"
23:28fdobridge_: <karolherbst🐧🦀> is kinda where we will end up
23:28fdobridge_: <airlied> most users don't want that choice at all, and there is no UI you could give it to them that would make sense
23:28fdobridge_: <karolherbst🐧🦀> yeah, so external storage speed > disp quali 😛
23:29fdobridge_: <redsheep> You could hand them a device list on the type c ports and have them order their priorities
23:29fdobridge_: <airlied> like nobody wants their 4k@30 because they plugged in a usb disk somewhere they shouldn't
23:29fdobridge_: <karolherbst🐧🦀> and then they are surprised if their display modesets after plugging in their USB storage 😄
23:29fdobridge_: <karolherbst🐧🦀> but that's often not the choice
23:29fdobridge_: <airlied> "there is no UI you could give it to them that would make sense"
23:29fdobridge_: <karolherbst🐧🦀> you can e.g. limit your 10 bpc display to 6 bpc
23:29fdobridge_: <karolherbst🐧🦀> or 16 to 8
23:30fdobridge_: <karolherbst🐧🦀> well.. nouveau clamps to 12 now since I've fixed it
23:30fdobridge_: <airlied> you just give them slow USB and they learn to plug it in somewhere else :-
23:30fdobridge_: <airlied> 😛
23:30fdobridge_: <karolherbst🐧🦀> so 12 -> 8 would be a viable option to get one more lane for USB
23:30fdobridge_: <karolherbst🐧🦀> yeah well..
23:31fdobridge_: <karolherbst🐧🦀> or we are smarter in drm and not waste lanes 😛
23:31fdobridge_: <karolherbst🐧🦀> but the policy users could set could be "prefer fast USB storage over display output quality"
23:32fdobridge_: <karolherbst🐧🦀> I guess USB networking could also cause issues here
23:32Lyude: karolherbst: yeah I've wanted a uapi for this at some point as well
23:32fdobridge_: <karolherbst🐧🦀> because I sure do prefer 1G over 100M 😄
23:33karolherbst: Lyude: yeah.. it makes perfect sense to have _something_ in the future
23:33karolherbst: we just need to figure out how it should look like
23:34karolherbst: perfect EVoC project probably :D
23:34fdobridge_: <redsheep> Honestly when available DP over type c should probably have DSC ratcheted up as far as it will go to use fewer lanes on displays that can take it. The visual loss of even the 3x compressed mode is nothing next to what you would end up with at 6 bpc...
23:34fdobridge_: <karolherbst🐧🦀> yeah...
23:35fdobridge_: <karolherbst🐧🦀> that's why I had the idea of drm telling the driver how many lanes it's allowed to use
23:35fdobridge_: <airlied> does OSX or Windows have anything for this?
23:35fdobridge_: <karolherbst🐧🦀> and usb reserving lanes from drm
23:35fdobridge_: <karolherbst🐧🦀> they are smarter than linux with this afaik
23:35fdobridge_: <karolherbst🐧🦀> not sure they have UIs
23:35fdobridge_: <karolherbst🐧🦀> but like reducing used lanes without loss of image quality would already be better than what we currently have
23:36fdobridge_: <karolherbst🐧🦀> and we don't do that
23:36fdobridge_: <karolherbst🐧🦀> atm
23:36fdobridge_: <karolherbst🐧🦀> like if dropping from 14 bpc to 10bpc gives you one more free lane, you should probably free that lane
23:37fdobridge_: <karolherbst🐧🦀> but we don't really care about that
23:37fdobridge_: <redsheep> Windows lets you manage your bit depth and chrome subsampling, which often results in getting the result you want, but nothing direct as far as I am aware.
23:38fdobridge_: <redsheep> *chroma
23:38fdobridge_: <karolherbst🐧🦀> I know I've ran into issues with this on i915 in the past
23:38fdobridge_: <karolherbst🐧🦀> but I think it was kinda addressed in a good enough way
23:38fdobridge_: <karolherbst🐧🦀> would have to check again at some point
23:39fdobridge_: <redsheep> Does anybody aside from medical imaging and dolby vision even use anything above 10bpc anyway? I feel like outside of special cases it should cap at 10