00:00pcercuei: With GEM buffers it seems to be more tricky... If I read correctly, the vma->vm_pgoff contains a fake offset
00:00pcercuei: so dma_mmap_attrs() gives me -ENXIO since it tries to use it as the real offset
00:59pcercuei: ok, I think I got it to work
04:11scx: Is it possible to disable specific driver via driconf?
04:12scx: If yes, how to do this?
04:31dcbaker[m]: Not via driconf. There is an environment variable, but frankly talk to the maintainers of the 19.08 runtime, they're using an old version of mesa 20.0.x (20.0.5 when 20.0.8 is the last in the series), and updating that might solve your problem
05:28scx: dcbaker[m]: I've tried Mesa 19.1.5 (although i965 is the preferred driver here, so we have to force iris via the MESA_LOADER_DRIVER_OVERRIDE variable), 20.0.5 and 20.1.1.
05:28scx: Problem still persist.
05:43soreau: it might be easier to notify users of potential problems and solutions than to try and make it work in every possible case
11:15Vanfanel: imirkin_: since drmGetCap(m_fd, DRM_CAP_CURSOR_WIDTH, &capability) would give a cursor width, is that the recommended one? ie: one that is guaranteed to work.
11:18LiquidAcid: Vanfanel, if you use the atomic interface you can always do a test-only commit and check if it fails
11:18LiquidAcid: i think this is what imirkin meant with fallback logic
11:23Vanfanel: LiquidAcid: That's a good idea. However, not all drivers support the atomic interface yet, do they?
11:24emersion: Vanfanel: yes, DRM_CAP_CURSOR_WIDTH is a width that will work for the cursor plane
11:24emersion: yes, some drivers still miss the atomic interface (drmdb can give you an idea of which drivers support it)
11:25emersion: it's generally older hw, e.g. old AMD cards
11:25Vanfanel: emersion: do you mean it's going to work for sure? Like, amdgpu reports that 16x16 works and then it shows a garbled cursor, while only 128x128 and up seems to work
11:25emersion: amdgpu reports 256 for me
11:26Vanfanel: emersion: yes, I was manually trying some sizes: 128 works, and 256 is reported and also works
11:27emersion: if you use DRM_CAP_CURSOR_WIDTH, it will work, yes
11:27Vanfanel: emersion: that's what I needed to know :)
11:27Vanfanel: Many thanks!
13:15pcercuei: Working on supporting fully cached GEM buffers in my driver (ingenic-drm). I implemented cache invalidation using the damage helper. Now, I would like the cacheability to be a property of the drm_framebuffer, so that it's disabled by default and can be enabled by the application.
13:17pcercuei: I don't quite understand the relation between a drm_framebuffer and dumb buffers, though. In the fops' .mmap() callback, how can retrieve the drm_framebuffer from the GEM buffer?
13:18pcercuei: If the cacheability is a property of the drm_framebuffer, I need to know what it's set to in that function, in order to mmap with different attributes
13:21pcercuei: The alternative is to have a module parameter for the cacheability, but I'd prefer to have a finer granularity ...
14:56pcercuei: Heh... It's not possible to attach properties to drm_framebuffer objects?
14:57pcercuei: They do have a drm_mode_object field, but I see no set_property/get_property callbacks anywhere
14:58danvet: yeah no properties on framebuffers
14:59danvet: also framebuffers are invariant objects by design, we'd need to add the prop list to addfb somehow
14:59danvet: thus far we've just added more props to the plane
14:59danvet: pcercuei, ^^
15:00pcercuei: Alright, thanks
15:01pcercuei: That makes things even more complex for me, since a GEM buffer is mmap'd before the atomic commit, so I can't just check whether the property has been set on the plane
15:02pcercuei: Unless I change the mapping when the property is changed (can I do that?)
15:05pcercuei: Or maybe commit just the property change, then mmap, then do the modeset...
15:05dcbaker[m]: scx: of you can replicate the big outside of flatpak then open an issue about the app, especially if it works with i965 but not iris
15:05danvet: changing the coherency mode of an allocated buffer generally doesn't work
15:05danvet: pcercuei, if you want that, this would need to be an allocation time thing
15:06danvet: but I'm not exactly sure why you want that
15:06danvet: all the use-cases I can think of are solved with the prefer_shadow flag already
15:06danvet: anytime you have an uncached framebuffer set that flag
15:07danvet: and userspace better heed it
15:07pcercuei: using a shadow buffer sinks the performance
15:08pcercuei: < danvet> changing the coherency mode of an allocated buffer generally doesn't work
15:08pcercuei: I got it to work with a custom .mmap(), but so far the toggle switch is a module parameter, which is not ideal
15:09danvet: dma-api doesn't allow you to change that at runtime, you need to reallocate
15:09danvet: pcercuei, and why does shadow kill performance?
15:09danvet: at least on x86 copying the relevant things is as fast as cache flushing
15:10pcercuei: we're talking about how the data is mapped to userspace, how is that an allocate-only thing?
15:10pcercuei: < danvet> pcercuei, and why does shadow kill performance?
15:10pcercuei: shadow means copying a full frame
15:11pcercuei: cache flushing is one bit to set per page
15:11danvet: if you can supply a dirt rect list to flush just what you need to flush
15:11danvet: you can also do that when copying
15:11pcercuei: in an ideal world, yes, but in our case the dirty area is the full frame
15:12danvet: so I'm assuming this is all for some fairly simple display driver
15:12danvet: and either you allocate pages, and then dma_map_sg + flushing
15:12danvet: or you do dma_alloc_coherent
15:12danvet: you can't make up your mind afterwards
15:12danvet: without breaking the dma-api abstraction, which I think for display drivers is a step too far
15:13danvet: hch already doesn't like us for the other places we break it
15:13pcercuei: I'm using CMA helpers, so I assume it does dma_alloc_coherent
15:14danvet: yeah, and you should use the dma_mmap helper then
15:14danvet: mixing dma_alloc_coherent and explicit flushing is undefined
15:14danvet: might or might not work
15:14pcercuei: I do. The regular drm_gem_cma_mmap() function calls dma_mmap_wc, my function only differs that it calls dma_mmap with non-coherent flag
15:15danvet: uh I think that would need review from someone who understands dma api first
15:16danvet: I really don't want to start even more fights in this area
15:16danvet: (with dma-api maintainers)
15:17danvet: pcercuei, btw for "shaodw is slower", have you tried having 2 shadow buffers, and comparing them before you flip?
15:17pcercuei: It's pretty standard code, really
15:17danvet: doing lots of minor faults really isn't very fast generally
15:18pcercuei: comparing them how?
15:18danvet: or userfaultfd
15:18danvet: it sounds like you're using userspace that doesn't know what it's rendering
15:18danvet: we generally assume that userspace knows what it's rendering
15:18danvet: for kms at least
15:19pcercuei: userspace is mostly games and emulators, all software-rendered
15:20pcercuei: and double-buffered, so every frame is painted anew
15:20danvet: how does catching pagefaults help then?
15:21danvet: if the entire frame is painted I mean
15:21pcercuei: it's not about pagefaults
15:21pcercuei: it's about cacheability
15:21pcercuei: write-combine means that performance tanks when doing alpha-blending
15:22pcercuei: because reads are uncached
15:23danvet: yeah, but I thought you're problem is that copying the full frame is too much
15:23danvet: with pagefaults, you end up flushing the full frame
15:23danvet: at least on x86, that's about the same speed
15:24danvet: well with pagefaults you end up flushing the full frame, plus a pile of pagefaults each frame
15:24danvet: pagefaults only win when you're painting very little of the overall frame
15:24danvet: blinking cursor or so
15:27pcercuei: pagefaults is when you try to access a "cold" page, right? How is that related to the discussion?
15:43danvet: I thought you want to track dirty pages with minor faults
15:44danvet: if not, I have no idea what exactly you want to do, because:
15:44danvet: a) prefer_shadow is too expensive
15:44danvet: b) dirty rectangle stuff not possible
15:44danvet: there's no c) if you don't track this with page faults
15:44danvet: I guess I go back to w/e
15:45pcercuei: c) is to have fully cacheable buffers, and force writeback on plane updates
19:26nanu: Hi :) how can i disable tiling in freedreno, i found FD_DBG_TTILE but i want it the other way round because my compositor doesn't support tiling
19:46lenarhoyt: Hi. I would like to smooth mouse movements so as to better control the mouse with a strong hand tremor. How can I achieve this?
19:47lenarhoyt: I was thinking about modifying usbhid.
22:24Lightsword: is there currently work being done for a mesa DirectML or D3D12 frontend as part of https://devblogs.microsoft.com/directx/in-the-works-opencl-and-opengl-mapping-layers-to-directx/ ?
22:28airlied: Lightsword: no
22:28airlied: that's purely about backends
22:28Lightsword: I see "Make it easier for developers to port their apps to D3D12" as a reason for this work but I don't see how this would be viable without a mesa3d D3D12 frontend.
22:28Lightsword: airlied, yeah, that's where I'm confused, because to port apps away from opencl to D3D12 you need a frontend not a backend right?
22:29HdkR: That talks about it a bit
22:30airlied: Lightsword: the idea I think is they get apps running under Vulkan etc then can port to the MS native D3D12 drivers using WSL2
22:30bnieuwenhuizen: Lightsword: I thought the idea was that they also provide D3D12 directly on WSL2, just not with mesa in between
22:30airlied: they don't care about Linux not on WSL2
22:32Lightsword: airlied, but D3D12 only works on WSL2 right? if someone ports an app from opengl/opencl to D3D12/DirectML(as microsoft is suggesting they do) how would it run on non-WSL2 linux systems?
22:32bnieuwenhuizen: Lightsword: they wouldn't?
22:33Lightsword: so this is classic microsoft Embrace, extend, and extinguish then?
22:33Lightsword: I mean, if that's the case are they not literally advocating in their blog post to break compatibility with all non-WSL2 Linux systems?
22:33airlied: not sure it's even that much of a decent 3E strategy
22:34bnieuwenhuizen: I think vkd3d also provides d3d12 on top of vulkan?
22:36danvet: someone should write a vk on top of gl and we could make every combo loop
22:36Lightsword: airlied, maybe what they are planning is to force everyone to use Azure(via a cloud version of WSL 2) for machine learning on Linux by crippling app compatibility with native Linux graphics/compute API's?
22:36airlied: danvet: I have vk on top of gallium nearly :-P
22:37Lightsword: I mean, why else would they be advocating developers move away from native graphics/compute API's to their weird bridged API's instead of telling developers to use the opencl/opengl mapping layers?
22:38airlied: Lightsword: because devs already have the insane thing running on d3d12 or hafl running
22:38airlied: Lightsword: I expect cloud gaming workloads and ML workloads the primary focus for azure
22:38Lightsword: airlied, don't they already have the mapping layer running as well?
22:38airlied: and they want to expose CUDA at near native speeds
22:39airlied: and ML etc, which when all you have is D3D12 host drivers, you get D3d12 guest drivers
22:39airlied: then people still want to run their Linux desktops, so need GL etc
22:40Lightsword: isn't the purpose of https://gitlab.freedesktop.org/kusma/mesa/-/tree/msclc-d3d12 to translate d3d12 API to opencl/opengl so normal linux apps work with their d3d12 guest drivers?
22:40airlied: yup pretty much
22:41Lightsword: so they could do all the machine learning against that opencl mapping layer and then you wouldn't have the native Linux compatibility issues right?
22:41airlied:isn't sure why there's a big effort there, we don't exactly have a lot of winning native Linux apps :-P
22:41Lightsword: airlied, tensorflow works natively on Linux right?
22:41airlied: Lightsword: with CUDA
22:42airlied: so you swap one properitary stcak for another
22:42bnieuwenhuizen: yeah I think on the compute side the competition is much more CUDA vs. Direct.. than OpenCL vs. Direct...
22:42airlied: ain't no competition wiith CL vs anything :-P
22:42Lightsword: well one proprietary stack at least is portable(CUDA) while DirectML is not(at least not to native Linux)
22:43bnieuwenhuizen: Lightsword: AMD would probably like a word with you on CUDA being portable :P
22:43Lightsword: so them getting everyone to move from CUDA to DirectML for applications would break the ability to run the app on any non-WSL 2 systems
22:43Lightsword: bnieuwenhuizen, well portable as in not tied to WSL2, only to the hardware
22:45airlied: Lightsword: but yeah business will do whatever they decide to do, they hate CUDA, I'm guessing they hate DirectML, someday maybe their will exist a useful open alternative
22:47danvet: Lightsword, from what I've heard the gl/cl-on-dx12 in mesa is a completely different effort from the wsl2 effort
22:47danvet: big company and all that
22:47danvet: the mesa thing was to replace certain very, very shoddy gl drivers on windows native for some specific apps
22:48jenatali: Lightsword: Yeah, danvet's right, the mapping layers are primarily for use on Windows. Though with WSL2 coming into the picture they do also have some value to provide there
22:48danvet: yeah I figured some parts of microsoft suddenly realized microsoft is big and has other parts :-)
22:49Lightsword: danvet, oh, I thought gl/cl-on-dx12 in mesa was mostly for WSL 2 since I thought most windows drivers supported opengl/opencl already
22:49jenatali: Surprisingly those two efforts actually appeared both within the same groups, but the overlap wasn't immediately obvious
22:49jenatali: Lightsword: Most, yes. All, no
22:49danvet: Lightsword, in general if it's not exactly the same group within a big company, it's safe to assume they only heard about the other once both went public ...
22:50Lightsword: danvet, yeah...I've worked for a company like that before
22:50danvet: ok, maybe I need to update that rule to "same team" :-)
22:50Lightsword: danvet, current stance for the kernel bride is still that they need to provide an open source userspace right?
22:51danvet: but yeah one tremendous advantage of an open driver is that you hear much more about what all the other parts of your company are trying to do
22:51danvet: Lightsword, nah, it's more "this doesn't fit into drivers/gpu, not our problem"
22:51danvet: that bridge is to run a windows graphics stack on a windows hypervisor
22:51danvet: which happens to be repackaged into ELF library instead of PE and with some linux kernel sandwiched in betwen, but aside from that it's a windows stack
22:55Lightsword: danvet, I thought they were doing a bunch of GPU stuff along with it like https://lists.freedesktop.org/archives/dri-devel/2020-June/270028.html
22:56danvet: I think that's completely different thing
22:56jenatali: Yeah, I think so too
22:57danvet: I don't think you can (at least not easily) glue the dx12-stack-on-wsl2 into a linux graphics stack around drivers/gpu together
22:57danvet: also not sure you should even try :-)
22:57danvet: so that drm drivers is probably again totally different effort
23:00Lightsword: danvet, I thought the open source userspace policy was now being applied to machine learning accelerator drivers as well per https://lwn.net/ml/linux-kernel/20200519174120.GC1158284@kroah.com/
23:02danvet: Lightsword, yeah, but drivers/hyperv might have different stance
23:02danvet: also, nvidia won't open their stuff except over their dead bodies I guess
23:02danvet: and you need their blob driver from windows, since this is a windows gfx stack
23:03Lightsword: danvet, yeah, sure, but nvidia doesn't care about upstreaming their kernel stuff in general :P