07:13ThomasD13: Hello, just a short question: I try to develop a kernel module which reads/writes data to a PCI endpoint device (which has own RAM). I've thought about designing the kernel module in such a way, that a block of memory can be remapped into userspace, which is used to write/read data from/to PCIe device.
07:13ThomasD13: So, when userspace application mmaps this memory, and read/writes on it, the data is actually moved by the CPU right? There is no DMA involved (?)
07:22airlied: ThomasD13: yes by the cpu is you map pages in the aperture
07:25ThomasD13: airlied, okay. So to move the data actually by DMA, I need to implemented for example an ioctl() which starts DMA transfer after userspace app has written it's data into the remapped region?
07:32amonakov: Hi! Are kernel DRM components responsible for clearing allocated video memory before handing it to userspace? It seems at the moment they aren't?
07:32airlied: ThomasD13: you have to provide main memory mappings and have an ioctl to DMA to/from those
07:32airlied: amonakov: no currently they aren't
07:33ThomasD13: thank you very much airlied
07:33amonakov: airlied: how come? for common RAM it's a matter of security, is that not the same for video RAM?
07:33amonakov: (I feel it should have been a FAQ, but I couldn't find an explanation anywhere)
07:39airlied: amonakov: it's just never been done, for a long time video RAM wasn't really user allocated, and it was faster to not clear it
07:50amonakov: and Wayland folks not asking for something to finally be done about it? (also matters for GPU computing, where I'm coming from)
07:52PenguinOfDoom: I'd like to capture video output from a game. Is there an easy way to produce a libGL.so that does software rendering and calls my custom code for each frame?
07:52PenguinOfDoom: getting lost in all the mesa layers...
07:53airlied: amonakov: I suppose awyland folks are expected to clear their buffers anyways
07:54airlied: since you can't tell that the buffer you got didn't just come from the local process cache
07:54amonakov: airlied: I mean from the security perspective: a rogue client could deliverately not clear the buffer and obtain a snapshot of a recently closed window
07:55amonakov: airlied: same for unified memory systems: if I terminate an ssh client, allocate a big enough buffer without clearing it, can I expect to see my private key in that buffer?
07:56airlied: amonakov: uma systems clear it
07:56airlied: there is an option with amdgpu at least to clear on free for some allocations
07:56airlied: not sure how that is hooked up
07:57tzimmermann: danvet, gem shmem releases the backing pages if no mapping when no mapping is established?
07:58danvet: tzimmermann, I'm not parsing ...
07:59danvet: airlied, amonakov that discussion just came up recently on dri-devel
07:59danvet: imo not clearing vram was ok for legacy nodes, but not really for render nodes
07:59danvet: and MrCooper_ concurred
07:59danvet: but nothing really conclusive
08:00tzimmermann: danvet: oh, ok. so i looked at the code of gem shmem for adding vmap caching. and apparently backing storage is only allocated on the first vmap. but then it is also released on the vunamp.
08:00tzimmermann: what's happening to the stored content?
08:00danvet: shmemfs keeps it
08:01danvet: might get swapped out
08:01danvet: so if we cache the vmap, we probably need a shrinker
08:03tzimmermann: danvet, so that's what the mapping stuff in drm_gem_get_pages is for?
08:04tzimmermann: danvet, thanks. that explains it
08:12airlied: karolherbst: got one piglit image read test to pass on radeonsi, but explodes due to lack of blitter context soon after :-P
08:16airlied: disabling dcc gets it though the whole test
08:21amonakov: danvet: thanks, found the thread. if you need my +1: this is bad, and in general I think people would appreciate an overview of GPU-side memory protection landscape
08:22danvet: amonakov, feel free to chime in
08:22mareko: airlied: what are you talking about?
08:22amonakov: (e.g. I've found that it's too easy to hang the system via OOB accesses from compute shaders, something I'd hope the IOMMU would prevent, but I guess that's a topic for #radeon)
08:23danvet: amonakov, probably only going to move with an amdgpu patch to clear by default (with no override) and a doc patch to make that clear for render nodes
08:23danvet: i915 discrete will clear vram
08:34amonakov: PenguinOfDoom: not sure where software rendering is coming from in your question; typically this is done by performing dynamic linking interposition on libGL.so and hooking glXSwapBuffers
08:41PenguinOfDoom: amonakov: I'm not sure where it lives either! MESA loaded a dri/swrast_dri.so, maybe that's it?
08:41PenguinOfDoom: but I'm not sure how bitmaps get into the X window from there
08:43amonakov: PenguinOfDoom: no, meant why are you asking about software rendering in the first place; surely if you want to capture a game, you'd like to use hardware rendering?
08:45PenguinOfDoom: it's old enough to work well with software rendering, I don't want to tie up physical hardware
09:05amonakov: PenguinOfDoom: if you're not striving for high efficiency, I think dynamic interposition I mentioned is the way to go. at least it's the well-trodden path :)
09:24airlied: mareko: got one CL image test running with nir backend, but due to dcc it tries to use a u_blitter path that isn't initialiseed for a resource copy region
09:24airlied: disabling dcc makes it hit the compute paths
09:28PenguinOfDoom: amonakov: ahh I think I get it, wine calls glXSwapBuffers on every uhh swapchain present, and I can make a copy of the front buffer with glReadPixels after forwarding the call
09:30amonakov: PenguinOfDoom: yeah, just remember you need to either have a separate GL context for readback, or carefully save/restore GL state around the readback
09:30amonakov: I'd recommend a separate GL context of course
09:39daniels: or just use a tool like renderdoc/apitrace/frameretrace
09:41amonakov: yeah, I also meant to ask, is that all in context of learning how to do that, or achieving some particular end goal?
09:45PenguinOfDoom: just a hobby art project, producing a video stream of a game playing itself
09:46PenguinOfDoom: decided to wrap it in a docker container for convenience
10:38dj-death: does mesa 18 still receive bug fixes?
10:39airlied: nope pretty sure not
10:48dj-death: airlied: thanks
10:51MrCooper: karolherbst: FWIW, every Gallium driver effectively needs LLVM when using the OpenGL frontend at least, due to the draw module
11:32mareko: airlied: ah yes, COMPUTE_ONLY expects info->has_graphics == false
11:33mareko: airlied: I don't have a solution other than adding a COMPUTE_ONLY resource flag
13:44imirkin: MrCooper: you're probably aware, but draw can function without llvm
13:44MrCooper: I'm indeed aware :)
13:44imirkin: (and the only time it's used is for the stupid select/feedback stuff, so not an incredible loss)
13:45MrCooper: a) it's currently a build time switch, so doesn't really help for loading LLVM at runtime b) there is OpenGL functionality where it's unusually slow without LLVM
13:45imirkin: MrCooper: DRAW_USE_LLVM=0 at runtime
13:45MrCooper: therefore, effectively needed
13:46imirkin: and it'll be unusually slow either way, just even slower without llvm
13:46MrCooper: there are cases where it makes the difference between at least somewhat usable and not at all
13:47imirkin: for the select/feedback fallback with a hw driver?
13:47MrCooper: anyway, currently it's enabled by default, so it would still result in loading LLVM
13:48imirkin: i missed the start of that conversation, so probably missing some context. just pointing out the options.
14:00karolherbst: airlied: ahh yeah.. someone of the original patch authors ran into that as well and just created a normal context or soemthing
14:00karolherbst: and I guess with images a "compute only" context doesn't make much sense, or we also do the blitter stuff
14:00karolherbst: MrCooper: right.... always forget about that part as well :/
14:59tango_: karolherbst: in what sense with images a compute only context doesn't make much sense?
15:00tango_: at least for opencl you should be able to access the TMUs even in compute-only contexts
15:01karolherbst: tango_: because you might also want to do gl_sharing and stuff, but I also don't know enough about the hw and drivers, so maybe even that would be fine
15:02tango_: you can use the tmus in opencl even without doing gl_sharing though
15:02tango_: or hardware equivalent
15:02karolherbst: sure, but what if the application uses gl_sharing
15:02karolherbst: but I think it would lead to two screens anyway
15:03karolherbst: I just don't know how pleased the hardware would be if resources get shared
15:03tango_: oh I don't know enough about mesa internals for that 8-P
15:03tango_: hm doesn't gl sharing require the gl context to be active before the cl context is initialized?
15:03karolherbst: no clue
15:04tango_: let me check what the spec says
15:04karolherbst: I just expect issues, that's all :D
15:08karolherbst: ahh yeah
15:08karolherbst: you create a context with either CL_GL_CONTEXT_KHR set to the context handle or CL_EGL_DISPLAY_KHR to the EGLDisplay one
15:08karolherbst: or CL_GLX_DISPLAY_KHR to the GLXContext
15:08tango_: karolherbst: ok so apparently the OpenGL (or ES or whatever) context must be created before, because it must be passed as a property to the clCreateContext
15:09tango_: so at context creation the platform can tell if it'll be a compute-only or shared context
15:09karolherbst: that will be fun to support in mesa though :D
15:09karolherbst: because atm we create a context per screen
15:10karolherbst: but it sounds like we want to share one between clover and st/mesa
15:10karolherbst: could be fun
15:10tango_: I'm not entirely sure I see where the problem is
15:11tango_: you would know which context to use from the clCreateContext call, right?
15:11tango_: (I mean, alternatively mesa could just not support the cl_khr_gl_sharing extension ;-))
15:11karolherbst: yeah. I don't think it's a huge problem, I am just wondering where we break stuff
15:12karolherbst: but one thing which is annoying is that clover uses the dynamic pipeloader and st/mesa the static one
15:12karolherbst: and I expect this to just fall apart
15:13jenatali: karolherbst: I doubt you'll want to use this approach, but my plan for when I get around to adding it for CLOn12 is to add an extension to GL to allow import/export of the underlying D3D objects, and use that from CL
15:13jenatali: Since they're from separate codebases completely
15:13jenatali: Just throwing it out in case it sparks ideas :)
15:13karolherbst: mhh, yeah...
15:14karolherbst: I think intel requires mesa sources to be available.. or at least beignet did so
15:17chills340: anyone here built mesa on the rpi4?
15:17chills340: i have a question
15:28FLHerne: chills340: Just ask the question :p
15:29chills340: -Dplatforms=x11 -Dvulkan-drivers=broadcom -Ddri-drivers= -Dgallium-drivers=v3d,kmsro,vc4,virgl is the string i use but i saw the vulkan driver is named v3dv so wondering if i should change broadcom or add v3dv to gallium-drivers
15:31kisak: in general, vulkan drivers are not gallium drivers (ignoring Zink)
15:32chills340: yea not sure about any of it just trying to get it right so i dont have to rebuild it over and over on this sdcard
15:33chills340: figured my best bet would be to ask the devs here what string is good for the pi4 since they are working on it
15:35nroberts: that looks like the right string, and it matches this blog post https://blogs.igalia.com/apinheiro/2020/06/v3dv-quick-guide-to-build-and-run-some-demos/ (the author of that is one of the devs)
15:37chills340: alright thx
15:40kisak: my line of thought would be to go take a peek at what raspbian is doing, but mesa 19.3.2 in http://archive.raspberrypi.org/debian/pool/main/m/mesa/ feels dreadfully old for anything v3dv
15:43chills340: i got 21 dev installed now on TwisterOS its based on rasbian i was just curious if the methods changed recently since vulkan is now receiving so much more attention
15:47nroberts: if it builds then that is a pretty good sign that nothing has changed :)
15:50chills340: true but not always the case could have added something that i wasn't aware of. thanks again tho i think i'm good now
16:21danvet: tzimmermann, lol that timing with syzbot giving you the udl unplug splat right on the day it came up on dri-devel as an issue
19:12airlied: karolherbst: yay CTS basic read/write image tests passing
19:15karolherbst: airlied: also with my 1.2 MR? :D
19:15karolherbst: airlied: last time I tried I think some sampling modes were busted or something odd?
19:15airlied: karolherbst: nah this is just 1.1 stuff
19:15karolherbst: 1.2 doesn't really add anything besides array images :D but I assume this will also just work?
19:16karolherbst: no idea what you fixed though
19:16airlied: how much does blender needs :-P
19:16airlied: karolherbst: this is on radeonsi, so just making it work at all
19:17karolherbst: airlied: :D
19:18karolherbst: the 1.2 stuff is also quite done. Just need to address that inline sampler stuff comment
19:18karolherbst: but I don't think there is a nice solution with the array image validation stuff
19:18karolherbst: it just sucks
19:18airlied:wonders what I need to use blender
19:18karolherbst: no clue
19:18karolherbst: how can I check if it works for me? :D
19:18karolherbst: I know that luxmark uses inline samplers
19:19airlied:also doesn't know how to use blender to trigger a CL kernel :-P
19:19karolherbst: probably need to compile it
19:19karolherbst: and enable CL
19:19karolherbst: but I think there was a CL option somewhere
19:19karolherbst: maybe check with PTS?
19:19karolherbst: there are some blender CL tests
19:19airlied: yeah you also have to get it to do something once it's running
19:19airlied: by clicking on things I've no idea about :-P
19:19karolherbst: but also no idea if you see anything and if the result is verified
19:20karolherbst: I just target luxmark for now as it does validate the outcome
19:20karolherbst: and once that works wait for bugs?
19:20airlied: meanwhile llvmpipe is on conversion 460
19:21karolherbst: btw.. relocating is just painfully tiring and time consuming :)
19:21karolherbst: airlied: nice
19:21jenatali: airlied: How long as llvmpipe been running?
19:22airlied: jenatali: about 2 days
19:22jenatali: Oh wow that's faster than expected
19:22airlied:isn't sure how many conversion runs in full
19:23airlied: feels about 100 tests per type,
19:23jenatali: ~900 something
19:24airlied: yeahs seems about right, so 1/2 way there :-P
19:26airlied: karolherbst: yeah I've moved around the world twice now, not much fun!
19:27karolherbst: airlied: for me it's roughly 1000km, but after a certain threshold it doesn't matter anymore I guess
19:33Venemo: kusma: out of curiosity, why does zink have the bit that disables VK optimizations?
23:03jenatali: ajax: Doing a bit more of a deep dive into what you did with Penny/Copper, I'm curious why you chose to keep it as a DRI-like driver instead of completely side-by-side
23:06Plagman: Venemo: it's not the disable optimizations then swap in pipeline that has all optimizations later thing?
23:06Plagman: iirc some gallium drivers already do that
23:09airlied: Plagman: nope zink just does disable for some reason
23:09Plagman: huh weird
23:09Plagman: maybe one of these "XXX remove before ship" things :P
23:10zmike: it's just been like that since the original merge commit
23:10imirkin: presumably the downstream driver can optimize, no point in doing 2x the opts?
23:10jenatali: Aren't we talking about the flag that tells the downstream driver not to optimize?
23:10Plagman: it's setting the vulkan bit asking the backend driver to not optimize, i think is what people are saying
23:11Plagman: yea thatr
23:11imirkin: nir opts are perfect in every way, therefore no need for downstream driver to optimize? :)
23:11zmike: no, we do rely on the vk driver optimizing some stuff
23:12Plagman: that bit is a good way to get llvm to crash, from my experience
23:12imirkin: that last comment was ever so slightly tongue-in-cheek
23:12chrisf: Plagman: well, cts never passes it, right? ;)
23:13Plagman: probably should if it doesn't yet, at least as a spot check for some stuff
23:13Plagman: although now that i remember, it was attempts to make that bit do something with radv+llvm that crashed
23:14Plagman: so radv didn't end up doing much, or anything, with it at the time, since some llvm amdgpu opt passes were load-bearing
23:14jenatali: Reasons like that is why D3D hasn't added that bit yet :) we were close but couldn't really define what it meant and chickened out
23:15Plagman: iirc at least some drivers do the no-opt-then-swap dance by themselves behind the scenes
23:16airlied:throws an mr at it
23:16Plagman: at least in 9 and 11 land
23:17airlied: kusma: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7840
23:17jenatali: Yeah, 9on12 and 11on12 implementing that pattern is why we were talking about adding the bit
23:17Plagman: ah makes sense
23:18Plagman: thumbs up just on account of the brit spelling for optimize