IRC Logs of #dri-devel on irc.freenode.net for 2023-01-24

00:31 Lynne: airlied: could you add support r10x6 and other single-plane rep formats in radv?
00:31 Lynne: it should be enough to just alias them to r16, becuase amd gpus don't leave junk in LSBs
00:33 Lynne: officially, r16 isn't a compatible rep for 10-bit planar formats
00:52 airlied: dj-death: okay I've got dg2 working locally, was just genxml change propogation and mocs, will cleanup and push soon
02:05 airlied: dj-death: updated the MR
02:05 airlied: dj-death: might just enable it by default, since it passes all the CTS tests now
02:07 airlied: Lynne: doesn't radv report R10X6 now?
02:13 Lynne: it does, actually, nevermind
02:14 robclark: jekstrand, danvet: adreno doesn't allocate mid batch (fwiw) and there is an igt test that should ensure that all the possible reclaim fail scenarios for submit ioctl are tested (well, at least all the non drm_sycobj ones, that still needs to be added to itg test... *but* demand-paging is a thing we might want to do some day and for any driver that means mid-batch allocations.. which might be fun
02:16 robclark: I guess it helps that the output of the binning pass isn't varyings but viz stream, and we can pre-calculate the worst case size (and if needed deal with case where viz stream bo is too small in cmdstream+hw)
02:17 Lynne: airlied: btw multi-slice h264 files still have minor corruption on radv
02:17 Lynne: I'll test on anv in a bit
02:17 Lynne: (use that m2ts sample I sent you)
02:20 airlied: Lynne: I wonder if it's something else than the fact that are multistream
02:34 Lynne: output on anv: all gray, with a few blocks of green
02:35 airlied: Lynne: that doesn't seem like a win :-P
02:36 Lynne: at least it's not crashing the GPU, so it's probably not short mode's fault
02:42 Lynne: I should do some basic fuzzing, but I'd rather wait until I get an rdna3, gpu crashes are a bother on my desktop
02:53 airlied: Lynne: ah yeah anv I think needs a bit more for slices
03:34 alyssa: zmike: stride motion? brave.
03:34 zmike: just some ordinary code cardio
03:44 dcbaker: zmike: no worries, just wanted to double check I did the right thing
03:58 airlied: Lynne: did we ever work out if the h264 spec required the 3-byte header on each slice?
03:58 airlied: (the vulkan h264 spec)
04:07 Lynne: nope, but all hwaccels do, so probably safe to assume it's required
04:08 Lynne: only av1 does not require emulation prevention bytes
04:09 Lynne: (some fought for it due to "it's always been done this way", and "think about mpegts!", but somehow saner thoughts prevailed)
04:10 airlied: fighting the anv multislice a bit here but not sure if it's this or not
04:14 Lynne: stripping the bytes for all but the first slice doesn't fix it
04:16 airlied: yeah I also hacked it off on both sides but still get garbage
04:17 airlied: Lynne: oh I have the code to do multislice written, but it's not working :-\
04:19 airlied: Lynne: one minor bug in ffmpeg if you don't add the startcode it is missing the offset in the memcpy
04:19 airlied: ff_vk_decode_add_slice
04:21 airlied: also a couple of lines below it adds startcode when it shouldn't
04:21 airlied: not that it matter, still get the same busted rendering
04:22 airlied: oh no that bit is fine
04:23 Lynne: never liked that code
04:24 airlied: you could move the second memcpy outside the if statement
04:24 airlied: and drop the else
04:33 airlied: Lynne: ah this video hit a scaling matrix path I haven't worked out
04:36 Lynne: oh, that explains the quantization artifacts on radv
04:36 Lynne: I should add a host-mapped path
04:36 Lynne: would save a pointless memcpy 'upload' on intel
04:42 airlied: Lynne: so does the pps scaling matrix override the sps one and that override sthe default?
04:42 airlied: Lynne: also setting the use default might have some value to save the vulkan driver uploading it
04:44 Lynne: the pps should override the sps matrix
04:46 Lynne: getting the use_default flag is a giant pain, so it's probably cheaper to just always upload the few hundred bytes at most vs parsing it
04:56 airlied: Lynne: okay, new anv branch pushed should do the multi slice and matrix
04:56 airlied: let me go look at radv
05:01 airlied: Lynne: so the spec defines 8x8 scaling matrix at 6 x 64, but the hw seems to only have 2 x 64
05:01 airlied: and I have to copy the 0 and 3 lists to the hw
05:07 airlied: Lynne: radv should be fixed now as well
05:10 Lynne: fast
05:13 airlied: Lynne: btw is https://paste.centos.org/view/f226e3f1 enough to make 422 work?
05:14 airlied:will go make some 422 content now
05:16 Lynne: radv works, anv looks better, but broken
05:16 Lynne: somehow the hue is shifted and red->blue
05:16 Lynne: *green
05:19 alyssa: jekstrand: danvet: Mali absolutely has to allocate mid batch for the tiler hea
05:19 alyssa: heap
05:19 alyssa: (Valhall allocates varyings dynamically out of the tiler heap, fwiw)
05:19 airlied: Lynne: oops I did noticed that and messed up my last test of it
05:20 alyssa: this works by reserving a large heap but not committing memory
05:20 airlied: Lynne: so I'm getting a profileIdc of 122 which isn't in the vulkan enum
05:20 alyssa: the kernel pagefault handler will then commit 2MB chunks at a time as needed by the hw
05:20 alyssa: "growable" memory
05:21 Lynne: airlied: 122 is profile_high_422
05:21 airlied: ah, probably need to get that defined by the std header
05:22 Lynne: also, the radv 422 patch is fine, but you should probably only advertise 422 if the input is 422, and only 422, unless the driver and/or hardware can convert 422 to 420 and vice-versa
05:23 airlied: Lynne: what format= in the command line do I need to test 422 content?
05:23 Lynne: just disregard the header profile values, profileIdc is a bitstream value, the spec shouldn't define these, hevc come up with a new profile every few months, no way they can keep up
05:25 Lynne: err, nv16 for 8-bit
06:07 airlied:can't spot why 422 isn't working on radv for me, will go fix anv scaling first :-P
06:09 Lynne: I think because it may want single-plane 422...
06:11 Lynne: maybe, for some reason vadumpcaps doesn't list "pixel_formats" for 422
06:13 Lynne: it does list a 422 surface format, as YUY2, which is ffmpegese for YUYV422 (format=yuyv422), which is vulkanese for VK_FORMAT_G8B8G8R8_422_UNORM
06:19 airlied: Lynne: okay fixed anv I think
06:22 Lynne: btw is srcBufferOffset wired up on the decode-side?
06:23 airlied: should be
06:28 airlied: Lynne: the amd hw should accept a 2-plane 422, at least the hw surface programming shouldn't be same as 2-plane 420 I think
06:31 airlied: but maybe it just doesn't support it
06:32 Lynne: why does vaapi signal it? I did think it was strange, but 422 is widely used in broadcasting
06:33 Lynne: anv is still not fixed, I think
06:33 airlied: vaapi driver seems to have it hardcoded
06:34 airlied: Lynne: wierd, I'm seeing the right colors at least now
06:41 Lynne: I can't pull from your repo?
06:41 Lynne: I get 7e80807a070c as the last commit, not 7dcf00ba
06:42 Lynne: did you change the branch or something?
06:42 airlied: anv-vulkan-video-decode, I also pushed the prelim one
06:42 airlied: anv-vulkan-video-prelim-decode just pushed that as well
06:46 Lynne: yup, works now
06:52 Lynne: radv didn't like host mapped at all
06:52 Lynne: though I remember it used to work...
06:54 Lynne: pushed to my repo, going to test on anv
07:06 Lynne: broken there too, I'm not sure srcBufferOffset is respected
07:06 airlied: Lynne: aligned correctly?
07:06 airlied: both drivers add it in at the right place
07:09 Lynne: yeah, I call GetMemoryHostPointerPropertiesEXT to check, and buffer creation doesn't fail either
07:09 Lynne: (anv is nasty, GetMemoryHostPointerPropertiesEXT passes but buffer creation doesn't if the ptr is unaligned!)
07:09 airlied: but you respect VkVideoCapabilitiesKHR::minBitstreamBufferOffsetAlignment
07:09 airlied: ?
07:10 airlied: ah the host ptr one might be closer
07:13 airlied: Lynne: I'm seeing values not aligned to that
07:14 airlied: though I'm not sure at least the radv value for it is correct, the anv one could be reduced with some work
07:15 Lynne: fixed
07:17 Lynne: still doesn't work on anv
07:19 airlied: Lynne: still doesn't seem to take minBitstreamBufferOffsetAlignment into account (not minBitreamBufferSizeAlignment)
07:20 Lynne: oh, forgot about that one
07:22 Lynne: still, minImportedHostPointerAlignment is 4096, it's probably way above any other alignment
07:24 airlied: yeah but the offset isn't aligned correctly then
07:24 airlied: you get the ptr aligned to 4096, but you can end up with an offset of 16
07:24 airlied:has no idea what the actual reqs on amd hw are here
07:25 airlied: the register is written a 64-bit addr, but no info on alignment for it
07:30 airlied: not sure how tricky it would be to make that all work, at least on anv it'll need some hacks to get it down to byte
07:35 Lynne: as long as the first slice doesn't have to be at offset 0, I can align to both
07:45 Lynne: for some reason, I can't do that?
07:45 Lynne: doesn't work on radv
07:45 Lynne: I'm just setting the first slice at an offset
07:48 Lynne: it works if the offset for the first slice is below or equal to 32
07:55 airlied: Lynne: I suspect 32 is the radv min alignment there
07:57 Lynne: not max? I haven't found a value which works above 32
08:00 airlied: Lynne: I see frame corruption on some frames that are getting 16 aligned vals for buffer offset
08:01 airlied: you kinda need an aligned malloc
08:06 airlied: Lynne: not really seeing how to make that work, realloc and alignment are kinda not friends
08:17 Lynne: we align our allocs to the CPU simd size, which is 32 bytes in our case
08:18 Lynne: but regardless, disable host mapping and increment slices_size after allocating it and you'll see corruption
08:20 Lynne: pushed to my repo
08:20 Lynne: just change vkpic->slices_size += 0; to 64 to see corruption
08:22 Lynne: for normal frames, we host-map the pointers to vkbuffers, which lack an offset alignment requirement, so we're free to
08:23 Lynne: but for bitstream buffers, I need to be able to offset the first slice
08:23 Lynne: to satisfy the secondary offset requirement
08:44 airlied: well vk buffers have an offset alignment requirement if you suballocate them from a memory allocation
09:04 Lynne: it's memory importing in this case
09:05 Lynne: the black magic method we're using by specifying memory which may be or may not even be part of our address space and offsetting it has worked 100% of the time so far
09:49 vliaskov: Does this look sane for fb damage clips on vkms? https://github.com/vliaskov/linux/commit/e088df37c73eb6ba5b664c7af2def517fe13d2f0 kms_atomic@atomic_plane_damage passes. I am not sure the destination clipped area is correctly calculated
09:49 jani: tzimmermann: are you going to send another (final?) drm-misc-next pull request this week?
09:52 jani: tzimmermann: I'd really like to get b494d6283deb ("drm/edid: remove redundant _drm_connector_update_edid_property()") into drm-next, so I could backmerge, apply a few dependent patches on drm-intel-next, and send the final drm-intel-next pull request for this cycle
10:36 danvet: vliaskov, strictly speaking all you have to do is wire the property up, but do nothing
10:37 danvet: because the damage rect stuff is strictly an optimisation, userspace still must render the full fb
10:37 danvet: so vkms can continue to "scan out" the entire thing
10:41 danvet: vliaskov, so the fb_blit you're doing just copies part of the fb over itself, so that's not optimizing anything and looks a bit strange (or I'm misreading)
10:41 danvet: if you want to optimize, you would need to change the blend() function
10:41 danvet: but that's a pretty substantial redesign, because we'd need to fix it to be more incremental
10:42 danvet: what could work is 1. refactor blend to only render a sub-rectangle of the crtc
10:42 danvet: 2. compute the overall crtc damage (currently no helper for those I think, but manual upload drivers need that so maybe good to share across drivers)
10:42 danvet: 3. then switch blend() to only recompute the damaged (in crtc coordinates) area
10:43 danvet: we must still compute the crc32 over the entire area
10:44 danvet: oh I just noticed: another optimization would be to make the crc computation optional
10:44 danvet: and only do it when we need it
10:44 danvet: that should help with using vkms for compositor testing ...
10:46 danvet: vliaskov, I thought there was some discussion about crtc damage rects in the past, but I can't find it anymore
10:46 danvet: oh also we could eventually extend this to handle multiple crtc damage rects
10:47 danvet: e.g. if you have a moving cursor in one are and a blinking cursor of an input field somewhere else
10:47 danvet: i915 psr code does have some of the crtc damage calcs already iirc
10:50 javierm: danvet: I don't think that copies part of the fb over itself but parts of the shadow buffer ?
10:51 danvet: javierm, yeah, but neither did the previous code
10:51 javierm: danvet: so I think that changing to iterate over damage clips rather than copy the full rect is still an optimization
10:51 danvet: so we really don't need a copy there, not even a partial one
10:51 danvet: the vkms "copy" for "scan out" happens in blend()
10:51 javierm: danvet: ah, I see
10:51 vliaskov: thanks danvet. So is anything accomplished with just enabling the property for vkms?
10:51 danvet: this just copies the structs around so we have all the required information
10:52 danvet: vliaskov, you can run tests
10:52 danvet: like compositors
10:52 danvet: so I think yes
10:52 danvet: also we could use vkms to test the damage clip helpers a bit (althought those should have full coverage with unit tests)
10:52 danvet: especially with crtc coordinate damage the helpers become crucial
10:53 danvet: since anything that impacts the resulting blended colors must be a full damage clip for that fb
10:53 danvet: e.g. lut/color stuff, blending modes, position changes, ...
10:54 danvet: vliaskov, what I'm saying is that the minimal damage implementation is actually less than what you do
10:54 danvet: any driver could set this actually
10:55 danvet: but the full vkms damage clip support is quite a bit more (i.e. the steps I laid out)
10:55 danvet: it should also help with performance in compositor testing when you have blinking cursors and stuff and don't capture crc (if we make the crc stuff optional too)
11:02 vliaskov: Yes, I did see that the vkms atomic_plane_update updates frame metadata, and actually performs no copy. So it is odd to try to do a partial update (i saw similar code in mga, simpledrm or i915 iirc).
11:02 vliaskov: I also see vkms_compose_planes currently keeps allocating output and intermediate line buffers... with an actually optimization, I assume we would keep re-using buffers in a refactored/partial blend() ?
11:03 danvet: vliaskov, yeah we'd need to reuse the old one
11:03 danvet: at least if the resutling crtc damage rects is less than the overall crtc
11:03 danvet: otherwise we can just allocate everything anew
11:05 vliaskov: understood. I guess I am not clear what enabling the property as is actually means in terms of gains (if any gains for vkms). But that seems independent of the proposed optimizations. I 'll get more familiar with the generic drm/fb helpers.
11:07 danvet: vliaskov, for vkms itself not really anything
11:08 danvet: but vkms is for testing, so enabling it allows you to test that compositor code a bit
11:08 danvet: with pure sw
11:08 danvet: so there's a benefit in enabling that one-liner (maybe even behind a config option again, so that compositors can be tests both with and without damage support supported in the driver)
11:08 danvet: so maybe a bit more than a one-liner :-)
11:09 vliaskov: i see, thanks
11:14 vliaskov: btw, I assume is the vkms TODO is up-to-date. There is an old item "Add background color KMS property[Good to get started]." but with no userspace using that prop, not sure if it still makes sense (patches were rejected twice in the past).
11:14 vliaskov: There was also an interesting vkms configfs attempt in July/August, but I am not sure when/if a v2 is coming out.
11:22 tzimmermann: jani, i would send that PR on early Thursda CET. would this work?
13:36 jani: tzimmermann: yeah, if airlied also merges it in a timely fashion *wink wink*
13:37 tzimmermann: jani, if it's just one patch, maybe cherry-pick?
13:42 jani: tzimmermann: no, it's a bunch, I just pointed at the most recent one
13:42 tzimmermann: danvet, ^ can we have two drm-misc-next PRs this week?
13:43 tzimmermann: i'd prepare one today and another one on thursday
13:44 jani: oh, that would work nicely too
13:44 danvet: tzimmermann, yeah can do and also fast-track if needed
13:44 danvet: just ping me or something like that
13:44 tzimmermann: danvet, thank you
13:46 jani: +1
13:56 tzimmermann: danvet, last week's PR for drm-misc-next has not been merged yet
13:57 tzimmermann: can you take care of it?
14:00 danvet: in some meetings, but will do in a bit
14:01 tzimmermann: thank you
14:13 X512: Is it possible to upstream alternative non-DRM interface for Vulkan drivers (RADV, ANV)?
14:24 karolherbst: no
14:24 karolherbst: two reasons: 1. we probably won't maintain them 2. whoever wants them to be added has to give a _hard_ promise to maintain them for at least 10 years
14:25 karolherbst: also.. having alternative interfaces for the same thing is also very frowned upon anyway
14:26 karolherbst: however, if you can show in a downstream project that moving to a different kind of API is totally worth it, be our guest, but don't complain if the work is wasted
14:27 karolherbst: X512: anyway... _why_ would you see alternative interface added in the first place?
14:29 X512: Because Haiku GPU drivers are userland based and FD can't be used for accessing device. Some data structure on client side is needed.
14:29 karolherbst: ohh you meant in the userspace driver
14:30 karolherbst: ehh I think we already support haiku for some stuff in mesa, no?
14:30 X512: There are GPU server that directly control hardware from userspace and client processes access server via IPC.
14:30 karolherbst: right...
14:30 X512: Only software rendering is currently upstreamed for Haiku.
14:31 karolherbst: I mean.. I'm convinced that the 100% userland approach is neither sustainable nor secure, but I think as long as it's not a pita to maintain it (and the CI is in place) it shouldn't be an issue
14:32 karolherbst: it's just easier to just support whatever linux provides for everybody
14:32 X512: I don't understand how userland approach can be less secure than kernel.
14:33 X512: Every small memory buffer overrun etc in kernel module can be used to hijack kernel and gain root access.
14:34 karolherbst: sure, but the userland approach means, that GPU server has access to all kernel memory
14:35 karolherbst: so you can own the system this way
14:35 karolherbst: well.. maybe haiku has a better approach on how to configure DMA
14:35 karolherbst: dunno, but doing this in userspace makes the entire isolation thing pointless
14:36 X512: Kernel GPU module also has access to whole physical memory. Userland GPU server has much less probability to expluatate vurnerabilities.
14:36 karolherbst: the point is: userspace doesn't
14:37 karolherbst: with userspace GPU drivers, userspace has that access
14:37 karolherbst: and not even as an exploit, just by definition
14:37 X512: And? The question is probabiliy of hacking, not possibility.
14:38 karolherbst: which is the same
14:38 X512: GPU server is priveleged process with root access. No authorized code can be loaded to it. So I see no problem.
14:38 karolherbst: one bug in the kernel == owned, one bug in the userspace GPU driver == owned
14:39 karolherbst: it's not about loading anything into it
14:39 karolherbst: GPU drivers are complex
14:39 karolherbst: you have to constantly configure DMA stuff
14:39 karolherbst: there can be a bug making the driver access all kernel memory and do random things
14:40 X512: Kenel code have much larger attack surface. In addition to GPU DMA any kernel structures can be also attacked.
14:40 karolherbst: anyway... having userspace access to all kernel memory by design sounds like a bad thing, tbh
14:41 karolherbst: in linux that would immediatly by a CVE
14:41 X512: One of big problem with Linux DRM kernel drivers is that it are not designed to be portable and it are hard coded to unstable internal Linux API.
14:42 karolherbst: sure
14:42 karolherbst: that's how linux drivers are developed
14:42 X512: Bringing large Linux compat layer to Haiku kernel is a bad idea I think.
14:42 karolherbst: but quite frankly: why should we adjust for 0.1% of users if that means taking 10+ other kernels into account?
14:43 karolherbst: that's a lot of work with no benefit
14:43 karolherbst: right...
14:43 X512: Vendor lock-in is a bad idea in general for healthy open source community.
14:43 karolherbst: but if one or two starts to reimplement GPU drivers for all the hw we support it's a doomed project from the start anyway
14:44 karolherbst: heck... even maintaining a compat layer is too much work for BSD folks generally
14:45 X512: For example NVidia have OS abstarction layer in kernel drivers so it can be easily ported.
14:45 karolherbst: and maintaining those layers is probably 5% the work you need for a full driver
14:45 karolherbst: right
14:45 karolherbst: but the thing is: they get money out of it
14:45 karolherbst: and have enoguh to actually let people work on this
14:45 turol: gitlab is refusing to let me log in on a browser profile which had previously accessed the old bugzilla
14:45 X512: I made functional driver for Radeon SI GPUs that works with Mesa RADV driver. So it is not impossible task.
14:45 turol: what do i need to clear to make it work?
14:46 karolherbst: X512: I'm not saying it's impossible, I'm just saying it's a lot of work doing it properly
14:46 karolherbst: how much of the available hardware do you support?
14:46 karolherbst: do you also support the GL driver?
14:46 karolherbst: how many bugs does it have?
14:46 karolherbst: does it support all the display stuff correctly?
14:46 daniels: turol: probably need to unblock reCAPTCHA
14:47 karolherbst: did you already implement DP-MST?
14:47 X512: Linux is not an option for me so I am ready to spend resources to bringing GPU acceleration stack for Haiku.
14:47 karolherbst: yeah, fair
14:47 karolherbst: but don't say we didn't warn you
14:48 karolherbst: I mean.. it is a cool project and everything, but it's also a lot of work
14:48 X512: Supporting all hardware is not needed and it is very hard task. Support can be limited by some subset of popular GPUs.
14:49 X512: On Haiku following interface is currently used instead of DRM device FD:
14:50 turol: daniels: i've already allowed javascript on freedesktop.org and there's no other sites loaded into the login form
14:50 X512: typedef struct accelerant_base { struct accelerant_base_vtable *vt; } accelerant_base; typedef struct accelerant_base_vtable { int32 (*AcquireReference)(accelerant_base *acc); int32 (*ReleaseReference)(accelerant_base *acc); void *(*QueryInterface)(accelerant_base *acc, const char *iface, uint32 version); } accelerant_base_vtable;
14:50 karolherbst: X512: yeah.. no
14:50 karolherbst: we won't support something if the goal isn't to do it properly
14:50 karolherbst: yes, supporting all the hardware is needed
14:51 X512: - int err = drmSyncobjCreate(device->drm_fd, flags, &sobj->syncobj);
14:51 X512: + int err = device->acc_drm->vt->DrmSyncobjCreate(device->acc_drm, flags, &sobj->syncobj);
14:52 X512: Accelerant object holds GPU server IPC connection context and some private state.
14:53 karolherbst: sure
14:53 karolherbst: but why should _we_ support/maintain it?
14:53 karolherbst: unless there is an active and big community around whatever is done for haiku, it all means, that sooner or later we will be the ones having to support/maintain it
14:53 daniels: (and, realistically, it gets broken all the time)
14:54 karolherbst: in avarage every 1 or 2 months a new person comes in and asks for that kind of stuff, if we had said yes to everybody, we'd have ~25 of those interfaces to maintain
14:54 karolherbst: because big surprise, those people move on and give up a few months later
14:55 daniels: it's not 'just' a five-entry vtable either; it means that every single platform is held to that interface as the lowest common denominator of what they can support, so if you suddenly go on holiday or lose interest, then no-one else can move forward until you've found some time to catch up
14:55 X512: 25 interfaces is good idea. Just make proper winsys abstraction like it is done in RADV or libdrm_amdgpu.
14:56 X512: And yet another OS driver interface will be an implamentation of winsys function table.
14:56 daniels: have you looked at how rapidly libdrm-amdgpu changes?
14:56 X512: Not so rapid really.
14:56 karolherbst: uhhh
14:57 karolherbst: you underestimate how much work that stuff is
14:57 karolherbst: I can guarentee you, that the driver you wrote is less than 1% of what you actually need for a working driver
14:57 karolherbst: at least for AMD hardware
14:58 karolherbst: if the goal isn't to support all the hw from at least 10 years ago, it's a doomed project
14:58 X512: I already adapted libdrm_amdgpu to Haiku new accelerant interface. It do not need libdrm.so at all now. Thanks to amdgpu_device_handle opaque pointer instead of direct FD use.
15:00 X512: Why? It can be a user base with small set of working GPUs.
15:00 X512: Why libdrm do not use opaque device pointer...
15:01 karolherbst: X512: because if you only support 1 or 2 GPUs it's irrelevant for us
15:01 karolherbst: then you can maintain it downstream
15:01 karolherbst: if you only support your hardware, there is no reason for us to accept your code
15:07 jenatali: I remember jekstrand talking about considering WDDM support for at least one of the drivers he was working on. Would be interesting to see if there's an abstraction that could support that, DRM, and Haiku
15:08 X512: That will be nice.
15:08 karolherbst: jenatali: right.. but WDDM is something we'd know wouldn't just vanish
15:08 X512: In theory Mesa Vulkan drivers can be used with Windows kernel GPU drivers.
15:08 karolherbst: the code isn't the issue here
15:09 karolherbst: so there are two paths
15:09 X512: Haiku have quite long history. It will not vanish in near future.
15:10 karolherbst: 1. build a strong and sustainable community first before even considering upstreaming, once it took of and people actually use it and random contributors make sure it works on all kind of hardware, then you can come back and ask us
15:10 karolherbst: or
15:10 karolherbst: 2. we accept it, and pull it out in half a year, because you lost interest/moved on and we get the hate, because we are the badies here
15:10 karolherbst: we kind of prefer 1.
15:10 X512: Haiku origin BeOS is older than Linux.
15:10 karolherbst: that's not the point
15:10 karolherbst: what makes _your_ work not vanishing
15:11 karolherbst: what if there are two other haiku devs having the same idea but different APIs
15:11 karolherbst: how should we know what's the "correct/main" one
15:12 X512: Removing hardcoding from DRM FD would be beneficial in general, not only for Haiku. Mentioned WDDM is a good example.
15:12 karolherbst: or maybe there will be 5 different APIs for different hardware, because it's just random people doing stuff
15:12 karolherbst: again
15:12 karolherbst: this isn't about code
15:12 karolherbst: I honestly don't care what potential or "maybe" benefits it has to the code
15:13 karolherbst: what matter is the sustainability of this project or if we just end up removing it at the end of this year
15:13 karolherbst: or rather _why_ does this has to go upstream and can't be maintained downstream for now
15:14 X512: I do not plan to give up this work for this year :)
15:15 karolherbst: yeah.. everybody said that before giving up a few months later
15:16 karolherbst: also
15:16 karolherbst: it doesn't even matter if you keep working on it, what matters is, if you are able to build a community around it
15:16 karolherbst: or if this will be the official haiku way of doing GPU drivers
15:17 karolherbst: or could there be like 5 alternative projects doing all the same but with different APIs
15:17 karolherbst: stuff like that
15:17 karolherbst: what if next week another person chimes in and said "hey, can I upstream my non DRM API for radeon GPUs on haiku?"
15:22 X512: First news about some kind of working RADV on Haiku is Nov 2021: https://discuss.haiku-os.org/t/vulkan-lavapipe-software-rendering-is-working-on-haiku/11363/157
15:24 X512: The problem is not potential multiple non-DRM APIs. The problem is a lack of device opaque pointer support.
15:25 X512: It will be easier to manage out of tree code if it will not patch a lot of places.
15:26 karolherbst: I mean.. sure I get why you want to upstream it
15:26 karolherbst: and ultimately it will be up to the radv/readeonsi devs to either accept it or not
15:27 karolherbst: and yeah, maybe you are the one of the few taking this very serious. If it already is shipped in official haiku images, that would be a good indicator of "yep, that project is legit"
15:28 karolherbst: or well.. included in official repositories or however that stuff works in haiku
15:30 X512: What about upstreaming software Haiku support (EGL support, Zink, lavapipe)?
15:31 danvet: maybe I missed something, but my take is ... why can't that gpu process emulate the linux ioctl interface over an fd?
15:31 X512: It is planned to switch to glvnd and abandon Mesa Haiku libGL implementation and use EGL glvnd driver instead.
15:31 danvet: linux has cuse (chardev in userspace) for this
15:32 karolherbst: X512: we already have that kind of stuff, yeah. but it also doens't seem to be a burden so far
15:32 karolherbst: but it's software rendering
15:32 karolherbst: that's the easy part
15:32 X512: Haiku doesn't. Haiku also have its own IPC kernel objects (port_id) that are not FD based.
15:34 danvet: tursulin, since the duplicated function is still there in the drm-tip merge I guess some backmerge would be good to nuke it for real
15:34 danvet: it's very confusing ...
15:36 tursulin: danvet, sorry does not ring a bell - what duplicated function?
15:49 tursulin: danvet, okay rodrigovivi got me up to speed on that one
15:58 jekstrand: We decided to have a WDDM2 fight without me? *sad face*
15:59 zmike: "decided" is a strong word to use there
15:59 karolherbst: jekstrand: you wanna fight?
16:00 karolherbst: but yes, from the last time we decided to exclude jekstrand from any fights, because of how OP he is
16:02 gawin: do wide points and lines have formal definition? I'm reading specs and cannot spot anything so far. (or can I know from state when it's used?)
16:02 jekstrand: Nah, I don't wanna fight.
16:03 jekstrand: I'm just amused.
16:04 karolherbst: :P
16:04 ajax: gawin: glspec2.1.pdf section 3.4.2 "Other Line Segment Features"
16:04 ajax: for wide lines anyway
16:05 gawin: thanks
16:05 X512: Is it possible to contact amdgpu/RADV developers? #radeon-dev channel seems have too low activity.
16:05 ajax: often easier to find the defintion for old gl features in old gl specs
16:05 emersion: X512: #radeon is the channel
16:05 karolherbst: open a MR 🙃
16:06 X512: Mistake, yes #radeon. Activity is low.
16:08 X512: I want to ask some questions about Radeon GPU operation. It is hard to understant some things from amdgpu kernel code and it seems no public documentation.
16:09 MrCooper: no better place for those questions than #radeon
16:10 ajax: but also, https://developer.amd.com/resources/developer-guides-manuals/ towards the bottom has some resources
16:12 ajax: many of them are for older gpus but much of it still applies
16:15 jenatali: jekstrand: I also wouldn't call that a WDDM2 fight. But I tried to summon you, you just didn't show up :P
16:15 zmike: you'd know if he did by the sound of an approaching keyboard
16:17 jekstrand: jenatali: hehe
16:17 jekstrand: zmike: Not nice, but totally fair. :P
16:18 X512: Somebody is planning to make WDDM2 winsys support?
16:19 jekstrand: We've had some Windows present for a while and we just landed more.
16:19 jekstrand: Or are you talking about the radv winsys abstraction?
16:24 X512: Yes. Make RADV working with Windows kernel drivers.
16:28 jekstrand: I did start typing on that ~1yr ago.
16:29 jekstrand: I stopped short of actually submitting GPU work.
16:35 danvet: tursulin, yeah sorry I split the reply across irc and m-l :-)
16:39 MrCooper: X512: Windows kernel drivers have no stable UAPI AFAIK
16:39 jenatali: That's right
16:40 jenatali: But also there's only ~7 entrypoints where there's driver-private data that can be affected. The rest is stable
16:40 tursulin: danvet, its okay, backmerge done and fixup removed
16:40 danvet: tursulin, thx!
16:41 X512: Haiku also have no stable UAPI and syscalls. Applications must use API provided by shared libraries.
16:42 X512: GPU server is also planned to have no stable IPC protocol. Server side driver and client interface library always come in pair.
16:45 karolherbst: well.. nobody will use the low level APIs directly anyway
16:48 MrCooper: X512: it means a RADV Windows winsys could break with any new AMD Windows driver release
16:49 airlied: danvet: doh, had all the next merges locally, just didn't push it out for some reason
16:50 danvet: well now it wont work so well :-)
16:50 danvet: airlied, I'm building the drm-misc-next one rn, so if you only have two I guess toss it?
16:50 danvet: I pushed the -gt-next one already
16:51 danvet: ok -misc-next too because the script just finished :-)
16:51 danvet: tzimmermann, ^^
16:51 jekstrand: MrCooper: Yes, there are ways around that but they require AMD's involvement.
16:52 airlied: danvet: cool, I'll rework whatever was left, i think it was just one amd one
16:52 X512: It is good that RADV have winsys concept. ANV driver just use ioctl directly.
16:54 X512: RADV have 2 levels of abstraction: RADV/Gallium winsys and libdrm_amdgpu.so. Honestly I don't understand why second level of abstraction is needed. ANV have zero layers.
16:55 X512: For Haiku needs a bit adapted libdrm_amdgpu.so is enough.
16:55 danvet: airlied, oh right agd5f does lowercase pull so I always miss them on first pass :-)
16:56 MrCooper: jekstrand: seeing as AMD's Linux teams so far are barely acknowledging RADV's existence, I wouldn't bet on the Windows teams caring :)
16:56 agd5f: danvet, would you prefer upper case?
16:56 danvet: agd5f, I think you're not the only one, so meh
16:57 danvet: it's more that I never know how case-insensitive search here in mutt works :-)
16:57 MrCooper: X512: libdrm_amdgpu is used by AMD's Linux UMDs as well
17:00 jekstrand: MrCooper: Well, yeah...
17:00 jekstrand: X512: RADV's winsys concept is very not great. I'm pretty sure bnieuwenhuizen wants to get rid of it. It also wasn't as helpful as you'd expect for my WDDM2 port.
17:01 jekstrand: Not that some abstraction isn't helpful. It was just the wrong abstraction.
17:01 airlied: the radv winsys abstraction was useful to plug in the null winsys later, but it might still have been a bad idea, and libdrm_amdgpu deps are still a bit of a mess
17:02 X512: Two new functions are added to libdrm_amdgpu.so for Haiku support: amdgpu_device_initialize_haiku, amdgpu_device_get_accelerant.
17:02 X512: The rest is fine.
17:02 X512: https://github.com/X547/libdrm2/blob/master/amdgpu/amdgpu.h#L531
17:02 X512: https://github.com/X547/libdrm2/blob/master/amdgpu/amdgpu.h#L568
17:02 anholt: tu_drm.c vs tu_kgsl.c feels like the best I've seen for porting so far. but also we break kgsl on a regular basis.
17:03 bnieuwenhuizen: jekstrand: I want to rework it but woud like to keep it
17:03 bnieuwenhuizen: it is hampered by not having a second real backend though
17:04 X512: Haiku one is the candidate.
17:12 X512: GitHub Mesa repo mirror seems stop updating.
17:16 jekstrand: X512: Yeah, we're sunsetting the GitHub mirrors, probably. There was an e-mail about that. GitHub made a change which broke them.
17:18 jekstrand: bnieuwenhuizen, airlied: That's fair. ANV has anv_gem.c which was fun but ultimately a bad idea. I'm hoping it gets deleted as part of Xe.
17:18 X512: jekstrand: What should be instead of anv_gem.c?
17:19 jekstrand: José has been working on an abstraction to support both i915 and Xe.
17:19 jekstrand: It's still pretty FD-heavy but probably ok for now.
17:19 jekstrand: We need something a little thicker if we wanted to support WDDM2 there.
17:26 ajax: isn't there an mr for nvk to add a winsys for the nvidia open driver?
17:27 airlied: an MR exists, not sure the code in the MR does what it says
17:35 bnieuwenhuizen: jekstrand: I'm considering setting up the virtio native context stuff as a second winsys depending on how different it is
17:35 bnieuwenhuizen: but definitely want to move a lot of cmdbuffer stuff out
17:35 jekstrand: bnieuwenhuizen: Yeah, virtio makes sense
17:41 anholt: anyone want to ack https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17445 ?
18:43 X512: It seems I found a bug in libglvnd: https://github.com/X547/libglvnd/commit/ee58c6201b90e5e7994f889da76a71ac11e2700e.
19:40 karolherbst: dcbaker: if I bump the meson requirement for rusticl, is there anything specific I need to do? Like adding a notice to release notes or something?
19:52 dcbaker: karolherbst: I don't think so... I don't think any distros are building rusticl that aren't also using bleeding edge meson
19:53 karolherbst: okay, fair enough
20:03 tzimmermann: jani, danvet, i've sent out the drm-misc-next PR
20:03 danvet: airlied, ^^
20:04 airlied: okay I'll process that today, just have to reset my tree
20:16 DemiMarie: karolherbst: what about the IOMMU? Without an IOMMU your system is not secure, period.
20:18 DemiMarie: Unless you trust every DMA-capable device and the drivers for all of them.
20:18 DemiMarie: X512: another option might be to use a global hash table with FDs as keys.
20:27 airlied: Lynne: so you call av_fast_realloc which degrades eventually into realloc which does no alignment on Linux at least
22:04 jenatali: Heh, I should've skipped VK1.1 and gone straight for 1.2, looks like we basically already support it...
22:07 HdkR: Number goes up! \o/
22:08 jenatali: But 1.3 requires BDA and that's... going to take a while
22:14 Lynne: airlied: that's irrelevant, I just need to add a pre-padding to the first slice, then via both that and the bitstream offset, I can align to whatever
22:15 Lynne: as long as the bitstream offset align is less than or equal to the maximum padding I can have before the first slice
22:15 Lynne: rewriting all slice offsets is still cheaper than memcpy
22:19 airlied: Lynne: so the buffer offsets I was getting last I tested still has % 32 left overs when I printed them out
22:28 Lynne: hmm, codec-dependent?
23:09 airlied: Lynne: was just radv with https://paste.centos.org/view/5d99d29e
23:34 Lynne: why do 64, 128, etc not work then? they're mod32