IRC Logs of #dri-devel on irc.freenode.net for 2024-03-12

00:00 airlied: no Red Hat used to ship that crap as our hypervisor :-P
00:01 DemiMarie: crap?
00:02 airlied: yeah at least from our paid customers pov, I'm sure it has use cases, but kvm just made life a lot simpler
00:02 DemiMarie: not surprised
00:02 DemiMarie: My understanding is that most uses of Xen nowadays are for things KVM just can’t do, at least not without a ton of additional work.
00:03 airlied: or because someone still uses citrix?
00:03 DemiMarie: not my use-case
00:04 DemiMarie: In Qubes OS we use PCI passthrough so that the NICs and USB controllers are handled by less-privileged VMs, thus protecting the host.
00:05 karolherbst: but couldn't that be done with KVM as wekk?
00:05 karolherbst: *well
00:11 Ermine: doesn't xen utilize kvm?
00:13 DemiMarie: karolherbst: not easily at least, because KVM doesn’t support one VM providing networking to another directly
00:14 airlied: nope
00:14 karolherbst: DemiMarie: that kinda sounds like a configuration problem?
00:14 DemiMarie: karolherbst: nope, it’s way more fundamental.
00:15 DemiMarie: In Xen, two VMs can communicate directly
00:15 karolherbst: mhh
00:15 DemiMarie: In KVM, you have to write a bunch of userspace stuff and try to make it work
00:16 DemiMarie: The closest you can get is virtio-vhost-user but that is still experimental and also assumes that the backend is trusted. In Qubes OS, the backend is not trusted.
00:16 mareko: zmike: any further comments on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27736 ?
00:17 DemiMarie: Xen is also working on being able to deprivilege dom0 and on safety certification, so that one can have e.g. a safety-critical QNX guest alongside Linux guests doing infotainment stuff.
00:17 DemiMarie: Trying to do that under KVM would be a nightmare, if it can even be done at all.
00:18 DemiMarie: Also Xen’s security process is vastly better than Linux’s, and therefore KVM’s.
00:22 zzoon[m]: airlied: when you have time, could you review https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28063 ?
00:28 airlied: zzoon[m]: left a coment
00:29 zzoon[m]: thanks!
00:44 zmike: mareko: I've still had the tab open, but I've been too busy to get back to reviewing
00:44 zmike: I didn't intend for that to be a blocking comment if it's been stalling pepp's review
00:44 zmike: hoping to get to it in the next couple days if someone doesn't beat me to it
00:47 HdkR: `amdgpu 0005:03:00.0: [drm] *ERROR* Error waiting for INBOX0 HW Lock Ack` Anyone ever see this error spamming in dmesg, or should I try updating my kernel?
00:56 DemiMarie: airlied: do `FOLL_LONGTERM` pins of VRAM work?
01:03 airlied: DemiMarie: don't think it makes any sense
01:04 DemiMarie: airlied: Ouch. Why?
01:05 airlied: VRAM isn't like RAM
01:05 airlied: the PCIE bar can act like a remapping table on some gpus
01:05 airlied: probably easier to not expose mappable VRAM to guests
01:06 DemiMarie: airlied: what will that break?
01:06 DemiMarie: Does it mean no Vulkan and no OpenGL4.6+?
01:07 DemiMarie: If so, it’s probably better to make whatever Xen-side changes are needed to make it work.
01:07 airlied: probably hurts performance on those
01:07 DemiMarie: how much?
01:07 airlied: but I think you can get away with just doing everything in RAM instead of VRAM where you want mappings
01:07 DemiMarie: what about vkMapMemory?
01:07 airlied: though not sure if some apps always assume you can map vram
01:08 airlied: they shouldn't but who knows what ppl do
01:08 DemiMarie: yeah
01:09 DemiMarie: I suspect it will really hurt compute, though.
01:09 DemiMarie: Compute seems to care much more about shared mappings
01:14 jenatali: D3D has gotten by without mapping VRAM until *very* recently
01:15 DemiMarie: How recently?
01:15 jenatali: Today
01:15 DemiMarie: Literally March 11 2024?
01:16 jenatali: Yes
01:16 DemiMarie: What was announced?
01:16 DemiMarie: If I can avoid mapping VRAM that makes things vastly simpler.
01:17 jenatali: Demi: https://devblogs.microsoft.com/directx/agility-sdk-1-613-0/
01:18 DemiMarie: jenatali: how long before applications will require upload heaps?
01:18 DemiMarie:wishes that upload heaps had never been created
01:19 jenatali: Dunno. Probably a while
01:20 karolherbst: not that it changes much because drivers need it anyway
01:20 kode54: Have teaching everyone how to turn on rebar, if they can
01:21 DemiMarie: karolherbst: why is that?
01:21 karolherbst: because they wanna upload stuff to the GPU in a non painful way
01:22 karolherbst: like NVK relied on being able to map VRAM since forever
01:22 DemiMarie: what about Intel and AMD?
01:23 kode54: Arc already requires it outright or else it runs like crap
01:23 karolherbst: dunno, but probably the same
01:23 DemiMarie: kode54: crap?
01:23 kode54: Like worse performance than really old hardware
01:24 DemiMarie: because of faulting on each access?
01:24 DemiMarie: Anyway, so this will need to be dealt with in Xen or we will need to switch to KVM or not support GPU acceleration.
01:24 DemiMarie: The old version that runs the userspace driver on the host is not even being considered.
01:25 jenatali: Out of curiosity, is anybody working on an implementation of the AMD work graph extension (for RADV or otherwise)?
01:26 kode54: I don’t know if faulting is why the windows drivers are slow since I don’t have that information
01:26 kode54: I just know they outright tell people to have rebar support, and reviewers have found the card performance to be a stuttery mess without it
01:34 DemiMarie: How common is rebar support nowadays?
01:34 DemiMarie: airlied: on which GPUs can the BAR act as a translation table?
01:35 airlied: amd and nvidia do it
01:35 airlied: though I don't think amdgpu takes too much advantage of it
01:35 airlied: but I haven't looked in a while
01:38 agd5f: we've supported it in amdgpu for almost maybe 8-10 years? Christian added the bar resizing the the kernel PCI code.
01:38 agd5f: As long as you have enough MMIO space
01:41 agd5f: Simplifies the kernel side since you never have to worry about faulting BOs in and out of the BAR window
01:46 airlied: agd5f: I don't think you do translations in the BAR though
01:46 airlied: like you resize it
01:46 airlied: but you map BAR address 0x100 to VRAM address 0x100 always
01:47 airlied: nvidia has a page table between VRAM and the BAR
01:47 airlied: so even with a 256MB you can map any parts of the 8GB VRAM on a page granularity, it's just a pain because you have to do evictions
01:51 DemiMarie: agd5f: why does amdgpu use non-refcounted pages?
04:14 marex: airlied: I wouldn't mind that, but is there something convenient like drivers/gpu/drm/drm_gem_dma_helper.c with iommu support ? Or how do I even ... what do I even grep for ?
04:17 marex: iommu_iova_to_phys maybe ?
04:18 marex: nope
04:20 airlied: you enable the iommu and the dma layer should do the magic
04:20 marex: airlied: hmmm, so ... I would use the SHMEM allocator, then use -something- from probably include/linux/iommu.h to turn buffer I get from that shmem allocator into ... uh ... IOVA ? ... and pass that to the device ?
04:21 airlied: dma_map_* should do it
04:22 airlied: https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt
04:25 marex: airlied: lemme read that, thanks
04:25 airlied: now I'm not 100% sure how to make it linear mapping from the device side, but there should be info somewhere
04:25 marex: airlied: I was always under the impression that iommu was used mostly to protect the system from access where devices shouldnt access
04:27 airlied: btw cma should work on x86 as well, just not all distros enable it
04:27 airlied: cma=<size> on command line might be needd
04:28 marex: airlied: right (I'm mostly familiar with the arm side)
04:31 marex: airlied: drivers/gpu/drm/rockchip/rockchip_drm_gem.c: ret = iommu_map_sgtable(private->domain, rk_obj->dma_addr, rk_obj->sgt,
04:31 marex: airlied: I think I have a winner, even with an example
04:31 marex: one I can even digest easily, this is awesome
04:54 marex: airlied: thanks !
05:08 Calandracas: Is there any plans for rusticl panfrost at some point in the future?
05:20 CounterPillow: The future is now
05:20 CounterPillow: panfrost is one of the supported drivers for rusticl: https://docs.mesa3d.org/envvars.html#envvar-RUSTICL_ENABLE
06:16 airlied: bcheng, Lynne : I've pushed https://github.com/airlied/FFmpeg/tree/av1-decode-wip as the code I think should work on the ffmpeg side (it doesn't work yet though on amd at least)
06:56 airlied: bcheng, Lynne : okay I'm pretty unsure how this is meant to work :-P
06:58 airlied: tchar__: I don't trust that code in radv for filling out the ref frame map
06:58 airlied: but I'm unsure how the API is meant to be used here
09:07 robmur01: marex: you only need to bother with the IOMMU API if you care about managing the address space and exactly *where* buffers are mapped in the device view
09:08 robmur01: otherwise just use drm_gem_dma_helpers and it all simply happens by magic
09:09 robmur01: (the iommu-dma layer can't strictly *guarantee* to linearise any given scatterlist, but it does try its best)
09:17 pq: DemiMarie, gfx card hot-unplug, perhaps?
09:22 pq: DemiMarie, see https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-hot-unplug
09:24 pq: SIGBUS would be bad, if you expect userspace to not crash on the spot.
09:27 HdkR: Oh, it looks like Anv and Iris compiles on AArch64 now?
09:27 HdkR: Might need to get a Battlemage GPU for testing
09:29 airlied: Arc you mean
09:30 HdkR: Well, I'm in no rush so I can wait for Battlemage :P
09:34 psykose: still surprised they actually named it that because it's a cool name
09:41 HdkR: Using DND class names sorted alphabetically is cute
09:48 pepp: DemiMarie: my coworkers are aware of how Xen works and figured out a way to expose GPU memory to the guest correctly
09:49 pepp: DemiMarie: but I don't know much about the details of the implementation
09:51 tchar: airlied: can you elaborate on the issue you are seeing? That code is working around some firmware bugs, so it's a bit cursed.
09:52 tchar: Oh, I see there's a new FFmpeg branch, I'll give it a spin
09:56 airlied: tchar: I'm not sure the new ffmpeg is right either
09:56 airlied: the main problem is around how many reference frames/slots we need to send
09:57 airlied: I'm not sure how to fill out ref_frame_map properly
09:58 airlied: like if we have 7 frame refs pointing at 2 references, I'm not sure what ref_frame_map needs to contain
09:59 airlied: if we only send two dpb references vs the old code which sends 8 slots with the same image in different indices
09:59 airlied: so the old code and my hacks that work send referenceSlotCount = 8 pretty much always
09:59 airlied: and in those 8 there are repeated slot indices
10:00 airlied: and that fills out ref_frame_map all properly
10:00 airlied: but if we don't send 8, but instead only send say 2, I'm having trouble working out the ref_frame_map contents that will work
10:01 airlied: I'll try and attack it again tomorrow when I've had a coffee to see if I can at least work out a good description :-P
10:14 tchar: airlied: yeah, you're right that the intention is only to send 2 dpb references in pReferenceSlots in that case, and you use the referenceNameSlotIndices to convey the "duplicates"
10:15 tchar: i'll see if I can spot any issue in the meantime
12:06 Calandracas: CounterPillow, thats awsome that panforst is supported. I thought it wasn't beacuse of this page: https://docs.mesa3d.org/drivers/panfrost.html
12:06 Calandracas: "Other graphics APIs (Vulkan, OpenCL) are not supported at this time."
12:09 Calandracas: Neat, i have hardware for all supported drivers
12:23 Calandracas: rusticl is super awesome and a complete game changer. Now I can do OpenCL development on my pinebookpro
12:52 randevouz: Yeah the basic arithmetic with all the list/seq/predicate functionality is in the core of w3c libraries, that confused me , so disambiguation there, it has unlicensed github project, turtle is subset of n3 https://ruby-rdf.github.io/rdf-n3/RDF/N3/Algebra/Math/Negation.html, but i am lost as i got tired. Not sure if i have to write my own stack, needs testing. And they have no logarithms . Fell a sleep yesterday before inspecting the
12:52 randevouz: compression or doing any testing.
12:54 aleasto: is there an environment variable to disable the egl zink fallback if it breaks an app?
12:54 zmike: no
12:55 aleasto: sad
13:02 agd5f: airlied, we have a page table too, but we don't use it.
13:07 bcheng: airlied: filling out referenceSlotCount = 8, with repeated slotIndex is illegal I believe.
13:10 MrCooper: aleasto: LIBGL_ALWAYS_SOFTWARE=1 ?
13:11 bcheng: airlied: the problem that ref_frame_map filling code is working around is that the vulkan api only provides references used by the frame, not all 8 codec defined slots. But the FW was designed for other APIs which are always given the state of the codec DPB, in which case if a frame stops being seen in ref_frame_map, the FW will drop the metadata for that slot
13:12 bcheng: the code is trying to fill into ref_frame_map, the real references first, then as a workaround, fill in the slots that were not specified by the app, in order for the FW to keep the metadata alive
13:16 mripard: we're moving the drm-misc repo to gitlab, expect disruption for some time
13:21 randevouz: https://www.w3.org/TR/xpath-functions/ they have natural and base10 logarithms, but somehow the ruby project lacks them.
13:52 mripard: the migration is done now
14:32 karolherbst: the hell.. marge nuked my pileine which succeeded in ` 59 minutes 31 seconds, queued for 4 seconds` :')
14:59 Lynne: airlied: added your dedup changes to my branch - https://github.com/cyanreg/FFmpeg/tree/av1dec
14:59 Lynne: also rebased it to git master
15:16 Lynne: on nvidia, it crashes in vkCreateVideoSessionParametersKHR, which is very weird
15:31 Lynne: right, fixed, I forgot that the spec forbids empty av1 session params, which we use for flushing the decoder on startup
15:32 Lynne: now both drivers crash during queue submission (though with radv, the kernel outright rejects the CS)
15:50 MrCooper: jfalempe: AFAIK a cache flush doesn't flush WC buffers
16:06 marex: robmur01: ack, thanks for the input. I need to dig into this first, then I'll come back (or probably won't, because this seems like a perfect fit for my purposes)
16:06 mareko: DemiMarie: one of the Xen architects now works for AMD
16:09 DemiMarie: pepp: does it support dGPUs or only iGPUs?
16:09 Lynne: airlied: with f3ab454f0 reverted in mesa, it sorta works, I get an image!
16:10 Lynne: it's not correct, but it's a start
16:11 pepp: DemiMarie: both
16:11 DemiMarie: pepp: does it handle paging of GPU memory and resizeable BAR?
16:12 pepp: DemiMarie: I don't ReBAR is supported. If "paging" means "moving memory around", then yes
16:12 pepp: I don't *think* ReBAR is supported
16:14 DemiMarie: No ReBAR could be a significant problem for broader use.
16:16 pepp: DemiMarie: I don't see why?
16:16 DemiMarie: pepp: because some GPU drivers simply require it
16:17 DemiMarie: Intel Arc and NVK for example
16:26 mareko: it's open source, people can change it
16:30 DemiMarie: mareko: my concern is that Xen-specific code in Intel and Nvidia drivers will receive no testing in upstream CI, and users will be left with bugs that are nigh-impossible to track down, much less fix.
16:44 DemiMarie: pepp: do AMD GPUs have ReBAR?
16:51 pepp: DemiMarie: yes.
17:03 mareko: DemiMarie: that's what SW development is for
17:13 tchar: Lynne: I made a of couple commits, 1 of them it sounds like you already fixed it: https://github.com/charlie-ht/FFmpeg/tree/av1-decode-wip
17:14 tchar: I'm looking now at the rvp / rav mgmt, to try and better match what the spec ended up on
17:16 Lynne: tchar: yes, all are fixed
18:37 DemiMarie: pepp: would it be reasonable to write an email to both xen-devel and dri-devel to try to get this all sorted out?
18:40 airlied: bcheng: i do wonder shouldnt we according to the spec but the complete dpb state in begin video coding?
18:42 DemiMarie: mareko: Neither I nor anyone else at Invisible Things Lab is a full-time GPU driver developer, and even if I was, I don’t have the resources to do the kind of testing that Intel and AMD do. If the same APIs that work outside of Xen also work under Xen, then Xen users benefit from the non-Xen testing for free.
18:44 DemiMarie: mareko: Also, there are quite a few additional hypervisors, mostly in the embedded space. Having GPU drivers need to know about a hypervisor is a giant layering violation, IMO.
18:45 DemiMarie: The GPU driver should just call the appropriate kernel APIs, and it should be the responsibility of the hypervisor interface to ensure that everything just works.
18:48 alyssa: mareko: i'm planning to rereview opt_varyings today or tomorrow
18:48 alyssa: fwiw
18:56 eric_engestrom: karolherbst: marge's timeout starts when it finishes pushing the branch, and then gitlab starts processing, figures out it needs to create a pipeline and how, and then the pipeline is created. Depending on the load on gitlab, I've seen this take up to a couple of minutes, so I'm guessing this is where the extra 25+ seconds you are missing went
18:58 eric_engestrom: it's really unfortunate though, and we've been discussing to try to find a way to address these "the pipeline is almost finished, please give me X extra time instead of the normal timeout" but we don't have a solution yet
18:59 eric_engestrom: (also, there's the obvious problem of abuse of extra time, but I believe this is a social problem that will require a social solution)
19:03 tchar: airlied: isn't that currently the case? the complete dpb state being the current frame and all its unique dependent frames. The idea of putting the whole "VBI" in the API was rejected
19:04 alyssa: eric_engestrom: I mean... given the 60m timeout and the 25m expected worst time, if we're hitting timeouts stuffs already on fire, so engineering a "please give me extra time" mechanism doesn't seem too necessary?
19:05 eric_engestrom: agreed, but until those who take too long stop taking too long, the users are left with nothing but the "womp womp, try again later" reassign button
19:06 tchar: airlied: I had a look and started trying to move FFmpeg in the direction of the latest spec in https://github.com/charlie-ht/FFmpeg/commit/e426de63843123ec7cbd2bbb575f2a8901132bce
19:07 jenatali: A "please give me 5 more minutes" seems better than "oh well, guess I'll have to take another ~hour by starting from scratch later"
19:07 tchar: just in case it is of any use... I will take another look my tomorrow, I didn't figure out how to handle the frame_id wrapping yet
19:08 alyssa: jenatali: what I mean is, we're defaulting to "please give me 35 more minutes" on every pipeline
19:08 alyssa: and if that's not enough... things are really on fire!
19:09 jenatali: Oh I agree, but borrowing 5 minutes now can help get things back in shape sooner
19:16 DemiMarie: airlied (and others): if a `FOLL_LONGTERM` pin of VRAM succeeds, does that mean there is a kernel bug?
19:17 airlied: DemiMarie: just means the bar pages are pinned, doesn't mean what is behind the pages is
19:17 airlied: agd5f: yeah that was my understanding, it would definitely relieve having to move BOs to use that page table, but also with rebar it's probably not much point
19:18 DemiMarie: airlied: what does that mean in practice? My understanding is that other subsystems (such as RDMA) also use `FOLL_LONGTERM` pins, and I don’t know if RDMA can handle faults.
19:21 airlied: DemiMarie: they don't usually have stuff in device memory space
19:21 DemiMarie: airlied: what happens if userspace tries?
19:22 DemiMarie: Suppose userspace maps some VRAM with vkMapMemory and passes the resulting address to an RDMA verb.
19:23 airlied: no idea
19:23 airlied: tchar: interesting, definitely looks bigger than I'd considered
19:23 airlied: Lynne: ^ you might want to start taking a look
19:23 DemiMarie: would this be considered a userspace bug?
19:26 airlied: probably, unless the kernel oops
19:27 DemiMarie: Ouch
19:28 DemiMarie: Does this mean that GPU acceleration under Xen will require either per-driver patches or recoverable page fault support in Xen?
19:31 airlied: don't know maybe ask Xen developers
19:34 Lynne: tchar: would you mind using my code
19:34 Lynne: there are a lot of incomplete stuff in the branches of both of you
19:35 DemiMarie: airlied: I am going to ask them, but first I need to know what the GPU drivers require.
19:35 DemiMarie: From the kernel perspective, not Xen's.
19:36 zamundaaa[m]: I've recently hit a GPU reset, which happened while a pageflip from the compositor was pending - because of the reset, that pageflip never happened and timed out.
19:36 zamundaaa[m]: When I tried to work around that timeout in KWin, the result of the next commit was EBUSY because a pageflip was still pending on the kernel side...
19:37 zamundaaa[m]: Afaict, there's no way for the kernel to signal to userspace that a commit failed. So when this happens, could / should the kernel signal the pageflip as completed, and allow commits to happen again?
19:37 zamundaaa[m]: Because the GPU reset itself seemed successful, after restarting the compositor everything worked fine again
19:41 DemiMarie: airlied: I think one might be able to trigger an oops with vmsplice.
20:01 airlied: Lynne: I've got it to decode properly by using the API illegally with my code
20:01 airlied: but I'll rebase on yours to see if I can hack it
20:19 airlied: Lynne: your branch has inconsistent loop restoration
20:19 airlied: at least with radv, but I'm not sure where the spec ended up
20:23 airlied: removing dedup fixes rendering for me on radv
20:23 airlied: if I comment out tchar's assert
20:31 DemiMarie: airlied: I see. Does that mean that P2PDMA between an RNIC and a GPU isn’t currently supported upstream?
20:34 airlied: DemiMarie: probably depends on the gpu driver cooperating
20:36 Lynne: airlied: the loop restoration unit sizes?
20:36 Lynne: they're supposed to be log2, but the spec misnamed them, there's a spec fix that should've been merged already
20:37 DemiMarie: airlied: Same cooperation I need, as it turns out.
20:40 airlied: Lynne: yeah the unit sizes
20:40 airlied: Lynne: - .LoopRestorationSize[0] = frame_header->lr_unit_shift,
20:40 airlied: - .LoopRestorationSize[1] = frame_header->lr_uv_shift,
20:40 airlied: - .LoopRestorationSize[2] = frame_header->lr_uv_shift,
20:40 airlied: + .LoopRestorationSize[0] = 1 + frame_header->lr_unit_shift,
20:40 airlied: + .LoopRestorationSize[1] = 1 + frame_header->lr_unit_shift - frame_header->lr_uv_shift,
20:41 airlied: + .LoopRestorationSize[2] = 1 + frame_header->lr_unit_shift - frame_header->lr_uv_shift,
20:42 airlied: that at least gets things to render on current radv
20:42 airlied: (the first frame)
20:51 airlied: not sure even with tchar fix the spec is all that clear :-
20:51 airlied: :-P
21:00 Lynne: airlied: the version in my branch should be the correct one
21:00 Lynne: wait, what?
21:01 Lynne: you have to add 1 and do the diff
21:01 bcheng: When LoopRestorationSize (codec defined) = 256, log2 - 5 would be 3, but lr_unit_shift gives 2
21:03 Lynne: you're right, even the sample program does that
21:03 Lynne: updated my brach with that
21:04 airlied: so with that, and the dedup reverted and the assert commented out, I get a proper video decode
21:05 airlied: so now we just need to work out how to make things work like the spec intends
21:08 bcheng: Seems like there's some issues with the dummy dpb addrs: https://gitlab.freedesktop.org/bcheng/mesa/-/commit/14eb7a417eee6bbd99b11cf7bce6e8fdf7b864c4
21:10 bcheng: dpbArraySize needs to be 8 if the ref_frame_map is filled out like it is
21:11 bcheng: but with dedup code, I get referenceSlotCount=0 even for the inter frame?
21:33 Lynne: airlied: reverted the dedup
21:33 Lynne: sadly that puts us back to where we started
21:34 bcheng: good news is I got the dedup working :)
21:34 bcheng: - for (int j = 0; j < AV1_NUM_REF_FRAMES; j++) {+ for (int j = 0; j < ref_count; j++) {
21:34 bcheng: Sorry
21:34 bcheng: - for (int j = 0; j < AV1_NUM_REF_FRAMES; j++) {
21:34 bcheng: + for (int j = 0; j < ref_count; j++) {
21:42 airlied: bcheng: doh!
21:42 airlied: Lynne: dedup with bcheng change seems to work for me
21:51 Lynne: updated
21:52 Lynne: and it works a little bit more than before - it works fine on nivida too
22:04 DemiMarie: Will pinning pages returned by e.g. vkMapMemory work as expected for iGPUs?
22:10 Lynne: airlied: passes all my standard tests so far, no crashes
22:16 DavidHeidelberg: mareko: gfxstrand robclark jljusten airlied would you be ok to create nine-tests namespace for testing gallium-nine? Code from Xninehttps://github.com/axeldavy/Xnine (https://github.com/axeldavy/Xnine) would be hosted there and used in Mesa3D CI
22:18 DavidHeidelberg: details: in general it's wine tests adapted to run directly on Linux + gallium-nine with GTest suite (for deqp-runner integration). Very small, much fast, much wow. Ofc MIT/LGPL licensed.
22:32 Lynne: tchar: do you think frame_id is still currently incorrect?
22:42 alyssa: DavidHeidelberg: been a while since i saw a doge meme. nice :3
23:00 tchar: Lynne: it's possible I was mishandling them in my exploratory patch, it looks like the av1dec layer was taking care of the values not going over the maxDpbSlots value (9)
23:01 tchar: where's the latest working stuff in terms of branches atm? I will test here in the morning too
23:04 DavidHeidelberg: alyssa: I seriously miss them
23:09 Lynne: tchar: https://github.com/cyanreg/FFmpeg/tree/av1dec
23:10 Lynne: passes superres, film grain (not on intel), weird invalid files we found crashes on with the old extension
23:10 Lynne: it should be on par with the old extension
23:12 Lynne: airlied: it still has the same 8/10bit flickering issues as before (and 10bit hevc), you should talk to jkqxz again
23:16 DemiMarie: robclark: are there any Chromebooks with discrete GPUs?
23:25 airlied: Lynne: cool, I really have to figure out how to reproduce that here
23:26 airlied: do you see it on both navi2x and navi3x?
23:26 Lynne: only 3x
23:27 Lynne: not swapchain related, I've seen it in ffmpeg
23:28 Lynne: latest status was that jkqxz thought that the 10-8 conversion was activated by uninitialized structs which radv wasn't using or filling in (but you said they are)
23:29 Lynne: we did try zeroing every single piece of memory manually, but that didn't work, only RADV_DEBUG=zerovram helps
23:50 bcheng: Lynne: is there a sample clip?
23:50 bcheng: I can check on my end
23:51 Lynne: no sample clip because I've been able to replicate with everything with the same consistency
23:52 bcheng: just any random av1/hevc 10 bit clip?
23:53 bcheng: do you do -vf "format=nv12" or something to get ffmpeg to do a 10-8 conversion?
23:53 Lynne: no no, any bit depth for av1, but for hevc, only 10bit
23:53 Lynne: no conversion either, just output whatever the file's pixel format is
23:54 Lynne: happens randomly, more frequently that rarely, less frequently that often, and if a single process does RADV_DEBUG=zerovram, globally it stops happening for a random amount of time
23:55 robclark: DemiMarie: currently no
23:55 DemiMarie: robclark: so right now a major decision to be made is whether to pin all VRAM shared with guests.
23:56 Lynne: bcheng: I recommend using my branch of ffmpeg, patching mpv with https://0x0.st/HhD5.diff to use the new extension, and opening and closing the same clip until it happens
23:56 DemiMarie: With iGPUs this is a non-issue, with dGPUs it is a significant concern.
23:57 bcheng: Lynne: thanks, will see if I can replicate
23:57 bcheng: only on navi3x?