00:32ity: Testing whether our connection to IRC is okay as we were unable to connect to oftc.net the past few weeks
00:32tiredchiku[d]: works
00:32ity: Eyy, thanks !
00:32chiku: you're welcome
03:15dwlsalmeida[d]: Btw, what is this? I assume this is created when the command buffer is submitted to the kernel driver? I assume `wait` are the things the command buffer has to wait on, and `signal` are the things it will signal when the job finishes
03:15dwlsalmeida[d]: struct nvkmd_nouveau_exec_ctx {
03:15dwlsalmeida[d]: struct nvkmd_ctx base;
03:15dwlsalmeida[d]: struct nouveau_ws_device *ws_dev;
03:15dwlsalmeida[d]: struct nouveau_ws_context *ws_ctx;
03:15dwlsalmeida[d]: uint32_t syncobj;
03:15dwlsalmeida[d]: uint32_t max_push;
03:15dwlsalmeida[d]: struct drm_nouveau_sync req_wait[NVKMD_NOUVEAU_MAX_SYNCS];
03:15dwlsalmeida[d]: struct drm_nouveau_sync req_sig[NVKMD_NOUVEAU_MAX_SYNCS];
03:15dwlsalmeida[d]: struct drm_nouveau_exec_push req_push[NVKMD_NOUVEAU_MAX_PUSH];
03:15dwlsalmeida[d]: struct drm_nouveau_exec req;
03:15dwlsalmeida[d]: };
03:15dwlsalmeida[d]: NVKMD_DECL_SUBCLASS(ctx, nouveau_exec);
03:15dwlsalmeida[d]: also, why there's both noveau_exec and nouveau_bind ?
03:20airlied[d]: exec is for cmd buffer submits, bind is for sparse bindings
03:24airlied[d]: we would build one of those for every submit, wait syncobjs, push buffers, and signal syncobjs
13:45marysaka[d]: dwlsalmeida[d]: Reproduced and found a fix, going to push soonish
13:46dwlsalmeida[d]: marysaka[d]: 🙂
13:46dwlsalmeida[d]: On a second note though
13:46dwlsalmeida[d]: As Dave asked yesterday
13:46dwlsalmeida[d]: Can we use this to dump the video class stuff?
13:47karolherbst[d]: should be possible, just needs to be wired up correctly
13:48marysaka[d]: dwlsalmeida[d]: Yes you can totally do that, the only thing that need wiring is nv_push_dump on the mesa side
13:48marysaka[d]: As right now it hardcode what class is used per subchannel instead of dynamically handling based on a SET_OBJECT method
13:48karolherbst[d]: ohh right, the video stuff is more dynamic, no?
13:48karolherbst[d]: at least it was on older gens
13:49gfxstrand[d]: dwlsalmeida[d]: Mostly. It still goes through the WS stuff but that's just a detail. You shouldn't care about the WS stuff these days.
13:50marysaka[d]: dwlsalmeida[d]: Pushed the fix
13:50marysaka[d]: karolherbst[d]: not sure buut we should handle subchannel dynamically on the push dumper tbh
13:50marysaka[d]: even for debugging video it's going to be required
13:50karolherbst[d]: yeah...
13:51gfxstrand[d]: Yeah but that requires the push dumper to be stateful which it isn't currently.
13:52marysaka[d]: yeah... and would need to carry away the state of previous dump too
13:52zmike[d]: "PTO"
13:53zmike[d]: :lul:
14:28gfxstrand[d]: zmike[d]: lmao, you think this time off is paid.
14:29zmike[d]: gfxstrand[d]: lmao you think the P means Paid
14:30zmike[d]: the year is 2024 and PTO is now Personal Time Off
14:31gfxstrand[d]: 😂
14:32gfxstrand[d]: What can I say? Mesa hacking is fun and Discord is on my phone.
14:32gfxstrand[d]: Don't expect me to write code or debug shit, though.
14:33zmike[d]: so just shitposting
14:33zmike[d]: truly one of us.
14:33gfxstrand[d]: I shut my laptop off and put it in my office. If I can't do it from my chromebook, it's too much like work.
14:33zmike[d]: does your phone run a mesa driver yet?
14:33gfxstrand[d]: Sadly, no. 😢
14:33zmike[d]: well you've got...
14:33zmike[d]:checks watch
14:33gfxstrand[d]: lmao
14:33zmike[d]: about 4 months
14:34gfxstrand[d]: I mean, it's a qualcomm so if I download the right emulator app, I'll probably get Turnip.
14:34karolherbst[d]: mhhhh
14:34zmike[d]: legend has it if you pass the right environment settings you might even get zink
14:34gfxstrand[d]: But I don't feel like building my own Android image just so I can swap out the graphics driver.
14:34zmike[d]: but if you think about it
14:34zmike[d]: what else are you doing
14:34zmike[d]: you have so much time
14:34karolherbst[d]: I might have to check something on turnip, not sure I have a compatible device
14:35gfxstrand[d]: I did that in my younger days. I'm too old for that shit now.
14:35zmike[d]: from shitposting to oldposting in 30 seconds
14:35gfxstrand[d]: https://tenor.com/view/why-not-both-why-not-take-both-gif-11478682
14:36zmike[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1297931608350920787/97fa4g.png?ex=6717b881&is=67166701&hm=4367932a2f3226935c67e96d7d556ca54d3bc24d094be832a175b1b95d48d364&
14:37gfxstrand[d]: It's not a real vacation if you can't shitpost with your friends.
14:38zmike[d]: idk my best vacations are the ones where I come back with all-new shitposts my friends have never imagined could exist
14:39zmike[d]: see also https://www.supergoodcode.com/manifesto/
14:42asdqueerfromeu[d]: gfxstrand[d]: I built AOSP Marshmallow from source a long time ago
14:42gfxstrand[d]: I think the last Android version I built was Ice Cream Sandwich
14:43tiredchiku[d]: ...android 12.1 for me
14:43tiredchiku[d]: and I plan to get back into the scene soon
14:44zmike[d]: not before you finish debugging that kde deadlock
14:45tiredchiku[d]: I would, but I'm ~620km from that PC this week
14:46zmike[d]: gfxstrand[d]: is on PTO and that isn't proving a barrier to ecosystem contributions
14:46zmike[d]: where is your passion for graphics
14:46tiredchiku[d]: currently my exams are more important :D
14:46tiredchiku[d]: I do have the portable installation with me but getting plasma to run on my laptop's dgpu without an external display is... painful
14:47gfxstrand[d]: zmike[d]: What are you talking about? I plan to put about as much effort into all of Mesa as I usually put into Kepler.
14:47zmike[d]: invaluable contributions.
14:50gfxstrand[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1297934949718163476/rn_image_picker_lib_temp_081e0f6a-0740-4399-a0fd-5a6a8f50dc9b.jpg?ex=6717bb9e&is=67166a1e&hm=130819a490eeb8b6d2b1ed19d9a074a7e96295989625cc700968b904d2fa696f&
14:50gfxstrand[d]: At the moment, I'm going to contribute this picture of my breakfast.
14:50tiredchiku[d]: ..I guess iGPU+zink should be enough
14:50tiredchiku[d]: will try tomorrow
14:50Ermine: > Linux game ports shall no longer link to LLVM! --- do they link ??? O_O
14:50zmike[d]: gfxstrand[d]: is that a croissant, a chocolate filled thing, and a dishrag?
14:51tiredchiku[d]: looks like a scrunched up omelette to me
14:51gfxstrand[d]: It's scrambled eggs
14:51tiredchiku[d]: :wahoo:
14:54tiredchiku[d]: I have achieved correcter-than-zmike status I can finally go live in the mountains /j
14:56zmike[d]: I'll need a notorized confirmation to believe it
15:02tiredchiku[d]: :ohno:
17:24gfxstrand[d]: marysaka[d]: should we move envyhooks to nouveau/?
17:25marysaka[d]: I feel we probably should yeah
17:25marysaka[d]: I don't know what access I have on nouveau tho... can I do the move myself :aki_thonk:
17:26gfxstrand[d]: IDK. I can clone it, though, and then you can delete yours and make it a clone
17:26karolherbst[d]: I could give you access 😄
17:26gfxstrand[d]: That works too
17:27gfxstrand[d]: I'll just let you do that
17:27karolherbst[d]: done
17:29marysaka[d]: And moved https://gitlab.freedesktop.org/nouveau/envyhooks
17:50airlied[d]: Yeah adding state to push dump was what got me in the end and I just hacked it
17:50airlied[d]: I did add the subchannel bits but not the carry state between dumps
17:54mohamexiety[d]: how do I use Timur's patches? I applied them on top of the latest release, but how do I actually extract the error logs to send them to someone?
18:05mohamexiety[d]: alright, figured it out
18:09mohamexiety[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1297985153770979328/debug-logs.tar.xz?ex=6717ea60&is=671698e0&hm=a478e29853db1a02b89274131ed0d7288d0927788b5f70d4735b0d92707f1781&
18:09mohamexiety[d]: skeggsb9778[d]: sorry about this but could you please take a look at this when you have time? this is the debug output with Timur's patches of me enabling DCC compression on nouveau, and it immediately faults on any image now. (with the help of marysaka[d] I got the fault address from a non-GSP system, and it's right at the start of the image). I am hoping maybe the GSP provides some extra info on
18:09mohamexiety[d]: what is missing/what we need to do as I guess there's something missing kernel side (and probably userspace side too) but I don't really have any clue what
18:16dwlsalmeida[d]: is there a way to run vkcube headless? vkcube --help doesn't really mention anything
18:21dwlsalmeida[d]: marysaka[d]: can you explain again how envyhooks is related to nv_push_dump, which seems to be its own binary?
18:22marysaka[d]: dwlsalmeida[d]: envyhooks will output dumps of command buffer to `EHKS_PUSHBUF_DUMP_DIR`, nv_push_dump allows you to decode them
18:23dwlsalmeida[d]: ah, ok
18:23dwlsalmeida[d]: your fix actually worked 😄
18:23dwlsalmeida[d]: although for some reason I get nothing at `EHKS_PUSHBUF_DUMP_DIR`
18:27marysaka[d]: It does work here
18:27marysaka[d]: ➜ envyhooks git:(main) ✗ LD_PRELOAD=$PWD/target/debug/libenvyhooks.so EHKS_PUSHBUF_DUMP_DIR=$PWD/dump_output gst-launch-1.0 filesrc location=64x64-I.h264 ! h264parse ! vulkanh264dec ! vulkandownload ! filesink location=/tmp/output_normal.yuv
18:27marysaka[d]: Setting pipeline to PAUSED ...
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 67108864, size: 564 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 70320128, size: 64 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 67111320, size: 436 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 67113084, size: 17041 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 70320400, size: 3616 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 69304340, size: 1086 }
18:27marysaka[d]: MESA-INTEL: warning: Haswell Vulkan support is incomplete
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 72822784, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 72825232, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 78970880, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 83886080, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 83888528, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 87425024, size: 351 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73216336, size: 355 }
18:27marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 87032072, size: 355 }
18:27marysaka[d]: Pipeline is PREROLLING ...
18:28marysaka[d]: Got context from element 'vulkanh264decoder0': gst.vulkan.instance=context, gst.vulkan.instance=(GstVulkanInstance)"\(GstVulkanInstance\)\ vulkaninstance0";
18:28marysaka[d]: Redistribute latency...
18:28marysaka[d]: ERROR: from element /GstPipeline:pipeline0/GstVulkanH264Decoder:vulkanh264decoder0: No valid frames decoded before end of stream
18:28marysaka[d]: Additional debug info:
18:28marysaka[d]: ../gst-libs/gst/video/gstvideodecoder.c(1416): gst_video_decoder_sink_event_default (): /GstPipeline:pipeline0/GstVulkanH264Decoder:vulkanh264decoder0:
18:28marysaka[d]: no valid frames found
18:28marysaka[d]: ERROR: pipeline doesn't want to preroll.
18:28marysaka[d]: Setting pipeline to NULL ...
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 87033684, size: 355 }
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73218996, size: 355 }
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73220624, size: 355 }
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73222228, size: 355 }
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73223832, size: 355 }
18:28marysaka[d]: WARNING: GPFIFO data seems invalid, skipping GpFifoRawEntry { opcode: Nop, sync_wait: false, gpu_address: 73225436, size: 355 }
18:28marysaka[d]: Freeing pipeline ...
18:28marysaka[d]: ➜ envyhooks git:(main) ✗ ls -l dump_output | wc -l
18:28marysaka[d]: 248
18:28dwlsalmeida[d]: ok, pardon me, I was running this under GDB
18:29dwlsalmeida[d]: running without the debugger works normally here too
18:29dwlsalmeida[d]: thank you!
21:11skeggsb9778[d]: mohamexiety[d]: it basically just says "read fault at 0x3ffffb4000" due to missing PTE
21:12skeggsb9778[d]: i mentioned somewhat recently - compression etc requires big pages
21:12skeggsb9778[d]: i have nfi what HW does if you try on small pages, but i could believe it ignores the small page table and faults because the big one is missing PTEs
21:13mohamexiety[d]: I see, thanks. that's the same error non-gsp says. 0x3ffffb4000 is also the base address of the image. so I guess the first order of things is to look into plumbing big pages in the kernel?
21:14skeggsb9778[d]: https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nouveau_uvmm.c#L114
21:14skeggsb9778[d]: all of those PAGE_SHIFT's passed into the vmm funcs are forcing small pages - though, i'm not sure how simple it is to fix, or whether the "gpuvm" stuff can split operations properly with multiple GPU page sizes
21:15skeggsb9778[d]: the non-VMBIND paths handle it for GPUs prior to pascal (the GPU blocked configuration of the compression tags since gp100, so nouveau couldn't support it)
21:16skeggsb9778[d]: you'll also probably have to poke around OpenRM to figure out how compression tags are handled on GSP / since Turing in general
21:23mohamexiety[d]: skeggsb9778[d]: hmm, got it. I have not touched the kernel part of things beyond a trivial patch a while ago so do you have any pointers for a good starting point for this? I'd guess I need to change the `PAGE_SHIFT`s based on whether we are working on small or big, but i am not sure that's the best starting point
21:24mohamexiety[d]: (also for kernel dev is there a better way to debug than constantly compiling and installing?)
21:31skeggsb9778[d]: it's easier prior to VM_BIND, because the kernel is in control of the VM mappings - basically what it does is:
21:31skeggsb9778[d]: - when the BO is allocated, it determines a page size for the BO up-front and stores that in nvbo->page (https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nouveau_bo.c#L266)
21:31skeggsb9778[d]: - when allocating the VMM space, it uses that to ensure the virtual address etc is also aligned to the selected page size
21:31skeggsb9778[d]: - when mapping into VMM space, compression tags will be allocated for the buffer, and filled into each PTEs NV_MMU_PTE_COMPTAGLINE field
21:31skeggsb9778[d]: that's the general idea, but it'd have to work quite a bit differently with VM_BIND, because you can't assume a 1-1 mapping between BO and VMM map
21:32skeggsb9778[d]: ie. userspace would have to ensure a lot of those conditions
21:33skeggsb9778[d]: mohamexiety[d]: i usually just "make drivers/gpu/drm/nouveau modules && make <...> modules_install"
21:33skeggsb9778[d]: and reload the module
21:35airlied[d]: I wonder is there more info we might need to communicate to userspace here
21:35skeggsb9778[d]: GPU page sizes mostly
21:36skeggsb9778[d]: I'm also not sure about UAPI around compression tags
21:36mohamexiety[d]: skeggsb9778[d]: this is a bit of a dumb question but.. why not? what could make the mapping different?
21:37mohamexiety[d]: skeggsb9778[d]: got you, thanks!
21:38skeggsb9778[d]: mohamexiety[d]: userspace is in control of the address-space now, so it has to be the one to handle correct alignment etc
21:38airlied[d]: I suppose we could try and have the kernel always try and allocate as large a page size as possible, but still need info on whether it needs compression tags
21:38skeggsb9778[d]: that comes from "kind"
21:38airlied[d]: there is at least one 32-bit pad in the bind call
21:40airlied[d]: I'd probably have to look at how other gpu drivers handled large pages
21:41mohamexiety[d]: skeggsb9778[d]: ah, that makes sense. so if the userspace side of things handles all this, it should work out mostly similar
21:42mohamexiety[d]: I'll look around and get hacking then, thanks a lot!
21:46airlied[d]: I suspect we need to tell userspace about the page sizes and maybe what kinds can use them
21:50airlied[d]: NV2080_CTRL_GPU_GET_MAX_SUPPORTED_PAGE_SIZE_PARAMS
21:50skeggsb9778[d]: nvkm already knows that
21:51airlied[d]: yeah just whether we need to know it in userspace or not I suppose or just have userspace try for a big enough page size
21:51mohamexiety[d]: oh speaking about kinds, a bit unrelated but does anyone know what the difference between `NV_MMU_PTE_KIND_GENERIC_MEMORY_COMPRESSIBLE` and `NV_MMU_PTE_KIND_GENERIC_MEMORY_COMPRESSIBLE_DISABLE_PLC` is? like what does "disable PLC" mean? the other kinds for other compressible types (the depth/depth stencil types) only have `COMPRESSIBLE_DISABLE_PLC` variants too.
21:53skeggsb9778[d]: i'm not too familiar with the kind changes on Turing+ yet, but PLC is post-L2 compression iirc
21:53airlied[d]: NV90F1_CTRL_CMD_VASPACE_GET_PAGE_LEVEL_INFO might also be interesting, but not sure
21:54mohamexiety[d]: skeggsb9778[d]: hm, I see. thanks
21:55skeggsb9778[d]: airlied[d]: https://gitlab.freedesktop.org/bskeggs/nouveau/-/blob/00.01-remove-ioctl/drivers/gpu/drm/nouveau/include/nvif/driverif.h?ref_type=heads#L128
21:55skeggsb9778[d]: i still need to push an updated branch here, but this is basically what nvkm returns to the drm driver
21:55skeggsb9778[d]: (a list of page sizes, and flags of what they support)
22:00skeggsb9778[d]: in practice, since gp100, it's 64KiB + 2MiB pages that support compression
22:02skeggsb9778[d]: i'm not sure if the higher levels support non-sparse mapping on newer GPUs or not, but as the next page size up is 512MiB, i'm not sure it's so relevant 😛