00:19fdobridge: <marysaka> Will take a look tomorrow, have a nice rest :ferris_hi:
00:56fdobridge: <airlied> also if you want a leg up on some fragment shader bits once you'd gotten the shader io tracker done, https://gitlab.freedesktop.org/airlied/mesa/-/tree/nvk-nak-hacks2?ref_type=heads
00:57fdobridge: <airlied> feel free to rip off as much as needed from that list of patches, some of them are definitely good, some probably need more consideration
05:51fdobridge: <dadschoorse> my stupid brain thought realz meant something depth related 🐸
08:34cwabbott: gfxstrand: have you seen https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/freedreno/computerator/?ref_type=heads ? it's the thing we use for ISA reverse engineering experiments in freedreno
08:38cwabbott: it's doing raw command submission, which helped for teasing out what shader-related registers do and worked out better than vulkan back-door or something
08:39cwabbott: when bringing up a new generation, it was also the first thing brought up before vulkan
08:55HdkR: If only NVIDIA released their internal ISA docs :>
08:58karolherbst: yeah.....
08:59karolherbst: HdkR: though we have something good enough I think for the moment :D
09:00HdkR: and then I see you fussing around with the .X modifiers and not understanding why it is changed. Would be so easy to have the documentation for that :|
09:00karolherbst: is that even documented in great detail?
09:01HdkR: In the ISA docs yes
09:01karolherbst: I have ISA docs, I'm just wondering if there are other ISA docs than I've gotten :D
09:01HdkR: Well, usually if someone actually wrote the pseudocode representation
09:01karolherbst: but yeah.. I suspect there is a version with more details
09:02HdkR: Indeed, if you don't have the pseudocode then you're missing details
09:02karolherbst: ohh yeah, I don't have the pseudocode
09:03fdobridge: <marysaka> Did NVIDIA ever published an updated version for the SPH doc btw?
09:03karolherbst: what I have looks like more a documentation of an assembler :D
09:03karolherbst: they haven't
09:03karolherbst: and I've asked
09:03karolherbst: I shall ask again
09:04HdkR: karolherbst: I guess it has the bitfield encodings and options for the instructions?
09:04karolherbst: it doesn't
09:04HdkR: welp
09:04karolherbst: it's listing all forms though
09:04HdkR: Smells like you got some second/third order documentation
09:04karolherbst: it feels like written in a way to help people using the ISA for other things
09:05karolherbst: :D
09:05karolherbst: no, really
09:05karolherbst: but yeah.. not having the encoding kinda sucks
09:05karolherbst: but, I'm free to use it for nouveau stuff and it does explain the flags and what instructions and forms there are
09:05karolherbst: so it's better than nothing
09:05HdkR: A small step is better than no step at all
09:05karolherbst: it also lists all system values and stuff
09:06karolherbst: and for IADD3 the only thing not explained there was how the neg modifier works
09:06HdkR: That's nice at least, so you can see the system registers you're not using :D
09:06karolherbst: but maybe there are a few pages in the front I haven't gotten
09:06karolherbst: :D
09:06karolherbst: yes
09:08HdkR: You would sort of expect that stuff to be documented right next to the instruction...like it does in the better docs
09:09karolherbst: yeah...
16:07benjaminl: karolherbst: the docs you're talking about aren't public, are they?
16:07karolherbst: correct, they aren't, and I've gotten them under NDA
16:08benjaminl: oof
16:08karolherbst: I'm just allowed to use them for nouveau stuff
16:08benjaminl: nice that you got them at all and can use them though :)
16:08karolherbst: but obviously can't share or quote from them
16:22fdobridge: <gfxstrand> cwabbott: I've not looked at it but I'm thinking of doing that. It's not much code to fill out a QPA and do a compute launch. Doing it through Vulkan isn't bad, either, but there's things I need and plumbing through magic queries looks painful.
16:27fdobridge: <gfxstrand> And it's probably better for a test framework than what I'm doing right now
16:30fdobridge: <gfxstrand> Or I could pull QMD setup into NAK...
16:30fdobridge: <gfxstrand> 🤔
16:31fdobridge: <karolherbst🐧🦀> only part of it
16:31fdobridge: <karolherbst🐧🦀> but NAK could have an API to generate the full one I guess 🙃
16:31fdobridge: <karolherbst🐧🦀> but it's kinda annoying that QMD contains static and dynamic information
16:32fdobridge: <gfxstrand> I could generate the template
16:32fdobridge: <karolherbst🐧🦀> yeah.. that's what I meant with having an API for it
16:33fdobridge: <karolherbst🐧🦀> I think it makes sense to have the SPH/QMD generation as part of NAK, as even GL would just write the exact same code anyway. And once we get the sph docs (hopefully) we also don't have to bother having two generators for sph and qmd
16:34fdobridge: <gfxstrand> Well, they're still going to be different
16:34fdobridge: <karolherbst🐧🦀> I meant, two for each
16:34fdobridge: <gfxstrand> Yeah
16:34fdobridge: <karolherbst🐧🦀> but yeah.. I think it feels natural to have sph/qmd generation APIs as part of the compiler and not the driver
16:34fdobridge: <karolherbst🐧🦀> always felt kinda odd
16:35fdobridge: <karolherbst🐧🦀> especially this slotting madness we've been doing in gl
16:35fdobridge: <gfxstrand> I considered doing that on Intel but on NVIDIA it actually makes sense.
16:36fdobridge: <karolherbst🐧🦀> we can always make drivers responsible for lowering certain system values (and pull them from a custom driver ubo or whatever) though
16:36fdobridge: <gfxstrand> And we can make the compiler output offsets to key things like the dispatch size.
16:36fdobridge: <karolherbst🐧🦀> yeah, or that
16:37fdobridge: <gfxstrand> I think that's better than an API, especially since we need to set some of them from MME anyway
16:37fdobridge: <karolherbst🐧🦀> right..
16:37fdobridge: <gfxstrand> Hrm... Actually, that means we probably do want an API
16:37fdobridge: <karolherbst🐧🦀> 😄
16:38fdobridge: <gfxstrand> Since it doesn't change per-shader.
16:38fdobridge: <karolherbst🐧🦀> yep
16:39fdobridge: <karolherbst🐧🦀> I think nvidia is doing something similar with CUDA even. The binary just tells where it wants special things to be and the runtime has to provide them
16:39fdobridge: <gfxstrand> Annoyingly, bindgen will do nothing with those headers...
16:40fdobridge: <karolherbst🐧🦀> mhhh
16:40fdobridge: <karolherbst🐧🦀> no?
16:40fdobridge: <karolherbst🐧🦀> ehh wait.. qmd uses that `MW` stuff..
16:40fdobridge: <karolherbst🐧🦀> uhh
16:41fdobridge: <karolherbst🐧🦀> we could generate rust code 🙃
16:41fdobridge: <karolherbst🐧🦀> write a macro and `!include` the file 🙃
16:42fdobridge: <gfxstrand> Yeah....
16:43fdobridge: <karolherbst🐧🦀> write a proc macro which parses those headers 😄
16:43fdobridge: <marysaka> I had some custom parser for that on my playground stuffs back then :vReiAgony: <https://github.com/Pascalinette/gpu_playground/blob/master/misc/header_converter/__init__.py#L787C5-L787C39>
16:43fdobridge: <gfxstrand> Also, setting CBs in the QMD is annoying... It's not just an offset. 😭
16:43fdobridge: <marysaka> (with libclang)
16:43fdobridge: <karolherbst🐧🦀> yeah...
16:43fdobridge: <karolherbst🐧🦀> CBs are kinda flexible
16:44fdobridge: <karolherbst🐧🦀> anyway.. I don't think doing it a template + specialization way is necessarily bad
16:44fdobridge: <karolherbst🐧🦀> I personally like the idea of making it the compilers problem to know and generate the headers
17:06fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> https://gitlab.freedesktop.org/gfxstrand/mesa/-/merge_requests/40 🚀
18:00rovo: Hello, I was following along nouveau firmware from Nvidia instructions here: https://nouveau.freedesktop.org/VideoAcceleration.html . I'm stuck at the 3rd wget trying to determine if I should download that driver, or adjust that line accordingly to my video cards model number?
18:01karolherbst: rovo: what's your GPU?
18:01rovo: "wget http://us.download.nvidia.com/XFree86/Linux-x86/325.15/NVIDIA-Linux-x86-325.15.run" this is the line I'm referring to, but I have a GeForce320m card. I'm generally used to trying to install 340.108 driver instead of the 325.15 listed there.
18:01karolherbst: nah, the version has to match exactly
18:06rovo: karolherbst: thank you. I went back and forth a few times trying both. Now I'm unsure which one I left of on.
18:17rovo: I'm not really sure I have it working... I've tried running this command on an mkv file and an mp4 file, but neither work.
18:17rovo: mplayer -vo vdpau -vc ffmpeg12vdpau,ffvc1vdpau,ffh264vdpau,ffodivxvdpau, <file>
18:17karolherbst: what's the problem?
18:20rovo: after I hit enter, it says MPlayer 1.4... then it says, do_connect: could not connect to socket, connect: No such file or directory, Failed to open LIRC support. Then it says libavformat file format detected, but follows with Protocol name not provided cannot determine if input is local or network protocols.
18:20karolherbst: yeah, doesn't sound like a nouveau issue to me at least
18:22rovo: looking more closely, that was a bad example, as that file was h265
18:26rovo: so the do_connect: could not connect to socket is not related to nouveau?
18:26karolherbst: it's not
18:28rovo: so now I am playing an h264 in a mkv container... it's starting with no other error, I can hear the audio of the movie, but the screen is black.
18:31karolherbst: do you see any errors comming up in dmesg?
18:37rovo: I just rebooted, and briefly saw something about nouveau is primary device... but I couldn't read it all
18:37karolherbst: I meant when playing the video
18:46rovo: karolherbst: I am doing this on ubuntu server... terminal only.
18:49rovo: I just tested `mpv --hwdec=vdpau yourvideofile` and that seems to work well
18:51rovo: But then I pulled up my journalctl of the last boot, and I see some errors related to
18:51rovo: ACPI Warning: \_SB.PCI0.IXVE.IGPU._DSM: Argument #4 type mismatch - Found [Buffer]
18:51rovo: nouveau 0000:04:00.0: NVIDIA MCP89 (0af000a2)
18:51rovo: nouveau 0000:04:00.0: fb: 256 MiB stolen system memory
18:52rovo: these ones are RED:
18:52rovo: nouveau 0000:04:00.0: bus: MMIO write of 0000807f FAULT at 100c18
18:52rovo: nouveau 0000:04:00.0: bus: MMIO write of 0000807e FAULT at 100c1c
19:05fdobridge: <gfxstrand> Okay, who's setting P0 without telling me?!?
19:06fdobridge: <gfxstrand> Probably IMAD.WIDE
19:16karolherbst: rovo: mhh, those are kinda irrelevant
19:16fdobridge: <karolherbst🐧🦀> @gfxstrand correct, `IMAD.WIDE` has to output predicates 😄
19:17fdobridge: <karolherbst🐧🦀> `IMAD.HI` as well btw
19:18fdobridge: <gfxstrand> One or two?
19:18fdobridge: <karolherbst🐧🦀> ehh two
19:18fdobridge: <gfxstrand> For both of them?
19:18fdobridge: <karolherbst🐧🦀> ehh wait
19:18fdobridge: <karolherbst🐧🦀> one
19:18fdobridge: <karolherbst🐧🦀> 🙃
19:19fdobridge: <karolherbst🐧🦀> I mistook one R for a P
19:19fdobridge: <gfxstrand> At bit 81 and 84?
19:19fdobridge: <gfxstrand> Okay, so just the one at bit 81?
19:19fdobridge: <karolherbst🐧🦀> probably? The docs don't contain the encoding
19:19fdobridge: <gfxstrand> Okay
19:20fdobridge: <gfxstrand> the disassembler doesn't do anything with it but at least it's not stomping P0 to garbage now
19:20fdobridge: <karolherbst🐧🦀> heh.. odd
19:20fdobridge: <karolherbst🐧🦀> sounds like a bug in the disassembler then
19:21fdobridge: <karolherbst🐧🦀> `IMAD.LO` has an input predicate btw
19:21fdobridge: <karolherbst🐧🦀> and `IMAD.WIDE` as well
19:21fdobridge: <karolherbst🐧🦀> mhh
19:21fdobridge: <karolherbst🐧🦀> no
19:21fdobridge: <karolherbst🐧🦀> only `IMAD.WIDE.X`
19:21fdobridge: <karolherbst🐧🦀> actually
19:21fdobridge: <karolherbst🐧🦀> it's only `.X` 😄
19:22fdobridge: <karolherbst🐧🦀> and all the variants
19:22fdobridge: <gfxstrand> Looks like we need to set PT in the non-.X variants or else it stomps things
19:22fdobridge: <karolherbst🐧🦀> funky
19:22fdobridge: <gfxstrand> Yeah
19:22fdobridge: <gfxstrand> It's not great but kinda makes sense
19:22fdobridge: <karolherbst🐧🦀> but yeah...
19:22fdobridge: <karolherbst🐧🦀> I wonder if it's doing something undocumented
19:23fdobridge: <karolherbst🐧🦀> or just nothing at all
19:23fdobridge: <gfxstrand> I suspect destinations have to always be encoded
19:23fdobridge: <gfxstrand> Even if they're not really used
19:23fdobridge: <karolherbst🐧🦀> yeah.. might be
19:23fdobridge: <karolherbst🐧🦀> let's se...
19:23fdobridge: <gfxstrand> But yeah... Upshot is that idiv now works. 😄
19:24fdobridge: <karolherbst🐧🦀> nice
19:24fdobridge: <gfxstrand> The only remaining CS delta with codegen is image atomic cmpswap but I think Daniel is working on that.
19:24fdobridge: <karolherbst🐧🦀> I suspect in theory you also have to encode all input predicates, but they might just get ignored in hardware
19:24fdobridge: <gfxstrand> And I'm not really worried. If he doesn't figure it out, I can wire it up.
19:24fdobridge: <karolherbst🐧🦀> I wonder if that matters more with scheduling...
19:25fdobridge: <gfxstrand> If it reads P0 and ignores the value, that's no different from PT
19:25fdobridge: <karolherbst🐧🦀> well...
19:25fdobridge: <karolherbst🐧🦀> yes, but...
19:25fdobridge: <gfxstrand> If it generates garbage and writes to P0 that's very different from PT
19:25fdobridge: <karolherbst🐧🦀> right
19:25fdobridge: <karolherbst🐧🦀> I'm sure it's fine
19:25fdobridge: <gfxstrand> For hardware that auto-schedules, there may be a difference on sources
19:25fdobridge: <gfxstrand> It'd be correct, it just might stall
19:26fdobridge: <karolherbst🐧🦀> yeah...
19:26fdobridge: <karolherbst🐧🦀> would be the case pre kepler, though kepler actually verifies this
19:26fdobridge: <karolherbst🐧🦀> and the hardware falls back to the worst case if anything is wrongly set
19:43fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Why could this error happen?: https://media.discordapp.net/attachments/1087385827354611772/1149791643986301009/image.png
19:46fdobridge: <gfxstrand> ENODEV probably means you got a hang
19:49fdobridge: <marysaka> or channel crash I think :aki_thonk: