00:46TimurTabi: airlied: which one?
00:51airlied: TimurTabi: I just moved to 535.113.01
00:53TimurTabi: airlied: are you still hacking up our header files?
00:54airlied: TimurTabi: still doing whatever Ben did
00:54TimurTabi: airlied: oof, you should fix that
00:54airlied: fix it how?
00:54airlied: I just renamed all the 535.54.03 -> 535.113.01 for now
00:55airlied: any designs for how to do it properly would be gladly accepted, I think dakr is going to start looking at it soon
00:55TimurTabi: Make it so you can copy the files verbatim.
00:57airlied: I don't think that will be useful long term unfortunately
00:57dakr: TimurTabi, I think this won't work supporting multiple firmware versions.
00:57airlied: it would be a struct naming/versioning nightmare
00:58TimurTabi: Why not? It would greatly simplify moving to new versions
00:58TimurTabi: No more three way diffs
00:58airlied: not really
00:58airlied: since we have to keep maintaining the older versions
00:59airlied: it would have made it easier to do this move which I don't want to keep backwards compat
00:59airlied: but after this we have to keep backwards compat
01:00TimurTabi: I still think it would be a good idea
01:02airlied: just not sure how it would work in practice
01:02airlied: we currently import pieces from 58 headers
01:03airlied: I think if we took the complete header files we'd likely add a lot more dependencies
01:03airlied: and since we can't just copy them over moving forward, it's doesn't really avoid the 3 way diffs problem
01:03TimurTabi: Sounds like a good job for an intern
01:04airlied: no redesigning it completly to not include the headers might be a more sensible idea
01:04airlied: just generate an interface description from the nvidia driver, and output nouveau headers for it
01:05airlied: it might turn out to be a difficult challenge, but I think it's worth a try
01:06TimurTabi: A script to extract the macros and structs might work
01:15airlied: TimurTabi: we do something in mesa for the class headers but this seems like it would be more complex since the headers are less defined
01:19TimurTabi: Maybe use cscope to enumerate the identifiers that Nouveau uses?
02:22airlied: yeah something like cscope or a python c parser would be needed
03:56fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> What GSP version will be used upstream?
04:02fdobridge: <airlied> probably try and use 535.113.01 now
04:22fdobridge: <airlied> also looks like setting the registry entries might paper over the bar2 fault crash
04:25fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I can see you merged a GSP branch that still uses 535.54.03 into drm-tip
04:27fdobridge: <airlied> yeah that was yesterday, who knows what tomorrow will bring
04:32fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Also is it possible to get KMS on Ada without GSP? 🐸
04:36fdobridge: <airlied> no I don't think it is
05:18fdobridge: <lingm> @marysaka yeah, rustup 1.23+ should read a `rust-toolchain.toml` and download the specified version automatically when attempting a build. on a glance your file looks valid...
05:18fdobridge: <lingm> since nightly features can be added and removed at any time it would be a good idea to pin a specific nightly to use. e.g. `channel = "nightly-2023-10-30"`, but that shouldn't have blocked it from working for Andrew.
05:20airlied: dakr: 3 patches on the list for getting to latest fw + fixes, I'd like to get those looked at and the consider sending the topic branch to Linus
05:34fdobridge: <lingm> no idea what went wrong there. `cargo build`ing envyhooks immediately downloads the latest nightly and then builds fine 🤷
07:14fdobridge: <karolherbst🐧🦀> in theory yes, we kinda had it supported on Ampere. The issue is just that you can only have a single context and it's all pain
07:15fdobridge: <karolherbst🐧🦀> and you'd need code for the display engine
07:15fdobridge: <karolherbst🐧🦀> or is Ada looked down so much?
07:26AndrewR: may be my problem was that I first installed rust from Slackware repo, only after first failed build I added rustup (separate package) and only after that I started to manually call it for downloading toolchains ...
07:38fdobridge: <lingm> AndrewR: apparently you need to manually setup your system to call the rustup shims on slackware: <https://slackbuilds.org/slackbuilds/15.0/development/rustup/README>
07:38fdobridge: <lingm> if you didn't do that, no wonder it didn't work. on arch the `rustup` package conflicts with the `rust` and `cargo` packages but then it works as expected out of the box. i'd classify that as a packaging bug.
07:42AndrewR: anyway, it was build, but when I tried to run it via script: ./run_preload ~/ffmpeg -i /mnt/hd/guest/Sea_of_life_plus_Mikura_dolphins_test.mp4 -c:v nvenc_h264 -f null /dev/null
07:42AndrewR: I got my cuda error :}
07:43karolherbst: yeah.. I guess that's expected because version mismatch and stuff
07:49AndrewR: karolherbst, I can try to build your valgrind stuff if it does not requre kernel modification (running from live dvd rn)
07:50karolherbst: uhhhhh.... it's kinda a pain, and you will have to update it for your version, but: https://github.com/karolherbst/valgrind/tree/uvm and https://github.com/karolherbst/envytools/tree/UVM
07:50karolherbst: and it also uses headers from the blob driver
07:51AndrewR: oh, moment, trying to convince nvidia driver to drive my monitor at 1440x900 (bad vga cable is bad - so it all blurred at 1024x768)
07:51karolherbst: and it's very incomplete
07:51karolherbst: it was more of a prototype to parse those UVM ioctls without having to like reverse engineer the layout and just use nvidias headers there
07:52karolherbst: but not sure how much of that is useful for the video encoding/decoding stuff
08:04AndrewR: karolherbst, anything special for building or autogen.sh/configure/make will make it?
08:06karolherbst: uhhh.. you might have to change some paths
08:11AndrewR: I just cp any .h file it was complainig, thankfully for this tool my driver provides all files ...
08:12karolherbst: yeah.. though you might have to update mmt quite a bit...
08:12karolherbst: ioctl changes and all that
08:12karolherbst: _however_ maybe these days the driver has all the info anyway
08:13karolherbst: but I think there is also a lot of secret stuff in the driver
08:13AndrewR: karolherbst, well, it compiled :} therefore I can use it ..how?
08:14karolherbst: but anyway.. valgrind-mmt is practically annoying to use, so I kinda wish we'd focus more on Marys tool and make that run better. Valgrinds overhead is a giant pita
08:16AndrewR: yeah, just 2 fps insteat of 65. but for encoder this doesn't matter? anyway, few seconds of trace resulted in 8 mb log. upload it to where?
08:17karolherbst: parse it with `demmt`
08:17karolherbst: which will fail to parse it
08:17karolherbst: but whatevef
08:17karolherbst: the fun part is to make it work :P
08:18AndrewR: well, I can't find demmt tool?
08:18karolherbst: we also have this: https://github.com/NVIDIA/open-gpu-doc/tree/master/classes/video but I doubt it's too much useful on old GPUs like Kepler
08:18karolherbst: AndrewR: second repo
08:22AndrewR: multiple definitions ..
08:22karolherbst: yeah well.. none of this is end user friendly and also needs fixing
08:27AndrewR: first err silenced by adding static keyword in include ..
08:27AndrewR: root/envytools/demmt/drm.c:574:38: error: ‘NOUVEAU_GETPARAM_FB_PHYSICAL’ undeclared (first use in this function); - second one is more serious?
08:28karolherbst: mhh, we removed that ioctl from nouveau
08:28karolherbst: you probably can just remove the code
08:30AndrewR: it also complains about NOUVEAU_GETPARAM_AGP_PHYSICAL and NOUVEAU_GETPARAM_PCI_PHYSICAL
08:31AndrewR: because I am not interested in nouveau tracing I can also comment them out?
08:31karolherbst: just remove it
08:31karolherbst: nouveau userspace also won't ever use them
08:31karolherbst: so it's kinda pointless to keep it around anyway
08:33AndrewR: https://pastebin.com/NuKY0iD6 - more errs :}
08:33AndrewR: jsut comment THEM out too?
08:34karolherbst: those are actual problems
08:34karolherbst: due to UVM having changed and stuff :)
08:35karolherbst: just check how your version looks like and adjust the code to that
08:35karolherbst: I never got to the point where I made it work for multiple versions of the driver
08:36karolherbst: I was mostly just interested in figuring out how painful it would be to just use nvidia headers instead of what we've done before
08:41AndrewR: karolherbst, https://pastebin.com/64m1qCF0 -structure for my version of driver ... does not look like it itself was versioned, may be version lives up there somewhere ..
08:41karolherbst: yeah... no idea how UVM is versioned at all
08:41karolherbst: there might be some define _somewhere_
08:52AndrewR: https://pastebin.com/5Tynjgqj hacks
08:54karolherbst: why do they change field names like that :D
08:55AndrewR: ./demmt/mmt_bin2dedma < ../valgrind/file-bin.log > file-txt.log
08:58AndrewR: too big for pastebin ...
08:59karolherbst: ahh, so it's recording stuff?
09:00AndrewR: it tries? https://pastebin.com/kwjdCWyZ
09:00karolherbst: but yeah...
09:00karolherbst: nvidias ioctl aren't stable at all
09:00karolherbst: so if you use a newer version you kinda have to figure out all the new bits
09:00AndrewR: feels the pain :}
09:01karolherbst: that's why I kinda hoped you could use Marys stuff, because that just uses headers from the driver
09:01AndrewR: well, from too new driver :}
09:15AndrewR: demmt still does not work: unknown type: 0x1b
09:17karolherbst: but yeah...
16:06AndrewR: karolherbst, any idea how hard fixing demmt for this specific use case can be?
16:09karolherbst: btween 5 minutes and a month of work
16:09karolherbst: or more
16:09karolherbst: reverse engineerring ioctls where you have no idea how the data layout is can be a lot of work
16:09karolherbst: most of the stuff is actually just command buffers.. soo yeah.. dunno
16:15AndrewR: if I look with less on binary log I Warning: noted but unhandled ioctl 0x49 and same for 0x21 ...
16:18karolherbst: yeah.. soo...
16:19karolherbst: you can always check out the earliest open source release and see how much of that makes sense
16:19AndrewR: https://pastebin.com/vTmRKigy - full (hopefully) list. Isn't open-source releases of nvidia kernel started rom 500.xx ?
16:20AndrewR: can demmt just skip stuff it does not understand?
16:39karolherbst: yes, but it shouldn't
16:39karolherbst: it could map memory
16:39karolherbst: and that memory could contain command buffers
16:43AndrewR: well, gues then I am stuck :) but bin2dedma still gives something ... any hint on how to fish useful info from that?
17:24karolherbst: hard to say.. you could try to find command buffers on the subchannels related to video encoding/decoding and see what those are doing
17:46AndrewR: karolherbst, thanks. I'll read more on it after some more sleep ...
18:22AndrewR: karolherbst, cool nvidia demand login for NEARLY all video sdk verions ....
18:23AndrewR: karolherbst, also, according to some pdf async mode on exist on Windows drivers ...
18:40airlied: is there any kepler vulkan video driver?
18:40karolherbst: kepler was ditched in 470 or so?
18:41airlied: i suspect re on a maxwell might be similiar
18:41karolherbst: at least it's closer
18:41airlied: we have some tegra video class headers also
18:42karolherbst: but video stuff is apparently different
18:42karolherbst: (on tegra)
18:42airlied: its not though
18:43airlied: its not perfectly same but dont think it was too different
18:44karolherbst: I think the biggest difference is, that the tegra side consumes the command buffers instead or something?
18:44airlied: i have a request to get geforce video class headers winding its way
18:44karolherbst: yeah.. nice
18:44airlied: through nvidia
18:44karolherbst: maybe we get them next yeah
18:44karolherbst: I should ping again on the SPH and other request I made :)
18:44karolherbst: poor Andy
18:44airlied: i was going to dump some vulkan video pushbufs at some point
18:46airlied: i wonder should i bring an nvidia laptop on my trip, they are all pretty heavy
18:46karolherbst: what trip
18:46karolherbst: some of the lenovo ones aren't too bad there
18:47karolherbst: though I think a Dell XPS one is lighter
18:47airlied: yeah the selection i have ranges from super heavy to merely heavy :-p
18:52AndrewR: Selected GPU 0: NVIDIA GeForce GT 710, type: 2
18:52AndrewR: peoprietary of course
18:52karolherbst: AndrewR: the question was about vulkan video
18:52karolherbst: like playing videos through vulkan
18:53AndrewR: ah, sorry. vulkainfo will tell?
18:53karolherbst: dunno :) ask airlied
18:55AndrewR: karolherbst, UVM_API_LATEST_REVISION 7 in my uvm.h header at /usr/src/nvidia-470.199.02/nvidia-uvm/uvm.h
18:57airlied: yeah vulkaninfo see if you have VK_KHR_video*
19:01AndrewR: airlied, nope ...
19:03airlied: okay then that won't be a good way forward
19:33karolherbst: AndrewR: I assume they don't bump that on each rename though :'(
19:34karolherbst: anyway... that video reverse engineering is a big task, definetly something you want to do a GSoC/EVoC for and even that wouldn't be enough
19:37AndrewR: karolherbst, yeah, just hoped to collect some info while it works
19:37AndrewR: karolherbst, https://docs.nvidia.com/drive/drive_os_220.127.116.11L/nvvib_docs/index.html#page/DRIVE_OS_Linux_SDK_Development_Guide/NvMedia/nvmedia_nvmvid_enc.html
19:38karolherbst: soo.. what would help is to write very simple and targeted application and trace what nvidia is doing
19:38karolherbst: but we need a better mmiotracer, because it's all cursed
19:38karolherbst: I wouldn't trust valgrind-mmt to even catch most of the relevant things compute related
19:39karolherbst: I've written those UVM patches for a reason, because that tool just became quite useless
19:39airlied: where did ilia ever get with video?
19:39karolherbst: it works on some GPUs
19:39karolherbst: it's broken
19:39karolherbst: but also kinda works
19:39airlied: is it just gpus where we can extract the fw?
19:40karolherbst: I think so
19:40karolherbst: maybe less even
19:40karolherbst: but I'm also hessistant to write new code not using nvidia headers
19:42airlied: I'd be writing code not using them until we get them :-)
19:43karolherbst: fair enough
19:44airlied: I think for gsp there are still some kernel bits to hook up to expose the video dec
19:45karolherbst: probably yeas
19:45karolherbst: all that video stuff is completely cursed
19:45karolherbst: I hope it's better with GSP
19:46karolherbst: we even have specialized instructions for video stuff in the shader ISA
19:47airlied: I wonder what for
19:47airlied: in modern ISAs?
19:47karolherbst: let me check..
19:48karolherbst: mhh I actually think PTX has those as well
19:48karolherbst: airlied: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#video-instructions
19:48karolherbst: maxwell has some of them I think
19:49karolherbst: but the nvdisasm docs are not mentioning them
19:49karolherbst: but they do exist
19:49karolherbst: envydis lists them e.g.: https://github.com/envytools/envytools/blob/master/envydis/gm107.c
19:50karolherbst: but I can't find any of those in the ISA docs I have.. sooo dunno
19:50karolherbst: however I think they exists...
19:50airlied: I doubt we care about any of those for basic decode though
19:51karolherbst: yeah.. we don't
19:51karolherbst: some of the encoding/decoding is shader assisted
19:51airlied: okay the thinkpad p1gen4 is probably the lightest, might be managebale
19:51airlied: karolherbst: I'd be interested if we know how to quantify "some" there
19:51karolherbst: yeah.. good question :)
19:52karolherbst: check what nvidia is doing
19:52karolherbst: but what I remember it kinda depends on the codec and profile and/or settings
19:52airlied: my guess is for most things using shaders is undesirable due to power consumption
19:52airlied: but I know for instance AV1 filmgrain is done in shaders on intel
19:53karolherbst: it also depends on the generation
19:53karolherbst: I think the VP docs _might_ give some details
19:53karolherbst: or was it PV?
19:53karolherbst: yeah.. PV
19:54karolherbst: wikipedia says this for Feature Set E: "Cards with this feature set use a combination of the PureVideo hardware and software running on the shader array to decode HEVC (H.265) as partial/hybrid hardware video decoding."
19:56airlied: ah so maxwell and below for h265
19:57karolherbst: yeah.. but I think there are more instances of that
20:36AndrewR: karolherbst, https://forum.doom9.org/showthread.php?t=164495
20:37AndrewR: karolherbst, https://forum.cyberlink.com/forum/posts/list/42357.page
20:37AndrewR: apperently on pre-cepler there was some encoding feature on cuda cores on fermi ...
20:46AndrewR: karolherbst, https://cseweb.ucsd.edu/classes/wi15/cse262-a/static/cuda-5.5-doc/html/video-encoder/index.html
21:30AndrewR: it seems this library (nvcuenc.dll) was only in specific windows drivers, but then 181 series was xp compatible, so may be it will work in Reactos ... :)
21:32AndrewR: karolherbst, https://on-demand.gputechconf.com/gtc/2010/presentations/S12075-GPU-Accelerated-Video-Encoding.pdf - but link at the end does not work and not archived ..
21:43RSpliet: you may be able to ask the author? Guessing it's this chap: https://github.com/toshas?tab=repositories