09:31 jb0IEUQ5aTx: whether booting from a live-media, or from a freshly-installed Ubuntu disk, we wanted to know if the default video/graphics driver loaded on machines with NVIDIA graphics cards is the free-software nouveau driver or the proprietary nvidia driver?
11:40 karolherbst: jb0IEUQ5aTx: no clue, that's not up to us to decide
11:42 jb0IEUQ5aTx: ok, if we needed to use ffmpeg to hardware-accelrated-encode using the free-software nouveau driver versus proprietary-nvidia/nvenc, what is hardware-accelrated encoder to use (instead of *_nvenc)?
11:47 karolherbst: nouveau doesn't support hw accelerated video encoding
11:50 jb0IEUQ5aTx: is that because Nvidia does not provides documentation about their GPUs - making it much-harder for the nouveau developers?
11:50 RSpliet: Plus bigger fish to fry, plus a tiny team
11:51 RSpliet: Oh and lack of redistributable firmware for the hw encoder
11:52 karolherbst: and more important things to work on than reverse engineering the hardware encoder sadly
11:56 jb0IEUQ5aTx: unbelievable that Nvidia has not changed their posture on this issue for over a decade now...
11:56 karolherbst: they have, but just a little
11:56 karolherbst: video encoding is just not important enough so there were other things to focus on
11:57 karolherbst: they do release firmware for enabling hardware acceleration in general (for OpenGL, CL, etc...) or do provide basic documentation now: https://github.com/NVIDIA/open-gpu-doc
11:58 karolherbst: so.. maybe in the future it will be better? who knows
11:59 karolherbst: ohh actually, there is some nvdec docs: https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/video/nvdec_drv.h
11:59 karolherbst: but without the firmware....
12:06 jb0IEUQ5aTx: right, but they even shortchange their proprietary driver users - by allowing only up to 2 (or 3) simultaneous hardware-accelrated encodings (apparently, unless you spend thousands for their higher-end cards): https://github.com/keylase/nvidia-patch
12:27 karolherbst: jb0IEUQ5aTx: it's actually a hw limitation
12:28 karolherbst: so the only GPUs with 7 instead of 3 or 1 nvenc engines are gp100 and gv100
12:28 HdkR: that github repo is fun since it removes some of the limitations. Depending on the hardware's capabilities of course
12:28 karolherbst: not sure if the engines can do multiple jobs in parallel though
12:29 karolherbst: but as far as we know only gv100 and gp100 have more than 3 nvenc engines
12:30 HdkR: That's the issue yea. Parallel encodes are software limited for whatever reason :|
12:32 HdkR: Limit on number of encodes doesn't make much sense since it's just a product of how hard you're pushing the encoders. I can only assume it is to reduce consumer confusion
12:32 HdkR: But 3 encodes at 8k != 3 encodes at 240p :P
12:32 karolherbst: right sure..
12:33 karolherbst: but encoding 240p is also super fast anyway
12:33 karolherbst: it might be that you will be able to fill in idle time with multiple encodings though
12:33 karolherbst: like e.g. streaming
12:34 HdkR: Which is likely why Quadro line gives you an unrestricted amount of encodes. Up to the customer to find the upper limit
12:34 jb0IEUQ5aTx: right, all encodes are not the same (especially given the resolution, bitrate, etc.)... so, it does not make sense to artificially-limit the number...
12:34 jb0IEUQ5aTx: According to https://github.com/Livepool-io/transcoder/issues/11 and https://www.youtube.com/watch?v=0fxu7zbhmrs , applying the said patch supposedly allows for atleast slightly increased number of hardware-accelerated encoding sessions. (The same issue applies to nvenc on both GNU/Linux and Windows)... can't be sure, as we haven't tried it out...
12:34 karolherbst: the question is rather, if you do multiple encodings, is it actually faster overall or not.. but again.. for real time streaming it's a different problem in the first place
12:36 karolherbst: the issue is mainl, that only those expensive GPUs have more than 3 engines, so hard to tell if it's a limitation on GeForce GPUs or they thought that they limit to the amount of available engines
12:37 karolherbst: but it seems like that you are really only able to fill in idle time with this
12:38 HdkR: Quadro parts will end up just evenly scaling the encode sessions over the hardware blocks as far as I'm aware
12:38 karolherbst: yeah.. I'd assume the same
12:38 karolherbst: still better for some use cases though
12:38 karolherbst: the most silly limitation we found was power consumption reporting only for quadro cards :D
12:38 karolherbst: that has like no hw reason
12:38 karolherbst: *had
12:38 HdkR: Definitely. If you actually need a wackload of encoders then Quadro still makes sense
12:39 jb0IEUQ5aTx: well, we did use 2 simultaneous 8192x4320 5fps encodings using hevc_nvenc, and it had a significant performance improvement taking up a total of only 10% of the CPU - even one 8192x4320 5fps software encoding using libx265 would take up 80% of the CPU (and all of the CPU threads).
12:39 jb0IEUQ5aTx: the problem is adding a third simultaneous 1280x720 encoding using a webcamera would fail with that error: "OpenEncodeSessionEx failed: out of memory" even though the "nvidia-smi" command-line utility that GPU memory was not really the issue...
12:39 karolherbst: jb0IEUQ5aTx: yeah.. well... 8k encoding needs some memory
12:39 karolherbst: jb0IEUQ5aTx: on wha GPU btw?
12:39 karolherbst: *what
12:40 HdkR: Hopefully Turing or better now. NVENC on the newest chips is great :D
12:40 karolherbst: it's very fast.. yes
12:40 karolherbst: I noticed it myself
12:40 jb0IEUQ5aTx: Quadro P1000 - its one of the lower-end workstation cards...
12:40 karolherbst: jb0IEUQ5aTx: yeah soo.. the p1000 is a gp107 which has 3 nvenc engines
12:41 HdkR: GP100 also only has 3 nvenc engines, but has unlimited concurrent sessions ;)
12:41 karolherbst: so doing two rather than just one encode, should give you a significant improvement yes
12:41 karolherbst: HdkR: nope, it has 7
12:42 HdkR: Does it? Nvidia's official documentation only claims 3
12:42 karolherbst: at least from a driver pov
12:42 karolherbst: HdkR: skeggsb_ set the limit to 7
12:42 HdkR: huh
12:42 karolherbst: maybe it's 3 in like real hw and 7 from a driver pob
12:43 karolherbst: *pov
12:43 karolherbst: who knows
12:43 HdkR: Could be
12:43 HdkR: https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new Is supposed to be the documentation on these
12:44 karolherbst: HdkR: there is no gp100 on that list?
12:44 karolherbst: ohhh
12:44 karolherbst: tabs...
12:44 karolherbst: HdkR: weird.. gv100 should also have 7...
12:44 HdkR: I guess it doesn't take in to account that the newer engines just are more capable
12:45 karolherbst: HdkR: odd is the Tesla M10 with 4
12:45 karolherbst: as gm107 usually only has 1....
12:45 karolherbst: ohhh
12:45 karolherbst: 4 chips...
12:45 karolherbst: maybe nvidia knows something we don't...
12:46 karolherbst: we also claim 3 for gm200 where nvidia claims 2
12:46 HdkR: Was M10 the GPU that was abused for GRID? Could explain why it had more independent encoders
12:48 HdkR: Old GPUs, hard to remember details about :P
12:48 karolherbst: HdkR: no, the M10 is just 4 GPUs
12:48 HdkR: oh hah, right.
12:53 HdkR: Nvidia's market segmenetation strategy is just a bit grating for consumers that know the hardware is the same. I don't think that will ever change :)
13:21 jb0IEUQ5aTx: According to https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new#collapseOne , P1000 is supposed to support 3 concurrent sessions, but can not seem to achieve that... only 2 8K 5fps using "-c:v hevc_nvenc -preset fast -tune ull -zerolatency 1" for now... even if the third encoding is as low as 720P 5fps...
13:22 jb0IEUQ5aTx: does anyone have any experience using Radeon/AMDGPU drivers for hardware-accelerated video encoding - if so, how does it compare to using nvenc (both in terms of quality and also concurrent sessions)?
13:25 HdkR: Internet claims latest generation of nvenc wipes the floor with AMD encoder
13:25 HdkR: in terms of quality
13:25 HdkR: no idea about concurrent sessions. I don't do video encoding
13:26 jb0IEUQ5aTx: where did u see the comparison?
13:30 HdkR: Bunch of random news posts and reviewer videos