00:05 karolherbst: mhh
00:07 docmax: sorry, but i cant believe its not possible to set all up with xorg.conf
00:08 karolherbst: I am not saying it is not possible
00:08 karolherbst: I am just saying it is less painful with xrandr
00:08 docmax: xrandr should have an expert to xorg.conf option
00:09 karolherbst: xorg.conf is kind of deperacted
00:09 karolherbst: *deprecated
00:10 docmax: but the only way to start X with the right setup from the START (not afterwards with xrandr)
00:12 docmax: i just want DP-1 left, DP-2 middle (main screen), DP-3 right ... thats all
00:12 docmax: all 1080p
00:13 imirkin: docmax: erm ... this works just fine...
00:13 imirkin: docmax: just add the relevant monitor sections
00:13 imirkin: specify what's left-of what
00:13 imirkin: and you're done...
00:13 docmax: which sections do i need minimum?
00:13 imirkin: this also has nothing to do with nouveau btw
00:14 imirkin: it's how every driver works
00:14 karolherbst: docmax: hyou have 4k displays, right?
00:14 imirkin: i don't remember... you need a monitor section for each monitor
00:14 docmax: no 1920x1080
00:14 karolherbst: ohh, okay
00:14 imirkin: and then you need a ServerLayout where you specify how they're placed
00:14 imirkin: which in turn will probably mean that you need a Device section
00:14 karolherbst: ohh, the one before you joined was asking about 4k stuff...
00:14 imirkin: i'm going from memory here, so ... imprecise.
00:15 docmax: do i need screen sections?
00:16 imirkin: yyyyeamaybe?
00:16 imirkin: hold on
00:16 imirkin: let me find a working config for you :)
00:16 docmax: screens are for more than 1 gpu?
00:16 docmax: imirkin: tanks
00:16 docmax: thanks
00:16 imirkin: https://hastebin.com/odomurayir.nginx
00:17 imirkin: obviously replace the connector names as appropriate, and s/modesetting/nouveau/
00:17 imirkin:hopes that works with nouveau... 99% sure it does
00:17 imirkin: oh, and ignore the fact that there are 2 different Mon0/1 entries in the device
00:18 imirkin: that's to account for differences in naming between modesetting and intel ddx's
00:18 docmax: this has no server layout
00:18 karolherbst: and ignore the rotate stuff ;)
00:18 docmax: nor screens
00:18 imirkin: correct
00:18 imirkin: but this works for me.
00:18 docmax: ok let me try
00:19 docmax: https://hastebin.com/ememoqisoc.m
00:20 imirkin: you're just misssing the bits in Device
00:20 docmax: bits?
00:21 imirkin: Option "Monitor-DP-1" "DP-1"
00:21 imirkin: etc
00:21 docmax: oh ok, maybe this makes the difference
00:22 imirkin: i gtg, good luck
00:22 docmax: https://hastebin.com/kocuyepejo.m
00:24 docmax: that doesnt work
00:24 docmax: still DP-1 ok, DP-2 DP-3 mirrored
00:24 imirkin: try modesetting? perhaps it has some support that's missing in the nouveau ddx.
00:26 docmax: no still the same
00:26 docmax: current config:
00:27 docmax: https://hastebin.com/newunemiso.m
00:28 docmax: i try to enable xinerama
00:29 docmax: still the same
00:29 docmax: damnit
00:30 karolherbst: docmax: I don't think you should have two Driver entries
00:30 docmax: there is just one?
00:31 karolherbst: Driver "nouveau" BusID "PCI:1@0:0:0" Driver "modesetting"
00:32 docmax: ok even removed doesnt make a difference
00:37 docmax: maybe i need screen and metamode?
00:47 docmax: i give up
00:48 docmax: the damn thing
01:17 rhyskidd: just doing some conforming of envytools' pciid against shipped products
01:17 rhyskidd: given it looks like there might be some new hw next month... :)
02:28 imirkin: docmax: well the config i pasted definitely works for me
11:54 BootI386: "Works for me"
12:17 ullbeking: Hello
12:18 ullbeking: Does anybody have any experience or reports on running the Sparkle Nvidia GT 520 with Nouveau?
12:19 ullbeking: It’s my first time experimenting with Nouveau, and searching found some vague references to it working, but idk whether that’s for both 2d and 3d, or what capabilities are supported.
12:32 karolherbst: ullbeking: well if you don't mind performance, acceleration and such should just work
12:33 karolherbst: Lyude: I have something nice to test regarding runpm: application offloaded + system suspend.
12:34 pendingchaos: I think video decoding acceleration requires one to extract the firmware from the blob though?
12:34 karolherbst: right, but that's the case for all GPUs in the end
12:34 karolherbst: I mean, when running nouveau
12:46 ullbeking: karolherbst: does this mean that even with nouveau, there’s no way to get a fully libre driver that supports all capabilities?
12:47 ullbeking: And if deciding acceleration doesn’t work without a blob, what’s the alternative blobless solution? Software acceleration?
12:48 ullbeking: decoding*
12:49 karolherbst: ullbeking: well, usually your CPU is kind of fast enough
12:50 karolherbst: but somebody could try to write open firmware for video decoding
12:50 karolherbst: it simply wasn't important enough
12:50 karolherbst: (as we also would have to figure out what all those opcodes mean)
12:54 ullbeking: karolherbst: not important enough? Ok. Either I’m misunderstanding common use cases or I’m overestimating how much CPU decoding takes. I’m pairing this up with a fairly old board.
13:08 ullbeking: FYI, I’m planning on using this with a Gigabyte GA-G41M-ES2L, on a libre stack as much as possible.
13:18 karolherbst: ullbeking: well, there is the workaround in using the firmware from the prop driver. Sure it would be good to replace them with free ones, but we also have other issues we need to take care of
13:23 imirkin: ullbeking: there's always a way. sometimes that way involves you doing a ton of work :)
13:27 sigod: karolherbst do you think that the free firmware for gtx660s will ever be fixed?
13:31 karolherbst: sigod: if somebody figures out what the issue is, yes
13:54 pendingchaos: imirkin: do I have your reviewed-by for https://patchwork.freedesktop.org/patch/241141/ ?
14:00 imirkin: pendingchaos: R-b me
14:00 imirkin: you have push now right?
14:00 pendingchaos: yes
14:00 imirkin: excellent.
14:07 pendingchaos: and it's been pushed
14:08 pendingchaos: (with minor changes to the description)
14:16 imirkin: awesome - great job on tracking that down btw!
14:17 imirkin: i still wonder whether that's truly what's going on, but ... can't argue with results
14:31 karolherbst: imirkin, skeggsb_: can we use nouveau_device.vram_size for igpus as well? After reading the code it seems to be the case, just making sure I didn't miss anything
14:31 karolherbst: context: I need it for that patch: https://github.com/karolherbst/mesa/commit/bec714e273036fac89a58041b95057cc03cdd12c
14:32 karolherbst: and I guess I might add it to nv50 with capping to 1 << 32 as well
14:37 ullbeking: karolherbst: imirkin: I have no doubt about these things, and please, make no mistake, I wasn’t making a demand, nor did I have any expectations of you. I’ve got a lot of experience with audio signal processing but none with video. I’m simply trying to understand this project better, its capabilities, and I am a complete noob when it comes to GPUs.
14:38 karolherbst: ullbeking: well, the thing about those firmwares are that you just need to hack in some assembly code which works, but without knowing what the instructions are doing, what the input data is and what the output data should be, it is quite challenging ;)
14:38 ullbeking: Also, I’ve run open source projects in the past and I understand that it can sometimes be a thankless task, or that users can be unreasonably demanding.
14:38 karolherbst: well, every help is always welcomed :p
14:42 ullbeking: karolherbst: I’ll add it to my list of side projects ;-) First thing first, is to get wait for this mainboard plus GPU to arrive and then do some experiments to get a working baseline. I’m new to firmware programming and RE, but it’s an important skillset in my long view :-)
14:46 ullbeking: The wiki page is nice, looks like a good intro to RE, that should be generalizable to other domains, no?
14:50 imirkin: ullbeking: no worries. no one felt it was interesting to invest the time to build firmware for this.
14:50 imirkin: because... what's wrong with the closed one
14:50 imirkin: not like it runs on the cpu
14:50 imirkin: and not like you can rid your life of firmware
14:50 imirkin: so ... why worry about it
14:52 cliff-hm:enjoys Intel CPU firmware updates :)
15:01 imirkin: ullbeking: note that the firmware doesn't actually decode the video ... it drives an internal engine that decodes the video
15:03 nyef: Do we know to what degree that firmware changes for different card versions?
15:04 karolherbst: nyef: the ISA changes
15:04 karolherbst: but maybe the engines also quite significantly
15:04 karolherbst: maybe not
15:05 nyef: And it's primarily various versions of falcon?
15:05 karolherbst: yeah
15:05 nyef: Thanks.
15:06 nyef: And, yeah, that goes on my list, but fairly well down.
15:09 karolherbst: ;)
15:10 nyef: Possibly even below the 3D glasses.
15:12 nyef: (Okay, to be fair, the 3D glasses are rising in priority due to the experience I'm getting in trying to figure out my sound card, since they both use 8051-type processors.)
15:20 imirkin: nyef: each vdec gen has different firmware, but that's about it
15:21 imirkin: nyef: i.e. all vp2 have same fw, vp3, etc
15:21 imirkin: for vp2/vp3/vp4 there's actually also "user" firmware which is supplied by the userspace application
15:21 imirkin: in vp5, that's baked in somewhere
16:19 ullbeking: imirkin: it’s interesting to me to know what IS interesting to
16:19 ullbeking: the nouveau community then
16:20 ullbeking: I would have (naively) presumed that an open source or libre video decoding module would be considered extremely interesting.
16:21 ullbeking: But you’re saluting that a lot of what happens inside the Nvidia GPUs is still essentially a lack box..?
16:21 ullbeking: saying*
16:23 imirkin: not sure how to answer your question... nvidia gpu's are complicated systems-on-pci-board
16:23 imirkin: a lot is understood, a lot is not
16:23 nyef: ullbeking: Higher priority items include adding basic support for new hardware, fixing opengl bugs, improving the shader compiler, enabling reclocking on what hardware we can, fixing stability issues...
16:23 imirkin: unrelated to that, everything runs on firmware
16:23 imirkin: cpus, hard drives, everything
16:24 imirkin: whether that firmware sits on a rom on the board or is uploaded from a user's hard drive seems largely irrelevant to me
16:24 imirkin: others may have other priorities
16:26 nyef: For me, the hardware that I'd be interested in having video accelleration for isn't currently stable. There's no point in having video accelleration if dragging the playback window while a video isn't even playing can crash the card.
16:26 nyef: Oh, hell. I should've thought of this before.
16:26 nyef: imirkin: Is there some easy way to detect if an application is using multiple contexts?
16:27 imirkin: "stop dragging the window around!"? :)
16:27 nyef: But it's in the wrong place!
16:27 imirkin: use -geometry
16:27 HdkR: start it up as a fullscreen window? :P
16:27 HdkR: Can't be in the wrong place when it is covering the screen
16:28 imirkin: anyways... outside of a bug, there's no way for an application to use multiple contexts
16:28 nyef: HdkR: Sure it can. If it's fullscreen, I can't see the controls!
16:28 imirkin: vdpau and gl are shared
16:28 imirkin: 1 hw context == 1 nouveau_screen
16:28 imirkin: and we ensure that there's only one of those per process
16:28 nyef: No, no... The multiple GL context thing that screws up.
16:28 imirkin: oh that.
16:28 imirkin: are you using mpv?
16:28 imirkin: GL + vdpau = mega-fail
16:29 nyef: I'm using gxine. I have no idea if anything particular is going on beyond that.
16:29 imirkin: ok
16:29 nyef: And this is before I start the video playing.
16:29 imirkin: use mplayer.
16:30 nyef: Hrm. And it doesn't fail on pre-G200 Tesla, and it doesn't fail on Fermi. That's not going to be the multiple GL context thing, is it?
16:30 nyef: Damn.
16:31 imirkin: as i recall
16:31 imirkin: some nvaf's are just super flakey
16:31 imirkin: probably we don't init something
16:31 imirkin: there was a situation like that with nvaa/nvac
16:31 imirkin: except for the vast majority, they came up properly initialized
16:31 imirkin: so it took a long time to notice
16:31 imirkin: of course the ones that *weren't* properly initialized tended to be on macs...
16:32 nyef: Naturally.
17:01 karolherbst: imirkin: anyway, I have super simple test case to trigger our multithreading issues
17:03 ullbeking: Ok cool, thanks for the context and information. My questions or opinions may seem weird or out of place with the community, but I genuinely need to feel my way around a bit until I’m able to orientate myself. If I’m fortunate it might even lead me to being able to contribute one day.
17:06 ullbeking: imirkin: I realise that it’s impossible to run a completely blobless computer, but nevertheless operating within a libre is important to me. Just because it’s practically impossible to run a 100% libre system doesn’t mean I should throw my hands up in the air and say “why bother”, and then quite doing anything to try to do to my bit to shift computing in the direction I’d like to see it go.
17:07 ullbeking: operating within a libre context* or mindset*
17:12 nyef: "All progress depends on the unreasonable man", huh?
17:13 nyef: It's possible to run a completely blobless computer. But building one takes some doing.
17:13 RSpliet: nyef: Is it a blob if we know how to decode it?
17:14 nyef: Since you have to either run a CPU old enough and of a design such that it doesn't require a firmware blob, and then all of your system peripherals need similar...
17:14 HdkR: Is the source of a blob worthwhile when you still need it to be signed by a private key regardless? ;)
17:14 ullbeking: nyef: I get the impression that you are being a bit sarcastic, but I’m not sure. I’m only trying to explain my position and understand your project. Geez.
17:15 RSpliet: HdkR: depends on what you want. If it's about inspection, of course it is. If you want to run your own code... it gets a bit more complicated
17:16 HdkR: Inspection is important, documentation is equally important
17:16 RSpliet: HdkR, ullbeking, nyef: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-nvidia-linux-nouveau/998310-nouveau-persevered-in-2017-for-open-source-nvidia-but-2018-could-be-much-better?p=998427#post998427
17:16 RSpliet: Mandatory reading
17:17 nyef: Will read.
17:20 RSpliet: on the more general level, my TL;DR is: a blob-free computer is impossible. Not everything in a computer is "auto-detect", OEMs make decisions in hardware design that need to be described somewhere (both in data format and bring-up code snippets) for correct operation. On the motherboard level this is described in the ACPI, for the GPU this info lives in the VBIOS. ARM sticks this in a combination of board-specific boot-loader, de
17:20 RSpliet: ree and board-specific kernel builds (an unholy mess).
17:24 RSpliet: There's so many things you can simply label "blob" that it's become a useless term. There's data tables vs. code, initialisation code routines vs. persistent code routines, stuff that runs on your regular CPU cores vs. dedicated microcontrollers. Living in ROM vs. EEPROM vs your filesystem. Updatable or not...
17:24 karolherbst: in the end it makes no difference if you have hardware or software doing the exact same thing (ignoring some fundamental details, like hardware not patchable)
17:25 RSpliet: ^ that: some "blobs" aren't even visible, they might as well have been synthesised to hardware
17:25 karolherbst: the only difference which matter is really the patchable thing though
17:26 karolherbst: as you can say: if there is a way to use something open, I want to prefer that over non open stuff
17:26 karolherbst: and software can be kind of monitored when it runs at low enough priviliges
17:28 karolherbst: also there are mainly two main reasons to do things in hardware: 1. to hide stuff (like TPMs are doing) 2. to be significant faster than software
17:28 karolherbst: and doing it in software is mainly done because it is easily replaceable
17:28 RSpliet: Inspectability can be important. Even if you can't upload your own firmwares, knowing what's running beats blind trust.
17:28 karolherbst: sure, but firmare == software
17:29 karolherbst: and you always know what software does, even if you don't know it on a high level
17:29 karolherbst: sometimes it is just quite hard to figure that out
17:29 karolherbst: but fundamentally you can always get to it
17:30 karolherbst: really interesting is runtime behaviour
17:30 karolherbst: and this is sometimes impossible to trace
17:31 RSpliet: Trace is not the best method in all cases... a "back door" is uncommon runtime behaviour, you'll expect never to be used except if you piss off the wrong people (... ;-) ). You might never encounter its existence using trace alone.
17:33 karolherbst: right
18:57 RSpliet: I'm just not too convinced about the existence of these alleged back doors in random hardware. Why hide one when you can market it as a "Management Engine"?
19:23 karolherbst: RSpliet: well, you usually hide those inside software
19:23 karolherbst: it's kind of proven, that cisco hardware was backdoored that way
19:23 karolherbst: sometimes they even intercepted shipmenets to upload modded firmwares
19:24 karolherbst: hardware backdoors need cooperation from the manufacturer, which is sometimes quite painful to deal with
19:25 karolherbst: and more expensive in general
19:25 karolherbst: but yeah, random hardware is kind of far fetched, allthough if you backdoor openssl (and heartbleed seems to be quite "proven" to be a placed backdoor), then all hardware which ship that are also backdoored in some sense
19:26 karolherbst: really hard to tell what "random" really means here
21:02 kernel-3xp: hi
21:07 kernel-3xp: i am trying to clock a quadro 4000 using nouveau but it says "not implemented", am i doing sth wrong?
21:09 kernel-3xp: https://www.irccloud.com/pastebin/WutqJvoF/
21:12 nyef: ... "angeforderte"?
21:13 kernel-3xp: requested
21:13 nyef: Ah.
21:13 nyef: Ah, a Fermi card?
21:14 kernel-3xp: quadro 4000
21:15 kernel-3xp: yeah fermi
21:16 nyef: The chart on the wiki CodeNames page says that that's an NVC0 / GF100.
21:16 kernel-3xp: yeah
21:17 nyef: Nouveau... might not support reclocking on Fermi yet. I'm not sure what the current reclocking situation is.
21:19 kernel-3xp: oh no...
21:20 kernel-3xp: thanks anyway
21:20 RSpliet: kernel-3xp, nyef: stalled. We can't change the memory clock on a Fermi
21:20 kernel-3xp: ok thanks :/
21:20 RSpliet: There's a whole load of code that implements the 90%, but until someone jumps in and takes it to 100%, it's unusable
21:20 RSpliet: It's the kind of thing that you either get right, or it doesn't work.
21:21 nyef: RSpliet: Is it just the memory clocks that are an issue, and is this something that one could do with one card, or does it need to be coordinated across multiple cards?
21:24 nyef: ... I see Trello cards for "Fixing Fermi engine reclocking" and "Fermi memory reclocking"...
21:27 RSpliet: nyef: I suspect that the patches Ben wrote succesfully take care of other clocks, but don't take my word on it
21:27 RSpliet: As for memory reclocking, that's definitely a task that requires a *lot* of cards to test with. But one at a time of course
21:28 nyef: Hmm.
21:28 nyef: ... probably not this quarter for me, then.
21:28 kernel-3xp: pls do something :)
21:29 RSpliet: kernel-3xp: I'd really really love to, but I really have no time or energy left to spend on nouveau at this point
21:31 kernel-3xp: ok ok maybe later on
21:31 kernel-3xp: thx anyway
21:31 nyef: I've got a sound card that I'm trying to figure out, and then I want to look into some NVAF issues, and then... yeah. I'm thinking "not this quarter for me", but possibly by the end of the year, not that I'll necessarily get very far.
21:32 nyef: ... yeah, I have two Fermi cards, both the same type.
21:33 kernel-3xp: i have 2 quadro 4000 and 2 quadro 2000 which i need to replace now since i wanted to drop the proprietary one
21:33 RSpliet: nyef: by that time make sure to hit me up. I've done this thing before for second gen Tesla, I might be able to give you some pointers
21:34 RSpliet: Get you going a bit quicker
21:34 RSpliet: It's not that much different in the end really, just... a lot of tedious details to get right
21:35 nyef: ... So, speaking of gen2 Tesla, can we engine-reclock those yet?
21:35 RSpliet: memory reclock even
21:35 RSpliet: But only manually, not based on card load or other params
21:35 nyef: I'm on an MCP89. Nice try. d-:
21:35 RSpliet: I implemented that for MCP89 myself...
21:36 RSpliet: Wouldn't call that a second gen Tesla.. just... a quick hack-job of an IGP :-P
21:36 nyef: Oh, lovely.
21:36 RSpliet: Tested it on NVAA and NVAC, not necessarily NVAF
21:36 nyef: NVAF here.
21:37 nyef: I was given to understand that we don't reclock NVAF memory because it's actually system memory?
21:37 RSpliet: Yeah, if it's sysmem there's no clock to control from the GPU pov
21:37 RSpliet: But the other clocks still are
21:37 nyef: Right, hence my question.
21:38 RSpliet: skeggsb knows more about NVAF than I do, I think it wasn't quite like NVAA and NVAC. The latter two were really dodgy, didn't even support voltage changes :-P
21:38 nyef: Where are we in terms of automatic reclocking?
21:38 RSpliet: karolherbst played with it a bit, but it's hard to invest time in it if the "manual mode" is still only 99,97% reliable
21:39 karolherbst: I guess on laptops that's fine
21:39 karolherbst: but on desktop it could be quite annoying
21:39 RSpliet: it isn't flicker-free at the moment. Up to NVD9 we can make it flicker-free easily for single-monitor set-ups by synchronising it with VBLANK,
21:39 karolherbst: we kind of have to fix the flickering stuff on memory reclocking first
21:39 RSpliet: ^ that
21:40 RSpliet: For multi-monitor set-ups we can solve that by correctly configuring the linebuffer (essentially a cache for the pixel scan-out logic)
21:40 nyef: The flickering is the... linebuffer / ISO hub thing?
21:40 RSpliet: yep
21:40 nyef: Is that available back to Tesla?
21:40 RSpliet: It is
21:41 RSpliet: Even further back I suspect... like GeForce FX
21:43 HdkR: Is kepler reclocking fully supported?
21:43 RSpliet: I always thought it had to be kind of trivial. You have a full cache, and you set how to divide it over the two monitors. If one is disabled, you allocate all of it for the other monitor. Otherwise, you can simply go for a ratio equal to the difference in pixel clocks
21:43 RSpliet: HdkR: Should work, but only statically
21:43 kernel-3xp: ok thx guys, have a nice day/night :)
21:43 RSpliet: (And first gen Maxwell too)
21:43 HdkR: ah
21:45 karolherbst: and in theory 2nd gen as well
21:46 HdkR: As long as you strip the pmu firmware from the proprietary blob :P
21:46 nyef: Yeah, I was about to ask about the PMU firmware. I guess that answers my question.
21:46 karolherbst: no
21:46 karolherbst: you don't need the pmu firmware
21:46 karolherbst: not technically
21:47 karolherbst: I mean, you kind of want to control your Fans somehow, but reclocking itself works without it
21:47 HdkR: ah
21:47 karolherbst: on laptops this doesn't matter as the fans aren't controlled by the GPU
21:48 RSpliet: Well... if you want flicker-free, I suspect you kind of need PMU to do the config of DRAM params.
21:48 karolherbst: right, but laptops have the advantage that it doesn't matter ;)
21:48 karolherbst: and for laptops that dynamic stuff matters kind of more anyway
21:49 RSpliet: Only laptops where no display is attached to the discrete card. And they do exist
21:49 karolherbst: okay sure, but on those the perf is miserable anyway
21:49 karolherbst: reverse prime really sucks perf wise
21:50 nyef: At which point, you may as well just use it as a compute engine? (-:
21:50 nyef: Okay, if I've got a possible missed initialization issue with my NVAF how would I go about tracking it down?
21:52 RSpliet: karolherbst: reverse prime? Why would you let intel render for the NVIDIA display?
21:52 karolherbst: RSpliet: ;) think about it
21:52 nyef: Hrm. Simple verification comes to mind: Bring the system up on the blob, take the blob down, bring nouveau up, and see if the problem still happens...
21:52 HdkR: Assume Intel doing the composition while nvidia is running for only a single application?
21:53 RSpliet: why composition if you're running a full-screen application...
21:53 HdkR: Since you don't need a dGPU in most instances
21:53 karolherbst: because the stack ain't that great?
21:53 RSpliet: Oh...
21:53 RSpliet: I hope AMD runs into the same problems then. They might fix it for us :-P
21:53 karolherbst: I tried it with some 4k monitors and glxgears, perf was... like a slideshow
21:54 karolherbst: and this was a desktop system with quite the capable CPU, not that it matters
21:54 RSpliet: Even still, that's all just broken stack stuff. Even if Intel does compositing, that shouldn't be a fraction of the compute time you need, so there must simply be a sync issue
21:55 karolherbst: PCIe is quite a bottleneck here
21:55 karolherbst: you get around +100% perf just by increasing the link speed from 2.5 to 8.0
21:55 karolherbst: but the perf increases slow than the pcie bandwidth
21:56 HdkR: Obviously we need PCIe 5.0 ;)
21:56 karolherbst: I assume it is a mix of PCIe bandwith and oversyncing
21:56 RSpliet: Is nouveau rendering using all sorts of buffers in shared mem rather than VRAM?
21:56 karolherbst: no, things are simply getting copied to the intel GPU and back to the nvidia one
21:57 karolherbst: normally PCIe bandwidth doesn't matter all that much
21:57 RSpliet: final frames, but all the other stuff exists in VRAM? Then I don't see why PCIe should be a bottleneck
21:57 karolherbst: uhm
21:57 karolherbst: 60 fps and 4k images?
21:57 karolherbst: this is quite a lot
21:57 karolherbst: and with reverse prime + crappy stack you get 120 frames per second as the target
21:57 karolherbst: as you need to copy twice
21:58 karolherbst: normally the PCIe accounts for like 10% of perf on a normal laptop doing trivial prime offloading stuff
21:59 karolherbst: I mean, the PCIe bandwith should be still enough for that
21:59 RSpliet: Surely there's like 30GiB/s of bandwidth, while you'd need only 2GiB/s to do 4K streaming (per monitor)
21:59 karolherbst: right
22:00 RSpliet: So... for dual monitor you'll be using 25% bandwidth. You'd have 75% of compute time left.
22:00 karolherbst: yeah well
22:00 karolherbst: but that's not how reality is
22:00 karolherbst: something is completly screwed here
22:01 RSpliet: Not to mention that if your frame is finished in time, compute of the next frame overlaps with PCIe transfers. You'll just lose DRAM bandwidth on transfers slowing down rendering a bit (but not by 25%)...
22:02 karolherbst: right, but the fact remains, increasing PCIe bandwith increases perf significantly
22:02 karolherbst: and that's not just going from 2.5 to 5.0
22:02 karolherbst: 5.0 to 8.0 also increases perf with the same ratio
22:02 karolherbst: I am not saying that the PCIe bandwith isn't enough
22:03 karolherbst: I am just blaming the architecture generally and maybe oversyncing or something
22:03 RSpliet: Sure, but that doesn't mean we can shrug and lean back saying PCIe is the bottleneck, because there's a million things "we" are doing that we shouldn't be doing that causes it to tank on PCIe bw ;-)
22:03 karolherbst: might be
22:03 karolherbst: anyway
22:03 RSpliet: If the blob does better, then definitely
22:03 karolherbst: if somebody got time to investigate this, it would be helpful
22:03 RSpliet: Yeah... about that time thing :-C
22:03 karolherbst: :p
22:04 karolherbst: I mean, we have some games which are clearly PCIe bottlenecked
22:04 karolherbst: which is a bit weird, but I guess that's basically our fault
22:15 Lyude: karolherbst: oh?
22:38 karolherbst: Lyude: yeah, dunno if that's always the case, but I had a gdb session open and suspended my laptop
22:38 karolherbst: on resume the GPU was basically screwed
22:38 Lyude: i mean
22:38 karolherbst: _might_ be related to secboot though
22:38 Lyude: RPM means nothing in the face of a real suspend
22:38 Lyude: the suspend core grabs a RPM ref for each device before suspending it, and then suspends regardless of what the usage count i
22:39 Lyude: *is
22:39 Lyude: so keeping a device alive through RPM isn't going to stop your laptop from suspending-which is the only way you'd be able to keep the GPU alive
22:41 karolherbst: yeah, probably
22:41 karolherbst: I am not quite sure what th real issue here is anyway
22:41 karolherbst: I think skeggsb_ knows more?
22:42 karolherbst: might be simply secboot screwing up, dunno