00:00r3m: Hi, I would like to use Reverse Prime, I wonder if pixel on screens connected to the integrated graphic card will be rendered by the nvidia gpu or the intel one, I found contradictory explanation on the web, some say the integrated graphic card will do the rendering while others say everything (all screen both gpu) will be rendered by the discrete gpu (nvidia)
00:01imirkin: r3m: there's no single correct answer to that question
00:02imirkin: it can be configured either way
00:02imirkin: within the current infrastructure, all rendering always happens on a single gpu
00:02imirkin: but it can be either one.
00:03imirkin: prime = using secondary gpu to render to primary gpu's outputs
00:03r3m: imirkin: how to configure this so that screen connected to the integrated graphic card be rendered by that card ?
00:03imirkin: reverse prime = using primary gpu to render to secondary gpu's outputs
00:03imirkin: can you tell me more about your setup?
00:03r3m: is primary gpu always the integrated card
00:04imirkin: it can be either one, depending on configuration
00:04imirkin: but the majority case, igp is primary
00:04imirkin: and screens are connected to igp
00:05r3m: imirkin: I have a GTX 970 with 4 hd monitor, and i want to add two more monitor (using the integrated card because i have no more port on the discrete gpu) but I would prefer if the rendering was made by the integrated graphic card for these port because my gtx has already four hd monitor i think it can be too much to treat 6 screen
00:05imirkin: for a particular application, rendering is done by a single gpu
00:05imirkin: irrespective of what screen it ends up on
00:05r3m: oh ok
00:05imirkin: you can pick which one
00:06imirkin: but it can only be one.
00:06joepublic: does that card support more than 4 monitors despite only 4 connectors?
00:06r3m: joepublic: no 4 head
00:06imirkin: nvidia GPUs only support 4 crtc's on kepler+
00:06imirkin: (2 on earlier gpu's)
00:06imirkin: however you can have an infinite number of remote slaved outputs
00:07imirkin: (subject to various pci bandwidth constraints)
00:07imirkin: you're not scanning out the fb, so you don't need the crtc's
00:07imirkin: [well, there's also a max fb size which can limit these things too]
00:08r3m: imirkin: maybe a stupid question, if reverse prime is the opposite of prime, why not just change the primary card instead
00:09imirkin: well, primary card determines where the fb lives
00:09imirkin: you really want the fb to live close to where the scanout is happening
00:09imirkin: reverse prime has various tearing implications as well
00:09imirkin: since you end up having to make async copies
00:09HdkR: (The 4 CRTC limit destroys me)
00:10imirkin: HdkR: just get an AMD board :)
00:10HdkR: Let me know when AMD creates a card that can compete with a Titan RTX :P
00:10imirkin: why not get 2
00:10r3m: imirkin: thanks for all this info, just a last question before i let you go, will 6 full hd monitor (4 on nvidia, 2 on integrated) all rendered by my 970 is too much? I dont plan gaming (or at least when i play it is just on one screen)
00:11imirkin: r3m: one way to find out =]
00:11imirkin: depends on screen sizes, right?
00:11imirkin: 6 1x1 pixel screens? sure!
00:11HdkR: imirkin: My gaming/VR rig already has two 2080ti D:
00:11imirkin: 6x 4k? might be a bit much.
00:11r3m: 6 1920x1080
00:11imirkin: i know a user who uses 3x 4k screens on a GTX 1060
00:11imirkin: and he seems happy enough. single GPU though
00:12imirkin: the remote aspect makes things more annoying
00:12imirkin: 6x 1080p *seems* like it ought ot be fine
00:12imirkin: but ... don't quote me on that
00:12HdkR: 6x1080p would probably be pretty easy
00:12imirkin: HdkR: with reclocking sure
00:12r3m: thanks guys!
00:12imirkin: but without...
00:12HdkR: only 1.5 of a 4k panel
00:12imirkin: could be a stretch
00:12HdkR: ah, without reclocking, hmm
00:12imirkin: the 1060 is also without reclocking :)
00:14HdkR: Sad time
00:14imirkin: yes. nvidia is sad.
00:15imirkin: amd lets you have 6 CRTC's
00:15imirkin: and open-source, supported drivers
00:15r3m: so you both think 6 full hd is fine without reclocking
00:15HdkR: Without reclocking is dicey, someone would need to try it
00:16HdkR: There's a reason why the proprietary blob starts upclocking with multi-monitor arrangements, without ever going to "idle"
00:16imirkin: the reason is you can't do flicker-free clock changes
00:16imirkin: when you have multi-monitor
00:18HdkR: Maybe they should fix that ;)
00:19imirkin: used to do phase-locked crtc's
00:19imirkin: back in the bad old days
00:21r3m: thanks again imirkin HdkR I'll try as you both said this is the only way to find out! ;) bye
09:42RSpliet: imirkin: you can do flicker-free clock changes, as long as the GPU has its line buffers filled with pixels for ~150μs of scan-out.
09:42RSpliet: _We_ just can't do it :-)
10:53imirkin: RSpliet: oh right...
13:13imirkin_: skeggsb: so ... any better ideas? i don't see the address referenced in the last batch
13:13imirkin_: (hm, there's a small chance that there's buffering going on... i'll double-check that)
13:14imirkin_: err uses fprintf(stderr) -- does that buffer when redirected to a file? it might...
13:14imirkin_: i know with tty it'll be line-buffered
13:14imirkin_: but for !tty, it might just do regular buffering. i forget =/
13:21imirkin_: looks like it's unbuffered by default. so yeah. no reference in the last batch
13:21imirkin_: or any batch from the test that hangs
13:22imirkin_: (and the test before passes)
13:22imirkin_: and the only reference to that address that i can see is the launch descriptor. setting it to 0 right after compute launch doesn't change the final address that there's an error on.
13:23imirkin_: skeggsb: also i wonder if there's something instructive from the fact that it's a PDE and not PTE error? or does that just mean that the PDE isn't there for that range?
15:45karolherbst: imirkin_: mind reviewing the RA patches or shall I just run some games affected by spilling on kepler and see if everything works out and just push it?
15:45karolherbst: mhh, although potentially it can break everything
15:48imirkin_: karolherbst: certainly try to run some games
15:49karolherbst: yeah.. well at least a full shader-db run didn't hit any memory corruptions anymore. I am mainly concerned about wrongly compiled shaders (even those which don't spill)
15:49imirkin_: iirc crysis hit some RA issues
15:49imirkin_: at 64 regs, at least
15:49karolherbst: that was RA being dumb, no?
15:49imirkin_: certainly not being smart
15:49karolherbst: I guess that's the other issue I was talking about
15:49imirkin_: iirc it was a ton of textures
15:49karolherbst: or maybe a third one
15:50imirkin_: so it actually did burn a lot of registers
15:50karolherbst: but it failed to spill...
15:50karolherbst: or... it tried
15:50karolherbst: the other issue is, where RA thinks it has to spill, but actually doesn't have to
15:50imirkin_: don't think it was that
15:50karolherbst: and some values are marked as unspillable
15:51imirkin_: could have been the unspillable thing.
15:51karolherbst: would be the same issue then
15:51karolherbst: it happens when the weight is inf
15:52imirkin_: tbh i don't remember
15:52karolherbst: fun fact, the TGSI I have to reproduce this error, doesn't even hit 20 regs
15:53karolherbst: but it has like 15 single component values
15:53karolherbst: and then RA can't allocate a vec4 anymore
15:54karolherbst: it happens when you have nodes like that: weight inf, deg 68/60
15:54karolherbst: weight inf -> can't be spille
15:55karolherbst: I am actually wondering if we should ignore that part
15:55karolherbst: what would be the worst what could happen, right?
15:56karolherbst: this is the result if I remove that isinf check in the spiller: https://gist.githubusercontent.com/karolherbst/ce1d5a7b7a1416e59dbb04df7127ac9d/raw/17fae47c5e650d5ba76dfcbc4a435a81e8ddddb8/gistfile1.txt
15:56karolherbst: ohh well, it uses 20 regs
15:58karolherbst: mhh, maybe I will check if the crysis bug disappears as well when I do that
15:58karolherbst: imirkin_: do you know if that was with d3d9 or wined3d?
16:00imirkin_: i have a crysis.tgsi at home
16:00karolherbst: ahh, that would be helpful
16:00karolherbst: I am sure I have one _somewhere_ as well
16:01imirkin_: i'll try to remember to send it tonight
16:04karolherbst: imirkin_: do you know where the bug is?
16:04imirkin_: what bug?
16:04karolherbst: the crysis one? or was there never a bug created for it?
16:04imirkin_: oh, no idea
16:04imirkin_: try searching for "crysis" :)
16:04imirkin_: i've had it for a _very_ long time
16:04imirkin_: like ... probably 5y
16:05karolherbst: well.. I didn't find anything :D
16:05karolherbst: we have one for civ 4 though
16:10karolherbst: imirkin_: https://github.com/iXit/Mesa-3D/issues/232
16:12karolherbst: _fun_.. it doesn't fail to compile on my RA branch
16:12karolherbst: and it fails on master
16:13karolherbst: and no memory corruption
16:20karolherbst: imirkin_: guess my RA fix does fix the cleanup and the crysis shader compiles on the second try
16:21karolherbst: uhm.. RA succeeds on the second try
16:22imirkin_: so the system works :)
16:22karolherbst: I guess so
16:22karolherbst: the diff looks reasonable
16:23karolherbst: but.. oh well
16:23karolherbst: will test it on monday then
16:27karolherbst: imirkin_: the main thing I don't like about my newest patch is that I am creating that many std::list objects :/ but I think I can create a full copy, manage the list in the wrapper class und replace the full list in the Value after RA succeeded... that should work.. unless you have a better idea?
16:27imirkin_: i saw you were doing that ... and also returning them by value
16:28imirkin_: instead of having a std::list<>&& thing
16:28imirkin_: probably doesn't immensely matter ... anyways, will look.
16:28imirkin_: and think.
16:30karolherbst: "return std::move(obj)" is a bad thing to do :p
16:30karolherbst: prevents copy elision
16:30imirkin_: i haven't kept up on all the latest c++ things
16:31karolherbst: when optimized, the std::list object gets constructed directly in the callers stack
16:31imirkin_: there's a new "final" value type of thing though
16:31imirkin_: which is foo&&
16:31imirkin_: which is used for return values
16:31karolherbst: foo&& only makes sense as function args
16:31imirkin_: which enables the opt you're talking about
16:31imirkin_: i thought it only made sense for return types
16:31karolherbst: the other way around
16:31imirkin_: i haven't looked at (recent) c++ closely though
16:32karolherbst: if you pass something inside a function with a && arg, your original value is considered "broken"
16:32karolherbst: because the && allows the callee to move values in
16:32karolherbst: instead of copying
16:32karolherbst: like pointers
16:32karolherbst: so you don't have to deep copy anymore
16:32karolherbst: you just copy the pointer once
16:34karolherbst: returning && causes crashes in the best case. In the worse cases hard to debug problems
16:34imirkin_: prvalue vs xvalue
16:35imirkin_: i need to do more reading.
16:35imirkin_: i think a && on a return type makes it into an xvalue
16:35imirkin_: which is actually what you want
16:35karolherbst: nope, it's broken code
16:35karolherbst: the stack is already gone
16:36karolherbst: so the object is already in fred memory
16:36imirkin_: that's not what the docs i'm reading suggest
16:36karolherbst: && is a hint for the compiler to be able to use move constructors or function overloads with && parameters
16:36karolherbst: that's essentially it
16:36imirkin_: i see && only used for return types
16:36imirkin_: e.g. https://stackoverflow.com/questions/3601602/what-are-rvalues-lvalues-xvalues-glvalues-and-prvalues
16:36imirkin_: so i think there's a bit more going on here
16:37karolherbst: it's wrong
16:37karolherbst: it's really that simple
16:38karolherbst: those example only work by luck
16:38imirkin_: i'm really pretty sure that foo&& as a return type is highly defined and used in a lot of places (with c++11 or 14 or whatever)
16:39karolherbst: it works with temporaries only essentially
16:39karolherbst: return std::list(...) // yes
16:39karolherbst: std::list tmp; return tmp (as &&) <-> broken
16:40karolherbst: the problem starts when you start to return values on the stack as &&
16:40karolherbst: and then it doesn't matter if you use && or return by value
16:40imirkin_: the latter example won't work.
16:40karolherbst: as the compiler produces the same code (more or less)
16:40imirkin_: returning a stack value as && = fail
16:41karolherbst: thing is, sing && as the return value only makes sense in such few cases, that if somebody uses it, it's probably wrong or not needed
16:43karolherbst: imirkin_: https://stackoverflow.com/a/5770888 is one of them
16:44karolherbst: there it's actually useful.. but.. uff
16:44karolherbst: you can imagine the headache
21:02karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/1a1fcf111509e79c0cfedb61e61faab1a82e23f4
21:03karolherbst: mhh, I should update some variables to also have the &
21:03imirkin_: i'll look in detail over the weekend
21:04imirkin_: if i have time =/
21:05karolherbst: okay :) I think I am finally acceptably happy about the patch
21:06imirkin_: and a lot less horrid than the original attempt, hopefully
21:06karolherbst: well.. at least the shader is always in a "non broken" state
21:07imirkin_: and is therefore working :)
21:08karolherbst: well.. the original idea is still the same. If we delete an instruction (when spilling) we have to clean up all references to its defs
21:08karolherbst: that part hasn't changed
21:11karolherbst: uff.. I found a bug