02:20ripdog: does anyone here have a gtx 970? I've never had nouveau work for me once, I either end up with a flashing black screen, a plain black screen or some stale boot messages on one screen and nothing else. apprantly nouveau supports the 970. any ideas? proprietary drivers work fine. I've tested fedora and arch
03:57Hurderion: So... hi? (:
08:49karolherbst: mhhh, apitraceing wine again, this is gonna be fun
09:00karolherbst: sigh... I guess I need to apitrace it in wine, and then apitrace retracing that
10:14karolherbst: wine uses glXSwapBuffersMscOML...
10:30mooch2: mwk, how do you change the size of ramin again on nv20?
10:38mwk: uh? you don't
10:39mwk: it's always 1MiB
10:49mooch2: oh okay
10:50mooch2: i fixed the bug anyway tho, thanks
12:36mooch2: god, i wish somebody would help me with my quest to emulate nv3
13:16karolherbst: nice a broken opt
13:17karolherbst: makes it easy to debug
13:21karolherbst: and it isn't algebraicOpt, weird
14:31karolherbst: okay, the ConstFolding::opnd OP_MUL case for "if (!i->postFactor && (imm0.isInteger(1) || imm0.isInteger(-1)))" has a bug
14:43karolherbst: a mul with saturate got into this
14:48mooch2: mwk: how am i supposed to figure out where nt4 loads the damned nv3 drivers?
14:51mwk: mooch2: at this point you know more about nt4 drivers than me, I'm afraid
14:52karolherbst: mwk, imirkin_: and idea if saturate is supported on OP_MOV? I could imagine there are several passes which just convert X -> OP_MOV and don't care about saturation...
14:52mwk: karolherbst: what ISA?
14:52mwk: mov cannot saturage, you need one of the cvts
14:52karolherbst: well yeah, I wrote a patch doing a OP_SAT
14:53karolherbst: I am more wondering if more passes are broken due to this
14:53mooch2: because i'm dealing with a situation where all i have is the EIP where things go wrong, and no disassembly
14:54karolherbst: mwk: expecially, because we have tons like "i->op = i->src(t).mod.getOp()"
14:57karolherbst: mwk: things like OP_ABS and so on are able tu saturate?
14:58karolherbst: or is it a simple cvt?
15:13karolherbst: do you think it might make sense to add an assert on emitMOV, so that we don't end up emitting movs with saturation flags?
15:24mwk: karolherbst: I've never touched the nv50 compiler
15:24mwk: I only know the hw instructions
15:27karolherbst: I see
15:37RSpliet: karolherbst: well, as long as you keep in mind that assert()s are for developers only, that's absolutely a good idea!
15:37karolherbst: currently checking on which stable branches to push this on
15:37karolherbst: because this issue is quite old
19:12karolherbst: crazy, ubershaders in dolphin make games run like super bad under nouveau
19:12tobijk: big shader, spilling eveywhere: nouveau is not good at scheduling: there you go?!
19:14karolherbst: I am sure I get more perf out of them quite easily
19:16tobijk: mh i havent seen those shaders, so dunno...
19:17karolherbst: should run on nvidia to compare things
19:18karolherbst: and they messed with the libGL loading, cause bumblebee doesn't work anymore
19:18tobijk: karolherbst: depens, maybe your distro uses glvnd now?
19:19tobijk: not sure what bumblebee would work in that case
19:19karolherbst: no I don't have glvnd yet
19:20karolherbst: I am sure they just tried to be smart and messed up
19:20karolherbst: like using hard coded paths or something
19:20tobijk: heh :)
19:21karolherbst: git bisect it is then
19:30tobijk: mh i should finally start debugging civ6 and nouveau
19:31karolherbst: turns out, dolphin defaults to EGL
19:32tobijk: mh egl always defaults to intel for me
19:32tobijk: but i have never cared actually
19:32karolherbst: bumblebee can't do EGL
19:32karolherbst: what's wrong with civ6?
19:33karolherbst: kind of feeling like fixing bugs today
19:33tobijk: dunno yet, i just segfauls somewhere in nouveau
19:33tobijk: the mesa part
19:33karolherbst: try to run it with that offloaded gl thread thingy
19:34tobijk: with? :O
19:37karolherbst: tobijk: mesa_glthread=true
19:38tobijk: karolherbst: yeah will try soon, but first i'm getting apitrace compiled again :)
19:39karolherbst: okay, ubershaders with nvidia are pretty fast
19:41tobijk: karolherbst: mh make sure you are really hitting nouveau
19:41karolherbst: I do
19:50tobijk: karolherbst: fyi: Civ6: segfault at 54 ip 00007f8296367d20 sp 00007f828d7909f8 error 4 in nouveau_dri.so[7f8295e84000+9f8000]
19:50tobijk: with glthread
19:51karolherbst: meh :(
19:52karolherbst: stacktrace pls
19:53tobijk: karolherbst: yeah i'm going to capture a apitrace
19:53karolherbst: won't help
19:53karolherbst: maybe it's my fault
19:53karolherbst: but capturing crashes never worked out for me
19:54karolherbst: maybe because they can't record the last call?
19:54karolherbst: especially if it's crashing
19:54tobijk: karolherbst: i will capture it with intel, where it semi works
19:54karolherbst: why don't you simply run it within gdb?
19:55karolherbst: but yeah, recording with intel might help
19:55tobijk: karolherbst: so i can simply run it later on easily
19:55karolherbst: nouveau does some heavy spilling in most ubershaders
19:56tobijk: karolherbst: so my wild guess was actually right ;-)
19:56karolherbst: no clue
19:56karolherbst: I don't see why it should result in such terrible perf
19:57karolherbst: one BB literally starts with nearly a hundred of phis
19:58karolherbst: I like this part espeically: https://gist.github.com/karolherbst/f50ce874f51b64b37abb55ba60ec14f4
19:59karolherbst: 4345 and 4346
19:59karolherbst: makes sense
20:00karolherbst: do you still think spilling is the issue? :D
20:01karolherbst: I mean it is somehow caused by spilling
20:01karolherbst: but the heck
20:02karolherbst: "1968: phi u32 %r9918 %r9905 %r15607 %r15525 %r15407 %r15393 %r15348 %r15055 %r14795 %r14784 %r14773 %r14762 %r14751 %r14740 %r14729 %r14718 %r14707 %r9916 (0)" :D
20:02tobijk: mh would be nice to see those shaders in a precompiled state as well
20:03tobijk: plain gl shaders i mean
20:03karolherbst: they emulate hardware
20:03karolherbst: what do you expect?
20:03karolherbst: tons of switches
20:03karolherbst: big uniforms
20:04karolherbst: I think I will write a pass which at least opts those silly load/store pairs away
20:04karolherbst: this should give us a decent improvement
20:05tobijk: yeah, how many more of them are there btw?
20:05karolherbst: too many
20:06karolherbst: "RASpillingUnfuck" sounds like a good name to me
20:07karolherbst: but maybe I should just fix the spill code ...
20:07karolherbst: no clue where to actually look for this
20:08tobijk: karolherbst: there should be a spillcodeinserter method somewhere
20:08tobijk: ah it actually is a whole class
20:19tobijk: meh, why does apitrace not trace :/
20:43karolherbst: tobijk: total instructions in shared programs : 161535 -> 139257 (-13.79%)
20:44tobijk: care to show the changes?
20:44karolherbst: it's a hack
20:44karolherbst: we should fix the spilling code
20:45karolherbst: but I wanted to have a dirty pass to at least show how much it messes up
20:45karolherbst: no checking if it runs better
20:45tobijk: how much is it with a shaderdb run without dolphin included?
20:45karolherbst: no clue
20:45karolherbst: not much I figure
20:46tobijk: mhm somehow that spilling thing makes me wonder if we regressed that at some point
20:47karolherbst: anyhow, I messed up
20:47karolherbst: too many things I didn't respect
20:48tobijk: karolherbst: yeah thats always the problem with the spilling, its hard to get right :/
20:55tobijk: ERROR: no viable spill candidates left
20:55tobijk: ERROR: no viable spill candidates left
20:56tobijk: ERROR: no viable spill candidates left
20:56tobijk: nvc0_program_translate:610 - shader translation failed: -4
20:56tobijk: karolherbst: --^ Civ6 :D
20:57karolherbst: I guess I can't really remove that load for now
20:58tobijk: how where you removing the LD?
20:59tobijk: check if ->prev is a ST?
20:59tobijk: (for the same value)
21:01karolherbst: ohh wait
21:01karolherbst: that's not the issue
21:01karolherbst: I can only do ti for local memory
21:02karolherbst: anyway, we should rather fix the spilling code to not do such stupid things
21:03tobijk: karolherbst: yeah that would fix a bunch of problems with many apps/games
21:04karolherbst: we could improve RA or scheduling
21:04karolherbst: but that's a different issue
21:04karolherbst: okay, there are some bugs within RA doing _really_ silly things
21:04karolherbst: but that's kind of an exception
21:04tobijk: well i mean apps crashing because we cant spill correctly :D
21:04karolherbst: I don't get how you can't spill anyhow
21:05karolherbst: ohh mhh crap
21:05karolherbst: I can't remove those stores as well, bcause the same location might be read again
21:05karolherbst: no, that makes no sense
21:06karolherbst: sure I can remove the store
21:06tobijk: but then u read in shit from that location
21:06karolherbst: think again
21:06karolherbst: look at the instructions again
21:07tobijk: oh its actually ld st and not st ld *dang*
21:08karolherbst: well doing st ld would be pretty pointless as well, well the ld part would be
21:22karolherbst: paperwork stinks
21:58tobijk: karolherbst: oh wow, it looks like when we try to spill we reuse the value for the last spilled one and try to find a better one (and scoring them)
21:58tobijk: even if there are vars to spill we may end up in refusing to spill if the score is not better then the last one
22:12Azur_Lighting: how is the performance of nouveau with a GT 730, i know this card is not even powerful of new but, at this point is the same has the nvidia closed source or better?
22:18tobijk: Azur_Lighting: i'm not exactly sure which generation a 730 is, if its a kepler, then the performance is (with reclocking) like 80% of the nvidia driver, if its a newer generation you are stuck with the boot clocks which are often the lowest ones, which gives you bad performance
22:18Azur_Lighting: i have a Gt 730 but its fermi
22:20Azur_Lighting: i mean it has 96 CUDA cores
22:21Azur_Lighting: i think fermi is more old right?
22:21tobijk: Azur_Lighting: yep
22:21tobijk: care to run: "glxinfo | grep Device"
22:22tobijk: should give you the exact chip
22:23tobijk: (with nouveau running, not nvidia)
22:49nyef`: Would using a compositing WM affect OpenGL VBlank synchronization?
23:45nyef`: Hrm. Some quick searching suggests that "it depends on the compositor", which isn't quite useful.
23:45nyef`: ... Also might not sync up correctly for what I'm doing *anyway*. /-: