00:00karolherbst: there is still something in the CSTEP table which bothers me
00:00karolherbst: 0: freq 810 MHz unkn 1 unkn 2 voltage 18
00:00karolherbst: 1: freq 540 MHz unkn 1 unkn 2 voltage 18
00:01karolherbst: 2: freq 628 MHz unkn 1a unkn 1 voltage 18
00:01karolherbst: and so on
00:01karolherbst: it looks like the 0th entry could be an entry, not being used at all
00:01karolherbst: but mybe those unkn will tell us more
00:05karolherbst: jayhost_: check that you have a new boost file in debugfs alongside the pstate file
00:05jayhost_: karolherbst it says task kworker blocked for more than 120 seconds
00:05jayhost_: When I echo 0f pstate
00:07karolherbst: yeah I have an idea
00:07karolherbst: jayhost_: but is there a boost file?
00:08jayhost_: I had to restart cause it hung. Doesn't look like boost is there
00:08jayhost_: nvm boost is there
00:09jayhost_: 0/boost 128/boost 64/boost
00:10karolherbst: I fear there is some locking issue there
00:10karolherbst: but I couldn't trigger it on my system
00:10karolherbst: the boost file works exactly like the pstate ones
00:10karolherbst: and if you just poke 0f in pstate it should usually work
00:11jayhost_: Okay it worked this time. This time I didn't boot into xfce
00:12karolherbst: shouldn't matter
00:12karolherbst: the therm subsystem asks the clk subsystem if it has to change the clocks due to temperature changes
00:12karolherbst: I fear there might be some race condition while the user manually reclocks
00:16karolherbst: your max voltage curves are interessting if the gpu would reach 160+ °C
00:18jayhost_: Can you say in Layman?
00:18karolherbst: well for your gpu there are 3 entries which specify the global maximum voltage the driver is allowed to set
00:19karolherbst: and at 95°C or a bit higher the driver or the gpu will downclock by a lot anyway
00:19karolherbst: and up until then only one of those entries is important
00:19karolherbst: I want to find a vbios with nice entries
00:29karolherbst: jayhost_: can you cat the pstate and boost file?
00:37karolherbst: well I guess I will spend some time in making this thing a bit dead lock free...
00:40jayhost_: yes I can
00:42karolherbst: I just want to verify it
00:44karolherbst: jayhost_: well I also want to have the output... maybe I wasn't clear enough
00:48jayhost_: Hahaha gotcha. It's 0: 1176 MHZ *1: 1254 2: 1398
00:48karolherbst: jayhost_: do you know what was the highest clock nvidia _ever_ used?
00:49karolherbst: though it should be around 1300
00:49karolherbst: jayhost_: if you poke 2 into boost, you increase the clock in the last pstate line
00:52jayhost_: I don't know highest and do you want me to try it
00:52karolherbst: well the worst that could happen is that the poke blocks for ever
00:52karolherbst: but there is no other issue with it
00:52karolherbst: the driver should run just fine
00:54jayhost_: More POWERR. It goes from 1265 to 1316 according to pstate
00:55karolherbst: now do this:
00:56karolherbst: nvaforcetemp 90; sleep 1; cat pstate; nvaforcetemp 1; sleep 1; cat pstate; nvaforcetemp 0
00:56karolherbst: the clock should change depending on the temperature
00:58jayhost_: okay fan went real high for sec
01:03karolherbst: because we faked the temperature to 90°C
01:05karolherbst: jayhost_: did the AC line differ ?
01:06karolherbst: I would expect the first one to be lower and the second one to be higher than 1316
01:06karolherbst: or the same
01:06jayhost_: 1265 both times
01:07karolherbst: and if you cat it now?
01:07karolherbst: maybe bad timing or something..
01:07jayhost_: 1265 now
01:08jayhost_: Oh I imagine you wanted me to run test after boost change
01:08karolherbst: boost should be set to 2
01:08karolherbst: boost is a clock limiter
01:08karolherbst: but 1265 is still odd
01:08jayhost_: 1316 and 1316 this time
01:09karolherbst: mhh okay
01:09karolherbst: maybe the effect isn't as big on yours
01:09karolherbst: k, something else
01:10karolherbst: jayhost_: nvaforcetemp 90; sleep 1; cat pstate | grep AC;sensors | grep "GPU core"; nvaforcetemp 1; sleep 1; cat pstate| grep AC;sensors | grep "GPU core"; nvaforcetemp 0
01:10karolherbst: the voltage should at least change a little
01:16jayhost_: 1.06 1.07
01:22jayhost_: I was trying out this game brothers a tale of two sons. It seems some kind of blur,bloom, or lighting effect creates black spot
01:22jayhost_: I wasn't sure if that's nouveau,mesa, or wine
01:36imirkin: jayhost_: make an apitrace
01:36imirkin: jayhost_: maxwell is pretty untested... might have any number of issues still
01:42jayhost_: imirkin okay cool. I've figuring out through trial and error how to build mesa 11.2 for steam wine 32 bit
02:49jayhost_: mmm imirkin can only get skyrim to use 11.2 mesa with 32 bit swrast and then it's not playable [slow]
03:09jayhost_: I probably should just stay away from Steam alltogether given it's closed nature
04:34orbea: jayhost_: the malware part is probably more convincing
09:29szt: got 4 freezes already today
10:03Misanthropos: hi there! some time ago i had troubles gaming with psate 0f - gforce gtx 650 ti and got a hint to change info.min to info.max in volt/base.c - which worked perfectly an i am a happy game ever since.
10:04Misanthropos: i wonder though why i have to patch every new kernel up to 4.6.0-rc2 ... i guess lots of people have issues with their card too
10:07Misanthropos: or not?
10:34pmoreau: Misanthropos: I think, it's more because it's a hack, and not the proper way to do it.
10:35pmoreau: karolherbst1: I don't remember: did you find a proper fix for the info.min -> info.max hack?
10:35karolherbst: pmoreau: yeah?
10:35karolherbst: pmoreau: guess what the big reclocking series is for :D
10:36pmoreau: I didn't really looked at it, sorry :-/
10:36karolherbst: pmoreau: https://lists.freedesktop.org/archives/nouveau/2016-April/024619.html
10:36karolherbst: this is the bit though
10:36pmoreau: Only the cover letter, and I forgot the content… --"
10:36karolherbst: all the other patches are more boost related
10:36karolherbst: and respecting voltage limits
10:36karolherbst: and all that stuff
10:37pmoreau: Nice! :-)
10:37pmoreau: I quickly saw in the logs that you found more volt entries in the VBIOS table?
10:38karolherbst: not really more entries
10:39karolherbst: but in the voltage map table header are references to "voltage max" entries
10:39karolherbst: which are basically entries like all the others
10:39karolherbst: but they return a voltage which add a software cap to the max voltage
10:39karolherbst: for example
10:39karolherbst: your gpu can do 1.2125V
10:40karolherbst: now we could just clock to a cstate with a voltage below that and it would be fine
10:40karolherbst: but those entries could say: at 5°C max voltage is 1.22V
10:40karolherbst: but for 90°C it is only 1.15V
10:40karolherbst: so some cstates can't be reached anymore and nouveau has to clock down a little
10:41pmoreau: Ah, ok
10:41karolherbst: the last commit was just that I found the byte for a third of such entries
10:41karolherbst: on all keplers there seem to be at least 2
10:41karolherbst: but some have 3
10:42pmoreau: And on pre-Kepler, only 2 of them I guess?
10:42karolherbst: and those entries can generate curves like this: https://i.imgur.com/zjiMPV9.png
10:42karolherbst: pmoreau: well this also kind of affects fermi, because some fermis have the table version 0x20 as well
10:42karolherbst: but I didn't check the 0x10 table yet
10:43karolherbst: which is pointless, because there is no kepler with 0x10 afaik
10:43karolherbst: and tesla doesn't have it at all
10:44pmoreau: (I guess x-axis is the temp, y-axis in V, but what about the red, green, blue curves?
10:44karolherbst: these are the max entries
10:44karolherbst: red is a static one with no temperature parameter
10:44karolherbst: basically in the entries you get a mode and 6 coefficients
10:45karolherbst: and depending on the mode, these coefficients go into different formulars
10:45karolherbst: and return a voltage
10:46karolherbst: it is all a bit painfull, because you always need the temperature for everything and always have to calculate on the fly...
10:47karolherbst: well I could pre caclulate stuff and save a temp->voltage map with 120 entries for each entry...
10:57Misanthropos: thanks for clearing that out :D i can live with patching the kernel until you find a feasible way.
10:58karolherbst: Misanthropos: kepler or maxwell?
10:58Misanthropos: i think its kepler - gtx 650 ti
10:58karolherbst: Misanthropos: you can always use my kernel tree here: https://github.com/karolherbst/linux/commits/nouveau_4.5_reclocking
10:59karolherbst: just make sure to use the right branch: nouveau_4.5_reclocking
10:59karolherbst: I am not updating this branch though, because it is kind of hard to automate
10:59karolherbst: compiling nouveau out of tree should be the better solution in longterm
10:59karolherbst: but the kernel tree is easier to deploy
11:00Misanthropos: does that branch does something different other than using info.max?
11:01imirkin: hakzsam: when you feel bored, try to figure out how to support 32 textures on fermi. apparently blob drivers do it, although tbh i don't see how it's possible.
11:01imirkin: hakzsam: specifically some Feral-ported games apparently need at least 18. should be easy on kepler+.
11:02karolherbst: Misanthropos: yeah, it's better basically
11:03karolherbst: Misanthropos: maybe not as stable as the link.max hack, but it is the right way. I don't think I missed anything, but there can be always some minor stuff I also have to respect...
11:03Misanthropos: ok - will try that
11:26Yoshimo: imirkin: do you have an example for "feral-ported"?
11:26imirkin: company of heroes 2
11:26imirkin: see https://bugs.freedesktop.org/show_bug.cgi?id=94705
11:32karolherbst: does anybody know if there are min/max instructions on the falcons?
11:46Misanthropos: karolherbst, i am unable to branch your repository - only master is available
11:47karolherbst: did you clone it right?
11:47Misanthropos: ups sorry
11:47Misanthropos: it doesnt list as branch
11:48Misanthropos: but i can checkout the novueau?4.5 reclocking
11:48karolherbst: it never does after a pull
11:48karolherbst: git remote show origin
11:48Misanthropos: found it
11:48Misanthropos: thank you
12:44karolherbst: Misanthropos: any luck building the tree yet?
12:46Misanthropos: karolherbst, not yet - going to build it later
12:52hakzsam: imirkin, too much things on my todolist :)
12:54imirkin: procrastinate on some things by doing other things :)
12:56karolherbst: imirkin: by the way, with all my mesa patches I get above 80% nvidia performance in pixmark piano now
12:56imirkin: karolherbst: cool
12:56imirkin: any change in a real application?
12:56karolherbst: stock mesa is around 5% slower
12:57karolherbst: imirkin: didn't test yet, but pixmark_piano ist shader only with no significant memory operations
12:57karolherbst: at least in unigine I got 0% difference
12:59karolherbst: but this commit is especially interessting: https://github.com/karolherbst/mesa/commit/5e682fa5015fe38b464cfd8446892e4fb02c2804
12:59karolherbst: I really don't know why this works
13:06imirkin: karolherbst: it'd be a TON more effective to have this pass in SSA
13:06imirkin: and would lead to much better RA
13:06karolherbst: this is in SSA
13:07imirkin: oh heh
13:07imirkin: so it is.
13:07karolherbst: that's why it reduces GPR usave by 0.72%
13:07imirkin: then wtf is up with the permuteAdjancent thing??
13:07karolherbst: permuteAdjacent shouldn't be called while in SSA?
13:07imirkin: just stick it up at the instruction definition site if the instr is in the same bb, or at the top of the bb if it isn't
13:08imirkin: i mean... it's just hugely inefficient the way you're doing it
13:08karolherbst: they still have a source, so I can't just move them wherever I want
13:08imirkin: and why are you running this backwards?
13:08imirkin: yeah you can
13:09imirkin: you can move them to right after that source is defined
13:09imirkin: assuming the def is in the same bb
13:09karolherbst: yeah, but
13:09karolherbst: I move them down
13:09karolherbst: that's why this bothers me so much
13:09karolherbst: why does moving them down, helps?
13:10imirkin: ah hm
13:10imirkin: you're potentially reducing overlapping uses
13:10imirkin: so like
13:10imirkin: x = foo(...)
13:10imirkin: y = -x
13:11imirkin: if you move the y down closer to the use
13:11imirkin: you can end up with better RA
13:11karolherbst: now I remeber why I wrote this pass
13:11karolherbst: the thing was something else
13:11karolherbst: but yeah
13:11karolherbst: this was to optimizes negs/abs and stuff like that away
13:12karolherbst: and move it into the use instruction
13:12karolherbst: as mods
13:12karolherbst: I think
13:12imirkin: so you should then just be able to move the instruction to right before the first use or to the end of the bb
13:12imirkin: either way, your permute thing is silly
13:12karolherbst: that makes sense then
13:12karolherbst: yeah I know, I was just trying it out
13:13karolherbst: I am still not convinced that doing this in general is a good idea, because it might introduce more stalls
13:15imirkin: yes... it's an eternal fight between good and evil that will never be resolved
13:16karolherbst: but then again, those instruction will be close to the source anyway, so the stall is just moved
14:01karolherbst: imirkin: was there something you wanted me to check regarding the PostRA DCE pass?
14:01imirkin: i don't remember tbh
14:01imirkin: at the very least, mention why it's useful
14:01karolherbst: mhh maybe something like in which situation this helps?
14:02imirkin: is it the stupid legalizessa junk that needs a DCE pass after it? or was it a handful of extra constraint mov's that ended up being unnecessary after ra?
14:03karolherbst: BBs being unused
14:03karolherbst: so those instruction were all cleared out
14:06imirkin: then we should have a step that nukes unnecessary BB's
14:06karolherbst: I will investigate then
14:15karolherbst: well at least this pass gives me "total instructions in shared programs : 2340985 -> 2334212 (-0.29%)" in my shaders
14:15karolherbst: 99% in eon based games
14:19karolherbst: imirkin: https://gist.github.com/karolherbst/20a9e90d76b9858a5663e1cf552de64e#file-gistfile1-txt-L11-L31
14:19karolherbst: the marked lines are optimized away by this pass
14:21imirkin: what does it look like pre-RA?
14:23karolherbst: I see, the set is optimized away in RA
14:23imirkin: it's for the set which determines which branch to take
14:23imirkin: but both branches are empty
14:23imirkin: so the proper thing to do is when deleting the last thing in a BB
14:23imirkin: to remove the whole BB
14:24imirkin: by *very* carefully replacing edges
14:24karolherbst: and then the set has no dest and gets optimised away in preRA?
14:25karolherbst: which is not as trivial as my pass I guess
14:25imirkin: basically phi node order is linked to inbound edge order
14:25imirkin: so as long as you only modify edges and never explicitly remove them, you should be good
14:25imirkin: there might even be a function to do this already, dunno
14:25imirkin: can you make the glsl shader available somewhere?
14:26imirkin:is curious wtf they're doing
14:26karolherbst: it is in the repositry
14:26imirkin: one of many i'm sure
14:27karolherbst: they do stupid stuff, but they always do it
14:29imirkin: ah probably something dumb like this
14:29imirkin: Temp.w = (floatBitsToInt(Temp).z != 0) ? Temp.w : (-Temp.w);
14:29karolherbst: does the shader looks smart to you in general? :D
14:29imirkin: seems fine
14:29imirkin: no control flow, which is good
14:30imirkin: it's clearly the output of some converter, hence all the uintBitsToFloat bs
14:30karolherbst: and we have to unmess the mess the converter did
14:30imirkin: but that's fairly common
14:30imirkin: stuff like this is nice: Temp.z = uintBitsToFloat((Temp.w>=(-Temp.w)) ? 0xFFFFFFFFu : 0u);
14:32imirkin: they could just do uintBitsToFloat(-((Temp.w>=(-Temp.w)))
14:32imirkin: of course that eq is only true for Temp.w == 0... oh well.
14:32imirkin: but it's converting some SM3 opcode that's a lot like the TGSI CMP
14:33imirkin: so like this sequence is esp sad: http://hastebin.com/ebemaqoler.coffee
14:34imirkin: since (a) the 0xfffffff/0x0 conditional assign is the *output* of a set op
14:34imirkin: and (b) the set op could just compare %r519 to 0
14:34imirkin: [which iirc enables more optimizations]
14:34karolherbst: how can it compare to 0?
14:35imirkin: well, a >= -a isn't true a whole lot :)
14:35imirkin: and any a that's <= 0
14:35karolherbst: ahh right
14:35karolherbst: yeah, of course
14:36imirkin: er, >= 0
14:39karolherbst: so there should be a check in delete_instruction which checks if the BB is completly empty and then it would just delete the BB and modify the edges
14:44imirkin: a little extreme to do it in ~Instruction...
14:46karolherbst: delete_instruction calls Program::releaseInstruction
14:47imirkin: sure, but it's still the wrong place to do it
14:47imirkin: either way, the bb will still have a branch
14:47imirkin: and you need to touch up a bunch of stuff
14:48imirkin: makes sense to do after DCE'ing a block though
14:49Misanthropos: karolherbst, i compiled your branch with the same config i used to compile my other kernels. the framerate is as bad as it would be if there if software rendering is used...
14:49karolherbst: Misanthropos: you have to reclock yourself though
14:49Misanthropos: i did
14:49karolherbst: then something went wrong
14:49Misanthropos: yeah .. i did that
14:49karolherbst: output of glxinfo
14:50Misanthropos: need to reboot
14:58Misanthropos: that glxinfo: https://bpaste.net/show/74abf72f7cde
14:58Misanthropos: this time i have heavy artefacts too if pstate 0f
14:58Misanthropos: on desktop
14:58Misanthropos: at least the first time.. cant reproduce
15:00karolherbst: yeah, nouveau isn't used in mesa
15:00karolherbst: Misanthropos: how did you install mesa?
15:00karolherbst: and also your full dmesg please
15:01Misanthropos: emerged through gentoo
15:01karolherbst: is VIDEO_CARDS="nouveau" set?
15:01Misanthropos: Installed versions: 11.0.6^d(01:15:07 PM 02/19/2016)(classic d3d9 dri3 egl gallium gbm gles2 llvm nptl opencl udev vdpau -bindist -debug -gles1 -openmax -osmesa -pax_kernel -pic -selinux -vaapi -wayland -xa -xvmc ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="32 64 -x32" KERNEL="-FreeBSD" VIDEO_CARDS="nouveau -freedreno -i915 -i965 -ilo -intel -r100 -r200 -r300 -r600 -radeon -radeonsi -vmware")
15:01karolherbst: then your full dmesg please
15:01Misanthropos: dmesg: https://bpaste.net/show/622d3d18dba9
15:02karolherbst: gr: init failed, -16
15:02karolherbst: yeah, that is bad
15:03imirkin: our favorite error is back... well done Karol :p
15:03karolherbst: mhh and I didn't even touched gr
15:03imirkin: well, the way we "fixed" it is by resetting the gpu
15:04karolherbst: Misanthropos: does it also happens with a stock kernel?
15:04imirkin: if you mess with voltage enough, i guess it'll mess things up =]
15:04karolherbst: mhh yeah, maybe the state change on boot messes it up
15:04Misanthropos: i explicitly checked and rebooted to 4.6.0-rc2 to make sure it is not something else
15:04karolherbst: Misanthropos: could you drop nouveau.config=NvClkMode=10 and reboot?
15:04Misanthropos: hmm... ok?
15:05Misanthropos: just a sec
15:05karolherbst: imirkin: but mhh, if the engines are messed up due to voltage, why does the gpu still display stuff?
15:06karolherbst: uhh I have an idea
15:06Misanthropos: so no reboot?
15:06imirkin: the machines affected by that other issue were universally laptops
15:07karolherbst: Misanthropos: no, still reboot
15:07karolherbst: but I still check something here
15:07imirkin: where the bios didn't sufficiently initialize them
15:07imirkin: this wouldn't happen on a desktop chip though
15:08karolherbst: maybe something else messed it up
15:08karolherbst: should be easy to bisect though if 4.6-rc2 works
15:11Misanthropos: wow.. this time the system froze on pstate 0f
15:12imirkin: progress! :)
15:12karolherbst: Misanthropos: full dmesg again please
15:12karolherbst: looks better now
15:12karolherbst: okay mhh
15:13karolherbst: I would need your vbios
15:13karolherbst: it is in /sys/kernel/debug/dri/0/vbios.rom
15:13karolherbst: Misanthropos: do you usually use the binary driver?
15:13Misanthropos: not anymore
15:14Misanthropos: i have been using the nouveau driver for 2-3 month now
15:14karolherbst: well. Maybe the vbios helps enough, but while running the nvidia driver we could check which voltage nvidia sets, and which one nouveau would set
15:14karolherbst: and if the difference is too big, I would have to investigate what is different
15:17karolherbst: k thanks, I have to buy stuff now and will look into it later then
16:26fergal: hi guys. does nouveau support glsl 330?
16:26imirkin: fergal: on G80+ GPU's, yes.
16:26imirkin: fergal: (for core profile GL contexts only, of course, same as every other mesa driver)
16:27fergal: imirkin: ummm i have a gtx 760, is that g80+?
16:28imirkin: G80 = GeForce GTS 8800, released on late 2006 iirc
16:28imirkin: fergal: are you having trouble, or just asking?
16:28fergal: imirkin: ah okay, sooo, can i ask glsl compiler questions in here? or should i find a more glsl specific channel?
16:29fergal: vertex shader isn’t compiling with the nouveau driver i have installed
16:29imirkin: fergal: pastebin glxinfo, as well as the error you're seeing
16:51fergal: imirkin: this is my glxinfo: http://sprunge.us/eYPF and this is the error i get at compile/validation: http://pastie.org/private/rf9egnnbcrmgyrxyud8pha and this is what my vertex shader looks like: http://pastie.org/private/8ulbgtbhpp1c2bwxytk6a
16:53imirkin: glxinfo looks good
16:54imirkin: and if you were accidentally using a compat context, it would have crapped out at the #version 330
16:54imirkin: (since you would only have up to version 130 there)
16:55imirkin: i think that's a reserved keyword
16:55imirkin: let me check
16:55imirkin: well, it's definitely *treated* like a keyword
16:56imirkin: but perhaps it shouldn't be... spec-digging time
16:56fergal: i’ll change it to buf to see if it helps?
16:56fergal: ummmm i’m actually requesting a gl context of 3.2 - should i be requesting 3.3?
16:57imirkin: gr, would it kill those jokers to actually *sort* the list of reserved keywords rather than have it in random order
16:57junkula: is this still needed nowadays? https://wiki.archlinux.org/index.php/Nouveau#Fan_Control
16:57imirkin: fergal: nah, requested context version shouldn't matter
16:58imirkin: fergal: core is the important bit, and you have that right
16:58imirkin: fergal: looks like it's being incorrectly reserved
16:58fergal: so just change the name buffe to something else?
16:59imirkin: fergal: let me figure out when it became a thing (430 or earlier), and then send a patch that fixes that
16:59imirkin: yeah, that's the simplest quickest fix
17:09fergal: imirkin: thanks very much for your help, that’s got me past the error! i really appreciate it! :)
17:13imirkin: fergal: just sent a patch which should fix it for real btw
17:13imirkin: fergal: https://patchwork.freedesktop.org/patch/80180/
17:15imirkin: junkula: i believe the default is auto fan control now
17:16junkula: imirkin: thanks good to know
18:39KungFuJesus: Hello, is it possible to use nouveau in conjunction with any old PCI based nvidia card on PPC?
18:40KungFuJesus: I've been getting a stream of EDID errors when trying to launch X with nouveau on Debian, resulting in a no displays detected error
18:41KungFuJesus: I can't find any specific evidence that I need a supported Apple / Openfirmware nvidia card / BIOS in order to use it post framebuffer
18:51imirkin: KungFuJesus: for some definition of use, yes...
18:53imirkin: KungFuJesus: i guess i've never tested... i was happy enough that the card that it came with turned on :)
18:53imirkin: KungFuJesus: you can file a bug and include all the relevant details, but... no guarantees. not exactly a highly-common environment
18:54KungFuJesus: I feel like this was all a lot easier before the KMS changes
18:57KungFuJesus: in theory this shouldn't be so hard, I have a PCI bus, Nouveau is an open implementation that is designed to talk to the BIOS on the card, and I'm not trying to use it as a framebuffer device - the one thing that would have required an openfirmware driver to be written in 4th
18:59imirkin: KungFuJesus: are you trying to logic the situation into working? i doubt that'll work
19:00imirkin: i'm not saying it *shouldn't* work, just that it's an uncommon environment that gets no testing at all, and i can totally see it being fubar
19:03hakzsam: imirkin, one more extension :)
19:41hussam: Hi. I have a question. If I am using nouveau and run a sdl game, gnome-shell uses a constant 30% CPU. If I run the same game whilst using the proprietary nvidia driver, gnome-shell only uses up to 15% CPU. Could it be the CPU is used for rendering instead of GPU under nouveau?
19:41hussam: Xorg process tends to match gnome-shell CPU usage.
19:42hussam: or the opposite.
19:42hussam: This is a fermi card.
20:19karolherbst: hussam: I would say this is to be expected, nouveau tends to have a higher cpu load than nvidia
20:21hussam: karolherbst: Ok, but 2 to 15 is a lot better than a constant 30%. Is this something that is under the radar for improvment?
20:22cousin_luigi: Hello, again.
20:23cousin_luigi: Is there any GPU supported by nouveau that has free microcode?
20:23hussam: basically under nvidia, it is ~2% idle and goes up to 15% when there is action. under nouveau, it is a constant 30% even at idle times.
20:24hussam: meaning irrelevant of how much CPU the game is using at that point.
20:24karolherbst: cousin_luigi: everything except 2gen maxwell
20:24karolherbst: hussam: yeah, nouveau does some busy waiting in some places
20:24karolherbst: hussam: imirkin knows more I think
20:26karolherbst: cousin_luigi: or why are you asking?
20:27cousin_luigi: karolherbst: I'm interesting in having a system as free as possible, but I saw "extfw required" for something in the feature matrix.
20:27karolherbst: Misanthropos: okay, I think you should boot with the nvidia driver and then we compare the voltage
20:27karolherbst: cousin_luigi: video decoding
20:28cousin_luigi: karolherbst: Since I watch videos all the time, I'd say it's important.
20:28karolherbst: cousin_luigi: but not because it is required, just because nobody had the motivation to write a free firmware for this
20:28karolherbst: cousin_luigi: your CPU is usually fast enough
20:28cousin_luigi: karolherbst: Is it conceivably a security risk?
20:28karolherbst: cousin_luigi: no
20:28karolherbst: cousin_luigi: this is firmware running on a micro processor on the GPU
20:28karolherbst: cousin_luigi: well maybe it can read system memory, but I highly doubt that
20:28cousin_luigi: exactly my doubt
20:29karolherbst: cousin_luigi: well you don't need it and if your CPU is fast enough, then you can always do the encoding on the CPU
20:29cousin_luigi: karolherbst: I care about decoding.
20:29karolherbst: cousin_luigi: yeah, but a CPU is usually fast enough
20:29cousin_luigi: karolherbst: And it's not fast enough from what I can see.
20:30karolherbst: cousin_luigi: ohh what CPU do you have?
20:30cousin_luigi: I mean, I can see a difference.
20:30karolherbst: 4k videos?
20:30karolherbst: then something is odd
20:30cousin_luigi: 720p at the most
20:30karolherbst: a i5 should be able to handle a full hd video at real time
20:31karolherbst: only with 4k it is getting interessting nowdays
20:31cousin_luigi: seeking back and forth and playing at different speeds feels less responsive than when using vdpau
20:31karolherbst: cousin_luigi: ohh your CPU has a gpu ?
20:31cousin_luigi: karolherbst: I don't have a 4k monitor, so the point is moot at the present.
20:31cousin_luigi: karolherbst: Yes, but it sucks.
20:31karolherbst: cousin_luigi: well but you could use vaapi
20:31karolherbst: and decode it on the intel gpu
20:32karolherbst: never done it this way though
20:32cousin_luigi: I'm not sure it's available when the discrete gpu is installed.
20:32cousin_luigi: Interesting idea though.
20:32karolherbst: cousin_luigi: does lspci show the intel gpu?
20:32cousin_luigi: karolherbst: no
20:33cousin_luigi: nor inxi
20:33karolherbst: sad :/
20:33karolherbst: for me it is "VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06)"
20:33cousin_luigi: perhaps there's some switch in the bios, will look better tomorrow
20:33karolherbst: yeah, most likely
20:33cousin_luigi: unless..is yours a laptop?
20:33karolherbst: yeah..., but it shouldn't matter on a desktop
20:33karolherbst: if your CPU has a GPU, it should be available or maybe hardware disabled
20:34cousin_luigi: it makes sense to have both on a battery-operated device
20:34cousin_luigi: not so much on one connected to mains
20:36karolherbst: mhhh depends
20:37karolherbst: usually disabling the GPU on the CPU is a stupid idea, because if your dedicated GPU breaks you could still use the GPU on the CPU ;)
20:48Misanthropos: karolherbst, do you know the current voltage?
20:49karolherbst: Misanthropos: there is a tool on my nouveau branch you could compile for that
20:49karolherbst: Misanthropos: it will show us everything we need to know
20:50Misanthropos: anyhow - i stopped using the binary blob because the performance was bad and the video signal stopped sometimes so i had a black screen for some seconds... but in one game only
20:50Misanthropos: ok - i reluctantly will go for the blob
20:51karolherbst: Misanthropos: it is just to get the proper data
20:51karolherbst: you can swithc back after that if you want
20:52Misanthropos: whats the name of the tool?
20:52karolherbst: it is on my nouveau branch
20:52karolherbst: you have to compile it first
20:52Misanthropos: the one i checked out, right?
20:52karolherbst: https://github.com/karolherbst/nouveau.git branch: stable_reclocking_kepler_v2
20:53karolherbst: Misanthropos: no, the thing you checked out was just the kernel tree
20:53karolherbst: usually I develop on a out of tree repository where just nouveau is in
20:53karolherbst: but you don't need to install anything from it
20:53karolherbst: just run make in the top level of that repotiry on the right branch
20:54karolherbst: and then as root: LD_LIBRARY_PATH=lib bin/nv_cmp_volt
20:54karolherbst: you can even run it on nouveau, but this is kind of pointless
20:54Misanthropos: ok. building it now
20:54karolherbst: Misanthropos: did you switch the branch?
20:55Misanthropos: which kernel should i use for nvidia blob?
20:55Misanthropos: the one you suggested?
20:55karolherbst: doesn't matter
20:55karolherbst: what gpu did you have again?
20:55karolherbst: ohh wait, you gave me your dmesg
20:56Misanthropos: gtx 650 ti
20:56Misanthropos: envyas: command not found
20:56Misanthropos: whats that?
20:56karolherbst: well your vbios is one of the simplier one
20:57karolherbst: Misanthropos: doesn't really matter
20:57karolherbst: or does it abort?
20:57Misanthropos: yeah if failed
20:57karolherbst: install envytools then
20:57karolherbst: it is in the x11 overlay though
20:57karolherbst: I thought envyas is optinal
20:57Misanthropos: its in layman :D
20:58karolherbst: yeah. x11 overlay
20:58karolherbst: you want to enable the nva USE flag
20:58karolherbst: the other ones aren't as important
20:58Misanthropos: for envy?
20:58karolherbst: for envy it doesn't matter
20:59karolherbst: but nva compiles usefull tool
20:59karolherbst: after you installed envytools, can you run nvapeek 101000
20:59karolherbst: it helps a little to know the output of this
20:59Misanthropos: Installed versions: 9999(10:59:30 PM 04/09/2016)(hwtest nva vdpau)
21:00karolherbst: yeah, it is fine
21:00karolherbst: there won't be ever realeases of this I guess
21:00Misanthropos: 00101000: 8040989a
21:00karolherbst: now you should be able to compile the nouveau tools
21:01Misanthropos: just waiting for the compile to finish
21:03karolherbst: mhhh 1.092V is the result for me at 40°C
21:03karolherbst: maybe nvidia volts much higher
21:03Misanthropos: LD_LIBRARY_PATH=lib bin/nv_cmp_volt
21:03Misanthropos: Segmentation fault
21:04Misanthropos: as root
21:04Misanthropos: but with nouveau
21:04karolherbst: wait a second
21:05karolherbst: you are right
21:05karolherbst: recent changes
21:06karolherbst: Misanthropos: can you git pull?
21:07Misanthropos: just compiling the new source
21:07karolherbst: yeah it should only relink all files in bin/ and recompile mc/base.c
21:07karolherbst: should work then
21:07Misanthropos: current voltage (µV), expected voltage (µV), abs diff (µV),rel diff nouveau/nvidia (%), pstate, cstate, temperature(°C)
21:07Misanthropos: 1125000, 1091166, -33834, 96.992533, 15, 2, 45
21:08karolherbst: this is with nouveau, right?
21:08Misanthropos: now i am going install nvidia
21:08karolherbst: this is bit odd
21:08Misanthropos: why is that?
21:09karolherbst: because 1100000 would be nearer to 1091166
21:09karolherbst: but lets wait what nvidia is doing
21:10karolherbst: this was on the 0f pstate?
21:11karolherbst: I thought it crashed your gpu?
21:11Misanthropos: only on your kernel
21:11karolherbst: or was it just bad luck the last time?
21:11karolherbst: ahh okay
21:11karolherbst: ohh yeah, then I understand why the voltage is so high
21:11Misanthropos: right. i am using the info.max patched one
21:12karolherbst: anyway, my branch shouldn't change much on your gpu in the end anyway, because your vbios doesn't have the needed bits for that
21:12karolherbst: makes sense
21:12Misanthropos: which runs stable and makes me happy :D
21:12karolherbst: yeah, but it is stable by accidence
21:12karolherbst: it sets a different voltage than it thinks it sets
21:12karolherbst: 1.125V is set
21:13karolherbst: but nouveau wanted to set 1.175V
21:13karolherbst: just by pure luck it is stable
21:13karolherbst: Misanthropos: do you have lm_sensors installed?
21:13karolherbst: sensors should print 1.175V I think
21:14karolherbst: or rather 1.18 or 1.17
21:14Misanthropos: GPU core: +1.18 V
21:14karolherbst: thought as much
21:14karolherbst: so in the end your GPU will run hotter than with nvidia
21:14karolherbst: much hotter
21:14Misanthropos: temp1: +45.0°C
21:15Misanthropos: an1: 1080 RPM
21:15Misanthropos: its not that bad
21:15karolherbst: yeah, no load
21:15karolherbst: run gputest furmark
21:15karolherbst: then your temperature will jumü
21:15Misanthropos: even with load it gets not over 55
21:15karolherbst: gputest furmark
21:15karolherbst: this produces the highest power consumption on mine
21:15karolherbst: temperature above 80°C
21:16karolherbst: where unigine heaven hardly reaches 80°C after 30 minutes
21:16karolherbst: and furmark has it after 5
21:16Misanthropos: where can i find that tool?
21:16karolherbst: well it is a closed source benchmarking thing
21:17karolherbst: it has usefull microbencharks though
21:17Misanthropos: ok... i am going to boot the the kernel with nvidia now
21:21karolherbst: 65W.... and my gpu has a budget of like 80W
21:21karolherbst: sometimes even 68W
21:22karolherbst: Misanthropos: so the deal is now: put heavy load on the gpu ( run glxgears without vsync) and then run the tool again
21:22Misanthropos: current voltage (µV), expected voltage (µV), abs diff (µV),rel diff nouveau/nvidia (%), pstate, cstate, temperature(°C)
21:22Misanthropos: 862500, 861876, -624, 99.927652, 7, 0, 34
21:22karolherbst: by the way. my gpu reaches even 85°C with furmark....
21:22karolherbst: Misanthropos: yeah, this is on the lowest performance state
21:22karolherbst: we need some load
21:22karolherbst: ohhh wait
21:23karolherbst: Misanthropos: open nvidia-settings
21:23karolherbst: then set prefered mode to max performance
21:23karolherbst: it should clock to above 1GHz
21:24karolherbst: mupuf: shit.... furmark reaches like 90°C on my gpu....
21:24Misanthropos: current voltage (µV), expected voltage (µV), abs diff (µV),rel diff nouveau/nvidia (%), pstate, cstate, temperature(°C)
21:24Misanthropos: 1087500, 1094343, 6843, 100.629241, 15, 2, 38
21:24Misanthropos: glx gears running
21:24Misanthropos: 10979.216 FPS
21:24karolherbst: "1.092V is the result for me at 40°C" this was what I calculated on my end
21:24karolherbst: this is pretty close to what nvidia does
21:25karolherbst: I miss something
21:25karolherbst: but I have no clue what...
21:25karolherbst: even on my tree nouveau sets a higher voltage than nvidia
21:25karolherbst: but it works with an even higher one
21:25karolherbst: which is just weird
21:26karolherbst: 94°C by the way with furmark now....
21:26Misanthropos: you maybe should stop the test :D
21:26karolherbst: there is enough room
21:26karolherbst: 97°C is sw downclock
21:26karolherbst: and somewhere aroun 102°C the hardware clocks down
21:26karolherbst: and on 105°C the gpu just shuts down
21:27karolherbst: and this is with nouveau
21:27karolherbst: not nvidia
21:28karolherbst: this furmark benchmark is just crazy
21:28karolherbst: heaven doesn't even come close
21:28Misanthropos: just doing the same
21:28karolherbst: my GPU even cools down while running heaven now
21:29Misanthropos: how do i know it is actually running? i just see static picture
21:29karolherbst: I have no idea what is wrong currently. I know that some GPUs have still troubles, but this is like 1 in 10
21:29karolherbst: it shouldn't be static
21:30karolherbst: which test did you start?
21:30karolherbst: odd, it should move
21:31Misanthropos: its not
21:31karolherbst: maybe it is too much for nvidia already :D
21:31karolherbst: run pixmark_pinao
21:31Misanthropos: same result
21:31Misanthropos: something is wrong
21:32karolherbst: did you eselect opengl?
21:32Misanthropos: its set to nvidia
21:32karolherbst: then it is very odd
21:32Misanthropos: but maybe i have to compile mesa with nvidia?
21:33karolherbst: try glxinfo
21:33karolherbst: does it should nvidia?
21:33Misanthropos: server glx vendor string: NVIDIA Corporation
21:33karolherbst: and does glxgears work?
21:33Misanthropos: and fast too
21:33karolherbst: no idea then
21:34karolherbst: well your vbios isn't good for dynamic power management anyway :/
21:34Misanthropos: dont insult my vbios :D
21:34karolherbst: well there are a few which aren't good for this
21:34karolherbst: no idea why
21:35karolherbst: you have only three clocks in the vbios
21:35Misanthropos: thats right
21:35karolherbst: normal keplers have like 40 or more
21:35karolherbst: there is also no boosting
21:35Misanthropos: yeah. it was a cheap one
21:35karolherbst: isn't the problem
21:36karolherbst: I should investigate what nvidia does if it encounters such a vbios, but well :/
21:36karolherbst: it just sucks if some resellers do crap like that
21:37Misanthropos: the reason btw i put a lot of effort into switching to noveau was i read about pstate and i got black screens frequently in an opengl game
21:37Misanthropos: which doesnt happen with nouveau AND the performance is better than the nvidia blob..
21:38karolherbst: it shouldn't be better
21:38rardiol: My pc froze and I got this on the logs : http://nixpaste.lbr.uno/t6pqxrQJ?text . Does it looks like a nouveau problem?
21:38Misanthropos: which degraded for some reason
21:38karolherbst: rardiol: no clue
21:38karolherbst: rardiol: try to find out which triggers it
21:40karolherbst: Misanthropos: could you run nvaforcetemp 90
21:40karolherbst: Misanthropos: and then my tool again?
21:40karolherbst: the fans will spin up
21:40rardiol: karolherbst: I was playing a game. but I had already played most of the campaign without it happening. If it has something to do with it it's very uncommon
21:40karolherbst: but this is just due to a faked temperature
21:40karolherbst: and with faked I mean, we force the GPU to think it is much hotter
21:41karolherbst: rardiol: yeah, such issues usually are uncommon
21:41karolherbst: rardiol: do yuo have by any chance xorg-server 1.18.2 installed?
21:41Misanthropos: current voltage (µV), expected voltage (µV), abs diff (µV),rel diff nouveau/nvidia (%), pstate, cstate, temperature(°C)
21:41Misanthropos: 1062500, 1071036, 8536, 100.803388, 15, 2, 90
21:41karolherbst: k, still higher
21:41karolherbst: Misanthropos: nvaforcetemp 1
21:42rardiol: karolherbst: 1.17.4
21:42karolherbst: rardiol: uhh, no idea then, I just now that 1.18.2 had problems
21:42karolherbst: Misanthropos: and again the tool
21:42Misanthropos: current voltage (µV), expected voltage (µV), abs diff (µV),rel diff nouveau/nvidia (%), pstate, cstate, temperature(°C)
21:42Misanthropos: 1087500, 1102231, 14731, 101.354575, 15, 2, 1
21:42karolherbst: Misanthropos: nvaforcetemp 0
21:43karolherbst: Misanthropos: which resets the temperature reading
21:43karolherbst: Misanthropos: so yeah, nouveau sets the right voltage,but something is odd
21:43rardiol: karolherbst: :( . ok, thanks.
21:43Misanthropos: 1087500, 1095152, 7652, 100.703632, 15, 2, 36
21:43karolherbst: Misanthropos: as long as the second number is higher, everything is fine usually
21:44karolherbst: Misanthropos: so something else is missing, and I have no clue what it is
21:44karolherbst: Misanthropos: did it _always_ freeze with my kernel or just sometimes?
21:44Misanthropos: well - froze once
21:44Misanthropos: another time i used pstate 0f i got artefacts instantly
21:45karolherbst: yeah, that is no problem though
21:45karolherbst: you can just reclock to 07 back and to 0f again
21:45karolherbst: and usually it works
21:45karolherbst: we miss some bits for memory reclocking
21:45karolherbst: so sometimes the display can get messed up
21:45karolherbst: maybe it is this
21:45Misanthropos: i c
21:45karolherbst: Misanthropos: do you have by any chance a second machine and could ssh into your main one?
21:45Misanthropos: i do
21:45karolherbst: what you could do is this
21:46karolherbst: switch pstates as many times as it takes to freeze the amchine
21:46karolherbst: then ssh into it and run demsg
21:46karolherbst: with my kernel tree
21:48Misanthropos: [ 1174.307434] x86/PAT: nvaforcetemp:3937 conflicting memory types d0000000-e0000000 write-combining<->uncached-minus
21:48Misanthropos: [ 1174.307439] x86/PAT: reserve_memtype failed [mem 0xd0000000-0xdfffffff], track uncached-minus, req uncached-minus
21:48Misanthropos: not sure if that gives you a hint
21:48Misanthropos: just saw it in dmesg
21:48karolherbst: doesn't matter
21:48karolherbst: PAT is just annoyed that we write into it
21:56karolherbst: how often did you reclock?
21:56Misanthropos: about 3 times
21:56karolherbst: because this doesn't really look like a voltage issue though...
21:56karolherbst: or maybe some engine is just messed up
21:57karolherbst: Misanthropos: could you boot into the kernel with the info.max hack and see how often you can reclock without messing up the system?
21:57karolherbst: maybe by luck you didn't hit the issue
21:57Misanthropos: i dont think so
21:58Misanthropos: i has been working now for months...
21:58Misanthropos: but i can try
22:00karolherbst: Misanthropos: in debugs run thus: i=0; while true; do echo $((i=$i+1)); echo 07 > pstate ; echo 0f > pstate ; done
22:00karolherbst: if you get more than 1million reclocks it should be stable, if not, well it isn't
22:00Misanthropos: ok.. i just did about 10 manually between 0a-0f
22:00Misanthropos: in ddebugs?
22:01karolherbst: well where the pstate file is
22:01Misanthropos: oh.. ok
22:01karolherbst: 10 times isn't much though
22:01karolherbst: I could expect that we don't suspend some engines while reclocking and a lower voltage could mess it up somewhat
22:01karolherbst: and if we would reclock at a higher voltage and then drop it again, it could be okay
22:04karolherbst: Misanthropos: and how many reclocks did it survive? :D
22:04Misanthropos: ok - it froze after 2394 switches
22:04karolherbst: this isn't much
22:04karolherbst: mine can survive over 500.000
22:05karolherbst: at full load
22:05karolherbst: so we still miss something somewhere
22:05karolherbst: I doubt it is the voltage though
22:05karolherbst: thanks for trying all the things out
22:05Misanthropos: sure! thanks for trying to figure out the problem
22:05karolherbst: but I doubt there is much we can do for now. Somebody just have to investigate more deepl what nvidia is doing and see what we missed
22:06karolherbst: having a display connected on the GPU changes a lot though
22:07karolherbst: well I also see some artefacts with prime while reclocking
22:07Misanthropos: maybe the card is faulty
22:07karolherbst: but more like if parts of an old frame is shown and parts of a new one
22:07karolherbst: Misanthropos: I doubt that
22:07karolherbst: it is more likely that nouveau is doing something wrong
22:08karolherbst: Misanthropos: anyway, you could check if on my branch the card runs stable when on 0f
22:08karolherbst: Misanthropos: if it does, then the missing thing is really just reclocking related
22:08Misanthropos: hmm ok
22:14Misanthropos: karolherbst, impossible.
22:14Misanthropos: karolherbst, i tried 2 times. the first time the system froze while starting up steam
22:14Misanthropos: the second time on pstate switch
22:15karolherbst: ohh I have an idea
22:15Misanthropos: delete the games?;p
22:16karolherbst: trying out something locally
22:16karolherbst: Misanthropos: could you do nvapeek 20200
22:16Misanthropos: 00020200: 27722444
22:17karolherbst: Misanthropos: could you run "nvapoke 0x20200 0x60 27722455" before trying to reclock and see if this changes anything?
22:18karolherbst: it is a bit far fetched
22:18karolherbst: but this somewhat auto deactivates some engines on the gpu
22:18Misanthropos: it was clocked on 0a
22:18karolherbst: and maybe this also helps with reclocking
22:18Misanthropos: you mean in your kernel?
22:19karolherbst: I am sure the chances of success are like 5%, but maybe...
22:23Misanthropos: karolherbst, no luck - on starting up steam the system froze again...
22:23karolherbst: yeah then I have to find out what it is ...
22:23karolherbst: Misanthropos: can you check the dmesg from the previous boots and see what the errors was?
22:27karolherbst: Misanthropos: sadly I don't have any desktop GPUs here so I can't really investigate the problem :/ I am sure it is something trivial in the end, but well :/
22:27Misanthropos: i just have the current dmesg
22:27karolherbst: Misanthropos: there should be someting in /var/log though
22:27Misanthropos: and the system froze so hard before.. i coulnt log in
22:27karolherbst: if you run a system logger
22:28karolherbst: Misanthropos: you use openrc right?
22:28karolherbst: ohh wait I saw a journal issue
22:28Misanthropos: i use systemd
22:28karolherbst: so systemd
22:28karolherbst: you could try to check journalctl --boot -1
22:28karolherbst: and so on
22:28karolherbst: and see if you find something usefull
22:28karolherbst: but chances are that the journals are just bricked
22:29Misanthropos: Apr 10 00:21:03 bug kernel: nouveau 0000:04:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
22:29Misanthropos: Apr 10 00:21:07 bug kernel: nouveau 0000:04:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
22:29Misanthropos: Apr 10 00:21:11 bug kernel: nouveau 0000:04:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
22:29karolherbst: Misanthropos: yeah okay, but maybe the others have different errors
22:31Misanthropos: i will try one more time
22:31Misanthropos: with that poke, right?
22:31karolherbst: no, I meant
22:31karolherbst: just give lower numbers to the --boot parameter
22:31karolherbst: -1 means last boot
22:31karolherbst: -3 means the boot before the boot before the last one ;(
22:32Misanthropos: i did but i could find anything
22:32karolherbst: ahh well
22:32Misanthropos: could not
22:32karolherbst: don't bother then
22:33Misanthropos: but i think i couldnt log in because i pressed the power switch.. so i can check once more
22:35Misanthropos: karolherbst, thats the last dmesg https://bpaste.net/show/1afc485660d9
22:36Misanthropos: it freezes on starting up steam
22:36karolherbst: Misanthropos: thanks for trying again. I really have no idea, what could cause this.
22:36karolherbst: It seems to be voltage related, but nouveau already sets a higher one than nvidia
22:36karolherbst: so this is a bit weird
22:36Misanthropos: the info.max thing works for me
22:37karolherbst: yeah, because it sets an even higher one
22:37karolherbst: most likely
22:38Misanthropos: 1125000, 1092593, -32407, 97.119378, 15, 2, 42
22:38Misanthropos: on linx-4.6.0-rc2 with pstate 0f
22:39karolherbst: 0.033V isn't _that_ much though
22:39karolherbst: I can undervolt my GPU by 0.1V without issues
22:39karolherbst: well without many issues that is
22:40Misanthropos: hmmm.. could it be related to a slightly overclocked pci bus?
22:40karolherbst: overclocked pci bus?
22:40karolherbst: the hell?
22:41Misanthropos: just thinking.... i might have done that
22:41karolherbst: well there are many thinks to overclock, but overclocking the pcie bus has the smallest impact on the gpu
22:41karolherbst: yeah, but it shouldn't
22:41Misanthropos: thats what i thought too
22:41karolherbst: maybe I get one of our GPUs to be that unstable and see what's up
22:42karolherbst: Misanthropos: if you are up to, you could try to mmiotrace nvidia
22:42karolherbst: and just reclock a little
22:42karolherbst: this might help us figuring it out
22:43Misanthropos: whats the name of the package?
22:43karolherbst: it is in the kernel
22:43karolherbst: Misanthropos: https://wiki.ubuntu.com/X/MMIOTracing
22:45karolherbst: but instead of xinit, you need to start a real X server and get the driver to clock down fully
22:45karolherbst: and clock up again
22:45karolherbst: well you could just disable the display manager service and blacklist nouveau and nvidia
22:45karolherbst: or stuff like that
22:45Misanthropos: by real X server you mean start by hand?
22:46karolherbst: yeah well
22:46karolherbst: xinit nvidia-settings
22:46karolherbst: looks like a good idea too though
22:46Misanthropos: will to that tomorrow
22:46karolherbst: but I think you get the idea after reading the page
22:49Misanthropos: ok - i´ll get back to you with a mmiotrace log tomorrow
22:51karolherbst: thanks a lot