04:44 mlankhorst: tagr: I've sent a fix for the tegra124 dt to linux-tegra, who will pick it up?
05:00 karolherbst: meh, we also loose some pmu IRQs even when we don't spam requests :/
05:24 tagr: mlankhorst: that'd be me
05:28 mlankhorst: it's probably for -stable too to make the wdt work
05:32 mlankhorst: was hoping a watchdog would fix my issues, but looks more like I need to force wifi to only use 1 antenna
05:43 karolherbst: skeggsb: could you have a look at this? It is a workaround, but it seems to work for me pretty well. Still it may cause a wrong message to be delivered when we wait on a reply, maybe we want to handle replies a little bit different so we can map them 1 to 1, or maybe we should find the root cause of the missing IRQs http://lists.freedesktop.org/archives/nouveau/2015-November/023202.html
05:54 karolherbst: mupuf: there? I need someone with a bit of pmu knowledge to verify that my ideas of how to handle those lost IRQs in fallbacks is valid enough
05:55 mupuf: sorry, can't now... working on my finnish right before the class...
05:55 mupuf: I should be free at 8pm for you
05:55 mupuf: for you == your tw
05:55 mupuf: tz
05:55 karolherbst: no problem
05:56 karolherbst: what would (addr = nvkm_rd32(device, 0x10a4cc) == nvkm_rd32(device, 0x10a4c8)) do? :/
05:56 karolherbst: ohh wait, I can't handle that differently
05:57 mupuf: check that both regs contain the same value and store this value in addr
05:57 karolherbst: yeah, but what if the values arne't the same
05:57 karolherbst: ((addr = nvkm_rd32(device, 0x10a4cc)) == nvkm_rd32(device, 0x10a4c8))
05:57 karolherbst: maybe this is better?
05:57 karolherbst: or more explicit
06:17 karolherbst: vita_cell: and did you any benchmarks or get the feeling it is a _bit_ fasteR?
07:06 karolherbst: was there a way to add a sleep for every frame in mesa?
07:08 imirkin_: what are you attempting to achieve?
07:08 karolherbst: I want to reduce the gpu load a bit
07:09 imirkin_: ah. sorry, dunno
07:09 imirkin_: you could stick a sleep into draw_vbo :)
07:09 karolherbst: or glxSwapBuffers, yes
07:09 imirkin_: that's harder
07:09 karolherbst: why?
07:09 imirkin_: you'd have to find it first
07:10 imirkin_: :)
07:10 karolherbst: I just do stuff like sleep(atoi(getenv("stuff")); or something like that
07:10 karolherbst: :D
07:10 karolherbst: ohh I would use usleep
07:11 karolherbst: imirkin_: any idea how we want to handle situation where a pstate has a clock range not covered by cstates?
07:12 karolherbst: just add fake cstates?
07:12 karolherbst: with the voltage of the lowest _real_ one?
07:12 imirkin_: huh?
07:12 karolherbst: for example
07:12 karolherbst: nouveau only detects 405MHz as the lowest cstate for my card
07:12 imirkin_: ok
07:12 karolherbst: but the boost table says 135-405MHz for 07 and 135-862MHz for 0a and 0f
07:13 karolherbst: and the blob also clocks to 135 at idle
07:13 imirkin_: ah interesting
07:13 karolherbst: so my idea was to fill up fake cstates between 135 and 405MHz
07:13 imirkin_: sounds like you know way more about this than i do ;)
07:13 karolherbst: and just copy the 405MHz one
07:13 karolherbst: and lower the clocks
07:13 imirkin_: sure
07:14 karolherbst: mhhh
07:14 karolherbst: when I think about this
07:14 karolherbst: 0: freq 810 MHz unkn[0] 1 unkn[1] 2 voltage 10
07:14 karolherbst: maybe one of the unknown means: use for lower clocks too
07:19 karolherbst: imirkin_: the question is now how many I should insert :/
07:19 karolherbst: maybe I just start with 1 for now
07:19 imirkin_: 100000000000
07:19 imirkin_: ;)
07:20 karolherbst: or I do something like at least for every 14MHz one
07:21 imirkin_: see what the power savings are and decide?
07:34 vita_cell: I did previous benchmarks
07:35 vita_cell: but not with that fix
07:46 vita_cell: karolherbst
07:46 vita_cell: I benchmarked, and same ferpormance
08:48 karolherbst: vita_cell: okay, not even like 1 fps more?
09:39 karolherbst: niva
09:39 karolherbst: nice
09:39 karolherbst: imirkin_: okay, seems to work, but it seems to safe like no power
09:43 karolherbst: also, my kernel just crashed
09:49 karolherbst: ohh, that was my mistake and totally unrelated to the lower clock thing
09:54 karolherbst: now I need volunteers to test dynamic reclocking, who wants?
09:55 prg_: hm, gk106?
09:56 karolherbst: yeah
09:56 karolherbst: I have one myself
09:57 prg_: want me to test on mine?
09:57 karolherbst: imirkin_: maybe it makes a difference when we do power gating the right way
09:57 karolherbst: prg_: what gpu do you have exactly?
09:57 karolherbst: also your vbios would be nice to have
09:58 imirkin_: maybe
09:58 karolherbst: /sys/kernel/debug/0/vbios.rom
09:58 karolherbst: imirkin_: with constant 60fps load, it didn't make any noticable difference
09:58 prg_: nouveau [ DEVICE][0000:01:00.0] BOOT0 : 0x0e6000a1, GTX660
09:58 karolherbst: imirkin_: I think the main thing is, that the voltage is still the same
09:58 karolherbst: prg_: okay
09:58 karolherbst: prg_: ever compiled nouveau yourself?
09:58 prg_: as in, a kernel?
09:58 karolherbst: as in a module
09:59 prg_: yes, that too
09:59 prg_: erm, give me half an hour, need to get something done first
10:00 karolherbst: no worries
10:12 karolherbst: imirkin_: should gr: GPC2/TPC1/MP trap: global 00000000 [] warp 3e000e [MEM_OUT_OF_BOUNDS] still happen with piglit?
10:13 karolherbst: or gr: GPC1/TPC0/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 3c000f [UNALIGNED_MEM_ACCESS]
10:13 imirkin_: the first one, yes... there are like 2-3 tests that do out-of-bounds accesses on purpose
10:13 imirkin_: the second one, definitely not
10:13 imirkin_: if you can tell me what test does that, i'll go whack it
10:13 karolherbst: ohh, currently I do a concurrent run of piglit
10:14 karolherbst: I get tons of errors like that
10:14 imirkin_: ah, it survives with ben's changes?
10:14 karolherbst: maybe, maybe not
10:14 karolherbst: currently it hangs
10:14 imirkin_: ah, so i'm gonna go with 'no' :)
10:14 karolherbst: dmesg is spamed with MEM_OUT_OF_BOUNDS
10:15 imirkin_: could be something messed up in our ctxsw
10:15 karolherbst: 5 hanged shader_runner processes
10:15 karolherbst: or my mesa is too old
10:15 imirkin_: i don't think i've ever seen UNALIGNED_MEM_ACCESS
10:16 imirkin_: maybe with hakzsam's compute changes? dunno
10:19 karolherbst: yeah. lol
10:19 karolherbst: X hang until all the processes were dead
10:20 karolherbst: ohhhh crap, runtime s
10:21 hakzsam: on which card?
10:22 hakzsam: this is probably not related to my compute changes though
10:24 karolherbst: mhh, that didn't went too well
10:24 hakzsam: karolherbst, on which card did you run piglit?
10:25 karolherbst: hakzsam: gk106
10:25 karolherbst: but I was also testing my dyn reclocking stuff at the same time
10:25 karolherbst: maybe something access gpu memory while it was reclocked?
10:25 hakzsam: so, definitely not related to my compute changes because I didn't do any changes on kepler ;)
10:26 karolherbst: I will do a non concurrent run after upgrading mesa
10:27 karolherbst: hakzsam: I may have ask you this already, but with the counters you are working, are you able to detect stalls with the gpu "core" waiting on memory operations to finish?
10:31 hakzsam: karolherbst, there are some XXX_waits_for_fb and some counters related to memory loads/stores too
10:31 hakzsam: but you will have to wait ... because nvif is still not in libdrm :)
10:31 karolherbst: hakzsam: you don't have the game antichamber, do you? :/
10:31 hakzsam: nope
10:32 karolherbst: this is the only thing I found which runs at 60fps stable with lowest clocks, but highest memory clock
10:32 karolherbst: so I wanted to check the pdaemon counters against your counters and see if I can detect this there too
10:32 hakzsam: cool
10:32 karolherbst: I think I found something, but I wanted to be sure
10:32 hakzsam: unfortunately we can't do that right now
10:32 karolherbst: hakzsam: np
10:33 karolherbst: hakzsam: this is critical though, because the counters show high core usage while it "waits" for memory, so the dyn reclocking code may think you need a higher core clock and not a higher memory clock
10:33 hakzsam: karolherbst, but once libdrm is upstream, I'll rebase all my work related to those perf counters and go ahead
10:33 karolherbst: and then you clock the core up, but nothing changes
10:34 hakzsam: I see
10:34 karolherbst: it's a big pain
10:34 karolherbst: and if you leave the core clock, and upclock memory
10:34 karolherbst: then the core load drops a lot
10:34 karolherbst: totally counter intuitive :D
10:35 hakzsam: yeah :)
10:35 karolherbst: but besides that, everything works pretty nicely already
10:36 karolherbst: only pcie load is a pain, because you get much lower values than from the other stuff
10:36 karolherbst: so you have to reclock earlier, but this can be dealt with
10:37 prg_: karolherbst, okay, what do you want me to do?
10:37 prg_: karolherbst, vbios: http://expirebox.com/download/82de6a02916fe8fbc396c148684edbfd.html
10:37 karolherbst: prg_: clone my repository: https://github.com/karolherbst/nouveau.git
10:38 karolherbst: and be sure that you are on my master_karol_no_touchy branch
10:38 imirkin_: haha
10:38 karolherbst: there is a lot of crap in it, so your card may just crash randomly ;)
10:38 karolherbst: it doesn't for me
10:38 imirkin_: i thought that was supposed to be you rprivate one :p
10:38 karolherbst: yeah
10:38 karolherbst: but the dangerous stuff is moved out now
10:39 karolherbst: I had the pmu hack enabled by default there earlier
10:39 karolherbst: that's why nobody should have used it
10:39 karolherbst: but now it is fine
10:39 karolherbst: it is just normally code in there I don't feel well putting on a productive system
10:39 karolherbst: now
10:39 prg_: On branch master_karol_no_touchy
10:39 karolherbst: nice
10:39 karolherbst: then compile it and install it
10:40 karolherbst: and then initramfs stuff and everything and reboot :D
10:40 karolherbst: prg_: do you have envytools installed?
10:41 prg_: yes
10:41 karolherbst: nvapeek 101000
10:41 prg_: 00101000: 80404096
10:41 karolherbst: thanks
10:42 prg_: (currently still on the blob if that matters)
10:42 karolherbst: no
10:43 karolherbst: looks nice though
10:44 karolherbst: prg_: the worst that could happen with my branch is, that nouveau upclocks your card a bit, but we will see that then
10:44 prg_: nouveau is very good at locking up my card already anyway
10:45 prg_: erm
10:45 karolherbst: prg_: ohh you need a 4.3 kernel by the way :D
10:45 prg_: you mean clocking
10:45 prg_: that wouldn't be that bad now
10:45 karolherbst: prg_: when does nouveau locks up your gpu?
10:47 karolherbst: prg_: could you give me the output of nvapeek 0x10a500 0x80 ?
10:47 karolherbst: then we could check before if nouveau will upclock with no load or not
10:49 prg_: just running a game for a while tends to be enough, or running a game while moving a video from one monitor to the other
10:49 karolherbst: prg_: at default clocks?+
10:49 prg_: yes
10:49 karolherbst: mhhh
10:49 prg_: https://paste.debian.net/333597/
10:50 karolherbst: prg_: then could you do this? nvapoke 0x10a524 0x1000 && nvapoke 0x10a52c 0x2 && nvapoke 0x10a528 0x80000000 && sleep 0.1 && nvapeek 0x10a528
10:51 prg_: 0010a528: 002d7965
10:51 karolherbst: mhhhh
10:52 karolherbst: once more nvapeek 0x10a500 0x80
10:52 prg_: https://paste.debian.net/333599/
10:53 karolherbst: ohh no, it's fine, I was being stupid a bit
10:53 karolherbst: still mhh
10:53 karolherbst: prg_: do you have currently any gpu load?
10:54 prg_: just doing stuff in a terminal in a non-composited kde desktop
10:54 prg_: (two monitors)
10:55 karolherbst: mhhhh
10:56 karolherbst: the value isn't high enough to cause any troubles, but it does something besides that thing I think it does
10:56 karolherbst: shouldn't cause any troubles though
11:04 prg_: brb
11:06 karolherbst: imirkin_: segfault in gl-3.1-vao-brok
11:06 karolherbst: ohhh
11:06 karolherbst: I am running on intel
11:06 karolherbst: crap :/
11:06 imirkin_: :)
11:06 imirkin_: yeah, they refuse to fix the error in their driver for some reason
11:06 imirkin_: it should be an easy fix... dunno
11:10 prg: karolherbst, ok, what now?
11:10 karolherbst: imirkin_: got UNALIGNED_MEM_ACCESS again and I passed --dmesg to piglit
11:10 karolherbst: prg: mhh go into /sys/kernel/debug/dri/1 as root
11:10 imirkin_: karolherbst: cool... figure out whodunit
11:11 karolherbst: imirkin_: I am at 9% of all tests :D
11:11 karolherbst: already 3 dmesg-fails
11:11 prg: karolherbst, there's only 0 128 64
11:12 karolherbst: prg: right, I meant 0
11:12 karolherbst: sorry
11:12 karolherbst: I have a laptop, so it is 1 for me
11:12 prg: ok, np
11:13 karolherbst: prg: there cat the pstate file
11:13 karolherbst: would like to take a look at that
11:13 imirkin_: karolherbst: grep the partial results
11:13 prg: http://paste.debian.net/333617/
11:14 karolherbst: imirkin_: 775
11:14 karolherbst: spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec3-index-wr-before-barrier
11:14 imirkin_: karolherbst: :p the command please
11:14 imirkin_: BLAST
11:14 imirkin_: karolherbst: i'm going to need your help to debug it... i suspect there's somethign wrong in the kepler emitter
11:14 karolherbst: prg: and current_load
11:14 karolherbst: imirkin_: k
11:14 imirkin_: karolherbst: and i only have a GK208, which uses SM35
11:14 karolherbst: ohhh okay
11:15 imirkin_: karolherbst: grab the command
11:15 imirkin_: (and paste it here... i want to double-check something)
11:15 imirkin_: should be like shader_runner tests/spec/bstuff
11:15 karolherbst: piglit/tests/spec/arb_tessellation_shader/execution/variable-indexing/tcs-output-array-vec3-index-wr-before-barrier.shader_test -auto
11:16 karolherbst: ahh yes, shader_runner before that
11:16 prg: karolherbst, core: 0 mem: 87 video: 0 pcie: 0
11:16 karolherbst: prg: mhhh okay, seems like we can't help that now, but it is fine
11:16 prg: help what?
11:16 imirkin_: yeah ok cool. now do NV50_PROG_DEBUG=1 and run that command
11:16 karolherbst: these loads are not percent but per0xff
11:16 imirkin_: and pastebin the output
11:16 karolherbst: prg: the memory load
11:16 karolherbst: it's fine though
11:17 karolherbst: prg: try to put some load on the gpu
11:17 karolherbst: and check the pstate file if it reclocks
11:17 karolherbst: and keep an eye on the current load
11:17 karolherbst: imirkin_: how can I pause a piglit run?
11:18 imirkin_: you can't... just run it
11:18 imirkin_: probably won't kill your box ;)
11:18 imirkin_: some random piglit will get a dmesg failure
11:18 karolherbst: imirkin_: I have to comile my dev mesa anyway
11:18 imirkin_: ohh ok
11:18 imirkin_: aaaactually
11:18 imirkin_: i should be able to check this out locally
11:19 imirkin_: hold on
11:19 karolherbst: imirkin_: there seems to be other tests now
11:19 imirkin_: yeah, expected
11:20 imirkin_: all tcs though right?
11:20 imirkin_: or maybe gs-something
11:20 karolherbst: spec@arb_tessellation_shader@execution@variable-indexing@vs-output-array-vec2-index-wr-before-tcs too
11:20 imirkin_: ERROR: unknow op
11:20 imirkin_: uh oh!
11:20 imirkin_: did i forget to do something?
11:20 imirkin_: this one might be my bad.
11:20 imirkin_: [well, it clearly is...]
11:21 imirkin_: grrrr
11:21 imirkin_: i wrote the emit function, just forgot to call it
11:21 karolherbst: imirkin_: okay, and I found the hanging test
11:21 karolherbst: it spams dmesg full with MEM_OUT_OF_BOUNDS :D
11:21 vita_cell: some 1-5fps more, I think, benchmarks are like same, some fps more
11:22 karolherbst: vita_cell: well, some more is always good :)
11:22 karolherbst: the nearer we come to the blob the better
11:23 vita_cell: just very nice work
11:23 imirkin_: karolherbst: this should fix you right up: http://hastebin.com/comoxuzowu.avrasm
11:25 karolherbst: crap, pci_pm_runtime_suspend(): nouveau_pmops_runtime_suspend+0x0/0xd0 [nouveau] returns -16 again :/
11:30 imirkin_: karolherbst: already pushed fyi
11:32 prg: karolherbst, screens were flickering to black briefly, http://paste.debian.net/333623/ then http://paste.debian.net/333624/
11:43 karolherbst: mhhh
11:45 karolherbst: lol, what's happening :/
11:45 karolherbst: my laptop is going a bit crazy
11:45 imirkin_: jumping up and down?
11:45 imirkin_: possessed by the devil?
11:46 karolherbst: mhhh
11:47 karolherbst: I think I have to do a cold reboot
11:47 karolherbst: the intel gpu is pretty much messed up
11:50 karolherbst: ohh no, EGL is just bricked a bit on intel
11:51 prg: karolherbst, got my message earlier?
11:51 prg: karolherbst, screens were flickering to black briefly, http://paste.debian.net/333623/ then http://paste.debian.net/333624/
11:51 karolherbst: yeah, I look into that now
11:51 karolherbst: just nouveau messed up my kernel
11:51 karolherbst: I had like 18 dead kworker threads
11:51 karolherbst: :D
11:51 karolherbst: and my internet connection was gone
11:51 imirkin_: prg: don't you need that c800 workaround enabled? or was that someone else?
11:52 karolherbst: imirkin_: I thought this was only for laptops?
11:52 karolherbst: prg: the flicker comes from pstate changes
11:52 karolherbst: this might happen
11:52 prg: the what workaround?
11:52 imirkin_: prg: nevermind. confused you with someone else.
11:53 karolherbst: prg: but the desktop and everything still works?
11:53 karolherbst: mhh
11:53 imirkin_: prg: if you can capture an apitrace that leads up to the [ 1203.025496] nouveau 0000:01:00.0: gr: GPC1/TPC0/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 3c0009 [INVALID_OPCODE]
11:53 karolherbst: reclocking memory isn't rock solid currently, so strange things might happen
11:53 imirkin_: that'd be interesting
11:53 prg: display was frozen, had to reboot
11:53 imirkin_: ah, maybe it's a result of some sort of memory fail
11:53 karolherbst: prg: could you try out if this happens again?
11:55 prg: imirkin_, it took a few minutes of standing around, so the trace might get quite big i guess
11:55 prg: karolherbst, will do
11:56 imirkin_: prg: ah, well first see if it repros... probably just a "reclocking messed everything up" type of issue
11:59 karolherbst: imirkin_: I think nouveau shoudn't do the while loop there: https://github.com/karolherbst/nouveau/blob/master_karol_no_touchy/drm/nouveau/nvkm/subdev/timer/nv04.c#L43
11:59 karolherbst: when the gpu gets messed up, both regs read always bad0011f or something
11:59 karolherbst: and the kernel stucks in there forever
11:59 karolherbst: messing up the entire system
11:59 karolherbst: even network goes down and stuff
12:04 prg: karolherbst, http://paste.debian.net/333633/
12:05 karolherbst: basically the card messed up
12:05 karolherbst: prg: how long does stuff runs usually?
12:06 prg: without reclocking? maybe minutes, maybe all day
12:07 prg: have been mostly using the blob recently
12:07 karolherbst: mhh
12:07 karolherbst: sad :/
12:08 karolherbst: seems like something upsets your card
12:10 karolherbst: prg: thanks for trying it out though.
12:10 karolherbst: It seems to have worked somehow, but well, there are other issues :(
12:11 imirkin_: karolherbst: can you check that my fix resolved your unaligned_mem thing?
12:11 karolherbst: yeah, checking now
12:12 imirkin_: thankfully nothing outside piglit really does variable indexing like that
12:15 karolherbst: yep
12:15 karolherbst: works now
12:16 imirkin_: yay!
12:18 karolherbst: I think both errors are gone now
12:18 prg: right... then i should first try to get things more stable without reclocking
12:19 karolherbst: prg: yeah, well
12:19 prg: (i.e. post logs and hope someone else figures out wtf is going on)
12:19 karolherbst: prg: you could use my master branch
12:19 karolherbst: it's the normal upstream nouveau branch
12:19 karolherbst: just compatible with older kernels
12:19 prg: well i'm on 4.3 now
12:19 karolherbst: but then you could boot wiht nouveau.pstate=1
12:19 karolherbst: prg: yes and upstream needs 4.4
12:19 prg: ah
12:20 prg: k, will try later
12:20 karolherbst: prg: you could boot with nouveau.pstate=1 and then check if cocking to 0a yourself works
12:20 karolherbst: and if clocking to 0d/0f works
12:20 karolherbst: it could be that the card messed up, because there were too many reclock reuqests or something
12:20 karolherbst: worst case is, that the pmu sends a reclock request every 0.1 seconds
12:21 karolherbst: lol :O
12:21 karolherbst: visual studio is open source now?
12:21 karolherbst: :O
12:22 karolherbst: ohh only vs code
12:22 karolherbst: whatever
12:24 karolherbst: imirkin_: got mem out of bounds in the shaders@glsl-array-bounds-10 test
12:24 imirkin_: that's expected :)
12:24 karolherbst: k
12:24 imirkin_: and another one of those too iirc
12:24 karolherbst: mhh
12:24 karolherbst: piglit detects the gpu going to sleep as dmesg/warn :&
12:25 karolherbst: imirkin_: what about this? glx-multithread[3345]: segfault at 8 ip 00007f13bdbc8abe sp 00007f13bfa82950 error 4 in libdrm_nouveau.so.2.0.0[7f13bdbc5000+7000]
12:25 karolherbst: :D
12:25 imirkin_: karolherbst: -x glx
12:25 imirkin_: that'll fix you right up :)
12:25 karolherbst: ohh
12:25 karolherbst: mhh
12:25 karolherbst: still it shouldn't crash, right?
12:25 imirkin_: i never run the glx tests
12:26 imirkin_: they tend to kill my box
12:26 imirkin_: which makes it no fun to debug
12:26 imirkin_: probably not.
12:26 karolherbst: aha
12:26 karolherbst: well
12:26 karolherbst: well, they shouldn't crash my box
12:26 karolherbst: ...
12:26 karolherbst: shouldn't have said that
12:26 karolherbst: :D
12:27 imirkin_: i think the glx-multithread test might do precisely what nouveau doesn't support btw
12:30 karolherbst: the other one is shaders@glsl-array-bounds-12 as it seems
12:30 karolherbst: mhh
12:30 karolherbst: imirkin_: arb_uniform_buf[3899]: segfault at 8 ip 00007f6f5a9d0abe sp 00007fffa9b93870 error 4 in libdrm_nouveau.so.2.0.0[7f6f5a9cd000+7000]
12:31 imirkin_: that's the one i added recently i think
12:31 imirkin_: the dsa one right?
12:31 karolherbst: no idea
12:31 imirkin_: is it called arb_uniform_buffers-rendering-dsa or something?
12:32 karolherbst: piglit doesn't collect dmesg for crashed ones, so I have to check a bit deeper
12:32 imirkin_: look for "crash"
12:32 karolherbst: ahh
12:32 karolherbst: spec@arb_uniform_buffer_object@rendering-dsa
12:33 imirkin_: yep. expected.
12:33 karolherbst: spec@arb_tessellation_shader@execution@barrier also got an out of bound
12:34 imirkin_: unexpected.
12:34 imirkin_: oh wait
12:34 imirkin_: i get that too
12:34 imirkin_: heh
12:34 imirkin_: but... it passes
12:34 imirkin_: what could possibly be going wrong
12:34 imirkin_: gr
12:34 imirkin_: oh wait
12:35 imirkin_: i know that issue
12:35 imirkin_: expected ;)
12:35 karolherbst: :D
12:35 imirkin_: http://patchwork.freedesktop.org/patch/55770/
12:35 karolherbst: another one in spec@arb_tessellation_shader@execution@barrier-patch but I guess this is expected too
12:35 imirkin_: yup
12:36 karolherbst: nearly at the half if all tests
12:36 karolherbst: looks much better now
12:36 karolherbst: also nouveau doesn't mess up anymore
12:36 karolherbst: what's DRM: skipped size 0?
12:37 imirkin_: that's some dumb 0-sized allocation making it through somehow
12:37 karolherbst: in spec@!opengl 1.1@max-texture-size
12:37 karolherbst: mhhh
12:37 karolherbst: crash in spec@arb_tessellation_shader@execution@variable-indexing@tcs-input-array-float-index-rd
12:38 karolherbst: ohh tcs
13:11 prg: okay, now using master, didn't reclock. no lockups yet, but a bunch of nouveau 0000:01:00.0: FEAR[2764]: nv50cal_space: -16
13:12 imirkin_: what comes before that?
13:12 imirkin_: coz that's basically "gpu's stuck"
13:12 prg: that's the first thing in dmesg
13:13 imirkin_: then i have nfc
13:17 prg: in userspace, lots of http://paste.debian.net/333673/ when that happens
13:19 imirkin_: right
13:19 imirkin_: the thing in the kernel rejects the batch
13:19 imirkin_: and the userspace lib prints rejected things
13:19 imirkin_: but nv50cal_space = no more ib space left
13:20 imirkin_: which means that pushbufs aren't being processed
13:29 karolherbst: prg: FEAR as in the win game?
13:31 prg: karolherbst, yes
13:31 karolherbst: mhh maybe, wait a second
13:31 karolherbst: I know I have a fear game
13:32 karolherbst: but don't know which
13:32 karolherbst: yay, all of them up to 3
13:32 karolherbst: prg: the normal fear one or one of the other two?
13:33 prg: the first one
13:33 prg: was doing the benchmark thing in the options menu
13:34 karolherbst: prg: I meant I have tree of the first one at steam, the normal and "Extraction Point" and "Perseus Mandate"
13:34 karolherbst: mhhh
13:34 karolherbst: well
13:34 prg: ah, normal one then
13:34 karolherbst: benchmarking a game through wine might be misleading
13:34 karolherbst: 17GB :/
13:34 karolherbst: meh
13:34 karolherbst: why is the first one the biggest one?
13:34 karolherbst: :D
13:35 prg: wasn't hoping for great fps, just wanted to see if it was stable
13:35 karolherbst: prg: do you have wine with the nine patches?
13:35 prg: no
13:35 karolherbst: mhh okay
13:35 karolherbst: will download fear anyway and test it here
13:36 karolherbst: prg: are you using newest mesa by the way?
13:38 prg: 11.0.5
13:39 prg: and if i just run fear normally, not using the benchmark, i'm not getting that message
13:39 imirkin_: oh interestnig
13:39 prg: also nothing bad happens when i get it, so i guess it's not that important
13:39 imirkin_: so there's another possibliity for getting that error
13:39 imirkin_: which is "you're submitting commands faster than the gpu can process them"
13:40 karolherbst: :D
13:40 imirkin_: basically that ring is finitely-sized
13:40 karolherbst: imirkin_: what would happen when the pcie bus is too slow?
13:40 imirkin_: and processes at a finite speed
13:41 imirkin_: karolherbst: dunno, this isn't about bus speeds though
13:41 karolherbst: yeah but command submissions go trhrouigh the pcie bus, right?
13:41 karolherbst: I was just thinking if there are too many, so that pcie is the bottleneck and not the gpu itself
13:41 imirkin_: sure, but it's not a question of sending the commands over the pci bus
13:42 imirkin_: it's a question of commands taking a certian amount of time to execute
13:42 karolherbst: ohh okay
13:42 prg: didn't reclock, so gpu should be quite slow
13:42 imirkin_: and if you can generate them faster than they execute, then you get a backlog
13:42 karolherbst: prg: might want to try out 0a pstate
13:42 imirkin_: prg: ah, try with reclock?
13:42 karolherbst: it should be stable enough
13:42 prg: will do
13:43 imirkin_: normally there are various waits involved
13:43 imirkin_: where the gpu results are used by the cpu for something
13:43 imirkin_: which naturally limits backlog growth
13:44 imirkin_: but you could easily have rendering that doesn't ever look at results of anything ever
13:44 imirkin_: which would cause this wait-less situation
13:44 karolherbst: prg: my branch just lets a microprocessor on the gpu check the load every 0.1 seconds and send a reclock request to the cpu, when the clock has either too much load or not enough
13:44 prg: um, where was that file again? cat /proc/cmdline: ro root=/dev/sda1 rootflags=discard nouveau.pstate=1 net.ifnames=0, find /sys/kernel/debug/ -name pstate: nothing
13:45 prg: (yes, debugfs is mounted)
13:45 imirkin_: karol moved it
13:45 karolherbst: prg: it looks good already
13:45 karolherbst: :D
13:45 karolherbst: ohh right
13:45 karolherbst: my mistake
13:45 imirkin_: it's in sysfs without his patches
13:46 karolherbst: imirkin_: https://bugs.freedesktop.org/show_bug.cgi?id=92991 c800 hack?
13:46 prg: oh, there's a /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/pstate now
13:46 karolherbst: totally looks like it
13:47 karolherbst: imirkin_: nouveau.config=War00C800_0=1 ?
13:47 imirkin_: karolherbst: yes
13:48 imirkin_: want me to handle it, or you got it?
13:48 karolherbst: I just suggest to add that to the kernel cmd line
13:48 karolherbst: :D
13:48 karolherbst: mhh, I got it I think
13:49 karolherbst: I already fixed it for my laptop, so
13:49 karolherbst: mhh I think I have to wait for skeggsb today, have a lot of stuff to discuss with him :D
13:49 imirkin_: skeggsb: any luck with stability?
13:50 prg: getting a few more fps with 0a, but also still that nv50cal_space: -16 message
13:50 imirkin_: pedal to the metal?
13:50 karolherbst: :D
13:50 karolherbst: prg: 0f!
13:51 karolherbst: do it
13:51 prg: that used to kill the system quite reliably, let's see...
13:51 karolherbst: ha! yeah used to
13:51 imirkin_: you have karol's patches right?
13:51 imirkin_: or at least 4.4-rc1?
13:52 karolherbst: yes he has it
13:52 prg: HEAD detached at karolherbst/master
13:52 karolherbst: when he uses my master branch that is
13:52 imirkin_: ok cool
13:52 prg: it survived the echo 0f, now for the game
13:54 prg: lots more fps, still the message
13:55 prg: maybe it's just doing something weird when doing some tests for figuring out performance
13:55 imirkin_: maybe
13:56 imirkin_: i wish i could take like... a month... to concentrate on nouveau
14:01 karolherbst: imirkin_: okay, I think I somehow now what zculling is
14:03 karolherbst: it basically reduces the amount off pixels for which pixel shaders are run, by determining stuff you won't see anyway on the gpu instead of finding those with the cpu
14:03 karolherbst: and z culling seems to be shader aware too
14:03 karolherbst: what are occluded pixels?
14:03 imirkin_: pixels that are behind something
14:03 imirkin_: i.e. fail the depth test
14:04 karolherbst: ohh okay
14:05 karolherbst: but sounds promising though
14:05 glennk: z test may be run before or after the shader, depending on state
14:05 imirkin_: glennk: zcull is a whole separate thing though
14:05 imirkin_: glennk: some sort of additional buffer
14:06 glennk: you mean hier-z?
14:06 imirkin_: i have nfc how it works though
14:06 imirkin_: if you say so
14:06 imirkin_: like i said, i have no clue what it is or how it works
14:06 glennk: conceptually its simple
14:06 imirkin_: it _appears_ to just be an auxiliary buffer for a particular render
14:06 imirkin_: but what do i know
14:06 glennk: for each tile you store min/max z
14:07 imirkin_: hm ok
14:07 glennk: and when rasterizing a tile you check the z bounds against that, if it fails you can reject the entire tile in one op
14:07 imirkin_: so that means that this buffer needs to be a child of a depth texture?
14:07 glennk: then apply this recursively to larger tiles in turn
14:07 imirkin_: [or rather, of a depth rb]
14:07 glennk: it'd be associated with it yes
14:12 glennk: on r300 era radeons it was a special on-chip memory, and could only store info for one surface at a time, not sure what < nv50 did
14:13 imirkin_: there is hierz support for nv17
14:13 imirkin_: nv17+ that is
14:13 imirkin_: in nouveau_vieux
14:13 imirkin_: but the later stuff is all called "zcull"
14:13 imirkin_: rather than "hierz"
14:14 glennk: may be a similar deal there then
14:14 imirkin_: r300 = nv30?
14:14 imirkin_: as far as era's are concerned?
14:15 glennk: yeah
14:15 imirkin_: glennk: https://github.com/envytools/envytools/blob/master/rnndb/graph/g80_3d.xml#L231
14:15 glennk: r100 had hyperz
14:16 imirkin_: from the looks of it, nv50 has some sort of on-chip thing
14:16 imirkin_: so it's actually per-render
14:17 imirkin_: whereas nvc0 has a separate zcull memory area
14:17 imirkin_: which i guess should be a framebuffer property or... something
14:18 glennk: separate is much nicer with FBOs and binding back and forth
14:18 imirkin_: well i don't make the rules :)
14:18 imirkin_: or the hw, for that matter
14:18 glennk: can probably ignore the older hw, they had awkward rules when it could be enabled or not
14:19 imirkin_: well, calim couldn't figure it out on nvc0
14:19 imirkin_: which leads me to believe i won't either
14:29 karolherbst: http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf
14:29 karolherbst: site 43 and 44
14:29 karolherbst: there is some stuff about z culling
14:30 karolherbst: they also state when z cull can't do anything
14:30 karolherbst: or is hardly an improvement
14:32 karolherbst: and page 31
14:35 karolherbst: ohhh
14:35 karolherbst: Zcull and Hi-Z are coarse stuff
14:36 karolherbst: where earlyZ is fine-grained stuff
14:37 karolherbst: maybe not
14:37 karolherbst: mhh
14:39 karolherbst: "Before a fragment reaches the fragment processor, the z-cull unit compares the pixel's depth with the values that already exist in the depth buffer. If the pixel's depth is greater, the pixel will not be visible, and there is no point shading that fragment, so the fragment processor isn't even executed."
14:39 karolherbst: sounds like fun somehow
14:40 karolherbst: okay and ZCull != Early Culling
14:42 imirkin_: :)
14:42 RSpliet: Isn't that part of your standard rasterisation operation?
14:42 RSpliet: (and, wouldn't that fail for translucent fragments? :-P)
14:43 imirkin_: early z is when you do the depth test *before* frag shader invocation
14:43 imirkin_: which you can't do if the frag shader e.g. writes depth
14:43 karolherbst: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter30.html
14:43 karolherbst: that's the best I found about zcull
14:43 karolherbst: well "about zcull"
14:43 glennk: translucent stuff you need to sort and draw in back to front order (aka painter's algorithm), or use various fancy shader tricks to get correct results
14:43 karolherbst: it is about a lot of other stuff too
14:45 RSpliet: glennk: I figured there must be ways... didn't read the restrictions so well but I guess it's in there :-P
14:45 karolherbst: a GeForce 6800 Ultra can discard 64 pixels per clock cycle
14:45 karolherbst: mhh sounds impressive
14:46 imirkin_: i can discard infinity pixels per second... in my head!
14:46 imirkin_: *not* discarding... heh
14:46 karolherbst: :D
14:46 imirkin_: that's trickier
14:46 glennk: its more like discard entire triangles on more modern cards
14:46 karolherbst: yeah, I figured that already
14:47 karolherbst: I think zcull is only a coarse thing
14:47 imirkin_: i bet zcull is good for a few % perf
14:47 karolherbst: yeah
14:47 karolherbst: I bet that too
14:47 karolherbst: a few like in 15 or more
14:47 karolherbst: imirkin_: I bet there is a way to disable zcull on keplers while nvidia is loaded
14:48 karolherbst: and then there might be a big fps drop
14:48 glennk: more so on IGPs and low end cards than on faster ones
14:48 imirkin_: coz fill rate is lower?
14:48 glennk: memory bandwidth too
14:48 karolherbst: glennk: don't think so
14:48 imirkin_: so skipping filling is more important :)
14:48 karolherbst: glennk: zculling reduce pixel shader invocations
14:48 glennk: karolherbst, i don't have to think, long since measured :-p
14:48 karolherbst: :D
14:48 karolherbst: ahh okay
14:49 karolherbst: glennk: any idea how to disable zculling on the blob?
14:49 karolherbst: :D
14:49 glennk: on really old chips flip the z test direction
14:49 glennk: midframe
14:49 karolherbst: I meant by messing with gpu regs
14:55 karolherbst: ohh
14:56 karolherbst: I got to drop the fps in heavon by like 8%
14:56 imirkin_: congrats? :)
14:56 karolherbst: nvapoke 0x225a8 0x20 0xffffffff
14:56 karolherbst: this does impact perf
15:01 RSpliet: apparently crashing is quite bad for performance indeed...
15:13 karlmag: RSpliet: a performance limiter at the very least :-P
17:50 pspace: Hey everyone, does anyone have personal experience on Kepler chipset? I am considering ditching proprietary driver in favor of nouveau. I'm using dual GTX 780's.
17:51 imirkin: nouveau doesn't support any sort of workload splitting
17:51 imirkin: otherwise GK110's should... generally... work. not a lot of people have them (or don't use nouveau with it), so issues do pop up
17:52 pspace: I'm not using SLI, do you mean workload splitting on a single card?
17:52 imirkin: i mean SLI-style workload splitting
17:52 imirkin: why dual GTX 780's then?
17:52 imirkin: really need 8 monitors? :)
17:53 pspace: Just windows gaming, I don't really need it for anything else. I do have a 4k .
17:53 imirkin: ah ok
17:53 pspace: Even proprietary driver doesn't handle SLI on 780
17:54 imirkin: hm, surprising.
17:54 pspace: when it's enabled there's latency, basically unusable
17:54 imirkin: this is currently the only issue i'm aware of connected to GTX 780: https://bugs.freedesktop.org/show_bug.cgi?id=92761
17:54 imirkin: however it has worked for other people with GTX 780's
17:57 pspace: Look's promosing from the feature matrix, will give it a shot.
17:58 pspace: Anychance you know how to re-enable noveua on Fedora, or is that more of a distro question
17:58 imirkin: reclocking will require a very up-to-date kernel, and is mostly untested on that card
17:58 imirkin: (definition of "very up to date" is 4.4-rc1)
17:58 imirkin: distro question... i dunno how you disabled it
17:58 imirkin: look for nomodeset or modeset=0 in various files
17:59 imirkin: like your boot config, modprobe.conf, etc
17:59 pspace: ah ok
17:59 pspace: hmmmm that is a very updated kernel
17:59 pspace: Whats the advantage of reclocking
17:59 pspace: power savings?
17:59 imirkin: moar fps
18:00 imirkin: i.e. none :)
18:00 imirkin: unless you play games
18:00 imirkin: keplers tend to boot into the lowest power state
18:00 imirkin: reclocking allows you to switch between power states
18:00 pspace: I guess that could be useful for crunching numbers on the GPU for scientific applications
18:01 imirkin: no opencl support with nouveau, so not a concern
18:03 pspace: good to know, thanks :)
18:05 imirkin: things are being worked on, but compared to blob driver, nouveau supports but a fraction of the features
18:05 imirkin: so if you're looking to use nouveau, figure out what's important to you, test it out on nouveau, and if it all works, enjoy :)
18:07 pspace: I really just want performant 3d desktop effects from KDE's window manager and good sleep and restore
18:08 imirkin: well, lots of people are reporting issues with KDE5 and nouveau
18:08 imirkin: i was able to track down one source of fail, patch in 4.3 and cc'd to various stable kernels
18:09 imirkin: but there's remaining fail yet to be worked out
18:09 pspace: :'-(
18:09 pspace: eh, Ill give it a try anyways
18:09 pspace: feeling advernturous
18:09 imirkin: good luck!
18:09 pspace: thanks for your help
18:10 imirkin: oh, hope your 4k monitor isn't mst..
18:10 imirkin: no dp-mst with nouveau yet
19:01 simonpatapon: hello
19:01 simonpatapon: I have set up X with legacy nvidia driver
19:02 simonpatapon: when i change nvidia by nouveau in X i get kms not enabled
19:02 imirkin: simonpatapon: pastebin dmesg
19:03 simonpatapon: I have rechanged the /etc/X11/xorg back to nvidia, do you want dmesg from the bogus config?
19:03 imirkin: simonpatapon: i want dmesg from when you were having issues
19:03 simonpatapon: ok i'll be back
19:07 imirkin: simonpatapon: and also xorg log while you're at it :)
19:09 simonpatapon: imirkin: http://pastebin.com/jneF1nhu http://pastebin.com/qKm8wG5q
19:09 simonpatapon: thank you for your time
19:09 simonpatapon: i see the glx again from nvidia
19:09 simonpatapon: of can i set the one for nouveau
19:09 imirkin: you have deeper issues
19:10 simonpatapon: :(
19:10 imirkin: there is no reference to nouveau in your dmesg
19:10 imirkin: which means that you have it somehow blacklisted
19:10 imirkin: grep -r nouveau /etc/modprobe*
19:11 simonpatapon: http://pastebin.com/wVy8F1XH
19:11 simonpatapon: lots of blacklists
19:11 imirkin: heheh
19:11 imirkin: modeset=0 also disables nouveau, fyi
19:12 imirkin: ok, well nouveau needs to be loaded if you want nouveau to work
19:12 imirkin: the nvidia glx thing is also a problem that you'll need to fix
19:15 simonpatapon: http://pastebin.com/Pjv6Ydwg this way?
19:16 simonpatapon: i commented all lines
19:16 imirkin: that works
19:16 simonpatapon: see you in boot...or two :)
19:34 simonpatapon: 5 reboots
19:34 simonpatapon: lol
19:34 simonpatapon: http://pastebin.com/FhCRNxUz
19:34 simonpatapon: had to redo ./NVIDIA... to get modules back to X working with legacy
19:35 simonpatapon: i guess I have to blacklist legacy nvidia?
19:37 imirkin: optionally
19:37 imirkin: nouveau wins, so wtvr
19:38 simonpatapon: what is the module to blacklist? nv?
19:38 imirkin: nvidia
19:39 simonpatapon: blacklist nvidia-current to /etc/modprobe.d/blacklist-nvidia.conf?
19:40 simonpatapon: or nvidia alone?
19:42 simonpatapon: il try alone
19:42 simonpatapon: see you in a few
20:28 simonpatapon: ok had a lot of errors with libGLX
20:28 simonpatapon: but now it works!
20:28 simonpatapon: i'm on nouveau
20:29 simonpatapon: but one screen is low resolution
20:30 simonpatapon: http://pastebin.com/BGH7fLHg
20:31 simonpatapon: http://pastebin.com/dbRddLy0
20:33 imirkin: [ 3.871] (II) NOUVEAU(0): Output DVI-I-1 using monitor section Monitor0
20:33 imirkin: you appear to have a retardo in your xorg.conf
20:33 imirkin: i'd just nuke that thing
20:33 simonpatapon: ok
20:33 simonpatapon: ill re
20:33 simonpatapon: thx again very veyr much
20:35 simonpatapon: it works well now
20:36 simonpatapon: good rez on both
20:37 imirkin: great
20:37 imirkin: note that your clock is at 50mhz
20:37 imirkin: so don't expect miracles in terms of speed :)
20:38 simonpatapon: i can now try the xinerama thing
20:38 simonpatapon: with the other i915 onboard chip
20:38 simonpatapon: i'm not a speed guy
20:40 simonpatapon: is there i way that i could save current x config to an xconfig file to start with?
20:45 imirkin: dunno
20:59 simonpatapon: http://pastebin.com/DQN07yxW i know that card1-DVI-I- and 2 are my nouveau ports
20:59 simonpatapon: how can i guess i915 ports? one DVI and on VGA used
21:00 imirkin: the zaphodhead config might be different for intel
21:01 simonpatapon: ah grep . /sys/class/drm/card0*/status
21:02 simonpatapon: i'll try this http://pastebin.com/Q3c8sHsV
21:02 simonpatapon: i'll re
21:11 simonpatapon: it doesnt http://pastebin.com/VmdB7P1Y
23:27 simonpatapon: imirkin: just want to let you know it works!!!!
23:27 simonpatapon: only thing I had to do, remove xorg.conf then xrandr --setprovideroutputsource 1 0
23:30 simonpatapon: thank you soo much for you help getting nouvau back