03:14 orbea: is there any specific known reason why the corrupted pixel bugs assocaited with hwdec / vdpau is far more prevelent with anime than live action? Could it just be that the fansubbers are sloppier than release groups for live action tv?
03:16 orbea: i can usually watch live action with hwdec with little to no corrupted pixels, but anime is more often than not unwatchable with hwdec
03:16 imirkin: there's *something* we don't decode properly in h264
03:16 imirkin: i dunno what it is, but some videos appear to be *way* more likely to exhibit than others
03:16 imirkin: a few years ago, it was video trailers
03:17 imirkin: now... it's basically everything
03:17 orbea: interesting
03:17 imirkin: maybe it's a NAL type the hw doesn't handle
03:17 imirkin: maybe it's some weird "rare" parameter we don't populate properly
03:17 imirkin: maybe it's something else entirely
03:17 imirkin: my suspicion is that it will take someone who knows h264 very well to work it out
03:18 orbea: It would be nice to see it solved one day, but I'm not very helpful...
03:18 imirkin: i tried for a while, but it's just one of those things i gave up on
03:19 imirkin: i spent many many days on it, poring over traces/etc. couldn't find a thing.
03:19 orbea: i also noticed some fansub groups are far more affected by it than others (Exiled-destiny is one)
03:20 imirkin: maybe with the new demmt it would be easier, dunno
03:20 imirkin: if you can find a video with which the errors happen from the *VERY BEGINNING*
03:20 imirkin: that would be very helpful
03:20 imirkin: and i don't mean "a few seconds in"
03:20 imirkin: i mean "frame 2 and on"
03:20 orbea: I might have some, would a trace maybe be helpful?
03:21 imirkin: however i suspect it's an issue in reference frame management
03:21 imirkin: either an error in our understand of what needs to be fed to the hw
03:21 imirkin: or an error in our code in how we feed stuff in
03:21 imirkin: but as a result of these minor differences, this stuff doesn't match up 1:1
03:21 imirkin: so i can't just diff the traces =/
03:23 orbea: yea, I have one that the corruption starts from teh very beginning and continues on and off from then on...
03:23 imirkin: i don't care how it continues... i just want the corruption to start immediately
03:24 imirkin: because then instead of looking for a needle in a tonne of hay, i'm only looking for it in a single bale
03:24 imirkin: still a lot, but much more manageable
04:13 orbea: is there anything special I need to trace mplayer? I cant get it to make a trace, not even with the manual method.... "apitrace trace --api gl mplayer videofile.mkv" should work, no?
04:14 imirkin: it's not apitrace
04:14 imirkin: you want to do a mmt trace
04:14 imirkin: also... don't use mplayer
04:14 orbea: oh
04:14 imirkin: mplayer is too complex
04:14 imirkin: and does too good a job at playing back the videos
04:14 orbea: what would be better to test with?
04:15 imirkin: which is good for... you know... watching
04:15 imirkin: but not as good for tracing
04:15 orbea: ffmpeg?
04:15 imirkin: have a look at https://github.com/imirkin/re-vp2
04:15 imirkin: there's an application called h264_player.c
04:15 imirkin: now, it's super-tuned towards this one video i was original testing with
04:15 imirkin: but it could be adapted
04:16 imirkin: if you're not up to it, you could make the video file available somewhere, and i could cut it up so that it worked
04:23 orbea: yea, doesn't work at all ootb for this video, I'll make a sample of the video and post it somewhere in a bit. I dont really know enough c to modify it anytime soon
04:23 imirkin: yeah, it's not really designed for ease-of-use
04:24 imirkin: more for ease-of-tracing
12:52 dcomp: skeggsb: attached mmiotrace and dmesg log to https://bugs.freedesktop.org/show_bug.cgi?id=95188
13:01 karolherbst: imirkin: I think for the h.264 issue it might be usefull to encode a video with multiple different configs and figure out what the faulty ones have in common :/
13:06 hakzsam_: karolherbst, does the second Tomb Raider issue has been fixed with Ken's branch?
13:36 karolherbst: hakzsam_: the white fog issue, yes
13:39 hakzsam_: karolherbst, cool, yeah and the first one is something else
13:40 karolherbst: hakzsam_: yeah, but there is a third issue actually
13:40 hakzsam_: which one?
13:40 karolherbst: hakzsam_: even with tressfx enabled in game (the one align patch of yours helps there) the hair it not rendered
13:41 karolherbst: or in a odd way
13:41 hakzsam_: but it doesn't crash?
13:41 karolherbst: well I am not sure because the intro scene still crashes
13:42 hakzsam_: well actually this unrelated to the align patch
13:42 hakzsam_: please fill a new bug :)
13:42 karolherbst: well your align patch only fixes crashes mid game while enabling tressfx :)
13:43 karolherbst: but yeah, will create a new one after I got time to do a trace of the missrendering
13:43 hakzsam_: sure, it fixes a potential crash when an app uses a ton of temporaries
13:53 karolherbst: hakzsam_: the align patch is upstreamed?
13:53 hakzsam_: yeah
13:53 urmet: what kernel version do i need to try out gm108 with skeggsb/nouveau tree?
13:59 karolherbst: mhh is the dual issue perf counter gone? :/
13:59 karolherbst: though I have "metric-inst_issued" twice
14:00 karolherbst: ohh inst_issued1 and inst_issued2
14:05 imirkin: urmet: 4.6-rc is probably safest
14:06 urmet: ok. got some errors when trying to compile with 4.5.2
14:07 imirkin: urmet: look for a change that's like "drm-next foo" and revert it, and it should build against 4.5
14:07 karolherbst: or just use my master_4.5 branch
14:08 imirkin: urmet: e.g. try reverting 3b3ec4e10
14:08 karolherbst: imirkin: I think that will fail though :/
14:08 urmet: i don
14:08 karolherbst: imirkin: maybe it was for the 4.5 drm-next, maybe the 4.6 one, not sure anymore
14:08 imirkin: karolherbst: it was for 4.6
14:09 karolherbst: no, I meant reverting one of them should fail
14:11 urmet: i'll just clone karolherbst repo. can't have too many copies of kernel sources
14:12 karolherbst: ohh
14:12 karolherbst: wait
14:12 karolherbst: I don't have a kernel tree for that
14:12 karolherbst: it is just nouveau
14:12 karolherbst: hakzsam_: fun times: https://i.imgur.com/1GJ3he7.png
14:12 imirkin: urmet: or take the tree you have and revert the one change i said to revert.
14:13 urmet: don't have any trees in this machine yet :)
14:13 imirkin: i meant in the nouveau tree
14:13 imirkin: so that it builds against 4.5.x
14:15 urmet: it worked, thanks
14:15 imirkin: karolherbst: image removed
14:16 karolherbst: ahh right: https://i.imgur.com/AlSNPmG.png
14:17 karolherbst: I think there is something wrong with the graphs
14:17 imirkin: heh
14:17 imirkin: it's a cumulative count
14:17 imirkin: that's being displayed directly
14:17 imirkin: instead of as a derivative
14:18 karolherbst: glxspheres has terrible dual issung by the way
14:18 karolherbst: below 30%
14:18 karolherbst: imirkin: nope
14:18 karolherbst: in glxspheres it works
14:18 karolherbst: it displays data per frame
14:18 karolherbst: I think
14:18 karolherbst: or per some time
14:19 karolherbst: glxgears even worse :/
14:20 karolherbst: anyway, in tomb raider those counters seem to be messed up
14:21 karolherbst: or maybe because it is 32bit
14:22 hakzsam_: karolherbst, your image has been removed
14:22 imirkin: hakzsam_: keep reading
14:22 karolherbst: nice
14:22 karolherbst: same effect in the apitraces
14:24 hakzsam_: yeah the counters are 32-bits
14:24 hakzsam_: that might explain the issue
14:24 urmet: got the kernel module loaded. i persume that is not all
14:25 karolherbst: hakzsam_: well it worked in saints row IV
14:25 imirkin: urmet: it kind of is... did it load properly?
14:25 imirkin: urmet: pastebin dmesg?
14:25 karolherbst: and there I got over 600G
14:25 karolherbst: let mecheck again
14:26 karolherbst: or was it 600m per frame?
14:28 karolherbst: hakzsam_: well yeah, with saints row IV it is stable at 150m with my current settings
14:28 karolherbst: but I am sure maxed out it was around 600m and it worked
14:28 urmet: imirkin: that looks like the relevant part http://pastebin.com/uQAV5YPv
14:28 hakzsam_: karolherbst, weird
14:29 karolherbst: hakzsam_: well you can check with my trace from the white fog issue if you want. You should see it there too
14:30 hakzsam_: yep, but I can't right now :)
14:30 karolherbst: ohhh
14:30 karolherbst: "nve4_hw_sm_begin_query:1421 - Not enough free MP counter slots !"
14:30 karolherbst: ...
14:30 karolherbst: that might explain this
14:30 imirkin: urmet: seems mostly happy
14:30 hakzsam_: karolherbst, most likely
14:30 imirkin: urmet: DRI_PRIME=1 glxinfo should now show you that it's using nouveau
14:30 karolherbst: but I only did "GALLIUM_HUD=inst_issued1,inst_issued2" :/
14:30 imirkin: urmet: you have to have dri3 though
14:31 hakzsam_: karolherbst, there are "only" 8 counters divided in two groups on kepler...
14:31 karolherbst: hakzsam_: well it works everywhere else :/
14:32 hakzsam_: karolherbst, are you sure? because the number of MP counter slots is definitely not related to the application :)
14:32 urmet: there's one line in Xorg log about dri3: [ 3537.567] (II) intel(0): direct rendering: DRI2 DRI3 enabled
14:32 karolherbst: hakzsam_: yes
14:32 karolherbst: hakzsam_: I know I had troubles using 4 or 5 counters usually
14:32 karolherbst: hakzsam_: bot those both always worked together
14:32 karolherbst: *but
14:33 hakzsam_: karolherbst, that's not a trouble, that's a limitation, but if "GALLIUM_HUD=inst_issued1,inst_issued2" works with saint rows and not with tomb raider, that's an issue
14:33 hakzsam_: I could have a look later today
14:36 karolherbst: thanks :)
14:36 karolherbst: the inst_issued1 counter is a bit odd though
14:37 karolherbst: because it decreases when inst_issued2 increases
14:37 karolherbst: 607/217 stock nouveau
14:37 karolherbst: 578/232 with my dual issue pass
14:37 hakzsam_: how about the number of FPS with your dual issue pass?
14:38 karolherbst: frame time decreaes in pixmark_piano by a lot
14:38 karolherbst: 60.6-57.6 with my pass
14:38 karolherbst: *without
14:39 karolherbst: 60.3-57.4 with my pass
14:39 karolherbst: -57.3 actually
14:39 karolherbst: in ms
14:40 karolherbst: I think inst_issued1 only tells us how many pairs of instructions were dual issued + instruction which were not
14:40 karolherbst: so total instruction counts is then inst_issued1+inst_issued2
14:41 karolherbst: ahh "inst_executed" then I guess
14:41 imirkin: urmet: should be fine
14:41 karolherbst: ohh no, I am stupid
14:41 karolherbst: inst_issued1 means not dual issued
14:41 karolherbst: ...
14:41 urmet: imirkin: yes, glxinfo confirms :)
14:42 imirkin: urmet: and hopefully your power usage went down since nouveau should auto-suspend the gpu
14:43 urmet: without the driver the gpu keeps running doing nothing?
14:43 karolherbst: hakzsam_: well with pixmark_piano even "GALLIUM_HUD=inst_issued1,inst_issued2,inst_executed" works :)
14:43 imirkin: urmet: yep :)
14:43 hakzsam_: karolherbst okay, thanks for reporting
14:44 urmet: awesome :)
14:53 hakzsam_: karolherbst, which gpu are you using?
14:54 hakzsam_: I guess it's gk104?
14:55 imirkin: GK106 i think
14:55 hakzsam_: yeah, same :)
14:55 hakzsam_: SM30
14:56 karolherbst: gk106
14:57 hakzsam_: karolherbst, so this "GALLIUM_HUD=inst_issued1,inst_issued2" doesn't work with Tomb Raider?
14:57 karolherbst: hakzsam_: right
14:57 hakzsam_: you get the "Not enough free MP counter slot" msg?
14:57 karolherbst: yes
14:57 hakzsam_: mmh
14:58 karolherbst: but I encountered this issue earlier already, but not that severe
14:58 karolherbst: it seems like on heavier workloads you can only use fewer amount of counters
14:58 hakzsam_: karolherbst, can you check if compute shaders are used at the same time?
14:59 hakzsam_: yep, I know why and I should fix this anyway :)
14:59 karolherbst: hakzsam_: mhh how do I check?
15:00 hakzsam_: ST_DUMP_SHADERS for example?
15:00 hakzsam_: and look for COMP shader
15:00 karolherbst: never got any compute shaders with ST_DUMP_SHADERS
15:00 imirkin: ah, can't do MP counters + compute? sad.
15:01 hakzsam_: karolherbst, I did
15:01 karolherbst: hakzsam_: k, let me try again then
15:02 hakzsam_: imirkin, not exactly, but using compute shaders for reading MP counters is not *really* a good way in my opinion ;)
15:02 hakzsam_: anyway, that should be a real pain to change that
15:02 karolherbst: hakzsam_: nope, I glretraces my white fog trace and got nothing with "grep -i comp"
15:03 imirkin: hakzsam_: what's the good way?
15:03 hakzsam_: karolherbst, okay, can you try with only inst_issued1?
15:03 karolherbst: works
15:04 hakzsam_: imirkin, read from the kernel as you do for global perf counters, but there are some other issues to deal with because those MP counters are context switched, so
15:04 hakzsam_: *as we
15:04 imirkin: ah
15:04 imirkin: right.
15:04 karolherbst: hakzsam_: mhh but with the first "stuck" rendered frame,both counters get the same value
15:04 karolherbst: so it happens kind of immediatly
15:05 karolherbst: both jump from 0 to 45k, then the always have the same values
15:05 hakzsam_: karolherbst, but you longer have the "Not enough blabla" msg?
15:05 karolherbst: well when I use one, I don't get the message
15:07 hakzsam_: karolherbst, okay, well I think the issue is because we launch too much queries and because we use too much compute kernels to read out those MP counters, or something along these lines
15:07 hakzsam_: I have started to solve this few weeks ago, but not ready yet
15:08 karolherbst: okay
15:16 karolherbst: mhh
15:16 karolherbst: maybe we dual issue terribly though
15:16 karolherbst: and inst_issued2 does tell us everything
15:18 hakzsam: karolherbst, sorry, my bouncer is a crap :)
15:18 karolherbst: hakzsam: :D
15:18 hakzsam: <hakzsam_> because I was working on compute+images, but I'll do asap :)
15:18 hakzsam: <hakzsam_> karolherbst, maybe you can try with GALLIUM_HUD_PERIOD, this should reduce the time between two queries
15:18 hakzsam: <hakzsam_> this might be a workaround for now
15:18 karolherbst: I have troubles to understand what inst_issued1/2 really tell me
15:19 karolherbst: mhh okay, I think I got it now
15:20 karolherbst: inst_executed == inst_issued2 * 2 + inst_issued1
15:20 hakzsam: it's the number of single/dual instructions issued
15:20 hakzsam: yeah, should be
15:20 karolherbst: okay
15:20 hakzsam: you have some metrics too
15:20 karolherbst: ohh right
15:21 karolherbst: what is "metric-issue_slot_utilization"?
15:22 hakzsam: it's a percentage for single/dual instructions issue regarding the number of cycles
15:22 hakzsam:should really document those somewherre
15:22 karolherbst: ahh okay
15:22 karolherbst: 160
15:22 karolherbst: :D
15:23 hakzsam: karolherbst, for documentation: src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c :)
15:23 hakzsam: karolherbst, and you can also have a look at cupti_query from the cuda package
15:23 karolherbst: hakzsam: odd, the utilization droped while inst_issued2 got bigger
15:23 hakzsam: it will give you a long description
15:24 karolherbst: hakzsam: anyway, with my pass I am at 44.6% in pixmark_piano
15:24 hakzsam: and without?
15:24 karolherbst: allthough only 42.9% should be possible
15:24 karolherbst: without 41.73%
15:25 karolherbst: the pass relies on what can_dual_issue returns though
15:25 hakzsam: Longdesc = Percentage of issue slots that issued at least one instruction, averaged across all cycles
15:25 hakzsam: btw
15:25 hakzsam: let me give you a link
15:26 karolherbst: *canDualIssue
15:26 karolherbst: ohh
15:26 hakzsam: https://github.com/hakzsam/re-pcounter-tools/blob/master/traces/cupti/kepler/gk106/metrics/.list_metrics.txt
15:26 karolherbst: okay, so if this value drops, we dual issue more
15:27 hakzsam: karolherbst, only some metrics are currently implemented, because the other ones need global perf counters
15:27 hakzsam: and they are still not upstream :/
15:28 hakzsam: karolherbst, you will have to wait a little more, but I promise you I will work on that just after images :)
15:29 karolherbst: issue_slots sounds interessting
15:29 karolherbst: I think this would tell us about stalls
15:29 karolherbst: maybe
15:29 karolherbst: depends on how what it really tells us
15:29 karolherbst: but less slots used should mean it can't do more at the same time or something
15:29 hakzsam: yeah
15:30 hakzsam: and you are lucky because this one is exposed
15:30 karolherbst: let me check
15:30 hakzsam: metric-issue_slots
15:30 karolherbst: don't have it
15:31 hakzsam: oh right, my bad
15:31 karolherbst: that's the one I've got: https://gist.github.com/karolherbst/febaa4d074357ceef8f57c0af340f8b6
15:31 hakzsam: it is there but not exposed
15:31 karolherbst: ahh okay
15:31 hakzsam: yeah I'm looking
15:32 karolherbst: anyway, I am fine with those dual issue thingies, because they enable us to write a simple rescheduling pass which just reorders a bit to optimize dual issueing which should enable us be a bit faster overall
15:34 karolherbst: allthough stuff like heaven already has nearly perfect dual issueing
15:38 hakzsam: karolherbst, issue_slots should be inst_issued1 + inst_issued2 (according to the report I have)
15:38 karolherbst: ahh okay
15:38 karolherbst: yeah makes sense somewhat
15:38 hakzsam: the code is there, but the metric is not exposed, I don't why
15:39 hakzsam: +know
15:39 karolherbst: but I would need something like warps doing nothing because they have to wait on results or something
15:40 karolherbst: but that's for later
15:40 hakzsam: achieved_occupancy is maybe what you want
15:41 karolherbst: right, that metric was useless...
15:41 karolherbst: it is 0 in pixmark_piano
15:41 karolherbst: always
15:41 hakzsam: oh I remember now, we need percentage for that one
15:41 karolherbst: well everywhere else it is also alway 0
15:41 karolherbst: hakzsam: yeah, set max to 100
15:41 karolherbst: :D
15:42 karolherbst: then you get percentage
15:42 karolherbst: it is pretty stupid, but it is done that implicitly
15:42 karolherbst: and then feed values form 0 to 100
15:42 hakzsam: nvidia likes to return float numbers between 0..1 for metrics
15:42 hakzsam: I was just doing the same thing
15:42 hakzsam: but using percentages is better
15:42 karolherbst: right
15:43 karolherbst: but it should work, if you tell me where to change that stuff I could try it out
15:43 hakzsam: I'll fix that later today when I will be at home
15:43 hakzsam: it's not a simple line to change actually
15:43 hakzsam: and I was a bit lazy to do it the first time
15:43 hakzsam: but it seems like stupid to use 0..1
15:45 karolherbst: I see
15:50 hakzsam: karolherbst, this should expose metric-issue_slots http://hastebin.com/educigewil
15:50 hakzsam: (it was probably a copy-paste error when I reworked that are)
15:50 hakzsam: *area
15:51 karolherbst: ahh
15:51 karolherbst: that explains the doubled inst_issued metric
15:51 hakzsam: yep
15:56 hakzsam: same problem with SM21 btw
16:12 hakzsam: karolherbst, maybe this should you http://hastebin.com/obociqituh :)
16:12 hakzsam: not completely done though
16:13 hakzsam: got ~20% with glxgears, but it's just for testing
16:31 urmet: cool. just tried binding of isaac. no glitches but slow motion :P
16:32 imirkin_: urmet: expected. should be slow as molasses at the lowest perf level.
16:33 urmet: i was expecting slow. i was not expecting decent picture and no crashes :P
16:34 imirkin_: oh well. maybe next time.
16:34 urmet: at least my desktop gpu crashes
16:34 imirkin_: which one?
16:34 imirkin_: [which gpu]
16:34 urmet: gtx960. i think that means gm206
16:35 imirkin_: yeah, sounds right
16:35 imirkin_: it crashes?
16:36 urmet: total graphics hang. have to recover via ssh
16:36 imirkin_: =/
16:39 karolherbst: hakzsam: nice
16:40 hakzsam: karolherbst, I'll probably send a little series tonight
16:47 karolherbst: hakzsam_: good will wait until then
16:53 karolherbst: hakzsam_: 10% in glxspheres :D
16:54 karolherbst: 53% in pixmark_piano
16:55 karolherbst: 85% in furmark
16:55 karolherbst: mhh
16:55 karolherbst: in furmark we are really close to nvidia by the way, but pixmark_piano too, allthough furmark is better
16:56 karolherbst: hakzsam_: is there an easy way to get the same data while running something with nvidia?
17:11 karolherbst: yay, mypass improves dual issueing by 10% in saints_row 3 :)
18:05 hakzsam_: karolherbst, not easy, you can try to use LGD but it only exposes graphics-related performance counters IIRC
18:05 hakzsam_: for compute-related ones you can use nvperf, but that one won't help for 3D :)
18:05 hakzsam_: it's cuda only
18:18 karolherbst: :/
18:18 karolherbst: well
18:18 karolherbst: occupancy related thins should be exposed in LGD I assume
18:22 Horrorcat: are there any known bugs which stop 04:00.0 VGA compatible controller: NVIDIA Corporation G94 [GeForce 9600 GT] (rev a1)
18:22 Horrorcat: from using two monitors with nouveau?
18:22 karolherbst: hakzsam_: I am also interessted in instruction count per frame (nvidia vs nouveau), this might tell us something
18:22 karolherbst: Horrorcat: nothing afaik, check dmesg
18:23 karolherbst: Horrorcat: sometimes booting with both displays work where hotplugging them doesn'zt
18:23 Horrorcat: xrandr shows the output connected and mode set, but the screen stays dark. it’s not my machine, so I wont bother you too much. just wanted to know if there’s anythink known broken
18:24 karolherbst: Horrorcat: ohh, maybe it is just xrandr doing something silly, I would usually configure the screens through your desktop enviornment settings stuff
18:24 Horrorcat: that didn’t work either
18:24 Horrorcat: which is why I suggested using xrandr
18:24 karolherbst: I see
18:24 Horrorcat: (i never had issues with xrandr though)
18:24 karolherbst: well maybe dmesg shows something
18:27 Akien: karolherbst: I'm about to remove bumblebee/nvidia to test nouveau/PRIME. Any idea for a couple quick and dirty tests I could do to benchmark the performance before and after?
18:37 karolherbst: Akien: well bumblebee/nvidia will be usually faster where nvidia is faster, but pcie trhoughput is much better
18:37 karolherbst: Akien: glxgears and glxspheres64 should show you higher fps
18:42 Akien: Thanks, I'll check that. I'm curious to see what performance I'll with nouveau generally nowadays; out of laziness I haven't really tested it in a long time
18:43 karolherbst: Akien: well you need full reclocking somehwat :D
18:43 imirkin_: Akien: make sure you're reclocking... otherwise perf will be shit on any semi-modern gpu.
18:43 Akien: What's reclocking? :D
18:44 imirkin_: Horrorcat: how is the screen connected?
18:44 imirkin_: Horrorcat: if it's DP, I can certainly imagine some issues on those earlyish GPUs (as far as DP was concerned at least)
18:45 Horrorcat: DVI
18:45 Horrorcat: (@ imirkin_ )
18:46 imirkin_: Horrorcat: hmmmm ok. i'm not aware of any issues. should work fine, esp if it thinks it set the mode...
18:46 hakzsam_: karolherbst, for the number of instructions per frame, you have to use GALLIUM_HUD="inst_executed" and GALLIUM_HUD_PERIOD=0
18:46 Horrorcat: imirkin_: we’re currently trying to get the proprietary driver working to see whether it’s a problem with something nouveau-ish, but that doesn’t seem to be trivial with debian for that old cards.
18:47 karolherbst: hakzsam_: ohh the LGD was this remote thingy, right? :/
18:47 hakzsam_: it is
18:47 karolherbst: meh
18:47 karolherbst: I totally failed to set it up
18:48 karolherbst: for whatever reason it won't connect through ssh
18:49 hakzsam_: weird, I used it a bunch of times
18:50 imirkin_: Horrorcat: are you on an ancient kernel perchance?
18:50 Horrorcat: is 3.16 ancient?
18:52 Akien: About reclocking, I read this on nouveau's front page: "Starting Linux 4.5, available in /sys/kernel/debug/dri/0/pstate, previously boot with nouveau.pstate=1 and use /sys/class/drm/card0/device/pstate." I understand the section about adding options to the boot line, but what does "use /sys/class/drm/card0/device/pstate" mean?
18:52 urmet: almost 2 years old. sounds ancient
18:52 imirkin_: Horrorcat: sufficiently old that i don't remember what bugs were fixed :)
18:52 Horrorcat: :D
18:53 Horrorcat: though, the GPU is quite old, too, so...
18:53 imirkin_: i def remember there were issues with G96 + HDMI, potentially fixed around then
18:53 imirkin_: but G94 doesn't ring a bell
18:56 karolherbst: urmet: what is 2 years old? Kepler?
18:56 karolherbst: ohh
18:56 urmet: 3.16 kernel
18:56 karolherbst: 3.16
18:56 karolherbst: yeah
18:56 karolherbst: that's ancient :D
18:57 imirkin_: well, 3.2 is ancient
18:57 imirkin_: debian was using that for quite a while
18:57 urmet: 3.2 is about the same age as dinosaurs
18:59 karolherbst: everything unsupported is ancient :)
18:59 urmet: debian 7 had 3.2 kernel and 8 has 3.16
19:00 urmet: and i see no sane reason to run debian stable on a desktop(especially if you care about video drivers)
19:01 karolherbst: I don'T get the point of using debian for security relevant systems too
19:06 bitlord: I have NV84/G84 card, what's the state of power management on these older generation cards? (with mostly up-to-date kernel, ...)
19:06 RSpliet: bitlord: fan management I think should work, performance level selection doesn't
19:07 bitlord: are those stuck at some lower performance level?
19:08 bitlord:not interested in fan management (has fanless/silent card) ;-)
19:09 bitlord: I'm just checking is it worth testing the card, I have an AMD/ATI now, power management works fine, but I guess nvidia is little more powerful than this (hd4350)
19:16 karolherbst: bitlord: I doubt nouveau will be faster on any g8x card
19:17 karolherbst: maybe not so much on high-end gpus
19:18 karolherbst: yeah,no g84 should be faster than radeon using nouveau
19:19 imirkin_: bitlord: most such older GPUs didn't actually have multiple perf levels
19:19 imirkin_: some mobile ones did i guess
19:20 bitlord: interesting, it's a 8600gts (no extra power connector, some weird configuration)
19:21 Horrorcat: if anyone would like to look at it, https://paste.debian.net/678982/ this is the dmesg of the boot with nouveau (and an nvidia which fails to load), where the screen shows as mode set, but black. maybe someone can spot an issue
19:21 hakzsam_: karolherbst, btw, same problem with metric-ipc on fermi, this one returns 0 too but using a percentage makes no sense...
19:22 Horrorcat: (the physical screen btw claims "no signal", while xrandr sees its modes and claims its mode to be set correctly)
19:22 Horrorcat: (actually, I have my money on user error, but they claim to have checked the cables)
19:23 imirkin_: Horrorcat: pastebin xrandr output too?
19:24 Horrorcat: imirkin_: https://paste.debian.net/678983/
19:25 karolherbst: hakzsam_: :) why not?
19:25 hakzsam_: karolherbst, https://cgit.freedesktop.org/~hakzsam/mesa/log/?h=metrics
19:25 hakzsam_: well, let me check :)
19:26 hakzsam_: karolherbst, this branch should some issues already
19:26 hakzsam_: +fix
19:27 karolherbst: yeah, saw it
19:27 karolherbst: already compiling
19:27 Horrorcat: updates: switching cables at the graphics card indicates that it is in fact the output; i.e. when cables are switched, the other screen has an image
19:27 karolherbst: :D
19:28 karolherbst: maybe the cable is indeed broken
19:29 hakzsam_: karolherbst, please tell me where you want to see % :)
19:30 hakzsam_: maybe, it makes sense to use a percentage for -ipc and -issued_ipc too
19:30 Horrorcat: no, karolherbst, I mean, just plugging the same cable with the same screen into a different output
19:30 Horrorcat: so, both cables and screens are fine, one of the GPU outputs is not giving an image.
19:30 Horrorcat: allegedly, the GPU runs fine under windows.
19:31 karolherbst: Horrorcat: okay
19:32 karolherbst: Horrorcat: try to boot with one display connected on the broken port (and none on the working one)
19:32 karolherbst: maybe this works
19:32 karolherbst: it is a stupid thing to test though
19:32 karolherbst: but we had somebody here were hotplugging 3x 4K screens didn't work
19:32 karolherbst: but booting with all three worked
19:35 karolherbst: hakzsam_: well the not enough free slots issue is still there
19:35 hakzsam_: karolherbst, sure, this branch doesn't fix that
19:36 karolherbst: and achieved_occupancy is still 0%
19:36 karolherbst: maybe nouveau just sucked there
19:36 karolherbst: ahh yeah
19:36 hakzsam_: really? 0%?
19:36 karolherbst: nouveau sucked
19:36 karolherbst: 10% with glxspheres
19:36 karolherbst: :D
19:36 hakzsam_: nice, it works :)
19:36 karolherbst: 20% glxgears
19:37 hakzsam_: actually, I'm testing on fermi but it's not exactly the same stuff on kepler
19:37 karolherbst: 81% furmark :)
19:37 karolherbst: k, so yeah, seems to work
19:37 karolherbst: now the other things
19:37 karolherbst: branch_efficiency
19:37 karolherbst: nice one
19:38 karolherbst: 90% in pixmark_piano
19:38 karolherbst: mhh
19:38 karolherbst: and 100% in furmark
19:38 karolherbst: seems like nouveau does a lot of good stuff in furmark, because the perf is like 90%+ close to nvidia
19:38 karolherbst: and those metrics also tell this
19:38 hakzsam_: maybe :)
19:39 karolherbst: issue slot utilization is weird though
19:39 karolherbst: 160% in piano
19:39 hakzsam_: maybe you can update the branch? I have just updated some minor things
19:39 hakzsam_: now, -ipc and issued-ipc return a % too
19:39 hakzsam_: mmh
19:40 karolherbst: mhh
19:40 karolherbst: maybe it tells us that instruction issued is 160% of available slots
19:40 hakzsam_: " Percentage of issue slots that issued at least one instruction, averaged across all cycles"
19:41 karolherbst: value dropped to 158% after I added my dual issue pass
19:42 hakzsam_: I'm going to check if the computation is correct
19:42 karolherbst: it is only at 33% with furmark though
19:42 Horrorcat: karolherbst: with only that single screen, grub works, and as soon as the initramfs is loaded it goes black and stays black
19:42 pmoreau: Horrorcat: I have issues with my laptop showing an image on external screen (but flickers with OpenGL), or black screen, depending on the adapter used (though this is with an MCP79/NVAC chipset). Hans has some issues with another Tesla card where the second screen stays black as well.
19:43 Horrorcat: oh and appearantly the cooling of the card goes from very high to off to very high to off and then stays off
19:43 imirkin_: Horrorcat: that may have gotten fixed at some point in the past 2 years :p
19:43 Horrorcat: plugging a screen to the other (known-to-work) output does not work in that case either
19:44 Horrorcat: imirkin_: I’m suggesting an upgrade to debian-testing
19:44 karolherbst: hakzsam_: issue_slot_util increased on upclocking
19:45 karolherbst: hakzsam_: it's something odd
19:45 hakzsam_: karolherbst, this one seems to be wrong, I'm not sure 100%
19:45 karolherbst: issued_ipc is 400%
19:46 hakzsam_: it's * 100
19:46 hakzsam_: and it's correct
19:46 hakzsam_: have a look at ipc
19:47 karolherbst: :O
19:47 karolherbst: 4 instruction per cycle
19:47 karolherbst: well
19:47 karolherbst: percentage doesn't make much sense here
19:47 dcomp: urmet: did you get GM108 working? ...
19:47 karolherbst: but we want floating point :/
19:47 hakzsam_: karolherbst, yeah
19:49 hakzsam_: karolherbst, anyway, having % for achieved_occupancy, branch_efficiency and issue_slot_utilization make sense
19:49 hakzsam_: I'll at least do that for now
19:50 urmet: dcomp: yes
19:50 karolherbst: hakzsam_: yeah
19:50 karolherbst: hakzsam_: allthough leave it also for ipc
19:50 karolherbst: getting 150% in the saints row 3 intro
19:50 dcomp: urmet: is it a mobile gpu or desktop?
19:50 hakzsam_: karolherbst, okay, if you want to, I'm not against :)
19:51 karolherbst: 70% ingame
19:51 urmet: secondary gpu on a laptop, dcomp
19:51 Horrorcat: karolherbst: dreamingL is my "patient"
19:51 dcomp: urmet: 840M
19:51 karolherbst: hakzsam_: wow, it even dropped down to 50%
19:51 urmet: yes
19:51 dreamingL: karolherbst: Hi there o/
19:51 imirkin_: dcomp: yours is just special :)
19:52 imirkin_: dcomp: it looked like your vbios didn't run btw... could you try it with nouveau.config=NvForcePost=1 ?
19:52 karolherbst: hakzsam_: so yeah, we need this to be something floating point based :)
19:52 urmet: dcomp: dell latitude e7540 with i5-5300U + 840M
19:52 hakzsam_: karolherbst, yeah, but this needs a ton of changes in the HUD
19:53 karolherbst: mhh
19:53 karolherbst: can't be too hard though
19:54 hakzsam_: yeah, but not too easy :)
19:57 urmet: dcomp: don't worry. at least my desktop is not working at all :)
20:06 hakzsam_: karolherbst, well, actually I won't change -ipc and -issued_ipc for now, because we should have floats instead
20:07 hakzsam_: so, only achieved_occupancy, slot_utilization and branch_efficiency will be updated
20:07 hakzsam_: karolherbst, and if you want to add support for floats in the HUD, go ahead :)
20:18 dcomp: imirkin_: no change
20:18 imirkin_: o well
20:18 dcomp: btw should GART: 1048576MiB be so large?
20:18 imirkin_: 1TB
20:18 imirkin_: that sounds right
20:20 hakzsam_: imirkin_, btw, do you remember this one https://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=various_fixes&id=88b6b62171b96a6cfb3bbf7a2d74cb00374ad032 ? I think it migh help if MP perf counters are used while compute shaders are used too :)
20:21 imirkin_: hakzsam_: sure, why not
20:21 imirkin_: hakzsam_: btw, feel free to push whatever in the query code... you own all that as far as i'm concerned :)
20:21 hakzsam_: okay, let me rebase it then because it will conflict for sure
20:22 hakzsam_: imirkin_, yeah, but I prefer to not push directly :)
20:22 imirkin_: seems likely.
20:22 Akien: So, I guess I'm setup to use DRI_PRIME
20:22 Akien: Let's try to mess with reclocking :)
20:22 imirkin_: Akien: DRI_PRIME=1 glxinfo | grep nouveau
20:22 Akien: Vendor: nouveau (0x10de)
20:22 Akien: OpenGL vendor string: nouveau
20:22 imirkin_: yay
20:23 Akien: Running kernel 4.6 RC5
20:23 imirkin_: you can just futz with /sys/kernel/debug/dri/1/pstate then
20:23 Akien: I'm not familiar with handling those files, should I just cat some value in it?
20:24 Akien: Currently I have: http://hastebin.com/fedoxopoji.mel
20:27 Akien: Well, looking for infos about it I see that I'm trying cutting edge developments :D
20:32 Akien: Hm... Tried `# echo '0a' > /sys/kernel/debug/dri/1/pstate` but it seems never to return the prompt...
20:32 Akien: And `DRI_PRIME=1 glxspheres64` won't start either, looks like it's stuck
20:33 hakzsam_: imirkin_, btw, I played few minutes with The Talos Principle yesterday, and this game has a bunch of rendering issues :/
20:34 imirkin_: hakzsam_: it sure does
20:34 imirkin_: curiously, it works fine on nv50
20:34 hakzsam_: yeah, I read the bug report already
20:37 Akien: Any idea what my issue might be? Looking at /sys/kernel/debug/dri/1/pstate it seems that the '0a' pstate was properly selected, but I'm left in a broken state it seems
20:37 imirkin_: Akien: pastebin the pstate file?
20:38 Akien: imirkin_: initial state: http://hastebin.com/fedoxopoji.mel. Then I ran `# echo '0a' > /sys/kernel/debug/dri/1/pstate` but the prompt never came back. From another terminal, the new pstate is: http://hastebin.com/voyiyocofa.mel
20:39 hakzsam_: karolherbst, well, I won't try to fix the not free enough slots error today because I would like to make some progress with images
20:39 imirkin_: Akien: that's a very bad sign
20:39 imirkin_: Akien: at the very least, you're missing the AC line
20:39 imirkin_: Akien: can you run this while running glxgears?
20:39 imirkin_: it can't read stuff out when the thing's off
20:39 karolherbst: Akien: yeah, changing while the gpu is suspended causes a lock :/
20:40 karolherbst: Akien: I really should upstream my suspend fixes too
20:40 Akien: karolherbst: Oh ok
20:40 Akien: Guess I should reboot, or can I unlock it manually?
20:40 karolherbst: you could use my branch
20:40 karolherbst: there it is fixed by accident too
20:40 Akien: I'm going one step at a time :p never compiled a kernel myself yet ;)
20:40 karolherbst: hakzsam_: no worries
20:41 karolherbst: hakzsam_: I need to get the LGD working anyway
20:41 hakzsam_: karolherbst, okay
20:41 Akien: brb
20:43 hakzsam_: karolherbst, my first attempt is here https://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=nvc0_batch_query&id=d790b92b16dd995a3722b01d777f9be49b0e7bd1 but it's really experimental :)
20:44 hakzsam_: the idea is to use *only* one compute shader for reading all queries
20:44 hakzsam_: and this can be done with this batch of counters thing
20:45 hakzsam_: but this requires a ton of changes in the actual code
20:47 Akien: Alright, starting with a non-empty AC line now, that looks better
20:48 Akien: Works fine :)
20:48 Akien: And can I stop the application using the discrete GPU while it's reclocked, or will that cause a lock too?
20:48 imirkin_: hakzsam_: i've kinda given up on figuring out what's wrong with talos
20:49 imirkin_: hakzsam_: similarly there's a tomb raider: u(.... something) bug which i've been unable to work out
20:49 hakzsam_: imirkin_, oh right, the rendering issue with tomb raider (with the "old" one) ?
20:50 hakzsam_: bbl
20:50 imirkin_: it's "tru". i forget what the u stands for
20:50 imirkin_: seemed like it might be some sort of faceness-related issue
20:57 Akien: Went from ~19 FPS to 36 FPS in a resource heavy level of SuperTuxKart, pretty cool :)
20:58 Akien: (in 1920x1080)
20:59 imirkin_: Akien: try 0f? :)
21:00 Akien: That's 0f already ;)
21:00 imirkin_: (i guess GDDR3 vram... not gddr5?)
21:00 imirkin_: oh
21:01 Akien: I think it's GDDR3 yeah
21:41 karolherbst: hakzsam_: good idea though
21:41 karolherbst: hakzsam_: should be more future proof in the end?
21:49 karolherbst: imirkin_: what rendering issue in tomb raider? I didn't see any except the tressfx thing
21:50 imirkin_: karolherbst: tomb raider: universe or whatever
21:50 imirkin_: no tressfx in there
21:50 karolherbst: ahh
21:52 karolherbst: imirkin_: wine?
21:52 hakzsam_: imirkin_, yeah, we should try to fix it
21:53 hakzsam_: karolherbst, it will yes
21:53 hakzsam_: and it's the way to do for global perf counters
21:53 karolherbst: what are the global perf counters?
21:53 karolherbst: just an example so that I know what they are used for
21:53 imirkin_: karolherbst: i assume... i only have a trace of it
21:54 karolherbst: imirkin_: okay
21:54 imirkin_: i never figured out wtf was going on
21:54 imirkin_: but it works fine on i965 and llvmpipe
21:54 imirkin_: karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=91247
21:55 karolherbst: this is like 2008 :D
21:55 karolherbst: yeah, and wine
21:56 hakzsam_: karolherbst, they are not context-switched, while MP counters are
21:57 karolherbst: yeah I assumed as much, but I have no idea what is context-switched or not :) or are there the same things, just for every "application" atonce?
21:57 hakzsam_: they count regardless of the context, so if you launch two instances of glxgears, results might be "wrong"
21:57 karolherbst: hakzsam_: and I assume it will be exposed through kernel interfaces in the end?
21:58 karolherbst: still thinking about how to expose those PMU counters to userspace
21:58 hakzsam_: sure, the kernel interface is already there
21:58 hakzsam_: through nvif
21:58 karolherbst: ohh but you still configure and push teh stuff from userspace?
21:59 hakzsam_: yeah, we configure them from userspace
21:59 hakzsam_: the kernel only knows the low-levels signals and multiplexers
21:59 karolherbst: okay mhh
22:00 karolherbst: well for the PMU counters I just need 5 slots on the PMU for dyn reclocking, and 3 are free for whatever and was thinking about to have an userspace API for them
22:00 hakzsam_: and the configuration of "high-level" events is stored in userspace
22:01 karolherbst: mhh okay
22:01 karolherbst: which nvif interface header?
22:01 hakzsam_: I don't remember, because lot of things have changed, but look for "perfdom"
22:01 hakzsam_: and/or "perfmon"
22:02 karolherbst: if0002 has perfmon and if0003 has perfdom
22:02 karolherbst: what's the difference?
22:02 hakzsam_: mmh
22:02 hakzsam_: ben has changed this
22:03 hakzsam_: I have never asked why actually
22:03 hakzsam_: pretty sure there is a reason though :)
22:04 karolherbst: okay
22:05 hakzsam_: I think it's because it's a different nvif interface
22:05 karolherbst: well I am sure nobody wants to know the load of a specific engine anyway, but it seems right to add support for it
22:05 karolherbst: allthough we have only 3 slots left :/
22:05 imirkin_: hakzsam_: btw, if you're looking for a *hard* problem to debug, try fixing either talos or that TR:U thing. i've sunk many hours into each one, and have come up empty.
22:06 karolherbst: ohh right, that talos bug :/
22:06 hakzsam_: imirkin_, if you didn't figure out, I wonder if I will be able to, but I could try :)
22:06 imirkin_: hakzsam_: well, you don't have to. just pointing them out.
22:07 hakzsam_: imirkin_, sure, this has to be fixed anywa
22:07 hakzsam_: +y
22:07 imirkin_: another potential source of fail for the TR:U one is that they use glDepthFunc(GL_EQUAL)
22:07 imirkin_: and so i was suspecting a precision-type issue
22:07 hakzsam_: which doesn't happen on nv50, right?
22:07 imirkin_: talos is nvc0 only
22:08 hakzsam_: okay
22:08 imirkin_: TR:U is everywhere
22:08 hakzsam_: yeah, I remember now
22:08 imirkin_: i tried futzing with the depth clipping settings, but to no avail
22:09 hakzsam_: the trace replays fine with llvmpipe I guess?
22:09 hakzsam_: maybe I could have a look at right now and work on images tomorrow (no rush) :)
22:09 imirkin_: hakzsam_: yep
22:10 hakzsam_: imirkin_, uhu http://hastebin.com/uvugoligiw.hs ? with LIBGL_ALWAYS_SOFTWARE=1 glretrace the_trace
22:11 imirkin_: for which one?
22:11 hakzsam_: https://bugs.freedesktop.org/show_bug.cgi?id=91247
22:11 hakzsam_: with your trimmed trace
22:12 hakzsam_: doesn't happen with mesa 11.2.1
22:12 hakzsam_: only with master
22:14 hakzsam_: imirkin_, mmh, it's probably on my side
22:14 hakzsam_: yeah, sorry for the noise
22:19 hakzsam_: okay, the problem first appears at frame 410
22:26 karolherbst: the depth mask looks fine though
22:26 karolherbst: values around 0.968 and no obvious wrong one
22:38 imirkin_: except the rendered picture, which is obviously wrong ;)
22:38 karolherbst: yeah
22:38 imirkin_: anyways, it builds up the picture across a boatload of draws
22:38 imirkin_: and it starts going wrong pretty early on
22:39 imirkin_: i was hoping that my basevertex-in-blit thing was going to fix it
22:39 imirkin_: but no such luck
22:47 karolherbst: maybe call 390986 is easier to debug?
23:00 hakzsam_: imirkin, if it's a precision issue, maybe we should try to resolve all the piglit/deqp precision fails first?
23:02 karolherbst: mhh well the "lowest" depth value I found was like 0.966, is this normal?
23:02 karolherbst: or are the ranges bigger in general?
23:21 imirkin_: hakzsam_: what precision fails?