13:45 haagch: so uhm... nouveau on tegra k1 doesn't like it when the xfce menu closes https://www.youtube.com/watch?v=KaOkcxhchL8
13:51 haagch: looks like a higher pstate fixes that
13:51 haagch: no awesome glitch art anymore :/
14:00 haagch: but git clone from github still runs at 5kb/s while a speedtest in a browser shows 16+mbit...
14:20 orbea: haagch: you know about speedtest-cli? no need to use the browser version :)
14:21 haagch: yes, but i didn't want to install it... from github
14:21 haagch: oh it's in [community], never mind
14:21 orbea: heh
17:33 MarcinWieczorek: Hello. I need help with debugging glitches on CS:GO
17:33 karolherbst: MarcinWieczorek: apitrace
17:33 orbea: apitrace ftw :)
17:33 MarcinWieczorek: oh cool
17:41 MarcinWieczorek: So I just play a little to collect some data and then analyze?
17:43 karolherbst: yes
17:43 karolherbst: you can debug every gl* call with it basically
17:43 karolherbst: qapitrace is a nice frontend
17:51 gisellenymes: hi
17:54 MarcinWieczorek: I have no idea what to look for
17:56 MarcinWieczorek: orbea: what can I do with my trace?
18:11 orbea: MarcinWieczorek: apitrace trace program (reproduce the bug and close it, they get big fast)
18:11 orbea: then apitrace replay program.trace
18:11 orbea: does it reproduce?
18:11 orbea: if so then xz -9 program.trace and upload it somewhere to show the devs here
18:13 orbea: i dont have much experience with qapitrace personally
19:17 haagch: does this look like a nouveau bug? https://gist.github.com/ChristophHaag/8e4ab39e5dc879d0b7052dd5e5d02bc1
19:18 imirkin_: core mesa
19:18 imirkin_: or st/mesa
19:18 imirkin_: definitely not nouveau.
19:18 imirkin_: (doesn't mean i wouldn't investigate such a bug, of course... provided an apitrace)
19:20 haagch: https://haagch.frickel.club/files/OSVRTrackerView.trace
19:21 haagch: I compiled openscenegraph git master for my chromebook because the version from archlinuxarm only has gles enabled and that doesn't work
19:22 imirkin_: i get the same crash on i965
19:22 imirkin_: i think it's safe to say this is not a nouveau-only bug ;)
19:23 haagch: on radeonsi the trace segfaults too
19:23 MarcinWieczorek: orbea: I'll work on it, it goes black when I try to replay it
19:24 imirkin_: haagch: i'd recommend filing a bug. not sure when i'd be able to look at it in depth. that vbo stuff tends to make my head spin.
19:26 haagch: yes, just compiling openscenegraph-git on my pc to see whether it happens there too
19:27 imirkin_: doesn't matter - mesa shouldn't crash.
19:27 imirkin_: with very few exceptions, like when the application does something nasty
19:28 imirkin_: but glretrace doesn't do nasty things, even if the original application does.
19:28 haagch: it's openscenegraph git master, who knows
19:29 imirkin_: wow. not a lot of calls in this trace. i like it.
19:30 imirkin_: and this happens while building a display list. i like it even more.
19:30 haagch: 60 kilobyte
19:31 imirkin_: so this is hitting the "save the vbo data into the display list" logic. excellent.
19:31 imirkin_: this is the corner case of a corner case of a corner case.
19:31 imirkin_: well done, sir
19:34 MarcinWieczorek: imirkin_: So my issue doesn't seem to occurr on a replay, but it looks way worse :p
19:41 imirkin_: MarcinWieczorek: so ... you probably don't want to hear this, but there are yet-to-be-diagnosed issues on maxwell.
19:41 imirkin_: separately, i'd recommend updating mesa, if you haven't already
19:47 MarcinWieczorek: Well I'll try with mesa from git, I don't remember if I have tried already
19:57 haagch: https://bugs.freedesktop.org/show_bug.cgi?id=99631
21:37 gregory38: anyone know this extension https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_shader_stencil_export.txt
21:37 gregory38: I'm trying to use it on the softpipe but it seems to be wrong
21:37 imirkin_: it's my best friend.
21:37 imirkin_: me and it, we're like ><
21:39 gregory38: I set gl_stencil_pass_depth_pass to gl_replace (and the 2 fails to gl_keep)
21:39 imirkin_: note that nvidia hw has no support for this btw
21:39 gregory38: yes, hence the softpipe
21:40 imirkin_: llvmpipe might support it too...
21:40 gregory38: I didn't manage to start PCSX2 with it.
21:40 imirkin_: ah =/
21:40 gregory38: Could be my compilation that doesn't get the good option
21:41 imirkin_: either way, i think the way the ext works is that you write to gl_FragStencilRefARB, and that is the value that replaces the stencil ref value in the stencil processing logic
21:42 gregory38: well my question is when the replacement is done
21:42 imirkin_: i.e. the same thing as having the ref parameter of glStencilFunc be dynamic
21:43 gregory38: I inited my stencil buffer to 1
21:44 gregory38: ok
21:45 gregory38: so basically the stencil value from the shader will be compared to the stencil value of the stencil buffer
21:45 gregory38: thanks
21:45 gregory38: I need to change my code ^^
21:50 glennk: maybe confusing the functionality with AMD_shader_stencil_value_export?
21:51 gregory38: wait, isn't it the same ?
21:51 imirkin_: are you kidding me? those are different?!
21:51 glennk: one is the stencil source value, the other is the ref value
21:51 imirkin_: oh right. AMD_shader_stencil_value_export vs AMD_shader_stencil_export
21:52 imirkin_: not confusing at all.
21:53 gregory38: OMG
21:56 gregory38: I guess the value extension is only supported on AMD HW.
21:56 imirkin_: hmmm... perhaps intel can do it too, just not implemented. not sure... let's check.
21:56 gregory38: tbh, I'm not sure which one I need
21:56 gregory38: I want to implement a destination alpha test with it
21:59 glennk: thats under the assumption that you've previously filled the stencil buffer with your "destination alpha" channel?
21:59 gregory38: yes I will do it with 2 passes
22:00 gregory38: first pass will populate the stencil with either a value based on current framebuffer or a constant value
22:00 gregory38: then 2nd pass will do the rendering and update the stencil accordingly
22:00 imirkin_: looks like it can only produce one type of stencil output
22:00 glennk: you can bind the framebuffer as a source otherwise with a barrier in between and just sample alpha
22:01 gregory38: because I can write the alpha channel and therefore impact the test
22:01 gregory38: glennk: by barrier you mean texture barrier ?
22:02 gregory38: I potentially have a tons of triangle that are renderer
22:02 glennk: i forget which function it is
22:02 gregory38: glTextureBarrier(); or glMemoryBarrier
22:03 glennk: similar thing with the advanced blend stuff
22:03 glennk: hardware that can do the coherent one you can handle it in a straightforward manner
22:03 gregory38: texture barrier is nice, but it requires a barrier every primitives (bounding box computing could improve the situation but often triangles are on the same area)
22:04 glennk: well, only if you need coherent results in the same batch
22:04 gregory38: yes I would love that everyone got coherent HW (myself included)
22:04 glennk: some gles hardware has various funky extensions for custom blending too
22:05 imirkin_: gregory38: SKL exposes coherent fb ops
22:05 imirkin_: [in mesa]
22:06 gregory38: imirkin_: is it GL_ARB_fragment_shader_interlock ?
22:06 imirkin_: no
22:06 imirkin_: EXT_shader_framebuffer_fetch
22:06 imirkin_: oh gr. it's in ES only =/
22:07 glennk: MESA_shader_framebuffer_fetch ?
22:07 imirkin_: yeah, but it's not published.
22:08 imirkin_: [and also not exposed]
22:09 gregory38: If I understand correctly, this extension "force" hardware to run in in-order fashion
22:10 gregory38: whereas interlock. Only force some part of the shader to run in-order
22:10 imirkin_: mmmmmm .. i don't think so
22:10 imirkin_: i think order might still be unspecified. not sure.
22:10 imirkin_: i haven't read the ext very carefully.
22:10 glennk: well, an ordering for primitives overlapping the same fragment
22:11 gregory38: but is there any guarantee
22:11 imirkin_: you'd have to check with curro.
22:11 glennk: the guarantee is primitive order as seen by that fragment position afaik
22:12 glennk: i think amd has an extension to relax that ordering too
22:12 imirkin_: late-model nvidia hw implements something too, but i dunno the details
22:12 gregory38: anyway, I don't have a SKL gpu neither (no Intel GPU at all on my HSWE)
22:12 gregory38: nvidia is the interlock extension
22:12 gregory38: It used to be an NV_ one
22:12 imirkin_: on gm20x+ right?
22:13 gregory38: yes I think so
22:13 gregory38: latest maxel
22:13 gregory38: amd/intel got another one
22:14 gregory38: Contributors to INTEL_fragment_shader_ordering
22:15 gregory38: for me those extensions will be ultra nice
22:15 imirkin_: well, i don't have any such late-model hardware, so ... unlikely to happen based on my efforts.
22:17 gregory38: me neither, I won't upgrade to an Nvidia GPU if there is no free driver available ;)
22:17 imirkin_: weren't you pretty anti-nouveau until semi-recently? or were you just annoyed that various features weren't available?
22:17 gregory38: me anti-nouveau
22:18 gregory38: you're kidding
22:18 gregory38: I'm pro free driver
22:18 imirkin_: well, just an impression i have. i guess a mistaken one. will update my records.
22:18 gregory38: however, let's be honest for speed, Nvidia is top notch
22:18 imirkin_: their driver does tend to be quite good.
22:18 glennk: imirkin_, EXT_stencil_clear_tag
22:19 imirkin_: and nouveau does tend to have lots of suckage
22:19 imirkin_: so yeah, radeonsi is in a much better spot - there's a full time team with internal doc access working on it, AND the blob driver stinks.
22:19 gregory38: imirkin_: however I'm anti-AMD ;) I suffer a lots with the proprietary driver.
22:20 gregory38: actually my (windows) user still have issue with AMD proprietary driver
22:20 imirkin_: btw, nouveau does support gm20x. just not their various fancy features.
22:20 imirkin_: (and no reclocking, so perhaps in your eyes, no support)
22:21 gregory38: Technically I don't need a powerful card, but it is easier to profile the CPU side when the GPU isn't sleeping
22:23 gregory38: by the way, something is not clear on the secure firmware topic
22:24 gregory38: they bundle them in the proprietary driver ?
22:24 gregory38: technically it could be extracted but not redistributed
22:25 imirkin_: correct
22:27 gregory38: yeah. There aren't nice guy
22:28 imirkin_: unfortunately the firmware has gotten harder to extract. both from a running driver/trace, as well as from their shipped blob files
22:30 gregory38: I don't understand why there are afraid to release them
22:31 gregory38: They are signed so nobody can modify them
22:31 gregory38: and if there are afraid of a bad guy, it can still extract them
22:32 gregory38: maybe if AMD win market share, situation might change
22:32 imirkin_: you're preaching to the choir, i'm afraid
22:33 haagch: osvrtrackerview works with the patch by the way, that was the only problem
22:33 airlied: gregory38: they do release some of them
22:33 airlied: they are just a bit slow in releasing others
22:33 imirkin_: yeah, only like 2 years behind
22:34 airlied: is there still some maxwell we don't have support for? /me hasn't looked close
22:34 imirkin_: yeah. the PMU blob that lets us adjust the fan.
22:34 imirkin_: not released for any maxwells
22:34 airlied: ah but the graphics fw is out
22:34 imirkin_: [or do reclocking, for that matter]
22:34 imirkin_: yeah, so?
22:34 imirkin_: you can accel, but at 50mhz shader cores? who cares?
22:35 gregory38: yeah likely faster to use your CPU ;)
22:35 airlied: well we can't reclock any gpus usefully for users yet
22:35 glennk: hey thats better than the 1Mhz equivalent i got out of a fpga hw sim of a mali once upon a time
22:35 airlied: so really we might as well just give up on nouveau completely
22:35 imirkin_: airlied: i dunno - kepler and gm10x are fairly reliable.
22:35 airlied: imirkin_: for users who just use
22:36 airlied: not users who want to type incantations into sysfs
22:36 imirkin_: for now.
22:36 imirkin_: but not forever.
22:36 airlied: and maybe pmu is for now and not forever
22:36 imirkin_: maybe. but it's been 2 years.
22:37 airlied: how many years can we not autoreclock for?
22:37 imirkin_: not sure what the connection is.
22:37 orbea: reclocking with kepler makes dolphin-emu full speed, that is something
22:37 haagch: can nouveau report gpu load?
22:37 airlied: imirkin_: just saying your attitiude on maxwell is misplaced
22:37 imirkin_: dynamic reclocking is a feature of nouveau code, which has little to do with nvidia. the blobs that nvidia needs to release are on nvidia's "head"
22:37 airlied: considering for most users we don't reclock any generation of card
22:38 airlied: from a user pov whether they are waiting for nouveau or nvidia it doesn't really matter
22:38 gregory38: we don't wait, we pray ;)
22:38 airlied: when nouveau can autoreclock and kepler for all users is oh wow it's faster out of the box, we have more leverage on nvidia
22:39 imirkin_: airlied: any user who notices that nouveau is slow and wants faster, they will quickly track down the sysfs thing.
22:39 airlied: imirkin_: users don't do that though, they just install the binary driver
22:39 imirkin_: my guess is most users of nouveau use it for one thing - runtime pm shutdown of the gpu :)
22:39 orbea: its not hard at all with 4.10 :)
22:39 airlied: imirkin_: pretty much the main use case I have for it :)
22:40 orbea: i use nouveau because nvidia blobs ruined my day one too many times with questionable decesions such as removing system libraries that led to a broken system, granted I learned a lot fixing that...
22:41 gregory38: multi threading gl might help to have some speed improvement
22:42 orbea: gregory38: you ever seen what that does with mpv + hwdec on nouveau? its not pretty :)
22:42 imirkin_: my guess is that the main perf improvement left in nouveau is zcull
22:42 imirkin_: [in addition to compiler improvements and whatnot]
22:42 imirkin_: [of which there's also a major one - proper instruction scheduling]
22:43 imirkin_: my guess is that between those two, most of the delta to the blob will dissipate.
22:43 gregory38: orbea: which MT. I tryed Marek's branch yesterday and it was fine on PCSX2 (well only tested 5 minutes)
22:43 orbea: with mpv, the mplayer/mplayer2 fork
22:44 orbea: crashed my entire system within seconds last time I tested it
22:44 gregory38: I got a 30% perf boost :p
22:44 hakzsam: imirkin_: stability and errors recovery would be good too :)
22:44 orbea: i should try mareks' branch I guess :)
22:45 orbea: i wonder if it helps xenosaga :D
22:45 imirkin_: hakzsam: i meant perf delta
22:45 imirkin_: not feature delta
22:45 hakzsam: sure, I know
22:45 gregory38: it needs some hacks to be efficient and might need to update gsdx to avoid some clear functions (I will see when code land in mesa)
22:53 glennk: imirkin_, what was the stencil equivalent of zcull again?
22:54 imirkin_: not aware that there is one
23:32 MarcinWieczorek: Thanks for help today guys
23:32 MarcinWieczorek: Night