02:16 pecisk: karolherbst1: hey :)
02:16 karolherbst: hi
02:17 pecisk: karlmag: mmiotrace and bios turned up to be useful?
02:18 karolherbst: don't know, didn't look into that this deep
02:21 pecisk: ahh ok :)
02:34 karolherbst: pecisk: maybe you should also trace nouveau and we might find an important difference
02:35 karolherbst: but actually you could also do that then ;) just open the trace with demmt and you can play with it around :p
02:37 pecisk: what I should look for? :)
02:38 karolherbst: don't know, your dmesg output should give you hints, but this is a part where I never looked into
02:56 pecisk: bb
03:09 pecisk: karolherbst: might be nothing, but I also reproduced it today - if I boot machine and launch optrun and get nvidia initialized, then I reboot machine, and modprobe nouveau, sched_err is gone. It's very consistent behavior.
03:23 karolherbst: mhhh there are many reasons for that
03:24 karolherbst: nvidia might do something nouveau doesn't
03:24 karolherbst: and as long as the gpu isn't powered off, chances are, that some regs aren't resetted or the gpu isn't in a "fresh" state
03:24 karolherbst: this should also work without reboot then
03:24 karolherbst: mhhh, stupid is, that your X server claims the nvidia card
03:24 karolherbst: which is painfull
03:25 pecisk: karolherbst: it doesn't
03:25 karolherbst: it is, because you can't load nouveau
03:25 pecisk: I can reboot laptop, lsmod doesn't show nvidia
03:25 karolherbst: or unload nvidia
03:25 pecisk: I can now
03:25 pecisk: if I do fresh reboot
03:25 pecisk: and log in
03:25 karolherbst: yeah, but this should work without rebooting
03:25 pecisk: and do lsmod, I don't see nvidia
03:26 karolherbst: usually bumblebee works like this:
03:26 pecisk: karolherbst: isn't that bumblebee screw up things?
03:26 karolherbst: you want to run something through bumblebee, bumblebee does this:
03:26 karolherbst: 1. load kernel module, usually nvidia
03:26 karolherbst: 2. starts second X server on your nvidia card
03:26 karolherbst: 3. does some nasty bridging things to display stuff on your intel card
03:26 karolherbst: 4. when application done => stop second X server
03:27 karolherbst: 5. unload kernel module
03:27 karolherbst: 6. power off gpu
03:27 karolherbst: but, your main X server prevents the last two steps
03:27 karolherbst: because it claims the nvidia card with the modesetting driver
03:27 karolherbst: if you wouldhave a DRI 3 enabled intel Xorg driver, this would be easily solvable
03:28 karolherbst: you could add this to your xorg.conf: https://gist.github.com/karolherbst/f6918733d3456133d433
03:28 karolherbst: but then DRI2 offloading doesn't work anymore
03:28 karolherbst: ohhh wait
03:28 karolherbst: give me your X config
03:29 pecisk: I have no X config
03:29 karolherbst: then make one like this: https://gist.github.com/karolherbst/f6918733d3456133d433
03:30 karolherbst: we only need a way to enable dri3 then
03:30 pecisk: AutoAddGPU isn't recognized as configuration parameter for some reason
03:30 karolherbst: it should be
03:32 pecisk: yesterday that prevented me to log in, Xorg failed that it doesn't recognize such parameter for ServerFlags section
03:35 karolherbst: wierd
03:35 karolherbst: which X server version do you have?
03:36 pecisk:
03:36 karolherbst: I get something like this: https://gist.github.com/karolherbst/16e481e54ae4cd004f71
03:37 karolherbst: with 1.17.2
03:40 pecisk: I am stupid :/
03:41 pecisk: karolherbst: I saw what I did wrong
03:41 pecisk: will try it
03:41 pecisk: ok, bb
03:43 pecisk: karolherbst: ok, I am in
04:10 pecisk: karolherbst: ok, so I have that xorg config loaded. What now? :)
04:39 karolherbst: pecisk: now you can check if the nvidia module is unloaded after running something through optirun
04:41 pecisk: it's not
04:41 pecisk: karolherbst: however I just unloaded it manually
04:41 karolherbst: ohh okay, so there wasn't any use left?
04:41 pecisk: with rmmod
04:41 pecisk: no
04:41 karolherbst: okay, that's better then
04:42 karolherbst: usually the moudle should be unloaded, but that might be because of some config
04:42 karolherbst: doesn't matter that much then
04:42 karolherbst: okay
04:42 karolherbst: then try to load nouveau
04:47 pecisk: karolherbst: success
04:50 karolherbst: okay mhh
04:51 karolherbst: can you run anything on it with DRI_PRIME=1 ?
04:51 karolherbst: check glxinfo if nouveau is shown
04:58 pecisk: karolherbst: both DRI_PRIME= 0 and 1 shows it's Intel
04:58 karolherbst: okay, then DRI3 is missing sadly :/
04:58 karolherbst: mhhh
04:59 karolherbst: did you upgrade your packages already?
04:59 pecisk: karolherbst: in what way? For Fedora 23? As far as I remeber...not yet
05:00 karolherbst: you should check if there is an update for your intel xorg driver
05:01 karolherbst: but if you use "2.99.917-15.20150729" you should be able to enable dri3
05:08 pecisk: yes, I use this one
05:09 pecisk: it seems nouveau text at dmesg says it uses DRI3
05:09 pecisk: karolherbst: damn, one sec
05:09 pecisk: reboot
05:10 pecisk: hmmm, nevermind, that's not the reason
05:14 pecisk: karolherbst: bumblebee seems to have written nouveau.modeset=0 and blacklisting it at boot line
05:14 pecisk: karolherbst: however I removed it when I booted this time and I still don't get Nvidia card when initializing nouveau
05:17 pecisk: karolherbst: there's some strange invalid tables in dmesg when I modprobe nouveau
05:17 pecisk: http://ur1.ca/ntek4
05:22 pmoreau: pecisk: It should not matter
05:23 pmoreau: Not sure what you mean by "I still don't get Nvidia card when initializing nouveau", you still don't see it in xrandr or using DRI_PRIME?
05:24 pecisk: yes
05:24 pmoreau: Could you paste your whole dmesg and Xorg.log please?
05:27 pecisk: pmoreau: dmesg here http://fpaste.org/268860/57912814/
05:28 pecisk: pmoreau: Xorg.log here http://fpaste.org/268865/25792901/
05:29 pmoreau: Check that the Nvidia card didn't went to sleep
05:30 pecisk: pmoreau: how to do that?
05:31 pmoreau: `cat /sys/kernel/debug/vga_switcheroo/switch` iirc
05:31 pmoreau: http://nouveau.freedesktop.org/wiki/Optimus/#checkingthecurrentpowerstate
05:33 pecisk: pmoreau: no such file or directory
05:34 pmoreau: You have to run it as root
05:34 pecisk: yes, I did it
05:34 pmoreau: Oh, there is no _ in vga_switcheroo: it's vgaswitcheroo
05:35 karolherbst: pecisk: yeah your module load seems fine that was
05:35 pecisk: pmoreau: yeah, just found out :)
05:35 karolherbst: 'way
05:35 pecisk: switch shows me this
05:35 pecisk: 0:IGD:+:Pwr:0000:00:02.0
05:35 pecisk: 1:DIS: :DynOff:0000:01:00.0
05:35 pecisk: yeahh, it's off :/
05:36 karolherbst: pecisk: do LIBGL_DEBUG=verbose glxinfo >/dev/null
05:36 pmoreau: `echo ON > /sys/kernel/debug/vgaswitcheroo/switch` to turn it on
05:36 karolherbst: it should tell you whether you use DRI2 or DRI3
05:36 karolherbst: pmoreau: DynOff should be fine though
05:36 karolherbst: beacuse that means it will turn on when needed
05:36 karolherbst: if not, then DynOff is broken ;)
05:36 pecisk: http://fpaste.org/268866/42579796/
05:36 pmoreau: Right
05:36 pmoreau: :)
05:36 pecisk: it says DRI2
05:36 pecisk: on screen 0
05:37 karolherbst: okay, then that*s the problem
05:37 karolherbst: mhhhh
05:37 karolherbst: it would be easier if it would work without loading the blob :D
05:38 pmoreau: pecisk: You still only get intel in `xrandr --listproviders`?
05:38 pecisk: karolherbst: well, it is working without loading blob, it just spews sched_err all over the place
05:38 karolherbst: then we try it by hand
05:38 karolherbst: pecisk: that means it doesn't work
05:38 pecisk: karolherbst: yes, I get Intel
05:38 karolherbst: pecisk: https://gist.github.com/karolherbst/f6918733d3456133d433
05:38 karolherbst: see that dri3 part?
05:38 karolherbst: try to add it
05:38 karolherbst: and restart X
05:39 pecisk: roger
05:39 pecisk: bb
05:47 pecisk: karolherbst: success regarding DRI3 http://fpaste.org/268874/25804231/
05:47 karolherbst: nice
05:47 karolherbst: okay, is nouveau still loaded?
05:48 karolherbst: or did you reboot
05:48 pecisk: karolherbst: I rebooted
05:48 karolherbst: ohh okay
05:48 pecisk: there's no nvidia or nouveau
05:48 karolherbst: you could try to load nouveua no without nvidia
05:48 pecisk: at this point
05:48 karolherbst: and check if it works
05:48 pecisk: ok
05:48 karolherbst: but I highly doubt that
05:48 pecisk: modprobe nouveau worked, no sched_err
05:49 karolherbst: allthough if you rebooted, maybe the card is in a good state
05:49 karolherbst: ahh well
05:49 karolherbst: then check with DRI_PRIME=1
05:53 karolherbst: pecisk_: ohh something happend :D
05:53 pecisk_: karolherbst: well, DRI_PRIME=1 glxinfo froze my laptop this time...I guess it tried to access Nvidia trough Nouveau
05:53 pecisk_: for first time for real
05:55 karolherbst: yeah
05:55 karolherbst: most likely
05:55 pecisk_: karolherbst: aaaaaand after this freeze now I modprobe nouveau and I get sched_err
05:56 pecisk_: it seems nvidia when itializes get something right or to card's liking and when nouveau launches it screws up something
05:56 pecisk_: so next time nouveau launches it gets sched_err
05:56 pecisk_: interesting enough i survives power off/on
05:57 pecisk_: i/it/s
06:05 pecisk_: karolherbst: and when it seems nouveau loads cleanly it just uses some of left overs from nvidia driver it seems
06:07 karolherbst: something like that
06:07 karolherbst: that's why you may should trace the nouveau module
06:07 karolherbst: and check what the blob does different
06:08 pecisk_: but it seems I can't even get nouveau working (on clear power reset basis) as it just goes down with sched_err
06:18 pecisk_: karolherbst: should I try to trace nouveau even if I get sched_err?
06:28 karolherbst: yes
06:29 karolherbst: when this happens is the best time
06:31 pecisk_: ok, that's what I will try next, will finish one report for job first
06:43 virtTimH: could someone look at a stack trace and offer thoughts?
06:43 virtTimH: http://pastebin.com/xCw7awtL
06:55 karolherbst: virtTimH: without debug symbols it is pretty useless
06:56 virtTimH: karolherbst: ok, fair enough
06:56 virtTimH: i don't _think_ it's nouveau related
07:18 virtTimH: karolherbst: would this be a cause for concern?
07:18 virtTimH: nvc0_program_translate:573 - shader translation failed: -5
07:18 virtTimH: nvc0_program_translate:573 - shader translation failed: -5
07:18 pecisk_: karolherbst: when doing this nouveau trace, I just modprobe nouveau, launch tracer and exit after 30 - 40 secs? Or have to launch something?
07:18 virtTimH: via valgrind, same run
07:20 karolherbst: pecisk_: should be fine
07:20 pecisk_: cool
07:20 karolherbst: virtTimH: it depends a lot on what is actually going wrong
07:21 virtTimH: karolherbst: can you offer any suggestion to drill down on it?
07:23 virtTimH: http://hastebin.com/exocisahil.vhdl
07:23 virtTimH: full output from valgrind of that nouveau section
07:25 virtTimH: setting up the debug symbols too
07:25 virtTimH: this is 4.3-rc1 too
07:26 pecisk_: karolherbst: ran trace capture, for 40 secs, got this http://fpaste.org/268921/14425863/
07:37 karolherbst: pecisk_: you need to enable the mmiotracer before loading nouveau
07:50 pecisk_: karolherbst: so steps would be -> enable mmiotrace tracer -> modprobe nouveau -> enable dump -> exit dump
07:53 karolherbst: like you traced the blob
07:53 karolherbst: but you don't need to run anything on the card
07:54 pecisk_: I know
07:54 pecisk_: karolherbst: this time I got nouveau modprobed already, so I just fired up mmiotrace tracer and created dump
07:54 pecisk_: but I will try with fresh restart
07:55 pecisk_: bb
08:15 pecisk_: karolherbst: it seems ok this time, mmiotrace with broken nouveau https://drive.google.com/file/d/0B31QWsnd2TuWX3RzTVBiV0VtZXc/view?usp=sharing
08:17 karolherbst: that*s a little short, but maybe it helps
08:17 karolherbst: mhhh "CPU:0 [LOST 992220 EVENTS]"
08:18 karolherbst: you really should follow those instructions on the ubuntu mmiotrace wiki page
08:18 pecisk_: but I followed them
08:19 pecisk_: karolherbst: which instructions you mean?
08:20 pecisk_: I rebooted laptop -> set tracer to mmiotrace -> modprobe nouveau -> turned on dump -> turned off after 1 min
08:24 pecisk_: karolherbst: lol it crashes while doing mmiotrace
08:24 pecisk_: http://ur1.ca/ntfm2
08:24 pecisk_: ohh dear, this laptop is just broken
08:29 pecisk_: karolherbst: ok, I will try bigger buffer size
08:37 pecisk_: karolherbst: ok, no lost events here, used Ubuntu mmiotrace instructions properly now https://drive.google.com/file/d/0B31QWsnd2TuWYmY3elE3S2NoV1E/view?usp=sharing
09:06 imirkin: virtTimH: "nvc0_program_translate:573 - shader translation failed: -5" -- that means the emitter failed. this is very unusual
09:07 imirkin: virtTimH: if you make a debug mesa build and run with NV50_PROG_DEBUG=1 that should print out a lot more shader compiler info, including what shader it's failing with
09:09 virtTimH: imirkin: i can certainly give that a try in a second, that could be part of my issue I suppose
09:09 virtTimH: too old Mesa
09:10 imirkin: virtTimH: oh yeah, you want at least Mesa 11.0 :)
09:10 virtTimH: heh
09:10 virtTimH: k, on it
09:10 imirkin: which GPU are you using?
09:10 virtTimH: this is the 740 now
09:10 virtTimH: had the 630 in earlier
09:10 imirkin: both keplers right?
09:10 imirkin: is the 740 a GK208?
09:10 virtTimH: i believe so, yes
09:11 imirkin: if so, i could definitely imagine some emission failures in esp very old versions of mesa
09:13 pecisk: karolherbst: this trace is ok?
09:13 karolherbst: specing: seems that way
09:14 virtTimH: imirkin: having a hard time pinning that spec down
09:14 virtTimH: doesn't show in lshw
09:14 imirkin: virtTimH: lspci -nn -d 10de:
09:14 virtTimH: didn't show in lspci
09:14 virtTimH: ah, didn't use args
09:14 imirkin: humor me
09:15 virtTimH: no model for the VGA controller, but GK107 for the HDMI Audio controller :)
09:15 imirkin: can you paste the VGA controller line?
09:15 virtTimH: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:0fc8] (rev a1)
09:15 imirkin: 0fc8 GK107 [GeForce GT 740]
09:15 karolherbst: I bet your hwdb may be old, too
09:16 imirkin: (from my pci.ids file)
09:16 imirkin: anyways... that's surprising then. i don't remember any emission bugs in a while. oh, maybe the spilling thing
09:16 imirkin: gr
09:16 karolherbst: ohh hwid is the project called :/
09:16 karolherbst: imirkin: spilling is -4
09:16 imirkin: but iirc that just asserts, doesn't end up returning -5
09:16 imirkin: karolherbst: no, the b96 loadstore thing
09:16 karolherbst: ohh
09:17 karolherbst: that should be an assert
09:17 imirkin: well presumably he has assertions disabled
09:17 karolherbst: right
09:17 imirkin: but iirc it just assumes it's a 32-bit load/store and moves on
09:17 imirkin: which makes for incorrect code, but not an error return code
09:17 karolherbst: virtTimH: what distribution are you using?
09:18 virtTimH: ubuntu gnome 14.04
09:18 imirkin: anyways, not much point speculating... see if mesa 11 helps
09:18 imirkin: i certainly don't have time to debug things on old and unsupported mesa versions
09:18 virtTimH: imirkin: ok, thanks, doing that now
09:18 virtTimH: fair enough
09:19 karolherbst: mhhh
09:19 karolherbst: why they don't update hwdata on a LTS distribution
09:20 virtTimH: picked 14.04 because I wanted the host to be a bit more stable, but I may have to move to fedora rawhide or something if i can't right this ship
09:20 karolherbst: you have mesa 10.1 installed, right?
09:20 virtTimH: yes
09:20 karolherbst: you could try out the xorg-edgers ppa
09:20 karolherbst: it should have all the new stuff
09:20 karolherbst: and usually don't break your system
09:20 virtTimH: ok, i'll give that a try first
09:21 karolherbst: 10.1 is old :/
09:21 imirkin: oh wow. 10.1 is ancient history. that was the first release with geometry shader support and OpenGL 3.3
09:21 karolherbst: early 2014
09:21 virtTimH: client glx vendor string: Mesa Project and SGI
09:21 virtTimH: OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.1.3
09:21 virtTimH: OpenGL version string: 3.0 Mesa 10.1.3
09:21 virtTimH: yeah
09:21 karolherbst: ohh 10.3
09:22 karolherbst: *10.1.3
09:22 virtTimH: no, 10.1.3
09:22 virtTimH: yeah
09:22 karolherbst: silly me
09:22 imirkin: 10.1 was *many* bug fixes ago :) i can't even remember all of them, but ...
09:22 imirkin: git log mesa-10.1.3..origin/master -- src/gallium/drivers/nouveau | grep '^commit' | wc -l
09:22 imirkin: 511
09:23 karolherbst: :)
09:23 virtTimH: my apologies, feel free to flog my avatar
09:24 karolherbst: the first thing I do on new ubuntu install is to install the xorg-edgers ppa :D
09:24 virtTimH: hehe
09:24 virtTimH: well this isn't my "primary" it's a couple day old setup for all of this
09:24 imirkin: the first thing i do on any new ubuntu install is install gentoo.
09:24 karolherbst: it is a mess to upgrade then though, so please keep the notice on the website in mind
09:24 karolherbst: :D
09:25 virtTimH: the worst that can happen is this system crash and burn
09:25 karolherbst: yeah, somehow I always used ubuntu for my gentoo
09:25 virtTimH: :)
09:25 karolherbst: mhh
09:25 karolherbst: ohh a lot worse can happen
09:25 karolherbst: like battery is starting to burn
09:26 karolherbst: but this is most likely not caused by OS
09:26 karolherbst: imirkin: what I like most about gentoo though is, that packages doesn't screw stuff up automatically, they always instruct you to do so
09:26 virtTimH: reboot, brb
09:27 karolherbst: imirkin: I hope for him that the new stack plays well with 3.13 :D
09:27 imirkin: it should
09:28 imirkin: he needs a fresher libdrm though
09:28 karolherbst: it is in the ppa
09:28 karolherbst: most of the stuff there is weekly git binaries
09:28 karolherbst: something like that
09:34 virtTimH: geniuses!
09:34 virtTimH: my bad :)
09:35 karolherbst: so it works now?
09:35 virtTimH: that part isn't crashing
09:35 karolherbst: nice
09:35 virtTimH: yes, very nice.. thanks!
09:35 virtTimH: on a previous setup I had built Mesa from source
09:35 virtTimH: on this particular box I did not
09:35 virtTimH: *hangs head*
09:35 imirkin: plenty of fixes happen in 2 years
09:36 virtTimH: that's a two year span?
09:36 virtTimH: holy hell
09:36 imirkin: Jan 2014
09:36 imirkin: not quite 2 years
09:36 virtTimH: close enough to call it that, yeah
09:36 imirkin: er oops. Mar 2014
09:36 virtTimH: alright now
09:36 imirkin: so like 1.5 years
09:36 virtTimH: now you're just playing games
09:36 virtTimH: :)
09:39 virtTimH: ok, no 3d/opengl acceleration in the guest
09:39 virtTimH: but not crashing at tip either
09:39 virtTimH: so small win
09:42 imirkin: are you passing the pci devices through to a guest?
09:44 virtTimH: no, virglrenderer
09:44 virtTimH: virtio
09:45 virtTimH: imirkin: ^^
10:00 imirkin: oh. that should work too.
10:10 imirkin: iirc it just uses opengl on the host
10:10 virtTimH: should just, yes via virglrenderer
10:10 virtTimH: probably guest is b0rked
10:11 imirkin: well, it might also try to use multiple GL contexts
10:11 imirkin: which would be sad, since nouveau doesn't handle that very well
10:11 imirkin: at least if those contexts are used simultaneously
10:11 virtTimH: i know I've read it's working on nouveau at least in some incarnation
10:12 imirkin: perhaps airlied blacklisted nouveau? dunno
10:12 virtTimH: well, not that the tip of that wip qemu branch isn't crashing on my hardware
10:12 virtTimH: I'm going to dig into the source more this afternoon
10:13 virtTimH: in the guest, direct rendering support is listed as "Yes"
10:13 virtTimH: that seems like a positive sign, right?
10:14 virtTimH: or would that be the case for any guest?
10:15 virtTimH: alright, but now, I must attend to other things, thanks again imirkin and karolherbst
10:16 imirkin: def a positive sign
10:16 imirkin: but if it says llvmpipe, then that means you don't have a 3d driver for the virtio "hardware"
10:17 imirkin: good luck
10:18 joi: imirkin: witcher2[13093]: Unknown handle 0x0000001f
10:18 joi: just after updating to mesa-git
10:18 imirkin: joi: there are a few reports of that happening in chrome
10:18 imirkin: joi: however i have no idea why it'd happen
10:18 imirkin: are you saying a recent change to mesa is triggering it?
10:19 imirkin: i haven't done anything related to that recently...
10:19 joi: well, it didn't happen on git from ~week ago
10:19 imirkin: does it happen reliably?
10:20 imirkin: if so, bisect ;)
10:20 imirkin: coz i have no idea what would cause it
10:20 joi: are you interested in "nouveau: kernel rejected pushbuf" log?
10:20 imirkin: not really...
10:20 imirkin: a lot more interested in where the unknown handle comes from
10:21 joi: well, the pushbuf probably references it somewhere...
10:21 imirkin: right, in the header
10:21 imirkin: but there's no way for an unknown handle to make it in there
10:21 imirkin: so... how'd that happen
10:24 joi: http://people.freedesktop.org/~mslusarz/1f.txt
10:26 imirkin: right, that tells me that buffer handle 1f is being used. but where'd it come from?
10:40 joi: heh, it happened once and I can't reproduce it
10:43 joi: wow, and now witcher2[8087]: multiple instances of buffer 1145 on validation list
10:45 joi: http://people.freedesktop.org/~mslusarz/mult.txt
10:46 imirkin: so something dodgy's going on, but i have no idea what
10:47 imirkin: that's not supposed to happen either
11:09 karolherbst: imirkin: does this message help in any way? "nouveau 0000:01:00.0: fifo: read fault at 00225ec000 engine 1b [CE2] client 18 [GR_CE] reason 02 [PTE] on channel 2 [00bf890000 witcher2[5926]]"
11:12 imirkin_: not without a mmt trace to go with it
11:16 joi: any idea why glretrace opens so many windows?
11:17 imirkin_: try #apitrace?
11:17 imirkin_: i've found it to be a little annoying too sometimes
11:17 imirkin_: i suppose "the more the merrier" :)
11:19 imirkin_: skeggsb: btw, just wanted to make sure you didn't miss http://lists.freedesktop.org/archives/nouveau/2015-September/022279.html
11:31 karolherbst: joi: I know the reasons
11:31 karolherbst: apitrace can't trace witcher2 yet
11:31 imirkin_: that's... a pretty abstract reason
11:31 karolherbst: joi: but I have something for ya
11:32 karolherbst: imirkin_: you know these unsupported GL extensions
11:32 karolherbst: GL_ARB_buffer_storage
11:32 imirkin_: you're talking about GL_ARB_buffer_storage right?
11:32 karolherbst: yeah
11:32 imirkin_: so it has little to do with witcher2
11:32 karolherbst: but I got witcher 2 binaries without it
11:32 karolherbst: joi: http://developer.vpltd.com/public/witcher2-gog-no_buffer_storage.tgz and http://developer.vpltd.com/public/witcher2-gog-vpfs.tgz
11:33 karolherbst: with these you should be able to create a usefull trace
11:33 joi: well, I can replay witcher2 traces...
11:33 karolherbst: does anything usefull shows up?
11:34 joi: they don't reproduce errors I see natively/under apitrace
11:34 karolherbst: whenever I tried to trace it, I just got these errors about missing supported for this extension and no usefull trace at all
11:35 joi: yes, the game renders perfectly
11:35 karolherbst: ohh strange
11:35 karolherbst: then you seem to have different binaries indeed than I have
11:35 imirkin_: polish vs german versions? :)
11:35 karolherbst: maybe beacuse you use the beta branch?
11:36 karolherbst: the game seems to ran really badly anyway :/
11:36 pecisk:can try to test Witcher 2 beta branch on Steam on nouveau if needed
11:36 joi: nope, I just bought the game
11:37 joi: I don't even know how to obtain beta binaries
11:38 karolherbst: ohh strange
11:38 pecisk: joi, one sec
11:39 pecisk: joi, click on the game with alt button, and get drop down menu and go to properties
11:39 pecisk: joi, then see tab 'BETAS'
11:39 pecisk: joi, and select beta drop drop down menu
11:39 pecisk: click on and it will update it for you
11:46 pecisk: karolherbst, I am checking nouveau mmiotrace and it does something very strange there
11:47 karolherbst: pecisk: what do you mean by "strange"?
11:48 pecisk: W 4 115.456096 264 0xf017f230 0x0 0x0 0 - fifth column is memory address right?
11:49 karolherbst: use demmt
11:49 pmoreau: s/demmt/demmio
11:49 karolherbst: ohh right, demmio
11:49 pmoreau: Or at least for that line ;)
11:49 pecisk: karolherbst, it just goes trough that fifth column, increasing it by 4, with such similar W commands
11:50 karolherbst: or lookup 0xf017f230 0x0
11:50 pecisk: and that's for few seconds
11:50 karolherbst: and?
11:50 pecisk: example http://fpaste.org/269041/26022361/
11:50 pecisk: I wonder why it does so and what it means
11:51 pmoreau: You should pass the whole mmiotrace through demmio, it will help you understanding it
11:51 pecisk: ok
11:52 pecisk: what demmio does?
11:52 pmoreau: I guess those are not "regs", but more some memory area where you can store various objects
11:52 karolherbst: pecisk: it parses the file for you
11:52 karolherbst: demmio -f mmiotrace.log
11:53 pecisk: ahh
11:53 pecisk: nice
11:53 pmoreau: By using rnndb, so you get a name for the address, and a split of the value in different fields etc.
11:53 pmoreau: (If that reg was RE'ed)
11:53 pecisk: ouch
11:53 pecisk: MAP -> SLEEP -> UNMAP
11:53 pecisk: lots of them
11:53 karolherbst: and?
11:53 pecisk: ahh
11:53 pecisk: nevermind
11:53 karolherbst: use demmio
11:54 karolherbst: and try to figure out what nouveau does wrong
12:02 pecisk: what RAMIN8 means?
12:02 imirkin_: 8-byte read from RAMIN
12:03 imirkin_: err
12:03 imirkin_: 8-bit
12:03 imirkin_: and operation, could be read or write
12:03 imirkin_: RAMIN is a window into vram btw... i think.
12:04 pecisk: I see
12:04 imirkin_: but i'm pretty weak on those low-level details
12:05 imirkin_: they tend not to matter for the vast majority of things
12:05 pecisk: imirkin_, I am pretty....total beginner :)
12:05 imirkin_: http://envytools.readthedocs.org/en/latest/hw/index.html
12:05 pecisk: imirkin_, well, it seems just a huge list of commands blanking video memory or something
12:05 pecisk: imirkin_, ohh nice
12:05 pecisk: thank you!
12:06 imirkin_: e.g. http://envytools.readthedocs.org/en/latest/hw/memory/g80-vram.html
12:06 imirkin_: (the gf100+ one has yet to be written, it seems)
12:07 imirkin_: hmmm... doesn't go over RAMIN though
12:07 imirkin_: i guess it was more of a pre-G80 concept
12:07 pecisk: trying to decyper these two lines, as those seem to be appear around same time when driver hangs
12:07 pecisk: [0] 114.895284 MMIO32 R 0x00d054 0x00000007 PGPIO.I2C[0x2].BITBANG => { SCL_OUT | SDA_OUT | MODE = BITBANG }
12:07 pecisk: [0] 114.895304 MMIO32 W 0x00d054 0x00000005 PGPIO.I2C[0x2].BITBANG <= { SCL_OUT | MODE = BITBANG }
12:08 imirkin_: that's a read followed by a write to mmio register 0xd054
12:08 pecisk: yeah, it tries to write.............a lot, and then it tries to read...........a lot
12:09 imirkin_: dunno what's attached to i2c 2... could be something like DDC
12:09 pecisk: then it tries to set timers and temperature params
12:09 imirkin_: or hwmon or who knows
12:09 pecisk: right
12:09 imirkin_: vbios should tell you
12:09 pecisk: ahhhhh
12:10 pecisk: bingo
12:10 pecisk: karolherbst, maybe nothing but hear this
12:10 pecisk: this is from nvidia
12:10 pecisk: [0] 1753.204673 MMIO32 R 0x00d034 0x00000007 PGPIO.I2C[0x1].BITBANG => { SCL_OUT | SDA_OUT | MODE = BITBANG }
12:10 pecisk: this is from nouveau
12:10 pecisk: [0] 114.895284 MMIO32 R 0x00d054 0x00000007 PGPIO.I2C[0x2].BITBANG => { SCL_OUT | SDA_OUT | MODE = BITBANG }
12:12 pecisk: imirkin_, just grepping or have to use some envytool for that?
12:12 imirkin_: skeggsb: btw, was this on purpose? http://cgit.freedesktop.org/~darktama/nouveau/diff/drm/nouveau/nvkm/subdev/bios/i2c.c?id=27ea56048a610fff746afd278e0ed5c016ff56db
12:12 imirkin_: pecisk: 'nvbios' in envytools
12:13 pecisk: judging by BITBANG responses in nvidia mmiocapture nouveau never gets answer from that subdevice
12:13 pecisk: I am kinda ready to bet it's what hangs it
12:22 pmoreau: imirkin_: Looking at the doc, the change was on purpose I would say: http://http.download.nvidia.com/open-gpu-doc/DCB/2/DCB-4.x-Specification.html#_communications_control_block_0x41
12:24 imirkin_: hm ok
12:27 karolherbst: pecisk: you need a clear goal what you are searching for
12:27 karolherbst: there are tons of lines
12:27 karolherbst: and mostly it is different than the blob
12:27 karolherbst: but something the nouveau driver is complaining about is more different than the other stuff
12:28 karolherbst: sometimes nouveau even uses values not "optimized" for your chipset, but the blob does and both ways are usually fine
12:28 karolherbst: so it is more tricky than just finding the difference
12:29 karolherbst: this was your error in nouveau: https://gist.github.com/karolherbst/7732b1c3afd3eb959e54
12:29 karolherbst: try to search this in the traces
12:29 karolherbst: nouveau and nvidia
12:29 karolherbst: everything with "bad" in the result is bad
12:29 karolherbst: this means either wrong reg was used
12:30 karolherbst: or the gpu is in a state where some engines are off? or parts are messed up or whatever
12:30 pecisk: karolherbst, well, if you see there's different addresses for I2C connection - 0x1 for nvidia and 0x2 for nouveau
12:31 karolherbst: this doesn't matter
12:31 karolherbst: you have tons of GPIOs
12:31 karolherbst: and both will be access in both traces
12:32 karolherbst: as I said: don't just try to find the difference, but try to find the part where nouveau fails
12:32 karolherbst: and compare this part with the blob
12:32 pecisk: karolherbst, not exactly, nouveau just massages 0x2, nvidia goes trough all GPIOs from 0x1 to up
12:32 pecisk: karolherbst, those commands are around where it fails, that's why I asked :)
12:33 pecisk: it just tries same BITBANG on same GPIO over and over again
12:33 karolherbst: yeah, the blob does it too
12:33 pecisk: while nvidia goes trough all GPIOs
12:33 pecisk: nope
12:33 pecisk: it doesn't
12:33 karolherbst: it does
12:33 karolherbst: demmio -f nvidia.log.xz 2>/dev/null | grep PGPIO | grep 0x2
12:34 karolherbst: you have to now, that the blob doesn't have to run all init code on load time or later
12:34 karolherbst: as far as I know, nouveau does everything at load time so that the gpu is fully ready for everything
12:34 karolherbst: but the blob is more lazy than that
12:35 karolherbst: but you are right with that the nouveau does more BITBANGs than the blob does
12:35 karolherbst: the question is, is this a problem or is this no problem?
12:35 pecisk: karolherbst, 64 vs 29k
12:35 pmoreau: :D
12:36 karolherbst: I wouldn't bother much about those GPIOs, because I don'T think they may cause such a messed up IBUS stuff
12:36 karolherbst: most likely nouveau uses wrong reads for something
12:36 karolherbst: and that may cause big loops for the gpios?
12:36 pecisk: it seems for at least for me
12:37 pecisk: 29k sounds way too much
12:37 karolherbst: it is most likely to much yes, but is this the cause of the problem?
12:37 karolherbst: or only a symptom?
12:37 joi: switching witcher2 to beta didn't help with anything; when it doesn't hang, the game renders correctly, but I still can't reproduce those hangs under glretrace
12:37 pecisk: good question :)
12:37 karolherbst: we already know, that the PIBUS parts are really messed up
12:37 karolherbst: next question: why is it that messed up?
12:38 pecisk: karolherbst, btw, demmio gave something like this "I don't know which chipset variant to use!" - ignore?
12:38 karolherbst: ignore
12:38 karolherbst: most likely won't change much
12:39 karolherbst: pecisk: ohh did you run the trace while nouveau printed those errors?
12:40 pecisk: karolherbst, yes, I followed those instructions, first launched capture, then modprobed nouveau
12:40 karolherbst: pecisk: I meant, was the gpu in a clean state?
12:40 karolherbst: cold boot, no blob loaded before, etc...
12:42 pecisk: karolherbst, cold boot
12:42 pecisk: certainly no blob
12:42 karolherbst: I wonder, because I can't find those "0xbad00100" values :/
12:43 pecisk: karolherbst, I will redoit
12:43 karolherbst: you don't need to
12:43 pecisk: karolherbst, to be sure :) Problem is when I modprobed nouveau, I didn't get sched_err
12:43 pecisk: I did get hang a bit
12:43 pecisk: as usual
12:43 pecisk: but no messages themselves
12:44 karolherbst: mhhh
12:44 karolherbst: why does nouveau writes into PBFB_BROADCAST+0x550
12:44 karolherbst: but not the blob
12:50 karolherbst: pmoreau: shouldn't nouveau first PMC.ENABLE PIBUS and then use the PIBUS regs?
12:52 pmoreau: not sure
12:53 karolherbst: the blob does it though
12:54 karolherbst: mhhh
12:54 karolherbst: don't think it is the issue though
12:56 pecisk: karolherbst, ok, if I have capture enabled before modprobe, sched_err error doesn't appear, there are some interesting stuff in dmesg, but not errors
12:57 pecisk: also I try to figure out how nouveau can have three different states between cold boots
12:57 pecisk: first, after laptop has been long off, modules loads, then nouveau hangs if I touch it with glxinfo
12:58 karolherbst: pecisk: randomness
12:58 pecisk: then second time it either spews sched_err or gives me this trace
12:58 pecisk: karolherbst, it is consistent
12:58 karolherbst: nobody says that the hardware is in the same state after powering on
12:58 pmoreau: pecisk: Do you still have the errors if you use nouveau.config=NvForcePost=1 or pci-reset the card (echo 1 > /sys/bus/pci/devices/<pci_id>/reset, and the same with rescan) before loading Nouveau?
12:58 karolherbst: ohh okay
13:00 pecisk: ok, will try that
13:06 pecisk: karolherbst, any idea why mmiotracer tracing makes sched_err disappear?
13:09 karolherbst: I think this has more todoay with your current X setupt
13:10 karolherbst: because nothing tries to use the card
13:10 pecisk: possibly
13:13 pecisk: pmoreau, pci-reset the card didn't help, modprobe nouveau spewed sched_err
13:13 pecisk: karolherbst, curious - there are some IBUS read timeouts/faults in dmesg shown before it goes on sched_err
13:14 pecisk: ahhh
13:14 pecisk: bunch of bads too
13:14 pecisk: it's same old
13:14 pecisk: but if I will turn on mmiotrace it will disappear, :/
13:18 pmoreau: Well, too bad
13:24 pmoreau: imirkin: Is there a way to prevent TGSI optimisation?
13:24 imirkin_: yeah, you can comment it out ;)
13:24 pmoreau: :D
13:24 imirkin_: there's no debug flag if that's what you mean
13:24 pmoreau: :(
13:25 pmoreau: Well, I'll try to comment it out then
13:25 imirkin_: look for it in st_glsl_to_tgsi.cpp
13:29 pmoreau: Found at least one optimisation call. :) Thanks!
13:30 imirkin_: doh
13:30 imirkin_: i hate those optimizations
13:30 imirkin_: they just cause trouble
13:43 pmoreau: Is there a specific reason why to use "mov u32" when loading a float immediate?
13:43 pecisk: karolherbst, just for curiosity, this is what I get with mmiotrace on instead of sched_err https://drive.google.com/file/d/0B31QWsnd2TuWRzRFYjR2Q3M5U1k/view?usp=sharing - I don't see any "bad" values there too
13:48 imirkin_: pmoreau: why not?
13:48 imirkin_: it's all the same...
13:49 rigid: karolherbst: remember me from yesterday? it WAS clang. I recompiled the whole system with gcc and now everything works. No segfaults.
13:50 rigid: i'd love to report that bug, but I'm not too sure how to find out something specific
13:50 pmoreau: imirkin_: Ok, so "mov f32" and "mov u32" are just visual things, but there is only one mov instruction on the hardware?
13:51 imirkin_: pmoreau: yeah, it's just bits
13:51 imirkin_: as long as the bits are right, it doesn't matter
13:52 pmoreau: :-)
13:53 imirkin_: rigid: bug: clang allowed to compile software. fix: remove clang
13:56 karolherbst: rigid: :)
13:58 rigid: it's a clang bug for sure
13:58 rigid: and imo clang is a marvellous compiler and llvm just rocks
13:58 rigid: gcc must die, the sooner clang is bug free and supports all platforms that gcc supports, the better :D
13:58 imirkin_: yeah, it's a marvelous compiler. it just doesn't compile code.
13:59 pecisk: pmoreau, well, nouveau.config=NvForcePost=1 helps to get rid of sched_err
13:59 rigid: imirkin_: there were worse bugs in gcc in the past
13:59 rigid: much worse
14:24 krofna_: does nouveau support 950m?
14:24 imirkin_: krofna_: lspci -nn -d 10de:
14:25 imirkin_: or put another way, if it's a GM20x, not usefully. if it's a GM10x, then slightly. if it's a GKxxx, then reasonably.
14:26 krofna_: gm107m
14:26 krofna_: I can't boot any distro installation without nomodeset
14:27 imirkin_: weird
14:27 imirkin_: i assume this is optimus?
14:27 krofna_: Yeah, according to the laptop spec it also has an intel card
14:28 imirkin_: so it should be using the intel card as the main thing
14:28 pecisk: krofna_, I have problems with 117M too, Optimus here
14:28 pecisk: krofna_, what happens when you boot without nomodeset?
14:28 krofna_: manjaro reboots, fedora spits lots of nouveau errors for a while and then reboots
14:29 pecisk: krofna_, sounds like same issue
14:29 imirkin_: krofna_: you can boot with nouveau.modeset=0 to still get the i915 modesetting
14:29 pecisk: krofna_, SCHED_ERR?
14:30 krofna_: pecisk: it starts spitting that after about a minute of "vm timeout"
14:31 pecisk: krofna_, welcome to the club
14:32 pecisk: krofna_, trying to debug it for second day now
14:32 krofna_: sounds bad
14:33 pecisk: krofna_, well, it is a bug, it must be fixed :)
14:33 krofna_: ok so how do I write = on american keyboard layout?
14:34 karolherbst: not again...
14:34 karolherbst: krofna_: blacklist nouveau and you are good to go
14:35 pecisk: good as not using nouveau :)
14:35 karolherbst: yeah well, you had the same issue
14:35 imirkin_: pecisk: press the key just left of the backspace
14:36 krofna_: imirkin_: thanks :D
14:37 karolherbst: I just hope that in a year, distributions will handle such cases the right way
14:37 karolherbst: dummy xorg driver for dedicated gpu, intel with DRI3
14:38 krofna_: I just hope nouveau will support my card :/
14:38 karolherbst: it should
14:38 karolherbst: krofna_: add a xorg.conf like this: https://gist.github.com/karolherbst/f6918733d3456133d433
14:38 imirkin_: krofna_: in optimus, the only reason to use the secondary gpu is perf
14:38 imirkin_: krofna_: you're likely to get better perf with the intel igpu than nouveau for quite a while
14:39 karolherbst: imirkin_: mhhh depends on the intel gpu, but 950 is quite capable
14:39 karolherbst: ohh maxwell
14:39 karolherbst: okay, intel is faster :D
14:39 karolherbst: for now
14:40 karolherbst: 950M seems like 750 performance wise, so it isn't that bad
14:43 krofna_: btw, how does one get into driver programming? I've got 4 years of experience with C/C++ 2 yrs of reverse engineering but I have no clue where to start with drivers ._.
14:44 imirkin_: krofna_: you just do it
14:44 imirkin_: ;)
14:44 krofna_: easier said than done. code makes no sense..
14:44 imirkin_: open your window, and scream out "I AM A DRIVER PROGRAMMER"
14:44 pecisk: karolherbst, I don't know why but mmiotrace allows to modprobe and glxinfo nouveau without crash, while simply doing it sometimes crashes it, sometimes not
14:44 karolherbst: krofna_: believe imirkin_ :D
14:44 karolherbst: krofna_: how doesn't it amke no sense?
14:45 karolherbst: I thought it makes a lot of sense from the beginning somehow :/
14:45 karolherbst: to be honest, the nouveau code seems to be pretty clean compared to what you usually see
14:45 karolherbst: which really suprised me, because I always asumed nouveau is hacky as hell :D
14:45 imirkin_: krofna_: kinda like this: https://www.youtube.com/watch?v=6mRHToDSfSU
14:46 pecisk: krofna_, needs a bit background about how hardware is built in general, but it's certainly approachable, especially if you have some good experience in reverse engineering
14:46 karolherbst: pecisk: ohhh no, I have no clue how gpus are built :D
14:46 pecisk:have some measly introduction about hardware design but at least understand GPIO etc. terms
14:47 krofna_: lol
14:47 karolherbst: but somehow I still manage to do something usefull
14:47 pecisk: karolherbst, not that deep of course :)
14:47 pecisk: yeah
14:47 karolherbst: no, really
14:47 karolherbst: I have no clue
14:47 pecisk: you have some clue about how that card works
14:47 pecisk: :)
14:47 karolherbst: not to speak about OpenGL or the dri stack, where I have less than no clue
14:49 pecisk: karolherbst, anyway, I can't get mmiotrace when it spews those sched_err, because it never does it when mmio trace is turned on
14:49 krofna_: also this laptop seems pretty hot. sensors reports 60 degrees and kb is warm on touch. is that normal?
14:49 pecisk: karolherbst, also it never crashes when tracking is turned on, when simply modprobing and glxinfo crash it now and then
14:50 karolherbst: krofna_: when the gpu is on, yes
14:50 pecisk: krofna_, only intel loaded?
14:50 krofna_: pecisk: yep. nvidia doesn't load anyway
14:52 karolherbst: only intel loaded means: nvidia gpu is on
14:52 karolherbst: you need to have either nouveau loaded, so that vgaswitcheroo can turn the card off
14:52 karolherbst: or install bbswitch (from bumblebee) and set parameters so, that the card is turned off on bbswitch load
14:53 karolherbst: the latter you would use when you use the blob driver with your card
14:53 pecisk: karolherbst, also I don't get why it is not consistent...after longer time, I can load nouveau module without errors and show glxinfo, and it freezes a bit after that...and after that cold reboot but then I continue to get errors and freezes
14:54 karolherbst: don't know
14:54 karolherbst: as I said: nobody said that the state of the card is the same after powering on, or should it be the same always imirkin_?
15:06 krofna_: pecisk: linux 4.1.7 works without nomodeset and doesn't overheat
15:07 pecisk: krofna_, I believe you
15:07 krofna_: yay for alpha images
15:07 pecisk: krofna_, at some point they broke it
15:07 pecisk: krofna_, well, f23 is 4.2 :)
15:07 krofna_: ah great
15:07 pecisk: and this is beta stage
15:07 krofna_: so i must not upgrade neither
15:08 pecisk: so it is a bug
15:08 krofna_: I thought it was an issue only with older kernels because I tried to run fedora 22 and manjaro stable unsucessfully
15:09 pecisk: krofna_, well, it was somehow similar to me
15:09 pecisk: F21 worked
15:09 pecisk: F22 did not
15:09 pecisk: now I am running F23 beta
15:09 pecisk: in fact it is already RC
15:10 pecisk: when I installed it was put on nomodeset... I will guess because card was blacklisted in some F23 installation process
15:11 karolherbst: pecisk: was nomodeset set generally?
15:11 karolherbst: like it was added directly to the kernel cmd line?
15:14 karolherbst: ohhhh
15:14 karolherbst: "nomodeset is added to the boot parameters if you boot the installation media with the "basic graphics mode" boot entry; consequently nomodeset is added to the kernel command line in the installed system."
15:14 karolherbst: that makes sense
15:20 pecisk: karolherbst, going to sleep, but before I go, wild guess - I just checked vbios, it seems sched_err references some unknown power table. My guess...Nvidia card doesn't get properly turned on as it is Optimus
15:21 karolherbst: I doubt that optimus is the cause and sched_err isn't the issue either
15:21 karolherbst: sched_err just says, that something is wrong
15:22 pecisk: karolherbst, yeah, but it might be consistent with what is happening...as card being turned on/off dynamically
15:22 pecisk: anyway
15:22 pecisk: time for a pause
15:22 pecisk: good luck and see ya
15:25 karolherbst: imirkin_: for your kepler card 0x8c1c0 is 30036, right?
15:29 imirkin_: 0008c1c0: 00030036
15:34 karolherbst: mhhh
15:36 karolherbst: uhhh
15:36 karolherbst: 0x8c080
15:37 karolherbst: nvscan result: 08c080: 00001010 00001010 00001010
15:37 karolherbst: sadly not used by the blob, but somehow I think this reg tells us something bout the card
15:38 karolherbst: imirkin_: you have a x8 card right? could you check this reg too?
15:39 karolherbst: maybe just give me nvapeek 0x8c000 0x1000 :D
15:40 imirkin_: karolherbst: iirc i did
15:41 karolherbst: mhh, then I need to find the link
15:41 karolherbst: will try
15:41 imirkin_: karolherbst: http://hastebin.com/zidikanile.sm
15:41 karolherbst: thanks
15:41 imirkin_: i *believe* it's x8 in an x16 slo
15:41 imirkin_: slot*
15:42 karolherbst: mhhh
15:42 karolherbst: then 0008c080 doesn't make that much sense
15:42 karolherbst: I though it would tells us card max width and slot max width
15:42 karolherbst: I am 100% sure that the width isn't in 0008c040
15:44 imirkin_: well, dmidecode says it's x16
15:44 imirkin_: but perhaps the card doesn't know any better
15:44 karolherbst: mhhh
15:45 karolherbst: really sad, that this reg doesn't show up in the nvidia trace
15:46 karolherbst: but nvascan should print the same just with your value
15:47 karolherbst: ohhh it is 404 on one of the cards in reator
15:47 karolherbst: now it is getting interessting
15:48 karolherbst: ha
15:48 karolherbst: one card 1010, one 404
15:48 karolherbst: one card x16 width, the other x4
15:48 karolherbst: coincidence?
15:49 imirkin_: must be!
15:49 karolherbst: so the one kepler card is in a x4 slot
15:49 karolherbst: or
15:49 karolherbst: there aren't enough lanes
15:49 imirkin_: well, i don't think there are any x4 cards :) [not gpu's at least]
15:49 karolherbst: nope
15:49 karolherbst: there aren't
15:49 karolherbst: so this reg doesn't tell us about what the card can
15:49 karolherbst: but more likly what the slot can do
15:50 karolherbst: it could even be somethig like slot max, slot current
15:51 karolherbst: ohh wow, the maxwell card is really pissed by this nvapeek call :D
15:51 karolherbst: instead of 0 it returns badf5040
15:53 karolherbst: ohh no, now that I think of this
15:53 karolherbst: maybe pecisk has an issue like that?
15:53 karolherbst: nouveau expect a 0x0 read for something unsuporrted, but that supid maxwell card returns bad...
15:54 karolherbst: imirkin_: could you give me the lspci -vv output of your card?
15:54 karolherbst: I plan to RE this entire PPCI_2 range :D
15:56 imirkin_: karolherbst: http://hastebin.com/imuqijivad.sm
15:56 karolherbst: thanks
15:56 karolherbst: luckily aspm is enabled here :)
15:57 karolherbst: ohh for you too
15:57 karolherbst: on reator it is disabled
16:05 karolherbst: funny that my card is closer to reator than yours
16:05 karolherbst: maybe because reator is also GK106
16:06 imirkin_: mine is a GK208
16:06 karolherbst: ohhh
16:06 karolherbst: that explains it
16:06 imirkin_: mupuf also has one
16:06 imirkin_: mupuf: btw, GK110/GK208 was supported in mesa 10.2 while GM107 was supported in 10.3
16:06 imirkin_: your presentation was a tad off regarding dates of availability
16:08 karolherbst: damn, another read only reg
16:08 karolherbst: those are annoying to RE
16:09 karolherbst: and another two
16:11 karolherbst: the hell :/ don't tell me that evry reg with a different value except two are read only?
16:27 karolherbst: imirkin_: by the way tizbac has a funny problem: with nouveau, half perf (at 0f), but louder fans and higher temps, though the last isn't confirmed yet
16:27 karolherbst: or is it normal on desktop gpus?
16:30 karolherbst: allthough 85°C is a bit too much
16:34 Karlton: but it's enough to cook an egg :D
16:34 karolherbst: ohh, for that I use my CPU
16:47 karolherbst: what could be a reason, that a gpu has a higher temp allthough it consumes same power?
16:47 karolherbst: *same amount
16:48 karolherbst: waut, 135°C is emergeny for my gpu? :D
16:48 karolherbst: emergency means powr off?
16:49 karolherbst: at least for critical nvidia clocks down to lowest possible clocks
16:50 karolherbst: which is at 105
16:50 imirkin: seems like the cooling situation would be relevant...
16:50 imirkin: perhaps one of the GPUs is inside a freezer?
16:50 karolherbst: :D
16:51 karolherbst: tizbac has the problem that with blob, gpu at 81°C and fan is "normal",but with nouveau he gets 85°C fan is louder, but pwoer consumption is the same
16:51 karolherbst: though I question how the power consumption was messured
16:51 imirkin: i'm gonna go with "unscientifically"
16:52 karolherbst: he says through his ups
16:52 karolherbst: mhhh
16:56 karolherbst: so I would say, it is fine, allthough his gpu is running really hot
16:56 karolherbst: 65°C idle at 0f
16:57 karolherbst: this is too hot I would say
17:00 karolherbst: imirkin: it would be really nice to see power consumption in sensors
17:01 imirkin: most GPUs don't come with a power sensor
17:01 imirkin: we expose it for the ones that do
17:01 karolherbst: with nvidia I can read my power consumption out with mupuf pwr_read tool
17:02 karolherbst: ohh the value is garbage mostly :D
17:03 karolherbst: ohh sensors is telling me this: "Adapter: PCI adapter"
17:04 karolherbst: nouveau-pci-0100
17:04 karolherbst: so the gpu tempereatur is exposed through the pci bus directly?
17:04 imirkin: through nouveau
17:04 imirkin: nouveau creates the devices
17:04 imirkin: and calls them nouveau-pci-12345
17:04 karolherbst: ohhh :/
17:04 imirkin: (or, presumably, nouveau-agp-foo)
17:04 imirkin: and registers them with hwmon
17:05 karolherbst: I still want to figure out, why my EC knows the gpu temp
17:05 karolherbst: or anythig related to it
17:08 imirkin: well, in a laptop these things tend to be all hooked up to the EC
17:08 imirkin: since the fans are all shared
17:08 karolherbst: not with this one
17:08 karolherbst: one cpu fan, one gpu fan
17:09 karolherbst: and dedicated heat pipes for both
17:09 karolherbst: but yes, the EC manages the fans
17:09 karolherbst: but it reacts to CPU and GPU temp
17:09 karolherbst: high CPU temp => CPU fan spins up
17:09 karolherbst: when GPU off => gpu fan ogg
17:09 karolherbst: *off
17:10 imirkin: so clearly it must have sensors in there somewhere
17:10 imirkin: and/or it can read it off the GPU
17:10 karolherbst: yeah, the EC either gets the power usage or gpu temp
17:10 karolherbst: wait, I can do a test
17:11 karolherbst: I could pump my gpu up to 80°C and the suddenly turn it of and see how fast the gpu fan is going off
17:12 karolherbst: imirkin: funny though: the fan reacts directly to nvaforcetemp
17:12 imirkin: so it probably just reads the temp off the gpu
17:12 imirkin: (it = EC)
17:13 karolherbst: nvaforcetemp 85 => 1 seconds later fan full speed
17:13 karolherbst: yeah most likely
17:14 karolherbst: it also immediatly goes off, when the gpu is turned off
17:14 karolherbst: even when the temp was like 85
17:15 karolherbst: I also have some power budget entries on my vbios
17:15 karolherbst: and three entries in the SENSE table
17:16 karolherbst: https://gist.github.com/karolherbst/a9cc5930205a8993ce34
17:16 karolherbst: mupuf already complained about those "high" Watts, but they are allright
17:22 karolherbst: yep
17:22 karolherbst: EXTDEV 1: type 0x4e [INA3221] at 0x80 defbus 0 unk02_5 2 unk03 0x02
17:22 karolherbst: same as on mupufs card :D
17:26 karolherbst: hah, now I have the power usage
17:30 karolherbst: ohh no, my gpu crashed because of overcklocking with the blob
17:30 karolherbst: so I don't think the blob sets the right voltage afterall
17:34 karolherbst: ha :D
17:34 karolherbst: we found the difference
17:34 karolherbst: the blob noticed, uhh it is getting hot in here, better lower voltage and clock
17:34 karolherbst: :D
17:51 karolherbst: okay, the blob exports an i2c device for the power consumption sensor on my gpu
17:52 karolherbst: nouveau should do the same :D
17:55 karolherbst: mupuf: https://github.com/sgp-blackphone/Blackphone-BP1-Kernel/blob/master/drivers/hwmon/ina3221.c :)
17:58 karolherbst: mupuf: just that you know: I also have a ina3221 on my gpu
18:37 karolherbst: okay, how do I get my power sensor displayed in "sensors" now? :D
23:50 imirkin: mupuf: your video cuts over to keith's at 26:53
23:50 imirkin: mupuf: https://www.youtube.com/watch?v=dEU9LJ1Li1g