01:54yoshimo: piglit results from the gm204 run that finally finished: http://filebin.ca/2cLxjnx6LBzA/results.tar.gz
01:59hakzsam_: it's not that bad
01:59hakzsam_: pretty nice actually :)
02:03yoshimo: how bad is "not that bad"?
03:50gnurou: imirkin: yeah sorry, I need to do some catchup on these github questions
03:51gnurou: imirkin: main reason for slowness is that this takes time for me to investigate/find the right person/get ack to release information, and it doesn't count as my job sadly
03:52gnurou: imirkin: but thanks for reminding me - I will try to move them forward on Monday
07:11Weaselweb: is there anything I can do here? https://nopaste.me/view/3bdda6ec this showed up when running piglit
10:06ouned__: thanks so much guys. Tested my GTX 870m with 4.6rc1 and everything just works. optimus, reclocking, speed is awesome, turning on/off etc.
10:09karolherbst: ouned__: speed is awesome? but there is no reclocking :O
10:09karolherbst: ohh wait
10:09karolherbst: 870m is kepler
10:09ouned__: its a kelper card
10:09karolherbst: 860m and below is maxwell though
10:09karolherbst: ahh yeah, then it is fine :)
10:09karolherbst: if you encounter any issues, please report it to me :D
10:10karolherbst: reclocking related
10:10karolherbst: ouned__: does sensors also show your gpus power consumption?
10:11ouned__: it should automatically reclock sometime in the future :P
10:11imirkin: gnurou: well, the main thing is transparency... perhaps add labels like "investigating" or "waiting for approval" or whatever is appropriate
10:12ouned__: karolherbst: how?
10:12imirkin: ouned__: run "sensors"
10:14ouned__: it shows temperatures only
10:14karolherbst: I would expect a therman sensor on this chip
10:15karolherbst: ouned__: dmesg | grep nouveau
10:17karolherbst: I would expect that something unexpected happened
10:23ouned__: ohh.. just noticed that im actually running 4.1 right now. need to reboot
10:42karolherbst: ouned__: no problems. The power consumption reading was added in 4.6 and I would like to know to which degree in the wild
11:50ouned__: ok ok.. now that i was so excited it doesnt work at all anymore :D
11:50ouned__: what i changed is switching to uefi mode
11:51karolherbst: https://lkml.org/lkml/2016/3/31/1108 :D
11:51karolherbst: why didn't I see this yesterday
11:51karolherbst: ouned__: uhh
11:51karolherbst: ouned__: like it doesn't boot anymore?
11:51imirkin: karolherbst: there were a few of them.
11:52ouned__: it should work though. it correctly turns the card on/off when running glxinfo but it just doesnt use it
11:52karolherbst: I missed them all...
11:52karolherbst: ouned__: uhh
11:52ouned__: it does boot but DRI_PRIME doesnt do anything
11:52karolherbst: ouned__: well turning off/on is done through ACPI
11:53karolherbst: and switching to uefi mode may have a big impact on this
11:53karolherbst: ouned__: dmesg please
11:53karolherbst: seems to work though
11:54karolherbst: except that your DSM is messed up
11:54karolherbst: could be this issue: https://bugs.freedesktop.org/show_bug.cgi?id=89558
11:55karolherbst: ohh wrong
11:56karolherbst: ouned__: boot with nouveau.runpm=0 and see if that helps
11:57karolherbst: or maybe just run DRI_PRIME=1 glxinfo
11:58karolherbst: it should show nouveau because I don't see anything wrong going on on your end
11:59ouned__: DRI_PRIME=1 glxinfo turns the card on but still uses the intel card
11:59karolherbst: mupuf: will there be a situation where we want a different pstate than requested?
11:59karolherbst: mupuf: except high temperature?
11:59mupuf: karolherbst: temperature is one good reason
11:59ouned__: i know itwas working in bios mode for sure
11:59mupuf: and over current?
11:59karolherbst: mupuf: yeah, but I have to modify the nvkm_clk struct a bit and I have to think this through
12:00karolherbst: mupuf: if temperature is the only change, I don't need expected_pstate, set_pstate fields
12:01mupuf: thinking shit through is indeed important :p
12:01ouned__: looks like im going to boot in bios again :D
12:01karolherbst: mupuf: I think in the end we will handle over current/temperature the same way at the same time. just remember a scaled value how much we are over in the "save cricital" range and scale the clocks according to this (if nvidia does the same)
12:01karolherbst: ouned__: yeah, but there might be issues
12:01karolherbst: ouned__: like gpu doesn*t power off or something
12:02mupuf: karolherbst: you know what, in manual mode, just change the pstate if we cannot go lower on the cstate
12:02karolherbst: mupuf: but currently we will have a simple flag in nvkm_clk: force_downclock
12:02karolherbst: which will be set by the therm daemon
12:03karolherbst: mupuf: thing is, there is no strong relation between cstates and pstates
12:03karolherbst: we can have the lowest cstate and the highest pstate
12:03karolherbst: even nvidia does it
12:03karolherbst: for me changing the pstates inreases power consumption by around 7W
12:03karolherbst: which is like nothing
12:06mupuf: only 7W?
12:06karolherbst: with the engines clocked
12:07karolherbst: 10W idle at 07, 17W idle at 0f
12:07karolherbst: what is the point of nvkm_therm_cstate by the way?
12:08karolherbst: I thought nouveau updates the fans every second anyway
12:08karolherbst: ohh wait
12:08karolherbst: okay, I know I think
12:08ouned__: karolherbst: it worked for sure in bios mode. there is a LED which shows whether the gpu is on or off and the fan also instantly stops on powering it off
12:09karolherbst: ouned__: ahh okay
12:10ouned__: by the way the power monitor works ^^
12:10ouned__: even now in uefi mode
12:10ouned__: power1: 13.84 W
12:11ouned__: it must be a really minor thing
12:11ouned__: as it turns on/off correctly but just isnt used
12:11karolherbst: maybe something is odd in your X server?
12:11karolherbst: ouned__: are you using dri3?
12:16ouned__: no idea
12:16karolherbst: I don't get why it should work in bios mode, but not in uefi mode
12:16karolherbst: maybe the index of the gpu changed and nouveau is DRI_PRIME=0 or something stupid
12:17karolherbst: mupuf: https://github.com/karolherbst/nouveau/commit/00b3e7497d4ac66ca2a1c037538ffe22468a5cfe
12:18ouned__: i can test wayland
12:18karolherbst: mupuf: and astate will be what the code/user wants
12:18ouned__: in theory
12:18karolherbst: astate is curerntly used by gk20a to set the cstates
12:18ouned__: DRI_PRIME=0 does not do it
12:18karolherbst: ouned__: X log please then
12:19mupuf: karolherbst: that explains why I have never seen anything related to astate
12:19karolherbst: dstate and pstate were unused too
12:19karolherbst: but I threw them out
12:19mupuf: I am trying to upgrade my jetson's kernel and github is being SUPER slow
12:19karolherbst: not pstate
12:19karolherbst: then we have a simple thing in nvkm_clk: expected and set pstate/cstate
12:20karolherbst: + some flags to manipulate this
12:20mupuf:is trying to get the jetson tk1 as a way to run piglit tests
12:20karolherbst: like force_downclock if we hit thresholds
12:20karolherbst: ouned__: yeah no dri3
12:20karolherbst: that means
12:20karolherbst: you have to set your xrand stuff
12:20karolherbst: DRI2 section
12:21karolherbst: is dri3 really that bad that nobody wants to enable it?
12:22sarnex: i think packagers think it is still buggy
12:22karolherbst: it wasn't buggy for me since ever
12:22ouned__: not like i ever heard of it
12:22karolherbst: ouned__: DRI3 is the new shit, basically it just works :D
12:23sarnex: it optimizes out a framebuffer copy too i think
12:23sarnex: so its a bit faster
12:23karolherbst: and vsync
12:23karolherbst: video acceleration doesn't work over dri3 yet in nouveau
12:23sarnex: isn't that because of gallium though
12:23sarnex: it has no dri3 backend?
12:23karolherbst: but to be honest, who cares?
12:24karolherbst: maybe it would speed things up for desktop users
12:24sarnex: It broke VDPAU<->GL interop with DRI3 enabled, because the Gallium VDPAU code doesn't support DRI3 yet. We can consider re-enabling this once there is a Mesa release where the Gallium VDPAU code supports DRI3.
12:24sarnex: is that something else?
12:25karolherbst: no clue
12:25sarnex: ok :PO
12:25MickFromAC: Good evening. I have a problem with the Linux kernel 4.4.6 (also tried 4.4.3 before) and the compiled in nouveau driver. Whenever i reboot my pc (warm boot) the kernel crashes at or right after mode setting. I can see a lot of debug messages in the console, some of them showing "nv50_....". I dont reach a login. I have to reboot the pc by pressing the reset button. The error does not occur...
12:25MickFromAC: ...when I do a cold start or after i press the reset button. Only warm boot hangs the kernel. Any ideas?
12:25karolherbst: I just now you need to setup dri2 offloading if you want to do video acceleration on nouveau
12:25karolherbst: even if you are using dri3
12:26karolherbst: MickFromAC: so only on a warm reboot?
12:26MickFromAC: karolherbst: yes!
12:26karolherbst: MickFromAC: try to warm-reboot with nouveau.config=NvForcePost=1
12:26karolherbst: it is a wild guess
12:27karolherbst: I have no clue, but this would be something I would try out
12:27karolherbst: we would need the error messages to debug this
12:27karolherbst: but if that works, we have a solid clue
12:28MickFromAC: karolherbst: okay, i will give it a try. i will report later. because i am at the machine in question. ;)
12:29karolherbst: MickFromAC: what gpu is it by the way?
12:33MickFromAC: karolherbst: its a PNY Quadro FX 580. NV50 family (Tesla), NV96 (G96)
12:36MickFromAC: okay ... reboot.
13:23MickFromAC: back. no changes with "nouveau.NvForcePost=1". the kernel hangs after warm start but comes up after pressing the reset button.
13:24imirkin: MickFromAC: did this work well on older kernels, or are you trying it out for the first time?
13:28MickFromAC: i am running kernel 3.7.1 now with nouveau enabled. works without any problems. recently i set up new system (LFS 7.9) and with the new kernel i always have the problem that the kernel hangs after warm start. the kerna comes up after reset or cold start. only warm start makes the kernel hang, lots of trace messages in the framebuffer, some of them point to nouveau (nv50_...). A kernel...
13:28MickFromAC: ...without nouveau driver always comes up, no matter if warm start or cold start.
13:29imirkin: ouch.... that's quite a range of kernels
13:29imirkin: nouveau has been rewritten about 4 different times since then
13:29imirkin: btw, that should have been nouveau.config=NvForcePost=1
13:30imirkin: not nouveau.NvForcePost=1
13:30MickFromAC: yes. but something in the current code seems to cause this (minor?) problem.
13:31MickFromAC: imirkin: yes. was a type. is correct in the grub.cfg
13:31imirkin: perhaps you can try to narrow down the kernel range?
13:32imirkin: good kernels to try would be 3.12, 3.16, and 3.19
13:34MickFromAC: imirkin: hmmm. already thought about that. but what if i continue with blfs? maybe i will get in trouble because some of the packages need a recent kernel.
13:35imirkin: anyways, at the very least, we'll need a log from when it runs into issues
13:35MickFromAC: Beyond Linux From Scratch
13:35imirkin: perhaps you can ssh in
13:35imirkin: or get the messages over netconsole
13:37MickFromAC: imirkin: i dont see a chance to get a log bcs the kernel does not come up completely. system does not reach a point where a logon is possible.
13:37imirkin: you've verified this, yes?
13:38imirkin: what could be happening is that everything is totally fine with the minor exception of the screen not updating, for example.
13:39MickFromAC: imirkin: i can send you some pictures made with camera. you can see the trace messages there.or i can write them into a file and send them somewhere.
13:40MickFromAC: imirkin: the kernel is not fully dead, after some minute more trace messages are send to the screen. its repeating in intervals of about 60 to 120 seconds.
13:41imirkin: take a photo and upload it on imgur
13:41imirkin: [or other site of choice]
13:57MickFromAC: imirkin: okay. pictures are there ---> http://imgur.com/a/9Y6lI
13:58imirkin: that should be an easy fix
13:58imirkin: give me a minute
14:01imirkin: MickFromAC: can you find your nouveau.ko module, and do gdb /path/to/nouveau.ko
14:01imirkin: and then "disassemble nv50_disp_intr_supervisor"
14:01imirkin: which should yield a few screens of code... pastebin that
14:02imirkin: MickFromAC: how is your screen connected btw?
14:02MickFromAC: both screens are connected on display port
14:02imirkin: ah that explains it
14:03imirkin: there are some divisions in there
14:04imirkin: that disassembly should help narrow down what's 0
14:04MickFromAC: hmpf. nouveau is compiled into the kernel.
14:05imirkin: MickFromAC: that's fine... gdb vmlinux then
14:06imirkin: skeggsb: btw --^ looks like DP is (sometimes?) busted on G96
14:06imirkin: probably an external encoder?
14:07imirkin: skeggsb: something doesn't get properly reinitialized on a warm boot
14:14MickFromAC: hmpf. gdb says "...(no debugging symbols found) ...done."
14:17imirkin: can you run disassemble anyways?
14:17imirkin: i think debugging symbols is something else
14:19MickFromAC: disassemble nv50_disp_intr_supervisor --> no symbol table is loaded. Use the "file" command.
14:20imirkin: well, clearly the symbols are there or else the kernel wouldn't be able to print a stacktrace :)
14:20imirkin: you *are* loading vmlinux right, not vmlinuz?
14:21MickFromAC: oops ...
14:24MickFromAC: damn ... okay. vmlinuX has symbols ;)
14:25MickFromAC: you need the full dump?
14:34MickFromAC: you can find it there --> http://pastebin.com/myEE5QzG
15:30imirkin: skeggsb: any advice on what all this means? http://hastebin.com/numukokume.css
15:31imirkin: skeggsb: specifically what the "magic sets" are? it happened when i ran glretrace on a random trace and it went apeshit. re-running it again, everything was fine. this has happened to me a few times.
15:31imirkin: MickFromAC: thanks
15:34imirkin: MickFromAC: ok, looks like it's dividing 12 by something on the stack.
15:34imirkin: the fixed 12 should make this easier to find
15:36imirkin: value = value - (3 * !!(dpctrl & 0x00004000)) - (12 / link_nr);
15:36imirkin: i guess link_nr == 0
15:37imirkin: MickFromAC: mind filing a bug, including that photo as well as the disassembly as attachments (not links) at bugs.freedesktop.org?
15:38MickFromAC: imirkin: yes, i can do that.
15:41imirkin: xorg -> Driver/nouveau
15:41imirkin: [even though it has nothing to do with X]
15:42imirkin: MickFromAC: once you do, i'll fill it in with a few additional details... but this will have to wait for skeggsb to take a proper look at
15:44MickFromAC: need to create account on freedesktop bugzilla first.
16:11MickFromAC: imirkin: can i add two attachments or do i have to zip the pic and the dump into one archive?
16:13imirkin: MickFromAC: add 2 attachments. you can only add one at a time... i'd file the initial bug without either attachment
16:13imirkin: and then add them once the bug is created
16:17MickFromAC: okay ... is created so far.
16:17MickFromAC: bug 94803
16:19MickFromAC: i forgot to set correct type for the picture .... is marked as "text/plain" ... shit
16:31imirkin: MickFromAC: that's why i said not to attach anything on the original comment :)
16:31imirkin: bugzilla is buggy and always sets the type of that attachment to text/plain
16:31imirkin: i'll fix it
16:32MickFromAC: bugzilla is buggy? ouch
16:45imirkin: MickFromAC: let me know if you need an actual patch... hopefully you can just take my suggestion directly
16:47MickFromAC: a patch would be fine, but i can also use newer kernel 4.4.7 if the corrections will be applied there.
16:48MickFromAC: but i will try the solution you posted first. ;-)
16:49imirkin: MickFromAC: i just meant like... do i need to write a full patch or can you just take my comment and figure out how to apply it :)
16:49MickFromAC: yep ... i can paste it ... :-)
17:04MickFromAC: okay then ... i will try the corections and do a reboot now. i will report in few minutes.
17:16MickFromAC: imirkin: WORKS!!!! i am impressed! Thanks for your help!
17:17imirkin: MickFromAC: can you report in the bug what error it prints? like the dpctrl value?
17:20MickFromAC: okay, need to do another reboot then ... should have saved the dmesg .... brb
18:13MickFromAC: back ... needed some reboots to get an error logged.
18:13imirkin: MickFromAC: thanks
18:13MickFromAC: have posted the comment to bugzilla.
18:13imirkin: when one of the monitors doesn't come up
18:14imirkin: is it "permanently" dead, or are you able to get it to come up once X starts?
18:15MickFromAC: i dont have X in that other system. still in early stages.
18:15imirkin: oh ok
18:16MickFromAC: and it ALWAYS failed when i rebooted. its strange that it happens now only on some occasions. most of the time both monitors come up now.
18:18MickFromAC: is that a timing problem?
18:19imirkin: well, it's reading a register that's supposed to tell it some info about the DP links/etc
18:19imirkin: i dunno if waiting longer would cause it to contain data
18:19imirkin: as soon as we get a hpd (hot-plug-detect) interrupt, we try to configure it
18:29MickFromAC: kernel 3.7.1 and its version of nouveau does not suffer from that problem. a reliable solution of that problem should be possible in kernel 4.4.6 too.
18:29imirkin: kernel 3.7.1 suffers from 1000 other problems though
18:30imirkin: among other things, no dpms for DP
18:30imirkin: also the DP link won't come up on half the monitors out there
18:30imirkin: anyways... i'm not personally too familiar with all the display stuff. skeggsb is... hopefully he'll be able to take a look next week
18:33MickFromAC: sounds like i have been lucky with the combination of monitors/gpu. until now. :-)
18:39imirkin: anyways, i bet we could assume that link_nr == 1 or something and continue on
18:39imirkin: but skeggsb will know better what to do
18:40MickFromAC: imirkin: thanks for your help again. i will check next week if there are some additional changes to the nouveau driver. if corrections are in 4.4.7 only i will use the newer kernel then. have a good night!