00:31karolherbst: mupuf: yesterday there was somebody with a odd pmu problem: everytime he tried to reclock, nouveau hanged inside nvkm_pstate_calc/nvkm_pmu_send
00:31karolherbst: but PUT/GET had the same value
00:48pmoreau: gnurou: \o/
00:48pmoreau: gnurou: That's nice to hear!
00:49gnurou: pmoreau: yeah well, I would not rejoice until it becomes nice to *see*
00:50pmoreau: gnurou: Every step in the right direction is good. :-) Even if you never reach the destination… :-/
00:51gnurou: oh we will reach it, it's just that we should be there already...
00:53pmoreau: Yeah, but let's see the positive side: it's still moving, maybe extremely slowly, but it's not dead yet
01:06karolherbst: gnurou: I don't know if you are able to answer that question, but how important is it to handle situations, that interrupts to/from a falcon gets lost? (especially the pmu)
01:09gnurou: karolherbst: that would depend on the falcon and the firmware that runs on it I'd say, but why would falcon interrupts get lost in the first place?
01:09karolherbst: gnurou: no idea, I only now that this patch helps me: https://github.com/karolherbst/nouveau/commit/f159e2910b44c52b89b6512d1213be5d02bdafd9
01:10karolherbst: basically in this patch I wait with a timeout for a reply to be handled, and if there is no handling at all, I just chceck if a message is queued
01:10karolherbst: could be of course that something with the pmu code isn't right
01:12gnurou: mmm, and the message you get from the PMU is what you expect in that case?
01:14gnurou: is the PMU firmware running at that time Nouveau's or NVIDIA's?
01:15karolherbst: but I have no idea how that can happen, maybe the assembler messes up, but why does it work most of the time then :/
01:15gnurou: could be an issue with it, but that seems strange still
01:15karolherbst: gnurou: this is the fuc code: https://github.com/karolherbst/nouveau/blob/master/drm/nouveau/nvkm/subdev/pmu/fuc/host.fuc#L99-L131
01:16karolherbst: so it has to be do something strange in the last instructions
01:16karolherbst: the two above the ret especially
01:16karolherbst: I have no clue though what happens after that
01:18gnurou: mmm I'm no fuc expert sadly, but I imagine that other falcons also signal interrupts without any issue... is this PMU specific?
01:18karolherbst: no idea
01:19karolherbst: I just encountered that while stress testing reclocking
01:19gnurou: is there a way for you to substitude Nouveau's PMU fuc with NVIDIA's and see if you can repro? that would help narrow the problem to the falcon or the host
01:20karolherbst: I don't know
01:20gnurou: not sure whether your host code would still work in that case though :/
01:20karolherbst: I think it is not possible, but what do I know :D
01:21gnurou: mupuf might know
01:29RSpliet: imirkin: hahaha, well, glad I could give you loads of fun with it
01:31RSpliet: gnurou: well, the nouveau memory scripting language is highly simplified from what NVIDIA implements
01:32RSpliet: which means mem reclocking would no longer work after swapping out the firmwares
01:33karolherbst: RSpliet: weren't you a fuc expert :p
01:33RSpliet: (most importantly, ours doesn't do branching or conditionals, we rather implemented the GT21x DDR3 memory training routine as one opcode)
01:34RSpliet: karolherbst: I wouldn't say "expert", I am however the man with an infinite amount of lack of time
01:34RSpliet: so I like halfly read IRC mostly, and try to add valuable information (or stupid jokes) where I can
01:34karolherbst: yeah well, yesterday I had someone where between memory reclocking there was a message lost from the host to the pmu or something like that
01:35karolherbst: I mean, I know something is odd there, but I don't know what exactly
03:07CapsAdmin: i was googling some error I was getting installing the proprietary nvidia drivers on fedora rawhide and someone here had mentioned it
03:07CapsAdmin: 13:59 < xexaxo> that one was in piglit, and iirc Chad fixed it.
03:07CapsAdmin: 14:00 < bentiss> imirkin_: FATAL: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol 'lock_release' -> that's fun to work with blobs!
03:08CapsAdmin: was there a fix for that?
03:14RSpliet: CapsAdmin: yes, don't use debugging kernels
03:14RSpliet: they're horrendously slow anyway
03:15CapsAdmin: ah, so fedora rawhide uses debug kernel?
03:16CapsAdmin: that makes sense i guess. thanks i'll look into that
03:16RSpliet: but in general the kernels on koji with rcX.0 are not, whereas rcX.n for n>0 are
03:17RSpliet: pardon: rcX.git0 is usually not debug, rcX.gitN for n>0 is debugging
03:18Antwan: Hi all, afaik, Fedora kernel is not patched for memory mapper I/O. While I'm willing to help Nouveau dev with my GeForce Titan Black, I currently cannot help on that aspect, any thoughts?
03:25RSpliet: Antwan: you mean MMIOTrace? I usually just roll my own kernel with MMIOTrace enabled
03:25RSpliet: debugging kernels do have it on, but have other issues with installing the official driver
03:25Antwan: RSpliet: Yes, I mean MMIOTrace.
03:25RSpliet: I think I can send you a kernel tonight, I probably have a set of 4.3 RPMs sitting around on my harddrive with MMIOTrace
03:25RSpliet: okay, maybe not tonight, bridgemas dinner
03:25Antwan: That's fine, no rush...I also have Tesla C2075 running on that beast that can be used as well if there is any interest?
03:25RSpliet: yes please!
03:25Antwan: By the way, for those who are in academia like myself and who are not aware of this, there is an academic support program from Nvidia that can provide GPU for free (you need to submit a proposal).
03:25Antwan: This is how I got these GPUs I'm talking about.
03:25RSpliet: surely my proposal can't just state "I want to reverse engineer them"... :-(
03:25funfunctor: oh interesting
03:25funfunctor: RSpliet: try, at least your have a story to tell :)
03:25Antwan: RSpliet: yes who knows?
03:25Antwan: I'm a geophysicist so mine was about doing computing on the GPU.
03:25RSpliet: we have good contact with NVIDIA ourselves nowadays, no need to try and abuse their well intended programs
03:28Antwan: By the way, I saw on the mailinglist archive that a nbody.c has been written for Computing tasks with Nouveau, but I did not find it, is this something we can get access to?
03:28RSpliet: pmoreau: any idea?
03:29Antwan: RSplies: I was just informing people, but you right!
03:41RSpliet: Antwan: no worries :-)
04:08funfunctor: imirkin: around?
04:09funfunctor: imirkin: can you test the following branch if it creates any regressions for you? https://github.com/victoredwardocallaghan/mesa-GLwork/tree/ARB_clear_texture-final2
04:25pmoreau: Antwan: The Titan X I have at work comes from that academic support program.
04:26pmoreau: For the nbody.c, I have no ideas. I'll have a look, but hakzsam might know?
05:12Antwan: pmoreau: Ok thanks.
05:19Antwan: Regarding nbody.c, I'm refering to http://lists.freedesktop.org/archives/nouveau/2015-November/023327.html
05:26RSpliet: oh, contact Hans de Goede
05:26RSpliet: per e-mail
05:27RSpliet: he generally does not do IRC
05:31hakzsam: Antwan, it's an experimental nbody simulation in TGSI/OpenCL used to test compute support with nouveau
05:33Antwan: RSpliet: great, thanks!
05:34Antwan: hakzsam: is the code accessible? I'm wondering how to do such simulations on GPU.
05:35hakzsam: Antwan, I can send you a copy if you want, but as I said it's experimental and so doesn't really work for now :)
05:39hakzsam: Antwan, have fun http://hastebin.com/elabisacuq
05:42Antwan: hakzsam: That is really great and nice. Thanks a lot!
05:42Agiofws: image link: https://i.imgur.com/8CfblFL.png
10:29karolherbst: mupuf: seems like Tom^s card doesn't run stable with the min voltage from the voltage map table, even if nouveau rans at base clock :/ do you think there might be another table with some more information or is it just part of the volting algorithm used by the blob?
10:38Tom^: karolherbst: what volt am i on base clock?
11:01karolherbst: Tom^: 925000uv with stock nouveau
11:01karolherbst: I guess the blob is a lot nearer to 1V
11:01karolherbst: but a bit below
11:01Tom^: at 0f?
11:02Tom^: it should be like 1.01 or so
11:02Tom^: at idle
11:02karolherbst: the blob is that high?
11:02Tom^: on max perf yes
11:03karolherbst: max voltage is 1112500uv :/
11:03karolherbst: would fit somehow then
11:04karolherbst: I thought it was lower
12:54imirkin_: skeggsb: did you see https://bugs.freedesktop.org/show_bug.cgi?id=92892#c7 ?
13:01aboll: I have a user with a crashing supertuxkart on NVE4, crashes in libdrm_nouveau -- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=789395
13:02aboll: some known bug?
13:03imirkin_: aboll: does the crash go away with MESA_EXTENSION_OVERRIDE=-GL_ARB_draw_indirect ?
13:03imirkin_: aboll: if so, updating mesa/libdrm should help
13:04imirkin_: yeah... indirect = 0x23fd5b90 and crashing in nouveau_pushbuf_data... definitely familiar
13:04imirkin_: aboll: updating should fix it
13:04aboll: imirkin_: which version?
13:05imirkin_: mesa 11.0.6 should be fine
13:05imirkin_: let me dig to see what the actual fix was
13:05aboll: so 11.0.5 doesn't have the fix?
13:05imirkin_: no it should
13:06imirkin_: commit 78d58e642549fbf340fdb4fca06720d2891216a8, cc'd to 10.5 and 10.6 stable
13:06imirkin_: which implicitly means that 11.0 should already have it
13:06imirkin_: hrmph... mesa 11.0.5 in that bug... odd
13:07imirkin_: will have a look
13:09imirkin_: in the meanwhile that override should disable indirect draw which i think is what's causing the problem
13:10aboll: I'll ask the bug reporter to try the override
13:10imirkin_: i'm installing stk now to see what's going on... didn't have it on this comp apparently
13:12imirkin_: oh hm.... i wonder if this is the issue -- libdrm is 2.4.60
13:12imirkin_: iirc *some* version in there was just totally fubar'd for nouveau
13:13imirkin_: aboll: not sure if this is part of the issue, but 2.4.60 was fubar'd for nouveau. see https://bugs.freedesktop.org/show_bug.cgi?id=89842
13:16orbea: I keep experiencing full system freezes (xorg, mouse, keyboard) while playing any games with nouveau which requires me to ssh in and reboot from my laptop, any ideas? http://dpaste.com/0H3PANE
13:17orbea: xorg.0.log http://dpaste.com/18NVXEE
13:18imirkin_: aboll: fwiw i just installed stk 0.9.1 and ran with mesa 11.0.6 no a GK208 (NV108) and it worked fine. it should be quite similar to the nve4 though.
13:19aboll: imirkin_: apparently the user is on Debian testing which should have libdrm 2.4.65 if the system is up-to-date
13:19imirkin_: orbea: sounds familiar... might have been fixed in a more recent kernel... mupuf ?
13:19imirkin_: aboll: just going off what's written in the bug report
13:20orbea: uhh, idk how to use mupuf, will look into after class and get back to you later
13:20aboll: yeah, I'll ask him about the libdrm version
13:20imirkin_: orbea: mupuf is someone here who knwos about all that i2c stuff :)
13:21imirkin_: orbea: you may also want to try kernel 4.4-rc3 which should potentially let you reclock your gpu
13:21orbea: googled fast and found other things called that.... :P
13:21imirkin_: but... perhaps not. but if not, there are additional patches we can have you try if you're interested
13:22orbea: i am, but I might have to put a rain check on trying anything that is not a quick fix, school priorities :\
13:22imirkin_: karolherbst: another GTX 780 Ti owner --^
13:23aboll: imirkin_: thanks for your help
13:23imirkin_: aboll: np. let me know if updated libdrm doesn't fix it
13:25aboll: imirkin_: yep
13:26imirkin_: aboll: although hrm... you may be right that the user has a more recent libdrm -- the assert being hit is one i added post 2.4.61
13:28imirkin_: aboll: in which case i'd want more info on how to repro
13:28karolherbst: imirkin_: who?
13:29imirkin_: karolherbst: orbea
13:29karolherbst: orbea: any issues so far?
13:29karolherbst: see it now
13:29imirkin_: scroll up.
13:31karolherbst: orbea: do you reclock to highest pstate?
13:31imirkin_: doubtful since he's on kernel 4.1.x
13:31karolherbst: who knows
13:31karolherbst: with the right clock you can also get a stable pll config
13:31karolherbst: or semi stable
13:32karolherbst: actually there were also some kepler gddr5 owners you said it was somehow stable even without my patch
13:32imirkin_: what i said was that even with identical values as blob, it was still unstable :)
13:33imirkin_: really just repeating what ben said
13:33karolherbst: mhhh yeah, I remember somebody saying that
13:34karolherbst: imirkin_: though Tom^s 780 ti isn't stable either on highest pstate, but that's because nouveau uses too low voltage :/
13:35karolherbst: orbea: sadly without compiling nouveau yourself you won't get anywhere most likely
13:36imirkin_: skeggsb: [ 30.594299] resource sanity check: requesting [mem 0xddf9c000-0xde09bfff], which spans more than 0000:01:00.0 [mem 0xdc000000-0xddffffff 64bit pref]
13:36imirkin_: skeggsb: are we ignoring the BAR size somewhere?
13:36aboll: imirkin_: hmm, could that override still be useful as a workaround?
13:36imirkin_: aboll: well, it'll disable indirect draws, which appears to be what's dying in there.
13:38aboll: imirkin_: ok, I'll keep you posted
13:41phomes: imirkin: I just tried reclocking seems to work. No crash or artifacts
13:42imirkin_: phomes: awesome :)
13:43phomes: I can write 03, 07, 0f. It also lists AC but will not accept that as a write
13:43imirkin_: that's the current line :)
13:43imirkin_: if you unplug it might even say DC
13:43imirkin_: and you can set diff pstates for AC and DC
13:44imirkin_: although imho such things are better left to a userspace policy manager
13:47phomes_: imirkin: spoke too soon. Going directly from 03 to 0f caused it to hang
13:48imirkin_: doh. what kernel are you on?
13:48imirkin_: hm ok. RSpliet -- does this sound familiar? or was it just chance?
13:49imirkin_: phomes_: does it actually reclock btw? i.e. when on 0f does the AC line match up to the 0f line?
13:49imirkin_: the *'s are just there to mislead you :)
13:50phomes_: heh, the * changed. I actually wanted to check AC line when it hung
13:50phomes_: will try again in a minute. I have some errors in the log from the hang
13:55imirkin_: hm, that sounds mildly familiar... RSpliet you saw stuff like that on occasion right?
13:57phomes_: AC does change. At least for 03 and 07. Will try to go from 03 to 0f again now
13:57imirkin_: this seems like the generic fail scenario... something died, nouveau didn't recover
13:58phomes_: AC line is close but not identical for 0f
13:58phomes_: 0f: core 550 MHz shader 1210 MHz memory 790 MHz AC DC *
13:58phomes_: AC: core 549 MHz shader 1215 MHz memory 789 MHz
13:59phomes_: the temp I get from lm_sensors is just 49 for 03, 50 for 07, and 51 for 0f
14:01phomes_: hang indeed seems unrelated. I have switched back and forth many times without problems now
14:02imirkin_: ok cool
14:02imirkin_: the fact that it's not identical is largely expected
14:02imirkin_: as long as it's close, it's fine
14:31imirkin_: aboll: it's all very weird... it looks like it's hitting the assert that i put in precisely to prevent the kinds of errors i was originally getting with stk, which i fixed with that mesa commit. i guess not hard enough? can't imagine why it would fail now... esp in that particular way =/
14:41karolherbst: why does nvidia has to mess up so big, only because the gpu got a little hot :O
14:43mupuf: imirkin_: I don't think anyone worked on i2c in a while :s Would be nice to get debug=i2c=debug
14:43karolherbst: gnurou: could you convince somebody, that the nvidia moduile keeps being stable even if we put 100+ °C into the gpu through nvaforcetemp :D
14:43mupuf: karolherbst: we definitely need to do exactly what the blob does, and the blob does not use the min voltage
14:43karolherbst: mupuf: I know
14:44karolherbst: guess what I try to do on reator :p
14:44karolherbst: get voltage for each clock of your gk106 gpu
14:44mupuf: that's why my appartment sounds like a jet engine
14:44karolherbst: but the patches are working really well on the 780 ti
14:44mupuf: not sure what you did to the fan :D
14:44karolherbst: nvidia messed up
14:44karolherbst: I forced 102°C
14:45karolherbst: I wanted to check when the blob drops the clock
14:45karolherbst: max: 1123 70 °C: 1110 80 °C: 1097 102°C: messed up
14:45karlmag: "hey, I can make coffee/tea in my gpu water-cooler"
14:45karolherbst: karlmag: well
14:46karolherbst: the actualy temp was still around 50°C
14:46karolherbst: but we can force the gpu to report something else to the blob driver
14:46karlmag: ah, right... doesn't qualify for anything more than lukewarm (as it comes to coffee/tea) then..
14:48karlmag: out of bound, or out of bound handled badly? Sounds like the blob believe what it's fed and kind of dies or something?
14:57karolherbst: mupuf: shall I check all clocks?
14:57mupuf: karolherbst: I wrote the tool for it, I sure know how to fake temperature :D
14:58karolherbst: yeah, but I want the voltage for each clock, currently I never saw the blob using different voltages for the same clock except for pwm based gpus :/
14:59karolherbst: ohh wait
14:59karolherbst: mupuf: any idea how I can scrap all those values with a script? :D
15:00karolherbst: yeah well, I don't want to modify the vbios, fake it, restart X, add load and mess with the temp by hand :/
15:01mupuf: a script would work work, but if you want to see all the voltages and clocks, fake a small load
15:01mupuf: it will go through all the clocks
15:01mupuf: and guess what, I already have everything :D
15:01karolherbst: makes sense somehow
15:01karolherbst: so I would do that for every temp too
15:02mupuf: I guess yes
15:02karolherbst: is the an envytools thingy for the load?
15:02karolherbst: or do I have to mess the counters
15:02mupuf: but I am sure ramping up the frequency at a fixed temperature would be easier :D
15:03karolherbst: I meant that
15:03karolherbst: ramping up at 1°c
15:03karolherbst: ramping up at 2°C
15:03karolherbst: and so on
15:04mupuf: yeah, set the clock to the base clock
15:04mupuf: and change the temperature
15:04mupuf: that would be enough
15:05karolherbst: mupuf: where is your script for the load clock thingy?
15:06Arbition: karolherbst: I decided digest mode was too much of a pain, I still got loads of them, so I'm just going to filter it in my email instead
15:06karolherbst: Arbition: :D
15:06Arbition: karolherbst: Had any more thoughts on what to try on my funny mobile chipset?
15:06mupuf: karolherbst: https://github.com/mupuf/pdaemon_trace
15:06karolherbst: mupuf: right he has a funny issue
15:06karolherbst: mupuf: yeah I know that tool
15:06karolherbst: mupuf: I meant a script for setting the load and starting this one ;)
15:07karolherbst: ahh there it is
15:08karolherbst: mupuf: Arbition has a funny issue
15:09karolherbst: mupuf: while reclocking memory, he stucks with the pmu in some state I don't know
15:09karolherbst: there is no message queued, but nouveau still waits somehwere in nvkm_pmu_send
15:10mupuf: Arbition: what's your funny issue?
15:11mupuf: karolherbst: hmm, looks like the same issue you have, right?
15:11karolherbst: for me PUT != GET
15:11karolherbst: but for him PUT == GET
15:11Arbition: umm, as karolherbst described. Less in depth (as I don't really know it) When I go to change the pstate of the chip, it seems to lock up
15:11Arbition: not as in the whole computer
15:11karolherbst: mupuf: the echo call into pstate just stucks for ever
15:11Arbition: but the command changing the pstate never returns
15:11karolherbst: we have the stacks though
15:12karolherbst: https://drive.google.com/file/d/0B2jHgkGt14XJTW5PbV9LQkFKV2M/view and https://drive.google.com/file/d/0B2jHgkGt14XJZWtuVW10ZmpxNjA/view
15:13karolherbst: mupuf: I ran out of ideas after I saw there is no message queued, so
15:13karolherbst: mupuf: and it happens every time
15:13mupuf: I see
15:13mupuf: the reclocking script is likely stuck
15:14karolherbst: anyway, 61% isn't enough :D
15:14karolherbst: 63% is good though
15:15karolherbst: mupuf: should I put all in one big csv file or each temp one file?
15:15mupuf: 63% of what?
15:15karolherbst: core load
15:15mupuf: yes, collect everything
15:15mupuf: data collection is fucking annoying
15:16mupuf: oh, and try to collect also the low -> high voltage
15:16mupuf: and high to low
15:17karolherbst: you mean once starting from 1°C and once starting from 100°C?
15:18mupuf: nope, I meant, when voltage is ramping up
15:18mupuf: and then when ramping down
15:18karolherbst: ohh okay
15:19karolherbst: but I have to wait until it goes down anyway, or do you mean with a lower load than 50% then, so it reclocks slowly down
15:19mupuf: yes, that would be perfect
15:20karolherbst: I think I messed up the driver
15:20karolherbst: now it stucks at highest clock :D
15:20mupuf: there may be some hysteresis factor
15:20karolherbst: ahh no, it took some time
15:20mupuf: no, it takes a lot of time the second time
15:21karolherbst: ohh oka
15:21karolherbst: okay, in 4 hours I have it ready then :p
15:23karolherbst: it jumps while downclocking :/
15:24mupuf: but you still have sample points
15:24karolherbst: but I try to find a load low enough so that it actually clocks down to the lowest clock :/
15:25karolherbst: it always stuck at 731 MHz :/
15:25karolherbst: and I was using 27% load
15:26mupuf: well, that's one sample point, better than nothing
15:26karolherbst: yeah, no idea, with such a low load, it doesn't make a difference if I have no load or a bit load
15:26karolherbst: it jumps a few times
15:26mupuf: if the voltage when increasing is the same as when decreasing (which sould me true)
15:26mupuf: then it is alllllllllll gggoooooddddd!
15:26karolherbst: it is
15:27karolherbst: with a sample of two pairs :D
15:27karolherbst: mupuf: how big should I choose the temp steps?
15:27karolherbst: +1 or something bigger?
15:28karolherbst: I really don't want to do that like 100 times, so I would do +5 steps if thats okay
15:28karolherbst: otherwise I should really write a script :D
15:28karolherbst: yeah why not actually
15:29mupuf: if you automate it, then time does not matter!
15:30mupuf: thanks a lot for doing this
15:30mupuf: it will be a good first sample point :D
15:31mupuf: and then we will need to modify the bios and see what it changes :D
15:32karolherbst: ohh there is kill %1
15:32karolherbst: this makes stuff damn easy
15:32mupuf: kill % 1?
15:32karolherbst: process & # background process
15:32karolherbst: sleep 4
15:32karolherbst: kill %1
15:32karolherbst: background process killed
15:32karolherbst: no stupid pid handling
15:33karolherbst: mupuf: how long should I wait for downclocking, 15 seconds?
15:34mupuf: show me your script
15:34karolherbst: first I need to test it
15:36karolherbst: mupuf: there you go: https://gist.github.com/karolherbst/0b8d4580a42bd66838bc
15:37karolherbst: meh, even 25 seconds are sometimes not enough :/
15:38karlmag: karolherbst: can you test for it?
15:39karolherbst: test for what?
15:39mupuf: karlmag: voltage could have been a good way to test
15:39karlmag: whatever takes too long
15:39mupuf: but ... that's what we are trying to test :D
15:39karolherbst: I think he means that I test against if the lowest clock is arleady reached
15:39karolherbst: well I could just tail the log file
15:39karlmag: never mind, carry on.. *steps aside again*
15:40mupuf: karlmag: change the code that dumps the clocks to not dump when there are no changes
15:40mupuf: then just increase the timer to 35 s
15:40mupuf: that should allow you to plot the values without having millions of rows
15:41karolherbst: we have cut for that :p
15:41karolherbst: and sort -u
15:41mupuf: hmm, you may still exceed the storage capacity
15:42karolherbst: I am 20% done and still <200kb
15:42mupuf: hald a meg for 100 seconds
15:43karolherbst: at highest clock, right?
15:43mupuf: yes, should be fine
15:43mupuf: highest clock?
15:43karolherbst: the daemon prints more at higher clocks
15:43karolherbst: and at lowest clock less
15:43mupuf: you know, I think you should just fix the clock to the base clock then change the voltag
15:43mupuf: that would be a first check
15:44karolherbst: I am already at 30°C, I will finish that now :p
15:44mupuf: and I will go to bed!
15:44karolherbst: but I use +5 °C steps, just to get a first data set
15:44karolherbst: we can improve accuracy later on if we want
15:45mupuf: before going to bed, start the one with step 1°C :p
15:45mupuf: not that I need to warm up my appartment
15:46mupuf: karlmag: that was a nice suggestion btw, thx
15:46karolherbst: maybe I can time it in a way
15:46karolherbst: that it gets loudest when you have to wake up!
15:46mupuf: ah ah, I selected an appartment where the bedroom would be far from the living room :D
15:46mupuf: guess where the machine is!
15:47mupuf: I need to setup piglit runs on my old laptops
15:47mupuf: I saw them yesterday and it is stupid not to make them available
15:47karolherbst: ancient gpus?
15:47karolherbst: like pre tesla?
15:48mupuf: nv86 and nvd9
15:48karlmag: mupuf: did you mean to reference me there, or that other k<tab> guy? ;-)
15:48mupuf: karlmag: nope, I meant you
15:48karolherbst: yeah, it was a good idea
15:48karlmag: oki.. you're welcome then :-)
15:48karolherbst: but I am too lazy to change the stuff
15:49mupuf: anyway, mister sandman is calling me!
15:49karolherbst: cut and sort -u are powerful enough anyway
15:49karolherbst: why bother, it is night the time you are outside anyway :p
15:49karolherbst: so you could theoretically sleep whenever
15:49karlmag: mr sandman should be calling me too *looks at half drunk can of cola*
15:50mupuf: Arbition: I will check the reclocking script that hangs on your card
15:50mupuf: but it just means the machine is beyond fucked
15:50mupuf: we probably program the plls in a wrong way and it never settles
15:50Arbition: Ah thats right, I was supposed to kit up for a kernel build env, I didn't have time
15:50karolherbst: mupuf: I had a talk with gnurou today
15:51karolherbst: and he said, usually there should never any IRQ be missing, but he has no big idea about the fuc code at all
15:51mupuf: you found a ginger gnu? (he will get the joke :D)
15:51mupuf: as I said, I think it is a coalescing on the cpu/kernel side
15:51karolherbst: yeah, maybe
15:52karolherbst: but why
15:52karolherbst: there aren't any parallel requests
15:52mupuf: then maybe the line is shared with something else and it gets the signal but not you?
15:52karolherbst: could be
15:52mupuf: wait, we live in the MSI time now
15:52karolherbst: ohh right
15:52mupuf: no idea what the heck is going o n
15:53karolherbst: well it is easy to reproduce
15:53karolherbst: just while true; echo 0a> pstate; echo 0f > pstate; done
15:53karolherbst: at some point this will stuck
15:53mupuf: I would say it would be good to put the timeout
15:53karolherbst: but it doesn't help in Arbitions case
15:53mupuf: and complain loudly in the logs
15:53karolherbst: that's what is bothering me most
15:53karolherbst: I already do
15:53RSpliet: imirkin_: regarding phomes_ issue, I can't quite tell which card this is... Clocks seem to correspond with a DDR3, which makes me believe it's GT21x...
15:54RSpliet: in which case, yes, I don't think the card is paused perfectly when reclocking, so under load you could experience a hang
15:54mupuf: nope, because as I said, pmu must hang in the script
15:54karolherbst: mupuf: loud enough? nvkm_error(subdev, "wait on reply timed out\n");
15:54RSpliet: once the clock is set, the card should be rock-solid
15:54mupuf: karolherbst: the right kind of loud :p
15:55karolherbst: ohh nearly done with teh script
15:55mupuf: karolherbst: we'll need to work on the commit message
15:55karolherbst: yeah, no problem
15:55imirkin_: RSpliet: he has a GT216, laptop edition
15:55mupuf: there are valid cases you want to handle, aside from missing irqs
15:55RSpliet: I figuredm those I have seen hang at random
15:55RSpliet: even without clock alterations
15:56mupuf: oh no, not those...
15:56RSpliet: well, I wouldn't say super, it's pretty messed up really
15:56karolherbst: mupuf: in any case, waiting with wait_event on IO operations sounds like a bad idea anyway
15:56mupuf: it is asking for troubles, I guess
15:57karolherbst: I know a lot of tricks to mess up nouveau
15:57karolherbst: and most of them are always of that kind of stuff
15:57RSpliet: like run a multithreaded GL program?
15:57mupuf: imirkin_: yeah, the nvaX are known to be messed up
15:57RSpliet: mupuf: NVA8 has been pretty stable
15:57karolherbst: mupuf: like while (nv_rd() = nv_rd()) kind of stuff
15:57imirkin_: mupuf: really? my gddr5 nva3 has surprisingly few issues
15:58mupuf: for the story, mwk reverse engineered pdaemon to understand why it works with the blob
15:58mupuf: imirkin_: they used to be a pain
15:58mupuf: 216 is nva5?
15:58RSpliet: imirkin_: for that nva3 you might actually want to see what bits you can borrow from my reclocking tree later on
15:58mupuf: I hate my nva5, it is super load
15:58imirkin_: mupuf: i think it's supposed to have issues for display due to gddr5 fail, but i only use it for offloading
15:58karolherbst: mupuf: wanna see the data? it is finished
15:58RSpliet: mupuf: it's very metal \m/
15:58imirkin_: RSpliet: i think you and i wrote ~identical patches ;)
15:58mupuf: RSpliet: hehe
15:59mupuf: karolherbst: what the hell, go for it
15:59imirkin_: RSpliet: except yours are tested and mroe complete
15:59mupuf:will go to the office at noon again tomorrow at this rate
15:59imirkin_: RSpliet: https://github.com/imirkin/nouveau/commit/a41f62639b14bab1f306241445398f525f8e47b7
15:59RSpliet: I wouldn't say tested, but I definitely went a leap from there
15:59imirkin_: [hmmm for some reason i thought i had more]
16:01mupuf: wait, there must be a ton of data :D Not sure I want to see :D
16:01karolherbst: mupuf: ? <1MB
16:02mupuf: still :D
16:02mupuf: I will late for the fancy graphs
16:02karolherbst: yeah okay, I write up a cut | sort command first
16:02mupuf: showing frequency vs voltage
16:03mupuf: and then having different lines for different temperatures
16:03mupuf: that is what we need :p
16:03karolherbst: mupuf: cat log_5.csv | cut -d, -f3-9,16 | head
16:03karolherbst: is that enough for couloums?
16:03karolherbst: or do you want more?
16:03karolherbst: ohh temp
16:04karolherbst: mupuf: cat log_5.csv | cut -d, -f3-9,16-17 | head
16:04karolherbst: mupuf: cat log_5.csv | cut -d, -f3-9,16-17 | uniq
16:04karolherbst: there you go
16:04karolherbst: 425 lines
16:05mupuf: still, I want to see the graph "D
16:05mupuf: have fun with gnuplot
16:05karolherbst: yeah meh :D
16:05karolherbst: how can I do those nice 3d graphs?
16:05mupuf: just cut the file into multiple files at different tempratures
16:06mupuf: with gnuplot? super simple
16:06mupuf: but 3d graphs are annoying to read
16:06karolherbst: I think I will do a 3d graph: tmp+core => volt
16:07mupuf: ack, but while you are at it, make a 2d graph with core => volt
16:07mupuf: and one line per temperature
16:14karolherbst: mupuf: don't you have a gnuplot example file for those 3d stuff I want?
16:14karolherbst: like you did for the pwm
16:14mupuf: http://lowrank.net/gnuplot/plot3d2-e.html ?
16:14mupuf: pwm? did I do that?
16:15karolherbst: voltage pwm
16:15mupuf: http://www.nigelpond.com/2009/10/06/using-gnuplot-to-plot-3d-graphs/ <-- this is probably what you want
16:15karolherbst: yeah, something like that
16:15karolherbst: or bar charts
16:15mupuf: maybe it was just to show off :D
16:15karolherbst: but maybe this is nice
16:15mupuf: I don't like 3d graphs
16:15mupuf: it is not readable
16:16orbea: karolherbst: I can compile nouveau myself (Im on slackware) and I haven't done anything to the kernel, stock slackwaregeneric kernel 4.1.13
16:17orbea: but its not a major issue, only with games which I should not play until I finish school projects, so I will come back to this later
16:18karlmag: orbea: which hardware?
16:19orbea: at first I thought it was a steam issue, but other non-steam games like pioneer space-sim do it too.
16:21karolherbst: mupuf: mhh: http://plotshare.com/sessions/510349027/Plot1.png :/
16:21karolherbst: maybe you are right :D
16:22mupuf: told you, sherlock!
16:22karolherbst: but the data isn't as smooth as I expected :/
16:22karlmag: orbea: ok.. Just got a bit curious since you're running slack-current
16:23mupuf: ok, this time, let's sleep :D
16:23karlmag: mupuf: g'nini
16:23karolherbst: I have to clean up the data
16:24karolherbst: mupuf: wait a second :O
16:24karolherbst: does your applciation supports VID >15 by the way?
16:24karolherbst: now that I look at this data
16:24karolherbst: yeah well meh
16:25orbea: karolherbst: yea, slackware current has had major changed with eudev and new kernels recently...
16:26orbea: Ideally if I can find out what causing it, I can mention it on LQ so Pat can fix the kernel or w/e it is, assuming its not on my end which I dont think it is
16:26mupuf: karolherbst: you should check using the maxwell, not the kepler
16:26mupuf: the pwm voltage would help getting more accuracy
16:26karolherbst: yeah I know, but there was the kepler
16:26mupuf: will plug it over the week end
16:27karolherbst: I think your daemon is a bit odd :/
16:27karolherbst: there is never a VID above 15
16:27mupuf: I call it a big HACK
16:27karolherbst: meh :D
16:27karolherbst: this time I repair the data by hand
16:27mupuf: are there higher voltages than 15?
16:27karolherbst: check this out: https://gist.github.com/karolherbst/cf02a71331321a0f008a
16:27karolherbst: how should that otherwise make any sense
16:27karolherbst: last is temp
16:27karolherbst: before temp is vid
16:29karolherbst: you only check for 4 vid gpios
16:30karolherbst: well your gpu has 5 :/
16:30karolherbst: much better now
16:31karlmag: orbea: sounds like an idea, yeah. Afaik Pat either tracks or get told about suff on LQ.
16:31karlmag: stuff even
18:57imirkin: gr. how does const-folding mad when all 3 args are const cause some programs to get hurt?!
19:34karolherbst: http://plotshare.com/sessions/510349027/Plot1.png mhhh
19:55karolherbst: mupuf: ^^
19:56karolherbst: there are a lot of garbade entries though in the data :/
20:05imirkin: aaaand i killed heaven on nv50 with all my fixes