02:01 hakzsam: skeggsb, well, the blob seems to disable ctxsw, read MP counters , enable ctxsw (http://hastebin.com/kibozehoxo.coffee). According to android gk20a driver's headers, 0x38 is "stop_ctxsw" and 0x39 is "start_ctxsw" ...
02:01 hakzsam: and it probably does the same thing when it configures MP
02:03 hakzsam: so... I think we definitely need to get rid of that compute kernel and to move MP counters inside nouveau (as for PCOUNTER ones)
02:04 hakzsam: but well, my series already fixes a ton of stuff related to thoses MP counters on Fermi, but I'll add a note to explain this :)
04:33 karlmag: fun.. had nouveau crash on me.. or at least freeze..
04:34 karlmag: hmm.. I guess karolherbst isn't here right now.
04:34 karlmag: oh well.. I'll paste the logs if someone else wants to have a peek
04:35 karlmag: dmesg; http://pastebin.com/PwZzRimG
04:35 karlmag: xorg log; http://pastebin.com/FUx7TSRh
04:35 karlmag: setup: 01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 650 Ti Boost] (rev a1)
04:35 karlmag: and a 08:00.0 VGA compatible controller: AMD/ATI [Advanced Micro Devices, Inc.] Hawaii PRO [Radeon R9 290] (rev 80)
04:36 karlmag: in same machine, nvidia one set as primary
04:36 karlmag: one montor connected to each card
04:37 karlmag: only one connected to the nvidia card active when X started
04:37 karlmag: essentially started kde, opened system settings, and there it froze
04:56 pmoreau: karlmag: Is it something new? What did you change to get that crash?
04:59 karlmag: pmoreau: only thing I really did was to put one monitor on the other card
04:59 karlmag: (adding a dp-adapter to the dvi cable)
05:00 karlmag: no, nothing else
05:01 karlmag: pmoreau: and yes, it's new
05:01 karlmag: had things freeze up before, but not seen those errors in the logs before
05:08 karlmag: System is still up'n'running in case there is anything I can try to probe for.
05:08 karlmag: (X is frozen, obviously)
05:29 pmoreau: karlmag: Sorry, got absorbed by e-mail handling and completely forgot to check back here.
05:30 pmoreau: So, unplugging the monitor from the AMD card would *solve* the issue?
05:30 pmoreau: Kinda reminds me (somewhat) of https://bugs.freedesktop.org/show_bug.cgi?id=82714
05:31 karlmag: pmoreau: no idea... btw, I did the pluggin before booting
05:31 pmoreau: Well, different GPUs, slightly different setup, but still having a config with one AMD and one NVIDIA, and a screen hooked up to the AMD one
05:34 pmoreau: Something goes wrong at some point with the GR engine, and then Nouveau runs oom I think
05:35 pmoreau: You'll get better support from imirkin than you will from me
05:36 karlmag: ok..
05:36 karlmag: I could try to make it happen again
05:36 pmoreau: It should come around in a few hours
05:36 pmoreau: That would be great!
05:36 karlmag: right now it's just in the frozen X state in case I should try to peek for stuff
05:37 karlmag: ok.. I'll reboot and see what happens.
05:40 karlmag: strange.. something must be hanging really hard
05:41 pmoreau: And if you do reproduce, might be interesting to see what happens if you unplung the monitor from the AMD card.
05:41 pmoreau: :/
05:41 karlmag: seems like the OS initiated a boot
05:41 karlmag: eventually the screen went black, and nothing have happend after that
05:42 pmoreau: Can you interact with it?
05:42 pmoreau: If you have another computer, can you SSH in?
05:42 karlmag: it didn't answer to ping, so it didn't boot.
05:43 karlmag: I gave it ample time for the os to get up
05:43 karlmag: well, a hard boot helped
05:43 pmoreau: Ok
05:53 karlmag: first test it didn't hang... but I did some stuff slightly different
05:53 karlmag: retrying
05:54 karlmag: hmm... nope
05:54 pmoreau: meh
05:55 karlmag: only *real* difference I can think of is that first time the machine was starting off cold after being switched off for hours
05:56 pmoreau: Let the computer powered off for like 10secs or so before booting again
05:56 pmoreau: But maybe you did that already
05:58 karlmag: Not sure how long I kept it off.. retrying now at least
05:59 karlmag: one other thing I saw on the last couple of reboots; nothing displayed on the amd monitor at all. Earlier it did print some stuff from the booting process. Right after the mode switch till nouveau took over as primary display I believe.. so a couple dozen lines of log
06:01 karlmag: now the text is back. Left machine off for a couple of minutes.
06:02 karlmag: nope.. no hang..
06:03 karlmag: I'll leave it for a bit and check again in a couple minutes I think
06:09 pmoreau: It's so annoying when bugs go away like that
06:10 karlmag: yeah..
06:10 karlmag: true that
06:10 karlmag: and nothing yet
06:10 karlmag: could retry of course..
06:10 karlmag: can switch that machine off for a while for it to cool down and see if that changes anything
06:12 karlmag: it's quite typical though
06:12 karlmag: I guess the logs doesn't say *that* much either.
06:12 karlmag: though I did think far enough to capture them at least.
06:13 pmoreau: I don't know how to interpret GR error messages
06:13 pmoreau: That's good! :-)
06:18 karlmag: I know what I forgot.. I should have checked cstate, pstate, etc.. :-/
06:32 Psy-Q: how can i find out if power management is working on my card? i'm running on linux 4.3.0-rc5-686-pae from debian testing, /sys/class/drm/card0/power exists and runtime_status shows "unsupported", does that mean it can't be enabled on this card?
06:36 karolherbst: mupuf: guess what, you forgot the subdev/bios/iccsense.h file
06:37 Psy-Q: it's a GK106 (0e6060a1) and i have nouveau.pstate=1 as kernel boot parameter
06:37 pmoreau: Psy-Q: The only power management supported by Nouveau is powering down the card automatically if it's not used (not connected to a screen / being used by PRIME)
06:37 Psy-Q: pmoreau: frequency scaling does not fall under the power management term as well?
06:37 pmoreau: No power/clock gating, nor automatic reclocking so far
06:37 Psy-Q: oooh
06:38 pmoreau: karolherbst is working on it, but it's a WIP and not merged yet
06:38 Psy-Q: ok, thanks. i saw some article on phoronix where he apparently benchmarked some cards with reclocking enabled on nouveau
06:38 pmoreau: It's manual reclocking, not automatic
06:39 karolherbst: Psy-Q: some of the stuff will/may land for 4.4
06:39 pmoreau: Having pstate=1 gives you a pstate file, which you can cat to get the available perflvl, and you can echo a perflvl to reclock to it
06:39 karolherbst: the more advnaces stuff aka dynamic reclocking have to wait though
06:40 pmoreau: Psy-Q: If you have a GDDR5 Kepler card, you could try Karol's branch to get reclocking fixes and PCIe speed changes
06:40 Psy-Q: ah hah! thanks :)
06:40 pmoreau: (Well, you can get the PCIe speed changes even with a non-GDDR5 Kepler, and reclock it)
06:41 Psy-Q: i'll try to get my hands on that branch
06:41 karlmag: karolherbst: I do have a couple of logs for you today.
06:41 Psy-Q: i made the mistake of putting a 32-bit kernel on this machine but now noticed many commercial games are 64-bit only now, so i'll have to set it up fresh anyway
06:41 karolherbst: karlmag: saw them
06:41 karolherbst: Psy-Q: shouldn't that work nethertheless?
06:42 karlmag: karolherbst: you read the channel log?
06:42 karolherbst: or can't 32bit linux handle 64bit binaries?
06:42 karolherbst: karlmag: sometimes
06:42 Psy-Q: i did see some message listing all the memory and cpu clocks this card support somewhere, but now i can't find it..
06:42 Psy-Q: karolherbst: no, only the other way round i guess
06:42 karolherbst: :/
06:42 karlmag: ah... right.. then I don't have to repeat myself then
06:42 Psy-Q: at home i have a multiarch machine, that works
06:42 karolherbst: well switching to 64bit kernel is not that hard though
06:42 Psy-Q: but this will be my nouveau experimentation machine
06:42 karolherbst: you just have to install one and boot it
06:42 pmoreau: karolherbst: Which branch of yours should he test?
06:42 karolherbst: pmoreau: master_karol_stable
06:43 Psy-Q: karolherbst: what about libs and stuff, i'd have to pull in 64-bit libs as well, no? i'll investigate
06:43 pmoreau: Psy-Q: https://github.com/karolherbst/nouveau/tree/master_karol_stable
06:43 karolherbst: Psy-Q: I am not sure about init, but a 64bit kernel can handly 32bit prcoesses just fine, so in theory (never tried it myself) it should just work
06:43 Psy-Q: pmoreau: ah, thanks, so it'd be for an out-of-tree build? i guess i can do that somehow, i think i saw a guide
06:43 pmoreau: Psy-Q: You'll probably need 64bits lib
06:44 Psy-Q: https://wiki.debian.org/Migrate32To64Bit <-- they have a really long guide
06:44 Psy-Q: i think i'm faster with reinstalling, there's only openarena, steam, xfce and nouveau (from experimental's 4.3) on this machine anyway :)
06:44 pmoreau: But if your computer support 64bits, you should go for that and forget 32-bit
06:44 pmoreau: :D
06:44 pmoreau: Yeah, should be faster!
06:45 karolherbst: :D
06:45 kubast2: fast/10
06:45 karolherbst: mwk: what should I do about those two counters? GR_ROP and GR_GPC, these bits are used on gt215+ tesla, though they may mean something else
06:46 mwk: they *do* mean something else
06:46 mwk: duplicate them
06:46 mwk: GR_UNK1 for tesla, GR_HUB for Fermi+
06:46 karlmag: meh... not able to repeat that one it seems.. Now I tried from cold boot (machine switched off for half an hour)
06:48 Psy-Q: alright, gonna reinstall in 64-bit and prepare an environment for an out-of-tree build. thanks!
06:51 karlmag: karolherbst: did (if you have looked at them) the logs give anything meaningful?
06:51 karlmag: useful I mean
06:51 pmoreau: mwk: Just a reminder that they are some envytools PR waiting for you ;-)
06:52 karolherbst: mwk: okay, updated
06:53 karolherbst: karlmag: these were at default clocks?
06:53 karlmag: yeah
06:54 karolherbst: never saw such an error
06:54 karlmag: of course I had to be the first :-P
06:54 karolherbst: well, I don't know any error
06:54 karolherbst: I bet imirkin saw something like that :D
06:56 mwk: karolherbst: ok, approved
06:56 mwk: merge it :)
07:48 kdub: if "get-edid" can get the edid from my monitor, why would xrandr --props not show the edid for the monitor?
07:48 karolherbst: kdub: sometimes the driver/card fails to read the edid
07:49 kdub: karolherbst, in dmesg, I see something like: [ 323.702214] nouveau E[ DRM] DDC responded, but no EDID for VGA-
07:54 karolherbst: kdub: you could either hack up a fake edid, or try other means to check if the display does have an EDID
07:54 kdub: get-edid seemed to parse it, and so did the nvidia driver, and nouveau has in the past
07:56 karolherbst: kdub: ohh so it worked in the past, but not now?
07:56 karolherbst: then you could bisect the kernel and figure out which commit bricked it
08:22 RSpliet: karolherbst: next time, first recommend him to upgrade to a 4.2 kernel or sth
08:23 karolherbst: ohh, right, totaly forgot about this :/
08:23 RSpliet: no worries
08:23 RSpliet: he didn't read the wiki
08:23 RSpliet: (esp. the bit on trying an up to date software stack :-P)
08:23 karolherbst: added to dodo: create list of stuff people should try out, before I mention my stupid ideas
08:23 karolherbst: *todo
08:24 karolherbst: right
08:24 karolherbst: but maybe he is bleeding edge already :D
08:24 RSpliet: you mean this list: http://nouveau.freedesktop.org/wiki/TroubleShooting/#index1h2 ?
08:24 karolherbst: yes :D
08:25 RSpliet: I think logging format changed in the 4.2 rewrite to include the PCI bus ID
08:25 karolherbst: but I already violate several points myself :D
08:25 karolherbst: yes
08:25 pmoreau: Oh! Never thought of "troubleshooting" as two separate words!! Interesting! :D
08:27 karolherbst: :D
08:35 karolherbst: if somebodyy really wants to dig into nvenv, what would be the best approach to do so, just try to mmt the prop driver?
08:35 karolherbst: nvenv is done through the cuda libs though
08:36 karolherbst: *nvenc
08:37 ystreet00: nvenc is inside libnvidia-encode which is provided by the driver
08:37 karolherbst: ystreet00: and I need some cuda libraries to actually compile the ffmpeg port
08:38 ystreet00: technically i don't believe you need cuda to use nvenc other than maybe the device selection which might to weird things behind the scenes
08:39 karolherbst: I think that only a header file was used though
08:39 karolherbst: there is also this nvenc sdk
08:40 ystreet00: yea, that only contains sample programs plus a header
08:41 imirkin: karolherbst: there's a GL_NVX_something that does it too iirc
08:41 karolherbst: this is the ffmpeg code commit: https://github.com/Brainiarc7/ffmpeg_libnvenc/commit/3b5a7bdccd5ed6b4189f596549fb300e3d3fd6b1
08:42 imirkin: ... or not
08:42 karolherbst: it checks for nvEncodeAPI.h and cuda.h
08:42 imirkin: there's a NVX_nvenc_interop but no interfaces for driving it directly
08:43 ystreet00: there's also http://cgit.freedesktop.org/gstreamer/gst-plugins-bad/commit/?id=b1d13e10af26ee8a062d3a333e9a694444e804ee which essentially does a similar thing
08:43 karolherbst: ystreet00: the ffmpeg code dlopens libcuda.so
08:43 imirkin: karolherbst: chances are there's a kernel component as well btw, you'll need to both mmiotrace and mmt trace
08:44 karolherbst: yes
08:44 karolherbst: nvidia-uvm gets loaded
08:45 karolherbst: vdpau is decoding only, right?
08:45 imirkin: yes.
08:45 karolherbst: is there any universal encoding api out there yet?
08:45 imirkin: vaapi and omx
08:46 kubast2: vainfo libva info: va_openDriver() returns -1
08:46 kubast2: libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nouveau_drv_video.so
08:46 kubast2: http://dpaste.com/2ABGC9A
08:46 karolherbst: imirkin: which one is better supported by nouveau?
08:47 karolherbst: ystreet00: seems like the gst code also depends on cuda
08:47 karolherbst: and actually links against it
08:51 karlmag: imirkin: did you see my logs from this morning btw?
08:54 karlmag: imirkin: dmesg; http://pastebin.com/PwZzRimG xorg log; http://pastebin.com/FUx7TSRh
08:54 karlmag: imirkin: it was suggested you might have seen that error before
08:56 imirkin: karolherbst: neither is supported by nouveau
08:56 karolherbst: imirkin: any idea what would be easier? vaapi or openmax?
08:57 karolherbst: or what was planned to supported anyway
08:57 imirkin: karlmag: very odd, never seen that before
08:57 imirkin: karolherbst: step 1 -- i'd make a standalone app that encodes one frame. then move up from there.
08:57 karlmag: must have been kind of a fluke too. I haven't been able to reproduce it.
08:57 imirkin: karolherbst: sorta like what i did for vp2
08:59 karolherbst: imirkin: against nvenc and check what it does?
08:59 karolherbst: I mean against the blob
08:59 imirkin: karolherbst: yeah, i ended up making my own vpdau player as well, which was a lot easier to trace
09:00 imirkin: karolherbst: but you def don't need to be worrying about omx or vaapi for now... that's like... way out
09:00 karolherbst: imirkin: well I could just hack the ffmpeg thingy which stops after one frame
09:00 karolherbst: *to let it stop
09:00 imirkin: karolherbst: that's all well and good, but eventually you'll want to do 2 frames, etc
09:00 imirkin: karolherbst: anyways, do what works for you
09:00 karolherbst: k
09:00 imirkin: i needed super-fine control because i was trying to track down exactly what frame decoding was going wrong on
09:01 imirkin: and i wanted tight control of the parameters i was feeding in
09:39 karolherbst: mupuf: iccsense->driver->pwr_get(iccsense) gives me always 0, any idea?
09:40 karolherbst: but I think I missed something in the h files you didn't add
10:51 karolherbst: mupuf: okay, fixed it
11:00 karolherbst: mupuf: I need this: https://github.com/karolherbst/nouveau/commit/0ed7a9d69643a72c7d9dd3b7178ed2d3f48acd38 I guess there is something not right with selecting the right lane :/ for the the 0 lane was the good one
11:40 m3n3chm0:nasZ
13:43 karolherbst: any other information important for benchmarking dynamic reclocking? https://gist.github.com/karolherbst/47aef5ef08d5d2c72575
13:43 imirkin_: that's a lot of frames
13:43 imirkin_: is this glxgears ors omething?
13:43 karolherbst: glxgears apitrace
13:44 karolherbst: I am coding the bechmark thingy in c++14 now
13:44 karolherbst: I need a bunch of crap which would be just too painful in plain C
13:44 karolherbst: like threading and regex
13:45 karolherbst: https://gist.github.com/karolherbst/00de3413c5f6953706af
13:45 karolherbst: mhh I can improve the atomic situation though
13:46 karolherbst: allthough it is not needed
14:13 mupuf: mwk: thanks for the review :)
14:14 skeggsb: mupuf: k1 has bar1 still, not technically a bar i guess, but still the same deal
14:14 mupuf: skeggsb: oh, cool! I need to check how to find its address then!
14:15 mupuf: must be somewhere in the DT
14:15 mwk: yeah, I also recall seeing it somewhere
14:15 mwk: IIRC hardcoded in some header file in kernel...
14:15 skeggsb: heh
14:15 skeggsb:agrees with linus
14:16 mupuf: skeggsb: ah ah, well, that's a bit nasty to tell them to just die though :D
14:17 mupuf: mwk: ack, I guess I could map bar1 based on the chipset id
14:17 skeggsb: meh, that's the sense of humour my siblings and i have with each other, it doesn't seem at all bad to me :P
14:17 mupuf: it is not very beautiful, but better than nothing
14:17 mupuf: right, we have the same kind of super dark humour
14:18 mupuf: but, when a stranger and super-known figure tells that about you, that may make you feel pretty uncomfortable
14:20 karolherbst: ohh waht happend to linus again :D
14:22 karolherbst: mupuf: I started to rewrite the benchmark script in c++14 (though c++11 would be also okay, but I use regex and newer gcc is kind of required there)
14:25 pmoreau: imirkin_: Setting fixed=1 for an Instruction, means that instruction should stay no matter what, right?
14:26 imirkin_: it means a bunch of things
14:26 imirkin_: but among others, it shouldn't get DCE'd
14:26 imirkin_: i THINK it might also end up acting as a barrier, but not sure
14:26 pmoreau: Cause this one disappears in the final code: "up->mkStore(OP_STORE, stTy, static_cast<Symbol *>(sym), ptr, value)->fixed = 1u;"
14:27 mupuf: karolherbst: stop doing this and use ezbench
14:27 mupuf: if you start spending time on it, use something that is meant for this kind of work
14:27 imirkin_: pmoreau: hmmmm dunno. can i see the before/after?
14:27 mupuf: and it is going to be used for the QA on nouveau which I volunteer to do
14:29 pmoreau: imirkin_: This? https://phabricator.pmoreau.org/P47
14:30 karolherbst: mupuf: what is ezbench?
14:30 pmoreau: karolherbst: Look his presentation at XDC
14:30 mupuf: http://cgit.freedesktop.org/~mperes/ezbench/log/
14:30 mupuf: the doc is outdated as I am hacking as fast as I can on it
14:30 pmoreau: https://www.youtube.com/watch?v=jVvvpZYemhc
14:31 mupuf: but that should provide what you need minus the continuous monitoring of the power consumption
14:31 mupuf: I have not worked on that yet
14:31 mupuf: because we have internal tools we can use for it already
14:31 imirkin_: pmoreau: yes, this...
14:31 karolherbst: ohh okay
14:31 mupuf: but I made it possible to do that semi-easily
14:32 mupuf: you will have to create a profile and change the run command to insert any kind of monitoring
14:32 imirkin_: pmoreau: there's probably a pass that notices that nothing ever reads from l[0x50] so it removes the store
14:32 imirkin_: pmoreau: lmem is not observable outside of the invocation of that shader
14:32 mupuf: sorry, xorg board of directors meeting, will tell you more when I can
14:32 imirkin_: pmoreau: you can remove MemoryOpt and that will probably make it better
14:32 pmoreau: imirkin_: Oh, oops!
14:32 imirkin_: pmoreau: but really you should be storing stuff to s[] or g[]
14:33 imirkin_: (s = shared by invocations, g = global memory)
14:33 pmoreau: was going to ask for s
14:33 pmoreau: but figured it out
14:33 pmoreau: Ok
14:33 pmoreau: Yeah, need to change that
14:35 pmoreau: But, it would be nice if the fixed was respected :-)
14:35 imirkin_: it usually is
14:35 imirkin_: it'd be nice if all these properties had clearly defined semantics
14:35 imirkin_: but... they don't
14:35 imirkin_: s/defined/documented/
14:35 pmoreau: :-D
14:36 pmoreau: Is there any property for providing an alignment? https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html#Memory%20Access
14:37 imirkin_: don't think so
14:37 imirkin_: that could be useful for combining memory accesses
14:38 pmoreau: Yeah, or just if I want to support the related SPIR-V flag :D
14:39 imirkin_: you can just ignore it
14:39 pmoreau: Which I guess is there to combine memory accesses :-)
14:39 pmoreau: True
14:39 pmoreau: And I will do that for now
14:39 imirkin_: but it's useful for e.g. intel cpu's which have aligned and unaligned sse ops
14:39 imirkin_: where presumably there's some advantage to using the aligned ones
15:16 mupuf: mwk: will have a look again at the patches for the jetson during the week end
15:17 karolherbst: mupuf: does ezbench also manage those LIBGL_DRIVERS_PATH thingies or do I have to set them in the config files?
15:18 mupuf: why do you use this and not LD_LIBRARY_PATH?
15:18 mupuf: you need to set it in the profile's run function
15:18 karolherbst: k
15:18 mupuf: actually, no
15:18 mupuf: you need to set it in the ezbench_pre_hook
15:18 karolherbst: I was thinking that when you point to a mesa git repository, you could actually manage an installation also in ezbench or would it be too messy?
15:19 mupuf: nope, this is in the todo list
15:19 mupuf: having a script that can set up the graphics stack from git ... entirely
15:19 karolherbst: okay
15:20 mupuf: actually, there is already something called ghbuild or something like that, that we can use
15:21 mupuf: but I just need to set sane default values :D
15:21 karolherbst: why not adding a drivers list field
15:21 mupuf: and then open source the file that we have been using
15:21 karolherbst: and manage options depending on this
15:21 mupuf: what kind of options?
15:21 mupuf: do you mean the build?
15:22 karolherbst: for mesa
15:22 karolherbst: like if there is no gpu with gallium support select, don't build gallium
15:22 mupuf: the build system of mesa already handles this
15:22 karolherbst: okay
15:24 karolherbst: mupuf: did you get my comment about your iccsense patch?
15:24 mupuf: I guess I need to scroll up!
15:25 mupuf: ah
15:25 mupuf: is it your nve6?
15:26 karolherbst: yes
15:26 mupuf: power rail 0: extdev_id/power_rail = 1, shunt resistor = 5 mOhm
15:26 mupuf: power rail 1: extdev_id/power_rail = 2, shunt resistor = 0 mOhm
15:26 mupuf: power rail 2: extdev_id/power_rail = 0, shunt resistor = 0 mOhm
15:27 karolherbst: https://gist.github.com/karolherbst/e451f50cdf51691d6ac3 :)
15:27 mupuf: well, looks good to me
15:27 karolherbst: yeah
15:27 karolherbst: but I think your patch selected the second rail
15:27 karolherbst: not the first one
15:27 karolherbst: don't know why though
15:27 mupuf: I do not select a rail, I use all of them
15:27 karolherbst: wierd
15:27 karolherbst: you skip those with mohm == 0 though
15:27 mupuf: can you boot nouveau with debug="iccsense=debug"?
15:28 mupuf: yes, indeed
15:28 karolherbst: wait, something is using my nouveau card :O
15:29 mupuf: yes, you crashed nouveau :D mouahahahah
15:29 karolherbst: nope
15:29 karolherbst: stupid glretrace
15:29 karolherbst: I messed this up somwhoe
15:29 mupuf: lol
15:30 karolherbst: [47795.183093] nouveau 0000:01:00.0: iccsense: rail[1] = 5 mOhm
15:32 karolherbst: I will add some printk where I set shunt = 5
15:34 mupuf: rail 1, that's your problem sir!
15:34 mupuf: :D
15:34 karolherbst: id: 0 shunt: 0 vshunt: 616 vbus: 18952
15:34 karolherbst: id: 1 shunt: 5 vshunt: 0 vbus: 0
15:34 karolherbst: :D
15:34 karolherbst: mupuf: I think you don't explicitly order them
15:34 karolherbst: but just say: top is 0, second is 1...
15:35 karolherbst: or something like that
15:35 mupuf: no, I am reading what the bios tells me
15:35 karolherbst: for completness: id: 2 shunt: 0 vshunt: 0 vbus: 0
15:37 mupuf: so, why the heck is the vbios telling me this is rail 1 then 2
15:37 mupuf: maybe it is not the rail number
15:37 mupuf: and when using the ina3221 mode, we should just read them in order
15:38 karolherbst: maybe it is the extev_id?
15:38 karolherbst: because that would fit in my case
15:38 karolherbst: exted_id = 1, and the ina also has id: 1
15:40 mupuf: yeah but the second is 2 :D
15:41 mupuf: and it has no shunt resistor
15:41 RSpliet: can the INA3221 not measure multiple rails?
15:41 mupuf: it can
15:41 mupuf: up to 3
15:41 mupuf: the more expensive cards use 3 INA219
15:41 karolherbst: maybe there is only one on my card
15:42 mupuf: and in this case, the extdev_id actually tells which INA219 is connected to what shunt resistor
15:42 mupuf: because yes, some cards have different values for different rails
15:42 imirkin_: presumably all 3 have the same shunt resistor...
15:43 karolherbst: anyway, the other rails don't give any data, at least on idle, and I doubt they give any on high load
15:43 imirkin_: (for ina3221)
15:43 karolherbst: the reads are all 0
15:43 mupuf: imirkin_: nope, not as far as I remember
15:43 mupuf: let me check again
15:44 imirkin_: or perhaps 0 == "nothing's hooked up, go away"
15:45 mupuf: pecisk's nve4 has different values
15:45 mupuf: but it is likely INA219
15:45 karolherbst: mupuf: highest messured value is 58W
15:46 mupuf: mlankhorst's nvc8 too, but this is again likely an INA219
15:46 mupuf: WTH! Tobias's nve7 has 7 rails :o
15:46 mupuf: 5
15:47 karolherbst: :O
15:48 mupuf: pecisk's nv117 is even funnier
15:48 karolherbst: :D
15:48 karolherbst: fun
15:48 mupuf: 128, 16, 23, 10, 5 mOhm
15:49 karolherbst: and no extdev
15:49 mupuf: yeah, I could have guessed that
15:49 mupuf: karolherbst: by the way
15:49 mupuf: find . -name vbios.rom -exec scripts/nvbios_and_grep.sh {} "power rail " \; 2> /dev/null | less -r
15:49 mupuf: that is quite handy to have a look at vbioses :D
15:50 mupuf:is lazy like you have no idea
15:50 karolherbst: you have no idea
15:51 karolherbst: :p
15:51 karolherbst: wait, there is a scripts/nvbios_and_grep.sh script?
15:51 mupuf: yes
15:51 mupuf: in the vbios repo
15:52 mupuf: so as it is super easy to check for instance what cards have an INA3221
15:52 mupuf: and it then prints the address of the vbios for it
15:54 karolherbst: there are really only a handfull of cards with those sensors
15:54 karolherbst: ohh 15 in total
15:54 karolherbst: calim has three extdev entries for tha ina219
15:54 karolherbst: mupuf: check pecisk nve4
15:55 karolherbst: 1 ina3221 and 2 ina219
15:55 karolherbst: :D
15:55 mupuf: Oh my!
15:55 mupuf: WHY!?
15:56 karolherbst: there are also some with a MAX6649 and INA3221
15:57 mupuf: those are unrelated
15:57 karolherbst: k
15:58 karolherbst: mupuf: ohhh, your patch only supports one sensors :/
15:58 mupuf: yes, one thing at a time
15:58 karolherbst: at the same time I mean
15:58 karolherbst: mhhh
15:58 mupuf: but yes
15:58 mupuf: I do not know how I will handle the two of them at the same time
15:59 mupuf: that indeed changes the design
15:59 mupuf: wtf were the people thinking...
15:59 mupuf: INA3221 can do 3 lanes already, why buy 2 more INA219?
15:59 karolherbst: for fun
16:03 mupuf: well, I guess I will need to change the interface a little then!
16:03 mupuf: so as a driver would be queried for one lane at a time
16:03 mupuf: and move the general polling code to base.c
16:03 mupuf: Which I would likely have done anyway
16:04 mupuf: oh dear, it is going to be hilarious!
16:05 karolherbst: :)
16:05 karolherbst: be lucky, that we know this now
16:05 karolherbst: and not later
16:09 mupuf: yes, I guess :D
16:09 mupuf: the problem is the mapping between the driver and the lanes of the INA3221
16:10 mupuf: err, not the driver, I meant the rails table
16:20 mupuf: karolherbst: so, the weird card is my nve6
16:20 karolherbst: mupuf: I don't know
16:20 mupuf: all the others have 1,2,3 instead of 0,1,2
16:20 karolherbst: check pecisks nve4
16:21 mupuf: yeah, I checked it
16:21 mupuf: I guess I need to reverse the other fields!
16:22 mupuf: it is a tad hard since the blob does not really act upon the power usage except for power capping
16:25 karolherbst: mupuf: quadro cards maybe?
16:26 karolherbst: nvidia-smi should display the power usage there
16:26 mupuf: as if I had those
16:26 mupuf: :D
16:26 mupuf: I do have 2 though :D
16:26 karolherbst: :D
16:26 mupuf: but they are old
16:26 karolherbst: ohh
16:26 karolherbst: maybe there is a way to trick nvidia-smi
16:27 mupuf: enough people have been doing it :D
16:27 karolherbst: mhh how? :D
16:27 mupuf: maybe there is a trick to read it back from pdaemon
16:27 karolherbst: I really would like to know
16:27 mupuf: oh, it does not work anymore
16:27 mupuf: but it used to
16:27 mupuf: first, they started flashing different vbioses
16:28 karolherbst: yeah well, I would rather not
16:28 mupuf: then there are also changing the value of some strap register
16:28 mupuf: but since then, they introduced the fuses inside the chips
16:28 mupuf: they write some values there and then blow the write fuse
16:29 mupuf: you can only read stuff from there ;p
16:29 mupuf: and it contains important stuff
16:29 mupuf: and the card can react differently depending on it
16:29 mupuf: for instance, there is a method that the blob inserts in the command stream that is a nop on quadros
16:29 mupuf: and a sleep on others
16:30 karolherbst: :/
16:30 karolherbst: lol
16:30 mupuf: mwk had been wondering about the role of this instruction until nvidia released a version of the blob with the debug symbols on
16:30 karolherbst: and thre I only want a usefull nvidia-smi for my card :D
16:30 mupuf: and the name was explicit-enough :D
16:30 mupuf: right
16:31 mupuf: short of changing the PCIID, I think you are screwed for this sw feature
16:32 karolherbst: and there every windows tool can like show you the voltage
16:32 karolherbst: this is just stupid from nvidia
16:35 karolherbst: mupuf: but hey, maybe nvidia changes their mind, when nouveau actually starts to be superior in some parts :D
16:36 mupuf: they will not change their mind
16:36 mupuf: they will contribute where their customers want them to
16:37 karolherbst: I see
16:37 mupuf: and help us as time permits
16:37 karolherbst: I am a customer and I want a working nvidia-smi :p
16:37 mupuf: which is much much much better than it used to be :D
16:37 mupuf: show them the money :D
16:38 karolherbst: :D
16:38 mupuf: you have got to be pretty big to be able to ask for changes in th edriver :D
16:38 karolherbst: my gpu isn't exactly one of the cheap ones though :p
16:38 karolherbst: :D
16:38 imirkin_: yeah it is
16:38 imirkin_: you probably paid less than $2k for it
16:38 karolherbst: yeah well, not exactly that much
16:39 imirkin_: example of a non-cheap gpu: http://www.amazon.com/NVIDIA-Tesla-Graphic-Card-900-22081-2250-000/dp/B00KDRRTB8
16:39 imirkin_: which are bought by the rack-full
16:39 karolherbst: wow
16:39 karolherbst: mine is around 400€
16:40 karolherbst: but I heard like you can do some mods to a high end kepler and get a quadro :p
16:40 imirkin_: when you buy 10k of those, nvidia will listen to you
16:40 karolherbst: :(
16:40 imirkin_: (maybe)
16:41 mupuf: yeah, maybe
16:41 mupuf: if it requires a week of work for a few engineers, then they may do it :p
16:42 karolherbst: I bet it cost nvidia more money to actually let nvidia-smi not work for all gpus than to just support it :p
16:42 karlmag: hmm.. if I ever want one of those I think I'll actually order it from amazon... it cost about twice that locally
16:43 imirkin_: karlmag: well, you probably won't show up to your local store and buy 10k units
16:43 karlmag: uhm... if I where to buy that many I would ask nvidia directly, for sure
16:43 karolherbst: I will be off now anyway :p
16:44 karolherbst: need some sleep and stuff
16:44 karlmag: sleep?
16:44 imirkin_: and i guess this is the pricier one -- http://www.newegg.com/Product/Product.aspx?Item=N82E16814132041
16:44 karlmag: hmm.. I should try that thing too :-P