03:15 mupuf: ok, I confirm that karol really found the PWM voltage regulator
03:15 mupuf: now let's try to RE the vbios
04:53 mupuf: So, by faking the value of performance counters on my maxwell, I managed to generate those PWM values for the voltage: http://pastebin.com/q4D4ABqG
04:54 mupuf: since my voltage table contains only 8 entries and forcing it to 1 does not change the behaviour of the blob (I made sure the blob sees my changes by forcing the fan speed to a higher-than-normal value)
04:55 mupuf: then, it means that either there is a new table for voltage or the voltage map table contains the formula or PWM value for this
04:55 mupuf: given that maxwell introduced a separate table for the fan management
04:55 mupuf: I would say they did the same thing for the PWM, so let's look for it!
07:25 RSpliet: there still is approximately 5000 tables to choose from ;-)
08:19 karolherbst: mupuf: thanks for the confirmation :)
08:19 karolherbst: for you interest: blob sets it to 3d on 0f, 2d on 0a and 26 on 07
08:20 karolherbst: "since my voltage table contains only 8 entries" wait
08:20 karolherbst: try this patch: https://github.com/karolherbst/envytools/commit/f684eadaa33eb8d4677f5900b80db0e1e7788aec
08:22 karolherbst: I get like 55 entries for your nv177
08:22 karolherbst: *117
08:23 karolherbst: actually 56
08:23 karolherbst: 8 + 16*3 (3 is in the high bits)
08:25 karolherbst: maybe I just do a PR for this and we discuss the changes there
08:46 gnurou: imirkin: how long have you been waiting?
08:51 imirkin: gnurou: i dunno, a month or two. i haven't really been keeping track
08:52 karolherbst: waiting for what?
08:52 imirkin: karolherbst: answers
08:52 karolherbst: I see
08:52 imirkin: they have a list where you can ask questions
08:52 imirkin: and they don't provide answers
08:52 imirkin: it's great :)
08:52 karolherbst: makes sense
08:53 karolherbst: maybe you should ask different questions
08:53 imirkin: it's largely /dev/null, but every so often, something pops out
08:53 karolherbst: like how is the Weather
08:53 imirkin: like ftp://download.nvidia.com/open-gpu-doc/Shader-Program-Header/1/Shader-Program-Header.html
08:53 imirkin: which actually included a lot more information than i had asked for. but then i sent follow-up questions that were never answered.
08:55 imirkin: [and there have been a ton of other unanswered questions]
08:56 karolherbst: mhh
08:56 karolherbst: I think I can understand whats going on, but....
08:57 karolherbst: they should be more clear about the procedure
08:57 karolherbst: they have to ask first what they can tell us and most likely it gets stuck somewhere if its not clear
08:57 karolherbst: or they just ignore question which they are not allowed to answer and discourage you to ask different one
08:57 karolherbst: s
09:01 imirkin: the procedure stinks, if that's what you're saying
09:01 karolherbst: :D
09:01 imirkin: but it's not like there are people dedicated to dealing with my idiot questions
09:01 karolherbst: maybe we should just tell them, that denying an answer is better then no reply
09:01 imirkin: it would all work much better as a bugtracker
09:12 karolherbst: imirkin: btw on too may chipsets the pci link width isn't parsed at all
09:13 karolherbst: so I need some kind of value to ignore that in my interface
09:13 karolherbst: 0xff or 0x0 ?
09:13 imirkin: just say if it's == 0 or > 16, then leave alone?
09:13 imirkin: or set to 16
09:13 imirkin: i dunno
09:17 karolherbst: mhh
09:17 karolherbst: setting to 16 isn't right
09:17 karolherbst: think about thse x8 cards
09:18 karolherbst: I think I will just leave the value if its not 1,2,4,8 or 16
09:18 karolherbst: and print an error
10:49 karolherbst: nvbios_perfEp is the right function for parsing the pstates right?
10:51 imirkin_: it's def not the *wrong* function
10:51 karolherbst: ohhh
10:51 imirkin_: if it's not right, it's osmething very much like it
10:51 karolherbst: right for ver 0x40 the clocks are in the cstates right?
10:51 imirkin_: i'm always confused by ben's naming convention for those
10:52 imirkin_: i think that means perf entry pointer
10:52 karolherbst: I was worried, because there was only "info->voltage = nvbios_rd08(bios, perf + 0x02);" for 0x40, so for my card
10:52 karolherbst: but there are now like dozens cstates with all the clock info
10:53 imirkin_: i really dunno, sorry
10:53 karolherbst: its fine, I am pretty sure its that way, I was just worried
10:56 karolherbst: mhhh
10:56 karolherbst: this switch: https://github.com/envytools/envytools/blob/master/nvbios/nvbios.c#L1529-L1540
10:56 karolherbst: case 3 and 1 are 5.0?
10:56 karolherbst: I mean 2.5
10:57 karolherbst: this is just odd
10:57 imirkin_: oh, that's mupuf being a jerk i think
10:57 imirkin_: he added that recently i dunno why
10:57 karolherbst: it seems to be right for me
10:57 Tanji: Hi I'm looking for companies that sell data in India , can anyone help ?
10:58 imirkin_: karolherbst: https://github.com/envytools/envytools/commit/46c11dace668d150bc839be481b10f52047125fa
10:58 imirkin_: Tanji: wrong channel.
10:58 Tanji: ups
10:58 Tanji: sorry
10:58 Tanji: bye
10:58 RSpliet: awwh... I would've sold him some data if he wanted it
10:58 RSpliet: mountains of MMIOTraces to sell
10:59 karolherbst: :)
10:59 imirkin_: RSpliet: and dealt with export regulations?
10:59 karolherbst: imirkin_: it doesn't make sense before and doesn't make sense after
10:59 karolherbst: 0: 5.0, 1: 2.5 2: 8.0 3: 2.5 what the hack
11:00 imirkin_: 3: unknown makes a lot more sense :)
11:00 RSpliet: or rather "3: rfu"
11:00 karolherbst: no clue
11:00 karolherbst: this sounds like a wtf moment for me if I put it like that not nouveau without a comment
11:01 karolherbst: comment: "mupuf said it was right that way" seems to be enough I guess
11:01 RSpliet: trust him, he's an engineer
11:01 karolherbst: :)
11:02 imirkin_: just ask this scientician
11:02 RSpliet: it's all scientology!
11:03 imirkin_: https://vid.me/goSn
11:08 karolherbst: nice, parsing seems to work nicly here
11:09 karolherbst: but
11:09 karolherbst: well
11:09 karolherbst: no its fine, nice
11:11 karolherbst: any idea on which card the pstate table has version 0x35 and 0x30?
11:11 RSpliet: I think that's pre-Tesla
11:11 imirkin_: try nv40?
11:12 imirkin_: nv4x rather
11:12 karolherbst: ohh okay
11:12 karolherbst: so pciev1
11:12 imirkin_: or not pcie at all :)
11:12 RSpliet: AGP with PCIe bridge I think
11:12 imirkin_: nv4x was native pcie
11:12 imirkin_: but nv3x was agp or bridge
11:12 karolherbst: mhh
11:12 RSpliet: was it? ah right, explains why it was so hard to get an AGP NV4x
11:12 imirkin_: yeah, just the nv4a (aka nv44a)
11:13 RSpliet: I have an AGP NV4B
11:13 imirkin_: i guess they made more
11:13 imirkin_: but nv4a was agp-only
11:13 karolherbst: will ignore the width for now until somebody manages to parse it rought out of the bios
11:13 RSpliet: this one probably has a bridge
11:14 karolherbst: I am not that thrilled about that part: https://github.com/envytools/envytools/blob/master/nvbios/nvbios.c#L1451-L1456
11:14 RSpliet: then fix it
11:15 karolherbst: will implement link speed first, because this is somehow kind of important, width change, meh
11:15 karolherbst: toyed around with tobijk card and the card really didn't like to change width
11:17 RSpliet: well, yes, what good does it do doubling the width of your bus... it's as if that magically doubles your bandwidth
11:18 RSpliet: mind you, you might not reach 5GT/s without changing your link width to x16
11:19 karolherbst: there are x8 only cards
11:20 imirkin_: RSpliet: the two are not connected
11:20 imirkin_: at least afaik
11:20 karolherbst: also this is per lane
11:21 karolherbst: its like the commands per second
11:21 karolherbst: its more complicated
11:22 imirkin_: i don't think you can have diff lanes with diff transfer rates... maybe the standard allows it, but the hw probably doesn't
11:25 karolherbst: ??
11:25 karolherbst: one lane has a speed, which can be changed
11:25 karolherbst: like v1: 2.5GT/s
11:25 karolherbst: v2: 5.0GT/s
11:25 karolherbst: v3: 8.0G/s
11:26 imirkin_: but say you have 16 lanes
11:26 karolherbst: yeah
11:26 imirkin_: i doubt you can configure one to have one speed, and one to have another
11:26 karolherbst: then the slot has 2.5 GT/s * 16
11:26 imirkin_: also note that v2 also supports 2.5
11:26 imirkin_: as does v3
11:26 imirkin_: they just enable higher speeds
11:26 karolherbst: yeah right
11:27 karolherbst: I meant the speed is per lane
11:27 karolherbst: not in total
11:27 karolherbst: " mind you, you might not reach 5GT/s without changing your link width to x16"
11:27 karolherbst: this makes only sense if this speed is per slot
11:27 karolherbst: but its not
11:29 karolherbst: RSpliet: with x16 you have then like 128 GT/s in total
11:29 karolherbst: 126.032 Gbit/s data transfer
11:30 imirkin_: with perfect timing :)
11:30 karolherbst: yeah well
11:30 karolherbst: its worse with v1 and v2
11:30 imirkin_: 100Gbit Ethernet should be making an appearance anytime now
11:30 karolherbst: they have 20% protocol overhead
11:30 imirkin_: not on pcie v3
11:30 karolherbst: no
11:30 karolherbst: there its like below 2%
11:31 imirkin_: 128/130 encoding, as opposed to 8/10
11:31 karolherbst: yeah
11:55 karolherbst: does anybody of you have a x8 kepler?
11:55 imirkin_: right here
11:55 imirkin_: GK208 x8
11:55 karolherbst: need some reg values
11:55 imirkin_: ask and ye shall receive
11:55 karolherbst: 0x088610 and 0x088150
11:56 imirkin_: karolherbst: http://hastebin.com/izinakegej.md -- enjoy
11:56 karolherbst: 0x08c040 was like 40489000 for you?
11:57 karolherbst: lol thanks
11:57 imirkin_: 0008c040: 40489000
11:57 karolherbst: would be nice to have the 0x08c000 range too
11:57 imirkin_: http://hastebin.com/mipohukiyo.sm
11:57 karolherbst: thanks a lot
11:58 karolherbst: I want to check where kepler reports the system limits nicely
11:58 imirkin_: 8c080 probably?
11:58 imirkin_: tbh i'm not sure if it's an x8 slot or just an x8 card...
11:58 imirkin_: let's see if dmi will tell me...
11:58 karolherbst: maybe 0x02241c ?
11:58 karolherbst: ohh no
11:58 karolherbst: this is the width and pci version
11:59 imirkin_: dmi seems to think it's an x16 slot
11:59 imirkin_: which means it's an x8 card
11:59 karolherbst: k
11:59 karolherbst: sadly 0x088460 is gone on kepler
12:00 imirkin_: hmmm
12:00 imirkin_: 8c460 is there though
12:00 imirkin_: and i note that it has a 10 08 in there
12:00 imirkin_: what does 8c460 say for you?
12:02 karolherbst: don't have it
12:02 imirkin_: you mean it says 0?
12:02 karolherbst: yeah
12:02 imirkin_: really? that's surprising =/
12:02 karolherbst: why?
12:02 imirkin_: because it's non-0 for me
12:02 karolherbst: you chipset is newer, isn't it?
12:02 karolherbst: *your
12:02 imirkin_: ya
12:03 imirkin_: but they don't really chagne this stuff around unless they have to
12:03 karolherbst: I have a bunch of possible candidates
12:03 imirkin_: for kepler they had to (for v3 support)
12:03 imirkin_: but changing your pcie ip isn't exactly a high-priority item... it's the sort of thing that just sorta works. until it doesn't.
12:04 karolherbst: mhh
12:04 karolherbst: who knows
12:05 karolherbst: mhh 0x08841c
12:05 karolherbst: that's an odd one
12:06 karolherbst: I don't thinks its usefull for us, but: "GEN2_DIS = 0"
12:06 imirkin_: just incrementing numbers
12:06 imirkin_: er, 88, not 8c
12:08 karolherbst: maybe 0x0880a8?
12:08 karolherbst: its 0x1f0003 for me
12:09 imirkin_: search me :)
12:09 karolherbst: ohh this reg changes later
12:09 karolherbst: not usefull then
12:09 karolherbst: I think we won't find it inside 0x88xxx
12:10 karolherbst: yeah 0x8cxxx looks better
12:10 imirkin_: 8c080 shows the current state
12:10 imirkin_: and perhaps the card width
12:10 imirkin_: but try as i might i couldn't get it to switch lane widths
12:10 imirkin_: (by writing stuff to it)
12:10 karolherbst: ohh right there is also 0x08c2c0
12:11 imirkin_: that has more bits
12:11 imirkin_: ;)
12:11 karolherbst: 060001b2 for me
12:11 imirkin_: 060b31b2 for me
12:11 imirkin_: and b3 == 10110011
12:12 karolherbst: mhhh
12:12 karolherbst: still
12:12 imirkin_: i don't see anything too useful in that bitpattern
12:13 karolherbst: maybe 0x08814c?
12:13 karolherbst: 0xb00001b for me
12:13 imirkin_: hmmm i have an extra 8
12:14 karolherbst: yeah
12:14 karolherbst: it could be something that is 0 for me
12:14 karolherbst: like a limiter
12:14 karolherbst: 0 = unlimited
12:14 karolherbst: 8 = limited to x8?
12:15 imirkin_: heh
12:15 imirkin_: nice theory :)
12:17 karolherbst: 0x088830
12:17 karolherbst: 00000021 for me
12:17 imirkin_: and i have an extra pair of bits...
12:17 imirkin_: could be 2 5-bit values adjacent
12:17 imirkin_: which for you are 1/1 and for me are 3/3
12:18 karolherbst: why I don't just diff the outputs
12:18 karolherbst: ...
12:19 imirkin_:assumed that's what you were doing
12:19 karolherbst: I was goig through my trace and look for something the blob reads
12:19 karolherbst: with a value, that my tell us something
12:20 karolherbst: ohhhh
12:20 karolherbst: 0x88088 ?
12:20 imirkin_: you mean 8808c?
12:20 karolherbst: 11010140 for me
12:20 imirkin_: er
12:20 imirkin_: no, 88088
12:20 imirkin_: 10810143
12:21 karolherbst: this looks somehow like the tesla/fermi reg
12:21 imirkin_: yea
12:21 karolherbst: maybe its there too?
12:22 imirkin_: don't we already know how to do it on tesla/fermi?
12:22 karolherbst: yeah, oh well
12:22 karolherbst: about 0x88088 its already in rnndb
12:23 karolherbst: it just tells the current width
12:23 karolherbst: link status
12:23 karolherbst: and some other stuff
12:25 karolherbst: ohh right, 0008c080
12:26 karolherbst: 00001010 for me
12:26 karolherbst: that would be the most obious one
12:27 karolherbst: but can't see the blob using it
12:27 imirkin_: and i had no luck trying to chagne it
12:29 karolherbst: mhh
12:29 karolherbst: one think I would like to try out
12:30 karolherbst: put a kepler card into main slot, read values, put into another slot or add other cards so it drops to x8
12:30 karolherbst: and read regs again
12:32 karolherbst: anyway, I think I will implement the reads through the linux API first anyway, because that's like usable on all cards
12:32 karolherbst: if it works
13:07 tobijk: RSpliet: my card actually buts into x16, but changing it to x8 or else lets it drop to x1 and if ou want to go back to x16/x8 it just hangs the system
13:35 karolherbst: imirkin_: if I want to change the pstate nvif interface, do I have to add a v1 version of that and keep the old?
13:36 imirkin_: check with skeggsb
13:36 imirkin_: he just changed the whole thing around without bumping anything
13:42 karolherbst: mhh, I think the interfaces stayed the same though (for the pstate sysfs file)
13:42 karolherbst: I would like to add the pcie information there as well
13:42 imirkin_: no, he also changed the actual interfaces too
13:42 imirkin_: see the last cmomit
13:42 imirkin_: commit*
14:12 mupuf: Ah ah
14:14 karolherbst: mupuf: tried the patch?
14:15 mupuf: nope, been trying to understand the voltage part for the bios
14:15 mupuf: when using PWM-based voltage management
14:15 karolherbst: imirkin_: yeah okay, he changed internal stuff, but I think the pstate sysfs file still works the same and the used interface, too
14:16 karolherbst: yeah, but you said you had only a handfull voltage entries
14:16 karolherbst: you should have much more
14:17 mupuf: ah, right
14:17 mupuf: this patch
14:17 mupuf: let's try this one
14:17 karolherbst: with that I got like 55
14:18 karolherbst: but I think there are still more
14:19 mupuf: Yeah, this definitely seems to be true that there are more
14:20 karolherbst: also "-- Maximum voltage 600000 µV, voltage step 10176 µV, Maximum voltage to be used 1200000 µV --"
14:20 karolherbst: but last one is like "-- Vid 55, voltage 1159680 µV/0 µV --"
14:22 karolherbst: but the values itself are realy strange anyway, can't make anything out of it
14:23 karolherbst: but check that: every cstep has a voltage entry, which maps to one in the "Voltage map table"
14:27 stole: I don't really *get* tearing, or why it happens on my computer. If tearing "comes and goes", what usually is the problem? Just now I can scroll down a heavy page in my browser and get zero tearing, but look away and come back- and there it is.
14:27 stole: Tearing that "comes and goes", is that symptomatic of something in particular?
14:28 karolherbst: stole: what window manager are you using?
14:28 karolherbst: some window manager trying to be smart and disable tearing prevention under "heavy" load
14:29 karolherbst: like for kwin I explicitly enable "Full scene repaints" and the problem is gone
14:29 stole: aha
14:29 karolherbst: so are you on kde or using kwin?
14:30 stole: stole: compiz, and xfce4 . I haven't tried kwin, but I've tried no window manager, I've tried xfce4's wm, also compiz as window manager+emerald and also xfce4+compton
14:31 karolherbst: mhhh
14:31 karolherbst: xfwm also has such options somewhere
14:31 imirkin_: stole: are you using a recent kernel and xf86-video-nouvea?
14:31 imirkin_: there were a bunch of vsync fixes around 3.16 or 3.17 or so
14:32 imirkin_: as well as in xf86-video-nouevau 1.0.10 or 11
14:33 stole: I'm using 3.18.12, and xf86-video-nouveau 1.0.11
14:33 imirkin_: that should have all the stuff i'm thinking of
14:33 karolherbst: stole: so currenty you use compiz?
14:33 karolherbst: mhhh
14:34 karolherbst: compiz I would say is in a "hanging" state somehow, but there should be options for that somewhere
14:34 stole: Right now, yes. Other compositing managers didn't work with available tweaking, so I enabled compiz again.
14:34 imirkin_: stole: afaik kepler has some undiagnosed tearing issues... apparently the issue also plagues nvidia's proprietary drivers
14:34 imirkin_: so i suspect the hardware does something odd... what gpu do you have?
14:34 stole: One thing I noticed with compiz was that if I enabled some sort of plugin overlay, say the "FPS" plugin, vsync would work very well suddenly
14:35 stole: A Quadro 1000M, I might be mistaken but that's probably Fermi?
14:35 imirkin_: stole: dunno... lspci -nn -d 10de: should provide the code
14:35 imirkin_: GFxxx or GKxxx
14:36 karolherbst: mupuf: I think the pwm value is somehow calculated
14:36 stole: GF108GLM
14:36 mupuf: karolherbst: not entirely sure
14:37 imirkin_: ok. well i also have a GF108 and i've never had tearing issues. i use WindowMaker and no compositor.
14:37 imirkin_: single screen though, not sure if that affects things
14:37 mupuf: the GPIO state could also be computed but it is not
14:37 karolherbst: mhh
14:37 stole: Oh, and in xorg.conf I've set GLXVBLANK to "True" and SwapLimit to "2"
14:37 imirkin_: glxvblank should default to on now
14:37 stole: That'd enable triple buffering and vsync from what I've read
14:37 imirkin_: the triple-buffering could be causing issues
14:38 imirkin_: in that almost nobody enables it, so who knows what bugs lie there
14:38 karolherbst: mupuf: at least I didn't find anything about that in my bios
14:38 stole: Hm, I enabled it and kept it one since it usually gave me some better performance, i.e enabling vsync in compiz reduced my framerate to a flat 30fps
14:39 stole: setting swaplimit to "2" made it run at 60
14:39 mupuf: karolherbst: let me be stupid and entirely hide the voltage table
14:39 mupuf: we'll see if we still get voltage changes!
14:39 karolherbst: yeah
14:39 mupuf: If we do (which I think), it will really point out that there is another table doing that
14:40 karolherbst: mhh
14:40 stole: imirkin_: I'm also only using a single screen. (laptop monitor). What does your xorg.conf look like? Unmodified?
14:40 karolherbst: or maybe its in the voltage map somehow
14:41 imirkin_: stole: pretty sure i don't have anything in there of note. not in front of the box to check.
14:41 mupuf: karolherbst: there are plenty of unknown tables
14:41 karolherbst: okay
14:41 mupuf: like UNK40 in the P table
14:43 karolherbst: ohh see it
14:45 karolherbst: mhh
14:45 karolherbst: just searched for the three values I am sure the blob uses
14:45 karolherbst: but mhh
14:51 mupuf: So, by faking the value of performance counters on my maxwell, I managed to generate those PWM values for the voltage: http://pastebin.com/q4D4ABqG
14:51 mupuf: since my voltage table contains only 8 entries and forcing it to 1 does not change the behaviour of the blob (I made sure the blob sees my changes by forcing the fan speed to a higher-than-normal value)
14:51 mupuf: then, it means that either there is a new table for voltage or the voltage map table contains the formula or PWM value for this
14:51 mupuf: given that maxwell introduced a separate table for the fan management
14:51 mupuf: I would say they did the same thing for the PWM, so let's look for it!
14:51 mupuf: I sent that earlier today
14:51 mupuf: when you weren't there
14:51 mupuf: there are more than 3 values :p
14:52 karolherbst: mupuf: did you tried my patch for nvbios?
14:52 mupuf: yes
14:52 karolherbst: still 8?
14:52 mupuf: I see the same thing as you said
14:52 mupuf: nope
14:52 karolherbst: ahh okay
14:52 mupuf: -- Vid 55, voltage 1159680 µV/698880 µV --
14:52 mupuf: what does nouveau do?
14:53 karolherbst: the same
14:53 karolherbst: [53445.112939] nouveau 0000:01:00.0: volt: VID 00: 600000uv
14:53 karolherbst: stuff
14:53 karolherbst: [53445.112992] nouveau 0000:01:00.0: volt: VID 3a: 1190208uv
14:53 karolherbst: ohh
14:53 karolherbst: looks like generated
14:55 mupuf: also, the values you see in each entry, they seem to be 2 16bit values
14:55 karolherbst: the voltage is just increased by what is there in the head of the table anyway
14:55 karolherbst: I really don't see what this table is trying to tell me
14:56 karolherbst: ohhhh wait
14:57 karolherbst: nope, nothing
14:58 mupuf: AHAH, the blob does not like not having a voltage table :D
14:58 karolherbst: :D
14:59 karolherbst: the stuff after that
14:59 karolherbst: in the bios file
14:59 karolherbst: this seems like something
15:00 stole: Well, it's strange. I get perfect vsync, or as good as I've been able to get it _if_ I enable some sort of overlay, in my current case, the benchmark overlay.
15:01 imirkin_: stole: sounds like some problem with your compositor then
15:02 karolherbst: yeah
15:02 karolherbst: you could give kwin a try with "full screne repaints" if that performs well and has no issues I would check why the others have that
15:02 stole: could be, although it's the same with or without a compositor.
15:02 stole: I'll try kwin.
15:02 stole: at some point.
15:05 karolherbst: mhhh
15:05 karolherbst: mupuf: my card is at 0x30 by default
15:05 karolherbst: without any driver loaded
15:06 imirkin_: stole: perhaps we use diff applciations. i've never seen any issues on my fermi =/
15:08 karolherbst: ohhhh shit
15:08 karolherbst: mupuf: voltage depends on AC status here
15:08 mupuf: well, it means the governor is taking the fact that the laptop is on battery or AC into account
15:09 mupuf: the voltage is only tied to the frequency
15:09 mupuf: no need to have a look at what you are looking right nwo
15:10 karolherbst: now this is something
15:10 karolherbst: also same clock can have different voltage depending on pstate
15:10 karolherbst: okay, let me be more explicit
15:10 karolherbst: same clock can have different pwm values depending on AC status
15:11 karolherbst: mhh
15:11 karolherbst: its getting stranger and stranger
15:11 karolherbst: 705 MHz: 0x2f
15:11 karolherbst: seconds later
15:11 karolherbst: 705MHz: 0x2e
15:12 mupuf: ok, that is an interesting result
15:13 mupuf: maybe the voltage is increased when there is a lot of variance in the power usage
15:13 karolherbst: oh wow
15:14 karolherbst: this is nice
15:14 karolherbst: clock at load 862
15:14 karolherbst: 0x3d voltage
15:14 karolherbst: increasing and decreasing clock
15:14 karolherbst: between 700-1000 (round about) always results in 0x3d voltage
15:14 karolherbst: but
15:15 karolherbst: while uplcoking this happens:
15:15 karolherbst: 00020344: 0000003d
15:15 karolherbst: 00020344: 00000032
15:15 karolherbst: 00020344: 00000035
15:15 karolherbst: 00020344: 00000036
15:15 karolherbst: 00020344: 00000036
15:15 karolherbst: 00020344: 0000003a
15:15 karolherbst: 00020344: 0000003d
15:15 karolherbst: read i 0.2 s intervals
15:17 karolherbst: okay mhh
15:17 karolherbst: okay
15:18 karolherbst: conclusion: while upclocking the blob sets the voltage to a value somehow related to the clock, but it also makes a last upvolt
15:19 karolherbst: mhh
15:19 karolherbst: values are somehow random though
15:22 mupuf: karolherbst: you really need a script that I wrote that dumps both the voltage and clocks in a single report
15:22 mupuf: what you are doing is useless because you are not checking if the frequency changes
15:22 karolherbst: ohhh right
15:22 karolherbst: where is your script? :D
15:22 mupuf: https://github.com/mupuf/pdaemon_trace
15:23 mupuf: it is mostly a modified nouveau :P
15:24 karolherbst: but as a cli tool for reading stuff?
15:25 mupuf: https://github.com/mupuf/pdaemon_trace/blob/master/pwr_read/pwr_read.c
15:25 mupuf: you need to compile that, and the modified version of nouveau
15:25 mupuf: which you may want to update
15:27 mupuf: it generates a very accurate report that allows to generate this kind of graphs: http://fs.mupuf.org/mupuf/nvidia/graphs/thresholds_graph.svg
15:27 mupuf: you will need to change it a little to also get the voltage
15:28 mupuf: http://fs.mupuf.org/mupuf/nvidia/graphs/ contains the gnuplot files to generate those lovely graphs
15:28 karolherbst: what's lnva?
15:28 karolherbst: nva
15:29 mupuf: libnva is part of envytools
15:29 karolherbst: becuase I get like undefined references
15:29 karolherbst: nva_cards and nva_init
15:30 mupuf: sure, you need to set the right -I when compiling
15:30 karolherbst: I do
15:31 mupuf: I would suggest checking again :D
15:31 karolherbst: -lnva ?
15:31 mupuf: unless the libnva changed a lot since the last time I check
15:31 mupuf: -lnva -L/path/to/envytools/build/nva
15:31 mupuf: or the -L before -lnva
15:32 karolherbst: okay, got other errors now
15:33 karolherbst: parse_pmc_id
15:33 mupuf: -L/path/to/envytools/build/nvhw -lnvhw
15:34 karolherbst: okay, nice
15:35 karolherbst: whats that "/root/src/nouveau/bin/nv_init:" ?
15:36 karolherbst: ohhh
15:36 karolherbst: I know
15:37 karolherbst: better now
15:37 karolherbst: ohh wow
15:37 karolherbst: power consumption of the gpu
15:38 mupuf: yeah, I have been meaning to add support for it in nouveau for a while
15:38 mupuf: I even wrote the assembly code for pdaemon
15:38 mupuf: but it never got in because ... nvidia started signing pdaemon
15:39 karolherbst: okay, messed up nvidia card now
15:39 mupuf: hmm
15:39 karolherbst: yeah, my fault most liekly
15:40 mupuf: did you run nv_init from my modified version of nouveau or the upstream one?
15:40 mupuf: because using the latter will result in a crash pretty quickly
15:40 karolherbst: used the once inside your repository
15:40 karolherbst: it worked
15:40 karolherbst: I just played around too much
15:40 mupuf: lol
15:41 mupuf: well, glad to see the tool work on another machine than mine
15:41 mupuf: it is a bit rough on the edges though :s
15:41 karolherbst: got lines like " 76590, 21.14, 836, 1755, 1672, 2004, 1873, 1080, 540, 0.00, 58.01, 3.22, 54.56, 7500, 16, 2, 56, 14, 0"
15:44 mupuf:does not remember the ordering :D
15:44 mupuf: 21.14 is the power consumption in W though
15:44 karolherbst: mupuf: " Perf_UNK (%)" pcie load
15:45 mupuf: yeah, I did not want to label it PCIE load, but I was pretty sure of it
15:45 mupuf: I confirmed it later
15:45 karolherbst: k
15:45 mupuf: i wrote this tool for my PhD
15:45 karolherbst: k
15:45 karolherbst: this is an awesome tool by the way
15:45 karolherbst: would have needed it earlier
15:45 karolherbst: :D
15:45 karolherbst: does it run with nouveau, too?
15:46 karolherbst: the load values are nice to find bottlenecks in nouveau
15:47 mupuf: for the load,. there is an nva tool for it
15:47 karolherbst: ohhh
15:48 mupuf: and just wait for hakzsam's work to land to get a shit ton of performance counters
15:48 mupuf: and the work of trtt to land in apitrace to get good reports!
15:48 karolherbst: okay, nice
15:48 karolherbst: do you remember the one who wrote the RA stuff for nouveau?
15:49 karolherbst: in 2012?
15:49 karolherbst: or at least changed a lot there
15:49 mupuf: probably calim
15:49 mupuf: he is the original author of most of nv50/c0 code!
15:50 karolherbst: ohhh
15:50 karolherbst: that's make sense now
15:50 karolherbst: yeah I need him
15:50 mupuf: Christoph Bumiller
15:50 karolherbst: yeah
15:50 mupuf: calim: I hope I did not butcher your name!
15:50 karolherbst: seems right
15:51 karolherbst: I need him to help me with my spilling issues I've got :/
15:51 mupuf: so, what makes you think that you are right with your patch?
15:51 karolherbst: which patch?
15:52 mupuf: the one for the voltage table
15:52 karolherbst: mhh
15:52 karolherbst: there are bios with 0 entries otherwise?
15:52 mupuf: that does not surprise me :D
15:52 karolherbst: lol :D
15:52 mupuf: PWM-based?
15:52 karolherbst: don't know
15:52 karolherbst: saw them
15:52 karolherbst: could try to find them again
15:52 mupuf: well, check it out next time you find one
15:53 mupuf: I do not agree with you when it comes to the size being wrong
15:53 mupuf: the value I get is giberish otherwise
15:53 RSpliet: +b
15:53 karolherbst: yeah it was just a guess with the size
15:54 mupuf: two b? I really am not a spelling b!
15:54 karolherbst: never looked deeper into that
15:54 karolherbst: okay, driver is dead
15:54 karolherbst: gpu not
15:54 karolherbst: mhhh
15:55 karolherbst: just go away nvidia module
15:56 karolherbst: have to reboot :/
16:06 karolherbst: mupuf: what's the easiest way to just read the value for the reg in C?
16:06 karolherbst: nva_rd32 ?
16:07 mupuf: yes
16:07 mupuf: nva_rd32(cnum, 0x20344);
16:07 calim: my name ?
16:07 imirkin: calim: you've alive :)
16:07 calim: oh ... I hit my head today, might have a concussion, don't expect me to make much sense
16:07 karolherbst: mhhh
16:07 karolherbst: oh well
16:07 karolherbst: I need your brain
16:08 imirkin: calim: might want to get that checked out
16:08 mupuf: karolherbst: pick it up from the ground then!
16:08 mupuf: imirkin: definitely!
16:08 calim: who lets a building protrude over the pavement at the height of my head ? none of the 3 people with me warned me !
16:08 imirkin: ah, that height of things above your eyes but below the top of your head?
16:08 imirkin: i hate that :)
16:09 mupuf: calim: I don't recall you being super tall, right?
16:09 karolherbst: mhh
16:09 mupuf: Seems like a very bad design in any case!
16:09 calim: nah, I'll just rest, mostly ... if I get dizzy / nauseous / severe headache, then I'll ... make it for the hospital
16:09 mupuf: calim: aneurism does not have to fill weird
16:10 imirkin: calim: well, when you feel better, i sure could use some help from you
16:10 mupuf: and when you do, you may not even be able to call for help
16:10 calim: then I'll just die I guess, saves me a whole lot of trouble :P (and also fun but meh)
16:10 mupuf: but I don't how hard you hit your head!
16:10 karolherbst: mhh
16:10 karolherbst: I always wanted to ask a person who gets mad, if he notice it
16:10 mupuf: karolherbst: probably not, it is just the world that gets mad for him/her
16:10 calim: harder than ever before, but that's not a good measure for comparison xD
16:11 karolherbst: :D
16:11 karolherbst: anyway
16:11 karolherbst: can you still tell me what's "ERROR: failed to coalesce phi operands" about?
16:11 karolherbst: I do some "reordering" with the instructions before RA
16:12 karolherbst: but as far as I know I don't do anything wrong anymore
16:12 karolherbst: maybe some corner cases are strange though
16:12 mupuf: <cue the super deeply technical explanation>
16:12 calim: look at the code and see what happened ... usually you can't coalesce them when their live ranges overlap
16:14 karolherbst: ohhh
16:14 karolherbst: okay
16:14 karolherbst: yeah, I think this was the actual issue
16:14 karolherbst: but well
16:14 karolherbst: what does it tells me to do or not to do
16:14 karolherbst: like if I reorder some instructions inside some basic blocks
16:15 karolherbst: how would that effect phi instructions in other?
16:15 karolherbst: I just crashed my gpu now
16:16 imirkin: calim: do you remember wtf needNewElseBlock is supposed to do? are you looking for critical edges there?
16:16 imirkin: calim: and separately, is it expected that all the edge types are wrong?
16:16 imirkin: calim: or at least not at all based on a spanning tree concept
16:17 karolherbst: mupuf: got some usefull data maybe
16:18 karolherbst: note to myself: never upclock memory +50% anymore
16:18 mupuf: ....
16:18 karolherbst: +25% worked though
16:18 mupuf: what did you try to get data for?
16:18 karolherbst: mupuf: https://gist.github.com/karolherbst/b5e4e8f8a0b877bc27ca
16:19 mupuf: make a gnuplot script that will plot core clock vs voltage :p
16:20 karolherbst: well
16:20 karolherbst: I can just open libreoffice and move the coloums near each other
16:20 karolherbst: mhhh
16:20 karolherbst: this doesn't make sense
16:20 karolherbst: this is so random somehow
16:21 mupuf: you can, but it will get old very fast!
16:22 karolherbst: I need to remove some lines
16:23 mupuf: I love how the blob expects some values in the vbios
16:23 mupuf: and if it does not have the right value, it just crashes
16:23 karolherbst: and if it does, it does random things anyway
16:24 karolherbst: mupuf: https://gist.github.com/karolherbst/b5e4e8f8a0b877bc27ca
16:25 karolherbst: will make another one
16:32 karolherbst: mupuf: this one is better: https://gist.github.com/karolherbst/13c454de721097b49614
16:32 karolherbst: the pattern I meant is stronger there
16:32 karolherbst: line 21 to 22
16:32 mupuf: Like michael jackson said: PLOT IT!
16:33 karolherbst: this was me just downclock to minimum
16:33 karolherbst: voltage stayed the same
16:33 karolherbst: never plotted with gnuplot :D
16:33 calim: imirkin_: we may require an extra incoming block for phi moves
16:33 karolherbst: calim: wanna look over my patch and check if I make something really stupid?
16:35 mupuf: karolherbst: just read the .plot files in http://fs.mupuf.org/mupuf/nvidia/graphs/
16:35 karolherbst: k
16:37 calim: it should probably be return (n > 1) and not (n == 2)
16:38 calim: karolherbst: not today :/
16:40 karolherbst: k
16:41 karolherbst: mupuf: http://www.plotshare.com/sessions/209198379/Plot1.png
16:42 mupuf: ??? are you sure you plotted the right thing?
16:43 karolherbst: ohhhh
16:43 karolherbst: hex values
16:50 karolherbst: mupuf: http://www.plotshare.com/sessions/209198379/Plot1.png
16:51 imirkin: calim: yeah. that's a critical edge. > 1 makes sense.
16:51 imirkin: calim: also why is it checking edge types?
16:51 imirkin: calim: i was just thinking if incoming > 1 && outgoing > 1
16:52 imirkin: calim: separately, the edge types are wrong... e.g. in an if/else you create 2 tree edges, and 2 forward edges. should be 3 tree edges and 1 cross edge
16:52 imirkin: calim: unless you have some alternate interpretation of what these edge types mean...
16:57 calim: back edges shouldn't count, don't know about cross edges
16:58 calim: cross edges are things like break from loop
16:58 calim: if else endif is 3 tree and 1 forward
17:01 karolherbst: mupuf: ohhhhh wait
17:01 karolherbst: I think
17:01 karolherbst: well, that could work
17:04 calim: (cross edges should be edges between nodes where neither is the ancestor of the other)
17:05 karolherbst: mhhh
17:06 karolherbst: okay, I don't see anything
17:06 karolherbst: do you?
17:06 imirkin: calim: hmmmm... based on the definitions i've seen, forward edges are where you go to a child
17:07 imirkin: and cross is where you go to something that's not along your branch
17:07 imirkin: so if/else should be 3 tree + 1 cross
17:07 calim: hm, although that might make endif edges cross, too
17:07 imirkin: from the else to the endif should be a cross, right
17:08 imirkin: why shouldn't back edges count btw? i thought this was about doing work along the transition...
17:08 imirkin: and the type of the edge shouldn't matter for that...
17:10 imirkin: calim: this is what i'm going off of btw: https://en.wikipedia.org/wiki/Control_flow_graph
17:10 calim: I don't know, it worked :P
17:10 calim: perhaps it doesn't though
17:11 calim: that piece of code has been there since the very beginning
17:11 imirkin: errr wait, that doesn't have the definitions
17:11 calim: guess we need to check all the places where we discriminate edge types
17:11 imirkin: yeah, i know... that's why i was hoping you could explain
17:11 imirkin: we actually don't care about edge types in _too_ many places
17:12 calim: except the CFG traversal
17:12 calim: where we want to traverse break edges last
17:12 calim: need to find a different kind of classification to distinguish endif-cross-edges from break-cross-edges
17:12 imirkin: hrmmmm
17:12 imirkin: in what way are they different?
17:13 calim: break usually comes later in structured control flow :P
17:13 calim: maybe I should have labeled the edges BREAK and ENDIF
17:14 calim: I don't remember if it was important to but break blocks before endif blocks, but it's usually preferable
17:14 calim: to *put, *after
17:15 imirkin: this has the edge definitions: https://en.wikipedia.org/wiki/Depth-first_search#Output_of_a_depth-first_search
17:15 calim: so ... some edge that leads to a node from where you can *not* come back to the current node
17:15 calim: with the current loop level ... bleh
17:16 calim: and then with arbitrary control flow every can blow up
17:16 calim: every*thing
17:16 imirkin: why do we care about any of that?
17:16 imirkin: is this for the joinat situation?
17:17 calim: well yes with arbitrary control flow joinat will probably stop working
17:17 imirkin: but we can do the joinat stuff manually
17:17 imirkin: from the tgsi
17:17 imirkin: (perhaps we even do, i ahven't looked carefully)
17:18 calim: yes, we only have structured control flow now anyway
17:18 imirkin: right
17:18 calim: unless you write a pass that creates unstructured flow or you have a different source language that contains it
17:18 imirkin: btw, do you have any opinion on this bug: https://bugs.freedesktop.org/show_bug.cgi?id=90887
17:19 imirkin: this is what has triggered this whole discussion
17:19 calim: we just shouldn't *assume* that it's structured in too many places ...
17:19 imirkin: yeah, i agree
17:19 calim: not now, I'm headed towards bed
17:19 imirkin: ok
17:20 imirkin: there are several bugs by now that i need your help on, so if you could make some time in the semi-near future, that'd be awesome
17:21 calim: I guess it's okay to add a new BREAK edge type which is a special CROSS edge
17:21 calim: if you want to fix edge labeling without changing things
17:21 imirkin: k
17:21 calim: or, yeah, redefine CROSS ;)
17:22 calim: if that's possible
17:22 imirkin: well, we can obviously name things however we like
17:22 imirkin: but it's nice to stay with standards
17:23 imirkin: we could also do it based on the idom tree
17:23 calim: I was just wondering if there'd be some criterion to distinguish BREAK from ENDIF edges
17:23 imirkin: er hm, or not
17:23 imirkin: i'm still unsure why you care
17:24 calim: because I like classification, labeling and control :P
17:24 imirkin: hehe
17:24 imirkin: i mean more like... why the code cares
17:24 imirkin: if it's just for labelling, i'd be happy calling it a CALIM edge ;)
17:28 calim: CALIM edges would be annoyingly hard to deal with
17:29 calim: the code cares because I wrote it to visit endif before break
17:30 calim: so that we might save a branch instruction... maybe
17:31 imirkin: hmmmm ok
17:32 imirkin: what's the situation? while { if { } else { break; } }
17:32 imirkin: don't we use the whole prebreak thing anyways?
17:33 imirkin: anyways, there are large swaths of the compiler that i don't understand... i try not to touch them mostly
17:34 imirkin: but that doesn't always pan out
17:36 calim: yes ... so the else block has a "brk" at the end ... if you schedule the brk block after the else block instead of the endif block you need a bra after the brk
17:36 calim: but ... hm
17:36 imirkin: anyways, i gtg
17:36 calim: it probably doesn't matter since there's still a TREE edge connecting the ENDIF
17:37 imirkin: and you should sleep
17:37 calim: so it will be put in the right place and the else can fallthrough
17:38 calim: err, the if block ... whatever ... stuff
17:38 calim: good night
21:43 Halfwit: I wonder when they're going to give out those signed images.
22:35 gnurou: imirkin: if you still have that email, can you send it to me? I will re-post it internally and push to get answers