00:22 imirkin: grrr
00:22 imirkin: i can't get both texture and textureLod to work at the same time
00:22 imirkin: it's one or the other!
00:38 imirkin: mupuf: i'm done for now, but if it's not too much trouble, could you keep the g80 in? i'm going to need to do some more tracing and testing
00:39 mupuf: I can do that
00:39 mupuf: won't do anything on the machine tonight
00:39 mupuf: And this week, I promised I would review hakzsam's patches
00:40 mupuf: I also said that for this week end, but it did not happen because I started having fun with the pcie speed instead...
00:40 imirkin: ;)
00:41 mupuf: it was good though :)
01:25 karolherbst: imirkin: what card do you have?
01:26 karolherbst: imirkin: allthough there is no GF108 without 5_0 support
01:26 karolherbst: imirkin: what did you try out?
01:27 karolherbst: mupuf did this on the fermi card and it worked: nvapoke 00088460: b06c2221 nvapoke 00088460: b06c2211
01:27 karolherbst: ...
01:27 karolherbst: no wrong
01:28 karolherbst: nvapoke 0x02241c 81 then 00088460: b06c2221
02:07 karolherbst: okay, nice
02:09 karolherbst: mupuf: are you there? I found out, why fermi requires two pokes
02:11 karolherbst: anyway if anybody is interested in trying pcie upclocking out, I created a nice overview what todo (its not well tested, but traces indicates it may be right): https://gist.github.com/karolherbst/5fdd4a543d20916bc362
02:19 karolherbst: the gk208 is strange though. Its always x8 lanes and uses a different value for pcie reclocking
02:58 mupuf: karolherbst: you should use a mask instead of writing random values like that
02:58 karolherbst: yeah, propably
02:59 mupuf: karolherbst: how about you write a tool called nvamask?
02:59 mupuf: nvamask reg mask value
02:59 karolherbst: but did you noticed, that on fermi, the first command sets the pcie link cap?
02:59 mupuf: nope, I did not see that, that is good to know
02:59 karolherbst: I checked with my fermi today
02:59 karolherbst: and it was already set to 5_0 GT/s
03:00 karolherbst: so only one commad was needed actually
03:00 mupuf: fun that it is not needed anymore on kepler+
03:00 karolherbst: yeah
03:00 karolherbst: its always on the highest I assume
03:00 mupuf: yeah, or I just missed it
03:00 karolherbst: strange though, the linkctrl2 stayed at 2_5 on fermi
03:01 karolherbst: on Kepler I don't really understand the first three bits though
03:02 mupuf: first three bits?
03:02 mupuf: I explained bit 0
03:02 karolherbst: I meant the "800"
03:02 mupuf: and I am not aware of bit 1 and 2
03:02 karolherbst: or "804"
03:02 karolherbst: or "404"
03:02 mupuf: digits then
03:02 karolherbst: ahh yeah
03:03 karolherbst: digits
03:03 mupuf: it is something else, we do not know yet
03:03 karolherbst: ohh now I see, on fermi and tesla the values can be different even across the same model
03:03 mupuf: and as long as we have no issue, we can ignore them
03:03 karolherbst: the first may indicate lanes though
03:04 karolherbst: its always 8 on 16x cards
03:04 karolherbst: and 4 on 8x cards
03:04 mupuf: could be
03:04 karolherbst: regardind nvamask: it should be a tool which just updates part of a reg?
03:05 karolherbst: like it would do nvapeek, write value with mask, nvapoke
03:13 mupuf: yes
03:13 mupuf: exactly that
03:13 mupuf: and potentially write the old value and the new one in verbose mode
03:13 karolherbst: yeah
03:14 karolherbst: okay
03:14 karolherbst: will write a bash script first, so that I know what has to be done in "general"
03:14 mupuf: copy/paste the code of nvapoke
03:14 mupuf: no, write it in c
03:14 mupuf: this is ridiculous to write a shell script for this
03:15 mupuf: and the shell script will never be used anyway
03:17 karolherbst: this is like for before writing it in c
03:19 karolherbst: okay, done :=
03:19 mupuf: done in bash or in c?
03:19 mwk: mupuf: is the mask for which bits to write or which to preserve?
03:19 karolherbst: bash at first
03:19 mwk: to write I suppose
03:19 mupuf: mwk: what would you prefer?
03:19 mupuf: write is whatr I had in mind
03:20 mupuf: or the same as nva_mask(), if we have it
03:21 karolherbst: mupuf: okay, so the tool should be used like this:
03:21 karolherbst: ./nvamask 0x08c040 0xfff0fff0 0x40001
03:21 karolherbst: old value: 0x80089000 new value: 0x80049001
03:21 karolherbst: nvpoke 0x8C040 0x80049001
03:21 mupuf: karolherbst: please do the opposite for the mask
03:22 karolherbst: okay
03:22 mupuf: 0xc0001
03:22 mupuf: because your mask was overly broad and I do not like that
03:23 karolherbst: k
03:24 mupuf: mwk: did you see that changing the pcie speed was trivial?
03:25 karolherbst: mupuf: mask should indicate bits which are changed by the write, right?
03:25 mupuf: karolherbst: yes
03:25 mwk: mupuf: yeah
03:25 mupuf: so, you need to use the ~ operator to reverse it
03:25 karolherbst: I usually don't code with masks, so I need some time to figure it out to have it fail proof
03:25 mupuf: in c, just just nvamask, IIRC we have it
03:26 karolherbst: okay
03:26 karolherbst: will check
03:26 mwk: mupuf: does it work for weirdo regspaces?
03:26 mupuf: karolherbst: nva_mask, defined in nva.h
03:27 mupuf: mwk: what do you mean?
03:27 karolherbst: ahhh
03:27 mwk: mupuf: regspace.h
03:28 karolherbst: then its trivial
03:28 mupuf: it was trivial even before that :p
03:28 mupuf: but yeah, hence why I am asking you to write it! That will make you contribute to envytools
03:28 mupuf: anyway, I'm at work
03:28 mupuf: ttyl
03:28 mupuf: probably tomorrow
03:29 mupuf: oh, you want nvamask to work in weird regspaces or the pcie speed?
03:30 mwk: nvamask
03:30 mwk: eg. to aim it at a CRTC register
03:30 karolherbst: mhhh
03:30 karolherbst: first I have to understand the code :D
03:34 RSpliet: karolherbst: on tesla, the link cap values are the other way round
03:34 karolherbst: RSpliet: okay
03:34 RSpliet: eg 0x154c & 0x80 == 0x80 means 5.0GT/s
03:37 karolherbst: okay
03:37 karolherbst: then its pretty the same
03:38 RSpliet: as was documented in rnndb btw
03:45 karolherbst: RSpliet: does it look right? https://gist.github.com/karolherbst/5fdd4a543d20916bc362
03:45 karolherbst: I think the commit bit is still there on tesla and fermi
03:46 RSpliet: that can't be right
03:47 karolherbst: now it should be better
03:50 RSpliet: your mask is still wrong
03:52 karolherbst: ohh now I see it
04:02 RSpliet: also note that this has no effect when the link width is set to 8x
04:02 RSpliet: still wrong btw... how do you determine your mask values?
04:04 mupuf: mwk: what would be the problem with those?
04:04 mupuf: as long as we can read and write, it should be fine, right?
04:05 RSpliet: mupuf: shoo, back to work you, Intel isn't paying you to help the competition :-P
04:05 karolherbst: :D
04:05 mupuf:is compiling
04:05 karolherbst: https://xkcd.com/303/
04:05 karolherbst: :)
04:05 mupuf: and nouveau is competition?
04:16 karolherbst: RSpliet: better? https://gist.github.com/karolherbst/5fdd4a543d20916bc362
04:22 specing: What is a good PCI-e GPU card these days for running all open source games (I believe Unvanquished is the most demanding) on max?
04:28 karolherbst: mupuf: there is also 0x02241c on my kepler card
04:28 karolherbst: with the value 0x81
04:31 karolherbst: nice
04:31 karolherbst: I can change linkcap on my kepler card with it
04:32 karolherbst: who also has a kepler card?
04:51 RSpliet: karolherbst: yes... although I'm not sure if that trigger bit exists on 0x154c as well
04:53 karolherbst: RSpliet: on tesla?
04:55 karolherbst: seems to work
04:58 RSpliet: doesn't mean it's correct... did you check it's meaning in rnndb?
05:03 karolherbst: I checked with traced
05:03 karolherbst: *traces
05:03 karolherbst: like reading all traces and grep for what was read from and write into it
05:05 karolherbst: RSpliet: for f in $(find nva* -type f -iname "*.xz"); do echo $f; demmio -f $f 2>&1 | grep 0x00154c ; done
05:05 karolherbst: inside the database
05:05 karolherbst: but maybe I should take a deeper look and verify more
05:05 RSpliet: it's always a good idea to check what information is available in rnndb
05:06 karolherbst: there are pairs: (0x0000023d, 0x000002bd), (0x0000017d, 0x000001fd)
05:06 karolherbst: yeah
05:06 karolherbst: demmio will show good stuff if its in rnndb, right?
05:07 RSpliet: yep
05:07 karolherbst: but "PCIE_SPEED = 2P5GT" seems wrong :D
05:07 karolherbst: should be 2_5GT
05:07 RSpliet: but it pays off verifying individual bits too; spot the difference:
05:07 karolherbst: on tesla und fermie I get nice output about the cap
05:07 RSpliet: [roy@Tuvok envytools]$ rnn/lookup -a a8 0x154c 0x0
05:07 RSpliet: PBUS.HWUNITS_1 => { PCIE_VERSION = 1 | PCI_CLASS = DISPLAY | UNK3 = 0 | PCIE_SPEED = 2P5GT }
05:07 RSpliet: [roy@Tuvok envytools]$ rnn/lookup -a a8 0x154c 0x1
05:07 RSpliet: PBUS.HWUNITS_1 => { PCIE_VERSION = 2 | PCI_CLASS = DISPLAY | UNK3 = 0 | PCIE_SPEED = 2P5GT }
05:08 karolherbst: yeah
05:08 karolherbst: its the last bit
05:08 karolherbst: mhhh
05:08 karolherbst: this might be actually important, but every card seems to have set it to verion 2 at some point
05:08 karolherbst: and never switches back to 1
05:09 RSpliet: ;-)
05:09 karolherbst: like yours nva0
05:09 karolherbst: I think it checks if pci 2.0 is supported and then immediatly switches
05:09 karolherbst: at least the driver does it
05:11 karolherbst: don't want to add documentation for it though, because I don't know what happens when speed is 5_0 but version is 1.0
05:11 karolherbst: so it should be always set to 2.0 for now
05:12 karolherbst: but thanks for the hint
05:12 RSpliet: I don't think that combination exists
05:12 RSpliet: negotiation will likely just fail
05:12 karolherbst: yeah
05:13 karolherbst: but who nows
05:13 karolherbst: do you want to test on your card?
05:15 RSpliet: doesn't work
05:15 RSpliet: unless you've figured out a way to increase the link width to 16 first
05:16 RSpliet: (that is, the NVA8... I don't have any other card at hand right now, all safely stowed in boxes)
05:16 karolherbst: GT218?
05:16 RSpliet: yes
05:22 karolherbst: ohhh
05:22 karolherbst: there is something missing
05:22 karolherbst: we overseen something on kepler
05:23 karolherbst: LnkCtl2 needs to be set to 8_0 before LnkSta, otherwise it doesn't work
05:24 karolherbst: ohhhh
05:24 karolherbst: ohhhhhhh
05:24 karolherbst: there is a third reg we need on kepler
05:25 karolherbst: not one as we assumed, but actually three
05:25 karolherbst: mupuf: if you have a little bit of time
05:28 mupuf: probably won't until tomorrow
05:28 mupuf: at best, since my main machine died during the night after 6 years of good service
05:28 mupuf: it may just be the PSU
05:28 mupuf: but who knows!
05:28 mupuf: I found the machine hung with slight corruption on the screen
05:29 mupuf: and when I tried to reboot it ... the pwr LED goes on for 100ms and off again
05:29 mupuf: I tried getting rid of the gpu, the hdd, checking if any cable was loose
05:29 mupuf: to no avail
05:31 karolherbst: no, I just wanted to tell you about the stuff I found out now
05:31 karolherbst: there are three regs needed for actuall stable upclocking
05:31 karolherbst: even on kepler
05:31 karolherbst: ohh
05:31 karolherbst: mhhh
05:50 karolherbst: I should be more carefull
07:15 imirkin: karolherbst: my fermi appears to be pcie v1, so no 5GT/s
07:15 imirkin: perhaps there's a way to flip it to pcie v2
07:43 tobijk: imirkin: does lspci say its an pcie v2?
07:43 imirkin: tobijk: no, it says v1
07:44 imirkin: tobijk: karolherbst: http://hastebin.com/sadebuqehe.xml
07:46 RSpliet: "Surprise-"
07:46 RSpliet: hmm...
07:46 imirkin: RSpliet: yeah, this card is *not* surprised :)
07:46 imirkin: in the same motherboard, i have another card, which looks like
07:47 imirkin: http://hastebin.com/itixagagod.xml
07:47 imirkin: so there it's v2. now the claim is that i have 2 PCIe 2.0 slots
07:49 tobijk: is that a GF108 maybe, something low-end?
07:49 imirkin: super low-end
07:49 imirkin: it's the definition of low-end
07:50 tobijk: "its a low-end card, lets take a cheap pcie1 connector chip ;-)
07:50 imirkin: maybe. or maybe it needs to be flipped into v2 mode
08:01 karolherbst: imirkin: what card was it again?
08:01 karolherbst: you have to set the LnkCap up before setting LnkSta
08:02 karolherbst: imirkin: the latter one is problematic
08:02 karolherbst: kepler?
08:02 karolherbst: ahh GT215
08:02 karolherbst: okay
08:02 karolherbst: I see the problem
08:02 tobijk: he said fermi :>
08:02 imirkin: the GT215 probably just doesn't support it
08:02 karolherbst: yeah, I see the problem
08:02 karolherbst: no
08:02 karolherbst: LnkCtl2 is the problem
08:03 imirkin: but the GF108 should be able to have pcie v2
08:03 karolherbst: on my kepler card if LnkCtl2 is at 2_5 I can't upclock anymore
08:03 karolherbst: there is some logic unknown which is pretty important
08:03 imirkin: note how the GF108 says "v1"
08:03 karolherbst: and I didn't find how to set the LnkCtl2
08:03 karolherbst: no problem
08:03 karolherbst: you have to cap to v2 first
08:03 karolherbst: its easy
08:04 karolherbst: I need 0x02241c
08:04 karolherbst: nvapeek 0x02241c
08:04 imirkin: 0
08:04 karolherbst: on some cards the blob is switching from v1 to v2 at the very beginning
08:04 karolherbst: yeah
08:04 karolherbst: 0 is v1
08:04 karolherbst: poke 1 into it
08:05 imirkin: if i time out, you'll know why :)
08:05 karolherbst: yeah
08:05 karolherbst: this is the reg for the pci caps
08:05 mupuf: boom boom boom boom, he is going to reboot!
08:05 karolherbst: :)
08:05 karolherbst: maybe not
08:05 imirkin: ka-BOOOOM!
08:05 karolherbst: we will see
08:05 karolherbst: :)
08:05 imirkin: not really
08:05 imirkin: it's v2 now, yay :)
08:05 karolherbst: :)
08:05 mupuf: imirkin: try v4!
08:05 karolherbst: now poke 0x81 into it
08:06 mupuf: hmm hmm :D
08:06 karolherbst: this will set the link to max speed
08:06 karolherbst: mhh
08:06 karolherbst: no
08:06 karolherbst: the cap
08:06 karolherbst: then you have cap: 5_0 but link speed still 2_5
08:06 imirkin: still 2.5 :(
08:06 karolherbst: also the cap?
08:07 imirkin: oh wait i lied
08:07 karolherbst: yeah
08:07 imirkin: looked at the wrong gpu
08:07 karolherbst: LnkCap is 5_0 now
08:07 karolherbst: LnkSta shouldbe 2_5
08:07 imirkin: ya
08:07 karolherbst: okay next step
08:07 karolherbst: peek 0x088460
08:07 imirkin: 00088460: b06c2220
08:07 karolherbst: poke b06c2221 into it
08:08 karolherbst: should do the trick
08:08 imirkin: so just flip the low bit on?
08:08 karolherbst: yeah
08:08 karolherbst: then lnkSta should be 5_0
08:08 imirkin: LnkSta: Speed 5GT/s
08:08 karolherbst: :)
08:08 karolherbst: there you go
08:08 imirkin: but i still see LnkCtl2: Target Link Speed: 2.5GT/s
08:08 imirkin: which is weird
08:08 karolherbst: yeah I have these to on my fermi
08:09 karolherbst: on kepler though this is a big problem
08:09 karolherbst: *too
08:09 karolherbst: there is another reg we need
08:09 karolherbst: for this
08:09 imirkin: would writing b06c2220 back into 88460 flip it back to 2.5?
08:09 karolherbst: this this out: https://gist.github.com/karolherbst/5fdd4a543d20916bc362
08:09 karolherbst: no
08:09 karolherbst: b06c2211
08:09 karolherbst: there are some cards which start in pcie v1.0 mode
08:09 karolherbst: and the blob just push them to 2.0
08:10 karolherbst: and never goes back though
08:10 imirkin: nope, that's still in 5.0 mode
08:10 karolherbst: mhhh
08:10 karolherbst: did you check lnkSta?
08:10 imirkin: ya
08:10 karolherbst: okay, strange
08:11 karolherbst: you could poke 0x1 into 0x02241c though
08:11 karolherbst: this should also drop lnkSta
08:11 karolherbst: but this is very unstable on my kepler
08:11 karolherbst: and should never be done as it seems
08:11 karolherbst: messed my entire 0x08c040 reg up
08:12 imirkin: hehe, i won't worry about that for now :)
08:12 karolherbst: :D
08:12 karolherbst: so did you drop to 2_5?
08:12 imirkin: no
08:12 karolherbst: mhhh
08:12 imirkin: but... wtvr
08:12 imirkin: you should write patches to do some of this stuff automatically
08:12 mupuf: imirkin: sure you checking the right gpu?
08:12 karolherbst: yeah
08:12 karolherbst: already done
08:12 imirkin: mupuf: yes :p the GT215 can only do 2.5
08:13 mupuf: imirkin: it requires understanding the vbios tables
08:13 imirkin: karolherbst: the nice thing about fermi is that they tend to come with middle clocks by default, so even without reclocking it should give a boost
08:13 mupuf: seems like nvaX only has one bit for it, but I cannot test
08:13 glennk: imirkin, isn't gt215 a pcie 2.0 device?
08:13 imirkin: glennk: it is
08:13 karolherbst: mupuf: https://github.com/karolherbst/envytools/commit/092f409139f4b36c2fff397f7fd8043ed4cb6051
08:14 karolherbst: imirkin: you are liying ;)
08:14 karolherbst: the GT 240 is able to do 5_0
08:14 imirkin: perhaps it can... but def not based on these pci caps
08:14 imirkin: i have a mmiotrace of it if you guys want to look though
08:14 karolherbst: mhh
08:14 glennk: mb might not for both slots
08:14 karolherbst: did you mess with your kepler now?
08:15 karolherbst: ahhh
08:15 karolherbst: ohh
08:15 imirkin: i think it's at http://people.freedesktop.org/~imirkin/traces/nva3/nva3-gddr5.log.xz
08:15 karolherbst: mhhh
08:15 imirkin: glennk: well, the v2 is showing, just not the 5GT/s
08:15 karolherbst: I think we may have messed up the kepler card now
08:15 imirkin: glennk: but then it's not showing for the fermi either, so... who knows
08:15 mupuf: karolherbst: where did you get this usleep(1) from?
08:15 karolherbst: strange values without it
08:15 imirkin: i was looking at Target Link Speed: 2.5GT/s as an indicator
08:15 karolherbst: the read may produce a shifted 1
08:16 karolherbst: was too confusing
08:16 mupuf: really? WTF
08:16 karolherbst: yeah
08:16 mupuf: probably linked to changing the pcie speed, but this is worrysome
08:16 karolherbst: I got something like 0x80049010
08:16 mupuf: that means we need extra locking :o
08:17 karolherbst: I see
08:17 mupuf: or change the link speed from pdaemon
08:17 mupuf: ...
08:17 glennk: probably not a great idea to attempt pcie transfers while renegotiating the link
08:17 karolherbst: :D
08:17 imirkin: karolherbst: http://hastebin.com/idihozelaw.coffee
08:17 mupuf: maybe the surrounding writes are not useless after all!
08:17 karolherbst: thats fermi right?
08:17 mupuf: but I definitely cannot reproduce this issue
08:18 mupuf: I would like you to seriously check your assertion
08:18 imirkin: karolherbst: no, GT215
08:18 imirkin: looks like it's link-training
08:18 karolherbst: crazy
08:18 karolherbst: it uses the fermi reg for lnkSta
08:18 imirkin: could be the other way 'round :p
08:18 imirkin: esp given order of development...
08:18 karolherbst: ahh its tesla
08:19 karolherbst: yeah, there is the 0x00154c reg
08:19 karolherbst: imirkin: 0x00154c is for LnkCap and PCIe v
08:19 karolherbst: 0x088080 for LnkSta
08:20 imirkin: karolherbst: i wouldn't ignore the link training-looking things
08:21 RSpliet: they seem simple enough not to ignore them
08:21 karolherbst: we still miss something
08:21 karolherbst: I got my kepler into 8_0 mode all the way
08:21 karolherbst: and couldn't downclock anymore
08:22 karolherbst: and like shifted garbage inside 0x08c040
08:22 karolherbst: mupuf
08:22 RSpliet: it might not be appropriate to talk about "clocking" here, besides, yes, check VBIOS initialisation scripts to see if they have hints
08:22 mupuf: RSpliet: already did on one card
08:22 mupuf: one reg is used as a condition
08:23 mupuf: that was on kepler though
08:23 mupuf: no init/training needed there
08:23 karolherbst: kepler seems to be easier at first, but I think we miss something
08:23 karolherbst: its easy to mess it up
08:23 karolherbst: mupuf: did you ever peek/poke with 0x02241c ?
08:23 karolherbst: on kepler
08:24 karolherbst: imirkin: okay on which card to we got the pcie link change working now?
08:24 karolherbst: *did
08:25 imirkin: it worked on the GF108 and GK208
08:25 karolherbst: okay
08:25 karolherbst: GF108 started with PCIe v1?
08:25 karolherbst: okay, yeah see it in the hastebin
08:26 karolherbst: checking
08:26 karolherbst: mlankhorst: ping
08:27 karolherbst: mlankhorst: your fermi card should be at v1 too?
08:28 imirkin: karolherbst: btw, 88078 should tell you what PCIe version is supported
08:29 karolherbst: is this okay with you? https://github.com/karolherbst/envytools/commit/c483531f9eb796aa507f8c960470a03cebe9f261
08:30 karolherbst: imirkin: on which models?
08:31 imirkin: karolherbst: dunno, it's in rnndb...
08:31 imirkin: PPCI.EXP_HEAD
08:31 imirkin: i guess all models, in theory
08:31 karolherbst: okay, I think most of the fermi cards start at v1 mode
08:32 imirkin: i dunno, i wouldn't go around renaming things at random
08:33 karolherbst: seems like a type for me
08:33 imirkin: 2 point 5
08:33 karolherbst: P and _ aren't that far away on us layout
08:33 karolherbst: ahhh
08:33 karolherbst: point
08:33 imirkin: i've seen the 2P5 stuff before
08:33 imirkin: as well as 5P0
08:33 karolherbst: shouldn't the value be consistent then?
08:33 tnt: I'm wondering: Is anyone investigating the seemingly "new" method of turnng off the discrete card used by Windows 8.1+ ? (i.e the need for acpi_osi="!Windows 2013" and soon acpi_osi="!Windows 2015")
08:34 karolherbst: using 2_5 in one place and 2P5 in another
08:34 imirkin: tnt: not to my knowledge
08:34 imirkin: tnt: file a bug, provide an acpidump
08:34 imirkin: karolherbst: yeah, that's not super-great
08:34 imirkin: mwk: any preference?
08:35 karolherbst: there is no 5P0 in rnndb
08:36 tnt: imirkin: I think several have been opened, but then closed because acpi_osi="!Windows 2013" fixes the immediate issue. I'm just not sure if anyone dug deeper. Because all those people that got issue with this, will nget it again with 4.1 since "Windows 2015" string just got added in the kernel.
08:36 imirkin: tnt: i have seen no such bug in the freedesktop.org bug tracker
08:36 karolherbst: tnt: you could try with the bumblebee guess though. They deal a lot with the ACPI stuff on nvidia cards
08:36 tnt: karolherbst: yeah, same issue there, no more answers though.
08:37 karolherbst: :/
08:38 tnt: In the acpi dsdt I can clearly see some stuff is skipped for 8.1, just no idea what's supposed to happen ... and the "confusing" part is that just putting the card in D3 _does_ save power. Just not as much as the old method. ( 1W vs 2.5W in my case )
08:39 karolherbst: tnt: which d3?
08:39 tnt: I mean PCI power state
08:39 karolherbst: yeah
08:39 karolherbst: which one
08:39 karolherbst: there is d3warm and d3cold
08:40 imirkin: i thought it was d3hot :)
08:40 karolherbst: :D
08:40 karolherbst: mhh
08:40 imirkin: tnt: well, without seeing the acpi tables, it's hard to tell
08:40 karolherbst: seems like both terms are used
08:40 tnt: karolherbst: sorry, Cold.
08:40 karolherbst: mhhh
08:41 karolherbst: tnt: then its a nice issue for the bumblebee guys :D
08:41 karolherbst: just tell them d3cold wastes more energy than some new stuff
08:41 tnt: imirkin: yeah, working on uploading them.
08:42 imirkin: tnt: but the deal is that things still work fine, but just the gpu sucks more power?
08:42 tnt: karolherbst: yeah, I'll report it there too. mostly just wanted to check nobody here had investigated this.
08:42 tnt: imirkin: yeah
08:42 imirkin: that's highly annoying
08:42 imirkin: since no one is ever going to notice
08:42 tnt: which I guess is why people might not notice ...
08:43 karolherbst: i think here are more important things going one usually ;) not that power consumption isn't important, but performance is more important I assume
08:43 karolherbst: would be great though to solve it
08:44 karolherbst: I have win8.1 installed though
08:44 karolherbst: so maybe I can somehow investigate that
08:44 karolherbst: but really don't know how to start there
08:44 tnt: I ahve win 8.1 installed too (well hhuh ... I thinkg it's 8.1. tbh I'm not sure), but I also have no idea how to look in there.
08:45 karolherbst: imirkin: do we want to mess with your GT 240 and get 5_0 working?
08:45 imirkin: karolherbst: yes, "we" do :)
08:45 karolherbst: nice
08:45 imirkin: i pasted the mmiotrace bits that flip it
08:46 karolherbst: yeah there isn't that much knew sadly
08:46 karolherbst: peek 0x00154c
08:46 imirkin: i guess just a write to punits, and then a write to 88460
08:47 karolherbst: the poke 0x00154c 0x000002bd _should_ set your lnkcap up to 5_0
08:48 imirkin: ya
08:48 karolherbst: then reg 0x088460
08:48 karolherbst: poke 0xb06c2221
08:48 imirkin: yep
08:48 karolherbst: it may be, that for stability 0xb06c2220 should be poked in before
08:48 karolherbst: don't know
08:48 imirkin: claims to be at 5GT/s
08:49 karolherbst: fine :)
08:49 karolherbst: then this cart might be enough usually: https://gist.github.com/karolherbst/5fdd4a543d20916bc362#file-all-in-one
08:49 karolherbst: *chart
08:49 karolherbst: _chart_
08:49 imirkin: well, i think the link training stuff in there might be good to do
08:49 karolherbst: I could mention pcie v1 though
08:49 imirkin: and i suspect some of the later reads are error counts (which come out at 0 on success)
08:50 karolherbst: added pcie version stuff: done
08:52 karolherbst: have to check maxwell for the cap reg though
08:52 karolherbst: imirkin: the 0x088078 thing is strange
08:52 karolherbst: only saw it on two cards
08:52 karolherbst: from 12
08:52 karolherbst: all fermis
08:52 imirkin: mine's a tesla...
08:53 karolherbst: okay, will check tesla traces then
08:56 tnt: imirkin: jfyi: http://pastebin.com/raw.php?i=C6Q3A8aa but yeah, I understand PM isn't on the top prio :p
08:56 karolherbst: mhhh
08:57 imirkin: tnt: file a bug, this will require real investigation, and it's good to keep track of that in a bug tracker
08:57 karolherbst: maxwell is strange now :/
08:57 karolherbst: seems like its always at pcie full speed
08:57 karolherbst: at least the cap
08:57 tnt: imirkin: will do
08:58 karolherbst: doesn't matter though
08:58 karolherbst: okay, demmio looks nice on all now
08:58 karolherbst: I thnk rnndb has the stuff now
09:00 karolherbst: ahh no, tesla/fermi is missing the lnksta part
09:03 imirkin: tnt: so it looks like a difference between the NVOP and NBCI methods
09:03 imirkin: we currently call the NVOP one
09:03 imirkin: the NBCI one is hidden behind \WIN8 (and also a different command id than what we use)
09:06 karolherbst: imirkin: sounds good? https://github.com/karolherbst/envytools/commit/cc6c814deebf7292ab15c0fd1c29e15c5b61e9e0
09:07 imirkin: value value="0" name="2.5 GT/s"
09:07 imirkin: please don't do stuff like that
09:07 imirkin: these things are supposed to be usable to generate headers
09:07 karolherbst: I just did what mupuf did :O
09:07 imirkin: yeah, and i was going to yell at him too
09:08 karolherbst: okay
09:08 imirkin: but then i forgot
09:08 mupuf: imirkin: since when do we generate headers with rnndb?
09:08 mupuf: we stopped doing it like 6 years ago
09:08 imirkin: mupuf: we don't, but the idea is that we can.
09:08 imirkin: and every single symbol in there is header-friendly
09:08 imirkin: except the ones you just added
09:08 karolherbst: so 2_5GT should be used
09:09 karolherbst: okay, will change it
09:09 mupuf: sounds good
09:09 mupuf: karolherbst: please change my shit then, sorry
09:09 karolherbst: yeah
09:09 imirkin: can you look at the various traces and see when it's first used?
09:10 karolherbst: imirkin: what exactly?
09:10 imirkin: 88460
09:10 karolherbst: tesla has it
09:10 karolherbst: should I look earlier?
09:11 imirkin: did you check every tesla?
09:11 imirkin: does G80 have it?
09:12 karolherbst: yeah, I have a nice bash script :)
09:12 karolherbst: nva4* doesn't seem to have it
09:12 karolherbst: ..
09:12 karolherbst: nv4*
09:12 imirkin: what about G80?
09:12 imirkin: aka nv50
09:12 karolherbst: checking
09:12 karolherbst: doesn't seem like that
09:12 karolherbst: nv8* have it
09:12 imirkin: so not *every* tesla :p
09:13 imirkin: G84-
09:13 imirkin: (add that as a variant)
09:13 karolherbst: this could be because of the traces though
09:13 karolherbst: but I doubt that
09:13 karolherbst: k
09:13 imirkin: i'd be surprised if G80 had pcie v2 support
09:13 imirkin: i don't think v2 was around back when it came out
09:14 karolherbst: checking
09:14 karolherbst: Quadro FX 5600 has v2
09:15 karolherbst: :/
09:15 imirkin: from the looks of it G92 was the first to support pcie v2
09:16 imirkin: yeah, G92 was the first
09:17 karolherbst: imirkin: what about the Quadro FX 5600?
09:17 karolherbst: but I need to find a good source first
09:18 karolherbst: maybe the wikipedia page is wrong
09:18 imirkin: https://en.wikipedia.org/wiki/GeForce_8_series -- look at the technical summary
09:19 karolherbst: imirkin: then this is lying: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#Quadro_FX_.28x600.29_Series
09:20 imirkin: hm odd
09:20 karolherbst: the bad thing is: I really don't find anything on the web
09:21 imirkin: that page lies at least a little bit
09:21 imirkin: Quadro FX 370 LP -- claims it's a G86 but it's actually a G98
09:21 imirkin: (and i have one... heh)
09:21 karolherbst: yeah, I thiknks the page is wrong
09:22 karolherbst: imirkin: ohh that's not that untypical, there are a lot of cards with different chips
09:23 imirkin: yeah, but all the FX 370 LP's are G98's
09:23 karolherbst: okay
09:23 karolherbst: okay patches: https://github.com/karolherbst/envytools/commit/c483531f9eb796aa507f8c960470a03cebe9f261
09:23 karolherbst: https://github.com/karolherbst/envytools/commit/1e1bd0d1d2227e9aaf86b8748534bf11e9ccc989
09:24 karolherbst: and https://github.com/karolherbst/envytools/commit/9545e50f509b3d25401a74fbf26148f00b50a011
09:33 imirkin: i do wonder about LnkCtl2: Target Link Speed: 2.5GT/s -- should check if the blob "fixes" that somehow
09:34 karolherbst: mhh
09:34 karolherbst: right
09:34 karolherbst: can't do that, because its always 8_0 for me
09:34 karolherbst: but there is a strict order
09:34 karolherbst: 1. cap 2. ctrl2 3. sta
09:39 karolherbst: imirkin: sadly blod doesn't touch cap or lnkctl2 for me
09:39 karolherbst: *blob
09:39 karolherbst: its always 8_0
09:40 tnt: imirkin: what makes you think NBCI is even related to PM ?
09:41 imirkin: it's part of the dsm call
09:41 karolherbst: imirkin: checked blob of fermi: lnkctl always at 2_5 cap and sta always the same either 2_5 or 5_0 both
09:43 tnt: imirkin: yeah, but then so is NVPS ... and \WIN8 is 1 for both "Windows 2012" and "Windows 2013" OSI strings.
09:43 imirkin: karolherbst: yeah, and ajax seems to think it's fine too
09:43 tnt: LGreaterEqual (OSYS, 0x07DD) seem to be the changes between "Windows 2012" and "Windows 2013"
09:43 imirkin: make sure to note that in the bug... i gtg
09:44 karolherbst: yeah, then we are fine indeed
09:44 karolherbst: but
09:44 karolherbst: I managed to get it set to 2_5 on my kepler
09:44 imirkin: probably by doing something horrible
09:44 imirkin: i wouldn't worry about it
09:44 karolherbst: setting lnkcap to 2_5
09:45 karolherbst: I think nouveau shouldn't touch lnkcap on kepler
09:45 karolherbst: and maxwell too
09:45 imirkin: bbl
10:20 hakzsam: imirkin, I'm running the full piglit run for the boolean patch
10:20 tnt: This is going to sound dumb, but what does the audio codec have to do with nouveau and optimus ?!?
10:20 karolherbst: tnt: hdmi
10:21 imirkin_: all GT215+ gpu's have an audio subfunction pci device
10:21 tnt: karolherbst: oh, ok, didn't think of that. But the video output is connected to the intel, hy would the nvidia be involved ?
10:21 imirkin_: you can't power them off independently of one another though
10:21 karolherbst: intel usually has its own audio part too
10:22 karolherbst: mhhh
10:22 karolherbst: imirkin_: what about dedicated mobile chips without any connectors?
10:22 karolherbst: didn't see anything from my card
10:22 tnt: or maybe that code is n the acpi table "just because" and doesn't actually serve any purpose ...
10:23 imirkin_: karolherbst: some of them have the audio subfunction disabled yeah
10:24 imirkin_: and others (this is genius) manage to *enable* the pci device when an hdmi cable is plugged in
10:24 imirkin_: wreaks havoc on just about everything
10:24 imirkin_: probably some bit of clever SMM
10:25 karolherbst: yeah, really clever indeed
10:25 karolherbst: but makes sense
10:25 karolherbst: my macbook used to do that
10:25 karolherbst: but it was miniDP back then
10:25 imirkin_: er yeah, well same deal for DP -- both hdmi and dp audio use the same subfunction
10:26 tnt: From looking at the ACPI table what I figured is that for < Win 8.1, it called GPON/GPOF from the _PS0 and _PS3 method when switching power state. For more recent windows version, it relies on "PowerResource" object mechanism.
10:27 imirkin_: tnt: you might want to pull some linux-acpi folks into the discussion
10:27 imirkin_: my only experience with acpi is reading the decoded tables and making wild assumptions about what the various things mean
10:27 imirkin_: has worked out well for me so far, but this might be past that
10:27 tnt: mine too :) and dates from about 4h ago :p
10:27 tnt: is there #linux-acpi ?
10:27 imirkin_: dunno
10:27 imirkin_: there's definitely a linux-acpi@vger.kernel.org
10:27 tnt: nope.
10:28 imirkin_: i think there's a general linux dev channel too
10:28 tnt: yeah, mght be a bit crowded though :p
10:28 imirkin_: but you should file a bug... somewhere... with all the various info you've collected thus far
10:28 imirkin_: and then perhaps start an email discussion around it
10:29 tnt: I'll continue collecting info for a couple hours, then file a bug with all I found.
10:29 imirkin_: pulling in people like ... gr, i forget. the acpi guys. :)
10:32 imirkin_: karolherbst: chances are the cleanest way to add the pci speed stuff to nouveau is to add a "pci" subdev -- it's a little annoying, but that's how everything is broken up. double-check with skeggsb_ before doing all the work though
10:32 imirkin_: karolherbst: let me know if you're up for it, if not i can probably work it out. but it'd be nice to get you writing kernel patches :)
10:33 karolherbst: imirkin_: mupuf said he wanted to do that
10:33 karolherbst: but it wouldn't be that hard actually, I think I know pretty sure what the cards needs
10:34 karolherbst: and what to do in a few special cases
10:34 imirkin_: mupuf's too busy and has plenty of contributions already; i'm sure he wouldn't mind.
10:34 karolherbst: k
10:34 imirkin_: karolherbst: but you need to do more than what you've done -- at least on my GT215 i saw the blob do what looked like a link-training sequence.
10:34 karolherbst: imirkin_: does this looks complete? https://gist.github.com/karolherbst/5fdd4a543d20916bc362#file-all-in-one
10:34 imirkin_: you should do the same thing.
10:34 karolherbst: mhhh
10:34 karolherbst: yeah
10:34 imirkin_: which is a giant pain to do by hand
10:34 karolherbst: I saw that some stuff is going on in a lot of traces
10:35 karolherbst: brb
10:35 imirkin_: and probably not worth it, but the actual driver should do it
10:38 karolherbst: back
10:38 imirkin_: so basically you'd add a nvkm/subdev/pci and add things to it. i think.
10:38 imirkin_: there's no great place to put it otherwise
10:39 karolherbst: I was thinking a little about this v1 => v2 transition
10:39 karolherbst: I think I would start with this first
10:39 karolherbst: because it seems trivially enough
10:39 karolherbst: *trivial
10:39 imirkin_: you can probably just do that in devinit or something
10:39 karolherbst: should be done insite the subdev/pci ctor right?
10:39 karolherbst: mhh
10:40 imirkin_: no, you shouldn't touch the card in any ctor
10:40 karolherbst: okay
10:40 imirkin_: only touch the card in init
10:40 karolherbst: why devinit for v1=>v2?
10:40 imirkin_: [there are a few ctor's that don't respect that rule, but they know what they're doing]
10:41 imirkin_: well, i suspect that with the link speed we might want to change back and forth
10:41 imirkin_: as part of pm/etc
10:41 imirkin_: so it seems like there should be a separate subdev for it
10:41 imirkin_: while for v1 -> v2, you always want to do it, so might as well just do it
10:41 imirkin_: dunno
10:41 imirkin_: maybe if we have a pci subdev, then it makes sense to just stick it in there too
10:42 imirkin_: def get skeggsb_'s perspective on this... he should be waking up in 5h or so
10:45 tnt: I'm a bit confused why PCI_D3cold is used for the runtime suspend (which is the 'optimis disable discrete card' right ?) and D3hot is used for the _suspend (whihc is sleep ?)
10:47 karolherbst: imirkin_: the blob never changes back to v1 as far as I can tell
10:47 karolherbst: and it always switches up if it sees it is at v1
10:47 glennk: i think there are some power settings in its control panel
10:47 karolherbst: 5h ...
10:47 karolherbst: yeah well
10:47 karolherbst: okay
10:48 karolherbst: .d
10:48 karolherbst: :D
10:48 karolherbst: tnt: I can tell you why
10:48 karolherbst: tnt: for example if you kexec in d3cold the card is simply gone
10:48 karolherbst: you can't rescan it
10:48 karolherbst: you can*t find it in lspci
10:48 karolherbst: its simply gone
10:49 karolherbst: I think it has something todo with the pcie stuff or firmware
10:49 karolherbst: also windows enables the card before suspend
10:49 karolherbst: I think
10:49 karolherbst: not sure though
10:50 tnt: karolherbst: ok thanks.
10:51 Wolf480pl: Hello. I've been here 2 days ago, had "HUB_INIT timed out" errors and freezes, and I was told to try the kernel module from the hack-gk106m branch.
10:52 Wolf480pl: With runpm=0 it worked initially, but now it sometimes errors on load time, which leads to freezes. The in-tree nouveau.ko also sometimes happens to load correctly (and then works ok), and sometimes with errors (and then freezes on xorg exit)
10:54 imirkin_: what error? HUB_INIT still, or something else?
10:54 Wolf480pl: HUB_INIT
10:54 karolherbst: actually I had this issue without the hack
10:54 imirkin_: Wolf480pl: very sad
10:54 karolherbst: but it was like it worked for 10% of times or less
10:55 imirkin_: Wolf480pl: you have a GK104 iirc?
10:55 Wolf480pl: yeah
10:55 imirkin_: perhaps gk104 needs some slightly different magic
10:55 Wolf480pl: btw. without runpm=0 the freezes when starting xorg and were worse (no Sysqr or ping) and today I managed to get dmesg and Xorg logs for such a freeze
10:55 imirkin_: or perhaps both gk104 and gk106 need the diff magic.
10:56 karolherbst: Wolf480pl: if the driver tries to turn of the card it totally breaks for me here
10:56 karolherbst: even with the hack
10:57 imirkin_: hence runpm=0 :)
10:58 karolherbst: yeah
10:58 Wolf480pl: do you want the logs of the freeze w/o runpm=0 ?
10:59 imirkin_: not if it's some HUB_INIT-inpsired error
10:59 imirkin_: inspired*
11:02 Wolf480pl: it's a couple ACPI warnings first, then a HUB_INIT error, and later some CPU lockups, so I guess that counts as HUB_INIT=inspired
11:02 Wolf480pl: s/=/-/
11:04 karolherbst: there should be a kernel null pointer dereference
11:08 Wolf480pl: I remember seeing a null pointer dereference before, but I the today logs don't seem to contain one
11:10 karolherbst: I check my pstore, maybe I have one
11:12 karolherbst: yeah, have it
11:13 karolherbst: Wolf480pl: something like that? https://gist.github.com/karolherbst/5e604f21f7cd72efaf41
11:17 tnt: I'm starting to think that the whole PCIE root port needs to be put in D3hot now ...
11:17 Wolf480pl: karolherbst: the ACPI warnings are very similar, but I get a block of "nouveau E[ PGRAPH] ... HUB_INIT timed out", and then hex digits (like all the other HUB_INITs I've seen), and I don't get the null pointer derefernce or kernel oops
11:18 Wolf480pl: but I get watchdog_overflow_callback ... hard LOCKUP on cpu 1
11:24 Wolf480pl: karolherbst, https://gist.github.com/anonymous/e1c12a262abf1400b2d0 (dmesg -n info)
11:37 tnt: imirkin_: seems in win 8.1 the D3hot is set on the PCI root port rather than the device (I gues to shut down more stuff). and this is what triggers the power off in ACPI.
12:21 tnt: \o/ Indeed looks like you need to set D3hot on the hub port and then acpi GPOF gets called and power decreases by the expected amount.
12:24 karolherbst: tnt: so GPOF is the new awesome stuff?
12:25 tnt: karolherbst: well, not really. I mean, the end result is pretty much the same (although might be a slight advantage power wise for the new). But it's just that with newer kernels that advertise themselves as Win 8.1 or Win 10, the previous method just doesn't work at all anymore.
12:26 tnt: karolherbst: so you need to manually tweak acpi_osi if you want to still use the previous way.
12:27 karolherbst: I see
12:27 karolherbst: but if it needs less power, it's better then isn't it?
12:28 tnt: I would say it's better just because it's at least future proof and won't need changin the kernel cmd line each time a new acpi string is added to the kernel.
12:28 tnt: of course this is all based on a sample of exactly 1 laptop ...
12:29 tnt: tbh, I'm not even sure the DSM is required at all. Looking at the acpi table, it shouldn't have any effect now.
12:30 karolherbst: my laptop currently uses the _SB_.PCI0.PEG0.PEGP handle
12:30 tnt: karolherbst: can you pastebin your whole table ?
12:30 karolherbst: should be on the web somwhere
12:30 tnt: which laptop ?
12:31 hakzsam: imirkin_, clang is happy with the boolean patch, now I just have to check for regressions
12:31 karolherbst: tnt: https://github.com/Bumblebee-Project/bbswitch/issues/65
12:41 tnt: karolherbst: definitely different ... there isn't even any mention of "Windows 2013" in there. And nothing regarding power managerment at the _SB_.PCI0.PEG0 level.
12:42 tnt: _OFF gets called when the card itself goes to D3 if the flag has been set by the HSM.
13:16 karolherbst: imirkin_: around 14fps in bioshock infinite
13:16 karolherbst: pcie doesn't make a difference though
13:17 karolherbst: guess I am highly gpc bottlenecked
13:17 imirkin_: increase your cstate then? :)
13:17 imirkin_: or reduce your resolution
13:17 imirkin_: heh
13:21 karolherbst: 07 pstate: 5fps
13:21 karolherbst: 0a gave me 14, this is awesome
13:21 karolherbst: more than doubled
13:21 karolherbst: will try 0f in a moment to check if memory changes anything
13:26 karolherbst: sad that 0f is a bit unstable on gddr5 :/
13:27 imirkin_: quite.
13:27 karolherbst: I really would help with that if possible
13:27 karolherbst: allthough for me it really doesn't matter as much
13:27 imirkin_: i assume you've also maxed your pcie speed already
13:27 karolherbst: yeah
13:27 karolherbst: doesn't make a big difference
13:27 karolherbst: it was stable at 11 before in the main menu
13:28 karolherbst: and began to be unstable between 9 and 13 after pcie change
13:28 imirkin_: i guess all that's left is reducing resolution :)
13:28 karolherbst: will try 0f again
13:29 karolherbst: but I really don't like changing resolution
13:29 karolherbst: its like the worst
13:30 imirkin_: but faster!
13:30 karolherbst: okay
13:30 karolherbst: pcie does chnage something
13:31 karolherbst: yeah a little
13:31 karolherbst: with 2.5 it runs either on 10 or 11 fps stable
13:32 imirkin_: do like 1200x800 -- way faster than 1920x1080
13:32 karolherbst: sometimes it hits 9-12
13:32 karolherbst: on 8.0 its like 9-13, but more in the 11-12 area
13:33 karolherbst: what is this ...
13:33 karolherbst: can't change it if I want to keep 16:9
13:34 karolherbst: 60 fps at 640x480 :)
13:34 karolherbst: 800x600 also at 60
13:35 karolherbst: with 1024x768 it drops to 40
13:35 karolherbst: to 33 with half the gpu clock
13:35 hakzsam: btw, did someone already see this error with git "git-send-email died of signal 11". I asked on #git but no one can tell me why this happens :)
13:36 imirkin_: consistently?
13:36 hakzsam: yes, and since a long time
13:36 hakzsam: only happens on my reator machine
13:36 imirkin_: probably because you're using arch :p
13:36 hakzsam: (not my main machine)
13:36 hakzsam: ahah no :)
13:37 karolherbst: imirkin_: game gets memory bottleneckd at aroun 650MHz core
13:37 imirkin_: pastebin the full command line along with all the output it produces as well as your .gitconfig
13:37 karolherbst: ramping up to 862 doesn't change anything
13:38 hakzsam: imirkin_, with GIT_TRACE=1 http://hastebin.com/tedazidosa and strace logs here http://hastebin.com/havegezaba
13:38 hakzsam: imirkin_, my .gitconfig is exactly the same between my two machines
13:39 karolherbst: okay, now I do care about 0f and gddr5 :D
13:39 hakzsam: imirkin_, https://github.com/hakzsam/dotfiles/blob/master/git/.gitconfig
13:41 imirkin_: hakzsam: and did you futz with .git/config ?
13:41 hakzsam: futz?
13:41 imirkin_: edit
13:42 hakzsam: what do you mean by that?
13:42 hakzsam: ah okay
13:42 hakzsam: no I don't think but I'm going to check
13:42 imirkin_: futz == "fiddle with"
13:42 karolherbst: imirkin_: with 0f I get 19-22
13:42 karolherbst: where I got 9-13 before
13:42 imirkin_: seems like 0f is faster
13:42 imirkin_: who knew, right
13:43 karolherbst: the thing is, the core clock doesn't change here
13:43 karolherbst: only memory
13:43 karolherbst: but from like 1.6GHz to 4GHz
13:43 hakzsam: imirkin_, I don't see any differences
13:43 hakzsam: really strange issue
13:43 karolherbst: and pcie3 2.5 to 5.0 did a "clear" difference of 5%
13:43 karolherbst: *8.0
13:44 hakzsam: imirkin_, I mean I have the same config between my two machines
13:44 karolherbst: so we can expect something between 5% and 25% more performance in games with high requiernments
13:44 imirkin_: hakzsam: hm ok
13:44 imirkin_: hakzsam: diff perl versions? git versions?
13:45 karolherbst: fun fact is that starting with 0f worked, but switching while the game was running did not
13:45 hakzsam: imirkin_, nope, same packages :/
13:46 karolherbst: hakzsam: try LD_PRELOAD=/lib/libSegfault.so
13:47 karolherbst: maybe this may give us a hint
13:47 karolherbst: ...
13:47 karolherbst: LD_PRELOAD=/lib/libSegFault.so
13:47 karolherbst: big F, always forget this
13:47 imirkin_: tab knows :)
13:47 karolherbst: :D
13:47 karolherbst: right
13:48 hakzsam: interesting http://hastebin.com/osopizilun
13:48 hakzsam: I didn't know that hint, thanks karolherbst
13:48 imirkin_: oh, that sounds familiar
13:48 imirkin_: i remember i had to rebuild a ton of stuff at some point
13:48 imirkin_: including SSLeay
13:48 karolherbst: after perl update: rebuild all perl modules
13:48 karolherbst: always
13:49 imirkin_: yeah
13:49 imirkin_: perl-cleaner on gentoo
13:49 karolherbst: yeah
13:49 karolherbst: :)
13:49 imirkin_: i try to skip it, but sometimes it bites me in the ass
13:49 hakzsam: mmh
13:49 hakzsam: I'll try
13:49 karolherbst: but on gentoo the module will simply not load, because its not in the path
13:49 karolherbst: :D
13:52 hakzsam: I'm upgrading all perfl modules
13:52 hakzsam: *perl
13:53 tnt: imirkin_: https://bugs.freedesktop.org/show_bug.cgi?id=91408 there you go, it's tracked with the info I have.
13:54 imirkin_: tnt: please attach dsdt in the bug rather than pastebin, which expires
13:55 tnt: ok, done
13:55 imirkin_: thanks
14:01 karolherbst: imirkin_: I get the feeling, that changing to 0f while something is running is a bad idea, but before doing anything it works more ofte
14:02 imirkin_: heh
14:04 mupuf: karolherbst: I definitely don't mind if you do stuff for me! I will review it, I am really good at that! Ask hakzsam!
14:04 mupuf: </sarcasm>
14:04 karolherbst: I already noticed :)
14:05 karolherbst: mupuf: we got copy_image working though
14:05 karolherbst: "working"
14:05 karolherbst: at least bioshock infinite was happy
14:06 mupuf: what is that?
14:06 mupuf: I mean, copy image
14:06 mupuf: not bioshock
14:06 mupuf: I know the name at least and roughly the story :D
14:06 karolherbst: some random GL 4.3? extension
14:07 imirkin_: mupuf: ARB_copy_image
14:07 karolherbst: but the biggest perf boost with pce 3.0 I got with glxspheres...
14:07 imirkin_: lets you copy from 1 texture to another
14:07 karolherbst: 314 fps on 0f 2.5 and 690fps on 0f 8.0
14:07 imirkin_: instead of things like glCopyTexImage which copy a texture to the renderbuffer
14:09 mupuf: imirkin_: I see! it is used by glxsphere?
14:10 karolherbst: no
14:10 karolherbst: by bioshock infinite
14:11 karolherbst: it needs a lot of fancy gl4+ extensions
14:11 karolherbst: tesselation too
14:11 imirkin_: do you see tons of errors in dmesg?
14:11 imirkin_: if not, it's not really using tess :)
14:12 karolherbst: why not?
14:12 karolherbst: and no I don't see many relating to it, but it complains on intel and crashes
14:12 imirkin_: coz there's some form of fail in my tess impl (i think)
14:12 karolherbst: mhh
14:12 imirkin_: which causes infinity errors about out of bounds issues in shaders
14:13 karolherbst: mhh
14:13 imirkin_: at least in heaven 4
14:13 imirkin_: as well as some piglits
14:14 karolherbst: mhh
14:14 karolherbst: it seems like it really dosn't use it
14:14 karolherbst: maybe just eon checks for that
14:14 karolherbst: or eon does strange stuff
14:14 karolherbst: who knows
14:15 imirkin_: is it eon? i thought that was witcher 2
14:15 karolherbst: both are eon
14:15 tobijk: karolherbst: changing to another perf lvl should work just fine while running some apps
14:15 imirkin_: ah
14:15 tobijk: infact i have to run something or my card will go to sleep
14:15 tobijk: which hangs my system when i change perf lvls :>
14:15 karolherbst: at least there is a "libopenal-eon.so.1" in the bioshock folder
14:16 karolherbst: tobijk: to 0a no problem
14:16 karolherbst: but 0f....
14:16 karolherbst: not so good
14:16 karolherbst: gddr5
14:16 tobijk: i'm a lucky ddr3 user ;-)
14:16 mupuf: so, it is official ... my main pc is dead
14:16 karolherbst: :/
14:16 tobijk: dum dum dum
14:16 mupuf: rest in piece, i7-920!
14:17 mupuf: (pun intended)
14:17 karolherbst: yeah well
14:18 imirkin_: hmmm... a sign of things to come for my i7-920?
14:18 karolherbst: do you have a pcie 3.0 board mupuf?
14:18 hakzsam: mupuf, are you going to buy another one?
14:18 mupuf: karolherbst: yes
14:18 mupuf: hakzsam: well, I wanted to buy a shiny skylake or at least broadwell
14:18 karolherbst: okay
14:19 karolherbst: mupuf: you have to "buy" it?
14:19 hakzsam: mupuf, so, you will have VT-d this time :)
14:19 mupuf: and there is the internal purchase program from intel that allows saving quite a lot of bucks
14:19 mupuf: but not sure I can wait for either of those
14:20 imirkin_: of course you have to buy a pallet at a time
14:20 mupuf: I may go to verkkokauppa tomorrow, the amazon of finland which has stock accessible in 15 minutes and is open 24/7
14:20 mupuf: imirkin_: hehe
14:20 imirkin_: i guess you mean more like microcenter... amazon doesn't have physical locations
14:21 mupuf: I just need to decide if I want the 8 core extreme edition CPU or something more ... modest
14:21 tobijk: nice working laws there? :O
14:21 karolherbst: pushing subroutines a bit :)
14:21 mupuf: tobijk: they have a front desk, they go pack your stuff when you come
14:21 mupuf: but yeah, that's interesting that they are open so late
14:21 mupuf: would be illegal in france
14:22 tobijk: yeah here as well
14:22 tobijk: lets see how long it stays that way
14:22 mupuf: well, people doing it in a hotel or at an online store (direct translation from verkkokauppa)
14:22 mupuf: what's the difference?
14:23 tobijk: not enough context? huh?
14:29 mupuf: tobijk: I mean, people working in hotels the entire night, at the reception
14:29 mupuf: or someone working at the reception of an online shop, what's the difference?
14:29 mupuf: :p
14:30 mupuf: just kidding, of course
14:30 imirkin_: as long as it's not the same person for all 24 hours
14:30 karolherbst: mupuf: hotels have money and can buy laws ;)
14:30 karolherbst: at least the big ones
14:31 tobijk: mupuf: yeah because of that it will soon change to everything open 24/7, nobody really needs it but thats the way it goes ^^
14:31 hakzsam: well, SSLeay is definitely outdated on my machine
14:31 karolherbst: tobijk: if nobody needs it, it will disappear sometime, because this cost too much money then (in theory)
14:32 karolherbst: but being paid for doing nothing is also kind of nice, but boring
14:32 tobijk: thats money for bore you out ;-) (makes people sick as well)
14:33 karolherbst: yeah I know
14:34 RSpliet: I plea for shops to close between 9 and 17
14:34 karolherbst: wow
14:34 RSpliet: everyone's at work then anyway
14:34 karolherbst: did you hear about this linux kills samsung ssds issue?
14:34 tobijk: huh? *in fear*
14:35 tobijk: i have one :>
14:35 karolherbst: mhh
14:35 karolherbst: short version: updated firmware may report that this card can be used with TRIM and NCQ
14:35 karolherbst: but if the kernel uses both, the card bricks
14:35 karolherbst: :D
14:36 tobijk: mhm
14:36 karolherbst: https://bugs.launchpad.net/ubuntu/+source/fstrim/+bug/1449005
14:36 karolherbst: last comments
14:36 specing: always issues with samsung ssd firmware
14:37 karolherbst: https://bugs.launchpad.net/ubuntu/+source/fstrim/+bug/1449005/comments/59 :D
14:37 karolherbst: best response ever
14:38 RSpliet: "Linux is open source and can be modified by anyone, as such we do not support the OS."
14:38 RSpliet: a statement that makes faces meet palms
14:38 karolherbst: :D
14:38 tobijk: lets tell that the business folks :>
14:40 specing:recommends not supporting samsung
14:41 karolherbst: but wasn't it samsung laptops with their faulty uEFI implementation, too?
14:41 karolherbst: who got bricked
14:42 tobijk: they still make notebooks? i thought they stopped a while ago
14:43 karolherbst: was some years ago
14:43 karolherbst: maybe 2?
14:43 karolherbst: it seems like samsung really doesn't care about the serious business area
14:45 tobijk: i'd be surprised by that, but who knows :>
14:45 mupuf: karolherbst: their QA = "Does it run on windows X?"
14:45 mupuf: if so, done!
14:46 mupuf: making asumption on the sw that is going to use the fw is dumb
14:46 mupuf: and they have an open source driver they can use to fuzz their code, why not use it?
14:46 karolherbst: who knows
14:47 tobijk: tech support aka call center without knowledge?!
14:47 karolherbst: I bet they are dealing with linux based hardware a lot these days
14:52 RSpliet: they still do chromebooks
14:53 karolherbst: yeah well
14:53 tobijk: thats not a notebook :P
14:53 hakzsam: karolherbst, upgrading perl modules fixes my git-send-email issue, thks :)
14:53 karolherbst: np
14:53 RSpliet: tobijk: it's worse, it's a tin and light notebook running Linux*
14:54 RSpliet: * for certain definitions of Linux
14:54 RSpliet: s/tin/thin/
14:54 karolherbst: I thought that was already cleared up in the linux docs
14:55 tobijk: RSpliet: i just make things up right now (tired of learing)
14:56 karolherbst: :O
14:57 karolherbst: old and busted: "video surveillance" new hotness: "Video protected"!
15:16 hakzsam: imirkin_, I updated the boolean patch, you can find the latest version here http://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=nouveau_bool&id=245d145a256ed34a61d01c639462bca5fddf699d
15:23 imirkin_: hakzsam: hmmm... i feel a bit weird about bool *res8 = (bool *)result
15:23 imirkin_: hakzsam: that's just asking for trouble... could you make it a uint8_t ?
15:25 hakzsam: I bet this is located in nv50_query_result()
15:25 imirkin_: yea
15:25 imirkin_: and nvc0_query_result
15:25 hakzsam: yes
15:26 imirkin_: also please undo your change to "non-boolean caps"
15:26 imirkin_: (also in multiple places)
15:27 imirkin_: otherwise this looks fine
15:27 hakzsam: forgot to undo those changes
15:30 karolherbst: are booleans ABI defined?
15:30 hakzsam: imirkin_, I would prefer to avoid to re-submit the patch because it needs moderator approval each time... can I get your R-b or do you want to see the updated version?
15:30 karolherbst: so are bools required to be either 1 or 0?
15:31 imirkin_: i'd like to see it, but not necessarily on ml
15:31 hakzsam: http://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=nouveau_bool&id=750ab62b5013e02ba53aeb37f861616a2c03c402
15:31 imirkin_: karolherbst: boolean is a stupid typedef. bool is required to be 0 or 1
15:31 karolherbst: okay
15:31 imirkin_: and if you manage to scribble over its memory with a non-0/1, then all hell breaks loose
15:31 karolherbst: because casting to bool* sounds like trouble indeed then
15:32 karolherbst: I always though false should be 0 and true should be !false, but maybe there are other issues with that
15:32 karolherbst: *thought
15:32 imirkin_: but tbh i don't know how defined it is that sizeof(bool) == 1
15:33 karolherbst: mhh
15:33 imirkin_: in practice, it is... at least on reasonable platforms
15:33 imirkin_: but... it's just a class of issues i want nothing to do with
15:33 karolherbst: in c++ it is even more annyoing
15:33 karolherbst: sizeof(bool)=1
15:33 karolherbst: sizeof(class A{ bool a;})=1
15:34 karolherbst: sizeof(class A{ bool a; int ;}) != sizeof(bool) + sizeof(int)
15:34 imirkin_: that's fairly normal
15:34 imirkin_: unless you force it to be packed
15:34 karolherbst: mhh right
15:34 karolherbst: completly forget about this
15:34 karolherbst: like always
15:34 imirkin_: would be the same with char
15:34 karolherbst: yes
15:35 imirkin_: or short, for that matter
15:35 karolherbst: but char is worse than bool
15:35 imirkin_: in terms of packing? should be the same
15:35 karolherbst: its 32bit in C right?
15:35 imirkin_: char is 1 byte
15:35 imirkin_: in Java it's an unsigned short
15:35 imirkin_: while 'byte' is a signed char
15:36 imirkin_: which causes no end of frustration
15:36 imirkin_: if you e.g. have byte x; if (x == 0xff) will never be true.
15:36 karolherbst: mhh wasn't there a time, where char was required to be 32bit? I read it somewhere
15:36 imirkin_: (i spent *hours* figuring this out the first time 'round)
15:37 imirkin_: perhaps now, with UCS-4
15:37 imirkin_: but char has always been a single byte in C
15:37 imirkin_: since the beginning of time
15:37 imirkin_: and until the end of time
15:37 imirkin_: ;)
15:37 karolherbst: mhh
15:38 karolherbst: seems I am wrong
15:38 karolherbst: I heard about something which is 32 bit in C, but only 8 in C++
15:39 imirkin_: perhaps something like
15:39 imirkin_: struct { char x; } foo[5]
15:39 imirkin_: dunno
15:41 imirkin_: hakzsam: Acked-by: me. you can push with that.
15:41 imirkin_: thanks for taking care of that... the stupid bool situation has been an annoyance for a while
15:42 hakzsam: okay, I'll do
15:47 hakzsam: pused
15:47 hakzsam: +h
15:48 imirkin_: awesome
15:48 imirkin_: now if you wnat to do s/INLINE/inline/...
15:49 imirkin_: actually dunno, maybe we should keep that around
15:49 hakzsam: mmh not yet :) I would prefer to fix the vertexid stuff
15:51 hakzsam: imirkin_, btw, what about the cache flush for constbufs patch?
15:52 imirkin_: hakzsam: i need to understand what's going on there
15:52 imirkin_: i'm fairly sure it has to be wrong. at the very least i think you flush under too many circumstances
15:52 imirkin_: perhaps it's necessary on nv50
15:53 imirkin_: coz i disabled the "regular" cb upload logic
15:53 hakzsam: okay, no rush for this one anyways
16:58 Lazik: imirkin_ : byte is an unsigned char
16:59 Lazik: it will solve all your frustration ;)
17:02 imirkin_: Lazik: unfortunately in java, that is just plain not the case.
17:02 imirkin_: it is, in fact, a signed char.
17:03 imirkin_: and 'char', is actually an unsigned short.
17:03 imirkin_: it's a great language.
17:04 Lazik: ahhh Java...
17:04 Lazik: lol
17:04 imirkin_: many hours of debugging to figure out why x == 0xff didn't work.
17:05 Lazik: ahahah Java is like the wrong choice between python and c/c++
17:06 imirkin_: that's a naive thing to say... there are often various constraints
17:07 Lazik: I disagree, Java is one of the worst language out there
17:10 Karlton: COBOL is the best language
17:14 karolherbst: char is just strange
17:14 karolherbst: in every regard
17:15 karolherbst: imirkin_: in C?
17:15 imirkin_: in java.
17:15 imirkin_: as i've mentioned like 10x already
17:15 karolherbst: char is for characters, not numbers :p
17:16 imirkin_: yes. and that's why it's an unsigned short.
17:16 karolherbst: you can actually have a lot of fun with char in c++ too
17:16 imirkin_: however that goes counter to the standard set by C of char being a signed int8_t
17:16 karolherbst: with char it is maybe more obvious but a lot of fun is (u)int8_t
17:16 imirkin_: and byte, unofficially, being always typedef'd to unsigned char
17:16 imirkin_: of course java doesn't even have usual unsigned types
17:17 karolherbst: char isn't signed
17:17 karolherbst: its not even unsigned :D
17:18 karolherbst: and signed char is not a char
17:20 imirkin_: heh
17:21 karolherbst: char is implementation defined
17:21 imirkin_: in theory
17:21 karolherbst: its in the standard
17:21 karolherbst: yeah
17:21 imirkin_: in practice it's always an int8_t, otherwise too many thigns would break
17:22 marcheu: imirkin_: *ahem* powerpc *ahem*
17:22 karolherbst: I think in C it may be, but I think in c++ its more strict than that, but the standard messed up (u)int8_t anyway
17:22 imirkin_: marcheu: and nothing works on ppc :p
17:23 karolherbst: marcheu: at least char is defined by the compiler ;)
17:23 karolherbst: but what is char on ppc usually?
17:23 marcheu: anyway you will make benh sad if you assume that char is signed
17:23 marcheu: karolherbst: unsigned
17:23 karolherbst: I see
17:23 marcheu: on linux ppc that is
17:24 Lazik: unsigned char array in c/c++ is as far as you can get to a pure byte buffer
17:24 marcheu: not sure about AIX, OS X and other oddities
17:24 Lazik: which is lovely
17:24 karolherbst: I would use uint8_t
17:24 Lazik: or uint8_t w/e
17:24 karolherbst: I really don't like the "old" types
17:24 karolherbst: like int or long or char
17:24 karolherbst: you never can be sure whats behind that
17:24 benh: marcheu / karolherbst: powerpc isn't the only arch for which char is unsigned afaik
17:24 marcheu: see he is sad now
17:25 benh: :-)
17:25 benh: I think on ARM too no ?
17:25 marcheu: yeah I think so
17:25 karolherbst: to be honest: the (u)intxx_t types are the best what could happen to plattform independent developing
17:26 karolherbst: it would be valid to let char range from -63 to 192 :)
17:26 karolherbst: :D
17:27 karolherbst: wow
17:27 karolherbst: its even worse
17:27 karolherbst: on EBCDIC platforms there is a gab between 'i' and 'j'
17:27 karolherbst: ...
17:27 karolherbst: how bad is that
17:30 benh: well, it's not like anybody actually cares about ebcdic :-)
17:30 karolherbst: :D
18:20 imirkin_: skeggsb_: any idea why the memory frequency on my GK208 doesn't end up getting changed, despite no errors from nouveau
18:21 karolherbst: imirkin_: let me check something for ya
18:22 karolherbst: okay
18:22 karolherbst: my cstate changes makes it possible to dump all cstates (with core and memory clockin) we could try to check if the cstate table is just messed up?
18:22 imirkin_: skeggsb_: and separately, comments would be appreciated on the proper location for doing the pcie link speed adjustments
18:23 imirkin_: karolherbst: what does cstate have to do with it?
18:23 karolherbst: pstate always sets the highest cstate
18:23 imirkin_: right. and that works properly
18:23 imirkin_: the cstate is set to the high value.
18:24 karolherbst: but the memory clocks remains the same?
18:24 imirkin_: yes
18:24 karolherbst: but cstate also have information about the memory clock
18:24 imirkin_: http://hastebin.com/uyujorilon.css
18:25 karolherbst: that doesn't help
18:25 imirkin_: ok
18:25 karolherbst: imirkin_: https://gist.github.com/karolherbst/d4e7a2c1819aad601ffa
18:25 karolherbst: this is my cstate table on 0a
18:26 karolherbst: list_for_each_entry(cstate, &pstate->list, head) { cstate->domain[nv_clk_src_mem]; if (args->v0.states == 64) break }
18:26 karolherbst: this is the code I use for it now
18:26 karolherbst: I striped the unimportat parts
18:27 karolherbst: its messy though, because every pstate has its own list of cstates
18:29 karolherbst: mhh wait
18:29 karolherbst: I may know where you could fail
18:29 karolherbst: https://github.com/karolherbst/nouveau/blob/master/drm/nouveau/nvkm/subdev/clk/base.c#L187-195
18:29 karolherbst: seems like there is only one memory clock per pstate indeed
18:30 karolherbst: so you get 810 memory clock on 0f state?
18:30 imirkin_: yep
18:31 karolherbst: https://github.com/karolherbst/nouveau/blob/master/drm/nouveau/nvkm/subdev/clk/base.c#L190 this line
18:31 karolherbst: I would assume it returns never 0
18:31 karolherbst: or if it returns 0 only with the 810 clock
18:32 imirkin_: well, if gk104_ram_calc_data failed, then it'd print
18:33 imirkin_: i suppose either nvkm_sddr3_calc or gk104_ram_calc_sddr3 could fail though?
18:33 imirkin_: nope, the latter can't fail
18:33 karolherbst: what about gk104_ram_calc?
18:33 karolherbst: this should be called there right?
18:33 imirkin_: aha, there could be an unrecognized timing_ver?
18:34 imirkin_: Timing mapping table at 0x7229. Version 17.
18:34 imirkin_: errr
18:34 imirkin_: Timing table at 0x73fc. Version 32.
18:34 imirkin_: aka 0x20
18:35 imirkin_: i guess i should throw some prints in all over the place
18:35 karolherbst: BUG_ON?
18:35 imirkin_: haha
18:35 imirkin_: nv_debug more like it
18:35 karolherbst: no
18:35 karolherbst: there is a BUG_ON
18:35 imirkin_: yeah but that clearly doesn't get hit
18:36 karolherbst: in gk104_ram_calc
18:36 karolherbst: I am just thiking if thats too harsh, but maybe its fine
18:40 imirkin_: grrr
18:40 karolherbst: imirkin_: do you know if gk104_ram_prog_0 is called with the right frequency?
18:40 imirkin_: i need to reboot
18:40 karolherbst: :/
18:40 imirkin_: i'll worry about this later.
18:40 karolherbst: okay
18:41 imirkin_: i must have changed something dumb in my config
18:41 imirkin_: and now it hates me
18:41 imirkin_: nouveau: disagrees about version of symbol drm_get_edid
18:41 imirkin_: and so on.
18:41 karolherbst: :/
18:41 imirkin_: oh, i should just force it
18:41 karolherbst: :D
18:41 karolherbst: sometimes I get also strange issues
18:41 karolherbst: like ttm symbols not found
18:41 karolherbst: while compiling
18:42 karolherbst: but the module is there
18:42 imirkin_: gah! where did -f go?!
18:42 karolherbst: imirkin_: kernel option ;)
18:42 imirkin_: stupid modversions, why did i ever turn that dumb thing on
18:42 karolherbst: mhhh
18:42 karolherbst: sounds nice though
18:43 imirkin_: until you try to do something that you know will work fine
18:43 karolherbst: :D
18:43 imirkin_: but it prevents you from doing
18:44 karolherbst: there has to be a reason for CONFIG_MODULE_FORCE_LOAD
18:47 imirkin_: # CONFIG_MODULE_FORCE_LOAD is not set
18:47 imirkin_: blast :(
18:48 karolherbst: for a reason
18:48 imirkin_: what's that?
18:48 imirkin_: i don't use a distro config... i think i normally turn that on, must have forgotten
18:48 imirkin_: i normally port my config across boxes, might have done this one from scratch though
18:48 karolherbst: mhh, I can imagine, that sometimes something may can get pretty wrong
18:49 imirkin_: sure. but what reason is it not enabled in my build?
18:49 imirkin_: i can't imagine anything good
18:49 imirkin_: i'm not the overly cautious type.
18:49 karolherbst: :D
18:50 imirkin_: i think we're just left with incompetence. oh well.
18:50 karolherbst: maybe you force load once ant your fs was messed up with random data
18:50 karolherbst: who knows
18:51 karolherbst: but usually I trust the descriptions of configs, and if they say its a really bad idea "usually" I think it is
18:54 karolherbst: wow, gk104_ram_calc_gddr5 is a monster :/
19:04 imirkin_: you can imagine how much fun ben had working it out
19:04 karolherbst: yeah :/
19:04 imirkin_: and (according to him) it's quite correct, on at least some boards
19:04 imirkin_: however the reclock still fails for myserious reasons
19:04 karolherbst: mhh
19:20 karolherbst: okay, the gddr5 timings are the same as the blob for me
19:30 karolherbst: too tired now
19:31 karolherbst: maybe if I have a LOT of time, I try to find a difference between nouveau gddr5 stuff and blob gddr5 stuff