00:31 karolherbst: RSpliet: gt215_pll_calc does calculate the values for the pll for a specific frequency?
00:31 karolherbst: so my assumption is, if I got the additional constraints into that function it will work then
00:55 karolherbst: okay
00:56 karolherbst: M stays 1 all the time
00:56 karolherbst: only P and N are changing for the blob
00:58 karolherbst: allthough P is mostly 1, sometimes 2
01:13 karolherbst: oh well
01:13 karolherbst: gt215_pll_calc does different depending on the pstate before (or mem clock)
01:17 karolherbst: oh well, doesn't seem to matter
01:29 karolherbst: okay, slowly I understand which values are kind of bad
01:34 karolherbst: okay
01:34 karolherbst: I think I got it
01:35 karolherbst: the reflock used in the gt215_pll_calc function has to be much lower, so that the N value can increase much more
01:35 karolherbst: which is done by the "main" Pll I assume
01:36 karolherbst: so instead of using the old value from the old mem clock, the clock has to be lowered to something really low
01:36 karolherbst: and then just multiply this value with a high multiplier
01:36 karolherbst: somehting above 10
01:37 karolherbst: M at 1 seems to be a good and stable value
01:42 karolherbst: that also explains the 4050 clock for me
01:44 karolherbst: for the highest pstate blob uses a base clock around 182 MHz
01:54 karolherbst: does that make sense somehow?
01:54 karolherbst: I think the seond PLL is only stable with multiplications and nothing else
01:55 karolherbst: P set to 2 seems also to be nice
01:58 karolherbst: mupuf_: you there?
02:32 karolherbst: with M=1 it seems to be a lot more stable
02:32 karolherbst: but I think to be stable for sure, the other PLL has to generate a lower clock
02:32 karolherbst: got like 200 pstate changes working without issues
02:33 karolherbst: and even after that the card recovers when set to 07 pstate
02:34 mlankhorst: that's good :)
02:34 karolherbst: yeah
02:34 karolherbst: running at 0f again without nouveua unload or card reset
02:34 karolherbst: even switching pstates while messed up worked
02:35 karolherbst: application was still messed up
02:35 karolherbst: so my conlcusion is: 1. main PLL has to generate much lower clocks then usually used for the other pstates 2. second PLL should only change the N value
02:36 karolherbst: M should stay 1 and P may be 2 if nothing else is possible
02:37 karolherbst: mlankhorst: funny though, switching from 0a to 0f doesn't work now, because that would calc N=2 => requested clock4008
02:37 mlankhorst: ah
02:37 karolherbst: =set clock 3240
02:37 karolherbst: but
02:37 karolherbst: this is a driver issue
02:37 karolherbst: not card
02:38 karolherbst: the echo 0f just hangs and the driver get some issues
02:38 karolherbst: maybe the diff between the clcoks is too big and nouveau can't handly that
02:38 karolherbst: *handle
02:39 karolherbst: this is my current hack now: https://github.com/karolherbst/nouveau/commit/dd2312916f4422426dc771236801569f891a580c
02:40 karolherbst: I just removed the loop which increases M
02:40 karolherbst: for the PLL used at higher clocks
02:41 karolherbst: would like to test other cards first though before doing anything serious
02:48 karolherbst: now 400 pstate changes were good
02:48 karolherbst: while running borderlands
02:50 karolherbst: and recovered again
02:50 karolherbst: at 07 again and no problem
02:50 karolherbst: *=f
02:50 karolherbst: =f
02:51 karolherbst: ... 0f
02:51 karolherbst: but application has to be quit I think
02:51 karolherbst: don't think it may recover while running?
03:21 Hauke: will these patches make it into 4.3: http://lists.freedesktop.org/archives/dri-devel/2015-August/087955.html ?
03:22 Hauke: or should I change something?
03:39 karolherbst: Hauke: I think it depends if it was tested on other cards as well
03:40 Hauke: I only have one nvidia card
03:41 karolherbst: mlankhorst: 1500 reclocks without issues
04:13 pmoreau_: Hauke: What setup would be needed to test your patches?
04:19 Hauke: a test with a monitor which uses dual link dvi would be nice
04:19 pmoreau_: How do you know whether it's using dual link or not?
04:20 Hauke: when the pixel clock is higher that 165 mhz
04:20 pmoreau_: Ok
04:20 Hauke: that is needed for monitors with more than 1920x1200
04:21 Hauke: the additional test would be to use a dmi connteced monitor with more than 1920x1200, so for example 2560x1440
04:21 Hauke: s/dmi/hdmi
04:21 pmoreau_: I don't have any monitor with more than 1920x1200. Could using multiple ones also work?
04:22 Hauke: basicly this patch increases the maximal allowed pixel clock for hdmi to 225mhz and changes some parts of the driver where it activates dual link mode when the pixel clock is higher than 165 mhz to also check if this connection is dual link capable
04:23 Hauke: I need this to operate my new monitor at 2560x1440@56
04:23 Hauke: over hdmi
04:23 Hauke: the blob can also do this
04:24 Hauke: I tested this with my hdmi monitor and my dvi (single link) monitor
04:25 pmoreau_: I'll give it a try before going to sleep
04:26 Hauke: I still do not know if this patch causes any problems with dual link DVI connections and with other than graphics cards than GF114 which could have problems with HDMI at 225mhz pixel clock.
04:26 Hauke: pmoreau_: thanks
04:27 Hauke: what was the first nvidia graphics card with hdmi support?
04:27 pmoreau_: I can test on some Tesla, and an GF100
04:27 Hauke: that would be nice
04:28 pmoreau_: Hum... I have a few Tesla with HDMI outputs
04:31 Hauke: a tesla with hdmi?? thats looks strange
04:32 pmoreau_: Well, they do :-) And envytools report an NV_HDMI block starting from G84.
07:20 marcosps: imirkin_: around?
07:41 mwk: pmoreau_: if we keep adding stuff to this PR like that, it's never going to get merged :p
07:41 mwk: anyhow, good effort
07:44 mwk: the major problems left are PHEAD (should definitely stay as PCRTC due to history) and use of FORMAT for pitch/blocklinear (will conflict with surface format, LAYOUT would be better)
07:44 mlankhorst: morning
07:47 pmoreau_: mwk: :D Hopefully, I think CRTC -> HEAD was the last one.
07:47 pmoreau_: I'll address your comments tonight
07:47 pmoreau_: Good job for reviewing it!
07:48 mwk: mupuf_: I think PR 17 is good to merge, do it?
07:48 pmoreau_: mwk: So, keep every PCRTC as PCRTC, or only for pre-G80?
07:49 mwk: PCRTC only exists on pre-G80 in the first place :)
07:50 pmoreau_: Right :D
07:50 pmoreau_: So use PCRTC and CRTC on pre-G80, and HEAD on post?
07:51 mwk: yeah
07:51 mwk: although using HEAD on pre-G80 is not an error
07:51 mwk: only PHEAD is wrong
07:51 pmoreau_: Ok
07:51 mwk: so eg. the HWSQ events can stay as HEAD0_VBLANK even for NV4x
07:51 mwk: would be a good thing, too
08:26 karolherbst: okay, gddr5 0f pstate stable for me now with a solution which may already work for multiple cards
08:26 karolherbst: now the real part begins
08:27 karolherbst: RSpliet: do you think its hard to let nouveua clock the memory down before actually enabling the second PLL while switching to a high clock?
08:28 karolherbst: the thing is I don't really know to which clock I should clock
08:37 karolherbst: ohh its all inside the gk104_ram_calc_gddr5 function
08:37 karolherbst: should be fine then
08:53 karolherbst: well
08:56 karolherbst: now the final test
09:04 karolherbst: I think the card doesn't like it, if the state is changed like every 0.05 seconds ? :/
09:07 karolherbst: but actually no application failed to start anymore at 0f pstate, so I think that may be good enough
09:32 mlankhorst: now try max-texture-size parallel and watch your card die >:D
09:32 imirkin: that dies without any reclocking anyways
09:36 sobukus: Hi, I'd like to use onboard radeon + quadro 290 for running 3 screens with xrandr 1.4. I got both gpus listed in xrandr --listproviders, but I fail to find some documentation that is _not_ about offloadung rendering but about simply running 3 screens in a post-xinerama world.
09:36 sobukus: Any hints?
09:37 sobukus: The manpage of xrandr is not that helpful
09:37 marcosps: imirkin: all patches sent to mesa ML, I hope I addressed all issues your reported ;)
09:38 imirkin: sobukus: well, if you want a single logical screen, your only options are xinerama and xrandr. for xrandr, if you want multiple gpu's, you have to use provider offloading
09:38 imirkin: sobukus: your other option is to have them as *separate* screens, but a single display, in which case that should be fairly easy to set up
09:38 sobukus: imirkin: you mean things have to be rendered on one single GPU?
09:38 imirkin: but a particular X client will be attached to the screen it starts on
09:39 imirkin: sobukus: yes
09:39 sobukus: I had a Zaphod-style setup running with a quadro 280 before, but this stopped working now ... I cannot move the cursor to the third screen. So I figured I try the proper modern way.
09:40 imirkin: sadly there is no proper modern way
09:40 imirkin: the zaphod thing should work
09:40 sobukus: It did before, but it's been acting weirdly and I wondered if anyone is testing this nowadays.
09:41 imirkin: you, and like 1 other guy :)
09:42 sobukus: Of course. So I see that the nouveau gpu supports "source offload" only. Does that mean I should try to enslave the HD3200 onboard to show pictures rendered by the quadro?
09:43 sobukus:needs to look up if the Quadro VNS 290 actually can drive that
09:43 sobukus: NVS
09:45 imirkin: that sounds old, aka 2 CRTCs
09:46 imirkin: you basically have to use the primary gpu to do all your rendering
09:46 sobukus: It has this one big connector that splits into two DVI-I
09:46 imirkin: and there's no good way to pick which gpu is your primary one
09:46 imirkin: DMS-59, everyone's favourite
09:48 sobukus: I'm not yet in the clear about the termuinology, source/sink output vs. source/sink offload.
09:48 sobukus: IF offload is the only way, what is output for?
09:54 marcosps: imirkin: I'm now looking at this task: https://trello.com/c/DhzCNweo/69-nv50-propagate-constants-into-16-bit-mul-fma
09:55 marcosps: there is a doc who describes these cvt, shl instrunctions?
09:55 imirkin: marcosps: do you have a nv50-era gpu? i thought you only had a fermi or something
09:55 imirkin: marcosps: ssssorta. take a look at envytools.rtfd.org -- look for tesla isa
09:55 imirkin: marcosps: cvt = convert, shl = shift left
09:56 imirkin: rzi = round towards zero, integer
09:56 marcosps: imirkin: how, true... I forgot this detail...
09:56 sobukus:killed the X session by trying to set nouveau as provider for radeon, or similar
10:00 marcosps: imirkin: I'm now looking at the tasla isa... thanks for the link! Also, can you point a new task, anything that tesselation one, just to be more used to mesa code?
10:01 imirkin: marcosps: which gpu do you have again?
10:04 marcosps1: I have a GF117M
10:05 marcosps: imirkin: to be more specific: NVIDIA Corporation GF117M [GeForce 610M/710M/810M/820M / GT 620M/625M/630M/720M] (rev a1)
10:05 sobukus: imirkin: When I set radeon as source for nouveau, I get three cloned screens; the other way round I get a server crash, at least effectively.
10:05 imirkin: right. i'll try to come up with a good task later on... don't have anything off-hand
10:05 imirkin: sobukus: probably an older X server? there were def issues with that
10:06 sobukus: imirkin: 1.17.2
10:06 marcosps: imirkin: there is some way to add a test to piglit or mesa after my patches get merged?
10:07 sobukus: I'm not sure which way is best, but since the radeon only drives one of the smaller displays, I guess it would be best if the main grunt is pulled by the nouveau.
10:07 imirkin: sobukus: hmmm
10:07 sobukus: I manged to crash the "good" setup, too while trying to rotate one of the screens on the enslaved nouveau card.
10:07 sobukus: rotating the radeon screen was ok
10:08 imirkin: yeah, xrandr reverse prime rotation is only supported on xorg-git afaik
10:08 karolherbst: mlankhorst: so what should I execute? :D
10:09 karolherbst: I am serious by the way :p
10:09 sobukus: imirkin: Well, I could try that. Or try to get zaphod work again without it preventing the cursor from entering the third screen.
10:11 karolherbst: imirkin: maybe you know what mlankhorst meant by max-texture-size parallel ? I guess this is a piglit test, but I don't know which one ;)
10:11 imirkin: karolherbst: 'max-texture-size' surprisingly enough :p
10:12 karolherbst: just assume I know nothing about piglit
10:12 karolherbst: ohh in bin
10:13 karolherbst: mhh
10:13 karolherbst: card isn't dead
10:13 karolherbst: or do I have to execute the test more times
10:13 imirkin: in parallel
10:13 karolherbst: so like 100 executings parallel
10:14 imirkin: ahahaha
10:14 imirkin: 2 should be more than sufficient.
10:14 karolherbst: okay
10:14 karolherbst: DRI_PRIME=1 ./bin/max-texture-size parallel & DRI_PRIME=1 ./bin/max-texture-size parallel ?
10:14 sobukus: https://bugs.freedesktop.org/show_bug.cgi?id=62773 ← not that much activity:-/
10:14 karolherbst: well
10:14 imirkin: sobukus: rotation is supported in xserver-git
10:15 karolherbst: "nouveau: kernel rejected pushbuf: Cannot allocate memory" but that has nothing todo with mem clock, has it?
10:15 sobukus: imirkin: OK, I'll then try to install that. It would be nice if unsupported features in Xorg wouldn't so often result in a crash instead of error messages (not bitching about nouveau in particular here).
10:17 karolherbst: imirkin: ohh saw your comment now
10:17 imirkin: sobukus: i agree, that would be preferable.
10:17 karolherbst: I ran bioshock infinite for an hour at 0f now
10:17 imirkin: sobukus: bugs are indeed annoying.
10:17 karolherbst: and no issue at 0f
10:17 karolherbst: swithing pstates also kind of works
10:18 karolherbst: but after like 2000 switches in 0.05 seconds interval the card isn't that happy
10:18 imirkin: that's still pretty good
10:18 karolherbst: yeah
10:18 karolherbst: I figured out what the problems are
10:18 karolherbst: I have a "hack" showing them: https://github.com/karolherbst/nouveau/commit/7588f95307ffec8c3d81c05e672edaaf356151dd
10:19 imirkin: that means a 0.05% failure rate
10:19 sobukus: imirkin: Well OK, that's a rather general statement. It's just that bugs that kill your desktop session are annoying++;-)
10:19 karolherbst: first one is gt215_pll_calc without increasing M
10:19 karolherbst: M=1 => stable gddr5 0f pstate
10:19 karolherbst: second: reduce refclk for second PLL
10:19 karolherbst: the second PLL are only used as a multiplier in the blob
10:19 karolherbst: nothing more
10:20 karolherbst: so it just tooks the first clock from the first PLL and multiplies it
10:20 karolherbst: that's why the first PLL produces like 1620MHz clocks at 0a pstate
10:20 karolherbst: but only 182MHz at 0f on the blob
10:20 karolherbst: imirkin: yeah, this rate seems about right
10:20 RSpliet: karolherbst: nice conclusions
10:21 karolherbst: yep
10:21 karolherbst: and it works with my hack
10:21 karolherbst: only have to figure out a nice way to do this: refclk = 182252;//fuc->mempll.refclk;
10:21 karolherbst: the blob doesn't use many values there
10:21 karolherbst: I found like 4 now?
10:21 karolherbst: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
10:21 karolherbst: but I will test more later
10:22 RSpliet: could you translate those refclk values to actual clock speeds?
10:24 karolherbst: the first one is the constant in my hack
10:24 karolherbst: but I have a script
10:24 karolherbst: will do all
10:24 karolherbst: I actually coded the read_mem function in bash :D
10:24 karolherbst: with static reg reads
10:27 karolherbst: RSpliet: done: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
10:27 karolherbst: I think they are in a specific range
10:27 karolherbst: and the neareas with a* clock = target_clock is used
10:27 sobukus:notices that git xserver wants new versions of other stuff …
10:28 karolherbst: and now I get also the 4008MHz clock in nouveau at 0f
10:28 karolherbst: nouveua clocked at 4050 before
10:29 karolherbst: actually it is a bit above 4009, but it doesn't matter there
10:29 karolherbst: RSpliet: but I think the low refclock isn't actually required
10:30 RSpliet: what would happen if you were to use the PLL calculation algorithm with two multipliers, one divider and one shift-divider. And use the result to configure both PLLs?
10:30 karolherbst: RSpliet: it is only nice to use them, because otherwise nouveau may be unhappy while clocking from a specific pstate
10:30 karolherbst: 0a => 0f crashed nouveau with 0a recflock
10:30 karolherbst: RSpliet: M=2 also hangs the gpu
10:30 karolherbst: or what do you mean?
10:31 karolherbst: ahh now I know what you mean
10:31 karolherbst: mhh
10:31 karolherbst: the blob also sets P sometimes to 2 for the second PLL
10:31 karolherbst: but this rarely happens
10:31 karolherbst: not sure when exactly, have to find the correct clock again
10:33 karolherbst: 00132004: 10011501
10:34 karolherbst: at 3496MHz
10:34 karolherbst: 00072a01 second PLL
10:35 karolherbst: I mean, the main one
10:35 karolherbst: 00132030: 15551009 this value is strange
10:35 karolherbst: it is usually 0x4 or something
10:35 RSpliet: are you sure about your refclk calculation btw?
10:36 karolherbst: yes
10:36 karolherbst: the constant works in the code
10:36 karolherbst: RSpliet: https://github.com/karolherbst/nouveau/commit/7588f95307ffec8c3d81c05e672edaaf356151dd#diff-c55b594c4bf4b9ffe42d3a48aca8f90a
10:37 karolherbst: N set to 11
10:37 karolherbst: clock: 4009
10:37 karolherbst: reported by nouveau
10:38 karolherbst: and 182252 * 11 = 2004772 (*2 for double rate stuff)
10:38 karolherbst: the blob uses these values and I don't see why this should be wrong then
10:39 karolherbst: I think the reflock is low to have a wide range of possible N
10:39 karolherbst: I can clock my memory with the blob in like 6MHz steps in avarage
10:39 karolherbst: so there are tons of possible mem clocks I can tell the blob to use
10:40 karolherbst: actually the regs are them same when I used clocks from other cards
10:40 karolherbst: compared to reg values for the other card
10:42 karolherbst: also gt215_pll_calc2 is the same as gt215_pll_calc, but without the for loop
10:42 karolherbst: so that M stays 1
10:42 sobukus: imirkin: Do you have a hint what updated package I'm missing when encountering this:
10:42 sobukus: /usr/src/xorg-server-20150816/dix/pixmap.c:194: undefined reference to `RRTransformCompute'
10:45 RSpliet: karolherbst: I think it should have been rounded slightly differently
10:45 karolherbst: wait
10:45 karolherbst: mhh
10:45 karolherbst: could be
10:45 karolherbst: why?
10:46 karolherbst: I used 27000 for crystal
10:46 RSpliet: (13500+(27000/40)) / 6 = 182250
10:46 karolherbst: mhhh
10:46 RSpliet: that is
10:46 imirkin: sobukus: sorry, i'm not really an xorg expert
10:46 RSpliet: 27000*40
10:46 karolherbst: ohh
10:46 karolherbst: okay
10:46 imirkin: sobukus: my extent of xorg knowledge is "don't do that" and "upgrade xorg"
10:46 RSpliet: (sorry for that typo)
10:47 RSpliet: so it's minor, but... well, the 2 had me suspicious for a little ;-)
10:47 sobukus: imirkin: I guess I need to disable dmx.
10:47 karolherbst: I see
10:47 karolherbst: brb (eating)
10:47 imirkin: sobukus: hahaha, had that enabled for the video wall? :)
10:47 sobukus: imirkin: not intentionally
10:49 imirkin: sobukus: it was a cool idea back in the day... multiple machines acting as a single logical X server
10:51 RSpliet: karolherbst: what the blob might be doing is this: for clocks over 2404MHz (see PLL limits), try to find a multiplier for MPLL0 that requires an input clock below 200 MHz, then reconfigure MPLL1 to be the frequency required to magically end up with the right clock
10:52 sobukus: I wonder if the crash when trying to set the nouveau card as source is related to it not being the secondary card.
10:52 sobukus: Eh, not being the primary.
10:53 sobukus: Damn, xorg-server update went through, but now nouveau driver doesn't want to build anymore.
10:53 imirkin: sobukus: there's a fix in git for that
10:53 imirkin: and radeon too
10:53 sobukus: I'm already using nouveau git.
10:53 RSpliet: on failing to find a stable PLL setting, increase the multiplier by 1 and try again
10:55 sobukus: imirkin: seems like I'm not using that current git after all (git pull
10:55 sobukus: fatal: No remote repository specified. Please, specify either a URL or a
10:55 sobukus: )
10:59 sobukus: Better now after fixing up the git pull.
11:08 SolarAquarion: hi guys
11:08 SolarAquarion: how is the nvidia 870m working on nouveau?
11:09 RSpliet: if it's a kepler, then absolutely fine
11:09 RSpliet: (and it seems to be)
11:09 RSpliet: you might not get max perf out of it
11:10 RSpliet: but with latest Mesa you get OpenGL 4.1
11:10 SolarAquarion: RSpliet, i use the git
11:10 RSpliet: may the git be with you
11:10 RSpliet: but yes, mesa git is latest enough
11:11 SolarAquarion: RSpliet, ok and I guess it works fine with bumblebee?
11:11 RSpliet: never tried it myself, but I don't see why not
11:12 SolarAquarion: ok
11:13 imirkin: SolarAquarion: avoid bumblebee if possible
11:13 imirkin: SolarAquarion: it'll mess with nouveau and doesn't add anything useful
11:13 SolarAquarion: imirkin, ok. So how should i be doing the off loading
11:14 imirkin: SolarAquarion: http://nouveau.freedesktop.org/wiki/Optimus/
11:14 imirkin: i guess that's what you're looking for?
11:14 SolarAquarion: yeah
11:14 imirkin: there's a translation widget at the top that runs it through google translate if you have trouble with english
11:16 SolarAquarion: I'm An American. lol
11:17 imirkin: ah ok. not sure why i thought that english was a problem for you.
11:18 SolarAquarion: well, for power management, my laptop is always plugged in
11:18 SolarAquarion: hmm
11:19 sobukus:wants to test the updated xorg but now has no input devices anymore, flashback from the past when audoadddevices first entered
11:23 karolherbst: RSpliet: yeah, figured as much already
11:23 karolherbst: but nearest below 200?
11:23 RSpliet: that's a kind of arbitrary guess
11:23 karolherbst: would make sense, but what is the lower bound
11:23 karolherbst: I mean it does do that
11:23 karolherbst: the question is simple which range
11:23 karolherbst: it would be insane to be a table lookup
11:23 RSpliet: idk... an obvious lower bound would be 27MHz
11:24 RSpliet: or when you run out of multiplier (according to the PLL limits table)
11:24 karolherbst: yeah
11:24 karolherbst: okay
11:24 karolherbst: would make sense
11:24 karolherbst: I try to find higher refclocks though
11:25 karolherbst: but I think this is the way the blob does it. Only minor stuff should be different like the one value I found
11:25 karolherbst: even nouveau couldn't handle these reg values
11:25 RSpliet: "even nouveau"
11:25 karolherbst: 00132030: 15551009
11:25 karolherbst: this is a totally strange value
11:26 karolherbst: I mean the nouveau code is pretty good with the reg values the blob uses
11:26 RSpliet: I would so like to have a kepler optimus laptop right now to see what those silly bits do to the clock output...
11:26 karolherbst: I have one
11:26 karolherbst: :p
11:27 RSpliet: then why not toy around with them? who cares if the kepler crashes, Intel is driving your monitor
11:27 RSpliet: just don't have a driver loaded
11:27 karolherbst: ....
11:27 karolherbst: nice in theory
11:27 karolherbst: but no
11:27 karolherbst: intel messes up as well
11:27 RSpliet: I've done this on my laptop
11:27 RSpliet: for GK218
11:27 karolherbst: I had to reboot my laptop like 10 times today already
11:28 karolherbst: I mean, it works sometimes
11:28 karolherbst: and sometimes intel can't recover
11:28 karolherbst: I get stuff like "[drm:intel_finish_crtc_commit] *ERROR* Atomic update failure on pipe A (start=13629 end=13632)"
11:29 RSpliet: that might mean intel is still expecting an optimis setup
11:29 RSpliet: but only 10 reboots isn't too bad
11:29 sobukus:sighs
11:29 RSpliet: during my summer project last year I did hundreds of those
11:29 karolherbst: :D
11:29 sobukus: now it doesn't crash on rotating the left screen, but it also fails to rotate it.
11:29 karolherbst: but not in one day
11:30 RSpliet: definitely
11:30 RSpliet: in one day
11:30 karolherbst: :D
11:30 sobukus:tries to go back to zaphod
11:30 RSpliet: every time I faked a VBIOS I had to reboot, the blob didn't like being loaded multiple times in a single boot cycle
11:30 karolherbst: what bits do you know exactly?
11:30 karolherbst: really?
11:30 karolherbst: that's really strange
11:30 sobukus: Cannot do multiple crtcs without X server dirty tracking 2 interface
11:30 sobukus: randr: failed to set shadow slave pixmap
11:31 sobukus: ^^ That's known to anyone?
11:31 karolherbst: I have bumblebee installed and the blob is loaded and unloaded all the time
11:31 RSpliet: it was a fairly old version of the blob, given how they made Tesla "legacy"
11:31 SolarAquarion: imirkin, do i need llvm SNV
11:31 karolherbst: as far as I know it helps if the gpu goes into S3cold
11:31 imirkin: sobukus: if you're going to be around in ~4h or so, ping airlied.
11:31 karolherbst: I usually turn the card off when switching drivers
11:31 imirkin: SolarAquarion: not sure what SNV is. but you don't need llvm for nouveau.
11:32 SolarAquarion: imirkin, i mean for mesa
11:32 SolarAquarion: SVN
11:32 RSpliet: oh yes, no... rebooting was quicker than figuring out what goes wrong
11:32 SolarAquarion: there's something about llvmpipes?
11:32 imirkin: SolarAquarion: not unless you're using radeonsi and want all the latest stuff.
11:32 SolarAquarion: ok
11:32 SolarAquarion: ah
11:32 sobukus: imirkin: Hm, bad timing, but thanks.
11:33 imirkin: sobukus: he's in australia. you can also try asking on #dri-devel, probably others who are more knowledgeable
11:34 sobukus: https://bugs.freedesktop.org/show_bug.cgi?format=multiple&id=86436
11:34 karolherbst: RSpliet: what would be the best way to implement the fix then?
11:34 sobukus: looks similar, but is old and radeon-only
11:34 karolherbst: I was thinking about adding a flag to the gt215_pll_calc function to indicate stable M
11:34 SolarAquarion: imirkin, would those kind of things be a good thing to add to the driver stuff or is it only useful within Radeon?
11:34 karolherbst: or move the M calculation out into a seperate function
11:35 imirkin: SolarAquarion: sorry, not sure what you're talking about
11:35 SolarAquarion: imirkin, the llvm stuff?
11:35 karolherbst: RSpliet: second part would be https://github.com/karolherbst/nouveau/commit/7588f95307ffec8c3d81c05e672edaaf356151dd#diff-c55b594c4bf4b9ffe42d3a48aca8f90aL973 a function to calculate a nice refclock instead using a value
11:35 karolherbst: any other idea?
11:36 karolherbst: would like to test the idea on other cards first though :/
11:37 SolarAquarion: karolherbst, what's the story of your git repository?
11:38 karolherbst: a lot of stuff
11:38 karolherbst: branch names should speak for themselfs
11:39 SolarAquarion: karolherbst, i guess buggy stuff?
11:39 karolherbst: not tested
11:39 karolherbst: except on my system
11:39 karolherbst: also not reviewed or anything
11:40 sobukus: imirkin: I now shuffled around the order of devices in xorg.conf and seem to have a combination where my cursor is allowed to cross the zaphod screen boundaries.
11:40 karolherbst: nothing is critical
11:40 karolherbst: and only helps debugging a bit
11:40 imirkin: sobukus: awesome :)
11:40 SolarAquarion: ok
11:40 sobukus: Perhaps this reverse optimus prime ultra rotation is too advanced for me right now;-)
11:40 karolherbst: :D
11:40 imirkin: sobukus: it's too advanced for everyone
11:41 karolherbst: I would blame the manufactures anyway
11:41 karolherbst: alone for that kind of stupid idea
11:42 karolherbst: sadly mupuf didn't show up :/
11:42 karolherbst: would like to ask him to put two kepler gddr5 cards into reator
11:42 karolherbst: and test my stuff on them
12:08 karolherbst: RSpliet: got clocks with 0x52601 and 0x00052501
12:08 karolherbst: which are slightly above 200
12:09 karolherbst: RSpliet: idea: the blob tries to get N as high as possible
12:10 karolherbst: if not, increase or decrease P
12:10 karolherbst: try to have M always set to 1
12:10 karolherbst: I think this is the way the blob does that
12:11 karolherbst: at least for the refclock PLL
12:11 karolherbst: then for the second PLL maybe it does the same
12:11 karolherbst: N as high as possible and get P higher only if really required
12:11 karolherbst: M stays 1
12:12 karolherbst: my calculation is still a bit off: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
12:29 SolarAquarion: imirkin, hmm. I'm doing the DRI_PRIME stuff but it's currently only on the nvidia chip?
12:29 SolarAquarion: oh uninstall nvidia?
12:35 SolarAquarion: DRI_PRIME=0 isn't working?
12:35 SolarAquarion: on my system
12:41 karolherbst: if you want the nvidia card it has to be set to 1
12:41 karolherbst: and nouveau loaded
12:41 karolherbst: and depend on DRI version other stuff
12:41 karolherbst: SolarAquarion: http://nouveau.freedesktop.org/wiki/Optimus/
12:41 SolarAquarion: karolherbst: I uninstalled nvidia
12:41 karolherbst: it only works with nouveau
12:42 SolarAquarion: karolherbst: I know
12:42 SolarAquarion: I added nouveau as a module
12:45 karolherbst: I think after I am done with that, I will add a debugfs interface for setting mem clock on highest pstate
12:49 karolherbst: RSpliet: 6508MHz: 00052a01 (229502)
12:50 karolherbst: got a 234902 refclock now :/
12:53 SolarAquarion: Works now
12:53 SolarAquarion: Deleted all nvidia stuff
13:28 karolherbst: which is usually not required except your distribution does stuff messy
14:05 RSpliet: karolherbst: interesting, so either a heuristic like proposed is not adequate after all, or "many roads lead to rome"
14:07 karolherbst: I can't see why a specific refclock is choosen right now
14:07 RSpliet: sounds unlikely btw... as you'd need a multiplier of 13,85 on top of that
14:07 karolherbst: mhh
14:07 karolherbst: don't think so
14:07 karolherbst: multiplier varries too much
14:08 RSpliet: 6508/(234,902 * 2) ~= 13,85
14:08 karolherbst: wait
14:08 karolherbst: 3556MHz: 00052601 (207902)
14:08 karolherbst: I was very lucky finding both extremes
14:09 RSpliet: you consistently seem to overshoot by 2KHz
14:09 karolherbst: yeah
14:09 karolherbst: maybe python is strange
14:10 karolherbst: RSpliet: that's the script I use: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-test-sh
14:11 karolherbst: helps a lot :D
14:12 karolherbst: ohh I didn't implement read_div right, so that's why these wierd values are too much off, but this doesn't matter for ref clock calculation
14:12 RSpliet: how is that python?
14:12 karolherbst: the calculations are done in python
14:12 karolherbst: hex_bc
14:12 karolherbst: I tried to use bc before
14:13 karolherbst: but it was too messy
14:13 RSpliet: right
14:14 RSpliet: *shivers*
14:14 karolherbst: :D
14:14 RSpliet: well, irrelevant
14:14 karolherbst: yes
14:14 karolherbst: but 3556MHz should be inspected deeper
14:15 RSpliet: yes, as it seems to have no accuracy at all
14:18 karolherbst: RSpliet: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-strange-values
14:18 karolherbst: bit 28
14:18 karolherbst: in 0x132004
14:18 karolherbst: also 0x132030
14:19 karolherbst: addad "normal" values there
14:21 karolherbst: seems I was wrong about 0x132030. This reg changes a lot anyway
14:21 karolherbst: ...
14:21 RSpliet: you might want to peek 0x137320 and 0x137330 too
14:21 karolherbst: yeah, I know about them
14:22 karolherbst: 0x137320 is ... :/
14:22 karolherbst: this is a bit painfull
14:22 karolherbst: 0x137330 is constant though
14:23 RSpliet: ok
14:23 karolherbst: good thing is. I can create as many combination as I want with the blob
14:23 karolherbst: I actually can above the +4008MHz limitation
14:23 karolherbst: and set my clock to 8200MHz
14:23 karolherbst: ....
14:24 karolherbst: the limitation is only shown inside nvidia-settings and I think this might be doubled clock and unrelated to the actual driver
14:24 RSpliet: is fN a 12-bit signed integer?
14:25 karolherbst: it defaults to 0xf000 in the module I think
14:25 RSpliet: that sounds wrong
14:26 karolherbst: mhh
14:26 karolherbst: I think it was in the read part
14:26 RSpliet: what I mean is
14:26 RSpliet: if you look at the clock for 3552MHz
14:26 karolherbst: u16 fN = 0xf000;
14:27 karolherbst: okay
14:27 RSpliet: yes. but fN = nv_rd32(priv, pll + 0x10) >> 16;
14:27 RSpliet: which is probably -2
14:27 RSpliet: since the value there is "0ffe0004"
14:27 karolherbst: ahh
14:27 karolherbst: checking
14:28 karolherbst: my script is wrong for that anyway
14:28 karolherbst: 0x789f45d result clock
14:29 karolherbst: ohh
14:29 karolherbst: forgot the shift
14:30 karolherbst: fN = 0x13 ?
14:30 RSpliet: que?
14:30 RSpliet: no, fN = nv_rd32(priv, pll + 0x10) >> 16;
14:33 karolherbst: silly me :/
14:34 karolherbst: yeah fN is 0xffe
14:36 RSpliet: which is probably meant as -2 rather than 4094
14:38 karolherbst: there is a fN + 4096 in the code
14:38 karolherbst: cast to u16
14:39 RSpliet: yes
14:39 RSpliet: and I'm saying that's probably wrong
14:39 karolherbst: okay
14:40 karolherbst: got now 1980000 for the 4008MHz values
14:40 karolherbst: 180000 for the first PLL
14:40 karolherbst: I think I fixed the fN part now in my script
14:40 karolherbst: still something seems odd
14:41 kb9vqf: I have a GTX 670 here that I have been trying to configure a DVI monitor @1920x1200 via an active DisplayPort adapter
14:41 kb9vqf: nVidia proprietary works, but is too slow in 2D
14:41 kb9vqf: nouveau works as long as the resolution is 1440x900 or below
14:42 RSpliet: kb9vqf: that probably means the memory bandwidth is insufficient for that resolution in the current "performance mode"
14:42 karolherbst: 0xfff should simulate a cast to u16?
14:42 imirkin: kb9vqf: not that it wouldn't work, but why an active adapter?
14:42 kb9vqf: imirkin: because there are two other DVI panels on the same card :-
14:42 imirkin: kb9vqf: should be able to use a passive single-link dvi adapter, which is enough for 1920x1200 with reduced blanking
14:42 RSpliet: karolherbst: no, why would you think so?
14:43 karolherbst: ohhh right, silly me
14:43 imirkin: kb9vqf: do you have an xorg log?
14:43 kb9vqf: I'm going for triple head (and it works with two 1920x1200 DVI panels + the 1440x900 adapted panel)
14:43 kb9vqf: yes
14:43 kb9vqf: let me upload
14:43 karolherbst: 0xffff :/
14:43 karolherbst: now the values are good
14:43 RSpliet: kb9vqf: do you have a dmesg as well?
14:44 kb9vqf: RSpliet: I'm having some problems with another device spamming dmesg at the moment so no
14:44 kb9vqf: (unrelated issue)
14:44 RSpliet: kb9vqf: it would be greatly helpful though, since it would help me judge whether we can scale up your memory bandwidth by (experimental) reclocking
14:45 karolherbst: RSpliet: I get 1774266 for the 3552MHz values
14:45 RSpliet: that's assuming it's an unsigned int
14:45 karolherbst: which is okay, because the blob isn't accurate with the values either
14:45 kb9vqf: RSpliet: Tried reclocking to 0xa and no change
14:45 karolherbst: its only for the script anyway
14:45 kb9vqf: 0xd and above fail
14:46 karolherbst: ohh gddr5?
14:46 RSpliet: kb9vqf: yeah I didn't expect anything above 0xa to work, but having extra bandwidth is useful to diagnose this :-)
14:46 RSpliet: karolherbst: yes
14:46 kb9vqf: RSpliet, imirkin: http://paste.ee/p/EVl90
14:46 karolherbst: :) working on the fail
14:46 karolherbst: mhhh
14:47 kb9vqf: RSpliet: From what I understand 0xa doesn't really touch the memory bandwidth, which probably explains why it had no effect?
14:47 karolherbst: does the other clock change above 0xa? ...
14:47 karolherbst: kb9vqf: not entirely true
14:47 kb9vqf: RSpliet: Also, is it normal for a memory starved GPU to just lock up with garbage on the screen?
14:47 karolherbst: 07 => 0a doubled memory clock for me
14:47 kb9vqf: hmm
14:48 karolherbst: but 07 => 0f does increase it by *5 so
14:48 imirkin: kb9vqf: Output DP-1 enabled but has no modes
14:48 karolherbst: RSpliet: the calculation seems to fit what the blob does
14:48 imirkin: kb9vqf: that's probably bad... drm_kms_helper.edid_firmware=edid/dell.bin video=DVI-I-1:D video=DVI-D-1:e video=DP-1:e -- is any of that really helping? it shouldn't be necessary
14:48 RSpliet: kb9vqf: can't verify until I've seen a dmesg :-P or... well, you can verify from /sys/class/drm/card0/device/pstate
14:49 kb9vqf: imirkin: There are edid read issues on this setup
14:49 kb9vqf: 07: core 324 MHz memory 648 MHz 0a: core 324-862 MHz memory 1620 MHz AC DC * 0d: core 540-1215 MHz memory 6008 MHz 0f: core 540-1215 MHz memory 6008 MHz AC: core 862 MHz memory 1620 MHz
14:49 kb9vqf: RSpliet: my bad, 0xa doubles the memory bandwidth but doesn't help the GPU lockup
14:50 imirkin: kb9vqf: what if you don't force-enable the DP ports?
14:50 RSpliet: okay ;-) but the "no modes" error pointed at by imirkin is more likely to be the culprit
14:50 kb9vqf: imirkin: the monitor is not detected
14:50 imirkin: ok, well that's a good sign that something's wrong
14:50 kb9vqf: imirkin: ...but that is because the DP adapter doesn't pass through edid
14:50 imirkin: i don't suppose that you have a passive adapter handy
14:50 imirkin: it should be sufficient for your needs
14:50 kb9vqf: imirkin: tried it, nothing worked at all, even under the proprietary driver
14:51 imirkin: hmmmm don't see why not
14:51 imirkin: i'd recommend removing all that drm_kms_helper* video=* junk and using a passive adapter and seeing what happens
14:51 kb9vqf: imirkin: so on the Kepler cards I should be able to drive two DVI panels via the two DVI ports, and a third DVI panel via a passive adapter?
14:52 imirkin: yeah
14:52 imirkin: up to 4 at a time
14:52 imirkin: the connector type doesn't matter (beyond the practical of what's available)
14:52 RSpliet: kb9vqf: you didn't explain though: what exactly goes wrong?
14:52 kb9vqf: imirkin: I thought there were only two DVI clock sources on the Kepler card?
14:52 RSpliet: are you unable to set the desired resolution, or do you get a more nasty outcome?
14:52 imirkin: kb9vqf: afaik there's no such thing as a separate "dvi clock source"
14:52 kb9vqf: RSpliet: Oh, sorry. As you can see nouveau "knows" it can't drive anything above 1440x900 or so and refuses to validate the modes
14:52 imirkin: kb9vqf: however i could just not know some bit of it
14:53 kb9vqf: RSpliet: if I force the 1920x1200 mode in Xorg then the GPU hard locks
14:53 karolherbst: nvidia isn't very creative with the pstate clock, is it?
14:53 RSpliet: kb9vqf: is that when you try to manually add the mode to the panel using xrandr?
14:53 karolherbst: also what is the difference between 0d and 0f?
14:53 kb9vqf: RSpliet: Yes, and also tried it in xorg.conf, same lockup
14:54 kb9vqf: RSpliet: On the other two (DVI port driven) panels I get static white snow
14:54 imirkin: kb9vqf: i'd be very interested to see what happens with the passive adapter and none of that custom config stuff
14:54 kb9vqf: imirkin: With the passive adapter I get nothing at all on the DP screen
14:54 kb9vqf: even under proprietary
14:54 imirkin: kb9vqf: can you set that up? i'd like to see what nouveau produces (esp when you don't do that video= junk)
14:55 kb9vqf: imirkin: The cabling here doesn't support reading the EDIDs reliably
14:56 kb9vqf: imirkin: the only thing I can do is tack a different screen onto the passive adapter (one that is more reliable as far as EDID reads are concerned)
14:56 imirkin: so keep the edid thing but get rid of the video= stuff
14:56 kb9vqf: imirkin: with the passive adapter, correct?
14:56 kb9vqf: it'll take me a bit to set that up
14:57 imirkin: then start with the active one
14:57 imirkin: but various people have had issues with active adapters
14:57 imirkin: in general the nouveau dp code is less-than-perfect
14:57 imirkin: so avoiding dp (e.g. by using a passive adapter) is preferable
14:58 imirkin: that said, dp works fine for lots of people too, so... yeah. ymmv
15:01 kb9vqf: imirkin: trying it now. I have read conflicting reports regarding the DP --> DVI stuff
15:01 imirkin: well, pre-kepler only have 2 crtc's
15:01 imirkin: i.e. can drive 2 screens
15:01 imirkin: kepler+ have 4 crtc's
15:01 kb9vqf: yes, that I understand
15:01 imirkin: i'm not aware of any special limitation on DVI
15:01 imirkin: doesn't mean it doesn't exist... just that i don't know about it :)
15:02 kb9vqf: well, I'll give this direct connection one more shot under proprietary if nouveau doesn't detect it properly
15:02 imirkin: i def know that there are quad-dvi-port kepler boards
15:02 kb9vqf: if proprietayr fails again then I'm guessing it's a board limitation
15:03 kb9vqf: huh
15:03 kb9vqf: direct connection worked
15:03 kb9vqf: now to figure out why
15:04 imirkin: lol
15:04 kb9vqf: (granted, direct to a 1920x180 screen)
15:04 kb9vqf: 1920x1080 rather
15:04 imirkin: that doesn't need reduced blanking to fit into a single-link connector
15:05 kb9vqf: all right, I'm wondering if the VBIOS needed to "see" the DVI screen to configure things properly?
15:06 imirkin: VBIOS definitely does stuff
15:06 kb9vqf: hmmm
15:06 imirkin: but nouveau should be able to configure things on its own as well
15:06 imirkin: [among other things, it runs bits of the vbios as necessary]
15:06 kb9vqf: well, I'm thinking nouveau doesn't know how to switch the DP link from DP to DVI mode
15:06 imirkin: that's def not the case.
15:06 kb9vqf: ok
15:07 imirkin: any such simplistic conclusion is not going to work out... nouveau def works with all this stuff some of the time under some conditions
15:07 imirkin: just not all the time under all conditions ;)
15:07 kb9vqf: well, I don't know. can you think of any reason why forcing the output on (when the card doesn't electrically know the screen is attached) wouldn't work?
15:07 imirkin: in order to nail it all down you need a lab-full of screens and gpu's and test all the combinations
15:08 imirkin: yeah, coz forcing the output on is an unexpected condition for the driver that gets even less testing
15:08 imirkin: and it's meaningless for DP where you can't just "force the output on"
15:08 imirkin: but instead there's like a 20-step process for enabling a DP output
15:08 imirkin: which involves link training, getting information from the remote end, etc
15:09 imirkin: it's more of a thing that you can do with DVI since you just flip a few bits and move on
15:09 kb9vqf: but using the passive adapter on a DVI screen should mean the DP port "sees" DVI, yes?
15:09 kb9vqf: and therefore should be able to force DVI on?
15:10 imirkin: ssssort of
15:10 imirkin: there's still some DP going on there
15:10 imirkin: and using an adapter (of any kind) is a less-tested scenario
15:10 imirkin: so again, you get into the "you are the first and last person ever trying this" issue
15:11 kb9vqf: yeah, that I figured :-)
15:11 kb9vqf: just glad I was able to get any response on this channel, so far you guys have been great!
15:12 kb9vqf: been at this for a day already and never though to try the other screen like an idiot
15:12 imirkin: it's not like any of this stuff *shouldn't* work, just realize that there are a limited number of people working on this, who have limited hardware, and limited time for odd configurations
15:12 imirkin: [well, limited time in general, but odd configurations getting a smaller slice of that time]
15:17 kb9vqf: imirkin: since I have a working configuration I figured out the problem: the hardware in the DP only switches to DVI when it electrically detects a DVI monitor connected to that adapter (probably via the +5VSB)
15:18 imirkin: kb9vqf: yeah that seems likely
15:18 imirkin: note that you can also do DP -> HDMI
15:18 imirkin: [both of which are TMDS]
15:19 kb9vqf: imirkin: in hindsight that was rather obvious. sorry for wasting time
15:19 kb9vqf: the 2D performance of nouveau, even without reclocking, is an order of magnitude better than the proprietayr driver
15:19 imirkin: that's very surprising
15:19 kb9vqf: I'm sure they have some kind of bug, but I can't wait for them to fix it
15:20 imirkin: well, with the latest mesa you should get a semi-decent GL 4.1 driver
15:20 kb9vqf: yeah, I tracked it down to some kind of issue in their nvidia_drv.so module (called by XYToscreen), but with no symbols I couldn't trace further
15:20 imirkin: (i.e. from the git tree, or the upcoming mesa 11.0 release)
15:21 kb9vqf: I have a GTX 970 coming (Maxwell); could you use any debug information from triple head on such a setup?
15:22 kb9vqf: (plan B in case I couldn't get the 2D performance fixed on the 670)
15:22 imirkin: the GM204's won't get any hw accel with nouveau i'm afraid
15:22 imirkin: and some people have reported modesetting issues... some things were fixed, but don't know about all
15:22 imirkin: it's definitely the early days for that family
15:23 imirkin: they've started requiring signed firmware, which kinda sucks
15:23 kb9vqf: yes, if the proprietary driver has the same 2D issues on Maxwell I'm returning the card
15:26 imirkin: and they've started retrieving that firmware in a way that makes it impossible to trace, and so we've just kinda been dragging our feet on enabling it
15:26 imirkin: the gpu's are still super-expensive, no one has them, it doesn't super-matter
15:27 imirkin: [ok, not impossible, just requires active effort, as opposed to the same old way they used to do it]
15:27 kb9vqf: imirkin: ATI any better in this regard?
15:27 imirkin: ati is long gone
15:27 imirkin: but amd has an actual paid open-source team
15:28 imirkin: and their (higher-end) gpu's support up to 6 crtc's
15:28 kb9vqf: yeah, I realize, showing my age here
15:28 kb9vqf: ATI was around forever
15:28 imirkin: yep
15:28 imirkin: at least you didn't bring up CL :)
15:28 kb9vqf: heh
15:29 kb9vqf: well, I wouldn't have even tried nouveau except for the horrid 2D performance
15:29 kb9vqf: (on the proprietary drivers)
15:29 imirkin: in general they do a pretty good job, so i'm surprised
15:29 kb9vqf: and now looking to switch manufacturers entirely. Wonder how nVidia's new direction will work out for them in the long run?
15:29 imirkin: there was a time when they didn't, but i thought that time was long gone
15:29 imirkin: that said, it's not like i've really checked
15:30 kb9vqf: well, tried everything from 3.04.x to 352.x
15:30 kb9vqf: all had the same regression
15:30 kb9vqf: it's probably something to do with the AMD chipset
15:30 imirkin: doubtful
15:30 imirkin: the AGP days are long behind us
15:30 imirkin: PCIe tends to work pretty well nowadays
15:31 kb9vqf: it only affects this one system; the same disks in an Intel system and the same nVidia card work with no issues
15:31 imirkin: that's really weird.
15:31 kb9vqf: to say the least
15:31 kb9vqf: oh, and the same monitors too
15:31 imirkin: well, nouveau doesn't currently futz with the GT/s setting on the link
15:31 kb9vqf: traced the nvidia driver, it doesn't either
15:31 imirkin: so it stays in whatever the vbios leaves it at
15:31 imirkin: which is usually 2.5GT/s
15:31 kb9vqf: checked the MTRRs, PAT entries, etc.
15:32 kb9vqf: the BIOS here set it up to 5GT/s x16
15:32 imirkin: well, the BIOS can't do anything about it... the VBIOS can though
15:32 imirkin: bbl
15:32 kb9vqf: oh wait
15:32 kb9vqf: you're right
15:32 kb9vqf: that's intereting
15:32 kb9vqf: *interesting
15:33 kb9vqf: didn't see that before
15:33 kb9vqf: only had nouveau working for a short time here so haven't looked at all the settings
15:34 kb9vqf: so I wonder if the driver switching the device into 5GT/s mode messes something up
15:35 kb9vqf: according to the nVidia utility link usage never went over 13%
15:48 imirkin_: yeah no idea, just pointing that out as a potential difference
15:49 kb9vqf: well, I'm a hardware engineer by trade, just haven't had much opportunity to tear down the nVidia cards and/or driver until now
15:50 kb9vqf: thing is, it probably isn't worth it if the ATI cards are better supported and less locked down
15:50 imirkin_: well, all semi-modern AMD gpu's require a command processor blob (and a few others)
15:51 imirkin_: up to the GM20x maxwells, nouveau provides all-open firmware, except video decoding
15:51 imirkin_: coz no one is masochistic enough to redo that (and reverse-engineer the vector processors that accelerate it in the first place)
15:52 imirkin_: except mwk who has done all of vp1 and a good bit of vp2 (both are only on fairly old gpu's)
15:52 kb9vqf: well, I'll probably reevaluate down the line a bit. was very unhappy about the performance regression, and hearing about the lockdown is concerning as well
15:52 karolherbst: imirkin_: up for some unigine benchmarks ;)
15:52 imirkin_: but the big thing about amd is that you can have up to 6 displays on a single gpu
15:52 imirkin_: rather than just 4
15:53 imirkin_: pretty sure even GM20x keeps the limit of 4 crtc's
15:54 imirkin_: karolherbst: sadly heaven dies on the GK208 with prime :( no idea why
15:54 karolherbst: :/
15:54 karolherbst: will try that with gk106
15:54 imirkin_: it gets an assert failure that indicates we basically have too many pushbufs outstanding
15:54 kb9vqf: imirkin: Well if you need any output from a Maxwell card just poke me
15:55 imirkin_: kb9vqf: well, i tend not to deal with that stuff too much
15:55 kb9vqf: will probably have it for a couple weeks for evaulation, and if proprietary works then may hang on to it for now
15:55 kb9vqf: ok
15:55 imirkin_: kb9vqf: i mostly just do 3d driver stuff... ben (skeggsb, currently on vacation) does all the hw stuff
15:59 kb9vqf: imirkin_: Just fired up Steam and notice odd texture issues, almost like they are blurred out
15:59 karolherbst: impressive
16:00 imirkin_: kb9vqf: probably you don't have libtxc_dxtn installed
16:00 karolherbst: imirkin_: this will cheer you up :D
16:00 kb9vqf: will look at it
16:00 karolherbst: blob 76.6fps nouveau 49.8fps
16:00 imirkin_: kb9vqf: also make sure that your mesa is relatively up to date
16:00 karolherbst: lowest settings though
16:01 karolherbst: will add some stuff
16:01 imirkin_: karolherbst: for what? heaven?
16:01 karolherbst: yeah
16:01 imirkin_: with or without tess?
16:01 karolherbst: without
16:01 karolherbst: will do ultra quality and with tess now
16:01 imirkin_: ok. coz i was gonna say... a bit odd that you get such high fps (with either)
16:01 kb9vqf: imirkin_: libtxc-dxtn-s2tc0 was already installed
16:02 imirkin_: kb9vqf: hmmm.... i think that's the crap one
16:02 imirkin_: kb9vqf: i think you need the s3tc one
16:02 imirkin_: tbh not sure what the diff is
16:02 imirkin_: kb9vqf: perhaps you can make a screenshot, i might be able to diagnose
16:02 karolherbst: wow
16:02 karolherbst: these artifacts
16:03 imirkin_: karolherbst: there are artifacts? :(
16:03 kb9vqf: imirkin_: sure, give me a sec
16:03 karolherbst: extreme tess + 8x msaa
16:04 imirkin_: karolherbst: can you reduce some stuff? pretty sure that with low/medium + tess + 4xmsaa it looked fine
16:04 karolherbst: 4xmsaa seems to work
16:04 imirkin_: oh hm
16:04 imirkin_: i never tried 8x
16:04 karolherbst: you should
16:04 imirkin_: the gpu was already getting like 0.000001fps
16:04 karolherbst: looks funny?
16:04 karolherbst: wut :O
16:04 imirkin_: GF108 ain't fast
16:05 imirkin_: it's like 1/5th the speed of the GT215 with gddr5 vram
16:05 kb9vqf: imirkin_: oh that's nice, didn't do it the second time. Guess it was just a glitch in the game (GoIO is pretty buggy to begin with
16:05 karolherbst: imirkin_: still :O
16:05 karolherbst: I am getting like 10 fps with ultra quality, extreme tess and 4x msaa
16:06 imirkin_: what rez?
16:06 karolherbst: full hd
16:06 kb9vqf: imirkin_: on the plus side the underclocked GTX 670 under Nouveau is performing better than under the proprietary drivers
16:06 karolherbst: I ain't do window stuff :p
16:06 imirkin_: i think i did the min (like 640x480?), 4x msaa, tess, and low or medium settings, and that was in the 1-4fps range
16:06 karolherbst: will benchmark now
16:06 karolherbst: ...
16:07 imirkin_: kb9vqf: sounds like there's just something horridly wrong
16:07 kb9vqf: no kidding
16:07 imirkin_: kb9vqf: with the proprietary driver, dunno what, but some sort of mtrr-style disagreement with the cpu
16:07 imirkin_: perhaps bus snopping? dunno
16:07 kb9vqf: triple checked the MTRRs (actually PATs)
16:07 imirkin_: although nouveau would be doing the exact same thing
16:07 kb9vqf: the aperture is only 128M on this card
16:08 kb9vqf: not sure why that would make the proprietayr driver work so badly though
16:08 imirkin_: that's mostly irrelevant
16:08 imirkin_: you never write to vram directly
16:08 kb9vqf: BTW is there anything I can do to help with the reclocking on the Kepler cards?
16:08 kb9vqf: yeah, I know
16:08 imirkin_: that's very slow... you get the gpu to dma stuff itself
16:08 kb9vqf: I'm thinking it's more of a problem inside Xorg and nvidia's Xorg driver
16:09 imirkin_: karolherbst has been looking into it...
16:09 imirkin_: we do know how to get the gpu into 5 or 8GT/s mode
16:09 imirkin_: he can help you with that if you want to try it
16:09 imirkin_: just a few well-placed commands :)
16:09 kb9vqf: sure, why not
16:09 kb9vqf: let's at least rule out that problem with the nvidia driver
16:09 imirkin_: start by grabbing envytools (https://github.com/envytools/envytools/)
16:10 imirkin_: where is that reg on kepler again? gr, i forget
16:10 imirkin_: kb9vqf: oh also, lspci -vvv -s 01:00.0 (or whatever the gpu is at)
16:10 imirkin_: (to see if it needs to be flipped into v2 mode as well)
16:11 kb9vqf: you want the output, or if I'm familiar with PCI do I need to keep it?
16:11 kb9vqf: ok
16:11 imirkin_: i want the output
16:11 kb9vqf: http://paste.ee/p/jm5Td
16:12 imirkin_: cool it's already in v2 mode
16:12 imirkin_: and i guess it's a pcie v3 slot so it can do 8GT/s?
16:12 karolherbst: wait
16:12 imirkin_: aka "Target Link Speed: Unknown" on older lspci versions
16:12 karolherbst: kb9vqf: https://gist.github.com/karolherbst/5fdd4a543d20916bc362
16:12 kb9vqf: imirkin_: No, it's PCI 2.0
16:12 kb9vqf: *PCIe
16:13 karolherbst: ...
16:13 imirkin_: kb9vqf: hmmm. ok
16:13 karolherbst: imirkin_: this sounds strange
16:13 karolherbst: "Port #0, Speed unknown, Width x16, ASPM L0s L1, Latency L0 <512ns, L1 <4us"
16:13 imirkin_: need to carefully only set it to 5GT/s then
16:13 karolherbst: the cap
16:13 imirkin_: it might get unhappy if you push it to 8GT/s
16:14 kb9vqf: karolherbst: that's the card capability
16:14 imirkin_: kb9vqf: ok, let me know when you've built envytools
16:14 kb9vqf: look at the boad capability
16:14 karolherbst: kb9vqf: you think it is
16:14 karolherbst: but its only its "current" capability
16:14 karolherbst: you can change the cap if you want
16:14 karolherbst: but I don'T think we know how exactly on fermi or later
16:15 kb9vqf: karolherbst: ah, ok. going tfrom the specs her e(still new to nvidia internals)
16:15 imirkin_: the cap is the board... the port's caps are defined elsewhere
16:15 kb9vqf: BTW Portal 2 is playing very nicely under nouveau
16:15 kb9vqf: very impressed so far
16:15 imirkin_: it does generally work
16:15 imirkin_: there are a few bugs that have been highly resistant to analysis
16:15 karolherbst: imirkin_: by the way min: 4.9 avg: 11.6 max: 26.4
16:16 karolherbst: ultra qual, extreme tess 4xmsaa @ 1920x1080
16:16 imirkin_: but if you run into misrendering, the general thing to do is make an apitrace, and file a bug with a link to it
16:16 imirkin_: i'll either solve it in under a day
16:16 imirkin_: or over a year ;)
16:16 imirkin_: very little in between
16:17 kb9vqf: imirkin_: envytools built
16:17 imirkin_: ok cool. now run 'nvapeek 8c040'
16:17 imirkin_: [and paste the output]
16:18 imirkin_: oh, and also 'nvapeek 2241c'
16:18 kb9vqf: hmmm, nvapeek wasn' tbuilt. let me check a couple things
16:19 imirkin_: it's in the 'nva' dir
16:19 imirkin_: it might require python for idiotic reasons
16:19 imirkin_: feel free to hack that out
16:19 imirkin_:hates cmake
16:19 kb9vqf: 0008c040: 80089000
16:20 kb9vqf: 0002241c: 00000081
16:20 imirkin_: and the other one
16:20 imirkin_: ok cool. so the link cap is good
16:20 imirkin_: now do
16:20 imirkin_: nvapoke 8c040 80049000; nvapoke 8c040 80049001
16:21 imirkin_: and then check lspci again
16:22 kb9vqf: yep, worked!
16:22 imirkin_: nice
16:22 karolherbst: nice
16:22 karolherbst: imirkin_: min 5.8 avg: 18.5 max: 46.9 with blob
16:22 kb9vqf: no 2d degredation either
16:22 imirkin_: and if you want 8GT/s you can do 80009000; 80009001; but only do this if the port actually supports it. otherwise the gpu may get unhappy
16:22 kb9vqf: the port does not support PCIe 3.0
16:23 karolherbst: wait
16:23 imirkin_: karolherbst: 62% =/ not *super* great.
16:23 kb9vqf: and besides, I really don't need that kind of bandwidth since nouveau does not support OpenCL
16:23 kb9vqf: ;-)
16:23 imirkin_: well, presumably this isn't a laptop, so the extra 1W it uses up is of little concern
16:23 kb9vqf: but it does confirm at least that the other 2D bug is squarely in nVidia's realm
16:24 karolherbst: kb9vqf: there are games which are happy about higher PCIe speed
16:24 karolherbst: imirkin_: :/
16:24 kb9vqf: ...and without full reclocking that probably doesn't matter too much right now ;-_
16:24 karolherbst: may work in a few days :p
16:24 kb9vqf: BTW while I'm here is there any way to hack the card to full power without reclocking?
16:25 karolherbst: there is a way to reclock to full power without crashes
16:25 karolherbst: at least I found it for my card
16:25 kb9vqf: oh cool
16:25 kb9vqf: Kepler?
16:25 imirkin_: kb9vqf: you can load nvidia blob, force it to the highest power level, then kexec a kernel that will use nouveau
16:25 karolherbst: yeah
16:25 karolherbst: :D
16:25 kb9vqf: oh, so nasty hack then
16:25 kb9vqf: ;-)
16:25 karolherbst: imirkin_: I now pretty good now what nouveau does wrong
16:26 karolherbst: and what nouveau should kind of do
16:26 kb9vqf: karolherbst: I'll be hanging out on IRC...if you want me to test anything....
16:26 kb9vqf: I'm actually pretty good at low level hardware (day job), just unfamiliar with nVidia cards
16:26 karolherbst: I don't need more information, because I can generate mem clock values as I like
16:26 karolherbst: and just read the regs back
16:27 karolherbst: the main prioblem is now to find the "right" algorithm
16:27 kb9vqf: I meant once you have a reclocking patch I can test it on this hardware if you like
16:27 karolherbst: yeah, okay
16:27 imirkin_: kb9vqf: you should send karolherbst your vbios
16:27 karolherbst: :O
16:27 karolherbst: why?
16:27 imirkin_: so you can see his cstate/etc list
16:27 karolherbst: the blob generates same reg values
16:27 karolherbst: doesn't matter
16:27 imirkin_: and see how it's totally different from yours :)
16:27 karolherbst: doesn't matter for mem clock
16:28 karolherbst: I already check on another kepler
16:28 kb9vqf: so these cards are DDR3 essentially?
16:28 karolherbst: same reg values for same clocks
16:28 imirkin_: kb9vqf: GDDR5 is *extremely* finicky
16:28 karolherbst: what does that mean? :D
16:28 kb9vqf: ah, DDR5
16:29 karolherbst: imirkin_: do you know what is wrong in nouveau?
16:29 imirkin_: also none of us really understand RAM, so we're mostly guessing from general principles and the occasional bit of documentation
16:29 imirkin_: karolherbst: yeah, it doesn't determine parameters properly :p
16:29 karolherbst: :D
16:29 karolherbst: and I know whcih one are good
16:29 karolherbst: :p
16:29 kb9vqf: I assume GDDR5 still requires DQS training, etc.?
16:29 imirkin_: karolherbst: yeah, but those parameters are dynamic
16:29 karolherbst: still
16:29 karolherbst: I know
16:29 imirkin_: kb9vqf: it requires training. dunno what DQS is.
16:30 karolherbst: I was serious with that the right algorithm is missing
16:30 karolherbst: the values aren't the problem anymore
16:30 kb9vqf: On DDR3 (which I just got done doing for something else) you have to train receiver enable, then center the data eye
16:30 karolherbst: basically you have to let the "main" Pll generate low clocks between 150-250 MHz and multiply them up with the second PLL
16:30 karolherbst: no dividers
16:30 karolherbst: no anything else
16:30 karolherbst: then the card is happy
16:31 kb9vqf: do you know if the cards use star or fly-by topology?
16:31 karolherbst: and that's it basically
16:31 imirkin_: kb9vqf: you clearly know ram. like i said, we do not ;)
16:32 kb9vqf: which is why I'm trying to help if I can :-)
16:32 karolherbst: so what nouveua has to do is set the first PLL to a value where there is a constant to fullfill a*pll_clk = target_clk and use a in the second PLL as the multiplier
16:32 karolherbst: but how to choose the refclock I am not really sure
16:33 karolherbst: kb9vqf: do you hve coolbits set to 8 for the blob?
16:33 kb9vqf: no, I didn't
16:33 kb9vqf: what does that do?
16:33 imirkin_: kb9vqf: i don't just not know the answer to your questions, i don't even know what you're asking :)
16:33 karolherbst: enable clocking stuff
16:33 karolherbst: I have no clue about ram :p
16:33 karolherbst: really
16:33 kb9vqf: ah, and is there anyone on the project that does?
16:33 karolherbst: I am just lucky with guessing
16:34 imirkin_: ben is the closest to it
16:34 imirkin_: RSpliet probably comes second. not sure where mwk fits in.
16:34 kb9vqf: I don't want to reinvent the wheel if you guys already know enough, but if I can help... :-)
16:34 karolherbst: kb9vqf: you could verify my values
16:34 imirkin_: i'm at spot #100, even if there's no one in between me and roy.
16:34 karolherbst: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
16:34 karolherbst: first one is mem clock set with the blob
16:35 kb9vqf: ah, so you want me to boot the blob back up and dump registers?
16:35 karolherbst: second is the reg of the PLL at 0x132024
16:35 karolherbst: basically
16:35 kb9vqf: ok
16:35 karolherbst: and reclock card with nvidia-settings
16:35 karolherbst: why you need coolbits = 8
16:36 kb9vqf: I can do that, just give me 5 minutes to reboot back into the blob and make a cup of coffee ;-)
16:36 karolherbst: I have Option "Coolbits" "8" inside my Device section
16:37 karolherbst: imirkin_: do you can think of any "sane" algorithm to set the clocks or by which value nouveau may be off the requrest clock?
16:37 karolherbst: I could just hardcode 150MHz and 250MHz as bounds for the refclock
16:37 karolherbst: :/
16:37 imirkin_: karolherbst: not sure what you mean... but ben RE'd a *ton* of that stuff
16:37 karolherbst: and see what we can do best
16:37 karolherbst: yeah
16:37 karolherbst: and there was a note from hiim
16:37 imirkin_: i'd very carefully study the stuff in ramgk104.c
16:38 karolherbst: basically saying he has no clue
16:38 imirkin_: hehehehe
16:38 karolherbst: "so far, i've seen very weird values being chosen by nvidia on kepler boards, no idea how/why they're chosen."
16:38 karolherbst: I think I figured out why
16:38 imirkin_: awesome :)
16:38 karolherbst: there is a second PLL
16:38 karolherbst: which gets activated for clocks higher then 2400MHz mem
16:39 karolherbst: and this PLL is unhappy if it has to do something else beyond multiplying
16:39 imirkin_: kb9vqf: so basically if you look at http://cgit.freedesktop.org/~darktama/nouveau/tree/drm/nouveau/nvkm/subdev/fb/ramgk104.c that is basically the state of the art of our knowledge of how things work
16:39 karolherbst: so it shall only multiply what the first PLL does
16:40 karolherbst: currently nouveau picks up the current mem clock and tries to push that to the requested clock with the second PLL
16:42 karolherbst: I mean I could benchmark unigine with that without issues now
16:42 karolherbst: so I think it works
16:43 karolherbst: imirkin_: any idea how I can test why the performance isn't that good in the benchmark?
16:44 karolherbst: I already ignore prime, because the blob runs through bumblebee and there I also loose performance
16:45 imirkin_: karolherbst: not easily. i suspect strongly it's because (a) our compiler doesn't optimize as well and (b) they do instruction scheduling and we don't
16:45 karolherbst: mhhh
16:45 karolherbst: is the latter hard to do?
16:45 imirkin_: yes and no
16:45 karolherbst: and with hard I mean more than a year?
16:45 imirkin_: it's easy to do something, but it's difficult to know if you're doing something better than nothing.
16:45 karolherbst: :D
16:45 karolherbst: I see
16:46 karolherbst: but a could be compared somehow
16:46 karolherbst: or is there no way to dump what the blob does
16:47 imirkin_: sure, you can dump it
16:47 imirkin_: mmt lets you do that
16:48 imirkin_: (and demmt does a pretty good job decoding it all)
16:48 karolherbst: okay
16:48 karolherbst: so I could dump the gpu code with gallium and compare it with what the blob does
16:48 kb9vqf: karolherbst: Well, after a quick glance at that file I think you guys lucked out and static training values are set by the manufacturer then used
16:48 karolherbst: so you got the same values?
16:49 karolherbst: ohh
16:49 karolherbst: you mean the source
16:49 kb9vqf: no, I was looking at the control file in cgit
16:49 kb9vqf: yes
16:49 karolherbst: about that... everything could be wrong ;)
16:49 kb9vqf: the blob is still installing
16:49 karolherbst: or at least different
16:49 kb9vqf: well, it's not out of the realm of possibilities that things were done that way
16:49 imirkin_: kb9vqf: errr... huh? http://cgit.freedesktop.org/~darktama/nouveau/tree/drm/nouveau/nvkm/subdev/fb/ramgk104.c#n1247
16:50 imirkin_: and lots of other stuff
16:50 imirkin_: the thing in that data are just funny bit patterns
16:50 imirkin_: i.e. 11111111, 9999999 etc
16:50 karolherbst: ahh
16:50 karolherbst: so that's what these values are for
16:50 karolherbst: I already wondered why there are such funny values in the mmiotrace
16:51 imirkin_: and then i think we read some value from somewhere as a result of that
16:51 imirkin_: and let the good times roll
16:51 kb9vqf: well, I haven't analyzed beyond a quick glance, but I was looking at the table parsing code @1322
16:51 imirkin_: kb9vqf: ah yeah, those are vbios scripts
16:51 kb9vqf: anyway if training is needed and you don't even know what registers control the DDR lane delays this just got infinitely harder
16:52 imirkin_: i think we know how to program all that stuff
16:52 imirkin_: http://cgit.freedesktop.org/~darktama/nouveau/tree/drm/nouveau/nvkm/subdev/fb/gddr5.c
16:52 imirkin_: calculates all the EMRS/etc values
16:52 imirkin_: which are then just stuck into various timing registers
16:53 karolherbst: I like this optimism
16:53 kb9vqf: imirkin_: yes, static tables as I mentioned earlier
16:53 kb9vqf: that is rather common in this type of hardware
16:53 imirkin_: right, those are indeed largely static
16:53 imirkin_: but not *entirely* unfortunately
16:53 imirkin_: right, well the ram chips are soldered on there
16:53 imirkin_: so there's only so much variability to be had :)
16:53 kb9vqf: basically the average user gets impatient waiting another 10s for the video card to complete training and POST, on top of the 20s for DDR3 training of the main system ;-)
16:54 kb9vqf: more important is that traning can fail
16:54 kb9vqf: so if you know the DDR layout, you can train once and save the values
16:54 imirkin_: yeah, whose genius idea was that?!
16:54 kb9vqf: karolherbst: OK, I'm in. WHat do you want me to dump?
16:55 karolherbst: can you open nvidia-settings?
16:55 kb9vqf: yep
16:55 karolherbst: there should be a seciotn "PowerMizer"
16:56 kb9vqf: yep, familiar with it
16:56 karolherbst: and in there "Editable Performance Levels"
16:56 kb9vqf: yep
16:56 karolherbst: what is min max for memory?
16:56 kb9vqf: -4388MHz, 6008MHz
16:56 karolherbst: also preferred mode has to be set to maximum
16:56 karolherbst: just as I assumed
16:56 kb9vqf: hang on
16:56 karolherbst: these values are completly bullshit :D
16:56 kb9vqf: max performance, -4388, 6008, so same
16:56 kb9vqf: yep,, garbage :-)
16:57 karolherbst: nono
16:57 karolherbst: no garbage
16:57 karolherbst: but bullshit
16:57 karolherbst: 4388: difference between 0f max clock and 0a max clock
16:57 karolherbst: 6008 : mem max clock
16:57 karolherbst: its the same for me
16:57 karolherbst: just with different values
16:57 kb9vqf: so they oopsed and passed through a low level value to the UI
16:57 karolherbst: but hey, lets clock your memory to 12016MHz and be happy....
16:57 kb9vqf: heh
16:57 karolherbst: acutally the blob will do that if you try that
16:58 karolherbst: and bexond
16:58 karolherbst: *beyond
16:58 karolherbst: I have like +4008
16:58 karolherbst: but I can also say 4500
16:58 karolherbst: no problem
16:58 karolherbst: okay
16:58 kb9vqf: so this helps you put together that clocking patch?
16:58 karolherbst: nope
16:58 kb9vqf: didn't think it would
16:58 karolherbst: this was just something I was wondering about
16:58 karolherbst: okay
16:58 karolherbst: now the serious part
16:58 kb9vqf: anything else you need from the blob?
16:59 kb9vqf: ok
16:59 karolherbst: can you do "nvapeek 0x132024"
16:59 karolherbst: is prefered mode set to maximum?
16:59 karolherbst: we need the blob to use the max clock all the time
17:00 kb9vqf: yes, max
17:00 kb9vqf: 00132024: 00072801
17:00 karolherbst: somehow I removed the value on my end, wait
17:01 karolherbst: okay
17:01 karolherbst: fine
17:01 karolherbst: now I have to think
17:01 karolherbst: clock the memory +500
17:01 karolherbst: and do the nvapeek again
17:01 kb9vqf: changing the offset will do that, yes?
17:01 karolherbst: yes,
17:01 karolherbst: nvapeek should print 00052a01
17:02 kb9vqf: wait, how do I apply the offset?
17:02 karolherbst: press enter
17:02 kb9vqf: gotcha
17:02 kb9vqf: 00132024: 00052a01
17:02 karolherbst: okay
17:02 karolherbst: clock again to 0
17:02 karolherbst: because you are actually using it and the card may become unhappy
17:02 kb9vqf: 00132024: 00072801
17:02 kb9vqf: ok
17:03 karolherbst: now, this was for safety ;)
17:03 karolherbst: *no
17:03 karolherbst: -2000
17:03 karolherbst: then value
17:03 karolherbst: should be 00062801
17:03 kb9vqf: pretty colors :-)
17:03 kb9vqf: 00132024: 00062801
17:04 karolherbst: -1800
17:04 kb9vqf: GPU locked up
17:04 karolherbst: should be 00072601
17:04 karolherbst: ...
17:04 karolherbst: :/
17:04 karolherbst: sad
17:04 kb9vqf: I don't think there was enough bandwidth at -2000
17:04 karolherbst: at least teh values seems to be the same
17:04 karolherbst: mhh
17:04 karolherbst: don't think so
17:05 kb9vqf: hmm, well, ok, bear in mind there were 3 1920x1200 screens active
17:05 karolherbst: I can play borderlands pretty nicely with 362MHz memory clock
17:05 karolherbst: allthough only 10fps, but more then expected
17:05 karolherbst: at full hd
17:05 imirkin_: Hauke: btw, basically you need to get ben to look at your patches. he's not around right now though.
17:05 kb9vqf: so did you need more data?
17:06 karolherbst: some values may be nice
17:06 imirkin_: Hauke: at this point, less likely that they'll make 4.3
17:06 karolherbst: maybe one or two more?
17:06 karolherbst: will use higher clocks then
17:06 kb9vqf: yeah, waiting for the machine to come back up
17:06 kb9vqf: will be a few minutes
17:06 karolherbst: but nice that the values are the same so far
17:07 kb9vqf: yes, I assume that means your reclocking patch might work?
17:07 karolherbst: no
17:07 karolherbst: it only means the blob uses the same values across cards
17:07 kb9vqf: ah, pl
17:07 kb9vqf: ok
17:07 karolherbst: which means the vbios really doesn't matter here
17:08 karolherbst: the psate reports the clock the driver shall use
17:08 karolherbst: but this really doesn't matter
17:08 karolherbst: you can set a higher clock and the gpu won't complain
17:08 kb9vqf: ready
17:08 karolherbst: +500
17:08 karolherbst: ohh wait
17:08 karolherbst: we did this already
17:09 karolherbst: then +250
17:09 kb9vqf: yeop
17:09 karolherbst: 00062a01
17:09 kb9vqf: 00132024: 00072801
17:09 karolherbst: max perf
17:09 kb9vqf: oops
17:09 karolherbst: :p
17:09 kb9vqf: 00132024: 00072801
17:09 kb9vqf: same
17:09 karolherbst: did you hit enter?
17:10 kb9vqf: yes
17:10 karolherbst: I mean is the value reported above
17:10 karolherbst: 6258MHz
17:10 karolherbst: ohhhh
17:10 karolherbst: my fault
17:10 karolherbst: I used the wrong offset
17:10 karolherbst: 00072801 is good
17:10 kb9vqf: ok
17:10 imirkin_: sounds like you got it ;)
17:11 kb9vqf: another value?
17:11 karolherbst: shit, blob messed up
17:11 karolherbst: refuses to go into 0f now
17:11 karolherbst: nah, its okay
17:11 karolherbst: stupid blob
17:11 karolherbst: okay, it unloaded the module
17:11 karolherbst: kb9vqf: I think I am good
17:12 kb9vqf: ok
17:12 kb9vqf: poke me when you've got something to test :-)
17:12 karolherbst: just wanted to check if thats independent from the card
17:12 karolherbst: which card do you have exactly?
17:12 kb9vqf: GTX 670
17:12 karolherbst: I mean like chipset
17:12 kb9vqf: NV104 IIRC
17:12 imirkin_: GK104 = NVE4
17:12 karolherbst: okay
17:12 karolherbst: I've got gk106
17:12 kb9vqf: NVIDIA Corporation GK104 [GeForce GTX 670] [10de:1189]
17:12 karolherbst: so gk104 == gk106 here
17:12 karolherbst: at least something
17:13 kb9vqf: looks like
17:13 karolherbst: nvidia got a bit lazy hiding stuff :p
17:13 kb9vqf: fine by me :-)
17:13 imirkin_: yeah, same probably goes for gk107. likely gk110 and gk208 as well
17:13 karolherbst: yeah
17:13 karolherbst: I need mupuf :D
17:13 karolherbst: I don't look forware scratching values out of traces
17:13 karolherbst: so
17:14 karolherbst: kb9vqf: if you want you could verify this entire table, but you don't have to https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
17:14 karolherbst: these clocks may mess up your systme because they are prety low
17:14 karolherbst: *pretty
17:15 karolherbst: in brackets is the actual clock value in kHz
17:15 kb9vqf: Yeah, that probably won't work, but from what I saw they track exactly
17:15 karolherbst: yeah
17:15 kb9vqf: well, I'm switching this machine back over to nouveau
17:15 karolherbst: yep
17:15 karolherbst: imirkin_: the bad thing is, the values are pretty jumpy
17:16 kb9vqf: looking forward to getting the rest of the performance, but we're better than the blob already, so that's good :-)
17:16 karolherbst: so a difference of 2Mhz can mean a completly different clock from the one PLL
17:16 karolherbst: we are better?
17:16 imirkin_: on his box at least :)
17:17 karolherbst: :)
17:17 imirkin_: sounds like nvidia manages to trip up the amd cpu somehow
17:17 karolherbst: how much performance is lost through DRI_PRIME?
17:28 karolherbst: ohh wait
17:30 karolherbst: wait a sec
17:30 karolherbst: ....
17:30 imirkin_: waiting with bated breath
17:30 karolherbst: ........
17:30 karolherbst: oh man
17:30 karolherbst: this was too obvious
17:31 karolherbst: imirkin_: https://github.com/karolherbst/nouveau/blob/master/drm/nouveau/nvkm/subdev/clk/pllgt215.c#L30 you see that function?
17:31 imirkin_: yes...
17:31 karolherbst: this was used for PLL1 and PLL2
17:31 karolherbst: and M was faulty on 0f gddr5
17:31 karolherbst: find the error
17:32 karolherbst: I can give you a hint, its in the line with the M stuff ;)
17:32 karolherbst: *lines
17:33 karolherbst: daamn :/
17:33 imirkin_: yeah, i got that... but... having trouble seeing it
17:33 karolherbst: info->vco1
17:33 karolherbst: for PLL2 ?
17:34 karolherbst: so nouveau used the same bios information for PLL1 and PLL2
17:34 karolherbst: ....
17:34 karolherbst: there is a info->vco2 field, too
17:34 imirkin_: oh, didn't realize that :)
17:34 karolherbst: yeah
17:34 karolherbst: I didn't either
17:34 imirkin_: so you need to pass in a pointer to the vco
17:34 karolherbst: checking the values
17:34 imirkin_: or... something
17:34 karolherbst: and I bet max_m is 1
17:34 karolherbst: ...
17:34 imirkin_: note that there may not be a vco2 on gt215
17:35 imirkin_: so be careful how you do it.
17:35 karolherbst: https://github.com/karolherbst/nouveau/blob/master/drm%2Fnouveau%2Finclude%2Fnvkm%2Fsubdev%2Fbios%2Fpll.h
17:35 karolherbst: values may be 0 at most
17:35 karolherbst: also PLL2 is only used for special cases anywasy
17:35 imirkin_: ah ok
17:38 karolherbst: mhh
17:38 karolherbst: values seem to be 0 for me
17:38 karolherbst: max_inputfreq also 0
17:38 karolherbst: so I guess this has to be read from the bios
17:39 karolherbst: and nouveau doesn't do that yet
17:40 karolherbst: ...
17:40 karolherbst: yeah
17:40 karolherbst: nvbios_pll_parse
17:40 karolherbst: nothin about vco2 for v 0x40
17:41 karolherbst: this may explain everything now
17:42 karolherbst: if we don't check against what the bios say, we shouldn't wonder if the gpu crashes :D
17:55 karolherbst: okay
17:55 karolherbst: the bios tells us shit
17:58 imirkin_: ;)
17:58 imirkin_: perhaps fixed or derived info for bios 0x40?
17:58 karolherbst: maybe wrong adress
17:58 karolherbst: who knows
17:58 imirkin_: oh, well yeah, it won't be in the same place as in 0x50
17:58 karolherbst: I don't understand the PLL stuff inside nvbios anyway
17:59 karolherbst: and binary seems odd
18:01 karolherbst: imirkin_: could you imagine a PLL having troubles with input clock 810MHz + dividing and multiplying a bit
18:02 imirkin_: well, the thing is that you can't go too crazy with the multipliers and dividers
18:02 karolherbst: I mean, for me it sounds strange to put such a high clock into a PLL already
18:02 imirkin_: e.g. you can't have a mutliplier of like 1000000 and a divider of 999999
18:02 karolherbst: yeah, obviously
18:02 imirkin_: having a 1ghz input into a pll isn't completely crazy
18:02 imirkin_: but def on the higher end
18:03 karolherbst: "VCO1 - freq [2000-4000]MHz, inputfreq [150-250]MHz, M [1-255], N [8-255], P [1-63] --"
18:03 karolherbst: mhhh
18:03 karolherbst: inputfreq seems like the values the second PLL gets
18:03 karolherbst: freq seems to be about right too somehow
18:04 karolherbst: but this ain't help us anyway
18:05 karolherbst: M=1 N<0x30 P={1,2}
18:05 karolherbst: this seems to be the values the blob is using
18:05 karolherbst: but I can't find anything near it in the table
18:15 karolherbst: imirkin_: do you think I could just hardcode some values for now?
18:15 imirkin_: sure
18:22 karolherbst: imirkin_: https://github.com/karolherbst/nouveau/commit/44c337578d77f810fa10b0306fc0cbe9b0e1d612
18:23 imirkin_: how is min_n < max_n?
18:23 karolherbst: ohhhh
18:23 karolherbst: I get tired
18:24 imirkin_: also... this is very surprising if the values are in 0x30 but not 0x40
18:24 karolherbst: yes
18:24 imirkin_: and 0x20
18:24 imirkin_: let me check yoru vbios...
18:24 imirkin_: what's the base "data" offset?
18:24 karolherbst: wait
18:25 karolherbst: I did a printk(KERN_INFO "data: %i\n", data);
18:25 karolherbst: inside the bios file
18:25 karolherbst: data: 20913 data: 20899 data: 20885
18:25 karolherbst: :/
18:25 imirkin_: ?
18:26 imirkin_: hrm, those are 14 away
18:26 imirkin_: that's too bad.
18:26 imirkin_: tight packing
18:26 karolherbst: yeah
18:26 karolherbst: PLL limits table at 0x510c, version 64
18:30 imirkin_: perhaps VCO1 == VCO2?
18:31 imirkin_: hence the existing code? :)
18:31 imirkin_: or at least perhaps that was the theory
18:41 karolherbst: no clue
18:41 karolherbst: wait
18:41 karolherbst: ...
18:42 karolherbst: Register 0x00137000
18:42 karolherbst: Register 0x00137020
18:42 karolherbst: Register 0x00137040
18:42 karolherbst: still doesn't help
18:53 karolherbst: imirkin_: do you think the bios may tell us the right bounds?
18:56 karolherbst: imirkin_: I will sleep now. If mupuf comes around, could you try to covince him to put some kepler gddr5 cards into reator?
18:56 imirkin_: mupuf_: ---^ i'm done with maxwell, feel free to swap it out.
18:56 karolherbst: ohh skeggsb joined
18:56 karolherbst: maybe he knows something :D
18:58 karolherbst: but I don't think he will answer though? don't know meh :/
18:58 skeggsb: what am i supposed to be answering?
18:58 karolherbst: gddr5 reclokcing stuff :p
18:58 karolherbst: I think I figured most of the issues out now, but still some questions
18:59 karolherbst: got 0f stable on my kepler gddr5 card
18:59 imirkin_: skeggsb: a week's worth of questions
18:59 karolherbst: :D
18:59 imirkin_: skeggsb: welcome back btw :)
18:59 skeggsb: thanks :)
19:00 karolherbst: and I think I know how to get it stable on other keplers as well
19:00 skeggsb: karolherbst: how, exactly?
19:00 karolherbst: easy
19:00 karolherbst: there are two PLL, but you know that
19:00 karolherbst: one used like always and the other one starting at 2400MHz or something
19:00 karolherbst: which you know
19:00 karolherbst: the tricky part comes now
19:00 karolherbst: the second PLL is only stable with mutlipliers
19:01 karolherbst: no dividers or anything
19:01 karolherbst: so it needs low clocks from the first one (150MHz - 250MHz range)
19:01 karolherbst: and has to mutiply it up to the requested clock
19:01 karolherbst: like 4GHz
19:02 karolherbst: which means for P, N and M. M=1 P either 1 or 2, and N something below 0x30
19:02 karolherbst: done here in a hacky way: https://github.com/karolherbst/nouveau/commit/7588f95307ffec8c3d81c05e672edaaf356151dd
19:02 karolherbst: gt215_pll_calc2 is gt215_pll_calc with M=1
19:02 karolherbst: and "refclk = 182252;" is what the blob uses for my 0f pstate mem clock
19:03 karolherbst: but here is a litle table with values from the blob: https://gist.github.com/karolherbst/a5dd956189a533ff3e6d#file-blob-values-for-pll
19:03 karolherbst: these are the same on gk104 and gk106 cards as far as we know
19:03 skeggsb: well, you see, even *with* using the exact same coefficients as the binary driver (and the exact same resulting script), it's not enough
19:04 karolherbst: it works for me
19:04 skeggsb: i had to fuck with isohub unless i want to only reclock after cold-booting the card, and before enabling anything else
19:04 karolherbst: I had several AAA games and unigine benchmark runnong on my card at 0f pstate
19:04 karolherbst: also changed pstates a lot while the games were running
19:04 karolherbst: after 500 changes maybe one hang or something
19:05 skeggsb: i think you get lucky :)
19:05 karolherbst: maybe, maybe not
19:05 imirkin_: skeggsb: i'll take lucky over the current state :)
19:05 karolherbst: want to test on retor with other cards
19:05 karolherbst: :D
19:05 skeggsb: but yes, i'm aware (i believe there's comments to that effect too) that we don't properly select the refpll setup
19:05 karolherbst: yeah
19:05 karolherbst: I saw taht
19:05 karolherbst: *that
19:06 karolherbst: I messed around a bit and at least it works for me, so I wanted to check other cards next
19:06 karolherbst: the tricky part was the M value for the PLL
19:06 karolherbst: which has to be 1 as far as I know
19:07 karolherbst: there is no other way
19:07 karolherbst: whatever M stands for ;) but it seems like to be a divider
19:08 karolherbst: the thing is, I don't know how the blob selects the refclock
19:08 karolherbst: how the algorithm is
19:08 karolherbst: I have ideas, but nothing where I woudl say that's it
19:08 skeggsb: me neither, i looked very briefly at it ages ago and didn't come to any conclusion, i then decided that those details could wait until i had the isohub stuff figured out
19:09 karolherbst: I see
19:09 skeggsb: basically, for me, it looks like something does memory requests while gddr5 link training is happening, which causes it to fail
19:09 skeggsb: screwing the whole reclock
19:09 karolherbst: mhhh
19:09 karolherbst: that may the reasons it failed for me once in 500 reclocks
19:10 karolherbst: but before the hack even starting a game at 0f is a risk
19:10 karolherbst: and I have a optimus laptop ;)
19:10 karolherbst: so nothing really is ruinning on the card until I start something
19:10 skeggsb: that's why it works for you :)
19:10 karolherbst: could be, could be not
19:10 skeggsb: it's probably the display that's doing the memory requests
19:11 karolherbst: with the hack its a lot more stable actually
19:11 skeggsb: (the isohub stuff is configuring the line buffer, etc)
19:11 karolherbst: switching pstates while something is running also is kind of stable ;)
19:11 skeggsb: yes, i'm not saying improvements there aren't needed, i know that already, but it's not the magic bullet :)
19:11 karolherbst: I know
19:12 karolherbst: but I will see how big the improvement is, when testing on other cards I guess
19:13 skeggsb: have a display connected while you're trying this :)
19:13 karolherbst: :D
19:13 karolherbst: no ports on dedicated one
19:13 karolherbst: all ports are on my intel card
19:14 skeggsb: i am actually a bit surprised that if M doesn't work for anything != 1, that the bios pll tables don't say that
19:14 karolherbst: well
19:14 karolherbst: about that
19:15 karolherbst: is info->vco2 for the PLL2 stuff?
19:16 karolherbst: in the nvbios_pll struct
19:16 karolherbst: because currently it doesn't seem that this value is filled
19:16 karolherbst: also the blob always use M=1
19:17 karolherbst: and I just thought there may be a good reason the blob does it this way
19:17 skeggsb: it'd probably just be vco1, they aren't 2-stage pll2
19:17 skeggsb: plls*
19:18 karolherbst: okay
19:18 karolherbst: anyway, I chechked several mem clcoks and this seemed like the way to go
19:18 karolherbst: because nouveau didn't do that
19:20 karolherbst: skeggsb: just to be clear: with stock nouveau, even while nothing was running while changing to 0f, the next application, even something like glxgears, could hang the gpu
19:21 skeggsb: yep, your board triggers that bug. the few i was using does not
19:21 karolherbst: I think the doing stuff while changing is one issue, but it isn't a big one at least for me
19:21 karolherbst: I see
19:21 skeggsb: obviously, it needs fixing too :P
19:22 karolherbst: anyway, if you find some time for trying stuff out, my hack should work, only the refclk value has to be changed to something sane for the card
19:22 karolherbst: or I just wait until I can play on reator :D
19:23 skeggsb: on my gk106, for example, i can use 0xf with current code so long as i boot with "config=NvForcePost=1,NvClkMode=0x0f"
19:23 karolherbst: mhh
19:23 skeggsb: changing after a mode is set doesn't work
19:23 skeggsb: unless i hack in the isohub crap that nvidia does
19:23 skeggsb: (which is magic dependent on a *lot* of variables i haven't figured out)
19:23 karolherbst: I see
19:24 karolherbst: I already looked over the PDAEMON stuff and only the PLL thingies standed out somehow
19:24 skeggsb: and is also complicated, and requires changes to the supervisor handling (modesetting stuff), and pmu ucode, and the ram reclocking code
19:25 skeggsb: in your case, that doesn't surprise me at all, given you don't have displays :)
19:25 skeggsb: we're *almost* identical to nvidia in that case already
19:25 karolherbst: okay
19:26 karolherbst: except maybe the bits I figured out today
19:26 skeggsb: that's part of the almost
19:27 karolherbst: but I also got the feeling, that changing while something was running was a lot more unstable with stock
19:27 karolherbst: than changing while nothing was running
19:29 karolherbst: but at least I know that with nouveau I am around 60% blob performance, which is kind of nice
19:30 imirkin_: skeggsb: make sure you look at Hauke's patches -- they got higher bandwidth hdmi working for him
19:31 imirkin_: skeggsb: seems like there's an artificial 225mhz limit on at least fermi, dunno about kepler
19:34 karolherbst: okay I am off now, I just hope mupuf is there tomorrow, otherwise I don't really know what I may do next, because that would be nearly everything on my todo list :D
19:34 imirkin_: skeggsb: dunno if you're doing a pull for 4.3, but if you are, may be nice to include that
19:34 karolherbst: imirkin_: any idea related to performance? Because I don't think I want to check why DRI_PRIME is causing sync issues for me
19:35 imirkin_: karolherbst: huh?
19:35 imirkin_: karolherbst: this is for hdmi
19:35 karolherbst: no, I meant like what I could do next
19:35 imirkin_: oh
19:36 karolherbst: performance seems to be the biggest issue for me after that
19:36 imirkin_: karolherbst: write patches to enable v2 and higher GT/s
19:36 karolherbst: for that I wanted to wait for skeggsb cleanups
19:36 imirkin_: oh right
19:36 karolherbst: because of pci subdev
19:36 imirkin_: ummmm.... well the rest is in the mesa driver
19:36 karolherbst: mhh
19:36 karolherbst: wait I think I had something
19:36 karolherbst: ...
19:36 karolherbst: something from yesterday
19:36 imirkin_: basically there are 2 potential sources of slowness
19:36 skeggsb: karolherbst: i didn't quite get it push-ready before i left, i'm writing proper commit messages right now and will push it today in theory
19:36 karolherbst: no worries
19:37 karolherbst: It would be fine for me to have a WIP branch
19:37 imirkin_: (a) improper resource management, i.e. stalls on previous renders, etc.
19:37 karolherbst: just didn't want to write code for a pci subdev you did anyway
19:37 imirkin_: (b) inefficient code produced by the compiler
19:37 karolherbst: imirkin_: what did I do yesterday? :D
19:38 imirkin_: (and there's a (c) unimplemented features in mesa that would allow the application to make use of faster mechanisms, but that's a minority)
19:38 karolherbst: ohh I did GPIO stuff
19:39 karolherbst: ohh no
19:39 karolherbst: imirkin_: PWM voltage thingy!
19:39 karolherbst: yeah
19:39 karolherbst: this was it
19:40 karolherbst: because nouveau can't change clock voltage here for me
19:40 karolherbst: this little thingy: https://github.com/envytools/envytools/commit/1004c7bb5b533de289faeee7456a003d0388f5fa
19:40 imirkin_: skeggsb: btw, just as an fyi, i turned tessellation off on maxwell. there's some crazy mechanism for reading/writing tess control outputs, which i decided to leave for another day (or another person).
19:41 skeggsb: yeah, i noticed that earlier when i was catching up on the mesa list
19:41 imirkin_: skeggsb: otherwise tess works pretty well. except for the fact that it hangs my GK208 (or *something* does anyways)
19:41 karolherbst: and 8x msaa
19:42 imirkin_: karolherbst: could i get a screenshot btw?
19:42 karolherbst: :D
19:42 karolherbst: okay
19:42 karolherbst: video is better though
19:42 imirkin_: karolherbst: i'll try to double-check on my GF108
19:42 imirkin_: 8x msaa is in various ways special, so not too surprised we mess it up
19:43 imirkin_: karolherbst: oh btw, in your perf tests, are you building mesa without --enable-debug? coz that's a pretty big drag on perf sometimes
19:43 karolherbst: mhh
19:43 karolherbst: I am not sure
19:43 karolherbst: I rebuilt mesa like twice today
19:43 karolherbst: once with and once without
19:46 imirkin_: k, well, no worries -- just something to keep in mind when you're actually benchmarking
19:46 karolherbst: yeah
19:46 karolherbst: I am pretty sure the msaa issue is a minor one
19:47 karolherbst: like you will immediatly see whats going on
19:47 imirkin_: yeahhhhh.... there are a ton of failing msaa tests
19:47 imirkin_: there are a few things that we do explicitly wrong
19:47 imirkin_: although i don't think they matter
19:47 imirkin_: and they're not wrong enough
19:47 imirkin_: however esp with MS stencil there's major idiocy i think
19:47 imirkin_: but nothing special to 8x msaa
19:48 imirkin_: looks like the 'accuracy' piglit tests all fail for MS8 only
19:49 karolherbst: ohhh
19:49 karolherbst: you will see
19:50 imirkin_: and again, for depth-stencil
19:52 karolherbst: http://www.filebin.ca/2CPAnOjuzVUh/out.mp4
19:53 imirkin_: whoa, i like it!
19:53 karolherbst: yeah
19:53 karolherbst: looks nice
19:53 imirkin_: so my guess is that there's some sort of "flushing" we need to do in order to be able to sample from the surface
19:54 imirkin_: or.... something.
19:54 karolherbst: but tess looks nice
19:54 karolherbst: allthough I don't think anybody wants a road like this
19:54 imirkin_: that looks like the "exterme tess" setting?
19:54 karolherbst: yeah
19:55 imirkin_: the normal tess setting isn't quite as extruded
19:55 karolherbst: :D
19:55 karolherbst: on the walls its fine though
19:56 karolherbst: imirkin_: do you know what its strange? that the missing voltage thingy actually didn't do anything bad for me
19:57 karolherbst: maybe its set to a high value by default?
19:57 karolherbst: didn't check it though
19:59 karolherbst: no I actually did, now I remember
20:00 karolherbst: ohh right
20:00 karolherbst: and the new temp sensor
20:00 karolherbst: imirkin_: if you want to take a look https://github.com/karolherbst/envytools/commit/0e05adf392ef3a564a62271ccbc690a703b94cb5
20:01 imirkin_: those are the only GF119- values... why do you say GF119?
20:01 karolherbst: because I didn't find it for earlier models
20:02 karolherbst: maybe GF117 has them too
20:02 imirkin_: well, GF117 comes after GF119
20:02 karolherbst: ahhh
20:02 karolherbst: okay
20:02 imirkin_: look at nvchipsets.xml to determine the order
20:02 karolherbst: that explains why GF117- didn't showed them for GF119 .D
20:02 imirkin_: gf117 came out after gf119
20:03 karolherbst: okay
20:03 karolherbst: didn't know that
20:03 karolherbst: but the blob didn't use it on any nvc* card
20:03 imirkin_: anyways, mupuf_ is the temperature master, so check with him
20:03 karolherbst: yeah he knows already
20:03 karolherbst: but you have a fermi card, don't you?
20:03 imirkin_: yea
20:03 imirkin_: i can cehck later
20:03 imirkin_: not in front of it right now
20:04 karolherbst: okay
20:04 karolherbst: reg is 0x0002044c
20:04 karolherbst: which fermi again?
20:05 karolherbst: anyway its a temp sensor which also collects temps beneath 0°C. No idea why nvidia added them, but seems somebody needed them
20:06 imirkin_: gf108
20:06 karolherbst: okay
20:06 karolherbst: nice
20:06 karolherbst: if they are not there, then we are sure its something really new
20:06 imirkin_: (fyi, gf108 = nvc1)
20:06 karolherbst: mupuf_ already said that the regs could be there for older cards but nvidia didn't use them
20:06 karolherbst: yeah I know the CodeNames page
20:07 imirkin_: ;)
20:07 karolherbst: I could also fill the last unknown regs for my card ...
20:07 karolherbst: oh well
20:07 karolherbst: so much to do
20:07 karolherbst: okay, will be off now for sure, cu then
22:02 imirkin_: skeggsb: any major objections to http://patchwork.freedesktop.org/patch/55422/ ? didn't seem to regress anything in piglit too visibly
22:09 skeggsb: imirkin_: nah, that looks good actually
22:09 imirkin_: ok cool
22:10 imirkin_: wanted to make sure i wasn't missing some bit of cleverness
22:10 skeggsb: no, that was just a fail by me really
22:55 imirkin_: oh hm. forgot all about this one. skeggsb -- something like this seems necessary right? not sure about my actual impl... http://patchwork.freedesktop.org/patch/49550/ it was so long ago
22:59 imirkin_: oh well, i'll look at it later