03:27mlankhorst: excellent, my rgb leds work with my tegra :D
03:43vlooe: what is the current state for 08:00.0 3D controller : NVIDIA Corporation GM108M [GeForce 940M] [10de:1347] (rev a2)
03:55karolherbst: vlooe: opengl should work
03:56karolherbst: vlooe: why?
03:56karolherbst: vlooe: do you plan to buy a laptop with that or do you have one with that already?
03:56karolherbst: vlooe: most of the time you are better of with intel gpus, especially the iris gpus are more powerfull than a 940M
03:57RSpliet: karolherbst: I personally doubt that... got benchmarks to back that up?
03:57RSpliet: or is that under the assumption that you're running nouveau?
03:57karolherbst: RSpliet: maybe the 940M is pretty strong, but the newest intel iris have aroun 20% more "theoretical" processing power
03:57karolherbst: RSpliet: with nouveau
03:58karolherbst: RSpliet: even with the blob it'S a close call
03:58karolherbst: then you have dual gpu overhead and stuff
03:58karolherbst: even if they were on par, the intel would give you more perf
03:59RSpliet: Irises don't score particularly impressive
03:59RSpliet: gheh ok
03:59RSpliet: nor does 940M
03:59karolherbst: iris 6200 and 940M are pretty close
04:00karolherbst: even the 6100 is pretty good compared to 940m
04:04karolherbst: RSpliet: as it seems even the first Iris are good too, but the 5000 one is slower than a 940m though
04:07vlooe: I have a Thinkpad T550 with this gpu. At the moment only the intel gpu is running.
04:08vlooe: i upgraded to kernel 4.3 and still get nouveau 0000:08:00.0: unknown chipset (118070a2)
04:08karolherbst: ohh right, it's a g108m :/
04:08karolherbst: sorry for that
04:08karolherbst: in theory it should work just like a g107m
04:09karolherbst: I don't hink any nouveau dev every tried it out yet
04:09karolherbst: vlooe: chich cpu?
04:09karolherbst: 5200 or 5600?
04:09vlooe: to rephrase my question: what is the current state? can I provide some help to get it supported?
04:09karolherbst: we could write a hack to use the g108m one as if it would be a g107m one
04:10karolherbst: and see how that goes
04:10karolherbst: imirkin: are you aware of anybody having done that before?
04:10vlooe: I have the i5-5200U cpu
04:11karolherbst: vlooe: mhhh
04:11vlooe: I use gentoo, so building/patching a kernel is no problem for me
04:11karolherbst: vlooe: with nouveau I would say... +30% more perf compared to the intel gpu, at most
04:11karolherbst: but you can still help if you want to :p
04:12karolherbst: I just want to make it clear, that you can't expect any wonders by using the nvidia gpu
04:12vlooe: i do not need perf, i just want it to work :D
04:13pmoreau: karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=89558#c44
04:13pmoreau: vlooe: -^
04:13pmoreau: I guess this one should help
04:13karolherbst: pmoreau: looks actually pretty good
04:13karolherbst: vlooe: yeah last comment attachment is a patch
04:14karolherbst: try that out and you should be good to go
04:14RSpliet: karolherbst: they don't have dedicated VRAM do they?
04:14RSpliet: those Iris units
04:14karolherbst: RSpliet: no
04:15karolherbst: RSpliet: but low end gpus don'thave fast memory either
04:15RSpliet: yeah, but it's not shared
04:15karolherbst: RSpliet: but with intel you don't have that PRIME problem
04:16karolherbst: RSpliet: uhh his cpu don't even support 8.0 pcie :/
04:17RSpliet: hahaha, it's designed to disadvantage nVidia :-P
04:17karolherbst: yeah well, my cpu supports 8.0
04:17karolherbst: I think a i5-5200U is just too low end
04:18karolherbst: mhh even a i7-5600U don't have 3.0
04:18vlooe: is the bigger one that much better? xD
04:18vlooe: will try the patch
04:18RSpliet: I actually think that benchmark is nog very accurate
04:18RSpliet: given the big variance
04:19karolherbst: both his gpus
04:19karolherbst: and you have to remove around 30% from the nvidia one cause of nouveau
04:20karolherbst: why bother...
04:20karolherbst: sometimes I think those engineers don't know how fast those intel gpus are now
04:20karolherbst: vlooe: there is one thing you could do though
04:20karolherbst: vlooe: you could check if all your display ports are connected to the nvidia or intel gpu
04:21RSpliet: that 940M is faster than that... the scores vary between 600 and 1100, that variance clearly indicates the set-up is unreliable
04:21RSpliet: (esp. since the 940M scores lower than the 930M on that website)
04:21karolherbst: I see
04:21karolherbst: RSpliet: there is also gpuboost on windows and stuff
04:21karolherbst: so cooling also matters
04:22RSpliet: hmm yes
04:22karolherbst: RSpliet: mgg
04:22RSpliet: factor 2 would be worrying though
04:22karolherbst: the 930m and 940m are identical
04:22karolherbst: just different clocks
04:22karolherbst: and clocks are different anyway
04:23vlooe: will try to load the patched nouveau kernel module, i hope i do not crash xD
04:23karolherbst: vlooe: I just hope you don't have any xorg.conf
04:23karolherbst: or something
04:23karolherbst: vlooe: usually nothing should happen as long you didn't tinker too much ;)
04:24RSpliet: karolherbst: sure... the small diff makes it all the more strange that a 930M performs marginally *better* than the 940M... hence, I don't trust that GPU benchmark ;-)
04:24RSpliet: but it's good to see Intel is making rapid progress with their GPUs
04:24karolherbst: I see
04:24karolherbst: it's awesome yes
04:25RSpliet: curious what happens if they start shipping computers with DDR4
04:25karolherbst: RSpliet: then embedded gpus will get ddr4 too
04:25karolherbst: so what? :D
04:26RSpliet: NVIDIA might have focussed on getting their memory controller ready for HBM instead, for the high end chips... curious whether the FB is ready for DDR4 ;-)
04:26karolherbst: yeah mhh
04:26karolherbst: I won't expect nvidia to risk all non gddr gpus being slower than intel ones
04:27karolherbst: but currently 95% fo them are already slower than the fastest iris
04:27karolherbst: so :D
04:27RSpliet: yeah, but that's comparing apples to oranges
04:27karolherbst: only a ddr3 950M one might keep up to them
04:27RSpliet: is any of those Irises being used in a mobile Intel CPU?
04:27karolherbst: RSpliet: yes, they are especially made for mobile
04:27RSpliet: is there an overview of which CPU has what?
04:28karolherbst: RSpliet: most of the cpus also have a iris variant, but let me check
04:28karolherbst: RSpliet: macbooks also use iris by the way
04:28karolherbst: RSpliet: https://en.wikipedia.org/wiki/Broadwell_(microarchitecture)#Mobile_processors
04:29karolherbst: and tehn haswell: https://en.wikipedia.org/wiki/Haswell_(microarchitecture)#Mobile_processors
04:30vlooe: seems to work: https://bpaste.net/show/5bfbc1ef62af
04:30karolherbst: RSpliet: the GT3e one has a 128MB L4 cache shared by gpu and cpu
04:30karolherbst: vlooe: nice
04:31vlooe: i guess i have to restart my xserver to see it in xrandr?
04:32karolherbst: vlooe: well it is more difficult than that
04:32karolherbst: vlooe: http://nouveau.freedesktop.org/wiki/Optimus/
04:35vlooe_: so i need a patched xserver and mesa too?
04:36karolherbst: vlooe_: mhhh good question
04:36karolherbst: I think it is easier to enable DRI3 for intel
04:36karolherbst: and just remove the nouveau ddx
04:37karolherbst: vlooe_: but the intel ddx has dri3 disabled in the ebuild
04:37karolherbst: wait a sec
04:37karolherbst: vlooe_: use this one and build with dri3 support: https://raw.githubusercontent.com/karolherbst/F.U.N.-overlay/master/x11-drivers/xf86-video-intel/xf86-video-intel-2.99.917-r3.ebuild
04:38karolherbst: vlooe_: and you need mesa[dri3] of course
04:39gryffus: imirkin: confirming your second patch also fixes the crashes :) thanks
04:41karolherbst: RSpliet: if you are interessted in this eDRAM cache thingy: http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3
04:45vlooe_: have to get a m2 ssd soon. gentoo on a usb drive is insanly slow xD
04:45karolherbst: ugh :D
05:29RSpliet: karolherbst: thanks, that clarifies a lot
06:39wootehfoot: Looking for a 620M GF117 VBIOS, alternatively 820M since that's also GF117.
08:49mupuf: karolherbst: hey, here is the new version of env_dump http://paste.pound-python.org/show/AKdvemx1DtNFWrPiyUqz/
08:50mupuf: look at the resolution of the SHAs :p
08:50mupuf: it also resolves some of the git versions of mesa, but it is not perfect yet
08:50mupuf: mesa's build system is ... funny
08:51mupuf: mega-drivers prevent me from detecting i965_dri.so
08:51mupuf: other than that, it all works :)
08:51mupuf: I will just add a special case for the mesa profile. Will just need to refactor some code
09:12karolherbst: mupuf: how do you read the library version?
09:15karolherbst: mupuf: " mega-drivers prevent me from detecting i965_dri.so" what do you mean by that?
09:15karolherbst: mupuf: also remove duplicates // in paths
09:16karolherbst: shouldn't have any noticable overhead and it reduces duplicates in the list
09:17karolherbst: mupuf: also you might want to look which window manager is running, because this also might impact performance
10:55imirkin: karolherbst: can you check if http://patchwork.freedesktop.org/patch/63996/ fixes Metro:LL startup fail?
11:11karolherbst: imirkin: ohh this is a mesa patch?
11:13karolherbst: imirkin: I can give you my 1.7G trace to test it though if you want
11:15imirkin: karolherbst: nah that's ok :)
11:25karolherbst: imirkin: nope, doesn't help
11:26imirkin: ok thanks for testing
11:27imirkin: sort of as i expected, but nice to have verification :)
11:27karolherbst: it is no crash in metro though
12:32imirkin_: hakzsam-: thanks for sending me that compute trace... actually looks like GK104_COMPUTE had already been hooked up in mmt. but something's going wrong... will try to investigate why it's nto decoding the kernel
12:41imirkin_: hmmm... clearly the blob knows something we don't -- it's setting the code address to 0x700000000 which is unmapped.
12:46imirkin_: oh weak
12:51karolherbst: mupuf: by the way: I used your version kind of
12:51karolherbst: but I played so much with the reg myself, that it didn't matter
12:52karolherbst: the fix works though
13:18imirkin_: hakzsam-: --^ enjoy
13:22imirkin_: skeggsb: so... is there anything i can do to place buffers *outside* of some 16MB region? i guess i could make the "annoying" buffers be 16MB in size in the first place thus guaranteeing that... but... hrmph.
13:23imirkin_: i guess outside of moving allocation policy into userspace, not a whole lot that can be done =/
13:29karolherbst: imirkin_: ha, I found out my runpm issue
13:29imirkin_: phantom VGA connector?
13:29karolherbst: imirkin_: whenever the nouveau module was loaded with runpm=0 once, then a after removing it and loading it with runpm=1 it won't work
13:30karolherbst: first I thought bbswitch messes up
13:30imirkin_: it must mess something up in runpm=0 mode which forces the device to remain enabled
13:30karolherbst: but I was forced to reboot and can verify it does not
13:30karolherbst: imirkin_: yeah, I think deep inside the kernel
13:30imirkin_: probably pm_runtime count, or... something.
13:30karolherbst: I think I know what
13:30karolherbst: nah, it's more trivial
13:30imirkin_: it might call pm_runtime_super_mega_disable()
13:30karolherbst: imirkin_: nouveau calls pm_runtime_forbid(dev);
13:31karolherbst: and dev is the kernel device
13:31karolherbst: so it will stay even after nouveau was removed
13:31imirkin_: it should un-call that on unload
13:31karolherbst: at load time
13:31karolherbst: f (nouveau_runtime_pm == 1)
13:32karolherbst: or whatever
13:33imirkin_: mmmm dunno
13:33imirkin_: there could be other reasons why pm_runtime is forbidden presumably? dunno
13:33imirkin_: you can also manually change it with e.g. powertop
13:33imirkin_: or by writing stuff to sysfs
13:33karolherbst: at least powertop doesn't allow me to do that
13:34karolherbst: control is still set to auto
13:34karolherbst: even after forbid
13:34imirkin_: ah ok
13:41pmoreau: imirkin_: envydis can't parse compute code for gk110? I get a "no mode cp" when trying `envydis -m gk110 -O cp -w`
13:41imirkin_: the -O cp is just for the g80 machine
13:41pmoreau: Oh ok
13:41imirkin_: for gf100+ it's all the same isa
13:41imirkin_: no extra cp variant
13:42pmoreau: Nice :-)
13:45imirkin_: a fun project would be to add aalib support to demmt so that it prints out textures ;)
13:46karolherbst: imirkin_: so nouveau should rather reset the forbid thing after it called it?
13:46pmoreau: Thanks for the envytools patches!! I couldn't find the kernel code in the trace. :D
13:46karolherbst: or should I explicitly read out the status at load time?
13:46imirkin_: karolherbst: i think so.
13:46imirkin_: pmoreau: if you have a gk110/gk208 trace that doesn't work, happy to take a look
13:47imirkin_: pmoreau: i just used the trace that hakzsam- sent me to improve things a bit
13:47imirkin_: it's all a guessing game though, so heuristics can always be improved
13:47pmoreau: imirkin_: hakzsam is my source of supply for gk110/gk208 traces, for now as I don't have one myself
13:48imirkin_: ah ok. well i happen to have a gk208 here, so i can run stuff if necessary
13:48imirkin_: i'm going to look over hansg's patches and then probably give atomic another shot given robert's feedback
13:49imirkin_: i guess i might have known about that 16M window thing actually
13:49imirkin_: but definitely forgot about it
13:49pmoreau: I wanted to have some traces on Kepler/Fermi as my work doesn't work on those, due (at least) to the fact that input params are stored in c and not in s.
13:51pmoreau: And it looks like my get_local_id() work won't work either, cause they have the ids stored in specific reg $tidx and $ctaidx, rather than in $r0
13:51imirkin_: pmoreau: that's what the sysvals are all about
13:52imirkin_: pmoreau: the relevant target knows how to retrieve SV_TIDX or whatever
13:52imirkin_: pmoreau: and all you do is stick a OP_RDSV in when converting
13:52pmoreau: Oh that's nice!
13:52imirkin_: (RDSV = read system value, WRSV = write system value... dunno when that one's used)
13:52pmoreau: Then maybe I'll try to get it working
13:53imirkin_: pmoreau: http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir.h#n358
13:53pmoreau: Cool! Thanks!
13:54pmoreau: I'll have to check what this CTAIDX is. I remember Robert takling about CTA in his reply, but already forgot what it was… --"
13:54imirkin_: i believe it has something to do with compute
13:54imirkin_: compute t___ address
13:54imirkin_: dunno maybe not
13:55imirkin_: compute thread area
13:55imirkin_: just guessing ;)
13:56pmoreau: Probably the blockIdx, just with a different name
13:56karolherbst: imirkin_: pm_runtime_forbid indeed increases the use count :/
13:56imirkin_: karolherbst: ok so we def have to call that on exit
14:00karolherbst: imirkin_: nouveau already calls it :o
14:00karolherbst: inside nouveau_drm_load
14:00karolherbst: and it doesn't help
14:01imirkin_: karolherbst: i mean call the inverse of it.
14:01karolherbst: yeah pm_runtime_allow
14:01karolherbst: this is already called
14:08karolherbst: imirkin_: mhh when the driver is laoded I get a usage count of 2 :/
14:08imirkin_: well i'm sure you'll figure out where the imbalance is
14:08karolherbst: maybe I can't decrease it anymore after the module was unloaded :/
14:09karolherbst: well therse is always atomic_dev(&dev->power.usage_count);
14:10karolherbst: now it suspends again
14:10karolherbst: something messes up big time
14:14karolherbst: imirkin_: calling allow on unload does not help
14:22karolherbst: imirkin_: I just removed all forbid calls and still the device is messed up :/
14:22imirkin_: great update :)
14:23karolherbst: haha :/
14:23karolherbst: all these runtime_pm checks
14:23karolherbst: and returning EBUSY don't change a thing totoally
14:24imirkin_: you realize that you've already messed up the count on yoru running kernel right?
14:24karolherbst: imirkin_: I repair it with atomic_dec
14:24imirkin_: ah :)
14:24imirkin_: not dangerous at all
14:24karolherbst: -1 doesn't crash the kernel
14:24karolherbst: all good
14:24karolherbst: okay, moved every runtime_pm check inside drm_load
14:24karolherbst: and still
14:24karolherbst: when I load the first time
14:24karolherbst: load_counter == 1
14:25karolherbst: load_counter == 0
14:25karolherbst: as expected
14:25karolherbst: when I load wuth runpm=0
14:25karolherbst: load runpm=1
14:25karolherbst: I get load_counter == 2
14:25karolherbst: at drm_load time
15:35karolherbst: imirkin_: okay, at least the usage counter increaes by one everytime nouveau is loaded with runpm=0
15:41karolherbst: imirkin_: maybe it is a drm core issue?
15:43karolherbst: imirkin_: okay, I think I found it
15:43karolherbst: after drm_device load finishes, it has a load count of 1 with runpm=0
15:44karolherbst: and unload finishes with 3
15:46imirkin_: pmoreau: btw, i just pushed 64-bit support for nv50 emission in case you need it in your opencl adventures. do note that it's G200-only.
15:46imirkin_: pmoreau: also don't try to divide or sqrt :)
15:46imirkin_: i have a pass that emulates rcp/rsq but... it didn't seem to work
15:46imirkin_: probably something dumb but i haven't gotten back to it
15:47imirkin_: [in part because i lack a G200 to test on]
15:51imirkin_: i should be able to debug on a nvc0 but... /me is lazy
16:50imirkin_: skeggsb: actually i guess what i need to do is *reserve* 16M of VA. it doesn't even need to be backed by anything.
17:01imirkin_: alright so now i need to combine my 20 atomic branches into The One True Branch (tm)
17:24imirkin_: skeggsb: what's the best way to understand this error? http://hastebin.com/qadobomime.sm
17:25imirkin_: skeggsb: is it an unaligned memory access, or is it some 0 address somewhere?
17:28imirkin_: skeggsb: nevermind, figured it out. unrelated to gmem at all in the first place, i was doing something dumb