03:27 mlankhorst: excellent, my rgb leds work with my tegra :D
03:43 vlooe: Hi
03:43 vlooe: what is the current state for 08:00.0 3D controller [0302]: NVIDIA Corporation GM108M [GeForce 940M] [10de:1347] (rev a2)
03:55 karolherbst: vlooe: opengl should work
03:56 karolherbst: vlooe: why?
03:56 karolherbst: vlooe: do you plan to buy a laptop with that or do you have one with that already?
03:56 karolherbst: vlooe: most of the time you are better of with intel gpus, especially the iris gpus are more powerfull than a 940M
03:57 RSpliet: karolherbst: I personally doubt that... got benchmarks to back that up?
03:57 RSpliet: or is that under the assumption that you're running nouveau?
03:57 karolherbst: RSpliet: maybe the 940M is pretty strong, but the newest intel iris have aroun 20% more "theoretical" processing power
03:57 karolherbst: RSpliet: with nouveau
03:58 karolherbst: RSpliet: even with the blob it'S a close call
03:58 karolherbst: then you have dual gpu overhead and stuff
03:58 karolherbst: even if they were on par, the intel would give you more perf
03:59 RSpliet: http://www.videocardbenchmark.net/gpu_list.php
03:59 RSpliet: Irises don't score particularly impressive
03:59 RSpliet: gheh ok
03:59 RSpliet: nor does 940M
03:59 karolherbst: 6200?
03:59 RSpliet: right
03:59 karolherbst: iris 6200 and 940M are pretty close
04:00 karolherbst: even the 6100 is pretty good compared to 940m
04:04 karolherbst: RSpliet: as it seems even the first Iris are good too, but the 5000 one is slower than a 940m though
04:07 vlooe: I have a Thinkpad T550 with this gpu. At the moment only the intel gpu is running.
04:08 vlooe: i upgraded to kernel 4.3 and still get nouveau 0000:08:00.0: unknown chipset (118070a2)
04:08 karolherbst: ohh right, it's a g108m :/
04:08 karolherbst: sorry for that
04:08 karolherbst: in theory it should work just like a g107m
04:08 karolherbst: but
04:09 karolherbst: I don't hink any nouveau dev every tried it out yet
04:09 karolherbst: vlooe: chich cpu?
04:09 karolherbst: *which
04:09 karolherbst: 5200 or 5600?
04:09 vlooe: to rephrase my question: what is the current state? can I provide some help to get it supported?
04:09 karolherbst: mhhh
04:09 karolherbst: we could write a hack to use the g108m one as if it would be a g107m one
04:10 karolherbst: and see how that goes
04:10 karolherbst: imirkin: are you aware of anybody having done that before?
04:10 vlooe: I have the i5-5200U cpu
04:11 karolherbst: vlooe: mhhh
04:11 vlooe: I use gentoo, so building/patching a kernel is no problem for me
04:11 karolherbst: vlooe: with nouveau I would say... +30% more perf compared to the intel gpu, at most
04:11 karolherbst: but you can still help if you want to :p
04:12 karolherbst: I just want to make it clear, that you can't expect any wonders by using the nvidia gpu
04:12 vlooe: i do not need perf, i just want it to work :D
04:13 karolherbst: k
04:13 pmoreau: karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=89558#c44
04:13 pmoreau: vlooe: -^
04:13 pmoreau: I guess this one should help
04:13 karolherbst: pmoreau: looks actually pretty good
04:13 karolherbst: vlooe: yeah last comment attachment is a patch
04:14 karolherbst: try that out and you should be good to go
04:14 RSpliet: karolherbst: they don't have dedicated VRAM do they?
04:14 RSpliet: those Iris units
04:14 karolherbst: RSpliet: no
04:15 karolherbst: RSpliet: but low end gpus don'thave fast memory either
04:15 karolherbst: :D
04:15 RSpliet: yeah, but it's not shared
04:15 karolherbst: right
04:15 karolherbst: RSpliet: but with intel you don't have that PRIME problem
04:16 RSpliet: sure
04:16 karolherbst: RSpliet: uhh his cpu don't even support 8.0 pcie :/
04:17 RSpliet: hahaha, it's designed to disadvantage nVidia :-P
04:17 karolherbst: :D
04:17 karolherbst: yeah well, my cpu supports 8.0
04:17 karolherbst: I think a i5-5200U is just too low end
04:18 karolherbst: mhh even a i7-5600U don't have 3.0
04:18 vlooe: is the bigger one that much better? xD
04:18 RSpliet: http://www.videocardbenchmark.net/gpu.php?gpu=GeForce+940M&id=
04:18 vlooe: will try the patch
04:18 RSpliet: I actually think that benchmark is nog very accurate
04:18 RSpliet: given the big variance
04:19 karolherbst: http://www.videocardbenchmark.net/compare.php?cmp[]=3155&cmp[]=2908
04:19 karolherbst: both his gpus
04:19 karolherbst: and you have to remove around 30% from the nvidia one cause of nouveau
04:20 karolherbst: why bother...
04:20 karolherbst: sometimes I think those engineers don't know how fast those intel gpus are now
04:20 karolherbst: vlooe: there is one thing you could do though
04:20 karolherbst: vlooe: you could check if all your display ports are connected to the nvidia or intel gpu
04:21 RSpliet: that 940M is faster than that... the scores vary between 600 and 1100, that variance clearly indicates the set-up is unreliable
04:21 RSpliet: (esp. since the 940M scores lower than the 930M on that website)
04:21 karolherbst: I see
04:21 karolherbst: RSpliet: there is also gpuboost on windows and stuff
04:21 karolherbst: so cooling also matters
04:22 RSpliet: hmm yes
04:22 karolherbst: RSpliet: mgg
04:22 RSpliet: factor 2 would be worrying though
04:22 karolherbst: the 930m and 940m are identical
04:22 karolherbst: just different clocks
04:22 karolherbst: and clocks are different anyway
04:23 vlooe: will try to load the patched nouveau kernel module, i hope i do not crash xD
04:23 karolherbst: vlooe: I just hope you don't have any xorg.conf
04:23 karolherbst: or something
04:23 karolherbst: vlooe: usually nothing should happen as long you didn't tinker too much ;)
04:24 RSpliet: karolherbst: sure... the small diff makes it all the more strange that a 930M performs marginally *better* than the 940M... hence, I don't trust that GPU benchmark ;-)
04:24 RSpliet: but it's good to see Intel is making rapid progress with their GPUs
04:24 karolherbst: I see
04:24 karolherbst: it's awesome yes
04:25 RSpliet: curious what happens if they start shipping computers with DDR4
04:25 karolherbst: mhhh
04:25 karolherbst: RSpliet: then embedded gpus will get ddr4 too
04:25 karolherbst: so what? :D
04:26 RSpliet: NVIDIA might have focussed on getting their memory controller ready for HBM instead, for the high end chips... curious whether the FB is ready for DDR4 ;-)
04:26 karolherbst: yeah mhh
04:26 karolherbst: I won't expect nvidia to risk all non gddr gpus being slower than intel ones
04:27 karolherbst: but currently 95% fo them are already slower than the fastest iris
04:27 karolherbst: so :D
04:27 RSpliet: yeah, but that's comparing apples to oranges
04:27 karolherbst: only a ddr3 950M one might keep up to them
04:27 RSpliet: is any of those Irises being used in a mobile Intel CPU?
04:27 karolherbst: RSpliet: yes, they are especially made for mobile
04:27 RSpliet: is there an overview of which CPU has what?
04:28 karolherbst: RSpliet: most of the cpus also have a iris variant, but let me check
04:28 karolherbst: RSpliet: macbooks also use iris by the way
04:28 karolherbst: RSpliet: https://en.wikipedia.org/wiki/Broadwell_(microarchitecture)#Mobile_processors
04:29 karolherbst: and tehn haswell: https://en.wikipedia.org/wiki/Haswell_(microarchitecture)#Mobile_processors
04:30 vlooe: seems to work: https://bpaste.net/show/5bfbc1ef62af
04:30 karolherbst: RSpliet: the GT3e one has a 128MB L4 cache shared by gpu and cpu
04:30 karolherbst: vlooe: nice
04:31 vlooe: i guess i have to restart my xserver to see it in xrandr?
04:32 karolherbst: vlooe: well it is more difficult than that
04:32 karolherbst: vlooe: http://nouveau.freedesktop.org/wiki/Optimus/
04:35 vlooe_: so i need a patched xserver and mesa too?
04:36 karolherbst: vlooe_: mhhh good question
04:36 karolherbst: I think it is easier to enable DRI3 for intel
04:36 karolherbst: and just remove the nouveau ddx
04:37 karolherbst: vlooe_: but the intel ddx has dri3 disabled in the ebuild
04:37 karolherbst: wait a sec
04:37 karolherbst: vlooe_: use this one and build with dri3 support: https://raw.githubusercontent.com/karolherbst/F.U.N.-overlay/master/x11-drivers/xf86-video-intel/xf86-video-intel-2.99.917-r3.ebuild
04:38 karolherbst: vlooe_: and you need mesa[dri3] of course
04:39 gryffus: imirkin: confirming your second patch also fixes the crashes :) thanks
04:41 karolherbst: RSpliet: if you are interessted in this eDRAM cache thingy: http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3
04:45 vlooe_: have to get a m2 ssd soon. gentoo on a usb drive is insanly slow xD
04:45 karolherbst: ugh :D
05:29 RSpliet: karolherbst: thanks, that clarifies a lot
06:18 yoyomoony: Hello.
06:39 wootehfoot: Looking for a 620M GF117 VBIOS, alternatively 820M since that's also GF117.
08:49 mupuf: karolherbst: hey, here is the new version of env_dump http://paste.pound-python.org/show/AKdvemx1DtNFWrPiyUqz/
08:50 mupuf: look at the resolution of the SHAs :p
08:50 mupuf: it also resolves some of the git versions of mesa, but it is not perfect yet
08:50 mupuf: mesa's build system is ... funny
08:51 mupuf: mega-drivers prevent me from detecting i965_dri.so
08:51 mupuf: other than that, it all works :)
08:51 mupuf: I will just add a special case for the mesa profile. Will just need to refactor some code
09:12 karolherbst: mupuf: how do you read the library version?
09:15 karolherbst: mupuf: " mega-drivers prevent me from detecting i965_dri.so" what do you mean by that?
09:15 karolherbst: mupuf: also remove duplicates // in paths
09:16 karolherbst: shouldn't have any noticable overhead and it reduces duplicates in the list
09:17 karolherbst: mupuf: also you might want to look which window manager is running, because this also might impact performance
10:55 imirkin: karolherbst: can you check if http://patchwork.freedesktop.org/patch/63996/ fixes Metro:LL startup fail?
11:09 karolherbst: K
11:11 karolherbst: imirkin: ohh this is a mesa patch?
11:13 karolherbst: imirkin: I can give you my 1.7G trace to test it though if you want
11:15 imirkin: karolherbst: nah that's ok :)
11:25 karolherbst: imirkin: nope, doesn't help
11:26 imirkin: ok thanks for testing
11:27 imirkin: sort of as i expected, but nice to have verification :)
11:27 karolherbst: yeah
11:27 karolherbst: it is no crash in metro though
12:32 imirkin_: hakzsam-: thanks for sending me that compute trace... actually looks like GK104_COMPUTE had already been hooked up in mmt. but something's going wrong... will try to investigate why it's nto decoding the kernel
12:41 imirkin_: hmmm... clearly the blob knows something we don't -- it's setting the code address to 0x700000000 which is unmapped.
12:46 imirkin_: oh weak
12:51 karolherbst: mupuf: by the way: I used your version kind of
12:51 karolherbst: but I played so much with the reg myself, that it didn't matter
12:52 karolherbst: the fix works though
13:18 imirkin_: hakzsam-: --^ enjoy
13:22 imirkin_: skeggsb: so... is there anything i can do to place buffers *outside* of some 16MB region? i guess i could make the "annoying" buffers be 16MB in size in the first place thus guaranteeing that... but... hrmph.
13:23 imirkin_: i guess outside of moving allocation policy into userspace, not a whole lot that can be done =/
13:29 karolherbst: imirkin_: ha, I found out my runpm issue
13:29 imirkin_: phantom VGA connector?
13:29 karolherbst: imirkin_: whenever the nouveau module was loaded with runpm=0 once, then a after removing it and loading it with runpm=1 it won't work
13:29 imirkin_: hah
13:30 karolherbst: first I thought bbswitch messes up
13:30 imirkin_: it must mess something up in runpm=0 mode which forces the device to remain enabled
13:30 karolherbst: but I was forced to reboot and can verify it does not
13:30 karolherbst: imirkin_: yeah, I think deep inside the kernel
13:30 imirkin_: probably pm_runtime count, or... something.
13:30 karolherbst: I think I know what
13:30 karolherbst: nah, it's more trivial
13:30 imirkin_: it might call pm_runtime_super_mega_disable()
13:30 karolherbst: imirkin_: nouveau calls pm_runtime_forbid(dev);
13:31 karolherbst: ;)
13:31 imirkin_: right.
13:31 karolherbst: and dev is the kernel device
13:31 karolherbst: so it will stay even after nouveau was removed
13:31 imirkin_: it should un-call that on unload
13:31 karolherbst: yes
13:31 karolherbst: or
13:31 karolherbst: at load time
13:31 karolherbst: f (nouveau_runtime_pm == 1)
13:32 karolherbst: pm_runtime_unforbid(dev);
13:32 karolherbst: or whatever
13:33 imirkin_: mmmm dunno
13:33 imirkin_: there could be other reasons why pm_runtime is forbidden presumably? dunno
13:33 imirkin_: you can also manually change it with e.g. powertop
13:33 karolherbst: mhh
13:33 imirkin_: or by writing stuff to sysfs
13:33 karolherbst: at least powertop doesn't allow me to do that
13:34 karolherbst: control is still set to auto
13:34 karolherbst: even after forbid
13:34 imirkin_: ah ok
13:41 pmoreau: imirkin_: envydis can't parse compute code for gk110? I get a "no mode cp" when trying `envydis -m gk110 -O cp -w`
13:41 imirkin_: the -O cp is just for the g80 machine
13:41 pmoreau: Oh ok
13:41 imirkin_: for gf100+ it's all the same isa
13:41 imirkin_: no extra cp variant
13:42 pmoreau: Nice :-)
13:45 imirkin_: a fun project would be to add aalib support to demmt so that it prints out textures ;)
13:46 karolherbst: imirkin_: so nouveau should rather reset the forbid thing after it called it?
13:46 pmoreau: Thanks for the envytools patches!! I couldn't find the kernel code in the trace. :D
13:46 karolherbst: or should I explicitly read out the status at load time?
13:46 imirkin_: karolherbst: i think so.
13:46 imirkin_: pmoreau: if you have a gk110/gk208 trace that doesn't work, happy to take a look
13:47 imirkin_: pmoreau: i just used the trace that hakzsam- sent me to improve things a bit
13:47 imirkin_: it's all a guessing game though, so heuristics can always be improved
13:47 pmoreau: imirkin_: hakzsam is my source of supply for gk110/gk208 traces, for now as I don't have one myself
13:48 imirkin_: ah ok. well i happen to have a gk208 here, so i can run stuff if necessary
13:48 imirkin_: i'm going to look over hansg's patches and then probably give atomic another shot given robert's feedback
13:48 pmoreau: :-)
13:49 imirkin_: i guess i might have known about that 16M window thing actually
13:49 imirkin_: but definitely forgot about it
13:49 pmoreau: I wanted to have some traces on Kepler/Fermi as my work doesn't work on those, due (at least) to the fact that input params are stored in c[] and not in s[].
13:51 pmoreau: And it looks like my get_local_id() work won't work either, cause they have the ids stored in specific reg $tidx and $ctaidx, rather than in $r0
13:51 imirkin_: pmoreau: that's what the sysvals are all about
13:52 imirkin_: pmoreau: the relevant target knows how to retrieve SV_TIDX or whatever
13:52 imirkin_: pmoreau: and all you do is stick a OP_RDSV in when converting
13:52 pmoreau: Oh that's nice!
13:52 imirkin_: (RDSV = read system value, WRSV = write system value... dunno when that one's used)
13:52 pmoreau: Then maybe I'll try to get it working
13:53 imirkin_: pmoreau: http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir.h#n358
13:53 pmoreau: Cool! Thanks!
13:53 imirkin_: http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n1526
13:54 pmoreau: I'll have to check what this CTAIDX is. I remember Robert takling about CTA in his reply, but already forgot what it was… --"
13:54 imirkin_: i believe it has something to do with compute
13:54 imirkin_: compute t___ address
13:54 imirkin_: dunno maybe not
13:55 imirkin_: compute thread area
13:55 imirkin_: just guessing ;)
13:56 pmoreau: :-)
13:56 pmoreau: Probably the blockIdx, just with a different name
13:56 karolherbst: imirkin_: pm_runtime_forbid indeed increases the use count :/
13:56 imirkin_: karolherbst: ok so we def have to call that on exit
13:57 karolherbst: yeah
14:00 karolherbst: imirkin_: nouveau already calls it :o
14:00 karolherbst: inside nouveau_drm_load
14:00 karolherbst: and it doesn't help
14:01 imirkin_: karolherbst: i mean call the inverse of it.
14:01 karolherbst: yeah pm_runtime_allow
14:01 karolherbst: this is already called
14:08 karolherbst: imirkin_: mhh when the driver is laoded I get a usage count of 2 :/
14:08 imirkin_: well i'm sure you'll figure out where the imbalance is
14:08 karolherbst: maybe I can't decrease it anymore after the module was unloaded :/
14:09 karolherbst: well therse is always atomic_dev(&dev->power.usage_count);
14:09 karolherbst: :D
14:10 karolherbst: magic
14:10 karolherbst: now it suspends again
14:10 karolherbst: mhhh
14:10 karolherbst: something messes up big time
14:14 karolherbst: imirkin_: calling allow on unload does not help
14:22 karolherbst: imirkin_: I just removed all forbid calls and still the device is messed up :/
14:22 imirkin_: great update :)
14:23 karolherbst: haha :/
14:23 karolherbst: okay
14:23 karolherbst: all these runtime_pm checks
14:23 karolherbst: and returning EBUSY don't change a thing totoally
14:24 imirkin_: you realize that you've already messed up the count on yoru running kernel right?
14:24 karolherbst: okay
14:24 karolherbst: imirkin_: I repair it with atomic_dec
14:24 imirkin_: ah :)
14:24 imirkin_: not dangerous at all
14:24 karolherbst: no
14:24 karolherbst: -1 doesn't crash the kernel
14:24 karolherbst: so
14:24 karolherbst: :D
14:24 karolherbst: all good
14:24 karolherbst: okay, moved every runtime_pm check inside drm_load
14:24 karolherbst: and still
14:24 karolherbst: when I load the first time
14:24 karolherbst: load_counter == 1
14:25 karolherbst: runtime_alloow
14:25 karolherbst: load_counter == 0
14:25 karolherbst: as expected
14:25 karolherbst: but
14:25 karolherbst: when I load wuth runpm=0
14:25 karolherbst: unload
14:25 karolherbst: load runpm=1
14:25 karolherbst: I get load_counter == 2
14:25 karolherbst: at drm_load time
15:35 karolherbst: imirkin_: okay, at least the usage counter increaes by one everytime nouveau is loaded with runpm=0
15:41 karolherbst: imirkin_: maybe it is a drm core issue?
15:43 karolherbst: imirkin_: okay, I think I found it
15:43 karolherbst: after drm_device load finishes, it has a load count of 1 with runpm=0
15:44 karolherbst: and unload finishes with 3
15:46 imirkin_: pmoreau: btw, i just pushed 64-bit support for nv50 emission in case you need it in your opencl adventures. do note that it's G200-only.
15:46 imirkin_: pmoreau: also don't try to divide or sqrt :)
15:46 imirkin_: i have a pass that emulates rcp/rsq but... it didn't seem to work
15:46 imirkin_: probably something dumb but i haven't gotten back to it
15:47 imirkin_: [in part because i lack a G200 to test on]
15:51 imirkin_: i should be able to debug on a nvc0 but... /me is lazy
16:50 imirkin_: skeggsb: actually i guess what i need to do is *reserve* 16M of VA. it doesn't even need to be backed by anything.
17:01 imirkin_: alright so now i need to combine my 20 atomic branches into The One True Branch (tm)
17:24 imirkin_: skeggsb: what's the best way to understand this error? http://hastebin.com/qadobomime.sm
17:25 imirkin_: skeggsb: is it an unaligned memory access, or is it some 0 address somewhere?
17:28 imirkin_: skeggsb: nevermind, figured it out. unrelated to gmem at all in the first place, i was doing something dumb