00:13gnurou: imirkin_: I only work with Tegra, so my understanding of vbios is very lacking :/
00:14gnurou: imirkin_: moving this register check thing one generation later sounds reasonable - I will send a patch later today
00:14imirkin_: gnurou: well, like i said, there's a GK107 we have "on file" without that write as well
00:15imirkin_: gnurou: i guess i'd defer to ben if he has any opinion on this
00:15imirkin_: gnurou: either just exclude GF104, or move it up to maxwell? dunno.
00:15gnurou: imirkin_: let's move it for Maxwell then? I think issues with repeating devinit could only occur when secure firmware is involved
00:16imirkin_: gnurou: also our vbios collection isn't exactly uniform across generations - i suspect we have better coverage of older GPUs than newer
00:16gnurou: it's just that this register is documented since Fermi, so I'm confused it is not used by all vbioses...
00:16imirkin_: and it wasn't *every* GF104 that had an issue
00:16gnurou: I mean, vbios is always provided by NVIDIA, right?
00:16imirkin_: afaik, never
00:16imirkin_: NVIDIA provides a tool
00:16imirkin_: but the board vendor futzes with it
00:16imirkin_: but my understanding of this is ... incomplete
00:16skeggsb: the tool can only change the data tables afaik, the scripts/code is all nv
00:17gnurou: ah, that would explain a lot
00:17imirkin_: so the *real* problem here, btw, is that running the vbios a second time messes things up
00:17gnurou: right, so if the code is provided by NVIDIA, according to past experience, we can expect it to always perform as expec... oh wait
00:17skeggsb: also expected, devinit isn't designed to be run on an initialised board
00:18imirkin_: skeggsb: oh really? how does it differ from a warm reboot?
00:18gnurou: isn't it what was done prior to this patch though?
00:18whompy: imirkin_: Are you collecting 13.0 glxinfo yet?
00:18imirkin_: whompy: i'm not, but i should be!
00:18skeggsb: imirkin_: i don't know, i presume on a warm reboot the gpu loses power?
00:18whompy: Well! I'll send a couple of 'em your way!
00:18skeggsb: but nv told me themselves to not do that :)
00:18imirkin_: skeggsb: didn't think so, but what do i know.
00:18skeggsb: and, in my own experience, it ends horribly often
00:19skeggsb: it often ends horribly*
00:19imirkin_: either way, it seems like either i can't properly analyze VBIOS init scripts, or sometimes some nvidia boards were shipped with vbios's that don't flip that bit
00:19imirkin_: your call as to how to "fix" it - we could look at *both* the new bit and the old method
00:20imirkin_: i.e. if the new bit is set, fine, if not, check the old thing
00:20skeggsb: gnurou: perhaps you can check with the graphics firmware team and/or rm teams to see if there's any additional "workarounds" that rm performs to detect this aside from the scratch reg?
00:20gnurou: skeggsb: I did when this issue first occured... nothing special came out of it
00:21imirkin_: ~handwave~ this vbios does not exist...
00:21gnurou: let me ping again
00:21skeggsb: gnurou: ah
00:22imirkin_: gnurou: are you going to take care of the nouveau firmware dir thing too?
00:23gnurou: imirkin_: sure, if skeegsb gives me a thumb up to revert to the old fw paths
00:23skeggsb:wonders how to do a thumbs-up emoticon
00:24imirkin_: ok. in general linus frowns upon regressions :)
00:24whompy: imirkin_: NVA5: http://ix.io/1BxY
00:24whompy: Ah poo. That's not the droid you're looking for.
00:24imirkin_: whompy: actually it is ;)
00:24imirkin_: whompy: i've unplugged my GT215
00:24gnurou: imirkin_: I know, and I am especially in a bad spot for introducing one :)
00:25whompy: If you want it in parsed form.
00:25gnurou:will take skeggsb's wonder as a 'yes'
00:25skeggsb: gnurou: that's more my fault really
00:25imirkin_: gnurou: ideally i'd have complained louder and sooner
00:25gnurou: nah, I should have insisted, breaking userspace really is a no-no
00:26gnurou: ok, so expect plenty of patches from me today - v3 of secboot refactoring on the way
00:26imirkin_: skeggsb: as a total aside, any clues as to what's wrong with those GTX 660's? seems like it's only those, not all GK106's...
00:26skeggsb: nope, none. i have one, and mine works fine...
00:28skeggsb: it's generally my "go to" board too, so probably the one i test the most aside from my own laptop
00:29imirkin_: so not *all* GTX 660's... of course, why would it be that easy
00:29skeggsb: it never is :)
00:29imirkin_: and ultimately things tend to work best on hw you have, since that's the one you test with
00:29imirkin_: not a lot you can do about it
00:31Lekensteyn: ok, removed vblank_mode=0 from .drirc, that fixed the "flip queue failed" problem with modesetting
00:32Lekensteyn: after that, I am now using intel and nouveau ddx (from git) and mesa 13.0.0, unfortunately the wakeup issue still occurs
00:33whompy: imirkin_: Wrong channel, but RV770 too: http://ix.io/1Byb
00:56Lekensteyn: oh, the wakeup happens because something glXGetFBConfigs is somehow being called by Qt which then invokes drmGetDevice. That on its turn starts reading the PCI config space for (sub)vendor/product ID and rev
00:57Lekensteyn: coincidentally, someone has posted another patch to address it: https://lkml.org/lkml/2016/11/1/249 (afaik the idea is not new)
01:31imirkin: Lekensteyn: i wonder if it's a result of xexaxo1's recent removal of udev
01:31imirkin: oh. and that patch you linked is also from xexaxo1.
01:32imirkin: xexaxo1: does anything care about the revision? ever? in the history or revisions?
01:32imirkin: xexaxo1: i suspect we can just use 0 as far as loader is concerned
04:57gnurou: imirkin: skeggsb: sent patch to fall back to "nouveau/nvxx_fucxxxx" when looking up GR firmware
05:00imirkin: cool. seems reasonable.
05:02imirkin: although ... mild preference for just passing it in as an extra arg - seems easy enough, instead of the strcmp thing
05:02imirkin: it's just ... a little jarring
05:03gnurou: imirkin: the extra arg would require updating all the callers (in gf100.c, gm200.c and gk20a.c) and would make it look like it is a feature - current way keeps it more localized IMHO
05:03imirkin: ok. i'll reply on-list, others can comment as seen fit
05:34s0be: imirkin, saw you inquiring about gX10X cards earlier, is gm107 in a optimus setup of any interest to you for anything?
05:34s0be: I have 5 days til I start my new job and have to return to the 'real world' if there's any dumps/traces I can do to thelp
05:34s0be: (I got a job at google... becoming part of the corporate machine)
05:34imirkin: s0be: not sure which context it was in, but i semi-recently pushed GM10x (and GM20x) support to xf86-video-nouveau, if you want to play with it
05:35s0be: just git X drivers will test it?
05:35s0be: do I need to DRI_PRIME shit?
05:35imirkin: oh, you're in an offloaded setup - probably doesn't matter then
05:36s0be: yeah, I got this laptop because I can compile android with outddir in ram disk
05:36s0be: <3 64gb ram, fyi
05:39s0be: intel gpu is sufficient for all my use cases, I just happen to have a nvidia offload gpu, so am happy to provide testing/dumps/whatever if it helps, but I don't need it to do anything for my standard use cases
05:40imirkin: s0be: ok. i actually also have a GM107 plugged in right now too...
05:41s0be: then I'm useless. Gotcha.
05:42imirkin: enjoy the goog. great place to work :)
05:43s0be: I've heard rumors of that. I was on the Cyanogen Systems team(which was pretty much my dream job) so the yhave a pretty high bar to surpass.
05:43imirkin: [or at least it was a decade ago]
06:06s0be: imirkin, also fyi, bluetooth bugs are at least as bad if not worse than vendor gcc bugs when it comes to figuring out what the fuck is wrong with your program.
06:07imirkin: solution: don't use bluetooth =]
06:10s0be: I turned down a job before accepting the google position because they said they were having bluetooth issues they wanted to resolve
06:11s0be: ANY protocol that trusts the remote device to tell you the truth about what it works with with no method to validate it's working is flawed
06:13s0be: HEY, I can playback 44.1khz pcm audio, I'll just play it back like it's 48 khz samples and pitch shift with huge (10% of the playback time) dead times.
10:08xexaxo1: imirkin: yes, already noticed that plus sent a patch for the kernel + libdrm.
10:08xexaxo1: no idea why the AMD folks needed the revision, but they insisted on it.
10:09xexaxo1: afaict no open-source userspace ever used the revision
10:32karolherbst: xexaxo1: I think amd indeed uses it, doesn't sound like something new to me
10:38Lekensteyn: I've applied xexaxo1's libdrm patch, starting glxinfo still wakes up the device. I guess I have to restart Xorg too?
10:39Lekensteyn: btw, do you know any tricks with the perf (or other) tool(s) that allow you to see what causes a runtime wakeup?
10:42karolherbst: Lekensteyn: reading the config space does
10:42karolherbst: there is a config file for the pci stuff
10:42karolherbst: but no, there is no tool
10:43Lekensteyn: oh, I see, something in glxinfo still scans the config space of all PCI devices for some reasons (even if the device identification now goes through /device, etc.)
10:44karolherbst: I have something for ya
10:45karolherbst: Lekensteyn: https://github.com/karolherbst/linux/commit/cb918e4c926990dfcfce92e1ecd905e0896de605
10:45karolherbst: and then libdrm just need to use this one
10:45karolherbst: instead of config
10:45karolherbst: but the work is a bit messy and a lot has to be rewritten :/
10:45karolherbst: started it, but I was like 15% done
10:47Lekensteyn: karolherbst: yes, that is what xexaxo1 patches are doing :) http://www.spinics.net/lists/dri-devel/msg122315.html http://www.spinics.net/lists/dri-devel/msg122319.html
10:52Lekensteyn: oh, using kprobes one could try to follow the runtime resume paths. I used this before for tracking down runtime ref leaks, but I guess it can also be used to find the trigger
10:52Lekensteyn: echo >/sys/kernel/debug/tracing/kprobe_events 'p __pm_runtime_resume dev=%di rpmflags=%si' && echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable && echo 1 > /sys/kernel/debug/tracing/options/stacktrace && cat /sys/kernel/debug/tracing/trace
10:53Lekensteyn: was hoping there was a nicer interface (bonus for including offending userspace process and stack) though
10:57Eliasvan: Hey all, is it normal that executing "echo 0f > /sys/kernel/debug/dri/1/pstate" on a GT 620M (optimus laptop) yields "bash: echo: write error: Function not implemented"?
10:57Eliasvan: I'm running the latest nightly Fedora live build right nowµ
10:59Eliasvan: I verified that DRI_PRIME=1 runs slower than with Intel integrated, so at least it's using the NVIDIA card
11:17xexaxo1: Lekensteyn: karolherbst is spot on there, and I've completely forgot about libpciaccess.
11:24xexaxo1: Lekensteyn: glxinfo uses the GLX_MESA_query_renderer extension which on intel hardware uses libdrm/libpciaccess
11:24Lekensteyn: xexaxo1: check your mail, I replied to your patch :)
11:24xexaxo1: ^^ since it's not too obvious
11:31vitis: Eliasvan, GT620m is fermi family and fermi clocking is still work in progress
11:33Eliasvan: vitis: Oh OK, thanks for the reply; Is there any experimental code I can apply to test this nevertheless?
11:35vitis: unfortunately i'm not developer i just had same question as you since i have card from same family. I guess you could enable fermi pstates if you build from source
11:36Eliasvan: well, I can check the pstate table, but I can't write to it, so does that mean that fermi pstates are enabled?
11:38vitis: Well i'm guessing that since you can't write pstate it's disabled since automatic pstates aren't done yet afaik. But i might be wrong as i've said i'm not developer i just read quite a lot about nouveau
11:40Lekensteyn: xexaxo1: I checked radeontool, spice (vd_agent), these do not seem to use the "revision" field from libpciaccess
11:40Lekensteyn: xexaxo1: within libdrm, only amdgpu copies pci_rev into a private field (pci_rev_id)
11:49xexaxo1: Lekensteyn: the xserver/ddx might be a bit annoying .... hmm we might want to move to a separate thread for libpciaccess
11:50xexaxo1:had the epiphany after hitting "send"
11:51karolherbst: xexaxo1: I have some WIP for libpciaccess
11:57xexaxo1: karolherbst: nice so it was you who noticed this ages ago ?
11:57xexaxo1: I was wondering the other day.... someone mentioned this, he even had some patches but who was it ...
11:59xexaxo1: from the following ddx only via seems to use the revision (chipRev) from xserver - amdgpu, ati, freedreno, intel, nouveau, opentegra and via
12:00xexaxo1: not sure how much one should be concerned about those ;-)
12:01xexaxo1: might be worth sending it over (once you have it working) to xorg-devel since it might take a long time to get it reviewed/merged/released
12:07Lekensteyn: xexaxo1: in xorg-server, SPARC could break by zeroing the revision field in libpciaccess
12:12Lekensteyn: nvm, I'm stupid
12:35karolherbst: xexaxo1: yes
13:28jneto: what nouveau does when accel is disabled?
13:28jneto: just "paints" the screen?
13:31ajax: just sets modes.
13:31ajax: xserver has a software renderer
15:28Eliasvan: imirkin: I've done some additional OpenGL timer query tests on other NVIDIA hardware: GT620M and GTX960M, both have an accuracy of higher than 99.99% compared with the CPU timer
15:28Eliasvan: imirkin: so, it seems very good
15:29imirkin_: Eliasvan: very surprising :) but thanks for checking
15:29Eliasvan: no problem
15:29Eliasvan: imirkin: it is in line with the results I get with the prop driver on Windows
15:30Eliasvan: also another interesting fact: you're aware of the Intel Skylake Mesa driver being in bad shape with an accuracy of only 96%, well, on Windows it turns out it is better, but not by much: 99.6%
15:32imirkin_: Eliasvan: well, on linux i think they're aware of it being a fail
15:32Eliasvan: yeah, but older Intel HW is doing fine on Mesa
15:33Eliasvan: also I saw that the Windows Intel driver exposes a 64bit counter, while on Mesa only 36bit
15:34Eliasvan: (not that that is a problem in practice though)
16:53Eliasvan: Hi all, I've tried to change powerstates on my GTX960M to 07, 0a or 0f, but the command locks up when I do so (dmesg first says something about the pll, then after 20s the kernel detects the workqueue doesn't respond), is this a known issue?
16:55imirkin_: which gpu is a GTX 960M?
16:55imirkin_: (lspci -nn -d 10de:)
17:09manio: was anyone here playing with wayland+nouveau?
17:10manio: is it possible to use multi gfx nvidia at the same time (using nouveau) ?
17:19Eliasvan: imirkin_: 01:00.0 3D controller : NVIDIA Corporation GM107M [GeForce GTX 960M] [10de:139b] (rev a2)
17:19imirkin_: Eliasvan: and which kernel?
17:20imirkin_: Eliasvan: oh, the gpu also needs to be powered on when you do this
17:20imirkin_: Eliasvan: so make sure you're running glxgears or something dumb when you're playing wtih pstate
17:20Eliasvan: fedora nightly build from 1-nov-2016, I have to check
17:20imirkin_: Eliasvan: not sure what's in there, but if you want memory reclocking, you need to grab drm-next
17:21Eliasvan: oh, thanks, good to know
17:22Eliasvan: I'll retest in about a half hour or so
17:22Eliasvan: it doesn't have to be memory reclocking, core reclocking is good already
17:54Eliasvan: imirkin_: kernel: 4.8.4-301.fc25.x86_64
18:10karolherbst: Lekensteyn: that method doesn't seem to work with a 361 driver
18:12karolherbst: but it should
18:13karolherbst: compiled it wrong
18:28karolherbst: mupuf: is there something special I have to do to fake the vbios on the gm107?
18:30karolherbst: ohh right
18:30karolherbst: 619f04 isn't set
18:33xexaxo1: karolherbst: I'm a genius, forgot to CC you on the libpciaccess thread https://lists.freedesktop.org/archives/xorg-devel/2016-November/051749.html
18:34xexaxo1: and https://lists.freedesktop.org/archives/xorg-devel/2016-November/051750.html
18:36xexaxo1: that is if you've got time/interest in beating libpciaccess into shape.
18:36xexaxo1: let me know otherwise
18:39karolherbst: xexaxo1: uhh, seems like I never commit my changes, most likely because it was ugly and annoying and everything
18:40karolherbst: xexaxo1: but basically libpciaccess has to read out the revision from the sysfs file instead of config
18:41karolherbst: src/linux_sysfs.c line 181 or so
18:41karolherbst: there it is written now from config
18:41xexaxo1: karolherbst: dealing with sysfs is always ugly, that's why I always warn people ;-)
18:41karolherbst: I meant libepciaccess
18:41karolherbst: it's not the best written piece of code actually
18:42karolherbst: the linux_sysfs file is kind of a mess
18:43karolherbst: populate_entries is ... yeah wel
18:44xexaxo1: the same - good thing Ian is not around ;-)
18:45xexaxo1: iirc Ian/idr is the original author of libpciaccess. no idea how much of it he did though.
18:49ajax: it should be remembered that libpciaccess was born because a) xfree86's pci abstraction was godawful and b) xorg refused to link against libpci because gpl and portability
18:49ajax: so it's really only as good as xorg needed it to be
18:49karolherbst: I figured
18:50Eliasvan: imirkin_: hi, I'm back, I've tested 07 while glxgears running, now it works, thanks!
18:50Eliasvan: now I'll test 0a...
18:51Eliasvan: cool, that works too! now going for 0f (core 270-1202 MHz memory 5010 MHz)...
18:52Eliasvan: 0f works!
18:53karolherbst: Eliasvan: what gpu?
18:53Eliasvan: wow, glxgears runs at 11267.852FPS !
18:53Eliasvan: thanks a lot!
18:53karolherbst: ahh gm107
18:54Eliasvan: 01:00.0 3D controller : NVIDIA Corporation GM107M [GeForce GTX 960M] [10de:139b] (rev a2)
18:54karolherbst: yeah, gm107 ;)
18:54karolherbst: the m is pretty much unimportant
18:54Eliasvan: great, going from 5500FPS to 11300!
18:54karolherbst: you will have a lower clock though I guess
18:54karolherbst: than the 1202MHz
18:54karolherbst: check the last line of pstate
18:55Eliasvan: [root@localhost-live liveuser]# cat /sys/kernel/debug/dri/1/pstate 07: core 405 MHz memory 810 MHz 0a: core 270-1202 MHz memory 1600 MHz 0f: core 270-1202 MHz memory 5010 MHz AC DC * DC: core 0 MHz memory 0 MHz
18:55karolherbst: you gpu is off now
18:55Eliasvan: oops, yes, wait...
18:55Eliasvan: last line is now: DC: core 1189 MHz memory 810 MHz
18:56karolherbst: no idea with what clocks it will resume
18:56karolherbst: current nouveau mesa?
18:56karolherbst: or drm-next?
18:56Eliasvan: nightly fedora build 1 november
18:56karolherbst: I see
18:56Eliasvan: kernel 4.8.4
18:56karolherbst: no clue what is in there
18:56karolherbst: you have no memory reclocking
18:56karolherbst: which means, -> useless
18:57Eliasvan: that's likely, indeed
18:57karolherbst: you may get more perf in glxgears, but in games it won't matter
18:57karolherbst: 20% top for most things
18:57karolherbst: need some newer stuff for that to work
18:57Eliasvan: for mem reclocking I need drm-next
18:58Eliasvan: cool! I've never heard my fans as loud as now! thanks!!
18:58Eliasvan: wait, I'll spawn another glxgears on the integrated one :D
19:00karolherbst: ... :p
19:00karolherbst: the main reason glxgears is fast now, is that the pcie bus has a higher clock
19:00Eliasvan: hmm, it seems now integrated one runs at 3500FPS instead of 7500FPS... Might be the temperature (or the max power consumption limit reached)
19:01karolherbst: more likely the intel gpu has a lot of stuff to do
19:01Eliasvan: oh, I probably should try something more interesting, like e.g. shadertoy
19:01karolherbst: it also have to display the other thing
19:01karolherbst: one thing
19:01karolherbst: with 4.8.4 chances are high that the gpu will crash
19:01karolherbst: there are _a_lot_ of fixes in drm-next, which will land with 4.10
19:02Eliasvan: well, I'll try and let you now
19:03Eliasvan: interesting, so when I run glxgears on integrated without other glxgears on discrete, I'm getting ~7000FPS,
19:04Eliasvan: and when I run glxgears on discrete without other glxgears on integrate, I'm getting ~11000FPS
19:04karolherbst: you have to keep in mind that with prime offloading, you copy from the nvidia gpu to the intel one
19:04karolherbst: over the pcie bus
19:05Eliasvan: and when I run glxgears on integrated with other glxgears on discrete, I'm getting only ~3500FPS on integrated and still ~11000FPS on discrete
19:05Eliasvan: oh, right, true
19:07karolherbst: you should run 1. pixmark_piano and 2. furmark
19:08karolherbst: both from gputest
19:08karolherbst: I am sure the former one will crash the gpu
19:09karolherbst: mupuf: guess what, more fields: https://gist.github.com/karolherbst/c224713cab4c79777905f8d956d2f765
19:09Eliasvan_: just tried shadertoy, and I get about the same perf as on integrated (which is already better than the original pstate on discrete, but not satisfactory)
19:10karolherbst: you need drm-next for that
19:10karolherbst: so that the memory will be clocks to 5GHz
19:10Eliasvan_: karolherbst: I'll try 1. pixmark_piano and 2. furmark
19:10Eliasvan_: (but not yet with drm-next)
19:10karolherbst: _never_ run pixmark piano on the intel one, by the way
19:10karolherbst: if you do, you'll see why
19:11Eliasvan_: oh, will it overheat?
19:11Eliasvan_: or empty the bios?
19:12karolherbst: it isn't dangerous
19:12karolherbst: but no fun either
19:12Eliasvan_: oh, ok
19:12karolherbst: like imagine your desktop running at 4 fps
19:13Eliasvan_: ah, I see, low fps hurts your eyes :P
19:14karolherbst: not just that
19:14ss501: HOLS A TODOS Y TODAS
19:15ss501: QUE PASO
19:15karolherbst: english pls
19:20Eliasvan_: karolherbst: I see what you mean, but I don't have a problem with desktop 6fps (with 4fps I would however) :P
19:21Eliasvan_: Even in wayland the mouse lags
19:21karolherbst: it shouldn't though
19:22Eliasvan_: guess what: 1. doesn't crash on discrete!
19:22Eliasvan_: (runs only at 4fps though)
19:23Eliasvan_: but now at least I can use my mouse (because of the graphics offloading)
19:24karolherbst: it should run much faster than 4 fps on nouveau
19:25karolherbst: depends on how you start it
19:25Eliasvan_: no, I tried the windowed one
19:25karolherbst: it should run faster
19:25karolherbst: maybe not
19:25karolherbst: well it should
19:26karolherbst: Eliasvan_: did you try on 0f?
19:26Eliasvan_: yes, it's on 0f
19:26karolherbst: how fast is it on 07?
19:26Eliasvan_: Let's see...
19:27karolherbst: that is kind of slow
19:27Eliasvan_: 0a yields 4fps
19:28Eliasvan_: 0f no difference
19:28karolherbst: I get 8 fps on 07
19:28karolherbst: 17 on 0g
19:29RSpliet: 0f? :-P
19:29Eliasvan_: mmh, I'm testing on wayland, could that be the reason?
19:31Eliasvan_: I'll try in X
19:31karolherbst: Eliasvan_: my gpu is just slightly faster than yours in practise
19:31karolherbst: it's a 770m
19:31RSpliet: (karolherbst: or are we talking in septadecimals? :-P)
19:32karolherbst: RSpliet: wouldn't matter I guess
19:32Eliasvan_: in fullscreen pixmark renders incorrectly
19:33karolherbst: how so?
19:33Eliasvan_: no change on X
19:33karolherbst: maybe our maxwell support is indeed that bad...
19:34karolherbst: but well
19:34karolherbst: 4 fps vs 17 fps
19:34karolherbst: that would be intense
19:34karolherbst: I guess something else is just simply wrong
19:36Eliasvan_: I'll ty to get a screenschot of that
19:37Eliasvan_: RSpliet: 770 would not be valid in septadecimals, right? '7' would not exist
19:37karolherbst: Eliasvan_: base 17
19:38Eliasvan_: oh, OK
19:41Eliasvan_: karolherbst: instead of a screenschot, I can exactly decribe the problem (otherwise I would need to post a video): the screen is split in 2 triangles separated by a diagonal going from the left top corner to the right bottom corner, and the left bottom triangle lags behind a few frames compared to the right top triangle!
19:41karolherbst: that is most likely the fault of prime offloading
19:41Eliasvan_: the scene rendered on the triangles is actually correct
19:48Eliasvan_: for furmark windowed: integrated=19fps, discrete(0f)=9fps
19:50karolherbst: yeah, you need higher memory clocks for that
19:50Eliasvan_: with https://nouveau.pmoreau.org/ I should be able to test that, right?
19:50Eliasvan_: (drm-next included)
19:51karolherbst: or wait
19:51karolherbst: let me check
19:52karolherbst: nope, doesn't seem this way
19:52Eliasvan_: well, at least now I have found a way to make my laptop generate more heat, could be useful for this winter
19:53karolherbst: ohh wait
19:53karolherbst: no, it should be in the image
19:53Eliasvan_: karolherbst: "Nouveau module: branch master"
19:53karolherbst: that's the log
19:53karolherbst: looks fine
20:43karolherbst: mupuf: the heck :O it works! the max_batt entry works indeed on mine
20:43karolherbst: nvidia clocks to that entry, when I am on battery :D
20:43karolherbst: big surprise, I know
21:04karolherbst: table completely REed!
21:04karolherbst: (more or less)
21:05karolherbst: now the important question is, what does req(Slowdown)Power mean? https://gist.github.com/karolherbst/43880879d8b02bb4330923778f19f11f
21:06imirkin_: karolherbst: the power usage when slowdown is requested?
21:06imirkin_: karolherbst: iirc there's a GPIO for that
21:06karolherbst: why is it lower than reqPower?
21:06imirkin_: so that you use less power when you're trying to reduce temp?
21:07karolherbst: well okay, example: nvidia limits the max clock, because I run on batteries to 0xa pstate and 405MHz core clock
21:08karolherbst: reqPower is 34.06W and reqSlowdownPower is 23.35W
21:08karolherbst: so, what should happen, if my card consumes like 14W
21:08imirkin_: situation: you're consuming 34W
21:08imirkin_: and overheating
21:08karolherbst: I don't overheat
21:08imirkin_: you get a GPIO signal requesting slowdown
21:09imirkin_: then you should reduce to 23W
21:09karolherbst: I see what you mean
21:09karolherbst: interesting idea
21:10imirkin_: anyways, that's my theory
21:10imirkin_: i could be misremembering the gpio thing
21:10karolherbst: would make sense, cause nvidia actually caps at 80W or something for me
21:10karolherbst: and the highest vpstate has a reqPower of 75W
21:10karolherbst: thing is
21:11imirkin_: 42 = SW Performance Level Slowdown. When asserted, the SW will lower it’s performance level to the lowest state.
21:11karolherbst: sw overheating is defined somewhere else already
21:11imirkin_: 43 = HW Slowdown Enable. On assertion HW will slowdown clocks (NVCLK, HOTCLK) using either _EXT_POWER, _EXT_ALERT or _EXT_OVERT settings (depends on GPIO configured: 12, 9 & 8 respectively). Than SW will take over, limit GPU p-state to battery level and disable slowdown. On deassertion SW will reenable slowdown and remove p-state limit. System will continue running full clocks.
21:11imirkin_: etc. there's a bunch of these.
21:11karolherbst: yeah, I know
21:11karolherbst: I know
21:12karolherbst: there is some "slowdown" thing and whenever the power goes above it, the slowdown is triggered, but this won't explain what reqPower is for
21:13imirkin_: 111 = HW Only Slowdown Enable. On assertion HW will slowdown clocks (NVCLK, HOTCLK) using _EXT_POWER settings (use only with GPIO12). No software action will be taken. On deassertion HW will release clock slowdown.
21:13karolherbst: those fields are useless anyway
21:13imirkin_: anyways, dunno
21:13karolherbst: cause it is only set for like 2% of all vbios
21:13karolherbst: so I don't care much about it
21:18karolherbst: mhh there are also two unknown fields in the header
21:18karolherbst: and an actualy third one I have no clue what it does
21:25karolherbst: ohh, my tesla has indeed a mini DP port :O that will be fun
21:29karolherbst: imirkin_: what is the highest pixel clock of tesla? 300MHz?
21:30imirkin_: for dual-link dvi
21:30karolherbst: that means 330MHz over DP as well?
21:30imirkin_: and probably higher than that for VGA, dunno what the hw limit is
21:30imirkin_: nope, it didn't support DP 1.2
21:31imirkin_: probably 4 lanes at 270MHz? not sure. [and not sure how exactly that bandwidth translates into modeline support]
21:31karolherbst: was thinking about plugging in my 4k display into my tesla gpu....
21:31karolherbst: I have a mini DVI and a mini DP port
21:32karolherbst: but only a miniDVI to VGA adapter, so I would have to use miniDP to HDMI
21:32imirkin_: miniDP -> HDMI will only get you 165MHz of pixel clock
21:32imirkin_: (unless you have an active adapter)
21:32karolherbst: on the tesla?, k
21:32karolherbst: it is active
21:32imirkin_: in general
21:32karolherbst: I get 540MHz with my hsw out of it
21:33imirkin_: you'll be able to get dual-link dvi's worth out of it, i suspect
21:33karolherbst: so 330mhz
21:33imirkin_: i dunno for sure, sorry
21:33karolherbst: I could try it out
21:33imirkin_: i forget how the bandwidth stuff works
21:34karolherbst: I should have a DP 1.2 port though
21:34karolherbst: mhh, or maybe not?
21:34imirkin_: definitely not
21:35karolherbst: ohh right
21:35karolherbst: the spec is newer than the machine
21:35karolherbst: seems like 1.0 and 1.1 support 4K@30Hz
21:35imirkin_: in theory, eya
21:36karolherbst: I like how that sounds
21:47karolherbst: imirkin_: k, so it doesn't seem to really work at all
21:48karolherbst: and with that I mean the display
21:58karolherbst: imirkin_: he
21:58karolherbst: 3840x2160_24.01 23.99*
21:58karolherbst: but the display stays black
21:58imirkin_: ok. not sure what you want me to do about it :p
21:58karolherbst: it is a bug either way :p
21:59karolherbst: 266MHz is the required clock
21:59karolherbst: should work, right?
21:59imirkin_: i don't remember :)
22:00karolherbst: maybe I should try a 2560x resolution first
22:02karolherbst: "nouveau 0000:02:00.0: disp: outp 00:0006:0242: link training failed"
22:11airlied: imirkin_: so is the multithread kde apps not working all ther same issue as other locking?
22:14hakzsam_: I think fixing the "we submit too fast" issue should also help for such applications
22:14hakzsam_: but multithreading is definitely broken anyways :)
22:16karolherbst: imirkin_: the vbios says 300MHz max freq for Analog and stuff I don't understand for the DP and the two TMDS
22:17imirkin_: airlied: afaik yes
22:18imirkin_: hakzsam_: pretty sure "submit too fast" isn't a frequent issue
22:20hakzsam_: depends interpretation of "frequent"
22:20hakzsam_: TK2, F1 and I don't remember the other apps, but I can reproduce it
22:20imirkin_: yeah, with some apps it happens consistently
22:20imirkin_: but those apps are rare, and almost definitely not regular desktop apps
22:21hakzsam_: does qtwebkit or something like that hit the issue?
22:21hakzsam_: or I mis-remember
22:22karolherbst: well, webkit does hit issues
22:22imirkin_: qtwebkit hits the multithreaded issue
22:22karolherbst: I think we should concentrate on fixing it pause other stuff until we figure something out
22:22karolherbst: it is pretty much a critical issue
22:23imirkin_: go for it.
22:23karolherbst: I meant it more like: "we all should fix it"
22:23imirkin_: (a) i don't use kde or qt
22:23imirkin_: (b) this requires a long concentrated period of time to fix (probably 2-3 days)
22:23hakzsam_: I'm busy with gm107 :)
22:24imirkin_: (c) it requires the motivation to spend that period of time on this soul-sucking issue
22:24karolherbst: sure, but this is maybe one of the most critical issues so far and it will only get worse over time
22:24imirkin_: i don't hack on nouveau coz people use it. i hack on it coz it's fun.
22:25karolherbst: I know
22:25hakzsam_: and (d) you have to be very careful and don't destroy perf with existing apps (eg. mutexes)
22:25karolherbst: it is the same for me. I am just saying, that this issue is _really_ critical and we should deal with it more professionaly
22:25imirkin_: mutexes will definitely be part of the solution
22:25hakzsam_: for sure
22:25imirkin_: but you def don't want to do lock(mutex); sleep(); :)
22:25karolherbst: *yield ?
22:26karolherbst: does it even exist in C?
22:26karolherbst: well in the pthreads thing
22:26karolherbst: in the kernel, right
22:26karolherbst: and userspace?
22:26mwk: it's a libc function
22:26mwk: from posix
22:27imirkin_: there's tons of stuff that's broken on nouveau, the kernel end of it barely gets any support at all as well
22:27skeggsb: hakzsam_: ah, i forgot about the "submit too fast" thing from xdc, i'll try and deal with that this week
22:27hakzsam_: skeggsb: would be awesome :)
22:28karolherbst: well, sure there is tons of stuff broken, but the multithread issue will hit like 100% of nouveau users in some years
22:28karolherbst: maybe just 80%
22:28karolherbst: who knows
22:28karolherbst: well 100% -1 at most anyway ;)
22:28skeggsb: imirkin_: from what you recall from looking re:multithreaded gl, if we a) fixed how the crazy fencing stuff works in mesa, and b) use per-context pushbufs (with a lock around only the submission to a channel), how far does that get us?
22:29imirkin_: skeggsb: 0% of the way
22:29imirkin_: skeggsb: here's the issue... you do something innocuous like nouveau_bo_map(), and all of a sudden you get a kick callback
22:29airlied: yeah removing libdrm_nouveau is probably step one
22:29airlied: so you don't get callbacks from random places
22:30imirkin_: skeggsb: also simultaneously something that better supported userspace fences would be awesome
22:30airlied: imirkin_: though we do that in mesa, but probably at a higher level
22:30imirkin_: skeggsb: i.e. so i could do an ioctl(wait for some fence to signal) instead of while (x != 5);
22:31airlied: you go to map a bo, and have to flush because the currently prepared command stream references it
22:31imirkin_: airlied: right, of course.
22:31skeggsb: yes, well, that's going to come, but it's a rather separate issue from the multithreading stuff
22:31imirkin_: skeggsb: mostly separate
22:31imirkin_: skeggsb: but as part of that whole rewrite, i think ALL bo's should be user-fenced
22:32imirkin_: skeggsb: and i'm concerned that that will have a much higher overhead as a result
22:32imirkin_: skeggsb: without the ioctl
22:33imirkin_: skeggsb: so the lazy part of me has been waiting for that ioctl to get added
22:33imirkin_: skeggsb: and the lazy part of me is pretty big :)
22:37Eliasvan: karolherbst: I'm still trying to test the nouveau-live image of pmoreau, but I'm having trouble using the discrete card (DRI_PRIME=1 doesn't seem to select the discrete card)
22:37karolherbst: Eliasvan: maybe you have to do the dri2 offload stuff first
22:37karolherbst: Eliasvan: xrandr --listproviders
22:38karolherbst: Eliasvan: xrandr --setprovideroffloadsink nouveau Intel
22:38karolherbst: ohh wait
22:38karolherbst: modesetting Intel for you I think
22:38karolherbst: check listproviders
22:38imirkin_: if it's modesetting, dri3 should be on by default
22:38karolherbst: mhh, true
22:38imirkin_: probably just loads intel i assume, which defaults to not enabling dri3
22:38karolherbst: but I guess intel is picked up on intel
22:39Eliasvan: So then I get 2 providers: name:Intel and name:nouveau
22:39karolherbst: then do "xrandr --setprovideroffloadsink nouveau Intel"
22:39karolherbst: then it should work
22:40Eliasvan: Thanks, it works!
22:41karolherbst: Eliasvan: clock to 0xf in your test now and you should see the memory clock being high
22:42karolherbst: the core clock will be a little lower though
22:42Eliasvan: but why did I have to do the setprovideroffloadsink?
22:42karolherbst: because it is dri2
22:42Eliasvan: (why wasn't it done automaticaly?)
22:42karolherbst: and dri3 is disabled by default
22:42Eliasvan: Ah, OK
22:42karolherbst: mhh, I think some desktops set it automatically? no clue
22:43Eliasvan: OK, so now when I fire up glxgears on discrete with DRI_PRIME=1, I don't see anything
22:43karolherbst: imirkin_, skeggsb: funny thing though, I was running my entire plasma5 session offloaded on my nvidia gpu with nouveau and didn't get any freeze for hours
22:44Eliasvan: (in the glxgears window)
22:44karolherbst: Eliasvan: uhhh, I think you need a compositor with dri2
22:44karolherbst: maybe it would be easier to just enable dri3 and restart X
22:44karolherbst: pmoreau: mind enableing dri3 by default for your live cd?
22:45Eliasvan: karolherbst: OK, I'll do that
23:01Eliasvan: core 1037 memory 5009
23:01Eliasvan: but now it hangs
23:02Eliasvan: and I have yet to test the gputest tests
23:14Eliasvan: karolherbst: even with memory @ 5GHz, pixmark runs at 4fps
23:16Eliasvan: karolherbst: but furmark does better: 20fps instead of previously (without memory reclocking, at 0f) 9fps
23:16Eliasvan: although integrated runs at 19fps, so not much advantage of discrete card
23:21Eliasvan: karolherbst: and shadertoy (first demo on start page) runs at 50fps (no fullscreen) with discrete at 0f mem 5GHz, and at 60fps with integrated