00:05 vpelletier: checking /proc/interrupts, I see:
00:06 vpelletier: 32: 0 0 0 1 MPIC-U3MSI 8 Edge nvkm
00:06 vpelletier: is nvkm the correct line ? is it normal a single irq was served (by cpu3)
00:37 imirkin: yeah, it's the correct line
00:37 imirkin: however the fact that interrupts aren't coming in answers all the questions
00:38 imirkin: there are supposed to be more interrupts, i'm fairly sure
00:40 vpelletier: there are about 600 total in the spurioous interrupts line
00:40 vpelletier: SPU: 107 174 201 211 Spurious interrupts
00:40 imirkin: interesting.
00:40 vpelletier: I did not see any warning in dmesg though... I thought this was logged
00:44 vpelletier: lspci https://pastebin.com/n1qXQAJr
00:44 vpelletier: (-vv)
00:46 vpelletier: $ cat /proc/irq/32/spurious count 1 unhandled 0 last_unhandled 0 ms
00:49 vpelletier: mmh, "unhandled" line is 0 or 1 only, which does not match with the ~600 total SPU
00:49 vpelletier: (0 or 1 accross all IRQs)
00:50 imirkin: try booting with nouveau.config=NvMSI=0
00:57 vpelletier: aha, lots of errors in mesg now
00:59 vpelletier: https://pastebin.com/ATabUGSt
01:03 vpelletier: and time to work, sorry
02:27 imirkin: vpelletier: alrighty, thinks are cooking now! ok, *now* try adding ,NvPCIE=0 and/or some of the hacks we tried earlier, also try nouveau.vram_pushbuf=1
02:28 imirkin: someone who understands MSI and PPC will have to look at why MSI is b0rked
02:40 imirkin: benh: is MSI on G5 PPC's reliable? iow if it's not working, is nouveau messing up, or is the hw messing up?
03:09 imirkin: looks like a known issue - https://powerpcliberation.blogspot.com/2015/07/g5-nouveau-2d-acceleration.html
03:11 imirkin: i wonder if we should just blacklist MSI on BE... it's a little harsh, but ... meh
03:14 imirkin: also https://www.mail-archive.com/debian-powerpc@lists.debian.org/msg64938.html
03:17 imirkin: unclear to me if the fault lies in the platform or in the boards apple/nvidia shipped
04:27 benh: imirkin: hrm ... let me see ...
04:27 benh: imirkin: the MSI vector/address isn't meant to be setup by the fcode but by linux...
04:28 benh: imirkin: I can dig out a g5 and test... MSIs are only supported on one of the bridges iirc due to a HW issue (endian) but the kernel should take care of only enabling them where they work...
05:56 endrift: oh boy did I miss more G5 NV40 antics?
05:56 endrift: I can test any changes as desired
08:43 gnurou: karolherbst: is my help still needed? can you explain the problem?
08:44 karolherbst: gnurou: something while starting the GR falcons doesn't work
08:44 karolherbst: we found that one issue with falcon images beyond 4G, but there seems to be another issue
08:44 gnurou: on which cards?
08:44 karolherbst: GTX 980
08:45 karolherbst: mangix is the one with the issue
08:45 gnurou: mmm, wasn't it working at some point?
08:45 karolherbst: yes
08:45 karolherbst: before your rework for 4.10 :p
08:45 karolherbst: ohh wait
08:45 karolherbst: 4.11
08:45 gnurou: have we pinpointed the commit that introduced that regression? (one of mine I assume :P)
08:46 karolherbst: well yes, but you had a fix for that
08:46 karolherbst: no idea if that other issues is caused by that as well
08:46 karolherbst: couldn't analyze it in depth
08:46 karolherbst: gnurou: on my 4.10 branch: https://github.com/karolherbst/nouveau/commit/571701c20e7c14c7b8f887445406b0ef73ccf8d0
08:47 gnurou: if you guys can confirm exactly which commit did that (one commit works ; the next doesn't), I don't mind having a look
08:47 gnurou: sadly I don't have any hardware to test on
08:47 karolherbst: it's this one, but there was already the fix for the issues within acr_r352_generate_flcn_bl_desc
08:47 gnurou: but hopefully I can still understand my own shit :)
08:48 karolherbst: but there seems to be another one
08:48 gnurou: but apparently the fix did not fix this issue, right
08:48 karolherbst: it's just super messy to figure it out
08:48 karolherbst: well
08:48 karolherbst: it did, but the timeout occurs later
08:48 gnurou: and you are super-positive that it was working fine before that commit?
08:48 karolherbst: in the code
08:48 karolherbst: yes
08:48 gnurou: ok, I must have messed something with the falcon initialization sequence then
08:48 gnurou: surprising that it only happens on GTX 980 though
08:49 karolherbst: as far as I can tell, the PMU boots up without issues
08:49 karolherbst: but secboot later fails while starting the gr falcons
08:49 karolherbst: gnurou: well 8GB vram ;)
08:49 gnurou: ah, again :)
08:49 karolherbst: that was the first fix about anyhow
08:49 karolherbst: code_dma_base only contained the lower 32bits
08:49 karolherbst: and so on
08:50 gnurou: that's interesting, this commit is just supposed to move code around, without changing any behavior
08:50 gnurou: but of course considering the size it is difficult to guarantee that :/
08:52 karolherbst: yeah
08:53 gnurou: looking at the log
08:53 karolherbst: but the only think I saw was different was acr_r352_generate_flcn_bl_desc, but you fixed that later
08:53 karolherbst: but I also didn't took a super deep look
08:53 gnurou: and that only delayed the timeout?
08:53 karolherbst: basically yes
08:53 gnurou: as in, it doesn't happen at the same place in the code?
08:54 karolherbst: yes
08:54 gnurou: but still during gr context init
08:54 karolherbst: before the fix: PMU startup failed
08:54 karolherbst: after the fix: GR startup failed
08:54 gnurou: ah, ok
08:54 gnurou: thanks, it's hard to find the information in the IRC log :)
08:58 gnurou: mmm I need to check again, but from the log it may well be that the PMU is not running
08:58 gnurou: ah but this is GTX980... there is no PMU LS firmware
08:58 dboyan: Implemented a simple BFS scheduling policy with the dependency DAG, frame time of pixmark piano seem to drop from 127ms to 121ms on my card.
08:59 gnurou: so PMU runs the HS firmware successfully it seems
08:59 gnurou: at this stage GR should be happy and running, but it is not
08:59 karolherbst: gnurou: yeah, apperantly
09:00 gnurou: would you or someone else be able to try patches? I'd like to dump a few registers to see if GR is in the proper state
09:01 mangix: mangix is here
09:02 karolherbst: gnurou: mangix is :p
09:02 gnurou: perfect. I will come later with a few regs to dump if you don't mind
09:03 gnurou: I just need to check my old code branches to know which ones...
09:03 karolherbst: :) awesome, thanks1
09:04 gnurou: probably the HS blob is loading crap into the GR falcons, while still being happy
09:05 gnurou: if that's the case... well, great security model >_<
09:07 karolherbst: yay
09:07 mangix: y'all still need ssh?
09:07 karolherbst: depends, gnurou will tell you what he needs
09:08 gnurou: mangix: I think I will just need you to apply a patch and give me the resulting logs
09:08 gnurou: which tree are you using? in-kernel or out-of-tree?
09:08 karolherbst: gnurou: he is using my out of tree tree
09:08 karolherbst: current master rebased on 4.11
09:09 mangix: 4.10
09:09 mangix: can't get 4.11 to compile
09:09 karolherbst: mangix: mhh, you can only compile it when running a 4.11 kernel
09:09 karolherbst: but I thought you already did that
09:10 mangix: no, 4.10
09:10 mangix: compile fails using master_4.11
09:10 karolherbst: mhh well, you should test at least my master_4.11 branch on 4.11
09:10 karolherbst: mangix: yeah, because you need to compile against 4.11, not 4.10
09:10 mangix: when checking out the offending commit
09:10 mangix: nono
09:10 karolherbst: ohhh, no, you should compile the HEAD against 4.11
09:10 mangix: 4.11 - master_4.11 fails. 4.10 > master_4.10 works
09:11 karolherbst: huh, odd
09:11 karolherbst: it works for me
09:11 mangix: i'm checking out the offending commits in both branches
09:11 karolherbst: no
09:11 mangix: HEAD probably works
09:11 karolherbst: you need to use the HEAD
09:11 karolherbst: and you should test on 4.11
09:11 karolherbst: not that your issue is already fixed and we just missed this
09:12 mangix: on 4.11, i believe i have HEAD
09:12 mangix: or close to it
09:12 mangix: since it compiles
09:12 mangix: it fails differently
09:13 karolherbst: mangix: output then please
09:13 mangix: it doesn't even try to launch the login screen
09:13 karolherbst: ohh
09:13 karolherbst: yeah, no worries
09:13 mangix: it stays on the terminal stuff
09:13 karolherbst: yeah, doesn't matter really
09:13 karolherbst: important are the logs
09:13 karolherbst: not what happens on the screen
09:14 mangix: will attempt to compile HEAD and post logs on 4.11
09:14 karolherbst: good thanks :) and gnurou may have a patch ready to dump stuff to debug all that
09:15 gnurou: yep - just give me some time, and please ping me if I don't come back within 24 hours :P
09:15 mangix: will do
09:18 mangix: agh
09:19 mangix: the quad screen issue
09:22 mangix: karolherbst: https://pastebin.com/5AWr6jiK
09:22 mangix: good news: no gr init failure
09:22 mangix: bad news: no login screen
09:25 mangix: will now try the skeggs repo
09:28 karolherbst: gnurou: shouldn't there be information printed about gr falcons booting?
09:29 karolherbst: wasn't there a series to lazy boot them on use for pascal tegras or anything like this?
09:30 karolherbst: mangix: can you check for dead threads?
09:30 gnurou: mmm there should, but I think you need to enable gr debug in the nouveau.debug flags
09:30 karolherbst: Xorg will hang, but there should be a kernel therad as well
09:31 karolherbst: mangix: please reboot with "nouveau.debug=debug"
09:32 gnurou: but AFAICT GR init seems to be happy here
09:33 mangix: FWIW i have a non rererence design GPU. no idea if that makes a difference
09:34 mangix: also a custom VBIOS
09:36 karolherbst: mangix: yeah, but it doesn't explain why it works on 4.10
09:37 karolherbst: full debug logs should shed some lights hopefully
09:37 mangix: systemctl start gdm shows a blank screen
09:37 mangix: leaving it for a minute
09:42 mangix: https://pastebin.com/nKVtm8Wt
09:44 mangix: this is some fun gibberish
09:44 mangix: pcie max speed: 8.0GT/s . i hope that's not pcie x1
09:46 gnurou: mmm, no more information about GR
09:47 karolherbst: gnurou: "gr: acquired GPCCS falcon"?
09:47 karolherbst: ohh wait
09:47 karolherbst: that's before performing secboot
09:47 gnurou: yeah, it just means it locked the falcon for exclusive use, which is expected
09:47 karolherbst: I am quite sure it hangs somewhere
09:48 karolherbst: mangix: can you check for stuck kernel threads?
09:48 gnurou: I was expecting something saying it started the falcon after secboot
09:48 mangix: how?
09:48 karolherbst: mangix: htop
09:48 karolherbst: in the settings you have to disable hiding kernel threads
09:48 gnurou: unless the HS firmware starts them itself on Maxwell? I don't quite remember
09:48 karolherbst: stuck threads are usually marked as "D"
09:48 karolherbst: gnurou: no
09:48 mangix: hmmm
09:48 karolherbst: I am sure they are started by the host
09:49 mangix: so without starting gdm?
09:49 karolherbst: maybe you changed it later
09:49 karolherbst: mangix: with starting
09:49 karolherbst: you need to ssh in
09:49 mangix: got it
09:49 mupuf: dboyan: yeah, I guess that's a good start. Has piano a lot of basic blocks?
09:49 karolherbst: mangix: what we need are the thread ids of the stuck ones
09:49 mupuf: AKA, does it have a complex control flow?
09:49 karolherbst: mangix: then we can print the stack trace and see where it got stuck
09:49 karolherbst: mupuf: while loops
09:50 karolherbst: mupuf: it's _one_ fragment shader for everything ;)
09:50 mupuf: ahah
09:50 karolherbst: and nested loops and everything
09:50 mupuf: but it still has geometry as an input, right?
09:50 mangix: soo
09:50 karolherbst: mupuf: it is so crazy, that I got a +1.5% perf increase after I enabled dual issuing for breaks
09:50 mupuf: with some attributes to say what kind of surface is supposed to be drawn
09:50 mangix: how do i enable stuck threads?
09:51 karolherbst: mupuf: but only if I reorder instructions for dual issueing
09:51 mupuf: ah ah ah
09:51 karolherbst: mupuf: no, not even a vertex
09:51 karolherbst: well, there is some fake vertex shader with 15 instructions or so
09:51 karolherbst: irrelevant
09:51 karolherbst: but I think it is for drawing the tux
09:51 karolherbst: or so
09:52 mangix: k. found hide kernel threads
09:52 mupuf: a passthrough vertex shader is fine, but I guess there is still geometry as an input, right?
09:52 karolherbst: no idea really, didn't looked in depth
09:52 mupuf: otherwise, they would need to do the projection in the fragment shader and do it for every pixel
09:53 mupuf: which is crazier than it sounds!
09:53 mupuf: And it sounds super bad to begin with
09:53 karolherbst: well
09:53 karolherbst: it's a benchmark
09:54 karolherbst: mupuf: the shaders are inside the repository
09:56 mangix: karolherbst: running htop
09:56 mangix: what am i looking for?
09:56 karolherbst: there is a column "S"
09:56 karolherbst: sort by it
09:56 karolherbst: and there should be threads with a "D" there
09:56 karolherbst: we need the ids of those
09:57 mangix: VIRT?
09:57 karolherbst: no
09:57 karolherbst: "S"
09:57 karolherbst: S: state
09:57 karolherbst: D: dead, S: sleep, R: running
09:57 karolherbst: as the values
09:58 karolherbst: otherwise you could also just do a "find /proc -iname stack -exec grep nvkm {} +" and see for which pids you get something
09:59 mangix: command gives nothing
10:00 mangix: how do i sort by column?
10:00 mangix: arrow keys move the screen
10:00 mangix: Ah f6
10:00 mangix: so I see htop as R
10:01 mangix: and S everything else
10:06 mangix: still don't sed what i'm looking for
10:08 dboyan: mupuf: From the log I dumped, I don't think piano has a lot of basic blocks, compared to its computation. It also had super high "issue slot utilization" (~150%), while my other program were mostly 20% to 50%.
10:09 mupuf: dboyan: karolherbst probably has done a good job there :p
10:09 dboyan: I also tested one program I wrote, it did improve ipc by some degrees
10:10 dboyan: I guess bfs is not a good policy, but I'm surprised to see it can improve things somehow
10:13 karolherbst: :)
10:13 karolherbst: dboyan: I have patches on my branch to improve perf by 10% there :p
10:14 karolherbst: my last result was at 82% perf compared to nvidia
10:14 mupuf: dboyan: BFS standing for what here?
10:14 karolherbst: dboyan: dual issueing is very important for pixmark_piano
10:15 karolherbst: dboyan: mostly experiments, but see yourself: https://github.com/karolherbst/mesa/commits/pixmark_piano
10:15 RSpliet: dboyan: what's the difference in gpr usage for your strategy? Any shaders tipping over the magical limit of 32, or did you succesfully limit that? :-)
10:16 karolherbst: you have no chance to reach 32 gpr with piano anyhow, except you spill a crap load of stuff
10:16 dboyan: I meant Breadth-First, I currently put instructions that met dependencies in a fifo, that's brute-force
10:16 karolherbst: dboyan: in SSA=
10:16 karolherbst: ?
10:16 dboyan: I haven't taken latencies and register pressure into account
10:16 dboyan: yeah, pre-RA
10:16 karolherbst: okay
10:16 karolherbst: use my branch then
10:17 karolherbst: it ensures in post-RA, that instructions are reordered to improve dual issueing
10:17 karolherbst: if you somehow mess this up, perf impact is super high
10:17 RSpliet: dboyan: okay that sounds like O(N^2), which I think isn't too bad
10:17 karolherbst: dboyan: see https://github.com/karolherbst/mesa/commit/70d24de24f798e3b23de144d3f4043a73fa4e88a
10:18 karolherbst: I think there are some wrong assumption in that pass though
10:18 mangix:goes to bed
10:18 dboyan: yeah, I noticed that. dual-issuing is best done post-RA
10:20 karolherbst: mhh, I should finish that pow to mul optimisation
10:20 karolherbst: it has an insanly high impact on perf
10:20 dboyan: RSpliet: O(N^2) in the worst case, but I think my current code runs at near linear for normal programs
10:20 RSpliet: dboyan: O is an upper bound ;-)
10:21 RSpliet: but yeah, I'd expect it to run O(n) in most cases. Even n^2 isn't to worry about though, I think other passes are of similar complexity. Shaders generally don't exceed several kilobytes of code, so it's all quite well contained
10:22 karolherbst: our backend compiler is pretty fast anyhow
10:22 karolherbst: just caching the TGSI makes shader compilation super fast
10:24 dboyan: Caching tgsi seems to reduce re-compiling, iirc. But I don't know why
10:25 karolherbst: anyway, let's not worry about compiler perf too much
10:26 dboyan: yeah, that's for sure :)
10:37 dboyan: bbl
11:23 karolherbst: mangix: okay, well probably just wait until gnurou has his patch ready then
12:18 imirkin: endrift: this was for a PCIE G5
12:22 imirkin: apparently MSI is somehow busted
13:02 vpelletier: nouveau.config=NvMSI=0,NvPCIE=0 is not enough
13:02 vpelletier: imirkin_: ^
13:02 imirkin: =/
13:02 imirkin: throw in nouveau.vram_pushbuf=1 ?
13:02 vpelletier: adding pushbuf seems to do the trick
13:02 imirkin: yay!
13:03 imirkin: probably don't need that NvPCIE=0 thing then
13:03 vpelletier: I(m also seeing therm fan update messages I do not remember seeing before
13:03 imirkin: although ... that suggests that dma is somehow wonky
13:03 imirkin: that's normal
13:04 imirkin: coz interrupts are working now ;)
13:04 imirkin: turns out interrupts are important for the proper functioning of hardware
13:04 imirkin: anyways, bbl... work...
13:04 vpelletier: hehe
13:08 vpelletier: and X accepts to start from an accelerated TTY
13:09 vpelletier: although the cursor has wrong colors. probably some of the changes we applied which is messing more things than they fix
13:09 vpelletier: I'll revert to vanila nouveau code
13:16 vpelletier: I confirm, back to vanila, cursor colors are back to normal
13:17 vpelletier: and vsync'ed glxgears rotate
13:18 vpelletier: so remaining issues are: debian providing 64k pages kernels out of the box, MSI broken, opengl 2.1 not being available because of color endianness
13:19 vpelletier: I won't fight with 1, I'll poke a bit at 2 and then 3
13:21 MichaelP: Opensuse leap 42.2 plasma 5 20-nouveau.conf... Option "GLXVBlank" "true" still getting screen tearing ... GeForce GT 730
13:27 vpelletier: imirkin_: lspci -t -vv and lspci -vv: https://pastebin.com/tgwgqaPQ
13:27 vpelletier: the nvidia card and its parent bus controler are the only MSI lines with "Enabled-", others are "Enabled+"
13:27 vpelletier: whatever it means (RTFM'ing)
13:33 vpelletier: looks like it exactly means MSI disabled
13:37 vpelletier: so it looks consistent with what benh described earlier
14:04 imirkin_: vpelletier: fwiw, MSI is one of those things that has resisted being understood by me in any meaningful way
14:04 imirkin_: and also fyi, benh is the ppc maintainer, so he tends to know what he's talking about wrt those bits
14:16 vpelletier: imirkin_: is it up to nouveau to detect MSI is not enabled on parent bridge and to auto-set NvMSI=0 ?
14:16 imirkin_: see above re my understanding of MSI :)
14:16 vpelletier: yeah, I was thinking this right when I sent my message
14:17 imirkin_: i know that MSI = good (like drugs = bad), but that's about it
14:18 imirkin_: https://www.youtube.com/watch?v=Uh7l8dx-h8M
14:33 vpelletier: /* deprecated, don't use */ pci_enable_msi
14:34 vpelletier: from drivers/pci/msi.c
14:35 vpelletier: anyway, it internally calls pci_msi_supported which checks NO_MSI flags on all parent bridges
14:36 vpelletier: so looks like it's not up to nouveau to check that
14:36 vpelletier: and maybe the "disable msi" quirk in ppc would forget to set the flag ?
15:09 vpelletier: getting confused, msi does not look so disabled, as in an old dmesg I find a "u3msi: allocated virq 0x20 (hw 0x8) addr 0xf8004080" line when nouveau tried to enable MSI. anyway, too late to think ~> Zzz
17:10 endrift: imirkin ah dang :(
19:00 karolherbst: Lyude: well regarding cock gating, you can simply open nvidia-settings and set the perf mode to max
19:00 karolherbst: it will set the same clocks as nouveau at highest pstate with NvBoost=0
19:01 karolherbst: Lyude: and yeah, prime system would be nice
19:01 Lyude: karolherbst: I mean to test things with nouveau's clockgating, not nvidia's
19:01 karolherbst: Lyude: I see heavy stuttering on my prime system running pixmark_piano with nouveau prime offloaded, even if the intel gpu is super idle
19:02 karolherbst: Lyude: with nouveau the load faker does nothing usefull
19:02 Lyude: karolherbst: ah
19:02 karolherbst: it just messes with some counters to get nvidia to clock
19:02 Lyude: karolherbst: also, would a 940MX/SKL do?
19:03 karolherbst: Lyude: maybe
19:03 Lyude: alright, I might have some other prime machines as well if that doesn't work
19:03 karolherbst: I think intel drops the sync if the offloaded rendering is way too slow
19:03 karolherbst: I have a hsw system with a 770m
19:04 karolherbst: but you should feel it immediatly. Maybe it is caused by the window manager
19:04 karolherbst: or having only DRI3 enabled
19:04 karolherbst: or having no nouveau ddx at all
19:04 karolherbst: I don't know, but the stutters are super super super annoying
19:04 Lyude: What DE is this?
19:04 karolherbst: plasma5
19:04 Lyude: ahhh
19:05 Lyude: i would put $10 on it being plasma's fault
19:05 karolherbst: I will check with kwin killed
19:05 karolherbst: well I even tried it with tearing prevention disabled completly
19:05 karolherbst: Lyude: no, actually it is due to a change in the i915 driver, because they are syncing the output "better", but somehow nobody had any interest to follow this
19:06 karolherbst: but maybe it only happens with kwin, dunno
19:06 Lyude: karolherbst: yeah, pretty features that break everything because no one actually tried them sounds like i915
19:06 Lyude: by chance have you managed to bisect it?
19:07 karolherbst: not really
19:07 karolherbst: mhhh
19:07 karolherbst: okay, I think kwin makes it only worse
19:07 karolherbst: but hum
19:07 karolherbst: even writing something inside my IRC client is super laggy now
19:07 karolherbst: kwin is killed
19:07 Tom^: O_o
19:07 Lyude: hm
19:07 Lyude: does the lag happen to come and go
19:07 karolherbst: no
19:08 Lyude: e.g. shell animations go at 60fps for a bit, then drop down again and stutter?
19:08 karolherbst: it only happens if nouveau is super busy
19:08 karolherbst: sometimes it even causes my entier desktop drop below the rendering speed of nouveau
19:08 karolherbst: It is caused by the fact how i915 syncs stuff
19:08 karolherbst: I had a patch one to get the old behaviour back, but I've lost it
19:09 karolherbst: I think Kayden gave me the patch
19:09 karolherbst: Lyude: well, on an intel only system it doesn't matter, because if the intel GPU is under heavy load everything is crappy anyway
19:10 karolherbst: it would be just nice to have others confirm this issue
19:11 Lyude: GK107GLM / haswell
19:11 Lyude: that sounds closer to your setup
19:11 karolherbst: yes
19:11 karolherbst: I have a gk106, but gk107 should be close enough
19:11 Lyude: cool, lemme fire this machine up
19:15 karolherbst: awesome, thanks :)
19:17 mupuf: Lyude: yeah, the fakecounters tool won't help you, it could only help you for power gating
21:04 karolherbst: Lyude: and did you notice anything?
21:04 Lyude: karolherbst: sorry, just got back to my desk. about to run the thing you mentioned
21:09 Lyude: karolherbst: entire rest of the GUI gets laggy to the point it's almost unusable?
21:09 karolherbst: :)
21:09 karolherbst: yeah, makes sense, doesn't it?
21:09 karolherbst: you did offloading, right?
21:09 Lyude: yep, DRI_PRIME=1
21:09 karolherbst: yeah
21:09 karolherbst: well on intel its even worse
21:09 karolherbst: at least I am not the only one having this silly issue :)
21:10 karolherbst: bonus point: games running at around 40 fps are affected as well
21:10 karolherbst: but there intel can drop below 40fps
21:10 karolherbst: ....
21:10 karolherbst: so in the end I get something like a 15fps experience
21:12 Tom^: is this something intel or offloading related?
21:12 karolherbst: intel
21:12 karolherbst: here is what intel basically does:
21:12 karolherbst: it finished until the other driver has a frame ready for displaying
21:12 karolherbst: and it blocks while doing this
21:12 karolherbst: and if you miss time frames just by a slim margin, your entire desktop just gets laggy
21:13 karolherbst: maybe nouveau fails to deliver information or something, but the hell
21:13 Tom^: i see
21:13 Tom^: does it occur on blob too?
21:13 karolherbst: nvidia doesn't support prime offloading
21:14 Tom^: i thought they did, i remembered reading something about it hm
21:14 karolherbst: bumblebee doesn't have this kind of issue though, because it's 100% in userspace
21:14 karolherbst: yeah, some ass crappy kind of offloading
21:14 karolherbst: it's an insult to proper prime offloading to call it prime offloading
21:14 Lyude: it's not crappy offloading, it's nvidia(tm) crappy offloading
21:14 Tom^: https://devtalk.nvidia.com/default/topic/957814/prime-and-prime-synchronization/ this
21:14 karolherbst: hihi
21:15 karolherbst: Tom^: yeah, it's basically gpu always on and intel used as a display provider
21:16 karolherbst: Tom^: looking at this, primusrun (bumblebee) looks more advanced
21:16 Tom^: heh, perhaps. havent fiddled with any of this offloading bits. just remembered reading something about it :P
21:16 karolherbst: it's not worth spending time on
21:17 Tom^: i wonder if i can boot the onboard intel and offload on this 780ti
21:17 karolherbst: sure you can do
21:17 Tom^: and ALSO gpu passthrough it when i feel like to in a vm
21:17 karolherbst: but why do you want to do that after intel messed up?
21:17 karolherbst: ohh I see
21:17 Tom^: the ultimate gaming rig. :p
21:17 karolherbst: well GPU passthrough sounds like a nice idea
21:17 karolherbst: without hardware supports it gets difficult though
21:18 Tom^: yeah i have iommu and all the bits, so it should be doable
21:18 Tom^: crap 23:18, time for bed.
21:23 karolherbst: Lyude: any idea what we can do about this issue?
21:24 Lyude: karolherbst: unfortunately I have no idea, I've never really worked on that part of the kernel.. this being said though, I -do- still have push access to i915, so if you end up coming up with any fixes I can gladly review/push
21:24 karolherbst: sounds like a plan
21:25 karolherbst: but maybe first I shall create a bug report
21:25 Lyude: yeah, probably a good idea
21:25 karolherbst: sadly I don't know in which kernel version this behaviour was introduce, but this alone is bug worthy even if not being a regression
21:26 karolherbst: Lyude: mind wanna check if this also happens on an intel/AMD setup?
21:26 karolherbst: *mind checking
21:26 Lyude: karolherbst: unfortunately I don't think I have any on me in the boston office
21:27 Lyude: yeah, just double checked. nothing :\
21:27 karolherbst: sad
21:28 Lyude: is is unfortunate how many less of those systems exist then the intel/nvidia ones
21:28 karolherbst: Lyude: do you have other systems where you could test this quite easily?
21:28 Lyude: karolherbst: yeah, i've got a couple of primes but they're all intel/nv
21:28 karolherbst: well, I will create the bug for HSW for now and if you want, you can check other platforms and respond on that bug
21:29 Lyude: sure thing
21:40 karolherbst: Lyude: https://gist.githubusercontent.com/karolherbst/f5e5a808bfe2110e383abddd00835016/raw/7899180e1d60461d792d1e343e872ebe9361cdf6/gistfile1.txt
21:41 Lyude: karolherbst: ah, so we're all set?"
21:42 karolherbst: well, the link doesn't work
21:42 karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=101506
21:42 karolherbst: but it should be fixed upstream anyhow
22:02 Lyude: niceee
22:04 Lyude: got some more meaningful measurements of where I am right now with clockgating/other gatings. Full perf on this 760 running pixmark_piano goes down from 123-125W to 111-119W
22:04 Lyude: that's only with blcg and cg so far
22:06 karolherbst: :) nice
22:06 karolherbst: but that's around what I experienced as well. 2-3 W drop ~60W
22:10 Lyude: i'm curious to see what happens once I add slcg
22:10 Lyude: and, someday, maybe elpg
22:11 Lyude: mupuf: btw, the main difficulty of elpg is knowing that parts of the GPU depend on eachother being turned on correct?
22:26 Lyude: hm, if I were to split gf100_gr_mmio() so that it's accessible by all subdevs, where is the best place in nouveau to put it? (gating and gr init have very similar init patterns, enough so it doesn't make sense to have more then one function for writing large packs of mmio register values)
22:27 Lyude: it will probably not be called gf100_gr_mmio at that point but you get the picture
22:33 Lyude: skeggsb: ^ ?
22:52 skeggsb: Lyude: hm, i think you might just be better off writing a separate function, i don't think you need the multi-level separation of "packs" do you?
22:53 Lyude: skeggsb: that's what I'm trying to do actually, apologies if I didn't explain properly
22:53 Lyude: buuut i think i know where it needs to go now
22:54 skeggsb: most of the work is getting all the data i guess, so put that all together somehow, and we can figure out the rest once we see those patches :P
22:54 skeggsb: ie. make it work, then flesh out the details
22:54 Lyude: gotcha
23:14 karolherbst: skeggsb: how can you even live in australia, here it was like nearly 30°C and it's already too much for me :O
23:15 karolherbst: but we also have a humidity of 92% here, so maybe that's why
23:21 airlied: karolherbst: thats quite common here
23:22 airlied: like at least a month or two a year
23:29 skeggsb: it's only 19 here currently :) love this time of year!
23:29 mangix: 38 C here
23:29 mangix: i cannot walk outside