00:02 karolherbst: optlink: yes, I do it like that
00:22 optlink: Alright, this time X started fine but optirun glxgears never starts and system load goes through the roof, just like before. It has been running for 10 minutes now
00:23 imirkin_: perhaps there are some tricks to getting it working on optimus, dunno
00:37 optlink: Mmiotrace is producing huge amounts of data. Is that normal?
00:37 imirkin_: define 'huge'
00:37 optlink: 100MB in a few minutes
00:37 imirkin_: a successful mmiotrace of glxgears for a few seconds should be in the 50-500MB range
00:38 imirkin_: it compresses *really* nicely
00:38 optlink: Ok that makes sense then.
00:39 optlink: Looks like I'm getting errors from the NVidia driver in dmesg. Great
02:25 optlink: I have absolutely no idea how to solve this problem. Kernel 4.9 locks up solid and 4.12 just dies slowly
05:00 sooda: RSpliet: about those timeslices: there may be more to it in nvgpu, haven't looked there in a while, but one more option to give extra time to high-priority tasks was discussed too - just insert the same channel several times in the runlist, based on some simple heuristics :P this is to reduce latency under load, while the timeslice length is only about throughput
10:12 RSpliet: sooda: from the looks of it, giving a bigger timeslice to a higher prio task would stretch out the epoch too, so the effect is not super predictable. I wonder what kind of heuristics would be best for this (load based, or rather try and work out which is the foreground task)
11:18 RSpliet: And following up, who should have control over this. For security reasons you'd want contexts to be completely independent from each other, but we can't expect tasks to say anything meaningful about the size of its timeslice on the GPU because 1) we don't want to modify applications, and 2) game theory :-P
11:19 RSpliet: For heuristics like "foreground task(s) get the biggest slice", perhaps the compositor can help... for a per-context perf-counter based approach the kernel would have to get its say
11:22 RSpliet: Do tasks give up their "slice" automatically when their work-queue is empty?
14:29 dcomp: well it seems i still need runpm=0 config=NvClkMode=7
14:51 optlink: karolherbst: do you have any special configuration for mmiotrace with bumblebee? The secondary X server hangs and then eventually hangs the rest of my system
14:52 karolherbst: well, mmiotrace has a few bugs, skeggsb I think has a few workarounds for that? not quite sure
14:52 karolherbst: it also depends on the issue you encounter
14:56 optlink: I've tried a few different methods and the issue seems to be that the X server started with the nvidia driver hangs after a few moments and never fully starts. System load continues to increase and eventually I see messages that a kernel worker is hung in dmesg.
14:57 optlink: mmiotrace is outputting, but something it's doing is triggering this hang
14:59 dcomp: optilink: I used to have that problem, I think I had to stop it from disabling all cpus. It seems to disable all other cpus and then peg that cpu so I can't stop it. Its an option somewhere in mmiotrace
15:00 optlink: dcomp: I've read about that. The kernel documentation says that disabling them is necessary in order to catch all events. Though I may as well try it seeing as nothing has worked so far
15:46 karolherbst: optlink: yeah I know
15:46 karolherbst: optlink: it is related to how the ioremaped pages are tracked and stuff
15:47 karolherbst: it is on my todo list to fix that
15:47 karolherbst: one fix is to get nvidia to only ioremap 0x1000 aligned pages
15:48 karolherbst: sometimes they map 0x200 or even 0x2040 big areas, which mmiotrace currently can't handly properly
15:52 optlink: karolherbst: I can see those larger remaps happening just prior to the hang so that must be it
15:52 optlink: unfortunately, I have no idea how to work around this or even where to start looking
15:52 karolherbst: there is glue code in the nvidia driver which gets compiled
15:53 karolherbst: skeggsb sould have a workaround for that I think
15:55 optlink: I'm guessing he is at XDC at the moment?
16:11 karolherbst: optlink: yes
16:11 karolherbst: so am I
18:56 karolherbst: imirkin: by the way, GP108 support landed in 4.14 not 4.13
18:57 imirkin_: oh
18:57 imirkin_: sorry, i was off.
19:11 karolherbst: yeah no worries, I already knew that it isn't part of 4.13
20:43 Lyude: https://nvidia.custhelp.com/app/answers/detail/a_id/4544 uhoh
21:02 optlink: skeggsb: would you happen to have a workaround for mmiotrace and differently sized remaps?
21:04 karolherbst: optlink: you probably won't get it until next week
21:04 karolherbst: I talked with him
21:04 optlink: karolherbst: ah ok. Thanks.
21:17 Lyude: oh mupuf I almost forgot to ask: the other day when you were presenting on Intel's CI you mentioned that you guys use efivarfs for storing kernel panics, is there any documentation on how to store stuff in efivarfs/do you just make a random file in there and it just works?
21:17 Lyude: being able to get kernel dumps through that would be extremely useful to have
21:17 mupuf: Lyude: no, that's the job of the firmware to create files there
21:18 Lyude: aw
21:31 jayhost: Anyone know if the AMDGPU vega driver allows gpu virtualization, I was gonna grep for mxgpu in the DC DAL kernel and see if I can figure it out
21:31 karolherbst: Lyude: I use it as well, but I don't know what I needed to setup for it
21:32 karolherbst: Lyude: well the pstore panic thing
21:32 karolherbst: mupuf: why is it the job of the firmware?
21:32 karolherbst: well sure the EFI stores them, but afaik efivars is used in pstore to save those dumps
21:33 karolherbst: you even find the files in efivars as well
21:38 Lyude:tries putting something in there...
21:39 karolherbst: Lyude: well the oops stuff automagically happens in the kernel
21:39 Lyude: any idea how you access that then?
21:40 karolherbst: you need pstorefs mounted, usually under /sys/fs/pstore
21:40 Lyude: OH, pstore is a f
21:40 Lyude: *fs
21:40 Lyude: ok, that makes sense
21:40 karolherbst: well, no, a directory
21:40 karolherbst: but yeah, a fs nontheless
21:41 imirkin_: jayhost: wrong channel?
21:41 karolherbst: Lyude: and you can simply delete all files inside the pstorefs to clean it up
21:41 karolherbst: but
21:41 karolherbst: I know about hardware where it simply doesn't work, no idea due to wrong setup or something
21:41 karolherbst: I simply know it works for me and the developer writing pstore
21:41 Lyude: ➜ pstore echo "Help! I'm stuck inside an EFI variable and I can't get out" > test.txt
21:42 Lyude: zsh: permission denied: test.txt
21:42 karolherbst: well
21:42 karolherbst: you don't write into pstore from userspace, because that isn't it's task really
21:42 Lyude: ahh, ok
21:42 karolherbst: the main idea of pstore is to have a mean of storing stuff if everything goes south, especially network/disc
21:42 karolherbst: or IO in general
21:42 jayhost: imirkin_ : No, I just know you know what you're talking about. Is there a better channel than Radeon or Amdgpu for amd help?
21:43 karolherbst: and there are other backends, not just efivarfs, but it's the one which makes sense on most modern desktops
21:43 imirkin_: jayhost: #radeon
21:43 karolherbst: Lyude: https://www.kernel.org/doc/Documentation/ABI/testing/pstore
21:44 jayhost: It's been super silent in radeon. I'll do some more research, glad I took your advice on getting AMDgpu. Vega is pretty rad.
21:44 Lyude: oooooooo
21:44 imirkin_: yeah - having a full-time team develop drivers, with access documentation - almost feels like they're cheating!
21:45 karolherbst: Lyude: if you are lucky, there are already files in there on your machine
21:45 Lyude: nothing in here right now
21:46 karolherbst: is it even mounted?
21:46 Lyude: yep, but I'm going to try remounting it with the arguments from that doc
21:46 karolherbst: not important actually
21:46 karolherbst: there is some funky going on here for most distributions I think
21:46 imirkin_: jayhost: i suspect a lot of people are at XDC, and talking in RL rather than on the interwebs
21:46 karolherbst: never got time to actually figuring out what
21:47 karolherbst: Lyude: it might be that your pstore simply has no backend set
21:47 karolherbst: or there is none at all
21:47 karolherbst: or it doesn't write oopses automatically in it
21:47 Lyude: :(, i will complain to our razer contacts if that's the case. this thing actually supports kernel debugging through some magic port on the board though, so I'd be surprised if it couldn't do this (but not that surprised)
21:47 karolherbst: well, I doubt it is a hardware issue
21:48 karolherbst: more like kernel configured incorrectly
21:48 Lyude: ahh
21:48 karolherbst: or it has to be enabled or so
21:48 karolherbst: no idea, I compile my own kernel, so I just enabled the kernel config for it
21:48 Lyude: alright, brb crashing laptop
21:48 imirkin_: Lyude: this might be obvious, but you have to boot with EFI
21:48 Lyude: imirkin_: i always do :)
21:48 Lyude: i think people who still use bios on machines with efi are weird
21:49 jayhost: imirkin_ Oh interesting, hosted at Google. I picked up an r5 1600 too. All runs great in Debian
21:50 jayhost: Almost same days at EGX event. So many conferences