00:20Lyude: i think I /finally/ came up with a fullproof solution for our RPM woes :)
00:21Lyude: Using current lets you do some pretty magical stuff
00:22Lyude: and so does shoving fb_helper out of the way entirely when runtime suspending
03:50nyef: ... I can't load nouveau after the nvidia blob because there's no bios image? What?
03:51gnarface: they have never played nice together
03:51nyef: Well, I unloaded the blob first. I figured trying to get them both to run the same device would be a bad idea.
03:51nyef: (Or, more accurately, a non-starter, since whichever loads first, wins.)
03:52gnarface: well, that's effectively true
03:52gnarface: you basically have to blacklist one or the other, then reboot
03:52gnarface: but afaik only the blob refuses to properly unload once loaded. nouveau shouldn't require a reboot unless something crashed and jacked it up.
03:53gnarface: (but note that most distros ship a graphical login manager by default now, which can cause a gotcha case if you forget to kill it too before unloading nouveau)
03:54nyef: The problem being that I'm trying to investigate a possible misconfiguration on the nouveau side, and wanted nvidia to do its setup first, then start nouveau to see if things are "better".
03:54gnarface: yea, that wont' work, the blob is too badly behaved
03:55nyef: So much for the "easy" route.
03:55gnarface: the process as it has been described to me was to get a mmiotrace of whatever the nvidia driver does with the firmware, then reboot to clean it out so you can compare that to the equivalent dump from nouveau, then go about figuring out which bits are needed
03:56gnarface: or you could do the same thing except comparing the non-free firmware as used by both drivers
03:56gnarface: it's way above my head though
03:56nyef: Does this work for finding ctxvals?
03:57gnarface: dunno, even that question is above my head, sorry
03:57nyef: Okay, thanks.
08:05heeen: I have blacklisted nouveau but it still keeps getting loaded. I have it in modprobe.d as blacklist and alias off and ran initramfs -u but still no luck. running kubuntu 18.04 on 4.17.11 and nvidia-396
09:15heeen: I even blacklisted it in the kernel commandline and it still loads
09:45karolherbst: heeen: weird, no idea what might cause that except something doesn't get updated as it should be
09:45heeen: karolherbst: I found it. systemctl disable nvidia-fallback.service
09:46heeen: which is installed by nvidia-396
09:46karolherbst: by nvidia?
09:46heeen: by the package maintainers at ubuntu?
09:46karolherbst: sounds like something a distribution would do
09:46heeen: I don't know
09:47karolherbst: that's evil
09:47karolherbst: I mean, usually you use the efifb or something as the fallback
09:48karolherbst: but using nouveau might just let you run into X not starting, allthough with glvnd it might be even okay
09:56heeen: I filed this
09:58karolherbst: heeen: ohh, what kind of laptop do you have?
09:59heeen: ok so issue for you nouveau guys: nouveau keeps crashing on suspend even if it is not used (not in lspci)
10:00karolherbst: what kind of crash?
10:00karolherbst: there might be several reasons, so without a dmesg or something it is hard to tell what issue you have
10:00karolherbst: currently we are kind of looking into those and a lot are already fixed, allthough the patches aren't merged yet
10:00heeen: karolherbst: http://termbin.com/o0ydo
10:01heeen: karolherbst: does not suspend, hard locks with dark screen
10:01karolherbst: mhh, nvkm_pmu_reset fails
10:01karolherbst: skeggsb_: any ideas?
10:05karolherbst: mhh, the entire secboot stuff fails before even trying suspending or anything
10:05karolherbst: that secboot stuff is just super annoying
10:05karolherbst: as it isn't debugable
10:06karolherbst: heeen: uhhh, you run into multiple issues here, kind of
10:06karolherbst: bumblebee tries to unload nouveau as well
10:11heeen: karolherbst: the issue was there before I installed bumblebee
10:12heeen: karolherbst: I haven't even tried it once
10:14heeen: karolherbst: https://github.com/matthieugras/Prime-Ubuntu-18.04 check this
10:15heeen: How is this different to the standard 18.04 approach?
10:15heeen: Two things are different. Firstly, nvidia-prime in Ubuntu 18.04 does not use bbswitch to power-off the nvidia card when you are in intel-only mode. Instead, the developers swapped to an officially-supported kernel feature, which apparently only works when the nouveau driver is present. Unfortunately, this means the nvidia drivers have to be removed. So prime-select intel goes through an elaborate process of
10:15heeen: removing the nvidia drivers, rebuilding the initramfs image and rebooting, solely to load nouveau so the nvidia card can be turned off.
10:15heeen: is this true?
10:15heeen: not loading nouveau will eat my battery because it will not power down properly?
10:17karolherbst: yeah, something changed so bbswitch won't work anymore
10:17karolherbst: on newer laptops even the old apporach doesn't work
10:18karolherbst: that prime-ubuntu thing is totally overengineered
10:18karolherbst: I wouldn't use it
10:21karolherbst: also it wouldn't work anyway
10:22karolherbst: well, not on you laptop
10:25heeen: karolherbst: so, without loading nvidia or nouveau, will it still consume power or not
10:25karolherbst: it will
10:25karolherbst: it will even with loading nvidia or bbswitch
10:25heeen: iow, I need nouveau to power it down?
10:26heeen: which means my battery runtime will suffer until it doesn't crash anymore?
10:26karolherbst: you could enable the runpm control
10:26karolherbst: even with no driver installed
10:26karolherbst: heeen: you should have a /sys/bus/pci/devices/0000:01:00.0/power/control file
10:26heeen: the nouveau.runpm=0 kernel commandline?
10:26karolherbst: if you echo auto in it with no driver loaded
10:26karolherbst: it should poweroff the GPU
10:26karolherbst: and work even quite reliably
10:27karolherbst: there is a /sys/bus/pci/devices/0000:01:00.0/power/runtime_status file which indicates the current status
10:28heeen: now it says suspended
10:28karolherbst: and the power consumption should drop
10:28heeen: sounds great
10:28heeen: why is it not on auto in the first place
10:29karolherbst: well, there is a weirdo bug I try to track down which causes the GPU to crash when nouveau is loaded and tries to do that...
10:29heeen: can I automate this with udev
10:29karolherbst: well, if you use tlp you could tell tlp to do that for example
10:29heeen: whats tlp
10:29karolherbst: you need to enable auto mode for the bus as well
10:29karolherbst: which is at /sys/bus/pci/devices/0000:00:01.0/power/control
10:30karolherbst: heeen: I would assume that it is the drivers job to know if the device can be suspended or not
10:30karolherbst: but dunno
10:30karolherbst: that all isn't really convenient right now
10:30heeen: hm no
10:31heeen: that device does not exist
10:31karolherbst: lspci -t output?
10:32karolherbst: so it is 1c.0
10:32karolherbst: /sys/bus/pci/devices/0000:00:1c.0/power/control then
10:33karolherbst: I am not quite sure how well that works together with bumblee/nvidia though
10:33heeen: so how would you make those settings stick
10:33karolherbst: heeen: could you verify that the power consumption went down?
10:34karolherbst: mhhh, yeah, that should work
10:34heeen: does it matter if its plugged in or on battery
10:34karolherbst: depends on powertop
10:34karolherbst: sometimes it doesn't seem to display power consumption when plugged in
10:35heeen: hmm actually
10:35heeen: doesn't powertop do something similar in the tunables page?
10:36heeen: Good Runtime PM for PCI Device NVIDIA Corporation GP108M [GeForce MX150]
10:36heeen: thats it, right?
10:36karolherbst: right, but you also need the same for the bus the GPU is on
10:38heeen: powertop does not say which is the 1c.0 device
10:38heeen: bot some of the pcie ports are "good"
10:38heeen: ok so I turend off the screen and am pulling powertop over wifi
10:39heeen: The battery reports a discharge rate of 4.77 W
10:39heeen: The power consumed was 97.5 J
10:39heeen: The estimated remaining time is 10 hours, 8 minutes
10:39heeen: Summary: 218,0 wakeups/second, 0,0 GPU ops/seconds, 0,0 VFS ops/sec and 3,1% CPU use
10:40heeen: changing it back to "on"
10:40heeen: The battery reports a discharge rate of 4.93 W
10:40heeen: The power consumed was 96.6 J
10:40heeen: The estimated remaining time is 9 hours, 49 minutes
10:40heeen: Summary: 226,3 wakeups/second, 0,0 GPU ops/seconds, 0,0 VFS ops/sec and 3,2% CPU use
10:40heeen: that is within the margin of error
10:40heeen: so no conclusive result
10:41heeen: 5.08w it says now
10:41karolherbst: well, you might want to wait a little more
10:41karolherbst: and maybe dmesg yells something
10:42karolherbst: but I would expect something like +5W or something when the GPU is actually on
10:42karolherbst: but the MX150 isn't really drawing much power in the first place
10:43heeen: I toggled all tunables to good
10:43heeen: The battery reports a discharge rate of 2.96 W
10:43heeen: The power consumed was 83.3 J
10:43heeen: The estimated remaining time is 16 hours, 21 minutes
10:43heeen: karolherbst: ping me if you want me to test something on my machine
10:44heeen: but it has to be built for ubuntu 18.04
10:44heeen: or give me build instructions
10:44karolherbst: kind of depens on if skeggsb_ knows something about that pmu reset issue
10:44heeen: how is the performance of nvidia vs nouveau nowadays
10:44karolherbst: _might_ be that a newer kernel/nouveau fixes that, but dunno
10:44karolherbst: we still have that runpm bug
10:44karolherbst: heeen: depends on the GPU
10:45karolherbst: on pascal, it is bad, as there is no real way to reclock the GPU without signed firmware
10:45heeen: lets say this mx150, or a gtx 970 or 1080
10:45karolherbst: so the GPU is stuck at low perf levels
10:45karolherbst: soo, the mx150 and 1080 are pascal, so there is terrible perf compared to nvidia
10:46karolherbst: on the 970 however, the situation is different. We _could_ set the GPU to the highest perf level, but without signed firmware we can't control the fans
10:46karolherbst: so heat is a problem
10:46karolherbst: on kepler based GPUs where we are able to set the perf level, perf compared to nvidia ranges from 50% to 80%
10:46karolherbst: we miss some important optimizations on maxwell still to be as close there
10:47heeen: I have a bit of graphics background
10:47heeen: why does the performance very so much - apart from protected apis like reclocking and fans
10:47karolherbst: low level optimizations
10:47heeen: is it the command stream generation thats inefficient?
10:47karolherbst: something like that
10:48karolherbst: there are many things the driver can do to improve performance
10:48karolherbst: currently we don't really care about it, as support features/fix bugs has higher priorities right now
10:48heeen: on windows, with games, you could argue with all the optimized paths for each individual game which nouveau obviously won't have
10:48heeen: but with generic gl or vulkan apps
10:48karolherbst: yeah... there is also that problem with the optimized shader stuff or optimized paths for games
10:48karolherbst: well, we don't have vulkan yet
10:49karolherbst: anybody is free to start one though, but this requires some changes in the kernel module
10:49karolherbst: there is some work we could already do before
10:49heeen: are you guys working in your free time or is someone sponsoring the development
10:50heeen: I find it hard to do anything outside of work and the family right now
10:50karolherbst: well... it is a mix of free time devs and devs paid by RH right now
10:50heeen: ah redheat, right
10:50karolherbst: we had one from nvidia at some point which took care of the tegra stuff
10:50karolherbst: and the secboot things
10:50karolherbst: but he left nvidia
10:52heeen: karolherbst: how about a 760, well supported?
10:52karolherbst: 760 should be a kepler one
10:52karolherbst: but I can't really tell how well supported it is
10:53karolherbst: I mean, if this is a board with kind of normal settings, then yeah
10:53karolherbst: I expect it to work and reclock
10:53karolherbst: but there are always random issues here and there
10:53karolherbst: some even kind of board specific
10:54karolherbst: mupuf: ohh btw, is reator ready with the gf119?
11:00karolherbst: nice :)
11:01heeen: https://bugs.launchpad.net/nvidia-drivers-ubuntu/+bug/1784598 karolherbst would you say this bug report is accurate
11:02karolherbst: mupuf: still the 4.14 kernel?
11:04karolherbst: mupuf: pacman -Syu was it?
11:04karolherbst: or how shall I update the kernel
11:06karolherbst: fails anyway
11:12mupuf: Pacman failed? I doubt it, it just boots my compiled kernel firdt
11:12mupuf: Check grub.cfg and boot the right entfy to get a 4.17
13:58karolherbst: imirkin: do you know why we cap HDMI to 297MHz?
13:59karolherbst: the HDMI 1.3 spec allows up to 340MHz, so I am wondering about that
13:59imirkin: yeah, but nothing is built to spec
13:59imirkin: nvidia limits to 297, therefore we do
13:59karolherbst: ohh, okay
13:59imirkin: verified on kepler only, of course
14:00karolherbst: I am mainly wondering where that 297 value comes from, as the evo caps are in x*10 steps only
14:01imirkin: well, it's there in the vbios
14:01RSpliet: PLL limits table?
14:01imirkin: or at least i saw a 29700 (iirc) value in the bios :)
14:01imirkin: and this stuff is listed in 10khz increments
14:01karolherbst: ahh, I see
14:01imirkin: (can you even think of a more natural unit...)
14:02karolherbst: well, the evo stuff is 10MHz units, so that's why I got confused
14:02RSpliet: imirkin: I'm in a country that works with imperial measurements... trust me, I can think of less natural units
14:02imirkin: RSpliet: you should convince them to switch to natural units.
14:02imirkin: oh wait, you're not a physicist. nevermind.
14:03karolherbst: "natural units" are an illusion anyway
14:03imirkin: (natural units measure things in units of c, ħ, and eV)
14:03RSpliet: Surely joule fits in that list as well right...
14:04imirkin: you only need 3 units...
14:04imirkin: doesn't really matter what they are as long as they're sufficiently orthogonal
14:04imirkin: cm/gram/seconds is another popular one
14:05imirkin: aka "cgs"
14:07RSpliet: beats lea/firkin/fortnight
14:11karolherbst: imirkin: I found a 0x7432 in the TMDS info table which is 29746
14:11karolherbst: but mhh
14:12imirkin: karolherbst: there's a "T" bit table iirc
14:12karolherbst: it is part of a 0x4074 which is 16500
14:12imirkin: or something else
14:12karolherbst: 0x80e8 next value which is 33000
14:13karolherbst: last value 0x57e4, which is 22500
14:13karolherbst: odd tabel
14:13imirkin: which is the limit on fermi iirc
14:13karolherbst: well, this is from a nve7
14:13imirkin: but i didn't find that this table was 100% accurate, so i never made use of it
14:14imirkin: perhaps something indexes into it? dunno
14:14karolherbst: maybe we gotten some docs about that one
14:14imirkin: we didn't at the time
14:16nyef: Hunh. I'm more familiar with the furlong/firkin/fortnight system than lea/firkin/fortnight.
14:17karolherbst: imirkin: how sure are you that nvidia capped at 297 and not let's say at 300 or something else?
14:17imirkin: pretty sure.
14:18imirkin: but not extremely sure.
14:18imirkin: try messing with modelines
14:18imirkin: and the blob driver
14:18imirkin: iirc it rejects everything over 297
14:18imirkin: but perhaps it's everything over 297.46
14:18imirkin: unfortunately i don't remember the provenance of this info
14:19imirkin: but i must have thought it was reliable at the time
14:24nyef: imirkin: Yesterday you mentioned the possibility of something not being initialized properly on my nvaf. Is this likely to be a ctxval or something else?
14:26karolherbst: imirkin: okay, I think the last value is some kind of limit as this value seems to change from GPU to GPU
14:27imirkin: nyef: something else
14:27imirkin: memory controller parameters
14:28imirkin: karolherbst: i had many theories, none panned out
14:28imirkin: so i gave up on it
14:29nyef: Hrm. Memory controller parameter issues on an IGP?
14:30imirkin: we had to give it some memory address
14:30imirkin: that it could do what it pleased with
14:30imirkin: i forget exactly
14:30imirkin: but leaving it blank caused all sorts of fail
14:31nyef: Okay, so find things that are nvaf-specific, especially with respect to memory management, see what the observables are, and compare them to an mmiotrace of the blob?
14:38pmoreau: imirkin: Thinking about the failures on NVAC?
14:38karolherbst: imirkin: no nvc0 actually has this table...
14:38karolherbst: super weird
14:40pmoreau: imirkin: Like, this? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.18-rc7&id=e9d91238990d89421315a556a3ba4dbbae35ffbf
14:46linkmauve: Hi, someone I know is asking how well Nouveau supports the GTX460.
14:46linkmauve: My understanding is “pretty well”, but is there any gotcha there?
14:47nyef: mcp89 uses mcp77_ram_new(), so unless the values need to be slightly different this isn't likely to be my issue.
14:49nyef: ... May have spoke too soon.
14:51nyef: Seems the blob is mostly just twitting 0x100c14, not messing with c18, c1c, or c24.
14:52nyef: So it could be screwing up an already-correct NISO poller configuration?
14:58pmoreau: IIRC, they said that thing was only on NVAA and NVAC, not NVAF
15:00pmoreau: The VBIOS is usually the one setting this up, but that wasn't the case on MacBooks
15:07karolherbst: pmoreau: but isn't nvaf mac only?
15:07nyef: Some digging suggested a few options for PC nvaf boards.
15:07karolherbst: I am sure there aren't
15:07karolherbst: the naming is totally screwed up
15:08karolherbst: GeForce GT 320M == gt216
15:08karolherbst: GeForce 320M == MCP89
15:09nyef: Ah, okay.
15:09nyef: That might be what happened.
15:10pmoreau: karolherbst: Maybe? Dunno
15:10karolherbst: pmoreau: no, this is quite the fact
15:10karolherbst: nvidia announced mcp89 to be specifilcally made for macs
15:10karolherbst: _maybe_ they changed it later and produced some non mac ones
15:10karolherbst: but I highly doubt that
15:11pmoreau: Ah okay, then yes, most likely a messed up VBIOS :-D
15:12karolherbst: skeggsb_ was convinced that we don't need that mcp stuff for nvaf though
15:12karolherbst: no idea what we want to do here
15:13karolherbst: nyef: you have that .mmu = g84_mmu_new, thing for nvaf
15:13karolherbst: uhm mcp89
15:14karolherbst: if mcp77_mmu_new is set, that stuff just works
15:14nyef: Oh, right. I remember that now.
15:14nyef: The recent .mmu change that fixed an nvaf laptop and didn't break my mini?
15:15karolherbst: yeah... dunno
15:15karolherbst: I am sure I don't need it on my mini as well
15:15karolherbst: but apperantly we need it on the macbooks
15:16nyef: Okay, and this puts me back to no angle other than ctxprog/ctxvals.
15:40RSpliet: linkmauve: That's fermi, so... okay. But not quite as well as the generation before or after
15:41linkmauve: Ok, thanks.
18:05karolherbst: imirkin: fyi: we got a bug report about nouveau not doing GL 3.2 and my guess is it is about compat profiles, because overriding helps and glxinfo prints the correct versions. So I might be digging into the compat stuff for real. Mainly wanted to ask if you did look into it at some point and may know what we miss for 3.2, otherwise I will just check what we might be missing
18:05karolherbst: or maybe you already know what we miss
18:05karolherbst: either way
19:23docmax: can somebody PLEASE help me to set up my 3 monitors via xorg.conf, so that i have DP-1 on the left, DP-2 middle and primary, DP-3 right... i'm trying 3 days now without success. the driver is nouveau and my config is here https://lpaste.net/1085927586816589824. THANK YOU!!!
19:37karolherbst: docmax: did you actually verified that it works with xrandr? not that there is some stupid issue and nouveau fails at driving three displays (and fallsback mirroring one or something stupid like that)
19:39karolherbst: docmax: maybe somebody in #xorg or #xorg-devel is able to help
19:45docmax: karolherbst: yes with xrandr i'm able to setup the screens
19:46docmax: and i allready am in #xorg
19:46karolherbst: okay, so then it is mainly a config issue
20:28docmax: karolherbst: i found something
21:33docmax: karolherbst: ~/.config/xfce4/xfconf/xfce-perchannel-xml/displays.xml
21:33docmax: karolherbst: overrides xorg.conf