03:53scientes: So debugfs crashes (not the GPU) whenever I try to write to pstate
03:53scientes: i looked at the relevent nouveau code but no luck there
04:01imirkin: scientes: is the gpu powered down when you're doing this?
04:01imirkin: if it's being auto-suspended, it has to be up when you reclock
04:01imirkin: and that reclock only lasts until it shuts down again
04:06scientes: hmm, your right
04:06scientes: i'm amazing Civilization IV could run on just i915....
04:07scientes: and that the the cpu could overhead ... :/
04:08imirkin: well, the GT218 isn't exactly a super powerful gpu either
04:15imirkin: and i assume you have a sandybridge? that's not entirely crap...
04:28scientes: hey i
04:28scientes: i'm poor
04:31imirkin:has a NV5 plugged in
04:31scientes:has a emu10k1
04:32imirkin: we should get together and build the awesome computer of 1999
04:32imirkin: i was very sad when i threw my sblive out
04:32scientes: it still works with linux
04:32scientes: it didn't work with Vista, with got me to switch to Linux
04:32imirkin: but ... onboard audio ... and no more pci slots
04:33scientes: and it was great through the rough pulseaudio conversion
04:33scientes: cause it has hardware mixing
04:33imirkin: i do miss the 32 hw-mixed channels though
04:33imirkin: hah. pulse. i take the same approach with it as i did with esd. kill -9.
04:34scientes: it makes the cheap current hardware work however
04:34scientes: and its better than android's audioflinger
04:34imirkin: dunno. alsa's enough for me.
04:34scientes: alsa can only handle one app
04:35imirkin: i believe you're misinformed.
04:35imirkin: probably by the pulse fanboys
04:36scientes: i thought it could only handle one app unless you have hardware mixing
04:36imirkin: alsa can't do audio-over-the-network. that's pretty much the only time you need pulse.
04:36imirkin: it has software mixing options.
04:36imirkin: [it probably shouldn't, but ... o well]
04:39imirkin: oss required hw mixing for multiple applications
04:39imirkin: which was like the #1 reason i got an sblive in the first place
05:29jayhost: What's the newest NV card that has Good nouveau perf
06:13gnarface: jayhost: i think i heard 780 Ti?
08:41qdel: Hi, Is there is any way to 'overclock?' the gpu core tension?
08:42qdel: Sensors reports +1.02 while it should run at ~1.12
08:42qdel: The gpu is crashing under heavy load using nouveau. while ok under nvidia driver. Only difference seems this.
08:45gnarface: depends on which card it is
08:46gnarface: reclocking support is missing for many of them
08:48qdel: gt750m. nve7 gpu
08:48gnarface: oh i can't tell you off the top which ones they are
08:49qdel: by default the card is stuck to lowest state. I can make it go higher playing with dri/pstate.
08:49qdel: 'mobile - optimus - buggy combo ;) '
08:50gnarface: https://nouveau.freedesktop.org/wiki/FeatureMatrix/ i think it's power management?
08:54qdel: Mostly possible. Seems WIP according to https://nouveau.freedesktop.org/wiki/PowerManagement/
08:55qdel: I am no expert of kernel drivers, but i know correctly my C and i can recompile kernel easily. If a developer want to test, i am open :P
09:00pmoreau: qdel: Which kernel are you running?
09:01pmoreau: Karol had a couple of reclocking fixes, for Kepler cards among others, that made it into 4.10.
09:01qdel: Tried a couple, was on 4.13. I gave a shot to 4.14-rc5, still same problem.
09:01pmoreau: Ah, I think Karol would be interested to here about that, but he’s not here yet.
09:01pmoreau: karolherbst ^
09:03qdel: Ok, i will wait for him :) i will prepare a kernel build on this laptop (sluggy cpu)
09:20karolherbst: imirkin: pulse allows you per application sound level mixing, regardless of if the application allows you to control the volume or not
09:20karolherbst: and pulse allows you to switch the output on the fly without needing to restart the application (if the application doesn't support this either itself)
09:21karolherbst: and this are two very good reasons to use pulse over plain alsa
09:21karolherbst: well, if you need it, but I actually do need both of those features
09:21karolherbst: qdel: what kernel version?
09:22karolherbst: ohh, 4.13/4.14
09:22karolherbst: qdel: I have the perfect tool to figure out if we mess up
09:23qdel: I can use your perfect tool when you want :)
09:23karolherbst: you need to run on nvidia for this
09:24karolherbst: buh... why is the fuc code broken... wait a second
09:29qdel: Ok, reinstalling nvidia driver. Hope it will work as bbswitch is hardly broken on this laptop (suspend make total apocaliptic disaster, but at boot is should work)
09:30karolherbst: mwk: fyi, 5fde30a2684041f9820aa9dc4fbd0009a45076a9 broke compilation of current PMU code within nouveau
09:30karolherbst: qdel: you can disable bbswitch entirely
09:31karolherbst: mwk: when I find some time, I will add a cmpile check for current nouveau fuc code to travis-ci...
09:35karolherbst: mhhh, we use movw on gk208
09:35karolherbst: mwk: is movw not available there anymore?
09:36karolherbst: but mhh, yeah, that makes sense... I thought I fixed something related to this for kepler2
09:41karolherbst: qdel: git repository: https://github.com/karolherbst/nouveau.git
09:41karolherbst: qdel: branch: awesome_tool
09:41karolherbst: run make in the top dir
09:55RSpliet: karolherbst: I'm pretty sure GT218 reclocking is spot on
09:55RSpliet: However, I never ran the code with RunPM
09:55karolherbst: RSpliet: yeah. I am super sure that the weird readings came from the GPU being suspended
09:56karolherbst: I already wrote all the fixes for that
09:56karolherbst: except reading out clocks
09:56karolherbst: I think
09:56karolherbst: not super sure about this
09:57RSpliet: That's where the magical 108MHz comes from. Presumably the sctl value is 0xffffffff or something similar due to the card being suspended
09:57karolherbst: that makes sense
09:58karolherbst: I think I fixed reading out the clock state on suspend, because on my branch I get 0 as the result
09:59karolherbst: or maybe not...
09:59karolherbst: clk->func->read is called directly
11:15qdel: karolherbst: How do i use your tool? (it was a bit a pain to put back this driver in place... forgot about diversions xD)
11:17karolherbst: qdel: did it compile?
11:17qdel: karolherbst: I am sure of one thing: using gputest / furmark on nvidia i reach 92 degrees with clock at gpu: 1163 mem: 1800. It is stable and i reach this temperature very fast (less than a minute)
11:17karolherbst: as root: bin/nv_cmp_volt
11:17qdel: Using nouveau, it never pass 75c. frequency is a bit less but not hugely
11:18karolherbst: and only when runing something on nvidia
11:18karolherbst: but it should print out stuff
11:18karolherbst: and this I want to see, let it run for 30 seconds or so
11:19qdel: failed to create device, -12
11:19qdel: (under sudo)
11:21karolherbst: qdel: did you run something on the nvidia GPU?
11:21karolherbst: mhh ohhh wait. let me check something
11:21qdel: karolherbst: I launched furmark
11:24karolherbst: qdel: you need to boot with iomem=relaxed
11:25karolherbst: otherwise that tool isn't allowed to read out the GPU state
11:25qdel: Ok, doing this
11:28qdel: nouveau 0000:01:00.0: bios: unable to locate usable image
11:28karolherbst: buh.... :/
11:28karolherbst: weird though
11:29karolherbst: qdel: do you need any special parameters for nouveau?
11:30qdel: Nope, i use nothing
11:30qdel: only "quiet" :)
11:31karolherbst: bin/nv_cmp_volt.c: 54
11:31karolherbst: replace "error" with "trace"
11:31qdel: For nvidia / bbswitch i had a couple in the past. But nothing work anymore.
11:31karolherbst: run again
11:35karolherbst: okay, fun
11:35karolherbst: I guess your vbios is usually in ACPI
11:36qdel: Can we retrieve it?
11:37qdel: possibly due to optimus. This is the kind of laptop where the nvidia gpu is connected to 0 screens (dp/hdmi are also linked to the intel)
11:38karolherbst: yeah, I just need to change something inside that tool
11:39karolherbst: ahhh crap
11:39karolherbst: we don't compile those usespace tools with acpi support
11:41ylwghst: nvidialegacy340 up n running
11:41ylwghst: i make it work even on nixos
11:41ylwghst: what can i provide you?
11:43karolherbst: qdel: did you extract the vbios.rom file with nouveau?
11:44qdel: karolherbst: Not manually. I used the package from debian
11:44karolherbst: the vbios.rom is something differently
11:45karolherbst: there shouldn't be a vbios.rom package
11:45karolherbst: qdel: I meant the vbios.rom file near the pstate file
11:46qdel: I can switch back to nouveau to extract it
11:46karolherbst: yeah, that would be perfect
11:46karolherbst: then we can tell the tool to just use that file
11:46karolherbst: and I would like to take a loot at the vbios anyway
11:46qdel: No problem, doing this
11:48karolherbst: interesting, open source trello alternative: https://wekan.github.io/
11:57qdel: karolherbst: meeeh nvidia driver is loading even if i blacklist it -_-'
11:59karolherbst: qdel: ;)
11:59karolherbst: yeah, it sucks
11:59karolherbst: best plan is to adapt to the insanity and just install both at once
12:00ylwghst: karolherbst: https://www.dropbox.com/sh/w0g1ix6t0z2m2ct/AAAUe0_dHaclsmHVgVATYOO6a?dl=0&preview=first.png
12:00qdel: I installed both but i wanted to blacklist nvidia to load nouveau (and opposite)
12:01ylwghst: karolherbst: there are 3 pictures
12:01karolherbst: ylwghst: yeah, but we need an mmiotrace
12:01karolherbst: ylwghst: https://wiki.ubuntu.com/X/MMIOTracing
12:02karolherbst: qdel: X config?
12:02karolherbst: ohh wait
12:04ylwghst: karolherbst: where i should share the logs then>
12:05karolherbst: ylwghst: doesn't matter. compress it with xz -9 and put it anywhere.
12:05karolherbst: you can mail it
12:11karolherbst: ylwghst: firstname.lastname@example.org
12:18qdel: gneee needed to blacklist nvidia-current-drm and nvidia-current-modeset
12:19qdel: karolherbst: http://qdel.ddns.net/bordel/gt750m-vbios.rom
12:22karolherbst: qdel: mhh okay, looks fine
12:34karolherbst: qdel: now when running nvidia, start the tool with "-c NvBios=$path_to_vbios"
12:34karolherbst: just set a proper path
12:39qdel: 1181250, 1175081, -6169, 99.477757, 10, 59, 79
12:39karolherbst: this is under load, right?
12:40qdel: furmark still, you want with glxgears?
12:42karolherbst: it's odd, because we seem to calculate the right voltage
12:42karolherbst: now wondering what bad happens on nouveau
12:42karolherbst: qdel: did you actually try to change the pstate?
12:43karolherbst: it might be that the default clocks are just unstable
12:43qdel: By default, the card is always running at low clocks. When i change to 'high' i use either '0f' or 'AUTO'.
12:44qdel: It ramps up to ~960Mhz
12:44qdel: Using glxgears it seems stable
12:44qdel: Using furmark it goes not at full speed
12:44qdel: Using unigine valley, it crash after ~5 seconds
12:55qdel: karolherbst: possibly another solution
12:55qdel: using nvidia-smi i checkd gpu / mem freq
12:55qdel: it reported 1163/1800
12:55qdel: Here, using nouveau, pstate report 966;2002
12:57qdel: If the values are correct, seems ram is hardly overclock on nouveau :)
12:59qdel: And using your tool on nouveau, i am correctly at 1.02v. so i am in front of 3 problems: gpu freq goes more slowy on nouveau
13:00qdel: possibly because of this the card doesn't ramp up to full tension
13:00qdel: ram seems to be by default too much high on nouveau
13:00karolherbst: 1800 mem clock?
13:00qdel: i can recheck using nvidia-smi or nvidia-settings
13:01karolherbst: but even the nouveau values doens't seem right
13:01imirkin: 1800 is pretty common for ddr3
13:02qdel: This is a weird laptop with 4gb of integrated slow ram with a gpu not powerfull enough to use it ;)
13:02karolherbst: it's gddr5
13:02imirkin: huh ok
13:02karolherbst: and the vbios reports even 7GHz
13:02karolherbst: for 0f and 0e
13:02karolherbst: wait a second
13:02karolherbst: maybe I got the wrong vbios
13:03karolherbst: got the wrong one
13:03qdel: My laptop is *not* under the ice =)
13:03imirkin: they all look alike
13:03karolherbst: pwm voltage...
13:04karolherbst: vbios reports 2GHz mem clock
13:04karolherbst: which should be normal for those 750m though
13:04karolherbst: qdel: can you give me a screenshot of the nvidia-settings perf mode table?
13:04qdel: But seems the constructor use slower ram
13:04qdel: yes, wait a few ;)
13:05karolherbst: imirkin: now I am wondering if there is a flag to reduce the mem clock...
13:06karolherbst: no voltage control for ram
13:08qdel: Ok, seems nvidia-smi show max freq
13:09qdel: I made this screenshot with glxgears in background
13:09qdel: with furmark, the gpu speed goes to 1163
13:10karolherbst: nouveau should do 1600 on 0a, right?
13:10karolherbst: qdel: can you try if the 0a pstate is stable on nouveau?
13:10karolherbst: there is no difference between 0a and 0f, except the memory clock is a bit higher
13:11karolherbst: qdel: also, you can increase the core clocks with nouveau.config=NvBoost=1 or =2
13:11karolherbst: but this might cause overheating issues
13:12karolherbst: skeggsb: ever found something like this?
13:17qdel: Meeeh.... the laptop crashed 3 times in a row when setting 0a before running anything
13:18karolherbst: you should run something and then set the clocks
13:18karolherbst: reclocking while the GPU is off isn't fixed yet, patches still need reviewing
13:20qdel: Ho, ok ;)
13:20qdel: Ran unigne, freeze after ~10s
13:20qdel: but laptop is still responsive
13:20karolherbst: does dmesg report anything?
13:22qdel: quite a lot, i send it threw pastebin
13:24karolherbst: ohh, okay
13:24karolherbst: thats most likely not reclocking related
13:24karolherbst: qdel: on 0f the machine crashed for real?
13:25qdel: Note that after this, the laptop is still working and i can even run glxgears with the nvidia
13:25karolherbst: or maybe it is, but this looks like something less severe
13:25karolherbst: yeah, most likely something in the gl code
13:25qdel: No, in fact when the asynchronous wait happen, the laptop is freezed
13:26karolherbst: qdel: would you like to play around a bit more on the 0a perf level and see if there are any real crashes compared to 07 or stock?
13:26karolherbst: qdel: well yeah, but that's intel waiting on nouveau
13:26qdel: it unfreeze a bit after then freeze a couple of seconds i think during the kernel stack trace
13:26qdel: then it is back ok
13:26karolherbst: that's the part where nouveau kicks the client out
13:26qdel: the whole 'freezing stuttering' takes ~25s
13:26karolherbst: and intel syncs until nouveau responds
13:26karolherbst: it's ugly for users, but... yeah
13:27karolherbst: at laest the machine survives
13:27qdel: No problem, it is how optimus work
13:27qdel: Yes. For this, i am happy ;)
13:27qdel: I can fun furmark on 0a
13:27karolherbst: so the bug in nouveau is, we don't detect that those 2GHz are wrong
13:27karolherbst: and that unigine bug could be fixed by using nvidias context switchting firmwre
13:29karolherbst: qdel: okay, so you can say for sure that 0a runs better than of?
13:32qdel: karolherbst: not for the moment laptop is running
13:33karolherbst: well, I will try to see if I can figure something out from your vbios related to that memory clock. Thanks for your help so far.
13:33karolherbst: not quite sure what I can do about the unigine hang without access to hardware
13:33karolherbst: or somebody else
13:34imirkin_: karolherbst: check if it happens to you? :)
13:34qdel: karolherbst: no problem. for the moment laptop is still running.
13:35qdel: I could give you an access to the hardware if you need it.
13:36karolherbst: imirkin: it doesn't
13:36karolherbst: imirkin: well maybe with the same mesa it would
13:36karolherbst: qdel: what version of mesa do you have installed?
13:37qdel: karolherbst: seems laptop just crashed
13:37qdel: i will wait a few to see if it goes back to life
13:42qdel: Last error http://qdel.ddns.net/bordel/IMG_20171025_153937.jpg
13:42qdel: machine was hard frozen
13:44qdel: mesa 17.2.3
14:27qdel: karolherbst: indeed using nvboost command line makes gpu goest up to 1162 Mhz. Tension went also higher, at 1.18. And laptop crashed with glxgears
14:27qdel: using 0a
14:37karolherbst: qdel: yeah, we don't do quite good on higher clocks as well.
14:37karolherbst: was it with 2?
14:38karolherbst: with 1 it should be more stable
14:40qdel: karolherbst: indeeded it was with 2. making a test with 1 in a couple of seconds
14:43qdel: Hummm still a crash really fastly. I can give you some dmesg log if you would like
14:44qdel: Frequencies where 1084 / 1600 a bit before the crash
18:14ylwghst: karolherbst: hello
18:15ylwghst: i get echo: write error: invalid argument
18:15ylwghst: when i try echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
18:15ylwghst: any ideas?
18:19karolherbst: check dmesg
18:21ylwghst: can be nouveau loaded?
18:22karolherbst: ylwghst: cat /sys/kernel/debug/tracing/available_tracers
18:23ylwghst: mmiotrace isnt there
18:24karolherbst: maybe you can modprobe mmiotrace?
18:24karolherbst: or something like that
18:24ylwghst: moduel not found
18:31l4mRh4X0r[m]: Hey, I'm getting a lot of `nouveau 0000:01:00.0: gr: ILLEGAL_CLASS ch 6 [007f617000 java] subc 0 class 0000 mthd 2390 data 00000000` in my dmesg output, and the program seems to freeze. Any idea what the problem is?
18:32karolherbst: l4mRh4X0r[m]: my guess would be multithreading messup. somebody works on it. it's complicated to fix though
18:33l4mRh4X0r[m]: Alright, cool. It's been happening for a while (the freezing), so I figured I'd dig into it a bit deeper. But good to hear people are working on it already :)
18:35karolherbst: well it's only my assumption, but the most likely one right now
18:36l4mRh4X0r[m]: Is there any way I can help? Providing debug information, other stuff, etc?
18:38ylwghst: there isnt mmiotrace module in my kernel
18:38karolherbst: ylwghst: :/ sad
18:39l4mRh4X0r[m]: karolherbst: sounds like a plausible cause actually, since sometimes it doesn't happen :)
20:14karolherbst: oh nice, "KHR-GL45.arrays_of_arrays_gl.SubroutineFunctionCalls1" fails for something related to doubles
20:14karolherbst: that's a lead
21:06newgentoouser: Hey, I have a problem. When I in/out of fullscreen a few times (3-5), everything freezes and I have to go into a TTY and kill the application. I think this is a relevant line from syslog: "kernel: [ 1914.372935] nouveau 0000:01:00.0: disp: 0x5da8: INIT_GENERIC_CONDITON: unknown 0x07". RAM/CPU usage seems to be normal.
21:07newgentoouser: Firefox and mpv are affected, only ones I've tried
21:07newgentoouser: Firefox being Youtube
21:14gnarface: state your kernel version, mesa version, and card model
21:15newgentoouser: gnarface: GTX 770M, Linux 4.12.12, mesa 17.0.6
21:15gnarface: (not because i can actually help you, but because nobody who actually can will even respond if you don't include that information)
21:15newgentoouser: Fair enough
21:15duttasankha: I was wondering if someone could give me the difference in functionalities among PFFB PMFB and PBFB. From envy I came to know about this different kinds of MC available and also abou their MMIO location. But I am curious about their functions.
21:16gnarface: someone here knows that too, but it's not me
21:16gnarface: newgentoouser: just curious though, does it happen still if you disable opengl rendering in firefox? that could be important evidence
21:17newgentoouser: gnarface: Is that hardware acceleration? Or what's the config name?
21:18duttasankha: the lookup didn't provide me much help either...
21:18karolherbst: imirkin: you have a kepler2 GPU, right? one with 256 regs?
21:19gnarface: newgentoouser: yea, hardware acceleration. i forget where you disable it. try looking in about:config
21:19newgentoouser: gnarface: Disabled it, still happened :(
21:19karolherbst: imirkin: mind checking if "KHR-GL45.arrays_of_arras_gl.SubroutineFunctionCalls1" passes on my cts branch on it?
21:20gnarface: newgentoouser: what's your window manager? does it have compositing? if so, try it with compositing disabled too
21:20newgentoouser: gnarface: fluxbox, it does, I'll disable it and try
21:21newgentoouser: gnarface: That fixed it, hah. So I guess it's a bug in compton then?
21:25gnarface: newgentoouser: not necessarily. i couldn't say for sure what the bug actually is in actually, but now you've at least confirmed that it must involve opengl somehow.
21:25gnarface: so it could be a bug in compton, firefox, nouveau, OR mesa
21:25gnarface: (but it's probably a bug in nouveau)
21:26gnarface: (mind you, also nothing says it can't be TWO bugs that only ever appear together)
21:26newgentoouser: gnarface: should I post a bugreport somewhere with the data I have? Which is rather limited
21:26gnarface: (that last one is pretty rare though)
21:27gnarface: newgentoouser: yea there's a bug tracker somewhere... linked from channel topic i think?
21:27gnarface: newgentoouser: yea, there's bug reporting info on nouveau.freedesktop.org
21:28newgentoouser: gnarface: Found it, thanks for the help mate
21:28gnarface: no problem, good luck with it
21:32newgentoouser: For anyone interested, I've narrowed it down to a blur kernel setting in compton
21:32gnarface: that's actually very interesting
21:33gnarface: make sure to include that in the bug report
21:33gnarface: karolherbst, imirkin you might want to take a look at this one if you don't already know about it
21:43newgentoouser: Apologies, I was too hasty, it had nothing to do with the blur, it was actually --unredir-if-possible. I don't even know why I use it (it's an old config from a previous system). Makes more sense too since it specifically affects fullscreen windows.
21:44newgentoouser: Can't force a reproduction without that setting though, even with all the other ones active. So I guess that's that as for the cause.
21:46gnarface: it might help someone trace the problem
21:46gnarface: just being able to reliably reproduce the bug is key, a lot of the time
22:43Lyude: i'm finally, finally done with my part of eglstreams, asides from cleaning up patches. thank god.
22:43karolherbst: Lyude: nice!
22:43Lyude: now i can go back to nouveau for real :)
22:44ylwghst: karolherbst: is this how should mmiotrace.log output look https://gist.github.com/ylwghst/fcff340bc59e7cd4c3d55c251b77d191 ?
22:44karolherbst: ylwghst: yeah
22:45karolherbst: Lyude: I am quite sure if you post some updated patches on the ML, I will take a look next week :D
22:45ylwghst: karolherbst: how long i should trace it?
22:45Lyude: hehe, hopefully I should get the time to. no other <redacted> stuff to do for red hat either
22:46karolherbst: ylwghst: I think actually not so long, because it should be part of the driver init
22:46karolherbst: ylwghst: you could start the trace, load nvidia, start X, do glxgears or something, and then stop
22:47ylwghst: it has 1GB already
22:47karolherbst: ylwghst: well, use xz -9
22:47karolherbst: Lyude: hihi
22:48karolherbst: Lyude: I really don't know what I _have_ to do next month, so currently I go ahead and just write a todo list and will work on that
22:50karolherbst: :D tweet of the day: "Sorry, we just don't think Linux is a big enough market. Instead, we redesigned the whole game for the four people with a Vive headset."
22:50Lyude: karolherbst: tbh, usually the only time i have to do stuff is when there's bugs. sometimes we need something in a thing that's not there and i'm one of the people who doesn't have enough job responsibilities on their plate at the time to end up being the one who does it
22:51Lyude: red hat doesn't consider upstream work a waste of time so that helps a lot :)
22:52karolherbst: well on my todo list is passing CTS, so that's something I guess
22:53Lyude: if I can get kevin to easily say yes to me working on powergating I'm sure you'll be able to find a lot of time for that :P
22:53Lyude: the only fun project I haven't gotten him to say yes to thus far that I've actually been seriously interested in has been biopenly
22:53Lyude: but to be honest, I can't blame him there
22:54karolherbst: ohh and I want to build a nouveau CI ! very important
22:59ylwghst: karolherbst: which options xz ?
23:00karolherbst: ylwghst: ? what do you mean
23:03ylwghst: youve sent xz with some options
23:03ylwghst: xz -z
23:03karolherbst: xz -9 file
23:05ylwghst: the trace has 1.7 GB uncompressed