08:39karolherbst: an apitrace triggering a concurency crash, nice
08:41karolherbst: but odd, I thought apitrace does only single threaded stuff
08:45mupuf_: karolherbst: it runs in multiple threads
08:45mupuf_: but one command at a time
08:45karolherbst: ohh I see
08:45karolherbst: but even then, why just 4 replays at the same time :O I would do like 32
08:46karolherbst: but if this indeed triggers those issues reliable, this would help a lot
08:46karolherbst:is still tired
13:25imirkin: karolherbst: a totally different concurrency issue than the one plaguing most plasma folk though, i think
13:25karolherbst: well maybe
13:26imirkin: karolherbst: or if it's the same crash, it's not related to the traces directly but rather to multiple applications running at the same time.
13:26karolherbst: but if this is fixed, we have one issue less
14:49Fervi: Hello. How to diagnose errors associated with Nouveau? Dmesg does not show anything, and maybe my logs solve problems :D
14:50imirkin_: can you describe the issue? and mention what gpu and software you're using?
14:52Fervi: Devuan Ascii (Debian Stretch without systemd) and Quadro FX1800, Kernel 4.9. Yesterday (?) Debian purge Nvidia 340xx drivers, so i remove them and using nouveau. It works nice on "work", but Chromium (~99% cases), Urban Terror(~10% cases) and VirtualBox freezing (always when VBox Drivers in Virtual Machine starting)
14:53Fervi: dmesg doesn't show anything wrong
14:53imirkin_: ok, so that's a G94.
14:53Fervi: it just freezing and i must kill it (application) from console
14:53imirkin_: i believe that virtualbox tends to use multi-threaded GL, which breaks nouveau
14:54imirkin_: so i would recommend running it with LIBGL_ALWAYS_SOFTWARE=1
14:54imirkin_: [as presumably you don't really need the GL accel there]
14:54Calinou: if you want native resolution in a Linux guest, often, you need GL acceleration
14:54Calinou: (> 1024x768)
14:57imirkin_: Fervi: are you sure there's nothing in dmesg? most problems i expect would say something like CACHE_ERROR - there's an undiagnosed issue on all nv50-era chips where they seem to lose their object cache somehow
14:59imirkin_: Fervi: separately, if you're looking to use the board for gaming, you might be able to reclock it by echoing stuff to /sys/kernel/debug/dri/0/pstate. note that this may hang your board, so use with care.
14:59imirkin_: Fervi: also, if you're using KDE5 plasma, that's known to wreak all kinds of havoc on nouveau
15:00Fervi: I'm almost sure, but i can send logs
15:01Fervi: i don't have "/sys/kernel/debug/dri/0/pstate" (maybe because i using sysvinit)
15:02imirkin_: or you don't have debugfs or it's not mounted
15:02imirkin_: has nothing to do with your init system
15:04Fervi: Little (maybe stupid) question - Nouveau works better with Wayland?
15:05imirkin_: Fervi: unlikely, but you can give it a shot. i've never personally tried it.
15:15pq: Fervi, I wouldn't think Wayland would change anything, certainly not for the better if your app is still an X11 app and you had to run it with Xwayland. That would not use the Nouveau Xorg DDX.
15:21imirkin_: it's been probably 2 years since i've run chromium on a nv50-era board, but it used to work fine...
15:24whompy: I think I've run it on kernel 4.9 and mesa 13 with no issues on my nv50
15:26imirkin_: whompy: which nv50 do you have btw?
16:45whompy: Whoops, it's actually an NVA5. GT216
16:45imirkin_: whompy: right, i figured you didn't have a G80 - no one does :)
16:46whompy: I actually used to in my desktop, but it has since died.
16:46whompy: I believe it was, at least. Quite some time ago.
16:47imirkin_: yeah. i think most of the actual G80's are on the trash heap by now
16:48whompy: Most of my remaining components seem to be becoming rarer as well. I fear the forced upgrades due to hardware failure may be imminent.
18:54nullbyte_: how can i fix bug with nouveau for nvidia geforce gtx 970 4gb, on various linux boot ready for installation or after then?
18:54nullbyte_: a bug with blank screen
19:03Fervi: Try CTRL + ALT + F1 and later CTRL + ALT + F7
19:03Fervi: sometimes working :D
19:06NanoSector: are there any plans on nvidia's side to start supporting PRIME that any of you know of?
19:09nullbyte_: Fervi not helps:)
19:12nullbyte__: i know there is a patch
19:12Tomin: isn't there some workaround that allows to boot with 4GB GTX 970? some kernel parameter
19:13nullbyte__: Tomin: nomodeset in grub and then properitary driver by geforce.com
19:13nullbyte__: Tomin: but you can't boot in default with any linux os
19:13nullbyte__: you need nomodeset kernel parm
19:14Tomin: what is the problem then?
19:15nullbyte__: multiple boot with OSs
19:16Tomin: so you don't know how to use nomodeset with grub?
19:19Tomin: can you get to the grub boot menu (list of os to boot)? press e (I think, there are instructions at the bottom) on the option you want to change and then add the parameter to the end of the line that starts with linux. then press ctrl+x to boot. this is temporary, but should allow you to install nvidia drivers
19:21Tomin: I think there is some "lesser" option that disables 3d acceleration or something from nouveau and allows to use it with that card instead of disabling the whole driver with nomodeset
19:41imirkin_: nullbyte_: GTX 970 + 4GB is broken on nouveau. you can boot with nouveau.modeset=0 to help your installer along.
19:42nullbyte_: imirkin: and never can be fixed?
19:42imirkin_: nullbyte_: no, it can be fixed. but the present state is broken.
19:42nullbyte_: thank you imirkin
19:44imirkin_: nullbyte_: there's a bug about it here: https://bugs.freedesktop.org/show_bug.cgi?id=94990
20:11karolherbst: I bet that widget trace won't crash for me
20:15imirkin_: nothing crashes for you :)
20:16karolherbst: some things do
20:16karolherbst: like TF2 crashed for me randomly
20:16karolherbst: and with randomly I mean after 4-5 hours on avarage
20:17imirkin_: oh, apparently mareko fixed that
20:17imirkin_: i think it was the LOD thing
20:17karolherbst: "timeout at /home/karol/Dokumente/repos/nouveau/drm/nouveau/nvkm/subdev/ltc/gf100.c:49/gf100_ltc_cbc_wait()!"
20:17imirkin_: or rather, the too many samplers thing
20:17imirkin_: which probably wreaked all kinds of havoc
20:17karolherbst: yeah, actually I tried to reproduce a crash
20:17karolherbst: guess what
20:18karolherbst: this is fun
20:18karolherbst: spawning that trace 8 times: https://gist.github.com/karolherbst/2e98d5b3714084c68011bb13c92307ec
20:19imirkin_: you managed to invoke ltc's wrath
20:19karolherbst: it still worked though
20:19karolherbst: just super slow
20:19karolherbst: and I think one or two windows displayed garbage
20:20karolherbst: the gpu is messed up for real
20:20karolherbst: that was fast
20:20karolherbst: it still does stuff
20:22karolherbst: looks like the buggers are totally rigged
20:23karolherbst: glxgears works
20:23karolherbst: but it stays black for a second
20:24karolherbst: sounds like we do something wrong
20:29karolherbst: who needs this ltc wait anyway
20:30karolherbst: getting EVICTED_CB and ILLEGAL_COMPSTAT
20:30karolherbst: without the wait those processes continue their work without issues
20:30karolherbst: 18 processes spawned, so let's see when a concurrency issue hits
20:42karolherbst: so let's try to fix that LTC error first
20:42karolherbst: what's the LTC=
20:43imirkin_: large table compression? just a guess.
20:53skeggsb: Level Two Cache
20:53karolherbst: ohh nice
20:53karolherbst: something kind of important
20:55karolherbst: skeggsb: short version: we got a trace which messes the LTC up for me
20:57karolherbst: ~130 calls per frame
20:57karolherbst: that looks simple enough
20:59karolherbst: skeggsb: any idea how to debug this?
21:07orbisvicis: hey, so I'm the guy who reported failure-to-initialize-video-out on resume against a gk208 card
21:07orbisvicis: with debug logs at: curl -s https://paste.fedoraproject.org/501132/81132565/raw/ | base64 -d | bunzip2
21:08imirkin_: skeggsb: you should take a look --^
21:17orbisvicis: yeah, I'd appreciate that. Eventually I plan to switch back to a non drm-next kernel
21:34orbisvicis: is there enough information in that log, anyway ?
21:40RSpliet: karolherbst: do you know what CB stands for?
21:40RSpliet: Command Buffer?
21:41imirkin_: or callback, depending on context
21:41RSpliet: okay, so not Command Buffer :-D
21:41imirkin_: that's not a thing
21:41imirkin_: pushbuf is a command buffer
21:42imirkin_: usually it's a fifo, or a ring, or something like that.
21:42imirkin_: i don't think i've ever seen "command buffer" anywhere
21:42RSpliet: it was an uneducated guess, but glad that's not terminology. It's bad enough to evict a constbuf, let alone a pushbuf
21:43imirkin_: i think in radeon-speak, CB is also color buffer
21:44RSpliet: but I don't see how that can be an erroneous state, shouldn't that be mapped into virtmem like anything else such?
21:44imirkin_: but i don't think nvidia ever uses that term
21:44imirkin_: nothing on nvidia (other than the mmu) can bypass the VM
21:45imirkin_: [well, PGRAPH has its own interface, but it respects the PTE's authoritay]
21:46imirkin_: among other things, you can have sparse buffers
21:46imirkin_: (ARB_sparse_buffers iirc)
21:47imirkin_: 4 5/31/2016 Jon Leech Change dependency from GL 1.5 + ARB_vbo to
21:47imirkin_: GL 4.4 (Bug 14117).
21:47imirkin_: a minor little version bump :)
22:22skeggsb: orbisvicis: can you grab a debug with nouveau.debug=disp=trace,i2c=trace drm.debug=0x14 ?
22:23skeggsb: grab a log*
22:23skeggsb: karolherbst: er, not really, beyond making sure we're allocating the right number of compression tags, clearing the right ones, etc etc
22:28karolherbst: skeggsb: https://gist.githubusercontent.com/karolherbst/a5390fe2a1e79c172ed618910f3cfd51/raw/c7b4caff8463eb513b0b11b12bb31f2f90f311eb/gistfile1.txt
22:28imirkin_: karolherbst: that wasn't for you
22:28karolherbst: ohhh :O
22:34skeggsb: karolherbst: btw, any chance multiple processes are doing allocations etc at the same time?
22:34karolherbst: yes :D
22:34karolherbst: I run that trace like 8 times in parralel
22:35skeggsb: try adding a mutex around the nvkm_ltc_tags_clear() call in nvkm/subdev/mmu/gf100.c gf100_vm_map()
22:35skeggsb: i suppose just doing it in nvkm/subdev/ltc/base.c is a better idea actually..
22:36skeggsb: the allocation is protected already, in a non-obvious way, but the bashing of the clear regs isn't
22:36skeggsb: (it is in a rework branch i haven't rebased yet)
22:36imirkin_: skeggsb: how long has this been an issue? "ttm: wait for bo fence to signal before unmapping vmas"
22:37karolherbst: skeggsb: I see
22:37karolherbst: skeggsb: patch?
22:37skeggsb: imirkin_: since amd pushed the ttm changes, whatever kernel that happened in.. lemme see
22:37skeggsb: karolherbst: there's *zero* chance of that applying, it's a big rework :)
22:37imirkin_: skeggsb: well, main point is ... please mark it for stable ;)
22:38skeggsb: imirkin_: yeah, already planned to
22:38imirkin_: k cool
22:39skeggsb: looks like 4.8, i think
22:39imirkin_: skeggsb: and the problem presents as "more page faults than usual"?
22:40skeggsb: page faults under memory pressure, when eviction is happening
22:40skeggsb: i triggered it with streaming-texture-leak
22:40skeggsb: (after some other mesa changes that were accidentally preventing it)
22:40imirkin_: heh, well that one tends to kill things for me pretty often
22:41skeggsb: karolherbst: https://paste.pound-python.org/show/dTGCnjM11SYRfJdYMNhD/
22:41karolherbst: skeggsb: like this?https://gist.github.com/karolherbst/93c4d38d75176a7b2c1b8cc282cc8eba
22:41karolherbst: there is a mutex already
22:42karolherbst: looks good I would say
22:42skeggsb: it fixes the issue?
22:42karolherbst: I have 16 processes running now
22:42karolherbst: yeah, I would say this fixes it
22:43karolherbst: feels much more stable now
22:43skeggsb: cool, i'll push that later on
22:45karolherbst: one concurrency issue less, progress!
22:45skeggsb: i'm working on fixing mesa btw
22:45karolherbst: I wanted to look into this : https://bugs.freedesktop.org/show_bug.cgi?id=92077#c23
22:45karolherbst: that is what triggered that ltc issue for me
22:46skeggsb: i've triggered all sorts of weird shit in the last week :P
22:47imirkin_: skeggsb: you see why i've been avoiding the issue :)
22:47karolherbst: I didn't
22:48karolherbst: no idea why, but I hardly hit any of those reported issues
22:48karolherbst: even the qtwebengines things
22:48karolherbst: no crashes
22:48karolherbst: even run entire plasma desktop prime offloaded, still no crashes
23:08orbisvicis: skeggsb: curl -s https://paste.fedoraproject.org/505551/84003148/raw/ | base64 -d | bunzip2
23:13karolherbst: uhh look what I've found, please add it to your link list :D https://developer.nvidia.com/sites/default/files/akamai/designworks/docs/NVIDIA%20Capture%20SDK%20Programming%20Guide.pdf
23:13Fervi: does nouveau have something like screensaver (power off when program is not used)?
23:16imirkin_: not really nouveau's place for that... it does support DPMS & co, so when requested, it'll turn off the screen
23:16Fervi: it is little something different, but strange
23:17Fervi: maybe something with VSync ...
23:18skeggsb: orbisvicis: hrm, weird... did you have this bug prior to drm-next ?
23:18Fervi: OTClient freezing, but when I go to Console (CTRL + ALT + F1) and return to X - it working
23:19Fervi: LiquidSky also, the screen is not refreshing, but when i move mouse on window - it working for few sec
23:20imirkin_: Fervi: oh, sounds a lot like a bug that happened with Xorg 1.19
23:20imirkin_: Fervi: the claim is that it got fixed, but i'm not so sure. there have been additional reports
23:20imirkin_: Fervi: are you using Xorg 1.19 or 1.18.4? if 1.19, can you downgrade to 1.18.4 to see if that improves the situation?
23:21Fervi: X.Org X Server 1.19.0
23:21Fervi: hmm, It is little difficult, but maybe i Can run older Debian in chroot? It is good idea? :D
23:22imirkin_: there was a workaround for that issue too...
23:22imirkin_: trying to remember what it was
23:22imirkin_: Fervi: it either happened with DRI3 or it happened with DRI2 but not both. mind pastebinning your xorg log?
23:23orbisvicis: skeggsb: it is this particular hardware combination
23:23orbisvicis: skeggsb: http://superuser.com/questions/1153248/causes-failure-to-reinitialize-video-out-on-resume#
23:23imirkin_: Fervi: ok, looks like it's only reporting DRI2. let's try with dri3 - add a Option "DRI" "3" to your device section.
23:23skeggsb: "Problem exists under Windows 7 as well."
23:24skeggsb: orbisvicis: are there any vbios updates for your board?
23:25orbisvicis: skeggsb: years ago I was running a 7300 card, iirc suspend/resume worked correctly for that card
23:26orbisvicis: let me check
23:26orbisvicis: *about vbios
23:26Fervi: imirkin_; how can i generate xorg file?
23:29imirkin_: Fervi: look around google :)
23:30Fervi: ok, stupid question :D I got it
23:30imirkin_: not really stupid... just long to answer and i'm lazy
23:31Fervi: ok, relog
23:32orbisvicis: umm, I'm not really sure if a newer vbios is available, or if there is a non-sketchy way to get them. Mine is:
23:32imirkin_: orbisvicis: what's your exact board?
23:33skeggsb: given that windows fails, the proprietary linux driver fails, and nouveau fails - i'd say your board is somehow broken, or there's a vbios bug
23:33orbisvicis: pny geforce gt 730, the GK208 version with 1GB GDDR5
23:33imirkin_: low-profile? http://www.pny.com/gt-730-1024mb-gddr5 ?
23:34orbisvicis: yes, part# VCGGT7301D5LXPB
23:34Fervi: imirkin_: http://meshnet-users.tk/paste/p/erx.log
23:34Fervi: I'm not sure if it is working properly
23:35imirkin_: [ 51492.854] (**) NOUVEAU(0): Allowed maximum DRI level 3.
23:35imirkin_: Fervi: looks good
23:38imirkin_: orbisvicis: hm, i don't see anything.
23:38imirkin_: [in terms of a bios update]
23:39imirkin_: Fervi: are you still experiencing the issues?
23:39orbisvicis: yeah techpowerup had an older 80.28.78.00.06 vbios
23:39orbisvicis: is it worth downgrading ?
23:39Fervi: imirkin_ dunno, wait a sec :P
23:39imirkin_: orbisvicis: could well be a diff gpu, or diff configuration. i wouldn't.
23:40orbisvicis: yeah you're right, it doesn't support ddr5
23:40imirkin_: skeggsb: thing is... the board starts up fine ... and we run the vbios, just like it's run on boot by the oprom...
23:41skeggsb: yep, i'm well aware :)
23:41imirkin_: should work :)
23:41imirkin_: orbisvicis: are you booting via bios or uefi?
23:41orbisvicis: so these debug logs were .. normal ?
23:41orbisvicis: imirkin_: BIOS
23:42imirkin_: so much for blaming it on the UEFI boogeyman
23:42skeggsb: orbisvicis: not normal, but it doesn't *look* like we're doing anything wrong, display just isn't processing commands
23:42imirkin_: skeggsb: did you ever figure out how to undo the thing nvidia blob does to display?
23:43skeggsb: no, i haven't bothered
23:45orbisvicis: maybe its worth trying: unplug display, suspend/resume, reconnect display ?
23:46orbisvicis: also it is an older motherboard, about 10yrs older than the gpu, but there shouldn't be any compatibility
23:48Fervi: it's look like it's working now - thanks imirkin_
23:49orbisvicis: yeah no
23:49orbisvicis: kinda screwed myself over on any sort of return policy
23:49orbisvicis: ah well
23:50imirkin_: Fervi: awesome.
23:50imirkin_: ajax: it's baaaaaaaaaack --^
23:54orbisvicis: skeggsb imirkin_: anyway, thanks for looking over the logs, for the help
23:57orbisvicis: maybe tomorrow, or in a few days, I'll test the card in a newer computer. Its a long shot, but I hope it will work fine on suspend, there
23:59orbisvicis: if so, I'll be back ...