01:06 j2lapoin: my parabola linux keep crashing (nouveau) freeze
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: fifo: write fault at 0000240000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 02 [PTE] on channel 2 [007fb4a000 Xorg[600]]
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: fifo: channel 2: killed
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: fifo: engine 6: scheduled for recovery
01:07 j2lapoin: Dec 10 14:41:52 m81 kernel: nouveau 0000:01:00.0: Xorg[600]: channel 2 killed!
01:07 Lyude: use a pastebin next time
01:07 Lyude: (I don't know the answer to your question, btw, just noting)
01:08 j2lapoin: Sorry, was not thinking too much
01:08 Lyude: it's ok :)
01:08 j2lapoin: is there a way to restart it by console?
01:10 gnarface: yes, in theory, if you still have a console to access
01:10 gnarface: are you trying to restart the whole computer or just Xorg?
01:11 j2lapoin: the screen is freeze but it still accept reisub so i suppose i can even do it via ssh
01:12 j2lapoin: i keep an eye on the memory, but it not the problem, it happen without i know why...and it don't freeze by request so i can't try too often
01:16 j2lapoin: brb
01:23 imirkin: skeggsb: did you see my pings yesterday re the various mmio faults on driver load?
01:23 imirkin: Lyude: ping re my mst issue yesterday ... not sure if you saved off the paste, i can dig it up if you need
01:24 Lyude: imirkin: hey-if you wouldn't mind digging it up yeah
01:24 imirkin: scroll wheel to the rescue!
01:25 imirkin: <imirkin> Lyude: this happened on driver unbind with 4.19.8: http://paste.debian.net/hidden/fb718bb0/
01:25 imirkin: <imirkin> and the initial issue is triggered by a DP 1.1 -> 1.2 switch on the dell monitor
01:26 imirkin: i am, btw, not being a TOTAL jerk in switching DP 1.1 -> 1.2 -- those monitors ship as 1.1, so flipping them live is going to be a fairly reasonable operation
01:26 Lyude: Do you hve the full log? Thete's something else going on there with the core notifiers, and I'd assume that probably has more to do with this null deref
01:27 imirkin: no, i cut it off - it was "more of the same" above though, as i recall
01:27 Lyude: imirkin: I've actually got plenty of dell monitors like that, I'm pretty sure they "hotplug" themselves when switching modes like that
01:27 imirkin: note the timestamps tho
01:27 Lyude: actually, what monitor is this? I might have one
01:28 imirkin: U2415
01:28 imirkin: i might actually have the logs - they should be in my /var/log/messages :)
01:29 Lyude: nice, I've got the older U2414
01:29 imirkin: i also have the much older 2408 :)
01:29 Lyude: imirkin: mind showing me? i'm still pretty convinced the core notify timeout has something to do with it, nvkm_ioctl() shouldn't ever be the spot where we're hitting a null deref afaik
01:30 Lyude: imirkin: ahhhh, they're all terrible displays in their own special ways! :)
01:30 skeggsb: imirkin: i think i missed them somehow
01:30 Lyude: (this is not even close to the first bug I've had to fix with these displays)
01:30 imirkin: Lyude: https://hastebin.com/jowuxugida.apache
01:30 imirkin: skeggsb: have a look at https://bugs.freedesktop.org/show_bug.cgi?id=108980
01:31 Lyude: mhm-that's cut off too
01:31 imirkin: Lyude: no
01:31 imirkin: it's not :)
01:31 Lyude: ahh, that's the whole log that got saved then
01:31 imirkin: like i said, "more of the same"
01:31 imirkin: no
01:31 imirkin: that's all there was in the first place
01:31 Lyude: ahh, right -sorry
01:32 Lyude: think you can check what line in the source code nv50_disp_atomic_commit_core.isra.10+0x2c5 points to?
01:32 imirkin: sure. how do i do that?
01:32 Lyude: sec
01:33 imirkin: gdb nouveau.ko?
01:33 Lyude: imirkin: with $NV_KO being the path to your nouveau.ko module, eu-addr2line -f -e $NV_KO nv50_disp_atomic_commit_core.isra.10+0x2c5
01:34 imirkin: ??:0
01:34 imirkin: ;)
01:34 imirkin: seems like something's missing... how do i tell it where the source lives?
01:35 Lyude: imirkin: there is --debuginfo-path
01:35 Lyude: but that's for seperated debuginfo
01:35 Lyude: I don't think it actually relies on needing the original source directory
01:35 imirkin: it's an unstripped module
01:36 imirkin: i see the symbol in gdb
01:36 imirkin: but i can't seem to pass it to "disassemble"
01:36 Lyude: I'm 99.9% sure that command should just work then, it's what I've got in the script I use for decoding stack traces all the time for kernel stuff
01:36 Lyude: hm.....
01:37 imirkin: gr, it does say "no debugging symbols found"
01:38 Lyude: i'm not entirely sure gdb can actually disassemble kernel stuff by default without using some of the plugins in the kernel src dir ./scripts/gdb/
01:38 imirkin: how do you get it to just print out a function though
01:38 Lyude: i might be wrong though
01:38 Lyude: imirkin: you're pointing it at the uncompressed module too right?
01:38 imirkin: ya
01:39 imirkin: gdb sees the symbol names
01:39 imirkin: i can tab-complete nv50_bla
01:39 Lyude: imirkin: tbh I've never had to resolve a relative function address in gdb before
01:39 Lyude: imirkin: can you run `file` on the kernel module and show me what it says?
01:40 imirkin: ok
01:40 imirkin: something about the . kills it
01:40 imirkin: disassemble nv50_disp_atomic_commit works fine
01:40 Lyude: hm, then maybe nv50_disp_atomic_commit_core+0x2c5?
01:41 imirkin: nah
01:41 Lyude: jfyi
01:41 Lyude: drivers/gpu/drm/nouveau/nouveau.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), BuildID[sha1]=4dd93bb9e33ed23b5521b2d83f70b0866e6ab204, with debug_info, not stripped
01:41 Lyude: if it doesn't look like that I think that's your problem
01:41 imirkin: i don't have "with debug info"
01:42 Lyude: CONFIG_DEBUG_INFO=y?
01:42 imirkin: i'd guess not :)
01:46 imirkin: i don't get why i don't get a function listing with disassemble for it though
01:46 imirkin: it works for all the other symbols =/
01:48 imirkin: even when i load the .o file directly!
01:49 imirkin: aha!
01:49 imirkin: https://hastebin.com/iterexubog.xml
01:49 imirkin: i think.
01:50 imirkin: so it's the +709 line
01:51 imirkin: which will be ~impossible to back out
01:55 Lyude: uuuuuhhhhhhh
01:56 Lyude: was kind of hoping for the line in the C source :p
01:59 imirkin: right.
01:59 imirkin: let's see if i can make some makefile hackery to have this work
02:04 imirkin: Lyude: https://hastebin.com/ilewewales.pl
02:04 imirkin: 1047 nv50_msto_prepare(msto);
02:05 Lyude: ohohoho i bet i know what this is
02:06 Lyude: imirkin: also: what commit is that kernel from specifically? or is it just from that kernel version in the dmesg output you gave me
02:06 imirkin: 4.19.8 + some local patches whcih don't affect this
02:06 Lyude: alright
02:06 imirkin: (my series which landed in 4.20 which enable hdmi2)
02:08 Lyude: imirkin: they do affect the line offsets though :p (they don't match up with v4.19.8 in git) , mind just putting your drivers/gpu/drm/nouveau/dispnv50/disp.c on pastebin?
02:08 imirkin: sure
02:08 imirkin: Lyude: https://hastebin.com/qejeyozidu.cpp
02:19 Lyude: imirkin: yeah, that line doesn't make a whole ton of sense... I think I'm going to need to reproduce this locally, but I don't think it should be too difficult
02:19 Lyude: imirkin: just switch the monitor from 1.1 mode to 1.2 when it's connected right?
02:21 imirkin: Lyude: yes. have a mode set on the monitor, and flip it
02:21 imirkin: fwiw i had another monitor on at the time too (HDMI)
02:21 imirkin: i think.
02:21 Lyude: imirkin: alright-I'll give that a shot when I'm back in the office tommorrow and let you know if I have any problems reproducing it
02:22 Lyude: (about to leave the office now)
02:23 imirkin: wow, late night
02:28 Lyude: Hehe, more of a case of "I n
02:29 Lyude: "I really need to fix my sleep schedule" :p
02:29 Lyude: (phones are hard also)
02:29 imirkin: i hear the new samsung ones are soft
02:29 Lyude: lol
02:40 j2lapoin: Gtx 1050 is supporting by nouveau
02:43 HdkR: Is that a question?
02:44 j2lapoin: Yes
02:47 j2lapoin: My gt710 is giving problem and the next card i can have is a gtx1050
02:48 joepublic: j2lapoin, perhaps here looking you might, https://nouveau.freedesktop.org/wiki/FeatureMatrix/ and, considering that gtx1050 is a NV130 Pascal (137 I think)
02:48 HdkR: It's supported far enough that it'll run at idle speeds.
02:48 j2lapoin: Joepublic: can a gtx1050 run on nv130? Right now
02:49 joepublic: gtx1050 can run, yes, right now.
02:50 j2lapoin: With nv130 or with nvidia?
02:51 joepublic: your question makes no sense. nouveau supports the card == "it works"
02:51 j2lapoin: Awesome
02:51 joepublic: it is not going to set any speed records, but there will be a screen to look at.
02:52 j2lapoin: Sound enough
02:52 j2lapoin: Thank and ill be back
02:59 imirkin: skeggsb: so any good ideas about how to conditionalize the vga unlock?
03:02 imirkin: or how to deal with falcon_fini running before the engine is enabled via PMC?
03:21 skeggsb: imirkin: check the pmc bit (nvkm_mc_enabled()) in falcon_fini, and skip
03:22 imirkin: oh, that's cheating. i like it.
03:22 imirkin: what about the vgalock thing?
03:22 skeggsb: *shrug* i was aware of the issue, but decided i didn't care that much, it's not like it causes any actual problem
03:25 imirkin: annoying to get the MMIO error...
03:25 imirkin: is it necessary on nv50+?
03:26 skeggsb: i *think* it was for some reason, but i can't remember the details
03:26 skeggsb: devinit scripts might touch stuff sometimes iirc
03:26 skeggsb: not sure
03:27 imirkin: k
03:27 imirkin: even on recent gpu's? that stinks =/
03:28 skeggsb: not sure about fermi and up at all
03:28 skeggsb: or anything, really :P
03:28 imirkin: :)
10:55 AndrewR: karolherbst, so, I set MESA_SHADER_DUMP_PATH and captured few shaders ... (from wine 3.21 staging + mafia2). Any use for those, or I better to dump more low-level stuff (tgsi, nvir?)
10:55 karolherbst: difficult
10:56 karolherbst: normally it's good to have those and try to optimize against wines d3d -> gl translation, but with gallium nine all that starts to get kind of weird
10:56 karolherbst: does that even work?
10:56 karolherbst: I mean, would you get d3d shaders being dumped?
10:57 AndrewR: karolherbst, no, I was nable to amkenine work with mafia2 especially ... I can try apitrace, i think.. but resulting file will be big ....
10:57 AndrewR: *unable
10:58 AndrewR: **facepalms ....unable to make mafia 2 works with nine (as in staging 3.21). 3dmark05 was working, as far as I was able to see
10:59 karolherbst: AndrewR: does it crash or is it not a d3d9 game?
10:59 AndrewR: karolherbst, it just quits ....
10:59 karolherbst: ohh, right :/
10:59 karolherbst: yeah dunno what to do about that
10:59 AndrewR: karolherbst, without even showing intro video
11:00 karolherbst: wine or nine devs would be able to helphere
11:04 AndrewR: karolherbst, yeah ..still, if problem lies at level whele shaders translated to hw ..:/ https://www52.zippyshare.com/v/oDnXKXLT/file.html
11:05 karolherbst: if you don't get an error it shouldn't
11:06 AndrewR: karolherbst, for some reason some older ati demos also fail to start now.. may be -staging-specific regression ...
11:07 karolherbst: maybe
11:07 AndrewR: ..i jst saw a lot of activity on #nouveau those days about compiler part of driver ....
11:08 karolherbst: well normally that would only affect what you see, but not making an application stop working though
12:37 AndrewR: karolherbst, https://www42.zippyshare.com/v/oY5PKnoN/file.html - this is gallium log file with ST_DEBUG=tgsi
12:38 AndrewR: karolherbst, I forgot my mesa bild was without any debug, so recompiled it ....
12:38 karolherbst: AndrewR: looking at the shaders won't help
13:05 AndrewR: karolherbst, https://www28.zippyshare.com/v/HvBAtpdr/file.html - apitrace ....
13:06 karolherbst: AndrewR: there is nothing I can help out with. I neither now the wine code, nor the gallium nine stuff. I already suggested to go to talk to them about that.
13:06 AndrewR: karolherbst, no, I mean relative slowness in wine mode, not crash with nine ....
13:06 karolherbst: ohh, I see
13:07 karolherbst: but yeah, the trace would help here more
13:12 AndrewR: karolherbst, not exactly best sorce of info, but https://www.phoronix.com/scan.php?page=article&item=nouveau-nvidia-eoy2018&num=2 - see Bioshock graphs (yes, for nvc0/reclocked) ...because I have my own little 'Crysis' here (pun intended), and don't have Bioshock (and my card probably will not run it at all) - I realized I can try with what I actually ahve - amy be some optimizations really applicable to both nv50/nvc0 ... (while chances are small..i thi
13:12 AndrewR: nk)
13:13 karolherbst: mhh
13:13 karolherbst: most of the perf difference isn't directly compiler related
13:13 karolherbst: there are some perf features we don't implement yet like Zcull
13:14 karolherbst: which should make up for 15% of the perf difference or something
13:15 AndrewR: karolherbst, well, 5 fps as atarting point gives little hope for even memory reclocking as way to make game playable here (speaking about mafia2 under wine ..because... even 8.5 fps is not my best fps to play ...)
13:15 karolherbst: AndrewR: memory reclocking is quite important
13:16 karolherbst: AndrewR: what is the difference between stock and highest clocks?
13:16 karolherbst: doubled? trippled clocks?
13:16 AndrewR: karolherbst, but can it speed up shader execution like 3 times (if mem clock only goes up from 500 to 850 Mhz in my case)?
13:17 karolherbst: mhh most likly not 3 times, but at least 50% is to be expected
13:18 AndrewR: karolherbst, https://pastebin.com/6NKT2r0X
13:18 karolherbst: AndrewR: but are you sure that gallium nine is used?
13:18 karolherbst: because if it prints out glsl files then probably not
13:19 AndrewR: then I was looking at those graphs in article, and nvidia driver was running at 169 vs noveau 62 fps ...
13:19 AndrewR: karolherbst, no, both nine and csmt were disabled via winecfg in my case
13:19 karolherbst: for some games that's true, right
13:20 karolherbst: without nine you can't expect reasonable perf anyway
13:20 karolherbst: but that's not because of the shader compiler being bad, but because we never optimized against performance generally
13:20 karolherbst: and we certainly don't implement all those perf features the nvidia driver has to offer
13:21 karolherbst: using nine should give you at least doubled performance, if not trippled
13:21 karolherbst: and if memory reclocking you could get another 70%
13:23 AndrewR: karolherbst, I can't recall 3dmark05 going to 50-75 fps from default 22 fps or something at first scene (no AA, AF16, 1440x900)
13:23 karolherbst: what do you mean? with nine?
13:23 karolherbst: well, it depends on the application
13:24 karolherbst: but usually it reduced CPU overhead a lot
13:24 karolherbst: and GPU benchmarks usually have 0 CPU usage
13:24 AndrewR: karolherbst, in my case (with plain wine) I can set CPU to 1.4 Ghz and mafia2 will be as slow as with 3.8 Ghz CPU (x4 cores each time) ..so ......
13:27 karolherbst: I can't help you, I already told you. Either you talk with the nine devs to figure that issue out and know for sure if that helps or not. We could for sure look into performance, but that won't happen today and it won't happen this or the next month. And it will take more than a year to get to a state where there are significant improvements. Get it to run with nine or not, your choice
13:27 karolherbst: or fix the perf issues yourself
13:28 Sarayan: "significant improvements"... that requires vulkan?
13:28 Sarayan: (hi y'all)
13:28 karolherbst: no
13:29 karolherbst: well, with vulkan that all would be better with wine anyway
13:29 karolherbst: but nine is equally good
13:29 karolherbst: there are more serious perf issues
13:29 Sarayan: or you mean nouveau doesn't interact correctly with nine yet?
13:29 karolherbst: for starters, we don't implement zcull, which gives you 15% more perf in avarage
13:29 Sarayan: (outside of the recloking/etc issues)
13:29 karolherbst: and there are many other things we might be able to do to improve perf
13:31 karolherbst: anyway, perf is something we can look into if time allows it and we took care of more serious and pressing issues
13:31 Sarayan: In my case managing to start the card could improve perf :-) I'll fix it someday I'm sure :-)
15:11 AndrewR: karolherbst, glretrace --list-metrics gives me only few lines of "nv50_screen_get_param: unhandled cap 181". Something busted in my build of apitrace/nouveau?
15:11 karolherbst: it's normal
15:11 karolherbst: you can ignore it
15:12 AndrewR: karolherbst, still, I don't understand wiki says there must be at least some backend? https://nouveau.freedesktop.org/wiki/PerformanceCounters/
15:37 AndrewR: sorry, will return after sleep
19:00 AndrewR: so, this series was never merged? https://lists.freedesktop.org/archives/nouveau/2015-June/021332.html
19:30 HdkR: "Nouveau Lands Initial Open-Source NVIDIA Turing Support - But No GPU Acceleration" Nice. 10/10
19:33 karolherbst: HdkR: big surprise, I know
19:34 karolherbst: AndrewR: the kernel bits were never merged afaik
19:35 HdkR: At least some work is coming along :)
19:41 AndrewR: karolherbst, https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/nouveau/nvkm/engine/pm/g84.c?h=v4.20-rc6&id=06b7972dc915e60051cd6531d988a7c72645d00a
19:44 karolherbst: hakzsam: any idea why it was never merged? nobody reviewed? Was there something wrong with the patches?
19:45 AndrewR: aha ... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drivers/gpu/drm/nouveau/nvkm/engine/sw/nv50.c?h=v4.20-rc6 - missing part from https://cgit.freedesktop.org/~hakzsam/nouveau/commit/?h=nouveau_perfmon&id=a4a78700752291b919b200743768a65c513306a4
20:14 hakzsam: AndrewR: karolherbst, I don't remember to be honest. You might find something on my github account too
20:14 hakzsam: both mesa and kernel parts
20:19 hakzsam: note that global perf counters aren't really useful, they aren't accurate compared to MP counters
20:32 karolherbst: true
20:32 karolherbst: but that's all you have on tesla or are there MP counters as well?
20:55 AndrewR: hakzsam, if you mean this repo https://github.com/hakzsam - then kerne/mesa repos has note abot moving them to freedesktop.org ....
20:58 hakzsam: yeah, but for nouveau stuff they are up-to-date
21:18 Lyude: imirkin: btw; will try to reproduce that MST issue today
21:57 Lyude: imirkin: ugh, not having much luck reproducing this with my setup
22:05 Lyude: how often does it happen with yours?
22:45 imirkin_: Lyude: happened once
22:46 imirkin_: but since i wasn't trying to crash things ... i didn't try many times
22:47 imirkin_: Lyude: note that i was using nouveau ddx, so no-atomic
22:48 imirkin_: this was with a U2415 plugged into a GM206
22:51 Lyude: imirkin_: atomic is disabled by default in nouveau so I doubt that's it
22:51 Lyude: (either way-legacy goes through atomic 100% of the time with >=nv50)
22:52 imirkin_: right
23:58 Lyude: skeggsb: poke, found a big mst memory leak in nouveau, patches on nouveau ml
23:58 Lyude: well, not sure if i'd describe it as 'big', but i wouldn't call it small
23:59 imirkin_: very small if you don't have any DP ports =]
23:59 Lyude: hehe