00:29 tarragon: question, does mesa and libdrm have nouveau improvements or it's only in the kernel module?
00:31 imirkin_: depends what kind of 'improvements' you're looking for
00:31 tarragon: also an update on my rudimentary experiment. xterm fast output also pushes steady 80% cpu usage with amdgpu but 2% under tmux. Whereas with nouveau fast terminal output was 30% cpu usage.
00:31 imirkin_: 3d accel is all in mesa
00:31 tarragon: got it.
00:31 imirkin_: but e.g. clock setting is in kernel
00:32 tarragon: are there special kernel or env options for performance or turn on things that aren't default?
00:32 imirkin_: btw, according to someone who knows these things, xterm fonts are drawn all using CPU, no acceleration, with EXA
00:32 imirkin_: if you have a gpu that reclocks, boosting isn't enabled by default
00:33 imirkin_: [what GPU do you have?]
00:33 tarragon: gk104
00:35 tarragon: is xterm retarded or something??!!!
00:36 imirkin_: xterm doesn't do any drawing at all
00:36 imirkin_: X server is what draws the fonts
00:37 imirkin_: and allegedly that's not hooked up to EXA
00:37 imirkin_: so the CPU usage is probably from the CPU drawing to VRAM directly
00:37 imirkin_: or moving buffers back and forth, dunno
00:38 tarragon: wow, dunno what to think, whether that's supposed to be remarkable or insane ...
00:38 imirkin_: anyways, if you're already making use of reclocking, you can get a bit more by booting with nouveau.config=NvBoost=1
00:38 imirkin_: it's supposed to be "it's hard to optimize every imaginable use-case"
00:38 imirkin_: [simultaneously]
00:39 tarragon: won't brick the card?
00:39 imirkin_: not unless while typing it you happen to spill your soda on the board
00:53 tarragon: thanks
04:22 _xvilka_: imirkin_: tarragon: yes, there is a terminal who uses GPU for rendering - alacritty
04:22 imirkin_: i was talkign specifically about xterm.
04:23 _xvilka_: imirkin_: yes, I noted. Posted an example of alacritty, so if he wants - he can compare
04:23 _xvilka_: *noticed
07:38 karolherbst: tried to compile valgrind-mmt: configure: error: please use gcc >= 3.0 or clang >= 2.9 or icc >= 13.0, tough.. with gcc-7
07:44 karolherbst: buhh. gcc_version is 7 and we fail to handle that, meh
07:45 karolherbst: gcc -dumpversion returns just 7
07:45 karolherbst: but my gcc is 7.2.1
07:48 karolherbst: https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00567.html
07:48 karolherbst: *sigh*
12:58 karolherbst: Lyude: the annoying part with bisecting this issue are those jumps back to 4.12....
13:22 karolherbst: Lyude: https://gist.githubusercontent.com/karolherbst/16f2ffa1b56d66de405f74bec33404e8/raw/604e93e620418e24545de62c890d153dde22ba66/gistfile1.txt
13:23 karolherbst: I should skip less
13:25 jolar2: imirkin_: Anything I can do to aid in debugging https://bugs.freedesktop.org/show_bug.cgi?id=101778 or is it a dead end (or perhaps low priority)?
13:25 karolherbst: jolar2: I am currently debugging a P50
13:26 jolar2: karolherbst: does it have the same issue (docking station specific I think)?
13:26 karolherbst: not quite sure
13:26 karolherbst: I could check on Monday
13:26 karolherbst: the docking station is at the office
13:26 karolherbst: jolar2: I put it on my todo list for Monday
13:26 karolherbst: no promises though
13:27 jolar2: karolherbst: Nice to hear. Both docking stations and docking via thunderbolt connector causes the same kind of kernel oops.
13:27 karolherbst: jolar2: did you test 4.13?
13:28 jolar2: karolherbst: Yes I am running 4.13.9
13:28 karolherbst: okay, same issue I assume?
13:28 jolar2: karolherbst: yup
13:28 karolherbst: okay
13:29 jolar2: It only happens when an external display is connected to the docking station.
13:29 karolherbst: jolar2: so you simply disconnected from the docking station and it oopses?
13:29 karolherbst: is any display connected and how?
13:29 karolherbst: ahh
13:29 karolherbst: which port?
13:29 jolar2: no it is when docking it happens
13:29 karolherbst: okay
13:29 jolar2: I pretty sure any port is affected.
13:29 jolar2: I have not tested VGA though.
13:29 karolherbst: so, DP, HDMI and DVI?
13:30 jolar2: yes
13:30 karolherbst: well most of it should use DP MST under the hood anyway
13:30 jolar2: HDMI through the laptop is fine
13:30 karolherbst: yeah, it is most likely related to DP MST
13:30 jolar2: but not HDMI via the docking station
13:30 jolar2: ok
13:30 karolherbst: jolar2: does your display has a setting for DP 1.2?
13:30 karolherbst: like can you disable 1.2 so that the display uses 1.1?
13:30 jolar2: karolherbst: no idea
13:31 imirkin: jolar2: are you the first guy or the second guy?
13:31 imirkin: [in that bug]
13:31 jolar2: imirkin: the second guy
13:31 imirkin: then file a separate bug with full information
13:31 jolar2: oh
13:31 karolherbst: jolar2: give me the link then though, so that I don't forget to check that as well
13:31 imirkin: unless you've root-caused your issue and the other guy's issue and determined that they're identical
13:32 jolar2: I am fairly sure it is the same issue.
13:32 jolar2: karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=101778
13:32 imirkin: also, iirc the mstm stuff got some fixes
13:32 imirkin: so perhaps update?
13:32 karolherbst: imirkin: well 4.14 is basically broken anyhow
13:32 imirkin: you never mention what kernel you're on or provide any logs
13:32 jolar2: imirkin: it is 4.13.9
13:33 karolherbst: jolar2: I meant the new one if you create one
13:33 jolar2: karolherbst: ofc.
13:33 jolar2: sry
13:34 jolar2: imirkin: Since I get the same failure and stack trace, and the same model and graphics card, and we both observe the failure when using a docking station when using hybrid graphics, I think it is the same issue.
13:35 imirkin: jolar2: ok. make sure to add yourself as a cc on that bug if you haven't already
13:35 karolherbst: jolar2: the vbios could be different ;) but yeah, I am also quite sure it might be the same
13:35 jolar2: imirkin: ok
13:35 karolherbst: hopefully mine won't be too different
13:35 jolar2: imirkin: done
13:35 imirkin: either way, i don't know squat about all this DP stuff =/
13:36 jolar2: if it is different, I may perhaps assist you
13:36 karolherbst: imirkin: well, me neither, but I have time to dig into that :D
13:36 imirkin: i'm just generally annoyed by people showing up on bugs and saying "everything about my setup is different, but i'm convinced that my issue is identical to this one, so i'll just say 'me too'"
13:37 karolherbst: well but if I can reproduce it on my machine, it might be easy to fix? maybe not, will find out mondays
13:37 imirkin: esp when that issue is "random hang"
13:37 karolherbst: true
13:37 imirkin: [which is not the case here]
13:37 karolherbst: but those are users, users make mistakes
13:37 jolar2: I understand
13:38 jolar2: karolherbst: yeah I am kind of hoping the P50 is affected as well
13:38 karolherbst: jolar2: it is a quadro m1000m and I probably have the same dock
13:38 karolherbst: so chances are high
13:38 jolar2: yeah
13:39 imirkin: Jul 13 15:04:27 kernel: [ 22.891014] RIP: 0010:drm_fb_helper_add_one_connector+0x17/0xd0 [drm_kms_helper]
13:39 imirkin: jolar2: you also get that?
13:39 karolherbst: but first I need to figure out what we screwed up in 4.14
13:39 jolar2: imirkin: yeah
13:39 imirkin: eeeeenteresting
13:40 karolherbst: imirkin: maybe some odd race condition?
13:40 imirkin: Jul 13 15:04:27 kernel: [ 22.763897] [drm:drm_helper_hpd_irq_event [drm_kms_helper]] [CONNECTOR:47:DP-1] status updated from disconnected to disconnected
13:40 imirkin: Jul 13 15:04:27 kernel: [ 22.826406] [drm:drm_helper_hpd_irq_event [drm_kms_helper]] [CONNECTOR:54:DP-2] status updated from disconnected to disconnected
13:40 imirkin: Jul 13 15:04:27 kernel: [ 22.890425] [drm:drm_helper_hpd_irq_event [drm_kms_helper]] [CONNECTOR:61:DP-3] status updated from unknown to disconnected
13:40 imirkin: my guess is the fact that the last one is going from 'unknown' to 'disconnected' means we're hot-adding DP connectors wrong?
13:41 karolherbst: maybe
13:41 jolar2: i.e. the resource is freed and we use NULL pointing stuff?
13:41 imirkin: jolar2: or the resource isn't allocated :)
13:41 imirkin: basically with DP, connectors can come and go
13:41 imirkin: well, with DP-MST
13:42 imirkin: the dock is a DP-MST thing, often
13:42 jolar2: nv50_mstm_register_connector being called has to do with DP-MST I guess
13:42 imirkin: anyways, don't think of this as low priority... think of nouveau as having very few able developers.
13:43 jolar2: and not much help from nvidia, from what I've heard
13:43 jolar2: funny thing is... the nvidia driver does not work very well with this setup either...
13:44 imirkin: they do their part -- make things harder for us to do over time.
13:44 imirkin: since normally it's too easy
13:44 karolherbst: :D
13:44 karolherbst: one way to see it
13:45 jolar2: :D
13:49 karolherbst: whelp... 4.13 works 4.13-rc1 is broken and 4.14-rc1 is broken
13:51 karolherbst: ohh different stack
14:25 jolar2: imirkin: I added some debug lines and printed the connector information in nv50_mstm_register_connector. Chronologically I get DP-3 going from unknown to disconnected, and then when the first oops would occur (safe guarded by my if-clause now) the connector is [CONNECTOR:69:DP-4].
14:43 jolar2: what maybe wrong here is that in the nvidia case, when it works (like 1 time out of 10), their logs indicates that there are ports called DP3.1 and DP3.2
14:44 imirkin_: jolar2: ah ok. that makes even more sense... those drm prints happen AFTER the connector is added
14:44 imirkin_: yeah, nvidia calls them as DPN.M
14:44 imirkin_: while the upstream way is to just add connectors with increasing numbers DP-N
14:44 jolar2: I see
14:45 jolar2: I know next to nothing about the DP technology... but it seems a bit complicated since it allows daisy-chaining of displays etc
14:46 imirkin_: it's a networking protocol
14:46 imirkin_: you can have DP hubs
14:46 imirkin_: and it detects loops iirc
14:46 jolar2: I see
14:46 jolar2: anything more advanced than serial buses are complicated to me
14:46 jolar2: :)
14:47 imirkin_: even serial is surprisingly complicated
14:47 imirkin_: esp the w1 stuff
14:47 jolar2: not like usb!
14:47 jolar2: don't know about w1
15:06 imirkin_: single-sire
15:06 imirkin_: single-wire
15:07 jolar2: ok, that should be complicated
15:50 mlankhorst: huh? how come nouveau doesn't expose DRIVER_ATOMIC? :p
15:53 mlankhorst: skeggsb: ^
15:54 imirkin_: nouveau.atomic=1
15:54 imirkin_: my guess is it was enabled due to fear of the gathering darkness
15:57 mlankhorst:runs IGT out of curiosity's sake
15:59 imirkin_: wasn't*
16:02 mlankhorst: that's what IGT is for. ;)
16:06 mlankhorst: [0339/1189] skip: 268, pass: 32, dmesg-warn: 2, fail: 37 \
16:06 Lyude: karolherbst: I saw your post, but I'm not sure I understand it?
16:06 Lyude: did the bisect take you into i915?
16:06 imirkin_: mlankhorst: looks like a good start
16:07 imirkin_: or is that where it hung? :)
16:07 mlankhorst: didn't hang yet
16:07 imirkin_: mlankhorst: note that there's an important fix
16:07 imirkin_: not sure what kernel you're on
16:07 mlankhorst: drm-tip
16:07 imirkin_: mlankhorst: https://github.com/skeggsb/linux/commit/d324c5bc462d354d337dcf3a14ffd0eb17b4fa38
16:08 mlankhorst: yeah looks useful
16:11 Lyude: karolherbst: poke
16:12 karolherbst: hey!
16:12 Lyude: hey, gonna resend the message that I think you missed from the dc
16:12 Lyude: karolherbst: I saw your post, but I'm not sure I understand it? did the bisect take you into i915?
16:12 karolherbst: well kind of
16:12 karolherbst: but
16:12 karolherbst: it is weird
16:12 karolherbst: 4.13-rc1 is buggy as well
16:12 karolherbst: soooo
16:13 Lyude: wa? 4.13.9 is most definitely functional on here
16:13 karolherbst: exactly
16:13 karolherbst: 4.13.0 as well
16:13 Lyude: hm.
16:13 karolherbst: I will try to figure out what fixed it
16:13 karolherbst: but
16:13 karolherbst: there are like two related to the same cause
16:14 karolherbst: currently doing my third bisect
16:14 Lyude: alright, I'm currently building v4.14-rc1 since I'm curious how that acts
16:15 Lyude: If there's some difference between stable and mainline we might need to figure out another good/bad commit, probably starrt from the branchpoint where v4.13 mainline becomes stable
16:15 karolherbst: yeah
16:15 karolherbst: that's what I am doing right now
16:16 karolherbst: most likely a bad merge or bugfix not backported
16:16 karolherbst: but the former is more likely
16:16 karolherbst: can't imagine how we have lost a bugfix from an rc
16:18 imirkin_: what are you guys debugging exactly?
16:18 karolherbst: crash on 4.14
16:18 Lyude: specifically with nvf0
16:18 karolherbst: imirkin_: happens on rmmod for example
16:18 Lyude: the one I'm getting just happens on boot
16:18 karolherbst: and on suspend as well I think
16:18 Lyude: are you sure we're seeing the same issue?
16:18 karolherbst: Lyude: no
16:19 karolherbst: but mine is display related
16:19 imirkin_: ok, so you guys are talking about totally different things
16:19 imirkin_: cool.
16:19 Lyude: yeah i don't think I realized that
16:19 karolherbst: mine is this: https://gist.githubusercontent.com/karolherbst/2d9d28dbcb5c92ea6931d5ca69b5f621/raw/a25d88e38fd08c7885313f6b38ec2371c53406db/gistfile1.txt
16:20 Lyude: hold on
16:20 Lyude: i've seen that
16:20 karolherbst: ;)
16:20 imirkin_: yeah, that looks familiar
16:20 imirkin_: very familiar.
16:20 Lyude: i'm trying to remember which card I've seen that with
16:20 karolherbst: mine is a gm107
16:20 Lyude: that's the one
16:20 imirkin_: yeah we fixed that.
16:20 karolherbst: well
16:20 karolherbst: it's back
16:20 imirkin_: the vblanks were messed up
16:20 imirkin_: hold on
16:21 karolherbst: I even tried master afaik
16:21 karolherbst: it got fixed in 4.13
16:21 karolherbst: but it is back on 4.14-rc whatever is the newest I tried
16:21 imirkin_: https://github.com/skeggsb/nouveau/commit/205775a3ac541020238a59719411e65b6c397273
16:21 karolherbst: 7?
16:21 imirkin_: + https://github.com/skeggsb/nouveau/commit/c74b147a590c608f80d9a09fcfcc45b20a157312
16:22 karolherbst: okay, yeah makes sense
16:22 karolherbst: I think we need to fix that on 4.14 again
16:22 imirkin_: so either someone added it back
16:22 karolherbst: I don't know _why_
16:22 karolherbst: but it is broken
16:22 imirkin_: or ... something.
16:22 karolherbst: bad merge
16:22 karolherbst: is my guess
16:22 imirkin_: check if that code is there
16:22 karolherbst: yeah
16:23 karolherbst: it
16:23 karolherbst: 's mhh, different
16:24 karolherbst: mhh, weird
16:25 karolherbst: super weird
16:25 karolherbst: well as far as I can tell, it is broken on rc7
16:27 karolherbst: Lyude: anyhow, I kind of think that doing git bisect on a path only leads to endless pain
16:28 Lyude: i've still gotta do it for this to figure out what's going on, this seems to work in rc1
16:29 Lyude: karolherbst: i've dealt with some pretty nasty bisects before so that's fine with me :P
16:29 imirkin_: Lyude: can you get a backtrace? earlyprintk? netconsole?
16:29 karolherbst: :D okay
16:29 Lyude: try bisecting ~4.2-4.4 on i915 sometime ;P
16:29 karolherbst: Lyude: no thanks
16:29 Lyude: imirkin_: yep, will do so in a moment
16:30 Lyude: slow bioses are the worst
16:31 Lyude: imirkin_: https://paste.fedoraproject.org/paste/8Yub5CKDUZ4FkbIiVOWAow
16:31 Lyude: i think it changed from what it was last time thoguh
16:32 karolherbst: uhhh
16:32 imirkin_: wtf
16:32 karolherbst: I saw that one
16:32 imirkin_: feels like you did something nasty with firmware
16:32 imirkin_: er no, i guess not
16:32 imirkin_: anyways, ehm, that shouldn't happen :)
16:32 Lyude: yeah :(
16:32 karolherbst: Lyude: check the changes in the gr engine, I think there is something nasty in there
16:33 Lyude: alright, I'm going to give rc7 a shot first though since I realized the rc7 I was testing before was from drm-fixes
16:33 imirkin_: Lyude: i don't suppose you can resolve to a line?
16:33 imirkin_: oh wait, it has it in there
16:33 imirkin_: gr
16:34 imirkin_: oh
16:34 imirkin_: [ 8.372797] nouveau 0000:22:00.0: timeout
16:34 imirkin_: it's just a WARN
16:34 imirkin_: coz your thing hangs on load.
16:34 imirkin_: super.
16:34 imirkin_: just super.
16:37 imirkin_: perhaps b68896d4982c5e77ac69d4d93edad72b8337ea2b affects it?
16:37 imirkin_: [in ben's tree]
16:38 karolherbst: would be super odd
16:38 imirkin_: or 66947ec27b83b695b5fa1d7c1a5ef3eaa4025846
16:39 karolherbst: isn't the hang related to the gr falcon not doing its job?
16:39 imirkin_: yes.
16:39 imirkin_: or this: a6d2626b7a7e44c0138f277b7654650822e502dd
16:39 karolherbst: yeah
16:39 karolherbst: that could be it
16:39 karolherbst: maybe it takes 3 seconds
16:39 karolherbst: or 10
16:39 karolherbst: who knows
16:40 karolherbst: Lyude: wanna try out 10 seconds?
16:40 imirkin_: or just revert it :)
16:40 imirkin_: instead of making random guesses
16:40 karolherbst: well, we want to have a timeout there though
16:40 Lyude: sure but give me a sec! still waiting on that last kernel build to finish
16:40 imirkin_: we can talk about that if the revert works.
16:40 karolherbst: otherwise we have threads doing loops forever
16:41 karolherbst: ahh, right
16:41 karolherbst: k
16:41 imirkin_: :)
16:41 imirkin_: step 1: identify problem. step 2: fix.
16:41 imirkin_: never get those backwards
16:41 karolherbst: well, you could merge those :p
16:41 imirkin_: tempting though it is :)
16:42 karolherbst: but yeah I know what you mean
16:43 karolherbst: I am sure we do some crap somewhere that's why we run into the timeout, but we use 2 seconds in a lot of places
16:43 karolherbst: guess there is a reason why not there
16:44 karolherbst: mhhh, wait
16:44 karolherbst: okay no, it looks fine
16:52 Lyude: okay, it's official, something's fucky
16:52 Lyude: good news is: the kernel is just fine. I think.
16:52 Lyude: at least mainline
16:53 karolherbst: Lyude: you mean 4.14-rc7?
16:53 karolherbst: or stable?
16:53 Lyude: yeah, just build it and ran it and it seems to launch fine
16:53 karolherbst: try it again from a cold boot
16:53 Lyude: ok
16:53 karolherbst: if it is really related to the timeout, you might get lucky sometimes
16:54 karolherbst: which might also cause a super unusable bisect
16:54 Lyude: i love computers
16:55 karolherbst: ;)
16:56 Lyude: seems to still work, keep in mind as well with the problematic version I can reproduce this 100%.. Also, keep in mind the problematic version is based off drm-next and not master
16:56 karolherbst: ohh, okay
16:56 karolherbst: so we need to check what's new there :/
16:56 karolherbst: but bisecting drm-next is pain as well for different reasons
16:56 Lyude: merges and merges
16:57 karolherbst: well the merges aren't the bad part
16:57 karolherbst: but sometimes trees are merged together which shouldn't be
16:57 karolherbst: like a 4.12 based tree with a 4.13 one
16:57 karolherbst: and suddenly you depend on certain bug fixes.... or whatever
16:59 Lyude: hold on
16:59 Lyude: i might have done something extraordinarily dumb
17:00 Lyude: ok good, I thought for a minute it might have been me leaving clockgating on by accident
17:00 Lyude: oh hey it did the other thing now
17:01 Lyude: imirkin_: https://paste.fedoraproject.org/paste/K01lDlcKu9FFCpToT4VFZw
17:02 imirkin_: is that with the revert?
17:03 imirkin_: oh wait, this is with drm-next?
17:03 Lyude: no, that's the same branch that was causing issues before
17:03 Lyude: drm-fixes/drm-next I believe yes
17:03 imirkin_: where ben redid all the VMM stuff?
17:03 Lyude: erm
17:03 Lyude: drm-next/drm-fixes
17:03 Lyude: ohhhh, that might be it
17:08 imirkin_: he had some stuff in there related to grctx too
17:18 imirkin_: like https://github.com/skeggsb/nouveau/commit/e8bd91afbef22a3396cc0d51f47858be50ec0d8c and https://github.com/skeggsb/nouveau/commit/c824cbc5191f5e8ef7740220e8d25033cfe4b0dc
17:37 Lyude: imirkin_: btw; do you know if this stuff is going to end up going into the final 4.14 release/
17:40 karolherbst: Lyude: no
17:40 karolherbst: Lyude: at least not the vmm stuff
17:40 karolherbst: would be too late to push it, don't you think?
17:42 RSpliet: think it's all queued up for 4.15
17:42 karolherbst: yeah
17:43 imirkin_: hence drm-next, not drm-fixes :)
17:45 imirkin_: skeggsb: this looks wrong... you replaced nvkm_wo 0x8001c with 0x1c, but then you nvkm_ro from the 0x80000 region.
17:45 karolherbst: imirkin_: the second of the two commits?
17:45 imirkin_: yes
17:45 imirkin_: i think that needs to be CB_RESERVED + ...
17:46 karolherbst: yeah
17:46 imirkin_: that siad, it's the gr->firmware case
17:46 imirkin_: which iirc means "external firmware"
17:46 imirkin_: which Lyude isn't using
17:46 karolherbst: what does "wo" stand for?
17:46 imirkin_: write object
17:46 Lyude: woooo
17:46 karolherbst: imirkin_: it could be right now
17:46 imirkin_: there's a boatload of stuff in there i don't understand though, so ... yeah.
17:47 karolherbst: maybe it has an internal offset set
17:47 karolherbst: so it is alright
17:47 imirkin_: karolherbst: it's a nvkm_memory object
17:47 imirkin_: it doesn't have a base.
17:47 karolherbst: okay, I see
17:47 imirkin_: it's allocated as CB_RESERVED + gr->size size.
17:47 karolherbst: k
17:48 karolherbst: yeah, I see it
17:48 karolherbst: looks wrong then
17:49 imirkin_: gr->firmware = nvkm_boolopt(device->cfgopt, "NvGrUseFW",
17:49 imirkin_: func->fecs.ucode == NULL);
17:49 imirkin_: so yeah. gr->firmware == external firmware
17:49 imirkin_: so not the issue here.
17:50 karolherbst: Lyude: do you use blob firmware?
17:50 imirkin_: not according to the kernel logs.
17:50 karolherbst: ohh wait
17:50 karolherbst: can we even on gk110?
17:50 imirkin_: of course we can.
17:50 karolherbst: okay
17:50 Lyude: karolherbst: no I don't
17:50 karolherbst: still the code looks wrong
17:52 karolherbst: Lyude: can you check if commit 7a88cbd8d65d622c00bd76ba4ae1d893b292c91c is allright?
17:52 karolherbst: if so, you could bisect the out of tree module
17:52 karolherbst: should be easier
17:53 Lyude: i will check it out right now
17:53 karolherbst: (and faster)
18:29 Lyude: karolherbst: works
18:32 karolherbst: Lyude: then start bisecting until this commit: https://github.com/skeggsb/nouveau/commit/71e57605dc05a1ae947142f8f2639bc64a4eeed8
18:35 Lyude: karolherbst: alright, I will probably have to do some quick hacking to make my scripts work with building that out of tree but that's been a long time coming anyway
18:35 Lyude: gotta reply to some emails then i'll get started
18:35 karolherbst: airlied: famous last words: "There's a really nasty nouveau collision, hopefully someone can take a look once I pushed this out."
18:36 karolherbst: airlied: I think we indeed have a regression due to that bad merge, sadly we won't make it until rc8/release? maybe we will? Depends on how fast I figure that out
18:36 karolherbst: Lyude: pro tips: blacklist nouveau, use insmod
18:37 Lyude: noted, although luckily in my case I only ever have to deal with the modules being in one place :)
18:37 karolherbst: no extra sub dir?
18:38 Lyude: i've got one but I don't build any modules into my initramfs (for the GPU anyway), so it's just a matter of replacing those files I'd think
18:38 Lyude: erm
18:38 Lyude: those being the ones in the module directory, maybe also just not build nouveau from the kernel as well
18:41 karolherbst: I was more talking about that extas direectory mess
18:41 karolherbst: when there are two nouveau.ko files inside lib/modules
18:47 karolherbst: imirkin_: I guess it broke here again: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/nouveau?h=v4.14-rc7&id=0c697fafc66830ca7d5dc19123a1d0641deaa1f6
18:50 imirkin_: hm?
18:50 karolherbst: my issue
18:50 imirkin_: dunno, not sure.
18:50 karolherbst: well, read the commit message
18:50 imirkin_: the thing is gone from nouveau_display
18:50 karolherbst: it is different
18:52 karolherbst: I will try to figure out what went wrong, otherwise maybe skeggsb has any ideas
19:03 tobijk: karolherbst: the issue mentioned there was with not cleaned up crtcs imho
19:03 tobijk: and that got fixed already
19:05 karolherbst: tobijk: guess what, it is broken on 4.14-rc7
19:05 tobijk: mh not sure, it was broken for me, but it is fixed now
19:06 karolherbst: it is broken for me
19:06 tobijk: i have no doubt
19:06 tobijk: that fixed it for me: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4813766325374af6ed0b66879ba6a0bbb05c83b6
19:06 karolherbst: I think I have a different issue
19:07 tobijk: yep, yet the commit you posted is fixed by the one i have shown you there
19:07 tobijk: yours is with gr init going wron right?
19:11 karolherbst: no
19:11 karolherbst: suspending/rmmod being broken
19:11 tobijk: oh :/
19:11 imirkin_: well, with vblank counts getting messed up
19:11 imirkin_: (refcounts)
19:12 karolherbst: well I do another bisect, but now with more knowledge
19:12 karolherbst: I am sure that backmerge commit is the culprit
19:12 karolherbst: maybe not
19:12 karolherbst: I will see it in a few minutes
19:13 tobijk: karolherbst: where does it crash btw?
19:14 karolherbst: https://gist.githubusercontent.com/karolherbst/2d9d28dbcb5c92ea6931d5ca69b5f621/raw/a25d88e38fd08c7885313f6b38ec2371c53406db/gistfile1.txt
19:17 tobijk: right, so its doing the same as it did for me: vblank off for a non existend monitor ~_~
19:18 tobijk: karolherbst: you could see if my extra chunks to the fix up ther help: https://hastebin.com/qonozasiti.php
19:19 tobijk: (thats what fixed it in the meantime for me)
19:19 karolherbst: tobijk: for me as well, it is a regression
19:19 tobijk: yep it is one and you bisected to the same backmerge i did when checking
19:19 karolherbst: I just want to track down what exactly caused that one
19:19 karolherbst: well I didn't yet
19:20 tobijk: heh, well i did a few weeks back
19:21 karolherbst: tobijk: yeah okay, so the proper fix is to fix that backmerge
19:24 tobijk: karolherbst: not sure if the backmerge contains faulty code
19:24 karolherbst: well, it doesn't matter, if that's the commit which broke something, we need to figure out what it broke and how
19:24 karolherbst: that simple
19:25 karolherbst: there is even a comment regarding an ugly conflict
19:25 tobijk: it got the atomic code to think there is an active crtc left, whuich it goes on to disable the vblank, which it can not as its not there
19:26 karolherbst: well, did you see the previos fixes?
19:26 tobijk: you mean 52dfcc5ccfbb?
19:26 karolherbst: no
19:27 tobijk: then no, i did not
19:27 karolherbst: 746c842d1f64caad81d82f0054c0e063c8aa5399 and 4a5431af19bc52c4dd491e989543c66a52380f00
19:27 karolherbst: now compare with the backmerge
19:28 karolherbst: you should find the conflict
19:31 karolherbst: new_crtc_state vs crtc_state
19:31 tobijk: mhm, the question still is, why we see the crtc as active
19:33 karolherbst: ugh...
19:33 karolherbst: okay, I think I got it
19:36 karolherbst: yeah
19:36 tobijk: so?
19:37 karolherbst: mhh, the place I found was already fixed
19:37 tobijk: yeah the code changed a bit compared to the commits you showed me
19:38 karolherbst: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/nouveau/nv50_display.c?h=v4.14-rc7&id=efa479352fc780b305fa186cafb5f416fdf2b2cb
19:39 karolherbst: meh
19:39 karolherbst: that what I get from testing rc7 and not master
19:39 tobijk: looks weird: if (old_crtc_state->active && !new_crtc_state->active)
19:39 tobijk: i think i have not understand the old new behavior yet :D
19:39 karolherbst: I think it is fixed on mainline master
19:39 karolherbst: *sigh*
19:40 tobijk: yep with the commit i showed you
19:40 tobijk: the workaround thingy
19:40 tobijk: i havent seen another commit which could fix it
19:40 karolherbst: is it upstream?
19:41 tobijk: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4813766325374af6ed0b66879ba6a0bbb05c83b6
19:41 karolherbst: ahh this one
19:41 karolherbst: I thought your patch you posted here
19:42 tobijk: karolherbst: its mostly the same, mine is only more paranoid
19:42 tobijk: more checks :>
19:42 tobijk: and no its not upstream
19:43 tobijk: gtg, talk to you later
19:43 karolherbst: thanks for your help. Will check if master really fixes my problem and then simply move one
19:43 karolherbst: *on
20:25 karolherbst: okay, mainline shows the same error for me
20:47 Lyude: karolherbst btw: you still want me to try bisecting downstream nouveau?
20:48 karolherbst: Lyude: if there is a regression on bens tree, then yeah, would make sense
20:48 karolherbst: your issues is most likely a different one
20:48 Lyude: k
20:50 karolherbst: okay, even the newest drm-fixes has the regression
20:50 karolherbst: Lyude: even on the one commit you tested for the latest fixes
20:50 karolherbst: I still get the same error
20:51 karolherbst: Lyude: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d324c5bc462d354d337dcf3a14ffd0eb17b4fa38
20:52 karolherbst: ohh wait, I think this is a different issue
20:57 Lyude: also, is there anything special i need to do to get this out of tree kernel module to build? I keep getting scripts/Makefile.build:49: *** CFLAGS was changed in "/home/lyudess/Projects/nouveau/kmod/Makefile". Fix it to use ccflags-y. Stop.
20:57 karolherbst: Lyude: cd drm
20:57 karolherbst: then it should work
20:58 karolherbst: but I assume you did this already
20:58 karolherbst: so no, no idea
20:58 Lyude: i did not do that, now it builds but it says modpost 0 modules, hrm.
20:59 karolherbst: Lyude: do you have nouveau builtin in your kernel config?
20:59 Lyude: I have it set as a module, which I just realized I should turn off
20:59 karolherbst: leave it on
21:00 karolherbst: otherwise ttm gets disabled
21:00 karolherbst: Lyude: how did you build it? "make" or something else?
21:01 Lyude: make -C /home/lyudess/Projects/linux/worktrees/kepler1.5-debug M=/home/lyudess/Projects/nouveau/kmod
21:01 Lyude: the -C directory being the plain kernel source tree
21:01 karolherbst: yeah, don't do that
21:01 Lyude: hehe, it's been a while since i've built anything out of trewe
21:01 karolherbst: just make
21:02 Lyude: also something else: keep in mind i build my stuff on a different machine then the one I'm actually running it on
21:02 karolherbst: okay
21:02 karolherbst: then
21:02 karolherbst: LINUXDIR=$whatever
21:02 Lyude: ahhh, I see
21:03 karolherbst: read the Makefile ;)
21:03 karolherbst: it does all the magic already
21:29 Lyude: karolherbst: so, if the regression is on ben's tree it should have just happened immediately when I build nouveau from master and loaded it, but it seems to be fine
21:31 karolherbst: Lyude: how did you load it?
21:32 Lyude: insmod and blacklisted nouveau from modprobe.d
21:32 Lyude: let me throw a message into there so I can make sure i'm getting the right module in
21:36 karolherbst: *sigh*
21:36 karolherbst: I get the feeling we will have some fun with 4.14
21:38 Lyude: oh no. confirmed that I have most definitely loaded the correct module....
21:38 karolherbst: ;) fun
21:38 Lyude: karolherbst: I don't know if it's going to be in 4.14 though if drm-fixes isn't going to be in there, since rc7 works fine
21:39 karolherbst: you mean drm-next?
21:39 karolherbst: or on which branch did you encounter issues?
21:39 karolherbst: at least I think I have two issues which might be related: 1. GPU doesn't suspend on mainline 2. I can't rmmod nouveau
21:41 Lyude: no, I'm on drm-fixes
21:41 karolherbst: mhh
21:42 karolherbst: this gets pulled in for 4.14
21:42 karolherbst: or is already pulled
21:42 karolherbst: Lyude: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log
21:42 karolherbst: " Merge tag 'drm-fixes-for-v4.14-rc8' "
21:43 Lyude: alright, i'm going to pull from master again and see if the error is happening there now
21:43 karolherbst: linus merged it 5 hours ago
21:44 Lyude: yeah, last time I checked it hadn't been pulled in yet
21:44 Lyude: so rc8 will be a new one to try
22:01 Lyude: karolherbst: i think i may have screwed something up, rc8 doesn't have the bug
22:01 Lyude: so, i'm going to assume this is something I somehow broke in one of my branches
22:02 karolherbst: yeah, maybe
22:02 karolherbst: you tested on commit d4c2e9fca5b7db8d315d93a072e65d0847f8e0c5?
22:02 Lyude: ye
22:02 karolherbst: okay
22:02 Lyude: that's a relief at least :)
22:03 karolherbst: yeah
22:04 karolherbst: I will try master once again, but I think I still have that bug there
22:38 karolherbst: Lyude: for me it is still broken on master