00:04 airlied: yeah but I wonder is sys_pitch like an errno
00:04 airlied: or actually a negative pitch
00:07 karolherbst: maybe ENOMEM
00:07 karolherbst: ?
00:07 imirkin: probably an errno...
00:07 imirkin: the product is -256k
00:07 imirkin: which means it must be a POT error :)
00:09 imirkin: should probably just try to repro
00:10 karolherbst: mhh I checked the prime factors and only 12 is the only sane value I can get
00:10 karolherbst: 2^2 * 3 * 5 * 17 * 257
00:10 imirkin: oh
00:11 imirkin: 0x3fffc
00:11 imirkin: not quite 256k
00:11 imirkin: $ factor 262140
00:11 imirkin: 262140: 2 2 3 5 17 257
00:11 karolherbst: ohh there is a factor tool :D
00:11 imirkin: yeah, could easily be ENOMEM
00:24 Wolf480pl: karolherbst, do I understand correctly that you managed to replicate the GPU lockups that happened to me on 3 highest cstates?
00:25 karolherbst: yes
00:25 Wolf480pl: so it's ok if I don't read this channel for a couple days?
00:26 karolherbst: yeah I think I will somehow find out what nouveau needs to do on my own now
00:33 imirkin: Local0 += Arg0 = (VRMB (0x04) + Local0)
00:34 imirkin: i wonder if that's the same thing as "local0 += arg0; local0 = (foo + local0)"
00:34 karolherbst: imirkin: I don't think so
00:34 karolherbst: Arg0 gets set too
00:35 imirkin: karolherbst: this is AML
00:35 imirkin: or ASL or whatever
00:36 karolherbst: ohh okay
01:44 imirkin: actually that 0x3fffc feels a lot more like 0xffff * 4
02:07 karolherbst: mhh, would also make sense :D
09:49 imirkin: mwk: any idea if a nv3x implements the nv2x/nv1x 3d classes?
09:50 imirkin: mwk: iirc nv2x/nv1x are all backwards compatible with one another
09:53 karolherbst: mupuf_: okay, the pdaemon counters can be tinkered with and the blob upclocks accordingly :)
10:25 karolherbst: awesome, I can't trace the blob anymore :/
10:25 pmoreau: :-(
10:25 pmoreau: What happens?
10:26 karolherbst: mmiotrace: unexpected secondary hit for address 0xffffc90010001070 on CPU 0.
10:26 karolherbst: then BUG: unable to handle kernel paging request at ffff8800f6000008
10:26 imirkin_: ohhhh that's sad
10:26 pmoreau: Oh yeah, I hit that as well
10:26 karolherbst: bit stack
10:26 karolherbst: I know it is somehow kernel config related
10:26 imirkin_: that means that they're using a funny instruction
10:26 imirkin_: that mmiotrace doesn't fully support
10:26 imirkin_: (at least iirc)
10:26 karolherbst: instruction as in x86 instruction?
10:27 imirkin_: ya
10:27 karolherbst: ohhh
10:27 karolherbst: mhhh
10:27 imirkin_: can you pastebin the full thing?
10:27 karolherbst: I compile my kernel with native optimizations
10:27 karolherbst: :D
10:27 karolherbst: imirkin_: is pstate good enough or should I bother my kernel log
10:27 karolherbst: :D
10:27 imirkin_: karolherbst: i meant the BUG
10:27 karolherbst: okay
10:28 karolherbst: yeah lol, journalctl makes it hard to copy paste, what a pain :/
10:29 karolherbst: imirkin_: https://gist.github.com/karolherbst/2d39c069fcf657f169b2
10:29 imirkin_: errrr
10:29 imirkin_: you're missing some bytes
10:30 imirkin_: the Code: line is cut off
10:31 karolherbst: nope, there isn't such line
10:31 imirkin_: huh?
10:31 imirkin_: it's there, it's just cut off
10:32 imirkin_: in width
10:32 imirkin_: there should be more letters at the end
10:32 karolherbst: ohhhh
10:32 imirkin_: at least i think there should be
10:32 karolherbst: https://gist.github.com/karolherbst/2d39c069fcf657f169b2
10:33 imirkin_: that's better!
10:33 karolherbst: yeah, journalctl uses less all the time :/
10:34 imirkin_: except... it mgiht be the wrong code
10:34 karolherbst: it's so annyoing
10:34 imirkin_: solution: don't use systemd
10:34 karolherbst: :D
10:34 karolherbst: or pass through to system logger
10:34 imirkin_: sure, you can patch around the idiocy
10:34 imirkin_: or you can just excise it
10:35 karolherbst: mhh
10:35 karolherbst: I could disable the gcc optimizations and try again
10:35 imirkin_: nah
10:35 karolherbst: or is it something inside nvidia?
10:35 imirkin_: it's in the blob code
10:35 karolherbst: ohh okay
10:35 karolherbst: I think I will ust remove nvidia-smi and see what happens
10:35 karolherbst: that tool is useless for me anyway
10:35 imirkin_: also that code is from the wrong function =/
10:35 imirkin_: very sad.
10:36 imirkin_: i need to have a closer look at mmiotrace
10:36 imirkin_: last i looked closely was back when pq was hacking on it
10:36 karolherbst: like nvidia-settings doesn't care about nvidia-smi :D
10:37 karolherbst: again :/
10:37 mwk: imirkin_: nv3x implements the nv2x classes
10:37 mwk: but not the nv1x classes
10:38 mwk: this apparently includes supporting NV20 VP code (by translating it to NV30 VP ISA)
10:39 mwk: have a look at the support table at http://envytools.readthedocs.org/en/latest/hw/graph/intro.html for exact classes supported
10:39 imirkin_: mwk: awesome thanks
10:39 imirkin_: i might grab a pci nv3x so i can test all generations at the same time
10:39 imirkin_: and add a hack to optionally allow using nouveau_vieux with nv3x
10:43 karolherbst: what a mess that is
10:45 imirkin_: looks like i should get NV25_3D
10:46 imirkin_: mwk: those variants seem off btw... NV10 had NV15_3D?
10:47 imirkin_: also seems likely that NV11 would have had NV11_3D (unless NV11 is ordered after NV17? shouldn't be)
10:47 mwk: ugh... right, NV15_3D seems off
10:47 mwk: and as for NV11_3D, this is an ugly one
10:48 imirkin_: coz NV1A is pre-NV11? :)
10:48 mwk: whatever that class is (I'm not sure), it's definitely not present on NV11
10:48 imirkin_: hehe
10:48 imirkin_: ok
10:48 mwk: but it's called 0x1196 by nv blob
10:49 imirkin_: yeah, this is how we pick the class for vieux: http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/drivers/dri/nouveau/nv10_context.c#n472
10:50 mwk: FWIW weird shit *is* common with the variants
10:50 mwk: eg. NV5_SIFM that doesn't exist on actual NV5 :)
10:50 imirkin_: hehe fair enough
10:51 mwk: but the NV15_3D thing is a mistake, I'll fix that
10:52 imirkin_: that one should be nv15+, aka nv11+ if you look in numerical space
10:53 mwk: yep
10:54 mwk: rnndb seems to have the right variants...
10:54 mwk: *shrug*
11:00 karolherbst: imirkin_: reading the mmiotrace doc really helps understanding what's going wrong :D
11:00 imirkin_: cool :)
11:01 karolherbst: like page faults is the way mmiotrace does stuff
11:01 karolherbst: :D
11:01 karolherbst: and when a page fault can't be handled, mmiotrace did it
11:02 karolherbst: so I think this is the important thing: "mmiotrace: unexpected secondary hit for address 0xffffc90014001070 on CPU 0."
11:02 RSpliet: nasty, isn't it... I'm sure pagefaults weren't invented for this ;-)
11:03 karolherbst: :D
11:03 karolherbst: right
11:03 RSpliet: (oh... and considering each page fault is over 1000 cycles, you'll get why mmiotrace is slow :-D)
11:03 karolherbst: yes
11:03 karolherbst: :D
11:03 karolherbst: allthough
11:03 karolherbst: they are faster with miotrace
11:03 karolherbst: because mmiotrace handles them
11:03 karolherbst: not the _normal_ kernel thingy
11:06 RSpliet: doesn
11:06 RSpliet: t make much of a difference unfortunately :-D
11:06 karolherbst: :D
11:07 karolherbst: mhh: ./arch/x86/mm/kmmio.c: pr_info("unexpected secondary hit for address 0x%08lx on CPU %d.\n",
11:07 karolherbst: inside kmmio_handler
11:08 karolherbst: yeah
11:08 karolherbst: I knew it
11:08 karolherbst: :/
11:08 karolherbst: okay
11:08 karolherbst: so this is what happens:
11:08 karolherbst: mmiotrace does mark the page as not there, and triggers a page fault
11:08 karolherbst: and while handling this page fault, on the same address another page fault is tirggered
11:08 imirkin_: right...
11:08 mlankhorst: this is why mmiotrace offlines all cpu's
11:08 mlankhorst: except boot
11:08 imirkin_: recursive pagefault
11:09 imirkin_: but it shouldn't happen
11:09 imirkin_: unless the page is really not there
11:09 karolherbst: right
11:09 imirkin_: it might do the wrong thing in that fallback case
11:09 karolherbst: this is what the comment says: "A second fault on the same page means some other condition needs handling by do_page_fault(), the page really not being present is the most common."
11:10 mlankhorst: or forgot to insert the page
11:14 karolherbst: what does it mean when a kmmio_context is active?
11:15 karolherbst: well
11:15 karolherbst: then I need to debug this issue with nvapeek :/
11:15 karolherbst: :D
11:17 imirkin_: or use an older blob
11:18 karolherbst: I don't think it is caused by the driver directly :/
11:18 karolherbst: I had this problem with much older version already
11:18 karolherbst: and older kernel
11:18 karolherbst: and it just disappear and appears again
11:43 imirkin_: airlied: hey, so i was just perusing the ACPI 5.0 spec, and looks like _DSM is actually supposed to take a PACKAGE object as its last arg, while nouveau passes a buffer. do you know if earlier specs required a buffer?
11:44 pmoreau: imirkin_: Earlier did as well
11:44 imirkin_: looking at ACPI 4.0, also package
11:45 imirkin_: ACPI 3.0, also package
11:45 pmoreau: But as some ACPI tables were using BUFFERs, it is said in the spec that it can be a BUFFER, even if it is supposed to be a PACKAGE
11:45 pmoreau: I had some patches to improve the ACPI warnings in Nouveau
11:47 pmoreau: I made them as part of adding support for dual Nvidia GPU optimus
11:48 karolherbst: imirkin_: it is messed up, it's everywhere implement wrongly on the laptops
11:50 imirkin_: can a package be evaluated as a buffer?
11:52 pmoreau: I'd say so, but not 100% sure
11:52 pmoreau: http://lists.freedesktop.org/archives/nouveau/2015-May/020972.html
11:52 pmoreau: -^ one of the patch I proposed
11:53 pmoreau: I think using `acpi_evaluate_dsm` rather than `acpi_evaluate_dsm_typed` is better, as it let ACPI handle whether it is a PACKAGE or BUFFER
11:55 karolherbst: imirkin_: do you think it is coincidence that the M parameter in the clk pll is always 0x1f with the blob? :D
11:56 karolherbst: and P always 0x1
11:56 imirkin_: pmoreau: ah
11:56 karolherbst: Wolf480pl: wanna try out a patch today? :D
11:56 karolherbst: this also explains why the blob only has like 15MHz steps
12:27 karolherbst: mupuf_: okay, even the pll values from the blob doesn't help
12:27 karolherbst: at all
12:28 karolherbst: gpu clocked +50MHz with nouveau is already unstable and I use a higher voltage than nvidia with +135MHz
12:28 karolherbst: I even poked nvidia PCLOCK stuff into the gpu
12:28 karolherbst: and it didn't help
12:29 airlied: imirkin_: everyone implemented things wrong from what I can tell
12:30 imirkin_: airlied: how well do you know acpi?
12:30 imirkin_: airlied: do you know if CreateWordField is legal on a Package object?
12:31 airlied:has paged acpi out completely
12:31 imirkin_: so there was a time when you knew it!
12:31 airlied: probably not at that depth, I can read ASL without throwing up, but anything complex is beyond me
12:35 karolherbst: ohh wow
12:35 karolherbst: the blob failed soo hard now, that the kernel switched from tsc to hpet :D
12:39 karolherbst: can anybody make nouveau as fast as the blob? thanks :D
12:39 imirkin_: karolherbst: only you
12:39 karolherbst: :(
12:40 imirkin_: ... can prevent forest fires
12:41 karolherbst: mhh
12:41 karolherbst: my entire system went into perma hung after I nvapeeked :/
12:46 karolherbst: imirkin_: by the way, in pixmark_piano I get 75% blob performance
12:46 imirkin_: cool
12:47 karolherbst: which is a benchmark with like no gpu memory stuff
12:47 imirkin_: get your patches ready for ben...
12:47 imirkin_: and bug him early and often
12:47 imirkin_: he's forgetful and busy
12:47 karolherbst: I already do :D
12:47 karolherbst: I am already nervous enough
12:47 karolherbst: :D
12:48 karolherbst: poked him thursday the last time
12:48 imirkin_: do it again... he's putting a pull request together for dave
12:48 karolherbst: :O
13:03 airlied: imirkin_: good point
13:04 airlied: skeggsb: this -next is very late :-)
13:13 karolherbst: noo, my gddr5 patch :D
14:29 RSpliet: airlied: Ben still alive and in one piece? :-P
14:31 airlied: he was on holidays last week, I think he made it back :-P
14:32 imirkin_: i spoke with him yesterday, so he's def alive
14:32 imirkin_: or was, at least
14:34 RSpliet: haha good stuff :-)
15:11 karolherbst: he was back wednesday by the way already :D
15:20 glennk: back to the feature
15:27 mupuf_: karolherbst: hey
15:27 mupuf_: what did you check exactly?
15:32 karolherbst: mupuf_: 134000 0x1000 range
15:33 karolherbst: ohhh
15:33 karolherbst: the other one
15:33 karolherbst: 137000 0x1000
15:35 mupuf_: ok, check also in the FB area
15:36 karolherbst: mupuf_: which is it?
15:36 mupuf_: forgot on fermi+
15:45 gryffus: Was this patch reintroduced by a proper fix? http://cgit.freedesktop.org/mesa/mesa/commit/?id=d0c22560a151a1ea726df4a6e001048a7c5b225e I'm having crash with "xe: nvc0/nvc0_screen.c:543: nvc0_screen_fence_emit: Assertion `PUSH_AVAIL(push) >= 5' failed." error. Full backtrace is here: https://bpaste.net/show/371b1b641283
15:45 imirkin_: gryffus: yes it was
15:46 gryffus: i'm on nvc0... any clues why i'm having this error?
15:46 imirkin_: gryffus: but apparently not proper enough
15:46 imirkin_: gryffus: what mesa version are you on?
15:47 gryffus: which patch should solve it? I'm using galium nine branch from https://github.com/iXit/Mesa-3D
15:47 imirkin_: hrmph... looks like a fail on my part
15:47 imirkin_: errrr... or not.
15:47 imirkin_: hm.
15:47 imirkin_: hmmmmmmmmmm
15:48 imirkin_: hm.
15:48 gryffus: :))
15:48 imirkin_: this is the case i was afraid of. ok. if i give you a patch, will you be able to test it?
15:49 gryffus: imirkin_: no problem
15:49 imirkin_: gryffus: http://hastebin.com/oroxopiwot.coffee
15:53 hakzsam: imirkin_, this assert continues to give us some troubles :)
15:53 imirkin_: hakzsam: well it's a legitimate assertion
15:53 hakzsam: sure
15:53 imirkin_: hakzsam: the problem isn't the assert... it's that it's being hit :)
15:53 imirkin_: i just made the problem more visible
15:54 imirkin_: instead of getting weird crashes and memory corruption
15:54 hakzsam: I know
15:54 imirkin_: you now get an assertion error. seems like a reasonable trade.
15:55 imirkin_: then fun part is that it's generally OK if you go over a bit -- libdrm reserves a few bytes at the end for a "return" anyways. so you don't end up corrupting memory, you just end up messing up the cmdstream
15:55 hakzsam: and I assume this PUSH_SPACE(0) will kick-off the pushbuf, right?
15:55 imirkin_: it will ensure that there's enough space for a fence emission
15:55 hakzsam: okay
15:56 imirkin_: gryffus: errr, that won't build. try this: http://hastebin.com/tirikofini.coffee
15:57 imirkin_: hakzsam: i kinda had an implicit assumption that some joker wouldn't be emitting fences all the time. however that can happen if you just call flush over and over
15:59 hakzsam: yeah, I see
15:59 imirkin_: i also have logic not to emit any fence if nothing refs the current fence
16:00 imirkin_: since nothing could possibly care
16:00 imirkin_: however if you supply a fence to ->flush() then it will cause fences to get rotated
16:00 hakzsam: makes sense
16:00 imirkin_: it's all very fragile, obviously
16:01 imirkin_: but i can't think of a non-fragile way to handle it
16:01 imirkin_: basically we can be called on to emit a fence from a callback at any point in time
16:01 imirkin_: and we can't allocate more space from that callback
16:01 imirkin_: (coz it's the callback that allocates more space)
16:02 imirkin_: so we must guarantee that no matter what, there shall always be room for a fence to get put in there
16:02 imirkin_: errrrrrrrr hm
16:02 imirkin_: i just had a realization
16:02 imirkin_: there's a rsvd_kick
16:03 imirkin_: which lets libdrm ensure this
16:03 imirkin_: interesting.
16:04 imirkin_: i might rethink my strategy
16:05 imirkin_: really i just need to do diff things from kick context and non-kick
16:15 aaaa: when i try and use 2 monitors with different outputs on each with my laptop it crashes. the screens turn off and i cannot switch to another tty. mirroring the displays does not crash. it worked fine with the proprietary drivers. i am running debian testing and using xrandr.
16:20 imirkin_: logs
16:20 aaaa: from where?
16:20 imirkin_: dmesg is a good start
16:21 aaaa: where should i upload?
16:21 imirkin_: pastebin
16:22 aaaa: dmesg log: https://dpaste.de/9vo4
16:27 imirkin_: aaaa: boot with nouveau.modeset=0
16:27 imirkin_: er wait
16:27 imirkin_: i thought this was an optimus setup
16:27 imirkin_: you need this patch: http://cgit.freedesktop.org/~darktama/nouveau/commit/?id=f153acb3a41432e74fdbdfba9a005007e2957c1c
16:27 imirkin_: and possibly some others
16:28 aaaa: how do i apply it?
16:29 imirkin_: aaaa: do you have an optimus option in your bios?
16:29 imirkin_: aaaa: i'd enable it if i were you... intel graphics are a lot more reliable than nvidia. plus it uses less power.
16:30 aaaa: i dont my laptop supports it. i wish i could use integrated.
16:30 imirkin_: hm ok
16:30 aaaa: *do not think
16:31 imirkin_: well i def didn't see the pci device in your log, but it'd be hidden if it were disabled in the bios
16:31 aaaa: mylaptop cannot use the intel graphics
16:33 aaaa: so how do i apply the patch?
16:34 imirkin_: things will go easier if you first install linux 4.3
17:00 aaaa: well, i did a system update and now xorg crashes but my computer does not freeze and the tty is returned. here is the log https://dpaste.de/LDV1
17:01 gryffus: https://bugs.freedesktop.org/show_bug.cgi?id=70354
17:02 imirkin_: gryffus: probably a different issue
17:02 imirkin_: gryffus: that bug is for the people for whom GPOB didn't help. but we're not even enabling GPOB for GK107
17:03 skeggsb: imirkin_: we are now
17:03 gryffus: imirkin_: oh, so sorry for confusion
17:03 imirkin_: skeggsb: with your patch which is slated for 4.4, yes
17:03 imirkin_: skeggsb: but not with kernel 4.2 or even 4.3
17:03 skeggsb: right
17:05 aaaa: so what should i try?
17:13 imirkin_: aaaa: the thing i said? install linux kernel 4.3 first
17:19 imirkin_: aaaa: that will make it easier to build ben's tree
17:19 aaaa: ok, trying to update now
17:27 imirkin_: gryffus: did my patch help btw? i'm going to send a better one later.
17:33 aaaa: imirkin_: its not in any debian repositroy, any way without it?
17:34 gryffus: imirkin_: waiting for rebuild https://build.opensuse.org/package/show/home:gryffus:branches:home:pontostroy:gallium-nine/Mesa and doing a system update meanwhile, i will let you know
17:34 imirkin_: you could build your own kernel... but if you knew how to do that, you probably wouldn't be asking.
17:34 imirkin_: aaaa: you can boot with nouveau.noaccel=1
17:34 imirkin_: aaaa: this should disable acceleration, but give you working monitors
17:34 imirkin_: gryffus: ok thanks
17:34 aaaa: will try
17:35 aaaa: just as a kernel peramiter?
17:35 imirkin_: aaaa: yep
17:37 imirkin_: bbl
17:47 aaaa: imirkin_: did not work
19:02 imirkin: aaaa: hmmm... maybe you also need nouveau.nofbaccel=1 in addition to it