00:02 pq: karolherbst, I'm practically always here, just call my nick to get my attention. :-)
00:04 pq: karolherbst, though I don't think I can really help with mmiotrace anymore.
00:07 karolherbst: pq: well I guess still more than most of us here :D
00:08 pq: karolherbst, I poured all my remaining knowledge in that email :-p
00:09 karolherbst: I doubt it is mmiotrace specific though or the code is spread a bit across the tree :/
00:09 pq: of course I can *try* to answer questions, but it'll be memories from 2009
00:09 karolherbst: the main problem I have is that I really don't know in which order stuff gets called
00:10 karolherbst: but I also don't know how linux handles page faults
00:10 pq: the page fault handler has an explicit call to mmiotrace
00:11 pq: ok, so I can recap the general idea:
00:11 karolherbst: There is a nice Documentation/mmiotrace.txt file though where most of idea is already in though
00:12 pq: oh yeah
00:12 karolherbst: I just miss the wiring up
00:12 pq: hm, right
00:12 pq: maybe you could dig up the initial patches adding mmiotrace to the kernel?
00:14 karolherbst: good idea actually
00:14 karolherbst: shouldn't be too hard to find
00:15 karolherbst: pq: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/kernel/trace/trace_mmiotrace.c?id=f984b51e0779a6dd30feedc41404013ca54e5d05
00:15 pq: I'm digging too
00:15 karolherbst: and all the parents of that
00:16 karolherbst: pq: this one is the first? https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/kernel/trace/trace_mmiotrace.c?id=8b7d89d02ef3c6a7c73d6596f28cea7632850af4
00:16 karolherbst: ohh wait
00:16 karolherbst: I should remove the path
00:16 karolherbst: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8b7d89d02ef3c6a7c73d6596f28cea7632850af4
00:17 pq: yeah, that's the one
00:18 karolherbst: sadly most of this was moved
00:20 pq: yup
00:21 pq: that patch refers to "page fault notifier chain" which was then completely removed from the kernel
00:21 karolherbst: here is the ftrace plugin patch: https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=f984b51e0779a6dd30feedc41404013ca54e5d05
00:23 pq: hrm... if I 'git log --stat -- path/filename', the stats won't print other files changed...
00:26 karolherbst: I have an idea what the issue might be actually. Well the technical error is, that the kernel gets a second fault on the same memory region and I think that might be while single stepping through the stuff
00:26 karolherbst: wait, will lookup the source code line
00:27 pq: yes, I believe that is what happens
00:27 pq: that's what the secondary or recursive fault error message means
00:27 karolherbst: the stack also kind of tells us this
00:28 pq: I didn't see reliable stack trace from you
00:28 karolherbst: _nv009391rm+0x177a/0x1a80 => page fault (page_fault, do_page_fault) => single stepping into _nv009391rm+0x177a/0x1a80 = page fault!
00:28 karolherbst: soo
00:28 pq: the page fault handler should return before the execution continues in the blob
00:29 karolherbst: the first one?
00:29 karolherbst: yeah makes somewhat sense I guess
00:29 pq: yes - it does not make sense to run blob code from *within* a fault handler
00:29 karolherbst: I think I rebuilt my kernel to often and now gdb lies to me :/
00:30 karolherbst: right
00:30 pq: well, do you have question marks in the kernel's own stack trace?
00:30 karolherbst: the first time I saw the stack it didn't make much sense
00:30 karolherbst: yeah
00:30 cousin_luigi: Greetings.
00:30 pq: that means it's not reliable for the question marked entries
00:30 karolherbst: wait a sec
00:30 cousin_luigi: Am I going to have less problems with kepler on 4.5 ?
00:31 karolherbst: pq: stack-points + the debuging stuff should be fine then I suppose?
00:32 pq: I'm not sure
00:33 karolherbst: well I saw once a stack with less question marks
00:38 pq: karolherbst, I'm getting you a file list...
00:41 pq: arch/x86/mm/kmmio.c arch/x86/mm/mmio-mod.c arch/x86/mm/pf_in.c arch/x86/mm/pf_in.h arch/x86/mm/testmmiotrace.c include/linux/mmiotrace.h arch/x86/mm/pageattr.c arch/x86/mm/fault.c arch/x86/mm/ioremap.c kernel/trace/trace_mmiotrace.c
00:41 pq: karolherbst, that seems a fairly good file list of what was affected by mmiotrace
00:43 pq: karolherbst, ah, joi has also poked mmiotrace in the past (Marcin Slusarz)
00:51 pq: karolherbst, did you see my comment about joi?
00:52 karolherbst: yes
00:52 karolherbst: I rebooted because of recompiled kernel
00:52 karolherbst: now I have a stack with more information and less ?
00:52 karolherbst: pq: https://gist.github.com/karolherbst/c7b46049bff3cf808e5f
00:53 karolherbst: that ___slab_alloc.constprop is somehow weird
00:53 karolherbst: but I can look up all addresses
00:53 karolherbst: but maybe it makes sense
00:54 karolherbst: because allocating memory while handling page faults....
00:54 karolherbst: anyway, it seems like the kernel tries to continue stuff in the page fault handler
00:55 pq: nope
00:55 pq: the only reliable entries in that trace are page_fault -> do_page_fault -> __do_page_fault, which makes sense.
00:55 karolherbst: pq: are the symbols not reliable or also the addresses?
00:55 pq: everything else is just old ghosts
00:56 pq: addresses
00:56 karolherbst: mhh meh :/
00:56 karolherbst: okay
00:56 pq: I believe the question marked entries are just text addresses that happened to be found in the stack area, but not confirmed to be return addresses
00:57 karolherbst: ohh okay
00:57 pq: so they are often just ghosts from calls done and returned earlier
00:59 karolherbst: pq: well the fault happens here: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/include/asm/pgtable.h?id=refs/tags/v4.4.1#n567
01:00 pq: arch/x86/mm/kmmio.c arch/x86/mm/kmmio.h arch/x86/mm/mmio-mod.c arch/x86/mm/pf_in.c arch/x86/mm/pf_in.h arch/x86/mm/testmmiotrace.c include/linux/mmiotrace.h arch/x86/mm/pageattr.c arch/x86/mm/fault.c arch/x86/mm/ioremap.c kernel/trace/trace_mmiotrace.c arch/x86/kernel/mmiotrace/kmmio.c arch/x86/kernel/mmiotrace/kmmio.h arch/x86/kernel/mmiotrace/mmio-mod.c arch/x86/kernel/mmiotrace/pf_in.c arch/x86/kernel/mmiot
01:00 pq: race/pf_in.h
01:00 pq: karolherbst, that's the file list that should get you about everything mmiotrace related, both old and moved file names
01:00 karolherbst: k
01:00 karolherbst: thanks
01:01 pq: is pmd NULL or something?
01:01 pq: the quess about hugepages is a good one
01:01 pq: *guess
01:02 pq: I suspect kmmio.c or such should handle hugepages explicitly
01:02 karolherbst: mhh how can I check it is null? :D well I have the registers, but I didn't do that kind of debugging yet
01:02 pq: neither have I done that - you check against disasm and the calling convention
01:03 karolherbst: well I have vmlinux opened in gdb
01:03 karolherbst: ohh I could let gdb show me the isntruction
01:03 pq: yeh - and stack? with args? or do you not have debug info and source included?
01:04 karolherbst: https://gist.github.com/karolherbst/27e7b05836e6504cdd6a
01:04 karolherbst: the IP points to "0xffffffff81087c71 <+449>: mov (%rbx,%rdx,1),%rsi"
01:05 karolherbst: sadly pte_offset_kernel is "inline pte_offset_kernel" :D
01:05 karolherbst: maybe I should remove that inline and generate a new stack?
01:05 karolherbst: ohh wait
01:05 karolherbst: I should get the args though
01:08 pq: karolherbst, I need to get to other stuff now, but my wild guess would be that hugepages cause some things wrt. page table functions to return NULL where they work for normal pages, and you have to patch kmmio.c or something to explicitly handle hugepages.
01:09 pq: karolherbst, did you try with a kernel built without a hugepags support, does it let things work?
01:10 karolherbst: I don't think you can disable hugepages directly though :/
01:10 urjaman: i think it's a kernel option... atleast for the 32bit kernels, not sure if its always there...
01:10 karolherbst: there is one for transparent hugepages
01:11 urjaman: oh yeah
01:11 urjaman: hmm...
01:11 urjaman: i might've gotten confused then
01:11 karolherbst: but I think nvidia just requests a iomm map of like 4MB
01:11 karolherbst: pq: I cut that inline down to 4 instructions, so I think I will figure that one out
01:12 pq: cool, good luck :-)
01:13 karolherbst: pte_index(address) is 0xffffffff81087c64 <vmalloc_fault+436>: shr $0x9,%rbx; 0xffffffff81087c68 <vmalloc_fault+440>: and $0xff8,%ebx
01:13 karolherbst: (pte_t *)pmd_page_vaddr(*pmd) + pte_index(address); is 0xffffffff81087c6e <vmalloc_fault+446>: add %rax,%rbx
01:13 karolherbst: so (pte_t *)pmd_page_vaddr(*pmd) is %rax
01:14 karolherbst: rax=ffff880000000000
01:14 karolherbst: rbx=ffff880000000008
01:15 karolherbst: ohh this is for mov (%rbx,%rdx,1),%rsi though
01:15 karolherbst: so before that add, %rbx is 0x8
01:17 karolherbst: mhh so no, the address can't be 0
01:18 karolherbst: and pmd can't be null either
01:20 karolherbst: I don't understand what "mov (%rbx,%rdx,1),%rsi" does though
01:21 karolherbst: ohh () means address
01:22 urjaman: Assembly 101, though the AT&T syntax is a bit goofy yeah
01:22 karolherbst: still don't get to what (%rbx,%rdx,1) maps
01:23 karolherbst: rdx=00000000f6000000
01:23 karolherbst: ffff880000000008 + 00000000f6000000?
01:23 karolherbst: but what about the 1 then
01:23 urjaman:guesses that is [rbx+rdx+1], but not sure if +1 or *1, but *1 would be silly
01:23 karolherbst: ohhhh
01:23 karolherbst: noooo
01:23 karolherbst: I now
01:23 karolherbst: *know
01:23 karolherbst: ffff880000000008 + 1*00000000f6000000
01:24 karolherbst: makes sense, right?
01:24 karolherbst: base + size*index
01:25 karolherbst: which gives us this: BUG: unable to handle kernel paging request at ffff8800f6000008
01:25 urjaman: oh
01:26 karolherbst: which should point to a pte_t structure
01:26 karolherbst: but guess what, it is marked as paged out?
01:26 karolherbst: pq: should it be ever possible for a pte_t struct to be paged out?
01:29 pq: karolherbst, I guess things go wrong earlier
01:29 karolherbst: yeah
01:29 pq: we do not detect that it is a hugepage, we continue handling the page table entries like it was a normal page, and get crap
01:29 pq: so what you'd want to check is how does hugepage handling differ from normal page handling
01:30 karolherbst: mhh
01:30 karolherbst: the pages are bigger
01:30 urjaman: yeah i think hugepage is one less levels of the indirection
01:30 karolherbst: urjaman: no
01:30 karolherbst: it is the same
01:30 karolherbst: just bigger
01:30 karolherbst: basically
01:30 urjaman: so you'd get pointer to the list of 4k pages normally there
01:30 pq: karolherbst, no, he means in the pte, pmd, pgd hierarchy, or what they were called
01:30 karolherbst: ahhh oaky
01:31 urjaman: but instead there's a bit saying "this is hugepage" and it points somehow to the pagey stuff... i did this once back, so i've forgotten
01:31 karolherbst: then I missunderstood, sorry
01:31 pq: so I suppose you'd skip one of the hierarchy levels somehow, maybe
01:32 Dezponia: karolherbst: Anything new under the sun to poke at with Kerpler reclocking or such you'd like an extra test on?
01:32 urjaman: like a 4k page can hold 1024 entries that maps to.. 4k*1024 of stuff
01:32 urjaman: so instead we just map it to a 4k*1024 sized page
01:32 urjaman: = 4MB
01:33 karolherbst: Dezponia: nope, still messing around with fixing mmiotrace and nvafakebios to get that volting issue fixed
01:33 karolherbst: urjaman: ahhhh
01:33 urjaman: go look it up in some datasheet :), i've very well forgotten all the bits etc
01:33 karolherbst: wait
01:33 karolherbst: it makes sense!
01:33 karolherbst: guess what
01:33 karolherbst: mmiotrace marked that page as pageout
01:33 karolherbst: which contains those page information
01:34 pq: that would be bad :-)
01:34 karolherbst: that's why the address pointing to pte_t isn't there
01:34 karolherbst: but it makes sense, right?
01:35 urjaman: yes for me, but i'm just a random reader here :P
01:36 Dezponia: karolherbst: I dont think i had to change voltage for my card? I could just change the clock regardless if I recal (unless I'm thinking of something else).
01:37 karolherbst: Dezponia: yeah then you could simply use my master_karol_no_touchy branch
01:37 karolherbst: there should be everything in there
01:37 karolherbst: right vbios parsing, dynamic reclocking, other stuff
01:38 karolherbst: and a volt hack for pwm based gpus
01:38 Dezponia: karolherbst: Oooo, dynamic reclocking even? :)
01:38 karolherbst: 'course
01:38 karolherbst: everything is pretty much done (like 75% done)
01:38 karolherbst: just that volting issue is really annoying
01:38 pq: karolherbst, btw. how did you confirm that it is mmiotrace marking the page table page as not present, rather than following a random pointer?
01:38 Dezponia: karolherbst: Great to hear! Do I still have the Heaven score record BTW? :P
01:38 karolherbst: Dezponia: with dynamic reclocking it is even worse
01:39 karolherbst: Dezponia: you or Tom^
01:39 karolherbst: Dezponia: but
01:39 karolherbst: Dezponia: I am sure your gpu will crash with dynamic reclocking, because driving at lower core clocks + high memory clocks shows the volting issue more obvious
01:39 karolherbst: at least this was with my gpu
01:39 karolherbst: pq: could be
01:40 Dezponia: karolherbst: Interesting. If I get some time I'll try it out. I'm excited for the dynamic reclocking :)
01:40 karolherbst: pq: pte_index(address) returns 0x8
01:40 karolherbst: mhh maybe I get address
01:42 karolherbst: nope: shr $0x9,%rbx; and $0xff8,%ebx
01:42 karolherbst: mhhh
01:42 karolherbst: 0x8 << 9 is 0x1000 though
01:42 karolherbst: could have been 0xffffc90010001070
01:42 pq: karolherbst, I think you should check how hugepages really work with the page tables before guessing too much ;-)
01:43 pq: like does a pte exist for a hugepage at all?
01:43 karolherbst: ohhh
01:43 karolherbst: I know what you mean
01:43 pq: or maybe one of the middle entries in the page table hierarchy
01:44 karolherbst: you mean the kernel just goes a wrong path and thinks there mithe be pte_ts but in fact it is the memory allocated by nvidia
01:45 pq: ...
01:45 pq: yeah, wrong path, but not in those words :-)
01:45 karolherbst: :D
01:46 pq: please go look how page tables work, as I cannot explain that to you
01:46 pq: or maybe it *is* in those words, not sure I understood right...
01:46 pq: anyway, that's the clue I'd follow for now
01:48 karolherbst: it would be really nice to know where that first page fault was triggered though
03:54 karolherbst: pq: look what I found: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/mm/ioremap.c#n69
03:54 karolherbst: pq: quick and dirty workaround: don't create transparent huge pages when the stuff is alligned
03:54 karolherbst: I mean "kernel huge I/O mapping"
03:55 karolherbst: and it also somehow says, that changes in PAT/MTRR stuff could also disable this
04:04 pq: karolherbst, nice. You could probably hack it to not use hugepages somewhere, if you are looking for a one-off workaround.
04:05 karolherbst: it would also verify the issue
04:05 pq: making mmiotrace support hugepages would be the upstreamable fix
04:05 karolherbst: yeah
07:19 Tom^: well now with my new gpu cooler i could probably OC it a bit more and beat Dezponia
07:19 Tom^: ;)
07:19 Tom^: karolherbst: btw the max clocks with nvaboost is still not quite what the blob gives
07:21 Tom^: karolherbst: in nouveau i only get 1078 while the blob puts me on 1098 until im reaching uh 80C (if i recall the temps correct)
07:22 karolherbst: Tom^: yeah, because you use a higher voltage due to the hack
07:23 Tom^: does that give me lower available cstates? O_o
07:23 karolherbst: yeah
07:23 Tom^: i see
07:25 Dezponia: Tom^: Lies! You'll never beat me! NEVAR!
07:26 karolherbst: Tom^: we could do one thing though :D
07:26 Tom^: Dezponia: one of my fans broke so i ordered a arctic cooler and replaced it, it even came with backside cooling, now on full load and full blob boost i only ever reach around 68C
07:26 Tom^: Dezponia: xD
07:27 Tom^: so i can probably use 1.2V and increase the clocks until it goes unstable, and i think that is aroudn 1250~ ish if you compare to other overclockers.
07:27 karolherbst: Tom^: in this if: https://github.com/karolherbst/nouveau/blob/master_karol_no_touchy/drm/nouveau/nvkm/subdev/clk/base.c#L187
07:27 karolherbst: next line you could do something like freq+=1000
07:27 karolherbst: freq is in khz
07:28 karolherbst: so +=1000 means +1MHz
07:28 karolherbst: that would be like adding a freq offset to all cstates
07:28 karolherbst: which OC tools usually do
07:28 Tom^: ah mhm
07:28 Tom^: debugfs file for this for support ! :P
07:28 karolherbst: it is important to be done in the if clause because otherwise you would also upclock other engines
07:28 Tom^: *for OC support.
07:28 karolherbst: later maybe
07:29 karolherbst: we need to respect the power budget first
07:29 Tom^: hehe
07:29 karolherbst: after that's done we can have such offset clock
07:29 karolherbst: adding 50 MHz should be fine for your gpu
07:29 karolherbst: maybe even 100 or 200 would be
07:30 karolherbst: I managed to get +135MHz with mine gpu with the binary driver
07:30 karolherbst: *my?
07:30 karolherbst: :D
07:30 karolherbst: such an offset is also the same what the blob does with coolbits enabled
07:30 karolherbst: but it also clocks up other engines as well
07:31 Tom^: mm
07:31 Tom^: would be quite cool having finer OC control with nouvau the nvidias coolbit :P
07:31 karolherbst: but other engines have to be adjusted by the vales in the boost table
07:31 karolherbst: "* boostS.percent / 100" there is this line in the clk/base.c file
07:31 Tom^: *then/than
07:32 karolherbst: well we could also add a voltage offset
07:32 karolherbst: or also more cstates or whatever
07:34 nightfuri: Hey guys kde plasma and nouveau doesn't get along well ? i am facing some flickering issues while shifting between windows and for a little time those windows sometimes looks messed up. how can i get it fixed ?
07:39 RSpliet: nightfuri: first start with providing more information; used GPU, distribution, kernel version, libdrm, mesa
07:47 nightfuri: RSpliet, http://dpaste.com/1DY8E80 couldn't screenshot the flickering but sometimes messed up windows looks similar to this- https://i.imgur.com/bnzkIyX.jpg
07:51 RSpliet: nightfuri: okay, that looks one heck of a lot like my laptop - apart from me using Gnome Shell rather than KDE Plasma
07:53 RSpliet: not that I'm sure it matters, but you set your back-end to OpenGL 3.1 through GLX, whereas your paste reports "GLX Version: 3.0"
07:53 nightfuri: RSpliet, the issue or my specs ?
07:53 RSpliet: your specs
07:55 RSpliet: KDE might be trying to use features not implemented by nouveau - although I can't imagine which one that may be (since iirc the only thing really missing is dxtn textures for legal reasons)
07:56 nightfuri: RSpliet, i have used same setting on another distro's kde plasma but there i use nvidia driver.
07:56 nightfuri: anyway to figure out whats causing the problem ?
07:58 RSpliet: nightfuri: nvidia's driver supports more features than nouveau; what other options are in the "Rendering backend" drop-down? OpenGL 2.0 or 3.0 by any chance?
08:01 nightfuri: RSpliet, openGL 3.1 openGL 2.0 and Xrender only. OpenGL 2.0 was the default i changed to 3.1 as i had issues there too on a fresh install. haven't checked how it is after updating the system
08:01 RSpliet: (bbl, surely some other people around here might be able to help)
08:01 RSpliet: try it first to rule out incompatibility issues
08:01 nightfuri: ok
08:01 RSpliet: karolherbst, mupuf: aren't you KDE users? what's your take on this? nouveau bug or KDE issue?
08:02 karolherbst: well I have my intel gpu
08:02 karolherbst: ohh wait
08:02 karolherbst: try to disable dri3
08:03 karolherbst: there was also an article on phronix about dri3: https://www.phoronix.com/scan.php?page=news_item&px=DRI3-Nouveau-Display-Woes
08:03 karolherbst: nightfuri: see if disabling window shadows helps
08:05 karolherbst: or just change the color of the shadow
08:05 karolherbst: maybe the black border equals that color
08:05 imirkin: nightfuri: try updating mesa to 11.1.1, that had a bunch of nouveau fixes, dunno if they'd affect your situation or not though.
08:06 nightfuri: karolherbst, what setting is that ?
08:06 imirkin: [and there's a 11.1.2 coming out in a day or two, which will have even more fixes]
08:06 karolherbst: nightfuri: with plasma 5: application style => windows decorations => press button of style => shadows
08:08 nightfuri: ok changed back to opengl2.0 flickering is gone for now. but this is what happening when i changed from 2.0 to 3.1 before too and i got the issue back later. weird
08:08 nightfuri: karolherbst, ok
08:09 karolherbst: nightfuri: I meant dri3, that's something else, but we can look into that later
08:12 nightfuri: changed to 3.1 again no flickering
08:12 nightfuri: karolherbst, ok changed the shadow from black to blue
08:13 karolherbst: so are those block borders blue now?
08:14 karolherbst: ohh wait, I saw the window buttons getting strange a bit sometimes, now I remember, but that's like so rare for me
08:15 nightfuri: karolherbst, yeah i can see the color in the window borders
08:15 karolherbst: k
08:16 karolherbst: nightfuri: I guess you can further manipulate this area with the shadow parameters? like make them bigger if you increase the shadow size and so on?
08:16 nightfuri: imirkin, thank you i will check that too :)
08:18 nightfuri: karolherbst, yeah its increasing as i increase the size.. this could be related to my issue ?
08:19 karolherbst: yeah well, I guess something is wrong with creating those shadows in the compositor
08:19 karolherbst: might be a ddx issue though
08:19 nightfuri: oh
08:20 karolherbst: I have no clue how kwin creates the shadows though
08:20 karolherbst: nightfuri: would you like to create an apitrace of kwin?
08:20 karolherbst: ohh wait
08:20 karolherbst: no idea if that works at all
08:20 karolherbst: let me try that first
08:20 nightfuri: ok
08:21 karolherbst: it is slow as hell
08:21 karolherbst: mhh and nothing to see really :/
08:22 karolherbst: really didn't know what I expected :/
08:23 infoscav: a miracle, obviously
09:58 karolherbst: pq: do you know where mmiotrace marks the page as not available?
10:19 karolherbst: pq: ohhhh, ohhhh: http://lxr.free-electrons.com/source/arch/x86/mm/kmmio.c#L431
10:21 karolherbst: I bet this "size += PAGE_SIZE;" causes troubles
10:38 Tundr4: Hi all. Hi all. For some reason Xorg gets stuck with llvmpipe. Here's what the kernel says at startup: firmware: failed to load nvidia/gm204/fecs_inst.bin (-2)
10:38 Tundr4: lol the Hi all got pasted twice. Sorry (I'm not a bot)
10:42 karolherbst: k
10:43 karolherbst: so I fixed the issue with the second hit, at least that is gone now
10:44 karolherbst: and now I have to tell mmiotrace to fault the begin of the page not somewhere in the middle
10:47 pmoreau: Tundr4: Possibly due to the renaming of those files which happened in some recent kernel version (maybe 4.4?)
10:47 pmoreau: Oh wait, gm204
10:47 karolherbst: ohh nice, and now my kernel also doesn't get messed up anymore :) still got internet access
10:47 pmoreau: Probably the ones NVIDIA is supposed to release… some day
10:47 pmoreau: So no acceleration
10:47 pmoreau: karolherbst: \o/
10:48 pmoreau: mmiotrace fixed?
10:48 karolherbst: not yet
10:48 karolherbst: but it is better now
10:48 Tundr4: ohhh fuck. Thanks man.
10:48 karolherbst: pmoreau: real close though
10:49 pmoreau: Awesome!! I most likely need mmiotrace back to fix the lockup while switching cards
10:49 karolherbst: pmoreau: just replaced "size += PAGE_SIZE;" with "size += p->len" in http://lxr.free-electrons.com/source/arch/x86/mm/kmmio.c#L431
10:50 karolherbst: this should fix the kernel-state corruption at least
10:50 karolherbst: pmoreau: you also got the problem that after some minutes internet was just gone and "sudo -s" hang like forever?
10:51 pmoreau: Never tried `sudo -s`…
10:51 karolherbst: k
10:51 karolherbst: then did su root work?
10:51 pmoreau: I tend to use `su -c` or just `su`.
10:51 pmoreau: I don't recall any hang, so I would say yes :-)
10:51 karolherbst: :D
10:51 karolherbst: k
10:52 karolherbst: but your internet got broke?
10:52 karolherbst: but that could be driver dependent
10:52 karolherbst: anyway, that my internet still works is a good sign
10:53 karolherbst: ohh maybe I should also do that in unregister
10:59 pmoreau: Oh, you're referring to yesterday, when it took me a while to connect back?
11:00 karolherbst: pmoreau: did you try to mmiotrace?
11:03 pmoreau: I haven't tried
11:04 pmoreau: I might try a bit later, but have to take care of other things firt.
11:38 karolherbst: pmoreau: at least I understand the issue now
11:38 pmoreau: That's great! :-)
11:38 karolherbst: basically nvidia requests a 16MB mapping
11:38 karolherbst: and this allignes perfectly to a 4MB huge page
11:39 karolherbst: but mmiotrace only operates on 4k page sizes
11:40 karolherbst: then if like 4k+0x70 is accessed, mmiotrace wants to page fault the "small" page behind 4k+0x70, but because the mapping is aligned to 4MB, 4k+0x70 is still in the "first" page of the mapping, but mmiotrace doesn't think it is and tries to mark the page behind 4k+0x70 as not there
11:40 karolherbst: of course the page fault can't be handled, because there is no page
11:42 karolherbst: ohh wait, it is actually 2MB or 1GB, but it doesn't matter much
11:43 karolherbst: painful is, I don't find any documentation for this at all
12:14 karolherbst: wish me luck :)
12:15 karlmag: karolherbst: luck
12:15 imirkin_: karolherbst: what's your question again?
12:16 imirkin_: there are large pages which are a full PDE's worth of entries
12:16 imirkin_: and then there are HUMUNGOUS pages which are like 2GB or so
12:16 karolherbst: imirkin_: right, and we get such
12:16 imirkin_: i doubt we get a huge page
12:16 karolherbst: there are three sizes: 4k, 2M and 1G
12:16 imirkin_: we probably just get a 4M page
12:16 karolherbst: imirkin_: we get for sure
12:16 imirkin_: (2M? are you sure? i think 4M...)
12:16 karolherbst: 2m
12:16 karolherbst: this is ioremap
12:17 imirkin_: i guess for those mega-pages there's actually a 3rd page level
12:17 karolherbst: yeah
12:17 karolherbst: 1G
12:17 imirkin_: and it's in THAT page level that some bit is set
12:17 karolherbst: imirkin_: read this: http://lxr.free-electrons.com/source/arch/x86/mm/ioremap.c#L69
12:17 karolherbst: and nvidia tries to map a 16MB region
12:17 karolherbst: guess what happens
12:18 Guest66899: imirkin_: how do I send you the save state file?
12:19 Guest66899: wetransfer.com?
12:21 karolherbst: imirkin_: also the pg_level enum here: http://lxr.free-electrons.com/source/arch/x86/include/asm/pgtable_types.h#L421
12:21 karolherbst: I hacked something together, maybe it works, most likely not
12:26 karolherbst: at least I didn't broke mmiotrace
12:26 karolherbst: wtf..
12:26 karolherbst: it is wokring?
12:26 karolherbst: mhh looks odd though
12:26 karolherbst: it works
12:27 karolherbst: I don't believe this...
12:28 karolherbst: pmoreau: fixed?
12:28 karolherbst: trace is 172MB big, looks okayish
12:29 karolherbst: mhhh though it looks fishy
12:29 karolherbst: k, I messed something up, but I don't know what exactly
12:29 karolherbst: ohh I see it now
12:30 karolherbst: but this looks okayish, doesn't it? https://gist.github.com/karolherbst/f19ad5e0f24cc38b00fb
12:34 karolherbst: k, second try then
12:38 Guest66899:is jealeous of karolherbst's trace
12:38 karolherbst: it was broken :p
12:39 imirkin_: Guest66899: filebin.ca
12:39 imirkin_: or... some similar site, doesn't matter
12:39 karolherbst: yay
12:39 karolherbst: it is working!
12:40 karolherbst: but does that sounds right? "[0] 170.458178 RAMIN8 96210 1f6dc7e210 <= 0"
12:40 karolherbst: I think the RAMIN8 things are still a bit messed up
12:41 imirkin_: it normally reads the vbios like that i think?
12:41 imirkin_: maybe with MMIO8 though, i forget
12:41 karolherbst: it looks odd, maybe it is still right
12:41 karolherbst: will do some serious stuff now
12:42 karolherbst: k, I find reclocking stuff and everything
12:42 karolherbst: so it isn't that bad
12:46 karolherbst: and that's the patch: https://gist.github.com/karolherbst/903bf75486134dd9505d
12:48 karolherbst: pmoreau: if you have time, please test this: https://gist.github.com/karolherbst/903bf75486134dd9505d
12:48 karolherbst: I might have forgotten something, because I didn't changed stuff in my git tree so I had to rely on diff
12:48 karolherbst: I don't think I changed anything else though
12:51 Guest66899: I will try valgrind yet another time
12:58 karolherbst: pq: there?
12:59 karolherbst: or maybe the blob just does something new
13:02 karolherbst: mhh pgraph stuff is missing in the trace
13:03 karolherbst: and I get UNKNOWN entries
13:03 karolherbst: UNKNOWN 362.870544 15 0xf0140448 f3,a4,c3 0x0 0
13:05 karolherbst: ohh yeah, unknown entries in the big mappings: MAP 358.616177 15 0xf0000000 0xffffc90012000000 0x1000000 0x0 0
13:46 imirkin_: skeggsb: does this need to go into 4.5-rc? https://github.com/skeggsb/nouveau/commit/eb87d86fd2c1395485d5cea93fe6159146fd1d9b
14:17 karolherbst: ahh pgraph is now there too :)
14:17 karolherbst: I messed up unaligned regions a bit
16:28 robclark: imirkin, is there a secret decoder ring somewhere for nv cards? Got someone claiming that 'Nvidia NVS 315' card won't communicate w/ any monitor.. not really sure which gen that works out to..
16:29 No1RL355: new lts blob released. looks like they opened some libraries
16:30 No1RL355: https://github.com/NVIDIA/libglvnd probably Im slowpoke, but its may be intresting
16:31 glennk: robclark, https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units
16:31 robclark: libglvnd is just the new vendor-neutral dispatch lib..
16:31 robclark: glennk, ta
16:32 glennk: in case its not listed in https://nouveau.freedesktop.org/wiki/CodeNames/
16:32 robclark: glennk, ok, so should be GF119, I guess.. he also mentioned something about an 'nvidia 500' tho no sure what that is yet..
16:33 glennk: think all of the 5xx ones are fermi variants?
16:34 RSpliet: glennk: except the GeForce 3 TI 500
16:34 glennk: thats a 3 series :-p
16:34 robclark: well, I'll try to find out what 'nvidia 500' means.. and either way I should be able to get some kernel logs and have a closer look at what is going on..
16:35 robclark: it's someone local in the office
18:42 jamesjr: Hello. I'm new to this developing scene. Is there some way I can help out?
18:46 jamesjr: Erm I saw on the trello link that the fbo-blending formats need some function called dst_alpha overridden. I have a geforce gtx 750m is there some way I could go about hard coding that and overriding it?
19:21 skeggsb: imirkin_: no, it doesn't need to be. the bit that makes it timeout only appears in gm20x, which we can't accelerate
19:40 jamesjr: Ok. Is there some other way I can help out? I am a novice at this kind of stuff.
19:43 jamesjr: oh failure on my comprehension part.
19:46 imirkin: skeggsb: ok. if you saw there was a report on phoronix of LTC-related errors on GM107, figured it might be connected.
19:47 imirkin: robclark: http://nouveau.freedesktop.org/wiki/CodeNames but it's incomplete. lspci should tell you with some precision.
19:47 imirkin: robclark: lspci is authoritative, the marketing names are not.
19:48 imirkin: jamesjr: yeah, so the deal with that card is to make RGB10 work with DST_ALPHA blending
19:48 imirkin: jamesjr: the thing about RGB10 is that there is no alpha, so DST_ALPHA is always 1
19:48 imirkin: jamesjr: however the hw doesn't have a RGB10X2 surface format, only RGB10A2
19:49 imirkin: jamesjr: so it doesn't realize that it should return 1 for dst alpha, and instead returns whatever junk is in those 2 alpha bits
19:50 jamesjr: So a new RGBX10X2 needs to be written against this DST_ALPHA function?
19:50 imirkin: jamesjr: the idea is that you'd make 2 pre-computed command lists from a single blend state - one for "regular" and one for "dst alpha becomes 1", and use the right one depending on the fb format
19:50 imirkin: jamesjr: what GPU do you have? GK107?
19:51 jamesjr: GK107M
19:51 imirkin: ok cool
19:51 imirkin: that should be a moderately well-supported GPU
19:51 imirkin: what's your level of familiarity with GL?
19:52 jamesjr: None, I went over all those documents on the nouveau freedesktop and have been studying C++/C in my offtime from work/school
19:52 imirkin: ah ok
19:53 imirkin: so all this blending stuff that i'm talking about probably makes no sense to you then
19:53 jamesjr: specifically that yes
19:53 jamesjr: should I study and come back?
19:53 imirkin: not necessarily
19:54 imirkin: i just like to know who my audience is and adjust accordingly :)
19:54 imirkin: trying to think of a good kepler beginner task
19:55 imirkin: i guess that one's not so bad
19:55 imirkin: jamesjr: have a look at https://www.opengl.org/wiki/Blending#Blending_Parameters
19:55 imirkin: to better understand blending
19:57 jamesjr: So what is the intent of mulitiplying the four seperate integers, erm blending them together?
19:58 imirkin: blending is used to combine a drawn polygon with the rest of the framebuffer
19:58 imirkin: so for example you might want to alpha-blend it on there so that it's 50% transparent
19:58 imirkin: or you might want to perform one of the fancier operations described
20:04 jamesjr: ok so it is manipulating the R, G, B, and A values to produce a more refined output per the equation? I.e making the output seem like it fits on the framebuffer by adjusting it accordingly to its color code and against the destination buffer's alpha/color codes?
20:06 imirkin: think about how drawing happens
20:06 jamesjr: So when you say fixing the input you mean finding a equation to fit R,G,B 10, A2 not returning junk in the last two bits
20:06 imirkin: you have a framebuffer
20:06 imirkin: you cleared it
20:06 imirkin: it's all black
20:06 imirkin: then you draw a triangle
20:06 imirkin: then you draw another triangle
20:06 imirkin: and another
20:06 imirkin: and another
20:06 imirkin: these all get drawn onto the framebuffer one at a time
20:07 imirkin: [at least logically]
20:07 imirkin: blending is the process of taking existing framebuffer data and fragments that are part of the rasterized triangle
20:07 imirkin: and computing new framebuffer values
20:07 imirkin: framebuffers can have formats
20:07 imirkin: some formats have alpha, some don't
20:08 imirkin: if the format doesn't have alpha, then DST_ALPHA should always return 1
20:08 imirkin: however there is no RGB10X2 format in hardware, only a RGB10A2 format
20:08 imirkin: so the hw doesn't know it should be returning a 1
20:08 jamesjr: But in that case it returns the junk in X2?
20:09 imirkin: exactly
20:09 imirkin: so in that case
20:09 imirkin: we need to "manually" adjust the destination blend parameter to "ONE" instead of "DST_ALPHA"
20:09 imirkin: (or ZERO instead of DST_INVERTED_ALPHA)
20:10 imirkin: it's a bit of an esoteric situation
20:10 imirkin: using DST_ALPHA blending on a framebuffer with no alpha channel is... idiotic
20:12 jamesjr: Oh so currently it doesent check its own consistency and then returns junk
20:12 imirkin: well, we have no way to tell the hw that it should return 1
20:13 jamesjr: Can I not just write two functions to run as sanity checks?
20:13 imirkin: ?
20:14 jamesjr: like something to check the input of the possibly nonexsistent buffer for both dst_alpha and dst_inverted_alpha?
20:14 imirkin: right, so you need to do _something_ along those lines
20:14 imirkin: this is complicated by the fact that there can be multiple framebuffers
20:14 imirkin: each with their own formats
20:14 imirkin: thankfully at least fermi+ supports per-render target blend setting
20:15 imirkin: half of the tesla line doesn't support that, and the other half supports it in an odd way
20:16 jamesjr: So it needs to be complex enough to work with the telsa line and other half, yet simple enough to use the effiency of per-target rendering.
20:17 imirkin: don't worry about tesla for now
20:17 imirkin: take a look at nvc0_blend_state_create
20:18 jamesjr: Ok, now I am at darktama/nouveau, is that the correct repository?
20:18 imirkin: oh fun. fermi+ is also a little odd.
20:18 imirkin: jamesjr: i think you're looking for mesa.
20:27 jamesjr: Looking at nvc0_blend_state