00:59blackerking: I am owner of a MSI Notebook: MS-1683.
01:00blackerking: Everytime I have to install original nvidia drivers or at least start with "nomodeset" because of failing to start with nouveau drivers.
01:01blackerking: How can I support this project with my error
01:02blackerking: GPU is a NVIDIA® Geforce 8200M G
01:02blackerking: The same Error comes with Ubuntu und Kali-Linux
01:31blackerking: Any developer recently joined?
01:34specing: maybe it went up in flames
01:49RSpliet: blackerking: that's obviously not supposed to happen; could you start with obtaining dmesg from a failed boot attempt? That's... going to be tricky, but if you have multiple machines you might be able to set up netconsole
01:49RSpliet: or use a serial console if you have the hardware (tends to be easier)
01:52blackerking: Is dmesg capable to read previos boot attempts?
01:53RSpliet: the tool isn't, but the log might be saved in either /var/log or journald (systemd) if you use the magic SysRQ sync before powering off your pc
02:00pq: blackerking, can you be more specific on what "failing to start" means? Does the machine stop responding in network?
02:22blackerking: Only the mousecursor is mooveable
02:22blackerking: the screen is black
02:22blackerking: and stays black ;)
02:22pq: blackerking, oh, that's completely different to "does not start"
02:23pq: sounds like the system is alive, just the display server is failing, which is good news - you can likely log in via network, and inspect the system, e.g. run 'dmesg'
02:23pq: assuming you run a ssh server
02:24blackerking: I can isntall packages via safestart
02:24pq: but it also means that if you can do a controlled shutdown with a power button (not just a blunt power-off), the logs should survive and you can dig them up on the next boot
02:26blackerking: I have to go, but I really like to help. If you send me a short brief mail what to do I respond asap
02:27blackerking: got to go cya.
03:35karolherbst: hi, anybody got time to investiage my subroutine memory issue apitrace?
05:43tambre: Would anyone happen to know what texMask2D and texMaskCentroid do?
06:40karolherbst: ohh, the nvidia-settings application is open source?
06:43karolherbst: but I doubt its any helpful?
06:46RSpliet: it's been once or twice
06:46RSpliet: once added some code to add mmiotrace markers before and after reclock
06:46karolherbst: I want to figure out, if it is possible to change the landes on kepler+ cards at all through the blob
06:47karolherbst: yeah, was thinking about doing the same with lanes
06:47karolherbst: but is there anybody wiht kepler who can change it?
06:47RSpliet: not sure if that helps at this point; we know where PPCI is, and it's not touched all that much :-P
06:48karolherbst: nah, on kepler we can do a lot now
06:48karolherbst: changing link speed is no problem
06:48karolherbst: and I don't see why changing lanes should be a big one
06:49karolherbst: It would be just nice to add pcie lande change on tesla+ and also on kepler+, instead of just saying it works on tesla
06:50karolherbst: imirkin and I found some reg which look pretty much to us to the regs to read out the amount of widths, but we never managed to actually change it on kepler, so I think there is some magic missing
07:11RSpliet: did you manage to change it on tesla?
08:05gnurou: imirkin: huehner: re Maxwell FW: actually it is also in my plans to push the GM206 and other firmwares
08:05gnurou: it is just a process that involves many people who must give their sign-off
08:06gnurou: also the FW themselves are useless without the proper loading code, which is quite complex... I'm working on it too
08:06gnurou: so the process is engaged, it just takes time (and to make things worse I am on a vacation next week)
08:07xexaxo: gnurou: I wouldn't call it being on holidays will make it "worse".
08:08xexaxo: a bit slower perhaps, but we all need some break once in a while :)
08:09gnurou: xexaxo: you're right. I am just frustrated that this is not done already and would like it to happen ASAP
08:09gnurou: but sadly this is also not the only thing I have to take care of :/
08:09gnurou: jugling between many tasks is the best way to make sure none gets completed promptly
08:13imirkin: gnurou: any thoughts on getting nvidia to sign nouveau's firmware?
08:13imirkin: RSpliet: changing lane widths works fine on tesla/fermi (at least according to lspci)
08:14imirkin: but kepler changed that stuff around
08:14RSpliet: imirkin: I'd love to see how, maybe I can have a go at it this weekend
08:14huehner: gnurou: thanks for the info, any guess on timing when maybe could get finished 1,3,6months?
08:15imirkin: RSpliet: i pushed a doc to rnndb... it's pretty simple, just a 2-stage write
08:15imirkin: RSpliet: https://github.com/envytools/envytools/commit/ed0db6dc9198500a35953ed41cf10d3e8d3e104a
08:15gnurou: imirkin: that's a whole different level... signing is here for security reasons, a code audit would be required, responsabilities might get engaged, and Nouveau's workflow will become dependent on NVIDIA's promptness to respond to their sign requests (not to mention testing will be made impossible). That's my understanding of the situation, not an official
08:15gnurou: statement though
08:15imirkin: RSpliet: write width to 0x88140, and then write the same value with the 1 bit set, and presto! link width has changed.
08:17imirkin: gnurou: yeah.... but then it's open firmware which matters to some people (less so myself, tbh). our firmware changes *very* rarely
08:17imirkin: RSpliet: keep the other bits that you see intact... no idea what they do, but i'm sure zero'ing them out at random is less-than-ideal
08:18gnurou: huehner: I don't want to give an ETA that will fire back at me :) We have working loading code for Tegra, but it is ugly. We need to make it upstreamable and adapt it to work with dGPU as well. FW files themselves should be releasable in a reasonable time I think. By reasonable I mean hopefully not longer than a quarter
08:20gnurou: imirkin: I understand the issue, indeed. I suspect the biggest obstacles might be legal. This needs to be discussed with someone who actually knows (not me) :P
08:21huehner: gnurou: clear and thanks for pushing it from your side
08:21imirkin: gnurou: yeah, i figured it was unlikely to be you with the approval powers for something like that
08:22imirkin: gnurou: doubt there are any legal implications, but IANAL. just an organizational question, i.e. do you want to make it possible to un-tivo the hardware
08:23gnurou: imirkin: yep. sadly I cannot provide an answer to that question
08:23imirkin: right :)
08:24gnurou: imirkin: we should have an in-depth discussion between NVIDIA and Nouveau devs to discuss such issues
08:24gnurou: a regular event maybe
08:26gnurou: allright, need to sign off - plane taking off in a few hours
08:26imirkin: have a safe flight :)
08:27gnurou: will try to clean the secure FW code a bit on the way
11:33Wolf480pl: udało się
11:33Wolf480pl: ooops, sorry, wrong channel
11:38Wolf480pl: Anyway, regarding HUB_INIT timeouts on my GK104 - sometimes I'm getting a different error message, which leads to the same kind of freeze. It's "PGRAPH][...] wait for idle timeout" then "grctx template channel unload timeout" and then "failed to construct context"
11:38Wolf480pl: Wanna see full dmesg?
11:40Wolf480pl: I also did mmio traces of all 3 cases: nouveau loading succesfully, nouveau failing with HUB_INIT timeout, and nouveau failing with grctx timeout. Would they be useful for you? If so, what do I do with them?
11:48RSpliet: Wolf480pl: that can be useful if it's a 4.1 kernel
11:49Wolf480pl: yeah, it's kernel 4.1.2
11:51Wolf480pl: so, I should compress the trace, dmesg and lspci outputs together?
11:53Wolf480pl: everything in one archive or each try in a separate archive?
11:54RSpliet: rather please only compress the trace if you're going to do a bugreport on bugzilla
11:54RSpliet: it's more convenient to take a quick peek in the dmesg if it's uncompressed
11:54imirkin_: only compress traces, if it's dmesg or xorg log or something text and relatively small (< 500KB) just include it plain
11:56Wolf480pl: should I include all traces I have, or is one for each result (success/error message) enough?
12:05Wolf480pl: Oh, I forgot to mention that I've done all these traces while using nouveau.ko built from the gk106-hack branch. Should I redo them with the in-tree nouveau module?
12:14rake: is the progress for the 900+ gtx cards still stalled by nvidias bs blobs?
12:15imirkin_: certainly not sped up by it :)
12:19Wolf480pl: RSpliet, aretraces from nouveau.ko from gk106-hack branch ok, or should I redo them with mainline nouveau.ko?
12:21rake: sorry if this is a repeat, is nouveau stalled for the gtx 900+ cards because of nvidias blob bs?
12:21rake: nephew shut off my pc
12:24RSpliet: Wolf480pl: which tree is that?
12:24RSpliet: this thing: http://cgit.freedesktop.org/~darktama/nouveau/log/?h=hack-gk106m ?
12:25RSpliet: hmm... that's relatively up to date; should suffice
12:44karolherbst: imirkin_: do you have time to check out my metro traces? Or is this more like something airlied should look at?
12:44imirkin_: karolherbst: what makes you say it has *anything* to do with subroutines?
12:51karolherbst: mhhh, I never saw this happening before?
12:51karolherbst: its a pretty wild guess I know, but everything else seems fine
12:52karolherbst: maybe its simply that the game doesn't something, which is strange and only accidentaly badly handled in nouveau
12:54karolherbst: but it would be helpful if anybody else could reproduce this issue
12:56karolherbst: imirkin_: one question: at which point will the memory be freed within nouveau?
12:57karolherbst: maybe I find something in the trace
12:57karolherbst: there is one frame with around 300.000 calls
13:00imirkin_: karolherbst: erm... what memory?
13:00imirkin_: it's freed when possible :)
13:03karolherbst: okay nice, its not within the 300k call frame
13:04karolherbst: only three frames left which may cause this, and they are pretty small (<1500 calls)
13:04imirkin_: karolherbst: btw, if you switch back to the blob, i'd love to get a mmt trace of tests/spec/arb_tessellation_shader/execution/vs-tes-tessinner-tessouter-inputs.shader_test
13:05imirkin_: you can do a lot of damage in 1500 calls ;)
13:05karolherbst: its the frame with 218 calls
13:06karolherbst: 1.1e+03MB size
13:06imirkin_: does it upload a 1GB buffer?
13:06imirkin_: (or texture or whatever)
13:06karolherbst: it may be
13:06imirkin_: we might not handle that ... optimally
13:06imirkin_: unless you consider crashing to be optimal :)
13:07karolherbst: memcpy(ptr, data, 9.8125kb, 10048)
13:07karolherbst: but this isn't that bad
13:07karolherbst: imirkin_: which gl calls?
13:07imirkin_: oh, i just mean like glBufferData with 1GB might make nouveau unhappy
13:07karolherbst: the frame is like half full of glBufferData
13:08karolherbst: no only at the begining
13:08karolherbst: there are like 10 calls with each 90kb data
13:08karolherbst: + another 10 with 20MB
13:09RSpliet: I hope those aren't shaders...
13:09imirkin_: RSpliet: that's glShaderSource :p
13:09karolherbst: its somewhere in these calls
13:10karolherbst: got a big mem spike after rendering after the bufferdata stuff is over
13:18karolherbst: I am 100% sure that its inside this glBindBuffer, glBufferData stuff
13:19karolherbst: there are no other calls in between
13:20tobijk: yeah well, bind a 1GB buffer ;-)
13:20karolherbst: okay, no qapitrace has a problem interpreting the data of one such buffer call
13:21karolherbst: tobijk: I aborted the application
13:22karolherbst: at one run I had my entire physical memory full (16GB) + 16GB zram swap
13:22karolherbst: and it was too much for the kernel
13:23karolherbst: glBufferData(GL_ARRAY_BUFFER, 91980800, [data, size=89825], GL_DYNAMIC_DRAW)
13:23karolherbst: this is the one call
13:23imirkin_: what's that second arg?
13:23imirkin_: that better not be an offset
13:24imirkin_: oh my
13:24karolherbst: before is a glBindBuffer(GL_ARRAY_BUFFER, 18) call
13:24tobijk: is it inside the kernel. or is it still userspace usage? *fears a kernel bug*
13:24karolherbst: uerspace reports 10% mem usage
13:24imirkin_: Specifies the size in bytes of the buffer object's new data store
13:24imirkin_: er wait, that's only like 100MB
13:24imirkin_: i thought it was 1G
13:24karolherbst: but like 10 calls of taht
13:24karolherbst: but wait, there is more :D
13:25karolherbst: after this, there is a pair of glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 19) and glBufferData(GL_ELEMENT_ARRAY_BUFFER, 21390336. [data, size=20889], GL_DYNAMIC_DRAW)
13:25imirkin_: is that size in kb?
13:25karolherbst: and these pairs repeat like 10 times
13:25karolherbst: its what qapitrace is telling me
13:26karolherbst: the data size is in kb
13:26imirkin_: yeah, looks liek it is
13:26imirkin_: so it's basically uploading a ton of data
13:27karolherbst: qapitrace just used around 12GB just to interpret this data
13:27karolherbst: one call
13:27karolherbst: entire disc cache whiped out
13:27imirkin_: so... all those things end up getting uploaded to vram
13:27RSpliet: is that a Java game? sounds like they run a DataFactory
13:27karolherbst: metro redux
13:27karolherbst: no such problem on intel or blob
13:28karolherbst: this is done for each frame
13:28karolherbst: the frame after it, does the same
13:29karolherbst: but this time with 20 pairs
13:29imirkin_: so all that junk gets written in with transfer_inline_write...
13:29karolherbst: last frame tried the same, but then I stopped the game
13:29imirkin_: which in turn just defaults to not donig anything too clever
13:30imirkin_: oh but..... ugh
13:30imirkin_: it creates a temp resource
13:30imirkin_: so it's 2x the copy
13:31imirkin_: i should resuscitate calim's patch to copy it in via fifo directly
13:31imirkin_: i wonder if that'd be better
13:31imirkin_: maybe not :)
13:31karolherbst: I can try it out
13:31imirkin_: step 1: find patch :)
13:32karolherbst: should be easy enough
13:32imirkin_: oh boo, it only handled constbufs
13:32karolherbst: qapitrace 30% memory usage
13:32karolherbst: that's bad
13:32imirkin_: does it die if you use glretrace?
13:33karolherbst: if you mean if it may die because of oom, yes, if you mean if it does with the trace I currently have? no
13:33karolherbst: but it has the same memory spike
13:34karolherbst: I guess if I manage to create a trace while having no memory to capture it, sure I think it would
13:47karolherbst: but what does glBufferData exactly do? just creating some buffer somewhere inside system memory?
13:47imirkin_: creates a buffer on the gpu
13:47imirkin_: and uploads some data to it
13:47imirkin_: however it'll first copy to system memory
13:47imirkin_: using the cpu
13:48imirkin_: and then will use the GPU to copy from that system memory to gpu vram
13:48karolherbst: yeah okay, which may be also not that bad actually
13:48karolherbst: could the system memory be freed after it?
13:48imirkin_: yeah, it should be!
13:49karolherbst: so is the bug rather it isn't freed for whatever reasons
13:50imirkin_: i think the bug is that it's freed too early
13:50karolherbst: too early?
13:50karolherbst: I would have guessed too late, but okay
13:52imirkin_: those allocs aren't from the slab
13:56karolherbst: okay, have to go now anyway
13:56karolherbst: cu then
17:18imirkin: skeggsb: if a bo is deleted (i.e. nouveau_bo_del) while still being actively used by unfinished commands, that's fine right?
20:43Mittttens: is there anyone developing reclock for maxwell?
22:16imirkin: Mittttens: nope