03:59YannK: Hello :)
03:59YannK: I use the nouveau driver formy brand new install of Debian Sid/LXDE with a GeForce G 105M on a laptop. Thanks a lot for all the work done to have this free driver
03:59YannK: I just have noticed that the 3D acceleration is slow because of Power Management, which is still indicated as "Mostly". As I have a decrease of speed from around 30FPS to 2,5FPS in a 3D environment I am familiar with, it is not very easy for me to work correctly (but as it is for a free project, I have no stress about that :) ).
03:59YannK: I wanted to know if there is a way to help to finish ? Knowing that I am not a developper but just an enthusiast (fRENC speaking one).
04:04RSpliet: YannK: mind posting a copy of your dmesg to a paste website of choice, and sharing the URL here please?
04:08YannK: Answer in MP :)
04:08RSpliet: yes... well please keep the conversation in #nouveau, so others can chip in if need be
04:08RSpliet: other than that
04:08RSpliet: I feared it would be a DDR2 card, there's no support for changing the DDR2 clocks in place
04:09RSpliet: I currently don't have a lot of time to look into that, but you can always help by sending an e-mail to email@example.com containing
04:09RSpliet: 1) your vbios
04:10YannK: (would mean even with PM finished, it will not improve in speed ?)
04:10RSpliet: 2) the contents of the 101000 register (obtained with envytools' "nvapeek" tool)
04:10RSpliet: 3) an MMIOTrace of the official NVIDIA driver going through all performance levels
04:11RSpliet: (1 and 3 being the most important one, where the latter is unfortunately not the easiest thing to obtain as it requires you to install the official driver and a kernel that supports mmiotrace)
04:11RSpliet: no, I expect performance to improve quite a bit if PM is finished
04:12YannK: ok, I must look for a way to do all that, as I don't have an idea of what you're talking about. I send you the mail asap, thanks a lot for helping :)
04:12RSpliet: but... well, 2.5FPS -> 30FPS is a *very* long stretch...
04:13RSpliet: here's some info on mmiotrace: http://nouveau.freedesktop.org/wiki/MmioTrace/
04:14RSpliet: instead of running 3d for a while, just start the nvidia-settings tool, open the sub-menu labelled something like "performance", and wait for it to reach the lowest clock
04:34YannK: RSPliet : what is the vbios ?
04:34YannK: And for 2), do you want it with nouveau driver or Nvidia one ?
04:40pmoreau: YannK: For 2), it doesn't matter
04:40YannK: ok, thx
04:41linkmauve1: Bon, c’est l’heure d’aller télécharger au boulot, sur une vraie connexion.
04:42pmoreau: linkmauve1: :p
04:44pmoreau: YannK: As for the VBIOS, it is a collection of tables containing clock speeds and various information about that particular card, and scripts that are then run by Nouveau to initialise the card
04:44linkmauve1: My home connection lags so much I can’t even make sure I’m on the right tab. :x
04:47YannK: pmoreau : where wan I find those tables ? I don't have any idea of the commands to run to get them
04:58pmoreau: YannK: You don't need to find the tables, I was simply explaining what one can find in a VBIOS.
04:58pmoreau: To retrieve it, you can use `nvagetbios > vbios.rom` as root, from envytools
05:01YannK: ok, thx pmoreau
06:18dRaiser: Hello. I hope this is right place for this question. I'm trying to improve performance for 9800 GTX+ card (NV92 from NV50 family) and can't get it done. I've set nouveau.pstate=1 param but don't see performance levels in dmesg. I can only see there currently set clocks which are very low: [ 12.034016] nouveau [ CLK][0000:01:00.0] --: core 399 MHz shader 810 MHz memory 399 MHz Could you help me set this? Thanks!
06:31pmoreau: dRaiser: pstate=1 unlocks a file which can be used to change the current perflvl, but dmesg does display by default all the different perflvls found
06:32pmoreau: And for G92, you will need kernel 4.4 to reclock your card IIRC
06:33dRaiser: that's interesting. so without 4.4 kernel it can't find different perflvls, right?
06:34pmoreau: No, without kernel 4.4 you can't change between them, but you can still look at them :-)
06:35pmoreau: But I guess you want to do a bit more than just looking at perflvls and their respective clocks, right?
06:35dRaiser: Yeah. Just wanted to confirm that this two CLK entries are only possible perflvls for this GPU.
06:35dRaiser: Thanks. I'll try with kernel 4.4
06:36pmoreau: (which isn't released yet, we're only at rc3 IIRC)
06:37dRaiser: Well, that's not stopping me, there is git and Arch AUR packages. ;)
06:37pmoreau: Oh! … it looks like only G94-G96 got reclocking added
06:38pmoreau: RSpliet: Didn't you add support for G92 as well? I thought only NV50 was left out for Tesla?
06:51dRaiser: pmoreau I may be wrong, but with quick look on commit history I can see commit named "rename g92 class to g94". http://cgit.freedesktop.org/nouveau/linux-2.6/log/?h=linux-4.4&qt=grep&q=g92
06:55pmoreau: dRaiser: That's only for the GPIO.
06:55pmoreau: Only >=G94…
06:56pmoreau: You could try to change that value to 0x92 and test whether reclocking works or not
06:57pmoreau: dRaiser: Could you paste your dmesg somewhere please?
06:58dRaiser: And that's some idea worth checking. Thanks again, I'll test it later.
06:59pmoreau: 9800 GTX+… I think that's the G92 I have, but it only has one perflvl. However the card boots to lower clocks, so you would still get a performance improvement.
07:00pmoreau: > BootPerf: Core=0.6, Shader=0.4, Memory=0.3 (compared to perflvl 1)
07:00dRaiser: here you go: http://pastebin.com/gyjRcqbv . grepped by nouveau, let me know if you need full
07:00pmoreau: This is what I saw on my G92
07:01dRaiser: I see 2 relevant "CLK" lines on mine
07:01pmoreau: Yeah, same for you
07:01pmoreau: line begining with 0f: value for perflvl 0f
07:01pmoreau: line beginning with --: current perflvl / clocks
07:01dRaiser: yes, and first is much better then second which is (I assume) enabled
07:02dRaiser: so it would make much improvement
07:02pmoreau: the second one is not a perflvl, they are the clocks to which the card boots: you only have one perflvl, 0f
07:03pmoreau: But you would still get quite some improvement
07:47karolherbst: only one perf level? :/
07:50urjaman: looked at my NV96? i think...
07:50urjaman: one perf level and a boot level thats currently afaik being used (if i read you correctly)
07:51urjaman: [ 16.437194] nouveau [ CLK][0000:02:00.0] 0f: core 550 MHz shader 1400 MHz memory 400 MHz
07:51urjaman: [ 16.437237] nouveau [ CLK][0000:02:00.0] --: core 400 MHz shader 800 MHz memory 504 MHz
07:51urjaman: but thats funny, 504Mhz memory?
07:54imirkin: karolherbst: yeah that's common for desktop tesla's
07:55imirkin: urjaman: yeah, iirc that's what my G96 looked like too
07:55imirkin: i.e. booted to a higher memory clock than described in the perf level
07:57urjaman: yeah i have to say i didnt even notice the performance difference
07:57urjaman: thought you had the clocking already done :P (in the 4.2.5 arch im running)
08:06imirkin: dRaiser: you could try enabling reclocking on your gpu based on pmoreau's pointer, but do note that all that stuff is still pretty half-baked... might work, might not
08:09dRaiser: imirkin: yes, I know. trying won't hurt though. BTW, wasn't it working before with nouveau.perflvl_wr=7777 param? I haven't used it myself, just wonder if it was working with this GPU in earlier versions.
08:09imirkin: that only worked up to kernel 3.12
08:09imirkin: you can feel free to try it out
08:09imirkin: it all got ripped out in 3.13 though
08:10dRaiser: Right, but I need newer kernel, so let's try new method.
08:12imirkin: note that you'll still need kernel 4.4-rc though
08:12dRaiser: yeah, know that
08:14karolherbst: imirkin: do you remember the guy where his gpu was running hotter with nouveau than compared to the blob?
08:21imirkin: karolherbst: not offhand...
08:21karolherbst: well Tom^ had the same issue anyway, hey Tom^ I need you :p
08:22karolherbst: I finish my kepler reclocking stuff today and will send a patch series today, I think it is safe enough for the next release and maybe we figure out boosting later, but before we do that, it should be good enough for now
08:42karolherbst: sent :)
09:26RSpliet: pmoreau: possibly, didn't test it on a G92 hence I didn't add it
09:26RSpliet: also: there's still G82-G88 that is unsupported, plus all the DDR2 cards
09:26imirkin: G84-G86 actually ;)
09:27pmoreau: RSpliet just created a new chipset on his spare time and named it G88. :-)
09:27RSpliet: yes, looking at it G88 does look a bit odd
09:28pmoreau: Oh right, DDR2 cards…
09:29imirkin: RSpliet: G92 was a pretty popular card though... it's what most of the beefier 8- and 9- series worked out as...
09:29imirkin: i.e. worthwhile spending a bit of time on it
09:29RSpliet: if someone sends me one :-D
09:30pmoreau: I could do that I guess
09:30RSpliet: pmoreau: oh, you have one? how well does it change clock(s)?
09:31RSpliet: we worked out your G96, surely we can fix your G92 as well :-P
09:31RSpliet: (also: should I return you your G94 on FOSDEM?)
09:31pmoreau: RSpliet: I have the same card as dRaiser, so only one perflvl but it doesn't boot to that one.
09:32pmoreau: :-D Well, I do use my G96 as it's in my laptop, but the G92 (and a bunch of other cards) are just sitting idle on a shelf.
09:32RSpliet: oh, your G94 I mean
09:33dRaiser: pmoreau, RSpliet: I'm just compiling kernel 4.4 with patch changed to enable G92 support, I'll let you know when it finished
09:33RSpliet: the one on my nookshelf :-P
09:33RSpliet: dRaiser: thanks... fingers crossed
09:34pmoreau: Once I get an accommodation, I'll setup a reator and have it run regression tests on most of my cards, but until then, I don't need it.
09:34pmoreau: And I doubt compute support will be different on the G94 than on the G96.
10:41dRaiser: pmoreau, RSpliet: So it won't be that easy. I enabled perflevel for NV92, but it went crazy: purple moving lines and fan on full speed. Made a photo: https://goo.gl/photos/6eQocozbbKGAfw249
10:46imirkin_: not too surprising tbh
10:46imirkin_: i think g94 is where it started using 100da0... g92 still used 10053c or whatever
10:49dRaiser: I'll be happy to test more when it's possible, can I leave email somewhere, add myself to mailist or just have to check commit history?
10:50imirkin_: dRaiser: well, each gpu is different... if you can make an mmiotrace of the blob starting on your gpu, that should be sufficient
10:50imirkin_: dRaiser: see https://wiki.ubuntu.com/X/MMIOTracing for a guide
10:51imirkin_: basically you just have to start X with mmiotrace running, that should be enough
10:51imirkin_: (since you only have 1 perf level)
10:52dRaiser: Using blob I can manually set any clock speed I want. So should I boot with blob, set performance level on nvidia-xconfig, close?
10:53Yoshimo: picture is cool.especially with the blue light in the background, not usable for much else though
10:53imirkin_: dRaiser: just starting it up with default settings should be sufficient
10:53dRaiser: imirkin alright. I'll do it when have some time fiddle with switching drivers.
10:53imirkin_: dRaiser: oh, and also include your vbios (/sys/kernel/debug/dri/0/vbios.rom when nouveau is loaded)
10:54imirkin_: dRaiser: and mail the whole thing to firstname.lastname@example.org
10:54dRaiser: imirkin thanks for tips.
10:55dRaiser: Yoshimo thx ;)
12:04Guest9011: Hi. I am trying to make work a laptop with 3D controller: NVIDIA Corporation GM108M [GeForce 930M] (rev a2). I installed linux 4.4rc3 and I get
12:05Guest9011: nouveau 0000:09:00.0: enabling device (0106 -> 0107)
12:05Guest9011: [ 6.879951] nouveau 0000:09:00.0: unknown chipset (118060a2)
12:05Guest9011: [ 6.879998] nouveau: probe of 0000:09:00.0 failed with error -12
12:05imirkin_: Guest9011: https://bugs.freedesktop.org/show_bug.cgi?id=89558
12:05imirkin_: Guest9011: although fyi, you don't need that nvidia chip for anything...
12:06imirkin_: you're better off with the intel gpu
12:06Guest9011: umm interesting
12:06karolherbst: even with the blob
12:06karolherbst: Guest9011: which intel hd do you have?
12:06imirkin_: karolherbst: probably... blob just causes pain and suffering
12:07karolherbst: yeah, use intel
12:07karolherbst: for everything
12:07Guest9011: thank you so much
12:07imirkin_: Guest9011: you do want to power it off though so that it doesn't suck power
12:07imirkin_: Guest9011: you can do that with bbswitch... or if you convince nouveau to load, it should auto-power-it-down
12:08karolherbst: Guest9011: bbswitch options load_state=0
12:08karolherbst: it was this way: options bbswitch load_state=0
12:08karolherbst: I think bbswitch by default doesn't do much without bumblebeed
12:08Guest9011: ok thank you so much
12:09karolherbst: maybe it may make sense to use the 930M with bumblebee
12:09karolherbst: but don't expect more than 30% improvements
12:09karolherbst: it may remove load from your intel card though
12:09karolherbst: so your desktop stays smooth and lagfree
12:09karolherbst: just don't expect any performance gain
12:10Guest9011: ok, no problem
12:10karolherbst: just with nouveau there i no gain currently
12:10Guest9011: i have a decent desktop and games performance with intel
12:10karolherbst: ahh then it doesn't matter
12:11karolherbst: yeah, just use bbswitch to turn the card off or something
12:11Guest9011: just wondering if the nvidia could do a better job
12:11Guest9011: ok, thank you
12:11imirkin_: well, in the short-term using blob would get you GL 4.5... while GL 4.x is still some months off for intel
12:11imirkin_: you wouldn't get GL 4.x with nouveau either though
12:12karolherbst: yeah, any meaningfull usecase would be with bumblebee
12:12Guest9011: yes, i have opengl 3.3 with intel
12:12imirkin_: [coz i'm lazy and nobody has maxwell's]
12:12karolherbst: either for gl 4.5 or to keep your intel loadfree
12:12imirkin_: [including me]
12:12karolherbst: imirkin_: but this intel loadfree point is very important though
12:13imirkin_: i don't think so
12:13karolherbst: some games just put my intel card under heavy load and the desktop is pretty unusable
12:13imirkin_: but what do i know... not like i've benchmarked anything
12:13karolherbst: I sometimes start games under intel where you really want the nvidia card :D
12:13karolherbst: and desktop drops to around 5 fps
12:15karolherbst: I was under the impression that reclocking worked for the maxwell card mupuf has :/
12:15karolherbst: I will check that
12:15imirkin_: maybe core clock
12:16karolherbst: I know that I did some voltage testing on that card
12:16karolherbst: so yeah, maybe
12:16joi: warsow 2.0 (released yesterday) crashes with nouveau: http://paste.ubuntu.com/13605146/
12:16imirkin_: joi: i threads
12:17imirkin_: joi: also what's work->func?
12:18imirkin_: i did see something in their release notes about using multiple threads
12:19imirkin_: joi: anwyays, thanks for the report
12:19imirkin_: joi: could you file a bug?
12:20imirkin_: there are too many problems, and i'm too tired of chasing them down
12:20imirkin_: hopefully someone will take a look at what's going on
13:05karolherbst: imirkin_: found another game with a massive perf boost through pcie: wasteland 2 (16 fps => 21 fps)
13:05imirkin_: joi: btw was that on start, or deeper into the game?
13:06imirkin_: karolherbst: get skeggsb to take your patches
13:06imirkin_: karolherbst: pester early and often
13:07karolherbst: imirkin_: he told me, that he wants to finish that for 4.5
13:07karolherbst: I am just waiting for his comments :D
13:07karolherbst: I already poked him like last week
13:07RSpliet: karolherbst: imirkin_ is right, and whatever you do, DON'T let him know we call it "pestering" behind his back!
13:07karolherbst: yeah because he won't see it in a comment below where he was mentioned :p
13:08imirkin_: anyways, i can't really bug him for anything since i told him i'd have the recovery stuff done over last weekend, and instead i watched tv
13:09karolherbst: just don't make promises :p
13:11imirkin_: and he owes me making piglit work in parallel
13:12imirkin_: so we're at an impasse
13:16joi: imirkin_: immediately on start
13:16imirkin_: i hate it when bugs are hard to trigger
13:18imirkin_: joi: fyi, repro'd
13:18imirkin_: except mine dies in FREE()
13:18joi: yeah, I had it once
13:18joi: try again
13:19joi: for me it was: *** Error in `./warsow.x86_64': double free or corruption (fasttop): 0x00007fa6e01708b0 ***
13:19imirkin_: let's hit this with a big hammer...
13:22karolherbst: imirkin_: in another scene in that game it is even worse: 9fps => 15 fps
13:29marcosps: imirkin: did you saw my messages yesterday?
13:30imirkin_: marcosps: yep... no clue.
13:30imirkin_: marcosps: iirc i've seen stuff like that before... where the exa bits are all messed up for some reason
13:30marcosps: imirkin_: So, airlied said that the problem could be the Xorg crashing, and so it seems he was right: https://bugzilla.redhat.com/show_bug.cgi?id=1286874
13:33marcosps: imirkin_: the interesting fact is, only nouveau crashes my work. When using i965 it works nicely...
13:35imirkin_: marcosps: not really comparable... i bet nouveau would work fine if it were in i965's position
13:35imirkin_: marcosps: the issue is more around prime + dri2
13:36marcosps: imirkin_: so, there is something that I can do to make this work...? Because without it I don't know if I could test the issue about reupload shaders on error...
13:37imirkin_: you could try DRI3
13:37marcosps:is googling how to activate it
13:38imirkin_: Option "DRI" "3"
13:38imirkin_: you might also have to build your ddx with --enable-dri3
13:38marcosps: So, will I need to rebuild xf86-video-nouveau ?
13:39imirkin_: no, xf86-video-intel
13:39marcosps:is new to Xorg too...
13:39imirkin_: look for the dri3 section
13:40marcosps: ok, thanks....
13:43karolherbst: anybody here with a desktop nvidia gpu and either the talos principle/serious sam 3/wasteland 2?
13:43karolherbst: I would like to know if the pcie patches make a big difference there too
13:46marcosps: imirkin_: sorry but I lost our last conversation due to a Xorg crash here... can you copy what you said 5 minutes ago again?
13:47imirkin_: marcosps: see topic for logs
13:49marcosps: imirkin_: nice, thanks
13:56imirkin_: joi: well, it now hangs on a fence wait... insufficiently large hammer i guess
13:57imirkin_: got some PBENTRY errors which i guess means i need to get off my ass and do soemthing about multi-threaded stuff
14:10hakzsam: imirkin, just tried warsow, I can't even start the game :)
14:12hakzsam: joi, is there any options to not start in fullscreen mode with warsow?
14:14imirkin_: hakzsam: yeah, well i have some local hacks
14:14imirkin_: hakzsam: http://hastebin.com/dijayuhuwa.pl
14:14imirkin_: but it won't work
14:14imirkin_: and will hang the channel
14:14imirkin_: so... be careful
14:17hakzsam: imirkin_, how this can hang the channel?
14:18imirkin_: my changes? they just let warsow get further
14:18imirkin_: but it's doing stuff multithreaded
14:18hakzsam: yeah, I know, but how your changes will hang the channel? you seem to correctly lock/unlock the mutex everywhere
14:24imirkin_: yeah ok, so it *definitely* does stuff from diff threads
14:24imirkin_: i've caught it red-handed
14:24imirkin_: so to speak
14:26imirkin_: my beautiful theory of "no one does this"... crumbling...
14:26RSpliet: imirkin_: does what?
14:27imirkin_: multi-threaded GL
14:27RSpliet: VirtualBox should've been quite a hint :-P
14:27imirkin_: i don't care about vbox
14:28RSpliet: because real men don't virtualise?
14:28imirkin_: because real men don't use closed software when there's perfectly good open software to be used
14:29imirkin_: alright universe, you win
14:29imirkin_: mutexes, here we come
14:30imirkin_: mutex here, mutex there, mutex everywhere
14:32RSpliet: can't we solve this decently with lock-free datastructures?
14:33RSpliet: oh, and afaik VirtualBox is the closes you get to open source VMs that run cross-platform... at least the core is GPL
14:35RSpliet: I sympathise with your averse for Oracle, but it prove to be the most useful tool on occasions :-)
14:35imirkin_: RSpliet: fundamentally not possible to do lock-free
14:35imirkin_: RSpliet: at least without making it one channel per context
14:36RSpliet: what datastructures are we trying to protect?
14:37imirkin_: the GPU's state
14:37imirkin_: you can have any amount of lockless on the cpu
14:37imirkin_: if you have 2 things that want to touch the gpu
14:37imirkin_: then they'll have to take their turn
14:38RSpliet: well, doing a lockless DLL proves to be tricky to say the least, impossible to the best of my knowledge :-) but it's not about concurrent access to pushbufs... hmm
14:39imirkin_: it doesn't matter what you do on the cpu
14:39imirkin_: if you have a single channel
14:39imirkin_: then you can only have one set of commands going in at a time
14:39imirkin_: and then you have to construct your second set of commands based on what the first set all modified
14:39imirkin_: unless you just always effectively re-init the gpu on every draw
14:40RSpliet: gheh, that sounds expensive
14:40imirkin_: yeah, questinoable how expensive in practice
14:40imirkin_: i've never measured it
14:40RSpliet: but I take it that's because you have to keep the entire OpenGL state tracker happy...
14:40imirkin_: either way, the first order of business is just to forcefully single-thread everything
14:42RSpliet: we're probably talking about different things...
14:42imirkin_: yes we are
14:43imirkin_: you're talking about trying to be clever on the cpu without any regard for what one is actually trying to do
14:43RSpliet: no, that's not entirely true...
14:43imirkin_: fundamentally you can only feed one set of commands in at a time
14:46RSpliet: yes, and I take it you can only feed one set of commands in because OpenGL internally is a gigantinormous state machine. Even if you had the mechanisms to submit multple commands, the application can never predict the right state before and after issuing a command
14:46imirkin_: because the GPU is a giant state machine
14:46imirkin_: it has a ton of state, and a few "go" commands
14:46RSpliet: the GPU closely models OpenGL iirc, so that was an assumption I made ;-)
14:47imirkin_: so when you emit "go", everything else has to be in the right state
14:47imirkin_: otherwise you won't get what you want
14:47RSpliet: good, I think we're now saying the same thing (only my wording is less comprehensable... I get that)
14:48imirkin_: but it has nothing to do with mesa
14:48imirkin_: or mesa state tracker
14:48imirkin_: or gallium helpers
14:48imirkin_: or any design decisions made inside mesa
14:49marcosps: imirkin_: So, after compiling xf86-video-intel, I just need to change Xorg conf and point to the new path of intel DDX, right?
14:49RSpliet: well, apart from the fact that Mesa isn't designed to recover state on every command just to support threading, which would be insane obviously :-D
14:50imirkin_: RSpliet: huh? that's not at all what's going on...
14:50imirkin_: RSpliet: imagine this scenario -- you have 2 threads that both want to draw()
14:50imirkin_: given that the gpu can only do one thing at a time...
14:50imirkin_: how you gonna do that lockless?
14:50RSpliet: oh I let go of the whole idea of lockless long time ago
14:51imirkin_: the doubly-linked list is the least of your concerns
14:51imirkin_: the operation is just fundamentally one where someone's gonna have to wait
14:51RSpliet: yes, I *got* that
14:51imirkin_: (you could obviously store the draw request into some other buffer, and then have another thread consume draw requests... but still serialized)
14:52imirkin_: (and such a design would cause major lifetime confusion)
14:53RSpliet: I was trying to make the exact point that this infrastructure of parallelism is infeasible *because* at construction time of your command buf you need to be able to predict the current state of the GPU, which you can't in a multithreaded env
14:55RSpliet: you can call it infeasible from the applications pov or the GPUs pov, both seem to be true with the OpenGL model
14:57RSpliet: (and probably infeasible from many other angles as well)
14:59imirkin_: well, you can always emit the full GPU state
14:59ravior: I'm sorry if this is the wrong channel for this, but is anyone having this problem with the latest kernel and nouveau driver? https://bugs.freedesktop.org/show_bug.cgi?id=71659
14:59imirkin_: but even then there is various subtlety
15:02RSpliet: imirkin_: yes, implementation wise that just sounds like a nightmare, a very expensive one