00:26 karolherbst: uhhh
00:27 karolherbst: with the shader cache: all texturegatheroffsets tests are crashing
00:27 karolherbst: also why does piglit use the shader cache...
00:28 karolherbst: dboyan_: look at the arb_gpu_shader5.texturegatheroffset tests
01:22 nyef: It occurs to me that I should be setting connector->stereo_allowed in nouveau_connector_create() based on the result of drm_conntype_from_dcb() instead of my current "is it a TMDS output?" logic in nouveau_connector_set_encoder().
01:23 nyef: Hrm. With the minor caveat that I have no idea if there are any pre-nv50 HDMI boards, or how to pick them off as unsupported if there are.
01:25 imirkin: there are, and they definitely don't support any 3D stuff
01:30 imirkin: actually i'm not even sure
01:31 koz_: How do I change the pstate in the newer kernels? Do I still use kernel params as before, or is there somthing else now?
01:34 imirkin: /sys/kernel/debug/dri/N/pstate
01:34 koz_: imirkin: Just cat the right value into it?
01:35 imirkin: usually one echo's the value in...
01:35 koz_: Sorry, I meant 'echo'.
01:35 imirkin: if you have a kepler or maxwell1, i highly recommend kernel 4.10
01:35 koz_: imirkin: It's on a Debian, so that might take a while.
01:35 koz_: But definitely will do ASAP.
01:35 imirkin: what gpu do you have?
01:36 imirkin: and what kernel
01:36 koz_: imirkin: Let me double-check both one sec.
01:36 imirkin: lspci -nn -d 10de::300
01:36 koz_: GK104, 4.9.0-2
01:37 imirkin: ah ok. well, with 4.10, the reclocking should actually work for highest pstate
01:37 koz_: (the former is from just lspci)
01:37 koz_: It *kinda* works for this one.
01:37 koz_: I get the memory clock up, but not the other one.
01:37 imirkin: with 4.9, you may be able to get to a middle pstate, depends
01:37 imirkin: yea
01:37 koz_: I'll try a lower state and see if it helps.
01:37 koz_: Also - how do I make it reclock on startup?
01:38 nyef: Hack the echo into rc.local ?
01:38 koz_: 0a is the best it can manage.
01:39 imirkin: boot with nouveau.config=NvClkMode=10 (in decimal, yes, very confusing)
01:46 koz_: imirkin: Thanks - I'll build a 4.10 and do that.
01:47 imirkin: with 4.10 you'll be able to do 0f as well
01:47 koz_: imirkin: That's my hope.
01:47 imirkin: so far that's worked for almost everyone
01:47 imirkin: there are also NvBoost settings
01:47 imirkin: which allow you to take advantage of the higher cstates
01:48 nyef: Okay, revised connector->stereo_allowed logic works for both DP and HDMI connectors. Perfect.
01:49 imirkin: well, the G92 came. will probably be experimenting with it another day though.
02:00 koz_: imirkin: How does one make use of NvBoost?
02:01 imirkin: nouveau.config=NvBoost=1 or 2
02:01 imirkin: i think 2 is for when you know what you're doing
02:01 imirkin: 1 is for when you don't :)
02:01 koz_: imirkin: So that means 1 for me.
02:07 nyef: Bleh. Just spotted a redundant bit set in an nvkm_mask() in two separate commits.
02:07 nyef: Not fixing.
02:07 nyef: At least, not fixing now.
06:21 imirkin: AndrewR: i think i understand what's going on with the pre-NV44 mpeg thing, but it'll require skeggsb's assistance to resolve
06:21 imirkin: AndrewR: and the G92 came, so i've plugged that in now
06:55 imirkin: gr. fail during pci unbind =/ https://hastebin.com/ihefesovah.go
06:57 koz_: If I wanna do two things in nouveau.config, do I comma-separate them or something?
06:57 koz_: Like 'nouveau.config=NvClkMode=15,NvBoost=1'?
07:00 imirkin: AndrewR: btw, observation: VP + PPP work on G92. just BSP is stuck. is that consistent with your findings? i.e. MPEG2 decoding works just fine on G92
07:00 imirkin: koz_: yes.
07:03 koz_: imirkin: Thanks!
07:04 koz_: I should update my gaming rig's kernel parameters to include NvBoost actually.
07:04 koz_: More power's good.
09:16 karolherbst: imirkin: 0 ist when you don't know it for NvBoost ;)
09:17 karolherbst: and 1 means "it should be fine"
09:19 karolherbst: basically 0 is the clock the GPU was sold with and 1 is the "boost" clock the GPU was sold with. And even on 2 you hardly ever get any problems. Except the fans aren't controlled or you get very close to the power budget
09:20 karolherbst: koz_: well you can do NvBoost=2, but you should keep an eye on the GPU temperature you can see for example in sensors
09:38 koz_: karolherbst: At boost freqs, this thing is still a bit short of the full clock it claims in pstate.
09:40 karolherbst: koz_: yeah, it's expected
09:40 karolherbst: koz_: the full clocks to mostly only get under a certein tempperature
09:40 karolherbst: but currently we assume the GPU has 95°C
09:40 koz_: Ah, OK.
09:40 koz_: So if I set NvBoost=2, would it hit the top end?
09:41 karolherbst: ohh you were talking about 1
09:41 karolherbst: yes, it will be short of the max clock
09:41 karolherbst: the above applies for boost 2
09:41 koz_: Ah, right.
09:41 karolherbst: nouveau needs to watch the temperature and adjust the max voltage and so also the clocks on the fly
09:41 koz_: I see, I see.
09:41 karolherbst: but it will be a difference of max ~50MHz
09:42 koz_: Doesn't seem worth it then.
09:42 karolherbst: well, I meant the difference between max and what you will get
09:42 koz_: Ah, OK.
09:42 koz_: Well, that's fine then.
09:42 karolherbst: on mine it is 797 on 1 and 862 on 2 :)
09:42 koz_: I'll try NvBoost=2 on my 680 and see how fast it'll go.
09:43 karolherbst: well, keep an eye on the GPU temperature
09:43 karolherbst: but as long as it stays below 90°C everything is fine
09:43 koz_: What's a good monitoring tool for that?
09:43 karolherbst: and it should hard shutdown itself if it's too hot anyway
09:43 karolherbst: koz_: no idea, there are a lot of sensors based panel things and so on
09:43 koz_: I'll look into it then. Thanks!
09:43 karolherbst: but a simple "sensors" in the console does it as well
09:44 karolherbst: on some cards boost 2 is unstable though for some reasons (most likely cause the clocks are too high and nouveau misses something)
09:44 koz_: I'll try it and see.
09:45 karolherbst: I found a card where a difference of 23MHz made the difference between super stable and super broken
09:45 koz_: Which card?
09:46 karolherbst: some mid range GPUs with insanely high clocks (above 1,1GHz)
09:46 koz_: Ah, OK.
09:47 koz_: My current card (no boost) is firing at ~1GHz.
09:48 karolherbst: is it a OC version of a 680?
09:48 koz_: 0f.
09:48 karolherbst: ohh wait, 680 are normally at 1GHz
09:48 karolherbst: tough
09:48 koz_: Oh, never mind, not overclocked, no.
09:48 koz_: (I don't even know how to be honest)
09:48 koz_: I guess NvBoost might get it a bit more, right?
09:48 karolherbst: pimping the vbios (or pimping nouveau)
09:49 karolherbst: koz_: NvBoost does the same nvidia also does
09:49 karolherbst: with NvBoost=2 I mean
09:49 karolherbst: except for monitoring temperature and power consumption
09:49 karolherbst: still on my todo list
09:49 koz_: karolherbst: As ever, we appreciate your amazing work.
10:46 AndrewR: imirkin, hi!
10:47 AndrewR: imirkin, thanks for looking into both bugs. Well, mpeg2 decoding for me was/is slow, but working...on g92.
10:57 karolherbst: yeah, I can reliable reproduce the MEMACK_TIMEOUT on parallel textureGather tests
10:57 karolherbst: *reliably
14:03 dboyan: imirkin, mupuf: I've shared my proposal on gsoc site. If you have comments, please let me know. I still have time to improve it next week.
15:39 dboyan: hmm, it seems hitman crashes at random places with glsl/tgsi cache on, but it also crashed once when I turned off the cache. I've seen some crashes of the same type, though.
15:39 dboyan: Strange...
15:47 imirkin: AndrewR: but both VP2-based and PMPEG-based decoding should work yeah?
15:48 dboyan: Anyone know how nouveau_transfer.map can become 0xffffffff? most crashes I've seen are caused by this
15:48 imirkin: dboyan: someone was just seeing this. was it you? i think karol...
15:49 dboyan: did karol also see this one?
15:49 dboyan: I only analyzed the crash today
15:53 imirkin: yeah, he saw that crash
15:53 imirkin: iirc we decided it was an out-of-bounds write overwriting the map
15:55 dboyan: imirkin: I also got invalid target (PIPE_BUFFER) in gf100_create_texture_view. I know karol was seeing this one
15:56 imirkin: erm, odd - you can create PIPE_BUFFER texture views
15:56 imirkin: was it a PIPE_BUFFER with tiling or something?
15:57 dboyan: no idea
15:58 dboyan: but PIPE_BUFFER target causes assertion failure in gf100_create_texture_view
15:58 imirkin: well, normally it's legal to create texture views of PIPE_BUFFER
15:58 imirkin: /* check for linear storage type */
15:58 imirkin: if (unlikely(!nouveau_bo_memtype(nv04_resource(texture)->bo))) {
15:58 imirkin: it should go into that path though
15:59 AndrewR: imirkin, it also worked with mplayer -vo xvmc started with NOUVEAU_PMPEG=1, and much faster, until it hanged different way (music from another file was still playing, power button worked to shut down machine)
15:59 imirkin: AndrewR: well, those kinds of hangs can happen :)
16:04 jamm: imirkin: my xorg.log after loading the new nouveau http://pastebin.com/8MWQe0au
16:04 jamm: i can't seem to get past the login window
16:04 imirkin: jamm: yeah
16:04 imirkin: [ 16.993] (EE) Failed to load module "glx" (module does not exist, 0)
16:04 imirkin: that's bad
16:04 imirkin: you need to stick the system dirs into the modulepath as well
16:05 imirkin: jamm: e.g. here's mine: https://hastebin.com/ociyigofiz.nginx
16:08 jamm: imirkin: ah, i see, i'll try putting those, brb
16:08 imirkin: well, i dunno if you have the same system dirs
16:08 imirkin: might want to check :)
16:08 dboyan: imirkin: I even saw a more mysterious crash: tgsi/tgsi_parse.c:241:tgsi_parse_token: Assertion `!inst->Src[i].Dimension.Dimension' failed.
16:09 imirkin: dboyan: memory corruption, it sounds like
16:09 dboyan: yeah, I believe so
16:11 jamm: imirkin: ah yeah, i wouldn't put them as-is of course XD
16:11 dboyan: imirkin: I'm just wondering if shader cache actually cause it or it just raises the possiblilty. I hope it is the latter
16:12 imirkin: dboyan: without identifying the source, no clue :)
16:14 dboyan: well, as karol pointed out earlier, shader cache did cause some problems, it causes some piglit tests to crash, although not related to the crashes here
16:20 imirkin: skeggsb: ping me when you're back && caught up on various stuff one catches up on after vacation
16:22 jamm: [ 17.907] (II) NOUVEAU driver Date: Tue Mar 7 18:44:43 2017 -0500
16:22 jamm: imirkin: looks like it works! thank you!
16:22 imirkin: yay
16:22 imirkin: shouldn't that driver date be Mar 26?
16:22 imirkin: jamm: mind pastebin'ing the whole log?
16:23 jamm: imirkin: yeah that's kinda odd - again
16:23 jamm: is it conflicting with my existing installation ?
16:23 imirkin: the log would indicate that.
16:24 jamm: imirkin: https://hastebin.com/ocinamelak.sql
16:24 imirkin: [ 17.889] (II) Loading /home/jam/install/lib/xorg/modules/drivers/nouveau_drv.so
16:24 imirkin: is that the right one?
16:25 imirkin: [ 18.396] (EE) AIGLX error: Calling driver entry point failed
16:25 jamm: yeah
16:25 imirkin: sounds like 3d isn't quite set up though
16:25 imirkin: oh, it wants.... grrrr.
16:25 imirkin: sec
16:25 jamm: the build date is weird
16:25 jamm: [ 17.829] Build Operating System: Linux 3.16.0-4-amd64 x86_64 Debian
16:26 imirkin: that's for Xorg
16:26 jamm: ah
16:26 jamm: i keep forgetting it's the xorg logs lol
16:26 imirkin: hrm
16:27 imirkin: jamm: cd /home/jam/install/lib/; ln -s /usr/lib/dri dri
16:27 imirkin: and restart X
16:31 jamm: imirkin: https://hastebin.com/sakivafuho.sql
16:31 imirkin: still fail =/
16:31 imirkin: do you have a /usr/lib/dri ? :)
16:31 imirkin: and is there a nouveau_dri.so in there?
16:31 jamm: doesn't seem like it
16:32 imirkin: oh, sorry.
16:32 jamm: i don't think i've installed libdri
16:32 imirkin: can you check where your nouveau_dri.so is?
16:32 imirkin: nah, diff distros put it in diff places
16:32 jamm: ah, right
16:33 imirkin: anyways, symlink the right one.
16:33 jamm: imirkin: https://hastebin.com/ejisotogor.sql
16:33 jamm: i'm guessing the x86_64 one
16:33 imirkin: oh wait
16:33 imirkin: this already exists: /home/jam/install/lib/dri/
16:33 imirkin: hehe
16:34 imirkin: i wonder what that ln -s did. should have failed i think.
16:34 imirkin: or did it go the wrong way?
16:34 imirkin: (but that'd require root privs)
16:34 jamm: it worked the first time o.O
16:34 jamm: okay let me retrace my steps
16:35 imirkin: anyways, i think that your nouveau_drv.so is looking for a nouveau_dri.so at /home/jam/install/lib/dri/
16:37 jamm: imirkin: /home/jam/install/lib/dri already exists and contains nouveau_dri.so
16:38 jamm: seems like i didn't create the symlink properly in the beginning, my bad
16:38 jamm: it missed the already existing folder and created it somewhere else
16:44 imirkin: ok
16:44 imirkin: well, wtvr
16:44 imirkin: either way, unless you have an old mesa install there, it should have been fine =/
16:44 imirkin: unless you didn't build nouveau_drv.so with --prefix=/home/jam/install ?
16:45 jamm: imirkin: https://hastebin.com/giyapoluka.sql after removing those font directives and putting the symlink from /usr/lib/x86_64-linux-gnu/dri/
16:46 imirkin: [ 18.396] (EE) AIGLX error: Calling driver entry point failed
16:46 imirkin: yeah still fail
16:46 jamm: imirkin: i did set prefix to that value, hmm
16:46 jamm: maybe i can do a rebuild
16:46 imirkin: is it an old mesa install?
16:47 jamm: hmm, i do remember installing some mesa stuffs via apt, but never built mesa from source before yesterday for nouveau
16:47 jamm: imirkin: specifically, libgl1-mesa-glx:i386 libgl1-mesa-dri:i386
16:48 jamm: i had installed them to get steam to work
16:48 jamm: iirc
16:49 jamm: my current xorg.conf https://hastebin.com/oxotidoput.nginx
16:50 imirkin: well, wait - if it's a mesa build from yesterday, then it's definitely recent enough
16:51 jamm: imirkin: the log file looks old, sorry, i think it's not loading the new conf
16:51 jamm: i tried service gdm3 restart, looks like i have to use a new screen or something
16:53 jamm: imirkin: the mesa build was also done with prefix
16:53 imirkin: the log file says when it was created up top
16:53 imirkin: [ 17.829] (==) Log file: "/home/jam/.local/share/xorg/Xorg.0.log", Time: Mon Mar 27 01:18:27 2017
16:53 pmoreau: It seems to be loading the conf: `[ 17.851] (**) ModulePath set to "/home/jam/install/lib/xorg/modules/drivers,/usr/lib/xorg/modules"`
16:53 jamm: that's odd coz it's 1:53 right now. I'll try again
16:56 jamm: imirkin: here we go, new logs https://hastebin.com/ajoquhaxab.sql
16:57 imirkin: still fail :( [ 2255.072] (EE) AIGLX error: Calling driver entry point failed
16:57 imirkin: but it finds the driver
16:57 imirkin: the driver just can't load
16:57 imirkin: odd. not sure what's up.
16:57 imirkin: but not an issue with my patches =]
17:14 jamm: imirkin: could this be an issue with the fact that i'm running debian stretch?
17:17 jamm: alright, i'll try rebuilding everything again making sure the prefix is /home/jam/install and try again. Thanks for the help imirkin!
17:18 jamm: imirkin: one noob question, should i build as root or as my own user? since i'm building in a user directory i doubt i would need root, but just wanted to make sure
17:30 pmoreau: jamm: As your own user should be fine
17:44 jamm: pmoreau: thanks!
18:09 karolherbst: dboyan_: the out of bound map error was "fixed" by this: https://github.com/karolherbst/mesa/commit/0cfc96f7787a136fd507139b3406947bfe009d1a
18:09 karolherbst: It may be that even higher values are necessary
18:10 karolherbst: anyway, if you want you could run a valgrind
18:10 karolherbst: but it takes hours and you need to exclude the hitman internal errors otherwise you hit the 10k error threshold
19:02 imirkin: mwk: so... with the G92 thing. it looks like VP and PPP are fine. but BSP is b0rked. do you have any thoughts?
19:03 imirkin: beyond "well, that sucks"
19:08 mwk: um? G92 has no PPP
19:09 mwk: VP is fine, but BSP is not? that's... strange
19:09 mwk: VP should be the way more complex one
19:14 imirkin: mwk: er yeah, no PPP. PPP is PMPEG on there :)
19:14 imirkin: mwk: well, MPEG2 decoding works via VP
19:14 imirkin: mwk: and the issue appears to span all G92's, and i've never heard of the issue on any other GPUs
19:18 imirkin: hmmm... this seems ominous
19:18 imirkin: [0] 94.680290 MMIO32 R 0x001590 0x00000100 PBUS.CLOCK_GATING_4 => { PVP2 = 0x1 | PCIPHER = 0 | PBSP = 0 }
19:18 imirkin: [0] 94.680325 MMIO32 W 0x001590 0x00001100 PBUS.CLOCK_GATING_4 <= { PVP2 = 0x1 | PCIPHER = 0 | PBSP = 0x1 }
19:20 imirkin: i guess we leave that at 0
19:20 karolherbst: good idea :)
19:28 karolherbst: imirkin: did it ever happen to you, that "spec@arb_timer_query@timestamp-get" fails?
19:29 imirkin: karolherbst: sounds familiar.
19:29 karolherbst: okay
19:29 karolherbst: I wonder on how often that actually happens and if it matter at all
19:29 imirkin: https://people.freedesktop.org/~imirkin/nvc0-comparison/problems.html
19:30 karolherbst: I was simply comparing master against my postraloadpropagation
19:30 karolherbst: and timestamp-get randomly failed
19:30 karolherbst: everything is fine though
19:30 karolherbst: *else
19:31 imirkin: mwk: last time i debugged it, i could send commands to BSP's xtensa and it processed them fine. but the second it had to hit the underlying hw, fail.
19:31 imirkin: mwk: although i haven't reproduced that this time around.
19:40 Lekensteyn: Hi! Since upgrading from Linux 4.9.5 -> 4.10.5 (+minor xorg, xf86-video-nouveau, etc. updates) I am seeing "nouveau 0000:01:00.0: disp: outp 01:0006:0f44: link training failed" which then manifests in a black screen
19:41 Lekensteyn: workaround is to open laptop, turn the screen off/on again with xrandr. Is it a known issue? (This is a hybrid graphics laptop using PRIME output slaving)
19:42 imirkin: 4.10 moved to atomic modesetting and gained DP-MST support.
19:42 imirkin: there was some kind of issue with DPMS in the nouveau ddx
19:42 imirkin: but 1.0.14 was released which fixes the issue
19:43 Lekensteyn: I'm using 1.0.14, here is the dmesg after occurrence (possibly related to DPMS since it only occurs when I return from idleness, no suspend): http://sprunge.us/dTbF
19:44 Lekensteyn: (can try to reproduct and bisect it in the coming week)
19:45 Lekensteyn: Intel had a similar black screen issue (link training failure) after DPMS suspend before the upgrade, at least that is fixed now (yay)
19:46 Lekensteyn: (thinking about it, it is actually not fixed (in all cases?), I still had to type blindly because laptop screen was black)
19:47 karolherbst: imirkin: I should have noted on the mail, that I did a piglit test run on nve6, anyway I did it and there are no regressions, can't say for gk110 or maxwell+
19:47 imirkin: karolherbst: ok
19:48 karolherbst: but patches are written in a way, that pass is only enabled for nvc0+ in the first step and the gk110 enables it for gk110 and maxwell for maxwell
19:50 nyef: ... link training failed? Is that an eDP panel?
19:53 Lekensteyn: nyef: yes, intel one is eDP panel, nouveau is connected through miniDP
19:54 nyef: Hrm. Okay, so probably not the problem that I have with my eDP panel.
19:54 Lekensteyn: for the Intel eDP issue I have just created a shortcut to "xset dpms force off; xset dpms force on" which works for some months now
19:55 Lekensteyn: as "feature" the login screen is now blank (feature = password not being shown), so I blindly type in the password and issue the shortcut
19:55 Lekensteyn: a feature... right:p
19:55 Lekensteyn: nyef: what issue did you have then?
19:57 nyef: Blank screen on resume from suspend. Workaround is to smack a GPIO to turn the screen back on.
19:58 Lekensteyn: is your eDP driven by i915 or nouveau?
19:58 nyef: nouveau.
19:59 nyef: The system that I have won't boot if I have an eDP installed without a discrete card, and the discrete cards disable the integrated video.
20:00 Lekensteyn: what laptop do you have?
20:00 Lekensteyn: (+GPUs?)
20:00 nyef: I haven't taken the time to put together a "real" fix for the issue because it's not a major problem for me at this point, I just run the blob on this system.
20:01 nyef: Alienware M17x R4. I seem to have a GK107 and a GK104 in two different systems.
20:03 Lekensteyn: hn ok, for me it should be easier to figure out what went wrong since I have a known working version (4.9.5) and a slightly broken one (4.10.5)
20:03 Lekensteyn: if you have a similar working/non-working version you could try a bisect
20:06 nyef: In my case, there is no working version. But I know what's wrong and why. I just haven't settled in to making a real fix of it.
20:06 karolherbst: imirkin: should spec@arb_tessellation_shader@execution@barrier give me OOR_ADDR errors? It doesn't like to me like this is expected
20:06 nyef: Other priorities, you know?
20:06 imirkin: karolherbst: sounds familiar.
20:07 imirkin: karolherbst: iirc it access an out-of-range patch var
20:07 karolherbst: well true, but the question I have is: does it have a penalty on perf if we get such traps
20:08 karolherbst: I think it does, cause the hitmanPro benchmark I do jumps between ~20fps and 6fps within the same scene without any obvious reason, and when I was toying around I got FPS around 18 without the issues
20:08 Lekensteyn: nyef: yep, that is why I added that shortcut for the Intel eDP issue ;)
20:08 karolherbst: even if the output was different
20:08 karolherbst: so I am not 100% sure
20:09 karolherbst: I just would like to test it out
20:09 karolherbst: *try
20:13 karolherbst: mhh okay, at least that from the tessellation shader is not global memory
21:46 __Chris: Hello, is there any known problem about nouveau driver and using LAPIC? Because I can only boot with "nolapic" kernel option and with nouveau driver, otherwise I get a kernel syncing error while booting, before KMS or nouveau driver kick in. But LAPIC is working if I don't build in nouveau driver. Any ideas?
21:50 __Chris: This is making the LAPIC very sad. Me too.
21:51 imirkin: what GPU? and anything "funny" about your setup (i.e. differing from the plain setup an average person might have)?
21:53 __Chris: Hm, not that I know. It's some older laptop, Pentium4. It's some integrated nvidia card. Let me look what GPU exactly.
21:53 imirkin: well, sometimes people come in complaining of an odd error, and then happen to, in passing, mention that it's a VAX
21:53 __Chris: NV31M
21:54 __Chris: Right, I understand.
21:54 imirkin: ok
21:54 imirkin: so ... i'm confused.
21:55 __Chris: Why?
21:55 imirkin: you're saying that building CONFIG_DRM_NOUVEAU is affecting your boot before any nouveau code hits?
21:55 __Chris: Right, I think so, because KMS is not in action if the kernel panic happens.
21:55 imirkin: are you doing CONFIG_DRM_NOUVEAU=y or =m?
21:55 __Chris: y
21:56 imirkin: can you try =m?
21:56 __Chris: Right, very good idea. Will that work with KMS?
21:56 imirkin: well, you'll have to ensure that the module is loaded, but yeah
21:56 RSpliet: __Chris: also, just for the record. This is a 4.10 kernel we're talking about?
21:56 __Chris: Okay, I will rebuild and report afterwards.
21:57 imirkin: [and with most distros' boot scripts, it'll get autoloaded]
21:57 __Chris: Right, I just updated to 4.10 kernel, selfbuild. But had exact problem with 4.5 too.
21:57 __Chris: I am using Gentoo.
21:58 imirkin: btw, i dunno what you're expecting in terms of graphics with that GPU, but don't expect nouveau to work wonders
22:01 __Chris: imirkin: No, I am realistic. I am using it with TDE (KDE3 fork). I have tested nouveau without LAPIC before and it was working very good. I am very happy about that open source variant. Binary Nvidia driver was better in performance but it is not supported anymore and lacks KMS.
22:01 imirkin: 2d should be fine. once you try to use GL, prepare for pain and suffering
22:01 __Chris: Geforce 5600Go.
22:02 imirkin: the nv30 backend is pretty crappy, and doesn't bend over backwards like the blob driver does to mask the hw's insufficiencies
22:02 __Chris: imirkin: I have tesed it again. FreedroidRPG gave me some results. But I am optimistic that I can play it on this machine in a year or so. I was running fine with binary driver.
22:03 imirkin: ok. i think blob exposes GL 2.1, while nouveau will only get you GL 1.5 on there (but with a lot of the extensions that make up GL 2.0)
22:04 __Chris: imirkin: So these GL 2.1 features were not supported by hardware, more emulated by the blob?
22:04 imirkin: kinda-sorta supported, not enough for conformant behavior
22:04 imirkin: software of the day knew what the limits were and avoided hitting the "software" path
22:05 imirkin: today's software couldn't give a damn - if you claim GL version X, it'll make full use of the features offered by that spec
22:06 __Chris: Reminds me on some unichrome chipsets of VIA...
22:07 imirkin: well, it was enough to support the early DX9 revs
22:07 __Chris: As long it is stable and gives some performance boost and KMS and a bit OpenGL, it is fine.
22:08 imirkin: and i have a NV34 plugged in atm, so i can investigate any issues you hit on the GL side
22:08 __Chris: Oh, that's nice.
22:09 imirkin: [that said, there are large classes of issues i'm aware of and haven't had time/etc to fix]
22:10 __Chris: imirkin: Yes, that's a bit what I expected. But I don't complain about performance or issues, I am happy, besides this LAPIC problem with nouveau. I will try some simple Linux games if it works and can report. Is there any bugtracker?
22:11 imirkin: bugs.freedesktop.org
22:11 __Chris: Ah, I see.
22:11 imirkin: files issues under Mesa -> DRI/nouveau
22:11 imirkin: [for 3d stuff]
22:11 __Chris: Still compiling.
22:15 imirkin: that said i have no idea what kinds of issues cause LAPIC oddness. i do remember that era when these things sometimes broke. but i don't remember the sources of those issues...
22:19 __Chris: So, result:
22:20 __Chris: It boots with build as module and LAPIC on.
22:20 __Chris: BUT: I think if it want's to load module, it kernel panics.
22:20 __Chris: And KMS is not started.
22:20 __Chris: So result is just delayed.
22:21 imirkin: ok, so ... that means that it's in nouveau loading code
22:21 __Chris: imirkin: The thing is, it worked with blob driver, and LAPIC on. And LAPIC without nouveau is working too.
22:21 imirkin: is it AGP or PCI or PCIe?
22:21 __Chris: It is.... MXM? Ah, AGP.
22:22 __Chris: SIS AGP.
22:22 imirkin: lspci -vvvvnn -d 10de:
22:22 imirkin: (pastebin the results of that)
22:22 __Chris: Let me so second. Ah, for that I have to fire up IRC on this machine. Let me some second.
22:23 imirkin: well, the pastebin name is usually short :)
22:23 imirkin: anyways, could be that AGP is somehow buggered. SiS and AGP don't seem to mix well in my memory
22:24 imirkin: you could boot with nouveau.config=NvAGP=0
22:24 __Chris: I have to let it go out of battery power. It don't shutdown. Also not with 30sec. powerbutton pressing.
22:25 imirkin: remove battery?
22:25 __Chris: Right.
22:25 __Chris: But I will try now first that NVAGP thing.
22:27 imirkin: if the laptop has a serial port, you could use that to get the messages before it dies
22:28 __Chris: Oh. NvAGP=0 let's it boot.
22:29 imirkin: /* SiS 761 does not support AGP cards, use PCI mode */
22:29 imirkin: { PCI_VENDOR_ID_SI, 0x0761, PCI_ANY_ID, PCI_ANY_ID, 0 },
22:29 __Chris: In dmesg I read about AGP=12x if I recall right, which was set to 8x after figuring out that 12x is not supported. Maybe it's related.
22:29 imirkin: could be.
22:29 imirkin: you can set NvAGP=4 to force it to 4x
22:30 __Chris: I will try it.
22:30 skeggsb: iirc the 12x is an oddity in how the kernel expresses agp v2
22:30 __Chris: skeggsb: Ah, I see.
22:30 skeggsb: i could be wrong though, agp has been paged out :)
22:31 imirkin: skeggsb: you've returned alive! all rested up?
22:31 skeggsb: i wouldn't say rested, it was a pretty busy holiday :P
22:31 imirkin: "i need a holiday from this holiday"? :)
22:31 skeggsb: exactly haha
22:32 skeggsb: but nah, it was good
22:32 __Chris: imirkin: I will try to build nouveau now on and test it too. But this AGP problem seems to be present only with LAPIC.
22:33 imirkin: __Chris: much like skeggsb has paged out AGP, i've long ago paged out any apic-related items. mostly they weren't in my head to begin with
22:33 imirkin: i just know "apic" == "weird shit's going down"
22:33 imirkin: and when you stick an l in front, that just makes it weirder
22:35 imirkin: skeggsb: well, when you get all caught up, i'd like to work out this mpeg stuff with you
22:35 __Chris: imirkin: These days CPUs has LAPICs still in as I know? But the BIOS implementations are more proper I think.
22:36 skeggsb: imirkin: i thought you might ;)
22:36 pmoreau: imirkin: Would you have some time to look at some patches (for SPIR-V support)? Just for me to get an external opinion on how the patches are split and general comments about `auxiliary/spirv/spirv_linker`, before sending it out to the list.
22:36 imirkin: pmoreau: not tonight =/
22:36 imirkin: pmoreau: i'm already supposed to be doing something else :)
22:37 skeggsb: imirkin: i replied to one of your patches (i think it might have just went to dri-devel though, not directly to you/nouveau list), and merged the other
22:37 imirkin: skeggsb: thanks. now there's the pre-nv44 issue.
22:37 pmoreau: imirkin: Doesn’t have to be tonight. Just checking that you would be ok to do it at some point. :-)
22:37 imirkin: pmoreau: sure
22:37 imirkin: skeggsb: https://bugs.freedesktop.org/show_bug.cgi?id=99584#c6
22:38 pmoreau: imirkin: Should I send you the patches now, or send them when you’ll have some time?
22:38 imirkin: pmoreau: send them whenever, and then keep bugging me
22:38 pmoreau: imirkin: Ok, thanks!
22:39 karolherbst: mupuf: mind plugging a gk110 and a gm107+ GPU inside reator?
22:40 karolherbst: mupuf: or checking for regressions on both GPUs via ezbench for me? :D
22:41 imirkin: skeggsb: unfortunately i don't understand enough about the abstractions here to really propose a way out of it...
22:42 skeggsb: imirkin: ack, i'll try and sort something out
22:43 imirkin: skeggsb: ok awesome :) should be easy to repro - just creating the object is enough to get the fail.
22:43 skeggsb: i don't believe i have a board that will hit it anymore
22:43 imirkin: any pre-NV44 should do it...
22:43 skeggsb: i think i have nv44/nv49 these days
22:44 imirkin: ah. no nv3x either?
22:44 skeggsb: oh, i have a pcie 3x
22:44 imirkin: i think mesa code bails on those since i was getting hangs when doing the mpeg decode which i suspect were unrelated to the mpeg decode itself
22:44 imirkin: but should be easy to write a sample app that instantiates the object on its own
22:45 skeggsb: that's fine, i'll create the object manually
22:45 karolherbst: skeggsb: mind looking over the last nine patches on this branch? https://github.com/karolherbst/nouveau/commits/clk_update_v2
22:45 skeggsb: karolherbst: i've seen them, it's on my list :P
22:45 karolherbst: nice!
22:46 karolherbst: skeggsb: I also found a MEMACK_TIMEOUT issue while running piglit in parallel
22:46 karolherbst: only happend with the textureGather based tests
22:47 skeggsb: imirkin: i have some stuff i want to add support for (channel groups) in that whole area before/for vulkan btw, so i'll make an effort to document some of those abstractions :P
22:47 skeggsb: karolherbst: what board?
22:47 karolherbst: gk106
22:47 skeggsb: unless it's something new mesa is triggering, i've been running // piglit on >=fermi at least without issue for a while now
22:47 skeggsb: including gk106 :P
22:47 karolherbst: yeah well
22:47 karolherbst: it is random
22:47 karolherbst: I did a run without issues
22:47 karolherbst: in one run I had to resume once
22:48 karolherbst: the other I had to resume three times
22:48 skeggsb:runs piglit multiple times a day lately (well, prior to holidays)
22:48 karolherbst: but I also run it with max clocks
22:48 karolherbst: maybe something is still unstable fully reclocked? dunno
22:48 skeggsb: i'd be tempted to blame the reclocking there, personally
22:48 skeggsb: but, not necessarily
22:48 karolherbst: :O
22:48 karolherbst: no way
22:49 skeggsb: MEMACK_TIMEOUT is one of those "fatal" errors according to nvgpu, one that sw shouldn't be able to trigger on its own
22:49 karolherbst: "nouveau 0000:01:00.0: fifo: PBDMA0: 00000002 [MEMACK_TIMEOUT] ch 6 [00bf8cf000 textureGather[4606]] subc 0 mthd 001c data 00000002"
22:49 pmoreau: karolherbst: Seems like that acr_r352 error I was having on the GP102 (or maybe another one) can be triggered on another Pascal card, though after a resume.
22:50 skeggsb: pmoreau: what error? gp102 worked for me
22:50 karolherbst: skeggsb: I see
22:50 karolherbst: skeggsb: will investigate
22:50 pmoreau: skeggsb: https://hastebin.com/vudesehowi.go
22:50 karolherbst:is wondering how long we can ignore the hwmon warning
22:51 skeggsb: pmoreau: ah, gnurou knows about that one.. does it work fine otherwise?
22:51 skeggsb: * There is a bug where the LS firmware sometimes require to be started
22:51 skeggsb: * twice (this happens only on SEC). Detect and workaround that
22:51 skeggsb: * condition.
22:51 skeggsb: no, it doesn't :P
22:51 skeggsb:reads more of the log
22:52 pmoreau: skeggsb: IIRC, I could get a screen, but the point of the experiment was to try imirkin's patch to get Pascal support in xf86-video-nouveau and see if acceleration worked
22:53 pmoreau: Also, there is this new bug report on a GP106 happening on resume https://bugs.freedesktop.org/show_bug.cgi?id=100406, maybe the same issue?
22:54 pmoreau: Did gnurou had a patch for it that I could try?
22:54 skeggsb: ignore my comments about gnurou, that's referring to the initial backtrace (which is "normal" on some configs)
22:55 pmoreau: Ok, but the "cannot boot FECS falcon" is not normal, is it? :-D
22:55 skeggsb: no, it's npot
22:55 skeggsb: not*
22:55 pmoreau: Ok
22:56 skeggsb:hates that we have this magic black-box
22:56 pmoreau: :-/
22:56 karolherbst: mhhh
22:57 pmoreau: And a magic black-box without working fans!
22:57 __Chris: Interesting: With builtin nouveau and nouveau.config=NvAGP=0 it kernel panics like before.
22:57 karolherbst: well, I would say we just fix the issues annoying us the most
22:58 pmoreau: s/working/variable speed
22:58 karolherbst: well
22:58 karolherbst: even if we get a PMU image, those will be useless
22:59 karolherbst: and I wouldn't even accept them in the linux-firmware repository
22:59 imirkin: skeggsb: as for the PCI + nv04_mmu thing - your call whether you want to do it for all nv4x's or only nv4a.
22:59 imirkin: skeggsb: i don't understand enough about all this MMU stuff.
22:59 karolherbst: skeggsb: there will be only NACKs on my side for PMU images without mem reclocking support
22:59 karolherbst: and I hope you will NACK them as well, because this is just silly if they really do this this way
22:59 skeggsb: karolherbst: i plan on NACKing any tegra pmu usage if they don't provide the equivilant for dGPU
23:00 karolherbst: memory reclocking
23:00 karolherbst: you don't need it for tegra
23:00 karolherbst: that's the problem
23:00 skeggsb: no, i know, but PMU does other stuff too that nvgpu uses on tegra, which i'm sure they'd like to use
23:00 karolherbst: and doing mem reclocking ont he host is silly and stupid as well
23:00 karolherbst: sure
23:00 karolherbst: I am fine with this
23:00 karolherbst: but I also want memory reclocking bits
23:00 karolherbst: at least
23:01 skeggsb: i also want fan control + cbc (compression) control.. if they give us pmu images with those, i'll accept them :P
23:01 karolherbst: otherwise that's pretty much useless for us
23:01 karolherbst: fan control + memory recocking + your wish list, then I am fine as well
23:01 karolherbst: memory reclocking _works_ on gm20x
23:01 mupuf: karolherbst: I'm back home!
23:01 mupuf: after the week end
23:01 karolherbst: :)
23:01 karolherbst: I had some fun with reator with a full reclocked gm20x
23:02 mupuf: so, I'll plug the gpus
23:02 mupuf: how did you do it?
23:02 karolherbst: nouveau without secboot -> reclock to 0f -> unload -> load nouveau with secboot -> do tests
23:02 karolherbst: easy
23:02 mupuf: ...
23:03 karolherbst: the GPU didn't even go above 60°C!
23:03 karolherbst: but that was without the scheduling stuff
23:03 mupuf: so, how about we use another engine to do reclocking?
23:03 mupuf: like one PCOPY :D?
23:03 karolherbst: I like that idea
23:03 karolherbst: :D
23:03 skeggsb: we can't, those falcons don't have access to all registers
23:03 karolherbst: well
23:03 karolherbst: we only need regs which are also accesible from the host
23:03 skeggsb: we *can* use PMU on gm20x at least, but we're still fucked without fan control
23:04 karolherbst: true
23:04 karolherbst: but it works
23:04 skeggsb: karolherbst: the engine falcons can only access themselves, can't even *read* PMC_BOOT_0
23:04 karolherbst: and why couldn't we load the PMU image on a pcopy engine?
23:04 karolherbst: mhh :(
23:04 karolherbst: sad
23:04 karolherbst: okay well
23:04 karolherbst: I already talked with gnurou about this
23:04 karolherbst: basically
23:04 karolherbst: it should be possible to load _our_ PMU image after secboot
23:04 mupuf: karolherbst: that would be best
23:05 karolherbst: but then we are still fucked due to fan control
23:05 mupuf: skeggsb: can pgraph access all regs?
23:05 skeggsb: mupuf: nope
23:05 __Chris: imirkin: Have you read my last message? And NvMSI=0 did nothing too.
23:05 mupuf: so, the PMU is the only one that can? That's a bummer!
23:05 karolherbst: we could switch the PMU images all the time :O
23:05 mupuf: karolherbst: well, it has an associated CPU cost
23:05 RSpliet: mupuf: had any luck turning FECS or one of the GPCCS into a software PWM for fan-control? :-D
23:06 karolherbst: mupuf: we only reclock memory once :p
23:06 karolherbst: so we would switch once as well
23:06 skeggsb: we *can* still work on improving everything else we need for clocking while we wait.. it's better than holding our breath :)
23:06 mupuf: RSpliet: I tried bitbanging the GPIO from the host
23:06 mupuf: but I could really not do anything
23:06 mupuf: skeggsb: true
23:06 karolherbst: skeggsb: uhm, like what? it works already like on kepler
23:06 karolherbst: for most cards
23:06 RSpliet: mupuf: was that latency related?
23:07 skeggsb: it's (memory clocking) still incomplete enough that i would *not* be comfortable enabling it by default
23:07 karolherbst: skeggsb: I just wanted to make it clear that I wouldn't accept any silly PMU image with nothing inside it
23:07 skeggsb: karolherbst: right
23:07 skeggsb: agreed
23:07 karolherbst: skeggsb: what is the diff between maxwell1 and maxwell2?
23:07 karolherbst: regarding mem reclocking
23:07 __Chris: imirkin: "Kernel panic not syncing - Fatal exception in interrupt"
23:07 imirkin: __Chris: no MSI with nv3x in the first place
23:08 mupuf: RSpliet: what do you mean?
23:08 skeggsb: perhaps nothing, but even on kepler it's not complete, there's a ton of bios switches etc that aren't handled/handled properly, there *will* be boards out there that fail
23:08 karolherbst: skeggsb: well true, but it's enabled on kepler and maxwell1 now anyway
23:08 karolherbst: and it works for most users
23:08 skeggsb: oh, by enabled i meant "automatic reclocking"
23:08 karolherbst: okay, true
23:08 RSpliet: mupuf: was the problem inability to change the reg values at all, or non-practical latencies getting in the way of doing bitbanging?
23:08 karolherbst: but manual reclocking is good enough for now
23:08 skeggsb: not for users :)
23:09 karolherbst: better than no reclocking
23:09 skeggsb: yeah
23:09 karolherbst: also
23:09 karolherbst: automatic reclocking works on mine
23:09 __Chris: imirkin: It did no difference anyway it seems. 0 or 1. But NvAGP=0 did the trick as (m) but not as (y). Makes no sense, right?
23:09 imirkin: right :)
23:10 mupuf: RSpliet: no, it was the inability to change the reg values
23:10 karolherbst: skeggsb: I did stresstesting: it was stable with 2k reclocks/second
23:10 karolherbst: under full load
23:10 mupuf: karolherbst: you missed his point, it may work on your hw but would not work on all
23:10 mupuf: I had the same logic as yours
23:10 karolherbst: true
23:10 karolherbst: but we could have a switch: config=dynRclk=1 or so
23:10 mupuf: I got a stable reclocking on my nv86 in a few months
23:11 karolherbst: and fix the issues which come over time
23:11 karolherbst: I wouldn't enabled it by default as well
23:11 karolherbst: *enable
23:11 mupuf: but years later, it is still not there by default
23:11 karolherbst: but my point was: I want memory reclocking bits inside the Maxwell2 PMU images
23:11 karolherbst: otherwise they are useless
23:11 RSpliet: karolherbst: a big dealbreaker is the flickering screen on every reclock because we don't configure the line buffer
23:11 RSpliet: which, speaking of flickering
23:11 RSpliet: skeggsb! :-D
23:11 karolherbst: well, true
23:12 skeggsb: karolherbst: we already have that option (NvClkMode) ;)
23:12 karolherbst: I meant for automatic reclocking
23:13 skeggsb: yeah, one of the numbers you can pass to that meant "automatic" last i looked
23:13 karolherbst: :O
23:13 karolherbst: it's not inside the code?
23:13 karolherbst: first of all: you need the PMU counters for this
23:13 karolherbst: nouveau has 0 support of those
23:13 skeggsb: i know :)
23:14 karolherbst: well, I have patches for this as well though :D
23:14 skeggsb: if (clk->allow_reclock && !strncasecmpz(mode, "auto", arglen))
23:14 skeggsb: return -2;
23:14 skeggsb: it's in the code
23:14 skeggsb: but, obviously, doesn't do a great deal right now
23:14 karolherbst: okay, cool
23:14 RSpliet: In your atomic modesetting work, the code for setting the vblank μs register disappered. I noticed because my Fermi DRAM clock change causes screen flicker :-(
23:15 karolherbst: then I could reuse this one
23:15 RSpliet: Could you look into reinstating that bit?
23:15 __Chris: imirkin: Thank you for your help anyway. I will go to sleep for now and will try it again tomorrow. Sending you the output from lspci than. :)
23:15 karolherbst: skeggsb: the reason I don't feel good about the PMU counter patches is this: https://github.com/karolherbst/nouveau/commit/65067e8709a20b3e5a1337aa2d9cf2ca06c16805#diff-5e5cb4582f6faff078d1cad6144b248aR154
23:15 skeggsb: RSpliet: it should still be there, just, moved..
23:15 karolherbst: skeggsb: I don't really mimic nvidia here with selecting which bits to enable
23:15 karolherbst: need to RE more there
23:16 karolherbst: especially the memory load part sucks
23:20 RSpliet: skeggsb: hmm, I see the code didn't disappear. However, I'm 100% positive it doesn't work at the moment. Easiest way to tell is nvawatch -t -m 0x8 0x10a7c4
23:21 pmoreau: skeggsb: BTW, there is no detection of GDRR5X memory in Nouveau yet. Is it just a matter of detecting the proper value inside the VBIOS and setting some defines?
23:22 skeggsb: pmoreau: yeah, that should be all
23:22 karolherbst: need to sleep... mupuf thanks for changing the GPUs. will run some piglit stuff on them tomorow or you could show me how to do that with ezbench or if you already have something setup for this already?
23:22 skeggsb: RSpliet: i'll see if i can reproduce and figure it out
23:22 mupuf: karolherbst: what do you need ezbench for?
23:22 mupuf: do you have a test?
23:22 karolherbst: mesa regression testing via piglit
23:22 RSpliet: Thanks! I expect it's something silly...
23:23 pmoreau: skeggsb: Ok. I could send the small patch I have then, after testing it (and praying it doesn’t bust the card :-D).
23:23 karolherbst: mupuf: mainly these commits: https://github.com/karolherbst/mesa/commits/008e6dcb1d9b414e6b2c56fc7e279d0f7cfb120d
23:24 mupuf: ok, performance
23:24 mupuf: I see
23:24 karolherbst: no
23:24 karolherbst: piglit
23:24 karolherbst: I dont want to break stuff
23:24 mupuf: ah, sure
23:24 mupuf: !
23:24 mupuf: I worked a lot on this this week end, actually
23:24 karolherbst: because I changed the emiting of MAD for gk110 and gm107 ISA
23:24 karolherbst: nice
23:24 karolherbst: (and nvc0 as well, but this I could verify on my own hw)
23:25 mupuf: finally putting ezbench in a cluster of test nodes
23:25 karolherbst: :)
23:25 karolherbst: would be nice to be able to tell a CI which branch on which repository to test
23:25 mupuf: yep
23:25 mupuf: coming
23:25 karolherbst: against a shared automatic tested master commits
23:25 mupuf: but also coming is: send a patch to the ML
23:25 karolherbst: !
23:26 karolherbst: nice
23:26 mupuf: and let the machines test it and report back
23:26 karolherbst: mupuf: mhh idea: BCC the rig
23:26 mupuf: this is what we have for Intel and we are working on replacing with ezbench
23:26 mupuf: no need, patchwork is following the nouveau ML
23:26 karolherbst: ohh patchwork integration, nice
23:28 karolherbst: but somehow patchworks fails to detect that stuff got merged
23:28 mupuf: but what we are working on right now: combining patch series into one tree and testing the tree, then finding culprits of regressiosn and marking the faulty patch series
23:28 mupuf: true, but this part is not the end of the work
23:28 karolherbst: right
23:28 karolherbst: just means I have to be more organized on my end
23:28 karolherbst: :D
23:34 karolherbst: skeggsb: regarding my reclocking update patches, I think there is still some issue _somewhere_ but this may be related to the other suspend/resume issues others had with 4.10 I think. need to investigate. I twas some funky stuff and maybe I fixed my issues already. Review still welcomed though
23:36 imirkin: skeggsb: oh, btw, runpm appears buggered in 4.10 - both i and a bunch of other people on regular desktops with multiple GPUs have had to stick runpm=0 to make stuff not-die
23:36 skeggsb: imirkin: die, how?
23:37 imirkin: skeggsb: some kind of lock imbalance
23:38 imirkin: airlied tried to look into it but couldn't repro? not sure
23:39 imirkin: skeggsb: oh, when the G92 died i tried to unbind it and got this: https://hastebin.com/ihefesovah.go
23:40 imirkin: skeggsb: this was the issue with runpm though: https://hastebin.com/inozoyasod.go
23:42 airlied:couldn't spot anything obivously different in the area