02:02 azaki: karolherbst: the loading times seem to be about the same, although i'm running this from a hard drive, not an SSD, so maybe that is bottlenecking it or something.
02:02 azaki: i used the gnome clock stopwatch to try and time it
02:03 azaki: by the way, i dunno if i was supposed to try mesa-git, i actually used mesa 18.1.4
02:03 azaki: and patched that
02:04 azaki: it also seems the framerate dropped like 10 or so fps. however, the ground texture issue i think is better now, because previously the ground was just full of garbage textures
02:04 azaki: but now the ground is all black, so it's actually playable
02:04 azaki: i can see where i'm going at least now
02:04 azaki: so i actually think this is a huge improvement
02:44 rhyskidd: new nv docs: https://download.nvidia.com/open-gpu-doc/Compute-QMD/1/
02:57 imirkin: pendingchaos: the join/joinat should be around the while loop, i think
03:00 imirkin: karolherbst: we cap at 297 by default, but you can use hdmimhz to set other things
03:00 imirkin: we don't allow HDMI 2.0 at all though, i think we need to set various regs to make it go
03:00 imirkin: and/or check the capability somehow
03:00 imirkin: both of the source and the sink
03:08 HdkR: Oh snap, QMD docs :D
03:09 imirkin: karolherbst: unfortunately i don't have a HDMI 2.0 source, so i couldn't add the support for it. and ben doesn't have a hdmi 2.0 sink.
03:10 imirkin: the logical solution -- hdmi-over-ip -- hasn't quite materialized
03:15 skeggsb_: i'd actually be surprised if it didn't "just work" if we allow it
03:16 skeggsb_: the tables we use from the vbios for modesetting are selected by clock rate ranges, i'd expect the scripts would handle it
03:16 skeggsb_: but, perhaps not
03:19 imirkin: skeggsb_: well, we have to detect both source and sink support...
03:19 imirkin: although i think if it's illegal, the sink isn't supposed to report those over edid...
04:04 nyef: Hrm. Thirteen flags tested, waited for, or mutated by the NVAF context-switch microcode for which there is no information in rnndb, envydis, or the kernel.
06:49 crmlt: Can I extract firmware from 340.107?
06:59 crmlt: How should I do it?
07:03 karolherbst: imirkin: well, checking the caps won't be a problem as I think I found the way how to determine what max clocks we can use
07:17 karolherbst: imirkin, skeggsb_: also, why do we have that duallink options, same reasons as for hdmimhz?
07:17 crmlt: Can I ignore this https://hastebin.com/afopuqifeg.coffeescript ?
07:18 skeggsb_: karolherbst: legacy we inherited from -nv, there were some early g8x boards that didn't have enough memory bandwidth to do it without reclocking
07:18 skeggsb_: i think my nv50 is like that actually
07:18 karolherbst: crmlt: I am sure you need to have the exact same version as stated in the script or something
07:19 karolherbst: skeggsb_: I see
07:19 karolherbst: skeggsb_: well we default it to be true and I think we can also read it out through the caps
07:20 skeggsb_: we get that particular cap from DCB already, you can ignore that one
07:20 skeggsb_: that's one of the things nvkm is supposed to do (match up hw caps against dcb and disable stuff)
07:20 skeggsb_: we just trust dcb blindly
07:20 karolherbst: okay
07:21 skeggsb_: (we're supposed to do it so evo will throw an error instead of trying)
07:21 skeggsb_: but, we modeset in the kernel anyway, so, not entirely necessary
07:21 karolherbst: I see
07:21 karolherbst: so if I would implement checking for that cap, we could basically remove that kernel parameter and check for that early on, right?
07:22 skeggsb_: no, because it's working around memory bandwidth, *not* detecting if the board supports duallink
07:22 skeggsb_: we determine support from dcb
07:23 skeggsb_: the board i have which needs it supports duallink just fine, but without reclocking, you'll get a black screen if you try
07:23 karolherbst: I see
07:23 skeggsb_: the kernel option is just a workaround if people hit that issue
07:24 skeggsb_: if we ever get stable g8x reclocking enabled by default *then* we can remove the option ;)
07:26 karolherbst: skeggsb_: I see, I think I will just write the patch adding support for parsing out the max clocks to give it to others for testing
07:27 karolherbst: it is just a bit stupid that with gm200 they split the TMDS/LVDS max clock cap
07:27 skeggsb_: yeah, that use of caps is fine, and would be nice to see
07:27 karolherbst: my GPU actually rports 600MHz, that's why I am wondering about HDMI 2.0
07:28 skeggsb_: do you have access to a hdmi 2.0 sink?
07:28 karolherbst: basically any 4k display would do, right?
07:28 skeggsb_: it'd be cool to confirm if allowing it makes it "just work"
07:29 karolherbst: yeah, thing is I would have to disable duallink for testing I guess
07:29 karolherbst: but yeah, we have a 4k display in the office here I might be able to test things on
07:29 karolherbst: skeggsb_: is there an advantage using single link over duallink except using one less link and potentially being able to drive more displays?
07:30 skeggsb_: 4k@60, yeah
07:30 skeggsb_: hdmi is only single link
07:30 skeggsb_: dual-link dvi was never really popular
07:30 skeggsb_:is using one such display right now, however
07:30 karolherbst: ahhh
07:31 karolherbst: I see
07:31 skeggsb_: though, with a DP->DVI adapter (active type, since passive DP->TMDS is single-link too)
07:31 HdkR: Really who wanted to pay for those super expensive 2560x1600 resolution monitors anyway? :P
07:32 skeggsb_: hehe
07:32 karolherbst: skeggsb_: yeah.. I also have an active HDMI to DP adapter, which is quite helpful for intel GPUs
07:32 karolherbst: as they kind of suck with HDMI
08:05 karolherbst: skeggsb_: https://github.com/karolherbst/nouveau/commit/00c0f88bf481d699af6fd8fcafc6d7268e756c61
08:06 karolherbst: uhm....
08:07 skeggsb_: i'm still somewhat wondering if we shouldn't save the results per-SOR... post-gm20x they should all be the same, but prior they might differ
08:07 karolherbst: yeah.. maybe that is a bit simplier
08:07 karolherbst: well
08:07 karolherbst: I think we should at least _parse_ them per sor
08:07 karolherbst: if the consuming code loops over all and takes the worst value that would be better than the current code anyway
08:08 skeggsb_: not necessarily ;)
08:09 karolherbst: skeggsb_: https://github.com/karolherbst/nouveau/commit/e16b9cd2629bcdbebc940da015814279f8b746b8
08:10 karolherbst: could I do nv_encoder->dp.no_interlace = caps->dp[nv_encoder->or].no_interlace; ?
08:10 karolherbst: in nv50_sor_create?
08:10 skeggsb_: well... prior to gm20x, yes.. it makes no sense on gm200 and up because they're dynamically assigned
08:11 karolherbst: mhhhh
08:11 skeggsb_: gm200.. i'd probably just pick the first SOR values.. not sure what nvidia do
08:11 skeggsb_: we *were* told that all SORs should be equally capable (with the exception of LVDS/eDP, which are supposed to be fixed)
08:11 karolherbst: I am mainly wondering when nv_encoder->or is written to, but I guess this happens at aquire time and this is after sor_create?
08:11 skeggsb_: use "ffs(nv_encoder->dcb->or) - 1"
08:12 skeggsb_: that'll deal with pre-gm200
08:13 skeggsb_: and, actually, all sors being equal, that'll deal with gm200 too
08:14 karolherbst: I see
08:14 karolherbst: well we know that the sors aren't equal in regards to lvds though
08:14 skeggsb_: yes, that's a known exception
08:15 skeggsb_: it's fine though, in that case sor/link and pad/link are supposed to be identify
08:15 skeggsb_: so it'll still match
08:16 skeggsb_: identity*
08:18 karolherbst: skeggsb_: do we have a macro for the max sors? or is it simply 8?
08:19 skeggsb_: it's 4 on earlier GPUs
08:19 skeggsb_: just only parse as many as there are caps for
08:19 karolherbst: skeggsb_: mhh in the 50 file they have sor0/1 and pror0/1/3
08:19 karolherbst: uhm pior
08:20 skeggsb_: uh, yeah, even less on even earlier GPUs :P
08:20 karolherbst: well, I was mainly wondering if I should stick a 8 or a NOUVEAU_MAX_SORS in the header
08:39 karolherbst: skeggsb_: https://github.com/skeggsb/nouveau/compare/92df3698bc35ddee9d58e0fb16bea1c715459310...karolherbst:fix_interlaced_reject
09:34 RSpliet: karolherbst, skeggsb_: we could estimate both DRAM bandwidth and scan-out bandwidth quite easily. We could replace the static option with a routine that checks required bandwidth against provided DRAM bandwidth, which once reclocking is sorted out (...) can be replaced with code to enforce a minimum perf lvl for given display mode.
09:35 RSpliet: The only caveat is that rejecting a mode is not helpful if we get users to manually change their perf lvl...
09:36 karolherbst: RSpliet: with my clocking stuff I kind of added the concept of a max clock
09:36 karolherbst: we could do the same just the other way around
09:36 karolherbst: and selecting a lower clock would simply fail
09:37 RSpliet: For these older (and even older) cards we'll need a lower bound anyway for scanout
09:38 karolherbst: yeah, I know. I have such a GPU
09:38 karolherbst: which needs 0xf for 1920x1080 already
09:38 karolherbst: or whatever the highest perf level on tesla is
11:35 karolherbst: skeggsb_: what do you think of this: https://github.com/karolherbst/nouveau/compare/master_4.17...karolherbst:fix_interlaced_reject
11:39 karolherbst: I guess I could add it for lvds as well
11:57 karolherbst: nice, today I get those EVO timeouts....
11:57 karolherbst: on boot
13:48 karolherbst: skeggsb_: uhm, we call nv50_mstm_init for every enconder except the dpmst ones?
13:48 karolherbst: shouldn't it be like only called on DRM_MODE_ENCODER_DP or DRM_MODE_ENCODER_DPMST ones?
13:50 karolherbst: in either case
13:50 karolherbst: having the dp.mstm inside an union doesn't sound like a good idea
13:58 karolherbst: "[drm:drm_mode_config_cleanup] *ERROR* connector eDP-3 leaked!" mhh, also not so nice
13:58 karolherbst: Lyude: do you know if you took care of such a bug already?
13:58 karolherbst: leaking connectors?
15:01 karolherbst: pendingchaos: do we have piglit tests for MS bindless images?
15:01 pendingchaos: I don't think so
15:02 karolherbst: would be good if you add some so that we can verify that your patch fixes it and we don't regress it
15:15 pendingchaos: karolherbst: would creating a modified basic-imageStore.shader_test be good?
15:15 pendingchaos: or should more than that be added
15:15 karolherbst: yeah, that would be good enough
15:15 karolherbst: as long as it fails without your patch and does what we expect in the fixed state
15:15 pendingchaos: that seems to be the case
15:15 karolherbst: pendingchaos: do you have a 4k HDMI display by any chance? I don't have one in reach right now
15:16 pendingchaos: no, I don't
15:17 karolherbst: :(
15:18 karolherbst: pendingchaos: maybe a 2560×1440 one?
15:18 azaki: X crashed, so I may have missed some messages
15:18 azaki: karolherbst: did you see my feedback yesterday?
15:18 pendingchaos: sorry, just 1920x1080
15:18 karolherbst: sad :(
15:18 karolherbst: azaki: maybe?
15:19 karolherbst: ahh yes
15:19 karolherbst: azaki: ahhh, so it is black now
15:19 azaki: yeah.
15:19 karolherbst: this basically means this patch isn't _that_ wrong
15:19 karolherbst: I was mainly hacking around stuff
15:19 karolherbst: mind doing another apitrace?
15:19 karolherbst: but uhm
15:19 karolherbst: kind of cache the game before
15:20 karolherbst: and try even smaller resolution
15:20 azaki: yeah it's certainly an improvement. even though the fps went down a bit. at least i can see what's going on now.
15:20 karolherbst: maybe you are able to create a ~15GB trace
15:20 karolherbst: azaki: well, a shader didn't compile
15:20 azaki: although loading times stayed the same.
15:20 karolherbst: yeah, most likley simply caching on my end
15:20 karolherbst: I have 32GB RAM on that machine, so the trace nearly fits in
15:20 azaki: i do have 32 GB of ram too though. XD
15:21 azaki: DDR3 though.
15:21 azaki: not DDR4
15:21 karolherbst: :p
15:21 karolherbst: I have that on my other laptop :D
15:21 karolherbst: anyway
15:21 azaki: i don't have swap though. which is probably a huge mistake. but eh
15:21 karolherbst: another trace, but smaller would be good
15:21 azaki: ok.
15:21 karolherbst: as I think wine was generating garbage
15:21 karolherbst: as for me the replay still showed the garbage floor
15:22 karolherbst: not black
15:22 azaki: yeah i tried the replay, it still shows garbage. but when you actually play the game itself, it's black.
15:23 karolherbst: okay
15:24 karolherbst: azaki: do you have a fermi or kepler gt 630?
15:24 azaki: kepler
15:24 karolherbst: okay nice
15:24 azaki: it's the OEM kepler, 192 'cuda cores'
15:24 karolherbst: I don't remember if you were able to reclock, but with that one, you are
15:24 azaki: not the retail one
15:24 karolherbst: so the GK107 one
15:25 azaki: yeah
15:25 azaki: i did reclock.
15:25 azaki: 0f: core 324-875 MHz memory 1782 MHz *
15:25 azaki: AC: core 875 MHz memory 1782 MHz AC DC
15:25 karolherbst: yeah
15:25 karolherbst: this is basically the highest possible then :)
15:25 azaki: i've actually kept it running at highest clock, i dunno if that's bad or not
15:25 azaki: =p
15:25 karolherbst: azaki: did you use NvBoost or is it without it?
15:26 azaki: without, i'm not sure what nvboost even is.
15:26 karolherbst: azaki: you might want to boot with nouveau.config=NvPmEnableGating=1
15:26 karolherbst: azaki: uhm, raising the max clocks
15:26 karolherbst: like you have the base/boost clocks on GPUs
15:26 karolherbst: and we default to base
15:26 karolherbst: but the pstate lines show the actual max clock
15:26 karolherbst: and the AC line would be simply capped
15:26 karolherbst: but as you already reach the top, there is nothing you could do
15:27 karolherbst: that NvPmEnableGating reduces power consumption and also heat generation
15:27 karolherbst: might be a good idea if your GPU stays at 0xf
15:28 karolherbst: you should notice the GPU be a bit cooler with that, but without a power sensor measuring power consumption it is a bit hard to tell
15:31 azaki: ah, ok
15:31 azaki: the temp isn't too bad right now. it's at 53 C
15:35 azaki: i have to go out in a bit. when i'm back i'll do the second apitrace
15:59 pendingchaos: karolherbst: if you have access to a Kepler card, do you mind making sure this doesn't break bound and bindless image load/store sometime: https://github.com/pendingchaos/mesa/commit/b72d47f56ba971e6dbd06ae721e3b2c99f57dd99 ?
16:08 karolherbst: pendingchaos: remind me on Monday
16:08 pendingchaos:nods
18:16 Lyude: karolherbst: I thought I saw someone else post a patch for something like that recently
18:22 karolherbst: Lyude: huh?
18:22 karolherbst: I did a few days ago
18:23 karolherbst: well yesterday
18:24 Lyude: karolherbst: I meant the leaking connector
18:24 karolherbst: ahhh
18:27 karolherbst: Lyude: I have that weird issue that after reloading nouveau my internal screen stays black on one of my laptops
18:27 karolherbst: well if I put it into dedicated only mode
21:31 WoC: So, still no OpenCL ?
21:57 karolherbst: WoC: well, there is, but not upstreamed
21:57 karolherbst: why are you asking
22:36 WoC: karolherbst, trying to figure out if there is any possible use for nouveau. But for me, w/o OpenCL, it's useless
22:41 WoC: Would even settle for cuda :P
22:41 karolherbst: send patches :p