02:02azaki: karolherbst: the loading times seem to be about the same, although i'm running this from a hard drive, not an SSD, so maybe that is bottlenecking it or something.
02:02azaki: i used the gnome clock stopwatch to try and time it
02:03azaki: by the way, i dunno if i was supposed to try mesa-git, i actually used mesa 18.1.4
02:03azaki: and patched that
02:04azaki: it also seems the framerate dropped like 10 or so fps. however, the ground texture issue i think is better now, because previously the ground was just full of garbage textures
02:04azaki: but now the ground is all black, so it's actually playable
02:04azaki: i can see where i'm going at least now
02:04azaki: so i actually think this is a huge improvement
02:44rhyskidd: new nv docs: https://download.nvidia.com/open-gpu-doc/Compute-QMD/1/
02:57imirkin: pendingchaos: the join/joinat should be around the while loop, i think
03:00imirkin: karolherbst: we cap at 297 by default, but you can use hdmimhz to set other things
03:00imirkin: we don't allow HDMI 2.0 at all though, i think we need to set various regs to make it go
03:00imirkin: and/or check the capability somehow
03:00imirkin: both of the source and the sink
03:08HdkR: Oh snap, QMD docs :D
03:09imirkin: karolherbst: unfortunately i don't have a HDMI 2.0 source, so i couldn't add the support for it. and ben doesn't have a hdmi 2.0 sink.
03:10imirkin: the logical solution -- hdmi-over-ip -- hasn't quite materialized
03:15skeggsb_: i'd actually be surprised if it didn't "just work" if we allow it
03:16skeggsb_: the tables we use from the vbios for modesetting are selected by clock rate ranges, i'd expect the scripts would handle it
03:16skeggsb_: but, perhaps not
03:19imirkin: skeggsb_: well, we have to detect both source and sink support...
03:19imirkin: although i think if it's illegal, the sink isn't supposed to report those over edid...
04:04nyef: Hrm. Thirteen flags tested, waited for, or mutated by the NVAF context-switch microcode for which there is no information in rnndb, envydis, or the kernel.
06:49crmlt: Can I extract firmware from 340.107?
06:59crmlt: How should I do it?
07:03karolherbst: imirkin: well, checking the caps won't be a problem as I think I found the way how to determine what max clocks we can use
07:17karolherbst: imirkin, skeggsb_: also, why do we have that duallink options, same reasons as for hdmimhz?
07:17crmlt: Can I ignore this https://hastebin.com/afopuqifeg.coffeescript ?
07:18skeggsb_: karolherbst: legacy we inherited from -nv, there were some early g8x boards that didn't have enough memory bandwidth to do it without reclocking
07:18skeggsb_: i think my nv50 is like that actually
07:18karolherbst: crmlt: I am sure you need to have the exact same version as stated in the script or something
07:19karolherbst: skeggsb_: I see
07:19karolherbst: skeggsb_: well we default it to be true and I think we can also read it out through the caps
07:20skeggsb_: we get that particular cap from DCB already, you can ignore that one
07:20skeggsb_: that's one of the things nvkm is supposed to do (match up hw caps against dcb and disable stuff)
07:20skeggsb_: we just trust dcb blindly
07:21skeggsb_: (we're supposed to do it so evo will throw an error instead of trying)
07:21skeggsb_: but, we modeset in the kernel anyway, so, not entirely necessary
07:21karolherbst: I see
07:21karolherbst: so if I would implement checking for that cap, we could basically remove that kernel parameter and check for that early on, right?
07:22skeggsb_: no, because it's working around memory bandwidth, *not* detecting if the board supports duallink
07:22skeggsb_: we determine support from dcb
07:23skeggsb_: the board i have which needs it supports duallink just fine, but without reclocking, you'll get a black screen if you try
07:23karolherbst: I see
07:23skeggsb_: the kernel option is just a workaround if people hit that issue
07:24skeggsb_: if we ever get stable g8x reclocking enabled by default *then* we can remove the option ;)
07:26karolherbst: skeggsb_: I see, I think I will just write the patch adding support for parsing out the max clocks to give it to others for testing
07:27karolherbst: it is just a bit stupid that with gm200 they split the TMDS/LVDS max clock cap
07:27skeggsb_: yeah, that use of caps is fine, and would be nice to see
07:27karolherbst: my GPU actually rports 600MHz, that's why I am wondering about HDMI 2.0
07:28skeggsb_: do you have access to a hdmi 2.0 sink?
07:28karolherbst: basically any 4k display would do, right?
07:28skeggsb_: it'd be cool to confirm if allowing it makes it "just work"
07:29karolherbst: yeah, thing is I would have to disable duallink for testing I guess
07:29karolherbst: but yeah, we have a 4k display in the office here I might be able to test things on
07:29karolherbst: skeggsb_: is there an advantage using single link over duallink except using one less link and potentially being able to drive more displays?
07:30skeggsb_: 4k@60, yeah
07:30skeggsb_: hdmi is only single link
07:30skeggsb_: dual-link dvi was never really popular
07:30skeggsb_:is using one such display right now, however
07:31karolherbst: I see
07:31skeggsb_: though, with a DP->DVI adapter (active type, since passive DP->TMDS is single-link too)
07:31HdkR: Really who wanted to pay for those super expensive 2560x1600 resolution monitors anyway? :P
07:32karolherbst: skeggsb_: yeah.. I also have an active HDMI to DP adapter, which is quite helpful for intel GPUs
07:32karolherbst: as they kind of suck with HDMI
08:05karolherbst: skeggsb_: https://github.com/karolherbst/nouveau/commit/00c0f88bf481d699af6fd8fcafc6d7268e756c61
08:07skeggsb_: i'm still somewhat wondering if we shouldn't save the results per-SOR... post-gm20x they should all be the same, but prior they might differ
08:07karolherbst: yeah.. maybe that is a bit simplier
08:07karolherbst: I think we should at least _parse_ them per sor
08:07karolherbst: if the consuming code loops over all and takes the worst value that would be better than the current code anyway
08:08skeggsb_: not necessarily ;)
08:09karolherbst: skeggsb_: https://github.com/karolherbst/nouveau/commit/e16b9cd2629bcdbebc940da015814279f8b746b8
08:10karolherbst: could I do nv_encoder->dp.no_interlace = caps->dp[nv_encoder->or].no_interlace; ?
08:10karolherbst: in nv50_sor_create?
08:10skeggsb_: well... prior to gm20x, yes.. it makes no sense on gm200 and up because they're dynamically assigned
08:11skeggsb_: gm200.. i'd probably just pick the first SOR values.. not sure what nvidia do
08:11skeggsb_: we *were* told that all SORs should be equally capable (with the exception of LVDS/eDP, which are supposed to be fixed)
08:11karolherbst: I am mainly wondering when nv_encoder->or is written to, but I guess this happens at aquire time and this is after sor_create?
08:11skeggsb_: use "ffs(nv_encoder->dcb->or) - 1"
08:12skeggsb_: that'll deal with pre-gm200
08:13skeggsb_: and, actually, all sors being equal, that'll deal with gm200 too
08:14karolherbst: I see
08:14karolherbst: well we know that the sors aren't equal in regards to lvds though
08:14skeggsb_: yes, that's a known exception
08:15skeggsb_: it's fine though, in that case sor/link and pad/link are supposed to be identify
08:15skeggsb_: so it'll still match
08:18karolherbst: skeggsb_: do we have a macro for the max sors? or is it simply 8?
08:19skeggsb_: it's 4 on earlier GPUs
08:19skeggsb_: just only parse as many as there are caps for
08:19karolherbst: skeggsb_: mhh in the 50 file they have sor0/1 and pror0/1/3
08:19karolherbst: uhm pior
08:20skeggsb_: uh, yeah, even less on even earlier GPUs :P
08:20karolherbst: well, I was mainly wondering if I should stick a 8 or a NOUVEAU_MAX_SORS in the header
08:39karolherbst: skeggsb_: https://github.com/skeggsb/nouveau/compare/92df3698bc35ddee9d58e0fb16bea1c715459310...karolherbst:fix_interlaced_reject
09:34RSpliet: karolherbst, skeggsb_: we could estimate both DRAM bandwidth and scan-out bandwidth quite easily. We could replace the static option with a routine that checks required bandwidth against provided DRAM bandwidth, which once reclocking is sorted out (...) can be replaced with code to enforce a minimum perf lvl for given display mode.
09:35RSpliet: The only caveat is that rejecting a mode is not helpful if we get users to manually change their perf lvl...
09:36karolherbst: RSpliet: with my clocking stuff I kind of added the concept of a max clock
09:36karolherbst: we could do the same just the other way around
09:36karolherbst: and selecting a lower clock would simply fail
09:37RSpliet: For these older (and even older) cards we'll need a lower bound anyway for scanout
09:38karolherbst: yeah, I know. I have such a GPU
09:38karolherbst: which needs 0xf for 1920x1080 already
09:38karolherbst: or whatever the highest perf level on tesla is
11:35karolherbst: skeggsb_: what do you think of this: https://github.com/karolherbst/nouveau/compare/master_4.17...karolherbst:fix_interlaced_reject
11:39karolherbst: I guess I could add it for lvds as well
11:57karolherbst: nice, today I get those EVO timeouts....
11:57karolherbst: on boot
13:48karolherbst: skeggsb_: uhm, we call nv50_mstm_init for every enconder except the dpmst ones?
13:48karolherbst: shouldn't it be like only called on DRM_MODE_ENCODER_DP or DRM_MODE_ENCODER_DPMST ones?
13:50karolherbst: in either case
13:50karolherbst: having the dp.mstm inside an union doesn't sound like a good idea
13:58karolherbst: "[drm:drm_mode_config_cleanup] *ERROR* connector eDP-3 leaked!" mhh, also not so nice
13:58karolherbst: Lyude: do you know if you took care of such a bug already?
13:58karolherbst: leaking connectors?
15:01karolherbst: pendingchaos: do we have piglit tests for MS bindless images?
15:01pendingchaos: I don't think so
15:02karolherbst: would be good if you add some so that we can verify that your patch fixes it and we don't regress it
15:15pendingchaos: karolherbst: would creating a modified basic-imageStore.shader_test be good?
15:15pendingchaos: or should more than that be added
15:15karolherbst: yeah, that would be good enough
15:15karolherbst: as long as it fails without your patch and does what we expect in the fixed state
15:15pendingchaos: that seems to be the case
15:15karolherbst: pendingchaos: do you have a 4k HDMI display by any chance? I don't have one in reach right now
15:16pendingchaos: no, I don't
15:18karolherbst: pendingchaos: maybe a 2560×1440 one?
15:18azaki: X crashed, so I may have missed some messages
15:18azaki: karolherbst: did you see my feedback yesterday?
15:18pendingchaos: sorry, just 1920x1080
15:18karolherbst: sad :(
15:18karolherbst: azaki: maybe?
15:19karolherbst: ahh yes
15:19karolherbst: azaki: ahhh, so it is black now
15:19karolherbst: this basically means this patch isn't _that_ wrong
15:19karolherbst: I was mainly hacking around stuff
15:19karolherbst: mind doing another apitrace?
15:19karolherbst: but uhm
15:19karolherbst: kind of cache the game before
15:20karolherbst: and try even smaller resolution
15:20azaki: yeah it's certainly an improvement. even though the fps went down a bit. at least i can see what's going on now.
15:20karolherbst: maybe you are able to create a ~15GB trace
15:20karolherbst: azaki: well, a shader didn't compile
15:20azaki: although loading times stayed the same.
15:20karolherbst: yeah, most likley simply caching on my end
15:20karolherbst: I have 32GB RAM on that machine, so the trace nearly fits in
15:20azaki: i do have 32 GB of ram too though. XD
15:21azaki: DDR3 though.
15:21azaki: not DDR4
15:21karolherbst: I have that on my other laptop :D
15:21azaki: i don't have swap though. which is probably a huge mistake. but eh
15:21karolherbst: another trace, but smaller would be good
15:21karolherbst: as I think wine was generating garbage
15:21karolherbst: as for me the replay still showed the garbage floor
15:22karolherbst: not black
15:22azaki: yeah i tried the replay, it still shows garbage. but when you actually play the game itself, it's black.
15:24karolherbst: azaki: do you have a fermi or kepler gt 630?
15:24karolherbst: okay nice
15:24azaki: it's the OEM kepler, 192 'cuda cores'
15:24karolherbst: I don't remember if you were able to reclock, but with that one, you are
15:24azaki: not the retail one
15:24karolherbst: so the GK107 one
15:25azaki: i did reclock.
15:25azaki: 0f: core 324-875 MHz memory 1782 MHz *
15:25azaki: AC: core 875 MHz memory 1782 MHz AC DC
15:25karolherbst: this is basically the highest possible then :)
15:25azaki: i've actually kept it running at highest clock, i dunno if that's bad or not
15:25karolherbst: azaki: did you use NvBoost or is it without it?
15:26azaki: without, i'm not sure what nvboost even is.
15:26karolherbst: azaki: you might want to boot with nouveau.config=NvPmEnableGating=1
15:26karolherbst: azaki: uhm, raising the max clocks
15:26karolherbst: like you have the base/boost clocks on GPUs
15:26karolherbst: and we default to base
15:26karolherbst: but the pstate lines show the actual max clock
15:26karolherbst: and the AC line would be simply capped
15:26karolherbst: but as you already reach the top, there is nothing you could do
15:27karolherbst: that NvPmEnableGating reduces power consumption and also heat generation
15:27karolherbst: might be a good idea if your GPU stays at 0xf
15:28karolherbst: you should notice the GPU be a bit cooler with that, but without a power sensor measuring power consumption it is a bit hard to tell
15:31azaki: ah, ok
15:31azaki: the temp isn't too bad right now. it's at 53 C
15:35azaki: i have to go out in a bit. when i'm back i'll do the second apitrace
15:59pendingchaos: karolherbst: if you have access to a Kepler card, do you mind making sure this doesn't break bound and bindless image load/store sometime: https://github.com/pendingchaos/mesa/commit/b72d47f56ba971e6dbd06ae721e3b2c99f57dd99 ?
16:08karolherbst: pendingchaos: remind me on Monday
18:16Lyude: karolherbst: I thought I saw someone else post a patch for something like that recently
18:22karolherbst: Lyude: huh?
18:22karolherbst: I did a few days ago
18:23karolherbst: well yesterday
18:24Lyude: karolherbst: I meant the leaking connector
18:27karolherbst: Lyude: I have that weird issue that after reloading nouveau my internal screen stays black on one of my laptops
18:27karolherbst: well if I put it into dedicated only mode
21:31WoC: So, still no OpenCL ?
21:57karolherbst: WoC: well, there is, but not upstreamed
21:57karolherbst: why are you asking
22:36WoC: karolherbst, trying to figure out if there is any possible use for nouveau. But for me, w/o OpenCL, it's useless
22:41WoC: Would even settle for cuda :P
22:41karolherbst: send patches :p