10:48hrw: can someone tell me how good/bad status of nouveau on 1050ti is?
10:48hrw: desktop (4.16.x kernel, two fullhd monitors, kde) runs fine but diablo3 is unplayable at 9-11fps
10:55karolherbst: hrw: you have to reclock the GPU
10:55hrw: karolherbst: how?
10:56karolherbst: cat it to get the perf states
10:56karolherbst: and echo the state into it to set it
10:56karolherbst: like you get 07: ... 0f: ... and so on
10:56karolherbst: last line is current status with DC/AC indication power source
10:56hrw: [root@puchatek 0]# cat pstate
10:56hrw: cat: pstate: No such device
10:57hrw: -rw-r--r--. 1 root root 0 04-16 17:30 pstate
10:57karolherbst: hrw: can you pastebin dmesg?
10:58hrw: karolherbst: https://paste.fedoraproject.org/paste/sGdhy61PJgwkZW6KWfkzUg
10:58karolherbst: ohhh wait
10:58karolherbst: this is a 1050 ti
10:58karolherbst: my mistake
10:59karolherbst: I was thinking about a 750 ti... for whatever reason
10:59hrw: no reclocking yet?
10:59karolherbst: yeah, basically we are locked out :(
10:59karolherbst: we need distributable signed firmware from nvidia or a way to extract those and add code against their firmware
11:00karolherbst: it's work in progress
11:00hrw: so for gaming still nvidia?
14:03karolherbst: fun: nvc0_set_global_handle:1318 - Cannot map into TGSI_RESOURCE_GLOBAL: resource not contained within 32-bit address space !
14:04karolherbst: imirkin: should I be worried about that?
14:04imirkin_: nfc what set_global_handle does
14:04karolherbst: I just created a 1.5GB memory buffer :(
14:04imirkin_: some compute-related thing
14:05imirkin_: not sure why 32-bit would be an issue, but probably is given how the code works
14:05karolherbst: I've tried a 20000x20000 matrix multiplication
14:05imirkin_: [which i haven't looked at]
14:05karolherbst: maybe yeah
14:05karolherbst: maybe some of the hmm patches are screwing around here
14:06imirkin_: its definitely an issue on nv50
14:06imirkin_: perhaps earlier nvc0 compute versions wanted to use the 32-bit load/store accessors, for which this could all be an issue.
14:06karolherbst: mhh well
14:06karolherbst: I am on gp107
14:06imirkin_: i mean versions of the code.
14:07karolherbst: 10000x10000 of floats works..
14:07imirkin_: i don't think set_global_handle is used
14:07imirkin_: except with clover
14:07karolherbst: I guess 3x 1.5GB = 4.5GB
14:07imirkin_: so it's been untested for years
14:07karolherbst: and that goes above 4GB
14:07imirkin_: or at least unreviewed
20:23Lyude: btw mupuf, other then the actual mysterious power sensor thing we talked about yesterday, do you have any idea on an alternative method to figuring out how much power the GPU is using? karolherbst said you might know some tricks for that
20:56mupuf: Lyude: hmm, not sure what he may be have been hinting towards
20:57Lyude: ah. Well; do you happen to have any maxwell1 systems with working power consumption sensors on them that I could use to try out the clockgating stuff with at some point/
20:57mupuf: the one plugged in reator is at your disposal :)
20:58mupuf: but I think this is the same as yours
20:58Lyude: aw. well, did you see anything useful from the i2c stuff you spotted yesterday?
20:58imirkin_: Lyude: fwiw i have a GTX 745.
20:59Lyude: oh, that might work
20:59imirkin_: pretty sure it's this one: https://people.freedesktop.org/~imirkin/traces/gm107-vbios.rom
20:59mupuf: Lyude: well, one could decode the transactions from the input
20:59mupuf: and see what it contains
20:59Lyude: mupuf: sure thing :), mind telling me how you changed the i2c lines though?
20:59mupuf: sigrock can do the decoding for you, if you need formating
20:59mupuf: oh, wait!
20:59mupuf: I have a tool for you :D
21:00mupuf: checking if it works on my box
21:00Lyude: yeah I played with that! but doesn't it not show anything if you have the i2c lines hooked up to the pmu?
21:00Lyude: i sure dont think I saw anything
21:00karolherbst: mupuf: that thing where you don't use the power meters
21:01mupuf: karolherbst: but that means you need to compute the power usage
21:01karolherbst: or maybe there are other ones besides the i2c devices
21:01karolherbst: yeah exactly
21:01mupuf: if yu want to see changes, then maybe the default weights are OK
21:01karolherbst: but nvidia-smi seems to report on cards without a i2c sensors
21:02Lyude: karolherbst: mm, that's what started part of the convo in the first place. we thought they might have been using 0x0200f8, but they don't seem to pay attention to the value there
21:02Lyude: (since it's not calibrated, of course)
21:02karolherbst: my hope was that mupuf knows more
21:03mupuf: Lyude: https://pastebin.com/XYut9P8S
21:03mupuf: looks like nvidia is actually scanning for devices :D
21:03karolherbst: mupuf: nvidia is spamming the bus like crazy anyway
21:04Lyude: anyway: i'm suspecious there's something else being used here for reporting this, because i would have expected to see something slightly more suspecious in demmio
21:04karolherbst: Lyude: maybe it is something super trivial and they do it completely in software
21:04Lyude: it's not possible they might be measuring the power consumption using something other than the GPU itself
21:04Lyude: is it
21:04karolherbst: load * SM amount * whatever + whatever2
21:04karolherbst: or so
21:05imirkin_: Lyude: let me know if based on that vbios, the gpu is interesting to you
21:05karolherbst: Lyude: there are tables adjusting this I think. There might be tables who might tell more about power consumption?
21:05imirkin_: if it is, i can try to plug it in in the next week or two
21:05Lyude: imirkin_: sure thing
21:05Lyude: karolherbst: possibly, yeah
21:05imirkin_: otherwise i'll go to the GK208
21:06mupuf: the bus 0 is polled a lot when calling nvidia-smi a1:ff =R=> 0000 (status = 6, polls = 0295): ERR
21:06mupuf: but it is odd that it cannot decode the transaction, so it really looks like there is nothing on the bus
21:07karolherbst: mupuf: try forcing high load and see what nvidia-smi does
21:07karolherbst: or not soo high load
21:07karolherbst: so it doesn't reclock
21:08mupuf: Lyude: I am sure some performance counters will tell you when you get clock gated or not
21:09mupuf: at least the the high-level ones
21:09mupuf: but otherwise, just use the hw-based computation of the power
21:10Lyude: imirkin_: tbh, mostly just interested in maxwell1 since we've got most of the stuff figured out for kepler, other than some vbios tables I need to investigate at some point
21:10mupuf: Lyude: what's different with maxwell 2?
21:10mupuf: because there, we have the power sensor working
21:10Lyude: mupuf: nothing other than we can't reclock with a fan
21:10karolherbst: don't have any load :p
21:10imirkin_: Lyude: right, but can you look at that vbios and let me know if there's "interesting stuff" there?
21:10karolherbst: "it's fine (tm)"
21:11imirkin_: otherwise me plugging it in won't do you much good
21:11Lyude: Lyude: sure, one sec
21:11mupuf: karolherbst, Lyude: but you can do the bring up of clock gating there
21:11mupuf: and test it there
21:11Lyude: *oops *imirkin_
21:11karolherbst: mupuf: sure
21:11karolherbst: mupuf: and I've done it and it had quite interesting results
21:11mupuf: I am sure you will see a drop in temperature anyway on maxwell1
21:11Lyude: mupuf: ah true; getting this working on maxwell2 should just require some tracing + more fancy vim macros
21:11mupuf: Lyude: well, at least, power reading can be done through nouveau
21:12mupuf: although we apparently need to fix quite a lot of things there with karol :D
21:12karolherbst: Lyude: you can reclock just fine on maxwell2 and nothing will happen to your GPU :p
21:12Lyude: ah cool, even with a fan?
21:12mupuf: karolherbst: why reclock? This has nothing to do with reclocking
21:12karolherbst: Lyude: even putting load is fine
21:12karolherbst: the GPU just gets hotter
21:12karolherbst: but the fan is still spinning
21:12karolherbst: it just doesn't spin faster
21:13karolherbst: mupuf: better to compare the effects
21:13karolherbst: on low clocks sometimes the difference is quite small
21:13mupuf: sure thing
21:14mupuf: anyway, rebooting now. The blob is giving me trouble while watching a bluray. I get 10 minutes before the video gets black
21:14mupuf: Pascal supprot on kde is also problematic :D
21:14Lyude: imirkin_: yeah, I don't think that one is much better since I'm not seeing any ina sensors in your vbios
21:16Lyude: anyone have any issue with me reindenting some tab/space mixing in rnndb?
21:16mupuf: Lyude: what? I bought it for this reason
21:16mupuf: let me check
21:17imirkin_: Lyude: yeah, i suspected as much. it's a low-end thing
21:17mupuf: Lyude: ./gm206/mupuf/vbios.rom
21:20Lyude: mupuf: oh; nvbios won't even parse the i2c table on your vbios correctly. BIT 'i' table has strange size: 0x0048 > 0x0044
21:20mupuf: Lyude: it is always found in the extdev ;)
21:21Lyude: that doesn't parse either though..
21:21Lyude:takes a look
21:21mupuf: Lyude: maybe it is time to pull and recompile? :D
21:21mupuf: EXTDEV 0: type 0x4e [INA3221] at 0x80 defbus 0
21:22Lyude: oh!! i never had nvbios pointed at my envytools prefix, whoops
21:23Lyude: there we go
21:24Lyude: mupuf: cool, wanna just go and update rnndb with what I found then I'll go do the traces I need to update the blcg/slcg list for m2, unless they're the same between the two gens (which would be a bit surprising)
21:25mupuf: Lyude: it seemed quite stable, as far as I remember
21:26Lyude: what, m2?
21:26mupuf: no, I mean, changes in the addresses
21:26mupuf: but there may be new clockgating there
21:27mupuf: IIRC, they reduced the amount of power domains
21:27mupuf: or was it on pascal?
21:27Lyude: ahh, yeah the addresses don't change but the values usually do
21:27Lyude: Oh you meant in response to rnndb
21:27mupuf: yes, sorry
21:27Lyude: there is actually a couple of registers that I noticed that aren't actually listed, or might be listed with the wrong gen in rnndb
21:27mupuf: for the values, I never checked :D
21:27mupuf: yeah, I used mwk's scans to populate rnndb
21:28mupuf: I started from gk20a's source
21:28Lyude: oh btw that reminds me: if anyone could review the k2 fixup for clockgating I posted that would help
21:29karolherbst: Lyude: uhm, how old was your envytools?
21:29Lyude: the rnndb was up to date, it was just nvbios and friends that weren't
21:37mupuf: Lyude: never make install :D
21:37Lyude: nah-i've got a very specific way of doing things i like :), plus I just entirely uninstalled the ancient envytools package on my system so it wouldn't happen again anyhow
22:45Lyude: By the way-how do you get the strap peek thing?
23:18karolherbst: Lyude: 202000
23:20Lyude: karolherbst: ah, just output that to a register?
23:21karolherbst: that is the register
23:21karolherbst: ah yeah
23:21karolherbst: call it strap_peek
23:21karolherbst: then nvbios picks it up
23:21karolherbst: just put it besides the vbios.rom file
23:22Lyude: cool; I will go and grab the strap_peek for everything I've uploaded so far next time I get a chance
23:27imirkin: Lyude: iirc 101000
23:27imirkin: dunno if karol typoed it
23:31karolherbst: uhm, yeah I typoed it
23:49Lyude: looks like we may not have figured out all of clockgating
23:50Lyude: interesting.... it seems we don't touch CG_CTRL at all on maxwell2
23:50Lyude: *the blob does not touch
23:58Lyude: karolherbst: what's the chance that nvidia's controlling the CG_CTRL registers through the PMU?
23:58Lyude: i'm mostly sure they moved this over for reasons that, as that changing CG_CTRL register shows, have a chance of complicating things
23:59Lyude: (this is as of maxwell2, maxwell1 and previous generations never show anything like this)