00:46 karolherbst: imirkin: random thought: shader global arrays with constants, indirect access on that array. In those cases it should make sense to have that immediate -> const buffer optimization, right?
00:46 karolherbst: wondering if we should allocate some space inside the driver const buf for those constants
00:54 HdkR: How often does that occur?
00:55 HdkR: Seems like a reasonable optimization if it happens in some heavy shader that does it a lot :P
00:57 karolherbst: no clue
01:00 imirkin: karolherbst: yeah. glsl compiler does that.
01:00 imirkin: iirc, at least.
01:00 karolherbst: mhh, okay, but does it just create an ubo for it or how does the backend get to know about it?
01:01 imirkin: karolherbst: https://cgit.freedesktop.org/mesa/mesa/tree/src/compiler/glsl/lower_const_arrays_to_uniforms.cpp
01:01 karolherbst: ahh
01:01 imirkin: nah, they come in as regular (ubo 0) program uniforms
01:01 karolherbst: k
01:02 imirkin: it doesn't capture EVERY case, obviously
01:02 imirkin: but at least some cases.
01:02 karolherbst: yeah, true
01:03 imirkin: courtesy of Kayden i believe
01:03 imirkin: but i encouraged the notion :)
01:42 JayFoxRox: this might be an odd question, but is there any good way to quick-reset an NV2A / NV20? I don't care how brutal it is: I just want to return the hardware to a state so a new kernel can boot and take over. I don't really want to go through a lot of custom de-initialization code to be honest
01:48 karolherbst: imirkin: ohh, vdpau + gl was also causing issues, right?
01:48 karolherbst: mpv was doing that?
01:48 karolherbst: or mplayer?
02:37 imirkin: mpv
02:38 imirkin: JayFoxRox: not sure, but perhaps twiddling 0x200 will do it?
02:39 imirkin: mwk will know... see JayFoxRox's question --^
02:42 JayFoxRox: that's something I've tried: 0x200 to 0x00000000, but it didn't seem to do anything for me :( [or rather: when I did my custom kexec (working on an xbox, loading the MS bios), the xbox did lock up]. I already have some rather complicated code which manually drains CACHEs, emptys FIFOs etc. but it's a lot of work and quite fragile
02:49 imirkin: JayFoxRox: there's also sometimes a vbe reset that can be called
02:49 imirkin: although dunno that xbox would have that
02:51 JayFoxRox: what is "VBE"?
03:12 imirkin: video bios extensions iirc
12:32 karolherbst: imirkin: with my print changes: https://gist.githubusercontent.com/karolherbst/6a12b1c99db1dc6d49dfefb37c2b6836/raw/75e2486ba6ab520f0dc9a9dfc118d013e44288f8/gistfile1.txt
12:32 karolherbst: for TXG
12:48 karolherbst: Lyude: yeah, something is odd with those patches. I get 0W power savings
12:51 karolherbst: ohh wait
12:51 karolherbst: forgot to set NvPmEnableGating
12:52 karolherbst: Lyude: 60 -> 40 W on 0xf
12:52 karolherbst: 19 -> 15W on 0x7
13:03 karolherbst: imirkin: example with a non 0xf mask: "tex 2D_SHADOW $r8 $s0 r___ f32 %r37 %r31 %r33 %r33"
13:06 karolherbst: anyway, will run piglit today on my patches rebased on master
13:45 karolherbst: mhh, was able to hit the black screen issue again, this time I also got a shader eviction warning, so let's see whats going on there
13:55 Sarayan: shader didn't pay the rent?
13:56 RSpliet: Probably shader ignored fire safety regulations
14:02 Sarayan: RS: that's when the laptop starts burning because recloking/fan control was fucked?
14:30 RSpliet: While the shader is in an infinite loop, yes
14:32 karolherbst: imirkin: no regressions in piglit
14:35 Sarayan: karolherbst: congrats, that's always nice :-)
16:05 karolherbst: :O
16:05 karolherbst: crap
16:05 karolherbst: mupuf: I got 110W on my GPU
16:05 karolherbst: I am sure my power budget is 80W
16:06 mupuf: Well, time to add a power limiter ;)
16:06 mupuf: It's a good problem to have :)
16:06 karolherbst: thing is
16:06 karolherbst: on my vbios I don't have that max entry pointer :/
16:06 karolherbst: so more REing needed
16:07 karolherbst: but totally crazy what hits that
16:07 karolherbst: mupuf: some game main menu, settings screen, nothing changes
16:07 karolherbst: vsync off: 40 W -> 110W
16:08 mupuf: Meh, who cares about max? Throttle to stay in budget
16:08 karolherbst: uhm
16:08 karolherbst: question: what is the power budget ;)
16:08 mupuf: Just like over temperature
16:08 mupuf: I thought we had this REed?
16:08 karolherbst: only for special cases
16:09 mupuf: Ok. I assume you'll see a budget per power lane
16:09 karolherbst: mupuf: https://gist.githubusercontent.com/karolherbst/ec9e8ac31c1dac07dee19b30de37b04a/raw/535fc830f2fa68abec39810d5fd52067b599c563/gistfile1.txt
16:09 karolherbst: "nvidia-smi cap entry" is the thing I noticed
16:10 karolherbst: and it was directly affecting what nvidia-smi printed as the budget
16:10 karolherbst: but not all cards come with that
16:10 karolherbst: (and the driver used that value as well)
16:10 karolherbst: mhh, maybe by budget is 100W?
16:11 mupuf: I.doubt it.
16:12 karolherbst: well
16:12 karolherbst: the cooling system is good enough
16:12 karolherbst: 68.0°C @ 105W
16:13 karolherbst: the CPU is actually hotter than the GPU
16:13 mupuf: that is no the only reason to limit the power consumption
16:14 mupuf: the voltage regulator and power delivery is designed to handle the sustained power usage
16:14 mupuf: no more
16:14 mupuf: you may blow a trace if you consume too much for too long
16:14 mupuf: I would assume nvidia would 16:16 < karolherbst> because it is identical to maxwell1?
16:16 karolherbst: more or less
16:16 karolherbst: mupuf: we just don't enable it as we aren't able to control the fans ;)
16:16 karolherbst: everything else works
16:16 karolherbst: I just have some local patches
16:17 karolherbst: and this is a laptop, so fans are EC controlled
16:17 mupuf: I see. Well, we could enable it for laptops since the fan is not controled by us
16:17 mupuf: yeah, but anyway, we need t omake sure we stay in budget
16:17 mupuf: the maximum clocks have been designed with clock gating in mind, and it is not enabled by default yet
16:18 mupuf: that should help, but I am sure there is plenty of other things we are doing wrong
16:18 karolherbst: I have also Lyudes clock gating patches :p
16:18 karolherbst: for maxwell2
16:19 karolherbst: and those were dropping idle consumption at 0xf from 60W to 40W
16:19 mupuf: not bad
16:19 mupuf: but still super high for an idle GPU
16:19 karolherbst: yeah
16:19 mupuf: would be nice to compare that with the blob
16:19 karolherbst: it idles at 15W with 0x7
16:19 karolherbst: which is sane
16:19 karolherbst: mupuf: remember, this is a gm204
16:19 mupuf: not really
16:20 mupuf: well, please measure with the blob and see the power difference at the same clocks ;)
16:20 karolherbst: yeah
16:21 mupuf: that really makes me uneasy that we would get to 110W for a potential 80W-TDP GPU
16:21 karolherbst: mhh, nvidia-smi is useless :/
16:21 karolherbst: Cap: N/A
16:21 karolherbst: idles at 8W though at 0x7
16:22 karolherbst: 20W at 0xd
16:22 karolherbst: doesn't get up to 0xf by default...
16:22 karolherbst: weird
16:23 mupuf: karolherbst: have you tried disabling DVFS in nvidia-setting?
16:23 mupuf: it should go to the base clock
16:24 karolherbst: there are no base clocks
16:24 karolherbst: not on that GPU
16:24 karolherbst: or well
16:24 karolherbst: there are no boost clocks
16:25 karolherbst: mupuf: mhhh... maybe the power readi16:26 karolherbst: no
16:26 karolherbst: not directly
16:26 karolherbst: there are I2C sensors
16:27 Sarayan: I'm used to i2c temperature sensors, I didn't realize there could be power ones too
16:27 karolherbst: mupuf: I hardly believe that with max fans the temperature wouln't change much between 50W and 105W
16:28 karolherbst: 62.0°C vs 68.0°C... but 52W vs 105W?
16:28 mupuf: karolherbst: agreed
16:28 mupuf: we may read the power incorrectly
16:28 karolherbst: yeah
16:28 mupuf: brb
16:28 karolherbst: should fix that first before power caping :p
16:29 karolherbst: Sarayan: we have INA3221 ones
16:29 mupuf: +1
16:29 Sarayan:checks the datasheet
16:29 karolherbst: Sarayan: INA219 on older GPUs, like fermi
16:30 mupuf: I.wonder if we use the.wrong shunt resistor values, or if we have other issues
16:30 karolherbst: the calculation might be flawed
16:30 mupuf: I can't remember much in the way of calibration though
16:30 karolherbst: "power rail 0: unk0 = 0x1, extdev_id = 0, shunt resistors = {5 mOhm (unk 0), 5 mOhm (unk 0), 255 mOhm (unk ff)}, config = 0x6607"
16:30 karolherbst: is how we parse it
16:31 Sarayan: bios data?
16:31 karolherbst: yeah
16:31 mupuf: Do we write the config to the config reg?
16:31 Sarayan: 0/0/ff doesn't sound very nice
16:31 karolherbst: mupuf: yes
16:31 karolherbst: Sarayan: 5/5/ff ;)
16:31 karolherbst: ff is obviously disbaled
16:32 Sarayan: ah yeah, 5 is better
16:32 mupuf: But it's surprising you have 2 power lanes
16:32 karolherbst: code is inside drm/nouveau/nvkm/subdev/iccsense
16:32 karolherbst: mupuf: yeah...
16:33 karolherbst: I just need to debug that a little
16:33 Sarayan: Dedicated hardware circuitry on the GTX 580 graphics card performs real-time monitoring of current and voltage on each 12V rail (6-pin, 8-pin, and PCI-Express).
16:33 Sarayan: your laptop may have two rails?
16:33 karolherbst: MXM
16:34 mupuf: Mxm is one and should bring.80W
16:34 karolherbst: the older mxm yes
16:34 < k16:40 karolherbst: yeah, most likely
16:41 mupuf: So, seems like we do a good job on your gpu
16:42 karolherbst: yeah... things aren't as bad :) stuff runs pretty well overall
16:43 mupuf: Except the "blown up budget" part
16:43 karolherbst: well, I assume the readings are wrong
16:44 karolherbst: nvidia is now at 64C@52W
16:44 mupuf: Can you check the total power consumption of your laptop?
16:44 karolherbst: maybe?
16:44 karolherbst: ohh wait, yeah, I REed that actually
16:44 mupuf: Quite likely ;)
16:44 karolherbst: the EC tells me
16:45 mupuf: Good, then you can check all things being equal on the blob and nouveau
16:45 karolherbst: :D
16:45 mupuf: Bbl
16:45 karolherbst: nvidia now: 260MHz core
16:45 karolherbst: :D
16:45 karolherbst: right...
16:46 karolherbst: -4.54 A @ +14.77 V
16:47 karolherbst: mupuf: 30W power cap on battery
16:47 karolherbst: for the GPU
17:50 mupuf: karolherbst: does the EC report the power when charging?
18:01 karolherbst: mupuf: it only reports whatever goes in/out of the battery
18:45 mupuf: karolherbst: I see. A little annoying but you can still use the idle power consumption for calibration... I guess
18:46 mupuf: just read the power with nvidia-smi, then use the tool I made to spy on the i2c bus
18:46 mupuf: and you'll see if we read the sensor correctly
18:50 Lyude: mupuf: jfyi: the not-as-good-as-expected power consumption with clockgating is most likely due to an issue I mentioned to karolherbst: I think some of the engine init stuff past kepler2 is slightly moved around in nouveau such that we accidentally load the clkgate pack after enabling the main CG registers, so it may or may not be leaving the actual applied cg settings at whatever they were in the vbios
18:50 Lyude: I likely just need to figure out where to put the init functions for the packs so they get loaded earlier
18:51 karolherbst: no, the power consumption nouveau reads out on my GPU is just wrong
18:51 Lyude: oh
18:51 Lyude: well either way testing locally the power savings I've seen aren't what they should be
18:59 karolherbst: mupuf: mhh, so it seems like nouveau only creates two rails for real :/
19:02 karolherbst: mhh
19:02 karolherbst: some random person on nvidia boards: "I forgot to mention that, after shifting, the shunt and bus values have to be scaled by 40 uV and 8 mV, respectively (based on the INA3221 datasheet). 40 uV is the LSB for the shunt-voltage register and 8 mV is the LSB for the bus-voltage register."
19:02 karolherbst: but I do that...
19:03 karolherbst: or.. wait
19:47 karolherbst: mhhh
19:47 karolherbst: no, I am sure I parse the values correctly
19:48 karolherbst: rail0: shunt: 1f0 bus: 4bb0 rai1: shunt: f8 bus: 4bb0
19:48 karolherbst: those are signed values, but have to be right shifted by 3
19:49 karolherbst: bus: 2422 shunt0: 62 shunt1: 31
19:50 karolherbst: lsb for shunt is 40uV for bus is 8mV
19:50 karolherbst: bus: 19.376V
19:51 karolherbst: shunt0: 0.002480V shunt1: 0.001240V
19:51 karolherbst: shunt ressistence is 5 mohm
19:52 karolherbst: shunt0: 0.496A shunt1: 0.248A.
19:53 karolherbst: 19.376V * (0.496A + 0.248A) = 14.415744W
19:53 karolherbst: mupuf, Sarayan: ^^ this sounds correct, no?
20:01 karolherbst: config says: channel 0+1 enabled, 64 samples per value, 140 μs "Bus and Shunt voltage conversion time", "Shunt and bus, continuous" operation mode
20:09 karolherbst: ... duh, nvidia
20:23 karolherbst: nice... I am able to read the sensor out with nvidia loaded
20:24 karolherbst: nvidas values then
20:25 karolherbst: bus: 0x4be0, shunt0: 0x150, shunt1: 0x8c
20:25 karolherbst: mhhhh
20:25 karolherbst: mupuf: seems like nvidia indeed reads out much lower values
20:28 karolherbst: 9.168128W with the same way I calculate it in nouveau
20:28 karolherbst: nvidia-smi reports 8 or 9W
20:28 karolherbst: soooo
20:28 karolherbst: meh
20:34 karolherbst: mupuf: any ideas why the sensor would produce higher values with under nouveau?