06:44 Venemo: johnny0: wdym?
06:46 Venemo: johnny0: I assume you're talking about the "clock stuck to low" issue. well, it seems that the TDP value reported by the VBIOS is less than half of the TDP the card was advertised with, and this is the reason why the clocks are stuck to lowest. the card behaves as if it was throttling even though it doesn't get above 45 C
06:48 Venemo: johnny0: the issue is solved by either using a higher TDP value, or disabling power containment or disabling powertune entirely. with either of those, the card performs like it does under windows
07:22 Venemo: johnny0: do you think there is a risk in any of that?
11:31 johnny0: Venemo: yes, check the total card power draw with stock TDP in windows
11:42 johnny0: with hawaii, the tdp / reported power is consistent between windows and linux, but that amount grossly under-represents the amount of power actually being pulled by the card
11:43 johnny0: taking a look at the reviews/discussions for SI cards, the situation doesn't look much different
11:49 Venemo: johnny0: good question what windows actually does for these. I suppose either they hardcode a TDP or they just disable powertune and let the hw do whatever
11:51 Venemo: johnny0: fwiw the card seems to handle itself just fine without any throttling
11:56 johnny0: i don't have one of these cards, but I'd suspect the idle power draw has something to do with it
11:58 johnny0: the TDP on some of those cards is *really* small... if most of that is eaten just idling with amdgpu, that'd leave practically nothing for dpm to work with
11:59 johnny0: so, raising the power limit would be an effective workaround... but these are slot-powered cards
11:59 johnny0: designed to be placed in hot-boxes with anemic power supplies
13:16 Venemo: I know
13:29 Venemo: johnny0: still, if users get decent perf from windows, and otoh on linux the same card can't even use a higher shader clock, there clearly is a problem, and I'm pretty sure it's a wrong tdp config
13:56 johnny0: Venemo: what I'm trying to get across is that the card may already pull >50W from the slot with stock TDP in Windows; bumping up the TDP to fix the dpm issue may also have the side effect of allowing much higher power draw than expected
13:57 Venemo: johnny0: I am pretty sure that it isn't using the stock tdp on windows either.
13:58 Venemo: I don't think there is a dpm issue. I think the power limit is just simply low
13:58 johnny0: gotcha, it's well worth confirming with a power meter though
13:59 Venemo: I don't have a power meter, best I can do is run furmark on the card for some time to confirm that it doesn't fall off the bus
14:01 johnny0: so... you're proposing a fix that may very well cause the card to pull >75W from the slot and you're not going to confirm it doesn't?
14:01 Venemo: I don't think that it will
14:01 Venemo: but I'm open to suggestions, of course
14:03 Venemo: thus far all the SI cards that I tested worked fine on their own without even using any DPM, but please correct me if I'm wrong
14:05 Venemo: best would be if we could read the smc registers somehow on windows, then we would see how it programs the gpu and just do the same
14:05 Venemo: johnny0: in your estimate, how much power does the card draw other than the chip's TDP?
14:05 johnny0: I'm not saying bumping the tdp isn't a realistic fix, just that it's something that an informed user should make the call on to enable (e.g., via overdrive)
14:06 johnny0: a considerable amount on hawaii
14:06 Venemo: how much?
14:06 johnny0: IIRC the stock 138W TDP on s9150 hit the 235W advertised power usage spot on
14:07 Venemo: wow. so, your Hawaii card draws about 100W more than the chip's TDP
14:07 johnny0: under load, yes
14:08 johnny0: and the dpm behavior...
14:08 Venemo: I see. that would make me worry too
14:08 johnny0: IIRC, the card can't maintain its max 900MHz under certain workloads
14:08 johnny0: DPM won't even try
14:09 johnny0: or... I should say if it tries, it sees the situation as hopeless and drops to the lower level
14:09 johnny0: which will not max out the TDP
14:09 Venemo: interesting
14:09 johnny0: but if the temperature/workload is right, it *will* hit the max dpm state and downclock as the temperature goes up
14:10 johnny0: if you have the system hooked up to a power meter, you can see the consumption rise accordingly
14:10 Venemo: I see your point
14:11 Venemo: sounds like your Hawaii works a bit differently from mine
14:13 johnny0: yeah, iirc it has a significantly lower power limits and only six DPM states
14:14 Venemo: well
14:14 Venemo: with the Radeon 430, what happens by default is that the DPM always behaves as if it was throttling, even when the card is cold. so even if I manually set it to the highest power level, the DPM will still downclock to the lowest shader clock under any load.
14:18 Venemo: now, we're not talking about doing any overclocking or od or anything, just the default behaviour
14:20 johnny0: hmm, I don't know if there's any power info for SI cards in windows (e.g., via hwinfo64), but it would be worth checking what the reported power draw delta is like when the memory is clocked up
14:20 Venemo: is there any good way for me to check this if I don't have HW to measure PCIe power draw?
14:23 johnny0: hwinfo64 is probably your best bet on the software side, but taking a delta using cheap wall power meter (e.g., kill-a-watt) works pretty well
14:23 johnny0: just be sure to take the PSU efficiency into account
14:24 Venemo: sigh
14:24 Venemo: I guess I'll start with the hwinfo thing
14:26 Venemo: with the power meter, I guess I'd have to fiddle with the CPU otherwise it would throw off the measurement completely
14:27 johnny0: ahh yes, good point... ugh
14:29 Venemo: not exactly sure how to do that, tho. even if i set a power limit to the CPU, it may exceed that limit
14:30 Venemo: and the PSU efficiency depends on the exact power draw, so that's also highly troublesome
14:38 johnny0: yep. still useful as an "uhoh" test if you can work out how to constrain those variables
14:41 Venemo: honestly, I don't see how else windows can get better perf out of the gpu other than raising the tdp limit
14:42 Venemo: I mean specifically in the context of allowing higher clocks
14:48 johnny0: I'd guess the windows drivers power down more of the chip leaving more available power budget to work with
14:49 Venemo: like what is there to power down?
15:36 Venemo: johnny0: anyway, I can't argue with your results from Hawaii, but as a base of comparison, I have 3 different Oland cards, and have compared the TDP value from the VBIOS: R7 250 - 57W, 520 - 30W, R5 430 - 24W.
15:37 Venemo: to be clear, all of these just draw power from the PCIe slot
15:38 Venemo: the R7 250 was a "gaming" card like 10+ years ago. the other two are cheap garbage that you can find in OEM PCs
15:39 Venemo: so, if the R7 250 is fine with 57W TDP, I don't see any reason why the other two couldn't use at least 35~40W or so
15:42 Venemo: assuming that the R7 250 didn't draw more than the PCIe slot allows, I don't think the other two would either
16:01 Remco: Venemo: Could you check idle power usage by doing echo 1 > /sys/class/drm/card0/device/remove perhaps?
16:01 Remco: Although that might not change power state
16:04 Venemo: Remco: I'm sorry I don't see what you mean there
16:05 Remco: I'm thinking that you could suspend the device to get the difference in power usage between normal and suspended
16:05 Remco: But looking a bit further it doesn't look that is a normal/supported operation
16:06 Venemo: well, how would I measure its power use?
16:10 Remco: Oh, I guess I assumed you'd do at the wall measuring. That would mean you could get an isolated reading
16:12 Venemo: issue is that the cpu can still consume a shitton of power and I have no way to account for that
16:12 Venemo: as ridiculous as that sounds, my CPU can draw 5x as much power as this puny GPU
16:22 Remco: Suspending the gpu doesn't cause the cpu to change any states right? But I see your problem, it's very hard to get a good reading because it's so easy to get lost in the noise
16:23 Venemo: so basically if I measured the power consumption of this system, the GPU would be a rounding error compared to the CPU
16:23 Venemo: that is unless I had a way to measure the CPU separately and subtract it from the result
16:25 johnny0: Venemo: Oh, interesting, thanks! That's a lot less worrying then
16:26 johnny0: I was only finding vbioses in the 18-30W range, so just slamming it to 50 seemed really bad given what I observed with hawaii
16:27 Venemo: wow, what cards are those in the 18-30 W range?
16:28 johnny0: all of the oem 520 and 240s I checked
16:30 Venemo: oh, wow
16:30 Venemo: sounds like they would have the same problem
16:33 Venemo: johnny0: could I ask you to test some of those with a raised TDP? I can give you a patch to do so
16:36 johnny0: Venemo: these were just vbioses snagged from techpowerup's database
16:53 Venemo: johnny0: oh, I see. I thought you actually had some of your own
16:55 Venemo: I think ultimately it will be up to Alex to decide which solution is acceptable