00:02 karolherbst: imirkin: glxgears shader: 16 regs down to 7 regs used :)
00:02 imirkin: hm?
00:02 imirkin: with your pass?
00:02 karolherbst: yeah
00:02 imirkin: nice
00:02 karolherbst: pixmark_picano: 56 -> 41
00:02 imirkin: 16 -> 7 -- that could be a real improvement
00:02 imirkin: 56 -> 41 -- dunno
00:02 karolherbst: will check
00:03 karolherbst: without cross BB values
00:03 karolherbst: don't check that yet
00:03 imirkin: i highly suspect the improvements are at POT boundaries
00:03 imirkin: crossing BB's is hard. leave that for later.
00:03 karolherbst: my pass is pretty trivial though
00:03 imirkin: there's all sorts of jerks that create for loops. and then they want to break out of them.
00:04 karolherbst: just add values of insn scheduled, and then schedule values dests first
00:04 imirkin: heh
00:04 imirkin: well that can lead to a really bad schedule too ;)
00:05 imirkin: i think you need to have 2 diff policies
00:05 imirkin: and switch between them
00:05 imirkin: depending on the register pressure
00:05 karolherbst: if I cut reg uses more by half, well
00:05 karolherbst: yeah maybe
00:05 imirkin: btw, make sure that when you're benchmarking, you're comparing apples-to-apples
00:05 karolherbst: or just prefer other stuff first
00:05 imirkin: i.e. opt build to opt build
00:05 imirkin: or debug build to debug build
00:05 karolherbst: of course
00:06 karolherbst: I just modify my source, recompile and use the same paths
00:07 karolherbst: mhh, for glxgears the perf is the same though :/
00:07 imirkin: well, glxgears isn't gpu bound
00:07 imirkin: try heaven :)
00:08 imirkin: or glxspheres
00:08 karolherbst: pixmark_piano no change too
00:08 karolherbst: ohh glxgheres is pcie bound in the end
00:08 karolherbst: clock bound only up to a specific clock
00:08 imirkin: weird
00:08 imirkin: i wonder why
00:08 karolherbst: mhh
00:08 karolherbst: its around 700MHz
00:09 imirkin: i.e. what does it do over pcie
00:09 karolherbst: DRI_PRIME=
00:09 karolherbst: ?
00:09 imirkin: that's only a small image
00:09 karolherbst: transfering the image data I guess
00:09 karolherbst: mhh
00:09 karolherbst: 1000fps?
00:09 karolherbst: thats a lot
00:09 karolherbst: :D
00:09 imirkin: heh i guess
00:09 karolherbst: usually with bumblebee I have high pcie load
00:09 karolherbst: on blob I get around 80% with glxgears
00:09 karolherbst: and fps are much worse than with nouveau
00:10 karolherbst: only because of image transfers
00:10 karolherbst: texture actually
00:10 glennk: prime setup would explain why pcie link speed matters so much...
00:10 karolherbst: mhh right
00:10 karolherbst: never thought about this way
00:10 karolherbst: but I think with talos it matters for everybody
00:11 karolherbst: because there I had bigger speed improvements than with other games
00:11 karolherbst: I always got around 5% usually, but in talos principle around 25%
00:11 glennk: basically if sizeof commands ~= size of framebuffer data
00:11 glennk: usually its <<
00:17 glennk: btw gprs vs warps: http://developer.download.nvidia.com/compute/cuda/CUDA_Occupancy_calculator.xls
00:18 imirkin: errrr
00:18 imirkin: what gpu is that on?
00:18 imirkin: must be GK110+
00:18 karolherbst: imirkin: you can change it
00:19 karolherbst: just type in 64 threads per block
00:19 karolherbst: then the diagrams will change
00:19 imirkin: registers per thread, actually
00:19 karolherbst: its 32 already
00:20 karolherbst: ir what are the right value for like my gpu?
00:20 karolherbst: 64 and 32?
00:20 karolherbst: or both 32
00:21 imirkin: oh. there's a dropdown.
00:21 imirkin: select 3.0 for your gpu
00:21 karolherbst: k
00:21 karolherbst: active threads per multiprocessor is 0 now :D
00:21 imirkin: the graph's a bit broken =/
00:22 karolherbst: ohh now it works
00:22 imirkin: anyways, i guess up to 32 regs it doesn't matter
00:22 karolherbst: but with what does these graphs help exactly?
00:22 imirkin: and then there are steps at 40 and 48
00:22 imirkin: interesting.
00:22 imirkin: not what i expected.
00:23 glennk: there are multiple limits
00:24 imirkin: joi: btw, i sent a more complete patch for that witcher2 cb issue.
00:25 glennk: imirkin, http://on-demand.gputechconf.com/gtc-express/2011/presentations/cuda_webinars_WarpsAndOccupancy.pdf pages 6-9
00:25 imirkin: some not quite-as-lazy individual should reinstate this logic into nv50 as well
00:28 karolherbst: imirkin: reduving live values fixes the spilling issues in heaven at least
00:29 karolherbst: no heaven runs at max settings
00:29 karolherbst: *now
00:32 airlied: mlankhorst: skeggsb seems to think there is a probme :)
00:40 karolherbst: 12.8/323 => 12.9/326 .. mhh this looks somehow not significant
00:41 imirkin: you need to take latency into account if you want a significant boost
00:42 karolherbst: latencies of instructions I guess?
00:43 karolherbst: so before I try to schedule one instructions, I should check if the latencies of the sources are big enough?
00:43 karolherbst: or small enough
00:43 imirkin: see if enough instructions have been scheduled
00:44 imirkin: ideally you'd only schedule instructions when their sources are "ready"
00:44 imirkin: of course you can't always do that (since there's no point in inserting nop's, unlike some other arch's)
00:45 karolherbst: I see
00:46 karolherbst: so then I would schedule one where the source will get ready the earliest
00:46 imirkin: right
00:48 karolherbst: how do I get the latency? Is there something like getLatency(insn->op)?
00:49 imirkin: targ->getLatency
00:50 karolherbst: prog->getTarget()->getLatency I guess?
00:50 imirkin: not exactly perfect, but should be a decent idea
00:50 imirkin: actually targ->getThroughput() might be a better fit
00:51 karolherbst: what's the difference?
00:51 karolherbst: canDualIssue also sounds quite nice?
00:51 imirkin: yeah, worry about that later.
00:51 karolherbst: mhhh
00:52 karolherbst: okay
00:52 karolherbst: ohh, this speed improvement seems stable though
00:52 karolherbst: stock has 325/326 points and my modifications are at 329 points
00:52 karolherbst: across runs
00:52 karolherbst: even if that's nearly nothing
00:52 imirkin: cool. a 1% improvement!
00:52 karolherbst: :D
00:52 imirkin: almost up to blob speeds i guess
00:53 karolherbst: okay, so what's the difference between latency and throughput now?
00:54 imirkin: use throughput.
00:54 imirkin: forget i mentioned latency.
00:54 glennk: latency would be how long a value takes before its available
00:54 karolherbst: okay, and what does the number tells me exactly? or how do know when a source is "ready"?
00:55 glennk: throughput would be how long the instruction takes to execute
00:55 glennk:is guessing
01:05 karolherbst: imirkin: does throughput means how many instructions I have to wait?
01:06 imirkin: yes
01:06 karolherbst: mhh
01:06 imirkin: minus 1
01:06 karolherbst: vfetch has 1
01:06 imirkin: i.e. if throughput is 1, don't have to wait at all
01:06 imirkin: yeah dunno. it seems to only have ALU ops in there
01:06 karolherbst: mhh okay
01:07 karolherbst: req has 8
01:07 karolherbst: *rsq
01:07 karolherbst: what is rsq?
01:08 karolherbst: ohh yeah, this looks not so good somehow: https://gist.github.com/karolherbst/504af031a8a8346e99ee
01:08 karolherbst: ohh the other way around silly me
01:08 karolherbst: still, %r181 gets used in the next instructions
01:08 karolherbst: I guess the core has to wait then
01:08 glennk: generally if unsure what an op does, look at the tgsi op docs
01:10 karolherbst: now I need the position of an instruction inside my list
01:11 imirkin: rsq = 1/sqrt()
01:11 karolherbst: ohh okay
01:11 mupuf: karolherbst: nice!
01:11 karolherbst: mupuf: whats nice?
01:12 mupuf: I will have a tool for you to do your benchmarking :)
01:12 karolherbst: awesome
01:12 mupuf: so as you won't have to run it by hand
01:12 mupuf: the nice part is the improvement
01:12 mupuf: karolherbst: I posted a new version of my patchset
01:12 mupuf: could you test it?
01:12 karolherbst: nice
01:12 karolherbst: yeah
01:12 karolherbst: will do
01:12 karolherbst: whats the difference?
01:13 mupuf: it is much better written
01:13 mupuf: but it should not change anything
01:13 imirkin: glennk: is there a way to trace where a texture comes from in apitrace?
01:13 mupuf: except adding a message in dmesg in debug verbosity
01:13 karolherbst: mupuf: this? http://lists.freedesktop.org/archives/nouveau/2015-September/022206.html
01:13 mupuf: the message tells if you are using the GPIO mode or PQM
01:13 mupuf: and it adds an extra check
01:13 glennk: imirkin, i wish, i usually just search for glBindtexture(#texture_id)
01:14 mupuf: and it should take less cpu when changing voltage
01:14 mupuf: karolherbst: yep, the 4 patches
01:14 mupuf: the 3 first are enough for you
01:17 karolherbst: mupuf: I had a little inaccuracy with your patches though
01:17 imirkin: hmmm... looks like RGBA4 isn't handled properly?
01:17 mupuf: karolherbst: ah?
01:17 karolherbst: yeah pwm should be 0x30 at 0f and not 0x31
01:17 karolherbst: will check with the new one though
01:17 mupuf: I did not change anything
01:18 karolherbst: I think the calculatin isn't exact enough
01:18 karolherbst: and you end up with 30.ff or something
01:19 karolherbst: ohh wait
01:19 karolherbst: doesn't make sense
01:24 karolherbst: mupuf: but it still works :)
01:29 mupuf: :) good, please reply onto the mailing list, saying Tested-by: Karol Herbst <git@karolherbst.whatever.domain>
01:29 mupuf: I tested it on my nve6 and gm107
02:01 karolherbst: imirkin: when I have a list of schedulable instructions and all have to wait the same, then I should schedule those first with higher throughtput, so that the other will be executed later?
02:03 karolherbst: ohh finaly
02:03 karolherbst: 13.4/337 now
02:04 karolherbst: that's more significant
02:04 karolherbst: mupuf: okay, gpu hang
02:06 karolherbst: it seems like the lowest voltage is still not high enough from the table
02:09 karolherbst: imirkin: okay, those dualIssue instructions: when I schedule one instructions and one with canDualIssue(insn, anotherInsn) after that, the second instructions is nearly for free?
02:10 mupuf: karolherbst: check that the voltage indeed got changed
02:11 karolherbst: I only know the nvapeek way, is there another?
02:11 karolherbst: mupuf: guess what
02:11 karolherbst: I found the issue
02:12 karolherbst: okay, stock is around 324 points, my scheduling pass is around 337 now (+0.5 fps)
02:13 mupuf: karolherbst: what issue? :D
02:13 karolherbst: with your patch
02:14 mupuf: your patch will need to be thoroughly tested on multiple benchmarks to look for regresssions :)
02:14 karolherbst: yeah I know
02:14 mupuf: also, you may want to run shaderdb on your pass to check if it increases anything
02:14 karolherbst: cpu usage is higher
02:15 karolherbst: how can I do that?
02:15 mupuf: is it higher at compilation time?
02:15 karolherbst: yeah
02:15 mupuf: or at run time?
02:15 mupuf: yeah, meh for compilation time
02:15 karolherbst: compilation time
02:15 karolherbst: heaven compiles some stuff on the fly
02:15 karolherbst: some scene changes seems to have bigger stutters
02:16 karolherbst: but you missed the | 0x80000000 thingy ;)
02:16 mupuf: karolherbst: AHAH
02:17 mupuf: well, I guess you get to shame me on the ML!
02:17 mupuf: but hey, you can review the patches and add your Tested-by and asking for this change
02:17 karolherbst: okay the speedup in heaven is pretty stable, nice benchmark
02:17 mupuf: I will just need a confirmation from skeggsb that this is an OK design
02:18 mupuf: with this done, I will even try to change the clocks on my maxwell :p
02:21 karolherbst: pixmark_piano -2ms frame time
02:22 karolherbst: mupuf: how can I run over shaderdb? ;) I thought only intel can use that or so
02:23 mupuf: well, intel has a ton more shaders but we cannot redistribute them because they are either from customer's applications or non-open source apps
02:23 mupuf: but the open source shaders are in a different repo
02:23 mupuf: and everyone can use them
02:25 mupuf: not sure if this is the tree: http://cgit.freedesktop.org/~anholt/shader-db/log/
02:26 karolherbst: mupuf: will this also capture speed and stuff?
02:31 mupuf: nope, but if you modify the compiler a bit to allow you to know the number of spills you get and the maximum live count, you can add decide if you are going to hurt or help shaders
02:34 karolherbst: I see
02:41 mupuf: and you can use the repo I created to store more shaders :)
03:06 karolherbst: imirkin: what was that about again? src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp:1689: void nv50_ir::CodeEmitterNVC0::emitLoadStoreType(nv50_ir::DataType): Assertion `!"invalid type"' failed.
04:14 karolherbst: imirkin: there is something wrong with the deps, when I start with instructions with higher throughput, then the compiler does garbage things
04:17 karolherbst: imirkin: segfault: nv50_ir::TargetNVC0::getThroughput (this=0x771010, i=0x0) at ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp:523
04:17 karolherbst: 523 if (i->dType == TYPE_F32) {
05:09 karolherbst: imirkin: forget it, my mistake
10:06 karolherbst: imirkin: so what would be the next thing after reducing wait of sources?
10:07 joi: imirkin: weird, witcher2 does not start with your new patch
10:08 joi: I unapplied your patch - it starts, applied again - it does not
10:09 joi: it just hangs with black screen
10:15 joi: int i = ffs(i) - 1; :D
10:16 glennk: gotta love C that allows that
10:20 karolherbst: why should that be bad?
10:20 karolherbst: :p
10:31 imirkin_: joi: is that wrong?
10:31 imirkin_: oh
10:31 imirkin_: that is wrnog
10:31 imirkin_: but so close! :)
10:32 joi: with s/ffs(i)/ffs(bindings)/ it works
10:33 imirkin_: joi: cool :)
10:33 imirkin_: i can't believe i did that
10:33 imirkin_: nice catch
10:35 joi: perf told me it was spinning in nvc0_cb_push, so it wasn't that hard ;)
10:37 imirkin_: odd that i didn't catch it -- i tried replaying the witcher2 trace you had and it worked fine
10:37 imirkin_: i guess it's stack-specific
10:37 imirkin_: thus diff compiler does diff things
10:43 joi: lol, with this ffs bug witcher2 trace replays fine
10:44 joi: but game still hangs
10:48 karolherbst: imirkin_: what does dual issue means? I find it odd, that around 80% of the instructions can be dual issuedä
10:49 imirkin_: karolherbst: process 2 instructions at the same time
10:49 imirkin_: karolherbst: perhaps you'd be interested in http://people.freedesktop.org/~chrisbmr/shaders.TODO
10:49 imirkin_: i haven't read through that in quite a while
10:50 karolherbst: okay I thought that much. mhh will it change anything regarding throughput then?
10:53 karolherbst: mhhh there is somewhere another memory coruption :/
10:54 karolherbst: when I have multiple instructions, which are waiting the same time and I prefer one which can be dual issued with the before insn, then heaven crashes :/
11:01 karolherbst: imirkin_: "1369: st b96 # l[0xc] $r24t (8)" should that ever happen?
11:02 imirkin_: karolherbst: no. that's the bug. i have a bugzilla and trello open for that.
11:02 imirkin_: spilling code is unaware of this
11:02 karolherbst: mhh okay
11:02 imirkin_: and will generate it when spilling 3-wide RIG node
11:02 karolherbst: because I have a situation here where I get some of them
11:02 imirkin_: i have a prelim patch in the bug iirc
11:02 imirkin_: https://bugs.freedesktop.org/show_bug.cgi?id=90348
11:04 karolherbst: seems to work, nice thanks
11:05 karolherbst: mhhh strange, prefering dual issueing decreases perf :/
11:09 glennk: higher register usage with more dual issue?
11:10 imirkin_: karolherbst: i think there are other funny constraints wrt dual-issue
11:10 imirkin_: like alignment
11:10 imirkin_: which you can't know at the time that your scheduling pass runs
11:10 karolherbst: ohhh
11:10 karolherbst: okay
11:10 karolherbst: then I simply ignore that for now
11:13 karolherbst: mhh, somehow I don't get more then 3% more perf by reducing waits
11:14 imirkin_: karolherbst: well there are additional things
11:14 imirkin_: like you want to move tex's as far away from uses as possible
11:14 imirkin_: feel free to fix up the getThroughput function to suit your needs
11:14 imirkin_: e.g. what happens if you make OP_TEX & co have a throughput of 20 or something
11:14 karolherbst: I have a lot of waits left anyway
11:18 karolherbst: imirkin_: I get stuff like that a lot: https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
11:18 karolherbst: always these 8 throughput instructions
11:18 karolherbst: followed by imediate use
11:18 karolherbst: and I don't find a way to move them away
11:20 imirkin_: yeah, not a ton you can do
11:20 imirkin_: why did you schedule the mul before the presin?
11:21 imirkin_: i.e. why not swap 91 and 92?
11:25 karolherbst: imirkin_: good catch
11:26 karolherbst: I have to improve my liveValue tracking anyway :/
11:45 karolherbst: imirkin_: like that? https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
11:46 karolherbst: ohh wait, I did something stupid indeed, but this shouldn't effect this
11:46 imirkin_: why isn't the presin scheduled immediately after the relevant mul?
11:46 imirkin_: i.e. why isn't serial 60 the presin?
11:47 imirkin_: fyi, to better deal with some of this stuff, there's the option of scheduling backwards
11:47 imirkin_: i.e. start at the end
11:47 imirkin_: and basically invert the algo
11:47 imirkin_: just a thought
11:47 karolherbst: I just made a mistake, maybe its better now
11:47 karolherbst: sec
11:47 karolherbst: https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
11:48 karolherbst: mhh
11:48 karolherbst: usually I prefer instructions with highest throughput
11:48 karolherbst: so the presin should be moved up automatically
11:48 karolherbst: but
11:48 karolherbst: maybe I have too many live values left
11:49 imirkin_: well, the presin doesn't alter the number of live values right?
11:49 imirkin_: since there are no further users of %r737
11:50 karolherbst: ohh mhh, I think my algo is stupid when I decide to not decrease live values :/
11:50 karolherbst: https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
11:50 karolherbst: :D now they are up
11:51 imirkin_: again...
11:51 karolherbst: but the use is too near
11:51 imirkin_: why not move the presin up one
11:51 karolherbst: yeah, I see what you mean
11:51 imirkin_: i.e. why do you prefer to schedule the mul over the presin?
11:51 imirkin_: are you trying to reduce live values?
11:52 karolherbst: both would reduce live values, right?
11:52 imirkin_: if so, i'd say it's ok to schedule things that don't alter the number of live values too
11:52 imirkin_: ah yeah, i guess
11:52 imirkin_: actually, keep them the same
11:52 karolherbst: I don't try to reduce them
11:52 karolherbst: I just schedule those first, who depend on live values
11:52 karolherbst: which may or may not reduce live values
11:53 karolherbst: mhh, the presin should be prefered over the mul, strange
11:56 karolherbst: ohhh maybe
11:56 karolherbst: mhh
11:58 karolherbst: okay, found the issue
11:59 karolherbst: imirkin_: I search the values which waited the longest and don't try to also fit exact matches
11:59 karolherbst: so the mul is prefered because of "-1" wait
11:59 imirkin_: who cares about the 'wait'?
12:00 karolherbst: no instruction should ever wait until the source is ready
12:00 karolherbst: or not?
12:00 karolherbst: I just change the algo a bit
12:00 karolherbst: *need to
12:08 karolherbst: imirkin_: betteR? https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
12:08 imirkin_: i think so
12:09 imirkin_: obviously only on a very micro level ;)
12:09 karolherbst: hey, less waiting
12:09 karolherbst: :D
12:09 imirkin_: throughput of insn: 8 latency: 9 serial: 39 sub ftz f32 %r536 %r528 %r475 (0)
12:10 imirkin_: that's wrong
12:10 imirkin_: that should be throughput 1
12:10 imirkin_: OP_SUB got missed in that getThroughput thingie
12:10 imirkin_: (there is no OP_SUB... it's just OP_ADD)
12:11 karolherbst: so I should handly OP_SUB like OP_ADD?
12:11 imirkin_: yes
12:11 imirkin_: add it in the relevant places for both of the if cases
12:12 imirkin_: throughput of insn: 2 latency: 9 serial: 65 mov u32 %r915 0x3a83126f (0)
12:12 imirkin_: that's gotta be wrong too
12:12 imirkin_: the type on a mov is kind of an artifact of the nv50 ir
12:12 imirkin_: not a real thing
12:12 imirkin_: should always be 1 i think
12:13 imirkin_: same for things like interp/vfetch/etc
12:13 imirkin_: although i suspect those are slower ops. dunno.
12:17 karolherbst: mhh killed perf now :/ sad
12:17 karolherbst: but this is still in a insane state, where changes which should improve stuff, makes it worse
12:18 karolherbst: anything I can still optimize here? https://gist.github.com/karolherbst/9e843fcbe29bfa8cdd80
12:19 karolherbst: mhhh
12:19 karolherbst: how could I find out which the real values are?
12:19 imirkin_: dunno, seems reasonable
12:20 karolherbst: perf is much worse now though, but this could be due to other programs, should do pixmark_piano again now
12:21 karolherbst: but this has also a 4000+ ins shader :/
12:22 karolherbst: actually why not
12:23 karolherbst: imirkin_: this would be the current state: https://gist.github.com/karolherbst/f1f599ae99f7a736cab7
12:23 karolherbst: this one is really messy :/
12:23 karolherbst: tons of BBs
12:24 karolherbst: and exit not the last instruction
12:25 imirkin_: uhhhhh
12:25 imirkin_: i don't see an exit at all
12:26 karolherbst: I don't display phis and exits
12:26 imirkin_: ah ok
12:26 karolherbst: its somewhere around 2200 though
12:27 karolherbst: but this is like the only real thing this benchmarks does, so optimizing should be a lot easier
13:43 imirkin_: ooh, the opengl bugs list is only 1 card longer than the opengl compiler opts list!
13:46 hakzsam: nice :)
13:46 imirkin_: and i'm definitely pushing *something* to fix the PhiMovesPass situation
13:47 imirkin_: will figure it out tonight.
14:09 mupuf: skeggsb: you were right, the gm107 is definitely locked
14:09 mupuf: and we cannot read the freq
14:10 mupuf: sorry, I am talking about ptimer
14:10 mupuf: on my nve6, the crystal is definitely running at 27 MHz
14:11 mupuf: I need to ask karol to test on his gpu is the frequency is 27 MHz or 27.xxxx
14:11 mupuf: because that may mean I would need to update the patch for his case
14:49 mupuf: well, on the maxwell, we have a clock conflict :D For fan management, le clock really is 27Mhz but for the voltage management, it is 27.648MHz
14:50 mupuf: so, either we have two clocks
14:50 mupuf: or there is a real weirdness here
14:50 mupuf: I guess I really to have a look at the physical board
14:50 mupuf: and look for the voltage regulator
14:55 mupuf: so far, everything looks good
14:55 mupuf: 27 MHz crystal, 20 mOhm sensing resistor
14:55 mupuf: all acording to spec
15:11 mupuf: oh oh, jackpot :) http://www.richtek.com/assets/product_file/RT8120E=RT8120F/DS8120EF-02.pdf
15:37 mupuf: well, judging by the board and the datasheet, it would seem like it is not too complicated
15:39 librin: imirkin, ICMP Echo
15:39 mupuf: the voltage controller has a 0.8 V internal reference (zener diode?), it compares that to an input
15:39 mupuf: so, if you set the voltage output of the controller to this input, you will get a stable .8 V
15:39 imirkin_: librin: ICMP Echo reply
15:41 mupuf: now, looking at the board, it looks like they are using a bridge divider to divide the voltage
15:41 librin: imirkin_, You just mentioned in Bug 91526 that commenting out some BGRA4 formats made the problem go away
15:41 mupuf: changing one resistor is enough to change the output voltage, but how do you do that programatically?
15:41 imirkin_: librin: indeed
15:41 imirkin_: that forces mesa to do a software conversion
15:42 mupuf: Well, how about have two resistors side by side and switching between them very quickly? :D
15:42 imirkin_: it's giving BGRA4 data to a glTexImage call onto a SRGB8_ALPHA8 surface
15:42 librin: Would You share that dirty hack? I would love to test if it helps with https://bugs.freedesktop.org/show_bug.cgi?id=90513
15:42 imirkin_: librin: just comment those 2 lines out... should be pretty obvious
15:42 mupuf: and here comes our 288 kHz PWM output from the GPU
15:42 librin: o.. okay...
15:42 imirkin_: librin: search for 'B4' and just prepend the entire line with //
15:42 mupuf: simple, economical and still pretty fast
15:43 librin: >using //
15:43 librin: blasphemy! /* all the way */
15:43 librin: B)
15:43 imirkin_: librin: well, you can comment it out however you like.
15:43 librin: indeed ;]
15:43 imirkin_: librin: it sounded as though you weren't very experienced with this stuff, so i went for the simplest option
15:43 imirkin_: but if you want to *delete* the two lines, that works too
15:43 imirkin_: heh
15:44 mupuf: the PWM controller itself works at 300 kHz though. I guess they wanted to be 22 kHz away to avoid any kind of buzzing sound in the human hearing range due to intermodulation of the two clocks
15:44 librin: or #if 0 and then #endif :d
15:44 imirkin_: the possibilities are endless
15:45 librin: if (0) { /* code */ }, if to make people cringe x]
15:45 librin: okay, I'll shut up now B|
15:45 imirkin_: that may not work
15:45 imirkin_: since it's inside a table
15:45 librin: ah
15:45 imirkin_: but A for creativity
15:46 mupuf: imirkin_: do you still need the g80 btw?
15:47 imirkin_: mupuf: ugh. yes. but i just won't have time to get to it.
15:47 mupuf: I think my work here is done, I will just push my updated patches to my server
15:47 imirkin_: mupuf: so feel free to unplug.
15:47 mupuf: oh, it has been unplugged for a while
15:47 mupuf: the question was: Should I plug it back :D
15:47 imirkin_: don't worry about it
15:47 mupuf: I will be gone for a week
15:47 imirkin_: i won't have time to use it in the next several days
15:47 mupuf: so, yes, I should worry a bit :p
15:47 imirkin_: people have filed actual bugs that affect actual things
15:48 imirkin_: instead of theoretical things like lod bias on a G80
15:48 mupuf: ahah
15:49 imirkin_: hmmmmm interesting. i may see what's going on.