00:00 mupuf: Lyude: I would say that it is more likely a per-chipset magic rather than anything in the vbios
00:00 mupuf: if it were in the vbios, it would mean OEMs were supposed to be able to tweak with that
00:01 mupuf: it may be useful... but only if the power input would have different bandwidth
00:01 mupuf: I honestly don't think nvidia would allow people to shoot themselves in the foot
00:01 mupuf: especially when we see how bashed into shape the vbioses are
00:05 Lyude: mupuf: mh, somewhat interesting they vary so much between two cards of the same gen and model though (i only just noticed this part)
00:06 mupuf: ok, then you know it is up to the vbios developer :D
00:06 mupuf: good luck finding the right table!
00:06 Lyude: aaa. that being said; i'm not sure if the initial value really matters
00:06 Lyude: i hope it doesn't
00:07 Lyude: also, did kepler2 get rid of NVKM_ENGINE_CE0?
00:07 mupuf: did you read the related patents, btw?
00:07 Lyude: I haven't read through them but I took a quick peek, is there something interesting in there I didn't notice?
00:08 mupuf: well, the values you are bashing are just delays
00:08 mupuf: for a very simple state machine
00:08 Lyude: I knew that much at least, we've got it documented in rnndb
00:09 mupuf: I wonder who did that O:-)
00:09 Lyude: hehe
00:09 mupuf: well, actually, did I actually documented the values?
00:09 Lyude: only for CG_CTRL
00:09 mupuf: ah, possibl
00:10 mupuf: mwk: http://ng.0x04.net/~mwk/scans/ <-- down :s
00:10 Lyude: mupuf: do you know if that's actually seperate from BLCG though? i've been looking at the values and they do look somewhat alike
00:10 mupuf: Lyude: mwk has stored reg dumps of most ranges on a lot of cards, this is hard I went from the gk20a down to Tesla
00:11 mupuf: finding regs with similar bitfields
00:11 mupuf: or, when they changed, similar addresses
00:13 Lyude: btw, is that website supposed to take forever to load?
00:13 mupuf: Lyude: nope. it is down
00:13 mupuf: been so for some time
00:20 Lyude: Oh, lol, thought you were linking me to that but now I see i misread
00:21 Lyude: hm, PTHERM.PCOPY_CG_CTRL looks like it's rather weird and maybe mislabeled, it seems to be the only engine on this card that isn't getting it's cg settings written automatically through the engine init hooks, which makes me think it's not the engine on kepler we think it is
00:33 mupuf: Lyude: oversight? Or they got rid of the clock gating logic
02:35 rhyskidd: Lyude: am going through your SLCG registers pull to envytools
02:35 rhyskidd: quick question: did you take a look at the adjacent SLCG registers from nvgpu etc?
02:36 rhyskidd: i can drop in a comment with a few other registers that are adjacent, just trying to understand why *these* particular ones you wanted to add
03:48 AndrewR: http://www.polyteknisk.dk/related_materials/0321278542_CH03.pdf - ppc G5/pcie overview ....
05:05 AndrewR: http://www.ibmfiles.com/ibmfiles/powerpc/itso_powerpc_inside_view.pdf - see 2.4.3 for early ppc memory/caching control (not in g5)
22:41 tobijk: imirkin: what did you say why we could not optimized such kind https://hastebin.com/josizonawa.pl ? (i forgot :/)
22:45 imirkin_: oh
22:45 imirkin_: it's not impossible
22:45 imirkin_: it's just annoying
22:46 tobijk: imirkin_: oh well, thats right, i just "optimized" that %r255 there
22:47 tobijk: got me one instruction less in total in one big shader :D
22:48 karolherbst: imirkin_: would it be possible to do something like $r254d? Just curious
22:49 imirkin_: $r255d
22:49 karolherbst: yeah I know that one
22:49 karolherbst: but I meant 254d
22:49 imirkin_: 254d would be weird.
22:49 karolherbst: well right, but
22:50 tobijk: karolherbst: like this https://hastebin.com/iretipomil.pl
22:50 tobijk: ?
22:51 karolherbst: tobijk: well guess what, you could remove that assignment to r1 as well
22:51 tobijk: karolherbst: yeah, quick copy paste :D
22:51 karolherbst: and no
22:51 karolherbst: has to be 254 the other value as well
22:51 karolherbst: but yeah
22:51 karolherbst: in theory
22:51 karolherbst: but nothing you should target anyway, because that is just super rare that you can do that
22:51 karolherbst: maybe on chips with just 63 regs?
22:51 karolherbst: dunno
22:52 imirkin_: of course in order to do that you'd have to say you use all 255 regs
22:52 imirkin_: which means less parallelism
22:52 karolherbst: right
22:52 karolherbst: but it might make sense if you only have 63 and use all of those
22:53 tobijk: i have forgot, why this is not possible: https://hastebin.com/ositakuwok.pl
22:53 karolherbst: who said it isn't
22:54 tobijk: imirkin_: two days ago (no credit, i'm to lazy to find it in the logs :D)
22:55 karolherbst: imirkin_: today I tracked down a funky bug: timer triggering therm to check for the temperature causing a reclock before clk or therm ran init at all ;) PMU didn't answer
22:56 karolherbst: that's on resuming the GPU
22:56 karolherbst: (and loads of my patches)
22:56 karolherbst: but you get the idea what might be causing painful issues on resuming a GPU
23:04 feaneron: what's nouveau's main git repository?
23:05 tobijk: feaneron: there is not "the" nouveau, there are several parts
23:06 tobijk: which noe do you want, mesa, libdrm, kernel
23:06 tobijk: *one
23:06 feaneron: yeah, sorry; i had the kernel one in mind
23:06 tobijk: feaneron: of course the main linux kernel tree :)
23:07 imirkin_: tobijk: it's possible.
23:07 imirkin_: tobijk: just ... difficult to do.
23:07 tobijk: imirkin_: maybe i got you wrong then
23:07 imirkin_: that literal code will work
23:07 imirkin_: getting nouveau to generate that code? surprisingly hard.
23:07 imirkin_: may be possible in some cases though
23:08 imirkin_: like if the source is an imm and not e.g. a phi
23:08 imirkin_: you're looking for the constraint movees
23:08 tobijk: imirkin_: oh i only wanted it for imm's for mow
23:08 tobijk: *now
23:08 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp#n2294
23:09 feaneron: tobijk: hm... i'm not literate on the linux development flow. i thought the main kernel tree would have merge requests from various maintainers; thus, assumed there would be a "nouveau" maintainer tree that would get pulled in
23:09 imirkin_: feaneron: https://github.com/skeggsb/linux
23:09 feaneron: saw a skeggsb/nouveau repo in github
23:09 feaneron: aha
23:09 imirkin_: and an out-of-tree module at https://github.com/skeggsb/nouveau
23:10 feaneron: ok, thanks!
23:10 imirkin_: (same code ... kinda)
23:10 feaneron:has to figure out how to install the out-of-tree code
23:11 imirkin_: cd drm; make
23:11 feaneron: perhaps it needs some path adjustments? 'make install' isn't enough to get the kernel to load it
23:12 tobijk: imirkin_: i have my change sitting here: https://hastebin.com/mawonimiba.php
23:13 tobijk: maybe not the right place, but i started there as i found it kind appropriate
23:14 imirkin_: + Instruction *ld = i->getSrc(0)->getInsn();
23:14 imirkin_: can't do that.
23:15 tobijk: ?
23:15 imirkin_: can't call getInsn()
23:15 imirkin_: no such thing post-ra
23:15 imirkin_: let's say you have this scenario
23:15 imirkin_: if () { x = 5; } else { x = 6; } mov foo, x
23:16 imirkin_: x->getInsn() will return the immediate load
23:16 imirkin_: and this will become mov foo, 6
23:16 imirkin_: (or 5, but either way - wrong)
23:20 tobijk: so, what would you propose then? solely operate on the Values (and the op we are coming from)?
23:21 imirkin_: do the thing i suggested in the place i suggested
23:21 imirkin_: i.e. insertConstraintMoves()
23:22 imirkin_: oh
23:22 imirkin_: ooooh
23:22 imirkin_: another alternative
23:22 imirkin_: a much better one
23:22 imirkin_: which will actually work really well
23:22 imirkin_: and be easy to implement
23:22 imirkin_: is to allow immediates to be propagated into MERGE arguments
23:22 imirkin_: should be a 1-line change somewhere
23:23 tobijk: mhm
23:25 tobijk: btw, i can follow your argument up there completely? how should i get different values? i only allow one single value, that is 0 and with ld i want the imm load
23:30 tobijk: imirkin_: -^