00:21 imirkin: skeggsb: did you see the bug with fbdev emulation disabled?
00:22 imirkin: skeggsb: in case you missed it: https://bugzilla.kernel.org/show_bug.cgi?id=120591
00:25 skeggsb: imirkin: can you send me a proper patch and i'll add it
00:25 imirkin: skeggsb: oh, i sorta assumed that was the wrong approach
00:25 imirkin: but i dunno anything about those structures
00:26 skeggsb: i suspect it's fine
00:26 skeggsb: it's annoying to even have to set that value where we do, but we can't hook in at the right place with how the drm helpers work
00:26 skeggsb: (the value gets overwritten again)
00:26 imirkin: bleh
00:45 imirkin: skeggsb: i've asked the original reporter to send a fix
19:40 orbea: when I clock up nouveau fully I get numbers like this in sensors, not too bad? http://dpaste.com/2AN1CV3
19:41 imirkin_: that power reading is bs, i think we decided
19:41 orbea: it fluctuates
19:41 orbea: seen it around 30 and as high as 90
19:41 imirkin_: it changes, it's just not accurate
19:42 orbea: ah
19:42 karolherbst: orbea: I have patches :p
19:42 karolherbst: orbea: you are on my reclocking branch, right?
19:42 orbea: i am using your reclocking branch, or are these different?
19:42 imirkin_: karolherbst's got patches for what ails ya
19:42 karolherbst: apply this patch too: https://github.com/karolherbst/nouveau/commit/ad52418a53a8196605650e38829c01989ac7e0f8
19:44 orbea: i'll apply it and restart in a bit to test
19:46 karolherbst: .... how insane, I don'T find the combination for the tilde key anymore ...
19:46 karolherbst: ahh now
19:47 imirkin_: shift `
19:47 karolherbst: I have the german layout
19:47 karolherbst: and an apply keyboard currently
19:47 karolherbst: doubled messed up
19:47 karolherbst: cmd + for me
19:48 imirkin_: heh
19:48 karolherbst: mupuf_: ina3221: power rail 0: unk0 = 0x1, extdev_id = 0, shunt resistors = {2 mOhm (unk 0), 2 mOhm (unk 0), 5 mOhm (unk 0)}, config = 0x7807
19:48 karolherbst: mupuf_: did you ever see someting like that?
19:48 imirkin_: my favorite is the swiss german layout, where the numbers are in the right places, but all the stuff in shift-number is off-by-one.
19:48 karolherbst: I thought they are always like 5 mohm
19:48 karolherbst: :D
19:49 karolherbst: I think french layouts are messed up
19:49 imirkin_: yeah, but those aren't even close
19:49 karolherbst: always wondering why I couldn't type in passwords
19:49 karolherbst: :D
19:49 imirkin_: whereas when you use a qwerty layout on a swiss-german keyboard, and look down for the shift-number stuff, you get a nasty surprise
19:49 karolherbst: and then I figued I need to press shift + numbers
19:49 karolherbst: well
19:49 imirkin_: like & is where ^ should be, etc
19:50 karolherbst: usually I use the right layout for the keyboards I am using
19:50 karolherbst: usually I can switch between US and DE layout without troubles, but now I have to use a mac at work
19:50 karolherbst: and special keys are like totally different there
19:51 karolherbst: cmd + q ony my keyboard here
19:51 karolherbst: and cmd + q quits applications on mac os x....
19:51 karolherbst: cmd + q is @ here
19:51 imirkin_: [and of course the de_DE layout != de_CH layout]
19:52 karolherbst: allthough the DE layout is usually quite sane compared to all the other things out there
19:54 gruetzkopf: i do switch between de_DE, en_US, en_UK and sun layouts several times per day
20:38 zeq: RSpliet: I've been trying out the 8800 with the legacy NVIDIA driver, with the clocks running at a decent speed it's pretty useful for running e21 and the F/OSS games I'm interested in. It got me wondering whether it would be possible to work out the memory voltages and timings using the nvidia-settings overclocking facility and mmiotrace? Might it not even be possible to implement an interface for arbitrary clock speeds where the card
20:38 zeq: /BIOS doesn't support multiple pstates as the NVIDIA "CoolBits" does?
20:39 imirkin_: zeq: definitely possible... just need to create a way to set all the parameters for a performance level
20:39 imirkin_: could do it by first copying an existing one and munging it a bit
20:40 mupuf_: karolherbst: yes, definitely seen 2ohms shunt resistors
20:40 zeq: imirkin_: yes, that was something like I was thinking.
20:41 imirkin_: the G80 was definitely a reasonable gpu
20:42 imirkin_: it's just that nouveau support for some of it is ... not great. and the genero tesla bugs you hit suck too.
20:42 zeq: I'm sure it's not up to the modern games, only being GL3.3 puts a limit on that, but it seems plenty fast enough for all the OSS games I tried
20:42 imirkin_: well, GL version != rendering speed :)
20:42 zeq: unles you try to emulate later versions with fallbacks :-)
20:42 imirkin_: lol
20:43 imirkin_: yeah, if you're really desperate, you could (partially) emulate tess with geometry shaders
20:43 imirkin_: that'd be ... painful
20:45 zeq: yeah. I wasn't suggesting it was a good idea! :-)
20:47 RSpliet: zeq: I don't think NVIDIA extrapolates memory timings to higher clocks when overclocking
20:47 RSpliet: in other words, whether your higher clock works is a hit-or-miss
20:48 RSpliet: implementing overclocking/underclocking is certainly not impossible, but also a very low priority (at least for me) as we can't even reliably hit the clocks that NVIDIA did validate across all boards
20:49 zeq: RSpliet: That surprises me a little since I've seen some really big stable memory overclocks.
20:49 zeq: RSpliet: how long the device lasts is an open question of course!
20:50 RSpliet: you can check it out for yourself, nvapeek 0x100220 0x20 a standard mode and a mem-overclocked mode and observe, well, I expect you'll observe no difference
20:51 zeq: RSpliet: I'll give that a go tomorrow
20:52 zeq: The system isn't powered on when my wife is home, it isn't that quiet, and it's sitting in our living room!
20:58 karolherbst: mupuf_: well I saw 2 mohms too, but never for the inas
20:58 mupuf_: oh, right
20:58 mupuf_: well, why not :)
20:58 karolherbst: exactly
20:59 karolherbst: anyway, if you got time, there is still that sense patch ;)
21:43 imirkin_: robclark: what would be the *minimal* amount of work i'd have to do to switch nv30 over to using nir?
21:44 robclark: imirkin, 10? (units unspecified)..
21:45 imirkin_: 10U, so like 17.5"?
21:45 robclark: imirkin, I guess it depends a bit on how featureful nv30 is..
21:45 imirkin_: "not very"
21:45 imirkin_: or perhaps better described as "very not"
21:45 robclark: heheh
21:46 robclark: well, then probably fairly easy, I guess..
21:46 imirkin_: i suspect it's fairly akin to a2xx, or r300, or ... etc
21:46 imirkin_: i was hoping you could quickly identify a few things that would have to be done
21:46 robclark: imirkin, I guess easy enough to hack up a call to tgsi->nir, and then nir_print()..
21:46 imirkin_: note that currently there's *no* IR at all
21:46 imirkin_: TGSI is directly serialized into the instruction stream
21:46 robclark: I guess the main thing is register allocation?
21:47 imirkin_: right, so there's a finite quantity of registers
21:47 imirkin_: and nowhere to spill them
21:47 robclark: no arrays / indirect registers?
21:47 robclark: (indirect register addressing)
21:47 imirkin_: there's an address register
21:47 imirkin_: (i'm fairly sure)
21:47 imirkin_: since ARB_vp & co had it...
21:48 imirkin_: hrm... the pipe shader caps say no
21:48 robclark: regalloc gets much easier without..
21:48 imirkin_: but my recollection is "yes"
21:49 robclark: maybe there is some value in spiffing up NIR ssa->regs pass to do better.. since I suspect etnaviv might like to have the same thing..
21:50 imirkin_: ok, looks like indirect stuff was never hooked up, but i'm fairly sure it can do it ... somehow.
21:51 imirkin_: but on the bright side, that means i can leave it non-hooked-up
21:51 imirkin_: yeah, vertex shaders had an ARL, which probably could only be used for const[] anyways
21:53 imirkin_: anyways... i have no idea how to hook any of this stuff up - since you've done it semi-recently on a moderately simple backend, i was hoping you could point me to specifics
21:56 imirkin_: oh crazy, vp can do indirect on both const and inputs. and actually it *is* supported.
21:57 imirkin_: robclark: but basically i'd like to be able to do stuff like ... i dunno, constant/input propagation into instructions where it's allowed, etc
21:57 robclark: tbh indirect on const isn't such a problem..
21:58 robclark: and actually, if you aren't doing regalloc for inputs, it isn't really an issue there..
21:58 imirkin_: right, i access inputs by doing MOV foo, IN[]
21:58 imirkin_: much like tgsi :)
21:58 robclark: heheh
21:58 imirkin_: but not 100% like tgsi - there are various restrictions, etc
21:59 robclark: well, tbh, what I'd suggest is just play around w/ tgsi->nir + random nir opt passes + nir_print() and then abort(), vs shaderdb.. and have a look at what the NIR you get looks like..
22:00 imirkin_: i was hoping for something a little more prescriptive :)
22:01 imirkin_: i'll live though
22:01 robclark: well.. not knowing the nv30 hw too much, not sure what else I could point out.. but I think main thing is just regalloc and it sounds like nv30 is simple enough that a fairly simple regalloc should work..
22:01 robclark: imirkin, what does nv30 have as far as flow control?
22:01 imirkin_: robclark: limited :)
22:02 imirkin_: it does do branches
22:02 imirkin_: but i'll have to reimplement the relocation logic
22:02 robclark: ok, so branches slightly complicate regalloc too..
22:02 imirkin_: isn't there already a register allocator everyone uses?
22:02 robclark: yeah
22:02 robclark: in util
22:02 robclark: you still have to tell it what the live ranges are
22:03 imirkin_: nv30 frag shaders have this: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv30/nvfx_shader.h#n268
22:03 imirkin_: and by "nv30 frag shaders" i mean "nv40 frag shaders"
22:03 robclark: heh, call/ret even
22:03 robclark: imirkin, a more practical question might be "how much of that is actually implemented in current nv30 compiler"..
22:04 imirkin_: robclark: well, as you can imagine, this maps fairly directly onto TGSI
22:05 imirkin_: i.e. the looping stuff
22:05 imirkin_: but it's implemented rather poorly
22:06 imirkin_: vertex shaders only have a branch from what i can tell
22:06 imirkin_: presumably somehow conditionalizable :)
22:06 robclark: (one would hope)
22:06 imirkin_: this is what we advertise in terms of caps: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv30/nv30_screen.c#n255
22:07 robclark: ouch.. 13 temps..
22:07 imirkin_: right, you see how RA might be useful ;)
22:08 robclark: heheh
22:08 imirkin_: indirect_const_addr should be supported though
22:08 imirkin_: not sure why it's not listed =/
22:09 imirkin_: i'll play with that later
22:09 imirkin_: oh, and in case it's not obvious - this is a vector isa
22:09 robclark: right
22:09 robclark: so you'd skip the scalarizing pass, ofc
22:09 imirkin_: so 1 reg = 4 value
22:09 imirkin_: i doubt i can do that
22:09 imirkin_: however there's a unscalarize pass somewhere
22:10 imirkin_: otherwise SSA is a wee bit painful
22:10 robclark: actually, I think i965 scalarizes and then re-vectorizes, but I didn't look at how that works..
22:11 robclark: btw, the vc4 regalloc code looks somewhat simpler than ir3, so might be easier example of register_alloc code..
22:11 robclark: (although, it is not dealing w/ flow control yet, afaiu)
22:11 imirkin_: right, flow control's annoying
22:11 imirkin_: i was hoping someone had solved this already
22:11 imirkin_: i guess i can wait.
22:12 imirkin_: [until i'm the only person running linux with a nv3x plugged in]
22:13 robclark: flow control isn't *that* bad.. you just need to know where loop ends and extend the live-range.. or at least that is the simple way to do it..
22:13 robclark: (and I think what both ir3 and i965 do)
22:13 imirkin_: so ...
22:13 imirkin_: in this hypothetical scenario
22:13 imirkin_: i'm not creating a fresh new ir
22:13 imirkin_: or do i have to?
22:13 orbea: how do I ensure /lib/modules/$VERSION/extra/nouveau.ko loads over /kernel/drivers/gpu/drm/nouveau/nouveau.ko at boot? I thought I had it before with just depmod -a, but now its not loading anymore and modinfo shows the default novueau driver...
22:14 robclark: I think if you didn't have to create an IR for tgsi, you probably don't have to nir.. guess it depends on how you do regalloc..
22:14 robclark: imirkin, and I would tend to think a nir->nir regalloc pass would be useful.. I suspect etnaviv would like something like that too..
22:15 imirkin_: orbea: don't stick the other one into /extras?
22:15 orbea: where should I put it then so that it does not overwrite the default one? Or am I supposed to do that?
22:16 imirkin_: orbea: overwrite the other one. having multiple modules that handle the same hw is not a good idea.
22:16 orbea: okay, thanks
22:17 imirkin_: robclark: hm. so i guess it's not quite at the plug & play level quite yet =/
22:19 robclark: well, I'd say spend 15 minutes playing w/ tgsi->nir and other passes sometime to see what comes up.. maybe try to bodge up something simple ignoring r/a just to get a feel for it without spending much time..
22:19 imirkin_: well... with 13 registers, it can be hard to ignore RA
22:19 robclark: I guess you could still draw a shaded quad or something ;-)
22:19 imirkin_: yeah, but i can do that now...
22:20 robclark: imirkin, btw, is it two different isa's for frag and vert?
22:20 imirkin_: robclark: of course!
22:20 imirkin_: why would they be the same?
22:20 imirkin_: that's just crazy talk!
22:20 imirkin_: oh, and the encodings change between nv30 and nv40
22:20 robclark:wasn't talking about the no-r/a case as an end goal, but rather just a way to scope things out..
22:20 robclark: heh, joy
22:20 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nv30/nvfx_shader.h#n10
22:22 imirkin_: and i guess *technically* there's an IR
22:22 imirkin_: but practically it's a list of instructions without any real connection to each other
22:24 robclark: imirkin, are there funky register classes, or restrictions about location of tex samp instruction srcs or that sort of thing? If no, that makes generic r/a easier..
22:24 robclark: (anyways, carpool time.. bbl..)
22:25 imirkin_: robclark: don't *think* so
23:12 gadmt: how do i manually reclock on nouveau?
23:17 imirkin: echo foo > .../pstate
23:19 gadmt: where would pstate be, and what would i set for foo? would i set the core clock speed for foo?
23:21 imirkin: gadmt: cat .../pstate gets you the available perf levels
23:21 imirkin: gadmt: what gpu do you have, and what kernel are you running?
23:22 gadmt: i have an nvidia quadro nvs 160m currently, and im on kernel 4.4
23:22 imirkin: ok. i think pstate is in debugfs in 4.4 already... check /sys/kernel/debug/dri/0/pstate
23:22 kivoduletahi: who wanted to be cool?
23:23 gadmt: no such file or directory
23:23 imirkin: is this an optimus setup?
23:24 gadmt: no
23:24 imirkin: (is there a /1/pstate?)
23:24 imirkin: ok, then that means i was wrong, the cutoff is 4.5
23:24 gadmt: when i do lshw | less , it reports a really low clock speed for the 160m
23:24 gadmt: 33Mhz
23:24 imirkin: in which case you need to load nouveau with nouveau.pstate=1, and it will be in /sys/class/drm/card0/device/pstate
23:24 imirkin: that's the pci clock speed probably?
23:25 imirkin: or something else entirely unrelated, dunno
23:25 imirkin: i wouldn't worry about that
23:25 gadmt: in games, even in small 2d ones, performance is very poor
23:25 imirkin: i forget when reclocking for G98 was merged... either 4.4 or 4.5, i guess we'll see
23:25 gadmt: can barely get above 10 fps in awesomenauts even in menus
23:25 imirkin: also it's unclear how reliable it'll be if you have DDR2 vram
23:26 imirkin: well, the G98 is a very slow gpu to start with
23:26 gadmt: no file or directory either, do i create one?
23:27 imirkin: did you load nouveau with pstate=1?
23:30 gadmt: in /etc/default/grub right?
23:31 imirkin: however you load nouveau... if it's a module, you can stick something into modprobe.conf
23:31 imirkin: or you can put nouveau.config=1 onto the kernel cmdline
23:33 orbea: yea, way different watts with that patch karolherbst linked me, before: http://dpaste.com/2AN1CV3 after: http://dpaste.com/1JAPVX9
23:33 robclark: imirkin, ok.. I should check what etnaviv needs.. if there is some possibility to re-use a nir ra pass, then that is something I could spend some time on..
23:51 imirkin: robclark: does nir allow different register types? like const vs temp?
23:51 gadmt_: modprobe.conf is only found in lib/linux-sound-base
23:52 gadmt_: how would i bring up the kernel command line
23:52 imirkin: gadmt_: if you're looking for help operating linux, i recommend finding a distro-specific support channel
23:54 robclark: imirkin, it is just temp.. intrinsics for everything else, which is kind of nice if things that aren't temp are kind of special.. if you can directly access input/const file with no restrictions, I suppose you'll what some sort of primitive cp..
23:54 imirkin: robclark: the thing is that load propagation has to be done before RA (or else it defeats the purpose of load propagation)
23:54 robclark: right
23:55 robclark: hmm
23:57 robclark: well, I guess it is a bit kludgy, but if you maybe had a way to flag certain things as "dont actually alloc a register to this thing".. (and ideally you'd do that before coming out of ssa?)..
23:57 robclark: (ie. so you leave the load_uniform's and load_input's in ssa form)
23:58 imirkin: right ... i think one needs to either create an IR that can represent instructions
23:58 imirkin: or nv30 is staying as-is
23:58 imirkin: (at least by my hand)