03:44karolherbst: pecisk: hi, mupuf needs your card :p
03:44pecisk: karolherbst: to do nasty things with it? :)
03:44karolherbst: of course
03:45karolherbst: your power sensor configuration is a bit odd on your card
03:45karolherbst: which completly messed up mupufs first design :p
03:51RSpliet: karolherbst: interesting... what kind of a file system is MupuFS? :-P
03:51karolherbst: multi push fs
03:51karolherbst: it is for SIMD fs operations
03:52karolherbst: or was it rather MISD?
03:52karolherbst: don't know
03:54mupuf: pecisk: yeah, your card is funky
03:55mupuf: but right now, I need to be able to read the power consumption from the blob
03:55pecisk: mupuf: I will have access in 7 hours time
03:55pecisk: at work atm
03:56mupuf: i won't be ready by then
03:56mupuf: maybe during the week end
03:56mupuf: and if necessary, I will have to be nasty and ... fake having a INA3221 by decoding transactions on the bus and then fake being the device but from the host
03:57mupuf: it is going to be fun and very cpu-intensive
04:00karolherbst: mupuf: any idea how to map the right hwmon interface to nouveau?
04:01mupuf: yes, do the opposite
04:01mupuf: go to /sys/class/drm/card0/hwmon
04:01karolherbst: there is no hwmon
04:02mupuf: ah, here you go
04:02karolherbst: this seems easy enough
04:02mupuf:has no nvidia gpu plugged on his work and home machines
04:02karolherbst: so in ezbench, do you already know which card is used?
04:02karolherbst: mupuf: the same with the intel gpu :p
04:03karolherbst: ohh wait
04:03mupuf: that is part of the environment
04:03karolherbst: the intel gpu doesn't have hwmon
04:03mupuf: yes, I don;t remember intel putting anything there
04:03mupuf: but I don't work in the kernel for intel
04:04mupuf: as for ezbench, right now, it will only check if the deployed version is the one you want to test
04:04mupuf: that is it
04:04mupuf: the environment is for next week
04:04karolherbst: I was thinking that I hack power consumption monitroing in ezbench
04:04mupuf: after the auto bisection
04:04karolherbst: but for this, I need to know whcih gpu is used
04:04mupuf: yes, we have a tool for that for intel, hence why I have not worked on it yet for other drivers
04:05karolherbst: and there is no plan to support it through hwmon?
04:05mupuf: support what?
04:05karolherbst: powr consumption for the intel gpu
04:05mupuf: no, I doubt it will happen
04:05karolherbst: mhhh :/
04:05mupuf: you need to use RAPL for that
04:05karolherbst: what a nice interface
04:06mupuf: but even then, RAPL exposes power domains but even us do not know what they actually represent on the chip
04:06mupuf: so, annoying it is!
04:06mupuf: nouveau is the odd guy when it comes to hwmon
04:06mupuf: radeon is a middleground
04:06mupuf: but /me wanted to expose everything through hwmon
04:07mupuf: and I see you agree
04:07karolherbst: I think pushing stuff through hwmon is generally a good idea, seems like I am only of the few that things that way :D
04:07karolherbst: I always remeber these days on my mac, where temperaturemonitor just exposed like 16 sensors :D
04:07mupuf: so, back to the problem at hand
04:08mupuf: how can we link a gl context to a gpu
04:08mupuf: well, well
04:08karolherbst: I bet there is a fancy mesa GL ext for this :p
04:08mupuf: the best way would be to check with /dev/dri/cardX is being open
04:09mupuf: or renderDXXX
04:09karolherbst: yeah render
04:09karolherbst: but this is a bit vague though
04:09karolherbst: may work
04:09mupuf: and then map it to access through sysfs
04:09karolherbst: but I don't like this approach :D
04:09mupuf: I guess you need to strace glxinfo to get this info "s
04:10karolherbst: ezbench just calls any application or is it GL specific?
04:10karolherbst: I mean for what should it be used in the end
04:10mupuf: ezbench does not care that much about the app
04:10mupuf: the app can do whatever it wants with the environment too :s
04:10mupuf: or force-load a driver
04:11mupuf: hence why I leave the environment for the end
04:11karolherbst: I would rather just preload a hook for specific glx/gl functions
04:11mupuf: it is a HARD problem
04:11mupuf: how would that help you?
04:11karolherbst: mhh right
04:11mupuf: and how would it work on other drivers than mesa?
04:11karolherbst: ohh wait
04:11karolherbst: maybe we could hook up inside the context creation, catch the context and push out which card is used through some nasty gl(x) calls
04:12mupuf: sure! Let's patch mesa!
04:12mupuf: and gl definitely needs to expose which dri node it is using
04:13mupuf: sorry, sarcasm overload :D
04:13karolherbst: we need to get the pci bus somehow
04:13mupuf: why do you need the pcibus?
04:13mupuf: you just need to find the right cardX to go to get to hwmon
04:13karolherbst: I don't really like the approach what the running application opens inside /dev
04:14karolherbst: maybe there is a better one
04:14mupuf:is still thinking about this
04:14mupuf: and I would definitely value feedback
04:15karolherbst: maybe preload and hook into open() (or whatever is used for /dev/dri/...) calls :p
04:15mupuf: but the surest way is strace
04:15mupuf: how is that better than strace?
04:16mupuf: strace -e open glxinfo > /dev/null
04:17karolherbst: how can I pipe the error channel? :D
04:17karolherbst: yeah well
04:18karolherbst: that goes into /dev/null already
04:18karolherbst: I already tried that
04:18mupuf: don't redirect to dev null then
04:18karolherbst: but with a nasty grep it shall be fine
04:18RSpliet: 2> /tmp/error_trace.txt
04:18RSpliet: &1 is only a pointer to "whatever your output channel is"
04:19karolherbst: that is somehow useless
04:19karolherbst: glxifo doesn't open /dev/dri
04:19karolherbst: DRI_PRIME=1 glxinfo opens /dev/dri/renderD129 though
04:20karolherbst: intel opens /sys/devices/pci0000:00/0000:00:02.0/drm/card0/ stuff
04:21karolherbst: and /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/renderD129/uevent
04:21mupuf: yeah, I only see the uevents on intel
04:21mupuf: on dri3
04:21karolherbst: glxinfo is really nasty
04:21karolherbst: grep dev: https://gist.github.com/karolherbst/54e286cca698afc71624
04:22karolherbst: and I am not using my nvidia card
04:23karolherbst: this is for nouveau: https://gist.github.com/karolherbst/3a9dfd47cea7a662e2d3
04:24mupuf: open("/dev/dri/renderD129", O_RDWR|O_CLOEXEC) = 5 --> thius is what I expected to see for both
04:24mupuf: mapping from the render node to the normal node is already a bit annoying but doable
04:24karolherbst: intel mesa seems to use sysfs
04:24karolherbst: and not devtmpfs
04:25karolherbst: mupuf: does ezbench wants to support the blob, too?
04:25mupuf: the udevice opends are likely from the libudev
04:25karolherbst: okay wait :D
04:26mupuf: I do not want to add unecessary restrictions
04:26mupuf: and when we want to compare the performance of nouveau with the blob, we will need to be able to do it
04:26karolherbst: this is with primusrun: https://gist.github.com/karolherbst/dda0e2353244d58710cc
04:26karolherbst: optirun: https://gist.github.com/karolherbst/0c002523f0a33dd516e1
04:26mupuf: open("/dev/nvidia0", O_RDWR) = 10 --> as expected
04:27karolherbst: okay, so the nvidia part seems to be the easiest one
04:28karolherbst: the question is, how doe we map to the i2c device then?
04:28mupuf: being able to detect what is the current driver running is a useful feature!
04:29mupuf: glxinfo can already tell you the name
04:29karolherbst: mhhhh :D
04:29mupuf: gl exposes it
04:29mupuf: but we need the node
04:29karolherbst: stupid nvidia, adding nvidia-modeset really messed up bumblebee for good :/
04:30karolherbst: oh well
04:42mupuf: karolherbst: what pci id do you get when you do: LIBGL_DEBUG=verbose glxgears?
04:42mupuf: because for intel, I get funky stuff :D
04:42mupuf: libGL: pci id for fd 4: 8086:0412, driver i965
04:43mupuf: but I guess it is not a real problem for intel since there can be only one gpu ... per cpu
04:43mupuf: darn it :D
04:45mupuf: libGL: pci id for fd 4: 10de:11c0, driver nouveau
04:45mupuf: that will work!
04:46mupuf: hakzsam: are you using reator?
04:46mupuf: sorry if I messed up your setup then
04:46hakzsam: everything is okay
04:51karolherbst: I really like how nvidia crashes my system somtimes
04:51karolherbst: mupuf: libGL: pci id for fd 4: 8086:0416, driver i965
04:52mupuf: well, if we can map this to a pci address like ../../../0000:05:00.0, we have a win
04:52karolherbst: lspci there ya go
04:52karolherbst: lspci -nn
04:53mupuf: lspci -n
04:53mupuf: very nice
04:56mupuf: cat /sys/bus/pci/devices/*00:01:00.0/hwmon/hwmon*/temp1_input
04:56karolherbst: lspci -nn | grep $(LIBGL_DEBUG=verbose glxinfo 2>&1 | grep "pci id" | cut -d\ -f7 | cut -d, -f1) | cut -d\ -f1
04:56mupuf: so, you just need to rename 00:01:00.0 with what you found in the pciid
04:57karolherbst: /sys/bus/pci/devices/*$(lspci -nn | grep $(LIBGL_DEBUG=verbose glxinfo 2>&1 | grep "pci id" | cut -d\ -f7 | cut -d, -f1) | cut -d\ -f1)/
04:58karolherbst: this seems to work for me
04:58mupuf: not here
04:58mupuf: but it seems like your line got cut or something
04:59mupuf: lspci -nn | grep $(LIBGL_DEBUG=verbose glxinfo 2>&1 | grep "pci id" | cut -d\ -f7 | cut -d, -f1) | cut -d\ -f1 --> this one already fails
04:59mupuf: anyway, are you satisfied with the solution?
04:59mupuf: it is ugly in bash though
05:00karolherbst: wait... something is odd here
05:00karolherbst: and I mean the ugly kind of odd
05:00karolherbst: that's a thing
05:00karolherbst: LIBGL_DEBUG=verbose glxinfo
05:00karolherbst: this powers on my nvidia card
05:00karolherbst: even glxinfo does
05:01karolherbst: but why?
05:01mupuf: creating an opengl context is enough
05:01mupuf: because it opens the nouveau device node
05:01mupuf: and that wakes up the gpu for 5s
05:01karolherbst: no that's not what I mean
05:01karolherbst: I run glxinfo for the intel gpu
05:02karolherbst: OpenGL renderer string: Mesa DRI Intel(R) Haswell Mobile
05:02mupuf: well, something is opening the node of nouveau
05:02mupuf: use strace
05:02karolherbst: it powers on after Accelerated: yes
05:02mupuf: how can you trace that so precisely?
05:02karolherbst: open("/sys/bus/pci/devices/0000:01:00.0/config", O_RDONLY|O_CLOEXEC) = 5
05:02karolherbst: mupuf: nouveau timeouts
05:03karolherbst: I have those stupid init hack of bens
05:03karolherbst: and this currently has several timeouts
05:03mupuf: init hacks?
05:03karolherbst: the card needs some time until it is ready
05:03mupuf:is not aware of them
05:03karolherbst: stupid bug
05:03karolherbst: you will see
05:04mupuf: well, that's a nouveau bug
05:04mupuf: so, not on my intel time
05:04mupuf: ezbench is fine on my intel time
05:05karolherbst: it powers on the gpu :D
05:06karolherbst: cat /sys/bus/pci/devices/0000:01:00.0/config also powers on the gpu
05:06karolherbst: mupuf: I don't think its a nouveau bug, because I soly use the intel gl part ;)
05:07karolherbst: maybe mesa core?
05:07karolherbst: or something libdrm related?
05:07karolherbst: or shall nouveau cache those config file somehow
05:09mupuf: yeah, I meant mesa core
05:09mupuf: libdrm could make a cache for the values, if necessary, yes
05:10mupuf: but how?
05:10karolherbst: why does it read this anyway?
05:10karolherbst: how is the config of the nvidia card realted to the glxinfo output of the intel one
05:10mupuf: this is a prime bug
05:11mupuf: mesa must try to open both devices to figure out what PRIME=1 means
05:11mupuf: or something akin
05:11karolherbst: I don't even set DIR_PRIME :/
05:11mupuf: fair point
05:12mupuf: well, happy debugging!
05:12mupuf: not my problem right now :D
05:12mupuf: as I said, intel time
05:12karolherbst: I will open a bug and assign to intel drm :p
05:12mupuf: and I will say "prove it" :D
05:13karolherbst: meh, then I have to ask nicely in #intel-gfx :p
05:13karolherbst: ohh maybe dri-devel is a better place for this
05:13mupuf: dri-devel would be better suite
05:16karolherbst: maybe reading out the video memory does something stupid :D
05:16karolherbst: because it happens exactly before this
05:19karolherbst: pmoreau: can you do "glxinfo ; cat /sys/kernel/debug/vgaswitcheroo/switch" as root?
05:20karolherbst: and tell me if your second gpu is turned on?
06:19karolherbst: airlied: I figured out the runpm issue
06:20karolherbst: lspci powers on the gpu
06:20karolherbst: like always
06:27karolherbst: mupuf: it is caused by pci_system_init called inside drm_intel_probe_agp_aperture_size
06:42karolherbst: mupuf: you are lucky it is a libpciaccess bug then :p
07:01mupuf: karolherbst: ahaha
07:02karolherbst: pci_system_init just iterates over all config files
07:03mupuf: not really a bug then
07:03mupuf: just an annoyance
07:03karolherbst: it iterates over the config files to get the vendor and device id
07:03mupuf: I do remember airlied talking about caching the informations for lspci
07:03karolherbst: it could just read vendor and device
07:04mupuf: maybe it is the only portable way
07:04karolherbst: it is in a liux specific sysfs implementation
07:05mupuf: then you also need to check if you are not going to break older platforms
07:05mupuf: check that at least the 2.6.20+ got support for it
07:06mupuf: or the bare minimum would be 3.0 probably
07:42ratherDIfficult: hello kids, does anyone know how to terminate threads in the warp?
07:44ratherDIfficult: should be something like rtt
07:47ratherDIfficult: terminate fallthrough preempted in tabrtt inside gk110.c
07:56ratherDIfficult: hmm, the docs say, it is actually exit
07:56ratherDIfficult: https://media.readthedocs.org/pdf/envytools/latest/envytools.pdf 233
08:02mupuf: this is not super nice to call us kids, as it implies we are kidding around .... which is not entirely untrue for some of us :D But seems like you got your answer :p
08:11ratherDIfficult: mupuf: allright ill call you grownups than, however i gotta update my kernel, anyone knows if gt730 is supported by nouveau?
08:12karolherbst: ratherDIfficult: everything below gm20x should work fine if you care about hardware acceleration
08:12kubast2: gm20x is maxwell?
08:13kubast2: *secound gen maxwell
08:13ratherDIfficult: karolherbst: i used some kind of mint 17.1 with ancient kernel, and nouveau did not work out, i'll try something newer
08:13karolherbst: kubast2: yes
08:14kubast2: ratherDIfficult did you tried to use 3.19 on lm
08:14kubast2: the newest that is in official repos
08:15ratherDIfficult: kubast2: hmm, not yet probably, i can tell what kernel version i was on though
08:15ratherDIfficult: Linux mlinux 3.13.0-24-generic
08:16kubast2: it works fine with newest nvidia drivers and lm nouveau
08:16ratherDIfficult: kubast2: you have that same card working with that kernel?
08:16ratherDIfficult: what is lm nouveau?
08:16kubast2: gtx 650
08:16kubast2: linux mint's
08:16kubast2: *it's older
08:16kubast2: than the newest one
08:17ratherDIfficult: anyways sounds like the card has worked for someone based off web information, but there are two versions of that card
08:17kubast2: maybe drm version is new+hw video acceleration fixes nothing beside that.
08:17ratherDIfficult: 64bit and 128bit
08:17karolherbst: ratherDIfficult: actually there are three
08:18karolherbst: one fermi and two kepler
08:18karolherbst: ohh gtx 650
08:18karolherbst: for the gtx 730 there are thrww
08:18kubast2: 128 bit ddr3 is fermin I think
08:18karolherbst: yes ane one DDR3 and one GDDR5 kepler
08:19kubast2: GK208-301-A1[gddr 64] GK208-400-A1[gddr5]
08:19ratherDIfficult: i have ddr3 version
08:19karolherbst: fermi or kepler?
08:19ratherDIfficult: but i dunno yet if 64 or 128bit version
08:19ratherDIfficult: dunno actually, i thought they were all keplers
08:20shakesoda: the 750 is the only kepler in the 700 series isn't it?
08:20kubast2: gf 108
08:20kubast2: is fermi
08:20shakesoda: er, no, the 750 is the maxwell
08:20ratherDIfficult: i switched to hsw which i will be working first, but i have some questions about a trap/interrupts, i wanna inject them to rasterizer threads
08:20shakesoda: don't mind me.
08:20kubast2: used in mobile gt 5xx series
08:21ratherDIfficult: so any documentation, can the trap execute some custom routine before any other instruction continues, or just popping the stack in the end of rasterizer stream will do it without the trap
08:22ratherDIfficult: i am wondering i do not see anything obvious how the rasterizer threads get killed among the sent stream, for intel it's easy as there is EOT in the end
08:23ratherDIfficult: theretically it does not have to exit the threads, cause fragment shader may continue with the same set
08:26ratherDIfficult: karolherbst: are you up to running a little test for me on your card?
08:27karolherbst: ratherDIfficult: it really depends on what it is
08:27ratherDIfficult: https://github.com/freedreno/mesa/blob/master/src/gallium/drivers/nouveau/nv50/nv50_state.c append something to pop the stack to the memory and try reading it back
08:29karolherbst: ratherDIfficult: no idea how to do that, sorry
08:30ratherDIfficult: wait a bit, i give the exact instructions, at the moment i only see mmio functions to pop the stack, it needs to be poped into memory and some piglit tests on lines for instance should be run
08:34karolherbst: ratherDIfficult: that's the falcon, it has nothing to do with the stuff where the usualy binaries are running at
08:36ratherDIfficult: its a method to pop from the top of the thread stack, but i dunno what the increment by 4 does
08:37ratherDIfficult: wait, i think the code segment is held inside some reg, but to read that, vertex arrays are needed to be done in piglit test then
08:44ratherDIfficult: well that is bit silly to get from top of the stack and increment by four, mwk: maybe you meant pop from the bottom of the stack and increment by four?
08:47karolherbst: ohh what is ASIC quality and how do I get that value? :D
08:48RSpliet: approx $9000 per square millimeter I believe, when produced in small quantity
08:48RSpliet: well, that's the ASIC value
08:48RSpliet: ASIC quality is obviously 42
10:50imirkin: hakzsam: i guess you finally completed your GSoC :)
10:51mupuf: even without the additional two year, he would have been late :D
10:52mupuf:is such a bad mentor for even proposing the project!
10:55karolherbst: mupuf: you have splitted it in like 3 projects ;)
10:55karolherbst: and let him do all three
10:55karolherbst: I don't know
10:55karolherbst: how long is a GSoC project, 3 months? :D
10:56karolherbst: I see
10:56mupuf: don't forget this year's project, adding support for apitracre
10:56mupuf: I mean, adding support for performance counters to apitrace
10:57mupuf: ah! Here we go. finally! I fixed the shit that was my state handling and locking and I finally got ezbench to do the right thing when needing to compile a new kernel and reboot to test something :)
10:58mupuf: and I now have ezbenchd, run by systemd
10:58imirkin: wow you really don't want me to use your thing, huh :p
10:58mupuf: you can run it with whatever you feel like, ilia :D
10:59imirkin: ah, so just a regular daemon
10:59mupuf: openrc, if that is what you use
10:59imirkin: not some sort of crazy systemd plugin
10:59mupuf: nope, nothing systemd-related here
10:59mupuf: except that I do not daemonize the code myself, if you need to, you will need to use the daemonize tool for that
11:01mupuf: time to clean up a bit and start releasing some profiles :)
11:01mupuf: imirkin: do you use grub?
11:01hakzsam: imirkin, ahah, not exactly, still need to complete those PCOUNTER ones :)
11:01imirkin: hakzsam: getting there :)
11:01mupuf: I use grub-reboot to select which kernel to boot on
11:01mupuf: and if it fails, the next reboot will be on a stable kernel
11:02imirkin: heh. you make a lot of assumptions about this stuff
11:02hakzsam: mupuf, imirkin but well, this was not a three months project, just impossible to do ;)
11:02mupuf: not that I handle the case just yet
11:02mupuf: hakzsam: stop making excuses!
11:02imirkin: like that people use grub2
11:02hakzsam: mupuf, :p
11:03mupuf: you just needed to be 100 times faster, you damn-lazy southener!
11:03karolherbst: what is grub :p I heard it slows down boots
11:03mupuf: imirkin: yes, grub2 works too
11:03hakzsam: mupuf, your first idea was to understand GL pipeline using perf counters in 3 months, isn't? :p
11:03mupuf: ah ah, well, in this case, it helps
11:03imirkin: mupuf: you assume that they do. but e.g. i don't
11:03mupuf: imirkin: what are you talking about?
11:04imirkin: mupuf: grub-reboot
11:04mupuf: hakzsam: that project was completely feasible :D
11:04mupuf: more seriously, we will still need to do it at some point :p
11:05mupuf: but we can thank nvidia that we do not have to do it right now
11:05hakzsam: mupuf, please, send a proposal for the next gsoc :p
11:06mupuf: hakzsam: ah ah
11:06mupuf: the next proposal will be to complete the gui of qapitrace I guess
11:07mupuf: by then, I should write the tests for apitrace in order to get the code accepted upstream
11:07mupuf: one thing at a time though!
11:07hakzsam: that seems to be feasibile in three months this time :D
11:08hakzsam: mupuf, the backend is almost ready to be upstream I guess
11:08mupuf: the code is ok, says the maintainer
11:08mupuf: but he wants us to add tests to get the code upstream
11:08mupuf: which is fine
11:09mupuf: and it makes sense
11:09mupuf: well, time will tell :D
11:09mupuf: After 3 years, maybe we will finally have a usable solutions to profile applications :D
11:09hakzsam: and then we could add perfkit support too
11:10mupuf: We have come a long way!
11:10mupuf: But hey, we may still beat Intel on this
11:10mupuf: have you seen the mail about not using perf anymore?
11:11hakzsam: no, link?
11:11mupuf: lengthiest cover letter I have seen in a while
11:12mupuf: wait a sec
11:13hakzsam: oh this one
11:13hakzsam: I saw it but I didn't read
11:13hakzsam: I'll do
11:13hakzsam: mupuf, so basically, i965 won't use perf for their perf counters?
11:14mupuf: yes, but this is about linux here, so it is i915. </pedantic>
11:18karolherbst: this libpciaccess issue is just stupid :/ oh well
11:19karolherbst: mupuf: if I have problems with my intel hardware, can I poke you about them :p or could you ask somebody else too look at them? I have some serious intel_pstate issues I don't get any usefull response yet
11:19mupuf: ah, what issues?
11:19mupuf: want to talk about it on #intel-gfx?
11:19karolherbst: cpu related
11:20karolherbst: first: turbo disabled after suspend, second: some cores don't want to go into deeper C states and stay at C1
11:20mupuf: well, pstate was already a dead give away :D
11:20mupuf: well, for that, I have no answer, sorry :s
11:20mupuf: tried contacting the maintainer?
11:21karolherbst: I could send a mail though :/
11:21karolherbst: but the turbo issue is not the annoying one
11:21karolherbst: the C state thingy is :/
11:21karolherbst: maybe I should open a bug for that
11:22karolherbst: hey, way, I have acpi-cpufreq compiled now, testing this forst
11:37karolherbst: mupuf: cpu seems to be cooler now with acpi driver :/
11:38karolherbst: mupuf: what would you say should nvkm_volt_get return when the gpu is off?
11:44karolherbst: okay, because it returns 0.6V
11:45mupuf: well, if reading the voltage wakes up the gpu, then we should return the real voltage
11:45mupuf: if it does not, then 0V is a good idea
11:45karolherbst: it doesn't wake up the gpu
11:45karolherbst: and shouldn't
11:46karolherbst: otherwise sensord would be kind of bad :D
11:48karolherbst: mupuf: I bet the pwm method of yours returns an error
11:48karolherbst: and then the code jumps into this gpio thingy
11:50karolherbst: mupuf: what does nvkm_rd32 return when the gpu is off? :D
11:51mupuf: usually 0xffffffff
11:51karolherbst: any idea what bios->base + bios->pwm_range * 0xffffffff / 0xffffffff is? :p
11:52karolherbst: so your code returns bios->base when the nvkm_rd32 calls are both returning -1
11:52karolherbst: seems right I think
11:54karolherbst: sending out a patch if you don't mind
11:57karolherbst: mupuf: looks good? https://github.com/karolherbst/nouveau/commit/d13beaccbfa1e4c9d64d8251a6cf456722e61509
11:57karolherbst: maybe you want to take care of that in your iccsense code too? it returns 0 currently, but maybe an error would make more sense too, maybe not
11:59karolherbst: ohh wait
11:59karolherbst: this breaks voltage for gpio based ones :/
11:59mupuf: karolherbst: nope, this not acceptable as the chipset may decide to return whatever it wants
11:59karolherbst: ohhh :/
11:59karolherbst: any idea how to deal with it then?
12:00mupuf: hmm, I guess checking what is the current runpm state?
12:01mupuf: but where should it be done?
12:01karolherbst: everywhere ;)
12:01mupuf: and it should be done in a ton of places, so it is not really acceptable
12:01mupuf: well, that is not true
12:01mupuf: it is only for a handful of sysfs files
12:02mupuf: temperature, voltage and power should be it
12:02karolherbst: what about the fan?
12:02mupuf: and fan
12:02karolherbst: do we want to handle that in the sysfs implementations or rather in the nvkm_ functions?
12:03mupuf: hard to tell, nvkm is going to be hard because it cannot know the runpm state
12:03mupuf: but when we start virtualizing, it is going to be fun too
12:13imirkin_: skeggsb_: FTR, repro'd your issue on my GK208. weird that it doesn't show up on GF108.
12:13imirkin_: (the glamor issue)
12:26imirkin_: karolherbst: what does this mean? Das Argument ist ungÃ¼ltig
12:26karolherbst: argument is invalid
12:26imirkin_: k, that's what i assumed. thanks
12:26karolherbst: Ã¼: ü
12:29karolherbst: mupuf: or we just remove hwmon interface when the gpu is off
12:30karolherbst: though I don't know what implications that has for sensord
12:31mupuf: well, that has merits
12:31karolherbst: I suspect there a lot of userspace application which just check once which interfaces are there
12:31karolherbst: and then don't search for new ones
12:31karolherbst: the question is: how much do we care
12:32karolherbst: maybe there is something in the hwmon docs
12:33imirkin_: whoa. blob drivers emit KHR_debug messages now
12:33imirkin_: 4192: message: api issue 131185: Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (0), GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (2), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) has been mapped WRITE_ONLY in SYSTEM HEAP memory (fast).
12:33imirkin_: 4354: message: api performance issue 131218: Program/shader state performance warning: Fragment shader in program 25 is being recompiled based on GL state.
12:33karolherbst: ohh right, the blob recompiles stuff sometimes
12:34imirkin_: which is weird coz we never do ;)
12:34imirkin_: but we should. but only in VERY rare circumstances
12:34karolherbst: I mean why only rare?
12:34karolherbst: sometimes some inputs are always the same
12:35karolherbst: then this could be simply eliminated
12:35karolherbst: I have noe clue about opengl, but I could imagine that sometimes a shader always get's the same input
12:35imirkin_: you recompile when your shader would emit the wrong thing
12:35karolherbst: or part of the input is always the same
12:35imirkin_: basically there are some bits of GL state that affect how a shader works
12:36imirkin_: in those cases you have to recompile
12:36imirkin_: like glShadeModel(GL_FLAT) and a few others
12:36karolherbst: ohhh okay
12:36imirkin_: right now our shaders just don't work --problem solved
12:36imirkin_: they kinda-sorta work enough, but not fully compliant
12:36karolherbst: I thought like recompiles are also done, if you find some potential optimizations
12:37imirkin_: yeah, you could do that
12:37imirkin_: like inlining constbuf values
12:37karolherbst: I think I really saw the blob doing it, at least that's what I thought
12:39imirkin_: skeggsb_: anything odd about suspend/resume wrt application? i guess if it has *any* implicit buffer references that happened to work by luck, that will work no more?
12:59hakzsam: imirkin_, pushed with all of your comments
13:01karolherbst: I don't like my todo list, anything else I could do? :D
13:02karolherbst: though clock gates for fermi+ are nice
13:02karolherbst: yes, I will take care of that
13:02karolherbst: mupuf: any idea about the subdev? or shall I just create a new one called "pwrgate" or something
13:03karolherbst: or clkgate
13:03mupuf: I think we need to study it more
13:03mupuf: but a good way could be to try
13:03karolherbst: yes, but setting those regs is a good first step already
13:04karolherbst: it's around10% for me
13:04karolherbst: this is a lot already
13:04mupuf: but you know, when you reset the engine, you should un-do clock gating
13:04karolherbst: even when it is on auto?
13:04mupuf: so I would suggest you study a mmiotrace and see when nvidia touches it
13:04karolherbst: reset you mean like in _fini
13:04mupuf: nope, not only
13:05mupuf: it will also be needed when we hang the gpu :D
13:06mupuf: so, maybe adding hooks in engines would be better
13:06mupuf: engines, but no only :D
13:07mupuf: but we can at least add them in engines
13:07mupuf: you should also map the clock domains to the mmio address space
13:11hakzsam: imirkin_, http://hastebin.com/ifagovafun --> when compiling in release mode only, r-b?
13:11karolherbst: mupuf: you mean by mapping I should figure out which reg is used for which engine?
13:11karolherbst: I mean, domain
13:11karolherbst: fun :/
13:11karolherbst: any good ideas how to do that?
13:11mupuf: put on domain to off
13:11karolherbst: makes sense
13:11mupuf: and then try to find what changed in the mmio space
13:11mupuf: make sure to avoid peeking values from pgraph
13:12mupuf: it will blow in your face :D
13:12karolherbst: even when I don't do anything?
13:12mupuf: yes, reading is enough
13:12imirkin_: hakzsam: certainly not
13:12karolherbst: mhhh k
13:12mupuf: not all the regs in pgraph are dangerous, but a few are
13:12mupuf: mwk could tell you more
13:12karolherbst: mupuf: ok poweroff/on the regs are reseted :/
13:13hakzsam: imirkin_, why?
13:13mupuf: clock gating does not trash the state
13:13imirkin_: hakzsam: i'm guessing it either needs to be respected, or there should be a assert(offset == 0) in there
13:13karolherbst: mupuf: after the gpu was powered off, the 0x20200 0x60 range went back to boot values
13:14mupuf: oh, sure
13:14mupuf: power gating does not retain the content
13:15hakzsam: imirkin_, but this variable is unused, so what do you need it?
13:15karolherbst: is there something like a range based nvawatch?
13:15mupuf: you would know that if you had read the chapter of my thesis I told you to read :D
13:15imirkin_: hakzsam: it's a nice reminder that "oops, forgot to deal with offset"
13:15mupuf: karolherbst: nope
13:16hakzsam: imirkin_, oh okay, makes sense
13:16karolherbst: mupuf: is pgragh the only range dangerous to read?
13:17karolherbst: I have a very stupid idea now
13:17mupuf: dump the entire space but pgraph and diff the before and after
13:17mupuf: keep a few regs from pgraph though
13:17karolherbst: what is the biggest reg address?
13:18karolherbst: mupuf: well pgrah will be the reg with no changes then if I leave it out :p
13:18mupuf: 0xffffff IIRC
13:18mupuf: you can read most of pgraph
13:18karolherbst: can something bad happen when I peek pgraph with no driver loaded?
13:18mupuf: just not a few regs in it
13:19mupuf: hmm, that may work if you have no driver
13:19mupuf: you will have to reboot when you are done though
13:19karolherbst: ohh seems to work?
13:20karolherbst: mupuf: or power off the card?
13:20karolherbst: or won't it work anymore
13:20mupuf: yeah, that may work
13:20karolherbst: I always have bbswitch loaded for that :)
13:21karolherbst: mupuf: did two 0x0 0xffffff pokes
13:21karolherbst: diffed the output
13:21karolherbst: 132443 line patch
13:22karolherbst: mupuf: is there anything beyond 0x1fffff ?
13:22mupuf: what the fuck are you talking about? :D
13:22karolherbst: I was doing nvapeek 0x0 0xffffff twice
13:22karolherbst: and diffed the output
13:22mupuf: I guess you need to look for registers that say 0xbadf
13:23karolherbst: there are also a lot of ffffffff and 0000ffff and stuff
13:23mupuf: look for 0xbadf
13:23mupuf: in the diff
13:24mupuf: but I guess the diff is just 16 MB of - and then 16 MB of +
13:24mupuf: diff is likely very bad at this
13:24mupuf: how about you just look at an engine
13:24mupuf: and try to find the domain of it
13:25karolherbst: nvapeek 0x0 0x1fffff is fine though
13:25karolherbst: 464 lines of diff
13:25karolherbst: seems to be random stuff like temperature or something
13:31karolherbst: found the first one
13:32karolherbst: 0x20200: 0013cbe0 to +0013cc50
13:32karolherbst: seems strange though
13:32karolherbst: maybe not
13:35karolherbst: ohh I did it while the blob was still loaded and running a X server
13:35karolherbst: it was only a little upset though
13:44karolherbst: mhh strange
13:44karolherbst: I am at the 6th reg and still found nothing
13:44mupuf: well, if the blob turns it back on, sure :D
13:45karolherbst: without the blob now :p
13:45karolherbst: and no, the blob doesn't turn it back on :D
13:46karolherbst: I just turn off everything
13:46karolherbst: and check if something changes
13:46karolherbst: mhhh no
13:47karolherbst: okay, other plan
13:50karolherbst: now I run glxgears until it crashes :)
13:50karolherbst: mhh weird, nouveau doesn't recover when I turn it back on :/
13:53karolherbst: mupuf: okay, glxgears stoped only when disabling 0x20200
13:54karolherbst: this is too vague
13:59mupuf: karolherbst: back home
14:00mupuf: your plan is mostly useless I would say
14:00karolherbst: I know
14:03karolherbst: I shall stop crashing my card :D
14:09karolherbst: mupuf: any other idea I could try?
14:21karolherbst: best language to learn programming for somebody who really isn't that well with that stuff usually :D
14:23mupuf: I'm back
14:23mupuf: it was dinner + southpark time :D
14:24karolherbst: very important
14:24mupuf: are you telling me that you do not see any difference in which registers are shown or not when you force clock gating on?
14:24mupuf: that may be true
14:24karolherbst: mhhh not really
14:25mupuf: and if so, we will need to think about other means
14:26hakzsam: imirkin_, my plan is also to make the HUD supports float but it's not trivial :)
14:26imirkin_: hakzsam: i meant that you cast the int64 to float
14:27hakzsam: imirkin_, ah okay, I see
14:27hakzsam: makes sense to use double actually
14:28imirkin_: if the values can get large, then you can benefit from the extra accurace
14:28imirkin_: otherwise meh
14:29hakzsam: currently some of these metrics are displayed as 0 by the HUD as it doesn't support float yet
14:30hakzsam: but this is going to change once I have figured out to add support for floats
14:31hakzsam: imirkin_, I updated the patch locally, do you want to see the v2?
14:33mupuf: hakzsam: in the mean time, why not use percents?
14:34imirkin_: hakzsam: you replaced *all* of the value = stuff with returns right?
14:34hakzsam: mupuf, because... I'm lazy to do it :) and that's really a minor issue
14:35hakzsam: only branch_efficiency and issue_slot_utilization use percents
14:35mupuf: hakzsam: right, better add proper support
14:35hakzsam: imirkin_, yeah
14:36karolherbst: hakzsam: could somebody check with your counters if there are waits between instructions?
14:36hakzsam: karolherbst, what?
14:39karolherbst: hakzsam: I was thinking about if those counters help with reordering the instructions to get more performance
14:39hakzsam: karolherbst, they could be used to profile any applications, so yes
14:40hakzsam: but thoses are only compute-related for now
14:40hakzsam: graphics ones will be added later :)
14:40karolherbst: ohh okay
14:41hakzsam: karolherbst, when all perf counters will be merged, it will be easy to profile the entire GL pipeline
14:41hakzsam: and my plan is to use them to improve perf :)
14:42mupuf: karolherbst: I guess you are interested in the number of clock cycles spent stalling
14:42karolherbst: yeah, something like that
14:42hakzsam: ie. active_cycles
14:42mupuf: pretty sure the compute counters would expose that
14:42hakzsam: for both fermi and kepler
14:42mupuf: hmm, this is not enough, you need to divide that by time
14:43mupuf: I guess active_cycles/pgraph_busy would help
14:43mupuf: but we also need to substract the cycles spent waiting for fb
14:43hakzsam: mupuf, pgraph_busy (gr_idle) is not exposed yet
14:44mupuf: which is also known
14:44hakzsam: mupuf, same for fb counters :)
14:44mupuf: karolherbst: so, short answer, it is coming
14:44karolherbst: mupuf: waiting for fb is also a good thing to know
15:58hakzsam: imirkin_, feel free to have a look at the v2 here http://cgit.freedesktop.org/~hakzsam/mesa/log/?h=nvc0_mp_metrics
15:58hakzsam: time to sleep, see you
15:59imirkin_: hakzsam: worksforme
15:59imirkin_: i.e. feel free to add my r-b
15:59hakzsam: hakzsam, nice, thanks! I'll push the patch tomorrow because I'm so tired that I don't want to make a mistake :)
16:00karolherbst: nouveua will be awesome in no time now! :)
16:33mupuf: RRRRrrrr, I got to try the library to get data out of the blob
16:33mupuf: and it does not want to spit out the power usage
16:33mupuf: that was to be expected though
16:33mupuf: nor did I manage to get the entire list of clocks
16:45mupuf: well, I guess I will go to sleep and try to find a good solution to figure out how to interpret the iccsense vbios table ... without getting a sensor readout ;s
16:49imirkin_: skeggsb_: any idea how suspend/resume might impact nvbo->valid_domains?
17:20imirkin_: skeggsb_: let's say a pushbuf has been submitted and refers to some gem buffer. i then go free that gem buffer and suspend. then i resume, will we move that (free'd) gem buffer back into memory to allow the pushbuf to properly execute? or do we flush everything out before suspending?
17:21imirkin_: skeggsb_: separately, let's say we forget about the suspend/resume thing -- is it ok to close a gem object while there are submitted but not yet fully executed pushbufs that refer to it?
17:54imirkin_: skeggsb_: can you remind me what NV10_SUBCHAN_REF_CNT does?
17:54imirkin_: (that also fixes glamor)