01:17karolherbst: Tom^: https://github.com/karolherbst/nouveau.git -b cstate_interface
01:20koz_: karolherbst: Did you get my email?
01:50karolherbst: koz_: yeah
01:50karolherbst: koz_: we have to verify first if you really used the compiled binary
01:52karolherbst: koz_: see that? "[drm] Initialized nouveau 1.3.0 20120801 for 0000:01:00.0 on minor 1"
01:52karolherbst: this basically means: old version
01:52karolherbst: koz_: I would rename your nouveau.ko(.xz) file inside /lib/modules
01:52karolherbst: koz_: then copy the compiled one there
01:52karolherbst: and rebuild your initramfs
01:53koz_: How do I rebuild my initramfs?
01:53karolherbst: koz_: depends on the distribution, but basically something like makeinitramfs or something
01:54koz_: karolherbst: I'm on Arch.
01:55karolherbst: pmoreau: will know :D
01:56koz_: karolherbst: Is this it? https://wiki.archlinux.org/index.php/Mkinitcpio
01:57koz_: OK, rebuilt.
01:57koz_: So now reboot and hope for the best?
01:57karolherbst: mhhh yeah
01:57koz_: Alrighty, here goes nothing.
01:57karolherbst: the version should be at least 1.3.1
01:57karolherbst: maybe 1.3.2
01:58karolherbst: no, 1.3.1
01:59koz_: OK, my display manager failed to start.
01:59koz_: So something probably didn't go right.
01:59karolherbst: it should still start :/
01:59karolherbst: what does dmesg say?
01:59koz_: I'm not even past the boot screen.
01:59koz_: It's just hanging there with a [FAILED] systemd message.
02:00koz_: My old nouveau was nouveau.ko.xz, but my new one is just nouveau.ko.
02:00karolherbst: does the message say anyhting usefull?
02:00karolherbst: doesn't matter
02:00karolherbst: .xz just means compressed
02:00koz_: Nope - I need to run journalctl for that, which I can't.
02:00koz_: Because hanging on boot screen.
02:01karolherbst: can you switch to a tty?
02:01koz_: I can try.
02:01koz_: OK, on a tty.
02:02koz_: journalctl is ten kinds of unhelpful.
02:02koz_: As ever.
02:02koz_: startx errors out, unsurprisingly.
02:03karolherbst: Xorg.log ?
02:03koz_: Reading now, but I have no clue what I'm looking for.
02:04koz_: Would it help you if I showed you what it looks like?
02:05karolherbst: mhh wait
02:05karolherbst: do you have network on the machine?
02:05koz_: ...possibly? I'll try to ssh into it.
02:06karolherbst: then you may be able to upload stuff
02:06koz_: Yeah, should be able to.
02:06karolherbst: don't know a good arch package for that, but there are such paste commands
02:06koz_: It's fine - I can do it in other ways as well.
02:06koz_: scp for one.
02:08koz_: karolherbst: http://paste.rel4tion.org/117
02:08koz_: In all respects apart from X, the machine is quite alive.
02:11karolherbst: "KMS not enabled"
02:11karolherbst: do you have anything like modeset=0 in /proc/cmdline ?
02:13koz_: Not at all.
02:13koz_: AFAIK, KMS is enabled.
02:14koz_: Or at least, I did nothing to disable it other than moving that .ko.
02:14karolherbst: ohh I have an idea
02:14karolherbst: lsmod | grep nouveau
02:15koz_: No match.
02:15koz_: So it didn't get loaded for some reason.
02:16karolherbst: modprobe nouveau
02:16koz_: modprobe: ERROR: could not insert 'nouveau': Unknown symbol in module, or unknown parameter (see dmesg).
02:16koz_: Guess you'll be wanting the dmesg?
02:18koz_: OK, grabbing.
02:21koz_: http://paste.rel4tion.org/118 <-- karolherbst: the dmesg.
02:22karolherbst: there is stuff missing
02:24karolherbst: the last lines would be enough
02:28koz_: karolherbst: So what should I do now?
02:29karolherbst: dmesg | tail -n 50
02:30karolherbst: or is dmesg printing what you gave me?
02:30koz_: That's exactly what came out of dmesg.
02:34kubast2: Hey ,I got the mmiotrace ,but the thing is my gtx 650 was in performance lvl 2 allready according nvidia x server settings
02:37Tom^: karolherbst: hm but how am i supposed to load that up on the iso?
02:37Tom^: karolherbst: dont i need to reboot to load it which means its gonna get reset anyways
02:37koz_: karolherbst: I think I'll just give it a pass then. Thanks for all the help and stuff, but I'll just wait till your changes get mainlined.
02:38karolherbst: Tom^: well, you can unload nouveau without rebooting
02:38karolherbst: Tom^: just need to stop X and unbind the vtcon
02:38kubast2: through the size of this trace is quiet huge
02:39karolherbst: Tom^: I think you might need ssh for that though
02:39Tom^: thats easily fetched
02:50Tom^: karolherbst: so am i supposed to use the libnvif.so from that repo?
02:50karolherbst: Tom^: cd drm
02:51Tom^: crap no kernel headers
02:51karolherbst: pmoreau: ^
02:51Tom^: pmoreau: !!!!!!!!!!!!!
02:57Tom^: karolherbst: uhm soc/tegra/fuse.h no such file or directory
02:58karolherbst: you can remove those includes
02:58karolherbst: arch doesn't ship those files
03:01karolherbst: I will create a bug for the suspend issue
03:02karolherbst: shouldn't be there a DRM/Nouveau component in bugzilla?
03:02karolherbst: because some issues aren't X server related
03:10Tom^: karolherbst: this didnt work, couldnt inster 'nouveau': Exec format error
03:10Tom^: karolherbst: perhaps using another 4.3 kernels api broke it :p
03:10Tom^: eh headers i imean
03:14karolherbst: yeah, guess so
03:18Tom^: hm wait he seems to ship the stock arch kernel too, 4.2.5 which means i can boot that and fetch headers for it
03:18Tom^: crisis may have been averted
03:21karolherbst: Tom^: well, he also uses not the kernel drm version
03:22Tom^: worst case scenario i have to install arch on an external disk
03:28Tom^: ;_; DRIVER_KMS_LEGACY_CONTEXT undeclared
03:31Tom^: karolherbst: i guess your repo requires 4.3+ ? :p
03:32karolherbst: but I have some older branches
03:32karolherbst: just not with that fix
03:32Tom^: urgl im gonna order a pizza make some coffe and install arch on one of the 32gb usb sticks then
03:32karolherbst: and then cherry-pick that one commit
03:32karolherbst: ohh wait
03:32karolherbst: no, you only need those cstate_interface
03:33karolherbst: master_karol_stable_4_1 also has it
03:33karolherbst: just use that branch then
03:34Tom^: all of this would have been done if i just went with installing arch on the usb stick in the first place yesterday =D
03:35karolherbst: well, most of the games are running really great with nouveau now anyway
03:35karolherbst: only dx11 is a problem
03:36karolherbst: even most of the dx9 games are running great through gallium nine
03:37Tom^: yea only reason im on NSA10 is because of dx11/dx12 , fallout 4 and soon mass effect andromeda.
03:37karolherbst: :/ yeah....
03:38karolherbst: you will cry if you see it :D
03:38karolherbst: Tom^: http://images.akamai.steamusercontent.com/ugc/646625824525324291/45BF8AE69B814D13D61047772049D0BB4221C542/
03:40karolherbst: ohh wait
03:40karolherbst: this isn't rhough crossover but stupid steam streaming
03:40karolherbst: forget it then :/
03:40Tom^: i almost had a stroke
03:41karolherbst: yeah me too
03:41karolherbst: then I thought about this
03:41karolherbst: dx10 games are running fine mostly
03:41karolherbst: also they said that there is a lot of work done towards dx11
03:41karolherbst: but not public yet
03:43Tom^: karolherbst: ok i think im on your nouveau now
03:44karolherbst: there should be some stuff inside /sys/kernel/debug/dri/0 now
03:44karolherbst: pstate and cstate
03:45karolherbst: to to 0f pstate
03:46karolherbst: now I have to check which cstate will work :D
03:46karolherbst: echo 41 > cstate
03:46karolherbst: then cat pstate
03:46Tom^: wait wait, wait.
03:46Tom^: if cstate fails it will freeze no?
03:46Tom^: ok good i thought i had to recompile etc :P
03:47karolherbst: this just reclocks the gpu core
03:47karolherbst: and your fans might turn on :D
03:48Tom^: cstate 41 seems to work i think
03:48karolherbst: last line of pstate
03:48Tom^: core is on 1177MHz
03:48karolherbst: 42 should fail
03:49Tom^: 19k fps in glxgears.
03:49karolherbst: meh, :D
03:49karolherbst: so close
03:49Tom^: yea 42 fails with failed to raise voltage
03:50karolherbst: I only get 16k fps :(
03:50karolherbst: what about glxspheres?
03:51Tom^: 1150 frames per sec, 1277 Mpixels/sec
03:51karolherbst: something is odd
03:52karolherbst: I expected much more but well
03:53karolherbst: though it is not bad
03:53karolherbst: it is just as fast as my card
03:54karolherbst: Tom^: I bet there is something cpu related going on
03:54karolherbst: might be cpu bound
03:54karolherbst: okay, but seems to look fine though
03:54Tom^: im only using around 11% of the cpu tho
03:55karolherbst: Tom^: why only 11%?
03:55karolherbst: I thought gpu
03:55karolherbst: Tom^: single core^
03:55karolherbst: mesa doesn't do multithreading stuff yet
03:55karolherbst: so if one core is at 100%
03:55karolherbst: then yeah
03:56karolherbst: Tom^: anyway, at the highest cstate there should be another +10% performance
03:56Tom^: just need that voltage correct then :p
03:56karolherbst: yeah, we have to figure out what is going on there
03:56karolherbst: and how to parse that table in the vbios
03:57karolherbst: maybe we could ask nvidia about that
03:57karolherbst: Tom^: ohh mhh
03:57karolherbst: Tom^: you have a working win installation you said?
03:57karolherbst: any gpu monitoring/OC tools there?
03:57Tom^: easily fetched
03:57karolherbst: I really want to know the voltage
03:57karolherbst: at max load
03:58karolherbst: if it's above 1.2125 V, then that would help a lot
04:04Tom^: hm now i need something that runs it at 100% :p
04:05karolherbst: disable vsync
04:08Tom^: meh still only 80% on fallout4
04:08karolherbst: Tom^: and how about the clocks?
04:08karolherbst: anyway, the voltage should be really high now
04:08karolherbst: I assume the blob clocks to max with 80% load already
04:09Tom^: 1097mhz and 1.1750 V
04:09Tom^: memory only at 1749
04:10karolherbst: then you got your 7000
04:11karolherbst: voltage_min = 1025000, voltage_max = 1217500
04:11karolherbst: 1.1750V seems to be in between
04:11karolherbst: we need mroe load :D
04:11karolherbst: Tom^: settings already at max?
04:12karolherbst: 8x msaa?
04:12karolherbst: or 4x ssaa?
04:12Tom^: hm need to check
04:12karolherbst: usually ssaa does the trick
04:15Tom^: yep fallout doesnt do the trick im probably bottlenecking on my cpu or something :P
04:16karolherbst: even with aa at max?
04:16karolherbst: isn't there ssaa or something?
04:17Tom^: cant change the levels of it and no ssaa , launching far cry 4 now instead.
04:17karolherbst: Tom^: window mode and just highest resolution? :D
04:22Tom^: not sure the tool is reporting mem clocks right
04:22Tom^: one reports it as 1749 constantly and one is at 3499 constantly :p
04:23karolherbst: Tom^: yeah, memory has to be multiplied normally
04:24karolherbst: Tom^: usually if you get your memory clock if you multiply with 4/2 then it's all fine
04:24Tom^: however i never seem to go above 1097 core clock and 1.1750 V
04:24Tom^: which means i already was above it on cstate 41?
04:25karolherbst: not quite sure
04:25karolherbst: there is something odd with the clocks mapping anyway
04:25karolherbst: mupuf: I think I got it :O
04:26karolherbst: there are always cstates above the "normal" clock
04:26karolherbst: and then also voltage entries for these "more" cstates
04:26karolherbst: and sometimes they just got above the max voltage
04:26karolherbst: which just means that gou can't use them
04:26karolherbst: Tom^: okay, we would need to test something else
04:27karolherbst: Tom^: we should check what nvidia-settings reports as the max clock used at full load with the binary driver
04:27karolherbst: I am only 20% sure about this theory I got
04:27mupuf: karolherbst: what did you understand?
04:27mupuf: nvidia just provides the voltage vs frequency part
04:27karolherbst: maybe gpu boost 2.0 is strange
04:27mupuf: the OEMs provide the voltage controller informations
04:27karolherbst: and I simply don't have it, which might explains it
04:28mupuf: and that's how it works :)
04:28karolherbst: mupuf: yes, but I saw much higher clocks in the cstates
04:28karolherbst: than what the card should have as the "max" clock
04:28karolherbst: that is what is bothering me now
04:28mupuf: even for boost?
04:28karolherbst: much higher
04:28karolherbst: Tom^: on windows: 1097 core clock
04:29karolherbst: highest cstate: 2613/2 MHz
04:29karolherbst: so 1097 vs 1307MHz
04:29karolherbst: highest working cstate is only 1960/2= 980MHz though
04:30karolherbst: mupuf: pstate table: core freq = 1098 MHz
04:30karolherbst: there we got the clock used on windows
04:31karolherbst: if now the linux driver also only clocks to 1098, we might can figure out what nouveau does wrong
04:31mupuf: hmm, maybe there are some gpus who have a much higher TDP and can reach much higher
04:31mupuf: this is what it is all about anyway, isn't it?
04:32karolherbst: I bet there are cstates even above boost
04:32karolherbst: just for OC stuff
04:32mupuf: well, boost for this particular card
04:32karolherbst: but normally never used
04:32mupuf: but as I said, the only thing that matters is the capacity of the fan/heatinsink to dissipate the heat
04:32karolherbst: yeah, right
04:33mupuf: on high end cards, they may allow going as high as the highest cstate
04:33Tom^: also didnt pstate report 1177MHz on cstate 41? which means i was above the core clock on windows already?
04:33karolherbst: that's what I think this means
04:34karolherbst: that's the reason I want to check the linux driver on your card
04:34mupuf: hehe, well, don't worry, nouveau is not really good at using the gpu, so I am sure you were far away from the TDP
04:34Tom^: im not worried im amazed that i might get awesomeness fps :p
04:34karolherbst: mupuf: yeah, but I think I can understand the pstate => cstate => voltage mapping with that
04:34mupuf: ah ah
04:34karolherbst: if we shouldn't select a cstate above pstate.core_clock
04:34karolherbst: then this is the answer
04:35mupuf: possible, but it is likely the clock found in the boost table saying the max percentage
04:35karolherbst: and then we have a boost range with cstate.voltage < gpu.max_voltage
04:35karolherbst: mupuf: the boost table doesn't help
04:35mupuf: in any case, we need a lot of work to understand all this shit
04:35karolherbst: 0: domain 2 percent 90 min 988 max 2352
04:35karolherbst: and domain 1
04:36karolherbst: 1: domain 4 percent 100 min 1080 max 2613
04:36karolherbst: it is already too hight from the current understanding
04:37karolherbst: 41: freq 2352 MHz unkn 0 unkn 1 voltage 56 is the max we can do....
04:37karolherbst: max 2352 boost table
04:37karolherbst: 41: freq 2352 MHz unkn 0 unkn 1 voltage 56 cstate
04:37karolherbst: same clock
04:37karolherbst: might be concidence
04:46Tom^: karolherbst: http://i.imgur.com/MW2NDnC.png
04:47karolherbst: Tom^: and the gpu load?
04:47karolherbst: it is in the GPU 0 section
04:47Tom^: im not sure where to read it :p
04:48Tom^: its at 98%
04:48Tom^: GPU Utilization: that is
04:49karolherbst: mupuf: seems like I am somehow right :)
04:49karolherbst: then this is really a gpu boost thing
04:50Tom^: 33k fps ;_;
04:50karolherbst: mupuf: so there is nothing wrong with the parsing of the voltage stuff
04:56karolherbst: Tom^: okay
04:57karolherbst: Tom^: can you add a Coolbits entry in the xorg.conf ?
04:57karolherbst: Tom^: Option "Coolbits" "28" in the Device section of the nvidia gpu
05:06Tom^: karolherbst: what do you want me to set the clocks to? :p
05:06karolherbst: leave them at 0
05:06karolherbst: and see if nvidia clocks above 1098MHz now
05:07Tom^: seem to be at 1097 still
05:07karolherbst: then +135MHz
05:07karolherbst: I assume this is the most you can do?
05:08Tom^: core clock 1232
05:08Tom^: still sailing on :p
05:10karolherbst: I think this offset thign only changes the clock
05:10karolherbst: nothing else
05:10karolherbst: in a post calculation way
05:11karolherbst: that helps me a lot already
05:11Tom^: GPUCurrentCoreVoltage 1175000
05:11karolherbst: mupuf: what do you say? we shouldn't clock to cstates with clocks abouve what the pstate table says?
05:11Tom^: just as windows reported
05:11karolherbst: Tom^: :O
05:11karolherbst: nvidia reports the right voltage for you
05:12Tom^: i do have an GPUOverVoltageOffset that nvidia-settings doesnt seem to expose in the GUI :p
05:13karolherbst: that's normal
05:13karolherbst: you can set it a bit, but usually something under 0.01V
06:13karolherbst: mupuf: sooo my idea is this
06:13karolherbst: mupuf: we only allow those cstates which are below the pstate "core" clock and have a voltage below the gpu max voltage
06:14karolherbst: then we might add a "NvBoost" parameter which will also allow cstates above the pstate "core" clock and below the boost max clock, and still below the gpu max voltage
06:15karolherbst: thing is
06:15karolherbst: this doesn't seem to work on my gpu :/
06:35karolherbst: now it makes more sense :)
06:41karolherbst: Tom^: the core clock should be higher on windows due to gpu boost
06:41karolherbst: Tom^: found any load which triggers this?
06:41Tom^: cant say i do
06:41Tom^: but then this is a MSI factory OC'ed card.
06:41karolherbst: pain is: nvidia doesn't do boost on linux
06:41Tom^: iirc the standard 980ti is around 980mhz core clock
06:42karolherbst: 876 MHz
06:43karolherbst: you don't have a 980 ti, cause that's not kepler anymore
06:43Tom^: yea i meant 780
06:43karolherbst: nvidia behaves differently with my gpu
06:43karolherbst: that bothers me
06:44karolherbst: mine just goes straight to the max cstate
06:44karolherbst: even with the blob driver
06:44karolherbst: and idles somewhere below
06:44Tom^: http://www.sweclockers.com/test/18242-geforce-gtx-780-ti-fran-msi-pny-och-zotac/2 swedish i know but specs just a bit down.
06:44Tom^: iirc thats the card i got
06:45karolherbst: so we hit max boost alreadz_
06:45Tom^: so it seems
06:45Tom^: and i dont think the linux nvidia blob does boost either it just simply goes to it directly no?
06:45karolherbst: no idea
06:45karolherbst: afaik on the nvidia forums some guy from nvidia said, nvidia doesn't do boost on linxu
06:46karolherbst: let me check something
06:46karolherbst: 1080 MHz was your clock
06:47Tom^: well according to specs but the tools report 1097~
06:47karolherbst: ohh right
06:47karolherbst: it is that cstate then: 35: freq 2195 MHz unkn 0 unkn 1 voltage 50
06:49karolherbst: voltage_min = 1025000, voltage_max = 1212500(+50000) [µV] --
06:49karolherbst: the next cstate would have a voltage_max above the max voltage
06:50karolherbst: mupuf: I think nvidia only uses one link from the voltage map table
06:50karolherbst: then it makes totally sense to me now
06:53karolherbst: max cstate: highest csate with voltage_map_table[cstate.voltage].max_voltage + voltage_map_table[voltage_map_table[cstate.voltage].link].max_voltage <= voltage_table.max_voltage AND cstate.clock <= boost_table[pstate].max_clock
06:53karolherbst: no I have to check others card
06:54karolherbst: Tom^: thanks a lot for your help. I think this is really valuable what I found out today :)
06:54Tom^: np, it was fun tinkering.
06:57jkucia: any ideas why would mesa 11.0.5 report 2.1 as max compat profile version on NVE4 (gtx 760)? GLX_MESA_query_renderer also returns compat as preferred profile and 0.0 for max core profile version
06:58karolherbst: pecisk: there?
06:58karolherbst: pecisk: I need you to check the max clock used with the binary driver
06:58karolherbst: and please tell me it is 1058 MHz
07:00karolherbst: mupuf: you still have the nve6 kepler?
07:00karolherbst: ohh I meant nve4
07:17pecisk: karolherbst, on which card
07:18karolherbst: pecisk: kepler
07:18pecisk: one sec
07:18pecisk: karolherbst, how to do that? :)
07:19karolherbst: and full load
07:22pecisk: karolherbst, when switching prefered mode to "Prefer maximum performance", I get 1006 for clock and 6008 for memory (when it's idle)
07:23Tom^: just run glxsphere twice without vsync, that maxed mine out. :p
07:34karolherbst: pecisk: yeah the clock should go a little higher
07:43imirkin: jkucia: you forgot --enable-texture-float
07:44imirkin: jdb: it would behoove you to keep all displays connected to a single gpu. otherwise you have to go the xinerama or reverse prime route, neither of which is great. and not sure the latter would work too well without acceleration (which your GPUs won't have)
07:45orbea:upgraded to mesa 11.1.0-rc1 and still OpenGL compatability version 3.0 :/
07:45imirkin: orbea: mesa will never report an opengl compat context version above 3.0
07:46orbea: :/ Was testing what someone said last night, wasn't correct
07:46orbea: I need opengl 3.1, but I don't want to use nvidia drivers, nouveau does tty so much better....
07:47orbea: and always works
07:47jdb: imirkin: Thanks for the advice. I think I'll do this for the short term, and maybe swing around and try things out when more-than-modeset support lands for my architecture.
07:48karolherbst: orbea: then you need a core profile
07:48jdb: tangentially related: given that my set up is a little esoteric, I'm always more than happy to try things out if it'll help the world.
07:49orbea: karolherbst: core profile is 4.1, compatability is 3.0, I need at least 3.1 compatability version
07:49imirkin: orbea: well, there's no such thing as core/compat with GL 3.1 actually... but if you request a 3.1 profile, you get one without ARB_compatibility.
07:49imirkin: mesa just plain doesn't support it.
07:49imirkin: and there are no plans on ever adding the support
07:50orbea: sad to hear
07:50karolherbst: why is it sad?
07:50karolherbst: neither mac os x does it
07:50imirkin: iirc OSX maxes out at GL 2.1
07:50orbea: i dont use apple products, I just know the only solution I have found is to use a proprietary driver and I don't like that...
07:51imirkin: then you're solving the wrong problem
07:51karolherbst: orbea: _why_ do you need a compat profile?
07:51imirkin: the problem you need to be solving is that the app wants a compat profile :)
07:51orbea: Pioneer Spacesim currently
07:51imirkin: you can always force it... MESA_GL_VERSION_OVERRIDE=3.1 MESA_GLSL_VERSION_OVERRIDE=140
07:51orbea: yea, I could take it up with them, but there will always be another game
07:52karolherbst: orbea: mhh never encountered any
07:52karolherbst: they even provide a mac os x version
07:52karolherbst: imirkin: I bet they just request the context wrong
07:53karolherbst: and it is open source
07:53imirkin: jdb: thanks for the offer :)
07:55orbea: interesting http://cgit.freedesktop.org/mesa/mesa/commit/?id=2599b92eb9751747d4eab8820384d2e5cc4f6801
07:55karolherbst: orbea: did you try it out by the way?
07:55karolherbst: launching the stuff
07:55karolherbst: I can't imagine that it won't run, because anybody would have fixed it by now
07:56orbea: try what? forcing it? I did, no difference, but maybe I was doing it wrong
07:57karolherbst: orbea: what is the error?
07:57orbea: same as before http://dpaste.com/01VN335 Their site explains how opengl 3.1 is needed
07:58karolherbst: then open a bug on their github tracker
07:58karolherbst: LOL: https://github.com/pioneerspacesim/pioneer/issues/3415
07:59karolherbst: I bet it is the same issue
07:59karolherbst: and they are simply not aware of it
08:00orbea: I'mma make a new report anyways and complain how making a free game require non-free drivers to run is bs
08:00karolherbst: orbea: complain about why they don't support a core context
08:00karolherbst: this should be the bug
08:00karolherbst: nothing else
08:00orbea: Will do
08:01karolherbst: otherwise the discussion will drift away
08:01karolherbst: but there is something else fishy though
08:02karolherbst: imirkin: is GL_ARB_explicit_attrib_location core only?
08:02imirkin: but maybe
08:02imirkin: grep should say
08:02karolherbst: then I don't see why this shader won't load
08:02imirkin: src/mesa/main/extensions_table.h:EXT(ARB_explicit_attrib_location , ARB_explicit_attrib_location , GLL, GLC, x , x , 2009)
08:03imirkin: nope, exposed for both legacy and core.
08:03imirkin: [assuming the driver flips it on]
08:03karolherbst: what about uniform stuff?
08:03imirkin: src/mesa/main/extensions_table.h:EXT(ARB_explicit_uniform_location , ARB_explicit_uniform_location , GLL, GLC, x , x , 2012)
08:04karolherbst: orbea: there is something else fishy
08:05imirkin: well, that was with a ILK gpu
08:05imirkin: which only supports GL 2.1
08:05karolherbst: ohh okay
08:05karolherbst: didn't found there which gpu it was
08:05karolherbst: makes sense then
08:05karolherbst: pecisk: any luck?
08:08pecisk: I am looking where to find glxsphere, because xonotic ultra settings maxed out didn't do a trick
08:08pecisk: on fedora that is
08:10karolherbst: glxgears is also nice
08:10imirkin: orbea: https://github.com/pioneerspacesim/pioneer/blob/master/src/graphics/WindowSDL.cpp#L19
08:11imirkin: looks like they already ask for a core context
08:11Tom^: i guess starting enough instances of glxgear without vsync should max it out too unless the cpu bogs down before :P
08:12pecisk: karolherbst, how to max out with glxgears?
08:12karolherbst: or 0?
08:12karolherbst: 0 it is :D
08:14pecisk: karolherbst, __ or _ at the beginning of variable?
08:15pecisk: running two instances, card doesn't seem to notice
08:15karolherbst: the fps should be really high
08:15karolherbst: above 20.000
08:15karolherbst: or at least near it
08:18orbea: I made issue https://github.com/pioneerspacesim/pioneer/issues/3537
08:18orbea: *an issue
08:18karolherbst: orbea: nice, good
08:18karolherbst: but imirkin found out that they should already use core profiles cause of os x
08:19karolherbst: I think the funky part is inside this shader
08:19imirkin: orbea: actually your log indicates it initialized GL 3.1 just fine
08:19pecisk: karolherbst, I am getting
08:19pecisk: Running synchronized to the vertical refresh. The framerate should be
08:19pecisk: approximately the same as the monitor refresh rate.
08:19imirkin: but i'm not overly familiar with the app in question
08:20pecisk: __GL_SYNC_TO_VBLANK=1 glxgears
08:20imirkin: pecisk: you want =0
08:20Tom^: or nvidia-settings go to opengl tab and disable vsync :p
08:20imirkin: otherwise it syncs to vblank :)
08:20Tom^: also i think glxspheres is in the virtualgl package in fedora if my google fu is correct.
08:20imirkin: as the name of the env var might suggest
08:20pecisk: that's more like it
08:21pecisk: imirkin, thanks for pointing out
08:21imirkin: np. "captain obvious" is my middle name
08:21pecisk: karolherbst, graphics clock 1150 Mhz
08:22karolherbst: not 1158? :/
08:22karolherbst: too close
08:22pecisk: nope, it stays at 1150 Mhz, and I hear card's cooling start to work
08:23karolherbst: how did I come to 1158 anyway
08:23karolherbst: okay, let me check that
08:26karolherbst: mhhh okay, I think I got it then
08:27pecisk: cool beans :)
08:27karolherbst: a bit at least
08:30pecisk: karolherbst, congrats on merge btw, seems more happy people :)
09:06RSpliet: karolherbst: gives you bragging rights you know, kernel patches
09:06RSpliet: the chicks dig that shit
09:09Tom^: RSpliet: why arent the chicks here then? https://github.com/torvalds/linux/commit/15c2bf2f3f5c1f25293ad89a9effeb75524458e8
09:12karolherbst: orbea: it starts for me
09:14orbea: interesting, i wonder where its going wrong for me
09:14karolherbst: orbea: try to start with LIBGL_DEBUG=verbose
09:15karolherbst: ohh wait
09:15karolherbst: wrong thing
09:15karolherbst: and MESA_DEBUG needs debug build right?
09:16karolherbst: orbea: do you have the newest version of it anywaY?
09:16karolherbst: I meant the simulation thingy
09:17orbea: I grabbed the source tarball from their github labeled 20151109
09:18orbea: someone in #pioneer said the latest version is broken, but idk if he meant their binary build or the source or how its broken
09:18karolherbst: I downloaded the binary from their site
09:20orbea: well anyways, the libGL command added this to the output, not sure its helpful http://dpaste.com/38D83ZJ
09:23karolherbst: mupuf: I am thinking about turning the pstate/cstate lists into dynamic arrays :/
09:27orbea: the pioneer binary build has library issues here
09:28jkucia: imirkin: thanks, it was --enable-texture-float
09:34jkucia: now I have to figure out why my shader with texelFetch and textureSize renders incorrectly :)
09:35imirkin: jkucia: mmmm... not aware of any extra special issues with those
09:35imirkin: jkucia: what gpu and what kind of texture?
09:36imirkin: jkucia: fwiw the texelFetch piglits when done from vs use textureSize() and work fine
09:37imirkin: jkucia: you can always see what happens with llvmpipe
09:37jkucia: It is possible that something is wrong in my code but it renders properly with nvidia blob. NVE4 (gtx 760) R8G8 and DXT3/DXT5 textures.
09:38imirkin: is this the starwars game... or starsomething?
09:38orbea: oh crap, I see the issue, pioneer doesn't like make install...its all my fault lol
09:39imirkin: iirc there was some sort of glitch in one of those, and the person appeared convinced it was somehow related to s3tc
09:39imirkin: (which it totally wasn't)
09:40jkucia: No, it isn't Starwars. I work on D3D11 in Wine.
09:40imirkin: jkucia: https://bugs.freedesktop.org/show_bug.cgi?id=91551 -- looks like it was only a bug on nv50 though, not nvc0
09:41imirkin: jkucia: ah ok :) well, make a trace, try it on other drivers, and then figure out whether to blame nouveau or not
09:41karolherbst: jkucia: like employed or for fun? :D
09:41imirkin: jkucia: note that just plain running wine against a diff driver is often an unfair comparison as it'll pick entirely different paths
09:41imirkin: based on the available exts
09:42jkucia: karolherbst: employed :D
09:42karolherbst: jkucia: codeweavers?
09:43imirkin: jkucia: anyways, if it *is* a nouveau issue, i suspect it's entirely unrelated to texture types
09:43jkucia: imirkin: I have to write some tests for shader instructions I have implemented so I will probably find this way which part doesn't work
09:43jkucia: karolherbst: yes
09:43karolherbst: okay nice :)
09:44imirkin: jkucia: note that texelFetch takes integers, not floats
09:44karolherbst: for a little percentage I thought working on dx11 is a myth :p, but glad to hear that there is actually some work on this
09:45imirkin: i'm just waiting for st/d3d1x to be resurrected
09:45karolherbst: jkucia: is IID_IDXGIDevice1 already implemented d3d11_device?
09:45karolherbst: I noticed some games need this
09:46jkucia: imirkin: I know it takes integers. Thanks for your help. I'll report back if I find anything :)
09:46jkucia: karolherbst: yes, it is
09:46karolherbst: I tried that myself and failed horribly :D
09:46imirkin: jkucia: yeah, figured :) make an apitrace if you find anything untoward in the nouveau driver.
09:47jkucia: Actually, I've got a first simple game running in D3D11. However, only with nvidia blob for now.
09:48karolherbst: jkucia: still awesome
09:48imirkin: well note that nouveau doesn't support like 50% of DX11 right now -- no UAV's, no compute
09:48karolherbst: jkucia: I bet most of the d3d10 games will also run beause of the dx11 runtime being implemented?
09:48imirkin: but... workign on it
09:49jkucia: karolherbst: yes
09:49karolherbst: imirkin: first we need the perf to actually run those games :p
09:49imirkin: karolherbst: meh, that's of little concern to me :)
09:49karolherbst: I know
09:49imirkin: i don't actually play games
09:49RSpliet: I think most people here don't play games
09:49jkucia: imirkin: It should not be problem for this game at all. It just use D3D11 and SM4 shader but just basic D3D11 features.
09:49karolherbst: I do, and I would be happy if I could ditch windows and the binary driver
09:50Tom^: karolherbst: +1 on that :p
09:50imirkin: jkucia: ah ok. not up on all the windows terminology... SM4 is DX10? i thought DX11 was SM5...
09:50imirkin: or this is a DX9-type situation where you have multiple SM versions that diff drivers might be capable of?
09:50jkucia: SM4 shaders can also be used in D3D11
09:51jkucia: but SM4 instructions are at the feature level of D3D10 hardware
09:51imirkin: ah i see
09:52imirkin: well DX10.1 is well-supported by nouveau on all nvidia hw [that supports it]
09:53karolherbst: imirkin: shouldn't gallium nine also be able to support those dx10 stuff? I mean it is just an API anyway and the runtime is implemented in wine, I doubt that this will be much work actually
09:53imirkin: karolherbst: afaik it's a totally differnet api -- easier to just make a new st
09:53karolherbst: mhh okay
09:53imirkin: karolherbst: there used to be a d3d1x state tracker a few years ago, but it was removed -- it was wildly incomplete
09:54karolherbst: I know
09:54imirkin: in large part because the gallium support wasn't there for it either
09:54imirkin: but that's why nouveau had all sorts of DX11-type features 75%-done
09:54karolherbst: I will ask in the d3d9 channel if there are plans
09:57karolherbst: mupuf: new function: struct nvkm_cstate * get_highest_available_cstate(struct nvkm_clk *clk, struct nvkm_pstate *pstate, bool boost); and this will return the fastest cstate currently available respecting voltage range and other stuff
09:58mupuf: hmm, why not, but how about simply exposing another view of the legal cstates?
09:59karolherbst: mupuf: sounds a bit of wasted memory to me :/
09:59karolherbst: mupuf: maybe using arrays instead of linked lists are better, because we could reuse the entries then
10:01mupuf: are you seriously optimising something that will not even take one memory page?
10:03karolherbst: mupuf: I am sure it could get bigger
10:03karolherbst: nvkm_cstate isn't a small thing
10:04karolherbst: 30 * u32 + list_head + 2 * u8 and this thing like 60 times
10:06karolherbst: well more actually, because every pstates has such a list
10:09imirkin: jkucia: oh, another funny thing with TXQ is that it returns the # of levels in .w - textureSize() doesn't have that. but you get that with textureQueryLevels() (part of some ext) -- dunno if that's relevant to your app.
10:10imirkin: jkucia: or SVIEWINFO... wtvr it's called :)
10:16jkucia: I use both, textureSize() and textureQueryLevels(), to implement SM4 resinfo. I guess resinfo is equivalent of SVIEWINFO
10:16imirkin: right. resinfo. sorry, i get all these things confused.
10:17imirkin: dx9, dx10, tgsi, and tgsi has 2 versions of all these things (one for sampler one for views)
10:18imirkin: anyways... all this stuff _ought_ to work. send traces :)
10:18imirkin: [i assume you're familiar with apitrace]
10:19jkucia: yes, I use it a lot recently ;)
10:38karolherbst: mupuf: nouveau 0000:01:00.0: volt: min voltage: 600000 max voltage: 1200000 :) we also need to parse this then
10:38karolherbst: Tom^: wanna try something out?
10:39Tom^: depends :p
10:39karolherbst: it works for me
10:40karolherbst: Tom^: which branch did you use? master_karol_stable_4_1?
10:41Tom^: pmoreau: needs to add some headers to the 4.3 kernel otherwise
10:43karolherbst: mupuf: does this look okayish? https://github.com/karolherbst/nouveau/commit/1ad74517ab66134ea02a41a369ffeba2a9f9dbc5
10:44karolherbst: Tom^: well use the branch named "Tom" from my repository
10:45karolherbst: and boot with nouveau.debug=volt=debug
10:45Tom^: that means options novueau debug=volt=debug in modprobe.conf ?
10:45karolherbst: Tom^: you should get a line like "nouveau 0000:01:00.0: volt: min voltage: 825000 max voltage: 1212500" then
10:46Tom^: ok will do in a moment
10:47mupuf: karolherbst: yeah, in theory this is good. You do not check the boost flag?
10:47mupuf: in any case, I think we need something like this :)
10:48karolherbst: I just can't concentrate enough now, so I don't think I can produce anything clean today
10:48karolherbst: I am sure part of the nvkm_clk stuff needs to be refactored for dynamic reclocking anyway
10:49karolherbst: mupuf: for example: why should the voltage entry id be stored in nvkm_cstate, why not the _exact_ voltage from the beginning?
10:49karolherbst: this will save us some cpu clocks every time nouveau reclocks
10:52mupuf: oh, because we try to keep in memory what the vbios contains
10:54RSpliet: those few clock cycles are not on the critical path
10:54karolherbst: mupuf: well boost flag will mean clocks above pstate clocks
10:54karolherbst: mupuf: but this part is tricky, because of clock_domains
10:54RSpliet: or well, not really. not very relevant
10:54mupuf: karolherbst: I got that :)
10:55karolherbst: mupuf: but the blob doesn't care about that at all
10:55mupuf: I would say, let's ignore boost for the moment and just pretend we can use the maximum clock at all time
10:55karolherbst: shader freq= 810 MHz here
10:55karolherbst: blob uses 862MHz straight away
10:55karolherbst: which is my highest cstate
10:55karolherbst: so, meh
10:56karolherbst: but this would mean my gpu is "just" 810MHz fast on paper, didn't know that
10:57karolherbst: mupuf: and I wrote a patch to support clocks lower than the lowest cstate (135MHz vs 405MHz for me), didn't see any lower power consumption and threw it away :D
10:58karolherbst: mupuf: I think with that, we now a solution to all those can't voltage issues I know of for kepler
10:58karolherbst: mupuf: or did I forgot anything?
11:04mupuf: karolherbst: not sure I am following
11:08karolherbst: mupuf: best I show you: "0: freq 810 MHz unkn 1 unkn 2 voltage 10" this is my slowest cstate
11:08karolherbst: boost table: 0: pstate 7 min 270 MHz max 810 MHz
11:08karolherbst: and the blob also uses this value on the lowest pstate
11:08karolherbst: nouveau not
11:08mupuf: oh, I was wondering where it took this low one from
11:08mupuf: I just guessed it was hard-coded
11:08mupuf: did you check this theory?
11:08karolherbst: not yet
11:09karolherbst: but the nvidia-settings thing always displays ranges like the boost table
11:09karolherbst: also on Tom^s card
11:09karolherbst: mupuf: it also displays clocks it never reaches :D
11:09karolherbst: mupuf: check this: 0: pstate 7 min 270 MHz max 810 MHz
11:10karolherbst: mupuf: this I mean: https://i.imgur.com/MW2NDnC.png
11:10karolherbst: this is at full load
11:10karolherbst: and the max/min clocks are exactly like the boost table
11:10karolherbst: except one minor thing
11:10karolherbst: perf level 1 min is +4MHz
11:19mooch: Does anybody know what CRTC reg 0x3e does on NV4?
11:19mooch: The BIOS likes to poll this register because of the way Debian uses VESA.
11:22karolherbst: Tom^: how is it going?
11:22Tom^: just compiled
11:22Tom^: now modprobe.conf and unload kms and reload module
11:22mooch: I've seen this register called DDC_STATUS
11:24imirkin: mooch: you're talking about 0x68083e right?
11:24imirkin: er wait, that makes no sense
11:24mooch: No, I'm talking about 0x3d4 index 0x3e
11:24mooch: In the VGA register space
11:24imirkin: ah right. that makes more sense :)
11:25prg: since this auto-reclocking thing turned out to be somewhat unstable for me, i've now been trying again to use nouveau and see how stable it is in normal usage. had a game running in the background and watched a movie in mpv
11:25prg: displays froze again, only thing in dmesg: nouveau 0000:01:00.0: fifo: read fault at 0021dbf000 engine 00 [GR] client 08 [GPC2/PE_2] reason 00 [PDE] on channel 6 [023f7f1000 mpv/vo]
11:25karolherbst: prg: yeah, as long as you don't reclock, most of the things should work without issues
11:25prg: nouveau 0000:01:00.0: fifo: gr engine fault on channel 6, recovering...
11:25prg: and no, it did not recover
11:25karolherbst: uhh yeah
11:25karolherbst: sometimes something fishy is going on
11:26prg: this was just on 07 pstate
11:26karolherbst: imirkin: I think something funny is going on in general. I hit this with Wasteland 2 several times a day
11:26karolherbst: imirkin: any idea how to debug those?
11:27imirkin: karolherbst: make a mmt trace that captures the issue, and note down the address reported in dmesg, then try to figure out wtf happened.
11:27karolherbst: imirkin: k
11:27karolherbst: will do tomorrow then
11:27karolherbst: imirkin: but it can take an hour :D
11:28imirkin: i don't get any more info from that message than you do.
11:28Tom^: karolherbst: volt: min voltage: 825000uv max voltage: 1212500uv
11:28karolherbst: Tom^: nice
11:28imirkin: it just lists some address. what am i supposed to do with that? :)
11:28karolherbst: Tom^: now reclock to 0f
11:28Tom^: is pstate available by default now?
11:28karolherbst: this is the part where it can actually crash
11:29karolherbst: with my branches
11:29Tom^: oki good :p
11:29karolherbst: it is just in debugfs
11:29imirkin: prg: it's entirely possible with have some resource mismanagement in the vdpau accel logic... which i assume you're using?
11:29Tom^: karolherbst: which was were?
11:30Tom^: sure seems to work just fine
11:31Tom^: no complaints, and core at 1177mhz and mem at 6999mhz
11:31karolherbst: very good
11:31Tom^: a bit above windows and blob as last time
11:31karolherbst: the blob also uses higher voltage in general
11:31karolherbst: so this is fine for now
11:32karolherbst: I think I will try to get those voltage fixes ready for the next kernel version
11:32karolherbst: mupuf: any idea what we should do about the 0ed voltage table header?
11:33mupuf: ahm right, we need to test how the blob selects
11:33imirkin: mooch: http://cgit.freedesktop.org/~darktama/nouveau/tree/drm/nouveau/dispnv04/nvreg.h#n273
11:33imirkin: i guess we call it DDC_STATUS too :)
11:33karolherbst: mupuf: ahh right
11:33karolherbst: mupuf: but would this prevent a fix like that? https://github.com/karolherbst/nouveau/commit/4e2ef5f700bf2912f71eefe877e9e9d1d4f40e73
11:33karolherbst: I mean we can change the selection later anyway
11:33mooch: imirkin: But what are the bit definitions?
11:34imirkin: nfc. nouveau never appears to read it.
11:35prg: imirkin, this was with VO: [opengl-hq]
11:35mooch: shit, because the BIOS relentlessly polls it
11:35imirkin: prg: hm ok
11:35mooch: I think it has something to do with this: https://en.wikipedia.org/wiki/Display_Data_Channel
11:35imirkin: mooch: so... just see what that bios code is expecting
11:35imirkin: yeah, ddc is to interact with the monitor
11:36imirkin: it's an i2c line
11:36mooch: Good point
11:38imirkin: mooch: check out xf86-video-nv. it sets DDCBase to 0x3e
11:38imirkin: mooch: and then has a few writes to it, but never reads
11:38imirkin: oh wait. that's just how reading works.
11:38imirkin: take a look at Riva_I2CGetBits and Riva_I2CPutBits
11:39imirkin: enjoy :)
11:41mooch: sda_mask and scl_mask...
11:41mooch: i dunno what those do
11:42Tom^: karolherbst: anyhow im gonna go to bed if you need anything else tested just highlight or pm me and il do it after work tomorrow.
11:43Tom^: karolherbst: im also gonna do a tiny victory dance to this https://www.youtube.com/watch?v=grnNKvd2SYI now that my 780ti runs at highest clocks on nouveau.
11:45RSpliet: Tom^: https://www.youtube.com/watch?v=M1Yt3EQfsNQ
12:01prg: if i wanted to do this mmt trace thing myself, i'd need to compile https://github.com/envytools/valgrind, right?
12:14karolherbst: imirkin: ... it segfaulted
12:19karolherbst: imirkin: when I run valgrind on the gdbed game, it works
12:19imirkin: skeggsb: oh boy, someone's running kasan on nouveau. that'll be scary to look at :)
12:19karolherbst: imirkin: what is kasan?
12:19imirkin: kernel asan
12:19karolherbst: iohh KernelAddressSanitizer
12:20imirkin: skeggsb: interesting... nvc0_fbcon_imageblit's OUT_RINGp goes out of bounds? that seems... sad.
12:20xexaxo: KASAN doesn't like nouveau :-(
12:20imirkin: skeggsb: can these things be called from multiple threads?
12:21imirkin: skeggsb: if so, all the RING_SPACE stuff is meaningless =/
12:23imirkin: skeggsb: oh wait. the error is *reading* the data, not writing it to the ring.
12:23imirkin: that's even sadder
12:25imirkin: skeggsb: soft_cursor: s_pitch = (cursor->image.width + 7) >> 3;
12:26imirkin: skeggsb: but nvc0_fbcon has:
12:26imirkin: width = ALIGN(image->width, 32);
12:26imirkin: dwords = (width * image->height) >> 5;
12:32imirkin: skeggsb: ahahaha. trinity + nouveau. that's gonna work out great :)
12:38imirkin: skeggsb: hm, i guess we might read too far by 3 bytes...
12:39imirkin: that didn't seem to be the situation here though
12:42imirkin: hakzsam: btw, if you're looking for other open issues -- https://bugs.freedesktop.org/show_bug.cgi?id=91247 -- no clue why it happens, but it affects both nv50 and nvc0.
12:43imirkin: hakzsam: and also talos is pretty messed up with nouveau if you enable some of the fancier logic within. i spent a bunch of time staring at it but made no progress. there's a bug open about it somewhere.
13:32hakzsam: imirkin, I'll have a look once I have fixed this edgeflag issue :)
13:43karolherbst: mupuf: how should we handle the case, when we don't have any max/min voltages?
13:45mupuf: what do you mean?
13:45mupuf: no entries and no header?
13:45mupuf: that means we have a fixed voltage
13:45mupuf: fixed *unspecified* voltage
13:46karolherbst: yeah, but we shouldn't fila in such cases too
13:46karolherbst: mupuf: and when do we get a device without a bios?
13:47mupuf: we shouldn't fail? That's the main case where we should fail
13:47mupuf: no idea about the voltage = no reclocking, that's it :D
13:47mupuf: devices without a vbios? Like the tegra?
13:59karolherbst: mupuf: do you know anybody with tesla cards which had the same problem?
13:59mupuf: seriously, do not focus on tesla
13:59karolherbst: I guess it is no issue on tesla, because there weren't so many cstates for each card back then
13:59mupuf: teslas are fucked up
13:59karolherbst: thing is
13:59karolherbst: I don't want to break stuff for teslas .d
14:00mupuf: ah, right
14:03karolherbst: mupuf: does that look somehow cleaner? https://github.com/karolherbst/nouveau/commit/d17c8b6ba98a4d68c712253298fe5ced69a0f504
14:03mupuf: give me a few minutes
14:03karolherbst: ohh I should move the nvkm_debug call
14:07koz_: karolherbst: I'm having some 3D issues...
14:08koz_: I restored the old nouveau.ko file I had before, but now, my machine has *no* 3D at all.
14:08karolherbst: koz_: I think the initramfs generation messed something up
14:08koz_: karolherbst: So how can I fix it?
14:09karolherbst: generate the initramfs the right way, but I never really done it
14:09karolherbst: koz_: maybe reinstalling the kernel package helps
14:09koz_: karolherbst: I tried reinstalling the kernel. Same deal.
14:09karolherbst: this should trigger all kinds of stuff
14:10karolherbst: koz_: are you using ARCH?
14:10koz_: karolherbst: Yes.
14:10karolherbst: koz_: then I would ask in an arch channel. I never encountered anybody here with this problem and this seems to be an arch issue afaik :/
14:10karolherbst: koz_: does dmesg tell anything?
14:10karolherbst: maybe there is a clue
14:10koz_: karolherbst: Nothing out of the ordinary AFAIK.
14:11karolherbst: koz_: is nouveau loaded at all?
14:11koz_: karolherbst: It is according to lsmod.
14:11karolherbst: then maybe Xorg.log helps
14:11koz_: I have normal display and everything - just *specifically* no 3D.
14:12karolherbst: maybe then something else is messed up
14:12koz_: Yeah, I just wish I knew what...
14:12karolherbst: try this: LIBGL_DEBUG=verbose glxgears
14:13karolherbst: mupuf: do you know what? I really don't care if the respect the boost table ranges or not. As long as we keep the voltage and don't produce too much heat, we should be fine, shouldn't we?
14:14karolherbst: or the clock the pstates say it should use
14:14mupuf: no, it is not acceptable
14:14mupuf: we need to stay in the power budget
14:15karolherbst: yeah of course
14:15mupuf: but right now, our compiler is ... lacking :D
14:15karolherbst: mupuf: but the clock doesn't matter with power consumption, does it?
14:15mupuf: the problem is that nvidia selected the clocks based on the worst case scenario
14:15mupuf: so as they would not need a power sensor
14:17karolherbst: mupuf: what is the worst that could happen if we go over the power budget?
14:18mupuf: blow up the power sensor? Consume more power than what the system is supposed to handle?
14:18mupuf: Like if someone selected a 200W PSU because the sum of all the TDP was 200W
14:18karolherbst: I see
14:20karolherbst: mupuf: and if the consumption gets too near to the budget, nouveau would have to lower the voltage?
14:30mupuf: voltage + freq
14:31karolherbst: mupuf: yeah okay, but we would lower the frreq anyway, because the voltage might be not high enough anymore
14:33karolherbst: mupuf: okay, then my plan would be this: respect pstate max clock and voltage range for now, then add a boost parameter, so that the user can decide to go above the pstate clock bound. Later when we support keeping the budget, we could enable boosting by default on gpus with power sensors
14:37mupuf: yeah, having an OC mode would be nice :)
14:37karolherbst: no, I didn't mean OC mode
14:37karolherbst: if I respect the pstate clock, I loose 52MHz compared to the blob
14:37karolherbst: and I didn't OC with the blob
14:38karolherbst: or is boosting also considered "OC"
14:39RSpliet: I think those 52MHz are not going to make a gigantic difference in perf
14:39karolherbst: it is more than 5%!
14:39karolherbst: RSpliet: 810MHz vs 862MHz
14:40RSpliet: we're talking core clock? meh, as long as you saturate the memory bus you'll be fine
14:40karolherbst: I am mostly gpu core bound
14:40RSpliet: otherwise, you'll probably be more effective implementing zculling, or proper insn scheduling... or minimising the number of registers we need to schedule more warps per SM :-D
14:41karolherbst: actually I worked on the latter on already
14:41karolherbst: but I gave up in the end, because I multiplied reg usage
14:41karolherbst: and only got a 5% perf git
14:42RSpliet: nobody said it'd be easy :-D
14:42karolherbst: I know
14:42karolherbst: I spend days with that
14:42RSpliet: meh, only days :-P
14:55karolherbst: mupuf: are the clocks in the cstate multiplied by 2?
14:56karolherbst: because in the pstate table I got "shader freq = 810 MHz"
14:56karolherbst: but the memory clock is also lower there, mhh
14:59mupuf: I doubt they would be
14:59karolherbst: mupuf: but it seems like it
14:59mupuf: you know what I am going to say!
14:59mupuf: CHECK IT :D
14:59mupuf: it is easy enough to check
14:59karolherbst: 36: freq 1725 MHz unkn 0 unkn 1 voltage 47
14:59karolherbst: nouveau says 862MHz
15:00karolherbst: for this
15:00karolherbst: nvbios says 1725
15:00karolherbst: also the struct has 1725
15:18karolherbst: mupuf: the clocks in the PM_Mode tabel are starting to make less sense to me :/
15:24karolherbst: mhhh, why can't I fake my vbios :/ this is annyoing
15:35karolherbst: mupuf: even nouveau doesn't pick up my new vbios
15:35karolherbst: mupuf: I just set the voltage entry to 47 for all my cstate
15:35karolherbst: and still nothing changes
15:35mupuf: hmm. check with the vbios dump
15:36mupuf: err, vbios debug option
15:37karolherbst: mupuf: I get the stock one
15:37karolherbst: is the vbios cached somewhere?
15:37karolherbst: because nouveau doesn't seem to read it
15:38mupuf: well, it is weird that it reads it from prom if it is available in RAMIN
15:39karolherbst: mhh wait a second
15:39karolherbst: this is dmesg: https://gist.github.com/karolherbst/ff6f26f786b2dd287c70
15:40karolherbst: but after removing and loading nouveau agian
15:40karolherbst: I don't get those lines anymore
15:43karolherbst: nvafakebios crashed my kernel
15:44karolherbst: okay, trying to upload vbios before loading nouveau the first time
15:48karolherbst: mupuf: no clue what it is, but if I try to nvafakebios after I turned on my gpu, my system is just frozen
15:48karolherbst: frozen as in there is no kernel crash
15:48karolherbst: but nothing reacts to anything
15:48karolherbst: maybe there is a crash, but pstore doesn't seem to save it
15:50mupuf: well, time for me to lsepe
15:50mupuf: hmm, I guess my point is made, I cannot write sleep, so I must be tired :D
15:50karolherbst: yeah, I have to go too :D
15:50karolherbst: you can!
15:50karolherbst: you just showed it