02:13Yoshimo: karolherbst: maybe we should move it to the correct channel as well ;)
02:14Yoshimo: and why shouldn't it? If your assumption was right it should have given us data
02:15karolherbst: yeah well, we need the pmu to reclock memory
02:15karolherbst: and this is secured
02:16Yoshimo: wasn't it just about reading instead of modifying?
02:17karolherbst: ohh you mean the daemon?
02:17karolherbst: oh yeah we have other problems too
02:17Yoshimo: no the additional output i was supposed to get from your branch
02:17karolherbst: yeah well, it doesn't matter in the end anyway, because you can only reclock your core
02:18karolherbst: and this gives you like nothing
02:20Yoshimo: isn't it about creating the foundation for additional work by understanding more of the cards internal plumbing? If i cared about performance i wouldn't experiment with nouveau on brand new cards
02:21karolherbst: well most of the vbios stuff I am pretty much sure of already. There is something odd with configuring the core clocks
02:23Yoshimo: except some stutter in videos for Xcom:EW and quite a load of mesa complaints about textures, it seemed fine
02:28karolherbst: yeah, it is only interessting on higher clocks
02:39Yoshimo: pmu was the one that needed additional work from NVIDIA in decoupling firmware and their own driver, wasn't it?
04:36karolherbst: mupuf: any idea how the blob decides a state is invalid?
04:36karolherbst: I just increased the clock by 0x100 and now it won't be selected anymore
04:39karolherbst: ohh yeah, maybe because I exceeded the boost limits...
09:05karolherbst: RSpliet: when have you a change to test the gddr5 kepler of yours again?
09:05karolherbst: I have an idea
09:05RSpliet: no promises, but maybe I can do some stuff tomorrow
09:05RSpliet: what did you find in the trace?
09:06karolherbst: this: 810MHz => (var N, fN = 0x61440, P = 1, M = 31) => gpc clock
09:06karolherbst: N being the only value which changes
09:06karolherbst: which also explains the 16MHz steps the blob is doing
09:08karolherbst: yeah, only N changes in the pll configuration
09:08RSpliet: which register is that?
09:09karolherbst: ohh P can also be 2
09:09karolherbst: got one clock where it is 2
09:09RSpliet: -- VCO1 - freq [1100-2200]MHz, inputfreq [25-100]MHz, M [17-32], N [8-255], P [1-63] --
09:09RSpliet: interesting how they don't document that in the VBIOS
09:09karolherbst: you know how that table also lied to us regarding gddr5?
09:10RSpliet: no it doesn't
09:10karolherbst: it sure does
09:10RSpliet: you just didn't have the strap peek
09:10RSpliet: MEM TYPE table at 0x4d14, version 10, 16 entries
09:10RSpliet: Detected ram type: GDDR5
09:10RSpliet: quite sure
09:10RSpliet: but feel free to doublecheck
09:11karolherbst: I mean the pll ranges are garbage and we can't rely on them somewhat
09:11RSpliet: how much does NVIDIAs PLL generation algorithm actually differ from ours?
09:12RSpliet: (like, the gk20a one)
09:12karolherbst: a lot
09:12karolherbst: they only multiply afaik
09:12karolherbst: the source clock comes from the cpu part
09:12RSpliet: I'm not talking blob, just the contributed tegra code
09:13karolherbst: still the source clock is an input from outside nouveau
09:13RSpliet: oh yes
09:13RSpliet: (sorry, I'm not the sharpest pencil in the pencilbox today)
09:14RSpliet: anyway, yes, if you mind cooking up a patch for that, and maybe verifying it against trace
09:14RSpliet: I can test it on actual hw tomorrow
09:14karolherbst: but basically they do the same I do in my gddr5 pll code
09:14karolherbst: they have two plls
09:14karolherbst: and do a loop
09:15karolherbst: mhh clk->params->max_m
09:15karolherbst: where get this one set
09:15RSpliet: I reckon this is a simple limit of VCOs only being able to function within a certain range that doesn't stretch far enough to cover 100MHz all the way to 2GHz+
09:15karolherbst: mhh odd
09:16karolherbst: well the lower pll can get pretty much everything anyay
09:16RSpliet: karolherbst: we get similar limits from the VBIOS
09:16RSpliet: and I'm sure the lower PLL has a wide range, but the higher PLL's input requirements could be strict :-)
09:17karolherbst: for one pll they are fine actually
09:17karolherbst: that was also was I noticed back then
09:18karolherbst: M had to be 1
09:18karolherbst: but it could be I missed something
09:18karolherbst: anyway, I check the blob now on different clocks
09:19karolherbst: and the high pll seems to use N variables and P either 1 or 2 with M and fN being constant
09:19RSpliet: cool, I'd be happy to test either tomorrow or monday
09:20RSpliet: sorry, I can't help a lot with dev right now, little free time
09:20karolherbst: yeah I have to check what nouveau does, but I doubt nouveau does have this pattern
09:20karolherbst: yeah, no worries
09:20karolherbst: for some reasons a clock higher than 2600MHz let nvidia hang my machine...
09:27karolherbst: mhh, seems like nvidia doens't like a clock higher than 1200MHz
10:05phillipsjk: Back in the day, the S#Virge/DX Decallarator card had a clock of 50Mhz, while the Pentium that could out-render it (without texture filtering though) ran at 60Mhz or more.
10:05phillipsjk: *S3Virge decelerator
10:53jayhost`: karolherbst yes it was I who tested your maxwell reclock branch
10:55karolherbst: jayhost`: got any unstabilities?
10:57karolherbst: RSpliet: so I've added something to my kepler_reclocking_v2 branch: 61440
11:00karolherbst: RSpliet: but I am sure there is something else odd, so I've also added a printk which throws out the configuration of the second pll,which is interessting when we compare nvidia with nouveau
11:30AlexAltea: hmm, I'm curious about this: While setting NV03_PFIFO_RAMHT to values like 0x0002XXXX, I can only set the RAMHT starting offset between 0x0 and 0x1F000. Could it go past this point (0x1F000)?
11:30AlexAltea: It's as if the value I set was AND'ed with 0xFFFF01F0
11:31AlexAltea: (hence the offset 0x1F000 == 0x1F0 << 8)
11:49jayhost`: karolherbst I don't think so. I havn't been testing but can on request.
11:54karolherbst: jayhost`: well would be nice, just run the card at highest clocks for some times and notify me whenether something unusual happens :)
11:54karolherbst: stability is out biggest concern usually
12:08imirkin: mwk: hey, so i want to make sqrt() work on tesla. we've currently been doing x * rsq(x), but that breaks down for x = inf (comes out as NaN). are you aware of a way to make it work without flipping the global "zero wins" mode?
12:42imirkin: karolherbst: did you end up setting up deqp?
12:42karolherbst: imirkin: not yet, why?
12:43karolherbst: did I promise something and forgot? :O
12:43imirkin: actually, there's probably a shader runner test that does the same thing
12:43imirkin: could i convince you to do a mmt trace on blob?
12:43karolherbst: yeah, I am hacking ebuilds currently (ubuntu stuff, so I am in a world of pain you can't imagine)
12:43karolherbst: anything else is better than that :D
12:44karolherbst: so, what shall I do?
12:44imirkin: bin/shader_runner generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-sqrt-float.shader_test -fbo -auto
12:47imirkin: it does rcp(rsq))!!
12:47imirkin: that is *very* surprising
13:01pmoreau_: imirkin: If you still need those MMT trace, I can get them know since I'm on the blob, and have some time (finaly!)
13:02imirkin: pmoreau_: what did i ask for?
13:02pmoreau_: Some dEPQ MMT
13:03imirkin: which one? :)
13:03imirkin: i solved one of the bugs i was having
13:03pmoreau_: dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler3d_float_fragment and dEQP-GLES3.functional.shaders.fragdata.draw_buffers
13:03imirkin: ah, yes please!
13:04pmoreau_: Both of them?
13:04imirkin: well, first please check if they pass
13:04imirkin: if they fail, i'm not as interested :)
13:04pmoreau_: Ok, let's see if I remember how to do that :-)
13:04pmoreau_: Eh eh eh!
13:04pmoreau_: You don't want to know how they fail? :-D
13:05imirkin: i'm sure it's the same way mine fail
13:05imirkin: [if they fail]
13:05pmoreau_: Does it have to be latest Mesa, or Mesa from one/two weeks ago will do?
13:06imirkin: mesa is not involved here
13:06imirkin: deqp test run against blob.
13:08pmoreau_: Right… --" It's getting late I guess
13:09pmoreau_: What was the flag again to give it the test to run?
13:09imirkin: ./deqp-gles3 --deqp-visibility=hidden --deqp-case='dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler3d_float_fragment'
13:11pmoreau_: Float fragment failed
13:12pmoreau_: And draw buffers as well
13:13imirkin: then nevermind
13:13imirkin: thanks for checking though
13:13pmoreau_: You're welcome
13:13pmoreau_: Sorry for taking so long
13:15imirkin: no worries
15:31AlexAltea: Does anyone have a collection of MMIO dumps I could use? The wiki only mentions how to submit dumps
15:32AlexAltea: I really need to know what offsets can I set on the NV03_PFIFO_RAMHT register
15:43mwk: AlexAltea: the offset has to be 4kiB-aligned
15:43mwk: and in range 0-0x1f000
15:59AlexAltea: mwk: thanks a lot for the information. (bad news for me though, heheh)
16:02mwk: AlexAltea: btw, what about that reg 0 I asked about?
16:03AlexAltea: mwk: E4D001A1
16:05AlexAltea: sorry for not submitting it before. I thought you weren't interested any more after the question about the objects was solved
16:06mwk: thank you
16:06AlexAltea: mwk, I also have a MMIO dump if you want, 8 MB in size
16:07mwk: I'd like that, yes
16:07AlexAltea: I couldn't get past that because due to hypervisor panics
16:08mwk: that's expected
16:08mwk: 0x800000+ is FIFO submission area
16:08mwk: if you attempt to read it, it causes a FIFO error
16:08mwk: because you submitted a wrong command
16:10AlexAltea: that's interesting, heh
16:10AlexAltea: here it is: https://github.com/AlexAltea/temporary2
18:12bkeys: Does anyone know of a GPU that respects your freedom AND supports Vulkan 1.0?
18:13imirkin: please define 'respects your freedom' and 'supports Vulkan'
18:15bkeys: respects your freedom: FSF approved? Or supported by nouveau in general
18:15bkeys: supports Vulkan: I can develop software using the Vulkan API
18:15imirkin: bkeys: not aware of any
18:16bkeys: In that case; at this point does anyone recommend a GPU that is known to work really well with Nouveau?
18:16imirkin: i don't know that FSF approves of any nvidia gpu's in particular, and nouveau has no vulkan support right now
18:16imirkin: if you're looking for the best open-source support, i'd definitely recommend amd over nvidia
18:17imirkin: if you really want to go for nvidia, i think the kepler series is the best supported right now out of the recent GPUs - you get reclocking (mostly), and full GL support (to the extent that nouveau has it)
18:17bkeys: Do you have a specific card that you recommend?
18:18imirkin: find one that fits your price range? any kepler should be fine though.
18:18imirkin: the GTX 780 Ti is the most expensive/powerful card that nouveau can drive right now
18:18imirkin: (well, technically that honor might fall to something in the Tesla K40 or so series, but... those aren't really end-user gpu's)
18:19bkeys: But Kepler is an Nvidia architecture? What about the AMD?
18:19imirkin: oh, i don't know too much about specific AMD models
18:19imirkin: i just know that they have an actual paid team of engineers whose job it is to support and develop the linux drivers
18:20imirkin: and who have access to various internal resources, like documentation, hw engineers, etc
18:20bkeys: Well I am interested moreso in the philosophical aspect of the support, not just support for GNU/Linux
18:21imirkin: not really sure what that means tbh...
18:22bkeys: Ehh it is kind of hard to describe in words what I meant. Just that I care more about the driver being freely licensed
18:23imirkin: both the amd (open-source) driver and nouveau are BSD-licensed
18:23imirkin: [or is it MIT? i forget tbh]
18:24bkeys: Okay; do you have any clue which is the most powerful AMD GPU that it's freely licensed driver can run?
18:24imirkin: afaik the amd driver will run on every amd gpu... i think the Fiji line is their high-end right now
18:24imirkin: you might check in #radeon for details
19:22jayhost`: imirkin these hastebins are PTX assembly? tesselation-sanity
19:22imirkin: what hastebins?
19:23imirkin: anyways, almost definitely not
19:23imirkin: PTX is not a real hardware ISA
19:28jayhost`: Maybe a better question is. What am I looking at here http://hastebin.com/nodowuxucu.cs
19:28imirkin: the output of nvdisasm ;)
19:28imirkin: it's a text representation of the actual instructions running on the gpu
19:29imirkin: that 8-dword sequence came out of a shader in the mmt
19:29jayhost`: Is it opengl assembly language
19:30imirkin: no, it's SM50 assembly
19:30imirkin: opengl is a higher-level language
19:30imirkin: like C
19:30imirkin: talking about "opengl assembly" is the same as talking about "C assembly"
19:31imirkin: but just like your C program is compiled to, say, x86_64
19:31imirkin: so are glsl shaders compiled to the target GPU. in the case of maxwell, the ISA is known as SM50
19:33jayhost`: Ahh yeah. Interesting.
19:37jayhost`: Do you have documentation
19:49imirkin: jayhost`: for what?
19:50jayhost`: Is that intel syntax
19:51imirkin: you mean for the gpu's isa? no docs.
19:51imirkin: you have to figure it out based on the glsl you write and what the gpu generates
19:51imirkin: some things are obvious, e.g. "MUL", others less so, like "ISBERD" :)
19:53jayhost`: Okay so write the C-ish code and try to mimic the blob
19:54imirkin: i think you're missing some steps
19:54imirkin: let's take a concrete shader
19:56imirkin: jayhost`: http://hastebin.com/emudigosaz.avrasm
19:56imirkin: now you can see the FFMA's at the bottom (fma = fused multiply-add), which are obviously performing the path part of the thing
19:56imirkin: but... where did those values come from?
19:57imirkin: that's what you need to trace through
19:57imirkin: if you can come up with a recipe for loading the various TEP inputs, then that will solve part of the missing functionality
19:57imirkin: (the other missing part is in TCP's... let's not worry about those for now)
19:57imirkin: note that every 4th instruction should be a sched -- if it's not, then that's just the decoder messing up.
19:58imirkin: although there doesn't seem to be any of that there.
19:58imirkin: but sometimes it happens
19:58imirkin: does that make more sense now?
20:02jayhost`: I think so. I only need to use Nvidiadisasm I don't have to run anything
20:03imirkin: you can even just use the paste i made above
20:03imirkin: just need to work through it and figure out wtf is going on
20:03imirkin: oh, remember that gl_Position is actually a vec4 -- i.e. it's 4 values in one
20:04imirkin: so all those multiplies/adds are actually doing 4 values, not 1
20:04imirkin: but the nvidia isa's are scalar, so it expands out to 4 separate ops
20:23jayhost`: okay I'm reading the FMA wiki page
20:28jayhost`: I assume when you said values you're talking about the GPR's
20:36jayhost`: I imagine I'll be able to figure it out. Assuming I'm supposed to start at mov32i after sched
20:36jayhost`: Maybe I should use pen and paper
21:51sveta: #kodi refused to help; at debian jessie with geoforce go 7300 i am trying to use nouveau and when i start kodi it gives a black screen and the whole system freezes including i can't switch to tty1; i add 'nomodeset' param in grub, screen resolution is ugly but kodi works
21:52imirkin: jayhost`: i stared at the tcp/tep a bit... i THINK i might understand what's going on
21:52imirkin: sveta: try removing libvdpau_nouveau.so from your system and see if that improves the situation
21:53imirkin: sveta: failing that, try to get some logs from the system
22:01jayhost`: ahh you got it already. Verry Nicce
22:02imirkin: jayhost`: not quite
22:02imirkin: i think i see what's going on though
22:02imirkin: isberd.o == the new ld.o, i.e. reading from tcp outputs
22:03imirkin: in order to write tcp outputs, you do gmem writes to $directbewriteaddress + something
22:03imirkin: isberd is used as, effectively, the old al2p was
22:04imirkin: there's a new al2p which is used to check if an output is out of range?
22:04imirkin: this will take a lot of hacking
22:04imirkin: [and testing]
22:04sveta: imirkin: in tty1 it says 'gpu lockup failed - switching to fbcon' (approximate wording). the file was not there and installing mesa-vdpau-drivers did not help, neither did removing it
22:04sveta: where to look for logs ?
22:05imirkin: sveta: anything before the gpu lockup message?
22:05imirkin: sveta: check dmesg
22:07imirkin: sveta: another thing to do would be to try disabling GL in kodi
22:07imirkin: (or uninstalling mesa entirely)
22:07imirkin: or forcing cpu-based accel, e.g. LIBGL_ALWAYS_SOFTWARE=1
22:09jayhost`: imirkin where is the code for hacking on
22:09imirkin: jayhost`: mesa? :)
22:10imirkin: jayhost`: you'll specifically want to look at https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n2012
22:10imirkin: this handles a specialized condition for kepler, i suspect a separate condition for maxwell can be added there to satisfy everything
22:11imirkin: this is how PFETCH is handled: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp#n152
22:11imirkin: perhaps it just needs a clever VFETCH handler? dunno.
22:12imirkin: oh, and i guess export too
22:13sveta_: imirkin: i see nothing at all in dmesg at the time of the issue, but it's difficult, because it has weird timestamps and i don't know which of the messages are at the time. the fbcon message i see in tty1 is not in dmesg.
22:13imirkin: sveta_: that's very odd =/
22:13imirkin: sveta_: anyways... 3d on nv4x is kinda sucky. i suspect kodi does something unexpected, which breaks the universe
22:14imirkin: sveta_: try getting rid of 3d, or just telling kodi not to use it
22:14imirkin: i gtg... good luck
22:16jayhost`: imirkin I'm gonna have to study a bit more.
22:16sveta_: imirkin: in kern.log http://dpaste.com/19WVD03.txt
22:17imirkin: sveta_: right so that's all happy
22:18sveta_: i would be glad to disable 3d whatever kodi has, but my vocabulary is bad, and one guy in #kodi wants me to use proprietary stuff without even filing you guys a proper bug report
22:19imirkin: it's likely the blob driver will work better, if you can find one that will load on your system
22:19imirkin: to disable 3d, try running kodi with LIBGL_ALWAYS_SOFTWARE=1 in the environment
22:20imirkin: it's fairly common (albeit unfortunate) for other projects to not support nouveau in any way whatsoever, and frequently to be actively hostile towards it.
22:20imirkin: my general response is to ignore those projects, like i do with kodi, mpv, etc. mplayer works great :)
22:30sveta_: http://paste.debian.net/plain/414707 <-- glxinfo
23:10sveta: thanks, that worked, though with low lagging mouse movement
23:10sveta: will check how video / audio under this work later