02:13 Yoshimo: karolherbst: maybe we should move it to the correct channel as well ;)
02:14 Yoshimo: and why shouldn't it? If your assumption was right it should have given us data
02:15 karolherbst: :D
02:15 karolherbst: yeah well, we need the pmu to reclock memory
02:15 karolherbst: and this is secured
02:16 Yoshimo: wasn't it just about reading instead of modifying?
02:17 karolherbst: ohh you mean the daemon?
02:17 karolherbst: oh yeah we have other problems too
02:17 Yoshimo: no the additional output i was supposed to get from your branch
02:17 karolherbst: ahh
02:17 karolherbst: yeah well, it doesn't matter in the end anyway, because you can only reclock your core
02:18 karolherbst: and this gives you like nothing
02:20 Yoshimo: isn't it about creating the foundation for additional work by understanding more of the cards internal plumbing? If i cared about performance i wouldn't experiment with nouveau on brand new cards
02:21 karolherbst: well most of the vbios stuff I am pretty much sure of already. There is something odd with configuring the core clocks
02:23 Yoshimo: except some stutter in videos for Xcom:EW and quite a load of mesa complaints about textures, it seemed fine
02:28 karolherbst: yeah, it is only interessting on higher clocks
02:39 Yoshimo: pmu was the one that needed additional work from NVIDIA in decoupling firmware and their own driver, wasn't it?
02:39 karolherbst: yeah
04:35 karolherbst: man..
04:36 karolherbst: mupuf: any idea how the blob decides a state is invalid?
04:36 karolherbst: I just increased the clock by 0x100 and now it won't be selected anymore
04:39 karolherbst: ohh yeah, maybe because I exceeded the boost limits...
04:54 Chasis: yop
09:05 karolherbst: RSpliet: when have you a change to test the gddr5 kepler of yours again?
09:05 karolherbst: I have an idea
09:05 RSpliet: no promises, but maybe I can do some stuff tomorrow
09:05 RSpliet: what did you find in the trace?
09:06 karolherbst: this: 810MHz => (var N, fN = 0x61440, P = 1, M = 31) => gpc clock
09:06 karolherbst: N being the only value which changes
09:06 karolherbst: which also explains the 16MHz steps the blob is doing
09:08 karolherbst: yeah, only N changes in the pll configuration
09:08 RSpliet: which register is that?
09:09 karolherbst: 137000
09:09 karolherbst: ohh P can also be 2
09:09 karolherbst: got one clock where it is 2
09:09 RSpliet: -- VCO1 - freq [1100-2200]MHz, inputfreq [25-100]MHz, M [17-32], N [8-255], P [1-63] --
09:09 RSpliet: interesting how they don't document that in the VBIOS
09:09 karolherbst: you know how that table also lied to us regarding gddr5?
09:10 RSpliet: no it doesn't
09:10 karolherbst: it sure does
09:10 RSpliet: you just didn't have the strap peek
09:10 karolherbst: sure?
09:10 RSpliet: MEM TYPE table at 0x4d14, version 10, 16 entries
09:10 RSpliet: Detected ram type: GDDR5
09:10 RSpliet: quite sure
09:10 RSpliet: but feel free to doublecheck
09:11 karolherbst: I mean the pll ranges are garbage and we can't rely on them somewhat
09:11 RSpliet: how much does NVIDIAs PLL generation algorithm actually differ from ours?
09:12 RSpliet: (like, the gk20a one)
09:12 karolherbst: well
09:12 karolherbst: a lot
09:12 karolherbst: they only multiply afaik
09:12 karolherbst: the source clock comes from the cpu part
09:12 RSpliet: I'm not talking blob, just the contributed tegra code
09:13 karolherbst: still the source clock is an input from outside nouveau
09:13 RSpliet: oh yes
09:13 RSpliet: (sorry, I'm not the sharpest pencil in the pencilbox today)
09:14 RSpliet: anyway, yes, if you mind cooking up a patch for that, and maybe verifying it against trace
09:14 RSpliet: I can test it on actual hw tomorrow
09:14 karolherbst: but basically they do the same I do in my gddr5 pll code
09:14 karolherbst: they have two plls
09:14 karolherbst: and do a loop
09:15 karolherbst: mhh clk->params->max_m
09:15 karolherbst: where get this one set
09:15 RSpliet: I reckon this is a simple limit of VCOs only being able to function within a certain range that doesn't stretch far enough to cover 100MHz all the way to 2GHz+
09:15 karolherbst: mhh odd
09:16 karolherbst: https://github.com/karolherbst/nouveau/blob/master_4.4/drm/nouveau/nvkm/subdev/clk/gk20a.c#L109
09:16 karolherbst: well the lower pll can get pretty much everything anyay
09:16 RSpliet: karolherbst: we get similar limits from the VBIOS
09:16 RSpliet: and I'm sure the lower PLL has a wide range, but the higher PLL's input requirements could be strict :-)
09:17 karolherbst: for one pll they are fine actually
09:17 karolherbst: yeah
09:17 karolherbst: that was also was I noticed back then
09:18 karolherbst: allthough...
09:18 karolherbst: M had to be 1
09:18 karolherbst: but it could be I missed something
09:18 karolherbst: anyway, I check the blob now on different clocks
09:19 karolherbst: and the high pll seems to use N variables and P either 1 or 2 with M and fN being constant
09:19 RSpliet: cool, I'd be happy to test either tomorrow or monday
09:20 RSpliet: sorry, I can't help a lot with dev right now, little free time
09:20 karolherbst: yeah I have to check what nouveau does, but I doubt nouveau does have this pattern
09:20 karolherbst: yeah, no worries
09:20 karolherbst: for some reasons a clock higher than 2600MHz let nvidia hang my machine...
09:22 karolherbst: huuh
09:27 karolherbst: mhh, seems like nvidia doens't like a clock higher than 1200MHz
10:05 phillipsjk: Back in the day, the S#Virge/DX Decallarator card had a clock of 50Mhz, while the Pentium that could out-render it (without texture filtering though) ran at 60Mhz or more.
10:05 phillipsjk: *S3Virge decelerator
10:53 jayhost`: karolherbst yes it was I who tested your maxwell reclock branch
10:55 karolherbst: jayhost`: got any unstabilities?
10:57 karolherbst: RSpliet: so I've added something to my kepler_reclocking_v2 branch: 61440
10:57 karolherbst: ....
10:57 karolherbst: https://github.com/karolherbst/nouveau/commit/7f03b990a614a9548e7b854a647533868bdcec94
11:00 karolherbst: RSpliet: but I am sure there is something else odd, so I've also added a printk which throws out the configuration of the second pll,which is interessting when we compare nvidia with nouveau
11:30 AlexAltea: hmm, I'm curious about this: While setting NV03_PFIFO_RAMHT to values like 0x0002XXXX, I can only set the RAMHT starting offset between 0x0 and 0x1F000. Could it go past this point (0x1F000)?
11:30 AlexAltea: It's as if the value I set was AND'ed with 0xFFFF01F0
11:31 AlexAltea: (hence the offset 0x1F000 == 0x1F0 << 8)
11:49 jayhost`: karolherbst I don't think so. I havn't been testing but can on request.
11:54 karolherbst: jayhost`: well would be nice, just run the card at highest clocks for some times and notify me whenether something unusual happens :)
11:54 karolherbst: stability is out biggest concern usually
12:08 imirkin: mwk: hey, so i want to make sqrt() work on tesla. we've currently been doing x * rsq(x), but that breaks down for x = inf (comes out as NaN). are you aware of a way to make it work without flipping the global "zero wins" mode?
12:42 imirkin: karolherbst: did you end up setting up deqp?
12:42 karolherbst: imirkin: not yet, why?
12:43 karolherbst: did I promise something and forgot? :O
12:43 imirkin: no
12:43 imirkin: actually, there's probably a shader runner test that does the same thing
12:43 imirkin: could i convince you to do a mmt trace on blob?
12:43 karolherbst: yeah, I am hacking ebuilds currently (ubuntu stuff, so I am in a world of pain you can't imagine)
12:43 karolherbst: anything else is better than that :D
12:44 karolherbst: so, what shall I do?
12:44 imirkin: bin/shader_runner generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-sqrt-float.shader_test -fbo -auto
12:46 karolherbst: http://filebin.ca/2a24TYC77Eco/fs-sqrt.mmt.xz
12:47 imirkin: HUH!
12:47 imirkin: it does rcp(rsq))!!
12:47 imirkin: that is *very* surprising
13:01 pmoreau_: imirkin: If you still need those MMT trace, I can get them know since I'm on the blob, and have some time (finaly!)
13:02 imirkin: pmoreau_: what did i ask for?
13:02 pmoreau_: Some dEPQ MMT
13:03 imirkin: which one? :)
13:03 imirkin: i solved one of the bugs i was having
13:03 pmoreau_: dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler3d_float_fragment and dEQP-GLES3.functional.shaders.fragdata.draw_buffers
13:03 imirkin: ah, yes please!
13:04 pmoreau_: Both of them?
13:04 imirkin: yes
13:04 imirkin: well, first please check if they pass
13:04 imirkin: if they fail, i'm not as interested :)
13:04 pmoreau_: Ok, let's see if I remember how to do that :-)
13:04 pmoreau_: Eh eh eh!
13:04 pmoreau_: You don't want to know how they fail? :-D
13:05 imirkin: i'm sure it's the same way mine fail
13:05 imirkin: [if they fail]
13:05 pmoreau_: Does it have to be latest Mesa, or Mesa from one/two weeks ago will do?
13:06 imirkin: mesa is not involved here
13:06 pmoreau_: Ok
13:06 imirkin: deqp test run against blob.
13:08 pmoreau_: Right… --" It's getting late I guess
13:09 pmoreau_: What was the flag again to give it the test to run?
13:09 imirkin: ./deqp-gles3 --deqp-visibility=hidden --deqp-case='dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler3d_float_fragment'
13:10 pmoreau_: Thanks
13:11 pmoreau_: Float fragment failed
13:12 pmoreau_: And draw buffers as well
13:12 imirkin: ok
13:13 imirkin: then nevermind
13:13 imirkin: thanks for checking though
13:13 pmoreau_: You're welcome
13:13 pmoreau_: Sorry for taking so long
13:15 imirkin: no worries
15:31 AlexAltea: Does anyone have a collection of MMIO dumps I could use? The wiki only mentions how to submit dumps
15:32 AlexAltea: I really need to know what offsets can I set on the NV03_PFIFO_RAMHT register
15:43 mwk: AlexAltea: the offset has to be 4kiB-aligned
15:43 mwk: and in range 0-0x1f000
15:59 AlexAltea: mwk: thanks a lot for the information. (bad news for me though, heheh)
16:02 mwk: AlexAltea: btw, what about that reg 0 I asked about?
16:03 AlexAltea: mwk: E4D001A1
16:05 AlexAltea: sorry for not submitting it before. I thought you weren't interested any more after the question about the objects was solved
16:06 mwk: thank you
16:06 AlexAltea: mwk, I also have a MMIO dump if you want, 8 MB in size
16:07 mwk: I'd like that, yes
16:07 AlexAltea: I couldn't get past that because due to hypervisor panics
16:08 mwk: that's expected
16:08 mwk: 0x800000+ is FIFO submission area
16:08 mwk: if you attempt to read it, it causes a FIFO error
16:08 mwk: because you submitted a wrong command
16:10 AlexAltea: that's interesting, heh
16:10 AlexAltea: here it is: https://github.com/AlexAltea/temporary2
18:12 bkeys: Does anyone know of a GPU that respects your freedom AND supports Vulkan 1.0?
18:13 imirkin: please define 'respects your freedom' and 'supports Vulkan'
18:15 bkeys: respects your freedom: FSF approved? Or supported by nouveau in general
18:15 bkeys: supports Vulkan: I can develop software using the Vulkan API
18:15 imirkin: bkeys: not aware of any
18:16 bkeys: In that case; at this point does anyone recommend a GPU that is known to work really well with Nouveau?
18:16 imirkin: i don't know that FSF approves of any nvidia gpu's in particular, and nouveau has no vulkan support right now
18:16 imirkin: if you're looking for the best open-source support, i'd definitely recommend amd over nvidia
18:17 imirkin: if you really want to go for nvidia, i think the kepler series is the best supported right now out of the recent GPUs - you get reclocking (mostly), and full GL support (to the extent that nouveau has it)
18:17 bkeys: Do you have a specific card that you recommend?
18:18 imirkin: find one that fits your price range? any kepler should be fine though.
18:18 imirkin: the GTX 780 Ti is the most expensive/powerful card that nouveau can drive right now
18:18 imirkin: (well, technically that honor might fall to something in the Tesla K40 or so series, but... those aren't really end-user gpu's)
18:19 bkeys: But Kepler is an Nvidia architecture? What about the AMD?
18:19 imirkin: oh, i don't know too much about specific AMD models
18:19 imirkin: i just know that they have an actual paid team of engineers whose job it is to support and develop the linux drivers
18:20 imirkin: and who have access to various internal resources, like documentation, hw engineers, etc
18:20 bkeys: Well I am interested moreso in the philosophical aspect of the support, not just support for GNU/Linux
18:21 imirkin: not really sure what that means tbh...
18:22 bkeys: Ehh it is kind of hard to describe in words what I meant. Just that I care more about the driver being freely licensed
18:23 imirkin: both the amd (open-source) driver and nouveau are BSD-licensed
18:23 imirkin: [or is it MIT? i forget tbh]
18:24 bkeys: Okay; do you have any clue which is the most powerful AMD GPU that it's freely licensed driver can run?
18:24 imirkin: afaik the amd driver will run on every amd gpu... i think the Fiji line is their high-end right now
18:24 imirkin: you might check in #radeon for details
19:22 jayhost`: imirkin these hastebins are PTX assembly? tesselation-sanity
19:22 imirkin: what hastebins?
19:23 imirkin: anyways, almost definitely not
19:23 imirkin: PTX is not a real hardware ISA
19:28 jayhost`: Maybe a better question is. What am I looking at here http://hastebin.com/nodowuxucu.cs
19:28 imirkin: the output of nvdisasm ;)
19:28 imirkin: it's a text representation of the actual instructions running on the gpu
19:29 imirkin: that 8-dword sequence came out of a shader in the mmt
19:29 jayhost`: haha
19:29 jayhost`: Is it opengl assembly language
19:30 imirkin: no, it's SM50 assembly
19:30 imirkin: opengl is a higher-level language
19:30 imirkin: like C
19:30 imirkin: talking about "opengl assembly" is the same as talking about "C assembly"
19:31 imirkin: but just like your C program is compiled to, say, x86_64
19:31 imirkin: so are glsl shaders compiled to the target GPU. in the case of maxwell, the ISA is known as SM50
19:33 jayhost`: Ahh yeah. Interesting.
19:37 jayhost`: Do you have documentation
19:49 imirkin: jayhost`: for what?
19:50 jayhost`: Is that intel syntax
19:51 imirkin: you mean for the gpu's isa? no docs.
19:51 imirkin: you have to figure it out based on the glsl you write and what the gpu generates
19:51 imirkin: some things are obvious, e.g. "MUL", others less so, like "ISBERD" :)
19:53 jayhost`: Okay so write the C-ish code and try to mimic the blob
19:54 imirkin: i think you're missing some steps
19:54 imirkin: let's take a concrete shader
19:56 imirkin: jayhost`: http://hastebin.com/emudigosaz.avrasm
19:56 imirkin: now you can see the FFMA's at the bottom (fma = fused multiply-add), which are obviously performing the path part of the thing
19:56 imirkin: but... where did those values come from?
19:57 imirkin: that's what you need to trace through
19:57 imirkin: if you can come up with a recipe for loading the various TEP inputs, then that will solve part of the missing functionality
19:57 imirkin: (the other missing part is in TCP's... let's not worry about those for now)
19:57 imirkin: note that every 4th instruction should be a sched -- if it's not, then that's just the decoder messing up.
19:58 imirkin: although there doesn't seem to be any of that there.
19:58 imirkin: but sometimes it happens
19:58 imirkin: does that make more sense now?
20:02 jayhost`: I think so. I only need to use Nvidiadisasm I don't have to run anything
20:03 imirkin: you can even just use the paste i made above
20:03 imirkin: just need to work through it and figure out wtf is going on
20:03 imirkin: oh, remember that gl_Position is actually a vec4 -- i.e. it's 4 values in one
20:04 imirkin: so all those multiplies/adds are actually doing 4 values, not 1
20:04 imirkin: but the nvidia isa's are scalar, so it expands out to 4 separate ops
20:23 jayhost`: okay I'm reading the FMA wiki page
20:28 jayhost`: I assume when you said values you're talking about the GPR's
20:36 jayhost`: I imagine I'll be able to figure it out. Assuming I'm supposed to start at mov32i after sched
20:36 jayhost`: Maybe I should use pen and paper
21:51 sveta: #kodi refused to help; at debian jessie with geoforce go 7300 i am trying to use nouveau and when i start kodi it gives a black screen and the whole system freezes including i can't switch to tty1; i add 'nomodeset' param in grub, screen resolution is ugly but kodi works
21:52 imirkin: jayhost`: i stared at the tcp/tep a bit... i THINK i might understand what's going on
21:52 imirkin: sveta: try removing libvdpau_nouveau.so from your system and see if that improves the situation
21:53 imirkin: sveta: failing that, try to get some logs from the system
22:01 jayhost`: ahh you got it already. Verry Nicce
22:02 imirkin: jayhost`: not quite
22:02 imirkin: i think i see what's going on though
22:02 imirkin: isberd.o == the new ld.o, i.e. reading from tcp outputs
22:03 imirkin: in order to write tcp outputs, you do gmem writes to $directbewriteaddress + something
22:03 imirkin: isberd is used as, effectively, the old al2p was
22:04 imirkin: there's a new al2p which is used to check if an output is out of range?
22:04 imirkin: this will take a lot of hacking
22:04 imirkin: [and testing]
22:04 sveta: imirkin: in tty1 it says 'gpu lockup failed - switching to fbcon' (approximate wording). the file was not there and installing mesa-vdpau-drivers did not help, neither did removing it
22:04 sveta: where to look for logs ?
22:05 imirkin: sveta: anything before the gpu lockup message?
22:05 imirkin: sveta: check dmesg
22:07 imirkin: sveta: another thing to do would be to try disabling GL in kodi
22:07 imirkin: (or uninstalling mesa entirely)
22:07 imirkin: or forcing cpu-based accel, e.g. LIBGL_ALWAYS_SOFTWARE=1
22:09 jayhost`: imirkin where is the code for hacking on
22:09 imirkin: jayhost`: mesa? :)
22:10 imirkin: jayhost`: you'll specifically want to look at https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n2012
22:10 imirkin: this handles a specialized condition for kepler, i suspect a separate condition for maxwell can be added there to satisfy everything
22:11 imirkin: this is how PFETCH is handled: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp#n152
22:11 imirkin: perhaps it just needs a clever VFETCH handler? dunno.
22:12 imirkin: oh, and i guess export too
22:13 sveta_: imirkin: i see nothing at all in dmesg at the time of the issue, but it's difficult, because it has weird timestamps and i don't know which of the messages are at the time. the fbcon message i see in tty1 is not in dmesg.
22:13 imirkin: sveta_: that's very odd =/
22:13 imirkin: sveta_: anyways... 3d on nv4x is kinda sucky. i suspect kodi does something unexpected, which breaks the universe
22:14 imirkin: sveta_: try getting rid of 3d, or just telling kodi not to use it
22:14 imirkin: i gtg... good luck
22:16 jayhost`: imirkin I'm gonna have to study a bit more.
22:16 sveta_: imirkin: in kern.log http://dpaste.com/19WVD03.txt
22:17 imirkin: sveta_: right so that's all happy
22:18 sveta_: i would be glad to disable 3d whatever kodi has, but my vocabulary is bad, and one guy in #kodi wants me to use proprietary stuff without even filing you guys a proper bug report
22:19 imirkin: it's likely the blob driver will work better, if you can find one that will load on your system
22:19 imirkin: to disable 3d, try running kodi with LIBGL_ALWAYS_SOFTWARE=1 in the environment
22:20 imirkin: it's fairly common (albeit unfortunate) for other projects to not support nouveau in any way whatsoever, and frequently to be actively hostile towards it.
22:20 imirkin: my general response is to ignore those projects, like i do with kodi, mpv, etc. mplayer works great :)
22:30 sveta_: http://paste.debian.net/plain/414707 <-- glxinfo
23:10 sveta: thanks, that worked, though with low lagging mouse movement
23:10 sveta: will check how video / audio under this work later