01:57 karolherbst: Tom^: wanna test some dynamic reclocking stuff?
01:58 karolherbst: or anybody else?
01:58 karolherbst: should be rather stable now
02:08 karolherbst: mupuf: so I think all PMU related issues are fixed now :) the only thing missing for dynamic reclocking is now the power budget stuff :p
02:09 pmoreau: karolherbst: Detail! :-)
02:09 karolherbst: :D
02:09 karolherbst: pmoreau: though my dyn reclocking stuff is only for gpus with a pmu
02:09 karolherbst: so gt215+
02:09 karolherbst: works for me :D
02:09 pmoreau: Grrrr…
02:10 pmoreau: I'll cancel your room for FOSDEM! :-p
02:10 karolherbst: :D
02:10 karolherbst: pmoreau: well the PMU only triggeres the reclocking
02:10 karolherbst: so for pre gt215 cards we only need a different trigger mechanism
02:10 karolherbst: pmoreau: that's the pmu code: https://github.com/karolherbst/nouveau/commit/effcdb2a1ab980bff62139ccd473320eeb85f023#diff-5e5cb4582f6faff078d1cad6144b248a
02:10 pmoreau: I'll be able to test your patches once I set up my own Reator.
02:11 pmoreau: (Which I should really start looking into!)
02:11 karolherbst: pmoreau: what is the pmu thing before gt215 by the way?
02:11 karolherbst: or does the code have to run on the host?
02:11 karolherbst: ohhh wait
02:11 karolherbst: I also use the pmu counters :/
02:11 pmoreau: I have no idea. sorry :-)
02:12 karolherbst: mhhh
02:12 karolherbst: now that I think of it, my code makes little sense for those old gpus anyway
02:12 karolherbst: because cstates weren't a think then
02:12 karolherbst: *thing
02:13 pmoreau: Dunno whether my laptop has any cstates
02:13 karolherbst: don't think so
02:16 RSpliet: cstates aren't even a thing on GT21x
02:16 karolherbst: RSpliet: sure? :/
02:17 karolherbst: because I think I saw some
02:18 RSpliet: find me a counterexample and I'll believe you :-P
02:18 RSpliet: I'm quite sure about that yes
02:19 karolherbst: k, didn't find any
02:29 mupuf: karolherbst: what was the issue with the IRQ then?
02:29 karolherbst: mupuf: 1. https://github.com/karolherbst/nouveau/commit/c62c59910255b8ffaaa7c8945c6156be3af145c6
02:29 karolherbst: mupuf: 2. https://github.com/karolherbst/nouveau/commit/d9f1c7ec2e46c9059c8376e0c3155d998ec0185f
02:30 karolherbst: both pretty much bs like issues :D
02:31 karolherbst: well the second was important when the kernel schedules processes a bit too freely
02:32 karolherbst: but the pmu acking those interrupts was a good one
02:35 mupuf: :o
02:36 mupuf: good catch, for sure!
02:45 karolherbst: anybody complaints about these 3 commits? https://github.com/karolherbst/envytools/commits/nvbios_unk50
02:45 karolherbst: I want to push them so that I don't have to manage my local installation :D
02:48 mupuf: unk5c is ok
02:48 mupuf: hiding base clock entries is ok too
02:49 mupuf: actually, please give a better name for both the 50 and 5c
02:49 karolherbst: ohh right
02:50 mupuf: and please state what t0 and t1 are for unk50
02:50 mupuf: then you can push
02:50 mupuf: oh, and get rid of the WIP too
02:50 karolherbst: no idea, I just know they are temperatures
02:51 karolherbst: and that t0 is lower than t1
02:52 karolherbst: to be honest, I have no clue what the unk50 table is about, there is some more stuff in that, but besides that, no idea
02:53 karolherbst: the entries are quite big though
02:59 karolherbst: mupuf: UNK TEMPERATURE and FAN MANAGEMENT?
03:00 karolherbst: though management somehow sounds wrong
03:00 mupuf: karolherbst: why divide it by 32 then?
03:01 mupuf: Fan trip points?
03:01 mupuf: fan policy?
03:01 karolherbst: mupuf: because after dividing it by 32 I get usefull data: 82.53 and 95.00
03:02 mupuf: I see
03:02 mupuf: keep the unk50 then
03:02 mupuf: and rename the other one to fan management
03:02 karolherbst: k
03:02 mupuf: FAN_MGMT
03:06 karolherbst: https://github.com/karolherbst/envytools/commits/nvbios_unk50
03:09 karolherbst: mupuf: and the unk5c also has those divides by 32, so I guess the numbers are multiplied by 32 for precision, like they also increased precision with the pwm for volting
03:09 karolherbst: mupuf: now I am curious: maybe there is another temperature sensors somewhere with this increased precision, too?
03:10 mupuf: doubt that, there is a .5°C resolution already
03:10 mupuf: and it is really noisy
03:10 mupuf: and you can always read straight from the ADC and use the parameters yourself if you need the most accurate reading
03:11 mupuf: possible
03:12 karolherbst: so any complaints left about the commits?
03:26 dodomorandi: Hello everyone. I am having lots of troubles with a damn nVidia card... Can I ask you some help about MMIO tracing?
03:29 karolherbst: ....
03:29 karolherbst: now I wanted to answer :D
03:30 karolherbst: dodomorandi: what kind of problems do you have?
03:33 dodomorandi: I have an old onboard nvidia, and the proprietary driver seems to have troubles with composition. Nouveau, unfortunately, does not seem to like the card, and the screen flickers a lot. I was trying to trace mmios to provide some info, but i only get a black screen
03:34 karolherbst: mhhh
03:34 karolherbst: what card is that?
03:34 dodomorandi: Geforce 7025/ nForce 630a
03:35 dodomorandi: I have got two identical pc, and both have the same issue
03:37 dodomorandi: The problem is that i do not know if the mmio tracing has been successful or not
03:37 karolherbst: dodomorandi: well you should get a _big_ file
03:38 dodomorandi: If i try to send a nop to the tracer, i got a "busy resource" error
03:38 karolherbst: yeah
03:38 karolherbst: you have to stop the cat command first
03:38 karolherbst: then nop the tracer
03:38 dodomorandi: Oh, ok
03:38 karolherbst: but while tracing you should be able to start X and anything else on it
03:38 karolherbst: otherwise the trace will be useless
03:38 karolherbst: most likely
03:39 dodomorandi: From logs, it seems that nvidia driver is stuck after\during 2d hardware acceleration
03:40 dodomorandi: However, i killed the cat, but the "echo nop" is stuck. Damn
03:41 dodomorandi: Any idea of some parameter i could try to pass to nvidia module to start X when tracing?
03:43 karolherbst: dodomorandi: nope
03:43 karolherbst: but I bet there is something not right related to how nouveau drivers the monitors on your gpu
03:43 karolherbst: dodomorandi: try to reduce the resolution and see if the flickering disappears with a lower one
03:44 karolherbst: it isn't like the gpu itself supports a high one anyway
03:44 dodomorandi: I tried, and yes, it stop flickering
03:44 dodomorandi: But i have to stay at a very low resolution
03:44 karolherbst: what is the highest one you can use without flickering?
03:44 dodomorandi: I wil check in a second
03:54 dodomorandi: It works at 1440×900, flickers at 1680×1050
03:54 karolherbst: mhhh okay
03:55 dodomorandi: No, wait. It flickers a bit a this reso
03:55 dodomorandi: Much less, but still a bit
03:57 dodomorandi: At 1280×720 it seems to work flawlessly, but it is not 16:10
03:57 karolherbst: dodomorandi: k
03:57 karolherbst: 1440x900 is pretty high already though
03:58 karolherbst: so there is just a small thing left somewhere
03:58 karolherbst: who is the expert with such old gpus by the way? :D
03:58 karolherbst: imirkin: I guess the pixel clock is just not fast enough
03:59 dodomorandi: Mmhh... Do you think i could try to patch ans recompile the nouveau and see if it works better?
04:00 karolherbst: dodomorandi: ever tried to reclock that card?
04:00 karolherbst: but I have _no_clue_ if reclocking works at all
04:00 dodomorandi: Nope. How can i do?
04:00 karolherbst: on that gpu
04:02 karolherbst: dodomorandi: boot with nouveau.pstate=1
04:03 dodomorandi: I will give it a try
04:05 dodomorandi: Same issue
04:05 karolherbst: dodomorandi: now you have to reclock first ;)
04:06 karolherbst: dodomorandi: is there a pstate file inside /sys/class/drm/card0/device
04:06 dodomorandi: Yes
04:07 karolherbst: can you cat it?
04:07 dodomorandi: Sure. It says
04:08 dodomorandi: 20: core 425 MHz shader 425 MHz memory 0 MHz
04:08 dodomorandi: And another line with everything set to 0
04:08 karolherbst: mhhhh
04:08 karolherbst: then try to echo 20 > pstate
04:08 karolherbst: you have to do that as root by the way
04:09 dodomorandi: Sure
04:09 dodomorandi: Mh
04:09 karolherbst: but if everything is set to 0, then something is odd already
04:09 dodomorandi: The second line changed
04:09 karolherbst: does it change anything related to the flickering? like now you can use a higher resolution?
04:10 dodomorandi: The core is now set to 29 Mhz, shader and memory still to 0
04:10 dodomorandi: The flickering is still there
04:11 karolherbst: yeah sadly I don't know much about those gpus
04:13 dodomorandi: Hey, you are trying to help me, NVidia doesn't care... I am appreciating that! :D
04:13 karolherbst: :D
04:13 karolherbst: the solution will be trivial though
04:14 RSpliet: dodomorandi: I think that GPU doesn't have dedicated memory, hence there's no way to increase memory bandwidth
04:15 RSpliet: is that a single monitor set-up?
04:15 dodomorandi: Yes
04:16 karolherbst: RSpliet: I highly assume that the pixel clock is just not fast enough set or anything stupid like that
04:16 karolherbst: becaue 1440x900 works
04:16 karolherbst: *because
04:16 RSpliet: hmm... I wonder how NVIDIA gets the required bandwidth then; it would be good if we understood the line buffer better
04:17 dodomorandi: Do you have any idea about the driver crazyness when i try to trace mmios?
04:17 karolherbst: RSpliet: is get_tmds_link_bandwidth the right function for htis?
04:17 RSpliet: dodomorandi: mmiotrace is just horrendously slow
04:17 karolherbst: but I would assume that for his gpu it should already return 165000
04:17 RSpliet: karolherbst: well, that should be fine
04:18 RSpliet: no the line buffer is an unknown component, I recall NVIDIA calling that thing the NISO poller
04:18 pmoreau: That rings a tiny bell
04:19 RSpliet: I assume that when configured correctly, it can increase bandwidth by doing bigger transfers from memory to the scanout logic
04:19 karolherbst: ahh so back to traces then
04:19 pmoreau: https://github.com/envytools/envytools/commit/f20d89ddbb565adb5fcdd54e27a09c27059a6d7a
04:19 RSpliet: results in bigger bursts, in turn higher DRAM utilisation efficiency
04:20 karolherbst: ahhh
04:20 karolherbst: dodomorandi: you may want to install envytools then
04:20 karolherbst: we could play aruond with that stuff and maybe we get it working somehow
04:20 dodomorandi: Yup
04:20 RSpliet: pmoreau: hmm, yes, that might be applicable to the other embedded GPUs too
04:20 pmoreau: But those regs are only valid on MCP77 and MCP79
04:21 karolherbst: pmoreau: how sure?
04:21 RSpliet: pmoreau: did you test them on an NV68? :-P
04:21 RSpliet: (NV63, nV67)
04:21 pmoreau: Maybe the NVIDIA dev didn't tell the whole truth, but he only talked of them being valid on MCP77 and MCP79
04:21 pmoreau: Let me find the email back
04:22 RSpliet: oh, well, we're talking 10+yo GPUs, I wouldn't remember the details of what I designed 10 years ago :-D
04:22 karolherbst: :D
04:22 dodomorandi: :D
04:22 karolherbst: we will know in a few seconds anyway
04:23 RSpliet: pmoreau: I also believe that this is not the whole story of the line buffer
04:23 karolherbst: dodomorandi: after you installed envytools as root: nvapeek 0x100c00 0x80
04:23 pmoreau: http://lists.freedesktop.org/archives/nouveau/2014-November/019266.html
04:24 RSpliet: there's some configurable registers in FB that get touched when changing resolutions
04:24 pmoreau: and http://lists.freedesktop.org/archives/nouveau/2014-December/019319.html
04:24 karolherbst: well
04:24 karolherbst: he doesn*t say the don't exist on older gpus
04:24 karolherbst: he just states why they exist on those mcp gpus
04:25 pmoreau: Right
04:25 pmoreau: It's been some time since I read those emails
04:25 RSpliet: recalling stuff from a year ago is tricky, let alone 10+ years :-P
04:25 pmoreau: :p
04:26 dodomorandi: Does it matter the resolution when I run nvapeek?
04:26 karolherbst: no
04:26 dodomorandi: (Compiling)
04:27 karolherbst: dodomorandi: I guess a "nvapeek 0x100c14 0x14" would be enough though
04:27 karolherbst: maybe do that even while you run the blob at your highest resolution
04:27 karolherbst: then we will already know what we have to poke into that
04:28 RSpliet: resolution doesn't matter for those regs
04:28 dodomorandi: Nvapeek 0x100c00 0x80 returned "..."
04:28 pmoreau: IIRC, those regs were not present on MCP89 (I think I checked on one of those cards), but it doesn't imply they don't exist on previous cards.
04:28 karolherbst: dodomorandi: try that with the nvidia driver
04:28 RSpliet: dodomorando: nvascan 0x100c18
04:29 karolherbst: ohh right
04:29 karolherbst: nvascan :)
04:31 RSpliet: you can do that with nouveau
04:31 dodomorandi: Oh, ok
04:32 dodomorandi: Is it normal that it did not return anything? Always "..."
04:33 RSpliet: that means the registers do not exist
04:33 karolherbst: well shouldn't nvascan not return ...?
04:33 karolherbst: or ohh wiat
04:33 karolherbst: right
04:33 karolherbst: my mistake
04:33 karolherbst: no, really
04:33 karolherbst: it shouldn't
04:34 karolherbst: or it should? :/
04:34 karolherbst: meh
04:34 karolherbst: I should sleep mroe
04:34 karolherbst: okay, other plan then
04:36 pmoreau: Maybe the regs are placed somewhere else
04:36 RSpliet: tracing time
04:36 pmoreau: Yep
04:36 pmoreau: I can have a look at the trace tonight
04:38 dodomorandi: Anything i can try to help you?
04:38 karolherbst: dodomorandi: do a proper mmiotrace ;)
04:40 dodomorandi: He, the problem is that nvidia drivers does not seem to want to help me... The trace i did is 1.9k, i do not think is sufficient, isn't it?
04:41 karolherbst: dodomorandi: did you enable the mmiotraceer before loading the nvidia module?
04:43 dodomorandi: Sending mmiotrace to current_tracker? Sure
04:43 dodomorandi: To be sure, i tried to blacklist nvidia module at boot, enabling mmiotrace, loading nvidia and then xinit with a sleep 10
04:44 dodomorandi: I also passed maxcpus=1 to kernel
04:44 karolherbst: ahh k
04:44 karolherbst: no, it is normal that the display just stays black, cause you don't start any applications by default, do you?
04:46 dodomorandi: Nope :( it should start xterm if i do not pass anything, and with "sleep 10" it should just terminate after some time. But it is stuck
04:46 karolherbst: ohh k
04:47 dodomorandi: However, i do not have idea if the trace contain useful information or not
04:48 dodomorandi: I imagine that 1.9k is not so much data for a mmio trace...
04:48 karolherbst: no
04:48 karolherbst: it should be more like 30MB
04:48 karolherbst: dodomorandi: what distribution are you using?
04:48 dodomorandi: Arch
04:49 karolherbst: which login manager?
04:49 karolherbst: because you could also just do systemctl start your_login_manager_here
04:49 dodomorandi: Normally gdm, but i tried to use just xinit for the trace
04:49 karolherbst: well then systemctl start gdm
04:50 dodomorandi: I'll try
04:50 karolherbst: and check the last lines of dmesg after loading nvidia
04:50 karolherbst: something weird could have happened
04:51 pmoreau: dodomorandi: When you blacklisted nvidia, did you boot into multi-user or graphical mode?
04:52 pmoreau: And when launching X with modprobing, it can take some time. I would say ~30sec-1min on my laptop IIRC.
04:54 dodomorandi: Multiuser console mode
04:54 dodomorandi: I waited more than 5 minutes
04:57 dodomorandi: Argh, sorry guys, but it seems that the update to the kernel i just made made a mess with nvidia... Downgrading... O.o
05:04 dodomorandi: -.- kernel dump
05:06 dodomorandi: Mmmhh... I will try again with only one core active
05:09 dodomorandi: Ok, now the nvidia module crashes
05:11 karolherbst: I bet the same issue I have
05:11 pmoreau: Starting mmiotrace should automatically pause other cores
05:11 dodomorandi: Unfortunately, i have to go. Guys, thank you so much! I will come back in a few days, maybe when the nvidia makes its 304xx driver compatibile with linux 4.3... Maybe i will be more lucky...
05:11 karolherbst: dodomorandi: issue like that? https://gist.github.com/karolherbst/f69e2a7b9c372e049525
05:12 karolherbst: "mmiotrace: unexpected secondary hit for"
05:13 dodomorandi: The backtrace is a bit different
05:13 karolherbst: but you got the "mmiotrace: unexpected secondary hit for" message?
05:13 pmoreau: I would have expected 304xx to be compatible with 4.3, 340xx is at least.
05:14 dodomorandi: Yes, there is a unexpected secondary hit
05:15 dodomorandi: However, i am having a mtrr related issue with the new kernel while loading the module
05:15 dodomorandi: Now i dont remember what it said, in detail
05:17 dodomorandi: Ah, if it can be useful, after the mmiotracer error, there is a unable to handle kernel paging request
05:17 karolherbst: yeah, it is the same issue I also get
05:17 karolherbst: no idea why though
05:19 dodomorandi: Now i have to run :( i will keep in touch, trying to run a trace on this damn proprietary bloabware
05:19 dodomorandi: Thank you again, guys :)
05:23 pmoreau: Probably the same issue as what I experienced last time I tried to trace the blob. :-/
05:41 karolherbst: pmoreau: I think the issue is that there is just a paging request while handling those mmio stuff
06:12 tacchinotacchi: is there a reason the code has so few comments? asd at least the last time i've seen it
06:14 RSpliet: taccinotacchi: I think developers will agree that it's always difficult to understand the level of documentation required for their code. Function and variable names should be self-explanatory, but generally the writer of those has a better understanding of the terminology than the novice reader
06:15 RSpliet: luckily, we're a very approachable bunch (albeit slightly grumpy, it's the dark months of the year in the northern hemisphere)
06:15 tacchinotacchi: well, there was a function yesterday i have no idea what it is
06:15 tacchinotacchi: trying to find
06:16 tacchinotacchi: what do you mean approachable?
06:16 RSpliet: always in for pointing you in the right direction if documentation does exist, or to answer questions if it doesn't
06:16 tacchinotacchi: ram_wr32
06:16 tacchinotacchi: here it is
06:17 tacchinotacchi: i think it's got something to do with writing 32 bits to something, but to what?
06:17 RSpliet: to the "ram" subsystem. Most of what you see will use nvkm_wr32, which is mapped to a 32-bit write to the MMIO area
06:18 RSpliet: ram_wr32 instead adds a write operation to a script, and this script gets uploaded later for execution by a dedicated "falcon" core
06:18 tacchinotacchi: i sense your trying to help but i'm helpless
06:19 tacchinotacchi: is this falcon core in every architecture?
06:19 RSpliet: this falcon core is part of modern NVIDIA GPUs (from GT21x on)
06:20 tacchinotacchi: so it is the core that manages clocks
06:20 tacchinotacchi: through scripts sent by the driver
06:20 RSpliet: takes care of performance-related tasks, so in our case monitoring performance counters, executing the scripts I mentioned earlier used for changing memory clocks, temperature monitoring
06:21 RSpliet: that it, I think we refer to this core as the "PMU"
06:21 tacchinotacchi: yesterday they sent me this http://envytools.readthedocs.org/en/latest/nvrm/pmu/index.html
06:21 RSpliet: (there's other falcon cores available too)
06:21 tacchinotacchi: i wonder how did whoever wrote this figure out in the first place
06:22 tacchinotacchi: like how did he figure out the bits sent to the card were opcode
06:22 RSpliet: what you're looking at is the script language that NVIDIA uses
06:22 RSpliet: (not the same as implemented in nouveau_
06:22 tacchinotacchi: what? nvidia actually shared this info?
06:23 RSpliet: no
06:23 RSpliet: we started by identifying the register/data pairs that were obvious
06:23 RSpliet: then working our way up to identify opcodes (luckily everything is 32-bit aligned, so pretty obvious from trace)
06:24 RSpliet: followed by a bit of binary disassembly of the firmware implementing this scripting language to understand a lot of the missing opcodes
06:26 tacchinotacchi: can you give me an example of "obvious" opcode?
06:28 RSpliet: the format is simple, one 32-bit word containing the opcode and the number of parameters, followed by each parameter as 32-bit words. We *know* such a script should write eg. the MR value in 0x1002c0, so identifying a data-value pair for that is simple
06:29 pmoreau: IIRC, Nouveau didn't started from scratch, but from the (now dead) open-source X driver that NVIDIA had some time ago. Though it only supported 2d accel and no card newer than Fermi?
06:29 RSpliet: reverse engineering - like debugging - is a skill that either takes a lot of time to develop or a brain the size of a planet. Don't feel bad for not getting in two days what took years and years of effort
06:30 RSpliet: pmoreau: I think it was even more limited, but even understanding and deobfuscating that code took forever ;-)
06:30 tacchinotacchi: i feel like i'm producing inconvienience
06:31 tacchinotacchi: what's MR? lel
06:31 pmoreau: RSpliet: Right, but that gives a (tiny) starting point
06:31 tacchinotacchi: i'd rather have some written documentation instead of wasting your time
06:31 RSpliet: tacchinotacchi: MR is documented in the RAM specifications and in envytools
06:32 RSpliet: RAM docs are a better source of information, as replicating documentation would've been a waste of time
06:32 karolherbst: tacchinotacchi: personally I think most of the nouveau code is quite well understandable, except the memory code :D
06:32 tacchinotacchi: i have envytools docs here
06:32 RSpliet: karolherbst: that's mostly because... nobody really understands what it does, apart from mimicking the blob :-P
06:32 tacchinotacchi: can i find RAM docs in th enouveau homepage?
06:32 karolherbst: mhh never looked into them except for the falcon stuff
06:33 tacchinotacchi: it seems like nobody ever "looked at themn"
06:33 RSpliet: tacchinotacchi: no, but google some DDR3 spec sheets for instance
06:33 tacchinotacchi: aw so RAM your talking about is random access memory
06:33 tacchinotacchi: i thought it was some nvidia specific stuff
06:33 tacchinotacchi: sry
06:34 tacchinotacchi: anyway all people i ever heard from here said "i never looked at reclocking code"
06:34 karolherbst: you know what, I will make a trace of my fermi card while reclocking :)
06:35 tacchinotacchi: i made some traces with overclocking on my fermi laptop
06:35 tacchinotacchi: the only thing i could do with it was send to you though
06:36 RSpliet: tacchinotacchi: I wrote most of the RAM reclocking code for pre-kepler, so can't claim that
06:37 RSpliet: and that stuff is ill-documented, because a lot of the docs really don't exist. It's just mimicking the behaviour of the blob as well as possible
06:38 tacchinotacchi: and does it work?
06:38 RSpliet: for various definitions of work
06:38 karolherbst: ....
06:38 RSpliet: I have a pile of GPUs in the range of G94-GT218 that can succesfully change their clock speed using this yes
06:39 karolherbst: I really don*t get this stupid stuff :/ like one or two months ago I had no problem creating mmiotraces
06:39 karolherbst: but now...
06:39 RSpliet: The kepler code has recently had an important fix from karolherbst, so most of those cards work now
06:39 RSpliet: only Fermi hasn't had a lot of love yet
06:39 RSpliet: (love takes time, unfortunately)
06:39 karolherbst: RSpliet: do we want to rush this and have it ready in two weeks? :D
06:40 RSpliet: karolherbst: I don't have time to rush it and finish in two weeks
06:40 karolherbst: mhhh well I think I do
06:40 karolherbst: may be the last month I get that much time
06:40 RSpliet: well, you know where to find my repository, have fun :-P
06:42 tacchinotacchi: karolherbst: talking about fermi?
06:42 tacchinotacchi: ew can't find any docs explaining MR value
06:42 tacchinotacchi: can't find any docs on ddr3 actually
06:44 RSpliet: https://www.google.co.uk/search?q=DDR3+device+operation+timing first hit
06:44 RSpliet: and probably second hit as well
06:45 tacchinotacchi: my best guess in the search box was "ddr3 specs sheet"
06:45 tacchinotacchi: might be also that i use duckduckgo instead of google
06:47 RSpliet: anyway, there's the documentation for nvkm/subdev/fb/sddr3.c , there's similar specifications for the other RAM types
06:50 karolherbst: RSpliet: the memory stuff will be in the pmu script or is there something else?
06:50 karolherbst: and the PBFB stuff I assume
06:52 tacchinotacchi: am i supposed to know how electricity works?
06:53 tacchinotacchi: because i don't
06:57 RSpliet: karolherbst: "memory stuff" is a bit of a fuzzy term, could you clarify?
06:57 karolherbst: well memory relcocking regs
06:58 karolherbst: I thought it is inside the pmu scripts + those *FB areas
06:58 RSpliet: yes, a memory reclock is fully contained within a PMU script
06:58 karolherbst: ahh k
06:58 RSpliet: so clock, timing, MR and all the other unknown FB registers
06:58 karolherbst: okay
06:58 karolherbst: so if I have the script I have everything I need?
06:58 karolherbst: well "need"
06:58 RSpliet: pretty much, NVIDIA does a few writes just outside the script
06:59 RSpliet: we merged them in
06:59 karolherbst: but I could take the script, execute it on my gpu and it should just work
06:59 RSpliet: almost, for that reason
06:59 RSpliet: but that script only works if the NVIDIA firmware is loaded
07:00 karolherbst: yeah I know
07:00 karolherbst: I just meant the reg writeing parts of it
07:01 karolherbst: but the script seems to be smaller than the one I saw on my kepler
07:01 RSpliet: the scripts are generated "on the fly", so registers that have already been configured correctly are untouched
07:02 karolherbst: ohh okay
07:02 tacchinotacchi: now i can't even understand c
07:02 karolherbst: RSpliet: what are those reg_last and val_last thingies?
07:02 karolherbst: tacchinotacchi: time to learn it then :p
07:02 karolherbst: no programming language is hard to learn anyway
07:02 tacchinotacchi: static const struct ramxlat ramddr3_wr[] = {...
07:02 RSpliet: karolherbst: consider them registers. It keeps track of the last touched reg/value for masking purposes etc.
07:02 tacchinotacchi: what's with ramxlat
07:03 karolherbst: tacchinotacchi: name of the type
07:03 RSpliet: tacchinotacchi: get a decent IDE that indexes those symbols, you'll click right through to it's definition
07:03 karolherbst: :D
07:03 RSpliet: or do as I do and use a non-decent IDE called eclipse
07:03 karolherbst: :O
07:03 tacchinotacchi: this is strange
07:03 karolherbst: qt-creator is quite nice though
07:04 tacchinotacchi: oh i'm so stupid i was reading it as c++
07:04 RSpliet: anyway, too much distraction for me right now, bbl
07:04 tacchinotacchi: i thought it was the definition of the type, not of an actual varaible
07:04 tacchinotacchi: bb
07:05 tacchinotacchi: thanks for the hussle
07:05 tacchinotacchi: hassle*
07:10 karolherbst: hehe, the sqe script is missing stuff :/
07:20 tacchinotacchi: qt creator literally did the same reading mistake i did
07:20 karolherbst: somebody broke the SEQ parser :/
07:20 tacchinotacchi: the guy who wrote this gave the same name to struct ramxlat and function int ramxlat
07:23 tacchinotacchi: apparently it was RSpliet :D
07:24 karolherbst: yes, he was :D
07:24 karolherbst: RSpliet: this broke the SEQ parser ...
07:25 karolherbst: RSpliet: https://gist.github.com/karolherbst/2ca68aae24dc381c9986
07:30 tacchinotacchi: is NV_MEM_TYPE_STOLEN shared plain RAM?
07:31 karolherbst: I would assume this
07:34 tacchinotacchi: quite surprised this syntax for array initialization is accepted
07:34 tacchinotacchi: static const char *name[] = {
07:35 tacchinotacchi: [NV_MEM_TYPE_UNKNOWN] = "unknown",
07:35 karolherbst: why shouldn't it?
07:35 karolherbst: wtf is wrong with printf...
07:36 tacchinotacchi: i knew *name[] = {"foo", "tie"};
07:36 tacchinotacchi: with foo being index 0 and tie 1
07:36 tacchinotacchi: i didn't know you could specify indexes like that
07:45 karolherbst: .... nooooo
07:46 karolherbst: I guess nobody checks printf return codes right?
07:48 tacchinotacchi: i would check the specification for printf
07:50 tacchinotacchi: i wonder what kind of error it could give
07:50 tacchinotacchi: that said, i don't think anyone would ever check for it, but you should ask someone else
07:50 karolherbst: ohh it returns the string lenght :/
07:51 tacchinotacchi: lel
07:51 Tom^: karolherbst: yes.
07:51 tacchinotacchi: negative numbers for errors
07:51 karolherbst: Tom^: you need some commits for that
07:51 Tom^: isnt my branch still around? apply it there :P
07:52 karolherbst: Tom^: nah, it doesn't make sense anymore to have it around :p
07:52 Tom^: ;_;
07:52 karolherbst: Tom^: are you on 4.4?
07:52 Tom^: nah
07:53 karolherbst: Tom^: fetch and cherry-pick those commits: c62c59910255b8ffaaa7c8945c6156be3af145c6, d9f1c7ec2e46c9059c8376e0c3155d998ec0185f and effcdb2a1ab980bff62139ccd473320eeb85f023
07:53 Tom^: ok
07:54 karolherbst: now I think not the seq parser is broken, but printf...
07:54 Tom^: i guess volt isnt fixed yet
07:58 karolherbst: you still need your branch
07:58 karolherbst: just apply those commits on top of it
07:58 tacchinotacchi: m sorry
07:59 tacchinotacchi: is there a difference between ddr3 and gddr3?
07:59 karolherbst: yes
08:00 tacchinotacchi: 'bout the initialization syntax i was surprised by before
08:00 tacchinotacchi: guides don't seem to know about it
08:00 tacchinotacchi: neither does the IDE
08:13 tacchinotacchi: bb
08:17 RSpliet: karolherbst, tacchinotacchi: nope, the ramxlat name comes from one of the other developers. Some wise words of wisdom though:
08:17 RSpliet: 1) don't blame people personally prematurely, it might piss them off
08:17 RSpliet: 2) don't blame people personally for issues other than maybe regressions, there might be a very good reason for what is written (which includes legacy and new insights)
08:17 RSpliet: 3) don't assume that things are a big issue just because your IDE doesn't like it.
08:17 RSpliet: 4) if you don't like code-style, be proactive and propose patches. Looking backwards needlessly diminishes peoples morales
08:18 karolherbst: RSpliet: sorry
08:20 karolherbst: RSpliet: I just did this: https://gist.github.com/karolherbst/1dc30613bcfb8834bad2 and the message doesn't get printed and bunch of others too
08:20 RSpliet: It's all right, just trying to maintain a healthy relationship between people in #nouveau. CompSci's are people too ;-)
08:20 karolherbst: any ideas though?
08:20 karolherbst: yeah I know
08:20 karolherbst: I also know that I am sometimes a bit too fast with my conclusions (and with sometimes I really mean like 95% of all cases)
08:21 karolherbst: still, when I let a "" write into seq_out, it works
08:21 karolherbst: so it's not your fault, but since your commit not all regs are printed ;)
08:21 karolherbst: I bet there is some stupid pointer stuff going on or something ugly
08:21 RSpliet: which commit broke things?
08:22 karolherbst: https://github.com/karolherbst/envytools/commit/ddcd223f74b58418dc2d4ac3eeac906b29d0633b
08:22 karolherbst: I have no clue why though
08:23 karolherbst: I think I will bisect it though
08:23 RSpliet: can you double-check with demmio -c, to strip out the colours
08:23 RSpliet: less sometimes does silly things when escape chars are parsed
08:23 karolherbst: yeah I already did
08:23 karolherbst: the colors aren't at fault here
08:23 karolherbst: and I don't use less
08:23 karolherbst: I just pipe it out
08:23 RSpliet: ok fair enough
08:24 karolherbst: printf returns the right length though
08:24 karolherbst: mhhh
08:24 karolherbst: maybe I shall pollute the line
08:24 RSpliet: I'm a bit busy right now, I hope to have some time later tonight to go fishing
08:24 karolherbst: k
08:24 karolherbst: I try to take care of the issue
08:25 RSpliet: btw, the https://gist.github.com/karolherbst/2ca68aae24dc381c9986 has pound-symbols behind every line, those were introduced in the patch. Instead of old, it seems like you just removed the last parameter?
08:26 karolherbst: yeah
08:26 karolherbst: I just remvoed the name thing and add ""
08:26 RSpliet: either way, it could fail if rnndb doesn't have a name for the register
08:26 karolherbst: but why?
08:26 karolherbst: I already checked gdb
08:26 karolherbst: the string contains the stuff in my patch
08:26 karolherbst: and the printf also doesn't print anything
08:26 RSpliet: I believe I tested it to verify it returns an empty string, so doesn't make much sense to me
08:26 RSpliet: printf is routed to a different channel I guess
08:27 karolherbst: seq_out calls printf
08:28 RSpliet: I guessed wrong :-D
08:28 karolherbst: this just sound like a stupid glibc bug
08:28 karolherbst: or another stupid bug
08:28 karolherbst: something stupid anyway
08:28 RSpliet: route your prints to stderr for debugging purposes
08:28 karolherbst: good idea
08:28 RSpliet: they might not get interleaved nicely, but it's a start
09:08 karolherbst: Tom^: did you already test that stuff? :p
09:08 Tom^: nope :p
09:08 Tom^: sauna is hot in uh 15 min, then im afk for ~40min. maybe after that
09:09 karolherbst: :D
09:59 karolherbst: RSpliet: what should info->mspll be?
10:00 karolherbst: gf100_clk_info.mspll
10:00 karolherbst: because I get some compile errors with your changes
10:15 Tom^: karolherbst: so i had some deep thinking done in the sauna, should i remove the nvaboost option im setting , will these patches actually "boost" or just jump between pstates.
10:15 Tom^: karolherbst: and if its boosting will it drop just like blob around 80C ?
10:18 Tom^: karolherbst: and if i get 4.4 will these patches apply on your 4.4 branch. :p
10:23 karolherbst: Tom^: nvboost just decides which max clock can be used
10:26 Tom^: right so these patches will simply jump between the pstates then depending on load.
10:27 karolherbst: and cstates, yes
10:28 Tom^: so yea, these patches will apply on your 4.4 branch?
10:28 Tom^: master_4.4 that is
10:29 Tom^: oh nvm, that one dosent have all the goodies from stable_reclock_kepler yet it seems.
10:30 karolherbst: right
10:31 Tom^: karolherbst: d9f1c7ec2e46c9059c8376e0c3155d998ec0185f didnt apply so yea :P
10:32 karolherbst: right
10:32 karolherbst: then something is missing
10:32 Tom^: or im not doing this right
10:32 karolherbst: pick 6302fc929e62aef4f997db153231df69c1b04c1a before
10:33 karolherbst: which is also kind of important for that to work :D
10:37 karolherbst: Tom^: does it work now=
10:39 Tom^: nope
10:40 Tom^: https://github.com/gulafaran/nouveau/tree/tom my branch is back ! =D but yea. they all report some conflict on it
10:40 karolherbst: :D
10:41 karolherbst: meh, then wait
10:42 karolherbst: mhhh ohhh
10:42 karolherbst: wait
10:42 karolherbst: are you still on 4.3?
10:42 Tom^: well yes, didnt you ask this like an hour ago =D
10:42 Tom^: i can get 4.4 too if that makes things easier
10:42 karolherbst: mhh because your tom branch seems odd :/
10:43 Tom^: my tom branch is from your stable_reclock_kepler
10:43 Tom^: so dont blame me
10:43 karolherbst: ohhh
10:44 karolherbst: then you have a 4.4 based branch now :p
10:44 Tom^: im fetching mainline then, :P
10:45 karolherbst: isn't there a 4.4 package already?
10:45 karolherbst: because making it 4.3 compatible is piece of cake
10:45 Tom^: not in arch
10:45 Tom^: il just compile it anyways
10:45 Tom^: takes at most 15 minutes
10:47 karolherbst: because I am done now :p
10:47 karolherbst: and in 10 seconds it would be also 4.3 compatible
10:48 Tom^: mk :p
10:49 tacchinotacchi: makepkg -s :v
10:49 tacchinotacchi: i think 25 minutes for me
10:49 karolherbst: tacchinotacchi: done
10:49 karolherbst: ..
10:49 karolherbst: Tom^: done
10:49 karolherbst: :D
10:49 tacchinotacchi: lel
10:49 tacchinotacchi: i'm thinking i should change distro
10:50 Tom^: get arch
10:50 karolherbst: well I need like 5 minutes to compile my kernel
10:50 tacchinotacchi: i have arch
10:50 Tom^: karolherbst: fine il refork you then :P
10:50 Tom^: because learning rebasing and what not is not for today.
10:50 tacchinotacchi: that's why it takes me 25 mins to compile kernel
10:50 karolherbst: do you actually configure the kernel on arch?
10:50 tacchinotacchi: no
10:50 Tom^: uhm why would it go faster on any other distro :/
10:50 Tom^: i do
10:50 karolherbst: but my kernel is pretty small
10:50 tacchinotacchi: i use default set with ck
10:51 tacchinotacchi: do you actually go through disabling ALL modules everytime?
10:51 karolherbst: ?
10:51 karolherbst: no?
10:51 karolherbst: you can just take your old config?
10:52 karolherbst: but my kernel is like 7.5MB lz4 compresses big
10:52 tacchinotacchi: ok then i just don't feel like configuring the kernel
10:52 tacchinotacchi: :D
10:52 karolherbst: but I also have most of the stuff builtin
10:52 karolherbst: and nearly no modules
10:52 tacchinotacchi: well mine is 4,1
10:53 tacchinotacchi: do you mean just vmlinuz?
10:53 karolherbst: tacchinotacchi: I guess everything is a module?
10:53 Tom^: karolherbst: did you make it 4.3 or is it 4.4?
10:53 karolherbst: Tom^: your tom branch is 4.3
10:53 Tom^: im aborting this kernel compilation then ;)
10:53 tacchinotacchi: yes probably all drivers are modules
10:53 karolherbst: tacchinotacchi: well vmlinux is 21MB here, but I disabled Os
10:53 karolherbst: and practiacally everything is built in and no module
10:54 tacchinotacchi: disabled Os?
10:54 karolherbst: well you can optimize for size
10:54 karolherbst: but who wants that actually
10:54 karolherbst: my entire modules directy uses like 16 MB
10:54 Tom^: bootup speeds.
10:54 imirkin: karolherbst: people with cpu's that don't have infinite caches
10:54 Tom^: =D
10:54 tacchinotacchi: pff boot speed
10:54 karolherbst: imirkin: :O
10:55 karolherbst: imirkin: but then you also don't compile your own kernel
10:55 tacchinotacchi: how much time does it take even a slow hdd to load a kernel image
10:55 imirkin: karolherbst: huh?
10:55 karolherbst: tacchinotacchi: <5 seconds
10:55 tacchinotacchi: Plasma instead is a killer, takes 40 seconds at least from my boot
10:55 Tom^: "Startup finished in 8.586s (firmware) + 988ms (loader) + 1.551s (kernel) + 1.725s (userspace) = 12.852s"
10:55 karolherbst: :D
10:55 tacchinotacchi: the kernel image is not even something fragmented
10:56 tacchinotacchi: it's continous
10:56 tacchinotacchi: reading is super-fast
10:56 karolherbst: crappy HDD: Startup finished in 3.227s (firmware) + 3.530s (loader) + 8.368s (kernel) + 31.895s (userspace) = 47.021s
10:56 karolherbst: but something is weird with my loader, seriously
10:56 tacchinotacchi: Startup finished in 4.184s (firmware) + 5.704s (loader) + 4.527s (kernel) + 22.611s (userspace) = 37.027s
10:56 tacchinotacchi: this one didn't take plasma boot in consideration though
10:56 karolherbst: yeah and I have mysql and samba starting
10:57 karolherbst: :D
10:57 Tom^: boys, ssd.
10:57 Tom^: ;)
10:57 karolherbst: nah, ssds are dead for me
10:57 karolherbst: I broke three already
10:58 tacchinotacchi: fast hdd
10:58 karolherbst: and I have like 5 working HDDs wich each of them is older than my 3 broken ssds together
10:58 Tom^: O_o i still got my corsair f60 from like 5 years ago. according to smartdata ive written around 10TB to it.
10:58 karolherbst: 10TB is like nothing :D
10:58 tacchinotacchi: programmer = many writes uh?
10:58 MichaelLong: karolherbst, you are "using it wrong" :P
10:58 karolherbst: one of my hdd has like 11025 hours power on :D
10:59 tacchinotacchi: doesn't sound like much to me though
11:01 karolherbst: well 11000 hours is already a big deal for "non-professional" ones though
11:02 karolherbst: though I bet 50k would be impressive
11:03 Tom^: Power_On_Hours -O--CK 040 040 000 - 52985
11:03 Tom^: 320gb wd black.
11:03 Tom^: ;)
11:04 karolherbst: :D
11:04 karolherbst: yeah those blacks
11:04 karolherbst: awesome
11:04 karolherbst: I think I only have a blue
11:04 Tom^: got two of them with nearly identical poweron time. never had an issue with them
11:04 karolherbst: yeah WD is awesome
11:05 karlmag: I've only bought NAS ones lately (except ssds, that is)
11:05 karolherbst: Tom^: ohh no, mine is an AV-25 one :/
11:06 karolherbst: the 1TB one that is, my 500GB is a blue and my 320GB one is a seagate, which is crappy because it is a slim one
11:07 tacchinotacchi: i need help on this even though it's the introduction
11:07 tacchinotacchi: http://envytools.readthedocs.org/en/latest/conventions.html
11:07 tacchinotacchi: (on the ram specs completely clueless)
11:08 imirkin: hm, i'm only at ~49k hours
11:08 Tom^: karolherbst: you broke pstatectl
11:08 tacchinotacchi: "2-input operation 0x4 (0b0100) is ~v1 & v2"
11:08 imirkin: a bunch of 2T drives
11:09 Tom^: karolherbst: its in debugfs which is only readable by root, and libnotify cant show as root because of missing user dbus :P
11:09 imirkin: tacchinotacchi: what's the question?
11:09 tacchinotacchi: how exactly did it get "~v1 & v2" from 0x4
11:09 tacchinotacchi: and how can 0x4 be an operation
11:09 karlmag:wonders how smart it would be to use a ssd for compile disk...
11:10 tacchinotacchi: karlmag: lel
11:10 orbea: ysy, I compiled a 4.4.0 kernel and now I can clock up to 0d :)
11:10 orbea: *yay
11:10 imirkin: tacchinotacchi: experimentally
11:10 imirkin: tacchinotacchi: there are ops aka "logic op", which take a number, which indicates which op they are. this is the table of those ops.
11:11 Tom^: karolherbst: yea its working otherwise, besides this annoying flicker that sort of has to be fixed. dynamic reclocking sort of makes it flicker quite often ;)
11:11 mwk: tacchinotacchi: 0x4 is 0b0100 in binary
11:12 mwk: and you're thinking of a 2-input binary function
11:12 mwk: there are 4 2-bit combinations: 00, 01, 10, 11
11:12 mwk: so, there are 4 possible inputs to such a function
11:12 Tom^: karolherbst: im not going above 1072 core clock tho
11:12 mwk: the functions are simply numbered by encoding outputs for all possible inputs
11:13 mwk: bit 0 of function number is output for 0b00 input, bit 1 of function is output for 0b01, bit 2 is 0b10, bit 3 is 0b11
11:13 mwk: now, consider 0x4 == 0b0100
11:13 mwk: bit 0 is 0, so function(0, 0) == 0
11:13 mwk: bit 1 is 0, so function(1, 0) == 0
11:14 mwk: bit 2 is 0, so function(0, 1) == 1
11:14 mwk: er, bit 2 is 1
11:14 mwk: bit 3 is 0, so function(1, 1) == 0
11:14 karolherbst: Tom^: you won't need it anymore anyway :p
11:14 Tom^: karolherbst: nor do i seem to drop down from 0d
11:14 tacchinotacchi: no i don't get the part of 0b0100 in this :\
11:14 tacchinotacchi: two inputs
11:14 imirkin: tacchinotacchi: 0bXXXX is a way to denote binary
11:15 tacchinotacchi: you change all bits to v1, than do an AND with v2
11:15 karolherbst: Tom^: k
11:15 tacchinotacchi: why is 0x4 any important in this other than being the opcode of the operation
11:15 karolherbst: Tom^: dmesg
11:15 imirkin: tacchinotacchi: it defines the op
11:15 mwk: tacchinotacchi: because it's also the truth table of the operation
11:15 imirkin: tacchinotacchi: 0x4 is effective the truth table of the boolean op
11:15 mwk: you know what a truth table is, right?
11:16 imirkin: (if mwk and i said it at the same time, it _must_ be right ;) )
11:16 karolherbst: Tom^: and I think I forgot something
11:16 tacchinotacchi: yes you're right
11:16 Tom^: karolherbst: ignore the temp, i did nvaforcetemp to see if its the gpu fan vibrating. https://gist.github.com/anonymous/81304abd5ab931ae27b2
11:16 karolherbst: ahh no, shoult be fine
11:16 tacchinotacchi: did the programmers just make that up or the gpu knows what operation to do based on the truth table?
11:17 mwk: tacchinotacchi: the GPU just uses the operation code as a truth table
11:17 karolherbst: Tom^: boot with nouveau.debug=clk=debug
11:17 Tom^: mk
11:17 tacchinotacchi: so it doesn't care of what operations brought you to that truth table
11:17 mwk: I've listed the 2-input operations for an example, but GPU also uses 3-input and 4-input functions
11:18 mwk: yes
11:18 mwk: if two operations have the same truth table, they're the same operation
11:18 mwk: like ~(a | b) and ~a & ~b
11:18 mwk: same thing
11:19 Tom^: karolherbst: https://gist.github.com/anonymous/199456170ca617d9c2c3
11:20 mwk: tacchinotacchi: basically, the GPU computes the operation exactly as I specified it in the big pseudocode
11:20 mwk: think of the operation code as a truth table
11:20 mwk: 0x4 == 0b0100
11:20 mwk: better yet, lets' make it an array of bits, starting from the right: [0, 0, 1, 0]
11:21 karolherbst: Tom^: well if you don't have any load it should be at the lowest clock
11:21 mwk: it constructs an index into this array from the input operands, and just picks the cell corresponding to the combination
11:21 karolherbst: Tom^: cat current_load in the debugfs directory
11:21 Tom^: karolherbst: mk il try again then
11:21 mwk: eg. inputs 0 and 1 are encoded as 0b10 (again from the right), which is 2... and entry 2 of table [0, 0, 1, 0] is 1
11:22 mwk: so, the operation code is really a LUT
11:22 karolherbst: Tom^: I mean I could have done something odd though
11:22 Tom^: karolherbst: uhm how do i parse current load?
11:22 tacchinotacchi: no i really don't get it this way
11:23 Tom^: karolherbst: https://gist.github.com/anonymous/05e33c22c4d3359c16fa
11:23 mwk: which, when performed in hardware, is a simply a multiplexer... fast
11:23 tacchinotacchi: how does the gpu know which one to alternate (which input to do 1100 for, which 1010)
11:24 mwk: what do you mean by "do 1100"?
11:24 karolherbst: Tom^: 1/255
11:24 tacchinotacchi: you have column for input one
11:24 tacchinotacchi: the rows are, from top to bottom, 1100
11:24 tacchinotacchi: for input two 1010
11:25 karolherbst: Tom^: k, then nvapeek 0x10a500 0x80 on the blob
11:25 karolherbst: acutally I never saw the values from the blob for those regs
11:25 tacchinotacchi: and then you make the operation and see the four values as a result
11:25 mwk: tacchinotacchi: it always goes 010101010101010101.... for the first
11:25 mwk: 0011001100110011 second
11:25 mwk: 000011110000111100001111... third
11:25 mwk: and so on
11:25 tacchinotacchi: thanks
11:25 Tom^: karolherbst: core 0 , mem 94 but im still at 0d on 548mhz and 6999 mem
11:26 karolherbst: yeah cause of the mem load :/
11:26 karolherbst: I need to know those counter configuration from the blob driver
11:26 karolherbst: memory is like real hard to get right
11:29 Tom^: karolherbst: https://gist.github.com/anonymous/92b6c8cc8d25d7535885
11:30 orbea: and im still getting video hiccups :\
11:31 mwk: tacchinotacchi: btw, this means that eg. 0xaaaa is a 4-input function that just returns the first input
11:31 mwk: 0xcccc is return second input, 0xf0f0 is third, 0xff00 fourth
11:31 karolherbst: Tom^: those memory stuff, all right
11:32 mwk: and a useful way of obtaining the function codes is such: suppose you want the code for v1 & v2
11:32 mwk: then you just compute 0xaaaa & 0xcccc == 0x8888, the code for 4-input AND
11:32 karolherbst: Tom^: update tom branch and recompile
11:33 mwk: likewise, the code for a XOR (v1 ^ v2) is 0xaaaa ^ 0xcccc == 0x6666
11:33 tacchinotacchi: why a and c
11:33 tacchinotacchi: look like hexadecimal to me
11:33 tacchinotacchi: but it wouldn't make any sense
11:33 mwk: a proper 4-input AND (v1 ^ v2 ^ v3 ^ v4) is: 0xaaaa & 0xcccc & 0xf0f0 & 0xff00 == 0x8000
11:34 mwk: 0xa is 0b1010, ie. alternating bits
11:34 mwk: 0xc is 0b1100, alternating every second bit
11:34 mwk: again, think of a truth table
11:34 tacchinotacchi: facepalm
11:34 tacchinotacchi: you are very patient
11:35 mwk: hmm
11:36 mwk: I'd really love an example right now...
11:36 tacchinotacchi: why always 4 hex digits ? should't they be dependent on the number of inputs?
11:36 tacchinotacchi: like for 2 input = 4 possibilities = 1 hex digit
11:37 mwk: that's correct
11:37 mwk: 2-input functions have 1 hex digit, 3-input have 2 hex digits, 4-input have 4
11:38 mwk: and I haven't seen any bigger size so far
11:38 Tom^: karolherbst: ugh i need a new fan for my gpu
11:39 tacchinotacchi: intuition suggest since 16 is power of two, it should always be possible with an integer number of hex digits
11:39 tacchinotacchi: but i'm not sure
11:40 mwk: tacchinotacchi: here's a 4-input example: https://gist.github.com/koriakin/d221225bab17e7f8fa03
11:40 mwk: so, there's a truth table
11:40 mwk: the first column of the truth table (input 0) is always 0, 1, 0, 1, 0, 1
11:40 mwk: second is always 001100110011...
11:40 mwk: third is 0000111100001111 and so on
11:41 tacchinotacchi: thanks again
11:41 mwk: to get the operation code for whatever operation you want, you need to write such a truth table
11:41 mwk: then read the column of your operation
11:41 mwk: and write it down from the right
11:41 tacchinotacchi: well i'd guess it's not like that for every opcode
11:42 mwk: in the example, v0 & v1 & v2 & v3 is 0, 0, 0, 0, ...., 0, 0, 1
11:42 tacchinotacchi: what about add when you have carry?
11:42 tacchinotacchi: maybe boolean ones
11:42 mwk: which is 0b1000000000000000 == 0x8000
11:42 mwk: add? add is not a bitwise operation, you can't encode it like that
11:43 tacchinotacchi: yes that's what i was saying
11:43 tacchinotacchi: but needed someone to be sure
11:44 mwk: I'm not sure which use of the bitwise ops you have in mind
11:44 mwk: the encoding is used in at least 4 places that I know of
11:44 mwk: are you working on the perf counters, raster ops, one of the ISAs with bitwise op operation?
11:45 tacchinotacchi: not working on anything, i'm reading the documentation to learn
11:45 mwk: then you started in a scary place :p
11:45 tacchinotacchi: but as you can see when it comes to low level i'm already lost in the introduction
11:46 mwk: that page just contains a few common parts I've extracted from perf counter / vp1 ISA / raster output docs
11:46 mwk: I had nowhere to stuff that
11:46 tacchinotacchi: is it the same for bitwise operations when you do regular assembly for the cpu
11:46 tacchinotacchi: ?
11:46 mwk: depends on the CPU
11:47 mwk: some of them do use a similiar encoding
11:47 karolherbst: Tom^: why do you need a new fan?
11:47 tacchinotacchi: all x86 should do the same with same opcodes
11:47 tacchinotacchi: same with all arms, with amd64 ecc.
11:47 mwk: but mostly they conserve opcode space by only exposing a handful of these operations
11:48 mwk: and CPUs usually don't have 3-input bitwise ops, too rare to be useful
11:48 imirkin: or any 3-input ops for that matter
11:48 mwk: fused multiply-add is 3-input...
11:48 RSpliet: karolherbst: Microsoft PLL(c)
11:48 imirkin: how many CPUs have that?
11:48 mwk: compare-and-swap too
11:48 RSpliet: or seriously, "memory source PLL"
11:49 mwk: imirkin: modern x86 does
11:49 RSpliet: aga, the one that goes into the mpll
11:49 RSpliet: *aka
11:49 tacchinotacchi: when the gpu does? is it atomic?
11:49 imirkin: mwk: right :) so... not a lot
11:49 karolherbst: RSpliet: k, I just got a lot of compile errors I have to dig through, I bet most of them are caused by a crappy 4.4 porting or something
11:49 mwk: touche
11:49 imirkin: or rather, it's a very recent development
11:49 tacchinotacchi: i mean, gpus have 3 input ops in their lookup tables or they simulate that with multiple operations?
11:50 RSpliet: karolhebrst: no, I just never did a compile-check
11:50 karolherbst: :D
11:50 karolherbst: k
11:50 imirkin: tacchinotacchi: gpu's have 3-input ops... have for a while
11:50 RSpliet: like formatting for documents, that's usually one of the last things I do
11:50 mwk: tacchinotacchi: depends on the place in the GPU
11:50 karolherbst: RSpliet: I bet you won't find much time to clean it up? At least not the next days?
11:50 mwk: generally, none of the on-gpu processors have 3-input bitwise ops... they tend to be too useless
11:50 tacchinotacchi: well i hope the driver guys don't have to deal with that
11:51 tacchinotacchi: i mean knowing wether a gpu has 3 input bitwise op or simulate
11:51 mwk: the only use of 3-input ops that I know of is in the 2d raster output, which is ancient stuff
11:51 RSpliet: karolherbst: maybe I'll see if I can do a pass tonight if you like
11:51 karolherbst: would be awesome, then I have somethig todo for the next weeks
11:51 tacchinotacchi: i'll leave you in peace for a while
11:51 tacchinotacchi: bb
11:52 karolherbst: Tom^: so any changes yet?
11:52 RSpliet: karolherbst: in particular, if you could make a start on DDR3 reclocking for Fermi that'd be swell :-P
11:52 karolherbst: I have a ddr3 fermi card here, yes :p
11:53 tacchinotacchi: oh i happen to have one
11:53 tacchinotacchi: lel
11:53 RSpliet: karolherbst: the memory script I have is derived from a single GDDR5 card, so I think you can do some serious work there
11:54 tacchinotacchi: if you wait a few years and my laptop doesn't die you might see an imp of mine of the core reclocking lol
11:54 karolherbst: RSpliet: k, I would like to have that one and check against mine DDR3 scriprt
11:54 karolherbst: then
11:54 karolherbst: RSpliet: by the way, have one upclock and one downclock one?
11:55 RSpliet: script? no, one script for every action
11:55 karolherbst: k
11:55 RSpliet: see GT215 for how that is done the right way
11:55 karolherbst: I am already thinking about redoing the entire thing for fermi, because it just didn't fit with my script :/
11:56 RSpliet: I wouldn't go that far
11:56 karolherbst: RSpliet: I thought you meant a SEQ script from the trace
11:56 RSpliet: yeah, one seq script for "reclock" whether it's up or down
12:04 karolherbst: ä #
12:08 Tom^: karolherbst: yea now im always on 07 :p
12:09 Tom^: karolherbst: core: 253 , mem: 36, video: 0, pcie: 40 , and on 07, i do however saw it flicker so i suspect it went up a pstate but down again.
12:10 karolherbst: Tom^: yeah well, depends on what you do
12:10 karolherbst: if you vsync it should be not too high clcoked, because it only has to calculate 60 fps anyway
12:13 Tom^: karolherbst: 5 glxspheres, the fifth one rat at like 3 mpixel :p
12:13 Tom^: karolherbst: and after a while it all froze https://gist.github.com/0797e021cbc2a99ddab4
12:13 karolherbst: Tom^: well you could also check the pstate file how high it clocks
12:13 Tom^: i was, 07
12:14 karolherbst: yeah there seems to be a bug in the request handling code :/
12:15 karolherbst: Tom^: could you the next time boot with nouveau.debug=clk=trace
12:15 Tom^: sure, i however have to go to bed :/
12:16 Yoshimo: karol, what are you working on these days?
12:17 karolherbst: reclocking stuff
12:19 Tom^: karolherbst: clk=trace doesnt seem to change dmesg output
12:20 karolherbst: it should :O
12:20 Tom^: hehe having just one glxspheres open kinda makes it constantly flicker because it jumps between some cstate
12:21 karolherbst: but dmesg should be filled with all kind of stuff
12:21 Tom^: oh wait i typoed the module
12:23 Tom^: karolherbst: https://gist.github.com/anonymous/f29193049b805184e7be
12:24 karolherbst: better
12:24 Tom^: il leave it at that, time for bed.
12:25 karolherbst: yeah I did something wrong :D
12:56 RSpliet: wow, this is an actual thing, isn't it: https://pbs.twimg.com/media/CYV07YpWYAA67Be.png ?
13:31 tacchinotacchi: RSpliet: maybe it's only for intel graphics?
14:03 mwk: RSpliet: nice one
14:03 mwk: with a lock... fitting
14:37 RSpliet: karolherbst: I pushed a mountain of compile fixes, but bear in mind
14:37 RSpliet: - half of what you find in the GDDR5 script is wrong, old, stale or needs to be verified; use with caution and don't try to blindly run it
14:38 RSpliet: - I most likely fucked up the voltage GPIO naming and conditions
14:38 RSpliet: - none of it is tested on an actual machine; random broken-ness is plausible
14:40 RSpliet: ... oh, hm. well, if anyone sees him again, tell him to check the IRC logs
14:41 hakzsam: okay :à
14:41 hakzsam: *:)
15:49 chillfan: hm so i'm using the stable_reclocking_kepler branch.. just updated it and the kernel, missing the 'pstate' interface from sys
15:49 chillfan: has it maybe moved?
15:51 chillfan: ah, i have in dmesg: [ 118.259195] nouveau: unknown parameter 'pstate' ignored
15:54 chillfan: brb
15:54 hakzsam: chillfan, yeah, this has been recently removed
15:54 hakzsam: https://github.com/karolherbst/nouveau/commit/bf55b04043389bb648236e6be7d643f60bc4dcb6
15:54 imirkin_: chillfan: the file is in debugfs now, no need for the param
15:54 hakzsam: btw, the wiki should be updated
15:54 hakzsam: http://nouveau.freedesktop.org/wiki/KernelModuleParameters/
15:55 imirkin_: yeah, that page needs some serious updates since the 4.3 rewrite where ben changed all the names... again =/
15:56 chillfan_: hm still missing it, tried rebuilding
15:57 imirkin_: chillfan_: it's gone. you don't need it anymore.
15:57 chillfan_: oh, so erm, what's the right way to check pstate etc?
15:57 imirkin_: it's in debugfs now
15:57 imirkin_: /sys/kernel/debug/dri/0/pstate
15:58 chillfan_: ah okay, i see it thanks
15:59 chillfan: brb then, have a bug to report anyway :)
16:11 chillfan: ok, it's basically a screen corruption problem, the issue happens either with or without compton enabled using vsync
16:12 chillfan: will find somewhere to post the desktop recording
16:13 chillfan: http://expirebox.com/download/f24cbfea91788a91f476b2c7b07fb42c.html .. graphics card is gtx 780ti, latest nouveau stable_reclocking_stable branch
16:13 chillfan: kernel 4.4
16:13 chillfan: from karols branch
16:18 chillfan: also occured before update, just updated there to make sure it wasn't already fixed
16:19 chillfan: brb
17:53 mwk: well that was fun.
17:55 mwk:managed to get a bit-perfect software implementation of G80's rcp instruction after lots of poking
17:56 mwk: lovely squaring algorithm and likewise lovely fudge factor in the rounding step
17:56 imirkin_: that sounds like it was fun
17:57 imirkin_: i wonder if that's how it actually works
17:57 mwk: it is
17:57 imirkin_: or if you missed some simpler explanation
17:57 mwk: there's an actual paper from nv describing how it works
17:57 mwk: I just had to figure out the details
17:57 imirkin_: ah :)
17:57 mwk: http://pctuning.tyden.cz/ilustrace3/soucek/g80/paper-164.pdf
17:58 mwk: they do mention an ugly fudge factor for rounding, but the square thing took me by surprise
17:59 mwk: imagine a 17x17 squarer, where you discard the lower 17 bits [because you're operating on fractions]
17:59 mwk: they mention doing hacking off lower input bits in the paper
18:00 mwk: what they really do... take all partial products in the squarer that contribute less than 0.5 to the output, and discard them
18:01 mwk: it works like that: https://github.com/envytools/envytools/blob/master/nvhw/sfu.c#L41
18:01 mwk: imirkin_: it must be working exactly this way, changing any smallest detail results in lots of test failures
18:02 mwk: fudge factor 0x47e7 is exactly correct, both 0x47e6 and 0x47e8 have dozens of failures in [1, 2) range
18:03 imirkin_: i believe you :)
18:05 mwk: now I just need the other 9 SFU functions
18:06 mwk: I wonder if they ever changed the algorithms, I only checked on a G200 so far
19:26 orbea: I got this trying to play a video with mpv, any ideas? http://dpaste.com/13MZ2CT
19:28 orbea: hmm, it only happens when the video is in a .rar (Which mpv normally can play fine)
19:29 orbea: and another video in a .rar works fine
19:30 orbea: and I cant repeat it...
19:46 imirkin: this scientific experiment isn't going the way i planned...
19:46 imirkin: i might have to take matters into my own hands
19:47 imirkin: my theory was that by disabling opt in st_glsl_to_tgsi i could speed up compiles
19:47 imirkin: but... it's not panning out
19:49 airlied: imirkin: didn't we discover disabling it caused bugs before?
19:50 imirkin: airlied: yeah, but i figured out why
19:50 imirkin: airlied: so i left in the critical bit
19:50 imirkin: which is the last part of the dead code elimination
19:50 imirkin: which just does a once-over the instructions and deletes totally dead ones
19:51 airlied: maybe the overhead of converting tgsi->nv50 ir is > than the overhead of removing dead stuff
19:52 imirkin: airlied: well, i haven't benchamrked it with *just* the dce thing
19:52 imirkin: only larger blocks
19:52 imirkin: and i found a bug in the GlobalCSE thing which took up most of the time i was going to futz with the various conversion logic
19:53 imirkin: i did recover a lot of the perf loss by disabling some extra additional stuff... i'm also wondering if i can nuke the common optimizations
19:54 imirkin: and i'm going to make LocalCSE not be O(n^2) anymore, maybe that'll help
19:57 imirkin: airlied: there's also the various glsl opt that gets done which could be removed... i have things to play with :)
20:01 imirkin: although one interesting tidbit is that taking overall timing stats, nouveau is 20% faster at compiling than i965
20:01 imirkin: obviously a bunch of that is shared logic
20:58 Guest11139: hm,i want to use nouveau,but it is not working,it is my xorg.log http://codepad.org/jw7z5b8f ,please help me .thanks I looked up a lot of information, but did not solve the problem,if u need more,please tell me ,thanks
21:02 Guest11139: dmesg |grep nouveau http://codepad.org/vTDxpwFF
21:04 imirkin: Guest11139: do you have any screens connected to the nvidia card?
21:06 Guest11139: no
21:06 imirkin: Guest11139: that's probably why X doesn't start on its own... what are you trying to achieve?
21:08 Guest11139: i have two cards ,i use intel ..but nouveau is not work
21:15 imirkin: Guest11139: i still have no clue what you're trying to achieve
23:15 tiny001: 1
23:18 tiny001: i have two videocards(intel & nvidia),intel working normal,nvidia (use nouveau)is not working ,it is my xorg.log http://codepad.org/AF5N0bs3 ,if u need more ,please tell me ,thanks
23:22 towo^work: if you can't disable the igp in your firmware, you can't use the nvidia chip in that way
23:23 towo^work: if you want the nvidia chip, you can use dri offloading for rendering programms on the nvidia chip
23:24 tiny001: it is my kernel configure http://codepad.org/50jSNsA5
23:25 imirkin: tiny001: you still haven't said what it is you're trying to do and what isn't working
23:25 imirkin: try to be specific
23:27 towo^work: tiny001, the kernel has not many to do with your situation
23:28 tiny001: ok,,thanks,,i am a chinese,my english is poor,but i will try my best to specific
23:29 tiny001: I don't know where to set the wrong,
23:30 towo^work: you use a xorg.conf snipped, where you set nouveau as driver
23:30 imirkin: tiny001: what is your goal? why are you trying to force the nouveau ddx to load in your xorg config? what are you hoping will happen?
23:30 towo^work: this can't work in optimus setups, if you can't disable the IGP
23:31 imirkin: tiny001: perhaps you might benefit from looking at http://nouveau.freedesktop.org/wiki/Optimus/
23:31 imirkin: there's an automatic translation feature, no idea how well it works
23:32 tiny001: if xorg.conf setting videocards=nouveau,not use intel ,it is not working
23:33 tiny001: i hope just one nouveau ,it is to working
23:33 imirkin: what would you expect would happen? you're telling it to use a chip which doesn't have any displays connected
23:34 imirkin: as far as i can tell, everything's working as expected
23:34 imirkin: read the wiki page i linked to.
23:35 tiny001: i read it yet,but I can't fully understand
23:36 imirkin: try using the translate widget that's built into it... it uses google translate
23:48 tiny001: 1
23:50 tiny001: I do not want to switch between the two, as long as one(just nouveau ) can use
23:53 tiny001: i know that page can use google translate,but I don't know where to set the right after I finish reading it.