04:18imirkin: fergal: fyi i pushed a fix to mesa which should allow using a variable named buffer prior to glsl 430. should be included in 11.2.1 when that's released in a week or two
05:10martm: imirkin: i have several versions of the pseudo now for scheduler, i belive most of them are quite perverse
05:12martm: i wanna use cache memory as few as possible allthough that seems to be 16000 word entries, hence i got to do the getpc, and use a bit multiply divide add and bitwise in the header/handler
09:45karolherbst: Misanthropos: there is one think you could do too. Run nvatiming on nvidia at full clocks and do the same on nouveau on 0f
12:23karolherbst: skeggsb: with your recent patches I need https://github.com/karolherbst/nouveau/commit/bfc52c6334baaa677d2a464789fdb5f6cca6296d for my tool
12:23karolherbst: skeggsb: or something like this, but I think you have a deeper insight in what is going wrong there
14:22karolherbst: mupuf: could think of any reasons, why the GPU needs a higher voltage set when running on nouveau?
15:09karolherbst: unloading nouveau sure is fun
15:11karolherbst: RSpliet: did you look into the gr firmware for issues? I am not quite sure who it was
16:22martm: i tell you about the scheme, i'd have to see the SSA form of the final shader, i can imagine that every opcode has a destination operand and source operands
16:24martm: so when you have a cache address that is from base, since there is 1024 registers per thread , it's encoded in 8bit value as destination address of a register file
16:24martm: so one could do like this:
16:25martm: get the pc of the instruction, having eariler filled in the destination operand with operations relative the pc
16:25martm: for example currentpc/2-destination register, this will consume roughly 6 bits
16:26martm: that of from 8
16:27martm: now you'd know where next instruction also lies on the cache register
16:28martm: better off encode that as currentpc-currend destination operand - next instruction destination operand
16:30martm: now instead of the next instruction dest operand you could write a mask instead
16:30martm: and in the handler you parse two instruction from cache, and take the mask, and write back the derived operand
16:37martm: then you'd launch a branch per instruction in the handler, so the scheduler, would do 4additional instructions roughly, the fewest that i could think of
16:39karolherbst: mupuf: I think I found it in PFUSE...
16:39martm: handler is done in such way, that instructions can be pinned in reverse order
16:40martm: so the first one appears as the last, and branch per instruction starting from the last going to top all the time
16:40martm: so it will jump back to the end and end jumps to the handler where the calculation is being done
16:45martm: yeah that is all done in the shader, the beaty of it is that when you know the cache slot, it's accessed with 4cycles, instead of when you don't when it will be 27cycles to go over all the cache
16:45martm: and the other beaty of course it's done on-chip instead of off-chip, where the last to fill cache in off.chip mode is just a lot slower
16:51martm: i concur with imirkin in a sence, when parsing is wanted to be done partly on gpu, you don't want to distract cpu's dma operations like make it wait with the cpu threads, this is just pointless the pushbuf should pause for a split second when gpu rearranges the stuff
16:51martm: while cpu goes ahead doing the slow transfers what he did, and pusbuf will capture it again later
17:00martm: it is simple, but you must talk about how to handle it , what is the scheme you like the best etc. my budget is low so i can not meet and discuss about how we do the remaining things, but it looks good so far
17:07martm: karolherbst: there are some things i do not know, since i have wasted lots of time, there is writeback stage too of the cache, i mean how that works, plus i do not know entirely how branches work, but it will do it anyways, with slight modifications
17:08martm: the truth is i can make it work anyway i want on fpga's, but on asics some details must be made sure...but the scheme would work regardless probably
17:11martm: depending on configuration there are three caching behaviors...write back and write through
17:11martm: and one more, which i do not remember currently
17:11martm: i think the third one is not used in gpu land anyways
17:25martm: beauty, but anyways i'm off to bed now..it seems absolutely stunning and astonishing at least how good card the kepler is also the fermi and all the newer ones probably too, but karolherbst
17:25martm: is the sched codes also there for kepler card, i mean starting from kepler last one being included?
17:27martm: there is no doubt it will take conversation to get the scheduler in/mereged..as it's complexity is above easy level
17:28martm: there are like hundreds of different details, that all should be aware about, and the instructions per card need to be profiled before it's almost a real work
18:02Misanthropos: karolherbst, mmiotrace resulted in segfault on nvidia so i couldnt trace anything
18:03karolherbst: segfault as in segfault or kernel crash?
18:03Misanthropos: i did nvatiming just now:
18:03Misanthropos: nouveau: https://bpaste.net/show/9e4e0282e4bc
18:03Misanthropos: nidia: https://bpaste.net/show/d2e5730afdc1
18:03karolherbst: Misanthropos: you mean this double hit issue?
18:04Misanthropos: the driver did not load correctly with mmiotrace.. but did without
18:04karolherbst: yeah, there is a patch for this
18:05karolherbst: Misanthropos: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/x86/mm?id=cfa52c0cfa4d727aa3e457bf29aeff296c528a08
18:07karolherbst: Misanthropos: the nvatiming output is a bit too odd
18:07karolherbst: Misanthropos: did you run it at max clocks on both drivers?
18:08Misanthropos: no.. i just realized
18:08karolherbst: Misanthropos: but loading nvidia with mmiotrace just didn't suceed and did you check dmesg in that case?
18:08karolherbst: I am sure you got a message about a second hit on the same address or something
18:09Misanthropos: nvidia high perf: https://bpaste.net/show/02b05fd67f91
18:09karolherbst: yeah, better, thanks
18:09Misanthropos: i checked dmesg.. and it showed some error
18:09Misanthropos: i could not start x either
18:10karolherbst: Misanthropos: aren't you remembering something like secondy hit? and couldn't handly page fault requests or something like that?
18:10Misanthropos: no.. but i can reproduce if you like
18:11karolherbst: or just apply my patch
18:11karolherbst: and if it works, then it was this issue
18:11karolherbst: it is already pushed upstream for 4.6
18:11Misanthropos: i remember dri wasnt able to load
18:11Misanthropos: using 4.4.0 for testing
18:12karolherbst: we could check something else
18:12karolherbst: Misanthropos: cat /proc/meminfo | grep DirectMap
18:12karolherbst: does it should 4k, 2M (and 1G)?
18:13Misanthropos: 4k and 2m it does
18:13karolherbst: k, then it is this issue most likely
18:13karolherbst: my patch should apply without issues on 4.4
18:18Misanthropos: how do i get a patch file out of that?
18:20karolherbst: click on patch
18:20karolherbst: there is commit hash (patch)
18:21Misanthropos: looks like an email? i can use it for patch?
18:22Misanthropos: worked :D
18:22Misanthropos: rebuilding kernel ...
18:48martm:wonders why imirkin is ignoring me
18:53martm: aah that was classical way abortion leftovers built up a problem in my life, by continuosuly playing sexpolicemen and afterists, wether now they play aneourmous victims around the world, talking that i am a gangaster and shit like this
18:53martm: those are hornny fucktards that scam with my life around the world
18:54martm: sane man should understand that doing such activities stuff can be not right in the head of those
18:56martm: they seem to have forgatten what they did and still arrange when talking about me
18:56martm: those fucking fellonies, and i am sent to institution, well i capture then one point and punish them all
18:59martm: very old time i called them stalkers with suicidal activity, somehitng in the middle of a stalker/scammer/horny fuckdard
18:59martm: they together understood, that it's best to get rid of me
18:59martm: as i am standing on their ways, cause i had relations of some sort with their women, allthough minimal
19:02martm: i really let the shit that they talk from one ear in, and from the other out as we say
19:03martm: cause their mission is to cumb my brain, to lower my popularity and scam everything about me
19:06martm: ia m gaoing into court anyways with this
19:07martm: later of if probrlems exist still, then prolly more drastical measures will be taken
19:07szt: karolherbst: hey, just an update: after having way too many freezes this weekend I switched to a friends ATI card
19:07szt: no problems so far
19:09martm: its sort of like enourmous team of poeple including my father who have done it really decades allready, the complaints are extraordinary about what they do
19:11martm: paoeple are in capable of finding any interests which is most poinful thing , and be any good in what they choose etc. painful that my id travels among such shitbreaks
19:12martm: all they knew that thereis more hansom man, and they want to get rid of it
19:15martm: they share information about me, and provide them as ironic jokes while stalking me daily basis
19:15martm: humour is based off how my dad has block me from getting where i want with it's life acheivements, when those painful jokes won't carry results things go worse and worse and worse
19:16martm: arrangements where i lose my healht including assaults etc.
19:16Tom^: martm: if i were to assume the reason he or anyone would ignore you is because these kinds of rants that happends on a weekly basis
19:16martm: mentally very siphisticated
19:17martm: Tom^: they do not happen in daily basis
19:17martm: Tom^: and this is not a rant, this is serious
19:17Tom^: martm: and it is related to #noveau how? or rather why bring it up here?
19:17Tom^: martm: i understand it might be serious but this isnt really the place for such things
19:18martm: now you are talking, so we talk about scheduler then
19:18martm: but we need to discuss about it, before anyone'd be happy with the outcome after some implementation
19:19martm: if noone talks i am here to help, since i am better dealing with those things then the picketation on streets
19:20martm: but if you wont start to go to arguments with me here on irc channel, i can not play along to help either
19:21martm: or even this, way if you talked and acted abit it would be allready ready
19:21martm: but this sobbatage i get everywhere too
19:23karolherbst: szt: yeah well :/
19:24szt: I still have my old card
19:24karolherbst: yeah, maybe we figure all this out at some point
19:24szt: if the restart thingy is ever done I'll definetly return
19:26martm: Tom^: i am just thinking wether you have weak nerves, i think it is not my fault, abviously i know it isn't why they picket about me, i mean lets just ignore that, not the conversation how to speed up the development
19:26martm: nones gonna touch you anyways, if the intention was to knock out me
19:28martm: Tom^: the scheduler on nouveau does not interest me at all, cause i implement in different way whan i once start to deal with it on fpga
19:29Tom^: martm: ok, sadly i barely know how to code at all i only helped test nouveau on my nvidia card, so i wont be much of help in discussions about the sheduler or the code in nouveau.
19:29martm: but on asics there are couple of versions too wich i decided to present, ok maybe we talk about how to do it tomorrow, i put couple of versions with instructions into dpaste.com and you tell me how you would like it
19:32martm: Tom^: noone at last me is a perfect coder, every snippet of the code is almost pastable from network, only minor adjustaments are needed to be done
19:33martm: i hold my mouse so that my hand has adjusted to mouse shape allready, go over the links every day, like sometimes thousounds of them in a day
19:34martm: when you finally read them off, modifications are easy too
19:37Misanthropos: karolherbst, no luck: https://bpaste.net/show/8e364b318361
19:38karolherbst: maybe you need to recompile nvidia after enable mmiotrace
19:38Misanthropos: i can try that
19:40martm: Tom^: as you understand that is classical sabbotage and violation against, be better off sure that there aren't any reason behind this, since there never was too, only the core envy
19:46Misanthropos: karolherbst, nope... https://bpaste.net/show/7314ccbfd8d3
19:46karolherbst: Misanthropos: no idea then. I know that nvidia sometimes does that, but I have no idea why
19:46martm: Tom^: just loops and stuff, i am almost skilled in reading c, c++ is difficulter just bit, i never use ides, my grandmaximum of gpu knowhow is allready there, and i've studied opengl too
19:47martm: i am not sure why i did , that cause everyone could violate me?
19:47Misanthropos: i will try that with a different kernel
19:47Misanthropos: maybe a different nvidia drivers version too
19:47martm: i'll branch off from those channels than and go on my own
19:47karolherbst: Misanthropos: defconfig might work + drivers you need, but I never looked into it
19:48martm: all by myself is you know how it is, a lot harder, but no other way
19:48Misanthropos: what is defconfig?
19:48martm: i definitely won't go on the streets and listen shit instead
19:48martm: all days behing my computer, or...i would just play my games and leave to home again if i start with sports again
19:48Misanthropos: i mean .. where do i find it?
19:49karolherbst: Misanthropos: if you run make defconfig in the linux tree you get the default config for your architecture
19:51karolherbst: mhh can anybody do anything with those numbers: 896529 (0xdae11), 19956 (0x4df4), 2500 (0x9c4), 30 (0x1e) ?
19:52RSpliet: karolherbst: lottery numbers?
19:52RSpliet: what context?
19:52karolherbst: RSpliet: unknown vbios table
19:52karolherbst: the 2500 could mean something, but what ... :/
19:53RSpliet: only one entry?
19:53karolherbst: 01 11 ae 0d f4 4d 00 00 c4 09 00 1e 00 00 1e 1e 0f 00 00 00
19:54karolherbst: another kepler:
19:54karolherbst: 01 11 fa 0c a0 4c 00 00 c4 09 00 10 00 00 1e 1e 0f 00 00 00
19:55RSpliet: I take it it's in the P domain?
19:55karolherbst: I am searching for those stupid coefficients, but ... :/
19:56karolherbst: mupuf also has a gk106, but the voltage is 3% lower
19:56karolherbst: so I am pretty sure there are some card specific factors in it
19:56RSpliet: not that I have a clue, but... if a clock of 2500MHz maps to voltage entry 1e... does it? :-P
19:56karolherbst: the CSTEP table maps to voltage entries already
19:57RSpliet: like there's never been any redundancy
19:57martm: Tom^: then i will leave, if you try to read my non-rant programming talks, did you understand something about that?
19:58karolherbst: no, what I am search for is somthing like this: voltage = a * c0 * f0 * value(mode) + b * c1 * f1 * value(mode) + ...
19:58karolherbst: c0, c1... are the coefficients from the voltage map table
19:58karolherbst: f0 are the factors which are gpu specific and a, b.. can be different factors which may be the same across all gpus
19:59karolherbst: I don't think the value(mode) thins is gpu specific, because mode 1 always says to add the temperature to it and never saw anything different
20:00martm: cause i can make a code that demonstrates with what i mean and what are the open questions for me how would some features work, basics are just you map the shader into pfifo cache, and execute it there pc to the cache, there you know which instruction is where, you can poke in every word, and change their code
20:00martm: that is all done in very fast memory and fast fashion
20:00karolherbst: RSpliet: also, my highest clock is 1725MHz
20:01karolherbst: ohhh I think I can figure out table unk40
20:01karolherbst: I have like 30 entries or something
20:03karolherbst: there are two numbers at least
20:04martm: pfifo cache is that said in nouveau docs, those are pretty docs btw
20:05martm: 64kb*1000 that is 64000/4 16000 word entries it can store
20:05martm: its just amazingly well done archidectures
20:08martm: all the stream of 3d commands goes through that cache, plus there is aseparate pgraph CCcache, but the pfifo one is enough to fix multithreading and scheduling
20:10martm: one shader is maximum 1024 commands on fermi/kepler radeonsi maybe i belive or that kinda a hw
20:11martm: the rest of that cache is all for other 3d commands
20:15karolherbst: RSpliet: any idea here? https://gist.github.com/karolherbst/ae0d6c5dca8dd96808b17109f66d0868
20:17martm: took a sleeping pill and off to bed..it's somehwere in nouveau_mm.c where cahe domain is being mapped, as i understend those addresses correspond to real cache backing instnatiaion indexes
20:20martm: so first slot is bo address cache ((.dataout)out(.tag) tag.(index),index)
20:20martm: in corresponding verilog, so you access the caches backing reg so
20:21martm: other possibility is that you tie another tag dynamically to there, and search for that tag from all of the cache
20:24martm: in that case in vlog95 it would work like this: cache[0:16000] ((.dataout)out(.tag) tag.(index),index)
20:24martm: cache is the module, and in that modules body there is written the logic when that tag is found from which index
20:25martm: and it gives the data to logic back
20:27martm: i did it with words, but verilog uses bits as known
20:36szt: karolherbst: nice one. first gpu hangup with radeon
20:36Misanthropos: karolherbst, mmiotrace.log https://drive.google.com/file/d/0B5CS2T7TCI2YM2FoajhJMEdVdkk/view?usp=sharing
20:36szt: karolherbst: maybe the problem isn't my gpu but my cpu
20:37Misanthropos: and dmesg: https://bpaste.net/show/235dcae6a431
20:38karolherbst: szt: shouldn't be :D
20:38karolherbst: Misanthropos: so defconfig helped?
20:38Misanthropos: it did
20:38szt: karolherbst: I had a fan failure earlier this year.. my cpu went up to 100+C until I noticed
20:38karolherbst: Misanthropos: did you reclock the gpu a bit while tracing?
20:38karolherbst: szt: yeah well
20:38Misanthropos: yes... from high to mediom low medium high
20:39karolherbst: szt: this is kind of, meh?
20:39karolherbst: Misanthropos: awesome, thanks
20:39szt: karolherbst: what do you mean?
20:39karolherbst: szt: well my CPU usually runs around 95-99°C
20:39szt: I overclocked it
20:39szt: by adjusting the multiplier
20:39karolherbst: I undervolted it
20:40szt: karolherbst: yeah and if it was my cpu I'd be unable to ssh after gpu failure, right?
20:43karolherbst: szt: well, sometimes the kernel crashes too if the gpu is messed up
20:43karolherbst: but usually you should be, yes
20:45karolherbst: szt: what cpu do you have?
20:45martm: so when you know the index cache is almost like 16000additional regs for only instruction caching, i belive for shading there 65535 regs +l2 texture has another set of texture cache constant cache
20:46szt: karolherbst: AMD Phenom(tm) II X4 965 Processor
21:04martm: Tom^: actually i no longer need to be on the channel, here are enourmous shitheades like the ones violating me here in my country i am just disguised, and good luck, i don't need that scheduler anyways, and i need to break my promise i ain't gonna do that for asics radeon/kepler/intel at least i talked how it's done, and cheers.