00:21 imirkin: skeggsb: can you submit your various patches that should probably be cc'd to stable, like the bar fix and the bios checksum fix?
06:26 gamester: Hello. I'm curious, is anyone funding this project or are the contributors just volunteers?
06:28 imirkin: RH has one developer full-time on it, most of the rest is volunteer
06:35 imirkin: every so often people will donate some HW.
06:38 gamester: Oh okay. Too bad there's no millionaires/billionaires interested in funding these kind of projects :D - oh well, maybe in 10-20 years.
06:39 imirkin: perhaps you? :)
06:40 gamester: yeah, hopefully :)
06:42 s0be: I'm a dozenaire. Someday I hope to become a hundredaire and might be able to donate some hardware.
09:56 jennifeiro: listen guys id still like to ask for what is youre views on the thing, where opencl kernel uses more then some amount of vgpr
09:56 jennifeiro: s, and cause of that the amount of warps would have to be limited, i do not understand this concept at all
10:00 jennifeiro: aah right , i was not paying much of an attention when looking into codeXL tutorial
10:00 jennifeiro: vgpr pressure arrived, and possible spills, will look again, that makes some sense actually
10:04 karolherbst: mupuf: okay, with my new knowledge, we will be able to implement power capping for around 60% of all kepler and maxwell1 gpus :)
10:05 karolherbst: may be 80%, but around 75% of the vbios expose the field I can use for this
10:09 jennifeiro: karolherbst: i have couple of minore details to recap. before arriving to patch state, i forgat about robclark comments, the thing is if due to vgpr pressure waves warps have been allready limited
10:09 karolherbst: RSpliet: I reordered the cards on trello for DRM power management, cards related to older chipsets are at the top, so your stuff is there as well
10:10 jennifeiro: wether then i should still mask the slower instructions
10:11 jennifeiro: but with performance counters that could be found out, i have to go to work in the garden now
10:41 karolherbst: mupuf: by the way, the network connection is getting laggy again
10:45 pmoreau: imirkin: I guess you had no further comment on the split 64-bit patch? Or you didn’t had time/forgot to have another look?
10:53 karolherbst: :O why did I just find this today: https://gist.github.com/karolherbst/b9bf8b838e3c154cdb2b3fce8d809f47
10:57 karolherbst: REing the pmu interface for the power readings :) this will be fun
11:01 karolherbst: hakzsam: any idea what this "Accounting Mode" might be?
11:01 karolherbst: I could imagine it is something perf counters related and the driver just monitors what is happening or so, but it may be something totally trivial though
11:02 karolherbst: "GPU Operation Mode" is "low double precision" on the titan
11:02 karolherbst: I guess there is also a "high double precision" mode
11:02 karolherbst: also interesting:https://gist.github.com/karolherbst/87e878e98d9a4931138e8b05b4528ed7
11:03 karolherbst: "GPU Shutdown Temp : 100 C" ! nice
11:03 karolherbst: that titan will be really helpful in the future
11:05 Calinou: how is power management for the GTX 960M (which, I presume, is a Maxwell 2) card going?
11:05 Calinou: suspend is unreliable with NVIDIA blob here, and using PRIME
11:05 Calinou: (I also need to log out and in to switch graphics cards…)
11:05 karolherbst: Calinou: bad, blame nvidia though
11:06 Calinou: my next laptop will likely not have a dedicated card due to this
11:06 karolherbst: Calinou: well, it is all implemented and all, we can't use it though
11:06 Calinou: even if I set the laptop to Intel-only, it'll hardlock on suspend 1/3 of the time
11:06 Calinou: (on resume actually)
11:08 RSpliet: Calinou: my Intel laptop does that, not using the NVIDIA GPU. Intel's driver is just broken
11:08 Calinou: RSpliet: I did have an Intel Atom netbook having actually reliable suspend
11:08 Calinou: but that was the only time I had reliable suspend on Linux
11:08 Calinou: that was in 2013 or so, on a netbook released in 2011
11:08 Calinou: very slow but it worked :^)
11:10 karolherbst: I have no problem with my haswell though
11:10 RSpliet: let me rephrase that: Intel Skylake suspend is broken
11:10 karolherbst: true
11:10 RSpliet: has little/nothing to do with the NVIDIA GPU
11:10 karolherbst: haswell is maybe the last generation you can use reliable ;)
11:17 Calinou: do you think the Linux graphics stack will ever implement proper crash recovery?
11:17 Calinou: it's been 10 years that Vista has it
11:17 Calinou: RSpliet: I'm on Haswell laptop
11:17 Calinou: not Skylake
11:17 karolherbst: Calinou: well, there are only a few devs working on the graphics stack compared to windows
11:18 karolherbst: but yeah, maybe priorities could be shifted a bit
11:18 karolherbst: crash recovery isn't easy though
11:18 karolherbst: i915 has something somehow working though
11:18 karolherbst: it isn't perfect, but works for most cases
11:20 Calinou: what kind of business 13"/14"/15" laptop would you recommend for Linux btw? no dedicated GPU needed, high battery life is a must though
11:20 karolherbst: not lenovo
11:21 karolherbst: they are assholes, so I don't recommend them
11:24 karolherbst: Calinou: I usually use this site to find models which would suite me: https://geizhals.eu/?cat=nb
11:25 karolherbst: there are tons of filters
12:07 Calinou: German :|
12:13 karolherbst: Calinou: you can switch to european version though
12:13 karolherbst: and I actually gave you the european link
12:13 Calinou: I mean, the site language
12:13 karolherbst: yeah
12:14 karolherbst: it's english still
12:14 karolherbst: I usually use the filters to find models which would fit my needs
12:26 jennifeiro: well i once again say, that this performance thing is important, but its not difficult at all, where it is slightly ridiculous for you to refuse to take that stuff on. but i can handle that its only that im absolutely jammed though, i can handle that, but id end up being out of position
12:28 jennifeiro: some sort of common strike fist will be formulated against me by ... you know how it is, by some interesting persons, id have to acquse them, it would had been lot easier if you listened to be and we divide the work and get it very quickly done
12:35 jennifeiro: any other then that, well good luck, as i did give the references that kernel geeks actually got the methods which are fastest under linux, nouveau implements this faster path
12:43 jennifeiro: so yeah nouveau should be inherently faster, i stand by this its only if you add a bit of code ontime for the performance path
12:44 jennifeiro: lots of delays with simpler things
12:54 karolherbst: meh, the power consumption thing is part of a 120 byte pmu response...
12:54 karolherbst: the heck
13:14 mwk: well
13:14 mwk: I got my 3.3V AGP machine back up
13:14 mwk: that was surprisingly easy
13:32 pmoreau: Cool! I figured out how remote port forwarding works and could connect on my test machine through my server.
13:33 pmoreau: Next step: configure users account for you, and you should be able to access it (though I’ll have to manually power it up though).
13:40 karolherbst: pmoreau: :D I could have told you any time
13:41 pmoreau: I wasn’t difficult, just spend a lot of time trying to figure out why it was saying that port forwarding was disabled, despite having enabled it. Turned out ports < 1024 need root authentification and I was connecting using a normal user account.
13:41 karolherbst: pmoreau: right
13:50 jennifeiro: now what did i tell you that this sort of clocking mechanism is wrong, we have headlines like iphone 7 and samsung galaxy note 7, manufacturer said to switch them all off, cause they burn into peoples pockets, the design as said is incorrect
13:51 jennifeiro: chip will be fried unexpectadly
13:51 jennifeiro: We can also spread computationally intensive workloads around the chip to eliminate hot spots and balance temperature The methodology of interdomain
13:52 jennifeiro: one would need to use that tehcnology cdc and multiple phases of ascynchornous clocks on programmable fabric to avoid that
13:52 jennifeiro: https://books.google.ee/books?id=RBzOBQAAQBAJ&pg=PA177&lpg=PA177&dq=%22We+can+also+spread+computationally+intensive+workloads+around+the+chip%22&source=bl&ots=YySBiXDzs2&sig=nGhEvU-GAegkReEMX1v8rO6Dz5I&hl=et&sa=X&ved=0ahUKEwik9qPWhvHPAhWHFiwKHV5gA5EQ6AEIJTAA#v=onepage&q=%22We%20can%20also%20spread%20computationally%20intensive%20workloads%20around%20the%20chip%22&f=false
16:25 jennifeiro: i am aware about my issues though, no comments needed, i was given a chance to die or injure myself and get away with it, and i am injured, other way was not possible, hell with that i am embarrased that i spam, i should handle a lot more my own, but concentration has taken a major hit
16:26 jennifeiro: i can think really almost ok, but putting into practise is sometimes hard for me, i gonna see dentist tomorrow on couple of issues get solved
16:27 jennifeiro: i am self critical very much those days, we try to beat the last issues i understand anyone who has issues or not yet and have rough go etc.
16:30 jennifeiro: people new how much i can take , and gambled on my nervous system until it had to blow up that all happened long time ago, i had very good options back then to progress into fantastic things, so tomorrow for couple of days i do some work allready
16:31 jennifeiro: on the systems i take my nvidia card and amd card and inspect the tools on binaries to collect the data more easily
16:49 jennifeiro: RSpliet: hi, yeah i intend to do the performance counter magic with appropriate gui tools yep, then yup the pressure does not probably break my mission, it seems like only gui interaction is needed
16:55 jennifeiro: i have to handwrite cuda kernels, but generally on nvidia i referenced very precise data for kepler and maxwell and fermi, maybe even tesla, i allready know the stuff on my card
16:56 jennifeiro: i only need to check how the vgpr pressure incluences the masks
16:57 imirkin: jennifeiro: please stop polluting the channel logs with random rambling.
17:08 jennifeiro: imirkin: yeah ok, but that is for amd http://developer.amd.com/community/blog/2014/05/16/codexl-game-developers-analyze-hlsl-gcn/
17:09 jennifeiro: i am only working out this thing, how it affects, i logically think that masks and sched codes stay the same, but will check
17:10 imirkin: this channel is for nouveau, not AMD-related items.
17:10 imirkin: if this continues, it'll be like last time
17:11 jennifeiro: yeah i know, i think nouveau has 8 simd equivalants on my card, they were called...the perfkit tool should show similar stuff
17:11 jennifeiro: SMX or SMs or something
17:13 jennifeiro: i still have the kepler card, which method do you prefer, ill hook it up? sched codes or masks?
17:13 jennifeiro: month it should take , i do that as my first and last contribution
17:15 jennifeiro: i am interested in using nouveau and open source drivers, i just only use the binaries to test some data, beavause i am not skillful enough, then go back to using nouveau if the performance is fixed, but its easy task
17:15 jennifeiro: i am prepared for reasonable task for my level of skills
17:19 jennifeiro: imirkin: i was not prepared that you had difficulties to understand the theory, how you hide latency, i just meant that i come here give the data, and someone implements it
17:33 jennifeiro: i personally read everything from the net, this mask based latency hiding which is easy deal i read from the net too, i knew it my own too befaure i got the links that explicitly said so, but this by assembling information together from different sources
17:33 jennifeiro: i.e i am also incapable of dreaming about how things work
17:41 jennifeiro: c ya tomorrow than, i just have hard time of beliving, the pace that i have been reading, that anyone else did not meet those entries, its all good, bye for now
17:46 karolherbst: imirkin: mind replaying a trace for me to check if you also get the prime frame ordering issue?
17:47 imirkin: sure...
17:47 karolherbst: thanks
17:48 karolherbst: imirkin: did you had any issues with the frame ordering and prime?
17:53 imirkin: karolherbst: no
17:53 imirkin: was there a specific trace you wanted me to replay?
17:53 imirkin: or just in general?
17:55 karolherbst: I have a trace, but I compress it
17:55 karolherbst: takes a while
17:57 karolherbst: the issue is really bad with this game/trace, so I am wondering if it's as bad for everybody. I also use dri3, so this might also change something
17:58 karolherbst: uhh, this issue is only there when I enable wait for vsync...
18:07 karolherbst: imirkin: https://drive.google.com/file/d/0B78S7GSrzebIbXhNTGtvWkFjLWs/view?usp=sharing
18:08 karolherbst: after loading the save
18:12 imirkin: downloading....................................
18:12 karolherbst: uncompressed it was like 650MB ;)
18:16 imirkin: 267M ... almost there
18:20 imirkin: ok, so what do you want me to do exactly?
18:20 imirkin: DRI_PRIME over DRI3?
18:20 karolherbst: just run the trace with prime offloading
18:20 karolherbst: doesn't matter
18:20 karolherbst: actually I would like to know if it happens with both DRI2 and DRI3
18:20 imirkin: any particular mesa version?
18:21 imirkin: (the options are 12.0.3 and whatever i have in my branch)
18:21 karolherbst: actually, I only care if it's reliably reproducable
18:21 karolherbst: shouldbn't matter
18:23 imirkin: ok, well running it on my main gpu i don't get any oddness
18:23 imirkin: let's try prime
18:24 xaviergmail: Hey, Nouveau was making URXvt render vim really slowly
18:25 xaviergmail: I then switched to nvidia's proprietary drivers and sort of regret it
18:25 imirkin: karolherbst: hm, apparently that trace doesn't run so hot on my NV34 :)
18:25 karolherbst: imirkin: :D
18:25 imirkin: for some reason it has decided that DRI_PRIME should go to that one
18:25 karolherbst: the scene ran at like 30fps on mine
18:25 xaviergmail: How could I check why nouveau was so slow? Bad configuration on my part?
18:25 karolherbst: xaviergmail: what gpu do you have?
18:26 imirkin: let's try on the GM107 instead :)
18:26 xaviergmail: karolherbst: I have the nvidia 560 ti
18:26 karolherbst: so a fermi one :/
18:27 karolherbst: xaviergmail: you are stuck with lowest clocks there
18:27 xaviergmail: Why D:
18:27 karolherbst: well
18:27 karolherbst: because we don't know how to reclock the memory on those
18:27 karolherbst: (it isn't simple)
18:28 imirkin: karolherbst: well, there's a tiny bit of rendering fail on the GM107, but i'm pretty sure they're just nouveau-fail-related.
18:28 imirkin: karolherbst: i didn't see any out of order issues
18:28 karolherbst: mhh
18:28 karolherbst: k
18:28 karolherbst: with dri3?
18:28 xaviergmail: Well surely you can just turn knobs and dials until it works? :D
18:28 xaviergmail: Lol
18:28 imirkin: of course both my GK208 and GM107 never get above 10fps...
18:28 imirkin: karolherbst: yes, dri3
18:28 karolherbst: mhh, right you have no compositor as well
18:28 imirkin: xaviergmail: you can check why nouveau is slow by seeing where the cpu usage is going.
18:29 xaviergmail: Is there some sort of breakthrough newsletter I could subscribe to to be notified when you guys find the secret sauce
18:29 imirkin: xaviergmail: there's some issue where the visual bell on some GPUs in some terms happens to render super-slowly
18:30 xaviergmail: \ahhhhhh I see (pun intended)
18:30 karolherbst: xaviergmail: well, I am sure everybody will tell about it when we get fermi memory reclocking working
18:31 imirkin: karolherbst: well that's just great - running with GALLIUM_HUD=fps crashes that trace on maxwell
18:31 karolherbst: huh
18:31 imirkin: and the GK208
18:31 karolherbst: ohh, you are right
18:32 imirkin: doesn't like the hud's draw
18:32 karolherbst: glXSwapBuffers+0x78
18:32 imirkin: why do i look at bugs :(
18:32 imirkin: ignorance is bliss...
18:32 karolherbst: something is fishy, disabled compositing still get visual errors due to frame ordering
18:32 xaviergmail: How could I go about testing performance in an opengl context?
18:33 karolherbst: imirkin: was there a way to create a video file out of a trace?
18:33 karolherbst: ohh, found it
18:35 karolherbst: the heck
18:35 karolherbst: with dump-images the issue isn't there
18:38 imirkin: probably a sync issue
18:38 imirkin: your gpu's too fast
18:38 karolherbst: yeah, most likely
18:38 karolherbst: gpu too fast, cpu too slow
19:36 karolherbst: hakzsam: did you find time to check my cse patch again?
19:41 karolherbst: huh, doesn't ST_DUMP_SHADERS work anymore?
19:42 imirkin: no, there's a new thing
19:42 imirkin: that's bigger and better :)
19:42 karolherbst: I see
19:43 karolherbst: I wish somebody would document stuff like that on the envvars.html page after adding it...
19:43 karolherbst: because you can't find it otherwise
19:44 karolherbst: excepting spending half an hour finding it
19:44 chip_noob_: I remember reading somewhere that new Nvida cards use signed binaries that stop use of free software on them, what is the situation with that? I searched and could not find the article I remember seeing.
19:45 karolherbst: chip_noob_: basically we can't really reverse engineer the vbios for gm20x gpus
19:45 karolherbst: and we need signed firmware for various things
19:45 karolherbst: so we can't use our own anymore
19:46 karolherbst: imirkin: MESA_SHADER_CAPTURE_PATH
19:48 karolherbst: imirkin: ohh I think it even takes care of SSO things and dumps every combination possible
19:48 imirkin: it does.
19:48 imirkin: hence "bigger and better"
19:48 karolherbst: k
19:49 karolherbst: hakzsam: I will add a status file to the shader-db repository listing every directory produced by MESA_SHADER_CAPTURE_PATH, so that we keep track of it, allthough there are other ways to find out
19:54 karolherbst: hakzsam: but bascially we should put traces inside the traces ftp thing and be able to regenerate all the directories with a noop driver in the end
21:09 karolherbst: mupuf: still need the titan for REing the temperature things, will try to do it over the week
21:50 jennifeiro: imirkin: by any chanche you could unban my home computers ip address, i am using tablets 4g which my mom eants to use tomorrow at work?
21:53 jennifeiro: imirkin: anyways i give the fermi wikipedia stuff, and amd codexl and tarjan_sc.pdf, you see there are around 2560 work-items, that tarjan pdf says, that only with diverge on miss, it can do better with 4warps then 16warps can on those kernels by 30%
21:53 imirkin: or you could find a different chan to annoy with random rambling.
21:53 jennifeiro: i help you do the math here, well 25workitems is around
21:54 jennifeiro: i mean 640 workitems is around 25vgprs
21:57 jennifeiro: and well there is 256 total per simd unit on radeon, that means its a major optimization i mean if you have like
21:57 jennifeiro: heck i cant explain, i do ciggie and continue
22:26 orbea: With changes in mesa and dolphin-emu I can now play it full speed with almost everything I have tried. :) However one exception is the map menu in metroid prime which is very slow while the rest of the game runs fine, the dolphin devs said it uses some geometry shaders which some gpus / drivers can't handle. I guess this would be true for nouveau, so does this apitrace point out something obvious to improve
22:26 orbea: or is it tricker than that? http://ks392457.kimsufi.com/orbea/stuff/trace/dolphin-emu_metroid-prime-map.trace.xz
22:29 imirkin: nouveau does handle geometry shaders...
22:30 imirkin: geometry shaders are, generically speaking, very slow
22:30 orbea: err, I guess I meant can't handle well
22:30 imirkin: not as a result of anything nvidia does, but just as a result of how they work
22:30 imirkin: they're a serial point in a highly parallel pipeline
22:35 orbea: I guess its amazing enough the rest of the game is fast :)
22:44 imirkin: orbea: any idea if it performs better on blob?
22:45 orbea: I believe its supposed to, but that is second hand info and I have not tried myself
22:47 imirkin: wow. issue slot utilization = 10%
22:47 imirkin: that's ... not great :)