00:10tobijk: mhrpm i dont see the problem right now to be honest
00:31tobijk: karolherbst: so my idea would be to find the sequential path first (if we hit a circular one first, save that one for later), first calculate live-ranges on the sequential one
00:31tobijk: easy concept, pretty hard to implement :/
00:33tobijk: or we implement a live-range data structure which we use to follow later on, so we can add live-ranges to nodes we have already visited
00:34tobijk: e.g a chain-list of live values
00:57karolherbst: tobijk: maybe
01:00tobijk: karolherbst: actually a chan-list would be better, gets rid of troublesome while (true) progs :D
01:16tobijk: karolherbst: i think i have a way to implement this, i think i hack something up tomorrow, but not now
06:34Aristar: https://www.nvidia.com/en-us/titan/titan-v only $2,999
06:34Aristar: like wtf... the titanx and titanxp were already overpriced. screw that,skipping titan this round
12:47karolherbst: Aristar: titans are overpriced by definition
12:48karolherbst: there is no reason to buy one for any kind of desktop usage except you make money from it
12:54karolherbst: pmoreau: also something broke compute shaders
12:59mupuf: karolherbst: that's a task for ezbench
12:59mupuf: and a piglit run ;)
12:59karolherbst: mupuf: on our opencl branch
12:59karolherbst: not in master
13:00mupuf: so what?
13:00mupuf: that does not change anything
13:00karolherbst: mhh, true
13:00mupuf: find a good and bad, and let the magic happen
13:11karolherbst: imirkin, hakzsam: ping on the maxwell sched patches
14:42karolherbst: pmoreau: okay, something is messed up in the new code, I wrote a compute shader which basically has the same thing: value defined in BB before loop and used in BB after loop and RA increases the live range inside the loop of that value
14:51pmoreau: karolherbst: Hey! Give me a second to read up the logs :-)
14:52karolherbst: pmoreau: you might want to merge this https://github.com/karolherbst/mesa/commit/2c599136ea34c04bb54cfc0d419a9d14d54336b4
14:54karolherbst: pmoreau: also regarding the loop issue: https://gist.githubusercontent.com/karolherbst/afb38441fa341cd4b06c5545f2a45e65/raw/9ecb705f681c2a19e56ad652fb590817472ba9c2/gistfile1.txt
14:54karolherbst: r38 is the value defined before the loop and used after
14:55karolherbst: and you see, that RA is doing the buildLiveSets again with a new sequence number
14:55karolherbst: now we just need to figure out how to get a new sequence number working ;)
14:57karolherbst: pmoreau: here is the opencl one: https://gist.githubusercontent.com/karolherbst/87660eced27ac6ead0f4f68a41f043f2/raw/3cf4f0c508442b03743249c73ea010cf12f2e91a/gistfile1.txt
14:57pmoreau: karolherbst: You wrote a compute shader doing the same thing, but how is it expressed in TGSI?
14:57karolherbst: and value r14d is used
14:57karolherbst: pmoreau: completly different
14:57karolherbst: with brebreaks and breaks and stuff
14:58pmoreau: Will merge that patch after doing some rebase
14:58karolherbst: pmoreau: https://gist.github.com/karolherbst/1669263f4bcda20205450a4fbc0d57b4
14:58karolherbst: well "completly" different
14:59karolherbst: I mean, SPIR-V doesn't seem to have the concept of loops and so on
14:59karolherbst: and if you look into tgsi to nvir how the edges are classified... I doubt we can easily do the same in spirv to nvir
15:00pmoreau: It does have a concept of loop, with the OpLoopMerge instructions, but either it’s something that’s only for structured CF in OpenGL/Vulkan SPIR-V, or SPIRV-LLVM does not implement it
15:01karolherbst: well anyway, it took me some time to come up with a shader, which generated such a value and made codegen have the same cfg
15:01karolherbst: iterating through the loop first
15:02karolherbst: I've added a new 'running buildLiveSets for sequence' print
15:02karolherbst: there you see what is done in tgsi to nvir
15:02karolherbst: and for 44 the 38 value liveness is copied over from BB:2 into the loop
15:05karolherbst: that loopNestingBound variable got my attention
15:06karolherbst: pmoreau: for TGSI BGNLOOP: if (loopBBs.getSize() > func->loopNestingBound) func->loopNestingBound++;
15:08pmoreau: What’s that loopNestingBound?
15:08karolherbst: if it is 2, we do live analysis twice
15:09karolherbst: if it is 3, we do it three times
15:09pmoreau: Got it
15:10karolherbst: uhm. -1
15:10karolherbst: if it is 1, we do it twice...
15:10karolherbst: and so on
15:10karolherbst: for that compute shader the value is 1
15:10karolherbst: for the cl kernel it is 0
15:12karolherbst: pmoreau: duh... loop_with_if: FAIL: 20 instead of 40
15:14karolherbst: mhh, for 2 I get 2
15:14karolherbst: lets see
15:16tobijk: karolherbst: did you solve the live-range propagation for loops already?
15:16karolherbst: beignet gives me 50
15:16karolherbst: tobijk: RA is fine, it is our fault
15:16karolherbst: first rule of RA
15:18tobijk: mh not sure if this always hold true :D
15:19tobijk: so what was the problem then?
15:20karolherbst: CPU says 50 is correct
15:20karolherbst: tobijk: we don't track loops
15:24karolherbst: okay, right, there is another issue
15:25karolherbst: this time within spirv to nvir
15:26karolherbst: take a close look at %r10
15:27karolherbst: mhh maybe this is fine, but still
15:27karolherbst: we shouldn't push it
15:30pmoreau: What’s wrong with %r10? And what do you mean by “pushing it”?
15:32tobijk: karolherbst: yeah the problem was that we did not track loops, but how did you solve the problem?
15:33tobijk: btw %r10 i unset if you dont "bra BB6" on %p7
15:33tobijk: is that always given?
15:34karolherbst: pmoreau: I think a phi node is resolved here, but in SSA form we actually get a loop of phi nodes. No idea if this is fine or not
15:35karolherbst: well, I need to check why the result is wrong for that kernel
15:35karolherbst: but I have to do it later
15:35karolherbst: and also think of a solution for the loop issue
15:44pmoreau: karolherbst: I rebased and pulled your changes in it. It has the program binary work that was merged in master. Soon will be able to try OpenGL SPIR-V.
15:50karolherbst: pmoreau: nice!
15:50karolherbst: pmoreau: I figure there are some OpenGL CTS tests for that?
15:52karolherbst: and we should come up with a plan on how to support cl kernels in long term. Is this llvm backend even a thing we want to maintain? Or is khronos planing to do that? Are they even planing to upstream something like this?
16:02pmoreau: karolherbst: They were planning to upstream it: there tried once about 1-1.5 year ago and were told to change the design, there was a second round to talk about the design earlier this year IIRC.
16:03pmoreau: I’m not sure what’s the current status from Khronos’ POV, but the activity there almost completely dropped since Dec. 2016.
16:13pmoreau: karolherbst: I went on and renamed the file to nvir_from_spirv.cpp. Might want to rebase to avoid merge conflicts
16:18karolherbst: pmoreau: so we might want to check on this, because if it won't get upstreamed at any point, then... well we are kind of screwed here
16:19pmoreau: IIRC what the argument from the LLVM guys were, in the second thread earlier this year, was to have it out-of-tree, similar to what they went for with clspv
16:22pmoreau: The initial thread: http://llvm.1065342.n5.nabble.com/RFC-Proposal-for-Adding-SPIRV-Target-td82552.html
16:22karolherbst: pmoreau: maybbe they add support for compute spir-v there?
16:22karolherbst: or, mhh
16:22karolherbst: maybe it can be used already?
16:23pmoreau: And here is the second thread: http://llvm.1065342.n5.nabble.com/llvm-dev-SPIR-V-SPIR-V-in-LLVM-td108441.html
16:25karolherbst: tstellar: do you know anything new you are able to share?
16:25pmoreau: clspv is about generating Vulkan SPIR-V out of OpenCL kernels. And Vulkan SPIR-V might not have all the features needed to cover everything doable in OpenCL.
16:27karolherbst: well, right, but maybe the plan is to add it to the same codebase? dunno
16:33pmoreau: I’m not sure what’s the conclusion of that issue, as it seems there was some misunderstanding.
16:35karolherbst: pmoreau: I am still struggling with one mayor issue though: should we move over and do nir to nvir or should we continue with the spir-v stuff. That is currently what I am not 100% sure of. I mean now I kind of know the state of the spir-v work and are in a better position to get an idea what might be the most worthy thing to do.
16:36karolherbst: and spir-v to nir is a thing already, even if it doesn't cover those opencl features
16:38pmoreau: I think NIR supports pointers now, but still not unstructured control flow (not 100% sure on the last one).
16:38karolherbst: well right
16:38karolherbst: but those things can be added as well
16:39karolherbst: in the end the mayor question is: do we want to move from tgsi to nir in long term/
16:39pmoreau: I would say yes, especially if every other driver is doing the same.
16:40karolherbst: imirkin: your opinion on moving to nir for nouveau?
16:40pmoreau: karolherbst: Are you talking about getting rid of NVIR altogether, or just implement an NIR to NVIR pass?
16:41karolherbst: not the question
16:41karolherbst: depends on what works for us
16:41karolherbst: so the target is to mainly get rid of tgsi
16:41karolherbst: and I think nobody would mind if tgsi could be just removed at some point
16:42karolherbst: or we just do nir to tgsi :D
16:42karolherbst: then I would just do nir to nvir
16:44karolherbst: pmoreau: but nir wouldn't solve our cl kernel issue, right?
16:44karolherbst: just that we would do cl -> spir-v -> nir -> nvir
16:45pmoreau: Well, OpenGL/Vulkan SPIR-V can be transformed to NIR using their existing translation pass, so we could use clspv to compile OpenCL to SPIR-
16:47karolherbst: well, lets face it: we don't need any opencl solution in 6 months anyway... or at least no complete thing
16:47karolherbst: if we cover 90% of the opencl features currently in clover this way
16:47karolherbst: this should be fine enough
16:47karolherbst: and then we can still adapt to changes to the cl to spir-v thing
16:48karolherbst: so if we reduce our view on things we can upstream first, then ARB_gl_spirv is the main thing to consider here in the first place
16:48karolherbst: if we have this upstream, doesn't matter how, then we can go from there and support more spir-v stuff
16:49karolherbst: maybe even vulkan would be a solid target there
16:49karolherbst: or vulkan compute
16:50pmoreau: ARB_gl_spirv would be the first thing that could be upstreamed, considering the issues with generating SPIR-V code out of OpenCL code.
16:51pmoreau: If we didn’t had those issues, then it would be equivalent to OpenCL support.
16:51karolherbst: personally I say supporting spir-v is the main target here
16:51karolherbst: if we can do it through ARB_gl_spirv first, fine
16:51karolherbst: if we can do it through vulkan, even better
16:52karolherbst: supporting any kind of compute will highlight other issues and won't be something we support until 2019 anyway, not seriously
16:53pmoreau: And we can get it (SPIR-V support) either by directly supporting it, or via NIR; OpenCL SPIR-V will require some patching on NIR’s side.
16:53karolherbst: I mean, it might be still a good idea to have a spir-v to nvir pass. I don't really know how much information we are losing going through nir here
16:53pmoreau: I haven’t looked at NIR, so no idea.
16:53karolherbst: okay, maybe let's do it this way: we get that spir-v to nvir thing in a stage where we kind of support getting shaders through ARB_gl_spirv
16:54karolherbst: I don't think this will be so much work
16:54karolherbst: because most of the missing bits should involve opencl features anyway, right?
16:55pmoreau: Well, the biggest missing bit might be handling images in SPIR-V to NVIR.
16:55karolherbst: it only matters how big this is compared to writing a nir to nvir pass
16:55karolherbst: I mean, it would be fine from my side to just pass the CTS and piglit tests regarding ARB_gl_spirv
16:56karolherbst: and then we can work on other issues which we are aware of after it got merged
17:27karolherbst: pmoreau: are those spir-v version use the same API by the way?
17:27karolherbst: it kind of looked like to be the same, but
17:27karolherbst: I also don't quite fully understand that spir-v versioning thing
17:29pmoreau: From OpenCL’s point of view: OpenCL 1.2, 2.0 and 2.1 implementations only need to support SPIR-V 1.0. OpenCL 2.2 implementations need to support SPIR-V 1.0, 1.1 and 1.2.
17:29pmoreau: Not sure how it works for Vulkan and OpenGL.
17:30pmoreau: https://www.khronos.org/registry/OpenCL/specs/opencl-2.2-environment.html (For OpenCL)
17:32karolherbst: pmoreau: ahh, so it is the same spir-v, just like the memoryModel is set to a different value
17:32karolherbst: and those other reuls
17:32karolherbst: I see
17:33pmoreau: Huh? The memorymodel value depends on the target (OpenCL,GLSL/Vulkan), not on the SPIR-V version.
17:33karolherbst: that's what I meant
17:33karolherbst: so in the end, we have the same code for everything
17:34karolherbst: and if we want to support the opencl stuff in spirv_to_nir, we just need to deal with those extra bits
17:35pmoreau: I would assume, but IIRC airlied or jekstrand were disagreeing with that view of things.
17:39pmoreau: *assume so
17:39karolherbst: wondering why
18:00imirkin: karolherbst: i have no interest in it. i did want to add a nir->nvir adapter to play around with
18:00imirkin: karolherbst: but nvir does a lot more than nir does, so i don't see nir being useful as anything more than a transport ir, like tgsi is
18:02karolherbst: imirkin: well I don't mind starting with our pre SSA stage when coming from nir
18:03imirkin: that's the way it has to be.
18:03imirkin: without rewriting a LOT of code
18:03imirkin: i spoke with jekstrand about how to do it, he gave me a rough idea
18:03karolherbst: I mean maybe it makes sense to port those pre SSA passes over to SSA and then to nir SSA to nvir SSA at _some_ point
18:03imirkin: no, it does not.
18:04karolherbst: except reducing compiler overhead, there is no real benefit in doing so, right?
18:04imirkin: it won't reduce compiler overhead.
18:04karolherbst: why not?
18:04imirkin: why would it
18:04karolherbst: I thought nir goes into a SSA form in any case
18:05imirkin: everyone sees nir as some savior from above and everything else is crap. i see nvir as being pretty great, and nir being just an extra hop on to a great compiler.
18:05karolherbst: well, I am mainly interested in nir, because it might become an universal IR in mesa everything goes into
18:05karolherbst: so that we can have glsl/opencl/spir-v supporting all at once just by doing nir to nvir
18:06imirkin: well, it won't be for opencl without a ton of work
18:06imirkin: opencl c supports unstructurized control flow
18:06karolherbst: if we are lucky, it is not our tons of work
18:06karolherbst: there is some opencl to spir-v work
18:06imirkin: i think the thought was to have llvm structurize it before passing it in
18:06imirkin: and while that functions, i think that may lead to crap code
18:06karolherbst: just a matter of how to pimp nir to support this as well, which could be quite a lot indeed
18:06imirkin: either way, i don't want to be locked into something dumb
18:07imirkin: when we have all the systems to support it already
18:07imirkin: which is why i like direct spir-v input
18:07imirkin: that said, i have less and less time to work on all this, so ... what i think has less and less value
18:08karolherbst: well right, this is more to figure out if I should spend my time on doing spir-v to nvir or doing nir to nvir
18:08imirkin: otoh, until someone shows up who's willing to seriously take over maintenance, it'll end up being that way.
18:08imirkin: nir -> nvir is an afternoon of hacking
18:08imirkin: i just need ... an afternoon :)
18:08karolherbst: if it isn't so much work
18:08karolherbst: well then I could just spend a week on it and see how it goes
18:09imirkin: it's mostly typing
18:09imirkin: i.e. nir op a -> nvir op b
18:09imirkin: which is ... tedious but trivial
18:09karolherbst: mhh I see
18:09karolherbst: because that spirv to nir is anything but trivial
18:09imirkin: + a bit of plumbing to ensure it all makes it there
18:10karolherbst: I even ended up doing a spir-v nextafter to nvir translation thing
18:10imirkin: and note that spirv_to_nir is non-trivial as well
18:10karolherbst: or implement sign or whatever
18:10imirkin: well - sign is pretty easy to implement
18:10karolherbst: and that's the part I hope for
18:10imirkin: just look at how it's done for tgsi
18:10karolherbst: so that there is everybody interested in having spirv_to_nir work perfectly
18:10imirkin: (there's a trick)
18:10karolherbst: and we only need to take care of the trivial parts
18:10imirkin: yes. except spirv_to_nir doesn't support opencl spirv
18:10imirkin: only vulkan spirv
18:11karolherbst: well, that is fine
18:11imirkin: and one of the reasons you're running into trouble is that you're trying to support opencl spirv. which is a bunch of effort.
18:11karolherbst: they pretty much have to deal with a lot of the same issues
18:11imirkin: also you guys are learning about compilers as you go along, which makes it harder
18:11karolherbst: and opencl stuff is hidden behind an OpExtInst opcode
18:15karolherbst: imirkin: opengl sign != opencl sign
18:15karolherbst: the opencl one is way more annoying
18:16karolherbst: because it specifies what to do on NaN
18:17karolherbst: but maybe it would still work, dunno
18:22imirkin: karolherbst: well, work out the various options
18:24karolherbst: imirkin: yeah, I think I can just work on nir to nvir a few days and then think about what might be the best way of action for us here, or at least something we might want to do as well
18:24karolherbst: we could even stick with tgsi as being the default and just use nir for everything we can't use tgsi for
18:26imirkin: well certainly won't be any switching of anything without in-depth comparisons
19:02levrano: Overheads can include scheduling work in order to give each thread its share of data to work with. Synchronizing threads to avoid incorrect computations due to data races also add overheads. Synchronization to avoid deadlocks or incorrect ordered access to shared data is also another source of overheads.
19:04levrano: so i worked couple of more days on arrays, from yestirday till today (research only) , based of the material, when there is a bank conflict on read or write ports in one of the uniformly distributed regs, it will not honor the order, and puts the non-conflicting first, which is expected and needed behavior
23:39dviola: does nouveau works fine with modesetting?
23:39dviola: the xorg driver
23:39karolherbst: more or less, it depends a bit on the card
23:40dviola: I see, thanks
23:42dviola: I want to give it a try but I don't have nvidia hardware currently
23:43dviola: over time it should get better I guess?
23:49karolherbst: well, hopefully. There are some issues with nouveau when you are doing too much at the same time
23:49dviola: I see
23:49karolherbst: and with modesetting you just get another thing doing OpenGL (through glamor)
23:50karolherbst: but besides that it should be perfectly fine.
23:50dviola: I see, I'll give it a try sometime