10:35 leberus__: mupuf: hi! Did you happen to have some time to check out v6? :)
10:41 mupuf: yes, I have an email open with some recommendatoins
10:41 mupuf: but I did not finish it
10:42 mupuf: basically, the only remark I have now is that you have a lot of 2 lines functions which make no sense and could be folded in the giant switch statements
10:42 mupuf: leberus__: I already told you about this in v5, and I wanted to know why you did not do it
10:47 leberus__: Yes, thats what I was asking the other day. I removes the pwm_enable/set functions and I put the commands inside of the switch. But I wasnt unsure about doing that for the rest of the functions. For me it was not clear if you wanted it just for pwm_enable/set functions or for every function. I guess I missunderstood you. Sorry about that
10:58 mupuf: no need to be sorry about that, miscommunication is the norm ;)
10:58 mupuf: follow your reason and try to understand what reviewers may have wanted you to fix, that's the best you can do
10:59 mupuf: oh, and be ready to accept criticism and explain your work
10:59 mupuf: but you have 0 problem with that, it would seem :)
11:34 AndChat|499956: You're right. Well, as long as i dont get a rea beating hehe
11:34 leberus__: I'll fix it in v7 then
11:35 leberus__: Soz for taking the time ;)! Would you send the email anyway? If not let me know and i'll send v7 this afternoon
11:41 mupuf: Well, send the v7. The email was in a state of flux because I wanted to apply your patches and got distracted
11:59 leberus__: Ok! I will do so
13:25 pmoreau: RSpliet: https://www.khronos.org/news/press/khronos-releases-opencl-2.2-with-spir-v-1.2
13:25 RSpliet: pmoreau: cool stuff... let me read the details tonight ;-)
13:25 RSpliet: meanwhile, I'm still stuck writing OpenCL 1.2 code because NVIDIA :-C
13:27 pmoreau: And because of a lazy Nouveau dev xD
13:27 RSpliet: guilty as charged...
13:27 pmoreau: Going to dive into the details of those new versions…
13:28 RSpliet: one version at a time :-D
13:28 pmoreau: Yeah
13:29 pmoreau: But the OpenCL CTS is also being opened and released. Even for the first OpenCL versions. \o/
13:34 tstellar: fyi, There is some glue code for running OpenCL CTS in piglt.
13:35 tstellar: *piglit
13:36 mupuf: tstellar: excellent :)
13:36 pmoreau: Ah, cool. But I need to revive a patch for compiling to SPIR-V in clover, rather than relying on an external compilation + clCreateProgramWithIL
16:09 jamm: hakzsam, imirkin: please take a quick look if possible: https://hastebin.com/uduhosefuj.bash
16:09 jamm: it's only one of the shaders. I'm facing some artifacts (will send screenshot in a bit)
16:10 jamm: if i can fix this, i'll get better at debugging the others so it'll be easier for hakzsam to review i think :)
16:13 pmoreau: jamm: "10: +sched (st 0xf wr 0x0 wt 0x1) (st 0xd wr 0x0 wt 0x1) (st 0xf wr 0x1 wt 0x3)" why wait on the 2nd barrier for the third insn? And on the 1st barrier for the 1st insn? (I could misremember how it works)
16:16 jamm: http://oi67.tinypic.com/348opyd.jpg <- screenshot after installing those changes
16:18 jamm: pmoreau: $r0 is used by the other two, and the instructions are of variable latency, so it cannot be predicted exactly which one of those 3 would be executed first
16:18 jamm: maxwell has a pipeline depth of 6 it seems
16:19 jamm: fwiw, the circle at the bottom is part of the background wallpaper, so that's not an issue
16:20 pmoreau: ` ipa $r3 a[0x94] $r0 0x0 0x1` only has a dependency on $r0, which is produced by `mufu rcp $r0 $r0`, so you should only need to wait for the barrier set by the mufu insn, no? Since mufu already waited for the previous ipa to finish
16:21 imirkin_: [afaik, it's free to wait on barriers that are already signalled]
16:24 jamm: pmoreau: you have a point. Actually my current assumption is that these instructions are variable latency so the exact moment when the write would happen is not predictable - i could be wrong though, as i'm new to this ^^
16:25 pmoreau: I am not that much more experienced tbh :-)
16:25 jamm: imirkin_: right, the maxas article mentions the same and some of hakzsam's sched codes on mesa also waits on 0x3f in a couple places i.e. all barries
16:25 jamm: pmoreau: no worries, i'm still learning ^^
16:25 pmoreau:needs to check how ipa works again
16:26 jamm: afaik ipa is interpolate, but i haven't looked too much into how it works yet. The dest is 32bit and src is memory location
16:26 jamm: and $r0, not sure what it's used for exactly but i've assumed it to be a read operand
16:27 jamm: ^ for e.g, ipa $r3 a[0x94] $r0 0x0 0x1
16:28 jamm: my guess is $r0 could be some sort of step size
16:28 jamm: ah, wild guess
16:28 pmoreau: I was reading it with $r0 being the dest, but as $r3 set by ipa is overwritten without being read…
16:29 jamm: yeah, i set a write barrier for that, and i'm also waiting for that same barrier as $r3 is being used elsewhere below
16:30 jamm: like tex nodep $r1 $r2 0x0 0x1 t2d 0x8 (here $r2 is actually $r2:$r3) and ipa $r3 a[0x84] $r0 0x0 0x1
16:30 pmoreau: Ah right, the address can use 2 regs
16:31 jamm: there's some assumptions i've made here, but so far they're not working hmm
16:31 jamm: although on the bright side, my first iteration on this patch gave much worse results
16:31 pmoreau: :-)
16:45 pmoreau: jamm: `mov $r2 $r3 0xf` are those variable latency? I would guess not since you have a stall of 0x6. So you shouldn't need to write to a barrier for those, even though that shouldn't create errors I would guess.
16:45 pmoreau: And what is the immediate for? Is that the mask of the active lanes?
16:54 jamm: pmoreau: well, it seems load/store instructions also require write bars
16:55 pmoreau: From/to memory/system values, but not from registers I think
16:55 jamm: you mean the 0xf?
16:56 jamm: not sure really, it does seem like a mask
16:56 jamm: (about the immediate thing)
16:56 pmoreau: yes, 0xf
16:56 jamm: pmoreau: yes you're right, they're not needed
16:57 jamm: btw, what are $pm{1..n} for?
16:58 pmoreau: Those ring a bell…
16:59 pmoreau: Maybe the performance counters that have been set up.
16:59 jamm: nvm, we're not using it but it does seem like a register
16:59 jamm: ah, i see
16:59 jamm: so its a system value
17:00 pmoreau: I think I saw them in some of the perf counters query stuff from hakzsam
17:01 jamm: sec, restarting nouveau
17:05 jamm: pmoreau: removed bar usage in movs -> no effect (still good coz we now avoid using it anyways i think)
17:39 pmoreau: jamm: I'll another look after dinner.
19:35 shtrb|work: Hi , Debian Sid running on Nvidia 940MX and nouveau (without any special kernel options or module options) , I'm getting no freezing or flickering or crashes , but I do see messages regarding priv: HUB0: 6013d4 0000573f (1f408200) and DRM: resuming do I need to perform any action or it something that can be ignored ?
19:37 towo`: shtrb|work, 940MX sounds like optimus, so the display is driven by intel
19:38 shtrb|work: ok .. and ?
19:38 towo`: nouveau is not really involved
19:39 shtrb|work: Oh , thanks , sorry , have seen some of it's messages in dmesg and was wondering
19:39 shtrb|work: thanks
19:55 imirkin_: well, those errors are from nouveau
19:56 imirkin_: however the HUB errors aren't extremely bad, just a bit odd. however on some GPUs they always happen and are, apparently, entirely harmless.
20:42 shtrb|work: Thank you imirkin_
20:42 shtrb|work: and towo`
21:18 leberus: mupuf: I've sent v7. This time I got rid of the "dummy" functions ;)
21:19 mupuf: yeepee!
21:20 mupuf: that looks much better, doesn't it? :D
21:23 mupuf: skeggsb: hey, can you pull leberus' patches please?
21:23 mupuf: you'll have to rework the commit messages to add drm/ and get rid of the capital letter, but that should be it
21:24 mupuf: and this should allow us to finally have proper temperature tracking on GPUs with i2c temperature sensors
21:30 leberus__: Yeah, it looks shorter and better :)!
22:38 pmoreau: jamm, hakzsam: Does this make sense? It's just removing some waits that might not be needed, so I doubt it will make it work. https://hastebin.com/equsozobod.bash
22:47 hakzsam: jamm: you only have to wait for all barriers at the beginning of a shader if it's a builtin
22:48 hakzsam: jamm: I'm sorry I still haven't looked at your mail...
23:00 hakzsam: jamm: just replied
23:00 hakzsam: btw, if you put wrong sched codes, you are most likely going to hang your GPU :)
23:01 hakzsam: pmoreau: why 'rd' on mufu?
23:03 pmoreau: hakzsam: Cause it has some read latency, so I interpreted that as "insert read barrier".
23:04 pmoreau: But since it's also the ouptut from the same command, it's probably fine without it
23:06 hakzsam: pmoreau: you have to use 'wr' instead
23:07 hakzsam: wr is for RaW, while rd is for WaR
23:07 pmoreau: I think I had a wr as well
23:15 hakzsam: pmoreau: jamm, https://hastebin.com/jemuvuxoxo.bash
23:15 hakzsam: this one should work
23:16 hakzsam: mov $r1 $r3 0xf --> this needs 0x6 stalls too
23:17 pmoreau: hakzsam: Why not st 0x1? It doesn't need to wait on the previous move, does it?
23:17 hakzsam: yeah, you are correct
23:18 pmoreau: Same for the one line 31
23:18 hakzsam: sure
23:29 _root_: hello
23:30 _root_: I am asking about bug https://bugs.freedesktop.org/show_bug.cgi?id=84721
23:31 _root_: is it any solution?
23:31 _root_: I really need a fix after all this time.
23:41 jamm: hakzsam: no worries! thanks :)
23:41 jamm: pmoreau
23:42 jamm: will apply the patch and test after work
23:43 jonan: hey guys, got a bit of an issue turning my dgpu on. cat /proc/acpi/bbswitch reports that my dgpu is off. trying to force it on with tee /proc/acpi/bbswitch <<<ON quickly says ON right back, but upon checking its status, it's still OFF. dmesg also shows that it's refusing to change the powerstate. any ideas?