13:10 pendingchaos: karolherbst: could the hang be due to an infinite loop in Payday2's SMAA shader?
13:10 pendingchaos: payday2/381.shader_test has a bunch of loops which loop depend on the results of textureLod()
13:10 pendingchaos: and I don't think your patches handle when textureLod() and such affect control flow
13:12 karolherbst: pendingchaos: dunno, it works without the flag set
13:12 karolherbst: ohhh
13:12 karolherbst: mhh
13:12 karolherbst: might be, yeah
13:12 karolherbst: good suggestion
13:13 karolherbst: right, so when the result is used within the same instruction inside the next loop iteration
13:13 karolherbst: :/
13:13 karolherbst: _that_ is annoying
13:15 karolherbst: okay, so then we kind of need a different approach
13:15 pendingchaos: I think changing "data.liveOnly = true;" to "data.liveOnly = !insn->asFlow();" would fix it
13:15 karolherbst: we have that loop max depth variable somewhere, which we might have to take into account
13:15 karolherbst: and iterate over the CFG like RA does
13:15 karolherbst: pendingchaos: sure, but we can do better
13:15 karolherbst: pendingchaos: but flow instructions aren't the issue here
13:16 karolherbst: not directly
13:16 karolherbst: you basically just have to follow the phis and see if you are able to get to another blacklisted instruction
13:17 karolherbst: I didn't want to follow the use chain on every tex instruction, that's why I created that table approach to save some CPU time
13:18 karolherbst: mhh
13:18 karolherbst: but it should work in theory
13:18 karolherbst: I think the issue is really if it hits the same instruction
16:08 karolherbst: imirkin: was there a fix for that nouveau_vma_del stuff? I noticed that I am able to trigger that quite reliably with OpenCL stuff
16:54 karolherbst: hum...
16:54 karolherbst: pendingchaos: didn't you had a patch for some constant folding stuff recently?
16:54 karolherbst: "nv50/ir: optimize near power-of-twos into shladd" ahh yeah, that breaks something for me :p
16:56 pendingchaos: what does it break?
16:56 karolherbst: 64 bit muls/mads ;)
16:56 karolherbst: but... you only get those with CL anyway
16:56 karolherbst: so no problem on upstream mesa
16:56 pendingchaos: I think the patch ignores 64-bit operations?
16:57 karolherbst: mhh
16:57 karolherbst: pendingchaos: https://gist.github.com/karolherbst/065773fd9be5eb1f55676d0c885db408
16:57 karolherbst: last mul and mad
16:57 karolherbst: "merge u64 %r122 %r140 %r139"
16:57 pendingchaos: unless the 64 bit operations are somehow lowered before ConstantFolding
16:57 pendingchaos:clicks
16:57 karolherbst: lowering happens after constant folding
16:58 karolherbst: maybe the issue is something else
16:58 karolherbst: but the constant folding result is wrong
16:58 karolherbst: ahh, yeah that "add u64 %r121d %r122 %r61d"
16:58 karolherbst: might not be even caused by your change
16:58 karolherbst: basically %r122 needs to be %r122d
16:58 karolherbst: then it would be fine
17:00 karolherbst: but I am sure it worked some months ago
17:01 pendingchaos: 'nv50/ir: move a * b -> a << log2(b) code into createMul()' might be it
17:02 pendingchaos: with it, 64-bit OP_MAD is turned into a OP_SHL and OP_ADD
17:02 karolherbst: yeah
17:02 karolherbst: would make sense
17:02 pendingchaos: unless OP_SHLADD supports 64-bit types
17:02 karolherbst: I didn't look into fixing it yet
17:02 karolherbst: I don't think so?
17:03 karolherbst: yeah, seems to be 32 bit only
17:09 karolherbst: pendingchaos: yep, that's it
17:09 karolherbst: "shl u64 %r122 %r95d 0x0000000000000002; add u64 %r121d %r122 %r61d"
17:10 karolherbst: ohhw ait
17:10 karolherbst: something else does that
17:22 karolherbst: pendingchaos: okay... found it
17:22 karolherbst: it is trivial
17:29 karolherbst: imirkin, pendingchaos: https://github.com/karolherbst/mesa/commit/36f8c57ffd8b6e403b9af14862cb602736b0e8a3
17:33 karolherbst: pendingchaos: can xmad help with something like that? "mad u32 { $r9 $c0 } $r8 $r6 $r10"
17:33 karolherbst: or "mad (SUBOP:1) u32 $r7 $r8 $r6 $r7 $c0"?
17:33 karolherbst: dunno how carries are supported there
17:36 pendingchaos: oops
17:37 pendingchaos: fwiw, that change is R-b me
17:38 pendingchaos: the $c0 is the carry from the addition? I XMAD has some x/cc flags
17:38 pendingchaos: even if they aren't useful, a separate add instruction could be used
17:39 karolherbst: pendingchaos: uhm, no, the mad
17:39 karolherbst: this is basically an add64 -> 2x mad32
17:39 karolherbst: I think...
17:40 karolherbst: or was it mul or mad?
17:40 karolherbst: mul
17:40 karolherbst: mul64 -> 2x mad32
17:41 karolherbst: ohhh no, I am silly
17:41 karolherbst: mul64 is lowered into quite a lot of stuff
17:41 karolherbst: it's an add64
17:42 karolherbst: or it is part of the mul64 lowering
17:42 karolherbst: don't remember
17:42 karolherbst: ayway
17:42 karolherbst: *anyway, we could end up with such mads
17:55 pendingchaos: are 64-bit multiplications sort of common with OpenCL?
17:55 pendingchaos: IIRC the XMAD patches didn't handle them because they seemed non-existent in shader-db
17:59 pendingchaos: *didn't turn them into XMADs
18:02 karolherbst: pendingchaos: yes
18:02 karolherbst: all kernels have that
18:02 karolherbst: well
18:02 karolherbst: all kernels which use get_global_id() and those are pretty much all kernels
18:02 karolherbst: pendingchaos: also, pointer math
18:02 karolherbst: pointers are 64 bit
18:03 karolherbst: pendingchaos: not a big issue, I was just wondering
18:05 karolherbst: skeggsb: btw, you know what that one is? "sec2: unhandled intr 00000010"
18:16 pendingchaos: would Deus Ex: Mankind Divided shaders be interesting for nouveau's shader-db?
18:16 pendingchaos: it's 76.5 MB worth of shaders (fdupes was run)
18:19 karolherbst: pendingchaos: sure, just add everything :p
18:22 pendingchaos: I'll be pushing some Deus Ex: Mankind Divided and Dawn of War III shaders then
18:27 pendingchaos: (if no one objects btw (it's a huge amount of shaders for one game))
22:12 karolherbst: pendingchaos: mind scanning your installed games for "ARB_gpu_shader_int64"?
22:13 karolherbst: I really would like to have such shaders inside shader-db
22:17 ash_mishra: Hello.
22:19 pendingchaos: Dawn of War III, Dirt Rally, Hitman and Total War Saga Thrones of Britannia match
22:19 pendingchaos: looking at shader-db, the first three don't use ARB_gpu_shader_int64
22:20 pendingchaos: and I think Thrones of Britannia was a Vulkan game?
22:20 pendingchaos: yeah, it is
22:21 ash_mishra: I need a help. I am new to deep learning and neural networks. I wanted to test my neural network on my PC which has a NVIDIA GeForce 940M graphics driver. I am using Fedora 28 as my OS. I wanted to know if the default nouveau driver would work or I have to install the specific NVIDIA driver for it. Could someone please help me.
22:21 karolherbst: pendingchaos: mhhh
22:21 karolherbst: ash_mishra: nope, currently no compute with nouveau
22:21 karolherbst: especially no neural network stuff
22:21 karolherbst: pendingchaos: mhh, interesting
22:22 karolherbst: maybe one can get those to use int64
22:22 karolherbst: like requires higher settings?
22:22 karolherbst: or some other condition has to be met
22:22 ash_mishra: karolherbst, so I have to install the driver. Ok. Thanks for your help :)
22:23 pendingchaos: seems Dirt Rally also has things like GL_ARB_transpose_matrix and GL_ARB_texture_env_dot3
22:23 karolherbst: pendingchaos: wow
22:24 pendingchaos: you can get stuff by grepping for "#extension " but it doesn't show int64 stuff
22:24 karolherbst: yeah...
22:24 karolherbst: maybe it makes sense to implement some of them to get more perf out of some games
22:27 pendingchaos: Hitman and Dawn of War III don't seem to have "#extension GL_ARB_gpu_shader_int64" too
22:27 karolherbst: pendingchaos: might be that most applications just do 64 bit integer math as this is part of core since.. 4.0 or something
22:28 pendingchaos: https://mesamatrix.net/#Version_ExtensionsthatarenotpartofanyOpenGLorOpenGLESversion seems it's still a separate extension
22:28 karolherbst: or wait
22:28 karolherbst: it isn't
22:29 karolherbst: yeah, I was mistaken
22:44 karolherbst: pendingchaos: Dying Light seems to use int64
22:44 karolherbst: or... wait, that's just the log... stupid