04:32 rhyskidd: i put into envytools details of the turing mobile pci-ids
04:33 rhyskidd: now that they are shipping in limited quantities
13:58 karolherbst: imirkin: you figured out that indirect draw issue :)
14:27 imirkin: karolherbst: yeah, i'm not proud of the solution, but ... hard to care about those cases
14:28 karolherbst: we can think about more performant solutions if it starts getting a perf issue for random applications I guess
14:28 imirkin: we had a short discussion in #dri-devel about it, no one can see either FIXED or DOUBLE getting used with indirect draws
14:29 karolherbst: yeah...
14:29 karolherbst: not even the CTS bothers testing it
14:29 karolherbst: it's only deqp, right?
14:29 imirkin: yea
14:29 imirkin: it's legal in desktop too though
14:29 imirkin: it's a legit test
14:29 karolherbst: is there anything else besides the copyimage stuff?
14:29 karolherbst: I mean, left to fix
14:29 imirkin: atomic counters, i should have a fix today
14:29 karolherbst: ahh, right
14:29 imirkin: and 3d image stuff, which ideally i'll fix as well
14:29 imirkin: but might take longer than today
14:30 karolherbst: yeah... that 3d image stuff is annoying to fix :/
14:30 imirkin: my plan is to just de-tile along the Z axis if a 3d texture is bound as an image
14:30 imirkin: and leave it like that forever
14:30 imirkin: not sure if that's a supported mode though ... hm
14:30 imirkin: we'll see.
14:30 karolherbst: I don't think it works on kepler
14:31 karolherbst: but maybe?
14:31 karolherbst: didn't spend enough time on it though
14:31 imirkin: one way to find out ;)
14:31 imirkin: i have a GK208 plugged in, so...
14:31 karolherbst: the thing I had only worked on maxwell1+ but maybe I just messed up a lot, dunno
14:31 imirkin: when you get some time, i'd like to walk you through doing some testing with maxwell on those texture things
14:32 karolherbst: right.. starting tuesday it should be better time wise on my side
14:32 imirkin: no rush
14:33 imirkin: it's highly unlikely i'll have plugged a maxwell in myself
14:33 imirkin: for the RGBA4 stuff, perhaps time to send it to the list? no one seems to be reviewing on gitlab...
14:33 imirkin: then i can have a look
14:33 karolherbst: but with the 3d image stuff fixed, we can actually start thinking about running the gl44 or gl45 CTS :)
14:33 karolherbst: imirkin: I've talked with idr on fosdem
14:33 karolherbst: he wants to take a look
14:33 karolherbst: he just didn't know
14:34 karolherbst: I will assign it to him, so that should work out
14:34 imirkin: oh ok
14:35 imirkin: excellent
14:39 karolherbst: imirkin: but I could send this out as a seperate patch: https://github.com/karolherbst/mesa/commit/b729cb3d681fce3584a6aada83cb930442fd533e
14:39 karolherbst: I don't think it hurts _that_ much
14:39 karolherbst: just more ES tests failing
14:39 karolherbst: which are fixed by fixing copyimage
14:39 imirkin: karolherbst: then there are 2 more unfortunate dEQP tests. one that causes an assert in glsl ir - i've filed a bug about it, and done some investigation, but it's a non-trivial fix
14:40 imirkin: and then a second test which reliably triggers a CTXSW_TIMEOUT for me
14:40 karolherbst: right, the compiler bugs
14:40 karolherbst: ohh
14:40 karolherbst: interesting
14:40 karolherbst: I like small tests triggering it
14:40 imirkin: dEQP-GLES31.functional.geometry_shading.query.primitives_generated_instanced
14:41 imirkin: could well be some bit of init missing on GK208
14:41 karolherbst: passes here
14:41 imirkin: i remember when it was being brought up we were missing something that led to insta-hang when using GS
14:41 karolherbst: uhm gp107
14:41 imirkin: so ... could be another missing bit
14:41 imirkin: or ... something else
14:42 imirkin: either way, might not be an easy fix
14:42 imirkin: perhaps skeggsb has a reasonable way to debug it
14:42 karolherbst: maybe
14:43 karolherbst: imirkin: oh btw, did you spend some time thinking about the TRAP I found a or two weeks ago?
14:43 imirkin: see commit 64db56233cd5edc68bdb1c7084b7a4eaa0ead00f in the 'nouveau' repo
14:43 imirkin: you mean the OOR_ADDR stuff?
14:43 karolherbst: no
14:44 karolherbst: the neested loop stuff
14:44 imirkin: oh, where you were getting an unknown trap?
14:44 karolherbst: yeah
14:44 imirkin: i have not ... but what was i supposed to be thinking about?
14:44 karolherbst: no idea, I thought you were going to
14:44 imirkin: oh, we were wondering if it was overflowing the stack or something
14:44 karolherbst: yeah
14:45 karolherbst: I don't know if RSpliet ended up with a final conclustion about that issue anyway
14:45 karolherbst: maybe he never did :)
14:54 karolherbst: imirkin: do you know how we can actually increase the stack size, just to verify if that helps?
14:54 imirkin: it's in the shader header
14:54 imirkin: (/ launch descriptor)
14:54 imirkin: there's on-chip and off-chip
14:55 imirkin: i don't have a clear model in my head of how it all works
14:56 karolherbst: I guess if that's the issue indeed, we might want to have a bigger stack the deeper nested loops go within a shader, but this was this 2n^2 kind of loop :/
15:03 karolherbst: imirkin: "The SPH field ShaderLocalMemoryCrsSize sets the additional (off chip) call/return stack size (CRS_SZ). Units are in Bytes/Warp. Minimum value 0, maximum 1 megabyte. Must be multiples of 512 bytes."
15:03 karolherbst: this might help
15:03 imirkin: Passed: 298/298 (100.0%)
15:03 imirkin: for atomic counters :)
15:03 karolherbst: nice!
15:19 imirkin: ok, time to try to slay the 3d image monster
15:20 imirkin: karolherbst: where's your patch for that?
15:20 imirkin: (i knwo you said it didn't help on kepler, but i still want to see what you did)
15:20 karolherbst: imirkin: https://github.com/karolherbst/mesa/commit/8137bafc7a702f72196613750fe1d429a7a1dcd5
15:22 imirkin: thanks
15:22 imirkin: ok cool. so tile_mode == 0 along the Z axis resolved it
15:22 imirkin: that's excellent
15:22 karolherbst: well, on maxwell :)
15:22 karolherbst: maybe I just messd up the kepler code
15:22 karolherbst: dunno
15:22 imirkin: yeah, but it should be doable on kepler
15:22 imirkin: would need a bit mor ework though
15:23 imirkin: this was only to help the non-layered-binding thing on maxwell
15:23 karolherbst: yeah.. I saw nvidia doing that in those tests as well, but no idea if they always do it
15:23 imirkin: the "regular" 3d case would have to be accounted for
15:23 karolherbst: with that patch I was able to pass every gl45 tests at some point
15:23 imirkin: yeah, i think that's part of the solution
15:24 imirkin: i'm going to handle it a bit differently
15:24 karolherbst: (ignoring some randomly failing ones and the glx tests(
15:24 imirkin: but same general idea
15:24 imirkin: i.e. allowing a 3d texture to specify that it should not be tiled along the 3d dim
19:34 imirkin: hakzsam: any chance you have left-over mmt traces from the images bringup?
19:34 imirkin: i'm looking for kepler 3d image stuff
19:34 imirkin: i figured out some stuff -- the tiling can actually work there
19:35 imirkin: that UNK1C is what's important, but not sure how to set it
19:38 imirkin: naturally i have that for fermi
19:38 imirkin: sigh
19:39 RSpliet: karolherbst: No I couldn't detect anything wrong with the shader code emitted. If my understanding of hardware is correct (... which I guess I can never be 100% sure of w/o stealing all those confidential docs off NVIDIA) there is no risk of an unbound stack growth.
19:41 RSpliet: As for the ShaderLocalMemoryCrsSize: when the size of the stack grows beyond the hardware stack size, and needs to eat into local/shared memory, we might have to carve out a section of such memory. I have no idea what that would entail, but somehow you'll need to make sure that whoever else might be using that memory doesn't get the chance to overwrite the stack.
19:47 RSpliet: Wait... "Local memory"? 1MiB? Does that mean... Does that mean it doesn't use the 16/48KiB of shared mem, but rather VRAM?
19:48 RSpliet: In that case you'll probably "just" have to allocate a buffer and map it somewhere.... and pray your L1 is efficient enough to never having to pay the penalty of using it
22:04 rhyskidd: so i had some time this afternoon to clean up documenting the turing usb type-c interface
22:04 rhyskidd: there's merge requests into envytools available for review
22:05 rhyskidd: this is the new connector on the gpu used for emerging VR headset protocols (e.g. VirtualLink)
22:06 rhyskidd: is there something wrong with readthedocs? doesn't appear to have rebuilt the web hosted envytools docs in the last month+ ?
22:07 HdkR: Or you can just use it as a USB device if you don't have a VR headset yet :P
22:08 rhyskidd: recent commits haven't been updated since 5c8b90591ee45c8a1a6e86bfd1e9086cd73e5ea9
22:08 rhyskidd: that too
22:09 imirkin: rhyskidd: let me check, iirc i had access to it...
22:10 imirkin: hm, nope, i don't
22:10 imirkin: mwk: can you make me an admin on rtfd?
22:18 imirkin: rhyskidd: https://docs.readthedocs.io/en/latest/webhooks.html#i-was-warned-i-shouldn-t-use-github-services
22:18 imirkin: probably related
22:20 imirkin: [i see that's configured in the envytools repo]
22:23 imirkin: anyone on kepler + nvidia blob and feel like running some mmt traces for me?
22:30 mwk: imirkin: sure, coming up
22:30 mwk: [wrt rtfd]
22:30 imirkin: thanks
22:31 mwk: imirkin: your username?
22:31 imirkin: imirkin
22:31 mwk: done
22:31 imirkin: (my creativity knows no bounds)
22:31 mwk: just making sure
22:31 imirkin: thanks
22:38 imirkin: ok, in theory that's added
22:38 imirkin: i kicked off a build too
22:38 imirkin: but ideally next push it should auto-build as well
23:06 imirkin: did we ever figure out what MADSP does precisely on kepler?
23:49 rhyskidd: mwk, imirkin: thanks for fixing up rtfd