00:37 imirkin_: Lyude: btw, dunno if you have the requisite hw, but doesn't look like we support the 8gbps/lane dp rate in disp/dp.c
00:38 imirkin_: would need something that does DP 1.3 for that
00:38 imirkin_: "HBR3"
00:49 karolherbst: imirkin_: do you know when the mul post factor is applied?
00:49 imirkin_: 6pm
00:50 karolherbst: mul(mul(a, imm), b) == mul_x_imm(a, b)
00:50 karolherbst: like do we know that's correct or not?
00:50 imirkin_: one way to find out.
00:51 imirkin_: the opt only works for powers of 2
00:51 imirkin_: so it's basically just an extra exponent adjustment at the end, i suspect
00:52 karolherbst: and only for 2,4 and 8, right?
00:52 imirkin_: and 0.5, 0.25, etc
00:52 karolherbst: ahh, right
00:52 imirkin_: i don't think it's precise
00:52 imirkin_: since it could be that a*2 = infinity
00:52 karolherbst: I am mainly wondering if we can still do parts inside tryCollapseChainedMULs
00:52 karolherbst: right
00:52 imirkin_: but a*b*2 = not-infinity
00:52 imirkin_: etc
00:53 karolherbst: right.. I would assume that this post factor is applied after the actual mul
00:53 imirkin_: that's what would make the most sense
00:53 imirkin_: but i don't have concrete proof of that
00:54 imirkin_: and mwk's hwtests only worked on nv50 -- requires more effort to bring up compute on nvc0+ i guess
00:54 karolherbst: mul(mul(a, b), imm) == mul_x_imm(a, b) would work though maybe
00:54 imirkin_: prolly yeah
00:55 karolherbst: but honestly, I doubt it even matters that much, so maybe we can just disable tryCollapseChainedMULs for precise values and be done with it.
00:55 imirkin_: sgtm
00:56 karolherbst: did you see the patch? Would push it if that's okay for you
00:56 imirkin_: no, i did not
00:56 imirkin_: don't see anythign in email...
00:56 karolherbst: I just posted the link here in IRC
00:57 imirkin_: oh, missed it
00:58 imirkin_: yeah, r-b. definitely makes sense.
00:58 imirkin_: the whole purpose of tryCollapseChainedMULs is to reorder mul's
01:00 karolherbst: I will CC stable as well
01:00 imirkin_: no objections
01:05 karolherbst: for the gles3 geqp: Failed: 71/42653 (0.2%), not too bad
01:06 imirkin_: cool
01:06 imirkin_: lmk if you need help
01:06 karolherbst: will create a trello card with the list of fails first :)
01:06 karolherbst: there is one fail with gles2 though
01:06 karolherbst: but... it's one of those weirdo fails
01:06 karolherbst: never fails if you only run that one test
01:06 imirkin_: doh
01:06 karolherbst: always fails if you run the whole list
01:08 karolherbst: ufff
01:08 karolherbst: those fails
01:08 karolherbst: https://trello.com/c/ZnM10QHM/30-deqp-master-gles-30-master
01:10 karolherbst: I hope none of those failed due to my teximage fixes
01:12 imirkin_: grrrr... those texturegrad failures
01:13 imirkin_: i thought i fixed those
01:13 imirkin_: like ages ago =/
01:13 imirkin_: 0cf6320eb5eca1ea20906624ad5a46ca386e0aa6 + 05944a392ebe30f8a67bf70e1fbc4eb088fb67a0
01:14 karolherbst: I am sure there is some GLES speciality
01:14 imirkin_: unlikely.
01:14 imirkin_: it's just the ones that do manual.
01:14 imirkin_: the ones that fit into the regular "txd" op all seem to work
01:15 imirkin_: what gpu did this fail on?
01:15 karolherbst: gp107
01:17 karolherbst: the hell
01:17 karolherbst: guess what
01:17 karolherbst: ohh nope
01:17 karolherbst: my mistake. DRI_PRIME=0 was set, ups
01:18 karolherbst: thought it was passing if ran alone
01:20 karolherbst: imirkin_: the GL test passes though
01:20 imirkin_: which one?
01:21 karolherbst: KHR-GL45.texture_cube_map_array.sampling
01:21 imirkin_: ah
01:22 imirkin_: the interesting thing is that it's not just cube that's failing
01:22 imirkin_: but also 3d
01:22 imirkin_: last time, there were questions about whether to normalize cube coordinates first or second
01:22 imirkin_: but with 3d (or 2dshadow) that's not a thing
01:24 karolherbst: that will be fun to debug
01:24 imirkin_: oh
01:24 imirkin_: hrm, no
01:25 karolherbst: uff, that lowering
01:25 imirkin_: good times, right?
01:25 karolherbst: first time I see shfl used :D
01:25 imirkin_: on the bright side, i do understand how it works
01:26 imirkin_: oh, i do have an idea
01:27 imirkin_: try setting i->derivAll = true
01:28 imirkin_: oh crap. we already do that.
01:32 karolherbst: on the bright side, your fix has 0 affect on that test
01:33 imirkin_: which one? from the commits a while back?
01:33 karolherbst: yeah
01:33 imirkin_: heh
01:33 karolherbst: reverted 05944a392ebe30f8a67bf70e1fbc4eb088fb67a0, but output image is the same
01:33 imirkin_: "yay"
01:34 imirkin_: can you put the shader through nvdisasm?
01:34 imirkin_: for like ... the sampler3d one
01:34 imirkin_: that should be the most straightforward one
01:36 karolherbst: https://gist.githubusercontent.com/karolherbst/971e43eb178c365e0cb516217cc86969/raw/01d24ac9e1ee940c31dfc3a945edb333ea69a2e8/gistfile1.txt
01:37 imirkin_: thanks
01:38 imirkin_: fun. TEXS.
01:38 HdkR: Woo TEXS
01:38 karolherbst: :)
01:38 karolherbst: ohhhh
01:38 karolherbst: mhhh
01:38 karolherbst: let me check something fast
01:38 imirkin_: mind turnign that off?
01:38 imirkin_: i don't trust it
01:38 imirkin_: since it doesn't have the .NDV
01:39 HdkR: Oh snap, look at all those SHFL
01:39 imirkin_: (which is the "i have no idea what i'm doing flag")
01:39 karolherbst: I trust my TEXS patches :p
01:39 imirkin_: HdkR: manual textureGrad.
01:39 HdkR: and FSWZADD. this is silly
01:39 karolherbst: crap
01:39 karolherbst: okay... disabling TEXS fixes the test
01:39 imirkin_: karolherbst: TEXS doesn't support derivAll though
01:39 HdkR: Ah right, textureGrad sucks
01:39 karolherbst: imirkin_: ohhhh
01:39 karolherbst: okay.. I guess I should check against that as well
01:40 imirkin_: HdkR: so basically it goes around, lane by lane, sets up the other lanes to have the right values for the auto deriv to work out, etc
01:40 imirkin_: HdkR: there's a TXD in the ISA, but it only handles 2d
01:40 imirkin_: karolherbst: either that, or figure out if there's a NDV flag for TEXS
01:40 karolherbst: I am sure there is not
01:40 imirkin_: "no more bits"? :)
01:40 karolherbst: there is simply no space for that :)
01:40 karolherbst: yeah
01:41 imirkin_: HdkR: btw, do you have any idea what SAM and RAM do?
01:41 imirkin_: i just cargo-cult them, but ... no clue.
01:41 HdkR: Yes
01:42 imirkin_: care to share? :)
01:42 karolherbst: imirkin_: Passed: 14/71 (19.7%) :) of those fails
01:42 imirkin_: karolherbst: hopefully all the textureGrad ones?
01:42 HdkR: imirkin_: You know that I wish I could
01:42 imirkin_: hehe
01:42 karolherbst: yes
01:44 rhyskidd: pmoreau: any work-in-progress patches that you can point me at would be great
01:45 imirkin_: HdkR: doesn't appear in the ptx isa :(
01:45 rhyskidd: great laptop btw, still strong after a decade (although did replace hdd->ssd and new battery)
01:45 HdkR: imirkin_: yea :|
01:46 imirkin_: i'll just keep praying to the cargo gods, and let the good times roll
01:46 karolherbst: imirkin_: https://github.com/karolherbst/mesa/commit/5396924c2865089301963af35adb0cea179feca7.patch
01:46 imirkin_: karolherbst: check the other tex->tex flags
01:46 imirkin_: like liveOnly
01:46 imirkin_: and perhaps others
01:47 karolherbst: texs knows liveonly
01:47 imirkin_: ok
01:47 karolherbst: offsets are handled as well
01:47 imirkin_: i think you're good then
01:48 imirkin_: patch is r-b me
01:49 karolherbst: huh.. there is more though
01:49 karolherbst: no bindless of course
01:49 imirkin_: that's an internal flag
01:49 karolherbst: or...
01:49 karolherbst: right
01:49 imirkin_: i think you're all set
01:49 imirkin_: that's the only one
01:49 karolherbst: what about the .s flag?
01:49 karolherbst: uhm
01:49 imirkin_: no clue what that does
01:49 karolherbst: value
01:49 imirkin_: value?
01:49 karolherbst: .r vs .s?
01:50 imirkin_: .r/.s are the texture/sampler
01:50 karolherbst: right
01:50 karolherbst: and we only have the texture value for TEXS
01:50 karolherbst: sampler is fixed to 0
01:50 imirkin_: on kepler+, realy
01:50 imirkin_: it's just a lookup into that constbuf
01:50 imirkin_: which then retrieves a handle with a TIC and TSC reference
01:51 karolherbst: okay, never actually looked into that much detail into all of that
01:51 imirkin_: yeah, don't worry about it
01:51 imirkin_: "it works"
01:51 karolherbst: :)
01:51 imirkin_: :)
01:51 imirkin_: it took me an embarassingly long time to grok it all too
01:52 imirkin_: properly understanding textures took me about a year, i think
01:52 imirkin_: if not longer.
01:52 imirkin_: levels, layers, 3d, mipmaps, minification, interpolation, etc
01:52 imirkin_: wtf this gradient junk is all about, level selection, etc
01:55 karolherbst: those invariance fails are super odd
01:56 HdkR: imirkin_: karolherbst: Have you ever officially pestered about particular instruction documentation?
01:56 karolherbst: HdkR: uhm... not directly as those are the things we are usually able to figure out ourself
01:56 HdkR: SAM/RAM without any arguments are pretty opaque :P
01:57 karolherbst: where are those used?
01:57 karolherbst: or are those instructions
01:57 karolherbst: or just flags?
01:57 HdkR: It was in that shader dump for texturegrad
01:58 karolherbst: ohh what we call quadon/quadpop
01:58 HdkR: hah. it was sam/ram in that dump :P
01:59 HdkR: Although that is pulled from the official disassembler I guess
01:59 karolherbst: yeah
01:59 karolherbst: it's quite hard to grep for those names as well
02:00 HdkR: Not a great name to grep for. so tiny and can mean other things
02:00 karolherbst: PARAM contains RAM eg
02:00 HdkR: oof
02:02 karolherbst: I have wild guesses about those, but...
02:02 karolherbst: I am sure those aren't exactly related to doing cross lane stuff, they just make it more reliable I'd figure and prevent random fails? dunno though
02:03 karolherbst: uff SAMPLER contains SAM
02:04 HdkR: Three letters that'll drive you mad while grepping
02:05 karolherbst: it's not that bad
02:06 karolherbst: it's just cheating to "forget" those in the nvdisasm documentation
02:06 karolherbst: :p
02:09 karolherbst: hah
02:09 karolherbst: I found something :D
02:10 karolherbst: imirkin_: you'll love it
02:11 karolherbst: imirkin_: "(i.e. don't surround divergent DDX/DDY with SAM+RAM)"
02:11 karolherbst: uhm.. more context
02:11 karolherbst: "Disable support for non uniform quad derivatives (i.e. don't surround divergent DDX/DDY with SAM+RAM)"
02:11 karolherbst: random string I found in the libraries
02:13 HdkR: oh noes, the secrets
02:13 karolherbst: I am sure that's basically it, right?
02:13 HdkR: :shruggie:
02:16 karolherbst: mhh, but there isn't really much more
02:21 imirkin: oh, i see
02:21 imirkin: so basically it'll auto-do those if there's divergence
02:21 imirkin: and this is just a one-stop-shop for that
02:24 karolherbst: imirkin: right, this was just a line documenting some compiler switch ;)
02:24 karolherbst: there was another line, but less helpful: "Disable optimization to merge adjacent SAM+RAM blocks"
02:24 imirkin: heh
02:24 imirkin: that makes sense too
02:24 imirkin: the whole point of it is not to do SAM/RAM too much
02:24 karolherbst: right, but that doesn't explain directly what SAM/RAM do ;)
02:24 karolherbst: or indirectly
02:24 karolherbst: just to not use them as much
02:24 imirkin: well, i think i get it
02:25 imirkin: sorta
02:25 imirkin: it's roughly what i thought
02:25 imirkin: i.e. before doing a bunch of quadops, do a SAM
02:30 airlied: save/restore A mask :-P
02:30 airlied: active mask I assume
02:30 imirkin: sounds reasonable.
02:32 karolherbst: ohh, makes sense
02:32 HdkR: airlied: Does AMD have something like that? :)
02:37 airlied: you can copy the contents of the execution mask to registers if memory serves
02:38 airlied: the exec mask "controls which threads are active and execute instructions at a given point
02:38 airlied: "
02:40 Lyude: imirkin: at the moment I don't have the hw for it, no
02:40 imirkin: Lyude: oh well
02:40 imirkin: "DP8K" apparently... fancy stuff
02:41 imirkin: karolherbst: iirc i've seen the repeated_clear thing before
02:41 imirkin: unfortunately i don't remember if i tracked it down
02:46 Lyude: imirkin: btw though, it well be interesting to enable since we'll probably also need to add support for dsc
02:47 imirkin: just need access to the hardware, probably
02:47 imirkin: my TV is HDMI 2.0 only, so it's all you
02:47 imirkin: (and ben)
02:47 imirkin: [and my U2415's are highly unlikely to support anything of the sort]
02:50 HdkR: Hm? DSC is being implemented where exactly? :)
02:51 imirkin: the modesetting portion
02:54 imirkin: karolherbst: what about those invariance ones?
02:55 HdkR: ah. Need that 8k+DSC support for those upcoming monitors :)
02:56 imirkin: HdkR: you were handing them out earlier :p
02:57 imirkin: it took me ~10y to buy a TV in the first place. it'll be a while before i get an 8k one.
02:58 HdkR: haha
02:59 imirkin: i'll be waiting for HDMI 4.0
03:00 imirkin: perhaps there will be some combo USB-C style standard, but wireless
03:00 imirkin: where everythign can talk to everything else
06:25 HdkR: imirkin: btw, when I switch over to the nvidia blob from i965 on this device that had a max resolution issue. `Screen 0: minimum 8 x 8, current 3840 x 2160, maximum 32767 x 32767`
06:26 HdkR: So sad D:
12:10 karolherbst: imirkin_: uff, the other invariant tests already fail on a TGSI level :/
12:12 karolherbst: not the entire expression is marked as precise :/
12:13 karolherbst: heh
12:14 karolherbst: that's a glsl ir fail
12:16 karolherbst: actually? dunno
13:53 karolherbst: imirkin: remembering if you did actually look into those repeated_clear fails? those are basically the last fails I am willing to look into
14:20 imirkin: yeah, i don't remember
14:20 imirkin: sorry
14:22 imirkin: karolherbst: what about the msaa ones?
14:23 karolherbst: no idea if I care enough about msaa yet
14:23 imirkin: ah :)
14:23 karolherbst: mhh, this repeated_clear fail seems like a texture reference issue
14:23 karolherbst: first two draw calls use the correct texture
14:23 karolherbst: the later calls not
14:26 karolherbst: left intel, right nouveau: https://i.imgur.com/pBGO4IF.png
14:31 karolherbst: imirkin: any ideas? it seems like clear updates the texture, but somehow the new data aren't used in the draw call?
14:50 imirkin_: karolherbst: sounds like we need a flush of some kind then?
14:51 imirkin_: karolherbst: or maybe an engine stall?
14:51 karolherbst: might be
14:51 imirkin_: e.g. the clear might be happening async to the other draw
14:51 karolherbst: mhh
14:51 imirkin_: yeah, like the texture_barrier() thing
14:51 karolherbst: doubtful
14:51 imirkin_: since a clear is using the texture as a RT
14:51 karolherbst: there are 16 draw calls in total
14:51 imirkin_: and then we're immediately sampling from it
14:51 karolherbst: and only the first two colors appear
14:51 imirkin_: seems reasonable
14:51 imirkin_: if it's easy
14:51 imirkin_: edit the test source
14:51 imirkin_: and add a call to glTextureBarrier
14:52 imirkin_: or fake it in the driver
14:52 imirkin_: might not be hooked up in deqp
14:52 imirkin_: since it's only in extended GLES
14:52 imirkin_: unrelated -- is qapitrace ui super-buggy for you?
14:52 karolherbst: MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH should work, no?
14:53 karolherbst: imirkin_: yes, but only on fedora. on my gentoo machine it works. You mean the resetting layout?
14:53 imirkin_: you click on one event, and then it scrolls around, and you end up clicking on the wrong one?
14:53 imirkin_: and the layout reset too
14:53 karolherbst: highres?
14:53 imirkin_: 1920x1200
14:53 karolherbst: mhh
14:53 karolherbst: meant hidpi actually, but mhh
14:53 imirkin_: higher than 640x480 ;)
14:59 karolherbst: imirkin_: do you know if there is an ES extension for glTextureBarrier?
14:59 imirkin_: NV_texture_barrier
15:00 imirkin_: has an ES variant
15:00 imirkin_: i also wouldn't exclude the possibility of it becoming core in ES 3.2
15:00 imirkin_: karolherbst: in the alternative, look at what texture_barrier callback does
15:00 imirkin_: and just do it on every draw
15:01 imirkin_: (or even simpler, call it internally on every draw)
15:02 karolherbst: there is EXT_shader_framebuffer_fetch which sounds a bit to relate to this
15:02 imirkin_: no
15:02 imirkin_: oh
15:02 karolherbst: :)
15:02 imirkin_: but it has a texture barrier call ;)
15:02 imirkin_: as part of the ext
15:02 imirkin_: for the non-coherent version
15:03 karolherbst: that ext isn't supported in mesa though
15:03 imirkin_: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c#n1007
15:04 imirkin_: just add nvc0->base.texture_barrier(nvc0) there
15:04 imirkin_: or copy what it does. wtvr.
15:06 karolherbst: that fixes it
15:06 imirkin_: yay :)
15:06 imirkin_: can you put up the trace somewhere? i want to see precisely what it does
15:06 karolherbst: okay
15:07 karolherbst: sent it via email
15:08 imirkin_: thanks
15:08 karolherbst: uhm
15:08 karolherbst: that fixes those msaa tests as well
15:09 karolherbst: or it's just random passes, I don't know
15:09 imirkin_: if only there were no downside to doing that.
15:09 karolherbst: right
15:09 karolherbst: there isn't much fancy stuff going on in the trace though
15:10 imirkin_: ok, so what it's doing is calling glClearBuffer, right?
15:10 karolherbst: yes
15:12 karolherbst: the texture is just attached to the framebuffer as a GL_COLOR_ATTACHMENT0, and glClear clears the color buffer, some rebinding and some drawing, but I think that's basically everything
15:13 imirkin_: crap, it always calls glretrace, which is the system one =/
15:13 imirkin_: glretrace: error while loading shared libraries: libprocps.so.6: cannot open shared object file: No such file or directory
15:13 imirkin_: ok, this one's my bad :)
15:14 karolherbst: :D
15:14 imirkin_: goddamn cmake
15:14 karolherbst: ohh, right, this problem. Yeah, that's annoying
15:15 karolherbst: usually I remove the lines from the cache file
15:28 imirkin_: ok, all built.
15:29 karolherbst: imirkin_: uhh, those preprocessor tests are checking if a shader with "#line +20" is compiling... sounds like super unimportant
15:29 imirkin_: the tests are wrong
15:29 karolherbst: ohh, okay
15:30 karolherbst: then all tests are fixed afaik :p we just need to write proper patches
17:00 karolherbst: imirkin: any conclusions so far? It seems like we don't have to use SERIALIZE, allthough I doubt that TEX_CACHE_CTL is for free either
17:00 imirkin_: got distracted.
17:00 imirkin_: and will continue to be distracted for a while
17:00 imirkin_: sorry
17:01 karolherbst: no worries. I still have that nouveau_drm->handle corruption to take care of
17:01 karolherbst: for reasons it doesn't trigger with valgrind :/
17:49 karolherbst: imirkin_: this is so weird... only the handle gets corrupted :/
17:50 karolherbst: but it's nothing touching the handle directly
18:03 imirkin_: karolherbst: just for fun... can you try changing st_cb_clear.c to, right before the end, do "quad_buffers |= clear_buffers; clear_buffers = 0;"
18:03 imirkin_: pretty sure that'll have zero effect.
18:04 karolherbst: you mean the st_Clear function?
18:04 imirkin_: yeah
18:04 imirkin_: at the bottom it does "if (clear_buffers) stuff". put that code right above that.
18:04 imirkin_: i.e. i always want it to hit the clear_with_quad path
18:06 karolherbst: yeah.. that changes nothing
18:08 imirkin_: ok
18:08 imirkin_: that's good.
18:08 imirkin_: ;)
18:09 imirkin_: at least sanity exists in the world
18:13 karolherbst: imirkin_: aha! I got totally confused by the fact that the handle is a uint32_t
18:13 karolherbst: but... someting writes a pointer into it
18:14 karolherbst: and due to struct alignment, the upper bit disappeared into unused memory :)
18:14 karolherbst: and the pointer is indeed valid
18:15 imirkin_: heh
18:18 karolherbst: 0x7ffff3242d20 <main_arena+224>
18:19 karolherbst: main_arena is some glibc thing
18:24 karolherbst: chipset = 0 is very suspicious as well
18:24 karolherbst: vram_size of 125206528 is also rather odd
18:25 karolherbst: I think the pointer to the nouveau_bo object is just plan corrupted
18:25 karolherbst: *plain
18:34 karolherbst: there are two suspicious things going on as well: "GL_OUT_OF_MEMORY in glTexStorage" and "GL_INVALID_VALUE in glTexSubImage2D(xoffset 0 + width 256 > 0)
18:36 karolherbst: ohhh interesting
18:36 karolherbst: I thing those are related to the black boxes actually
18:44 imirkin_: so one way that gm107+ is different
18:44 imirkin_: is that for vertex data, we will allocate scratch bo's, since they only take data in as vbo
18:44 imirkin_: while earlier gpu's you could send that down in the cmd buffer
20:22 mslusarz: karolherbst: have you tried gdb's "watch" command with that handle corruption?
20:23 karolherbst: mslusarz: it's from a webgl cts test
20:23 mslusarz: so?
20:23 karolherbst: that handle changes like ten thousend of times before it gets corrupted
20:23 mslusarz: oh, ok
20:24 karolherbst: mhhh, maybe I can make an conditional watchpoint?
20:24 karolherbst: if value > 0x10000 or something
20:24 karolherbst: usually a pointer gets stored there
20:24 mslusarz: I think you can
20:25 karolherbst: I just have to install/remove watchpoints as a nouveau_bo object is created/destoryed
21:19 karolherbst: mslusarz: what I actually need is something I can add to the C code and mark the memory I want to get watched
21:20 karolherbst: sadly if I use valgrind, that bug doesn't trigger for unknown reasons
22:27 mslusarz: karolherbst: last answer from https://stackoverflow.com/questions/8941711/is-it-possible-to-set-a-gdb-watchpoint-programmatically seems like something you could use
22:39 karolherbst: mslusarz: yeah, saw that already, but that more or less involves writing your own debugging tool as well, or maybe that would work with gdb right away
22:40 mslusarz: the last one works with gdb
23:14 imirkin_: karolherbst: i'm going to switch my plugged-in boards around, so i should be able to investigate that repeated clear thing more directly
23:14 karolherbst: imirkin_: cool :)
23:14 imirkin_: unless you desperately wanted to work it out yourself
23:15 karolherbst: I am sure if we wolve this, it also solves those stencil issues
23:15 karolherbst: imirkin_: ohh, I was mainly waiting on you as you seem to have an idea about all that
23:15 imirkin_: well, the texture cache isn't getting flushed when it should
23:15 imirkin_: either the test is doing something illegal
23:15 imirkin_: or we're not picking something up
23:15 imirkin_: i'm fairly sure what the test is doing is fine
23:15 karolherbst: it sounds sane enough actually
23:15 karolherbst: mhh, let me check something
23:16 imirkin_: so what we're not picking up is if any of the currently-bound textures have the GPU_WRITING flag at draw time
23:16 karolherbst: maybe it also fixes those weirdo stencil bugs from the gl CTS, that would be awesome
23:16 imirkin_: we only check at bind time
23:16 imirkin_: but ... that's what texture barrier is supposed to prevent
23:16 imirkin_: so there might be a bit more to it.
23:16 imirkin_: we may be able to get away with just checking at fb change time
23:17 imirkin_: like i said - need to play around with it ;)
23:17 imirkin_: yeah, always in favor of fixing weirdo bugs.
23:19 karolherbst: mhh, seems to be something else
23:20 karolherbst: imirkin_: or maybe just when we clear and only if the color buffer is bound as a texture?
23:20 imirkin_: i suspect it's something along those lines.
23:21 imirkin_: but a little wider, i think
23:22 karolherbst: mhh, glFramebufferTexture2D has some interesting notes (not directly related, but): "Special precautions need to be taken to avoid attaching a texture image to the currently bound framebuffer while the texture object is currently bound and potentially sampled by the current vertex or fragment shader. Doing so could lead to the creation of a "feedback loop" between the writing of pixels by rendering operations and the
23:22 karolherbst: simultaneous reading of those same pixels when used as texels in the currently bound texture. In this scenario, the framebuffer will be considered framebuffer complete, but the values of fragments rendered while in this state will be undefined. The values of texture samples may be undefined as well."
23:22 karolherbst: I suspect there are some more hidden information regarding that scenario
23:22 imirkin_: that's what texture barrier is all about
23:23 imirkin_: glTextureBarrier() is to enable such feedback loops to work
23:23 karolherbst: right, just that this test doesn't do glFramebufferTexture2D, but glClearBuffer
23:23 karolherbst: and launches the shader after glFramebufferTexture2D
23:23 imirkin_: yeah, but internally it's all the same thing.
23:24 imirkin_: like i said ... i need to think about when the right time to do the barrier is
23:24 imirkin_: and that's why i had you do the barrier thing
23:26 karolherbst: imirkin_: ohh
23:26 karolherbst: "It is possible to bind a texture to an FBO, bind that same texture to a shader, and then try to render with it at the same time."
23:26 karolherbst: continue reading
23:26 karolherbst: "It is perfectly valid to bind one image from a texture to an FBO and then render with that texture, as long as you prevent yourself from sampling from that image. If you do try to read and write to the same image, you get undefined results. Meaning it may do what you want, the sampler may get old data, the sampler may get half old and half new data, or it may get garbage data. Any of these are possible outcomes."
23:27 karolherbst: but that's for gl
23:27 imirkin_: yeah, but binding/unbinding creates various barrier-style effects in GL as well
23:27 karolherbst: the fb is never unbound
23:28 karolherbst: only the destination one
23:28 imirkin_: in this case.
23:28 karolherbst: right
23:28 imirkin_: but i'm thinking about hypothetical cases
23:28 karolherbst: ohh, I see
23:28 imirkin_: note that st/mesa will happily do a draw for certain types of clears
23:28 imirkin_: so we couldn't really tell the difference at the nouveau level
23:28 imirkin_: so i'm treating glClear the same as glDraw in my mind
23:29 karolherbst: maybe this is our best bet to get the proper infos: https://www.khronos.org/opengl/wiki/Memory_Model#Framebuffer_objects
23:29 imirkin_: doubtful. the wiki pages are fairly inaccurate and imprecise.
23:33 karolherbst: I kind of get the feeling that glBindFramebuffer might be the sync point here
23:33 imirkin_: precisely.
23:33 imirkin_: which is why earlier i said ...
23:34 imirkin_: <imirkin_> we may be able to get away with just checking at fb change time
23:34 karolherbst:tries reading the spec
23:40 karolherbst: mhh, nothing explicitly stated alongside those commands sadly