22:11gfxstrand[d]: I hate CTS tests that only fail when I run them as part of a full run. ðŸ˜
22:27airlied[d]: I like the GL ones that only fail on really small windows with crazy glconfigs
22:29gfxstrand[d]: I'm gonna run this whole caselist in valgrind.
22:29gfxstrand[d]: gfxstrand[d]: To be clear, the whole shard passes reliably. It's only a full CTS run that reproduces it. :facepalm:
22:30airlied[d]: oh that is worse
22:32gfxstrand[d]: Maybe it's something to do with memory pressure?
22:32gfxstrand[d]: Maybe if I force some stuff to GART, it'll fail?
22:39gfxstrand[d]: GPU atomics not playing will with CPU caches sounds plausible.
22:48gfxstrand[d]: flipping to GART doesn't change anything
22:50karolherbst[d]: I'm still convinced it's something wrong on the kernel side
22:52gfxstrand[d]: Maybe? But it's really damn consistent. Like, it's exactly the same set of tests that fail.
22:53gfxstrand[d]: If it were a kernel issue, I'd expect it to flake.
22:53gfxstrand[d]: Personally, I suspect it's a CTS bug
22:53gfxstrand[d]: But it's a really subtle one if it is
22:54airlied[d]: generally with those I speculate a lot and it always ends up being some mundane state interaction
23:00gfxstrand[d]: Well, the test *is* bugged but not bugged in a way that should matter
23:00gfxstrand[d]: Unless....
23:40gfxstrand[d]: Well, if it only reproduces with full CTS runs, I guess that's what we'll do...
23:49gfxstrand[d]: One run with appropriate logging should at least tell me if it's a CPU or GPU side state tracking bug.