22:11 gfxstrand[d]: I hate CTS tests that only fail when I run them as part of a full run. 😭
22:27 airlied[d]: I like the GL ones that only fail on really small windows with crazy glconfigs
22:29 gfxstrand[d]: I'm gonna run this whole caselist in valgrind.
22:29 gfxstrand[d]: gfxstrand[d]: To be clear, the whole shard passes reliably. It's only a full CTS run that reproduces it. :facepalm:
22:30 airlied[d]: oh that is worse
22:32 gfxstrand[d]: Maybe it's something to do with memory pressure?
22:32 gfxstrand[d]: Maybe if I force some stuff to GART, it'll fail?
22:39 gfxstrand[d]: GPU atomics not playing will with CPU caches sounds plausible.
22:48 gfxstrand[d]: flipping to GART doesn't change anything
22:50 karolherbst[d]: I'm still convinced it's something wrong on the kernel side
22:52 gfxstrand[d]: Maybe? But it's really damn consistent. Like, it's exactly the same set of tests that fail.
22:53 gfxstrand[d]: If it were a kernel issue, I'd expect it to flake.
22:53 gfxstrand[d]: Personally, I suspect it's a CTS bug
22:53 gfxstrand[d]: But it's a really subtle one if it is
22:54 airlied[d]: generally with those I speculate a lot and it always ends up being some mundane state interaction
23:00 gfxstrand[d]: Well, the test *is* bugged but not bugged in a way that should matter
23:00 gfxstrand[d]: Unless....
23:40 gfxstrand[d]: Well, if it only reproduces with full CTS runs, I guess that's what we'll do...
23:49 gfxstrand[d]: One run with appropriate logging should at least tell me if it's a CPU or GPU side state tracking bug.