11:05 ondracka: Hi, I'm looking into some r300-driver texture corruption under memory pressure, which seems to be related to the following scenario. 1) Tiled texture is created in VRAM, 2) it gets moved to GTT later when under memory pressure 3) we sample from it, but the HW as far as I can tell cannot sample tiled from GTT so it expects linear and therefore results in bad rendering.
11:07 ondracka: I could use some suggestion what should be the fix here. I was reading the radeon winsys and as far as I can tell, the driver has no control over the placements its just hints. And as far as I can tell I also cannot prevent winsys/ttm from moving tiled textures from VRAM to GTT?
14:17 MrCooper: ondracka: correct; "cannot sample tiled from GTT" for all non-linear tiling parameters, or just a subset?
14:23 agd5f: ondracka, my recollection of r300 tiling is kind of hazy, but IIRC, there is no problem using tiled on GTT, but there are no real advantages to doing so since it's mainly about maximizing caching and memory channels on VRAM.
14:30 MrCooper: I have vague recollection of some specific tiling parameters only working in VRAM, or something like that
16:47 ondracka: agd5f, MrCooper: to be honest I haven't tried if all tiling modes are not working from GTT but this is what the code implies and at least some are definitely broken (also this doesn't seem to be discussed in the publicly available docs). Even if some would be working this would still not fix the bug IMO...
17:00 ondracka: I was looking at the kernel and there are some pinning mechanisms like radeon_bo_pin. Would be exposing that to userspace acceptable (so the mesa driver can pin tiled bos to VRAM or fallback to linear) in theory?
17:03 agd5f: ondracka, no. doing so would allow userspace to DOS the GPU.
17:07 ondracka: agd5f: sigh, so that doesn't really leave any good options here, correct?
17:10 agd5f: you could adjust the kernel driver to guarantee that tiled buffers are always in VRAM at CS execution time.
17:11 MrCooper: first of all would need to confirm that the HW can't sample with tiling from GTT, that sounds a bit fishy; the corruption you're seeing might be a different issue
17:13 agd5f: yeah, I don't see why the hw could care, tiling just reorders the access patterns
17:18 ondracka: OK, I'll run more tests, but thats what I observe so far, if I sample from VRAM all is fine. If the bo gets moved to GTT later, sampling is broken.
17:19 agd5f: ondracka, can you do a test where you allocate from GTT originally and use tiling there?
17:23 ondracka: agd5f: will do, thanks for the suggestion.
18:05 fililip: is the gfx12 HiZ bug a pure hardware bug or can it be fixed in firmware?
21:10 ondracka: agd5f: You were right, my analysis was indeed bad. If I force (well, hint) everything to GTT, it still works.
21:14 ondracka: So in fact what I'm seeing is a lot of piglit flakes in texturing and mipmaps, like tex-miplevel-selection, tegen, tex3d, etc... They usually pass interactive, but when I run them in parallel with something like tex3d-maxsize which allocates a lot of memory makes them fail quite consistently.
21:14 ondracka: Disabling tiling completely makes the flakes go away.
21:17 ondracka: I based my previous analysis on the fact that when I was monitoring the bo omains I was quite sure I could relate the bad rendering to GTT placements, but this was wrong probably (also now looking at it more, the actual place is probably not visible to mesa at all, so I was probably just monitoring the initial domains).
21:19 ondracka: I was also playing a bit with tex cache invalidations, but that also doesn't seem to help. So any advice how to debug further would be appreciated.