00:04mareko: alyssa: no
00:04alyssa: uh oh
00:04mareko: alyssa: there is a way to make it perspective-correct using SYSTEM_VALUE_BARYCENTRIC_PULL_MODEL
00:05alyssa: but radv is ok... not doing that somehow?
00:05alyssa: (I'm trying to move it to common code, it passes the tests on Intel but..)
00:07mareko: both AMD drivers do it incorrectly by interpolating linearly within a quad
00:08alyssa: and that's.. ok?
00:08mareko: no
00:08alyssa: :-/
00:08alyssa: i so want to say "if it's good enough for radv it's good enough for anv" :p
00:09mareko: if W doesn't vary much between pixels, the error can be small
00:10mareko: CTS needs coverage
00:11alyssa: alright :/
00:19mareko: the test I would use is this: draw a quad into 16x16, pos.w must vary a lot, then draw the same into 2x2 and interpolate in the middle of that, then compare that result with the middle pixel of 16x16
00:21alyssa: well.. guess I'll just leave https://rosenzweig.io/0001-WIP-brw-use-common-barycentric-code.patch here then.. *sigh*
00:24mareko: it's better to do it incorrectly in the common code than doing it incorrectly in each driver separately
00:25mareko: however, ddx/ddy should be at the beginning of the shader because they need helper invocations
00:25alyssa: right, yeah
00:25robclark: lumag: btw for infra issues, you might want to ask on #freedesktop
00:25alyssa: radv seems to have lower_interp_center_smooth for correct with _model
00:26alyssa: I don't really understand the code, though
00:27mareko: that might actually be the correct code
00:28robclark: karolherbst: where is it decided that BuiltInLocalInvocationId and friends are 64b for cl? Is that in llvm? I'm wondering if we can early on realize that the hw size is only 32b and turn them into something sane. Otherwise int64 lowering turns `half y = (half)get_local_id(0);` into something funny/sad..
00:28alyssa: mareko: inclined to believe it, but I'm still trying to understand load_barycentric_model
00:28mareko: so I was wrong, RADV does it correctly it seems
00:29karolherbst: robclark: it's size_t in CLC
00:29alyssa: comparing https://docs.vulkan.org/refpages/latest/refpages/source/BaryCoordPullModelAMD.html to https://docs.vulkan.org/refpages/latest/refpages/source/BaryCoordNoPerspAMD.html
00:29alyssa: can I derive _model by just doing frcp on the nopersp centre one (with the appropriate 1-x-y extension to vec3)?
00:30karolherbst: robclark: but system val lowering should be able to drop stuff to 32 bits
00:30robclark: hmm
00:30karolherbst: like local_id should just be u2u64(local_id)
00:30robclark: https://www.irccloud.com/pastebin/l3W94fUV/
00:31robclark: the question is why the u2u64 doesn't get optimized out
00:31karolherbst: robclark: I'd argue you can nuke the u2u64 there via an opt
00:31karolherbst: int64 not well optimized?
00:31mareko: alyssa: persp-correct interpolation interpolates x, y, and w separately and the does x/w, y/w; linear interpolation does x/w, y/w first and then interpolates the results
00:31alyssa: robclark: opt_algebraic is a missing a ton of 16/64-bit stuff
00:31robclark: maybe a missing algebraic opt..
00:31robclark: yeah
00:31karolherbst: ohh I also have a fun one
00:32alyssa: mareko: right. thanks
00:32karolherbst: i2i64(ushr(a, 5lsb something)) -> u2u64(ushr(....)) I have to upstream that one
00:33mareko: alyssa: I think barycentrics are already divided by W, so interpolating those would result in linear interpolation, which I think is why pull_model gives us i/w, j/w, 1/w, which is the original x, y, w (I think), so we interpolate those, and then compute x/w, y/w to get persp-correct i, j
00:39mareko: pull_model can be used for both linear and perspective interpolation - linear interp would divide by W first and then interpolate, while persp interp would first interpolate and then divided by W
01:02lumag: robclark: good idea
01:20Company: any Intel person here? I'm running into a problem with everyone's favorite format VK_FORMAT_G10X6_B10X6R10X6_2PLANE_444_UNORM_3PACK16 aka P010, where the SAMPLED_BIT is set, but neither COSITED_CHROMA nur MIDPOINT_CHROMA are set as supported
01:20Company: I've read the code, and that's because the format is marked as VIDEO but not as YCBCR
01:21Company: and I'm not sure if that's an oversight and it should be marked as YCBCR, or if it should not set the SAMPLED_BIT
01:21Company: because if a multiplane format can be sampled, it must set either of the chroma flags
01:32Company:decides it's probably because of the R10X6 which would require extra support for sampling, and prepares a patch that marks the format as non-samplable
01:37Company: or it's because the Vulkan spec doesn't explicitly forbid sampling without ycbcr, only for VK_IMAGE_ASPECT_COLOR_BIT but not for VK_IMAGE_ASPECT_PLANE_0_BIT etc