12:24 karolherbst: imirkin: soo... 3D images with float coords and linear filtering are broken with a mirrored repeat addressing :/ Anything coming into your mind what it could be?
12:27 imirkin: and by "images" you mean textures of course?
12:27 karolherbst: ahh yeah.. CL read_images :)
12:27 imirkin: probably a test being too picky? dunno
12:28 karolherbst: ohh.. maybe only -0, -0, -0 is broken..
12:28 karolherbst: https://gist.github.com/karolherbst/e09aec1d15e846c48101911512efc99c
12:28 karolherbst: mhh Expected (3.34953e-08,0,0,1), got (6.96225e-08,0,0,1),
12:28 imirkin: Actual integer coords used (i = floor(x-.5)): i0:{21, 12, 30} and i1:{21, 13, 30}
12:29 imirkin: Sample 32835: coord {0.995455(0x1.fdac38p-1),0.247170(0x1.fa342cp-3),0.996774(0x1.fe593p-1)} did not validate!
12:29 karolherbst: yeah.. the output is super confusing
12:29 imirkin: anyways ...
12:30 imirkin: if texelFetch works as expected
12:30 imirkin: then it's a hw issue
12:30 karolherbst: yeah.. it works for like everything else
12:30 imirkin: or some bitfield changed and we're setting the sampler up incorrectly
12:31 imirkin: if texelFetch fails then we're setting the texture up incorrectly
12:31 karolherbst: could be some weirdo 2D vs 3D difference
12:31 karolherbst: fun thing is, it does work for 2D
12:31 karolherbst: it's really just that one combination which messes up
12:31 karolherbst: mhh, I'll check the shader then, could be some mess up with coords
12:32 imirkin: can you try on other gens?
12:32 imirkin: if you're testing on turing, try with pre-volta
12:32 imirkin: could be that 3d is just messed up on turing
12:33 karolherbst: maybe
12:33 imirkin: for 2d or 3d?
12:33 karolherbst: 3d
12:33 karolherbst: thing is.. it works normally
12:33 imirkin: yeah, i mean, i must say - i was very surprised that there were no weird updates to texture parameter order on volta
12:33 karolherbst: log until first error: https://gist.github.com/karolherbst/085ef96470a0adcabce0afc762f015c0
12:33 imirkin: it's like the hw engineers decided to go on vacation instead of messing with our lives
12:33 karolherbst: :D
12:33 imirkin: so ... perhaps you missed something :)
12:33 karolherbst: probably
12:34 karolherbst: I think I saw the same issues on pascal though
12:34 imirkin: oh INTERESTING
12:34 imirkin: ok
12:34 imirkin: so this is unnormalized float coords
12:34 imirkin: and it only fails with some formats
12:34 karolherbst: ahh, yeah, might be
12:35 karolherbst: missed this one little detail
12:35 imirkin: specifically it only fails with f16
12:35 karolherbst: f32 as well though
12:35 karolherbst: but other mix
12:35 imirkin: not according to this log
12:35 karolherbst: yeah.. I skipped most of the later stuff
12:36 imirkin: ok, and this is normalized coords
12:36 imirkin: you know, this doesn't have to be one bug
12:36 imirkin: it could be multiple bugs!
12:36 karolherbst: yep
12:36 karolherbst: "read_image (normalized float coords, float results) *****************************"
12:36 karolherbst: yeah...
12:36 imirkin: anyways - quick theory is that with unnormalized + f16, it's treating the coords as f16 instead of f32
12:37 imirkin: (when in doubt, blame the hw.)
12:37 imirkin: certainly couldn't be OUR code that's wrong
12:37 karolherbst: full log: https://gist.github.com/karolherbst/085ef96470a0adcabce0afc762f015c0
12:38 imirkin: unnormalized coords are weird case
12:38 karolherbst: yeah well
12:38 imirkin: although it's accessible from GL with rect textures
12:38 karolherbst: 2D images work correctly :)
12:38 imirkin: oh yeah. with unnormalized, only 2d in GL
12:38 imirkin: no way to do it for 3d
12:39 imirkin: and iirc unsupported in nouveau
12:39 imirkin: hol don
12:39 karolherbst: ahhh
12:39 imirkin: are these being created a PIPE_TEXTURE_RECT?
12:39 imirkin: or PIPE_TEXTURE_3D? how is the unnormalized "info" passed through?
12:40 karolherbst: well, for 3D we use PIPE_TEXTURE_3D
12:40 karolherbst: I think..
12:40 karolherbst: let me see
12:40 imirkin: and how do we know it's normalized vs unnormalized coords?
12:40 karolherbst: yeah
12:40 imirkin: coz in nouveau, unless there are additional patches, we do unnormalized coords only for PIPE_TEXTURE_RECT
12:40 karolherbst: mhh, good question
12:41 imirkin: (and PIPE_BUFFER, but that's a bit of an edge case)
12:41 karolherbst: clover doesn't use PIPE_TEXTURE_RECT :)
12:41 imirkin: er
12:41 imirkin: s/er//
12:44 karolherbst: okay.. so pipe_sampler_state.normalized_coords is filled by clover
12:44 imirkin: i don't think we look at that ;)
12:44 imirkin: is that a new field?
12:44 imirkin: src/gallium/drivers/nouveau/nv50/nv50_state.c: * ! pipe_sampler_state.normalize
12:44 imirkin: d_coords is ignored - rectangle textures will
12:44 karolherbst: we look at it
12:44 karolherbst: well
12:45 karolherbst: in nv50
12:45 karolherbst: nv50_sampler_state_create
12:45 imirkin: ok yeah, i guess we do
12:45 karolherbst: which nvc0 uses
12:45 imirkin: well, nve4+
12:45 karolherbst: yeah
12:46 imirkin: anyways ... it's a weird use-case. could be broken in hw.
12:46 karolherbst: or something changed and we never noticed it being broken
12:46 karolherbst: maxwell did a lot of changes there
12:46 imirkin: we have those docs though
12:46 imirkin: thanks to gnurou
12:47 imirkin: oh boooo
12:47 imirkin: only the TIC changed
12:47 karolherbst: ahh
12:48 imirkin: not the TSC
12:48 imirkin: and ... *some* of the tests work
12:48 imirkin: so it can't be THAT broken
12:48 karolherbst: mhhh
12:49 imirkin: welp, good luck =]
12:49 karolherbst: :D
12:49 karolherbst: yeah.. I'll figure it out somehow
12:50 imirkin: currently my nouveau-related efforts are being directed towards bringing up nv50 compute
12:50 karolherbst: cool
12:50 imirkin: unfortunately i haven't had a lot of time in the past few days
12:50 imirkin: but i'm hoping i'll have something this weekend
12:50 karolherbst: biggest thing is images, no?
12:50 imirkin: just a handful of "tricky" functions to write
12:50 imirkin: yes
12:50 imirkin: + a few "easy" but very important things
12:51 imirkin: like having ssbo's and images at the same time
12:51 karolherbst: once you are done I'll just throw CL at it and check how much is still broken :D
12:51 imirkin: sure
12:51 imirkin: i'm doing more of the GL-style bringup
12:51 imirkin: but hopefully applicable to CL-style
12:51 karolherbst: yeah..
12:51 karolherbst: the problems are the same, CL just.. allows/requires dumb combination of stuff
12:51 imirkin: pmoreau has some CL-enabling patches, but i'm not really touching those paths
12:52 karolherbst: yeah.. I am midly aware of what needs to be done
12:52 karolherbst: had CL running on g84 as well
12:52 imirkin: cool
12:52 imirkin: one thing about g84 -- no shared mem atomics
12:52 imirkin: much sad.
12:52 karolherbst: might have to expose nv50 as "DEVICE_CUSTOM" and ignore the restrictions
12:52 imirkin: nva0+ have it
12:53 karolherbst: there are other dumb reqs
12:53 imirkin: (or nva3+, will have to double-check)
12:53 imirkin: and nv50 doesn't have *any* atomics :)
12:53 karolherbst: but apparently nva3+ can do CL 1.1
12:54 imirkin: i'm kinda-sorta targeting ES 3.1 on nv50
12:54 karolherbst: fun
12:56 imirkin: although i suspect i'll run into some unfixable issues
13:54 karolherbst: ehhh.. we end up with TXL because the llvm to spirv translator is stupid
13:54 karolherbst: but that shouldn't matter
13:55 karolherbst: imirkin: or are you aware of tex.ll missbehaving in some cases?
13:55 karolherbst: the lod is 0.0 so it shouldn't matter.. but
13:59 imirkin: if lod is 0
13:59 imirkin: then we'll do lz
13:59 imirkin: i.e. if it's statically 0
13:59 imirkin: but no, i'm not aware of textureLod() misbehaving
13:59 imirkin: however if they changed the order of args on some gen ... :)
14:00 imirkin: although it'd be reflected in GL as well
14:00 karolherbst: ehh... let's say changing the order would be something I could rule out quite easily :D
14:00 karolherbst: but coords are before the lod anyway
14:00 imirkin: famous last words
14:00 imirkin: :)
14:01 imirkin: also if it's like textureLodOffset or whatever
14:01 imirkin: and also what goes into the first src and what goes into the second
14:01 imirkin: all up for grabs
14:01 karolherbst: not really :p
14:01 imirkin: and then txd comes along and has totally diff arg order for no good reason
14:01 imirkin: (ok, there's a good reason. doesn't make it less annoying.)
14:02 karolherbst: right
14:03 karolherbst: txd was what in nv speak?
14:03 karolherbst: ahh TXD...
14:04 karolherbst: ehhh.. yeah, TXD is annoying :D
14:04 imirkin: the one with the explicit derivatives
14:04 karolherbst: yeah
14:04 imirkin: i forget what it's called
14:04 karolherbst: in this case it really makes sense
14:04 karolherbst: yep, TXD it is
14:05 imirkin: but they like pack stuff into arg0
14:05 imirkin: into the array index
14:06 imirkin: karolherbst: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#n956
14:08 imirkin: let's just say that wasn't my *first* guess ;)
15:47 dviola: skeggsb: hi, I know that you are pretty busy, but did you get the chance to look at this yet: https://lkml.org/lkml/2021/2/16/467 ?
19:32 imirkin: skeggsb: are you no longer updating github.com/skeggsb/nouveau ?