00:00 robclark: craftyguy: is it somehow possible that freedreno tools get built without building freedreno itself? (Although in theory that should show up w/ fdperf as well.. but there could be some luck involved, since fdperf has a couple extra dependencies)
00:03 robclark: craftyguy: ok, a shot in the dark, but does this help:
00:03 robclark: https://www.irccloud.com/pastebin/bbvej7B2/
00:04 anholt_: yeah, that looks necessary.
00:06 craftyguy: robclark: I'll give it a try
00:06 robclark: thx.. I wonder if we don't enable tools in x86 gitlab CI builds? It is the kind of snafu that would be nice to catch pre-merge..
00:08 craftyguy: robclark: intel CI it using -Dgallium-drivers=freedreno, that should build the driver right?
00:08 robclark: yeah.. but I guess it could come down to build order?
00:08 robclark: maybe intel CI has too many cores to build mesa in parallel :-P
00:12 craftyguy: heh
00:14 craftyguy: it's only doing a modest -j18
00:17 idr: imirkin_: An approach that could use optimizable wrappers would be viable without any driver opt-in. ;)
00:18 imirkin_: idr: yeah, but it wouldn't benefit the application
00:19 idr: Based on all the crap I've been doing in NIR, it seems like it could... but it would not be guaranteed.
00:19 idr: That was my plan for implementation, anyway.
00:19 imirkin_: idr: i certainly wouldn't stop you from implementing something like that, but imho it's not that useful. i think Mystral said as much too.
00:20 craftyguy: robclark: that patch seems to fix the build issue I hit. you can add my t-b
00:21 robclark: thanks
00:22 imirkin_: idr: the benefit of using the hardware thing is that it all Just Works (tm) with zero overhead. sticking if's for all mul's is going to be much heavier.
00:26 idr: But I think you're thinking about the problem wrong. You don't want to make mul produce zero, you want to prevent mul from seeing Inf.
00:26 idr: Most of the stuff I see is something like:
00:26 imirkin_: no, i want mul to produce zero. that's what the nvidia flag is.
00:26 imirkin_: and i can guarantee that if these games got tested on anything, they got tested on nvidia
00:27 idr: if (x <= 0) { copysign(y, FLT_MAX); } else { y * rsq(x) }
00:27 idr: In literally thousands of shaders.
00:27 idr: Or similar stuff.
00:27 imirkin_: probably some pattern that got copied. i'm talking about original DX9 code
00:27 robclark: craftyguy: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3939
00:27 gitbot: Mesa issue (Merge request) 3939 in mesa "freedreno/computerator: fix build dependency" [Freedreno, Opened]
00:27 idr: Yes. These are ports of DX9 games to GL by really, really talented people.
00:27 imirkin_: hehe
00:28 imirkin_: by the time it's GLSL, the dream is over.
00:28 imirkin_: i mean, you can try to detect it and convert to a different mul op
00:28 imirkin_: but i was mostly thinking of wine, which is taking in DX9 byte code and generating GLSL shaders
00:29 imirkin_: in DX9, rsq(x) takes the abs of x iirc?
00:29 Mystral: yes
00:29 idr: Sounds right.
00:29 imirkin_: so it's just y * rsq(x) there
00:30 idr: I typed rsq instead of inversesqrt(). :)
00:30 idr: (Because I was lazy.)
00:31 imirkin_: and i forgot that it wasn't rsq in glsl :)
00:31 imirkin_: but anyways, i'm not going to fight this ... my ability to sway people's opinions here is like 0 for 10
00:32 imirkin_: gonna keep it that way, no need for 11
00:34 idr:-> home
01:27 imirkin: pendingchaos: how hard would it be to hack ACO emit V_MUL_LEGACY unconditionally, but in a correct manner (i.e. handling mad, etc)?
01:28 imirkin: (or hakzsam ... whoever is around && knows about aco)
01:28 pendingchaos: shouldn't be too difficult
01:29 imirkin: is it possible to run the compiler standalone somehow, i.e. without the hw?
01:30 pendingchaos: yes: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3872
01:30 gitbot: Mesa issue (Merge request) 3872 in mesa "radv: implement a dummy winsys for creating devices without AMDGPU" [Radv, Opened]
01:30 imirkin: neat-o
01:30 pendingchaos: RADV_PERFTEST=aco RADV_FORCE_FAMILY=...
01:30 imirkin: and then basically shader_test the-shader-i-want-to-compile?
01:31 imirkin: presumably no integration with radeonsi?
01:31 imirkin: (given the subject...)
01:31 pendingchaos: ACO is currently RADV only
01:34 pendingchaos: should just be making instruction selection create v_mul_legacy_f32 instead of v_mul_f32 (denormal handling might be more difficult though, if that matters)
01:36 pendingchaos: combining v_mul_legacy_f32+v_add_f32 into v_mad_legacy_f32 would involve modifying the optimizer though
01:43 imirkin: why did i think there was a radeonsi integration?
01:43 imirkin:must have dreamt it
01:44 linkmauve: There are also d3d9 implementations on top of Vulkan, so maybe this extension could also be useful for wine users on radv?
01:45 imirkin: pendingchaos: afaik there are various restrictions about which opcodes are supported where, too
01:45 imirkin: but if it doesn't connect to radeonsi, that's unfortunate =/
01:46 imirkin: i was so sure it did... o well
02:59 airlied: jekstrand: ever consider adding a range to nir texture instr?
03:00 airlied: so when we lower things away from the uniforms to index + ssa offsets, we know what the original array size was
03:00 airlied: vec4 32 ssa_22 = tex ssa_15 (texture_offset), ssa_21 (sampler_offset), ssa_9 (coord), 0 (texture), 0 (sampler)
03:00 airlied: if I have that I've no idea that the original array was 4 samplers long
03:01 jekstrand: airlied: We used to have one
03:01 jekstrand: airlied: Only i965 used it
03:02 jekstrand: We deleted it
03:02 airlied: oh texture_array_size
03:02 airlied: is that the one?
03:02 jekstrand: yup
03:02 jekstrand: I thought we deleted it
03:03 airlied: oh that is used for ycbcr
03:03 airlied: maybe we nuked range somewhere else
03:03 jekstrand: No
03:03 jekstrand: We used to use it for shortening binding tables
03:03 airlied:should go dig
03:04 airlied: yeah for llvmpipe it would make it easier to limit how many texture accessors I have to build
03:06 airlied:guesses I can worry about it if it becomse a real problem :-P:
03:07 jekstrand: airlied: Yeah, looks like it's still there
03:07 jekstrand: airlied: But now I want to delete it
03:07 airlied: indeed it is, I'll cautiously use it and hope you get distracted
03:08 airlied: jekstrand: *look* shiny unstructured nir!
03:09 airlied: turnip also looks at it
03:21 jekstrand: airlied: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3940
03:21 gitbot: Mesa issue (Merge request) 3940 in mesa "nir: Drop nir_tex_instr::texture_array_size" [Nir, Opened]
03:21 jekstrand: airlied: Feel free to NAK
03:24 airlied: jekstrand: nah don't want to block progress there :-)
03:25 jekstrand: airlied: Feel free to review too ;)
03:27 airlied: jekstrand: just added an r-b to it
03:31 jekstrand: airlied: I'll need to run a piglit run with NIR_TEST_CLONE before we land it
03:32 jekstrand: airlied: Why do you need to get that information from the shader code?
03:33 jekstrand: airlied: If you just want a total, nir_shader::info::num_textures will give you that
03:33 jekstrand: airlied: Unless you're talking Vulkan
03:33 jekstrand: That might get more interesting but there you have the descriptor set
03:36 airlied: jekstrand: if I have a shader with 4 normal textures and 4 textures in an array
03:36 airlied: I only want to generate the switch code and resulting mess for the array ones
03:36 jekstrand: Right
03:36 jekstrand: airlied: you could always make nir_lower_indirect_derefs run on textures
03:37 jekstrand: Generate the switch code in NIR
03:37 airlied: hmm I wonder how insane that would be
03:37 jekstrand: I have no idea
03:37 airlied: there might be some optimisations I can make better at llvm level
03:37 jekstrand: Yeah
03:37 airlied: since we generate functions that would end up common
03:37 jekstrand: Why does it have to be a switch?
03:37 jekstrand: Why not a function pointer?
03:38 jekstrand: Or a table of them
03:38 jekstrand: Or just an array of descriptor data
03:38 jekstrand:has no idea; he's just asking question.
03:38 jekstrand: *quesetions
03:38 jekstrand: *questions
03:38 jekstrand: Stupid first-gen XPS 13 keyboard
03:41 jekstrand: What I will say is please don't add switch statements to NIR for this. :P
03:41 airlied: jekstrand: probably could eventually be an array of functions pointers, but that requires more LLVM
03:41 airlied: LLVM learning :-P
03:42 HdkR: LLVM you say? :)
03:42 jekstrand: airlied loves LLVM
03:43 HdkR: function pointer array in llvm is pretty straight forward, just create an array with the function pointer types then GEP it :D
03:43 airlied: HdkR: okay I'll put that down as step 2 :-)
03:44 imirkin: after that comes profit...
03:44 HdkR: woo profit
03:46 airlied: llvmpipe is all about the profits
03:47 airlied: for cloud vendors getting paid to run CI :-P
03:48 HdkR: How soon until I can replace my GPU with a dual epyc server running LLVMPipe? :)
03:49 imirkin: you could do it today
03:49 airlied: HdkR: as soon as you buy me a dual epyc server to "test" it on :-P
03:50 airlied: (replacemnt of GPUs not guaranteed)
03:50 HdkR: imirkin: but I want the perf of a GPU as well :P
03:50 imirkin: HdkR: of a gpu... sure
03:50 imirkin: how do you feel about the i740?
03:52 HdkR: imirkin: Perfect, lemme just run Doom Eternal on that
04:37 robclark: airlied: fun fact.. somewhere in the last generation or two, adreno added structured flow control (if/else/endif) to the isa.. I'm not entirely sure why, or how blob decides which sort of flow control to use (but that might just come down to when they can figure out that llvm is structured)
04:38 HdkR: whoa
04:39 HdkR: Makes me think of the old ARM IT instruction
04:39 robclark: IT is more like execuation mast
04:39 robclark: mask
04:41 robclark: if/else/endif is something we still need to experiment with.. but presumably they added it for some reason.. or maybe it was there all along and they only recently figured out how to make their llvm backend do something useful with it?
04:44 HdkR: Very curious
04:46 robclark: it is, ofc, sometimes hard to know when blob compiler choices come down to llvm vs things that actually matter for perf
04:55 airlied: robclark: I wonder if it's always been there, I think r600 has some of that
04:55 airlied: yay it passes the one piglit test I wrote for this
05:00 robclark: when it comes to llvm, I don't want to rule anything out.. a2xx had something more similar to r600 w/ CF clauses in shader.. a3xx was pretty much a new shader ISA arch.. but that is also when they switched to llvm, so maybe there were some hw features that they couldn't figure out how to utilize..
05:05 jekstrand: robclark: Guaranteed re-convergence, perhaps?
05:05 jekstrand: robclark: SPIR-V for Vulkan has some re-convergence guarantees. In particular, you're guaranteed some form of re-convergence at merge blocks.
05:06 jekstrand:doesn't 100% remember the rules
05:06 robclark: iirc in both cases it is using (jp) flag to mark potential convergence points.. at least I assume (jp) is a hint to the hw that divergent flow control might be reconverging
05:07 jekstrand: Sure but, iirc, arduino jp just marks where it may re-converge; it doesn't guarantee that it does.
05:07 jekstrand: How could it?
05:07 jekstrand: Whereas if you have a structured if/else, you can provide better guarantees.
05:07 robclark: sure.. but I guess the hw just needs to know when to join up divergent threads in a warp..
05:08 jekstrand: Potentially
05:09 robclark: but ofc we could have something wrong about that.. so far I've mostly not cared too much about VS performance, and most FS that matters doesn't have too much flow control (at least for things that matter in the gles world)
05:09 jekstrand: It may be less about perf and more about guarantees for shader authors
05:09 robclark: (ie. manhattan30 has I think just one FS that has flow control)
05:09 robclark: could be
05:10 jekstrand: Every every divergence point allows invocations to go their own way and never re-converge, that's bad.
05:11 jekstrand: Shader authors want to be able to write "if (...) { a = 0.5 } else { a = 0.3 } color = texture(coord) * a;" without having to worry about whether or not they have valid derivatives
05:11 robclark: yeah, it has been a thought thats crossed my mind about things we could be potentially getting wrong.. although ir3 basically ends up putting (jp) on the front of every block that has more than one predecessor
05:11 jekstrand: If you can't guarantee re-convergence, you're in a world of hurt.
05:12 robclark: yeah
06:31 dhirc: I am trying to find a libdrm example of using Variable Refresh Rate (vrr) without luck. Any pointers on where to get started with this would be greatly appreciated.
06:51 tomeu: anarsoul: done, but I think anybody else could have given their r-b, as long as the CI passes reliably with the changes
06:52 anarsoul: tomeu: thanks!
08:00 hakzsam: anholt: drm-shim is probably much better than my null winsys, I will try to add amdgpu support :)
10:08 bbrezillon: pinchartl: I'll queue "drm/bridge: lvds-codec: Add to_lvds_codec() function" and "drm/bridge: lvds-codec: Constify the drm_bridge_funcs structure" to drm-misc-next if you're okay
10:12 pinchartl: bbrezillon: sure
10:12 pinchartl: thanks for fixing the subject line :-)
11:00 daniels: tanty: your jobs from !3907 failed because we are having a small issue in our lab, I'll retry them when it's back
11:43 tanty: daniels: OK, thanks!
14:08 pinchartl: bbrezillon: could you review "[PATCH v7 04/54] drm/bridge: Document the drm_encoder.bridge_chain field as private" ?
14:08 pinchartl: and "[PATCH v7 05/54] drm/bridge: Fix atomic state ops documentation" ?
14:11 pinchartl: danvet: is the big omapdrm series a candidate for drm-misc, or should it go in through a pull request to drm ?
14:11 pinchartl: it's currently based on drm-misc-next
14:15 danvet:shrugs
14:16 danvet: not clear what your question is ...
14:16 tomba: pinchartl: I'm fine with sending a pull req, if the series doesn't depend on misc. that's perhaps a bit simpler with a series so big.
14:17 pinchartl: tomba: it currently depends on drm-misc-next, but that may have been merged in drm-next. let me check
14:18 pinchartl: tomba: no, it depends on drm-misc-next still
14:19 pinchartl: I think it's best to merge it through there
14:19 pinchartl: danvet: the question was whether drm-misc is suitable for such large rework
14:19 pinchartl: (I would assume the answer to be yes)
14:21 danvet: there's a lot of bridge stuff in there and 54 patches should still be totally fine to smash into drm-misc imo
14:22 danvet: I expect mripard_ will stuff the vc4 series into -misc too :-)
14:22 pinchartl: ok :-)
14:33 bbrezillon: pinchartl: will do
14:34 pinchartl: bbrezillon: thanks
14:55 daniels: anholt, robclark: another a6xx flake https://gitlab.freedesktop.org/llandwerlin/mesa/-/jobs/1708380
15:13 imirkin_:is amazed ... https://github.com/KhronosGroup/VK-GL-CTS/pull/189 actually got merged
15:13 gitbot: KhronosGroup issue (Pull request) 189 in VK-GL-CTS "Improve support for KHR-GL33.* tests" [Closed]
15:14 karolherbst: nice
15:15 imirkin_: nv50 still fails some of the tests (at least G84 does), but for fairly legitimate reasons
16:57 danvet: agd5f_, I'll resend the patch to unexport drm_get_pci_dev
16:58 danvet: agd5f_, I think simplest if you also add that to your amd pile
18:04 agd5f_: danvet, yeah, planning to pick that up after my next PR since my current tree still has it used in vmware, or I can backmerge current drm-next
18:06 danvet: agd5f_, ah if you need a merge anyway I guess I can wait for your pr and then stuff it into misc
18:09 agd5f_: sure, works for me
18:17 jekstrand: airlied, danvet: Are there any IOCTLs which act directly on a dma-buf prime FD?
18:17 jekstrand: As in, the IOCTL doesn't act on a DRM handle but the prime FD itself
18:19 danvet: jekstrand, yup
18:20 danvet: DMA_BUF_IOCTL_SYNC and DMA_BUF_SET_NAME are the ones right now
18:21 danvet: meh the dma-buf name isn't the same as the gem obj name
18:21 danvet:disappoint
18:21 danvet: sumits, ^^
18:23 jekstrand: danvet: Awesome! I'm going to add two more. :-)
18:29 jekstrand: danvet: I assume the one with IOCTL is the better name?
18:30 danvet:doesn't really care
18:33 Venemo: can someone explain to me how write masks are supposed to work when you are storing an output that is an array of vec3s?
18:34 imirkin_: same way as if it's not an array of vec3's...
18:34 Venemo: err
18:34 imirkin_: when you're assigning an array to another (if it's even legal), i don't think you can have a writemask at the array level
18:34 Venemo: is it an offset inside the array or inside the vec3?
18:35 imirkin_: inside the vec3
18:35 imirkin_: so like foo[5].xy = vec2(1, 2)
18:35 imirkin_: will update the xy components of foo[5]
18:35 Venemo: I have an output that is a vec3[3]
18:35 imirkin_: (which seems quite obvious, so i suspect you're asking about something else)
18:36 imirkin_: ok
18:36 Venemo: sorry, it's not obvious to me
18:36 imirkin_: what is the line of code you're unsure about?
18:36 Venemo: gimme a sec
18:37 imirkin_: also i'm guessing you're on an arch where you have to export the whole vec4 at a time rather than per-component?
18:40 Venemo: this is what I have: intrinsic store_per_vertex_output (ssa_3, ssa_0, ssa_2) (20, 7, 0) /* base=20 */ /* wrmask=xyz */ /* component=0 */
18:41 Venemo: which stores into this output: decl_var shader_out INTERP_MODE_NONE vec3[3] @4 (32.xyz, 20, 0)
18:41 imirkin_: so that's saying foo[0].xyz = ssa_one_of_them
18:42 imirkin_: oh, wait, ssa_all_of_them probably :)
18:43 imirkin_: i.e. foo[0].xyz = vec3(ssa_3, ssa_0, ssa_2)
18:43 Venemo: ssa_3 is a vec3 already
18:43 imirkin_: then ssa_0/2 mean other things. would have to look at the store_per_vertex_output spec
18:43 imirkin_: is this TCS?
18:43 Venemo: yes
18:43 imirkin_: ah heh
18:43 imirkin_: so then probably ssa_0 is gl_InvocationID?
18:44 imirkin_: but really just look at the spec for what those args mean
18:44 Venemo: I already did
18:44 pendingchaos: ssa_3 is the value, ssa_0 is the vertex and ssa_2 is the index within the vec3[]
18:44 Venemo: ssa_3 is the value to store, ssa_0 is the vertex index and ssa_2 is the output
18:45 pendingchaos: I think the writemask is for each component of ssa_3
18:45 Venemo: the question is, should the above instruction write to all vec3s in the array, or only one?
18:46 pendingchaos: only one
18:46 Venemo: but all 3 elements inside that vec3, right?
18:46 Venemo: because of xyz
18:46 pendingchaos: yes
18:47 Venemo: actually, why doesn't this thing get flattened into an array that has 9 elements?
18:47 Venemo: it kind of feels fiddly to have to care about vecs like this
18:48 airlied: welcome to tessellation
18:48 imirkin_: flattening has problems too
18:48 imirkin_: like indirect indexing
18:48 imirkin_: as well as funky ARB_enhanced_layout things
18:48 imirkin_: as well as SSO
18:48 imirkin_: (although iirc you can't have a TCS in a SSO by itself?)
18:49 Venemo: I think you always need a TES, and the TCS is optional in OpenGL but mandatory in Vulkan
18:49 imirkin_: well, i mean in a SSO program
18:50 imirkin_: not in a full pipeline
18:50 Venemo: SSO = ?
18:50 imirkin_: separable shader objects
18:50 Venemo: sorry, I have no idea
18:50 imirkin_: which enables you to create a pipeline out of pre-linked shaders
18:50 imirkin_: which present linkable interfaces
18:50 imirkin_: for their inputs/outputs
18:50 Venemo: ahhh
18:50 Venemo: okay then
18:52 Venemo: I think my problem is that I need to care about these vecs
18:52 airlied: also array of 3 vec3s isnt 9 floats
18:52 airlied: its most likely 12 floats
18:52 Venemo: airlied: why?
18:53 airlied: because vec3s generally align to vec4
18:53 Venemo: so the flattening would introduce more problems than itbsolves
18:53 Venemo: than it solves
18:53 imirkin_: this is known as "varying packing"
18:54 imirkin_: and we do it in many places, but there are various restrictions
18:54 airlied: also tess vecs are not necessarily linear
18:54 Venemo: how do you mean?
18:54 airlied: tess vecs can be stored strided
18:54 Venemo: headache
18:54 airlied: AoS vs SoA
18:55 airlied: if you have 2 tess outputs a[3] and b[3]
18:55 imirkin_: so like on adreno, those aren't hardware varyings
18:56 imirkin_: you just have a buffer, and you read/write to it
18:56 imirkin_: and as long as TCS/TES agree on where to read/write from, it's all good
18:56 airlied: they can be stored a[0] b[0]
18:56 airlied: a[1] b[1]
18:56 Venemo: imirkin_: that's exactly what I do too: just find the place where it needs to be stored in memory
18:57 Venemo: well, "just"... it's been very fiddly
18:57 airlied: the ring calculations for tess are always messy
18:58 airlied: i thought i had them right, then hit compact vars
18:59 Venemo: I was wondering about that too
18:59 Venemo: what does compact mean here?
18:59 airlied: compact patch var[gl_invocationID] is quite the brain twister
18:59 airlied: Venemo: packed array
19:00 airlied: so float[4] acts like a vec4
19:00 airlied: used for tess factors and clip dists
19:01 Venemo: emm, I thought float[4] under the hood is the same as vec4
19:02 imirkin_: for normal varyings, it's actually 4x vec4
19:02 imirkin_: but only using the first component
19:02 Venemo: what the
19:02 imirkin_: (and using ARB_enhanced_locations, you can place a float[4] into the y component as well)
19:02 imirkin_: thankfully all the location stuff is already handled by the higher levels
19:03 imirkin_: so you just do what they tell you
19:03 Venemo: yeah, "just"
19:03 imirkin_: better than having to figure it _all_ out :p
19:03 Venemo: that's true
19:03 Venemo: fortunately I can use the same layout as is used by radv_nir_to_llvm so that's good that I have something to verify against
19:04 imirkin_: yeah, copying a working thing is always a winning solution
19:05 Venemo: it can't be just copypasted, because the nir is lowered differently
19:06 airlied: for some hw it might be possible to just lower tess i/o to ssbos
19:07 Venemo: but anyway, I got most of it right, only this vecX case remains, I think
19:07 jekstrand: If you know the max number of threads and can build some sort of free-list data structure in the SSBO, that might work.
19:08 Venemo: do I need to care about this compact thing separately, or does nir already set the locations correctly, in lower_io?
19:09 Venemo: so far, I think it does, but can't say for sure
19:12 Venemo: yeah it seems it doed
19:12 Venemo: seems it does
19:13 Venemo: so that's one less headache, fortunately
19:14 airlied: when you get CTS perfect, then you'll know if you needed it or not :-)
19:19 Venemo: hehe
19:19 Venemo: fair enough
19:19 airlied:had GL working on llvmpipe, but vulkan tess nir looked slightly different
19:22 imirkin_: airlied: you mean when heaven runs, you'll know?
19:24 airlied: imirkin_: well he's probably doing vulkan :-P
19:24 airlied: so the sascha tess demo :-P
19:25 airlied:wants to track down why the tess factor nir is different between vulkan and GL
19:27 airlied: ah GL ends up with tessouter being a vec4, vulkan ends up with tessouter being a compact float[4]
19:30 imirkin_: i want to say there's some PIPE_CAP that controls that.
19:30 imirkin_: but i don't remember what it is.
19:30 imirkin_: something i wanted to enable to nouveau, but ... fail
19:30 imirkin_: too much chagne for not enough benefit
19:30 Venemo: airlied: actually the Sascha demo already requires quite a lot of stuff to get right, so instead of that I'm dealing with the vulkan cts
19:30 imirkin_: it was something do to do with compact arrays
19:37 Venemo: imirkin_: so, regarding write masks that we talked about earlier, does a write mask even make sense for a simple float[N] output, or is that always just x then?
19:39 pinchartl: vsyrjala: congratulations for finding a panel that can handle 60kfps in our code base. I thought a few days ago that display was easier than cameras because we would never need to support 1kfps :-)
19:49 airlied: Venemo: for a float it will be x
19:49 Venemo: ah, ok
19:49 Venemo: so it only matters for vecs
20:43 anholt_: just to be sure before I go do a bunch of typing, do we have a piglit test that can iterate over possible target/size/format of textures and test sampling from each texel?
20:52 airlied: anholt_: tex-miplevel-selection?
20:52 airlied: or texwrap maybe
20:52 anholt_: texelFetch looks close but doesn't do formats
20:52 anholt_: texwrap doesn't do all levels
20:52 anholt_: or 1:1 all texels
20:54 anholt_: tex-miplevel-selection has a constant color across the level.
20:55 airlied: mareko or imirkin_ might know of something
20:55 anholt_: and hardcoded texture size too
20:55 imirkin_: tex-miplevel-selection is it.
20:55 imirkin_: it takes about a thousand different args
20:56 imirkin_: and tests the various texture functions
20:56 anholt_: and doesn't do formats
20:56 imirkin_: nothing which does formats afaik, except teximage-colors
20:56 anholt_: texelFetch still looks like the closest for "iterate targets, sizes, and formats"
20:56 imirkin_: and some fairly generic-looking ones
20:56 imirkin_: yeah, texelFetch is a pretty great test (at testing texelFetch)
20:57 Sesse: on that note, I really wish GL didn't bind the choice of normalized/unnormalized coordinates to whether you get interpolation or not
20:58 Sesse: sometimes, it would be really useful to be able to have unnormalized coords but still get interpolation
21:09 ajax: Sesse: https://www.khronos.org/registry/OpenGL/extensions/ARM/ARM_texture_unnormalized_coordinates.txt may be up your alley
21:10 ajax: only defined for gles for some silly reason, but that's fixable
21:11 Venemo: one more question, how is the offset of a NIR intrinsic supposed to work? is that a byte offset within the given output, or does it mean that it's gonna end up at a different output?
21:12 airlied: Venemo: within the given one
21:13 anarsoul: ajax: likely because ARM blob doesn't support desktop GL
21:14 Venemo: airlied: how is it different from the component then?
21:14 pendingchaos: what offset are you talking about?
21:16 Venemo: eg. store_per_vertex_output(val, vertex_index, offset) <--- this offset
21:17 airlied: base, wrmask, component
21:17 airlied: oops sorry
21:17 pendingchaos: IIRC that changes the vec4/slot that's written to
21:18 Venemo: so I'm talking about the thing that nir_get_io_offset_src returns for a given intrinsic
21:19 Venemo: pendingchaos: what you're saying seems to contradict what airlied said
21:19 airlied: ignore me :-)
21:20 Venemo: but in that case, what is the use case for this?
21:21 Venemo: why not just store the output you actually want to store, instead of storing another one with an offset
21:21 pendingchaos: the offset might not be a constant
21:22 Venemo: yes, I know, but I still don't see what could get compiled into such a thing
21:23 Sesse: ajax: ironically enough, I've talked to people on the mali team about this, and at the time, they were unaware GL had such a restriction :-)
21:23 Sesse: ajax: I guess their hardware can do it no problem
21:55 kisak: O.o a 52 second mesa build? nice https://gitlab.freedesktop.org/anholt/mesa/-/jobs/1732631
21:57 Sesse: ccache ftw?
22:04 anholt_: ccache and a large, otherwise-idle machine
22:09 Sesse: I guess meson, ccache, ninja and a large machine -- perfect combination