00:02 karolherbst: jekstrand: can you check the internal logs of my 151st CI run? somehow I think it either got stuck or it broke horribly
00:07 jekstrand: karolherbst: Yeah, it got hung up
00:08 jekstrand: karolherbst: It'll time out shortly
00:09 jenatali: karolherbst: Officially trying my first full CTS runs :P
00:09 jenatali: Rather than just piecemeal running CTS tests
00:36 bl4ckb0ne: should I bump the CI image for waffle? I need meson 0.51 for the wayland scanners
01:06 dcbaker[m]: bl4ckb0ne: why do you need 0.51?
01:08 bl4ckb0ne: a line break and get_variable on the pkgbuild dep
01:08 bl4ckb0ne: i can do without those features
01:10 dcbaker[m]: I think Mesa uses 0.52? I'm okay with bumping to whatever Mesa uses
01:11 bl4ckb0ne: yup mesa uses 0.52
01:13 bl4ckb0ne: now i just have to figure out how to upgrade the ci debian image
01:18 dcbaker[m]: Good luck, lol
01:19 bl4ckb0ne: i found the same tag with a different version from xserver and gave it a shot
01:19 bl4ckb0ne: so far its still building
01:20 bl4ckb0ne: i also found a typo here https://gitlab.freedesktop.org/mesa/waffle/-/blob/master/src/waffle/meson.build#L44
01:20 bl4ckb0ne: ill make a pull request after that
01:26 dcbaker[m]: bl4ckb0ne: that's not a typo. Waffles has a c11 threads compatibility library, that's what idep_threads is
01:27 bl4ckb0ne: found it defined as dep_threads here https://gitlab.freedesktop.org/mesa/waffle/-/blob/master/meson.build#L61
01:27 bl4ckb0ne: also you were right, CI is still not working
01:27 bl4ckb0ne: debian buster only has meson 0.49
01:29 dcbaker[m]: You need buster-backports of meson
01:30 dcbaker[m]: https://gitlab.freedesktop.org/mesa/waffle/-/blob/master/third_party/threads/meson.build#L34
01:31 bl4ckb0ne: i somehow missed that thanks
01:37 bl4ckb0ne: can i install from buster-backports with DEBIAN_DEBS?
01:37 bl4ckb0ne: nevermind found it
01:39 bl4ckb0ne: https://gitlab.freedesktop.org/wayland/ci-templates/-/blob/master/templates/debian.yml was what I was searching for
01:45 bl4ckb0ne: i think there is some issues with the dependencies, i dont see them being installed in the pipeline
02:10 dcbaker[m]: It's quite possible waffle was still doing the old style dependencies as build steps instead of as part of the pipeline
02:13 bl4ckb0ne: should it be part of the xdg shell merge request?
02:15 dcbaker[m]: It might be better as two, just because the reviewers might be different, but I think it's fine to be one MR too
02:20 bl4ckb0ne: ill see what i can do about that tomorrow
02:20 bl4ckb0ne: thanks for the help
06:50 tomeu: kisak: yep, working on a fix
10:00 danvet: tzimmermann, another mgag200 series just for me
10:01 danvet: well ok a bunch more people
10:01 danvet: but no lists
10:01 danvet: I'm still confused how your workflow works
10:01 tzimmermann: danvet, arghhh
10:01 danvet: you get all the maintainers added, but none of the lists somehow
10:01 danvet: that's unique :-)
10:02 tzimmermann: i don't know why i often forget the list
10:02 danvet: script it?
10:02 danvet: dim add-missing-ccs
10:02 danvet: then just send patches to yourself or so
10:02 danvet: that dim commands also adds mailing lists I think
10:02 tzimmermann: i do get_maintainers on the patches and dri-devel shows up at the very bottom. i guess, i just overlook it
10:03 danvet: uh actually no
10:03 danvet: we remove the mailing lists ...
10:03 karolherbst: jekstrand: so mhhh.. I kind of ran into an annoying issue on intels side (and this might affect other drivers potentially as well): if we run lower_system_values, nir->info.system_values_read needs to be fixed, but doing so kills nir->info.inputs_read for tess passthrough shaders as bits are fliped without adding input variables
10:03 danvet: maybe that's the bug?
10:03 danvet: perhaps should at least not remove dri-devel ...
10:04 danvet: tzimmermann, with the dim command you can do a dim retip -x "dim add-missing-cc" and done
10:04 danvet: iirc
10:04 tzimmermann: danvet, what do you mean by 'no'? scripts/get_maintainers.pl shows dri-devel
10:04 tzimmermann: i'll try out that dim command
10:04 danvet: tzimmermann, dim add-missing-cc filters out dri-devel
10:04 tzimmermann: danvet, sorry about all this
10:05 danvet: tzimmermann, no worries, really
10:05 danvet: the workflow is kinda bonkers
10:05 danvet: the trouble is that get_maintainers always includes everyone to the top
10:05 danvet: so always cc lkml
10:05 danvet: and for i915 also always cc dri-devel
10:05 danvet: I think that's why we're filtering
12:09 bbrezillon: jenatali: I tried implementing jekstrand suggestion here https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/240/commits, but maybe I went too far
12:09 bbrezillon: I thought I could get rid of _vtn_local_load_store() and let drivers do the split
12:21 bbrezillon: jenatali, jekstrand: but it seems nir_split_{struct,array}_vars() are restricted to shader/func temps
12:21 bbrezillon: and the vtn code was apparently lowering more than that
12:22 bbrezillon: (or maybe it's just me misunderstanding what it does)
15:13 bl4ckb0ne: do I need to use ci-templates for a freedesktop project or can I use the vanilla gitlab ci?
15:21 emersion: for waffle
15:21 emersion: ?
15:25 bl4ckb0ne: yes
15:25 bl4ckb0ne: i dont see what ci-templates can add to the build since its only building on x86 debian buster
15:56 dcbaker[m]: bl4ckb0ne: It saves us runner time if you can to install dependencies as part of the template build
15:56 bl4ckb0ne: yeah thats what i figured out
15:57 bl4ckb0ne: i finally managed to solve the issues i had, im squashing the commits and making a merge request
15:58 bl4ckb0ne: https://gitlab.freedesktop.org/mesa/waffle/-/merge_requests/77
15:59 bl4ckb0ne: gah pipeline is red again
15:59 bl4ckb0ne: meson test failing, maybe its missing a dep
16:11 dcbaker[m]: Looks like x11 related
16:12 bl4ckb0ne: yup
16:12 bl4ckb0ne: maybe its due to the bump of the image
16:21 bl4ckb0ne: im really not good with x11, could it be a config from the previous build image?
16:33 Kayden: jekstrand, karolherbst: using variables for passthrough TCS? I thought about doing that, but it seemed like a giant pain
16:34 Kayden: because you might have complex things like varying arrays of structs or whatnot
16:34 Kayden: and I could either replicate the entire structure of TES inputs as TCS inputs / TCS outputs, and emit code to figure out how to copy each element of those types
16:34 Kayden: or I could just loop over the number of vec4 slots and load_input/store_output
16:34 Kayden: which seemed way easier
16:35 Kayden: I suppose we could instead use a bunch of ivec4 variables and load/store those
16:36 Kayden: maybe that's better, but I'm not materially sure how
16:36 jekstrand: Kayden: You could just loop over the slots and create a bunch of vec4 variables. :)
16:37 Kayden: but what's the advantage?
16:37 Kayden: they don't match any kind of original structure and are just going to get lowered anyway...?
16:39 Kayden: I guess if you run gather_info afterwards it probably clobbers it, but...
16:39 Kayden: half the things in gather_info have to be or cannot be run at particular times
16:41 Kayden: I think I missed some context in the discussion
16:47 jekstrand: Kayden: It's nir_gather_info clobbering things
16:47 jekstrand: karolherbst was working on a MR to change the way system value lowering happens a bit and ran into trouble with the passthrough TCS
16:49 jenatali: karolherbst: Looks like the CL CTS tries to run the half tests regardless of whether the device supports the fp16 extension... I'm wondering if that's a test bug or if I missed part of the spec and some fp16 stuff isn't optional
16:53 jenatali: Oh, I see, half is required, but can only be used for conversions to/from float with the vload/vstore methods, unless the fp16 extension is supported
16:56 jekstrand: Kayden: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6119
17:02 Kayden: pretty much moving nir_gather_info gives me nightmares
17:02 jekstrand: Yeah
17:03 Kayden: ran into enough problems with that during the samplers_as_derefs rework
17:03 jekstrand: I don't really get why we need to move it
17:03 jekstrand: But I've not spent enough time looking at that MR
17:26 bl4ckb0ne: dcbaker[m]: relaunching the pipeline fixed the issue
17:27 dcbaker[m]: Awesome. I'll get out an actual computer and have a look today
17:27 dcbaker[m]:is on his phone ATM
17:28 bl4ckb0ne: thanks
17:28 bl4ckb0ne: once this is merged ill go back on the xdg-shell issues
17:28 bl4ckb0ne: i let the wip tag in the meanwhile on the xdg shell merge request
19:01 anholt: ha, it's not just my runner that crashes in the virgl gles31 tests. https://gitlab.freedesktop.org/anholt/mesa/-/jobs/3879484
19:21 airlied: anholt: wierd, I'd really expect ot have seen that
20:27 jenatali: Does nir have fp16 <-> fp32 lowering somewhere? A quick search isn't showing me anything pre-existing
20:28 jenatali: Er, specifically lowering the conversion between the two, not just replacing fp16 usage with fp32 usage
20:29 imirkin: you mean like using integer math to do the conversion?
20:29 jenatali: Yeah
20:29 imirkin: there's a library for fp64 like that, but i don't think that's there for fp32 <-> fp16 conversion
20:29 karolherbst: jenatali: you don't have any fp16 bits in dxil?
20:29 jenatali: Awesome - CL requires supporting that conversion (fp16 coming from external memory), but it's not a required feature in D3D
20:30 jenatali: karolherbst: It's optional
20:30 karolherbst: mhh
20:30 jenatali: Guess I'll add a lowering pass
20:30 karolherbst: mhhh
20:31 karolherbst: it feels like you could depend on it being there
20:31 karolherbst: at least it would cover all nvidia gpus as they all support it
20:31 karolherbst: well
20:31 karolherbst: relevant ones
20:31 karolherbst: imirkin: can tesla do fp16 <-> fp32 conversion?
20:31 karolherbst: I think so, right?
20:31 imirkin: pretty sure, yeah
20:32 jenatali: karolherbst: My dev PC's AMD GPU doesn't set the D3D bit for 16bit support
20:32 jenatali: We conflated i16 and fp16 support into a single bit
20:32 karolherbst: jenatali: I am sure the hw still supports it though
20:32 jenatali: Whether that was a good idea or not... I can't really say :)
20:32 karolherbst: ehhh
20:32 karolherbst: yeah..
20:32 jenatali: karolherbst: Doesn't matter if I can't get at it ;)
20:32 karolherbst: that was probably not a good idea :p
20:32 imirkin: whether hw supports or not, if driver doesn't allow it, not much you can do
20:32 karolherbst: yeah...
20:32 jenatali: Yup
20:32 karolherbst: annoying
20:33 jenatali: I've got C routines for the conversion, I just have to port them to nir, shouldn't be too bad
20:33 karolherbst: rounding modes are fun :(
20:33 karolherbst: not
20:33 jenatali: Yeah...
20:34 karolherbst: convert_sat_rtp_half(double) :p
20:34 jenatali: I think daniels may have already hooked it all up for me? But I haven't looked closely at whether his code supports fp32->fp16 since we weren't planning to support fp16 at all
20:34 karolherbst: ehhh
20:34 karolherbst: _sat is at the end
20:34 karolherbst: convert_rtp_half_sat_
20:34 karolherbst: ?
20:34 karolherbst: how was it..
20:34 karolherbst: convert_half_rtp_sat!
20:34 jenatali: I think that's part of the fp16 extension - all we need to support is the vstore_half variants
20:35 karolherbst: ohh, might be
20:35 jenatali: Which do still have rounding modes
20:35 imirkin: there might actually be lowering already
20:35 imirkin: for like
20:35 karolherbst: jenatali: why are you ending up with conversions then?
20:35 imirkin: pack_half_something
20:35 karolherbst: yeah
20:35 karolherbst: the packing is there already
20:35 imirkin: this packs 2x floats into a single 32-bit value
20:35 jenatali: karolherbst: vload_halfn takes half as input and returns float
20:35 karolherbst: lower_pack_half_2x16
20:35 karolherbst: lower_unpack_half_2x16
20:36 karolherbst: jenatali: ehhh...
20:36 jenatali: And store's the opposite
20:36 karolherbst: ....
20:36 imirkin: right, that makes sense
20:36 karolherbst: why...
20:36 jenatali: Dunno
20:36 karolherbst: but yeah.. it does make sense
20:36 jenatali: I have no idea why this isn't also behind the fp16 extension
20:36 imirkin: coz everyone has the hw to do this
20:37 jenatali: Like... what're you doing with half floats in external memory?
20:37 karolherbst: jenatali: do you know if dxil has those packing/unpack operations because they are more or less mandated.. I think even by GL?
20:37 imirkin: even if they don't have the hw to do the remainder of fp16
20:37 imirkin: reducing bandwidth?
20:37 jenatali: Oh, yeah that's a good reason
20:37 karolherbst: jenatali: fp16 textures
20:38 karolherbst: imirkin: any idea if GL mandates fp16 textures?
20:38 jenatali: imirkin: The lowering you're talking about still has f2f16 or f2f32, which is the op we can't use if the D3D driver doesn't set the 16bit flag
20:39 imirkin: karolherbst: semi-modern versions of GL definitely
20:39 imirkin: probably starting with GL 3.0?
20:39 karolherbst: yeah.. I guess so
20:39 jenatali: karolherbst: Believe it or not, fp16 textures are unrelated, the D3D abstraction requires fp16 -> fp32 expansion to happen inside the sampler hardware
20:39 imirkin: jenatali: sad. i guess all hw supports it, so it's never come up :)
20:39 jenatali: "the sampler hardware" being a texture op instruction
20:39 karolherbst: jenatali: I wouldn't be surprised if dxil has fp16 operations which driver have to support though
20:39 jenatali: karolherbst: Negative, just tried
20:40 karolherbst: jenatali: not even unpack?
20:40 jenatali: Our WARP device supports 16bit ops, and f2f32 succeeds just fine, but D3D validation blocks the same shader from even getting to the driver on my AMD device
20:40 imirkin: jenatali: yeah, do you have those pack/unpack opcodes you can access? those are pretty much required for GLSL
20:41 jenatali: Let me double-check...
20:41 karolherbst: jenatali: D3DX_FLOAT2_to_R16G16_UNORM?
20:41 karolherbst: would be the hlsl think I think
20:41 karolherbst: *thing
20:41 karolherbst: stores two floats into an int packed as fp16
20:42 karolherbst: again, I wouldn't be surprised if there is _something_
20:42 jenatali: karolherbst: Pretty sure that's just int math though
20:42 karolherbst: mhhh
20:42 karolherbst: but like all hw supports it
20:42 karolherbst: even AMD
20:42 jenatali: Ah, I think I may have found it, we've got dedicated intrinsics for converting between f32/f16 apparently
20:42 karolherbst: :)
20:43 jenatali: The native LLVM opcodes require the 16bit support, but I bet these instructions don't
20:43 karolherbst: yeah
20:43 jenatali: Thanks for making me look again
20:43 karolherbst: would be weird if you could do it with GL but not d3d
20:45 karolherbst: imirkin: seems like the explicit unpacking is GL 4.2+
20:45 karolherbst: GL_ARB_shading_language_packing
20:45 karolherbst: but I guess texture stuff is way older
20:45 imirkin: yes, fp16 textures were around in DX9 hw
20:46 imirkin: and i think required starting GL 3.0
20:46 karolherbst: yeah
20:46 karolherbst: GL_ARB_texture_float in 3.0
22:05 tanty: imirkin: could I get a quick review to https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/347 ?
22:08 imirkin: tanty: a-b: me
22:08 tanty: thanks a lot!
22:08 imirkin: (i'm no expert on this stuff, but it seems not-completely-wrong)
22:08 tanty: :)
22:09 imirkin: and that's really the best i can do.
22:09 tanty: is as much as I can ask, I think
22:09 imirkin: the way i do it in my .emacs is a bit different
22:09 imirkin: i use (custom-set-variables
22:09 tanty: right
22:10 imirkin: and i don't have any mode-specific values
22:10 imirkin: coz i didn't know how to do that ;)
22:11 imirkin: wow, impressive how much crap can get into a .emacs file over 20y or so.
22:11 imirkin: no clue what any of it does. fun.
22:13 imirkin: i guess more like 15. xemacs before that...
22:18 tanty: I think I have the exact same experience :D
22:19 imirkin: and over all that time, i've managed to learn surprisingly little about it
22:20 imirkin: i know that there's all these power features, but i need exactly none of them
22:20 imirkin: tags is as far as i go with power features
22:24 Lyude: imirkin: I feel that, I've been adding on to my .vimrc and using it since high school
22:24 Lyude: so like, 8 or 9 years
22:25 imirkin: i think i used vi back in HS ... hard to remmeber
22:25 imirkin: lots of Turbo C as well...
22:25 imirkin: (well, elvis of course)
22:30 jenatali: karolherbst, imirkin: So apparently DXIL has instructions for fp16<->fp32 in which the fp16 data is in the low 16 bits of a 32bit value... I'm having trouble coming up with a way to construct that in nir without adding a new opcode. Interested in your thoughts
22:30 karolherbst: jenatali: unpack?
22:30 imirkin: jenatali: i'm largely unfamiliar with nir. you want jekstrand.
22:30 jenatali: Hm... let me look closer at that opcode
22:31 karolherbst: unpack_2x16 and pack_2x16 is what you need
22:31 jekstrand:sees the jekstrand symbol
22:31 jekstrand: Also known as a pink 4
22:31 imirkin: jenatali: however i'd think that you can skip over the little detail that you don't have actual 16-bit registers and implement it as a store into a 32-bit value.
22:32 karolherbst: yeah
22:32 karolherbst: skip the 16 bit one
22:32 imirkin: and for the f16-to-f32 variant, just ignore whatever result you get for the upper half
22:32 jenatali: Hmmm
22:32 jekstrand: jenatali: I think you want [un]pack_half_2x16
22:33 karolherbst: jekstrand: I am sure the dxil operation fills both halfs though :p
22:33 karolherbst: ...
22:33 karolherbst: jema
22:33 karolherbst: ...
22:33 karolherbst: jenatali:
22:33 jenatali: :P
22:33 karolherbst:is getting tired :D
22:33 karolherbst: implementing scalar regs is... annoying
22:33 karolherbst: uhm
22:33 jekstrand: jenatali: In particular, unpack_half_2x16_split_x for fp16 -> fp32 and pack_half_2x16_split(x, undef) for fp32 -> fp16
22:33 karolherbst: uniform regs
22:34 karolherbst: ahhh
22:34 jenatali: jekstrand: Thanks, I think that helps
22:34 karolherbst: jekstrand: don't you mean unpack_32_2x16_split_y?
22:34 karolherbst: ehhh
22:34 karolherbst: pack..
22:34 karolherbst: ufff
22:34 karolherbst: k
22:34 jekstrand: karolherbst: If you want it in the top bits, use y
22:34 jekstrand: x is low bits
22:34 karolherbst: right
22:35 jekstrand: little endian FTW
22:35 jekstrand: wait...
22:35 jekstrand: do I have that backwards?
22:35 karolherbst: no clue
22:35 jekstrand:doesn't feel like looking it up
22:37 jenatali: x is high bits I'm pretty sure
22:37 jenatali: er... no
22:38 jenatali: Definitely low bits
22:40 imirkin: low bits = x.
22:40 imirkin: endian determines how a 32-bit values is packed into bytes
22:41 imirkin: which isn't relevant to the whole x/y/z/w thing in a logical 32-bit value.
22:41 imirkin: little endian = low byte comes first, big endian = low byte comes last
22:44 bnieuwenhuizen: jekstrand: kinda weird idea but would it be wrong to start setting maxImageCount = INT32_MAX (or something smaller if preferred) in the Vulkan WSI? the 0 case is very counter-intuitive, I don't see the value add (since you obviously cannot get more images than can be represented) and AFAICT MEsa is the only driver setting ti to 0
22:45 bnieuwenhuizen: all other drivers set it to something out of [8, 64] (including windows and android)
22:47 jekstrand: bnieuwenhuizen: Answering that question would require thinking about WSI. :-P
22:48 bnieuwenhuizen: you're not the first to make that observation :P
22:48 jekstrand: hah!
22:48 bnieuwenhuizen: though I'd argue it is more API weirdness than WSI properl
22:55 jenatali: jekstrand: Thanks, I think the pack/unpack approach will work nicely :)
22:56 jekstrand: jenatali: Cool!