08:12 airlied: j4ni: i think those last lspcon patches didnt end up on intel-gfx
08:12 airlied: might want to kick poster
08:13 airlied: vsyrjala: ^
08:56 j4ni: airlied: I'm not sure which patch you're referring to
09:01 airlied: https://patchwork.freedesktop.org/series/77020/
09:01 airlied: j4ni: ^
09:05 j4ni: airlied: aye. vsyrjala, please have a look at that. the last time I commented on it, you pretty much overruled me. ;)
09:05 j4ni: also bounced it to intel-gfx@
10:14 Venemo: jekstrand: regarding GS intrinsics, we need a way to not only track the number of vertices but also the number of primitives. which option would appeal to you more? do you prefer to keep two intrinsics, set_vertex_count and set_vertex_and_primitive_count, or do you prefer to have just the latter, with the primitive count optional?
10:14 Venemo: jekstrand: I vote for the latter, but I'd be happy to hear your input too
10:31 MrCooper: ugh, spam comments on the kernel bugzilla as well
14:32 danvet: tzimmermann, did you see my reply to your question on the shmem series?
14:33 danvet: apologies that I've missed it, I thought you've acked them all
14:37 tzimmermann: danvet, i did. thanks for explaining
14:38 danvet: tzimmermann, so ack or did your reply somehow get lost?
14:38 tzimmermann: you need an ack for that patch? no problem
14:40 tzimmermann: danvet, done
14:41 jekstrand: Venemo: Can't you just multiply and/or divide?
14:41 Venemo: jekstrand: what do you mean?
14:41 jekstrand: I thought all the primitives had the same number of vertices
14:41 Venemo: they did?
14:41 jekstrand: Maybe not?
14:41 Venemo: hmm
14:42 jekstrand: I'm a little rusty on GS details
14:42 pendingchaos: GS outputs strips
14:42 jekstrand: uh... right.
14:42 jekstrand: Wait, can you output strips or do you always output lists?
14:42 danvet: tzimmermann, thx
14:42 pendingchaos: and end_primitive restarts the strip
14:43 pendingchaos: you always output strips
14:43 pendingchaos: so you can't divide the vertex count to get a primitive count
14:43 imirkin: jekstrand: points, line strips, and tri strips, but with the ability to "cut" arbitrarily, as pendingchaos points out.
14:44 jekstrand: Oh, right. It's emitVertex() not emitPrimitive()
14:44 imirkin: and if you're doing points, you can actually also have multiple *streams*
14:44 jekstrand: Yeah, streams are "fun"
14:44 imirkin: which is only useful for ... transform feedback. who thought it was a good idea ... dunno.
14:45 imirkin: AMD has extensions that enable you to use streams with other primitives, as well as to rasterize them
14:45 imirkin: still seems pointless.
14:45 imirkin: nvidia hw doesn't support any of that. dunno where intel falls.
14:45 jekstrand: Venemo: I think from NIR's perspective, it'd be cleaner to have one intrinsic or separate set_vertex_count and set_primitive_count intrinsics.
14:46 jekstrand: Having set_vertex_count and set_vertex_and_primitive_count
14:46 jekstrand: seems very strange
14:46 Venemo: ok, then I think we agree
14:46 Venemo: well, you either need to track just the vertex count, or both the vertex and primitive countsy
15:02 Lord: While using amdgpu with mesa, with demanding games i have some complete freezes
15:02 Lord: http://vpaste.net/AFBfs
15:03 Lord: i got a ring gfx timeout
15:03 Lord: most of the time i can workaround it by decreasing resolution of lowering some graphical options
15:04 Lord: but some games are faster than other to crash
15:04 Lord: i can play team fortress 2 for hours without any hiccup, but black mesa will crash in less than 2 minutes (on xen levels)
15:14 Lord: i'm reading on different forum/ml that this bug seems to be fixed to some users in recent kernel but not me :-/
15:20 Venemo: jekstrand, pendingchaos this is what I got: https://gitlab.freedesktop.org/Venemo/mesa/-/commit/ed5c09dcc2cc60834ed6717588e1fc75772ffb09
15:23 MrCooper: Lord: there's no 'this bug', there's lots of separate issues with similar symptoms
15:25 bl4ckb0ne: is there a way to force an attrib to be active in gles2?
15:25 pendingchaos: Venemo: emit_vertex can also increase the number of primitives if there's enough vertices in the strip
15:25 bl4ckb0ne: i have two attribs in my shaders and the second one is not active even thoguh i use it
15:25 pendingchaos: we should probably keep track of the number of vertices since the last end_primitive and increase the primitive count if it's high enough
15:25 linkmauve: What could cause llvmpipe to advertise only OpenGL 3.1 instead of 3.3?
15:26 linkmauve: This server is on Ubuntu 20.04, so using Mesa 20.0.4.
15:27 jekstrand: Venemo: Looks ok. I noticed you added a bool parameter for whether or not to count primitives. I'm inclined to just always count them and let DCE clean-up in the back-end if they're unused.
15:27 jekstrand: Venemo: If we're really concerned, we can have a tiny pass which replaces the primitive count with an undef in NIR.
15:28 jekstrand: Kayden may have an opinion
15:30 pendingchaos: Venemo: a bit unrelated, but (if we're going to mimic llvm's NGG GS implementation) we'll also need to know if a specific emit_vertex adds a new primitive and whether it's an odd vertex
15:34 jekstrand: pendingchaos: So you want a verts_since_last_prim thing?
15:34 jekstrand: Maybe an additional argument to emit_vertex?
15:35 pendingchaos: verts_since_last_prim for any GS shader and the number of primitives for queries and transform feedback
15:37 jekstrand: I'm relatively happy to add whatever's needed. Maybe we need to make lower_gs_intrinsics take a flags parameter at this point.
15:37 jekstrand: Our HW does primitive counts for XFB for us so we don't need that. (I'm also happy to let DCE handle it)
15:37 Venemo: pendingchaos: yes but those can easily be tracked on the ACO side IMO
15:37 jekstrand: verts_since_last_prim we also don't need but, again, dead code can probably deal with it. Might be tricky for loops though.
15:38 Venemo: AFAIU the NGG HW needs us to tell it the total number of vertices and primitives
15:46 Venemo: the HW allows a lot more flexibility than would be needed by Vulkan's GS
15:47 jekstrand: Yeah. Maken an MR with whatever stuff you need in it and we'll ask you to make stuff optional if we think it's a problem.
15:48 Venemo: jekstrand: so far, only the primitive count is added and that is optional
15:48 alyssa: our hw implements GS/tess/indirect-draws/XFB entirely in software
15:48 alyssa:ducks
15:48 Venemo: pendingchaos: I think end_primitive_with_counter is enough for us to tell that the last vertex completed a primitive.
15:48 imirkin: alyssa: so does llvmpipe :p
15:49 Venemo: alyssa: do you mean that the hw doesn't support it and you implement it in sw, or that the hw itself actually implements it in sw?
15:49 alyssa: Venemo: The former, the blob does massive compute shaders, and possibly patches the command stream at runtime
15:50 alyssa: The latter would be annoying but not my problem.
15:50 alyssa: XFB is sort of hw, varyings get written back to memory automatically on all draws due to tiling, so implementing ES3.0-class XFB is mostly a matter of making sure all the formats/paddings line up
15:51 alyssa: obviously stuff gets exponentially worse when you have XFB and GS simultaneously, etc
15:54 Venemo: XFB = ?
15:54 alyssa: streamout
15:55 Venemo: ah
15:55 Venemo: there are so many names for that thing
15:55 alyssa: XFB, SO, PITA...
15:57 Venemo: :D
15:59 zmike_: PITA++
16:01 ccr: <Al Pacino> ohhh .. I have so many names ..
16:13 mareko: we all love streamout
16:16 alyssa: love to hate it?
16:21 zmike_: I'm deep into the stockholm syndrome part of so, so I just love it
16:22 alyssa: zmike_: Give it time... ;(
16:22 alyssa: lfrb: wrote most of the Panfrost xfb code and hasn't worked on Panfrost since ;|
16:22 alyssa: lfrb: ....Sorry
16:23 zmike_: I just finished the zink xfb implementation, so surely I'll never need to do anything related to it again
16:23 zmike_: and my life will be nothing but sunshine and happiness
16:23 alyssa: zmike_: because now it's kusma's problem? ;P
16:23 alyssa: yes I too know how maintainership works :P
16:23 zmike_: in the sense that it needs another review pass yes
16:28 zmike_: mareko: heya, do you have a min to talk about your comment on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5338 ?
16:29 zmike_: otherwise we can do long form correspondence np
16:29 linkmauve: I have both anv and radv installed, but no GPU available. Still, vkEnumeratePhysicalDevices() does return one physical device, which then segfaults when I vkGetPhysicalDeviceProperties() it.
16:30 linkmauve: Still on Mesa 20.0.4.
16:30 linkmauve: Any way to know which of these two ICDs it is, or how to disable it?
16:33 Venemo: delete one icd, see what happens?
16:36 daniels: gdb?
16:36 pendingchaos: I think you can use VK_ICD_FILENAMES to use just one ICD without deleting anything
16:39 linkmauve: Venemo, I don’t happen to have root access on this machine.
16:40 linkmauve: daniels, gdb only crashes in vkGetPhysicalDeviceProperties(), I have no other information (or debug symbols).
16:41 Venemo: linkmauve: then try what pendingchaos said with VK_ICD_FILENAMES
16:42 linkmauve: pendingchaos, weird, even if I select only one or the other, both will fail with the same segfault.
16:43 linkmauve: VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json or VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json
16:44 Lord: MrCooper : so, what's possible to do to debug this ?
16:53 linkmauve: I’ve reported it here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3098
16:55 MrCooper: Lord: I'd file an issue at https://gitlab.freedesktop.org/groups/mesa/-/issues , and hope pepp or mareko work with you to get the needed information
16:55 Lord: ok will do thanks
16:55 daniels: linkmauve: does it still happen if you set VK_ICD_FILENAMES to something totally invalid? if yes, then that's nothing to do with mesa
16:59 linkmauve: No, it doesn’t.
17:02 linkmauve: VK_ICD_FILENAMES= gives me “Vulkan required instance extensions: missing”, while VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nouveau_icd.x86_64.json gives me “Vulkan loader: missing”.
17:03 linkmauve: The latter behaviour seems to be triggered by the filename ending with .json weirdly.
17:05 Lord: MrCooper : i'm trying to open the issue but i don't know which project to choose. Mesa / drm ?
17:23 pepp: Lord: mesa/mesa
17:27 Lord: hoo i choose mesa/drm, i try to change it
17:30 pepp: Lord : don't bother, mesa/drm is fine too
17:31 Lord: ok, i can't find how to change it aftewards…
17:39 mareko: zmike_: just ask questions anywhere and I'll reply when I see it
17:46 mareko: zmike_: util_blitter_blit supports blitting between Z/S and color and the color buffers can only be certain formats (e.g. R32_UINT for Z24S8), so there is no driver work needed except doubles
17:49 mareko: zmike_: when converting float (texture read) to Z24_UINT, the fragment shader uses 64-bit fmul to have precise conversion from float to uint
17:52 zmike_: mareko: sorry, I got distracted
17:52 zmike_: my question is why would this be better than using the function in my MR for zink?
17:52 zmike_: I'd still have to do the same vulkan calls
17:53 zmike_: so it seems like this is actually going to end up being substantially more work
17:54 mareko: zmike_: that's up to you
17:54 mareko: zmike_: radeonsi has to do it, because Z/S are unmappable
17:55 zmike_: ah
17:56 zmike_: I think it'd end up being nearly identical in terms of all the actual calls made for this case but then more code in the end since I'd have to update the blit methods to handle this
17:56 zmike_: not to mention d3d12 will use these methods as well
17:57 zmike_: radeonsi is an interesting read here though
18:04 Kayden:reads back, happy to have NIR count things and replace them with an undef for drivers that don't care and DCE stuff...
18:04 Kayden: all of those plans sound reasonable
18:05 Kayden: since it was asked... Intel HW can do either EndPrimitive (cut vertices) OR streams. It might be possible to do other primitives with streams, as long as you don't use EndPrimitive, but...not sure. Bit crazy, really.
18:06 Kayden: gen is definitely not as cool as radeon in its transform feedback related capabilities
18:06 Kayden: but then again, it sounds like few things are :)
18:07 mareko: Kayden: "not as cool" lol
18:10 mareko: Kayden: our new geometry engine doesn't have transform feedback, it only has ordered atomic add, everything else has to be emulated, including queries
18:10 Kayden: ahh
18:10 Kayden: that's decidedly less 'cool' than your old one then
18:11 Kayden: but, it's not like much uses that anyway..
18:51 jekstrand: mareko: That actually sounds pretty nice in the same way that fetching your own vertex data sounds nice. It's simple and you can build whatever you need on top of it.
18:51 jekstrand: Of course, you have to do a lot of building....
19:01 anholt_: mareko: we've got older hw with basically that for TF, and would love to move it to NIR. Has anyone started on that?
19:23 airlied: anholt_: I think that is what Venemo is looking at now
19:23 Venemo: airlied: what do you mean? I'm working on NGG GS
19:32 airlied: Venemo: yup which I assume needs to support TF on NGG or no?
19:33 Venemo: airlied: eventually yes, but it's not the first priority
19:34 Venemo: currently AFAIR even with the LLVM backend we disable NGG when there is TF
19:35 kherbst: cwabbott or jekstrand: mind taking a look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5373 ? Would be cool to get it merged as soon as possible so we don't wait on you for getting the volta/turing stuff merged into nouveau
19:37 jekstrand: kherbst: Have you run it on Intel CI?
19:37 kherbst: yes
19:37 kherbst: result posted on the MR
19:39 jekstrand: kherbst: done
19:40 kherbst: cool thanks!
20:03 jenatali: I'm wondering if someone can walk me through when a glsl_type for a struct has valid offset information in its fields, vs when it's computed on the fly (e.g. glsl_type::struct_location_offset or struct_type_get_field_offset in nir_deref.c)
20:04 jenatali: I'm noticing that the "packed" field in the glsl_type never actually gets set by vtn for CL types and I'm wondering if I can/should just pre-cache offset information rather than having to store it in the type
20:06 jekstrand: Packed not getting set seems wrong
20:06 jenatali: Agreed :)
20:07 jekstrand: We don't have a good helper for "does this have explicit offsets?"
20:07 jekstrand: Or even a global flag on the type
20:08 jekstrand: For a given struct field, you can see if the offset >= 0
20:13 jenatali: Alright
20:13 jenatali: I'm also seeing a bug in the LLVM->SPIRV converter that accessing a packed type doesn't correctly set the alignment on the load, so that'll be fun to deal with...
20:27 kherbst: jenatali, jekstrand: yeah.. I think the packed flag was added but we never really wired it up yet.. but if vtn sets it, it should just work in the end
20:27 jenatali: Until you try to deref chain into a struct
20:27 jekstrand: I'm a bit surprised that VTN doesn't set it.
20:27 kherbst: I am sure I have a patch for it and forgot to post it
20:28 jekstrand: probably
20:30 kherbst: btw.. packed is getting set
20:30 kherbst: jenatali: check handling of SpvDecorationCPacked
20:30 jenatali: The vtn_type packed is getting set, that doesn't make it to the glsl_type
20:31 kherbst: ohhh
20:31 jenatali: Yeah those attributes are parsed after the (immutable) glsl type is already created, there's just a constant false passed to that
20:32 kherbst: yeah.. seeing it right now
20:33 kherbst: I know I had it working in the past though
20:33 kherbst: let me check my old tests and see what happens
20:34 jenatali: Sure, thanks, no rush
20:35 kherbst: yeah.. no worries.. I finished my review for today anyway :p
20:36 jenatali: But yeah even after that, there's enough places where struct offsets are recomputed in different ways with various alignment requirements that I'd be shocked if it worked in all cases right now
20:36 kherbst: yeah.. I would be surprised as well
20:36 kherbst: at least the places where the cl helpers are getting used should be correct
20:37 kherbst: and for CL we kind of always want to use those if it's about application controlled memory
20:37 jenatali: Yeah, except deref chain building, which takes a size_align helper but doesn't respect packed
20:37 kherbst: it's fine
20:37 kherbst: the cl size/align helpers are respecting packed
20:37 kherbst: or did you mean something else?
20:37 jenatali: How can they?
20:37 jenatali: Packed is a property on the struct, not the member types
20:38 kherbst: ohh. I start to remember the whole situation.. let me see
20:39 kherbst: jenatali: it's fine
20:39 kherbst: check glsl_type::cl_size
20:39 kherbst: or at least should be fine
20:40 jenatali: Take a packed struct { char a; int b; }. Generate a deref chain to b. You need to compute the offset of b based on the size/align of a and b. The align of both of them becomes 1 if the *parent* struct is packed
20:40 jenatali: The size/align helpers would only operate on the member types
20:42 kherbst: mhhh
20:43 kherbst: but yeah.. packed seems to be rboken now nonetheless
20:43 jenatali: Seems like it would be nice if we could just guarantee that the offsets were set and just use that everywhere
20:43 kherbst: I agree
20:44 jenatali: Something for the backlog :)
20:53 kherbst: ufff
20:53 kherbst: align_mul is wrong for struct params anyway
20:53 kherbst: ehh.. wait
20:53 kherbst: this is passed by value
20:53 jenatali: Hm?
20:53 kherbst: kernel void test(global float *res, struct FMAData data)
20:53 kherbst: I have a local test
20:54 kherbst: and something odd happens
20:54 kherbst: vec1 16 ssa_7 = intrinsic load_kernel_input (ssa_2) (0, 0, 2, 0) /* base=0 */ /* range=0 */ /* align_mul=2 */ /* align_offset=0 */
20:54 kherbst: which is just.. not true
20:54 kherbst: the first member if the struct is 16 bit sized, true
20:55 kherbst: doesn't matter in my test anyway
20:55 jenatali: I'm not sure where that align_mul comes from. Must be from lower_explicit_io
20:57 jenatali: The alignment info from LLVM/SPIRV gets dropped when vtn converts things into load_deref/store_deref. See https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/116/diffs?commit_id=14e7788e9765ce25567cbfe3bf8110b063c8c97c. We can only do 32bit aligned loads/stores in DXIL so I need to preserve that info and split under-aligned loads/stores
20:57 jenatali: But the alignment on the SPIRV loads/stores of packed fields is wrong
21:00 jenatali: kherbst: Oh hey: /* TODO: We should try and provide a better alignment. For OpenCL, we need to plumb the alignment through from SPIR-V when we have one. */
21:00 kherbst: ohhh
21:00 kherbst: it's broken because I mess up 64 to 16 bit conversions...
21:00 kherbst: my bad
21:00 kherbst: jenatali: hep
21:01 kherbst: *jep
21:01 jenatali: Always nice when I can tackle a TODO without even knowing it... or at least part of one
21:01 kherbst: :)
21:05 kherbst: okay :) packed is broken
21:05 kherbst: and I am sure it used to work
21:06 jenatali:shrugs
21:06 kherbst: jenatali: easy to fix actually
21:18 jenatali: I'm curious, how easy?
21:18 kherbst: seems like a bit harder actually.. now lower_io is broken
21:18 kherbst: but I have the packed flag set
21:18 jenatali: Yeah, lower_io would probably be another place that's building its own deref offsets...
21:18 kherbst: jenatali: https://gist.github.com/karolherbst/b531ecb00b59a1e59847737bdfb7b255
21:19 kherbst: you can ignore the added comment
21:19 jenatali: Yep, that's as far as I got in my thought experiment before I saw all the other places that'd (probably) blow up and came here to ask :P
21:19 kherbst: yeah..
21:20 kherbst: wait.. I have an idea
21:21 kherbst: ufff
21:21 kherbst: it's not lower_io
21:21 kherbst: I think...
21:21 kherbst: let me debug a little further
21:23 jenatali: It would be. The offsets stored in the glsl_type aren't updated to reflect packed. That's what lower_explicit_io is looking at
21:23 kherbst: could be.. but that's not the reason it fails for me at least
21:28 jenatali: Damn... looks like clang/llvm's the one that gets alignment wrong on loading from packed structs :( https://godbolt.org/z/fQjjo_
21:28 kherbst: :/
21:28 kherbst: jenatali: no
21:28 kherbst: it just adds fake bytes
21:29 jenatali: It adds fake bytes for aligned(), not for pack()
21:30 kherbst: ohh, I see now
21:30 kherbst: ehh
21:30 jenatali: %6 should have alignment 1, but it has 4
21:30 kherbst: yeah
21:34 jenatali: Well that's going to be fun to deal with at some point... not sure what I'm going to do about that :(
21:40 kherbst: jenatali: ahh, lower_io uses glsl_get_struct_field_offset
21:40 kherbst: mhh, so yeah
21:40 jenatali: Yeah, that's what I was saying, it's the cached offsets
21:40 kherbst: I think we only need to change that
21:40 kherbst: yep
21:40 kherbst: when are those set?
21:41 jenatali: And anyplace else deref chains are built (nir_deref.c struct_type_get_field_offset)
21:41 jenatali: Those offsets are set in vtn
21:41 jenatali: Right by (above?) the code you changed
21:41 kherbst: glsl_type::struct_location_offset I guess?
21:41 jenatali: Yeah maybe. That looks like it has weird logic that's GL-specific. Dunno
21:42 kherbst: mhh.. right
21:42 kherbst: all that looks a bit weirdo
21:42 jenatali:shrugs
21:42 jenatali: I'm happy if you want to find all the right places to change for me though ;P
21:43 kherbst: it does end up using glsl_get_struct_field_offset...
21:43 kherbst: so something is setting it
21:44 kherbst: jekstrand: do you know how the internal struct offsets gets set?
21:44 kherbst: I see glsl_type::get_explicit_type_for_size_align doing something
21:44 kherbst: but..
21:44 kherbst: I am not quite sure how that all works out with vtn
21:45 kherbst: maybe I just need to use nir_lower_vars_to_explicit_types? mhhh.. couldn't be though
21:45 kherbst: I think something is missing
21:46 kherbst: for another time
21:46 kherbst: at least we know what's up
21:46 jenatali: kherbst: In spirv_to_nir, handling of SpvOpTypeStruct, there's logic for KERNEL to assign field offsets
21:46 jekstrand: kherbst: Likely, you need lower_vars_to_explicit_types
21:46 kherbst: jenatali: ohh, right. I forgot :O
21:46 jekstrand: kherbst: We do set explicit offsets when stuff comes in from Vulkan and has them.
21:47 kherbst: jenatali: let me fix it then :D
21:47 kherbst: now it's easy
21:47 jenatali: Again, as long as you only use explicit_io to access things and nothing else that walks structs or deref chains
21:48 kherbst: jenatali: not an issue with my fix
21:48 jenatali: Great
21:48 kherbst: at least I hope...
21:48 jenatali: :P
21:49 kherbst: ahh moved the wrong line
21:49 kherbst: :D
21:52 kherbst: and apparently you can't call align with align(a, 0)
21:52 kherbst: but 1
21:53 kherbst: heh.. it's slightly better
21:53 kherbst: now I have 0xc, 0xe and 0x12 as offset...
21:53 kherbst: all three float values
21:53 kherbst: first one should be 0xa
21:54 kherbst: although
21:54 kherbst: even that is wrong
21:55 kherbst: mhhh
21:55 kherbst: 0x0: 64 bit, 0x8: 16 bit, 0xa: struct {float, float, float}[10]
21:55 kherbst: that's the struct I have essentially
21:56 jenatali: IIRC according to the spec, the sub-struct maintains its alignment, unless the struct was also annotated with packed
21:57 jenatali: Eh maybe not. Looking at the last example in https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#specifying-attributes-of-types
21:57 kherbst: yeah
21:57 kherbst: :p
21:57 kherbst: C rules apply
22:00 kherbst: jenatali: that one is 122 bytes big: https://gist.github.com/karolherbst/555c2b7986e50d39da997a54c5ef9848
22:00 kherbst: at least according to a C compiler
22:01 jenatali: Cool
22:06 anholt_: austriancoder: btw, I'm also working on getting tracie working in the bare-metal world.
22:30 kherbst: jenatali: yeah.. anyway.. it's mostly working just something is weird about the first member
22:31 kherbst: jenatali: https://gist.github.com/karolherbst/b531ecb00b59a1e59847737bdfb7b255
22:33 jenatali: kherbst: What do you mean the first member?
22:33 kherbst: jenatali: https://gist.githubusercontent.com/karolherbst/493190decc66e5d6bf71b75e3c937d3c/raw/b81ffc5723cce0e87b0ad1a85293cbdf7f9686df/gistfile1.txt
22:33 kherbst: this is what I get
22:33 kherbst: three inputs are loaded in the loop
22:33 kherbst: all three are floats
22:33 kherbst: the first one has the wrong offset
22:33 kherbst: should be 0xa instead of 0xc
22:34 kherbst: the two other ones are fine
22:35 jenatali: Weird...
22:36 kherbst: yep
22:36 kherbst: anyway, with my patch the situation looks better
22:37 jenatali: Awesome, thanks
22:37 kherbst: jenatali: I am sure there is some silly if 0 corner case
22:37 jenatali: Yeah, wouldn't be surprised
23:09 jenatali: kherbst: Btw, this still doesn't address putting __attribute__((packed)) on individual struct members, which is something that you can do :(
23:11 kherbst: I see
23:12 kherbst: jenatali: how does it look inside spirv?
23:12 kherbst: CPacked on the member?
23:12 jenatali: Dunno, haven't tried it, but that'd be my guess
23:12 kherbst: mhh CPacked is structure only
23:12 jenatali: Maybe it's explicit offset? Or alignment? I dunno
23:13 kherbst: jenatali: I honestly think the entire struct is packed and spirv-llvm-translator does manual padding
23:13 kherbst: I don't see any other way with spir-v
23:13 jenatali: Ooh, that'd be clever
23:13 kherbst: yeah... or something else :p
23:13 kherbst: I kind of fully rely on the padding for the alignment bits
23:13 kherbst: but if they start doing something else we might have to rework it
23:13 jenatali: Yeah, I noticed that __attribute__((aligned())) did that
23:14 kherbst: it's how llvm does it
23:14 kherbst: and it's best to not fight against llvm :p
23:36 jenatali: Well, my alignment concerns weren't well-founded, in fighting the optimizer I made the compiler lose alignment info. Packed struct loads do have correct alignment. hooray
23:47 kherbst: nice