01:10 airlied: jenatali, jekstrand : https://paste.centos.org/view/raw/8c248dd5
01:11 airlied: memcpy lowering, the ssa_117 cast is stopping nir_src_as_deref
01:12 jenatali: Hm...
01:13 airlied: just wondering if I should make the mempy lowering peer thorugh that sort of thing
01:13 jenatali: airlied: Isn't the src a deref though? Is the problem somewhere else or is it really nir_src_as_deref?
01:16 airlied: jenatali: doh got it backwards, it's ssa_665 thats the issue
01:16 airlied: since it had the iadd
01:17 jenatali: Uh... what's even the meaning of memcpy from a non-pointer?
01:17 airlied: it's a pointer you've loaded
01:17 jenatali: Unless it's something like (void*)((size_t)ptr + 0x10) as the src
01:17 airlied: yes that
01:18 jenatali: I guess the SPIR-V is just missing that extra cast?
01:18 airlied: yeah maybe three is a missing cast to uint8_T there
01:19 jenatali: My reading of the SPIR-V spec for OpCopyMemorySized says Source needs to be a pointer
01:19 jenatali: So that reads like invalid SPIR-V to me
01:19 jenatali: airlied: Maybe force it to be cast in vtn?
01:20 airlied: I'll dump the psirv
01:20 airlied: spir-v
01:29 airlied: jenatali: spir-v is fine, and initial nir looks okay, we must eat the cast somewhree
01:29 jenatali: :(
01:34 jenatali: airlied: What's the SPIR-V look like? Ptr -> int -> add -> cast to ptr?
01:35 airlied: it's just two vars
01:35 airlied: it's when we lower explicit io
01:35 jenatali: Are you running the memcpy lowering after lower_explicit_io?
01:35 airlied: I've moved memcpy lowering up a bit to see if it helps in clvoer
01:35 jenatali: Yeah I'm pretty sure that has to be first
01:36 airlied: yeah explicit io got moved up for some other reason
01:36 airlied: whack-a-mole
01:36 jenatali: Since you could lower explicit for mem mode A, but the memcpy could involve both A and B, it gets messy
01:37 jenatali: At least, that was my reasoning for suggesting it be a separate pass rather than just part of lower_explicit_io
01:38 airlied: now darktable at least starts
01:44 airlied: starts but doesn't use CL :-P
01:46 zmike: Kayden: wow, I've been hitting that b2f64 thing for months and thought it was just my bug somewhere
01:59 Kayden: zmike: ah, nope, it's a recent(?) regression
01:59 Kayden: maybe not so recent
02:00 Kayden: I wrote an idea of how we might want to handle this in general, but I think for now that patch should be good enough
02:00 Kayden: most of the algebraic transformations just turn fp64 into other fp64, only a few produce fp64 from nothing
02:01 Kayden: I didn't check the other int64 related rules for potential problems
02:10 airlied: ah yeah darktable does annoying kernels with inline usaeg that break things
02:30 ajax: jenatali: i guess i don't know what you mean by completely side-by-side?
02:31 airlied: as a separate libGL amybe
02:35 jekstrand: airlied: It's possible someone's optimizing that cast away invalidly
02:36 airlied: yeah I think the explicit io lowering ordering is just wrong
02:36 airlied: now I'm into "inline" vs "static inline" and C99 vs opencl c
02:39 ajax: i guess the short answer is that you still need to speak some glx protocol to get the fbconfig list, and the dri drivers already have hooks in most of the places you'll need to hook, so inventing a whole parallel set of hooks is just busywork
02:40 ajax: it's still sort of a DRI driver in that it's using the DRI loader conventions for getting vtables across the libgl/driver boundary
02:41 ajax: but it's __DRI_COPPER not __DRI_DRI2 so that if i need to change calling conventions or parameters, i can
02:42 ajax: also can everyone please stop with the fucking double underscore shit, we have symbol visibility let's pretend we know how it works
02:43 ajax: second only to the word "helper" in the list of API spelling tics that annoy me. hungarian notation is less bad.
03:46 jenatali: ajax: I guess my main point was that calling it "DRI" seems like it's using the /dev/dri/card<n> infrastructure, when it's really building on all Mesa-specific infrastructure that's been built up around the rest of DRI (at least, that's my POV as an outsider)
03:47 jenatali: I'm mainly just trying to understand how I should be thinking about making our /dev/dxg infrastructure work for trying to get GPU-paravirtualization acceleration for WSL, which looks an awful lot like your copper work, and just trying to understand why it is the way it is
03:48 airlied: jenatali: dri is quite overloaded
03:48 airlied: it has at least 3 meanings
03:49 airlied: jenatali: /dev/dri/ interaces are actually called the drm interface
03:49 airlied: along with that, DRI means the X protocol between X and the clients, and the internal mesa interface between loader and drivers
03:50 ajax: yeah, terribly overloaded terminology. i'd really like to make "dri" just mean the protocol to coordinate with xserver, and make the mesa internals that use "dri" be more like, i dunno, hwr for hardware renderer?
03:50 jenatali: airlied: Ah, that makes sense
03:50 jenatali: Shame that "drm" is *also* overloaded with digital rights management
03:50 ajax: we got there first tbh
03:50 jenatali: Yeah, that was my read on history
03:51 jenatali: Most of this predates my time in the industry :)
03:52 airlied: uggh spirv 1.4 is the first time they allow vec4 = select bool vec4, vec4 to work
03:53 jenatali: I guess, as long as people won't balk at my intended use of DRI based on Penny/Copper, that's all that really matters at the moment
03:54 jenatali: airlied: Looks worse than that... looks like you could do select on structs too
03:56 airlied:got darktable to start with CL on radeonsi, crashes once I try and use it :-P
03:57 jenatali: Progress is progress
04:11 airlied: ah ouch radeonsi is very much sampler_view + sampler as one entity focused
04:44 airlied: karolherbst: I think you might have to merge inline samplesr of the same types
04:45 jenatali: airlied: We had to do that: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/microsoft/clc/clc_nir.c#L326
04:45 jenatali: Looks like clang/llvm generates a new sampler for every call site where it's used
04:45 airlied: jenatali: yup definitely want to copy that
04:46 jenatali: Or move it - feel free
04:46 jenatali: We finally have CI in place :)
06:13 airlied: my darktable efforts ended in spirv-1.2 missing select functionality, and forcing 1.4 breaking lots of other stuff
10:47 kusma: airlied: in the long term (and for DXVK AFAIK), VK drivers needs to compile code correctly even if we disable optimizations. So the drivers need to be careful about what the difference between "this is a required transformation" and "this is something that just makes things faster". Yeah, It's not always easy, so most things should probably just ignore that flag.
10:48 kusma: airlied: but yeah, in the longer term, we should compile a "fallback"-shader quickly that does all variant stuff as conditionals etc, so we can start drawing as early as possible, and compile specialized shaders in the background.
10:49 kusma: Otherwise, we'll have to live with compilation hitches, which isn't great.
10:49 kusma: But don't get me wrong, zink doesn't even try to do this stuff yet, so that flag should be removed, like you just did.
10:52 kusma: mareko: I'm not sure if you're the right person to ask, but I'm seeing a crash that seems to have started at some point after mesa 19.3 in st_create_vp_variant, and you have touched that code in between. The problem is that we call tgsi_dup_tokens(NULL), because stvp->state.tokens is NULL.
10:54 karolherbst: airlied: yeah.. shouldn't be too hard
10:54 kusma: stvp->state.type is PIPE_SHADER_IR_NIR, but key->is_draw_shader is true
10:54 karolherbst: airlied: luckily we have more freedom on our hw and can mix and match
11:09 kusma: Ah, I think the problem is really that the "key->is_draw_shader || draw_has_llvm()"-case doesn't actually work, and I had to disable the llvm-support on my end because my build is no longer sufficient to build things properly locally...
11:11 kusma: The "key->is_draw_shader || draw_has_llvm()"-case doesn't work because "st_release_variants()" has no idea that we still need the tokens...
11:13 kusma: Hmm, no. It doesn't seem to be about "st_release_variants()", nevermind that bit...
12:10 mareko: kusma: this should fix it: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7826
12:47 kusma: mareko: wow, thanks :)
12:50 kusma: mareko: I'll give it a review soon :)
12:50 mareko: kusma: if you use a driver that's using NIR, it will never use TGSI tokens with that MR
12:58 hakzsam: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/5891724 ...
12:59 hakzsam: MrCooper: I think we need an exception when the commit message starts with "Revert", should I disable the sanity check in the meantime?
13:36 MrCooper: hakzsam: please comment out only the "ci-fairy check-commits" line, don't disable the sanity job completely
13:38 MrCooper: or, since the shortlog seems just 1 character too long, maybe shorten it somehow
13:40 MrCooper: hakzsam: can you file an issue at https://gitlab.freedesktop.org/freedesktop/ci-templates/-/issues as well?
13:40 hakzsam: yes
13:49 MrCooper: thanks!
14:08 kusma: MrCooper: Same failure again here, seems like a bad builder? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/5893666#L1113
14:08 kusma: mareko: Sounds great!
14:09 hakzsam: MrCooper: btw, are you fine with the CI commit on top of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7843/commits ?
14:09 MrCooper: daniels: ^ hmm, maybe the ccache is corrupted on fdo-packet-m1xl-1, or there's something else wrong with it?
14:10 MrCooper: hakzsam: looks fine
14:23 FireBurn: Hi
14:24 FireBurn: I'm seeing a build failure on mesa git https://pastebin.com/mrFiyLSd
14:26 hakzsam: see !7843
14:26 hakzsam: I think someone should definitely review it
14:35 daniels: MrCooper: thanks for the heads-up, just obliterated ccache
15:21 igrtrrt: Hi, there is a good way to check if an optimization in the spirv_to_nir is generating a valid nir? Also, is it possible to have the same kind of report as in shader-db using a spirv shader instead of a glsl?
15:21 imirkin: igrtrrt: run with NIR_VALIDATE=1 i think?
15:22 imirkin: which may be the default in debug builds, not sure
15:22 igrtrrt: I will try it, thanks!
15:23 pendingchaos: NIR_VALIDATE is enabled by default in debug/debugoptimized builds
15:23 imirkin: even debugoptimized??
15:23 imirkin: isn't it mega-slow?
15:23 pendingchaos: yes, debugoptimized
15:24 pendingchaos: it's slow, but I think not as slow as it used to be
15:24 pendingchaos: also validation isn't compiled for release builds, so NIR_VALIDATE=1 does nothing there
15:24 imirkin: ah, i thought the whole point of debugoptimized was that it didn't run validation
15:25 imirkin: (or some of the point)
15:26 igrtrrt: What should I do to compile with debug? Is this option in the meson_option.txt?
15:27 pendingchaos: igrtrrt: fossilize/vkpipeline-db allows you to test Vulkan pipelines similar to shader-db
15:27 pendingchaos: I don't see anything in shader-db indicating that it supports OpenGL+SPIR-V
15:28 pendingchaos: meson configure <build directory> -Dbuildtype=debug
15:30 igrtrrt: Thanks pendingchaos
16:27 SomeoneSerge: Hi, idk if it's appropriate place for users questions, but is it an expected behaviour if /proc/cpuinfo reports different frequency (400MHz) from "current freq" reported by cpufreq userspace governor (1.9GHz)?
16:30 FireBurn: Ah https://gitlab.freedesktop.org/hakzsam/mesa/-/commit/32193b67ff3bac3420cf1265dfd26a9d6f94736a.patch fixes the build for me
16:49 ajax: MrCooper: hey, in !7660 i've dropped the libxdamage-dev from x86_build-base.sh since it won't be needed in the future, but i haven't explicitly bumped the distribution tag. should i, or is it harmless to just leave in place for the next person who really needs to bump it?
16:51 MrCooper: in general it's better to bump, to rule out an unpleasant surprise for the next person who bumps in an unrelated MR
16:54 MrCooper: this will require bumping the android/i386/ppc64el/s390x/x86_build tags as well
16:55 ajax: gotcha, was just looking at the git-blame to see what needed doing last time the tag got bumped. thanks!
17:00 MrCooper: np
17:59 ajax: why is the meson-release job broken
18:01 MrCooper: on fdo-packet-m1xl-1 again?
18:01 ajax: aye
18:02 MrCooper: job URL?
18:02 ajax: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/5898716
18:02 anholt:is fighting debootstrap failing due to lack of mknod permissions
18:04 MrCooper: ajax: weird, so it seems to consistently fail on that runner, but not on others
18:04 MrCooper: anyway, signing off for today, bbl
18:31 daniels: ajax, MrCooper: let me see if I can obliterate the cache harder
18:47 daniels: anholt: https://gitlab.freedesktop.org/anholt/mesa/-/jobs/5902027 - is this expected with you making changes or should I be just writing off packet-m1xl-1 and standing up a new one from scratch?
18:47 anholt: daniels: that error is the thing I've been fighting, not sure what's up with it.
18:47 anholt: (Cannot install into target '/lava-files/rootfs-arm64' mounted with noexec or nodev)
18:48 anholt: it's also happening on my personal runner
18:49 anholt: (luckily my personal runner has more cores and a big ccache so at least I can fail in 5 minutes
18:49 anholt: pretty sure we need to increase mesa's ccache size
18:50 anholt: or get on that sccache thing
19:03 daniels: anholt: if you try to do the same thing using native mknod rather than emulated, does it still fail?
19:05 anholt: "sudo docker run -it registry.freedesktop.org/anholt/mesa/debian/arm_test-base:2020-11-30-sanitizers bash; mknod test-null c 1 3" succeeds.
19:06 anholt: and copy and pasting the debootstrap line works too
20:24 anholt: cool. looks like the ci-templates bump is probably what broke arm64_test build
20:34 daniels: anholt: oh?
20:36 anholt: https://gitlab.freedesktop.org/anholt/mesa/-/jobs/5904986 rebuild on master, ci-templates bump failed to bump the image tags, and the buildah changes across those revs sound like they might explain this behavior change in buildah but not a docker run
20:36 anholt: (there's another job running on an fdo runner in that pipeline, to show that it's not just my runner)
20:42 Rodrigo_: I work on an application that requires robust vertex buffer access to work properly
20:43 Rodrigo_: on OpenGL we use NV_vertex_buffer_unified_memory when available
20:43 Rodrigo_: where should I make this feature request (here/gitlab)?
20:43 Rodrigo_: I need something like BufferAddressRangeNV, or using opaque handles instead of addresses (like vkCmdBindVertexBuffers2EXT)
20:46 imirkin: Rodrigo_: probably gitlab, but ... it might fall on deaf ears. why is something like ARB_buffer_storage not sufficient for you?
20:47 Rodrigo__: some games use more data than what they bind on an indexed draw
20:48 Rodrigo__: and expect zeroes at the end
20:48 imirkin: right...
20:48 imirkin: glBindBufferRange right?
20:48 Rodrigo__: yeah, but that doesn't exist for BindVertexBuffer
20:48 Rodrigo__: it binds the whole buffer
20:48 imirkin: o
20:48 imirkin: that's ... surprising
20:49 Rodrigo__: Vulkan had the same issue, until VK_EXT_extended_dynamic_state
20:49 imirkin: but it's been a while since i looked, so quite plausible =/
20:49 imirkin: you need like a GL_buffer_view :)
20:49 imirkin: (like ARB_texture_view, but for buffers)
20:50 Rodrigo__: yay, more discrepancy between names on APIs :P
20:50 Rodrigo__: VkBufferView is used texture buffers
20:52 imirkin: i realize it's not completely the same, but does glDrawRangeElements not satisfy your requirements?
20:53 Rodrigo__: isn't that for the contents of the index array?
20:53 imirkin: the contents of the index array strongly influence access to the vertex arrays though
20:54 imirkin: like i said, not completely the same :)
20:54 Rodrigo__: I don't know the contents of the index buffer without walking through it in some way
20:54 Rodrigo__: since it can be modified from the GPU
20:54 imirkin: you don't need to -- glDrawRangeElements will supply a min/max index into the vbo
20:55 pendingchaos: imirkin: I thought the min/max was a perf hint?
20:55 Rodrigo__: to be fair reading out of <address>+<length> in BufferAddressRangeNV yields undefined values :P
20:55 imirkin: hmmm
20:56 imirkin: yeah ok. "It is an error for indices to lie outside the range [start,end], but implementations may not check for this situation. Such indices cause implementation-dependent behavior"
20:56 imirkin: in nvidia hw, we program some limits which will trigger its "out of bounds" logic
20:56 imirkin: which iirc just returns 0's (or a configurable value, depending on gen)
20:57 Rodrigo__: it's always been zeroes for me, but that could be HOS setting it up when the application boots
20:57 Rodrigo__: (Maxwell)
20:57 imirkin: it's actually the much older gens where it's configurable what happens when you're out of bounds
20:58 imirkin: (tesla)
20:59 imirkin: and yeah, normally you just set it to 0's
20:59 imirkin: and i think newer gens might actually clamp the index to the range
21:00 imirkin: so you get a valid vertex's data
21:00 imirkin: not sure
21:00 Rodrigo__: I've tried robust index reads on deko3d and it returns zeroes
21:01 Rodrigo__: (out of bound reads from the index buffer, but that one is easier to emulate)
21:05 imirkin: Rodrigo__: anyways, if you want to make that type of ext "happen" in mesa, you need to recruit the various driver authors, since it'll do you little good if it's implemented only in some drivers but not the major ones
21:05 Rodrigo__: do you mean non-mesa driver authors?
21:05 imirkin: i mean the authors responsible for various different hardware supported for mesa
21:06 imirkin: as you might expect, it's a different team support e.g. intel and amd hw.
21:08 ajax: or write it yourself, of course
21:08 imirkin: naturally.
21:09 ajax: or write some piglit tests for it so if someone does come along to implement it they have some idea if it works right
21:09 Rodrigo__: yea, I thought on adding it, I'll have to see if vertex buffers have a length parameter on mesa
21:09 Rodrigo__: if they do, it makes things easier :P
21:10 Rodrigo__: vertex buffer bindings*
21:10 imirkin: search for pipe_vertex_*
21:11 imirkin: [that will only apply to gallium drivers, but the only non-gallium driver of note is i965 which officially supports pre-skylake hardware]
21:13 pendingchaos: if there is a way to specify a size, it looks like radeonsi ignores it: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeonsi/si_state_draw.c?id=86675a07f811280974e834c5164de60a315c8114#n1649
21:13 imirkin: i know nvidia hw definitely has controls for the size of each vertex buffer
21:14 Rodrigo__: I can check how it was implemented on RADV
21:14 imirkin: and yeah, i think we have similar code in nouveau, so probably no way to supply width
21:14 imirkin: (i.e. would have to be added)
21:14 Rodrigo__: yes, it makes sense to have the assumption of `size - offset` everywhere
21:15 airlied: jekstrand: printf, even a quick glance :-P
21:17 jekstrand: airlied: Ugh... Yeah, I'll take a look.
21:21 imirkin: Rodrigo__: well, not *quite*, but ... https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c#n347
21:22 pendingchaos: Rodrigo__: RADV is here: https://cgit.freedesktop.org/mesa/mesa/tree/src/amd/vulkan/radv_cmd_buffer.c?id=86675a07f811280974e834c5164de60a315c8114#n2806
21:22 pendingchaos: it works basically the same as radeonsi afaict
21:22 pendingchaos: except, like other AMD vulkan drivers, it configures the bounds 1 element too low if the size isn't a multiple of the stride on some cards
21:22 pendingchaos: and on other cards, the bounds checks are per-component instead of per-attribute in some cases
21:24 pendingchaos: (these are bugs)
21:24 Rodrigo__: looks like vertex buffers are weird on AMD
21:59 jekstrand: airlied: I still hate printf
21:59 jekstrand: airlied: Just in case you're wondering whether or not that changed. :)
22:04 jekstrand: airlied, jenatali Do we actually need to write anything into the buffer for string args?
22:04 jekstrand: As far as I can tell, we don't.
22:05 jekstrand: They're always compile-time-known
22:05 jekstrand: The only reason for %s is "security" because we can't always splice inside VTN
22:06 imirkin: i remain surprised that printf exists in opencl.
22:06 keithp: jekstrand: %n
22:07 jekstrand: keithp: Yes, that's why we can't splice
22:07 jekstrand:is suddenly reminded of that Intel "security" training he's had to suffer through
22:07 keithp: could be simplified to 'don't program'
22:07 keithp: or even 'don't touch anything with a processor'
22:08 jekstrand: "live in a mud hut"
22:08 keithp: too much silicon in mud for my comfort
22:08 keithp: could spontaneously turn into a computer
22:08 jekstrand: pieapple under the sea then?
22:08 keithp: there you go
22:09 jenatali: jekstrand: Can you do `foo ? "a" : "b"`?
22:09 jekstrand: jenatali: Not in our current impl. :)
22:09 jenatali: Hm, good point
22:10 jenatali: Can you do %.*s?
22:10 jekstrand: What does that do?
22:11 jenatali: Number of characters is variable based on another arg to printf
22:11 jekstrand: Again, not in our current impl. :P
22:11 jekstrand: Well, maybe?
22:11 jekstrand: I guess if you have the base, it can be sorted out CPU-side
22:12 jenatali: Looks like no, the CL C spec doesn't support it, but plain C does
22:12 jekstrand: Thinking about this stuff again, I think we roughly have two options:
22:13 airlied: jekstrand: oh you mean when you see a %s bake the value into the format tsring instead?
22:13 jekstrand: 1. Assume %s is 100% constant-determinable and parse it in VTN in which case we don't need to write anything to the buffer and I kind-of want to stuff the string args in a nir_printf_format_info::str_args array.
22:13 jekstrand: airlied: We can't do that; it might contain "%"
22:13 airlied: indeed that would be bad
22:13 airlied: jekstrand: I did have a str_args array before
22:14 airlied: I moved to a single concatenated string because why not
22:14 airlied: and it seemed simpler
22:14 jekstrand: 2. Don't assume we have constant-determinable string args and just let them live in nir_shader::constant_data and write the offset into the buffer.
22:14 jenatali: jekstrand: You could escape the % to %% while embedding the string if you wanted to
22:14 jekstrand: airlied: I'm hoping we can have only one array hanging off the nir_shader rather than two. Maybe that's fool-hardy
22:14 airlied: jekstrand: 2 diesn;t wirj
22:14 airlied: doesn't work
22:14 airlied: we can't extract them at clover time
22:15 jekstrand: airlied: We could, it'd just be more work. :P
22:15 airlied: jekstrand: I couldn't make it work even with more work
22:15 jekstrand: Hrm... Maybe we can just do %%
22:16 jekstrand: Given that % is the only special character in a printf string, s/%/%% should be a sufficient escape algorithm.
22:16 jenatali: Sounds fine by me
22:16 jekstrand: I'd love it if we could get rid of string arguments entirely.
22:16 airlied: jekstrand: the other option is to avoid hagnign it from nir shader
22:17 airlied: and just pass in a struct ptr to spirv_to_nir
22:17 jenatali: The one thing I'll say is that it makes it harder in the case we ever need to backtrack and support any kind of dynamic-ism on it; which I can't believe we ever would...
22:17 jekstrand: jenatali: Which makes it harder?
22:17 airlied:also isn't sure it makes the code cleaner
22:17 airlied: it's pretty ugly no matter how it ends up
22:17 jenatali: jekstrand: It means we have to change the buffer format, rather than just the VTN code
22:18 jekstrand: jenatali: Yeah, but the more I look at this the more I think we have a fork in the road.
22:18 jekstrand:will play with something
22:18 jenatali: I'm fine with just embedding it in the format string
22:19 jenatali: Though be aware you need to deal with precision modifiers, so %1s on "foo" is just "f"
22:19 airlied: my only problem with doing that I I don't want to do printf parsing and % subsitution
22:19 jenatali: Er, %.1s
22:20 airlied:wonders why that is any nicr than just passing them trhough
22:20 jekstrand: airlied: Someone has to handle it
22:20 airlied: jekstrand: printf handles
22:20 airlied: it
22:20 jekstrand: airlied: Oh....
22:20 jekstrand: airlied: Let me play. :)
22:21 jenatali: Yep
22:21 airlied: I'd rather not be parsing things when I just hand them to printf in clover later
22:21 airlied: jekstrand: I could move the strings into the other structure actually
22:22 airlied: and concatentate the strings to the printf string
22:40 karolherbst: airlied, jenatali: do you have a kernel/test handy which has duplicated inline samplers?
22:41 jenatali: karolherbst: Anything that uses an inline sampler twice
22:41 airlied: jekstrand, jenatali : https://paste.centos.org/view/raw/9a949b55 as a cleanup, totally WIP untested
22:41 jenatali: LLVM duplicates them based on the call site, not the decl site
22:41 karolherbst: ahhh
22:41 airlied: karolherbst: nope I was just testing darktable but there are lots of problems getting it running
22:41 karolherbst: maybe I hit it in the CTS then?
22:41 karolherbst: not sure
22:41 karolherbst: airlied: I guess
22:42 karolherbst: airlied: anything special to use darktable or just launch and enable CL somewhere?
22:42 karolherbst: I could dig into it and fix the issues :D
22:43 jenatali: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/microsoft/clc/clc_compiler_test.cpp#L1861
22:43 karolherbst: jenatali: ahh, cool
22:44 karolherbst: llvm is annoying :D
22:44 karolherbst: but yeah.. shouldn't be that hard to fix
22:44 jenatali: Oh right, I can actually talk about what we were originally testing with... I saw that pretty early on while looking at Photoshop kernels :)
22:44 karolherbst: probably I do the deduplication when extracting samplers or something.. dunno
22:44 karolherbst: jenatali: .... and this had to be kept secret? :D
22:44 jenatali: karolherbst: Or just take the pass I wrote and move it to be common code
22:45 jenatali: Yeah... the fact that we were working on the project for Photoshop had to be secret for a while
22:45 karolherbst: jenatali: or that
22:45 karolherbst: jenatali: ehhh... "okay"
22:45 jenatali:shrugs
22:45 airlied: karolherbst: we have to fix inline with clang to make it's shaders laod at all
22:45 jenatali: I don't like keeping secrets, I just do it when I'm told I have to
22:45 karolherbst: airlied: too many?
22:46 airlied: karolherbst: C99 and OpenCL C define inline the same, but ltos of apps do it wrong
22:46 karolherbst: ohh.. inline
22:46 airlied: lots of CL C apps do inline when they mean "static inline"
22:46 karolherbst: yeah, no shit
22:46 karolherbst: we should just ignore inline
22:46 airlied: instead of fixing all the apps the workaround seem to be to tell clang -fc89-inlines or something
22:47 karolherbst: ehhhh
22:47 airlied: it's clang that deals with them
22:47 karolherbst: welll....
22:47 karolherbst: s/inline//
22:47 jenatali: What's the actual problem that happens?
22:48 karolherbst: I guess llvm adds multiple copies
22:48 karolherbst: and fails to compile
22:48 airlied: nope the opposite
22:48 karolherbst: none?
22:48 karolherbst: :D
22:48 karolherbst: that's even worse
22:48 airlied: it doesn't add any copies and fails to link
22:48 jenatali: ... wat
22:48 karolherbst: smells like a clang bug to me
22:48 airlied: it's not
22:49 karolherbst: not saying it is, I just say it smells like it :p
22:49 airlied: it's how the C99 inline keyword works
22:49 airlied: in the spec
22:49 karolherbst: I like my s/inline// solution
22:49 airlied: how can wedo that?
22:49 karolherbst: we get the string?
22:49 karolherbst: we pass the kernel to clang?
22:49 airlied: seems like a crappy answer
22:49 karolherbst: it is
22:50 airlied: instead of just asking clang to do the c89 thing
22:50 airlied: also what happens if someone uses inline in a stirng
22:50 airlied: you'd need a C parser
22:50 airlied: which is what clang is
22:51 airlied: anyways inline with -O0 is the main problem
22:51 airlied: but since SPIR-V output needs -O0 you have the issue
22:52 airlied: if you use inline on it's own but don't optimise, C99 expected the implementaiton to be provided by another compilation unit
22:53 airlied: you should use "static inline"
22:53 airlied: https://gustedt.wordpress.com/2010/11/29/myth-and-reality-about-inline-in-c99/
22:55 jenatali: I remember seeing this somewhere... https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/432
22:56 airlied: yeah they end with -fgnu89-inline
22:56 airlied: in theory we could only do it for apps where we know it's doing the wrong thing
22:56 airlied: but it seems to be cut-n-pasted around a lot, at least darktable and opencv have had it
23:00 curro: karolherbst: i'm not sure i follow your concern regarding the switch to an intrusive_ptr. if you use it in sampler_argument instead of a plain pointer the temporary sampler you have created on exec_context::bind() will be automatically destroyed when it's no longer in use. i don't think you could use unique_ptr instead because the sampler pointer bound to a sampler_argument is not guaranteed to be unique, it might be a plain sampler
23:00 curro: object the user is still holding a reference to
23:00 karolherbst: curro: set takes void*
23:01 curro: karolherbst: yes, which for sampler objects it's realy a cl_sampler *, so intrusive_ptr can grab a reference to that
23:01 karolherbst: curro: but I just have a list of inline_samplers in the exec_context, but I could probably also rewrite the set functions for sampler arguments I guess?
23:01 karolherbst: mhhh
23:02 curro: karolherbst: why add an extra list to exec_context? (i didn't see that list in your patch anyway)
23:02 karolherbst: right.. so for now we store the normal sampler* thing inside the sampler argument
23:02 karolherbst: ohhh wait..
23:02 karolherbst: forgot to rebase --continue
23:03 karolherbst: but yeah.. I could make kernel::sampler_argument store the intrusive_ptr instead
23:03 curro: karolherbst: yeah i think that would be the simplest thing to do
23:03 karolherbst: but I also didn't want to have dangling pointers, even if temporary... but yeah.. I guess that would work
23:04 curro: as long as it's a refcount-holding pointer it won't be dangling :)
23:05 karolherbst: yeah.. no idea why I wasn't thinking about it..
23:07 curro: karolherbst: one minor issue i noticed in your patch is that you pass 'q->device().address_bits()/8' as size argument for sampler_argument::set(). i don't think that's right in general since 'size' refers to the CPU size of the object the void* points to (i.e. sizeof(cl_sampler)), as is the case for clSetKernelArg()
23:08 karolherbst: ohh, you are right
23:08 karolherbst: "If the argument is of type sampler_t, the arg_size value must be equal to sizeof(cl_sampler)." :)
23:08 curro: yep
23:09 karolherbst: set even checks for it :)
23:15 airlied: okay testing printf rework x+1
23:26 jenatali: Ugh... X11 defines BOOL differently from Win32
23:26 Sachiel: enum bool { true, false, file_not_found };
23:27 imirkin: i think it's abort/retry/cancel actually
23:27 jenatali: No, that's Office, not Win32 :P
23:27 HdkR: I like to confine X11 defines in to just a couple of c files rather than letting it pollute global namespace. It's such a mess D:
23:27 karolherbst: jenatali: you mean int instead of char or something?
23:28 karolherbst: but don't be dissapointed, it does cause bugs sometimes :D
23:28 ccr: true, false, maybe, try_again
23:28 jenatali: karolherbst: Yeah, though just the fact that they both need that type defined is a pain
23:28 imirkin: keep in mind that those X11 definitions go back to the dawn of time...
23:28 karolherbst: C99 for the win
23:28 karolherbst: ...
23:28 jenatali: HdkR: I'm trying to build glue between /dev/dxg and GLX, I need both :(
23:28 karolherbst: really makes you think that bool is a C99 feature
23:28 imirkin: jenatali: c++ + anonymous namespaces? :)
23:28 imirkin: er, s/anonymous/named i guess
23:29 jenatali: imirkin: That's not a bad idea... think people will scream about C++ in GLX?
23:29 imirkin: mmmm ... if it's restricted to the glue, i don't think anyone will care
23:30 jenatali: Let me see how much further I get with that
23:30 jenatali: Thanks
23:30 imirkin: and if you avoid libstdc++, then i suspect any objections would go away
23:30 imirkin: i.e. just use the core language features
23:30 jenatali: Yeah, that should be easy enough
23:42 airlied: jenatali: hey can you fixup the clc stuff in the printf branch again?
23:42 airlied: jekstrand: new version in the MR now
23:42 jenatali: airlied: Yeah, give me a minute
23:43 airlied: oops underpushed, should be correct branch now
23:48 R0b0t1``: Hi, I've got a moto g7, a msm/adreno part. How do I access toe writeback buffer? I'm trying to get screen contents, and reading /dev/graphics/fb0 fails. There is an fb1 and reading that fails also. I see that /sys/class/graphics/fb1/msm_fb_type reports "writeback panel" and a suggestion from elsewhere leads me to belive this is the device I should be interested in
23:53 airlied: jenatali: I rebased another fix in earlier some be careful not to force push over it
23:53 R0b0t1``: Is it just a case of dumb reading not being supported, and perhaps other ioctls work?
23:54 jenatali: airlied: Thanks for the heads up
23:54 jenatali: Is there some good way to collaborate on a force-push like that? Or should I just commit my change, reset, then throw away the tip and cherry-pick my new commit in?
23:55 airlied: jenatali: yeah force-push generally makes it more of use irc as a locking mechanism :-P
23:55 jenatali: The one downside to two people working on one MR
23:55 airlied: in theory git rebase should work
23:55 airlied: as long as you do it before you force push
23:56 airlied: but it's something you have to remember to do
23:56 jenatali: Yep
23:56 airlied: normally for two ppl on an MR, you jsut pile on top, and agree to rebase/force-push at a good time
23:58 jenatali: airlied: This is still writing an offset into the buffer, right? It's just now into the same char array as the format str?
23:59 airlied: jenatali: yes