00:00 karolherbst: I think I trust compilers enough to not mess this up :p
00:02 ccr: famous last words.
00:03 karolherbst: hey.. I think they should be able to do the most basic kinds of optimizatons :p
00:05 karolherbst: jenatali: but I think you want to set the alignment correctly because it can mean quite a perf impact
00:05 karolherbst: generally
00:06 karolherbst: on nv hardware shared mem is L2 cache, and you have to define a split between L2 and shared memory. So the more shared mem you need, the part used for caching can get reduced
00:06 karolherbst: and there are like various trip points
00:06 jenatali: Yeah, though my primary concern is correctness above perf
00:07 karolherbst: sure
00:07 karolherbst: I just don't think that's hard to fix, but very hard to figure out this is causing lower perf :p
00:07 karolherbst: or could be an opportunity to improve it
00:08 jenatali: We're *this* close to getting a conformance certification - once we've got that I'm looking forward to shooting for CL3.0 and improving perf
00:08 karolherbst: :) cool
00:19 imirkin: so ... anyone around who knows how uapi drm events are _supposed_ to work?
00:29 imirkin: looks like nouveau disables vblank events entirely when turning off a head, which means that any drmWaitVBlank's should be getting an error... hm
00:31 imirkin: oh wait. i think i see the bug in nouveau ddx code...
03:36 airlied: ugghh is panfrost ci timing out
05:51 airlied: daniels: throwing a temp disable for t720/760 jobs
05:55 daniels: yeah thanks, for some reason half the lab is completely AWOL this morning
05:56 imirkin: taking the sunday off?
09:09 pmoreau: Damn! 😲 “On a recent desktop, the 1.0 tests used to take about three days.” (Comment from the OpenCL 1.1 CTS, in 2010)
09:10 EdB: :)
09:11 pmoreau: The good old times :-)
09:11 EdB: karolherbst: when to you think that there will be a printf impl on nir side ?
09:11 pmoreau: Did you this MR, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6308?
09:12 pmoreau: *Did you see
09:15 EdB: yes
09:16 EdB: I was wondering if someone can make an ETA
09:19 EdB: because if we want a print runtime that support AMD native and NIR target, I will need to know what is expected by NIR
09:20 pmoreau: I see :-)
09:20 pmoreau: Sorry, no idea about an ETA.
09:23 EdB: It seems that the fate of my Clover 1.2 changes is to languish unmerged :)
09:24 pmoreau: It wouldn’t be the first clover MR to encounter that fate, sadly.
09:24 pmoreau: I should give it another review since I think you added new features compared to last time I reviewed.
09:26 EdB: pmoreau: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5038#note_597583 I was talking about LLVM version not the LLVMSPIRVLib version
09:27 pmoreau: Ah, you meant I should make clover require 8.0.0 for LLVM as well?
09:27 EdB: yes, if it easier for version managment
09:28 pmoreau: I see
09:28 EdB: since clover can only support AMD and spirv
09:28 EdB: and that AMD already requires 8.0.0
09:29 pmoreau: There is an implicit dependency on LLVM 8.0.1 since LLVMSPIRVLib X.Y.Z.W is built against LLVM X.Y.Z.
09:30 pmoreau: But I have no idea how to express that in Meson.
09:31 pmoreau: clover does also support Nouveau and SPIR-V if you set `NOUVEAU_ENABLE_CL=1` in your environment, but we can definitely bump the LLVM requirement to 8.0.0.
09:35 EdB: Nouveau support is with NATIVE_IR or only with IR_NIR_SERIALIZED ?
09:39 pmoreau: Only IR_NIR_SERIALIZED
09:39 pmoreau: EdB: This would work for you, can I keep your Rb? https://hastebin.com/winesaluye.php
09:40 EdB: don't you want _llvm_version = '>= 8.0.1' ?
09:41 pmoreau: True
09:44 EdB: it's ok for me, you have my RB
09:47 pmoreau: EdB: How about this updated version? https://gitlab.freedesktop.org/pmoreau/mesa/snippets/1143
09:48 pmoreau: (I haven’t build-tested it yet, but more looking for feedback on the idea.)
09:57 EdB: seems strange
09:57 EdB: I can prepare a branch with if with_amd_vk or with_gallium_radeonsi or with_gallium_opencl
09:58 EdB: and the removal of all clover compat stuff
09:58 EdB: to raise the min version to 8.0.1
10:04 Vanfanel: pq: wrt cursor hotspot, you told me that "for a cursor, you'd have CRTC_X = cursor_position.x - hotspot.x". And I was also told that using drmModeMoveCursor() mixed with atomic is a good idea. So, how can I tell drmModeMoveCursor? Maybe by operating the X and Y position parameters with the hotspot coords before passing them to drmModeMoveCursor?
10:04 Vanfanel: I mean, how can I tell drmModeMoveCursor() about the cursor hotspot?
10:35 karolherbst: EdB: we more or less already have one, but the AMD llvm backend seems to post process the format string already
10:35 karolherbst: I don't have much against "driver" specific things, but right now clover is like 50% AMD specific and putting more stuff into "core" doesn't help separating all of that
10:36 karolherbst: EdB: printf for nir: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6308
10:39 karolherbst: there are just a lot of things which are very bad for nouveau, but are acceptable on AMD, like how we handle constant sources
10:39 karolherbst: how clover does things is _terrible_ for performance for us
10:39 karolherbst: and this should be a "AMD specific" optimization if that's any good for AMD
10:39 karolherbst: just one of the things I really don't want to have more
10:40 EdB: what part of clover is a performance problem ?
10:40 karolherbst: how constant args are passed into the kernel
10:40 karolherbst: cover does this 8/24 bit split
10:40 karolherbst: so you have one indirect containing index _and_ offset
10:40 karolherbst: but indirects on constant memory are terrible on nvidia hardware
10:40 karolherbst: essentailly lowers constant memory to global
10:40 karolherbst: makes constant even slower than global
10:41 karolherbst: (because you have to split the offset of and etc...
10:41 karolherbst: )
10:41 karolherbst: EdB: I have nothing against landing the printf stuff generaly, but if it's AMD specific it should land like something AMD specific, not as "core"
10:42 karolherbst: maybe have a CAP_COMPUTE_PRINTF_FORMAT thing and select between CLC and AMDHSA or so
10:42 EdB: / XXX: Correctly handle constant address space. <-- there is room for a new ABI here
10:43 karolherbst: yeah
10:43 karolherbst: and it's fine to change it, I just don't want to prevent putting more things into clover which only help AMD but is useless for everything else
10:43 EdB: I ony have AMD ardware
10:43 EdB: hardware
10:43 karolherbst: right and that's fine
10:44 karolherbst: I was mainly wondering what this printf format belongs to
10:44 karolherbst: is it part of the AMD llvm backend?
10:44 karolherbst: or is there an llvm pass doing it and it's in core?
10:44 EdB: yes it's AMD backend.
10:44 karolherbst: okay
10:44 karolherbst: so, if it's AMD specific, it should be replaceable easily by a different format
10:44 EdB: yes
10:45 EdB: as long as there is other to support, I'll be happy to support them :)
10:46 karolherbst: it's not really about support, just landing stuff in a way it's easier for others to change things around or do already consider very likely future changes
10:50 EdB: if the printf buffer format for nir is defined, I may be able to find out a more generic way
10:50 karolherbst: with nir we get the string from the CLC source
10:50 EdB: you mean the printf format
10:50 karolherbst: yes
10:51 EdB: we can make a runtine copy
10:52 EdB: how are the arg store in the printf buffer ?
10:52 karolherbst: check https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6308
10:52 EdB: how it's like to the format ? with a poiner to the constant ?
10:54 karolherbst: so we only store the id of the string and then encode the arguments after it for each print, but I am not entirely sure how where to get the id -> string mapping from
10:54 EdB: it's mostly look like what amd driver do
10:54 karolherbst: yeah.. the things aren't that different, just the format string itself is
10:56 karolherbst: but it's also beyond me why AMD has its own format...
10:56 EdB: so clover have to extrat the string from source and we need to figuring how to get a unniform id
10:57 EdB: they were the first to implement it
10:57 EdB: no spirv at that tile
10:57 EdB: time
10:57 karolherbst: the firmat we get isn't spirv specific
10:57 karolherbst: it's the opencl one
10:57 karolherbst: so the question is, why does AMD has a one different from OpenCL
10:58 karolherbst: is there space savings? Is there any other benefit? Is it easier to parse?
10:58 karolherbst: mainly wondering why to do it like that and make it harder for everybody
11:00 EdB: I won't guess :)
11:01 karolherbst: yeah... I mean, I would accept it makes sense if there is like 10% space savings or whatever :) but if there is no benefit, I'd wonder if we can make the AMD target just emit the opencl string?
11:01 karolherbst: but I guess it's hardcoded and can't be changed
11:03 EdB: but regarding the runtime, it's easier for my to have a format that give me the real number of args passed to the call even if the format string have extra or less % specifier
11:03 karolherbst: well, what we do in the driver is decoupled from the format string
11:04 karolherbst: and that's fine
11:04 karolherbst: but we could also store it the same across all paths, but I guess we won't get that :)
11:05 karolherbst: EdB: I mean, I am fine to land it as is, I just prefer to make it obviously AMD specific from the start
11:05 EdB: :)
11:06 karolherbst: and once I get to supporting the OpenCL string format and the data format we have in nir, I'd try to generalize the code
11:07 karolherbst: pmoreau: did you check how we make sure the stuff links in the target meson.build file?
11:07 EdB: I will look to add a CAP
11:08 karolherbst: yeah.. we kind of need a way to decide how the formatting goes :)
11:09 karolherbst: pmoreau: but anyway.. I think the version discussion is resolved now or what is left?
11:09 pmoreau: karolherbst: What do you mean by that? libclllvm takes LLVMSPIRVLib and LLVM as dependencies and that should handle it.
11:09 karolherbst: because llvms build system is broken
11:09 karolherbst: so you compile your shit, and then you run your application and it doesn't work :p
11:10 karolherbst: there was a reason we did this in the opencl target
11:10 karolherbst: static version of "clang-cpp" was broken in llvm-9 or something...
11:10 karolherbst: something stupid
11:10 pmoreau: I haven’t changed any of the defaults, so if LLVM linking worked before I don’t see why it would no longer work now.
11:10 karolherbst: but that stuff happens all the time with llvm
11:11 karolherbst: pmoreau: wrong combination of build time flags
11:11 karolherbst: or bugs or whatever..
11:11 karolherbst: thing is
11:11 karolherbst: you never know if a. stuff links and b. runtime is happy before trying
11:11 karolherbst: so we'd like to catch those issues before compiling
11:12 karolherbst: for one we should check if the translator and llvms version at least matches
11:12 karolherbst: and then maybe even link and runtime check or something
11:15 pmoreau: I agree for checking that the translator and LLVM versions match. I’m not too sure how to express that in Meson. Do we want to build a list of all available LLVM versions, and another one of all translator versions, then make a list of all pairs where we get a match, and finally pick the most recent one or the first one in PATH order that satisfies the version requirement?
11:15 karolherbst: I mean... I am fine for now to leave it like this, but we have to improve it sooner or later
11:15 karolherbst: not quite sure...
11:15 karolherbst: maybe just a compile time check would be enough? I really don't know.. Maybe checking the major version alone is enough?
11:15 karolherbst: but I think once there is an llvm release, the API is fixed, more or less?
11:16 pmoreau: Yeah, I ran into that issue recently when Arch updated to LLVM 10.0.1 but SPIRV-LLVM-Translator didn’t make any update against that version so the translator is still on 10.0.0. I did downgrade LLVM & co back to the previous version to keep it going.
11:16 karolherbst: pmoreau: that results in linker issues, no?
11:16 karolherbst: or runtime?
11:17 pmoreau: I don’t remember… I think it was a linker issue.
11:17 karolherbst: but in that case rebuilding the translator should be fine
11:17 karolherbst: anyway.. doing something similiar to what we have in targets/opencl could be enough already
11:17 karolherbst: (+ checking the major version)
11:19 karolherbst: brb
11:26 karolherbst: pmoreau: anyway.. whatever works to catch those issues at configure time is likely a good solution :p
11:28 EdB: karolherbst: do you store a buffer size and offset like what AMD do ?
11:29 karolherbst: for what?
11:29 karolherbst: ohh printf
11:29 EdB: yes
11:29 karolherbst: I think our differs
11:29 karolherbst: we store the number of entries in the first byte and go from there?
11:30 karolherbst: jenatali: actually.. does your printf stuff works in case there are too many or not enough args?
11:30 EdB: how do you check that the buffer is not full ?
11:30 karolherbst: ohh., first byte contains the full size actually
11:31 karolherbst: jenatali: ehh.. the bound check is flawed :p
11:31 karolherbst: I will reply on the MR
11:31 EdB: for amd it's the second one :)
11:32 EdB: the offset is put a the first and updated each time to reserve the space and check that there is enought space
11:32 karolherbst: I kind of like the format string approach, you won't have to save all the strings :)
11:35 karolherbst: but yeah.. I think I need to write the code to actually figure out how things work at all :D
11:37 EdB: :)
11:55 EdB: pmoreau: LVM min version to 8.0.1 with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6336
11:56 EdB: I thin I forgot about the LLVM old version ci checks !
11:56 EdB: but let's see if people are ok with that anyway
11:57 pmoreau: I’m working on a version that hardens the requirements a bit more, but it’s messy. Will send a version soon; but good to see more compats bit disappear.
12:10 EdB: who is in charge of ci env ?
12:11 EdB: Because if 6336 is merged, meson-clover-old-llvm won't be useful anymore
12:12 EdB: unless it can be ugrade to LLVM8 to 9
12:20 karolherbst: EdB: you can change the ci files yourself
12:20 karolherbst: it's all in git
12:20 EdB: I can deactivate it
12:21 EdB: but not sure how to set up new backport to test build with "new" older LLVM version
12:21 EdB: the current olf build stop at LLVM 7
12:22 karolherbst: the old ones are also mainly for llvmpipe, no?
12:24 EdB: use-x86_build_old is only referenced by meson-clover-old-llvm
12:26 karolherbst: ahh
12:27 karolherbst: I think in this case you can really remove all of that and add it to the MR before increasing the version
12:27 karolherbst: EdB: but maybe you should add to the MR on why it should be increased?
12:28 EdB: yeah, let's try to write it down
12:34 EdB: AMD driver and SPIRV are our main consumers. radeonsi already require LLVM>=8.0.0 and for SPIRV there is !5038 looking for a bump to 8.0.1.
12:34 EdB: Let's use the opportunity to get ride of some clover comptat stuff
12:34 EdB: hope it's enough :)
12:39 pmoreau: EdBand karolherbst: Updated patch for the version: https://gitlab.freedesktop.org/pmoreau/mesa/snippets/1143. It checks that the existing LLVM requirements are compatible with the one we get from the requested SPIRV-LLVM-Translator, and then looks for the exact LLVM version needed by the SPIRV-LLVM-Translator that got picked.
12:41 EdB: eric_engestrom: llvm pipe may be ? I'm not able to figuring out what it want :/
12:44 karolherbst: pmoreau: is there a good reason the version has to match that cloesly?
12:44 karolherbst: *closely
12:44 pmoreau: Currently checking on that
12:45 pmoreau: I think major.minor might be enough
12:45 karolherbst: there are no minor llvm releases anyway afaik
12:45 EdB: pmoreau: I don't think you need the if _opencl != 'disabled'. because the else is only adding 'with_opencl_spirv ) false' that you already get with 'with_opencl_spirv = get_option('opencl-spirv')'
12:45 karolherbst: ohh there was 7.1.. ehhh
12:46 karolherbst: ahhh very helpful: https://releases.llvm.org/7.1.0/docs/ReleaseNotes.html
12:46 karolherbst: ....
12:48 EdB: https://releases.llvm.org/ for a list
12:48 pmoreau: EdB: But what if the user has `-Dgallium-opencl='disabled' -Dopencl-spirv='true'` for whatever reason? :-D We shouldn’t be adding the dependencies in that case.
12:49 karolherbst: mhhh..
12:52 EdB: maybe we shoud failled before in that case, with a proper message
12:53 EdB: or meson have a way to set an option to depend on another ?
12:53 karolherbst: please don't
12:53 karolherbst: this makes it just annoying to flip over options
12:54 karolherbst: if we would do this all the way, we would also require disabling all gallium drivers if gallium isn't built, etc...
12:54 EdB: there is sone stuff like that : error('OpenCL Clover implementation requires at least one gallium driver.')
12:55 karolherbst: yeah, but that's the other way around
12:55 karolherbst: if you enable a feature, you of course require its dependencies
12:55 karolherbst: mhhh...
12:55 karolherbst: maybe it does make sense for opencl-spirv then... mhh
12:56 EdB: with_opencl_spirv = get_option('opencl-spirv') && _opencl ?
12:56 jenatali: karolherbst: What's wrong with the bounds check?
12:57 karolherbst: jenatali: you increase the buffer size, then do the bound check
12:57 karolherbst: so in case it's actually out of space, you are left with the increased size
12:57 jenatali: Sure - I guess it could eventually overflow if you keep writing while it's out of space
12:58 karolherbst: right
12:58 jenatali: Didn't seem worth it to do a compare/exchange loop though
12:58 karolherbst: I mean, the runtime should still check that it doesn't
12:58 karolherbst: but you are left with the last incomplete write
12:58 karolherbst: and this could cause problems
12:58 jenatali: Ah - good point
12:58 karolherbst: mhhhh
12:59 jenatali: The CTS is *really* lax for printf
12:59 karolherbst: actually, I see why you did it this way
12:59 karolherbst: mhhhh
12:59 karolherbst: annoying
12:59 karolherbst: jenatali: it's more of a driver issue though
12:59 karolherbst: you need to know how much data is valid and what is just random stuff
12:59 jenatali: I just mean all of the CTS tests do a single format string entry :)
12:59 karolherbst: ahh :D
12:59 karolherbst: yeah.. well
12:59 jenatali: From a single thread
13:00 karolherbst: mhh.. this is slighly annoying to bound check first
13:00 EdB: jenatali: it's hard to be strict when the specs says "their is no guarantie on how the stuff is displayed" :)
13:00 jenatali: Yeah, I know :P
13:01 karolherbst: jenatali: but at the same time, not that much
13:01 karolherbst: but then again.. very
13:02 jenatali: EdB: Regarding the arg size for strings - how can the size that's stored with the format string possibly reflect the size of potentially dynamic %s entries?
13:03 karolherbst: jenatali: but I think it's not that bad...
13:03 karolherbst: just a small loop
13:03 karolherbst: and if you use printf you already say you don't care about performance
13:04 karolherbst: so...
13:04 jenatali: karolherbst: I'd happily try out a patch if you want to help write one. I can write some dedicated tests to try and force corner cases if we want to make it more robust
13:04 karolherbst: jenatali: I think you can just make the bound check return a boolean and you construct the later if based on that
13:04 karolherbst: so you don't actually have to do this one big super loop
13:05 jenatali: Right
13:05 karolherbst: jenatali: well, how I see it, the pass can write garbage the runtime can crash on
13:05 jenatali: Yeah, I understand
13:05 EdB: jenatali: %s args are string literral, so the compiler know
13:06 jenatali: EdB: %s args can be dynamically chosen between multiple string literals
13:06 EdB: it even align buffer to next 8 size
13:06 jenatali: And if I recall, we experimented a while ago and you can actually get the compiler to accept __constant char* kernel args, but I'm not sure if that was correct according to the spec
13:07 EdB: I should test. Hope it reserve buffer for max length in that case
13:07 karolherbst: well.. the phrasing of the spec is just very bad
13:07 jenatali: Indeed
13:08 EdB: runtine printf is happy to display till the first \0
13:08 jenatali: True, where you need to actually skip to the next arg, even if there's embedded \0
13:09 jenatali: Oh hey, the spec has examples of invalid printf usage, the %s arg needs to be a literal string
13:09 jenatali: A few examples of invalid use cases of printf for the conversion specifier s are given below:
13:09 jenatali: kernel void my_kernel(global char *s, ... )
13:09 jenatali: {
13:09 jenatali: printf("%s\n", s);
13:09 jenatali: constant char *p = "`this is a test string\n`";
13:09 jenatali: printf("%s\n", p);
13:09 jenatali: printf("%s\n", &p[3]);
13:09 jenatali: }
13:09 karolherbst: jenatali: from where is that?
13:09 jenatali: https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#printf
13:10 jenatali: Not sure if it's been updated or if I just missed it the first time around
13:11 EdB: it's the same on my old 1.2 pdf
13:11 karolherbst: jenatali: "The argument value must be a pointer to a literal string" sounds like _any_ string :p
13:11 karolherbst: also "pointer"
13:11 karolherbst: but mggg
13:11 karolherbst: the example is quite clear
13:11 jenatali: Seems that way, but apparently even declaring a "constant char* p" and using it on the next line is declared invalid
13:11 karolherbst: yeah..
13:11 karolherbst: cool
13:12 karolherbst: that makes it easier
13:12 jenatali: Agreed - though CLOn12's implementation could handle it either way :P
13:12 karolherbst: wondering if we should optimize it... mhhh
13:12 karolherbst: so all %s are resolveable at compile time, right?
13:13 jenatali: Right
13:13 karolherbst: so we'd only need a table with all formats and only add non string args in case we merge all string arguments int
13:13 EdB: pmoreau: "Require the LLVM version against the found SPIRV-LLVM-Translator was built" should be LLVM the main and Transaltor version adjsted according to ?
13:13 karolherbst: *in
13:13 karolherbst: no idea if it's worth it though
13:14 karolherbst: jenatali: but maybe we could do some constant folding on top of printf :D
13:14 karolherbst: oh well...
13:14 jenatali: karolherbst: That was what I expected Clover'd want to do before running the pass I wrote - swap out all strings to indices into a table that the runtime could look up
13:14 karolherbst: I somehow don't care enough
13:14 karolherbst: jenatali: I meant merging strings into the format string
13:15 jenatali: Ohh
13:15 jenatali: Sure, you could do that
13:15 karolherbst: but now that I think about it, we could also merge numeric constats :p
13:15 karolherbst: jenatali: but how is a runtime supposed to use the pass anyway? How does one get the constant table?
13:15 karolherbst: uhm
13:15 karolherbst: string table
13:16 jenatali: karolherbst: Dunno about Clover, but we have a metadata struct that we hand out to the runtime after compiling
13:16 karolherbst: mhhh
13:16 jenatali: Contains things like constant values that need to be put into hidden buffers that couldn't be inlined in the shader
13:16 karolherbst: jenatali: I have a fun idea....
13:16 karolherbst: what if the lowering would replace all strings with a string id and "returns" the table?
13:17 eric_engestrom: EdB: yeah, I just grep'ed and saw 9 lines checking LLVM_VERSION_MAJOR against `4` in gallivm, so yeah llvmpipe seems to be the only one still suporting LLVM that far back
13:17 karolherbst: or stores the table alongside nir_shader?
13:17 jenatali:shrugs
13:17 jenatali: Seems fine
13:18 jenatali: I'm not bothering with a preprocess step, since for us, the pointer to the string is as good as an entry into a table
13:18 EdB: eric_engestrom: ah ! I didn't go outside the meson.build file
13:18 karolherbst: jenatali: pointer as in host pointer?
13:18 karolherbst: yeah mhhh...
13:18 jenatali: karolherbst: Guest pointer
13:18 jenatali: ER... device
13:18 karolherbst: ahh
13:18 karolherbst: I think I will just tinker with it at some point and see how that all works out
13:19 EdB: eric_engestrom: thanks for the search
13:19 pmoreau: EdB: I could change so that 1) we check that LLVM and SPIRV-LLVM-Translator requirements do not clash, 2) look for LLVM, 3) look for SPIRV-LLVM-Translator based on the LLVM version that was found.
13:19 jenatali: Sure - like I said, I'm happy to have more patches on top, but this works pretty well for us
13:19 EdB: clover will be happy to read metadata table :)
13:19 karolherbst: pmoreau: yeah.. sounds about right
13:20 EdB: for me too
13:20 karolherbst: jenatali: yeah.. I think I am fine with landing it as is, just thinking ahead
13:20 jenatali: karolherbst: Thanks for pointing out the bounds check bug though. You also made me re-read the spec to find: "If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored."
13:20 karolherbst: ehhh..
13:20 jenatali: So, I need to add some more data somewhere - right now CLOn12 figures out where the next arg is based on the format string, but apparently that's not sufficient
13:21 pmoreau: Okay, can change that. I found a way to make the version check be on major + minor.
13:21 karolherbst: jenatali: why would it make sense to care about arguments not being part of the format string?
13:22 jenatali: Right now, I'd go to the next iteration of printing the string, but keep reading arguments that were unused from the previous one
13:22 EdB: it could ba a function call
13:22 jenatali: Right, it's just C rules, all arguments to a variadic function are evaluated
13:23 karolherbst: I'd just ignore it :p
13:23 karolherbst: how would an application know anyway
13:24 jenatali: Heh :P fair
13:24 jenatali: I'll at least flag it as a known issue
13:25 karolherbst: EdB: doesn't matter for nir as we already done linking and inlining
13:27 jenatali: EdB: Does the metadata from AMD's LLVM printf stuff ever have multiple entries for the same format string? Like if I do printf("%s", "foo"); printf("%s", "longer_foo"); -- would those be two metadata entries?
13:28 jenatali: Or the case I'm really wondering about, printf("", a, b, c, d); vs printf("");
13:30 EdB: they don't merge
13:30 jenatali: Got it, so it's each call site that has its own dedicated metadata?
13:30 EdB: a std::set could have solve that, but no, there is an id for every call
13:30 karolherbst: jenatali: heh.. maybe a printf optimization wouldn't be that terrible... we could also drop unuses args and shit :D but again.. I still don't care enough :O
13:32 EdB: got to go. see you
13:37 jenatali: karolherbst: Tracking with https://github.com/microsoft/OpenCLOn12/issues/5
13:47 karolherbst: jenatali: btw.. was vstore fixed upstream?
13:48 karolherbst: ehhh...
13:48 karolherbst: test_basic vstore_global does not even compile?
13:48 jenatali: The problem's not vstore, it's single-component-of-a-vector store
13:48 karolherbst: ahhhh
13:48 karolherbst: my compilation issue was something else though
13:48 jenatali: And no, we haven't pushed a fix for that upstream yet I don't think
13:48 karolherbst: mhhhh
13:48 karolherbst: yeah.. I think I hit that with vstore_local
13:49 karolherbst: "Data sample 2 for vstore of float2 did not validate (expected {ce4bac95 ccc3bfc2}, got {00000000 ccc3bfc2}, stored from store #442 (of 512, offset = 2) with address offset of 1)" :)
13:49 jenatali: Right, yeah we hit it with vstore_local too
13:49 karolherbst: have a patch?
13:49 jenatali: Not a clean one - we were having debates on the right way to fix it
13:50 jenatali: https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/240
14:00 karolherbst: jenatali: somehow the emited code is also garbage...
14:01 karolherbst: ehh..
14:01 karolherbst: the clc code is just stupid :D that confused me
14:09 karolherbst: ahh.. vstore wasn't the issue .. right, it's the stuff at the end..
14:09 jenatali: Yep
14:09 karolherbst: anyway, the patch works for me as well
14:10 jenatali: Cool - there was a lot of discussion with jekstrand about the right way to handle array-indexing-of-vectors long-term
14:10 jenatali: This was kind of a shortcut to what he had in mind
14:10 karolherbst: yeah...
14:11 karolherbst: I think we should just touch components actually touched by the source code :p
14:11 jenatali: So, not sure if makes sense to take upstream, even temporarily, or if we should just keep it locally-applied and try to figure out a better general approach
14:11 karolherbst: I mean.. the issue is that all components are written, no?
14:11 jenatali: Yeah, it's a read-modify-write
14:12 jenatali: Non-atomic, on shared/global memory, meaning it's racy from multiple threads
14:13 karolherbst: mhhhh
14:13 karolherbst: looking at the clc code.. something is odd
14:13 karolherbst: ohhh right
14:13 karolherbst: I remember
14:13 karolherbst: something with optimizations and so on
14:13 jenatali: Yeah, LLVM's too smart for its own good
14:14 jenatali: Turns the cast + indexing into just indexing
14:15 karolherbst: mhh
14:16 karolherbst: it actually vectorizes the for loop
14:16 jenatali: Just like when it unrolls a loop and vectorizes loads/stores into larger types
14:16 jenatali: Yep
14:18 karolherbst: okay mhhh
14:19 jenatali: karolherbst: If the vectorization causes problems for you, you might need some patches from !5900 which propagate alignment info to the loads/stores - at least LLVM keeps the original type's alignment when it vectorizes
14:19 karolherbst: I don't vectorize
14:19 karolherbst: :D
14:19 jenatali: I'm not saying you doing it, I'm saying LLVM doing it for you
14:19 karolherbst: okay.. mhh
14:19 karolherbst: so the first nir we get has two stores, one vec2 and the other being vec1
14:20 jenatali: e.g. "for (int i = 0; i < 2; ++i) a[i] = b[i];" - I've seen LLVM promote the types of a/b into a larger type
14:21 karolherbst: jenatali: can it be that the spirv is already busted?
14:21 jenatali: Depends what the problem is - if you're talking about LLVM vectorizing, then yes :P
14:22 jenatali: karolherbst: Which CLC is giving you trouble?
14:22 karolherbst: not entirely sure..
14:22 karolherbst: I mean, it's the RMW issue, I just try to track down what the actual problem is
14:22 jenatali: Ah
14:22 karolherbst: mhhh
14:23 jenatali: The SPIRV will do a store to a component of a vector
14:23 karolherbst: actually.. I don't think it does
14:23 karolherbst: it looks like single component to me
14:23 jenatali: Paste it?
14:24 karolherbst: there are two, but I am not 100% which is the one being executed https://gist.github.com/karolherbst/5787f3259637175e421d210d1e251333
14:24 karolherbst: ohhh..
14:24 karolherbst: I am silly :D
14:24 karolherbst: we dump two now
14:24 karolherbst: and the first one is before linking...
14:24 karolherbst: nvm then, just the second one
14:25 karolherbst: yeah..
14:25 karolherbst: both stores do scalar int writes... no?
14:26 jenatali: OpStore %20 %49 Aligned 1
14:26 karolherbst: so.. vtn for whatever reasons, turns the first one into a vec2 store
14:27 jenatali: %20 is an access chain which derefs to a single component of a vec2
14:27 karolherbst: yeah, so we translate "OpStore %20 %49 Aligned 1" incorrectly
14:27 jenatali: Yes, stores to array-deref-of-vector are turned into read-modify-write of the whole vector
14:27 jenatali: Without that patch
14:27 karolherbst: jenatali: where is the vector comming from though?
14:27 karolherbst: ohh.. wait
14:27 karolherbst: the first arg is the dest type.. uff
14:28 jenatali: %20 is an access chain into something of type %39, which is a pointer to a char2
14:28 karolherbst: yeah.. I see it now
14:28 karolherbst: heh...
14:28 karolherbst: wait...
14:29 karolherbst: this is wrong...
14:29 jenatali: ?
14:29 karolherbst: OpStore %22 %50 Aligned 1 vs OpStore %20 %49 Aligned 1
14:29 karolherbst: see how %22 is constructed?
14:30 karolherbst: it uses %20 and takes the component "1"
14:30 karolherbst: but.. %20 is supposed to be scalar, no?
14:30 karolherbst: although...
14:30 karolherbst: mhh
14:30 karolherbst: ahh..
14:30 karolherbst: this chain stuff always confuses me
14:30 karolherbst: of course it's uchar*
14:30 karolherbst: and you can just walk the array
14:30 Vanfanel: emersion: what's drmModeAtomicSetCursor() supposet to receive as 2nd parameter? the fb_id of a cursor fb?
14:31 jenatali: Yeah, exactly
14:31 jenatali: One level of access chain doesn't change the type (as I understand it at least)
14:31 karolherbst: okay
14:31 karolherbst: then vtn is wrong :p
14:31 karolherbst: if the spirv writes one component, vtn shouldn't write two
14:32 jenatali: Bingo
14:52 karolherbst: mhhh...
14:54 karolherbst: uhhh
14:54 karolherbst: mhhh
14:56 karolherbst: I blame vtn_local_store
14:56 karolherbst: mhhh
14:56 karolherbst: jekstrand: why does vtn_local_store read the entire vector/array?
14:57 karolherbst: uhm.. I guess it only reads an entire vector actually
14:59 karolherbst: ehh, it was always like this
14:59 karolherbst: maybe not.. mhh
15:03 karolherbst: yeah..
15:03 karolherbst: probably always was
15:03 karolherbst: to me it really doesn't make much sense to do that
15:04 karolherbst: but maybe it messes something up?
15:04 karolherbst: mhhh
15:07 karolherbst: jenatali: sometimes removing code is the solution: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/7a160213734811a71c6bb7741caa36a775050c20
15:08 karolherbst: checking what intel ci thinks about this idea
15:08 jenatali: Yeah, except there's some passes that don't support array-deref-of-vector in nir for certain kinds of memory
15:08 jenatali: You're going to explode Vulkan :)
15:08 karolherbst: guess we need to fix it then :p
15:09 karolherbst: well.. my changes only touch it for shared
15:09 karolherbst: and for shared it should be supported.. no?
15:09 karolherbst: but I guess this issue exists on other memory types as well?
15:09 karolherbst: anyway...
15:09 karolherbst: I don't see a better solution than removing the read
15:17 hanetzer: y'know, I once had some wierd situation where my cursor died. straight vanooshed. prolly will never be able to repro but it was funny.
15:18 hanetzer: had another situation where after xrandr'ing a display, a ghost cursor stuck around and I had another one moving around. x11 is wild ya'll
15:19 karolherbst: I had cursor fun with plasma today on wayland..
15:19 karolherbst: somehow it got confused and displayed a big cursor for all wayland clients
15:19 karolherbst: and a proper sized one for X clients
15:20 karolherbst: I suspect hidpi display handling as I have a 4k and a FHD display
15:21 hanetzer: yee. I have two 4k displays, I can't make use of 4k with them yet tho because of navi dc issues :)
15:21 karolherbst: "fun"...
15:22 hanetzer: they'll work on my old rx480 gpu, but it only has 2 dp outputs.
15:23 hanetzer: and even if that were working, I can't switch to wayland because sway can't do drm tiled displays yet :)
15:24 karolherbst: ehh...
15:27 hanetzer: what can I say, I like my tiles.
15:28 karolherbst: but why 4k?
15:28 karolherbst: was that really a thing?
15:28 hanetzer: yee. I like to play games and consume media, sue me ;P
15:29 karolherbst: I mean, I was aware of 5K displays doing this, but 4K?
15:29 hanetzer: yeah. its a kind of diy display.
15:29 hanetzer: buy these panels or a monitor using them, hook this fpga based tcon to it, 4k
15:29 hanetzer: 4k@120hz, specifically.
15:30 karolherbst: ahh
15:30 hanetzer: or 2k@240hz
15:30 karolherbst: but then you still have the edge in the middle, no?
15:30 karolherbst: or can you do it that properly that you wouldn't notice
15:30 hanetzer: or even smaller, more ridiculous resolutions at even higher refresh rates.
15:31 hanetzer: yeah, its imperceptible when its workign right.
15:31 karolherbst: at least you can fix your EDID with that setup :p
15:31 hanetzer: prior to agd5f_ fixing a thing in the dc driver used by the rx480, basically, remember when tv's needed to be tuned to not elevator around?
15:32 hanetzer: half the screen was shifted half the way upwards and the other was where it should be.
15:32 karolherbst: somebody once told me on the conference that he used such displays with nouveau like 2 years ago?
15:32 karolherbst: and he essentially said: you try it like 3 or 4 times and then the stuff is properly aligned
15:32 hanetzer: now its fixed on the rx480, but for the rx 5700 xt its busted :)
15:32 karolherbst: :D
15:34 karolherbst: jenatali: left things: offsets, constant memory, private memory and async :)
15:34 karolherbst: all the issues I've got now are related to that
15:34 jenatali: All the issues in basic ;)
15:34 karolherbst: and uhm.. I think isnormal is used by one of those tests as well
15:34 karolherbst: yeah....
15:35 karolherbst: I think I will fix constant next
15:35 jenatali: Offsets you should get soon, once you and jekstrand are happy with that MR and we can land it
15:35 karolherbst: or private first? mhhh
15:35 jenatali: Async will come with libclc
15:35 karolherbst: maybe private
15:36 karolherbst: jenatali: lower_io lowers function_temp to scratch, right?
15:36 jenatali: Yep
15:36 jenatali: You'll want to use the explici_types pass on it just like shared, to populate scratch_size
15:37 karolherbst: right
15:37 karolherbst: mhhh
15:37 karolherbst: nir_validate is unhappy
15:38 karolherbst: before lowering even
15:38 karolherbst: so right after spirv_to_nir
15:39 karolherbst: ohh ehhh...
15:39 karolherbst: mhhh
15:39 hanetzer: karolherbst: are you associated with herbst*wm ? (forget the full name) :P
15:39 karolherbst: no clue
15:40 hanetzer: guess not. herbstluftwm is the full name
15:40 karolherbst: jenatali: src/compiler/nir/nir_validate.c:464
15:40 karolherbst: I guess you do something about that one?
15:40 karolherbst: ahhh
15:41 jenatali: Yeah, one of my patches relaxes that validation
15:42 karolherbst: mhhhhh
15:42 jenatali: Actually, not for ptr_as_array, but for array/array_wildcard
15:42 karolherbst: do we have a pass which converts function_temp arrays to scratch?
15:42 karolherbst: maybe I want to fix this first and get rid of this stupid handling inside nir -> nv50ir
15:42 jenatali: I wonder if we fail validation too - early on I hadn't always been running debug builds
15:49 hanetzer: I mean, its a fairly reasonable assumption. someone named *herbst in a channel about linux display technology creating a linux x11 window manager named herbst*wm yeah?
15:49 karolherbst: hanetzer: maybe? :p
15:50 karolherbst: but the term is too generic, so everybody could end up using it
15:51 hanetzer: yeah. but sometimes the linux tech world is really small :)
15:54 karolherbst: before I use (parts of) my name of any project, I'd want to use it for something useful, not yet another compositor :p
15:54 karolherbst: we already got plenty
15:55 hanetzer: nah man, I want LG3D not in java :P
15:55 karolherbst: ...
15:56 karolherbst: people should stop using java for that stuff
15:56 karolherbst: it's not like the language is bad, just everbody who considers java to be the fitting language for that has no idea on how to develop without wasting resources making it all pointless :p
15:58 hanetzer: tbh LG3D is pretty cool, checked out some demos on youtube. Also wayfire is kinda cool if you're into what used to be compiz :)
16:11 marcodiego: looking glass looked revolutionary for its time. Metisse and compiz were also very advanced. Most effects are interesting for a few minutes, after that comes a feeling of unnecessary bling. I still miss some of that bling
16:15 hanetzer: yeah personally I'm not one for bling for blings sake.
17:18 EdB: pmoreau: Translator version should always be LLVM_VERSION + .0 , excpet for the 8 where it should be ?
17:33 karolherbst: pmoreau, jekstrand: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/7eafcf5cf66276ac104d854703623d09b0113e76 \o/
17:36 karolherbst: now I just hope it doesn't regress anything :D
17:40 karolherbst: I am sure it trips over a few asserts, but generally...
17:40 karolherbst: at some point I should vectorize more as well.. but that will just trip codegen over I think
18:12 EdB: pmoreau: I added a simplier version, that should do the same LLVM check
18:19 pmoreau: <EdB "Pierre Moreau: Translator versio"> ==LLVM_major.LLVM_minor.whatever.whatever, except for 8.0.whatever where it should be
18:20 EdB: ok, some my suggession should do the wor
18:20 EdB: work
18:20 EdB: *so my suggession
18:21 EdB: the error message could probably be improved
18:22 mattst88: pmoreau: presumably you're using matrix<->irc. can the tab completion be configured to be less...wordy?
18:22 mattst88: e.g., what we IRC users saw was '<EdB "Pierre Moreau: Translator versio"> ==LLVM_major.LLVM_minor.whatever.whatever, ...'
18:23 imirkin: does that imply that EdB is secretly pmoreau and it's just a conversation between the two psyche's?
18:23 pmoreau: Ah, probably the reply adding some extra :-/
18:23 EdB: :)
18:24 pmoreau: And yes, I am indeed
18:25 EdB: mh, you are using matrix or your are me ?
18:26 imirkin: what's that John Cusack movie...
18:26 pmoreau: I am using Matrix :-D
18:26 imirkin: ah yeah - Identity
18:28 pmoreau: One limit with your method, would be if you have two different installation of SPIRV-LLVM-Translator, you can end up picking the wrong one and bailing out.
18:30 EdB: well, I guess if you have different lib you are able to adjust your path
18:53 karolherbst: ehhh
18:53 karolherbst: no, we should pass the correct version into dependency
18:54 karolherbst: mhhh
18:54 karolherbst: why does it have to be as annoying
18:54 karolherbst: pmoreau: we also need to add an upper check
18:54 karolherbst: ohh wait
18:54 pmoreau: ?
18:54 karolherbst: you are doing that, aren't you?
18:54 pmoreau: Yup
18:54 karolherbst: okay
18:55 karolherbst: yeah.. then that's fine
18:55 pmoreau: Unless I messed it up
18:55 karolherbst: mhhh
18:55 karolherbst: should be fine
18:55 karolherbst: I just don't know how compatible minor updates are
18:55 karolherbst: but yeah..
18:55 karolherbst: those are so rare that we can stick with what we've got
18:56 karolherbst: EdB: the problem with your patch is, that meson thinks it found the correct one
18:56 karolherbst: but it found an invalid one
18:57 EdB: it check for major and minor and raise error if not compatible
18:58 karolherbst: right, but for meson it still looks fine
18:58 karolherbst: you just error out
18:58 karolherbst: so the dep is marked as found
18:58 EdB: ah :/
18:58 karolherbst: it still prevents from building, but if we are able to pass in the precise requiernment we should do that
18:59 karolherbst: pmoreau: why do we need three checks though?
18:59 karolherbst: ehh.. I guess it makes things simplier
18:59 karolherbst: nvm
19:00 pmoreau: Which three checks? You mean the array given to `dependency()`?
19:00 karolherbst: yeah..
19:00 karolherbst: but I think it's fine
19:00 EdB: well I'm happier with simplier version and to let dev to adjust there path. Most of the time the verson wil match
19:00 pmoreau: Well we need 2 for the min and max.
19:01 pmoreau: And the 3rd one is for the special case of LLVM 8.0.
19:02 karolherbst: EdB: I think your case wouldn't work if we increase the llvm version at some point and would have to rescan as the dep is cached
19:02 karolherbst: pmoreau: might be worth testing that
19:03 karolherbst: dependency is doing some fancy magic
19:03 EdB: karolherbst: don't meson rerun the script if you change it ?
19:03 karolherbst: yes
19:08 EdB: anyway, I'm not a fan of over engineered config script but if you give an RB, ok for me
19:10 karolherbst: llvm requires us to over engineer anyway, that's just the cost of dealing with llvm
19:27 hanetzer: yee.
19:27 hanetzer:wishes more folks would just feckin use pkgconfig
19:30 EdB: SPIRV-LLVM-Translator use pkgconfig so it's not what bother us
19:34 hanetzer: currently pkgconfig would fix my issue (gentoo, cross-compiling mesa for aarch64 from x86)
19:44 karolherbst: hanetzer: I gave up on using "native" ways of doing cross compilation as they all suck
19:46 karolherbst: just docker crosscompile and use qemu-binfmt
19:47 karolherbst: this even allows to reuse some of the gitlab scripts
19:47 karolherbst: (in case of mesa)
20:05 HdkR: Compiler running through qemu sounds terrible :P
20:05 karolherbst: HdkR: well.. you don't run the compiler through qemu
20:06 karolherbst: just the whatever binaries the build system produces
20:06 karolherbst: debian is quite good in allowing you to install a native compiler but the arch foreign dependencies eg
20:06 karolherbst: (this is what we do in CI on gitlab)
20:07 HdkR: Nice
20:08 karolherbst: I even have some local files for ppc64le
20:09 karolherbst: HdkR: https://gist.github.com/karolherbst/3b552178cad56f2489979c378bb600f9
20:09 karolherbst: and there you have a crosscompiler env for ppc64le
20:10 HdkR: neat
20:10 karolherbst: once you figured out how it works, it's quite neat
20:10 HdkR: But docker though
20:10 karolherbst: as it works a) on every linux distribution b) you can always throw it away
20:10 karolherbst: HdkR: well.. better than VMs :p
20:11 karolherbst: and better than how distributions do cross compilation
20:11 karolherbst: it's so terrible and often so terribly broken
20:11 karolherbst: I never got gentoos stuff to work relibly
20:11 karolherbst: ended up using mxe for windows cross compiles years ago
20:12 karolherbst: which.. kind of works
20:13 karolherbst: and the meson.sh script you can run like ./meson.sh .gitlab-ci/meson-build.sh
20:13 karolherbst: and then you get your result inside _build :p
20:13 karolherbst: honestly.. it's the best way doing cross compilation I've done for a long time
20:27 hanetzer: for a single package, sure, but I'm trying to pull a google and use gentoo to build my chromebook's distro :)
20:35 karolherbst: :D
20:39 karolherbst: pmoreau: you need to add your r-by on my commits on your branch
20:39 karolherbst: you overwrite that I pushed those :p
20:39 pmoreau: 🤦
20:40 pmoreau: I hope I didn’t overwrite more than that
20:40 karolherbst: I don't think so
20:42 pmoreau: Interesting, the last commit is not/no-longer signed-off by you
20:43 karolherbst: yeah.. I might have added that as well
20:44 karolherbst: I still don't like the preprocessor magic, but I think that's not too bad, it's just annoying
20:45 pmoreau: Yeah it is… but can’t do much better until there is reflection in C++.
20:45 karolherbst: well.. they could just hide this stupidity inside the library and provide a sane string interface :p
20:45 karolherbst: because.. for strings using a string as the type is more or less the right choice
20:47 pmoreau: Branch updated with the missing tags
21:47 karolherbst: pmoreau: I have 50 patches locally again :(