03:46 imirkin: grrr... why does CTS hate our clear_buffer impl :(
03:46 imirkin: KHR-GL45.direct_state_access.buffers_clear fails... if i turn off our accelerated paths, it works
03:47 imirkin: and naturally arb_clear_buffer_object-formats works just fine.
03:55 airlied: imirkin: is it only for some formats?
03:55 imirkin: nope
03:55 imirkin: and i've tried to hack it to use the pushbuf path - same fail.
03:55 imirkin: (there's a RT-based path, and a pushbuf-upload-based path)
03:55 imirkin: and i found a bug already
03:56 imirkin: (missing defining the range as valid... which can mess up the waits at map time)
03:56 imirkin: but apparently there are multiple bugs.
04:08 imirkin: ok. second bug fixed.
04:08 imirkin: all good =]
04:08 imirkin: well. more like commented out.
04:09 imirkin: moral of the story - not a *ton* of people offload GPU-side buffer clears on buffers < 64 bytes.
04:12 imirkin: moral of the story #2: use helpers, don't roll your own. ugh.
04:24 imirkin: oooh, i wonder if that's why qbo's are broken sometimes
04:56 skeggsb: imirkin: is it? ;)
04:57 imirkin: is what?
04:57 skeggsb: the reason qbos are sometimes broken
04:57 imirkin: oh, i dunno
04:57 imirkin: i hate qbo's
04:57 imirkin: it was more idle wondering
04:58 imirkin: yeah, looks like a good explanation tho
05:06 imirkin: w00t, that was it
05:09 imirkin: ok, now only KHR-GL45.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations fails, which is expected
06:21 imirkin: ok. all the KHR-GL45.direct_state_access.queries_functional are for GL_QUERY_RESULT_AVAILABLE.
06:23 imirkin: and apparently that test fails if it's != 1
06:23 imirkin: that seems ... bogus.
06:31 skeggsb: that moment when you forget to exclude glx-multithread from piglit :P
06:31 imirkin: oh fuckin' a
06:32 imirkin: yet-another place where we were missing the proper fencing
06:32 imirkin: it all passes now
06:32 skeggsb: all of cts? ;)
06:32 imirkin: that one test :p
06:34 imirkin: skeggsb: that should be a good way to test your new and improved error recovery
06:34 imirkin: [glx-multithread]
06:35 skeggsb: i'm just trying to pre-fix another issue that i encountered during that work, and always get nervous when i touch code related to delayed buffer unmap/delete.. // piglit is a good way to thrash it
06:36 imirkin: in mesa, or kernel?
06:36 skeggsb: kernel
06:37 skeggsb: potentially accidentally fixed that kasan backtrace that i seen someone report around here recently as a side-effect
06:38 imirkin: nice
06:39 skeggsb: hopefully :P
06:39 skeggsb: the last week or so has taught me to be extra-pessimistic
06:40 imirkin: as opposed to your usual pessimistic self?
06:42 imirkin: anyways, glad you've been doing some hardening
06:42 imirkin: i'm off to bed. that was a good amount of CTS fixage for one day. karolherbst - patches in my 'bindless' branch if you're interested.
08:05 karolherbst: pmoreau: what is the status of your clover patches?
09:19 pmoreau: karolherbst: They have been waiting patiently for the past two? weeks for me to send them.
09:19 karolherbst: ahh, I see
09:20 karolherbst: anythin I can help with here?
09:20 pmoreau: I need to write down all the changes since v2, and that has been deterring me from doing it :-D
09:21 karolherbst: ...
09:21 karolherbst: I see
09:21 pmoreau: I’ll try to do it tonight.
09:22 karolherbst: awesome! thanks
09:22 pmoreau: But otherwise, I think the patches are ready to be merged.
09:22 pmoreau: I was planning to leave out the SPIR-V backend, but since you have your v5 on the list, I should probably put it back.
09:23 karolherbst: spir-v backend for the C stuff?
09:23 karolherbst: I would keep it separated for now
09:23 pmoreau: For the clover stuff
09:24 pmoreau: So, the whole clover/spirv folder
09:24 karolherbst: the IL stuff?
09:24 pmoreau: Not really
09:24 karolherbst: I mean, we can first concentrate on getting FromIL support merged and then the OpenCL C frontend stuff
09:25 pmoreau: So this is the current status: https://github.com/pierremoreau/mesa/commits/clover_spirv_series_v3
09:25 karolherbst: ahh, k
09:25 pmoreau: It allows you to use clCreateProgramWithIL(), by translating SPIR-V back to LLVM IR.
09:25 karolherbst: we should move my commit further down
09:25 karolherbst: and add the other defs
09:25 karolherbst: I will update it
09:26 pmoreau: So it only supports SPIR-V -> LLVM IR, so this is not enough for your NIR frontend (nor my SPIR-V frontend) in Nouveau.
09:27 pmoreau: I was thinking of having the other part (compiling OpenCL C to SPIR-V, and keeping everything in SPIR-V until the end) with your series maybe, or a later series.
09:27 karolherbst: pmoreau: I see
09:27 pmoreau: What do you mean by “We should move my commit further down and add the other defs”? Which commit are you talking about, and which defs?
09:28 karolherbst: pmoreau: maybe we can have 3 series? 1. spir-v -> llvm IR 2. add functionality to pass spir-v to the backend 3. add support for OpenCL C?
09:28 karolherbst: pmoreau: "clover: update ICD table to support everything up to 2.2"
09:28 karolherbst: I skiped a few proper defines due to outdated headers
09:29 pmoreau: Ah, yeah, I included that one in my series (and it should have the proper From). It might be good to move it after the updated headers, and I’ll send the updated version as part of my series?
09:29 karolherbst: yeah
09:30 pmoreau: Or should you send a small series which includes updating the headers?
09:30 karolherbst: I will fix that commit and push it on my git
09:30 pmoreau: Okay, thanks
09:31 pmoreau: Regarding the “3. add support for OpenCL C“, it is trivial once you have the llvm-spirv module included, which I already need for the spir-v -> llvm IR stuff, so 2 and 3 can be safely merged
09:46 karolherbst: pmoreau: https://github.com/karolherbst/mesa/commits/clover_spirv_series_v3
09:46 pmoreau: Perfect, thanks!
09:46 karolherbst: this time I didn't run into that weird EGL compile error
09:47 karolherbst: I think it was fixed in the CL header update
09:47 pmoreau: Ah, cool
10:23 karolherbst: pmoreau: what is the current branch with the OpenCL C spir-v frontend support?
10:25 pmoreau: Compiling OpenCL C to SPIR-V in clover? I have it as part of my work on the SPRI-V frontend to Nouveau: https://github.com/pierremoreau/mesa/tree/nouveau_spirv_support
10:25 pmoreau: *SPIR-V
10:25 karolherbst: I see
10:25 pmoreau: I should have it on a separate branch
10:26 karolherbst: pmoreau: I think I want to write a proper patch for this now: https://github.com/karolherbst/mesa/commit/19fa5749f996091b69f330963fc6140264b21449
10:30 pmoreau: Sounds good
10:30 pmoreau: Planning to do that today?
10:30 karolherbst: yeah
10:30 pmoreau: Okay
10:30 karolherbst: and maybe also add those load/store_global intrinsics
10:30 karolherbst: I want to clean up all the spirv_to_nir stuff we did
10:31 pmoreau: Sounds good
10:31 pmoreau: So... I shouldn’t spend too much time reviewing those patches then? O;-D
10:31 karolherbst: ohh the nir patches won't change
10:32 karolherbst: this is all on top of that now
10:32 karolherbst: the nir series is for the graphics part only
10:32 pmoreau: Ugh, trye
10:32 pmoreau: *true
10:32 pmoreau: Dang, I can’t be lazy :-D
10:32 karolherbst: the compute part may add like 100 loc to it or something
10:32 karolherbst: like this: https://github.com/karolherbst/mesa/commit/b4289d76a89ba099e858cf610eecb61756c696f5
10:34 pmoreau: Indeed, not that many loc
10:39 karolherbst: pmoreau: "clover: Plug in compilation and linking when using SPIR-V as IR" is the patch where you add support for drivers without LLVM support, right?
10:40 pmoreau: Right
10:40 karolherbst: pmoreau: and we kind of need to figure out the situation with supported_ir vs preferred_ir
10:41 pmoreau: Preferred_ir is no longer used, at all
10:41 pmoreau: https://github.com/pierremoreau/mesa/commit/51f484bb44f2341d41dbce964d76549cd1a045cb#diff-1fd2423cda3f54237963e0ec9e1976cb
10:42 karolherbst: pmoreau: module is a llvm tyype, right?
10:42 pmoreau: depends
10:43 karolherbst: in clover?
10:43 pmoreau: there is clover::module, and llvm::Module (I think it has a capital M for the LLVM one)
10:43 karolherbst: ahh
10:43 pmoreau: But in your case, it’s most likely a clover::module
10:43 karolherbst: okay
10:43 pmoreau: Defined in clover/core/module.hpp
10:44 pmoreau: So, spirv::link_program returns a clover::module, for example (I guess that is the one you are wondering about?)
10:44 karolherbst: yeah
10:45 karolherbst: currently wondering on how to implement the nir path
10:45 karolherbst: because I just want to use the spir-v stuff + to nir translation
10:46 karolherbst: mhh
10:46 karolherbst: actually
10:46 karolherbst: we should just convert to nir when passing the shader to the driver
10:48 pmoreau: Let me double check; I thought there was a good place to do it.
10:49 karolherbst: before ../src/gallium/state_trackers/clover/core/kernel.cpp:233?
10:50 pmoreau: I think what I would do, is define a free function in clover/core/program.cpp, and call it on the result of spirv::link_program
10:50 karolherbst: ahh
10:50 karolherbst: this might make sense, right
10:51 pmoreau: That free function would check whether the sections is an executable (and not a library), and then depending on the supported IR, translate it to whatever you want.
10:51 pmoreau: **However**, that means you end up with multiple entrypoints, as you don’t know which kernel will run yet.
10:51 karolherbst: right
10:51 karolherbst: I think I will do it in bind for now
10:52 karolherbst: clover::kernel::exec_context::bind
10:52 pmoreau: It could be a good thing to add to NIR later on.
10:52 karolherbst: mhhh
10:52 karolherbst: doubtful
10:52 karolherbst: we have still all the linking mess
10:52 pmoreau: What linking mess?
10:52 karolherbst: we should convert after linking
10:53 pmoreau: “and call it on the result of spirv::link_program” isn’t it what I said?
10:53 karolherbst: I didn't say directly after linking
10:53 karolherbst: we need to know what kernel is launched
10:53 pmoreau: If NIR supported multiple entrypoints, that wouldn’t be an issue.
10:54 karolherbst: not really
10:54 karolherbst: currently you have to specify the entry point when converting from spirv to nir
10:55 pmoreau: So, **if** it supported multiple entrypoints, you wouldn’t need to pick, wouldn’t you?
10:55 karolherbst: well how is it done currently? can the driver compile a kernel with multiple entry point and at launch time decide what function to call first?
10:56 pmoreau: You compile a program, which can have multiple entry points, and then you create a kernel object, which refers to one of those entry points.
10:56 pmoreau: (In the OpenCL API)
10:57 pmoreau: https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/clCreateKernel.html
10:58 karolherbst: okay
10:58 karolherbst: so we could convert to nir when calling clCreateKernel?
10:59 pmoreau: That might work? Let me check
11:00 pmoreau: Might be easier in kernel::launch
11:01 karolherbst: easier, yes
11:01 karolherbst: but you probably don't want to have the runtime compile at launch time
11:01 karolherbst: I think clCreateKernel would be the perfect place
11:03 pmoreau: Note that currently the kernel class does not store the binary, but just keeps a reference to the program objects which contains the binary.
11:03 karolherbst: right
11:03 pmoreau: You could add a new attribute to kernel for storing the binary (SPIR-V, NIR, whatever) and use that in launch
11:04 karolherbst: pmoreau: ohh wait, there is that pipe_compute_state cs;
11:04 karolherbst: which contains the info of the binary
11:05 pmoreau: But it might be not that easy. Like, what if you compile, create a kernel, recompile, launch the kernel. I’m not sure whether that is legal or not though.
11:05 karolherbst: I doubt that is leval
11:05 karolherbst: or well
11:05 karolherbst: sounds rather like a noop
11:05 karolherbst: why do you want to recompile?
11:06 karolherbst: I mean sure, you could provide new sources and everything, but if the kernel is created, it is done afaik
11:06 pmoreau: Nevermind, recompiling would give you a new program object anyway.
11:07 karolherbst: :)
11:08 karolherbst: ohh, the cs struct is filled on launch
11:08 karolherbst: cs.prog = &(msec.data[0]);
11:09 karolherbst: well, for now I will convert it at launch time :D
11:09 karolherbst: we can improve that later
11:11 RSpliet: pmoreau: doesn't NVIDIA under the hood create one binary per kernel entry point (with heavy inlining and dead-code elimination and stuff)?
11:12 pmoreau: Or just store the binary when creating the kernel, by adding an extra attribute. I can’t think of another valid reason for it to fail.
11:14 pmoreau: RSpliet: Well, kinda I think, but that shouldn’t be different from one binary with multiple entry points one after the other.
11:15 RSpliet: pmoreau: if you don't do control flow it doesn't make a difference. But I believe they replicate or inline function calls for each entry point
11:15 pmoreau: Though, the launch system on GPUs allows you to specify where in the code to start from, so you might as well use that to ship one binary with multiple entrypoints.
11:15 pmoreau: We do function inlining as well (since we do not support anything else)
11:16 pmoreau: IIRC
11:16 RSpliet: ah cool, that solves the problem of "what if my kernel doesn't need some part of the generic function", you can just DCE :-)
11:17 pmoreau: Theoretically, that should happen, yes. I have not tried it in practice, to check that this is indeed the case. :-D
12:37 karolherbst: pmoreau: this line makes no sense: cs.prog = &(msec.data[0]);
12:37 karolherbst: msec.data is an array for chars
12:38 karolherbst: but what you did in nvc0_cp_state_create is just totally not fitting this
12:38 pmoreau: What did I do in nvc0_cp_state_create...
12:38 pmoreau:goes to look it up
12:39 karolherbst: this change https://github.com/pierremoreau/mesa/commit/565656182139e6c98f047f3398bba3c770a74596#diff-336065b196d4bfd11a37186c536f9aff
12:39 pmoreau: Yup, looks good to me
12:40 karolherbst: ?
12:40 pmoreau: (Well, except that it should not get rid of the TGSI path)
12:40 karolherbst: I am confused about something else
12:40 karolherbst: cso is the cs thing in clover/core/kernel
12:40 karolherbst: cs.prog = &(msec.data[0]);
12:40 karolherbst: msec.data some array
12:41 karolherbst: and now it get cast to pipe_llvm_program_header?
12:41 karolherbst: do we seriously want to do things like that?
12:41 pmoreau: That what is done for LLVM IR as well
12:41 pmoreau: We need to pass in the size, along the binary
12:42 pmoreau: In the first series, I added the size as an extra argument to the pipe_compute thing, but was told to use this structure thing instead.
12:44 karolherbst: ...
12:44 karolherbst: I don't listen to people to tell me that "cso->prog + sizeof(struct pipe_llvm_program_header)" is a sane thing to do
12:44 karolherbst: because it is not
12:45 karolherbst: *that
12:45 karolherbst: and usually this looks like a out of bound access
12:45 karolherbst: seriously
12:46 karolherbst: and it is painful to debug, because you have no clue what is going on
12:46 pmoreau: We could change pipe_llvm_program_header to be `struct pipe_spirv_program { size_t size; uint32_t* binary };`. That would work fine for me, and would definitely look better.
12:47 karolherbst: yeah
12:47 karolherbst: something like that
12:47 karolherbst: but please no out of struct bound magic crap
12:47 karolherbst: there is enough bad C code, we shouldn't add more
12:48 pmoreau: Okay, will do that tonight as well.
12:50 karolherbst: mhh
12:50 karolherbst: but I think I still do sometihng wrong
12:51 karolherbst: pmoreau: https://github.com/karolherbst/mesa/commit/f7a39e54801c5b1d8b5df82635305f5a346accea
12:51 karolherbst: look at the spirv_to_nir call
12:52 pmoreau: Ugh, that new structure is going to be painful
12:53 karolherbst: I can imagine
12:53 karolherbst: if in doubt, add silly copies in glue code
12:53 pmoreau: The whole binary is supposed to be transmitted, not just a pointer to some memory region (I mean, it could be just a pointer, but that is going to make memory management just so much more painful).
12:54 karolherbst: right
12:54 pmoreau: I’ll try to think about it tonight, but I can see why they went with the + sizeof for LLVM (IIRC, I took it from there).
12:54 pmoreau: Let’s look at your issue
12:55 karolherbst: duh.. silly pointer arithmetic stuff
12:56 pmoreau: Yeah, pointer arithmetic on void* is not going to work well
12:56 karolherbst: I put a +1 now
12:56 karolherbst: then I get 0x7230203 on [0]
12:56 karolherbst: *sigh
12:57 pmoreau: Or do `((const void*)(&(msec.data[0]) + sizeof(struct pipe_llvm_program_header)))`
12:57 karolherbst: "smad24: PASS" :)
12:57 pmoreau: Nice!
12:58 pmoreau: Or even `((const uint32_t*)(&(msec.data[0]) + sizeof(struct pipe_llvm_program_header)))` :-D
12:58 karolherbst: 1. error: invalid conversion from 'const void*' to 'const uint32_t* {aka const unsigned int*}' [-fpermissive]
12:59 karolherbst: :)
12:59 pmoreau: Does spirv_to_nir really take a uint32_t*, but the size is in bytes?
12:59 karolherbst: yes
12:59 karolherbst: uhm
12:59 pmoreau: Interesting
12:59 karolherbst: no
12:59 karolherbst: the size is in bytes / 4
12:59 karolherbst: I forgot the / 4 in the aptch
12:59 pmoreau: Okay
13:00 pmoreau: Does the second version work? (When you don’t cast to const void*)
13:00 karolherbst: https://github.com/karolherbst/mesa/commit/0d0b508c23e74a8c9d3336a2a6111f9812a842f1
13:00 karolherbst: this works now
13:00 karolherbst: yeah
13:01 pmoreau: https://github.com/karolherbst/mesa/commit/f7a39e54801c5b1d8b5df82635305f5a346accea#diff-9c0e3674d5025df7827a77bf0fa279dbR265 you can replace "test" by _name.c_str() I would guess
13:01 karolherbst: ohh right
13:01 pmoreau: That was the whole point of moving it there :-D
13:01 karolherbst: I did so in the spirv_to_nir call
13:01 pmoreau: Ah right
13:02 pmoreau: Isn’t the name of the entrypoint already set though?
13:02 karolherbst: maybe?
13:02 pmoreau: I would guess so, cause it is part of the `OpEntryPoint` instruction
13:02 karolherbst: seems like it is
13:04 pmoreau: Also, I would move all of that code outside of kernel::exec_context::bind, and into a free function that would fill in the cs struct (or just the prog part).
13:04 karolherbst: yeah.. maybe
13:05 karolherbst: I think we want to move it somewhere else anyway
13:05 pmoreau: Yeah, true
13:05 karolherbst: at least this is now better than having it in codegen
13:05 pmoreau: Yup, cause it can be used by robclark as well :-)
13:05 karolherbst: right
13:08 karolherbst: slowly it is looking quite good overall :)
13:09 pmoreau: :-)
13:09 pmoreau: So, now you can try running the CTS, without hacking in the names.
13:09 karolherbst: well yeah
13:09 karolherbst: just one issue
13:10 karolherbst: get_global_id
13:10 karolherbst: because the CTS is using it all the time
13:10 pmoreau: Do you still have the same issues with my patches?
13:10 karolherbst: depends on which patches you mean
13:11 pmoreau: Let’s see
13:11 karolherbst: tomorrow I will look at robclark patches and try to rework mine on top of his or something
13:12 pmoreau: 22:10 pmoreau: karolherbst: Take the two top most commits of https://github.com/pierremoreau/llvm-spirv/commits/integrate_with_mesa for the llvm-spirv part, and the top most commit of https://github.com/pierremoreau/mesa/commits/nouveau_spirv_support for the Mesa part.
13:13 karolherbst: nope
13:14 pmoreau: “nope” as in, it still doesn’t work with them, or you haven’t tried with them?
13:14 karolherbst: pmoreau: this is the NIR https://gist.githubusercontent.com/karolherbst/d6a0de97687013b992c26998f1128655/raw/f540c92592255e1e872c2d42f5214dca031f1607/gistfile1.txt
13:14 karolherbst: I've tried it
13:14 pmoreau: Okay
13:15 karolherbst: intrinsic store_var (ssa_1) (__spirv_BuiltInGlobalInvocationId) (7) /* wrmask=xyz */
13:15 karolherbst: vec3 64 ssa_1 = load_const (0x 0 /* 0.000000 */, 0x 0 /* 0.000000 */, 0x 0 /* 0.000000 */)
13:15 karolherbst: here the builtin is set to 0
13:16 karolherbst: directly after the conversion I still get this decl_var INTERP_MODE_NONE u64vec3 __spirv_BuiltInGlobalInvocationId = { 0x00000000 }
13:16 karolherbst: wondering why that is
13:20 pmoreau: Hum, I can’t really run your code here, as I’m on the blob.
13:22 pmoreau: Is that the final NIR btw, after all the various passes?
13:23 pmoreau: Is it possible to dump it right after the initial pass that transforms the SPIR-V to NIR?
13:28 karolherbst: this should be right after the transform: https://gist.githubusercontent.com/karolherbst/82cb992ff2f5bcdad7d16a2e797c43c2/raw/b27a4fbe9442cf39c06abc2afcea42a1fac1a182/gistfile1.txt
13:28 karolherbst: or maybe after the first pass...
13:28 karolherbst: let me check
13:30 karolherbst: pmoreau: this is directly after the transform: https://gist.githubusercontent.com/karolherbst/82cb992ff2f5bcdad7d16a2e797c43c2/raw/92506e677c6fcf1fd1dee79afee20a90b6f66755/gistfile1.txt
13:30 pmoreau: Thanks
13:33 dhgsb: how can i flash a vga bios with linux to the gpu card?
13:34 karolherbst: dhgsb: nvidia has some tools
13:35 karolherbst: dhgsb: but why do you want to flash it in the first place?
13:35 dhgsb: karolherbst: nvflash is the only thing? i cant flash with nouveau? because nouveau lets me read out the vbios really easy
13:35 dhgsb: karolherbst: because its a free fake card
13:36 karolherbst: fake card means what exactly? some clone or?
13:36 dhgsb: the scam-vbios preinstalled is of course not working propperly
13:36 karolherbst: ahh
13:36 karolherbst: so it is scam :)
13:36 karolherbst: well
13:36 karolherbst: we don't flash the vbios in nouveau
13:36 karolherbst: we just use whatever is on the GPU
13:36 karolherbst: however
13:36 dhgsb: karolherbst: no, it claims to be a more modern one with fake vbios. just want to flash the real bios.
13:37 karolherbst: dhgsb: do you know 100% sure what the real bios is? Because you can't just grab any with the same model number
13:37 karolherbst: wel maybe vendor serial/model numbers are fine though
13:37 karolherbst: *well
13:37 dhgsb: with fake bios, the official nvidia driver also loads the wrong driver and always have to mess up with inf files and so on - just flash the real vbios is the plan
13:37 karolherbst: there is the option to upload a vbios into memory and let the nvidia driver pick that one up
13:38 karolherbst: we have a tool for that in envytools called nvafakebios (I think)
13:38 karolherbst: yeah, nvafakebios
13:38 dhgsb: there is ONLY nvflash? if nvflash does not work you have to use an SPI-flasher to connect to spi chip? No other ways?
13:38 karolherbst: well, one could reverse engineer nvflash, but I don't think anybody did this?
13:39 karolherbst: dhgsb: you can tell nouveau to use a file on the filesystem instead though
13:39 karolherbst: and for nvidia there is this vbios in memory thing
13:40 dhgsb: i try to find a propper solution that always works and when i never would have to mess up with the software again. Like having a broken EDID in the monitor. Yes, you can manually load a propper one but flash the EDID chip is the propper solution to never care about any more
13:40 dhgsb: so there is really none other vbios flashing tool then nvflash, right?
13:40 karolherbst: right, but we didn't develop any replacement for nvflash
13:41 karolherbst: who knows? Maybe somebody wrote one
13:41 karolherbst: I am not aware of anything else
13:41 dhgsb: ok, thanks so far
13:41 pmoreau: karolherbst: Would you have the SPIR-V for it as well, please? Could you try breakpoint’ing on https://github.com/karolherbst/mesa/blob/0d0b508c23e74a8c9d3336a2a6111f9812a842f1/src/compiler/spirv/vtn_variables.c#L1325 and https://github.com/karolherbst/mesa/blob/0d0b508c23e74a8c9d3336a2a6111f9812a842f1/src/compiler/spirv/vtn_variables.c#L1342
13:41 karolherbst: pmoreau: how can I dump it again?
13:42 pmoreau: For both check whether nir_var->constant_initialiszer is NULL, and for the BuiltIn, check which path it takes.
13:42 pmoreau: CLOVER_DEBUG=spirv CLOVER_DEBUG_FILE="somefiletodumpto"
13:42 dhgsb: karolherbst: btw if you would like to have a free fake card (ebay/paypal refunds this scam directly to 100% ) https://www.ebay.com/itm/232658535073
13:44 karolherbst: pmoreau: CLOVER_DEBUG doesn't do anything :(
13:44 karolherbst: ohh wait
13:44 karolherbst: my mistake
13:44 pmoreau: Maybe you do not have my patch for it either
13:45 karolherbst: pmoreau: https://gist.githubusercontent.com/karolherbst/9d168ce556a45bf2409c1f8b8a84cd5f/raw/630494a3b722e24a5a73419f80035af77204a36e/gistfile1.txt
13:45 pmoreau: Perfect, thanks
13:45 karolherbst: dhgsb: :D
13:45 karolherbst: dhgsb: what card is it originally? 620?
13:45 dhgsb: karolherbst: GF116
13:46 karolherbst: mhh
13:46 karolherbst: GF116 ain't so bad though
13:47 dhgsb: karolherbst: i know. so if anyone with older GPU here have problems because nouveau simply dont fix problems on older GPUs, just send him this ebay link to the fake card and tell him to flash with external spi flasher.
13:47 karolherbst: well
13:47 karolherbst: fermi has no reclocking support yet
13:47 karolherbst: so it is rather pointless
13:47 dhgsb: karolherbst: but fermi would probably not cause a kernel-panic and other issues that other nvidia cards does but the bugs dont get fixed
13:48 karolherbst: what do you mean?
13:48 karolherbst: it isn't like anybody chooses to not fix bugs
13:49 dhgsb: karolherbst: just install a nvidia 7900gtx card into your system and use it as a daily driver. have fun with non working system and 100% fan speed
13:50 dhgsb: it sounds as a vakuum cleaner and crashes the system. on windows its working fine
13:51 karolherbst: well, then somebody with such a GPU should look into those issues, right?
13:51 dhgsb: karolherbst: it would be really great if you could get such a 1-5$ card and install it into your system.
13:51 karolherbst: and I don't see any bug reports about that GPU anyway
13:52 karolherbst: dhgsb: it would be really great if you could look into the issue as well or file a bug report with the vbios attached
13:52 karolherbst: issue is
13:52 dhgsb: karolherbst: should i flood this channel with reported bugs in freedesktop?
13:52 karolherbst: 7900 gtx != 7900 gtx
13:52 dhgsb: karolherbst: https://bugs.freedesktop.org/show_bug.cgi?id=102349
13:52 karolherbst: just because I get one, doesn't mean I get one with your issues
13:53 dhgsb: https://bugs.freedesktop.org/show_bug.cgi?id=102430
13:53 dhgsb: https://bugs.freedesktop.org/show_bug.cgi?id=102352
13:54 dhgsb: https://bugs.freedesktop.org/show_bug.cgi?id=96460
13:54 dhgsb: https://bugs.freedesktop.org/show_bug.cgi?id=91986
13:55 karolherbst: two of those have actually patches
13:55 karolherbst: wondering what is the status about those though
13:55 dhgsb: karolherbst: you mean hackpatch2? its been told that it should not be merged and done properly
13:56 karolherbst: right
13:56 pmoreau: There was a problematic fan (can’t remember for which gen), for which we received some documentation from NVIDIA like 1 or 2 weeks ago.
13:57 dhgsb: pmoreau: the vbios did not have any fan information including
13:58 dhgsb: pmoreau: just check the vios dumps here: https://bugs.freedesktop.org/show_bug.cgi?id=102352
13:58 dhgsb: pmoreau: the nouveau driver should set the values by himself, manual fan speed setting is working fine
13:58 karolherbst: dhgsb: there is fan stuff
13:59 karolherbst: but
13:59 karolherbst: I guess we just currently fail to parse the proper tables
13:59 dhgsb: karolherbst: as far as i understood the vbios have only information about "fan yes/no"
13:59 karolherbst: mupuf: bios with just the FAN CALIBRATION tbale?
13:59 karolherbst: dhgsb: far from that
13:59 karolherbst: there is bunch of fan stuff in their
14:00 mupuf: karolherbst: GOOD QUESTION
14:00 mupuf: sorry for the caps
14:00 karolherbst: ;)
14:00 dhgsb: karolherbst: mupuf told there is no fan calibration table in the vbios
14:00 karolherbst: "FAN CALIBRATION table at 0xd65a, version 10"
14:00 karolherbst: there are other fan tables though
14:00 karolherbst: this fan calibration is a thing we don't use yet
14:00 karolherbst: but mupuf was working on that
14:01 dhgsb: he told 4 months ago that he would like to look at the weekend into it. now about 20 weekends later nothing new happened
14:01 mupuf: dhgsb: I am sorry, but does it look like I get paid to do this work? :s
14:02 mupuf: I try my best
14:02 mupuf: your problem is for one person, there is another problem that affects ~30 people
14:02 dhgsb: mupuf: i know that you dont get payed, but i just listened to your word here: https://bugs.freedesktop.org/show_bug.cgi?id=102352#c7
14:02 mupuf: now tell me, which problem should I be working on first?
14:03 dhgsb: mupuf: you ask me, then i tell you: this one -> https://bugs.freedesktop.org/show_bug.cgi?id=102352
14:03 mupuf: I can't say anything else but: I am too much of an optimist and obviously can't judge how long things will take me
14:03 karolherbst: mupuf: but this is about a patch which didn't get merged, rigt?
14:03 mupuf: next time, I will not give any time frame
14:03 karolherbst: blame skeggsb then :p
14:03 mupuf: karolherbst: the patch was wrong IIRC
14:03 dhgsb: karolherbst: the patch got merged
14:04 karolherbst: mupuf: ahh
14:04 karolherbst: okay
14:04 mupuf: and to reproduce the issue, I need to have one special gpu, and hack the vbios
14:04 karolherbst: right
14:04 mupuf: dhgsb: I even plugged this GPU before going on my christmas vacation
14:04 dhgsb: karolherbst: 4.11rc1 had it inside as far as i understood
14:04 mupuf: so as I could do that remotely
14:04 karolherbst: mupuf: but it looks like this is related to the fan calibration thing though
14:04 mupuf: and I did so again for FOSDEM, since i was away for a week and would maybe have time
14:04 karolherbst: because this is the only fan related table I see in that vbios
14:04 mupuf: so, no, I did not forget you
14:05 mupuf: I am sorry I am not working as fast you would wish, I feel exactly in the same way
14:05 mupuf: karolherbst: it used to work, so I doubt it :p
14:05 karolherbst: ahh
14:05 karolherbst: I see
14:05 mupuf: but maybe it is true, maybe we did not drive the fan before :D
14:05 karolherbst: :D
14:06 karolherbst: might be
14:06 karolherbst: mupuf: usually there should be a thermal table with some stuff, right?
14:06 karolherbst: what is the fallback in case nothing is there?
14:06 dhgsb: and again, you can just tell every single user with PCIe card and curie or tesla series when you dont have time/will to to fix the bug, just send those people this link and tell them to buy and ask paypal to refund. Then they got a newer series with better nouveau code: https://www.ebay.com/itm/232658535073
14:06 karolherbst: scaling on the PWM 0%->100%?
14:07 dhgsb: karolherbst: no, thats exactly the case. it then runs 100%
14:08 karolherbst: dhgsb: please don't suggest people to buy a certain GPU here with the promises it might work better. Sometimes it doesn't and what if people don't get money back?
14:08 dhgsb: but it should be scaling on PWM 0% to 100%. but it just rin 100% whole time and on a nvidia 7900 series the high power fan lets you get the feeling you are sitting next to a 19" server rack
14:09 dhgsb: karolherbst: the people get the money back based on paypal. they use paypal for payment (nothing else is accepted there) and based on paypal rules they get always 100% refund on fakes
14:10 dhgsb: here a review of two of those fake gpus: https://www.youtube.com/watch?v=joVGTjB70dQ
14:10 karolherbst: dhgsb: they can buy other cards as well
14:11 dhgsb: karolherbst: yes, other fake ones to get free cards, yes. For example the nvidia 960 4GB that is a Nvidia GTS 450 Rev 2 with 1GB
14:11 karolherbst: anyway, it's not the place to suggest to buy any GPU here to others without them asking first
14:11 mupuf: absolutely, this behaviour was unacceptable
14:12 mupuf: we officially support nothing, if it works for you, great, if not, patches are welcome
14:12 dhgsb: telling people to get free cards from scammers to run those on nouveau?
14:12 karolherbst: if somebody asks me, I am willing to make a good enough suggestion, but I would never pinpoint to a special product and "promise" it will work out
14:12 karolherbst: dhgsb: you have to put time into it
14:12 mupuf: all this effort is a best effort
14:12 mupuf: no guarantees
14:12 karolherbst: for me my time would be too value to even get those for free
14:12 mupuf: if you want guarantees, use the blob
14:12 karolherbst: *valueable
14:13 karolherbst: I would rather not buy those and spend my time more productive
14:13 dhgsb: karolherbst: i would also value your time much more here https://bugs.freedesktop.org/show_bug.cgi?id=102430 then seeing you flashing with cheap spi flasher the working bios on such cheap cards
14:14 karolherbst: dhgsb: also, please don't get poleimcal. We do our best here and it doesn't help to get blamed for random stuff
14:14 karolherbst: if you want to get shit doen, do it yourself
14:14 karolherbst: dhgsb: okay, if you want me to fix this, pay me 100€/hour
14:14 karolherbst: problem solved
14:14 dhgsb: karolherbst: but you really cant tell the user "you see a bug, then fix it yourself"
14:15 karolherbst: dhgsb: okay, then I suggest them to try fixing it
14:15 dhgsb: karolherbst: you get payed by redhead to fix those bugs already :/
14:15 karolherbst: we help
14:15 karolherbst: dhgsb: no, I don't
14:15 karolherbst: I get paid to work on other things
14:15 mupuf: dhgsb: your understanding of free software is wrong :s
14:15 karolherbst: dhgsb: if you want to get me fixing bugs, buy RHEL supscriptions and report bugs
14:16 karolherbst: *subscriptions
14:16 karolherbst: or be nice to us
14:16 dhgsb: mupuf: i report bugs of free software to bugtracker. thats what a user should do in best case and thats what i do
14:16 karolherbst: dhgsb: your behvaiour doesn't make it more likely that we fix your bugs
14:16 karolherbst: more the contrary
14:16 karolherbst: I see your point, and you are right to get annoyed by the bugs
14:16 karolherbst: but your are not right to blame us for the issues
14:17 dhgsb: karolherbst: but thats not "my" bugs. those are bugs of many thousand people. i am always the only one who spend time into propper reporing those with logfiles and so on
14:17 karolherbst: I already told you, don't get polemical
14:17 mupuf: dhgsb: we value your bug report
14:17 mupuf: but, don't expect things to magically fix themselves
14:17 mupuf: even on i915 we are struggling with that
14:18 mupuf: especially for HW from 2005
14:18 karolherbst: dhgsb: if I would be infinitly fast and would have infinitly time, I would fix all the bugs
14:18 mupuf: if not 2004
14:18 karolherbst: but because I am/have neither, I can't
14:18 dhgsb: karolherbst: i just argue about the things you told about that. i wont spend my time into that if i wont support the environment and would like to make the world a bit better by not giving people a reason to buy new hardware when the old hardware is still fine but just a software bug
14:19 karolherbst: dhgsb: you pointed to a specific product
14:19 karolherbst: of course buying a better supported GPU will improve the situation
14:19 karolherbst: but this can be any
14:19 mupuf: dhgsb: agreed, fully, but there is also only limited time for fixing things
14:19 karolherbst: you could even suggest to search for scam and poker on refunds
14:19 karolherbst: that's fine
14:19 mupuf: and I am working on proper CI systems to prevent regressions
14:20 mupuf: so as we keep stuff that was previously-working in good condition
14:20 dhgsb: mupuf: the hardware is new from 2018 and also dont work. For example buy this and connect your monitor to it: https://www.amazon.com/ASRock-Motherboard-Motherboards-N68-GS4-R2-0/dp/B01N8OR7D8
14:20 karolherbst: dhgsb: and yes, we would like to support all those and fix all the bugs, but there are so many bugs and so many GPUs, that there is always something to work on
14:21 mupuf: dhgsb: what the fuck :o
14:21 karolherbst: the heck
14:21 mupuf: who would buy that though? This is just a waste of energy
14:22 dhgsb: mupuf: its a new nvidia 7025 GPU IGP card. Tons of people buy this hardware because its the cheapest AM3+ mainboard.
14:22 karolherbst: it is from 2015 though
14:22 karolherbst: "new nvidia 7025"
14:22 karolherbst: from 2007
14:23 mupuf: dhgsb: fair enough, but we still don't magically get more time to work on that, especially when we lack the HW for it
14:23 dhgsb: karolherbst: its the IGP. i cant do anything about that. Just dont tell me, that the hardware were from 2004. Its also build and sold in 2015, 2016 and 2017
14:23 mupuf: but that could be a good platform to buy for CI, thanks for the link
14:24 mupuf: the design still is from 2004 or 2005
14:25 dhgsb: mupuf: i had spend so many hours with this 7025 because there is no simple way to do as a user a MMIO dump of the OEM driver. But happyliy the MMIO dump was usefull and skeggsb was able to fix the assembler code that was causing the kernel panic with this mainboard
14:26 mupuf: dhgsb: see, we are trying ;)
14:26 mupuf: and thanks for caring, really
14:26 mupuf: I share your values, but my time is what it is
14:26 mupuf: maybe you want to start contributing
14:26 dhgsb: i cant code
14:26 dhgsb: and i cant learn coding
14:26 mupuf: well, so did I until I learnt
14:27 dhgsb: i cant learn coding
14:27 mupuf: why that?
14:27 karolherbst: pmoreau: for the nir_var->constant_initializer the initializer is NULL
14:27 dhgsb: btw here the bugreport: https://bugs.freedesktop.org/show_bug.cgi?id=93557
14:28 pmoreau: karolherbst: Good, but for which location? :-D
14:28 dhgsb: would be nice if someone could care about this bugreport: https://bugs.freedesktop.org/show_bug.cgi?id=93557
14:29 karolherbst: pmoreau: what do you mean? there is just one
14:29 karolherbst: ohh
14:29 karolherbst: the assert
14:29 karolherbst: which I or somebody else disabled
14:29 karolherbst: the first
14:29 karolherbst: case SpvDecorationConstant:
14:29 pmoreau: Right
14:29 pmoreau: Okay, good
14:30 pmoreau: Wait, it didn’t break in the builtin one first?
14:30 karolherbst: and the case SpvDecorationBuiltIn: runs into the first if
14:30 karolherbst: no
14:30 karolherbst: first the constant, than builtin
14:30 pmoreau: Hum okay, they might have reordered the decorations, but it shouldn’t matter.
14:30 pmoreau: It runs in the first if...
14:30 karolherbst: mhhh let me check something
14:31 pmoreau: Wait, what?
14:31 karolherbst: ufff
14:31 karolherbst: yeah... I think I messed up here
14:31 pmoreau: It runs in the `builtin == SpvBuiltInWorkgroupSize)` if??
14:31 karolherbst: pmoreau: https://github.com/karolherbst/mesa/commit/56d2ddab890
14:32 karolherbst: guess I broke it
14:32 pmoreau: Ah, I see https://github.com/karolherbst/mesa/commit/56d2ddab890#diff-20e6d2c405c49e60dcdbfae7b9e2565aR1359
14:32 karolherbst: okay
14:32 karolherbst: now another issues
14:33 karolherbst: 64 bit vs 32 bit
14:34 dhgsb: karolherbst: you wrote "we would like to support all those and fix all the bugs, but there are so many bugs and so many GPUs, that there is always something to work on" . I would like to know if it wound not be much more valuable to grow everywhere instead adding vulkan support for some gpus when on other ones the default gnome and kde screen is crashing
14:34 dhgsb: karolherbst: i think its MUCH more valuable to have "basic usage" working first
14:35 karolherbst: dhgsb: well, in my spare time I work on things I think are fun
14:35 karolherbst: on work time, I work towards goals we have internally mainly
14:35 pmoreau: karolherbst: Would you have an updated NIR for those new issues plus?
14:35 karolherbst: pmoreau: https://gist.github.com/karolherbst/cf03e317f4453f994428efa37e18b055
14:35 pmoreau: Thanks
14:36 dhgsb: from the user perspective: You hear something about "this linux thing" and then hear the most people use this "ubuntu" or "suse" and then start it and it crash. this is the first linux experience you get of linux on some systems because of nouveau.
14:36 karolherbst: dhgsb: the nouveau community isn't a coorperation you can blame for random bugs and assume they get fixed by blaiming people
14:36 karolherbst: it is much simplier than that
14:37 karolherbst: get polemical and blame us, and we don't work on your bugs. Stay friendly and work with us towards a goal and we help you
14:37 karolherbst: you can be an asshole to companies, but not to people, we are people, not a company
14:38 pmoreau: karolherbst: So, I guess some bug in the code which splits the global_id into work_group_id and local_invocation_id.
14:39 karolherbst: pmoreau: I doubt that
14:39 karolherbst: pmoreau: but maybe?
14:39 dhgsb: karolherbst: i tried to understand if i can get such a redhat support level on ANY hardware but did not get it. because i think redhat should not value nvidia for their closed source way, the "Curie" cards should be fixed first before supporting their new closed-source-enforcing-things like the nvidia 1000 series.
14:39 karolherbst: pmoreau: I think it gets lowered or something
14:39 pmoreau: karolherbst: Well, those do not exist in the SPIR-V, and when they appear in NIR, they use 32 bits instead of 64.
14:39 karolherbst: pmoreau: nir_lower_system_values
14:40 dhgsb: karolherbst: who am i blaming? i just tell from the user perspective on a new linux user
14:40 karolherbst: dhgsb: you get redhat support if you pay for RHEL subscriptions, I told you that already
14:41 dhgsb: karolherbst: can you get RHEL subscription for ANY hardware? Or is the RHEL subscription even hardware dependent or just user?
14:41 karolherbst: well I got the feeling your are upset because we don't do our job
14:41 karolherbst: dhgsb: it is a subscription for software
14:41 karolherbst: and if you run it on supported hardware, that's fine
14:42 karolherbst: but I doubt you want to get one
14:42 karolherbst: it's like 300$ a year
14:42 imirkin: dhgsb: this chan is full of optimists. since we work in our free time, we hope that we'll have more of it, but reality often differs.
14:42 dhgsb: i want to run it on the hardware that makes problems to get the reported bugs into your "<karolherbst> on work time, I work towards goals we have internally mainly"
14:45 karolherbst: pmoreau: mhh in glsl: "The value of gl_GlobalInvocationID is equal to gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
14:45 dhgsb: karolherbst: for 300$ i can step back and call support and the support then buy the non working hardware and make the MMIO-traces on the OEM driver and sends those then to you? I can report as many bugs as i find and get those fixed in time? That would be great and worth 300$. With 300$ then you can fix sooo many bugs in soo many opensource projects
14:45 karolherbst: pmoreau: is gloablInvocationId a different thing in OpenCL?
14:45 karolherbst: dhgsb: I doubt that works out, because those usually have to be evaluated and stuff
14:45 imirkin: dhgsb: just add a few more 0's, and you got it
14:46 karolherbst: dhgsb: usually you don't want to go the enterprise way for that
14:46 pmoreau: karolherbst: Nope, it works the same way
14:46 karolherbst: pmoreau: ahh
14:46 karolherbst: pmoreau: well the issue is, that the code generates 32 bit ops
14:46 karolherbst: instead of 64 bit ones
14:46 dhgsb: imirkin: i was not able to find any price tag on any official site. please send me the link.
14:46 pmoreau: karolherbst: Right. I would guess that this is an assumption done by the lowering code in NIR
14:47 karolherbst: dhgsb: and we are already working on some Nouveau bugs here
14:47 karolherbst: dhgsb: there are just a lot of those
14:47 pmoreau: (Or in some of the GLSL lowering thing)
14:47 karolherbst: pmoreau: right
14:47 imirkin: dhgsb: it'd be a negotiated contract. be prepared to part with many millions.
14:47 karolherbst: imirkin: well you can just buy RHEL for workstations though
14:48 imirkin: dunno what level of support that gets you
14:48 karolherbst: responses
14:48 karolherbst: "unlimited incident reports and 2-business-day or 4-business-hour responses"
14:48 karolherbst: but it is just for dev purposes
14:48 imirkin: like "we appreciate you have a choice in operating systems, and you appear to have chosen the wrong one."?
14:48 dhgsb: karolherbst: you told here something about kernel 4.15 or 4.16. Does the changes got into 4.15 or 4.16 or are they still not in 4.16? https://bugs.freedesktop.org/show_bug.cgi?id=102430
14:48 karolherbst: dhgsb: I think 4.16? not quite sure
14:49 karolherbst: or 4.15..
14:49 karolherbst: you can try out 415
14:50 karolherbst: 4.15
14:50 karolherbst: maybe it is better with it
14:50 karolherbst: maybe not
14:51 dhgsb: karolherbst: logfiles of not working would help you for the further development?
14:51 karolherbst: imirkin: maybe?
14:51 karolherbst: dhgsb: maybe? it always depends on the content
14:51 karolherbst: sometimes the logs only log the broken stuff, but not the cause
14:52 dhgsb: when telling the software + hardware that is been used, then this should be perfect, right?
14:52 karolherbst: anyway, ultimatly I would like to have a lot of GPUs here and work on bugs as I see fit and I actually wanted to look into the out of vram issues at some point
14:53 dhgsb: i dont think that the problem is getting a "Curie" based gpu from ebay, right?
14:55 karolherbst: pmoreau: mhh I think all builtins are 32 bit...
14:57 karolherbst: pmoreau: nice, now the first 1023 tests passes :)
14:57 karolherbst: uhm rdsv u32 %r11d sv[CTAID:0]
14:57 karolherbst: imirkin: rdsc is 32 bit only?
14:57 karolherbst: *rdsv
14:58 pmoreau: karolherbst: Nice!
14:58 imirkin: yeah, sysvals are 32-bit
14:59 imirkin: some are logically 64-bit, but practically they're 32-bit and you have to do 2 loads + merge
15:01 karolherbst: ahh
15:02 imirkin: and some are 64-bit in the api, but 32-bit on hw
15:02 imirkin: like the subgroup stuff
15:05 karolherbst: right
15:06 karolherbst: pmoreau: guess when the first fail is now
15:07 karolherbst: but this is kind of weird
15:07 pmoreau: Images?
15:07 karolherbst: ohh
15:07 karolherbst: channel broke
15:07 karolherbst: read fault
15:08 karolherbst: uhh
15:08 karolherbst: yeah that sounds wrong
15:09 karolherbst: 1024 still
15:10 karolherbst: mhh
15:10 karolherbst: 11: shl u32 $r11 s64 $r8 $r0d $r9 (8)
15:10 karolherbst: those shift ops
15:10 karolherbst: really
15:11 karolherbst: imirkin: I assume I can't do this? shl s64 %r22d %r21d %r1d
15:12 imirkin: not in hw, but we should lower it iirc
15:12 karolherbst: split u64 { %r136 %r137 } %r94d
15:12 karolherbst: shl u32 %r139 s64 0x0000000000000000 %r70d %r136
15:12 karolherbst: shl u32 %r138 s64 %r136 %r70d %r137
15:13 imirkin: yeah, that's actually a LSHF
15:13 karolherbst: I am wondering about the second source
15:13 imirkin: rather than a shl
15:13 karolherbst: ohh
15:13 karolherbst: weird op
15:13 imirkin: hm, that second source shouldn't be a double tho
15:13 imirkin: er, SHF.L. same diff.
15:14 karolherbst: pmoreau: is global invoc id really a 64 bit value in OpenCL?
15:14 imirkin: and dest should be s32 as well
15:14 pmoreau: Yes
15:14 imirkin: sounds like the lowering hasn't run
15:14 imirkin: or is otherwise messed up
15:14 karolherbst: mupuf: the switch GPU commands: http://switchbrew.org/index.php?title=GPU_Commands_Basics
15:15 pmoreau: karolherbst: `size_t get_global_id ( uint dimindx)`, and size_t The unsigned integer type of the result of the sizeof operator. This is a 32-bit unsigned integer if CL_DEVICE_ADDRESS_BITS defined in clGetDeviceInfo is 32-bits and is a 64-bit unsigned integer if CL_DEVICE_ADDRESS_BITS is 64-bits.”
15:15 karolherbst: duh
15:16 imirkin: karolherbst: i hope they're not just figuring this out from scratch
15:16 karolherbst: I think they do
15:16 karolherbst: but the one guy asled me
15:16 karolherbst: *asked
15:16 karolherbst: because he hoped I would tell them we already know it
15:17 imirkin: it's ... just a maxwell chip
15:17 karolherbst: yeah
15:17 imirkin: nouveau should Just Work (tm)
15:17 pmoreau: They do reference envytools and some Mesa code on that page
15:17 imirkin: yeah
15:18 pmoreau: So hopefully they have been using that information
15:18 karolherbst: imirkin: yeah, they know
15:19 imirkin: coz the stuff they have on there looks a lot like RE
15:19 imirkin: rather than "just use nouveau"
15:21 karolherbst: imirkin: the Command List list on that wiki page looks odd though
15:21 karolherbst: ohh wait
15:21 karolherbst: no
15:21 imirkin: shift-by-2
15:21 karolherbst: it is fine
15:21 karolherbst: yeah
15:24 imirkin: interesting. 0x47f (aka 11fc) == ResolveDepthBuffer
15:24 imirkin: that's not one i was previously aware of.
15:24 imirkin: perhaps solution to some of the maxwell problems?
15:25 karolherbst: maybe?
15:28 imirkin: karolherbst: can you trace this when you get a chance? KHR-GL45.direct_state_access.renderbuffers_storage_multisample
15:28 imirkin: we're failing on RGBA32I/UI formats.
15:28 karolherbst: ahh
15:28 imirkin: RGBA32F is fine apparently.
15:29 karolherbst: mhh
15:36 freecoder: imirkin: do you still need help testing that patch for fermi? although i am still scraping over the nouveau docs and figuring out the kinks so you might have to help me out a bit here
15:37 karolherbst: imirkin: switch kernel API: http://switchbrew.org/index.php?title=NV_services
15:39 imirkin: freecoder: yes.
15:40 karolherbst: imirkin, pmoreau: do you see a bigger issue here? https://gist.githubusercontent.com/karolherbst/203f5fabc04ab39079c2c332a06914aa/raw/e95b89a970fd2b9255df67cee8504fed13bea52d/gistfile1.txt
15:45 freecoder: cool. do you guys use skeggsb/nouveau repo to build nouveau?
15:45 karolherbst: I use mine :p
15:45 freecoder: karolherbst: i dont have mine :D. which one do i use?
15:46 karolherbst: the skeggsb one should be fine
15:47 freecoder: ok
15:54 karolherbst: pmoreau: mhh that code works perfectly up to 0x1023 as the global id
15:54 karolherbst: uhm
15:54 karolherbst: 1023
15:55 karolherbst: final shader for int: https://gist.githubusercontent.com/karolherbst/701a2c4ea34f734187e0005dd509701d/raw/f1e0ea2f7f2ae27e37c3e889e41670b4ffd25f6a/gistfile1.txt
15:56 karolherbst: pmoreau: ohhhh
15:56 karolherbst: pmoreau: let me guess, we just allocation 1kB of global memory somewhere :)
15:56 karolherbst: or something like that?
15:59 imirkin: freecoder: patch is against mesa
16:00 karolherbst: pmoreau: aha!
16:00 karolherbst: pmoreau: it is something else
16:00 karolherbst: pmoreau: we just don't do enough invocations for whatever reasons
16:00 karolherbst: I am doing out[0] = get_global_id(0) now
16:00 karolherbst: and usually the value is 0x3ff, but sometimes it is also 0x3df
16:01 karolherbst: pmoreau: any ideas?
16:13 freecoder: imirkin: oh. did not know that :P
16:27 pmoreau: karolherbst: Hum, one sec. Let me get some tea and get back to you. :-)
16:28 pmoreau: Plus, now I am home, so grepping things in Mesa and stuff will be way easier. :-D
16:29 karolherbst: :D
16:29 karolherbst: I think something doesn't really work out launching the kernels
16:29 karolherbst: rdsv u32 $r4 sv[TID:0] sounds correct, no?
16:29 imirkin: my guess is you're computing a local id instead of global
16:29 karolherbst: or is TID the local one?
16:29 imirkin: check how tgsi does it
16:30 imirkin: TID is local.
16:30 karolherbst: meh
16:30 imirkin: (i'm like 99% sure)
16:30 karolherbst: something went wrong then
16:30 pmoreau: Yup, I see the issue
16:30 imirkin: no, you have to get the local.
16:30 imirkin: but you also need to do math on it.
16:30 karolherbst: I have CTAID and TID
16:30 karolherbst: but the CTAID things get nuked
16:30 imirkin: CTAID is the block id
16:30 karolherbst: right
16:31 imirkin: so you need to do CTAID * blocksize + TID
16:31 pmoreau: “vec3 64 ssa_48 = load_const (0x 0 /* 0.000000 */, 0x 0 /* 0.000000 */, 0x 0 /* 0.000000 */)” this is supposed to be blockSize
16:31 imirkin: yeah, that could backfire ;)
16:31 karolherbst: ohh
16:31 pmoreau: NIR reads the value from some shader info, and I guess those aren’t filled up by clover, or rather clover fills the compute one, but not the shader one.
16:31 karolherbst: pmoreau: yeah, that makes sense
16:32 pmoreau: Something like that I guess
16:32 karolherbst: so we have block id * block size
16:32 karolherbst: and because it is * 0
16:32 karolherbst: it gets nuked
16:32 imirkin: fwiw there are ntid sysvals as well
16:32 karolherbst: pmoreau: I see, yeah, makes sense
16:33 imirkin: which could be used for a blocksize
16:33 imirkin: but ... better to have an immediate :)
16:33 karolherbst: :)
16:33 imirkin: since it's almost always a power of 2, and thus can be shifted
16:33 imirkin: (coz really, who does invocations in non-POT sizes...)
16:33 karolherbst: pmoreau: b->shader->info.cs.local_size[0]..b->shader->info.cs.local_size[2]
16:34 karolherbst: is used
16:34 pmoreau: Oh, maybe it hasn’t yet been set by clover, when you call spirv_to_nir
16:35 karolherbst: mhhh
16:35 karolherbst: good point
16:36 karolherbst: I think it is though
16:38 karolherbst: pmoreau: :) ADD and SUB apsses now
16:39 karolherbst: but!
16:39 karolherbst: LONG_MAD int test failed 0 712c4e5005cce8ec != 712c4e4f05cce8ec
16:39 karolherbst: !
16:39 karolherbst: carry sched bug again
16:39 pmoreau: Cool! So, what did you needed to fix?
16:39 karolherbst: pmoreau: I just hardcoded the grid size for now
16:39 karolherbst: uhm
16:39 pmoreau: :-D
16:39 karolherbst: block size
16:44 karolherbst: so... who is up for a mesa port to the switch to use the nouveau userspace with the switch kernel space?
16:44 pmoreau: Not me :-)
16:45 pmoreau: I don’t have time, nor the hardware. (Though it could be reallly fun!)
16:45 orbea: karolherbst: and then you can use RetroArch on a switch with nouveau? :P
16:46 freecoder: i'm fetching the mesa code from git clone git://anongit.freedesktop.org/git/mesa/mesa and its taking too long. is the anon repo throttled?
16:46 pmoreau: Could be the infrastructure which is having issues.
16:47 pmoreau: Try the mirror on GitHub, it was quite fast when I tried today
16:47 pmoreau: freecoder: https://github.com/mesa3d/mesa
16:49 freecoder: pmoreau: got it. thanks, this works
16:50 pmoreau: Once you have it cloned, you can always change the remote back to freedesktop, as you won’t need to download as much when you update.
16:51 freecoder: good idea
17:00 karolherbst: pmoreau: anyway, gtg
17:00 karolherbst: won't be doing stuff today anymore
17:02 pmoreau: Okay
17:29 imirkin_: i wonder if we're enabling some dumb compression thing on rgba32[u]i and we shouldn't be
20:25 fireion: How to find GPU codename?
20:26 fireion: One of my friends has 820M with 128 cuda cores, and other has 96 cuda cores. I want to see the die name
20:26 imirkin_: hm?
20:26 imirkin_: lspci -nn -d 10de:
20:26 imirkin_: probably GK107 vs GF108
20:28 fireion: Both have GF117
20:28 fireion: GF117M
20:29 imirkin_: so ... that's the gpu codename
20:29 fireion: But... One has 128 cores, and other has 96 cores
20:29 fireion: What is this??
20:29 imirkin_: some chips will have parts fused off
20:30 fireion: But...? 96 cores? How did they fuse off 32 cores? There are 64 cores per SM in GF117M
20:31 imirkin_: those are fake numbers anyways
20:31 imirkin_: but they probably fuse off one SM
20:31 imirkin_: or MP or whatever
20:31 fireion: fake numbers? what's real then?
20:31 imirkin_: well, i just mean the way one gets to 96 requires some creative accounting
20:32 fireion: How do we get the real details on architecture
20:32 fireion: By running profiling experiments and putting up different theories? Is that the only way to understand GPUs?
20:32 imirkin_: read fuses
20:33 imirkin_: oh, that
20:33 imirkin_: well, you can read about it
20:33 imirkin_: it's not secret how they count
20:33 fireion: Can you point a reference to "fuses"?
20:33 imirkin_: but it's like saying that if you have 2 cpu cores, and each can do AVX512 (i.e. 16-wide SIMD), then you have "32 cores"
20:34 fireion: and fuses?
20:35 imirkin_: that's just where we read which MP's are available and whatnot
20:37 fireion: some web url pointing to it?
20:44 fireion: Damn. Different versions of deviceQuery give different values? Faith in humanity lost.
20:48 imirkin_: fireion: https://github.com/envytools/envytools/blob/master/rnndb/bus/punits.xml#L54
20:50 fireion: Probably the devices were exactly same. The deviceQuery was of mint's cuda, which gave 96 cuda cores, as compared to cuda-8.0 which gave 128 cuda cores on the same GPU
20:50 fireion: *by mint I mean linux mint
20:51 imirkin_: could be yeah. all depends what you count :)
20:52 fireion: Thank you for all your help! I don't know how long I would have been looking had it not happened that you helped.
20:53 imirkin_: np
22:51 pmoreau:crosses fingers that everything will go fine this time, with sending the series
23:01 pmoreau: karolherbst: Series sent, and I created a separate branch containing the series + the SPIR-V clover backend: https://github.com/pierremoreau/mesa/commits/add_clover_spirv_backend_v1