00:00 anholt: let's be real, the unreachable() in the release build where you trigger the path is not going to infinite loop. it's going to use an undefined value just like before.
00:01 imirkin: mmm ... are you sure? i was under the impression that it was _very_ likely to cause an infinite loop
00:01 anholt: it's very unlikely to.
00:01 imirkin: hm ok
00:02 imirkin: in that case, consider my comment retracted
00:02 anholt: but if you're really concerned, pick a value you like for the default and swap back to assert
00:02 imirkin: i will do some playing around with unreachable
00:02 imirkin: iirc i definitely ran into infinite loops with it
00:02 imirkin: but it was all some long while ago
01:01 jekstrand: jenatali: It's for doing whole-array copies
01:01 jekstrand: jenatali: So nir_intrinsic_copy_deref when you want to copy an entire array.
01:01 jekstrand: jenatali: In particular, by copying the entire array instead of one element at a time, we can copy-propagate through whole-array copies.
01:02 jekstrand: So if you have "int x[5], y[5]; copy_deref(x[*], y[*]); use(x[i]);", we can propagate through and turn "use(x[i]);" into "use(y[i]);". Sometimes, we can delete the entire array "y".
01:03 jekstrand: Delete the entire array "x", rather.
01:35 jenatali: Ah, cool, thanks
09:19 MrCooper: anholt: the rules are such that jobs only run by default if the pipeline was created by Marge; she cannot start jobs in a pipeline created by another user
09:20 MrCooper: ergo, best leave rebasing to her, or if one has to do it oneself, remove the Part-of: tags from the commit logs
10:43 LiquidAcid: ickle, hey, any chance you remember this commit? https://github.com/torvalds/linux/commit/6259a56ba0e1c3a15954e22ea531e810944518cb
10:44 LiquidAcid: i'm running to the assertion in drm_mm_init(), or more like, lima is running into the assertion:
10:44 LiquidAcid: https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/lima/lima_vm.c#L224
10:44 LiquidAcid: both va_start and va_end are zero at this point, which is, as far as i understand, intentional
10:45 LiquidAcid: i'm wondering if the assertion condition should be relaxed to start + size < start then
17:29 DPA: MrCooper: It turns out the drmModeAddFB failure I was seeing and seemed to have went away after the mesa update simply did so
17:29 DPA: because I had forgotten to reapply MR 3449. I've now found the real root cause of that problem, it's this here:
17:29 DPA: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/winsys/etnaviv/drm/etnaviv_drm_winsys.c#L107-110
17:29 DPA: If I comment those lines out, the drmModeAddFB failure goes away. That's because kmsro is used with 2 different kms cards, thus
17:29 DPA: it calls this function with different ro->kms_fd, but uses the same etnaviv card for ro->gpu_fd, and since that's the only thing
17:29 DPA: used as key for the hash map here, it thinks it can reuse the pipe_screen already created for that gpu_fd, and then things go wrong.
17:29 DPA: Anyway, I still have to figure out why Xorg can't draw to the output from the second kms card / why it stays black if prime is involved.
18:07 DPA: Also, If I configure Xorg to use different Xscreens for both dri cards, and start glxgears on both Xscreens on xfce, I get this strange bahaviour:
18:07 DPA: https://dpa.li/temp/2xscreen.mp4
18:07 DPA: Is this normal, a bug, or maybe even related to my other problem?
19:52 karolherbst: ....
19:52 karolherbst: airlied, jekstrand: I am not 100% sure yet, but I think libclc just hit a nir_serialize bug
19:53 karolherbst: also.. libclc crashes with the cache disabled :)
19:56 airlied: karolherbst: the latter sounds simple to fix, the former a bit trickier
19:57 karolherbst: nir_variable.data.location is not set when fetched from the cache
19:57 karolherbst: uhhhh
19:57 karolherbst: that is easy to fix LD
19:58 karolherbst: I found the bug
19:58 airlied: I'll push a fix for the cache
19:59 karolherbst: airlied, jekstrand: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/802f1141bf3a719e8311faf7cc2c83bd753c86bc
20:01 karolherbst: maybe we should just list where it's allowed instead :p
20:01 karolherbst: but... why aren't system values lowered at that point?
20:01 airlied: karolherbst: pushed a disk cache fix
20:02 jekstrand: karolherbst: Yeah...
20:02 karolherbst: but anyway, the code is wrong :p
20:03 jekstrand: karolherbst: Yeah, we should definitely flip that list around
20:04 jekstrand: karolherbst: Or you could just not bother stripping
20:04 jekstrand: The stripping stuff makes a lot of assumptions
20:04 karolherbst: well.. doesn't change the fact that the code is wrong for system value
20:05 karolherbst: I think it just never happened before
20:06 karolherbst: but I am wondering if we shouldn't strip libclc? mhhh
20:06 karolherbst: stripping wouldn't help with cache hits here anyway
20:09 karolherbst: airlied: I think we need a better hash for libclc
20:11 karolherbst: not now, but in the future once we store kernel linked against libclc in the cache (or generally linked kernels)
20:11 karolherbst: and we want to know if we have to recompile
20:16 airlied: karolherbst: yeah might need to hash it and include that hash in the key
20:19 karolherbst: and then compile time options ...
20:19 karolherbst: uhhh
20:19 karolherbst: right...
20:19 karolherbst: airlied: I think our approach doesn't work long term...
20:20 karolherbst: I think we might end up with multiple libclcs if the application specifies different compile time flags
20:20 karolherbst: ...
20:20 karolherbst: not that we care today
20:20 karolherbst: but....
20:20 karolherbst: but maybe we can also ignore them for clc?
20:21 airlied: then you just add all the hashes I suppose
20:29 vv222: Currently trying to build 32-bit Mesa 17 on an up-to-date 64-bit Debian Sid, I expect I’m going to have strange questions at some point ;)
20:34 karolherbst: vv222: why bothering with an unsupported version though?
20:36 vv222: I’m trying to hunt what looks like a regression, to make a nice bug report.
20:36 vv222: cf. https://forge.dotslashplay.it/play.it/games/-/issues/139 & https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=920523
20:37 karolherbst: huh? crash in std::cout sounds kind of strange
20:39 vv222: This is an old proprietary engine, so symbol collisions might be playing a role here.
20:40 karolherbst: yeah...
20:40 karolherbst: or maybe preloading pthread might help
20:40 vv222: "preloading pthread"?
20:41 karolherbst: nvm.. pthread gets loaded quite early
20:42 karolherbst: vv222: mind checking with ldd what the game links against?
20:42 vv222: So `ldd $game_binary`?
20:42 karolherbst: yeah
20:44 vv222: karolherbst: https://forge.dotslashplay.it/play.it/games/-/issues/139#note_22604
20:45 karolherbst: okay.. yeah.. I guess that should be fine. We just had ana nnoying to track down bug recently caused by mesa doing threading, but the application does not
20:45 karolherbst: ohh
20:46 karolherbst: but it seems like the game is written in C
20:46 karolherbst: strange
20:46 karolherbst: ahh no, missed the stdc++ one
20:48 jekstrand: karolherbst: You reminded me that no one put the alignments in the type [de]serialization stuff. Fixed now. :-)
20:49 karolherbst: :)
20:49 karolherbst: btw, with the updated patches I am still at "Pass 258 Fails 1 Crashes 0 Timeouts 0"
20:49 karolherbst: and the one fail is nouveaus fault
20:50 karolherbst: jekstrand: btw, the uniform patch conflicts with your constant stuff I think
20:50 karolherbst: and breaks a few tests
20:51 karolherbst: didn't investigate yet, but I assume that those patch series are more less messing things up for each other
20:51 jenatali: Yeah, seems likely, unfortunately
20:51 jenatali: Using uniforms for kernel inputs makes images much cleaner
20:51 jenatali: I haven't tried merging it with the constant address space in CLOn12 though to see if it conflicts for me as well
20:52 karolherbst: jenatali: I am still for treating images as opaque things
20:52 jenatali: I don't know what you mean by that
20:52 karolherbst: and we could just add uniforms to the kernel for each image
20:52 karolherbst: jenatali: no storage inside the kernel input buffer
20:52 jenatali: Sure
20:53 jenatali: That's a matter of what you do with them before the lower_io passes
20:53 jenatali: Btw, you probably do want storage in the kernel input buffer, because you need to be able to query their CL formats, which you probably want to implement as a kernel input load
20:54 karolherbst: mhhh
20:54 karolherbst: but I thought hw can usually do that, no?
20:54 karolherbst: though there is an instruction for that
20:54 karolherbst: maybe not for graphics though...
20:55 jenatali: Potentially. But then you need a bunch more instructions to turn it back into CL's specific format construction
20:55 karolherbst: right..
20:56 karolherbst: at this point I don't really know what makes sense anyway :)
20:56 jenatali: But yeah I'm not in too much of a rush to push those changes. Maybe I'll need to figure out how to actually run Clover instead of just making you do all of the testing for me :D
20:56 karolherbst: and I assume we can figure the issue out
20:56 karolherbst: :D
20:56 karolherbst: I mean.. CL does work on top of llvmpipe
20:56 karolherbst: so you won't even have to mess with drivers
20:57 jenatali: Yeah, though it'd still be a fairly big learning curve for me
20:57 jenatali: But that's not today :) maybe next week
20:57 karolherbst: yeah.. I mean, at some point I want to get to images anyway
20:57 karolherbst: just wanted to deal with the required bits first
20:58 karolherbst: and clc will take some time I think...
20:58 karolherbst: jekstrand: what do you think about "cl" variants of nir_alu_ops?
20:58 karolherbst: either as new opcodes or a flag on nir_alu_instr
20:59 karolherbst: given that CL indeed has both
20:59 karolherbst: and then converting all cl opcodes to normal ones when lowering against clc
21:01 karolherbst: that would also help with the div/sqrt stuff
21:01 airlied: for radeonsi i think it might be okay to out image descriptoes in the inout buffer
21:02 airlied: input
21:02 airlied: but for others an index or nothing might make more sense
21:05 vv222: My Mesa 17.3.9 build is failing with the following error: https://paste.debian.net/plain/1161777
21:05 vv222: Any pointer would be welcome ;)
21:05 karolherbst: use an older gcc
21:07 vv222: Thanks, I’m trying that.
21:08 vv222: Hmm, looks like I might not be able to easily downgrade gcc without downgrading glibc too…
21:09 karolherbst: yeah... there might be a compile time option though
21:09 karolherbst: but normally gcc gets stricter over time
21:09 ccr: you can try passing -fcommon
21:10 karolherbst: ohh right. that was the option, I forgot about that one
21:11 vv222: Is `-fcommon` a GCC option?
21:11 vv222: (I’m very new to all this)
21:11 jekstrand: karolherbst: Not sure
21:11 ccr: yes. not sure if you need it for linker as well, haven't used it myself.
21:13 ccr: I have a sneaking suspicion that your problem is some c++ ABI thing, but shrug.
21:23 karolherbst: jekstrand: I am also wondering about more high level optimization like on trigonometric functions for cl
21:24 karolherbst: although I doubt we have much yet, but I can see that doing something on top of the builtins could help
21:25 karolherbst: so being able to keep the CL opcodes around for optimizations _could_ be a benefit
21:25 karolherbst: and later we just lower all CL opcodes to libclc functions (or other lowerings, like for fdiv/fsqrt)
21:27 karolherbst: and I think a flag like "exact" just "cl" would be less expensive than adding a bunch of new opcodes
21:27 karolherbst: or maybe we just add those and depend on a second optimization loop?
21:27 karolherbst: mhhh
21:27 karolherbst: but I think having something would help with some stuff
21:28 karolherbst: or we call it "high precision"
21:29 karolherbst: mhhh
21:29 karolherbst: jekstrand: nir_alu_instr.exact -> precision and we have { none, exact, precise }?
21:29 karolherbst: that would turn a true check against a >= exact in the nir_algebraic_opt case
21:30 karolherbst: mhhhh
21:30 karolherbst: none: do whatever
21:30 karolherbst: exact: stay consistent
21:30 karolherbst: precise: full precision
21:32 karolherbst: although those are kind of different things :/
21:37 karolherbst: I will give it some thoughts and see what works best
21:37 karolherbst: but I am kind of sure we need to tag instruction on how precise they have to be...
21:38 karolherbst: especially also to support those unsafe math options in cl
21:38 karolherbst: or well.. to make use of them
21:48 jekstrand: karolherbst: Yeah, we need to do something but what still isn't quite clear
21:49 jekstrand: I think one thing we definitely need to do is separate ffma and fmad
21:49 jekstrand: Beyond that, though, I'm not sure.
21:49 jekstrand: I think most of the problems we've seen aren't caused by inprecise optimizations so much as sloppy lowering.
21:50 jekstrand: sqrt(x) -> rcp(rsq(x)) and fdif(x, y) -> fmul(x, frcp(y)) for instance
22:09 karolherbst: jekstrand: we have hw sqrt
22:09 karolherbst: and that one isn't enough either
22:09 karolherbst: denormal handling is busted on hw
22:09 karolherbst: I think
22:09 karolherbst: but rsqrt was also busted :)
22:10 karolherbst: let me check what was wrong with sqrt
22:12 karolherbst: "sqrt: -11863283.000000 ulp error at 0x1p-141 (0x00000100): *0x1.6a09e6p-71 vs. 0x0p+0"
22:14 jekstrand: karolherbst: I don't know what our sqrt is like or if its busted. I should probably start looking into some of this stuff.
22:14 jekstrand: Right now, I'm trying to focus on getting pointers sorted, though.
22:14 jekstrand:can, contrary to popular opinion, only work on so many different things at once
22:14 karolherbst: I assume it's denormal handling
22:15 karolherbst: mhh.. let's see
22:15 karolherbst: jekstrand: yeah.. that's fine I am just thinking out loud on what solution we'd like to have which makes it possible to optimize clc stuff less painful, but also allow us to add precision where we need it, but also being able to track sqrt vs native_sqrt
22:16 karolherbst: but maybe backends just have to provide how precise they can implement opcodes.... mhhh
22:20 karolherbst: jekstrand: with iris: sqrt: -11863283.000000 ulp error at 0x1p-149 (0x00000001): *0x1.6a09e6p-75 vs. 0x0p+0 :)
22:20 karolherbst: I guess that's a common problem
22:23 jekstrand: karolherbst: That's using 1/rsq
22:23 karolherbst: ahh...
22:23 karolherbst: rsqrt: 151115727451828646838272.000000 ulp error at 0x1p-149 (0x00000001): *0x1.6a09e6p+74 vs. inf :p
22:24 jekstrand: karolherbst: What's the precision of rsq?
22:24 karolherbst: required by CL?
22:24 jekstrand: karolherbst: Oh, maybe our rsq isn't precise enough then. :)
22:24 karolherbst: well
22:24 karolherbst: I advertise denormal support
22:24 karolherbst: could be fine without it
22:25 karolherbst: yeah...
22:25 karolherbst: looks better without advertising support for denormals
22:25 jekstrand: karolherbst: Is advertising no denormals an option?
22:25 karolherbst: it is required for fp64 but optional for fp32 afaik.. let me check
22:26 karolherbst: yeah..
22:26 karolherbst: optnal for fp32 but required for fp64
22:28 karolherbst: I mean.. maybe we should go for the minimum unless there comes up a good reason we have to start caring
22:30 karolherbst: but that still leaves us with the fdiv/sqrt problem :)
22:31 jekstrand: karolherbst: Yeah, we likely need some amount of fixup
22:31 jekstrand: But I don't know that that implies new opcodes.
22:32 karolherbst: right.. that's why I was thinking we just want to have some modifiers on the alu instructions like exact
22:32 karolherbst: just a bit different
22:33 karolherbst: mhhh... maybe we want to make .exact a bitfield instead and opt_algebraic just & WHATEVER_FLAG so it stays cheap?
22:38 vv222: Still on my Mesa 17 build, with GCC 8 and 6 I get stuck on another error: https://paste.debian.net/plain/1161781
22:38 vv222: (I can not easily test with GCC 7)
22:39 karolherbst: ehh... I think that was fixed by adding an include or something
22:39 karolherbst: vv222: or just disable llvmpipe
22:40 vv222: That would be done through a ./configure option?
22:40 karolherbst: yeah
22:40 karolherbst: just disable swrats as a gallium driver
22:40 karolherbst: *swrast
22:41 vv222: OK, I think I see how to do that ;)
23:18 vv222: Hmm, I’m still getting this "implicit declaration" in gallivm/lp_bld_init.c after the addition of `--with-gallium-drivers=r300,r600 --with-dri-drivers=r200,radeon`
23:47 vv222: Looks like I found the missing include, I’m going to try that: https://bugs.freedesktop.org/show_bug.cgi?id=111077#c12
23:51 kisak: vv222: hopefully you're not trying to build legacy mesa against current-era llvm. That's an exercise in pain for no gain
23:52 vv222: I’m most probably doing that, without knowing this is a huge mistake ;)
23:53 vv222: But the environment system it is going to be tested in uses an up-to-date llvm, can I still build against an old one?
23:56 kisak: I don't have enough experience fiddling with Debian to know how that is handled. Over in gentoo's portage, I can just swap out the system llvm and nothing complains
23:57 vv222: You’re right, I see I can even have multiple versions of llvm installed together with no issue.