00:38lostgoat: lordheavy: hey, I just updated my arch system and ran into a problem with the vulkan-validation-layers package. It now seems to be compiled as Debug by default.
00:38lostgoat: Could this commit be reverted:
00:39lostgoat: Besides the perf problem, I've got some unit tests that listen for validation errors/warnings and that made everything explode.
00:39lostgoat: Also, thanks for maintaining the package :)
00:50HdkR: "add debugging informations", sounds like they wanted RelWithDebInfo, which is still a weird build config for a system package
01:10lostgoat: To be fair, I think the way Arch handles debug packages is annoying
01:11lostgoat: I wish simple -dbg and -src packages were available. Doing the makepkg dance is a little annoying.
05:31lordheavy: lostgoat: ok will check this
05:47lordheavy: lostgoat: Please check vulkan-validation-layers-1.2.146-2
06:10tomeu: finally got deqp to run on this AMD chromebook, but the performance is quite disappointing: https://gitlab.freedesktop.org/tomeu/mesa/-/jobs/3829813
06:10tomeu: it's significantly slower than the oldest rockchip chromebook
06:11tomeu: wonder if there GPU is locked at low frequencies, or if LLVM is just that slow when compiling shaders
06:11tomeu:goes measure CPU utilization
06:13airlied: tomeu: llvm just that slow
06:13HdkR: LLVM is slow, 10/10 slows to be had
06:14airlied: tomeu: 7m doesn't seem that brutal
06:14airlied: though that does likely mean you can only run gles2
06:14airlied: which might bea bit limiting
06:16tomeu: gles3 is taking ages: https://gitlab.freedesktop.org/tomeu/mesa/-/jobs/3829814
06:17tomeu: the conformance level is great, though
06:18airlied: tomeu: can you login can get a perf record to confirm?
06:18tomeu: not really, but I can get some stats printed at the end
06:19airlied: I assume you ruled out things like NIR_VALIDATE and that llvm is a release build
06:23tomeu: not the former...
06:27pq: jekstrand, don't feel bad about not blogging, that time is better spent on writing the same into proper docs IMHO. But if you don't do even that, welcome to the club ;-)
06:32daniels: tomeu: try with ACO!
06:32airlied: daniels: aco doesn't work on radeonsi yet :)
06:32tomeu: yeah... :/
06:32daniels: damn :(
06:33HdkR: petition for ACO on radeonsi :P
06:34tomeu: not that it would help to test the compiler that people are using nowadays... :)
06:34tomeu: btw, how is NIR_VALIDATE involved on radeonsi if it uses llvm?
06:35dschuermann: tomeu: it uses GLSL->NIR->LLVM
06:40tomeu: ah, ok
06:40tomeu: NIR_VALIDATE just shaved off a couple of seconds
06:40tomeu: let's see the utilization how looks like
07:16tomeu: 2020-07-28T07:13:39 + cat /proc/loadavg
07:16tomeu: 2020-07-28T07:13:39 17.06 11.40 4.77 1/61 2857
07:16tomeu: guess llvm it is
07:19MrCooper: someone should land the changes allowing dEQP to be run with "piglit run --process-isolation false"; that makes a big difference for the piglit profiles
07:21tomeu: MrCooper: you mean to run deqp within piglit?
07:22tomeu: could that ever be as fast as parallel-deqp-runner?
07:23MrCooper: don't know, does that also run multiple tests in the same process?
07:23MrCooper: then it might be similar
07:42tomeu: 2020-07-28T07:36:02 + cat /sys/class/drm/card0/device/gpu_busy_percent
07:42tomeu: 2020-07-28T07:36:02 0
07:44tomeu: we'll have to add loads of devices to get decent coverage :(
07:50MrCooper: gpu_busy_percent is just the current state, not an average over a longer period of time
07:56airlied: MrCooper: I think deqp parallel runner is currently pretty good at spreading the load between tests
07:57MrCooper: the main point is not to spawn a separate process for each test, as that would incur a lot of overhead loading LLVM
08:53danvet: tzimmermann, a bunch of patches in -fixes
08:53tzimmermann: danvet, yes
08:53danvet: tzimmermann, also -fixes is still stuck on -rc1, generally you should try to rebase every week
08:54tzimmermann: danvet, ok
08:54tzimmermann: i'll take care of this later today
08:54danvet: otherwise if someone has a late -rc regression with one of our patches, they might not be able to bisect because -rc1 doesn't work for them
08:54danvet: same applies also for -next, occasional backmerge is good
08:55danvet: pinchartl, there's a pile of xlnx fixups floating around
08:55danvet: since I don't have a tree I can apply these two, tag you're it :-P
09:12bnieuwenhuizen: tomeu: Stoney Chromebook?
09:34pinchartl: danvet: I need to review them, yes. on my todo list :-)
09:35danvet: pinchartl, well you're going to do another pull?
09:35danvet: or mlankhorst needs to backmerge drm-next into drm-misc-next-fixes (and maybe also into drm-misc-next for mripard)
09:35pinchartl: danvet: isn't it too late for v5.9 ?
09:36danvet: pinchartl, I mean xlnx is supposed to be in drm-misc
09:36danvet: except it isn't
09:36danvet: so I can't just go through and apply the patches
09:39pinchartl: danvet: but it had to be merged with a specific base branch. is that allowed in drm-misc, bypassing dim ?
09:47danvet: pinchartl, not bypassing dim
09:47danvet: but dim supports pull requests and topic branches ofc
09:48danvet: we probably have a topic branch about every release or so
09:48danvet: ignoring all the pulls that dim processes for drm.git
09:56pinchartl: danvet: thanks for the info. I thought dim enforced an apply for mailbox workflow
09:57danvet: for patches
09:57danvet: pinchartl, and the topic branches should be exceptions when needed only
10:05tomeu: bnieuwenhuizen: yep
10:06tomeu: MrCooper: hmm, actually, I think the parallel deqp runner is starting a new deqp process for each batch, instead of reusing existing processes for new batches
10:06tomeu: that could play a role here I guess
10:13MrCooper: depends how many tests run in each batch; if it's at least tens or hundreds, should be fine I think
10:15bnieuwenhuizen: should be a batch size of 128 by default (except for the last few)
10:16mlankhorst: danvet: hm didn't I fast-forward drm-misc-next-fixes?
10:18danvet: mlankhorst, oops indeed, apologies for the noise
10:44karolherbst: can we stop caring about 80 coloumns please? :D (or did we stop caring already?)
11:46karolherbst: jekstrand: rewriting the vtn unstructured CFG stuff now and is it sooo much simplier ;)
14:19SolarAquarion: there is a inline asm issue that i'm getting https://lonewolf.pedrohlc.com/chaotic-aur/makepkglogs/_daily/schoina/mesa-git.log
14:22TheRealJohnGalt: It seems those periodic hangs at 4k in detroit are related to zerovram
14:26bnieuwenhuizen: TheRealJohnGalt: what do you mean related?
14:28TheRealJohnGalt: If I revert !5710, I no longer get the hangs (last about a minute at a time).
14:30TheRealJohnGalt: Since it only happens at 4k and not lower resolutions for me, maybe related to the already larger vram usage at 4k?
14:51karolherbst: my unstructured CFG rework: 2 files changed, 116 insertions(+), 375 deletions(-)
15:12jekstrand: karolherbst: \o/
15:13jekstrand: pq: Yeah.....
15:13karolherbst: it's even more now :D doing some last cleanups
15:14karolherbst: but now it's small and easy to understand :)
15:15karolherbst: jekstrand: do we have a helper to create a nir_block and insert it into the tree?
15:16jekstrand: No, not really. Typically, the nir_blocks are already there for you.
15:16karolherbst: right.. oh well
15:16karolherbst: I needed it in like 4 places, but I already added my local helper :)
15:17karolherbst: also.. I might want to make use of the switch handling which I removed for now for not having to deal with it for now
15:17karolherbst: at least I don't emit blocks twice, just the if chain is longer
15:17karolherbst: mhhh.. the structurizer might be messy then.. mhh
15:19karolherbst: jekstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401/diffs?commit_id=bd489b1af6b10bdf3af4570e315f621ded0881c0
15:20karolherbst: ohh.. I think I can remove the start_block field as well
15:23jekstrand: karolherbst: :D
15:23karolherbst: yeah.. I decided to just not use and of those structs anymore :p
15:23karolherbst: works as well.. soo...
15:23jekstrand: karolherbst: Do you need block? Or can you just use end_nop->block?
15:24karolherbst: good question
15:24karolherbst: let me try
15:26jekstrand: One of the advantages to end_nop is that as blocks are split and stuff is inserted and moved around, block pointers aren't particularly stable. An instruction in that block, on the other hand, is something you can reliably hold onto.
15:26karolherbst: jekstrand: I think it doesn't work for the blocks I pushed into the set as they are handled later and I need the block reference to adjust the cursor
15:27karolherbst: and I really want to create the blocks before entering so I can emit the goto and goto_ifs without having to patch them later
15:27karolherbst: althoguh I guess I could have a custom struct for the set and include the nir_block there...
15:28jekstrand: Yeah, with goto, you're getting into territory where there's not much to help you.
15:33dschuermann: jekstrand: something we can do about https://github.com/KhronosGroup/glslang/issues/2357 ?
15:33gitbot: KhronosGroup issue 2357 in glslang "Not setting the NonUniform decoration correctly for SSBO stores" [Open]
15:34jekstrand: dschuermann: Someone can fix the bug. The GLSLang code is horrible but not that hard to work on.
15:34jekstrand: I've had to hack on it a few times.
15:34jekstrand: To fix bugs like this
15:34dschuermann: I mean in the driver... what if there are shipped applications?
15:47karolherbst: jekstrand: yeah.. mhh I think I realy do need the block reference... don't see how to reliably handle SpvOpBranchConditional without it
15:47karolherbst: or well.. without adding hash tables or whatever
15:49jekstrand: dschuermann: What we did before and I suppose we could start doing again, is that when we created the vtn_pointer from the OpAccessChain, we looked for any NonUniform decorations on array indices in the OpAccessChain.
15:50jekstrand: I *think* that should still work probably.
15:51jekstrand: But it's sketchy because it might go through a composite at which point the non-uniformness will get lost.
15:53jekstrand: karolherbst: That's probably fine. Just be warned that the CF helpers split blocks and shuffle things around un unpredictable ways. Sometimes, stuffing an instruction in the block is a better way to hold onto it durring CF chagnes.
15:55jekstrand: dschuermann: The requirement of the spec is that the NonUniform decoration has to go on the SPIR-V id that is the source/destination pointer that the OpLoad or OpStore is accessing. Anywhere else and we're free to ignore it.
15:56jekstrand: But that doesn't mean GLSLang is actually correct. :-(
15:56dschuermann: no, glsland has certainly to be fixed
15:56jekstrand: I think if we find a shipping app that breaks this, it's fixable.
15:57jekstrand: Do we want to proactively fix it? I don't know.
15:57dschuermann: jekstrand: dEQP-VK.descriptor_indexing.uniform_texel_buffer seems to be broken on our side, though
15:57jekstrand: dschuermann: Hrm... That's possible, I suppose.
15:57dschuermann: yeah, I think fixing when we actually encounter an application which got it wrong should also work
15:58jekstrand: But I don't see how it'd be wrong.
15:59jekstrand: The code for handling NonUniform is now so stupid simple....
16:01dschuermann: they might have messed this up as well
16:03dschuermann: jekstrand: where does the NonUniform decoration have to go? https://pastebin.com/FHuiXwC0
16:03dschuermann: I think it's the imagefetch which would actually need it
16:03dschuermann: but I'm pretty aweful at reading SPIRV
16:05jekstrand: dschuermann: %24
16:06jekstrand: dschuermann: It has go on the thing that is the "source" parameter of the OpImageFetch
16:06dschuermann: ok, then we have 10 affected CTS tests. how does ANV get around this?
16:10karolherbst: jekstrand: I fixed the helpers in the other commits :)
16:11karolherbst: made them do not such things if you insert blocks after blocks
16:11karolherbst: that was the other idea of the rework, so I don't have to connect blocks manually anymore
16:13jenatali_: jekstrand, karolherbst: Getting to the point in reworking my invocation/workgroup id offset patches where I need to add nir caps. Trying to decide how many and what exactly they should be. The vertex id has two caps, one to lower it to zero_base + base, and the other to lower base to something else
16:14jekstrand: dschuermann: Another option would be to just run uniformity analysis and base the non-uniform lowering on that.
16:14jenatali_: I'm thinking the same - one to lower the non-zero-based ID to zero_base + base, and another to lower base to 0
16:14jekstrand: dschuermann: I wonder how often that'd actually be a false positive?
16:14karolherbst: jenatali_: I think we always need to insert the offsets though.. soo maybe one would be enough?
16:14jekstrand: dschuermann: It's possible the CTS we have in CI is just too old.
16:14jekstrand: dschuermann: It's also possible that we get around it by luck somehow.
16:14jenatali_: Oh, did we decide that no driver needed the non-zero-based intrinsic?
16:15karolherbst: jenatali_: I am sure there is no hw supporting it natively actually
16:15jenatali_: Alright, that works
16:15karolherbst: and by any chance there is hw, they can make it more complex :)
16:16karolherbst: anyway, I'd like to insert all adjustments for clover, and I think they are two? invoc_id_offset and work_group_id offset?
16:16karolherbst: or is there more?
16:16jenatali_: Nope, just those two
16:16karolherbst: and num_work_groups calculates based on group_id offsets?
16:16karolherbst: orr... wait
16:17karolherbst: nope, should be fine this way
16:17karolherbst: I am never sure what those cl functions really mean
16:17jekstrand: jenatali_: If there's a reason for a second cap, it'd be to not perturb GL drivers.
16:17karolherbst: although.. CL names are sane, nirs are not :P CL just has the same names except local vs global :D
16:18jenatali_: Yeah, I took your advice and added a patch to rename the existing intrinsic before doing any rework, so I'm already touching the GL drivers
16:18dschuermann: jekstrand divergence analysis has way too many false positives
16:18karolherbst: jekstrand: anyway.. I'll try to write a kernel requiring lowering of big grids on pascal.. shouldn't be _too_ hard as the 2nd and 3rd dim are quite limited
16:19jekstrand: dschuermann: :-(
16:20karolherbst: but I think I will probably go with the current infrastructure and lower the offset load to load the kernel input...
16:20karolherbst: at least that's how clover does it right now
16:20jekstrand: karolherbst: Sounds good. I think getting at least a prototype for that will help provide clerity.
16:20jenatali_: Great, thanks :)
16:20karolherbst: jekstrand: I already had one, but I wired the values through the gallium API :)
16:21karolherbst: and let the driver deal with it
16:21karolherbst: but for LLVM we already have the arg approach...
16:21karolherbst: anyway.. doesn't matter for vtn or nir
16:21karolherbst: just that I will have to add a nir lowering pass for clover
16:21jenatali_: Yeah should be simple though
16:22jenatali_: I'll try to have these patches ready for you to try out today
16:22karolherbst: ehhh.. I am already done for today I think :D
16:22jenatali_: My today, so that it's ready for your tomorrow :P
16:22karolherbst: reworking the unstructured code already used up most of my concentration for today :p
16:23karolherbst:I really hope I don't break anything
16:23karolherbst: will probably still add the switch case deduplication thing so the lower_goto_if doesn't have to do weird shit
16:33lostgoat: lordheavy: thanks for the update. I'm on manjaro and it is sadly two weeks behind the arch packages, so I can't test the update directly.
16:33lostgoat: But I built the PKGBUILD locally and I see the following:
16:33lostgoat: 1. No more validation warning message causing failures \o/
16:33lostgoat: 2. Validation is still a lot slower than usual, and it seems related to stripping. With strip it runs in 1s with !strip it runs in 15s (both were Release builds)
16:33lostgoat: 3. Size of the package is still fairly large, 371MB. Although, for a package intended for
16:34lostgoat: development purposes having things be a little bulkier shouldn't be too bad.
16:34lostgoat: The speed difference from #2 is pretty large though
18:16karolherbst: jekstrand: soo.. done with the rework. Added case deduplication, so it should be fine now
18:16karolherbst: switch handling is really more than half of the code... :D
18:17jekstrand: karolherbst: Yeah, switch statements suck
18:18karolherbst: the annoying thing is.. the clc code has no switch
18:18karolherbst: it's llvm optimizing the if ladder
18:19jekstrand: Good job LLVM
18:20karolherbst: I actually believe nvidias use of LLVM is the only viable use for GPUs: use the frontent and emit an stable IR and consume it in your actual compiler :p
18:21jekstrand: Yeah, jenatali_ and I were looking at one earlier this week where LLVM was "optimizing" a uint to a uint64_t just so it could initialize it with a single 64-bit store and it created casts everywhere.
18:21karolherbst: the hell
18:22karolherbst: anyway.. if anybody wants to review the vtn bits, feel free to do so :p
18:23imirkin: karolherbst: of course in nvidia's case, that output is PTX
18:23imirkin: which doesn't have switch statements :)
18:24karolherbst: :) I see this as a win
18:24karolherbst: imirkin: actually... maybe not, then somebody could think that if the IR doesn't support switches, a lookup table might be fine for small ranges :p
18:24karolherbst: could backfire
18:25karolherbst: although.. PTX doesn't allow indirect branching I think?
18:26imirkin: hw can do it based on uniform values
18:26imirkin: but whether that's a thing in ptx, who knows
18:27karolherbst: but then llvm would have to care about uniform values :p
18:28karolherbst: ohh there is indirect call
18:28karolherbst: but only direct branches.... mhhhh
18:28jenatali_: jekstrand: The worst I've seen was LLVM optimizing a 2-iteration loop writing chars, into an unaligned store of a uint16
18:28jekstrand: jenatali_: Oh, that's nice
18:29jekstrand: Incidentally, our HW would be fine with that. :-P
18:29jenatali_: Yeah, especially when vtn didn't have any alignment info for us to deal with it :(
18:42karolherbst: jenatali_: are there plans to implement the 2.1 spir-v extension?
18:42karolherbst: if not, you could maintain your own llvm optimization list
18:42karolherbst: (or run no optimization)
18:42jenatali_: karolherbst: Yeah, I'd like to implement both SPIR and SPIR-V
18:42karolherbst: but you could have that anyway and make the clc path suck less :p
18:43karolherbst: jenatali_: can't llvm just accept SPIR?
18:43imirkin: SPIR is the llvm 3.7 or whatever IR
18:43karolherbst: SPIR is something weird I decided to not care about :p
18:43jenatali_: Yeah, there's a 1.2 extension for supporting SPIR inputs
18:43karolherbst: jenatali_: realy? annoying
18:44karolherbst: I'd ignore that and just focus on the spir-v ones :p
18:44jenatali_: I've specifically had a request to support it, and our compilation chain already goes through SPIR anyway, so I figure I might as well hook it up
18:44karolherbst: please don't support SPIR :D
18:44jenatali_: Why not?
18:44karolherbst: because it's LLVM
18:44imirkin: i don't think upstream llvm can consume/produce SPIR
18:44jenatali_: Upstream LLVM absolutely can consume/produce SPIR :)
18:45karolherbst: well.. if llvm can just consume it, then I guess you could write it up :p
18:45karolherbst: I just always though you need like this special llvm tree or something
18:45imirkin: jenatali_: huh ok. used to be that you had to use some funny branch.
18:45jenatali_: Yeah, our compilation chain right now uses clang 10.0 to compile CLC -> SPIR, and then the SPIRV-LLVM-Translator to convert to SPIR-V
18:45karolherbst: jenatali_: heh? but that's LLVM, not SPIR?
18:45jenatali_: SPIR is LLVM?
18:46imirkin: SPIR is a spec
18:46imirkin: which specs whatever LLVM had whenever they wrote it
18:46karolherbst: well.. more or less
18:46imirkin: which was ~5y ago
18:46jenatali_: SPIR is just some specific metadata and name mangling on top of normal LLVM
18:46karolherbst: yeah.. it seems like SPIR 1.2 is LLVM IR 3.2 and SPIR 2.0 is LLVM IR 3.4
18:46karolherbst: it's annoying
18:46karolherbst: just ignore SPIR
18:47airlied: jenatali_: AFAIK llvm SPIR output is just LLVM IR from that version of LLVM
18:47jenatali_: LLVM bitcode parsers are backwards-compatible, so we can load old LLVM bitcode on new LLVM bits
18:47karolherbst: or well.. let LLVM deal with it :D
18:47karolherbst: ohh.. I see
18:47airlied: jenatali_: really?
18:47karolherbst: not that terrible then
18:47karolherbst: but still
18:47jenatali_: airlied: Yep
18:47airlied: I didn't think the bitcode parser was
18:47jenatali_: At least... according to our DXIL team
18:47karolherbst: jenatali_: the issue is, if you start supporting SPIR, you support it for 25 years :p
18:47airlied: jenatali_: they might learn it isn't then
18:47jenatali_: The bitcode *writer* isn't though, which is why the DXCompiler tree is still on LLVM 3.7...
18:48karolherbst: oh crap
18:48karolherbst: SPIR was a good toy to play around having an own IR
18:48airlied: jenatali_: but yeah I hope someone got a sworn statement from llvm IR ppl that the bitcode reader is stable forever :-P
18:48karolherbst: but people saw that defining your own IR is just the better path forward :p
18:48jenatali_: Heh, yeah fair
18:48imirkin: from now and into perpetuity
18:48imirkin: seems dangerous to rely on that, but that's just me
18:49imirkin: seems more likely for consumign than producing
18:49karolherbst: jenatali_: the good thing is, if somebody has code producing SPIR, they are on old LLVM
18:49karolherbst: but if MS says: nope, just spir-v they are forced to update
18:49karolherbst: win win
18:49jenatali_: Yeah I have no plans to ever *write* SPIR anywhere, but I figure we can consume it pretty much for free
18:49jenatali_: If that turns out to not be the case then yeah it's dead
18:50imirkin: jenatali_: someone's gotta write it though ;)
18:50karolherbst: jenatali_: "for free", never underesitmate 25 years of support :p
18:50imirkin: and if they are foolishly using upstream llvm for it ...
18:50jenatali_: karolherbst: You realize who you're talking to, right? We're *very* familiar with 25 years of support
18:50karolherbst: jenatali_: I know, that's why I was confused about your "for free" remark :p
18:50jenatali_: :P Yeah okay fair
18:51jenatali_: Well, we'll see. Future plans aren't set in stone yet, we're just trying to get 1.2 conformance with no extensions before we plan out what's next
18:51karolherbst: and before doing SPIR you'd add spir-v support anyway :p
18:52karolherbst: because it is CL core :p
18:52airlied: jenatali_: reading the docs, the bitecode file format is probably okay, but the IR inside it doesn't seem to be
18:52karolherbst: and has a 1.2 extension
18:52jenatali_: karolherbst: Yeah absolutely, SPIR-V I think we need to support so we'll do that first
18:52jenatali_: airlied: As I understood it (again, secondhand), the bitcode parser specifically has handling for older IR formats
18:53jenatali_: What I'd heard was that the devs who change the IR bitcode format are supposed to update the parser to handle the new and old versions
18:54jekstrand: And now that it's used for DXIL, it's actually tested. :-P
18:54jenatali_: At least coming from 3.7 :P
18:54karolherbst: *sigh* :D
18:54karolherbst: jenatali_: you might want to move from SPIR to SPIR-V :p
18:56karolherbst: I thought SPIR is the reason your DXCompiler stuff is 3.7 based
18:56jenatali_: Sorry, DXCompiler = https://github.com/microsoft/DirectXShaderCompiler
18:56jenatali_: What we're building in Mesa is a custom bitcode writer, so that we can use LLVM10+ for compiling CLC
18:57karolherbst: you know what would be fun? ditching dxil and using spir-v instead :D
18:57jenatali_: Heh, for some definition of fun, yeah
18:57karolherbst: all the driver vendors already support spir-v.. soo
18:57airlied: jenatali_: oh indeed there is some magic, messy, though wouldn't work amdgpu backends which explains why we never used it
18:58jekstrand: karolherbst: So you're saying we need a D3D flavor of SPIR-V? https://xkcd.com/927/
18:58karolherbst: jekstrand: nope, just spirv as it's today :p
18:58jenatali_: Shader model 6'
18:58karolherbst: I am sure they'll come up with extensions though
18:58karolherbst: but hey
18:59karolherbst: could be dx13 and then dxvk is practically for free
18:59jekstrand: Shader model 6i because it only exists in karolherbst's imagination. :-P
19:00karolherbst: then we just need to convince openmp folks to also mandate emiting spir-v and then spir-v is everywhere1!!1!!
19:00jekstrand: And, if the phoronix readers are to believed, we'll have GPUs soon that consume SPIR-V natively!
19:01jenatali_: Hah that'll be the day
19:01karolherbst: wouldn't be any different than how x86 is today :p
19:01karolherbst: well.. more or less
19:01imirkin: jekstrand: and soon after that, LISP
19:02karolherbst: but seriously, getting spir-v to become the de facto IR for everything would be funny
19:02jenatali_: jekstrand, karolherbst: Would love a quick skim of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891 just to validate that the approach seems sane after what we talked about yesterday
19:03karolherbst: jenatali_: yeah.. from a quick look that indeed looks better
19:04karolherbst: will try to get something done with clover over the next days then
19:05jenatali_: Awesome, thanks
19:07jenatali_: Would love to get that first commit merged sooner rather than later just to avoid new uses of the old intrinsic name
19:09jekstrand: jenatali_: If you want an excuse to merge most of it ahead of the OpenCL stuff, I can throw together a patch to use it in ANV.
19:09jekstrand: jenatali_: We don't need the global base but we do need base workgroup
19:10jenatali_: jekstrand: I absolutely wouldn't complain if you did ;)
19:12jekstrand: jenatali_: Looks like I already wrote most of the patch, actually.
19:12jekstrand: I just have to figure out what MR I left it in. :)
19:13jekstrand: jenatali_: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4856/diffs?commit_id=d393806f2eb69d440a729fd56d6ab1ef46aab039
19:13jekstrand: jenatali_: I didn't quite do all the plumbing you did so it's not as good.
19:13jekstrand: But it could be adapted in about 5 minutes.
19:14jenatali_: Yeah absolutely - I don't have a test environment set up for it, but I'm happy to pick it and get it to compile?
19:14karolherbst: jenatali_: didn't you get access to the intel CI stuff?
19:14jekstrand: Meh. I can do that quick.
19:14jenatali_: karolherbst: Not me, that was bbrezillon I think
19:29karolherbst: jenatali_: btw, mind starting to test with the new structurizer? no idea if you saw that, but I kind of really want to see this one merged asap https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401
19:29karolherbst: there is still some stuff to do though :/
19:29jenatali_: karolherbst: Yeah, I'll see if I can give it a try
19:31jenatali_: Not sure exactly what the best way is to replace the old version, given how far back in our tree it is though - I know some git-fu but maybe not enough
19:31jekstrand: jenatali_: This is going to be a bit more work than I thought.
19:31karolherbst: jenatali_: shouldn't be too painful, the interfaces are the same
19:31jekstrand: jenatali_: Not a problem; just need to re-wire a few things.
19:31jenatali_: Ah ok
19:32jekstrand: jenatali_: It's mostly that our compiler setup isn't built to set different NIR options per-driver.
19:32jekstrand: Totally fixable.
19:32jenatali_: Ah, got it
19:32karolherbst: jekstrand: per driver? you mean like iris vs i965?
19:33jekstrand: karolherbst: yeah
19:33jekstrand: karolherbst: Only, in this case, anv vs. iris
19:33karolherbst: ahh, okay
19:33jekstrand: Just need to re-plumb some stuff. Not a big deal.
19:33karolherbst: jekstrand: I saw the way you generate the option stuff and was wondering maybe you want to do the same thing we do in nouveau
19:33jekstrand: The way we have it right now is kind-of rubbish anyway
19:34karolherbst: yeah, we have something nice :p
19:34karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp#L3215 until the end of the file
19:34karolherbst: still.. requires some dispatching, but...
19:34karolherbst: could be solved with macro magic on top
19:34jekstrand: karolherbst: It's more that brw_create_compiler() only takes a devinfo
19:34jekstrand: karolherbst: I need to make it able to take more options
19:35karolherbst: I see
19:35jekstrand: We're currently creating it and then stomping stuff after the fact which is terrible.
19:35jekstrand: And NIR options aren't stompable
19:35jekstrand: So I just need to make it not rubbish
19:35karolherbst: well... I like how we do it :p creating the tables at compile time and just dispatch at runtime
19:38jekstrand: karolherbst: That's kind-of an unrelated problem given that you only have one driver, not three. :-P
19:39karolherbst: just add an argument and call it "driver" :p
19:39karolherbst: but yeah.. some macro magic would be needed to kind of automate the declarations
19:39karolherbst: and dispatching
19:40karolherbst: we don't even care about the shader type at this point
19:40karolherbst: but you also have this scalar vs simd stuff and so on...
19:40karolherbst: super annoying
19:41karolherbst: but I like the idea to "code" the values of the struct and let the compiler constant fold
19:41karolherbst: instead of declaring multiple structs and mix them together in the preprocessor
19:41jekstrand: karolherbst: Yeah, well... meh?
19:42jekstrand: I'm not going to claim that what we have is great
19:42karolherbst: what we need is a "constant" attribute to functions to delcare same input results in same output and let the compiler do the magic.. ohh wait :p
19:45karolherbst: but I think that's only legit for c++ at the time? dunno
19:46jekstrand: There's a C99 or C11 attribute for that
19:46jekstrand: At least there's one for GCC
19:46jekstrand: I think GCC has both "const" and "pure" which are slightly different.
19:48karolherbst: jekstrand: pure means, same input has _always_ the same result
19:48karolherbst: ohh wait.. same for sont.. uff
19:49jekstrand: I think it has to do with whether or not they ever touch memory outside themselves.
19:49jekstrand: It's a subtle distinction
19:49dcbaker[m]: doesn't "pure" mean that there's no side effects?
19:49karolherbst: "pure allows the function to read any non-volatile memory, even if it changes in between successive invocations of the function" right
19:49DrNick: one is the function depends only on arguments, the other is arguments and global memory
20:06jekstrand: jenatali_: https://gitlab.freedesktop.org/jekstrand/mesa/-/commits/wip/anv-base-work-group/
20:06jekstrand: jenatali_: Kicking it to CI now
20:06jenatali_: jekstrand: Thanks!
20:06INSANU: Hey guys, can someone point me to a doc explaining how can developer help with mesa? =)
20:37jenatali_: karolherbst: FYI I just got a report of a miniature kernel that breaks CLOn12, I'm pretty sure due to the old version of the structurizer we're using
20:37jenatali_: Gives me even more motivation to grab your new version and try it out ;)
20:49karolherbst: jenatali_: yeah... at this point I am more interested in regressions, but also bugs :) but mainly regressions at this point :p
20:54pmoreau: karolherbst and jenatali_ When you were talking about the SPIR-V extensions, was it the cl_khr_il_program one? If so, it is already implemented and just waiting for review (as well as the core support in 2.1): https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2078
20:55jenatali_: pmoreau: We (Microsoft) are working on a separate CL runtime which would need to independently implement that extension, though we're leveraging (and contributing to) Mesa's SPIR-V -> nir converter to help us out
20:55pmoreau: Ah I see
20:56pmoreau: Hadn’t realised the runtime was independent from Mesa; my bad.
20:56jenatali_: No worries :) I recognize it's kind of weird
21:01jekstrand: jenatali_: Out of curiosity, how do you plan to invoke mesa? git submodule the mesa tree in? Build a little dynamic library that's just the compiler?
21:02jenatali_: jekstrand: The latter, though I'd hardly call it little since it's got LLVM in it
21:02jekstrand: jenatali_: Well, yes. :-)
21:03jekstrand: When we first shipped our Vulkan driver, I was very proud of the fact that an optimized binary could fit on two floppy disks.
21:03jekstrand: I tried to get it down to one but never quite got there.
21:03jenatali_: I did some experiments though and if you strip out LLVM... it's ~1/20th the size
21:04jekstrand: I briefly considered putting it on a couple of floppies and mailing it to someone RedHat as a gag. "Here's a Vulkan driver for you to ship in your distro".
21:05jekstrand: But then they would have asked for source code and I would have had to ship them a crate of floppies. :-)
21:05danvet: still too big even as tar.xy?
21:06jekstrand: I think so
21:06kisak: pull out one of those 100MB zip disks
21:06karolherbst: kisak: that's cheating :p
21:06jekstrand: I genuinely did try to shrink it down so I probably did experiment with gz, bz2, xz, etc.
21:06danvet: jekstrand, I mean the sourcecode
21:06karolherbst: the goal is to fit on one 640kb flopyy :p
21:06danvet: that should compress decently
21:07jekstrand: danvet: Yeah, tar.gz of the source should fit on a finite number of floppies
21:07jekstrand: Small enough to mail, even.
21:09imirkin: can build different sets of floppies for different things too, slackware-style
21:09danvet: still a package
21:09danvet: the tar.bz2 is like 14MB
21:09imirkin: but you could split drivers apart, etc
21:10imirkin: have the "a" floppies for core, etc.
21:13jekstrand: "Here's the floppy for the base driver and here's floppies with extensions"
21:14imirkin: only get what you need, coz a 14.4 modem isn't so fast
21:15jekstrand: The difficulty will be in finding a machine with both a Vulkan-capable GPU and a floppy drive. :D
21:16imirkin: even my current desktop, from ~2010, i ended up opting against getting a floppy drive
21:16imirkin: i have one sitting around in a box somewhere i think
21:16imirkin: oh whoa. it's been 10 years ... might be time for an upgrade. hm.
21:16jekstrand: I've got a 5 1/4 drive in my desktop. It's not hooked up, though, because I haven't owned a motherboard with floppy controller pins in years.
21:16jekstrand: And, yes, last I knew it worked.
21:17karolherbst: same case for 15 years or what happened? :p
21:17imirkin: the trouble is floppies ... they don't age so wlel
21:17imirkin: apparently physics gets in the way =/
21:17karolherbst: jekstrand: still got one with a turbo button? :D
21:18imirkin: that was a great button.
21:18imirkin: let you go through certain areas of games slower
21:19imirkin: since the timing loop would be calibrated to the higher freq :)
21:19jekstrand: karolherbst: No, I've not had a turbo button in a very long time.
21:19jekstrand: karolherbst: I don't think I've ever owned a computer with one.
21:19jekstrand: We had one with a turbo button growing up but it was my dad's so I don't really count it.
21:19imirkin: it was a thing for 386 and 486's, iirc
21:21imirkin: just turned off the clock mulitplier
21:21imirkin: so a 486 DX2 66mhz would start going at 33mhz
21:22karolherbst: imirkin: I am wondering if we should open a webshop to finance nouveau development and sell turbo buttons so people can manually reclock :p
21:26sravn: danvet: Unisoc DRM patchset. Patch 4 and patch 6 would benefit from your critical eyes. Patch 4 has the atomic stuff. Patch 6 has a lot of stuff and I think it should be modelled as a bridge or more. But dunno for sure.
21:27sravn: Also I think the bindings should be better prepared for bridge use if we ask them to go that route. But agin not too sure here
21:32danvet: sravn, uh
21:32danvet: the entire thing looks like a giantic midlayer
21:32danvet: looking at patch 4/6
21:32danvet: struct dpu_layer shouldn't exist
21:32danvet: struct dpu_core_ops neither
21:33danvet: just call the functions directly
21:33danvet: or maybe I'm just confused
21:35danvet: but also generally I don't care much, drivers are going to driver in the low levels
21:35danvet: wrt 6/6, I guess when someone starts using it in a different driver, they get to make it a bridge
21:36danvet: not everything has to be a bridge, imo that only makes sense if it's some separate thing that's potentially shared among drivers
21:37sravn: danvet: Looks like it is prepared for multiple DPU which is why dpu_core_ops are there. Or so I think.
21:37sravn: I hope the intro can shaed some light on it.
21:38danvet: well we do have piles of callbacks already
21:38danvet: if they don't fit, dont use the helpers maybe
21:38danvet: occasionally some private callbacks make sense on top, but not like just everything
21:39sravn: dpu_layer - is this some drm_plane thingy maybe?
21:40sravn: Would be good if you could give a bit feedback on ml - I will await if we see some high level structure before I submit more.
21:41sravn: The small details do not really matter if there are fundamental parts that needs fixing first so my detailed comments can make them belive the rest is good :-(
21:43linkmauve: “23:15:09 jekstrand> The difficulty will be in finding a machine with both a Vulkan-capable GPU and a floppy drive. :D”, I recently built a desktop computer, Kaby Lake in a Pentium 4 case (because why buy a new case when this one works perfectly?), but I haven’t bought the adapter to plug the floppy reader at the front yet.
21:44jekstrand: My case is from the P4 era
21:45linkmauve: I should figure out how to plug in the power button someday, currently using a screwdriver. :-°
21:48jekstrand: My power button works fine. :P
21:49airlied: jekstrand: what's the trajectory for landing the linked list change?
21:50airlied:doesn't really want to nag ppl on vallium until after it lands
21:50jekstrand: airlied: It needs review
21:50jenatali_: Ugh, not looking forward to reconciling that one :(
21:50jekstrand: airlied: I think I've gotten most people who don't have CI set up to run and ACK it
21:51jekstrand: airlied: But no one's done any detailed review AFAIK
21:51jekstrand: jenatali_: Yeah, neither am I for my internal branches
21:51jekstrand: airlied: And I would very much like to land it before another big thing lands. :-/
21:51danvet: sravn, typed up something
21:52danvet: sravn, btw one nit that I think you missed is the use of devm_kzalloc
21:52danvet: I think in new drivers we should stop that
21:55airlied: jekstrand: gert did r-b a lot of the generic patches
21:56jekstrand: airlied: Did he? I thought he gave an rb on just the r600 parts
21:57jekstrand: It's kind-of hard to tell with a series like this. :-/
21:57bnieuwenhuizen: jekstrand: this looks like more though? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966#note_582208
21:58jekstrand: Oh, I hadn't seen that yet!
21:58jekstrand: Oh, and there's three from Iago too
21:59jekstrand: airlied: I'm kind-of inclined to try and land it before the branch if we can just so that backports aren't more annoying than needed.
22:00airlied: jekstrand: yeah if there's any review gaps let me know I'm reading over it now
22:01jekstrand: airlied: Cool.
22:01jekstrand: airlied: I've gotten lots of tested-by which is good
22:01jekstrand: And anholt reviewed the last patch fairly thoroughly
22:01jekstrand: airlied: I'm happy to land as soon as we've got enough review.
22:01jekstrand: airlied: I have no reason to hold it up and, in fact, I've got a very "fun" rebase I can't wait to do once it lands. :-)
22:02airlied: jekstrand: it's definitely one of those just rip it off situations, else rebasing forever nightmare
22:02jekstrand: airlied: Yup.
22:02jekstrand: airlied: I've already fixed two rebase issues.
22:03jekstrand: airlied: The one review I'd really like is for tarceri to look at the the 3-4 GL linker patches.
22:03jekstrand: But even that's probably ok given that it goes through Jenkins ok.
22:04jekstrand: And in my debugging bugs tend to take the form of "nothing works"
22:09airlied: jekstrand: llvmpipe in CI gives fairly good glsl coverage as well
22:20airlied: jekstrand: okay dropped two sets of r-b, hopefully covers the remainging bits
22:20jekstrand: airlied: Ok. I'll look at it tomorrow
22:20jekstrand: Getting close to EOD here.
22:40SolarAquarion: https://repo.kitsuna.net/lab-mesa-git.log trying to build lib32-mesa, getting a translation unit too large
22:43bnieuwenhuizen: SolarAquarion: also looks like the 64-bit version os some files is used
22:43SolarAquarion: strange. Well, it's being built in a chroot, but that shouldn't be happening, i think
22:57SolarAquarion: bnieuwenhuizen: https://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
22:58SolarAquarion: a header is getting a bit too big