02:39jekstrand: bnieuwenhuizen: I'll read tomorrow
06:17airlied:wonders how insane adding a thread to drisw would be
07:05pepp: daniels: Hi. Windows runner failure https://gitlab.freedesktop.org/pepp/mesa/-/jobs/2106453 (out of disk space)
08:36mareko: airlied: why would you do that?
09:15daniels: pepp: yep, trying to fix it
09:29danvet: sravn, 0day confirmed my theory I think
09:29danvet: can you take another look pls?
09:30danvet: for the drmm_add_final_kfree fix
09:40Venemo: gitlab CI issue: https://gitlab.freedesktop.org/Venemo/mesa/-/jobs/2107988 -> "No space left on device"
09:44danvet: Venemo, #freedesktop for server issues
09:44danvet: Venemo, ping daniels and bentiss
09:44daniels: daniels knows about it and is fixing it
09:44daniels: as discussed ~7 lines above
09:45Venemo: thx, sorry I didn't see it above
09:52danvet: daniels, sry coffee not yet working here, despite like 2nd
09:55mareko: airlied: there is u_threaded_context and that should be easy to enable for sw pipes
10:31kusma: jekstrand: do you know the reason why we encode the destination type-width both in the opcode and in the destination-value for u2u/i2i-opcodes? It's safe to assume they are the same, right?
10:37sravn: danvet: looks good and your explanation the other days explained it well. You got an r-b on mail
10:41airlied: mareko: played witb it today, not much impact but llvmpipe as some internal issues that once solved it miggt help
10:41airlied: but the big issue is presenting
10:42airlied: flush, finish, shm put image and xsync
10:42airlied: pretty much means you stall per frame
10:43airlied: if i could remove finish and pld the putimage and sync in a thread it might be more pipelined
10:45pendingchaos: jekstrand: how would it sound to create a opcode like i2i32/u2u32 except instead of zero extending or sign extending, the backend can put garbage in the upper bits
10:45pendingchaos: should be useful for improving the code generated by nir_lower_bit_size
11:32danvet: airlied, vgem and real dma-fence?
11:32danvet: they have a 10s time-out for evil userspace, then vgem.ko force completes
11:32danvet: instead of shmem put image and all that
12:09hakzsam: CI is seriously broken these days :/
12:09hakzsam: I hit various CI issues since friday
12:12hakzsam: https://gitlab.freedesktop.org/hakzsam/mesa/-/jobs/2106839 https://gitlab.freedesktop.org/hakzsam/mesa/-/jobs/2109160 https://gitlab.freedesktop.org/hakzsam/mesa/-/jobs/2109596
12:13hakzsam: I know that the window issue is going to be fixed but the other ones are annoying too
12:36tomeu: daniels: in the jobs that fail to download the artifacts, the following line is missing: "Downloading artifacts for meson-arm64 (2109126)..."
12:36tomeu: do we have an idea already of what is going wrong there?
12:52daniels: hakzsam: the Windows thing is already fixed - I hadn't realised one of the details of deploying GitLab Runner under Windows so am working on installing a scheduled task to clean up the logs which were filling the disk automatically. the artifacts thing is pretty much my highest priority to try to figure out now, but I don't have dedicated time to do it.
12:52hakzsam: no worries and thanks for fixing it!
12:53daniels: the fdno flakes are for robclark/anholt - that particular test has been flaking for quite a long time I believe (at least the name hits my mental pattern-patch) - I think robclark has a plan for properly fixing it but not sure what's happening with it
12:53daniels: i appreciate the artifacts thing is _really_ annoying
12:54daniels: tomeu: if you've got some time on your hands, maybe you could look into the failing jobs and try to figure out if there's some pattern as to why it's not working? I almost wonder if it's some kind of horrible race with direct upload - in which case, if you put a job which needs arm64-build + is needed by pf, and just runs 'sleep 300', that should be enough to ensure the upload is completed behind the scenes
12:57tomeu: daniels: but, wouldn't that job fail in the same way with the same results?
12:57tomeu: oh, but it wouldn't need the artifacts, right
12:57tomeu: but then, why aren't the artifacts found when the job is retried?
12:57daniels: right now we're stabbing in the dark without further information, but there are two things I suspect. one is that we just haven't declared our dependencies properly so it's not getting scheduled right, but I'm not super sure about that.
12:58tomeu: let me see when the jobs started
12:58daniels: the other is about how we upload artifacts: the runners upload to the service, then the service queues a background job to upload the artifacts to storage. perhaps we're hitting a race where the job gets scheduled before the background job completes, so sees there are no artifacts to download?
12:59daniels: we can bypass that proxy-upload thing, but last time I tried (about a year ago) it broke, so we need to do some testing first to confirm
12:59daniels: or tbh, just do it on Sunday morning when everyone's asleep and then it's quiet enough that we can manually monitor the breakage :P
13:10mareko: git rebase -Xignore-space-change is really useful
13:17emersion: ah, nice one
13:50shadeslayer: Hi, Could someone merge https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4162 please
14:11robclark: daniels: anholt figured out something that looks promising re: the flakes.. it isn't so much an issue w/ a single test but sequences of tests vs bo cache vs ubwc.. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290
14:12robclark: (we should probably just go ahead and land that as-is)
14:12daniels: yeah, it seems positively reviewed, so might as well
14:15MrCooper: daniels: the needs: are there, but might be worth trying to drop dependencies: in favour of https://docs.gitlab.com/ce/ci/yaml/#artifact-downloads-with-needs
14:15tomeu: daniels: so far this is the mains suspect for bailing out before "Downloading artifacts for ..." gets printed
14:16tomeu: guess it could be related to what MrCooper just mentioned
14:19daniels: yeah, I think trying that out makes a lot of sense
15:19jekstrand: kusma: Yes, they are guaranteed to be the same.
15:21jekstrand: kusma: The reason why we encode the type size in the op is that NIR doesn't allow you to have more than one floating bit size. Rather than changing that (which would have been massive amounts of work), we just added one per destination bit size. Why destination? It seemed more convenient.
15:21jekstrand: pendingchaos: You can do that with a pack and an undef but such an opcode seems pretty reasonable to me.
15:22jekstrand: I agree that it's super useful in the context of bit-size lowering
15:22jekstrand: Or at least has the potential to be
15:22kusma: jekstrand: OK. Seems a bit tricky to get automatic lowering like nir_lower_bit_size to eliminate them for that reason, but for now I think I'll just optimize away some common patterns generated in nir_opt_algebraic.py
15:26kusma: jekstrand: just to be clear, I get i2i32(u2u8(a@32)) and u2u32(i2i8(a@32)) when I try to lower away all 8-bit operations.
15:27kusma: the latter is trivial, it just becomes an and. The formber becomes two shifts, but I guess it's debatable that this is always better, so perhaps I need to add a flag, or make my own dxil_opt_algebraic.py...
15:28pendingchaos: jekstrand: not sure if a pack would work as well for 8->32 as it does 16->23, since there doesn't seem to be a pack_32_4x8
15:29pendingchaos: it would also require checking for that pattern in ACO, since we lower most undefs to 0 for simplicity
15:29kusma: pendingchaos: sounds familiar ;)
15:31jekstrand: kusma: Yeah... :-(
15:32jekstrand: pendingchaos: I think having an opcode sounds like a good idea.
15:35kusma: jekstrand: gotcha, thanks.
15:36jekstrand: kusma: Yeah, I think we can do better there; we just need to figure out a good system of opcodes and elimination rules.
15:37kusma: jekstrand: Yeah, that's totally fair.
15:37jekstrand: So, for 8-bit add, for instance, you could replace it with i2i8(iadd(garbage_extend32(x), garbage_extend32(x))) and that should be easier to optimize
15:38jekstrand: Because we can replace garbage_extend32(i2i8(x@32)) could be replaced with x
15:38kusma: jekstrand: well, nir_lower_bit_size actually fixes everything for me *exept* some u2u/i2i combinations to sign-extend the result, it seems...
15:39kusma: jekstrand: Right. For DXIL, we also have flags if we care about overflow on the operations.
15:39kusma: (straight outta LLVM)
15:40kusma: Not sure if that's super useful or not. Just an observation. DXC seems to use that quite heavily, but that could just be because it's based on LLVM...
16:43jekstrand: kusma: I think we're likely going to want to add such a bit to NIR at some point.
16:43jekstrand: kusma: Currently, NIR always tries to keep things overflow-correct
16:44kusma: jekstrand: OK, cool.
16:45jekstrand: For this particular case, though, even if the 8-bit thing isn't overflow-correct, if we use garbage_extend (I really don't like that name!), the resulting op is going to have to be overflow-correct because we don't know what that garbage will be.
16:45jekstrand: undef_extend might not be a terrible name, I suppose
16:49kusma: jekstrand: yeah, I suppose...
16:51jekstrand: Anyway, undef_extend seems like a good idea
16:51kusma: I'm going to try to properly wrap my head around that stuff tomorrow :)
16:52kusma: For now I'm happy to have something that kinda works with some nasty hacks ;)
16:52kusma: Any my brain is starting to hurt already ;-)
16:56kusma: It's been a long day of LLVM-debugging and manual sign-extending and bit-packing. I'm still getting "invalid record"-errors from LLVM if I lower u8 to u16 instead of u32... so yeah, still some ghosts left in the computer. Ugh.
17:16shadeslayer: tomeu: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4377 could use review too :)
17:28jekstrand: kusma: :(
17:29jekstrand: kusma: Does the underlying platform not support u16 for some things?
17:30kusma: jekstrand: u16 works, turns out the problem was that binops needs to have matching operand sizes, and nir does shifts with u32 shift-counts
17:31kusma: jekstrand: so, that was easy to remove, doing the same as ac_nir_to_llvm.c does
17:31kusma: Just needed to debug deeply enough into the llvm bitcode-reader to see what was going on
17:32kusma: Did I mention I'm not the biggest fan of LLVM bitcode ;-)
17:44anholt: hakzsam: what's the status of fixing the radv fossils intermittent failures?
17:44hakzsam: anholt: should be fixed by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335
17:45anholt: (my freedreno flake-reduction MR failed to merge due to radv and panfrost flakes)
17:45anholt: tomeu: https://gitlab.freedesktop.org/anholt/mesa/-/jobs/2115919
17:45hakzsam: I think the MR will be merged in one day or two
17:55robclark: heh, there is some kind of irony in not being able to land fd flake reduction due to flakes in other drivers :-P
17:56anholt: we've been bad at disabling flaky jobs recently.
19:07airlied: danvet: yeah considering that path also
19:15eric_engestrom: hakzsam, Venemo: !4284 is tagged for stable, but doesn't apply cleanly without !4159
19:16eric_engestrom: 3 options: 1) backport both, 2) backport neither, 3) you send me a custom backport MR (against staging/20.0)
19:16eric_engestrom: which one would you prefer?
19:18hakzsam: eric_engestrom: it's the shuffle thing?
19:18hakzsam: so, squash both and backport
19:21eric_engestrom: done :)
19:26anholt: tomeu: lava failed again, differently https://gitlab.freedesktop.org/anholt/mesa/-/jobs/2117571
20:03Venemo: eric_engestrom: yeah, the two should be squashed together, and the squashed patch is safe to backport.
20:06eric_engestrom: currently looks like this: https://gitlab.freedesktop.org/mesa/mesa/-/commit/50f3a85b690af0e4d61c621feff1c25b42e3ecf4
20:09eric_engestrom: actually no, I just swapped the two commit messages, makes more sense like this: https://gitlab.freedesktop.org/mesa/mesa/-/commit/1e598bf8e0a1e94fa87235b02c58b93996ea93f4
20:14airlied: bnieuwen1uizen: okay used the once stuff to init the mutex
20:36Venemo: eric_engestrom: the commit message could be "Enable subgroup shuffle on all chips", but what you have is also OK
20:38Lyude: this is a pretty silly question that I've never bothered to ask, but would be useful for some vbl docs I got asked to review: does the blanking region for the display raster live on the first few scanlines, or the last few? (e.g. is it on the top or bottom?)
20:38Lyude: I'm like 99% sure it's on top
20:43ajax: Lyude: it's all in your perspective
20:43ajax: if you start from a modeline kind of description then the blanking intervals are conceptually south and east of the image data
20:44ajax: (showing my northern-hemisphere bias in which up is north)
20:45Lyude: ajax: ah, so it'd be below then
20:47ajax: right. it's equally valid to think of it as coming "above" the image data, but most of the time i see it described as below
20:47Lyude: makes sense
20:47ajax: in particular for scanline-interrupt generation 0 is usually the first row of pixels
20:54HdkR: I tend to think of it as "in-between" because describing it one way versus the other tends to place assumptions that newer LCDs will end up breaking
20:55HdkR: (As opposed to old CRTS)
20:56agd5f_: Lyude, also hw tends to be pretty flexible as to what is considered a vblank interrupt (start, end, somewhere in the middle)
20:57Lyude: HdkR: I mean yeah, but I think we're only really concerned about just explaining things from the driver's perspective here
20:57Lyude: agd5f_: mhm, that's probably a good point to bring up in the docs
20:58HdkR: Dealing with the ArtX VI interface hurt my mind, since it's dumb flexible for this
20:59HdkR: Even gives you up to four independently controlled interrupts for letting you know where in the blanking period it is
20:59HdkR: er, in the scanning period*
21:00Lyude: HdkR: yeah-actually another very well known GPU manufacturer has the same thing as well ;)
21:04ajax: ArtX! (hits cigarette) now there's a name i haven't heard in a long time...
21:05HdkR: haha :D
21:09Lyude: huh, TIL there is literally an ascii character called "Horizontal Scan Line 7"
21:17Lyude: agd5f_: poke-just to clarify, when you say "start, end, somewhere in the middle" do you mean with respect to the vertical blanking region, or the whole raster?
21:42agd5f_: Lyude, blanking region. when hw has a vbl interrupt, it's not always consistent across vendors. some fire at the start, some at the end, etc.
21:42Lyude: agd5f_: ahh, just wanted to confirm. I know intel GPUs are like that across different gens
21:42agd5f_: the vbl interrupt is not always what you want to meet expectations, depending on what those are.
23:24robclark: anholt, daniels: does a bunch of jobs failing with "tar: artifacts/install.tar: Cannot open: No such file or directory" mean something flaked in a big way.. or an actual problem with earlier pipeline stage?
23:24robclark: re: https://gitlab.freedesktop.org/robclark/mesa/pipelines/126430
23:25robclark: hmm.. https://gitlab.freedesktop.org/robclark/mesa/-/jobs/2121472 says "ERROR: No files to upload"
23:33robclark: gitlab seems to have a different concept from me of what "Show whitespace changes" means
23:53anholt: robclark: last time I heard about it, the theory was a new version of gitlab-runner being flaky.
23:53anholt: robclark: note in the mesa-arm64 job "ERROR: No files to upload"
23:56robclark: yeah, I saw that.. I wasn't sure quite what that meant, since I didn't see any obvious compile error in that job..
23:57robclark:clicked rebase to make the whole pipeline run again