09:33pq: jekstrand, I'm certainly following your emails with figurative popcorn. :-)
09:55tango_: the sync ones?
10:33pq: tango_, yup
11:57MrCooper: anholt_ krh: is google-freedreno-db820c-01 the only db820c runner? Seems to be the only one picking up jobs ATM, causing very slow pipelines
12:00MrCooper: also, why are the freedreno test jobs running for radv MRs in the first place?
12:02daniels: MrCooper: yes, 0 all dropped off between 20min-1h ago
12:04pepp: MrCooper it seems some freedreno jobs (like arm64_a306_gles2) are not using the .freedreno-rules
12:11jadahl: is it so that atomic hw cursor planes do not have any hotspot metadata?
12:19maccraft: hello, hows performance of opengl on ironlake intel graphics in comparison to windows?
13:37danvet: mlankhorst_, just rolled drm-misc-fixes forward to -rc6 and applied a bugfix
13:39danvet: mlankhorst_, hm now I see your pull and I'm confused
13:42danvet: mlankhorst_, never had a chance for a fast-forward?
13:46mlankhorst_: danvet: I think we crossed
13:47danvet: mlankhorst_, yeah
13:47danvet: oh well, I also botched something and the fast-forward didn't work anyway
13:51danvet: sumits, ping for that ack
14:00danvet: mlankhorst_, ok it's pushed now, might want to respin that pull so you can fast-forward :-)
14:07mlankhorst_: danvet: enjoy
14:33sumits: danvet: I'm sorry, I typed the reply a little while back - of course, Ack, of course!
15:33jekstrand: pq: It's fun, isn't it?
15:38jekstrand: Can we assume c11 these days? It'd let us delete some things...
15:44danvet: sumits, hm I'm not seeing your reply here
15:44danvet: also nothing in spam
15:45sumits: danvet. hmm.. let me re-send it!
15:46sumits: danvet: sent again.
15:56danvet: sumits, showed up now, thanks
15:57sumits: np danvet!
16:48anholt_: MrCooper: one gitlab-runner instance is all of the google-freedreno-db* runners, so unless gitlab-runner gets wedged or someone pauses the individual runners in it, there shouldn't be much else that could go wrong as far as queuing.
16:49anholt_: https://gitlab.freedesktop.org/admin/runners/?scope=all&utf8=%E2%9C%93&state=opened&search=db820c shows db820c-01 having 60% more jobs, but I did all my dev with just that one turned on.
16:49MrCooper: it was the only one picking up jobs when I asked
16:52daniels: yeah, they have come back; when I looked at that same page, all the runners other runners were showing last contact 1h ago (or one at 21min ago)
16:52anholt_: shouldn't be anyone in the office to kick the network cable.
16:55MrCooper: could it be hung / timed out jobs?
17:02anholt_: MrCooper: maybe some lava refactor jobs https://gitlab.freedesktop.org/tomeu/mesa/-/jobs/1968271
17:03anholt_: should extend the expect-output.sh to take an optional list of failure cases, and fail out the job when they happen
17:03anholt_: (also, 2 hour timeout? I should crank that down to 20min)
17:03anholt_: that must be custom on that repo
17:24dcbaker: hakzsam: you've got a couple of patches "radv/gfx10: fix ... VK_EXT_subgroup_size_control" which don't apply cleanly to 20.0, and I'm not sure how to backport them
17:24dcbaker: that are nominated for 20.0 I mean
17:25hakzsam: dcbaker: yes, do you need a MR with a backport?
17:25dcbaker: that would be nice :)
17:25hakzsam: will do now
17:35hakzsam: dcbaker: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4230
17:37dcbaker: hakzsam: perfect, thanks!
17:38dcbaker: although it's telling me I can't rebase the branch
17:45hakzsam: dcbaker: the branch is based on staging/20.0, why can't you rebase?
17:47kisak: wasn't there an option to allow others to make changes to the merge request that needed to be checked?
17:49hakzsam: maybe I did something wrong?
17:50MrCooper: there is, "Allow commits from members who can merge to the target branch"
17:50MrCooper: hakzsam: you need to check that in the MR settings
17:51hakzsam: just checked that box
17:51hakzsam: dcbaker: ^
17:51dcbaker: that did it, thanks!
17:52MrCooper: hakzsam: if you've never checked that for your MRs before, that probably explains the mystery why others can't manipulate your CI jobs as well
17:54MrCooper: basically everybody always checks that (we'd enforce it if we could)
17:54hakzsam: do I need to check that box every time I submit a MR? or is there a global option?
18:01hakzsam: MrCooper: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232 --> can you cancel/retry jobs?
18:03MrCooper: yes, we can, and you always need to check that
18:03hakzsam: I didn't know
18:16Venemo: jekstrand: could you please review the two small NIR patches in MR 4165?
18:18Venemo: I changed the naming scheme like you suggested + fixed another small issue that I noticed
18:57jbarnes: does anyone object to Emil's latest patch for the set/drop_master stuff?
18:57jbarnes: it's one I'm interested in seeing land, because it's really the only patch I need in order to run stock upstream with a full chromeos stack
18:58jbarnes: without it, chrome fails to start due to the special semantics we added, and this is probably a better overall approach anyway
18:58jbarnes: danvet: ^
19:01anholt_: MrCooper: my thought for 390x/ppc intermittents is that we should probably either crank up the timeout or move the .c enumeration into a list in meson.build so each process spawned has a separate timeout.
19:01jekstrand: Is there an easy way to run a CI build locally?
19:01anholt_: jekstrand: nope, unfortunately
19:01jekstrand: There's piles of stuff specified in the .gitlab-ci.yml
19:01anholt_: there's a command line tool that's supposed to be able to, but it's not maintained and doesn't support modern syntax last I checked
19:02anholt_:tends to just remove all unrelated jobs, switch policy from manual to on_success, and do lots of pushes to a personal branch
19:02jekstrand: Yeah, I need GDB for this
19:02danvet: jbarnes, smack r-b onto it, xexaxo1 has commit rights
19:02jekstrand: On a mingw build... 😭
19:03anholt_: gdb on a ci build? sounds like a bad day
19:03danvet: the thing looks reasonable and seems to come with igt tests too
19:04jbarnes: yeah tests ftw
19:12jekstrand: WTH! The test failed twice in CI but when I build it locally in the CI container, it works fine.
19:15jekstrand: Seems like I ran the script wrong?
20:59jekstrand: Is marge-bot busted or just swampped right now?
20:59airlied: seems swampy
21:00kisak: jekstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests?assignee_username=marge-bot&scope=all&sort=created_asc&state=opened is the queue, lowest MR # first
21:01HdkR: open empty PRs so later you can fill the branch and jump the queue :D
21:01kisak: a bunch of MRs were rebased and queued up at the same time, so CI was behind, and there's also a minor delay coming from the a630 jobs
21:02jekstrand: Didn't we disable unneeded CI?
21:03jekstrand: My patch is ANV-only and is getting 100% of the CI
21:04kisak: jekstrand: unless I missed something, clever CI skipping hasn't landed yet
21:04pepp: jekstrand: the src/mesa/drivers isn't really filtered properly
21:05pepp: see https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/.gitlab-ci/test-source-dep.yml#L8
21:05anarsoul: does mesa lower external samplers for yuv textures if driver doesn't support yuv?
21:06pepp: any file modified matching "src/mesa/**/*" will cause all tests to run (because for now all test depends on mesa_core_file_list)
21:06jekstrand: Yeah, but this MR only modifies files in src/intel/vulkan
21:07jekstrand: Oh, and I guess src/intel/blorp
21:07anarsoul: nevermind, found nir_lower_tex.c
21:12kisak: oops, I meant to say minor delay from a530 CI jobs, not a630
21:57Venemo: jekstrand: I thought a little more about that I/O mask topic. I realized that what I really want is basically a 'default' driver location
22:00Venemo: after I looked into what we use those masks for, we basically don't use them for anything but assigning driver locations
22:13ajax:squints at brw_gl_server_wait_sync
22:16anarsoul: so lowering external samplers is broken if driver doesn't support R8 and R8G8
22:16anarsoul: any ideas how to fix it?
22:18imirkin: support R8 and R8G8?
22:21anarsoul: haven't we discussed that few weeks ago? :)
22:21anarsoul: it's gles2 gpu
22:21anarsoul: it doesn't support it natively
22:21imirkin: so what are the options, forgetting about existing software?
22:22imirkin: i.e. you have a NV12 surface, with Y8Y8Y8Y8 on one plane, and U8V8U8V8 on the other plane
22:22anarsoul: it can sample from L8 and L8A8 and can render into R8, B8, R8G8 and B8G8
22:23imirkin: ah, should be possible to make it work with L8 and L8A8 instead of R8 and R8G8
22:23imirkin: should just be a bunch of typing to adjust the sahders
22:23imirkin: instead of .rg it's .ra for the uv swizzle
22:26anarsoul: oh, vc4 does something similar
22:26imirkin: (L = stick the value into .rgb)
22:47jekstrand: Venemo: I think we use them for a little bit more than that
22:49Venemo: anyway, would it make sense to add a "default driver location" to NIR? which could be used when the HW doesn't have any special requierements?
22:49Venemo: or is this something that should fully belong in the backend?
22:50jekstrand: I don't know
22:51jekstrand: We've got helpers to quickly assign locations
22:51jekstrand: They just walk the list and assign away
22:51jekstrand: They're pretty dumb
22:51jekstrand: I'm not that inclined to put something like that in common unless we're really sure the algorithm is pretty common.
22:53ajax: jekstrand: have a moment to answer random questions about explicit sync?
22:53jekstrand: ajax: go for it
22:55ajax: which drm drivers, if any, have actual command-stream level parallelism between contexts?
22:55jekstrand: i915 does
22:55jekstrand: amdgpu should be able to
22:55jekstrand: nouveau should be able to (not sure how well enabled that is)
22:56jekstrand: Not sure about freedreno, vc4, vc5, or vivante
22:56ajax: okay, let's take i915 for a moment. is there an oldest gen where that's true?
22:56jekstrand: Gen9, I think
22:57jekstrand: Maybe gen8 but I don't think so
22:57ajax: okay. so before that you probably had one big queue and drm might interleave contexts into it but that's all the parallelism you're getting
22:57jekstrand: I think nvidia's had preemption and real parallelism since nv50 or so
22:58ajax: can i get that level of parallelism for multiple gl contexts within a single process?
22:59dj-death: I thought gen8 had preemption too
22:59jekstrand: Possibly even within a single context (thinking AMD's async compute stuff) though you won't notice that.
23:00ajax: for gen9+ (given the above), and assuming you have the good sense to use GLX_ARB_context_flush_control, how expensive is a context switch?
23:00jekstrand: dj-death: I think they tried on Gen8. I don't know if we enabled it though.
23:00anholt_: ajax: for vc4/v3d, each job has two parts, where B depends on A, and there are separate rings for A and B and they get scheduled independently(-ish)
23:00anholt_: they share the shader core but separate command parsers
23:01jekstrand: ajax: It's not free but also not terrible. iris is currently using separate contexts for 3D and compute and ping-ponging between them.
23:01jekstrand: ajax: It is measurable, though, ickle did a thing in SNA a while ago to stop using contexts because he could measure the overhead.
23:01dj-death: loading of a context was like 60~70us
23:01dj-death: depends on GT size
23:01Venemo: jekstrand: which helpers do you mean?
23:02imirkin: afaik there's no preemption in the middle of a draw on nvidia
23:02jekstrand: Venemo: nir_assign_var_locations
23:02jekstrand: imirkin: I think we're talking mid-command-buffer
23:02imirkin: ah yeah, since the dawn of time
23:04ajax: jekstrand: say i did glFenceSync between every gl draw command. how much am i going to regret it?
23:04anholt_: vc4/v3d don't do any mid-cmdstream preemption, though maybe one could extend to split up the render job to at least preempt at tile granularity.
23:05imirkin: the way the command stream is structured on nvidia, it's impossible not to have pre-emption
23:05bnieuwenhuizen: ajax: jekstrand: amdgpu has parallelism between contexts, at least for compute and dma (and since our newest gen also for gfx AFAIU)
23:05imirkin: you can explicitly YIELD too
23:06jekstrand: ajax: With a sync_file tucked in there? Very badly.
23:07ajax: i don't need this to be an ipc-able fence
23:07jekstrand: ajax: Even without a sync file, very badly
23:07jekstrand: ajax: That's going to land you with one draw per batch along with flushing all the caches and a full GPU stall.
23:08jekstrand: Was that a space or am I missing a font?
23:09ajax: that's a space. i was attempting to evoke "i am speechless".
23:09dj-death: might be a jaw on the floor ;)
23:09jekstrand: ajax: Why do you need to sync on every draw?
23:09jekstrand: ajax: I think iris may be significantly better at this
23:09jekstrand: Or, rather, about to be. Ken has patches in this area
23:09ajax: like... the way i had this kind of thing working on r100 was you could just dma a breadcrumb back out to host memory
23:10dj-death: if we use HW semaphores we can control the amount of flushing from iris
23:10dj-death: not sure it's the case yet
23:10Venemo: jekstrand: I'll take a look at those. would you (intel) gain anything if they were more compact?
23:10ajax: i don't need to sync on every draw, necessarily, but it'd be the worst-case scenario
23:10bnieuwenhuizen: I think the difference is everyone is using software scheduling an on-busy-waits these days?
23:11bnieuwenhuizen: and non-busy-waits*
23:11ajax: i guess maybe if you could only do that once per batch...
23:12jekstrand: ajax: In iris, Ken has patches to do a more breadcrumb style implementation and it's just a clone of what's already in radeonsi
23:12jekstrand: And possibly other drivers as well
23:12ajax: good, i'll go ahead and not care about brw anymore
23:12jekstrand: It's still not free (full cache flush) but it's better than one per batch.
23:13ajax: i still don't really understand the cache-flush if the fence is only scoped to the current context
23:13jekstrand: If we nuke i965 performance, that's not going to be good. There are still lots of HSW users out there...
23:13jekstrand: It's just the way our hardware works. The way you stall the GPU (which you have to do to properly breadcrumb) is to flush the cache and wait on the flush.
23:15ajax: enh. sure. because you can only know the batch has been passed if all the EUs go silent, maybe. ick but fine.
23:15jekstrand: Yup, pretty much
23:17jekstrand: Why do you need to sync so often?
23:17ajax: because i'm an x server. i can't know where i'm going to draw next, and the next command might source from the current command's destination.
23:18jekstrand: Why do you need multiple contexts?
23:18ajax: deciding whether they'd be worth it was kind of why i've been asking these things ;)
23:18jekstrand: Ah :)
23:19jekstrand: If you did a bit of tracking about where you're drawing and what's dirty where, you could be more judscious about your syncs
23:19ajax: yeah, which isn't too hard, there's internal hooks for changing the current draw target
23:19jekstrand: But given that most things will draw to the front buffer, you're going to be syncing a lot
23:20ajax: not... necessarily, i don't think.
23:21ajax: so one i wasn't really worried about the scenario where X controls the root window pixmap contents
23:21jekstrand: Are you going to have threads?
23:22ajax: let's say no. with the asterisks of, maybe one day one xserver thread per gpu, and maybe the gl is threaded internally.
23:22jekstrand: In that case, you could just hang on to the current context and only sync when you switch contexts
23:22jekstrand: Or have a pointer per-resource to the last context to touch it which you check before you do a draw
23:23airlied: can we go to vulkan first :-)
23:23jekstrand: I'd be a fan of that :)
23:23airlied: I started doing some of the typing for it, but there's a lot of typing
23:26airlied: before I even get to the thinking
23:26bnieuwenhuizen: what do you do for legacy drivers? say r600?
23:27ajax: jekstrand: the main case where i'd need these fences is when i know the destination drawable has another X client listening for damage on it
23:27bnieuwenhuizen: right. Do we want to maintain an extra render backend (vulkan besides GL)? What is the feature advantage for that extra work?
23:27airlied: bnieuwenhuizen: they can continue to suck :-)
23:28jekstrand: ajax: Yeah.....
23:28airlied: glamor is pretty much "done" in that nobody cares enough to do much more with it :-P
23:28ajax: and the point, mostly, would be to notice draws to such watched windows and defer the damage event until the fence signals
23:28bnieuwenhuizen: airlied: I'm just asking in what way we'd make thigns suck less in the vulkan world?
23:28jekstrand: Vulkan could be potentially a bit faster
23:28jekstrand: We had some glamor perf problems on i965
23:28jekstrand: I think they're mostly gone on iris though
23:28ajax: which gets me to the X protocol sequentiality guarantees w/r/t multiple clients
23:28jekstrand: which covers nearly the same set of platforms as ANV
23:31airlied: bnieuwenhuizen: we'd have a more direct line to sucking less, whether we use it or not
23:32jekstrand: But then I might have to debug X issues again....
23:32chrisf: flexible recording order should be a win
23:32jekstrand: Seems like a downside. :-P
23:32ajax: pfft, X is easy
23:33ajax: or at least, not a moving target
23:42ajax: jekstrand: thanks.