00:32 anholt: man, even running it from CI's docker image didn't keep virgl_test_server from using my amd gpu and GPU hanging it to death.
02:34 robclark: is this an expected outcome of LIBGL_ALWAYS_SOFTWARE=1? https://paste.ubuntu.com/p/628fbJCgW5/
02:34 robclark: ie. I don't think we *really* want sw gl to be on top of zink?
02:35 bnieuwenhuizen: I think there was an MR open to fix that
02:35 bnieuwenhuizen: robclark: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8782
02:36 steev: robclark: i'll yank that in here, i guess
02:36 robclark: thx, fwd'd that
02:36 robclark: o/ steev
03:32 airlied: jenatali, zmike : uggh got a crash with gnome-shell/zink/lavapipe (no envvar set), have to track down why it's even getting zink
03:33 zmike: airlied: there's a ticket for that
03:33 zmike: but I suspect this is also related to why people get zink when they set some lavapipe/llvmpipe env vars
03:34 zmike: not at a real computer atm, but probably the ticket has zink label
03:35 airlied: zmike: no the ticket is for LIBGL_ALWAYS_SOFTWARE=1
03:35 airlied: nobody sets any vars here
03:35 zmike: right, I'm saying I suspect there's a connection
03:36 zmike: because iirc this was happening with the env vars prior to the commit from that ticket
03:36 airlied: yup I just r-b that patch before I got debugging this
03:36 zmike: (I was getting bug reports from people on my wip branch a loooong time ago and they were setting bizarre env vars to get zink)
03:36 zmike: cool
03:37 zmike: cc/ping me if you want a review tonight or I'll see anything with zink label when I check tomorrow morning 👍
04:04 jekstrand: Uh... Is gitlab not allowing pushes?
04:11 jenatali: airlied: Is there no GL hardware available such that it falls back to drisw/zink?
04:13 jenatali: For the record I really want that path to work for the d3d12 backend, since that's the environment we'll be in for WSL - no other GL hardware but a layered driver with a sw winsys works well. Not sure if such a scenario exists for zink...
04:13 airlied: jenatali: yup no hw, and then drisw loads zink and zink bindsto lavapipe
04:14 airlied: jenatali: the problem is to make gnome-shell run in that scenario which i'm not sure matters to you
04:14 jenatali: Ooh... yeah I didn't think about lavapipe getting preferred over llvmpipe...
04:14 airlied: I might make zink fail to bind to cpu devices
04:14 airlied: unless forced
04:14 jenatali: Seems reasonable to me
04:15 zmike: 👍 from me
04:17 zmike: jekstrand: gitlab's been downish for ~9 hours, guess it's going to be intermittent this week during the transition
04:20 jenatali: airlied: I'm out of town at the moment, feel free to push your r-b and merge that fix if you're good with it
04:30 airlied:is throwing a fix at rawhide to see if it fixes the problem
05:03 daniels: jekstrand: are you still getting the post-receive hook fail?
05:03 daniels: you shouldn't be, because er, there aren't any post-receive hooks
05:04 daniels: hanging during push was 'expected' for a couple of hours, but Ben sorted that out
05:05 airlied: remote: Resolving deltas: 100% (6502/6502), completed with 964 local objects.
05:05 airlied: remote: GitLab: Internal API error (500)
05:05 airlied: daniels: ^ just there now
05:06 daniels: thanks, will see what I can do
05:17 daniels: airlied: will have to check in the morning I'm afraid
05:17 daniels: at a guess, it's transient fail affecting repos forked from Mesa which should be fixed by ~lunch European time
05:18 airlied: daniels: no worries, I've marked the issue as a blocker for now anyways
05:18 daniels: thanks, sorry about that
09:16 pq: re: LICENSE; there is a "new" thing: https://reuse.software/
09:23 pq: I've tried the 'reuse lint' tool in CI for a new personal tiny project with DCO, MIT, and CC-BY-4.0 and it gives a warm fuzzy feeling when it passes.
09:29 JEEB: it gets real fun when the overall license is one, but then most if not all files have another license
09:30 JEEB: like files containing LGPLv2.1+, but LICENSE being LGPLv3
09:30 JEEB: (and before that GPLv3)
09:30 JEEB: see: libaribb24
09:36 pq: is that even a valid situation?
09:36 pq: sounds very ambiguous
09:37 pq: my tiny project cannot use a single LICENSE file either, because images are CC-BY-4.0 and code is MIT. According to reuse.software quidelines, licenses are set per file according to the SPDX syntax. But there is no ambiguity.
09:41 FLHerne: Speaking of licenses, the headers added in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8816 look very fishy to me
09:43 FLHerne: There's both an MIT header *and* a 3-clause BSD one in each file, with slightly contradictory terms
10:46 JEEB: pq: https://github.com/nkoriyama/aribb24/issues/9
10:47 JEEB: effectively people are interpreting it according to the more strict definition
10:47 JEEB: which is either GPLv3 for the last release, or LGPLv3 for the current master
10:47 JEEB: but yes, it's kind of hilarious
10:47 JEEB: (and sad)
10:47 JEEB: apparently in dec, 2018 I made some grepping
10:48 danvet: emersion, I guess you'll apply some patch once the standalone fourcc bikeshed has settled?
10:48 emersion: are you fine with either version?
10:57 Sumera[m]: danvet: is this the right refactor for the second step we discussed? https://paste.ubuntu.com/p/zPtBf8Ntxv/
10:58 Sumera[m]: or did I get it wrong? I have only made code changes, haven't updated the docs yet.
11:10 danvet: Sumera[m], lgtm, some nits: move the helper above, make it static and usually in igt we use a __ prefix instead of _helper postfix for these helpers
11:22 hanetzer: yeesh, bisects are such a pain in the arse.
11:22 Sumera[m]: danvet: hm, does __igt_vblank_wait() work ?
11:24 hanetzer: agd5f: just finished the bisect re: drm/amd issue #1388
11:26 danvet: Sumera[m], ?
11:26 danvet: oh for the name, sure
11:27 Sumera[m]: danvet, yes, thanks!
11:44 JEEB: (34
13:47 danvet: tzimmermann, I think gerd has some qxl series for cross review with your ast one?
13:47 danvet: I did scroll through it quickly, looked reasonable
13:50 tzimmermann: danvet, yes we can do that.
13:52 tzimmermann: i only need reviews for the ast patches 1 to 5. these could then go in soon. the rest would be reworked rsp to the vbox driver and the shmem stuff i just posted
14:43 danvet: hwentlan, for the DC private state tracking/sync issue, probably good to look at what mripard has done in vc4 and the new drm_crtc_commit helper
14:43 danvet: I'd expect you'll need that stuff to if you state-ify DC
16:34 imirkin: dcbaker: when is mesa 21.0 supposed to come out?
16:34 imirkin: i might have to revert a change
16:54 alyssa:wonders if there's a good way to deduplicate atomics in NIR
16:54 imirkin: "de-duplicate"?
16:56 alyssa: imirkin: Providing the illusion of {global, ssbo, image, shared, deref}_* being merged
16:57 alyssa: I'd probably be content with nir_intrinsic_info having a segment property.
17:02 karolherbst: imirkin: what did you break? :p
17:05 imirkin: karolherbst: the fence thing apparently causes issues
17:05 karolherbst: ;
17:05 karolherbst: :/
17:05 karolherbst: annoying
17:06 imirkin: hence my questions in #nouveau.
17:08 alyssa: imirkin: https://rosenzweig.io/0001-nir-Unify-memory-atomics.patch there's this, for a start :p
17:09 imirkin: alyssa: as long as the type of thing being referenced is obvious, i wouldn't see a problem with that
17:10 karolherbst: alyssa: I guess that makes sense
17:10 imirkin: oh, wait, you're not even talking about removing the diff ops
17:10 imirkin: you're talking about reducing the spam in nir_intrinsics.py?
17:11 karolherbst: alyssa: you should move the comments though
17:11 alyssa: imirkin: reducing the spam in nir_intrinsics.py and then routing through the segment info so backends can handle them all with the same case
17:12 alyssa: Removing the diff ops would certainly be nicer, but I really don't want to go fiddling with every compiler in-tree, not all of which are even in mesa CI
17:12 alyssa: (And it's a can of worms because the different seg/op combinations take different arguments)
17:13 alyssa: (For my motivation, our hw only supports global atomics, everything else gets lowered to address arithmetic)
17:23 anholt: alyssa: for hw like that (V3D is the same instruction type for ssbo, ubo, scratch, and shared), I've wanted a lowering to global accesses with a load-ssbo-base-address (or whatever) intrinsic used in addressing math in nir
17:23 alyssa: anholt: We have nir_load_ssbo to rewrite ssbo->global
17:24 alyssa: jekstrand: has mentioned the pass is probably completely redundant vs nir_lower_io with the right options
17:24 anholt: nir_lower_io really needs a whole lot of documentation
17:25 alyssa: You don't like inscrutable magic code?
17:25 anholt: I have understood a good chunk of it, briefly, before. it's not knowledge that sticks, though.
17:25 jekstrand: anholt: Ueah....
17:25 jekstrand: alyssa: What do you not need with nir_lower_io?
17:25 jekstrand: It's a magical wonder-function that does most things for you
17:26 alyssa: jekstrand: nir_lower_ssbo possibly?
17:26 alyssa: getting SSBO access lowered to globals and 64-bit iadd with a sysval base
17:28 alyssa: Likewise we need (some) image accesses lowered that way. And obscurely shared computational (not xchg/cmpxchg) atomics on bifrost need to be lowered to a special SEG_ADD instruction
17:28 alyssa: it's easy enough to do this stuff in the backend and I've come to accept I'll wind up writing backend CSE down the line, but hey
17:29 jekstrand: alyssa: nir_lower_explicit_io has 95% of what you need for that already. :)
17:29 jekstrand: Assuming you start with derefs
17:29 alyssa: Ack
17:30 alyssa: Also, what's the story with all the weird 62-bit pointer stuff?
17:31 alyssa: Given hardware with a totally flat 64-bit address space and "load_shared_base/load_scratch_base" intrinsics, can all of that be ignored?
17:31 alyssa: (This is not our hardware...)
17:32 HdkR: Someone has 62bit hardware?
17:32 alyssa: Looks like NIR wants deref_mode_is/addr_mode_is intrinsics even with hardware like that..
17:33 imirkin: HdkR: intel does
17:34 imirkin: iirc the low bits of a pointer aren't used
17:34 imirkin: clever people stuff extra data in there
17:34 HdkR: fancy tag bits
17:34 imirkin: (clever/suicidal)
17:36 imirkin: (and not sure all the upper bits are usable either)
17:38 jekstrand: alyssa: The weird 62-bit pointer stuff is for OpenCL where we may not know the type of pointer at compile time.
17:38 jekstrand: So we use the top 2 bits to tag it as a global, shared, or scratch pointer
17:39 alyssa: sorry, better question - why do we need to know at run time either?
17:39 alyssa: Is there an OpenCL typeofptr instruction or something?
17:39 imirkin: address 1 in shared memory != address 1 in global memory...
17:39 bnieuwenhuizen: alyssa: you have loads/stores that don't care about memory type in CL2
17:39 jenatali: alyssa: Yes there is
17:40 alyssa: jenatali: Incredible. Got it, thanks. That's frustrating.
17:40 jenatali: alyssa: https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#address-space-qualifier-functions
17:44 jekstrand: alyssa: Also, some of us don't have a unified address space. :-/
17:44 jekstrand: alyssa: On Intel, we currently emit an if-ladder for every load/store with a generic mode.
17:44 jekstrand: Fortunately, nir_opt_deref is pretty good at determining exact modes for most things.
17:45 jekstrand: alyssa: Back to your original question... What is the original question?
17:45 jekstrand: :P
17:46 anholt: "can we convince nir_lower_io (or _io_explict) somehow to turn ssbo accesses into global accesses with addressing math based on an intrinsic to get the ssbo base/size"
17:46 jekstrand: Yes, we can totally do that
17:46 anholt: (or at least, that's the original question I'm interested in)
17:47 jekstrand:writes a patch
17:47 jenatali: Isn't there already an address mode that can do that?
17:47 jenatali: The bounded global mode or something
17:47 jekstrand: Yeah. Just need a constant_base intrinsic
17:51 jekstrand: hrm...
17:52 jekstrand: So... nir_lower_explicit_io isn't quite ready for it
17:52 dcbaker: imirkin: I'm running a bit behind this week, I'm not going to make the release till tomorrow, but I think there's still opened blocking issues, so unless that's changed it'll be -rc4 this week
17:52 jekstrand: It's got 90% of what's needed
17:55 alyssa: anholt: load_ssbo_address is what we're using for lower_ssbo fwiw
17:56 alyssa: and I guess there's already get_ssbo_size for bounds checking
17:56 imirkin: dcbaker: ok. how do i add a blocking issue?
17:56 hanetzer: alyssa: o/
17:57 alyssa: \o
17:57 anholt: alyssa: yeah, if there's a spec requirement for bounds checking, then that pass should be using get_ssbo_size to do it
17:57 anholt: but it does look like v3d could use that pass to delete a bunch of code.
17:58 anholt: though also, it looks like there's a 64-bit assumption baked in to that lowering pass
17:58 alyssa: anholt: AFAIU there's a spec requirement iff robustness is enabled on the context
17:58 anholt: while v3d is 32-bit.
17:58 jekstrand: Do youw ant load_ssbo_base_ptr and load_ssbo_size as separate things?
17:58 jekstrand: Or a single load_ssbo_descriptor?
17:58 alyssa: jekstrand: Makes no difference to us, but the separate things are already both in nir_intrinsics so I'd prefer separate (method of maximal laziness)
17:58 alyssa: [Kidding. Maybe.]
17:59 anholt: separate makes most sense for v3d
18:00 anholt: (obvious translation to 32-bit uniform stream values)
18:00 dcbaker: imirkin: there's a milestone
18:00 dcbaker: add your issue or MR to the milestone
18:01 imirkin: dcbaker: thanks!
18:01 jekstrand:types code
18:04 MrCooper: imirkin: you can set the milestone on the issue/MR page
18:04 imirkin: MrCooper: yeah, i found that. just didn't realize that's what the "blocking" stuff was about
18:05 imirkin: had i bothered to click on it, it would have been fairly obvious
18:05 imirkin: since the first one in the list is "Mesa 21.0 release blockers"
18:24 jekstrand: alyssa, anholt: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8844
18:24 jekstrand: I'll let one of you two figure out how to plumb the address format through galium.
18:25 jekstrand: There's a new address format I add in !8635 if you want to save on some 64-bit arithmetic
18:25 anholt: you mean not just "stuff it in shader compiler options"? :)
18:25 jekstrand: anholt: I've been tempted....
18:25 jekstrand: anholt: I'm not conviced that's the way to do it but there are days
18:26 anholt: tbf, u_screen makes it not terrible to add new caps now.
18:26 anholt: as long as you don't need it to be per-stage (whoops)
18:28 alyssa: jekstrand: nice!
18:30 jekstrand: if you use bounded_global, you'll get bounds checking
18:30 jekstrand: If you use 64bit_global, you won't
18:45 jekstrand: alyssa, anholt: Added two more patches for UBO support
18:56 agd5f: hanetzer, unfortunately, I think you are experiencing a different issue
18:58 hanetzer: agd5f: hmm. well, it presented itself the same way as mr. rx580's issue
19:01 agd5f: hanetzer, so you are just getting the dcn20_validate_bandwidth warning, but the system is otherwise usable?
19:01 hanetzer: agd5f: no; without that revert, on ~5.10 I can't use xorg
19:01 hanetzer: black screen
19:02 hanetzer: with that revert 'everything works' as far as I can tell.
19:03 hanetzer: no warnings or nothing.
19:03 agd5f: anything in the dmesg output when that happens or just the display not lighting up?
19:03 agd5f: ok, that's not the same issue as the original report. Still an issue to be sure, but not related to the OP
19:04 hanetzer: the dcn20_validate_bandwidth_fp error happens when starting xorg, without that commit reverted. should I open a new issue?
19:04 agd5f: sure. thanks. too many issues on the same bug gets confusing
19:07 agd5f: hanetzer, also provide info about the monitor(s) used and res/refresh rates
19:07 agd5f: on the report
19:08 hanetzer: geh... I suck at making good issues. will do. I've tested three different monitors on this setup :)
19:09 hanetzer: two of those custom zisworks displays, one lg ultrawide, and one old ass dell monitor with a DVI-HDMI adapter
19:09 agd5f: hanetzer, BTW, thanks for bisecting!
19:09 hanetzer: took a while =_=
19:11 hanetzer: whelp. now that I 'know' where my issue lies and I have a full bisect log and suchnot I can go ahead and do various sorts of inter-monitor testing.
19:15 pendingchaos: jekstrand: what's the difference between load_global_constant and load_global with ACCESS_CAN_REORDER?
19:17 jekstrand: pendingchaos: Uh.... I wasn't aware of ACCESS_CAN_REORDER
19:18 jekstrand: pendingchaos: For one thing, NIR will actually CSE etc. load_global_constant but it's unaware of ACCESS_CAN_REORDER
19:19 alyssa: does NIR not usually CSE intrinsics?
19:20 jekstrand: It's based on the CAN_ELIMINATE and CAN_REORDER bits
19:20 alyssa: got it
19:20 jekstrand: If both are set in the nir_intrinsic_info, you get CSE
19:20 jekstrand: If CAN_ELIMINATE is set, you get DCE
19:20 jekstrand: If neither, it leaves it alone
19:20 pendingchaos: nir_intrinsic_can_reorder() (used by CSE) already includes a few load intrinsics, load_global could be added to get CSE
19:20 jekstrand: pendingchaos: Uh... Right. I forgot about that.
19:22 jekstrand: So, yeah, we could add some more intrinsics there and we could probably drop load_constant
19:22 jekstrand: load_global_constant, rather
19:25 jekstrand: That'd save us some plumbing
19:28 alyssa: Guess who probably needs to finally redesign RA...
19:31 jekstrand:raises his hand
19:31 jekstrand: But not for your driver. :P
19:31 imirkin: i think pretty much everyone is in need of this.
19:31 imirkin: raise your hand if you're happy with your RA
19:31 imirkin: *crickets*
19:33 alyssa: jekstrand: am I going to regret a linear scan / graph colouring hybrid
19:35 jekstrand: alyssa: What you're going to regret is touching RA again. :)
19:35 jekstrand: Also, yes. Not because it's a linear scan / graph coloring hybrid but because it's an RA algorithm. It really doesn't matter what you use. You'll regret it sooner or later.
19:35 alyssa: True :(
19:36 anholt: aco is the only ones I haven't heard griping about theirs.
19:36 alyssa: anholt: You should listen to them more often, then ;)
19:36 alyssa: dschuermann: ;)
19:37 alyssa: Woah, awesome! I just instinctively typed `pkill Xorg` out of a fit of frustration, but I didn't suffer any data loss.
19:37 alyssa: Wayland rocks!
19:38 dcbaker: add that to the list of things wayland does better than X, lol
19:39 alyssa: [x] Dealing with alyssa's tantrums
19:40 hifi: if I'm sending a patch to dri-devel should I CC the maintainers or not?
19:40 alyssa: yes
19:40 alyssa: ("You're not a maintainer, sit down alyssa")
19:42 ajax: listing yourself as a maintainer does sort of imply you want to see yourself cc'd on patches about the thing you maintain
19:46 hifi: ok, thanks
20:17 jekstrand: alyssa: My recommendation on that MR is to pull my patches into an MR to do what you want w/ panfrost so you can show my code works. :)
20:18 alyssa: jekstrand: Ack
20:31 danvet: jstultz, I think we're at the point where you need to set up a meeting with android folks and me
20:31 danvet: this isn't working
20:36 Kayden: zmike: hey, lstrano mentioned that you might have some patches with extra u_threaded_context docs?
20:37 zmike: Kayden: uhhh I had a couple patches a long time ago but they got rejected because the part I was documenting was being rewritten
20:39 Kayden: ahh, okay
20:39 zmike: if you have some questions I might be able to answer from the perspective of someone who (vaguely) recently did an impl
20:39 danvet: jstultz, also I hope dma-buf heaps doesn't hand out !VM_SPECIAL mmaps already, that would be inconvenient at best
20:40 Kayden: I've been looking at helgrind errors with u_t_c enabled, and I'm not entirely sure what to do about some of them
20:40 zmike: ah
20:40 zmike: I ran into that as well
20:40 zmike: best advice is just ignore it
20:41 Kayden: well, some of them are real
20:41 zmike: hm
20:41 zmike: like mesa core stuff?
20:41 Kayden: the docs say that create_*_state (all CSOs and shaders) should not "use any per-context stuff"
20:42 Kayden: but...my hooks there upload the shader assembly into a buffer, which uses a u_upload_mgr today
20:42 zmike: ah, you need to use a different u_upload_mgr
20:42 Kayden: which has to use transfer maps to get a CPU map of the resource...which needs a pipe_context
20:42 zmike: or you can threaded_context_unwrap_sync(pctx)
20:43 zmike: but basically if you're in a thread you can use tc->base.stream_uploader
20:43 zmike: or at least that's my understanding and how I handled e.g., buffer mapping
20:43 alyssa:has ^^ problem for SHAREABLE_SHADERS
20:44 Kayden: aha
20:44 Kayden: I thought there might be an uploader for the tc thread
20:44 zmike: yea
20:44 zmike: it's a bit mind-melting but I'd suggest reading through si_buffer_transfer_map()
20:44 alyssa:supposes she could put an uploader on the screen
20:45 Kayden: I do need to use an uploader with a special flag set
20:45 Kayden: (which forces memory allocations to a certain address space)
20:46 Kayden: alyssa: but the uploader needs transfer maps, which have a context :/
20:46 zmike: I'd guess you could probably look into adding another uploader to tc or add flags to the tc uploader?
20:46 zmike: https://gitlab.freedesktop.org/zmike/mesa/-/blob/zink-wip/src/gallium/drivers/zink/zink_resource.c#L768 another reference for async stream upload
20:47 alyssa: Kayden: bah :(
20:48 Kayden: ah
20:49 Kayden: so I can use TC_TRANSFER_MAP_THREADED_UNSYNC to detect when we're in the other thread, and I can just make separate uploaders for each one
20:49 zmike: yeah for transfer maps
20:51 zmike: Kayden: I wrote a post with some possibly-useful tips for tc integration a long while back https://www.supergoodcode.com/last-day/
20:51 zmike: might be useful
20:53 Kayden: thanks!
20:58 airlied: Lyude: that patch is in drm-fixes now
21:04 anholt: heads up: fixed the TF api_errors test so that Mesa drivers wouldn't be so failing at it. https://gerrit.khronos.org/c/vk-gl-cts/+/6926
21:16 jstultz: danvet: regarding mmaps, I don't believe so.. Suren brought up the PFNMAP thing due to an accounting concern. I've not had time to dig into the details
21:16 jstultz: danvet: and i'm working up an email regarding a meeting
21:17 danvet: jstultz, yeah but there's a lot more to pfnmap than just accounting
21:18 danvet: just typed an rfc with a few people on it
21:18 Lyude: airlied: thanks for letting me know!
21:20 danvet: jstultz, if we go cgroups for this entire ball of topics then probably need also some amd
21:20 danvet: and the cgroups people from intel
21:20 danvet: and maybe tejun ideally
21:20 danvet: so like lpc topic at that point
21:21 danvet: maybe even throw in glisse for ZONE_DEVICE account
21:21 danvet: to make sure we don't create bad overlap
21:22 jstultz: danvet: Maybe lets start with you and the android folks? I'm not sure I've got the bandwidth to put together an extra conference here :)
21:22 danvet:not clear how that stuff really is accounted
21:22 danvet: jstultz, yeah sure :-)
21:22 jstultz: (but its a good start for an LPC topic, though hopefully we don't have to put it off that long)
21:23 danvet: jstultz, you're an optimist
21:23 danvet: cgroups for gpus has been discussed for 2 years already
21:23 danvet: with patches
21:24 jstultz: I'm not saying it gets solved by then, but just not putting the discussion off that far.
21:24 karolherbst: don't you need like 2019+ GPUs in order to guarantee anything anyway? :p
21:25 danvet: jstultz, oh sure
21:25 danvet: and it probably requires a lot of discussions with lots of folks to get somewhere
21:27 danvet: karolherbst, scheduling maybe, memory account doesn't need anything really
21:28 karolherbst: ahh, true
21:41 Lyude: hey btw -if any folks here are interested are interested https://lore.kernel.org/dri-devel/CAHUNapTB1tt6T931LfBWVWreXGFwd6tTPqH58i7s3WKivCDT4g@mail.gmail.com/ if anyone has ideas for shorter GSoC projects for this year, we're still looking for responses to the emails we sent out about this
21:58 raster: damn..... so is there any good reason for sometimes the timestamp for vblank interrupts on intel gfx ... going backwards between 2 frames?
21:59 Kayden: yikes
22:00 ickle: sure, it means you have an invite to Stephen Hawking's birthday party
22:00 raster: this has taken me ages to identify... like randomly, sometimes... all rendering would freeze
22:01 imirkin: raster: presumably you don't mean wrap-around?
22:01 ickle: they use ktime_t / monotonic clock, sampled on the cpu at the interrupt
22:01 ickle: so either we put the wrong timestamp on the wrong event, or the event is out of order
22:01 raster: i never found figure out why... until i finally caught it. the clock went backwards betwene one frame and the next -not gettimeofday click - the timesmtap in the vblank event which is what we were keying off for animation. it SHOULd be monotonic and SHOULD only go forwards... the code we had would fall into a logic hole if it ever went backwards... which was never expected. :)
22:02 raster: imirkin: no. not wrap. :)
22:02 raster: like y debug now tells me when this happens
22:02 raster: EEEEEEK! drm time went backwards! 5695.214712 -> 5695.198201
22:02 imirkin: raster: is this one of those older amd systems with unsync'd tsc's?
22:02 ickle: we have plenty of oops, this is one frame later than expected
22:02 raster: and EEEEEEK! drm time went backwards! 5519.817667 -> 5519.801114
22:03 raster: and EEEEEEK! drm time went backwards! 4521.013988 -> 4520.997437
22:03 ickle: tscs are not used for ktime_t
22:03 raster: imirkin: nope. not old.
22:03 ickle: some of the newer art may be trusted enough
22:03 raster: i7-8550U so pretty new
22:03 imirkin: and not amd :)
22:03 raster: amd is just fine
22:04 raster: it only goes forwards because it can't find reverse :)
22:04 imirkin: i just remember their first dual-core opteron's had unsync'd tsc's which caused all manner of trouble
22:04 imirkin: this was ... sigh, like 15y ago now
22:04 raster: this is intel gfx. so... hmmm. i ever would have expected this.
22:05 imirkin: duh. didn't make the connection. so yeah, nevermind :)
22:05 ickle: I don't think we have any "EEEEEK" message in IGT, so searching the history for a backwards timestamp would take a bit of effort
22:05 raster: i can't think of a good reason for this to be wrong :| it's a very very very small jump backwards each time, but enough to mess up my logic to havea frame delta of < 0 :)
22:06 raster: ickle: no no - thats my debug. :)
22:06 ickle: I know
22:06 ickle: I'm lamenting we don't have an obvious thing to grep for
22:06 ickle: are you looking at the drm_event timestamp or the fence timestamp?
22:07 ickle: I think the former is an estimate of when we think the vblank actually occurred, and the latter a sampling of when we signaled the fence
22:07 raster: yeah. well first i am just floating ut there the general q of "any good reason for this to not be a bug in the kernel" ... or well even a hw counter bug?
22:08 ickle: still both should be 1 frame +- 2us apart
22:08 raster: ickle: this is sec and usec passed to the vlnak event handler by libdrm
22:09 raster: i have found the timestamp passed to generally be highly accurate...
22:10 ickle: we test that variation is within about 1 microsecond; once upon a time drm had a big banner proclaiming how accurate it vblank timestamping was
22:11 ickle: and lots of tests that try to make sure each frame advances by exactly one vblank
22:11 ickle: but I still can't think of any that clearly shout if it goes backwards
22:11 raster: it is pretty good... except SOMETIMES... it jumps in its delorean and takes a micro-leap back at 83mph :)
22:12 ickle: raster: see if you can catch vsyrjala
22:22 Lyude: curious, do you know if the hw vblank counter -stays- 2 frames behind after it jumps back?
22:22 Lyude: (I've never hit an issue like this, but that sounds like it'd be interesting to look at)
22:23 raster: Lyude: i do not know that....
22:24 Lyude: figuring that out might let you at least eliminate either the hw vblank counter or everything else from the possibilities here
22:24 raster: the problem is i turn vlank on/off a lot so its not always ticking so timestamp_last_frame can easily be much earlier that current frame timestamp
22:24 raster: i can track it with tframe counter - i just dont have logic to dump that.
22:25 Lyude: I know there's some vblank logic for calculating the number of "missed" vblanks since vblanks were last enabled (maybe that's what you meant by your last message), I wonder how accurate that is to the expected values from intel's hw counters
22:25 raster: i'm still shocked from the timestamp going backwards :) this explains maybe a year or 2 worth of random user reports of "it all freezes" and where i saw it every now and again...
22:26 Lyude: this also kinda brings up the question of whether this happens if you just leave the vblank counter always enabled
22:26 raster: well i'm beginning to suspect this is either a hw bug that just somehow never was caught because nothing adverse happened until i wrote code to rely on it :)
22:27 raster: or.. it's some counter mis-read like reading top/bottom half registers separately or ... something... but i havent looked.
22:27 raster: and yeah - it may also have to do with us always scheduling a single tick each time and not having it just "on forever"
22:28 Lyude: yeah-i'm kinda leaning towards hw bug here as well, I don't think it's super likely we'd jump backwards anywhere else (unless at some point when we're re-enabling vblanks, if we were to decide to use the estimated vblank counter rather then the actual hw counter. I, don't think there's anywhere we do that in drm_vblank.c though)
22:28 Lyude: at least not for drivers that actually let you read the current timestamp
22:29 raster: well i'm going to try this on my other intel laptop to see if it produces EEEEKS too... i have not seen a single EEEK from amd so far.
22:30 Lyude: raster: if it does end up being a hw bug though, at least we might be able to workaround it by just using the estimated vblank counter to offset the hw counter. assuming it stays 2 frames behind anyway
22:34 raster: Lyude: well i've put a workaround up in userspace now and i am now freeze-free even though i'm seeing my EEEKs.
22:57 vsyrjala: if the hw frame counter would misbehave then the vblank seq number would jump around. shouldn't really affect the timestamp
22:57 vsyrjala: two ways the timestamp could go bad: bad clock_monotonic, or something went horribly wrong with the scanoutpos based timestamp correction we do
23:04 vsyrjala: hmm. could be a psr related fail. in particular drm_vblank_restore() doesn't seem to have any qualms about overwriting the previous ts for the same seq number
23:04 vsyrjala: so if you ask for it twice you could get different answers each time
23:07 vsyrjala: now i wonder if we should skip if diff==0, or if we should just make diff>=1 always
23:08 cmarcelo: jenatali: in the Initializer MR, Output is not used for OpenCL right? so I will amend the patch to include it just for OpenGL/Vulkan before landing it.
23:09 jenatali: cmarcelo: Don't think output's used for CL, no
23:10 jenatali: cmarcelo: What's the problem with output?
23:11 cmarcelo: jenatali: just one of the fail text strings was mentioning it as valid for OpenCL.
23:11 jenatali: Ahhh
23:14 raster: vsyrjala: it's going to be fun figuring it out.
23:17 cmarcelo: jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8820/diffs?commit_id=09a3fbc5e23af62fefda5a0c5598162dff81d46d
23:17 vsyrjala: i guess we could go with diff==0 -> throw it away. that's what we do for the irq handler
23:18 jenatali: cmarcelo: Fixup looks good, thanks!
23:19 vsyrjala: hmm. except then we don't update our last sampled hw frame counter value
23:20 raster: vsyrjala: well i need to do some chasing and see what frame # says and if there isa jump after that
23:21 vsyrjala: any idea if it was limited to laptops with psr panels?
23:22 raster: vsyrjala: well i'm goign to test on another older i5 laptop ihave and see if it repeats
23:22 raster: but i have no desktops with intel gfx active
23:24 Lyude: is it likely for one kasan error to prevent a different kasan error from being seen? like - will kasan stop checking for memory errors after it detects one?
23:24 ickle: no
23:24 Lyude: awesome
23:24 ickle: but it may change the values seen by the code and so change the flow
23:27 vsyrjala: raster: https://paste.debian.net/1183916/ is what i'm thinking for a fix
23:28 ickle: potentiall_y_ observe t_w_o
23:28 raster: that implies frame count will be the same in these cases
23:28 raster: but ... my code skips pframe == frame
23:30 vsyrjala: ah
23:30 raster: i learned that trcik when u have > 1 screen :)
23:31 vsyrjala: i guess then we're back to the clock_monotonic is broke and/or we miscorrect the timestamp theories
23:32 raster: at least i know this laptop produces this regularly enough
23:32 raster: https://termbin.com/6bum
23:34 vsyrjala: we do have a drm.debug=0x20 knob for vblank debugs which in theory should spew some useful information to the log. but it's a bit dangerous since it prints a bunch of stuff @60hz (or whatever), and so combined with a slow console it may effectively kill the box
23:34 vsyrjala: should probably think about converting it all to tracepoints
23:35 raster: vsyrjala: yeah. that is not going to be too helpful
23:36 raster: the noise... as it happens often enough .... but not that often as my logs will be pretty full :)
23:36 raster: what i'm thinking is add more "time went backwards" checks and stuffing them now into the kernel to only dump when a delorean-occasion happened
23:42 vsyrjala: sounds like a decent plan
23:43 emersion: can't you dump dmesg when you get the EEEK?
23:43 emersion: dmesg is a ring buffer
23:44 raster: i guess it'll have enough buffer still
23:45 emersion: the buffer size is configurable if you miss it
23:46 vsyrjala: someone should send a patch to bump it 4M by default so i wouldn't have to ask for that in every bug report
23:47 raster: gehehehe
23:49 imirkin: vsyrjala: but what about N64??
23:51 vsyrjala: how much can that afford? 4k?
23:52 imirkin: -4M would be ideal
23:52 imirkin: then it'd have 8M total ;)
23:53 vsyrjala: that patch i want to see
23:53 imirkin: they had a memory expander in johnny mnemonic...