00:08 robclark: jenatali: iirc there was some idea about adding "debug instructions" in NIR.. but jekstrand is maybe more up to date on that
00:08 robclark: mareko: btw, why isn't tc_sampler_view_destroy() deferred to driver thread?
00:09 jekstrand: robclark: Uh... Someone sent patches many moons ago. Maybe dj-death or scottph?
00:09 jekstrand: robclark: It was only in RFC shape, I think.
00:09 jekstrand: jenatali: ^^
00:09 robclark: ahh, ok.. I was assuming I was out of date on that ;-)
00:10 jekstrand: Nope, not at all.
00:10 jenatali: jekstrand: I don't see anything on gitlab with a quick search - by "many moons" I'm guessing that means mailing list before gitlab?
00:10 jekstrand: I've given zero thought to how we keep the optimizer from destroying the data.
00:10 imirkin: robclark: i wrote up a spec about adding debug ops to help RE
00:10 jekstrand: jenatali: No, it was gitlab
00:10 imirkin: but never got around to implementing it
00:11 jenatali: Ah nvm, found it: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3971
00:11 airlied: bleh llvmpipe holds a refeence to constant buffers, and the only way to get rid of it seems to be to execute a dummy draw
00:11 robclark: imirkin: yeah.. that is a bit diff (ie. not debug meta-data).. but would be useful.. tho we have a shader replacement mechanism now which can kinda serve same/similar purpose
00:12 imirkin: robclark: yeah, there are other ways of achieving it
00:13 imirkin: but it can also be nice if you have a bunch of unknown bits
00:13 robclark: agreed
00:13 imirkin: nvidia was the target... there's a XMAD op which is ... extra mad
00:13 imirkin: it has like 100 diff modes
00:14 robclark: x(tra)mad?
00:15 imirkin: exactly
00:16 robclark: that might be a funnier instruction name than some that we have... umad braa? :-P
00:16 imirkin: ;)
00:46 mareko: robclark: I don't know, but it could be done in the driver thread
00:50 HdkR: imirkin: But XMAD is gone now. Nvidia no longer xtra mad :P
00:50 imirkin: now they're just half-mad (HMAD)
00:50 HdkR: hah
00:50 imirkin: (that's probably not the op... o well)
00:52 mareko: they seem pretty mad to me, considering the "should your editorial direction change"
00:53 robclark: mareko: ok.. if it isn't a problem to kick that to driver thread, then that is one less place I'll need a way to enqueue driver private callback
00:53 airlied: ppc64le is much more friendly with eieio
00:54 robclark: (I end up needing a hash table to map the set of individual sampler views to a single driver state obj.. it is one area where gallium API doesn't fit the hw very well for me)
00:54 robclark: yeah, eieio is a pretty great instruction name
00:56 imirkin: airlied: i hope you feel similarly about the ppc simd opcode mnemonics...
09:01 MrCooper: jenatali kusma: looks like the spec/arb_timer_query/timestamp-get piglit test may be flaky on Windows: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/7239146
12:43 daniels: ajax: TIL pstack
12:47 jadahl: daniels: wait until you hear about gstack
13:25 zmike: MrCooper: it's a little flaky on zink too
13:25 zmike: the threshold is very small
13:33 MrCooper: jadahl: wonder why gstack isn't in gdb upstream
14:06 j4ni: bb52cb0dec8d ("drm/ttm: make the pool shrinker lock a mutex") in upstream and drm-misc-fixes creates a silent conflict with ba051901d10f ("drm/ttm: add a debugfs file for the global page pools") in drm-misc-next, causing the latter to use spin_lock/unlock on a mutex.
14:07 j4ni: mripard, mlankhorst_, tzimmermann, danvet: ^
14:10 jadahl: MrCooper: interesting. here pstack is just a symlink to gstack
14:16 baryluk: gstack is literally just a simple call to gdb with `thread apply all bt`. that is all basically.
14:36 tzimmermann: j4ni, maybe mention it to christian könig
14:36 tzimmermann: he sent out an email about a ttm conflict this morning
14:37 tzimmermann: cc'ed you
14:55 tzimmermann: danvet, did you see my reply to the simplekms threat? I'd like to make a simple form of these readout helpers
14:56 danvet: tzimmermann, sry I'm a bit behind on mails and well everything :-/
14:56 tzimmermann: danvet, ok. don't worry
15:05 danvet: tzimmermann, quickly looked, if you're looked at state recovery helpers I thought I discussed them with mripard or maybe pinchartl
15:05 danvet: wrt the fastboot problem and simplekms ... this looks tough
15:06 danvet: maybe chat with hans de goede, iirc he's the one at rh who's done a lot of the fastboot stuff
15:08 tzimmermann: danvet, short story: BOs are allocated by clients. and fbdev has it's own code and drm file to to this. reading out the plane/crtc state can be done easily. but the FB BOs are owned by a client. so fbdev has to readout the kms state. and that requires driver-independend interfaces
15:09 tzimmermann: i can chat up hdegoede and mripard. i'd like to make a simple version of this for the needs of simplekms
15:21 mripard: danvet: we discussed it yeah, but I never got the time to actually start working on it
15:35 danvet: mripard, if you still find the discussions for tzimmermann?
15:35 danvet: or did we do a todo.rst?
15:45 mripard: https://lore.kernel.org/dri-devel/CAKMK7uHtqHy_oz4W7F+hmp9iqp7W5Ra8CxPvJ=9BwmvfU-O0gg@mail.gmail.com/
16:18 tzimmermann: mripard, thanks. i've seen the mail
16:18 tzimmermann: i don't think i could solve all the i915 corner cases
16:19 tzimmermann: but maybe start with something that is useful for a simple driver with fbdev
17:11 zmike: maybe a dumb question, but would it be possible to avoid running all the hw-specific ci builds for doc changes?
17:12 zmike: or is it that ci gets run any time any file in the ci directory changes?
17:12 ajax: we mostly do that, see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8701
17:12 zmike: hm
17:13 zmike: I'm just wondering because my zink version bump triggered a full pipeline
17:13 zmike: which feels...maybe like overkill
17:13 ajax: but that change _only_ touches docs/, so yeah, i'd expect touching .gitlab-ci* would kick everything
17:13 zmike: I see
17:13 ajax: we could probably granulate that finer if we wanted
17:13 zmike: I think that would be good, at least for zink, since changing zink's ci files should never affect another driver
17:13 zmike: and we'll be seeing a ton of changes there in the near future
17:28 MrCooper: zmike: the problem is it's not easy to share some parts of the path filters between different jobs, so taking what you want to the extreme would result in each job having its own full list of every path that affects it, which isn't really worth it; it's a fine balance
17:41 jekstrand: Does LLVMpipe display stuff on Windows?
17:41 jekstrand: Or is it headless somehow?
17:41 zmike: hm I guess I was more just hoping to slightly reduce the jobs needed for zink (and ci overall) since I'm anticipating that almost every MR I do from now on will require changes there
17:42 ajax: jekstrand: we have a wgl target
17:42 ajax: i'm _fairly_ sure it knows how to display to an HWND
17:42 zmike: makes sense though MrCooper, thanks for the info 👍
17:43 jekstrand: ajax: Ok, then someone could probably copy+paste some code into a Vulkan WSI back-end if they felt so inclined.
17:43 jekstrand:does not feel so inclined.
17:44 hch12907: yeah, llvmpipe does display stuff on win32
17:44 hch12907: it's affected by
17:44 hch12907: #4226
17:44 hch12907: though
17:45 MrCooper: zmike: one thing you could do is adapt 6d2afe1c8325 "ci: Move out expect files from .gitlab-ci" to the piglit jobs
17:45 zmike: ooh will have to take a look at that when I get home, thanks!
17:46 MrCooper: actually I think that should be pretty much what you asked for
17:46 daniels: jekstrand: ajax is right - that's how we bootstrapped GLon12 originally
17:46 jekstrand: daniels: neat
17:47 daniels: slow, but neat
17:47 jekstrand: Well, sure. But we're talking about strapping it on to LLVMpipe so....
17:47 jekstrand: Slow is sort-of an assumption here.
17:49 jenatali: jekstrand: The wgl winsys stuff is part of the wgl gallium frontend
17:51 jekstrand: jenatali: Sure. I didn't think it could be used directly, but it gives an enterprising person something to copy+paste.
17:51 jenatali: Yep, agreed
17:51 jekstrand: And, since lavapipe lives in gallium, maybe it could actually use it? I don't know.
17:53 jenatali: I... doubt that
17:53 jenatali: Actually displaying stuff in a window should be pretty simple
18:05 jenatali: jekstrand: Were you actually interested in trying to hook it up?
18:06 jekstrand: jenatali: Not at all. :) But someone seems to care about lavapipe on Windows:
18:06 jekstrand: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7208
18:07 jenatali: Yeah, I've been following along. Seems close
18:09 jenatali: Unrelated - if I've got a multi-commit fix I'd like to backport without an explicit Fixes tag, is the right thing to add a Cc tag to all of them?
18:09 dcbaker: jenatali: yes
18:09 jenatali: Eh I guess I could actually pick a commit that they can be considered to fix
18:10 jekstrand: Yeah, sometimes you just can't pick a good Fixes: commit
18:13 vsyrjala: Fixes: all of them
18:14 vsyrjala: maybe only applicable to git rm *
18:15 ajax: every bug exists in a file, after all
18:15 ajax: so if there are no files....
18:16 ccr: !
18:17 jenatali: https://memegenerator.net/img/instances/76877838.jpg
18:17 zmike: good meme
18:26 ajax: jenatali: i always liked this phrasing: http://vladimiruspensky.com/assets/2015-08-02-development-tips/deleted.jpg
18:26 HdkR: Tested it all the way to the trash bin
18:26 jenatali: Heh, it's also the fastest code
18:27 ajax: code you don't run has some wonderful properties
18:29 dcbaker: it also takes 0 CPU time, and saves power, thereby reducting climate change. therefore deleting code == good environment stewardship
18:31 jekstrand: Why does anyone ever write code? What is it all for? What are we doing with our lives? Why are we all here?
18:31 jekstrand: Oh, right, it's to mine bitcoin.
18:31 zmike: hahah
18:31 dcbaker: btw, bitcoin mining consumed more power that argentina last year
18:31 jekstrand: That's depressing
18:32 HdkR: I saw it was causing power outages in Iran as well :/
18:37 jekstrand: So here's a fun question: If I wanted to represent a load_const that takes on a different value per-subgroup-invocation, any ideas for what that should look like?
18:37 jekstrand: For the NIR peeps
18:37 jenatali: Isn't that just a system value?
18:39 jekstrand: It's like gl_SubgroupInvocation only shuffled around a bit.
18:39 jekstrand: Actually.... This doesn't even need an intrinsic.
18:40 jekstrand: In this case, I could represent the whole thing as (64bit_imm >> (gl_LocalInvocationId * 8) & 0xff) if I really wanted to.
18:40 jekstrand: Maybe what I want is a small const array to shift optimization.
18:40 jekstrand: We can do more efficient in our back-end than the shift but what I really care about is getting rid of the memory load.
18:41 jekstrand: The array looks like
18:41 jekstrand: __constant const uint shuffleE[8] = {2, 3, 0, 1, 6, 7, 4, 5};
18:41 jekstrand: There are 11 of them in this kernel.
18:43 pendingchaos: shuffleE[i] = i ^ 2
18:43 jekstrand: Or I can just re-write the kernel to do a shift. I have that option today. :)
18:44 jekstrand: pendingchaos: Dang! They're all i ^ N for some N, I think. I should just fix the kernel. :)
18:46 jekstrand: No, they're not all i ^ N
18:46 jekstrand: __constant const uint shuffleG[8] = {0, 2, 1, 3, 5, 4, 7, 6};
18:46 jekstrand: That one flips some bits
18:47 jekstrand: On our HW, this can all be an immediate. We have these SIMD vector immediates that evaluate to a unique 4-bit value per lane.
19:01 airlied: jekstrand: I did try building mesa in a windows vm locally but I gave up around flex/bison install, I got tired of manipulating my path variable
19:02 jekstrand: airlied: You shouldn't need flex/bison for lavapipe.
19:02 dcbaker: we should probably move the flex and bison checks into the glsl compiler code
19:04 jekstrand: But, also, lavapipe depends on all of gallium so....
19:05 airlied: jekstrand: yeah shouldn't and me giving a crap weren't aligned that day :-P
19:05 jekstrand: airlied: :)
19:48 jenatali: airlied: Feel free to ping me if you try again at some point and want help :P
19:48 jenatali: Regarding flex/bison, one of our folks looking at the generic spirv->dxil conversion stuff mentioned he was looking at making that more optional
19:49 airlied: jenatali: I'm afraid to ask if there's a better way than hvaing to set the longest PATH ever
19:49 jenatali: airlied: I don't think so, I think PATH is right - but trust me, you're not hitting the longest PATH ever just from those two
19:50 jenatali: My home PC's got 19 entries and I don't even really use that for development
19:50 airlied: jenatali: oh it was already long before those, it's just was getting longer and longer :-P
19:51 airlied: I think being in a VM didn't help, maybe I should repurpose a real machine :-P
19:59 jenatali: airlied: Only 42 paths on my work PC, I would've expected more :P
20:14 zmike: MrCooper: I had a look at that commit, but it looks like that's just for deqp? or will piglit tests work too?
21:42 austriancoder: imirkin: is there a reason nouveau does not use u_pipe_screen_get_param_defaults(..)?
21:43 imirkin: i thought it did
21:43 imirkin: austriancoder: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c#n434
21:44 austriancoder: imirkin: my fault.. I thought I have not seen it
22:38 mareko: robclark: I'm going to skip the sync for create_stream_output_target for manhattan
22:39 mareko: Kayden: ^^
22:39 mareko: zmike: ^^
22:39 robclark: still called on driver thread?
22:39 mareko: drivers might need changes
22:39 mareko: robclark: it syncs and is called in the frontend thread, but I'll remove the sync
22:40 zmike: mareko: I was wondering if you'd ever do something about that
22:40 robclark: that looks largely ok, looks like I only need pctx->screen
22:41 Kayden: mareko: I don't use the context in create_stream_output_target, on the branch that uses u_threaded_context
22:41 Kayden: so I think it should be fine
22:43 zmike: same for me
22:43 zmike: 👍
22:44 mareko: I also have 16-bit varyings using new slots VARYING_SLOT_VAR(0..15)_16BIT, which have 2 packed varyings per slot
22:46 mareko: I just need FP16 GLSL uniforms and my pixel shader codegen will be almost perfect
22:46 imirkin: and you can finally retire ;)
22:47 mareko: no rest for the wicked
22:48 robclark: mareko: did that also change the linking code to not pack 32b and 16b varying together in a single vec4? That is one annoying point that I've not got around to fixing yet
22:50 imirkin: up next: 8-bit floats
22:52 mareko: robclark: a simple change in cmp_varying_component should do that
22:53 robclark: (tbf it is actually a problem after I get "vectorish" mode working, which has be languishing on a branch)
23:01 mareko: robclark: the linker only needs to group mediump varyings together, then the driver can choose to remap them to my new slots, which will keep them packed but decrease the slot usage by 50%
23:03 robclark: slot usage doesn't actually really matter for us in some sense.. they get re-packed anyways (pre-dating the nir linking).. it's really just avoid a f2f16 for a subset of the components of a single varying fetch (because that can't be folded into a single "vectorish" bary.f)
23:04 mareko: yes, same here
23:05 mareko: the slot remapping gives me packed slots, which you don't need
23:12 mareko: manhattan calls generic_nop, which is slowing us down by 2% due to _mesa_error lol
23:23 robclark: I suppose manhattan is a bit more cpu limited on a big vram gpu
23:24 jekstrand: mareko: That's just fantastic