00:19imirkin: anholt: let me know how the nouveau shim goes if you decide to use it
00:20anholt: imirkin: definitely looking forward to trying it!
00:20anholt: hitting some more cleanups of the gtn-ntt branch
00:20imirkin: anholt: to get maximum feature coverage, use like a chipset=124 override. but there's various generations in between which are variously different. i tried to cover it in the README but i didn't want to write a long book :)
00:21anholt: loved seeing a good readme like that
00:46alyssa: Long books are nice
00:46alyssa: Although perhaps unweildy in print.
02:19imirkin: gitlab having issues?
02:21jenatali: Looks that way to me
08:49danvet: airlied, I have some core mm patches that jason gunthorpe suggestd I send as a topic branch directly to linus next merge window
08:49danvet: ok if I stuff that into drm.git so I can use dim and all?
08:49danvet: also need to ping sfr to include it
08:53airlied: danvet: sounds good
09:07MrCooper: HdkR: you can get ops from ChanServ anytime
09:14HdkR: MrCooper: Aye, I just didn't deop from last time
10:27daniels: imirkin, jenatali: can you be more specific? what's failing?
12:27karolherbst: ehhh... mtx_timedlock is broken :/
12:31karolherbst: or well.. the both implementations differ in their behaviour
12:31karolherbst: I think.. :D
14:32jenatali: daniels: MR pages were only partially loading. It recovered
14:34danvet: emersion, I think chen doesn't have commit rights, so feel like also pushing that little fix?
14:36emersion: danvet: sure
14:41alyssa: who's the right cc for a mesa/st patch?
14:41alyssa: (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8300 -- last thing we're blocking on for GL3.1)
14:43daniels: jenatali: given the time of day, it may have been a window for a kernel upgrade
14:46danvet: emersion, thx
14:46danvet: emersion, for next time around https://people.freedesktop.org/~seanpaul/whomisc.html
14:46emersion: oh, nice
14:51tzimmermann: danvet, since we're not taking the dma_resv lock within shmem; could we go through all drivers one-by-one and replace the local lock with the resv lock semi-mechanically? this way we can tackle drivers individually
14:59danvet: tzimmermann, yeah I think that's the only feasible approach
14:59danvet: the problem is, it's not mechanical in many cases, since they do use dma_resv already for some things
14:59danvet: just not as the general per-obj lock
15:00danvet: but for any driver that doesn't use dma_resv anywhere it should be mechanical replacement
15:00danvet: also once we have all callers of a given function converted, we could add dma_resv_assert_lock already
15:01danvet: with a comment saying it's just to make sure the partial conversion isn't broken, not that it does protect actual state yet
15:01tzimmermann: danvet, that's quite a bit of work. but i think i'd prefer that over the dma-buf-passthrough.
15:01danvet: I'd guesstimate its about half a year of full time work to convert all shmem users over
15:02tzimmermann: lol, damnit
15:02tzimmermann: i guess i'll put the vmap_local aside for now and take a look at the shmem drivers
15:03danvet: tzimmermann, imo do the pass-through
15:03danvet: it's not the prettiest, but it's fairly ok
15:03danvet: and it's what I've done for get/put_pages too iirc
15:03tzimmermann: it's papering over the issue ... :/
15:03danvet: maybe add a comment or todo.rst
15:03tzimmermann: i think about it
15:03danvet: nah, just trying to dissect the problem into incremental steps we can actually pull off
15:03danvet: without holding up features
15:04danvet: I mean the vmap_local stuff is meant to unblock some feature work
15:04danvet: so doing that + todo.rst entry with fairly detailed plan is plenty good enough
15:04danvet: then chip away at it over the next year as a side project
15:04danvet: or find a gsoc to do it for you :-)
15:05tzimmermann: unfortunately, that's probably the more realistic approach
15:05tzimmermann: still, not my favorite
15:06danvet: yeah, but pragmatic&gradual approaches have value in themselves
15:06danvet: as long as we have a clear enough idea where we want to head towards, to avoid pulling in different directions
15:48danvet: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_resize_fb_bar’:
15:48danvet: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1109:6: warning: unused variable ‘space_needed’ [-Wunused-variable]
15:48danvet: 1109 | u64 space_needed = roundup_pow_of_two(adev->gmc.real_vram_size);
15:49danvet: agd5f, ^^ just spotted this one, not sure you have a fix already somewhere
16:22kisak: Friendly reminder that the mesa 21.0 branchpoint is scheduled for tomorrow. At least it's harder to swamp the CI and marge-bot these days.
16:44karolherbst: we overflow shader_info.system_values_read
16:45karolherbst: so we have more than 64 system values
16:45jenatali: karolherbst: In what content? Is that CL?
16:45karolherbst: jekstrand: what's the proper way of checking if a system value is actually used by the nir?
16:45karolherbst: jenatali: GL
16:46karolherbst: doing a for (uint8_t i = 0; i < SYSTEM_VALUE_MAX; ++i) if (!(nir->info.system_values_read & 1ull << i)) continue; loop
16:46karolherbst: but that doesn't work obviously
16:50jekstrand: karolherbst: system_values_read is the way but it doesn't have enough bits
16:51dj-death: we started reusing inputs/outputs
17:15karolherbst: jekstrand: ehhh :/
17:16karolherbst: jekstrand: so anything I can do to workaround it?
17:17jekstrand: Make system_values_read a BITSET_WORD?
17:22alyssa: kisak: You haven't noticed the flurry of features getting flipped on?=)
17:22karolherbst: I am actually annoyed by the fact that nobody feels responsible for actually fixing it, but still adding more and more system values :/
17:25kisak: alyssa: am I allowed to lie and act all surprised?
17:26karolherbst: dj-death: what do you mean by using inputs/outputs?
17:28Kayden: we're using uint64_t for inputs/outputs these days, should probably do that for system values
17:29karolherbst: Kayden: we have more than 64 system values
17:29karolherbst: SYSTEM_VALUE_MAX is 75
17:29karolherbst: and it seems like to be this way since may
17:30karolherbst: and I only noticed because tgsi_get_sysval_semantic asserted
17:33dj-death: karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/compiler/shader_enums.h#L291
17:33karolherbst: dj-death: how is that even related?
17:34alyssa: kisak: sure
17:35dj-death: karolherbst: just mentioning we're running out of bits everywhere ;)
17:38karolherbst: dj-death: well.. then fix it instead of adding workarounds?
17:38karolherbst: anyway, system values are now just broken as it seems
17:40karolherbst: okay.. actually just for two months
17:40karolherbst: jekstrand: do you want to fix it as I am pretty sure your raytracing MR broke it or should somebody else do it?
17:50alyssa: Don't go blaming raytracing ;P
17:50hakzsam: dcbaker: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8412#note_759989 why did Marge fail again?
17:50HdkR: alyssa: Ray tracing solves all the problems. It "Just works"
17:51jekstrand: karolherbst: I don't think I totally broke it
17:51jekstrand: I tried pretty hard not to
17:51dcbaker: hakzsam: I think because it didn't need to rebase, so it assumes that the results it has are still valid
17:51karolherbst: jekstrand: right.. but that loop doesn't work anymore
17:51dcbaker: It's going to fail on freedreno anyway I think
17:52dcbaker: oh, maybe not
17:52karolherbst: so if there is a better way of knowing what system values are enabled, fine, if not.. then not
17:52karolherbst: anyway, I hit an assert
17:52dcbaker: hakzsam: I've got the CS:GO fix staging, and I'll get yours next
17:52hakzsam: dcbaker: great, thanks
17:53jekstrand: karolherbst: Who's doing that loop?
17:53karolherbst: nouveau does
17:54jekstrand: What are you checking for?
17:54karolherbst: because we need to know what system values are enabled
17:54jekstrand: In what way is it failing now? You should never see anything above 64 set
17:55karolherbst: jekstrand: wrap around
17:55jekstrand: Replacing it with a proper bitset will take surgery across all of Mesa. Fixing nouveau won't.
17:56jekstrand: On the upside, the surgery would be the type that compile fails every place you have to touch
17:57karolherbst: which I value more important than the amount of time any fix needs
17:58karolherbst: but mhh.. maybe I could rework nouveau, but it's also not as trivial
17:58karolherbst: I could just ignore every value above 63 though
18:04karolherbst: jekstrand: seems like I will rework that bit for nouveau anyway as the current code is suboptimal and probably would be able to get rid of the loop
18:04jekstrand: But, yeah, we should make it a proper bitset eventually
18:04jekstrand: I'll stick that on my back-burner todo
18:27zackr: danvet: Hi, could you give me the submit access to drm-misc? I need to push some of our changes out so if we can't get access to drm-misc I'll have to go back to bothering airlied to pull from us directly
18:28danvet: zackr, do you have a fd.o shell account?
18:28zackr: danvet: zack on fdo
18:29danvet: mripard, ack for adding zack for vmwgfx stuff ^^ ?
18:29danvet: hm I thought we've also added roland, but seems to not have happened
18:29danvet: seanpaul, or is your whomisc not up to date anymore?
18:29seanpaul: danvet: mm, not sure should run each night
18:30seanpaul: i'll see if i can kick it
18:30danvet: zackr, happen to know whether roland has an fdo shell account too?
18:31danvet: seanpaul, I think it's all ok
18:31zackr: danvet: he should have. it should be "sroland"
18:31daniels: danvet: sroland?
18:31daniels: zackr: we are of one mind
18:31zackr: daniels: haha, one mind, two accents
18:32zackr: are you bundling?
18:32danvet: hm getent on gabe finds neither
18:32daniels: yeah, I've got a pretty strong hybrid going on
18:32daniels: danvet: they're both there, I can add them as soon as someone tells me yes
18:33danvet: waiting for some drm-misc maintainer ack for formality's sake
18:33daniels: only a very limited number of people are on gabe - annarchy and kemper have everyone
18:33danvet: zackr, tomorrow also good enough?
18:33zackr: danvet: sure, sounds great
18:34zackr: i'll get everything ready here then. i'll switch our internal ci to drm-tip tomorrow too then
20:47jekstrand: ngcortes: Can you get cwabbott set up on Intel CI, please?
21:05ngcortes: jekstrand, can do
21:36agd5f: danvet, yeah, Nirmoy is sending out a fix
22:12mripard: danvet: yep, ack-by: me
22:12danvet: mripard, also ack for roland from vmwgfx?
22:21mareko: I've been considering this idea I call "super draws", an evolution of multi draws. It's a multi draw where the following 3 variables vary: start, count, vertex buffers (but not their count). u_threaded_context would generate them from its batch buffers. I think this is the future of super low overhead gallium drivers.
22:26alyssa: How would the driver make use of that?
22:28zmike: I'm all for lowering overhead with tc 👍
22:28mareko: drivers would have a small inner-most loop that sets vertex buffer bindings and generates draws, skipping expensive draw time validation that slows drivers down
22:30mareko: draw time validation is a big obstacle if you want to execute hundreds of thousands or millions of draws per frame
22:31zmike: not to mention stuff like batch-based tracking for lifetimes
22:32alyssa: Fair enough
22:34mareko: it's not really a proposal or ask, it's an annoucement that it might happen because only our driver uses u_threaded_context, so we have the freedom to change it however we want
22:35zmike: technically not entirely accurate since I'm using it, but that isn't in master so I guess it's not wrong either
22:35jekstrand: Kayden: ^^
22:35jekstrand: mareko: That doesn't seem crazy, TBH.
22:36imirkin: mareko: when you say "vertex buffers", do you really mean the buffers, or just the mappings of inputs to things in those buffers?
22:37jekstrand: mareko: We'd like to enable it in iris, we just haven't yet.
22:39mareko: imirkin: the most common pattern is set_vertex_buffers and draw_vbo interleaved; tc can combine consecutive draw_vbo calls into a multi draw with a constant draw ID (typically 0), this will just extend it by folding set_vertex_buffers in there
22:40imirkin: i see
22:40imirkin: makes sense
22:40jenatali: We'd like to use u_threaded_context too I think
22:41jekstrand: mareko: If we can have a tight loop and not re-validate all state, that'd probably help.
22:42jekstrand: Swapping out vertex buffers is simple enough
22:42jenatali: jekstrand: Are you still interested in using CL programs with global variables shared between kernels? I think !8133 probably broke that for you. I seem to recall you were looking at that functionality a while ago
22:44ngcortes: cwabbott, per jekstrand I've gone ahead and set you up on the mesa CI here: http://otc-mesa-ci.jf.intel.com/job/cwabbott/
22:45jekstrand: jenatali: It may have. It's not a shipping feature so I don't care that much.
22:45jenatali: Fair enough, just happened to see it
22:47alyssa: mareko: Giving it some thought I think that makes sense 👍
22:49ngcortes: cwabbott, anyway it looks like your jenkins job is set up to just test out vulkan. so of course that'll only run on platforms that support that API
22:49karolherbst: jenatali: btw, anything in particular I should take a look at, now that I am more or less back from PTO? :D
22:50jenatali: karolherbst: Nah, I don't have anything in flight at the moment
22:50jenatali: Eventually I'd like to get around to merging our clang invocation stuff with Clover, and moving our SPIR-V metadata parsing into vtn, but I don't have cycles for that atm
22:52karolherbst: yeah.. sounds like good ideas
22:57mareko: recently we started using a C++ template for our draw_vbo function with template parameters <hw_generation, has_tess, has_gs, ...> and we change the draw_vbo callback according to bound shaders. This means we have the optimal draw function when e.g. tessellation is enabled and also when it's disabled.
22:58jenatali: mareko: I'm curious, did you measure the impact on code size?
22:59imirkin: mareko: cool - i think swr does something similar
23:01mareko: jenatali: we don't know, it's just the compile time that got a little worse, we only did CPU profiling with it and it was really worth whatever the code size cost is, it's not much code, just 1 source file
23:03jenatali: Cool, makes sense, I can see how that'd make a big difference
23:15bnieuwenhuizen: emersion: did you push the changes for the new libdrm release? I can't find them here : https://gitlab.freedesktop.org/mesa/drm/
23:15emersion: bnieuwenhuizen: you mean this? https://gitlab.freedesktop.org/mesa/drm/-/commit/cdd14e92e9bd11e7307512b81b9ce26dd5ee2bce
23:16bnieuwenhuizen: bleh, confused by the date of the last patch :(
23:41idr: mareko: I tried a similar thing a few years ago in C using various __attribute__ options to force inlining. Basically, the "real" function has all those things as parameters, and it is only called directly by functions that pass constants for those parameters and force it to inline.
23:41idr: tl;dr: It did not work out. :(
23:43idr: There's some stuff in the various Intel drivers now that do some similar kinds of things using #define for generation / #include of "real" source file.
23:45idr: It means that your template looks like "define_function(function_name, GEN)" (that's not what's actually used, but it's the same idea) instead of "function_name<GEN>".
23:49Kayden: I am planning to use u_threaded_context, and I had code working towards that, but it now requires multidraw, so my code is broken and I have to get it back up to functioning agian