09:01bbrezillon: karolherbst, jekstrand: would you mind if I send a MR for https://gitlab.freedesktop.org/kusma/mesa/-/merge_requests/72/diffs?commit_id=b7f1f3c595bac70c698ab327f19823a81d5da1b4 even if there's no upstream user yet ?
10:17karolherbst: bbrezillon: yeah, I think this one change is fine
10:17karolherbst: I am just not happy with the general way we calculate strides and so on as they differ between CL and GL and we have a few places where we would get it wrong
11:02MrCooper: mupuf: does link training failure really result in a modeset error though, as opposed to setting the link-status property to BAD? What if the mode is set for multiple connectors attached to the same CRTC, and link training fails only for some of them?
11:53daniels: anholt: RTYI https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4162
12:46TheRealJohnGalt: How do you save captures you've made in renderdoccmd? Can't seem to find it in documentation.
12:58dj-death: MrCooper: did you see my comment about the missing commit for 20.0
13:45MrCooper: dj-death: yeah, so we should be all good, just need to backport that along?
13:46dj-death: MrCooper: yep
13:46dj-death: MrCooper: probably impacts amdgpu too
13:47MrCooper: yeah, should be fine
13:49dj-death: MrCooper: should I open an MR against staging/20.0 ?
13:53MrCooper: once the Intel fixes have landed on master, sure
13:59dj-death: MrCooper: no I mean for the other missing patch
13:59MrCooper: is there a reason to backport them separately?
14:00dj-death: MrCooper: like I said, since it probably impacts amd drivers, it could now
14:02emersion: eric_engestrom: do i need to do something in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4953 ?
14:03eric_engestrom: iirc it was good to go last time I checked
14:03MrCooper: dj-death: it can, just not sure why it should :)
14:03eric_engestrom: emersion: let me have a quick look and I'll either assign it to marge or post a comment
14:03eric_engestrom: (also I'm about to leave for a couple of hours)
14:03emersion: thx :)
14:06eric_engestrom: emersion: yup, all good, assigned to marge
14:06eric_engestrom: thanks for the update!
14:07alyssa: I'm not familiar with the 20.1 backport process, can I just assign https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5193 to Marge?
14:07eric_engestrom: emersion: btw you might want to have a look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5177
14:08eric_engestrom: alyssa: the release maintainer does that, ie me for 20.1
14:08alyssa: eric_engestrom: In that case, could you merge that? thank you :)
14:09emersion: eric_engestrom: oh nice, will definitely have a look
14:09eric_engestrom: alyssa: lgtm, done
14:09alyssa: eric_engestrom: thanks :)
14:10eric_engestrom: ok, now I'm leaving ^^
14:10emersion: daniels: marge is broken on the inside 😢 https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4953#note_510068
14:19danvet: robher, did you get around to test drive my shmem series?
14:29daniels: emersion: the logs don't go back far enough but I'm monitoring now
14:32MrCooper: daniels: I for one want to run the CI pipeline less often (specifically, not post-merge), not more :)
14:33daniels: MrCooper: whyso?
14:33jekstrand: bbrezillon: When are you planning to upstream your Dx12 stuff?
14:34daniels: jekstrand: RSN
14:34jekstrand: bbrezillon: I'd like to see core changes landing in master sooner rather than later.
14:34daniels: jekstrand: we were holding off (like most new drivers) whilst it was still a bit frantic and being rewritten by six different people every other week
14:34jekstrand: alyssa: I assume "real soon now"
14:34jekstrand: daniels: Yeah, that makes sense.
14:34daniels: jekstrand: we've now hit the point where the core of the driver makes sense, modulo things like 'how is pointer'
14:34daniels: and the rest is mostly just GL exts
14:35jekstrand: daniels: Yeah, that makes sense.
14:35MrCooper: daniels: because post-merge pipelines are wasteful when every MR must pass the pipeline pre-merge, and running the pipeline only for Marge by default seems to be working fine
14:38daniels: yeah, post-merge is helpful, but tbh if we're able infrastructurally to run CI for everyone all the time, what's the loss?
14:38daniels: (I realise the enabling condition is not currently true)
14:39alyssa: bbrezillon: Aside, I'm noticing panfrost_batch_add_bo has some overhead just from the hashtables. Wonder if there's a differenet data structure we want.
14:39jekstrand: alyssa: ANV uses a bitset. :-P
14:40jekstrand: alyssa: but anholt_ tried converting freedreno to a bitset and got a perf loss.
14:40jekstrand: No idea why
14:40alyssa: jekstrand: how does that.. work?
14:40jekstrand: alyssa: It's a bitset indexed by GEM handle
14:40jekstrand: And then we have a util_sparse_array indexed by GEM handle which contains every `anv_bo` ever created.
14:41MrCooper: daniels: apart from the network bandwidth cost, it'll always result in higher runner load / longer turnaround times for the pipeline
14:42alyssa: jekstrand: and that.. works? and copes with being able to create new BOs mid batch? (I guess realloc the bitset but that's expensive... fallback to a regular set..?)
14:42alyssa: otoh, tiny cost next to allocating a BO at all, acceptable
14:43jekstrand: alyssa: Yup. It works fine. And yes, you sometimes have to realloc the bitset but that's not very common
14:43jekstrand: And if you power-of-two grow it with a minimum size of 8 dwords (256 bits), it's very very not often.
14:43bnieuwenhuizen: alyssa: in Vulkan I've pretty much never seen more than 4k buffers used at the same time which is 512 bytes. I'm pretty sure the hit of reallocating that once in a while is not too bad
14:44alyssa: True, true :-)
14:44bnieuwenhuizen: (in GL it probably depends on whether your driver has a BO cache)
14:44bnieuwenhuizen: or BO suballocator really
14:44alyssa: We have a BO cache but not a suballocator atm.
14:45bnieuwenhuizen: gallium/auxiliary IIRC has one ready for you to use :P
14:45alyssa: (Except for transient memory like cmdstreams which get tossed after a frame anyhow)
14:46jekstrand: As far as size goes, when you consider that a hash table is 72B in 64-bit or 68B in 32-bit and that doesn't include the table which is 16/8B (32/64) per entry, the bitset is basically guaranteed to be smaller unless it's absurdly sparse and your batch contains virtually no BOs.
14:47alyssa: --Okay, you've convinced me then :-)
14:47jekstrand: alyssa: The big downside is, like I said, anholt tried it and it was slower
14:47jekstrand: We never figured out why though
14:48alyssa: Mm. A bigger fish to fry is knowing when BOs are free again for the BO cache <===> when a batch completes kernel side.
14:49alyssa: Polling bo_wait in the cache_fetch routine is... highly suboptimal.
14:49alyssa: We could do better doing that on batch_submit, I think.
14:49alyssa: AFAIK we don't have callbacks for that :p
14:50bbrezillon: alyssa: hehe, no we don't
14:50daniels: MrCooper: well, not always. previously we were on small crap runners - we're not anymore, because Packet offered us larger ones. we couldn't easily scale horizontally by adding new ones - we have a one-line auto-provision script now. our hardware labs were bottlenecking - fdno is laughably overprovisioned & we solved the Collabora USB-host instability issues ..
14:50mareko: pb_slab is for managing suballocations
14:50bbrezillon: alyssa: but checking for BO-idleness in the submit path is not a bad idea
14:51bbrezillon: though you still have to do extra ioctls
14:51bbrezillon: so I'm not sure it makes a big difference
14:51daniels: ... bentiss has helped us burn down an unbelievable amount of tech debt as well which has freed us from a lot of the terror that kept us bound to GCP. if we could find a sponsor (say, Packet) to give us a platform to run our own Kubernetes with free egress, we could instantly scale our CI back up to handle the volume
14:51MrCooper: daniels: more jobs always means more load / longer pipeline run-times, with any given runner capacity
14:51mareko: we have 2 MB allocations as the smallest size in radeonsi and we allocate anything smaller from those buffers using pb_Slab
14:52daniels: MrCooper: er? if your capacity is greater than the requests, why does it imply longer runtimes?
14:52jekstrand: alyssa: You could always have a "tag" mechanism where, for each batch submission, you have a list of things it uses and then you only bo_wait on the batch and set !idle on all the BOs if the batch is idle.
14:52jekstrand: alyssa: That'd likely require some sort of busy refcount though
14:52alyssa: bbrezillon: Right, but there are N calls to catch_fetch per frame, and only 1 to submit. So a lot of redundant calls can go away, I think, since heuristically a frame will finish in about as long as the previous frame so submit() is a good place to stick things.
14:52bbrezillon: jekstrand: we have that already, I think
14:52alyssa: Maybe poll for frame (n - 2), but yeah.
14:53bbrezillon: (I mean the refcount stuff)
14:53alyssa: bbrezillon: jekstrand: We have the mechanism needed in kernel space, just need to respin the BO cache to use it.
14:54bbrezillon: alyssa: IIRC, BOs are put at the end of a bucket list when they've just been used
14:54bbrezillon: so calling bo_wait(DONT_WAIT) for the N first BOs in the list should do the trick
14:55alyssa: cool :)
14:55MrCooper: daniels: the number of jobs to run varies a lot between different times, runner capacity which is plenty even for the worst case would be wasteful for the average case
14:55bbrezillon: IIUC, the problem is not that you have to wait for BOs, it's just that you have to do an ioctl() to realize it's busy/idle
14:56bbrezillon: or did I misunderstood what the problem was?
14:56alyssa: bbrezillon: It's more that we do that ioctl every time we go to fetch a BO from the cache, which happens extremely frequently.
14:56alyssa: So you do the same redundant ioctl for the same BO even huge numbers of times per frame.
14:56alyssa: So if we defer that to submit time (for instance), the # of ioctls goes down by a huge number already.
14:56alyssa: It's still non-zero but shouldn't be a bottleneck anymore.
14:57bbrezillon: well, you'd still have to do it at some point
14:57bbrezillon: in the BO cache path, right?
14:57bbrezillon: unless you have some marked idle already
14:58alyssa: I'm suggesting we keep kernel, but move the idle check to submit() instead of cache_fetch() so cache_fetch doesn't need to call wait
14:58alyssa: since submit is called far less often than cache_fetch
14:58bbrezillon: but how many ioctls will you issue there?
14:58daniels: MrCooper: well, between the monitoring we have which can tell us about demand vs. capacity, and our ability to non-interactively provision runners, that's a solveable problem, no?
14:59bbrezillon: I mean, the number of ioctls you'll do will still stay hight
14:59alyssa: doing it in cache fetch is O(bn) for b BOs and n allocations; doing it in submit is O(b)
14:59alyssa: so that's strictly better
14:59alyssa: (except maybe for contrived examples.)
15:00MrCooper: daniels: anyway, the main point for me is: the current arrangement seems to work fine, so I don't see the point in wasting resources on running more jobs that aren't really needed most of the time
15:01bbrezillon: alyssa: maybe I'm missing something, to me you'd just end up with the same number of calls, unless old BOs keep being busy when you go fetch something from the cache
15:02bbrezillon: I mean, you have the same problem
15:02bbrezillon: in the submit path
15:02daniels: gitlab is going down very briefly in order to give psql more connections to play with, so we can kill all the 500 errors
15:02bbrezillon: so maybe we should have some aging
15:02bbrezillon: on the BO objects
15:02daniels: MrCooper: shrug, if we are in a position to enable more testing more frequently, rather than having people just bounce off marge, I don't see why we wouldn't try to do that
15:03bbrezillon: to avoid querying their states when they've just been added to the cache
15:04bnieuwenhuizen: bbrezillon: alyssa: or have fences that are queryable from userspace (e.g. write number to memory at the end of a batch, just check the number and do the expensive thing if the number is too low)
15:04MrCooper: daniels: I mean, we've seen a very clear example of what can happen when one doesn't try to keep resource consumption minimal
15:05bnieuwenhuizen: daniels: MrCooper: honestly if we have the capacity I think there is much more value in the initial upload of a MR than post-submit (I agree with MrCooper that post-submit is fairly useless with Marge, unless we decide to do more expensive testing that would be too long to wait on)
15:07MrCooper: not sure it's possible to discriminate between "initial upload" and any other MR pipeline
15:09daniels: MrCooper: well sure, we have seen that, but that happened unchecked because we had zero monitoring
15:09daniels: which is not the case now
15:10daniels: I'm not saying 'switch it all on tomorrow and eat the GCP bill', I'm saying 'if we move our infrastructure to something with free egress anyway because it's a thing that makes sense, and a thing we're now much better-placed to do, then we could very plausibly widen access to CI without impacting on runtimes (because we've done all that prep work too), and without costing us the earth (because monitoring)'
15:10MrCooper: fair enough
15:11MrCooper: I just don't see it being all that important
15:12daniels: yeah, I'm not saying that I'm desperate to do it tomorrow, but in the long-term direction I think it's a healthy thing for us to pursue if it doesn't compromise other things, like the feasibility of running the service at all + developer experience via runtime
15:17alyssa: bnieuwenhuizen: That... is possibly
15:17alyssa: Simultaneously incredible and horrifying.
15:17bnieuwenhuizen: alyssa: we have it on AMD HW :)
15:18alyssa: bnieuwenhuizen: In hw, or in the driver?
15:18alyssa: (Driver as in - for us we have a WRITE_VALUE job type which is meant for testing/debug mostly, but can be used as a generic write primitive.)
15:19bnieuwenhuizen: alyssa: userspace specifies a buffer and offset, and then the driver, when submitting a batch, will put an appropriate write packet in the ringbuffer after the call to the batch
15:26danvet: hwentlan, [RFC 00/17] dma-fence lockdep annotations <- did you see this fly by?
15:27danvet: agd5f_ did reply on one patch, but there's more in dc land there that might be relevant
16:03hwentlan: danvet: haven't had a chance to look at it. Let me take a quick look now
16:23MrCooper: mupuf: even a legacy modeset can affect any number of connectors associated with the same CRTC; an atomic commit can even do so for any number of CRTCs at once
16:37pcercuei: Is it possible to attach a DRM object (e.g. drm_plane) to a DRM driver long after it was loaded?
16:38pcercuei: I'd like to have an external module, that *if* it is loaded, adds a drm_plane to my driver
16:38lynxeye: pcercuei: attaching might work (while detaching definitely won't), but there is no way to signal userspace that something has changed
16:39emersion: in theory user-space could pick it up on uevent…
16:39emersion: pretty sure nobody supports that
16:40pcercuei: hmm. Alright
16:42daniels: apologies, but we're going to have a couple of GitLab outages over the next half-hour or so whilst we try to fix the fallout from having too few postgresql connections now. appreciate it's not ideal timing (usually we try to do these things on weekends), but it would be nicer to have things all ready and just working for Tuesday morning I think.
17:42imirkin: daniels: still seeing outages on gitlab - presumably related to your previous announcement? (you said 30 min, but it's been 1h since you wrote it)
17:43imirkin: [also, it's a holiday in US and UK, so definitely not the worst timing in the world, as a good chunk of developers are based in those 2]
17:43daniels: yeah, things have gone a lot worse than expected :(
17:43daniels: I don't have a solid ETA since at this point I'm iteratively bugfixing rather than moving forward with a clear plan
17:43imirkin: ok, good luck!
17:46imirkin: (i'd say i'd help, but i doubt i have anything of substance to offer in that department)
17:46alyssa: 2 hours later, srht.fd.o pops up... =P
17:48imirkin: airlied: re the bptc stuff, check the original email discussion when this was landed (by nroberts, some short 6 years ago) -- iirc he had said that it wasn't perfect around compression but "good enough". are you having trouble with compression, or decompression?
17:49daniels: alyssa: tbf gitlab isn't the problem at all, it's the constellation of things we use to run it
17:49hwentlan: danvet, that's a large set and there's too much going on today on my side. i'll need a few days to leave a response on the dma-fence lockdep patches
17:50danvet: hwentlan, I loled when you said "a quick look", but figured I'll leave the fun unspoilt :-)
17:51danvet: for this even a few days would be ludicrous speed
17:51hwentlan: haha, i remember why i didn't read the whole set yet when i looked at it again
18:36airlied: imirkin: I think it's decompression, I'm assuming cts is going the compression, but maybe it's compression
18:36airlied: actually it's vulkan failing ,so it's decompression
18:36imirkin: airlied: GL allows doing like glTexImage on a compressed format, which will do online compression
18:37imirkin: i'm not aware of any issues in the online decompression though. but the discussion was like 6 years ago, so my memory might be a bit hazy.
18:39bnieuwenhuizen: airlied: FWIW CTS can be quite particular about precision
18:40airlied: bnieuwenhuizen: it's only about 5-10 pixels wrong
18:40airlied: so I execpt it's like a single bit encoded wrong
18:40airlied: or decdoed wrong
19:04daniels: the captain has switched off the 'fasten seatbelts' sign. you may now move around the cabin. however, for your safety, when seated, we recommend you always keep your seatbelt fastened. thankyou for flying with Air Kubernetes.
19:06imirkin: we appreciate that you have a choice of air carriers, and apparently you made the wrong one
19:06imirkin: [that's from south park, btw]
19:08jekstrand: daniels: Good. Now I can finally tell you you're wrong on GitLab. :-P
19:14alyssa: something something 38 Planes
19:16jekstrand: Oh, I hope not...
19:16jekstrand: What would you do with 38 planes? One per bit and 4 more for compression?
19:17alyssa: land them in Newfoundland, apparently
19:18jekstrand: That would work, I suppose.
19:18glisse: anyone knows where pagelist of deferred_io for fbdev get fill up ?
19:19airlied: glisse: is it just from vmalloc?
19:20glisse: if you believe fb_deferred_io_page() but i can not find where the hell page get addup to fbdefio->pagelist
19:20airlied: oh in the drivers I think
19:20airlied: hmm maybe not
19:21glisse: grep is failing me hard
19:22airlied: glisse: the list_add_tail does it
19:22airlied: it's just ugly
19:23glisse: ok my grep was missing a 'd' to add
19:23danvet: glisse, why?
19:23danvet: well not sure I want to hear the answer
19:23danvet: I mean, why do you dig around in there
19:24glisse: if you knew where i am digging you would be appal ;)
19:24glisse: fdev is not the worst place i have been ...
19:25daniels: jekstrand: \o/
19:25airlied: defio is always a pita since it can't work with shmem due to everyone wanting the lru
19:26glisse: oh well my life is much harder i taking page->mapping,private,index all for myself ...
19:56nroberts: airlied, imirkin: I’d hope that the vulkan API wouldn’t provide anything that ended up needing software decompression or compression
19:57airlied: nroberts: well a vulkan sw rasterizer would :-P
19:57nroberts: oh, right
19:58nroberts: if it hasn’t changed since I wrote it then software decompressor is really inefficient
19:58alyssa: nroberts: Not sure if VK allows but for GL at least, if you modify a few pixels (CPU-side) of a compressed buffers, it's faster to patch up in s/w then defer to the dedicated blit pipe
19:58nroberts: but as far as I know it should be accurate unless there are bugs
19:58alyssa: (if you have one, 3D cores if you don't)
19:59airlied: nroberts: yeah it seems to fail the bc7 tests in vulkanCTS
19:59airlied: but beyond that I can't say how or why, its just one of the remaining failures I have
19:59nroberts: hm :/
20:03nroberts: alyssa: I’m not sure what you mean by the dedicated blit pipe? surely there isn’t hardware that can compress to bptc?
20:03alyssa: nroberts: Oh, thought you were talking about framebuffer compression, nvm
20:03airlied: nroberts: that's the result file
20:04airlied: see only two blocks show differences
20:09mannerov: Hi, I would like to add some nine memory allocation statistics to the gallium hud. It looks like currently there is no state tracker/frontend displaying their statistics on the hud.
20:09mannerov: It looks like I could wrap the pipe_screen passed the hud and overwrite its get_driver_query_info/get_driver_query_group_info calls
20:09mannerov: but I'm not sure that's the way to go...
20:29mannerov: It would probably better to extend hud to query atomics from the frontend
20:44nroberts: oh fun, that CTS test generates random data for the compressed texture
20:46dinosomething: anyone know what a "secondary gpu" means? for example, in a X screen section, you set "device", which is a graphics device, but then theres also "GpuDevice" that you can set, which adds a "secondary" gpu... what does that mean in the context of x?
20:52mannerov: dinosomething: As you have guessed it's for multi-gpu systems. DRI2 GPU Offloading (DRI_PRIME) needed both gpu to be registered by X. I guess the option is related to that. Now everything should be using DRI3 which doesn't need that.
20:52airlied: it's also used for secondary gpu displays
20:58dinosomething: mannerov: ahhh, so DRI2 _is not_ direct rendering then, right?
20:59dinosomething: why would X need to be involved if its "direct" rendering
20:59airlied: dinosomething: it is direct rendering, that litterally what the DR stands for
20:59dinosomething: airlied: lol yea
20:59dinosomething: but whats X involvement when it comes to DRI2?
20:59airlied: with PRIME and DRI2 the X server does the copy from one GPU to the other
21:00airlied: dri3/present made that unnecessary
21:00dinosomething: airlied: whoa whoa whoa
21:00dinosomething: what.... why did it do that?
21:01dinosomething: in dri2, does the process still hit "/dev/dri/cardX" ?
21:01airlied: because we hadn't inventedf dri3/present yet
21:01dinosomething: uh huh
21:02airlied: with dri2 the process renders everything to the backbufer, and the X server does the copy to other GPU frontbuffer
21:03dinosomething: so in dri3, the process has direct access to the front buffer or something?
21:03dinosomething: atleast as far a X goes?
21:03airlied: dinosomething: no there's just more magic
21:04airlied: and we blit on the client side now
21:04jekstrand: Nothing has access to the front buffer except X
21:05dinosomething: so, is the "device" that is in a "screen" section, essentially the device where you want to have the frontbuffer be?
21:05dinosomething: and then when you use GpuDevice, it adds a device for a process to utilize the other gpu, with x being able to copy it (for dri2)
21:07airlied: dinosomething: yes generally it's all autoconfigured
21:07airlied: but gpudevice add the ones use for output and dri2 offload
21:07dinosomething: airlied: ahhhhh
21:07airlied: if you don't need either of those, then you don't need a gpudevice
21:08dinosomething: GpuDevice essentially does that "xrandr --setproviderblahblah" stuff?
21:08airlied: it exposes a gpu to do that
21:08airlied: if you are static configuring
21:08dinosomething: i was before, yea
21:08dinosomething: but now im not
21:09dinosomething: but what you have explained, it kinda explains something funny ive been seeing. right now, i set primarygpu true on my amdgpu driver. but when i run DRI_PRIME=1 glxgears, the screen shows just like a bunch of weird lines
21:10dinosomething: but i mean, im on ubuntu 20.04 so i would assume that dri3 would be default....
21:11mannerov: dinosomething: it would be helpful if you gave exact details about your configs (which cards, monitors connected to which)
21:14dinosomething: mannerov: ive got an x1 carbon 7th gen, so only graphics is just the integrated intel graphics. but then i got an egpu which has a radeon card
21:15dinosomething: i set "PrimaryGPU" "true" on my amdgpu OutputClass, so that way DRI_PRIME=0 is my amd card
21:16dinosomething: thats the only custom xorg config i changed at all, is just "PrimaryGpu" true to the amdgpu
21:18mannerov: It's possible the iris driver hasn't implemented completly what's needed for dri3 DRI_PRIME. Afterall using a less powerful card is not expected
21:18mannerov: the non-iris driver hasn't implemented to my knowledge what's needed to enable dri3 DRI_PRIME (it will just use the amd card always)
21:19mannerov: I think your gpu must use iris
21:19dinosomething: mannerov: yea. the reason is that since im using the monitors that are connected to my egpu card, iwant to mkae sure that xorg is doing as much as possible on that card (like all the compositing and memory and whatnot)
21:19dinosomething: mannerov: yea xorg log shows iris
21:19dinosomething: for DRI driver
21:22dinosomething: what exactly are the implications of setting PrimaryGpu though? like, internally in xorg, how does the primarygpu differ from any other? am i right in assuming that the primarygpu is the gpu in which xorg will composite or whatever? im sure thats wrong because im still confused about it, but i just want to nail down what primary gpu means
21:32dinosomething: am i atleast correct in assuming that the X framebuffer would be located on the graphics card which is selected as PrimaryGpu?
21:33dinosomething: is there any sort of command i can run to query X for where low level information like that, like the memory address of its framebuffer?
21:34robclark: glisse: grep? The cool kids these days are using vscode (allegedly.. :-P)
21:34HdkR: I've been using ag recently, ends up being faster than grep for code searching
21:36mannerov: dinosomething: there is no magic, DRI_PRIME=0 gives you the main gpu, which is the one on which compositing, etc occurs
21:41anholt_:ponders how to debug vk cts startup: SERIAL-CPU> FATAL ERROR: vk.createInstance(pCreateInfo, pAllocator, &object): VK_ERROR_INITIALIZATION_FAILED at vkRefUtilImpl.inl:168FATAL ERROR: vk.createInstance(pCreateInfo, pAllocator, &object): VK_ERROR_INITIALIZATION_FAILED at vkRefUtilImpl.inl:168
21:41glisse: robclark: aren't you still using the java thingy ... keep forgetting the name of that monstruosity :)
21:42imirkin: or what was that ibm thing ... websomething
21:42glisse: maybe i should revive my cscope kung fu or see what's around nowadays
21:42robclark: heheh, eclipse.. still using that for mesa, since it still has better code indexing than anything else (or at least anything else that is open src?).. been playing around w/ vscode for kernel where eclipse has, emm, scalability problems
21:42imirkin: robclark: have you ever looked at emacs + ctags?
21:43robclark: vscode is at least light enough that I can use it on aarch64 laptop for kernel stuff
21:43glisse: isn't the rule of java that if it get slow you just go out and buy more TeraByte or ram ?
21:43robclark: imirkin: yeah, krh has shown it to me.. I walked away not so impressed
21:43imirkin: glisse: you can still run out of permgen (if that's still a thing with recent java)
21:43imirkin: which is basically the biggest "f u" a vm can create
21:43anholt_:uses emacs and remains unimpressed with it
21:43robclark: glisse: yeah, that is more or less the reason for not using eclipse on aarch64 laptop
21:44imirkin: "you're out of this non-recyclable memory, and there's no way to continue. have a nice day."
21:44imirkin: i've been using emacs since ~forever, and while i wouldn't say i'm a power user of it by any means, it does what i need
21:44imirkin: (which, quite frankly, is not a whole lot)
21:44glisse: does it brew beers ?
21:45imirkin: glisse: nah, i got a store for that
21:45karolherbst: robclark: clion is probably the cool kids IDE these days :p
21:45imirkin: glisse: i dunno, maybe that's one of the power user features :)
21:45karolherbst: but I used to use qt-creator.. it just has painfully broken support for Make
21:45karolherbst: so it's not useful for the kernel
21:46karolherbst: or maybe make is just stupid
21:46karolherbst: eclipse is just terrible performance wise
21:47robclark: as far as indexing, I'd say vscode is as good or better than qt-creator, and seems to be as fast.. it is a step down as far as indexing compared to eclipse but it isn't nearly as bloated..
21:47imirkin: robclark: fwiw, my "emacs" workflow does include a whole lot "git grep in another shell". but it's perfectly effective, and allows me to do all kinds of stuff that might not integrate into a "generic" package very well
21:47karolherbst: robclark: right.. but at least qt-creator has a useful debugging interface and it less bloated than vscode :p
21:47robclark: eclipse is fine on mesa, but kernel pushes it a bit.. and defn not something I could use on laptop where I can't just add a bunch of ram
21:48robclark: imirkin: sure, ofc.. but once you use somethign better it is hard to go back ;-)
21:49robclark: karolherbst: hmm, haven't compared memory footprint/etc.. but just based on using both, I'm not sure I agree.. I haven't played much w/ debugging interface on vscode, but I know some people who have been using it w/ kgdb
21:49karolherbst: I like qt-creator.. it's good enough and doesn't put a lot of preasure on system memory
21:49karolherbst: qt-creator still doesn't support meson: clion
21:50robclark: tbh, the "integrated debugger" aspect of IDEs isn't a big deal for me.. but the code-search aspect is
21:50emersion:thought kernel hackers were more hardcore
21:50imirkin: robclark doesn't write bugs.
21:50robclark: vscode just uses compiler_commands.json.. which kernel can generate, fwiw
21:50imirkin: emersion: not everyone uses 'debug' for everything
21:50robclark: (and meson/ninja too)
21:51imirkin: (in case someone's not familiar with `debug` - https://en.wikipedia.org/wiki/Debug_(command) )
22:02karolherbst: robclark: vscode has a few nice features though :P
22:02karolherbst: extensions rather
22:02karolherbst: that instant git blame is nice :D
22:04robclark: yeah, it's got a lot of extensions.. and I will say that I expected to dislike it when I tried it.. I still haven't switched over to it full-time for anything (on of the blockers for kernel work was lack of cross-compilers in the vscode flatpak.. but I built it from git to run outside container yesterday to get around that)
22:04karolherbst: imirkin: "Tool to help find documentation and device tree matching on device driver source code, by device tree binding compatible strings." :O
22:05robclark: haven't played too much w/ git integration yet bug krh (emacs user) seemed to think the git integration showed promise.. and emacs's git "plugin thing" is about the only thing that would make me consider going back to emacs, so I guess that is high praise)
22:06karolherbst: robclark: I have gitLens installed
22:06karolherbst: and this seems to have everything
22:06robclark: (ahh, magit is the emacs git thing)
22:07ric96: robclark: have you tried running minetest with freedreno on 845? I've been having some, for lack of my terminology, "transparency issues". Works fine of 410 tho, same mesa build.
22:09ric96: not sure if this is a known issue, would love to help debug
22:09robclark: ric96: hmm, can't say I have.. I can have a look.. there are some FD_MESA_DEBUG switches (like 'noubwc') to turn off various features..
22:11karolherbst: robclark: what do I need to do to get the compiler_commands.json?
22:12robclark: hmm, 2020-05-25 15:11:45: ERROR[Main]: /builddir/build/BUILD/minetest-5.2.0/src/script/cpp_api/s_base.cpp:89: static int ScriptApiBase::luaPanic(lua_State*): A fatal error occurred: LUA PANIC: unprotected error in call to Lua API (bad light userdata pointer)
22:13robclark: karolherbst: there is a vscode "plugin(ish)" thing for kernel that includes a script to generate that.. or scripts/gen_compile_commands.py
22:13karolherbst: there it is
22:13karolherbst: I didn't see it :)
22:14robclark: karolherbst: https://github.com/amezin/vscode-linux-kernel is the other thing
22:17robclark: ric96: maybe you could 'apitrace trace' minetest.. run the game long enough to show misrendering and then exit)? Possibly interesting to record a trace on a3xx, and replay on a6xx (to rule out whether it is related to some extension that a6xx has that a3xx does not)
22:18robclark: either way I could use that to see where rendering goes wrong.. does look a bit like an alpha channel issue.. at least if the correct rendering is what I assume it is..
22:19karolherbst: robclark: huh.. do I need to do anything special to be able to build in vscode? :D there is no build button
22:20robclark: sudo dnf install nodejs yarnpkg
22:20robclark: the default instructions aren't great for bootstrapping, tbf
22:21karolherbst: yeah.. especially because vscode thinks true and false aren't defined :/
22:21robclark: iirc, just run 'yarn' from root of git tree, and the ./scripts/code.sh
22:22karolherbst: huh? there is no ./scripts/code.sh
22:22robclark: and then maybe this.. since oss version doesn't seem to have URL for extensions configured:
22:23robclark: git-blame tells me that it's there..
22:24karolherbst: we talk about the linux kernel git tree, right?
22:24robclark: no, talking about vscode git tree
22:25robclark:assumed you were asking how to build vscode
22:25karolherbst: I was asking on how to get the kernel built in vscode :p
22:25robclark: ahh.. hmm
22:25karolherbst: I have the compile_commands.json file generated but it doesn't seem to do much
22:26robclark: ctrl-shift-b ?
22:26karolherbst: just builtin default tasks
22:26robclark: I don't normally build from ide, just use compile_commands.json so it's indexer can resolve header file includes
22:26karolherbst: ahh yeah..
22:26karolherbst: well that doesn't work either
22:29robclark: I've only used https://github.com/amezin/vscode-linux-kernel .. haven't tried without it and just using scripts/gen_compile_commands.py.. but *looks* like the only thing vscode-linux-kernel thing adds beyond that is settings for indenting rules..
22:30robclark: iirc there is a way via some .json stuff to add your own entries to what shows up in ctrl-shift-b
22:31robclark: (I haven't played with it too much yet, but https://github.com/amezin/vscode-linux-kernel/blob/master/tasks.json looks like an example)
22:31karolherbst: ohh, it seems like I have to set it up first..
22:31karolherbst: let's see
22:43ric96: robclark: preferred place to upload apitrace? about 200mb each
22:46robclark: ric96: gzip or xz might reduce the size.. but upload wherever..
22:46robclark: if no good options to upload, file a bug on gitlab and attach
22:47robclark: ie. at https://gitlab.freedesktop.org/mesa/mesa/-/issues
22:47ric96: yeah, i'll go that rout then
22:50ric96: robclark: um.. not sure if this is supposed to happen but apitrace replay looks fine, running it on my workstation.
22:50ric96: is it just running the calls on my workstation hardware then?
22:51robclark: it is just replaying the sequence of gl calls
22:52robclark: so if it looks find replaying on workstation and not on device, probably a driver bug
22:52robclark: (but apitrace has nice tools like dump-images/diff-images that is useful to compare rendering between drivers to track down where things go wrong)
23:23ric96: robclark: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3045