00:25 alyssa: pac85: fair point. gl stands for gallium =D
00:26 alyssa: cmarcelo: hmm. the reason I want this is specifically *so* grep works properly. grepping src/asahi should get everything in the driver. the AGX compiler shouldn't be #included outside src/asahi. etc
00:26 alyssa: right now it's src/asahi and also src/gallium/drivers/asahi if the GL driver is touched
00:27 alyssa: only reason to keep in src/gallium/drivers is so mareko doesn't get annoyed about gallium drivers disappearing, I guess
00:27 alyssa: (so that grepping src/gallium/drivers hits all the gl drivers)
00:27 alyssa: idk if a symlink is the right tool for that though
01:31 benjaminl: is there a rule of thumb for when a new nir pass should go in src/compiler vs a driver-specific directory?
01:33 benjaminl: (context is I'm currently looking at writing a pass to lower noperspective varyings for hardware that doesn't have linear varying interpolation. Afaik this only applies to panfrost, but in theory it's general)
04:46 alyssa: benjaminl: "is this useful for multiple drivers", pretty much
04:46 alyssa: I tend to be a fan of "incubate in driver, then the second driver who wants to use it can move to common code" guaranteeing common passes have 2 users
04:46 alyssa: not a hard rule though
04:47 alyssa: what was the problem with noperpsective on mali thenÉ
04:47 alyssa: what was the problem with noperspective on mali though?
11:16 mahkoh: Hi, VkSwapchainCreateInfoKHR allows the client to pass in an oldSwapchain. The spec only requires the oldSwapchain to have been created from the same device, only the same instance. But mesa performs an unconditional cast to its internal swapchain type. Am I missing some part of the spec that requires the oldSwapchain to have been created from the same device?
11:16 mahkoh: uhh, I meant the spec DOES NOT require the oldSwapchain to have been created from the same device
11:55 emersion: doesn't same instance imply same device?
12:03 mahkoh: The instance is the top-level object that can be used to enumerate devices. On systems with an nvidia GPU and an AMD GPU, the same instance can be used to create mesa and nvidia devices.
12:04 mahkoh: The SurfaceKHR type is defined in vk_icd.h which is shared by all drivers. But swapchains seem to always have a driver-defined layout so sharing swapchains between devices should always lead to undefined behavior.
12:05 zmike: might be a spec oversight
12:05 mahkoh: I guess this is just an oversight. vkDestroySwapchain requires that the swapchain was created from the given device.
12:05 zmike: raise an issue
12:05 mahkoh: Ok
12:05 zmike: on the spec
12:12 mahkoh: https://github.com/KhronosGroup/Vulkan-Docs/issues/2454
13:08 Company: does the spec forbid creating multiple swapchains with different devices?
13:08 Company: because I'm pretty sure that's not gonna work
13:34 bromiumse: shannon was an american computer engineer was also more like programmer, who calculated the possible combinations after each move or split in chess too though. I am not as big caliber guy i assume. https://en.wikipedia.org/wiki/Claude_Shannon but whatever. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4942903 so the heavy lifting is done by the decoder where the expensive loop
13:34 bromiumse: happens, where each power has the multiplier later on, and a remainder that is interpreted per sum of invariant operands and translated to by the accessed alu operations result. That paper gives the same results, the implementation is not really needed (but you can also find free doi of the reference 15, compression ration is way neyond winzip there so 0.05 vs 0.343
13:34 bromiumse: https://www.sciencedirect.com/science/article/pii/S0304397511009418?via%3Dihub). max number is presented by log (L,n) bits. which comes back as 10 bits per 4.2Billion. So the system ddrm procedures are broken due to bad PCIe bugs, well read-modify-write is needed and dma, unaligned is not a requirement, as shifting can be performed on host? Mr Shannon was a smart man, i did not fully
13:34 bromiumse: understand his chess take on things though, but powers in computer is likely simpler than chess paradigm.
14:35 mahkoh: Is it possible to recover from vkQueuePresentKHR failing? It seems that mesa updates the state of the image (acquired) somewhere in the middle of its logic, which means that the client cannot use the image anymore. The spec doesn't seem to say anything about this.
14:35 mahkoh: For a wrapper, would it be best to discard the swapchain at that point?
14:40 daniels: depends on the return - afaict the only failures after the implementation takes ownership of the image are VK_OUT_OF_DATE_KHR, which is pretty clear that you can no longer use the swapchain
14:41 stsquad: has anyone tried debugging mesa/virglrenderer with rr? is swrast fallback the only way to go?
14:42 mahkoh: I see several goto fail_present before and after the acquired field has been reset.
14:42 mahkoh: For example, result = wsi_signal_dma_buf_from_semaphore(swapchain, image), which does not look like it would return VK_OUT_OF_DATE_KHR.
14:43 mahkoh: But I have not checked.
14:58 stsquad: hmm it looks like venus won't work with swrast as a backend
15:31 Lynne: is there a way to test whether somehow mesa's nir frontend miscompiles my code?
15:31 Lynne: validation layers pass, and the code runs excellent on nvidia's drivers
15:31 Lynne: it uses BDA extensively, so I don't know how well that's tested
15:34 karolherbst: BDA is just a raw pointer in the shader and it's kinda up to backends to validate that their compiler doesn't screw up
15:35 karolherbst: but there is nir validation stuff going on
15:35 karolherbst: `NIR_DEBUG=validate` I think
15:35 karolherbst: mhhh
15:35 karolherbst: might be disabled in release builds
15:36 karolherbst: in debug builds it's validating by default
15:38 pendingchaos: NIR_DEBUG=validate_ssa_dominance too, which isn't ever enabled by default
15:38 pendingchaos: there's no way to tell for certain without looking at the original shader and final nir, but NIR_DEBUG can catch some bugs
15:47 Lynne: both are clean
15:47 Lynne: original shader is https://paste.debian.net/1333820/ (can link you a program if you want to try running it)
15:48 Lynne: most of the code isn't called, the issue is that renorm_encoder_full() puts junk data in the bitstream buffer
17:18 cmarcelo: alyssa: ack, tks for clarifying. I do a lot of project wide grep that's why it crossed my mind.
17:49 alyssa: cmarcelo: not sure what the right way would be
17:58 cmarcelo: alyssa: to be clear, no opposition on your suggestion.
18:34 alyssa: ack
18:51 indopaster: if you look at that this way, you can only have somewhere around 1024 combinations, when the operands become known there will be round robin pinning, which produces less than 1024*1024 non-ordered combinations, hence there is no carry handling anymore, cause carry is already handled by invariance you could pin based of the truth table but the result of binary multiplication would come to
18:51 indopaster: the same solution, as for operand multiplication and encoded to invariant contiguous weights. There is quite few doubt that this pdf is correct, because, it's the decoder which will output all the not used bits 1024*1024*1024*4-1 is the max, 1024 is the operand 1 powers combination count, second is the other operands comb_count, result is the combinations of decoder+result and *4 is
18:51 indopaster: handling the carry. So *4 is handled by invariance, 1024*1024 is handled by access table through sums and *1024 of result is handled by the results which are prepinned. The results are so sparse, cause the collisions at higher powers are met so often.so 1024 mersenne number is 1023, where as 1024 is 32+11+11 the mersenne of that is larger in encoded format but results of the access are
18:51 indopaster: smaller or bigger for multiply alu but the translation of decoder is bigger, in other words it can use collisions as intermediate results and hence the outcome is the same, and it's not like united states or british military did not know what i talked about, russian language i am learning, cause british and us sent me 2years of feedback on such things, this method is known as electrical
18:51 indopaster: suppression for the bright headed people. and is achieved with routines over plus and minus. In hw without opcodes reloaded you can not have transition from small electricity to bigger per opcode AFAIK. Now PCIe troubles i dunno very well,i was never close to their development for hw nor sw. I know that this is an interrupt based third party dma which needs to be used, it needs reliable
18:51 indopaster: pcie encoding and timing likely indeed. We would really fancy putting the bursts there in fact, it's very low power as well as fast there and pretty easy to develop.
18:58 airlied: sima: do we care about kunit tests causing lockdep warnings :-)
19:50 Lyude: drm-tip conflict detected, taking a look now
19:51 airlied: Lyude: oh I'm solving it here
19:51 airlied: was just waiting for a rebuild
19:54 airlied: Lyude: pushed it
19:59 Lyude: airlied: ah cool, got to me first then :P
19:59 Lyude: erm, got there first I mean
21:11 airlied: mlankhorst: can you apply Arnd's patch to fix imx/dcss/dcss-kms.c to drm-misc-next and send me a new PR?
21:11 airlied: the current PR doesn't build for me here
21:15 benjaminl: alyssa: the problem with noperspective on mali is that the only available instructions to load varyings do perspective-correction. LD_VARY*.clobber *almost* does the right thing but not quite
21:15 alyssa: ouch
21:16 alyssa: yeah, ok. frustrating
21:17 benjaminl: hahaha yeah. The docs I have for G610 say that .clobber is supposed to do noperspective, but the docs for later chips just say it's undefined
21:17 alyssa: entertaining
21:17 benjaminl: possibly it was intended as noperspective but hardware bug?
21:17 alyssa: I'd believe it
21:18 alyssa: but "change the docs to match the bug" instead of "fix the bug" is just
21:18 alyssa: chefs kiss
21:18 benjaminl: :)
21:18 alyssa: I guess Mali doesn't have any way to do per-vertex varying loads either
21:19 alyssa: (neither does Imaginapple but still)
21:19 benjaminl: plan is to "undo" the perspective correction by writing 'v * w' in the vs and then multiplying by gl_FragCoord.w in the fs
21:20 benjaminl: and yeah, no per-vertex varying loads
21:20 alyssa: ouch
21:20 alyssa: any clue what the DDK does?
21:20 benjaminl: not even possible to fake it by loading directly from the vertex packet buffer afaict
21:20 benjaminl: I asked a DDK dev but haven't heard back yet
21:21 benjaminl: I should really get around to setting up pandecode