08:15mareko: glehmann: where do you see that?
08:36glehmann: mareko: looks like I misremembered, you wanted to do that for glsl but never actually followed through: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392
13:06konstantin: sghuge: That sounds like a lot of effort (if it is doable at all), most literature about tree construction uses binary trees. In my performance tests, the binary -> n-ary stage was quite fast
13:08konstantin: Something like PLOC sounds impossible to do efficiently with generating the n-ary tree directly since you need to find groups of n nodes that have minimal surface area instead of pairs.
15:52phasta: Have a nice End-of-Year folks. See you next year
18:10sghuge: konstantin: yeah I was more afraid about PLOC..lbvh is still fine I believe. Intel HW does not perform better with the common approach...AMD can have children scattered (more like pointer based BVH) but Intel HW has a very strict requirement that all children has to contiguous...and that's killing most of the performance I believe.
21:46konstantin: sghuge: AMD has the same requirement sine GFX12 but it only means that we need to use an atomic for computing the final offsets. We recently changed to using atomics on older gens as well for more compact BVHs but I didn't notice a performance drop
21:46sghuge: konstantin: atomics on Intel are bad AFAIK :(
21:47konstantin: I have a tool (Please ignore the wip stuff) https://gitlab.freedesktop.org/KonstantinSeurer/mesa/-/commits/vulkan-profiling?ref_type=heads that can give you a detailed view of which stages are how expensive
21:48konstantin: On AMD the sort and ploc seem to be the most expensive
21:49konstantin: You should probably implement updates on intel, they help us a lot because they have insane throughput (tri/sec)
21:50konstantin: Maybe it would make sense for you to use larger workgroups and shared atomics to reduce the number of global atomics
21:50sghuge: yeah, that's one thing we haven't implemented yet...Tool look pretty neat. Thanks for sharing.
21:52konstantin: BVH updates allow us to build most BVHs in ~0.5ms in cp2077 on the 9070xt
21:56sghuge: konstantin: ACK! wow, ~0.5ms is insane! I will start looking at update, that's on my TODO list.