03:14 AndrewR: imirkin, (re: is that 23 MSamples/s or /frame? ) sorry, I don't know if those vales per second or per frame ..
03:19 AndrewR: imirkin, but then, in game men or for into videos it was more like ...40/60/30 fps vs 5Msamples ...in-game - 5 fps and 23 Msamples .....
03:29 AndrewR: https://www.techspot.com/review/312-mafia2-performance/page3.html - no card with 384Mb of vram .... still, they used higher resolutions and even in-game AA variant ....
03:33 AndrewR: also, game runs at nearly same 5-7 fps when I set cpu to 1.4 Ghz, or let it turbo-run up to nearly 4 Ghz .... and lowerinng resolution makes fps go higher (up to 15 fps), but then game looks like crap ....
03:44 imirkin: karolherbst: so ... is there more bits to look at with that control flow thing, or just let it go?
03:45 karolherbst: imirkin: I mean, we figured out why it didn't work
03:45 karolherbst: I think we should basically decide what to do
03:45 karolherbst: should we depend on the IR being sane in a way that loops are always declared through precont/conts?
03:46 karolherbst: or should we check before merging that join with bras to verify that there are only unconditional bras?
03:46 imirkin: i don't think we do for the other case in the fermi/kepler lowering of shared atomics
03:46 karolherbst: yeah, doesn't look that way
03:46 karolherbst: but the join merging issue didn't happen there for some reason
03:47 karolherbst: maybe we always merge it with the next instruction instead of adding a phi nop?
03:47 karolherbst: allthough, mhhh
03:47 karolherbst: I think I would prefer to fix that join merging though
03:48 karolherbst: gives us a bit more freedom in the IR
03:48 karolherbst: not entirely sure why it never caused any issues on kepler
04:29 imirkin: karolherbst: could add an assert() for now
04:29 imirkin: that the branch goes to the join bb
14:33 AndrewR: karolherbst, yes, you were right, furmark can generate nearly 1G of instruction: https://ibin.co/4PWtISndgr5v.png mafia2/wine for compare: https://ibin.co/4PWsWNk39CPS.png
14:35 karolherbst: yeah, furmark is quite the beast
14:35 karolherbst: pixmark_piano is a different kind of beast
14:35 karolherbst: much bigger in terms of instructions
16:28 tagr: skeggsb: are you aware of the discussion that's currently going on regarding various crashes with Nouveau on Tegra?
16:28 tagr: skeggsb: in a nutshell, we need to get this in as soon as possible: https://patchwork.freedesktop.org/patch/263588/
16:28 tagr: fixes a regression introduced in v4.20-rc1
18:53 Lyude: skeggsb: poke: there are some patches from nvidia for tegra that fix a regression from me converting over to the new drm probing functions for nouveau that I think you may have misse
18:53 Lyude: *missed
18:53 Lyude: https://patchwork.freedesktop.org/patch/263587/ specifically
19:09 karolherbst: Lyude: tagr already poked here :p
19:10 pmoreau: If anyone is interested, the VK_NV_ray_tracing extension is exposed on Pascal hardware (at least on a 1080 Ti) with 415.22, which wasn’t the case on previous driver versions. I haven’t tried yet to check whether it’s actually working or not, but will definitely do when I have time.
19:10 karolherbst: pmoreau: that's something for 2021 :p
19:11 karolherbst: supporting that in mesa alone would require huge reworks :/
19:11 pmoreau: Pffff
19:11 karolherbst: pmoreau: we basically have to teach mesa to support multiple pipelines instead of just the normal gl one
19:11 karolherbst: but we would also come clover to that VR extension at the same time
19:12 pmoreau: What, no one wants to do ray tracing on an underclocked GPU? You might be able to get 1 frame per minute!! :-D
19:12 karolherbst: well
19:12 karolherbst: it might make sense for AMD to implement it
19:12 karolherbst: who knows
19:12 karolherbst: or doing it in software :D
19:13 pmoreau: How does it work for compute in Mesa? Isn’t it kind of a different pipeline as well? (Though, pipeline with a single stage.)
19:13 karolherbst: well
19:13 karolherbst: compute is hardly a pipeline
19:13 karolherbst: the problem is that inter stage code
19:13 karolherbst: pmoreau: we have tons of loops looping over all the stages except compute
19:14 karolherbst: like the linker
19:14 karolherbst: I would assume the changes to be straightforward
19:14 karolherbst: but
19:14 karolherbst: they have to be done
19:15 pmoreau: Yeah, and I don’t see Nouveau leading the charge on that, since there are other things to focus on. Plus it would require a Vulkan driver for Nouveau since there is only a Vulkan extension.
19:16 karolherbst: ohh, I thought it would also be valid for gl
19:16 karolherbst: mhh
19:16 HdkR: pmoreau: The good thing is that there is a large amount of code that could be shared between drivers on that extension
19:16 karolherbst: same for the VR one
19:17 pmoreau: It might be possible to create an OpenGL extension for ray tracing (that would require bindless), but currently there is none.
19:30 HdkR: pmoreau: A Mesa backed GL raytracing extension based on VK_NV? :P
19:38 pmoreau: That would be cool! :-)
19:42 HdkR: There is a reference DXR compute based implementation that is open source FYI
19:42 HdkR: Gives references to how to do guaranteed watertight intersection testing using FP32
21:38 rhyskidd: imirkin: thanks
23:25 rhyskidd: i think i've inferred the missing devinit opcodes from Volta (5 of them) and Turing (only 1 new seen): https://paste.fedoraproject.org/paste/DHxB5fwIZLIIzCa2PG9TOA
23:25 rhyskidd: confirmation of their actual use is lacking in some areas
23:26 rhyskidd: is this the kind of info that nv have in the past confirmed via that email address for nouveau developers to request specs?
23:26 rhyskidd: asking for others' experiences, before i go to the effort of drafting up and sending off the email to nvidia team
23:35 skeggsb: it never hurts to ask, i suspect it'll be very low priority though, since it's not something we "need to know" unless they're used outside the devinit scripts
23:35 skeggsb: (because ucode parses them these days)
23:55 rhyskidd: okay, i'll draft up something and send in the next few days