11:29fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> How easy is it to implement nir_intrinsic_reduce? :triangle_nvk:
11:48fdobridge: <gfxstrand> Probably not horrible except that it's a subgroup op and neither compiler has proper re-convergence semantics right now.
12:06fdobridge: <phomes> gfxstrand: I did start on the pipeline cache, but real life happened. I can try to get back to it tonight or you can also implement it yourself if you prefer. Don't wait for me if you need it soon 🙂
12:31fdobridge: <gfxstrand> Nah. It'll be nice once we have it but for now I'm happy to let it be
12:32fdobridge: <gfxstrand> I mentioned it because it occurred to me yesterday that I think the reason NAK runs take longer (by 10-15 min) is because I'm running more NIR optimizations. Then I realized we could be caching all that.
12:33fdobridge: <gfxstrand> But also meh? 🤷🏻♀️
12:33fdobridge: <gfxstrand> We should eventually but it's fine for now
12:33fdobridge: <gfxstrand> The whole compile pipeline is a mess and will be until we land NAK and use it for everything.
12:37fdobridge: <gfxstrand> I guess we could add some wrappers to make codegen look more like NAK. 🤔
15:29fdobridge: <gfxstrand> `Pass 405328 Fail 56 Crash 65 Warn 4
15:32fdobridge: <gfxstrand> IDK what I think about threads. We'll see how it works out.
15:33fdobridge: <karolherbst🐧🦀> me neither... I'm always keeping in mind that medium-term we want to move something else anyway 🙃
15:34fdobridge: <karolherbst🐧🦀> I'm kinda thinking about giving matrix another go as it did improve in the meantime and might actually provide what we'd need
15:34fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Meta/Facebook Threads is definitely concerning though
15:34fdobridge: <karolherbst🐧🦀> but it also bridges better to here
15:34fdobridge: <karolherbst🐧🦀> just have to find the spoons to set everything up
15:35fdobridge: <karolherbst🐧🦀> matrix also has threads since last year...
15:35fdobridge: <karolherbst🐧🦀> just the permission system is totally and entirely broken and useless
15:35fdobridge:<karolherbst🐧🦀> but
15:36fdobridge: <karolherbst🐧🦀> we can have a FDO "space" and sub-spaces for each individual driver
15:36fdobridge: <karolherbst🐧🦀> and it kinda looks like discord servers
15:36fdobridge: <karolherbst🐧🦀> so that's what initially was my reason to just ignore it
15:41fdobridge: <gfxstrand> @marysaka did you make any SPH progress while I was asleep?
15:43fdobridge: <marysaka> oh yeah I forgot to push, but I'm not that happy with the result so far tbh
15:43fdobridge: <gfxstrand> Ok, cool. Please push!
15:43fdobridge: <gfxstrand> I'm going to try and pull that in today and see if it fixes all my VS woes.
15:45fdobridge: <marysaka> On it 👍
15:47fdobridge: <karolherbst🐧🦀> I think I need to ping again on the thread to get the sph docs 🙃
15:47fdobridge: <karolherbst🐧🦀> @gfxstrand how is your endeavor with getting docs doing?
17:33fdobridge: <gfxstrand> Not great. If I don't hear anything by then, I'm going to pester people at the F2F after XDC
17:37fdobridge: <gfxstrand> @marysaka The good news is that your SPH stuff looks mostly like what I was hoping for. The bad news is that it doesn't work with non-constant offsets and neither do the NIR intrinsics today. 🤦🏻♀️
17:37fdobridge: <gfxstrand> I need to think about that. One option is to do a bit of NIR hackery to add the information. Another is to look at semantics (a bit annoying). A third would be to propagate more detailed information from C but I was really hoping to avoid that one.
17:38fdobridge: <gfxstrand> In any case, it should be good enough for FS so I'm going to go ahead and do a run.
17:38fdobridge: <marysaka> Oh no :vReiAgony:
17:38fdobridge: <gfxstrand> Totally my fault, that one.
17:39fdobridge: <marysaka> yeah I kind of thought it was weird but kept it that way if you are talking about the interpolated stuffs 😅
17:39fdobridge: <gfxstrand> It's a fundamental hole in my current I/O lowering plan.
17:41fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> How much I/O lowering does NAK do?
17:43fdobridge: <gfxstrand> The normal amount?
17:43fdobridge: <karolherbst🐧🦀> random thought: we might want to wire up 32 bit addressing mode to safe on 64 bit alus... not sure if it makes sense with vulkan though....
17:43fdobridge: <gfxstrand> What would that do?
17:43fdobridge: <karolherbst🐧🦀> pointers are 32 bit
17:43fdobridge: <karolherbst🐧🦀> it's the `.E` flag on `LD` and `ST`
17:43fdobridge: <gfxstrand> Yeah, we can't restrict our global address space that far
17:44fdobridge: <karolherbst🐧🦀> `.E` -> 64 bit, 32 bit otherwise
17:44fdobridge: <gfxstrand> It's fine for local and shared but not global.
17:44fdobridge: <karolherbst🐧🦀> yeah...., but local and shared are already 32 bit anyway
17:44fdobridge: <karolherbst🐧🦀> I'm sure it made more sense 15 years ago 😄
17:45fdobridge: <karolherbst🐧🦀> CUDA still allows you to choose though, I wonder how long they'll keep it
17:45fdobridge: <karolherbst🐧🦀> but it does make sense for performance
17:47fdobridge: <gfxstrand> I'm sure it's a legacy "nvidia never deletes hardware" thing at this point. It makes no sense whatsoever when my card has 12GB of VRAM.
17:53fdobridge: <karolherbst🐧🦀> yeah...
17:55fdobridge: <karolherbst🐧🦀> maybe something we could consider on low VRAM cards on older gens at some points, but then again... it doesn't really matter
18:07fdobridge: <gfxstrand> no
18:07fdobridge: <gfxstrand> That's just asking for bug reports I can't reproduce
18:08fdobridge: <gfxstrand> In exchange for a performance gain I can't reproduce, either.
18:08fdobridge: <karolherbst🐧🦀> yeah...
18:08fdobridge: <karolherbst🐧🦀> fermi supports 64 bit pointers afaik, so that only leaves us with nv50
18:09fdobridge: <karolherbst🐧🦀> and nv50.... is uhm....
18:09fdobridge: <karolherbst🐧🦀> they have native ssbos
18:09fdobridge: <karolherbst🐧🦀> 16 slots of 32 bit address space
18:10fdobridge: <karolherbst🐧🦀> no idea if the hardware does OOB checks
18:10fdobridge: <karolherbst🐧🦀> but if we want to ditch codegen, we'd have to support nv50
18:10fdobridge: <karolherbst🐧🦀> or we keep codegen for old hardware
18:10fdobridge: <gfxstrand> codegen for old hardware
18:11fdobridge: <gfxstrand> Throw it in amber and forget it exists
18:11fdobridge: <karolherbst🐧🦀> I knew you'd say that 😛
18:11fdobridge: <karolherbst🐧🦀> yeah...
18:11fdobridge: <karolherbst🐧🦀> nv50 has its own driver anyway
18:11fdobridge: <karolherbst🐧🦀> so that's fine
18:11fdobridge: <gfxstrand> Repeat after me: Life's too short to optimize nv50
18:12fdobridge: <karolherbst🐧🦀> 😄
18:12fdobridge: <karolherbst🐧🦀> I agree
18:12fdobridge: <karolherbst🐧🦀> kepler is kinda the gen where nvidia started to become saneish
18:12fdobridge: <karolherbst🐧🦀> fermi is an oddball 🙃
18:13fdobridge: <karolherbst🐧🦀> however, Ben mentioned we could in theory make the dma-copy class work on fermi...
18:13fdobridge: <gfxstrand> https://cdn.discordapp.com/attachments/1034184951790305330/1156292383739215935/PfWJP7AAAAABJRU5ErkJggg.png?ex=651470b3&is=65131f33&hm=95bbdf66bfa3541de097f5f9f6734085462ce8b97f802818bb75a63488f3310d&
18:13fdobridge: <karolherbst🐧🦀> just needs firmware from nvidia
18:13fdobridge: <karolherbst🐧🦀> but fermi doesn't have reclocking, soooo.. again.. whatever
18:13fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Unless 🔺 starts learning NVIDIA hardware (and tries to make Teslakan) 😅
18:14fdobridge: <karolherbst🐧🦀> yo, no
18:14fdobridge: <karolherbst🐧🦀> I'm sure you can't make a compliant vk 1.0 driver on tesla 😄
18:14fdobridge: <karolherbst🐧🦀> maybe on the 3rd gen of tesla...
18:14fdobridge: <gfxstrand> You can't make a 1.3 driver on Kepler
18:14fdobridge: <karolherbst🐧🦀> but uhhh
18:14fdobridge: <karolherbst🐧🦀> the horrors
18:15fdobridge: <karolherbst🐧🦀> because of some image stuff?
18:15fdobridge: <gfxstrand> No, something with atomics and caching. It can't pass the memory model tests.
18:15fdobridge: <karolherbst🐧🦀> uhh.. pain
18:15fdobridge: <gfxstrand> And nvidia folks are very sure it's a HW thing
18:15fdobridge: <karolherbst🐧🦀> yeah.. sounds about right
18:16fdobridge: <karolherbst🐧🦀> kinda sad, but also...
18:16fdobridge: <karolherbst🐧🦀> is vk 1.3 strictly needed for dxvk anyway?
18:17fdobridge: <karolherbst🐧🦀> worst case people see some weirdo glitches, with the only alternative being not being able to game at all
18:17fdobridge: <karolherbst🐧🦀> sooo...
18:17fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> It does use the memory model stuff I think (at least v2.0+)
18:17fdobridge: <karolherbst🐧🦀> pain
18:21karolherbst: zink intel: Pass 2417 Fails 35 Crashes 2 AMD: Pass 2285 Fails 30 Crashes 139 lvp: Pass 2286 Fails 30 Crashes 138
18:21karolherbst: I think that's good enough :D
18:22karolherbst: though the crash might be something silly
18:22karolherbst: ehh
18:22karolherbst: wrong channel
19:14fdobridge: <dadschoorse> it needs d3d11 behavior, so what ever kepler does for atomics is fine
19:15fdobridge: <dadschoorse> but really, are there even users with kepler left?
19:15fdobridge: <karolherbst🐧🦀> it's the only generation we can more or less properly reclock
19:15fdobridge: <karolherbst🐧🦀> and there are users using those GPUs for that reason
19:16fdobridge: <karolherbst🐧🦀> if you absolutely not want to install binary drivers, kepler is kinda your only option if you want to play some games
19:16fdobridge: <dadschoorse> I guess then you can just add an app profile that enabled 1.3 for dxvk 🐸
19:17fdobridge: <karolherbst🐧🦀> yeah.. something
19:17fdobridge: <karolherbst🐧🦀> as I said, the alternative is poor performance gaming through opengl
19:18fdobridge: <dadschoorse> some of these gpus are getting old enough that you might have to worry about dying capacitors 🐸
19:19fdobridge: <karolherbst🐧🦀> yeah...
19:19fdobridge: <karolherbst🐧🦀> but we also have users of 20+ year old GPUs
19:19fdobridge: <karolherbst🐧🦀> and the bugs we see are generally driver bugs 🙃
19:19fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> I'm getting https://wiki.archlinux.org/title/Intel_graphics#OpenGL_2.1_with_i915_driver flashbacks
19:19fdobridge: <karolherbst🐧🦀> though I have some nv40 GPUs I'm convinced are dying
19:20fdobridge: <![NVK Whacker] Echo (she) 🇱🇹> Are they all AGP?
19:20fdobridge: <karolherbst🐧🦀> PCIe
19:20fdobridge: <karolherbst🐧🦀> I don't have AGP or PCI gpus 😄
19:20fdobridge: <karolherbst🐧🦀> it's kinda wild to put those old "beasts" into a system with a 12th gen intel cpu
21:07fdobridge: <marysaka> ... I think I still have a Geforce 6600 somewhere, not sure if it still works tho
21:10fdobridge: <karolherbst🐧🦀> should run gnome 🙃
21:11fdobridge: <karolherbst🐧🦀> I have a wonky nv40 which runs gnome without issues, but crashes once you start glxgears
21:11fdobridge: <karolherbst🐧🦀> and with gnome I mean like modern gnome
21:15fdobridge: <marysaka> The fact it even run it is a miracle :AkkoDerp:
21:16fdobridge: <karolherbst🐧🦀> hey.. we fixed a critical bug like.. 3 years ago
21:16fdobridge: <karolherbst🐧🦀> it was a hilarious bug
21:17fdobridge: <karolherbst🐧🦀> https://gitlab.freedesktop.org/mesa/mesa/-/commit/1387d1d41103b3120d40f93f66a7cfe00304bfd7
21:17fdobridge: <karolherbst🐧🦀> okay.. 2.5 years
21:18fdobridge: <dadschoorse> I have a laptop with a GeForce2 card somewhere still, is that still supported?
21:18fdobridge: <karolherbst🐧🦀> gnome was _broken_ without that
21:18fdobridge: <karolherbst🐧🦀> like entirely and completely 😄
21:18fdobridge: <karolherbst🐧🦀> it was a bunch of random vertices doing random stuff
21:18fdobridge: <karolherbst🐧🦀> sure
21:18fdobridge: <karolherbst🐧🦀> without GL though 😄
21:18fdobridge: <karolherbst🐧🦀> but you should get kms
21:19fdobridge: <karolherbst🐧🦀> though it might be broken depending on if users still use it and if they do report bugs 😄
21:20fdobridge: <karolherbst🐧🦀> I know that we get some bug reports for nv40 era hardware
21:20fdobridge: <karolherbst🐧🦀> maybe nv30?
21:20fdobridge: <karolherbst🐧🦀> dunno
21:21fdobridge: <dadschoorse> I actually used it in 2017-2018 for school essays and watching yt videos with mpv
21:22fdobridge: <karolherbst🐧🦀> oh wow
21:23fdobridge: <dadschoorse> my first computer where I used linux full time, LXDE was the only thing that could run with decent performance
23:24pabs: from #debian-offtopic:
23:24pabs: <tomman> related: if you wanna swap to VRAM on a noVideo card, do NOT use Nouveau (much less the blob!) - either live without a framebuffer console, or fallback to uvesafb
23:24pabs: <tomman> if you tell Nouveau to only use, say, 8MB VRAM, Nouveau will say "oooook, but no warranties, I'm free to do what I please with the VRAM so go away" *kernel panics*
23:24pabs: <tomman> but if you tell uvesafb to use 8MB VRAM, it will use 8MB VRAM and not a single byte outside that range!