13:37 dcunit3d: is there any way to run JAX on the rocm/jax-build image?
13:37 dcunit3d: i got it running, but the python is crippled. it wasn't built with sqlite, so i can't run jupyter
13:37 dcunit3d: it shows that jax is 0.4.6, but there's no ROCmSoftwarePlatform branches that match that
13:38 dcunit3d: i've spent like a month trying to get to the point where i can run python code
13:39 dcunit3d: and i know the AMD compute situation isn't going to be like this forever... but by the time any reasonably installable software runs on my GPU, it will be worth $200
13:39 dcunit3d: i don't give a shit about video games and i never would have bought a GPU to play games
13:40 dcunit3d: and i certainly don't have time for that after trying to get OpenCL software to run
13:47 dcunit3d: just make your own language/framework. it's too complicated to deal with the limitations https://docs.cupy.dev/en/v12.0.0/install.html#limitations
13:49 dcunit3d: how do i run these projects? is there like a grant or something where i can get software that will run on my GPU? https://www.amd.com/en/graphics/servers-solutions-rocm-hpc
13:51 dcunit3d: most people haven't even thought about using DHCPv6 to boot an HPC cluster ... oh but i can play cyberpunk on linux yay!
20:04 dcunit3d: ok so one of the biggest problems i've had in figuring out how to use the ROCm stuff is trying to learn how the builds/dependencies differ between the official tensorflow/etc repositories and the ROCmSoftwarePlatform repositories.
20:05 dcunit3d: diffing across the repositories checked out to specific repo's helps alot. see the org-babel scripts here https://github.com/dcunited001/nb-jax/#repo
20:05 dcunit3d: and the resulting diff here: https://github.com/dcunited001/nb-jax/blob/master/rocm046.diff
21:45 superkuh: Hah. If you think ROCm support will still be available for your AMD gpu by the time it's worth $200 you are in for a surprise. Look at the lack of ROCm support for the 2018 released RX 580.
21:54 soreau: what about rusticl?
21:58 superkuh: I don't have any knowledge about rusticl yet. I am readingnow. For things that are opencl that seems promising. But no ROCm HIP mediated CUDA-ish support?
21:59 dcunit3d: yeh i'm fixing to just write something that compiles superkuh
21:59 superkuh: Ah, sorry to interject with my pet peeves. ;)
22:00 dcunit3d: i think i finally got JAX working though. from the rocm/tensorflow-build image, which then needs to run a build of TF, but i'm not sure I need it.
22:02 dcunit3d: the differences between nvidia & AMD's architectures (the ability to have infinity cache, etc) make matching CUDA's interface a moving target
22:02 dcunit3d: i just had no idea what i was getting into. i can't stand nvidia.
22:03 dcunit3d: i can understand that older hardware needs to be phased out. that is definitely frustrating.
22:04 superkuh: 4 years of life.
22:04 dcunit3d: i was stuck with a macbook pro 2013 where the nvidia 400 series drivers kept phasing in/out for various linux package managers. so things like NVENC/NVDEC were supported and would work, then wouldn't.
22:06 dcunit3d: i think the difficulty in supporting those cards has more to do with the switch to RDNA/CDNA architecture
22:08 dcunit3d: maybe openmp can help open up older architectures eventually. i have like 8x AMD 520's or something from 2014. i know they could compute /something/, but the power usage just isn't worth it.
22:08 superkuh: "lets use all the newest libs when writing the RDNA stuff!" and then suddenly all the old code which executed just fine now has to be re-written too.
22:08 dcunit3d: i would actually try to figure that out
22:10 dcunit3d: yeh but i think relaxing the restrictions on caching require significant changes to some of the firmware and low-level software interfaces (like the AMD equivalent to nvidia PTX, which is like assembler for graphics cards)
22:11 dcunit3d: i've been through like 10 years of being completely capable of doing ML & data science, but unable to participate because of a lack of a decent GPU.
22:11 dcunit3d: it sucks
22:11 psykose: don't think you're missing out at all
22:11 superkuh: I don't know, having to use CPU inference for large language models kind of sucks when I have an 8GB GPU just sitting there.
22:12 dcunit3d: i'm actually suprised that there aren't significant restrictions in functionality resulting from the infinity cache, but I think that's at least one reason for separating RDNA/CDNA
22:13 dcunit3d: firefox seems to want to eat half my VRAM. i got the 5950 zen processor (without integrated graphics, so that was maybe a mistake)
22:14 dcunit3d: KDE eats like 1.5 GB and each browser window eats ~250MB, then you have your linux window managers like KDE that want to make it almost impossible to turn off transparency.
22:14 dcunit3d: i have a huge desktop, so it matters (i think?)