16:19 jhugo: tomeu: is there documentation for NIR from the driver perspective? Something like how to implement a compute shader in a gallium driver using NIR? docs.mesa3d.org does not seem to have much. We have a gallium driver that interfaces with teflon, and can compiler/invoke subgraphs using the CPU. Next step is actually interacting with the
16:19 jhugo: accelerator. We need to compile the subgraph into an ELF .so and I'm hoping there is existing functionality in Mesa we can leverage for that
16:22 tomeu: jhugo: I think you can find some of that in Faith's blog: https://www.gfxstrand.net/faith/projects/mesa/nir-notes/
16:22 tomeu: so the .so is passed through the kernel to the firmware on the other side?
16:25 jhugo: Yes
16:26 tomeu: ok, so right now we don't have NIR operations that operate on tensors
16:27 tomeu: so you either have to add those (hardware independent), convert the graph to it and lower from there to your ISA
16:28 tomeu: or you can generate lower-level NIR from the graph
16:33 jhugo: Hmm, maybe I'm thinking about this wrong? I thought that I could teach NIR about our ISA, and then hook into the lower level parts of NIR to convert the teflon graph to the .so (or atleast get the content of a text segment which we can then package into a .so in our driver)
16:35 tomeu: you use NIR to lower down to a level from which you can generate machine code
16:35 tomeu: then I guess you can stuff that into a .so
16:35 tomeu: you may need to add some NIR operations specific to your hardware so you can get low enough
16:36 tomeu: but if the ISA is capable of supporting CL-like computer kernels, then you probably can reuse most of them
16:39 jhugo: The ISA is a DSP (Hexagon) with Vector and Matrix co-processors. We haven't hooked up compute to it (as in OpenCL), but thats on my wishlist
16:39 tomeu: ah, very cool
16:39 jhugo: We've got caches, local memory and DDR to play with
16:40 tomeu: so yeah, you should be able to share the compiler between teflon and rusticl (for opencl)
16:40 tomeu: that's fine, you would deal with that at the higher level passes
16:41 tomeu: if you know that you want to do opencl, then maybe it would make sense to start with it, even if not compliant
16:42 tomeu: so once you have opencl kernels running (and thus a compiler), you can convert the graph to NIR (if you have added tensor operations) and lower from there down to the CL level
16:42 jhugo: I'm guessing we'd need to extend NIR eventually for the ML ops. I need to double check the ISA details, but the matrix co-processor should be able to directly consume a 2d conv
16:43 tomeu: I think that should be fine
16:43 tomeu: but will require new NIR ops obviously
16:44 jhugo: Yep.
16:44 tomeu: you program the matrix co-processor through a DSP instruction?
16:45 tomeu: or is it something bolted to the side?
16:45 jhugo: Through DSP instruction. ISA extension
16:45 tomeu: that's great
16:46 jhugo: I presume you are somewhat familiar with x86 SIMD or ARM NEON
16:46 jhugo: Similar to that
16:46 jhugo: Very powerful though its flexibility, but with that comes complexity :)
16:46 tomeu: guess some instructions to prepare the input, then another to configure the array, then another to retrieve the output?
16:48 tomeu: that's what the Coral TPU does, and I think TI as well
16:49 jhugo: Basically. Layout ("configure") the operation in SRAM, invoke the co-processor instruction. It dumps the output back in SRAM
16:51 tomeu: cool, nothing weird then
16:52 tomeu: if you have enough people, then maybe you can parallelize the work having one person doing the opencl driver, and another adding the NIR tensor operations and translating from tflite ops
16:53 tomeu: if we find that adding a tensor data type to NIR is too weird, we could consider using MLIR instead
16:53 tomeu: and lowering from MLIR to NIR at some point
16:54 tomeu: but if NIR can reasonably handle tensors, I would avoid MLIR in Mesa
16:55 jhugo: Yeah. In concept, I'd like to stick to NIR to be able to have leverage between compute and ML, but I'm very new at this so I'm expecting reality to end up being very different than what I picture
16:56 tomeu: with MLIR you would still have, you would lower from tensor-level operations in MLIR to image/buffer-level operations in NIR
16:56 tomeu: but it's bring a lot of complexity and has a lot of overlap with NIR
16:56 tomeu: as both are multi-level IRs that are supposed to support a broad range of complexity levels
16:59 jhugo: Ok. Sounds like I've got list of topics to go research, but I think I've got an idea of the direction to go in
17:00 tomeu: sounds like a lot of fun to me :)