00:00 karolherbst: ohhh no
00:00 karolherbst: ehh wrong channel
00:01 airlied: karolherbst: hardware vm
00:02 karolherbst: airlied: eh.. just having a VM or what do you mean?
00:02 airlied: pretty much, having per-client VM
00:03 karolherbst: yeah well.. we have that for nv hardware a lot longer than nvidia or we would expose vulkan support for
00:03 mangix: karolherbst: close enough :P
00:03 karolherbst: I think NV50 got it already
00:03 karolherbst: but vulkan is kepler+
00:03 airlied: so you could bring up a cutdown vulkan on those for zink :-P
00:04 imirkin: should be able to do vulkan on nv50 i think
00:04 karolherbst: nvidia had a faulty impl for fermi afaik
00:04 karolherbst: and they disabled it
00:04 airlied: you just end up like intel haswell, small things don't conform and are pita to fix
00:04 karolherbst: would be fun though if nv50 supported vulkan
00:04 karolherbst: yeah, probably that
00:04 karolherbst: nv50 doesn't have a 64 bit VA
00:05 imirkin: i guess i'm not completely sure what the min requirements for vk are
00:05 imirkin: karolherbst: is that a problem?
00:05 karolherbst: imirkin: if you map host memory, then yes
00:05 imirkin: karolherbst: kepler doesn't either...
00:05 karolherbst: kepler doesn't?
00:05 imirkin: all 40-bit
00:05 imirkin: (including nv50)
00:05 karolherbst: mhhh
00:05 airlied: I think 40-bit is enough
00:05 karolherbst: well, nv is 32 bit in shaders
00:05 karolherbst: *nv50
00:05 imirkin: welllll
00:05 imirkin: 16x32-bit ;)
00:06 karolherbst: ;)
00:06 imirkin: i.e. you get 16 32-bit segments
00:06 airlied: I think you could do a reasonable nv50 vulkan, but you'd be hitting a lot of edges
00:06 karolherbst: yeah, sounds like pain though
00:06 karolherbst: airlied: sounds like a fun project
00:06 karolherbst: but I think nv50 would be the next step
00:06 karolherbst: if we could do vulkan on fermi that would be good
00:06 airlied: like cayman could have done it for amd, but it was one generation of gpu
00:06 karolherbst: as nvc0 is fermi+
00:07 karolherbst: at least I would love to do benchmarks between nvc0 and vulkan + zink and evaluate if we find the time like ever to perf improve nvc0 (probably not) or just trash it and use zink
00:09 karolherbst: I honestly don't see a path where anyone would spend much time improving perf on a GL driver in the future really
00:09 karolherbst: well.. especially nouveau
00:09 mangix: zink was vulkan to opengl, right?
00:09 karolherbst: uhm
00:09 karolherbst: the other way around I think would be the better phrasing
00:10 mangix: right, depends on point of view
00:12 imirkin: airlied: it'd definitely be on the weak side. lots of atomic things aren't supported, esp on earlier variants, dunno if they're required on vk
00:12 karolherbst: could lower it
00:12 karolherbst: or just make it supported enough so that things work we care about
00:12 imirkin: how do you lower atomics exactly?
00:13 karolherbst: like intel lowers 64 bit atomics :p
00:13 imirkin: yeah
00:13 airlied: like if you can do gles3.1 then you can usually do baseline vulkan
00:13 imirkin: but they have _something_
00:13 karolherbst: imirkin: they don't
00:13 karolherbst: not for 64 bit
00:13 imirkin: karolherbst: they have 32-bit
00:13 karolherbst: (at least not all memory)
00:13 karolherbst: ahh yeah
00:13 imirkin: karolherbst: G80 has _nothing_ iirc
00:13 imirkin: G84-GT200 has global but not shared
00:13 karolherbst: not even an atomic store/load?
00:13 imirkin: GT215+ has shared
00:14 karolherbst: cas is just optional really
00:14 imirkin: i'm talking about anything
00:14 karolherbst: mhh
00:14 karolherbst: yeah okay.. that's annoying
00:14 karolherbst: but it could be a best effort impl
00:14 karolherbst: dunno
00:15 karolherbst: write an ext to make things optional? :D
00:15 karolherbst: imirkin: do we do gles3.1 on nv50?
00:15 imirkin: karolherbst: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp#n1484
00:15 imirkin: karolherbst: on GT215, i flipped it on
00:15 karolherbst: okay
00:15 imirkin: in truth, we do miss on some very minor bits
00:15 karolherbst: what's missing for prev gens?
00:15 imirkin: er, GT215+
00:15 imirkin: for ES 3.1?
00:15 karolherbst: yeah
00:16 imirkin: GT215 adds texgather, texquerylod
00:16 karolherbst: uhhh
00:16 imirkin: and some other small bits iirc
00:16 imirkin: slightly earlier gens add shared atomics
00:17 imirkin: (looks like nva0+)
00:17 karolherbst: yeah...
00:18 karolherbst: not sure if it makes sense to bother then
00:18 karolherbst: but nvc0+ would be a cool target
00:18 imirkin: in practice, the "big" missing feature for GT215 on ES 3.1 is component select on tex gather
00:18 karolherbst: like if anybody cares
00:18 imirkin: ;)
00:18 imirkin: seemed like an OK thing to just whiff on
00:19 imirkin: in exchange for compute, images, etc
00:19 karolherbst: yeah..
00:20 karolherbst: anyway... I think zink would be a viable mid term solution, until we either care enough or find enough time to write a proper perf oriented driver, because if we would bring up the current one up to speed, we would rewrite it completely anyway I think
00:20 karolherbst: especially with all the threaded context, shader variants and what else other drivers are doing
00:20 karolherbst: state tracking has to be reworked anyway
00:21 karolherbst: oh well
00:21 karolherbst: depends on the actual perf in the end
00:21 karolherbst: but I would say it's a safe bet to assume zink being faster
00:22 imirkin: heh
00:23 airlied: I think the one area zink fails against radeonsi is compiler side
00:23 karolherbst: ahh yeah, makes sense
00:23 airlied: like there is CPU overhead, but mareko put a lot of work into reducing compilation/linking times
00:23 karolherbst: we do the glsl -> nir -> spir-v round trip, don't we?
00:23 airlied: granted LLVM as a backend made them excessive to begin with
00:24 karolherbst: uhh
00:24 airlied: karolherbst: yes
00:24 airlied: it's not really the compiler IR tsack as much as when you have to compile
00:24 karolherbst: would be cool if we could do glsl -> GL spir-v -> Vk spir-v
00:24 airlied: not really
00:24 karolherbst: why not?
00:24 airlied: jsut doesn't save you anything where you need to save it
00:24 karolherbst: ahh
00:25 karolherbst: although I'd assume that would lower the overhead a little, but granted glsl parsing is the real killer anyway
00:25 airlied: the problem is that vulkan pipelines are a big compile, and GL doesn't really fit that model
00:25 karolherbst: ohh, I see
00:25 karolherbst: could add more extensions or something :P
00:26 airlied: there are some coming
00:26 airlied: https://www.khronos.org/registry/vulkan/specs/1.3-extensions/man/html/VK_KHR_pipeline_library.html
00:26 airlied: or here already
00:27 karolherbst: cool
00:28 karolherbst: if we keep adding ext after ext do make translation a practical no-op I guess then it really doesn't make much sense to keep a GL driver
00:29 karolherbst: though the biggest advantage of caring more about vulkan are all the windows games anyway
00:29 airlied: yeah with steam and proton, now GL usage is quite reduced
00:29 airlied: unless you really want to tackle the workstation market
00:29 karolherbst: so where it matters you have vulkan anyway and where it don't the overhead won't be a problem
00:29 karolherbst: *you
00:30 karolherbst: airlied: why workstation?
00:30 karolherbst: legacy software?
00:30 airlied: those CAD software stuff is stuck on GL
00:30 airlied: and they love excessive vertex heavy workloads
00:30 karolherbst: yeah sure fine.. those 5% perf difference won't kill them
00:30 airlied: though mareko has optimised mesa for that in most of the generic paths
00:30 karolherbst: ahh, cool
07:26 mynacol: karolherbst: Running your MT fixes MR since January. Definitely better, but still freezes especially in MPV sometimes.
07:26 mynacol: Will be switching to the branch with even more fixes.
23:02 TimurTabi: I may have asked this before, but is there a way to write to an offset from the base address of a subdev?
23:02 TimurTabi: I have a function that needs to write to a particular register in one of three different Falcons, either the GSP, the SEC, or the NVDEC (whatever that is).
23:02 TimurTabi: It's the same register, but the base address is different (either 110000, 840000, or 830000)
23:03 TimurTabi: I was hoping that struct nvkm_subdev would have a "u32 base_address" field or something like that.
23:29 anholt: well, gm206 looks way more stable than gm20b.
23:30 anholt: deqp transform feedback is still a flakefest
23:37 karolherbst: mynacol: yeah.... I did not test those patches with mpv or something _at_all_
23:54 airlied: skeggsb: probably a q for you above from Timur
23:54 airlied: though he's gonen ow