00:01karolherbst: imirkin: TEX.T R5, R4, 0x0, 0x0, 2D, 0x1; on SM20
00:02karolherbst: btw, you need cuda-8 for fermi
00:02edgecase: ok looks like the right place to start.
01:28karolherbst: imirkin: any idea what this means? ce0: LAUNCHERR 00000003 [MEM2MEM_RECT_OUT_OF_BOUNDS]
01:28karolherbst: probably something super stupid
01:29karolherbst: I guess something went wrong with the texture upload
01:32karolherbst: mhhh... maybe we could benefit from an op turning indirect samplers/textures into direct ones if the value is actually a constant... mhhh
01:34imirkin: yeah, m2mf parameters are bad
16:55edgecase: what's an imem? nv50, PCI BAR2 is used to map vram objects to kernel address space, with ioremap() and gpu vmm
16:55edgecase: instmem, instance?
16:56edgecase: also a mystery is nvkm and nvif, what do those mean?
18:28edgecase: subdev = OO inheritance it seems
18:32karolherbst: edgecase: the entire driver is developed in a OOP style
18:33imirkin_: nvkm = kernel module. nvif = InterFace
18:34imirkin_: various bits of nvkm present a semi-stable interface that can be called via nvif
18:34imirkin_: almost like syscall-style
18:34imirkin_: the idea being that this could potentially be done across a VM boundary
18:34imirkin_: but that's just future hope
18:34karolherbst: well, the nvif interface is actually and API to userspace and therefore needs to be stable
18:34imirkin_: right. and there's also usif, for maximal confusion.
18:35karolherbst: yeah.. but it's essentially the same interface
18:35karolherbst: it just wraps in a weird way
18:35karolherbst: I always get lost when looking into that
18:35imirkin_: used to be even less explicit than it is now :)
18:45edgecase: container_of() secret sauce
18:45imirkin_: there's a lot less of that now, iirc
18:45imirkin_: as well as naked pointer casts
18:46edgecase: it's fine, just getting up to speed on basic Linux stuff
18:46imirkin_: that's not how most drivers are doe
18:46edgecase: was container_of replaced with something else?
18:46imirkin_: nouveau just covers a LOT of generations
18:46imirkin_: so there's a lot of indirection, since they're all a little different
18:47karolherbst: imirkin_: actually.. we might need to harden nouveau against those specualative attacks :D.. I think? Or was that already done? but we have quite a lot of indirect calls
18:47karolherbst: or it got fixed and we didn't mitigate the costs
18:48karolherbst: one of the mitigation one can do is to compare an func pointer against static function addresses (2 or 3 until it stops improving perf) and call the static one instead of indirect :)
18:48karolherbst: but no idea how many of those func pointers we have where we kind of always know the actual pointer
18:53edgecase: so there is a heirarchy of memory management, it seems gpu mmu -> vma's -> instobjs
18:54edgecase: instobjs base class is nvkm_instobj, (it's abstract?), subclass nv50_instobj has the actual ioremap address
18:54edgecase: so, instobjs are "host readable/writable things in VRAM"
18:54imirkin_: sounds right.
18:54imirkin_: they're not used for much iirc
18:55edgecase: instobjs arent't used much?
18:55imirkin_: iirc no
18:56imirkin_: they're used for backing certain things like object definitions, etc
18:56imirkin_: i could be lying. skeggsb is the one who understands it all.
18:56edgecase: gah, wasting my time? they do account for part of the useage of phys vram tho, right?
18:56imirkin_: i tend to understand bits and pieces, and rarely at the same time.
18:56imirkin_: part? sure.
18:56edgecase: what are the names of other classes of things that use vram?
18:57imirkin_: well, i think VM is just allocated
18:57imirkin_: and stuck into a nouveau_bo
18:57imirkin_: start in nouveau_gem.c and trace down.
18:57edgecase: sure, gem handle -> TTM obj -> nv50_instobj?
18:58imirkin_: then i just don't know what i'm talking about.
18:58edgecase: well between nouveau_bo (TTM object) and nv50_instobj, is what I'm unravelling, so (or someone) can tell me.
18:59edgecase: if instobj is the only allocator of vram, better for me.
19:01edgecase: but even then, there's gem objects, ttm objects, ioremappings in nv50_instobj's, GPU vm mapping, gpu window mappings, then gpu physical vram allocator/heap
19:01edgecase: each one has some kind of heap implementation, and resource limits
19:01edgecase: and could suffer from fragmentation possibly