08:57 Sarayan: I think Rust is being praised because they seem to have managed to reduce the friction to usable levels on powerful but complicated stuff like the borrower checking
09:03 herbas: this is simply a test message to see if this message will get logged (my previous one didn't somehow, maybe because I wasn't registered with nickserv)
09:04 herbas: Now, it works, anyway: I am using nouveau on an NVIDIA Jetson nano, with Fedora 36 (5.17 kernel and mesa 22), however display simply doesn't work. In Fedora 33 it did work however, although with very severe graphics artifacts. Also, sometimes when booting, but not always, I get this kernel panic, which has nouveau a few times in it https://pastebin.com/raw/81vbhXAr
10:10 karolherbst: Sarayan: the borrow checker isn't complicated
10:10 karolherbst: it just frictions with unsafe things C devs are doing
10:11 karolherbst: it literally checks that any value is only used once if it's changed somewhere
10:12 karolherbst: (and even that isn't true)
10:13 karolherbst: and I am sure if C devs approach rust with a "everything I did in the past was very wrong and unsafe, and now I learn how to write safe software" mindset, rust becomes very easy
10:13 karolherbst: *simple
10:13 karolherbst: a lot of issues I was encountering were me trying to force my C habits until I understood why it was bad and why it's even bad doing it in C
11:15 Sarayan: karolherbst: Sure, but they managed to make a syntax and some exit traps (like unsafe) that makes it sane to use in production, rather that just a toy language
11:19 RSpliet: Sarayan: I think on a higher level that's why Rust is relatively successful currently. I haven't looked into it much at all, but from what I've heard it's a clean language that's taken past experience from languages like C plus "academic" innovation from projects like ML to create something that's pragmatic but safer than the previous pragmatic thing
11:23 Sarayan: yeah, from what I see I agree
11:23 Sarayan: the pragmatic part is really importnat
11:30 karolherbst: although C could be made much safer without having to change much
11:30 karolherbst: it just breaks existing code
11:36 RSpliet: Yeah... that's a problem :-D
11:38 RSpliet: Although... every time I upgrade my compiler flags to a newer flavour of C++ I see existing code break and requiring fix-ups. Some amount of code breakage is apparently acceptable as long as it's optional. . I suspect C's breakage with the past would be much worse though if you really want to fix some of those issues :-D
13:34 karolherbst: imirkin: what do you think about the theory, that fencing fails on nv50, because the hardware doesn't reach them in time and does _something_ instead? Because if I just resume after the dma_fence wait timeout, everything is fine
15:55 imirkin: karolherbst: there is no "fencing" ... you just insert things into the pushbuf that tells the fifo thing to wait until some value is > X
15:56 imirkin: normally when such a thing happens, we also instruct it to raise an interrupt
15:56 imirkin: which then in turn triggers some sort of completion logic
15:56 imirkin: <end of knowledge>
15:57 imirkin: IME when that condition is never reached, it's a hang for that channel
15:57 imirkin: but i don't know too much about it -- i just tried to avoid the situation at all cost
16:32 Wally: .
16:33 Wally: imirkin: Sorry, did you mean on the ogk driver or nouveau?
16:33 imirkin: i meant on the nvidia hardware.
16:34 Wally: generally?
16:34 imirkin: yes.
16:34 Wally: I thought I saw fencing ioctls in the uapi headers for ogk...
16:34 imirkin: the completion logic is driver-specific of course.
16:34 imirkin: ok
16:34 Wally: k
16:39 karolherbst: imirkin: ehh, right.. but I meant whatever we do for that ttm dma_fence stuff
16:40 karolherbst: imirkin: inside ttm_bo_move_accel_cleanup
16:41 imirkin: right, so that's basically saying "the fence didn't get hit within a certain period of time" right?
16:41 imirkin: so... wait longer? or the channel is dead? or the interrupt never made it
16:41 karolherbst: anyway.. ttm_bo_move_accel_cleanup returns with EBUSY (after 15 seconds) and we fall back to software, that already looks like a very wrong thing to do
16:41 imirkin: don't have a clear explanation
16:42 karolherbst: but if I shorten the timeout and disable the fallback, things "just work"
16:42 imirkin: yea, that's basically the YOLO approach
16:42 karolherbst: so nothing is dead, and the fence gets hit (because new things do "just work")
16:42 karolherbst: I had a similar thing in mesa, where I used the same workaround
16:42 imirkin: yeah, so probably we miss reporting the fence as completed
16:43 karolherbst: _somehow_ we don't notice the fence getting hit, because the counter doesn't increase because of... caching or whatever
16:43 karolherbst: yeah.. maybe
16:43 imirkin: are you sure it doesn't increase?
16:43 imirkin: worth checking that explicitly
16:43 karolherbst: not yet
16:43 karolherbst: doing that on a kernel level where you run the android emulator is... well... spamming dmesg
16:44 karolherbst: bad part: the emulator runs for a bit until this happens (but reproducible)
16:44 imirkin: ok
16:44 karolherbst: another strange thing: on the gt200 with max clocks it generally doesn't "freeze"
16:44 imirkin: well i wish you good luck :)
16:44 karolherbst: with default clocks it happens sometimes
16:44 karolherbst: yeah...
16:44 karolherbst: it's a stupid issue
16:44 imirkin: the G200 is pretty powerful as far as GPUs go
16:45 karolherbst: yep
16:45 karolherbst: it could be that we just enqueue something seriously expensive, but if I wait forever, then it waits forever
16:45 karolherbst: with dma_fence_wait I mean
16:45 imirkin: yeah
16:45 imirkin: so that means it's either stuck or missed
16:46 imirkin: given that ignoring the "stuck" lets things proceed as normal
16:46 karolherbst: I bet on missed, because if I just ignore it, it works
16:46 imirkin: it must be getting missed
17:02 karolherbst: mhh, could also be that we insert things in the wrong order as we expect it... I really need to put in some printks and figure out what's the situation on the kernel side
17:04 imirkin: karolherbst: or could be that interrupts are coalesced
17:05 karolherbst: uh.. another good idea
17:05 imirkin: ideally the kernel "semaphore interrupt" handler has the same logic as we do in mesa
17:05 imirkin: where it completes all the "old" fenes
17:05 imirkin: fences*
17:06 karolherbst: yeah.. hopefully I'll figure it out tomorrow :)
19:15 Wally: Sorry, is it true that we dont implement zcull?
19:15 imirkin: yes
19:15 Wally: eek
19:16 Wally: [that may have contributed to the bad kde performance]
19:16 imirkin: unfortunately it's not the sort of thing that can 50% work
19:16 imirkin: or even 95%
19:16 Wally: why?
19:16 imirkin: it has to be 100%, or else everything fails
19:16 Wally: oh
19:17 imirkin: and we never were able to work out precisely how to operate it
19:17 Wally: Is there a experimental branch somewhere for it?
19:17 imirkin: no
19:17 imirkin: there are some ifdef'd things in nvc0 somewhere i think
19:18 Wally: Oh! So thats why when I grepped for cull I found some...
19:18 imirkin: but it's not like close to a proper implementation
19:18 imirkin: tbh i never even bothered looking into it
19:18 imirkin: it was stuff calim had tried to sort out ~10y ago
19:21 airlied: is zcull like hiz?
19:21 Wally: Its face culling
19:22 imirkin: airlied: zcull is exactly like hiz
19:22 imirkin: the details are obv different, but same ideas
19:22 imirkin: nv50 and nvc0 zcull are pretty different
19:22 imirkin: on nvc0 you have separate cull surfaces. on nv50 there's some weird shared internal state.
19:23 airlied: yeah hiz evolved over the years to be simpler thankfully
19:24 airlied: early hiz had a shared private RAM you had to allocate to the process
19:24 imirkin: yeah, pre-nv50 was even worse
20:00 karolherbst_: Wally: the thing is, we suspect it needs kernel/firmware level support for zcull, so we guess it can be implemented maxwell2+, but did nvidia put zcull code into the firmware they gave us? no clue
20:00 Wally: karolherbst_: Ah
20:01 karolherbst: needs to be context switched and stuff
20:01 karolherbst: it might just work, but when I was looking into I didn't see any difference, not sure if I did something wrong or not
20:01 karolherbst: but feel free to experiment with it
20:01 karolherbst: it's not terrible difficult to implement though
20:01 Wally: karolherbst: the deko3d guys claim to have solved it
20:02 karolherbst: you essentially allocate a bo alongside any fb and attach the zcull buffer whenever you switch fb afaik
20:02 karolherbst: Wally: yeah.. it's not hard
20:02 karolherbst: but it needs support from the kernel side
20:02 karolherbst: afaik
20:02 Wally: For hardware acceleration?
20:02 karolherbst: nope. It has to be context switched
20:03 karolherbst: like.. well.. the context switching code needs to be aware of the zcull buffer
20:03 karolherbst: but I might be wrong, but I think it's like that
20:03 karolherbst: grep for zcull in nvidias driver :)
20:03 Wally: So even for occlusion that doesnt use any hardware features a kernel patch is needed
20:04 Wally: ctrl2080 controls irt
20:04 Wally: tit*
20:04 Wally: it*
20:04 karolherbst: src/nvidia/src/kernel/gpu/gr/kernel_graphics.c has tons of references
20:05 karolherbst: it's all easy to complain you figure random stuff out, if you already got your perfectly working kernel :)
20:06 karolherbst: s/complain/boast about/
20:07 Wally: karolherbst: Sorry in Mesa where are tris drawn for nvc0? I think im grepping for the wrong things
20:07 karolherbst: anyway.. I got just super annoyed by those people, because of this "why didn't you do this _simple (tm) thing" attitude
20:08 Wally: its not simple(tm)
20:08 karolherbst: of course it isn't
20:08 karolherbst: the idea is though
20:08 karolherbst: just getting it right is hard, especially if you put random stuff into those zcull methods and literally nothing changes :)
20:09 karolherbst: should poke nvidia if they are more willing to release docs for all of this though
20:09 karolherbst: Wally: isn't all this stuff inside nvc0_draw?
20:10 Wally: thx
20:10 karolherbst: ehh nvc0_draw_* rather
20:10 karolherbst: there are multiple functions
20:10 karolherbst: nvc0_draw_vbo being the main one
20:10 karolherbst: anyway, it would be nice to see some patches for that stuff instead of just annoying remarks :)
20:11 Wally: or testing
20:14 karolherbst: yeah, testing is always nice
20:14 karolherbst: it's really cool when people test your big MRs
20:20 Sarayan: karol: Yeah, why don't you just implement pmu for pascal? It's just new bios structures, new methods, falcon code and a signature key we don't have :-)
20:21 karolherbst: duh, I should really just do it
20:21 karolherbst: why didn't I come up with this idea myself
20:21 Wally: l3l
20:30 karolherbst: nice, no flickering with nv50 inside the android emulator either :3
20:42 karolherbst: oh wow.. people appear to be more up to helping now that nvidias source is out
20:42 karolherbst: I hope this will increase over time :)
21:17 Wally: with drm-shim is there anything needed to be done except preloading the library?