15:25 pmoreau: karolherbst: I opened a new MR with some of the clover fixes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5038
16:05 simernes: Hi. My desktop environment (KDE) keeps randomly freezing. Any suggestions on how to troubleshoot? I can still ssh into the machine, but I can't open new ttys with ctrl+alt+fx. It seems to maybe be correlated with jumping between tabs in firefox and similar gui-ish actions although I'm not sure about that. Sometimes it freezes just for a while and I'm lucky in that it wakes up from the freeze, but
16:05 simernes: other times not.
16:05 simernes: I'm on void linux with kernel 5.4.40, with GTX 650 and nouveau.
16:05 simernes: I reported this issue here a few days ago and it now happened again while seeking in a video in vlc (skipping to a specific time with the "progress" bar of the time). I was supposed to report back if killing firefox through ssh unfroze my system, and neither killing vlc nor firefox helped.
16:07 loonycyborg: nouveau tends to leave some output in dmesg in such cases generally
16:07 simernes: I see this in dmesg:perf: interrupt took too long (4919 > 4918), lowering kernel.perf_event_max_sample_rate to 40000
16:07 simernes: nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
16:07 simernes: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
16:08 simernes: nouveau 0000:01:00.0: fifo: channel 14: killed
16:08 simernes: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
16:08 simernes: nouveau 0000:01:00.0: vlc[8200]: channel 14 killed!
18:29 simernes: I'm a little new to nouveau so please excuse my ignorance. Just wondering where I should go to follow up: is this it? https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/339 or is xf86-video-nouveau something else than just nouveau?
18:31 imirkin: which of the 30 people saying "i have the same problem" are you?
18:31 simernes: I'll be 31
18:31 imirkin: without doing a deep analysis, it's best to treat the problems as separate
18:36 simernes: cheers dude
18:36 imirkin: so what's your problem, what are the symptoms, what's the hw?
18:37 imirkin: oh. i see you listed stuff earlier.
18:37 imirkin: so a ctxsw fw hang on a GTX 650 is unusual.
18:37 imirkin: is that CTXSW_TIMEOUT the first nouveau-related message in the logs?
18:37 imirkin: or is there stuff before?
18:41 simernes: don't really see anything else. Are the timestamps or whatever they are relative to boot? This is my first time looking in dmesg honestly but all I see before it is basically elogind stuff and wifi
18:42 imirkin: yes, it's seconds since boot
18:43 imirkin: are you using plasma?
18:43 simernes: yes
18:43 imirkin: lots of people have reported issues with plasma
18:44 imirkin: i'd recommend trying a diff environment, or not using nouveau
18:44 simernes: ok, great. I guess that's a solution
18:47 simernes: getting another gpu model would change the behaviour or no?
18:48 imirkin: a non-nvidia one, yes
19:08 imirkin: NVIDIA CEO Jensen Huang unveiled the company's new Ampere A100 GPU architecture for machine learning and HPC markets today. Jensen claims the 54B transistor A100 is the biggest, most powerful GPU NVIDIA has ever made, and it's also the largest chip ever produced on 7nm semiconductor process. There are a total of 6,912 FP32 CUDA cores, 432 Tensor cores, and 108 SMs (Streaming Multiprocessors) in the A100, paired to 40GB of HBM2e memory with maximum memory
19:08 imirkin: bandwidth of 1.6TB/sec. FP32 compute comes in at a staggering 19.5 TLFLOPs, compared to 16.4 TFLOPs for NVIDIA's previous gen Tesla V100.
19:09 imirkin: https://hothardware.com/news/nvidia-ampere-dgx-a100-ai-machine-learning
19:14 Conmanx360: Ah, is that the one Jensen pulled from his oven?
19:14 Conmanx360: https://www.youtube.com/watch?v=So7TNRhIYJ8
19:15 Conmanx360: Yeah, looks like it.
22:00 HdkR: imirkin: But how soon until Nouveau supports A100?
22:03 imirkin: 2025
22:04 imirkin: there'll be a Z100 out by then, of course
22:04 imirkin: with the AA100 being announced soon after
22:11 HdkR: :D
22:15 karolherbst: hey :D it might be not that bad