02:34 karolherbst: mhh okay.. so suspending using the sw path takes like 220 seconds :/
02:35 karolherbst: but it does seem to work reliably
02:39 imirkin: woohoo, great success. by the time it's done suspending, you already want to resume ;)
02:39 karolherbst: huh..
02:39 karolherbst: what's up with nouveau_bo_move_prep?
02:39 karolherbst: that function looks...... odd
02:42 karolherbst: anyway.. doesn't matter
02:43 karolherbst: imirkin: I kind of expect fencing to be broken... remember the issue I hit within mesa, but everything was fine regardless?
02:44 karolherbst: I am actually wondering what would happen if I'd do the same on the kenrel.. just assume everything is fine and move on
02:47 karolherbst: ohhhhhhhhhhhhhhhhhhhhhh
02:49 karolherbst: oh no..
02:49 karolherbst: I hope my current theory is wrong
02:50 imirkin: if your past theories are any prediction...
02:57 karolherbst: fuck
02:57 karolherbst: we don't stop the channel before evicting memory
02:57 karolherbst: well
02:57 karolherbst: wait for idle
02:57 karolherbst: _but_ it seems like something pushes work to the gpu while we evict things
02:59 karolherbst: imirkin: soo... would you expect that after calling into nouveau_fbcon_set_suspend and nouveau_display_suspend channel 1 (used by the kernel) is active and doing random stuff?
03:00 karolherbst: mhhh
03:02 karolherbst: maybe I missunderstand what ttm_resource_manager_evict_all is supposed to do
03:02 karolherbst: but my understanding is kind of that we remove all memory out of VRAM with that
03:03 karolherbst: well move, and it gets copied over to sys mem
08:14 graphitemaster: imirkin, Does NV hardware microarchitectually implement SIMT as 8x SIMD with each SIMD being 4 lanes. Is this where that 4x8 configuration comes from and why mapping thread indices to some texture space (as an example) not end up within a single warp. It's actually kind of weird that a warp is not a multiple of gl_LocalInvocationIndex
08:15 imirkin: it changes on volta+
08:15 imirkin: on fermi/kepler/etc afaik it's just waves of 32 SIMD thingies
08:16 imirkin: volta+ is magic
08:16 imirkin: but i don't really know what the magic is
08:16 graphitemaster: So are warps are just partitioned into multiples of the thread index then?
08:21 imirkin: well, multiples of 32
08:21 imirkin: last one probably just has a bunch of lanes masked off
08:21 imirkin: moral of the story: don't run compute shaders with local size 1
08:24 graphitemaster: So gl_SubgroupInvocationID == gl_LocalInvocationIndex % gl_SubgroupInvocationSize
08:24 graphitemaster: Or is that wrong
08:25 graphitemaster: Looking at https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_shader_subgroup.txt
08:26 graphitemaster: If the extension GL_KHR_shader_subgroup_basic is enabled, the variable
08:26 graphitemaster: <gl_SubgroupInvocationID> is a built-in containing the index of an
08:26 graphitemaster: invocation within a subgroup. The value of this variable is in the range
08:26 graphitemaster: 0 to <gl_SubgroupSize>-1.
08:27 graphitemaster: Er, typo'd there, gl_SubGroupInvocationID == gl_LocalInvocationIndex % gl_SubgroupSize ?
15:23 karolherbst: imirkin: btw, I was hitting https://gitlab.freedesktop.org/drm/nouveau/-/issues/150 as well
18:06 imirkin: karolherbst: heh. i guess people aren't totally crazy :)
18:06 karolherbst: it's weird though
18:07 karolherbst: I wouldn't be surprised if it's related to the suspend/resume issue
18:09 karolherbst: imirkin: maybe we cut of the upper bits somewhere for stupid reasons?
18:09 karolherbst: mhh
18:09 karolherbst: something for next week