00:15 nyef: ReinUsesLisp: Out of curiousity, which Lisp?
00:15 ReinUsesLisp: a programming language family
00:15 nyef: Yes. Which one do you use, though?
00:15 ReinUsesLisp: Common Lisp
00:16 nyef: Cool. Which implementation?
00:16 nyef:uses SBCL, primarily.
00:16 ReinUsesLisp: unless you want to embed it to an application, then I would use a scheme dialect
00:16 ReinUsesLisp: yea, sbcl or cmucl
00:16 nyef: Every so often I am surprised that CMUCL is still maintained.
00:18 imirkin:has only ever used clisp
00:19 imirkin: eliza. good times.
00:20 ReinUsesLisp: btw how does a shader instruction know which vertex input is reading in a geometry shader?
00:23 imirkin: huh?
00:23 nyef: Ugh. GNU clisp is dreadful. /-:
00:23 imirkin: a shader instruction like IADD just takes 2 args and adds them... not sure what your question is
00:25 ReinUsesLisp: e.g.: "ALD R3, a[0x78], R4 ;", is it reading gl_in[0], gl_in[1] or gl_in[n]?
00:25 imirkin: n, where n = R4
00:26 imirkin: or something along those lines
00:27 imirkin: more like R4 = PFETCH n
00:27 imirkin: i forget what the nvidia mnemonic is for PFETCH
00:27 imirkin: ah yes - ISBERD
00:28 imirkin: (at least on maxwell. on earlier gpu's it's different iirc)
00:35 HdkR: isberd is still a weird name for an instruction
00:35 nyef: HdkR: So is "EIEIO".
00:35 HdkR: isbe-rd but makes me think, "Is Beard?"
00:36 nyef: (Enforce In-order Execution of I/O, a PPC instruction.)
00:37 HdkR: Yea, I'm well aware of PPC :D
00:37 HdkR: I love the rlw line of instructions
00:40 ReinUsesLisp: so it'd be something like this: https://hastebin.com/abutoresed.makefile ?
00:41 ReinUsesLisp: and yes, IMO clisp is kind of bad, at least for building external packages
00:43 orbea: I guess this nouveau/dolphin-emu crash is similar to the last one? Happens when starting a game, playing a little, closing it and then starting it or another game again. removing ~/.cache/dolphin-emu/ seems to work around it. https://pastebin.com/BdyxvK0v
00:44 HdkR: orbea: Same issue, it's calling glFinish on a worker thread still
00:45 orbea: okay, thanks
00:45 imirkin: ReinUsesLisp: sounds right
00:46 imirkin: but i haven't checked antyhing
00:46 imirkin: and then there's AL2P if you want to do indirect shader input access on kepler+
00:47 imirkin: (AFETCH in nv50_ir)
00:47 imirkin: on fermi, ALD could take a reg offset directly
00:47 imirkin: all this stuff gets a bit confusing in TCS with loading outputs (and TES too iirc, even for inputs)
00:48 imirkin: took some fiddling to work it all out back in the day
00:48 ReinUsesLisp: I have to emulate instructions with GLSL, so any extra feature nvidia GPUs have is kind of a pain
00:48 ReinUsesLisp: every*
00:52 HdkR: :P
00:53 HdkR: Just implement a ton of GL_Mesa extensions
00:53 ReinUsesLisp: the nightmare instructions are LDG and STG because Switch's Maxwell shares memory with its CPU
00:57 ReinUsesLisp: thanks, this really helps
00:57 ReinUsesLisp: see you
04:30 karolherbst: mhhh, skeggsb "fifo: write fault at 000fa24000 engine 00 [GR] client 1d [DFALCON] reason 02 [PTE] on channel 3 [017f0de000 Xorg[3116]]"
04:30 karolherbst: that DFALCON client makes me suspicious
04:35 karolherbst: anyway, the engines aren't recovered or at least the userspace program doesn't get notified or whatever and hangs infinitly
04:38 karolherbst: hangs inside nouveau_bo_wait
04:38 karolherbst: well doing an ioctl actually
04:38 karolherbst: called from nvc0_hw_get_query_result
04:41 skeggsb: yeah... that's kinda going to happen if you try and wait on stuff from a dead channel...
04:42 skeggsb: recovery can't make the channel survive, it's dead, gone. just like a SIGSEGV
04:42 karolherbst: okay
04:42 karolherbst: but then we probably want to kill the applications or something
04:42 karolherbst: or handle that somehow
04:42 skeggsb: i have patches that make the kernel behave a bit better (fences immediately timeout etc), but userspace is still stupid and spins forever
04:43 karolherbst: skeggsb: we might also have to report reset status soonish
04:43 karolherbst: for GLX_ARB_create_context_robustness
04:43 skeggsb: that can already happen, userspace just doesn't use it...
04:43 karolherbst: to implement pipe_context::get_device_reset_status
04:43 karolherbst: ahh
04:43 karolherbst: okay, when the kernel bits are there and that should get fairly trivial
04:44 skeggsb: the kernel bits for saying "your channel is dead" are already there
04:44 skeggsb: they have been for a very long time
04:44 karolherbst: right, but that method is a bit more precise
04:44 karolherbst: you can specify whose fault it was
04:44 skeggsb: you get told which channel caused it
04:44 karolherbst: (application or driver, stuff like that)
04:44 skeggsb: the kernel can't possibly know that...
04:45 karolherbst: maybe I missread the enum
04:45 skeggsb: the userspace driver might be able to determine something more, the kernel doesn't know anything beyond "here's the push buffer i got told to execute"
04:47 karolherbst: oh, we actually look at the context
04:47 karolherbst: so we need to know which context was causing it
04:47 karolherbst: context meaning OpenGL context
04:48 karolherbst: or we always return PIPE_UNKNOWN_CONTEXT_RESET... but I am not quite sure if that's good enough for the extension
04:48 skeggsb: forget about the extension for now, you're complicating things.. you want to make the 3d driver behave more nicely than just locking up when a channel dies, right?
04:49 skeggsb: separate issue from reporting to a robustness user :P
04:49 skeggsb: in the context of GL anyway, you can probably just delete the channel, recreate it, mark all state dirty, and continue on
04:50 karolherbst: well, we need it for passing the CTS
04:52 karolherbst: mhh weird
04:52 karolherbst: that thing is all used internal to drivers anyway
04:56 karolherbst: skeggsb: okay, I see
04:56 karolherbst: anyway, for the extension we indeed need to be able to attribute the reseted channel to the context
04:56 karolherbst: allthough returning UNKNOWN_CONTEXT_RESET_ARB might do the trick
04:59 karolherbst: skeggsb: okay, but it seems like there is nothing I can do from userspace right now, as the ioctl blocks forever
05:00 skeggsb: it doesn't, it blocks for 30 seconds i believe... used to be a lot shorted, but someone changed it at some point.. probably with the dma_fence/reservation object stuff
05:00 karolherbst: mhh, let me check
05:00 skeggsb: userspace just keeps trying...
05:00 karolherbst: well the application is dead for 2 hours now
05:00 karolherbst: ahh
05:01 karolherbst: then let me check with gdb
05:01 karolherbst: ohh, you are right
05:02 karolherbst: ....
05:02 karolherbst: oh heck
05:02 karolherbst: skeggsb: it loops here: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_cb_queryobj.c#n302
05:02 karolherbst: :(
05:03 skeggsb: i guess in the "dead channel" case, you probably need to lie and return some garbage result
05:03 skeggsb: or just return the current value of the query buffer i guess
05:04 skeggsb: but yeah, the 3d driver should make some attempt to recover (by recreating the channel) when that shit happens
05:04 karolherbst: yeah
05:04 skeggsb: spinning forever isn't nice
05:04 karolherbst: not really :)
05:05 skeggsb: our recovery stuff is far from perfect, but it works a lot better than it appears due to userspace issues
05:05 karolherbst: I see
05:05 karolherbst: have some example code on how to recover the channel?
05:06 karolherbst: I don't really plan to test a lot of stuff here as it takes around 6 hours until the cts runs into such issues
05:06 skeggsb: my attempt at threading fixes would've gone a long way to helping that with some restructuring i did... you need to delete everything that depends on the channel, and the channel itself, and recreate it all
05:06 skeggsb: (so, engine objects etc all need to be recreated after you allocate a new channel)
05:14 karolherbst: skeggsb: mhh, nouveau_gem_ioctl_cpu_prep returns EBUSY
05:14 karolherbst: which is what ends up being called on the kernel side
05:22 karolherbst: I guess with your kernel patches that it will become more clearer to userspace that the channel ist dead or something
05:31 azaki: how good is kepler support right now? is reclocking pretty stable, etc?
05:31 karolherbst: stableish
05:31 karolherbst: if it works for you, good
05:32 karolherbst: if not, then we might be able to track down the issues
05:32 karolherbst: there are some issues
05:32 karolherbst: skeggsb: btw, saw my patches I post a few days ago?
05:57 karolherbst: skeggsb: btw, did you see my messages regarding that EVO timeout?
08:35 azaki: is it normal that my gt630 (oem, kepler) shows up with GL version 4.3 ? mesamatrix shows that nvc0 and later seem to have 4.4 support
08:42 karolherbst: azaki: yes
08:43 karolherbst: azaki: for 4.4 and newer we need to pass the conformance test
08:43 karolherbst: but we don't, so we only expose up to 4.3
08:43 karolherbst: the 4.5 bits are implemented though and only a few things still needs to be fixed
08:43 karolherbst: azaki: if you need 4.4 or newer for something you can always force the OpenGL version
08:49 azaki: yeah, i did that in wine just now to get elder scrolls online running
08:49 azaki: apparently wine's d3d implementation needs it to expose d3d11 to games.
08:50 azaki: there was some weird bug with the ground textures also. but that's a known issue i think.
08:51 azaki: this one. https://bugs.winehq.org/show_bug.cgi?id=44514
08:52 azaki: so looks like a wine bug
08:55 karolherbst: azaki: ahh, it uses 4.4 compat, right?
08:56 karolherbst: imirkin: can we support more than 32 sampler in a fragment shader?
08:57 azaki: karolherbst: i'm not sure to be honest, i was asking in #winehq earlier and from what they were saying, i think they're using core now, not sure though.
08:57 karolherbst: azaki: some mesa devs were working on getting compatibiliy profiles up to 4.5 working and that wine is one of the suers
08:58 karolherbst: not that this is a problem for nouveau, because it should work out quite well even if we cap at 3.1 or something
08:59 karolherbst: we could probably just expose 4.3 compat, but it isn't tested right now
09:00 azaki: <iive> azaki, i think wine developers made it to prefer core profiles recently, and maxversiongl is probably the way to enable it on older.
09:00 azaki: this was what i was told. not sure if it's accurate or not though. i guess i can ask tomorrow, there might be some devs around.
09:01 karolherbst: azaki: you might want to reclock your card though if things are super slow
09:01 karolherbst: or use the nine state tracker when the game supports dx9
09:02 azaki: yeah i did that. i did the pstate thing. i set it to auto, and then checked, and it seemed to be running on the highest frequency
09:02 azaki: although it's a bit hard to read, i'm assuming "AC" is the current settings or something?
09:03 azaki: TESO only supported dx11, that's why i ended up in this situation. apparently they used to support dx9, but they patched it out at one point. =\
09:04 karolherbst: azaki: last line is current, yes
09:05 karolherbst: oh well, I gues with a 630 you are out of luck anyway
09:05 karolherbst: because that GPU is quite slow
09:05 karolherbst: different story if nvidia would be like 5x faster or something
09:06 azaki: yeah it is. i had to set it to absolute minimum settings. it seems ESO can go pretty low in terms of graphics settings. it ran well enough for me to at least get a daily quest done (just talking to an npc and pressing a button basically).
09:06 karolherbst: azaki: you can also try to set MESA_NO_ERROR=1
09:06 karolherbst: as wine is usually CPU bottlenecked
09:07 azaki: the ground texture thing was pretty awkward though.
09:08 karolherbst: azaki: if you got some time, mind trying out with nvidia or dumping the shaders with MESA_SHADER_CAPTURE_PATH?
09:08 karolherbst: you can pack the dumped shaders in an archive and give it to us
09:09 karolherbst: then we might be able to optimize things a bit
09:09 karolherbst: maybe wine produces weird shaders we don't handle that well
09:11 azaki: it's 5am here, so i should go to bed. but i can try the shader capture thing tomorrow
09:12 azaki: how does that work though? do you just set the variable and then set a path to dump the shaders to? like MESA_SHADER_CAPTURE_PATH="/home/name/captured_stuff/" or whatever ?
09:12 karolherbst: yes
09:13 karolherbst: you can keep it there while playing for a bit
09:13 karolherbst: so you can also collect shaders compiled later
09:14 azaki: i'll try. it can be a bit of an lsd trip to play with that ground texture bug. since it's not just like some dark empty void, rather, it's more like the ground seems to be showing random textures from the environment as you rotate the camera around.
09:15 azaki: so it's kind of trippy =p
09:17 karolherbst: :D
09:17 karolherbst: yeah
09:17 karolherbst: but I guess if wine actually starts using bindless_textures it might even speed up some games
09:17 karolherbst: as wine is usually CPU bound and not GPU
09:17 karolherbst: but maybe not
09:19 azaki: thank you for the help btw. i appreciate it. i'll just idle in the channel while i sleep
09:19 azaki: goodnight.
09:19 azaki: =)
13:15 imirkin: karolherbst: theoretically, sure. we don't currently though.
13:16 karolherbst: imirkin: yeah, it seems like that dx10/11 allows more than 32
13:16 karolherbst: but
13:16 karolherbst: I don't know how much the value of 32 is enforced by glsl
13:16 imirkin: they allow up to 128 i think
13:16 karolherbst: insane
13:16 imirkin: starting DX10
13:16 karolherbst: or maybe not
13:18 karolherbst: is this a limitation on our end?
13:18 karolherbst: I mean, that we only support 32
13:19 imirkin: indeed it is
13:19 karolherbst: okay
13:19 imirkin: of course DX10 supported 128 sampler views and 16 samplers
13:19 imirkin: so this doesn't map too nicely to GL
13:19 imirkin: but starting kepler, we can do whatever we like
13:20 imirkin: and on fermi/tesla, we can do this by flipping into linked_tsc mode
13:20 karolherbst: ahhh
13:20 karolherbst: well, I guess the focus would be to bump that up to 128 on kepler+ for now
13:21 karolherbst: because games requiring more than 32 will probably not run that great on unreclocked hardware
13:21 karolherbst: maybe tesla would be fine though
13:21 karolherbst: imirkin: is it the GL_MAX_TEXTURE_IMAGE_UNITS_ARB limit?
13:22 imirkin: no
13:22 imirkin: that's the program-wide max which includes all stages i think
13:23 imirkin: but it's something like that.
13:23 karolherbst: PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS/PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS in gallium
13:23 imirkin: right
13:23 karolherbst: ohh, 16 pre kepler for us
13:24 karolherbst: yeah, and the GL one also has a VERTEX/GEOMETRY variant
13:27 karolherbst: I guess it shouldn't be too hard to bump that up to 128 for kepler+ hardware
13:38 karolherbst: azaki: when you are back, mind creating an apitrace and upload it somehwere as well?
13:38 karolherbst: maybe we can fix the texturing issues inside nouveau
16:46 pmoreau: pendingchaos: Thanks! I’ll give it a try over the weekend. :-)
17:20 endrift: Hey so I've ordered a laptop that reportedly has lots of crashing issues with nouveau. Is there anything I can do to help the project with this? Crash logs or envytools poking or anything?
17:21 endrift: I've heard that the crashing goes away when using the blob
17:21 endrift: It's the Gigabyte Aero 15X with a 1070
17:22 endrift: Also I heard through the grapevine that there were some PPC fixes recently; should I give my AGP cards another spin?
17:23 HdkR: endrift: Hope you like running at idle clocks :)
17:28 karolherbst: endrift: not really. I have hardware with some laptopt issues as well and I am making some progress on tracking those issues down
17:29 endrift: Ok. Just wanted to make sure
17:32 endrift: HdkR I can still undervolt right? :P
17:32 endrift: (missed that message at first due to hopping between client sessions)
17:36 HdkR: No idea how power control works :D
17:39 endrift: HdkR I've been reading up on the x64 MSR so I think I know how to do it :P
17:39 karolherbst: endrift: no, you can't
17:40 endrift: Oh sorry I was talking about the CPU
17:40 karolherbst: ahh
17:40 karolherbst: yeah, that might work
17:40 endrift: I'm not sure if HdkR was now that I think about it
17:40 karolherbst: but is it still relibly possible to do on modern x86 CPUs?
17:41 endrift: I'll be honest. One of the reasons I'm jumping ship from Mac is because of shitty OpenGL/Vulkan support :P
17:41 karolherbst: I was under the assumption that there isn't really any puffer anymore
17:41 karolherbst: *buffer
17:41 endrift: I've been reading about this specific CPU so...yeah I think so.
17:41 endrift: Worst case scenario I can underclock a bit too
17:41 karolherbst: I can't imagine that the voltage can be reduced by more than 5% in the end
17:42 endrift: So long as it doesn't burn my legs off I'm happy
17:42 karolherbst: well
17:42 karolherbst: "this specific" no
17:42 karolherbst: in this area, just reading about the model number doesn't help at all
17:42 endrift: Fair enough
17:42 endrift: CPU binning is a thing
17:42 karolherbst: it depends on the manufacturing quality in the end
17:42 karolherbst: and how stable the clocks are for a given voltage
17:43 karolherbst: but usually such metrics are fused into the die
17:43 karolherbst: and accounted for
17:43 karolherbst: but stock CPUs try to be 100% safe in this regard
17:43 karolherbst: and sometimes if you ignore the existence of certain workloads it might just work out
17:43 endrift: Let me put it this way. I've seen people successfully undervolt this CPU model by 0.125V, so it may be possible with the one I receive
17:43 HdkR: Ah, I thought you meant GPU voltage regulation
17:44 endrift: But you're right. It may not be possible too
17:44 karolherbst: endrift: again, it might just work out and then suddenly the machine crashes due to a very specific workload
17:44 endrift: Yup
17:44 karolherbst: it's always a risk
17:44 endrift: I'd say I may toss Prime95 at it but that tends to crash processors regardless :P I'm looking at you Skylake
17:45 nyef: "An eighth of a volt here, an eighth of a volt there, pretty soon you're talking about real power."
17:45 karolherbst: anyway, I REed such a fused in value on some Nvidia chipsets and the precision is quite high as it is a ~ 13 bit value which you use to tweak your voltage reqiernment calculations
17:45 endrift: Anyway I don't plan to be stressing the GPU often
17:45 karolherbst: and nv hw has 6.125mW precise voltage regulation
17:45 karolherbst: or something like that
17:45 karolherbst: I know it is 6. something mW
17:46 karolherbst: uhm
17:46 karolherbst: mV
17:46 karolherbst: and 0.125V usually makes the difference between stable and totaly not stabl
17:46 karolherbst: e
17:47 karolherbst: and on some boards 0.25V is the full range from lowest to highest clocks
17:48 endrift: yeah I was surprised by the magnitude of the number
17:48 karolherbst: well intel CPUs have higher ranges usually
17:48 karolherbst: on nv kepler/maxwell you are usually around 0.85V - 1.2V
17:49 karolherbst: laptops cap normally around 1.0V
17:49 karolherbst: mostly higher end GPUs have tighter ranges
17:49 karolherbst: as they also have tigher clock ranges
17:50 karolherbst: starting with pascal things are different, so I am not quite sure how things are working out there
17:50 endrift: Not that I expect anyone here to know (except the people who work at NV who are hanging around, who I know can't say), but I'm really curious what's going on with Volta/Ampere/Turing/whatever
17:50 endrift: they've released...one Volta card and it's a Quadro
17:50 karolherbst: yeah, I know
17:50 endrift: and now people are saying they're skipping Volta??
17:50 endrift: It makes me wonder a lot
17:50 karolherbst: well, that card isn't that great
17:51 karolherbst: it has a lot of VRAM though
17:51 endrift: it's $3K, how can it not be great? :P
17:51 karolherbst: 8k$
17:51 endrift: oh shit really
17:51 karolherbst: the GeForce GV100 was like 3K$
17:51 karolherbst: uhm
17:51 karolherbst: Titan V?
17:51 endrift: yeah that
17:52 karolherbst: yeah, it has only 16GB of vram
17:52 karolherbst: lame
17:52 karolherbst: 12 actually
17:52 endrift: oh sorry, two cards
17:52 endrift: the Titan V, which is $3K, and the Quadro GV100, which is $9K
17:52 karolherbst: well, I happen to have access to a Quadro GV100... so we will end up getting support for those with nouveau
17:53 karolherbst: but as you already hinted, it is questionable if _anybody_ will ever use nouveau on those
17:55 karolherbst: endrift: there is actually a third GV100
17:55 karolherbst: "Nvdia Titan V CEO Edition"
17:56 karolherbst: which kind of seems to be identical to the quadro gv100
18:00 HdkR: So many Voltas on the market :P
18:03 aric49: Hi everyone --- having a hanging issue with Nouveau driver on Ubuntu 18.04 using a Nvidia GM107GLM. My gnome session seems to hang periodically such that only the mouse is responsive
18:03 karolherbst: aric49: hanging for a short time?
18:03 karolherbst: maybe there is something in dmesg
18:03 aric49: no, totally freezing and causing me to have to reboot
18:03 karolherbst: could be related to the issues we have with multiple contexts/multithreading
18:03 karolherbst: ahh
18:04 karolherbst: aric49: there seems to be improvement when the nouveau ddx is used instead of the modesetting one
18:04 karolherbst: you can check /var/log/Xorg.0.log which is used
18:04 aric49: sure
18:04 karolherbst: but distribution kind of seem to default to modesetting
18:05 karolherbst: aric49: creating a /etc/X11/xorg.conf file with that content might help: https://gist.githubusercontent.com/karolherbst/c2f9ed0411a82d4ed5defe2a85028287/raw/b357613fe69e9163308a9a0869086806c8ca6d33/xorg.conf
18:06 aric49: karolherbst, I don't seem to have a Xorg.0.log in my /var/log
18:06 karolherbst: uhm
18:06 karolherbst: ohh
18:06 karolherbst: maybe you use wayland instead
18:06 karolherbst: or maybe it is only in the journald logs
18:07 aric49: display servers are fairly new to me.. is there an easy way to tell which I'm using?
18:07 karolherbst: aric49: can you switch to X11 mode in gdm?
18:07 karolherbst: there might be an option somewhere
18:07 karolherbst: related to using a different session or something
18:08 karolherbst: anyway, have to go, bbl
18:08 aric49: yes, I am using x11 karolherbst
18:09 aric49: looks like it's using journald for nouveau logs
18:19 pendingchaos: aric49: perhaps check ~/.local/share/xorg/?
18:20 aric49: ah. Seems to be there @pendingchaos
18:21 aric49: is there anything i can grep for in that log to show recent errors?
18:25 Lyude: karolherbst: figured out the runtime suspend deadlock with nouveau
18:26 Lyude: MST probing can take 5+ seconds in a worst case scenario when we manage to confuse the hub enough to make it stop responding, which is the amount of time it takes for the GPU to autosuspend!
18:27 Lyude: so we end up polling for outputs, which takes too long with 0 power refs and causes the pm core to try autosuspending the gpu and thus grab locks, which causes a deadlock of sorts when we end up trying to get a power reference in the poll worker for doing an atomic commit
18:27 Lyude: tbqh we need to be really careful about making sure we're grabbing pm refs at the right time in nouveau
18:34 karolherbst: Lyude: :O
18:34 karolherbst: crap
18:34 Lyude: hehehe
18:35 karolherbst: I assume this can also happen when hotplugging?
18:35 Lyude: in theory
18:35 aric49: are there any common patterns to look for in xorg logs to determine what might be causing a hang? Nothing I'm seeing seems to indicate errors
18:35 Lyude: it relies on the hub getting scared and not responding for 10+ seconds
18:35 karolherbst: Lyude: okay, so we should mark the GPU in use for the entire drm_load function
18:35 karolherbst: otherwise we will always run into stupid issues
18:35 Lyude: that as well
18:36 Lyude: karolherbst: if it gives you any ideas: the way intel handles this is by having the functions that they use to keep individual power domains on also grab runtime power references for the device
18:36 karolherbst: sure, but we don't really have to be that sophisticated
18:36 Lyude: so there aren't a whole ton of spots in i915 you actually have to grab power refs manually
18:37 karolherbst: right, or you only use the power ref thing
18:37 Lyude: yeah, I guess I'm trying to think of ways we could avoid having dumb issues like this in the future potentially
18:37 karolherbst: mhhh
18:37 karolherbst: basically everything needs to be guarded
18:38 karolherbst: I also had this issue with my reclocking patches where the GPU might suspend when I adjust the voltage on the fly, due to temperature changes
18:38 karolherbst: but this can be solved by the subdev infrastructure already
18:38 karolherbst: basically _never_ have active work when there is no active engine/subdev
18:38 Lyude: the other thing though you need to keep in mind is avoiding potentially calling pm_runtime_get() in a spot where the resume funct9ion would call down to something else that tries to grab a power ref, which leads to infinite recursion (this usually can be avoided by incrementing/decrementing dev->power.disable_depth)
18:39 karolherbst: right
18:39 karolherbst: we could have a wrapper around it
18:39 Lyude: nouveau seems to be pretty OK at this so far though, you guys are sane and don't just do hotplug polling the moment you come out of runtime resume :)
18:39 karolherbst: but nvkm doesn't need it
18:39 Lyude: or if you do, it's guarded properly
18:39 karolherbst: nvkm is guarded
18:40 karolherbst: you can't get it wrong in nvkm except with kworkers
18:40 karolherbst: but then you shouldn't have a kworker when it's subdev is not active anymore
18:40 karolherbst: so this is just working out by itself
18:40 Lyude: jfyi; I'm also going to add guards into the output_poll_function as well, unless you don't think we're going to need that
18:40 karolherbst: again: everything needs to be guarded ;)
18:41 karolherbst: the smart thing to do is, to ensure nothing is called when the GPU is suspended
18:41 karolherbst: except the resume function
18:41 karolherbst: we also have some hwmon bugs regarding this
18:41 karolherbst: but I also fixed those, just need to rebase my patches
18:42 karolherbst: or the pstate file ;)
18:43 karolherbst: Lyude: https://github.com/karolherbst/nouveau/commit/6a0f9ff72cc575e209c64935c26984850771df6c
18:43 karolherbst: allthough maybe we should handle that stuff implicitly inside nvif
18:43 Lyude: huh, IS_ERR_VALUE() <-- is that new?
18:43 karolherbst: and nvif just knows when to wake the gpu up
18:43 karolherbst: Lyude: no :D
18:43 karolherbst: it is super old
18:44 karolherbst: like seriously super old
18:44 karolherbst: like you wouldn't believe it old
18:44 Lyude: !?!?
18:44 Lyude: and, it's not even used anywhere
18:44 karolherbst: well maybe not _that_ old
18:44 Lyude: including 4 lines below lol
18:44 karolherbst: added with v2.6.12
18:45 karolherbst: rc5
18:45 karolherbst: so yeah, like 13 years and a bit
18:45 Lyude: huh
18:46 Lyude: so is that what we're supposed to use instead of < 0?
18:46 karolherbst: first definition of IS_ERR_VALUE: #define IS_ERR_VALUE(x) unlikely((x) > (unsigned long)-1000L)
18:46 karolherbst: ;)
18:46 karolherbst: it was already smarter back then :p
18:46 karolherbst: now it is #define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
18:47 karolherbst: which is super, as you can also use quite some negative values
18:48 Lyude: ahhh, well I will definitely use that then
18:48 Lyude: btw karolherbst, think you could help review some of the other patches I've got up for nouveau?
18:48 karolherbst: sure
18:48 Lyude: I found a /lot/ of issues :)
18:48 Lyude: and still have a couple to go
18:49 Lyude: amazing what having physical access to a machine can do
18:49 karolherbst: :D
18:49 karolherbst: big surprise, isn't it
18:49 Lyude: hehehe
18:58 Lyude: karolherbst: hm, I don't know if the drm_load part will be needed, it looks like we don't call pm_runtime_use_autosuspend() until the entire drm_load process has finished
19:02 aric49: I wonder if these logs might indicate the source of the hanging I am seeing: https://gist.github.com/aric49/d5e8bf4e07f4e1765b3376613877c7ab#file-nouveau-log-L1-L7 ?
19:03 aric49: any ideas what might be causing that?
19:04 karolherbst: Lyude: mhh, how do we run into that 5second timeout then? a thread?
19:06 aric49: also seeing this from an earlier crash: https://gist.github.com/aric49/cc354a7e08eafec409b16a3cde82dd66
19:06 karolherbst: Lyude: maybe we should write a wrapper around the kworker stuff and handle the autosuspend stuff there
19:07 Lyude: karolherbst: it's just us waiting for a response from the MST hub
19:07 karolherbst: Lyude: sure, but what is the thing around that
19:07 Lyude: eventually it does actually respond which causes us to try to enable the new displays in fbcon, but at that point there's a runtime suspend request active
19:08 Lyude: so we wait for that request to finish while the request waits for us, the output polling thread, to finish
19:08 Lyude: karolherbst: just nv50_mstm_detect()
19:08 Lyude: erm, hold on I think I am misunderstanding you
19:08 kiljacken: Now that there's some activity: Does it sound plausible, that the reason my 1070 Ti (which was released november 2017) is not working at all with nouveau is due to lack of firmware support (given than gp102 firmware blob is from feb 2018). The HS bootloader gets uploaded to the card, but returns an error code, so there's nothing more I can really look into from this end
19:09 kiljacken: *feb 2017
19:09 Lyude: karolherbst: https://paste.fedoraproject.org/paste/RFoEszW9Ax~TpVkZibLDVw jfyi, this is my current fix
19:09 karolherbst: mhhh
19:10 karolherbst: this is a call from within drm
19:10 karolherbst: okay, but that is for hotplugging
19:10 Lyude: actually let me get you the full patch because that describes this way more in detail
19:10 karolherbst: Lyude: what about drm_connector_funcs.detect?
19:10 Lyude: https://paste.fedoraproject.org/paste/YZF8w~GuDXHqB1B7FpTyVg
19:11 Lyude: karolherbst: that would be nv50_mstm_detect()
19:11 Lyude: we should probably also have a ref in there I suppose
19:11 karolherbst: well nv50_mstm_detect is just a tiny fraction of detect
19:12 Lyude: karolherbst: not in this case
19:12 karolherbst: true
19:12 karolherbst: but I am more thinking about the general issue now
19:13 karolherbst: so basically we have to get to something like that: everytime we do something with the hardware, increase the runpm counter _or_ return early in case of suspended GPU
19:14 karolherbst: to return early or waking up the GPU should always depend on the actual thing happening
19:14 karolherbst: for hotplugging we have to wake up of course
19:15 karolherbst: Lyude: I could imagine that the issue with multiple displays might be something related
19:16 karolherbst: like the process of detecting all displays could take longer the more displays there are
19:16 Lyude: actually that is probably one of the other issues I fixed on the ml
19:16 Lyude: specifically the connector iteration issue
19:16 karolherbst: yeah...
19:16 karolherbst: I think we just need to add those thing basically everywhere we touch the hardware
19:16 Lyude: mhm
19:16 karolherbst: otherwise we might always get those weirdo runpm issues
19:17 Lyude: btw, for the time being do you think that patch is good enough to fix this issue?
19:18 karolherbst: why pm_runtime_mark_last_busy?
19:18 Lyude: no specific reason, maybe pm_runtime_put() makes more sense
19:18 karolherbst: the pm_runtime_put_autosuspend is fine
19:19 karolherbst: mark_last_busy just resets the autosuspend timer
19:19 karolherbst: so you wait another 5 seconds
19:19 Lyude: ahhh, now I know
19:19 Lyude: I'll fix it up and get it out on the ML list in just a moment then
19:20 Lyude: are there any equivalent paths like this we need to handle for nv04+ as well?
19:20 karolherbst: well sometimes it makes sense, but here it doesn't. Either we detect a new display, where userspace may start to use it and the GPU is in use and resumed or we disconnected the last display where we could basically suspend the GPU right away anyway :)
19:20 karolherbst: Lyude: probably
19:20 karolherbst: hard to say, as I never really looked much into the code outside of nvkm
19:21 Lyude: alright, I'll try tripping up nouveau a couple other ways before I send this out then to make sure there isn't anywhere we're missi ng
19:21 Lyude: *missing
19:21 Lyude: actually, hm, I think pretty much any of the drm entrypoints from userspace should be good
19:21 karolherbst: well I guess basically every drm callback could hit such issues
19:22 karolherbst: mhhh
19:22 karolherbst: maybe
19:22 karolherbst: depends on the semantics there
19:22 karolherbst: I doubt that drm itself would wake up the GPU
19:22 karolherbst: and in every case we shouldn't rely on the 5 seconds being 5 seconds
19:23 karolherbst: I don't even want to know in how many cases we just luck out because things are usually faster than 5 seconds
19:23 karolherbst: and then it hits us hard like in this case
19:23 Lyude: yeah
19:27 Lyude: found one
19:30 kiljacken: I'll give it another try, sorry if the lack of response was a "i don't know": Does it sound plausible, that a 1070 Ti (gp104 variant released november 2017) is not working with nouveau is due to lack of firmware support (gp102 firmware blob is from feb 2017). The HS bootloader gets uploaded to the card, but returns an error code, so there's nothing more I can really look into from this end.
19:33 karolherbst: kiljacken: something like that
19:33 karolherbst: well there can be always bugs inside the firmware
19:33 karolherbst: or the secboot code not bugfree
19:33 karolherbst: and impossible to debug as the hardware doesn't tell us what's wrong
19:36 kiljacken: So basically, unless nvidia decides to be nice guys and drop some docs or something, i'm sol. Thanks for the reply, and have a nice whatever time of day it is in your timezone :)
19:56 Lyude: karolherbst: btw, where in nvkm does the rpm stuff happen?