00:34 Lekensteyn: karolherbst: I remember still seeing the audio device without writing to that reg (it was not functional though)
00:34 karolherbst: Lekensteyn: yeah well sure, but on my system I never saw the audio device to begin with before
00:35 Lekensteyn: kepler right?
00:35 karolherbst: yes
00:35 Lekensteyn: maybe it is different between generations (I have a maxwell, GTX 965M 10de:13d9
00:36 karolherbst: doubtful
00:36 Lekensteyn: will have to try once I have a HDMI screen + audio available for testing
00:36 karolherbst: I am pretty sure it has something to do how the gpu is initialized during boot
00:37 karolherbst: Lekensteyn: anyway, I think the best we could do is add some quirks to properly setup the GPU, I don't see any way to make it work inside nouveau for every system
00:38 karolherbst: even if that reg is written to prior suspending, it won't show up on rescan
00:38 karolherbst: no idea why, but remving the card from the bus is somehow important
00:39 Lekensteyn: are PCI regs fixed for all generations or are some devices moving it around?
00:39 karolherbst: they are pretty much fixed
00:39 karolherbst: kepler added a second pci region at 0x8c000
00:39 karolherbst: but the prior one stayed fully functionaly afaik
00:40 karolherbst: that 0x88488 reg seems to be there since 0x84 or so
00:40 karolherbst: didn't see any reads from nvidia on a nv50
00:40 Lekensteyn: what do you mean by 0x8c000? That is outside the 0x1000 PCI config space range
00:40 Lekensteyn: *extended
00:40 karolherbst: yeah
00:41 karolherbst: it doesn't map to the config space afaik
00:41 karolherbst: and 0x88000 doesn't do that as well, at least for some parts it doesn't
00:41 karolherbst: or well... maybe it does, but most of the regs are not defined anyhow
00:43 Lekensteyn: were you able to get a functional audio device after revealing the audio function?
00:43 karolherbst: I don't have any ports on the nvidia gpu connected
00:44 karolherbst: but it is a mxm card
00:44 karolherbst: so it depends on the motherboard
00:45 karolherbst: the connectors are inside the vbios, 3x DP and one HDMI
00:45 karolherbst: and others
00:46 karolherbst: Lekensteyn: but anyhow, I would trust that nvidia guy on the answer, because this is what we basically saw on other cards as well, just we didn't know which reg to poke really or not why to poke that reg
00:49 Lekensteyn: yes, the remove/rescan requirement is quite ugly though...
00:49 karolherbst: mhh meh, found another ugly thing
00:49 karolherbst: my audio device stayed suspended after the ACPI suspend/resume calls
00:50 karolherbst: uhh
00:50 karolherbst: crap
00:50 karolherbst: this is ugly
00:50 karolherbst: after suspend/resume that reg is reset anyhow
00:51 karolherbst: so I assume that the audio device isn't properly restored except we do this remove/rescan dancy on every resume as well
00:52 karolherbst: Lekensteyn: at least also detects 4 pcms on my nvidia HDMI device
00:52 karolherbst: *alsa
00:53 Lekensteyn: I have also observed a non-functioning audio device after s/r, it somehow also triggered an oops at some point (I gave up after that)
00:54 karolherbst: yeah
00:54 karolherbst: and on resume I get this: "[ 2784.325151] snd_hda_codec_hdmi hdaudioC2D0: Unable to sync register 0x4f0100. -5"
00:54 karolherbst: so the card stays off
00:54 karolherbst: and "snd_hda_codec_hdmi hdaudioC2D0: HDMI: invalid ELD buf size -1"
00:54 karolherbst: okay, this means a lot of fun then
00:55 karolherbst: so basically what we need to do is: whenever the GPU is resumed: poke that reg, remove the GPU, rescan the bus
00:55 karolherbst: ohhh
00:55 karolherbst: it is enough to poke that reg
00:56 karolherbst: maybe
00:56 karolherbst: hopefully
17:14 pmoreau: I’m having some issues mmt’ing the blob… This is what I get when tracing an OpenCL program on the blob https://hastebin.com/bijokidisu.vbs
17:15 pmoreau: I was a bit more successful when I tried mmt’ing on my other computer, even though it didn’t not grab everything.
17:17 pmoreau: For information, this is on kernel 4.13.4 + NVIDIA 384.90 + CUDA 9.0.176
17:17 imirkin: are you having an issue with nouveau code? i.e. why are you trying to trace the cvt thing?
17:19 pmoreau: Yes, cvt u64 -> u32 returns an illegal instruction encoding.
17:20 pmoreau: I had sent a patch for it some time ago, and you asked for me to check which cvt were resulting in errors, which I never truly did.
17:21 pmoreau: As I had an issue with my patch a couple of days ago, I decided to check all combinations on the blob and see which cvt it did manually, without using the cvt instruction, and submit a new version of the patch using that information.
17:29 imirkin: yeah
17:29 imirkin: check what i did
17:29 imirkin: for tgsi
17:31 pmoreau: Will do
17:32 pmoreau: Would be nice to have some assert in the emission to break if trying to emit wrong instructions.
17:34 imirkin: yep
17:34 imirkin: we have no ir validation
17:34 imirkin: which is a bit unfortunate
17:35 pmoreau: Will try to add some in the patch
17:37 pmoreau: I would argue that doing custom things depending on the cvt instruction, should be done as a pass over NVIR, rather than upon generating NVIR.
17:38 pmoreau: As otherwise we duplicate that custom behaviour for each frontend, though maybe you create an awesome helper function which does that, and which I should be using. :-)
17:39 imirkin: agreed.
17:40 imirkin: iirc i did what was easy with tgsi.
17:40 imirkin: feel free to refactor
17:40 imirkin: check all the U642U or whatever ops
17:42 pmoreau: Will do after finishing a small CUDA test program, as I should be able to check its generated assembly without mmt.
17:43 imirkin: yeah, can just use nvdisasm on the cubin
17:43 pmoreau: Right
18:32 pmoreau: imirkin: Found your code, and it seems to be exactly what the blob is doing. :-)
19:42 imirkin: cool
22:24 pmoreau: Hum… I thought there would be way more different cases, but apparently not. I am not going to complain about that though. :-)
22:25 pmoreau: Need to check when the cvt has a sat modifier, run some tests, and it should be good.
22:42 karolherbst: pmoreau: ohh, fun with sat again?