01:16karolherbst: imirkin: I git cloned the WebGL registry and open the test suite through file:// ... is like 10x faster
01:17karolherbst: just requires chromium to be opened with --allow-file-access-from-files
01:17karolherbst: okay... it's like 100x faster
01:18karolherbst: some tests are done in <30 ms
01:19karolherbst: and hopefully no timeouts
01:24imirkin: i didn't realize there would be such a huge perf difference
01:24imirkin: but it makes sense in hindsight
01:25karolherbst: well it doesn't help with the longer running tests, but eliminating timeouts due to HTTP fail + faster in general are nice advantages :)
01:35imirkin: i just wasn't thinking about it
03:56rhyskidd: Lightsword: an example of the interface to one of the falcons (this one is "PMU") as we understand it is here: https://envytools.readthedocs.io/en/latest/nvrm/pmu/ucode-cmds.html?highlight=nv_ucode_cmd
04:00rhyskidd: that's just one falcon, but the commands provided by the binary blob firmware are read/write memory, do some special commands stuff, run some verification routines etc
04:00rhyskidd: we (as in a user or developer) only sees this interface
04:01rhyskidd: and this interface is only present if the relevant binary blob firmware is loaded at runtime onto the gpu
04:01rhyskidd: and by implication, the loaded blob passes whatever verification or validation the silicon requires
04:04Lightsword: rhyskidd, the validation isn’t done in the bios right?
04:17imirkin: the boot process for the board isn't entirely clear to me
04:18imirkin: we have to use a secure blob in order to load other secure blobs
04:18imirkin: but ... how does the first secure blob get there
04:18imirkin: afaik the validation of the signature is done in the silicon though
04:19imirkin: could be the other engine doing it though, pre-loaded with firmware off the board somehow
04:19Lightsword: yeah, at least the comments indicate that it’s a 16byte signature which is unusually short even for ECC https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c#L35
04:19imirkin: i pointed you at that stuff the other day, no?
04:20Lightsword: I think a slightly different loader though
04:20imirkin: there are several versions
04:20imirkin: r352 is in reference to the blob version numbers
04:20Lightsword: yeah, my guess is they use the same signature scheme everywhere and the struct there seems to only have a single key
04:21imirkin: that struct is the contents of the _sig file
04:21Lightsword: which sig file does acr_r352 correspond to?
04:22Lightsword: or well acr_r352_flcn_bl_desc I guess
04:23imirkin: the gm200 ones
04:23imirkin: the gp10x's are the r36x's
04:23imirkin: or whatever
04:24imirkin: (you can tell by the fielsize)
05:43Lightsword: any idea what’s going on here with the PMU_SHA1_GID? https://github.com/madisongh/linux-tegra-4.9/blob/c6f99c1e23745448434885c0d3e4c02320a34a2d/nvidia/nvgpu/drivers/gpu/nvgpu/include/nvgpu/pmuif/gpmuif_pmu.h#L73-L77
05:49imirkin: not a clue. where is GID_SIGNATURE used?
05:51Lightsword: in this function I guess https://github.com/madisongh/linux-tegra-4.9/blob/c6f99c1e23745448434885c0d3e4c02320a34a2d/nvidia/nvgpu/drivers/gpu/nvgpu/common/pmu/pmu.c#L306-L384
06:20Lightsword: btw shouldn’t it at least be possible to extract the signed images from nvidia’s closed source driver?
06:21imirkin: just gotta find them
06:21Lightsword: that pretty tricky?
06:22imirkin: i wrote these tools: https://github.com/envytools/firmware
06:22imirkin: the relevant one here is probably the scanner
06:22imirkin: unfortunately the blob no longer uploads these directly (thus visible in a mmiotrace)
06:22imirkin: but instead the GPU will DMA them from sysmem directly
06:23imirkin: which means you gotta capture the command to get the GPU to do that, and make a copy for yourself
06:23Lightsword: does the open source signed firmware loader work differently?
06:24imirkin: good question
06:24imirkin: doubt it
06:24imirkin: but i didn't think to look
12:20karolherbst: Lightsword, imirkin: our loader is more or less the same thing. Also some falcons have a crypto extensions with 128 bit crypto regs to do all the validation in hardware
12:21Lightsword: karolherbst, any idea what algo is used for those 128 bit crypto regs?
12:21karolherbst: and the challenge is that the driver puts a 128 bit signature into one of those and the falcon does the same with the current firmware + its key
12:21karolherbst: Lightsword: I am not aware if that's already public information or not sadly
12:22karolherbst: but the end result is a 128 bit value
12:22karolherbst: so, it can't be _that_ secure
12:23Lightsword: yeah, sounds like high chance that signatures can be forged if the validation scheme can be figured out
12:24karolherbst: thing is, the falcon needs to have the private key, anything else wouldn't make sense
12:24karolherbst: if it's RSA
12:24karolherbst: but then it's quite easy to find the key on the silicon
12:24Lightsword: 128 bit does not sound like it’s using asymmetric crypto though
12:24karolherbst: that's why you hash
12:24Lightsword: 128 bit RSA would be totally broken
12:25Lightsword: it could be just a hash potentially, in that case would only have to figure out the hash algo
12:25karolherbst: think about why an RSA key is that long
12:25Lightsword: 512 bit RSA is totally broken
12:25karolherbst: no, you can have a 4k RSA key, but only generate a 128 bit signature
12:25Lightsword: so 128 bit would be even more broken
12:25karolherbst: or 256 bit
12:26karolherbst: not saying it's as secure, but the signature doesn't have to be as weak as a equally long RSA key
12:26Lightsword: uh, don’t think that’s how it works, RSA signatures are the same size as the key from my understanding
12:26karolherbst: normally they are, yes
12:27Lightsword: signature schemes typically sign a hash
12:27Lightsword: the hash would be 128 bits…but isn’t much good if it’s not signed
12:28karolherbst: but that's not the point here
12:28karolherbst: the falcon just wants to know if you have the key, nothing more
12:28karolherbst: aka you are able to generate something valid
12:29Lightsword: and the falcon would have only a public key or would it have a private key?
12:29Lightsword: normally one would embed a public key and nvidia would retain the private key
12:29Lightsword: but the key length indicate that is not what was done
12:29karolherbst: right, but then you require the full signature, or not?
12:30karolherbst: so it has the private key and it has to be hidden on the hardware
12:30karolherbst: and then it makes more sense to just use AES
12:30Lightsword: so without a full signature if someone can extract the validation function they can forge signatures
12:30karolherbst: smaller key == easier to hide
12:31Lightsword: yes but forgeable
12:31karolherbst: Lightsword: keep in mind, that there is no software controlled firmware doing the validation, it's basically triggered by the instruction decoder
12:32Lightsword: true, so the tricky part is key extraction
12:32karolherbst: what the unsigned part of the firmware does is basically set the stack pointer, write the signature into a crypto reg, jump into the secure code
12:32karolherbst: and jumping into the code triggeres the validation
12:33Lightsword: so would probably need either a glitching vulnerability or to decap the chip and put it under an electron microscope
12:33karolherbst: more or less, yes
12:33Lightsword: that’s assuming there’s no other vulnerability that would allow readout of the validation scheme
12:33karolherbst: the second option is trivial
12:33karolherbst: it just costs money
12:34karolherbst: maybe somebody would be able to do a successful decap on first try
12:34Lightsword: yeah, once one extracts the validation routine and key anyone would be able to forge signatures for any GPU’s that use the same scheme
12:35Lightsword: a proper scheme would embed only a pubkey, which would not allow key forging
12:35karolherbst: I wouldn't be surprised if the key is already somewhere public
12:36karolherbst: Lightsword: the validation is way to fast for doing RSA though
12:36karolherbst: and would be terribly slow on current hardware
12:36karolherbst: I guess
12:36Lightsword: right, only asymmetric crypto scheme that might be potentially secure would be some custom ECC scheme, but those key lengths are really too small for that to be likely
12:37Lightsword: standard is 256 bit for ECC
12:39Lightsword: and I think 256 bit ECC produces signatures larger than the key unlike RSA
12:39Lightsword: although I think some schemes may get as small as 160…but still larger than 128
12:41Lightsword: karolherbst, any idea where one might look in public for the keys? I was looking around using github search a bit
12:43Lightsword: karolherbst, any idea why nvidia is requiring signed microcode here? is it so that they can disable features on certain models easier?
12:47karolherbst: no idea if they made a public statement
12:49Lightsword: so I’d guess it’s probably something like HMAC-MD5 based on the 128 bit keys
12:53karolherbst: *sigh* this WebGL cts is killing me
12:53karolherbst: "out of memory" sure.....
16:54karolherbst: imirkin: so with a local WebGL run I got 1 timeout, and still around 0,6% failed subtests
16:55imirkin: and no hangs?
16:55imirkin: or did you skip the max-texture-size tests?
16:55imirkin: oh btw - are you running the 1.x or 2.x tests?
16:55karolherbst: I didn't skip it
16:55imirkin: ok good
16:55imirkin: i guess you have more ram than i do
16:56karolherbst: most of the fails seem to be compiler/API related
16:56imirkin: or something got fixed somewhere
16:56karolherbst: yeah... 32GB+ :p
16:56imirkin: i have 6GB
16:56imirkin: not sure how much that GPU had... whatever GTX960's typically have
16:56imirkin: (no longer have that gpu on me)
16:59karolherbst: we fail the precision tests for some SFU operations
17:00karolherbst: lowp is fine for cos, but mediump and highp not
17:00imirkin: perhaps tests are too picky, dunno
17:00imirkin: i don't think nvidia does any conditioning on those
17:00imirkin: should check
17:01karolherbst: fract as well, odd
17:02karolherbst: something is odd with rgba32ui textures as well
17:02karolherbst: fbocolorbuffer tests
17:02imirkin: oh wait
17:02imirkin: we don't clamp to zero
17:02imirkin: but it's unclear that we're supposed to
17:02imirkin: it's with some stupid blit of int -> uint
17:02karolherbst: I kind of fixed something in that regard, no?
17:02karolherbst: for the CTS
17:02imirkin: sounds familiar.
17:03imirkin: maybe that's what i'm remembering
17:03karolherbst: maybe it needs a bit more work
17:03imirkin: for each one you investigate
17:03imirkin: create a new issue on the nouveau-cts board
17:04imirkin: with notes
17:04karolherbst: now.. how to select subtests :/
17:04imirkin: you can just edit the html ;)
17:53l4mRh4X0r[m]: Right, those last messages didn't come through because I wasn't identified
17:54l4mRh4X0r[m]: Hi! I'm trying to start a game using `DRI_PRIME=1`, but I'm getting the following message: java: ../libdrm-2.4.96/nouveau/pushbuf.c:723: nouveau_pushbuf_data: Assertion `kref' failed.
17:54l4mRh4X0r[m]: And my dmesg is full of `gr: DATA_ERROR` messages
17:54l4mRh4X0r[m]: I'm guessing it's got something to do with multiple threads doing GL, but I'm not sure
17:56l4mRh4X0r[m]: Any idea how to solve this?
18:13gnarface: l4mRh4X0r[m]: probably not, but any chance you've got any bad capacitors on the board? https://bugs.freedesktop.org/show_bug.cgi?id=95330
18:14gnarface: my brief google searching returns other complaints about the same error, but at least one of them determined it was related to capacitor failure and fixed it by replacing them....
18:14gnarface: worth checking, anyway
18:26karolherbst: imirkin: we fail the dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgba32ui test, but I am not sure if that's related... the only rgba32ui test we fail on the gles CTS afaik
18:26karolherbst: seems like I really have to dig through that WebGL stuff and how to properly debug all that
18:33karolherbst: imirkin: ............. it was using the software stuff, argh
18:37l4mRh4X0r[m]: gnarface: Well, it's a laptop with Nvidia Optimus, so I doubt it. And even if it were true, I'd rather not void my warranty 🙂
18:38l4mRh4X0r[m]: Card is a GM107 (NV117) by the way
18:45l4mRh4X0r[m]: My googling turned up https://bugs.freedesktop.org/show_bug.cgi?id=92438, which is an error I'm also getting sometimes
18:48gnarface: l4mRh4X0r[m]: eh, it's over my head, sorry.
18:49gnarface: looks like the best thing you can do is participate in testing if you can reproduce the issue
18:49l4mRh4X0r[m]: No worries
18:49gnarface: but i wouldn't hold my breath for a fix
18:50gnarface: a lot of this stuff simply never will get fixed unless NVidia wants it fixed
18:50karolherbst: l4mRh4X0r[m]: what game?
18:50karolherbst: l4mRh4X0r[m]: I have some fixes for that multithreaded stuff and would like to know if it fixes that game as well
18:51l4mRh4X0r[m]: In this case, a modded Minecraft. The loading screen flickers a lot between two loaders, hence my guess it's a multithreaqding issue
18:51l4mRh4X0r[m]: an unmodded minecraft already works fine
18:52l4mRh4X0r[m]: Sure, how would I go about testing those fixes?
18:52karolherbst: l4mRh4X0r[m]: compile mesa locally and use that. repository is here: https://github.com/karolherbst/mesa.git branch is named "mt_fixes_take2"
18:58l4mRh4X0r[m]: karolherbst: would using LD_LIBRARY_PATH do the trick?
18:58karolherbst: l4mRh4X0r[m]: depends on how you build/install it, but it should
18:58karolherbst: like if you have your local prefix and use LD_LIBRARY_PATH it should work
19:01huehner: karolherbst: l4mRh4X0r[m] about modded minecraft, i was getting crashes and imirkin pointed me to your mt_fixes_take2 branch which when used made it perfectly stable for me (using ld_library_path to select it just for that game)
19:01karolherbst: huehner: thanks, good to know.
19:30pmoreau: karolherbst: I rebased the clover series on top of the latest master and fixed a few merge conflicts. I’m going to check again that the different build configurations still build properly and then send it to the ML.
19:31pmoreau: Going to do a bit of reviewing as well, on your NIR in Nouveau series.
19:34pmoreau: karolherbst: How do you want to proceed regarding Tesla BTW: merge in the current work and fix Tesla afterwards, or fix it before merging?
19:37l4mRh4X0r[m]: karolherbst: I'm getting `double free or corruption (fasttop)`
19:40l4mRh4X0r[m]: Or at least, I did the first time. Now I'm getting a truckload of `nouveau 0000:01:00.0: gr: ILLEGAL_CLASS ch 2 [007fb24000 java] subc 0 class ffff mthd 2384 data 00000000` in my kernel log, and the program hangs
19:41l4mRh4X0r[m]: And now the double free message twice
19:41l4mRh4X0r[m]: It's crashing a lot later though, so that's good
19:43l4mRh4X0r[m]: Now a segfault in `nvc0_screen_fence_update+0x8`
19:45l4mRh4X0r[m]: Now the double free message once again. So it's already an improvement, but not quite there yet.
19:58l4mRh4X0r[m]: FWIW, this is what it looks like with intel drivers:
19:59l4mRh4X0r[m]: And this is what it looks like with nouveau: https://canarymod.net/~willem/nouveau.ogv
20:00karolherbst: l4mRh4X0r[m]: it seems like that something is terribly wrong on your end if it also causes issues with intel :/
20:00l4mRh4X0r[m]: I don't see the flashes of another application through the screen by the way, that's just the recording software
20:01karolherbst: I see
20:05karolherbst: l4mRh4X0r[m]: are you sure you switch the branch btw?
20:06l4mRh4X0r[m]: Fun fact: Using i965 with your branch also crashes, but with SIGFPE
20:07l4mRh4X0r[m]: Not reproducibly though
20:10karolherbst: maybe something since my latest rebase broken it...
20:12karolherbst: seems like I get some crashes as well, let me check
20:12karolherbst: huehner: can you tell me the git commit id of the latest commit on the branch you have locally?
20:14karolherbst: l4mRh4X0r[m]: I forced pushed an older version, care you retry with that one?
20:14karolherbst: top commit should be 7e0837d6436
20:21l4mRh4X0r[m]: Yeah, this one works fine
20:22karolherbst: mhh, I guess it is those tsc changes imirkin pushed recently... so I need to dig into that and see what to do about this one
20:24l4mRh4X0r[m]: I do have some weird rendering though, but I don't know whether that's due to your branch
20:32karolherbst: might be some nouveau bug though
21:52pmoreau: karolherbst: You might want to grab https://github.com/pierremoreau/mesa/commit/6283b5f28422017134850a04c8e4515e6f9edf3d: one gets an undefined reference otherwise when building without SPIRV-Tools and SPIRV-LLVM-Translator.
23:06huehner: karolherbst: i got cf4b03f35466ef61021afe149772710ff72a3ef6 of your take2 branch here
23:13karolherbst: huehner: ohh, this is from an even older version
23:39huehner: karolherbst: cloned that around 19th last month