01:16 karolherbst: imirkin: I git cloned the WebGL registry and open the test suite through file:// ... is like 10x faster
01:17 karolherbst: just requires chromium to be opened with --allow-file-access-from-files
01:17 karolherbst: okay... it's like 100x faster
01:18 karolherbst: some tests are done in <30 ms
01:19 imirkin: :)
01:19 karolherbst: and hopefully no timeouts
01:24 imirkin: i didn't realize there would be such a huge perf difference
01:24 imirkin: but it makes sense in hindsight
01:25 karolherbst: well it doesn't help with the longer running tests, but eliminating timeouts due to HTTP fail + faster in general are nice advantages :)
01:35 imirkin: yeah
01:35 imirkin: i just wasn't thinking about it
03:56 rhyskidd: Lightsword: an example of the interface to one of the falcons (this one is "PMU") as we understand it is here: https://envytools.readthedocs.io/en/latest/nvrm/pmu/ucode-cmds.html?highlight=nv_ucode_cmd
03:58 Lightsword: ah
04:00 rhyskidd: that's just one falcon, but the commands provided by the binary blob firmware are read/write memory, do some special commands stuff, run some verification routines etc
04:00 rhyskidd: we (as in a user or developer) only sees this interface
04:01 rhyskidd: and this interface is only present if the relevant binary blob firmware is loaded at runtime onto the gpu
04:01 rhyskidd: and by implication, the loaded blob passes whatever verification or validation the silicon requires
04:04 Lightsword: rhyskidd, the validation isn’t done in the bios right?
04:17 imirkin: the boot process for the board isn't entirely clear to me
04:18 imirkin: we have to use a secure blob in order to load other secure blobs
04:18 imirkin: but ... how does the first secure blob get there
04:18 imirkin: afaik the validation of the signature is done in the silicon though
04:19 imirkin: could be the other engine doing it though, pre-loaded with firmware off the board somehow
04:19 Lightsword: yeah, at least the comments indicate that it’s a 16byte signature which is unusually short even for ECC https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c#L35
04:19 imirkin: i pointed you at that stuff the other day, no?
04:19 Lightsword: yeah
04:20 Lightsword: I think a slightly different loader though
04:20 imirkin: there are several versions
04:20 imirkin: r352 is in reference to the blob version numbers
04:20 Lightsword: yeah, my guess is they use the same signature scheme everywhere and the struct there seems to only have a single key
04:21 imirkin: that struct is the contents of the _sig file
04:21 Lightsword: which sig file does acr_r352 correspond to?
04:22 Lightsword: or well acr_r352_flcn_bl_desc I guess
04:23 imirkin: the gm200 ones
04:23 imirkin: the gp10x's are the r36x's
04:23 imirkin: or whatever
04:24 imirkin: (you can tell by the fielsize)
05:43 Lightsword: any idea what’s going on here with the PMU_SHA1_GID? https://github.com/madisongh/linux-tegra-4.9/blob/c6f99c1e23745448434885c0d3e4c02320a34a2d/nvidia/nvgpu/drivers/gpu/nvgpu/include/nvgpu/pmuif/gpmuif_pmu.h#L73-L77
05:49 imirkin: not a clue. where is GID_SIGNATURE used?
05:51 Lightsword: in this function I guess https://github.com/madisongh/linux-tegra-4.9/blob/c6f99c1e23745448434885c0d3e4c02320a34a2d/nvidia/nvgpu/drivers/gpu/nvgpu/common/pmu/pmu.c#L306-L384
06:20 Lightsword: btw shouldn’t it at least be possible to extract the signed images from nvidia’s closed source driver?
06:21 imirkin: sure
06:21 imirkin: just gotta find them
06:21 Lightsword: that pretty tricky?
06:22 imirkin: i wrote these tools: https://github.com/envytools/firmware
06:22 imirkin: the relevant one here is probably the scanner
06:22 imirkin: unfortunately the blob no longer uploads these directly (thus visible in a mmiotrace)
06:22 imirkin: but instead the GPU will DMA them from sysmem directly
06:23 imirkin: which means you gotta capture the command to get the GPU to do that, and make a copy for yourself
06:23 Lightsword: does the open source signed firmware loader work differently?
06:24 imirkin: good question
06:24 imirkin: doubt it
06:24 imirkin: but i didn't think to look
12:20 karolherbst: Lightsword, imirkin: our loader is more or less the same thing. Also some falcons have a crypto extensions with 128 bit crypto regs to do all the validation in hardware
12:21 Lightsword: karolherbst, any idea what algo is used for those 128 bit crypto regs?
12:21 karolherbst: and the challenge is that the driver puts a 128 bit signature into one of those and the falcon does the same with the current firmware + its key
12:21 karolherbst: Lightsword: I am not aware if that's already public information or not sadly
12:22 karolherbst: but the end result is a 128 bit value
12:22 karolherbst: so, it can't be _that_ secure
12:23 Lightsword: yeah, sounds like high chance that signatures can be forged if the validation scheme can be figured out
12:24 karolherbst: thing is, the falcon needs to have the private key, anything else wouldn't make sense
12:24 karolherbst: if it's RSA
12:24 karolherbst: but then it's quite easy to find the key on the silicon
12:24 Lightsword: 128 bit does not sound like it’s using asymmetric crypto though
12:24 karolherbst: that's why you hash
12:24 Lightsword: 128 bit RSA would be totally broken
12:25 Lightsword: it could be just a hash potentially, in that case would only have to figure out the hash algo
12:25 karolherbst: why?
12:25 karolherbst: think about why an RSA key is that long
12:25 Lightsword: 512 bit RSA is totally broken
12:25 karolherbst: no, you can have a 4k RSA key, but only generate a 128 bit signature
12:25 Lightsword: so 128 bit would be even more broken
12:25 karolherbst: or 256 bit
12:26 karolherbst: not saying it's as secure, but the signature doesn't have to be as weak as a equally long RSA key
12:26 Lightsword: uh, don’t think that’s how it works, RSA signatures are the same size as the key from my understanding
12:26 karolherbst: normally they are, yes
12:27 Lightsword: signature schemes typically sign a hash
12:27 Lightsword: the hash would be 128 bits…but isn’t much good if it’s not signed
12:27 karolherbst: right
12:28 karolherbst: but that's not the point here
12:28 karolherbst: the falcon just wants to know if you have the key, nothing more
12:28 karolherbst: aka you are able to generate something valid
12:29 Lightsword: and the falcon would have only a public key or would it have a private key?
12:29 Lightsword: normally one would embed a public key and nvidia would retain the private key
12:29 Lightsword: but the key length indicate that is not what was done
12:29 karolherbst: right, but then you require the full signature, or not?
12:30 Lightsword: yes
12:30 karolherbst: so it has the private key and it has to be hidden on the hardware
12:30 karolherbst: and then it makes more sense to just use AES
12:30 Lightsword: so without a full signature if someone can extract the validation function they can forge signatures
12:30 karolherbst: smaller key == easier to hide
12:31 Lightsword: yes but forgeable
12:31 karolherbst: Lightsword: keep in mind, that there is no software controlled firmware doing the validation, it's basically triggered by the instruction decoder
12:32 Lightsword: true, so the tricky part is key extraction
12:32 karolherbst: what the unsigned part of the firmware does is basically set the stack pointer, write the signature into a crypto reg, jump into the secure code
12:32 karolherbst: and jumping into the code triggeres the validation
12:33 Lightsword: so would probably need either a glitching vulnerability or to decap the chip and put it under an electron microscope
12:33 karolherbst: more or less, yes
12:33 Lightsword: that’s assuming there’s no other vulnerability that would allow readout of the validation scheme
12:33 karolherbst: the second option is trivial
12:33 karolherbst: it just costs money
12:34 karolherbst: maybe somebody would be able to do a successful decap on first try
12:34 karolherbst: dunno
12:34 Lightsword: yeah, once one extracts the validation routine and key anyone would be able to forge signatures for any GPU’s that use the same scheme
12:35 Lightsword: a proper scheme would embed only a pubkey, which would not allow key forging
12:35 karolherbst: I wouldn't be surprised if the key is already somewhere public
12:36 karolherbst: Lightsword: the validation is way to fast for doing RSA though
12:36 karolherbst: and would be terribly slow on current hardware
12:36 karolherbst: I guess
12:36 Lightsword: right, only asymmetric crypto scheme that might be potentially secure would be some custom ECC scheme, but those key lengths are really too small for that to be likely
12:37 Lightsword: standard is 256 bit for ECC
12:39 Lightsword: and I think 256 bit ECC produces signatures larger than the key unlike RSA
12:39 Lightsword: although I think some schemes may get as small as 160…but still larger than 128
12:41 Lightsword: karolherbst, any idea where one might look in public for the keys? I was looking around using github search a bit
12:42 karolherbst: dunno
12:43 Lightsword: karolherbst, any idea why nvidia is requiring signed microcode here? is it so that they can disable features on certain models easier?
12:47 karolherbst: no idea if they made a public statement
12:49 Lightsword: so I’d guess it’s probably something like HMAC-MD5 based on the 128 bit keys
12:53 karolherbst: *sigh* this WebGL cts is killing me
12:53 karolherbst: "out of memory" sure.....
16:54 karolherbst: imirkin: so with a local WebGL run I got 1 timeout, and still around 0,6% failed subtests
16:55 imirkin: and no hangs?
16:55 imirkin: or did you skip the max-texture-size tests?
16:55 imirkin: oh btw - are you running the 1.x or 2.x tests?
16:55 karolherbst: I didn't skip it
16:55 karolherbst: 2.x
16:55 imirkin: ok good
16:55 imirkin: i guess you have more ram than i do
16:56 karolherbst: most of the fails seem to be compiler/API related
16:56 imirkin: or something got fixed somewhere
16:56 karolherbst: yeah... 32GB+ :p
16:56 imirkin: i have 6GB
16:56 imirkin: not sure how much that GPU had... whatever GTX960's typically have
16:56 imirkin: (no longer have that gpu on me)
16:59 karolherbst: we fail the precision tests for some SFU operations
17:00 karolherbst: lowp is fine for cos, but mediump and highp not
17:00 imirkin: perhaps tests are too picky, dunno
17:00 imirkin: i don't think nvidia does any conditioning on those
17:00 imirkin: should check
17:01 karolherbst: fract as well, odd
17:02 karolherbst: something is odd with rgba32ui textures as well
17:02 imirkin: unlikely
17:02 karolherbst: fbocolorbuffer tests
17:02 imirkin: oh wait
17:02 imirkin: we don't clamp to zero
17:02 imirkin: but it's unclear that we're supposed to
17:02 imirkin: it's with some stupid blit of int -> uint
17:02 karolherbst: mhhhh
17:02 karolherbst: I kind of fixed something in that regard, no?
17:02 karolherbst: for the CTS
17:02 imirkin: sounds familiar.
17:03 imirkin: maybe that's what i'm remembering
17:03 karolherbst: maybe it needs a bit more work
17:03 imirkin: for each one you investigate
17:03 imirkin: create a new issue on the nouveau-cts board
17:03 imirkin: s/issue/card
17:04 imirkin: with notes
17:04 karolherbst: now.. how to select subtests :/
17:04 imirkin: you can just edit the html ;)
17:04 karolherbst: true
17:53 l4mRh4X0r[m]: Right, those last messages didn't come through because I wasn't identified
17:54 l4mRh4X0r[m]: Hi! I'm trying to start a game using `DRI_PRIME=1`, but I'm getting the following message: java: ../libdrm-2.4.96/nouveau/pushbuf.c:723: nouveau_pushbuf_data: Assertion `kref' failed.
17:54 l4mRh4X0r[m]: And my dmesg is full of `gr: DATA_ERROR` messages
17:54 l4mRh4X0r[m]: http://sprunge.us/3S2i8E
17:54 l4mRh4X0r[m]: I'm guessing it's got something to do with multiple threads doing GL, but I'm not sure
17:56 l4mRh4X0r[m]: Any idea how to solve this?
18:13 gnarface: l4mRh4X0r[m]: probably not, but any chance you've got any bad capacitors on the board? https://bugs.freedesktop.org/show_bug.cgi?id=95330
18:14 gnarface: my brief google searching returns other complaints about the same error, but at least one of them determined it was related to capacitor failure and fixed it by replacing them....
18:14 gnarface: worth checking, anyway
18:26 karolherbst: imirkin: we fail the dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgba32ui test, but I am not sure if that's related... the only rgba32ui test we fail on the gles CTS afaik
18:26 karolherbst: seems like I really have to dig through that WebGL stuff and how to properly debug all that
18:33 karolherbst: imirkin: ............. it was using the software stuff, argh
18:37 l4mRh4X0r[m]: gnarface: Well, it's a laptop with Nvidia Optimus, so I doubt it. And even if it were true, I'd rather not void my warranty 🙂
18:38 l4mRh4X0r[m]: Card is a GM107 (NV117) by the way
18:45 l4mRh4X0r[m]: My googling turned up https://bugs.freedesktop.org/show_bug.cgi?id=92438, which is an error I'm also getting sometimes
18:48 gnarface: l4mRh4X0r[m]: eh, it's over my head, sorry.
18:49 gnarface: looks like the best thing you can do is participate in testing if you can reproduce the issue
18:49 l4mRh4X0r[m]: No worries
18:49 gnarface: but i wouldn't hold my breath for a fix
18:50 gnarface: a lot of this stuff simply never will get fixed unless NVidia wants it fixed
18:50 karolherbst: l4mRh4X0r[m]: what game?
18:50 karolherbst: l4mRh4X0r[m]: I have some fixes for that multithreaded stuff and would like to know if it fixes that game as well
18:51 l4mRh4X0r[m]: In this case, a modded Minecraft. The loading screen flickers a lot between two loaders, hence my guess it's a multithreaqding issue
18:51 l4mRh4X0r[m]: an unmodded minecraft already works fine
18:52 l4mRh4X0r[m]: Sure, how would I go about testing those fixes?
18:52 karolherbst: l4mRh4X0r[m]: compile mesa locally and use that. repository is here: https://github.com/karolherbst/mesa.git branch is named "mt_fixes_take2"
18:58 l4mRh4X0r[m]: karolherbst: would using LD_LIBRARY_PATH do the trick?
18:58 karolherbst: l4mRh4X0r[m]: depends on how you build/install it, but it should
18:58 karolherbst: like if you have your local prefix and use LD_LIBRARY_PATH it should work
19:01 huehner: karolherbst: l4mRh4X0r[m] about modded minecraft, i was getting crashes and imirkin pointed me to your mt_fixes_take2 branch which when used made it perfectly stable for me (using ld_library_path to select it just for that game)
19:01 karolherbst: huehner: thanks, good to know.
19:30 pmoreau: karolherbst: I rebased the clover series on top of the latest master and fixed a few merge conflicts. I’m going to check again that the different build configurations still build properly and then send it to the ML.
19:31 pmoreau: Going to do a bit of reviewing as well, on your NIR in Nouveau series.
19:34 pmoreau: karolherbst: How do you want to proceed regarding Tesla BTW: merge in the current work and fix Tesla afterwards, or fix it before merging?
19:37 l4mRh4X0r[m]: karolherbst: I'm getting `double free or corruption (fasttop)`
19:40 l4mRh4X0r[m]: Or at least, I did the first time. Now I'm getting a truckload of `nouveau 0000:01:00.0: gr: ILLEGAL_CLASS ch 2 [007fb24000 java[25095]] subc 0 class ffff mthd 2384 data 00000000` in my kernel log, and the program hangs
19:41 l4mRh4X0r[m]: And now the double free message twice
19:41 l4mRh4X0r[m]: It's crashing a lot later though, so that's good
19:43 l4mRh4X0r[m]: Now a segfault in `nvc0_screen_fence_update+0x8`
19:45 l4mRh4X0r[m]: Now the double free message once again. So it's already an improvement, but not quite there yet.
19:58 l4mRh4X0r[m]: FWIW, this is what it looks like with intel drivers:
19:59 l4mRh4X0r[m]: https://canarymod.net/~willem/intel.ogv
19:59 l4mRh4X0r[m]: And this is what it looks like with nouveau: https://canarymod.net/~willem/nouveau.ogv
20:00 karolherbst: l4mRh4X0r[m]: it seems like that something is terribly wrong on your end if it also causes issues with intel :/
20:00 l4mRh4X0r[m]: I don't see the flashes of another application through the screen by the way, that's just the recording software
20:01 karolherbst: I see
20:05 karolherbst: l4mRh4X0r[m]: are you sure you switch the branch btw?
20:06 l4mRh4X0r[m]: Yes
20:06 l4mRh4X0r[m]: Fun fact: Using i965 with your branch also crashes, but with SIGFPE
20:07 l4mRh4X0r[m]: Not reproducibly though
20:10 karolherbst: maybe something since my latest rebase broken it...
20:12 karolherbst: seems like I get some crashes as well, let me check
20:12 karolherbst: huehner: can you tell me the git commit id of the latest commit on the branch you have locally?
20:14 karolherbst: l4mRh4X0r[m]: I forced pushed an older version, care you retry with that one?
20:14 karolherbst: top commit should be 7e0837d6436
20:18 l4mRh4X0r[m]: Alright
20:21 l4mRh4X0r[m]: Yeah, this one works fine
20:22 karolherbst: mhh, I guess it is those tsc changes imirkin pushed recently... so I need to dig into that and see what to do about this one
20:24 l4mRh4X0r[m]: I do have some weird rendering though, but I don't know whether that's due to your branch
20:32 karolherbst: might be some nouveau bug though
21:52 pmoreau: karolherbst: You might want to grab https://github.com/pierremoreau/mesa/commit/6283b5f28422017134850a04c8e4515e6f9edf3d: one gets an undefined reference otherwise when building without SPIRV-Tools and SPIRV-LLVM-Translator.
23:06 huehner: karolherbst: i got cf4b03f35466ef61021afe149772710ff72a3ef6 of your take2 branch here
23:13 karolherbst: huehner: ohh, this is from an even older version
23:39 huehner: karolherbst: cloned that around 19th last month