00:00 AndrewR: mooch2, only as user/tester, sorry (not-a-real-dev)
00:00 mooch2: damn
00:02 AndrewR: mooch2, but wasn't nv4/5 emulation also in the works? Are they too complex, compared to nv3? (i mean, riva 128 sounds a bit too weak...)
00:03 mooch2: i'm the developer of that, actually
00:03 mooch2: and honestly, i'm just trying to preserve computer history
00:03 mooch2: also, nv3 by itself is already a tremendous undertaking
00:04 AndrewR: mooch2, guess i read some of logs where you tried to make nt4 under modified pcem (86box) to actually display something first..)
00:06 mooch2: also, 86box uses software rendering for EVERYTHING
00:06 mooch2: even voodoo2!
00:06 mwk: mooch2: sounds like another fun project :)
00:07 mooch2: oh hey mwk!
00:07 mooch2: what parts of qemu do you work on?
00:07 mooch2: oh wait derp
00:07 mwk: :p
00:07 mooch2: sorry, i thought this was the qemu irc channel
00:07 mooch2: WHOOPS
00:07 mooch2: but yeah
00:07 mwk: though having said that
00:07 mwk: I did write a qemu module this week
00:07 mwk: emulating a PCI device
00:07 mooch2: oh? for what?
00:08 mwk: but that's a complete accident :p
00:08 mwk: for a university course
00:08 mooch2: ah
00:08 mwk: it's supposed to be a "crypto accelerator", bignum multiplication in hardware
00:10 mooch2: ah neat!
00:10 mwk: learned a bit about crypto that way, too
00:10 mooch2: ah
00:10 mwk: Montomery multiplication... I've wanted to understand that stuff ever since I REd a piece of Falcon firmware doing it
00:10 mooch2: i guess i'll be the next one to plunge down that rabbit hole lmao
00:11 mwk: which rabbit hole?
00:11 mwk: we've got plenty of those :p
00:11 mooch2: qemu
00:11 mwk: heh
00:11 mwk: good luck
00:11 mooch2: thanks man
00:11 mwk: I wish I could be of more help
00:11 mooch2: say, what else have you been up to recently?
00:12 mooch2: specifically involving nvidia hardware, of course
00:12 mwk: basically nothing
00:12 nyef: I was looking at trying to use qemu for something a while back, but it turned out not to have R10k CPU emulation. /-:
00:12 mwk: I haven't touched my big rig of nv cards since january or so
00:12 mooch2: jeez
00:12 mooch2: you should do more hwtests lol
00:12 mooch2: or send that rig to me :D
00:13 mwk: right now I'm busy trying to get Veles off the ground, do my coursework, and write a MSc all at once
00:14 mwk: well, I'm not leaving my rig
00:14 mwk: but if you want remote access to any part of it, just let me know
00:14 mwk: it's not any good if it's standing unused...
00:14 nyef: HDMI input progress: works properly in Windows if the nvidia driver is installed (which took some .ini hacking to make happen). Sound works (tested with a PS3), but sound from the MCP89 is *dreadful*.
00:32 mooch3: mwk: can you please send me the login details on mooch2? thanks
05:13 dboyan_: skeggsb: Is there plan to add compute support for pascal in mesa?
05:15 skeggsb: it shouldn't be terribly difficult for someone to do
05:16 skeggsb: for the test kernel i launched a while back, i just took gm20x code and modified for the newer QMD format
05:17 dboyan_: iirc there are new local and shared memory window field
05:19 dboyan_: There is a new LoadInlineQmdData in pascal. Do you make use of them?
05:20 dboyan_: In compute class method
05:20 skeggsb: nope, as i said, i only changed the qmd format to see if it worked
05:20 skeggsb: everything else i left just like gm20x would be
05:20 dboyan_: okay, it doesn't seem hard then
05:21 skeggsb: iirc pascal has fp16 support, which would need reverse-engineering too
05:22 dboyan_: yeah, but I guess its performance is not so great for geforce cards
05:25 dboyan_: Now that I've built drm-next kernel, I think I'll play with my gp107 when I have time.
08:14 pmoreau: dboyan_: fp16 on !(GP100 || GP10B) is indeed crap: it’s half the double throughput, or a 64th of the float one…
08:19 dboyan_: yeah =/
08:20 dboyan_: no idea why they added this. just to showcase that they have native fp16 support?
08:22 pmoreau: I am not sure. Maybe because they had it in the GP100, and wanted to have at least some support on the other Pascal chipsets?
08:23 dboyan_: maybe
08:23 pmoreau: Having talked with some NVIDIA researchers, they were quite surprised that people were expecting native fp16 support for the 1080 and the other cards.
08:44 karolherbst: ....
08:44 karolherbst: yeah, because f16 is so much more important than f32 :p
08:45 airlied: for ai it is
08:46 karolherbst: sure, but you can also just use f32
08:46 karolherbst: what is your disadvantage using f32?
08:46 karolherbst: you can always buy 2 1080 instead of the gp100 one and adjust your code
08:48 karolherbst: they should have leave out f16 support completly on non GP100 GPUs, so that software can deal with it. Now it's just a bs situation
08:49 karolherbst: ... well okay, dev systems might need f16 for writing software supporting f16
09:03 dboyan_: skeggsb: Are you sure the open-gpu-doc from nvidia is correct?
09:04 dboyan_: skeggsb: The way to specify blockDim on pascal seems changed, but it isn't on the doc
09:09 dboyan_: not blockDim, gridDim
09:09 pmoreau: karolherbst: Throughput + memory usage
09:11 karolherbst: pmoreau: well sure it is _faster_, but you don't miss anything else. Of course Nvidia wants more money for the GP100, that's obvious. But you can always compensate by buying many cheap ones and do f32
09:14 pmoreau: karolherbst: With native fp16, you can have two registers (A and B), each containing two fp16, and if you do an add on those, the hardware will do C_1 = A_1 + B_1, C_2 = A_2 + B_2, without needing to extract the fp16 and convert it to an fp32, then back to fp16, using more registers
09:14 pmoreau: Are any NVIDIA cards cheap? :-D
09:15 karolherbst: well sure, but I was saying there is no reasons besides performance
09:15 karolherbst: but you can also just do f32 arithm for AI as well, it's just slower
09:15 karolherbst: or do I miss anything?
09:15 pmoreau: Using less memory, though it kind of fits in performance as well
09:16 karolherbst: yeah
09:16 karolherbst: but if you buy 2 GPUs, instead of one, you have doubled memory as well
09:17 pmoreau: Yes, but then it depends how your algorithm works: can you split the memory content between both cards, or do you need to replicate it?
09:17 karolherbst: true
09:18 karolherbst: I was just trying to say, that missing good f16 isn't such a big problem. Sure it's a silly one, but none where you would say, that everything is stupid and Nvidia is shit
09:19 pmoreau: I wouldn’t go as far as saying it is stupid and NVIDIA is shit, but I would have loved good fp16 support for my work. Especially since I have to emulate them, as fp32 was too costly in some cases.
09:21 dboyan_: so go buy some tesla p100s ;)
09:21 karolherbst: pmoreau: doing the conversions is cheaper than just do f32 overall?
09:21 karolherbst: *doing
09:22 pmoreau: Yes, because you fetch half as much data from memory!
09:23 pmoreau: And the kernel was (and still is) memory-bound.
09:24 karolherbst: well sure, but we talk here about 50% perf hit, not 90% perf hit or so
09:25 dboyan_: Does pascal have the potential to make y and z dimension of grid 2^32-1 or I'm just mislooking something?
09:26 pmoreau: Well, it could be more than 50%, since you use less space in the cache. But I don’t remember how much it did improve those kernels.
09:27 dboyan_: pmoreau: Do you have a gp100 around?
09:27 pmoreau: dboyan_: In the CUDA doc, it still puts a limit of 65535 for Pascal on the y and z dimensions of the grid.
09:27 pmoreau: dboyan_: I don’t.
09:27 dboyan_: okay.
09:28 dboyan_: yeah, I'm seeing it in cuda docs. But GRIDDIM in QMD seems to become 3 words on my gp107, one dim each
09:29 pmoreau: Oh, interesting. Will have to try launching grids with gridDim.y > 64k
09:30 dboyan_: So I'm curious to see what's the case on gp100.
09:32 pmoreau: I could check that on the GP102
09:32 pmoreau: Did you get the info by tracing with MMT a compute application?
09:33 dboyan_: yeah
09:34 dboyan_: pmoreau: Then could you do me an mmt trace of https://gist.github.com/dboyan/ab68999dd7de4a7831d20cfe63441a32
09:35 pmoreau: Yes
09:35 dboyan_: compile with -lGL -lGLEW -lglfw
09:38 pmoreau: dboyan_: https://phabricator.pmoreau.org/F130120 (if I didn’t make any mistakes)
09:39 dboyan_: thanks
09:41 pmoreau: Ah, it has the chipset detection bug
09:42 dboyan_: yeah, I know
09:43 dboyan_: GP102 is using GP104_COMPUTE?! Interessant
09:44 dboyan_: It is basically the same with my trace then
09:48 pmoreau: I am not sure how GXYZ_COMPUTE matches to SM versions, but there are only three SM versions for Pascal: SM_60 for GP100, SM_62 for the GP10B, and SM_61 for everything else.
09:51 dboyan_: well there are "object classes" where c0c0 stands for "Pascal Compute A", c1c0 stands for "Pascal Compute B". We call them GP100_COMPUTE and GP104_COMPUTE in rnndb repectively
09:52 dboyan_: both of them occurred in your trace but only the latter is actually used
09:52 karolherbst: so compute b is for !GP100 ?
09:52 dboyan_: I guess so
09:53 dboyan_: but from rnndb names it seems gp102 should use class a
09:53 dboyan_: or we'll have GP102_COMPUTE
09:54 dboyan_: well, I don't know a lot about rnndb naming, might be wrong
09:56 dboyan_: pmoreau: btw, are you sure it is a gp102?
09:58 dboyan_: I just passed -m 132 to demmt, and it is the only reason GP100_COMPUTE show up there
09:58 dboyan_: If I passed -m 134 instead, GP100_COMPUTE is totally gone. So it was demmt that thought GP100_COMPUTE should be there
10:45 pmoreau: dboyan_: "01:00.0 VGA compatible controller: NVIDIA Corporation GP102 [TITAN X] (rev a1)", yes, I am sure ;-)
10:46 pmoreau: There coulb be a c2c0 object for GP10B, but we haven’t stumbled upon it yet
17:01 leberus: karolherbst: Hi :) I've sent v3 of the patch series with the fixups you and Mirkin pointed out. This time I've signed them with Signed-off-by
17:01 karolherbst: leberus: will look at them tomorrow, thanks
17:03 leberus: thanks ;)!
17:40 gregory38: imirkin_: Hello. Is DRI3 stable enough on nouveau ?
17:42 gregory38: it seems I hit some kinds of deadlock (or missing event) with glthread
17:42 nyef: Are you sure it's a DRI3 problem and not a threading problem?
17:46 gregory38: I'm sure of nothing
17:46 gregory38: I upgraded to DRI3 yesterday to avoid thread issue with glthread and x11
17:47 gregory38: but dri3 isn't the default
17:47 gregory38: I want to be sure that it doesn't have any know stability issue
17:48 gregory38: but yeah it feels like a deadlock due to threading
17:49 gregory38: https://pastebin.com/3fgcApAg
17:49 gregory38: stacktrace of the 2 threads
17:52 Lyude: karolherbst: have any other stuff with that iccsense thing you want me to try?
17:54 karolherbst: Lyude: not yet, got interesting news from work and need to deal with that first... Need to get my head around what went wrong, but I fear that the i2c bus is also secured...
17:56 karolherbst: ......
17:56 karolherbst: Lyude: found it
17:56 karolherbst: Lyude: guess what: https://github.com/skeggsb/nouveau/blob/master/drm/nouveau/include/nvkm/subdev/bios/iccsense.h#L4
17:57 Lyude: haha
17:57 Lyude: this time I will just do a search and replace with sed :)
17:57 karolherbst: :p
17:57 karolherbst: that
17:57 karolherbst: 's what you get for saving any tiny bit of memory
17:58 Lyude: hehe
17:58 gregory38: hum, maybe my version of xcb is too old, I found at least 2 patches to fix some hangs.
18:21 imirkin_: gregory38: well, mrcooper suggests that dri3 can't work with exa
18:21 imirkin_: we haven't flipped it on by default because it messes up kde, among other things
18:21 Lyude: karolherbst: congrats! we're, closer I think
18:21 Lyude: "power1: N/A (max = 120.00 mW, crit = 140.00 mW)"
18:21 karolherbst: mhh, well
18:22 karolherbst: dmesg please
18:22 Lyude: yep, one sec
18:23 gregory38: imirkin_: can I replace EXA with something else ?
18:23 imirkin_: you can try the modesetting driver
18:23 imirkin_: that will bring its own set of issues, of course
18:25 gregory38: what kind of issues ?
18:25 Lyude: karolherbst: https://people.freedesktop.org/~lyudess/iccsense/dmesg.txt
18:25 imirkin_: gregory38: well, mainly stability
18:26 gregory38: ok
18:26 imirkin_: nouveau is, shall we say, imperfect
18:26 gregory38: at least I will see if I still have the hang/deadlokc
18:26 imirkin_: as i think i summed it up recently, xf86-video-nouveau is essentially Bug Free (tm), while the GL driver is Bug Full (tm)
18:26 gregory38: otherwise next step is to upgrade my distribution (I should do it anyway)
18:26 gregory38: lol
18:27 imirkin_: at least as far as crashes, stability, etc goes
18:27 karolherbst: Lyude: that N/A makes me nervous, I think we don't get any data from the sensor actually. Might be something off with how we find the right I2C bus or something
18:27 gregory38: what is modesetting
18:27 gregory38: ?
18:27 gregory38: glamor ?
18:28 gregory38: that being said a 2 years old buggy modesetting driver is maybe not a good idea
18:28 Lyude: karolherbst: here's to hoping…
18:28 imirkin_: gregory38: xf86-video-modesetting is part of the X server starting ... 1.18 iirc
18:28 imirkin_: it's a DDX that uses the KMS api for all the modesetting logic
18:29 imirkin_: and has an optional acceleration component via GLAMOR, which effectively uses EGL (via GBM) to do the drawing
18:29 gregory38: ok
18:31 mooch2: do you guys know of any 0x010000 register on nv4?
18:33 karolherbst: mhhh
18:34 karolherbst: didn't we had a tool to read out i2c devices?
18:34 Teklad: I forgot xlsatoms existed... so I ended up writing my own... but it all turned out okay... mine's nicer. :)
18:35 Teklad: karolherbst: How's the pascal stuff going? :p
18:40 karolherbst: Lyude: okay, I have an idea
18:41 karolherbst: Lyude: load nvidia and run "nvaspyi2c $i" starting with i=0 until you see some read/wrutes
18:41 karolherbst: *writes
18:41 karolherbst: for me it is 2
18:41 karolherbst: and then you get lines like "81:00 =R=> 4e07 (status = 0, polls = 0595)"
18:42 Lyude: karolherbst: sure thing, i gotta do a release for the nouveau ddx first to fix some nasty libdrm bug skeggsb found but i'll try that in a minute
18:42 karolherbst: Lyude: you should see a read 81:00 =R=> 7407. The first byte can be different though?
18:47 imirkin_: Lyude: make sure you use the script this time ;)
18:47 Lyude: yep!
18:47 Lyude: doing so right now :P
19:03 Lyude: karolherbst: trying to launch nvaspyi2c now and I keep getting "WARN: Can't probe 0000:01:00.0" but I've definitely loaded nvidia and nvidia-drm…
19:14 karolherbst: Lyude: tried as root?
19:14 Lyude: karolherbst: yep
19:14 karolherbst: iomem=relaxed set?
19:17 Lyude: karolherbst: where would I set that?
19:28 karolherbst: Lyude: kernel boot parameter
19:28 Lyude: gotcha
19:29 karolherbst: "some guy" added a security feature, so that iomem regions bound by a driver can't be tinkered with, totally useless if you ask me
19:31 Lyude: there we go, lemme figure out which port it is then
19:31 karolherbst: I think mupuf already scold him :p
19:33 Lyude: karolherbst: btw, do you have any idea where that load faking tool is? figure I should turn that on sicne I don't have any GUI loaded on this machine
19:34 karolherbst: doesn
19:34 karolherbst: 't matter for the power sensor readings
19:34 Lyude: ah, okay
19:34 karolherbst: nvidia polls for that every second
19:34 karolherbst: more often if there is load
19:36 Lyude: karolherbst: I'm not seeing anything on any of the ports, we're sure I don't need a display or anything like that for this to work?
19:36 Lyude: i have a feeling that is also a bad sign
19:37 karolherbst: mhh
19:37 karolherbst: you might want to start X
19:37 Lyude: alright
19:39 karolherbst: okay, we might parse the I2C table wrong though. But we have docs
19:40 karolherbst: ohh, since maxwell there is a new version of that table anyway
19:40 Lyude: there we go
19:40 Lyude: got something spitting out tons of messages, seems to be 2
19:41 karolherbst: k
19:41 karolherbst: could you give me a dump of it?
19:44 Lyude: karolherbst: https://people.freedesktop.org/~lyudess/iccsense/nvaspyi2c-2.txt
19:45 karolherbst: mhh
19:45 karolherbst: always ERR
19:48 karolherbst: ohhh mhh okay
19:49 karolherbst: Lyude: could you relaunch nvaspyi2c until rows without "ERR" are printed?
19:52 Lyude: alright, had it launched for a few seconds now with | grep --invert-match ERR and nothing seems to have come up yet
19:52 karolherbst: :/ okay
19:52 karolherbst: mhh, I will adjust nvbios for now anyway, cause the table layout is different and parsed wrongly. But nouveau has the required changes
19:52 Lyude: this output comes much faster then once persecond btw]
19:52 karolherbst: yeah, because there are errors
20:01 karolherbst: Lyude: nvidia-smi prints the power consumption, right?
20:03 Lyude: karolherbst: is that one of the utilities with the blob?
20:03 karolherbst: yes
20:07 Lyude: karolherbst: I can't see any cli utility called that, but I can see the power stuff getting printed in the control panel for the binary blob
20:08 Lyude: *find any
20:08 karolherbst: mhh
20:14 Lyude: btw karolherbst, where in the rnndb can I find the registers for clockgating? Beginning my work on that right now with a GK110
20:15 Lyude: *GF110
20:15 karolherbst: it's in the comments of the card
20:16 karolherbst: 20200 - 2025c
20:16 Lyude: oh, so we don't actually have them documented
20:16 karolherbst: we should
20:16 RSpliet: Lyude: they have had various names over the past
20:17 RSpliet: oh wait, which clock gating are you talking about?
20:17 Lyude: Implementing it for the GF110
20:17 RSpliet: the very fine grain "BLCG" ones?
20:17 karolherbst: Lyude: lookup -c0 20200
20:17 karolherbst: Lyude: lookup -ac0 20200
20:17 RSpliet: or the thermal protection ones?
20:17 karolherbst: RSpliet: neither
20:18 karolherbst: CG_CTRL
20:18 Lyude: lookup -ac0?
20:18 karolherbst: -a is for chipset selection
20:18 karolherbst: ohh
20:18 karolherbst: lookup is a command from rnn
20:18 Lyude: I think I'm about to learn about an envytool I haven't touched yet
20:19 karolherbst: lookup is super usefull :D
20:19 karolherbst: lookup -a $chipset $reg $value
20:21 Lyude: oooh, yes that is extremely useful
20:23 Lyude: so I'm guessing I'm going to need to figure out which clockgate control register goes to which things
20:26 RSpliet: karlherbst: so the thermal protection ones it is
20:26 Lyude: also curious: why is there an executable and a python script for each rnn util?
20:26 RSpliet: https://github.com/kfractal/envytools/commit/763944016ce33b8f325d5cbef80adea573991ab3#diff-4f78f1169bad0f11dba7f3c3db10ea48R15
20:27 karolherbst: RSpliet: it has nothing to do with thermal protection
20:27 karolherbst: it enables clock gating
20:27 karolherbst: Lyude: well, it's all reverse engineered already though, well mostly
20:27 RSpliet: karolherbst: the use of that mechanism always way "when the shit hits the fan, slash the clock"
20:28 karolherbst: Lyude: "switching on bit 1 on 20200 - 2025c seems to do the trick without any issues. "
20:28 Lyude: I see, and all of the registers have a setting to just have the GPU turn clockgating on/off as nessecary
20:28 karolherbst: RSpliet: no
20:28 karolherbst: Lyude: no
20:28 karolherbst: Lyude: you set them to "AUTO" mode ;)
20:28 Lyude: sorry, that's what I meant :P
20:28 karolherbst: and then the engines clock gate themselves
20:29 karolherbst: on fermi you will see tons of writes into the regs
20:29 karolherbst: but afaik those are unneeded?
20:29 karolherbst: no clue
20:29 RSpliet: karolherbst: mupuf and I looked at this like 8 years ago... Things might have changed in the meanwhile but the fact that NVIDIA labels them in the scope of "THERM" should at least ring a bell
20:29 karolherbst: RSpliet: well, they reduce power consumption
20:29 karolherbst: on idle
20:29 RSpliet: and then there's thus stuff https://android.googlesource.com/kernel/tegra/+/android-tegra-3.10/drivers/gpu/nvgpu/gk20a/gk20a_gating_reglist.c
20:29 Lyude: karolherbst: it always makes me chuckle how nvidia's driver seems to do a lot of things we eventually find out don't have much use (as far as we know anyway)
20:29 RSpliet: which does clock gating on a fine grain <- and is very necessary to control the gating parameters
20:29 karolherbst: RSpliet: well of course they are therm related, cause clock gating reduces power consumption and also heat
20:29 karolherbst: but it's totally unrelated to therm shutdown or anything like it
20:30 karolherbst: no, the parameters don't matter as well
20:30 karolherbst: well you can tweak those
20:30 karolherbst: but this is more like "after x time, auto shutdown the clock signal"
20:30 Lyude: https://android.googlesource.com/kernel/tegra/+/android-tegra-3.10/drivers/gpu/nvgpu/gk20a/gk20a_gating_reglist.c#298 I like how none of those are actually static inlined functions
20:30 karolherbst: that's all
20:31 RSpliet: if they're set to the disable value, the "block" is not gated. Fat chance they boot in their disable value
20:31 karolherbst: Lyude: I have a "nvapoke 0x20200 0x60 27722455" in my nouveau loading script :D
20:31 Lyude: karolherbst: lol
20:31 karolherbst: yes
20:31 karolherbst: no issue so far
20:31 Lyude: anyway, this definitely looks very easy
20:31 karolherbst: yes
20:31 karolherbst: but on fermi some crazy stuff is going on
20:31 karolherbst: you may see it in the traces
20:31 karolherbst: tons of reads/writes
20:31 karolherbst: no idea why
20:32 karolherbst: maybe on fermi it needs to be disabled in certain situations?
20:32 Lyude: only one way to find out :)
20:32 karolherbst: well
20:32 karolherbst: it's easier to figure out you do something right if you can read out the power consumption
20:32 karolherbst: can you on your fermi?
20:32 Lyude: lemme see
20:33 RSpliet: mind you, from the top of my head there was a hw-bug in pre-fermi HW that led some engine lose it's memory when BLCG was enabled. I think it was PMU, and can't recall the conditions
20:33 karolherbst: yes
20:33 karolherbst: that's why there is a another card for pre fermi
20:33 RSpliet: but you'd see a ton of disable-reenable around a perflvl change
20:33 karolherbst: yeah
20:33 karolherbst: something like this
20:33 karolherbst: but only on fermi
20:33 karolherbst: not kepler
20:34 karolherbst: but this isn't about block gating
20:34 karolherbst: but simply the clock signal
20:34 Lyude: karolherbst: doesn't look like we have anything for reading voltage, or at least it's not working on this GPU. I'm down to adding support for this if it just happens to be not implemented in nouveau
20:34 karolherbst: it's implemented
20:34 karolherbst: maybe your fermi GPU is just super low end
20:35 Lyude: GTX570 here GF110
20:35 karolherbst: mhh
20:35 karolherbst: kernel version?
20:35 Lyude: Linux LyudeCowCube 4.11.0-rc7Lyude-Test+ #1 SMP Fri Apr 21 15:55:39 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
20:35 karolherbst: huh
20:35 karolherbst: odd
20:35 karolherbst: sensors should print the voltage on this
20:36 karolherbst: except there is no volting
20:36 karolherbst: which would be super odd for a fermi one
20:36 karolherbst: vbios please?
20:36 Lyude: sure thing
20:36 karolherbst: I am sure it's supported, because I've added all that stuff :D
20:38 Lyude: karolherbst: https://people.freedesktop.org/~lyudess/nouveau/vbios/gf110.rom.xz
20:38 karolherbst: thx
20:39 RSpliet: Lyude: https://groups.google.com/a/chromium.org/forum/#!msg/chromium-os-reviews/FVQxrQjPtCw/ZzVI7zAJBQAJ
20:39 karolherbst: yay, power sensors
20:39 RSpliet: that patch implements the whole thing for Tegra K1. It loads the values (that I pointed to earlier, hard-coded in the driver), then enables the right bits
20:40 karolherbst: "patch"
20:40 RSpliet: have a better description?
20:40 karolherbst: ohh, the patch is above
20:40 karolherbst: you linked a comment
20:40 RSpliet: Not deliberately
20:41 karolherbst: but there is more inside the patch
20:41 karolherbst: I would focus on the 20200+ regs first
20:41 karolherbst: because this is very easy
20:41 karolherbst: and has a noticeable effect already
20:42 karolherbst: Lyude: dmesg on your fermi? I think I know what issue this is about
20:42 RSpliet: the 20200+ regs *depend* on the values in the BLCG/ELCG registers. They control parameters of the same mechanism
20:42 karolherbst: it works without touching the other regs on my GPU
20:43 karolherbst: and any one else I tried it on
20:43 RSpliet: "it works" doesn't mean it's the whole story
20:43 Lyude: karolherbst: oh
20:43 Lyude: no, I know the problem
20:43 karolherbst: I never said it would be
20:43 Lyude: i forgot to load nouveau
20:44 Lyude: OK, everything works
20:45 karolherbst: also power readings?
20:45 Lyude: gimme a sec, I am going to get hastebin setup on ere cause I'm tired of fpaste being broken from RH's infrastructure dying
20:47 karolherbst: RSpliet: I think those are actually setup by some vbios scripts
20:49 karolherbst: or maybe not? dunno, anyway, the clock gating things are enabled by default on any GPU I tested this on
20:50 RSpliet: would be good to verify before someone gets to submitting patches...
20:51 Lyude: karolherbst: https://hastebin.com/ofocinibaq.swift
20:51 karolherbst: Lyude: what kind of monster sensors have you on your board :D
20:51 karolherbst: I want that too
20:52 karolherbst: "ERROR: Can't get value of subfeature in0_min: Can't read" this is well, I know that issue
20:52 karolherbst: hum
20:53 Lyude: karolherbst: hehe. sensors-detect --auto usually gets everything like that. That's what added all the extra sensors on my desktop's board: https://hastebin.com/enuqeqibuq.txt
20:54 karolherbst: yeah, maybe I configured it in a stupid way or so, dummo
20:54 karolherbst: anyway, can you grab me your dmesg?
20:54 karolherbst: I think something is messed up witht he sensors stuff again
20:54 Lyude: whoops, right
20:54 karolherbst: because you have a power sensors for sure
20:54 karolherbst: 3 to be correct
20:55 Lyude: before I do that: do you want mme to turn on debugging in nouveau
20:55 karolherbst: yeah, good idea
20:55 karolherbst: mhh
20:55 karolherbst: best with this:
20:55 karolherbst: nouveau.debug=iccsense=trace
20:57 karolherbst: Lyude: another thing, do you have "i2c-X" files within "/dev"?
20:58 Lyude: Not in dev, but I do in /sys/class/i2c-adaptor
20:59 karolherbst: mhh
20:59 karolherbst: I don't have that one
20:59 Lyude: also, is nouveau safe to reload?
20:59 karolherbst: yeah
21:00 karolherbst: I guess you have "i2c_algo_bit" loaded as a module or builtin ?
21:00 Lyude: as a modul,e
21:00 karolherbst: but without that, something else would go bad already, so you should have it
21:01 karolherbst: most likely I just implemented it wrongly and did a terrible mistake. It works on my GPU though
21:02 Lyude: nice, thank you guys for actually making unload/reload work
21:02 Lyude: radeon is much less kind with that
21:05 Lyude: karolherbst: https://hastebin.com/tamiteviya.go
21:05 karolherbst: ohh, I think I see the problem
21:08 karolherbst: Lyude: https://github.com/karolherbst/nouveau/blob/master_4.10/drm/nouveau/nvkm/subdev/iccsense/base.c#L247
21:08 karolherbst: remove that check against mode
21:08 karolherbst: I need to reverse engineer this one byte still
21:08 karolherbst: it's 0xff, 0xfc and 0xfd for your sensors
21:09 karolherbst: so it fails
21:09 karolherbst: the check is correct on vbios with the table version 0x20, but yours is 0x10 and I didn't give it enough of my time to figure that one out
21:15 karolherbst: also I would feel totally incompetent, if this won't work with that check removed :O
21:25 Lyude: karolherbst: still doesn't work :P, but I can just debug this if you want..
21:25 karolherbst: yeah, dmesg would be nice
21:27 karolherbst: mupuf: can you plug your highest end Fermi GPUs? Thanks
21:28 Lyude: karolherbst: https://hastebin.com/heyalahexe.go ignore the messages prefixed with "Lyude:", that is just me making sure that my changes to the module were actually getting loaded
21:29 karolherbst: mhhh
21:29 karolherbst: the iccsense stuff went through
21:30 Lyude: are we sure that error in sensors isn't just from the lm_sensors lib having some funky config file somewhere?
21:30 karolherbst: check inside /sys/class/hwmon/
21:31 karolherbst: there should be a file name "power1_input" in one of the drivers
21:31 karolherbst: maybe
21:31 karolherbst: Lyude: https://github.com/karolherbst/nouveau/blob/master_4.10/drm/nouveau/nouveau_hwmon.c#L763
21:31 Lyude: no such device"
21:31 karolherbst: okay
21:31 Lyude: error when I try to cat in0_min
21:31 karolherbst: but the file is there
21:32 karolherbst: ohh
21:32 mupuf: Yes, will do in an hour
21:32 karolherbst: Lyude: that's hwmon being silly
21:32 karolherbst: Lyude: in0_min isn't set right on a few fermi cards
21:32 karolherbst: I have a patch to fix that
21:32 karolherbst: Lyude: https://github.com/karolherbst/nouveau/commit/bb9ac9192c03132d7b8c1e40ec61aab0e1a252cd
21:33 karolherbst: Lyude: with that patch, the in0_min file shouldn't return an error anymore
21:34 karolherbst: Lyude: and can you check iccsense->data_valid in the nouveau_hwmon file and ignore it for now?
21:34 karolherbst: maybe we did too many change and something got broken
21:34 karolherbst: mupuf: thanks
21:35 Lyude: karolherbst: didn't need to do anything other then just apply that patch :)
21:35 Lyude: karolherbst: https://hastebin.com/isupegahuj.swift
21:35 karolherbst: :)
21:36 karolherbst: I meant for the power sensor stuff
21:36 karolherbst: the other things
21:36 Lyude: oh, didn't realize some things were still not coming up in sennsors
21:36 karolherbst: yeah, power consumption
21:36 karolherbst: I am sure iccsense isn't NULL
21:36 karolherbst: so
21:36 karolherbst: data_valid is most likely false
21:36 karolherbst: no idea why
21:36 Lyude: ah, yeah I don't have any of the changes from the stuff we were doing last night for iccsense->data_valid
21:37 karolherbst: ohh, mhh
21:37 karolherbst: odd
21:37 karolherbst: can you just remove the if check then in hwmon and unconditionally add the power attgroup?
21:38 karolherbst: so remove "if (iccsense && iccsense->data_valid && !list_empty(&iccsense->rails)) {" and the closing } ;)
21:39 Lyude: karolherbst: I think this might go a little bit faster if you just handed me a diff :P
21:39 karolherbst: true
21:40 karolherbst: https://gist.github.com/karolherbst/8ec7a4c0f01d9459a525a0648bf606d0
21:42 Lyude: karolherbst: got power1 now, but it's 0.00W
21:43 karolherbst: okay
21:45 karolherbst: I will debug that on mupufs GPU then, because his card seems to be pretty much the same regarding this
21:46 karolherbst: I am sure there is a super silly mistake somewhere, will figure that out then
21:46 Lyude: karolherbst: alright, again you're always welcome to ssh access on one of my machines as well if you need it
21:46 karolherbst: mhhh
21:46 karolherbst: okay, I would do the same with mupufs machine :D but yeah, that would be helpful if you can grant me access
21:47 karolherbst: then I can do it now :D
21:47 Lyude: sure thing, can you give me an ssh key?
21:53 karolherbst: Lyude: https://gist.github.com/karolherbst/795df36833641b2f9eb1245fdef8f6e7
21:53 karolherbst: I should rather post the raw urls ...
21:53 karolherbst: https://gist.githubusercontent.com/karolherbst/795df36833641b2f9eb1245fdef8f6e7/raw/5c385a55175497776b85eb0c810047ff3e3eb199/gistfile1.txt
21:58 Lyude: alright, writing a script to add users to speed this up in the future
21:59 karolherbst: I wrote something like that for work
21:59 karolherbst: on push into git repository deployment of all pub keys on all servers :3
22:02 Lyude: karolherbst: can you try logging into karolherbst@lyude.net port 50010?
22:03 karolherbst: works, thanks
22:03 karolherbst: I like that kernel is tainted message :D
22:04 karolherbst: I will simply use my out of tree module
22:04 Lyude: np, do whatever you want with the machine.
22:05 karolherbst: mhhh
22:05 karolherbst: permission denied
22:05 Lyude: sudo not working?
22:05 karolherbst: is there something odd with the kernel build scripts?
22:05 karolherbst: it does
22:06 karolherbst: but why do I need to be root to build a kernel module
22:06 xen: hi everyone
22:06 imirkin_: Lyude: you never appear to have completed the release
22:06 Lyude: imirkin_: hm? I ran the script and sent out the email
22:06 imirkin_: (i don't see an email)
22:06 Lyude: oh, the nouveau one was cause I wasn't on the mailing list. unfortunately red hat's email is down so I had to use my gmail
22:06 imirkin_: are you not subscribed to nouveau@?
22:06 imirkin_: ah ok
22:07 Lyude: yeah, my other email is properly subscribed to things
22:07 xen: a bit unrelated, but is it possible to do 1080p transcoding with ffmpeg with an odroid c2 (used as dlna server) with proper quality/framerte?
22:07 imirkin_: xen: wrong channel?
22:07 Lyude: karolherbst: are you trying to use kernel sources already on that machine?
22:07 xen: imirkin_: oops :\
22:07 Lyude: i wasn't aware there were any other then the ones in my /home
22:09 karolherbst: Lyude: no, but I try to build nouveau out of tree
22:09 Lyude: are you building on that machine or your own machine?
22:09 karolherbst: on that one, why?
22:10 Lyude: i was just wondering
22:10 karolherbst: on mine it works without being root, that's why I was wondering
22:10 Lyude: ahhh, yeah that is strange. I'm not sure
22:10 Lyude: one of my scripts for loading kernels probably changed some permission somewhere at some point in time
22:11 karolherbst: noooo, my editor isn't installed :(
22:11 Lyude: karolherbst: dnf install whatever you wantr
22:11 Lyude: *want
22:12 karolherbst: k, thanks
22:12 Lyude: in the mean time, is there a GPU this sensors stuff works on that I should start off with for clockgating instead?
22:13 karolherbst: I am sure it works on most kepler GPUs
22:13 karolherbst: and maxwell
22:13 Lyude: cool, I'll go pull out one of my keplers then and see if I can get started
22:16 karolherbst: uhm, is there ccache installed?
22:16 karolherbst: indeed it is
22:21 karolherbst: Lyude: what kind of kernel do you have installed on that machine? some super drm-next thing?
22:21 karolherbst: .gamma_set changed or so
22:22 karolherbst: https://cgit.freedesktop.org/~airlied/linux/commit/drivers/gpu/drm/nouveau?h=drm-next&id=6d124ff845334bc466f56c059147e7ad587c2e7e
22:27 karolherbst: Lyude: "power1: 28.09 W " :3
22:27 karolherbst: after the nvapoke: "power1: 26.91 W "
22:27 karolherbst: more than 1W \o/
22:29 karolherbst: https://gist.githubusercontent.com/karolherbst/97e1b08f49109f85d8be8e061d683243/raw/77ade36d1e77a8a22532c03e14dab30be0a3a2c3/gistfile1.txt
22:32 karolherbst: Lyude: this patch was enough to get it working: https://github.com/karolherbst/nouveau/commit/47d083659e84de5610a5d3d151f9d4eef7ba6902
22:32 karolherbst: mhh
22:34 karolherbst: RSpliet: I remember now, mupuf thought it didn't work, because the power sensors weren't serup correctly and the power reading were _super_ inaccurate, which they aren't anymore under nouveau now
22:35 karolherbst: so I can pretty much notice a change of 0.1W with high confidence
22:35 karolherbst: and before that a change of 0.5W couldn't be detected really
22:36 karolherbst: Lyude: so you can use my module for testing if you want now. Maybe something with the module deployement went wrong or anything like this? dunno
22:36 RSpliet: mupuf and I discussed this stuff when he did his measurements with a multimeter or sth
22:37 RSpliet: many many hairs ago
22:37 karolherbst: well, it's a mess on pre fermi hardware, yes
22:38 RSpliet: Not more so than post-fermi... just nobody bothered implementing it proper yet
22:38 karolherbst: fermi is fine as well
22:38 karolherbst: I just tested it (tm) and it just works (tm)
22:38 RSpliet: not this again... there's fine, and there's upstreamable
22:39 karolherbst: true
22:39 Teklad: I didn't realize how few people are actually commiting to the nouveau driver.
22:39 karolherbst: but on kepler nvidia basically does fire and forget
22:39 Teklad: The number is pretty darn low, lol.
22:39 karolherbst: Teklad: what did you expect? 100 monkeys :D
22:39 Teklad: karolherbst: At least 50.
22:39 RSpliet: HAH!
22:39 karolherbst: :O
22:39 karolherbst: 50?
22:39 RSpliet: well, over the course of 10 years maybe just
22:40 Teklad: I'm gonna fork and start poking at pascal.
22:40 karolherbst: yeah
22:40 RSpliet: but that includes fuzzer one-liners
22:40 karolherbst: at most
22:40 RSpliet: and drm maintainers doing fix-ups
22:40 RSpliet: and that kind of stuff :-P
22:40 karolherbst: Teklad: why "fork"? or do you mean github fork?
22:40 Teklad: Yes... github fork
22:40 Teklad: I do plan to make pull requests.
22:40 karolherbst: Teklad: also, forget about pascal, you won't be able to do anything regarding power management
22:40 karolherbst: we don't do pull requests
22:40 karolherbst: we have a mailing list
22:40 Teklad: Oh?
22:41 karolherbst: Teklad: not because of missing skills, just that the registers are write protected
22:41 karolherbst: and you need signed PMU images to actually do anything
22:41 karolherbst: we can't even set the voltage
22:41 karolherbst: -> no reclocking
22:41 Teklad: karolherbst: So initially most of the work is going to be getting the signed thingies, I assume.
22:41 karolherbst: so, forget about pascal, it's a waste of time for now (tm)
22:41 karolherbst: :D
22:41 karolherbst: well
22:41 karolherbst: kind of
22:42 karolherbst: nvidia needs to release them
22:42 karolherbst: we are still waiting for the maxwell ones
22:42 Teklad: karolherbst: Usually if you poke them with a big enough stick they'll respond.
22:42 karolherbst: try it
22:42 Lyude: Teklad: Red Hat's been poking them with large sticks for basically the entire lifetime of nouveau
22:42 karolherbst: we are kind of upset about the situation already
22:43 Lyude: hasn't really helped much :\
22:43 RSpliet: Teklad: the last guy we poked resigned a few weeks ago
22:43 karolherbst: well, he poked inside nvidia
22:43 Teklad: Lol... so employees that ask no-no questions for us get a forceful resignation.
22:43 Teklad: Sounds nice.
22:43 karolherbst: it wasn't like that :D
22:44 RSpliet: Teklad: no, it wasn't forceful
22:44 RSpliet: he decided to pursue something else
22:44 Lyude: yeah, I am pretty sure that the general consensus across people at nvidia working on their blob is that they want to just use nouveau
22:44 Lyude: The problem is upper management
22:44 Teklad: Upper management is always an issue... regardless of the company.
22:45 Lyude: Well yes, but almost everyone else has budged at this point.
22:45 Teklad: Lyude: But upper management typically has like 0 developer experience... so they don't like what they don't understand.
22:45 Lyude: oh yeah i'm not denying that
22:46 Lyude: my point is moreso, most other companies have budged. some took a while, but basically all of them have other then like, nvidia broadcom and qualcom
22:46 Lyude: i just stopped buying nvidia hardware for my own machines
22:46 Teklad: I'm sure nouveau's code base looks better than that 20 year old blob of death.
22:46 Lyude: I can get it through Red Hat since I'm pretty interested in making nouveau work
22:47 Teklad:doesn't like proprietary nonsense on his machines.
22:47 Lyude: but I'm also hopeful AMD will eventually get bigger in the GPU market again and maybe, just maybe, someday dethrone nvidia
22:47 Teklad: I forgot to read up on Nvidia and their latest evil before buying my GTX 1060.
22:47 Lyude: putting them in a position where they might actually have to become open to compete with them, since AMD has opened up most of their stuff
22:47 Teklad: So now I'm stuck in no-man's land.
22:48 Lyude: i still would like to get something for reclocking on pascal, firmware or no firmware, if not just to get us to the point where we can confidently say "the only reason we can't reclock your GPU is because nvidia doesn't want us to"
22:48 RSpliet: Lyude: Broadcom actually has Eric Anholt on board for OSS support
22:48 Teklad: Lyude: You're talking possible years of stalled nouveau development on newer hardware because nvidia is being stubborn.
22:48 Lyude: RSpliet: damn, I didn't hear about that
22:48 karolherbst: Lyude: well, we are already at that point :p
22:49 Lyude: yeah
22:49 RSpliet: Lyude: a long time already... for vc4 (raspberry pi & co)
22:49 Lyude: nice
22:49 Lyude: so that just makes nvidia and qualcom then
22:49 RSpliet: In the meanwhile, you guys (Rob Clark) are taking care of qualcomm quite well
22:49 RSpliet: Qualcomm doesn't seem to hostile, and has contributed docs & patches
22:50 RSpliet: so that leaves ARM and Imagination as the worst guys :-P
22:50 Lyude: RSpliet: they even recommend our driver to customers according to rob
22:50 Teklad: Nvidia is one of those monolithic black masses from the old days of computing... they're still catching up with modern practices.
22:50 Lyude: for anyone who wants to use a standard linux distro and not blobs
22:50 Lyude: i still want to work on getting something working for pascal, even if it's not anything we could hope to enable out of the box due to lack of missing firmware
22:51 Lyude: *lack of firmware
22:51 karolherbst: well, we could always try to find a way to get our own firmware working
22:51 karolherbst: *cough*
22:51 Lyude: karolherbst: OK, so you WERE actually suggesting that
22:51 RSpliet: Although I'm not sure whether Imgtec should be considered "hostile" or just unable to attract developers/their interest
22:51 Teklad: I'm game for that.
22:51 Lyude: i was skeptical at first cause that sounds like one hell of a job
22:51 Lyude: but yeah i'm game for it too
22:52 karolherbst: Lyude: what else can I suggest at that point? waitng? hell no
22:52 Lyude: karolherbst: damn straight, that's the right attitude :)
22:52 Teklad: karolherbst: There's a mild to moderate possibility I'll die from old age if we wait on Nvidia. :p
22:52 karolherbst: we waited long enough and we are pissed
22:52 karolherbst: we could have working reclocking on maxwell2 gpus
22:52 Lyude: if nvidia ever complains about their encryption keys being broken
22:52 karolherbst: but we can't control the fans
22:52 Lyude: we'll tell them "you COULD have just given us a firmware blob"
22:52 karolherbst: which is just pure utter bs
22:52 karolherbst: :D
22:53 RSpliet: Lyude: not sure if lawyers agree with that line of reasoning
22:53 RSpliet: but #yolo
22:53 karolherbst: :3
22:53 karolherbst: we just do it under a nwe nick
22:53 Lyude: i'm not sure where the legality of that stands, but it's not like this hasn't been done before
22:53 karolherbst: "
22:53 karolherbst: @shadow_nouveau_brokers
22:53 Lyude: bad encryption practices is why we can jailbreak older PS3s
22:53 Teklad: karolthirstyturtle@yahoo.com yo
22:53 Lyude: haha
22:54 RSpliet:taunts HACK THE PLANET!!!
22:54 karolherbst: not _my_ fault if their private keys "just happen" to appear on a pastebin
22:54 Teklad:mind suddenly thought of Captain Planet
22:54 RSpliet: But seriously, I think it's a pretty standard AESque pubkey/privkey signing scheme
22:54 karolherbst: yes
22:54 karolherbst: it is
22:54 karolherbst: 128bit
22:54 RSpliet: or, that's what I recall mwk saying before
22:55 mupuf: karolherbst: would the GTX 470 work?
22:55 karolherbst: it's 128 bit for sure
22:55 mupuf: I have an nvce too
22:55 karolherbst: mupuf: I am already done, thanks :p
22:55 Teklad: So how do you go about getting that? Exploring the nvidia blob patiently?
22:55 mupuf: lol
22:55 karolherbst: but I need to RE a bit anyway, but oh well
22:55 Lyude: what if we cracked the encryption key with all of the GPUs we can get reclocking working on
22:55 karolherbst: not really needed
22:55 RSpliet: karolherbst: how many hours of GTX1080 does that take to brute-force? I'd love the irony of a GPU cracking its own privkey
22:55 Lyude: that would be really nice irony
22:56 karolherbst: RSpliet: 128 bit AES key, do the math
22:56 RSpliet: Lyude: great minds...
22:56 karolherbst: RSpliet: keep in mind: you need to upload the image to the PMU for each try
22:56 karolherbst: and start it
22:56 karolherbst: and check if the HS mode was entered
22:56 Teklad: RSpliet: Gonna put them GPU processing cores to work on itself? :p
22:56 Lyude: karolherbst: btw, lemme know when you're done with that machine
22:56 karolherbst: so around 4,5 seconds per try?
22:56 karolherbst: Lyude: I am done. I think something went wrong in your deployment, because it just worked for me (tm)
22:56 RSpliet: karolherbst: there's no pubkey certificate?
22:56 Lyude: karolherbst: that is not unlikely at all
22:57 karolherbst: Lyude: https://github.com/karolherbst/nouveau/commits/fermi_iccsense
22:57 karolherbst: this is the tree I used
22:57 karolherbst: it's compatible with the kernel running on that machine
22:57 Lyude: i may just start using the out of tree one
22:57 RSpliet: Oh... hw
22:57 RSpliet: No... the signature is uploaded, we need a privkey that generates the same sig for the same firmware
22:57 karolherbst: Lyude: check the nvapoke commands and power lines: https://gist.github.com/karolherbst/97e1b08f49109f85d8be8e061d683243
22:57 Lyude: oh, then that's easy
22:57 RSpliet: Surely that doesn't require to run it on PMU
22:58 karolherbst: mhhh
22:58 karolherbst: RSpliet: no, won't work
22:58 karolherbst: the PMU locks down on failed verification
22:58 karolherbst: ohh wait
22:58 karolherbst: okay
22:58 karolherbst: I see what you meant
22:58 karolherbst: true
22:58 karolherbst: we could do that
22:59 Lyude: if we were bruteforcing a key though ideally I'd think we'd want to do that without uploading each attempt to the pmu
22:59 karolherbst: I know which parts are relevant for the signature generation, because not the entire image is important for that
22:59 karolherbst: only the actual signed pages
22:59 karolherbst: Lyude: nooo, RSpliet has the good idea
22:59 karolherbst: Lyude: we have the image, we have the signature
22:59 karolherbst: we can just sign the image until we get the signature
23:00 Lyude: wait, really!?
23:00 karolherbst: it's a 128 bit key though
23:00 karolherbst: so....
23:00 Lyude: oh yeah
23:00 Lyude: sorry, that is what I was trying to say
23:00 Lyude: 128 bit key does not sound very hard...
23:00 karolherbst: well
23:00 karolherbst: it takes a while
23:00 karolherbst: it's AES
23:00 karolherbst: not RSA
23:00 Lyude: ah
23:01 Lyude: i hear AMD GPUs are pretty good with number crunching :)
23:01 karolherbst: it's still a lot of tries
23:01 karolherbst: and those images aren't exactly small
23:01 RSpliet: karolherbst: plus I don't know the details, but there could be multiple privkeys that for given data generate the same sig, but on different data could produce different sigs?
23:01 Teklad: Well I leave my computer idling quite a bit... I could technically leave said brute-forcing program to do its work all day every day.
23:01 Lyude: karolherbst: btw, think it might be those two patches you have on top of drm-next/
23:02 Lyude: erm, that make sensors work I mean
23:02 karolherbst: RSpliet: no, each private key has to generate a different sig
23:02 Lyude: would like to get something working on my tree for the time being
23:02 karolherbst: RSpliet: otherwise AES would be less secure
23:02 RSpliet: karolherbst: that would mean no privkey is longer than the sig?
23:02 Lyude: also yeah if this is not illegal in such a way that would get me fired I will be more then happy to dedicate machines for this
23:02 karolherbst: RSpliet: would it make sense otherwise?
23:02 karolherbst: the sig is 128 bit
23:02 Lyude: estatic, even
23:02 karolherbst: the key as well
23:03 karolherbst: well
23:03 Lyude: karolherbst: you've seen utilities such as hashcat, correct?
23:03 karolherbst: it depends on how you merge the sigs though
23:03 karolherbst: Lyude: yes
23:03 karolherbst: RSpliet: well, I just left out the merging of the sigs
23:03 Lyude: sounds like it would just be a matter of getting hashcat working with opencl, and getting it to try to crack the key
23:04 karolherbst: but what I said is true for 128 bit keys + 128 bit data
23:04 karolherbst: Lyude: details are important
23:04 karolherbst: we don't know how they merge the sigs together
23:04 karolherbst: CBC most likely though?
23:04 Lyude: hrm
23:05 RSpliet: "This is a very small gain, as a 126-bit key (instead of 128-bits) would still take billions of years to brute force on current and foreseeable hardware."
23:05 karolherbst: I think they even use custom IVs
23:06 Lyude: yay, got sensors to work on my machine, looks like I was missing the commit "hack", which I thought i added earlier…
23:07 karolherbst: your dmesg looked also fine
23:07 karolherbst: that's why I was buffled
23:07 Lyude: yeah, sorry about that lol
23:07 Lyude: figured giving you ssh access might point out something silly like that
23:07 karolherbst: no worries
23:08 karolherbst: time I spend on nouveau stuff is never wasted :3
23:08 karolherbst: RSpliet: so, good luck :p
23:08 nyef: ... GF119 HD Audio does not have tegra-esque "scratch" registers.
23:09 karolherbst: RSpliet: you have to keep in mind, that you need to sign the blocks, chain them and do that for like _a_lot_ of keys
23:10 karolherbst: RSpliet: do you think an "electron microscope" might help?
23:10 RSpliet: karolherbst: with the right expertise...
23:10 RSpliet: don't think those microscopes come cheap though
23:11 karolherbst: RSpliet: I know people with access to those :p
23:11 Lyude: but some people here do work at universities...
23:11 RSpliet:hides
23:11 karolherbst: RSpliet: you know the nm of the falcons?
23:12 RSpliet: the nm? presumably the entire chip is produced in the same process...
23:12 karolherbst: mhh
23:12 RSpliet: 16?
23:12 Lyude: btw karolherbst, where do you think in nouveau's kernel source tree I should start sticking in code for enabling clock gating?
23:12 karolherbst: so the falcons would be 28 nm as well
23:12 karolherbst: RSpliet: maxwell2 is still 28nm
23:12 RSpliet: ok
23:12 karolherbst: Lyude: I always work with the out of tree module
23:12 karolherbst: ohhh
23:12 karolherbst: Lyude: subdev/therm
23:12 Lyude: cool
23:12 karolherbst: I think
23:12 karolherbst: not _quite_ sure
23:13 karolherbst: you want to add checks for qhich engine actually exists
23:13 karolherbst: but my poke is soo trivial I doubt there will go anything wrong
23:13 karolherbst: but we can still do it the right way
23:14 Lyude: I'm guessing those regs are universal across different gens based on the notes in trello?
23:14 karolherbst: RSpliet: the PMU is somwhere on the die I assume? not somewhere else?
23:14 nyef: Preliminary conclusion: I don't need an HDMI analyzer, I need to see if the blob does anything useful with HDMI audio on this hardware, and if it *doesn't* then I need to see if smacking the tegra "scratch" registers from the HDA side trigger it to do so anyway. /-:
23:14 karolherbst: Lyude: well, starting with fermi until maxwell2
23:15 karolherbst: Lyude: pascall, dunno, tesla? different
23:15 Lyude: alright, just trying to decide what file to start off with. there are a lot of things in nouveau's source tree...
23:15 karolherbst: Lyude: so you need to add a function pointer for handling that, and enable that only for gf1xx -> gm2xx
23:16 karolherbst: Lyude: example for a design like this: subdev/pci/pcie.c + implementation pointers inside various subdev/pci/$chipset.c files
23:17 karolherbst: nvkm_pcie_set_link is called from outside
23:17 Lyude: awesome
23:20 RSpliet: karolherbst: yes, it's just going to be on-die
23:21 karolherbst: :( maybe it has a label "PMU HERE *==>"
23:21 RSpliet: so only 3 billion transistors to RE
23:22 karolherbst: and the star below: "*you novueau guys rock <3" :O
23:22 RSpliet: a significant portion of that transistor budget is spent on SRAMs
23:22 karolherbst: RSpliet: the question is, what is faster, brute forcing the key .... or that
23:23 RSpliet: DRAM controllers are by the side
23:23 RSpliet: large replicated patches are likely the GPCs
23:23 RSpliet: so you're looking for a small semi-irregular core that is not one of the video decoders :-P
23:24 karolherbst: well
23:24 karolherbst: the PMU is the biggest falcon afaik
23:24 RSpliet: I thought vdecs had SIMD extensions, expect them to have significantly more ALUs
23:24 karolherbst: mhh, true
23:26 RSpliet: (or ALUs with deep pipelines...)
23:26 karolherbst: or we just break the crypto hardware
23:26 karolherbst: would be the cleanest way
23:40 Lyude: btw, how is the model stuff in subdev supposed to go? I see that for thermal stuff on fermi we have gf119, is that because gf119 handles things differently then earlier gf* chips? or is that just because that's what the person writing it had available at the time
23:41 karolherbst: Lyude: the former
23:42 Lyude: alright, I figured as much. time to make a gf100.c then
23:42 karolherbst: Lyude: checkout nvkm/engine/device/base.c
23:42 Lyude: ooooooooooooooooooh. this helps a ton
23:43 karolherbst: Lyude: "filecycle" of the nvkm_subdev pointers: oneinit: at loading time, init: at on every boot/resume -> fini: on every shutdown/suspend, onefini: at unloading time
23:43 Teklad: Now I'm kinda wishing I had an older card to dev on.
23:43 Teklad: So much hurt.
23:44 Lyude: i should go and get some older cards from friends/ebay at some point
23:44 karolherbst: ohh wait
23:44 karolherbst: there is no onefini
23:44 Teklad: only xul
23:44 Lyude: true
23:44 karolherbst: Lyude: and there is also preinit
23:44 karolherbst: preinit basically means: before init, but no other subdevs are up
23:44 Teklad: Oh right... I do have an older card sitting here.
23:44 Teklad: what was it again
23:44 karolherbst: at init time you can expect that for every engine/subdev preinit was ran
23:45 karolherbst: and that you can actually use other subdevs
23:45 karolherbst: at preinit/oneinit, not so much
23:45 Teklad: darn
23:45 Teklad: All I got is an old 730k
23:46 karolherbst: Lyude: we usually use dtor for cleaning up on unloading time
23:47 Lyude: honestly now that I am understanding this organization scheme you guys have for source files in nouveau more, this is realllllly nice
23:47 karolherbst: well
23:47 karolherbst: it is basically object oriented C :p
23:47 Lyude: of course! but intel just jams all of their gen specific things in the same source files
23:47 karolherbst: just better than what glib is doing :D
23:47 Lyude: amd has different files, all of which copy paste as much non-gen specific crap as they can
23:47 karolherbst: AMD source is shit
23:47 karolherbst: :D
23:48 Lyude: oh yeah
23:48 Teklad: You guys got some kind of bug tracker thingamajig?
23:48 karolherbst: I tried to figure out how they do their pcie stuff
23:48 karolherbst: it was terrible
23:48 Lyude: like
23:48 Lyude: anyone working at AMD should not legally be allowed to ever type "CTRL-C" or "CTRL-V"
23:48 karolherbst: :O
23:48 karolherbst: :p
23:48 karolherbst: true
23:48 imirkin_: Lyude: how do you kill a process?
23:48 imirkin_: just let it run then?
23:49 Lyude: imirkin_: hehe, good point
23:49 karolherbst: Lyude: I really like the dsign of nouveau, because it makes it so easy to get stuff working for new chipsets, we just hack some pointers together
23:49 karolherbst: :)
23:49 Lyude: CTRL+Z + kill %1 would work
23:49 imirkin_: and entering newlines kinda requires ^V ...
23:49 imirkin_: hehe, i learned long ago to never do that. always do kill %%
23:49 karolherbst: imirkin_: ctrl-d :p
23:49 imirkin_: otherwise you could kill the wrong process
23:49 Lyude: oh neat, never knew about that one
23:49 karolherbst: ctrl-d is SIGTERM
23:49 karolherbst: ohhh
23:49 karolherbst: wait
23:49 Lyude: karolherbst: that is exactly how it should be, honestly
23:49 karolherbst: EOF
23:49 imirkin_: [and in fact, i did - that's how i leanred. heh.]
23:49 karolherbst: I think it's EOF
23:49 karolherbst: or SIGTERM
23:50 karolherbst: dunno
23:50 karolherbst: don't care
23:50 imirkin_: it's EOF for stdin
23:50 karolherbst: it issues a logout in ssh
23:50 karolherbst: and doesn't kill the process
23:50 karolherbst: yeah, EOF
23:50 karolherbst: most likely
23:50 imirkin_: bash has a special interpretation for it
23:50 imirkin_: as do most shells
23:50 karolherbst: true
23:50 karolherbst: bash calls exit
23:50 imirkin_: back in the day you had to type out 'exit'
23:50 karolherbst: :O
23:50 karolherbst: no tab?
23:50 karolherbst: ohh wait
23:50 karolherbst: that takes even longer
23:51 imirkin_: back in day, this was the only tab: https://en.wikipedia.org/wiki/Tab_(soft_drink)
23:51 karolherbst: please stay away with your sugar water :p
23:52 karolherbst: ohh wait, it has none
23:52 imirkin_: all this computer hacking is making me thirsty... where's my tab?
23:53 imirkin_: https://www.youtube.com/watch?v=siaxGjttoVM
23:53 karolherbst: :D
23:54 imirkin_: it took me a *very* long time before i got that ... probably years since i first saw it
23:54 karolherbst: Lyude: maybe we should get ben to refactor AMDs code, so that it does looks good as well :3 I am sure he would like to do it _very_much_ !
23:54 Lyude: i've considered doing it every now and then
23:54 karolherbst: imirkin_: ... seriously? :D
23:55 Lyude: i'd be willing to do it for radeon because a lot of the cleaning up is -really- trivial
23:55 imirkin_: well, i didn't know that tab was a drink, so i had no idea what the reference was to
23:55 karolherbst: Lyude: merging the drivers woulod be a first step :p
23:55 Lyude: and it would be hilarious to have a -1000+ line diff that adds maybe like, 100 lines
23:55 Lyude: karolherbst: yeah :\
23:55 Lyude: there are things I would like to do with amdgpu but I haven't because I don't know what's going to come out of this ugly dal crap
23:56 karolherbst: allthough nouveaus support for pre nv50 hardware is a bit... bad? no clue, it seems to work, but that's it I think
23:56 imirkin_: the unfortunate thing about the current arrangement in nouveau is that you end up having to jump around between a LOT of fail
23:56 Lyude: i might clean things up then suddenly everything's been replaced by dal
23:56 imirkin_: files*
23:56 Lyude: assuming amd ever manages to get dal into an acceptable state
23:56 karolherbst: imirkin_: get a proper IDE :p
23:56 imirkin_: not sure how that'd fix anything
23:56 imirkin_: ctags aren't an issue - i can jump to definitions/etc
23:56 Lyude: i'm fine with that
23:56 karolherbst: mhh well, IDEs you give you a list of implementation where you want to jump to
23:56 imirkin_: it's just all spread out
23:56 Lyude: karolherbst: you can do that with vim too :P
23:56 karolherbst: ohh, k, sure
23:56 Lyude:uses vim like an IDE
23:57 karolherbst: Lyude: well, I first need to learn how to use vim anyway....
23:57 karolherbst: I even bought a book! that's how dedicated I am!
23:57 Lyude: haha
23:57 karolherbst: (bought it like 6 years ago)
23:57 imirkin_: you can pass it on to your children
23:57 karolherbst: this still takes a while
23:57 karolherbst: I am not _that_ old actually
23:58 karolherbst: allthough I am
23:58 karolherbst: mhh
23:58 imirkin_: that's how everyone thinks about themselves
23:58 imirkin_: the definition for what's old is your age + 5 or so :)
23:58 karolherbst: uhm... I am the youngest in my company currently
23:58 Lyude: hehe, I'm only 21 and I know I am constantly one of the young people in most of the communities I'm in
23:58 imirkin_: of course the rest are all octogenarians...
23:59 imirkin_: Lyude: esp the retirement communities? :p
23:59 Lyude: imirkin_: hehe
23:59 karolherbst: Lyude: noooo, somebody is younger than me and working on nouveau :(
23:59 imirkin_: and i'm probably the oldest
23:59 imirkin_: =/
23:59 karolherbst: could be?
23:59 karolherbst: RSpliet once told me he could by my father :P