04:06dviola: skeggsb: I tested your patch and the timeout is gone when booting my system, but it's back after a suspend/resume
04:07imirkin: pmoreau: ok. rebased and split up: https://gitlab.freedesktop.org/imirkin/mesa/-/commits/nv50_compute
04:07imirkin: pmoreau: still a few things to address
04:07imirkin: pmoreau: i left out all the nir / opencl stuff
04:11skeggsb: dviola: what's the backtrace from that?
04:12dviola: one sec, I'll get it
04:13dviola: ugh, I reverted it to the stock arch kernel, it will take me a few minutes to get it to you, I have to recompile the kernel
04:14dviola: shouldn't take that long
04:14skeggsb: cool, because i don't believe it's possible to hit that timeout now, so i'm curious :P
04:28imirkin: mwk: are the "postincr" op variants basically where on top of doing whatever the op was doing, the $a reg's value is increased too?
04:30imirkin: (and if so, by how much exactly?)
04:34dviola: skeggsb: this is after a suspend/resume: http://ix.io/2QLS
04:37skeggsb: hrmmm, i'm gunna need to think about that for a bit
04:39skeggsb: those are different timeouts you're hitting now, yes?
04:40dviola: looks like it: http://ix.io/2OTM
04:40dviola: the other was related to: drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c:107 nvkm_pmu_reset
04:40skeggsb: yeah, good, that sorta makes sense now then
04:43skeggsb: i'm apparently attempting to come up with an additional patch now
04:57skeggsb: dviola: alright, there's another patch in the same branch
04:59dviola: cool, testing now
05:08dviola: skeggsb: sweet, no timeouts anymore :)
05:08dviola: tried suspend/resume as well
05:09dviola: let me try a few more reboots, etc
05:09skeggsb: sure :) let me know how you go
05:10dviola: yep, it's perfect now
05:11dviola: skeggsb: thanks a lot
05:11skeggsb: thanks for the pokes and testing :) i won't send them anywhere yet, i need to look carefully to make sure nothing unexpected is going to happen
05:11dviola: alright, sure
05:12skeggsb: there's a twisty maze between pre-ACR falcon use and post-ACR in the driver that's hard to follow
05:13dviola: and yw :) I have to actually confess I installed arch on that machine just to see which issues I would run into :P that's the laptop I am supposed to use for work, I might actually use it now :P
05:16dviola: skeggsb: feel free to ping me anytime if you want me to test more patches
05:19skeggsb: dviola: sure, thanks!
05:34damo22: thank you all for this driver! I've been reading the backlog for only a few days but i can tell there is a lot of work that went into the development and reverse engineering and appreciate how hard this stuff can be
05:40damo22: fingers crossed the signature check has a bypass somewhere and one day we'll figure it out :D
06:20salawat: Well, that was quite the rabbit hole.
06:24salawat: Went from not having a bloody clue to diiscovering documentation that has priorr to now stayed firmly somewhere in the fog induced by search engine SEO, and discovered I've got a new ISA to look into, and fiddle with tooling for.
06:29salawat: The asn1 parsing experiment was downright a shot in the dark that resulted in more amusement than anything... but I feel like it's at lleast gotten me deeper into actually learning the guts of the card thanks to the envytools docs. Also stumbled across the DRIVE docs, and while I still haven't seen much in terms of binary dumpable crypto stores accessible from the host machine, I'm at least awre of what algorithms their packing in their
06:31salawat: got a lot of reading to do I think before I'll be any use, but eh... the journey of a million miles starts with a single step, right?
08:51pmoreau: imirkin: Awesome, thanks a lot! Don’t we all love having one giant WIP patch that just keeps on accumulating even more stuff? :-D Should I go from the compute branch or the nv50_compute one?
08:52pmoreau: As for the patches missing S-o-b, I most likely forgot to add the flag for it when committing so you can add them (or I will do it when I update my branch).
08:52pmoreau: I should change my Git config for this repo so that it automatically adds the S-o-b.
10:49zocker: is there a way to read the current core and memory clocks of a GT1030/gp108 card while running on nouveau?
11:03zocker: or would pmu firmware be required for that?
11:14kherbst: zocker: I think in theory we should be able to read it out, but it's a bit hard to re what registers to touch for that
11:17RSpliet: kherbst: is it? Can't you set manual clocks with... ehh... what's that feature called
11:18RSpliet: Coolbits I think
11:21RSpliet: kherbst: https://forums.developer.nvidia.com/t/option-coolbits-is-not-used-optimus-enabled-laptop-running-an-rtx-2070-manjaro-linux/111771
11:21RSpliet: Should still be a thing for RTX2070 at least
11:21kherbst: RSpliet: it still goes through the PMU
11:21kherbst: but anyway.. I don't think much of the regs changed in pascal
11:22kherbst: and I think most is actually identical
11:22RSpliet: kherbst: even the non-memory PLLs?
11:22kherbst: yeah.. I think so
11:22kherbst: I was looking at it briefly
11:22kherbst: the thing which they locked down was the voltage controler
11:22kherbst: but I wasn't looking in depth, because... quite pointless
11:22RSpliet: Oh. Eh, I expect PLL registers to look exactly the same as they always do, PLLs themselves haven't fundamentally changed :-D
11:23kherbst: but we don't even calculate the clocks correctly on kepler
11:23RSpliet: Yeah, limited benefit. Nice to report speeds, but that's about it.
11:24RSpliet: I expect REing those registers, if at all necessary, isn't the problem though. Just... rather have you focus on multithreading or OpenCL or other more useful bits :-D
11:24kherbst: the bigger issue is just, that we can't really trace what the blob is doing
11:24kherbst: and this sucks
11:25RSpliet: Why not? You can't alter the VBIOS, which is a headache. But you can still intercept the PMU scripts no?
11:25kherbst: we can't
11:26kherbst: also, what scripts? :p
11:26RSpliet: My knowledge is Kepler/Maxwell, do enlighten me about changes :-P
11:26kherbst: yeah well.. the PMU is doing everything
11:27kherbst: dunno if this is the case with pascal though
11:27RSpliet: Oh, so no more "calculate all the memory registers in the driver and upload a script of changes"
11:27RSpliet: Does it read out the VBIOS directly?
11:27kherbst: but the tables changed massively
11:27kherbst: with kepler we had like .. 15 tables?
11:27kherbst: we are now at 40
11:28RSpliet: Are they running MySQL on those cores?
11:28kherbst: highest P table offset was 0x138 or whatever
11:28kherbst: it's massive
11:28kherbst: maybe I am exegarating a little.. let me check on turing :D
11:29RSpliet: There's a point of over-engineering :')
11:29RSpliet: Anyway, yeah, clear. However, if all the logic ends up in the PMU, then we don't need to RE none of it. We just need the gosh darn firmwares :-P
11:29kherbst: RSpliet: https://gist.github.com/karolherbst/be862bd529a8b4e31a6b35575f2adb22
11:30kherbst: so.. I wasn't that off
11:30RSpliet: Yeah that's 40 alright
11:30kherbst: 40 existing tables ;)
11:31kherbst: there are some missing ones
11:31kherbst: but I assume those are mostly legacy ones
11:31kherbst: but yeah.. I think what you do witht he PMU is, that you read out the tables and configure the PMU accordingly
11:32kherbst: but... oh well
11:32kherbst: we'll see
11:36RSpliet: We've been seeing for a couple of years now. So far there's little to see on the perf management front :-P
11:36kherbst: I know
11:36RSpliet: Even I got an AMD APU the other day... my heart hurts, but the easy of having it just works does soothe a bit
12:01zocker: these missing firmwares should be somewhere in the nvidia blob and the difficulty is to actually extract them somehow, correct?
12:01zocker: (sorry for my noob questions, i just started researching this stuff yesterday :D)
12:04kherbst: problem is redistributinh
12:09zocker: so if i managed to somehow obtain the pmu firmware, what else would be missing for reclocking?
12:10kherbst: wiring it up inside the kernel
12:11zocker: so the driver would talk to it in order so set voltages and clocks, and the card wouldn't manage that itself?
12:13zocker: as in, it wouldn't be sufficient to just upload the pmu firmware to the gpu?
12:18zocker: i see
12:21zocker: is the procedure to extract the firmware documented somewhere? at least that knowledge should be legal to redistribute?
12:24kherbst: there is code somewhere.. imirkin_ might know
12:30zocker: alright, thanks!
12:33zocker: i'm really impressed with nouveaus functionality, kde plasma works great with desktop effects and everything up to 1920x1080
12:34zocker: tried it with a 4K display as well, then it starts to fall apart performance wise
12:51zocker: i'm going to use it with a 1080p projector anyway, so for me its fine... i'm just curious about possible performance improvements :)
13:22imirkin_: pmoreau: the nv50_compute one is the up-to-date rebased one. it's all nice and split up.
13:23imirkin_: (i think?)
13:23imirkin_: let me know if you think something in there deserves further splitting
13:24pmoreau: I will have a look at it later (most likely during the weekend).
13:28pmoreau: I rebased my “rework_mem_ldst” branch to the latest master; I started reworking some of my patches to properly use a, but it’s still a WIP (I don’t even remember what the current status is there). I’ll try to split patches out (like changes in nv50_ir_from_nir.cpp) to simplify things, and send MRs.
13:33imirkin_: pmoreau: i'd recommend basing on mine. i'm going to start upstreaming lots of this stuff.
13:33imirkin_: unless you see issues.
13:33pmoreau: Sounds good!
13:33imirkin_: what do you mean by "properly use a"?
13:34imirkin_: i need to copy over the nvc0 lowering for shared memory atomics
13:34pmoreau: Uh, not a, but $a
13:34imirkin_: but otherwise shared should be fine ... i think
13:34imirkin_: i'll double-check
13:34pmoreau: I was converting away from using the address registers and always using regular registers instead.
13:35imirkin_: yeah, the input ir needs to make sure they're FILE_ADDRESS
13:35imirkin_: maybe nvir screws that up? dunno
13:35imirkin_: er, nvir_from_nir
13:36pmoreau: from_nir is doing it properly IIRC, but I had an explicit pass to replace $a with $r, cause something was not working/I did not understand how to use $a properly.
13:36pmoreau: So in the new series I am trying to revert and fix that.
13:39imirkin_: oh oops
13:39imirkin_: rm -rf ;)
13:39imirkin_: having split out patches helps with this stuff
13:42mwk: imirkin_: the postincr for Tesla? it first uses the unchanged $a value, then increments it by the instruction field that would otherwise encode offset from $a
13:42imirkin_: mwk: ah ok. so the offset isn't applied
13:42imirkin_: (to the address)
13:42pmoreau: Hahaha :-D
13:42pmoreau: It does, and thankfully it isn’t part of a big huge WIP patch.
13:44mwk: yes, the offset is only applied afterwards
13:47imirkin_: mwk: cool thanks
15:08imirkin_: pmoreau: i'll put another branch together with what i propose we merge since it's clearly "good"
15:49pmoreau: I might even have some time to look at it today, we will see.
15:54imirkin_: basically i think many compiler changes are good to go, and some of the state tracking stuff
15:54imirkin_: and probably leave off the enablement bits until later
15:54imirkin_: as well as some of the more bogus WIP-marked changes
16:54Lyude: -finally- think I figured out why the cursor transparency igt test has been failing and it's, silly (and I'm surprised I didn't think of this)
16:55imirkin_: Lyude: speaking of cursors, did you see the discussion about the 256x256 cursor change on the ML?
16:56imirkin_: by the sounds of it, at least GK10x's don't fully support 256x256 cursors. or there's more that we're supposed to do for them.
16:56Lyude: we're not clearing fbs before we allocate them, which I didn't realize would actually matter until I realized that cairo (which igt uses for drawing) would be doing alpha blending using the data already in the fb. which, means drawing with alpha = 0 just leaves whatever color was previously on the fb
16:56Lyude: imirkin_: hm.
16:57imirkin_: 128x128 is fine though
16:57Lyude: yeah I did see that I just hadn't really read too deeply into it. I actually got some of my other kepler cards back from the boston office the other day so I can probably try to reproduce this
16:57imirkin_: Lyude: cool
16:57imirkin_: would be good to have someone with (a) hardware (b) graphics experience have a looksy
16:57Lyude: imirkin_: that's... also very weird, I'm not sure that actually matches up with some of the docs I have
16:57Lyude: maybe I missed something though
16:57imirkin_: i asked emersion to retest on his GK208, since that has a slightly later display controller
16:58imirkin_: could be. it's been known to happen.
16:58imirkin_: and yeah, the docs are pretty clear about 256x256 being a valid enum value. but nothing to say that there wasn't an errata that went with it that said "don't use 256x256, it's broken" :)
16:59imirkin_: modetest + SMPTE pattern for the cursor should be able to flush it out though
17:01Lyude: skeggsb: btw - I remember you mentioning that at some point with all of the mmu rework stuff for nouveau you were going to add something for auto-clearing surfaces we allocate, any idea what the timeline for that is (or if I could help speed it up?)
17:01Lyude: imirkin_: eh-I've kinda just gotten in the habit of using igt, I need to hack modetest at some point so that you don't have to specify drm object IDs because it makes it kind of tedious to use in the commandline :v
17:02imirkin_: yeah, connector names would be super
17:02imirkin_: but i only ever light up 1 connector at a time
17:02imirkin_: but anyways, you want to use igt for it, go for it
17:02imirkin_: but it's important to verify it actually displays correctly ;)
17:04Lyude: imirkin_: I mean you can see the output with igt :P, either way CRCs tend to be pretty accurate
17:04imirkin_: but CRC's don't verify anything is displaying correctly...
17:05imirkin_: just displaying differently ;)
17:05Lyude: mhm, yeah I've already found some bugs like that with igt :P
17:05imirkin_: ultimately a human sorta has to be the judge of "correct" vs "not"
17:06imirkin_: or a full writeback scheme with frame-by-frame comparison to a set of goldens
17:08Lyude: generally it's fine as long as you've verified at some point that displaying basic things work correctly, and also try actually generating different CRCs by changing the display output (it's technically possible for two different frames to have identical CRCs, but any decent CRC algorithm should give a different number for two frames that are only a tiny bit different from one another)
17:10imirkin_: but we don't have CRC goldens do we?
17:10imirkin_: (for each hardware for each test)
17:11Lyude: imirkin_: neither does intel, but they still catch bugs pretty well for the most part. i've only found like, I think one false positive in all the tests I've ran so far?
17:11imirkin_: i'm not saying it's bad to do the crc thing
17:11imirkin_: i'm saying it's not an indicator of accurate display
17:11Lyude: yeah, gotcha
17:11imirkin_: like you could have e.g. inverted colors, and the crc thing would be all good
17:13Lyude: good point
17:13imirkin_: or the stride of the image could be messed up, and crc would be all good
17:14imirkin_: but crc's catch other types of issues which are important too
17:14imirkin_: (like "i tried to display a plane, and nothing changed. weird")
17:14Lyude: yeah, that's kinda the reason I got intel into using chameliums for testing in the first place
17:15imirkin_: having crc goldens would be a reasonable idea
17:15Lyude: (I really need to use mine for like, actual igt tests more instead of just using it as my ultimate remote display)
17:15imirkin_: but i wonder if it's too sensitive
17:15imirkin_: yes, the simplest way to have remote display: chamelium.
17:15imirkin_: definitely not vnc.
17:16Lyude: imirkin_: it is too sensitive unfortunately, on a lot of hw (and I think this is the case for some nv hardware in my experiements) I think even just things like different data pipe widths between each head can even cause crcs to come out differently
17:16imirkin_: yeah, makes sense
17:16Lyude: (volta+ actually has a mode for fixing this though! but, I haven't actually experimented enough to see how reliable that is)
17:18Lyude: imirkin_: imho what would be quite nice (mainly because it'd let us actually use CRCs for non-active parts of the raster for verifying stuff like HDMI audio) is if we could actually get the algorithm for computing a CRC from an fb, or figure out if the hardware provides any way of doing it
17:19imirkin_: yeah, that's next-level
17:19imirkin_: verifying that we're sending the right infoframes/etc would be super
17:20imirkin_: like ACTUALLY sending the right infoframes
17:20imirkin_: as opposed to thinking we're sending the right infoframes :)
17:20imirkin_: that said, esp audio is tricky
17:20imirkin_: we use the "auto" setting for the audio stuff
17:20Lyude: mhm, we can probably do that with the chamelium but i'd much rather us be able to just do it with crcs if possible (since it's faster, sometimes more reliable, and definitely cheaper for everyone involved)
17:20imirkin_: rather than computing all that junk by hand
17:20imirkin_: so we don't really know what the "right" thing is in the first place
17:21imirkin_: the hw does allow explicit setting of hdmi audio parameters, but everything i've seen on-list about it with hw that requires that is endless pain and lack of audio sync
17:24Lyude: i do wonder if we could at least implement a basic hdmi audio test, one that isn't robust but at least verifies that something changes the CRC in the non-active raster. would be better then nothing for when we finally get CI running
17:34emersion: chamelium already should have audio tests
17:35Lyude: emersion: I mentioned that btw
17:35emersion: i worked on those during 6 full months :P
17:35emersion: ah, sorry, haven't read the whole thing
17:36Lyude: but it'd still be nice to have CRC tests for doing BAT-like runs (and for people who might just want to do basic sanity checks for regressions on their own machines with igt)
17:36Lyude: emersion: also yeah I forgot I helped you with reviewing all of those lol
17:38emersion: hm, how is CRC related to audio?
17:39Lyude: emersion: nvidia hardware provides the ability to specify which part of the scanout raster you want to generate CRCs from, including the non-active portions (which I -believe- contain audio for HDMI? I might completely be misremembering though)
17:39emersion: ah. yeah i think audio is just stuffed after the frame data
17:40emersion: i wonder if it'll work though -- that may be used for other things too, like completely unrelated infoframes
17:41emersion: maybe it's just garbage when there's no audio playing
17:41emersion: so, not sure a CRC on that would help, but worth trying stuff out
17:42Lyude: yeah, I definitely have to dig into it a bit more
17:43Lyude: but, that'll probably happen after I've at least got basic atomic working on both pascal and earlier + turing
17:43Lyude: erm, s/turing/volta+
20:00tertl3: good afternoon fam
20:00tertl3: may I inquire regarding NetBSD and a 9500 with 6GB
20:16imirkin: inquire away
20:23imirkin: (except there were no GT 9500's with 6GB of vram. 512MB would be more like it)
20:26RSpliet: do we even have *BSD devs here?
20:26kherbst: dunno, but last time I had to do with *BSD they had drm based on 3.x
20:27imirkin: riastradh used to be here for a time when trying to get netbsd up and running. but i don't think he's been around much lately
20:27imirkin: yeah, it was a 3.15 or so port at the time (years ago)
20:27imirkin: perhaps it's been updated since