00:05RSpliet: Oh I didn't realise the dri-logger bounced along to OFTC. Good stuff! Could someone add the words "Logged" and "https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=nouveau" to the topic? :-D
00:10imirkin: RSpliet: go for it
15:36Svanto: Greetings, esteemed graphics wizards!
15:36Svanto: I was looking through the troubleshooting and FAQ sections but I cannot seem to find what I was looking for.
15:36Svanto: Is the nvFanless kernel mode parameter still a thing, or had it been removed?
15:36Svanto: I wanted to test it out with/ sacrifice a Maxwell 2 GPU to science.
15:36Svanto: I understand the ramifications of the setting and why one would want to omit it from official documentation, but I prepared a watercooled loop with a coolant temp monitor, so I'm as prepared as I think I can be.
15:37imirkin: still a thing. but it's called NvFanless
15:38imirkin: although i'm not 100% sure that stuff was merged in the first place. maybe?
15:38imirkin: karolherbst: --^
15:38karolherbst: Svanto: it never landed upstream
15:39imirkin: Svanto: btw, just so there are no false expectations... clock-for-clock, nvidia is still much faster than nouveau
15:41karolherbst: oh btw, I now have an ampere GPU :3 I hope I will get to enabling acceleration soon enough :D
15:42karolherbst: the ISA is even the same as with turing
15:42karolherbst: so hopefully it will be easy to do
15:45Svanto: karolherbst: Ah, I'm sorry to hear that. Thank you very much for the info. And congratulations on the new hardware!
15:46imirkin: Svanto: there's a branch somewhere
15:46karolherbst: yeah.. it took a while, but now I have one :D
15:48Svanto: imirkin: thank you for the info, yeah, I'm well aware. I was mainly interested in nouveau because it appears to be the only blobless driver for mainstream GPUs.
15:48Svanto: Plus, I use Sway, so not a whole lot of choice given the state of the GPU market, and given that unlike nouveau, proprietary nvidia does not appear to support GBM, nor will it ever by the looks of it.
15:48imirkin: karolherbst: do you have a link to your fanless branch?
15:49imirkin: i couldn't find it quickly
15:50karolherbst: probably this one: https://github.com/karolherbst/nouveau/commits/clk_to_upstream
15:51karolherbst: it also shows why that stuff isn't upstreamed yet...
15:51karolherbst: driver side thermal throttling
15:52karolherbst: or maybe use this one? https://github.com/karolherbst/nouveau/commits/clk_cleanup_for_real
15:52karolherbst: I lost overview long time ago :D
15:54Svanto: Ahh. Well, I can pin my first goal in nouveau tinkering then; find NvFanless branch ;D
15:54karolherbst: if somebody wants to clean this mess up :D
15:54karolherbst: that would be appreciated
15:55karolherbst: that branch contains like 5 new features though
15:55imirkin: and then you can get another branch, "clk_cleanup_no_really_this_is_the_one"
15:55karolherbst: and might not even apply on latest code
15:55karolherbst: imirkin: I used to use versions, but at some point I arrive at likve v9
15:56karolherbst: the minimum I want to see included is thermal throttling before adding a flag like this
15:56karolherbst: just never found the time to really re this stupid table
15:57imirkin: karolherbst: did you land the "don't crash when trying to reclock when it's runpm-suspended" work?
15:57karolherbst: I think so?
15:58karolherbst: I mean.. I don't really have control over what patches land anyway, so if you track like 10 patches and some never land even I lose interest at some point :/
15:58karolherbst: probably the reason why most of my time I just spend on mesa stuff :p
16:06Svanto: karolherbst: >the minimum I want to see included is thermal throttling
16:06Svanto: Yeah, that is entirely understandable.
16:35Svanto: karolherbst: so the thermal policies table is the main issue that needs to be dealt with?
16:38karolherbst: Svanto: more or less
16:39karolherbst: it's not technically required.. but to protect users from themselves, so even if they turn it on, we make sure the GPU doesn't get too hot
16:39karolherbst: even if we don't do it, the GPU would at some point downclock and even force shutdown
16:39karolherbst: but this way you get less perf and more efficient throttling
16:40karolherbst: ehh.. more perf
16:44imirkin: Svanto: probably just want to get the existing work rebased on something more recent
16:44imirkin: that'd be a great first step
16:45Svanto: >but this way you get less perf and more efficient throttling
16:45Svanto: Ah, I see! Thanks for explaining.
16:46imirkin: Svanto: i sort of assume you're a developer? otherwise that task may be out of reach for you
16:48Svanto: imirkin: Well yes, but my experience with C is limited and my experience with Mesa is nonexistent. I'm interested but it will be a while before I feel qualified enough. But thank you for all your help.
16:48imirkin: you won't be touching mesa at all
16:48imirkin: this is going to be kernel work
16:48imirkin: and everyone starts out with no experience with the kernel, so ... it's been done :)
16:49imirkin: but if you are comfortable working with code, it shouldn't be too difficult to apply the existing work onto the latest
16:49imirkin: if you're not sure what to do about something, feel free to ask
16:50Svanto: I will be sure to! Thank you.
17:22karolherbst: *sigh* :/
17:23karolherbst: why does stuff break even though pushbuffer content doesn't change :/
17:24karolherbst: is there something up with queries I totally miss?
17:27karolherbst: imirkin: soo.. normally if we forget to bind a buffer to the pushbuffer and access it, we would assume there is some kind of read/write error in dmesg, no?
17:35karolherbst: something is strange
17:37imirkin: karolherbst: certainly not necessary
17:37imirkin: you only get a read/write error if it's not in vram
17:37imirkin: might still be in vram :)
17:37imirkin: karolherbst: queries stuff is very sensitive
17:38karolherbst: yeah well..
17:38imirkin: it #define's the pushbuf-no-touch define
17:38karolherbst: I even found something simplier I broke
17:38imirkin: so it doesn't implicitly PUSH_SPACE on every thing
17:38karolherbst: KHR-GL46.texture_repeat_mode.r8_49x23_0_clamp_to_edge even
17:38karolherbst: I am sure I got the fencing right, but now I actually want to move to per context pushbuf...
17:38karolherbst: I even verified what I push out
17:38karolherbst: same commends
17:38karolherbst: same order
17:38karolherbst: fences.. also correct
17:38karolherbst: test still fails
17:39karolherbst: well.. I didn't check if all changes inside CB_DATA are "okay" but...
17:39karolherbst: they look like pointers
17:39karolherbst: maybe I am missing something...
17:40imirkin: CB DATA should generally not have pointers
17:40imirkin: we do use P2MF to upload texture descriptors
17:40imirkin: which *do* have pointers in them
17:41karolherbst: it happens right before setting SAMPLE_LOCATIONS
17:41karolherbst: those are definetly pointers
17:41karolherbst: I think? mhh
17:41imirkin: or floats :)
17:42karolherbst: why would they be different between runs though?
17:42karolherbst: let me take three dumps in total, two from working runs, one from a non working one..
17:42karolherbst: maybe I missed something
17:43imirkin: the only time CB DATA would have pointers is for compute
17:43imirkin: for the UBO bases for UBO's > 7
17:43imirkin: or 6 or whatever
17:43karolherbst: so CB_DATA differs between all runs
17:44imirkin: oh, and SSBO bases too i guess?
17:44karolherbst: weird thing is..
17:44karolherbst: some pointers are equal between the both working runs
17:44karolherbst: but different in the broken one
17:44imirkin: that said, the memory allocation tends to be fairly static
17:44imirkin: so it generally doesn't vary between runs of a simple test
17:45karolherbst: ohhhh wait.....
17:45karolherbst: let me try something
17:45karolherbst: the heck
17:46karolherbst: it's not the pushbuf...
17:46karolherbst: it's the client
17:46karolherbst: so.. because of internal libdrm details, ben told me to allocate nouveau_clients per context as well, which do... something
17:46karolherbst: you still use the same channel though
17:47karolherbst: but if you have bos bound to multiple pushbuffers
17:47karolherbst: you kick other pushbuffers as long as they are using the same client
17:47karolherbst: that fpush if inside pushbuf_kref
17:48karolherbst: but why does using a new client break things?
17:49karolherbst: yeah... using the same fixes all regressions
18:00karolherbst: what is this client thing actually doing..
18:02imirkin: karolherbst: client per context mens one hw thing per context
18:02imirkin: which we were trying to avoid
18:02karolherbst: it does not
18:02karolherbst: nouveau_client is purely some libdrm thing
18:02imirkin: then don't listen to me
18:18karolherbst: in the working case:
18:18karolherbst: bo 0x7f101c269000 0x6e0000 0x80000
18:18karolherbst: bo 0x7f10273b4000 0x14000 0x1000
18:18karolherbst: broken case:
18:18karolherbst: bo 0x7f3e42782000 0x6e0000 0x80000
18:19karolherbst: that's the TSC bo
18:19karolherbst: I think...
18:19karolherbst: anyway.. it's missing and I guess this is causing issues :)
18:27karolherbst: imirkin: btw.. you asked for this: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/169
18:27gitlab-bot: Mesa issue (Merge request) 169 in drm "nouveau: print bo address in the GPU/CPU vm and its size" [Nvidia, Opened]
18:27karolherbst: once anything like this lands, I'll push out an updated depushbuf version
18:28karolherbst: ehh wait
18:28karolherbst: offset is 64 bit...
18:28imirkin: karolherbst: tsc is kinda important.
18:28karolherbst: yeah... :D
18:28karolherbst: I guess something is broken in regards to reference screen data
18:36Svanto: Is there an efficient way to search through the logs for this channel?
18:37karolherbst: I doubt it...
18:37karolherbst: but I guess you could download the logs and grep it
18:39imirkin: Svanto: google
18:40imirkin: with site:people.freedesktop.org
18:40Svanto: imirkin: I did try that, but it's difficult to bring up results for nouveau speficially that way.
18:41imirkin: is there something concrete you're looking for?
18:49Svanto: imirkin: There are some things that I was curious about and didn't want to waste people's time with
18:49Svanto: But namely it was:
18:49Svanto: - I know that Nouveau is held back by nvidia's refusal to provide the PMU firmware, and I know this is why we cannot control the onboard fan speeds. Are there other functions held back by that inaccessible firmware?
18:49Svanto: - I heard rumors about Vulkan on Nouveau back in 2018. Did it hit a technical roadblock or is it simply an undertaking that there are not yet enough contributors for?
18:49imirkin: Svanto: doing RE for how to do reclocking
18:49imirkin: (is held back by lack of ability to fuzz the vbios, due to signature verification)
18:50Svanto: Ah, thank you very much for the info.
18:50imirkin: Svanto: the vulkan stuff is missing primarily due to laziness. there's a lot of heavy work that needs to be done in the nouveau kernel driver to enable userspace-side VA management, which is semi-required for vulkan
18:51imirkin: skeggsb has been working on it for ages, but i get the feeling that it's not a primary focus for him
19:02Svanto: imirkin: Ohh, I see. Can't call it laziness, a volunteer project is a volunteer project... I'm glad to have this info.
19:02Svanto: BTW by RE you mean reverse engineering, right? Did the documentation nvidia supposedly provided help anyhow?
19:02Svanto: referring to this
19:03imirkin: Svanto: yes, RE = reverse engineering
19:04imirkin: Svanto: the full info is here: https://nvidia.github.io/open-gpu-doc/
19:04imirkin: we're all well aware of it
19:04imirkin: in general they release the info after we've figured out how it works.
19:05imirkin: but they've released zero about reclocking
19:05imirkin: even the stuff that we know how it works :)
19:05imirkin: there's one exceptoin to that, which is that aritger (iirc) made excellent docs available on fermi fan stuff, whcih we get wrong
19:05Svanto: Ah, I almost thought nvidia was being helpful for a second. My naivete.
19:05imirkin: but no one's had time / motivation to look into it properly
19:06imirkin: Svanto: they do make display docs available, which have been quite helpful actually
19:06imirkin: but those are for controlling the display stuff
19:11Svanto: Ah, that's something then
19:33karolherbst: Svanto: they are, but that's mainly focused on new gens
19:33karolherbst: imirkin: and compute classes
19:33karolherbst: and I think for ampere we have nearly everything
19:34imirkin: karolherbst: they only give the function names for compute
19:34imirkin: not the full manual
19:34imirkin: (and for display too, i guess)
19:35karolherbst: imirkin: we have a lot more
19:35karolherbst: but yeah.. some interesthing things for userspace are still missing
19:36karolherbst: imirkin: we even got the nvenc stuff
19:36Santurysim: Can I help with your effort? I have a GK106 card
19:37karolherbst: Santurysim: yes, just depends on what you want and can do
19:40karolherbst: imirkin: https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/video/nvdec_drv.h
19:40karolherbst: sadly the class docs we have are... volta+?
19:40imirkin: karolherbst: i saw those.
19:41karolherbst: ahh no.. turing+
19:41imirkin: karolherbst: any will to do video stuff has long-ago been sucked out of me though
19:41karolherbst: and I have more important things to work on :/
19:41karolherbst: also.. firmware
19:41imirkin: that's the main thing :)
19:42imirkin: the video stuff is too finicky
19:42imirkin: too many pieces have to be perfectly in place for anything at all to work
19:42imirkin: and then it still weirdly fails
19:42imirkin: and i'm not enough of a h.264 expert to work out what's up
19:42karolherbst: and is totally broken in regards to multithreading anyway :)
19:43karolherbst: at least after figuring out the nouveau_client issue, I have a branch actually fixing it without deadlocks :D
19:44imirkin: karolherbst: multithreading only matters if you're not a total screwup in single-threaded ;)
19:45karolherbst: you mean regressions or generally?
19:48karolherbst: imirkin: have ever looked at this threaded context stuff?
19:55Santurysim: karolherbst: I want a nice gpu driver. Also I want to learn how the GPU subsystem works in Linux. I can try to collect data for you
20:27karolherbst: imirkin: ehh.. I was mistaken.. it wasn't the tls bo.. it was the fence one...
20:27karolherbst: which is surprising...
20:28karolherbst: *sigh* I see it now
21:35Svanto: karolherbst: >I think for ampere we have nearly everything
21:35Svanto: Even PMU firmware and other signed firmware?
21:36karolherbst: I meant docs
21:36karolherbst: register names
21:37Svanto: Ohh I see, thank you
21:41imirkin: karolherbst: have they released 3d names for anything?
21:41karolherbst: I don't think so
21:42karolherbst: it's also the biggest one
21:42karolherbst: so.. it's taking more time
21:42karolherbst: I don't even know if they are working on it, but I would guess so
21:42karolherbst: it's just... huge
21:42imirkin: pretty sure it's not some dude who's typing it up line by line...
21:42imirkin: they already have the doc
21:44RSpliet: It might be hand-vetted by legal line by line
21:44RSpliet: But this better be part-generated doxygen-style
21:45imirkin: RSpliet: nah, their standard docs which strip out all the text
21:45imirkin: and so it's just reg + bitfield names
21:45karolherbst: RSpliet: yep
21:45imirkin: which is still better than nothing
21:45karolherbst: imirkin: nope
21:45karolherbst: they have legal checking them
21:45RSpliet: imirkin: fairly sure they also strip out some of the bitfields and regs
21:45imirkin: RSpliet: harder to tell
21:45imirkin: much easier to tell that all the descriptive docs are gone :)
21:46karolherbst: imirkin: that's true, but they have legal checking the docs :D
21:46karolherbst: I am not kidding
21:46RSpliet: imirkin: recall coming across regs in envydocs that weren't in the NV-published docs
21:46RSpliet: envydocs? envytools!
21:47karolherbst: docs for new gens are easier, because you can check the diff
21:47imirkin: RSpliet: and probably the nvc0 TFB_UNFUCKUP_OFFSET_QUERY is not the official register name?
21:47RSpliet: We may never know :-D
21:48imirkin: it _could_ be the official name...
21:48imirkin: it's just esp funny that offset queries were fucked up, and i had no idea why
21:48imirkin: and then i started looking at traces
21:48imirkin: and there was, annotated in the mmiotrace, a write to that register :)
21:55Santurysim: Why not to use swear words in classified documentation? :D
21:55imirkin: [and sure enough, writing a 1 to that reg fixed everything]
22:34Svanto: imirkin: and probably the nvc0 TFB_UNFUCKUP_OFFSET_QUERY is not the official register name?
22:34Svanto: Perhaps the reason nvidia won't open source the drivers is to bar us from seeing into the dwindling sanity of their programmers :-)
22:36RSpliet: that, and patent(troll)s
23:04karolherbst: imirkin: does it actually make a difference how often and when we push?
23:05karolherbst: like are there any good reasons when we can't split up a pushbuffer?
23:06karolherbst: that doesn't seem to be the issue either
23:09karolherbst: maybe I am just overlooking something somewhere.. mhhh
23:38imirkin: karolherbst: each push takes up an ib ring slot