07:58 karolherbst: skeggsb: saw that "fb: 0 MiB DDR3" bug comming up on a few machines?
10:33 RSpliet: karolherbst: a few patches have disappeared from the tip of Bens tree. One I recall being related to detecting the correct amount of DRAM
10:34 karolherbst: RSpliet: yeah... I kind of remember that something happened regarding this, but I forgot all the details
11:49 karolherbst: does anybody have access to a gm108 and could create an mmiotrace for me?
13:40 dboyan: hakzsam: I came up with a dirty hack to fix percentage values for AMD_perfmon in nouveau.
13:40 dboyan: hakzsam: https://hastebin.com/asajajivup.cs
13:41 dboyan: hakzsam: I don't think it'll break hud, but the code is not that clean for now.
13:41 tagr: RSpliet: no disagreement here =)
21:11 Lyude: mupuf: alright, looking at the clockgating stuff again. It looks like all that was really left to do was testing, am I correct? (don't see any responses to my v3 patch
21:11 mupuf: oh, so sorry I missed your v3!
21:12 karolherbst: there was a v3? :/
21:12 Lyude: yeah
21:12 mupuf: where the heck did you send it "D?
21:12 Lyude: nouveau@lists.freedesktop.org
21:12 Lyude: i can resend if you want
21:12 karolherbst: I think it got lost
21:12 karolherbst: I don't see it on the list
21:12 mupuf: yeah, I can't see it either
21:13 mupuf: I don't even see the v2
21:13 Lyude: alright, lemme rebase it and i'll resend in a bit
21:13 karolherbst: mupuf: true
21:14 mupuf: ah! It was in my ri-devel folder
21:14 karolherbst: ohhh
21:15 karolherbst: mupuf: was the nouveau list in CC or anything?
21:15 mupuf: yes
21:15 Lyude: ah cool, so we're good? (shouldn't need a rebase, since it rebased cleanly here
21:15 karolherbst: mhhh
21:15 karolherbst: mupuf: super odd
21:15 karolherbst: mupuf: date?
21:15 mupuf: 27/04
21:16 karolherbst: ohhh
21:16 karolherbst: now I see it
21:16 mupuf: Lyude: you forgot about setting all the SLPC and ELCG regs
21:17 mupuf: there are quite a few of them
21:17 Lyude: mupuf: huh, alright
21:17 karolherbst: ohh right those
21:20 karolherbst: Lyude: that "if (!ret) nvkm_therm_clkgate_" part is a bit tricky... I wouldn't like to see it depending on the return code really. Usually !ret means something is wrong and we shouldn't continue by enabling more stuff
21:20 karolherbst: let me read the entire source though
21:21 karolherbst: ohh. I am silly
21:22 karolherbst: I shouldn't do reviews when I am super tired
21:33 Lyude: btw mupuf, where are the rnndb entries for the registers you're talking about?
21:36 mupuf: BLCG and SLCG
21:36 mupuf: that's the names I was looking for
21:36 mupuf: HW_CGBLK*
21:36 karolherbst: Lyude: now a real comment: if the clkgate.c file won't get bigger with the adjustments mupuf mentioned, you could just put nvkm_therm_clkgate_engine into base.c
21:37 Lyude: yeah, I was thinking that
21:37 Lyude: such a small lonely file
21:37 mupuf: karolherbst: it won't, what I am talking about is a one-time write and should be part of the engine/subdev init
21:37 karolherbst: yeah, most likely
21:37 mupuf: Lyude: https://android.googlesource.com/kernel/tegra/+/b445e5296764d18861a6450f6851f25b9ca59dee/drivers/video/tegra/host/gk20a/hw_gr_gk20a.h may contain information about these regs BTW
21:38 mupuf: Lyude: take mmiotraces from the vbios repo, compare them
21:38 mupuf: and make the same writes
21:38 karolherbst: Lyude: usuall we put such dispatch calls always into base.c even if there is more real stuff later one
21:38 karolherbst: *on
21:38 karolherbst: and usually also do pointer checks
21:38 Lyude: holy crap what is with all of these inline functions
21:38 karolherbst: if (therm && therm->func->clkgate_engine)
21:39 mupuf: Lyude: isn't life amazing? :D
21:39 karolherbst: Lyude: it's called from outside therm, so therm could be NULL actually
21:39 Lyude: mupuf: this is tegra e.g. not nouveau right
21:39 mupuf: Lyude: of course :D
21:39 karolherbst: yes, those are fun
21:39 Lyude: god damn i didn't realize nvidia was that bad
21:39 mupuf: It is nvgpu
21:39 Lyude: amd has some competition in that field it seems
21:39 karolherbst: I think this is generated code?
21:39 Lyude: ooh, that would make more sense
21:40 karolherbst: yeah, AMD is like 90% headers and 10% source :D
21:40 karolherbst: ;)
21:40 Lyude: ehhhh, i dunno about that one. there's a -lot- of source
21:40 Lyude: it's just, not good source
21:40 mupuf: ah ah
21:40 mupuf: of course it is auto-generated :D
21:40 Lyude: i have more irq cleanups for cik and r600 on the way and i think i probably removed another 1000 lines from their driver
21:40 mupuf: _f == field
21:40 karolherbst: well, maybe it'S 10% after removing all that duplicated code?
21:40 mupuf: _v == default value
21:41 Lyude: karolherbst: yeah that sounds about right :P
21:41 karolherbst: allthough that was a pure guess, is there a lot of duplicated code? :D
21:41 Lyude: oh my god yes there is
21:42 karolherbst: figures
21:43 Lyude: karolherbst: if you are curious how much duplicated code btw, just look at some of the diffs for the commits from here https://github.com/Lyude/linux/tree/wip/radeon-cik-irq-cleanup-v1
21:47 karolherbst: oh crap
21:49 Lyude: yeah, it's -bad-
21:53 karolherbst: I hope we get access to Dawn of War fast, so that I can try it out :)
22:06 karolherbst: mhhh
22:06 karolherbst: I get notification spamming from the PMU
22:07 karolherbst: when I set min/max thresholds and have something running at full load, I can't really tell the PMU to say: please don't notify the host on high loads anymore
22:07 karolherbst: I also need to add some kind of order, like if one domain is between min/max and everything is below min, there is no point in poking the host to reclock as well
22:08 karolherbst: I think I already solved that problem with my old PMU code
22:09 karolherbst: mupuf: by the way, do you mind taking a look at my most recent pmu counter series?
22:10 mupuf: karolherbst: yes, you did solve this before: send one IRQ and nothing else until it gets acked
22:10 karolherbst: no, it's a different issue
22:10 karolherbst: after the PMU pokes the host, it sets max to 0xff and min to 0x0, so it won't send anything, but I was always acking now... I was talking about a different issue though
22:11 karolherbst: but basically I know now how to fix it
22:12 karolherbst: I make the thresholds 16 bit so that I can set max to 0x100
22:13 karolherbst: and max gets set to 0x100 if the domain has the highest clock, min to 0x0 on lowest clock
22:13 karolherbst: and if one domain is between min/max and every other below min, the host won't get poked, because there is no reason to
22:14 karolherbst: notifications only when at least one domain is above max or every domain is below min
22:16 mupuf: hmm, but how does this work with the cstate then?
22:16 karolherbst: what do you mean?
22:17 mupuf: you never lower the cstate when it is too high, unless all the other domains are below min?
22:17 karolherbst: mhhhh
22:18 karolherbst: crap
22:19 karolherbst: mupuf: imagine that video engines are also clocked by cstates or something else. Then you would have it always at load 0x0, but if the GPU renders something, the core domain is always up
22:20 karolherbst: so the PMU would spam with IRQs, because the video engines could be downclocked
22:20 mupuf: no, vdec engines are not part of the cstate
22:20 mupuf: there is only one clock, dude ;)
22:20 karolherbst: yeah I know, but those are still at 0 load
22:20 mupuf: sure, and?
22:20 karolherbst: and we can't really downclock pstates either if we have to set high cstates
22:20 mupuf: yep
22:21 karolherbst: and if I set the core domain to min 0x70 and max 0x100, the load varies between 0xe0-0xf0 and video at 0x0, should the PMU send IRQs?
22:22 mupuf: dude, can you first tell what these number mean :D?
22:22 karolherbst: on some GPUs, highest cstates are only enabled on pstate 0xd+ and 0xa doesn't allow the most highest ones
22:22 mupuf: yep
22:22 karolherbst: mupuf: loads are represtended between 0x0 and 0xff
22:22 karolherbst: *represented
22:22 karolherbst: *as values
22:23 karolherbst: min/max are simply thresholds when the PMU should poke the host, so if the load drops below min, it should notify the host about that
22:23 mupuf: basically, for the core domain, only care about the cstate. You just change the pstate up and down based on the rest of the domains
22:23 karolherbst: well, this won't work
22:23 karolherbst: I need to care about pstates as well for core domain
22:23 mupuf: basically: pstate = max(vdec.wanted_pstate, mem.wanted_pstate, core.wanted_pstate, etc...)
22:24 mupuf: no, you don't. It just comes as a dependency of the cstate
22:24 karolherbst: mhh, okay, true
22:24 karolherbst: but this isn't the issue I am talking about though
22:24 mupuf: you sure about that? :D
22:24 karolherbst: the host side is not interesting here really
22:25 karolherbst: I am more concerned about when the PMU should send IRQs
22:25 mupuf: well, what you want is to let the PMU tell you when a reclock needs to happen
22:25 karolherbst: yes
22:25 karolherbst: but I also want to prevent IRQs spamming just because one domain has a load of 0
22:26 karolherbst: so I need a way to tell the PMU from the host: don't bother to notify me, just because the video engines has nothing to do
22:26 mupuf: remember what I wrote above
22:26 mupuf: pstate = max(vdec.wanted_pstate, mem.wanted_pstate, core.wanted_pstate, etc...)
22:27 karolherbst: yeah, the PMU doesn't know about the relationships
22:27 mupuf: so, an irq should be fired when all the loads are low-enough
22:27 mupuf: (well, aside from the cstate, let's keep it on the side for now)
22:27 karolherbst: mhh
22:27 karolherbst: I need to be able to set some flags for the counter slots then
22:27 karolherbst: from the host side
22:28 mupuf: you need to be able to set the min thresholds of all counters, yes
22:28 mupuf: and only when all are under, then you fire an irq
22:28 karolherbst: mhhh
22:28 karolherbst: this would be a good way
22:28 mupuf: that will work for all the domains but the cstate
22:29 mupuf: as for upclocking, any domain above a certain load needs to trigger an IRQ too
22:29 mupuf: when you are at the highest, set the threshold to 0x100, like you already do
22:29 mupuf: but there is one more thing missing from your code
22:29 mupuf: downclocking should only happen if all domains are under a threshold for a certain time
22:30 karolherbst: yeah
22:30 mupuf: you can't be downclocking at 10Hz
22:30 karolherbst: I know, I was just rewriting all the code from scratch basically
22:30 mupuf: but seriously, stop with the PMU for now and do it on the kernel side, if not in the userspace
22:30 mupuf: make it work, then put it in the PMU
22:31 karolherbst: what do you mean now? I don't really do anything on the PMU except checking the thresholds
22:31 mupuf: do nothing there
22:31 mupuf: prove your algorithm first
22:31 mupuf: in the userspace
22:31 karolherbst: well the PMU read out the counter slots
22:31 mupuf: you don't have to do it there, you are just making it harder for you
22:31 mupuf: and it is just a way to avoid doing the real work: coming up with good algorithms :D
22:33 mupuf: with that being said, I will go to bed, and review your code when possible!
22:33 karolherbst: k
22:33 karolherbst: the code doesn't contain any of that anyway
22:33 karolherbst: it's just the part to provide the loads
22:33 mupuf: I know :)
22:33 karolherbst: okay
22:33 mupuf: and that is nice
22:34 karolherbst: yeah I guess I should just do more real research and concentrate on implementing late when I know what I need for sure
22:35 karolherbst: good night
22:36 mupuf: thanks, same to you!
22:36 mupuf: I really need to get back to all this. So much fun (and frustration) to be had!
22:36 karolherbst: :)
23:13 pmoreau: Maybe we should have a Trello team, so that anyone in the team can access all the boards, without needing to add people manually to each board.
23:14 pmoreau: Plus it would be easy to see all the team’s board without pinning/star'ing them all. :-)
23:14 pmoreau: I had completely forgotten about that CTS board
23:24 pmoreau: imirkin: I created a public Trello team and added you as an admin. Could you please change the CTS board from public to team, and pick the Nouveau team? If it works fine, and remains public, I’ll add everyone to the team and switch the main one to the team.
23:32 pmoreau: imirkin: Looks like you can simply go to Settings -> Change Team and select Nouveau.
23:38 imirkin: pmoreau: iirc that's a non-free feature...
23:38 pmoreau: To have public boards in a team?
23:39 imirkin: to have teams at all... i guess we'll see
23:39 imirkin: anyways, changed
23:39 pmoreau: Seems to work
23:40 imirkin: maybe it's to have private + team
23:40 pmoreau: Teams are free, but you have some limitations to them in that you can change many settings (https://trello.com/nouveau8/account)
23:40 imirkin: i see
23:42 imirkin: Lyude: in case you want to add more fun GM200+ features, ARB_location_samples should be *relatively* straightforward, but i suspect it's a lot of typing.
23:43 Lyude: imirkin: sure, probably going to see if I can finish up clockgating first though
23:43 pmoreau: I think I added everyone from the board to the team.
23:43 imirkin: Lyude: whatever you like to work on. just pointing things out that are semi-similar to other things you did
23:43 Lyude: ahh, cool
23:44 imirkin: Lyude: i have no agenda here... it'd be nice to work on useful things, but fun things tend to rule the day
23:44 Lyude: hehe
23:46 imirkin: pmoreau: thanks for setting the team thing up - that's def a good idea.
23:47 pmoreau: Cool, the URL of the boards did not change after adding it to the team! :-)
23:47 imirkin: always a nice little bonus.
23:47 pmoreau: imirkin: You are welcome! That way it will be easier to add someone to the boards, and track all the boards we create.
23:48 imirkin: yeah. i think that the "nouveau" board has been getting a lot of unrelated lists, which perhaps make it harder to use
23:48 pmoreau: I do not remember if Karol created a separate board for falcon… I’ll have to ask him tomorrow.
23:48 pmoreau: True. On the other hand, I am not sure how I would split things up.
23:49 pmoreau: One thing I would change, would be the "OpenGL compiler" list, to "NVIR compiler", or just compiler, as it applies to OpenCL and CUDA as well. :-)
23:51 pmoreau: Maybe a board for the compiler which is shared across APIs, one for the kernel and another for the DDX, and then one for each graphic API?
23:51 pmoreau: And if a card should be categorised as "compiler" rather than "OpenGL", it is very easy to move the card between boards without losing any information.
23:52 imirkin: pmoreau: go for it
23:52 imirkin: er
23:52 imirkin: OpenGL compiler -> NVIR compiler
23:53 imirkin: but yeah... some of this stuff can be reorganized a bit
23:53 imirkin: the opengl was there more to differentiate it from, say, kernel
23:53 imirkin: or dd
23:53 imirkin: ddx
23:53 pmoreau: imirkin: I think I’ll go for my bed first, let the night pass, see the reactions tomorrow morning, and then do something. ;-)
23:53 pmoreau: Makes sense
23:58 Lyude: btw mupuf, in that hw_gr_gk20a.h file you showed me do you have any idea what l1c is?
23:58 skeggsb: level 1 cache, probably
23:58 skeggsb: ltc == l2c == level 2 cache, so, that'd be my guess
23:59 Lyude: oh yeah, that seems to make sense here