01:15 i509vcb[d]: If you can somehow determine whether userspace gives you a tegra, you could do a different way of generating the uuid kind of like what I copied from another driver since a system will only have 1 tegra gpu typically:
01:15 i509vcb[d]: https://gitlab.freedesktop.org/asahi/mesa/-/blob/main/src/asahi/lib/agx_device.c?ref_type=heads#L756-780
09:27 ahuillet[d]: karolherbst[d]: I haven't looked, I wasn't clear on whether you wanted me to look or not :)
09:28 ahuillet[d]: oh actually you were pretty clear I just didn't notice due to being swamped in other things. I'll see what I can find now.
09:56 ahuillet[d]: karolherbst[d]: we use https://github.com/NVIDIA/open-gpu-kernel-modules/blob/main/src/common/sdk/nvidia/inc/ctrl/ctrl0000/ctrl0000gpu.h#L586 for the Vulkan GPU UUID
10:01 ahuillet[d]: which seems to go in here: https://github.com/NVIDIA/open-gpu-kernel-modules/blob/448d5cc65624d3aa69015efa0d3fb50fd9729f41/src/nvidia/src/kernel/platform/chipset/pci_pbi.c#L259
10:01 ahuillet[d]: I don't know what PCI PBI is but it sounds like it might be some special interface in PCI so talk to the device? so you're actually asking the physical HW for its UUID
10:02 ahuillet[d]: Google is useless on the topic as seems to be standard these days
10:04 ahuillet[d]: ahuillet[d]: I guess you'd want to copy paste this in nouveau.ko or similar. FWIW PBI seems to be a NVIDIA term, stands for "post box interface" and according to internal docs seems to be a way to talk to the VBIOS. I can probably share more details not here (NDA)
10:05 ahuillet[d]: at any rate the open source code shows you how to talk to it which is probably enough for your purposes?
10:43 magic_rb[d]: Random again: i want to thanks everyone involved in the nvidia open kernel driver, i havent been able to game on my laptop since nouveau is not there yet and the fully proprietary driver lasted around 20 minutes before segfaulting either inside X or the kernel. But with the open one, works great, so thanks!
11:35 karolherbst[d]: ahuillet[d]: right... I think I'd still need to dig into how nvgpu is doing all of this, just to be sure we get it right on every gpu
11:38 karolherbst[d]: and also what to do pre GSP
11:38 ahuillet[d]: that seems unrelated to GSP
11:38 ahuillet[d]: the stuff I linked is all open source, not executed by GSP
11:39 karolherbst[d]: isn't `NV0000_CTRL_CMD_GPU_GET_UUID_FROM_GPU_ID` a GSP RPC?
11:40 ahuillet[d]: it's an RM ctrl, but it seems to be implemented on the CPU
11:40 karolherbst[d]: ahh
11:40 ahuillet[d]: ahuillet[d]: hence why this is open source
11:40 ahuillet[d]: basically - PCIe tricks to get the UUID from vbios, which you can do from nouveau.ko without involving GSP
11:45 karolherbst[d]: yeah...
14:32 gfxstrand[d]: Ah, yes. Another bug which comes down to me working around codegen to pass CTS tests when NAK doesn't need the workaround and works just fine. :facepalm:
14:34 karolherbst[d]: oh no
14:35 karolherbst[d]: working on kepler or what are you up to? 😄
14:35 karolherbst[d]: I thought Maxwell is already NAK only?
14:37 gfxstrand[d]: VK_EXT_post_depth_coverage
14:38 gfxstrand[d]: It wasn't working because of a workaround in NVK to work around codegen's workaround for some unknown ancient bit of hardware.
14:38 gfxstrand[d]: https://tenor.com/view/blocks-wrecked-fail-fall-jenga-gif-15311832
14:38 karolherbst[d]: oh no
14:39 gfxstrand[d]: I'm doing a full CTS run on Ampere and then I'll pull out my 750 Ti and make sure that NAK is fine on Maxwell A before I merge.
14:39 gfxstrand[d]: I won't do a full CTS run but I should be able to run the gl_SampleMaskIn tests and that should be enough.
14:41 gfxstrand[d]: I have no idea what HW codegen was working around
14:41 karolherbst[d]: anyway.. I'm considering using the `num_enum` crate, would you have any use of it?
14:41 gfxstrand[d]: In particular, codegen was replacing `gl_SampleMaskIn[]` by `1 << gl_SampleID` which is fine (sort-of) as long as you force per-sample shading whenever you read `gl_SampleMaskIn[]`.
14:42 gfxstrand[d]: karolherbst[d]: Yeah, I've been eyeing it for a while.
14:42 gfxstrand[d]: I've got multiple things that would benefit from it
14:42 karolherbst[d]: it would help me simplify (de)serialization
14:42 gfxstrand[d]: `nak::ir::RegFile` being the big one
14:42 gfxstrand[d]: I've also thought about just re-typing it as my own proc macro because it's not hard.
14:43 karolherbst[d]: yeah.. but..
14:43 karolherbst[d]: do we have to wrap "dev-dependencies" as well or just the normal ones?
14:43 gfxstrand[d]: All of them.
14:43 karolherbst[d]: mhh...
14:44 gfxstrand[d]: And things with proc macros are extra painful because you have to deal with cross builds
14:44 gfxstrand[d]: Though may be meson does that automaticall now
14:44 karolherbst[d]: annoying
14:45 gfxstrand[d]: But yes I would use it if you pulled it in
14:45 gfxstrand[d]: I've just been trying to keep the crates down until we no longer need to mess about with wraps
14:46 karolherbst[d]: yeah....
14:46 karolherbst[d]: `num_enum` pull in quite a lot of dev deps
14:49 asdqueerfromeu[d]: gfxstrand[d]: codegen-ception :nouveau:
14:53 gfxstrand[d]: Also, the CTS test is kinda bad.
14:53 gfxstrand[d]: Bugs all the way down.
15:28 phomes_[d][d]: I am glad that post_depth_coverage finally gets fixed 🙂 I looked for the problem in the wrong place. At least I learned a lot about shader SPHs 🙂
15:35 gfxstrand[d]: Well, it's not fixed yet
15:35 gfxstrand[d]: Getting rid of the hack leads to other fails. 😭
15:38 karolherbst[d]: makes sense
15:38 gfxstrand[d]: The blob uses `pixld.covmask` for `gl_SampleMaskIn[]` but doing that fails CTS tests because it contains the coverage mask for the whole pixel, not what's executing in the FS.
15:39 gfxstrand[d]: IDK how to get the set of samples corresponding to the current FS invocation
15:39 gfxstrand[d]: Or if there's some API bit I can whack to change what `pixld.covmask` reports
15:40 karolherbst[d]: is it doing anything else in the shader?
15:40 karolherbst[d]: though let me check if I know more
15:41 karolherbst[d]: ahhh
15:41 karolherbst[d]: found it
15:41 karolherbst[d]: gfxstrand[d]: yeah, there is a flag to modify how covmask behaves
15:41 gfxstrand[d]: Ah
15:41 gfxstrand[d]: which flag?
15:41 karolherbst[d]: raster multisample mode
15:42 gfxstrand[d]: Is that an SPH thing? Because I don't see anything named that in the headers
15:43 karolherbst[d]: not sure
15:43 gfxstrand[d]: I don't see anything in the RE'd headers either
15:44 karolherbst[d]: `NVC797_SET_ANTI_ALIAS_RASTER` maybe?
15:45 karolherbst[d]: anyway, number of valid bits depends on the sample count
15:46 karolherbst[d]: but anyway.. there could be more to it anyway
15:54 gfxstrand[d]: clearly
15:58 gfxstrand[d]: There's also SET_POST_PS_INITIAL_COVERAGE which is new on MaxwellB
16:04 gfxstrand[d]: Maybe I should dump the blob command stream for one of these CTS tests
16:07 karolherbst[d]: fyi, the RID_MODE stuff we've talked about a few weeks ago got coded up after I dropped a comment in an old AMD bug :ferrisUpsideDown: https://patchwork.kernel.org/project/dri-devel/patch/20240731214941.257975-1-hamza.mahfooz@amd.com/
16:08 karolherbst[d]: apparently they considered adding edid quirks as well
16:09 gfxstrand[d]: Also, SAMPLE_MASK_RASTER_OUT
16:09 karolherbst[d]: should fix those bugs with those ultra widescreen displays not getting the correct mode
16:10 karolherbst[d]: sadly I forgot what triggered it all
16:23 gfxstrand[d]: SET_POST_Z_PS_IMASK
16:23 gfxstrand[d]: Maybe?
16:24 gfxstrand[d]: No, that's just post-depth-coverage
16:31 gfxstrand[d]: Ugh... None of the new regs on Maxwell seem to affect this
16:37 gfxstrand[d]: karolherbst[d]: So there is a SET_RASTER_SAMPLES_MODE but it errors out every time I try to use
16:37 karolherbst[d]: mhhh...
16:37 karolherbst[d]: where is it?
16:38 karolherbst[d]: or do you have a typo in the name, because I can't find it in the doc repo
16:38 gfxstrand[d]: That's what it's called in the header
16:38 karolherbst[d]: huh..
16:38 gfxstrand[d]: SET_ANTI_ALIAS_RASTER_SAMPLES_MODE
16:39 karolherbst[d]: ahh
16:39 karolherbst[d]: right, the thing I've pointed out above
16:39 karolherbst[d]: in what sense does it error?
16:39 gfxstrand[d]: The blob sets that to the sample count and then sets `SET_ANTI_ALIAS` to `MODE_1x1` which makes no senes
16:39 karolherbst[d]: the GPU complains?
16:40 gfxstrand[d]: [152613.948664] nouveau 0000:65:00.0: gsp: rc engn:00000001 chid:24 type:69 scope:1 part:233
16:40 gfxstrand[d]: [152613.948677] nouveau 0000:65:00.0: fifo:c00000:0003:0018:[deqp-vk[47980]] errored - disabling channel
16:40 gfxstrand[d]: [152613.948685] nouveau 0000:65:00.0: deqp-vk[47980]: channel 24 killed!
16:40 gfxstrand[d]: And this is ampere so the errors are useless
16:40 karolherbst[d]: annoying
16:40 karolherbst[d]: there is `NVCB97_SET_ANTI_ALIAS_ENABLE` and `NVCB97_SET_ANTI_ALIAS_SAMPLE_POSITIONS` as well
16:41 karolherbst[d]: though I think you set those
16:42 marysaka[d]: Yeah those should be set
16:42 gfxstrand[d]: Yeah, we set those
16:43 gfxstrand[d]: gfxstrand[d]: I guess I could swap back to my Maxwell B where I have errors. :frog_upside_down:
16:43 marysaka[d]: I remember last time I tried to get post depth coverage working the only real diff I was seeing in the command stream was just SET_ANTI_ALIAS using D3D variant compared to us
16:43 marysaka[d]: (So probably unrelated?)
16:49 kestrel_: Hi! I'm having trouble getting nouveau to work on 940m, the driver keeps crashing. I've already reported this at #372. Is there anything else I could attach to the issue that would be useful?
16:50 gfxstrand[d]: gfxstrand[d]: I'm going to do that. I doubt it'll actually help but maybe?
16:52 karolherbst[d]: kestrel_: ohh, I forgot to check the logs with your report, sorry for that
16:52 karolherbst[d]: something goes wrong with loading the GPU firmware it seems...
16:56 karolherbst[d]: skeggsb9778[d]: mind taking a loot at this? `nouveau 0000:04:00.0: fifo: fault 01 [WRITE] at 0000000000160000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 0c [UNSUPPORTED_KIND] on channel -1 [007fd38000 unknown]` is kinda weird, never saw that one. https://gitlab.freedesktop.org/drm/nouveau/-/issues/372
16:57 karolherbst[d]: ohh
16:57 karolherbst[d]: I'm sure it's `intel_iommu=on` that nouveau runs into those issues...
16:57 karolherbst[d]: kestrel_: mind trying to the iommu disabled?
17:18 kestrel_: yes i have it off explicitly karolherbst[d] im on linux-libre 6.9.10
17:20 karolherbst[d]: kestrel_: but it was enabled at least according to the log you attached to the issue
17:35 kestrel_: karolherbst: I've uploaded a log with intel_iommu=off in the issue
17:39 karolherbst[d]: kestrel_: something weird is going on and I have no idea what precisely yet...
17:40 karolherbst[d]: ohh wait..
17:41 karolherbst[d]: kestrel_: this is a GPU connected through thunderbolt, right?
17:42 karolherbst[d]: there are also some errors in regards to the PCI bar happening
17:42 karolherbst[d]: `pci 0000:04:00.0: ROM [mem 0xfff80000-0xffffffff pref]: can't claim; no compatible bridge window`
17:42 karolherbst[d]: `pci 0000:04:00.0: ROM [mem size 0x00080000 pref]: can't assign; no space`
17:43 karolherbst[d]: yeah...
17:43 kestrel_: The chip is soldered to the motherboard as far as I'm aware
17:43 karolherbst[d]: kestrel_: I think something goes very wrong on the PCI level. I _think_ some here had an idea on how to deal with it...
17:43 karolherbst[d]: huh?
17:43 karolherbst[d]: weird...
17:43 karolherbst[d]: `pci 0000:04:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:1c.4 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)`
17:44 karolherbst[d]: the `x4` link is suspcious, because that's usually a strong hint that it's thunderbolt
17:44 karolherbst[d]: but yeah.. maybe the laptop just comes with a x4 link...
17:44 karolherbst[d]: weird
17:45 karolherbst[d]: kestrel_: why do you have NvMsi=0 set btw?
17:46 kestrel_: I've tried with and without
17:46 kestrel_: Would it make a difference?
17:46 karolherbst[d]: doubtful
17:46 karolherbst[d]: though you should leave it unset
17:47 karolherbst[d]: anyway, I think something goes very wrong on the PCI level, but not sure if there is something nouveau could do about it or not... Do you know if the nvidia driver works better on your system?
17:48 kestrel_: Yes I've been using the nvidia driver since 2017 and as of this spring
17:48 karolherbst[d]: do you have a boot log with nvidia?
17:49 karolherbst[d]: they might do something to reconfigure something on the PCI level
17:52 kestrel_: I'll boot with nvidia
18:32 phomes_[d][d]: for post_depth_coverage I left some notes of what I found when looking into it in a similar way with sph and regs: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29194#note_2436180
18:32 phomes_[d][d]: I was not able to find anything other than the cts using pdc so I did write something myself. I did not see any interesting from that either
18:39 triang3l[d]: phomes_[d][d]: Is there some sort of auto-revert to the default state? Because I don't see a branch setting this register when post-depth coverage is enabled
18:40 kestrel_: karolherbst[d]: I have glx-gears with nvidia version 550.67, do I just need dmesg logs now?
18:40 triang3l[d]: oh yes, Faith has suggested a setting there
18:40 karolherbst[d]: kestrel_: correct
18:50 kestrel_: karolherbst[d]: posted in the issue
18:51 gfxstrand[d]: `[ 5893.734384] nouveau 0000:17:00.0: gr: DATA_ERROR 00000004 [INVALID_VALUE] ch 7 [017ed46000 deqp-vk[3042]] subc 0 class b197 mthd 0fb8 data 00000000`
18:52 gfxstrand[d]: That's not that useful...
18:56 karolherbst[d]: yeah....
18:57 karolherbst[d]: might require something else to be set first
19:11 redsheep[d]: karolherbst[d]: Would this also likely fix my missing 4k120 mode? Do you expect this will have an impact on the weird issues where some people were getting 6 bpc?
19:17 karolherbst[d]: redsheep[d]: mhhhhhhh maybe?
19:17 karolherbst[d]: Though it's mostly about ultra wide modes not appearing
19:17 karolherbst[d]: but maybe this also impacts this?
19:18 redsheep[d]: It would be nice if that's fixed. Have you poked at the FRL or DSC support any further?
19:19 karolherbst[d]: nope
19:20 redsheep[d]: I have a new monitor that relies on dsc arriving tomorrow 🥳
19:20 redsheep[d]: So I can test that aspect on any patches or nova when the time comes
19:27 karolherbst[d]: airlied[d]: any idea if nouveau should call into `request_mem_region` or something before accessing the PCI bars?
19:35 airlied[d]: don't think it's required to do it
19:45 karolherbst[d]: mhhh
19:45 karolherbst[d]: we got a bug report where the PCI BAR is all screwed up: https://gitlab.freedesktop.org/drm/nouveau/-/issues/372
19:46 karolherbst[d]: so I was wondering if that matters in like some cases
19:46 karolherbst[d]: nvidia is doing it
19:49 karolherbst[d]: though atm we just do `pci_resource_start` + `ioremap` on the address
21:21 airlied[d]: All the bars or just rom?
21:39 karolherbst[d]: dunno
21:40 karolherbst[d]: the GPU seems to throw errors for `BAR1` and `BAR2`
21:40 karolherbst[d]: but I think the `BAR` with the mmio regs works
21:42 karolherbst[d]: I mean.. that one has to work otherwise nouveau wouldn't get that far
22:07 gfxstrand[d]: Ugh... How is the blob passing this test?!?
22:09 karolherbst[d]: ~~maybe there is another secret bit we have to flip like the helper invoc stuff~~
22:17 gfxstrand[d]: I hope not
22:21 gfxstrand[d]: Looking at the bits the blob sets, I'm not seeing anything
22:22 phomes_[d][d]: We also pass with 29194 right?
22:23 gfxstrand[d]: Yeah but it ends up disabling sample shading which isn't right
22:24 airlied[d]: karolherbst[d]: that looks pretty special alright, the PCI bits only seem to be around the ROM, and since it's onboard GPU it probably gets the rom from another method anyways
22:24 kestrel_: [ 4096.348384] nouveau 0000:04:00.0: fifo: fault 01 [WRITE] at 0000000000001000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 0a [UNSUPPORTED_APERTURE] on channel -1 [007fd38000 unknown]
22:25 karolherbst[d]: airlied[d]: mhhhhhhhh....
22:25 karolherbst[d]: good point actually
22:25 airlied[d]: [ 20.636679] nouveau 0000:04:00.0: bus: MMIO read of 00000000 FAULT at 6013d4 [ PRIVRING ] starts bad and goes down hill
22:25 kestrel_: I booted with noaccel=1 and then reinserted it later and it does bunch of these, should i attach the log?
22:25 kestrel_: with noaccel=0
22:26 kestrel_: for the latter
22:26 karolherbst[d]: though...
22:26 karolherbst[d]: mhh
22:26 karolherbst[d]: the vbios gets loaded
22:26 airlied[d]: the fact no accel works means at least the framebuffer BAR is there
22:26 karolherbst[d]: `nouveau 0000:04:00.0: bios: version 82.08.46.00.1c`
22:26 karolherbst[d]: I'm sure it fetches it from ACPI or something...
22:40 phomes_[d][d]: gfxstrand[d]: Yes it is certainly not the right fix. In that MR I just made nvk do the same as nv and I suppose that we pass in the same way. The cts test is not testing much so it is hard to tell if pdc is correctly implemented or not
23:36 gfxstrand[d]: I know what I need to do now. I'll type it all up on Monday
23:36 gfxstrand[d]: The reason why I was only seeing `pixld.covmask` is because I was looking at an ESO shader. for `minSampleShading > 0` cases, there's an emulation we need.
23:39 gfxstrand[d]: I'm not crazy. `pixld.covmask` returns the coverage mask for the pixel. The blob pushes in a LUT of masks. You index into the table by `pixld.my_id` to look up the mask for your pass and then `&` that with the coverage mask.
23:39 gfxstrand[d]: But I need to figure out the LUT and that might mean a bit of fuzzing.
23:39 gfxstrand[d]: Hopefully it just evenly divides the samples into N groups of M samples (sample count and number of passes are both powers of two).
23:43 gfxstrand[d]: Really, it's just a fancier version of the emulation I was doing before. It just needs to be restricted a bit more and I need to think through it a lot harder.