01:36 vdpafaor[d]: it looks like nvk does nir_lower_non_uniform_ubo_access before nir_lower_explicit_io for UBOs
01:37 vdpafaor[d]: how does that work? doesn't nir_lower_non_uniform_access need lowered io?
01:38 gfxstrand[d]: No. It works on derefs.
01:42 gfxstrand[d]: We should probably not be lowering non-uniform UBO access
01:43 gfxstrand[d]: Well, ish. We want to lower non-uniform handles but non-uniform offsets are okay
01:44 vdpafaor[d]: hmm, I think it can do deref images/textures, but I don't see code to handle `load_deref` for ubos/ssbos
01:45 vdpafaor[d]: I did also initially try putting it before nir_lower_explicit_io on panvk and it didn't touch nonuniform ubos, but possible I'm holding it wrong
01:45 gfxstrand[d]: Well, if that's true that's probably good. I don't think we want to lower those anyway. 😂
01:46 vdpafaor[d]: ah, so it hasn't been a problem because you don't actually need to lower them anyway
01:58 mhenning[d]: I think the proprietary driver sets shaderUniformBufferArrayNonUniformIndexingNative and we don't - which is one of the things on my long todo list
01:59 mhenning[d]: if we're not actually lowering them then maybe we can just drop the nir_lower_non_uniform_ubo_access and set shaderUniformBufferArrayNonUniformIndexingNative
02:08 gfxstrand[d]: I've got a Maxwell in my box right now so it's easy to test.
02:08 gfxstrand[d]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35268
02:09 gfxstrand[d]: mhenning[d]: We already do on Turing+
02:09 gfxstrand[d]: For UBOs, whether or not they're native is a matter of taste
02:10 gfxstrand[d]: Is `ld.constant` really "native" when compared to `ldc cx[urN][]`?
02:11 gfxstrand[d]: gfxstrand[d]: We do on Turing+ for textures and images, that is.
02:12 gfxstrand[d]: And we claim native everywhere for SSBOs because all we have is `ld.global`
02:15 gfxstrand[d]: But yeah, there are no loops so I guess it's native? Or as native as it can be? Like, we also promote stuff to cbufs at times so compared to that nothing is ever native. It's kind of a "do whatever you want. We'll figure it out." So...
02:15 gfxstrand[d]: <a:shrug_anim:1096500513106841673>
02:16 gfxstrand[d]: Maybe that's as native as it gets?
02:18 gfxstrand[d]: Honestly, those bits were always sort of a horrible compromise. They exist because NVIDIA pushed for non-uniform everything and lots of people were like "OMG! We might have to loop! The client needs to know that we might have to loop and that they shouldn't do that."
02:18 gfxstrand[d]: These days, those bits probably wouldn't have made it in.
02:43 gfxstrand[d]: I think what I'm saying is that the bit is BS and we can set it to whatever we want. I guess saying UBOs are native is probably fine.
02:49 gfxstrand[d]: There. Added a commit to the MR with a detailed commit message that someone can go read if they're ever wondering why we set that bit.
03:08 gfxstrand[d]: I'll merge tomorrow once my CTS run is done but it looks pretty good so far
03:08 gfxstrand[d]: `Pass: 474398, Skip: 684596, Timeout: 3, Flake: 3, Duration: 1:00:38, Remaining: 1:28:11`
03:08 gfxstrand[d]: On Maxwell A
03:09 gfxstrand[d]: Which will maybe matter one day in the future when people are running Postmarket OS on their old Switches.
04:04 gfxstrand[d]: I love these little compiler deep-dives where we collectively figure out that things are deeply cursed, always have been, and then make them better. 💜
04:32 kar1m0[d]: gfxstrand[d]: Thank you for your work 🙏
04:32 kar1m0[d]: I am no longer dependent on proprietary drivers
04:42 gfxstrand[d]: https://social.treehouse.systems/@gfxstrand/114600551487497359
05:16 orowith2os[d]: gfxstrand[d]: what are the languages that compile down to assembly? Grimoires? :akipeek:
10:10 kar1m0[d]: I wanted to check dual monitor setup with nouveau
10:10 kar1m0[d]: and the second monitor is black
10:11 kar1m0[d]: the kernel log I get about it is
10:11 kar1m0[d]: [ 9.347151] nouveau 0000:01:00.0: [drm] Cannot find any crtc or sizes
10:11 kar1m0[d]: how can I fix it?
10:11 kar1m0[d]: or should I make a request on gitlab for it?
10:50 danadreamjr: It's a bit embarrassing to be that cranky as i am too cause of only some computer bits or pointless arguments, but what nowdays i really think, it is still me that so many speak and sing about, but i had not possibly understood that cause my self-esteem is so low that i use none of my chanches delivered, and it might be that i am too old to make any corrections there as well :( mental
10:50 danadreamjr: illness shines through the best in situations where so many dream about you but you are too stupid to understand to weak to deliver or act on it properly, in my case it's the highest version of stupidity i estimated my worth so much below that in years of time would not even understand that it could be me that they do that with/to. Every day laura would tell the same thing that i got to
10:50 danadreamjr: start to love myself. I would not go very deep to analyse as to why she does that, instead shout internally in anger and forget everything like a brick, but i am getting there now. So all in all i think the computing is none of this why i am considered to be so stupid by many people, but the emotional world of not fitting to understand other inhabitants, be those celebrities stars or
10:50 danadreamjr: people who love me and people i love. By thoughts shut off when i get too much attention i am always working around the problem, this is definitely not me , but this could be me indeed instead. It's somewhat honorable that i creaped under so manies skin however unintentionally, even lovable to feel that as of yestirday when i analysed things again, but yes i came with mistakes of such kind
10:50 danadreamjr: i lack celebrity properties tbh :(
10:51 snowycoder[d]: kar1m0[d]: What card are you using? I have the same problem with my GK104
10:52 kar1m0[d]: snowycoder[d]: rtx 4080 laptop gpu
10:58 danadreamjr: Couple of other things i am analysing what is often said to me, that i seriously always store too much negativeness, which spoils the game for too many.
11:03 kar1m0[d]: ada lovelace ad104 iirc
11:50 mangodev[d]: strange, i'm using dual monitor on turing
12:08 kar1m0[d]: mangodev[d]: might be a laptop thing because laptops often use both integrated and discrete gpus
12:09 kar1m0[d]: forgot what it's called
13:10 avhe[d]: is there a way to tell which version of nvcc was used to compile a given kernel?
13:10 avhe[d]: i don't see anything in cuobjdump
13:13 avhe[d]: (context is that i wasn't able to reproduce a SASS pattern using SGXT that i was pretty confident mapped to __mul24 (<https://godbolt.org/z/aG7qr7E37>). Turns out nvcc started emitting shifts insteda of bitfield extract starting in 11.6.0, which took me a while to figure out because i was testing on its latest version)
13:50 gfxstrand[d]: kar1m0[d]: Does a USB-C adapter work? That would let you drive both from the Intel card, which is probably better for your thermals anyway.
13:51 kar1m0[d]: gfxstrand[d]: I don't have an adapter unfortunately
13:51 kar1m0[d]: I always used hdmi
13:52 kar1m0[d]: meanwhile I have tested the finals
13:52 kar1m0[d]: and well
13:52 kar1m0[d]: it's not good
13:52 kar1m0[d]: managed to get 15 fps on minimal settings
13:54 kar1m0[d]: gfxstrand[d]: I am planning to try to work on getting more gpu output info in mangohud, will that be helpful? My idea is that I would be able to see gpu clock speeds, memory speeds, gpu power usage and vram usage. what do you think?
13:55 kar1m0[d]: I am not sure how long it will take for me to do it or if it will be successful but I am willing to try
13:56 esdrastarsis[d]: kar1m0[d]: Need kernel patches to get this info from the gsp firmware
13:56 kar1m0[d]: esdrastarsis[d]: yes I know
13:56 kar1m0[d]: esdrastarsis[d]: have you tried to do something similar?
13:57 kar1m0[d]: I think it might be helpful to see mangohud output
14:02 gfxstrand[d]: kar1m0[d]: Is it a D3D12 title?
14:07 kar1m0[d]: gfxstrand[d]: yes
14:07 kar1m0[d]: also I noticed that steam often crashes for me
14:07 kar1m0[d]: and I have to restart my pc to fix it
14:10 kar1m0[d]: probably something about drivers with wayland
14:21 gfxstrand[d]: There's known issues with steam. I need to look into it. Seems no one else wants to debug Kopper. 🙄 Not that I particularly want to debug Kopper, either. 🤷🏻‍♀️😂
14:21 esdrastarsis[d]: kar1m0[d]: no
14:22 gfxstrand[d]: kar1m0[d]: Okay, so that's half your FPS right there. We're working on making D3D12 suck less but it's a long road. (It sucks for the blob driver, too.)
14:23 gfxstrand[d]: But there may also be some pessimal cases it's hitting.
14:30 gfxstrand[d]: I think karolherbst[d] and I need to write a blog post answering the firmware questions once and for all so I can just link it any time anyone asks.
14:31 gfxstrand[d]: So far I've had 3 people respond to my Fedi post from last night asking why we can't reclock. 😩
14:32 karolherbst[d]: 🙃
14:33 karolherbst[d]: well.. in theory we could use nvidia's firwmare and reverse engineer the interface, and....
14:34 gfxstrand[d]: In theory...
14:37 karolherbst[d]: on Pascal that might even be doable, it's more of a pain on Maxwell
14:41 gfxstrand[d]: Honestly, if we could just get Pascal...
14:43 karolherbst[d]: Pascal is nice, because the firmware boot process doesn't run on the PMU
14:43 karolherbst[d]: on Maxwell, it's the PMU that bootstraps everything
14:43 karolherbst[d]: and then loads the PMU firmware onto itself or something weird
14:43 karolherbst[d]: dunno the details
14:46 gfxstrand[d]: In any case, it would be good to write it all down. I'm tired of having these discussions.
14:46 karolherbst[d]: you'll get use to it 😄
14:46 karolherbst[d]: but yeah...
14:46 karolherbst[d]: we kinda want to reboot the wiki, and might be good to have a few new pages there just explaining new things.. dunno
14:46 karolherbst[d]: or just do a new wiki
14:55 gfxstrand[d]: Well, we need a new wiki
14:56 magic_rb[d]: hows the legality of reverse engineering nvidia's blob firmware? because a technical challenge can be overcome, a legal one cannot
15:01 karolherbst[d]: depends on the country (tm)
15:01 karolherbst[d]: usually all this is based on drm law
15:01 karolherbst[d]: and encryption/signing keys are protected as they are used as a mean to protect IP
15:02 kayliemoony[d]: general rules i've seen in RE spaces are
15:02 kayliemoony[d]: - don't fuck with signing keys
15:02 kayliemoony[d]: - be in the EU
15:02 karolherbst[d]: the reverse engineering isn't the issue, the problem is rather, do you need "secrets" you shouldn't have to run anything
15:02 kar1m0[d]: gfxstrand[d]: I can help writing it if needed
15:02 karolherbst[d]: yeah.. the signing key is a big one. Though as long as we use nvidia's blobs there isn't a real issue besides.. IP laws again, because redistribution might not be legal
15:03 karolherbst[d]: nvidias drivers licensing term are a bit... vague
15:03 karolherbst[d]: like they allow distribution of unmodified files
15:03 kayliemoony[d]: if you can redistribute the blob then it Should be entirely legal to just use the keys from known locations in the blob without ever redistributing the keys yourself
15:03 kayliemoony[d]: (this is how romhack distribution effectively works for example)
15:03 karolherbst[d]: but is "extracting part of a binary" already a modification? Probably yes
15:03 karolherbst[d]: well we don't need keys
15:04 karolherbst[d]: though part of the boot process is also to place signatures at the right place 🙃
15:04 kar1m0[d]: karolherbst[d]: Like zink for example?
15:05 kar1m0[d]: Might need to add it yes
15:13 gfxstrand[d]: The whole nouveau wiki is ancient. I don't think it even mentions NVK.
15:14 gfxstrand[d]: We should move it to GitLab and rewrite large chunks of it.
15:16 karolherbst[d]: it's already on gitlab
15:25 kar1m0[d]: sure you can move entirely to gitlab but if not for the nouveau wiki I wouldn't have been here, I think it is important for newcomers or people interested in finding out more about nouveau.
15:26 kar1m0[d]: as I said I can rewrite the wiki I don't mind doing it
15:27 kar1m0[d]: but since I haven't been present here for long I might need help in knowing what needs to be added.
15:27 marysaka[d]: karolherbst[d]: ... or that one trick that is a disaster to figure out
15:27 gfxstrand[d]: karolherbst[d]: Oh? Do you mean it's a GitLab wiki or that it's backed by a git repo that's in GitLab?
15:29 karolherbst[d]: gfxstrand[d]: it's gitlab pages
15:29 karolherbst[d]: deployed from a gitlab pipeline on a gitlab git repo
15:29 karolherbst[d]: https://gitlab.freedesktop.org/nouveau/wiki/
15:30 magic_rb[d]: https://nouveau.freedesktop.org/ are we talking about that wiki? or what wiki
15:30 kar1m0[d]: magic_rb[d]: this one is outdated
15:30 kar1m0[d]: the gitlab one is newer
15:30 magic_rb[d]: kar1m0[d]: its also insanely nostalgic
15:30 magic_rb[d]: :P
15:31 gfxstrand[d]: karolherbst[d]: Oh, nice! I didn't realize it had been updated. Maybe now I can work on editing stuff.
15:31 magic_rb[d]: i recall trying to play stalker ogse on a gtx 520m on my old old laptop, those were the days, eventually got it running with proprietary nvidia i think
15:31 karolherbst[d]: gfxstrand[d]: yeah.. it's still ikiwiki, but it's just markdown
15:32 karolherbst[d]: so as long as the pages don't do weird things, can just use whatever wiki software in the future and keep the pages
15:33 gfxstrand[d]: Okay, next time I have a mediocre brain day, I'm gonna get typing.
17:32 snowycoder[d]: Can I always assume that the first BB in NAK does not have any incoming edge?
17:39 mhenning[d]: yes. It's a result of the graph being in reverse postorder
17:59 mhenning[d]: out of curiosity, what are you working on that you need that assumption for?
18:18 snowycoder[d]: mhenning[d]: Texdepbar insertion in kepler, culling them requires computing the min/max bounds of the texture fetch stack at each instruction.
18:18 snowycoder[d]: I require the first block to be:
18:18 snowycoder[d]: - the entry point
18:18 snowycoder[d]: - not have any previous block
18:18 snowycoder[d]: So that I can ensure the stack to start empty
19:02 mhenning[d]: well, we have exactly one entry block and it's guaranteed to be the first one so you should be good
20:10 gfxstrand[d]: Doesn't texdepbar take that number of texture instructions to look back?
20:11 mhenning[d]: yes
20:12 mhenning[d]: I think a dataflow analysis would probably make sense
20:14 gfxstrand[d]: I'm less worried about data flow. I'm just not sure what to do if at joins. I guess if we can assume they're a queue and if we assume the hardware ignores the barrier if there aren't enough in the queue then we can just take a minimum over all predecessors
20:14 gfxstrand[d]: And initialize to the maximum
20:14 mhenning[d]: yeah, it's a queue
20:15 mhenning[d]: and yes, join operator would be minimum
20:16 gfxstrand[d]: That should be fine then
20:19 snowycoder[d]: gfxstrand[d]: That's what the old codegen does, but barrer culling is not fast.
20:19 snowycoder[d]: There's a stack bounds propagation algorithm with complexity O(max_nested_loop*E)
20:37 mhenning[d]: I wouldn't worry about bounds propagation. You can probably just iterate until fixpoint and it will be fine
20:37 mhenning[d]: That's what eg. liveness analysis already does and it works well in practice
20:39 mhenning[d]: I've spent some time working on a dataflow framework for nak, which is here if you want to play around with that: https://gitlab.freedesktop.org/mhenning/mesa/-/commits/dataflow?ref_type=heads
20:47 snowycoder[d]: mhenning[d]: Thank you, I'll look into it more when I get back to NAK