02:20 indonesianpaul: crypotgraphically when curve is drawn of the range, it would resemble an ellipsoid shape. This elliptic cryptography is known well for having very many keys, but maintaining a strong encryption. Techinically collision is only possible if the value is the same so the rng must had failed. Inverse index or value found validation with it's bounded inverse where the first is subtract from
02:20 indonesianpaul: last is already enough of validation it needs to be 58, and the more you stretch the range to bigger distances accomodating more keys, the more it takes elliptic shape, so it's also easiest to use on compute model. Since it's possible to pull to minus or twice the minus, or positive or twice the positive etc. Now the advantages and disadvantages. on the adv. side you are able to break
02:20 indonesianpaul: encryption keys such as nvidia's signed fw can be reverse engineered, if enough thought is put having an extraordinary throughput of decryption procedures. Disadvantage being overall global thieves and lazy people would scam others and would not want to go to work either which is a smaller problem when robotics take over, especially when they are of angry and lazy type, it's degenerative
02:20 indonesianpaul: curve as result on the social matters. One should at least go to gym etc to be respectful to others i would assume. So my work here is done, the compute module i won't upload and i am still choosing the sides Europe or Russian federation if it's going to be conflict zone here. They are both close, so America is distant. I would propose the peace finally, but it's not my thing to tell.
02:20 indonesianpaul: This nouveau GSP firmware as well as KMS driver could be a great contribution if you had enough resillience like i have to get it stable, and i would even have faith in you , the team seems good, but you need to quit violating others. I am not a stupid guy at all or mentally ill.
02:21 lru: lol
02:43 pavlo_kozlenko[d]: make yt chanell
02:43 pavlo_kozlenko[d]: i will subscribe you
02:44 pavlo_kozlenko[d]: i will subscribe to you
02:46 mangodev[d]: mangodev[d]: something interesting is that when firefox crashes, it causes the whole computer to lag for about 10 seconds or so
02:46 mangodev[d]: don't know how much that helps diagnose the issue, but at least it may be a clue toward the cause
03:12 gfxstrand[d]: What do you mean by lag? It's possible your computer is cleaning up processes or that the FF crash is an OOM of some sort.
03:12 gfxstrand[d]: I'm not aware of anything that's snuck into Zink/NVK that would cause an OOM but stranger things have happened.
03:13 airlied[d]: 10s sounds like a fence wait
04:08 mangodev[d]: gfxstrand[d]: i'd be concerned if it's an OOM because it does it at seemingly random times
04:09 mangodev[d]: it can crash literal seconds after booting
04:09 mangodev[d]: other times it takes hours
04:09 gfxstrand[d]: airlied[d]: That's also very possible
04:09 mangodev[d]: i mean, the traces firefox gave did bring up fences
04:09 gfxstrand[d]: I was unclear on what "lag for 10s" meant
04:10 mangodev[d]: gfxstrand[d]: the compositor stays choppy for about 5-10s after crashing
04:10 mangodev[d]: it's hard to tell because the logs are nondescript (though *maybe* i can get extra info if i contact mozilla's crash log admins :/)
04:11 mangodev[d]: the weird thing is that chrome and electron *don't* do this, only firefox
04:11 mangodev[d]: this is still happening on latest too, i built my drivers an hour ago and it's still crashing spontaneously
04:12 mangodev[d]: it doesn't have much correlation between the contents/complexity of the webpage and whether it'll crash or not
04:13 mangodev[d]: i've had super long reddit threads crash, pages with gsap or threejs crash, and pages with really unoptimized react code crash (github >:|), but i've also had ancient html 2.0 pages crash as well
04:14 mangodev[d]: i think it's something to do with firefox "activating" the page renderer
04:14 mangodev[d]: because the most common crash is on down scroll, and it gets more frequent with more tabs open
04:14 mangodev[d]: might crash more with discord open to the side, but it's hard to tell with certainty, could just be a red herring
04:14 gfxstrand[d]: Okay, choppy after the crash is definitely weird.
04:14 mangodev[d]: and it seems to logspam a bit while crashing
04:15 mangodev[d]: normally, firefox only gives one error from crashing, but it logs quite a few times when crashing
04:15 mangodev[d]: i'll check the logs again, but it shouldn't be different from last time
04:16 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364817169212510239/image.png?ex=680b0c8a&is=6809bb0a&hm=61196e6d24694fb17ac752bfee941832c2beb089141fc5ba2b872097a9d14ad9&
04:16 mangodev[d]: this is what it logs when it crashes
04:16 mangodev[d]: oooh wait that `.dmp` file is probably important
04:16 mangodev[d]: didn't notice that before
04:17 mangodev[d]: wait
04:18 mangodev[d]: uhhh
04:18 mangodev[d]: it failed to generate the dump :|
04:18 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364817739344248933/image.png?ex=680b0d12&is=6809bb92&hm=654103ccc44c7ec96e99a91e8ca544b73b474f819827fddaaaf2e1c0fee89a57&
04:18 mangodev[d]: how :(
04:25 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364819393254789211/image.png?ex=680b0e9c&is=6809bd1c&hm=86c058962d825fda5e31cd6c2d4c1b2969e18109bb5b501cee7bed6087f1512a&
04:25 mangodev[d]: now in the instance folder
04:25 mangodev[d]: update: `minidumps` and `crashes` have nothing useful
04:25 mhenning[d]: do you get anything interesting in dmesg?
04:25 mangodev[d]: mhenning[d]: nope, only from electron crashes
04:26 mangodev[d]: the discord soft crash seems easier to debug, but is probably lower priority
04:26 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364819739792375860/image.png?ex=680b0eef&is=6809bd6f&hm=8078d69620487306b5527ae880c1d436173d3bb4de3530d732709cfcf9746732&
04:26 mangodev[d]: i think i found something
04:49 gfxstrand[d]: mangodev[d]: status=11 probably means segfault but it's possibly another enum.
04:51 mangodev[d]: gfxstrand[d]: what do you think the `Exiting due to channel error.` means? it only does this on this crash
04:52 mangodev[d]: the other day i was having an extension crash (from `WebExtensions`, must've been a broken plugin because i disabled some plugins and it's fixed now), and it didn't give the channel error log
04:53 mangodev[d]: it seems to make a stutter for each time it logs that channel error log
04:53 mangodev[d]: when i did some live logging yesterday, it stuttered per log
04:54 mangodev[d]: so that channel error exit seems to cause some lag
04:54 mangodev[d]: …but what is it exiting? the crash reporter? why on this crash in specific?
04:54 mangodev[d]: something also to note is that closing the crash report window on these crashes takes a bit longer than usual
04:54 mangodev[d]: the window freezes for a good 5 seconds before disappearing like it's supposed to
05:03 gfxstrand[d]: mangodev[d]: I don't know. Nouveau has a very specific meaning for "channel error" but I don't see how that could possibly get plumbed through to Firefox so it's probably something Firefox specific.
05:03 mangodev[d]: it seems to happen more often when tabs are opened in quick succession?
05:03 mangodev[d]: that's the biggest thing all these crashes have in common
05:04 mangodev[d]: and also the crashing on down scroll
05:04 mangodev[d]: maybe a race condition of some sort?
05:06 gfxstrand[d]: Really, I think someone just needs to reproduce it in GDB.
05:06 gfxstrand[d]: Once we can poke around at the actual crash site, we should have a way better idea of what's going on.
05:07 mangodev[d]: mangodev[d]: which one of these can i even get data from
05:08 mangodev[d]: why does firefox have to be so difficult to get logs from :(
05:21 gfxstrand[d]: 1. Start Firefox
05:21 gfxstrand[d]: 2. Run `ps aux | grep firefox`
05:21 gfxstrand[d]: 3. Look for the thing that looks like a render process
05:21 gfxstrand[d]: 4. `gdb --pid=<render process PID>`
05:21 gfxstrand[d]: 5. wait for the crash
05:21 gfxstrand[d]: 6. In GDB, type `bt full`
05:24 gfxstrand[d]: If GDB refuses to connect to the GPU process (access permissions can do that), there's a flag to disable GPU process isolation. I just don't remember what it is. I'd have to look it up.
05:27 tiredchiku[d]: security.sandbox.gpu.level
05:55 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364842115925807104/image.png?ex=680b23c5&is=6809d245&hm=8b3972e35b23571188afc520d008133f7d8c4c8704e8d63ba26597458a8f5292&
05:55 mangodev[d]: gfxstrand[d]: rdd?
05:55 mangodev[d]: i'm on step 2
06:06 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364844854240350259/image.png?ex=680b2652&is=6809d4d2&hm=e1b50eeded2b631789b5227937cf8e7d40a300a4ef25152e142e1d441c45e5d3&
06:06 mangodev[d]: ?
06:09 mangodev[d]: uhhhhhh hmmmm
06:10 mangodev[d]: wait
06:10 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364846037721682083/image.png?ex=680b276d&is=6809d5ed&hm=0b3e9472a7eec9a643801d683305aede8203dce534627eef13cd20b731b102f4&
06:10 mangodev[d]: why does firefox have so many issues
06:20 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364848498620764232/image.png?ex=680b29b7&is=6809d837&hm=3139720914958a1e7b009ec85883069baaacd8d445e080ce15d932b7589ef297&
06:20 mangodev[d]: gfxstrand[d]: even gdb doesn't have any idea
06:21 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364848737498828810/image.png?ex=680b29f0&is=6809d870&hm=e055ba2b816e23063e5a06222b65992355f70dbe4c50ca4f0c51fc4a63433cad&
06:21 mangodev[d]: journalctl gave an extra log this time though
06:23 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364849164625772574/image.png?ex=680b2a56&is=6809d8d6&hm=7bae0de51049727de5cbd937df9d591d3a1e5ae69b96a071d7d9df2eb2032b54&
06:23 mangodev[d]: also gave this 4 seconds before the crash, may be the cause?
06:24 mangodev[d]: https://cdn.discordapp.com/attachments/1034184951790305330/1364849407333498880/image.png?ex=680b2a90&is=6809d910&hm=df02ca5d0cf67920d27cc2ad9886009a819f613320c3b79061a7c37732414dca&
06:24 mangodev[d]: lmao
06:24 mangodev[d]: found this while scrolling through logs
06:35 mangodev[d]: i'm gonna head off to bed, i hope this light debugging session was able to help diagnose the issue at least a little
08:23 kwizart: Hi, can you confirm few things related nvidia:
08:23 kwizart: 1/ Is nvidia-open kernel driver of any usage for any mesa driver as is ? (outside study purpose)
08:23 kwizart: 2/ is new mesa driver relies on nouveau.ko as in stable upstream kernel or downstream patches are needed ?
08:41 airlied[d]: Not yet useful and stable upstream
09:14 x512[m]: kwizart: Nvidia open KMD is used with NVK Vulkan driver on Haiku.
09:15 x512[m]: Because Nvidia open KMD is designed to be portable unlike Nouveau KMD.
09:35 kwizart: thanks
09:37 kwizart: x512[m], so it means nvidia-open can be used for NVK also under linux with mesa-next (or with a dedicated mesa branch ?)
09:39 tiredchiku[d]: not yet
09:39 kwizart: also not sure about where nova km stands for ?
09:39 x512[m]: It can (and demonstrated to be working), but most Linux distros will likely continue to use Nouveau KMD and Nova KMD in future. Out of tree kernel modules with unstable UAPI is a big taboo in Linux ecosystem, but it is perfectly fine for Haiku.
09:39 x512[m]: Nova KMD is not ready at all for now.
09:41 x512[m]: Nova KMD is intended to be Nouveau replacement for Turing+ GPUs written with Rust.
09:42 x512[m]: Linux-only I suppose.
09:43 x512[m]: In theory NVK with Nvidia KMD (NVRM) can interop with CUDA or official Nvidia proprietary OpenGL/Vulkan drivers.
13:29 snowycoder[d]: Ok maybe I understand what's going wrong with SR and ISBE on sm32.
13:29 snowycoder[d]: The issue is: there is no ISBE(?).
13:29 snowycoder[d]: There's VILD, that I mapped to ISBE, but codegen instead calls PFETCH.
13:29 snowycoder[d]: PFETCH loads a vertex address indirectly(?)
13:29 snowycoder[d]: So codegen just does PFETCH vtxid instead of the invocation_info thinginess
13:30 snowycoder[d]: Maybe I can hack it together
13:33 gfxstrand[d]: Yeah, we may want a new `OpViLd`
13:35 gfxstrand[d]: Codegen likes to go "Oh, these do kinda similar things. Let's pretend it's the same op but with a GPU generational difference." I prefer to add a new op so there's no confusion.
13:36 snowycoder[d]: Yep, I just mapped them together because I didn't know what either of them did😂
13:39 gfxstrand[d]: That's fair
13:40 gfxstrand[d]: But yeah, my read of the lowering code for nvc0 was that it maybe multiplied that vertex index by 4 and that was it.
13:40 gfxstrand[d]: It didn't do a separate ISBE anything
14:34 snowycoder[d]: Good news: now geometry shaders can read position attributes.
14:34 snowycoder[d]: Bad news: Basic geometry tests like `dEQP-VK.geometry.basic.primitive_id` still fail and `ILLEGAL_SPH_INSTR_COMBO` (although it doesn't seem like a fp64 problem)
16:07 mohamexiety[d]: gfxstrand[d]: so did this (with pushing different push constants every time) and I am seeing something interesting.
16:07 mohamexiety[d]: - two packets indeed goes A B A, so I guess one of them is the shader pointer. but no clue what the other one could be
16:07 mohamexiety[d]: - two packets instead are different every time :thonk:
16:08 mohamexiety[d]: I'd guess one is cbuf0 but not sure what the other would be
16:11 mohamexiety[d]: could be helpful to see what they look like, so I'll post them here. starting with the two packets that go A B A: (note: they're not next to each other)
16:11 mohamexiety[d]: Packet 1, A:
16:11 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:11 mohamexiety[d]: .VALUE = 0xf3ab2800
16:11 mohamexiety[d]: Packet 1, B:
16:11 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:11 mohamexiety[d]: .VALUE = 0xf3ab2810
16:11 mohamexiety[d]: Packet 2, A:
16:11 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:11 mohamexiety[d]: .VALUE = 0x7ceaca04
16:11 mohamexiety[d]: Packet 2, B:
16:11 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:11 mohamexiety[d]: .VALUE = 0x7ceaca08
16:20 mohamexiety[d]: now the different ones: (again, not next to each other)
16:20 mohamexiety[d]: Packet 1, A:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x6f3d9
16:20 mohamexiety[d]: Packet 1, B:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x6f3e1
16:20 mohamexiety[d]: Packet 1, A:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x6f3e9
16:20 mohamexiety[d]: Packet 2, A:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x1bcf68
16:20 mohamexiety[d]: Packet 2, B:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x1bcf88
16:20 mohamexiety[d]: Packet 2, A:
16:20 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
16:20 mohamexiety[d]: .VALUE = 0x1bcfa8
16:20 mohamexiety[d]: one other thing which could be helpful/relevant, the first packet is actually the second packet they send in the whole QMD
17:03 karolherbst[d]: marysaka[d]: I found your bug... 🙃
17:03 marysaka[d]: oh?
17:04 karolherbst[d]: I _think_ uhm...
17:04 tiredchiku[d]: 🎷🐛
17:05 karolherbst[d]: marysaka[d]: like.. the `idx` to `compute_matrix_offsets` is kinda the "tile" within the matrix, no?
17:06 karolherbst[d]: so I was confused about the `compute_matrix_16x8x32_target(b, desc, lane_id, idx % 16, col_offset, row_offset);` part
17:06 karolherbst[d]: and why `idx % 16` specifically
17:06 karolherbst[d]: and I'm wondering why that's not `idx % 4`
17:08 karolherbst[d]: mhhh
17:08 karolherbst[d]: maybe I need to take a deeper look..
17:08 karolherbst[d]: this is all a bit complicated 🙃
17:09 karolherbst[d]: ehh wait.. `idx` is the element index...
17:09 marysaka[d]: idx is the element yeah
17:09 karolherbst[d]: I was wondering why my change made it pass, turns out it didn't use the nvidia gpu 🙃
17:09 gfxstrand[d]: mohamexiety[d]: What does your compute shader do? Is it super trivial? I'm guessing the first one is shader address lower and the second one is cbuf0 but I'm not 100% sure.
17:10 marysaka[d]: something I remember seeing at least on the graphics side: the cbuf0 is usually right after the shader program
17:11 karolherbst[d]: marysaka[d]: but I also have an idea how to do the encoding better, and I think `nak_cmat_type` needs to encode the type as well
17:11 karolherbst[d]: because the matrix layout can change depending on the actual tyoe
17:13 marysaka[d]: yeah,,,
17:13 marysaka[d]: we pass the info for the actuall MMA intrinsic at least but not the layout computation yet
17:14 gfxstrand[d]: marysaka[d]: Not on QMD 4.0
17:14 gfxstrand[d]: IDK about 5.0
17:15 marysaka[d]: ah rip...
17:16 karolherbst[d]: but I think `16x8x16` wasn't correct
17:16 karolherbst[d]: sometimes A and C/D seem to use different cut offs
17:17 karolherbst[d]: you had `if (idx >= 2) row = nir_iadd_imm(b, row, 8);` for both, but it seems for A it has to be `(idx >= 4)`
17:17 karolherbst[d]: looking at https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-fragment-mma-16816-i8-f8
17:17 karolherbst[d]: just wondering if I'm misunderstanding things or not
17:22 karolherbst[d]: marysaka[d]: also.. I don't find the docs of the 16x16x32 matrix, is there another source for that?
17:24 marysaka[d]: karolherbst[d]: Hammering the driver with tests and diffing 🙃
17:24 karolherbst[d]: pain
17:24 marysaka[d]: https://github.com/marysaka/usami/tree/master/coop_matrix_layout_store_shaders/sm86
17:25 karolherbst[d]: thanks!
17:25 marysaka[d]: codegen here I should update that part buut yeah
17:25 marysaka[d]: I think SM86 output isn't up to date
17:25 karolherbst[d]: are the layouts different between gens? 🙃
17:25 marysaka[d]: yes
17:25 karolherbst[d]: mhhh
17:25 marysaka[d]: well more it lower to smaller size possibly
17:26 marysaka[d]: when not supported
17:26 karolherbst[d]: I see
17:26 karolherbst[d]: I mean more if I can rely on the cuda docs layout to be the same across the gens
17:26 marysaka[d]: but 16x16x32 is two IMMA btw
17:26 karolherbst[d]: yeah.. I'm ignoring those for now
17:26 marysaka[d]: using 16x8x32
17:26 karolherbst[d]: want to get 8x8x32 working first
17:27 karolherbst[d]: marysaka[d]: that one doesn't exist on turing either
17:28 marysaka[d]: correct
17:47 gfxstrand[d]: snowycoder[d]: FYI: I'm about to shut down the desktop with the Kepler in it and won't have access to it again for about 2 weeks.
17:48 karolherbst[d]: I think at this point I'm making it worse
17:51 gfxstrand[d]:is tempted to implement AMD_trinary_minmax just so she can use an actual predicate for the min source of `fmnmx`.
17:52 gfxstrand[d]: The `spirv_to_nir` lowering emits 4 min/max instructions. We can do it in 3, I think.
17:53 gfxstrand[d]: AMD can do it in 1
17:54 snowycoder[d]: gfxstrand[d]: No more help on sm20/sm32 then? 😦
17:55 gfxstrand[d]: I can review stuff and try to answer questions. I just can't run tests for a bit
17:56 snowycoder[d]: Ok, thank you.
17:58 snowycoder[d]: For vild, I have a branch I'm hacking on: https://gitlab.freedesktop.org/SnowyCoder/mesa/-/tree/nak_sm32_vild?ref_type=heads
17:58 snowycoder[d]: sph is identical and shader code is equivalent so I have no clue about what's wrong.
18:00 HdkR: Is AMD_trinary_minmax used heavily enough that it's actually interesting to implement?
18:00 HdkR: Always seemed like one of those extensions to expose a hardware feature that barely anything hammers :D
18:01 mangodev[d]: i've heard of it before, but idk if it's *that* well used :P
18:02 mangodev[d]: ik Mary used to have a `vk_ext_mesh_shader` PR, but last time i checked, it got a little broken from the post pass scheduler
18:07 gfxstrand[d]: HdkR: No but it gives me an excuse to use a fun little hardware quirk.
18:08 HdkR: Reasonable enough
18:33 karolherbst[d]: I want tuples in C...
18:38 mhenning[d]: Have you tried using Structs™, The Official Product Type of C ?
18:38 ermine1716[d]: who haven't
18:38 mohamexiety[d]: gfxstrand[d]: yep, they're both really trivial.
18:38 mohamexiety[d]: VkShaderModule cs_1 = qoCreateShaderModuleGLSL(
18:38 mohamexiety[d]: t_device, COMPUTE,
18:38 mohamexiety[d]: layout(push_constant, std430) uniform Push {
18:38 mohamexiety[d]: uint push_const;
18:38 mohamexiety[d]: } pc;
18:38 mohamexiety[d]: layout(set = 0, binding = 0, std430) buffer Storage {
18:38 mohamexiety[d]: uint ua[];
18:38 mohamexiety[d]: } ssbo;
18:39 mohamexiety[d]: layout (local_size_x = 32) in;
18:39 mohamexiety[d]: void main()
18:39 mohamexiety[d]: {
18:39 mohamexiety[d]: ssbo.ua[gl_LocalInvocationID.x] = pc.push_const + gl_LocalInvocationID.x;
18:39 mohamexiety[d]: }
18:39 mohamexiety[d]: );
18:39 mohamexiety[d]: VkShaderModule cs_2 = qoCreateShaderModuleGLSL(
18:39 mohamexiety[d]: t_device, COMPUTE,
18:39 mohamexiety[d]: layout(push_constant, std430) uniform Push {
18:39 mohamexiety[d]: uint push_const;
18:39 mohamexiety[d]: } pc;
18:39 mohamexiety[d]: layout(set = 0, binding = 0, std430) buffer Storage {
18:39 mohamexiety[d]: uint ua[];
18:39 mohamexiety[d]: } ssbo;
18:39 mohamexiety[d]: layout (local_size_x = 32) in;
18:39 mohamexiety[d]: void main()
18:39 mohamexiety[d]: {
18:39 mohamexiety[d]: ssbo.ua[gl_LocalInvocationID.x] = pc.push_const * gl_LocalInvocationID.x;
18:39 mohamexiety[d]: }
18:39 mohamexiety[d]: );
18:39 mohamexiety[d]: I am going to try not using push constants and will also try having all constants being the same
18:46 karolherbst[d]: mhenning[d]: can't use them in switch statements
18:46 karolherbst[d]: okay.....
18:46 karolherbst[d]: I've disabled enough coop matrix to get this:
18:46 karolherbst[d]: Passed: 1018/31414 (3.2%)
18:46 karolherbst[d]: Failed: 0/31414 (0.0%)
18:46 karolherbst[d]: Not supported: 30396/31414 (96.8%)
18:46 karolherbst[d]: I think I slowly understand the code
18:47 karolherbst[d]: now let's add more matrix types and go from there
18:49 mohamexiety[d]: I am seeing a 100% pass rate, I say ship it
19:08 karolherbst[d]: I'm a bit confused on how to deal with lowered operations...
19:08 karolherbst[d]: maybe we should ignore the matrix layout there and uhm.. but...
19:08 karolherbst[d]: like operations on a single matrix
19:08 karolherbst[d]: currently it's a bit hard to figure out what's the layout supposed to be, but I also don't know if it matters...
19:09 karolherbst[d]: maybe it should be split in the code
19:09 karolherbst[d]: layout of single matrices, and then the muladd layouts..
19:09 karolherbst[d]: hopefully it's the same
19:09 karolherbst[d]: mhhh
19:09 karolherbst[d]: 16x8x8 is entirely wild
19:10 karolherbst[d]: the layout changes depending on the types of everything
19:10 karolherbst[d]: like.. inputs of fp16/fp32/fp64 all have different layouts
19:10 karolherbst[d]: and then for c/d it's either fp16+fp32 or fp64
19:19 mohamexiety[d]: mohamexiety[d]: so I tried having no push constants at all and having the same push constant on all 3 dispatches. for the constant push const case, I am seeing exactly these same numbers, no difference at all
19:22 mohamexiety[d]: for the no push const case, I am seeing the exact same numbers for the things that change ABA. however, for the things that become different with each dispatch, we get different numbers for everything:
19:22 mohamexiety[d]: now the different ones: (again, not next to each other)
19:22 mohamexiety[d]: Packet 1, A:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x6f3d9
19:22 mohamexiety[d]: Packet 1, B:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x6f3db
19:22 mohamexiety[d]: Packet 1, A:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x6f3dd
19:22 mohamexiety[d]: Packet 2, A:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x1bcf68
19:22 mohamexiety[d]: Packet 2, B:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x1bcf70
19:22 mohamexiety[d]: Packet 2, A:
19:22 mohamexiety[d]: mthd 3bc4 NVC7C0_CALL_MME_DATA(120)
19:22 mohamexiety[d]: .VALUE = 0x1bcf78
19:22 mohamexiety[d]: interestingly the first 'A' is the same as the other cases
19:25 mohamexiety[d]: btw these are the only things missing for qmd re work
20:32 mohamexiety[d]: I wonder if trying on Ada would be helpful..
20:32 mohamexiety[d]: but at the same time given the amount of things changing, it wouldn't really help with pinpointing what is what
20:34 mohamexiety[d]: mohamexiety[d]: it could help due to sizes though. I'd _guess_ (and that could be a super bad guess) that the sizes of things would be the same across the QMD versions. so e.g., both of these are 32bit.
20:34 mohamexiety[d]: mohamexiety[d]: on the other hand here the first packet is 20bit and the second is 24bit
20:38 toponeyankee: there is no such thing as genetical mental illness as there are no such thing as reading from an index smaller value than the index worth removed with arithmetic, you may accidentally on a fluke way read out higher value than 256-X-Y=58, which you can yet more validate, because 127+71=198 and 256-198 has to be 58, that remainder if there is any , one can append back to double validate.
20:38 toponeyankee: Genetical mental illness such retarded condition (rather absurd as you are) was fabricated, cause medical world did not want to face charges of butchering me and using my stem cells illegally. Those Europes/Estonia tyrans are very good at terroring and very transparent when it comes down to being responsible of what they did and do and very incapable as of showing their legal skillsets. At
20:38 toponeyankee: one point trying to call someone mentally ill, and from the birth is most well understood fiction to deny harm cause and annoy the victim. So there is nothing funny about by hashing function either, it's fundamental way is indexes are yielding two equal and one other relevant value from a constant if enough larger than only +1, which means twice the index value can be conditioned right
20:38 toponeyankee: after to measure the target, so those in the example those such indexes were 58 and 116. Mental illness as such condition is a voodoo absurd.
20:38 toponeyankee: it simply is vague never appearing condition to complain about.
20:39 mohamexiety[d]: you know, for someone who looks here occasionally with no context, I wonder if my qmd ramblings read similar to these posts :nervous:
20:44 dwfreed: the difference your ramblings probably make sense to the devs here
20:44 dwfreed: s/your/is &/
20:44 dwfreed: that guy doesn't make sense to anybody
20:44 mohamexiety[d]: yeah..