00:02fdobridge: <gfxstrand> Yeah, I know what's going wrong, I think. I just don't know how to fix it yet
00:02fdobridge: <karolherbst🐧🦀> I hope it's something funky
00:02fdobridge: <gfxstrand> It's an interaction between MMEs and SEMAPHORE_WRITE
00:02fdobridge: <karolherbst🐧🦀> pain
00:27fdobridge: <marysaka> Is it related to the changes I made maybe or is it more semaphore write in MME?
00:37fdobridge: <marysaka> Is it related to the changes I made for queries maybe or is it more semaphore write in MME? (edited)
01:18fdobridge: <gfxstrand> I think it's related(ish) but there's still something I'm missing.
01:19fdobridge: <gfxstrand> I'm going to dump the blob and see what happens
01:30fdobridge: <gfxstrand> What does `SET_REFERENCE` do?
01:34fdobridge: <karolherbst🐧🦀> given that the field is called `SET_REFERENCE_COUNT` it sounds like a reference count of some sort
01:35fdobridge: <karolherbst🐧🦀> there was/is some dynamic part of the subchannel assignment
01:35fdobridge: <karolherbst🐧🦀> but I don't know if that's still relevant on modern GPUs
01:35fdobridge: <karolherbst🐧🦀> but on older GPUs you could assign subchannels to any id and other things, but that mapping became fixed on newer gpus afaik
01:43fdobridge: <airlied> it's in the open gpu docs
01:43fdobridge: <airlied> REF - Reference Count
01:47fdobridge: <bylaws> I think blob calls that before indirect draws
01:48fdobridge: <bylaws> Like it's some sort of cmdbuf barrier
01:50fdobridge: <gfxstrand> Yeah, that seems to be how the blob is using it
01:50fdobridge: <gfxstrand> I'm just not sure exactly what it's doing there.
01:50fdobridge: <gfxstrand> I'm just not sure exactly what it's doing there. 😅 (edited)
05:19fdobridge: <gfxstrand> Okay, it's passing CTS now. It also fixes some indirect draw tests. Tomorrow I need to do some selective testing on Maxwell to make sure I didn't break anything there but I think we should be good.
15:41fdobridge: <gfxstrand> Now I just need someone to benchmark it. :triangle_nvk:
17:12Ilgaz: hi. sorry health issue can't type wrong. gnome-wayland works perfectly on opensuse tw except a single stock application (gnome-terminal) text area and the window when it is moved corrupts display. zero hacks, mods, extensions. If I get the same issue with Fedora live USB should I report to freedesktop? distro maintainers sends me to freedesktop.
17:13Ilgaz: nouveau, nv9400, macbook 5.1
17:15Ilgaz: The interesting part is, it is a gtk3 application which moved to gtk4 in current git.
17:24Ilgaz: A horrible photo of screen. That same thing looks perfectly fine _pn file_ if I record/screenshot it. It is almost like an electronics issue. https://photos.app.goo.gl/7A1AjYxukLN47hzd7
17:25Ilgaz: sorry I mean "on file"
17:52Ilgaz: gnome system monitor, another gtk3 application doing same things
18:18Ilgaz: sorry it was plasma system monitor and it created a crisis here. I have a huge dmesg log in hand. It flooded RAM and CPU displaying same glitches
19:36songafear: 12:15] songafear: https://www.codeproject.com/Articles/330174/Part-9-OpenCL-Extensions-and-Device-Fission so those things that let you control compute units or PEs and their SIMDS are called asynchronous command queues, that let you use two contexts per same device, so in opencl 1.2 as well as 3.0 that falls back to 1.2 are not available, so 1.2 is good enough standard for graphics too, however their event lists are synchronous blocking calls but only q
19:36songafear: [12:15] songafear: is async non-blocking call.
19:36songafear: [12:16] songafear: In other words you can not start the kernel if not all earlier kernels were not finished.
19:36songafear: [12:28] * glennk has disconnected (Ping timeout: 480 seconds)
19:36songafear: [12:29] songafear: As for the blocks grids and threads in cuda terms, you can vary all of them, but they get executed synchronous between each other, which is simpler and does the job very well on the most modern paradigm, so 1.2 is enough until you do same latency on all PEs which is bit almost inherent to the way hashed execution works.
19:36songafear: [12:30] songafear: However 2.1 is most sophisticated model that is not much needed in fact.
19:36songafear: [12:34] songafear: MST seems cool idea at another hand, but I must admit I assume things cause I haven't looked into this hub code
19:36songafear: [12:35] songafear: Opencl I was aware about
19:36songafear: [12:36] songafear: The biggest issue is we waste each other's time primarily you do that to me, however tech wise things are brilliant or should be, and I know that well
19:36songafear: [12:44] songafear: And all in all, if I forget about the terror Estonian entirely mindill and born Ill humans put me through, I am very educated and aware intelligent person, 40 years old, cunts from south Africa can not stalk or say things to me or england 2.5 years in a row, it's not allowed for abortion leftovers like that nor for their Finnish sluts.
19:36songafear: [12:46] songafear: Their delusions scams, theft and criminal and fucker stalker life is not welcome around me, they can not say things to me, if they do that again physical force will be used against those.
19:36songafear: [12:58] songafear: Gangsters are those as well as genetically born Ill leftovers, I am totally average mid class person without such delusions.
19:36songafear: [13:01] songafear: When I was a youth champion or good player has been on archives for long time already, cause after how I was treated I am no longer able to compete in such level anyhow if miracle treatment won't come through, so...
19:36songafear: [13:19] * Tom^ has disconnected (Ping timeout: 480 seconds)
19:36songafear: [13:39] * Tom^-laptop has disconnected ()
19:36songafear: [13:46] songafear: Whether I get some justice out of this is not questionable even, it's known to me, I do not coordinate this at all, they are soon all dead it's known that people dislike gods of abortion leftovers as in teams to commit terror against regulation like laws.
19:36songafear: [13:48] songafear: I have so much other hobbies and work to do that I do not coordinate this fightback or revenge even, but the outcome is very much expected.
19:36songafear: [14:00] songafear: But the hardwares are super strong that fact I discovered in fact some five years ago, there's only a little work yet to do, and nothing much complex anymore, I have the algorithms for this, how graphics are treated it's still good, that fixed function units no longer serve us well is just small evolution, graphics hw is still good.
19:36songafear: [14:00] songafear: And Microsoft spec is also good for shader model.
19:36songafear: [14:02] songafear: They all knew we eventually replace the FF units, so graphics still serve very well, the science behind it docs are so good
19:36songafear: [14:23] * Tom^ has disconnected ()
19:36songafear: [14:51] * RSpliet has disconnected (Quit: Bye bye man, bye bye)
19:36songafear: [15:10] * utsweetyfish has disconnected (Quit: ZNC 1.8.2+deb2+b1 - https://znc.in)
19:36songafear: [15:21] songafear: So the stashing algorithm, so 513 and 511 where 511 is focal you add 512 to it and say we have 128 banks in 1 sector so index is from 1 to 128. 1023-125 is 898 so the inverse buffer has 1+125+511
19:36songafear: [15:29] * kelvium has disconnected (Quit: ZNC 1.8.2 - https://znc.in)
19:36songafear: [15:32] * kelvium has disconnected ()
19:36songafear: [16:18] * agd5f has disconnected (Read error: Connection reset by peer)
19:36songafear: [16:27] songafear: You notice that when you remove 512 from top but add 512 and 125 to lower it they sum as 1024
19:36songafear: [17:07] songafear: So it's good in case you do not remove 512 it would sum as 1535 , but in that case you remove 1024 and you get your 511 .
19:36songafear: [17:29] songafear: This is the most straightforward way to do it, so that itself is years of work how it is with all steps mixed into compiler solution, but that's the fastest way to start
19:36songafear: [17:30] songafear: So you want to give only a sum of values with some fixed or derivable metadata
19:36songafear: [17:35] songafear: I already know the best way so my years were already served
19:36songafear: [19:52] songafear: So the proof is if you never added 512 in the first place but only index you filtered out 511 as 1024 when you summed while others in the vector reach inverse index times to
19:36songafear: 511
19:36songafear: [20:05] songafear: If I forget later on, and if I live, tell me too :)