04:58 violet_purple_red[d]: mohamexiety[d]: Maybe I should test it myself then
04:58 violet_purple_red[d]: The thing is that the start of the game doesn't have much open world
04:58 violet_purple_red[d]: And I would want to see how nouveau handles the wide open world of wuwa
04:58 violet_purple_red[d]: (wuwa = wuthering waves)
10:09 kar1m0[d]: I am having a weard issue with steam
10:09 kar1m0[d]: it is just in an endless loop
10:09 kar1m0[d]: and it doesn't open
10:16 kar1m0[d]: wait maybe it's a mesa issue
17:09 mhenning[d]: leftmostcat: Did you figure out if this is a compiler bug or not? I'm currently looking at https://gitlab.freedesktop.org/mesa/mesa/-/issues/13966 which could possibly be a plop3 bug, so if you haven't had time I might work on it a bit.
17:11 leftmostcat[d]: mhenning[d]: I tried a few cases with nvdisasm tests and couldn't get it to produce anything unexpected, but I didn't do a comprehensive search of the space.
17:12 mhenning[d]: Okay, thanks
17:14 leftmostcat[d]: In particular, I couldn't get the not bit for operand 3 to make sense, as that diagram doesn't show one at all.
17:34 mhenning[d]: Oh, there might not be a `not` bit for that src - not every modifier can be used everywhere
17:41 leftmostcat[d]: I'm trying to scrounge up some hardware to test `iadd_sat`/`isub_sat` on, but if I can't manage that, I may just have to write the hw tests and ask someone else to run them. 😛
17:42 leftmostcat[d]:is a master of planning ahead.
17:46 mhenning[d]: oh, do you not have an nvidia gpu? That does make testing harder
17:54 leftmostcat[d]: Not a new enough one, no.
18:00 mohamexiety[d]: cubanismo: alright having looked more at things, yeah the domain check in the vmm code is fully redundant; the bo code check is sufficient and protects us against all cases that are marked with `GART`. if the bo has that flag it will always get 4K no matter what. for eviction, we only really have eviction for `GART | VRAM` stuff, so that will always get 4K and wont get compression. under high
18:00 mohamexiety[d]: memory pressure we evict `VRAM` only stuff to sysmem, but the context is halted until it gets paged back into VRAM and it's unusable in that scenario so we don't have to worry about page sizing.
18:00 mohamexiety[d]: the tldr I guess is we can either remove that check entirely or keep it just in case, I don't really have any preferences towards the matter
18:03 mohamexiety[d]: if the HW really doesn't care about having larger page sizes in sysmem then I guess future work could try seeing why nouveau doesnt do that and fix things up because this would make life a lot a lot a lot easier in this area
18:03 cubanismo[d]: OK, makes sense.
18:05 mohamexiety[d]: (the whole dedicated alloc nonsense is specifically because of this kernel limit)
18:06 mhenning[d]: leftmostcat[d]: what card do you have? there's plenty of compiler stuff to do for kepler/maxwell/pascal if you would prefer to switch to a project you can test - or you can keep going on the iadd_sat stuff if you prefer
18:08 cubanismo[d]: mohamexiety[d]: Up to you/others then if you want to leave the check in for clarity or future-proofing. As I say, no strong opinion from me. Just reply what you decide on the patch thread and I'll do a Reviewed-by:
18:10 leftmostcat[d]: mhenning[d]: I believe my newest Nvidia card is Fermi, but it's in a box and I'd have to dig it out. I'm putting out feelers locally to see if I can turn something up.
18:11 x512[m]: cubanismo[d]: Do you know how to get flip completion notifications from NVKMS_IOCTL_FLIP? Tried to use NVKMS_IOCTL_DECLARE_EVENT_INTEREST and NVKMS_IOCTL_GET_NEXT_EVENT, but got nothing.
18:11 mhenning[d]: leftmostcat[d]: ah, yeah, that's too old for nvk. good luck in the search!
18:11 cubanismo[d]: Uh
18:12 cubanismo[d]: That's kind of a mess.
18:12 cubanismo[d]: And will probably be reworked really soon.
18:12 leftmostcat[d]: Thanks. I should be able to manage something. 🙂
18:15 mohamexiety[d]: if you want the absolute cheapest, look for a quadro T400 or A400, those are like $135 MSRP, let alone second hand. they're really weak cards but the first is TU117 Turing (does not have RT or tensor cores) and the second is GA107 Ampere (has RT, tensor cores, etc). there's also the GTX 1630 for a tiny bit more powerful offering. that's also TU117
18:16 cubanismo[d]: x512[m]: Note the documentation/comment in nvkms-api.h:
18:16 cubanismo[d]: > When a client requests a flip and specifies a completion notifier with NvKmsCompletionNotifierDescription::awaken == TRUE, this event will be generated. This event is only delivered to clients with flipping permission.
18:17 mohamexiety[d]: for higher perf but still relatively cheaper stuff I'd probably look at the 2060/2060S/3050/4060/5050/5060
20:03 x512[m]: Yeah, NVKMS flip notification is working. 2 notifiers were needed.
20:03 x512[m]:sent a code block: https://matrix.org/oftc/media/v1/media/download/AeDjTSckBpHZinOLswahcVPfDuospMhNN3c2b9IQujxBYP90idQOU7n7lyOJF_aEEER0dfJVDVrwLJW-U1tl_C9Ceak2ol8QAG1hdHJpeC5vcmcvSFNQc0FzekRUWkpiQnBPSmNCUVB5c0NC
20:06 Guest30490: ,
21:20 mhenning[d]: who wants to review a small compiler fix? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38220
21:26 cubanismo[d]: x512[m]: Oh, yeah, you would need a notifier per outstanding frame. Was that the issue?
21:26 x512[m]: I originally didn't understand notifier purpose.
21:27 x512[m]: Another issue is that acquire/release semaphores in flip request do not work.
21:28 x512[m]:sent a code block: https://matrix.org/oftc/media/v1/media/download/AW92a1dRvTaUGysVUhvjKx0r3jpG3p8jRJcG93_dZX6_fQvzDQfqDqDpyuflbL0wglYkMDQ4_rpuH3ezaWID6SBCeak7fdbQAG1hdHJpeC5vcmcvY0xIYUtDRkVkWXVoakxEUFRnTmpLTUtP
21:29 x512[m]: Semaphore value is not affected and flip request succeed without blocking.
21:32 x512[m]: Also under some conditions I managed to make process that call NVKMS stuck and to be unkillable.
21:40 leftmostcat[d]: mhenning[d]: The logic changes make sense to me, though I admit I don't fully understand why dedup helps in the first place.
21:48 snowycoder[d]: leftmostcat[d]: `a & ~a` might be constant-folded probably, don't know why it should happen that late though
21:54 mhenning[d]: yeah, it's just an optimization. we can fold things like (x ^ y) & x into a single LUT on x and y which then frees up space in the instruction
21:55 leftmostcat[d]: Ahh, okay.
21:57 leftmostcat[d]: Thanks.