IRC Logs of #d3d9 on irc.freenode.net for 2023-04-14

05:23 adavy: q4a: found the wfog programmable ps issue
05:23 adavy: it was not updating the ps dirty flag
05:24 q4a: Good catch)
05:25 adavy: now it fully passes the tests
05:25 adavy: only remains the question about odepth
05:26 q4a: Did you create MR with fix or pushed it to your branch?
05:26 q4a: I'll try to add odepth test today
05:34 q4a: ok. Looks like it is this branch: https://gitlab.freedesktop.org/axeldavy/mesa/-/commits/fixes_for_merge
05:39 adavy: yes
05:50 adavy: btw
05:50 adavy: if you want to optimize nine, one area that can benefit a lot and is not too hard is ff constant compaction
05:52 adavy: you'll notice in the programmable vs path that two optimizations are present that are not in the ff path
05:53 adavy: one minor is we recompile after inserting the boolean and integer constants
05:53 adavy: this allows loop unrolling and makes loops faster
05:53 adavy: for ff, we already handle loop manually for ps, and the only loop is for vs lightning
05:53 adavy: this is a rare case, probably not worth optimizing
05:54 adavy: (vs takes only a fraction of the time of ps since it is run less often)
05:54 adavy: however the second optimization is after we know which constants are needed, we recompile the shader with compacted constants
05:55 adavy: basically if you need 100 constants, you only upload 100
05:57 adavy: right now for ff any constant change leads to uploadng 3KB of constants
05:57 adavy: because absolutely no compaction of detection of needed size is implemented
05:58 adavy: the only drawback of compaction is you need to reupload constants when you change the shader
05:58 adavy: but let's face it constants change when the shader change
05:59 adavy: ofc this optimization will only affect ff games which usually are pretty light and already fast
06:02 adavy: compaction would also help the very rare case of games rendering with software vertex processing (swvp). This is rare, but when they do, we upload a lot of constants as no compaction is done and the range available is huge (prepare_vs_constants_userbuf_swvp). The software vertex preprocessing is emulated in hw right now
06:05 q4a: Good. I will try to optimize that, but first I want to finish with the fog.
06:07 q4a: I will try my best. I'm not so smart, but I have free time to learn if I use the right books/materials/friends help.
06:10 adavy: the other area that can be optimized is stateblocks, but this is harder
06:10 adavy: basically stateblocks enable to set at the same time a range of states that were recorded
06:11 adavy: right now we record what needs to be done, and call manually one command for each state to set
06:11 adavy: this could be less heavy, but it is easy to make a mistake
06:16 adavy: in the case of stateblocks, the optimization wouldn't be to neccessarily avoid making in the end one call for each state (because there are a lot of behaviours to handle when a state is set)
06:17 adavy: but right not the calls are made in the main thread and results in a lot of calls appended to the list of calls to make for the secondary thread
06:17 adavy: this is ok for the secondary thread to have a bit of work, but the process of appending the calls to the list is done in the main thread, and worse if the list is full it waits for the secondary thread
06:17 adavy: the optimization would be to reduce the amount of work in the main thread
06:19 adavy: we can discuss that when we come to it
06:22 q4a: yea.. Also I would like to keep this suggestions in mesa issues with tags like: nine, difficulty: easy/medium, good-first-task or something like that
06:22 q4a: That would allow to see tasks for people, that don't read all irc logs
06:24 q4a: Also I can add comment in MR that it fixes one of issues
06:24 adavy: ok
06:24 q4a: * Also I will be able to add comment in MR that it fixes one of issues
15:53 adavy: q4a: I'm not sure of the process. One issue per suggestion or a big issue ?
16:00 q4a: adavy: I would like to see one issue per suggestion, but you can do it anyway you want =)