05:17 imirkin: cwabbott: i think the idea is that instead of spilling the 4-wide merge (which is just a RA constraint), you'd spill/restore the individual components into the appropriate regs.
05:18 imirkin: it's literally the "wide" value which is marked as no-spill, not its components
05:18 imirkin: anyways, i still haven't really looked at the wider details here
06:19 cwabbott: imirkin: well, you still might need to spill/unspill the "wide" value
06:19 imirkin: when would truly spilling it be beneficial?
06:20 cwabbott: it might be necessary
06:20 cwabbott: which means spilling/unspilling its components too and deleting the merge
06:20 imirkin: it's made up of the components
06:20 imirkin: anyways, in the current scenario, the fact that it doesn't get a weight computed is tripping things up
06:20 cwabbott: but some things use the whole register, no?
06:21 imirkin: sure, a tex might want those 4 values all loaded into the appropriate sequence of 4 regs
06:21 imirkin: but spilling is done 1 reg at a time
06:21 imirkin: (admittedly you can do wider writes, but that's a second order opt)
06:21 cwabbott: then you have to spill/unspill all the subregs
06:21 imirkin: correct
06:22 cwabbott: even at the texture
06:22 imirkin: well, by the time the texture runs, it _has_ to have 4 sequential regs
06:22 imirkin: no matter what
06:22 cwabbott: yes, that's right
06:22 imirkin: so either there's room for those 4 values
06:22 imirkin: or there isn't
06:23 imirkin: if some regs only become availabel right before the regs
06:23 imirkin: then those components need to be restored
06:23 imirkin: but it's still going to be a per-component decision
06:23 imirkin: rather than a 4-reg-wide decisition
06:23 imirkin: decision*
06:24 imirkin: at least that's the idea. perhaps it doesn't jive with the algorithm :)
06:24 imirkin: (well, clearly it doesn't)
06:25 cwabbott: well, that assumes that the merge always comes right before the tex, I guess?
06:25 imirkin: yes. which is accurate, i believe
06:25 imirkin: and there are "constraint" moves into values which are merged
06:26 imirkin: i.e. it won't take random ssa values from elsewhere - it'll create fresh ones just to be sources into the merge
06:26 cwabbott: right, so the merges could be deleted and the SSA values could become subregs of the larger reg and it would work out
06:26 imirkin: exactly.
06:27 imirkin: that's the dream :)
06:27 cwabbott: there might still be some case where multiple wide values are live
06:27 imirkin: sure
06:27 cwabbott: and then you'd be screwed
06:27 imirkin: but if a tex needs 2x 4-wide things
06:27 cwabbott: seems very fragile
06:27 imirkin: then it needs 2x 4-wide things
06:27 cwabbott: I mean, aren't there other things that emit merges though?
06:28 imirkin: yeah, but those aren't marked noSpill. i think.
06:28 imirkin: this is specifically for tex (and surface) ops
06:28 imirkin: [and some other super-weird cases iirc]
06:29 cwabbott: i dunno, that all seems pretty fragile though
06:29 imirkin: [which definitely don't apply here]
06:29 imirkin: yes. fragile indeed.
06:29 imirkin: the immediate problem is that the noSpill merge tries to get simplified, which bails due to a missing weight
06:29 imirkin: which is explicitly not calculated for noSpill nodes
06:29 cwabbott: is the weight a spill weight?
06:30 imirkin: yes, spill weight.
06:30 imirkin: or rather, it gets an "infinite" spill weight
06:30 imirkin: anyways, i'll get time to investigate this more .... in the future
06:30 imirkin: life keeps getting in the way
06:31 imirkin: (stupid taxes)
14:21 cwabbott: imirkin: ah, seems like the way optimistic coloring and spilling are implemented are a bit dumb
14:22 cwabbott: (optimistic coloring is when you're not sure you push something onto the stack not being sure that you can color it, and hope for the best)
14:22 cwabbott: normally you choose the node with the smallest degree, since that's the most likely to succeed
14:23 cwabbott: but it seems that there's some attempt to choose which nodes to spill when selecting - it just spills anything it can't find a register for
14:24 cwabbott: which, of course, has to be an optimistically-colored node
14:25 cwabbott: but, if the optimistically-colored node is a noSpill node... you're screwed
14:28 cwabbott: also, if I'm reading this correctly it chooses the higher degree first, which is really gonna do you bad
14:28 cwabbott: the classic algorithm just spills one thing at a time, which avoids all of this but can be slower
14:29 cwabbott: since you'll need more round-trips through the simplify/select/spill loop
14:30 cwabbott: (only for things that spill of course)
14:31 cwabbott: but choosing (effectively) the node with the highest degree will really increase the probability of spilling because you'll try to color the node with the most already-colored neighbors and probably fail
14:32 cwabbott: I *think* the idea is to pick the "best" node to spill, since the order things get pushed on the stack determines what gets spilled
14:32 cwabbott: but it just screws you over because it makes you more likely to spill in the first place
14:43 karolherbst: imirkin: ever saw supported-gpus/supported-gpus.json inside the nvidia driver? it has some flags for hdmi4k and hdmi4k60rgb444