05:24endrift: Hmm, is the modulo operator (on integers) just really slow on Maxwell shaders? I have some shader code that conditionally does x -= x % y to grid-align coordinates to an arbitrary (read: npot) grid, but when it's enabled it's super slow on TX1 using Nouveau
05:24endrift: it's fine everywhere else
05:25imirkin: endrift: nvidia gpu's don't have integer division
05:26endrift: that would do it
05:26HdkR: None of them have integer division, you're relying on bigger GPUs brute forcing it
05:26endrift: I was just wondering if that was the case
05:27endrift: so is it possible to make this faster on my end? Or do I just have to suffer?
05:27endrift: casting to float, dividing, then back to int seems...less than ideal
05:27HdkR: If you can do BFEs it'll be faster
05:27endrift: I might be able to munge it into some weird reciprocal division I guess
05:27endrift: BFE?
05:27HdkR: bitfield extract, and masking, etc
05:27imirkin: if you do x % <immediate value> then it'll be faster
05:28endrift: it's configurable via uniform unfortunately
05:28imirkin: maybe blob inlines it, dunno
05:28imirkin: oh, also i recommend making these values unsigned
05:28imirkin: modulo with signed quantities is extra-annoying
05:29imirkin: i.e. ensure that both x and y have unsigned types
05:29endrift: ohhh fair
05:31imirkin: (like wtf is -10 % -2? who knows. takes extra ops to figure it out)
06:02endrift: given the limited range of both sides I wonder if there are any weird tricks I can do
06:03endrift: lhs is always going to be between 0 and 255, rhs is always going to be between 2 and 8, both inclusive
06:04HdkR: Use the shader5 extension to get uint8_t types? Maybe mesa will optimize it for you :P
06:31endrift: ok, I did it the way the compiler does integer division by constant :P
06:31endrift: multiply by a shifted up reciprocal, then shift down
06:31endrift: since the number of divisors was small I can just precompute it and stuff it in a table
06:32endrift: hopefully I can get away with using 20 bits of integer in the process though