Rotating a vector using integer math
Hello,
Today's galaxy brain idea is to use integer math to do arbitrary rotation about a point.
Why?
I can already hear you saying "We have very nice floating point units that are good at this already!" and I say, No thanks to the floating point unit. They're all well and good, but converting between floats and ints is slow on the GPU, wasting a whole instruction and especially so when you want to do index calculations. Also consider that on newer versions of GCN, we have packed ops, but no packed float/int conversions. Also, also consider that GCN doesn't have scalar floating point ops, and if you want to do float math in a scalar context, you'll be issuing vector instructions with SGPR sources! If you are maxed out on VALU and want to offload some calculations to SALU, then you're only making your problem worse if you need to do floating point ops.
Think about the index rotation case:
- read in some integer index data, maybe a thread ID.
- convert to float
- do some mads for your rotation
- convert back to int to access some resource
Wouldn't it be sick if you could avoid those unpackable conversions? Wouldn't it be dope if we could use SALU for this calculation? Well, you can, and I am going to show you how.
Are you for real?
Heck yes, I've gone from 4 ms to 1.5 ms in a shader using this technique. I'M HITTING 170% INSTRUCTION ISSUE RATE, BABY~!
(my SALU and VALU tandem energies)
Is this just fixed point math?
Maybe! I don't know exactly how fixed point works, but it might work like this. I won't say that I invented fixed point math, but I invented this technique. The whole idea behind this is to multiply our basis vector components by some BIG_NUMBER, multiply that with the vector we want to rotate, and then divide it by the BIG_NUMBER, all without overflowing. If your BIG_NUMBER is a power of two, then this turns out to be a left shift, [[your math]], and then a right shift. BIG_NUMBER needs to be selected carefully not to overflow.
You want to pick the biggest BIG_NUMBER you can without overflowing. To not overflow, the power of two that you choose for BIG_NUMBER has to be integer size minus the log2( largest value of your vector ) - 2. Subtract one because you need to add in the offset (which could bump you up to the next bit), and one for the sign bit. Example time:
- 32 bit integers, with one sign bit
- largest value in your vector is 1023 (say you're rotating an index in a 1024 x 1024 grid).
- log2(largest value) rounded up is 10.
- BIG_NUMBER is 1 << ( 32 - 10 - 2) = (1 << 20)
Preparation
First thing we need to do is prepare our data on the CPU. If you want to rotate a vector about a point, you can do with a 4x4 matrix using your standard vector math library (I'm using Unity here):
We also have a Basis class that compresses the matrix for a 2D rotation, taking just the X and Y components of the basis vectors and translation component. Our integer basis just has the BIG_NUMBER multiplied in and is rounded to the nearest int.
And that's all you need to do to prepare the data on the CPU.
Shader side
For the shader itself, just do your standard vector-matrix multiply, but with a division at the end. Using our IntegerBasis, it looks like this:
And that's really all there is to it. Conversion free, packable, and SALU compatible! I hope this can save you some cycles. BONUS ALGORITHM: You can also use this technique to implement integer lerp! Tweet me your spiciest Integer Lerp implementation @pyromuffin if you're feeling that SALU POWER!
XOXO
Kelly
Comments
Post a Comment