How to implement Ballot() with DX11
Hi, Today I will talk about a quick (slow) hack for getting some platform specific code working crossplatform. This is really about correctness, and not speed - I mean, otherwise the shader code wouldn't run on windows anyway. This hack won't be necessary once we can switch over to DXC - Microsoft's new shader compiler with shader model 6.0, but for now, that's out of reach. from GPUopen... though I'm pretty sure this isn't actually what it does. In order to explain what the function Ballot() actually does, I need to first take a little diversion to explain GPU hardware. GPU SIMDs The way a (modern, discrete) GPU works is that it executes instructions WAVE_SIZE (64 on AMD and 32 on Nvidia) at a time, all in lockstep. An easy way to think about it is that a set of WAVE_SIZE threads all share one instruction pointer. A group of WAVE_SIZE threads all located on a single SIMD is called a wave (or warp, using NV terminology). The cool thing about...