Float precision on GPU, bugs/features
- Reading float bits from texture and losing float bits. (first screenshot)
- Precompiled floats. CPU vs GPU.
- More about float bits and int/uint, float can be equal to 0. but…
Reading float bits from texture and losing float bits:
Screenshot above. Test shader shadertoy link. Expected 123 everywhere.
Numbers on left side on screenshot:
Blue number result of this operation (line 29 in shadertoy code):
uint value_2 = floatBitsToUint(uintBitsToFloat(uint(value+iZero))+fZero);
Result — can be 0 or value, depends on GPU and GPU shader language.
Reason — uintBitsToFloat(uint(value+iZero)) save value as float bits, but for test I use small uint value 123.
And the sum of float(<from uint 123>) + 0. processed as “float operation” by floats-GPU units, and uint(0) can be as result.
Conclusion — adding floats to float_bits value may result in losing bits.
Red number can be 0 or 123 depends of CPU shader compiler (it can be bugged and result 0).
Green number — should be 123 everywhere.
Right side of screenshot, texture reading:
Do not expect to read valid bits by using textureLod or texture, when FBO interpolation set to Linear or Mipmap.
Only texelFetch return valid bits.
And only when interpolation set to Nearest function texture and textureLod may return valid bits.
Result of functions such as trigonometric functions(sin etc), pow, sqrt, others on CPU may not be equal to GPU.
Test shader shadertoy link. Expected — left side of screenshot equal to right.
Left side code has static iTime=32. and shader code precalculated on CPU.
Right side code iTime is 32 + GPU(0) that force code executed on the GPU.
First line on screenshot — result of trigonometric sin-base random. Not same on CPU and GPU.
Also remember that result of sin-based-hash-random is not same from GPU to GPU, and only uint-based-hash-random will be same on every GPU and CPU.
Myths About Floating-Point Numbers by Adam Sawicki say this:
The reason random numbers are generated on NVIDIA cards and not on AMD is that sine instruction on AMD GPU architectures actually has period of 1, not 2*PI. But it is still fully deterministic in regards to input value. It just returns different results between different platforms.
Next lines result 1 or 2 and 243 or 242 is because pow on CPU not equal to pow on GPU.
Functions like smoothstep may follow specs when used on CPU shader compiler:
smoothstepreturns 0.0 if
edge0and 1.0 if
Results are undefined if
Test shader shadertoy link — result of smoothstep(1., 0.9, 0.) is 0 on CPU and 1 on GPU.
Linear interpolation depends on GPU/API:
Shader compiler may use 32 or 64 bit floats to pre-compile static code:
Test shader shadertoy link. Expected — 3248488448 and negative 20 everywhere.
First line on screenshot — val0 CPU precompiled result 3248488448.
Second line — val1 changing 0.3 from const to (0.3+min(iTime,0.)).
Changes result in Vulkan to 3248488447.
Last line — val2 changing 0.4 from const to (0.4+min(iTime,0.)).
Changes result in OpenGL and Vulkan to 3248488447 or 3248488449.
Vulkan values can be so different, maybe because of RelaxedPrecision.
GPU precision never 0, but it can be 0:
Test shader shadertoy link. Expected 0 for every of val 0 to 3.
In shader and screenshot:
Value of val_const equal to val_dyn in uint bits representation on the GPU.
But their difference may not be 0. on GPU.
Result of val2 = val_const - val_dyn; is a very small positive value, that may not be equal to 0. (look below for testing example)
Result of val3 = val_dyn - val_const; is a very small negative value, that also may not equal to 0. (look below for testing example)
The result of operation will be equal to 0. only when “source of the variable” is same for every of variables in operation.
Like GPU_value minus GPU_value.
Testing this behavior:
Open this my test shader shadertoy link.
Use this line of code, and add to the line numbered below:
Add this line of code to line 27.
And press compile on shadertoy.
Result — (val3==0.) is true.
Add this line of code to line 41.
And press compile on shadertoy.
Result — (val3==0.) is false.
When val3 in this line changed to val0 or val1 — result will be true in any part of the shader.
Undefined behavior on CPU and GPU:
The resulting value is undefined if <condition>
Undefined does not mean it guarantee NAN or INF for each operation, it can be anything.
This is also source of errors, bugs and make shader debugging much harder because result of operations not same on CPU and GPU.
I made this article as a note for my self.
I hope this is not completely useless info for you.
Also, I made list of other GPU - bugs link to list of bugs.
Thanks for reading!