Float precision on GPU, bugs/features

Expected 123 everywhere, screenshot from ANGLE DX11, Nvidia https://www.shadertoy.com/view/tlfBRB


  1. Precompiled floats. CPU vs GPU.
  2. More about float bits and int/uint, float can be equal to 0. but

Reading float bits from texture and losing float bits:

Numbers on left side on screenshot:

uint value_2 = floatBitsToUint(uintBitsToFloat(uint(value+iZero))+fZero);

Result — can be 0 or value, depends on GPU and GPU shader language.
— uintBitsToFloat(uint(value+iZero)) save value as float bits, but for test I use small uint value 123.
And the sum of float(<from uint 123>) + 0. processed as “float operation” by floats-GPU units, and uint(0) can be as result.

Conclusion — adding floats to float_bits value may result in losing bits.

Red number can be 0 or 123 depends of CPU shader compiler (it can be bugged and result 0).
Green number — should be 123 everywhere.

Right side of screenshot, texture reading:

Precompiled floats:

https://www.shadertoy.com/view/wdXGW8 Nvidia

Result of functions such as trigonometric functions(sin etc), pow, sqrt, others on CPU may not be equal to GPU.

Left side code has static iTime=32. and shader code precalculated on CPU.
Right side code iTime is 32 + GPU(0) that force code executed on the GPU.

First line on screenshotresult of trigonometric sin-base random. Not same on CPU and GPU.
Also remember that result of sin-based-hash-random is not same from GPU to GPU, and only uint-based-hash-random will be same on every GPU and CPU.

Myths About Floating-Point Numbers by Adam Sawicki say this:

The reason random numbers are generated on NVIDIA cards and not on AMD is that sine instruction on AMD GPU architectures actually has period of 1, not 2*PI. But it is still fully deterministic in regards to input value. It just returns different results between different platforms.

Next lines result 1 or 2 and 243 or 242 is because pow on CPU not equal to pow on GPU.

Also remember:

smoothstep returns 0.0 if xedge0 and 1.0 if xedge1.
Results are undefined if edge0edge1.

Test shader shadertoy link result of smoothstep(1., 0.9, 0.) is 0 on CPU and 1 on GPU.

Shader compiler may use 32 or 64 bit floats to pre-compile static code:

https://www.shadertoy.com/view/sllXW8 Nvidia

Test shader shadertoy link. Expected — 3248488448 and negative 20 everywhere.

First line on screenshot — val0 CPU precompiled result 3248488448.
Second lineval1 changing 0.3 from const to (0.3+min(iTime,0.)).
Changes result in Vulkan to 3248488447.
Last line val2 changing 0.4 from const to (0.4+min(iTime,0.)).
Changes result in OpenGL and Vulkan to 3248488447 or 3248488449.

Vulkan values can be so different, maybe because of RelaxedPrecision.

GPU precision never 0, but it can be 0:

https://www.shadertoy.com/view/ftXSWB Nvidia

Test shader shadertoy link. Expected 0 for every of val 0 to 3.

In shader and screenshot:

Result of val2 = val_const - val_dyn; is a very small positive value, that may not be equal to 0. (look below for testing example)

Result of val3 = val_dyn - val_const; is a very small negative value, that also may not equal to 0. (look below for testing example)

The result of operation will be equal to 0. only when “source of the variable” is same for every of variables in operation.
Like GPU_value minus GPU_value.

Testing this behavior:


First test:
Add this line of code to line 27.
And press compile on shadertoy.
Result — (val3==0.) is true.

Second test:
Add this line of code to line 41.
And press compile on shadertoy.
Result — (val3==0.) is false.

When val3 in this line changed to val0 or val1 — result will be true in any part of the shader.

Undefined behavior on CPU and GPU:

The resulting value is undefined if <condition>

Undefined does not mean it guarantee NAN or INF for each operation, it can be anything.
This is also source of errors, bugs and make shader debugging much harder because result of operations not same on CPU and GPU.

I made this article as a note for my self.
I hope this is not completely useless info for you.

Thanks for reading!

GLSL and usual coding