Floor and Ceil versus Denormals on CPU and GPU
Recorded: May 30, 2026, 10:01 a.m.
| Original | Summarized |
Floor and Ceil Versus Denormals on CPU and GPU Programming, graphics, games, media, C++, Windows, Internet and more... Sorry, you need Javascript on to email me. Main PageBlogProductionsAbout Floor and Ceil Versus Denormals on CPU and GPU Sat Recently, I dove deep into floating-point numbers and their behavior. Somehow, this topic haunts me in my programming practice since I created Floating-Point Formats Cheatsheet back in 2013 and also released a comprehensive article The Secrets of Floating-Point Numbers in 2024. floor - rounding "down" i.e., towards -infinity. Examples: When talking about round, there is also a question what happens when we are exactly halfway, like round(2.5). Various programming languages define it differently: Standard C/C++ function defines it to round away from zero, so round(2.5) = 3.0, round(-3.5) = -4.0. Knowing all this, we can answer our main question: some platforms preserve them (processing the values they represent), This is the problem I stumbled upon recently. In most cases, it doesn't matter. For example, when rendering graphics, the difference between such a small number and 0 would produce an indistinguishable difference in results. After applying functions such as floor and ceil, however, the difference is significant: floor(-1.175493930432748e-38) = -1.0 ceil(-1.175493930432748e-38) = -0.0 If the platform flushes denormals to 0: floor(-1.175493930432748e-38) = -0.0 ceil(-1.175493930432748e-38) = -0.0 The behavior of a specific platform may depend on many factors, such as flags used during compilation of our source code, as well as some floating-point modes controlled in runtime. It may be an unexpected source of nondeterminism between CPU and GPU, as well as between GPU vendors. CPU in x86 64-bit architecture (AMD Ryzen 7 7800X3D, but I don't expect differences between AMD and Intel here) on Windows, executing C++ code compiled using Visual Studio 2022 appeared to preserve denormals when doing floor and ceil. I've tested the following options, with no change in the results: Both Release and Debug configurations (with and without compiler optimizations) GPU from Nvidia (GeForce RTX 4090 - Ada architecture) executing a Direct3D 12 program and HLSL code compiled using modern DXC shader compiler flushes denormals. I've tested the following options, with no change in the results: With and without DXC parameter -Gis (Force IEEE strictness) GPU from Intel (Arc B580 - Xe2-HPG architecture) executing the same shader flushes denormals by default. However, using -denorm preserve makes it preserving denormals. This is not the first time we can see Nvidia taking shortcuts to achieve maximum performance of their GPUs 😉 We could see another example in the article "Mipmap selection in too much detail" by Pema Malling. float DeterministicCeil(float x) Comments | Share Comments Please enable JavaScript to view the comments powered by Disqus. [Download] [Dropbox] [pub] [Mirror] [Privacy policy] |
The discussion centers on the behavior of floating-point operations, specifically floor and ceil functions, and the handling of denormal numbers on both Central Processing Units (CPUs) and Graphics Processing Units (GPUs). The initial context establishes the mathematical definitions of floor, ceil, trunc, and round, noting that these functions transform a floating-point number into an integral floating-point value by rounding towards negative infinity (floor), positive infinity (ceil), rounding towards zero (trunc), or the nearest integer (round). The text differentiates the behavior of the round function across programming languages, noting that standard C/C++ rounds away from zero, while HLSL uses rounding to the nearest even number, and GLSL leaves the implementation dependent. A critical point introduced is the issue of denormal numbers, or subnormal numbers, which are values extremely close to zero that utilize a special representation where the implicit leading one bit is disregarded. These numbers have an exponent of zero and fall within a range smaller than the minimum positive normalized value, which creates ambiguity regarding how hardware platforms process them. The core problem arises because some systems preserve these denormal values while others flush them to zero, which leads to nondeterministic results when applying floor or ceil functions, especially with very small inputs. The text demonstrates that the outcome of floor and ceil operations on a denormal input depends entirely on which platform's floating-point handling policy is followed. If a platform preserves denormals, the results of floor and ceil for small numbers are distinct from when the platform flushes denormals to zero. For instance, the floor of a small negative denormal may yield a result of -1.0 if denormals are preserved, versus -0.0 if they are flushed to zero. Testing across different hardware revealed this disparity in behavior. On x86 64-bit CPUs, the behavior appeared to preserve denormals when executing C++ code, regardless of many compilation flags. Conversely, the results on various GPUs showed platform-specific handling: Nvidia GPUs utilizing DirectX and HLSL appeared to flush denormals by default, whereas Intel and AMD GPUs also generally flushed them unless specific flags were used. This suggests a source of nondeterminism across CPU and GPU architectures and between different GPU vendors. The text notes that the DirectX Specification requires GPUs to flush denormals on both input and output of floating-point operations. To achieve a consistent, deterministic implementation of floor and ceil functions that preserves denormals across all CPUs and GPUs, the author proposes a deterministic solution. This involves implementing custom floor and ceil functions using simple bit manipulation techniques operating on the underlying bit representation of the floating-point numbers. This method bypasses the inconsistent hardware-dependent behavior by ensuring the functions behave consistently, regardless of whether denormals are preserved or flushed. |