In computer science, subnormal numbers are the subset of denormalized numbers (sometimes called denormals) that fill the underflow gap around zero in floating-point arithmetic. Any non-zero number with magnitude smaller than the smallest positive normal number is subnormal, while denormal can also refer to numbers outside that range.
The term "number" is used rather loosely, to describe a particular sequence of digits, rather than a mathematical abstraction; see Floating-point arithmetic for details of how real numbers relate to floating-point representations. "Representation" rather than "number" may be used when clarity is required.
In a normal floating-point value, there are no in the significand (also commonly called mantissa); rather, leading zeros are removed by adjusting the exponent (for example, the number 0.0123 would be written as ). Conversely, a denormalized floating-point value has a significand with a leading digit of zero. Of these, the subnormal numbers represent values which if normalized would have exponents below the smallest representable exponent (the exponent having a limited range).
The significand (or mantissa) of an IEEE floating-point number is the part of a floating-point number that represents the significant digits. For a positive normalised number, it can be represented as m0. m1 m2 m3... m p−2 m p−1 (where m represents a significant digit, and p is the precision) with non-zero m0. Notice that for a binary radix, the leading binary digit is always 1. In a subnormal number, since the exponent is the least that it can be, zero is the leading significant digit (0. m1 m2 m3... m p−2 m p−1), allowing the representation of numbers closer to zero than the smallest normal number. A floating-point number may be recognized as subnormal whenever its exponent has the least possible value.
By filling the underflow gap like this, significant digits are lost, but not as abruptly as when using the flush to zero on underflow approach (discarding all significant digits when underflow is reached). Hence the production of a subnormal number is sometimes called gradual underflow because it allows a calculation to lose precision slowly when the result is small.
In IEEE 754-2008, subnormal numbers are supported in both binary and decimal formats. In binary interchange formats, subnormal numbers are encoded with a Exponent bias of 0, but are interpreted with the value of the smallest allowed exponent, which is one greater (i.e., as if it were encoded as a 1). In decimal interchange formats they require no special encoding because the format supports unnormalized numbers directly.
Mathematically speaking, the normalized floating-point numbers of a given sign are roughly spaced, and as such any finite-sized normal float asymptotic. The subnormal floats are a linearly spaced set of values, which span the gap between the negative and positive normal floats.
Subnormal numbers were implemented in the Intel 8087 while the IEEE 754 standard was being written. They were by far the most controversial feature in the K-C-S format proposal that was eventually adopted, but this implementation demonstrated that subnormal numbers could be supported in a practical implementation. Some implementations of floating-point units do not directly support subnormal numbers in hardware, but rather trap to some kind of software support. While this may be transparent to the user, it can result in calculations that produce or consume subnormal numbers being much slower than similar calculations on normal numbers.
No other denormalized numbers exist in the IEEE binary floating-point formats, but they exist in some other formats, including the IEEE decimal floating-point formats.
This speed difference can be a security risk. Researchers showed that it provides a Timing attack that allows a malicious web site to extract page content from another site inside a web browser.
Some applications need to contain code to avoid subnormal numbers, either to maintain accuracy, or in order to avoid the performance penalty in some processors. For instance, in audio processing applications, subnormal values usually represent a signal so quiet that it is out of the human hearing range. Because of this, a common measure to avoid subnormals on processors where there would be a performance penalty is to cut the signal to zero once it reaches subnormal levels or mix in an extremely quiet noise signal. Other methods of preventing subnormal numbers include adding a DC offset, quantizing numbers, adding a Nyquist signal, etc. Since the SSE2 processor extension, Intel has provided such a functionality in CPU hardware, which rounds subnormal numbers to zero.
A non-C99-compliant method of enabling the and flags on targets supporting SSE is given below, but is not widely supported. It is known to work on Mac OS X since at least 2006.
// Sets DAZ and FTZ, clobbering other CSR settings.
fenv.c and fenv.h.
fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV);
// fesetenv(FE_DFL_ENV) // Disable both, clobbering other CSR settings.
For other x86-SSE platforms where the C library has not yet implemented this flag, the following may work:
_mm_setcsr(_mm_getcsr() | 0x0040); // DAZ
_mm_setcsr(_mm_getcsr() | 0x8000); // FTZ
_mm_setcsr(_mm_getcsr() | 0x8040); // Both
_mm_setcsr(_mm_getcsr() & ~0x8040); // Disable both
The and macros wrap a more readable interface for the code above.
_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
// To enable FTZ
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
Most compilers will already provide the previous macro by default, otherwise the following code snippet can be used (the definition for FTZ is analogous):
The default denormalization behavior is mandated by the ABI, and therefore well-behaved software should save and restore the denormalization mode before returning to the caller or calling code in other libraries.
AArch32 NEON (SIMD) FPU always uses a flush-to-zero mode, which is the same as . For the scalar FPU and in the AArch64 SIMD, the flush-to-zero behavior is optional and controlled by the bit of the control register – FPSCR in Arm32 and FPCR in AArch64.
One way to do this can be:
uint64_t fpcr;
asm( "mrs %0, fpcr" : "=r"( fpcr )); //Load the FPCR register
asm( "msr fpcr, %0" :: "r"( fpcr | (1 << 24) )); //Set the 24th bit (FTZ) to 1
Some ARM processors have hardware handling of subnormals.
|
|