Sunday, March 15, 2009

Floating Point Exceptions

As further evidence that floating point math is stranger than you think, consider the following code:

#include <stdio.h>
#include <math.h>
#ifdef __unix__
#define _GNU_SOURCE   // gives us feenableexcept on older gcc's
#define __USE_GNU     // gives us feenableexcept on newer gcc's
#include <fenv.h>
#include <float.h>
#ifndef _EM_OVERFLOW

int main(int argc, char **argv)
    float a, b;

#ifdef __unix__
    _controlfp(0, _EM_OVERFLOW);

    // gcc will compute these expressions at compile-time (which is
    // actually kinda cool, but it ruins the example) if we just do
    // pow(10.0, 50) and fabs(-13).
    int exponent = 50;
    a = pow(10.0, exponent);

    puts("1e50 is too big for a float, but we got this far,\n"
         "so we must be okay, right?\n");

    b = -13;
    b = fabs(b);

    puts("You'll never see this, because we just got an overflow trying\n"
         "to process a really small number!  How can this be!?!?\n"
         "The world is coming to an end!\n");

    printf("%f %f\n", a, b);

    return 0;

By default, at least on x86 systems, floating point exceptions are usually masked 1, but I've been enabling them as part of my attempt at hairshirt programming or fail fast programming; exceptions such as overflow or division by 0 probably indicate logic errors or algorithmic limitations within the software, and I'd like to know about such problems as soon as possible. That led me to this problem: the pow call generates the exception, but it's not raised until the fabs call, completely confusing the poor programmer. Section 8.6 of the Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 1 (hereafter IASDM) explains how this behavior comes to be:

Because the integer unit and x87 FPU are separate execution units, it is possible for the processor to execute floating-point, integer, and system instructions concurrently. No special programming techniques are required to gain the advantages of concurrent execution. (Floating-point instructions are placed in the instruction stream along with the integer and system instructions.) However, concurrent execution can cause problems for floating-point exception handlers.

Intel's manual provides more details; to summarize, exceptions generated within the FPU are flagged within the FPU's status register but not raised until the CPU executes the next floating point operation. In practice, this should rarely be a problem: if the exception is generated during an operation, it will be raised when the results of the operation are written back to memory, and if the exception is generated when the results are written to memory (due to truncating to a smaller precision), it will be immediately raised as long as you immediately use the result that you just stored. So as long as you're not storing a low-precision value for later use and immediately going on to something else, you ought to avoid the confusing behavior described here.

If you do suspect that something like this is happening, then using C99's fetestexcept functions can help pinpoint the culprit:

    int exponent = 50;
    a = pow(10.0, exponent);

    if (fetestexcept(FE_OVERFLOW)) {
        puts("Warning; doom is impending.");

    b = -13;
    b = fabs(b);

Or you can use C99's fegetenv and the undocumented internals of its fenv_t parameter to examine the entire x87 FPU state - its status word (including pending exceptions), its control word (including the current exception mask), and so on. If your environment doesn't provide an implementation of these fenv.h functions, you can borrow the MinGW runtime's (after adjusting the syntax for your compiler); the implementation is extremely simple (a few lines of assembly per function), and the MinGW runtime is in the public domain.

As an aside, this particular problem would not have happened if the code used doubles throughout instead of floats. Using floats at all is probably a mistake, but I've had some difficulty finding detailed information on the tradeoffs of floats versus doubles. Here's what I've been able to find so far:

  • Doubles take up twice as much memory as floats, although the extra memory usage is rarely an issue.
  • The x87 FPU always operates on 80-bit long doubles internally, as described in the IASDM (volume 1, section 8.2). For this reason, doubles are usually no slower than floats.
  • As a small exception to the previous statement, some builtin functions like pow that offer both float and double versions seem to run faster in their lower-precision versions.
  • As a larger exception to the previous statement, the performance SSE is entirely dependent on the size of the data it's operating on, so floats are twice as fast as doubles. On the other hand, if you're doing the kind of graphics, scientific, or gaming development that would most benefit from SSE, you probably already know more than I could tell you about the kinds of performance issues you'll face.
  • Floating point precision has a few historical oddities in C and C++; particularly old compilers promoted floats to doubles while doing calculations, and floats are promoted to doubles when given as arguments to variadic functions. (Since pointers can't be promoted, printf and scanf are asymmetrical: printf("%f", f); will work whether f is a float or a double, but scanf("%f", &f); requires that f be a float.)
  • The IASDM says (volume 1, section 8.2) that doubles should be used "as a general rule" but notes that "the single-precision format is useful for debugging algorithms, because rounding problems will manifest themselves more quickly in this format" and that 80-bit long doubles can be used "when an application requires the maximum range and precision of the x86 FPU."
  • If you're using Microsoft Visual C++ (as of MSVC++ 2008), you can't use 80-bit long doubles. Sorry.

If there are other pros and cons involved in selecting a precision, please let me know. Of course, if you or your predecessor designed your C/C++ code right, all of this is controlled by a single typedef, so it would be very easy to test for yourself. (If you do have the pleasure of working on such well-designed code, feel free to post a comment and gloat over those of us stuck maintaining legacy apps.)

1. Specifically, the x87 FPU disables all floating point exceptions by default, as described in section 8.1.5 of the Intel 64 and IA-32 Software Developer's Manual Volume 1. Microsoft follows suit (unless you set a compiler option), and apparently Linux does too, but CodeGear C++Builder only masks underflow, denormal operand, and inexact result exceptions as part of its startup code.