Wednesday, 28 November 2012

Bitwise Reproducibility with the NAG Libraries

I've written in this blog before about the problems of Wandering Precision - where the results computed by a program are not consistent, even when running exactly the same program on the same machine with the same data several times in a row.

At SC12 in Salt Lake City a couple of weeks ago I took part in a "Birds of a Feather" session organised by Intel, where problems like these, associated with bitwise reproducibility of results, were discussed. On the panel, apart from myself, were representatives of Intel's Math Kernel Library, The MathWorks (producers of MATLAB), German engineering company GNS, and the University Of California, Berkeley.

We all gave brief presentations discussing the impact of non-reproducible results on our work or on our users, and ways around the problem. It turned out that most of
our presentations were remarkably similar, involving tales of disgruntled users who were not happy about accepting such varying results. This is true even when the same
users completely understand the fact that numerical software is subject to rounding errors, and that the varying results they see are not incorrect - they may be equally good approximations to an ideal mathematical result. But they just don't like inconsistency - in fact, often they would prefer to get an answer that is accurate to fewer digits of precision so long as it is consistent, rather than inconsistent but slightly more accurate results.

As it happens, we do have some control over these inconsistent results, which are largely due to optimizations introduced by clever compilers that are designed to take advantage of fast SIMD instructions like SSE and AVX that are available on modern hardware. By using appropriate compiler flags on the Intel Fortran and C compilers, for example, we can avoid these optimizations, at the cost of making the code run up to 12 or 15% slower (according to Intel).