Thanks to mathematician-friends and social networking sites, today I encountered this 2008 article by McCullough and Heiser on the statistical inaccuracies of Microsoft’s package Excel. What is so striking about the article is not merely the angry tone in which it is written – something rather unusual for academic publications, but the fact that the authors’ anger is entirely justified. As they write,
Excel’s statistical distributions have always been inadequate. Over the years, Microsoft has fixed some distributions, fixed others incorrectly, and failed to fix others.
Microsoft occasionally fixes errors, more often ignores them, and sometimes fixes them incorrectly. Consequently, every time there is a new version of Excel, the tests must be repeated. Indeed, every time a new version of Excel is released, we receive emails asking, “Is it safe to use Excel?” There appears to be a sentiment that if only Excel would pass the intermediate tests, then it would be safe to use. Nothing could be farther
from the truth. Microsoft has not even fixed all the flaws identified by Sawitzki (1994) over 15 years ago (his paper took a couple years to be published).
There are scores of statistical functions in Excel, and we have benchmarked only a fraction of them. In the fraction that we have examined, we have found enough unfixed/incorrectly-fixed errors to cast grave doubt on the above-mentioned sentiment. To wit, we write this paper in part to warn members of the statistical community that even should there come a day when Microsoft can fix enough of the errors in Excel that Excel can pass these intermediate-level tests, it will not necessarily be safe to use.
And even more:
What is particularly pernicious, at least so far as consumers of statistics are concerned, is that Microsoft seems to take the approach that specialized domain knowledge is not necessary to the production of statistical code, i.e., that the entire field of statistical computing has no merit.
The article is available here. Please, read it, even (or especially) if you, like many sociologists, rely on software packages for your work – or ignore it and use Excel for statistical analyses at your peril…
P.S. the authors do not recommend other packages explicitly.