In the last couple of years I've found myself discussing some allegedly controversial aspects of floating-point mathematics and applications. The arguments have usually focused on some allegedly irrational (eh) choices in the IEEE-754 standard and how they are detrimental (and confusing) to programmers.

I must say that I've grown somewhat tired of these argument, and yet when someone is wrong on the Internet I still feel the urge to correct them, so I've decided to discuss some key points in a series of articles about computers and numbers, trying to clear some common misunderstanding about how math on the computer is made more complex than necessary by the IEEE-754 standard, and how integer math (often proposed as an alternative solution) is far from the panacea that it is claimed to be.

I will try and proceed mostly in a bottom-up fashion, by discussing possible answers to the following questions (some of them rhetorical, some of them not):

  • integer math, is it really that simple?
  • how can computers deal with non-integer math?
  • fixed- and floating-point, what are the benefits and downsides?

Topics that will be discussed include: representations and their limits, encoding, accuracy, approximation, rounding, significance, correctness, validity.

Although I will try and keep my discussion as general as possible, I will make some assumptions. Particularly, I will assume that we are discussing these topics in the context of binary computers with 8-bit bytes and power-of-two words (although there would be much to be said about ternary computers).

Integer math and computers
Posted by Oblomov