(How to) avoid division by zero (in C)
Leveraging boolean operators to avoid divisions by zero without conditional expressions.
Let's say we're collecting some data, and we want to compute an average of the values. Or we computed the absolute error, and we want the relative error. This requires the division of some number (e.g. the sum of the values, or the absolute error) by some other number (e.g. the number of values, the reference value).
Catastrophe arises when the number we want to divide by is 0: if the list of values we want to average is empty, for example, we would end up with an expression such as 0/0 (undefined).
Programmatically, we would like to avoid such corner cases with as little hassle as possible. The standard way to handle these cases is by using conditional expressions: if the value we want to divide for is zero, do something special, otherwise do the division we're actually interested in.
This can be cumbersome.
In what follows, we'll assume that the special handling of the zero division case would be to return the numerator unchanged: we want if is non-zero, otherwise will do. In (C) code, this could be written:
if (b != 0)
r = a/b;
else
r = a;
We can write this more succinctly using the ternary operator:
r = b != 0 ? a/b : a;
or, leveraging the fact that any non-zero value is “true”:
r = b ? a/b : a;
I'll leave it to the reader to decide if this expression is more readable or not, but the fundamental issue remains that this kind of conditional handling is still not nice. Worse, if this is done in a loop (e.g. to convert a set of absolute errors into a set of relative errors, dividing each by the corresponding —potentially null!— reference value) It can even produce sub-optimal code on modern machines with vector capabilities: since the expression for the two sides is different, and there is no way to know (until the program is running) which elements will follow which path, the compiler will have to produce sub-optimal scalar code instead of potentially much faster vectorized code.
Ideally, we would want to have the same operation done on both sides of the
conditional. This can, in fact, be achieved by remarking that a is
the same as a/1. We can thus write:
r = a/(b ? b : 1);
The advantage of this expression is that, as the body of a loop, it leads to better vectorization opportunities, delegating the conditional to the construction of the divisor.
But we can do better! There's a nifty trick we can employ (at least in C), leveraging the fact that the boolean negation of any non-zero value is 0, and the boolean negation of 0 is 1. The trick is:
r = a/(b + !b);
Why does this work?
If b == 0, then !b == 1, and b + !b == 0 + 1 == 1.
If b != 0, then !b == 0, and b + !b == b + 0 == b.
The result of b + !b is thus exactly the same as b ? b : 1,
without using conditionals.
Addendum (OpenCL C and vector types)
The trick above doesn't work if a, b are vector types, at least in OpenCL C
since the specification in this case requires that the component-wise negation
of 0 is -1 rather than 1. So, for vector types, the trick becomes:
r = a/(b - !b);
to correct for the difference in sign.
Other programming languages
The trick extends trivially to any programming language that can seamlessly cast between numerical and logical values, For example, in MATLAB, Octave or Scilab one would use:
r = a./(b + ~b)
for the same purpose (notice the use of ./ rather than / to allow component-wise division
between equi-dimensional vectors or matrices), and in Python:
r = a/(b + (not b))
Other languages may need explicit casting. For example, the expression in Mathematica would be:
r = a/(b + Boole[b == 0])
using the Boole function introduced in version 5.1, and in FORTRAN you would need something even uglier such as
r = a/(b + MERGE(1, 0, b == 0))
(and a recent enough version of the standard where MERGE is defined, I believe this was introduced with F90)
which is just as ugly as the C version with the ternary operator.