The Number With Negative Absolute Value

As you know, computers internally work only with ones and zeros, and everything is represented in some way using these “bits.” Using M bits you can encode 2M different values. Let’s say you’re interested in encoding the signed integers …, -1, 0, 1, 2, …. To make things simple, let’s use a fixed number of bits: for example with 8 bits you can encode 28 = 256 different numbers. One of the numbers is zero, so you are left with 255 bit patterns to be divided between positive and negative numbers. Can you see any problems with this?

You can’t have the same number of positive and negative integers because 255 is odd. The range of representable numbers will not be symmetric around zero; this is the two’s complement anomaly. (You could solve the issue by deciding to just not use one, but that brings problems of its own.) In the two’s complement system there’s one negative number that doesn’t have a positive counterpart. This anomaly has a counter-intuitive consequence.

For example, if x is -128 as an 8-bit number, what is -x? If you work this out in binary, you start with x = 1000 00002, and then you compute -x which is by definition ~x+1, you get -x = 0111 11112 + 1 = 1000 00002. This is exactly what we started with: -x = x = -128, a mathematical impossibility?

Is the two’s complement system is broken?

Not really. Recall that computer integers are actually defined modulo 2M, and in this case the correct result of -x = 128 is equivalent to -128 modulo 28. We did get the right result, it just has an unexpected interpretation.

You’ll just have to get used to it: there are numbers with a negative absolute value and every time you see a call to abs() you should ask yourself if the argument can be the two’s complement anomaly, and what happens if it is. For example, suppose you want to produce a random stage from the Dreyfus model of skill acquisition, using a function rnd() that produces uniformly random 32-bit signed integers. Here’s a function that does just that:

char *dreyfus_stage() {
    char *stages[] = {"novice", "advanced beginner", 
                      "competent", "proficient", "expert"};
    int index = abs(rnd()) % 5;
    return stages[index];
}

This is the stuff of the Underhanded C Contest: the function looks innocent, but has serious flaws. Not only is it slightly biased; it can also end up computing index as -3 and return random garbage.