What happens when you add one to an integer?

It depends. You saw in the previous post that there are plenty of different integer types, some with known sizes and some where the size is set by the implementation. Well for each size of integer type there are two main variants: signed and unsigned.

Unsigned numbers are always zero or positive. They’re the easiest ones to understand, and their behaviour is well defined. In almost all cases, adding one to an unsigned integer in C makes that integer bigger by one. The only exceptional case is when the number already represents the maximum value that will fit in its type; adding one to the maximum “overflows” and gets you back to 0.

Signed integers are tricky. Computers don’t natively handle negative numbers, but signed values can (as the name suggests) be negative. Various conventions have been created to allow support for negative numbers: the most common is to treat one bit of a variable as the “sign” bit (as a note for overly-sensitive nerds: sometimes these conventions are honoured in CPU instructions, and you could say that such computers do natively handle negative numbers). If the sign bit is set, then the number is negative; otherwise it is positive. Some platforms have an extra bit separate from the storage of the number that indicates the sign of the number.

What this means is that if the C language were to specify what happens when a signed integer overflows, some implementations would be able to handle this efficiently but some would not as they’d have to translate the particular platform-specific behaviour into that required by the standard.

The result then of adding one to a signed integer is quite surprising: if it causes the number to overflow, the result is undefined. An implementation is free to do anything (implementers usually choose to do whatever’s most efficient); relying on the behaviour from one implementation means writing unportable code.

As a result of this it’s important to guard against integer overflow in C (and C++ and Objective-C) programs. Typically the unsigned integer types should only be used either as bitmasks, where the value of each bit is important but doesn’t affect interpretation of the other bits, or in situations where the known overflow behaviour is actually desired. In cases where you “know” a number will always be positive, it’s still best to use a signed integer, as that offers the possibility of detecting bugs that end up pushing the value below zero.

As an example, consider a data type in my application that I “know” will always have a count that’s positive and smaller than 200. I could use a uint8_t to represent that, but there are conditions that are erroneous and yet will lead to valid-looking answers. Imagine removing 80 objects from an instance with count 50, or adding 80 objects to an instance with count 180. Because of the overflow behaviour of uint8_t, these problems would leave the result “looking” OK. It would be better to represent this type using int16_t, which both accepts values below 0 and above 200; now the problematic cases described earlier do not overflow, but result in numbers that are within the range that can be represented and can therefore be tested against my application-specific requirements.

About Graham

I make it faster and easier for you to create high-quality code.
This entry was posted in buffer-overflow, code-level. Bookmark the permalink.