What is the defined behavior in C for UINT_MAX + 1u? How safe is to assume it is zero?
From the standard (C11, 6.2.5/9, emphasis mine):
[...] A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
If UINT_MAX is 10:
(10 + 1) % (10 + 1) == 0
So, yes, it's safe to assume it's zero.
It's worth emphasizing that while unsigned behavior is well-defined, signed integer overflow isn't:
http://en.wikipedia.org/wiki/Integer_overflow
In the C programming language, signed integer overflow causes
undefined behavior, while unsigned integer overflow causes the number
to be reduced modulo a power of two
A very good paper on the subject:
http://www.cs.utah.edu/~regehr/papers/overflow12.pdf
EXAMPLES OF C/C++ INTEGER OPERATIONS AND THEIR RESULTS
Expression Result
---------- ------
UINT_MAX+1 0
LONG_MAX+1 undefined
INT_MAX+1 undefined
SHRT_MAX+1 SHRT_MAX+1 if INT_MAX>SHRT_MAX, otherwise undefined
char c = CHAR_MAX; c++ varies
-INT_MIN undefined
(char)INT_MAX commonly -1
1<<-1 undefined
1<<0 1
1<<31 commonly INT_MIN in ANSI C and C++98; undefined in C99 and C++11
1<<32 undefined
1/0 undefined
INT_MIN%-1 undefined in C11, otherwise undefined in practice
It's safe. The C standard guarantees that unsigned integer overflow wrap-around results in zero.
Should be safe:
Wiki on unsigned overflow
Note the unsigned int overflow is well defined.
Also, here's a whole question on this.
Related
I hope you can help me with this
What final value will x have ?
int32_t x = 0xE5;
int8_t i;
for(i=0;i<200;i++)
{
x++;
}`
What happens to int8_t when it exceeds its range of -128 to 127?
Thank you
Consider what happens in i++ when i is 127. The C standard’s specification of postfix ++ says, in C 2018 6.5.2.4 2:
… As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it)…
Unfortunately, it says nothing else about the arithmetic used; it does not say whether the addition is perform using int8_t arithmetic or int arithmetic or something else. In most operations in C, operands are promoted to at least the int type. For example, i += 1 is specified to be effectively equivalent to i = i + 1, and i + 1 is specified to promote this i to int. Then the addition yields 128, because 127 + 1 = 128 and 128 is representable in the int type. Then the 128 is converted to int8_t for storage in i. This is a problem because 128 is not representable in int8_t. C 2018 6.3.1.3 3 says there is either an implementation-defined result or an implementation-defined signal.
This means your compiler must document what happens here. There should be a manual for the compiler, and it should say what happens when an out-of-range result is converted to int8_t. For example, GCC documents that the result wraps modulo 256.
Since the standard is vague about the arithmetic used, it is possible the intent is the arithmetic would be performed in the int8_t type, and the addition would overflow, which has undefined behavior. But this would contrast with the general nature of the standard.
If the loop does continue, then x++ will eventually exceed the int32_t type. If that is the same as the int type, it will overflow and have undefined behavior. If int is wider than int32_t, we have the same situation with a successful addition followed by an implementation-defined conversion.
I was working with integers in C, trying to explore more on when and how overflow happens.
I noticed that when I added two positive numbers, the sum of which overflows, I always got a negative number.
On the other hand, if I added two negative numbers, the sum of which overflows, I always got a positive number (including 0).
I made few experiments, but I would like to know if this is true for every case.
Integer overflows are undefined behavior in C.
C says an expression involving integers overflows, if its result after the usual arithmetic conversions is of a signed typed and cannot be represented in the type of the result. Assignment and cast expressions are an exception as they are ruled by the integer conversions.
Expressions of unsigned type cannot overflow, they wrap, e. g., 0U - 1 is UINT_MAX.
Examples:
INT_MAX + 1 // integer overflow
UINT_MAX + 1 // no overflow, the resulting type is unsigned
(unsigned char) INT_MAX // no overflow, integer conversion occurs
Never let any integer expression overflows, modern compilers (like gcc) take advantage of integer overflows being undefined behavior to perform various types of optimizations.
For example:
a - 10 < 20
when a is of type int after promotion, the expression is reduced in gcc (when optimization are enabled) to:
a < 30
It takes advantage of the expression being undefined behavior when a is in the range INT_MIN + 10 - 1 to INT_MIN.
This optimization could not be done when a is unsigned int because if a is 0, then a - 10 has to be evaluated as UINT_MAX - 9 (no undefined behavior). Optimizing a - 10 < 20 to a < 30 would then lead to a different result than the required one when a is 0 to 9.
Overflow of signed integers is undefined behaviour in C, so there are no guarantees.
That said, wrap around, or arithmetic modulo 2N, where N is the number of bits in the type, is a common behaviour. For that behaviour, indeed if a sum overflows, the result has the opposite sign of the operands.
Formally, the behaviour of signed arithmetic on overflow is undefined; anything can happen and it is 'correct'. This contrasts with unsigned arithmetic, where overflow is completely defined.
In practice, many older compilers used signed arithmetic which overflowed as you describe. However, modern GCC is making changes to the way it works, and you'd be very ill-advised to rely on the behaviour. It may change at any time when anything in the environment where your code is compiled changes — the compiler, the platform, ...
Overflow in C is a godawful mess.
Overflow during unsigned arithmetic or conversion to an unsigned type results in wraping modulo 2n
Overflow during conversion to a signed type is implementation defined, most implementations will wrap modulo 2n but some may not.
Overflow during signed arithmetic is undefined behaviour, according to the standard anything might happen. In practice sometimes it will do what you wan't, sometimes it will cause strange issues later in yoir code as the compiler optimises out important tests.
What makes things even worse is how this interacts with integer promotion. Thanks to promotion you can be doing signed arithmetic when it looks like you are doing unsigned arithmetic. For example consider the following code
uint16_t a = 65535;
uint16_t b = a * a;
On a system with 16-bit int this code is well-defined. However on a system with 32-bit int the multiplication will take place as signed int and the resulting overflow will be undefined behavior!
int tx = INT_MAX +1; // 2147483648;
printf("tx = %d\n", tx);
prints tx = -2147483648.
I was wondering how to explain the result based on 6.3 Conversions in C11 standard?
when evaluating INT_MAX +1, are both operands int? Is the result 2147483648 long int? Which rule in 6.3 determines the type of the result?
when evaluating tx = ..., are the higher bits of the bit representation of the right hand side truncated so that its size changes from long int size to int size, and then are the truncated result interpreted as int? What rules in 6.3 determines how the conversion in this step is done?
Both INT_MAX and 1 have type int, so the result will have type int. Performing this operation causes signed integer overflow which is undefined behavior.
Section 3.4.3p3 Gives this as an example of undefined behavior:
EXAMPLE An example of undefined behavior is the behavior on integer overflow.
The relevant part here is 6.5/5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
This happens because both INT_MAX and the integer constant 1 have types int. So you simply can't do INT_MAX + 1. And there are no implicit promotions/conversions present to save the day, so 6.3 does not apply. It's a bug, anything can happen.
What you could do is to force a conversion by changing the code to int tx = INT_MAX + 1u;. Here one operand, 1u, is of unsigned int type. Therefore the usual arithmetic conversions convert INT_MAX to type unsigned int (See Implicit type promotion rules). The result is a well-defined 2147483648 and of type unsigned int.
Then there's an attempt to store this inside int tx, conversion to the left operand of assignment applies and then the conversion rules of 6.3 kick in. Specifically 6.3.1.3/3:
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So by changing the type to 1u we changed the code from undefined to impl.defined behavior. Still not ideal, but at least now the code has deterministic behavior on the given compiler. In theory, the result could be a SIGFPE signal, but in practice all real-world 2's complement 32/64 bit compilers are likely to give you the result -2147483648.
Ironically, all real-world 2's complement CPUs I've ever heard of perform signed overflow in a deterministic way. So the undefined behavior part of C is just an artificial construct by the C standard, caused by the useless language feature that allows exotic 1's complement and signed magnitude formats. In such exotic formats, signed overflow could lead to a trap representation and so C must claim that integer overflow is undefined behavior, even though it is not on the real-world 2's complement CPU that the C program is executing on.
I was working with integers in C, trying to explore more on when and how overflow happens.
I noticed that when I added two positive numbers, the sum of which overflows, I always got a negative number.
On the other hand, if I added two negative numbers, the sum of which overflows, I always got a positive number (including 0).
I made few experiments, but I would like to know if this is true for every case.
Integer overflows are undefined behavior in C.
C says an expression involving integers overflows, if its result after the usual arithmetic conversions is of a signed typed and cannot be represented in the type of the result. Assignment and cast expressions are an exception as they are ruled by the integer conversions.
Expressions of unsigned type cannot overflow, they wrap, e. g., 0U - 1 is UINT_MAX.
Examples:
INT_MAX + 1 // integer overflow
UINT_MAX + 1 // no overflow, the resulting type is unsigned
(unsigned char) INT_MAX // no overflow, integer conversion occurs
Never let any integer expression overflows, modern compilers (like gcc) take advantage of integer overflows being undefined behavior to perform various types of optimizations.
For example:
a - 10 < 20
when a is of type int after promotion, the expression is reduced in gcc (when optimization are enabled) to:
a < 30
It takes advantage of the expression being undefined behavior when a is in the range INT_MIN + 10 - 1 to INT_MIN.
This optimization could not be done when a is unsigned int because if a is 0, then a - 10 has to be evaluated as UINT_MAX - 9 (no undefined behavior). Optimizing a - 10 < 20 to a < 30 would then lead to a different result than the required one when a is 0 to 9.
Overflow of signed integers is undefined behaviour in C, so there are no guarantees.
That said, wrap around, or arithmetic modulo 2N, where N is the number of bits in the type, is a common behaviour. For that behaviour, indeed if a sum overflows, the result has the opposite sign of the operands.
Formally, the behaviour of signed arithmetic on overflow is undefined; anything can happen and it is 'correct'. This contrasts with unsigned arithmetic, where overflow is completely defined.
In practice, many older compilers used signed arithmetic which overflowed as you describe. However, modern GCC is making changes to the way it works, and you'd be very ill-advised to rely on the behaviour. It may change at any time when anything in the environment where your code is compiled changes — the compiler, the platform, ...
Overflow in C is a godawful mess.
Overflow during unsigned arithmetic or conversion to an unsigned type results in wraping modulo 2n
Overflow during conversion to a signed type is implementation defined, most implementations will wrap modulo 2n but some may not.
Overflow during signed arithmetic is undefined behaviour, according to the standard anything might happen. In practice sometimes it will do what you wan't, sometimes it will cause strange issues later in yoir code as the compiler optimises out important tests.
What makes things even worse is how this interacts with integer promotion. Thanks to promotion you can be doing signed arithmetic when it looks like you are doing unsigned arithmetic. For example consider the following code
uint16_t a = 65535;
uint16_t b = a * a;
On a system with 16-bit int this code is well-defined. However on a system with 32-bit int the multiplication will take place as signed int and the resulting overflow will be undefined behavior!
This question already has answers here:
Do C99 signed integer types defined in stdint.h exhibit well-defined behaviour in case of an overflow?
(2 answers)
Closed 7 years ago.
In C99 there're some (optional) types like int8_t, int16_t and the like, which are guaranteed to be have exactly specified width and no padding bits, and represent numbers in two's complement (7.18.1.1). In 6.2.6.2 signed integer overflow is mentioned as footnotes 44) and 45), namely that it might result in trapping values in padding bits.
As intN_t don't have any padding bits, and they are guaranteed to be two's complement, does this mean that their overflow doesn't generate any undefined behavior? What would be the result of e.g. overflowing multiplication? What about addition? Is the result reduced modulo 2^N as for unsigned types?
Footnotes are not normative. If a footnote states that overflow can result in trapping values in padding bits, it is not really wrong, but I can see how it is slightly misleading. The normative text merely says the behaviour is undefined. Placing trapping values in padding bits is one possible consequence of undefined behaviour, but not the only one.
So no, this does not mean overflow is defined. It's possible for operations involving intN_t/uintN_t operands to overflow, and for that overflow to result in undefined behaviour.
Something like int8_t i = 127; ++i; has no UB. int8_t is subject to integral promotions, so the addition is carried out as if you had written i = (int8_t) ((int) i + 1);. The addition itself does not overflow, and the conversion back to int8_t produces an implementation-defined result.
Something like uint16_t u = 65535; u *= u; has UB on current typical implementations where int has 32 sign/value bits. uint16_t too is subject to integral promotions, so the multiplication is carried out as if you had written u = (uint16_t) ((int) u * (int) u);. The multiplication itself overflows, and the behaviour is undefined.
Something like int64_t i = 9223372036854775807; ++i; has UB on almost all implementations. The addition itself overflows.
No, it isn't well defined because ... it isn't defined at all. There simply is no text in the standard that would give semantic to the overflow of a signed integer.
Don't overstress the term "undefined behavior" as being something mysterious, but take in its direct sense. There is no definition of the behavior, so the standard doesn't specify anything what should happen and no portable code should rely on such a feature.