Is overflow of intN_t well-defined? [duplicate] - c

This question already has answers here:
Do C99 signed integer types defined in stdint.h exhibit well-defined behaviour in case of an overflow?
(2 answers)
Closed 7 years ago.
In C99 there're some (optional) types like int8_t, int16_t and the like, which are guaranteed to be have exactly specified width and no padding bits, and represent numbers in two's complement (7.18.1.1). In 6.2.6.2 signed integer overflow is mentioned as footnotes 44) and 45), namely that it might result in trapping values in padding bits.
As intN_t don't have any padding bits, and they are guaranteed to be two's complement, does this mean that their overflow doesn't generate any undefined behavior? What would be the result of e.g. overflowing multiplication? What about addition? Is the result reduced modulo 2^N as for unsigned types?

Footnotes are not normative. If a footnote states that overflow can result in trapping values in padding bits, it is not really wrong, but I can see how it is slightly misleading. The normative text merely says the behaviour is undefined. Placing trapping values in padding bits is one possible consequence of undefined behaviour, but not the only one.
So no, this does not mean overflow is defined. It's possible for operations involving intN_t/uintN_t operands to overflow, and for that overflow to result in undefined behaviour.
Something like int8_t i = 127; ++i; has no UB. int8_t is subject to integral promotions, so the addition is carried out as if you had written i = (int8_t) ((int) i + 1);. The addition itself does not overflow, and the conversion back to int8_t produces an implementation-defined result.
Something like uint16_t u = 65535; u *= u; has UB on current typical implementations where int has 32 sign/value bits. uint16_t too is subject to integral promotions, so the multiplication is carried out as if you had written u = (uint16_t) ((int) u * (int) u);. The multiplication itself overflows, and the behaviour is undefined.
Something like int64_t i = 9223372036854775807; ++i; has UB on almost all implementations. The addition itself overflows.

No, it isn't well defined because ... it isn't defined at all. There simply is no text in the standard that would give semantic to the overflow of a signed integer.
Don't overstress the term "undefined behavior" as being something mysterious, but take in its direct sense. There is no definition of the behavior, so the standard doesn't specify anything what should happen and no portable code should rely on such a feature.

Related

Why the variable whose value is i+1000000 is not less than i when i is big enough? [duplicate]

I was working with integers in C, trying to explore more on when and how overflow happens.
I noticed that when I added two positive numbers, the sum of which overflows, I always got a negative number.
On the other hand, if I added two negative numbers, the sum of which overflows, I always got a positive number (including 0).
I made few experiments, but I would like to know if this is true for every case.
Integer overflows are undefined behavior in C.
C says an expression involving integers overflows, if its result after the usual arithmetic conversions is of a signed typed and cannot be represented in the type of the result. Assignment and cast expressions are an exception as they are ruled by the integer conversions.
Expressions of unsigned type cannot overflow, they wrap, e. g., 0U - 1 is UINT_MAX.
Examples:
INT_MAX + 1 // integer overflow
UINT_MAX + 1 // no overflow, the resulting type is unsigned
(unsigned char) INT_MAX // no overflow, integer conversion occurs
Never let any integer expression overflows, modern compilers (like gcc) take advantage of integer overflows being undefined behavior to perform various types of optimizations.
For example:
a - 10 < 20
when a is of type int after promotion, the expression is reduced in gcc (when optimization are enabled) to:
a < 30
It takes advantage of the expression being undefined behavior when a is in the range INT_MIN + 10 - 1 to INT_MIN.
This optimization could not be done when a is unsigned int because if a is 0, then a - 10 has to be evaluated as UINT_MAX - 9 (no undefined behavior). Optimizing a - 10 < 20 to a < 30 would then lead to a different result than the required one when a is 0 to 9.
Overflow of signed integers is undefined behaviour in C, so there are no guarantees.
That said, wrap around, or arithmetic modulo 2N, where N is the number of bits in the type, is a common behaviour. For that behaviour, indeed if a sum overflows, the result has the opposite sign of the operands.
Formally, the behaviour of signed arithmetic on overflow is undefined; anything can happen and it is 'correct'. This contrasts with unsigned arithmetic, where overflow is completely defined.
In practice, many older compilers used signed arithmetic which overflowed as you describe. However, modern GCC is making changes to the way it works, and you'd be very ill-advised to rely on the behaviour. It may change at any time when anything in the environment where your code is compiled changes — the compiler, the platform, ...
Overflow in C is a godawful mess.
Overflow during unsigned arithmetic or conversion to an unsigned type results in wraping modulo 2n
Overflow during conversion to a signed type is implementation defined, most implementations will wrap modulo 2n but some may not.
Overflow during signed arithmetic is undefined behaviour, according to the standard anything might happen. In practice sometimes it will do what you wan't, sometimes it will cause strange issues later in yoir code as the compiler optimises out important tests.
What makes things even worse is how this interacts with integer promotion. Thanks to promotion you can be doing signed arithmetic when it looks like you are doing unsigned arithmetic. For example consider the following code
uint16_t a = 65535;
uint16_t b = a * a;
On a system with 16-bit int this code is well-defined. However on a system with 32-bit int the multiplication will take place as signed int and the resulting overflow will be undefined behavior!

What rules in C11 standard determine the evaluation of `int tx = INT_MAX +1`?

int tx = INT_MAX +1; // 2147483648;
printf("tx = %d\n", tx);
prints tx = -2147483648.
I was wondering how to explain the result based on 6.3 Conversions in C11 standard?
when evaluating INT_MAX +1, are both operands int? Is the result 2147483648 long int? Which rule in 6.3 determines the type of the result?
when evaluating tx = ..., are the higher bits of the bit representation of the right hand side truncated so that its size changes from long int size to int size, and then are the truncated result interpreted as int? What rules in 6.3 determines how the conversion in this step is done?
Both INT_MAX and 1 have type int, so the result will have type int. Performing this operation causes signed integer overflow which is undefined behavior.
Section 3.4.3p3 Gives this as an example of undefined behavior:
EXAMPLE An example of undefined behavior is the behavior on integer overflow.
The relevant part here is 6.5/5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
This happens because both INT_MAX and the integer constant 1 have types int. So you simply can't do INT_MAX + 1. And there are no implicit promotions/conversions present to save the day, so 6.3 does not apply. It's a bug, anything can happen.
What you could do is to force a conversion by changing the code to int tx = INT_MAX + 1u;. Here one operand, 1u, is of unsigned int type. Therefore the usual arithmetic conversions convert INT_MAX to type unsigned int (See Implicit type promotion rules). The result is a well-defined 2147483648 and of type unsigned int.
Then there's an attempt to store this inside int tx, conversion to the left operand of assignment applies and then the conversion rules of 6.3 kick in. Specifically 6.3.1.3/3:
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So by changing the type to 1u we changed the code from undefined to impl.defined behavior. Still not ideal, but at least now the code has deterministic behavior on the given compiler. In theory, the result could be a SIGFPE signal, but in practice all real-world 2's complement 32/64 bit compilers are likely to give you the result -2147483648.
Ironically, all real-world 2's complement CPUs I've ever heard of perform signed overflow in a deterministic way. So the undefined behavior part of C is just an artificial construct by the C standard, caused by the useless language feature that allows exotic 1's complement and signed magnitude formats. In such exotic formats, signed overflow could lead to a trap representation and so C must claim that integer overflow is undefined behavior, even though it is not on the real-world 2's complement CPU that the C program is executing on.

C Integer Overflow explanation [duplicate]

I was working with integers in C, trying to explore more on when and how overflow happens.
I noticed that when I added two positive numbers, the sum of which overflows, I always got a negative number.
On the other hand, if I added two negative numbers, the sum of which overflows, I always got a positive number (including 0).
I made few experiments, but I would like to know if this is true for every case.
Integer overflows are undefined behavior in C.
C says an expression involving integers overflows, if its result after the usual arithmetic conversions is of a signed typed and cannot be represented in the type of the result. Assignment and cast expressions are an exception as they are ruled by the integer conversions.
Expressions of unsigned type cannot overflow, they wrap, e. g., 0U - 1 is UINT_MAX.
Examples:
INT_MAX + 1 // integer overflow
UINT_MAX + 1 // no overflow, the resulting type is unsigned
(unsigned char) INT_MAX // no overflow, integer conversion occurs
Never let any integer expression overflows, modern compilers (like gcc) take advantage of integer overflows being undefined behavior to perform various types of optimizations.
For example:
a - 10 < 20
when a is of type int after promotion, the expression is reduced in gcc (when optimization are enabled) to:
a < 30
It takes advantage of the expression being undefined behavior when a is in the range INT_MIN + 10 - 1 to INT_MIN.
This optimization could not be done when a is unsigned int because if a is 0, then a - 10 has to be evaluated as UINT_MAX - 9 (no undefined behavior). Optimizing a - 10 < 20 to a < 30 would then lead to a different result than the required one when a is 0 to 9.
Overflow of signed integers is undefined behaviour in C, so there are no guarantees.
That said, wrap around, or arithmetic modulo 2N, where N is the number of bits in the type, is a common behaviour. For that behaviour, indeed if a sum overflows, the result has the opposite sign of the operands.
Formally, the behaviour of signed arithmetic on overflow is undefined; anything can happen and it is 'correct'. This contrasts with unsigned arithmetic, where overflow is completely defined.
In practice, many older compilers used signed arithmetic which overflowed as you describe. However, modern GCC is making changes to the way it works, and you'd be very ill-advised to rely on the behaviour. It may change at any time when anything in the environment where your code is compiled changes — the compiler, the platform, ...
Overflow in C is a godawful mess.
Overflow during unsigned arithmetic or conversion to an unsigned type results in wraping modulo 2n
Overflow during conversion to a signed type is implementation defined, most implementations will wrap modulo 2n but some may not.
Overflow during signed arithmetic is undefined behaviour, according to the standard anything might happen. In practice sometimes it will do what you wan't, sometimes it will cause strange issues later in yoir code as the compiler optimises out important tests.
What makes things even worse is how this interacts with integer promotion. Thanks to promotion you can be doing signed arithmetic when it looks like you are doing unsigned arithmetic. For example consider the following code
uint16_t a = 65535;
uint16_t b = a * a;
On a system with 16-bit int this code is well-defined. However on a system with 32-bit int the multiplication will take place as signed int and the resulting overflow will be undefined behavior!

Cyclic Nature of char datatype [duplicate]

This question already has answers here:
Simple Character Interpretation In C
(9 answers)
Closed 5 years ago.
I had been learning C and came across topic called Cyclic Nature of Data Type in C.
It is like example
char c=125;
c=c+10;
printf("%d",c);
The output is -121.
the logic given was
125+1= 126
125+2= 127
125+3=-128
125+4=-127
125+5=-126
125+6=-125
125+7=-124
125+8=-123
125+9=-122
125+10=-121
This is due to cyclic nature if char datatype. Y does Char exhibit cyclic nature?? How is it possible to char??
On your system char is signed char. When a signed integral type overflows, the result is undefined. It may or may not be cyclic. Although on most of the machines performing 2's complement arithmetic, you may observe this as cyclic.
The char data type is signed type, as per your implementation. As such it can store values in range: -128 to 127. When you store a value greater than 127, you would end up with a value that might be in negative or positive number, depending on how large is the value stored and what kind of platform you are working on.
Signed integer overflow is undefined behavior in C and then not at all unsigned numbers are guaranteed to wrap around.
char isn’t special in this regard (besides its implementation-defined signedness), all conversions to signed types usually exhibit this “cyclic nature”. However, there are undefined and implementation-defined aspects of signed overflow, so be careful when doing such things.
What happens here:
In the expression
c=c+10
the operands of + are subject to the usual arithmetic conversions. They include integer promotion, which converts all values to int if all values of their type can be represented as an int. This means, the left operand of + (c) is converted to an int (an int can hold every char value1)). The result of the addition has type int. The assignment implicitly converts this value to a char, which happens to be signed on your platform. An (8-bit) signed char cannot hold the value 135 so it is converted in an implementation-defined way 2). For gcc:
For conversion to a type of width N, the value is reduced modulo 2N to be within range of the type; no signal is raised.
Your char has a width of 8, 28 is 256, and 135 ☰ -121 mod 256 (cf. e.g. 2’s complement on Wikipedia).
You didn’t say which compiler you use, but the behaviour should be the same for all compilers (there aren’t really any non-2’s-complement machines anymore and with 2’s complement, that’s the only reasonable signed conversion definition I can think of).
Note, that this implementation-defined behaviour only applies to conversions, not to overflows in arbitrary expressions, so e.g.
int n = INT_MAX;
n += 1;
is undefined behaviour and used for optimizations by some compilers (e.g. by optimizing such statements out), so such things should definitely be avoided.
A third case (unrelated here, but for sake of completeness) are unsigned integer types: No overflow occurs (there are exceptions, however, e.g. bit-shifting by more than the width of the type), the result is always reduced modulo 2N for a type with precision N.
Related:
A simple C program output is not as expected
Allowing signed integer overflows in C/C++
1 At least for 8-bit chars, signed chars, or ints with higher precision than char, so virtually always.
2 The C standard says (C99 and C11 (n1570) 6.3.1.3 p.3) “[…] either the result is implementation-defined or an implementation-defined signal is raised.” I don’t know of any implementation raising a signal in this case. But it’s probably better not to rely on that conversion without reading the compiler documentation.

UINT_MAX + 1 equals what?

What is the defined behavior in C for UINT_MAX + 1u? How safe is to assume it is zero?
From the standard (C11, 6.2.5/9, emphasis mine):
[...] A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
If UINT_MAX is 10:
(10 + 1) % (10 + 1) == 0
So, yes, it's safe to assume it's zero.
It's worth emphasizing that while unsigned behavior is well-defined, signed integer overflow isn't:
http://en.wikipedia.org/wiki/Integer_overflow
In the C programming language, signed integer overflow causes
undefined behavior, while unsigned integer overflow causes the number
to be reduced modulo a power of two
A very good paper on the subject:
http://www.cs.utah.edu/~regehr/papers/overflow12.pdf
EXAMPLES OF C/C++ INTEGER OPERATIONS AND THEIR RESULTS
Expression Result
---------- ------
UINT_MAX+1 0
LONG_MAX+1 undefined
INT_MAX+1 undefined
SHRT_MAX+1 SHRT_MAX+1 if INT_MAX>SHRT_MAX, otherwise undefined
char c = CHAR_MAX; c++ varies
-INT_MIN undefined
(char)INT_MAX commonly -1
1<<-1 undefined
1<<0 1
1<<31 commonly INT_MIN in ANSI C and C++98; undefined in C99 and C++11
1<<32 undefined
1/0 undefined
INT_MIN%-1 undefined in C11, otherwise undefined in practice
It's safe. The C standard guarantees that unsigned integer overflow wrap-around results in zero.
Should be safe:
Wiki on unsigned overflow
Note the unsigned int overflow is well defined.
Also, here's a whole question on this.

Resources