I hope you can help me with this
What final value will x have ?
int32_t x = 0xE5;
int8_t i;
for(i=0;i<200;i++)
{
x++;
}`
What happens to int8_t when it exceeds its range of -128 to 127?
Thank you
Consider what happens in i++ when i is 127. The C standard’s specification of postfix ++ says, in C 2018 6.5.2.4 2:
… As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it)…
Unfortunately, it says nothing else about the arithmetic used; it does not say whether the addition is perform using int8_t arithmetic or int arithmetic or something else. In most operations in C, operands are promoted to at least the int type. For example, i += 1 is specified to be effectively equivalent to i = i + 1, and i + 1 is specified to promote this i to int. Then the addition yields 128, because 127 + 1 = 128 and 128 is representable in the int type. Then the 128 is converted to int8_t for storage in i. This is a problem because 128 is not representable in int8_t. C 2018 6.3.1.3 3 says there is either an implementation-defined result or an implementation-defined signal.
This means your compiler must document what happens here. There should be a manual for the compiler, and it should say what happens when an out-of-range result is converted to int8_t. For example, GCC documents that the result wraps modulo 256.
Since the standard is vague about the arithmetic used, it is possible the intent is the arithmetic would be performed in the int8_t type, and the addition would overflow, which has undefined behavior. But this would contrast with the general nature of the standard.
If the loop does continue, then x++ will eventually exceed the int32_t type. If that is the same as the int type, it will overflow and have undefined behavior. If int is wider than int32_t, we have the same situation with a successful addition followed by an implementation-defined conversion.
Related
I am trying to add '1' to a character which is holding maximum positive value it can hold. It is giving 0 as output instead of giving -256.
#include <stdio.h>
int main() {
signed char c = 255;
printf("%d\n", c + 1 );
}
O / P : 0
c + 2 = 1;
c + 3 = 2;
As per my understanding, it should give negative numbers once it reaches the maximum limit (). Is this correct? I am testing on Ubuntu.
A signed char is very often 8-bit encoding values [-128...127].
signed char c = 255; is attempting to initialize c to a value outside the signed char range.
It is implementation behavior what happens next. Very commonly 255 is converted "mod" 256 to the value of -1.
signed char c = 255;
printf("%d\n", c ); // -1 expected
printf("%d\n", c + 1 ); // 0 expected
As per my understanding, it should give negative numbers once it reaches the maximum limit (). Is this correct?
No. Adding 1 to the maximum int value is undefined behavior. There is no should. It might result in a negative number, might not, might exit code - it is not defined.
Had code been
signed char c = 127;
printf("%d\n", c + 1 );
c + 1 would be 128 and "128\n" would be printed as c + 1 is an int operation with an in range int sum.
There's several implicit conversions to keep track of here:
signed char c = 255; Is a conversion of the constant 255 which has type int, into a smaller signed char. This is "lvalue conversion through assignment" (initialization follows the rules of assignment) where the right operand gets converted to the type of the left.
The actual conversion from a large signed type to a small signed type follows this rule:
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
In practice, the very likely conversion to happen on a two's complement computer is that you end up with the signed char having the decimal value equivalent to 0xFF, which is -1.
c + 1 is an operation with two operands of types signed char and int respectively. For the + operator, it means that the usual arithmetic conversions are performed, see Implicit type promotion rules.
Meaning c gets converted to int and the operation is carried out on int type, which is also the type of the result.
printf("%d\n", stuff ); The functions like printf accepting a variable number of arguments undergo an oddball conversion rule called the default argument promotions. In case of integers, it means that the integer promotions (see link above) are carried out. If you pass c + 1 as parameter, then the type is int and no promotion takes place. But if you had just passed c, then it gets implicitly promoted to int as per these rules. Which is why using %d together with character type actually works, even though it's the wrong conversion specifier for printing characters.
As per my understanding, it should give negative numbers once it reaches the maximum limit (). Is this correct?
If you simply do signed char c = 127; c++; then that's a signed overflow, undefined behavior with no predictable outcome.
If you do signed char c = 127; ... c + 1 then there's no overflow because of the implicit promotion to int.
If you do unsigned char c = 255; c++; then there is a well-defined wrap around since this is an unsigned type. c will become zero. Signed types do not have such a well-defined wrap around - they overflow instead.
In practice, signed number overflow is artificial nonsense invented by the C standard. All well-known computers just set an overflow and/or carry bit when you do overflow on assembler level, properly documented and well-defined by the core manual. The reason it turns "undefined behavior" in C is mainly because C allows for nonsensical signedness formats like one's complement or signed magnitude, that may have padding bits, trap representations or other such exotic, mostly fictional stuff.
Though nowadays, optimizing compilers take advantage of overflow not being allowed to happen, in order to generate more efficient code. Which is unfortunate, since we could have had both fast and 100% deterministic code if 2's complement was the only allowed format.
#include <limits.h>
int main(){
int a = UINT_MAX;
return 0;
}
I this UB or implementation defined?
Links saying its UB
https://www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Integer-Overflow-Basics
Allowing signed integer overflows in C/C++
Links saying its Implementation defined
http://www.enseignement.polytechnique.fr/informatique/INF478/docs/Cpp/en/c/language/signed_and_unsigned_integers.html
Conversion rule says:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Aren't we converting a max unsigned value into a signed value?
The way I have seen it, gcc just truncates the result.
Both references are correct, but they do not address the same issue.
int a = UINT_MAX; is not an instance of signed integer overflow, this definition involves a conversion from unsigned int to int with a value that exceeds the range of type int. As quoted from the École polytechnique's site, the C Standard defines the behavior as implementation-defined.
#include <limits.h>
int main(){
int a = UINT_MAX; // implementation defined behavior
int b = INT_MAX + 1; // undefined behavior
return 0;
}
Here is the text from the C Standard:
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Some compilers have a command line option to change the behavior of signed arithmetic overflow from undefined behavior to implementation-defined: gcc and clang support -fwrapv to force integer computations to be performed modulo the 232 or 264 depending on the signed type. This prevents some useful optimisations, but also prevents some counterintuitive optimisations that may break innocent looking code. See this question for some examples: What does -fwrapv do?
int a = UINT_MAX; does not overflow because no exceptional condition occurs while evaluating this declaration or the expression within it. This code is defined to convert UINT_MAX to the type int for the initialization of a, and the conversion is defined by the rules in C 2018 6.3.1.3.
Briefly, the rules that apply are:
6.7.9 11 says initialization behaves similarly to simple assignment: “… The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply,…”
6.5.16.1 2 says simple assignment performs a conversion: “In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.”
6.3.1.3 3, which covers conversion to a signed integer type when the operand value cannot be represented in the type, says: “either the result is implementation-defined or an implementation-defined signal is raised.”
So, the behavior is defined.
There is a general rule in 2018 6.5 5 about exceptional conditions that occur while evaluating expressions:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
However, this rule never applies in the chain above. While doing the evaluations, including the implied assignment of the initialization, we never get a result out of range of its type. The input to the conversion is out of range of the destination type, int, but the result of the conversion is in range, so there is no out-of-range result to trigger an exceptional condition.
(A possible exception to this is that the C implementation could, I suppose, define the result of the conversion to be out of range of int. I am not aware of any that do, and this is likely not what was intended by 6.3.1.3 3.)
This in not signed integer overflow:
int a = UINT_MAX;
It is a conversion from an unsigned to a signed integer type and is implementation defined. This is covered in section 6.3.1.3 of the C standard regarding conversion of signed and unsigned integer types:
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new
type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than
the maximum value that can be represented in the new type
until the value is in the range of the new type.6
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined
or an implementation-defined signal is raised.
An example of signed integer overflow would be:
int x = INT_MAX;
x = x + 1;
And this is undefined. In fact section 3.4.3 of the C standard which defines undefined behavior states in paragraph 4:
An example of undefined behavior is the behavior on integer overflow
And integer overflow only applies to signed types as per 6.2.5p9:
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type
In the pre-existing "language" (family of dialects) the C Standard was written to describe, implementations would generally either process signed integer overflow by doing whatever the underlying platform did, truncating values to the length of the underlying type (which is what most platforms did) even on platforms which would otherwise do something else, or triggering some form of signal or diagnostic.
In K&R's book "The C Programming Language", the behavior is described as "machine-dependent".
Although the authors of the Standard have said in the published Rationale document identified some cases where they expected that implementations for commonplace platforms would behave in commonplace fashion, they didn't want to say that certain actions would have defined behavior on some platforms but not others. Further, characterizing the behavior as "implementation-defined" would have created a problem. Consider something like:
int f1(void);
int f2(int a, int b, int c);
int test(int x, int y)
{
int test = x*y;
if (f1())
f2(test, x, y);
}
If the behavior of integer overflow were "Implementation Defined", then any implementation where it could raise a signal or have other observable side effects would be required to perform the multiplication before calling f1(), even though the result of the multiply would be ignored unless f1() returns a non-zero value. Classifying it as "Undefined Behavior" avoids such issues.
Unfortunately, gcc interprets the classification as "Undefined Behavior" as an invitation to treat integer overflow in ways that aren't bound by ordinary laws of causality. Given a function like:
unsigned mul_mod_32768(unsigned short x, unsigned short y)
{
return (x*y) & 0x7FFFu;
}
an attempt to call it with x greater than INT_MAX/y may arbitrarily disrupt the behavior of surrounding code, even if the result of the function would not otherwise have been used in any observable fashion.
So can i cast the values to unsigned values, do the operation and cast back, and get the same result? I want to do this because unsigned integers can overflow, while signed cant.
Unsigned integer arithmetic does not overflow in C terminology because it is defined to wrap modulo 2N, where N is the number of bits in the unsigned type being operated on, per C 2018 6.2.5 9:
… A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
For other types, if an overflow occurs, the behavior is not defined by the C standard, per 6.5 5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined. Note that not just the result is undefined; the entire behavior of the program is undefined. It could give a result you do not expect, it could trap, or it could execute entirely different code from what you expect.
Regarding your question:
So can I cast the values to unsigned values, do the operation and cast back, and get the same result?
we have two problems. First, consider a + b given int a, b;. If a + b overflows, then the behavior is not defined by the C standard. So we cannot say whether converting to unsigned, adding, and converting back to int will produce the same result because there is no defined result for a + b to start with.
Second, the conversion back is partly implementation-defined, per C 6.3.1.3. Consider int c = (unsigned) a + (unsigned) b;, which implicitly converts the unsigned sum to an int to store in c. Paragraph 1 tells us that, if the value of the sum is representable in int, it is the result of the conversion. But paragraph 3 tells us what happens if the value is not representable in int:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
GCC, for example, defines the result to be the result of wrapping modulo 2N. So, for int c = (unsigned) a + (unsigned) b;, GCC will produce the same result as int c = a + b; would if a + b wrapped modulo 2N. However, GCC does not guarantee the latter. When optimizing, GCC expects overflow will not occur, which can result in it eliminating any code branches where the program does allow overflow to occur. (GCC may have some options regarding its treatment of overflow.)
Additionally, even if both signed arithmetic and unsigned arithmetic wrap, performing an operation using unsigned values and converting back does not mathematically produce the same result as doing the operation with signed values. For example, consider -3/2. The int result is −1. But if -3 is converted to 32-bit unsigned, the resulting value is 232−3, and then (int) ((unsigned) -3 / (unsigned) 2) is 2−31−2 = 2,147,483,646.
int tx = INT_MAX +1; // 2147483648;
printf("tx = %d\n", tx);
prints tx = -2147483648.
I was wondering how to explain the result based on 6.3 Conversions in C11 standard?
when evaluating INT_MAX +1, are both operands int? Is the result 2147483648 long int? Which rule in 6.3 determines the type of the result?
when evaluating tx = ..., are the higher bits of the bit representation of the right hand side truncated so that its size changes from long int size to int size, and then are the truncated result interpreted as int? What rules in 6.3 determines how the conversion in this step is done?
Both INT_MAX and 1 have type int, so the result will have type int. Performing this operation causes signed integer overflow which is undefined behavior.
Section 3.4.3p3 Gives this as an example of undefined behavior:
EXAMPLE An example of undefined behavior is the behavior on integer overflow.
The relevant part here is 6.5/5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
This happens because both INT_MAX and the integer constant 1 have types int. So you simply can't do INT_MAX + 1. And there are no implicit promotions/conversions present to save the day, so 6.3 does not apply. It's a bug, anything can happen.
What you could do is to force a conversion by changing the code to int tx = INT_MAX + 1u;. Here one operand, 1u, is of unsigned int type. Therefore the usual arithmetic conversions convert INT_MAX to type unsigned int (See Implicit type promotion rules). The result is a well-defined 2147483648 and of type unsigned int.
Then there's an attempt to store this inside int tx, conversion to the left operand of assignment applies and then the conversion rules of 6.3 kick in. Specifically 6.3.1.3/3:
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So by changing the type to 1u we changed the code from undefined to impl.defined behavior. Still not ideal, but at least now the code has deterministic behavior on the given compiler. In theory, the result could be a SIGFPE signal, but in practice all real-world 2's complement 32/64 bit compilers are likely to give you the result -2147483648.
Ironically, all real-world 2's complement CPUs I've ever heard of perform signed overflow in a deterministic way. So the undefined behavior part of C is just an artificial construct by the C standard, caused by the useless language feature that allows exotic 1's complement and signed magnitude formats. In such exotic formats, signed overflow could lead to a trap representation and so C must claim that integer overflow is undefined behavior, even though it is not on the real-world 2's complement CPU that the C program is executing on.
Is the following code undefined behavior according to GCC in C99 mode:
signed char c = CHAR_MAX; // assume CHAR_MAX < INT_MAX
c = c + 1;
printf("%d", c);
signed char overflow does cause undefined behavior, but that is not what happens in the posted code.
With c = c + 1, the integer promotions are performed before the addition, so c is promoted to int in the expression on the right. Since 128 is less than INT_MAX, this addition occurs without incident. Note that char is typically narrower than int, but on rare systems char and int may be the same width. In either case a char is promoted to int in arithmetic expressions.
When the assignment to c is then made, if plain char is unsigned on the system in question, the result of the addition is less than UCHAR_MAX (which must be at least 255) and this value remains unchanged in the conversion and assignment to c.
If instead plain char is signed, the result of the addition is converted to a signed char value before assignment. Here, if the result of the addition can't be represented in a signed char the conversion "is implementation-defined, or an implementation-defined signal is raised," according to §6.3.1.3/3 of the Standard. SCHAR_MAX must be at least 127, and if this is the case then the behavior is implementation-defined for the values in the posted code when plain char is signed.
The behavior is not undefined for the code in question, but is implementation-defined.
No, it has implementation-defined behavior, either storing an implementation-defined result or possibly raising a signal.
Firstly, the usual arithmetic conversions are applied to the operands. This converts the operands to type int and so the computation is performed in type int. The result value 128 is guaranteed to be representable in int, since INT_MAX is guaranteed to be at least 32767 (5.2.4.2.1 Sizes of integer types), so next a value 128 in type int must be converted to type char to be stored in c. If char is unsigned, CHAR_MAX is guaranteed to be at least 255; otherwise, if SCHAR_MAX takes its minimal value of 127:
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type, [if] the new type is signed and the value cannot be represented in it[,] either the
result is implementation-defined or an implementation-defined signal is raised.
In particular, gcc can be configured to treat char as either signed or unsigned (-f\[un\]signed-char); by default it will pick the appropriate configuration for the target platform ABI, if any. If a signed char is selected, all current gcc target platforms that I am aware of have an 8-bit byte (some obsolete targets such as AT&T DSP1600 had a 16-bit byte), so it will have range [-128, 127] (8-bit, two's complement) and gcc will apply modulo arithmetic yielding -128 as the result:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.