Adding an signed integer beyond 0xFFFFFFFF - c

#include <stdio.h>
void fun3(int a, int b, int c)
{
printf("%d \n", a+b+c );
}
void fun2 ( int x, int y)
{
fun3(0x33333333,0x30303030, 0x31313131);
printf("%d \n", x+y);
}
fun1 (int x)
{
fun2(0x22222222,0x20202020);
printf("%d \n", x);
}
main()
{
fun1(0x1111111);
}
I'm going through the above program for stack corruption. I am getting the o/p for the above program with some undesired values. All I could understand is if the added value is beyond 0xFFFFFFFF then the small negative integer becomes the largest value say -1 becomes 0xFFFFFFFF. Any insights on this

EDIT (Corrections) (I missed the point. My answer is right for constants, but the question contains parameters of functions, then what happens here is overflow of signed integer objects and, as correctly pointed out #Cornstalks in his comment, this is undefined behaviour).
/EDIT
In fun1() you are using printf() in a wrong way.
You wrote "%d" to accept an int, but this is not true if your number is greater that MAX_INT.
You have to check the value of MAX_INT in your system.
If you write an integer constant in hexadecimal format, the standard C (ISO C99 or C11) tries to put the value in the first type that the constant can fit, by following this order:
int, unsigned int, long int, unsigned long int, long long int,
unsigned long long int.
Thus, if you have a constant greater that MAX_INT (the max. value in the range of int), your constant (if positive) has type unsigned int, but the directive %d expected a signed int value. Thus, it will be shown some negative number.
Worst, if your constant is a value greater than UMAX_INT (the max. value in the range of unsigned int) then the type of the constant will be the first of long int, unsigned long int, long long int, with precision strictly bigger than of unsigned int.
This implies that %d becomes a wrong directive.
If you cannot be completely sure about how big will be your values, you could do a cast to the biggest integer type:
printf("%lld", (long long int) 0x33333333333);
The directive %lld stands for long long int.
If you are interested always in positive values, you have to use %llu and cast to unsigned long long int:
printf("%llu", (unsigned long long int) 0x33333333333);
In this way, you avoids any "funny" numbers, as much as, you show big numbers without loosing any precision.
Remark: The constants INT_MAX, UINT_MAX, and the like, are in limits.h.
Important: The automatic sequence of casts is only valid for octal and hexadecimal constants. For decimal constants there is another rule:
int, long int, long long int.

To #Cornstalks' point: INT_MIN is 0x80000000, and (int)-1 is 0xFFFFFFFF in 2's complement (on a 32-bit system, anyway).
This allows the instruction set to do things in signed arithmetic like:
1 + -2 = -1
becomes (as signed shorts, for brevity)
0x0001 + 0xFFFE = 0xFFFF
... then:
1 + -1 = 0
is represented internally with overflow as
0x0001 + 0xFFFF = 0x0000
Also to #Cornstalks' point: the internal representation (as well as overflow addition) is an implementation detail. C implementations (and instruction sets) need not represent integers in 2's complement, so providing hex values for signed integer types may tie you to a subset of C implementations.

fun3 will attempt to print the value 0x94949494. This is greater than the max 4-byte integer value of 0x7FFFFFFF, so it will "overflow" and (on virtually every computer made today) produce (if I did my arithmetic correctly) the negative number -0x6B6B6B6C, which is -1802201964.
fun1 and fun2 should print the "expected" positive results.

Related

Type conversion of objects involving unsigned type in C language

I have been reading KnR lately and have come across a statement:
"Type conversions of Expressions involving unsigned type values are
complicated"
So to understand this i have written a very small code:
#include<stdio.h>
int main()
{
signed char x = -1;
printf("Signed : %d\n",x);
printf("Unsigned : %d\n",(unsigned)x);
}
signed x bit form will be : 1000 0001 = -1
when converted to unsigned its value has to be 1000 0001 = 129.
But even after type conversion it prints the -1.
Note: I'm using gcc compiler.
C11 6.3.1.3, paragraph 2:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
−1 cannot be represented as an unsigned int value, the −1 is converted to UINT_MAX. Thus -1 becomes a very large positive integer.
Also, use %u for unsigned int, otherwise result invoked undefined behaviour.
C semantics regarding conversion of integers is defined in terms of the mathematical values, not in terms of the bits. It is good to understand how values are represented as bits, but, when analyzing expressions, you should think about values.
In this statement:
printf("Signed : %d\n",x);
the signed char x is automatically promoted to int. Since x is −1, the new int also has the value −1, and printf prints “-1”.
In this statement:
printf("Unsigned : %d\n",(unsigned)x);
the signed char x is automatically promoted to int. The value is still −1. Then the cast converts this to unsigned. The rule for conversion to unsigned is that UINT_MAX+1 is added to or subtracted from the value as needed to bring it into the range of unsigned. In this case, adding UINT_MAX+1 to −1 once brings the value to UINT_MAX, which is within range. So the result of the conversion is UINT_MAX.
However, this unsigned value is then passed to printf to be printed with the %d conversion. That violates C 2018 7.21.6.1 9, which says the behavior is undefined if the types do not match.
This means a C implementation is allowed to do anything. In your case, it seems what happened is:
The unsigned value UINT_MAX is represented with all one bits.
The printf interpreted the all-one-bits value as an int.
Your implementation uses two’s complement for int types.
In two’s complement, an object with all one bits represents −1.
So printf printed “-1”.
If you had used this correct code instead:
printf("Unsigned : %u\n",(unsigned)x);
then printf would print the value of UINT_MAX, which is likely 4,294,967,295, so printf would print “4294967295”.

Output of a short int and a unsigned short?

So I'm trying to interpret the following output:
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("v = %d, uv = %u\n", v, uv);
Output:
v = -12345
uv = 53191
So the question is: why is this exact output generated when this program is run on a two's complement machine?
What operations lead to this result when casting the value to unsigned short?
My answer assumes 16-bit two's complement arithmetic.
To find the value of -12345, take 12345, complement it, and add 1.
12345 is 0x3039 is 0011000000111001.
Complementing means changing all the 1's to 0's and all the 0's to 1's:
1100111111000110 is 0xcfc6 is 53190.
Add one: 53191.
So internally, -12345 is represented by 0xcfc7 = 53191.
But if you interpret it as an unsigned number, it's obviously just 53191. (And when you assign a signed value to an unsigned integer of the same size, what typically ends up happening is that you assign the exact bit pattern, without converting anything. Later, however, you will typically interpret that value differently, such as when you print it with %u.)
Another, perhaps easier way to think about this is that 16-bit arithmetic "wraps around" at 216 = 65536. So you can think of 65536 as being another name for 0 (just like 0:00 and 24:00 are both names for midnight). So -12345 is 65536 - 12345 = 53191.
The conversion rules, when converting signed integer to an unsigned integer, defined by C standard requires by repeatedly adding the TYPE_MAX + 1 to the value.
From 6.3.1.3 Signed and unsigned integers:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
If USHRT_MAX is 65535 and then adding 65535 + 1 + -12345 is 53191.
The output seen does not depend on 2's complement nor 16 or 32- bit int. The output seen is entirely defined and would be the same on a rare 1's complement or sign-magnitude machine. The result does depend on 16-bit unsigned short
-12345 is within the minimum range of a short, so no issues with that assignment. When a short is passed as a ... argument, is goes thought the usual promotion to an int with no change in value as all short are in the range of int. "%d" expects an int, so the output is "-12345"
short int v = -12345;
printf("%d\n", v); // output "-12345\n"
Assigning a negative number to a unsigned type is well defined. With a 16-bit unsigned short, the value of uv is -12345 plus the minimum multiples of USHRT_MAX+1 (65536 in this case) to a final value of 53191.
Passing an unsigned short as an ... argument, the value is converted to int or unsigned, whichever type contains the entire range of unsigned short. IAC, the values does not change. "%u" matches an unsigned. It also matches an int whose values are expressible as either an int or unsigned.
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("%u\n", v); // output "53191\n"
What operations lead to this result when casting the value to unsigned short?
The casting did not affect the final outcome. The same result would have occurred without the cast. The cast may be useful to quiet warnings.

Why is 0 < -0x80000000?

I have below a simple program:
#include <stdio.h>
#define INT32_MIN (-0x80000000)
int main(void)
{
long long bal = 0;
if(bal < INT32_MIN )
{
printf("Failed!!!");
}
else
{
printf("Success!!!");
}
return 0;
}
The condition if(bal < INT32_MIN ) is always true. How is it possible?
It works fine if I change the macro to:
#define INT32_MIN (-2147483648L)
Can anyone point out the issue?
This is quite subtle.
Every integer literal in your program has a type. Which type it has is regulated by a table in 6.4.4.1:
Suffix Decimal Constant Octal or Hexadecimal Constant
none int int
long int unsigned int
long long int long int
unsigned long int
long long int
unsigned long long int
If a literal number can't fit inside the default int type, it will attempt the next larger type as indicated in the above table. So for regular decimal integer literals it goes like:
Try int
If it can't fit, try long
If it can't fit, try long long.
Hex literals behave differently though! If the literal can't fit inside a signed type like int, it will first try unsigned int before moving on to trying larger types. See the difference in the above table.
So on a 32 bit system, your literal 0x80000000 is of type unsigned int.
This means that you can apply the unary - operator on the literal without invoking implementation-defined behavior, as you otherwise would when overflowing a signed integer. Instead, you will get the value 0x80000000, a positive value.
bal < INT32_MIN invokes the usual arithmetic conversions and the result of the expression 0x80000000 is promoted from unsigned int to long long. The value 0x80000000 is preserved and 0 is less than 0x80000000, hence the result.
When you replace the literal with 2147483648L you use decimal notation and therefore the compiler doesn't pick unsigned int, but rather tries to fit it inside a long. Also the L suffix says that you want a long if possible. The L suffix actually has similar rules if you continue to read the mentioned table in 6.4.4.1: if the number doesn't fit inside the requested long, which it doesn't in the 32 bit case, the compiler will give you a long long where it will fit just fine.
0x80000000 is an unsigned literal with value 2147483648.
Applying the unary minus on this still gives you an unsigned type with a non-zero value. (In fact, for a non-zero value x, the value you end up with is UINT_MAX - x + 1.)
This integer literal 0x80000000 has type unsigned int.
According to the C Standard (6.4.4.1 Integer constants)
5 The type of an integer constant is the first of the corresponding
list in which its value can be represented.
And this integer constant can be represented by the type of unsigned int.
So this expression
-0x80000000 has the same unsigned int type. Moreover it has the same value
0x80000000 in the two's complement representation that calculates the following way
-0x80000000 = ~0x80000000 + 1 => 0x7FFFFFFF + 1 => 0x80000000
This has a side effect if to write for example
int x = INT_MIN;
x = abs( x );
The result will be again INT_MIN.
Thus in in this condition
bal < INT32_MIN
there is compared 0 with unsigned value 0x80000000 converted to type long long int according to the rules of the usual arithmetic conversions.
It is evident that 0 is less than 0x80000000.
The numeric constant 0x80000000 is of type unsigned int. If we take -0x80000000 and do 2s compliment math on it, we get this:
~0x80000000 = 0x7FFFFFFF
0x7FFFFFFF + 1 = 0x80000000
So -0x80000000 == 0x80000000. And comparing (0 < 0x80000000) (since 0x80000000 is unsigned) is true.
A point of confusion occurs in thinking the - is part of the numeric constant.
In the below code 0x80000000 is the numeric constant. Its type is determine only on that. The - is applied afterward and does not change the type.
#define INT32_MIN (-0x80000000)
long long bal = 0;
if (bal < INT32_MIN )
Raw unadorned numeric constants are positive.
If it is decimal, then the type assigned is first type that will hold it: int, long, long long.
If the constant is octal or hexadecimal, it gets the first type that holds it: int, unsigned, long, unsigned long, long long, unsigned long long.
0x80000000, on OP's system gets the type of unsigned or unsigned long. Either way, it is some unsigned type.
-0x80000000 is also some non-zero value and being some unsigned type, it is greater than 0. When code compares that to a long long, the values are not changed on the 2 sides of the compare, so 0 < INT32_MIN is true.
An alternate definition avoids this curious behavior
#define INT32_MIN (-2147483647 - 1)
Let us walk in fantasy land for a while where int and unsigned are 48-bit.
Then 0x80000000 fits in int and so is the type int. -0x80000000 is then a negative number and the result of the print out is different.
[Back to real-word]
Since 0x80000000 fits in some unsigned type before a signed type as it is just larger than some_signed_MAX yet within some_unsigned_MAX, it is some unsigned type.
C has a rule that the integer literal may be signed or unsigned depends on whether it fits in signed or unsigned (integer promotion). On a 32-bit machine the literal 0x80000000 will be unsigned. 2's complement of -0x80000000 is 0x80000000 on a 32-bit machine. Therefore, the comparison bal < INT32_MIN is between signed and unsigned and before comparison as per the C rule unsigned int will be converted to long long.
C11: 6.3.1.8/1:
[...] Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Therefore, bal < INT32_MIN is always true.

Initializing unsigned short int to signed value

#include<stdio.h>
int main()
{
unsigned short a=-1;
printf("%d",a);
return 0;
}
This is giving me output 65535. why?
When I increased the value of a in negative side the output is (2^16-1=)65535-a.
I know the range of unsigned short int is 0 to 65535.
But why is rotating in the range 0 to 65535.What is going inside?
#include<stdio.h>
int main()
{
unsigned int a=-1;
printf("%d",a);
return 0;
}
Output is -1.
%d is used for signed decimal integer than why here it is not following the rule of printing the largest value of its(int) range.
Why the output in this part is -1?
I know %u is used for printing unsigned decimal integer.
Why the behavioral is undefined in second code and not in first.?
This I have compiled in gcc compiler. It's a C code
On my machine sizeof short int is 2 bytes and size of int is 4 bytes.
In your implementation, short is 16 bits and int is 32 bits.
unsigned short a=-1;
printf("%d",a);
First, -1 is converted to unsigned short. This results in the value 65535. For the precise definition see the standard "integer conversions". To summarize: the value is taken modulo USHORT_MAX+1.
This value 65535 is assigned to a.
Then for the printf, which uses varargs, the value is promoted back to int. varargs never pass integer types smaller than int, they're always converted to int. This results in the value 65535, which is printed.
unsigned int a=-1;
printf("%d",a);
First line, same as before but modulo UINT_MAX+1. a is 4294967295.
For the printf, a is passed as an unsigned int. Since %d requires an int the behavior is undefined by the C standard. But your implementation appears to have reinterpreted the unsigned value 4294967295, which has all bits set, as as a signed integer with all-bits-set, i.e. the two's-complement value -1. This behavior is common but not guaranteed.
Variable assignment is done to the amount of memory of the type of the variable (e.g., short is 2 bytes, int is 4 bytes, in 32 bit hardware, typically). Sign of the variable is not important in the assignment. What matters here is how you are going to access it. When you assign to a 'short' (signed/unsigned) you assign the value to a '2 bytes' memory. Now if you are going to use '%d' in printf, printf will consider it 'integer' (4 bytes in your hardware) and the two MSBs will be 0 and hence you got [0|0](two MSBs) [-1] (two LSBs). Due to the new MSBs (introduced by %d in printf, migration) your sign bit is hidden in the LSBs and hence printf considers it unsigned (due to the MSBs being 0) and you see the positive value. To get a negative in this you need to use '%hd' in first case. In the second case you assigned to '4 bytes' memory and the MSB got its SIGN bit '1' (means negative) during assignment and hence you see the negative number in '%d' of printf. Hope it explains. For more clarification please comment on the answer.
NB: I used 'MSB' for a shorthand of higher-order byte(s). Please read it according to the context (e.g., 'SIGN bit' will make you read like 'Most Significant Bit'). Thanks.

Overflow in C code

Can anyone explain to me why this code prints "error" ? This only appears for minimal value of integer.
int abs(int x) {
int result = 0;
if(x < 0)
result = -1*x;
else
result = x;
return result;
}
int main() {
printf("Testing abs... ");
if (abs(-2147483648) != 2147483648)
printf("error\n");
else
printf("success\n");
}
Because for a 32 bit integer signed integer, using two's complement, the largest number you can store is 2147483647.
The range is -2147483648 2147483647.
You must be careful - overflowing signed numbers is undefined behavior.
The maximal value of a 32 bit integer is 2,147,483,647.
put long instead of int. For bigger integers you will need long long. Google for the range that this types offer. Also for the comparison with a static number you must declare it like e.g. 8438328L
Because of the way integers are represented (2's complement), if your int is 32 bits, -2147483648 is its own negative.
After -2147483648 is returned by your abs(), it is probably being compared as a long, 64-bit integer. If the comparison were 32-bit, 2147483648 would be equivalent to -2147483648. Perhaps if you turn on all warnings for your compiler, it will complain?
The range of 32-bit signed integers is, as has been mentioned before -2147483648 (= -231) to 2147483647 (= 231 - 1). In your abs() function, you have thus overflow of a signed integer, which is undefined behaviour (citation of standard to be inserted). Therefore anything could happen, but what actually happens is very probably that the result just wraps around, producing -2147483648 again. However, you compare that to the integer literal 2147483648, which does not fit into a 32-bit signed integer, hence, since it has no (n)signedness suffix, that literal has the next type in the list
int
long int
long long int
which can represent its value (if any). On 64-bit systems that could be long int or long long int, the former is typically the case on Linux, the latter, as far as I know on Windows, on 32-bit systems it's almost certainly long long int.
Then the int value -2147483648 is promoted to long (long) int and the tested condition is
if (-2147483648L != 2147483648L) // or LL

Resources