Overflow in C code - c

Can anyone explain to me why this code prints "error" ? This only appears for minimal value of integer.
int abs(int x) {
int result = 0;
if(x < 0)
result = -1*x;
else
result = x;
return result;
}
int main() {
printf("Testing abs... ");
if (abs(-2147483648) != 2147483648)
printf("error\n");
else
printf("success\n");
}

Because for a 32 bit integer signed integer, using two's complement, the largest number you can store is 2147483647.
The range is -2147483648 2147483647.
You must be careful - overflowing signed numbers is undefined behavior.

The maximal value of a 32 bit integer is 2,147,483,647.

put long instead of int. For bigger integers you will need long long. Google for the range that this types offer. Also for the comparison with a static number you must declare it like e.g. 8438328L

Because of the way integers are represented (2's complement), if your int is 32 bits, -2147483648 is its own negative.
After -2147483648 is returned by your abs(), it is probably being compared as a long, 64-bit integer. If the comparison were 32-bit, 2147483648 would be equivalent to -2147483648. Perhaps if you turn on all warnings for your compiler, it will complain?

The range of 32-bit signed integers is, as has been mentioned before -2147483648 (= -231) to 2147483647 (= 231 - 1). In your abs() function, you have thus overflow of a signed integer, which is undefined behaviour (citation of standard to be inserted). Therefore anything could happen, but what actually happens is very probably that the result just wraps around, producing -2147483648 again. However, you compare that to the integer literal 2147483648, which does not fit into a 32-bit signed integer, hence, since it has no (n)signedness suffix, that literal has the next type in the list
int
long int
long long int
which can represent its value (if any). On 64-bit systems that could be long int or long long int, the former is typically the case on Linux, the latter, as far as I know on Windows, on 32-bit systems it's almost certainly long long int.
Then the int value -2147483648 is promoted to long (long) int and the tested condition is
if (-2147483648L != 2147483648L) // or LL

Related

how can something be bigger than (unsigned long long) > LONG_MAX?

I found this code in an algorithm I need to update:
if (value > (unsigned long long) LONG_MAX)
EDIT: value is the result of a division of two uint64_t numbers.
I understand that (unsigned long long) LONG_MAX is a VERY big number:
#include "stdio.h"
#include "limits.h"
int main() {
unsigned long long ull = (unsigned long long) LONG_MAX;
printf("%lu",ull);
return 0;
}
prints 9223372036854775807
So what I am comparing here? In what case this if statement will evaluate to true?
A float or double can be larger than that. Appendix Ep5 of the C standard states that either type must be able to hold a value at least as large as 1E37 which is a larger value than LONG_MAX which must be at least 2147483647:
The values given in the following list shall be replaced by
implementation-defined constant expressions with values that are
greater than or equal to those shown:
#define DBL_MAX 1E+37
#define FLT_MAX 1E+37
#define LDBL_MAX 1E+37
So if value is either of those types, it could evaluate to true.
EDIT:
Since value is a uint64_t, whose max value is 18446744073709551615, this can also be larger than LONG_MAX.
how can something be bigger than (unsigned long long) > LONG_MAX?
Easily. LONG MAX is the maximum value that can be represented as a long int. Converting that to unsigned long long does not change its value, only its data type. The maximum value that can be represented as an unsigned long int is larger on every C implementation you're likely to meet. The maximum value of long long int is larger on some, and the maximum value of unsigned long long int is, again, larger on every C implementation you're likely to meet (much larger on some).
However, this ...
unsigned long long ull = (unsigned long long) LONG_MAX;
printf("%lu",ull);
... is not a conforming way to investigate the value in question because %lu is a formatting directive for type unsigned long, not unsigned long long. The printf call presented therefore has undefined behavior.
I found this code in an algorithm I need to update:
if (value > (unsigned long long) LONG_MAX)
EDIT: value is the result of a division of two uint64_t numbers.
[...]
In what case this if statement will
evaluate to true?
Supposing that value has type uint64_t, which is probably the same as unsigned long long in your implementation, the condition in that if statement will evaluate to true at least when the most-significant bit of value is set. If your long int is only 32 bits wide, however, then the condition will evaluate to true much more widely than that, because there are many 64-bit integers that are larger than the largest value representable as a signed 32-bit integer.
I would be inclined to guess that the code was indeed written under the assumption that long int is 32 bits wide, so that the if statement asks a very natural question: "can the result of the previous uint64_t division be represented as a long?" In fact, that's what the if statement is evaluating in any case, but it makes more sense if long is only 32 bits wide, which is typical of 32-bit computers and standard on both 32- and 64-bit Windows.
LONG_MAX is maximum value for a signed long. By standard it should be >= +2147483647 (fits in 32 bit signed)
There's also ULLONG_MAX for unsigned long long, which is currently most often 18446744073709551615. The standard mandates that to be >= than 9223372036854775807

Variable conversion mixup in C

I have the following C code and I have to understand why the result is a fairly big positive number:
int k;
unsigned int l;
float f;
f=4; l = 1; k = -2;
printf("\n %f", (unsigned int)l+k+f);
The result is a very large number (around 4 billion, max for 32 bit integers), so I suspect it has something to do with the representation of signed negative integers (two's complement) that looks fairly big if we look at it as unsigned.
However I don't quite understand why it behaves this way and what the float has to do with the behavior (if I remove it it stops doing it). Could someone explain how does it do that ? What is the process that happens when adding the numbers that leads to this result ?
The problem is that when you add a signed int to an unsigned, C converts the result to an unsigned int, even negative ones. Since k is negative, it gets re-interpreted as a large positive number before the addition. After that the f is added, but it is small in comparison to negative 2 re-interpreted as a positive number.
Here is a short illustration of the problem:
int k = -2;
unsigned int l = 1;
printf("\n %u", l+k);
This prints 4294967295 on a 32-bit system, because -2 in two's complement representation is 0xFFFFFFFE (demo).
The answer is that when the compiler does the l+k it typecasts the k to unsigned int which turns it into that big number. If you change the order of the vars to this: l+f+k this behavior will not occur.
What happens is that a negative integer is converted to an unsigned integer by adding the maximum unsigned value, plus one, repeatedly to the signed value until it's representable. This is why you see such a large value.
With two's complement, it's the same as truncating or sign extending the value and interpreting it as unsigned which is simple for a computer to do.

Adding an signed integer beyond 0xFFFFFFFF

#include <stdio.h>
void fun3(int a, int b, int c)
{
printf("%d \n", a+b+c );
}
void fun2 ( int x, int y)
{
fun3(0x33333333,0x30303030, 0x31313131);
printf("%d \n", x+y);
}
fun1 (int x)
{
fun2(0x22222222,0x20202020);
printf("%d \n", x);
}
main()
{
fun1(0x1111111);
}
I'm going through the above program for stack corruption. I am getting the o/p for the above program with some undesired values. All I could understand is if the added value is beyond 0xFFFFFFFF then the small negative integer becomes the largest value say -1 becomes 0xFFFFFFFF. Any insights on this
EDIT (Corrections) (I missed the point. My answer is right for constants, but the question contains parameters of functions, then what happens here is overflow of signed integer objects and, as correctly pointed out #Cornstalks in his comment, this is undefined behaviour).
/EDIT
In fun1() you are using printf() in a wrong way.
You wrote "%d" to accept an int, but this is not true if your number is greater that MAX_INT.
You have to check the value of MAX_INT in your system.
If you write an integer constant in hexadecimal format, the standard C (ISO C99 or C11) tries to put the value in the first type that the constant can fit, by following this order:
int, unsigned int, long int, unsigned long int, long long int,
unsigned long long int.
Thus, if you have a constant greater that MAX_INT (the max. value in the range of int), your constant (if positive) has type unsigned int, but the directive %d expected a signed int value. Thus, it will be shown some negative number.
Worst, if your constant is a value greater than UMAX_INT (the max. value in the range of unsigned int) then the type of the constant will be the first of long int, unsigned long int, long long int, with precision strictly bigger than of unsigned int.
This implies that %d becomes a wrong directive.
If you cannot be completely sure about how big will be your values, you could do a cast to the biggest integer type:
printf("%lld", (long long int) 0x33333333333);
The directive %lld stands for long long int.
If you are interested always in positive values, you have to use %llu and cast to unsigned long long int:
printf("%llu", (unsigned long long int) 0x33333333333);
In this way, you avoids any "funny" numbers, as much as, you show big numbers without loosing any precision.
Remark: The constants INT_MAX, UINT_MAX, and the like, are in limits.h.
Important: The automatic sequence of casts is only valid for octal and hexadecimal constants. For decimal constants there is another rule:
int, long int, long long int.
To #Cornstalks' point: INT_MIN is 0x80000000, and (int)-1 is 0xFFFFFFFF in 2's complement (on a 32-bit system, anyway).
This allows the instruction set to do things in signed arithmetic like:
1 + -2 = -1
becomes (as signed shorts, for brevity)
0x0001 + 0xFFFE = 0xFFFF
... then:
1 + -1 = 0
is represented internally with overflow as
0x0001 + 0xFFFF = 0x0000
Also to #Cornstalks' point: the internal representation (as well as overflow addition) is an implementation detail. C implementations (and instruction sets) need not represent integers in 2's complement, so providing hex values for signed integer types may tie you to a subset of C implementations.
fun3 will attempt to print the value 0x94949494. This is greater than the max 4-byte integer value of 0x7FFFFFFF, so it will "overflow" and (on virtually every computer made today) produce (if I did my arithmetic correctly) the negative number -0x6B6B6B6C, which is -1802201964.
fun1 and fun2 should print the "expected" positive results.

sscanf with hexadecimal negative value

I need to convert hexadecimal 4-digits values to decimal so I used ssscanf but it is not work on negative numbers...
For example,
int test;
sscanf("0xfff6","%x",&test);
return 65526 instead of -10.
How can I resolve that ?
You have to do it manually. x conversion specifier performs a conversion from an unsigned int and it requires an argument of type pointer to unsigned int. For example with a cast:
unsigned int test;
int result;
sscanf("0xfff6","%x", &test);
result = (int16_t) test;
The fact that the value of test is greater than 32k indicates it's a "long" integer (32 bit)...thus it doesn't see the sign bit. You would have to read the value into an unsigned short integer, then type cast it to a signed short, in order to see the negative value...
I think the core issue is that you are assuming that an int is 16 bits, whereas in your system it appears to be larger than that.
Int is always signed, so the fact that it is reporting the result as 65525 proves that INT_MAX is greater than the 16 bit value of 36727.
Try changing your code from int text to short test and I suspect it will work.
It's probably because your integers are more than 16 bits wide so that fff6 is indeed positive - you may need fffffff6 or even wider to properly represent a negative number.
To fix this, simply place the following after the scanf:
if (val > 32767) val -= 65536;
This adjust values with the top bit set (in 16-bit terms) to be negative.

C: Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0)

I was working on an embedded project when I ran into something which I thought was strange behaviour. I managed to reproduce it on codepad (see below) to confirm, but don't have any other C compilers on my machine to try it on them.
Scenario: I have a #define for the most negative value a 32-bit integer can hold, and then I try to use this to compare with a floating point value as shown below:
#define INT32_MIN (-2147483648L)
void main()
{
float myNumber = 0.0f;
if(myNumber > INT32_MIN)
{
printf("Everything is OK");
}
else
{
printf("The universe is broken!!");
}
}
Codepad link: http://codepad.org/cBneMZL5
To me it looks as though this this code should work fine, but to my surprise it prints out The universe is broken!!.
This code implicitly casts the INT32_MIN to a float, but it turns out that this results in a floating point value of 2147483648.0 (positive!), even though the floating point type is perfectly capable of representing -2147483648.0.
Does anyone have any insights into the cause of this behaviour?
CODE SOLUTION: As Steve Jessop mentioned in his answer, limits.h and stdint.h contain correct (working) int range defines already, so I'm now using these instead of my own #define
PROBLEM/SOLUTION EXPLANATION SUMMARY: Given the answers and discussions, I think this is a good summary of what's going on (note: still read the answers/comments because they provide a more detailed explanation):
I'm using a C89 compiler with 32-bit longs, so any values greater than LONG_MAX and less or equal to ULONG_MAX followed by the L postfix have a type of unsigned long
(-2147483648L) is actually a unary - on an unsigned long (see previous point) value: -(2147483648L). This negation operation 'wraps' the value around to be the unsigned long value of 2147483648 (because 32-bit unsigned longs have the range 0 - 4294967295).
This unsigned long number looks like the expected negative int value when it gets printed as an int or passed to a function because it is first getting cast to an int, which is wrapping this out-of-range 2147483648 around to -2147483648 (because 32-bit ints have the range -2147483648 to 2147483647)
The cast to float, however, is using the actual unsigned long value 2147483648 for conversion, resulting in the floating-point value of 2147483648.0.
Replace
#define INT32_MIN (-2147483648L)
with
#define INT32_MIN (-2147483647 - 1)
-2147483648 is interpreted by the compiler to be the negation of 2147483648, which causes overflow on an int. So you should write (-2147483647 - 1) instead.
This is all C89 standard though. See Steve Jessop's answer for C99.
Also long is typically 32 bits on 32-bit machines, and 64 bits on 64-bit machines. int here gets the things done.
In C89 with a 32 bit long, 2147483648L has type unsigned long int (see 3.1.3.2 Integer constants). So once modulo arithmetic has been applied to the unary minus operation, INT32_MIN is the positive value 2147483648 with type unsigned long.
In C99, 2147483648L has type long if long is bigger than 32 bits, or long long otherwise (see 6.4.4.1 Integer constants). So there is no problem and INT32_MIN is the negative value -2147483648 with type long or long long.
Similarly in C89 with long larger than 32 bits, 2147483648L has type long and INT32_MIN is negative.
I guess you're using a C89 compiler with a 32 bit long.
One way to look at it is that C99 fixes a "mistake" in C89. In C99 a decimal literal with no U suffix always has signed type, whereas in C89 it may be signed or unsigned depending on its value.
What you should probably do, btw, is include limits.h and use INT_MIN for the minimum value of an int, and LONG_MIN for the minimum value of a long. They have the correct value and the expected type (INT_MIN is an int, LONG_MIN is a long). If you need an exact 32 bit type then (assuming your implementation is 2's complement):
for code that doesn't have to be portable, you could use whichever type you prefer that's the correct size, and assert it to be on the safe side.
for code that has to be portable, search for a version of the C99 header stdint.h that works on your C89 compiler, and use int32_t and INT32_MIN from that.
if all else fails, write stdint.h yourself, and use the expression in WiSaGaN's answer. It has type int if int is at least 32 bits, otherwise long.

Resources