Overflows and underflows in C language - c

Can you guy explain me how overflows and underflows works for signed char and unsigned char?
int main () {
signed char c;
scanf("%d",&c);
printf("%d\n",c);
printf("%c\n",c);
return 0;
}
In this case, if thanks to scanf, I put c=200 there is an overflow and this is showed by the first printf.
The second printf gives me the same ASCII symbol of 200...
Why?

scanf's %d expects an int, so giving it anything else is undefined behavior.
You should do this:
int d;
scanf("%d", &d);
whatevertype c = (whatevertype)d;
However, signed integer overflow is undefined. But if you use unsigned types, like
unsigned char c = (unsigned char)d;
Then c is guaranteed to be d modulus 2 to the power of the number of bits in an unsigned char.

Every compiler is running on a specific machine, on a specific hardware.
Say for example that our machine/processor signed integer is in the range of 16 bits. This means that MAX_INT will be 0x7fff hex value, which is 32767 decimal value and MIN_INT will be 0x8000 hex vale, which is -32768 decimal value.
Most machines has ALU control register that defines how signed integers will behave in case of an overflow. This register generally has a saturation flag.
Overflow Example:
If the saturation flag is set, than in case that the result of the last signed integer ALU operation is bigger than MAX_INT, the result will be set to MAX_INT.
for example if the last operation was adding 0x7ffe to 0x2 than the result will be 0x7fff.
If the saturation flag is not set, than in case that the result of the last signed integer ALU operation is bigger than MAX_INT, the result will probably be set to the lower 16 bits of the correct result. In our case 0x7ffe+0x2=0x8000, which is the minimum integer.
In case of unsigned integers the compiler guarantees as that the result will be according to the definition of unsigned int addition in C.
Underflow example:
Every machine has MIN_FLOAT definition. And again if the saturation flag is set than a result that is smaller than MIN_FLOAT will be rounded to MIN_FLOAT. other wise the result will be according to the operation of the processor. (Search in the internet to understand the terms Mantissa and exponent if you are interested to know on floating point representation and operations).

The Reason you are getting a Overflow in the first printf statement is because you have defined the char as a signed char and if You print them in an %d format which is integer You'll get a range from 0-127 and -1 to -128 because as we all know char consume one byte in memory and by assigning them as signed character you'll divide them in positive and negative values in simple words a unsigned byte ranges from 0 to 255 and soo a signed character will give you both positive as well as negative value range from 0-127 and -1 to -128.

Related

How does hexadecimal to %x work?

I am learning in C and I got a question regarding this conversion.
short int x = -0x52ea;
printf ( "%x", x );
output:
ffffad16
I would like to know how this conversion works because it's supposed to be on a test and we won't be able to use any compilers. Thank you
I would like to know how this conversion works
It is undefined behavior (UB)
short int x = -0x52ea;
0x52ea is a hexadecimal constant. It has the value of 52EA16, or 21,22610. It has type int as it fits in an int, even if int was 16 bit. OP's int is evidently 32-bit.
- negates the value to -21,226.
The value is assigned to a short int which can encode -21,226, so no special issues with assigning this int to a short int.
printf("%x", x );
short int x is passed to a ... function, so goes through the default argument
promotions and becomes an int. So an int with the value -21,226 is passed.
"%x" used with printf(), expects an unsigned argument. Since the type passed is not an unsigned (and not an int with a non-negative value - See exception C11dr ยง6.5.2.2 6), the result is undefined behavior (UB). Apparently the UB on your machine was to print the hex pattern of a 32-bit 2's complement of -21,226 or FFFFAD16.
If the exam result is anything but UB, just smile and nod and realize the curriculum needs updating.
The point here is that when a number is negative, it's structured in a completely different way.
1 in 16-bit hexadecimal is 0001, -1 is ffff. The most relevant bit (8000) indicates that it's a negative number (admitting it's a signed integer), and that's why it can only go as positive as 32767 (7fff), and as negative as -32768 (8000).
Basically to transform from positive to negative, you invert all bits and sum 1. 0001 inverted is fffe, +1 = ffff.
This is a convention called Two's complement and it's used because it's quite trivial to do arithmetic using bitwise operations when you use it.

Can we assign integer with negative number to unsigned integer?

#include<stdio.h>
#include<conio.h>
main()
{
int i=-5;
unsigned int j=i;
printf("%d",j);
getch();
}
O/p
-----
-5
#include<stdio.h>
#include<conio.h>
main()
{
int i=-5;
unsigned int j=i;
printf("%u",j);
getch();
}
O/p
===
4255644633
Here I am not getting any compilation error .
It is giving -5 when print with the identifier %d and when printing with %u it is printing some garbage value .
The things I want to know are
1) Why compiler ignores when assigned integer with negative number to unsigned int.
2) How it is converting signed to unsigned ?
Who are "we?"
There's no "garbage value", it's probably just the result of viewing the bits of the signed integer as an unsigned. Typically two's complement will result in very large values for many a negative values. Try printing the value in hex to see the pattern more clearly, in decimal they're often hard to decipher.
I'd simply add that the concept of signed or unsigned is something that humans appreciate more than machines.
Assuming a 32-bit machine, your value of -5 is going to be represented internally by the 32-bit value 0xFFFFFFFB (two's complement).
When you insert printf("%d",j); into your source code, the compiler couldn't care less whether j is signed or unsigned, it just shoves 0xFFFFFFFB onto the stack and then a pointer to the "%d" string. The printf function when called looks at the format string, sees the %d and knows from that that it has to interpret the 0xFFFFFFFB as a signed value, hence the reason for it displaying -5 despite j being an unsigned int.
On the other hand, when you write printf("%u",j);, the "%u" makes printf interpret your 0xFFFFFFFB as an unsigned value. That value is 2^32 - 5, or 4294967291.
It's the format string passed to printf that determines how the value will be interpreted, not the type of the variable j.
There's noting unusual in the possibility to assign a negative value to an unsigned variable. The implicit conversion that happens in such cases is perfectly well defined by C language. The value is brought into the range of the target unsigned type in accordance with the rules of modulo arithmetic. The modulo is equal to 2^N, where N is the number of value bits in the unsigned recipient. This is how it has always been in C.
Printing an unsigned int value with %d specifier makes no sense. This specifier requires a signed int argument. Because of this mismatch, the behavior of your first code is undefined.
In other words, you got it completely backwards with regards to which value is garbage and which is not.
Your first code is essentially "printing garbage value" due to undefined behavior. The fact that it happens to match your original value of -5 is just a specific manifestation of undefined behavior.
Meanwhile, the second code is supposed to print a well-defined proper value. It should be result of conversion of -5 to unsigned int type by modulo UINT_MAX + 1. In your case that modulo probably happens to be 2^32 = 4294967296, which is why you are supposed to see 4294967296 - 5 = 4294967291.
How you managed to get 4255644633 is not clear. Your 4255644633 is apparently a result of different code, not the one you posted.
You can and you should get a warning (or perhaps failure) depending on the compiler and the settings.
The value you get is due to twos-complement.
The output in the second case is not a garbage value...
int i=-5;
when converted to binary form the Most Significant Bit is assigned '1' as -5 is a negative number..
but when u use %u the binary form is treated as a normal number and the 1 in MSB is treated a part of normal number..

Difference between '(unsigned)1' and '(unsigned)~0'

What is the difference between (unsigned)~0 and (unsigned)1. Why is unsigned of ~0 is -1 and unsigned of 1 is 1? Does it have something to do with the way unsigned numbers are stored in the memory. Why does an unsigned number give a signed result. It didn't give any overflow error either. I am using GCC compiler:
#include<sdio.h>
main()
{
unsigned int x=(unsigned)~0;
unsigned int y=(unsigned)1;
printf("%d\n",x); //prints -1
printf("%d\n",y); //prints 1
}
Because %d is a signed int specifier. Use %u.
which prints 4294967295 on my machine.
As others mentioned, if you interpret the highest unsigned value as signed, you get -1, see the wikipedia entry for two's complement.
Your system uses two's complement representation of negative numbers. In this representation a binary number composed of all ones represent the biggest negative number -1.
Since inverting all bits of a zero gives you a number composed of all ones, you get -1 when you re-interpret the number as a signed number by printing it with a %d which expects a signed number, not an unsigned one.
First, in your use of printf you are telling it to print the number as signed ("%d") instead of unsigned ("%u").
Second, you are right in that it has "something to do with the way numbers are stored in memory". An int (signed or unsigned) is not a single bit on your computer, but a collection of k bits. The exact value of k depends on the specifics of your computer architecture, but most likely you have k=32.
For the sake of succinctness, lets assume your ints are 8 bits long, so k=8 (this is most certainly not the case, unless you are working on a very limited embedded system,). In that case (int)0 is actually 00000000, and (int)~0 (which negates all the bits) is 11111111.
Finally, in two's complement (which is the most common binary representation of signed numbers), 11111111 is actually -1. See http://en.wikipedia.org/wiki/Two's_complement for a description of two's complement.
If you changed your print to use "%u", then it will print a positive integer that represents (2^k-1) where k is the number of bits in an integer (so probably it will print 4294967295).
printf() only knows what type of variable you passed it by what format specifiers you used in your format string. So what's happening here is that you're printing x and y as signed integers, because you used %d in your format string. Try %u instead, and you'll get a result more in line with what you're probably expecting.

Initializing unsigned short int to signed value

#include<stdio.h>
int main()
{
unsigned short a=-1;
printf("%d",a);
return 0;
}
This is giving me output 65535. why?
When I increased the value of a in negative side the output is (2^16-1=)65535-a.
I know the range of unsigned short int is 0 to 65535.
But why is rotating in the range 0 to 65535.What is going inside?
#include<stdio.h>
int main()
{
unsigned int a=-1;
printf("%d",a);
return 0;
}
Output is -1.
%d is used for signed decimal integer than why here it is not following the rule of printing the largest value of its(int) range.
Why the output in this part is -1?
I know %u is used for printing unsigned decimal integer.
Why the behavioral is undefined in second code and not in first.?
This I have compiled in gcc compiler. It's a C code
On my machine sizeof short int is 2 bytes and size of int is 4 bytes.
In your implementation, short is 16 bits and int is 32 bits.
unsigned short a=-1;
printf("%d",a);
First, -1 is converted to unsigned short. This results in the value 65535. For the precise definition see the standard "integer conversions". To summarize: the value is taken modulo USHORT_MAX+1.
This value 65535 is assigned to a.
Then for the printf, which uses varargs, the value is promoted back to int. varargs never pass integer types smaller than int, they're always converted to int. This results in the value 65535, which is printed.
unsigned int a=-1;
printf("%d",a);
First line, same as before but modulo UINT_MAX+1. a is 4294967295.
For the printf, a is passed as an unsigned int. Since %d requires an int the behavior is undefined by the C standard. But your implementation appears to have reinterpreted the unsigned value 4294967295, which has all bits set, as as a signed integer with all-bits-set, i.e. the two's-complement value -1. This behavior is common but not guaranteed.
Variable assignment is done to the amount of memory of the type of the variable (e.g., short is 2 bytes, int is 4 bytes, in 32 bit hardware, typically). Sign of the variable is not important in the assignment. What matters here is how you are going to access it. When you assign to a 'short' (signed/unsigned) you assign the value to a '2 bytes' memory. Now if you are going to use '%d' in printf, printf will consider it 'integer' (4 bytes in your hardware) and the two MSBs will be 0 and hence you got [0|0](two MSBs) [-1] (two LSBs). Due to the new MSBs (introduced by %d in printf, migration) your sign bit is hidden in the LSBs and hence printf considers it unsigned (due to the MSBs being 0) and you see the positive value. To get a negative in this you need to use '%hd' in first case. In the second case you assigned to '4 bytes' memory and the MSB got its SIGN bit '1' (means negative) during assignment and hence you see the negative number in '%d' of printf. Hope it explains. For more clarification please comment on the answer.
NB: I used 'MSB' for a shorthand of higher-order byte(s). Please read it according to the context (e.g., 'SIGN bit' will make you read like 'Most Significant Bit'). Thanks.

Integer overflow problem

Please explain the following paragraph.
"The next question is whether we can assign a certain value to a variable without losing precision. It is not sufficient if we just check for overflow during the addition or subtraction, because someone might add 1 to -5 and assign the result to an unsigned int. Then the actual addition does not overflow, but the result still does not fit."
when i am adding 1 to -5 i dont see any reason to worry.the answer is as it should be -4.
so what is the problem of result not being fit??
you can find the full article here through which i was going:
http://www.fefe.de/intof.html
The binary representation of -4, in a 32-bit word, is as follows (hex notation)
0xfffffffc
When interpreted as an unsigned integer, this bit pattern represents the number 2**32-4, or 18446744073709551612. I'm not sure I would call this phenomenon "overflow", but it is a common mistake to assign a small negative integer to a variable of unsigned type and wind up with a really big positive integer.
This trick is actually exploited for bounds checking: if you have a signed integer i and want to know if it is in the range 0 <= i < n, you can test
if ((unsigned)i < n) { ... }
which gives you the answer using one comparison instead of two. The cast to unsigned has no run-time cost; it just tells the compiler to generate an unsigned comparison instead of a signed comparison.
Try assigning it to a unsigned int, not an int.
The term unsigned int is the key - by default an int datatype will hold negative and positive numbers; however, unsigned ints are always positive. They provide this option because uints can technically hold greater positive values than regular signed ints because they do not need to use a bit to keep track of whether or not its negative or positive.
Please see:
Signed versus Unsigned Integers
The problem is that you're storing -4 in an unsigned int. Unsigned ints can only contain zero and positive values. If you assign -4 to one, you'll actually end up getting a very large positive number (the actual value depends on how wide an int you're using).
The problem is that the sizes of storage such as unsigned int can only hold so much. With 1 and -5 it does not matter, but with 1 and -500000000 you might end up with a confusing result. Also, unsigned storage will interpret anything stored in it as positive, so you cannot put a negative value in an unsigned variable.
Two big things to watch out for:
1. Overflow in the operation itself: 1 + -500000000
2. Issues in casting: (unsigned int)(1 + -500)
Unsigned variables, like unsigned int, cannot hold negative values. So assigning 1 - 5 to an unsigned int won't give you -4. I'm not sure what it'll give you, it's probably compiler specific.
Some code:
signed int first, second;
unsigned int result;
first = obtain(); // happens to be 1
second = obtain(); // happens to be -5
result = first + second; // unexpected result here - very large number - and it's too late to check that there's a problem
Say you obtained those values from keyboard. You need to check before addition that the result can be represented in unsigned int. That's what the article talks about.
By definition the number -4 cannot be represented in an unsigned int. -4 is a signed integer. The same goes for any negative number.
When you assign a negative integer to an unsigned int the actual bits of the number do not change, but they are merely represented differently. You'll get some ridiculously-large number due to the way integers are represented in binary (two's complement).
In two's complement, -4 is represented as 0xfffffffc. When 0xfffffffc is represented as an unsigned int you'll get the number 4,294,967,292.
You have to remember that fundamentally you're working with bits. So you can assign a value of -4 to an unsigned integer and this will place a series of bits into that memory location. Those bits can be interpreted as -4 in certain circumstances. One such circumstance is the obvious one: you've told the compiler/system that the bits in that memory location should be interpreted as a two's compliment signed number. So if you do printf("%s",i) prtinf does its magic and converts the two's compliment number to a magnitude and sign. The magnitude will be 4 and the sign will be negative, so it displays '-4'.
However, if you tell the compiler that the data at that memory location is not signed then the bits don't change but instead their interpretation does. So when you do your addition, store the result in an unsigned integer memory location and then call printf on the result it doesn't bother looking for the sign because by definition it is always positive. It calculates the magnitude and prints it. The magnitude will be off because the sign information is still encoded in the bits but it's treated as magnitude information.

Resources