Datatypes and datatype modifiers in C - c

I am pretty new to C. I recently came across this piece of code in C:
#include <stdio.h>
int main()
{
unsigned Abc = 1;
signed Xyz = -1;
if(Abc<Xyz)
printf("Less");
else
if(Abc>Xyz)
printf("Great");
else
if(Abc==Xyz)
printf("Equal");
return 0;
}
I tried running it and it outputs "Less". How does it work? What is the meaning of unsigned Abc? I could understand unsigned char Abc, but simply unsigned Abc? I am pretty sure Abc is no data type! How(and Why?) does this work?

Two things are happening.
The default data type in C in int. Thus you have variables of type signed int and unsigned int.
When and unsigned int and a signed int are used in an expression the signed int is converted to unsigned before the expression is evaluated. This will cause signed(-1) to turn into a very large unsigned number (due to 2's complement representation).

The default type in C is int. Therefore unsigned is a synonym for unsigned int.
Singed integers are usually handled using twos complement. This means that the actual value for 1 is 0x0001 and the actual value for -1 is 0xFFFF.

int is the "default" type in C. unsigned Abc means unsigned int Abc just like long L means long int L.
When you have an expression that mixes signed and unsigned ints, the signed ints get automatically converted to unsigned. Most systems use two's complement to store integers, so (unsigned int)(-1) is equal to the largest possible unsigned int.

As far as I know, the signed value gets promoted to an unsigned value and so becomes very large.

Comparing signed and unsigned types result in undefined behavior. Your program can and will print different results on different platforms.
Please see comments.

unsigned/signed is just short specification for unsigned int/signed int (source), so no, you don't have variable with "no data type"

The signed value will get promoted to unsigned and therefore it will be bigger than 1.

Add the following line after signed Xyz = -1;
printf("is Abc => %x less than Xyz => %x\n",Abc,Xyz);
and see the result for yourself.

Related

unsigned and signed behavior in C

I tried older post but not able to understand following behavior.
https://stackoverflow.com/questions/12295168/c-signed-unsigned-mismatch
unsigned int and signed char comparison
#define T long
int main()
{
unsigned T a;
T b;
a=1;
b=-1;
if(a>b)
printf("True\n");
else
printf("False\n");
return 0;
}
I tried above code for T=char, short int and long.
observed output for char and short is TRUE, while for int and long is FALSE. I tried above code in Ubuntu gcc.
Can anyone explain, why am I getting different output for different data types?
When testing against the signed b value for char and short the value gets widen to an int and this replicates the sign bit whereas for the a value the signed bit is not replicated.
Thus for char the if becomes if (0x00000001 > 0xFFFFFFFF) and this is true (assuming a 32 bit int).
But when using an unsigned that is an int or bigger the test is done using an unsigned comparision.
char is promoted to int in cases such as yours where you compare two variables.
Let's see what happens underneath for char types:
a is promoted to an int and it remains as 1. b is also promoted to an int, the sign is preserved and it also remains as -1. Is 1 > -1? Yes!
And what about int types:
As there as an unsigned operand involved all of them will be converted to unsigned. In the case of a which is already unsigned 1 is preserved as it is. However, b is signed and therefore we need to lose the sign.
Due to the underlying bit representation, on a 32 bit machine, -1 actually has the same bits as 4294967295. And you end up comparing if 1 is bigger than 4294967295. I think the answer is obvious.

Type conversion - unsigned to signed int/char

I tried the to execute the below program:
#include <stdio.h>
int main() {
signed char a = -5;
unsigned char b = -5;
int c = -5;
unsigned int d = -5;
if (a == b)
printf("\r\n char is SAME!!!");
else
printf("\r\n char is DIFF!!!");
if (c == d)
printf("\r\n int is SAME!!!");
else
printf("\r\n int is DIFF!!!");
return 0;
}
For this program, I am getting the output:
char is DIFF!!!
int is SAME!!!
Why are we getting different outputs for both?
Should the output be as below ?
char is SAME!!!
int is SAME!!!
A codepad link.
This is because of the various implicit type conversion rules in C. There are two of them that a C programmer must know: the usual arithmetic conversions and the integer promotions (the latter are part of the former).
In the char case you have the types (signed char) == (unsigned char). These are both small integer types. Other such small integer types are bool and short. The integer promotion rules state that whenever a small integer type is an operand of an operation, its type will get promoted to int, which is signed. This will happen no matter if the type was signed or unsigned.
In the case of the signed char, the sign will be preserved and it will be promoted to an int containing the value -5. In the case of the unsigned char, it contains a value which is 251 (0xFB ). It will be promoted to an int containing that same value. You end up with
if( (int)-5 == (int)251 )
In the integer case you have the types (signed int) == (unsigned int). They are not small integer types, so the integer promotions do not apply. Instead, they are balanced by the usual arithmetic conversions, which state that if two operands have the same "rank" (size) but different signedness, the signed operand is converted to the same type as the unsigned one. You end up with
if( (unsigned int)-5 == (unsigned int)-5)
Cool question!
The int comparison works, because both ints contain exactly the same bits, so they are essentially the same. But what about the chars?
Ah, C implicitly promotes chars to ints on various occasions. This is one of them. Your code says if(a==b), but what the compiler actually turns that to is:
if((int)a==(int)b)
(int)a is -5, but (int)b is 251. Those are definitely not the same.
EDIT: As #Carbonic-Acid pointed out, (int)b is 251 only if a char is 8 bits long. If int is 32 bits long, (int)b is -32764.
REDIT: There's a whole bunch of comments discussing the nature of the answer if a byte is not 8 bits long. The only difference in this case is that (int)b is not 251 but a different positive number, which isn't -5. This is not really relevant to the question which is still very cool.
Welcome to integer promotion. If I may quote from the website:
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions. All other types are unchanged
by the integer promotions.
C can be really confusing when you do comparisons such as these, I recently puzzled some of my non-C programming friends with the following tease:
#include <stdio.h>
#include <string.h>
int main()
{
char* string = "One looooooooooong string";
printf("%d\n", strlen(string));
if (strlen(string) < -1) printf("This cannot be happening :(");
return 0;
}
Which indeed does print This cannot be happening :( and seemingly demonstrates that 25 is smaller than -1!
What happens underneath however is that -1 is represented as an unsigned integer which due to the underlying bits representation is equal to 4294967295 on a 32 bit system. And naturally 25 is smaller than 4294967295.
If we however explicitly cast the size_t type returned by strlen as a signed integer:
if ((int)(strlen(string)) < -1)
Then it will compare 25 against -1 and all will be well with the world.
A good compiler should warn you about the comparison between an unsigned and signed integer and yet it is still so easy to miss (especially if you don't enable warnings).
This is especially confusing for Java programmers as all primitive types there are signed. Here's what James Gosling (one of the creators of Java) had to say on the subject:
Gosling: For me as a language designer, which I don't really count
myself as these days, what "simple" really ended up meaning was could
I expect J. Random Developer to hold the spec in his head. That
definition says that, for instance, Java isn't -- and in fact a lot of
these languages end up with a lot of corner cases, things that nobody
really understands. Quiz any C developer about unsigned, and pretty
soon you discover that almost no C developers actually understand what
goes on with unsigned, what unsigned arithmetic is. Things like that
made C complex. The language part of Java is, I think, pretty simple.
The libraries you have to look up.
The hex representation of -5 is:
8-bit, two's complement signed char: 0xfb
32-bit, two's complement signed int: 0xfffffffb
When you convert a signed number to an unsigned number, or vice versa, the compiler does ... precisely nothing. What is there to do? The number is either convertible or it isn't, in which case undefined or implementation-defined behaviour follows (I've not actually checked which) and the most efficient implementation-defined behaviour is to do nothing.
So, the hex representation of (unsigned <type>)-5 is:
8-bit, unsigned char: 0xfb
32-bit, unsigned int: 0xfffffffb
Look familiar? They're bit-for-bit the same as the signed versions.
When you write if (a == b), where a and b are of type char, what the compiler is actually required to read is if ((int)a == (int)b). (This is that "integer promotion" that everyone else is banging on about.)
So, what happens when we convert char to int?
8-bit signed char to 32-bit signed int: 0xfb -> 0xfffffffb
Well, that makes sense because it matches the representations of -5 above!
It's called a "sign-extend", because it copies the top bit of the byte, the "sign-bit", leftwards into the new, wider value.
8-bit unsigned char to 32-bit signed int: 0xfb -> 0x000000fb
This time it does a "zero-extend" because the source type is unsigned, so there is no sign-bit to copy.
So, a == b really does 0xfffffffb == 0x000000fb => no match!
And, c == d really does 0xfffffffb == 0xfffffffb => match!
My point is: didn't you get a warning at compile time "comparing signed and unsigned expression"?
The compiler is trying to inform you that he is entitled to do crazy stuff! :) I would add, crazy stuff will happen using big values, close to the capacity of the primitive type. And
unsigned int d = -5;
is assigning definitely a big value to d, it's equivalent (even if, probably not guaranteed to be equivalent) to be:
unsigned int d = UINT_MAX -4; ///Since -1 is UINT_MAX
Edit:
However, it is interesting to notice that only the second comparison gives a warning (check the code). So it means that the compiler applying the conversion rules is confident that there won't be errors in the comparison between unsigned char and char (during comparison they will be converted to a type that can safely represent all its possible values). And he is right on this point. Then, it informs you that this won't be the case for unsigned int and int: during the comparison one of the 2 will be converted to a type that cannot fully represent it.
For completeness, I checked it also for short: the compiler behaves in the same way than for chars, and, as expected, there are no errors at runtime.
.
Related to this topic, I recently asked this question (yet, C++ oriented).

When will an unsigned int variable becomes negative

I was going through the existing code and when debugging the UTC time which is declared as
unsigned int utc_time;
I could get some positive integer every time by which I would be sure that I get the time. But suddenly in the code I got a negative value for the variable which is declared as an unsigned integer.
Please help me to understand what might be the reason.
Unsigned integers, by their very nature, can never be negative.
You may end up with a negative value if you cast it to a signed integer, or simply assign the value to a signed integer, or even incorrectly treat it as signed, such as with:
#include <stdio.h>
int main (void) {
unsigned int u = 3333333333u;
printf ("unsigned = %u, signed = %d\n", u, u);
return 0;
}
which outputs:
unsigned = 3333333333, signed = -961633963
on my 32-bit integer system.
When it's cast or treated as a signed type. You probably printed your unsigned int as an int, and the bit sequence of the unsigned would have corresponded to a negative signed value.
ie. Perhaps you did:
unsigned int utc_time;
...
printf("%d", utc_time);
Where %d is for signed integers, compared to %u which is used for unsigned. Anyway if you show us the code we'll be able to tell you for certain.
There's no notion of positive or negative in an unsigned variable.
Make sure you using
printf("%u", utc_time);
to display it
In response to the comment %u displays the varible as an unsigned int where as %i or %d will display the varible as a signed int.
Negative numbers in most (all?) C programs are represented as a two's complement of the unsigned number plus one. It's possible that your debugger or a program listing the values doesn't show it as an unsigned type so you see it's two's complement.

How to cast or convert an unsigned int to int in C?

My apologies if the question seems weird. I'm debugging my code and this seems to be the problem, but I'm not sure.
Thanks!
It depends on what you want the behaviour to be. An int cannot hold many of the values that an unsigned int can.
You can cast as usual:
int signedInt = (int) myUnsigned;
but this will cause problems if the unsigned value is past the max int can hold. This means half of the possible unsigned values will result in erroneous behaviour unless you specifically watch out for it.
You should probably reexamine how you store values in the first place if you're having to convert for no good reason.
EDIT: As mentioned by ProdigySim in the comments, the maximum value is platform dependent. But you can access it with INT_MAX and UINT_MAX.
For the usual 4-byte types:
4 bytes = (4*8) bits = 32 bits
If all 32 bits are used, as in unsigned, the maximum value will be 2^32 - 1, or 4,294,967,295.
A signed int effectively sacrifices one bit for the sign, so the maximum value will be 2^31 - 1, or 2,147,483,647. Note that this is half of the other value.
Unsigned int can be converted to signed (or vice-versa) by simple expression as shown below :
unsigned int z;
int y=5;
z= (unsigned int)y;
Though not targeted to the question, you would like to read following links :
signed to unsigned conversion in C - is it always safe?
performance of unsigned vs signed integers
Unsigned and signed values in C
What type-conversions are happening?
IMHO this question is an evergreen. As stated in various answers, the assignment of an unsigned value that is not in the range [0,INT_MAX] is implementation defined and might even raise a signal. If the unsigned value is considered to be a two's complement representation of a signed number, the probably most portable way is IMHO the way shown in the following code snippet:
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else if (u >= (unsigned int)INT_MIN)
i = -(int)~u - 1; /*(2)*/
else
i = INT_MIN; /*(3)*/
Branch (1) is obvious and cannot invoke overflow or traps, since it
is value-preserving.
Branch (2) goes through some pains to avoid signed integer overflow
by taking the one's complement of the value by bit-wise NOT, casts it
to 'int' (which cannot overflow now), negates the value and subtracts
one, which can also not overflow here.
Branch (3) provides the poison we have to take on one's complement or
sign/magnitude targets, because the signed integer representation
range is smaller than the two's complement representation range.
This is likely to boil down to a simple move on a two's complement target; at least I've observed such with GCC and CLANG. Also branch (3) is unreachable on such a target -- if one wants to limit the execution to two's complement targets, the code could be condensed to
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else
i = -(int)~u - 1; /*(2)*/
The recipe works with any signed/unsigned type pair, and the code is best put into a macro or inline function so the compiler/optimizer can sort it out. (In which case rewriting the recipe with a ternary operator is helpful. But it's less readable and therefore not a good way to explain the strategy.)
And yes, some of the casts to 'unsigned int' are redundant, but
they might help the casual reader
some compilers issue warnings on signed/unsigned compares, because the implicit cast causes some non-intuitive behavior by language design
If you have a variable unsigned int x;, you can convert it to an int using (int)x.
It's as simple as this:
unsigned int foo;
int bar = 10;
foo = (unsigned int)bar;
Or vice versa...
If an unsigned int and a (signed) int are used in the same expression, the signed int gets implicitly converted to unsigned. This is a rather dangerous feature of the C language, and one you therefore need to be aware of. It may or may not be the cause of your bug. If you want a more detailed answer, you'll have to post some code.
Some explain from C++Primer 5th Page 35
If we assign an out-of-range value to an object of unsigned type, the result is the remainder of the value modulo the number of values the target type can hold.
For example, an 8-bit unsigned char can hold values from 0 through 255, inclusive. If we assign a value outside the range, the compiler assigns the remainder of that value modulo 256.
unsigned char c = -1; // assuming 8-bit chars, c has value 255
If we assign an out-of-range value to an object of signed type, the result is undefined. The program might appear to work, it might crash, or it might produce garbage values.
Page 160:
If any operand is an unsigned type, the type to which the operands are converted depends on the relative sizes of the integral types on the machine.
...
When the signedness differs and the type of the unsigned operand is the same as or larger than that of the signed operand, the signed operand is converted to unsigned.
The remaining case is when the signed operand has a larger type than the unsigned operand. In this case, the result is machine dependent. If all values in the unsigned type fit in the large type, then the unsigned operand is converted to the signed type. If the values don't fit, then the signed operand is converted to the unsigned type.
For example, if the operands are long and unsigned int, and int and long have the same size, the length will be converted to unsigned int. If the long type has more bits, then the unsigned int will be converted to long.
I found reading this book is very helpful.

Comparison operation on unsigned and signed integers

See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?
Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.
On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)
You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.
#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number
Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}
The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.
while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.

Resources