It's been a while since I start programming in C, however, I still feel confused about unsigned. If we compiled this code:
#include <stdio.h>
int main(int argc, char **argv)
{
unsigned int x = -1;
return 0;
}
both gcc and VC++ don't raise any error or even a warning regarding using negative number with unsigned.
My question is that does unsigned do any internal job or it just a hint to the programer that this value shouldn't be negative?
It is NOT just a hint. The following two snippets should behave differently:
Signed int:
int x = -1;
printf("%d\n", x > 0); // prints 0
Unsigned int:
unsigned int x = -1;
printf("%d\n", x > 0); // prints 1
And you could probably come up with 5 more examples where the signedness matters. For example, shifting right with the >> operator.
Use -Wsign-conversion to get a warning with gcc.
With gcc 4.7.1, -Wsign-conversion is neither a part of -Wall nor -Wextra.
Also note that the C Standard does NOT require a warning for this initialization.
unsigned is not a qualifier like static, extern, const or inline. It is part of the type.
int and unsigned int are two completely different types. You will never find an unsigned int that can hold a negative number. Note also that int and signed int are exactly the same type. It's a slightly different story for char, but I'll leave that for another time.
Assigning -1 to an unsigned integer is a common trick to set it to the largest value it can hold.
unsigned can affect integer overflow, comparison, and bitwise-shift behavior. It is not just a "hint" that it should not be negative.
Try to enable warnings in gcc by: gcc -Wall -o out input.c
Another difference between unsigned int and normal int: int range using 4-bytes: –2,147,483,648 to 2,147,483,647, unsigned range using 4-bytes: 0 to 4,294,967,295 . You don't have to store information about the sign (the highest bit) so you can assign larger values and the application will work properly.
((unsigned int) -1) is equivalent to (UINT_MAX) (<limits.h>), because a conversion from a negative value to an unsigned value is guaranteed to be done by a modulo operation.
C11 § 6.3.1.3 Signed and unsigned integers
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
There are both signed and unsigned versions of the internal(machine-level) instructions for calculations such as multiplication mul(for unsigned) and imul(signed). Compiler chooses them as you declare variables as unsigned or signed.
Related
(assuming 64bit machine)
e.g.
int n = 0xFFFFFFFF; //max 32bit unsigned number
printf("%u\n", n);
The maximum positive number that a regular signed integer (32bit) can store is 0x7FFFFFFF.
In the above example I'm assigning the maximum unsigned integer value to a regular signed integer, I'm receiving no warnings or error from GCC, and the result is printed without problems (with -Wall -Wextra).
Appending U or L to the hex constant changes nothing.
Why is that?
0xFFFFFFFF, on a platform where unsigned has a maximum value of 232-1, will have the type unsigned according to "6.4.4.1 Integer Constants" of the Standard.
And then we get to the conversion:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
So, the result is implementation-defined or raises an implementation-defined signal.
Now, you print your int with the format %u, which is just plain mismatched. And while that is strictly speaking UB, you will likely get the original constant, assuming you have 2s-complement and the original assignment used wrap-around.
The C standard doesn't specify the behaviour but requires that the implementation specifies it. GCC always uses 2's complement representation and converts via truncation, therefore int32_t i = 0xFFFFFFFF; will result in i being set to -1 when compiled with GCC. On other compilers YMMV.
To get the warning from GCC you need to give the -Wsign-conversion flag:
% gcc 0xfffffff.c -c -Wsign-conversion
0xfffffff.c:1:9: warning: conversion of unsigned constant value to negative integer
[-Wsign-conversion]
int i = 0xFFFFFFFF;
^~t ~~~~~~~~
In general C compilers by default produce warnings only about very blatant errors and of constraint violations. The -Wsign-conversion would make many compilations very noisy - even those that are well-defined, like:
unsigned char c = '\x80';
which produces
unsignedchar.c:1:19: warning: negative integer implicitly converted to unsigned type
[-Wsign-conversion]
unsigned char c = '\x80';
^~~~~~
on implementations where char is signed.
Assume that int and unsigned int are 32 bits, which is the case on most platforms you're likely to be using (both 32-bit and 64-bit systems). Then the constant 0xFFFFFFFF is of type unsigned int, and has the value 4294967295.
This:
int n = 0xFFFFFFFF;
implicitly converts that value from unsigned int to int. The result of the conversion is implementation-defined; there is no undefined behavior. (In principle, it can also cause an implementation-defined signal to be raised, but I know of no implementations that do that).
Most likely the value stored in n will be -1.
printf("%u\n", n);
Here you use a %u format specifier, which requires an argument of type unsigned int, but you pass it an argument of type int. The standard says that values of corresponding signed and unsigned type are interchangeable as function arguments, but only for values that are within the range of both types, which is not the case here.
This call does not perform a conversion from int to unsigned int. Rather, an int value is passed to printf, which assumes that the value it received is of type unsigned int. The behavior is undefined. (Again, this would be a reasonable thing to warn about.)
The most likely result is that the int value of -1, which (assuming 2's-complement) has the same representation as 0xFFFFFFFF, will be treated as if it were an unsigned int value of 0xFFFFFFFF, which is printed in decimal as 4294967295.
You can get a warning on int n = 0xFFFFFFFF; by using the -Wconversion or -Wsign-conversion option. These option are not included in -Wextra or -Wall. (You'd have to ask the gcc maintainers why.)
I don't know of an option that will cause a warning on the printf call.
(Of course the fix is to define n as an unsigned int, which makes everything correct and consistent.)
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Confused about C macro expansion and integer arithmetic
A riddle (in C)
The expected output of the following C program is to print the elements in the array. But when actually run, it doesn't do so.
#include<stdio.h>
#define TOTAL_ELEMENTS (sizeof(array) / sizeof(array[0]))
int array[] = {23,34,12,17,204,99,16};
int main()
{
int d;
for(d=-1;d <= (TOTAL_ELEMENTS-2);d++)
printf("%d\n",array[d+1]);
return 0;
}
Because sizeof gives you an unsigned value, which you probably would have noticed had you turned up the warning level, such as using -Wall -Wextra with gcc (a):
xyzzy.c: In function 'main':
xyzzy.c:8: warning: comparison between signed and unsigned
If you force it to signed, it works fine:
#define TOTAL_ELEMENTS (int)((sizeof(array) / sizeof(array[0])))
What happens in detail can be gleaned from the ISO standard. In comparisons between different types, promotions are performed to make the types compatible. The compatible type chosen depends on several factors such as sign compatibility, precision and rank but, in this case, it was deemed that the unsigned type size_t was the compatible type so d was upgraded to that type.
Unfortunately, casting -1 to an unsigned type (at least for two's complement which is almost certainly what you're using) results in a rather large positive number.
One that's certainly larger the the 5 you get from (TOTAL_ELEMENTS-2). In other words, your for statement effectively becomes:
for (d = some big honking number way greater than five;
d <= 5;
d++
) {
// fat chance of getting in here !!
}
(a) This requirement to use extra remains a point of contention between the gcc developers and myself. They're obviously using some new definition of the word "all" of which I was previously unaware (with apologies to Douglas Adams).
TOTAL_ELEMENTS is of type size_t, subtracting 2 is done at compile time and so it is 5UL (emphasis on the unsigned suffix). The comparison with the signed integer d is then always false. Try
for(d=-1;d <= (ssize_t)(TOTAL_ELEMENTS-2);d++)
FTW, the intel compiler warns about exactly this when you try and compile the code.
To clarify what went wrong: sizeof() translates to a result type of size_t, which is nothing but an unsigned integer, larger than or equal to unsigned int.
So the result of (sizeof(array) / sizeof(array[0])) is a result in two operands of size_t type. The division is performed on these operands:size_t / size_t. Both operands are of the same type so it works fine. The result of the division is of type size_t, which is the type that TOTAL_ELEMENTS results in.
The expression (TOTAL_ELEMENTS-2) therefore have the types size_t - int, since the integer literal 2 is of type int.
Here we have two different types. What happens then is something called balancing (formally "the usual arithmetic conversions"), which happens when the compiler spots two different types. The balancing rules state that if one operand is signed and the other unsigned, then the signed one is silently, implicitly converted to an unsigned type.
This is what happens in this code. size_t - int is converted to size_t - size_t, then the subtraction is executed, the result is size_t. Then int <= size_t is converted to size_t <= size_t. The variable d turns unsigned, and if it had a negative value, the code goes haywire.
My apologies if the question seems weird. I'm debugging my code and this seems to be the problem, but I'm not sure.
Thanks!
It depends on what you want the behaviour to be. An int cannot hold many of the values that an unsigned int can.
You can cast as usual:
int signedInt = (int) myUnsigned;
but this will cause problems if the unsigned value is past the max int can hold. This means half of the possible unsigned values will result in erroneous behaviour unless you specifically watch out for it.
You should probably reexamine how you store values in the first place if you're having to convert for no good reason.
EDIT: As mentioned by ProdigySim in the comments, the maximum value is platform dependent. But you can access it with INT_MAX and UINT_MAX.
For the usual 4-byte types:
4 bytes = (4*8) bits = 32 bits
If all 32 bits are used, as in unsigned, the maximum value will be 2^32 - 1, or 4,294,967,295.
A signed int effectively sacrifices one bit for the sign, so the maximum value will be 2^31 - 1, or 2,147,483,647. Note that this is half of the other value.
Unsigned int can be converted to signed (or vice-versa) by simple expression as shown below :
unsigned int z;
int y=5;
z= (unsigned int)y;
Though not targeted to the question, you would like to read following links :
signed to unsigned conversion in C - is it always safe?
performance of unsigned vs signed integers
Unsigned and signed values in C
What type-conversions are happening?
IMHO this question is an evergreen. As stated in various answers, the assignment of an unsigned value that is not in the range [0,INT_MAX] is implementation defined and might even raise a signal. If the unsigned value is considered to be a two's complement representation of a signed number, the probably most portable way is IMHO the way shown in the following code snippet:
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else if (u >= (unsigned int)INT_MIN)
i = -(int)~u - 1; /*(2)*/
else
i = INT_MIN; /*(3)*/
Branch (1) is obvious and cannot invoke overflow or traps, since it
is value-preserving.
Branch (2) goes through some pains to avoid signed integer overflow
by taking the one's complement of the value by bit-wise NOT, casts it
to 'int' (which cannot overflow now), negates the value and subtracts
one, which can also not overflow here.
Branch (3) provides the poison we have to take on one's complement or
sign/magnitude targets, because the signed integer representation
range is smaller than the two's complement representation range.
This is likely to boil down to a simple move on a two's complement target; at least I've observed such with GCC and CLANG. Also branch (3) is unreachable on such a target -- if one wants to limit the execution to two's complement targets, the code could be condensed to
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else
i = -(int)~u - 1; /*(2)*/
The recipe works with any signed/unsigned type pair, and the code is best put into a macro or inline function so the compiler/optimizer can sort it out. (In which case rewriting the recipe with a ternary operator is helpful. But it's less readable and therefore not a good way to explain the strategy.)
And yes, some of the casts to 'unsigned int' are redundant, but
they might help the casual reader
some compilers issue warnings on signed/unsigned compares, because the implicit cast causes some non-intuitive behavior by language design
If you have a variable unsigned int x;, you can convert it to an int using (int)x.
It's as simple as this:
unsigned int foo;
int bar = 10;
foo = (unsigned int)bar;
Or vice versa...
If an unsigned int and a (signed) int are used in the same expression, the signed int gets implicitly converted to unsigned. This is a rather dangerous feature of the C language, and one you therefore need to be aware of. It may or may not be the cause of your bug. If you want a more detailed answer, you'll have to post some code.
Some explain from C++Primer 5th Page 35
If we assign an out-of-range value to an object of unsigned type, the result is the remainder of the value modulo the number of values the target type can hold.
For example, an 8-bit unsigned char can hold values from 0 through 255, inclusive. If we assign a value outside the range, the compiler assigns the remainder of that value modulo 256.
unsigned char c = -1; // assuming 8-bit chars, c has value 255
If we assign an out-of-range value to an object of signed type, the result is undefined. The program might appear to work, it might crash, or it might produce garbage values.
Page 160:
If any operand is an unsigned type, the type to which the operands are converted depends on the relative sizes of the integral types on the machine.
...
When the signedness differs and the type of the unsigned operand is the same as or larger than that of the signed operand, the signed operand is converted to unsigned.
The remaining case is when the signed operand has a larger type than the unsigned operand. In this case, the result is machine dependent. If all values in the unsigned type fit in the large type, then the unsigned operand is converted to the signed type. If the values don't fit, then the signed operand is converted to the unsigned type.
For example, if the operands are long and unsigned int, and int and long have the same size, the length will be converted to unsigned int. If the long type has more bits, then the unsigned int will be converted to long.
I found reading this book is very helpful.
I am pretty new to C. I recently came across this piece of code in C:
#include <stdio.h>
int main()
{
unsigned Abc = 1;
signed Xyz = -1;
if(Abc<Xyz)
printf("Less");
else
if(Abc>Xyz)
printf("Great");
else
if(Abc==Xyz)
printf("Equal");
return 0;
}
I tried running it and it outputs "Less". How does it work? What is the meaning of unsigned Abc? I could understand unsigned char Abc, but simply unsigned Abc? I am pretty sure Abc is no data type! How(and Why?) does this work?
Two things are happening.
The default data type in C in int. Thus you have variables of type signed int and unsigned int.
When and unsigned int and a signed int are used in an expression the signed int is converted to unsigned before the expression is evaluated. This will cause signed(-1) to turn into a very large unsigned number (due to 2's complement representation).
The default type in C is int. Therefore unsigned is a synonym for unsigned int.
Singed integers are usually handled using twos complement. This means that the actual value for 1 is 0x0001 and the actual value for -1 is 0xFFFF.
int is the "default" type in C. unsigned Abc means unsigned int Abc just like long L means long int L.
When you have an expression that mixes signed and unsigned ints, the signed ints get automatically converted to unsigned. Most systems use two's complement to store integers, so (unsigned int)(-1) is equal to the largest possible unsigned int.
As far as I know, the signed value gets promoted to an unsigned value and so becomes very large.
Comparing signed and unsigned types result in undefined behavior. Your program can and will print different results on different platforms.
Please see comments.
unsigned/signed is just short specification for unsigned int/signed int (source), so no, you don't have variable with "no data type"
The signed value will get promoted to unsigned and therefore it will be bigger than 1.
Add the following line after signed Xyz = -1;
printf("is Abc => %x less than Xyz => %x\n",Abc,Xyz);
and see the result for yourself.
See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?
Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.
On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)
You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.
#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number
Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}
The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.
while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.