Unsigned values in C

Unsigned values in C - c

I have the following code:
#include <stdio.h>
int main() {
unsigned int a = -1;
int b = -1;
printf("%x\n", a);
printf("%x\n", b);
printf("%d\n", a);
printf("%d\n", b);
printf("%u\n", a);
printf("%u\n", b);
return 0;
}
The output is:
ffffffff
ffffffff
-1
-1
4294967295
4294967295
I can see that a value is interpreted as signed or unsigned according to the value passed to printf function. In both cases, the bytes are the same (ffffffff). Then, what is the unsigned word for?

Assign a int -1 to an unsigned: As -1 does not fit in the range [0...UINT_MAX], multiples of UINT_MAX+1 are added until the answer is in range. Evidently UINT_MAX is pow(2,32)-1 or 429496725 on OP's machine so a has the value of 4294967295.
unsigned int a = -1;
The "%x", "%u" specifier expects a matching unsigned. Since these do not match, "If a conversion specification is invalid, the behavior is undefined.
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined." C11 §7.21.6.1 9. The printf specifier does not change b.
printf("%x\n", b); // UB
printf("%u\n", b); // UB
The "%d" specifier expects a matching int. Since these do not match, more UB.
printf("%d\n", a); // UB
Given undefined behavior, the conclusions are not supported.
both cases, the bytes are the same (ffffffff).
Even with the same bit pattern, different types may have different values. ffffffff as an unsigned has the value of 4294967295. As an int, depending signed integer encoding, it has the value of -1, -2147483647 or TBD. As a float it may be a NAN.
what is unsigned word for?
unsigned stores a whole number in the range [0 ... UINT_MAX]. It never has a negative value. If code needs a non-negative number, use unsigned. If code needs a counting number that may be +, - or 0, use int.
Update: to avoid a compiler warning about assigning a signed int to unsigned, use the below. This is an unsigned 1u being negated - which is well defined as above. The effect is the same as a -1, but conveys to the compiler direct intentions.
unsigned int a = -1u;

Having unsigned in variable declaration is more useful for the programmers themselves - don't treat the variables as negative. As you've noticed, both -1 and 4294967295 have exact same bit representation for a 4 byte integer. It's all about how you want to treat or see them.
The statement unsigned int a = -1; is converting -1 in two's complement and assigning the bit representation in a. The printf() specifier x, d and u are showing how the bit representation stored in variable a looks like in different format.

When you initialize unsigned int a to -1; it means that you are storing the 2's complement of -1 into the memory of a.
Which is nothing but 0xffffffff or 4294967295.
Hence when you print it using %x or %u format specifier you get that output.
By specifying signedness of a variable to decide on the minimum and maximum limit of value that can be stored.
Like with unsigned int: the range is from 0 to 4,294,967,295 and int: the range is from -2,147,483,648 to 2,147,483,647
For more info on signedness refer this

Related

Clarifications about unsigned type in C

Hi I'm currently learning C and there's something that I quite don't understand.
First of all I was told that if I did this:
unsigned int c2 = -1;
printf("c2 = %u\n", c2);
It would output 255, according to this table:
But I get a weird result: c2 = 4294967295
Now what's weirder is that this works:
unsigned char c2 = -1;
printf("c2 = %d\n", c2);
But I don't understand because since a char is, well, a char why does it even print anything? Since the specifier here is %d and not %u as it should be for unsigned types.

The following code:
unsigned int c2 = -1;
printf("c2 = %u\n", c2);
Will never print 255. The table you are looking at is referring to an unsigned integer of 8 bits. An int in C needs to be at least 16 bits in order to comply with the C standard (UINT_MAX defined as 2^16-1 in paragraph §5.2.4.2.1, page 22 here). Therefore the value you will see is going to be a much larger number than 255. The most common implementations use 32 bits for an int, and in that case you'll see 4294967295 (2^32 - 1).
You can check how many bits are used for any kind of variable on your system by doing sizeof(type_or_variable) * CHAR_BIT (CHAR_BIT is defined in limits.h and represents the number of bits per byte, which is again most of the times 8).
The correct code to obtain 255 as output is:
unsigned char c = -1;
printf("c = %hhu\n", c);
Where the hh prefix specifier means (from man 3 printf):
hh: A following integer conversion corresponds to a signed char or unsigned char argument, or a following n conversion corresponds to a pointer to a signed char argument.
Anything else is just implementation defined or even worse undefined behavior.

In this declaration
unsigned char c2 = -1;
the internal representation of -1 is truncated to one byte and interpreted as unsigned char. That is all bits of the object c2 are set.
In this call
printf("c2 = %d\n", c2);
the argument that has the type unsigned char is promoted to the type int preserving its value that is 255. This value is outputted as an integer.
Is this declaration
unsigned int c2 = -1;
there is no truncation. The integer value -1 that usually occupies 4 bytes (according to the size of the type int) is interpreted as an unsigned value with all bits set.
So in this call
printf("c2 = %u\n", c2);
there is outputted the maximum value of the type unsigned int. It is the maximum value because all bits in the internal representation are set. The conversion from signed integer type to a larger unsigned integer type preserve the sign propagating it to the width of the unsigned integer object.

In C integer can have multiple representations, so multiple storage sizes and value ranges
refer to the table below for more details.

How does printf know when it's being passed a signed int

I'm trying to figure out how variables really work in C and find it strange how the printf function seems to know the difference between different variables of the same size, I'm assuming they're both 16 bits.
#include <stdio.h>
int main(void) {
unsigned short int positive = 0xffff;
short int negative = 0xffff;
printf("%d\n%d\n", positive, negative);
return 0;
}
Output:
65535
-1

I think we have to more carefully distinguish between the type conversions on the different integer types on one hand and the printf format specifiers (allowing to force printf how to interpret the data) on the other hand.
Systematically:
printf("%hd %hd\n", positive, negative);
// gives: -1 -1
Both values are interpreted as signed short int by printf, regardless of the declaration.
printf("%hu %hu\n", positive, negative);
// gives: 65535 65535
Both values are interpreted as unsigned short int by printf, regardless of the declaration.
However,
printf("%d %d\n", positive, negative);
// gives: 65535 -1
Both values are implicitly converted to (a longer) int, while the sign is kept.
Finally,
printf("%u %u\n", positive, negative);
// gives 65535 4294967295
Again, both values are implicitly converted to int, while the sign is kept, but then the negative value is interpreted as unsigned. As we can see here, plain int is actually 32-bit (on this system).
Curiously, only if I compile with gcc and -Wpedantic, it gives me a warning for the assignment short int negative = 0xffff;.

Type conversion of objects involving unsigned type in C language

I have been reading KnR lately and have come across a statement:
"Type conversions of Expressions involving unsigned type values are
complicated"
So to understand this i have written a very small code:
#include<stdio.h>
int main()
{
signed char x = -1;
printf("Signed : %d\n",x);
printf("Unsigned : %d\n",(unsigned)x);
}
signed x bit form will be : 1000 0001 = -1
when converted to unsigned its value has to be 1000 0001 = 129.
But even after type conversion it prints the -1.
Note: I'm using gcc compiler.

C11 6.3.1.3, paragraph 2:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
−1 cannot be represented as an unsigned int value, the −1 is converted to UINT_MAX. Thus -1 becomes a very large positive integer.
Also, use %u for unsigned int, otherwise result invoked undefined behaviour.

C semantics regarding conversion of integers is defined in terms of the mathematical values, not in terms of the bits. It is good to understand how values are represented as bits, but, when analyzing expressions, you should think about values.
In this statement:
printf("Signed : %d\n",x);
the signed char x is automatically promoted to int. Since x is −1, the new int also has the value −1, and printf prints “-1”.
In this statement:
printf("Unsigned : %d\n",(unsigned)x);
the signed char x is automatically promoted to int. The value is still −1. Then the cast converts this to unsigned. The rule for conversion to unsigned is that UINT_MAX+1 is added to or subtracted from the value as needed to bring it into the range of unsigned. In this case, adding UINT_MAX+1 to −1 once brings the value to UINT_MAX, which is within range. So the result of the conversion is UINT_MAX.
However, this unsigned value is then passed to printf to be printed with the %d conversion. That violates C 2018 7.21.6.1 9, which says the behavior is undefined if the types do not match.
This means a C implementation is allowed to do anything. In your case, it seems what happened is:
The unsigned value UINT_MAX is represented with all one bits.
The printf interpreted the all-one-bits value as an int.
Your implementation uses two’s complement for int types.
In two’s complement, an object with all one bits represents −1.
So printf printed “-1”.
If you had used this correct code instead:
printf("Unsigned : %u\n",(unsigned)x);
then printf would print the value of UINT_MAX, which is likely 4,294,967,295, so printf would print “4294967295”.

Sign extension query in case of short

Given,
unsigned short y = 0xFFFF;
When I print
printf("%x", y);
I get : 0xFFFF;
But when I print
printf("%x", (signed short)y);
I get : 0xFFFFFFFF
Whole program below:
#include <stdio.h>
int main() {
unsigned short y = 0xFFFF;
unsigned short z = 0x7FFF;
printf("%x %x\n", y,z);
printf("%x %x", (signed short)y, (signed short)z);
return 0;
}
Sign extension happens when we typecast lower to higher byte data type, but here we are typecasting short to signed short.
In both cases sizeof((signed short)y) or sizeof((signed short)z) prints 2 bytes. Short remains of 2 bytes, if sign bit is zero as in case of 0x7fff.
Any help is very much appreciated!

Output of the first printf is as expected. The second printf produces undefined behavior.
In C language when you pass a a value smaller than int as a variadic argument, that value is always implicitly converted to type int. It is not possible to physically pass a short or char variadic argument. That implicit conversion to int is where your "sign extension" takes place.
For this reason, your printf("%x", y); is equivalent to printf("%x", (int) y);. The value that is passed to printf is 0xFFFF of type int. Technically, %x format requires an unsigned int argument, but a non-negative int value is also OK (unless I'm missing some technicality). The output is 0xFFFF.
Conversion to int happens in the second case as well. I.e. your printf("%x", (signed short) y); is equivalent to printf("%x", (int) (signed short) y);. The conversion of 0xFFFF to (signed short) is implementation-defined, because 0xFFFF is apparently out of range of signed short on your platform. But most likely it produces a negative value (-1). When converted to int it produces the same negative value of type int (again, -1 represented as 0xFFFFFFFF for a 32-bit int). The further behavior is undefined, since you are passing a negative int value for format specifier %x, which requires unsigned int argument. It is illegal to use %x with negative int values.
In other words, formally your second printf prints unpredictable garbage. But practically the above explains where that 0xFFFFFFFF came from.

Let's break it down and into smaller pieces:
Given,
unsigned short y = 0xFFFF;
Assuming two-bytes unsigned short maximum value is 2^16-1, that is indeed 0xFFFF.
When I print
printf("%x", y);
Due to default argument promotion (as printf() is variadic function) value of y is implicitly promoted to type int. With %x format-specified it's treated as unsigned int. Assuming common two-complement's representation and four-bytes int type, that means that as most-significant bit is set to zero, the bit patterns of int and unsigned int are simply the same.
But when I print
printf("%x", (signed short)y);
What you have done is cast to signed type, that cannot represent value of 0xFFFF. Such conversion as standard stays is implementation-defined, so you can get whatever result. After implicit conversion to int apparently you have bit-patern of 32-ones, that are represented as 0xFFFFFFFF.

Initializing unsigned short int to signed value

#include<stdio.h>
int main()
{
unsigned short a=-1;
printf("%d",a);
return 0;
}
This is giving me output 65535. why?
When I increased the value of a in negative side the output is (2^16-1=)65535-a.
I know the range of unsigned short int is 0 to 65535.
But why is rotating in the range 0 to 65535.What is going inside?
#include<stdio.h>
int main()
{
unsigned int a=-1;
printf("%d",a);
return 0;
}
Output is -1.
%d is used for signed decimal integer than why here it is not following the rule of printing the largest value of its(int) range.
Why the output in this part is -1?
I know %u is used for printing unsigned decimal integer.
Why the behavioral is undefined in second code and not in first.?
This I have compiled in gcc compiler. It's a C code
On my machine sizeof short int is 2 bytes and size of int is 4 bytes.

In your implementation, short is 16 bits and int is 32 bits.
unsigned short a=-1;
printf("%d",a);
First, -1 is converted to unsigned short. This results in the value 65535. For the precise definition see the standard "integer conversions". To summarize: the value is taken modulo USHORT_MAX+1.
This value 65535 is assigned to a.
Then for the printf, which uses varargs, the value is promoted back to int. varargs never pass integer types smaller than int, they're always converted to int. This results in the value 65535, which is printed.
unsigned int a=-1;
printf("%d",a);
First line, same as before but modulo UINT_MAX+1. a is 4294967295.
For the printf, a is passed as an unsigned int. Since %d requires an int the behavior is undefined by the C standard. But your implementation appears to have reinterpreted the unsigned value 4294967295, which has all bits set, as as a signed integer with all-bits-set, i.e. the two's-complement value -1. This behavior is common but not guaranteed.

Variable assignment is done to the amount of memory of the type of the variable (e.g., short is 2 bytes, int is 4 bytes, in 32 bit hardware, typically). Sign of the variable is not important in the assignment. What matters here is how you are going to access it. When you assign to a 'short' (signed/unsigned) you assign the value to a '2 bytes' memory. Now if you are going to use '%d' in printf, printf will consider it 'integer' (4 bytes in your hardware) and the two MSBs will be 0 and hence you got [0|0](two MSBs) [-1] (two LSBs). Due to the new MSBs (introduced by %d in printf, migration) your sign bit is hidden in the LSBs and hence printf considers it unsigned (due to the MSBs being 0) and you see the positive value. To get a negative in this you need to use '%hd' in first case. In the second case you assigned to '4 bytes' memory and the MSB got its SIGN bit '1' (means negative) during assignment and hence you see the negative number in '%d' of printf. Hope it explains. For more clarification please comment on the answer.
NB: I used 'MSB' for a shorthand of higher-order byte(s). Please read it according to the context (e.g., 'SIGN bit' will make you read like 'Most Significant Bit'). Thanks.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight