Why unsigned int stills signed? [duplicate] - c

This question already has answers here:
C Unsigned int providing a negative value?
(3 answers)
Closed 5 years ago.
I create an unsigned int and unsigned char. Then I assign the -10 value, and the char remains unsigned and gives me a value of 246, but the unsigned int takes the -10 value.
#include <stdio.h>
int main ()
{
unsigned char a;
unsigned int b;
a=-10;
b=-10;
printf("%d\t%d\n", a,b);
}
Compiling and executing I have this:
246 -10
I have no idea why the unsigned int stills signed, and why the char is unsigned.
Reading the book "The C programming language 2nd edition" I can see char can be unsigned by default depending on the machine.
(I'm running NetBSD as a operating system.)
Why the int is signed while I'm declaring as unsigned int, and why the char is taking the value 246?
Is this a compiler or system operating "feature" ?

This is undefined behavior when you pass unsigned integers to %d. Wrong format specifier is UB.
If you assign a negative value to an unsigned variable, it's fine and the value will be taken modulo UINT_MAX + 1 (or UCHAR_MAX + 1), so (-10) % (UCHAR_MAX + 1) = 256 - 10 = 246, and b is 4294967296 - 10 = 4294967286. Unsigned integral overflow is required to wrap-around.
When printf is interpreting these numbers, it finds 246 is suitable for %d, the format specifier for signed int, and 4294967286 is reinterpreted as -10. That's all.

When you assign -10 to an unsigned char variable, the value is reduced modulo UCHAR_MAX + 1, which results in 246 on your platform. Printing an unsigned char value using format %d is fine on most platforms. The value gets implicitly converted to int, which is the correct type for %d format. So, you see that 246 as you should.
When you assign -10 to an unsigned int variable, the value is reduced modulo UINT_MAX + 1, which results in some large value (depends on the range of unsigned int on your platform). Printing such large unsigned int value (greater than INT_MAX) using format %d leads to undefined behavior. The output is meaningless.

%d is the specifier used to print signed int, so it is not strange. Use %u instead.
http://www.cplusplus.com/reference/cstdio/printf/
And when you assign a negative value to an unsigned variable you will get overflow. That's why you get strange values.
printf("%d", a) means that will take the content of variable a and interpret it as a signed int.
Oh, and btw. You are causing undefined behavior which implies that there's really no reason to ask why something happens. Undefined behavior will always be undefined. Avoid it at all costs. Note that the only thing that is undefined is the printf statement. Assigning a value that's out of range to an unsigned variable is a defined behavior. However, the opposite is not true. int a = UINT_MAX will cause undefined behavior.

Related

How does printf knows if variable passed signed or unsigned

Given the following code snippet:
signed char x = 150;
unsigned char y = 150;
printf("%d %d\n", x, y);
The output is:
-106 150
However, I'm using the same format specifier, for variables that are represented in memory in the same way. How does printf knows whether it's signed or unsigned.
Memory representation in both cases is:
10010110
signed char x = 150; incurs implementation defines behavior as 150 is not in the range of OP's signed char.
The 150 is an int and not fitting in the signed char range undergoes:
the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C17dr § 6.3.1.3 3
In this case, x took on the value of 150 - 256.
Good code would not assume this result of -106 and instead not assign to a signed char values outside it range.
Then ...
Commonly, both signed char x and unsigned char y are promoted to int before being passed as arguments to a ... function due to the usual arithmetic conversions. (types in the range of int are promoted to int).
Thus printf("%d %d\n", x, y); is not a problem. printf() receive 2 ints and that matches the "%d" specifiers.
Let's first recognize this issue:
char x = 150;
x never had the value 150 to begin with. That 150 is going to get auto casted to signed char. Hence x, is immediately going to assume the value of -106, since 150 can't be represented within a signed 8-bit value. You might as well have said:
char x = (signed char)150; // same as -106, which is 0x96 in hex
Second, char and short values when passed as variable arguments get auto promoted int. as part of being put on the stack. This includes getting sign-extended.
So when you invoke printf("%d %d\n", x, y);, the compiler will massage it to really be this:
printf("%d %d\n", (int)x, (unsigned int)y);
the following gets put onto the stack:
"%d %d\n"
0xffffff96 (int)x
0x00000096 (unsigned int)y
When printf runs, it parses the formatting string on the stack (%d %d\n) and sees it needs to interpret the next two items on the stack as signed integers. It references 0xffffff96 and 00000096 as value on the stack respectively and renders both to the console in decimal form.
How does printf knows if variable passed signed or unsigned?
The printf function doesn't "know".
You effectively tell it by using either a signed conversion specifier (d or i) or an unsigned conversion specifier (o, u, x or X).
And if you print a signed integer as unsigned or vice versa, printf just does what you told it to do.
I used the same specifier "%d", and it printed different values (the positive one and the negative one"
In your example, you are printing signed and unsigned char values.
signed char x = 150;
The value in x is -106 (8 bits signed) because 150 is greater than the largest value for char. (The char type's range is -128 to +127 with any hardware / C compiler that you are likely to encounter.)
unsigned char y = 150;
The value in y is 150 (8 bits unsigned) as expected.
At the call site. The char value -108 is sign extended to a larger integer type. The unsigned char value 150 is converted without sign extension.
By the time printf is called, the values that are have been passed to it have a different representation.

Information lost with type conversion in C [duplicate]

This question already has answers here:
assigning 128 to char variable in c
(3 answers)
Closed 4 years ago.
#include <stdio.h>
int main(int argc, char const *argv[])
{
int x = 128;
char y = x;
int z = y;
printf("%d\n", z);
return 0;
}
i don't understand why this program prints -128.
i have tried to convert 128 in binary base but i'm still confused on how the C compiler convert int to char and char to int.
Note : in my machine sizeof(char) = 1 and sizeof(int) = 4
Assuming a char is signed and 8 bits, its range is -128 to 127 which means the value 128 is out of range for a char. So the value is converted in a implementation-defined manner. In the case of gcc, the result is reduced modulo 28 to become in range.
What this basically means is that the low-order byte of the int value 128 is assigned to the char variable. 128 in hex as 32-bit is 0x00000080, so 0x80 is assigned to y. Assuming 2's compliment representation, this represents the value -128. When this value is then assigned to z, which is an int, that value can be represented in an int, so that's what gets assigned to it, and its representation is 0xffffff80.
C standard does not specify whether char is signed or unsigned. In fact, assigning a char type a value outside of the basic execution character set is implementation defined. You can probably use the macros in <limits.h> to verify.
I suspect on your system char is signed, which makes the max value 127. Signed interger overflow is undefined. So no guarantess on the output. In this case, it looks like it wraps around.

Difference between int and unsigned int in C [duplicate]

This question already has answers here:
Unsigned and Signed int and printf
(2 answers)
Closed 6 years ago.
I know that the range of unsigned int is 0<= I <= 2^32-1
However, when I type like this in C(visual 2015)
void main(){
unsigned int k = -10;
printf("%d",k);
}
then why computer print -10 on screen?? I think there should be error.
int stores signed numbers by default, which means they can go from -2,147,483,648 to 2,147,483,647 in range. An unsigned int means you won't be using negative numbers, so the range is much larger because you have freed up the left most bit in your number which is normally used to indicate that it is signed (negative) or not. So an unsigned int can go from 0 to 4,294,967,295. This applies to types like char as well, they normally go from -128 to 127, when unsigned, a char holds one byte exactly, or 0 to 255.
Visual Studio (and most compilers) should give you a warning for trying to store a signed value into an unsigned type.
When you used printf("%d",k) The %d is telling printf() to print out a signed int. So that is what it did. If you want printf() to print out an unsigned int than you needed to use printf("%u").
in hexadecimal form, -10 is 0xFFFFFFF6
unsigned int k = -10; means
unsigned int k = 0xFFFFFFF6;
and when printing this value if you say
printf("%d",k); the compiler will evaluate the value (0xFFFFFFF6) as integer because of the %d specifier. (check http://www.cplusplus.com/reference/cstdio/printf/ about the issue)
if you say printf("%u",k); the compiler will evaluate the value (0xFFFFFFF6) as unsigned integer and will print a value between 0-2^32-1

Whats wrong with this C code?

My sourcecode:
#include <stdio.h>
int main()
{
char myArray[150];
int n = sizeof(myArray);
for(int i = 0; i < n; i++)
{
myArray[i] = i + 1;
printf("%d\n", myArray[i]);
}
return 0;
}
I'm using Ubuntu 14 and gcc to compile it, what it prints out is:
1
2
3
...
125
126
127
-128
-127
-126
-125
...
Why doesn't it just count up to 150?
int value of a char can range from 0 to 255 or -127 to 127, depending on implementation.
Therefore once the value reaches 127 in your case, it overflows and you get negative value as output.
The signedness of a plain char is implementation defined.
In your case, a char is a signed char, which can hold the value of a range to -128 to +127.
As you're incrementing the value of i beyond the limit signed char can hold and trying to assign the same to myArray[i] you're facing an implementation-defined behaviour.
To quote C11, chapter §6.3.1.4,
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Because a char is a SIGNED BYTE. That means it's value range is -128 -> 127.
EDIT Due to all the below comment suggesting this is wrong / not the issue / signdness / what not...
Running this code:
char a, b;
unsigned char c, d;
int si, ui, t;
t = 200;
a = b = t;
c = d = t;
si = a + b;
ui = c + d;
printf("Signed:%d | Unsigned:%d", si, ui);
Prints: Signed:-112 | Unsigned:400
Try yourself
The reason is the same. a & b are signed chars (signed variables of size byte - 8bits). c & d are unsigned. Assigning 200 to the signed variables overflows and they get the value -56. In memory, a, b,c&d` all hold the same value, but when used their type "signdness" dictates how the value is used, and in this case it makes a big difference.
Note about standard
It has been noted (in the comments to this answer, as well as other answers) that the standard doesn't mandate that char is signed. That is true. However, in the case presented by OP, as well the code above, char IS signed.
It seems that your compiler by default considers type char like type signed char. In this case CHAR_MIN is equal to SCHAR_MIN and in turn equal to -128 while CHAR_MAX is equal to SCHAR_MAX and in turn equal to 127 (See header <limits.h>)
According to the C Standard (6.2.5 Types)
15 The three types char, signed char, and unsigned char are
collectively called the character types. The implementation shall
define char to have the same range, representation, and behavior as
either signed char or unsigned char
For signed types one bit is used as the sign bit. So for the type signed char the maximum value corresponds to the following representation in the hexadecimal notation
0x7F
and equal to 127. The most significant bit is the signed bit and is equal to 0.
For negative values the signed bit is set to 1 and for example -128 is represented like
0x80
When in your program the value stored in char reaches its positive maximum 0x7Fand was increased it becomes equal to 0x80 that in the decimal notation is equal to -128.
You should explicitly use type unsigned char instead of the char if you want that the result of the program execution did not depend on the compiler settings.
Or in the printf statement you could explicitly cast type char to type unsigned char. For example
printf("%d\n", ( unsigned char )myArray[i]);
Or to compare results you could write in the loop
printf("%d %d\n", myArray[i], ( unsigned char )myArray[i]);

Initializing unsigned short int to signed value

#include<stdio.h>
int main()
{
unsigned short a=-1;
printf("%d",a);
return 0;
}
This is giving me output 65535. why?
When I increased the value of a in negative side the output is (2^16-1=)65535-a.
I know the range of unsigned short int is 0 to 65535.
But why is rotating in the range 0 to 65535.What is going inside?
#include<stdio.h>
int main()
{
unsigned int a=-1;
printf("%d",a);
return 0;
}
Output is -1.
%d is used for signed decimal integer than why here it is not following the rule of printing the largest value of its(int) range.
Why the output in this part is -1?
I know %u is used for printing unsigned decimal integer.
Why the behavioral is undefined in second code and not in first.?
This I have compiled in gcc compiler. It's a C code
On my machine sizeof short int is 2 bytes and size of int is 4 bytes.
In your implementation, short is 16 bits and int is 32 bits.
unsigned short a=-1;
printf("%d",a);
First, -1 is converted to unsigned short. This results in the value 65535. For the precise definition see the standard "integer conversions". To summarize: the value is taken modulo USHORT_MAX+1.
This value 65535 is assigned to a.
Then for the printf, which uses varargs, the value is promoted back to int. varargs never pass integer types smaller than int, they're always converted to int. This results in the value 65535, which is printed.
unsigned int a=-1;
printf("%d",a);
First line, same as before but modulo UINT_MAX+1. a is 4294967295.
For the printf, a is passed as an unsigned int. Since %d requires an int the behavior is undefined by the C standard. But your implementation appears to have reinterpreted the unsigned value 4294967295, which has all bits set, as as a signed integer with all-bits-set, i.e. the two's-complement value -1. This behavior is common but not guaranteed.
Variable assignment is done to the amount of memory of the type of the variable (e.g., short is 2 bytes, int is 4 bytes, in 32 bit hardware, typically). Sign of the variable is not important in the assignment. What matters here is how you are going to access it. When you assign to a 'short' (signed/unsigned) you assign the value to a '2 bytes' memory. Now if you are going to use '%d' in printf, printf will consider it 'integer' (4 bytes in your hardware) and the two MSBs will be 0 and hence you got [0|0](two MSBs) [-1] (two LSBs). Due to the new MSBs (introduced by %d in printf, migration) your sign bit is hidden in the LSBs and hence printf considers it unsigned (due to the MSBs being 0) and you see the positive value. To get a negative in this you need to use '%hd' in first case. In the second case you assigned to '4 bytes' memory and the MSB got its SIGN bit '1' (means negative) during assignment and hence you see the negative number in '%d' of printf. Hope it explains. For more clarification please comment on the answer.
NB: I used 'MSB' for a shorthand of higher-order byte(s). Please read it according to the context (e.g., 'SIGN bit' will make you read like 'Most Significant Bit'). Thanks.

Resources