How does printf knows if variable passed signed or unsigned - c

Given the following code snippet:
signed char x = 150;
unsigned char y = 150;
printf("%d %d\n", x, y);
The output is:
-106 150
However, I'm using the same format specifier, for variables that are represented in memory in the same way. How does printf knows whether it's signed or unsigned.
Memory representation in both cases is:
10010110

signed char x = 150; incurs implementation defines behavior as 150 is not in the range of OP's signed char.
The 150 is an int and not fitting in the signed char range undergoes:
the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C17dr § 6.3.1.3 3
In this case, x took on the value of 150 - 256.
Good code would not assume this result of -106 and instead not assign to a signed char values outside it range.
Then ...
Commonly, both signed char x and unsigned char y are promoted to int before being passed as arguments to a ... function due to the usual arithmetic conversions. (types in the range of int are promoted to int).
Thus printf("%d %d\n", x, y); is not a problem. printf() receive 2 ints and that matches the "%d" specifiers.

Let's first recognize this issue:
char x = 150;
x never had the value 150 to begin with. That 150 is going to get auto casted to signed char. Hence x, is immediately going to assume the value of -106, since 150 can't be represented within a signed 8-bit value. You might as well have said:
char x = (signed char)150; // same as -106, which is 0x96 in hex
Second, char and short values when passed as variable arguments get auto promoted int. as part of being put on the stack. This includes getting sign-extended.
So when you invoke printf("%d %d\n", x, y);, the compiler will massage it to really be this:
printf("%d %d\n", (int)x, (unsigned int)y);
the following gets put onto the stack:
"%d %d\n"
0xffffff96 (int)x
0x00000096 (unsigned int)y
When printf runs, it parses the formatting string on the stack (%d %d\n) and sees it needs to interpret the next two items on the stack as signed integers. It references 0xffffff96 and 00000096 as value on the stack respectively and renders both to the console in decimal form.

How does printf knows if variable passed signed or unsigned?
The printf function doesn't "know".
You effectively tell it by using either a signed conversion specifier (d or i) or an unsigned conversion specifier (o, u, x or X).
And if you print a signed integer as unsigned or vice versa, printf just does what you told it to do.
I used the same specifier "%d", and it printed different values (the positive one and the negative one"
In your example, you are printing signed and unsigned char values.
signed char x = 150;
The value in x is -106 (8 bits signed) because 150 is greater than the largest value for char. (The char type's range is -128 to +127 with any hardware / C compiler that you are likely to encounter.)
unsigned char y = 150;
The value in y is 150 (8 bits unsigned) as expected.
At the call site. The char value -108 is sign extended to a larger integer type. The unsigned char value 150 is converted without sign extension.
By the time printf is called, the values that are have been passed to it have a different representation.

Related

Why unsigned int stills signed? [duplicate]

This question already has answers here:
C Unsigned int providing a negative value?
(3 answers)
Closed 5 years ago.
I create an unsigned int and unsigned char. Then I assign the -10 value, and the char remains unsigned and gives me a value of 246, but the unsigned int takes the -10 value.
#include <stdio.h>
int main ()
{
unsigned char a;
unsigned int b;
a=-10;
b=-10;
printf("%d\t%d\n", a,b);
}
Compiling and executing I have this:
246 -10
I have no idea why the unsigned int stills signed, and why the char is unsigned.
Reading the book "The C programming language 2nd edition" I can see char can be unsigned by default depending on the machine.
(I'm running NetBSD as a operating system.)
Why the int is signed while I'm declaring as unsigned int, and why the char is taking the value 246?
Is this a compiler or system operating "feature" ?
This is undefined behavior when you pass unsigned integers to %d. Wrong format specifier is UB.
If you assign a negative value to an unsigned variable, it's fine and the value will be taken modulo UINT_MAX + 1 (or UCHAR_MAX + 1), so (-10) % (UCHAR_MAX + 1) = 256 - 10 = 246, and b is 4294967296 - 10 = 4294967286. Unsigned integral overflow is required to wrap-around.
When printf is interpreting these numbers, it finds 246 is suitable for %d, the format specifier for signed int, and 4294967286 is reinterpreted as -10. That's all.
When you assign -10 to an unsigned char variable, the value is reduced modulo UCHAR_MAX + 1, which results in 246 on your platform. Printing an unsigned char value using format %d is fine on most platforms. The value gets implicitly converted to int, which is the correct type for %d format. So, you see that 246 as you should.
When you assign -10 to an unsigned int variable, the value is reduced modulo UINT_MAX + 1, which results in some large value (depends on the range of unsigned int on your platform). Printing such large unsigned int value (greater than INT_MAX) using format %d leads to undefined behavior. The output is meaningless.
%d is the specifier used to print signed int, so it is not strange. Use %u instead.
http://www.cplusplus.com/reference/cstdio/printf/
And when you assign a negative value to an unsigned variable you will get overflow. That's why you get strange values.
printf("%d", a) means that will take the content of variable a and interpret it as a signed int.
Oh, and btw. You are causing undefined behavior which implies that there's really no reason to ask why something happens. Undefined behavior will always be undefined. Avoid it at all costs. Note that the only thing that is undefined is the printf statement. Assigning a value that's out of range to an unsigned variable is a defined behavior. However, the opposite is not true. int a = UINT_MAX will cause undefined behavior.

Problems casting a double into an unsigned char

Why does casting a double 728.3 to an unsigned char produce zero? 728 is 0x2D8, so shouldn't w be 0xD8 (216)?
int w = (unsigned char)728.3;
int x = (int)728.3;
int y = (int)(unsigned char)728.3;
int z = (unsigned char)(int)728.3;
printf( "%i %i %i %i", w, x, y, z );
// prints 0 728 0 216
From the C standard 6.3.1.4p1:
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
So, unless you have >=10 bit unsigned char, your code invokes undefined behaviour.
Note that the cast explicitly tells the compiler you know what you are doing, thus suppresses a warning.
Supposing that unsigned char has 8 value bits, as is nearly (but not completely) certain for your implementation, the behavior of converting the double value 728.3 to type unsigned char is undefined, as specified by paragraph 6.3.1.4/1 of the standard:
When a finite value of real floating type is converted to an integer
type other than _Bool, the fractional part is discarded (i.e., the
value is truncated toward zero). If the value of the integral part
cannot be represented by the integer type, the behavior is undefined.
This applies to both your w and your y. It does not apply to your x, and the rules covering conversions between integer values (i.e. your z) are different.
Basically, then, there is no answer at the C level for why you see the specific results you do, nor for why I see different ones when I run your code. The behavior is undefined; I can be thankful that it did not turn out to be an outpouring of nasal demons.

Whats wrong with this C code?

My sourcecode:
#include <stdio.h>
int main()
{
char myArray[150];
int n = sizeof(myArray);
for(int i = 0; i < n; i++)
{
myArray[i] = i + 1;
printf("%d\n", myArray[i]);
}
return 0;
}
I'm using Ubuntu 14 and gcc to compile it, what it prints out is:
1
2
3
...
125
126
127
-128
-127
-126
-125
...
Why doesn't it just count up to 150?
int value of a char can range from 0 to 255 or -127 to 127, depending on implementation.
Therefore once the value reaches 127 in your case, it overflows and you get negative value as output.
The signedness of a plain char is implementation defined.
In your case, a char is a signed char, which can hold the value of a range to -128 to +127.
As you're incrementing the value of i beyond the limit signed char can hold and trying to assign the same to myArray[i] you're facing an implementation-defined behaviour.
To quote C11, chapter §6.3.1.4,
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Because a char is a SIGNED BYTE. That means it's value range is -128 -> 127.
EDIT Due to all the below comment suggesting this is wrong / not the issue / signdness / what not...
Running this code:
char a, b;
unsigned char c, d;
int si, ui, t;
t = 200;
a = b = t;
c = d = t;
si = a + b;
ui = c + d;
printf("Signed:%d | Unsigned:%d", si, ui);
Prints: Signed:-112 | Unsigned:400
Try yourself
The reason is the same. a & b are signed chars (signed variables of size byte - 8bits). c & d are unsigned. Assigning 200 to the signed variables overflows and they get the value -56. In memory, a, b,c&d` all hold the same value, but when used their type "signdness" dictates how the value is used, and in this case it makes a big difference.
Note about standard
It has been noted (in the comments to this answer, as well as other answers) that the standard doesn't mandate that char is signed. That is true. However, in the case presented by OP, as well the code above, char IS signed.
It seems that your compiler by default considers type char like type signed char. In this case CHAR_MIN is equal to SCHAR_MIN and in turn equal to -128 while CHAR_MAX is equal to SCHAR_MAX and in turn equal to 127 (See header <limits.h>)
According to the C Standard (6.2.5 Types)
15 The three types char, signed char, and unsigned char are
collectively called the character types. The implementation shall
define char to have the same range, representation, and behavior as
either signed char or unsigned char
For signed types one bit is used as the sign bit. So for the type signed char the maximum value corresponds to the following representation in the hexadecimal notation
0x7F
and equal to 127. The most significant bit is the signed bit and is equal to 0.
For negative values the signed bit is set to 1 and for example -128 is represented like
0x80
When in your program the value stored in char reaches its positive maximum 0x7Fand was increased it becomes equal to 0x80 that in the decimal notation is equal to -128.
You should explicitly use type unsigned char instead of the char if you want that the result of the program execution did not depend on the compiler settings.
Or in the printf statement you could explicitly cast type char to type unsigned char. For example
printf("%d\n", ( unsigned char )myArray[i]);
Or to compare results you could write in the loop
printf("%d %d\n", myArray[i], ( unsigned char )myArray[i]);

Range of unsigned char in C language

As per my knowledge range of unsigned char in C is 0-255. but when I executed the below code its printing the 256 as output. How this is possible? I have got this code from "test your C skill" book which say char size is one byte.
main()
{
unsigned char i = 0x80;
printf("\n %d",i << 1);
}
Because the operands to <<* undergo integer promotion. It's effectively equivalent to (int)i << 1.
* This is true for most operators in C.
Several things are happening.
First, the expression i << 1 has type int, not char; the literal 1 has type int, so the type of i is "promoted" to int, and 0x100 is well within the range of a signed integer.
Secondly, the %d conversion specifier expects its corresponding argument to have type int. So the argument is being interpreted as an integer.
If you want to print the numeric value of a signed char, use the conversion specifier %hhd. If you want to print the numeric value of an unsigned char, use %hhu.
For arithmetical operations, char is promoted to int before the operation is performed. See the standard for details. Simplified: the "smaller" type is first brought to the "larger" type before the operation is performed. For the shift-operators, the resulting type is that of the left side operand, while for e.g. + and other "combining" operators it is the larger of both, but at least int. The latter means that char and short (and their unsigned counterparts are always promoted to int with the result being int, too. (simplified, for details please read the standard)
Note also that %d takes an int argument, not a char.
Additional notes:
unsigned char has not necessarily the range 0..255. Check limits.h, you will find UCHAR_MAX there.
char and "byte" are synonymously used in the standard, but neither are necessarily 8 bits wide (just very likely for modern general purpose CPUs).
As others have already explained, the statement "printf("\n %d",i << 1);" does integer promotion. So the one right shifting of integer value 128 results in 256. You could try the following code to print the maximum value of "unsigned char". The maximum value of "unsigned char" has all bits set. So a bitwise NOT operation using "~" should give you the maximum ASCII value of 255.
int main()
{
unsigned char ch = ~0;
printf("ch = %d\n", ch);
return 0;
}
Output:-
M-40UT:Desktop$ ./a.out
ch = 255

Sign extension query in case of short

Given,
unsigned short y = 0xFFFF;
When I print
printf("%x", y);
I get : 0xFFFF;
But when I print
printf("%x", (signed short)y);
I get : 0xFFFFFFFF
Whole program below:
#include <stdio.h>
int main() {
unsigned short y = 0xFFFF;
unsigned short z = 0x7FFF;
printf("%x %x\n", y,z);
printf("%x %x", (signed short)y, (signed short)z);
return 0;
}
Sign extension happens when we typecast lower to higher byte data type, but here we are typecasting short to signed short.
In both cases sizeof((signed short)y) or sizeof((signed short)z) prints 2 bytes. Short remains of 2 bytes, if sign bit is zero as in case of 0x7fff.
Any help is very much appreciated!
Output of the first printf is as expected. The second printf produces undefined behavior.
In C language when you pass a a value smaller than int as a variadic argument, that value is always implicitly converted to type int. It is not possible to physically pass a short or char variadic argument. That implicit conversion to int is where your "sign extension" takes place.
For this reason, your printf("%x", y); is equivalent to printf("%x", (int) y);. The value that is passed to printf is 0xFFFF of type int. Technically, %x format requires an unsigned int argument, but a non-negative int value is also OK (unless I'm missing some technicality). The output is 0xFFFF.
Conversion to int happens in the second case as well. I.e. your printf("%x", (signed short) y); is equivalent to printf("%x", (int) (signed short) y);. The conversion of 0xFFFF to (signed short) is implementation-defined, because 0xFFFF is apparently out of range of signed short on your platform. But most likely it produces a negative value (-1). When converted to int it produces the same negative value of type int (again, -1 represented as 0xFFFFFFFF for a 32-bit int). The further behavior is undefined, since you are passing a negative int value for format specifier %x, which requires unsigned int argument. It is illegal to use %x with negative int values.
In other words, formally your second printf prints unpredictable garbage. But practically the above explains where that 0xFFFFFFFF came from.
Let's break it down and into smaller pieces:
Given,
unsigned short y = 0xFFFF;
Assuming two-bytes unsigned short maximum value is 2^16-1, that is indeed 0xFFFF.
When I print
printf("%x", y);
Due to default argument promotion (as printf() is variadic function) value of y is implicitly promoted to type int. With %x format-specified it's treated as unsigned int. Assuming common two-complement's representation and four-bytes int type, that means that as most-significant bit is set to zero, the bit patterns of int and unsigned int are simply the same.
But when I print
printf("%x", (signed short)y);
What you have done is cast to signed type, that cannot represent value of 0xFFFF. Such conversion as standard stays is implementation-defined, so you can get whatever result. After implicit conversion to int apparently you have bit-patern of 32-ones, that are represented as 0xFFFFFFFF.

Resources