why is it printing 255 - c

int main()
{
unsigned char a = -1;
printf("%d",a);
printf("%u",a);
}
when i have executed the above program i got 255 255 as the answer.
we know negative numbers will be stored in 2's complement.
since it is 2's complement the representation would be
1111 1111 -->2's complement.
but in the above we are printing %d(int) but integer is four bytes.
my assumption is even though it is character we are forcing compiler to treat it as integer.
so it internally uses sign extension concept.
1111 1111 1111 1111 1111 1111 1111 1111.
according to the above representation it has to be -1 in the first case since it is %d(signed).
in the second case it has to print (2^31- 1) but it is printing 255 and 255.
why it is printing 255 in both cases.
tell me if my assumption is wrong and give me the real interpretation.

Your assumption is wrong; the character will "roll over" to 255, then be padded to the size of an integer. Assuming a 32-bit integer:
11111111
would be padded to:
00000000 00000000 00000000 11111111

Up to the representation of a, you are correct. However, the %d and %u conversions of the printf() function both take an int as an argument. That is, your code is the same as if you had written
int main() {
unsigned char a = -1;
printf("%d", (int)a);
printf("%u", (int)a);
}
In the moment you have assigned -1 to a you have lost the information that it once was a signed value, the logical value of a is 255. Now, when you convert an unsigned char to an int, the compiler preserves the logical value of a and the code prints 255.

The compiler doesn't know what type the extra parameters in printf should be, since the only thing that specifies it should be treated as a 4-byte int is the format string, which is irrelevant at compile time.
What actually happens behind the scenes is the callee (printf) receives a pointer to each parameter, then casts to the appropriate type.
Roughly the same result as this:
char a = -1;
int * p = (int*)&a; // BAD CAST
int numberToPrint = *p; // Accesses 3 extra bytes from somewhere on the stack
Since you're likely running on a little endian CPU, the 4-byte int 0x12345678 is arranged in memory as | 0x78 | 0x56 | 0x34 | 0x12 |
If the 3 bytes on the stack following a are all 0x00 (they probably are due to stack alignment, but it's NOT GUARANTEED), the memory looks like this:
&a: | 0xFF |
(int*)&a: | 0xFF | 0x00 | 0x00 | 0x00 |
which evaluates to *(int*)&a == 0x000000FF.

unsigned char runs from 0-255 So the negative number -1 will print 255 -2 will print 254 and so on...
signed char runs from -128 to +127 so you get -1 for the same printf() which is not the case with unsigned char
Once you make a assignment to a char then the rest of the integer values will be padded so your assumption of 2^31 is wrong.
The negative number is represented using 2's complement(Implementation dependent)
So
1 = 0000 0001
So in order to get -1 we do
----------------------------------------
2's complement = 1111 1111 = (255) |
-----------------------------------------

It is printing 255, simply because this is the purpose from ISO/IEC9899
H.2.2 Integer types
1 The signed C integer types int, long int, long long int, and the corresponding
unsigned types are compatible with LIA−1. If an implementation adds support for the
LIA−1 exceptional values ‘‘integer_overflow’’ and ‘‘undefined’’, then those types are
LIA−1 conformant types. C’s unsigned integer types are ‘‘modulo’’ in the LIA−1 sense
in that overflows or out-of-bounds results silently wrap. An implementation that defines
signed integer types as also being modulo need not detect integer overflow, in which case,
only integer divide-by-zero need be detected.
If this is given, printing 255 is absolutly that, what the LIA-1 would expect.
Otherwise, if your implementation doesn't support C99's LIA-1 Annex part, then its simply undefined behaving.

Related

Explanation for the given code output in C

I have written code in C as
#include<stdio.h>
int main()
{
char a = 128;
char b = -128;
printf("%c",a);
printf("%c",b);
}
The output of above code is ÇÇ
Using 128 or -128 output is coming out to be same. Why? Please explain using binary if possible.
A signed char type typically has a range of -128 to 127. Since 128 is outside of this range, your compiler is converting it to a value with the same 8-bit bit pattern, and this is -128.
The literal -128 has type int and on a 32bit 2's complement representation has the bit pattern:
1111 1111 1111 1111 1111 1111 1000 0000
In this case, when you assign it to a char there is an implicit conversion (cast) such that only the LSB is used 1000 000 or in decimal, 128. Hence the result is the same.
Strictly the behaviour is implementation defined if char is signed, and the standard defines the behaviour in somewhat arcane "as-if" terms for unsigned char. Whether char itself is signed or unsigned is itself implementation defined as is the actual width and therefore range of a char. In practice though the above explanation is what is happening in this case and is the most likely behaviour for any implementation with 8-bit char, it makes no difference whether char is signed or unsigned.

What is the reasoning behind char to int conversion output?

For example:
int x = 65535; char y = x; printf("%d\n", y)
This will output -1. Anyway to derive this by hand?
In order to derive this by hand you need to know several implementation-defined aspects of your system - namely
If char is signed or not
If the char is signed, what representation scheme is used
How does your system treat narrowing conversions of values that cannot be represented exactly in the narrow type.
Although the standard allows implementations to decide, a very common approach to narrowing conversions is to truncate the bits that do not fit in a narrow type. Assuming that this is the approach taken by your system, the first part of figuring out the output is to find the last eight bits of the int value being converted. In your case, 65535 is 11111111111111112, so the last eight bits are all ones.
Now you need to decide the interpretation of 111111112. On your system char is signed, and the system uses two's complement representation of negative values, so this pattern is interpreted as an eight-bit -1 value.
When you call printf, eight-bit value of signed char is promoted to int, so the preserved value is printed.
On systems where char is unsigned by default the same pattern would be interpreted as 255.
65535 is 0xffff
When converting to char, the left bits are left off:
0xffff AND 0xff is 0xff
When passing a char to a function, it is expanded to int. As the left-most bit of the char is a 1, this will be sign-extended, so it will become 0xffffffff (32 bits)
This is -1, so -1 is printed as.
As Dasblinkenlight points out, it matters if the char is signed or unsigned (whether as a default in the implementation or as the declaration unsigned char y). My last lines would read for unsigned char:
When passing an unsgined char to a function, it is expanded to unsigned int. As it is unsigned, just zeroes are added at the left, so it will become 0x000000ff (32 bits).
This is 255, so 255 is printed as.
65535 when converted to binary is 1111111111111111.
and when you assign this to a character variable it gets trimmed to least significant 8 bits which is 11111111.
This is equivalent to -1 in Binary. 2's complement of any number gives its negative value, hence 2's complement of 00000001:
11111110 + 1 = 11111111 is -1.
int x = 65535;
The value of 'x' 65535 is equivalent to 0xffff (binary = 1111 1111 1111 1111) in hexadecimal which represent the 2 Byte(16 bit) number .
Then the value of 'x' is converting into character 'y' and the size of character datatype is 1 Byte (8 bit) .so right side values of x is cutoff while assigning to 'y' because it only have 1B memory space .
So the value of 'y' is 0xff (binary = 1111 1111) which is equal to -1 in signed number integer representation .
And we are displaying the 'y' character value in term of integer by '%d' .

tilde operator query in C working differently

I came across this question.
What is the output of this C code?
#include <stdio.h>
int main()
{
unsigned int a = 10;
a = ~a;
printf("%d\n", a);
}
I know what tilde operator do, now 10 can be represented as 1010 in binary, and if i bitwise not it, i get 0101, so i do not understand the output -11. Can anyone explain?
The bitwise negation will not result in 0101. Note that an int contains at least 16 bits. So, for 16 bits, it will generate:
a = 0000 0000 0000 1010
~a = 1111 1111 1111 0101
So we expect to see a large number (with 16 bits that would be 65'525), but you use %d as format specifier. This means you interpret the integer as a signed integer. Now signed integers use the two-complement representation [wiki]. This means that every integers where the highest bit is set, is negative, and furthermore that in that case the value is equal to -1-(~x), so -11. In case the specifier was %u, then the format would be an unsigned integer.
EDIT: like #R. says, %d is only well defined for unsigned integers, if these are in the range of the signed integers as well, outside it depends on the implementation.
It's undefined behaviour, since "%d" is for signed integers; for unsigned ones, use "%u".
Otherwise, note that negative values are often represented as a two's complement; So -a == (~a)+1, or the other way round: (~a) == -a -1. Hence, (~10) is the same as -10-1, which is -11.
The format specifier for an unsigned decimal integer is %u. %d is for a signed decimal integer.
printf("%d\n", a) is interpreting a as a signed int. You want printf("%u\n", a).

Simple Character Interpretation In C

Here is my code
#include<stdio.h>
void main()
{
char ch = 129;
printf("%d", ch);
}
I get the output as -127. What does it mean?
It means that char is an 8-bit variable that can only hold 2^8 = 256 values, since the declaration is char ch, ch is a signed variable, which means it can store 127 negative and positive values. when you ask to go over 127 then the value starts over from -128.
Think of it like some arcade games where you go from one side of the screen to the other:
ch = 50;
-----> 50 is stored
|___________________________________|___________| since it fits
-128 0 50 127 between -127
and 128
ch = 129;
--- 129 goes over
--> 127 by 2, so
|__|____________________________________________| it 'lands' in
-128 -127 0 127 -127
BUT!! you shouldn't rely on this since it's undefined behaviour!
In honor of Luchian Grigore here's the bit representation of what's happening:
A char is a variable that will hold 8-bits or a byte. So we have 8 0's and 1's struggling to represent whatever value you desire. If the char is a signed variable it will represent whether it's a positive or negative number. You probably read about the one bit representing the sign, that's an abstraction of the true process; in fact it is only one of the first solutions implemented in electronics. But such a trivial method had a problem, you would have 2 ways of representing 0 (+0 and -0):
0 0000000 -> +0 1 0000000 -> -0
^ ^
|_ sign bit 0: positive |_ sign bit 1: negative
Inconsistencies guaranteed!! So, some very smart folks came up with a system called Ones' Complement which would represent a negative number as the negation (NOT operation) of its positive counterpart:
01010101 -> +85
10101010 -> -85
This system... had the same problem. 0 could be represented as 00000000 (+0) and 11111111 (-0). Then came some smarter folks who created Two's Complement, which would hold the negation part of the earlier method and then add 1, therefore removing that pesky -0 and giving us a shiny new number to our range: -128!. So how does our range look now?
00000000 +0
00000001 +1
00000010 +2
...
01111110 +126
01111111 +127
10000000 -128
10000001 -127
10000010 -126
...
11111110 -2
11111111 -1
So, this should give an idea of what's happening when our little processor tries to add numbers to our variable:
0110010 50 01111111 127
+0000010 + 2 +00000010 + 2
------- -- -------- ---
0110100 52 10000001 -127
^ ^ ^
|_ 1 + 1 = 10 129 in bin _| |_ wait, what?!
Yep, if you review the range table above you can see that up to 127 (01111111) the binary was fine and dandy, nothing weird happening, but after the 8'th bit is set at -128 (10000000) the number interpreted no longer held to its binary magnitude but to the Two's Complement representation. This means, the binary representation, the bits in your variable, the 1's and 0's, the heart of our beloved char, does hold a 129... its there, look at it! But the evil processor reads that as measly -127 cause the variable HAD to be signed undermining all its positive potential for a smelly shift through the real number line in the Euclidean space of dimension one.
It means you ran into undefined behavior.
Any outcome is possible.
char ch=129; is UB because 129 is not a representable value for a char for you specific setup.
Your char is most likely an 8-bit signed integer that is stored using Two's complement. Such a variable can only represent numbers between -128 and 127. If you do "127+1" it wraps around to -128. So 129 is equivalent to -127.
This comes from the fact that a char is coded on one byte, so 8 bits of data.
In fact char has a value coded on 7 bits and have one bit for the sign, unsigned char have 8 bits of data for its value.
This means:
Taking abcdefgh as 8 bits respectively (a being the leftmost bit, and h the rightmost), the value is encoded with a for the sign and bcdefgh in binary format for the real value:
42(decimal) = 101010(binary)
stored as :
abcdefgh
00101010
When using this value from the memory :
a is 0 : the number is positive, bcdefgh = 0101010 : the value is 42
What happens when you put 129 :
129(decimal) = 10000001(binary)
stored as :
abcdefgh
10000001
When using this value from the memory :
a is 0 : the number is negative, we should substract one and invert all bits in the value, so (bcdefgh - 1) inverted = 1111111 : the value is 127
The number is -127
On your system: char 129 has the same bits as the 8 bit signed integer -127.
An unsigned integer goes from 0 to 255, and signed integer -128 to 127.
Related (C++):
You may also be interested in reading the nice top answer to What is an unsigned char?
As #jmquigley points out. This is strictly undefined behavior and you should not rely on it.
Allowing signed integer overflows in C/C++
The char type is a 8-bit signed integer. If you interpret the representation of unsigned byte 129 in the two's complement signed representation, you get -127.
The type char can be either signed or unsigned, it's up to the compiler. Most compilers have it as `signed.
In your case, the compiler silently converts the integer 129 to its signed variant, and puts it in an 8-bit field, which yields -127.
char is 8 bits, signed. It can only hold values -128 to 127. When you try and assign 129 to it you get the result you see because the bit that indicates signing is flipped. Another way to think of it is that the number "wraps" around.
Whether a plain char is signed or unsigned, is implementation-defined behavior. This is a quite stupid, obscure rule in the C language. int, long etc are guaranteed to be signed, but char could be signed or unsigned, it is up to the compiler implementation.
On your particular compiler, char is apparently signed. This means, assuming that your system uses two's complement, that it can hold values of -128 to 127.
You attempt to store the value 129 in such a variable. This leads to undefined behavior, because you get an integer overflow. Strictly speaking, anything can happen when you do this. The program could print "hello world" or start shooting innocent bystanders, and still conform to ISO C. In practice, most (all?) compilers will however implement this undefined behavior as "wrap around", as described in other answers.
To sum it up, your code relies on two different behaviors that aren't well defined by the standard. Understanding how the result of such unpredictable code ends up in a certain way has limited value. The important thing here is to recognize that the code is obscure, and learn how to write it in a way that isn't obscure.
The code could for example be rewritten as:
unsigned char ch = 129;
Or even better:
#include <stdint.h>
...
uint8_t ch = 129;
As a rule of thumb, make sure to follow these rules in MISRA-C:2004:
6.1 The plain char type shall be used only for the storage and use of character values.
6.2 signed and unsigned char type shall be used only for the storage and use of numeric values.

Bit field manipulation-setting a bit

#include<stdio.h>
int main()
{
struct s{
int bit_fld:3;
};
s a;
a.bit_fld=0x10;
a.bit_fld =( a.bit_fld | (1<<2));
printf("%x\n",a.bit_fld);
return 0;
}
This program outputs fffffffc.
I tried to do manual calculation of the output and I could not get the output that the compiler produced.
bit_fld = 00010000 and (1<<2) = 0100 oring both wil result in 00010100 which is 0x14 in hexadecimal.
Why my perception of the output is wrong ? Help me to understand where I'm mistaken.
a.bit_fld is only 3 bits big, it can't store the value 0x10. Behavior is implementation-defined, but in this case it has probably stored 0.
Then 1 << 2 is binary 100 as you say. Assuming we did store 0 at the first step, the result of ( a.bit_fld | (1<<2)) is an int with value 4 (binary 100).
In a signed 2's complement 3-bit representation, this bit pattern represents the value -4, so it's not at all surprising if -4 is what you get when you store the value 4 to a.bit_fld, although again this is implementation-defined.
In the printf, a.bit_fld is promoted to int before passing it as a vararg. The 2's complement 32 bit representation of -4 is 0xfffffffc, which is what you see.
It's also undefined behavior to pass an int instead of an unsigned int to printf for the %x format. It's not surprising that it appears to work, though: for varargs in general there are certain circumstances where it's valid to pass an int and read it as an unsigned int. printf isn't one of them, but an implementation isn't going to go out of its way to stop it appearing to work.

Resources