C Programming on storage of bits in byte - c

#include<stdio.h>
int main()
{
char a = 128;
char b = -128;
printf("a is %d -- b is %d \n",a,b);
return 0;
}
The output is :
a is -128 -- b is -128
As the signed character range is from 0 to 127, from the above code can you please explain how the value is assigned for the out of boundary values.
Thanks in Advance.

The range of a char type depends on the implementation. If it is a signed type, then its range is at least from -128 to 127, and if it is an unsigned type its range is at least from 0 to 255 (these are the ranges that the type must support at a bare minimum, the range supported by the type may actually be larger than this depending on the implementation).
Also note, that when you assign an integer to a signed type that cannot hold that value, you are invoking undefined behaviour. So assigning 128 to a signed char that cannot hold 128 (e.g. when 128 is greater than CHAR_MAX) is invoking undefined behaviour. In this case, it has wrapped around to -128 because it shares the same byte representation as an unsigned char type holding 128, but as with all instances of undefined behaviour, you cannot guarantee that this will be the case on all implementations.

Related

typecasting unsigned char and signed char to int in C

int main()
{
char ch1 = 128;
unsigned char ch2 = 128;
printf("%d\n", (int)ch1);
printf("%d\n", (int)ch2);
}
The first printf statement outputs -128 and second 128. According to me both ch1 and ch2 will have same binary representation of the number stored: 10000000. So when I typecast both the values to integers how they end up being different value?
First of all, a char can be signed or unsigned and that depends on the compiler implementation. But, as you got different results. Then, your compiler treats char as signed.
A signed char can only hold values from -128 to 127. So, a value of 128 for signed char overflows to -128.
But an unsigned char can hold values from 0 to 255. So, a value of 128 remains the same.
An unsigned char can have a value of 0 to 255. A signed char can have a value of -128 to 127. Setting a signed char to 128 in your compiler probably wrapped around to the lowest possible value, which is -128.
Your fundamental error here is a misunderstanding of what a cast (or any conversion) does in C. It does not reinterpret bits. It's purely an operation on values.
Assuming plain char is signed, ch1 has value -128 and ch2 has value 128. Both -128 and 128 are representable in int, and therefore the cast does not change their value. (Moreover, writing it is redundant since the default promotions automatically convert variadic arguments of types lower-rank than int up to int.) Conversions can only change the value of an expression when the original value is not representable in the destination type.
For starters these castings
printf("%d\n", (int)ch1);
printf("%d\n", (int)ch2);
are redundant. You could just write
printf("%d\n", ch1);
printf("%d\n", ch2);
because due to the default argument promotions integer types with the rank that is less than the rank of the type int are promoted to the type int if an object of this type can represent the value stored in an object of an integer type with less rank.
The type char can behave either as the type signed char or unsigned char depending on compiler options.
From the C Standard (5.2.4.2.1 Sizes of integer types <limits.h>)
2 If the value of an object of type char is treated as a signed
integer when used in an expression, the value of CHAR_MIN shall be the
same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same
as that of SCHAR_MAX. Otherwise, the value of CHAR_MIN shall be 0 and
the value of CHAR_MAX shall be the same as that of UCHAR_MAX. 20) The
value UCHAR_MAX shall equal 2CHAR_BIT − 1.
So it seems by default the used compiler treats the type char as signed char.
As a result in the first declaration
char ch1 = 128;
unsigned char ch2 = 128;
the internal representation 0x80 of the value 128 was interpreted as a signed value because the sign bit is set. And this value is equal to -128.
So you got that the first call of printf outputted the value -128
printf("%d\n", (int)ch1);
while the second call of printf where there is used an object of the type unsigned char
printf("%d\n", (int)ch2);
outputted the value 128.

Output of C code in which character is assigned an octal number

#include<stdio.h>
int main(void)
{
char a = 01212;
printf("%d",a);
return 0;
}
On compiling i get a warning and output -118 how? I know any number starting with 0 in c is considered as octal. The octal equivalent of 01212 is 650 then why the output is -118?
The assignment char a = 01212; on most of the systems is out of range and implementation dependent. A system with 8-bit char that implement 2's complement will print -118.
For detail, please read below explanation.
Unlike integer a char is not signed by default; there are three different char types in C.
char,
signed char
and
unsigned char
A char has a range from CHAR_MIN to CHAR_MAX. For a particular compiler, the char will use either an underlying signed or unsigned representation. You can check this value in limits.h of your system.
Here is the text from C99 standard point number 15
6.2.5 Types
The three types char, signed char, and unsigned char are collectively called
the character types. The implementation shall define char to have the same range,
representation, and behavior as either signed char or unsigned char.35)
And again note 35 says
35) CHAR_MIN, defined in , will have one of the values 0 or SCHAR_MIN, and this can be
used to distinguish the two options. Irrespective of the choice made, char is a separate type from the
other two and is not compatible with either.
Having said this char a = 01212; is larger than 8 bit. The C standard allows char size more than 8 bit but I think almost all computers in today's world implement 8 Bit char.
So if char is implemented as unsigned char and the value is more than CHAR_MAX the value will be converted according to Modulo CHAR_MAX+1.
In 8 bit system, the converted value is 650 modulo 256 which is 650-512 = 138
If char is implemented as signed char the conversion is implementation dependent. If it's an 8-bit char system and it implements 2's complement the value will be -118 as you have seen in your result. Note that in this system the Range for char will be from -128 to +127.
The value of 650 is most likely out-of-range for your char type. In C the behavior is implementation-defined in such cases as out-of-range integer conversions. I.e. it is clear that you will not get 650 in your char, and what exactly you will get depends on your compiler. Consult your compiler documentation to figure out why you got -118.
Char is only occupying one byte, or 8 bits, so the maximum number a unsigned char can hold is 2^8 - 1, which is 255, and a signed char has a maximum of 127. When assigned a number that's greater than that, it would cause undefined behavior, in which a negative number may appear.

Execution output of program

Why does the below given program prints -128
#include <stdio.h>
main()
{
char i = 0;
for (; i >= 0; i++)
;
printf("%d",i);
}
Also can I assign int value to char without type-casting it. And if I used a print statement in for loop it prints till 127 which is correct but this program current prints -128. Why
If char is a signed type on your platform then the behaviour of the program is undefined: overflowing a signed type is undefined behaviour in C.
A 2's complement 8 bit number with a value of -128 has the same bit pattern as a unsigned 8 bit number with value +128. It seems that this is what is happening in your case. And -128 is, of course, a termination condition for your loop. (You could even call it "wraparound to the smallest negative"). But don't rely on this.
According to N1570, whether a char is signed is implementation-defined:(Emphasis mine)
6.2.5 Types
15 The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall
define char to have the same range, representation, and behavior as
either signed char or unsigned char.
If it's unsigned, it will never overflow:
9 The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same. A
computation involving unsigned operands can never overflow, because a
result that cannot be represented by the resulting unsigned integer
type is reduced modulo the number that is one greater than the largest
value that can be represented by the resulting type.
For example, suppose UCHAR_MAX == 127(usually it'll be 255, though), 127 + 1 = (127 + 1) % (UCHAR_MAX + 1) = (127 + 1) % (127 + 1) = 0.
But if it's signed, the behavior is undefined, which means anything can happen. CHAR_MAX + 1 can be equal to CHAR_MIN, 0, or whatever. What's more, "undefined behavior" indicates that the program is possible to crash, although it's not very likely in practice.
In your case, it seems that char is signed, and CHAR_MAX + 1 == CHAR_MIN. Why? Just because your implementation defined so, and your are lucky enough to miss a crash this time. But this is not portable and reliable at all.

Char multiplication in C

I have a code like this:
#include <stdio.h>
int main()
{
char a=20,b=30;
char c=a*b;
printf("%c\n",c);
return 0;
}
The output of this program is X .
How is this output possible if a*b=600 which overflows as char values lies between -128 and 127 ?
Whether char is signed or unsigned is implementation defined. Either way, it is an integer type.
Anyway, the multiplication is done as int due to integer promotions and the result is converted to char.
If the value does not fit into the "smaller" type, it is implementation defined for a signed char how this is done. Far by most (if not all) implementations simply cut off the upper bits.
For an unsigned char, the standard actually requires (briefly) cutting of the upper bits.
So:
(int)20 * (int)20 -> (int)600 -> (char)(600 % 256) -> 88 == 'X'
(Assuming 8 bit char).
See the link and its surrounding paragraphs for more details.
Note: If you enable compiler warnings (as always recommended), you should get a truncation warning for the assignment. This can be avoided by an explicit cast (only if you are really sure about all implications). The gcc option is -Wconversion.
First off, the behavior is implementation-defined here. A char may be either unsigned char or signed char, so it may be able to hold 0 to 255 or -128 to 127, assuming CHAR_BIT == 8.
600 in decimal is 0x258. What happens is the least significant eight bits are stored, the value is 0x58 a.k.a. X in ASCII.
This code will cause undefined behavior if char is signed.
I thought overflow of signed integer is undefined behavior, but conversion to smaller type is implementation-defined.
quote from N1256 6.3.1.3 Signed and unsigned integers:
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
If the value is simply truncated to 8 bits, (20 * 30) & 0xff == 0x58 and 0x58 is ASCII code for X. So, if your system do this and use ASCII code, the output will be X.
First, looks like you have unsigned char with a range from 0 to 255.
You're right about the overflow.
600 - 256 - 256 = 88
This is just an ASCII code of 'X'.

Whats wrong with this C code?

My sourcecode:
#include <stdio.h>
int main()
{
char myArray[150];
int n = sizeof(myArray);
for(int i = 0; i < n; i++)
{
myArray[i] = i + 1;
printf("%d\n", myArray[i]);
}
return 0;
}
I'm using Ubuntu 14 and gcc to compile it, what it prints out is:
1
2
3
...
125
126
127
-128
-127
-126
-125
...
Why doesn't it just count up to 150?
int value of a char can range from 0 to 255 or -127 to 127, depending on implementation.
Therefore once the value reaches 127 in your case, it overflows and you get negative value as output.
The signedness of a plain char is implementation defined.
In your case, a char is a signed char, which can hold the value of a range to -128 to +127.
As you're incrementing the value of i beyond the limit signed char can hold and trying to assign the same to myArray[i] you're facing an implementation-defined behaviour.
To quote C11, chapter §6.3.1.4,
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Because a char is a SIGNED BYTE. That means it's value range is -128 -> 127.
EDIT Due to all the below comment suggesting this is wrong / not the issue / signdness / what not...
Running this code:
char a, b;
unsigned char c, d;
int si, ui, t;
t = 200;
a = b = t;
c = d = t;
si = a + b;
ui = c + d;
printf("Signed:%d | Unsigned:%d", si, ui);
Prints: Signed:-112 | Unsigned:400
Try yourself
The reason is the same. a & b are signed chars (signed variables of size byte - 8bits). c & d are unsigned. Assigning 200 to the signed variables overflows and they get the value -56. In memory, a, b,c&d` all hold the same value, but when used their type "signdness" dictates how the value is used, and in this case it makes a big difference.
Note about standard
It has been noted (in the comments to this answer, as well as other answers) that the standard doesn't mandate that char is signed. That is true. However, in the case presented by OP, as well the code above, char IS signed.
It seems that your compiler by default considers type char like type signed char. In this case CHAR_MIN is equal to SCHAR_MIN and in turn equal to -128 while CHAR_MAX is equal to SCHAR_MAX and in turn equal to 127 (See header <limits.h>)
According to the C Standard (6.2.5 Types)
15 The three types char, signed char, and unsigned char are
collectively called the character types. The implementation shall
define char to have the same range, representation, and behavior as
either signed char or unsigned char
For signed types one bit is used as the sign bit. So for the type signed char the maximum value corresponds to the following representation in the hexadecimal notation
0x7F
and equal to 127. The most significant bit is the signed bit and is equal to 0.
For negative values the signed bit is set to 1 and for example -128 is represented like
0x80
When in your program the value stored in char reaches its positive maximum 0x7Fand was increased it becomes equal to 0x80 that in the decimal notation is equal to -128.
You should explicitly use type unsigned char instead of the char if you want that the result of the program execution did not depend on the compiler settings.
Or in the printf statement you could explicitly cast type char to type unsigned char. For example
printf("%d\n", ( unsigned char )myArray[i]);
Or to compare results you could write in the loop
printf("%d %d\n", myArray[i], ( unsigned char )myArray[i]);

Resources