how does the Character pointer treats MSB bit of each byte? - c

for the following code:
void main()
{
int i;
float a=5.2;
char *ptr;
ptr=(char *)&a;
for(i=0;i<=3;i++)
printf("%d ",*ptr++);
}
i m getting o/p as 102 102 -90 64..why?how does the Character pointer treats MSB bit of each byte?

Whether char is signed or unsigned is implementation defined. Clearly the char data type in your system is signed. So the MSB is the sign bit.

In your case, apparently it treats the most significant bit as a sign bit, in other words, in your implementation char is a signed integer type, with two's complement representation, incidentally.

If you convert the 5.2 floating point value into binary format you get:
5.2 = 01000000 (=64) 10100110 (=166) 01100110 (=102) 01100110 (= 102)
If you take the 3rd byte (166) and convert it into a signed char value (within [-128, 127]) then you obtain -90.
Compile your program with -funsigned-char to obtain 102 102 166 64 as output.

In your case , the char is using signed bit representation. As far as the values are concerned , they depend on the Endianness of the system you are working on.

Related

How is integer literal mapped in memory?

Take a look at the following example:
int a = 130;
char *ptr;
ptr = (char *) &a;
printf("%d", *ptr);
I expected to get a value 0 printed on the screen but to my surprise it's -126. I came to the conclusion that since char is 8 bits the int might be rounding.
Until now I used to think that memory is filled in a way that msb is on the left. But now everything seems to be mixed. How exactly is memory allocated?
in your case a (might be) 4 bytes little endian value. and 130 is 10000010 in binary.
int a = 130; // 10000010 00000000 00000000 00000000 see little endianness here
and you're pointing to the first byte with char*
char* ptr = (char*)&a; // 10000010
and trying to print it with %d format which will print the signed integer value of 10000010 which is -126 (see: Two's complement)
Your output is a hint that your system is little endian (Least Significant Byte has lowest memory address).
In hexadecimal (exactly 2 digit per byte) 130 writes 0x82. Assuming 4 bytes for an int, in a little endian system, the integer will be stored as 0x82, 0, 0, 0. So *ptr will be (char) 0x82.
But you use printf to display its value. As all parameters passed the first have no required type, the char value will be promoted to an int. Here assuming a 2 complement representation (the currently most common one) you will get either 130 if char was unsigned, or -126 if is is signed.
TL/DR: the output is normal on a little endian system with a 2-complement integer representation, and where the char type is signed.

Converting unsigned char to signed int

Consider the following program
void main(){
char t = 179;
printf("%d ",t);
}
Output is -77.
But binary representation of 179 is
10110011
So, shouldn't output be -51, considering 1st bit is singed bit.
Binary representation of -77 is
11001101
It seems bit order is reversed. What's going on? Please somebody advice.
You claimed that the binary representation of -77 is 11001101, but the reversal is yours. The binary representation of -77 is 10110011.
Binary 10110011 unsigned is decimal 179.
Binary 10110011 signed is decimal -77.
You assigned the out-of-range value 179 to a signed char. It might theoretically be Undefined Behaviour, but apart from throwing an error, it would be a very poor compiler that placed anything but that 8-bit value in the signed char.
But when printed, it's interpreted as a negative number because b7 is set.
Looks like char is a signed type on your system. The valid range for a char would be [-128, 127]
By using
char t = 179;
the compiler uses the 2's complement of 179 (which is most likely -77) and assigns that value to t.
To convert between a positive and a negative number in 2's complement you invert all the bits and then you add 1.
10110011
01001100 (invert all bits)
01001101 (add one)
That is 77 decimal
Char could be signed or unsigned: that's up to your compiler.
If signed, it could be 2s or 1 complement.
Furthermore, it could be larger than 8 bits, although sizeof char is defined to be 1.
So it's inadvisable to rely on a specific representation.
Please note, that on many systems char is signed. Hence, when you assign 179 (which is of type int) to a char this value is outside of char range, hence it is unspecified behaviour.
6.3.1.3 Signed and unsigned integers:
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more unsigned integer than the maximum value that can be represented in the new type until the value is in the range of the new conversion to type. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
If you change the type to unsigned char your program will perform correctly.
Also note that char, signed char and unsigned char are 3 distinct types, unlike int and signed int. The signedness of char is implementation defined.

how do we get the following output?

#include <stdio.h>
int main(void)
{
int i = 258;
char ch = i;
printf("%d", ch)
}
the output is 2!
How the range of variable works? what is the range of different data types in c langauge?
When assigning to a smaller type the value is
truncated, i.e. 258 % 256 if the new type is unsigned
modified in an implementation-defined fashion if the new type is signed
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.
So all that fancy "adding or subtracting" means it is assigned as if you said:
ch = i % 256;
char is 8-bit long, while 258 requires nine bits to represent. Converting to char chops off the most significant bit of 258 which is 100000010 in binary, resulting in 10, which is 2 in binary.
When you pass char to printf, it gets promoted to int, which is then picked up by the %d format specifier, and printed as 2.
#include <stdio.h>
int main(void)
{
int i = 258;
char ch = i;
printf("%d", ch)
}
Here i is 0000000100000010 on the machine level. ch takes 1 byte, so it takes last 8 bit, it is 00000010, it is 2.
In order to find out how long various types are in C language you should refer to limits.h (or climits in C++). char is not guaranteed to be 8 bits long . It is just:
smallest addressable unit of the machine that can contain basic character set. It is an integer type. Actual type can be either signed or unsigned depending on implementation
Same sort of vague definitions are put for other types.
Alternatively, you can use operator sizeof to dynamically find out size of the type in bytes.
You may not assume exact ranges of native C data types. Standard places only minimal restrictions, so you can say unsigned short can hold at least 65536 different values. Upper limit can differ
Refer to Wikipedia for more reading
char is on 8 bits so, when you cast (you assign an integer to a char), in 32 bits machine, the i (int is on 32 bits) var is:
00000000 00000000 00000001 00000010 = 258 (in binary)
When you want a char from this int, you truncate the last 8 bits (char is on 8 bits), so you get:
00000010 which mean 2 in decimal, this is why you see this output.
Regards.
This is an overflow ; the result is undefined because char may be signed (undefined behavior) or unsigned (well-defined "wrap-around" behavior).
You are using little-endian machine.
Binary representation of 258 is
00000000 00000000 00000001 00000010
while assigning integer to char, only 8 byte of data is copied to char. i.e LSB.
Here only 00000010 i.e 0x02 will be copied to char.
The same code will gives zero, in case of big-endian machine.

Right shift operator in C

I got the following code:
int main(int argc, char *argv[])
{
char c = 128;
c = c >> 1;
printf("c = %d\n", c);
return 0;
}
Running the above code on Windows XP 32 bit, I got the result: -64. Why -64?
Because the char type is a signed 8-bit integer (in the implementation of C that you are using). If you try to store the value 128 in it, it will actually be -128.
The bits for that would be:
10000000
Shifting a negative number will keep the sign bit set (as your implementation uses an arithmetic shift):
11000000
The result is -64.
The C standard doesn't specify whether char is signed or unsigned. In this case it looks like you're getting a signed char, with a range from -128 to +127. Assigning 128 to it rolls round and leaves you with -128, so c>>1 is -64.
If you specify c as "unsigned char", c>>1 will be 64.
As the comment says, right-shifting a negative value is undefined by the standard so it's just luck that it comes out as -64.
You are using type char which by default is signed. Signed chars have a range of -128 to 127. which means char c = 128 really sets c to -128. (This is because most processors use two's complement to represent negative numbers) Thus when you shift right you get -64.
Bottom line is that when doing bit manipulations, use unsigned types to get the results you expect.
Variable c is signed. Changing the declaration to unsigned char c... will yield a result of 64.

-250 prints as 6

I could not identify how the following program outputs 6 and -250.
#include<stdio.h>
int main()
{
unsigned char p=-250;
printf("%d",p);
unsigned int p1=-250;
printf("%d",p1);
return 0;
}
Being an unsigned integer it has to output only the positive values.How does the p value outputs 6? Please help me understand.
printf is not typesafe. It prints whatever you ask it to, and %d says "signed integer". It is your responsibility to provide a varibale of matching type. Since the unsigned char is only 8 bits wide, the literal -250 wraps around to +6, which remains +6 when interpreted as a signed integer. Note that char and short int (and their signed/unsigned counterparts) all get promoted to int-types when passed via variadic arguments.
By default, integer numerals such as -250 have a type int. Also, negative values stores in memory in Two's complement form. Let calculate two's complement form of -250 (see Making two's complement form paragraph in wiki):
Positive 250 is a 11111010 (first 8 bits, leading zeros are omitted)
Complement it and get 00000101 (first 8 bits, leading ones are omitted)
Add one and get 00000110 (first 8 bits, leading ones are omitted)
Type conversion rules for integer types in C says that we should drop left bits to get 8-bit char. For more details look K&R A.6.2 (well, it is for russian edition, maybe in original book it has another place).
So unsigned char p gets exactly a 00000110 value (6 in decimal). That is why you get 6 in output.
I think, you understand now why there is -250 in second printf ;)
unsigned char may consist only of numbers 0..255
numbers are converted modulo 256. So -250 casted to 6
You should not trust this behaviour. you should avoid overflow.
As of p1, it casted to unsigned int, but interpried as p1 in printf() because of %d identifier
p1 is unsigned, but the %d modifier treats the corresponding argument as signed, so even though in fact it is positive, it is printed as negative.
Whether a number is signed or unsigned is all about the representation that is applied, at the machine level it doesn't make a difference.

Resources