Bitwise operation in character - c

I am curious about a behavior of bit-wise operator of C on Character.
#include <stdio.h>
int main()
{
int x = 108;
x = x<<1;
printf("%d\n", x);
char y = 108;
y = y<<1;
printf("%d", y);
//printf("%d", y<<1);
return 0;
}
Here, if I pass like this, y = y<<1, it's output was -40 and when I print it directly like,
printf("%d", y<<1);
it's output was 216.
How I can simulate it?

Note that there is really no << operation on char types - the operands of << are promoted to (at least) int types, and the result is, similarly, an int.
So, when you do y = y << 1, you are truncating the int result of the operation to a (signed) char, which leaves the most significant bit (the sign bit) set, so it is interpreted as a negative value.
However, when you pass y << 1 directly to printf, the resulting int is left unchanged.

y<<1 produces an int. To get -40, you were implicitly casting it to a char. In your printf case, you'll need to do the cast explicitly: (char)(y<<1)

Related

How int is converted to char and how char is converted to int?

In the following example the bit representation of byte with all ones is printed:
#include <stdio.h>
int main (void)
{
char c = 255;
char z;
for (int i = 7; i >= 0; i--) {
z = 1 << i;
if ((z & c) == z) printf("1"); else printf("0");
}
printf("\n");
return 0;
}
The output is 11111111
Now we change char c to int c, so that the example becomes:
#include <stdio.h>
int main (void)
{
int c = 255;
char z;
for (int i = 7; i >= 0; i--) {
z = 1 << i;
if ((z & c) == z) printf("1"); else printf("0");
}
printf("\n");
return 0;
}
Now the output is 01111111.
Why the output is different?
UPDATE
Compile the following test.c:
#include <stdio.h>
int main(void)
{
char c=-1;
printf("%c",c);
return 0;
}
$ gcc test.c
$ ./a.out | od -b
0000000 377
0000001
The output is 377, which means that glibc contradicts to gcc, because signed char is converted to unsigned char automatically.
Why such complications? It is reasonable to have char unsigned by default. Is there any specific reason why not?
The first problem here is the char type. This type should never be used for storing integer values, because it has implementation-defined signedness. This means that it could be either signed or unsigned, and you will get different results on different compilers. If char is unsigned on the given compiler, then this code will behave as you expected.
But in case char is signed, char c = 255; will result in a value which is too large. The value 255 will then get converted to a signed number in some compiler-specific way. Usually by translating the raw data value to the two's complement equivalent.
Good compilers like GCC will give a warning for this: "overflow in implicit constant conversion".
Solve this bug by never using char for storing integers. Use uint8_t instead.
The same problem appears when you try to store 1 << 7 inside a char type that is signed on your given compiler. z will end up as a negative value (-128) when that happens.
In the expression z & c, both operands are silently integer promoted to type int. This happens in most C expressions whenever you use small integer types such as char.
The & operator doesn't care if the operands are signed or not, it will do a bitwise AND on the "raw data" values of the variables. When c is a signed char and has the raw value 0xFF, you will get a result which is negative, with the sign bit set. Value -1 on two's complement computers.
So to answer why you get different results in the two cases:
When you switch type to int, the value 255 will fit inside c without getting converted to a negative value. The result of the & operation will also be an int and the sign bit of this int will never be set, unlike it was in the char case.
When you execute -128 & 255 the result will be 128 (0x80). This is a positive integer. z is however a negative integer with the value -128. It will get promoted to int by the == operator but the sign is preserved. Since 128 is not equal to -128, the MSB will get printed as a zero.
You would get the same result if you switched char to uint8_t.
for char to int, you have to define char as unsigned because by default char or any type is treated as singed.
int main (void)
{
int c = 255;
unsigned char z;
int i;
for (i = 7; i >= 0; i--) {
z = 1 << i;
if ((z & c) == z) printf("1"); else printf("0");
}
printf("\n");
return 0;
}
(edit to clarify "signed by default")
In the first listing, (z == c) tests two char ; in the second listing however, (z == c) tests one char and one int.
In order to perform the & and == operations between a char and an int the compiler expands the char to the size of an int.
Regarding the bit 7 (8th):
If your compiler would consider char to be unsigned by default, the condition
(((int)(128) & (int)255) == (int)128)
would render true, and a 1 would be printed. However in your case the result is false, and a 0 is displayed.
The reason is likely your compiler that considers char to be signed (like gcc by default). In this case, a char set to 1 << 7 is actually -128, while in an int (at least two bytes) 255 is positive.
(char)-128 expanded to an int is (int)-128, thus the condition
if ((z & c) == z)
reads
if (((int)(-128) & (int)255) == (int)-128)
which is false in this case.

strange behaviour of char

If char a = -128; it is represented in binary as 10000000.
but when I shift this binary equivalent to the left side by by one bit it gives me -256 for which my brain doesn't make any sense.
Can anyone explain it to me how this strange behaviour comes?
int main(){
char a=-128;
printf("%d",a<<1);
return 0;
}
-128 on an int variable is 0xffffff80.
The shifting left result is 0xffffff00 that is -256.
You can test it with this code:
int main(void)
{
int n = -128;
printf("Decimal value = %d\n", n);
printf("Hex value = %x\n", n);
n<<=1;
printf("Decimal value = %d\n", n);
printf("Hex Value = %x\n", n);
return 0;
}
EDIT
In your code printf is promoting your char variable to int before shifting it.
As per the rule# of shifting operator,
The integer promotions are performed on each of the operands. [...]
So, while using a<<1 as the argument for printf(), a being of type char and 1 being the type of int (literal), a value is promoted to type int and then, the shifting will be performed, then the result will be printed out as an int value.
[#] - C11, chapter ยง6.5.7, Bitwise shift operators

summing unsigned and signed ints, same or different answer?

If I have the following code in C
int main()
{
int x = <a number>
int y = <a number>
unsigned int v = x;
unsigned int w = y;
int ssum = x * y;
unsigned int usum = v * w;
printf("%d\n", ssum);
printf("%d\n", usum);
if(ssum == usum){
printf("Same\n");
} else {
printf("Different\n");
}
return 0;
}
Which would print the most? Would it be equal since signed and unsigned would produce the same result, then if you have a negative like -1, when it gets assigned to int x it becomes 0xFF, and if you want to do -1 + (-1), if you do it the signed way to get -2 = 0xFE, and since the unsigned variables would be set to 0xFF, if you add them you would still get 0xFE. And the same holds true for 2 + (-3) or -2 + 3, in the end the hexadecimal values are identical. So in C is that what's looked at when it sees signedSum == unsignedSum? It doesnt care that one is actually a large number and the other is -2, as long at the 1's and 0's are the same?
Are there any values that would make this not true?
The examples you have given are incorrect in C. Also, converting between signed and unsigned types is not required to preserve bit patterns (the conversion is by value), although with some representations bit patterns are preserved.
There are circumstances where the result of operations will be the same, and circumstances where the result will differ.
If the (actual) sum of adding two ints would overflow an int
(i.e. value outside range that an int can represent) the result is
undefined behaviour. Anything can happen at that point (including
the program terminating abnormally) - subsequently converting to an unsigned doesn't change anything.
Converting an int with negative value to unsigned int uses modulo
arithmetic (modulo the maximum value that an unsigned can
represent, plus one). That is well defined by the standard, but
means -1 (type int) will convert to the maximum value that an
unsigned can represent (i.e. UINT_MAX, an implementation defined
value specified in <limits.h>).
Similarly, adding two variables of type unsigned int always uses
modulo arithmetic.
Because of things like this, your question "which would produce the most?" is meaningless.

function to convert float to int (huge integers)

This is a university question. Just to make sure :-) We need to implement (float)x
I have the following code which must convert integer x to its floating point binary representation stored in an unsigned integer.
unsigned float_i2f(int x) {
if (!x) return x;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 0;
int t = a;
while(t != 1) {
/* divide by two until t is 0*/
t >>= 1;
e++;
};
/* calculate mantissa */
int m = a << (32 - e);
/* logical right shift */
m = (m >> 9) & ~(((0x1 << 31) >> 9 << 1));
/* add bias for 32bit float */
e += 127;
int res = sign << 31;
res |= (e << 23);
res |= m;
/* lots of printf */
return res;
}
One problem I encounter now is that when my integers are too big then my code fails. I have this control procedure implemented:
float f = (float)x;
unsigned int r;
memcpy(&r, &f, sizeof(unsigned int));
This of course always produces the correct output.
Now when I do some test runs, this are my outputs (GOAL is what It needs to be, result is what I got)
:!make && ./btest -f float_i2f -1 0x80004999
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147464807] 10000000000000000100100110011001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111101101100
result: 11001110111111111111111101101100
GOAL: 11001110111111111111111101101101
So in this case, a 1 is added as the LSB.
Next case:
:!make && ./btest -f float_i2f -1 0x80000001
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147483647] 10000000000000000000000000000001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111111111111
result: 11001110111111111111111111111111
GOAL: 11001111000000000000000000000000
Here 1 is added to the exponent while the mantissa is the complement of it.
I tried hours to look ip up on the internet plus in my books etc but I can't find any references to this problem. I guess It has something to do with the fact that the mantissa is only 23 bits. But how do I have to handle it then?
EDIT: THIS PART IS OBSOLETE THANKS TO THE COMMENTS BELOW. int l must be unsigned l.
int x = 2147483647;
float f = (float)x;
int l = f;
printf("l: %d\n", l);
then l becomes -2147483648.
How can this happen? So C is doing the casting wrong?
Hope someone can help me here!
Thx
Markus
EDIT 2:
My updated code is now this:
unsigned float_i2f(int x) {
if (x == 0) return 0;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 158;
int t = a;
while (!(t >> 31) & 0x1) {
t <<= 1;
e--;
};
/* calculate mantissa */
int m = (t >> 8) & ~(((0x1 << 31) >> 8 << 1));
m &= 0x7fffff;
int res = sign << 31;
res |= (e << 23);
res |= m;
return res;
}
I also figured out that the code works for all integers in the range -2^24, 2^24. Everything above/below sometimes works but mostly doesn't.
Something is missing, but I really have no idea what. Can anyone help me?
The answer printed is absolutely correct as it's totally dependent on the underlying representation of numbers being cast. However, If we understand the binary representation of the number, you won't get surprised with this result.
To understand an implicit conversion is associated with the assignment operator (ref C99 Standard 6.5.16). The C99 Standard goes on to say:
6.3.1.4 Real floating and integer
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
Your earlier example illustrates undefined behavior due to assigning a value outside the range of the destination type. Trying to assign a negative value to an unsigned type, not from converting floating point to integer.
The asserts in the following snippet ought to prevent any undefined behavior from occurring.
#include <limits.h>
#include <math.h>
unsigned int convertFloatingPoint(double v) {
double d;
assert(isfinite(v));
d = trunc(v);
assert((d>=0.0) && (d<=(double)UINT_MAX));
return (unsigned int)d;
}
Another way for doing the same thing, Create a union containing a 32-bit integer and a float. The int and float are now just different ways of looking at the same bit of memory;
union {
int myInt;
float myFloat;
} my_union;
my_union.myInt = 0x BFFFF2E5;
printf("float is %f\n", my_union.myFloat);
float is -1.999600
You are telling the compiler to take the number you have (large integer) and make it into a float, not to interpret the number AS float. To do that, you need to tell the compiler to read the number from that address in a different form, so this:
myFloat = *(float *)&myInt ;
That means, if we take it apart, starting from the right:
&myInt - the location in memory that holds your integer.
(float *) - really, I want the compiler use this as a pointer to float, not whatever the compiler thinks it may be.
* - read from the address of whatever is to the right.
myFloat = - set this variable to whatever is to the right.
So, you are telling the compiler: In the location of (myInt), there is a floating point number, now put that float into myFloat.

Unexpected result of ++((unsigned)x), x is uint8_t?

Why doesn't this piece of code result in y == 0x100?
uint8_t x = 0xff;
unsigned y = ++((unsigned)x);
Check it out for yourself here: http://codepad.org/dmsmrtsg
The code you posted is invalid form the point of view of C language. The result of any cast in C is an rvalue. It cannot be used as an argument of ++. Operator ++ requires an lvalue argument. I.e. expression ++((unsigned) x) is non-compilable in standard C language.
What you actually observe in this case is GCC's "generalized lvalues" extension
http://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Lvalues.html
Per that extension (and contrary to the standard C), a cast applied to an lvalue produces an lvalue. When you attempt to write something into the resultant "generalized" lvalue, the value being written is converted twice: it is first converted to the type specified by the explicit cast, and then the intermediate result is converted again to the type of recipient object. The final result is placed into the recipient object.
For example, if with your x you do
(unsigned) x = 0x100;
it will be actually interpreted by GCC as
x = (uint8_t) (unsigned) 0x100;
and the final value of x will be 0.
And this is exactly what happens in your example. In GCC your
++((unsigned) x)
is equivalent to
(unsigned) x = (unsigned) x + 1;
which is in turn interpreted by GCC as
x = (uint8_t) (unsigned) ((unsigned) x + 1);
This is why you get 0 in x as the result, and that is the 0 that then gets assigned to your y.
This extension is referred to as deprecated by GCC docs.
To start this is not valid C code, I don't know how you got it to compile, but your link does show an output, so I'll try to explain what's happening based on this one major assumption:
I guess with this line unsigned y = ++((unsigned x)); the second unsigned is being dropped by your compiler, hence why you're able to build.
So, Assuming that...
uint8_t x = 0xff; // 8 bit value, max is 255(10) or 0xFF(16)
unsigned y = ++((unsigned)x);
Now x has the max value already for its type. You want to know why if we +1 via ++, y doesn't get value of 0x100.
x is 8 bit, typecasting it doesn't change the fact that it's 8 bit. So when we say:
++x
We're incrementing x (x=x+1). So we have an unsigned 8 bit value, at the max and add 1 to it, now it's wrapped around to 0. So y will get 0.
If you wanted this to work you could do something like:
int main(void)
{
unsigned char x = 0xFF; //I'm using char because it's 8 bit too
unsigned int y = 1+x; //no need to typecast, we're already unsigned
printf("%#x %#x\n", x, y);
return 0;
}
Now you'll get the expected values (x==0xFF and y==0x100)
Try this:
uint8_t x = 0xff;
unsigned y = ((unsigned)x) + 1;
It will come out as you expect, because (unsigned) x is now the two-byte value 0x0100.
Now try this:
uint8_t x = 0xff;
++x;
The value of 0xff wraps around to 0x00.
I put in some little transparency code into your excerpt and it explains everything.
#include <stdio.h>
#include <stdint.h> // not needed
int main(void) {
uint8_t x = 0xff;
printf("%d\n", sizeof(x));
unsigned y = ++((unsigned)x);
printf("%d\n", sizeof(y));
printf("0x%x\n", y);
printf("%d\n", sizeof(y));
return 0;
}
and the output is
1 // size of x
4 // size of y before computation
0x100 // computed value of y from your code
4 // size of y after computation
First thing to notice is that the sizeof(y) stays constant across computation.
From the outputs,
uint8_t = 1 byte
unsigned = 4 bytes
When you do a cast in C, think of it as an implicit call to realloc which says: "take this data I have from its block, increase (or decrease) its size in memory to the size I want to cast it to, then return the same data in a new block.
And from our sizes, unsigned will have enough space to fit the result of the computation from a one-byte operation.
Re-explainng your code in byte-level detail,
x = 11111111 = 0xff (in a byte)
(unsigned)x = 00000000000000000000000011111111 = 0xff (in a word)
++((unsigned)x) = 00000000000000000000000100000000 = 0x100

Resources