Bitwise operation in C - c

I am new to bitwise operation. I have the basic concepts of AND, OR, XOR, and 2s complement. However I came across following piece of code and am unable to figure out the output.
char c1 = 0xFF; // -1
int shifted = c1 << 8; //-256 (-1 * 256)
printf("%d, %x\n", shifted, shifted);
int myInt;
myInt = 0xFFFFFFE2;
printf("%d\n", myInt);
int i = 0xff;
printf("%d\n", i<<2);
The output is:
-256, ffffff00
-30
1020
Please help me understand what's going on here!

char c1 = 0xFF; // -1
int shifted = c1 << 8; //-256 (-1 * 256)
c1 is promoted to int for the shift, so it's still -1, shifting negative ints is implementation-defined, but your implementation seems to do like most and shifts it as if it were an unsigned bit-pattern, so a left-shift by eight places is multiplication by 256.
printf( "%d, %x\n", shifted, shifted );
-256, ffffff00
as expected. The bit pattern of -256 in two's complement is 0xFFFFFF00 (32-bit ints).
int myInt;
myInt = 0xFFFFFFE2;
That bit pattern is -30 in two's complement
printf("%d\n",myInt);
int i = 0xff ;
This is 255, 255*4 = 1020
printf("%d\n", i<<2);
-30
1020

Write down c1, myInt and i in binary with the declared number of bits.
Apply the operations to them.
Follow it through.

OK let me explain in some detail what's going on. Any binary notation to clarify will be prefixed with 0b and the most significant bit on the left.
char c1 = 0xFF; // -1
char c1 is an eight-bit signed integer type, which is set to 0b11111111
For signed types such as int, short and char, the leftmost bit is used as the sign bit. In two's complement, the most common standard for storing signed types, any signed integer with all bits set is equal to -1.
int shifted = c1 << 8; //-256 (-1 * 256)
c1 is implicitly cast to a signed integer before the shift, so we have an int of -1 which is 0xffffffff. The shift operator in C is a non-rotating shift, i.e. any bits shifted "from" outside the value will be set to zero. After the shift we have 0xffffff00, which is equal to -256 in two's complement.
printf( "%d, %x\n", shifted, shifted );
int myInt;
myInt = 0xFFFFFFE2;
printf("%d\n",myInt);
You're reading a signed integer which is printed according to two's complement.
int i = 0xff ;
printf("%d\n", i<<2);
The initial i is equivalent to 0b11111111, the sign bit not being set. After the shift we have 0b1111111100, which is equal to 1020, again because of non-rotating shift. Hope that clarifies things a bit. If you want to do bit shifts and AND/OR logic you should typically use unsigned types as mentioned before.

Related

Extract k bits from any side of hex notation

int X = 0x1234ABCD;
int Y = 0xcdba4321;
// a) print the lower 10 bits of X in hex notation
int output1 = X & 0xFF;
printf("%X\n", output1);
// b) print the upper 12 bits of Y in hex notation
int output2 = Y >> 20;
printf("%X\n", output2);
I want to print the lower 10 bits of X in hex notation; since each character in hex is 4 bits, FF = 8 bits, would it be right to & with 0x2FF to get the lower 10 bits in hex notation.
Also, would shifting right by 20 drop all 20 bits at the end, and keep the upper 12 bits only?
I want to print the lower 10 bits of X in hex notation; since each character in hex is 4 bits, FF = 8 bits, would it be right to & with 0x2FF to get the lower 10 bits in hex notation.
No, that would be incorrect. You'd want to use 0x3FF to get the low 10 bits. (0x2FF in binary is: 1011111111). If you're a little uncertain with hex values, an easier way to do that these days is via binary constants instead, e.g.
// mask lowest ten bits in hex
int output1 = X & 0x3FF;
// mask lowest ten bits in binary
int output1 = X & 0b1111111111;
Also, would shifting right by 20 drop all 20 bits at the end, and keep the upper 12 bits only?
In the case of LEFT shift, zeros will be shifted in from the right, and the higher bits will be dropped.
In the case of RIGHT shift, it depends on the sign of the data type you are shifting.
// unsigned right shift
unsigned U = 0x80000000;
U = U >> 20;
printf("%x\n", U); // prints: 800
// signed right shift
int S = 0x80000000;
S = S >> 20;
printf("%x\n", S); // prints: fffff800
Signed right-shift typically shifts the highest bit in from the left. Unsigned right-shift always shifts in zero.
As an aside: IIRC the C standard is a little vague wrt to signed integer shifts. I believe it is theoretically possible to have a hardware platform that shifts in zeros for signed right shift (i.e. micro-controllers). Most of your typical platforms (Intel/Arm) will shift in the highest bit though.
Assuming 32 bit int, then you have the following problems:
0xcdba4321 is too large to fit inside an int. The hex constant itself will actually be unsigned int in this specific case, because of an oddball type rule in C. From there you force an implicit conversion to int, likely ending up with a negative number.
Y >> 20 right shifts a negative number, which is non-portable behavior. It can either shift in ones (arithmetic shift) or zeroes (logical shift), depending on compiler. Whereas right shifting unsigned types is well-defined and always results in logical shift.
& 0xFF masks out 8 bits, not 10.
%X expects an unsigned int, not an int.
The root of all your problems is "sloppy typing" - that is, writing int all over the place when you actually need a more suitable type. You should start using the portable types from stdint.h instead, in this case uint32_t. Also make a habit of always ending you hex constants with a u or U suffix.
A fixed program:
#include <stdio.h>
#include <stdint.h>
int main (void)
{
uint32_t X = 0x1234ABCDu;
uint32_t Y = 0xcdba4321u;
printf("%X\n", X & 0x3FFu);
printf("%X\n", Y >> (32-12));
}
The 0x3FFu mask can also be written as ( (1u<<10) - 1).
(Strictly speaking you need to printf the stdint.h types using specifiers from inttypes.h but lets not confuse the answer by introducing those at the same time.)
Lots of high-value answers to this question.
Here's more info that might spark curiosity...
int main() {
uint32_t X;
X = 0x1234ABCDu; // your first hex number
printf( "%X\n", X );
X &= ((1u<<12)-1)<<20; // mask 12 bits, shifting mask left
printf( "%X\n", X );
X = 0x1234ABCDu; // your first hex number
X &= ~0u^(~0u>>12);
printf( "%X\n", X );
X = 0x0234ABCDu; // Note leading 0 printed in two styles
printf( "%X %08X\n", X, X );
return 0;
}
1234ABCD
12300000
12300000
234ABCD 0234ABCD
print the upper 12 bits of Y in hex notation
To handle this when the width of int is not known, first determine the width with code like sizeof(unsigned)*CHAR_BIT. (C specifies it must be at least 16-bit.)
Best to use unsigned or mask the shifted result with an unsigned.
#include <limits.h>
int output2 = Y;
printf("%X\n", (unsigned) output2 >> (sizeof(unsigned)*CHAR_BIT - 12));
// or
printf("%X\n", (output2 >> (sizeof output2 * CHAR_BIT - 12)) & 0x3FFu);
Rare non-2's complement encoded int needs additional code - not shown.
Very rare padded int needs other bit width detection - not shown.

Unknown system bitsize for int, how to create mask

I would like to create a mask for the MSB only, however the width of the int on the operating system is suppose to be unknown, so you cannot assume 32 bits.
see the following
// THE FOLLOWING FAILS BECAUSE OF SYSTEM IMPLEMENTING A LOGICAL
// RIGHT SHIFT
// Idea is
// 1. 0 inverted = all 1's
// 2. Arithmetic shift right
// 3. Then invert again to preseve MSB '1'
const int unsigned mask = ~(~0>>1); // FAIL, because of logic shift
Assuming 16 bit system
~0 give FFFF
~0>>1 give 7FFF
~(~0 >> 1) give 8000
You should add an u suffix to make what is shifted unsigned so that logical right shift is performed instead of arithmetic one.
const int unsigned mask = ~(~0u>>1);
You can just left shift the (unsigned) value 1 by the number of bits in the type minus 1 (i.e. for a 32-bit type, the MSB will be 1 << 31). To get the number of bits, use a combination of the sizeof operator and the CHAR_BIT constant (defined in <limits.h>):
const unsigned int MSB = 1u << (sizeof(unsigned int) * CHAR_BIT - 1);
INT_MAX is the int bit pattern of 0111...1111 (of some width)* for all implementations.
To form 1000...0000, invert those bits.
~INT_MAX
The above treads on undefined beahvior (UB).
Better to looks to unsigned or wider types.
unsigned mask = ~(unsigned) INT_MAX;
On rare machines, INT_MAX == UINT_MAX, so on those, look to wider types:
long long = ~(long long) INT_MAX;
On rarer machines (unheard of), INT_MAX == LONG_MAX is also true, then we are out of luck.
Pedantic: Rare machines use padding on int/unsigned, so best to drive code with (U)INT_MAX than sizeof.
* Maybe some padding bits too - very rare.

Bit shift with a signed int resets one bit too much

Please have a look at the following code snippet, which basically simply bit shifts 1 byte by 24 bits to the left.
uint64_t i = 0xFFFFFFFF00000000;
printf("before: %016llX \n", i);
i += 0xFF << 24;
printf("after : %016llX \n", i);
// gives:
// before: FFFFFFFF00000000
// after : FFFFFFFEFF000000
The most significant 32 bits are FFFFFFFE (watch the E at the end). This is not as I expected. I don't see why shifting 1 bytes 24 bits left would touch bit #32 (bit #31 should be the last one modified) It changed the last F (1111) into E (1110)).
To make it work properly, I had use 0xFF unsigned (0xFFU).
uint64_t i = 0xFFFFFFFF00000000;
printf("before: %016llX \n", i);
i += 0xFFU << 24;
printf("after : %016llX \n", i);
// gives:
// before: FFFFFFFF00000000
// after : FFFFFFFFFF000000
Why does the bit shift with a signed int (0xFF) touch/reset one bit too much?
You left-shifted into the sign bit.
The integer constant 0xFF has type int. Assuming an int is 32 bit, the expression 0xFF << 24 shifts a bit set to 1 into the high-order bit of a signed integer triggers undefined behavior which in your case manifested as an unexpected value.
This is spelled out in section 6.5.7p4 of the C standard:
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1×2E2is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
By using the U suffix this makes the constant have type unsigned int, and it is valid to shift bits set to 1 into the high-order bit because there is no sign bit.

Negative numbers: How can I change the sign bit in a signed int to a 0?

I was thinking this world work, but it does not:
int a = -500;
a = a << 1;
a = (unsigned int)a >> 1;
//printf("%d",a) gives me "2147483148"
My thought was that the left-shift would remove the leftmost sign bit, so right-shifting it as an unsigned int would guarantee that it's a logical shift rather than arithmetic. Why is this incorrect?
Also:
int a = -500;
a = a << 1;
//printf("%d",a) gives me "-1000"
TL;DR: the easiest way is to use the abs function from <stdlib.h>. The rest of the answer involves the representation of negative numbers on a computer.
Negative integers are (almost always) represented in 2's complement form. (see note below)
The method of getting the negative of a number is:
Take the binary representation of the whole number (including leading zeroes for the data type, except the MSB which will serve as the sign bit).
Take the 1's complement of the above number.
Add 1 to the 1's complement.
Prefix a sign bit.
Using 500 as an example,
Take the binary representation of 500: _000 0001 1111 0100 (_ is a placeholder for the sign bit).
Take the 1's-complement / inverse of it: _111 1110 0000 1011
Add 1 to the 1's complement: _111 1110 0000 1011 + 1 = _111 1110 0000 1100. This is the same as 2147483148 that you obtained, when you replaced the sign-bit by zero.
Prefix 0 to show a positive number and 1 for a negative number: 1111 1110 0000 1100. (This will be different from 2147483148 above. The reason you got the above value is because you nuked the MSB).
Inverting the sign is a similar process. You get leading ones if you use 16-bit or 32-bit numbers leading to the large value that you see. The LSB should be the same in each case.
Note: there are machines with 1's complement representation, but they are a minority. The 2's complement is usually preferred because 0 has the same representation, i.e., -0 and 0 are represented as all-zeroes in the 2's complement notation.
Left-shifting negative integers invokes undefined behavior, so you can't do that. You could have used your code if you did a = (unsigned int)a << 1;. You'd get 500 = 0xFFFFFE0C, left-shifted 1 = 0xFFFFFC18.
a = (unsigned int)a >> 1; does indeed guarantee logical shift, so you get 0x7FFFFE0C. This is decimal 2147483148.
But this is needlessly complex. The best and most portable way to change the sign bit is simply a = -a. Any other code or method is questionable.
If you however insist on bit-twiddling, you could also do something like
(int32_t)a & ~(1u << 31)
This is portable to 32 bit systems, since (int32_t) guarantees two's complement, but 1u << 31 assumes 32 bit int type.
Demo:
#include <stdio.h>
#include <stdint.h>
int main (void)
{
int a = -500;
a = (unsigned int)a << 1;
a = (unsigned int)a >> 1;
printf("%.8X = %d\n", a, a);
_Static_assert(sizeof(int)>=4, "Int must be at least 32 bits.");
a = -500;
a = (int32_t)a & ~(1u << 31);
printf("%.8X = %d\n", a, a);
return 0;
}
As you put in the your "Also" section, after your first left shift of 1 bit, a DOES reflect -1000 as expected.
The issue is in your cast to unsigned int. As explained above, the negative number is represented as 2's complement, meaning the sign is determined by the left most bit (most significant bit). When cast to an unsigned int, that value no longer represents sign but increases the maximum value your int can take.
Assuming 32 bit ints, the MSB used to represent -2^31 (= -2147483648) and now represents positive 2147483648 in an unsigned int, for an increase of 2* 2147483648 = 4294967296. Add this to your original value of -1000 and you get 4294966296. Right shift divides this by 2 and you arrive at 2147483148.
Hoping this may be helpful: (modified printing func from Print an int in binary representation using C)
void int2bin(int a, char *buffer, int buf_size) {
buffer += (buf_size - 1);
for (int i = buf_size-1; i >= 0; i--) {
*buffer-- = (a & 1) + '0';
a >>= 1;
}
}
int main() {
int test = -500;
int bufSize = sizeof(int)*8 + 1;
char buf[bufSize];
buf[bufSize-1] = '\0';
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: -500 (4294966796): 11111111111111111111111000001100
test = test << 1;
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: -1000 (4294966296): 11111111111111111111110000011000
test = 500;
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: 500 (500): 00000000000000000000000111110100
return 0;
}

cast without * operator

Could someone explain to me what's happening to "n" in this situation?
main.c
unsigned long temp0;
PLLSYS0_FWD_DIV_A_DECODE(n);
main.h
#define PLLSYS0_FWD_DIV_A_DECODE(n) ((((unsigned long)(n))>>8)& 0x0000000f)
I understand that n is being shifted 8 bits and then anded with 0x0000000f. So what does (unsigned long)(n) actually do?
#include <stdio.h>
int main(void)
{
unsigned long test1 = 1;
printf("test1 = %d \n", test1);
printf("(unsigned long)test1 = %d \n", (unsigned long)(test1));
return 0;
}
Output:
test1 = 1
(unsigned long)test1 = 1
In your code example, the cast doesn't make much sense because test1 is already an unsigned long, but it makes sense when the macro is used on a different type like unsigned char etc.
Also you should use %lu in printf to print unsigned long.
printf("(unsigned long)test1 = %lu\n", (unsigned long)(test1));
// ^^
It widens it to be the size of an unsigned long. Imagine if you called this with a char and shifted it 8 bits to the right, the anding wouldn't work the same.
Also just found this (look under right-shift operator) for why it's unsigned. Apparently unsigned forces a logical shift in which the left-most bit is replaced with a zero for each position shifted. Whereas a signed value shifted performs an arithmetic shift where the left-most bit is replaced by the dropped rightmost bit.
Example:
11000011 ( unsigned, shifted to the right by 1 )
01100001
11000011 ( signed, shifted to the right by 1 )
11100001
Could someone explain to me what's happening to "n" in this situation?
You are casting n to unsigned long.
So what does (unsigned long)(n) actually do?
It will promote n to unsigned long.
Casting the input is all it's doing before the bit shift and the anding. Being careful about order if operations and precedence of operators. It's pretty ugly.
But looks like they're avoiding hitting the sign bit and by doing this instead of a function, there's no type checking on n.
It's just ugly.
Better form would be to have a clean clear function that has input type checking.
That ensures that n has the proper size (in bits) and most importantly is treated as unsigned. As the shift operators perform sign extension, when a number is signed and negative, the extension will be done with 1 not zero. It means that a negative number shifted will always result in a negative number.
For example:
int main()
{
long i = -1;
long x, y;
x = ((unsigned long)i) >> 8;
y = i >> 8;
printf("%ld %ld\n", x, y);
}
On my machine it outputs:
72057594037927935 -1
Because of the sign extension in y, the number continues to be -1:

Resources