Analysing program bitshifting - c

I'll start with the code rightaway:
#include <stdio.h>
int main()
{
unsigned char value = 0xAF;
printf("%02x\n", value);
value = (value << 4) | (value >> 4);
printf("%02x\n", value);
return 0;
}
Firstly I thought you can't store numbers in chars and that you would need to make that an int. Appearently not. Then, if I did the bitshifting mats:
value << 4 = 101011110
value >> 4 = 1010111
101011110
| 1010111
=101011111
and that would be 0x15f.
If I compile that code it prints
af
fa
Can anyone explain to me where I'm thinking wrong?

Bit shifting 4 shifts 4 binary digits, not 2 as you seem to be showing. It also shifts 1 hex digit. So if you have 0xAF, shifting left 4 gives you 0xF0. Because it is a char, it only has 8 bits and the A is cut off. Shifting right 4 similarly yields 0xA. 0x0A | 0xF0 == 0xFA.

Start with the baseline, 0xaf is 1010-111116 (and we're assuming an eight-bit char here based on the code though it's not mandated by the standard).
The expression value << 4 will left-shift that by four bits (not one as you seem to think), giving 1010-1111-000016 and, yes, it's more than an eight-bit char because of integer promotions (both operands of a << expression are promoted to int as per ISO C11 6.5.7 and also in earlier iterations of the standard).
The expression value >> 4 will right-shift it by four bits, giving 101016.
When you bitwise-or those together, you get:
1010-1111-0000
1010
==============
1010-1111-1010
and when you finally try to shoe-horn that back into the eight-bit value, it lops off the upper bits, giving 1111-101016, which is 0xFA.

You might have messed up the bit representations in your calculation.
Ok. I will try to explain according to the code you have provided.
value 0XAF = 10101111
value << 4 = 11110000
value >> 4 = 00001010
11110000
|00001010 = 11111010 and hence the 0XFA.
Explanation:
1. Representation is in binary 8 bit.
2. When you left/right shift by a number, I think you are considering it in terms of multiplication and division, but in 8-bit binary representation it just gets shifted by 4 places and the bits get replaced by 0.
Hope this helps.

because sizeof(unsigned char) is equal to 1.its a 8bit data.
the range of "value" is from 0x0 to 0xFF, that's the valid bit is from bit0 to bit7.
so when assign 0x15F to "value" after bitshifting, only the data from bit 0 to bit7 are assigned to variable "value", bit8 is cut off.
0x15f ---binarization---> 0001 0101 1111
variable "value" is a 8bit data, so, only 0101 1111 is assigned to it.
value ---binarization---> 0101 1111

Related

What does hibyte = Value >> 8 meaning?

I am using C for developing my program and I found out from an example code
unHiByte = unVal >> 8;
What does this mean? If unVal = 250. What could be the value for unHiByte?
>> in programming is a bitwise operation. The operation >> means shift right operation.
So unVal >> 8 means shift right unVal by 8 bits. Shifting the bits to the right can be interpreted as dividing the value by 2.
Hence, unHiByte = unval >> 8 means unHiByte = unVal/(2^8) (divide unVal by 2 eight times)
Without going into the shift operator itself (since that is answered already), here the assumption is that unVal is a two byte variable with a high byte (the upper 8 bits) and a low byte (the lower 8 bits). The intent is to obtain the value produced by ONLY the upper 8 bits and discarding the lower bits.
The shift operator though should easily be learned via any book / tutorial and perhaps was the reason some one down voted the question.
The >> is a bitwise right shift.
It operates on bits. With unHiByte = unVal >> 8; When unVal=250.
Its binary form is 11111010
Right shift means to shift the bits to the right. So when you shift 1111 1010, 8 digits to right you get 0000 0000.
Note: You can easily determine the right shift operation result by dividing the number to the left of >> by 2^(number to right of >>)
So, 250/28= 0
For example: if you have a hex 0x2A63 and you want to take 2A or you want to take 63 out of it, then you will do this.
For example, if we convert 2A63 to binary which is: 0010101001100011. (that is 16 bits, first 8 bits are 2A and the second 8 bits are 63)
The issue is that binary always starts from right. So we have to push the first 8 bits (2A) to the right side to be able to get it.
uint16_t hex = 0x2A63;
uint8_t part2A = (uint8_t)(hex >> 8) // Pushed the first
// eight bits (2A) to right and (63) is gone out of the way. Now we have 0000000000101010
// Now Line 2 returns for us 0x2A which the last 8 bits (2A).
// To get 63 we will do simply:
uint8_t part63 = (uint8_t)hex; // As by default the 63 is on the right most side in the binary.
It is that simple.

Why does the following bitwise operation return an unintended result?

3 bits can hold up to a maximum number of 7 (4 + 2 + 1). I'm trying to calculate this using a bitwise operation.
3 is 0b011
~3 is 0b100
Doing a bitwise OR I would expect 0b111 (i.e. 7). Instead I get
int result = (~3) | 3;
printf("%i\n", result);
-1
What am I doing wrong?
You are doing everything right: N | ~N results in a number with binary representation consisting of all ones. Such number is interpreted as -1 in two's compliment representation of negative numbers.
How many bits wide is an int? You seem to think it's three bits wide. Certainly not correct! Guess again. What is ~0u? Try printf("%u\n", ~0u);. What about ~1u? ... and ~2u? Do you notice a pattern?
Note the u suffix, which tells the compiler that it's an unsigned literal. You can't work with signed integer types with the ~ operator... Well, you can, but you might run into trap representations and negative zeros, according to 6.2.6.2 of n1570.pdf. Using a trap representation is undefined behaviour. That might work on your system, but only by coincidence. Do you want to rely upon coincidence?
Similarly, I suggest using the %u directive to print unsigned values, as %d would produce undefined behaviour according to 7.21.6.1p29 of n1570.pdf.
When you do ~3 you are inverting the bits that make up 3 - so you turn 0000 0000 0000 0000 0000 0000 0000 0011 into 1111 1111 1111 1111 1111 1111 1111 1100. Since the high bit is set, this is interpreted as a negative number - all 1s is -1, one less than that is -2, one less -3 and so on. This number is the signed 32 bit integer for -4.
If you binary OR this with 3, you get all 1s (by definition) - which is the signed 32 bit integer for -1.
Your only problem is that you think you are working with 3 bit numbers, but you are actually working with 32 bit numbers.
After doing this in the code
int result = (~3) | 3;
Add this line
result= result & 0x07
This will give you the answer that you expect.
#include <stdio.h>
int main (){
unsigned d3 = 0b011;
unsigned invd3 = ~d3;
unsigned d4 = 0b100;
unsigned result = d3 | invd3;
printf("%X\n", result);//FFFFFFFF
result = d3 | d4;
printf("%X\n", result);//7
return 0;
}

reading 2 bits off a register

I'm looking at a datasheet specification of a NIC and it says:
bits 2:3 of register contain the NIC speed, 4 contains link state, etc. How can I isolate these bits using bitwise?
For example, I've seen the code to isolate the link state which is something like:
(link_reg & (1 << 4))>>4
But I don't quite get why the right shift. I must say, I'm still not fairly comfortable with the bitwise ops, even though I understand how to convert to binary and what each operation does, but it doesn't ring as practical.
It depends on what you want to do with that bit. The link state, call it L is in a variable/register somewhere
43210
xxxxLxxxx
To isolate that bit you want to and it with a 1, a bitwise operation:
xxLxxxx
& 0010000
=========
00L0000
1<<4 = 1 with 4 zeros or 0b10000, the number you want to and with.
status&(1<<4)
This will give a result of either zero or 0b10000. You can do a boolean comparison to determine if it is false (zero) or true (not zero)
if(status&(1<<4))
{
//bit was on/one
}
else
{
//bit was off/zero
}
If you want to have the result be a 1 or zero, you need to shift the result to the ones column
(0b00L0000 >> 4) = 0b0000L
If the result of the and was zero then shifting still gives zero, if the result was 0b10000 then the shift right of 4 gives a 0b00001
so
(status&(1<<4))>>4 gives either a 1 or 0;
(xxxxLxxxx & (00001<<4))>>4 =
(xxxxLxxxx & (10000))>>4 =
(0000L0000) >> 4 =
0000L
Another way to do this using fewer operations is
(status>>4)&1;
xxxxLxxxx >> 4 = xxxxxxL
xxxxxxL & 00001 = 00000L
Easiest to look at some binary numbers.
Here's a possible register value, with the bit index underneath:
00111010
76543210
So, bit 4 is 1. How do we get just that bit? We construct a mask containing only that bit (which we can do by shifting a 1 into the right place, i.e. 1<<4), and use &:
00111010
& 00010000
----------
00010000
But we want a 0 or a 1. So, one way is to shift the result down: 00010000 >> 4 == 1. Another alternative is !!val, which turns 0 into 0 and nonzero into 1 (note that this only works for single bits, not a two-bit value like the link speed).
Now, if you want bits 3:2, you can use a mask with both of those bits set. You can write 3 << 2 to get 00001100 (since 3 has two bits set). Then we & with it:
00111010
& 00001100
----------
00001000
and shift down by 2 to get 10, the desired two bits. So, the statement to get the two-bit link speed would be (link_reg & (3<<2))>>2.
If you want to treat bits 2 and 3 (starting the count at 0) as a number, you can do this:
unsigned int n = (link_get & 0xF) >> 2;
The bitwise and with 15 (which is 0b1111 in binary) sets all but the bottom four bits to zero, and the following right-shift by 2 gets you the number in bits 2 and 3.
you can use this to determine if the bit at position pos is set in val:
#define CHECK_BIT(val, pos) ((val) & (1U<<(pos)))
if (CHECK_BIT(reg, 4)) {
/* bit 4 is set */
}
the bitwise and operator (&) sets each bit in the result to 1 if both operands have the corresponding bit set to 1. otherwise, the result bit is 0.
The problem is that isolating bits is not enough: you need to shift them to get the correct size order of the value.
In your example you have bit 2 and 3 for the size (I'm assuming that least significant is bit 0), it means that it is a value in range [0,3]. Now you can mask these bits with reg & (0x03<<2) or, converted, (reg & 0x12) but this is not enough:
reg 0110 1010 &
0x12 0000 1100
---------------
0x08 0000 1000
As you can see the result is 1000b which is 8, which is over the range. To solve this you need to shift back the result so that the least significant bit of the value you are interested in corresponds to the least significant bit of the containing byte:
0000 1000 >> 2 = 10b = 3
which now is correct.

How to create mask with least significat bits set to 1 in C

Can someone please explain this function to me?
A mask with the least significant n bits set to 1.
Ex:
n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F
Also, what is a mask?
The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.
A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.
A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.
For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.
Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.
Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.
To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.
So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.
Positive powers of 2 are easily generated with the left shift << operator in C.
Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.
(1 << n) - 1 will produce a mask with n lowest bits set to 1.
Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.
For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.
Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.
This method is correct, whereas the current top answers produce undefined behavior in edge cases.
Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions:
-mbmi -mbmi2
#include <inttypes.h>
#include <stdio.h>
uint64_t gen_mask(const uint_fast8_t msb) {
const uint64_t src = (uint64_t)1 << msb;
return (src - 1) ^ src;
}
int main() {
uint_fast8_t msb;
for (msb = 0; msb < 64; ++msb) {
printf("%016" PRIx64 "\n", gen_mask(msb));
}
return 0;
}
First, for those who only want the code to create the mask:
uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)
Thanks to #Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:
uint64_t bits = 6;
uint64_t mask = (bits < 64)
? ((uint64_t)1 << bits) - 1
: (uint64_t)0 - 1
For those who want to know what a mask is:
A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.
Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.
Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.
You can read more about bitwise operations in C here.
I believe your first example should be 0x3f.
0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.
The following little C program will calculate the correct mask:
#include <stdarg.h>
#include <stdio.h>
int mask_for_n_bits(int n)
{
int mask = 0;
for (int i = 0; i < n; ++i)
mask |= 1 << i;
return mask;
}
int main (int argc, char const *argv[])
{
printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
return 0;
}
0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.
Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.
A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.
For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.
In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.

Need help understanding "getbits()" method in Chapter 2 of K&R C

In chapter 2, the section on bitwise operators (section 2.9), I'm having trouble understanding how one of the sample methods works.
Here's the method provided:
unsigned int getbits(unsigned int x, int p, int n) {
return (x >> (p + 1 - n)) & ~(~0 << n);
}
The idea is that, for the given number x, it will return the n bits starting at position p, counting from the right (with the farthest right bit being position 0). Given the following main() method:
int main(void) {
int x = 0xF994, p = 4, n = 3;
int z = getbits(x, p, n);
printf("getbits(%u (%x), %d, %d) = %u (%X)\n", x, x, p, n, z, z);
return 0;
}
The output is:
getbits(63892 (f994), 4, 3) = 5 (5)
I get portions of this, but am having trouble with the "big picture," mostly because of the bits (no pun intended) that I don't understand.
The part I'm specifically having issues with is the complements piece: ~(~0 << n). I think I get the first part, dealing with x; it's this part (and then the mask) that I'm struggling with -- and how it all comes together to actually retrieve those bits. (Which I've verified it is doing, both with code and checking my results using calc.exe -- thank God it has a binary view!)
Any help?
Let's use 16 bits for our example. In that case, ~0 is equal to
1111111111111111
When we left-shift this n bits (3 in your case), we get:
1111111111111000
because the 1s at the left are discarded and 0s are fed in at the right. Then re-complementing it gives:
0000000000000111
so it's just a clever way to get n 1-bits in the least significant part of the number.
The "x bit" you describe has shifted the given number (f994 = 1111 1001 1001 0100) right far enough so that the least significant 3 bits are the ones you want. In this example, the input bits you're requesting are there, all other input bits are marked . since they're not important to the final result:
ff94 ...........101.. # original number
>> p+1-n [2] .............101 # shift desired bits to right
& ~(~0 << n) [7] 0000000000000101 # clear all the other (left) bits
As you can see, you now have the relevant bits, in the rightmost bit positions.
I would say the best thing to do is to do a problem out by hand, that way you'll understand how it works.
Here is what I did using an 8-bit unsigned int.
Our number is 75 we want the 4 bits starting from position 6.
the call for the function would be getbits(75,6,4);
75 in binary is 0100 1011
So we create a mask that is 4 bits long starting with the lowest order bit this is done as such.
~0 = 1111 1111
<<4 = 1111 0000
~ = 0000 1111
Okay we got our mask.
Now, we push the bits we want out of the number into the lowest order bits so
we shift binary 75 by 6+1-4=3.
0100 1011 >>3 0000 1001
Now we have a mask of the correct number of bits in the low order and the bits we want out of the original number in the low order.
so we & them
0000 1001
& 0000 1111
============
0000 1001
so the answer is decimal 9.
Note: the higher order nibble just happens to be all zeros, making the masking redundant in this case but it could have been anything depending on the value of the number we started with.
~(~0 << n) creates a mask that will have the n right-most bits turned on.
0
0000000000000000
~0
1111111111111111
~0 << 4
1111111111110000
~(~0 << 4)
0000000000001111
ANDing the result with something else will return what's in those n bits.
Edit: I wanted to point out this programmer's calculator I've been using forever: AnalogX PCalc.
Nobody mentioned it yet, but in ANSI C ~0 << n causes undefined behaviour.
This is because ~0 is a negative number and left-shifting negative numbers is undefined.
Reference: C11 6.5.7/4 (earlier versions had similar text)
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed
type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
In K&R C this code would have relied on the particular class of system that K&R developed on, naively shifting 1 bits off the left when performing left-shift of a signed number (and this code also relies on 2's complement representation), but some other systems don't share those properties so the C standardization process did not define this behaviour.
So this example is really only interesting as a historical curiosity, it should not be used in any real code since 1989 (if not earlier).
Using the example:
int x = 0xF994, p = 4, n = 3;
int z = getbits(x, p, n);
and focusing on this set of operations
~(~0 << n)
for any bit set (10010011 etc) you want to generate a "mask" that pulls only the bits you want to see. So 10010011 or 0x03, I'm interested in xxxxx011. What is the mask that will extract that set ? 00000111 Now I want to be sizeof int independent, I'll let the machine do the work i.e. start with 0 for a byte machine it's 0x00 for a word machine it's 0x0000 etc. 64 bit machine would represent by 64 bits or 0x0000000000000000
Now apply "not" (~0) and get 11111111
shift right (<<) by n and get 11111000
and "not" that and get 00000111
so 10010011 & 00000111 = 00000011
You remember how boolean operations work ?
In ANSI C ~0 >> n causes undefined behavior
// the post about left shifting causing a problem is wrong.
unsigned char m,l;
m = ~0 >> 4; is producing 255 and its equal to ~0 but,
m = ~0;
l = m >> 4; is producing correct value 15 same as:
m = 255 >> 4;
there is no problem with left shifting negative ~0 << whatsoever

Resources