How to extract bits from a number in C?

How to extract bits from a number in C? - c

I need to extract specific part (no of bits) of a short data type in C.
Fox example, i have a binary of 45 as 101101 and i just want 2 bits in middle such as (10)
I started with C code 2 days ago so don't given a lot of functions.
How do i extract them ?

Please search for bit-wise operations for more general information, and bit masking for your specific question. I wouldn't recommend to jump to bits if you are new to programming though.
The solution will slightly change depending on whether your input will be fixed in length. If it won't be fixed, you need to arrange you mask accordingly. Or you can use a different method, this is probably simplest way.
In order to get specific bits that you want, you can use bitmasking.
E.g you have 101101 and you want those middle two bits, if you & this with 001100, only bits that are 1 on the mask will remain unchanged in the source, all the other bits will be set to 0. Effectively, you will have those bits that you are interested in.
If you don't know what & (bitwise and) is, it takes two operands, and returns 1 only if first AND second operands are 1, returns 0 otherwise.
input : 1 0 1 1 0 1
mask : 0 0 1 1 0 0
result : 0 0 1 1 0 0
As C syntax, we can do this like:
unsigned int input = 45;
unsigned int mask = 0b001100; // I don't know if this is standard notation. May not work with all compilers
// or
unsigned int mask = 12; // This is equivalent
unsigned int result = input & mask; // result contains ...001100
As yo can see, we filtered the bits we wanted. The next step depends on what you want to do with those bytes.
At this point, the result 001100 corresponds to 12. I assume this is not really useful. What you can do is, you can move those bits around. In order to get rid of 0s at the right, we can shit it 2 bits to the right. For this, we need to use >> operator.
0 0 1 1 0 0 >> 2 ≡ 0 0 0 0 1 1
result = result >> 2; // result contains ...011
From there, you can set a bool variable to store each of them being 1 or 0.
unsigned char flag1 = result & 0b01; // or just 1
unsigned char flag2 = result & 0b10; // or just 2
You could do this without shifting at all but this way it's more clear.

You need to mask the bits that you want to extract. If suppose you want to create mask having first 4 bits set. Then you can do that by using:
(1 << 4) - 1
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
void print_bin(short n)
{
unsigned long i = CHAR_BIT * sizeof(n);
while(i--)
putchar('0' + ((n >> i) & 1));
printf("\n");
}
int main()
{
short num = 45; /* Binary 101101 */
short mask = 4; /* 4 bits */
short start = 0; /* Start from leftmost bit
position 0 */
print_bin((num >> start) & ((1 << mask) - 1)); /* Prints 1101 */
mask = 2; /* 2 bits */
start = 1; /* start from bit indexed at position 1 */
print_bin((num >> start) & ((1 << mask) - 1)); /* Prints 10 */
return 0;
}
Output:
0000000000001101
0000000000000010

Related

Swapping bits in an integer in C, can you explain this function to me?

I want to write a function that receives an unsigned char and swaps between bit 2 and bit 4 and returns the new number.
I am not allowed to use if statement.
So I found this function, among other functions, but this was the most simple one to understand (or try to understand).
All other functions involve XOR which I don't really understand to be honest.
unsigned char SwapBits(unsigned char num)
{
unsigned char mask2 = ( num & 0x04 ) << 2;
unsigned char mask4 = ( num & 0x10 ) >> 2;
unsigned char mask = mask3 | mask5 ;
return ( num & 0xeb ) | mask;
}
Can someone explain me what happens here and most important, why?
Why AND is required here and why with hex address?
Why should I AND with 0xeb (255)? I know that's the range of char but why should I do that.
In short,
I know how to read codes. I understand this code, but I don't understand the purpose of each line.
Thanks.

First, the usual convention is that bits are numbered starting from 0 for the least significant bit and counting up. In this case, you have an 8-bit value, so the bits go from 0 on the right up to 7 on the left.
The function you posted still isn't quite right, but I think I see where you (it) was going with it. Here are the steps it's doing:
Pull out bit 2 (which is 3rd from the right) using a mask
Pull out bit 4 (which is 5th from the right) using a mask
Shift bit 2 left 2 positions so it's now in bit 4's original position
Shift bit 4 right 2 positions so it's now in bit 2's original position
Join these two bits together into one value that is now bits 2 and 4 swapped
Mask out (erase using &) only bits 2 and 4 from the original value
Join in (insert using |) the new swapped bits 2 and 4 to complete the transformation
I have rewritten the function to show each step one at a time to help make it clearer. In the original function or other examples you find, you'll see many of these steps all happen together in the same statement.
unsigned char SwapBits(unsigned char num)
{
// preserve only bit 2
unsigned char bit2 = num & 0x04;
// preserve only bit 4
unsigned char bit4 = num & 0x10;
// move bit 2 left to bit 4 position
unsigned char bit2_moved = bit2 << 2;
// move bit 4 right to bit 2 position
unsigned char bit4_moved = bit4 >> 2;
// put the two moved bits together into one swapped value
unsigned char swapped_bits = bit2_moved | bit4_moved;
// clear bits 2 and 4 from the original value
unsigned char num_with_swapped_bits_cleared = num & ~0x14;
// put swapped bits back into the original value to complete the swap
return num_with_swapped_bits_cleared | swapped_bits;
}
The second to last step num & ~0x14 probably needs some explanation. Since we want to save all the original bits except for bits 2 and 4, we mask out (erase) only the bits we're changing and leave all the others alone. The bits we want to erase are in positions 2 and 4, which are the 1s in the mask 0x14. So we do a complement (~) on 0x14 to turn it into all 1s everywhere except for 0s in bits 2 and 4. Then we AND this value with the original number, which has the effect of changing bits 2 and 4 to 0 while leaving all the others alone. This allows us to OR in the new swapped bits as the final step to complete the process.

You have to read about binary representation of number
unsigned char SwapBits(unsigned char num)
{
// let say that [num] = 46, it means that is is represented 0b00101110
unsigned char mask2 = ( num & 0x04 ) << 2;
// now, another byte named mask2 will be equal to:
// 0b00101110 num
// 0b00000100 0x04
// . .1. mask2 = 4. Here the & failed with . as BOTH ([and]) bits need to be set. Basically it keeps only numbers that have the 3rd bit set
unsigned char mask4 = ( num & 0x10 ) >> 2;
// 0b00101110 num
// 0b00010000 0x10 -> means 16 in decimal or 0b10000 in binary or 2^4 (the power is also the number of trailing 0 after the bit set)
// 0b00.....0 mask4 = 0, all bits failed to be both set
unsigned char mask = mask3 | mask5 ;
// mask will take bits at each position if either set by mask3 [or] mask5 so:
// 0b1001 mask3
// 0boo11 mask4
// 0b1011 mask
return ( num & 0xeb ) | mask; // you now know how it works ;) solve this one. PS: operation between Brackets have priority
}
If you are interested to learn the basics of bitwise operators you can take a look at this introduction.
After you build confidence you can try solving algorithms using only bitwise operators, where you will explore even deeper bitwise operations and see its impact on the runtime ;)
I also recommend reading Bit Twiddling Hacks, Oldies but Goodies!
b = ((b * 0x80200802ULL) & 0x0884422110ULL) * 0x0101010101ULL >> 32; // reverse your byte!
Simple function to understand swap of bit 3 and 5:
if you want to swap bit index 3 and bit index 5, then you have to do the following:
int n = 0b100010
int mask = 0b100000 // keep bit index 5 (starting from index 0)
int mask2 = 0b1000 // keep bit index 3
n = (n & mask) >> 2 | (n & mask2) << 2 | (n & 0b010111);
// (n & mask) >> 2
// the mask index 5 is decrease by 2 position (>>2) and brings along with it the bit located at index 5 that it had captured in n thanks to the AND operand.
// | (n & mask2) << 2
// mask2 is increased by 2 index and set it to 0 since n didn't have a bit set at index 3 originally.
// | (n & 0b010111); // bits 0 1 2 and 4 are preserved
// since we assign the value to n all other bits would have been wiped out if we hadn't kept their original value thanks to the mask on which we do not perform any shift operations.

Bit operations for formatting and parsing custom protocol header in C [duplicate]

Say I have a byte like this 1010XXXX where the X values could be anything. I want to set the lower four bits to a specific pattern, say 1100, while leaving the upper four bits unaffected. How would I do this the fastest in C?

In general:
value = (value & ~mask) | (newvalue & mask);
mask is a value with all bits to be changed (and only them) set to 1 - it would be 0xf in your case. newvalue is a value that contains the new state of those bits - all other bits are essentially ignored.
This will work for all types for which bitwise operators are supported.

You can set all those bits to 0 by bitwise-anding with the 4 bits set to 0 and all other set to 1 (This is the complement of the 4 bits set to 1). You can then bitwise-or in the bits as you would normally.
ie
val &= ~0xf; // Clear lower 4 bits. Note: ~0xf == 0xfffffff0
val |= lower4Bits & 0xf; // Worth anding with the 4 bits set to 1 to make sure no
// other bits are set.

Use bitwise operator or | when you want to change the bit of a byte from 0 to 1.
Use bitwise operator and & when you want to change the bit of a byte from 1 to 0
Example
#include <stdio.h>
int byte;
int chb;
int main() {
// Change bit 2 of byte from 0 to 1
byte = 0b10101010;
chb = 0b00000100; //0 to 1 changer byte
printf("%d\n",byte); // display current status of byte
byte = byte | chb; // perform 0 to 1 single bit changing operation
printf("%d\n",byte);
// Change bit 2 of byte back from 1 to 0
chb = 0b11111011; //1 to 0 changer byte
byte = byte & chb; // perform 1 to 0 single bit changing operation
printf("%d\n",byte);
}
Maybe there are better ways, I dont know. This will help you for now.

To further generalise the answers given, here are a couple of macros (for 32-bit values; adjust for different bitfield lengths).
#include <stdio.h>
#include <stdint.h>
#define MASK(L,P) (~(0xffffffff << (L)) << (P))
#define GET_VAL(BF,L,P) (((BF) & MASK(L,P)) >> P)
#define SET_VAL(BF,L,P,V) ( (BF) = ((BF) & ~MASK(L,P)) | (((V) << (P)) & MASK(L,P)) )
int main(int argc, char ** argv)
{
uint32_t bf = 1024;
printf("Bitfield before : %d , 0x%08x\n", bf, bf);
printf("Mask(5,3): %d , 0x%08x\n", MASK(5,3), MASK(5,3));
SET_VAL(bf,5,3,19);
printf("Bitfield after : %d , 0x%08x\n", bf, bf);
return 0;
}
As an aside, it's ridiculous that the C bitfield is completely useless. It's the perfect syntactic sugar for this requirement but due to leaving it up to each compiler to implement as it sees fit, it's useless for any real-world usage.

Bit rearrangement/manipulation in C

I want to arrange bits in a byte, to result in a certain order. For example, if the starting byte is as follows 0 1 1 0 1 0 1 0 with bits labeled as 1 2 3 4 5 6 7 8, I want to arrange it so it matches the following positioning: 2 4 3 5 7 1 8 6 this results to: 1 0 1 1 1 0 0 0. What would be the most efficient way of doing so? I read about "look-up" tables but I am not sure how this works. Can someone give an example and an explanation of an efficient way of doing this bit rearrangement in C.

You could create an array of "unsigned char" with 256 entries. The index into that array would be the current value of the byte to be converted, and the value at that entry would be the "converted" value.
Alternatively, you could use bit masking, and "if" statements... but it would less efficient.
Here's a snippet of the "array" method... with only a few values defined...
... and no output of the output in "binary-text" format.
#include<stdio.h>
unsigned char lkup[256] =
{ 0x00, /* idx: 0 (0x00) */
0x02, /* idx: 1 (0x01) (0b00000001) */
0x08, /* idx: 2 (0x02) (0b00000010) */
0x0a, /* idx: 3 (0x03) (0b00000011) */
0x01 /* idx: 4 (0x04) (0b00000100) */
};
int main(int argc, char **argv)
{
unsigned char wk = 3;
printf("Input: %u output: >%u\n", wk, lkup[wk]);
}

I think I understood what he wants to achieve. This code may help you:
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint8_t original = 0b01101010;
uint8_t positions[8] = {1,3,2,4,6,0,7,5};
uint8_t result = 0;
for(int i = 0; i < 8; i++)
{
if(original & (1 << (7 - positions[i])))
result |= (1 << (7-i));
}
return 0;
}
The first thing I have done is to create a byte that represents the original value as well as a array of the positions you want to change. Next step ist to look the original byte at the xth. position is zero or one and then shift the value in the result if so. The last for-loop is just for printing the result.
I adjusted your indices to be zero-based.

Here is one way to change bit-positions. With & (and-operator) we select certain bits from the char and then shift them to new bit-positions. Finally all the shifted bits will happily join together by | (or-operator).
The left shift << will move bits left and right shift >> to the right. I took the freedom to renumber bit-positions. 7 means most-significant bit on the left and 0 is least-significant bit, so the left and right shift operations descripts shifting direction correctly.
And why there is the shift operations first and then AND-operation for the last two rows?
– Because char-type can be unsigned and if we do the right shift for the negative value, eg 11111000 (-8), the most-significant bit will be copied; 11111000 >> 2 will result (1 filled from this end -->) 11111110 (-2).
(See Right shifting negative numbers in C.)
But back into function:
char changebitpositions (char ch) {
// bit locations (- = don't care)
// before after
return (ch & 0x20) // --5----- => --5----- (no change)
| ((ch & 0x49) << 1) // -6--3--0 => 6--3--0-
| ((ch & 0x12) << 2) // ---4--1- => -4--1---
| ((ch >> 5) & 0x04) // 7------- => -----7--
| ((ch >> 2) & 0x01); // -----2-- => -------2
// original and result: 76543210 => 64531702
}

How do you set only certain bits of a byte in C without affecting the rest?

Say I have a byte like this 1010XXXX where the X values could be anything. I want to set the lower four bits to a specific pattern, say 1100, while leaving the upper four bits unaffected. How would I do this the fastest in C?

In general:
value = (value & ~mask) | (newvalue & mask);
mask is a value with all bits to be changed (and only them) set to 1 - it would be 0xf in your case. newvalue is a value that contains the new state of those bits - all other bits are essentially ignored.
This will work for all types for which bitwise operators are supported.

You can set all those bits to 0 by bitwise-anding with the 4 bits set to 0 and all other set to 1 (This is the complement of the 4 bits set to 1). You can then bitwise-or in the bits as you would normally.
ie
val &= ~0xf; // Clear lower 4 bits. Note: ~0xf == 0xfffffff0
val |= lower4Bits & 0xf; // Worth anding with the 4 bits set to 1 to make sure no
// other bits are set.

Use bitwise operator or | when you want to change the bit of a byte from 0 to 1.
Use bitwise operator and & when you want to change the bit of a byte from 1 to 0
Example
#include <stdio.h>
int byte;
int chb;
int main() {
// Change bit 2 of byte from 0 to 1
byte = 0b10101010;
chb = 0b00000100; //0 to 1 changer byte
printf("%d\n",byte); // display current status of byte
byte = byte | chb; // perform 0 to 1 single bit changing operation
printf("%d\n",byte);
// Change bit 2 of byte back from 1 to 0
chb = 0b11111011; //1 to 0 changer byte
byte = byte & chb; // perform 1 to 0 single bit changing operation
printf("%d\n",byte);
}
Maybe there are better ways, I dont know. This will help you for now.

To further generalise the answers given, here are a couple of macros (for 32-bit values; adjust for different bitfield lengths).
#include <stdio.h>
#include <stdint.h>
#define MASK(L,P) (~(0xffffffff << (L)) << (P))
#define GET_VAL(BF,L,P) (((BF) & MASK(L,P)) >> P)
#define SET_VAL(BF,L,P,V) ( (BF) = ((BF) & ~MASK(L,P)) | (((V) << (P)) & MASK(L,P)) )
int main(int argc, char ** argv)
{
uint32_t bf = 1024;
printf("Bitfield before : %d , 0x%08x\n", bf, bf);
printf("Mask(5,3): %d , 0x%08x\n", MASK(5,3), MASK(5,3));
SET_VAL(bf,5,3,19);
printf("Bitfield after : %d , 0x%08x\n", bf, bf);
return 0;
}
As an aside, it's ridiculous that the C bitfield is completely useless. It's the perfect syntactic sugar for this requirement but due to leaving it up to each compiler to implement as it sees fit, it's useless for any real-world usage.

How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.
I have taken the program bitsort.c from here and pasted it below:
/* Copyright (C) 1999 Lucent Technologies */
/* From 'Programming Pearls' by Jon Bentley */
/* bitsort.c -- bitmap sort from Column 1
* Sort distinct integers in the range [0..N-1]
*/
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i)
{
int sh = i>>SHIFT;
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/*Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).
clr clears the ith bit
set sets the ith bit
test returns the value at the ith bit
Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.

The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.
The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:
which int (of the array) is it in
which of that int's bits are we talking about
Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).
Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.

Basically is a bucket sort optimized:
reserve a bit array of length n
bits.
clear the bit array (first for in main).
read the items one by one (they must all be distinct).
set the i'th bit in the bit array if the read number is i.
iterate the bit array.
if the bit is set then print the position.
Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0
start with an empty 10 bit array (aka one integer usually)
0000000000
read 4 and set the bit in the array..
0000100000
read 6 and set the bit in the array
0000101000
read 2 and set the bit in the array
0010101000
iterate the array and print every position in which the bits are set to one.
2, 4, 6
sorted.

Starting with set():
A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in.
MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32.
Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set.
ORing sets the bit.
The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"
clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.
The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.

The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.
If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):
// supposing an int of 1 bit would exist...
int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements
// set bit at x,y:
int linear_address = y*BITSPERWORD + x;
bits + linear_address = 1; // or 0
// 0 1 2 3 4 5 6 7 8 9 10 11 ... 31
// . . . . . . . . . . . . .
// . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)
The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.
When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using
int bitrow = 0;
bitrow |= 1 << (x);
Now when we iterate over the bits, we have the linear address, but need to find the corresponding word.
int column = linear_address % BITSPERROW;
int bit_mask = 1 << column; // meaning for the xth column,
// you take 1 and shift that bit x times
int row = linear_address / BITSPERROW;
So to set the i'th bit, you can do this:
bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );
An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.
a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK
a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT
This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation
a[ i >> SHIFT ] |= ( 1 << (i&MASK) );
But I don't see the algorithm working for e.g. 40 bits per word.

Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:
/* phase 1: initialize set to empty */
for (i = 0; i < n; i++)
bit[i] = 0
/* phase 2: insert present elements */
for each i in the input file
bit[i] = 1
/* phase 3: write sorted output */
for (i = 0; i < n; i++)
if bit[i] == 1
write i on the output file

A few doubts :
1. Why is it a need for a 32 bit ?
2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999
and values 0 or 1 based on the presence/absence of the bit ? What are the implications
for such a program ?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight