Read a single bit from a buffer of char - c

I would to implement a function like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
where index is the offset of the bit that I would want to read.
How do I use bit shifting or masking to achieve this?

You might want to split this into three separate tasks:
Determining which char contains the bit that you're looking for.
Determining the bit offset into that char that you need to read.
Actually selecting that bit out of that char.
I'll leave parts (1) and (2) as exercises, since they're not too bad. For part (3), one trick you might find useful would be to do a bitwise AND between the byte in question and a byte with a single 1 bit at the index that you want. For example, suppose you want to get the fourth bit out of a byte. You could then do something like this:
Byte: 11011100
Mask: 00001000
----------------
AND: 00001000
So think about the following: how would you generate the mask that you need given that you know the bit index? And how would you convert the AND result back to a single bit?
Good luck!

buffer[index/8] & (1u<<(index%8))
should do it (that is, view buffer as a bit array and test the bit at index).
Similarly:
buffer[index/8] |= (1u<<(index%8))
should set the index-th bit.
Or you could store a table of the eight shift states of 1 and & against that
unsigned char bits[] = { 1u<<0, 1u<<1, 1u<<2, 1u<<3, 1u<<4, 1u<<5, 1u<<6, 1u<<7 };
If your compiler doesn't optimize those / and % to bit ops (more efficient), then:
unsigned_int / 8 == unsigned_int >> 3
unsigned_int % 8 == unsigned_int & 0x07 //0x07 == 0000 0111
so
buffer[index>>3] & (1u<<(index&0x07u)) //test
buffer[index>>3] |= (1u<<(index&0x07u)) //set

One possible implementation of your function might look like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
{
unsigned char c = buffer[index / 8]; //getting the byte which contains the bit
unsigned int bit_position = index % 8; //getting the position of that bit within the byte
return ((c >> (7 - bit_position)) & 1);
//shifting that byte to the right with (7 - bit_position) will move the bit whose value you want to know at "the end" of the byte.
//then, by doing bitwise AND with the new byte and 1 (whose binary representation is 00000001) will yield 1 or 0, depending on the value of the bit you need.
}

Related

What does this code does ? There are so many weird things

int n_b ( char *addr , int i ) {
char char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
Like what is that : " i%8 & Ox1" ?
Edit: Note that 0x1 is the hexadecimal notation for 1. Also note that :
0x1 = 0x01 = 0x000001 = 0x0...01
i%8 means i modulo 8, ie the rest in the Euclidean division of i by 8.
& 0x1 is a bitwise AND, it converts the number before to binary form then computes the bitwise operation. (it's already in binary but it's just so you understand)
Example : 0x1101 & 0x1001 = 0x1001
Note that any number & 0x1 is either 0 or one.
Example: 0x11111111 & 0x00000001 is 0x1 and 0x11111110 & 0x00000001 is 0x0
Essentially, it is testing the last bit on the number, which the bit determining parity.
Final edit:
I got the precedence wrong, thanks to the comments for pointing it out. Here is the real precedence.
First, we compute i%8.
The result could be 0, 1, 2, 3, 4, 5, 6, 7.
Then, we shift the char by the result, which is maximum 7. That means the i % 8 th bit is now the least significant bit.
Then, we check if the original i % 8 bit is set (equals one) or not. If it is, return 1. Else, return 0.
This function returns the value of a specific bit in a char array as the integer 0 or 1.
addr is the pointer to the first char.
i is the index to the bit. 8 bits are commonly stored in a char.
First, the char at the correct offset is fetched:
char char_in_chain = addr [ i / 8 ] ;
i / 8 divides i by 8, ignoring the remainder. For example, any value in the range from 24 to 31 gives 3 as the result.
This result is used as the index to the char in the array.
Next and finally, the bit is obtained and returned:
return char_in_chain >> i%8 & 0x1;
Let's just look at the expression char_in_chain >> i%8 & 0x1.
It is confusing, because it does not show which operation is done in what sequence. Therefore, I duplicate it with appropriate parentheses: (char_in_chain >> (i % 8)) & 0x1. The rules (operation precedence) are given by the C standard.
First, the remainder of the division of i by 8 is calculated. This is used to right-shift the obtained char_in_chain. Now the interesting bit is in the least significant bit. Finally, this bit is "masked" with the binary AND operator and the second operand 0x1. BTW, there is no need to mark this constant as hex.
Example:
The array contains the bytes 0x5A, 0x23, and 0x42. The index of the bit to retrieve is 13.
i as given as argument is 13.
i / 8 gives 13 / 8 = 1, remainder ignored.
addr[1] returns 0x23, which is stored in char_in_chain.
i % 8 gives 5 (13 / 8 = 1, remainder 5).
0x23 is binary 0b00100011, and right-shifted by 5 gives 0b00000001.
0b00000001 ANDed with 0b00000001 gives 0b00000001.
The value returned is 1.
Note: If more is not clear, feel free to comment.
What the various operators do is explained by any C book, so I won't address that here. To instead analyse the code step by step...
The function and types used:
int as return type is an indication of the programmer being inexperienced at writing hardware-related code. We should always avoid signed types for such purposes. An experienced programmer would have used an unsigned type, like for example uint8_t. (Or in this specific case maybe even bool, depending on what the data is supposed to represent.)
n_b is a rubbish name, we should obviously never give an identifier such a nondescript name. get_bit or similar would have been a better name.
char* is, again, an indication of the programmer being inexperienced. char is particularly problematic when dealing with raw data, since we can't even know if it is signed or unsigned, it depends on which compiler that is used. Had the raw data contained a value of 0x80 or larger and char was negative, we would have gotten a negative type. And then right shifting a negative value is also problematic, since that behavior too is compiler-specific.
char* is proof of the programmer lacking the fundamental knowledge of const correctness. The function does not modify this parameter so it should have been const qualified. Good code would use const uint8_t* addr.
int i is not really incorrect, the signedness doesn't really matter. But good programming practice would have used an unsigned type or even size_t.
With types unsloppified and corrected, the function might look like this:
#include <stdint.h>
uint8_t get_bit (const uint8_t* addr, size_t i ) {
uint8_t char_in_chain = addr [ i / 8 ] ;
return char_in_chain >> i%8 & 0x1;
}
This is still somewhat problematic, because the average C programmer might not remember the precedence of >> vs % vs & on top of their head. It happens to be % over >> over &, but lets write the code a bit more readable still by making precedence explicit: (char_in_chain >> (i%8)) & 0x1.
Then I would question if the local variable really adds anything to readability. Not really, we might as well write:
uint8_t get_bit (const uint8_t* addr, size_t i ) {
return ((addr[i/8]) >> (i%8)) & 0x1;
}
As for what this code actually does: this happens to be a common design pattern for how to access a specific bit in a raw bit-field.
Any bit-field in C may be accessed as an array of bytes.
Bit number n in that bit-field, will be found at byte n/8.
Inside that byte, the bit will be located at n%8.
Bit masking in C is most readably done as data & (1u << bit). Which can be obfuscated as somewhat equivalent but less readable (data >> bit) & 1u, where the masked bit ends up in the LSB.
For example lets assume we have 64 bits of raw data. Bits are always enumerated from 0 to 63 and bytes (just like any C array) from index 0. We want to access bit 33. Then 33/8 integer division = 4.
So byte[4]. Bit 33 will be found at 33%8 = 1. So we can obtain the value of bit 33 from ordinary bit masking byte[33/8] & (1u << (bit%8)). Or similarly, (byte[33/8] >> (bit%8)) & 1u
An alternative, more readable version of it all:
bool is_bit_set (const uint8_t* data, size_t bit)
{
uint8_t byte = data [bit / 8u];
size_t mask = 1u << (bit % 8u);
return (byte & mask) != 0u;
}
(Strictly speaking we could as well do return byte & mask; since a boolean type is used, but it doesn't hurt to be explicit.)

Swapping bits in an integer in C, can you explain this function to me?

I want to write a function that receives an unsigned char and swaps between bit 2 and bit 4 and returns the new number.
I am not allowed to use if statement.
So I found this function, among other functions, but this was the most simple one to understand (or try to understand).
All other functions involve XOR which I don't really understand to be honest.
unsigned char SwapBits(unsigned char num)
{
unsigned char mask2 = ( num & 0x04 ) << 2;
unsigned char mask4 = ( num & 0x10 ) >> 2;
unsigned char mask = mask3 | mask5 ;
return ( num & 0xeb ) | mask;
}
Can someone explain me what happens here and most important, why?
Why AND is required here and why with hex address?
Why should I AND with 0xeb (255)? I know that's the range of char but why should I do that.
In short,
I know how to read codes. I understand this code, but I don't understand the purpose of each line.
Thanks.
First, the usual convention is that bits are numbered starting from 0 for the least significant bit and counting up. In this case, you have an 8-bit value, so the bits go from 0 on the right up to 7 on the left.
The function you posted still isn't quite right, but I think I see where you (it) was going with it. Here are the steps it's doing:
Pull out bit 2 (which is 3rd from the right) using a mask
Pull out bit 4 (which is 5th from the right) using a mask
Shift bit 2 left 2 positions so it's now in bit 4's original position
Shift bit 4 right 2 positions so it's now in bit 2's original position
Join these two bits together into one value that is now bits 2 and 4 swapped
Mask out (erase using &) only bits 2 and 4 from the original value
Join in (insert using |) the new swapped bits 2 and 4 to complete the transformation
I have rewritten the function to show each step one at a time to help make it clearer. In the original function or other examples you find, you'll see many of these steps all happen together in the same statement.
unsigned char SwapBits(unsigned char num)
{
// preserve only bit 2
unsigned char bit2 = num & 0x04;
// preserve only bit 4
unsigned char bit4 = num & 0x10;
// move bit 2 left to bit 4 position
unsigned char bit2_moved = bit2 << 2;
// move bit 4 right to bit 2 position
unsigned char bit4_moved = bit4 >> 2;
// put the two moved bits together into one swapped value
unsigned char swapped_bits = bit2_moved | bit4_moved;
// clear bits 2 and 4 from the original value
unsigned char num_with_swapped_bits_cleared = num & ~0x14;
// put swapped bits back into the original value to complete the swap
return num_with_swapped_bits_cleared | swapped_bits;
}
The second to last step num & ~0x14 probably needs some explanation. Since we want to save all the original bits except for bits 2 and 4, we mask out (erase) only the bits we're changing and leave all the others alone. The bits we want to erase are in positions 2 and 4, which are the 1s in the mask 0x14. So we do a complement (~) on 0x14 to turn it into all 1s everywhere except for 0s in bits 2 and 4. Then we AND this value with the original number, which has the effect of changing bits 2 and 4 to 0 while leaving all the others alone. This allows us to OR in the new swapped bits as the final step to complete the process.
You have to read about binary representation of number
unsigned char SwapBits(unsigned char num)
{
// let say that [num] = 46, it means that is is represented 0b00101110
unsigned char mask2 = ( num & 0x04 ) << 2;
// now, another byte named mask2 will be equal to:
// 0b00101110 num
// 0b00000100 0x04
// . .1. mask2 = 4. Here the & failed with . as BOTH ([and]) bits need to be set. Basically it keeps only numbers that have the 3rd bit set
unsigned char mask4 = ( num & 0x10 ) >> 2;
// 0b00101110 num
// 0b00010000 0x10 -> means 16 in decimal or 0b10000 in binary or 2^4 (the power is also the number of trailing 0 after the bit set)
// 0b00.....0 mask4 = 0, all bits failed to be both set
unsigned char mask = mask3 | mask5 ;
// mask will take bits at each position if either set by mask3 [or] mask5 so:
// 0b1001 mask3
// 0boo11 mask4
// 0b1011 mask
return ( num & 0xeb ) | mask; // you now know how it works ;) solve this one. PS: operation between Brackets have priority
}
If you are interested to learn the basics of bitwise operators you can take a look at this introduction.
After you build confidence you can try solving algorithms using only bitwise operators, where you will explore even deeper bitwise operations and see its impact on the runtime ;)
I also recommend reading Bit Twiddling Hacks, Oldies but Goodies!
b = ((b * 0x80200802ULL) & 0x0884422110ULL) * 0x0101010101ULL >> 32; // reverse your byte!
Simple function to understand swap of bit 3 and 5:
if you want to swap bit index 3 and bit index 5, then you have to do the following:
int n = 0b100010
int mask = 0b100000 // keep bit index 5 (starting from index 0)
int mask2 = 0b1000 // keep bit index 3
n = (n & mask) >> 2 | (n & mask2) << 2 | (n & 0b010111);
// (n & mask) >> 2
// the mask index 5 is decrease by 2 position (>>2) and brings along with it the bit located at index 5 that it had captured in n thanks to the AND operand.
// | (n & mask2) << 2
// mask2 is increased by 2 index and set it to 0 since n didn't have a bit set at index 3 originally.
// | (n & 0b010111); // bits 0 1 2 and 4 are preserved
// since we assign the value to n all other bits would have been wiped out if we hadn't kept their original value thanks to the mask on which we do not perform any shift operations.

Setting bits in a bit stream

I have encountered the following C function while working on a legacy code and I am compeletely baffled, the way the code is organized. I can see that the function is trying to set bits at given position in bit stream but I can't get my head around with individual statements and expressions. Can somebody please explain why the developer used divison by 8 (/8) and modulus 8 (%8) expressions here and there. Is there an easy way to read these kinds of bit manipulation functions in c?
static void setBits(U8 *input, U16 *bPos, U8 len, U8 val)
{
U16 pos;
if (bPos==0)
{
pos=0;
}
else
{
pos = *bPos;
*bPos += len;
}
input[pos/8] = (input[pos/8]&(0xFF-((0xFF>>(pos%8))&(0xFF<<(pos%8+len>=8?0:8-(pos+len)%8)))))
|((((0xFF>>(8-len)) & val)<<(8-len))>>(pos%8));
if ((pos/8 == (pos+len)/8)|(!((pos+len)%8)))
return;
input[(pos+len)/8] = (input[(pos+len)/8]
&(0xFF-(0xFF<<(8-(pos+len)%8))))
|((0xFF>>(8-len)) & val)<<(8-(pos+len)%8);
}
please explain why the developer used divison by 8 (/8) and modulus 8 (%8) expressions here and there
First of all, note that the individual bits of a byte are numbered 0 to 7, where bit 0 is the least significant one. There are 8 bits in a byte, hence the "magic number" 8.
Generally speaking: if you have any raw data, it consists of n bytes and can therefore always be treated as an array of bytes uint8_t data[n]. To access bit x in that byte array, you can for example do like this:
Given x = 17, bit x is then found in byte number 17/8 = 2. Note that integer division "floors" the value, instead of 2.125 you get 2.
The remainder of the integer division gives you the bit position in that byte, 17%8 = 1.
So bit number 17 is located in byte 2, bit 1. data[2] gives the byte.
To mask out a bit from a byte in C, the bitwise AND operator & is used. And in order to use that, a bit mask is needed. Such bit masks are best obtained by shifting the value 1 by the desired amount of bits. Bit masks are perhaps most clearly expressed in hex and the possible bit masks for a byte will be (1<<0) == 0x01 , (1<<1) == 0x02, (1<<3) == 0x04, (1<<4) == 0x08 and so on.
In this case (1<<1) == 0x02.
C code:
uint8_t data[n];
...
size_t byte_index = x / 8;
size_t bit_index = x % 8;
bool is_bit_set;
is_bit_set = ( data[byte_index] & (1<<bit_index) ) != 0;

How to flip a specific bit in a byte in C?

I'm trying to use masks and manipulating specific bits in a byte.
For example:
I want to write a program in C that flips two bits at particular positions e.g. the bit at position 0 and the one at the third position.
So, 11100011, would become 01110011.
How can I swap these bits?
Flipping a bit is done by XOR-ing with a mask: set bits at the positions that you want to flip, and then execute a XOR, like this:
int mask = 0x90; // 10010000
int num = 0xE3; // 11100011
num ^= mask; // 01110011
Here are a few notes:
bits are commonly counted from the least significant position, so your example flips bits in positions 4 and 7, not at positions 0 and 4
To construct a bit mask for a single position, use expression 1 << n, where n is the position number counting from the least significant bit.
To combine multiple bits in a single mask, use | operator. For example, (1 << 4) | (1 << 7) constructs the mask for flipping bits 4 and 7.
If your byte is x, and you want to switch the bits at the i-th and j-th position:
x = x ^ ((1<<i) | (1<<j));
So, in your case, it would just be (1<<4) | (1<<7). :)
First of all, good luck!
One remark - it is more useful to count the bits from the right and not left, since there are various byte/word sizes (8-bit,16-bit,etc.) and that count preserves compatibility better. So in your case you are referring to bits #7 and #4 (zero-count).
Did you mean 'flip' (change 0<->1 bits) or 'switch' them between one and the other?
For the first option, the answer above (XOR with "int mask = 0x90; // 10010000") is very good. For the second one, it's a bit more tricky (but not much).
To flip bits, you can use the exclusive OR bitwise operator. This takes two operands (typically, the value you want to operate on and the mask defining what bits will be flipped). The eXclusive OR (XOR) operator will only flip a bit if, and only if, one of the two is set to 1, but NOT both. See the (simple) example below:
#include <stdio.h>
int main(int argc, char** argv)
{
int num = 7; //00000111
int mask = 3; //00000011
int result = num ^ mask; //00000100
printf("result = %d\n", result); //should be 4
return 0;
}

How to store two different things in one byte and then access them again?

I am trying to learn C for my class. One thing I need to know is given an array, I have to take information from two characters and store it in one bytes. For eg. if string is "A1B3C5" then I have to store A = 001 in higher 3bits and then store 1 in lower 5bits. I have to function that can get two chars from array at a time and print it here is that function,
void print2(char string[])
{
int i = 0;
int length = 0;
char char1, char2;
length = strlen(string);
for ( i = 0; i <length; i= i + 2)
{
char1 = string[i];
char2 = string[i+1];
printf("%c, %c\n", char1, char2);
}
}
but now i am not sure how to get it encoded and then decode again. Can anyone help me please?
Assuming an ASCII character set, subtract '#' from the letter and shift left five bits, then subtract '0' from the character representing the digit and add it to the first part.
So you've got a byte, and you want the following bit layout:
76543210
AAABBBBB
To store A, you would do:
unsigned char result;
int input_a = somevalue;
result &= 0x1F; // Clear the upper 3 bits.
// Store "A": make sure only the lower 3 bits of input_a are used,
// Then shift it by 5 positions. Finally, store it by OR'ing.
result |= (char)((input_a & 7) << 5);
To read it:
// Simply shift the byte by five positions.
int output_a = (result >> 5);
To store B, you would do:
int input_b = yetanothervalue;
result &= 0xE0; // Clear the lower 5 bits.
// Store "B": make sure only the lower 5 bits of input_b are used,
// then store them by OR'ing.
result |= (char)(input_b & 0x1F);
To read it:
// Simply get the lower 5 bits.
int output_b = (result & 0x1F);
You may want to read about the boolean operations AND and OR, bit shifting and finally bit masks.
First of all, one bit can only represent two states: 0 and 1, or TRUE and FALSE. What you mean is a Byte, which consists of 8 bits and can thus represent 2^8 states.
Two put two values in one byte, use logical OR (|) and bitwise shift (<< and >>).
I don't post the code here since you should learn this stuff - it's really important to know what bits and bytes are and how to work with them. But feel free to ask follow up question if something is not clear to you.

Resources