Reading n-bit elements from a data stream in C - c

Given a data stream in C, I need to read the nth element which is x bits wide. x can vary from 1-64. How do I do this in C? I tried some bit fiddling but could not come up with a solution.
For example, for a data stream
01101010 11010101 11111111 00000010 00000000 10000000
==== ======
if the data is 10-bit wide and the element to parse is the third element. The expected data should be 1111 000000.
The data stream is byte-addressable.

First find out what the most significant bit represents. Specifically, does it represent bit 0 or bit 7 in your bit stream.
To find the nth element, you will need to find which byte it starts on ((n*x)/8), get the appropriate bits from that byte, then get the remaining bits from the following byte or bytes.
But, which bits should be taken from the bytes depends on what the most significant bit represents.

#include <stdio.h>
#include <stdint.h>
uint64_t bit_slice(const uint8_t ds[], int start, int end){
//index start, end {x | 1 <= x <= 64 }
uint64_t s = 0;//memcpy(&s, ds, 8);
int i, n = (end - 1) / 8;
for(i = 0; i <= n; ++i)
s = (s << 8) + ds[i];
s >>= (n+1) * 8 - end;
uint64_t mask = (((uint64_t)1) << (end - start + 1))-1;//len = end - start + 1
s &= mask;
return s;
}
int main(void){
uint8_t data[8] = {
0b01101010, 0b11010101, 0b11111111, 0b00000010, 0b00000000, 0b10000000 //0b... GCC extention
};
unsigned x = bit_slice(data, 21, 30);
printf("%X\n", x);//3C0 : 11 1100 0000
return 0;
}

Related

how do i split an unsigned 64 bit int into individual 8 bits? (little endian) in C

for example i have uint64_t value = 42 and i would like to split it into 8 uint8_t (8 bits), little endian. But I am unsure how to do the bit shifting. Help would be much appreciated.
If you want the individual bytes of a 64-bit value in little endian, then you can do the following:
In order to get the 1st byte, you simply apply the AND-bitmask 0xFF. This will mask out all bits except for the 8 least-significant bits.
In order to get the 2nd byte, you shift right by 8 bits before applying the bit-mask.
In order to get the 3rd byte, you shift right by 16 bits before applying the bit-mask.
In order to get the 4th byte, you shift right by 24 bits before applying the bit-mask.
(...)
In order to get the 8th byte, you shift right by 56 bits before applying the bit-mask.
Here is the code for the value 42 (which is the example in the question):
#include <stdio.h>
#include <stdint.h>
int main( void )
{
uint64_t value = 42;
uint8_t bytes[8];
//extract the individual bytes
for ( int i = 0; i < 8; i++ )
{
bytes[i] = value >> (8 * i) & 0xFF;
}
//print the individual bytes
for ( int i = 0; i < 8; i++ )
{
printf( "%2d ", bytes[i] );
}
printf( "\n" );
}
Output:
42 0 0 0 0 0 0 0
If you replace the value 42 with the value 74579834759 in the program above, then you get the following output:
135 247 77 93 17 0 0 0
The following code works on both little-endian and big-endian platforms. On both types of platforms, it will produce the bytes in little-endian byte order.
uint64_t input = 42;
uint8_t values[8];
values[0] = input >> 0 & 0xFF;
values[1] = input >> 8 & 0xFF;
values[2] = input >> 16 & 0xFF;
values[3] = input >> 24 & 0xFF;
values[4] = input >> 32 & 0xFF;
values[5] = input >> 40 & 0xFF;
values[6] = input >> 48 & 0xFF;
values[7] = input >> 56 & 0xFF;
Note that the & 0xFF is redundant here, but it makes the code more clear and it's useful if you want to do anything with the value other than immediately assign it to a uint8_t variable.
Macro extracts bth byte form the u integer
#define EXTRACT(u,b) ((u) >> (8 * (b)))
int foo(uint64_t x)
{
uint8_t b[8] = {
EXTRACT(x,0),
EXTRACT(x,1),
EXTRACT(x,2),
EXTRACT(x,3),
EXTRACT(x,4),
EXTRACT(x,5),
EXTRACT(x,6),
EXTRACT(x,7),
};
}
If the platform is little endian you can also use memcpy
void foo(uint64_t x)
{
uint8_t b[8];
memcpy(b, &x, sizeof(b));
}
Here's a pointer approach to retrieve byte data from u64 data I usually use. Just share with you. But in this way, the user has to take care of the order.
#include <stdio.h>
#include <stdint.h>
void main(void)
{
int i;
uint64_t v = 0x123456789abcdef0;
uint8_t* ptrb;
ptrb = (uint8_t*)&v;
for (i = 0; i < 8; i++)
{
printf("%2x ", ptrb[i]);
}
printf("\n");
}
Below is the output with my sample code,
$ ./foo
f0 de bc 9a 78 56 34 12

Extracting a particular range of bits and find number of zeros between them in C

I want to extract a particular range of bits in an integer variable.
For example: 0xA5 (10100101)
I want to extract from bit2 to bit5. i.e 1001 to a variable and count number of zeros between them.
I have another variable which give the starting point, which means in this case the value of the variable is 2. So the starting point can be find by 0xA5 >> 2.
5th bit position is a random position here..means it can be 6 or 7. The main idea is whichever bit is set to 1 after 2nd bit. I have to extract that..
How can I do rest of the part ?
Assuming you are dealing with unsigned int for your variable.
You will have to construct the appropriate mask.
Suppose you want the bits from position x to position y, there need to be y - x + 1 1s in the mask.
You can get this by -
int digits = y - x + 1;
unsigned int mask = 1u << digits - 1;
Now you need to remove the lower x bits from the initial number, which be done by -
unsigned int result = number >> x;
Finally apply the mask to remove the upper bits -
result = result & mask;
In this example we put 0 or 1 values into array. After that you can treat array as you like.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv) {
uint8_t value = 0xA5;
unsigned char bytes[8];
unsigned char i;
for (i = 0; i < 8; i++) {
bytes[i] = (value & (1 << i)) != 0 ? 1 : 0;
}
for (i = 0; i < 8; i++) {
printf("%d", bytes[i]);
}
return 0;
}
You could use a mask and the "&" (AND) operation:
a = 0xA5;
a = a >> OFFSET; //OFFSET
mask = 0x0F; // equals 00001111
a = a & mask;
In your example a = 0xA5 (10100101), and the offset is 2.
a >> 2 a now equals to 0x29 (00101001)
a & 0x0F (00101001 AND
00001111) = 00001001 = 0x09
If you want bits from the offset X then shift right by X.
If you want Y bits, then then mask (after the shift) will be 2 to the power of Y minus one (for your example with four bits, 2 to the power of 4 is 16, minus one is 15 which is 1111 binary). This can be dome by using left-shifting by Y bits and subtracting 1.
However, the masking isn't needed if you want to count the number of zeros in the wanted bits, only the right shift. Loop Y times, each time shifting a 1 left one step, and check using bitwise and if the value is zero. If it is then increment a counter. At the end of the loop the counter is the number of zeros.
To put it all in code:
// Count the number of zeros in a specific amount of bits starting at a specific offset
// value is the original value
// offset is the offset in bits
// bits is the number of bits to check
unsigned int count_zeros(unsigned int value, unsigned int offset, unsigned int bits)
{
// Get the bits we're interested in the rightmost position
value >>= offset;
unsigned int counter = 0; // Zero-counter
for (unsigned int i = 0; i < bits; ++i)
{
if ((value & (1 << i)) == 0)
{
++counter; // Bit is a zero
}
}
return counter;
}
To use with the example data you have:
count_zeros(0xa5, 2, 4);
The result should be 2. Which it is if you see this live program.
int32_t do_test(int32_t value, int32_t offset)
{
int32_t _zeros = 1;
value >>= offset;
int i = 1;
while(1) {
if((value >> i) % 2 == 0) {
_zeros += 1;
i++;
} else {
break;
}
}
}
int result = (0xA5 >> 2) & 0x0F;
Truth table for the & operator
| INPUTS | OUTPUT |
-----------------------
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
-----------------------

Bitwise Operation on a byte and an int

I have a byte array represented as
char * bytes = getbytes(object); //some api function
I want to check whether the bit at some position x is set.
I've been trying this
int mask = 1 << x % 8;
y= bytes[x>>3] & mask;
However y returns as all zeros? What am I doing incorrectly and is there an easier way to check if a bit is set?
EDIT:
I did run this as well. It didn't return with the expected result either.
int k = x >> 3;
int mask = x % 8;
unsigned char byte = bytes[k];
return (byte & mask);
it failed an assert true ctest I ran. Byte and Mask at this time where "0002" and 2 respectively when printed from gdb.
edit 2: This is how I set the bits in the first place. I'm just trying to write a test to verify they are set.
unsigned long x = somehash(void* a);
unsigned int mask = 1 << (x % 8);
unsigned int location = x >> 3;
char* filter = getData(ref);
filter[location] |= mask;
This would be one (crude perhaps) way from the top of my head:
#include "stdio.h"
#include "stdlib.h"
// this function *changes* the byte array
int getBit(char *b, int bit)
{
int bitToCheck = bit % 8;
b = b + (bitToCheck ? (bit / 8) : (bit / 8 - 1));
if (bitToCheck)
*b = (*b) >> (8 - bitToCheck);
return (*b) & 1;
}
int main(void)
{
char *bytes = calloc(2, 1);
*(bytes + 1)= 5; // writing to the appropiate bits
printf("%d\n", getBit(bytes, 16)); // checking the 16th bit from the left
return 0;
}
Assumptions:
A byte is represented as:
----------------------------------------
| 2^7 | 2^6 | 2^5 | 2^4 | 2^3 |... |
----------------------------------------
The left most bit is considered bit number 1 and the right most bit is considered the max. numbered bit (16th bit in a 2 byte object).
It's OK to overwrite the actual byte object (if this is not wanted, use memcpy).

how to replace given nibbles with another set of nibbles in an integer

Suppose you have an integer a = 0x12345678 & a short b = 0xabcd
What i wanna do is replace the given nibbles in integer a with nibbles from short b
Eg: Replace 0,2,5,7th nibbles in a = 0x12345678 (where 8 = 0th nibble, 7=1st nibble, 6=2nd nibble and so on...) with nibbles from b = 0xabcd (where d = 0th nibble, c=1st nibble, b=2nd nibble & so on...)
My approach is -
Clear the bits we're going to replace from a.
like a = 0x02045070
Create the mask from the short b like mask = 0xa0b00c0d
bitwise OR them to get the result. result = a| mask i.e result = 0xa2b45c7d hence nibbles replaced.
My problem is I don't know any efficient way to create the desired mask (like in step 2) from the given short b
If you can give me an efficient way of doing so, it would be a great help to me and I thank you for that in advance ;)
Please ask if more info needed.
EDIT:
My code to solve the problem (not good enough though)
Any improvement is highly appreciated.
int index[4] = {0,1,5,7}; // Given nibbles to be replaced in integer
int s = 0x01024300; // integer mask i.e. cleared nibbles
int r = 0x0000abcd; // short (converted to int )
r = ((r & 0x0000000f) << 4*(index[0]-0)) |
((r & 0x000000f0) << 4*(index[1]-1)) |
((r & 0x00000f00) << 4*(index[2]-2)) |
((r & 0x0000f000) << 4*(index[3]-3));
s = s|r;
Nibble has 4 bits, and according to your indexing scheme, the zeroth nibble is represented by least significant bits at positions 0-3, the first nibble is represented by least significant bits at positions 4-7, and so on.
Simply shift the values the necessary amount. This will set the nibble at position set by the variable index:
size_t index = 5; //6th nibble is at index 5
size_t shift = 4 * index; //6th nibble is represented by bits 20-23
unsigned long nibble = 0xC;
unsigned long result = 0x12345678;
result = result & ~( 0xFu << shift ); //clear the 6th nibble
result = result | ( nibble << shift ); //set the 6th nibble
If you want to set more than one value, put this code in a loop. The variable index should be changed to an array of values, and variable nibble could also be an array of values, or it could contain more than one nibble, in which case you extract them one by one by shifting values to the right.
A lot depends on how your flexible you are in accepting the "nibble list" index[4] in your case.
You mentioned that you can replace anywhere from 0 to 8 nibbles. If you take your nibble bits as an 8-bit bitmap, rather than as a list, you can use the bitmap as a lookup in a 256-entry table, which maps from bitmap to a (fixed) mask with 1s in the nibble positions. For example, for the nibble list {1, 3}, you'd have the bitmap 0b00001010 which would map to the mask 0x0000F0F0.
Then you can use pdep which has intrinsics on gcc, clang, icc and MSVC on x86 to expand the bits in your short to the right position. E.g., for b == 0xab you'd have _pdep_u32(b, mask) == 0x0000a0b0.
If you aren't on a platform with pdep, you can accomplish the same thing with multiplication.
To be able to change easy the nibbles assignment, a bit-field union structure could be used:
Step 1 - create a union allowing to have nibbles access
typedef union u_nibble {
uint32_t dwValue;
uint16_t wValue;
struct sNibble {
uint32_t nib0: 4;
uint32_t nib1: 4;
uint32_t nib2: 4;
uint32_t nib3: 4;
uint32_t nib4: 4;
uint32_t nib5: 4;
uint32_t nib6: 4;
uint32_t nib7: 4;
} uNibble;
} NIBBLE;
Step 2 - assign two NIBBLE items with your integer a and short b
NIBBLE myNibbles[2];
uint32_t a = 0x12345678;
uint16_t b = 0xabcd;
myNibbles[0].dwValue = a;
myNibbles[1].wValue = b;
Step 3 - initialize nibbles of a by nibbles of b
printf("a = %08x\n",myNibbles[0].dwValue);
myNibbles[0].uNibble.nib0 = myNibbles[1].uNibble.nib0;
myNibbles[0].uNibble.nib2 = myNibbles[1].uNibble.nib1;
myNibbles[0].uNibble.nib5 = myNibbles[1].uNibble.nib2;
myNibbles[0].uNibble.nib7 = myNibbles[1].uNibble.nib3;
printf("a = %08x\n",myNibbles[0].dwValue);
Output will be:
a = 12345678
a = a2b45c7d
If I understand your goal, the fun you are having comes from the reversal of the order of your fill from the upper half to the lower half of your final number. (instead of 0, 2, 4, 6, you want 0, 2, 5, 7) It isn't any more difficult, but it does make you count where the holes are in the final number. If I understood, then you could mask with 0x0f0ff0f0 and then fill in the zeros with shifts of 16, 12, 4 and 0. For example:
#include <stdio.h>
int main (void) {
unsigned a = 0x12345678, c = 0, mask = 0x0f0ff0f0;
unsigned short b = 0xabcd;
/* mask a, fill in the holes with the bits from b */
c = (a & mask) | (((unsigned)b & 0xf000) << 16);
c |= (((unsigned)b & 0x0f00) << 12);
c |= (((unsigned)b & 0x00f0) << 4);
c |= (unsigned)b & 0x000f;
printf (" a : 0x%08x\n b : 0x%0hx\n c : 0x%08x\n", a, b, c);
return 0;
}
Example Use/Output
$ ./bin/bit_swap_nibble
a : 0x12345678
b : 0xabcd
c : 0xa2b45c7d
Let me know if I misunderstood, I'm happy to help further.
With nibble = 4 bits and unsigned int = 32 bits, a nibble inside a unsigned int can be found as follows:
x = 0x00a0b000, find 3rd nibble in x i.e locate 'b'. Note nibble index starts with 0.
Now 3rd nibble is from 12th bit to 15th bit.
3rd_nibble can be selected with n = 2^16 - 2^12. So, in n all the bits in 3rd nibble will be 1 and all the bits in other nibbles will be 0. That is, n=0x00001000
In general, suppose if you want to find a continuous sequence of 1 in binary representation in which sequence starts from Xth bit to Yth bit then formula is 2^(Y+1) - 2^X.
#include <stdio.h>
#define BUF_SIZE 33
char *int2bin(int a, char *buffer, int buf_size)
{
int i;
buffer[BUF_SIZE - 1] = '\0';
buffer += (buf_size - 1);
for(i = 31; i >= 0; i--)
{
*buffer-- = (a & 1) + '0';
a >>= 1;
}
return buffer;
}
int main()
{
unsigned int a = 0;
unsigned int b = 65535;
unsigned int b_nibble;
unsigned int b_at_a;
unsigned int a_nibble_clear;
char replace_with[8];
unsigned int ai;
char buffer[BUF_SIZE];
memset(replace_with, -1, sizeof(replace_with));
replace_with[0] = 0; //replace 0th nibble of a with 0th nibble of b
replace_with[2] = 1; //replace 2nd nibble of a with 1st nibble of b
replace_with[5] = 2; //replace 5th nibble of a with 2nd nibble of b
replace_with[7] = 3; //replace 7th nibble of a with 3rd nibble of b
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
int2bin(b, buffer, BUF_SIZE - 1);
printf("b = %s, %08x\n", buffer, b);
for(ai = 0; ai < 8; ++ai)
{
if(replace_with[ai] != -1)
{
b_nibble = (b & (1LL << ((replace_with[ai] + 1)*4)) - (1LL << (replace_with[ai]*4))) >> (replace_with[ai]*4);
b_at_a = b_nibble << (ai * 4);
a_nibble_clear = (a & ~(a & (1LL << ((ai + 1) * 4)) - (1LL << (ai * 4))));
a = a_nibble_clear | b_at_a;
}
}
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
return 0;
}
Output:
a = 00000000000000000000000000000000, 00000000
b = 00000000000000001111111111111111, 0000ffff
a = 11110000111100000000111100001111, f0f00f0f

XORing a 32 bit integer with itself

I'm stuck on XORing a 32-bit integer with it itself. I'm supposed to XOR the 4 8-bit portions of the integers. I understand how it works, but without storing the integer anywhere, I don't get how to do this.
I've thought it over and I'm thinking of using binary left shift and right shift operators to separate the 32 bit integer into 4 parts to XOR them. For example, if I were to use an 8-bit integer, I would do something like this:
int a = <some integer here>
(a << 4) ^ (a >> 4)
So far, it isn't working the way I thought it would work.
Here's a part of my code:
else if (choice == 2) {
int bits = 8;
printf("Enter an integer for checksum calculation: ");
scanf("%d", &in);
printf("Integer: %d, ", in);
int x = in, i;
int mask = 1 << sizeof(int) * bits - 1;
printf("Bit representation: ");
for (i = 1; i <= sizeof(int) * bits; i++) {
if (x & mask)
putchar('1');
else
putchar('0');
x <<= 1;
if (! (i % 8)) {
putchar(' ');
}
}
printf("\n");
}
Here's an example of an output:
What type of display do you want?
Enter 1 for character parity, 2 for integer checksum: 2
Enter an integer for checksum calculation: 1024
Integer: 1024, Bit representation: 00000000 00000000 00000100 00000000
Checksum of the number is: 4, Bit representation: 00000100
To accumulate the XOR of 8-bit values, you simply shift and XOR each part of the value. Conceptually it's this:
uint32_t checksum = ( (a >> 24) ^ (a >> 16) ^ (a >> 8) ^ a ) & 0xff;
However, since XOR can be done in any order, you can do the same with fewer operations:
uint32_t checksum = (a >> 16) ^ a;
checksum = ((checksum >> 8) ^ checksum) & 0xff;
If you're doing this over many values, you can extend this idea by only condensing the value at the very end. This is quite similar to how parallel commutative operations are done in larger registers with technologies like SIMD (and indeed, compilers with SIMD support should be able to optimize the following code to make it much faster):
uint32_t simple_checksum( uint32_t *v, size_t count )
{
uint32_t checksum = 0;
uint32_t *end = v + count;
for( ; v != end; v++ )
{
checksum ^= *v; /* accumulate XOR of each 32-bit value */
}
checksum ^= (checksum >> 16); /* XOR high and low words into low word */
checksum ^= (checksum >> 8 ); /* XOR each byte of low word into low byte */
return checksum & 0xff; /* everything from bits 8-31 is rubbish */
}
In general Xoring a number with itself should provide you with the value 0 so you can just as easily set the variable to 0.
0100101^0100101=0
This is a result of the Karnaugh map for the xor operation providing a 0 when both bits are a one, or both are a zero.

Resources