Very fast way to check set bit in C - c

I'm using some sort of BitStream in my code that has a read_bit()-function. This function is called very very often (more than one billion times in a single stream). This is what the struct BitStream looks like:
typedef struct BitStream {
unsigned char* data;
unsigned int size;
unsigned int currentByte;
unsigned char buffer;
unsigned char bitsInBuffer;
} BitStream;
And the read_bit()-function is defined as follows:
unsigned char bitstream_read_bit(BitStream* stream, unsigned long long bitPos) {
unsigned int byte = bitPos / 8;
unsigned char byteVal = stream->data[byte];
unsigned char mask = 128 >> (bitPos & 7);
if (mask & byteVal) {
return 1;
} else {
return 0;
}
}
Now, I found out through trial-and-error that the line unsigned char mask = 128 >> (bitPos & 7); is very slow. Is there some way that I can speed up the check of a bit? I've already tried to use an array that indexes the 8 different possible masks, but this is not faster (I think due to memory access).
EDIT: I tried a lot of the answers over the past week and performed a lot of benchmarks but there wasn't a lot of performance improvement. I eventually managed to get a 10 seconds improvement by reversing the order of the bits in the bitstream. So instead of using the mask 128 >> (bitPos & 7), I used the function:
unsigned char bitstream_read_bit_2(BitStream* stream, const unsigned long long bitPos) {
unsigned int byte = (unsigned int) (bitPos / 8);
unsigned char byteVal = stream->data[byte];
unsigned char mod = bitPos & 7;
return (byteVal & (1 << mod)) >> mod;
}
I have obviously also changed the corresponding write-function.

The obvious first improvement is to shift the loaded value rather than the mask:
unsigned char bitstream_read_bit(BitStream* stream, unsigned long long bitPos) {
unsigned int byte = bitPos / 8;
unsigned char byteVal = stream->data[byte];
unsigned char maskVal = byteVal >> (bitPos & 7);
return maskVal & 1;
}
This removes the need for a conditional (No if or ! or ?:).
If you can modify the struct, I'd recommend accessing by larger units than bytes:
#include <stddef.h>
#include <limits.h>
#include <stdbool.h>
typedef struct WBitStream
{
size_t *data;
size_t size;
} WBitStream;
bool Wbitstream_read_bit(WBitStream* stream, size_t bitPos)
{
size_t location = bitPos / (sizeof(size_t)*CHAR_BIT);
size_t locval = stream->data[location];
size_t maskval = locval >> (bitPos & (sizeof(size_t)*CHAR_BIT-1));
return maskval & 1;
}
On some processors (notably the common x86), the mask of the shift-amount is a NOP, since the processor's native shift instruction only considers the low bits of the shift amount anyway. At least gcc knows about this.

I have tested to optimzed macro compared to your initial source code:
static unsigned char tMask[8] = { 128, 64, 32, 16, 8, 4, 2, 1 };
#define BITSTREAM_READ_BIT1(stream, bitPos) (((128 >> (bitPos & 7)) & stream->data[bitPos >> 3])!=0)
#define BITSTREAM_READ_BIT2(stream, bitPos) (((tMask[(bitPos & 7)]) & stream->data[bitPos >> 3])!=0)
Replacing mask computation by mask in array doesn't increase performance.
The main gap is between function and macro (6 times faster on my computer with 80.000.000 of calls).
And the static inline use is not far from the macro.

Here's how I initially optimized your code:
unsigned char bitstream_read_bit(BitStream* stream, unsigned long long bitPos)
{
return !!(stream->data[(bitPos / 8)] & (128 >> (bitPos % 8)));
}
But the function call overhead itself is likely more instructions than the bit tweaking code inside it. So if you really want to optimize it even further, let's take advantage of inlining and just convert it to a macro:
#define bitstream_read_bit(stream, bitPos) (!!((stream)->data[((bitPos) / 8)] & (128 >> ((bitPos) % 8))))

Related

How can you change a single bit of a bitmap inside a char array in C?

I'm forced to use a char array as a bitmap. For instance, this would be a 32-bit bitmap:
char bitmap[4];
Beforehand, I have initialized every single byte of this array to 0. My question is, how can I change a single bit of this array to be the one I want? I'm looking for a function with a similar structure to this, where the bitmap is passed as a parameter, along with the index of the bit we want to change and the value we want to change it to:
set_bit(char *bitmap, int bit, int value);
They force me to use a char array instead of an unsigned char array. It would also be useful to have a get_bit function with a similar structure that only asks for the bitmap and the bit to be probed as arguments.
Thank you in advance.
EDIT: I fixed the type of the bitmap in the set_bit definition
void setbit(void *arr, size_t bit, unsigned val)
{
unsigned char *ucarr = arr; // void * to prevent compiler warnings when you pass other type pointer.
size_t index = bit >> 3; //>>3 is == divide by 8 which is number of bits in char on most systems. Index number
unsigned char mask = 1 << (bit & 7); // &7 - bit number in the 8 bits charackter
ucarr[index] &= ~mask; // zero the bit
ucarr[index] |= mask * (!!val); // set the bit to the value (1 of var nonzero, 0 if var == 0)
}
or if you are sure that val will be 1 or 0 a bit more efficient version (few clocks)
void setbit1(void *arr, size_t bit, unsigned val)
{
unsigned char *ucarr = arr;
size_t index = bit >> 3;
size_t bitindex = bit & 7;
unsigned char mask = 1 << bitindex;
ucarr[index] &= ~mask;
ucarr[index] |= val << bitindex;
}
some other versions https://godbolt.org/z/JGK-Zo
or a bit more portable version (CHAR_BIT up to 256)
#define CHO (((CHAR_BIT >> 1) & 1)*2 + ((CHAR_BIT >> 2) & 1)*4 + ((CHAR_BIT >> 3) & 1)*8 + ((CHAR_BIT >> 4) & 1)*16 + ((CHAR_BIT >> 5) & 1)*32 + ((CHAR_BIT >> 6) & 1)*64 + ((CHAR_BIT >> 7) & 1)*128 + ((CHAR_BIT >> 8) & 1)*256)
void setbit(void *arr, size_t bit, unsigned val)
{
unsigned char *ucarr = arr;
size_t index = bit >> CHO;
unsigned char mask = 1 << (bit & (CHAR_BIT - 1));
ucarr[index] &= ~mask;
ucarr[index] |= mask * (!!val);
}
void setbit1(void *arr, size_t bit, unsigned val)
{
unsigned char *ucarr = arr;
size_t index = bit >> CHO;
size_t bitindex = bit & (CHAR_BIT - 1);
unsigned char mask = 1 << bitindex;
ucarr[index] &= ~mask;
ucarr[index] |= val << bitindex;
}

Extracting 3 bytes to a number

What is the FASTEST way, using bit operators to return the number, represented with 3 different unsigned char variables ?
unsigned char byte1 = 200;
unsigned char byte2 = 40;
unsigned char byte3 = 33;
unsigned long number = byte1 + byte2 * 256 + byte3 * 256 * 256;
is the slowest way possible.
Just shift each one into place, and OR them together:
#include <stdint.h>
int main(void)
{
uint8_t a = 0xAB, b = 0xCD, c = 0xEF;
/*
* 'a' must be first cast to uint32_t because of the implicit conversion
* to int, which is only guaranteed to be at least 16 bits.
* (Thanks Matt McNabb and Tim Čas.)
*/
uint32_t i = ((uint32_t)a << 16) | (b << 8) | c;
printf("0x%X\n", i);
return 0;
}
Do note however, that almost any modern compiler will replace a multiplication by a power of two with a bit-shift of the appropriate amount.
The fastest way would be the direct memory writing, assuming you know the endian of your system (here the assumption is little endian):
unsigned char byte1 = 200;
unsigned char byte2 = 40;
unsigned char byte3 = 33;
unsigned long number = 0;
((unsigned char*)&number)[0] = byte1;
((unsigned char*)&number)[1] = byte2;
((unsigned char*)&number)[2] = byte3;
Or if you don't mind doing some excercise, you can do something like:
union
{
unsigned long ulongVal;
unsigned char chars[4]; // In case your long is 32bits
} a;
and then by assigning:
a.chars[0] = byte1;
a.chars[1] = byte2;
a.chars[2] = byte3;
a.chars[3] = 0;
you will read the final value from a.ulongVal. This will spare extra memory operations.

How can I create a 48-bit uint for bit mask

I am trying to create a 48-bit integer value. I understand it may be possible to use a char array or struct, but I want to be able to do bit masking/manipulation and I'm not sure how that can be done.
Currently the program uses a 16-bit uint and I need to change it to 48. It is a bytecode interpreter and I want to expand the memory addressing to 4GB. I could just use 64-bit, but that would waste a lot of space.
Here is a sample of the code:
unsigned int program[] = { 0x1064, 0x11C8, 0x2201, 0x0000 };
void decode( )
{
instrNum = (program[i] & 0xF000) >> 12; //the instruction
reg1 = (program[i] & 0xF00 ) >> 8; //registers
reg2 = (program[i] & 0xF0 ) >> 4;
reg3 = (program[i] & 0xF );
imm = (program[i] & 0xFF ); //pointer to data
}
full program: http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Register_VM_in_C
You can use the bit fields which are often used to represent integral types of known, fixed bit-width. A well-known usage of bit-fields is to represent a set of bits, and/or series of bits, known as flags. You can apply bit operations on them.
#include <stdio.h>
#include <stdint.h>
struct uint48 {
uint64_t x:48;
} __attribute__((packed));
Use a structure or uint16_t array with special functions for an array of uint48.
For individual instances, use uint64_t or unsigned long long. uint64_t will work fine for individually int48, but may want to mask off the results operations like * or << to keep upper bits cleared. Just some space saving routines are needed for arrays.
typedef uint64_t uint48;
const uint48 uint48mask = 0xFFFFFFFFFFFFFFFFull;
uint48 uint48_get(const uint48 *a48, size_t index) {
const uint16_t *a16 = (const uint16_t *) a48;
index *= 3;
return a16[index] | (uint32_t) a16[index + 1] << 16
| (uint64_t) a16[index + 2] << 32;
}
void uint48_set(uint48 *a48, size_t index, uint48 value) {
uint16_t *a16 = (uint16_t *) a48;
index *= 3;
a16[index] = (uint16_t) value;
a16[++index] = (uint16_t) (value >> 16);
a16[++index] = (uint16_t) (value >> 32);
}
uint48 *uint48_new(size_t n) {
size_t size = n * 3 * sizeof(uint16_t);
// Insure size allocated is a multiple of `sizeof(uint64_t)`
// Not fully certain this is needed - but doesn't hurt.
if (size % sizeof(uint64_t)) {
size += sizeof(uint64_t) - size % sizeof(uint64_t);
}
return malloc(size);
}

implement the bit map in the following situation

My question is how to implement the bit map in the following situation?
If a vertex of a graph is in a minimum spanning tree (MST), mark the corresponding bit; later on check whether it is in the MST by checking the bit.
At beginning, I was thinking of using
typedef struct bit_t{
char bit0:1;
} bit;
bit bitmap[num_of_vertex];
And use bitmap array to record the bit;
But then I found the sizeof (bitmap[num_of_vertex]) is num_of_vertex byte, not num_of_vertex/8 byte. so it is not saving space like what I thought;
So far I use
long bit_record = 0;
...
bit_record |= 1<< u;//set vertex as in MST
...
then later on check whether vertex is in MST using:
static bool is_in_MST(int v, int bit_record){
int mask = 1 << v;
if (mask & bit_record)
return true;
else
return false;
}
Though the code works, it will not work if num_of_vertex is larger than 32.
How in general a bitmap in the above situation is implemented?
The situation is that you just can't have 1-bit types in C. The smallest addressable unit is a byte in C, so even if you declare a struct with a one-bit bitfield, it will be padded to a byte (at least). What you can do is create an array of bytes, then access the bits in the array using division and modulus.
unsigned char bitmap[0x100] = { 0 };
void set_nth_bit(unsigned char *bitmap, int idx)
{
bitmap[idx / CHAR_BIT] |= 1 << (idx % CHAR_BIT);
}
void clear_nth_bit(unsigned char *bitmap, int idx)
{
bitmap[idx / CHAR_BIT] &= ~(1 << (idx % CHAR_BIT));
}
int get_nth_bit(unsigned char *bitmap, int idx)
{
return (bitmap[idx / CHAR_BIT] >> (idx % CHAR_BIT)) & 1;
}
about bitmap,here is an example on Programming Pearls, and I added up some notes:
#define BITPERWORD 32 //bits of int,which depends on your computer
#define N 10000000 // number of your elements
#define SHIFT 5 // 32 = 2^5
#define MASK 0x1F // 11111 in binary
int a[N/BITPERWORD + 1]; //space for your bitmap
// i is the the bit you want to use
void set(int i) { a[i>>SHIFT] |= (1<<(i&MASK));}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i&MASK));}
int test(int i) { return a[i>>SHIFT] & (1<<(i&MASK));}
and don't forget the initialization at the beginning:
for(i=0;i<N;i++)
clr(i);

Converting Char array to Long in C

This question may looks silly, but please guide me
I have a function to convert long data to char array
void ConvertLongToChar(char *pSrc, char *pDest)
{
pDest[0] = pSrc[0];
pDest[1] = pSrc[1];
pDest[2] = pSrc[2];
pDest[3] = pSrc[3];
}
And I call the above function like this
long lTemp = (long) (fRxPower * 1000);
ConvertLongToChar ((char *)&lTemp, pBuffer);
Which works fine.
I need a similar function to reverse the procedure. Convert char array to long.
I cannot use atol or similar functions.
You can do:
union {
unsigned char c[4];
long l;
} conv;
conv.l = 0xABC;
and access c[0] c[1] c[2] c[3]. This is good as it wastes no memory and is very fast because there is no shifting or any assignment besides the initial one and it works both ways.
Leaving the burden of matching the endianness with your other function to you, here's one way:
unsigned long int l = pdest[0] | (pdest[1] << 8) | (pdest[2] << 16) | (pdest[3] << 24);
Just to be safe, here's the corresponding other direction:
unsigned char pdest[4];
unsigned long int l;
pdest[0] = l & 0xFF;
pdest[1] = (l >> 8) & 0xFF;
pdest[2] = (l >> 16) & 0xFF;
pdest[3] = (l >> 24) & 0xFF;
Going from char[4] to long and back is entirely reversible; going from long to char[4] and back is reversible for values up to 2^32-1.
Note that all this is only well-defined for unsigned types.
(My example is little endian if you read pdest from left to right.)
Addendum: I'm also assuming that CHAR_BIT == 8. In general, substitute multiples of 8 by multiples of CHAR_BIT in the code.
A simple way would be to use memcpy:
char * buffer = ...;
long l;
memcpy(&l, buff, sizeof(long));
That does not take endianness into account, however, so beware if you have to share data between multiple computers.
If you mean to treat sizeof (long) bytes memory as a single long, then you should do the below:
char char_arr[sizeof(long)];
long l;
memcpy (&l, char_arr, sizeof (long));
This thing can be done by pasting each bytes of the long using bit shifting ans pasting, like below.
l = 0;
l |= (char_arr[0]);
l |= (char_arr[1] << 8);
l |= (char_arr[2] << 16);
l |= (char_arr[3] << 24);
If you mean to convert "1234\0" string into 1234L then you should
l = strtol (char_arr, NULL, 10); /* to interpret the base as decimal */
Does this work:
#include<stdio.h>
long ConvertCharToLong(char *pSrc) {
int i=1;
long result = (int)pSrc[0] - '0';
while(i<strlen(pSrc)){
result = result * 10 + ((int)pSrc[i] - '0');
++i;
}
return result;
}
int main() {
char* str = "34878";
printf("The answer is %d",ConvertCharToLong(str));
return 0;
}
This is dirty but it works:
unsigned char myCharArray[8];
// Put some data in myCharArray here...
long long integer = *((long long*) myCharArray);
char charArray[8]; //ideally, zero initialise
unsigned long long int combined = *(unsigned long long int *) &charArray[0];
Be wary of strings that are null terminated, as you will end up copying any bytes beyond the null terminator into combined; thus in the above assignment, charArray needs to be fully zero-initialised for a "clean" conversion.
Just found this having tried more than one of the above to no avail :=( :
char * vIn = "0";
long vOut = strtol(vIn,NULL,10);
Worked perfectly for me.
To give credit where it is due, this is where I found it:
https://www.convertdatatypes.com/Convert-char-Array-to-long-in-C.html

Resources