Need help explaining bit manipulation with / and % in wiss code - c

setbit(mapptr, pos)
register unsigned char *mapptr;
register int pos;
{
/* adjust word pointer and bit offset (pos) */
/************** FROM HERE ****************/
mapptr += pos / BITSPERBYTE;
pos %= BITSPERBYTE;
/*************** TO HERE ***************/
*mapptr |= 1 << pos; /* set bit */
}
There is bit manipulation code in wiss filesystem I don't understand (marked with a pair of comments).
Why do divide the pos into BITPERBYTE (defined as 8) and operate the pos into BITPERBYTE?

The code "packs" bits into bytes in such a way that bits 0..7 go into byte 0, 8..15 go into byte 1, 16..23 go into byte 2, and so on. If you divide a bit number by 8 and drop the remainder, you will end up with the corresponding byte number:
int byteNumber = bitNumber / BITSPERBYTE;
Your code snippet adds byteNumber to mapptr, which amounts to indexing an array of bytes through pointer arithmetic.
The result of % 8 is the same as obtaining these last three bits, i.e. a number from 0 to 7, inclusive. This is the bit number within the corresponding byte:
int bitInByte = bitNumber % BITSPERBYTE;
The result of % is always in the range 0..BITSPERBYTE-1.
Note: Your code is pre-ANSI, consider rewriting it using the syntax that has been standard for the last few decades, and change the type of pos to unsigned:
void setbit(unsigned char *mapptr, unsigned int pos) {
mapptr += pos / BITSPERBYTE;
pos %= BITSPERBYTE;
*mapptr |= 1 << pos;
}

You want to set bit X in a bitmap. If there are eight bits in a byte,
bit X will lie in byte X/8 (0-based, integer maths => rounded towards zero, i.e. rounded down for positive pos)
it will be the X%8 th bit of that byte (i.e. removing the component we used to find the byte in this computation)
so
mapptr += pos / BITSPERBYTE;
computes the byte offset of the bit we want to set, adding it to the pointer, and
pos %= BITSPERBYTE;
modified pos to index the bit with that byte that we want to set; then 1 << pos generates that bit, and we 'or' it into the byte we've picked:
*mapptr |= 1 << pos; /* set bit */

Related

what is the purpose of using the XOR operator and this line "if (current & 1)" "current = exclusive >> I;"

#include "main.h"
/**
* flip_bits - count the number of bits to change
* to get from one number to another
* #n: the first number
* #m: second number
* Return: the number of bits needed to change
*/
unsigned int flip_bits(unsigned long int n, unsigned long int m)
{
int i, count = 0;
unsigned long int current;
unsigned long int exclusive = n ^ m;
for (i = 63; i >= 0; i--)
{
current = exclusive >> i;
if (current & 1)
count++;
}
return (count);
}
The behaviour of XOR operation is that it returns a 1 when two operands(bits) are different and 0 if they are same. So XOR of two numbers would give you a value which when interpreted in binary shows the bit position where the two numbers differ. And to get the count of number of bits to change all you need to do is count the number of 1s in the result of XOR.
current = exclusive >> i;
this line shifts MSB of exclusive, bringing it to LSB and assigns it to current. And the line below
if (current & 1)
count++;
is doing a check whether the LSB of current(i.e. MSB of XOR result) is 1 or not(since 1 in binary is all zeros except LSB 1 and) counting that.
These lines are in a loop so in all it is checking for each position if it is a 1 and counting it. You can do it the other way around as well going from LSB to MSB but that is a messier way, and you should perhaps use sizeof(unsigned long int)*8 - 1 as the initial value of i in your for loop instead of 63, as the sizes are platform dependent.

Writing a stream of 9 bit values as bytes to a file in C

I have an array with integer values from 0-511 (9 bits max). I am trying to write this to a file with fwrite.
For Example, with the array:
[257, 258, 259]
Which is 100000001, 100000010, 100000011
I am trying to write
100000001100000010100000011 + extra padding of 0s to the file
But since fwrite limits writes to 1 byte at a time, I am not sure how to go about doing this. I am new to bitwise operations and am not how to separate out the individual bytes.
You need a bit buffer.
Since you are writing 8 bits at the time, you must
have data type that can hold at least 9+7 bits at minimum. uint16_t would do,
but I recommend using size that would at least as big as your native int. Make sure you use unsigned types to avoid shifting issues.
uint32_t bitBuffer = 0; // Our temporary bit storage
uint32_t count = 0; // Number of bits in buffer
Let's assume we have single data:
uint32_t data9b = 257; // 1 0000 0001
Adding bits to buffer is simple; just shift bits at the end of the buffer,
and combine with OR.
bitBuffer |= (data9b << count); // At first iteration, shift does nothing
count += 9; // Update counter
After 9 bits are added, we can flush 8 bits to file.
while(count >= 8) {
writeToFile(bitBuffer & 0xFF); // Mask out lowest bits with AND
bitBuffer >>= 8; // Remove written bits
count -= 8; // Fix counter
}
After each cycle you have 0 - 7 bits left over in the buffer. At the end of all data, if you finish with non-multiple of 8 bits, just write remaining contents of bitBuffer to file.
if(count > 0)
writeToFile(bitBuffer);
Ok, so did it using bit shifting, oring (can also do with *, '+', % and /) but shift is more appropriate / readable, imo.
// Your data, n is the number of 9-bits values
uint16_t dat[] = { 257, 258, 259 };
int i,n = sizeof(dat)/sizeof(*dat);
// out file
FILE *fp = fopen("f8.bin","w");
uint16_t one = 0;
int shift = 0;
uint8_t b;
// main loop
for(i=0 ; i<n ; i++) {
b = (dat[i] << shift) | one; // one is remainder from previous value
one = dat[i]>>(8-shift); // Move the remaining MSb to the right
shift = (shift+9) % 8; // next shift
fwrite(&b, 1, 1, fp); // write our b byte
}
// remainder, always have a remainder
fwrite(&one, 1, 1, fp);
fclose(fp);
Had fun :-)

CRC code and implementation compatibility

the mechanism or the steps for the CRC checksum is easy , but the software is somehow much complicated and there are some steps in software that are not compatible with the steps of CRC
the following picture is the steps for getting the checksum of the CRC ( which is simply a modulo 2 division):
the checksum is the remainder = 001
the software for calculating the CRC checksum is for a string of bits is:
/*
* The width of the CRC calculation and result.
* Modify the typedef for a 16 or 32-bit CRC standard.
*/
typedef uint8_t crc;
#define WIDTH (8 * sizeof(crc))
#define TOPBIT (1 << (WIDTH - 1))
crc
crcSlow(uint8_t const message[], int nBytes)
{
crc remainder = 0;
/*
* Perform modulo-2 division, a byte at a time.
*/
for (int byte = 0; byte < nBytes; ++byte)
{
/*
* Bring the next byte into the remainder.
*/
remainder ^= (message[byte] << (WIDTH - 8));
/*
* Perform modulo-2 division, a bit at a time.
*/
for (uint8_t bit = 8; bit > 0; --bit)
{
/*
* Try to divide the current data bit.
*/
if (remainder & TOPBIT)
{
remainder = (remainder << 1) ^ POLYNOMIAL;
}
else
{
remainder = (remainder << 1);
}
}
}
/*
* The final remainder is the CRC result.
*/
return (remainder);
}
I see that there is incompatibility in the software in the part( remainder<<1 ) because the shifting will always add 0 at the right even if the following bit is not 0.
and also in the part:
remainder ^= (message[byte] << (WIDTH - 8));
when putting the first byte I don't see problem because the initial value is because the initial value is 0, but when inserting the next bytes why we xor every byte of them with the previous remainder
The code example appears to use a variable sized CRC, where the size of the CRC is WIDTH. POLYNOMIAL is the bottom WIDTH bits of a WIDTH+1 bit polynomial, which will have the least significant bit set to 1. Since the operations are XOR, the order in which the data bits are XOR'ed into the remainder doesn't matter, so 8 data bits can be XOR'ed into the upper bits of remainder at the same time. Then the bit at a time feedback cycle occurs for 8 bits. Since the bottom bit of POLYNOMIAL is a 1, that will keep the cycle going, as long as there are any 1 bits in the data.

Merging 13 bits array into an array of unsigned char

I'm writing an algorithm that compresses data (LZSS) and it requires me to have two 13-bit values which I'll have to later merge together.
In some cases, however, I don't need 13 bits; 8 are enough.
For this purpose I have a structure like this:
typedef struct pattern
{
char is_compressed:1; //flag
short index :13; //first value
short length :13; //second value
unsigned char c; //is 8 bits are enough, use this instead
} Pattern;
I therefore have an array of these structures, and each structure can either contain the two 13-bit values or an 8-bit value.
I am now looping over this array, and my objective is to merge all these bits together.
I easily calculated the total number of bits used and the number of arrays of unsigned chars (8 bits) needed in order to store all the values:
int compressed = 0, plain = 0;
//count is the amount of patterns i have and p is the array of patterns (the structures)
for (int i = 0; i < count; i++)
{
if (p[i]->is_compressed)
compressed++;
else
plain++;
}
//this stores the number of bits used in the pattern (13 for length and 13 for the index or 8 for the plain uchar)
int tot_bits = compressed * 26 + plain * 8;
//since we can only write a minimum of 8 bits, we calculate how many arrays are needed to store the bits
int nr_of_arrays = (tot_bits % 8 == 0) ? tot_bits / 8 : (tot_bits / 8) + 1;
//we allocate the needed memory for the array of unsigned chars that will contain, concatenated, all the bits
unsigned char* uc = (unsigned char*) malloc(nr_of_arrays * sizeof(unsigned char));
After allocating the memory for the array I'm going to fill, I simply loop through the array of structures and recognize whether the structure I'm looking at contains the two 13-bit values or just the 8-bit one
for (int i = 0; i < count; i++)
{
if (p->is_compressed)
{
//The structure contains the two 13 bits value
}
else
{
//The structure only contains the 8 bits value
}
}
Here I'm stuck and can't seem to figure out a proper way of getting the job done.
Does anybody of you know how to implement that part there?
A practical example would be:
pattern 1 contains the 2 13-bit values:
1111 1111 1111 1
0000 0000 0000 0
pattern 2 contains the 8-bit value
1010 1010
total bits: 34
number of arrays required: 5 (that will waste 6 bits)
resulting array is:
[0] 1111 1111
[1] 1111 1000
[2] 0000 0000
[3] 0010 1010
[4] 1000 0000 (the remaining 6 bits are set to 0)
One way to do that is to write bytes one by one and keep track of partial bytes as you write.
You need a pointer to your char array, and an integer to keep track of how many bits you wrote to the last byte. Every time you write bits, you check how many bits you can write to the last byte, and you write these bits accordingly (ex: if there is 5 bits free, you shift your next value by 3 and add it to the last byte). Every time a byte is complete, you increment your array pointer and reset your bit tracker.
A clean way to implement this would be to write functions like :
void BitWriter_init( char *myArray );
void BitWriter_write( int theBitsToWrite, int howManyBits );
Now you just have to figure out how to implement these functions, or use any other method of your choice.
The problem intrigued me. Here's a possible implementation of "by using a lot of bitwise operations":
/* A writable bit string, with an indicator of the next available bit */
struct bitbuffer {
uint8_t *bytes;
size_t next_bit;
};
/*
* writes the bits represented by the given pattern to the next available
* positions in the specified bit buffer
*/
void write_bits(struct bitbuffer *buffer, Pattern *pattern) {
/* The index of the byte containing the next available bit */
size_t next_byte = buffer->next_bit / 8;
/* the number of bits already used in the next available byte */
unsigned bits_used = buffer->next_bit % 8;
if (pattern->is_compressed) {
/* assemble the bits to write in a 32-bit block */
uint32_t bits = pattern->index << 13 + pattern->length;
if (bits_used == 7) {
/* special case: the bits to write will span 5 bytes */
/* the first bit written will be the last in the current byte */
uint8_t first_bit = bits >> 25;
buffer->bytes[next_byte] |= first_bit;
/* write the next 8 bits to the next byte */
buffer->bytes[++next_byte] = (bits >> 17) & 0xFF;
/* align the tail of the bit block with the buffer*/
bits <<= 7;
} else {
/* the first bits written will fill out the current byte */
uint8_t first_bits = (bits >> (18 + bits_used)) & 0xFF;
buffer->bytes[next_byte] |= first_bits;
/* align the tail of the bit block with the buffer*/
bits <<= (6 - bits_used);
}
/*
* Write the remainder of the bit block to the buffer,
* most-significant bits first. Three (more) bytes will be modified.
*/
buffer->bytes[++next_byte] = (bits >> 16) & 0xFF;
buffer->bytes[++next_byte] = (bits >> 8) & 0xFF;
buffer->bytes[++next_byte] = bits & 0xFF;
/* update the buffer's index of the next available bit */
buffer->next_bit += 26;
} else { /* the pattern is not compressed */
if (bits_used) {
/* the bits to write will span two bytes in the buffer */
buffer->bytes[next_byte] |= (pattern->c >> bits_used);
buffer[++next_byte] = (pattern->c << bits_used) & 0xFF;
} else {
/* the bits to write exactly fill the next buffer byte */
buffer->bytes[next_byte] = pattern->c;
}
/* update the buffer's index of the next available bit */
buffer->next_bit += 8;
}
}

How to define and work with an array of bits in C?

I want to create a very large array on which I write '0's and '1's. I'm trying to simulate a physical process called random sequential adsorption, where units of length 2, dimers, are deposited onto an n-dimensional lattice at a random location, without overlapping each other. The process stops when there is no more room left on the lattice for depositing more dimers (lattice is jammed).
Initially I start with a lattice of zeroes, and the dimers are represented by a pair of '1's. As each dimer is deposited, the site on the left of the dimer is blocked, due to the fact that the dimers cannot overlap. So I simulate this process by depositing a triple of '1's on the lattice. I need to repeat the entire simulation a large number of times and then work out the average coverage %.
I've already done this using an array of chars for 1D and 2D lattices. At the moment I'm trying to make the code as efficient as possible, before working on the 3D problem and more complicated generalisations.
This is basically what the code looks like in 1D, simplified:
int main()
{
/* Define lattice */
array = (char*)malloc(N * sizeof(char));
total_c = 0;
/* Carry out RSA multiple times */
for (i = 0; i < 1000; i++)
rand_seq_ads();
/* Calculate average coverage efficiency at jamming */
printf("coverage efficiency = %lf", total_c/1000);
return 0;
}
void rand_seq_ads()
{
/* Initialise array, initial conditions */
memset(a, 0, N * sizeof(char));
available_sites = N;
count = 0;
/* While the lattice still has enough room... */
while(available_sites != 0)
{
/* Generate random site location */
x = rand();
/* Deposit dimer (if site is available) */
if(array[x] == 0)
{
array[x] = 1;
array[x+1] = 1;
count += 1;
available_sites += -2;
}
/* Mark site left of dimer as unavailable (if its empty) */
if(array[x-1] == 0)
{
array[x-1] = 1;
available_sites += -1;
}
}
/* Calculate coverage %, and add to total */
c = count/N
total_c += c;
}
For the actual project I'm doing, it involves not just dimers but trimers, quadrimers, and all sorts of shapes and sizes (for 2D and 3D).
I was hoping that I would be able to work with individual bits instead of bytes, but I've been reading around and as far as I can tell you can only change 1 byte at a time, so either I need to do some complicated indexing or there is a simpler way to do it?
Thanks for your answers
If I am not too late, this page gives awesome explanation with examples.
An array of int can be used to deal with array of bits. Assuming size of int to be 4 bytes, when we talk about an int, we are dealing with 32 bits. Say we have int A[10], means we are working on 10*4*8 = 320 bits and following figure shows it: (each element of array has 4 big blocks, each of which represent a byte and each of the smaller blocks represent a bit)
So, to set the kth bit in array A:
// NOTE: if using "uint8_t A[]" instead of "int A[]" then divide by 8, not 32
void SetBit( int A[], int k )
{
int i = k/32; //gives the corresponding index in the array A
int pos = k%32; //gives the corresponding bit position in A[i]
unsigned int flag = 1; // flag = 0000.....00001
flag = flag << pos; // flag = 0000...010...000 (shifted k positions)
A[i] = A[i] | flag; // Set the bit at the k-th position in A[i]
}
or in the shortened version
void SetBit( int A[], int k )
{
A[k/32] |= 1 << (k%32); // Set the bit at the k-th position in A[i]
}
similarly to clear kth bit:
void ClearBit( int A[], int k )
{
A[k/32] &= ~(1 << (k%32));
}
and to test if the kth bit:
int TestBit( int A[], int k )
{
return ( (A[k/32] & (1 << (k%32) )) != 0 ) ;
}
As said above, these manipulations can be written as macros too:
// Due order of operation wrap 'k' in parentheses in case it
// is passed as an equation, e.g. i + 1, otherwise the first
// part evaluates to "A[i + (1/32)]" not "A[(i + 1)/32]"
#define SetBit(A,k) ( A[(k)/32] |= (1 << ((k)%32)) )
#define ClearBit(A,k) ( A[(k)/32] &= ~(1 << ((k)%32)) )
#define TestBit(A,k) ( A[(k)/32] & (1 << ((k)%32)) )
typedef unsigned long bfield_t[ size_needed/sizeof(long) ];
// long because that's probably what your cpu is best at
// The size_needed should be evenly divisable by sizeof(long) or
// you could (sizeof(long)-1+size_needed)/sizeof(long) to force it to round up
Now, each long in a bfield_t can hold sizeof(long)*8 bits.
You can calculate the index of a needed big by:
bindex = index / (8 * sizeof(long) );
and your bit number by
b = index % (8 * sizeof(long) );
You can then look up the long you need and then mask out the bit you need from it.
result = my_field[bindex] & (1<<b);
or
result = 1 & (my_field[bindex]>>b); // if you prefer them to be in bit0
The first one may be faster on some cpus or may save you shifting back up of you need
to perform operations between the same bit in multiple bit arrays. It also mirrors
the setting and clearing of a bit in the field more closely than the second implemention.
set:
my_field[bindex] |= 1<<b;
clear:
my_field[bindex] &= ~(1<<b);
You should remember that you can use bitwise operations on the longs that hold the fields
and that's the same as the operations on the individual bits.
You'll probably also want to look into the ffs, fls, ffc, and flc functions if available. ffs should always be avaiable in strings.h. It's there just for this purpose -- a string of bits.
Anyway, it is find first set and essentially:
int ffs(int x) {
int c = 0;
while (!(x&1) ) {
c++;
x>>=1;
}
return c; // except that it handles x = 0 differently
}
This is a common operation for processors to have an instruction for and your compiler will probably generate that instruction rather than calling a function like the one I wrote. x86 has an instruction for this, by the way. Oh, and ffsl and ffsll are the same function except take long and long long, respectively.
You can use & (bitwise and) and << (left shift).
For example, (1 << 3) results in "00001000" in binary. So your code could look like:
char eightBits = 0;
//Set the 5th and 6th bits from the right to 1
eightBits &= (1 << 4);
eightBits &= (1 << 5);
//eightBits now looks like "00110000".
Then just scale it up with an array of chars and figure out the appropriate byte to modify first.
For more efficiency, you could define a list of bitfields in advance and put them in an array:
#define BIT8 0x01
#define BIT7 0x02
#define BIT6 0x04
#define BIT5 0x08
#define BIT4 0x10
#define BIT3 0x20
#define BIT2 0x40
#define BIT1 0x80
char bits[8] = {BIT1, BIT2, BIT3, BIT4, BIT5, BIT6, BIT7, BIT8};
Then you avoid the overhead of the bit shifting and you can index your bits, turning the previous code into:
eightBits &= (bits[3] & bits[4]);
Alternatively, if you can use C++, you could just use an std::vector<bool> which is internally defined as a vector of bits, complete with direct indexing.
bitarray.h:
#include <inttypes.h> // defines uint32_t
//typedef unsigned int bitarray_t; // if you know that int is 32 bits
typedef uint32_t bitarray_t;
#define RESERVE_BITS(n) (((n)+0x1f)>>5)
#define DW_INDEX(x) ((x)>>5)
#define BIT_INDEX(x) ((x)&0x1f)
#define getbit(array,index) (((array)[DW_INDEX(index)]>>BIT_INDEX(index))&1)
#define putbit(array, index, bit) \
((bit)&1 ? ((array)[DW_INDEX(index)] |= 1<<BIT_INDEX(index)) \
: ((array)[DW_INDEX(index)] &= ~(1<<BIT_INDEX(index))) \
, 0 \
)
Use:
bitarray_t arr[RESERVE_BITS(130)] = {0, 0x12345678,0xabcdef0,0xffff0000,0};
int i = getbit(arr,5);
putbit(arr,6,1);
int x=2; // the least significant bit is 0
putbit(arr,6,x); // sets bit 6 to 0 because 2&1 is 0
putbit(arr,6,!!x); // sets bit 6 to 1 because !!2 is 1
EDIT the docs:
"dword" = "double word" = 32-bit value (unsigned, but that's not really important)
RESERVE_BITS: number_of_bits --> number_of_dwords
RESERVE_BITS(n) is the number of 32-bit integers enough to store n bits
DW_INDEX: bit_index_in_array --> dword_index_in_array
DW_INDEX(i) is the index of dword where the i-th bit is stored.
Both bit and dword indexes start from 0.
BIT_INDEX: bit_index_in_array --> bit_index_in_dword
If i is the number of some bit in the array, BIT_INDEX(i) is the number
of that bit in the dword where the bit is stored.
And the dword is known via DW_INDEX().
getbit: bit_array, bit_index_in_array --> bit_value
putbit: bit_array, bit_index_in_array, bit_value --> 0
getbit(array,i) fetches the dword containing the bit i and shifts the dword right, so that the bit i becomes the least significant bit. Then, a bitwise and with 1 clears all other bits.
putbit(array, i, v) first of all checks the least significant bit of v; if it is 0, we have to clear the bit, and if it is 1, we have to set it.
To set the bit, we do a bitwise or of the dword that contains the bit and the value of 1 shifted left by bit_index_in_dword: that bit is set, and other bits do not change.
To clear the bit, we do a bitwise and of the dword that contains the bit and the bitwise complement of 1 shifted left by bit_index_in_dword: that value has all bits set to one except the only zero bit in the position that we want to clear.
The macro ends with , 0 because otherwise it would return the value of dword where the bit i is stored, and that value is not meaningful. One could also use ((void)0).
It's a trade-off:
(1) use 1 byte for each 2 bit value - simple, fast, but uses 4x memory
(2) pack bits into bytes - more complex, some performance overhead, uses minimum memory
If you have enough memory available then go for (1), otherwise consider (2).

Resources