Bitwise Operation on a byte and an int - c

I have a byte array represented as
char * bytes = getbytes(object); //some api function
I want to check whether the bit at some position x is set.
I've been trying this
int mask = 1 << x % 8;
y= bytes[x>>3] & mask;
However y returns as all zeros? What am I doing incorrectly and is there an easier way to check if a bit is set?
EDIT:
I did run this as well. It didn't return with the expected result either.
int k = x >> 3;
int mask = x % 8;
unsigned char byte = bytes[k];
return (byte & mask);
it failed an assert true ctest I ran. Byte and Mask at this time where "0002" and 2 respectively when printed from gdb.
edit 2: This is how I set the bits in the first place. I'm just trying to write a test to verify they are set.
unsigned long x = somehash(void* a);
unsigned int mask = 1 << (x % 8);
unsigned int location = x >> 3;
char* filter = getData(ref);
filter[location] |= mask;

This would be one (crude perhaps) way from the top of my head:
#include "stdio.h"
#include "stdlib.h"
// this function *changes* the byte array
int getBit(char *b, int bit)
{
int bitToCheck = bit % 8;
b = b + (bitToCheck ? (bit / 8) : (bit / 8 - 1));
if (bitToCheck)
*b = (*b) >> (8 - bitToCheck);
return (*b) & 1;
}
int main(void)
{
char *bytes = calloc(2, 1);
*(bytes + 1)= 5; // writing to the appropiate bits
printf("%d\n", getBit(bytes, 16)); // checking the 16th bit from the left
return 0;
}
Assumptions:
A byte is represented as:
----------------------------------------
| 2^7 | 2^6 | 2^5 | 2^4 | 2^3 |... |
----------------------------------------
The left most bit is considered bit number 1 and the right most bit is considered the max. numbered bit (16th bit in a 2 byte object).
It's OK to overwrite the actual byte object (if this is not wanted, use memcpy).

Related

Reversing the nibbles

I am trying to "build a new number by reversing its nibbles".
This is the exercise:
Write a function that given an unsigned n
a) returns the value with the nibbles placed in reverse order
I was thinking that all the 8 nibbles from the 32 bit unsigned should be placed in reverse order. So , as an example for the number 24, which is 00000000000000000000000000011000.
=> The reversed value should be: 10000001000000000000000000000000.
#include <stdio.h>
unsigned getNibble(unsigned n,unsigned p){
unsigned mask = 0xFu;
unsigned nibble = 0;
nibble = (n&(mask<<p))>>p;
return nibble;
}
unsigned swapNibbles(unsigned n){
unsigned new = 0;
unsigned nibble;
for(unsigned i=0;i<(sizeof(n)*8);i=i+4){
nibble = getNibble(n,i);
new = (new<<i) + nibble;
}
return new;
}
int main(void) {
printf("0x%x",swapNibbles(24));
return 0;
}
I tried to debug it , and it went well until one point.
At one of the right shifts , it transformed my "new" variable into 0.
This statement
new = (new << i) + nibble;
is wrong. There should be
new = (new << 4) + nibble;
An approach that does work in parallel:
uint32_t n = ...;
// Swap the nibbles of each byte.
n = (n & 0x0F0F0F0F ) << 4
| (n & 0xF0F0F0F0 ) >> 4;
// Swap the bytes of each byte pair.
n = ( n & 0x00FF00FF ) << 8
| ( n & 0xFF00FF00 ) >> 8;
// Swap the byte pairs.
n = ( n & 0x0000FFFF ) << 16
| ( n & 0xFFFF0000 ) >> 16;
Doing the work in parallel greatly reduces the number of operations.
OP's This
Approach Approach
-------- --------- ---------
Shifts 24 / 48 6 / 8 32 bits / 64 bits
Ands 8 / 16 6 / 8
Ors* 8 / 16 3 / 4
Assigns 8 / 16 3 / 4
Adds 8 / 16 0 / 0
Compares 8 / 16 0 / 0
-------- --------- ---------
Total 64 / 128 18 / 24
-------- --------- ---------
Scale O(N) O(log(N))
* Addition was used as "or" in the OP's solution.
int main (void)
{
uint32_t x = 0xDEADBEEF;
printf("original 4 bytes %X\n", x);
uint32_t y = 0;
for(uint8_t i = 0 ; i < 32 ; i += 4)
{
y <<= 4;
y |= x>>i & 0xF;
}
printf("reverse order nibbles %X\n", y);
return 0;
}
This could be made generic function for accepting all 8,16,32 bits numbers. But for now this resolves the bug you are facing in your code.
But I would point out ikegami's code is much better than this approach.

Re-Indexing Bits Within a Char

I have an exercise where I have to encode and decode strings at the bit level that are given in by the command line.
The caveat for this is that I have to use a permutation mapping to re-order the bits.
Here's an Example:
The User Inputs The Character To Encode
H
The Binary for H is
01001000
However, that is the regular mapping of the 8 bits, through 0-7.
My program will have to permute the bits to whatever Mapping Patter I use.
For Example, If I use Mapping 64752031
The Bits for the Char 'H'
01001000
Turn To
01000001
When encoding the char, the 0th bit turns to the 6th bit, the 2nd bit turns to the 4th bit, the 3rd bit turns to the 7th bit, and so on. Whatever is based on for that mapping.
Is there a way that I can manipulate and change the order of bits based on the permutation map given?
Thank you.
If you need to process large strings, it is probably better to use a look-up table that will precompute the translation.
#include <stdio.h>
unsigned char perm[256]; // permutation table
unsigned mapping[8]={6,4,7,5,2,0,3,1};
// assumes 7 6 5 4 3 2 1 0
// => 6 4 7 5 2 0 3 1
void mkperm(unsigned char perm[256]) {
for (int i=0; i<256; i++)
perm[i]=0;
for (int i=0;i<256;i++) {
for (int j=7; j>=0; j--) {
int pos=mapping[7-j]; // at mapping[0] is the new position of bit 7
if (i & (1<<j)) // only considers set bits, the table is previously cleared
perm[i] |= (1<<pos) ;
}
}
}
int main() {
mkperm(perm);
printf("%.2x => %.2x\n",'H',perm['H']);
}
mkperm() computes the permutation table by scanning the successive bits of every char. If a bit is set in char i, we set at position i in the translation table a bit at one at a logical weight given by the mapping. Setting this one is done by oring the content of cell i with a 1 properly shifted.
Use bitwise operators.
Here's an example of how to move the second bit to the seventh bit:
x |= (x & 1<<1) << 6;
x &= ~(1<<1);
If my bit numbering bothers anybody, I'm sorry. This is just how I read binary numbers.
You can also put this into an inline function:
inline int bit_mode(int *x, int bit1, int bit2)
{
*x |= *x & (1<<(bit1-1)) << (bit2-1);
*x &= ~(1<<(bit1-1));
return *x;
}
int a;
bit_mode(&a, 2, 7);
Just shift the bits to proper positions. After some fun, I think I've got this:
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <limits.h>
#include <stdint.h>
/**
* A little helper function
* get the bit number 'as' from the byte 'in'
* and put that bit as the number 'num' in the output
*/
static inline
uint8_t map_get_bit_as(uint8_t in,
uint8_t num, uint8_t as)
{
return (!!(in & (1 << as))) << num;
}
uint8_t map(unsigned long mapping, uint8_t in)
{
// static_assert(CHAR_BIT == 8, "are you insane?");
const int bit0 = mapping / 10000000 % 10;
const int bit1 = mapping / 1000000 % 10;
const int bit2 = mapping / 100000 % 10;
const int bit3 = mapping / 10000 % 10;
const int bit4 = mapping / 1000 % 10;
const int bit5 = mapping / 100 % 10;
const int bit6 = mapping / 10 % 10;
const int bit7 = mapping / 1 % 10;
return
map_get_bit_as(in, 0, bit0) |
map_get_bit_as(in, 1, bit1) |
map_get_bit_as(in, 2, bit2) |
map_get_bit_as(in, 3, bit3) |
map_get_bit_as(in, 4, bit4) |
map_get_bit_as(in, 5, bit5) |
map_get_bit_as(in, 6, bit6) |
map_get_bit_as(in, 7, bit7);
}
int main() {
printf("%#02x %#02x\n\n", 'H', map(64752031, 'H'));
}
will output:
0x48 0x41
tested on repl.
If I have correctly understood the order of bits as you are counting them then the corresponding function can look the following way as it is shown in the demonstrative program.
#include <stdio.h>
#include <limits.h>
#include <stdint.h>
char encode( char c, uint32_t mask )
{
unsigned char result = '\0';
for ( size_t i = 0; i < 2 * sizeof( mask ) ; i++ )
{
uint32_t bit = ( ( ( uint32_t )1 << ( CHAR_BIT - 1 - ( mask & 0xf ) ) ) & c ) != 0;
result |= bit << i;
mask >>= 4;
}
return ( char )result;
}
int main( void )
{
uint32_t mask = 0x64752031;
char c = 'H';
printf( "c = %hhx\n", c );
c = encode( c, mask );
printf( "c = %hhx\n", c );
}
The program output is
c = 48
c = 41

Extracting a particular range of bits and find number of zeros between them in C

I want to extract a particular range of bits in an integer variable.
For example: 0xA5 (10100101)
I want to extract from bit2 to bit5. i.e 1001 to a variable and count number of zeros between them.
I have another variable which give the starting point, which means in this case the value of the variable is 2. So the starting point can be find by 0xA5 >> 2.
5th bit position is a random position here..means it can be 6 or 7. The main idea is whichever bit is set to 1 after 2nd bit. I have to extract that..
How can I do rest of the part ?
Assuming you are dealing with unsigned int for your variable.
You will have to construct the appropriate mask.
Suppose you want the bits from position x to position y, there need to be y - x + 1 1s in the mask.
You can get this by -
int digits = y - x + 1;
unsigned int mask = 1u << digits - 1;
Now you need to remove the lower x bits from the initial number, which be done by -
unsigned int result = number >> x;
Finally apply the mask to remove the upper bits -
result = result & mask;
In this example we put 0 or 1 values into array. After that you can treat array as you like.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv) {
uint8_t value = 0xA5;
unsigned char bytes[8];
unsigned char i;
for (i = 0; i < 8; i++) {
bytes[i] = (value & (1 << i)) != 0 ? 1 : 0;
}
for (i = 0; i < 8; i++) {
printf("%d", bytes[i]);
}
return 0;
}
You could use a mask and the "&" (AND) operation:
a = 0xA5;
a = a >> OFFSET; //OFFSET
mask = 0x0F; // equals 00001111
a = a & mask;
In your example a = 0xA5 (10100101), and the offset is 2.
a >> 2 a now equals to 0x29 (00101001)
a & 0x0F (00101001 AND
00001111) = 00001001 = 0x09
If you want bits from the offset X then shift right by X.
If you want Y bits, then then mask (after the shift) will be 2 to the power of Y minus one (for your example with four bits, 2 to the power of 4 is 16, minus one is 15 which is 1111 binary). This can be dome by using left-shifting by Y bits and subtracting 1.
However, the masking isn't needed if you want to count the number of zeros in the wanted bits, only the right shift. Loop Y times, each time shifting a 1 left one step, and check using bitwise and if the value is zero. If it is then increment a counter. At the end of the loop the counter is the number of zeros.
To put it all in code:
// Count the number of zeros in a specific amount of bits starting at a specific offset
// value is the original value
// offset is the offset in bits
// bits is the number of bits to check
unsigned int count_zeros(unsigned int value, unsigned int offset, unsigned int bits)
{
// Get the bits we're interested in the rightmost position
value >>= offset;
unsigned int counter = 0; // Zero-counter
for (unsigned int i = 0; i < bits; ++i)
{
if ((value & (1 << i)) == 0)
{
++counter; // Bit is a zero
}
}
return counter;
}
To use with the example data you have:
count_zeros(0xa5, 2, 4);
The result should be 2. Which it is if you see this live program.
int32_t do_test(int32_t value, int32_t offset)
{
int32_t _zeros = 1;
value >>= offset;
int i = 1;
while(1) {
if((value >> i) % 2 == 0) {
_zeros += 1;
i++;
} else {
break;
}
}
}
int result = (0xA5 >> 2) & 0x0F;
Truth table for the & operator
| INPUTS | OUTPUT |
-----------------------
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
-----------------------

how to replace given nibbles with another set of nibbles in an integer

Suppose you have an integer a = 0x12345678 & a short b = 0xabcd
What i wanna do is replace the given nibbles in integer a with nibbles from short b
Eg: Replace 0,2,5,7th nibbles in a = 0x12345678 (where 8 = 0th nibble, 7=1st nibble, 6=2nd nibble and so on...) with nibbles from b = 0xabcd (where d = 0th nibble, c=1st nibble, b=2nd nibble & so on...)
My approach is -
Clear the bits we're going to replace from a.
like a = 0x02045070
Create the mask from the short b like mask = 0xa0b00c0d
bitwise OR them to get the result. result = a| mask i.e result = 0xa2b45c7d hence nibbles replaced.
My problem is I don't know any efficient way to create the desired mask (like in step 2) from the given short b
If you can give me an efficient way of doing so, it would be a great help to me and I thank you for that in advance ;)
Please ask if more info needed.
EDIT:
My code to solve the problem (not good enough though)
Any improvement is highly appreciated.
int index[4] = {0,1,5,7}; // Given nibbles to be replaced in integer
int s = 0x01024300; // integer mask i.e. cleared nibbles
int r = 0x0000abcd; // short (converted to int )
r = ((r & 0x0000000f) << 4*(index[0]-0)) |
((r & 0x000000f0) << 4*(index[1]-1)) |
((r & 0x00000f00) << 4*(index[2]-2)) |
((r & 0x0000f000) << 4*(index[3]-3));
s = s|r;
Nibble has 4 bits, and according to your indexing scheme, the zeroth nibble is represented by least significant bits at positions 0-3, the first nibble is represented by least significant bits at positions 4-7, and so on.
Simply shift the values the necessary amount. This will set the nibble at position set by the variable index:
size_t index = 5; //6th nibble is at index 5
size_t shift = 4 * index; //6th nibble is represented by bits 20-23
unsigned long nibble = 0xC;
unsigned long result = 0x12345678;
result = result & ~( 0xFu << shift ); //clear the 6th nibble
result = result | ( nibble << shift ); //set the 6th nibble
If you want to set more than one value, put this code in a loop. The variable index should be changed to an array of values, and variable nibble could also be an array of values, or it could contain more than one nibble, in which case you extract them one by one by shifting values to the right.
A lot depends on how your flexible you are in accepting the "nibble list" index[4] in your case.
You mentioned that you can replace anywhere from 0 to 8 nibbles. If you take your nibble bits as an 8-bit bitmap, rather than as a list, you can use the bitmap as a lookup in a 256-entry table, which maps from bitmap to a (fixed) mask with 1s in the nibble positions. For example, for the nibble list {1, 3}, you'd have the bitmap 0b00001010 which would map to the mask 0x0000F0F0.
Then you can use pdep which has intrinsics on gcc, clang, icc and MSVC on x86 to expand the bits in your short to the right position. E.g., for b == 0xab you'd have _pdep_u32(b, mask) == 0x0000a0b0.
If you aren't on a platform with pdep, you can accomplish the same thing with multiplication.
To be able to change easy the nibbles assignment, a bit-field union structure could be used:
Step 1 - create a union allowing to have nibbles access
typedef union u_nibble {
uint32_t dwValue;
uint16_t wValue;
struct sNibble {
uint32_t nib0: 4;
uint32_t nib1: 4;
uint32_t nib2: 4;
uint32_t nib3: 4;
uint32_t nib4: 4;
uint32_t nib5: 4;
uint32_t nib6: 4;
uint32_t nib7: 4;
} uNibble;
} NIBBLE;
Step 2 - assign two NIBBLE items with your integer a and short b
NIBBLE myNibbles[2];
uint32_t a = 0x12345678;
uint16_t b = 0xabcd;
myNibbles[0].dwValue = a;
myNibbles[1].wValue = b;
Step 3 - initialize nibbles of a by nibbles of b
printf("a = %08x\n",myNibbles[0].dwValue);
myNibbles[0].uNibble.nib0 = myNibbles[1].uNibble.nib0;
myNibbles[0].uNibble.nib2 = myNibbles[1].uNibble.nib1;
myNibbles[0].uNibble.nib5 = myNibbles[1].uNibble.nib2;
myNibbles[0].uNibble.nib7 = myNibbles[1].uNibble.nib3;
printf("a = %08x\n",myNibbles[0].dwValue);
Output will be:
a = 12345678
a = a2b45c7d
If I understand your goal, the fun you are having comes from the reversal of the order of your fill from the upper half to the lower half of your final number. (instead of 0, 2, 4, 6, you want 0, 2, 5, 7) It isn't any more difficult, but it does make you count where the holes are in the final number. If I understood, then you could mask with 0x0f0ff0f0 and then fill in the zeros with shifts of 16, 12, 4 and 0. For example:
#include <stdio.h>
int main (void) {
unsigned a = 0x12345678, c = 0, mask = 0x0f0ff0f0;
unsigned short b = 0xabcd;
/* mask a, fill in the holes with the bits from b */
c = (a & mask) | (((unsigned)b & 0xf000) << 16);
c |= (((unsigned)b & 0x0f00) << 12);
c |= (((unsigned)b & 0x00f0) << 4);
c |= (unsigned)b & 0x000f;
printf (" a : 0x%08x\n b : 0x%0hx\n c : 0x%08x\n", a, b, c);
return 0;
}
Example Use/Output
$ ./bin/bit_swap_nibble
a : 0x12345678
b : 0xabcd
c : 0xa2b45c7d
Let me know if I misunderstood, I'm happy to help further.
With nibble = 4 bits and unsigned int = 32 bits, a nibble inside a unsigned int can be found as follows:
x = 0x00a0b000, find 3rd nibble in x i.e locate 'b'. Note nibble index starts with 0.
Now 3rd nibble is from 12th bit to 15th bit.
3rd_nibble can be selected with n = 2^16 - 2^12. So, in n all the bits in 3rd nibble will be 1 and all the bits in other nibbles will be 0. That is, n=0x00001000
In general, suppose if you want to find a continuous sequence of 1 in binary representation in which sequence starts from Xth bit to Yth bit then formula is 2^(Y+1) - 2^X.
#include <stdio.h>
#define BUF_SIZE 33
char *int2bin(int a, char *buffer, int buf_size)
{
int i;
buffer[BUF_SIZE - 1] = '\0';
buffer += (buf_size - 1);
for(i = 31; i >= 0; i--)
{
*buffer-- = (a & 1) + '0';
a >>= 1;
}
return buffer;
}
int main()
{
unsigned int a = 0;
unsigned int b = 65535;
unsigned int b_nibble;
unsigned int b_at_a;
unsigned int a_nibble_clear;
char replace_with[8];
unsigned int ai;
char buffer[BUF_SIZE];
memset(replace_with, -1, sizeof(replace_with));
replace_with[0] = 0; //replace 0th nibble of a with 0th nibble of b
replace_with[2] = 1; //replace 2nd nibble of a with 1st nibble of b
replace_with[5] = 2; //replace 5th nibble of a with 2nd nibble of b
replace_with[7] = 3; //replace 7th nibble of a with 3rd nibble of b
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
int2bin(b, buffer, BUF_SIZE - 1);
printf("b = %s, %08x\n", buffer, b);
for(ai = 0; ai < 8; ++ai)
{
if(replace_with[ai] != -1)
{
b_nibble = (b & (1LL << ((replace_with[ai] + 1)*4)) - (1LL << (replace_with[ai]*4))) >> (replace_with[ai]*4);
b_at_a = b_nibble << (ai * 4);
a_nibble_clear = (a & ~(a & (1LL << ((ai + 1) * 4)) - (1LL << (ai * 4))));
a = a_nibble_clear | b_at_a;
}
}
int2bin(a, buffer, BUF_SIZE - 1);
printf("a = %s, %08x\n", buffer, a);
return 0;
}
Output:
a = 00000000000000000000000000000000, 00000000
b = 00000000000000001111111111111111, 0000ffff
a = 11110000111100000000111100001111, f0f00f0f

how to make a bit-set/byte-array conversion in c

Given an array,
unsigned char q[32]="1100111...",
how can I generate a 4-bytes bit-set, unsigned char p[4], such that, the bit of this bit-set, equals to value inside the array, e.g., the first byte p[0]= "q[0] ... q[7]"; 2nd byte p[1]="q[8] ... q[15]", etc.
and also how to do it in opposite, i.e., given bit-set, generate the array?
my own trial out for the first part.
unsigned char p[4]={0};
for (int j=0; j<N; j++)
{
if (q[j] == '1')
{
p [j / 8] |= 1 << (7-(j % 8));
}
}
Is the above right? any conditions to check? Is there any better way?
EDIT - 1
I wonder if above is efficient way? As the array size could be upto 4096 or even more.
First, Use strtoul to get a 32-bit value. Then convert the byte order to big-endian with htonl. Finally, store the result in your array:
#include <arpa/inet.h>
#include <stdlib.h>
/* ... */
unsigned char q[32] = "1100111...";
unsigned char result[4] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
There are other ways as well.
But I lack <arpa/inet.h>!
Then you need to know what byte order your platform is. If it's big endian, then htonl does nothing and can be omitted. If it's little-endian, then htonl is just:
unsigned long htonl(unsigned long x)
{
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
If you're lucky, your optimizer might see what you're doing and make it into efficient code. If not, well, at least it's all implementable in registers and O(log N).
If you don't know what byte order your platform is, then you need to detect it:
typedef union {
char c[sizeof(int) / sizeof(char)];
int i;
} OrderTest;
unsigned long htonl(unsigned long x)
{
OrderTest test;
test.i = 1;
if(!test.c[0])
return x;
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
Maybe long is 8 bytes!
Well, the OP implied 4-byte inputs with their array size, but 8-byte long is doable:
#define kCharsPerLong (sizeof(long) / sizeof(char))
unsigned char q[8 * kCharsPerLong] = "1100111...";
unsigned char result[kCharsPerLong] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
unsigned long htonl(unsigned long x)
{
#if kCharsPerLong == 4
x = (x & 0xFF00FF00UL) >> 8) | (x & 0x00FF00FFUL) << 8);
x = (x & 0xFFFF0000UL) >> 16) | (x & 0x0000FFFFUL) << 16);
#elif kCharsPerLong == 8
x = (x & 0xFF00FF00FF00FF00UL) >> 8) | (x & 0x00FF00FF00FF00FFUL) << 8);
x = (x & 0xFFFF0000FFFF0000UL) >> 16) | (x & 0x0000FFFF0000FFFFUL) << 16);
x = (x & 0xFFFFFFFF00000000UL) >> 32) | (x & 0x00000000FFFFFFFFUL) << 32);
#else
#error Unsupported word size.
#endif
return x;
}
For char that isn't 8 bits (DSPs like to do this), you're on your own. (This is why it was a Big Deal when the SHARC series of DSPs had 8-bit bytes; it made it a LOT easier to port existing code because, face it, C does a horrible job of portability support.)
What about arbitrary length buffers? No funny pointer typecasts, please.
The main thing that can be improved with the OP's version is to rethink the loop's internals. Instead of thinking of the output bytes as a fixed data register, think of it as a shift register, where each successive bit is shifted into the right (LSB) end. This will save you from all those divisions and mods (which, hopefully, are optimized away to bit shifts).
For sanity, I'm ditching unsigned char for uint8_t.
#include <stdint.h>
unsigned StringToBits(const char* inChars, uint8_t* outBytes, size_t numBytes,
size_t* bytesRead)
/* Converts the string of '1' and '0' characters in `inChars` to a buffer of
* bytes in `outBytes`. `numBytes` is the number of available bytes in the
* `outBytes` buffer. On exit, if `bytesRead` is not NULL, the value it points
* to is set to the number of bytes read (rounding up to the nearest full
* byte). If a multiple of 8 bits is not read, the last byte written will be
* padded with 0 bits to reach a multiple of 8 bits. This function returns the
* number of padding bits that were added. For example, an input of 11 bits
* will result `bytesRead` being set to 2 and the function will return 5. This
* means that if a nonzero value is returned, then a partial byte was read,
* which may be an error.
*/
{ size_t bytes = 0;
unsigned bits = 0;
uint8_t x = 0;
while(bytes < numBytes)
{ /* Parse a character. */
switch(*inChars++)
{ '0': x <<= 1; ++bits; break;
'1': x = (x << 1) | 1; ++bits; break;
default: numBytes = 0;
}
/* See if we filled a byte. */
if(bits == 8)
{ outBytes[bytes++] = x;
x = 0;
bits = 0;
}
}
/* Padding, if needed. */
if(bits)
{ bits = 8 - bits;
outBytes[bytes++] = x << bits;
}
/* Finish up. */
if(bytesRead)
*bytesRead = bytes;
return bits;
}
It's your responsibility to make sure inChars is null-terminated. The function will return on the first non-'0' or '1' character it sees or if it runs out of output buffer. Some example usage:
unsigned char q[32] = "1100111...";
uint8_t buf[4];
size_t bytesRead = 5;
if(StringToBits(q, buf, 4, &bytesRead) || bytesRead != 4)
{
/* Partial read; handle error here. */
}
This just reads 4 bytes, and traps the error if it can't.
unsigned char q[4096] = "1100111...";
uint8_t buf[512];
StringToBits(q, buf, 512, NULL);
This just converts what it can and sets the rest to 0 bits.
This function could be done better if C had the ability to break out of more than one level of loop or switch; as it stands, I'd have to add a flag value to get the same effect, which is clutter, or I'd have to add a goto, which I simply refuse.
I don't think that will quite work. You are comparing each "bit" to 1 when it should really be '1'. You can also make it a bit more efficient by getting rid of the if:
unsigned char p[4]={0};
for (int j=0; j<32; j++)
{
p [j / 8] |= (q[j] == `1`) << (7-(j % 8));
}
Going in reverse is pretty simple too. Just mask for each "bit" that you set earlier.
unsigned char q[32]={0};
for (int j=0; j<32; j++) {
q[j] = p[j / 8] & ( 1 << (7-(j % 8)) ) + '0';
}
You'll notice the creative use of (boolean) + '0' to convert between 1/0 and '1'/'0'.
According to your example it does not look like you are going for readability, and after a (late) refresh my solution looks very similar to Chriszuma except for the lack of parenthesis due to order of operations and the addition of the !! to enforce a 0 or 1.
const size_t N = 32; //N must be a multiple of 8
unsigned char q[N+1] = "11011101001001101001111110000111";
unsigned char p[N/8] = {0};
unsigned char r[N+1] = {0}; //reversed
for(size_t i = 0; i < N; ++i)
p[i / 8] |= (q[i] == '1') << 7 - i % 8;
for(size_t i = 0; i < N; ++i)
r[i] = '0' + !!(p[i / 8] & 1 << 7 - i % 8);
printf("%x %x %x %x\n", p[0], p[1], p[2], p[3]);
printf("%s\n%s\n", q,r);
If you are looking for extreme efficiency, try to use the following techniques:
Replace if by subtraction of '0' (seems like you can assume your input symbols can be only 0 or 1).
Also process the input from lower indices to higher ones.
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + q[c + b] - '0';
p[c / 8] = y;
}
Replace array indices by auto-incrementing pointers:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + *qptr++ - '0';
*pptr++ = y;
}
Unroll the inner loop:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
*pptr++ =
qptr[0] - '0' << 7 |
qptr[1] - '0' << 6 |
qptr[2] - '0' << 5 |
qptr[3] - '0' << 4 |
qptr[4] - '0' << 3 |
qptr[5] - '0' << 2 |
qptr[6] - '0' << 1 |
qptr[7] - '0' << 0;
qptr += 8;
}
Process several input characters simultaneously (using bit twiddling hacks or MMX instructions) - this has great speedup potential!

Resources