Can someone explain how this bitMask code works?

Can someone explain how this bitMask code works? - c

This is code that my partner came up with but for some reason I can't get a hold of him to ask him how it's suppose to work. I've been through it many times now and can't seem to get the answer I'm suppose to get.
/**
* bitMask - Generate a mask consisting of all 1's
* lowbit and highbit
* Examples: bitMask(5,3) = 0x38
* Assume 0 <= lowbit <= 31, and 0 <= highbit <= 31
* If lowbit > highbit, then mask should be all 0's
* Legal ops: ! ~ & ^ | + << >>
*/
int bitMask(int highbit, int lowbit) {
int i = ~0;
return ~(i << highbit << 1) & (i << lowbit);
}

This function is actually incorrect: for large values of highbit and lowbit, it may have implementation specific behavior or even undefined behavior. It should use and return unsigned types:
unsigned bitMask(int highbit, int lowbit) {
unsigned i = ~0U;
return ~(i << highbit << 1) & (i << lowbit);
}
Here are the steps:
i = ~0U; sets i to all bits 1.
i << highbit shifts these bits to the left, inserting highbit 0 bits in the low order bits.
i << highbit << 1 makes room for one more 0 bit. One should not simplify this expression as i << (highbit + 1) because such a bit shift is implementation defined if highbit + 1 becomes larger or equal to the number of bits in the type of i.
~(i << highbit << 1) complements this mask, creating a mask with highbit + 1 bits set in the low order positions and 0 for the higher bits.
i << lowbit creates a mask with lowbit 0 bits and 1 in the higher positions.
~(i << highbit << 1) & (i << lowbit) computes the intersection of these 2 masks, result has 1 bits from bit number lowbit to bit number highbit inclusive, numbering the bits from 0 for the least significant.
examples:
bitMask(31, 0) -> 0xFFFFFFFF.
bitMask(0, 0) -> 0x00000001.
bitMask(31, 16) -> 0xFFFF0000.
bitMask(15, 0) -> 0x0000FFFF.
This numbering method is used in hardware specifications. I personally prefer a different method where one specifies the number of bits to skip and the number of bits to set, more consistent with bit-field specifications:
unsigned bitSpec(int start, int len) {
return (~0U >> (32 - len)) << start;
}
and the same examples:
bitSpec(0, 32) -> 0xFFFFFFFF.
bitSpec(0, 1) -> 0x00000001.
bitSpec(16, 16) -> 0xFFFF0000.
bitSpec(0, 16) -> 0x0000FFFF.

In your case, given the description included with your function, the function is doing exactly what you seem to intend it to do. The primary problem is you are using int instead of unsigned int. That will cause problems with sign extension. (not to mention the lack of definition for signed shifts in C).
A simple conversion to unsigned will show you it is operating as you expect:
Short example:
#include <stdio.h>
#include <stdlib.h>
unsigned int bitMask (unsigned int highbit, unsigned int lowbit) {
unsigned int i = ~0;
return ~(i << highbit << 1) & (i << lowbit);
}
char *binstr (unsigned long n, unsigned char sz, unsigned char szs, char sep) {
static char s[128 + 1] = {0};
char *p = s + 128;
unsigned char i;
for (i = 0; i < sz; i++) {
p--;
if (i > 0 && szs > 0 && i % szs == 0)
*p-- = sep;
*p = (n >> i & 1) ? '1' : '0';
}
return p;
}
int main (int argc, char **argv) {
unsigned high = argc > 1 ? (unsigned)strtoul (argv[1], NULL, 10) : 5;
unsigned low = argc > 2 ? (unsigned)strtoul (argv[2], NULL, 10) : 3;
printf ("%s\n", binstr (bitMask (high, low), 32, 8, '-'));
return 0;
}
Output
$ ./bin/bitmask
00000000-00000000-00000000-00111000
$ ./bin/bitmask 10 3
00000000-00000000-00000111-11111000
$ ./bin/bitmask 31 5
11111111-11111111-11111111-11100000
$ ./bin/bitmask 4 8
00000000-00000000-00000000-00000000

Related

Bitwise Operation on a byte and an int

I have a byte array represented as
char * bytes = getbytes(object); //some api function
I want to check whether the bit at some position x is set.
I've been trying this
int mask = 1 << x % 8;
y= bytes[x>>3] & mask;
However y returns as all zeros? What am I doing incorrectly and is there an easier way to check if a bit is set?
EDIT:
I did run this as well. It didn't return with the expected result either.
int k = x >> 3;
int mask = x % 8;
unsigned char byte = bytes[k];
return (byte & mask);
it failed an assert true ctest I ran. Byte and Mask at this time where "0002" and 2 respectively when printed from gdb.
edit 2: This is how I set the bits in the first place. I'm just trying to write a test to verify they are set.
unsigned long x = somehash(void* a);
unsigned int mask = 1 << (x % 8);
unsigned int location = x >> 3;
char* filter = getData(ref);
filter[location] |= mask;

This would be one (crude perhaps) way from the top of my head:
#include "stdio.h"
#include "stdlib.h"
// this function *changes* the byte array
int getBit(char *b, int bit)
{
int bitToCheck = bit % 8;
b = b + (bitToCheck ? (bit / 8) : (bit / 8 - 1));
if (bitToCheck)
*b = (*b) >> (8 - bitToCheck);
return (*b) & 1;
}
int main(void)
{
char *bytes = calloc(2, 1);
*(bytes + 1)= 5; // writing to the appropiate bits
printf("%d\n", getBit(bytes, 16)); // checking the 16th bit from the left
return 0;
}
Assumptions:
A byte is represented as:
----------------------------------------
| 2^7 | 2^6 | 2^5 | 2^4 | 2^3 |... |
----------------------------------------
The left most bit is considered bit number 1 and the right most bit is considered the max. numbered bit (16th bit in a 2 byte object).
It's OK to overwrite the actual byte object (if this is not wanted, use memcpy).

Breaking apart bit patterns, shifting and creating new patterns

As part of a larger problem, I have to take some binary value: 00000000 11011110 (8)
Then, I have to:
Derive the bit count in this function - so I've done that by finding the place of the most sig fig.
Then store the first 6 numbers of this value into the value 128, such that it equals: 10011110
Then store the last 5 numbers of this value into the value 192, such that it equals: 11000011 10011110
The two bytes should be stored in some array, buffer[]
I have written this function however, position does not appear to initialise properly in gdb and the values are not outputting correctly. This is my attempt:
void create_value(unsigned short init_val, unsigned char buffer[])
{
// get the count
int position = 0;
while (init_val >>= 1)
position++;
// get total
int count = position++;
int start = 128;
for (int i = 0; i < 7; i++)
if (((1 << i) & init_val) != 0) start = start | 1 << i;
buffer[0] = start;
start = 192;
for (int i = 7; i < 11; i++) {
if (((1 << i) & init_val) !=0) start = start | 1 << i;
}
buf[1] = start;
}

After
while (init_val >>= 1)
position++;
init_val will be 0. When you later use
if (((1 << i) & init_val) != 0) start = start | 1 << i;
you will never change start.
So, after reading through what you're trying to do (which is pretty confusingly described), why don't you:
void create_value(unsigned short init_value, unsigned char buffer[])
{
buffer[0] = (init_value & 63) | 128;
buffer[1] = ((init_value >> 6) & 31) | 192;
return;
}
What this does: init_value & 63 masks off all but the lowest 6 bits in init_value, as you wanted. The | 128 then sets the most significant bit of the byte (IFF CHAR_BIT == 8, mind you).
(init_value >> 6) shifts init_value down by 6 bits, so now the original bits 6-11 are bits 0-4. & 31 masks off all bit the lowest 5 bits in this value, | 192 sets the two most significant bits.

Bitwise operations equivalent of greater than operator

I am working on a function that will essentially see which of two ints is larger. The parameters that are passed are 2 32-bit ints. The trick is the only operators allowed are ! ~ | & << >> ^ (no casting, other data types besides signed int, *, /, -, etc..).
My idea so far is to ^ the two binaries together to see all the positions of the 1 values that they don't share. What I want to do is then take that value and isolate the 1 farthest to the left. Then see of which of them has that value in it. That value then will be the larger.
(Say we use 8-bit ints instead of 32-bit).
If the two values passed were 01011011 and 01101001
I used ^ on them to get 00100010.
I then want to make it 00100000 in other words 01xxxxxx -> 01000000
Then & it with the first number
!! the result and return it.
If it is 1, then the first # is larger.
Any thoughts on how to 01xxxxxx -> 01000000 or anything else to help?
Forgot to note: no ifs, whiles, fors etc...

Here's a loop-free version which compares unsigned integers in O(lg b) operations where b is the word size of the machine. Note the OP states no other data types than signed int, so it seems likely the top part of this answer does not meet the OP's specifications. (Spoiler version as at the bottom.)
Note that the behavior we want to capture is when the most significant bit mismatch is 1 for a and 0 for b. Another way of thinking about this is any bit in a being larger than the corresponding bit in b means a is greater than b, so long as there wasn't an earlier bit in a that was less than the corresponding bit in b.
To that end, we compute all the bits in a greater than the corresponding bits in b, and likewise compute all the bits in a less than the corresponding bits in b. We now want to mask out all the 'greater than' bits that are below any 'less than' bits, so we take all the 'less than' bits and smear them all to the right making a mask: the most significant bit set all the way down to the least significant bit are now 1.
Now all we have to do is remove the 'greater than' bits set by using simple bit masking logic.
The resulting value is 0 if a <= b and nonzero if a > b. If we want it to be 1 in the latter case we can do a similar smearing trick and just take a look at the least significant bit.
#include <stdio.h>
// Works for unsigned ints.
// Scroll down to the "actual algorithm" to see the interesting code.
// Utility function for displaying binary representation of an unsigned integer
void printBin(unsigned int x) {
for (int i = 31; i >= 0; i--) printf("%i", (x >> i) & 1);
printf("\n");
}
// Utility function to print out a separator
void printSep() {
for (int i = 31; i>= 0; i--) printf("-");
printf("\n");
}
int main()
{
while (1)
{
unsigned int a, b;
printf("Enter two unsigned integers separated by spaces: ");
scanf("%u %u", &a, &b);
getchar();
printBin(a);
printBin(b);
printSep();
/************ The actual algorithm starts here ************/
// These are all the bits in a that are less than their corresponding bits in b.
unsigned int ltb = ~a & b;
// These are all the bits in a that are greater than their corresponding bits in b.
unsigned int gtb = a & ~b;
ltb |= ltb >> 1;
ltb |= ltb >> 2;
ltb |= ltb >> 4;
ltb |= ltb >> 8;
ltb |= ltb >> 16;
// Nonzero if a > b
// Zero if a <= b
unsigned int isGt = gtb & ~ltb;
// If you want to make this exactly '1' when nonzero do this part:
isGt |= isGt >> 1;
isGt |= isGt >> 2;
isGt |= isGt >> 4;
isGt |= isGt >> 8;
isGt |= isGt >> 16;
isGt &= 1;
/************ The actual algorithm ends here ************/
// Print out the results.
printBin(ltb); // Debug info
printBin(gtb); // Debug info
printSep();
printBin(isGt); // The actual result
}
}
Note: This should work for signed integers as well if you flip the top bit on both of the inputs, e.g. a ^= 0x80000000.
Spoiler
If you want an answer that meets all of the requirements (including 25 operators or less):
int isGt(int a, int b)
{
int diff = a ^ b;
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
diff &= ~(diff >> 1) | 0x80000000;
diff &= (a ^ 0x80000000) & (b ^ 0x7fffffff);
return !!diff;
}
I'll leave explaining why it works up to you.

To convert 001xxxxx to 00100000, you first execute:
x |= x >> 4;
x |= x >> 2;
x |= x >> 1;
(this is for 8 bits; to extend it to 32, add shifts by 8 and 16 at the start of the sequence).
This leaves us with 00111111 (this technique is sometimes called "bit-smearing"). We can then chop off all but the first 1 bit:
x ^= x >> 1;
leaving us with 00100000.

An unsigned variant given that one can use logical (&&, ||) and comparison (!=, ==).
int u_isgt(unsigned int a, unsigned int b)
{
return a != b && ( /* If a == b then a !> b and a !< b. */
b == 0 || /* Else if b == 0 a has to be > b (as a != 0). */
(a / b) /* Else divide; integer division always truncate */
); /* towards zero. Giving 0 if a < b. */
}
!= and == can easily be eliminated., i.e.:
int u_isgt(unsigned int a, unsigned int b)
{
return a ^ b && (
!(b ^ 0) ||
(a / b)
);
}
For signed one could then expand to something like:
int isgt(int a, int b)
{
return
(a != b) &&
(
(!(0x80000000 & a) && 0x80000000 & b) || /* if a >= 0 && b < 0 */
(!(0x80000000 & a) && b == 0) ||
/* Two more lines, can add them if you like, but as it is homework
* I'll leave it up to you to decide.
* Hint: check on "both negative" and "both not negative". */
)
;
}
Can be more compact / eliminate ops. (at least one) but put it like this for clarity.
Instead of 0x80000000 one could say ie:
#include <limits.h>
static const int INT_NEG = (1 << ((sizeof(int) * CHAR_BIT) - 1));
Using this to test:
void test_isgt(int a, int b)
{
fprintf(stdout,
"%11d > %11d = %d : %d %s\n",
a, b,
isgt(a, b), (a > b),
isgt(a, b) != (a>b) ? "BAD!" : "OK!");
}
Result:
33 > 0 = 1 : 1 OK!
-33 > 0 = 0 : 0 OK!
0 > 33 = 0 : 0 OK!
0 > -33 = 1 : 1 OK!
0 > 0 = 0 : 0 OK!
33 > 33 = 0 : 0 OK!
-33 > -33 = 0 : 0 OK!
-5 > -33 = 1 : 1 OK!
-33 > -5 = 0 : 0 OK!
-2147483647 > 2147483647 = 0 : 0 OK!
2147483647 > -2147483647 = 1 : 1 OK!
2147483647 > 2147483647 = 0 : 0 OK!
2147483647 > 0 = 1 : 1 OK!
0 > 2147483647 = 0 : 0 OK!

A fully branchless version of Kaganar's smaller isGt function might look like so:
int isGt(int a, int b)
{
int diff = a ^ b;
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
//1+ on GT, 0 otherwise.
diff &= ~(diff >> 1) | 0x80000000;
diff &= (a ^ 0x80000000) & (b ^ 0x7fffffff);
//flatten back to range of 0 or 1.
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
diff &= 1;
return diff;
}
This clocks in at around 60 instructions for the actual computation (MSVC 2010 compiler, on an x86 arch), plus an extra 10 stack ops or so for the function's prolog/epilog.

EDIT:
Okay, there were some issues with the code, but I revised it and the following works.
This auxiliary function compares the numbers' n'th significant digit:
int compare ( int a, int b, int n )
{
int digit = (0x1 << n-1);
if ( (a & digit) && (b & digit) )
return 0; //the digit is the same
if ( (a & digit) && !(b & digit) )
return 1; //a is greater than b
if ( !(a & digit) && (b & digit) )
return -1; //b is greater than a
}
The following should recursively return the larger number:
int larger ( int a, int b )
{
for ( int i = 8*sizeof(a) - 1 ; i >= 0 ; i-- )
{
if ( int k = compare ( a, b, i ) )
{
return (k == 1) ? a : b;
}
}
return 0; //equal
}

As much as I don't want to do someone else's homework I couldn't resist this one.. :) I am sure others can think of a more compact one..but here is mine..works well, including negative numbers..
Edit: there are couple of bugs though. I will leave it to the OP to find it and fix it.
#include<unistd.h>
#include<stdio.h>
int a, b, i, ma, mb, a_neg, b_neg, stop;
int flipnum(int *num, int *is_neg) {
*num = ~(*num) + 1;
*is_neg = 1;
return 0;
}
int print_num1() {
return ((a_neg && printf("bigger number %d\n", mb)) ||
printf("bigger number %d\n", ma));
}
int print_num2() {
return ((b_neg && printf("bigger number %d\n", ma)) ||
printf("bigger number %d\n", mb));
}
int check_num1(int j) {
return ((a & j) && print_num1());
}
int check_num2(int j) {
return ((b & j) && print_num2());
}
int recursive_check (int j) {
((a & j) ^ (b & j)) && (check_num1(j) || check_num2(j)) && (stop = 1, j = 0);
return(!stop && (j = j >> 1) && recursive_check(j));
}
int main() {
int j;
scanf("%d%d", &a, &b);
ma = a; mb = b;
i = (sizeof (int) * 8) - 1;
j = 1 << i;
((a & j) && flipnum(&a, &a_neg));
((b & j) && flipnum(&b, &b_neg));
j = 1 << (i - 1);
recursive_check(j);
(!stop && printf("numbers are same..\n"));
}

I think I have a solution with 3 operations:
Add one to the first number, the subtract it from the largest possible number you can represent (all 1's). Add that number to the second number. If it it overflows, then the first number is less than the second.
I'm not 100% sure if this is correct. That is you might not need to add 1, and I don't know if it's possible to check for overflow (if not then just reserve the last bit and test if it's 1 at the end.)

EDIT: The constraints make the simple approach at the bottom invalid. I am adding the binary search function and the final comparison to detect the greater value:
unsigned long greater(unsigned long a, unsigned long b) {
unsigned long x = a;
unsigned long y = b;
unsigned long t = a ^ b;
if (t & 0xFFFF0000) {
x >>= 16;
y >>= 16;
t >>= 16;
}
if (t & 0xFF00) {
x >>= 8;
y >>= 8;
t >>= 8;
}
if (t & 0xf0) {
x >>= 4;
y >>= 4;
t >>= 4;
}
if ( t & 0xc) {
x >>= 2;
y >>= 2;
t >>= 2;
}
if ( t & 0x2) {
x >>= 1;
y >>= 1;
t >>= 1;
}
return (x & 1) ? a : b;
}
The idea is to start off with the most significant half of the word we are interested in and see if there are any set bits in there. If there are, then we don't need the least significant half, so we shift the unwanted bits away. If not, we do nothing (the half is zero anyway, so it won't get in the way). Since we cannot keep track of the shifted amount (it would require addition), we also shift the original values so that we can do the final and to determine the larger number. We repeat this process with half the size of the previous mask until we collapse the interesting bits into bit position 0.
I didn't add the equal case in here on purpose.
Old answer:
The simplest method is probably the best for a homework. Once you've got the mismatching bit value, you start off with another mask at 0x80000000 (or whatever suitable max bit position for your word size), and keep right shifting this until you hit a bit that is set in your mismatch value. If your right shift ends up with 0, then the mismatch value is 0.
I assume you already know the final step required to determine the larger number.

replace byte in 32 bit number

I have a function called replaceByte(x,n,c) that is to replace byte n in x with c with the following restrictions:
Bytes numbered from 0 (LSB) to 3 (MSB)
Examples: replaceByte(0x12345678,1,0xab) = 0x1234ab78
You can assume 0 <= n <= 3 and 0 <= c <= 255
Legal ops: ! ~ & ^ | + << >>
Max ops: 10
int replaceByte(int x, int n, int c) {
int shift = (c << (8 * n));
int mask = 0xff << shift;
return (mask & x) | shift;
}
but when I test it I get this error:
ERROR: Test replaceByte(-2147483648[0x80000000],0[0x0],0[0x0]) failed...
...Gives 0[0x0]. Should be -2147483648[0x80000000]
after realizing that * is not a legal operator I have finally figured it out...and if you are curious, this is what I did:
int replaceByte(int x, int n, int c) {
int mask = 0xff << (n << 3);
int shift = (c << (n << 3));
return (~mask & x) | shift;
}

Since this looks like homework I'm not going to post code, but list the steps you need to perform:
Cast c into a 32-bit number so you don't lose any bits while shifting
Next, shift c by the appropriate number of bits to the left (if n==0 no shifting, if n==1 shift by 8 etc.)
Create a 32-bit bitmask that will zero the lowest 8 bits of x, then shift this mask by the same amount as the last step
Perform bitwise AND of the shifted bitmask and x to zero out the appropriate bits of x
Perform bitwise OR (or addition) of the shifted c value and x to replace the masked bits of the latter

Ahh... You are almost there.
Just change
return (mask & x) | shift;
to
return (~mask & x) | shift;
The mask should contain all ones except for the region to be masked and not vice versa.
I am using this simple code and it works fine in gcc
#include<stdio.h>
int replaceByte(int x, int n, int c)
{
int shift = (c << (8 * n));
int mask = 0xff << shift;
return (~mask & x) | shift;
}
int main ()
{
printf("%X",replaceByte(0x80000000,0,0));
return 0;
}

Proper solution is for c = 0 as well:
int replaceByte(int x, int n, int c)
{
int shift = 8 * n;
int value = c << shift;
int mask = 0xff << shift;
return (~mask & x) | value;
}

how to make a bit-set/byte-array conversion in c

Given an array,
unsigned char q[32]="1100111...",
how can I generate a 4-bytes bit-set, unsigned char p[4], such that, the bit of this bit-set, equals to value inside the array, e.g., the first byte p[0]= "q[0] ... q[7]"; 2nd byte p[1]="q[8] ... q[15]", etc.
and also how to do it in opposite, i.e., given bit-set, generate the array?
my own trial out for the first part.
unsigned char p[4]={0};
for (int j=0; j<N; j++)
{
if (q[j] == '1')
{
p [j / 8] |= 1 << (7-(j % 8));
}
}
Is the above right? any conditions to check? Is there any better way?
EDIT - 1
I wonder if above is efficient way? As the array size could be upto 4096 or even more.

First, Use strtoul to get a 32-bit value. Then convert the byte order to big-endian with htonl. Finally, store the result in your array:
#include <arpa/inet.h>
#include <stdlib.h>
/* ... */
unsigned char q[32] = "1100111...";
unsigned char result[4] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
There are other ways as well.
But I lack <arpa/inet.h>!
Then you need to know what byte order your platform is. If it's big endian, then htonl does nothing and can be omitted. If it's little-endian, then htonl is just:
unsigned long htonl(unsigned long x)
{
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
If you're lucky, your optimizer might see what you're doing and make it into efficient code. If not, well, at least it's all implementable in registers and O(log N).
If you don't know what byte order your platform is, then you need to detect it:
typedef union {
char c[sizeof(int) / sizeof(char)];
int i;
} OrderTest;
unsigned long htonl(unsigned long x)
{
OrderTest test;
test.i = 1;
if(!test.c[0])
return x;
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
Maybe long is 8 bytes!
Well, the OP implied 4-byte inputs with their array size, but 8-byte long is doable:
#define kCharsPerLong (sizeof(long) / sizeof(char))
unsigned char q[8 * kCharsPerLong] = "1100111...";
unsigned char result[kCharsPerLong] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
unsigned long htonl(unsigned long x)
{
#if kCharsPerLong == 4
x = (x & 0xFF00FF00UL) >> 8) | (x & 0x00FF00FFUL) << 8);
x = (x & 0xFFFF0000UL) >> 16) | (x & 0x0000FFFFUL) << 16);
#elif kCharsPerLong == 8
x = (x & 0xFF00FF00FF00FF00UL) >> 8) | (x & 0x00FF00FF00FF00FFUL) << 8);
x = (x & 0xFFFF0000FFFF0000UL) >> 16) | (x & 0x0000FFFF0000FFFFUL) << 16);
x = (x & 0xFFFFFFFF00000000UL) >> 32) | (x & 0x00000000FFFFFFFFUL) << 32);
#else
#error Unsupported word size.
#endif
return x;
}
For char that isn't 8 bits (DSPs like to do this), you're on your own. (This is why it was a Big Deal when the SHARC series of DSPs had 8-bit bytes; it made it a LOT easier to port existing code because, face it, C does a horrible job of portability support.)
What about arbitrary length buffers? No funny pointer typecasts, please.
The main thing that can be improved with the OP's version is to rethink the loop's internals. Instead of thinking of the output bytes as a fixed data register, think of it as a shift register, where each successive bit is shifted into the right (LSB) end. This will save you from all those divisions and mods (which, hopefully, are optimized away to bit shifts).
For sanity, I'm ditching unsigned char for uint8_t.
#include <stdint.h>
unsigned StringToBits(const char* inChars, uint8_t* outBytes, size_t numBytes,
size_t* bytesRead)
/* Converts the string of '1' and '0' characters in `inChars` to a buffer of
* bytes in `outBytes`. `numBytes` is the number of available bytes in the
* `outBytes` buffer. On exit, if `bytesRead` is not NULL, the value it points
* to is set to the number of bytes read (rounding up to the nearest full
* byte). If a multiple of 8 bits is not read, the last byte written will be
* padded with 0 bits to reach a multiple of 8 bits. This function returns the
* number of padding bits that were added. For example, an input of 11 bits
* will result `bytesRead` being set to 2 and the function will return 5. This
* means that if a nonzero value is returned, then a partial byte was read,
* which may be an error.
*/
{ size_t bytes = 0;
unsigned bits = 0;
uint8_t x = 0;
while(bytes < numBytes)
{ /* Parse a character. */
switch(*inChars++)
{ '0': x <<= 1; ++bits; break;
'1': x = (x << 1) | 1; ++bits; break;
default: numBytes = 0;
}
/* See if we filled a byte. */
if(bits == 8)
{ outBytes[bytes++] = x;
x = 0;
bits = 0;
}
}
/* Padding, if needed. */
if(bits)
{ bits = 8 - bits;
outBytes[bytes++] = x << bits;
}
/* Finish up. */
if(bytesRead)
*bytesRead = bytes;
return bits;
}
It's your responsibility to make sure inChars is null-terminated. The function will return on the first non-'0' or '1' character it sees or if it runs out of output buffer. Some example usage:
unsigned char q[32] = "1100111...";
uint8_t buf[4];
size_t bytesRead = 5;
if(StringToBits(q, buf, 4, &bytesRead) || bytesRead != 4)
{
/* Partial read; handle error here. */
}
This just reads 4 bytes, and traps the error if it can't.
unsigned char q[4096] = "1100111...";
uint8_t buf[512];
StringToBits(q, buf, 512, NULL);
This just converts what it can and sets the rest to 0 bits.
This function could be done better if C had the ability to break out of more than one level of loop or switch; as it stands, I'd have to add a flag value to get the same effect, which is clutter, or I'd have to add a goto, which I simply refuse.

I don't think that will quite work. You are comparing each "bit" to 1 when it should really be '1'. You can also make it a bit more efficient by getting rid of the if:
unsigned char p[4]={0};
for (int j=0; j<32; j++)
{
p [j / 8] |= (q[j] == `1`) << (7-(j % 8));
}
Going in reverse is pretty simple too. Just mask for each "bit" that you set earlier.
unsigned char q[32]={0};
for (int j=0; j<32; j++) {
q[j] = p[j / 8] & ( 1 << (7-(j % 8)) ) + '0';
}
You'll notice the creative use of (boolean) + '0' to convert between 1/0 and '1'/'0'.

According to your example it does not look like you are going for readability, and after a (late) refresh my solution looks very similar to Chriszuma except for the lack of parenthesis due to order of operations and the addition of the !! to enforce a 0 or 1.
const size_t N = 32; //N must be a multiple of 8
unsigned char q[N+1] = "11011101001001101001111110000111";
unsigned char p[N/8] = {0};
unsigned char r[N+1] = {0}; //reversed
for(size_t i = 0; i < N; ++i)
p[i / 8] |= (q[i] == '1') << 7 - i % 8;
for(size_t i = 0; i < N; ++i)
r[i] = '0' + !!(p[i / 8] & 1 << 7 - i % 8);
printf("%x %x %x %x\n", p[0], p[1], p[2], p[3]);
printf("%s\n%s\n", q,r);

If you are looking for extreme efficiency, try to use the following techniques:
Replace if by subtraction of '0' (seems like you can assume your input symbols can be only 0 or 1).
Also process the input from lower indices to higher ones.
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + q[c + b] - '0';
p[c / 8] = y;
}
Replace array indices by auto-incrementing pointers:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + *qptr++ - '0';
*pptr++ = y;
}
Unroll the inner loop:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
*pptr++ =
qptr[0] - '0' << 7 |
qptr[1] - '0' << 6 |
qptr[2] - '0' << 5 |
qptr[3] - '0' << 4 |
qptr[4] - '0' << 3 |
qptr[5] - '0' << 2 |
qptr[6] - '0' << 1 |
qptr[7] - '0' << 0;
qptr += 8;
}
Process several input characters simultaneously (using bit twiddling hacks or MMX instructions) - this has great speedup potential!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Can someone explain how this bitMask code works? - c

Related

Bitwise Operation on a byte and an int

Breaking apart bit patterns, shifting and creating new patterns

Bitwise operations equivalent of greater than operator

replace byte in 32 bit number

how to make a bit-set/byte-array conversion in c

Categories

Resources