Aligning bit pattern by most-significant bit - c

I want to XOR two numbers as follows:
11001110 and 110
However, I need to align the bit patterns as such:
11001110
11000000
Any ideas how to do this? I imagine some bitwise operation might be needed, although how would I know how many bits to shift by?

Here's one attempt, assuming I got the requirements right:
int topbit(unsigned int x)
{
for (int i = CHAR_BIT * sizeof x - 1; i >= 0; --i)
{
if (x & (1u << i))
return i;
}
return -1;
}
unsigned int alignedxor(unsigned int a, unsigned int b)
{
const int topa = topbit(a);
const int topb = topbit(b);
if (topa < 0)
return b;
if (topb < 0)
return a;
if (topa > topb)
return a ^ (b << (topa - topb));
return (a << (topb - topa)) ^ b;
}
int main(void) {
printf("%x\n", alignedxor(0xce, 6));
printf("%x\n", alignedxor(6, 0xce));
return 0;
}
This prints e, twice, which seems correct but that's all the testing I did.
And yes, you can get the index of the topmost 1-bit more efficiently, but who cares? Also used my rich imagination to deal with corner cases (such as one number being 0).

To know how many bits to shift on Windows you can use this MS-specific function: _BitScanReverse or you can implement your own, something along the lines of:
int findFirstSetBit(uint32_t _n)
{
int idx = 31;
for( ; idx >= 0; --idx){
if(_n & (1 << idx) != 0){
return idx;
}
}
return -1;
}

Related

Invert operation for bitwise in C

Dear all C programmer:
X = 1 << N; (left shift)
how to recover N from X ?
Thanks
N in this case is the bit position where you shifted in a 1 at. Assuming that X here only got one bit set. Then to find out what number that bit position corresponds to, you have to iterate through the data and mask with bitwise AND:
for(size_t i=0; i<sizeof(X)*8; i++)
if(X & (1<<i))
printf("%d", i);
If performance is important, then you'd make a look-up table with all possible results instead.
In a while loop, keep shifting right until X==1, record how many times you have to shift right and the counter will give you N.
int var = X;
int count = 0;
while (var != 1){
var >>= 1;
count++;
}
printf("N is %d", count);
Try this (flsl from here which is available from string.h on macOS) :
int flsl(long mask)
{
int bit;
if (mask == 0) return (0);
for (bit = 1; mask != 1; bit++)
mask = (unsigned long)mask >> 1;
return (bit);
}
unsigned char binlog(long mask) { return mask ? flsl(mask) - 1 : 0; }
int x = 1 << 20;
printf("%d\n", binlog(x)); ===> 20

Is there a more optimal way to approach some of these functions?

I completed some bit manipulation exercises out of a textbook recently and have grasped onto some of the core ideas behind manipulating bits firmly. My main concern with making this post is for optimizations to my current code. I get the hunch that there are some functions that I could approach better. Do you have any recommendations for the following code?
#include <stdio.h>
#include "funcs.h"
// basically sizeof(int) using bit manipulation
unsigned int int_size(){
int size = 0;
for(unsigned int i = ~00u; i > 0; i >>= 1, size++);
return size;
}
// get a bit at a specific nth index
// index starts with 0 on the most significant bit
unsigned int bit_get(unsigned int data, unsigned int n){
return (data >> (int_size() - n - 1)) & 1;
}
// set a bit at a specific nth index
// index starts with 0 on the most significant bit
unsigned int bit_set(unsigned int data, unsigned int n){
return data | (1 << (int_size() - n - 1));
}
// gets the bit width of the data (<32)
unsigned int bit_width(unsigned int data){
int width = int_size();
for(; width > 0; width--)
if((data & (1 << width)) != 0)
break;
return width + 1;
}
// print the data contained in an unsigned int
void print_data(unsigned int data){
printf("%016X = ",data);
for(int i = 0; i < int_size(); i++)
printf("%X",bit_get(data,i));
putchar('\n');
}
// search for pattern in source (where pattern is n wide)
unsigned int bitpat_search(unsigned int source, unsigned int pattern,
unsigned int n){
int right = int_size() - n;
unsigned int mask = 0;
for(int i = 0; i < n; i++)
mask |= 1 << i;
for(int i = 0; i < right; i++)
if(((source & (mask << (right - i))) >> (right - i) ^ pattern) == 0)
return i - bit_width(source);
return -1;
}
// extract {count} bits from data starting at {start}
unsigned int bitpat_get(unsigned int data, int start, int count){
if(start < 0 || count < 0 || int_size() <= start || int_size() <= count || bit_width(data) != count)
return -1;
unsigned int mask = 1;
for(int i = 0; i < count; i++)
mask |= 1 << i;
mask <<= int_size() - start - count;
return (data & mask) >> (int_size() - start - count);
}
// set {count} bits (basically width of {replace}) in {*data} starting at {start}
void bitpat_set(unsigned int *data, unsigned int replace, int start, int count){
if(start < 0 || count < 0 || int_size() <= start || int_size() <= count || bit_width(replace) != count)
return;
unsigned int mask = 1;
for(int i = 0; i < count; i++)
mask |= 1 << i;
*data = ((*data | (mask << (int_size() - start - count))) & ~(mask << (int_size() - start - count))) | (replace << (int_size() - start - count));
}
because your int_size() function returns the same value each time you could save some time there:
unsigned int int_size(){
static unsigned int size = 0;
if (size == 0)
for(unsigned int i = ~00u; i > 0; i >>= 1, size++);
return size;
}
so it will calculate the value only once.
But replacing all calls of this function by sizeof(int)*8 would be much better.
I looked through your code and there's nothing that jumps out at me.
Overall, don't sweat the small stuff. If the code runs and works fine, no worries. If you are really concerned about performance, go ahead and run your code through a profiler.
Overall, I will say that the one thing you might be dealing with is the "paranoia" I see in your code regarding the width of an int. I generally use the fixed-length types in stdint.h and give the caller some options regarding what length of ints (i.e. uint8_t, uint16_t, uint32_t, etc.) they want to deal with.
Also, in C99, there are bitfields, which allow for each bit to be addressed into.
unsigned int int_size(){
return __builtin_popcount((unsigned int) -1) / __builtin_popcount((unsigned char) -1);
}
This should be faster than looping.
Including int_size() in all the others seems like its going to kill performance unless the compiler is really good at optimizing that loop out.
You could use a uint32_t instead of an int and then you would know up front the size.
You could also use sizeof(int) to get the size in bytes of an int and multiply by 8. I haven't seen an environment that defined a byte to be other than 8 bits, but the standard does seem to allow for it in saying it is implementation defined.

Set n highest bits twiddling

I have the following function that sets the N highest bits, e.g. set_n_high(8) == 0xff00000000000000
uint64_t set_n_high(int n)
{
uint64_t v = 0;
int i;
for (i = 63 ; i > 63 - n; i--) {
v |= (1llu << i);
}
return v;
}
Now just out of curiosity, is there any way in C to accomplish the same without using a loop (or a lookup table) ?
EDIT: n = 0 and n = 64 are cases to be handled, just as the loop variant does.
If you're OK with the n = 0 case not working, you can simplify it to
uint64_t set_n_high(int n)
{
return ~UINT64_C(0) << (64 - n);
}
If, in addition to that, you're OK with "weird shift counts" (undefined behaviour, but Works On My Machine), you can simplify that even further to
uint64_t set_n_high(int n)
{
return ~UINT64_C(0) << -n;
}
If you're OK with the n = 64 case not working, you can simplify it to
uint64_t set_n_high(int n)
{
return ~(~UINT64_C(0) >> n);
}
If using this means that you have to validate n, it won't be faster. Otherwise, it might be.
If you're not OK with either case not working, it gets trickier. Here's a suggestion (there may be a better way)
uint64_t set_n_high(int n)
{
return ~(~UINT64_C(0) >> (n & 63)) | -(uint64_t)(n >> 6);
}
Note that negating an unsigned number is perfectly well-defined.
uint64_t set_n_high(int n) {
return ((1llu << n) - 1) << (64-n);
}
Use a conditional to handle n == 0 and then it becomes trivial.
uint64_t set_n_high(int n) {
/* optional error checking:
if (n < 0 || n > 64) do something */
if (n == 0) return 0;
return -(uint64_t)1 << 64 - n;
}
There’s really no good reason to do anything more complicated than that. The cast from int to uint64_t is fully specified, as are the negation and shift (because the shift amount is guaranteed to lie in [0,63] if n is in [0,64]).
well taking #harold's answer and changing it a little:
uint64_t set_n_high(int n)
{
int carry = n>>6;
return ~((~0uLL >> (n-carry)) >> carry);
}
For what it's worth, of the posts so far (that handle n of 0-64), this one produces the least amount of assembly on an x86_64 and a raspberry pi (and does 1 branch operation) (with gcc 4.8.2). It looks fairly readable too.
uint64_t set_n_high2(int n)
{
uint64_t v = 0;
if (n != 0) {
v = ~UINT64_C(0) << (64 - n);
}
return v;
}
Well I'm presenting a weird-looking one.
:)
/* works for 0<=n<=64 */
uint64_t set_n_high(int n)
{
return ~0llu << ((64 - n) / 4) << ((64 - n) * 3 / 4);
}

masking most significant bit

I wrote this function to remove the most significant bit in every byte. But this function doesn't seem to be working the way I wanted it to be.
The output file size is always '0', I don't understand why nothing's been written to the output file. Is there a better and simple way to remove the most significant bit in every byte??
In relation to shift operators, section 6.5.7 of the C standard says:
If the value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined.
So firstly, remove nBuffer << 8;. Even if it were well defined, it wouldn't be an assignment operator.
As people have mentioned, you'd be better off using CHAR_BIT than 8. I'm pretty sure, instead of 0x7f you mean UCHAR_MAX >> 1 and instead of 7 you meant CHAR_BIT - 1.
Let's just focus on nBuffer and bit_count, here. I shall comment out anything that doesn't use either of these.
bit_count += 7;
if (bit_count == 7*8)
{
*out_buf++ = nBuffer;
/*if((write(out_fd, bit_buf, sizeof(char))) == -1)
oops("Cannot write on the file", "");*/
nBuffer << 8;
bit_count -= 8;
}
nBuffer = 0;
bit_count = 0;
At the end of this code, what is the value of nBuffer? What about bit_count? What impact would that have on your second loop? while (bit_count > 0)
Now let's focus on the commented out code:
if((write(out_fd, bit_buf, sizeof(char))) == -1)
oops("Cannot write on the file", "");
Where are you assigning a value to bit_buf? Using an uninitialised variable is undefined behaviour.
Instead of going through all of the bits to find the high one, this goes through only the 1 bits. high() returns the high bit of the argument, or zero if the argument is zero.
inline int high(int n)
{
int k;
do {
k = n ^ (n - 1);
n &= ~k;
} while (n);
return (k + 1) >> 1;
}
inline int drop_high(int n)
{
return n ^ high(n);
}
unsigned char remove_most_significant_bit(unsigned char b)
{
int bit;
for(bit = 0; bit < 8; bit++)
{
unsigned char mask = (0x80 >> bit);
if( mask & b) return b & ~mask;
}
return b;
}
void remove_most_significant_bit_from_buffer(unsigned char* b, int length)
{
int i;
for(i=0; i<length;i++)
{
b[i] = remove_most_significant_bit(b[i]);
}
}
void test_it()
{
unsigned char data[8];
int i;
for(i = 0; i < 8; i++)
{
data[i] = (1 << i) + i;
}
for(i = 0; i < 8; i++)
{
printf("%d\r\n", data[i]);
}
remove_most_significant_bit_from_buffer(data, 8);
for(i = 0; i < 8; i++)
{
printf("%d\r\n", data[i]);
}
}
I won't go through your entire answer to provide your reworked code, but removing the most significant bit is easy. This comes from the fact that the most significant bit can easily be found by using log base 2 converted to an integer.
#include <stdio.h>
#include <math.h>
int RemoveMSB(int a)
{
return a ^ (1 << (int)log2(a));
}
int main(int argc, char const *argv[])
{
int a = 4387;
printf("MSB of %d is %d\n", a, (int)log2(a));
a = RemoveMSB(a);
printf("MSB of %d is %d\n", a, (int)log2(a));
return 0;
}
Output:
MSB of 4387 is 12
MSB of 291 is 8
As such, 4387 in binary is 1000100100011 with a most significant bit at 12.
Likewise, 291 in binary is 0000100100011 with a most significant bit at 8.

How do I get bit-by-bit data from an integer value in C?

I want to extract bits of a decimal number.
For example, 7 is binary 0111, and I want to get 0 1 1 1 all bits stored in bool. How can I do so?
OK, a loop is not a good option, can I do something else for this?
If you want the k-th bit of n, then do
(n & ( 1 << k )) >> k
Here we create a mask, apply the mask to n, and then right shift the masked value to get just the bit we want. We could write it out more fully as:
int mask = 1 << k;
int masked_n = n & mask;
int thebit = masked_n >> k;
You can read more about bit-masking here.
Here is a program:
#include <stdio.h>
#include <stdlib.h>
int *get_bits(int n, int bitswanted){
int *bits = malloc(sizeof(int) * bitswanted);
int k;
for(k=0; k<bitswanted; k++){
int mask = 1 << k;
int masked_n = n & mask;
int thebit = masked_n >> k;
bits[k] = thebit;
}
return bits;
}
int main(){
int n=7;
int bitswanted = 5;
int *bits = get_bits(n, bitswanted);
printf("%d = ", n);
int i;
for(i=bitswanted-1; i>=0;i--){
printf("%d ", bits[i]);
}
printf("\n");
}
As requested, I decided to extend my comment on forefinger's answer to a full-fledged answer. Although his answer is correct, it is needlessly complex. Furthermore all current answers use signed ints to represent the values. This is dangerous, as right-shifting of negative values is implementation-defined (i.e. not portable) and left-shifting can lead to undefined behavior (see this question).
By right-shifting the desired bit into the least significant bit position, masking can be done with 1. No need to compute a new mask value for each bit.
(n >> k) & 1
As a complete program, computing (and subsequently printing) an array of single bit values:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv)
{
unsigned
input = 0b0111u,
n_bits = 4u,
*bits = (unsigned*)malloc(sizeof(unsigned) * n_bits),
bit = 0;
for(bit = 0; bit < n_bits; ++bit)
bits[bit] = (input >> bit) & 1;
for(bit = n_bits; bit--;)
printf("%u", bits[bit]);
printf("\n");
free(bits);
}
Assuming that you want to calculate all bits as in this case, and not a specific one, the loop can be further changed to
for(bit = 0; bit < n_bits; ++bit, input >>= 1)
bits[bit] = input & 1;
This modifies input in place and thereby allows the use of a constant width, single-bit shift, which may be more efficient on some architectures.
Here's one way to do it—there are many others:
bool b[4];
int v = 7; // number to dissect
for (int j = 0; j < 4; ++j)
b [j] = 0 != (v & (1 << j));
It is hard to understand why use of a loop is not desired, but it is easy enough to unroll the loop:
bool b[4];
int v = 7; // number to dissect
b [0] = 0 != (v & (1 << 0));
b [1] = 0 != (v & (1 << 1));
b [2] = 0 != (v & (1 << 2));
b [3] = 0 != (v & (1 << 3));
Or evaluating constant expressions in the last four statements:
b [0] = 0 != (v & 1);
b [1] = 0 != (v & 2);
b [2] = 0 != (v & 4);
b [3] = 0 != (v & 8);
Here's a very simple way to do it;
int main()
{
int s=7,l=1;
vector <bool> v;
v.clear();
while (l <= 4)
{
v.push_back(s%2);
s /= 2;
l++;
}
for (l=(v.size()-1); l >= 0; l--)
{
cout<<v[l]<<" ";
}
return 0;
}
Using std::bitset
int value = 123;
std::bitset<sizeof(int)> bits(value);
std::cout <<bits.to_string();
#prateek thank you for your help. I rewrote the function with comments for use in a program. Increase 8 for more bits (up to 32 for an integer).
std::vector <bool> bits_from_int (int integer) // discern which bits of PLC codes are true
{
std::vector <bool> bool_bits;
// continously divide the integer by 2, if there is no remainder, the bit is 1, else it's 0
for (int i = 0; i < 8; i++)
{
bool_bits.push_back (integer%2); // remainder of dividing by 2
integer /= 2; // integer equals itself divided by 2
}
return bool_bits;
}
#include <stdio.h>
int main(void)
{
int number = 7; /* signed */
int vbool[8 * sizeof(int)];
int i;
for (i = 0; i < 8 * sizeof(int); i++)
{
vbool[i] = number<<i < 0;
printf("%d", vbool[i]);
}
return 0;
}
If you don't want any loops, you'll have to write it out:
#include <stdio.h>
#include <stdbool.h>
int main(void)
{
int num = 7;
#if 0
bool arr[4] = { (num&1) ?true: false, (num&2) ?true: false, (num&4) ?true: false, (num&8) ?true: false };
#else
#define BTB(v,i) ((v) & (1u << (i))) ? true : false
bool arr[4] = { BTB(num,0), BTB(num,1), BTB(num,2), BTB(num,3)};
#undef BTB
#endif
printf("%d %d %d %d\n", arr[3], arr[2], arr[1], arr[0]);
return 0;
}
As demonstrated here, this also works in an initializer.

Resources