Convert signed int of variable bit size - c

I have a number of bits (the number of bits can change) in an unsigned int (uint32_t). For example (12 bits in the example):
uint32_t a = 0xF9C;
The bits represent a signed int of that length.
In this case the number in decimal should be -100.
I want to store the variable in a signed variable and gets is actual value.
If I just use:
int32_t b = (int32_t)a;
it will be just the value 3996, since it gets casted to (0x00000F9C) but it actually needs to be (0xFFFFFF9C)
I know one way to do it:
union test
{
signed temp :12;
};
union test x;
x.temp = a;
int32_t result = (int32_t) x.temp;
now i get the correct value -100
But is there a better way to do it?
My solution is not very flexbile, as I mentioned the number of bits can vary (anything between 1-64bits).

But is there a better way to do it?
Well, depends on what you mean by "better". The example below shows a more flexible way of doing it as the size of the bit field isn't fixed. If your use case requires different bit sizes, you could consider it a "better" way.
unsigned sign_extend(unsigned x, unsigned num_bits)
{
unsigned f = ~((1 << (num_bits-1)) - 1);
if (x & f) x = x | f;
return x;
}
int main(void)
{
int x = sign_extend(0xf9c, 12);
printf("%d\n", x);
int y = sign_extend(0x79c, 12);
printf("%d\n", y);
}
Output:
-100
1948

A branch free way to sign extend a bitfield (Henry S. Warren Jr., CACM v20 n6 June 1977) is this:
// value i of bit-length len is a bitfield to sign extend
// i is right aligned and zero-filled to the left
sext = 1 << (len - 1);
i = (i ^ sext) - sext;
UPDATE based on #Lundin's comment
Here's tested code (prints -100):
#include <stdio.h>
#include <stdint.h>
int32_t sign_extend (uint32_t x, int32_t len)
{
int32_t i = (x & ((1u << len) - 1)); // or just x if you know there are no extraneous bits
int32_t sext = 1 << (len - 1);
return (i ^ sext) - sext;
}
int main(void)
{
printf("%d\n", sign_extend(0xF9C, 12));
return 0;
}

This relies on the implementation defined behavior of sign extension when right-shifting signed negative integers. First you shift your unsigned integer all the way left until the sign bit is becoming MSB, then you cast it to signed integer and shift back:
#include <stdio.h>
#include <stdint.h>
#define NUMBER_OF_BITS 12
int main(void) {
uint32_t x = 0xF9C;
int32_t y = (int32_t)(x << (32-NUMBER_OF_BITS)) >> (32-NUMBER_OF_BITS);
printf("%d\n", y);
return 0;
}

This is a solution to your problem:
int32_t sign_extend(uint32_t x, uint32_t bit_size)
{
// The expression (0xffffffff << bit_size) will fill the upper bits to sign extend the number.
// The expression (-(x >> (bit_size-1))) is a mask that will zero the previous expression in case the number was positive (to avoid having an if statemet).
return (0xffffffff << bit_size) & (-(x >> (bit_size-1))) | x;
}
int main()
{
printf("%d\n", sign_extend(0xf9c, 12)); // -100
printf("%d\n", sign_extend(0x7ff, 12)); // 2047
return 0;
}

The sane, portable and effective way to do this is simply to mask out the data part, then fill up everything else with 0xFF... to get proper 2's complement representation. You need to know is how many bits that are the data part.
We can mask out the data with (1u << data_length) - 1.
In this case with data_length = 8, the data mask becomes 0xFF. Lets call this data_mask.
Thus the data part of the number is a & data_mask.
The rest of the number needs to be filled with zeroes. That is, everything not part of the data mask. Simply do ~data_mask to achieve that.
C code: a = (a & data_mask) | ~data_mask. Now a is proper 32 bit 2's complement.
Example:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
const uint32_t data_length = 8;
const uint32_t data_mask = (1u << data_length) - 1;
uint32_t a = 0xF9C;
a = (a & data_mask) | ~data_mask;
printf("%"PRIX32 "\t%"PRIi32, a, (int32_t)a);
}
Output:
FFFFFF9C -100
This relies on int being 32 bits 2's complement but is otherwise fully portable.

Related

Sign extending to 32 bits, starting with n bits - C

I am new to C and getting some practice with bit manipulation.
Suppose I have an n bit two's complement number such that n > 0 and n < 31. If I know the size of n in advance, how can I sign extend it to 32 bits?
If n was 16 bits,
int32_t extendMe(int16_t n) {
return (int32_t) n;
}
assuming I have the data definitions.
Suppose I have an n bit value that I want to sign extend to 32, how can I accomplish this?
Thank you.
If this really is about interpreting arbitrary bit patterns as numbers represented in n bits using two's complement, here's some sloppy example code doing that:
#include <stdio.h>
#include <inttypes.h>
// this assumes the number is in the least significant `bits`, with
// the most significat of these being the sign bit.
int32_t fromTwosComplement(uint32_t pattern, unsigned int bits)
{
// read sign bit
int negative = !!(pattern & (1U << (bits-1)));
// bit mask for all bits *except* the sign bit
uint32_t mask = (1U << (bits-1)) - 1;
// extract value without sign
uint32_t val = pattern & mask;
if (negative)
{
// if negative, apply two's complement
val ^= mask;
++val;
return -val;
}
else
{
return val;
}
}
int main(void)
{
printf("%" PRId32 "\n", fromTwosComplement(0x1f, 5)); // output -1
printf("%" PRId32 "\n", fromTwosComplement(0x01, 5)); // output 1
}
An n-bit 2's complement number is negative if bit n - 1 is 1. In that case you want to fill all the bits from n to 31 with 1's. If it's zero, for completeness, you might also want to fill the bits from n to 31 with 0. So you need a mask, that you can use with bit operations to accomplish the above. This is easy to make. Assuming your n bit 2's complement number is held in a uint32_t:
int32_t signExtend(uint32_t number, int n)
{
uint32_t ret;
uint32_t mask = 0xffffffff << n;
if (number & (1 << (n - 1)) != 0)
{
// number is negative
ret = number | mask;
}
else
{
// number is positive
ret = number & ~mask;
}
return (int32_t) ret;
}
Completely untested and the last line might be UB but it should work on most implementations.

store 2 signed shorts in one unsigned int

This is given:
signed short a, b;
a = -16;
b = 340;
Now I want to store these 2 signed shorts in one unsigned int and later retrieve these 2 signed shorts again. I tried this but the resulting shorts are not the same:
unsigned int c = a << 16 | b;
signed short ar, br;
ar = c >> 16;
br = c & 0xFFFF;
OP almost had it right
#include <assert.h>
#include <limits.h>
unsigned ab_to_c(signed short a, signed short b) {
assert(SHRT_MAX == 32767);
assert(UINT_MAX == 4294967295);
// unsigned int c = a << 16 | b; fails as `b` get sign extended before the `|`.
// *1u insures the shift of `a` is done as `unsigned` to avoid UB
// of shifting into the sign bit.
unsigned c = (a*1u << 16) | (b & 0xFFFF);
return c;
}
void c_to_ab(unsigned c, signed short *a, signed short *b) {
*a = c >> 16;
*b = c & 0xFFFF;
}
Since a has a negative value,
unsigned int c = a << 16 | b;
results in undefined behavior.
From the C99 standard (emphasis mine):
6.5.7 Bitwise shift operators
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 x 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
You can explicitly cast the signed short to unsigned short to get a predictable behavior.
#include <stdio.h>
int main()
{
signed short a, b;
a = -16;
b = 340;
unsigned int c = (unsigned short)a << 16 | (unsigned short)b;
signed short ar, br;
ar = c >> 16;
br = c & 0xFFFF;
printf("ar: %hd, br: %hd\n", ar, br);
}
Output:
ar: -16, br: 340
This is really weird, I've compiled your code and it works for me
perhaps this is undefined behavior I'm not sure, however if I were you
I'd add castings to explicitly avoid some bit loss that may or may not be caused by abusing two complement or compiler auto casting....
In my opinion what's happening is probably you shifting out all the bits in
a... try this
unsigned int c = ((unsigned int) a) << 16 | b;
This is because you are using an unsigned int, which is usually 32 bits and a negative signed short which is usually 16 bits.
When you put a short with a negative value into an unsigned int, that "negative" bit is going to be interpreted as part of a positive number.
And so you get a vastly different number in the unsigned int.
Storing two positive numbers would solve this problem....but you might need to store a negative one.
Not sure if this way of doing is good for portability or others but I use...
#ifndef STDIO_H
#define STDIO_H
#include <stdio.h>
#endif
#ifndef SDTINT_H
#define STDINT_H
#include <stdint.h>
#endif
#ifndef BOOLEAN_TE
#define BOOLEAN_TE
typedef enum {false, true} bool;
#endif
#ifndef UINT32_WIDTH
#define UINT32_WIDTH 32 // defined in stdint.h, inttypes.h even in libc.h... undefined ??
#endif
typedef struct{
struct{ // anonymous struct
uint32_t x;
uint32_t y;
};}ts_point;
typedef struct{
struct{ // anonymous struct
uint32_t line;
uint32_t column;
};}ts_position;
bool is_little_endian()
{
uint8_t n = 1;
return *(char *)&n == 1;
}
int main(void)
{
uint32_t x, y;
uint64_t packed;
ts_point *point;
ts_position *position;
x = -12;
y = 3457254;
printf("at start: x = %i | y = %i\n", x, y);
if (is_little_endian()){
packed = (uint64_t)y << UINT32_WIDTH | (uint64_t)x;
}else{
packed = (uint64_t)x << UINT32_WIDTH | (uint64_t)y;
}
printf("packed: position = %llu\n", packed);
point = (ts_point*)&packed;
printf("unpacked: x = %i | y = %i\n", point->x, point->y); // access via pointer
position = (ts_position*)&packed;
printf("unpacked: line = %i | column = %i\n", position->line, position->column);
return 0;
}
I like the way I do as it's offer lots of readiness and can be applied in manay ways ie. 02x32, 04x16, 08x08, etc. I'm new at C so feel free to critic my code and way of doing... thanks

Bit invert function in K&R exercise 2-7

Exercise 2-7 of The C Programming Language:
Write a function invert(x,p,n) that returns x with the n bits that begin at position p inverted (i.e., 1 changed to 0 and vice versa), leaving the others unchanged.
I understood the question like this: I have 182 which is 101(101)10 in binary, the part in parentheses has to be inverted without changing the rest. The return value should be 10101010 then, which is 170 in decimal.
Here is my attempt:
#include <stdio.h>
unsigned int getbits(unsigned int bitfield, int pos, int num);
unsigned int invert(unsigned int bitfield, int pos, int num);
int main(void)
{
printf("%d\n", invert(182, 4, 3));
return 0;
}
/* getbits: get num bits from position pos */
unsigned int getbits(unsigned int bitfield, int pos, int num)
{
return (bitfield >> (pos+1-n)) & ~(~0 << num);
}
/* invert: flip pos-num bits in bitfield */
unsigned int invert(unsigned int bitfield, int pos, int num)
{
unsigned int mask;
unsigned int bits = getbits(bitfield,pos,num);
mask = (bits << (num-1)) | ((~bits << (pos+1)) >> num);
return bitfield ^ mask;
}
It seems correct (to me), but invert(182, 4, 3) outputs 536870730. getbits() works fine (it's straight from the book). I wrote down what happens in the expression I've assigned to y:
(00000101 << 2) | ((~00000101 << 5) >> 3) -- 000000101 is the part being flipped: 101(101)10
00010100 | ((11111010 << 5) >> 3)
00010100 | (01000000 >> 3)
00010100 | 00001000
= 00011100
10110110 (182)
^ 00011100
----------
= 10101010 (170)
Should be correct, but it isn't. I found out this is where it goes wrong: ((~xpn << (p+1)) >> n). I don't see how.
Also, I've no idea how general this code is. My first priority is to just get this case working. Help in this issue is welcome too.
((1u<<n)-1) is a bit mask with n '1' bits at the RHS. <<p shifts this block of ones p positions to the left. (you should shift with (p-n) instead of p if you want to count from the left).
return val ^ (((1u<<n)-1) <<p) ;
There still is a problem when p is larger than the wordsize (shifting by more than the wordsize is undefined), but that should be the responsability of the caller ;-)
For the example 101(101)10 with p=2 and n=3:
1u<<n := 1000
((1u<<n)-1) := 0111
(((1u<<n)-1) <<p) := 011100
original val := 10110110
val ^ mask := 10101010
I think you have an off-by-one issue in one of the shifts (it's just a hunch, I'm not entirely sure). Nevertheless, I'd keep it simple (I'm assuming the index position p starts from the LSB, i.e. p=0 is the LSB):
unsigned int getbits(unsigned int x, int p, int n) {
unsigned int ones = ~(unsigned int)0;
return x ^ (ones << p) ^ (ones << (p+n));
}
edit: If you need p=0 to be the MSB, just invert the shifts (this works correctly because ones is defined as unsigned int):
unsigned int getbits(unsigned int x, int p, int n) {
unsigned int ones = ~(unsigned int)0;
return x ^ (ones >> p) ^ (ones >> (p+n));
}
note: in both cases if p < 0, p >= sizeof(int)*8, p+n < 0 or p+n >= sizeof(int)*8 the result of getbits is undefined.
Take a look at Steve Summit's "Introductory C programming" and at Ted Jensen's "At tutorial on pointers and arrays in C". The language they cover is a bit different from today's C (also programming customs have evolved, machines are much larger, and real men don't write assembler anymore), but much of what they say is as true today as it was then. Sean Anderson's "Bit twiddling hacks" will make your eyes bulge. Guaranteed.
I found out what was wrong in my implementation (other than counting num from the wrong direction). Seems fairly obvious afterwards now that I've learned more about bits.
When a 1-bit is shifted left, out of range of the bit field, it's expanded.
1000 (8) << 1
== 10000 (16)
bitfield << n multiplies bitfield by 2 n times. My expression ((~bits << (pos+1)) >> num) has 5, 4 and 3 as values for bits, pos and num, respectively. I was multiplying a number almost the size of a 32-bit int by 2, twice.
how about my function? i think it so good.
unsigned invert(unsigned x,int p,int n)
{
return (x^((~(~0<<n))<<p+1-n));
}

How do I extract specific 'n' bits of a 32-bit unsigned integer in C?

Could anyone tell me as to how to extract 'n' specific bits from a 32-bit unsigned integer in C.
For example, say I want the first 17 bits of the 32-bit value; what is it that I should do?
I presume I am supposed to use the modulus operator and I tried it and was able to get the last 8 bits and last 16 bits as
unsigned last8bitsvalue=(32 bit integer) % 16
unsigned last16bitsvalue=(32 bit integer) % 32
Is this correct? Is there a better and more efficient way to do this?
Instead of thinking of it as 'extracting', I like to think of it as 'isolating'. Once the desired bits are isolated, you can do what you will with them.
To isolate any set of bits, apply an AND mask.
If you want the last X bits of a value, there is a simple trick that can be used.
unsigned mask;
mask = (1 << X) - 1;
lastXbits = value & mask;
If you want to isolate a run of X bits in the middle of 'value' starting at 'startBit' ...
unsigned mask;
mask = ((1 << X) - 1) << startBit;
isolatedXbits = value & mask;
Hope this helps.
If you want n bits specific then you could first create a bitmask and then AND it with your number to take the desired bits.
Simple function to create mask from bit a to bit b.
unsigned createMask(unsigned a, unsigned b)
{
unsigned r = 0;
for (unsigned i=a; i<=b; i++)
r |= 1 << i;
return r;
}
You should check that a<=b.
If you want bits 12 to 16 call the function and then simply & (logical AND) r with your number N
r = createMask(12,16);
unsigned result = r & N;
If you want you can shift the result. Hope this helps
Modulus works to get bottom bits (only), although I think value & 0x1ffff expresses "take the bottom 17 bits" more directly than value % 131072, and so is easier to understand as doing that.
The top 17 bits of a 32-bit unsigned value would be value & 0xffff8000 (if you want them still in their positions at the top), or value >> 15 if you want the top 17 bits of the value in the bottom 17 bits of the result.
There is a single BEXTR (Bit field extract (with register)) x86 instruction on Intel and AMD CPUs and UBFX on ARM. There are intrinsic functions such as _bextr_u32() (link requires sign-in) that allow to invoke this instruction explicitly.
They implement (source >> offset) & ((1 << n) - 1) C code: get n continuous bits from source starting at the offset bit. Here's a complete function definition that handles edge cases:
#include <limits.h>
unsigned getbits(unsigned value, unsigned offset, unsigned n)
{
const unsigned max_n = CHAR_BIT * sizeof(unsigned);
if (offset >= max_n)
return 0; /* value is padded with infinite zeros on the left */
value >>= offset; /* drop offset bits */
if (n >= max_n)
return value; /* all bits requested */
const unsigned mask = (1u << n) - 1; /* n '1's */
return value & mask;
}
For example, to get 3 bits from 2273 (0b100011100001) starting at 5-th bit, call getbits(2273, 5, 3)—it extracts 7 (0b111).
For example, say I want the first 17 bits of the 32-bit value; what is it that I should do?
unsigned first_bits = value & ((1u << 17) - 1); // & 0x1ffff
Assuming CHAR_BIT * sizeof(unsigned) is 32 on your system.
I presume I am supposed to use the modulus operator and I tried it and was able to get the last 8 bits and last 16 bits
unsigned last8bitsvalue = value & ((1u << 8) - 1); // & 0xff
unsigned last16bitsvalue = value & ((1u << 16) - 1); // & 0xffff
If the offset is always zero as in all your examples in the question then you don't need the more general getbits(). There is a special cpu instruction BLSMSK that helps to compute the mask ((1 << n) - 1).
If you need the X last bits of your integer, use a binary mask :
unsigned last8bitsvalue=(32 bit integer) & 0xFF
unsigned last16bitsvalue=(32 bit integer) & 0xFFFF
This is a briefer variation of the accepted answer: the function below extracts the bits from-to inclusive by creating a bitmask. After applying an AND logic over the original number the result is shifted so the function returns just the extracted bits.
Skipped index/integrity checks for clarity.
uint16_t extractInt(uint16_t orig16BitWord, unsigned from, unsigned to)
{
unsigned mask = ( (1<<(to-from+1))-1) << from;
return (orig16BitWord & mask) >> from;
}
Bitwise AND your integer with the mask having exactly those bits set that you want to extract. Then shift the result right to reposition the extracted bits if desired.
unsigned int lowest_17_bits = myuint32 & 0x1FFFF;
unsigned int highest_17_bits = (myuint32 & (0x1FFFF << (32 - 17))) >> (32 - 17);
Edit: The latter repositions the highest 17 bits as the lowest 17; this can be useful if you need to extract an integer from “within” a larger one. You can omit the right shift (>>) if this is not desired.
#define GENERAL__GET_BITS_FROM_U8(source,lsb,msb) \
((uint8_t)((source) & \
((uint8_t)(((uint8_t)(0xFF >> ((uint8_t)(7-((uint8_t)(msb) & 7))))) & \
((uint8_t)(0xFF << ((uint8_t)(lsb) & 7)))))))
#define GENERAL__GET_BITS_FROM_U16(source,lsb,msb) \
((uint16_t)((source) & \
((uint16_t)(((uint16_t)(0xFFFF >> ((uint8_t)(15-((uint8_t)(msb) & 15))))) & \
((uint16_t)(0xFFFF << ((uint8_t)(lsb) & 15)))))))
#define GENERAL__GET_BITS_FROM_U32(source,lsb,msb) \
((uint32_t)((source) & \
((uint32_t)(((uint32_t)(0xFFFFFFFF >> ((uint8_t)(31-((uint8_t)(msb) & 31))))) & \
((uint32_t)(0xFFFFFFFF << ((uint8_t)(lsb) & 31)))))))
int get_nbits(int num, int n)
{
return (((1<<n)-1) & num);
}
I have another method for accomplishing this. You can use a union of an integer type that has enough bits for your application and a bit field struct.
Example:
typedef thesebits
{
unsigned long first4 : 4;
unsigned long second4 : 4;
unsigned long third8 : 8;
unsigned long forth7 : 7;
unsigned long fifth3 : 3;
unsigned long sixth5 : 5;
unsigned long last1 : 1;
} thesebits;
you can set that struct up to whatever bit pattern you want. If you have multiple bit patterns, you can even use that in your union as well.
typedef thesebitstwo
{
unsigned long first8 : 8;
unsigned long second8 : 8;
unsigned long third8 : 8;
unsigned long last8 : 8;
} thesebitstwo;
Now you can set up your union:
typedef union myunion
{
unsigned long mynumber;
thesebits mybits;
thesebitstwo mybitstwo;
} myunion;
Then you can access the bits you want from any number you assign to the member mynumber:
myunion getmybits;
getmybits.mynumber = 1234567890;
If you want the last 8 bits:
last16bits = getmybits.mybitstwo.last8;
If you want the second 4 bits:
second4bits = getmybits.mybits.second4;
I gave two examples kind of randomly assigned different bits to show. You can set the struct bit-fields up for whatever bits you want to get. I made all of the variables type unsigned long but you can use any variable type as long as the number of bits doesn't exceed those that can be used in the type. So most of these could have been just unsigned int and some even could be unsigned short
The caveat here is this works if you always want the same set of bits over and over. If there's a reason you may need to vary which bits you're looking at to anything, you could use a struct with an array that keeps a copy of the bits like so:
#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>
typedef struct bits32
{
bool b0 : 1;
bool b1 : 1;
bool b2 : 1;
bool b3 : 1;
bool b4 : 1;
bool b5 : 1;
bool b6 : 1;
bool b7 : 1;
bool b8 : 1;
bool b9 : 1;
bool b10 : 1;
bool b11 : 1;
bool b12 : 1;
bool b13 : 1;
bool b14 : 1;
bool b15 : 1;
bool b16 : 1;
bool b17 : 1;
bool b18 : 1;
bool b19 : 1;
bool b20 : 1;
bool b21 : 1;
bool b22 : 1;
bool b23 : 1;
bool b24 : 1;
bool b25 : 1;
bool b26 : 1;
bool b27 : 1;
bool b28 : 1;
bool b29 : 1;
bool b30 : 1;
bool b31 : 1;
} bits32;
typedef struct flags32 {
union
{
uint32_t number;
struct bits32 bits;
};
bool b[32];
} flags32;
struct flags32 assignarray ( unsigned long thisnumber )
{
struct flags32 f;
f.number = thisnumber;
f.b[0] = f.bits.b0;
f.b[1] = f.bits.b1;
f.b[2] = f.bits.b2;
f.b[3] = f.bits.b3;
f.b[4] = f.bits.b4;
f.b[5] = f.bits.b5;
f.b[6] = f.bits.b6;
f.b[7] = f.bits.b7;
f.b[8] = f.bits.b8;
f.b[9] = f.bits.b9;
f.b[10] = f.bits.b10;
f.b[11] = f.bits.b11;
f.b[12] = f.bits.b12;
f.b[13] = f.bits.b13;
f.b[14] = f.bits.b14;
f.b[15] = f.bits.b15;
f.b[16] = f.bits.b16;
f.b[17] = f.bits.b17;
f.b[18] = f.bits.b18;
f.b[19] = f.bits.b19;
f.b[20] = f.bits.b20;
f.b[21] = f.bits.b21;
f.b[22] = f.bits.b22;
f.b[23] = f.bits.b23;
f.b[24] = f.bits.b24;
f.b[25] = f.bits.b25;
f.b[26] = f.bits.b26;
f.b[27] = f.bits.b27;
f.b[28] = f.bits.b28;
f.b[29] = f.bits.b29;
f.b[30] = f.bits.b30;
f.b[31] = f.bits.b31;
return f;
}
int main ()
{
struct flags32 bitmaster;
bitmaster = assignarray(1234567890);
printf("%d\n", bitmaster.number);
printf("%d\n",bitmaster.bits.b9);
printf("%d\n",bitmaster.b[9]);
printf("%lu\n", sizeof(bitmaster));
printf("%lu\n", sizeof(bitmaster.number));
printf("%lu\n", sizeof(bitmaster.bits));
printf("%lu\n", sizeof(bitmaster.b));
}
The issue with this last example is that it's not compact. The union itself is only 4 bytes, but since you can't do pointers to bit-fields (without complicated and debatably "non-standard" code), then the array makes a copy of each boolean value and uses a full byte for each one, instead of just the bit, so it takes up 9x the total memory space (if you run the printf statement examples I gave, you'll see).
But now, you can address each bit one-by-one and use a variable to index each one, which is great if you're not short on memory.
By using the typedefs above and the assignarray function as a constructor for flags32, you can easily expand to multiple variables. If you're OK just addressing with .b# and not being able to use a variable, you can just define the union as flags32 and omit the rest of the struct. Then you also don't need the assignarray function, and you'll use much less memory.

Bit reversal of an integer, ignoring integer size and endianness

Given an integer typedef:
typedef unsigned int TYPE;
or
typedef unsigned long TYPE;
I have the following code to reverse the bits of an integer:
TYPE max_bit= (TYPE)-1;
void reverse_int_setup()
{
TYPE bits= (TYPE)max_bit;
while (bits <<= 1)
max_bit= bits;
}
TYPE reverse_int(TYPE arg)
{
TYPE bit_setter= 1, bit_tester= max_bit, result= 0;
for (result= 0; bit_tester; bit_tester>>= 1, bit_setter<<= 1)
if (arg & bit_tester)
result|= bit_setter;
return result;
}
One just needs first to run reverse_int_setup(), which stores an integer with the highest bit turned on, then any call to reverse_int(arg) returns arg with its bits reversed (to be used as a key to a binary tree, taken from an increasing counter, but that's more or less irrelevant).
Is there a platform-agnostic way to have in compile-time the correct value for max_int after the call to reverse_int_setup(); Otherwise, is there an algorithm you consider better/leaner than the one I have for reverse_int()?
Thanks.
#include<stdio.h>
#include<limits.h>
#define TYPE_BITS sizeof(TYPE)*CHAR_BIT
typedef unsigned long TYPE;
TYPE reverser(TYPE n)
{
TYPE nrev = 0, i, bit1, bit2;
int count;
for(i = 0; i < TYPE_BITS; i += 2)
{
/*In each iteration, we swap one bit on the 'right half'
of the number with another on the left half*/
count = TYPE_BITS - i - 1; /*this is used to find how many positions
to the left (and right) we gotta move
the bits in this iteration*/
bit1 = n & (1<<(i/2)); /*Extract 'right half' bit*/
bit1 <<= count; /*Shift it to where it belongs*/
bit2 = n & 1<<((i/2) + count); /*Find the 'left half' bit*/
bit2 >>= count; /*Place that bit in bit1's original position*/
nrev |= bit1; /*Now add the bits to the reversal result*/
nrev |= bit2;
}
return nrev;
}
int main()
{
TYPE n = 6;
printf("%lu", reverser(n));
return 0;
}
This time I've used the 'number of bits' idea from TK, but made it somewhat more portable by not assuming a byte contains 8 bits and instead using the CHAR_BIT macro. The code is more efficient now (with the inner for loop removed). I hope the code is also slightly less cryptic this time. :)
The need for using count is that the number of positions by which we have to shift a bit varies in each iteration - we have to move the rightmost bit by 31 positions (assuming 32 bit number), the second rightmost bit by 29 positions and so on. Hence count must decrease with each iteration as i increases.
Hope that bit of info proves helpful in understanding the code...
The following program serves to demonstrate a leaner algorithm for reversing bits, which can be easily extended to handle 64bit numbers.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char**argv)
{
int32_t x;
if ( argc != 2 )
{
printf("Usage: %s hexadecimal\n", argv[0]);
return 1;
}
sscanf(argv[1],"%x", &x);
/* swap every neigbouring bit */
x = (x&0xAAAAAAAA)>>1 | (x&0x55555555)<<1;
/* swap every 2 neighbouring bits */
x = (x&0xCCCCCCCC)>>2 | (x&0x33333333)<<2;
/* swap every 4 neighbouring bits */
x = (x&0xF0F0F0F0)>>4 | (x&0x0F0F0F0F)<<4;
/* swap every 8 neighbouring bits */
x = (x&0xFF00FF00)>>8 | (x&0x00FF00FF)<<8;
/* and so forth, for say, 32 bit int */
x = (x&0xFFFF0000)>>16 | (x&0x0000FFFF)<<16;
printf("0x%x\n",x);
return 0;
}
This code should not contain errors, and was tested using 0x12345678 to produce 0x1e6a2c48 which is the correct answer.
typedef unsigned long TYPE;
TYPE reverser(TYPE n)
{
TYPE k = 1, nrev = 0, i, nrevbit1, nrevbit2;
int count;
for(i = 0; !i || (1 << i && (1 << i) != 1); i+=2)
{
/*In each iteration, we swap one bit
on the 'right half' of the number with another
on the left half*/
k = 1<<i; /*this is used to find how many positions
to the left (or right, for the other bit)
we gotta move the bits in this iteration*/
count = 0;
while(k << 1 && k << 1 != 1)
{
k <<= 1;
count++;
}
nrevbit1 = n & (1<<(i/2));
nrevbit1 <<= count;
nrevbit2 = n & 1<<((i/2) + count);
nrevbit2 >>= count;
nrev |= nrevbit1;
nrev |= nrevbit2;
}
return nrev;
}
This works fine in gcc under Windows, but I'm not sure if it's completely platform independent. A few places of concern are:
the condition in the for loop - it assumes that when you left shift 1 beyond the leftmost bit, you get either a 0 with the 1 'falling out' (what I'd expect and what good old Turbo C gives iirc), or the 1 circles around and you get a 1 (what seems to be gcc's behaviour).
the condition in the inner while loop: see above. But there's a strange thing happening here: in this case, gcc seems to let the 1 fall out and not circle around!
The code might prove cryptic: if you're interested and need an explanation please don't hesitate to ask - I'll put it up someplace.
#ΤΖΩΤΖΙΟΥ
In reply to ΤΖΩΤΖΙΟΥ 's comments, I present modified version of above which depends on a upper limit for bit width.
#include <stdio.h>
#include <stdint.h>
typedef int32_t TYPE;
TYPE reverse(TYPE x, int bits)
{
TYPE m=~0;
switch(bits)
{
case 64:
x = (x&0xFFFFFFFF00000000&m)>>16 | (x&0x00000000FFFFFFFF&m)<<16;
case 32:
x = (x&0xFFFF0000FFFF0000&m)>>16 | (x&0x0000FFFF0000FFFF&m)<<16;
case 16:
x = (x&0xFF00FF00FF00FF00&m)>>8 | (x&0x00FF00FF00FF00FF&m)<<8;
case 8:
x = (x&0xF0F0F0F0F0F0F0F0&m)>>4 | (x&0x0F0F0F0F0F0F0F0F&m)<<4;
x = (x&0xCCCCCCCCCCCCCCCC&m)>>2 | (x&0x3333333333333333&m)<<2;
x = (x&0xAAAAAAAAAAAAAAAA&m)>>1 | (x&0x5555555555555555&m)<<1;
}
return x;
}
int main(int argc, char**argv)
{
TYPE x;
TYPE b = (TYPE)-1;
int bits;
if ( argc != 2 )
{
printf("Usage: %s hexadecimal\n", argv[0]);
return 1;
}
for(bits=1;b;b<<=1,bits++);
--bits;
printf("TYPE has %d bits\n", bits);
sscanf(argv[1],"%x", &x);
printf("0x%x\n",reverse(x, bits));
return 0;
}
Notes:
gcc will warn on the 64bit constants
the printfs will generate warnings too
If you need more than 64bit, the code should be simple enough to extend
I apologise in advance for the coding crimes I committed above - mercy good sir!
There's a nice collection of "Bit Twiddling Hacks", including a variety of simple and not-so simple bit reversing algorithms coded in C at http://graphics.stanford.edu/~seander/bithacks.html.
I personally like the "Obvious" algorigthm (http://graphics.stanford.edu/~seander/bithacks.html#BitReverseObvious) because, well, it's obvious. Some of the others may require less instructions to execute. If I really need to optimize the heck out of something I may choose the not-so-obvious but faster versions. Otherwise, for readability, maintainability, and portability I would choose the Obvious one.
Here is a more generally useful variation. Its advantage is its ability to work in situations where the bit length of the value to be reversed -- the codeword -- is unknown but is guaranteed not to exceed a value we'll call maxLength. A good example of this case is Huffman code decompression.
The code below works on codewords from 1 to 24 bits in length. It has been optimized for fast execution on a Pentium D. Note that it accesses the lookup table as many as 3 times per use. I experimented with many variations that reduced that number to 2 at the expense of a larger table (4096 and 65,536 entries). This version, with the 256-byte table, was the clear winner, partly because it is so advantageous for table data to be in the caches, and perhaps also because the processor has an 8-bit table lookup/translation instruction.
const unsigned char table[] = {
0x00,0x80,0x40,0xC0,0x20,0xA0,0x60,0xE0,0x10,0x90,0x50,0xD0,0x30,0xB0,0x70,0xF0,
0x08,0x88,0x48,0xC8,0x28,0xA8,0x68,0xE8,0x18,0x98,0x58,0xD8,0x38,0xB8,0x78,0xF8,
0x04,0x84,0x44,0xC4,0x24,0xA4,0x64,0xE4,0x14,0x94,0x54,0xD4,0x34,0xB4,0x74,0xF4,
0x0C,0x8C,0x4C,0xCC,0x2C,0xAC,0x6C,0xEC,0x1C,0x9C,0x5C,0xDC,0x3C,0xBC,0x7C,0xFC,
0x02,0x82,0x42,0xC2,0x22,0xA2,0x62,0xE2,0x12,0x92,0x52,0xD2,0x32,0xB2,0x72,0xF2,
0x0A,0x8A,0x4A,0xCA,0x2A,0xAA,0x6A,0xEA,0x1A,0x9A,0x5A,0xDA,0x3A,0xBA,0x7A,0xFA,
0x06,0x86,0x46,0xC6,0x26,0xA6,0x66,0xE6,0x16,0x96,0x56,0xD6,0x36,0xB6,0x76,0xF6,
0x0E,0x8E,0x4E,0xCE,0x2E,0xAE,0x6E,0xEE,0x1E,0x9E,0x5E,0xDE,0x3E,0xBE,0x7E,0xFE,
0x01,0x81,0x41,0xC1,0x21,0xA1,0x61,0xE1,0x11,0x91,0x51,0xD1,0x31,0xB1,0x71,0xF1,
0x09,0x89,0x49,0xC9,0x29,0xA9,0x69,0xE9,0x19,0x99,0x59,0xD9,0x39,0xB9,0x79,0xF9,
0x05,0x85,0x45,0xC5,0x25,0xA5,0x65,0xE5,0x15,0x95,0x55,0xD5,0x35,0xB5,0x75,0xF5,
0x0D,0x8D,0x4D,0xCD,0x2D,0xAD,0x6D,0xED,0x1D,0x9D,0x5D,0xDD,0x3D,0xBD,0x7D,0xFD,
0x03,0x83,0x43,0xC3,0x23,0xA3,0x63,0xE3,0x13,0x93,0x53,0xD3,0x33,0xB3,0x73,0xF3,
0x0B,0x8B,0x4B,0xCB,0x2B,0xAB,0x6B,0xEB,0x1B,0x9B,0x5B,0xDB,0x3B,0xBB,0x7B,0xFB,
0x07,0x87,0x47,0xC7,0x27,0xA7,0x67,0xE7,0x17,0x97,0x57,0xD7,0x37,0xB7,0x77,0xF7,
0x0F,0x8F,0x4F,0xCF,0x2F,0xAF,0x6F,0xEF,0x1F,0x9F,0x5F,0xDF,0x3F,0xBF,0x7F,0xFF};
const unsigned short masks[17] =
{0,0,0,0,0,0,0,0,0,0X0100,0X0300,0X0700,0X0F00,0X1F00,0X3F00,0X7F00,0XFF00};
unsigned long codeword; // value to be reversed, occupying the low 1-24 bits
unsigned char maxLength; // bit length of longest possible codeword (<= 24)
unsigned char sc; // shift count in bits and index into masks array
if (maxLength <= 8)
{
codeword = table[codeword << (8 - maxLength)];
}
else
{
sc = maxLength - 8;
if (maxLength <= 16)
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc];
}
else if (maxLength & 1) // if maxLength is 17, 19, 21, or 23
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc] |
(table[(codeword & masks[sc]) >> (sc - 8)] << 8);
}
else // if maxlength is 18, 20, 22, or 24
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc]
| (table[(codeword & masks[sc]) >> (sc >> 1)] << (sc >> 1));
}
}
How about:
long temp = 0;
int counter = 0;
int number_of_bits = sizeof(value) * 8; // get the number of bits that represent value (assuming that it is aligned to a byte boundary)
while(value > 0) // loop until value is empty
{
temp <<= 1; // shift whatever was in temp left to create room for the next bit
temp |= (value & 0x01); // get the lsb from value and set as lsb in temp
value >>= 1; // shift value right by one to look at next lsb
counter++;
}
value = temp;
if (counter < number_of_bits)
{
value <<= counter-number_of_bits;
}
(I'm assuming that you know how many bits value holds and it is stored in number_of_bits)
Obviously temp needs to be the longest imaginable data type and when you copy temp back into value, all the extraneous bits in temp should magically vanish (I think!).
Or, the 'c' way would be to say :
while(value)
your choice
We can store the results of reversing all possible 1 byte sequences in an array (256 distinct entries), then use a combination of lookups into this table and some oring logic to get the reverse of integer.
Here is a variation and correction to TK's solution which might be clearer than the solutions by sundar. It takes single bits from t and pushes them into return_val:
typedef unsigned long TYPE;
#define TYPE_BITS sizeof(TYPE)*8
TYPE reverser(TYPE t)
{
unsigned int i;
TYPE return_val = 0
for(i = 0; i < TYPE_BITS; i++)
{/*foreach bit in TYPE*/
/* shift the value of return_val to the left and add the rightmost bit from t */
return_val = (return_val << 1) + (t & 1);
/* shift off the rightmost bit of t */
t = t >> 1;
}
return(return_val);
}
The generic approach hat would work for objects of any type of any size would be to reverse the of bytes of the object, and the reverse the order of bits in each byte. In this case the bit-level algorithm is tied to a concrete number of bits (a byte), while the "variable" logic (with regard to size) is lifted to the level of whole bytes.
Here's my generalization of freespace's solution (in case we one day get 128-bit machines). It results in jump-free code when compiled with gcc -O3, and is obviously insensitive to the definition of foo_t on sane machines. Unfortunately it does depend on shift being a power of 2!
#include <limits.h>
#include <stdio.h>
typedef unsigned long foo_t;
foo_t reverse(foo_t x)
{
int shift = sizeof (x) * CHAR_BIT / 2;
foo_t mask = (1 << shift) - 1;
int i;
for (i = 0; shift; i++) {
x = ((x & mask) << shift) | ((x & ~mask) >> shift);
shift >>= 1;
mask ^= (mask << shift);
}
return x;
}
int main() {
printf("reverse = 0x%08lx\n", reverse(0x12345678L));
}
In case bit-reversal is time critical, and mainly in conjunction with FFT, the best is to store the whole bit reversed array. In any case, this array will be smaller in size than the roots of unity that have to be precomputed in FFT Cooley-Tukey algorithm. An easy way to compute the array is:
int BitReverse[Size]; // Size is power of 2
void Init()
{
BitReverse[0] = 0;
for(int i = 0; i < Size/2; i++)
{
BitReverse[2*i] = BitReverse[i]/2;
BitReverse[2*i+1] = (BitReverse[i] + Size)/2;
}
} // end it's all

Resources