Performance of bitwise operators in C

Performance of bitwise operators in C - c

What is the fastest way to make the last 2 bits of a byte zero?
x = x >> 2 << 2;
OR
x &= 252;
Is there a better way?

Depends on many factors, including the compiler, the machine architecture (ie processor).
My experience is that
x &= 252; // or...
x &= ~3;
are more efficient (and faster) than
x = x >> 2 << 2;

If your compiler is smart enough, it might replace
x = x >> 2 << 2;
by
x &= ~3;
The later is faster than the former, because the later is only one machine instruction, while the former is two. And all bit manipulation instructions can be expected to execute in precisely one cycle.
Note:
The expression ~3 is the correct way to say: A bit mask with all bits set but the last two. For a one-byte type, this is equivalent to using 252 as you did, but ~3 will work for all types up to int. If you need to specify such a bitmask for a larger type like a long, add the appropriate suffix to the number, ~3l in the case of a long.

Related

Is there a better way to define a preprocessor macro for doing bit manipulation?

Take macro:
GPIOxMODE(gpio,mode,port) ( GPIO##gpio->MODER = ((GPIO##gpio->MODER & ~((uint32_t)GPIO2BITMASK << (port*2))) | (mode << (port * 2))) )
Assuming that the reset value of the register is 0xFFFF.FFFF, I want to set a 2 bit width to an arbitrary value. This was written for an STM32
MCU that has 15 pins per port. GPIO2BITMASK is defined as 0x3. Is there a better way for clearing and setting a random 2 bits in anywhere in the
32-bit wide register.
Valid range for port 0 - 15
Valid range for mode 0 - 3
The method I came up with is to bit shift the mask, invert it, logically AND it with the existing register value, logically OR the result with a bit shifted new value.
I am looking to combine the mask and new value to reduce the number of logical operations bit shift operations. The goal is also keep the process generic enough so that I can use for bit operations of 1,2,3 or 4 bit widths.
Is there a better way?
In the long and sort of it, is there a better way is really an opened question. I am looking specifically for a method that will reduce the number of logical operations and bit shift operations, while being a simple one lined statement.
The answer is NO.
You MUST do reset/set to ensure that the bit field you are writing to has the desired value.
The answers received can be better (in a matter of opinion/preference/philosophy/practice) in that they aren't necessary a macros and have have parameter checking. Also pit falls of this style have been pointed out in both the comments and responses.

This kind of macros should be avoided as a plaque for many reasons:
They are not debuggable
They are hard to find error prone
and many other reasons
The same result you can archive using inline functions. The resulting code will be the same effective
static inline __attribute__((always_inline)) void GPIOMODE(GPIO_TypeDef *gpio, unsigned mode, unsigned pin)
{
gpio -> MODER &= ~(GPIO_MODER_MODE0_Msk << (pin * 2));
gpio -> MODER |= mode << (pin * 2);
}
but if you love macros
#define GPIOxMODE(gpio,mode,port) {volatile uint32_t *mdr = &GPIO##gpio->MODER; *mdr &= ~(GPIO_MODER_MODE0_Msk << (port*2)); *mdr |= mode << (port * 2);}
I am looking to combine the mask and new value to reduce the number of
logical operations bit shift operations.
you cant. You need to reset and then set the bits.

The method I came up with is to bit shift the mask, invert it,
logically AND it with the existing register value, logically OR the
result with a bit shifted new value.
That or an equivalent is the way to do it.
I am looking to combine the mask and new value to reduce the number of
logical operations bit shift operations. The goal is also keep the
process generic enough so that I can use for bit operations of 1,2,3
or 4 bit widths.
Is there a better way?
You must accomplish two basic objectives:
ensure that the bits that should be off in the affected range are in fact off, and
ensure that the bits that should be on in the affected range are in fact on.
In the general case, those require two separate operations: a bitwise AND to force bits off, and a bitwise OR (or XOR, if the bits are first cleared) to turn the wanted bits on. There may be ways to shortcut for specific cases of original and target values, but if you want something general-purpose, as you say, then your options are limited.
Personally, though, I think I would be inclined to build it from multiple pieces, separating the GPIO selection from the actual computation. At minimum, you can separate out a generic macro for setting a range of bits:
#define SETBITS32(x,bits,offset,mask) ((((uint32_t)(x)) & ~(((uint32_t)(mask)) << (offset))) | (((uint32_t)(bits)) << (offset)))
#define GPIOxMODE(gpio,mode,port) (GPIO##gpio->MODER = SETBITS32(GPIO##gpio->MODER, mode, port * 2, GPIO2BITMASK)
But do note that there appears to be no good way to avoid such a macro evaluating some of its arguments more than once. It might therefore be safer to write SETBITS32 as a function instead. The compiler will probably inline such a function in any case, but you can maximize the likelihood of that by declaring it static and inline:
static inline uint32_t SETBITS32(uint32_t x, uint32_t bits, unsigned offset, uint32_t mask) {
return x & ~(mask << offset) | (bits << offset);
}
That's easier to read, too, though it, like the macro, does assume that bits has no set bits outside the mask region.
Of course there are other, similar formulations. For instance, if you do not need to support discontinuous bit ranges, you might specify a bit count instead of a bit mask. This alternative does that, protects against the user providing bits outside the specified range, and also has some parameter validation:
static inline uint32_t set_bitrange_32(uint32_t x, uint32_t bits, unsigned width,
unsigned offset) {
if (width + offset > 32) {
// error: invalid parameters
return x;
} else if (width == 0) {
return x;
}
uint32_t mask = ~(uint32_t)0 >> (32 - width);
return x & ~(mask << offset) | ((bits & mask) << offset);
}

Rotate right by n only using bitwise operators in C

I'm trying to implement a rotateRight by n function in C by only using bitwise operators.
So far, I have settled on using this.
y = x >> n
z = x << (32 - n)
g = y | z
So take for instance the value 11010011
If I were to try and `rotateRight(5):
y becomes 11111110
z becomes 01100000
Then g becomes 111111110
However the correct answer should be 10011110
This almost works, but the problem is that the right-shift copies the sign bit when I need it to perform logical shift, and so some of my answers are the negative of what they should be. How can I fix this?
Note
I am unable to use casting or unsigned types

You could shift unsigned values:
y = (int)((unsigned)x >> n);
z = x << (32 - n);
g = y | z;
Or, you could mask appropriately:
y = (x >> n) & ~(-1 << (32 - n));
z = x << (32 - n);
g = y | z;

Though #jlahd answer is correct I will try and provide a brief explanation of the difference between a logical shift right and an arithmetic shift right (another nice diagram of the difference can be found here).
Please read the links first and then if you're still confused read below:
Brief explanation of the two different shifts right
Now, if you declare your variable as int x = 8; the C compiler knows that this number is signed and when you use a shift operator like this:
int x = 8;
int y = -8;
int shifted_x, shifted_y;
shifted_x = x >> 2; // After this operation shifted_x == 2
shifted_y = y >> 2; // After this operation shifted_y == -2
The reason for this is that a shift right represents a division by a power of 2.
Now, I'm lazy so lets make int's on my hypothetical machine 8 bits so I can save myself some writing. In binary 8 and -8 would look like this:
8 = 00001000
-8 = 11111000 ( invert and add 1 for complement 2 representation )
But in computing the binary number 11111000 is 248 in decimal. It can only represent -8 if we remember that that variable has a sign...
If we want to keep the nice property of a shift where the shift represents a division by a power of 2 (this is really useful) and we want to now have signed numbers, we need to make two different types of right shifts because
248 >> 1 = 124 = 01111100
-8 >> 1 = -4 = 11111100
// And for comparison
8 >> 1 = 4 = 00000100
We can see that the first shift inserted a 0 at the front while the second shift inserted a 1. This is because of the difference between the signed numbers and unsigned numbers, in two's complement representation, when dividing by a power of 2.
To keep this nicety we have two different right shift operators for signed and unsigned variables. In assembly you can explicitly state which you wish to use while in C the compiler decides for you based on the declared type.
Code generalisation
I would write the code a little differently in an attempt to keep myself at least a little platform agnostic.
#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))
This is a little better but you still have to remember to keep the variables unsigned when using this macro. I could try casting the macro like this:
#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))
but now I'm assuming that you're never going to try and rotate an integer larger than size_t...
In order to get rid of the upper bits of the right shift which may be 1's or 0's depending on the type of shift the compiler chooses one might try the following (which satisfies your no casting requirement):
#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))
But it would not work as expected for the long type since the ~(0u) is of type unsigned int (first type which zero fits in the table) and hence restricts us to rotations that are less than sizeof(unsigned int) * 8 bits. In which case we could use ~(0ul) but that makes it of unsigned long type and this type may be inefficient on your platform and what do we do if you want to pass in a long long? We would need it to be of the same type as x and we could achieve it by doing more magical expressions like ~((x)^(x)), but we would still need to turn it into and unsigned version so lets not go there.
#MattMcNabb also points out in the comments two more problems:
our left shift operation could overflow. When operating on signed types, even though in practice it is most often the same, we need to cast the x in the left shift operation to an unsigned type, because it is undefined behavior when an arithmetic shift operation overflows (see this answer's reference to the standard). But if we cast it we will once again need to pick a suitable type for the cast because its size in bytes will act as an upper limit on what we can rotate...
We are assuming that bytes have 8 bits. Which is not always the case, and we should use CHAR_BIT instead of 8.
In which case why bother? Why not go back to the previous solution and just use the largest integer type, uintmax_t (C99), instead of size_t. But this now means that we could be penalized in performance since we might be using integers larger than the processor word and that could involve more than just one assembly instruction per arithmetic operation... Nevertheless here it is:
#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))
#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))
So really, there is likely no perfect way to do it (at least none that I can think of). You can either have it work for all types or have it be fast by only dealing with things equal to or smaller than the processor word (eliminate long long and the likes). But this is nice and generic and should adhere to the standard...
If you want fast algorithms there is a point where you need to know what machine/s you're writing code for otherwise you can't optimize.
So in the end #jlahd's solution will work better, whilst my one might help you make things more generic (at a cost).

I've tried your code on x86 Linux with gcc 4.6.3.
y = x >> n
z = x << (32 - n)
g = y | z
This works correct.If x equals 11010011 then rotateRight(5) will makes y become 00000110.">>" will not add 1.

Convert two 8-bit uint to one 12-bit uint

I'm reading two registers from microcontroller. One have 4-bit MSB (First 4-bits has some other things) and another 8-bit LSB. I want to convert it into one 12-bit uint (16 bit to be precise). So far I made it like that:
UINT16 x;
UINT8 RegValue = 0;
UINT8 RegValue1 = 0;
ReadRegister(Register01, &RegValue1);
ReadRegister(Register02, &RegValue2);
x = RegValue1 & 0x000F;
x = x << 8;
x = x | RegValue2 & 0x00FF;
is there any better way to do that?
/* To be more precise ReadRegister is I2C communication to another ADC. Register01 and Register02 are different addresses. RegValue1 is 8 bit but only 4 LSB are needed and concatenate to RegValue (4-LSB of RegValue1 and all 8-bits of RegValue). */

If you know the endianness of your machine, you can read the bytes
directly into x like this:
ReadRegister(Register01, (UINT8*)&x + 1);
ReadRegister(Register02, (UINT8*)&x);
x &= 0xfff;
Note that this is not portable and the performance gain (if any) will
likely be small.

The RegValue & 0x00FF mask is unnecessary since RegValue is already 8 bit.
Breaking it down into three statements may be good for clarity, but this expression is probably simple enough to implement in one statement:
x = ((RegValue1 & 0x0Fu) << 8u) | RegValue ;
The use of an unsigned literal (0x0Fu) makes little difference but emphasises that we are dealing with unsigned 8-bit data. It is in fact an unsigned int even with only two digits, but again this emphasises to the reader perhaps that we are only dealing with 8 bits, and is purely stylistic rather than semantic. In C there is no 8-bit literal constant type (though in C++ '\x0f' has type char). You can force better type agreement as follows:
#define LS4BITMASK ((UINT8)0x0fu)
x = ((RegValue1 & LS4BITMASK) << 8u) | RegValue ;
The macro merely avoids repetition and clutter in the expression.
None of the above is necessarily "better" than your original code in terms of performance or actual generated code, and is largely a matter of preference or local coding standards or practices.

If the registers are adjacent to each other, they will most likley also be in the correct order with respect to target endianness. That being the case they can be read as a single 16 bit register and masked accordingly, assuming that Register01 is the lower address value:
ReadRegister16(Register01, &x ) ;
x &= 0x0fffu ;
Of course I have invented here the ReadRegister16() function, but if the registers are memory mapped, and Register01 is simply an address then this may simply be:
UINT16 x = *Register01 ;
x &= 0x0fffu ;

Explain this Function

Can someone explain to me the reason why someone would want use bitwise comparison?
example:
int f(int x) {
return x & (x-1);
}
int main(){
printf("F(10) = %d", f(10));
}
This is what I really want to know: "Why check for common set bits"
x is any positive number.

Bitwise operations are used for three reasons:
You can use the least possible space to store information
You can compare/modify an entire register (e.g. 32, 64, or 128 bits depending on your processor) in a single CPU instruction, usually taking a single clock cycle. That means you can do a lot of work (of certain types) blindingly fast compared to regular arithmetic.
It's cool, fun and interesting. Programmers like these things, and they can often be the differentiator when there is no difference between techniques in terms of efficiency/performance.
You can use this for all kinds of very handy things. For example, in my database I can store a lot of true/false information about my customers in a tiny space (a single byte can store 8 different true/false facts) and then use '&' operations to query their status:
Is my customer Male and Single and a Smoker?
if (customerFlags & (maleFlag | singleFlag | smokerFlag) ==
(maleFlag | singleFlag | smokerFlag))
Is my customer (any combination of) Male Or Single Or a Smoker?
if (customerFlags & (maleFlag | singleFlag | smokerFlag) != 0)
Is my customer not Male and not Single and not a Smoker)?
if (customerFlags & (maleFlag | singleFlag | smokerFlag) == 0)
Aside from just "checking for common bits", you can also do:
Certain arithmetic, e.g. value & 15 is a much faster equivalent of value % 16. This only works for certain numbers, but if you can use it, it can be a great optimisation.
Data packing/unpacking. e.g. a colour is often expressed as a 32-bit integer that contains Alpha, Red, Green and Blue byte values. The Red value might be extracted with an expression like red = (value >> 16) & 255; (shift the value down 16 bit positions and then carve off the bottom byte)
Data manipulation and swizzling. Some clever tricks can be achieved with bitwise operations. For example, swapping two integer values without needing to use a third temporary variable, or converting ARGB colour values into another format (e.g RGBA or BGRA)

The Ur-example is "testing if a number is even or odd":
unsigned int number = ...;
bool isOdd = (0 != (number & 1));
More complex uses include bitmasks (multiple boolean values in a single integer, each one taking up one bit of space) and encryption/hashing (which frequently involve bit shifting, XOR, etc.)

The example you've given is kinda odd, but I'll use bitwise comparisons all the time in embedded code.
I'll often have code that looks like the following:
volatile uint32_t *flags = 0x000A000;
bool flagA = *flags & 0x1;
bool flagB = *flags & 0x2;
bool flagC = *flags & 0x4;

It's not a bitwise comparison. It doesn't return a boolean.
Bitwise operators are used to read and modify individual bits of a number.
n & 0x8 // Peek at bit3
n |= 0x8 // Set bit3
n &= ~0x8 // Clear bit3
n ^= 0x8 // Toggle bit3
Bits are used in order to save space. 8 chars takes a lot more memory than 8 bits in a char.
The following example gets the range of an IP subnet using given an IP address of the subnet and the subnet mask of the subnet.
uint32_t mask = (((255 << 8) | 255) << 8) | 255) << 8) | 255;
uint32_t ip = (((192 << 8) | 168) << 8) | 3) << 8) | 4;
uint32_t first = ip & mask;
uint32_t last = ip | ~mask;

e.g. if you have a number of status flags in order to save space you may want to put each flag as a bit.
so x, if declared as a byte, would have 8 flags.

I think you mean bitwise combination (in your case a bitwise AND operation). This is a very common operation in those cases where the byte, word or dword value is handled as a collection of bits, eg status information, eg in SCADA or control programs.

Your example tests whether x has at most 1 bit set. f returns 0 if x is a power of 2 and non-zero if it is not.

Your particular example tests if two consecutive bits in the binary representation are 1.

bitwise indexing in C?

I'm trying to implement a data compression idea I've had, and since I'm imagining running it against a large corpus of test data, I had thought to code it in C (I mostly have experience in scripting languages like Ruby and Tcl.)
Looking through the O'Reilly 'cow' books on C, I realize that I can't simply index the bits of a simple 'char' or 'int' type variable as I'd like to to do bitwise comparisons and operators.
Am I correct in this perception? Is it reasonable for me to use an enumerated type for representing a bit (and make an array of these, and writing functions to convert to and from char)? If so, is such a type and functions defined in a standard library already somewhere? Are there other (better?) approaches? Is there some example code somewhere that someone could point me to?
Thanks -

Following on from what Kyle has said, you can use a macro to do the hard work for you.
It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 6th-from
right
To clear a bit, use AND:
x &= ~(1 << 5); // clears
6th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 6th-from-right
Or...
#define GetBit(var, bit) ((var & (1 << bit)) != 0) // Returns true / false if bit is set
#define SetBit(var, bit) (var |= (1 << bit))
#define FlipBit(var, bit) (var ^= (1 << bit))
Then you can use it in code like:
int myVar = 0;
SetBit(myVar, 5);
if (GetBit(myVar, 5))
{
// Do something
}

It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 5th-from right
To clear a bit, use AND:
x &= ~(1 << 5); // clears 5th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 5th-from-right
To get the value of a bit use shift and AND:
(x & (1 << 5)) >> 5 // gets the value (0 or 1) of the 5th-from-right
note: the shift right 5 is to ensure the value is either 0 or 1. If you're just interested in 0/not 0, you can get by without the shift.

Have a look at the answers to this question.

Theory
There is no C syntax for accessing or setting the n-th bit of a built-in datatype (e.g. a 'char'). However, you can access bits using a logical AND operation, and set bits using a logical OR operation.
As an example, say that you have a variable that holds 1101 and you want to check the 2nd bit from the left. Simply perform a logical AND with 0100:
1101
0100
---- AND
0100
If the result is non-zero, then the 2nd bit must have been set; otherwise is was not set.
If you want to set the 3rd bit from the left, then perform a logical OR with 0010:
1101
0010
---- OR
1111
You can use the C operators && (for AND) and || (for OR) to perform these tasks. You will need to construct the bit access patterns (the 0100 and 0010 in the above examples) yourself. The trick is to remember that the least significant bit (LSB) counts 1s, the next LSB counts 2s, then 4s etc. So, the bit access pattern for the n-th LSB (starting at 0) is simply the value of 2^n. The easiest way to compute this in C is to shift the binary value 0001 (in this four bit example) to the left by the required number of places. As this value is always equal to 1 in unsigned integer-like quantities, this is just '1 << n'
Example
unsigned char myVal = 0x65; /* in hex; this is 01100101 in binary. */
/* Q: is the 3-rd least significant bit set (again, the LSB is the 0th bit)? */
unsigned char pattern = 1;
pattern <<= 3; /* Shift pattern left by three places.*/
if(myVal && (char)(1<<3)) {printf("Yes!\n");} /* Perform the test. */
/* Set the most significant bit. */
myVal |= (char)(1<<7);
This example hasn't been tested, but should serve to illustrate the general idea.

To query state of bit with specific index:
int index_state = variable & ( 1 << bit_index );
To set bit:
varabile |= 1 << bit_index;
To restart bit:
variable &= ~( 1 << bit_index );

Try using bitfields. Be careful the implementation can vary by compiler.
http://publications.gbdirect.co.uk/c_book/chapter6/bitfields.html

IF you want to index a bit you could:
bit = (char & 0xF0) >> 7;
gets the msb of a char. You could even leave out the right shift and do a test on 0.
bit = char & 0xF0;
if the bit is set the result will be > 0;
obviousuly, you need to change the mask to get different bits (NB: the 0xF is the bit mask if it is unclear). It is possible to define numerous masks e.g.
#define BIT_0 0x1 // or 1 << 0
#define BIT_1 0x2 // or 1 << 1
#define BIT_2 0x4 // or 1 << 2
#define BIT_3 0x8 // or 1 << 3
etc...
This gives you:
bit = char & BIT_1;
You can use these definitions in the above code to sucessfully index a bit within either a macro or a function.
To set a bit:
char |= BIT_2;
To clear a bit:
char &= ~BIT_3
To toggle a bit
char ^= BIT_4
This help?

Individual bits can be indexed as follows.
Define a struct like this one:
struct
{
unsigned bit0 : 1;
unsigned bit1 : 1;
unsigned bit2 : 1;
unsigned bit3 : 1;
unsigned reserved : 28;
} bitPattern;
Now if I want to know the individual bit values of a var named "value", do the following:
CopyMemory( &input, &value, sizeof(value) );
To see if bit 2 is high or low:
int state = bitPattern.bit2;
Hope this helps.

There is a standard library container for bits: std::vector. It is specialised in the library to be space efficient. There is also a boost dynamic_bitset class.
These will let you perform operations on a set of boolean values, using one bit per value of underlying storage.
Boost dynamic bitset documentation
For the STL documentation, see your compiler documentation.
Of course, you can also address the individual bits in other integral types by hand. If you do that, you should use unsigned types so that you don't get undefined behaviour if decide to do a right shift on a value with the high bit set. However, it sounds like you want the containers.
To the commenter who claimed this takes 32x more space than necessary: boost::dynamic_bitset and vector are specialised to use one bit per entry, and so there is not a space penalty, assuming that you actually want more than the number of bits in a primitive type. These classes allow you to address individual bits in a large container with efficient underlying storage. If you just want (say) 32 bits, by all means, use an int. If you want some large number of bits, you can use a library container.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Performance of bitwise operators in C - c

What is the fastest way to make the last 2 bits of a byte zero? x = x >> 2 << 2; OR x &= 252; Is there a better way?

Depends on many factors, including the compiler, the machine architecture (ie processor). My experience is that x &= 252; // or... x &= ~3; are more efficient (and faster) than x = x >> 2 << 2;

Related

Is there a better way to define a preprocessor macro for doing bit manipulation?

Rotate right by n only using bitwise operators in C

Convert two 8-bit uint to one 12-bit uint

Explain this Function

bitwise indexing in C?

Categories

Resources