How to get the most significant bit of an unsigned 8-bit type in C - c

I'm trying to get the most significant bit of an unsigned 8-bit type in C.
This is what I'm trying to do right now:
uint8_t *var = ...;
...
(*var >> 6) & 1
Is this right? If it's not, what would be?

To get the most significant bit from a memory pointed to by uint8_t pointer, you need to shift by 7 bits.
(*var >> 7) & 1

The most standard/correct way of masking bits is to use a readable bit mask of the form 1u << bit. Any C programmer spotting 1u << n in code will know that it is a bit mask - so it is self-documenting code.
So if you want bit number 7, you would write
*var & (1u << 7)
The u suffix is important for rugged code, since you want to avoid accidental implicit promotions to signed types.

Another option is to simply apply a bit mask and check the resulting value:
*var & 0x80u // 1000 0000

Related

Casting in C: gotchas

I am programming an Atmel SAMD20 in C. I came upon an error, that I have now fixed, but I'm not quite sure why it happened in the first place. Can someone point it out to me? (it's probably far too obvious, and I'm going to facepalm later.)
An array of sensors is generating uint16_t data, which I converted to uint8_t to send over I2C. So, this is how I originally wrote it:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = (uint8_t) sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
}
Here, write_buffer is uint8_t and sample_sensbus1 is uint16_t.
This, for some reason, ends up messing up the most significant byte (in most cases, the most significant byte is just 1 (i.e. 0x100)). This, on the other hand, works fine, and is exactly what it should be:
for (i = 0; i < SENSBUS1_COUNT; ++i)
{
write_buffer[ (i*2) ] = sample_sensbus1[i] & 0xff;
write_buffer[(i*2)+1] = sample_sensbus1[i] >> 8;
}
Clearly, the implicit cast is smarter than I am.
What is going on?
write_buffer[(i*2)+1] = (uint8_t) sample_sensbus1[i] >> 8;
This is equivalent to:
write_buffer[(i*2)+1] = ((uint8_t) sample_sensbus1[i]) >> 8;
As you see, it does the cast before it does the shift. Your most significant byte is now gone.
This should work, though:
write_buffer[(i*2)+1] = (uint8_t) (sample_sensbus1[i] >> 8);
Your cast converts the uint16_t to uint8_t before it does the shift or mask. It is treated as though you wrote:
write_buffer[ (i*2) ] = ((uint8_t)sample_sensbus1[i]) & 0xff;
write_buffer[(i*2)+1] = ((uint8_t)sample_sensbus1[i]) >> 8;
You might need:
write_buffer[ (i*2) ] = (uint8_t)(sample_sensbus1[i] & 0xff);
write_buffer[(i*2)+1] = (uint8_t)(sample_sensbus1[i] >> 8);
In practice, the uncast version is OK too. Remember, a cast tells the compiler "I know more about this than you do; do as I say". That's dangerous if you don't know more than the compiler. Avoid casts whenever you can.
You might also note that shifting (left or right) by the size of the type in bits (or more) is undefined behaviour. However, the ((uint8_t)sample_sensbus[i]) >> 8 is not undefined behaviour, because of the 'usual arithmetic conversions' which mean that the result of (uint8_t)sample_sensbus[i] is converted to int before the shift occurs, and the size of an int cannot be 8 bits (it must be at least 16 bits to satisfy the standard), so the shift is not too big.
This is a question of operator precedence. In the first example, you are first converting to uint8_t and are applying the & and >> operators second. In the second example, those are applied before the implicit conversion takes place.
Casting is a unary prefix operator and as such has very high precedence.
(uint8_t) sample_sensbus1[i] & 0xff
parses as
((uint8_t)sample_sensbus1[i]) & 0xff
In this case & 0xff is redundant. But:
(uint8_t) sample_sensbus1[i] >> 8
parses as
((uint8_t)sample_sensbus1[i]) >> 8
Here the cast truncates the number to 8 bits, then >> 8 shifts everything out.
The problem is in this expression:
(uint8_t) sample_sensbus1[i] >> 8;
It is doing the following sequence:
Converting the sample_sensbus1[i] to uint8_t, effectively truncating it to the 8 least significant bits. This is where you are losing your data.
Converting the above to int as a part of usual arithmetic conversions, making an int with only 8 lower bits set.
Shifting the above int right 8 bits, effectively making the whole expression zero.

Getting four bits from the right only in a byte using bit shift operations

I wanted to try to get only the four bits from the right in a byte by using only bit shift operations but it sometimes worked and sometimes not, but I don't understand why.
Here's an example:
unsigned char b = foo; //say foo is 1000 1010
unsigned char temp=0u;
temp |= ((b << 4) >> 4);//I want this to be 00001010
PS: I know I can use a mask=F and do temp =(mask&=b).
Shift operator only only works on integral types. Using << causes implicit integral promotion, type casting b to an int and "protecting" the higher bits.
To solve, use temp = ((unsigned char)(b << 4)) >> 4;

c Code that reads a 4 byte little endian number from a buffer

I encountered this piece of C code that's existing. I am struggling to understand it.
I supposidly reads a 4 byte unsigned value passed in a buffer (in little endian format) into a variable of type "long".
This code runs on a 64 bit word size, little endian x86 machine - where sizeof(long) is 8 bytes.
My guess is that this code is intended to also run on a 32 bit x86 machine - so a variable of type long is used instead of int for sake of storing value from a four byte input data.
I am having some doubts and have put comments in the code to express what I understand, or what I don't :-)
Please answer questions below in that context
void read_Value_From_Four_Byte_Buff( char*input)
{
/* use long so on 32 bit machine, can still accommodate 4 bytes */
long intValueOfInput;
/* Bitwise and of input buffer's byte 0 with 0xFF gives MSB or LSB ?*/
/* This code seems to assume that assignment will store in rightmost byte - is that true on a x86 machine ?*/
intValueOfInput = 0xFF & input[0];
/*left shift byte-1 eight times, bitwise "or" places in 2nd byte frm right*/
intValueOfInput |= ((0xFF & input[1]) << 8);
/* similar left shift in mult. of 8 and bitwise "or" for next two bytes */
intValueOfInput |= ((0xFF & input[2]) << 16);
intValueOfInput |= ((0xFF & input[3]) << 24);
}
My questions
1) The input buffer is expected to be in "Little endian". But from code looks like assumption here is that it read in as Byte 0 = MSB, Byte 1, Byte 2, Byte 3= LSB. I thought so because code reads bytes starting from Byte 0, and subsequent bytes ( 1 onwards) are placed in the target variable after left shifting. Is that how it is or am I getting it wrong ?
2) I feel this is a convoluted way of doing things - is there a simpler alternative to copy value from 4 byte buffer into a long variable ?
3) Will the assumption "that this code will run on a 64 bit machine" will have any bearing on how easily I can do this alternatively? I mean is all this trouble to keep it agnostic to word size ( I assume its agnostic to word size now - not sure though) ?
Thanks for your enlightenment :-)
You have it backwards. When you left shift, you're putting into more significant bits. So (0xFF & input[3]) << 24) puts Byte 3 into the MSB.
This is the way to do it in standard C. POSIX has the function ntohl() that converts from network byte order to a native 32-bit integer, so this is usually used in Unix/Linux applications.
This will not work exactly the same on a 64-bit machine, unless you use unsigned long instead of long. As currently written, the highest bit of input[3] will be put into the sign bit of the result (assuming a twos-complement machine), so you can get negative results. If long is 64 bits, all the results will be positive.
The code you are using does indeed treat the input buffer as little endian. Look how it takes the first byte of the buffer and just assigns it to the variable without any shifting. If the first byte increases by 1, the value of your result increases by 1, so it is the least-significant byte (LSB). Left-shifting makes a byte more significant, not less. Left-shifting by 8 is generally the same as multiplying by 256.
I don't think you can get much simpler than this unless you use an external function, or make assumptions about the machine this code is running on, or invoke undefined behavior. In most instances, it would work to just write uint32_t x = *(uint32_t *)input; but this assumes your machine is little endian and I think it might be undefined behavior according to the C standard.
No, running on a 64-bit machine is not a problem. I recommend using types like uint32_t and int32_t to make it easier to reason about whether your code will work on different architectures. You just need to include the stdint.h header from C99 to use those types.
The right-hand side of the last line of this function might exhibit undefined behavior depending on the data in the input:
((0xFF & input[3]) << 24)
The problem is that (0xFF & input[3]) will be a signed int (because of integer promotion). The int will probably be 32-bit, and you are shifting it so far to the left that the resulting value might not be representable in an int. The C standard says this is undefined behavior, and you should really try to avoid that because it gives the compiler a license to do whatever it wants and you won't be able to predict the result.
A solution is to convert it from an int to a uint32_t before shifting it, using a cast.
Finally, the variable intValueOfInput is written to but never used. Shouldn't you return it or store it somewhere?
Taking all this into account, I would rewrite the function like this:
uint32_t read_value_from_four_byte_buff(char * input)
{
uint32_t x;
x = 0xFF & input[0];
x |= (0xFF & input[1]) << 8;
x |= (0xFF & input[2]) << 16;
x |= (uint32_t)(0xFF & input[3]) << 24;
return x;
}
From the code, Byte 0 is LSB, Byte 3 is MSB. But there are some typos. The lines should be
intValueOfInput |= ((0xFF & input[2]) << 16);
intValueOfInput |= ((0xFF & input[3]) << 24);
You can make the code shorter by dropping 0xFF but using the type "unsigned char" in the argument type.
To make the code shorter, you can do:
long intValueOfInput = 0;
for (int i = 0, shift = 0; i < 4; i++, shift += 8)
intValueOfInput |= ((unsigned char)input[i]) << shift;

How to pad or extend the most significant bit (bit 23) into bits 24 through 31

I want to know, how could I extend the most significant bit (bit 23) into bits 24 through 31? How could I do that in C code? I am using C code to program Nios II.
I was thinking of using bit shifting operation but not knowing in details how by using bit shifting operation, the above could be achieved, any link or resource is much appreciated.
Thank you in advance.
As Carl said, right shift if implementation defined. You can use other binary operators that will always work:
if (0 != (0x00800000 & x)) //test if bit 23 is set
{
x |= 0xFF000000; //set bits 24-31
}
else
{
x &= 0x00FFFFFF; //clear bits 24-31
}
The C right-shift operator has implementation-defined behaviour when right-shifting. Since Nios II has an arithmetic right-shift instruction, you can likely simply do:
x = (x << 8) >> 8;
Double check the output assembly to be sure it uses an instruction from the sra family.
A variation on #IronMensan which relies on the reasonable assumption that the integer being modified is 32 bits.
The following only affects bit 24-31, even if the integer is wider.
#define Mask2431 (0xFF000000)
#define Bit23 (0x800000)
some_int |= Mask2431;
if (!(some_int & Bit23))
some_int ^= Mask2431;
The following affects bit 24 and all higher even when using wider than a 32-bit integer:
#define Mask24 (0xFFFFFF)
#define Bit23 (0x800000)
some_int &= Mask24;
if (some_int & Bit23)
some_int = ~some_int ^ Mask24;

bitwise indexing in C?

I'm trying to implement a data compression idea I've had, and since I'm imagining running it against a large corpus of test data, I had thought to code it in C (I mostly have experience in scripting languages like Ruby and Tcl.)
Looking through the O'Reilly 'cow' books on C, I realize that I can't simply index the bits of a simple 'char' or 'int' type variable as I'd like to to do bitwise comparisons and operators.
Am I correct in this perception? Is it reasonable for me to use an enumerated type for representing a bit (and make an array of these, and writing functions to convert to and from char)? If so, is such a type and functions defined in a standard library already somewhere? Are there other (better?) approaches? Is there some example code somewhere that someone could point me to?
Thanks -
Following on from what Kyle has said, you can use a macro to do the hard work for you.
It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 6th-from
right
To clear a bit, use AND:
x &= ~(1 << 5); // clears
6th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 6th-from-right
Or...
#define GetBit(var, bit) ((var & (1 << bit)) != 0) // Returns true / false if bit is set
#define SetBit(var, bit) (var |= (1 << bit))
#define FlipBit(var, bit) (var ^= (1 << bit))
Then you can use it in code like:
int myVar = 0;
SetBit(myVar, 5);
if (GetBit(myVar, 5))
{
// Do something
}
It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 5th-from right
To clear a bit, use AND:
x &= ~(1 << 5); // clears 5th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 5th-from-right
To get the value of a bit use shift and AND:
(x & (1 << 5)) >> 5 // gets the value (0 or 1) of the 5th-from-right
note: the shift right 5 is to ensure the value is either 0 or 1. If you're just interested in 0/not 0, you can get by without the shift.
Have a look at the answers to this question.
Theory
There is no C syntax for accessing or setting the n-th bit of a built-in datatype (e.g. a 'char'). However, you can access bits using a logical AND operation, and set bits using a logical OR operation.
As an example, say that you have a variable that holds 1101 and you want to check the 2nd bit from the left. Simply perform a logical AND with 0100:
1101
0100
---- AND
0100
If the result is non-zero, then the 2nd bit must have been set; otherwise is was not set.
If you want to set the 3rd bit from the left, then perform a logical OR with 0010:
1101
0010
---- OR
1111
You can use the C operators && (for AND) and || (for OR) to perform these tasks. You will need to construct the bit access patterns (the 0100 and 0010 in the above examples) yourself. The trick is to remember that the least significant bit (LSB) counts 1s, the next LSB counts 2s, then 4s etc. So, the bit access pattern for the n-th LSB (starting at 0) is simply the value of 2^n. The easiest way to compute this in C is to shift the binary value 0001 (in this four bit example) to the left by the required number of places. As this value is always equal to 1 in unsigned integer-like quantities, this is just '1 << n'
Example
unsigned char myVal = 0x65; /* in hex; this is 01100101 in binary. */
/* Q: is the 3-rd least significant bit set (again, the LSB is the 0th bit)? */
unsigned char pattern = 1;
pattern <<= 3; /* Shift pattern left by three places.*/
if(myVal && (char)(1<<3)) {printf("Yes!\n");} /* Perform the test. */
/* Set the most significant bit. */
myVal |= (char)(1<<7);
This example hasn't been tested, but should serve to illustrate the general idea.
To query state of bit with specific index:
int index_state = variable & ( 1 << bit_index );
To set bit:
varabile |= 1 << bit_index;
To restart bit:
variable &= ~( 1 << bit_index );
Try using bitfields. Be careful the implementation can vary by compiler.
http://publications.gbdirect.co.uk/c_book/chapter6/bitfields.html
IF you want to index a bit you could:
bit = (char & 0xF0) >> 7;
gets the msb of a char. You could even leave out the right shift and do a test on 0.
bit = char & 0xF0;
if the bit is set the result will be > 0;
obviousuly, you need to change the mask to get different bits (NB: the 0xF is the bit mask if it is unclear). It is possible to define numerous masks e.g.
#define BIT_0 0x1 // or 1 << 0
#define BIT_1 0x2 // or 1 << 1
#define BIT_2 0x4 // or 1 << 2
#define BIT_3 0x8 // or 1 << 3
etc...
This gives you:
bit = char & BIT_1;
You can use these definitions in the above code to sucessfully index a bit within either a macro or a function.
To set a bit:
char |= BIT_2;
To clear a bit:
char &= ~BIT_3
To toggle a bit
char ^= BIT_4
This help?
Individual bits can be indexed as follows.
Define a struct like this one:
struct
{
unsigned bit0 : 1;
unsigned bit1 : 1;
unsigned bit2 : 1;
unsigned bit3 : 1;
unsigned reserved : 28;
} bitPattern;
Now if I want to know the individual bit values of a var named "value", do the following:
CopyMemory( &input, &value, sizeof(value) );
To see if bit 2 is high or low:
int state = bitPattern.bit2;
Hope this helps.
There is a standard library container for bits: std::vector. It is specialised in the library to be space efficient. There is also a boost dynamic_bitset class.
These will let you perform operations on a set of boolean values, using one bit per value of underlying storage.
Boost dynamic bitset documentation
For the STL documentation, see your compiler documentation.
Of course, you can also address the individual bits in other integral types by hand. If you do that, you should use unsigned types so that you don't get undefined behaviour if decide to do a right shift on a value with the high bit set. However, it sounds like you want the containers.
To the commenter who claimed this takes 32x more space than necessary: boost::dynamic_bitset and vector are specialised to use one bit per entry, and so there is not a space penalty, assuming that you actually want more than the number of bits in a primitive type. These classes allow you to address individual bits in a large container with efficient underlying storage. If you just want (say) 32 bits, by all means, use an int. If you want some large number of bits, you can use a library container.

Resources