Related
I'm calculating a CCITT CRC-16 bit by bit. I do this that way because it's a prototype that later should be ported to VHDL and end up in hardware to check a serial bit-stream.
On the net I found a single bit CRC-16 update step code. Wrote a test-program and it works. Except for one strange thing: I have to feed the bits of a byte from lowest to highest bit. If I do it this way, I get correct results.
In the CCITT definition of CRC-16 the bits should be feed highest bit to lowest bit though. The data-stream that I want to calculate the CRC from comes in this format as well, so my current code is kind of useless for me.
I'm confused. I would have not expected that feeding the bits the wrong way around could work at all.
Question: Why is it possible that a CRC can be written to take the data in two different bit-orders, and how do I transform my single bit update code that it accepts the data MSB first?
For reference, here is the relevant code. Initialization and the final check have been removed to keep the example short:
typedef unsigned char bit;
void update_crc_single_bit (bit * crc, bit data)
{
// update CRC for a single bit:
bit temp[16];
int i;
temp[0] = data ^ crc[15];
temp[1] = crc[0];
temp[2] = crc[1];
temp[3] = crc[2];
temp[4] = crc[3];
temp[5] = data ^ crc[4] ^ crc[15];
temp[6] = crc[5];
temp[7] = crc[6];
temp[8] = crc[7];
temp[9] = crc[8];
temp[10] = crc[9];
temp[11] = crc[10];
temp[12] = data ^ crc[11] ^ crc[15];
temp[13] = crc[12];
temp[14] = crc[13];
temp[15] = crc[14];
for (i=0; i<16; i++)
crc[i] = temp[i];
}
void update_crc_byte (bit * crc, unsigned char data)
{
int j;
// calculate CRC lowest bit first
for (j=0; j<8; j++)
{
bit b = (data>>j)&1;
update_crc_single_bit(crc, b);
}
}
Edit: Since there is some confusion here: I have to compute the CRC bit by bit, and for each byte MSB first. I can't simply store the bits because the code shown above is a prototype for something that will end up in hardware (without memory).
The code shown above generates the correct result if I feed in a bit-stream in the following order (shown is the index of the received bit. Each byte gets transmitted MSB first):
|- first byte -|- second byte -|- third byte
7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8,....
I need the single update loop to be transformed that it generates the same CRC using natural order (e.g. as received):
|- first byte -|- second byte -|- third byte
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,....
If you look at the RevEng 16-bit CRC Catalogue, you see that there are two different CRCs called "CCITT", one of which is labeled there "CCITT-False". Somewhere along the way someone got confused about what the CCITT 16-bit CRC was, and that confusion was propagated widely. The two CRCs are described thusly, with the first one (KERMIT) being the true CCITT CRC:
KERMIT
width=16 poly=0x1021 init=0x0000 refin=true refout=true xorout=0x0000 check=0x2189 name="KERMIT"
and
CRC-16/CCITT-FALSE
width=16 poly=0x1021 init=0xffff refin=false refout=false xorout=0x0000 check=0x29b1 name="CRC-16/CCITT-FALSE"
You will note that the real one is reflected, and the false one is not, and there is another difference in the initialization. In reflected CRCs, the lowest bit of the data is processed first, so it appears that you are trying to compute the true CCITT CRC.
When the CRC is reflected, so is the order of the bits in the polynomial that is exclusive-ored into the register, so 0x1021 becomes 0x8408. Here is a simple C implementation that you can check against:
#include <stddef.h>
#define POLY 0x8408
unsigned crc16_ccitt(unsigned crc, unsigned char *buf, size_t len)
{
int k;
while (len--) {
crc ^= *buf++;
for (k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
}
return crc;
}
I don't know what you mean by "In the CCITT definition of CRC-16 the bits should be feed highest bit to lowest bit though". What definition are you referring to?
In this Altera document, you can see the shift register implementation of the CRC for a hardware implementation. Here is a copy of the diagram:
For your code, you need to reverse your register, temp[], indices. temp[0] is temp[15] and so on.
Update - If you look at:
RevEng 16-bit CRC Catalogue
there's a link to:
Online CRC calculator
The first three labeled as CRC-CCITT operate on data sent or received MSB to LSB using the polynomial 0x11021. The only difference is the starting value:
CRC-CCITT (XModem) - crc initialized to 0x0000, same as prefixing by 0x0000.
CRC-CCITT (0xFFFF) - crc initialized to 0xFFFF, same as prefixing by 0x84CF.
CRC-CCITT (0x1D0F) - crc initialized to 0x1D0F, same as prefixing by 0xFFFF.
So my guess is that you want to use one of these three.
normally, bits are transferred on the line least significant bit first. So, in case you have an array of bytes, first bit is least significant bit of first byte, then comes the next to least significant bit... so up to the most significant bit of the first byte and then comes the least significant bit of the next byte. This is the order of the bits (coefficients) in the polyonomial division you are making. Try my routines at https://github.com/mojadita/crc.git (you have there a table for CRC16-CCITT)
I am trying to read the 'size' of an SD card. The sample example which I am having has following lines of code:
unsigned char xdata *pchar; // Pointer to external mem space for FLASH Read function;
pchar += 9; // Size indicator is in the 9th byte of CSD (Card specific data) register;
// Extract size indicator bits;
size = (unsigned int)((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7));
I am not able to understand what is actually being done in the above line where indicator bit is being extracted. Can somebody help me in understanding this?
The size is made up of bits from two bytes. One byte is at pchar, the other at pchar + 1.
(*pchar) & 0x03) takes the 2 least significant bits (chopping of the 6 most significant ones).
This result is shifted one bit to the left using << 1. For example:
11011010 (& 0x03/00000011)==> 00000010 (<< 1)==> 00000100 (-----10-)
Something similar is done with pchar + 1. For example:
11110110 (& 0x80/10000000)==> 10000000 (>> 7)==> 00000001 (-------1)
Then these two values are OR-ed together with |. So in this example you'd get:
00000100 | 00000001 = 00000101 (-----101)
But note that the 5 most significant bits will always be 0 (above indicated with -) because they were &-ed away:
To summarize, the first byte holds two bits of size, while the second byte only one bit.
It seems the size indicator, say SI, consists of 3 bits, where *pchar contains the two most significant bits of SI in its lowest two bits (0x03) and *(pchar+1) contains the least significant bit of SI in its highest bit (0x80).
The first and second line figure out how to point to the data that you want.
Let's now go through the steps involved, from left to right.
The first portion of the operations takes the byte pointed to by pchar, performs a logical AND on the byte and 0x03 and shifts over that result by one bit.
That result is then logically ORed with the next byte (*pchar+1), which in turn is ANDed with 0x80, which is then right shifted by seven bits. Essentially, this portion just strips off the first bit in the byte and shifts it over by seven bits.
What the result is essentially this:
Imagine pchar points to the byte where bits are represented by letters: ABCDEFGH.
The first part ANDs with 0x03, so we are left with 000000GH. This is then left shifted by one bit, so we are left with 00000GH0.
Same thing for the right portion. pchar+1 is represented by IJKLMNOP. With the first logical AND, we are left with I0000000. This is then right shifted seven times. So we have 0000000I. This is combined with the left hand portion using the OR, so we have 00000GHI, which is then casted into an int, which holds your size.
Basically, there are three bits that hold the size, but they are not byte aligned. As a result, some manipulation is necessary.
size = (unsigned int)((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7));
Can somebody help me in understanding this?
We have byte *pchar and byte *(pchar+1). Each byte consists of 8 bits.
Let's index each bit of *pchar in bold: 76543210 and each bit of *(pchar+1) in italic: 76543210.
1.. ((*pchar) & 0x03) << 1 means "zero all bits of *pchar except bits 0 and 1, then shift result to the left by 1 bit":
76543210 --> xxxxxx10 --> xxxxx10x
2.. (((*(pchar+1)) & 0x80) >> 7) means "zero all bits of *(pchar+1) except bit 7, then shift result to the right by 7 bits":
76543210 --> 7xxxxxxx --> xxxxxxx7
3.. ((((*pchar) & 0x03) << 1) | (((*(pchar+1)) & 0x80) >> 7)) means "combine all non-zero bits of left and right operands into one byte":
xxxxx10x | xxxxxxx7 --> xxxxx107
So, in the result we have two low bits from *pchar and one high bit from *(pchar+1).
Can someone please explain this function to me?
A mask with the least significant n bits set to 1.
Ex:
n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F
Also, what is a mask?
The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.
A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.
A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.
For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.
Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.
Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.
To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.
So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.
Positive powers of 2 are easily generated with the left shift << operator in C.
Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.
(1 << n) - 1 will produce a mask with n lowest bits set to 1.
Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.
For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.
Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.
This method is correct, whereas the current top answers produce undefined behavior in edge cases.
Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions:
-mbmi -mbmi2
#include <inttypes.h>
#include <stdio.h>
uint64_t gen_mask(const uint_fast8_t msb) {
const uint64_t src = (uint64_t)1 << msb;
return (src - 1) ^ src;
}
int main() {
uint_fast8_t msb;
for (msb = 0; msb < 64; ++msb) {
printf("%016" PRIx64 "\n", gen_mask(msb));
}
return 0;
}
First, for those who only want the code to create the mask:
uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)
Thanks to #Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:
uint64_t bits = 6;
uint64_t mask = (bits < 64)
? ((uint64_t)1 << bits) - 1
: (uint64_t)0 - 1
For those who want to know what a mask is:
A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.
Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.
Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.
You can read more about bitwise operations in C here.
I believe your first example should be 0x3f.
0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.
The following little C program will calculate the correct mask:
#include <stdarg.h>
#include <stdio.h>
int mask_for_n_bits(int n)
{
int mask = 0;
for (int i = 0; i < n; ++i)
mask |= 1 << i;
return mask;
}
int main (int argc, char const *argv[])
{
printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
return 0;
}
0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.
Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.
A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.
For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.
In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.
Hey, in the Programming Pearls book, there is a source code for setting, clearing and testing a bit of the given index in an array of ints that is actually a set representation.
The code is the following:
#include<stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1+ N/BITSPERWORD];
void set(int i)
{
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i)
{
a[i>>SHIFT] &= ~(1<<(i & MASK));
}
int test(int i)
{
a[i>>SHIFT] & (1<<(i & MASK));
}
Could somebody explain me the reason of the SHIFT and the MASK defines? And what are their purposes in the code?
I've already read the previous related question.
VonC posted a good answer about bitmasks in general. Here's some information that's more specific to the code you posted.
Given an integer representing a bit, we work out which member of the array holds that bit. That is: Bits 0 to 31 live in a[0], bits 32 to 63 live in a[1], etc. All that i>>SHIFT does is i / 32. This works out which member of a the bit lives in. With an optimising compiler, these are probably equivalent.
Obviously, now we've found out which member of a that bitflag lives in, we need to ensure that we set the correct bit in that integer. This is what 1 << i does. However, we need to ensure that we don't try to access the 33rd bit in a 32-bit integer, so the shift operation is constrained by using 1 << (i & 0x1F). The magic here is that 0x1F is 31, so we'll never left-shift the bit represented by i more than 31 places (otherwise it should have gone in the next member of a).
From Here (General answer to get this thread started)
A bit mask is a value (which may be stored in a variable) that enables you to isolate a specific set of bits within an integer type.
Normally the masked will have the bits you are interested in set to 1 and all the other bits set to 0. The mask then allows you to isolate the value of the bits, clear all the bits or set all the bits or set a new value to the bits.
Masks (particularly multi-bit ones) often have an associated shift value which is the amount the bits need shifting left so that the least significant masked bit is shifted to the least significant bit in the type.
For example using a 16 bit short data type suppose you wanted to be able to mask bits 3, 4 and 5 (LSB is number 0). You mask and shift would look something like
#define MASK 0x0038
#define SHIFT 3
Masks are often assigned in hexadecimal because it is easier to work with bits in the data type in that base as opposed to decimal. Historically octal has also been used for bit masks.
If I have a variable, var, that contains data that the mask is relevant to then I can isolate the bits like this
var & MASK
I can isolate all the other bits like this
var & ~MASK
I can clear the bits like this
var &= ~MASK;
I can clear all the other bits like this
var &= MASK;
I can set all the bits like this
var |= MASK;
I can set all the other bits like this
var |= ~MASK;
I can extract the decimal value of the bits like this
(var & MASK) >> SHIFT
I can assign a new value to the bits like this
var &= ~MASK;
var |= (newValue << SHIFT) & MASK;
When You want to set a bit inside the array, You have to
seek to the right array index and
set the appropriate bit inside this array item.
There are BITSPERWORD (=32) bits in one array item, which means that the index i has to be split into two parts:
rightmost 5 bits serve as an index in the array item and
the rest of the bits (leftmost 28) serve as an index into the array.
You get:
the leftmost 28 bits by discarding the rightmost five, which is exactly what i>>SHIFT does, and
the rightmost five bits by masking out anything but the rightmost five bits, which is what i & MASK does.
I guess You understand the rest.
Bitwise operation and the leading paragraphs of Mask are a concise explanation, and contain some pointers for further study.
Think of an 8-bit byte as a set of elements from an 8-member universe. A member is IN the set when the corresponding bit is set. Setting a bit more then once doesn't modify set membership (a bit can have only 2 states). The bitwise operators in C provide access to bits by masking and shifting.
The code is trying to store N bits by an array, where each element of the array contains BITSPERWORD (32) bits.
Thus if you're trying to access bit i, you need to calculate the index of the array element stores it (i/32), which is what i>>SHIFT does.
And then you need to access that bit in the array element we just got.
(i & MASK) gives the bit position at the array element (word).
(1<<(i & MASK)) makes the bit at that position to be set.
Now you can set/clear/test that bit in a[i>>SHIFT] by (1<<i & MASK)).
You may also think i is a 32 bits number, that bits 6~31 is the index of the array element stores it, bits 0~5 represents the bit position in the word.
I'm trying to implement a data compression idea I've had, and since I'm imagining running it against a large corpus of test data, I had thought to code it in C (I mostly have experience in scripting languages like Ruby and Tcl.)
Looking through the O'Reilly 'cow' books on C, I realize that I can't simply index the bits of a simple 'char' or 'int' type variable as I'd like to to do bitwise comparisons and operators.
Am I correct in this perception? Is it reasonable for me to use an enumerated type for representing a bit (and make an array of these, and writing functions to convert to and from char)? If so, is such a type and functions defined in a standard library already somewhere? Are there other (better?) approaches? Is there some example code somewhere that someone could point me to?
Thanks -
Following on from what Kyle has said, you can use a macro to do the hard work for you.
It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 6th-from
right
To clear a bit, use AND:
x &= ~(1 << 5); // clears
6th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 6th-from-right
Or...
#define GetBit(var, bit) ((var & (1 << bit)) != 0) // Returns true / false if bit is set
#define SetBit(var, bit) (var |= (1 << bit))
#define FlipBit(var, bit) (var ^= (1 << bit))
Then you can use it in code like:
int myVar = 0;
SetBit(myVar, 5);
if (GetBit(myVar, 5))
{
// Do something
}
It is possible.
To set the nth bit, use OR:
x |= (1 << 5); // sets the 5th-from right
To clear a bit, use AND:
x &= ~(1 << 5); // clears 5th-from-right
To flip a bit, use XOR:
x ^= (1 << 5); // flips 5th-from-right
To get the value of a bit use shift and AND:
(x & (1 << 5)) >> 5 // gets the value (0 or 1) of the 5th-from-right
note: the shift right 5 is to ensure the value is either 0 or 1. If you're just interested in 0/not 0, you can get by without the shift.
Have a look at the answers to this question.
Theory
There is no C syntax for accessing or setting the n-th bit of a built-in datatype (e.g. a 'char'). However, you can access bits using a logical AND operation, and set bits using a logical OR operation.
As an example, say that you have a variable that holds 1101 and you want to check the 2nd bit from the left. Simply perform a logical AND with 0100:
1101
0100
---- AND
0100
If the result is non-zero, then the 2nd bit must have been set; otherwise is was not set.
If you want to set the 3rd bit from the left, then perform a logical OR with 0010:
1101
0010
---- OR
1111
You can use the C operators && (for AND) and || (for OR) to perform these tasks. You will need to construct the bit access patterns (the 0100 and 0010 in the above examples) yourself. The trick is to remember that the least significant bit (LSB) counts 1s, the next LSB counts 2s, then 4s etc. So, the bit access pattern for the n-th LSB (starting at 0) is simply the value of 2^n. The easiest way to compute this in C is to shift the binary value 0001 (in this four bit example) to the left by the required number of places. As this value is always equal to 1 in unsigned integer-like quantities, this is just '1 << n'
Example
unsigned char myVal = 0x65; /* in hex; this is 01100101 in binary. */
/* Q: is the 3-rd least significant bit set (again, the LSB is the 0th bit)? */
unsigned char pattern = 1;
pattern <<= 3; /* Shift pattern left by three places.*/
if(myVal && (char)(1<<3)) {printf("Yes!\n");} /* Perform the test. */
/* Set the most significant bit. */
myVal |= (char)(1<<7);
This example hasn't been tested, but should serve to illustrate the general idea.
To query state of bit with specific index:
int index_state = variable & ( 1 << bit_index );
To set bit:
varabile |= 1 << bit_index;
To restart bit:
variable &= ~( 1 << bit_index );
Try using bitfields. Be careful the implementation can vary by compiler.
http://publications.gbdirect.co.uk/c_book/chapter6/bitfields.html
IF you want to index a bit you could:
bit = (char & 0xF0) >> 7;
gets the msb of a char. You could even leave out the right shift and do a test on 0.
bit = char & 0xF0;
if the bit is set the result will be > 0;
obviousuly, you need to change the mask to get different bits (NB: the 0xF is the bit mask if it is unclear). It is possible to define numerous masks e.g.
#define BIT_0 0x1 // or 1 << 0
#define BIT_1 0x2 // or 1 << 1
#define BIT_2 0x4 // or 1 << 2
#define BIT_3 0x8 // or 1 << 3
etc...
This gives you:
bit = char & BIT_1;
You can use these definitions in the above code to sucessfully index a bit within either a macro or a function.
To set a bit:
char |= BIT_2;
To clear a bit:
char &= ~BIT_3
To toggle a bit
char ^= BIT_4
This help?
Individual bits can be indexed as follows.
Define a struct like this one:
struct
{
unsigned bit0 : 1;
unsigned bit1 : 1;
unsigned bit2 : 1;
unsigned bit3 : 1;
unsigned reserved : 28;
} bitPattern;
Now if I want to know the individual bit values of a var named "value", do the following:
CopyMemory( &input, &value, sizeof(value) );
To see if bit 2 is high or low:
int state = bitPattern.bit2;
Hope this helps.
There is a standard library container for bits: std::vector. It is specialised in the library to be space efficient. There is also a boost dynamic_bitset class.
These will let you perform operations on a set of boolean values, using one bit per value of underlying storage.
Boost dynamic bitset documentation
For the STL documentation, see your compiler documentation.
Of course, you can also address the individual bits in other integral types by hand. If you do that, you should use unsigned types so that you don't get undefined behaviour if decide to do a right shift on a value with the high bit set. However, it sounds like you want the containers.
To the commenter who claimed this takes 32x more space than necessary: boost::dynamic_bitset and vector are specialised to use one bit per entry, and so there is not a space penalty, assuming that you actually want more than the number of bits in a primitive type. These classes allow you to address individual bits in a large container with efficient underlying storage. If you just want (say) 32 bits, by all means, use an int. If you want some large number of bits, you can use a library container.