CRC bit-order confusion - c

I'm calculating a CCITT CRC-16 bit by bit. I do this that way because it's a prototype that later should be ported to VHDL and end up in hardware to check a serial bit-stream.
On the net I found a single bit CRC-16 update step code. Wrote a test-program and it works. Except for one strange thing: I have to feed the bits of a byte from lowest to highest bit. If I do it this way, I get correct results.
In the CCITT definition of CRC-16 the bits should be feed highest bit to lowest bit though. The data-stream that I want to calculate the CRC from comes in this format as well, so my current code is kind of useless for me.
I'm confused. I would have not expected that feeding the bits the wrong way around could work at all.
Question: Why is it possible that a CRC can be written to take the data in two different bit-orders, and how do I transform my single bit update code that it accepts the data MSB first?
For reference, here is the relevant code. Initialization and the final check have been removed to keep the example short:
typedef unsigned char bit;
void update_crc_single_bit (bit * crc, bit data)
{
// update CRC for a single bit:
bit temp[16];
int i;
temp[0] = data ^ crc[15];
temp[1] = crc[0];
temp[2] = crc[1];
temp[3] = crc[2];
temp[4] = crc[3];
temp[5] = data ^ crc[4] ^ crc[15];
temp[6] = crc[5];
temp[7] = crc[6];
temp[8] = crc[7];
temp[9] = crc[8];
temp[10] = crc[9];
temp[11] = crc[10];
temp[12] = data ^ crc[11] ^ crc[15];
temp[13] = crc[12];
temp[14] = crc[13];
temp[15] = crc[14];
for (i=0; i<16; i++)
crc[i] = temp[i];
}
void update_crc_byte (bit * crc, unsigned char data)
{
int j;
// calculate CRC lowest bit first
for (j=0; j<8; j++)
{
bit b = (data>>j)&1;
update_crc_single_bit(crc, b);
}
}
Edit: Since there is some confusion here: I have to compute the CRC bit by bit, and for each byte MSB first. I can't simply store the bits because the code shown above is a prototype for something that will end up in hardware (without memory).
The code shown above generates the correct result if I feed in a bit-stream in the following order (shown is the index of the received bit. Each byte gets transmitted MSB first):
|- first byte -|- second byte -|- third byte
7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8,....
I need the single update loop to be transformed that it generates the same CRC using natural order (e.g. as received):
|- first byte -|- second byte -|- third byte
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,....

If you look at the RevEng 16-bit CRC Catalogue, you see that there are two different CRCs called "CCITT", one of which is labeled there "CCITT-False". Somewhere along the way someone got confused about what the CCITT 16-bit CRC was, and that confusion was propagated widely. The two CRCs are described thusly, with the first one (KERMIT) being the true CCITT CRC:
KERMIT
width=16 poly=0x1021 init=0x0000 refin=true refout=true xorout=0x0000 check=0x2189 name="KERMIT"
and
CRC-16/CCITT-FALSE
width=16 poly=0x1021 init=0xffff refin=false refout=false xorout=0x0000 check=0x29b1 name="CRC-16/CCITT-FALSE"
You will note that the real one is reflected, and the false one is not, and there is another difference in the initialization. In reflected CRCs, the lowest bit of the data is processed first, so it appears that you are trying to compute the true CCITT CRC.
When the CRC is reflected, so is the order of the bits in the polynomial that is exclusive-ored into the register, so 0x1021 becomes 0x8408. Here is a simple C implementation that you can check against:
#include <stddef.h>
#define POLY 0x8408
unsigned crc16_ccitt(unsigned crc, unsigned char *buf, size_t len)
{
int k;
while (len--) {
crc ^= *buf++;
for (k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
}
return crc;
}
I don't know what you mean by "In the CCITT definition of CRC-16 the bits should be feed highest bit to lowest bit though". What definition are you referring to?
In this Altera document, you can see the shift register implementation of the CRC for a hardware implementation. Here is a copy of the diagram:
For your code, you need to reverse your register, temp[], indices. temp[0] is temp[15] and so on.

Update - If you look at:
RevEng 16-bit CRC Catalogue
there's a link to:
Online CRC calculator
The first three labeled as CRC-CCITT operate on data sent or received MSB to LSB using the polynomial 0x11021. The only difference is the starting value:
CRC-CCITT (XModem) - crc initialized to 0x0000, same as prefixing by 0x0000.
CRC-CCITT (0xFFFF) - crc initialized to 0xFFFF, same as prefixing by 0x84CF.
CRC-CCITT (0x1D0F) - crc initialized to 0x1D0F, same as prefixing by 0xFFFF.
So my guess is that you want to use one of these three.

normally, bits are transferred on the line least significant bit first. So, in case you have an array of bytes, first bit is least significant bit of first byte, then comes the next to least significant bit... so up to the most significant bit of the first byte and then comes the least significant bit of the next byte. This is the order of the bits (coefficients) in the polyonomial division you are making. Try my routines at https://github.com/mojadita/crc.git (you have there a table for CRC16-CCITT)

Related

Compute CRC lookup table of poly CRC-32K/6.2

I need the properties of poly (0x992c1a4c; 0x132583499) <=> (0x992c1a4c; 0x132583499) from CRC Zoo.
I have read Wikipedia and Ross N. Williams thoroughly but I can't make the final connection. Also I don't know how to check the generated table for correctness.
Do I need to take into account the endianness of the system where it is implemented? Can I use the reflected algorithm regardless of that? Which initial and XorOut values should I pick? How do I check my results?
Trivia - the polynomial 0x132583499 is the product of 3 prime factors (carryless multiply):
0x3 * 0x3 * 0x5A12A42D = 0x132583499
compute-crc-lookup-table
This depends if the crc is left shifted or right shifted. Assuming that the table is for working with one byte at a time, a 256 by 32 bit table is used. For a left shifting crc, the most signficant bit is masked off: 0x132583499 -> 0x32583499:
void gentbl(void)
{
uint32_t crc;
uint32_t b;
uint32_t c;
uint32_t i;
for(c = 0; c < 0x100; c++){
crc = c<<24;
for(i = 0; i < 8; i++){
b = crc>>31;
crc <<= 1;
crc ^= (0 - b) & 0x32583499;
}
crctbl[c] = crc;
}
}
For a right shifting crc, the polynomial is reversed and right shifted 1 bit: (0x132583499 reversed = 0x132583499, shifted right 1 bit = 0x992c1a4c.
void gentbl(void)
{
uint32_t crc;
uint32_t b;
uint32_t c;
uint32_t i;
for(c = 0; c < 0x100; c++){
crc = c;
for(i = 0; i < 8; i++){
b = crc&1;
crc >>= 1;
crc ^= (0 - b) & 0x992c1a4c;
}
crctbl[c] = crc;
}
}
Do I need to take into account the endianness of the system where it is implemented?
Only if the code loads or stores more than a byte at a time. This may be required if the symbol size is not the same as a byte.
Can I use the reflected algorithm regardless of that (endianness)?
Yes, the endianness only affects the loading and storing of data. The reflected algorithm is used for right shifting crc, non-reflected for left shifting crc.
Which initial and XorOut values should I pick?
This is arbitrary and depends on the specific crc. Initial value is typically all zero bits or all one bits with a few exceptions. XorOut is most often 0, but sometimes all one bits to post complement a crc.
How do I check my results?
Use the code with the same initial, xorout, and polynomial as some crc used with an online calculator to verify some crc values. Note that endianness affects the output shown on some online calculators. The string size for an online calculator is limited, but even a few bytes should be enough to check the crc. If the table is created using code similar to the examples above, it's unlikely to have a mix of good and bad entries.
A 32 bit crc is the remainder produced by treating the message as a long n bit dividend and dividing it by a 33 bit polynomial, resulting in a 32 bit remainder, which is the crc. The remainder is appended to the message, resulting in an encoded string of n+32 bits that is an exact multiple of the crc polynomial. If there are no errors and the crc is generated for the n+32 bit encoded string of bits, the crc will always be some constant, such as 0 if xorout == 0.
The crc zoo table contains additional information, such as a list of maximum number of data bits (before the 32 bit crc is appended) versus the Hamming Distance (HD), starting with HD=3, which means that every valid encoded string will differ by at least 3 bits from any other valid encoded string, and therefore any 2 bit error can be detected if the message length is not too long. You can click on the lengths to see an expanded list including failure examples, showing the indexes of the leading bits, and the last 32 bits of the message (somewhat confusing, I converted some of these to show all as indexes). There are 12 lengths shown, I added a 3rd row showing the lengths including the 32 bit crc:
HD = { 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14
{65506,65506,32738,32738,134,134,26,26,16,16, 3, 3}
{65538,65538,32770,32770,166,166,58,58,48,48,35,35} +32 for crc
The web site includes examples of failures with crc poly of 0x132583499 when the message length is too long for a given hamming distance. For HD=3 or 4, encoded message with length (65538+1) 65539 bits, all zero bits except bit[0] and bit[65538] = 1, this will pass a crc check, even though there are 2 bits in error. For HD=5 or 6, encoded message length 32771, all zero bits except bit[{0, 1, 32769, 32770}] = 1, passes crc check with 4 bit error. For HD=7 or 8, encoded message length 167, all zero bits except bit[{0, 43, 44, 122, 123, 166}] = 1, passes crc check with 6 bit error. For HD=9 or 10, encoded message length 59, all zero bits except bit[{0, 5, 21, 25, 33, 37, 53, 58}] = 1, passes crc check with 8 bit error.

CRC lookup table generated in C always gives different results

I'm trying to create a function that generates a CRC lookup table. I'm working with an 8051 micro-controller, and I'd rather do the table lookup method but at the same time, I'd rather have my computer generate the values which then I can load directly into the micro-controller. Most of this source code has been borrowed from: http://www.rajivchakravorty.com/source-code/uncertainty/multimedia-sim/html/crc8_8c-source.html
I only added in the "main" function
#include <stdio.h>
#define GP 0x107
#define DI 0x07
static unsigned char crc8_table[256];
static int made_table=0;
static void init_crc8()
{
int i,j;
unsigned char crc;
if (!made_table) {
for (i=0; i<256; i++) {
crc = i;
for (j=0; j<8; j++)
crc = (crc << 1) ^ ((crc & 0x80) ? DI : 0);
crc8_table[i] = crc & 0xFF;
}
made_table=1;
}
}
void crc8(unsigned char *crc, unsigned char m)
{
if (!made_table)
init_crc8();
*crc = crc8_table[(*crc) ^ m];
*crc &= 0xFF;
}
int main()
{
unsigned char crc[1];
crc8(crc,'S');
printf("S=%x\n",crc[0]); //different hex code almost every time
crc8(crc,'T');
printf("T=%x\n",crc[0]); //different hex code almost every time
return 0;
}
When I execute the program, I expected the same values on the screen but the hex codes after the printed equals signs changed on nearly every program execution.
What can I do to correct that issue? I don't want to be collecting incorrect CRC values.
crc[0] is not initialized. You need crc[0] = 0; or *crc = 0; before calling crc8() with crc. Then you won't get random answers coming from the random initial contents of crc[0].
You don't need the *crc &= 0xff; in crc8(). If char is eight bits, then it does nothing. If you have an odd architecture where char is more than eight bits, then you need to do *crc = crc8_table[((*crc) ^ m) & 0xff]; to assure that you don't go outside the bounds of the table. (Only the low eight bits of m will be used in the CRC calculation.) The contents of the table have already been limited to eight bits, so in any case you don't need a final & 0xff.
You may need a different initial value than zero, and you may need to exclusive-or the final CRC value with something, depending on the definition of the CRC-8 that you want. In the RevEng catalog of CRC's, there are two 8-bit CRCs with that polynomial that are not reflected. Both happen to start with an initial value of zero, but one is exclusive-ored with 0x55 at the end. Also the CRC definition you need may be reflected, in which case the shift direction changes and the polynomial is flipped. If your CRC-8 needs to be interoperable with some other software, then you need to find out the full definition of the CRC being used.
Passing a pointer seems like an odd choice here. It would be more efficient to just pass and return the CRC value directly. E.g. unsigned crc8(unsigned crc, unsigned ch) {, which would apply the eight bits in ch to the CRC crc, and return the new value. Note that you do not need to make the CRC value a char. unsigned is generally what C routines most efficiently take as an argument and return. In fact usually the first argument is passed in a register and returned in the same register.
Usually one computes a CRC on a message consisting of a series of bytes. It would be more efficient to have a routine that does the whole message with a loop, so that you don't need to check to see if the table has been built yet for every single byte of the message.
In the main, crc[0] has not been initialized. As a result, in crc8, *crc in the expression (*crc) ^ m is uninitialized, hence your random values.
Fix: initialize crc[0]. Something like
unsigned char crc[1] = { 0 };

What does mask variable do in this CRC checksum calculation?

The question is about code in figure 14-6 in here.
The mask is calculated as:
mask = -(crc & 1)
Why do we & crc with 1 and then make result negative? The Figure 14-5 does not have this mask variable, why?
Edit:
So since this point is clear, why do we have this line also:
crc = crc ^ byte;
This line is not present in Figure 14-5.
Can this program be used if the generator polynomial length is not multiple of 8 bits?
What that does is to check the least significant bit of crc and then negating it. The effect is that if the bit is zero the mask will be zero (that is all zeroes) and if the bit is one the mask will be -1 (that is all ones). This is used to conditionally xor with 0xEDB88320.
The other solution instead uses if to make that condition.
The second trick they're using in the second solution is to do the xor for the bit check in one operation for all eight bits. In the first example they use (int)(crc^byte) < 0 (which means a check for the XOR of the most significant bit or the sign bit), they then shift both crc and byte one bit to the left and do the same on next bit. In the second example they do the XOR eight bits at a time and then checks each bit of the result.
To see what happens, consider if we change the first example to:
for(j=0; j<=7; j++) {
crc = crc ^ mask_sign_bit(byte);
if( (int)crc < 0 )
crc = (crc << 1) ^ 0x04C11DB7;
else
crc = crc << 1;
byte = byte << 1;
}
where mask_sign_bit masks out every bit except the sign bit, the sign of crc ^ byte becomes the same as crc ^ mask_sign_bit(byte) so the consequence of the if statement becomes the same. Then when shifting crc to the left one step the bit modified by crc = crc ^ mask_sign_bit(byte) will be lost.
This operation turns the least significant bit into a mask.
For example, for an 8-bit value (for simplicity) we have:
00000000 -> 00000000
00000001 -> 11111111
Using unary minus complicates the circuitry of the CRC function massively, which otherwise requires no addition operations. It can be implemented as function of addition, as follows
-x = ~x + 1
Some architectures might support a bit-vector "broadcast" operation, to send the least significant bits to all others bits, which will give huge performance gain.

Read a single bit from a buffer of char

I would to implement a function like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
where index is the offset of the bit that I would want to read.
How do I use bit shifting or masking to achieve this?
You might want to split this into three separate tasks:
Determining which char contains the bit that you're looking for.
Determining the bit offset into that char that you need to read.
Actually selecting that bit out of that char.
I'll leave parts (1) and (2) as exercises, since they're not too bad. For part (3), one trick you might find useful would be to do a bitwise AND between the byte in question and a byte with a single 1 bit at the index that you want. For example, suppose you want to get the fourth bit out of a byte. You could then do something like this:
Byte: 11011100
Mask: 00001000
----------------
AND: 00001000
So think about the following: how would you generate the mask that you need given that you know the bit index? And how would you convert the AND result back to a single bit?
Good luck!
buffer[index/8] & (1u<<(index%8))
should do it (that is, view buffer as a bit array and test the bit at index).
Similarly:
buffer[index/8] |= (1u<<(index%8))
should set the index-th bit.
Or you could store a table of the eight shift states of 1 and & against that
unsigned char bits[] = { 1u<<0, 1u<<1, 1u<<2, 1u<<3, 1u<<4, 1u<<5, 1u<<6, 1u<<7 };
If your compiler doesn't optimize those / and % to bit ops (more efficient), then:
unsigned_int / 8 == unsigned_int >> 3
unsigned_int % 8 == unsigned_int & 0x07 //0x07 == 0000 0111
so
buffer[index>>3] & (1u<<(index&0x07u)) //test
buffer[index>>3] |= (1u<<(index&0x07u)) //set
One possible implementation of your function might look like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
{
unsigned char c = buffer[index / 8]; //getting the byte which contains the bit
unsigned int bit_position = index % 8; //getting the position of that bit within the byte
return ((c >> (7 - bit_position)) & 1);
//shifting that byte to the right with (7 - bit_position) will move the bit whose value you want to know at "the end" of the byte.
//then, by doing bitwise AND with the new byte and 1 (whose binary representation is 00000001) will yield 1 or 0, depending on the value of the bit you need.
}

Large bit arrays in C

Our OS professor mentioned that for assigning a process id to a new process, the kernel incrementally searches for the first zero bit in a array of size equivalent to the maximum number of processes(~32,768 by default), where an allocated process id has 1 stored in it.
As far as I know, there is no bit data type in C. Obviously, there's something I'm missing here.
Is there any such special construct from which we can build up a bit array? How is this done exactly?
More importantly, what are the operations that can be performed on such an array?
Bit arrays are simply byte arrays where you use bitwise operators to read the individual bits.
Suppose you have a 1-byte char variable. This contains 8 bits. You can test if the lowest bit is true by performing a bitwise AND operation with the value 1, e.g.
char a = /*something*/;
if (a & 1) {
/* lowest bit is true */
}
Notice that this is a single ampersand. It is completely different from the logical AND operator &&. This works because a & 1 will "mask out" all bits except the first, and so a & 1 will be nonzero if and only if the lowest bit of a is 1. Similarly, you can check if the second lowest bit is true by ANDing it with 2, and the third by ANDing with 4, etc, for continuing powers of two.
So a 32,768-element bit array would be represented as a 4096-element byte array, where the first byte holds bits 0-7, the second byte holds bits 8-15, etc. To perform the check, the code would select the byte from the array containing the bit that it wanted to check, and then use a bitwise operation to read the bit value from the byte.
As far as what the operations are, like any other data type, you can read values and write values. I explained how to read values above, and I'll explain how to write values below, but if you're really interested in understanding bitwise operations, read the link I provided in the first sentence.
How you write a bit depends on if you want to write a 0 or a 1. To write a 1-bit into a byte a, you perform the opposite of an AND operation: an OR operation, e.g.
char a = /*something*/;
a = a | 1; /* or a |= 1 */
After this, the lowest bit of a will be set to 1 whether it was set before or not. Again, you could write this into the second position by replacing 1 with 2, or into the third with 4, and so on for powers of two.
Finally, to write a zero bit, you AND with the inverse of the position you want to write to, e.g.
char a = /*something*/;
a = a & ~1; /* or a &= ~1 */
Now, the lowest bit of a is set to 0, regardless of its previous value. This works because ~1 will have all bits other than the lowest set to 1, and the lowest set to zero. This "masks out" the lowest bit to zero, and leaves the remaining bits of a alone.
A struct can assign members bit-sizes, but that's the extent of a "bit-type" in 'C'.
struct int_sized_struct {
int foo:4;
int bar:4;
int baz:24;
};
The rest of it is done with bitwise operations. For example. searching that PID bitmap can be done with:
extern uint32_t *process_bitmap;
uint32_t *p = process_bitmap;
uint32_t bit_offset = 0;
uint32_t bit_test;
/* Scan pid bitmap 32 entries per cycle. */
while ((*p & 0xffffffff) == 0xffffffff) {
p++;
}
/* Scan the 32-bit int block that has an open slot for the open PID */
bit_test = 0x80000000;
while ((*p & bit_test) == bit_test) {
bit_test >>= 1;
bit_offset++;
}
pid = (p - process_bitmap)*8 + bit_offset;
This is roughly 32x faster than doing a simple for loop scanning an array with one byte per PID. (Actually, greater than 32x since more of the bitmap is will stay in CPU cache.)
see http://graphics.stanford.edu/~seander/bithacks.html
No bit type in C, but bit manipulation is fairly straight forward. Some processors have bit specific instructions which the code below would nicely optimize for, even without that should be pretty fast. May or may not be faster using an array of 32 bit words instead of bytes. Inlining instead of functions would also help performance.
If you have the memory to burn just use a whole byte to store one bit (or whole 32 bit number, etc) greatly improve performance at the cost of memory used.
unsigned char data[SIZE];
unsigned char get_bit ( unsigned int offset )
{
//TODO: limit check offset
if(data[offset>>3]&(1<<(offset&7))) return(1);
else return(0);
}
void set_bit ( unsigned int offset, unsigned char bit )
{
//TODO: limit check offset
if(bit) data[offset>>3]|=1<<(offset&7);
else data[offset>>3]&=~(1<<(offset&7));
}

Resources