Simple serial comm. & Calculating CRC16 using C - c

I have a sensor in which I wish to get some data from. I want to ask it the date or ask it for data.
The sensor uses RS-232; 9600 8N1 to communicate.
The communication packet is composed with a header, payload and CRC. The manual provides the header and payload for whatever you want to do. Each communication packet is composed as follows:
<SOH> header <STX> payload <ETB> CRC16 <ETX>
<SOH>: 0x01
<STX>: 0x02
<ETB>: 0x17
<ETX>: 0X03
The manual gives an example if you want to ask for the date, it tells you the header is 0x31 and the payload is 0x41.
Thus the command to send the sensor is: \x01\x31\x02\x41\x17\CRC16\x03
Now as an example, the manual also calculates the CRC16 for you, and is A0D5 in ASCII. CRC16 needs to be transmitted little endian.
So the full command is now:
\x01\x31\x02\x41\x17\x44\x35\x41\x30\x03
The manual doesn't provide any other CRC16 calculations, and it expects the user to do it which is fine :)
From the manual: Each packet is validated by a 16-bit CRC transferred in hexademiical ASCII coded (four chars). CRC is calculated from the header+payload concatenated
WORD CRC16_Compute( BYTE *pBuffer, WORD length )
{
BYTE i;
BOOL bs;
WORD crc=0;
while( length-- )
{
crc ^= *pBuffer++;
for( i = 0; i < 8; i++ )
{
bs = crc & 1;
crc >>= 1;
if( bs )
{
crc ^= 0xA001;
}
}
}
return crc;
}
That is the CRC calculator in C, I am not too savvy with but this code snippet is all they provide and no context.
In ASCII, they are using 1+A (0x31+0x41) to get 2 byte, A0D5. Could someone explain to me what the CRC code is doing, thanks!

#include <stdio.h>
typedef unsigned char BYTE;
typedef unsigned int BOOL;
typedef unsigned int WORD;
WORD CRC16_Compute( BYTE *pBuffer, WORD length )
{
BYTE i;
BOOL bs;
WORD crc=0;
while( length-- )
{
crc ^= *pBuffer++;
for( i = 0; i < 8; i++ )
{
bs = crc & 1;
crc >>= 1;
if( bs )
{
crc ^= 0xA001;
}
}
}
return crc;
}
int main ( void )
{
unsigned char data[2];
unsigned char sdata[9];
unsigned int x;
unsigned int z;
unsigned int i;
data[0]=0x31;
data[1]=0x41;
x = CRC16_Compute(data,2);
x&=0xFFFF;
printf("0x%X\n",x);
z=0;
sdata[z++]=0x01;
sdata[z++]=data[0];
sdata[z++]=0x02;
sdata[z++]=data[1];
sdata[z++]=0x17;
sdata[z ]=((x>> 4)&0xF)+0x30; if(sdata[4]>0x39) sdata[4]+=7; z++;
sdata[z ]=((x>> 0)&0xF)+0x30; if(sdata[5]>0x39) sdata[5]+=7; z++;
sdata[z ]=((x>>12)&0xF)+0x30; if(sdata[6]>0x39) sdata[6]+=7; z++;
sdata[z ]=((x>> 8)&0xF)+0x30; if(sdata[7]>0x39) sdata[7]+=7; z++;
sdata[z++]=0x03;
for(i=0;i<z;i++) printf("%02X ",sdata[i]); printf("\n");
return(0);
}
Run it
gcc so.c -o so
./so
0xA0D5
01 31 02 41 17 44 35 41 30 03
How about that the right answer....
Everything you need to know is in your question, just do what it says. A few minutes of crudely putting it together.
CRC is calculated from the header+payload concatenated
data[0]=0x31;
data[1]=0x41;
That is header and payload and it gives the right answer based on the CRC code provided.
Then you build the packet with the other items. If you google ASCII table you can see the values for 'D' 'A' '0' '5' and can figure out how to get from 0xD to 0x44 and 0xA to 0x41 but first look at 0x0 to 0x30 and 0x5 to 0x35, 0-9 is easy but 0x0A gives 0x3A but needs to be 0x41, so you adjust.
So the code works as described based on the comments as described, I don't know this sensor, seems goofy the way they did it but good for them for providing an example and the details on the crc16 as there are multiple standard variations including the initial value, so again good for them for saving tons of time trying to figure it out...

Related

Is there an architecture-independent method to create a little-endian byte stream from a value in C?

I am trying to transmit values between architectures, by creating a uint8_t[] buffer and then sending that. To ensure they are transmitted correctly, the spec is to convert all values to little-endian as they go into the buffer.
I read this article here which discussed how to convert from one endianness to the other, and here where it discusses how to check the endianness of the system.
I am curious if there is a method to read bytes from a uint64 or other value in little-endian order regardless of whether the system is big or little? (ie through some sequence of bitwise operations)
Or is the only method to first check the endianness of the system, and then if big explicitly convert to little?
That's actually quite easy -- you just use shifts to convert between 'native' format (whatever that is) and little-endian
/* put a 32-bit value into a buffer in little-endian order (4 bytes) */
void put32(uint8_t *buf, uint32_t val) {
buf[0] = val;
buf[1] = val >> 8;
buf[2] = val >> 16;
buf[3] = val >> 24;
}
/* get a 32-bit value from a buffer (little-endian) */
uint32_t get32(uint8_t *buf) {
return (uint32_t)buf[0] + ((uint32_t)buf[1] << 8) +
((uint32_t)buf[2] << 16) + ((uint32_t)buf[3] << 24);
}
If you put a value into a buffer, transmit it as a byte stream to another machine, and then get the value from the received buffer, the two machines will have the same 32 bit value regardless of whether they have the same or different native byte oridering. The casts are needed becuase the default promotions will just convert to int, which might be smaller than a uin32_t, in which case the shifts could be out of range.
Be careful if you buffers are char rather than uint8_t (char might or might not be signed) -- you need to mask in that case:
uint32_t get32(char *buf) {
return ((uint32_t)buf[0] & 0xff) + (((uint32_t)buf[1] & 0xff) << 8) +
(((uint32_t)buf[2] & 0xff) << 16) + (((uint32_t)buf[3] & 0xff) << 24);
}
You can always serialize an uint64_t value to array of uint8_t in little endian order as simply
uint64_t source = ...;
uint8_t target[8];
target[0] = source;
target[1] = source >> 8;
target[2] = source >> 16;
target[3] = source >> 24;
target[4] = source >> 32;
target[5] = source >> 40;
target[6] = source >> 48;
target[7] = source >> 56;
or
for (int i = 0; i < sizeof (uint64_t); i++) {
target[i] = source >> i * 8;
}
and this will work anywhere where uint64_t and uint8_t exists.
Notice that this assumes that the source value is unsigned. Bit-shifting negative signed values will cause all sorts of headaches and you just don't want to do that.
Deserialization is a bit more complex if reading byte at a time in order:
uint8_t source[8] = ...;
uint64_t target = 0;
for (int i = 0; i < sizeof (uint64_t); i ++) {
target |= (uint64_t)source[i] << i * 8;
}
The cast to (uint64_t) is absolutely necessary, because the operands of << will undergo integer promotions, and uint8_t would always be converted to a signed int - and "funny" things will happen when you shift a set bit into the sign bit of a signed int.
If you write this into a function
#include <inttypes.h>
void serialize(uint64_t source, uint8_t *target) {
target[0] = source;
target[1] = source >> 8;
target[2] = source >> 16;
target[3] = source >> 24;
target[4] = source >> 32;
target[5] = source >> 40;
target[6] = source >> 48;
target[7] = source >> 56;
}
and compile for x86-64 using GCC 11 and -O3, the function will be compiled to
serialize:
movq %rdi, (%rsi)
ret
which just moves the 64-bit value of source into target array as is. If you reverse the indices (7 ... 0; big-endian), GCC will be clever enough to recognize that too and will compile it (with -O3) to
serialize:
bswap %rdi
movq %rdi, (%rsi)
ret
Most standardized network protocols specify numbers in big-endian format. In fact, big-endian is all referred to as network byte order, and there are functions specifically for translating integers of various sizes between host and network byte order.
These function are htons and ntohs for 16 bit values and htonl and ntohl` for 32 bit values. However, there is no equivalent for 64 bit values, and you're using little-endian for the network protocol, so these won't help you.
You can still however translate between the host byte order and the network byte order (little-endian in this case) without knowing the host order. You can do this by bit shifting the relevant values in to or out of the host numbers.
For example, to convert a 32 bit value from host to little endian and back to host:
uint32_t src_value = *some value*;
uint8_t buf[sizeof(uint32_t)];
int i;
for (i=0; i<sizeof(uint32_t); i++) {
buf[i] = (src_value >> (8 * i)) & 0xff;
}
uint32_t dest_value = 0;
for (i=0; i<sizeof(uint32_t); i++) {
dest_value |= (uint32_t)buf[i] << (8 * i);
}
For two systems that must communicated, you specify an "intercomminication-byte order". Then you have functions that convert between that and the native architecture byte order of each system.
There are three approaches to this problem. In order of efficiency:
Compile time detection of endianess
Run time detection of endianness
Endian agnostic code (corresponding to "sequence of bitwise operations" in your question).
Compile time detection of endianess
On architectures whose byte order is the same as the intercomm byte order, these functions do no transformation, but by using them, the same code becomes portable between systems.
Such functions may already exist on your target platform, for example:
Linux's endian.h be64toh() et-al
POSIX htonl, htons, ntohl, ntohs
Windows' winsock.h (same as POSIX but adds 64 bit htonll() and ntohll()
Where they don't exist creating them with cross-platform support is trivial. For example:
uint16_t intercom_to_host_16( uint16_t intercom_word )
{
#if __BIG_ENDIAN__
return intercom_word ;
#else
return intercom_word >> 8 | intercom_word << 8 ;
#endif
}
Here I have assumed that the intercom order is big-endian, that makes the function compatible with network byte order per ntohs() et al. The macro __BIG_ENDIAN__ is a predefined macro on most compilers. If not simply define it as a command line macro when compiling e.g. -D __BIG_ENDIAN__.
Run time detection of endianness
It is possible to detect endianness at runtime with minimal overhead:
uint16_t intercom_to_host_16( uint16_t intercom_word )
{
static const union
{
uint16_t word ;
uint8_t bytes[2] ;
} test = {.word = 0xff00u } ;
return test.bytes[0] == 0xffu ?
intercom_word :
intercom_word >> 8 | intercom_word << 8 ;
}
Of course you might wrap the test in a function for use in similar functions for other word sizes:
#include <stdbool.h>
bool isBigEndian()
{
static const union
{
uint16_t word ;
uint8_t bytes[2] ;
} test = {.word = 0xff00u } ;
return test.bytes[0] == 0xffu ;
}
Then simply have :
uint16_t intercom_to_host_16( uint16_t intercom_word )
{
return isBigEndian() ? intercom_word :
intercom_word >> 8 | intercom_word << 8 ;
}
Endian agnostic code
It is entirely possible to use endian agnostic code, but in that case all participants in the communication or file processing have the software overhead imposed even if the native byte order is already the same as the intercom byte order.
uint16_t intercom_to_host_16( uint16_t intercom_word )
{
uint8_t host_word [2] = { intercom_word >> 8,
intercom_word << 8 } ;
return *(uint16_t*)host_word ;
}

Filtering bit strings efficiently [duplicate]

This question already has an answer here:
Mask and aggregate bits
(1 answer)
Closed 5 years ago.
I am looking for a bit manipulation function that takes two bit strings and filters and compacts the first string based on the second, so only the values where the second string are 1 are kept. Eg:
01101010 and 11110000 gives 00000110
01101010 and 00001111 gives 00001010
01101010 and 10101000 gives 00000011
By using looping, conditionals and working with each bit independently this is easy to implement, but I'm looking for a faster method using bit manipulation tricks if one exists, not using conditionals and loops. It does not have to work for input longer than 32 bits. Therefore a solution would have a signature like: uint32_t filter(uint32_t in, uint32_t mask)
In C it would look something like this with arrays and a loop:
void filter(bool in[], bool mask[], bool out[], int size) {
int output_index = 0;
for (int input_index = 0; input_index < size; ++input_index) {
if (mask[input_index]) {
out[output_index++] = in[input_index];
}
}
}
Here are a bunch examples of the types of solutions I'm looking for: Bit Twiddling Hacks
If you only need to store bit sequences of up to 32 bits, it would be far more efficient to store them as 32-bit unsigned integers. Here's one way of doing it:
#include <stdio.h>
#include <stdint.h>
uint32_t filter(uint32_t in, uint32_t mask) {
uint32_t result=0, t, p=1, q=1;
while (mask) {
if ( (t = mask & 1) ) {
if ( (q & in) ) result |= p;
p <<= 1;
}
mask >>= 1;
q <<= 1;
}
return result;
}
int main() {
/* 01101010 and 11110000 gives 00000110 */
printf("%04x %04x %04x\n", 0x6a, 0xf0, filter(0x6a,0xf0)); /* Output: 0006 */
/* 01101010 and 00001111 gives 00001010 */
printf("%04x %04x %04x\n", 0x6a, 0x0f, filter(0x6a,0x0f)); /* Output: 000a */
/* 01101010 and 10101000 gives 00000011 */
printf("%04x %04x %04x\n", 0x6a, 0xa8, filter(0x6a,0xa8)); /* Output: 0003 */
return 0;
}

Parallel Verilog CRC algorithm from C-like reference

I have a set of c-like snippets provided that describe a CRC algorithm, and this article that explains how to transform a serial implementation to parallel that I need to implement in Verilog.
I tried using multiple online code generators, both serial and parallel (although serial would not work in final solution), and also tried working with the article, but got no similar results to what these snippets generate.
I should say I'm more or less exclusively hardware engineer and my understanding of C is rudimentary. I also never worked with CRC other than straightforward shift register implementation. I can see the polynomial and initial value from what I have, but that is more or less it.
Serial implementation uses augmented message. Should I also create parallel one for 6 bits wider message and append zeros to it?
I do not understand too well how the final value crc6 is generated. CrcValue is generated using the CalcCrc function for the final zeros of augmented message, then its top bit is written to its place in crc6 and removed before feeding it to the function again. Why is that? When working the algorithm to get the matrices for the parallel implementation, I should probably take crc6 as my final result, not last value of CrcValue?
Regardless of how crc6 is obtained, in the snippet for CRC check only runs through the function. How does that work?
Here are the code snippets:
const unsigned crc6Polynom =0x03; // x**6 + x + 1
unsigned CalcCrc(unsigned crcValue, unsigned thisbit) {
unsigned m = crcValue & crc6Polynom;
while (m > 0) {
thisbit ^= (m & 1);
m >>= 1;
return (((thisbit << 6) | crcValue) >> 1);
}
}
// obtain CRC6 for sending (6 bit)
unsigned GetCrc(unsigned crcValue) {
unsigned crc6 = 0;
for (i = 0; i < 6; i++) {
crcValue = CalcCrc(crcValue, 0);
crc6 |= (crcValue & 0x20) | (crc6 >> 1);
crcValue &= 0x1F; // remove output bit
}
return (crc6);
}
// Calculate CRC6
unsigned crcValue = 0x3F;
for (i = 1; i < nDataBits; i++) { // Startbit excluded
unsigned thisBit = (unsigned)((telegram >> i) & 0x1);
crcValue = CalcCrc(crcValue, thisBit);
}
/* now send telegram + GetCrc(crcValue) */
// Check CRC6
unsigned crcValue = 0x3F;
for (i = 1; i < nDataBits+6; i++) { // No startbit, but with CRC
unsigned thisBit = (unsigned)((telegram >> i) & 0x1);
crcValue = CalcCrc(crcValue, thisBit);
}
if (crcValue != 0) { /* put error handler here */ }
Thanks in advance for any advice, I'm really stuck there.
xoring bits of the data stream can be done in parallel because only the least signficant bit is used for feedback (in this case), and the order of the data stream bit xor operations doesn't affect the result.
Whether the hardware would need a parallel version depends on how a data stream is handled. The hardware could calculate the CRC one bit at a time during transmission or reception. If the hardware is staged to work with 6 bit characters, then a parallel version would make sense.
Since the snippets use a right shift for the CRC, it would seem that data for each 6 bit character is transmitted and received least significant bit first, to allow for hardware that could calculate CRC 1 bit at a time as it's transmitted or received. After all 6 bit data characters are transmitted, then the 6 bit CRC is transmitted (also least significant bit first).
The snippets seem wrong. My guess at what they should be:
/* calculate crc6 1 bit at a time */
const unsigned crc6Polynom =0x43; /* x**6 + x + 1 */
unsigned CalcCrc(unsigned crcValue, unsigned thisbit) {
crcValue ^= thisbit;
if(crcValue&1)
crcValue ^= crc6Polynom;
crcValue >>= 1;
return crcValue;
}
Example for passing 6 bits at a time. A 64 by 6 bit table lookup could be used to replace the for loop.
/* calculate 6 bits at a time */
unsigned CalcCrc6(unsigned crcValue, unsigned sixbits) {
int i;
crcValue ^= sixbits;
for(i = 0; i < 6; i++){
if(crcValue&1)
crcValue ^= crc6Polynom;
crcValue >>= 1;
}
return crcValue;
}
Assume that telegram contains 31 bits, 1 start bit + 30 data bits (five 6 bit characters):
/* code to calculate crc 6 bits at a time: */
unsigned crcValue = 0x3F;
int i;
telegram >>= 1; /* skip start bit */
for (i = 0; i < 5; i++) {
crcValue = CalcCrc6(unsigned crcValue, telegram & 0x3f);
telegram >>= 6;
}

_mm_crc32_u8 gives different result than reference code

I've been struggling with the intrinsics. In particular I don't get the same results using the standard CRC calculation and the supposedly equivalent intel intrinsics. I'd like to move to using _mm_crc32_u16, and _mm_crc32_u32 but if I can't get the 8 bit operation to work there's no point.
static UINT32 g_ui32CRC32Table[256] =
{
0x00000000L, 0x77073096L, 0xEE0E612CL, 0x990951BAL,
0x076DC419L, 0x706AF48FL, 0xE963A535L, 0x9E6495A3L,
0x0EDB8832L, 0x79DCB8A4L, 0xE0D5E91EL, 0x97D2D988L,
....
// Your basic 32-bit CRC calculator
// NOTE: this code cannot be changed
UINT32 CalcCRC32(unsigned char *pucBuff, int iLen)
{
UINT32 crc = 0xFFFFFFFF;
for (int x = 0; x < iLen; x++)
{
crc = g_ui32CRC32Table[(crc ^ *pucBuff++) & 0xFFL] ^ (crc >> 8);
}
return crc ^ 0xFFFFFFFF;
}
UINT32 CalcCRC32_Intrinsic(unsigned char *pucBuff, int iLen)
{
UINT32 crc = 0xFFFFFFFF;
for (int x = 0; x < iLen; x++)
{
crc = _mm_crc32_u8(crc, *pucBuff++);
}
return crc ^ 0xFFFFFFFF;
}
That table is for a different CRC polynomial than the one used by the Intel instruction. The table is for the Ethernet/ZIP/etc. CRC, often referred to as CRC-32. The Intel instruction uses the iSCSI (Castagnoli) polynomial, for the CRC often referred to as CRC-32C.
This short example code can calculate either, by uncommenting the desired polynomial:
#include <stddef.h>
#include <stdint.h>
/* CRC-32 (Ethernet, ZIP, etc.) polynomial in reversed bit order. */
#define POLY 0xedb88320
/* CRC-32C (iSCSI) polynomial in reversed bit order. */
/* #define POLY 0x82f63b78 */
/* Compute CRC of buf[0..len-1] with initial CRC crc. This permits the
computation of a CRC by feeding this routine a chunk of the input data at a
time. The value of crc for the first chunk should be zero. */
uint32_t crc32c(uint32_t crc, const unsigned char *buf, size_t len)
{
int k;
crc = ~crc;
while (len--) {
crc ^= *buf++;
for (k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
}
return ~crc;
}
You can use this code to generate a replacement table for your code by simply computing the CRC-32C of each of the one-byte messages 0, 1, 2, ..., 255.
FWIW, I've obtained SW code that demonstrably matches the Intel crc32c instruction, but it uses a different polynomial: 0x82f63b78 The function definitely doesn't match any of the iSCSI test examples here: https://www.rfc-editor.org/rfc/rfc3720#appendix-B.4
What's frustrating in all this is every implementation I've tried for CRC-32C comes out with different hashes from all the others. Is there a true piece of reference code out there?

How are my bytes in C stored?

First, I'm a student still. So I am not very experienced.
I'm working with a piece of bluetooth hardware and I am using its protocol to send it commands. The protocol requires packets to be sent with LSB first for each packet field.
I was getting error packets back to me indicating my CRC values were wrong so I did some investigating. I found the problem, but I became confused in the process.
Here is Some GDB output and other information elucidating my confusion.
I'm sending a packet that should look like this:
|Start Flag| Packet Num | Command | Payload | CRC | End Flag|
0xfc 0x1 0x0 0x8 0x0 0x5 0x59 0x42 0xfd
Here is some GDB output:
print /x reqId_ep
$1 = {start_flag = 0xfc, data = {packet_num = 0x1, command = {0x0, 0x8}, payload = {
0x0, 0x5}}, crc = 0x5942, end_flag = 0xfd}
reqId_ep is the variable name of the packet I'm sending. It looks all good there, but I am receiving the CRC error codes from it so something must be wrong.
Here I examine 9 bytes in hex starting from the address of my packet to send:
x/9bx 0x7fffffffdee0
0xfc 0x01 0x00 0x08 0x00 0x05 0x42 0x59 0xfd
And here the problem becomes apparent. The CRC is not LSB first. (0x42 0x59)
To fix my problem I removed the htons() that I set my CRC value equal with.
And here is the same output above without htons():
p/x reqId_ep
$1 = {start_flag = 0xfc, data = {packet_num = 0x1, command = {0x0, 0x8}, payload = {
0x0, 0x5}}, crc = 0x4259, end_flag = 0xfd}
Here the CRC value is not LSB.
But then:
x/9bx 0x7fffffffdee0
0xfc 0x01 0x00 0x08 0x00 0x05 0x59 0x42 0xfd
Here the CRC value is LSB first.
So apparently the storing of C is LSB first? Can someone please cast a light of knowledge upon me for this situation? Thank you kindly.
This has to do with Endianness in computing:
http://en.wikipedia.org/wiki/Endianness#Endianness_and_operating_systems_on_architectures
For example, the value 4660 (base-ten) is 0x1234 in hex. On a Big Endian system, it would be stored in memory as 1234 while on a Little Endian system it would be stored as 3412
If you want to avoid this sort of issue in the future, it might just be easiest to create a large array or struct of unsigned char, and store individual values in it.
eg:
|Start Flag| Packet Num | Command | Payload | CRC | End Flag|
0xfc 0x1 0x0 0x8 0x0 0x5 0x59 0x42 0xfd
typedef struct packet {
unsigned char startFlag;
unsigned char packetNum;
unsigned char commandMSB;
unsigned char commandLSB;
unsigned char payloadMSB;
unsigned char payloadLSB;
unsigned char crcMSB;
unsigned char crcLSB;
unsigned char endFlag;
} packet_t;
You could then create a function that you compile differently based on the type of system you are building for using preprocessor macros.
eg:
/* Uncomment the line below if you are using a little endian system;
/* otherwise, leave it commented
*/
//#define LITTLE_ENDIAN_SYSTEM
// Function protocol
void writeCommand(int cmd);
//Function definition
void writeCommand(int cmd, packet_t* pkt)
{
if(!pkt)
{
printf("Error, invalid pointer!");
return;
}
#if LITTLE_ENDIAN_SYSTEM
pkt->commandMSB = (cmd && 0xFF00) >> 8;
pkt->commandLSB = (cmd && 0x00FF);
# else // Big Endian system
pkt->commandMSB = (cmd && 0x00FF);
pkt->commandLSB = (cmd && 0xFF00) >> 8;
#endif
// Done
}
int main void()
{
packet_t myPacket = {0}; //Initialize so it is zeroed out
writeCommand(0x1234,&myPacket);
return 0;
}
One final note: avoid sending structs as a stream of data, send it's individual elements one-at-a-time instead! ie: don't assume that the struct is stored internally in this case like a giant array of unsigned characters. There are things that the compiler and system put in place like packing and allignment, and the struct could actually be larger than 9 x sizeof(unsigned char).
Good luck!
This is architecture dependent based on which processor you're targeting. There are what is known as "Big Endian" systems, which store the most significant byte of a word first, and "Little Endian" systems that store the least significant byte first. It looks like you're looking at a Little Endian system there.

Resources