CRC-32 on MicroController (Atmel) - c

I am currently trying to implement a CRC-32 for an incoming datastream (Serial communication) on an ATMEGA1280 and I am a little lost how to do this on the embedded side in C.... If anyone could point me in the proper direction and/or help in anyway i would greatly appreciate it...

There are plenty of CRC-32 implementations in C. The AT MEGA1280 has 128 KB of code space, it shoudn't have any problems running any off-the-shelf implementation.
Here is pretty much the first one I found.

You should know what polynomial you are dealing with. So it is not enough to know that you are using CRC, but you should also know polynomial.
You are looking for function with this kind of prototype
uint32_t crc(uint8_t * data, int len, uint32_t polynomial)
Or even further, this function can also support updating if you getting your data asynchronously, so there is extra parameter for resuming computation.
With CRC32 you stream bits through CRC function and you get 32 bit number that is used for data corruption checking.
I am sure you will be able to find C code of crc online.
EDIT:
It looks like, CRC32 polynomial is sort of standard and is usually unified.
That mean that CRC32 implementation will employ correct polynomial.
http://en.wikipedia.org/wiki/Computation_of_CRC

Copy-paste solution for AVR, ripped from here:
/* crc32.c -- compute the CRC-32 of a data stream
* Copyright (C) 1995-1996 Mark Adler
* For conditions of distribution and use, see copyright notice in zlib.h
*/
/* $Id: crc32.c,v 1.1 2007/10/07 20:47:35 matthias Exp $ */
/* ========================================================================
* Table of CRC-32's of all single-byte values (made by make_crc_table)
*/
static unsigned long crc_table[256] = {
0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L,
0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L,
0xe0d5e91eL, 0x97d2d988L, 0x09b64c2bL, 0x7eb17cbdL, 0xe7b82d07L,
0x90bf1d91L, 0x1db71064L, 0x6ab020f2L, 0xf3b97148L, 0x84be41deL,
0x1adad47dL, 0x6ddde4ebL, 0xf4d4b551L, 0x83d385c7L, 0x136c9856L,
0x646ba8c0L, 0xfd62f97aL, 0x8a65c9ecL, 0x14015c4fL, 0x63066cd9L,
0xfa0f3d63L, 0x8d080df5L, 0x3b6e20c8L, 0x4c69105eL, 0xd56041e4L,
0xa2677172L, 0x3c03e4d1L, 0x4b04d447L, 0xd20d85fdL, 0xa50ab56bL,
0x35b5a8faL, 0x42b2986cL, 0xdbbbc9d6L, 0xacbcf940L, 0x32d86ce3L,
0x45df5c75L, 0xdcd60dcfL, 0xabd13d59L, 0x26d930acL, 0x51de003aL,
0xc8d75180L, 0xbfd06116L, 0x21b4f4b5L, 0x56b3c423L, 0xcfba9599L,
0xb8bda50fL, 0x2802b89eL, 0x5f058808L, 0xc60cd9b2L, 0xb10be924L,
0x2f6f7c87L, 0x58684c11L, 0xc1611dabL, 0xb6662d3dL, 0x76dc4190L,
0x01db7106L, 0x98d220bcL, 0xefd5102aL, 0x71b18589L, 0x06b6b51fL,
0x9fbfe4a5L, 0xe8b8d433L, 0x7807c9a2L, 0x0f00f934L, 0x9609a88eL,
0xe10e9818L, 0x7f6a0dbbL, 0x086d3d2dL, 0x91646c97L, 0xe6635c01L,
0x6b6b51f4L, 0x1c6c6162L, 0x856530d8L, 0xf262004eL, 0x6c0695edL,
0x1b01a57bL, 0x8208f4c1L, 0xf50fc457L, 0x65b0d9c6L, 0x12b7e950L,
0x8bbeb8eaL, 0xfcb9887cL, 0x62dd1ddfL, 0x15da2d49L, 0x8cd37cf3L,
0xfbd44c65L, 0x4db26158L, 0x3ab551ceL, 0xa3bc0074L, 0xd4bb30e2L,
0x4adfa541L, 0x3dd895d7L, 0xa4d1c46dL, 0xd3d6f4fbL, 0x4369e96aL,
0x346ed9fcL, 0xad678846L, 0xda60b8d0L, 0x44042d73L, 0x33031de5L,
0xaa0a4c5fL, 0xdd0d7cc9L, 0x5005713cL, 0x270241aaL, 0xbe0b1010L,
0xc90c2086L, 0x5768b525L, 0x206f85b3L, 0xb966d409L, 0xce61e49fL,
0x5edef90eL, 0x29d9c998L, 0xb0d09822L, 0xc7d7a8b4L, 0x59b33d17L,
0x2eb40d81L, 0xb7bd5c3bL, 0xc0ba6cadL, 0xedb88320L, 0x9abfb3b6L,
0x03b6e20cL, 0x74b1d29aL, 0xead54739L, 0x9dd277afL, 0x04db2615L,
0x73dc1683L, 0xe3630b12L, 0x94643b84L, 0x0d6d6a3eL, 0x7a6a5aa8L,
0xe40ecf0bL, 0x9309ff9dL, 0x0a00ae27L, 0x7d079eb1L, 0xf00f9344L,
0x8708a3d2L, 0x1e01f268L, 0x6906c2feL, 0xf762575dL, 0x806567cbL,
0x196c3671L, 0x6e6b06e7L, 0xfed41b76L, 0x89d32be0L, 0x10da7a5aL,
0x67dd4accL, 0xf9b9df6fL, 0x8ebeeff9L, 0x17b7be43L, 0x60b08ed5L,
0xd6d6a3e8L, 0xa1d1937eL, 0x38d8c2c4L, 0x4fdff252L, 0xd1bb67f1L,
0xa6bc5767L, 0x3fb506ddL, 0x48b2364bL, 0xd80d2bdaL, 0xaf0a1b4cL,
0x36034af6L, 0x41047a60L, 0xdf60efc3L, 0xa867df55L, 0x316e8eefL,
0x4669be79L, 0xcb61b38cL, 0xbc66831aL, 0x256fd2a0L, 0x5268e236L,
0xcc0c7795L, 0xbb0b4703L, 0x220216b9L, 0x5505262fL, 0xc5ba3bbeL,
0xb2bd0b28L, 0x2bb45a92L, 0x5cb36a04L, 0xc2d7ffa7L, 0xb5d0cf31L,
0x2cd99e8bL, 0x5bdeae1dL, 0x9b64c2b0L, 0xec63f226L, 0x756aa39cL,
0x026d930aL, 0x9c0906a9L, 0xeb0e363fL, 0x72076785L, 0x05005713L,
0x95bf4a82L, 0xe2b87a14L, 0x7bb12baeL, 0x0cb61b38L, 0x92d28e9bL,
0xe5d5be0dL, 0x7cdcefb7L, 0x0bdbdf21L, 0x86d3d2d4L, 0xf1d4e242L,
0x68ddb3f8L, 0x1fda836eL, 0x81be16cdL, 0xf6b9265bL, 0x6fb077e1L,
0x18b74777L, 0x88085ae6L, 0xff0f6a70L, 0x66063bcaL, 0x11010b5cL,
0x8f659effL, 0xf862ae69L, 0x616bffd3L, 0x166ccf45L, 0xa00ae278L,
0xd70dd2eeL, 0x4e048354L, 0x3903b3c2L, 0xa7672661L, 0xd06016f7L,
0x4969474dL, 0x3e6e77dbL, 0xaed16a4aL, 0xd9d65adcL, 0x40df0b66L,
0x37d83bf0L, 0xa9bcae53L, 0xdebb9ec5L, 0x47b2cf7fL, 0x30b5ffe9L,
0xbdbdf21cL, 0xcabac28aL, 0x53b39330L, 0x24b4a3a6L, 0xbad03605L,
0xcdd70693L, 0x54de5729L, 0x23d967bfL, 0xb3667a2eL, 0xc4614ab8L,
0x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL,
0x2d02ef8dL
};
#define DO1(buf) crc = crc_table[((int)crc ^ (*buf++)) & 0xff] ^ (crc >> 8);
#define DO2(buf) DO1(buf); DO1(buf);
#define DO4(buf) DO2(buf); DO2(buf);
#define DO8(buf) DO4(buf); DO4(buf);
unsigned long crc32(crc, buf, len)
unsigned long crc;
const unsigned char *buf;
unsigned int len;
{
if (!buf) return(0L);
crc = crc ^ 0xffffffffL;
while (len >= 8)
{
DO8(buf);
len -= 8;
}
if (len) do {
DO1(buf);
} while (--len);
return(crc^0xffffffffL);
}

Related

Broken CRC32 Combine

I have a problem with a certain CRC method when trying to combine CRC's.
I have been using the combine CRC method and even adapted it a while ago to work with CRC16, etc. and I (hope?) understand how the filling of 0's work according to the answer here https://stackoverflow.com/a/23126768/5495036 with zlib's code
Thing is, I've been rapping my brain on how to get an operator that I can use as I am also working with a non standard CRC calculation. The original CRC calculation is based on the CRC32 0xEDB88320 polynomial for the lookup table but the calculation itself was broken, so instead of 256 byte lookup table looking like
0x00000000,0x77073096,0xee0e612c,0x990951ba,0x076dc419,0x706af48f,0xe963a535,... it now looks like this
0x00000000,0x96000000,0x30960000,0x07309600,0x77073096,0x2c770730,0x612c7707,... and the combine CRC uses the operator to be able to zero out the rest of the bits.
I can't change the calculation unfortunately so that idea is out :P
Any ideas?
EDIT:
The calculation is standard, it just uses a table that has been built using the standard polynomial of 0xEDB88320 but building it incorrectly. Table is still 256 ints. Starting CRC of 0xFFFFFFFF
Complete class code
public static byte[] ByteLookupTable =
{
0x00,0x00,0x00,0x00,0x96,0x30,0x07,0x77,0x2C,0x61,0x0E,0xEE,0xBA,0x51,
0x09,0x99,0x19,0xC4,0x6D,0x07,0x8F,0xF4,0x6A,0x70,0x35,0xA5,0x63,
0xE9,0xA3,0x95,0x64,0x9E,0x32,0x88,0xDB,0x0E,0xA4,0xB8,0xDC,0x79,0x1E,
0xE9,0xD5,0xE0,0x88,0xD9,0xD2,0x97,0x2B,0x4C,0xB6,0x09,0xBD,0x7C,
0xB1,0x7E,0x07,0x2D,0xB8,0xE7,0x91,0x1D,0xBF,0x90,0x64,0x10,0xB7,0x1D,
0xF2,0x20,0xB0,0x6A,0x48,0x71,0xB9,0xF3,0xDE,0x41,0xBE,0x84,0x7D,
0xD4,0xDA,0x1A,0xEB,0xE4,0xDD,0x6D,0x51,0xB5,0xD4,0xF4,0xC7,0x85,0xD3,
0x83,0x56,0x98,0x6C,0x13,0xC0,0xA8,0x6B,0x64,0x7A,0xF9,0x62,0xFD,
0xEC,0xC9,0x65,0x8A,0x4F,0x5C,0x01,0x14,0xD9,0x6C,0x06,0x63,0x63,0x3D,
0x0F,0xFA,0xF5,0x0D,0x08,0x8D,0xC8,0x20,0x6E,0x3B,0x5E,0x10,0x69,
0x4C,0xE4,0x41,0x60,0xD5,0x72,0x71,0x67,0xA2,0xD1,0xE4,0x03,0x3C,0x47,
0xD4,0x04,0x4B,0xFD,0x85,0x0D,0xD2,0x6B,0xB5,0x0A,0xA5,0xFA,0xA8,
0xB5,0x35,0x6C,0x98,0xB2,0x42,0xD6,0xC9,0xBB,0xDB,0x40,0xF9,0xBC,0xAC,
0xE3,0x6C,0xD8,0x32,0x75,0x5C,0xDF,0x45,0xCF,0x0D,0xD6,0xDC,0x59,
0x3D,0xD1,0xAB,0xAC,0x30,0xD9,0x26,0x3A,0x00,0xDE,0x51,0x80,0x51,0xD7,
0xC8,0x16,0x61,0xD0,0xBF,0xB5,0xF4,0xB4,0x21,0x23,0xC4,0xB3,0x56,
0x99,0x95,0xBA,0xCF,0x0F,0xA5,0xBD,0xB8,0x9E,0xB8,0x02,0x28,0x08,0x88,
0x05,0x5F,0xB2,0xD9,0x0C,0xC6,0x24,0xE9,0x0B,0xB1,0x87,0x7C,0x6F,
0x2F,0x11,0x4C,0x68,0x58,0xAB,0x1D,0x61,0xC1,0x3D,0x2D,0x66,0xB6,0x90,
0x41,0xDC,0x76,0x06,0x71,0xDB,0x01,0xBC,0x20,0xD2,0x98,0x2A,0x10,
0xD5,0xEF,0x89,0x85,0xB1,0x71,0x1F,0xB5,0xB6,0x06,0xA5,0xE4,0xBF,0x9F,
0x33,0xD4,0xB8,0xE8,0xA2,0xC9,0x07,0x78,0x34,0xF9,0x00,0x0F,0x8E,
0xA8,0x09,0x96,0x18,0x98,0x0E,0xE1,0xBB,0x0D,0x6A,0x7F,0x2D,0x3D,0x6D,
0x08,0x97,0x6C,0x64,0x91,0x01,0x5C,0x63,0xE6,0xF4,0x51,0x6B,0x6B,
0x62,0x61,0x6C,0x1C,0xD8,0x30,0x65,0x85,0x4E,0x00,0x62,0xF2,0xED,0x95,
0x06,0x6C,0x7B,0xA5,0x01,0x1B,0xC1,0xF4,0x08,0x82,0x57,0xC4,0x0F,
0xF5,0xC6,0xD9,0xB0,0x65,0x50,0xE9,0xB7,0x12,0xEA,0xB8,0xBE,0x8B,0x7C,
0x88,0xB9,0xFC,0xDF,0x1D,0xDD,0x62,0x49,0x2D,0xDA,0x15,0xF3,0x7C,
0xD3,0x8C,0x65,0x4C,0xD4,0xFB,0x58,0x61,0xB2,0x4D,0xCE,0x51,0xB5,0x3A,
0x74,0x00,0xBC,0xA3,0xE2,0x30,0xBB,0xD4,0x41,0xA5,0xDF,0x4A,0xD7,
0x95,0xD8,0x3D,0x6D,0xC4,0xD1,0xA4,0xFB,0xF4,0xD6,0xD3,0x6A,0xE9,0x69,
0x43,0xFC,0xD9,0x6E,0x34,0x46,0x88,0x67,0xAD,0xD0,0xB8,0x60,0xDA,
0x73,0x2D,0x04,0x44,0xE5,0x1D,0x03,0x33,0x5F,0x4C,0x0A,0xAA,0xC9,0x7C,
0x0D,0xDD,0x3C,0x71,0x05,0x50,0xAA,0x41,0x02,0x27,0x10,0x10,0x0B,
0xBE,0x86,0x20,0x0C,0xC9,0x25,0xB5,0x68,0x57,0xB3,0x85,0x6F,0x20,0x09,
0xD4,0x66,0xB9,0x9F,0xE4,0x61,0xCE,0x0E,0xF9,0xDE,0x5E,0x98,0xC9,
0xD9,0x29,0x22,0x98,0xD0,0xB0,0xB4,0xA8,0xD7,0xC7,0x17,0x3D,0xB3,0x59,
0x81,0x0D,0xB4,0x2E,0x3B,0x5C,0xBD,0xB7,0xAD,0x6C,0xBA,0xC0,0x20,
0x83,0xB8,0xED,0xB6,0xB3,0xBF,0x9A,0x0C,0xE2,0xB6,0x03,0x9A,0xD2,0xB1,
0x74,0x39,0x47,0xD5,0xEA,0xAF,0x77,0xD2,0x9D,0x15,0x26,0xDB,0x04,
0x83,0x16,0xDC,0x73,0x12,0x0B,0x63,0xE3,0x84,0x3B,0x64,0x94,0x3E,0x6A,
0x6D,0x0D,0xA8,0x5A,0x6A,0x7A,0x0B,0xCF,0x0E,0xE4,0x9D,0xFF,0x09,
0x93,0x27,0xAE,0x00,0x0A,0xB1,0x9E,0x07,0x7D,0x44,0x93,0x0F,0xF0,0xD2,
0xA3,0x08,0x87,0x68,0xF2,0x01,0x1E,0xFE,0xC2,0x06,0x69,0x5D,0x57,
0x62,0xF7,0xCB,0x67,0x65,0x80,0x71,0x36,0x6C,0x19,0xE7,0x06,0x6B,0x6E,
0x76,0x1B,0xD4,0xFE,0xE0,0x2B,0xD3,0x89,0x5A,0x7A,0xDA,0x10,0xCC,
0x4A,0xDD,0x67,0x6F,0xDF,0xB9,0xF9,0xF9,0xEF,0xBE,0x8E,0x43,0xBE,0xB7,
0x17,0xD5,0x8E,0xB0,0x60,0xE8,0xA3,0xD6,0xD6,0x7E,0x93,0xD1,0xA1,
0xC4,0xC2,0xD8,0x38,0x52,0xF2,0xDF,0x4F,0xF1,0x67,0xBB,0xD1,0x67,0x57,
0xBC,0xA6,0xDD,0x06,0xB5,0x3F,0x4B,0x36,0xB2,0x48,0xDA,0x2B,0x0D,
0xD8,0x4C,0x1B,0x0A,0xAF,0xF6,0x4A,0x03,0x36,0x60,0x7A,0x04,0x41,0xC3,
0xEF,0x60,0xDF,0x55,0xDF,0x67,0xA8,0xEF,0x8E,0x6E,0x31,0x79,0xBE,
0x69,0x46,0x8C,0xB3,0x61,0xCB,0x1A,0x83,0x66,0xBC,0xA0,0xD2,0x6F,0x25,
0x36,0xE2,0x68,0x52,0x95,0x77,0x0C,0xCC,0x03,0x47,0x0B,0xBB,0xB9,
0x16,0x02,0x22,0x2F,0x26,0x05,0x55,0xBE,0x3B,0xBA,0xC5,0x28,0x0B,0xBD,
0xB2,0x92,0x5A,0xB4,0x2B,0x04,0x6A,0xB3,0x5C,0xA7,0xFF,0xD7,0xC2,
0x31,0xCF,0xD0,0xB5,0x8B,0x9E,0xD9,0x2C,0x1D,0xAE,0xDE,0x5B,0xB0,0xC2,
0x64,0x9B,0x26,0xF2,0x63,0xEC,0x9C,0xA3,0x6A,0x75,0x0A,0x93,0x6D,
0x02,0xA9,0x06,0x09,0x9C,0x3F,0x36,0x0E,0xEB,0x85,0x67,0x07,0x72,0x13,
0x57,0x00,0x05,0x82,0x4A,0xBF,0x95,0x14,0x7A,0xB8,0xE2,0xAE,0x2B,
0xB1,0x7B,0x38,0x1B,0xB6,0x0C,0x9B,0x8E,0xD2,0x92,0x0D,0xBE,0xD5,0xE5,
0xB7,0xEF,0xDC,0x7C,0x21,0xDF,0xDB,0x0B,0xD4,0xD2,0xD3,0x86,0x42,
0xE2,0xD4,0xF1,0xF8,0xB3,0xDD,0x68,0x6E,0x83,0xDA,0x1F,0xCD,0x16,0xBE,
0x81,0x5B,0x26,0xB9,0xF6,0xE1,0x77,0xB0,0x6F,0x77,0x47,0xB7,0x18,
0xE6,0x5A,0x08,0x88,0x70,0x6A,0x0F,0xFF,0xCA,0x3B,0x06,0x66,0x5C,0x0B,
0x01,0x11,0xFF,0x9E,0x65,0x8F,0x69,0xAE,0x62,0xF8,0xD3,0xFF,0x6B,
0x61,0x45,0xCF,0x6C,0x16,0x78,0xE2,0x0A,0xA0,0xEE,0xD2,0x0D,0xD7,0x54,
0x83,0x04,0x4E,0xC2,0xB3,0x03,0x39,0x61,0x26,0x67,0xA7,0xF7,0x16,
0x60,0xD0,0x4D,0x47,0x69,0x49,0xDB,0x77,0x6E,0x3E,0x4A,0x6A,0xD1,0xAE,
0xDC,0x5A,0xD6,0xD9,0x66,0x0B,0xDF,0x40,0xF0,0x3B,0xD8,0x37,0x53,
0xAE,0xBC,0xA9,0xC5,0x9E,0xBB,0xDE,0x7F,0xCF,0xB2,0x47,0xE9,0xFF,0xB5,
0x30,0x1C,0xF2,0xBD,0xBD,0x8A,0xC2,0xBA,0xCA,0x30,0x93,0xB3,0x53,
0xA6,0xA3,0xB4,0x24,0x05,0x36,0xD0,0xBA,0x93,0x06,0xD7,0xCD,0x29,0x57,
0xDE,0x54,0xBF,0x67,0xD9,0x23,0x2E,0x7A,0x66,0xB3,0xB8,0x4A,0x61,
0xC4,0x02,0x1B,0x68,0x5D,0x94,0x2B,0x6F,0x2A,0x37,0xBE,0x0B,0xB4,0xA1,
0x8E,0x0C,0xC3,0x1B,0xDF,0x05,0x5A,0x8D,0xEF,0x02,0x2D,0xC3,0x8D,0x40,0x00
};
protected void BuildLookupTable()
{
if (LookupTable == null)
{
LookupTable = new uint[256];
for (int i = 0; i < LookupTable.Length; i++)
{
LookupTable[i] = BitConverter.ToUInt32(ByteLookupTable, i);
}
}
}
protected override uint CalculateBuffer(byte[] buffer, uint crc, int startPos, int endPos)
{
for (int i = 0; i < endPos; i++)
{
crc = LookupTable[(crc ^ buffer[i]) & 0xff] ^ (crc >> 8);
}
return crc;
}
The lookup table constants are correct for that polynomial, but apparently the conversion done by BuildLookupTable() is totally messed up. It would have been easy to rewrite the table to avoid needing BuildLookupTable().
What follows is from the original answer, which assumed that the table was converted correctly. Which it isn't.
As it is, this isn't a CRC, so the combination approach for a CRC does not apply here.
The one thing missing from your definition is what the initial value for the CRC is, and possibly if there is an exclusive-or done on the final CRC. That polynomial is the same as used by zlib, PKZIP, etc., where the initial CRC value is 0xffffffff and the final exclusive-or is with 0xffffffff. That CRC is referred to as CRC-32/ISO-HDLC.
Whatever your initial and the final exclusive-or values are, if they are equal, then you can use the crc32_combine() function from zlib as is. If they are not, you can still use crc32_combine(), but you need to exclusive-or the input and output CRC values of that function with the exclusive-or of the initial and final exclusive-or values.

How to improve Mersenne Twister vor AVX/SSE?

Today i have started a project having the goal to optimize the generation of random numbers.
I want to wipe several hard drives, using the Mersenne Twister PRNG, but unfortunately i'm only able to produce around 200MB/s of random data, on 8 hard drives, so 25MB/s each.
Is there a way to optimize this code, using AVX, or SSE (for legacy reasons) to improve this code?
Is it something what can be done without having studied computer science?
Unfortunately i'm a simple C programmer, only having the experience of a few years, but not having studied computer science yet.
Could someone provide some information, how to bring this forward?
Which of these processes could be improved? Can someone give some examples, how an enhanced version of it could look like? Can someone provide some good books to retrieve a better knowledge about this matter?
/*
A C-program for MT19937, with initialization improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto.
Before using, initialize the state by using init_genrand(seed)
or init_by_array(init_key, key_length).
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
All rights reserved.
Copyright (C) 2005, Mutsuo Saito,
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The names of its contributors may not be used to endorse or promote
products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Any feedback is very welcome.
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
email: m-mat # math.sci.hiroshima-u.ac.jp (remove space)
*/
#include <stdio.h>
#include "mt19937ar.h"
/* Period parameters */
#define N 624
#define M 397
#define MATRIX_A 0x9908b0dfUL /* constant vector a */
#define UPPER_MASK 0x80000000UL /* most significant w-r bits */
#define LOWER_MASK 0x7fffffffUL /* least significant r bits */
static unsigned long mt[N]; /* the array for the state vector */
static int mti=N+1; /* mti==N+1 means mt[N] is not initialized */
/* initializes mt[N] with a seed */
void init_genrand(unsigned long s)
{
mt[0]= s & 0xffffffffUL;
for (mti=1; mti<N; mti++) {
mt[mti] =
(1812433253UL * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
/* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */
/* In the previous versions, MSBs of the seed affect */
/* only MSBs of the array mt[]. */
/* 2002/01/09 modified by Makoto Matsumoto */
mt[mti] &= 0xffffffffUL;
/* for >32 bit machines */
}
}
/* initialize by an array with array-length */
/* init_key is the array for initializing keys */
/* key_length is its length */
/* slight change for C++, 2004/2/26 */
void init_by_array(unsigned long init_key[], int key_length)
{
int i, j, k;
init_genrand(19650218UL);
i=1; j=0;
k = (N>key_length ? N : key_length);
for (; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1664525UL))
+ init_key[j] + j; /* non linear */
mt[i] &= 0xffffffffUL; /* for WORDSIZE > 32 machines */
i++; j++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
if (j>=key_length) j=0;
}
for (k=N-1; k; k--) {
mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >> 30)) * 1566083941UL))
- i; /* non linear */
mt[i] &= 0xffffffffUL; /* for WORDSIZE > 32 machines */
i++;
if (i>=N) { mt[0] = mt[N-1]; i=1; }
}
mt[0] = 0x80000000UL; /* MSB is 1; assuring non-zero initial array */
}
/* generates a random number on [0,0xffffffff]-interval */
unsigned long genrand_int32(void)
{
unsigned long y;
static unsigned long mag01[2]={0x0UL, MATRIX_A};
/* mag01[x] = x * MATRIX_A for x=0,1 */
if (mti >= N) { /* generate N words at one time */
int kk;
if (mti == N+1) /* if init_genrand() has not been called, */
init_genrand(5489UL); /* a default initial seed is used */
for (kk=0;kk<N-M;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1UL];
}
for (;kk<N-1;kk++) {
y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1UL];
}
y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);
mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1UL];
mti = 0;
}
y = mt[mti++];
/* Tempering */
y ^= (y >> 11);
y ^= (y << 7) & 0x9d2c5680UL;
y ^= (y << 15) & 0xefc60000UL;
y ^= (y >> 18);
return y;
}
I think, Mersenne Twister PRNG is not suitable for your purpose, because of it is not cryptographically secure. If cryptographer can guess or recover piece of MT sequence (for example, from initially 0-filled disk part), he can recover your generator state, and recover entire your sequence, by re-run PRNG from that state.
I suggest you to use modification of RC4 PRNG, but works with array of "uint16_t [1 << 16]", rather than original byte-oriented RC4 with array "uint8_t [1 << 8]". This modification will give you 2 benefits:
For each computation step, you will extract 2 bytes, not one, as in the original. By other words, you will get ~2x performance improvement, this is important for you.
Period of this generator would be extremely long, I estimate it as 2^(2^14).
To preserve attack to your KSA (major RC4 vulnerability), I suggest you to "empty intial run" 2^18 times before usage. By other words, you initially, just read and drop initial 2^18 uint16_t values from PRNG.
Also, you can modify the KSA, to make it more hack-resistant.

What's the "to little endian" equivalent of htonl? [duplicate]

I need to convert a short value from the host byte order to little endian. If the target was big endian, I could use the htons() function, but alas - it's not.
I guess I could do:
swap(htons(val))
But this could potentially cause the bytes to be swapped twice, rendering the result correct but giving me a performance penalty which is not alright in my case.
Here is an article about endianness and how to determine it from IBM:
Writing endian-independent code in C: Don't let endianness "byte" you
It includes an example of how to determine endianness at run time ( which you would only need to do once )
const int i = 1;
#define is_bigendian() ( (*(char*)&i) == 0 )
int main(void) {
int val;
char *ptr;
ptr = (char*) &val;
val = 0x12345678;
if (is_bigendian()) {
printf(“%X.%X.%X.%X\n", u.c[0], u.c[1], u.c[2], u.c[3]);
} else {
printf(“%X.%X.%X.%X\n", u.c[3], u.c[2], u.c[1], u.c[0]);
}
exit(0);
}
The page also has a section on methods for reversing byte order:
short reverseShort (short s) {
unsigned char c1, c2;
if (is_bigendian()) {
return s;
} else {
c1 = s & 255;
c2 = (s >> 8) & 255;
return (c1 << 8) + c2;
}
}
;
short reverseShort (char *c) {
short s;
char *p = (char *)&s;
if (is_bigendian()) {
p[0] = c[0];
p[1] = c[1];
} else {
p[0] = c[1];
p[1] = c[0];
}
return s;
}
Then you should know your endianness and call htons() conditionally. Actually, not even htons, but just swap bytes conditionally. Compile-time, of course.
Something like the following:
unsigned short swaps( unsigned short val)
{
return ((val & 0xff) << 8) | ((val & 0xff00) >> 8);
}
/* host to little endian */
#define PLATFORM_IS_BIG_ENDIAN 1
#if PLATFORM_IS_LITTLE_ENDIAN
unsigned short htoles( unsigned short val)
{
/* no-op on a little endian platform */
return val;
}
#elif PLATFORM_IS_BIG_ENDIAN
unsigned short htoles( unsigned short val)
{
/* need to swap bytes on a big endian platform */
return swaps( val);
}
#else
unsigned short htoles( unsigned short val)
{
/* the platform hasn't been properly configured for the */
/* preprocessor to know if it's little or big endian */
/* use potentially less-performant, but always works option */
return swaps( htons(val));
}
#endif
If you have a system that's properly configured (such that the preprocessor knows whether the target id little or big endian) you get an 'optimized' version of htoles(). Otherwise you get the potentially non-optimized version that depends on htons(). In any case, you get something that works.
Nothing too tricky and more or less portable.
Of course, you can further improve the optimization possibilities by implementing this with inline or as macros as you see fit.
You might want to look at something like the "Portable Open Source Harness (POSH)" for an actual implementation that defines the endianness for various compilers. Note, getting to the library requires going though a pseudo-authentication page (though you don't need to register to give any personal details): http://hookatooka.com/poshlib/
This trick should would: at startup, use ntohs with a dummy value and then compare the resulting value to the original value. If both values are the same, then the machine uses big endian, otherwise it is little endian.
Then, use a ToLittleEndian method that either does nothing or invokes ntohs, depending on the result of the initial test.
(Edited with the information provided in comments)
My rule-of-thumb performance guess is that depends whether you are little-endian-ising a big block of data in one go, or just one value:
If just one value, then the function call overhead is probably going to swamp the overhead of unnecessary byte-swaps, and that's even if the compiler doesn't optimise away the unnecessary byte swaps. Then you're maybe going to write the value as the port number of a socket connection, and try to open or bind a socket, which takes an age compared with any sort of bit-manipulation. So just don't worry about it.
If a large block, then you might worry the compiler won't handle it. So do something like this:
if (!is_little_endian()) {
for (int i = 0; i < size; ++i) {
vals[i] = swap_short(vals[i]);
}
}
Or look into SIMD instructions on your architecture which can do it considerably faster.
Write is_little_endian() using whatever trick you like. I think the one Robert S. Barnes provides is sound, but since you usually know for a given target whether it's going to be big- or little-endian, maybe you should have a platform-specific header file, that defines it to be a macro evaluating either to 1 or 0.
As always, if you really care about performance, then look at the generated assembly to see whether pointless code has been removed or not, and time the various alternatives against each other to see what actually goes fastest.
Unfortunately, there's not really a cross-platform way to determine a system's byte order at compile-time with standard C. I suggest adding a #define to your config.h (or whatever else you or your build system uses for build configuration).
A unit test to check for the correct definition of LITTLE_ENDIAN or BIG_ENDIAN could look like this:
#include <assert.h>
#include <limits.h>
#include <stdint.h>
void check_bits_per_byte(void)
{ assert(CHAR_BIT == 8); }
void check_sizeof_uint32(void)
{ assert(sizeof (uint32_t) == 4); }
void check_byte_order(void)
{
static const union { unsigned char bytes[4]; uint32_t value; } byte_order =
{ { 1, 2, 3, 4 } };
static const uint32_t little_endian = 0x04030201ul;
static const uint32_t big_endian = 0x01020304ul;
#ifdef LITTLE_ENDIAN
assert(byte_order.value == little_endian);
#endif
#ifdef BIG_ENDIAN
assert(byte_order.value == big_endian);
#endif
#if !defined LITTLE_ENDIAN && !defined BIG_ENDIAN
assert(!"byte order unknown or unsupported");
#endif
}
int main(void)
{
check_bits_per_byte();
check_sizeof_uint32();
check_byte_order();
}
On many Linux systems, there is a <endian.h> or <sys/endian.h> with conversion functions. man page for ENDIAN(3)

Seeking single file encryption implemenation which can handle whole file en/de-crypt in Delphi and C

[Update] I am offering a bonus for this. Frankly, I don't care which encryption method is used. Preferably something simple like XTEA, RC4, BlowFish ... but you chose.
I want minimum effort on my part, preferably just drop the files into my projects and build.
Idealy you should already have used the code to en/de-crypt a file in Delphi and C (I want to trade files between an Atmel UC3 micro-processor (coding in C) and a Windows PC (coding in Delphi) en-and-de-crypt in both directions).
I have a strong preference for a single .PAS unit and a single .C/.H file. I do not want to use a DLL or a library supporting dozens of encryption algorithms, just one (and I certainly don't want anything with an install program).
I hope that I don't sound too picky here, but I have been googling & trying code for over a week and still can't find two implementations which match. I suspect that only someone who has already done this can help me ...
Thanks in advance.
As a follow up to my previous post, I am still looking for some very simple code with why I can - with minimal effort - en-de crypt a file and exchange it between Delphi on a PC and C on an Atmel UC3 u-processor.
It sounds simple in theory, but in practice it's a nightmare. There are many possible candidates and I have spend days googling and trying them out - to no avail.
Some are humonous libraries, supporting many encryption algorithms, and I want something lightweight (especially on the C / u-processor end).
Some look good, but one set of source offers only block manipulation, the other strings (I would prefer whole file en/de-crypt).
Most seem to be very poorly documented, with meaningless parameter names and no example code to call the functions.
Over the past weekend (plus a few more days), I have burned my way through a slew of XTEA, XXTEA and BlowFish implementations, but while I can encrypt, I can't reverse the process.
Now I am looking at AES-256. Dos anyone know of an implementation in C which is a single AES.C file? (plus AES.H, of course)
Frankly, I will take anything that will do whole file en/de-crypt between Delphi and C, but unless anyone has actually done this themselves, I expect to hear only "any implementation that meets the standard should do" - which is a nice theoory but just not working out for me :-(
Any simple AES-256 in C out there? I have some reasonable looking Delphi code, but won't be sure until I try them together.
Thanks in advance ...
I would suggest using the .NET Micro Framework on a secondary microcontroller (e.g. Atmel SAM7X) as a crypto coprocessor. You can test this out on a Netduino, which you can pick up for around $35 / £30. The framework includes an AES implementation within it, under the System.Security.Cryptography namespace, alongside a variety of other cryptographic functions that might be useful for you. The benefit here is that you get a fully tested and working implementation, and increased security via type-safe code.
You could use SPI or I2C to communicate between the two microcontrollers, or bit-bang your own data transfer protocol over several I/O lines in parallel if higher throughput is needed.
I did exactly this with an Arduino and a Netduino (using the Netduino to hash blocks of data for a hardware BitTorrent device) and implemented a rudimentary asynchronous system using various commands sent between the devices via SPI and an interrupt mechanism.
Arduino is SPI master, Netduino is SPI slave.
A GPIO pin on the Netduino is set as an output, and tied to another interrupt-enabled GPIO pin on the Arduino that is set as an input. This is the interrupt pin.
Arduino sends 0xF1 as a "hello" initialization message.
Netduino sends back 0xF2 as an acknolwedgement.
When Arduino wants to hash a block, it sends 0x48 (ASCII 'H') followed by the data. When it is done sending data, it sets CS low. It must send whole bytes; setting CS low when the number of received bits is not divisible by 8 causes an error.
The Netduino receives the data, and sends back 0x68 (ASCII 'h') followed by the number of received bytes as a 2-byte unsigned integer. If an error occurred, it sends back 0x21 (ASCII '!') instead.
If it succeeded, the Netduino computes the hash, then sets the interrupt pin high. During the computation time, the Arduino is free to continue its job whilst waiting.
The Arduino sends 0x52 (ASCII 'R') to request the result.
The Netduino sets the interrupt pin low, then sends 0x72 (ASCII 'r') and the raw hash data back.
Since the Arduino can service interrupts via GPIO pins, it allowed me to make the processing entirely asynchronous. A variable on the Arduino side tracks whether we're currently waiting on the coprocessor to complete its task, so we don't try to send it a new block whilst it's still working on the old one.
You could easily adapt this scheme for computing AES blocks.
Small C library for AES-256 by Ilya Levin. Short implementation, asm-less, simple usage. Not sure how would it work on your current micro CPU, though.
[Edit]
You've mentioned having some delphi implementation, but in case something not working together, try this or this.
Also I've found arduino (avr-based) module using the Ilya's library - so it should also work on your micro CPU.
Can you compile C code from Delphi (you can compile Delphi code from C++ Builder, not sure about VV). Or maybe use the Free Borland Command line C++ compiler or even another C compiler.
The idea is to use the same C code in your Windows app as you use on your microprocessor.. That way you can be reasonably sure that the code will work in both directions.
[Update] See
http://www.drbob42.com/examines/examin92.htm
http://www.hflib.gov.cn/e_book/e_book_file/bcb/ch06.htm (Using C++ Code in Delphi)
http://edn.embarcadero.com/article/10156#H11
It looks like you need to use a DLL, but you can statically link it if you don't want to distribute it
Here is RC4 code. It is very lightweight.
The C has been used in a production system for five years.
I have added lightly tested Delphi code. The Pascal is a line-by-line port with with unsigned char going to Byte. I have only run the Pascal in Free Pascal with Delphi option turned on, not Delphi itself. Both C and Pascal have simple file processors.
Scrambling the ciphertext gives the original cleartext back.
No bugs reported to date. Hope this solves your problem.
rc4.h
#ifndef RC4_H
#define RC4_H
/*
* rc4.h -- Declarations for a simple rc4 encryption/decryption implementation.
* The code was inspired by libtomcrypt. See www.libtomcrypt.org.
*/
typedef struct TRC4State_s {
int x, y;
unsigned char buf[256];
} TRC4State;
/* rc4.c */
void init_rc4(TRC4State *state);
void setup_rc4(TRC4State *state, char *key, int keylen);
unsigned endecrypt_rc4(unsigned char *buf, unsigned len, TRC4State *state);
#endif
rc4.c
void init_rc4(TRC4State *state)
{
int x;
state->x = state->y = 0;
for (x = 0; x < 256; x++)
state->buf[x] = x;
}
void setup_rc4(TRC4State *state, char *key, int keylen)
{
unsigned tmp;
int x, y;
// use only first 256 characters of key
if (keylen > 256)
keylen = 256;
for (x = y = 0; x < 256; x++) {
y = (y + state->buf[x] + key[x % keylen]) & 255;
tmp = state->buf[x];
state->buf[x] = state->buf[y];
state->buf[y] = tmp;
}
state->x = 255;
state->y = y;
}
unsigned endecrypt_rc4(unsigned char *buf, unsigned len, TRC4State *state)
{
int x, y;
unsigned char *s, tmp;
unsigned n;
x = state->x;
y = state->y;
s = state->buf;
n = len;
while (n--) {
x = (x + 1) & 255;
y = (y + s[x]) & 255;
tmp = s[x]; s[x] = s[y]; s[y] = tmp;
tmp = (s[x] + s[y]) & 255;
*buf++ ^= s[tmp];
}
state->x = x;
state->y = y;
return len;
}
int endecrypt_file(FILE *f_in, FILE *f_out, char *key)
{
TRC4State state[1];
unsigned char buf[4096];
size_t n_read, n_written;
init_rc4(state);
setup_rc4(state, key, strlen(key));
do {
n_read = fread(buf, 1, sizeof buf, f_in);
endecrypt_rc4(buf, n_read, state);
n_written = fwrite(buf, 1, n_read, f_out);
} while (n_read == sizeof buf && n_written == n_read);
return (n_written == n_read) ? 0 : 1;
}
int endecrypt_file_at(char *f_in_name, char *f_out_name, char *key)
{
int rtn;
FILE *f_in = fopen(f_in_name, "rb");
if (!f_in) {
return 1;
}
FILE *f_out = fopen(f_out_name, "wb");
if (!f_out) {
close(f_in);
return 2;
}
rtn = endecrypt_file(f_in, f_out, key);
fclose(f_in);
fclose(f_out);
return rtn;
}
#ifdef TEST
// Simple test.
int main(void)
{
char *key = "This is the key!";
endecrypt_file_at("rc4.pas", "rc4-scrambled.c", key);
endecrypt_file_at("rc4-scrambled.c", "rc4-unscrambled.c", key);
return 0;
}
#endif
Here is lightly tested Pascal. I can scramble the source code in C and descramble it with the Pascal implementation just fine.
type
RC4State = record
x, y : Integer;
buf : array[0..255] of Byte;
end;
KeyString = String[255];
procedure initRC4(var state : RC4State);
var
x : Integer;
begin
state.x := 0;
state.y := 0;
for x := 0 to 255 do
state.buf[x] := Byte(x);
end;
procedure setupRC4(var state : RC4State; var key : KeyString);
var
tmp : Byte;
x, y : Integer;
begin
y := 0;
for x := 0 to 255 do begin
y := (y + state.buf[x] + Integer(key[1 + x mod Length(key)])) and 255;
tmp := state.buf[x];
state.buf[x] := state.buf[y];
state.buf[y] := tmp;
end;
state.x := 255;
state.y := y;
end;
procedure endecryptRC4(var buf : array of Byte; len : Integer; var state : RC4State);
var
x, y, i : Integer;
tmp : Byte;
begin
x := state.x;
y := state.y;
for i := 0 to len - 1 do begin
x := (x + 1) and 255;
y := (y + state.buf[x]) and 255;
tmp := state.buf[x];
state.buf[x] := state.buf[y];
state.buf[y] := tmp;
tmp := (state.buf[x] + state.buf[y]) and 255;
buf[i] := buf[i] xor state.buf[tmp]
end;
state.x := x;
state.y := y;
end;
procedure endecryptFile(var fIn, fOut : File; key : KeyString);
var
nRead, nWritten : Longword;
buf : array[0..4095] of Byte;
state : RC4State;
begin
initRC4(state);
setupRC4(state, key);
repeat
BlockRead(fIN, buf, sizeof(buf), nRead);
endecryptRC4(buf, nRead, state);
BlockWrite(fOut, buf, nRead, nWritten);
until (nRead <> sizeof(buf)) or (nRead <> nWritten);
end;
procedure endecryptFileAt(fInName, fOutName, key : String);
var
fIn, fOut : File;
begin
Assign(fIn, fInName);
Assign(fOut, fOutName);
Reset(fIn, 1);
Rewrite(fOut, 1);
endecryptFile(fIn, fOut, key);
Close(fIn);
Close(fOut);
end;
{$IFDEF TEST}
// Very small test.
const
key = 'This is the key!';
begin
endecryptFileAt('rc4.pas', 'rc4-scrambled.pas', key);
endecryptFileAt('rc4-scrambled.pas', 'rc4-unscrambled.pas', key);
end.
{$ENDIF}
It looks easier would be to get reference AES implementation (which works with blocks), and add some code to handle CBC (or CTR encryption).
This would need from you only adding ~30-50 lines of code, something like the following (for CBC):
aes_expand_key();
first_block = iv;
for (i = 0; i < filesize / 16; i++)
{
data_block = read(file, 16);
data_block = (data_block ^ iv);
iv = encrypt_block(data_block);
write(outputfile, iv);
}
// if filesize % 16 != 0, then you also need to add some padding and encrypt the last block
Assuming the encryption strength isn't an issue, as in satisfying an organization's Chinese Wall requirement, the very simple "Sawtooth" encryption scheme of adding (i++ % modulo 256) to fgetc(), for each byte, starting at the beginning of the file, might work just fine.
Declaring i as a UCHAR will eliminate the modulo requirement, as the single byte integer cannot help but cycle through its 0-255 range.
The code is so simple it's not worth posting. A little imagination, and you'll have some embellishments that can add a lot to the strength of this cypher. The primary vulnerability of this cypher is large blocks of identical characters. Rectifying this is a good place to start improving its strength.
This cypher works on every possible file type, and is especially effective if you've already 7Zipped the file.
Performance is phenomenal. You won't even know the code is there. Totally I/O bound.

Bit manipulation library for ANSI C

Does anyone knows a good bit manipulation library for ANSI C?
What I basically need, is ability, like in Jovial to set specific bits in a variable, something like
// I assume LSB has index of 0
int a = 0x123;
setBits(&a,2,5, 0xFF);
printf("0x%x"); // should be 0x13F
int a = 0x123;
printf("0x%x",getBits(&a,2,5)); // should be 0x4
char a[] = {0xCC, 0xBB};
char b[] = {0x11, 0x12};
copyBits(a,/*to=*/4,b,/*from=*/,4,/*lengthToCopy=*/8);
// Now a == {0x1C, 0xB2}
There's a similar library called bitfile, but it doesn't seem to support direct memory manipulation. It only supports feeding bits to file streams.
It's not hard to write, but if there's something tested - I won't reinvent the wheel.
Maybe this library exists as a part of bigger library (bzip2, gzip are the usual suspects)?
I think is considered "too simple" for a library; most functions would only be a statement or two, which would make the overhead of calling a library function a bit more than typical C programmers tolerate. :)
That said, the always-excellent glib has two of the more complicated bit-oriented functions: g_bit_nth_lsf() and g_bit_nth_msf(). These are used to find the index of the first bit set, searching from the lowest or the highest bit, respectively.
This seems to be the problem I was tackling in my question
Algorithm for copying N bits at arbitrary position from one int to another
There are several different alternatives provided, with the fastest being the assembly solution by fnieto.
You will come a long way with the following macros:
#define SETBITS(mem, bits) (mem) |= (bits)
#define CLEARBITS(mem, bits) (mem) &= ~(bits)
#define BIN(b7,b6,b5,b4, b3,b2,b1,b0) \
(unsigned char)( \
((b7)<<7) + ((b6)<<6) + ((b5)<<5) + ((b4)<<4) + \
((b3)<<3) + ((b2)<<2) + ((b1)<<1) + ((b0)<<0) \
)
Then you can write
int a = 0x123;
SETBITS(a, BIN(0,0,0,1, 1,1,1,0));
printf("0x%x", a); // should be 0x13F
Maybe the algorithms from the "FXT" book (link at the bottom of the page) will be useful.

Resources