How can I create a 48-bit uint for bit mask

How can I create a 48-bit uint for bit mask - c

I am trying to create a 48-bit integer value. I understand it may be possible to use a char array or struct, but I want to be able to do bit masking/manipulation and I'm not sure how that can be done.
Currently the program uses a 16-bit uint and I need to change it to 48. It is a bytecode interpreter and I want to expand the memory addressing to 4GB. I could just use 64-bit, but that would waste a lot of space.
Here is a sample of the code:
unsigned int program[] = { 0x1064, 0x11C8, 0x2201, 0x0000 };
void decode( )
{
instrNum = (program[i] & 0xF000) >> 12; //the instruction
reg1 = (program[i] & 0xF00 ) >> 8; //registers
reg2 = (program[i] & 0xF0 ) >> 4;
reg3 = (program[i] & 0xF );
imm = (program[i] & 0xFF ); //pointer to data
}
full program: http://en.wikibooks.org/wiki/Creating_a_Virtual_Machine/Register_VM_in_C

You can use the bit fields which are often used to represent integral types of known, fixed bit-width. A well-known usage of bit-fields is to represent a set of bits, and/or series of bits, known as flags. You can apply bit operations on them.
#include <stdio.h>
#include <stdint.h>
struct uint48 {
uint64_t x:48;
} __attribute__((packed));

Use a structure or uint16_t array with special functions for an array of uint48.
For individual instances, use uint64_t or unsigned long long. uint64_t will work fine for individually int48, but may want to mask off the results operations like * or << to keep upper bits cleared. Just some space saving routines are needed for arrays.
typedef uint64_t uint48;
const uint48 uint48mask = 0xFFFFFFFFFFFFFFFFull;
uint48 uint48_get(const uint48 *a48, size_t index) {
const uint16_t *a16 = (const uint16_t *) a48;
index *= 3;
return a16[index] | (uint32_t) a16[index + 1] << 16
| (uint64_t) a16[index + 2] << 32;
}
void uint48_set(uint48 *a48, size_t index, uint48 value) {
uint16_t *a16 = (uint16_t *) a48;
index *= 3;
a16[index] = (uint16_t) value;
a16[++index] = (uint16_t) (value >> 16);
a16[++index] = (uint16_t) (value >> 32);
}
uint48 *uint48_new(size_t n) {
size_t size = n * 3 * sizeof(uint16_t);
// Insure size allocated is a multiple of `sizeof(uint64_t)`
// Not fully certain this is needed - but doesn't hurt.
if (size % sizeof(uint64_t)) {
size += sizeof(uint64_t) - size % sizeof(uint64_t);
}
return malloc(size);
}

Related

How to set the values of an array to a single variable

I'm reading the values from a SD card in an ARM micro:
Res = f_read(&fil, (void*)buf, 6, &NumBytesRead);
where fil is a pointer, buf is a buffer where the data is stored.
And that's the problem: it's an array but I'd like to have the contents of that array in a single variable.
To give an actual example: the 6 bytes read from the file are:
buf[0] = 0x1B
buf[1] = 0x26
buf[2] = 0xB3
buf[3] = 0x54
buf[4] = 0xA1
buf[5] = 0xCF
And I'd like to have: uint64_t data be equal to 0x1B26B354A1CF. That is, all the elements of the array "concatenated" in one single 64 bit integer.

Without type punning you can do as below.
uint64_t data = 0;
for (int i=0; i<6; i++)
{
data <<= 8;
data |= (uint64_t) buf[i];
}

Use union but remember about the endianes.
union
{
uint8_t u8[8];
uint64_t u64;
}u64;
typedef union
{
uint8_t u8[8];
uint64_t u64;
}u64;
typedef enum
{
LITTLE_E,
BIG_E,
}ENDIANESS;
ENDIANESS checkEndianess(void)
{
ENDIANESS result = BIG_E;
u64 d64 = {.u64 = 0xff};
if(d64.u8[0]) result = LITTLE_E;
return result;
}
uint64_t arrayToU64(uint8_t *array, ENDIANESS e) // for the array BE
{
u64 d64;
if(e == LITTLE_E)
{
memmove(&d64, array, sizeof(d64.u64));
}
else
{
for(int index = sizeof(d64.u64) - 1; index >= 0; index--)
{
d64.u8[sizeof(d64.u64) - index - 1] = array[index];
}
}
return d64.u64;
}
int main()
{
uint8_t BIG_E_Array[] = {0x10,0x20,0x30,0x40,0x50,0x60,0x70,0x80};
ENDIANESS e;
printf("This system endianess: %s\n", (e = checkEndianess()) == BIG_E ? "BIG":"LITTLE");
printf("Punned uint64_t for our system 0x%lx\n", arrayToU64(BIG_E_Array, e));
printf("Punned uint64_t for the opposite endianess system 0x%lx\n", arrayToU64(BIG_E_Array, e == BIG_E ? LITTLE_E : BIG_E));
return 0;
}

To things to take care of here:
have the bytes be ordered correctly
read the six bytes into one 64bit integer
Issue 1 can be taken care of by storing the byte coming in in network byte order (Big Endian) into the 64 bit integer in host byte order by for example using the two marcos below:
/* below defines of htonll() and ntohll() are taken from this answer:
https://stackoverflow.com/a/28592202/694576
*/
#if __BIG_ENDIAN__
# define htonll(x) (x)
# define ntohll(x) (x)
#else
# define htonll(x) ((uint64_t)htonl((x) & 0xFFFFFFFF) << 32) | htonl((x) >> 32))
# define ntohll(x) ((uint64_t)ntohl((x) & 0xFFFFFFFF) << 32) | ntohl((x) >> 32))
#endif
Issue 2 can be solved in multiple ways:
Extending your approach
#define BUFFER_SIZE (6)
...
assert(BUFFER_SIZE <= sizeof (uint64_t));
uint8_t buffer[BUFFER_SIZE];
FILE * pf = ...; /* open file here */
/* test if file has been opened successfully here */
... result = f_read(pf, buffer, BUFFER_SIZE, ...);
/* test result for success */
uint64_t number = 0;
memset(&number, buffer, BUFFER_SIZE)
number = ntohll(number);
Use "Type Punning" by using a union
union buffer_wrapper
{
uint8_t u8[sizeof (uint64_t)];
uint64_t u64;
}
Instead of
uint8_t buffer[BUFFER_SIZE];
use
union buffer_wrapper buffer;
and instead of
memcpy(&number, buffer, BUFFER_SIZE)
number = ntohll(number)
use
number = ntohll(buffer.u64)

custom data type in c: 4x14bit + 1x8bit within a 64 bit 'container'

i am currently trying to figure out an elegant and convinient way to store 4 14-bit values and 1 8-bit value within a 64 bit boundary.
something like this:
typedef struct my64bit{
unsigned data1 : 14;
unsigned data2 : 14;
unsigned data3 : 14;
unsigned data4 : 14;
unsigned other : 8;
}tmy64Bit;
later I wan't to create an array of these 'containers'
tmy64Bit myArray[1000];
so that i have a pointer "myArray" wich points to 1000x64-bits of memory
this array is send via tcp to an embedded-linux SOCFPGA system where it should be copied (with correction of endianess and network byte order) into a specific memory (directly accessible from the fpga)
my problem is that the upper code doesn't create a 64-bit type
sizeof(tmy64Bit)
returns 12, so 12 bytes are allocated instead of 8
filling the struct with data and watching the memory (on my 64 bit linux system) returns this
tmy64Bit test;
memset(&test,0,sizeof(tmy64Bit));
test.data1 = 0x3fff;
...
test.other = 0xAA;
Memory View:
after d1 written = 0xFF3F0000 00000000 00000000
after d2 written = 0xFFFFFF0F 00000000 00000000
after d3 written = 0xFFFFFF0F FF3F0000 00000000
after d4 written = 0xFFFFFF0F FFFFFF0F 00000000
after o written = 0xFFFFFF0F FFFFFF0F AA000000
so the first 2 14 bit variables are stored correctly but then padding fills up the last half-byte and at the end the last byte needs to be stored in a new 64 bit cell
an other aproach would be
typedef struct my2nd64Bit{
uint8_t data[7];
uint8_t other;
}tmy2nd64Bit;
where a
sizeof(tmy2nd64Bit)
returns an 8 (which was expected)
This generates correctly padded structure, but storing the 14 bit always involves a lot of bitshifting and masking

Avoid bit-fields, they are so poorly defined by the C standard that they can barely be used in practice. Your bit-field struct code contains something around 5 to 10 different forms of poorly-specified behavior. C standard bit-fields is a dangerous and superfluous feature, simple as that.
Instead, simply use a raw array of binary values, something like this:
typedef union {
uint8_t array [sizeof(uint64_t)];
uint64_t val64;
}tmy64Bit;
(Note that the uint64_t in the union will be endianess-dependent)
The de facto way to set and clear bits in such a raw array is:
void set_bit (tmy64Bit* x, size_t bit)
{
x->array [bit / 8] |= 1 << (bit % 8);
}
void clear_bit (tmy64Bit* x, size_t bit)
{
x->array [bit / 8] &= ~(1 << (bit % 8));
}
Or if you will, a more readable version (equivalent):
void set_bit (tmy64Bit* x, size_t bit)
{
uint8_t byte_index = bit / 8;
uint8_t bit_index = bit % 8;
x->array[byte_index] |= 1 << bit_index;
}

This is what you want :
typedef struct my64bit{
uint64_t data1 : 14;
uint64_t data2 : 14;
uint64_t data3 : 14;
uint64_t data4 : 14;
uint64_t other : 8;
}tmy64Bit;
unsigned means unsigned int, and this type is 32-bit on most systems. This will cause padding because the individual fields won't be allowed to cross 32-bit boundaries. Using a 64-bit member type won't add padding for this case (you don't cross any 64-bit boundary).
As for any question about bit-fields, you need to remember that most of the bit-field mechanics are implementation defined, which means that if you want to use that, you should check that you actually get what you want. Also, if you plan to use another compiler, check that the behavior is the same (usually it is, but maybe not on exotic platforms). If you properly check, this is safe to use (not undefined behavior), but you might want to use a more portable way, using bit operations for example.

I agree with Lundin's answer, but for this particular situation, my implementation would be a bit different (no pun).
First, I would decide how to pack the fields into each 64-bit word. For example:
Bits Description
0-13 data[0]
14-27 data[1]
28-41 data[2]
42-55 data[3]
56-64 other
Second, I would use a dynamically allocated structure with the FPGA data in a C99 flexible array member, to describe the target device:
typedef struct {
/* Other FPGA-related fields, maybe
* a struct sockaddr_in or _in6
* to identify the FPGA */
size_t words;
uint64_t word[];
} fpga;
fpga *fpga_create(const size_t words)
{
fpga *f;
f = malloc(sizeof (fpga) + word * sizeof(f->word[0]));
if (!f)
return NULL;
f->words = words;
memset(f->word, 0, f->words * sizeof (f->word[0]));
return f;
}
Third, I would use static inline accessors to manipulate the data:
static inline unsigned int fpga_get_data(const fpga *f, const int w, const int i)
{
assert(f != NULL);
assert(w >= 0 && (size_t)w < f->words);
assert(i >= 0 && i < 4);
return (f->word[(size_t)w] >> (i * 14)) & 0x3FFFU;
}
static inline unsigned int fpga_get_other(const fpga *f, const int w)
{
assert(f != NULL);
assert(w >= 0 && (size_t)w < f->words);
return (f->word[(size_t)w] >> 56) & 0xFFU;
}
static inline void fpga_set_data(const fpga *f, const int w, const int i,
const unsigned int value)
{
assert(f != NULL);
assert(w >= 0 && (size_t)w < f->words);
assert(i >= 0 && i < 4);
f->word[(size_t)w] = (f->word[(size_t)w] & (~(0x3FFFU << (i*14))))
| ((value & 0x3FFFU) << (i*14));
}
static inline void fpga_set_other(const fpga *f, const int w, const unsigned int value)
{
assert(f != NULL);
assert(w >= 0 && (size_t)w < f->words);
f->word[(size_t)w] = (f->word[(size_t)w] & (uint64_t)0x00FFFFFFFFFFFFFFULL)
| ((value & 0xFFU) << 56);
}
Above, w is the index to the word, and i is the index of the data entry (0 to 3). If you want a continuous data array, you could use
static inline unsigned int fpga_get_data(const fpga *f, const int i)
{
assert(f != NULL);
assert(i >= 0 && (size_t)i < 4 * f->words);
return (f->word[(size_t)i / 4] >> ((i & 3) * 14)) & 0x3FFFU;
}
if (f != NULL &&
i >= 0 && i < (size_t)4 * f->words)
return (f->word[(size_t)i / 4] >> ((i & 3) * 14)) & 0x3FFFU;
else
return 0U; /* Or abort with an error */
}
static inline void fpga_set_data(const fpga *f, const int i, const unsigned int value)
{
assert(f != NULL);
assert(i >= 0 && (size_t)i / 4 < f->words);
f->word[(size_t)i / 4] = (f->word[(size_t)i / 4] & (~(0x3FFFU << ((i & 3) * 14))))
| ((value & 0x3FFFU) << ((i & 3) * 14));
}
The accessors should be defined in the header file that defines the structure.
Note that if other is actually some sort of checksum for the four data fields, I would calculate them just before sending.

Extract 14-bit values from an array of bytes in C

In an arbitrary-sized array of bytes in C, I want to store 14-bit numbers (0-16,383) tightly packed. In other words, in the sequence:
0000000000000100000000000001
there are two numbers that I wish to be able to arbitrarily store and retrieve into a 16-bit integer. (in this case, both of them are 1, but could be anything in the given range) If I were to have the functions uint16_t 14bitarr_get(unsigned char* arr, unsigned int index) and void 14bitarr_set(unsigned char* arr, unsigned int index, uint16_t value), how would I implement those functions?
This is not for a homework project, merely my own curiosity. I have a specific project that this would be used for, and it is the key/center of the entire project.
I do not want an array of structs that have 14-bit values in them, as that generates waste bits for every struct that is stored. I want to be able to tightly pack as many 14-bit values as I possibly can into an array of bytes. (e.g.: in a comment I made, putting as many 14-bit values into a chunk of 64 bytes is desirable, with no waste bits. the way those 64 bytes work is completely tightly packed for a specific use case, such that even a single bit of waste would take away the ability to store another 14 bit value)

Well, this is bit fiddling at its best. Doing it with an array of bytes makes it more complicated than it would be with larger elements because a single 14 bit quantity can span 3 bytes, where uint16_t or anything bigger would require no more than two. But I'll take you at your word that this is what you want (no pun intended). This code will actually work with the constant set to anything 8 or larger (but not over the size of an int; for that, additional type casts are needed). Of course the value type must be adjusted if larger than 16.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#define W 14
uint16_t arr_get(unsigned char* arr, size_t index) {
size_t bit_index = W * index;
size_t byte_index = bit_index / 8;
unsigned bit_in_byte_index = bit_index % 8;
uint16_t result = arr[byte_index] >> bit_in_byte_index;
for (unsigned n_bits = 8 - bit_in_byte_index; n_bits < W; n_bits += 8)
result |= arr[++byte_index] << n_bits;
return result & ~(~0u << W);
}
void arr_set(unsigned char* arr, size_t index, uint16_t value) {
size_t bit_index = W * index;
size_t byte_index = bit_index / 8;
unsigned bit_in_byte_index = bit_index % 8;
arr[byte_index] &= ~(0xff << bit_in_byte_index);
arr[byte_index++] |= value << bit_in_byte_index;
unsigned n_bits = 8 - bit_in_byte_index;
value >>= n_bits;
while (n_bits < W - 8) {
arr[byte_index++] = value;
value >>= 8;
n_bits += 8;
}
arr[byte_index] &= 0xff << (W - n_bits);
arr[byte_index] |= value;
}
int main(void) {
int mod = 1 << W;
int n = 50000;
unsigned x[n];
unsigned char b[2 * n];
for (int tries = 0; tries < 10000; tries++) {
for (int i = 0; i < n; i++) {
x[i] = rand() % mod;
arr_set(b, i, x[i]);
}
for (int i = 0; i < n; i++)
if (arr_get(b, i) != x[i])
printf("Err #%d: %d should be %d\n", i, arr_get(b, i), x[i]);
}
return 0;
}
Faster versions Since you said in comments that performance is an issue: open coding the loops gives a roughly 10% speed improvement on my machine on the little test driver included in the original. This includes random number generation and testing, so perhaps the primitives are 20% faster. I'm confident that 16- or 32-bit array elements would give further improvements because byte access is expensive:
uint16_t arr_get(unsigned char* a, size_t i) {
size_t ib = 14 * i;
size_t iy = ib / 8;
switch (ib % 8) {
case 0:
return (a[iy] | (a[iy+1] << 8)) & 0x3fff;
case 2:
return ((a[iy] >> 2) | (a[iy+1] << 6)) & 0x3fff;
case 4:
return ((a[iy] >> 4) | (a[iy+1] << 4) | (a[iy+2] << 12)) & 0x3fff;
}
return ((a[iy] >> 6) | (a[iy+1] << 2) | (a[iy+2] << 10)) & 0x3fff;
}
#define M(IB) (~0u << (IB))
#define SETLO(IY, IB, V) a[IY] = (a[IY] & M(IB)) | ((V) >> (14 - (IB)))
#define SETHI(IY, IB, V) a[IY] = (a[IY] & ~M(IB)) | ((V) << (IB))
void arr_set(unsigned char* a, size_t i, uint16_t val) {
size_t ib = 14 * i;
size_t iy = ib / 8;
switch (ib % 8) {
case 0:
a[iy] = val;
SETLO(iy+1, 6, val);
return;
case 2:
SETHI(iy, 2, val);
a[iy+1] = val >> 6;
return;
case 4:
SETHI(iy, 4, val);
a[iy+1] = val >> 4;
SETLO(iy+2, 2, val);
return;
}
SETHI(iy, 6, val);
a[iy+1] = val >> 2;
SETLO(iy+2, 4, val);
}
Another variation
This is quite a bit faster yet on my machine, about 20% better than above:
uint16_t arr_get2(unsigned char* a, size_t i) {
size_t ib = i * 14;
size_t iy = ib / 8;
unsigned buf = a[iy] | (a[iy+1] << 8) | (a[iy+2] << 16);
return (buf >> (ib % 8)) & 0x3fff;
}
void arr_set2(unsigned char* a, size_t i, unsigned val) {
size_t ib = i * 14;
size_t iy = ib / 8;
unsigned buf = a[iy] | (a[iy+1] << 8) | (a[iy+2] << 16);
unsigned io = ib % 8;
buf = (buf & ~(0x3fff << io)) | (val << io);
a[iy] = buf;
a[iy+1] = buf >> 8;
a[iy+2] = buf >> 16;
}
Note that for this code to be safe you should allocate one extra byte at the end of the packed array. It always reads and writes 3 bytes even when the desired 14 bits are in the first 2.
One more variation Finally, this runs just a bit slower than the one above (again on my machine; YMMV), but you don't need the extra byte. It uses one comparison per operation:
uint16_t arr_get2(unsigned char* a, size_t i) {
size_t ib = i * 14;
size_t iy = ib / 8;
unsigned io = ib % 8;
unsigned buf = ib % 8 <= 2
? a[iy] | (a[iy+1] << 8)
: a[iy] | (a[iy+1] << 8) | (a[iy+2] << 16);
return (buf >> io) & 0x3fff;
}
void arr_set2(unsigned char* a, size_t i, unsigned val) {
size_t ib = i * 14;
size_t iy = ib / 8;
unsigned io = ib % 8;
if (io <= 2) {
unsigned buf = a[iy] | (a[iy+1] << 8);
buf = (buf & ~(0x3fff << io)) | (val << io);
a[iy] = buf;
a[iy+1] = buf >> 8;
} else {
unsigned buf = a[iy] | (a[iy+1] << 8) | (a[iy+2] << 16);
buf = (buf & ~(0x3fff << io)) | (val << io);
a[iy] = buf;
a[iy+1] = buf >> 8;
a[iy+2] = buf >> 16;
}
}

The easiest solution is to use a struct of eight bitfields:
typedef struct __attribute__((__packed__)) EightValues {
uint16_t v0 : 14,
v1 : 14,
v2 : 14,
v3 : 14,
v4 : 14,
v5 : 14,
v6 : 14,
v7 : 14;
} EightValues;
This struct has a size of 14*8 = 112 bits, which is 14 bytes (seven uint16_t). Now, all you need is to use the last three bits of the array index to select the right bitfield:
uint16_t 14bitarr_get(unsigned char* arr, unsigned int index) {
EightValues* accessPointer = (EightValues*)arr;
accessPointer += index >> 3; //select the right structure in the array
switch(index & 7) { //use the last three bits of the index to access the right bitfield
case 0: return accessPointer->v0;
case 1: return accessPointer->v1;
case 2: return accessPointer->v2;
case 3: return accessPointer->v3;
case 4: return accessPointer->v4;
case 5: return accessPointer->v5;
case 6: return accessPointer->v6;
case 7: return accessPointer->v7;
}
}
Your compiler will do the bit-fiddling for you.

The Basis for Storage Issue
The biggest issue you are facing is the fundamental question of "What is my basis for storage going to be?" You know the basics, what you have available to you is char, short, int, etc... The smallest being 8-bits. No matter how you slice your storage scheme, it will ultimately have to rest in memory in a unit of memory based on this 8 bit per byte layout.
The only optimal, no bits wasted, memory allocation would be to declare an array of char in the least common multiple of 14-bits. It is the full 112-bits in this case (7-shorts or 14-chars). This may be the best option. Here, declaring an array of 7-shorts or 14-chars, would allow the exact storage of 8 14-bit values. Of course if you have no need for 8 of them, then it wouldn't be of much use anyway as it would waste more than the 4-bits lost on a single unsigned value.
Let me know if this is something you would like to further explore. If it is, I'm happy to help with the implementation.
Bitfield Struct
The comments regarding bitfield packing or bit packing are exactly what you need to do. This can involve a structure alone or in combination with a union, or by manually right/left shifting values directly as needed.
A short example applicable to your situation (if I understood correctly you want 2 14-bit areas in memory) would be:
#include <stdio.h>
typedef struct bitarr14 {
unsigned n1 : 14,
n2 : 14;
} bitarr14;
char *binstr (unsigned long n, size_t sz);
int main (void) {
bitarr14 mybitfield;
mybitfield.n1 = 1;
mybitfield.n2 = 1;
printf ("\n mybitfield in memory : %s\n\n",
binstr (*(unsigned *)&mybitfield, 28));
return 0;
}
char *binstr (unsigned long n, size_t sz)
{
static char s[64 + 1] = {0};
char *p = s + 64;
register size_t i = 0;
for (i = 0; i < sz; i++) {
p--;
*p = (n >> i & 1) ? '1' : '0';
}
return p;
}
Output
$ ./bin/bitfield14
mybitfield in memory : 0000000000000100000000000001
Note: the dereference of mybitfield for purposes of printing the value in memory breaks strict aliasing and it is intentional just for the purpose of the output example.
The beauty, and purpose for using a struct in the manner provided is it will allow direct access to each 14-bit part of the struct directly, without having to manually shift, etc.

Update - assuming you want big endian bit packing. This is code meant for a fixed size code word. It's based on code I've used for data compression algorithms. The switch case and fixed logic helps with performance.
typedef unsigned short uint16_t;
void bit14arr_set(unsigned char* arr, unsigned int index, uint16_t value)
{
unsigned int bitofs = (index*14)%8;
arr += (index*14)/8;
switch(bitofs){
case 0: /* bit offset == 0 */
*arr++ = (unsigned char)(value >> 6);
*arr &= 0x03;
*arr |= (unsigned char)(value << 2);
break;
case 2: /* bit offset == 2 */
*arr &= 0xc0;
*arr++ |= (unsigned char)(value >> 8);
*arr = (unsigned char)(value << 0);
break;
case 4: /* bit offset == 4 */
*arr &= 0xf0;
*arr++ |= (unsigned char)(value >> 10);
*arr++ = (unsigned char)(value >> 2);
*arr &= 0x3f;
*arr |= (unsigned char)(value << 6);
break;
case 6: /* bit offset == 6 */
*arr &= 0xfc;
*arr++ |= (unsigned char)(value >> 12);
*arr++ = (unsigned char)(value >> 4);
*arr &= 0x0f;
*arr |= (unsigned char)(value << 4);
break;
}
}
uint16_t bit14arr_get(unsigned char* arr, unsigned int index)
{
unsigned int bitofs = (index*14)%8;
unsigned short value;
arr += (index*14)/8;
switch(bitofs){
case 0: /* bit offset == 0 */
value = ((unsigned int)(*arr++) ) << 6;
value |= ((unsigned int)(*arr ) ) >> 2;
break;
case 2: /* bit offset == 2 */
value = ((unsigned int)(*arr++)&0x3f) << 8;
value |= ((unsigned int)(*arr ) ) >> 0;
break;
case 4: /* bit offset == 4 */
value = ((unsigned int)(*arr++)&0x0f) << 10;
value |= ((unsigned int)(*arr++) ) << 2;
value |= ((unsigned int)(*arr ) ) >> 6;
break;
case 6: /* bit offset == 6 */
value = ((unsigned int)(*arr++)&0x03) << 12;
value |= ((unsigned int)(*arr++) ) << 4;
value |= ((unsigned int)(*arr ) ) >> 4;
break;
}
return value;
}

Here's my version (updated to fix bugs):
#define PACKWID 14 // number of bits in packed number
#define PACKMSK ((1 << PACKWID) - 1)
#ifndef ARCHBYTEALIGN
#define ARCHBYTEALIGN 1 // align to 1=bytes, 2=words
#endif
#define ARCHBITALIGN (ARCHBYTEALIGN * 8)
typedef unsigned char byte;
typedef unsigned short u16;
typedef unsigned int u32;
typedef long long s64;
typedef u16 pcknum_t; // container for packed number
typedef u32 acc_t; // working accumulator
#ifndef ARYOFF
#define ARYOFF long
#endif
#define PRT(_val) ((unsigned long) _val)
typedef unsigned ARYOFF aryoff_t; // bit offset
// packary -- access array of packed numbers
// RETURNS: old value
extern inline pcknum_t
packary(byte *ary,aryoff_t idx,int setflg,pcknum_t newval)
// ary -- byte array pointer
// idx -- index into array (packed number relative)
// setflg -- 1=set new value, 0=just get old value
// newval -- new value to set (if setflg set)
{
aryoff_t absbitoff;
aryoff_t bytoff;
aryoff_t absbitlhs;
acc_t acc;
acc_t nval;
int shf;
acc_t curmsk;
pcknum_t oldval;
// get the absolute bit number for the given array index
absbitoff = idx * PACKWID;
// get the byte offset of the lowest byte containing the number
bytoff = absbitoff / ARCHBITALIGN;
// get absolute bit offset of first containing byte
absbitlhs = bytoff * ARCHBITALIGN;
// get amount we need to shift things by:
// (1) our accumulator
// (2) values to set/get
shf = absbitoff - absbitlhs;
#ifdef MODSHOW
do {
static int modshow;
if (modshow > 50)
break;
++modshow;
printf("packary: MODSHOW idx=%ld shf=%d bytoff=%ld absbitlhs=%ld absbitoff=%ld\n",
PRT(idx),shf,PRT(bytoff),PRT(absbitlhs),PRT(absbitoff));
} while (0);
#endif
// adjust array pointer to the portion we want (guaranteed to span)
ary += bytoff * ARCHBYTEALIGN;
// fetch the number + some other bits
acc = *(acc_t *) ary;
// get the old value
oldval = (acc >> shf) & PACKMSK;
// set the new value
if (setflg) {
// get shifted mask for packed number
curmsk = PACKMSK << shf;
// remove the old value
acc &= ~curmsk;
// ensure caller doesn't pass us a bad value
nval = newval;
#if 0
nval &= PACKMSK;
#endif
nval <<= shf;
// add in the value
acc |= nval;
*(acc_t *) ary = acc;
}
return oldval;
}
pcknum_t
int_get(byte *ary,aryoff_t idx)
{
return packary(ary,idx,0,0);
}
void
int_set(byte *ary,aryoff_t idx,pcknum_t newval)
{
packary(ary,idx,1,newval);
}
Here are benchmarks:
set: 354740751 7.095 -- gene
set: 203407176 4.068 -- rcgldr
set: 298946533 5.979 -- craig
get: 268574627 5.371 -- gene
get: 166839767 3.337 -- rcgldr
get: 207764612 4.155 -- craig

get16bits macro in hash function

I was looking at hash functions the other day and came across a website that had an example of one. Most of the code was easy to grasp, however this macro function I can't really wrap my head around.
Could someone breakdown what's going on here?
#define get16bits(d) ((((uint32_t)(((const uint8_t *)(d))[1])) << 8) +(uint32_t)(((const uint8_t *)(d))[0]))

Basically it gets the lower 16 bit of the 32 bit integer d
lets break it down
#define get16bits(d) ((((uint32_t)(((const uint8_t *)(d))[1])) << 8) +(uint32_t)(((const uint8_t *)(d))[0]))
uint32_t a = 0x12345678;
uint16_t b = get16bits(&a); // b == 0x00005678
first we must pass the address of a to get16bits() or it will not work.
(((uint32_t)(const uint8_t *)(d))[1])) << 8
this first converts the 32 bit integer into an array of 8 bit integers and retrieves the 2 one.
It then shifts the value by 8 bit so it and adds the lower 8 bits to it
+ (uint32_t)(((const uint8_t *)(d))[0]))
In our example it will be
uint8_t tmp[4] = (uint8_t *)&a;
uint32_t result;
result = tmp[1] << 8; // 0x00005600
result += tmp[0]; //tmp[0] == 0x78
// result is now 0x00005678

The macro is more or less equivalent to:
static uint32_t get16bits(SOMETYPE *d)
{
unsigned char temp[ sizeof *d];
uint32_t val;
memcpy(temp, d, sizeof *d);
val = (temp[0] << 8)
+ temp[1];
return val;
}
, but the macro argument has no type, and the function argument does.
Another way would be to actually cast:
static uint32_t get16bits(SOMETYPE *d)
{
unsigned char *cp = (unsigned char*) d;
uint32_t val;
val = (cp[0] << 8)
+ cp[1];
return val;
}
, which also shows the weakness: by indexing with 1, the code assumes that sizeof (*d) is at least 2.

How to encode a numeric value as bytes

I need to be able to be able to send a numeric value to a remote socket server and so I need to encode possible numbers as bytes.
The numbers are up to 64 bit, ie requiring up to 8 bytes. The very first byte is the type, and it is always a number under 255 so fits in 1 byte.
For example, if the number was 8 and the type was a 32 bit unsigned integer then the type would be 7 which would be copied to the first (leftmost) byte and then the next 4 bytes would be encoded with the actual number (8 in this case).
So in terms of bytes:
byte1: 7
byte2: 0
byte3: 0
byte4: 0
byte5: 8
I hope this is making sense.
Does this code to perform this encoding look like a reasonable approach?
int type = 7;
uint32_t number = 8;
unsigned char* msg7 = (unsigned char*)malloc(5);
unsigned char* p = msg7;
*p++ = type;
for (int i = sizeof(uint32_t) - 1; i >= 0; --i)
*p++ = number & 0xFF << (i * 8);

You'll want to explicitly cast type to avoid a warning:
*p++ = (unsigned char) type;
You want to encode the number with most significant byte first, but you're shifting in the wrong direction. The loop should be:
for (int i = sizeof(uint32_t) - 1; i >= 0; --i)
*p++ = (unsigned char) ((number >> (i * 8)) & 0xFF);
It looks good otherwise.

Your code is reasonable (although I'd use uint8_t, since you are not using the bytes as “characters”, and Peter is of course right wrt the typo), and unlike the commonly found alternatives like
uint32_t number = 8;
uint8_t* p = (uint8_t *) &number;
or
union {
uint32_t number;
uint8_t bytes[4];
} val;
val.number = 8;
// access val.bytes[0] .. val.bytes[3]
is even guaranteed to work. The first alternative will probably work in a debug build, but more and more compilers might break it when optimizing, while the second one tends to work in practice just about everywhere, but is explicitly marked as a bad thing™ by the language standard.

I would drop the loop and use a "caller allocates" interface, like
int convert_32 (unsigned char *target, size_t size, uint32_t val)
{
if (size < 5) return -1;
target[0] = 7;
target[1] = (val >> 24) & 0xff;
target[2] = (val >> 16) & 0xff;
target[3] = (val >> 8) & 0xff;
target[4] = (val) & 0xff;
return 5;
}
This makes it easier for the caller to concatenate multiple fragments into one big binary packet and keep track of the used/needed buffer size.

Do you mean?
for (int i = sizeof(uint32_t) - 1; i >= 0; --i)
*p++ = (number >> (i * 8)) & 0xFF;
Another option to might be to do
// this would work on Big endian systems, e.g. sparc
struct unsignedMsg {
unsigned char type;
uint32_t value;
}
unsignedMsg msg;
msg.type = 7;
msg.value = number;
unsigned char *p = (unsigned char *) &msg;
or
unsigned char* p =
p[0] = 7;
*((uint32_t *) &(p[1])) = number;