Byte level length description - c

I have a protocol that requires a length field up to 32-bits, and it must be
generated at runtime to describe how many bytes are in a given packet.
The code below is kind of ugly but I am wondering if this can be refactored to
be slightly more efficient or easily understandable. The problem is that the
code will only generate enough bytes to describe the length of the packet, so
less than 255 bytes = 1 byte of length, less than 65535 = 2 bytes of length
etc...
{
extern char byte_stream[];
int bytes = offset_in_packet;
int n = length_of_packet;
/* Under 4 billion, so this can be represented in 32 bits. */
int t;
/* 32-bit number used for temporary storage. */
/* These are the bytes we will break up n into. */
unsigned char first, second, third, fourth;
t = n & 0xFF000000;
/* We have used AND to "mask out" the first byte of the number. */
/* The only bits which can be on in t are the first 8 bits. */
first = t >> 24;
if (t) {
printf("byte 1: 0x%02x\n",first );
byte_stream[bytes] = first; bytes++;
write_zeros = 1;
}
/* Now we shift t so that it is between 0 and 255. This is the first, highest byte of n. */
t = n & 0x00FF0000;
second = t >> 16;
if (t || write_zeros) {
printf("byte 2: 0x%02x\n", second );
byte_stream[bytes] = second; bytes++;
write_zeros = 1;
}
t = n & 0x0000FF00;
third = t >> 8;
if ( t || write_zeros) {
printf("byte 3: 0x%02x\n", third );
byte_stream[bytes] = third; bytes++;
write_zeros = 1;
}
t = n & 0x000000FF;
fourth = t;
if (t || write_zeros) {
printf("byte 4: 0x%02x\n", fourth);
byte_stream[bytes] = fourth; bytes++;
}
}

You should really use a fixed-width field for your length.
When the program on the receiving end has to read the length field of your packet, how does it know where the length stops?
If the length of a packet can potentially reach 4 GB, does a 1-3 byte overhead really matter?
Do you see how complex your code has already become?

Really you're only doing four calculations, so readability seems way more important here than efficiency. My approach to make something like this more readable is to
Extract common code to a function
Put similar calculations together to make the patterns more obvious
Get rid of the intermediate variable print_zeroes and be explicit about the cases in which you output bytes even if they're zero (i.e. the preceding byte was non-zero)
I've changed the random code block into a function and changed a few variables (underscores are giving me trouble in the markdown preview screen). I've also assumed that bytes is being passed in, and that whoever is passing it in will pass us a pointer so we can modify it.
Here's the code:
/* append byte b to stream, increment index */
/* really needs to check length of stream before appending */
void output( int i, unsigned char b, char stream[], int *index )
{
printf("byte %d: 0x%02x\n", i, b);
stream[(*index)++] = b;
}
void answer( char bytestream[], unsigned int *bytes, unsigned int n)
{
/* mask out four bytes from word n */
first = (n & 0xFF000000) >> 24;
second = (n & 0x00FF0000) >> 16;
third = (n & 0x0000FF00) >> 8;
fourth = (n & 0x000000FF) >> 0;
/* conditionally output each byte starting with the */
/* first non-zero byte */
if (first)
output( 1, first, bytestream, bytes);
if (first || second)
output( 2, second, bytestream, bytes);
if (first || second || third)
output( 3, third, bytestream, bytes);
if (first || second || third || fourth)
output( 4, fourth, bytestream, bytes);
}
Ever so slightly more efficient, and maybe easier to understand would be this modification to the last four if statements:
if (n>0x00FFFFFF)
output( 1, first, bytestream, bytes);
if (n>0x0000FFFF)
output( 2, second, bytestream, bytes);
if (n>0x000000FF)
output( 3, third, bytestream, bytes);
if (1)
output( 4, fourth, bytestream, bytes);
I agree, however, that compressing this field makes the receiving state machine overly complicated. But if you can't change the protocol, this code is much easier to read.

Try this loop:
{
extern char byte_stream[];
int bytes = offset_in_packet;
int n = length_of_packet; /* Under 4 billion, so this can be represented in 32 bits. */
int t; /* 32-bit number used for temporary storage. */
int i;
unsigned char curByte;
for (i = 0; i < 4; i++) {
t = n & (0xFF000000 >> (i * 16));
curByte = t >> (24 - (i * 8));
if (t || write_zeros) {
printf("byte %d: 0x%02x\n", i, curByte );
byte_stream[bytes] = curByte;
bytes++;
write_zeros = 1;
}
}
}

I'm not sure I understand your question. What exactly are you trying to count? If I understand correctly you're trying to find the Most Significant non-zero byte.
You're probably better off using a loop like this:
int i;
int write_zeros = 0;
for (i = 3; i >=0 ; --i) {
t = (n >> (8 * i)) & 0xff;
if (t || write_zeros) {
write_zeros = 1;
printf ("byte %d : 0x%02x\n", 4-i, t);
byte_stream[bytes++] = t;
}
}

Related

writing 3 bits at a time to binary file in C

Image
hello, i have a list of locations as described in the image stored in a linked list. every node has an unsigned char in the size of 2(chessPos in the code) - the first location represents a row and the second a col. for example the first node: row = 'C', col = '5' and so on. the list is passed through the function i dont need to built it.
i need to write the data to a binary file, when each row or col is written in 3 bits. so 'C' will be written as 010 and right after '5' will be written as 100 (the 3 bits written represent the row/col -1, thats why '5' is represnted by 100 which is 4 in binary).
the difficulty is that every byte is 8 bits and every time i write a byte to the file it contains 6 bits which represt a row and a col, and 2 bits of the next byte.
how can i make it work?
thanks
this is my code so far:
typedef char chessPos[2];
typedef struct _chessPosArray {
unsigned int size;
chessPos* positions;
}chessPosArray;
typedef struct _chessPosCell {
chessPos position;
struct _chessPosCell* next;
}chessPosCell;
typedef struct _chessPosList {
chessPosCell* head;
chessPosCell* tail;
}chessPosList;
void function_name(char* file_name, chessPosList* pos_list)
{
FILE* file;
short list_len;
int i = 0;
unsigned char row, col, byte_to_file, next_byte;
chessPosCell* curr = pos_list->head;
file = fopen(file_name, "wb"); /* open binary file to writing */
checkFileOpening(file);
while (curr != NULL)
{
row = curr->position[0] - 'A' - 17; /* 'A' ---> '1' ---> '0' */
col = curr->position[1] - 1; /* '4' ---> '3' */
if (remain < 6)
{
curr = curr->next;
remain += 8;
}
if (i > 1)
{
i = 0;
}
if (curr->next != NULL)
{
next_byte = curr->next->position[i] >> (remain - 7);
byte_to_file = ((row << (remain - 3)) | (col << (remain - 6))) | (next_byte);
i++;
}
else
{
byte_to_file = ((row << (remain - 3)) | (col << (remain - 6)));
}
fwrite(&byte_to_file, sizeof(unsigned char), 1, file);
remain -= 6;
}
how can i make it work?
Since each location requires both a column and a row, you can actually think of a location as a single 6-bit value in which the lowest 3 bits are the row and the high 3 bits are the column. If you think of it that way, then the problem is a little bit simpler in that you're actually just talking about base-64 encoding/decoding, and there are lots of open-source implementations available if you really want to pack the data into the smallest possible space.
That said, I'd encourage you to consider whether your problem really requires minimizing the storage space. You could instead store those locations as characters, either using 4 bits for row and 4 for column, continue treating locations as 6-bit values and just ignore the two extra bits. Unless you're storing a huge number of these locations, the benefit of saving two bits per location isn't likely to matter.
how can i make it work?
Well, first start with a good abstraction. Anyway, it's actually pretty simple:
let's take a 16-bit/2-byte buffer and a bit position within the buffer
it's way easier when the buffer is continues (uint16_t) instead of two separate bytes (unsigned char byte_to_file, next_byte). The next_byte bits just shift themselves and byte_to_file can be extracted with a mask.
I "see" in my imagination MSB on the left and LSB on the right
for each new 6-bits push it to the most left position that is not set yet
so shift of 16-6 minus the position
if we filled more then 8 bits
take one byte and output it
and shift the buffer 8 bits to the left
Here's a sample program that prints Hello world\n:
#include <stdint.h>
#include <stdio.h>
struct bitwritter {
FILE *out;
// our buffer for bits
uint16_t buf;
// the count of set bits within buffer counting from MSB
unsigned char pos;
};
struct bitwritter bitwritter_init(FILE *out) {
return (struct bitwritter){ .out = out };
}
int bitwritter_write_6bits(struct bitwritter *t, unsigned char bits6) {
// we always write starting from MSB
unsigned char toshift = 16 - 6 - t->pos;
// just a mask with 6 bits
bits6 &= 0x3f;
t->buf |= bits6 << toshift;
t->pos += 6;
// do we have whole byte?
if (t->pos >= 8) {
// extract the byte - note it's in MSB
unsigned char towrite = t->buf >> 8;
// shift out buffer
t->buf <<= 8;
t->pos -= 8;
// write output
if (fwrite(&towrite, sizeof(towrite), 1, t->out) != 1) {
return -1;
}
return 1;
}
return 0;
}
int main() {
struct bitwritter bw = bitwritter_init(stdout);
// echo 'Hello world' | xxd -c1 -p | while read l; do python -c "print(\"{0:08b}\".format(0x$l))"; done | paste -sd '' | sed -E 's/.{6}/0b&,\n/g'
unsigned char data[] = {
0b010010,
0b000110,
0b010101,
0b101100,
0b011011,
0b000110,
0b111100,
0b100000,
0b011101,
0b110110,
0b111101,
0b110010,
0b011011,
0b000110,
0b010000,
0b001010,
};
for (size_t i = 0; i < sizeof(data); ++i) {
bitwritter_write_6bits(&bw, data[i]);
}
}

C decompress Bitmask source

This may be somewhat of an odd question as well as my first one ever on this site and a pretty complicated thing to ask basically I have this decompresser for a very specific archived file, I barely understand this but from what i can grasp its some sort of "bit mask" it reads the first 2 bytes out of target file, and stores them as a sequence.
The first for loop is where I get confused
Say for arguments sake mask is 2 bytes 10 04, or 1040(decimal) thats what it usually is in these files
for (t = 0; t<16; t++) {
if (mask & (1 << (15 - t))) {
This seems to be looping through all 16 bits of those 2 bytes and running an AND operation on mask (1040) on every bit?
The if statement is what I don't understand completely? Whats triggering the if? If the bit is greater then 0?
Because if mask is 1040, then really what were looking at is
if(1040 & 32768) index 15
if(1040 & 16384) index 14
if(1040 & 8192) index 13
if(1040 & 4096) index 12
if(1040 & 2048) index 11
if(1040 & 1024) index 10
if(1040 & 512) and so on.....
if(1040 & 256)
I just really need to know whats triggering this if statement? i think i might be over thinking it, but is it simply trigger if the current bit is greater then 0?
The only other thing i can do is compile this source myself, insert printfs on key variables and go hand in hand with a hex editor and try and figure out whats actually going on here, if anyone could give me a hand would be awesome.
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
uint8_t dest[1024 * 1024 * 4]; // holds the actual data
int main(int argc, char *argv[]) {
FILE *fi, *fo;
char fname[255];
uint16_t mask, tmp, offset, length;
uint16_t seq;
uint32_t dptr, sptr;
uint16_t l, ct;
uint16_t t, s;
int test_len;
int t_length, t_off;
// Print Usage if filename is missing
if (argc<3) {
printf("sld_unpack - Decompressor for .sld files ()\nsld_unpack <filename.sld> <filename.t2>\n");
return(-1);
}
// Open .SLD-file
if (!(fi = fopen(argv[1], "rb"))) {
printf("Error opening %s\n", argv[1]);
return(-1);
}
dptr = 0;
fread((uint16_t*)&seq, 1, 2, fi); // read 1st 2 bytes in file
test_len = ftell(fi);
printf("[Main Header sequence: %d]\n 'offset' : %d \n", seq, test_len);
sptr = 0;
fread((uint16_t*)&seq, 1, 2, fi);
while (!feof(fi)) { // while not at the end of the file set mask equal to sequence (first 2 bytes of header)
mask = seq;
// loop through 16 bit mask
for (t = 0; t<16; t++) {
if (mask & (1 << (15 - t))) { // check all bit fields and run AND check to if value greater then 0?
test_len = ftell(fi);
fread((uint16_t*)&seq, 1, 2, fi); // read
sptr = sptr + 2; // set from 0 to 2
tmp = seq; // set tmp to sequence
offset = ((uint32_t)tmp & 0x07ff) * 2;
length = ((tmp >> 11) & 0x1f) * 2; // 32 - 1?
if (length>0) {
for (l = 0; l<length; l++) {
dest[dptr] = dest[dptr - offset];
dptr++;
}
}
else { // if length == 0
t_length = ftell(fi);
fread((uint16_t*)&seq, 1, 2, fi);
sptr = sptr + 2;
length = seq * 2;
for (s = 0; s<length; s++) {
dest[dptr] = dest[dptr - offset];
dptr++;
}
}
}
else { // if sequence AND returns 0 (or less)?
fread((uint16_t*)&seq, 1, 2, fi);
t_length = ftell(fi);
sptr = sptr + 2;
dest[dptr++] = seq & 0xff;
dest[dptr++] = (seq >> 8) & 0xff;
}
}
fread((uint16_t*)&seq, 1, 2, fi);
}
fclose(fi);
sprintf(fname, "%s\0", argv[2]);
if (!(fo = fopen(fname, "wb"))) { // if file
printf("Error creating %s\n", fname);
return(-1);
}
fwrite((uint8_t*)&dest, 1, dptr, fo);
fclose(fo);
printf("Done.\n");
return(0);
}
Be careful here.
for arguments sake mask is 2 bytes 10 04, or 1040(decimal)
That assumption may be nowhere close to true. You need to show how mask is defined, but generally a mask of bytes 10 (00001010) and 40 (00101000) is binary 101000101000 or decimal (2600) not quite 1040.
The general mask of 2600 decimal will match when bits 4,6,10 & 12 are set. Remember a bit mask is nothing more than a number whose binary representation when anded or ored with a second number produces some desired result. Nothing magic about a bit mask, its just a number with the right bits set for your intended purpose.
When you and two numbers together and test, your are testing whether there are common bits set in both numbers. Using the for loop and shift, you are doing a bitwise test for which common bits are set. Using the mask of 2600 with the loop counter will test true when bits 4,6,10 & 12 are set. In other words when the test clause equals 8, 32, 512 or 2048.
The following is a short example of what is happening in the loop and if statements.
#include <stdio.h>
/* BUILD_64 */
#if defined(__LP64__) || defined(_LP64)
# define BUILD_64 1
#endif
/* BITS_PER_LONG */
#ifdef BUILD_64
# define BITS_PER_LONG 64
#else
# define BITS_PER_LONG 32
#endif
/* CHAR_BIT */
#ifndef CHAR_BIT
# define CHAR_BIT 8
#endif
char *binpad (unsigned long n, size_t sz);
int main (void) {
unsigned short t, mask;
mask = (10 << 8) | 40;
printf ("\n mask : %s (%hu)\n\n",
binpad (mask, sizeof mask * CHAR_BIT), mask);
for (t = 0; t<16; t++)
if (mask & (1 << (15 - t)))
printf (" t %2hu : %s (%hu)\n", t,
binpad (mask & (1 << (15 - t)), sizeof mask * CHAR_BIT),
mask & (1 << (15 - t)));
return 0;
}
/** returns pointer to binary representation of 'n' zero padded to 'sz'.
* returns pointer to string contianing binary representation of
* unsigned 64-bit (or less ) value zero padded to 'sz' digits.
*/
char *binpad (unsigned long n, size_t sz)
{
static char s[BITS_PER_LONG + 1] = {0};
char *p = s + BITS_PER_LONG;
register size_t i;
for (i = 0; i < sz; i++)
*--p = (n>>i & 1) ? '1' : '0';
return p;
}
Output
$ ./bin/bitmask1040
mask : 0000101000101000 (2600)
t 4 : 0000100000000000 (2048)
t 6 : 0000001000000000 (512)
t 10 : 0000000000100000 (32)
t 12 : 0000000000001000 (8)
The if statement is what I don't understand completely? Whats triggering the if? If the bit is greater then 0? ... I just really need to know whats triggering this if statement? i think i might be over thinking it, but is it simply trigger if the current bit is greater then 0?
The C (and C++) if statement "triggers" when the conditional statement evaluates to true, which is any non-zero value; zero equates to false.
Straight C doesn't have a Boolean type, it just use the convention of zero (0) is false, and any other value is true.
if (mask & (1 << (15 - t))) {...}
is the same as
if ((mask & (1 << (15 - t))) != 0) {...}
The expression you gave is only true (non-zero) when there is a bit in the mask in the same position that the 1 was shifted by. i.e. is the 15th bit in the mask set, etc.
N.b.
mask & (1 << (15 - t))
can only ever be 0 or 1 er... will only have one bit set.

Efficient algorithm for finding a byte in a bit array

Given a bytearray uint8_t data[N] what is an efficient method to find a byte uint8_t search within it even if search is not octet aligned? i.e. the first three bits of search could be in data[i] and the next 5 bits in data[i+1].
My current method involves creating a bool get_bit(const uint8_t* src, struct internal_state* state) function (struct internal_state contains a mask that is bitshifted right, &ed with src and returned, maintaining size_t src_index < size_t src_len) , leftshifting the returned bits into a uint8_t my_register and comparing it with search every time, and using state->src_index and state->src_mask to get the position of the matched byte.
Is there a better method for this?
If you're searching an eight bit pattern within a large array you can implement a sliding window over 16 bit values to check if the searched pattern is part of the two bytes forming that 16 bit value.
To be portable you have to take care of endianness issues which is done by my implementation by building the 16 bit value to search for the pattern manually. The high byte is always the currently iterated byte and the low byte is the following byte. If you do a simple conversion like value = *((unsigned short *)pData) you will run into trouble on x86 processors...
Once value, cmp and mask are setup cmp and mask are shifted. If the pattern was not found within hi high byte the loop continues by checking the next byte as start byte.
Here is my implementation including some debug printouts (the function returns the bit position or -1 if pattern was not found):
int findPattern(unsigned char *data, int size, unsigned char pattern)
{
int result = -1;
unsigned char *pData;
unsigned char *pEnd;
unsigned short value;
unsigned short mask;
unsigned short cmp;
int tmpResult;
if ((data != NULL) && (size > 0))
{
pData = data;
pEnd = data + size;
while ((pData < pEnd) && (result == -1))
{
printf("\n\npData = {%02x, %02x, ...};\n", pData[0], pData[1]);
if ((pData + 1) < pEnd) /* still at least two bytes to check? */
{
tmpResult = (int)(pData - data) * 8; /* calculate bit offset according to current byte */
/* avoid endianness troubles by "manually" building value! */
value = *pData << 8;
pData++;
value += *pData;
/* create a sliding window to check if search patter is within value */
cmp = pattern << 8;
mask = 0xFF00;
while (mask > 0x00FF) /* the low byte is checked within next iteration! */
{
printf("cmp = %04x, mask = %04x, tmpResult = %d\n", cmp, mask, tmpResult);
if ((value & mask) == cmp)
{
result = tmpResult;
break;
}
tmpResult++; /* count bits! */
mask >>= 1;
cmp >>= 1;
}
}
else
{
/* only one chance left if there is only one byte left to check! */
if (*pData == pattern)
{
result = (int)(pData - data) * 8;
}
pData++;
}
}
}
return (result);
}
I don't think you can do much better than this in C:
/*
* Searches for the 8-bit pattern represented by 'needle' in the bit array
* represented by 'haystack'.
*
* Returns the index *in bits* of the first appearance of 'needle', or
* -1 if 'needle' is not found.
*/
int search(uint8_t needle, int num_bytes, uint8_t haystack[num_bytes]) {
if (num_bytes > 0) {
uint16_t window = haystack[0];
if (window == needle) return 0;
for (int i = 1; i < num_bytes; i += 1) {
window = window << 8 + haystack[i];
/* Candidate for unrolling: */
for (int j = 7; j >= 0; j -= 1) {
if ((window >> j) & 0xff == needle) {
return 8 * i - j;
}
}
}
}
return -1;
}
The main idea is to handle the 87.5% of cases that cross the boundary between consecutive bytes by pairing bytes in a wider data type (uint16_t in this case). You could adjust it to use an even wider data type, but I'm not sure that would gain anything.
What you cannot safely or easily do is anything involving casting part or all of your array to a wider integer type via a pointer (i.e. (uint16_t *)&haystack[i]). You cannot be ensured of proper alignment for such a cast, nor of the byte order with which the result might be interpreted.
I don't know if it would be better, but i would use sliding window.
uint counter = 0, feeder = 8;
uint window = data[0];
while (search ^ (window & 0xff)){
window >>= 1;
feeder--;
if (feeder < 8){
counter++;
if (counter >= data.length) {
feeder = 0;
break;
}
window |= data[counter] << feeder;
feeder += 8;
}
}
//Returns index of first bit of first sequence occurrence or -1 if sequence is not found
return (feeder > 0) ? (counter+1)*8-feeder : -1;
Also with some alterations you can use this method to search for arbitrary length (1 to 64-array_element_size_in_bits) bits sequence.
If AVX2 is acceptable (with earlier versions it didn't work out so well, but you can still do something there), you can search in a lot of places at the same time. I couldn't test this on my machine (only compile) so the following is more to give to you an idea of how it could be approached than copy&paste code, so I'll try to explain it rather than just code-dump.
The main idea is to read an uint64_t, shift it right by all values that make sense (0 through 7), then for each of those 8 new uint64_t's, test whether the byte is in there. Small complication: for the uint64_t's shifted by more than 0, the highest position should not be counted since it has zeroes shifted into it that might not be in the actual data. Once this is done, the next uint64_t should be read at an offset of 7 from the current one, otherwise there is a boundary that is not checked across. That's fine though, unaligned loads aren't so bad anymore, especially if they're not wide.
So now for some (untested, and incomplete, see below) code,
__m256i needle = _mm256_set1_epi8(find);
size_t i;
for (i = 0; i < n - 6; i += 7) {
// unaligned load here, but that's OK
uint64_t d = *(uint64_t*)(data + i);
__m256i x = _mm256_set1_epi64x(d);
__m256i low = _mm256_srlv_epi64(x, _mm256_set_epi64x(3, 2, 1, 0));
__m256i high = _mm256_srlv_epi64(x, _mm256_set_epi64x(7, 6, 5, 4));
low = _mm256_cmpeq_epi8(low, needle);
high = _mm256_cmpeq_epi8(high, needle);
// in the qword right-shifted by 0, all positions are valid
// otherwise, the top position corresponds to an incomplete byte
uint32_t lowmask = 0x7f7f7fffu & _mm256_movemask_epi8(low);
uint32_t highmask = 0x7f7f7f7fu & _mm256_movemask_epi8(high);
uint64_t mask = lowmask | ((uint64_t)highmask << 32);
if (mask) {
int bitindex = __builtin_ffsl(mask);
// the bit-index and byte-index are swapped
return 8 * (i + (bitindex & 7)) + (bitindex >> 3);
}
}
The funny "bit-index and byte-index are swapped" thing is because searching within a qword is done byte by byte and the results of those comparisons end up in 8 adjacent bits, while the search for "shifted by 1" ends up in the next 8 bits and so on. So in the resulting masks, the index of the byte that contains the 1 is a bit-offset, but the bit-index within that byte is actually the byte-offset, for example 0x8000 would correspond to finding the byte at the 7th byte of the qword that was right-shifted by 1, so the actual index is 8*7+1.
There is also the issue of the "tail", the part of the data left over when all blocks of 7 bytes have been processed. It can be done much the same way, but now more positions contain bogus bytes. Now n - i bytes are left over, so the mask has to have n - i bits set in the lowest byte, and one fewer for all other bytes (for the same reason as earlier, the other positions have zeroes shifted in). Also, if there is exactly 1 byte "left", it isn't really left because it would have been tested already, but that doesn't really matter. I'll assume the data is sufficiently padded that accessing out of bounds doesn't matter. Here it is, untested:
if (i < n - 1) {
// make n-i-1 bits, then copy them to every byte
uint32_t validh = ((1u << (n - i - 1)) - 1) * 0x01010101;
// the lowest position has an extra valid bit, set lowest zero
uint32_t validl = (validh + 1) | validh;
uint64_t d = *(uint64_t*)(data + i);
__m256i x = _mm256_set1_epi64x(d);
__m256i low = _mm256_srlv_epi64(x, _mm256_set_epi64x(3, 2, 1, 0));
__m256i high = _mm256_srlv_epi64(x, _mm256_set_epi64x(7, 6, 5, 4));
low = _mm256_cmpeq_epi8(low, needle);
high = _mm256_cmpeq_epi8(high, needle);
uint32_t lowmask = validl & _mm256_movemask_epi8(low);
uint32_t highmask = validh & _mm256_movemask_epi8(high);
uint64_t mask = lowmask | ((uint64_t)highmask << 32);
if (mask) {
int bitindex = __builtin_ffsl(mask);
return 8 * (i + (bitindex & 7)) + (bitindex >> 3);
}
}
If you are searching a large amount of memory and can afford an expensive setup, another approach is to use a 64K lookup table. For each possible 16-bit value, the table stores a byte containing the bit shift offset at which the matching octet occurs (+1, so 0 can indicate no match). You can initialize it like this:
uint8_t* g_pLookupTable = malloc(65536);
void initLUT(uint8_t octet)
{
memset(g_pLookupTable, 0, 65536); // zero out
for(int i = 0; i < 65536; i++)
{
for(int j = 7; j >= 0; j--)
{
if(((i >> j) & 255) == octet)
{
g_pLookupTable[i] = j + 1;
break;
}
}
}
}
Note that the case where the value is shifted 8 bits is not included (the reason will be obvious in a minute).
Then you can scan through your array of bytes like this:
int findByteMatch(uint8_t* pArray, uint8_t octet, int length)
{
if(length >= 0)
{
uint16_t index = (uint16_t)pArray[0];
if(index == octet)
return 0;
for(int bit, i = 1; i < length; i++)
{
index = (index << 8) | pArray[i];
if(bit = g_pLookupTable[index])
return (i * 8) - (bit - 1);
}
}
return -1;
}
Further optimization:
Read 32 or however many bits at a time from pArray into a uint32_t and then shift and AND each to get byte one at a time, OR with index and test, before reading another 4.
Pack the LUT into 32K by storing a nybble for each index. This might help it squeeze into the cache on some systems.
It will depend on your memory architecture whether this is faster than an unrolled loop that doesn't use a lookup table.

How can I store variable length codes sequentially in memory?

Say I have a two dimensional array where each entry contains a length and a value:
int array[4][2] = { /* {length, value}, */
{5, 3},
{6, 7},
{1, 0},
{8, 15},
};
I want to store them sequentially into memory with leading zeros to make each field the appropriate length. The example above would be:
00011 000111 0 00001111
The first block is five bits long and stores decimal 3. The second block is six bits long and stores decimal seven. The third block is one bit long and stores decimal 0, and the last block is eight bits long and stores decimal 15.
I can do it with some bitwise manipulation but I thought I would ask to see if there is an easier way.
I am coding in C for a Tensilica 32-bit RISC processor.
The purpose is to write a sequence of Exponential-Golomb codes.
EDIT: SOLUTION:
int main(int argc, char *argv[])
{
unsigned int i = 0, j = 0;
unsigned char bit = 0;
unsigned int bit_num = 0;
unsigned int field_length_bits = 0;
unsigned int field_length_bytes = 0;
unsigned int field_array_length = 0;
unsigned int field_list[NUM_FIELDS][2] = {
/*{Length, Value},*/
{4, 3},
{5, 5},
{6, 9},
{7, 11},
{8, 13},
{9, 15},
{10, 17},
};
unsigned char *seq_array;
// Find total length of field list in bits
for (i = 0; i < NUM_FIELDS; i++)
field_length_bits += field_list[i][LENGTH];
// Number of bytes needed to store FIELD parameters
for (i = 0; i < (field_length_bits + i) % 8 != 0; i++) ;
field_length_bytes = (field_length_bits + i) / 8;
// Size of array we need to allocate (multiple of 4 bytes)
for (i = 0; (field_length_bytes + i) % 4 != 0; i++) ;
field_array_length = (field_length_bytes + i);
// Allocate memory
seq_array = (unsigned char *) calloc(field_array_length, sizeof(unsigned char));
// Traverse source and set destination
for(i = 0; i < NUM_FIELDS; i++)
{
for(j = 0; j < field_list[i][LENGTH]; j++)
{
bit = 0x01 & (field_list[i][VALUE] >> (field_list[i][LENGTH] - j - 1));
if (bit)
setBit(seq_array, field_array_length, bit_num, 1);
else
setBit(seq_array, field_array_length, bit_num, 0);
bit_num++;
}
}
return 0;
}
void setBit(unsigned char *array, unsigned int array_len, unsigned int bit_num, unsigned int bit_value)
{
unsigned int byte_location = 0;
unsigned int bit_location = 0;
byte_location = bit_num / 8;
if(byte_location > array_len - 1)
{
printf("setBit(): Unauthorized memory access");
return;
}
bit_location = bit_num % 8;
if(bit_value)
array[byte_location] |= (1 << (7-bit_location));
else
array[byte_location] &= ~(1 << (7-bit_location));
return;
}
You can use a bitstream library:
Highly recommended bitstream library:
http://cpansearch.perl.org/src/KURIHARA/Imager-QRCode-0.033/src/bitstream.c
http://cpansearch.perl.org/src/KURIHARA/Imager-QRCode-0.033/src/bitstream.h
Because this bitstream library seems to be very self-contained, and doesn't seem to require external includes.
http://www.codeproject.com/Articles/32783/CBitStream-A-simple-C-class-for-reading-and-writin - C library, but using windows WORD, DWORD types (you can still typedef to use this library)
http://code.google.com/p/youtube-mobile-ffmpeg/source/browse/trunk/libavcodec/bitstream.c?r=8 - includes quite a few other include files to use the bitstream library
If you just want exponential golomb codes, there are open-source C implementations:
http://www.koders.com/c/fid8A317DF502A7D61CC96EC4DA07021850B6AD97ED.aspx?s=gcd
Or you can use bit manipulation techniques.
For example:
unsigned int array[4][2] = ???
unsigned int mem[100] = {};
int index=0,bit=0;
for (int i=0;i<4;i++) {
int shift = (32 - array[i][0] - bit);
if (shift>0) mem[index] &= array[i][1] << shift;
else {
mem[index] &= array[i][1] >> -shift;
mem[index+1] &= array[i][1] << (32+shift);
}
bit += array[i][1];
if (bit>=32) {
bit-=32;
index++;
}
}
Disclaimer:
The code only works if your computer byte-order is little endian, and the result will actually be little-endian within each 4-byte boundary, and big-endian across 4-byte boundaries. If you convert mem from int type to char, and replace the constants 32 to 8, you will get a big-endian representation of your bit-array.
It also assumes that the length is less than 32. Obviously, the code you actually want will depend on the bounds of valid input, and what you want in terms of byte-ordering.
Do you mean something like a bit field?
struct myBF
{
unsigned int v1 : 5;
unsigned int v2 : 5;
unsigned int v3 : 1;
unsigned int v4 : 8;
};
struct myBF b = { 3, 7, 0, 15 };
I may be misunderstanding your requirements entirely. Please comment if that's the case.
Update: Suppose you want to do this dynamically. Let's make a function that accepts an array of pairs, like in your example, and an output buffer:
/* Fill dst with bits.
* Returns one plus the number of bytes used or 0 on error.
*/
size_t bitstream(int (*arr)[2], size_t arrlen,
unsigned char * dst, size_t dstlen)
{
size_t total_bits = 0, bits_so_far = 0;
/* Check if there's enough space */
for (size_t i = 0; i != arrlen; ++i) { total_bits += arr[i][0]; }
if (dst == NULL || total_bits > CHAR_BIT * dstlen) { return 0; }
/* Set the output range to all zero */
memset(dst, 0, dstlen);
/* Populate the output range */
for (size_t i = 0; i != arrlen; ++i)
{
for (size_t bits_to_spend = arr[i][0], value = arr[i][1];
bits_to_spend != 0; /* no increment */ )
{
size_t const bit_offset = bits_so_far % CHAR_BIT;
size_t const byte_index = bits_so_far / CHAR_BIT;
size_t const cur_byte_capacity = CHAR_BIT - bit_offset;
/* Debug: Watch it work! */
printf("Need to store %zu, %zu bits to spend, capacity %zu.\n",
value, bits_to_spend, cur_byte_capacity);
dst[byte_index] |= (value << bit_offset);
if (cur_byte_capacity < bits_to_spend)
{
value >>= cur_byte_capacity;
bits_so_far += cur_byte_capacity;
bits_to_spend -= cur_byte_capacity;
}
else
{
bits_so_far += bits_to_spend;
bits_to_spend = 0;
}
}
}
return (bits_so_far + CHAR_BIT - 1) / CHAR_BIT;
}
Notes:
If the number arr[i][1] does not fit into arr[i][0] bits, only the residue modulo 2arr[i][0] is stored.
To be perfectly correct, the array type should be unsigned as well, otherwise the initialization size_t value = arr[i][1] may be undefined behaviour.
You can modify the error handling behaviour. For example, you could forgo transactionality and move the length check into the main loop. Also, instead of returning 0, you could return the numĀ­ber of required bytes, so that the user can figure out how big the destination array needs to be (like snptrintf does).
Usage:
unsigned char dst[N];
size_t n = bitstream(array, sizeof array / sizeof *array, dst, sizeof dst);
for (size_t i = 0; i != n; ++i) { printf("0x%02X ", dst[n - i - 1]); }
For your example, this will produce 0x00 0xF0 0xE3, which is:
0x00 0xF0 0xE3
00000000 11110000 11100011
0000 00001111 0 000111 00011
padd 15 0 7 3
In standard C there's no way to access anything smaller than a char by any way other than the 'bitwise manipulation` you mention. I'm afraid you're out of luck, unless you come across a library somewhere out there that can help you.

Strip parity bits in C from 8 bits of data followed by 1 parity bit

I have a buffer of bits with 8 bits of data followed by 1 parity bit. This pattern repeats itself. The buffer is currently stored as an array of octets.
Example (p are parity bits):
0001 0001 p000 0100 0p00 0001 00p01 1100 ...
should become
0001 0001 0000 1000 0000 0100 0111 00 ...
Basically, I need to strip of every ninth bit to just obtain the data bits. How can I achieve this?
This is related to another question asked here sometime back.
This is on a 32 bit machine so the solution to the related question may not be applicable. The maximum possible number of bits is 45 i.e. 5 data octets
This is what I have tried so far. I have created a "boolean" array and added the bits into the array based on the the bitset of the octet. I then look at every ninth index of the array and through it away. Then move the remaining array down one index. Then I've got only the data bits left. I was thinking there may be better ways of doing this.
Your idea of having an array of bits is good. Just implement the array of bits by a 32-bit number (buffer).
To remove a bit from the middle of the buffer:
void remove_bit(uint32_t* buffer, int* occupancy, int pos)
{
assert(*occupancy > 0);
uint32_t high_half = *buffer >> pos >> 1;
uint32_t low_half = *buffer << (32 - pos) >> (32 - pos);
*buffer = high_half | low_half;
--*occupancy;
}
To add a byte to the buffer:
void add_byte(uint32_t* buffer, int* occupancy, uint8_t byte)
{
assert(*occupancy <= 24);
*buffer = (*buffer << 8) | byte;
*occupancy += 8;
}
To remove a byte from the buffer:
uint8_t remove_byte(uint32_t* buffer, int* occupancy)
{
uint8_t result = *buffer >> (*occupancy - 8);
assert(*occupancy >= 8);
*occupancy -= 8;
return result;
}
You will have to arrange the calls so that the buffer never overflows. For example:
buffer = 0;
occupancy = 0;
add_byte(buffer, occupancy, *input++);
add_byte(buffer, occupancy, *input++);
remove_bit(buffer, occupancy, 7);
*output++ = remove_byte(buffer, occupancy);
add_byte(buffer, occupancy, *input++);
remove_bit(buffer, occupancy, 6);
*output++ = remove_byte(buffer, occupancy);
... (there are only 6 input bytes, so this should be easy)
In pseudo-code (since you're not providing any proof you've tried something), I would probably do it like this, for simplicity:
View the data (with parity bits included) as a stream of bits
While there are bits left to read:
Read the next 8 bits
Write to the output
Read one more bit, and discard it
This "lifts you up" from worrying about reading bytes, which no longer is a useful operation since your bytes are interleaved with bits you want to discard.
I have written helper functions to read unaligned bit buffers (this was for AVC streams, see original source here). The code itself is GPL, I'm pasting interesting (modified) bits here.
typedef struct bit_buffer_ {
uint8_t * start;
size_t size;
uint8_t * current;
uint8_t read_bits;
} bit_buffer;
/* reads one bit and returns its value as a 8-bit integer */
uint8_t get_bit(bit_buffer * bb) {
uint8_t ret;
ret = (*(bb->current) >> (7 - bb->read_bits)) & 0x1;
if (bb->read_bits == 7) {
bb->read_bits = 0;
bb->current++;
}
else {
bb->read_bits++;
}
return ret;
}
/* reads up to 32 bits and returns the value as a 32-bit integer */
uint32_t get_bits(bit_buffer * bb, size_t nbits) {
uint32_t i, ret;
ret = 0;
for (i = 0; i < nbits; i++) {
ret = (ret << 1) + get_bit(bb);
}
return ret;
}
You can use the structure like this:
uint_8 * buffer;
size_t buffer_size;
/* assumes buffer points to your data */
bit_buffer bb;
bb.start = buffer;
bb.size = buffer_size;
bb.current = buffer;
bb.read_bits = 0;
uint32_t value = get_bits(&bb, 8);
uint8_t parity = get_bit(&bb);
uint32_t value2 = get_bits(&bb, 8);
uint8_t parity2 = get_bit(&bb);
/* etc */
I must stress that this code is quite perfectible, proper bound checking must be implemented, but it works fine in my use-case.
I leave it as an exercise to you to implement a proper bit buffer reader using this for inspiration.
This also works
void RemoveParity(unsigned char buffer[], int size)
{
int offset = 0;
int j = 0;
for(int i = 1; i + j < size; i++)
{
if (offset == 0)
{
printf("%u\n", buffer[i + j - 1]);
}
else
{
unsigned char left = buffer[i + j - 1] << offset;
unsigned char right = buffer[i + j] >> (8 - offset);
printf("%u\n", (unsigned char)(left | right));
}
offset++;
if (offset == 8)
{
offset = 0;
j++; // advance buffer (8 parity bit consumed)
}
}
}

Resources