Fast strlen with bit operations - c

I found this code
int strlen_my(const char *s)
{
int len = 0;
for(;;)
{
unsigned x = *(unsigned*)s;
if((x & 0xFF) == 0) return len;
if((x & 0xFF00) == 0) return len + 1;
if((x & 0xFF0000) == 0) return len + 2;
if((x & 0xFF000000) == 0) return len + 3;
s += 4, len += 4;
}
}
I'm very interested in knowing how it works. ¿Can anyone explain how it works?

A bitwise AND with ones will retrieve the bit pattern from the other operand. Meaning, 10101 & 11111 = 10101. If the result of that bitwise AND is 0, then we know we know the other operand was 0. A result of 0 when ANDing a single byte with 0xFF (ones) will indicate a NULL byte.
The code itself checks each byte of the char array in four-byte partitions. NOTE: This code isn't portable; on another machine or compiler, an unsigned int could be more than 4 bytes. It would probably be better to use the uint32_t data type to ensure 32-bit unsigned integers.
The first thing to note is that on a little-endian machine, the bytes making up the character array will be read into an unsigned data type in reverse order; that is, if the four bytes at the current address are the bit pattern corresponding to abcd, then the unsigned variable will contain the bit pattern corresponding to dcba.
The second is that a hexadecimal number constant in C results in an int-sized number with the specified bytes at the little-end of the bit pattern. Meaning, 0xFF is actually 0x000000FF when compiling with 4-byte ints. 0xFF00 is 0x0000FF00. And so on.
So the program is basically looking for the NULL character in the four possible positions. If there is no NULL character in the current partition, it advances to the next four-byte slot.
Take the char array abcdef for an example. In C, string constants will always have null terminators at the end, so there's a 0x00 byte at the end of that string.
It'll work as follows:
Read "abcd" into unsigned int x:
x: 0x64636261 [ASCII representations for "dcba"]
Check each byte for a null terminator:
0x64636261
& 0x000000FF
0x00000061 != 0,
0x64636261
& 0x0000FF00
0x00006200 != 0,
And check the other two positions; there are no null terminators in this 4-byte partition, so advance to the next partition.
Read "ef" into unsigned int x:
x: 0xBF006665 [ASCII representations for "fe"]
Note the 0xBF byte; this is past the string's length, so we're reading in garbage from the runtime stack. It could be anything. On a machine that doesn't allow unaligned accesses, this will crash if the memory after the string is not 1-byte aligned. If there were just one character left in the string, we'd be reading two extra bytes, so the alignment of the memory adjacent to the char array would have to be 2-byte aligned.
Check each byte for a null terminator:
0xBF006665
& 0x000000FF
0x00000065 != 0,
0xBF006665
& 0x0000FF00
0x00006600 != 0,
0xBF006665
& 0x00FF0000
0x00000000 == 0 !!!
So we return len + 2; len was 4 since we incremented it once by 4, so we return 6, which is indeed the length of the string.

Code "works" by attempting to read 4 bytes at a time by assuming the string is laid out and accessible like an array of int. Code reads the first int and then each byte in turn, testing if it is the null character. In theory, code working with int will run faster then 4 individualchar operations.
But there are problems:
Alignment is an issue: e.g. *(unsigned*)s may seg-fault.
Endian is an issue with if((x & 0xFF) == 0) might not get the byte at address s
s += 4 is a problem as sizeof(int) may differ from 4.
Array types may exceed int range, better to use size_t.
An attempt to right these difficulties.
#include <stddef.h>
#include <stdio.h>
static inline aligned_as_int(const char *s) {
max_align_t mat; // C11
uintptr_t i = (uintptr_t) s;
return i % sizeof mat == 0;
}
size_t strlen_my(const char *s) {
size_t len = 0;
// align
while (!aligned_as_int(s)) {
if (*s == 0) return len;
s++;
len++;
}
for (;;) {
unsigned x = *(unsigned*) s;
#if UINT_MAX >> CHAR_BIT == UCHAR_MAX
if(!(x & 0xFF) || !(x & 0xFF00)) break;
s += 2, len += 2;
#elif UINT_MAX >> CHAR_BIT*3 == UCHAR_MAX
if (!(x & 0xFF) || !(x & 0xFF00) || !(x & 0xFF0000) || !(x & 0xFF000000)) break;
s += 4, len += 4;
#elif UINT_MAX >> CHAR_BIT*7 == UCHAR_MAX
if ( !(x & 0xFF) || !(x & 0xFF00)
|| !(x & 0xFF0000) || !(x & 0xFF000000)
|| !(x & 0xFF00000000) || !(x & 0xFF0000000000)
|| !(x & 0xFF000000000000) || !(x & 0xFF00000000000000)) break;
s += 8, len += 8;
#else
#error TBD code
#endif
}
while (*s++) {
len++;
}
return len;
}

It trades undefined behaviour (unaligned accesses, 75% probability to access beyond the end of the array) for a very questionable speedup (it is very possibly even slower). And is not standard-compliant, because it returns int instead of size_t. Even if unaligned accesses are allowed on the platform, they can be much slower than aligned accesses.
It also does not work on big-endian systems, or if unsigned is not 32 bits. Not to mention the multiple mask and conditional operations.
That said:
It tests 4 8-bit bytes at a time by loading a unsigned (which is not even guaranteed to have more than 16 bits). Once any of the bytes contains the '\0'-terminator, it returns the sum of the current length plus the position of that byte. Else it increments the current length by the number of bytes tested in parallel (4) and gets the next unsigned.
My advice: bad example of optimization plus too many uncertainties/pitfalls. It's likely not even faster — just profile it against the standard version:
size_t strlen(restrict const char *s)
{
size_t l = 0;
while ( *s++ )
l++;
return l;
}
There might be a way to use special vector-instructions, but unless you can prove this is a critical function, you should leave this to the compiler — some may unroll/speedup such loops much better.

All there proposals are slower than a simple strlen().
The reason is that they do not reduce the number of comparisons and only one deals with alignment.
Check for the strlen() proposal from Torbjorn Granlund (tege#sics.se) and Dan Sahlin (dan#sics.se) in the net. If you are on a 64 bit platform this really helps to speed up things.

It detects if any bits are set at a specific byte on a little-endian machine. Since we're only checking a single byte (since all the nibbles, 0 or 0xF, are doubled up) and it happens to be the last byte position (since the machine is little-endian and the byte pattern for the numbers is therefore reversed) we can immediately know which byte contains NUL.

The loop is taking 4 bytes of the char array for each iteration. The four if statements are used to determine if the string is over, using bitmask with AND operator to read the status of i-th element of the substring selected.

Related

C/C++ code to convert big endian to little endian

I've seen several different examples of code that converts big endian to little endian and vice versa, but I've come across a piece of code someone wrote that seems to work, but I'm stumped as to why it does.
Basically, there's a char buffer that, at a certain position, contains a 4-byte int stored as big-endian. The code would extract the integer and store it as native little endian. Here's a brief example:
char test[8] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07};
char *ptr = test;
int32_t value = 0;
value = ((*ptr) & 0xFF) << 24;
value |= ((*(ptr + 1)) & 0xFF) << 16;
value |= ((*(ptr + 2)) & 0xFF) << 8;
value |= (*(ptr + 3)) & 0xFF;
printf("value: %d\n", value);
value: 66051
The above code takes the first four bytes, stores it as little endian, and prints the result. Can anyone explain step by step how this works? I'm confused why ((*ptr) & 0xFF) << X wouldn't just evaluate to 0 for any X >= 8.
This code is constructing the value, one byte at a time.
First it captures the lowest byte
(*ptr) & 0xFF
And then shifts it to the highest byte
((*ptr) & 0xFF) << 24
And then assigns it to the previously 0 initialized value.
value =((*ptr) & 0xFF) << 24
Now the "magic" comes into play. Since the ptr value was declared as a char* adding one to it advances the pointer by one character.
(ptr + 1) /* the next character address */
*(ptr + 1) /* the next character */
After you see that they are using pointer math to update the relative starting address, the rest of the operations are the same as the ones already described, except that to preserve the partially shifted values, they or the values into the existing value variable
value |= ((*(ptr + 1)) & 0xFF) << 16
Note that pointer math is why you can do things like
char* ptr = ... some value ...
while (*ptr != 0) {
... do something ...
ptr++;
}
but it comes at a price of possibly really messing up your pointer addresses, greatly increasing your risk of a SEGFAULT violation. Some languages saw this as such a problem, that they removed the ability to do pointer math. An almost-pointer that you cannot do pointer math on is typically called a reference.
If you want to convert little endian represantion to big endian you can use htonl, htons, ntohl, ntohs. these functions convert values between host and network byte order. Big endian also used in arm based platform. see here: https://linux.die.net/man/3/endian
A code you might use is based on the idea that numbers on the network shall be sent in BIG ENDIAN mode.
The functions htonl() and htons() convert 32 bit integer and 16 bit integer in BIG ENDIAN where your system uses LITTLE ENDIAN and they leave the numbers in BIG ENDIAN otherwise.
Here the code:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <arpa/inet.h>
int main(void)
{
uint32_t x,y;
uint16_t s,z;
x=0xFF567890;
y=htonl(x);
printf("LE=%08X BE=%08X\n",x,y);
s=0x7891;
z=htons(s);
printf("LE=%04X BE=%04X\n",s,z);
return 0;
}
This code is written to convert from LE to BE on a LE machine.
You might use the opposite functions ntohl() and ntohs() to convert from BE to LE, these functions convert the integers from BE to LE on the LE machines and don't convert on BE machines.
I'm confused why ((*ptr) & 0xFF) << X wouldn't just evaluate to 0 for any X >= 8.
I think you misinterpret the shift functionality.
value = ((*ptr) & 0xFF) << 24;
means a masking of the value at ptr with 0xff (the byte) and afterwards a shift by 24 BITS (not bytes). That is a shift by 24/8 bytes (3 bytes) to the highest byte.
One of the keypoints to understanding the evaluation of ((*ptr) & 0xFF) << X
Is Integer Promotion. The Value (*ptr) & 0xff is promoted to an Integer before being shifted.
I've written the code below. This code contains two functions swapmem() and swap64().
swapmem() swaps the bytes of a memory area of an arbitrary dimension.
swap64() swaps the bytes of a 64 bits integer.
At the end of this reply I indicate you an idea to solve your problem with the buffer of byte.
Here the code:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <malloc.h>
void * swapmem(void *x, size_t len, int retnew);
uint64_t swap64(uint64_t k);
/**
brief swapmem
This function swaps the byte into a memory buffer.
param x
pointer to the buffer to be swapped
param len
lenght to the buffer to be swapped
param retnew
If this parameter is 1 the buffer is swapped in a new
buffer. The new buffer shall be deallocated by using
free() when it's no longer useful.
If this parameter is 0 the buffer is swapped in its
memory area.
return
The pointer to the memory area where the bytes has been
swapped or NULL if an error occurs.
*/
void * swapmem(void *x, size_t len, int retnew)
{
char *b = NULL, app;
size_t i;
if (x != NULL) {
if (retnew) {
b = malloc(len);
if (b!=NULL) {
for(i=0;i<len;i++) {
b[i]=*((char *)x+len-1-i);
}
}
} else {
b=(char *)x;
for(i=0;i<len/2;i++) {
app=b[i];
b[i]=b[len-1-i];
b[len-1-i]=app;
}
}
}
return b;
}
uint64_t swap64(uint64_t k)
{
return ((k << 56) |
((k & 0x000000000000FF00) << 40) |
((k & 0x0000000000FF0000) << 24) |
((k & 0x00000000FF000000) << 8) |
((k & 0x000000FF00000000) >> 8) |
((k & 0x0000FF0000000000) >> 24)|
((k & 0x00FF000000000000) >> 40)|
(k >> 56)
);
}
int main(void)
{
uint32_t x,*y;
uint16_t s,z;
uint64_t k,t;
x=0xFF567890;
/* Dynamic allocation is used to avoid to change the contents of x */
y=(uint32_t *)swapmem(&x,sizeof(x),1);
if (y!=NULL) {
printf("LE=%08X BE=%08X\n",x,*y);
free(y);
}
/* Dynamic allocation is not used. The contents of z and k will change */
z=s=0x7891;
swapmem(&z,sizeof(z),0);
printf("LE=%04X BE=%04X\n",s,z);
k=t=0x1120324351657389;
swapmem(&k,sizeof(k),0);
printf("LE=%16"PRIX64" BE=%16"PRIX64"\n",t,k);
/* LE64 to BE64 (or viceversa) using shift */
k=swap64(t);
printf("LE=%16"PRIX64" BE=%16"PRIX64"\n",t,k);
return 0;
}
After the program was compiled I had the curiosity to see the assembly code gcc generated. I discovered that the function swap64 is generated as indicated below.
00000000004007a0 <swap64>:
4007a0: 48 89 f8 mov %rdi,%rax
4007a3: 48 0f c8 bswap %rax
4007a6: c3 retq
This result is obtained compiling the code, on a PC with Intel I3 CPU, with the gcc options: -Ofast, or -O3, or -O2, or -Os.
You may solve your problem using something like the swap64() function. A function like the following I've named swap32():
uint32_t swap32(uint32_t k)
{
return ((k << 24) |
((k & 0x0000FF00) << 8) |
((k & 0x00FF0000) >> 8) |
(k >> 24)
);
}
You may use it as:
uint32_t j=swap32(*(uint32_t *)ptr);

Setting bits in a bit stream

I have encountered the following C function while working on a legacy code and I am compeletely baffled, the way the code is organized. I can see that the function is trying to set bits at given position in bit stream but I can't get my head around with individual statements and expressions. Can somebody please explain why the developer used divison by 8 (/8) and modulus 8 (%8) expressions here and there. Is there an easy way to read these kinds of bit manipulation functions in c?
static void setBits(U8 *input, U16 *bPos, U8 len, U8 val)
{
U16 pos;
if (bPos==0)
{
pos=0;
}
else
{
pos = *bPos;
*bPos += len;
}
input[pos/8] = (input[pos/8]&(0xFF-((0xFF>>(pos%8))&(0xFF<<(pos%8+len>=8?0:8-(pos+len)%8)))))
|((((0xFF>>(8-len)) & val)<<(8-len))>>(pos%8));
if ((pos/8 == (pos+len)/8)|(!((pos+len)%8)))
return;
input[(pos+len)/8] = (input[(pos+len)/8]
&(0xFF-(0xFF<<(8-(pos+len)%8))))
|((0xFF>>(8-len)) & val)<<(8-(pos+len)%8);
}
please explain why the developer used divison by 8 (/8) and modulus 8 (%8) expressions here and there
First of all, note that the individual bits of a byte are numbered 0 to 7, where bit 0 is the least significant one. There are 8 bits in a byte, hence the "magic number" 8.
Generally speaking: if you have any raw data, it consists of n bytes and can therefore always be treated as an array of bytes uint8_t data[n]. To access bit x in that byte array, you can for example do like this:
Given x = 17, bit x is then found in byte number 17/8 = 2. Note that integer division "floors" the value, instead of 2.125 you get 2.
The remainder of the integer division gives you the bit position in that byte, 17%8 = 1.
So bit number 17 is located in byte 2, bit 1. data[2] gives the byte.
To mask out a bit from a byte in C, the bitwise AND operator & is used. And in order to use that, a bit mask is needed. Such bit masks are best obtained by shifting the value 1 by the desired amount of bits. Bit masks are perhaps most clearly expressed in hex and the possible bit masks for a byte will be (1<<0) == 0x01 , (1<<1) == 0x02, (1<<3) == 0x04, (1<<4) == 0x08 and so on.
In this case (1<<1) == 0x02.
C code:
uint8_t data[n];
...
size_t byte_index = x / 8;
size_t bit_index = x % 8;
bool is_bit_set;
is_bit_set = ( data[byte_index] & (1<<bit_index) ) != 0;

Converting little endian to big endian using Bitshift Operators

I am working on endianess. My little endian program works, and gives the correct output. But I am not able to get my way around big endian. Below is the what I have so far.
I know i have to use bit shift and i dont think i am doing a good job at it. I tried asking my TA's and prof but they are not much help.
I have been following this link (convert big endian to little endian in C [without using provided func]) to understand more but cannot still make it work. Thank you for the help.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
FILE* input;
FILE* output;
input = fopen(argv[1],"r");
output = fopen(argv[2],"w");
int value,value2;
int i;
int zipcode, population;
while(fscanf(input,"%d %d\n",&zipcode, &population)!= EOF)
{
for(i = 0; i<4; i++)
{
population = ((population >> 4)|(population << 4));
}
fwrite(&population, sizeof(int), 1, output);
}
fclose(input);
fclose(output);
return 0;
}
I'm answering not to give you the answer but to help you solve it yourself.
First ask yourself this: how many bits are in a byte? (hint: 8) Next, how many bytes are in an int? (hint: probably 4) Picture this 32-bit integer in memory:
+--------+
0x|12345678|
+--------+
Now picture it on a little-endian machine, byte-wise. It would look like this:
+--+--+--+--+
0x|78|56|34|12|
+--+--+--+--+
What shift operations are required to get the bytes into the correct spot?
Remember, when you use a bitwise operator like >>, you are operating on bits. So 1 << 24 would be the integer value 1 converted into the processor's opposite endianness.
"little-endian" and "big-endian" refer to the order of bytes (we can assume 8 bits here) in a binary representation. When referring to machines, it's about the order of the bytes in memory: on big-endian machines, the address of an int will point to its highest-order byte, while on a little-endian machine the address of an int will refer to its lowest-order byte.
When referring to binary files (or pipes or transmission protocols etc.), however, it refers to the order of the bytes in the file: a "little-endian representation" will have the lowest-order byte first and the highest-order byte last.
How does one obtain the lowest-order byte of an int? That's the low 8 bits, so it's (n & 0xFF) (or ((n >> 0) & 0xFF), the usefulness of which you will see below).
The next lowest-order byte is ((n >> 8) & 0xFF).
The next lowest-order byte is ((n >> 16) & 0xFF) ... or (((n >> 8) >> 8) & 0xFF).
And so on.
So you can peal off bytes from n in a loop and output them one byte at a time ... you can use fwrite for that but it's simpler just to use putchar or putc.
You say that your teacher requires you to use fwrite. There are two ways to do that: 1) use fwrite(&n, 1, 1, filePtr) in a loop as described above. 2) Use the loop to reorder your int value by storing the bytes in the desired order in a char array rather than outputting them, then use fwrite to write it out. The latter is probably what your teacher has in mind.
Note that, if you just use fwrite to output your int it will work ... if you're running on a little-endian machine, where the bytes of the int are already stored in the right order. But the bytes will be backwards if running on a big-endian machine.
The problem with most answers to this question is portability. I've provided a portable answer here, but this recieved relatively little positive feedback. Note that C defines undefined behavior as: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.
The answer I'll give here won't assume that int is 16 bits in width; It'll give you an idea of how to represent "larger int" values. It's the same concept, but uses a dynamic loop rather than two fputcs.
Declare an array of sizeof int unsigned chars: unsigned char big_endian[sizeof int];
Separate the sign and the absolute value.
int sign = value < 0;
value = sign ? -value : value;
Loop from sizeof int to 0, writing the least significant bytes:
size_t foo = sizeof int;
do {
big_endian[--foo] = value % (UCHAR_MAX + 1);
value /= (UCHAR_MAX + 1);
} while (foo > 0);
Now insert the sign: foo[0] |= sign << (CHAR_BIT - 1);
Simple, yeh? Little endian is equally simple. Just reverse the order of the loop to go from 0 to sizeof int, instead of from sizeof int to 0:
size_t foo = 0;
do {
big_endian[foo++] = value % (UCHAR_MAX + 1);
value /= (UCHAR_MAX + 1);
} while (foo < sizeof int);
The portable methods make more sense, because they're well defined.

Reading characters on a bit level

I would like to be able to enter a character from the keyboard and display the binary code for said key in the format 00000001 for example.
Furthermore i would also like to read the bits in a way that allows me to output if they are true or false.
e.g.
01010101 = false,true,false,true,false,true,false,true
I would post an idea of how i have tried to do it myself but I have absolutely no idea, i'm still experimenting with C and this is my first taste of programming at such a low level scale.
Thankyou
For bit tweaking, it is often safer to use unsigned types, because shifts of signed negative values have an implementation-dependent effect. The plain char can be either signed or unsigned (traditionally, it is unsigned on MacIntosh platforms, but signed on PC). Hence, first cast you character into the unsigned char type.
Then, your friends are the bitwise boolean operators (&, |, ^ and ~) and the shift operators (<< and >>). For instance, if your character is in variable x, then to get the 5th bit you simply use: ((x >> 5) & 1). The shift operators moves the value towards the right, dropping the five lower bits and moving the bit your are interested in the "lowest position" (aka "rightmost"). The bitwise AND with 1 simply sets all other bits to 0, so the resulting value is either 0 or 1, which is your bit. Note here that I number bits from left significant (rightmost) to most significant (leftmost) and I begin with zero, not one.
If you assume that your characters are 8-bits, you could write your code as:
unsigned char x = (unsigned char)your_character;
int i;
for (i = 7; i >= 0; i --) {
if (i != 7)
printf(",");
printf("%s", ((x >> i) & 1) ? "true" : "false");
}
You may note that since I number bits from right to left, but you want output from left to right, the loop index must be decreasing.
Note that according to the C standard, unsigned char has at least eight bits but may have more (nowadays, only a handful of embedded DSP have characters which are not 8-bit). To be extra safe, add this near the beginning of your code (as a top-level declaration):
#include <limits.h>
#if CHAR_BIT != 8
#error I need 8-bit bytes!
#endif
This will prevent successful compilation if the target system happens to be one of those special embedded DSP. As a note on the note, the term "byte" in the C standard means "the elementary memory unit which correspond to an unsigned char", so that, in C-speak, a byte may have more than eight bits (a byte is not always an octet). This is a traditional source of confusion.
This is probably not the safest way - no sanity/size/type checks - but it should still work.
unsigned char myBools[8];
char myChar;
// get your character - this is not safe and you should
// use a better method to obtain input...
// cin >> myChar; <- C++
scanf("%c", &myChar);
// binary AND against each bit in the char and then
// cast the result. anything > 0 should resolve to 'true'
// and == 0 to 'false', but you could add a '> 1' check to be sure.
for(int i = 0; i < 8; ++i)
{
myBools[i] = ( (myChar & (1 << i) > 0) ? 1 : 0 );
}
This will give you an array of unsigned chars - either 0 or 1 (true or false) - for the character.
This code is C89:
/* we need this to use exit */
#include <stdlib.h>
/* we need this to use CHAR_BIT */
#include <limits.h>
/* we need this to use fgetc and printf */
#include <stdio.h>
int main() {
/* Declare everything we need */
int input, index;
unsigned int mask;
char inputchar;
/* an array to store integers telling us the values of the individual bits.
There are (almost) always 8 bits in a char, but it doesn't hurt to get into
good habits early, and in C, the sizes of the basic types are different
on different platforms. CHAR_BIT tells us the number of bits in a byte.
*/
int bits[CHAR_BIT];
/* the simplest way to read a single character is fgetc, but note that
the user will probably have to press "return", since input is generally
buffered */
input = fgetc(stdin);
printf("%d\n", input);
/* Check for errors. In C, we must always check for errors */
if (input == EOF) {
printf("No character read\n");
exit(1);
}
/* convert the value read from type int to type char. Not strictly needed,
we can examine the bits of an int or a char, but here's how it's done.
*/
inputchar = input;
/* the most common way to examine individual bits in a value is to use a
"mask" - in this case we have just 1 bit set, the most significant bit
of a char. */
mask = 1 << (CHAR_BIT - 1);
/* this is a loop, index takes each value from 0 to CHAR_BIT-1 in turn,
and we will read the bits from most significant to least significant. */
for (index = 0; index < CHAR_BIT; ++index) {
/* the bitwise-and operator & is how we use the mask.
"inputchar & mask" will be 0 if the bit corresponding to the mask
is 0, and non-zero if the bit is 1. ?: is the ternary conditional
operator, and in C when you use an integer value in a boolean context,
non-zero values are true. So we're converting any non-zero value to 1.
*/
bits[index] = (inputchar & mask) ? 1 : 0;
/* output what we've done */
printf("index %d, value %u\n", index, inputchar & mask);
/* we need a new mask for the next bit */
mask = mask >> 1;
}
/* output each bit as 0 or 1 */
for (index = 0; index < CHAR_BIT; ++index) {
printf("%d", bits[index]);
}
printf("\n");
/* output each bit as "true" or "false" */
for (index = 0; index < CHAR_BIT; ++index) {
printf(bits[index] ? "true" : "false");
/* fiddly part - we want a comma between each bit, but not at the end */
if (index != CHAR_BIT - 1) printf(",");
}
printf("\n");
return 0;
}
You don't necessarily need three loops - you could combine them together if you wanted, and if you're only doing one of the two kinds of output, then you wouldn't need the array, you could just use each bit value as you mask it off. But I think this keeps things separate and hopefully easier to understand.

How to shift an array of bytes by 12-bits

I want to shift the contents of an array of bytes by 12-bit to the left.
For example, starting with this array of type uint8_t shift[10]:
{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0A, 0xBC}
I'd like to shift it to the left by 12-bits resulting in:
{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xAB, 0xC0, 0x00}
Hurray for pointers!
This code works by looking ahead 12 bits for each byte and copying the proper bits forward. 12 bits is the bottom half (nybble) of the next byte and the top half of 2 bytes away.
unsigned char length = 10;
unsigned char data[10] = {0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0A,0xBC};
unsigned char *shift = data;
while (shift < data+(length-2)) {
*shift = (*(shift+1)&0x0F)<<4 | (*(shift+2)&0xF0)>>4;
shift++;
}
*(data+length-2) = (*(data+length-1)&0x0F)<<4;
*(data+length-1) = 0x00;
Justin wrote:
#Mike, your solution works, but does not carry.
Well, I'd say a normal shift operation does just that (called overflow), and just lets the extra bits fall off the right or left. It's simple enough to carry if you wanted to - just save the 12 bits before you start to shift. Maybe you want a circular shift, to put the overflowed bits back at the bottom? Maybe you want to realloc the array and make it larger? Return the overflow to the caller? Return a boolean if non-zero data was overflowed? You'd have to define what carry means to you.
unsigned char overflow[2];
*overflow = (*data&0xF0)>>4;
*(overflow+1) = (*data&0x0F)<<4 | (*(data+1)&0xF0)>>4;
while (shift < data+(length-2)) {
/* normal shifting */
}
/* now would be the time to copy it back if you want to carry it somewhere */
*(data+length-2) = (*(data+length-1)&0x0F)<<4 | (*(overflow)&0x0F);
*(data+length-1) = *(overflow+1);
/* You could return a 16-bit carry int,
* but endian-ness makes that look weird
* if you care about the physical layout */
unsigned short carry = *(overflow+1)<<8 | *overflow;
Here's my solution, but even more importantly my approach to solving the problem.
I approached the problem by
drawing the memory cells and drawing arrows from the destination to the source.
made a table showing the above drawing.
labeling each row in the table with the relative byte address.
This showed me the pattern:
let iL be the low nybble (half byte) of a[i]
let iH be the high nybble of a[i]
iH = (i+1)L
iL = (i+2)H
This pattern holds for all bytes.
Translating into C, this means:
a[i] = (iH << 4) OR iL
a[i] = ((a[i+1] & 0x0f) << 4) | ((a[i+2] & 0xf0) >> 4)
We now make three more observations:
since we carry out the assignments left to right, we don't need to store any values in temporary variables.
we will have a special case for the tail: all 12 bits at the end will be zero.
we must avoid reading undefined memory past the array. since we never read more than a[i+2], this only affects the last two bytes
So, we
handle the general case by looping for N-2 bytes and performing the general calculation above
handle the next to last byte by it by setting iH = (i+1)L
handle the last byte by setting it to 0
given a with length N, we get:
for (i = 0; i < N - 2; ++i) {
a[i] = ((a[i+1] & 0x0f) << 4) | ((a[i+2] & 0xf0) >> 4);
}
a[N-2] = (a[N-1) & 0x0f) << 4;
a[N-1] = 0;
And there you have it... the array is shifted left by 12 bits. It could easily be generalized to shifting N bits, noting that there will be M assignment statements where M = number of bits modulo 8, I believe.
The loop could be made more efficient on some machines by translating to pointers
for (p = a, p2=a+N-2; p != p2; ++p) {
*p = ((*(p+1) & 0x0f) << 4) | (((*(p+2) & 0xf0) >> 4);
}
and by using the largest integer data type supported by the CPU.
(I've just typed this in, so now would be a good time for somebody to review the code, especially since bit twiddling is notoriously easy to get wrong.)
Lets make it the best way to shift N bits in the array of 8 bit integers.
N - Total number of bits to shift
F = (N / 8) - Full 8 bit integers shifted
R = (N % 8) - Remaining bits that need to be shifted
I guess from here you would have to find the most optimal way to make use of this data to move around ints in an array. Generic algorithms would be to apply the full integer shifts by starting from the right of the array and moving each integer F indexes. Zero fill the newly empty spaces. Then finally perform an R bit shift on all of the indexes, again starting from the right.
In the case of shifting 0xBC by R bits you can calculate the overflow by doing a bitwise AND, and the shift using the bitshift operator:
// 0xAB shifted 4 bits is:
(0xAB & 0x0F) >> 4 // is the overflow (0x0A)
0xAB << 4 // is the shifted value (0xB0)
Keep in mind that the 4 bits is just a simple mask: 0x0F or just 0b00001111. This is easy to calculate, dynamically build, or you can even use a simple static lookup table.
I hope that is generic enough. I'm not good with C/C++ at all so maybe someone can clean up my syntax or be more specific.
Bonus: If you're crafty with your C you might be able to fudge multiple array indexes into a single 16, 32, or even 64 bit integer and perform the shifts. But that is prabably not very portable and I would recommend against this. Just a possible optimization.
Here a working solution, using temporary variables:
void shift_4bits_left(uint8_t* array, uint16_t size)
{
int i;
uint8_t shifted = 0x00;
uint8_t overflow = (0xF0 & array[0]) >> 4;
for (i = (size - 1); i >= 0; i--)
{
shifted = (array[i] << 4) | overflow;
overflow = (0xF0 & array[i]) >> 4;
array[i] = shifted;
}
}
Call this function 3 times for a 12-bit shift.
Mike's solution maybe faster, due to the use of temporary variables.
The 32 bit version... :-) Handles 1 <= count <= num_words
#include <stdio.h>
unsigned int array[] = {0x12345678,0x9abcdef0,0x12345678,0x9abcdef0,0x66666666};
int main(void) {
int count;
unsigned int *from, *to;
from = &array[0];
to = &array[0];
count = 5;
while (count-- > 1) {
*to++ = (*from<<12) | ((*++from>>20)&0xfff);
};
*to = (*from<<12);
printf("%x\n", array[0]);
printf("%x\n", array[1]);
printf("%x\n", array[2]);
printf("%x\n", array[3]);
printf("%x\n", array[4]);
return 0;
}
#Joseph, notice that the variables are 8 bits wide, while the shift is 12 bits wide. Your solution works only for N <= variable size.
If you can assume your array is a multiple of 4 you can cast the array into an array of uint64_t and then work on that. If it isn't a multiple of 4, you can work in 64-bit chunks on as much as you can and work on the remainder one by one.
This may be a bit more coding, but I think it's more elegant in the end.
There are a couple of edge-cases which make this a neat problem:
the input array might be empty
the last and next-to-last bits need to be treated specially, because they have zero bits shifted into them
Here's a simple solution which loops over the array copying the low-order nibble of the next byte into its high-order nibble, and the high-order nibble of the next-next (+2) byte into its low-order nibble. To save dereferencing the look-ahead pointer twice, it maintains a two-element buffer with the "last" and "next" bytes:
void shl12(uint8_t *v, size_t length) {
if (length == 0) {
return; // nothing to do
}
if (length > 1) {
uint8_t last_byte, next_byte;
next_byte = *(v + 1);
for (size_t i = 0; i + 2 < length; i++, v++) {
last_byte = next_byte;
next_byte = *(v + 2);
*v = ((last_byte & 0x0f) << 4) | (((next_byte) & 0xf0) >> 4);
}
// the next-to-last byte is half-empty
*(v++) = (next_byte & 0x0f) << 4;
}
// the last byte is always empty
*v = 0;
}
Consider the boundary cases, which activate successively more parts of the function:
When length is zero, we bail out without touching memory.
When length is one, we set the one and only element to zero.
When length is two, we set the high-order nibble of the first byte to low-order nibble of the second byte (that is, bits 12-16), and the second byte to zero. We don't activate the loop.
When length is greater than two we hit the loop, shuffling the bytes across the two-element buffer.
If efficiency is your goal, the answer probably depends largely on your machine's architecture. Typically you should maintain the two-element buffer, but handle a machine word (32/64 bit unsigned integer) at a time. If you're shifting a lot of data it will be worthwhile treating the first few bytes as a special case so that you can get your machine word pointers word-aligned. Most CPUs access memory more efficiently if the accesses fall on machine word boundaries. Of course, the trailing bytes have to be handled specially too so you don't touch memory past the end of the array.

Resources