How to handle a bunch of 0s and 1s with a microcontroller? - c

I'm making a program to receive a bunch of 0's and 1's with a µC and need to take any amount of bits (1 to 16) from any position.
I.E. I have 150 bits and I want to take 6 bits from the 32th bit and copy it to a char (8bits) variable; I know I can do it with strings by saving as ASCII 0's and 1's, but I have not a lot of RAM, so I need to save it as bits.
The bigger variable is a unsigned 32 bits long, but save the data is not my problem, the problem is how to access to a specific bits positions and copy that to a char(8) variable.

You can use bitwise operators:
//bits: your bits (byte array), start: index of the first bit of the char you want
char select(char* bits, int start) {
dec = start%8;
return bits[start/8]>>dec + bits[start/8+1]<<dec;
}
The code above supposed start < (bits.size()-8)
[EDIT]
You can change the char* to any type you want. However you will need to change dec value to the appropriate number of bits (8*SIZE_IN_BYTES) and then apply operator | ("logical or") to get your char back
example:
char select(int* bits, int start) {
nbitsint = 8*4;
dec = start%nbitsint;
if (dec < nbitsint-8) {
// | 0xff creates creates a byte
return (bits[start/nbitsint]>>((3-dec/8)*8+dec%8) | 0xff;
}
// Getting a byte which is astride two values is tricky
return (bits[start/nbitsint]>>(start%8) + bits[start/nbitsint+1]<<(start%8) | 0xff;
}

Related

Converting 32 bit number to four 8bit numbers

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

How to add and extract bits to a byte?

I am trying to understand bit and byte manipulation and I have seen many examples in SO. Still, I have some questions regarding my understanding.
First, lets say we have a byte array with the byte order as Least Significant Byte. I want to get the byte 2 from this array. I can get the byte like byte[1]. Am I right?
Second, we have a byte array with the byte order as Least Significant Byte. And I want to get first 2 bits of the byte 1!. How can I get the first 2 bits from that byte?
Also, how can I add a number into the first 2 bits of a byte?
Any help or link to understand those logics are much appreciated.
First, lets say we have a byte array with the byte order as LSB. I want to get the byte 2 from this array. I can get the byte like byte[1]. Am I right?
Yes.
Second, we have a byte array with the byte order as LSB. And I want to get first 2 bits of the byte 1!. How can I get the first 2 bits from that byte? Also, how can I add a number into the first 2 bits of a byte?
You can use the bitwise AND operator & with the constant 3 to retrieve only the first two bits. By doing num & 3 it will realize a condition operation between each bit of num and 3 returning 1 as resultant bit only if both bits are 1. As 3 have only its 2 first bits set, every bit in num other than the first 2 will be ignored.
unsigned char foo = 47;
unsigned char twobits = foo & 3; // this will return only the value of the two bits of foo.
unsigned char number_to_add = 78;
twobits &= (number_to_add & 3); // this will get the values of the 2 bits of number_to_add_ and then assign it to the 2 bits of variable twobits.
Or if you don't care of the endianess you can use bitfields:
struct st_foo
{
unsigned char bit1 : 1;
unsigned char bit2 : 1;
unsigned char the_rest : 6;
};
struct st_foo my_byte;
my_byte.bit1 = 1;
my_byte.bit2 = 0;

Read a single bit from a buffer of char

I would to implement a function like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
where index is the offset of the bit that I would want to read.
How do I use bit shifting or masking to achieve this?
You might want to split this into three separate tasks:
Determining which char contains the bit that you're looking for.
Determining the bit offset into that char that you need to read.
Actually selecting that bit out of that char.
I'll leave parts (1) and (2) as exercises, since they're not too bad. For part (3), one trick you might find useful would be to do a bitwise AND between the byte in question and a byte with a single 1 bit at the index that you want. For example, suppose you want to get the fourth bit out of a byte. You could then do something like this:
Byte: 11011100
Mask: 00001000
----------------
AND: 00001000
So think about the following: how would you generate the mask that you need given that you know the bit index? And how would you convert the AND result back to a single bit?
Good luck!
buffer[index/8] & (1u<<(index%8))
should do it (that is, view buffer as a bit array and test the bit at index).
Similarly:
buffer[index/8] |= (1u<<(index%8))
should set the index-th bit.
Or you could store a table of the eight shift states of 1 and & against that
unsigned char bits[] = { 1u<<0, 1u<<1, 1u<<2, 1u<<3, 1u<<4, 1u<<5, 1u<<6, 1u<<7 };
If your compiler doesn't optimize those / and % to bit ops (more efficient), then:
unsigned_int / 8 == unsigned_int >> 3
unsigned_int % 8 == unsigned_int & 0x07 //0x07 == 0000 0111
so
buffer[index>>3] & (1u<<(index&0x07u)) //test
buffer[index>>3] |= (1u<<(index&0x07u)) //set
One possible implementation of your function might look like this:
int read_single_bit(unsigned char* buffer, unsigned int index)
{
unsigned char c = buffer[index / 8]; //getting the byte which contains the bit
unsigned int bit_position = index % 8; //getting the position of that bit within the byte
return ((c >> (7 - bit_position)) & 1);
//shifting that byte to the right with (7 - bit_position) will move the bit whose value you want to know at "the end" of the byte.
//then, by doing bitwise AND with the new byte and 1 (whose binary representation is 00000001) will yield 1 or 0, depending on the value of the bit you need.
}

Merging 13 bits array into an array of unsigned char

I'm writing an algorithm that compresses data (LZSS) and it requires me to have two 13-bit values which I'll have to later merge together.
In some cases, however, I don't need 13 bits; 8 are enough.
For this purpose I have a structure like this:
typedef struct pattern
{
char is_compressed:1; //flag
short index :13; //first value
short length :13; //second value
unsigned char c; //is 8 bits are enough, use this instead
} Pattern;
I therefore have an array of these structures, and each structure can either contain the two 13-bit values or an 8-bit value.
I am now looping over this array, and my objective is to merge all these bits together.
I easily calculated the total number of bits used and the number of arrays of unsigned chars (8 bits) needed in order to store all the values:
int compressed = 0, plain = 0;
//count is the amount of patterns i have and p is the array of patterns (the structures)
for (int i = 0; i < count; i++)
{
if (p[i]->is_compressed)
compressed++;
else
plain++;
}
//this stores the number of bits used in the pattern (13 for length and 13 for the index or 8 for the plain uchar)
int tot_bits = compressed * 26 + plain * 8;
//since we can only write a minimum of 8 bits, we calculate how many arrays are needed to store the bits
int nr_of_arrays = (tot_bits % 8 == 0) ? tot_bits / 8 : (tot_bits / 8) + 1;
//we allocate the needed memory for the array of unsigned chars that will contain, concatenated, all the bits
unsigned char* uc = (unsigned char*) malloc(nr_of_arrays * sizeof(unsigned char));
After allocating the memory for the array I'm going to fill, I simply loop through the array of structures and recognize whether the structure I'm looking at contains the two 13-bit values or just the 8-bit one
for (int i = 0; i < count; i++)
{
if (p->is_compressed)
{
//The structure contains the two 13 bits value
}
else
{
//The structure only contains the 8 bits value
}
}
Here I'm stuck and can't seem to figure out a proper way of getting the job done.
Does anybody of you know how to implement that part there?
A practical example would be:
pattern 1 contains the 2 13-bit values:
1111 1111 1111 1
0000 0000 0000 0
pattern 2 contains the 8-bit value
1010 1010
total bits: 34
number of arrays required: 5 (that will waste 6 bits)
resulting array is:
[0] 1111 1111
[1] 1111 1000
[2] 0000 0000
[3] 0010 1010
[4] 1000 0000 (the remaining 6 bits are set to 0)
One way to do that is to write bytes one by one and keep track of partial bytes as you write.
You need a pointer to your char array, and an integer to keep track of how many bits you wrote to the last byte. Every time you write bits, you check how many bits you can write to the last byte, and you write these bits accordingly (ex: if there is 5 bits free, you shift your next value by 3 and add it to the last byte). Every time a byte is complete, you increment your array pointer and reset your bit tracker.
A clean way to implement this would be to write functions like :
void BitWriter_init( char *myArray );
void BitWriter_write( int theBitsToWrite, int howManyBits );
Now you just have to figure out how to implement these functions, or use any other method of your choice.
The problem intrigued me. Here's a possible implementation of "by using a lot of bitwise operations":
/* A writable bit string, with an indicator of the next available bit */
struct bitbuffer {
uint8_t *bytes;
size_t next_bit;
};
/*
* writes the bits represented by the given pattern to the next available
* positions in the specified bit buffer
*/
void write_bits(struct bitbuffer *buffer, Pattern *pattern) {
/* The index of the byte containing the next available bit */
size_t next_byte = buffer->next_bit / 8;
/* the number of bits already used in the next available byte */
unsigned bits_used = buffer->next_bit % 8;
if (pattern->is_compressed) {
/* assemble the bits to write in a 32-bit block */
uint32_t bits = pattern->index << 13 + pattern->length;
if (bits_used == 7) {
/* special case: the bits to write will span 5 bytes */
/* the first bit written will be the last in the current byte */
uint8_t first_bit = bits >> 25;
buffer->bytes[next_byte] |= first_bit;
/* write the next 8 bits to the next byte */
buffer->bytes[++next_byte] = (bits >> 17) & 0xFF;
/* align the tail of the bit block with the buffer*/
bits <<= 7;
} else {
/* the first bits written will fill out the current byte */
uint8_t first_bits = (bits >> (18 + bits_used)) & 0xFF;
buffer->bytes[next_byte] |= first_bits;
/* align the tail of the bit block with the buffer*/
bits <<= (6 - bits_used);
}
/*
* Write the remainder of the bit block to the buffer,
* most-significant bits first. Three (more) bytes will be modified.
*/
buffer->bytes[++next_byte] = (bits >> 16) & 0xFF;
buffer->bytes[++next_byte] = (bits >> 8) & 0xFF;
buffer->bytes[++next_byte] = bits & 0xFF;
/* update the buffer's index of the next available bit */
buffer->next_bit += 26;
} else { /* the pattern is not compressed */
if (bits_used) {
/* the bits to write will span two bytes in the buffer */
buffer->bytes[next_byte] |= (pattern->c >> bits_used);
buffer[++next_byte] = (pattern->c << bits_used) & 0xFF;
} else {
/* the bits to write exactly fill the next buffer byte */
buffer->bytes[next_byte] = pattern->c;
}
/* update the buffer's index of the next available bit */
buffer->next_bit += 8;
}
}

How to store two different things in one byte and then access them again?

I am trying to learn C for my class. One thing I need to know is given an array, I have to take information from two characters and store it in one bytes. For eg. if string is "A1B3C5" then I have to store A = 001 in higher 3bits and then store 1 in lower 5bits. I have to function that can get two chars from array at a time and print it here is that function,
void print2(char string[])
{
int i = 0;
int length = 0;
char char1, char2;
length = strlen(string);
for ( i = 0; i <length; i= i + 2)
{
char1 = string[i];
char2 = string[i+1];
printf("%c, %c\n", char1, char2);
}
}
but now i am not sure how to get it encoded and then decode again. Can anyone help me please?
Assuming an ASCII character set, subtract '#' from the letter and shift left five bits, then subtract '0' from the character representing the digit and add it to the first part.
So you've got a byte, and you want the following bit layout:
76543210
AAABBBBB
To store A, you would do:
unsigned char result;
int input_a = somevalue;
result &= 0x1F; // Clear the upper 3 bits.
// Store "A": make sure only the lower 3 bits of input_a are used,
// Then shift it by 5 positions. Finally, store it by OR'ing.
result |= (char)((input_a & 7) << 5);
To read it:
// Simply shift the byte by five positions.
int output_a = (result >> 5);
To store B, you would do:
int input_b = yetanothervalue;
result &= 0xE0; // Clear the lower 5 bits.
// Store "B": make sure only the lower 5 bits of input_b are used,
// then store them by OR'ing.
result |= (char)(input_b & 0x1F);
To read it:
// Simply get the lower 5 bits.
int output_b = (result & 0x1F);
You may want to read about the boolean operations AND and OR, bit shifting and finally bit masks.
First of all, one bit can only represent two states: 0 and 1, or TRUE and FALSE. What you mean is a Byte, which consists of 8 bits and can thus represent 2^8 states.
Two put two values in one byte, use logical OR (|) and bitwise shift (<< and >>).
I don't post the code here since you should learn this stuff - it's really important to know what bits and bytes are and how to work with them. But feel free to ask follow up question if something is not clear to you.

Resources