Mapping a number to bit position in C - c

I'm developing an programm running on Atmel AT90CAN128. Connected to this controller there are 40 devices, each with a status (on/off). As I need to report the status of each of this devices to a PC through Serial Communication, I have 40 bits, which define whether the device is on or off. In addition, the PC can turn any of this devices on or off.
So, my first attempt was to create the following struct:
typedef struct {
unsigned char length; //!< Data Length
unsigned data_type; //!< Data type
unsigned char data[5]; //!< CAN data array 5 * 8 = 40 bits
} SERIAL_packet;
The problem with this was that the PC will send an unsigned char address telling me the device to turn on/off, so accessing the bit corresponding to that address number turned out to be rather complicated...
So I started looking for options, and I stumbled upon the C99 _Bool type. I thought, great so now I'll just create a _Bool data[40] and I can access the address bit just by indexing my data array. Turns out that in C (or C++) memory mapping needs an entire byte for addressing it. So even if I declare a _Bool the size of that _Bool will be 8 bits which is a problem (it needs to be as fast as possible so the more bits I send the slower it gets, and the PC will be specting 40 bits only) and not very efficient for the communication. So I started looking into Bit Fields, and tried the following:
typedef struct {
unsigned char length; //!< Data Length
unsigned data_type; //!< Data type
arrayData data[40]; //!< Data array 5 bytes == 40 bits
} SERIAL_packet;
typedef struct {
unsigned char aux : 1;
} arrayData;
And I wonder, is this going to map that data[40] into a consequent memory block with a size of 40 bits (5 bytes)?
If not, is there any obvious solution I'm missing? This doesn't seem like a very complicated thing to do (would be much simpler if there were less than 32 devices so I could use a int and just access through a bit mask).

Assuming the addresses you get back are in the range 0 - 39 and that a char has 8 bits, you can treat your data array as an array of bits:
| data[0] | data[1] ...
-----------------------------------------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15|
-----------------------------------------------------------------
To set bit i:
packet.data[i/8] |= (1 << (i%8));
To clear bit i:
packet.data[i/8] &= (1 << (i%8)) ^ 0xff;
To read bit i:
int flag = (packet.data[i/8] & (1 << (i%8)) != 0;

Related

Converting 32 bit number to four 8bit numbers

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

Structure for an array of bits in C

It has come to my attention that there is no builtin structure for a single bit in C. There is (unsigned) char and int, which are 8 bits (one byte), and long which is 64+ bits, and so on (uint64_t, bool...)
I came across this while coding up a huffman tree, and the encodings for certain characters were not necessarily exactly 8 bits long (like 00101), so there was no efficient way to store the encodings. I had to find makeshift solutions such as strings or boolean arrays, but this takes far more memory.
But anyways, my question is more general: is there a good way to store an array of bits, or some sort of user-defined struct? I scoured the web for one but the smallest structure seems to be 8 bits (one byte). I tried things such as int a : 1 but it didn't work. I read about bit fields but they do not simply achieve exactly what I want to do. I know questions have already been asked about this in C++ and if there is a struct for a single bit, but mostly I want to know specifically what would be the most memory-efficient way to store an encoding such as 00101 in C.
If you're mainly interested in accessing a single bit at a time, you can take an array of unsigned char and treat it as a bit array. For example:
unsigned char array[125];
Assuming 8 bits per byte, this can be treated as an array of 1000 bits. The first 16 logically look like this:
---------------------------------------------------------------------------------
byte | 0 | 1 |
---------------------------------------------------------------------------------
bit | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
---------------------------------------------------------------------------------
Let's say you want to work with bit b. You can then do the following:
Read bit b:
value = (array[b/8] & (1 << (b%8)) != 0;
Set bit b:
array[b/8] |= (1 << (b%8));
Clear bit b:
array[b/8] &= ~(1 << (b%8));
Dividing the bit number by 8 gets you the relevant byte. Similarly, mod'ing the bit number by 8 gives you the relevant bit inside of that byte. You then left shift the value 1 by the bit number to give you the necessary bit mask.
While there is integer division and modulus at work here, the dividend is a power of 2 so any decent compiler should replace them with bit shifting/masking.
It has come to my attention that there is no builtin structure for a single bit in C.
That is true, and it makes sense because substantially no machines have bit-addressible memory.
But anyways, my question is more general: is there a good way to store
an array of bits, or some sort of user-defined struct?
One generally uses an unsigned char or another unsigned integer type, or an array of such. Along with that you need some masking and shifting to set or read the values of individual bits.
I scoured the
web for one but the smallest structure seems to be 8 bits (one byte).
Technically, the smallest addressible storage unit ([[un]signed] char) could be larger than 8 bits, though you're unlikely ever to see that.
I tried things such as int a : 1 but it didn't work. I read about bit
fields but they do not simply achieve exactly what I want to do.
Bit fields can appear only as structure members. A structure object containing such a bitfield will still have a size that is a multiple of the size of a char, so that doesn't map very well onto a bit array or any part of one.
I
know questions have already been asked about this in C++ and if there
is a struct for a single bit, but mostly I want to know specifically
what would be the most memory-efficient way to store an encoding such
as 00101 in C.
If you need a bit pattern and a separate bit count -- such as if some of the bits available in the bit-storage object are not actually significant -- then you need a separate datum for the significant-bit count. If you want a data structure for a small but variable number of bits, then you might go with something along these lines:
struct bit_array_small {
unsigned char bits;
unsigned char num_bits;
};
Of course, you can make that larger by choosing a different data type for the bits member and, maybe, the num_bits member. I'm sure you can see how you might extend the concept to handling arbitrary-length bit arrays if you should happen to need that.
If you really want the most memory efficiency, you can encode the Huffman tree itself as a stream of bits. See, for example:
https://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman_tutorial.html
Then just encode those bits as an array of bytes, with a possible waste of 7 bits.
But that would be a horrible idea. For the structure in memory to be useful, it must be easy to access. You can still do that very efficiently. Let's say you want to encode up to 12-bit codes. Use a 16-bit integer and bitfields:
struct huffcode {
uint16_t length: 4,
value: 12;
}
C will store this as a single 16-bit value, and allow you to access the length and value fields separately. The complete Huffman node would also contain the input code value, and tree pointers (which, if you want further compactness, can be integer indices into an array).
You can make you own bit array in no time.
#define ba_set(ptr, bit) { (ptr)[(bit) >> 3] |= (char)(1 << ((bit) & 7)); }
#define ba_clear(ptr, bit) { (ptr)[(bit) >> 3] &= (char)(~(1 << ((bit) & 7))); }
#define ba_get(ptr, bit) ( ((ptr)[(bit) >> 3] & (char)(1 << ((bit) & 7)) ? 1 : 0 )
#define ba_setbit(ptr, bit, value) { if (value) { ba_set((ptr), (bit)) } else { ba_clear((ptr), (bit)); } }
#define BITARRAY_BITS (120)
int main()
{
char mybits[(BITARRAY_BITS + 7) / 8];
memset(mybits, 0, sizeof(mybits));
ba_setbit(mybits, 33, 1);
if (!ba_get(33))
return 1;
return 0;
};

Different types for zero length bit fields in c?

Fount this statement A zero-width bit field can cause the next field to be aligned on the next container boundary where the container is the same size as the underlying type of the bit field
To put it into practice assuming int is 2 bytes (16 bits) and that short is 1 byte (8 bits) to save typing. Also let's say we are using the gcc compiler (would be nice to explain the differences to clang).
struct foo {
unsigned int a:5;
unsigned int :0;
unsigned int b:3;
}
In memory this looks like
struct address
|
|
v
aaaaa000 00000000 bbb00000 00000000
Question 1: In my understanding it can not look like aaaaa000 00000000 0..00bbb00000..., So bbb has to align with the container directly following the current container. Is this actually true?
Moving on, if I specify
struct bar {
unsigned short x:5;
unsigned int :0;
unsigned short y:7;
}
Will it be like so?
struct address
| short stops here short starts
| | |
v v | this is uint | v
xxxxx000 00000000 00000000 yyyyyyy0
Edit 1
It was pointed out that short can not be less than 16 bytes. That is slightly beside the point in this question. But if its important to you you can replace short with char and int with short
Update, after reading the text in context:
The result of your example (corrected to use char):
struct bar {
unsigned char x:5;
unsigned int :0;
unsigned char y:7;
}
would look like this (assuming 16-bit int):
char pad pad int boundary
| | | |
v v v v
xxxxx000 00000000 yyyyyyy0
(I'm ignoring endian).
The zero-length bitfield causes the position to move to next int boundary. You defined int to be 16-bit, so 16 minus 5 gives 11 bits of padding.
It does not insert an entire blank int. The example on the page you link demonstrates this (but using 32-bit integers).
First, whenever writing bit-fields, it is always recommended that you declare variables either in ascending or descending sizes of datatypes used. This way compiler always chooses the highest datatype size and it makes chunks of the same size.
This is what i think will be.
struct address
| short stops here short starts
| | |
v v| this is unit | v
xxxxx000 00000000 00000000 yyyyyyy0

Structs in a 32-bit architecture [duplicate]

This question already has answers here:
What is the meaning of "__attribute__((packed, aligned(4))) "
(3 answers)
Closed 9 years ago.
The following code;
struct s1 {
void *a;
char b[2];
int c;
};
struct s2 {
void *a;
char b[2];
int c;
}__attribute__((packed));
if s1 has a size of 12 bytes and s2 has a size of 10 bytes, is this due to data being read in 4 byte chunks and }__attribute__((packed)); reduces the size of void*a; to only 2 bytes?
A little confused as to what }__attribute__((packed)); does.
Many thanks
It is due to alignment, a process in which the compiler adds hidden "junk" between the fields to make sure they have optimal (for performance) starting addresses.
Using packed forces the compiler to not do that, which often means that accessing the structure becomes slower (or simply impossible, causing e.g. a bus error) if the hardware has problems doing e.g. 32-bit accesses on addresses that are not multiples of 4.
On Intel processors, the fetches of 32-bit aligned data is considerably faster than unaligned; on many other processors unaligned fetches might be illegal altogether, or need to be simulated using 2 instructions. Thus the first structure would have the c always on these 32-bit architectures aligned to a byte address divisible by 4. This however requires that 2 bytes will be wasted in storage.
struct s1 {
void *a;
char b[2];
int c;
};
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | NA | NA | c0 | c1 | c2 | c3 |
// addresses increasing ====>
On the other hand, sometimes you absolutely need to map some unaligned datastructures (like file formats, or network packets), as is, into C structures; there you can use the __attribute__((packed)) to specify that you want everything without padding bytes:
struct s2 {
void *a;
char b[2];
int c;
} __attribute__((packed));
// Byte layout in memory (32-bit little-endian):
// | a0 | a1 | a2 | a3 | b0 | b1 | c0 | c1 | c2 | c3 |
// addresses increasing ====>
This is due to data structure alignment, a combination of two processes: data alignment and data padding. The first structure will be aligned to the word as you said, however the second structure is packed and forces the compiler to not pad the structure to the word.
The second structure is 10 bytes because the character array is 2 bytes, not the void pointer (it remains 4 bytes, as all pointers are). This can hinder performance as the trade off of 2 bytes of space is not worth the efficiency lost by the hardware (under most circumstances) and could lead to undefined behaviour.

How to write a byte to register with specific memory address?

I want to write a byte to register with specific memory address (0x1228A432)
But, this register has a following structure:
Bits | Access | Name | Reset | Description |
[31:8] | Read only | -------- | ------ | Reserved |
[7:0] | Read-write | REG[7:0] | 0xXX | ----------- |
Please tell me, how to write a byte to this register without "touching" the Reserved bits?
EDIT1: My target is Cortex A9.
I could successfully read/write to onboard DDR2 memory using 256-bit values (such as 0xFF)
EDIT2: I used to work with DDR2 memory in the following way :
// First stage
static unsigned char *p = 0;
char * argv1="0x60000000";
unsigned long address=strtoul(argv1, 0, 0);
p = (unsigned char *) argv1;
// Second stage
char * argv4="FF";
int value=strtol(argv4,0,16);
// Third stage
int offset = 9;
p[offset]=value;
EDIT3: I found out the following information:
All registers are 32 bits wide and do not support byte writes.
Write operations must be word-wide and bits marked as reserved must be preserved
using read-modify-write.
One way to preserve bits [31:8], assuming 32-bit wide access, is to read the value, zero-out bits [7:0], bitwise-or it with the value needed and then write it back to the register.
Something like (stealing from RedX a bit ;) ):
uint8_t your_8_bit_value = 0x42;
uint32_t volatile * const mem_map_register = (uint32_t volatile *) 0x1228a432;
*mem_map_register = (*mem_map_register & 0xFFFFFF00) | your_8_bit_value;
Yet I think there should be more info available about your hardware. I've seen several datasheets saying e.g. that you have to write all 1 to reserved bits (meaning that reserved bits are reserved for future use, and 1 is a safe default), etc. So it is not always obvious, that leaving reserved bits untouched is the right thing to do.
You should find more details about your hardware - are byte-wide writes supported, are writes to reserved bits ignored perhaps, or should be all 0/1, etc.
Look up the assembler instruction handbook for an 8 bit writing instruction (not sure if it exists). If it does, use an uint8_t for your assignment to that memory location (uint8_t volatile * const reg = (uint8_t volatile * const) 0x1228a432;).
Else do what Omkant said. Overwriting the bits with the same number should not produce any unwanted results, since they are not "zeroed" before being overwritten.
His code in C (this is the verbose version for better readability):
uint8_t your_8_bit_value = 0x42;
uint32_t volatile * const mem_map_register = (uint32_t volatile *) 0x1228a432;
*mem_map_register = (*mem_map_register & 0xFFFFFF00) | your_8_bit_value;
[register value] = ([register value] | [00 00 00 FF]) & [FF FF FF XX]
Here , xx is the one byte read from your given address and then set a mask of 24 bits.
And perform bitwise & on the values shown above
I think this should work

Resources