struct padding extra byte after member [duplicate] - c

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 4 years ago.
#include <stdio.h>
#include <stdint.h>
typedef struct s{
uint8_t a[1];
uint16_t b;
uint8_t c[1];
}s_t;
int main()
{
printf("size = %d", sizeof(s_t));
return 0;
}
I am not sure why the output of this program is 6 bytes and not 5. Why does the compiler pad an extra byte after the last member ? It also seems like, if you make the last member array length 3, the padding makes the size 8. I am unable to explain this since this is not the case for 2 arrays only.

Here is an illustration of the alignment that the compiler generates:
Bytes:
+-----+---------------+
| 0 | a[1] |
+-----+---------------+
| 1 | N/A (padding) |
+-----+---------------+
| 2 | b |
+-----+---------------+
| 3 | b |
+-----+---------------+
| 4 | c |
+-----+---------------+
As 16-bit quantities:
+---+------+----+
| 0 | a[i] | |
+---+------+----+
| 2 | b |
+---+------+----+
| 4 | c | |
+---+------+----+
Processors like to fetch 16-bit quantities from even addresses.
When they are on odd addresses, the computer may have to make 2 16-bit fetches, and extract the unaligned data out of them.
The easy method to eliminate this extra fetch is to add padding bytes so that 16-bit quantities align to even addresses.
A rule of thumb is to place the larger items first, then the smaller.
Applying this rule:
+---+------+
| 0 | b |
+---+------+
| 2 | a[1] |
+---+------+
| 3 | c |
+---+------+
The rule eliminates the need for an extra padding byte.

Related

Why am I getting this output from a C union with bitfields in my code?

Sorry for the non descriptive title - I wasn't sure how to pose this in one line.
I have a data structure, where I have two values: one 14-bit, one 10-bit. I want to be able to access them as bytes in a union. I have the following:
struct test
{
union
{
struct
{
unsigned int a : 14;
unsigned int b : 10;
} fields;
struct
{
unsigned char i0;
unsigned char i1;
unsigned char i2;
} bytes;
} id;
};
Now, when I assign 1 to the value at bytes.i2, I would expect the value at values.b to also assume the value 1. But the value in values.b is actually bytes.i2 shifted left by 2 bits.
int main()
{
struct test x;
x.id.bytes.i2 = 1;
printf("%d", x.id.fields.b); // OUTPUTS 4
return 0;
}
I must be missing some basic principle here, any insight would be helpful!
In little endian, packed structs:
fields a |b
bytes i0 |i1 : |i2
BITS 00000000|000000|00|10000000 i2 = 1; b = 4
BITS 00000000|000000|10|10000000 i1 = 64; b = 1
INDEX 01234567|890123|45|67890123
0 1 2
As you can see b = 0b00000100 (4)
The exact layout and ordering of bitfields in a struct is entirely up to the implementation.
On a little endian machine, the layout of the union most likely looks like this:
|a |b a |b |
|7 6 5 4 3 2 1 0|1 0 d c b a 9 8|a 9 8 7 6 5 4 3|
| i0 | i1 | i2 |
-------------------------------------------------
| | | | | | | | | | | | | | | | | | | | | | | | |
-------------------------------------------------
In this layout, we can see that the 8 low order bits of a are in the first byte, then the 6 high order bits of a and the 2 low order bits of b in the second byte, followed by the high order 8 bits of b in the third byte. This explains the result you're seeing.
Little endian machine will typically also have the bits in little endian format, so if you reverse the order of the bits in each byte above, reflecting the physical representation instead of the logical representation, you can see that the bits of each bitfield are contiguous.

Which C type should be used in this protocol implementation?

There are communication protocol definitions that use fields for values that are multiple of bytes but don't use all the space in C types such as uint8_t, uint16_t, uint32_t and uint64_t.
For example take this fake protocol (every line is a byte):
* | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
*
* 1 | TYPE | spare |
* 2 | Packet Number |
* 3 | Packet Number |
* 4 | Number Of Packets |
* 5 | Number Of Packets |
* 6 | Number Of Packets |
* 7 | Number Of Octet |
* 8 | Number Of Octet |
* 9 | Number Of Octet |
* 10 | Number Of Octet |
* 11 | Number Of Octet |
There are fields that can easily fit in C types such as TYPE and spare fields using uint8_t and bit operations or Packet Number in a uint16_t. But what about the fields Number Of Packets and Number Of Octet?
Questions:
It should be used uint32_t for Number Of Packets and uint64_t for Number Of Octets even though they only use 24 bits and 40 bits respectively?
It should be created a new type using bitfield for 24 bits and 40 bits? see
Or it should be used uint8_t[3] and uint8_t[5] arrays to hold this values?
Im asking this trying to have the best time performance.
Thanks!

How memory is allocated when we use malloc to create 2-dimensional array?

I want to create an integer array[5][10] using malloc(). The difference between memory address of array[0] and array[1] is showing 8. Why?
#include <stdio.h>
#include <stdlib.h>
int main() {
int *b[5];
for (int loop = 0; loop < 5; loop++)
b[loop] = (int*)malloc(10 * sizeof(int));
printf("b=%u \n", b);
printf("(b+1)=%u \n", (b + 1));
printf("(b+2)=%u \n", (b + 2));
}
The output is:
b=2151122304
(b+1)=2151122312
(b+2)=2151122320
The difference between memory address of array[0] and array[1] is showing 8. Why?
That's because sizeof of a pointer on your platform is 8.
BTW, use of %u to print a pointer leads to undefined behavior. Use %p instead.
printf("(b+1)=%p \n",(b+1));
printf("(b+2)=%p \n",(b+2));
Difference between array of pointers and a 2D array
When you use:
int *b[5];
The memory used for b is:
&b[0] &b[1] &b[2]
| | |
v v v
+--------+--------+--------+
| b[0] | b[1] | b[2] |
+--------+--------+--------+
(b+1) is the same as &b[1]
(b+2) is the same as &b[2]
Hence, the difference between (b+2) and (b+1) is the size of a pointer.
When you use:
int b[5][10];
The memory used for b is:
&b[0][0] &b[1][0] &b[2][0]
| | |
v v v
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ ...
| | | | | | | | | | | | | | | | | | | | | ...
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ ...
(b+1) is the same as &b[1], The value of that pointer is the same as the value of &b[1][0] even though they are pointers to different types.
(b+2) is the same as &b[2], The value of that pointer is the same as the value of &b[2][0]
Hence, the difference between (b+2) and (b+1) is the size of 10 ints.
First, with int *b[5] you are not creating a two dimensional array, but an array of pointers.
The elements of the array b are pointers. Each occupies the size of a pointer, which depends on your architecture. In a 64-bits architecture it will probably occupy 64 bits (8 bytes). You can check that by printing sizeof(int*) or sizeof(b[0])
Memory allocation will look like
b
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[0]+--------------> | | | | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[1]+--------------> | | |....... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[2]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[3]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
| | +------+------+-----------+-----+-----+-----+-----+
| b[4]+--------------> | | | ...... | | | | |
| | +------+------+-----------+-----+-----+-----+-----+
+-----+
b will point to b[0], after decay, and b + 1 will give the address of b[1]. Size of pointer on your machine is 8 bytes, therefore you are getting a difference of 8 in the address.
Beside of this
Do not cast return value of malloc
b[loop]=malloc(10*sizeof(int));
and use %p for pointer data type
printf("b=%p \n",(void *)b);
printf("(b+1)=%p \n",(void *)(b+1));
printf("(b+2)=%p \n",(void *)(b+2));
What you've declared is not technically a two dimensional array but an array of pointers to int, each of which points to an array of int. The reason array[0] and array[1] are 8 bytes apart is because you have an array of pointers, and pointers on your system are 8 bytes.
When you allocate each individual 1 dimensional array, they don't necessarily exist next to each other in memory. If on the other hand you declared int b[5][10], you would have 10 * 5 = 50 contiguous integers arranged in 5 rows of 10.

What does "int *[5]" indicate in c? [duplicate]

This question already has answers here:
What is the meaning of int (*pt)[5] in c [duplicate]
(3 answers)
Closed 7 years ago.
I was going through study materials from my previous year at University, and I saw a question like:
What is the difference between int *a and int a[5] and int *[5]. What does the last one indicate?
the int *a[5] declares an array of pointers to int.
the easiest was to determine the specifics of a variable declaration is read it from right to left.
In a nutshell:
int *a - creates a pointer to an int. This should contain the memory address of another int.
Example values of *a are 0x00001, 0x000057, etc.
int a[5] - creates an array that contains five int elements with each element containing an int values.
Here's a visualization of the possible values of each element in the array:
-------------------------
| Array element | Value |
-------------------------
| a[0] | 1 |
| a[1] | 2 |
| a[2] | 3 |
| a[3] | 4 |
| a[4] | 5 |
-------------------------
int *a[5] - creates an array that contains five pointer to an int elements which each element containing the memory address of another int.
Here's a visualization of the possible values of each element in the pointer array:
-------------------------
| Array element | Value |
-------------------------
| a[0] | 0x000 |
| a[1] | 0x001 |
| a[2] | 0x002 |
| a[3] | 0x003 |
| a[4] | 0x004 |
-------------------------

improving memory alignment of a structure on 32 bit machine

Correct the alignment of structure below that is bad.
typedef struct{
char *string; // 4 byte (type of address int)
char temp; // 1 byte
short pick; // 2 byte
char temp2; // 1 byte
}hello;
string = 4
temp + pick + temp2(offset 7) = 1+2+1
answer given, good alignment is
char *string; // 4 byte (type of address int)
short pick; // 2 byte
char temp; // 1 byte
char temp2; // 1 byte
string = 4
pick + temp + temp2(offset 7) = 2+1+1
unable to understand the reason that says temp2 should be at offset 7 rather than 8. how? please help
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+
| string | pick | t1| t2|
+---+---+---+---+---+---+---+---+
Using t1 for temp and t2 for temp2, this is the revised layout. The offset of t2 is 7.
For the original structure on a system where n-byte quantities are n-byte aligned, the layout would be:
+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B |
+---+---+---+---+---+---+---+---+---+---+---+---+
| string | t1|pad| pick | t2|pad|pad|pad|
+---+---+---+---+---+---+---+---+---+---+---+---+
That's because the 4-byte pointer needs to be 4-byte aligned, so an array of the structure needs each member to be a multiple of 4 bytes.
Thus, in the original structure, the offset of t2 would have been 8, not 7.

Resources