How many bytes does char defined in a structure take? - c

When I define a character type in a structure, it seems to take more than 1 byte; in fact it seems to take 4 bytes.
Below is my program:
#include <stdio.h>
int main(void)
{
struct book{
char name;
float price;
int pages;
};
struct book b1={'B',130.00,550};
printf("\nAddress of structure:%u",&b1);
printf("\nAddress of character name:%u",&b1.name);
printf("\nAddress of float price:%u",&b1.price);
printf("\nAddress of integer pages:%u",&b1.pages);
printf("\n\n");
return 0;
}
When I run the above program, I get the output below:
Address of structure:557762432
Address of character name:557762432
Address of float price:557762436
Address of integer pages:557762440
Why is it that I see the difference of 4 bytes between address of variable "name" and variable "price"?
The system on which this program is being run is x86_64 bit arch running Fedora-14.

The C standard allows implementations to add additional padding bits to structures so as to access the structure faster by making it aligned to byte boundaries as required by that implementation.
This is known as Structure Padding.
Given the above the size of a structure may not be the same as the sum of the sizes of the individual members. You should always use sizeof to determine the size of structure.
Also, the above mentioned is the reason that you do not see the structures members placed at memory addresses you expect them to be at.

You are running into alignment rules. This is a very compiler- and system-specific thing, but by default (that is, unless you specify otherwise using compiler flags or special alignment requests in the code) GCC on x64 Linux aligns each field of a struct on a multiple of its size. So, a one-character byte has nothing in particular to worry about for alignment. However, an int or float is always placed on a 4-byte boundary, and a double is always placed on an 8-byte boundary. That's what you are seeing here.
As Ethan notes above in the comments, some processors won't even access memory objects that aren't aligned a certain way, and Intel processors will access memory much more slowly if it isn't aligned.

Strictly, the char always occupies one byte, but it may be followed by 0 or more bytes padding. How much padding depends on the compiler, and on the CPU, and on the alignment of the type following the char in a structure, and any packing pragmas that are in effect. If an N-byte type must be N-byte aligned, as on SPARC, for example, then:
struct s16 { char c; double d; };
struct s8 { char c; long l; };
struct s4 { char c; short s; };
struct s2 { char c; char b; };
The solitary char is followed by 7, 3, 1 and 0 bytes in the structures above on a SPARC machine, and also on an x86_64 machine.

A "char" in C typically takes exactly 8 bits (one byte).
However, a char in a struct is "aligned" on a word, dword or even quadword boundary:
http://en.wikipedia.org/wiki/Data_structure_alignment
There are compiler directives available to force alignment to 4, 2 or 1 byte (which circumvents this behavior).
For example, in Visual C++, you can say "#pragma pack(1)".

Related

Size of a struct with float and uint8_t in C [duplicate]

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?
This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:
Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
Either corrected in hardware, for a modest performance-degradation.
Or corrected by emulation in software, for a severe performance-degradation.
In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */
One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.
Packing and byte alignment, as described in the C FAQ here:
It's for alignment. Many processors can't access 2- and 4-byte
quantities (e.g. ints and long ints) if they're crammed in
every-which-way.
Suppose you have this structure:
struct {
char a[3];
short int b;
long int c;
char d[3];
};
Now, you might think that it ought to be possible to pack this
structure into memory like this:
+-------+-------+-------+-------+
| a | b |
+-------+-------+-------+-------+
| b | c |
+-------+-------+-------+-------+
| c | d |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges
it like this:
+-------+-------+-------+
| a |
+-------+-------+-------+
| b |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for
you and me to see how the b and c fields wrap around? In a nutshell,
it's hard for the processor, too. Therefore, most compilers will pad
the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
| a | pad1 |
+-------+-------+-------+-------+
| b | pad2 |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d | pad3 |
+-------+-------+-------+-------+
If you want the structure to have a certain size with GCC for example use __attribute__((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%zu\n",sizeof(struct oneInt));
printf("twoInts=%zu\n",sizeof(struct twoInts));
printf("someBits=%zu\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
See also:
for Microsoft Visual C:
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
and GCC claim compatibility with Microsoft's compiler.:
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Structure_002dPacking-Pragmas.html
In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.
I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.
The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.
For example. Consider the simple structure:
struct myStruct
{
int a;
char b;
int c;
} data;
If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,
Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.
Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".
On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.
For example:
#pragma pack 1
struct MyStruct
{
int a;
char b;
int c;
short d;
} myData;
I = sizeof(myData);
I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
6.5.3.4 The sizeof operator:
3 When applied to an operand that has structure or union type,
the result is the total number of bytes in such an object,
including internal and trailing padding.
6.7.2.1 Structure and union specifiers:
13 ... There may be unnamed
padding within a structure object, but not at its beginning.
and:
15 There may be unnamed padding at the end of a structure or union.
The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:
16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
Annex J Portability Issues reiterates:
The following are unspecified: ...
The value of padding bytes when storing values in structures or unions (6.2.6.1)
C++11 N3337 standard draft
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
5.3.3 Sizeof:
2 When applied
to a class, the result is the number of bytes in an object of that class including any padding required for
placing objects of that type in an array.
9.2 Class members:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. — end note ]
I only know enough C++ to understand the note :-)
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
C language leaves compiler some freedom about the location of the structural elements in the memory:
memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options
The C language provides some assurance to the programmer of the elements layout in the structure:
compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements
Problems related to the elements alignment:
Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)
How alignment works:
The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should
end so that the first element of the next following structure does not the violate requirements of alignment
p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"
The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.
struct pixel {
unsigned char red; // 0
unsigned char green; // 1
unsigned int alpha; // 4 (gotta skip to an aligned offset)
unsigned char blue; // 8 (then skip 9 10 11)
};
// next offset: 12
The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.
Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)
Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.
TL;DR: alignment is important.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.
Among the other well-explained answers about memory alignment and structure padding/packing, there is something which I have discovered in the question itself by reading it carefully.
"Why isn't sizeof for a struct equal to the sum of sizeof of each member?"
"Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members"?
Both questions suggest something what is plain wrong. At least in a generic, non-example focused view, which is the case here.
The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately. It doesn't have to be larger/different.
If there is no reason for padding, no memory will be padded.
One most implementations, if the structure contains only members of the same type:
struct foo {
int a;
int b;
int c;
} bar;
Assuming sizeof(int) == 4, the size of the structure bar will be equal to the sum of the sizes of all members together, sizeof(bar) == 12. No padding done here.
Same goes for example here:
struct foo {
short int a;
short int b;
int c;
} bar;
Assuming sizeof(short int) == 2 and sizeof(int) == 4. The sum of allocated bytes for a and b is equal to the allocated bytes for c, the largest member and with that everything is perfectly aligned. Thus, sizeof(bar) == 8.
This is also object of the second most popular question regarding structure padding, here:
Memory alignment in C-structs
given a lot information(explanation) above.
And, I just would like to share some method in order to solve this issue.
You can avoid it by adding pragma pack
#pragma pack(push, 1)
// your structure
#pragma pack(pop)

Memory - Natural address boundary

Definition
Structure padding is the process of aligning data members of the structure in accordance with the memory alignment rules specified by the processor.
what is the memory alignment rule for Intel x86 processor?
As per my understanding, natural address boundaries for Intel-x86 processor is 32 bits each(i.e.,addressOffset%4==0)
So, In x86 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[3]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[3]; /* for alignment of the whole struct using array */
};
what is the memory alignment rule for Intel x86-64 processor?
As per my understanding, natural address boundaries for Intel x86-64 processor is 64 bits each(i.e.,addressOffset%8==0)
So, In x86-64 processor,
struct mystruct_A {
char a;
int b;
char c;
};
will be constructed as,
struct mystruct_A {
char a;
char gap_0[7]; /* inserted by compiler: for alignment of b using array */
int b;
char c;
char gap_1[7]; /* for alignment of the whole struct using array */
};
If the above understanding is correct, then I would like to know why use an array of int for bit operation?
Recommends to use int sized data, as mentioned here, that says, because the most cost efficient access to memory is accessing int sized data.
Question:
Is this memory alignment rule that forces to declare int sized data for bit operations?
Addendum: this is valid for x86/-64 bit processors, but also for others. I am blindly assuming you're using those. For others, you should check the respective manuals.
If fasm automatically added fillers into my structs i'd go insane. In general, performance is better when accesses to memory are on a boundary corresponding to the size of the element you want to retrieve. That being said, it's not a definite necessity!
This article here might be worth a look: https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures
Intel's suggestion for optimal layout is to start with the biggest elements first and going smaller as the structure increases. That way you'll stay aligned properly, as long as the first element is aligned properly. There are no three-byte elements, thus misalignment is out of the question and all the compiler might do is adding bytes at the end, which is the best way to make sure it won't ruin things if you choose to do direct memory accesses instead of using variables.
The safest procedure is to not rely on your compiler, but instead aligning the data properly yourself.
Fun Fact: loops work the same way. Padding NOPs in your code, before the start of a loop, can make a difference.

Why difference between structure variable is always giving 16bytes in 64 bit OS?

#include<stdio.h>
struct std
{
char c;
char e;
};
void main()
{
struct std p,q;
int start,last;
last= &q;
start= &p;
printf("Size:%u %u Address_Diff:%d \n",&p, &q,(last-start) );
}
I am learning structure alignment and padding while we executing above program every time difference between two structure variables showing 16 bytes. I am not understanding why the difference is 16 bytes but i'm expecting the difference is 2 bytes.
some examples:
4077309424 4077309440 diff:16
3600238672 3600238688 diff:16
3272011008 3272011024 diff:16
Generally compiler may pad bytes, to keep each structure variable start position to word or double-word alignment. I guess the choice of alignment position may be architecture and compiler dependent.
It seems your compiler is keeping 16-byte alignment for each structure variable. Sometimes compiler may pad bytes to elements of structure to keep word or double-word alignment. You can check where the bytes padding happening, I mean bytes or padded at element level or structure level.
Do print each element address (as below) and see the difference between them.
printf("Size:%u %u Address_Diff:%d \n",&p.c, &p.e);
If difference is just one byte, it means that bytes are padded at structure level i.e., at the end of structure.

C malloc offsets relative to struct definition locations (and padding)

C question:
Does malloc'ing a struct always result in linear placement from top to bottom of the data inside? As a second minor question: is there a standard on the padding size, or does it vary between 32 and 64 bit machines?
Using this test code:
#include <stdio.h>
#include <stdlib.h>
struct test
{
char a;
/* char pad[3]; // I'm assuming this happens from the compiler */
int b;
};
int main() {
int size;
char* testarray;
struct test* testarraystruct;
size = sizeof(struct test);
testarray = malloc(size * 4);
testarraystruct = (struct test *)testarray;
testarraystruct[1].a = 123;
printf("test.a = %d\n", testarray[size]); // Will this always be test.a?
free(testarray);
return 0;
}
On my machine, size is always 8. Therefore I check testarray[8] to see if it's the second struct's 'char a' field. In this example, my machine returns 123, but this obviously isn't proof it always is.
Does the C compiler have any guarantees that struct's are created in linear order?
I am not claiming the way this is done is safe or the proper way, this is a curiosity question.
Does this change if this is becomes C++?
Better yet, would my memory look like this if it malloc'd to 0x00001000?
0x00001000 char a // First test struct
0x00001001 char pad[0]
0x00001002 char pad[1]
0x00001003 char pad[2]
0x00001004 int b // Next four belong to byte b
0x00001005
0x00001006
0x00001007
0x00001008 char a // Second test struct
0x00001009 char pad[0]
0x0000100a char pad[1]
0x0000100b char pad[2]
0x0000100c int b // Next four belong to byte b
0x0000100d
0x0000100e
0x0000100f
NOTE: This question assumes int's are 32 bits
As far as I know, malloc for struct is not linear placement of data but it's a linear allocation of memory for the members with in the structure that too when you create an object of it.
This is also necessary for padding.
Padding also depends on the type of machine (i.e 32 bit or 64 bit).
The CPU fetches the memory based on whether it is 32 bit or 64 bit.
For 32 bit machine your structure will be:
struct test
{
char a; /* 3 bytes padding done to a */
int b;
};
Here your CPU fetch cycle is 32 bit i.e 4 bytes
So in this case (for this example) the CPU takes two fetch cycles.
To make it more clear in one fetch cycle CPU allocates 4 bytes of memory. So 3 bytes of padding will be done to "char a".
For 64 bit machine your structure will be:
struct test
{
char a;
int b; /* 3 bytes padding done to b */
}
Here the CPU fetch cycle is 8 bytes.
So in this case (for this example) the CPU takes one fetch cycles. So 3 bytes of padding here must be done to "int b".
However you can avoid the padding you can use #pragma pack 1
But this will not be efficient w.r.t time because here CPU fetch cycles will be more (for this example CPU fetch cycles will be 5).
This is tradeoff between CPU fetch cycles and padding.
For many CPU types, it is most efficient to read an N-byte quantity (where N is a power of 2 — 1, 2, 4, 8, sometimes 16) when it is aligned on an N-byte address boundary. Some CPU types will generate a SIGBUS error if you try to read an incorrectly aligned quantity; others will make extra memory accesses as necessary to retrieve an incorrectly aligned quantity. AFAICR, the DEC Alpha (subsequently Compaq and HP) had a mechanism that effectively used a system call to fix up a misaligned memory access, which was fiendishly expensive. You could control whether that was allowed with a program (uac — unaligned access control) which would stop the kernel from aborting the process and would do the necessary double reads.
C compilers are aware of the benefits and costs of accessing misaligned data, and go to lengths to avoid doing so. In particular, they ensure that data within a structure, or an array of structures, is appropriately aligned for fast access unless you hold them to ransom with quasi-standard #pragma directives like #pragma pack(1).
For your sample data structure, for a machine where sizeof(int) == 4 (most 32-bit and 64-bit systems), then there will indeed be 3 bytes of padding after an initial 1 byte char field and before a 4-byte int. If you use short s; after the single character, there would be just 1 byte of padding. Indeed, the following 3 structures are all the same size on many machines:
struct test_1
{
char a;
/* char pad[3]; // I'm assuming this happens from the compiler */
int b;
};
struct test_2
{
char a;
short s;
int b;
};
struct test_3
{
char a;
char c;
short s;
int b;
};
The C standard mandates that the elements of a structure are laid out in the sequence in which they are defined. That is, in struct test_3, the element a comes first, then c, then s, then b. That is, a is at the lowest address (and the standard mandates that there is no padding before the first element), then c is at an address higher than a (but the standard does not mandate that it will be one byte higher), then s is at an address higher than c, and that b is at an address higher than s. Further, the elements cannot overlap. There may be padding after the last element of a structure. For example in struct test_4, on many computers, there will be 7 bytes of padding between a and d, and there will be 7 bytes of padding after b:
struct test_4
{
char a;
double d;
char b;
};
This ensures that every element of an array of struct test_4 will have the d member properly aligned on an 8-byte boundary for optimal access (but at the cost of space; the size of the structure is often 24 bytes).
As noted in the first comment to the question, the layout and alignment of the structure is independent of whether the space is allocated by malloc() or on the stack or in global variables. Note that malloc() does not know what the pointer it returns will be used for. Its job is simply to ensure that no matter what the pointer is used for, there will be no misaligned access. That often means the pointer returned by malloc() will fall on an 8-byte boundary; on some 64-bit systems, the address is always a multiple of 16 bytes. That means that consecutive malloc() calls each allocating 1 byte will seldom produce addresses 1 byte apart.
For your sample code, I believe that standard does require that testdata[size] does equal 123 after the assignment. At the very least, you would be hard-pressed to find a compiler where it is not the case.
For simple structures containing plain old data (POD — simple C data types), C++ provides the same layout as C. If the structure is a class with virtual functions, etc, then the layout rules depend on the compiler. Virtual bases and the dreaded 'diamond of death' multiple inheritance, etc, also make changes to the layout of structures.

Questions about C bitfields

Is bitfield a C concept or C++?
Can it be used only within a structure? What are the other places we can use them?
AFAIK, bitfields are special structure variables that occupy the memory only for specified no. of bits. It is useful in saving memory and nothing else. Am I correct?
I coded a small program to understand the usage of bitfields - But, I think it is not working as expected. I expect the size of the below structure to be 1+4+2 = 7 bytes (considering the size of unsigned int is 4 bytes on my machine), But to my surprise it turns out to be 12 bytes (4+4+4). Can anyone let me know why?
#include <stdio.h>
struct s{
unsigned int a:1;
unsigned int b;
unsigned int c:2;
};
int main()
{
printf("sizeof struct s = %d bytes \n",sizeof(struct s));
return 0;
}
OUTPUT:
sizeof struct s = 12 bytes
Because a and c are not contiguous, they each reserve a full int's worth of memory space. If you move a and c together, the size of the struct becomes 8 bytes.
Moreover, you are telling the compiler that you want a to occupy only 1 bit, not 1 byte. So even though a and c next to each other should occupy only 3 bits total (still under a single byte), the combination of a and c still become word-aligned in memory on your 32-bit machine, hence occupying a full 4 bytes in addition to the int b.
Similarly, you would find that
struct s{
unsigned int b;
short s1;
short s2;
};
occupies 8 bytes, while
struct s{
short s1;
unsigned int b;
short s2;
};
occupies 12 bytes because in the latter case, the two shorts each sit in their own 32-bit alignment.
1) They originated in C, but are part of C++ too, unfortunately.
2) Yes, or within a class in C++.
3) As well as saving memory, they can be used for some forms of bit twiddling. However, both memory saving and twiddling are inherently implementation dependent - if you want to write portable software, avoid bit fields.
Its C.
Your comiler has rounded the memory allocation to 12 bytes for alignment purposes. Most computer memory syubsystems can't handle byte addressing.
Your program is working exactly as I'd expect. The compiler allocates adjacent bitfields into the same memory word, but yours are separated by a non-bitfield.
Move the bitfields next to each other and you'll probably get 8, which is the size of two ints on your machine. The bitfields would be packed into one int. This is compiler specific, however.
Bitfields are useful for saving space, but not much else.
Bitfields are widely used in firmware to map different fields in registers. This save a lot of manual bitwise operations which would have been necessary to read / write fields without it.
One disadvantage is you can't take address of bitfields.

Resources