Sizeof Structure in C [duplicate] - c

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 8 years ago.
To find the size of a structure in C
struct student
{
char name;
int age;
float weight;
};
main ()
{
int i,j,k,l;
struct student s1;
i=sizeof(s1.name);
j=sizeof(s1.age);
k=sizeof(s1.weight);
l=sizeof(s1);
printf ("\n size of name %d",i);
printf ("\n size of age %d",j);
printf ("\n size of weight %d",k);
printf ("\n size of s1 %d",l);
printf("\n");
}
My output is:
size of name 1
size of age 4
size of weight 4
size of s1 12
But structure size should be the sum of sizes of its members. Why am i getting 12 instead of 9 as size of structure variable s1. Can someone explain what is wrong.

For performance or hardware reasons, fields in structures should be suitably aligned. Read about data structure alignment (details depend upon the target processor and the ABI).
In your example on x86-64/Linux:
struct student {
char name;
int age;
float weight;
};
the field name has no alignment requirement.
the field age needs 4 bytes aligned to a multiple of 4
the field weight needs 4 bytes aligned to a multiple of 4
so the overall struct student needs 12 bytes aligned to a multiple of 4
If weight was declared double it would need 8 bytes aligned to a multiple of 8, and the entire structure would need 16 bytes aligned to 8.
BTW, the type of your name field is wrong. Usually names are more than one single char. (My family name needs 13 letters + the terminating null byte, i.e. 14 bytes). Probably you should declare it a pointer char *name; (8 bytes aligned to 8) or an array e.g. char name[16]; (16 bytes aligned to 1 byte).
The GCC compiler provides a nice extension: __alignof__ and relevant type attributes.
If performance or size is important to you, you should put fields in struct in order of decreasing alignment requirements (so usually start with the double fields, then long and pointers, etc...)

Related

Struct memory allocation, memory allocation should be in multiple of 4

struct x
{
char b;
short s;
char bb;
};
int main()
{
printf("%d",sizeof(struct x));
}
Output is : 6
I run this code on a 32-bit compiler. the output should be 8 bytes.
My explanation --> 1. Char needs 1 bytes and the next short takes multiple of 2 so short create a padding of 1 and take 2 bytes, here 4 bytes already allocated. Now the only left char member takes 1 byte but as the memory allocates is in multiple of 4 so overall memory gives is 8 bytes.
The alignment requirement of a struct is that of the member with the maximum alignment. The max alignment here is for short, so probably 2. Hence, two for b, two for s, and two for bb gives 6.
The C struct memory layout is completely implementation-specific and you can't assume all of this.
Also, in the typical alignment of C structs a struct like this:
struct MyData
{
short Data1;
short Data2;
short Data3;
};
will also have sizeof = 6 because if the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned. Data1 would be at offset 0, Data2 at offset 2, and Data3 at offset 4. The size of this structure would be 6 bytes.
See https://en.wikipedia.org/wiki/Data_structure_alignment

C memory allocation for struct with malloc

I am trying to understand the memory allocation in C for struct but I am stuck on it.
struct Person {
char *name;
int age;
int height;
int weight;
};
struct Person *Person_create(char *name, int age, int height, int weight)
{
struct Person *who = malloc(sizeof(struct Person));
assert(who != NULL);
who->age = age;
who->height = height;
who->weight = weight;
who->name = strdup(name);
return who;
}
int main(int argc, char *argv[])
{
struct Person *joe = Person_create("ABC", 10, 170, 60);
printf("Size of joe: %d\n", sizeof(*joe));
printf("1. Address of joe \t= %x\n", joe);
printf("2. Address of Age \t= %x\n", &joe->age);
printf("3. Address of Height \t= %x\n", &joe->height);
printf("4. Address of Weight \t= %x\n", &joe->weight);
printf("5. Address of name \t= %x\n", joe->name);
...
What I don't understand is the memory allocation for this struct. On my printout I see this:
Size of joe: 24
1. Address of joe = 602010
2. Address of Age = 602018
3. Address of Height = 60201c
4. Address of Weight = 602020
5. Address of name = 602030
Questions:
Why there is a gap between the 1 and 2?
Why there is a gap between the 4 and 5?
How is the size of *name being calculated as the name points only to
first char?
There is no gap between the address of the object joe and the address of data member age. This extent is occupied by data member name.
struct Person {
char *name;
int age;
//...
According to the output
1. Address of joe = 602010
2. Address of Age = 602018
it occupies 8 bytes that is sizeof( char * ) in your platform is equal to 8. And its address coincides with the address of the object joe itself.
In this statement
printf("5. Address of name \t= %x\n", joe->name);
you did not output the address of name itself. You printed the value stored in this pointer and this value is the address of the first character of a copy of the string literal "ABC" that was gotten by using strdup.
So there is a gap between values in the outputs 4 and 5 because they are different extents of memory. Data member weight belongs to object joe while the copy of the string literal "ABC" is stored outside the object. The object just has data member name that points to the first character of the copy of the literal.
As name is a pointer then its size is calculated like
sizeof( char * )
or
sizeof( joe->name )
and equal to 8 as I explained in the beginning of the post.
If you want to determine the length of the string literal you should use standard function strlen declared in header <string.h>. For example
printf( "%zu\n", strlen( joe->name ) );
Why there is a gap between the 1 and 2?
A struct's start address is always equal to the address of it's first member. From the C standard:
6.7.2.1-13. A pointer to a structure object, suitably converted, points to its initial member
The first member is not age, but name. So the following two lines should print the same address:
printf("1. Address of joe \t= %x\n", joe);
printf("1. Address of name-pointer \t= %x\n", &joe->name);
In your code,
printf("5. Address of name \t= %x\n", joe->name);
does not print the address of the pointer, but the address of the data the pointer points to.
How the size of *name is being calculated as the name points only to first char?
nameis a pointer, which occupies 8 bytes of memory regardless of the size of data it points to (that may be a string as in your case, a single char, an int or whatever).
Why there is a gap between the 4 and 5?
The memory for storing the actual name string is not within the struct - strdup allocates memory somewhere to duplicate the string into. This happens to be 16 bytes after the last member of your struct. This memory location is then pointed to by your name pointer.
Note that padding and memory alignment are a factor only for the size of the struct (they do not matter for your explicitly stated questions). Since the struct contains one pointer (8 bytes on your machine) and 3 integers (4 bytes each), one would assume that the total size is 20 bytes. On most platforms, memory is 8 byte aligned - which is why the size of your struct is rounded up to 24 bytes. This way, if you declare an array of Persons, each array element starts at an address that is 8 byte aligned, i.e., the address value can be divided evenly by 8.
The only things the c standard guarantees is that the address of the first member is the same as the address of the structure, and that the addresses of subsequent members increases with their position in the structure.
Compilers are allowed to insert spaces between members; this is called padding. Regard it as the compiler optimising the structure for a particular platform.
Arrays must always be contiguous in memory though.
It is due to something called Data alignment. To quote from this website
Every data type in C/C++ will have alignment requirement (in fact it is mandated by processor architecture, not by language).
And then extending this requirement for structures:
Because of the alignment requirements of various data types, every member of structure should be naturally aligned.
You can go through this article for a detailed read..
The memory layout of the struct is machine dependent, so you should not bother with that unless you are trying to implement a DBMS or a device driver or something like that.
sizeof(*name) would equal to sizeof(char), I do not get what confused you here, can you give further explanation?

Different Result than calculated,using the SIZEOF operator? [duplicate]

This question already has answers here:
Why isn't sizeof for a struct equal to the sum of sizeof of each member?
(13 answers)
Closed 8 years ago.
I was writing the code to check size of the int ,char and some struct.But its giving different result than manually calculated one.
#include<stdio.h>
struct person
{
int roll;
char name[10];
};
void main()
{
struct person p1;
printf("\n The size of the integer on machine is \t :: %d \n ",sizeof(int));
printf("\n The size of the char on machine is \t :: %d \n ",sizeof(char));
printf("\n The size of structre is \t :: %d \n",sizeof(struct person));
printf("\n The size of structre is \t :: %d \n",sizeof(p1));
}
I think structure shall have size = 10 * 1 + 4 = 14. But the output is
The size of the integer on machine is :: 4
The size of the char on machine is :: 1
The size of structre is :: 16
See what wikipedia says!
To calculate the size of any object type, the compiler must take into account any address alignment that may be needed to meet efficiency or architectural constraints. Many computer architectures do not support multiple-byte access starting at any byte address that is not a multiple of the word size, and even when the architecture allows it, usually the processor can fetch a word-aligned object faster than it can fetch an object that straddles multiple words in memory.[4] Therefore, compilers usually align data structures to at least a word alignment boundary, and also align individual members to their respective alignment boundaries. In the following example, the structure student is likely to be aligned on a word boundary, which is also where the member grade begins, and the member age is likely to start at the next word address. The compiler accomplishes the latter by inserting unused "padding" bytes between members as needed to satisfy the alignment requirements. There may also be padding at the end of a structure to ensure proper alignment in case the structure is ever used as an element of an array.
Thus, the aggregate size of a structure in C can be greater than the sum of the sizes of its individual members. For example, on many systems the following code will print 8:
struct student{
char grade; /* char is 1 byte long */
int age; /* int is 4 bytes long */
};
printf("%zu", sizeof (struct student));
You should try by altering the size of char array in your structure for better understanding
for example:
struct person
{
int roll;
char name[4];
};
Gives answer as 8
struct person
{
int roll;
char name[7];
};
Gives answer as 12
First you need to change %d to %zu because sizeof returns size_t type.
sizeof(p1) is giving 16 bytes instead of 14 because padding 2 bytes are added to it.
struct person
{
int roll; // 4 bytes
char name[10]; // 10 bytes. 2 bytes are needed for structure alignment
};

How does gcc calculate the required space for a structure?

struct {
integer a;
struct c b;
...
}
In general how does gcc calculate the required space? Is there anyone here who has ever peeked into the internals?
I have not "peeked at the internals", but it's pretty clear, and any sane compiler will do it exactly the same way. The process goes like:
Begin with size 0.
For each element, round size up to the next multiple of the alignment for that element, then add the size of that element.
Finally, round size up to the least common multiple of the alignments of all members.
Here's an example (assume int is 4 bytes and has 4 byte alignment):
struct foo {
char a;
int b;
char c;
};
Size is initially 0.
Round to alignment of char (1); size is still 0.
Add size of char (1); size is now 1.
Round to alignment of int (4); size is now 4.
Add size of int (4); size is now 8.
Round to alignment of char (1); size is still 8.
Add size of char (1); size is now 9.
Round to lcm(1,4) (4); size is now 12.
Edit: To address why the last step is necessary, suppose instead the size were just 9, not 12. Now declare struct foo myfoo[2]; and consider &myfoo[1].b, which is 13 bytes past the beginning of myfoo and 9 bytes past &myfoo[0].b. This means it's impossible for both myfoo[0].b and myfoo[1].b to be aligned to their required alignment (4).
There's not truely standardized way of aligning a struct, but the rule of thumb goes like this: The entire struct is aligned at a 4 or 8 byte boundary (depending on the platform). Within the struct, each member is aligned by its size. So the following packs with no padding:
char // 1
char
char
char
short int // 2
short int
int // 4
This will have a total size of 12. However, this next one will cause padding:
char // 1, + 1 bytes padding
short // 2
int // 4
char // 1, + 1 byte padding
short // 2
char // 1
char // 1, + 2 bytes padding
Now the structure takes up 16 bytes.
This is just a typical example, the details will depend on your platform. Sometimes you can tell a compiler to never add any padding -- this cause more expensive memory access (possibly introducing concurrency problems) but will save space.
To lay out aggregates as efficiently as possible, order the members by size, starting with the biggest.
The size of a structure is implementation defined, but it is hard to say what the size of your structure will be without more information (it is incomplete). For instance, given this struct:
struct MyStruct {
int abc;
int def;
char temp;
};
Yields a size of 9 on my compiler. 4 bytes for int and 1 byte for a char.
Have modified your code so that it compiles and ran it on Eclipse/Microsoft C compiler platform:
struct c {
int a;
struct c *b;
};
struct c d;
printf("\nsizeof c=%d, sizeof a=%d, sizeof b=%d",
sizeof(d), sizeof(d.a), sizeof(d.b));
printf("\naddrof c =%08x", &c);
printf("\naddrof c.a=%08x", &c.a);
printf("\naddrof c.b=%08x", &c.b);
The above code fragment produced the following output:
sizeof c=8, sizeof a=4, sizeof b=4
addrof c =0012ff38
addrof c.a=0012ff38
addrof c.b=0012ff3c
Do something like this so you can see (WITHOUT GUESSING) exactly how your compiler formats a structure.

Read binary data (from file) into a struct

I'm reading binary data from a file, specifically from a zip file. (To know more about the zip format structure see http://en.wikipedia.org/wiki/ZIP_%28file_format%29)
I've created a struct that stores the data:
typedef struct {
/*Start Size Description */
int signatute; /* 0 4 Local file header signature = 0x04034b50 */
short int version; /*  4 2 Version needed to extract (minimum) */
short int bit_flag; /*  6 2 General purpose bit flag */
short int compression_method; /*  8 2 Compression method */
short int time; /* 10 2 File last modification time */
short int date; /* 12 2 File last modification date */
int crc; /* 14 4 CRC-32 */
int compressed_size; /* 18 4 Compressed size */
int uncompressed_size; /* 22 4 Uncompressed size */
short int name_length; /* 26 2 File name length (n) */
short int extra_field_length; /* 28 2 Extra field length (m) */
char *name; /* 30 n File name */
char *extra_field; /*30+n m Extra field */
} ZIP_local_file_header;
The size returned by sizeof(ZIP_local_file_header) is 40, but if the sum of each field is calculated with sizeof operator the total size is 38.
If we have the next struct:
typedef struct {
short int x;
int y;
} FOO;
sizeof(FOO) returns 8 because the memory is allocated with 4 bytes every time. So, to allocate x are reserved 4 bytes (but the real size is 2 bytes). If we need another short int it will fill the resting 2 bytes of the previous allocation. But as we have an int it will be allocated plus 4 bytes and the empty 2 bytes are wasted.
To read data from file, we can use the function fread:
ZIP_local_file_header p;
fread(&p,sizeof(ZIP_local_file_header),1,file);
But as there're empty bytes in the middle, it isn't read correctly.
What can I do to sequentially and efficiently store data with ZIP_local_file_header wasting no bytes?
In order to meet the alignment requirements of the underlying platform, structs may have "padding" bytes between members so that each member starts at a properly aligned address.
There are several ways around this: one is to read each element of the header separately using the appropriately-sized member:
fread(&p.signature, sizeof p.signature, 1, file);
fread(&p.version, sizeof p.version, 1, file);
...
Another is to use bit fields in your struct definition; these are not subject to padding restrictions. The downside is that bit fields must be unsigned int or int or, as of C99, _Bool; you may have to cast the raw data to the target type to interpret it correctly:
typedef struct {
unsigned int signature : 32;
unsigned int version : 16;
unsigned int bit_flag; : 16;
unsigned int compression_method : 16;
unsigned int time : 16;
unsigned int date : 16;
unsigned int crc : 32;
unsigned int compressed_size : 32;
unsigned int uncompressed_size : 32;
unsigned int name_length : 16;
unsigned int extra_field_length : 16;
} ZIP_local_file_header;
You may also have to do some byte-swapping in each member if the file was written in big-endian but your system is little-endian.
Note that name and extra field aren't part of the struct definition; when you read from the file, you're not going to be reading pointer values for the name and extra field, you're going to be reading the actual contents of the name and extra field. Since you don't know the sizes of those fields until you read the rest of the header, you should defer reading them until after you've read the structure above. Something like
ZIP_local_file_header p;
char *name = NULL;
char *extra = NULL;
...
fread(&p, sizeof p, 1, file);
if (name = malloc(p.name_length + 1))
{
fread(name, p.name_length, 1, file);
name[p.name_length] = 0;
}
if (extra = malloc(p.extra_field_length + 1))
{
fread(extra, p.extra_field_length, 1, file);
extra[p.extra_field_length] = 0;
}
C structs are just about grouping related pieces of data together, they do not specify a particular layout in memory. (Just as the width of an int isn't defined either.) Little-endian/Big-endian is also not defined, and depends on the processor.
Different compilers, the same compiler on different architectures or operating systems, etc., will all layout structs differently.
As the file format you want to read is defined in terms of which bytes go where, a struct, although it looks very convenient and tempting, isn't the right solution. You need to treat the file as a char[] and pull out the bytes you need and shift them in order to make numbers composed of multiple bytes, etc.
The solution is compiler-specific, but for instance in GCC, you can force it to pack the structure more tightly by appending __attribute__((packed)) to the definition. See http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/Type-Attributes.html.
It's been a while since I worked with zip-compressed files, but I do remember the practice of adding my own padding to hit the 4-byte alignment rules of PowerPC arch.
At best you simply need to define each element of your struct to the size of the piece of data you want to read in. Don't just use 'int' as that may be platform/compiler defined to various sizes.
Do something like this in a header:
typedef unsigned long unsigned32;
typedef unsigned short unsigned16;
typedef unsigned char unsigned8;
typedef unsigned char byte;
Then instead of just int use an unsigned32 where you have a known 4-byte vaule. And unsigned16 for any known 2-byte values.
This will help you see where you can add padding bytes to hit 4-byte alignment, or where you can group 2, 2-byte elements to make up a 4-byte alignment.
Ideally you can use a minimum of padding bytes (which can be used to add additional data later as your expand the program) or none at all if you can align everything to 4-byte boundaries with variable-length data at the end.
Also, the name and extra_field will not contain any meaningful data, most likely. At least not between runs of the program, since these are pointers.

Resources