struct xyz {
int a;
int b;
char c[0];
};
struct xyz x1;
printf("Size of structure is %d",sizeof(x1));
Output: 8
why isn't the size of structure 9 bytes? Is it because the character array declared is of size 0?
Zero-length arrays are not in the standard C, but they are allowed by many compilers.
The idea is that they must be placed as the very last field in a struct, but they don't occupy any bytes. The struct works as a header for the array that is placed just next to it in memory.
For example:
struct Hdr
{
int a, b, c;
struct Foo foos[0]
};
struct Hdr *buffer = malloc(sizeof(struct Hdr) + 10*sizeof(Foo));
buffer->a = ...;
buffer->foos[0] = ...;
buffer->foos[9] = ...;
The standard way to do that is to create an array of size 1 and then substracting that 1 from the length of the array. But even that technique is controversial...
For more details and the similar flexible array member see this document.
Your array of characters has a length of 0 and hence the size of c is 0 bytes. Therefore when your compiler allocated a block of memory for that structure it only considers both integers and since you are on a 32-bit environment (assuming so from your result) the size of the structure is 8 bytes.
Remark: You can still access the field c without any compiler warnings (compiled with gcc) however it will be some garbage value.
An array of length 0 is actually not permitted in standard C, but apparently your compiler supports it as an extension.
It's one way of implementing the so-called "struct hack", explained in question 2.6 of the comp.lang.c FAQ.
Because C implementations typically don't do bounds checking for arrays, a zero-element array (or in a more portable variant, a one-element array) gives you a base address for an array of arbitrary size. You have to allocate, typically using malloc, enough memory for the enclosing struct so that there's room for as many array elements as you need:
struct xyz *ptr = malloc(sizeof *ptr + COUNT * sizeof (char));
C99 added a new feature, "flexible array members", that does the same thing without specifying a fake array size:
struct xyz {
int a;
int b;
char c[];
};
In this case:
int a;//4
int b;//4
char c[0] ; 0
So it is 8.
And
The sizeof never return 9 in you struct even if you give a size to char c[];
int a;//4 byte
int b;//4 byte
char c[1];// one byte but it should alignment with other.
just in this way:
^^^^
^^^^
^~~~ //for alignment
So, sizeof return 12 not 9
Related
As memory alignment of the structure in C is done in a contiguous form for the first element first and then second, then third and so on... along with bit padding, then why the size of this structure is same even when the elements are rearranged:
#include <stdio.h>
int main(void)
{
struct student
{
float c;
int a;
char b;
};
printf("%zu\n", sizeof(struct student));
return 0;
}
Output:
12
Does the memory alignment looks like this for the above configuration of structure?:
f f f f i i i i c
_ _ _ _ _ _ _ _ _ _ _
0 1 2 3 4 5 6 7 8 9 10
It appears that on your machine both int and float require 4 bytes each. In this sense, their location in the struct is irrelevant.
However, a char takes (usually) only 1 byte, so you might wonder how come sizeof doesn't return 9 (4+4+1) and the reason is padding.
Padding is added in a number of situations, but most obviously to allow for type alignment (I assume that both the int and float types on your system are naturally aligned on the 4 byte boundary).
I assume that this would make perfect sense if the order was changed to:
struct student
{
float c;
char b;
int a;
};
In this example we would have 4 bytes (float) + 1 byte (char) + 3 bytes (padding) + 4 bytes (int). i.e.:
struct student
{
float c;
char b;
char padding[3];
int a;
};
However, in your original example we have:
struct student
{
float c;
int a;
char b;
};
This results in 4 bytes (float) + 4 bytes (int) + 1 byte (char) + 3 bytes (padding) - i.e.:
struct student
{
float c;
int a;
char b;
char padding[3];
};
The reason we still get padding at the end of the struct is to allow for Arrays (struct student array[32]).
If there wasn't any padding at the end of the struct than the second member of the array (array[1]) would start on an offset and the type (float) wouldn't be properly aligned on the natural 4 byte boundary.
When you declare the type, the compiler will always add the required padding that allows the type to be used in an array (i.e., when allocating memory using malloc).
I hope this answers your question.
EDIT:
To clarify the padding in the remaining struct that I hadn't listed above (see comment), it would probably look something like this (assuming the compiler is compiling code for a similar system):
struct student
{
char b;
char padding[3];
float c;
int a;
};
If a was a char as well, we would get padding in both ends of the struct:
struct student
{
char b;
char padding[3];
float c;
char a;
char padding2[3];
};
However, if we re-organized the struct so the chars were next to each other, their padding would be different since the char type doesn't have a natural alignment of 4 bytes on this system:
struct student
{
char b;
char a;
char padding[2];
float c;
};
Note:
Most compilers should support an instruction that tells the compiler to "pack" the structure (ignore type alignment and padding)... this, however, should be highly discouraged IMHO since it could cause some CPU architectures to crash and introduces non-portable code (see also here).
On your system, the int and float data types appear to have sizes of 4 bytes each; thus, the compiler (unless told otherwise) will align structure members of those types on 4-byte boundaries. This is why, if you have the char b field as the second member, then 3 bytes of 'padding' will be added between that and the next field - giving a total size of 12 bytes for the structure.
However, the compiler will also add padding at the end of the structure! (Three bytes, again, in the case when char b is the last field.)
Why? Well, consider an array of such struct types. Without that 'terminal padding', the first field of the second array element would be misaligned – that is, it would not be on a 4-byte boundary, thus reducing any efficiency gained from 'internal' padding. Similar issues would arise for other structures that have your type included as a nested field.
EDIT: I can't really offer much improvement on the following statement from this Wikipedia page:
It is important to note that the last member is padded with the number
of bytes required so that the total size of the structure should be a
multiple of the largest alignment of any structure member …
The compiler aligns the fields and fills the remainder of the struct so if you build an array, the next element will be aligned.
In your posted case, the alignment of the field elements is the size of the two fields float and int that are 4 bytes large. So the compiler pads the structure at the end for the next array element of the type you are defined is also aligned. This means the compiler has to add three pad elements after the char typed field, even if it is at the end of the structure.
In the case you post, if you consider that the struct is 5 bytes, the next element of an array of structs will not be aligned, as the float and the int would start on a +1 aligned offset.
typedef struct {
int num;
char arr[64];
} A;
typedef struct {
int num;
char arr[];
} B;
I declared A* a; and then put some data into it. Now I want to cast it to a B*.
A* a;
a->num = 1;
strcpy(a->arr, "Hi");
B* b = (B*)a;
Is this right?
I get a segmentation fault sometimes (not always), and I wonder if this could be the cause of the problem.
I got a segmentation fault even though I didn't try to access to char arr[] after casting.
This defines a pointer variable
A* a;
There is nothing it is cleanly pointing to, the pointer is non-initialised.
This accesses whatever it is pointing to
a->num = 1;
strcpy(a->arr, "Hi");
Without allocating anything to the pointer beforehand (by e.g. using malloc()) this is asking for segfaults as one possible consequence of the undefined behaviour it invokes.
This is an addendum to Yunnosch's answer, which identifies the problem correctly. Let's assume you do it correctly and either write just
A a;
which gives you an object of automatic storage duration when declared inside a function, or you dynamically allocated an instance of A like this:
A *a = malloc(sizeof *a);
if (!a) return -1; // or whatever else to do in case of allocation error
Then, the next thing is your cast:
B* b = (B*)a;
This is not correct, types A and B are not compatible. Here, it will probably work in practice because the struct members are compatible, but beware that strange things can happen because the compiler is allowed to assume a and b point to different objects because their types are not compatible. For more information, read on the topic of what's commonly called the strict aliasing rule.
You should also know that an incomplete array type (without a size) is only allowed as the very last member of a struct. With a definition like yours:
typedef struct {
int num;
char arr[];
} B;
the member arr is allowed to have any size, but it's your responsibility to allocate it correctly. The size of B (sizeof(B)) doesn't include this member. So if you just write
B b;
you can't store anything in b.arr, it has a size of 0. This last member is called a flexible array member and can only be used correctly with dynamic allocation, adding the size manually, like this:
B *b = malloc(sizeof *b + 64);
This gives you an instance *b with an arr of size 64. If the array doesn't have the type char, you must multiply manually with the size of your member type -- it's not necessary for char because sizeof(char) is by definition 1. So if you change the type of your array to something different, e.g. int, you'd write this to allocate it with 64 elements:
B *b = malloc(sizeof *b + 64 * sizeof *(b->arr));
It appears that you are confusing two different topics. In C99/C11 char arr[]; as the last member of a structure is a Flexible Array Member (FAM) and it allows you to allocate for the structure itself and N number of elements for the flexible array. However -- you must allocate storage for it. The FAM provides the benefit of allowing one-allocation and one-free where there would normally be two required. (In C89 a similar implementation went by the name struct hack, but it was slightly different).
For example, B *b = malloc (sizeof *b + 64 * sizeof *b->arr); would allocate storage for b plus 64-characters of storage for b->arr. You could then copy the members of a to b using the proper '.' and '->' syntax.
A short example can illustrate:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NCHAR 64 /* if you need a constant, #define one (or more) */
typedef struct {
int num;
char arr[NCHAR];
} A;
typedef struct {
int num;
char arr[]; /* flexible array member */
} B;
int main (void) {
A a = { 1, "Hi" };
B *b = malloc (sizeof *b + NCHAR * sizeof *b->arr);
if (!b) {
perror ("malloc-b");
return 1;
}
b->num = a.num;
strcpy (b->arr, a.arr);
printf ("b->num: %d\nb->arr: %s\n", b->num, b->arr);
free (b);
return 0;
}
Example Use/Output
$ ./bin/struct_fam
b->num: 1
b->arr: Hi
Look things over and let me know if that helps clear things up. Also let me know if you were asking something different. It is a little unclear exactly where you confusion lies.
I have a piece of code bellow,and what's the difference of them?
The first one,the address of buf element of the struct is 4 bigger than that of the struct while the second one is not.
First
#include <stdio.h>
typedef struct A
{
int i;
char buf[]; //Here
}A;
int main()
{
A *pa = malloc(sizeof(A));
char *p = malloc(13);
memcpy(p, "helloworld", 10);
memcpy(pa->buf, p, 13);
printf("%x %x %d %s\n", pa->buf, pa, (char *)pa->buf - (char *)pa, pa->buf);
}
Second
typedef struct A
{
int i;
char *buf; //Here
}A;
The first is a C99 'flexible array member'. The second is the reliable fallback for when you don't have C99 or later.
With a flexible array member, you allocate the space you need for the array along with the main structure:
A *pa = malloc(sizeof(A) + strlen(string) + 1);
pa->i = index;
strcpy(pa->buf, string);
...use pa...
free(pa);
As far as the memory allocation goes, the buf member has no size (so sizeof(A) == sizeof(int) unless there are padding issues because of array alignment — eg if you had a flexible array of double).
The alternative requires either two allocations (and two releases), or some care in the setup:
typedef struct A2
{
int i;
char *buf;
} A2;
A2 *pa2 = malloc(sizeof(A2));
pa2->buff = strdup(string);
...use pa2...
free(pa2->buff);
free(pa2);
Or:
A2 *pa2 = malloc(sizeof(A2) + strlen(string) + 1);
pa2->buff = (char *)pa2 + sizeof(A2);
...use pa2...
free(pa2);
Note that using A2 requires more memory, either by the size of the pointer (single allocation), or by the size of the pointer and the overhead for the second memory allocation (double allocation).
You will sometimes see something known as the 'struct hack' in use; this predates the C99 standard and is obsoleted by flexible array members. The code for this looks like:
typedef struct A3
{
int i;
char buf[1];
} A3;
A3 *pa3 = malloc(sizeof(A3) + strlen(string) + 1);
strcpy(pa3->buf, string);
This is almost the same as a flexible array member, but the structure is bigger. In the example, on most machines, the structure A3 would be 8 bytes long (instead of 4 bytes for A).
GCC has some support for zero length arrays; you might see the struct hack with an array dimension of 0. That is not portable to any compiler that is not mimicking GCC.
It's called the 'struct hack' because it is not guaranteed to be portable by the language standard (because you are accessing outside the bounds of the declared array). However, empirically, it has 'always worked' and probably will continue to do so. Nevertheless, you should use flexible array members in preference to the struct hack.
ISO/IEC 9899:2011 §6.7.2.1 Structure and union specifiers
¶3 A structure or union shall not contain a member with incomplete or function type (hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
¶18 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. However, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed; the
offset of the array shall remain that of the flexible array member, even if this would differ
from that of the replacement array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt is made to access that
element or to generate a pointer one past it.
struct A {
int i;
char buf[];
};
does not reserve any space for the array, or for a pointer to an array. What this says is that an array can directly follow the body of A and be accessed via buf, like so:
struct A *a = malloc(sizeof(*a) + 6);
strcpy(a->buf, "hello");
assert(a->buf[0] == 'h');
assert(a->buf[5] == '\0';
Note I reserved 6 bytes following a for "hello" and the nul terminator.
The pointer form uses an indirection (the memory could be contiguous, but this is neither depended on nor required)
struct B {
int i;
char *buf;
};
/* requiring two allocations: */
struct B *b1 = malloc(sizeof(*b1));
b1->buf = strdup("hello");
/* or some pointer arithmetic */
struct B *b2 = malloc(sizeof(*b2) + 6);
b2->buf = (char *)((&b2->buf)+1);
The second is now laid out the same as a above, except with a pointer between the integer and the char array.
Suppose I have:
int (* arrPtr)[10] = NULL; // A pointer to an array of ten elements with type int.
int (*ptr)[3]= NULL;
int var[10] = {1,2,3,4,5,6,7,8,9,10};
int matrix[3][10];
Now if I do,
arrPtr = matrix; //.....This is fine...
Now can I do this:
ptr = var; //.....***This is working***
OR is it compulsory to do this:
ptr= (int (*)[10])var; //....I dont understand why this is necessary
Also,
printf("%d",(*ptr)[4]);
is working even though we declare
int (*ptr)[3]=NULL;
^^^
In some cases, Name of Array is Pointer to it's First Location.
So, when you do,
ptr = var;
You are assigning address of var[0] to ptr[0]
int var[10] declaration makes var as an int pointer
As both are int pointers, the operation is valid.
For Second Question,
When you declare a Pointer, It points to some address.
Say
int * ptr = 0x1234; //Some Random address
now when you write ptr[3], it's 0x1234 + (sizeof(int) * 3).
So Pointer works irrespective of it's declared array size.
So when ptr = NULL,
*ptr[4] will point to NULL + (sizeof(int) * 4)
i.e. A Valid Operation!
ptr and var aren't compatible pointers because ptr is a pointer to an array of 3 ints and var is an array of 10 ints, 3 ≠ 10.
(*ptr)[4] works likely because the compiler doesn't do rigorous boundary checks when indexing arrays. This probably has to do with the fact that a lot of existing C code uses variable-size structures defined something like this:
typedef struct
{
int type;
size_t size; // actual number of chars in data[]
unsigned char data[1];
} DATA_PACKET;
The code allocates more memory to a DATA_PACKET* pointer than sizeof(DATA_PACKET), here it would be sizeof(DATA_PACKET)-1+how many chars need to be in data[].
So, the compiler ignores index=4 when dereferencing (*ptr)[4] even though it's >= 3 in the declaration int (*ptr)[3].
Also, the compiler cannot always keep track of arrays and their sizes when accessing them through pointers. Code analysis is hard.
ptr is a pointer to array of 3 integers, so ptr[0] will point to the start of the first array, ptr[1] will point to the start of the second array and so on.
In your case:
printf("%d",(*ptr)[4]);
works as you print the element no 5 of the first array
and
printf("%d",(*ptr+1)[4]);
print the element no 5 of the second array ( which of course doesn't exists)
for example the following is the same as yours
printf("%d",ptr[0][4]);
but this doesn't mean that you depend on this as var is array of 10 integers, so ptr has to be decelared as
int *ptr = NULL
in this case to print the element no 5
printf("%d", ptr[4]);
#define STRMAX 50
struct Person {
char sName[STRMAX];
int iAge;
};
typedef struct Person PERSON;
int main() {
PERSON *personen[1];
personen[0]->sName = "Pieter";
personen[0]->iAge = 18;
return 0;
}
This code generates an error on personen[0]->sName = "Pieter"; saying incompatible types in assignment. Why?
You don't want an array of pointers. Try
PERSON personen[1];
And like others have said, use the strcpy function!
Don't try to assign arrays. Use strcpy to copy the string from one array to the other.
...sName is an array of chars while "Pieter" is a const char*. You cannot assign the latter to the former. The compiler is always right :)
Change
PERSON *personen[1];
to
PERSON personen[1];
and use strcpy to copy the string.
strcpy(personen[0]->sName,"Pieter");
I agree with the above but I figured it was also important to include the "why"
int a; // is an integer
int *b; // pointer to an integer must be malloced (to have an array)
int c[]; // pointer to an integer must also be malloced (to have an array)
int d[5]; // pointer to an integer bu now it is initialized to an array of integers
to get b and c from simple pointers and give them memory to match d use the following to give them memory space
b = (int *) malloc(sizeof(int)*5);
where it casts the pointer returned from malloc to an int pointer, and creates a memory block of 5 times the size of an integer (thus it will hold 5 integers like d)