Defined an empty array in c - c

What actually happened in following program, I have defined an empty array int arr[]; in code and with the GCC compiler, but the compiler doesn't give an error. It's successfully worked.
#include <stdio.h>
#include <stdlib.h>
typedef struct st
{
int i;
int arr[];
}ST,*ptr;
int main()
{
ST s1;
ptr p1= (ptr)malloc(sizeof(ST)+4*sizeof(int));
p1->i=10;
p1->arr[0]=1;
p1->arr[3] = 1;
printf("%d\n",p1->arr[3]);
printf("%ld\n", sizeof(s1));
}
C language not allowed undefined array length. but GCC compiler allowed. Why?
Just curious, what actually happens?

It's valid and allowed since C99. It's called flexible array member - useful to make the last member of the struct variable-length and it must be the last member of the struct if used.

You are looking at a C99 feature, Flexible array member. If the array is defined at the end of the struct, then you will define its size during allocation via malloc, by allocating a bigger chunk of memory.
for an array of size len:
struct st *mySt = malloc(sizeof(struct st) + len * sizeof(mySt->arr[0]));

C99, the structure of the last element allows the size of the array is unknown, which is called a flexible array member, but in front of the structure of a flexible array member must have at least one other member.
Flexible array members allowed to structure contains an array of variable size. This size of the structure sizeof returns do not include a flexible array of memory. Structure comprising a flexible array member using malloc () function allocates memory dynamically and allocates memory should be larger than the size of the structure to accommodate the expected size of the flexible array.
The basic model
typedef struct st
{
int i;
int arr[];
}ST,*ptr;
Tested under linux gcc about operating results for the structure, the last element of an array, and an array of undefined length, operating results consistent with theory, this is a bit like a c ++ class member functions, member functions is not occupied class space size. Well, this flexible array, there is the role of what is.

Related

Array of size 0 at the end of struct [duplicate]

This question already has answers here:
What's the need of array with zero elements?
(5 answers)
Closed 5 years ago.
My professor of a systems programming course I'm taking told us today to define a struct with a zero-length array at the end:
struct array{
size_t size;
int data[0];
};
typedef struct array array;
This is a useful struct to define or initialize an array with a variable, i.e., something as follows:
array *array_new(size_t size){
array* a = malloc(sizeof(array) + size * sizeof(int));
if(a){
a->size = size;
}
return a;
}
That is, using malloc(), we also allocate memory for the array of size zero. This is completely new for me, and it's seems odd, because, from my understanding, structs do not have their elements necessarily in continuous locations.
Why does the code in array_new allocate memory to data[0]? Why would it be legal to access then, say
array * a = array_new(3);
a->data[1] = 12;
?
From what he told us, it seems that an array defined as length zero at the end of a struct is ensured to come immediately after the last element of the struct, but this seems strange, because, again, from my understanding, structs could have padding.
I've also seen around that this is just a feature of gcc and not defined by any standard. Is this true?
Currently, there exists a standard feature, as mentioned in C11, chapter §6.7.2.1, called flexible array member.
Quoting the standard,
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. [...]
The syntax should be
struct s { int n; double d[]; };
where the last element is incomplete type, (no array dimensions, not even 0).
So, your code should better look like
struct array{
size_t size;
int data[ ];
};
to be standard-conforming.
Now, coming to your example, of a 0-sized array, this was a legacy way ("struct hack") of achieving the same. Before C99, GCC supported this as an extension to emulate flexible array member functionality.
Your professor is confused. They should go read what happens if I define a zero size array. This is a non-standard GCC extension; it is not valid C and not something they should teach students to use (*).
Instead, use standard C flexible array member. Unlike your zero-size array, it will actually work, portably:
struct array{
size_t size;
int data[];
};
Flexible array members are guaranteed to count as zero when you use sizeof on the struct, allowing you to do things like:
malloc(sizeof(array) + sizeof(int[size]));
(*) Back in the 90s people used an unsafe exploit to add data after structs, known as the "struct hack". To provide a safe way to extend a struct, GCC implemented the zero-size array feature as a non-standard extension. It became obsolete in 1999 when the C standard finally provided a better way to do this.
Other answers explains that zero-length arrays are GCC extension and C allows variable length array but no one addressed your other questions.
from my understanding, structs do not have their elements necessarily in continuous locations.
Yes. struct data type do not have their elements necessarily in continuous locations.
Why does the code in array_new allocate memory to data[0]? Why would it be legal to access then, say
array * a = array_new(3);
a->data[1] = 12;
?
You should note that one of the the restriction on zero-length array is that it must be the last member of a structure. By this, compiler knows that the struct can have variable length object and some more memory will be needed at runtime.
But, you shouldn't be confused with; "since zero-length array is the last member of the structure then the memory allocated for zero-length array must be added to the end of the structure and since structs do not have their elements necessarily in continuous locations then how could that allocated memory be accessed?"
No. That's not the case. Memory allocation for structure members not necessarily be contiguous, there may be padding between them, but that allocated memory must be accessed with variable data. And yes, padding will have no effect over here. The rule is:
§6.7.2.1/15
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared.
I've also seen around that this is just a feature of gcc and not defined by any standard. Is this true?
Yes. As other answers already mentioned that zero-length arrays are not supported by standard C, but an extension of GCC compilers. C99 introduced flexible array member. An example from C standard (6.7.2.1):
After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in particular, the offsets of member d might not be the same).
A more standard way would be to define your array with a data size of 1, as in:
struct array{
size_t size;
int data[1]; // <--- will work across compilers
};
Then use the offset of the data member (not the size of the array) in the calculation:
array *array_new(size_t size){
array* a = malloc(offsetof(array, data) + size * sizeof(int));
if(a){
a->size = size;
}
return a;
}
This is effectively using array.data as a marker for where the extra data might go (depending on size).
The way I used to do it is without a dummy member at the end of the structure: the size of the structure itself tells you the address just past it. Adding 1 to the typed pointer goes there:
header * p = malloc (sizeof (header) + buffersize);
char * buffer = (char*)(p+1);
As for structs in general, you can know that the fields are layed out in order. Being able to match some imposed structure needed by a file format binary image, operating system call, or hardware is one advantage of using C. You have to know how the padding for alignment works, but they are in order and in one contiguous block.

Internal mechanism of sizeof in C?

I use sizeof to get size of a struct in C, but the result I got is unexpected.
struct sdshdr {
int len;
int free;
char buf[];
};
int main(){
printf("struct len:%d\n",(sizeof(struct sdshdr)));
return 0;
} //struct len:8, with or without buf
my question is why does buf not occupy any space and why is the size of the int type still 4 on a 64-bit CPU?
here is the output from gcc -v:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.4.0
Thread model: posix
The [] is a flexible array member. They do not count towards the total size of the struct, because the C standard explicitly says so:
6.7.2.1/18
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
This is intentional by design, because the purpose of a flexible array member is to allow you to allocate trailing data dynamically after the struct. (When the struct is a file header, protocol header etc.)
Example including discussion about non-standard gcc extensions and the old pre-C99 "struct hack".
From C99 onwards the size of an array at the end of a struct may be omitted. For purposes of sizeof(struct) this array will appear to have zero size (although its presence may add some padding to the struct), but the intent is for its length to be flexible, i.e., when allocating space for the struct one must allocate the desired amount of extra space for the array at the end. (To avoid going out of bounds, the actual allocated length of the array should be stored somewhere.)
Before C99 it was a fairly common hack to have an array of size 1 (or 0 where allowed by the compiler) at the end of a struct and then allocate more space for it, so C99 made this practice explicitly allowed by introducing the flexible array member with no size given.
As a GNU c extension, you have zero-length arrays:
As a GNU extension, the number of elements can be as small as zero. Zero-length arrays are useful as the last element of a structure which is really a header for a variable-length object:
for example, consider this code from The gnu c manual
struct line
{
int length;
char contents[0];
};
{
struct line *this_line = (struct line *)
malloc (sizeof (struct line) + this_length);
this_line -> length = this_length;
}
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero.
Flexible array members may only appear as the last member of a struct that is otherwise non-empty.
A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are permitted by GCC as extensions.)
buf here is a flexible array member
Flexible array members have incomplete type, and so the sizeof operator may not be applied whereas original implementation of zero-length arrays, sizeof evaluates to zero.

How are zero-length arrays in a structure useful? [duplicate]

This question already has answers here:
What's the need of array with zero elements?
(5 answers)
Closed 8 years ago.
Quoting from this site,
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object.
Code:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)malloc(sizeof (struct line) + this_length);
thisline->length = this_length;
I really did not understand what they have meant by this: "last element of a structure that is really a header for a variable-length object". Can someone elaborate on that sentence?
Zero-length arrays (which have been replaced with flexible array members in C99) let you allocate structures with variable size in a single allocation instead of two allocations. Without flexible array members you would be forced to do this:
struct line {
int length;
char *contents;
};
struct line *thisline = malloc(sizeof (struct line));
thisline->length = this_length;
thisline->contents=malloc(this_length);
Flexible array member saves you from making the second allocation, and also from having to call free a second time.
In addition to its convenience, this little trick helps you save on memory overhead. Each time you allocate a block of memory, malloc needs to store some "bookkeeping" information for the use of free. In addition, the blocks the malloc allocates are sized at specific increments, leaving small unused portions at the end of each block. By reducing the number of allocations in half you can reduce the overhead by a significant fraction.
It is old C. With current C99 or C11 standard you can use flexible array members (which should be the last member of a struct) like:
struct line {
unsigned length;
char contents[];
};
and you need a convention about the occupied size of the flexible array. If contents is a C string (like we all expect), you could suppose its size is length+1 (since you need a terminating null byte). Then you should comment that.
See also this & that
I really did not understand what they have meant by this: "last element of a structure that is really a header for a variable-length object".
It means this is OK (in GNU C):
struct line {
int length;
char contents[0]; // OK, last member of the structure
};
but this is not:
struct line {
char contents[0]; // Wrong, not the last member of the structure
int length;
};

Why do we use zero length array instead of pointers?

It's said that zero length array is for variable length structure, which I can understand. But what puzzle me is why we don't simply use a pointer, we can dereference and allocate a different size structure in the same way.
EDIT - Added example from comments
Assuming:
struct p
{
char ch;
int *arr;
};
We can use this:
struct p *p = malloc(sizeof(*p) + (sizeof(int) * n));
p->arr = (struct p*)(p + 1);
To get a contiguous chunk of memory. However, I seemed to forget the space p->arr occupies and it seems to be a disparate thing from the zero size array method.
If you use a pointer, the structure would no longer be of variable length: it will have fixed length, but its data will be stored in a different place.
The idea behind zero-length arrays* is to store the data of the array "in line" with the rest of the data in the structure, so that the array's data follows the structure's data in memory. Pointer to a separately allocated region of memory does not let you do that.
* Such arrays are also known as flexible arrays; in C99 you declare them as element_type flexArray[] instead of element_type flexArray[0], i.e. you drop zero.
The pointer isn't really needed, so it costs space for no benefit. Also, it might imply another level of indirection, which also isn't really needed.
Compare these example declarations, for a dynamic integer array:
typedef struct {
size_t length;
int data[0];
} IntArray1;
and:
typedef struct {
size_t length;
int *data;
} IntArray2;
Basically, the pointer expresses "the first element of the array is at this address, which can be anything" which is more generic than is typically needed. The desired model is "the first element of the array is right here, but I don't know how large the array is".
Of course, the second form makes it possible to grow the array without risking that the "base" address (the address of the IntArray2 structure itself) changes, which can be really neat. You can't do that with IntArray1, since you need to allocate the base structure and the integer data elements together. Trade-offs, trade-offs ...
These are various forms of the so-called "struct hack", discussed in question 2.6 of the comp.lang.c FAQ.
Defining an array of size 0 is actually illegal in C, and has been at least since the 1989 ANSI standard. Some compilers permit it as an extension, but relying on that leads to non-portable code.
A more portable way to implement this is to use an array of length 1, for example:
struct foo {
size_t len;
char str[1];
};
You could allocate more than sizeof (struct foo) bytes, using len to keep track of the allocated size, and then access str[N] to get the Nth element of the array. Since C compilers typically don't do array bounds checking, this would generally "work". But, strictly speaking, the behavior is undefined.
The 1999 ISO standard added a feature called "flexible array members", intended to replace this usage:
struct foo {
size_t len;
char str[];
};
You can deal with these in the same way as the older struct hack, but the behavior is well defined. But you have to do all the bookkeeping yourself; sizeof (struct foo) still doesn't include the size of the array, for example.
You can, of course, use a pointer instead:
struct bar {
size_t len;
char *ptr;
};
And this is a perfectly good approach, but it has different semantics. The main advantage of the "struct hack", or of flexible array members, is that the array is allocated contiguously with the rest of the structure, and you can copy the array along with the structure using memcpy (as long as the target has been properly allocated). With a pointer, the array is allocated separately -- which may or may not be exactly what you want.
This is because with a pointer you need a separate allocation and assignment.
struct WithPointer
{
int someOtherField;
...
int* array;
};
struct WithArray
{
int someOtherField;
...
int array[1];
};
To get an 'object' of WithPointer you need to do:
struct WithPointer* withPointer = malloc(sizeof(struct WithPointer));
withPointer.array = malloc(ARRAY_SIZE * sizeof(int));
To get an 'object' of WithArray:
struct WithArray* withArray = malloc(sizeof(struct WithArray) +
(ARRAY_SIZE - 1) * sizeof(int));
That's it.
In some cases it's also very handy, or even necessary, to have the array in consecutive memory; for example in network protocol packets.

Increasing The Size of Memory Allocated to a Struct via Malloc

I just learned that it's possible to increase the size of the memory you'll allocate to a struct when using the malloc function. For example, you can have a struct like this:
struct test{
char a;
int v[1];
char b;
};
Which clearly has space for only 2 chars and 1 int (pointer to an int in reality, but anyway). But you could call malloc in such a way to make the struct holds 2 chars and as many ints as you wanted (let's say 10):
int main(){
struct test *ptr;
ptr = malloc (sizeof(struct test)+sizeof(int)*9);
ptr->v[9]=50;
printf("%d\n",ptr->v[9]);
return 0;
}
The output here would be "50" printed on the screen, meaning that the array inside the struct was holding up to 10 ints.
My questions for the experienced C programmers out there:
What is happening behind the scenes here? Does the computer allocate 2+4 (2 chars + pointer to int) bytes for the standard "struct test", and then 4*9 more bytes of memory and let the pointer "ptr" put whatever kind of data it wants on those extra bytes?
Does this trick only works when there is an array inside the struct?
If the array is not the last member of the struct, how does the computer manage the memory block allocated?
...Which clearly has space for only 2 chars and 1 int (pointer to an
int in reality, but anyway)...
Already incorrect. Arrays are not pointers. Your struct holds space for 2 chars and 1 int. There's no pointer of any kind there. What you have declared is essentially equivalent to
struct test {
char a;
int v;
char b;
};
There's not much difference between an array of 1 element and an ordinary variable (there's conceptual difference only, i.e. syntactic sugar).
...But you could call malloc in such a way to make it hold 1 char and as
many ints as you wanted (let's say 10)...
Er... If you want it to hold 1 char, why did you declare your struct with 2 chars???
Anyway, in order to implement an array of flexible size as a member of a struct you have to place your array at the very end of the struct.
struct test {
char a;
char b;
int v[1];
};
Then you can allocate memory for your struct with some "extra" memory for the array at the end
struct test *ptr = malloc(offsetof(struct test, v) + sizeof(int) * 10);
(Note how offsetof is used to calculate the proper size).
That way it will work, giving you an array of size 10 and 2 chars in the struct (as declared). It is called "struct hack" and it depends critically on the array being the very last member of the struct.
C99 version of C language introduced dedicated support for "struct hack". In C99 it can be done as
struct test {
char a;
char b;
int v[];
};
...
struct test *ptr = malloc(sizeof(struct test) + sizeof(int) * 10);
What is happening behind the scenes here? Does the computer allocate
2+4 (2 chars + pointer to int) bytes for the standard "struct test",
and then 4*9 more bytes of memory and let the pointer "ptr" put
whatever kind of data it wants on those extra bytes?
malloc allocates as much memory as you ask it to allocate. It is just a single flat block of raw memory. Nothing else happens "behind the scenes". There's no "pointer to int" of any kind in your struct, so any questions that involve "pointer to int" make no sense at all.
Does this trick only works when there is an array inside the struct?
Well, that's the whole point: to access the extra memory as if it belongs to an array declared as the last member of the struct.
If the array is not the last member of the struct, how does the computer manage the memory block allocated?
It doesn't manage anything. If the array is not the last member of the struct, then trying to work with the extra elements of the array will trash the members of the struct that declared after the array. This is pretty useless, which is why the "flexible" array has to be the last member.
No, that does not work. You can't change the immutable size of a struct (which is a compile-time allocation, after all) by using malloc ( ) at run time. But you can allocate a memory block, or change its size, such that it holds more than one struct:
int main(){
struct test *ptr;
ptr = malloc (sizeof(struct test) * 9);
}
That's just about all you can do with malloc ( ) in this context.
In addition to what others have told you (summary: arrays are not pointers, pointers are not arrays, read section 6 of the comp.lang.c FAQ), attempting to access array elements past the last element invokes undefined behavior.
Let's look at an example that doesn't involve dynamic allocation:
struct foo {
int arr1[1];
int arr2[1000];
};
struct foo obj;
The language guarantees that obj.arr1 will be allocated starting at offset 0, and that the offset of obj.arr2 will be sizeof (int) or more (the compiler may insert padding between struct members and after the last member, but not before the first one). So we know that there's enough room in obj for multiple int objects immediately following obj.arr1. That means that if you write obj.arr1[5] = 42, and then later access obj.arr[5], you'll probably get back the value 42 that you stored there (and you'll probably have clobbered obj.arr2[4]).
The C language doesn't require array bounds checking, but it makes the behavior of accessing an array outside its declared bounds undefined. Anything could happen -- including having the code quietly behave just the way you want it to. In fact, C permits array bounds checking; it just doesn't provide a way to handle errors, and most compilers don't implement it.
For an example like this, you're most likely to run into visible problems in the presence of optimization. A compiler (particularly an optimizing compiler) is permitted to assume that your program's behavior is well-defined, and to rearrange the generated code to take advantage of that assumption. If you write
int index = 5;
obj.arr1[index] = 42;
the compiler is permitted to assume that the index operation doesn't go outside the declared bounds of the array. As Henry Spencer wrote, "If you lie to the compiler, it will get its revenge".
Strictly speaking, the struct hack probably involves undefined behavior (which is why C99 added a well-defined version of it), but it's been so widely used that most or all compilers will support it. This is covered in question 2.6 of the comp.lang.c FAQ.

Resources