I saw several questions on this topic .But my query couldn't be resolved.Links:
Structure memory allocation,
Allocating memory for nested structure pointer,Understanding Nested Structures
Basically Memory is allocated when we create the instance of a structure not when we define it. So what if i create an object of another structure in this structure i.e. make something like this :
struct a{
int c;
};
struct b
{
struct a obj;
};
is now memory given to struct a object when we declare it in struct b?.(We can also do it through pointer but what if we do like this ).
In your case, struct b is also a (another) declaration, just the same as struct a.
No memory allocation happens here. It's there for compiler to know, should a variable be defined of this type, how much memory to be allocated. Just because a member of a structure is another structure, it does not mean memory has to be allocated there. Once you have a variable of the type, memory allocation will take place.
Only thing to notice here, the inner structure type must be declared before it is used as a member of the outer type.
Related
Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?
For example:
struct person {
char* name;
};
struct person p = {.name="apple"};
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
What is really the difference between the two? When would one approach be used over others?
Having a data structure like;
struct myStruct {
int a;
char *b;
};
struct myStruct p; // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct)); // alternative 2
Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).
Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).
In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.
printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)
printf("%p\n", q); // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)
Addressing the second part of your; when to use which:
If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
If this struct is temporary and you do not need its value after, you do not need to malloc memory.
Usage with an example:
struct myStruct {
int a;
char *b;
};
struct myStruct *foo() {
struct myStruct p;
p.a = 5;
return &p; // after this point, it's out of scope; possible warning
}
struct myStruct *bar() {
struct myStruct *q = malloc(sizeof(struct myStruct));
q->a = 5;
return q;
}
int main() {
struct myStruct *pMain = foo();
// memory is allocated in foo. p.a was assigned as '5'.
// a memory address is returned.
// but be careful!!!
// memory is susceptible to be overwritten.
// it is out of your control.
struct myStruct *qMain = bar();
// memory is allocated in bar. q->a was assigned as '5'.
// a memory address is returned.
// memory is *not* susceptible to be overwritten
// until you use 'free(qMain);'
}
If we assume both examples occur inside a function, then in:
struct person p = {.name="apple"};
the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:
You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
You are working with a small number of objects at one time.
In:
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:
The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.
Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.
If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.
Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.
struct person p = {.name="apple"};
^This is Automatic allocation for a variable/instance of type person.
struct person* p_tr = malloc(sizeof(person));
^This is dynamic allocation for a variable/instance of type person.
Static memory allocation occurs at Compile Time.
Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction
Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.
All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.
The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.
Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.
There are other uses as well.
In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.
Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.
What is the difference between using flexible array member (FAM) or pointer member ? In the two cases, a malloc and an affectation element by element must be done. But with FAM, a memory allocation is done for the whole structure and with ptr member, a memory allocation is done for the ptr member only (see code). What are the pros ans the cons of these two methods ?
#include <stdio.h>
#include <stdlib.h>
typedef struct farr_mb {
int lg;
int arr[];
} Farr_mb;
typedef struct ptr_mb {
int lg;
int * ptr;
} Ptr_mb;
int main() {
int lg=5;
Farr_mb *a=malloc(sizeof(Farr_mb)+lg*sizeof(int));
Ptr_mb b; b.ptr=malloc(lg*sizeof(int));
for (int i=0;i<lg;i++) (a->arr)[i]=i;
for (int i=0;i<lg;i++) (b.ptr)[i]=i;
for (int i=0;i<lg;i++) printf("%d \t",(a->arr)[i]=i);
printf("\n");
for (int i=0;i<lg;i++) printf("%d \t",(b.ptr)[i]=i);
return 0;
}
Before we get to the pros and cons, let's look at some real-world examples.
Let's say we wish to implement a hash table, where each entry is a dynamically managed array of elements:
struct hash_entry {
size_t allocated;
size_t used;
element array[];
};
struct hash_table {
size_t size;
struct hash_entry **entry;
};
#define HASH_TABLE_INITIALIZER { 0, NULL }
This in fact uses both. The hash table itself is a structure with two members. The size member indicates the size of the hash table, and the entry member is a pointer to an array of hash table entry pointers. This way, each unused entry is just a NULL pointer. When adding elements to a hash table entry, the entire struct entry can be reallocated (for sizeof (struct entry) + allocates * sizeof (element) or freed, as long as the corresponding pointer in the entry member in the struct hash_table is updated accordingly.
If we used element *array instead, we would need use struct hash_entry *entry: in the struct hash_table; or allocate the struct hash_entry separately from the array; or allocate both struct hash_entry and array in the single chunk, with the array pointer pointing just after the same struct hash_entry.
The cost of that would be two extra size_ts worth of memory used for each unused hash table slot, as well as an extra pointer dereference when accessing elements. (Or, to get the address of the array, two consecutive pointer dereferences, instead of one pointer dereference plus offset.) If this is a key structure heavily used in an implementation, that cost can be visible in profiling, and negatively affect cache performance. For random accesses, the larger the element array is, the less difference there is, however; the cost is largest when the arrays are small, and fit within the same cacheline (or a few cachelines) as the allocated and used members.
We do not usually want to make the entry member in the struct hash_table a flexible array member, because that would mean you no longer can declare a hash table statically, using struct hash_table my_table = HASH_TABLE_INITIALIZER;; you would need to use a pointer to a table, and an initializer function: struct hash_table *my_table; my_table = hash_table_init(); or similar.
I do have another example of related data structures using both pointer members and flexible array members. It allows one to use variables of type matrix to represent any 2D matrix with double entries, even when a matrix is a view to another (say, a transpose, a block, a row or column vector, or even a diagonal vector); these views are all equal (unlike in e.g. GNU Scientific Library, where matrix views are represented by a separate data type). This matrix representation approach makes writing robust numerical linear algebra code easy, and the ensuing code is much more readable than when using GSL or BLAS+LAPACK. In my opinion, that is.
So, let's look at the pros and cons, from the point of view of how to choose which approach to use. (For that reason, I will not designate any feature as "pro" or "con", as the determination depends on the context, on each particular use case.)
Structures with flexible array members cannot be initialized statically. You can only refer to them via pointers.
You can declare and initialize structures with pointer members. As shown in above example, using a preprocessor initializer macro can mean you do not need an initializer function. For example, a function accepting a struct hash_table *table parameter can always resize the array of pointers using realloc(table->entry, newsize * sizeof table->entry[0]), even when table->entry is NULL. This reduces the number of functions needed, and simplifies their implementation.
Accessing an array via a pointer member can require an extra pointer dereference.
If we compare the accesses to arrays in statically initialized structures with pointer to the array, to a structure with a flexible array member referred via a static pointer, the same number of dereferences are made.
If we have a function that gets the address of a structure as a parameter, then accessing an array element via a pointer member requires two pointer dereferences, whereas accessing a flexible array element requires only one pointer dereference and one offset. If the array elements are small enough and the array index small enough, so that the accessed array element is in the same cacheline, the flexible array member access is often significantly faster. For larger arrays, the difference in performance tends to be insignificant. This does vary between hardware architectures, however.
Reallocating an array via a pointer member hides the complexity from those using the structure as an opaque variable.
This means that if we have a function that receives a pointer to a structure as a parameter, and that structure has a pointer to a dynamically allocated array, the function can reallocate that array without the caller seeing any change in the structure address itself (only structure contents change).
However, if we have a function that receives a pointer to a structure with a flexible array member, reallocating the array means reallocating the entire structure. That potentially modifies the address of the structure. Because the pointer is passed by value, the modification is not visible to the caller. Thus, a function that may resize a flexible array member, must receive a pointer to a pointer to the structure with a flexible array member.
If the function only examines the contents of a structure with a flexible array member, say counts the number of elements that fulfill some criteria, then a pointer to the structure suffices; and both the pointer and the pointed-to data can be marked const. This might help the compiler produce better code. Furthermore, all the data accessed is linear in memory, which helps more complex processors manage caching more efficiently. (To do the same with an array having a pointer member, one would need to pass the pointer to the array, as well as the size field at least, as parameters to the counting function, instead of a pointer to the structure containing those values.)
An unused/empty structure with a flexible array member can be represented by a NULL pointer (to such structure). This can be important when you have an array of arrays.
With structures with flexible array members, the outer array is just an array of pointers. With structures with pointer members, the outer array can be either an array of structures, or an array of pointers to structures.
Both can support different types of sub-arrays, if the structures have a common type tag as the first member, and you use an union of those structures. (What 'use' means in this context, is unfortunately debatable. Some claim you need to access the array via the union, I claim the visibility of such an union is sufficient because anything else will break a huge amount of existing POSIX C code; basically all server-side C code using sockets.)
Those are the major ones I can think of right now. Both forms are ubiquitous in my own code, and I have had no issues with either. (In particular, I prefer using a structure free helper function that poisons the structure to help detect use-after-free bugs in early testing; and my programs do not often have any memory-related issues.)
I will edit the above list, if I find I've missed important facets. Therefore, if you have a suggestion or think I've overlooked something above, please let me know in a comment, so I can verify and edit as appropriate.
Suppose to have a struct that contains a pointer to an array and its size, like this one:
typedef struct {
int * array;
int arr_size;
}IntArray;
and want to have this inside another struct, it can be done in two ways:
typedef struct{
IntArray ia;
//other variables
}Base1;
typedef struct{
IntArray * ia;
//other variables
}Base2;
What happens when I dynamically allocate Base1 and Base2 (e.g Base1 b1 = (Base1 *)malloc(sizeof(Base1));) and why should I choose one way instead of the other?
Nested structs' space exist as space in their parent struct, which means they don't need their own allocation (but they might still need their own initialization), whereas struct fields that are pointers need to be both allocated and freed when the parent object is initiated (this is a common cause of memory leaks in C because it does not have automatic object destructors like C++ does). Though if using a pointer you could point to another array/object that might exist on the stack (thus avoiding malloc/free) but then you might run into object lifetime bugs depending on the difference on scope and lifetimes of your objects.
Nested structs exist in-place, so they cannot be shared by other instances. This may or may not be ideal (you could solve this with a template in C++, in C you'd have to settle for a hideous preprocessor macro).
Because dynamically-allocated objects (such as your array and your Base2 type's nested ia member) exist in different locations in physical memory it means your code will not take advantage of spatial locality that the CPU's caches can take advantage of and you'll incur a double pointer dereference. So your code will run slower.
Anyway: when in C, you should generally try to minimize pointer use.
Basically the question is the same as, should I allocate a struct or a pointer to a struct? That is:
IntArray myStruct;
or
IntArray *myStructPtr;
The fact that the variables in question are within a struct makes no difference, you can choose either.
And you access them in the same manner as you would if they were not inside another structure, after referencing the field inside the outside structure of course, so
Base1 contains the actual IntArray struct so you would
Base1 *b1 = malloc(sizeof(*b1));
b1->ia.array = malloc(yourSizeHere);
Base2 contains a pointer to a IntArray struct, so you would need to point it to an existing IntArray struct or malloc() memory for it, and then access it as a pointer.
Base2 *b2 = malloc(sizeof(*b2));
b2->ia = malloc(sizeof(*(b2->ia)));
b2->ia->array = malloc(yourSizeHere);
I have a structure as below
typedef struct Mystruct{
char *name;
int telno;
struct Mystruct *nextp;
}data;
Now I malloc the structure
data *addnode;
addnode = malloc (sizeof(data));
Now I would add data to the char *name.
addnode->name = malloc (sizeof(MAX));
Question:Why is it required to malloc again?
I was under the assumption that malloc-ing the addnode will even allocate the memory for addnode->name but it is not so.
malloc is not deep and doesn't do recursion. So it won't allocate memory for any of the pointers inside the structure you pass.
If you think about this a bit more, you can see that must be so. You don't pass in any information about the structure you are allocating. You just pass a size for the memory block. Now, malloc doesn't even know what type of data you are allocating. It doesn't know that the block itself contains a pointer.
As for why this design choice was made, how can the library tell who owns the memory that your pointer refers to? Perhaps it's owned by that structure. Or perhaps you want to use that pointer to refer to some memory allocated elsewhere. Only you can know that which is why the responsibility falls to you. In fact your structure is a fine example of this. Probably the name member is owned by the structure, and the nextp member is not.
Allocating memory for Mystruct provides enough memory for a pointer to name. At this point we have no idea how many characters will be in a name so can't possibly allocate the memory for it.
If you want to fully allocate the structure in a single allocation, you could decide on a max size for name and change the structure definition to
#define MAX_NAME (10) /* change this as required */
typedef struct Mystruct{
char name[MAX_NAME];
int telno;
struct Mystruct *nextp;
}data;
Or, if you know the name when you allocate the struct, you could hide the need for two allocations from the caller by providing a constructor function
struct Mystruct* Mystruct_create(const char* name)
{
Mystruct* ms = malloc(sizeof(*ms));
ms->name = strdup(name);
return ms;
}
No. first malloc() allocates memory to whole structure including memory for holding pointer to name. i.e 4 bytes in 32 bit OS.
You need to allocate memory separately for holding data in it. by default that pointer will be pointing to some garbage location, if not initialized.
same case for free() too. i.e you have to free the inner blocks first, then free the memory for whole structure. There is no recursion kind of things in malloc() and free().
What is the benefit of declaring a C structure member as in array of size 1 instead of a pointer :
struct {
a_struct_t a_member[1];
...
}b_struct;
Thanks in advance
In a typical case, a structure with a member that's declared as an array of one item will have that member as the last item in the struct. The intent is that the struct will be allocated dynamically. When it is allocated, the code will allocate space for as many items as you really want/need in that array:
struct X {
time_t birthday;
char name[1];
};
struct X *x = malloc(sizeof(*x) + 35);
x->birthday = mktime(&t);
strcpy(x->name, "no more than 35 characters");
This works particularly well for strings -- the character you've allocated in the struct gives you space for the NUL terminator, so when you do the allocation, the number of characters you allocate is exactly the strlen() of the string you're going to put there. For most other kinds of items, you normally want to subtract one from the allocation size (or just live with the allocated space being one item larger than is strictly necessary).
You can do (sort of) the same thing with a pointer, but it results in allocating the body of the struct separately from the item you refer to via the pointer. The good point is that (unlike the method above) more than one item can be allocated dynamically, where the method above only works for the last member of the struct.
What you describe are two different things entirely. If you have a pointer as a member:
a_struct_t* a_member;
then it is simply a pointer. There is no memory allocated inside of the struct to hold an a_struct_t. If, on the other hand, you have an array of size 1:
a_struct_t a_member[1];
then your struct actually has an object of type a_struct_t inside of it. From a memory standpoint, it isn't much different from just putting an object of that type inside the struct:
a_struct_t a_member;
From a usage standpoint, an array requires indirection to access the one element (i.e., you need to use *a_member instead of a_member).
"Array of size 1 instead of a pointer"? Sorry, but I don't see how this quiestion can possibly make sense. I would understand if you asked about "array of size 1 instead of an ordinary member (non-array)". But "instead of a pointer"? What does pointer have to do with this? How is it interchangeable with an array, to justify the question?
If what you really wanted to ask is why it is declared as an array of size 1 instead of non-array as in
struct {
a_struct_t a_member;
} b_struct;
then one possible explanation is the well-known idiom called "struct hack". You might see a declaration like
struct {
...
a_struct_t a_member[1];
} b_struct;
used to implement an array of flexible size as the last member of the struct object. The actual struct object is later created within a memory block that is large enough to accomodate as many array elements as necessary. But in this case the array has to be the last member of the struct, not the first one as in your example.
P.S. From time to time you might see "struct hack" implemented through an array of size 0, which is actually a constraint violation in C (i.e. a compile error).
So I think it's been stated that the main difference between pointers and arrays is that you have to allocate memory for pointers.
The tricky part about your question is that even as you allocate space for your struct, if your struct contains a pointer you have to allocate a SECOND time for the pointer, but the pointer itself would be allocated as part of the struct's allocaiton.
If your struct contained an array of 1 you would not have to allocate any additional memory, it would be stored in the struct (which you still have to allocate).
These are different things.
Such member's name is an address of allocated memory, allocated inside the struct instance itself.