I am putting together a project in C where I must pass around a variable length byte sequence, but I'm trying to limit malloc calls due to potentially limited heap.
Say I have a struct, my_struct, that contains the variable length byte sequence, ptr, and a function, my_func, that creates an instance of my_struct. In my_func, my_struct.ptr is malloc'd and my_struct is returned by value. my_struct will then be used by other functions being passed by value: another_func. Code below.
Is this "safe" to do against memory leaks provided somewhere on the original or any copy of my_struct when passed by value, I call my_struct_destroy or free the malloc'd pointer? Specifically, is there any way that when another_func returns, that inst.ptr is open to being rewritten or dangling?
Since stackoverflow doesn't like opinion-based questions, are there any good references that discuss this behavior? I'm not sure what to search for.
typedef struct {
char * ptr;
} my_struct;
// allocates n bytes to pointer in structure and initializes.
my_struct my_func(size_t n) {
my_struct out = {(char *) malloc(n)};
/* initialization of out.ptr */
return out;
}
void another_func(my_struct inst) {
/*
do something using the passed-by-value inst
are there problems with inst.ptr here or after this function returns?
*/
}
void my_struct_destroy(my_struct * ms_ptr) {
free(ms_ptr->ptr);
ms_ptr->ptr = NULL;
}
int main() {
my_struct inst = my_func(20);
another_func(inst);
my_struct_destroy(&inst);
}
I's safe to pass and return a struct containing a pointer by value as you did it. It contains a copy of ptr. Nothing is changed in the calling function. There would, of course, be a big problem if another_func frees ptr and then the caller tries to use it or free it again.
Locality of alloc+free is a best practice. Wherever possible, make the function that allocates an object also responsible for freeing it. Where that's not feasible, malloc and free of the same object should be in the same source file. Where that's not possible (think complex graph data structure with deletes), the collection of files that manage objects of a given type should be clearly identified and conventions documented. There's a common technique useful for programs (like compilers) that work in stages where much of the memory allocated in one stage should be freed before the next starts. Here, memory is only malloced in big blocks by a manager. From these, the manager allocs objects of any size. But it knows only one way to free: all at once, presumably at the end of a stage. This is a gcc idea: obstacks. When allocation is more complex, bigger systems implement some kind of garbage collector. Beyond these ideas, there are as many ways to manage C storage as there are colors. Sorry I don't have any pointers to references (pun intended :)
If you only have one variable-length field and its size doesn't need to be dynamically updated, consider making the last field in the struct an array to hold it. This is okay with the C standard:
typedef struct {
... other fields
char a[1]; // variable length
} my_struct;
my_struct my_func(size_t n) {
my_struct *p = malloc(sizeof *p + (n - 1) * sizeof p->a[0]);
... initialize fields of p
return p;
}
This avoids the need to separately free the variable length field. Unfortunately it only works for one.
If you're okay with gcc extensions, you can allocate the array with size zero. In C 99, you can get the same effect with a[]. This avoids the - 1 in the size calculation.
Related
I've recently started learning C and currently I'm working on a project which involves implementing a struct with two variables and I don't really know how to apparoach this.
The gist of it is I need to implement a struct which contains two variables, a pointer to an int array AND an int value which indicates the number of elements conatained within the array. The size of the array is declared upon the invocation of the constructor and is dependent on the input.
For the constructor I'm using a different function which recieves a string
as input which is encoded into a decimal code. Also this function recieves another input
which is a pointer to an int array (the pointer defined in the struct) and the problem is I'm using the malloc() function to allocate memory for it but I dont really understand how and when to use the free() function properly.
So, the questions are:
When am I supposed to free the allocated memory? (assuming I need this struct for later use throughout the program's running time)
What are the best ways to avoid memory leaks? What should you look out for?
It's unclear whether you're expected to manage the memory of the array inside, but this is functionally the setup you need for allocating the containing structure.
#include <malloc.h>
#include <stdio.h>
struct my_struct {
size_t num_entries;
int *array;
};
int main() {
struct my_struct *storage = malloc(sizeof(struct my_struct));
storage->num_entries = 4;
storage->array = malloc(sizeof(int) * storage->num_entries);
storage->array[0] = 1;
storage->array[3] = 2;
printf("allocated %ld entries\n", storage->num_entries);
printf("entry #4 (index=3): %d\n", storage->array[3]);
free(storage->array); /* MUST be first! */
free(storage);
storage = 0; /* safety to ensure you can't read the freed memory later */
}
if you're responsible for freeing the internal storage array, then you must free it first before freeing the containing memory.
The biggest key to memory management: only one part of the code at any time "owns" the memory in the pointer and is responsible for freeing it or passing it to something else that will.
Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?
For example:
struct person {
char* name;
};
struct person p = {.name="apple"};
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
What is really the difference between the two? When would one approach be used over others?
Having a data structure like;
struct myStruct {
int a;
char *b;
};
struct myStruct p; // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct)); // alternative 2
Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).
Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).
In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.
printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)
printf("%p\n", q); // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)
Addressing the second part of your; when to use which:
If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
If this struct is temporary and you do not need its value after, you do not need to malloc memory.
Usage with an example:
struct myStruct {
int a;
char *b;
};
struct myStruct *foo() {
struct myStruct p;
p.a = 5;
return &p; // after this point, it's out of scope; possible warning
}
struct myStruct *bar() {
struct myStruct *q = malloc(sizeof(struct myStruct));
q->a = 5;
return q;
}
int main() {
struct myStruct *pMain = foo();
// memory is allocated in foo. p.a was assigned as '5'.
// a memory address is returned.
// but be careful!!!
// memory is susceptible to be overwritten.
// it is out of your control.
struct myStruct *qMain = bar();
// memory is allocated in bar. q->a was assigned as '5'.
// a memory address is returned.
// memory is *not* susceptible to be overwritten
// until you use 'free(qMain);'
}
If we assume both examples occur inside a function, then in:
struct person p = {.name="apple"};
the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:
You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
You are working with a small number of objects at one time.
In:
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:
The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.
Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.
If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.
Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.
struct person p = {.name="apple"};
^This is Automatic allocation for a variable/instance of type person.
struct person* p_tr = malloc(sizeof(person));
^This is dynamic allocation for a variable/instance of type person.
Static memory allocation occurs at Compile Time.
Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction
Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.
All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.
The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.
Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.
There are other uses as well.
In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.
Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.
I created a struct like the following:
typedef struct header{
int hc;
char src[18];
char dst=[18];
char reason[15];
char d[3];
char m[3];
char y[4];
struct measurements{
char h_ip[17];
int h_ttl;
int h_id;
float h_rtt;
}HOPS[100];
}HEADER;
INSIDE MAIN:
HEADER *head;
for(...){
head=(HEADER*) malloc(sizeof(HEADER));
.....
free(head);
}
Will the above malloc automatically allocate memory for the inner struct as well? Also, I'm facing a weird problem here. After I free the header, I'm still able to print the values of head->HOPS[i].h_ip. Should I explicitly free the inner struct as well so that even the values get cleared?
Yes, it allocates memory for the inner structure. And you need not free the inner structure separately.
If you have a pointer defined inside your structure, in that case you have to allocate separately for that pointer member of the structure and free that separately.
Consider freeing memory as a black box. All what you know is that after freeing you shouldn't refer to freed memory.
You may find that that memory block still exists and still contains some old values. That's ok: it just was marked as freed and probably it will be used again soon by allocator.
For example when you call malloc again and realized that just allocated block contains values from the old structure. It happens and that's alright. Just use this block as usually.
So, after the problem with the wrong declaration of head was resolved:
free returns a previously allocated memory block to the heap. It does not clear anything (for performance reasons). However, you are not supposed to access that block anymore afterwards. Doing so results in undefined behaviour and might let your computer fly out of the window.
Worst that can happen is ... nothing ... Yes, you might even not notice anything strang happens. However, that does not mean your program run correctly, it just does not show any symptoms.
To catch illegal accesses, you might set the pointer to NULL once you freed the object it points to. Some operating systems catch accesses to addresses near the null pointer address, but there is no guarantee. It is a good practice anyway and does no harm.
For your other question: malloc allocates a block of memory large enough to store that many bytes you passed as argument. If it cannot, it will return a null pointer. You should always check if malloc & friends returned a valid pointer (i.e. not a null pointer).
int *p = malloc(sizeof(int));
if ( p == NULL ) {
error: out of memory
}
...
Notice the omission of the cast of the result of malloc. In C you should not cast void * as returned by malloc & friends (but also elsewhere). As much as you did not for free(head). Both take the same type: void *, btw. (so why cast one and not the other?). Note that in C any object pointer can freely be assigned to/from void * without cast. Warning functions are no objects in the C standard!
Finally: sizeof(HEADER) returns the size of the struct. Of course that include all fields. A nested struct is a field. A pointer to another struct is a field. For the latter, however note: the pointer itself is a field, but not what it points to! If that was another struct, you have to malloc that seperately **and also free seperately (remember what I wrote above).
But as you do not have pointer inside your struct, that is not your problem here. (keep it in mind, if you continue programming, you will eventually need that!)
I have been churning through C for the last several months. In an effort to learn the language, the project is an arithmetic parser - formulas, variables, etc.
I recently decided to go ahead and work out garbage collection because I have a lot of calls to this method:
char* read_token(const Source* source, const Token* token) {
int szWord = token->t_L + 1; // +1 for NULL terminator
char* word = (char*)malloc(sizeof(char)*szWord);
memset(word, '\0', sizeof(char)*(szWord));
char* p_T = source->p_Src + token->t_S;
memcpy(word, p_T, token->t_L);
return word;
}
... which means calling free(...) quite a bit.
The Source struct has two buffer properties among others:
typedef struct source Source;
struct source {
// ...
char* p_Src; // malloc'd source buffer
int srcLen;
Token* p_tokens; // malloc'd Token buffer
// ...
};
The Token struct has start and length properties:
typedef struct token Token;
struct token {
int t_S; // buffer start index
int t_L; // token length
};
In addition, since there could be many sources, a Source* buffer is malloc'd.
When a buffer is malloc'd, the size of the struct is provided (* numStructs). But if a given struct has a buffer that may be allocated at a later time, such as Token*, does that change the size of Source? Is the code in danger of overwriting previously allocated memory?
For some reason I was getting the idea that all of the memory used for a struct, including any buffers, is allocated in a linear manner. If Token* buffer in struct is allocated to 10 tokens, that space is not then linearly allocated within the Source struct right?
The pointer members in your struct are variables that store addresses of memory blocks and as you state yourself pointers and pointees are allocated independently. Hence those buffers might be located just next to where their 'parent' struct is stored, or not (and most probably won't).
If it is needed, ensuring contiguous storage of the struct members and its pointed buffers can be achieved by allocating everything in one call to the *alloc function.
This can be done
using fixed-size buffers: not really convenient since any flexibility on the buffer sizes is lost. Also note that declaring this updates the value of sizeof(struct foo) accordingly.
using C99's flexible array member or tricks to enable the feature in pre-C99 C: Allocate Pointer and pointee at once .
using not recommended hacks resorting on pointer arithmetic, watching out for the compiler's alignment policy.
A pointer in a struct is a fixed size, regardless of what it is pointing at, even if it is uninitialised. That way, sizeof(struct token) is a fixed length.
When malloc is used, the memory is taken somewhere from the heap, we know not where, and should not care either. It is highly unlikely that memory will be allocated anywhere near the original struct, and even if it was, that would be implementation specific and you could not count on that behaviour.
Obviously (?) you should call free() on the pointer before the struct that it lives in is destroyed.
Also note C99's Variable Length Arrays (VLAs).
Sorry, i'm writing this in the "Answer" section instead of "Comments" section because my Stackoverflow reputation isn't high enough yet.
What i was about to comment is Why don't you just use this line:
char word[ sizeof(char)*szWord ];
instead of
char* word = (char*)malloc(sizeof(char)*szWord); ?
In C, it is possible for functions to return pointers to memory that that function dynamically-allocated and require the calling code to free it. It's also common to require that the calling code supplies a buffer to a second function, which then sets the contents of that buffer. For example:
struct mystruct {
int a;
char *b;
};
struct mystruct *get_a_struct(int a, char*b)
{
struct mystruct *m = malloc(sizeof(struct mystruct));
m->a = a;
m->b = b;
return m;
}
int init_a_struct(int a, char*b, struct mystruct *s)
{
int success = 0;
if (a < 10) {
success = 1;
s->a = a;
s->b = b;
}
return success;
}
Is one or the other method preferable? I can think of arguments for both: for the get_a_struct method the calling code is simplified because it only needs to free() the returned struct; for the init_a_struct method there is a very low likelihood that the calling code will fail to free() dynamically-allocated memory since the calling code itself probably allocated it.
It depends on the specific situation but in general supplying the allocated buffer seems to be preferable.
As mentioned by Jim, DLLs can cause problems if called function allocates memory. That would be the case if you decide to distribute the code as a Dll and get_a_struct is exported to/is visible by the users of the DLL. Then the users have to figure out, hopefully from documentation, if they should free the memory using free, delete or other OS specific function. Furthermore, even if they use the correct function to free the memory they might be using a different version of the C/C++ runtime. This can lead to bugs that are rather hard to find. Check this Raymond Chen post or search for "memory allocation dll boundaries". The typical solution is export from the DLL your own free function. So you will have the pair: get_a_struct/release_a_struct.
In the other hand, sometimes only the called function knows the amount of memory that needs to be allocated. In this case it makes more sense for the called function to do the allocation. If that is not possible, say because of the DLL boundary issue, a typical albeit ugly solution is to provide a mechanism to find this information. For example in Windows the GetCurrentDirectory function will return the required buffer size if you pass 0 and NULL as its parameters.
I think that providing the already allocated struct as an argument is preferable, because in most cases you wouldn't need to call malloc/calloc in the calling code, and therefore worrying about free'ing it. Example:
int init_struct(struct some_struct *ss, args...)
{
// init ss
}
int main()
{
struct some_struct foo;
init_struct(&foo, some_args...);
// free is not needed
}
The "pass an pointer in is preferred", unless it's absolutely required that every object is a "new object allocated from the heap" for some logistical reason - e.g. it's going to be put into a linked list as a node, and the linked-list handler will eventually destroy the elements by calling free - or some other situation where "all things created from here will go to free later on).
Note that "not calling malloc" is always the preferred solution if possible. Not only does calling malloc take some time, it also means that some place, you will have to call free on the allocated memory, and every allocated object takes several bytes (typically 12-40 bytes) of "overhead" - so allocating space for small objects is definitely wasteful.
I agree with other answers that passing the allocated struct is preferred, but there is one situation where returning a pointer may be preferred.
In case you need to explicitly free some resource at the end (close a file or socket, or free some memory internal to the struct, or join a thread, or something else that would require a destructor in C++), I think it may be better to allocate internally, then returning the pointer.
I think it so because, in C, it means some kind of a contract: if you allocate your own struct, you shouldn't have to do anything to destroy it, and it be automatically cleared at the end of the function. On the other hand, if you received some dynamically allocated pointer, you feel compelled to call something to destroy it at the end, and this destroy_a_struct function is where you will put the other cleanup tasks needed, alongside free.