I've read and understand the concepts behind the binary buddies approach to memory allocation, and I'm trying to put it to work in C but I have a few implementation specific questions before I can really get started.
https://drive.google.com/file/d/0BxJX9LHXUU59OWZ6ZmhvV1lBX2M/view?usp=sharing
- This is a link to the assignment specification, my question pertains to problem 5.
The problem specifies that one call to malloc is to be made at the initialization of the allocator, and all requests for memory must be serviced using the space acquired from this call.
It's clear that the initial pointer to this space must be incremented in some way when a call to get_memory() is made, and the new pointer will be returned to the calling process. How can I increment the pointer by a specific number of bytes?
I understand that free lists for each block size must be kept, but I'm unsure exactly how these will be initialized and maintained. What is stored in the free list exactly? The memory pointer?
I apologize if these questions have been asked before, I haven't found a relevant question that provided enough clarity for me to get working.
For your first question, you just have to increment your pointer like a normal variable.
The value of a pointer corresponds to the address in memory of the data it points to. By incrementing it by, say 10, you actually move 10 bytes further into your memory.
As for the free list, malloc() creates a structure contingent with the allocated memory block containing informations such as the address of the memory block,its size, and whether it is free or not.
You goal is to create these structures so you can keep track of the status the different memory blocks you have allocated or free with your get_memory() and release_memory() function.
You might also find this useful : https://stackoverflow.com/a/1957125/4758798
It's clear that the initial pointer to this space must be incremented in some way when a call to get_memory() is made, and the new pointer will be returned to the calling process. How can I increment the pointer by a specific number of bytes?
When you call get_memory(), you will return a pointer to the main memory added to some offset. The word 'increment' implies that you are going to change the value of the initial pointer, which you should not do.
Here is some simple code of me subaddressing one big memory block.
#include <stdlib.h>
#include <stdio.h>
int main (void)
{
// Allocate a block of memory
void * memory_block = malloc (512);
// Now "Split" that memory into two halves.
void * first_half = memory_block;
void * second_half = memory_block + 256;
// We can even keep splitting...
void * second_first_half = second_half;
void * second_second_half = second_half + 128;
// Note that this splitting doesn't actually change the main memory block.
// We're just bookmarking locations in it.
printf ("memory_block %p\n", memory_block);
printf ("first_half %p\n", first_half);
printf ("second_half %p\n", second_half);
printf ("second_first_half %p\n", second_first_half);
printf ("second_second_half %p\n", second_second_half);
return 0;
}
I understand that free lists for each block size must be kept, but I'm unsure exactly how these will be initialized and maintained. What is stored in the free list exactly? The memory pointer?
At a minimum, you probably want to keep track of the memory pointer and the size of that memory block, so something like this...
typedef struct memory_block {
void * memory;
size_t size;
} memory_block_t;
There are other ways to represent this though. For example, you get equivalent information by keeping track of their memory offsets relative to the global malloc. I would suggest treating memory as a set of offsets like this:
void * global_memory; // Assigned by start_memory()
// Functionally equivalent to the above struct
// memory = global_memory + begin;
// size = end - begin;
typedef struct memory_block {
size_t begin;
size_t end;
} memory_block_t;
There are multiple approaches to this difficult problem.
Related
I am putting together a project in C where I must pass around a variable length byte sequence, but I'm trying to limit malloc calls due to potentially limited heap.
Say I have a struct, my_struct, that contains the variable length byte sequence, ptr, and a function, my_func, that creates an instance of my_struct. In my_func, my_struct.ptr is malloc'd and my_struct is returned by value. my_struct will then be used by other functions being passed by value: another_func. Code below.
Is this "safe" to do against memory leaks provided somewhere on the original or any copy of my_struct when passed by value, I call my_struct_destroy or free the malloc'd pointer? Specifically, is there any way that when another_func returns, that inst.ptr is open to being rewritten or dangling?
Since stackoverflow doesn't like opinion-based questions, are there any good references that discuss this behavior? I'm not sure what to search for.
typedef struct {
char * ptr;
} my_struct;
// allocates n bytes to pointer in structure and initializes.
my_struct my_func(size_t n) {
my_struct out = {(char *) malloc(n)};
/* initialization of out.ptr */
return out;
}
void another_func(my_struct inst) {
/*
do something using the passed-by-value inst
are there problems with inst.ptr here or after this function returns?
*/
}
void my_struct_destroy(my_struct * ms_ptr) {
free(ms_ptr->ptr);
ms_ptr->ptr = NULL;
}
int main() {
my_struct inst = my_func(20);
another_func(inst);
my_struct_destroy(&inst);
}
I's safe to pass and return a struct containing a pointer by value as you did it. It contains a copy of ptr. Nothing is changed in the calling function. There would, of course, be a big problem if another_func frees ptr and then the caller tries to use it or free it again.
Locality of alloc+free is a best practice. Wherever possible, make the function that allocates an object also responsible for freeing it. Where that's not feasible, malloc and free of the same object should be in the same source file. Where that's not possible (think complex graph data structure with deletes), the collection of files that manage objects of a given type should be clearly identified and conventions documented. There's a common technique useful for programs (like compilers) that work in stages where much of the memory allocated in one stage should be freed before the next starts. Here, memory is only malloced in big blocks by a manager. From these, the manager allocs objects of any size. But it knows only one way to free: all at once, presumably at the end of a stage. This is a gcc idea: obstacks. When allocation is more complex, bigger systems implement some kind of garbage collector. Beyond these ideas, there are as many ways to manage C storage as there are colors. Sorry I don't have any pointers to references (pun intended :)
If you only have one variable-length field and its size doesn't need to be dynamically updated, consider making the last field in the struct an array to hold it. This is okay with the C standard:
typedef struct {
... other fields
char a[1]; // variable length
} my_struct;
my_struct my_func(size_t n) {
my_struct *p = malloc(sizeof *p + (n - 1) * sizeof p->a[0]);
... initialize fields of p
return p;
}
This avoids the need to separately free the variable length field. Unfortunately it only works for one.
If you're okay with gcc extensions, you can allocate the array with size zero. In C 99, you can get the same effect with a[]. This avoids the - 1 in the size calculation.
I've recently started learning C and currently I'm working on a project which involves implementing a struct with two variables and I don't really know how to apparoach this.
The gist of it is I need to implement a struct which contains two variables, a pointer to an int array AND an int value which indicates the number of elements conatained within the array. The size of the array is declared upon the invocation of the constructor and is dependent on the input.
For the constructor I'm using a different function which recieves a string
as input which is encoded into a decimal code. Also this function recieves another input
which is a pointer to an int array (the pointer defined in the struct) and the problem is I'm using the malloc() function to allocate memory for it but I dont really understand how and when to use the free() function properly.
So, the questions are:
When am I supposed to free the allocated memory? (assuming I need this struct for later use throughout the program's running time)
What are the best ways to avoid memory leaks? What should you look out for?
It's unclear whether you're expected to manage the memory of the array inside, but this is functionally the setup you need for allocating the containing structure.
#include <malloc.h>
#include <stdio.h>
struct my_struct {
size_t num_entries;
int *array;
};
int main() {
struct my_struct *storage = malloc(sizeof(struct my_struct));
storage->num_entries = 4;
storage->array = malloc(sizeof(int) * storage->num_entries);
storage->array[0] = 1;
storage->array[3] = 2;
printf("allocated %ld entries\n", storage->num_entries);
printf("entry #4 (index=3): %d\n", storage->array[3]);
free(storage->array); /* MUST be first! */
free(storage);
storage = 0; /* safety to ensure you can't read the freed memory later */
}
if you're responsible for freeing the internal storage array, then you must free it first before freeing the containing memory.
The biggest key to memory management: only one part of the code at any time "owns" the memory in the pointer and is responsible for freeing it or passing it to something else that will.
I am learning the basic of C programming and find it quite strange and difficult. Especially with dynamic memory allocation, pointer and similar things. I came upon this function and don't quite understand what is wrong with it.
char *strdup(const char *p)
{
char *q;
strcpy(q, p);
return q;
}
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
When it is appropriate to use malloc? As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true? Is there any other situation where malloc is needed?
The type of q is a pointer, and pointers hold addresses -- so what you are returning is the address that pointer holds.
Until you give that pointer a valid address, it points off to who-knows-where, memory that you may or may not own and have the right to access. So, the strdup call will copy a string from the address held in p into some location you probably don't own.
If you had done a malloc first, and given q the results of the malloc, then q would hold a valid address, and your strdup would put the copy into memory that you did own (assuming you malloc'd enough space for the string -- a strlen on p would tell you how much you needed).
Then, when you returned q, you would be giving the caller the address as well. Any code with that address can see the string you put there. If some future code were to free that address, then what it holds is up in the air -- it could be anything at all.
So, you don't want to free q before you return the address that it holds -- you need to let the caller free the address it gets from you, when it is ready to do so.
In terms of when you malloc, yes, if you want to return an address that will remain viable after your function completes, you need to malloc it -- giving the caller the address of a local variable, for example, would be bad: the memory is freed when the function returns, you don't own it anymore.
Another typical use of malloc is for building up dynamic data structures like trees and lists -- you can't know how much memory you need up front, so you build the list or tree up as you need to, malloc'ing more memory for each node in the structure.
My personal rules are: use malloc() when an object is too big to put on the stack or/and when it has to live outside the scope of the current block. In your case, I believe, you should do something like the following:
char *strdup(const char *p)
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
malloc(X) creates space (of size X bytes) on the heap for you to play with. The data that you write to these X byes stays put when your function returns, as a result, you can read what your strdup() function wrote to that space on the heap.
free() is used for freeing space on the heap. You can pass a pointer to this function that you obtained only as a result of a malloc() or realloc() call. This function may not clear out the data, it just means that a subsequent call to malloc() may return the address of the same space that you just freed. I say "may" because these are all implementaion defined and should not be relied upon.
In the piece of code you wrote, the strcpy() function copies the bytes one by one from p to q until it finds a \0 at the location pointed to by q (and then it copies the \0 as well). In order to write data somewhere, you need to allocate space first for the data to be written, hence, one option is that you call malloc() to create some space and then write data there.
Well, calling free() is not mandatory as your OS will reclaim the space allocated by malloc() when you program ends, but as long as your program runs, you may be ocupying more space than you need to - and that's bad for your program, for other programs, for the OS and the universe as a whole.
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
No, your data won't be saved. In fact your pointer q is being used without allocating it's size can cause problems. Also, once this function execution complete, variable char* q will be destroyed.
You need to allocate memory to pointer q before copying data as suggested by #Michael's answer.But once you finish using the data return by this function, you will need to manually free() the memory you allocated or else it will cause memory leak (a situation where the memory is allocated but there is no pointer refer to that chunk of memory you allocated and hence will be inaccessible throughout program execution)
char *strdup(const char *p) // From #Michael's answer
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
void someFunction()
{
char* aDupString = strdup("Hello World!!!");
/*... somecode use aDupString */
free(aDupString); // If not freed, will cause memory leaks
}
When it is appropriate to use malloc? Is there any other situation where malloc is needed?
It is appropriate to use in following situations:
1> The usage size of array are unknown at compile time.
2> You need size flexibility. For example, your function need to work with small data size and large data size. (like Data structure such as Link list,Stacks, Queues, etc.)
As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true?
I think this one is partially true. depending on what you are trying to achive, there might be a way to get around using malloc though. For example, your strdup can also be rewrite in following way:
void strdup2(const char *p, char* strOut)
{
// malloc not require
strcpy(strOut, p);
}
void someFunction()
{
char aString[15] = "Hello World!!!";
char aDupStr[sizeof(aString)];
strdup2(aString, aDupStr);
// free() memory not required. but size is not dynamic.
}
Suppose I have a block of memory as such:
void *block = malloc(sizeof(void *) + size);
How do I set a pointer to the beginning of the block while still being able to access the rest of the reserved space? For this reason, I do not want to simply assign 'block' to another pointer or NULL.
How do I set the first two bytes of the block as NULL or have it point somewhere?
This doesn't make any sense unless you're running on a 16-bit machine.
Based on the way that you're calling malloc(), you're planning to have the first N bytes be a pointer to something else (where N may be 2, 4, or 8 depending on whether you're running on a 16-, 32-, or 64-bit architecture). Is this what you really want to do?
If it is, then you can create use a pointer-to-a-pointer approach (recognizing that you can't actually use a void* to change anything, but I don't want to confuse matters by introducing a real type):
void** ptr = block;
However, it would be far more elegant to define your block with a struct (this may contain syntax errors; I haven't run it through a compiler):
typedef struct {
void* ptr; /* replace void* with whatever your pointer type really is */
char[1] data; } MY_STRUCT;
MY_STRUCT* block = malloc(sizeof(MY_STRUCT) + additional);
block->ptr = /* something */
memset(block, 0, 2);
memset can be found in string.h
Putting the first two bytes of the allocated memory block to 0 is easy. There is many ways to do it, for example:
((char*)block)[0] = 0;
((char*)block)[1] = 0;
Now, the way the question is asked show some misunderstanding.
You can put anything in the first two bytes of your allocated block, it doesn't change anything for accessing the following bytes. The only difference is that C string manipulation operator use as a convention that strings end with a 0 byte. Then if you do things like strcpy((char*)block, target) it will stop copying immediately if the first byte is a zero. But you can still do strcpy((char*)block+2, target).
Now if you want to store a pointer a the beginning of the block (and usually it's not 2 bytes).
You can do the same thing as above but using void* instead of char.
((void**)block)[0] = your_pointer;
You access the rest of the block as you like, just get it's address and go on. You could do it for example with.
void * pointer_to_rest = &((void**)block)[1];
PS: I do not recommand such pointer games. They are very error prone. Your best move would probably be to follow the struct method proposed by #Anon.
void *block = malloc(sizeof(void *) + size); // allocate block
void *ptr = NULL; // some pointer
memcpy(block, &ptr, sizeof(void *)); // copy pointer to start of block
I have a guess at what you're trying to ask, but your wording is so confusing that I could be totally wrong. I am assuming that you want a pointer that points to the "first 2 bytes" of the block you allocated, and then another pointer that points to the rest of the block.
Pointers carry no information about the size of the memory block that they point to, so you can do this:
void *block = malloc(sizeof(void *) + size);
void *first_two_bytes = block;
void *rest_of_block = ((char*)block)+2;
Now, first_two_bytes points to the beginning of the block that you allocated, and you should just treat it as if it pointed to a memory area 2 bytes long.
And rest_of_block points to the portion of the block starting 3 bytes in, and you should treat it as if it pointed to a memory area 2 bytes smaller than what you allocated.
Note, however, that this is still only a single allocation, and you should only free the block pointer. If you free all three pointers, you will corrupt the heap, since you will be calling free more than once on the same block.
While implementing a map interface using a hash table I faced a similar issue, where each key-value pair (both of which are not statically sized, omitting the option of defining a compile-time struct) had to be stored in block of heap memory that also included a pointer to the next element in a linked list (should the blocks be chained in the event that more than one is hashed to the same index in the hash table array). Leaving space for the pointer at the beginning of the block, I found that the solution mentioned by kriss:
((void**)block)[0] = your_pointer;
where you cast the pointer to the block as an array, and then use the bracket syntax to handle pointer arithmetic and dereferencing, was the cleanest solution for copying a new value into this pointer "field" of the block.
Someone here recently pointed out to me in a piece of code of mine I am using
char* name = malloc(256*sizeof(char));
// more code
free(name);
I was under the impression that this way of setting up an array was identical to using
char name[256];
and that both ways would require the use of free(). Am I wrong and if so could someone please explain in low level terms what the difference is?
In the first code, the memory is dynamically allocated on the heap. That memory needs to be freed with free(). Its lifetime is arbitrary: it can cross function boundaries, etc.
In the second code, the 256 bytes are allocated on the stack, and are automatically reclaimed when the function returns (or at program termination if it is outside all functions). So you don't have to (and cannot) call free() on it. It can't leak, but it also won't live beyond the end of the function.
Choose between the two based on the requirements for the memory.
Addendum (Pax):
If I may add to this, Ned, most implementations will typically provide more heap than stack (at least by default). This won't typically matter for 256 bytes unless you're already running out of stack or doing heavily recursive stuff.
Also, sizeof(char) is always 1 according to the standard so you don't need that superfluous multiply. Even though the compiler will probably optimize it away, it makes the code ugly IMNSHO.
End addendum (Pax).
and that both ways would require the use of free().
No, only the first needs the use of a free. The second is allocated on the stack. That makes it incredibly fast to allocate. Look here:
void doit() {
/* ... */
/* SP += 10 * sizeof(int) */
int a[10];
/* ... (using a) */
} /* SP -= 10 */
When you create it, the compiler at compile time knows its size and will allocate the right size at the stack for it. The stack is a large chunk of continuous memory located somewhere. Putting something at the stack will just increment (or decrement depending on your platform) the stackpointer. Going out of scope will do the reverse, and your array is freed. That will happen automatically. Therefor variables created that way have automatic storage duration.
Using malloc is different. It will order some arbitrary large memory chunk (from a place called freestore). The runtime will have to lookup a reasonably large block of memory. The size can be determined at runtime, so the compiler generally cannot optimize it at compile time. Because the pointer can go out of scope, or be copied around, there is no inherent coupling between the memory allocated, and the pointer to which the memory address is assigned, so the memory is still allocated even if you have left the function long ago. You have to call free passing it the address you got from malloc manually if the time has come to do so.
Some "recent" form of C, called C99, allows you to give arrays an runtime size. I.e you are allowed to do:
void doit(int n) {
int a[n]; // allocate n * sizeof(int) size on the stack */
}
But that feature should better be avoided if you don't have a reason to use it. One reason is that it's not failsafe: If no memory is available anymore, anything can happen. Another is that C99 is not very portable among compilers.
There is a third possibility here, which is that the array can be declared external to a function, but statically, eg,
// file foo.c
char name[256];
int foo() {
// do something here.
}
I was rather surprised in answers to another question on SO that someone felt this was inappropriate in C; here's it's not even mentioned, and I'm a little confused and surprised (like "what are they teaching kids in school these days?") about this.
If you use this definition, the memory is allocated statically, neither on the heap nor the stack, but in data space in the image. Thus is neither must be managed as with malloc/free, nor do you have to worry about the address being reused as you would with an auto definition.
It's useful to recall the whole "declared" vs "defined" thing here. Here's an example
/* example.c */
char buf1[256] ; /* declared extern, defined in data space */
static char buf2[256] ; /* declared static, defined in data space */
char * buf3 ; /* declared extern, defined one ptr in data space */
int example(int c) { /* c declared here, defined on stack */
char buf4[256] ; /* declared here, defined on stack */
char * buf5 = malloc(256)] /* pointer declared here, defined on stack */
/* and buf4 is address of 256 bytes alloc'd on heap */
buf3 = malloc(256); /* now buf3 contains address of 256 more bytes on heap */
return 0; /* stack unwound; buf4 and buf5 lost. */
/* NOTICE buf4 memory on heap still allocated */
/* so this leaks 256 bytes of memory */
}
Now in a whole different file
/* example2.c */
extern char buf1[]; /* gets the SAME chunk of memory as from example.c */
static char buf2[256]; /* DIFFERENT 256 char buffer than example.c */
extern char * buf3 ; /* Same pointer as from example.c */
void undoes() {
free(buf3); /* this will work as long as example() called first */
return ;
}
This is incorrect - the array declaration does not require a free. Further, if this is within a function, it is allocated on the stack (if memory serves) and is automatically released with the function returns - don't pass a reference to it back the caller!
Break down your statement
char* name = malloc(256*sizeof(char)); // one statement
char *name; // Step 1 declare pointer to character
name = malloc(256*sizeof(char)); // assign address to pointer of memory from heap
name[2]; // access 3rd item in array
*(name+2); // access 3rd item in array
name++; // move name to item 1
Translation: name is now a pointer to character which is assigned the address of some memory on the heap
char name[256]; // declare an array on the stack
name++; // error name is a constant pointer
*(name+2); // access 3rd item in array
name[2]; // access 3rd item in array
char *p = name;
p[2]; // access 3rd item in array
*(p+2); // access 3rd item in array
p++; // move p to item 1
p[0]; // item 1 in array
Translation: Name is a constant pointer to a character that points to some memory on the stack
In C arrays and pointers are the same thing more or less. Arrays are constant pointers to memory. The main difference is that when you call malloc you take your memory from the heap and any memory taken from the heap must be freed from the heap. When you declare the array with a size it is assigned memory from the stack. You can't free this memory because free is made to free memory from the heap. The memory on the stack will automatically be freed when the current program unit returns. In the second example free(p) would be an error also. p is a pointer the name array on the stack. So by freeing p you are attempting to free the memory on the stack.
This is no different from:
int n = 10;
int *p = &n;
freeing p in this case would be an error because p points to n which is a variable on the stack. Therefore p holds a memory location in the stack and cannot be freed.
int *p = (int *) malloc(sizeof(int));
*p = 10;
free(p);
in this case the free is correct because p points to a memory location on the heap which was allocated by malloc.
depending on where you are running this, stack space might be at a HUGE premium. If, for example, you're writing BREW code for Verizon/Alltel handsets, you are generally restricted to miniscule stacks but have ever increasing heap access.
Also, as char[] are most often used for strings, it's not a bad idea to allow the string constructing method to allocate the memory it needs for the string in question, rather than hope that for ever and always 256 (or whatever number you decree) will suffice.