Implementing Explicit free lists memory allocation - c

As part of an assignment we have to implement a (basic) malloc function(we should somehow simulate dynamic memory allocation). I already implemented a solution based on implicit free list,but the problem is that i get a utilization of 50% and a throughput of 9% only(I have to get a 90% utilization+throughput). The problem with implicit free list is that it takes alot of time searching for free blocks. So i wanted to implement explicit free list to see how much can the program improve. now the problem is that i have to keep track of next/prev pointers for free blocks. And since I can only use scalar variables and cannot use any kind of data structures e.g:linked list,struct,.., i couldnt implement it. Can someone point out how can i keep track of (virtual) pointers in C?
thanks,

From the images in the slides (linked in the comments), I think you're expected to store the links as integer offsets.
In C, this really is best described as a struct.
struct alloc_cell {
int size;
int forward;
int back;
int data[];
};
Now, he also has the size repeated at the end, and that is much harder to describe with a struct. It's not clear how necessary it is. It's used for boundary coalescing.
Edit: much later, considering comment.
Even if you can't use structs in the code, you can still use them in the pseudocode to help organize your thinking. The struct fields, being all the same type, map naturally to an array
represention which can be accessed by a pointer to the the first member/element.
struct alloc_cell { // int *cell;
int size; // cell[0] // *cell
int forward; // cell[1]
int back; // cell[2]
int data[]; // (cell+3)[...] // &cell[3]
};
You could even go so far as to give these offsets mnemonic names (but this may be considered overkill and unnecessary obfuscation).
enum { SIZE, FORWARD, BACK, DATA };
cell[SIZE]; // alloc_cell.size
cell[FORWARD]; // alloc_cell.forward
cell[BACK]; // alloc_cell.back
cell+DATA; // &cell[DATA] // alloc_cell.data

Related

How to store custom objects (struct) in C?

I want to know how to store custom objects (not their pointers) in C. I have created a custom structure called Node
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
(which works) and I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
The internet seems to suggest something like calloc() so my last attempt was to make a container Neighbors following this example, with numNeighbors being just an integer:
Node Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
At compilation, I got an error from this line saying
initializing 'Node' with an expression of incompatible type 'void *'
and in places where I referenced to this container (as in Neighbors[i]) I got errors of
subscripted value is not an array, pointer, or vector
Since I'm spoiled by Python, I have no idea if I've got my syntax all wrong (it should tell you something that I'm still not there after scouring a ton of tutorials, docs, and stackoverflows on malloc(), calloc() and the like), or if I am on a completely wrong approach to storing custom objects (searching "store custom objects in C" on the internet gives irrelevant results dealing with iOS and C# so I would really appreciate some help).
EDIT: Thanks for the tips everyone, it finally compiled without errors!
You can create a regular array using your custom struct:
Node Neighbors[10];
You can then reference them like any other array, for example:
Neighbors[3].height = 10;
If your C implementation supports C.1999 style VLA, simply define your array.
Node Neighbors[numNeighbors];
(Note that VLA has no error reporting mechanism. A failed allocation results in undefined behavior, which probably expresses itself as a crash.)
Otherwise, you will need dynamic allocation. calloc is suitable, but it returns a pointer representing the contiguous allocation.
Node *Neighbors = calloc(numNeighbors, sizeof(*Neighbors));
Note, do not cast the result of malloc/calloc/realloc when programming in C. It is not required, and in the worst case, can mask a fatal error.
I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
If you know the amount of them at compile-time (or at the very least a reasonable maximum); then you can create an array of stack-allocated objects. For instance, say you are OK with a maximum of 10 objects:
#define MAX_NODES 10
Node nodes[MAX_NODES];
int number_nodes = 0;
Then when you add an object, you keep in sync number_nodes (so that you know where to put the next one). Technically, you will always have 10, but you only use the ones you want/need. Removing objects is similar, although more involved if you want to take out some in the middle.
However, if you don't know how many you will have (nor a maximum); or even if you know but they are way too many to fit in the stack; then you are forced to use the heap (typically with malloc() and free()):
int number_nodes; // unknown until runtime or too big
Node * nodes = malloc(sizeof(Node) * number_nodes);
...
free(nodes);
In any case, you will be using pointers in the dynamically allocated memory case, and most probably in the stack case as well.
Python is hiding and doing all this dance for you behind the scenes -- which is quite useful and time saving as you have probably already realized, as long as you do not need precise control over it (read: performance).
malloc and calloc are for dynamic allocation, and they need pointer variables. I don't see any reason for you to use dynamic allocation. Just define a regular array until you have a reason not to.
#define MAXQ 100
#define NUM_NEIGHBORS 50
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
Node Neighbors[NUM_NEIGHBORS];
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}
Here NUM_NEIGHBORS needs to be a constant. (Hence static) If you want it to be variable or dynamic, then you need dynamic allocations, and pointers inevitably:
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
int numNeighbors = 50;
Node *Neighbors;
Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}

Structure Elements On The Heap vs The Stack

So, I am creating a structure that currently needs a lot of memory. I hope to reduce it in the future, but for now, it is what it is. Hence, I need to allocate some of its elements on the heap because I get a stack overflow if they are put on the stack. And yes, I increased the stack size but on the target platform I only have so much.
In this case, would it be 'better' to allocate every structure element on the heap, or put some on the stack and the big stuff on the heap? For instance:
typedef struct my_structure_s{
int bounds[2];
int num_values;
int* values; //needs to be very large
} my_structure_t;
Vs:
typedef struct my_structure_s{
int* bounds;
int* num_values;
int* values;
} my_structure_t;
I know 'better' is largely subjective, and could quite possibly incite a riot here. So, what are the pros and cons of both examples? What do you usually do? Why?
Also, forgive the _s, _t stuff...I know some of you may find it in bad taste but that is the convention for the legacy codebase this will be integrated into.
Thanks everyone!
It is better to keep the simple members as direct values, and allocate just the array. Using the extra two pointers just slows down access for no benefit.
One other option to consider if you have C99 or C11 is to use a flexible array member (FAM).
You'd define your structure using the notation:
typedef struct my_structure_s
{
int bounds[2];
int num_values;
int values[];
} my_structure_t;
You'd allocate enough memory for the structure and an N-element array in values all in a single operation, using:
my_structure_t *np = malloc(sizeof(*np) + N * sizeof(np->values[0]));
This then means you only have to free one block of memory to free.
You can find references to the 'struct hack' if you search. This notation is effectively the standardized form of the struct hack.
In comments, the discussion continued:
This is an interesting approach; however, I can't guarantee I will have C99.
If need be, you can use the 'struct hack' version of the code, which would look like:
typedef struct my_structure_s
{
int bounds[2];
int num_values;
int values[1];
} my_structure_t;
The rest of the code remains unchanged. This uses slightly more memory (4-8 bytes more) than the FAM solution, and isn't strictly supported by the standard, but it was used extensively before the C99 standard so it is unlikely that a compiler would invalidate such code.
Okay, but how about:
typedef struct my_structure_s
{
int bounds[2];
int num_values;
int values[MAX_SIZE];
} my_structure_t;
And then: my_structure_t *the_structure = malloc(sizeof(my_structure_t));
This will also give me a fixed block size on the heap right? (Except here, my block size will be bigger than it needs to be, in some instances, because I won't always get to MAX_SIZE).
If there is not too much wasted space on average, then the fixed-size array in the structure is simpler still. Further, it means that if the MAX_SIZE is not too huge, you can allocate on the stack or on the heap, whereas the FAM approach mandates dynamic (heap) allocation. The issue is whether the wasted space is enough of a problem, and what you do if MAX_SIZE isn't big enough after all. Otherwise, this is much the simplest approach; I simply assumed you'd already ruled it out.
Note that every one of the suggested solutions avoids the pointers to bounds and num_values suggested in option 2 in the question.
do the first one. It is simpler and less error prone (you have to remember to allocate and release more things in the second one)
BTW - not that the first example will not put num_values on the stack. IT will go wherever you allocate the struct, stack, heap of static

Trie, C code. Low efficiency?

I saw some people use this structure for Trie nodes:
struct trie_node_st {
int count;
struct trie_node_st *next[TREE_WIDTH];
};
Is is low efficiency, since we don't always need TREE_WIDTH as length for each array.
Or am I misunderstanding something?
It's a CPU/memory trade off. By allocating it up front you use a certain minimum amount of memory to store those pointers (TREE_WIDTH * sizeof (struct trie_node_st *)) bytes, you use less CPU later because this is done at compile time (unless you allocate the struct with malloc()). However, this is hardly an overhead. Even if you had a ton of pointers, it wouldn't matter. The likely design decision was just that the programmer did not feel like having to dynamically allocate an array of pointers to struct trie_node_st each time he used this structure.

Casting a 'BigStruct' to a 'SmallStruct' in C (similar structs with static arrays of different sizes)

Supposed that for some reason you are only allowed to use static memory in a C program.
I have a basic structure that I am using in several places defined as below:
#define SMALL_STUFF_MAX_SIZE 64
typedef struct {
/* Various fields would go here */
...
double data[SMALL_STUFF_MAX_SIZE]; /* array to hold some data */
} SmallStuff;
Now, I have been asked to add a new feature that lead to a particular case where I need the same structure but with a much larger array. I can't afford to max up the array of the SmallStuff structure as memory is too tight. So I made a special version of the struct defined as below that I eventually cast to a (SmallStuff*) when calling functions that expect a pointer to a SmallStuff structure (the actual size of 'data' is properly handled in these functions)
#define BIG_STUFF_MAX_SIZE 1000000
typedef struct {
/* Various fields, identical to the ones in SmallStuff would go here */
...
double data[BIG_STUFF_MAX_SIZE]; /* array to hold some data */
} BigStuff;
Obviously, the proper way to do it would be to dynamically allocate the memory but as said above I can't use dynamic memory allocation.
Are there any side-effects that I should consider?
Or better ways to deal with this kind of problem?
Thanks in advance.
What you're doing is fine, though it tends to scare people who are uncomfortable with pointers and casting.
The general solution for your problem is to get rid of BigStuff and SmallStuff and make a single Stuff structure with a size member and a double *data that points to an array of your choosing, instead of risking potential miscasts in your code later or having to change your functions when you discover you also need MediumStuff. This gives you the flexibility of using whatever sizes are appropriate.
typedef struct
{
// the usual
size_t data_length;
double *data;
} Stuff;
double bigdata[BIG_STUFF_MAX_SIZE];
Stuff big = { ..., BIG_STUFF_MAX_SIZE, bigdata };
typedef struct {
/* Various fields would go here */
double data[]; /* a flexible array (C99 extension) */
} AnySizeStuff;
typedef struct {
AnySizeStuff header;
double smalldata[SMALL_STUFF_MAX_SIZE];
} SmallStuff;
typedef struct {
AnySizeStuff header;
double bigdata[BIG_STUFF_MAX_SIZE];
} BigStuff;
Then if x is either a SmallStuff or BigStuff, you can pass &x.header to routines that can take either.
Although its ugly code because of the complexity, there should not be any runtime problems because the sizes are hard-coded.
A better way to deal with it is to have algorithms which didnt need you to have 2 separate structs which only differ by size. However, I dont know your application, so you know best how to deal with this problem.

Resizing a char[x] to char[y] at runtime

OK, I hope I explain this one correctly.
I have a struct:
typedef struct _MyData
{
char Data[256];
int Index;
} MyData;
Now, I run into a problem. Most of the time MyData.Data is OK with 256, but in some cases I need to expand the amount of chars it can hold to different sizes.
I can't use a pointer.
Is there any way to resize Data at run time? How?
Code is appreciated.
EDIT 1:
While I am very thankful for all the comments, the "maybe try this..." or "do that", or "what you are dong is wrong..." comments are not helping. Code is the help here. Please, if you know the answer post the code.
Please note that:
I cannot use pointers. Please don't try to figure out why, I just can't.
The struct is being injected into another program's memory that's why no pointers can be used.
Sorry for being a bit rough here but I asked the question here because I already tried all the different approaches that thought might work.
Again, I am looking for code. At this point I am not interested in "might work..." or " have you considered this..."
Thank you and my apologies again.
EDIT 2
Why was this set as answered?
You can use a flexible array member
typedef struct _MyData
{
int Index;
char Data[];
} MyData;
So that you can then allocate the right amount of space
MyData *d = malloc(sizeof *d + sizeof(char[100]));
d->Data[0..99] = ...;
Later, you can free, and allocate another chunk of memory and make a pointer to MyData point to it, at which time you will have more / less elements in the flexible array member (realloc). Note that you will have to save the length somewhere, too.
In Pre-C99 times, there isn't a flexible array member: char Data[] is simply regarded as an array with incomplete type, and the compiler would moan about that. Here i recommend you two possible ways out there
Using a pointer: char *Data and make it point to the allocated memory. This won't be as convenient as using the embedded array, because you will possibly need to have two allocations: One for the struct, and one for the memory pointed to by the pointer. You can also have the struct allocated on the stack instead, if the situation in your program allows this.
Using a char Data[1] instead, but treat it as if it were bigger, so that it overlays the whole allocated object. This is formally undefined behavior, but is a common technique, so it's probably safe to use with your compiler.
The problem here is your statement "I can't use a pointer". You will have to, and it will make everything much easier. Hey, realloc even copies your existing data, what do you want more?
So why do you think you can't use a pointer? Better try to fix that.
You would re-arrange the structure like that
typedef struct _MyData
{
int Index;
char Data[256];
} MyData;
And allocate instances with malloc/realloc like that:
my_data = (MyData*) malloc ( sizeof(MyData) + extra_space_needed );
This is an ugly approach and I would not recommend it (I would use pointers), but is an answer to your question how to do it without a pointer.
A limitation is that it allows for only one variable size member per struct, and has to be at the end.
Let me sum up two important points I see in this thread:
The structure is used to interact between two programs through some IPC mechanism
The destination program cannot be changed
You cannot therefore change that structure in any way, because the destination program is stuck trying to read it as currently defined. I'm afraid you are stuck.
You can try to find ways to get the equivalent behavior, or find some evil hack to force the destination program to read a new structure (e.g., modifying the binary offsets in the executable). That's all pretty application specific so I can't give much better guidance than that.
You might consider writing a third program to act as an interface between the two. It can take the "long" messages and do something with them, and pass the "short" messages onward to the old program. You can inject that in between the IPC mechanisms fairly easily.
You may be able to do this like this, without allocating a pointer for the array:
typedef struct _MyData
{
int Index;
char Data[1];
} MyData;
Later, you allocate like this:
int bcount = 256;
MyData *foo;
foo = (MyData *)malloc(sizeof(*foo) + bcount);
realloc:
int newbcount = 512;
MyData *resized_foo;
resized_foo = realloc((void *)foo, sizeof(*foo) + newbcount);
It looks like from what you're saying that you definitely have to keep MyData as a static block of data. In which case I think the only option open to you is to somehow (optionally) chain these data structures together in a way that can be re-assembled be the other process.
You'd need and additional member in MyData, eg.
typedef struct _MyData
{
int Sequence;
char Data[256];
int Index;
} MyData;
Where Sequence identifies the descending sequence in which to re-assemble the data (a sequence number of zero would indicate the final data buffer).
The problem is in the way you're putting the question. Don't think about C semantics: instead, think like a hacker. Explain exactly how you are currently getting your data into the other process at the right time, and also how the other program knows where the data begins and ends. Is the other program expecting a null-terminated string? If you declare your struct with a char[300] does the other program crash?
You see, when you say "passing data" to the other program, you might be [a] tricking the other process into copying what you put in front of it, [b] tricking the other program into letting you overwrite its normally 'private' memory, or [c] some other approach. No matter which is the case, if the other program can take your larger data, there is a way to get it to them.
I find KIV's trick quite usable. Though, I would suggest investigating the pointer issue first.
If you look at the malloc implementations
(check this IBM article, Listing 5: Pseudo-code for the main allocator),
When you allocate, the memory manager allocates a control header and
then free space following it based on your requested size.
This is very much like saying,
typedef struct _MyData
{
int size;
char Data[1]; // we are going to break the array-bound up-to size length
} MyData;
Now, your problem is,
How do you pass such a (mis-sized?) structure to this other process?
That brings us the the question,
How does the other process figure out the size of this data?
I would expect a length field as part of the communication.
If you have all that, whats wrong with passing a pointer to the other process?
Will the other process identify the difference between a pointer to a
structure and that to a allocated memory?
You cant reacolate manualy.
You can do some tricks wich i was uning when i was working aon simple data holding sistem. (very simple filesystem).
typedef struct
{
int index ;
char x[250];
} data_ztorage_250_char;
typedef struct
{
int index;
char x[1000];
} data_ztorage_1000_char;
int main(void)
{
char just_raw_data[sizeof(data_ztorage_1000_char)];
data_ztorage_1000_char* big_struct;
data_ztorage_250_char* small_struct;
big_struct = (data_ztorage_1000_char*)big_struct; //now you have bigg struct
// notice that upper line is same as writing
// big_struct = (data_ztorage_1000_char*)(&just_raw_data[0]);
small_struct = (data_ztorage_250_char*)just_raw_data;//now you have small struct
//both structs starts at same locations and they share same memory
//addresing data is
small_struct -> index = 250;
}
You don't state what the Index value is for.
As I understand it you are passing data to another program using the structure shown.
Is there a reason why you can't break your data to send into chunks of 256bytes and then set the index value accordingly? e.g.
Data is 512 bytes so you send one struct with the first 256 bytes and index=0, then another with the next 256 bytes in your array and Index=1.
How about a really, really simple solution? Could you do:
typedef struct _MyData
{
char Data[1024];
int Index;
} MyData;
I have a feeling I know your response will be "No, because the other program I don't have control over expects 256 bytes"... And if that is indeed your answer to my answer, then my answer becomes: this is impossible.

Resources