How to avoid multiple deallocation - c

A Scene struct has a pointer to (a linked list of) SceneObjects.
Each SceneObject refers to a Mesh.
Some SceneObjects may however refer to the same Mesh (by sharing the same pointer - or handle, see later - to the Mesh). Meshes are pretty big and doing it this way has obvious advantages for rendering speed.
typedef struct {
Mesh *mesh;
...
struct SceneObject *next;
} SceneObject;
typedef struct Scene {
SceneObject *objects;
...
} Scene;
My question:
How do I free a Scene, while avoiding to free the same Mesh pointer multiple times?
I thought I could solve this by using handle to Mesh (Mesh** mesh_handle) instead of a pointer so I could set the referenced Mesh pointer to 0, and let successive frees on it just free 0, but I can't make it work. I just can't get my head around how to avoid multiple deallocations.
Am I forced to keep references for such a scenario? Or am I forced to put all the Mesh objects into a separate Mesh table and free it separately? Is there a way to tackle this without doing these things? By tagging the objects as instances of each other I can naturally adjust the free algorithm so it deals with the problem, but I was wondering if there is a more 'pure' solution for this problem.

One standard solution is to have reference counters, that is every object that can possibly be referred by many other objects must have a counter that remembers how many of them are pointing it. This is done with something like
typedef struct T_Object
{
int refcount;
....
} Object;
Object *newObject(....)
{
Object *obj = my_malloc(sizeof(Object));
obj->refcount = 1;
....
return obj;
}
Object *ref(Object *p)
{
if (p) p->refcount++;
return p;
}
void deref(Object *p)
{
if (p && p->refcount-- == 1)
destroyObject(p);
}
Who first allocates the object will be the first owner (hence the counter is initialized to 1). When you need to store the pointer in other places every time you should store ref(p) instad, to be sure to increment the counter. When someone is not going to point to it anymore you should call deref(p). Once the last reference to the object is gone the counter will become zero and the deref call will actually destroy the object.
It takes some discipline to get it working (you should always think when calling ref and deref) but it's possible to write complex software that has zero leaks using this approach.
A simpler solution that is sometimes applicable is having all your shared objects also stored in a separate list... you freely assign and change complex data structures pointing to these objects but you never free them during the normal use. Only when you need to throw everything away you deallocate those objects by using that separate list.
Note that this approach is possible only if you're not allocating many objects during the "normal use" because in that case delaying the destruction could be not viable.

Related

Does C have a version of JavaScript "this"?

I've use quite a bit of JavaScript so far. If you were to use an object constructor in JavaScript, you have access to the this constructor.
So my question relates to trying to use a similar concept in C. I created a struct that I want to be able to self reference:
struct Storage {
void (*delete)();
}
So if I were to allocate a Storage class:
struct Storage *myStruct = malloc(sizeof(struct Storage));
Let's say I'm trying to delete myStruct. If I have some delete function that I point to (with myStruct->delete = deleteStructure), I would like to do something like this:
myStruct.delete();
which would then free() the struct through a self referencing variable inside of said delete function. I'm wondering if there would be a way to have the delete function look like:
void deleteStructure() {
free( /* "this" or some equivalent C self-reference */ );
}
My assumption from research so far is that this is not possible since this is usually only in object oriented programming languages. If this is not possible, I'm wondering what would be the semantically correct way to do this. I'm hoping to make the usage of this delete functionality rather simplistic from a user interface perspective. The only way I understand this to work would be passing a reference to the structure like:
void deleteStructure(struct Storage *someStructure) {
free(someStructure);
}
which would then require deletion to be done as follows:
deleteStructure(myStruct);
To sum up: is there a way to make a delete function that uses self references in C, and if not, what would be the most semantically correct way to delete a structure in the most user friendly way?
No. You cannot even define a function for a struct.
struct Storage {
void (*delete)();
}
simply stores a pointer to a void function. That could be any void function and when it is being called, it has no connection to Storage whatsoever.
Also note that in your code, every instance of the struct stores one pointer to a void function. You could initialize them so that they all point to the same function, in which case you would simply waste 64 bit per instance without any real benefit. You could also make them point to completely different functions with different semantics.
As per #UnholySheep's comment, the correct semantical use of a struct with connection to a C function will follow the structure:
struct Storage {
/* Some definitions here */
}
void deleteStructure(struct Storage *someStructure) {
free( /* all inner structure allocations */ );
free(someStructure);
}
Here's more about passing structs by reference.

Shallow and deep destructors?

Imagine a list "a", and there's a copy constructor for lists which performs deep copying. If "b" is a list deep copied from "a", then both can be destroyed using simple destructors. This destructor should use deep destruction.
typedef struct list { void * first; struct list * next } list;
struct list * list_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
void list_destructor(struct list * input);
Now imagine that you rename the copy constructor for lists as a deep copy constructor for lists, and add another shallow copy constructor for lists.
/** Performs shallow copy. */
struct list * list_shallow_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Performs deep copy. */
struct list * list_deep_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Be warned performs deep destruction. */
void list_destructor(struct list * input);
The destructor, which performs a deep destruction, can be used paired with the deep copy constructor calls.
Once you used the shallow copy constructor for lists, you would need to know which of both lists own the elements, and then one of the lists (the list owning the elements), can be destroyed with the destructor, but for the list that doesn't own the elements, I would need to destroy it using a shallow destructor I would need to create, before destroying the list owning the elements.
/** Performs shallow copy. */
struct list * list_shallow_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Performs deep copy. */
struct list * list_deep_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Performs shallow destruction. */
void list_shallow_destructor(struct list * input);
/** Performs deep destruction. */
void list_deep_destructor(struct list * input);
But, the problem is, I don't recognize shallow destructor as a term in bibliography, so I thought I might be doing something wrong. Am I doing something wrong? E.g. should I be using smart pointers already instead of deep and shallow destructors?
The concept of deep or shallow exists only in the mind of the programmer, and in C++ it is very arbitrary. By default raw pointer members are not deep destroyed when an object is destroyed, which you might call shallow, but you can write extra code in your object destructor to destroy deeply. On the other hand any members that have destructors get their destructors called, which you might call deep, and there is no way to avoid that. Exactly the same mechanism applies for the default copy and assignment, so is equally impossible to say an object is wholly deep or shallow copied or destroyed at a glance.
So the distinction is not really a property of object destructors, but their members.
Of course, now the question is about C again, but still mentions smart pointers. In C you have to decide what philosophy you want to implement, as there is no concept of destruction. If you follow a C++-like philosophy of having destruction functions for each type of member, and having them deep-call.
Having said that there are a number of strategies you might consider that would potentially produce a leaner model:
If /all/ the members of a particular container are owned or /all/ not owned, then a simple flag in the container for whether to destroy /all/ children is an adequate model.
If /all/ the objects are shared with another container, or this might be the last/only such container for a particular set of content, you could keep a circular list of sharing containers. When the destructor realises it is the last container it could destroy /all/ the content. On the other hand, you could simply implement this model with a shared_ptr to the one container instance, and when the last pointer is released then the container is destroyed.
If individual items in the container may be shared in arbitrary ways, then make it a container of shared_ptr to each item of content. This is the most robust model, but may have costs in terms of memory usage. Ultimately, somewhere there needs to be a reference count (though circular lists of referees are also good, it is much harder to mutex across threads) In C++ shared_ptr this is implemented using a stub, but in your own C objects, this is probably a counter member in the child object.
If you want to have lists with shared elements be correctly destructed, you cannot simply use "shallow destructors". Destructing them all with shallow destructors results in elements still residing in memory and being leaked. It also doesn't look good to mark one of the lists having a deep destructor while others having shallow ones. If you first destroy the list with deep destructor, others will have dangling pointers which you can accidentally access. So it looks like shallow destructor is not a well-recognized term simply because it is not of a great use. You just have destructors: functions that destroy the stuff that you objects conceptually own.
Now, for the particular task of sharing elements in lists and destroying everything in time shared pointers seem to be a reasonable solution. A shared pointer is conceptually a pointer to a struct consisting of two elements: an actual object (list element) and reference counter. Copying the shared pointer increases the counter; destroying the shared pointer decreases the counter and destructs the struct with object & counter if the counter fell to 0. In this scenario, each list own its copies of shared pointers, but the list elements themselves are owned by shared pointers rather than a list. Due to the shared pointer destruction semantics there is no trouble with destroying all the shared pointer copies that the list owns (the destructor doesn't deallocate memory unless there are no references left), so there is no need for distinguishing "shallow" and "deep" destructors, as shared pointers will take care of deleting themselves in time automatically.
As you already suspected, your design is weird. Think about it: if you are going to "shallow copy" a list, why not just take a pointer of it? "Shallow copy" has no use outside of a classroom, where it is only useful to explain what a "deep copy" is.
You either want multiple users to have independent lists, that can be used and destroy independently of the each other, or you want one user to "own" the list, and the others just point to that list. Your idea of "shallow copy" has no advantage over a simple pointer, but is much more complex to handle.
What is actually useful is having multiple "lists" but with shared data, where each user has its own "shared copy" of the list that can be used and destroyed independently, but points to the same data, and will only really be deallocated when the last user has destroyed it. This is a very common pattern usually handled by an algorithm called reference counting, and is implemented by many libraries and languages, like Python, glib, and even in C++ as the smart pointer std::shared_ptr.
If you are using C, you may want to add support to reference counting to your struct list, and it is not very difficult: just add a field unsigned reference_count; and set it to 1 when it is allocated. Decrement when destroyed, but only really deallocate if reference_count == 0, in which case there are no more users and you must do a "deep deallocation". You would still have only one destructor function, but two copy constructors:
/** Performs shared copy.
*
* Actually, just increments reference_count and returns the same pointer.
*/
struct list * list_shared_copy_constructor(struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Performs deep copy.
*
* reference_count for the new copy is set to 1.
*/
struct list * list_deep_copy_constructor(const struct list * input)
REQUIRE_RETURNED_VALUE_CAPTURE;
/** Performs destruction. */
void list_shallow_destructor(struct list * input);
If you are actually using C++, as you hinted in your question, then simply use std::shared_ptr.
But, the problem is, I don't recognize shallow destructor as a term in bibliography, so I thought I might be doing something wrong. Am I doing something wrong?
The point of the destructor (or in C just myobject_destroy(myobject)*) is to clean up the resources the instance holds (memory, os handles, ...). Whether you need a "shallow destructor" or a "deep destructor" depends on the way you decided to implement your object, as long as it does its job to cleanup stuff.
If you are using modern C++ stack allocation and smart pointers are your friend, because they manage memory by themselves.

Changing a pointer as a result of destroying an "object" in C

As part of a course I am attending at the moment, we are working in C with self-developed low level libraries, and we are now working in our final project, which is a game.
At a certain point, it seemed relevant to have a struct (serving as a sort of object) that held some important information about the current game status, namely a pointer to a player "object" (can't really call the simulated objects we are using actual objects, can we?).
It would go something like this:
typedef struct {
//Holds relevant information about game current state
state_st currstate;
//Buffer of events to process ('array of events')
//Needs to be pointers because of deallocating memory
event_st ** event_buffer;
//Indicates the size of the event buffer array above
unsigned int n_events_to_process;
//... Other members ...
//Pointer to a player (Pointer to allow allocation and deallocation)
Player * player;
//Flag that indicates if a player has been created
bool player_created;
} Game_Info;
The problem is the following:
If we are to stick to the design philosophy that is used in most of this course, we are to "abstract" these "objects" using functions like Game_Info * create_game_info() and destroy_game_info(Game_Info * gi_ptr) to act as constructors and destructors for these "objects" (also, "member functions" would be something like update_game_state(Game_Info * gi_ptr), acting like C++ by passing the normally implicit this as the first argument).
Therefore, as a way of detecting if the player object inside a Game_Info "instance" had already been deleted I am comparing the player pointer to NULL, since in all of the "destructors", after deallocating the memory I set the passed pointer to NULL, to show that the object was successfully deallocated.
This obviously causes a problem (which I did not detect at first, and thus the player_created bool flag that fixed it while I still was getting a grasp on what was happening) which is that because the pointer is passed by copy and not by reference, it is not set to NULL after the call to the "object" "destructor", and thus comparing it to NULL is not a reliable way to know if the pointer was deallocated.
I am writing this, then, to ask for input on what would be the best way to overcome this problem:
A flag to indicate if an "object" is "instanced" or not - using the flag instead of ptr == NULL in comparisons to assert if the "object" is "instanced" - the solution I am currently using
Passing a pointer to the pointer (calling the functions with &player instead of only player) - would enable setting to NULL
Setting the pointer to NULL one "level" above, after calling the "destructor"
Any other solution, since I am not very experienced in C and am probably overlooking an easier way to solve this problem.
Thank you for reading and for any advice you might be able to provide!
I am writing this, then, to ask for input on what would be the best way to overcome this problem: …
What would be the best way is primarily opinion-based, but of the ways you listed the worst is the first, where one has to keep two variables (pointer and flag) synchronized.
Any other solution…
Another solution would be using a macro, e. g.:
#define destroy_player(p) do { /* whatever cleanup needed */; free(p), p = NULL; } while (0)
…
destroy_player(gi_ptr->player);

What is reference counter and how does it work?

I've been writing code, and I'm in a point where I should have another program calling my library. I should make a reference counter for the output of my library. Basic idea as I have understood is that, I need to have reference counter struct inside my struct that I want to pass around. So my questions are following:
What should I keep in mind when making a reference counter?
What are complete don'ts when making a reference counter?
Is there really detailed examples where to start with this?
Thank you for your answers in advance!
Reference counting allows clients of your library to keep reference objects created by your library on the heap and allows you to keep track of how many references are still active. When the reference count goes to zero you can safely free the memory used by the object. It is a way to implement basic "garbage collection".
In C++, you can do this more easily, by using "smart pointers" that manage the reference count through the constructor and destructor, but it sounds like you are looking to do it in C.
You need to be very clear on the protocol that you expect users of your libraries to follow when accessing your objects so that they properly communicate when a new reference is created or when a reference is no longer needed. Getting this wrong will either prematurely free memory that is still being referenced or cause memory to never be freed (memory leak).
Basically, You include a reference count in your struct, that gets incremented each time that your library returns the struct.
You also need to provide a function that releases the reference:
struct Object {
int ref;
....
}
Object* getObject (...) {
Object *p = .... // find or malloc the object
p->ref++;
return p;
}
void releaseReference (Object* p) {
p->ref--;
if (p->ref == 0) free(p);
}
void grabReference (Object* p) {
p->ref++;
}
Use grabReference() if a client of your library passes a reference to another client (in the above example, the initial caller of your library doesn't need to call grabReference())
If your code is multi-threaded then you need to make sure that you handle this correctly when incrementing or decrementing references

Create a new copy of a data structure based on pointers

I started with a programming assignment. I had to design a DFA based on graphs. Here is the data structure I used for it:
typedef struct n{
struct n *next[255]; //pointer to the next state. Value is NULL if no transition for the input character( denoted by their ascii value)
bool start_state;
bool end_state;
}node;
Now I have a DFA graph-based structure ready with me. I need to utilize this DFA in several places; The DFA will get modified in each of these several places. But I want unmodified DFAs to be passed to these various functions. One way is to create a copy of this DFA. What is the most elegant way of doing this? So all of them are initialized either with a NULL value or some pointer to another state.
NOTE:
I want the copy to be created in the called function i.e. I pass the DFA, the called function creates its copy and does operation on it. This way, my original DFA remains undeterred.
MORE NOTES:
From each node of a DFA, I can have a directed edge connecting it with another edge, If the transition takes place when there the input alphabet is c then state->next[c] will have a pointer of the next node. It is possible that several elements of the next array are NULL. Modifying the NFA means both adding new nodes as well as altering the present nodes.
If you need a private copy on each call, and since this is a linked data structure, I see no way to avoid copying the whole graph (except perhaps to do a copy-on-write to some sub branches if the performance is that critical, but the complexity is significant and so is the chance of bugs).
Had this been c++, you could have done this in a copy constructor, but in c you just need to clone on every function. One way is to clone the entire structure (Like Mark suggested) - it's pretty complicated since you need to track cycles/ back edges in the graph (which manifest as pointers to previously visited nodes that you don't want to reallocate but reuse what you've already allocated).
Another way, if you're willing to change your data structure, is to work with arrays - keep all the nodes in a single array of type node. The array should be big enough to accommodate all nodes if you know the limit, or just reallocate it to increase upon demand, and each "pointer" is replaced by a simple index.
Building this array is different - instead of mallocing a new node, use the next available index (keep it on the side), or if you're going to add/remove nodes on the fly, you could keep a queue/stack of "free" indices (populate at the beginning with 1..N, and pop/push there whenever you need a new location or about to free an old one.
The upside is that copying would be much faster, since all the links are relative to the instance of the array, you just copy a chunk of contiguous memory (memcpy would now work fine)
Another upside is that the performance of using this data structure should be superior to the linked one, since the memory accesses are spatially close and easily prefetchable.
You'll need to write a recursive function that visits all the nodes, with a global dictionary that keeps track of the mapping from the source graph nodes to the copied graph nodes. This dictionary will be basically a table that maps old pointers to new pointers.
Here's the idea. I haven't compiled it or debugged it...
struct {
node* existing;
node* copy
} dictionary[MAX_NODES] = {0};
node* do_copy(node* existing)
{
node* copy;
int i;
for(i=0;dictionary[i].existing;i++) {
if (dictionary[i].existing == existing) return dictionary[i].copy;
}
copy = (node*)malloc(sizeof(node));
dictionary[i].existing = existing;
dictionary[i].copy = copy;
for(int j=0;j<255 && existing->next[j];j++) {
node* child = do_copy(existing->next[j]);
copy->next[j] = child;
}
copy->end_state = existing->end_state;
copy->start_start = existing->start_state;
return copy;
}

Resources