Common datastructure library in C

Common datastructure library in C - c

Hello I have started writing common data structure library in C similar to STL.
Here is the link . http://code.google.com/p/cstl/
I struggled a lot of whether to go ahead with having void* as basic element for data structure. and End up with structure which has two elements
typedef struct __c_lib__object {
void* raw_data;
size_t size;
} clib_object, *clib_object_ptr;
This approach allow me to store each element, but it requires lot of memory allocation , during saving and returning back the element from the container.
Can anybody please review this , and let me know if there is any other approach.
Thanks
Avinash

Names starting with double-underscore are reserved to 'the implementation' and should be avoided in user code.
Personally, I dislike typedefs for pointers; I'd rather use clib_object *x; than clib_object_ptr x;.
Why do you need to record the size of the object?

Related

Use of function pointers in C on data structure development

I'm having an Algorithms course next semester and so I dived into C with the purpose of making a few data structures ahead of time to be prepared.
As I learned about function pointers, I found I could store them in structs and create an object-oriented-like use for my data structure. Here's an example:
#include <stdio.h>
void insert(char * object)
{
printf("Adding %s to the data structure\n", object);
}
typedef struct data_structure {
char * obj;
void (*insert)(char * object);
} data_structure;
int main()
{
data_structure d;
d.insert = insert;
d.insert("bacon");
return 0;
}
But is this kind of procedure actually useful in the scope of data structure and algorithm studying in C? Or is it just taking up memory on the data structure?
I've found other posts talking about function pointers, but none that explores this kind of approach. I think this could be useful to a bunch of curious students out there :)

In the past I have certainly seen objects constructed this way as sets of function pointers effectively representing a vtable. Usually, for a vtable you add one extra level of indirection such that all data objects with similar traits point to the same function pointer object. This reduces the cost per data object if there is more than 1 function, but at a slight execution cost.
It can also be used as a lightweight way to organise and structure function+voiddata callback objects, by insisting that the first member of the data is the callback function. Of course, you can't define inherited classes using c, but you can have nested structures which can be bullied to the same purpose.

After reading all the answers, here's the insight this post gathered regarding the use of function pointers as C structure attributes.
Advantages:
Good ol' practise on a somewhat advanced subject if you're a student
Provides encapsulation and object oriented code in C
Gives you better understanding of the object orientated programming paradigm if you're not already well familiarized with it
Can be used to implement VTables
Disadvantages:
On a functional level, you still have to pass the data structure to the function, as it doesn't have access to said data structure
Slight performance overhead
In conclusion, the use of function pointers as in the original question would really only have practical uses if one wishes to explore OOP whilst getting into more advanced aspects of C, or to construct VTables.
Thank you to all the people who replied.

Is there a more efficient way to store an n-ary tree in a file in C?

Say I'm writing an adventure game. The map is built of tiles of different types. I have tiles that form paths, and tiles that form doors, and so on.
I will use a struct to describe the type and content of a tile, and to which other tiles it connects.
Then I'll make a quadruple-linked list to connect them all together.
But a struct that will describe a room will have far more elements than one that will describe a door, so many elements in a door struct will be redundant. I could make a smaller door struct, but structs can only point to structs of the same type*, so I couldn't connect a room struct to a door struct. The redundancy may be negligible but I wondered if there's another way.
Another option is using an array of structs, but then I'd have lots of 'padding' structs wasting even more space. However an array would make reading and re-building a map from file much easier.
Is there any way around the limitation that a struct can only point to a struct of the same type? Or is there another common solution to this problem that I haven't mentioned?
One idea I had was that each tile could have pointers for every other type of tile. Some would be redundant, but it would be a lesser redundancy that the option above.
*By this I mean that typically in a linked list, structs contain pointers to struct of the same type that they're in.

You really don't have to have a uniform struct describing everything. Instead, you could do the following (this is somewhat like writing your own C++ virtual tables in C, and is very widely used).
Your basic tile struct can look like this:
struct tile
{
// common tile stuff
...
enum tile_type type;
void *type_info;
};
So in this struct you store stuff that's common to every tile type. Then you make other structs for other types: one for a room, one for a path, etc. Within an object of tile, you make the enum describe the actual type, and store a pointer to the concrete type within the void *.
There are many links describing variations of this technique. Here's one.

Instead of storing elements in a tile, store only a pointer to the linked list of elements.

For pass to function is it worth packing the matrix and its dimension in a struct or is it OK to use additional parameters?

Here is a matrix declared as pointer to an array of pointers to rows.
(source: Numerical Recipes in C)
What is the better way to pass this matrix to a function along with its dimensions?
void printMatrix(float **matrix, int rows, int cols);
Or pack it in a struct
struct Matrix {
int rows, cols;
int **data;
};
and pass a pointer to the struct?
void printMatrix(struct Matrix *m);

Both ways work, however, the approach using a struct is a bit "easier" to use. You (or whoever will use this) won't have to worry about passing the correct size as well and it isn't required to organize it at all. You just handle one struct or one logical object. If you split everything up, you'll have to handle the data as well as the meta data yourself (i.e. storying/passing data and dimensions).
Is there a downside using the struct? Not that I know of (other than having to handle one more pointer). However there is one huge advantage: Using the struct you could use a function wanting data and meta data separated as well (by passing the struct elements rather than a pointer to the struct). This isn't that easy the other way around.
As for "is it worth it?" considering "should I do it for organisaiton?": Do it, if the grouping is logical. Lots of windows APIs work with structs that way, but I'm not a real fan of them, if the grouping isn't logical or it creates additional "pains". In other words: Don't group your parameters into a struct, if they're not related or if the user most likely wouldn't have them in that form (i.e. they're grouped for this call only).
Edit:
As an example:
I'd group your example data, as width and height belong to the matrix data and they're related (plus they might be used in other functions the same way).
However, I wouldn't group parameters such as this: write_log(LOG_INFO, "All data has been processed"); Adding a struct here would add complexity that isn't required. It's very likely that this group of data won't be used elsewhere and makes calling the function more complicated (as you'll have to create the struct first).

For the sake of optimization, I would consider simply passing the struct by value. i.e.
void printMatrix(struct Matrix m);
without the pointer. It's a very small data structure and the processor might just store this top-level data in the cache. The compiler and processor may be able to optimize access to this top-level data.
Then again, it might do nothing or even make it worse. Optimization can be a black art.
(And don't forget that if you make changes to the top-level Matrix struct, then you'll need to return it somehow). So maybe this should only be considered in place of const struct Matrix *m.

There is no single perfect method. In the appropriate chapter of c-faq you can see 5 methods and their comparison.

how to write a c function which creates an empty queue?

I am new to C. I have no idea about how to write a C function which creates an empty queue and return a void pointer.
void* queue_open(void)
I also want to know how to write a C function which puts an element at end of a queue.
void queue_put(void *p, void *elementp)
Thanks for your help!

If you are coming from an object oriented background (as your method signatures seem to indicate).
Object oriented idea -> good way to do it in C
Object creation -> malloc a struct, then pass it into an initialization function
struct queue* q = (struct queue*)malloc(sizeof(struct queue));
queue_initialize(q);
if you want, you can wrap this in a function, like so
struct queue* queue_construct() {
struct queue* q = (struct queue*)malloc(sizeof(struct queue));
queue_initialize(q);
return q;
}
Note that these pointer shouldn't point to void*, let C do at least some of the type checking for you.
Implement a method -> create a function that takes a struct pointer to the "almost this" struct.
struct user* user = ... whatever we do here ...;
queue_add(q, (void*)user);
As far as how to actually implement a queue, I suggest a good data structures or algorithms book, as there are many ways to go about it; and, the specific techniques you choose will have different impacts on performance and reliability. There's no one best way, it depends heavily on how the queue is to be used, and which aspects of performance are more important.
The book I recommend is Introduction to Algorithms. This book is overkill for most situations, with very detailed listings of nearly every major data structure you are likely to encounter in the first few years of programming. As such, it makes a great reference, despite its attempt at a language neutral approach, which now looks odd when compared to common programming languages.
Once you understand what is going on, you can do it in nearly any language.

You need to decide what a queue element should look like, what a queue is, and what it means for a queue to be empty. If you know those things, writing queue_open and queue_put should be pretty easy. I'd suggest that you start by defining a structure that represents your queue element.

You can learn about queues here:
http://en.wikipedia.org/wiki/Queue_(data_structure)
While you could easily copy and paste the sample code from the link above and with little modification solve your homework problem, you are not going to learn a lot by doing that.
After understanding a queue conceptually, I recommend you try to implement it yourself, then use the sample code from the link above as a reference when you get stuck.
The best thing you could do is pair up with another student in your class who is smarter than you. Then pair program ( http://en.wikipedia.org/wiki/Pair_programming ) with him/her to solve the problem. You'll become a better programmer.

Serialize Data Structures in C

I'd like a C library that can serialize my data structures to disk, and then load them again later. It should accept arbitrarily nested structures, possibly with circular references.
I presume that this tool would need a configuration file describing my data structures. The library is allowed to use code generation, although I'm fairly sure it's possible to do this without it.
Note I'm not interested in data portability. I'd like to use it as a cache, so I can rely on the environment not changing.
Thanks.
Results
Someone suggested Tpl which is an awesome library, but I believe that it does not do arbitrary object graphs, such as a tree of Nodes that each contain two other Nodes.
Another candidate is Eet, which is a project of the Enlightenment window manager. Looks interesting but, again, seems not to have the ability to serialize nested structures.

Check out tpl. From the overview:
Tpl is a library for serializing C
data. The data is stored in its
natural binary form. The API is small
and tries to stay "out of the way".
Compared to using XML, tpl is faster
and easier to use in C programs. Tpl
can serialize many C data types,
including structures.

I know you're asking for a library. If you can't find one (::boggle::, you'd think this was a solved problem!), here is an outline for a solution:
You should be able to write a code generator[1] to serialize trees/graphs without (run-time) pre-processing fairly simply.
You'll need to parse the node structure (typedef handling?), and write the included data values in a straight ahead fashion, but treat the pointers with some care.
For pointer to other objects (i.e. char *name;) which you know are singly referenced, you can serialize the target data directly.
For objects that might be multiply refernced and for other nodes of your tree you'll have to represent the pointer structure. Each object gets assigned a serialization number, which is what is written out in-place of the pointer. Maintain a translation structure between current memory position and serialization number. On encountering a pointer, see if it is already assigned a number, if not, give it one and queue that object up for serialization.
Reading back also requires a node-#/memory-location translation step, and might be easier to do in two passes: regenerate the nodes with the node numbers in the pointer slots (bad pointer, be warned) to find out where each node gets put, then walk the structure again fixing the pointers.
I don't know anything about tpl, but you might be able to piggy-back on it.
The on-disk/network format should probably be framed with some type information. You'll need a name-mangling scheme.
[1] ROOT uses this mechanism to provide very flexible serialization support in C++.
Late addition: It occurs to me that this is not always as easy as I implied above. Consider the following (contrived and badly designed) declaration:
enum {
mask_none = 0x00,
mask_something = 0x01,
mask_another = 0x02,
/* ... */
mask_all = 0xff
};
typedef struct mask_map {
int mask_val;
char *mask_name;
} mask_map_t;
mask_map_t mask_list[] = {
{mask_something, "mask_something"},
{mask_another, "mask_another"},
/* ... */
};
struct saved_setup {
char* name;
/* various configuration data */
char* mask_name;
/* ... */
};
and assume that we initalize out struct saved_setup items so that mask_name points at mask_list[foo].mask_name.
When we go to serialize the data, what do we do with struct saved_setup.mask_name?
You will need to take care in designing your data structures and/or bring some case-specific intelligence to the serialization process.

This is my solution. It uses my own implementation of malloc, free and mmap, munmap system calls. Follow the given example codes. Ref: http://amscata.blogspot.com/2013/02/serialize-your-memory.html
In my approach I create a char array as my own RAM space. Then there are functions for allocate the memory and free them. After creating the data structure, by using mmap, I write the char array to a file.
Whenever you want to load it back to the memory there is a function which used munmap to put the data structure again to the char array. Since it has virtual addresses for your pointers, you can re use your data structure. That means, you can create data structure, save it, load it, again edit it, and save it again.

You can take a look on eet. A library of the enlightenment project to store C data types (including nested structures). Although nearly all libs of the enlightenment project are in pre-alpha state, eet is already released. I'm not sure, however, if it can handle circular references. Probably not.

http://s11n.net/c11n/
HTH

you should checkout gwlib. the serializer/deserializer is extensive. and there are extensive tests available to look at. http://gwlib.com/

I'm assuming you are talking about storing a graph structure, if not then disregard...
If your storing a graph, I personally think the best idea would be implementing a function that converts your graph into an adjacency matrix. You can then make a function that converts an adjacency matrix to your graph data structure.
This has three benefits (that may or may not matter in your application):
adjacency matrix are a very natural way to create and store a graph
You can create an adjacency matrix and import them into your applications
You can store and read your data in a meaningful way.
I used this method during a CS project and is definitely how I would do it again.
You can read more about adjacency matrix here: http://en.wikipedia.org/wiki/Modified_adjacency_matrix

Another option is Avro C, an implementation of Apache Avro in C.

Here is an example using the Binn library (my creation):
binn *obj;
// create a new object
obj = binn_object();
// add values to it
binn_object_set_int32(obj, "id", 123);
binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
binn_object_set_double(obj, "price", 12.50);
binn_object_set_blob(obj, "picture", picptr, piclen);
// send over the network
send(sock, binn_ptr(obj), binn_size(obj));
// release the buffer
binn_free(obj);
If you don't want to use strings as keys you can use a binn_map which uses integers as keys.
There is also support for lists, and all these structures can be nested:
binn *list;
// create a new list
list = binn_list();
// add values to it
binn_list_add_int32(list, 123);
binn_list_add_double(list, 2.50);
// add the list to the object
binn_object_set_list(obj, "items", list);
// or add the object to the list
binn_list_add_object(list, obj);

In theory YAML should do what you want http://code.google.com/p/yaml-cpp/
Please let me know if it works for you.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Common datastructure library in C - c

Names starting with double-underscore are reserved to 'the implementation' and should be avoided in user code. Personally, I dislike typedefs for pointers; I'd rather use clib_object *x; than clib_object_ptr x;. Why do you need to record the size of the object?

Related

Use of function pointers in C on data structure development

Is there a more efficient way to store an n-ary tree in a file in C?

For pass to function is it worth packing the matrix and its dimension in a struct or is it OK to use additional parameters?

how to write a c function which creates an empty queue?

Serialize Data Structures in C

Categories

Resources