generic data structure in C [duplicate] - c

This question already has answers here:
Simulation of templates in C (for a queue data type)
(10 answers)
Closed 6 years ago.
Is there any way to create generic data structure in C and use functions in accordance with the stored data type, a structure that has various types of data and for example can be printed according to the stored data.
For example,
Suppose I wish to make a binary search tree that has just float's, int's stored. The natural approach to do would be to create an enumeration with int's and float's. it would look something like this:
Typedef enum {INT, FLOAT} DataType;
Typedef struct node
{
void *data;
DataType t;
struct node *left,
*right;
}Node;
if i want print it out:
void printTree(Node *n)
{
if (n != NULL)
{
if (n->t == INT)
{
int *a = (int *) n->data;
printf("%d ", *a);
}
else
{
float *a = (float *) n->data;
printf("%f ", *a);
}
printTree(n->left);
printTree(n->right);
}
}
That's ok but i want to store another data type as a stack, query or something else. So that's why I created a tree that does not depends on a specific data type, such as:
Typedef struct node
{
void *data;
struct node *left,
*right;
}Node;
If i want to print it out i use callback functions, such as:
Node *printTree(Node *n, void (*print)(const void *))
{
if (n != NULL)
{
print(n->data);
printTree(a->left);
printTree(a->right);
}
}
But it falls down when i try to insert a integer and a float and print it out. My question is, Is there a way of creating a generic data structure that a routine depends on a specific data type in one situation but another situation it doesn't , for mixed data type? In this situation i should create a structure that stores int's and float's stores it and use a print function like in the first print code for that in the callback function?
observation: I just declared a node in the structure and did everything on it trying to simplify, but the idea is to use the structure with .h and .c and all this abstraction involving data structures.

I would suggest trying something like the following. You'll noticed that Node contains a tagged union that allows for either a pointer type, an integer, or a floating point number. When Node is a pointer type, the custom print function is called, and in the other cases, the appropriate printf format is used.
typedef enum {POINTER, INT, FLOAT} DataType;
typedef struct node
{
DataType t;
union {
void *pointer;
int integer;
float floating;
} data;
struct node *left,
*right;
} Node;
void printTree(Node *n, void (*print)(const void *))
{
if (n != NULL) {
switch (n->t) {
case POINTER:
print(n->data.pointer);
break;
case INT:
printf("%d ", n->data.integer);
break;
case FLOAT:
printf("%f ", n->data.floating);
break;
}
printTree(a->left, print);
printTree(a->right, print);
}
}

C doesn't support this kind of generic data types/structures. You have a few options you can go with:
If you have the opportunity to use Clang as the compiler, there's a language extension to overload functions in C. But you have to cast the argument to the specific type, so the compiler knows which function to call.
Use C++
although you still have to cast the argument, so the compiler knows which of the available functions called print he has to call.
use templates
Create a function called print which takes something like
struct data_info {
void *data;
enum_describing_type type;
}
print does a switch and calls the appropriate printInt, printFloat etc.

uthash is a collection of header files that provide typed hash table, linked list, etc. implementations, all using C preprocessor macros.

Related

Is there a way to make a single function operate on different structures (having common members) in c

After passing a void* pointer as argument to a function, is there a way to specify the type to which it is cast as another parameter. If I have two structs like:
struct A{
int key;
char c;
}
struct B {
int key;
float d;
}
Is it possible to define a function,
void func(void * ptr, ...){
//operate on key
}
and pass a pointer to either structs to the function after casting to void* and access the key element from within the function.
Trying to understand the use of void*, how structure definitions are stored ( How are the offsets of various elements determined from the structure definition? ) and how ploymorphism may be implemented in c.
Was trying to see if I could write Binary Search tree functions that could deal with nodes of any struct.
After passing a void* pointer as argument to a function, is there a way to specify the type to which it is cast as another parameter.
Yes and no.
I suppose you're hoping for something specific to this purpose, such as a variable that conveys a type name that the function can somehow use to perform the cast. Something along the lines of a type parameter in a C++ template, or a Java generic method, for example. C does not have any such thing.
But of course, you can use an ordinary integer to convey a code representing which of several known-in-advance types to cast to. If you like, you can even use an enum to give those codes meaningful names. For example:
enum arg_type { STRUCT_A_TYPE, STRUCT_B_TYPE };
void func(void *ptr, enum arg_type type) {
int key = 0;
switch (type) {
case STRUCT_A_TYPE:
key = ((struct A *) ptr)->key;
break;
case STRUCT_B_TYPE:
key = ((struct B *) ptr)->key;
break;
default:
assert(0);
}
// ...
}
Note well that that approach allows accessing any member of the pointed-to structure, but if you only want to access the first member, and it has the same type in every structure type of interest, then you don't need to know the specific structure type. In that particular case, you can cast directly to the member type:
void func(void *ptr) {
int key = *(int *)ptr;
// ...
}
That relies on C's guarantee that a pointer to any structure, suitably cast, points to that structure's first member.
Trying to understand the use of void*, how structure definitions are store and how ploymorphism may be implemented in c.
That's awfully broad.
C does not offer polymorphism as a language feature, and C objects do not carry information about their type such as could be used to dispatch type-specific functions. You can, of course, implement that yourself, but it is non-trivial. Available approaches include, but are not limited to,
passing pointers to functions that do the right thing for the type of your data. The standard qsort() and bsearch() functions are the canonical examples of this approach.
putting some kind of descriptor object as the first member of every (structure) type. The type of that member can be a structure type itself, so it can convey arbitrarily complex data. Such as a vtable. As long as it is the first member of all your polymorphic structures, you can always access it from a pointer to one of them by casting to its type, as discussed above.
Using tagged unions of groups of polymorphic types (requiring that all the type alternatives in each group be known at build time). C then allows you to look at any members of the common initial sequence of all union members without knowing which member actually has a value. That initial sequence would ordinarily include the tag, so that you don't have to pass it separately, but it might include other information as well.
Polymorphism via (single-)inheritance can be implemented by giving each child type an object of its parent type as its first member. That then allows you to cast to (a pointer to) any supertype and get the right thing.
Lets say you had a sort function that takes a function as a parameter which implements the "compare" functionality of the sort. The sort would then be capable of sorting a list of any arbitrary struct, by handing it a comparer function that implements the correct order for your particular struct.
void bubbleSort(Node* start, bool comparerFunction(void* a, void* b))
Consider the following struct definition:
typedef struct {
int book_id;
char title[50];
char author[50];
char subject[100];
char ISBN[13];
} Book;
And this unremarkable linked list definition:
typedef struct node{
void* item;
struct node* next;
} Node;
Which can store an arbitrary struct in the item member.
Because you know the type of the members you've placed in your linked list, you can write a comparer function that will do the right thing:
bool sortByTitle(void* left, void* right) {
Book* a = (Book*)left;
Book* b = (Book*)right;
return strcmp(a->title, b->title) > 0;
}
And then call your sort like this:
bubbleSort(myList, sortByTitle);
For completeness, here is the bubbleSort implementation:
/* Bubble sort the given linked list */
void bubbleSort(Node *start, bool greaterThan(void* a, void* b))
{
int swapped, i;
Node* ptr1;
Node* lptr = NULL;
/* Checking for empty list */
if (start == NULL)
return;
do
{
swapped = 0;
ptr1 = start;
while (ptr1->next != lptr)
{
if (greaterThan(ptr1->item, ptr1->next->item))
{
swap(ptr1, ptr1->next);
swapped = 1;
}
ptr1 = ptr1->next;
}
lptr = ptr1;
}
while (swapped);
}
/* function to swap data of two nodes a and b*/
void swap(Node *a, Node *b)
{
void* temp = a->item;
a->item = b->item;
b->item = temp;
}

Struct pointer inheritance in C?

Recently I started to look into some operating system's source code, there is a special coding technique which puzzles me a lot.
First the source code declare a very basic struct, such as:
struct cmd {
int type;
};
And then, it continue to declare several other structs which contain the first basic struct at their beginning:
struct execcmd {
int type; //Here.
char *argv[MAXARGS];
char *eargv[MAXARGS];
};
struct redircmd {
int type; //Here.
struct cmd *cmd;
char *file;
char *efile;
int mode;
int fd;
};
Because the identity in the first few bytes of these structs, we are able to access the shared int type part even though we are not sure of which exactly the structure it is. And we can use the int type part to cast the struct pointer to the correct one:
void runcmd(struct cmd *cmd)
{
switch(cmd->type){
case EXEC:
ecmd = (struct execcmd*)cmd;
case REDIR:
rcmd = (struct redircmd*)cmd;
break;
case LIST:
lcmd = (struct listcmd*)cmd;
break;
case PIPE:
pcmd = (struct pipecmd*)cmd;
break;
case BACK:
bcmd = (struct backcmd*)cmd;
break;
}
So my question is, what is the name and benefit of this techinique, or, what is the normal use case for this technique?
This is known as a "common initial sequence". Per 6.5.2.3 Structure and union members, paragraph 6 of the C11 standard:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
Strictly speaking, the code you have posted is incorrect as the struct members are not used via a common union.
This technique is used any time you're storing some data that may have an arbitrary type from some set of types. This is what could be called a variant type.
Suppose you were writing a parser for mathematical expressions - building an AST (abstract syntax tree). You'd want each node in the tree to be able to be handled generically by some code that for example can serialize and deserialize the tree. The generic code could use the type tag to call the type-specific serialization/deserialization method (also called a virtual observer method). The observers would then cast the node to a concrete "derived" type, and use that to operate on it.
enum NodeType { NodeA, NodeB };
struct Node {
enum NodeType type;
} typedef Node;
typedef void (Observer*)(Node *node, void *context);
void serializeNodeA(Node *node, void *context);
void deserializeNodeA(Node *node, void *context);
void serializeNodeB(Node *node, void *context);
void deserializeNodeB(Node *node, void *context);
struct VirtualMethods {
Observer serialize;
Observer deserialize;
} typedef VirtualMethods;
const VirtualMethods vtables[] = {
{{serializeNodeA, deserializeNodeA},
{serializeNodeB, deserializeNodeB}};
void serializeNode(Node *node, void *context) {
int type = node->type;
Observer serialize = vtables[type].serialize;
serialize(node, context);
}
void deserializeNode(Node *node, void *context) {
int type = node->type;
Observer deserialize = vtables[type].deserialize;
deserialize(node, context);
}
There are other applications as well, of course.
Using the type integer tag to select a virtual function table saves space compared to directly storing a virtual function table pointer, and is more flexible since to compare types you don't need to compare pointers to tables.

Generic data structure search in c

I'm trying to code a fully generic data structure library in c.
Is there any way or technique in c programming that allows searching data without knowing its type?
Here I have to define my compare function again upon my data type.
list.h
typedef struct _node
{
void *data;
struct _node *next;
}NODE;
typedef struct _list
{
NODE *head;
NODE *tail;
NODE *current;
}GLIST;
int search(GLIST *list,void *data,int (*COMPARE)(void*,void*));
and
list.c
int search(GLIST *list,void *data,int(*COMPARE)(void*,void*))
{
list->current=list->head;
int cIndex=1;
while(list->current)
{
if(COMPARE(list->current->data,data))
{
printf("data found at position %i.\n",cIndex);
if(list->current->next==NULL)
{
return 1;
}
}
list->current=list->current->next;
cIndex++;
}
printf("NO DATA FOUND.\n");
return 0;
}
and
mycode.c
int compare(void *list,void *data);
typedef struct _student
{
int studentNumber;
char name[64];
}STUDENT;
int main()
{
GLIST list;
//initializing list......
STUDENT stud;
//code .....
search(&list,&stud,compare) // I want an alternative of using compare here
search(&list,&stud); // want the function be like this and also be generic !
return 0;
}
int compare(void *list,void *data)
{
// I do not wanna have to declare this function even
return !strcmp(((STUDENT*)list)->name,((STUDENT*)data)->name);
}
I'm wondering if there is A COMMON thing to compare elements "structures,unions,arrays" upon it in c or any technique else.
There is no way of comparing two objects without knowing their data type.
A first attempt would probably be to use something like memcmp, but this fails for at least three reasons:
Without knowing the type, you do not know the size of the object.
Even if you somehow could derive some size, comparing objects of type struct or union could lead to wrong result due to padding.
A comparison based on the memory layout could at most achieve a "shallow" comparison, which may not represent "equality" in terms of the respective data type.
So the only way (and this is used by generic libraries) is to define functions that accept user-defined comparison functions as parameters.

Storing and using type information in C

I'm coming from Java and I'm trying to implement a doubly linked list in C as an exercise. I wanted to do something like the Java generics where I would pass a pointer type to the list initialization and this pointer type would be use to cast the list void pointer but I'm not sure if this is possible?
What I'm looking for is something that can be stored in a list struct and used to cast *data to the correct type from a node. I was thinking of using a double pointer but then I'd need to declare that as a void pointer and I'd have the same problem.
typedef struct node {
void *data;
struct node *next;
struct node *previous;
} node;
typedef struct list {
node *head;
node *tail;
//??? is there any way to store the data type of *data?
} list;
Typically, the use of specific functions like the following are used.
void List_Put_int(list *L, int *i);
void List_Put_double(list *L, double *d);
int * List_Get_int(list *L);
double *List_Get_double(list *L);
A not so easy for learner approach uses _Generic. C11 offers _Generic which allows for code, at compile time, to be steered as desired based on type.
The below offers basic code to save/fetch to 3 types of pointers. The macros would need expansion for each new types. _Generic does not allow 2 types listed that may be the same like unsigned * and size_t *. So there are are limitations.
The type_id(X) macros creates an enumeration for the 3 types which may be use to check for run-time problems as with LIST_POP(L, &d); below.
typedef struct node {
void *data;
int type;
} node;
typedef struct list {
node *head;
node *tail;
} list;
node node_var;
void List_Push(list *l, void *p, int type) {
// tbd code - simplistic use of global for illustration only
node_var.data = p;
node_var.type = type;
}
void *List_Pop(list *l, int type) {
// tbd code
assert(node_var.type == type);
return node_var.data;
}
#define cast(X,ptr) _Generic((X), \
double *: (double *) (ptr), \
unsigned *: (unsigned *) (ptr), \
int *: (int *) (ptr) \
)
#define type_id(X) _Generic((X), \
double *: 1, \
unsigned *: 2, \
int *: 3 \
)
#define LIST_PUSH(L, data) { List_Push((L),(data), type_id(data)); }
#define LIST_POP(L, dataptr) (*(dataptr)=cast(*dataptr, List_Pop((L), type_id(*dataptr))) )
Usage example and output
int main() {
list *L = 0; // tbd initialization
int i = 42;
printf("%p %d\n", (void*) &i, i);
LIST_PUSH(L, &i);
int *j;
LIST_POP(L, &j);
printf("%p %d\n", (void*) j, *j);
double *d;
LIST_POP(L, &d);
}
42
42
assertion error
There is no way to do what you want in C. There is no way to store a type in a variable and C doesn't have a template system like C++ that would allow you to fake it in the preprocessor.
You could define your own template-like macros that could quickly define your node and list structs for whatever type you need, but I think that sort of hackery is generally frowned upon unless you really need a whole bunch of linked lists that only differ in the type they store.
C doesn't have any runtime type information and doesn't have a type "Type". Types are meaningless once the code was compiled. So, there's no solution to what you ask provided by the language.
One common reason you would want to have a type available at runtime is that you have some code that might see different instances of your container and must do different things for different types stored in the container. You can easily solve such a situation using an enum, e.g.
enum ElementType
{
ET_INT; // int
ET_DOUBLE; // double
ET_CAR; // struct Car
// ...
};
and enumerate any type here that should ever go into your container. Another reason is if your container should take ownership of the objects stored in it and therefore must know how to destroy them (and sometimes how to clone them). For such cases, I recommend the use of function pointers:
typedef void (*ElementDeleter)(void *element);
typedef void *(*ElementCloner)(const void *element);
Then extend your struct to contain these:
typedef struct list {
node *head;
node *tail;
ElementDeleter deleter;
ElementCloner cloner;
} list;
Make sure they are set to a function that actually deletes resp. clones an element of the type to be stored in your container and then use them where needed, e.g. in a remove function, you could do something like
myList->deleter(myNode->data);
// delete the contained element without knowing its type
create enum type, that will store data type and alloc memory according to this enum. This could be done in switch/case construction.
Unlike Java or C++, C does not provide any type safety. To answer your question succinctly, by rearranging your node type this way:
struct node {
node* prev; /* put these at front */
node* next;
/* no data here */
};
You could then separately declare nodes carrying any data
struct data_node {.
data_node *prev; // keep these two data members at the front
data_node *next; // and in the same order as in struct list.
// you can add more data members here.
};
/* OR... */
enter code here
struct data_node2 {
node node_data; /* WANING: this may look a bit safer, but is _only_ if placed at the front.
/* more data ... */
};
You can then create a library that operates on data-less lists of nodes.
void list_add(list* l, node* n);
void list_remove(list* l, node* n);
/* etc... */
And by casting, use this 'generic lists' api to do operation on your list
You can have some sort of type information in your list declaration, for what it's worth, since C does not provide meaningful type protection.
struct data_list
{
data_node* head; /* this makes intent clear. */
data_node* tail;
};
struct data2_list
{
data_node2* head;
data_node2* tail;
};
/* ... */
data_node* my_data_node = malloc(sizeof(data_node));
data_node2* my_data_node2 = malloc(sizeof(data_node2));
/* ... */
list_add((list*)&my_list, (node*)my_data_node);
list_add((list*)&my_list2, &(my_data_node2->node_data));
/* warning above is because one could write this */
list_add((list*)&my_list2, (node*)my_data_node2);
/* etc... */
These two techniques generate the same object code, so which one you choose is up to you, really.
As an aside, avoid the typedef struct notation if your compiler allows, most compilers do, these days. It increases readability in the long run, IMHO. You can be certain some won't and some will agree with me on this subject though.

Modular data structure in C with dynamic data type

For my upcoming university C project, I'm requested to have modular code as C allows it. Basically, I'll have .c file and a corresponding .h file for some data structure, like a linked list, binary tree, hash table, whatever...
Using a linked list as an example, I have this:
typedef struct sLinkedList {
int value;
struct sLinkedList *next;
} List;
But this forces value to be of type int and the user using this linked list library would be forced to directly change the source code of the library. I want to avoid that, I want to avoid the need to change the library, to make the code as modular as possible.
My project may need to use a linked list for a list of integers, or maybe a list of some structure. But I'm not going to duplicate the library files/code and change the code accordingly.
How can I solve this?
Unfortunately, there is no simple way to solve this. The most common, pure C approach to this type of situation is to use a void*, and to copy the value into memory allocated by you into the pointer. This makes usage tricky, though, and is very error prone.
Another alternative no one has mentioned yet can be found in the Linux kernel's list.h generic linked list implementation. The principle is this:
/* generic definition */
struct list {
strict list *next, *prev;
};
// some more code
/* specific version */
struct intlist {
struct list list;
int i;
};
If you make struct intlist* pointers, they can safely be cast (in C) to struct list* pointers, thus allowing you to write genericized functions that operate on struct list* and have them work regardless of datatype.
The list.h implementation uses some macro trickery to support arbitrary placement of the struct list inside your specific list, but I prefer to rely on the struct-cast-to-first-member trick myself. It makes the calling code much easier to read. Granted, it disables "multiple inheritance" (assuming you consider this to be some kind of inheritance) but next(mylist) looks nicer than next(mylist, list). Plus, if you can avoid delving into offsetof hackery, you're probably going to end up in better shape.
Since this is a university project, we can't just give you the answer. Instead, I'd invite you to meditate on two C features: the void pointer (which you've likely encountered before), and the token pasting operator (which you may not have).
You can avoid this by defining value as void* value;. You can assign a pointer to any type of data this way, but the calling code is required to cast and dereference the pointer to the correct type. One way to keep track of this would be to add a short char array to the struct to note the type name.
This problem is precisely the reason why templates were developed for C++. The approach I've used once or twice in C is to have the value field be a void*, and cast the values thereto on insertion and cast them back on retrieval. This is far from type-safe, of course. For extra modularity, I might write insert_int(), get_mystruct() etc. functions for each type you use this for, and do the casting there.
You can use Void* instead of int. This allows the data to be of any type. But the user should be aware of the type of data.
For that, optionally you can have another member which represents Type. which is of enum {INT,CHAR,float...}
Unlike C++ where one can use template, void * is the de-facto C solution.
Also, you can put the elements of the linked list in a separate struct, e.g:
typedef struct sLinkedListElem {
int value; /* or "void * value" */
} ListElem;
typedef struct sLinkedList {
ListElem data;
struct sLinkedList *next;
} List;
so that the elements can be changed without affecting the link-ing code.
Here is an example of linked list utilities in C:
struct Single_List_Node
{
struct Single_List * p_next;
void * p_data;
};
struct Double_List_Node
{
struct Double_List * p_next;
struct Double_List * p_prev; // pointer to previous node
void * p_data;
};
struct Single_List_Data_Type
{
size_t size; // Number of elements in list
struct Single_List_Node * p_first_node;
struct Single_List_Node * p_last_node; // To make appending faster.
};
Some generic functions:
void Single_List_Create(struct Single_List_Data_Type * p_list)
{
if (p_list)
{
p_list->size = 0;
p_list->first_node = 0;
p_list->last_node = p_list->first_node;
}
return;
}
void Single_List_Append(struct Single_List_Data_Type * p_list,
void * p_data)
{
if (p_list)
{
struct Single_List_Node * p_new_node = malloc(sizeof(struct Single_List_Node));
if (p_new_node)
{
p_new_node->p_data = p_data;
p_new_node->p_next = 0;
if (p_list->last_node)
{
p_list->last_node->p_next = p_new_node;
}
else
{
if (p_list->first_node == 0)
{
p_list->first_node = p_new_node;
p_list->last_node = p_new_node;
}
else
{
struct Single_List_Node * p_last_node = 0;
p_last_node = p_list->first_node;
while (p_last_node->p_next)
{
p_last_node = p_last_node->p_next;
}
p_list->last_node->p_next = p_new_node;
p_list->last_node = p_new_node;
}
}
++(p_list->size);
}
}
return;
}
You can put all these functions into a single source file and the function declarations into a header file. This will allow you to use the functions with other programs and not have to recompile all the time. The void * for the pointer to data will allow you to use the list with many different data types.
(The above code comes as-is and has not been tested with any compiler. The responsibility of bug fixing is up to the user of the examples.)

Resources