I am trying to wrap my head around the concept of using macros to define data structure operations. The following code is a simple example to use the built in list library in FreeBSD. In the library all operations are defined as macros. I have seen this approach in couple of other libraries also.
I can see that this has some advantages eg. being ability to use any data structure as an element in the list. But I do not quite understand how this works. For example:
What is stailhead? This seems to be "just" defined.
How to pass head and entries to a function?
What type is head, how can I declare a pointer to it?
Is there a standard name for this technique which I can use to search google, or any book which explains this concept? Any links or good explanation as to how this technique works will be much appreciated.
Thanks to Niklas B. I ran gcc -E and got this definition for head
struct stailhead {
struct stailq_entry *stqh_first;
struct stailq_entry **stqh_last;
} head = { ((void *)0), &(head).stqh_first };
and this for stailq_entry
struct stailq_entry {
int value;
struct { struct stailq_entry *stqe_next; } entries;
};
So I guess head is of type struct stailhead.
#include <stdio.h>
#include <stdlib.h>
#include <sys/queue.h>
struct stailq_entry {
int value;
STAILQ_ENTRY(stailq_entry) entries;
};
int main(void)
{
STAILQ_HEAD(stailhead, stailq_entry) head = STAILQ_HEAD_INITIALIZER(head);
struct stailq_entry *n1;
unsigned i;
STAILQ_INIT(&head); /* Initialize the queue. */
for (i=0;i<10;i++){
n1 = malloc(sizeof(struct stailq_entry)); /* Insert at the head. */
n1->value = i;
STAILQ_INSERT_HEAD(&head, n1, entries);
}
n1 = NULL;
while (!STAILQ_EMPTY(&head)) {
n1 = STAILQ_LAST(&head, stailq_entry, entries);
STAILQ_REMOVE(&head, n1, stailq_entry, entries);
printf ("n2: %d\n", n1->value);
free(n1);
}
return (0);
}
First read this to get a hold what these macros do. And then go to queue.h. You'll get your treasure trove there!
I found a few gold coins for you-
#define STAILQ_HEAD(name, type) \
struct name { \
struct type *stqh_first;/* first element */ \
struct type **stqh_last;/* addr of last next element */ \
}
Lets dig in a bit deep and answer your questions
What is stailhead? This seems to be "just" defined.
#define STAILQ_HEAD(name, type) \
struct name { \
struct type *stqh_first;/* first element */ \
struct type **stqh_last;/* addr of last next element */ \
}
STAILQ_HEAD(stailhead, entry) head =
STAILQ_HEAD_INITIALIZER(head);
struct stailhead *headp; /* Singly-linked tail queue head. */
So stailhead is a structure
How to pass head and entries to a function?
#define STAILQ_ENTRY(type) \
struct { \
struct type *stqe_next; /* next element */ \
}
So entries and head ( as explained before ) are just structures and you can pass them just as you pass other structures. &structure_variable
What type is head, how can I declare a pointer to it?
Already explained!
Read this man page for nice pretty examples.
Related
I'm working on some data structures in C. I have created a queue data structure that can take any type of data. This is currently being done by a macro which is default initialized to int type.
#ifndef DATATYPE
#define DATATYPE int
#endif
The queue header is being included in another data structure - the binary search tree, and I am using the queue for a breadth-first-search implementation. In the Makefile, I have modified the DATATYPE macro from an int to a binary_tree_node_t * type.
binary_tree: DEFS=-DDATATYPE="struct BINARY_TREE_NODE *"
My question is, is there a better way to do this using typedefs? Can I define a type DATATYPE as an int in the queue implementation, but have it get modified in a different header file?
Or is it possible to implement a queue that can take any datatype?
Here are the source files (redacted for sake of brevity) for reference:
queue.h
#ifndef _QUEUE_H
#define _QUEUE_H
#ifndef DATATYPE
#define DATATYPE int
#endif
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
typedef struct LL_NODE {
DATATYPE data;
struct LL_NODE* next;
} node_t;
typedef struct QUEUE {
node_t* head;
node_t* tail;
int size;
} queue_t;
queue_t* init_queue();
void destroy(queue_t* queue);
bool is_empty(queue_t* queue);
int size(queue_t* queue);
void enqueue(queue_t* queue, DATATYPE data);
DATATYPE dequeue(queue_t* queue);
DATATYPE peek(queue_t* queue);
#endif
binary_search_tree.c
#include "binary_search_tree.h"
void bfs_trav(binary_tree_node_t* root) {
queue_t* queue = init_queue();
binary_tree_node_t* temp = root;
enqueue(queue, root);
while (!is_empty(queue)) {
temp = dequeue(queue);
printf("%d ", temp->data);
if (temp->left) {
enqueue(queue, temp->left);
}
if (temp->right) {
enqueue(queue, temp->right);
}
}
destroy(queue);
return;
}
binary_search_tree.h
#ifndef _BINARY_SEARCH_TREE_H
#define _BINARY_SEARCH_TREE_H
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include "../queue/queue.h"
typedef struct BINARY_TREE_NODE {
int data;
struct BINARY_TREE_NODE *left;
struct BINARY_TREE_NODE *right;
} binary_tree_node_t;
void bfs_trav(binary_tree_node_t* root);
#endif
One popular way to do what you're trying to do is to nest your data structures and thereby make "polymorphic" types. In your case, that would mean removing the payload from the LL_NODE struct entirely and declare various "derived" structs as needed. For example:
typedef struct LL_NODE {
struct LL_NODE* next;
} node_t;
typedef struct INT_NODE {
node_t node;
int payload;
} int_node_t;
typedef struct TREE_NODE {
node_t node;
struct BINARY_TREE_NODE *payload;
} tree_node_t;
Any time you want to work with the linked list simply pass a pointer to the node struct member for any derived type.
Of course for this to be useful, you'll need a way to "downcast" from a linked list node to a derived type. This is usually done with a macro that will look roughly as follows:
#define LIST_NODE_DOWNCAST(ptr, derivedType, nodeName) \
(derivedType *)((ptrdiff_t)ptr - (ptrdiff_t)&((derivedType *)0)->nodeName)
So for example, given a node_t pointer called list_ptr, that you know refers to an int_node_t object, you can recover the int_node_t object as follows:
int_node_t *derived = LIST_NODE_DOWNCAST(list_ptr, int_node_t, node);
Note that this is the primary method used in the Linux kernel for managing queues and lists. I answered a related question a while ago where you can see a specific example in usage in the kernel.
I would suggest (quite strongly) that the makefile is the wrong place to make that change. You should be setting the data type in the BST code before including queue.h:
…
#define DATATYPE struct BINARY_TREE_NODE *
#include "queue/queue.h"
…
Your binary tree code knows what it wants to use; it should not rely on the vagaries of the makefile to ensure it gets what it needs.
Whether you should add #undef DATATYPE before defining it is trickier; I'd not do it unless it was proven necessary — and I'd also debate why it's necessary and how you are going to have two sets of functions with the same names but working with different types, given that this is C and C++. On the whole, not using #undef is safer; you will be told if there is a problem which using #undef will conceal — and the concealed problems will be hell to debug!
I'm coming from Java and I'm trying to implement a doubly linked list in C as an exercise. I wanted to do something like the Java generics where I would pass a pointer type to the list initialization and this pointer type would be use to cast the list void pointer but I'm not sure if this is possible?
What I'm looking for is something that can be stored in a list struct and used to cast *data to the correct type from a node. I was thinking of using a double pointer but then I'd need to declare that as a void pointer and I'd have the same problem.
typedef struct node {
void *data;
struct node *next;
struct node *previous;
} node;
typedef struct list {
node *head;
node *tail;
//??? is there any way to store the data type of *data?
} list;
Typically, the use of specific functions like the following are used.
void List_Put_int(list *L, int *i);
void List_Put_double(list *L, double *d);
int * List_Get_int(list *L);
double *List_Get_double(list *L);
A not so easy for learner approach uses _Generic. C11 offers _Generic which allows for code, at compile time, to be steered as desired based on type.
The below offers basic code to save/fetch to 3 types of pointers. The macros would need expansion for each new types. _Generic does not allow 2 types listed that may be the same like unsigned * and size_t *. So there are are limitations.
The type_id(X) macros creates an enumeration for the 3 types which may be use to check for run-time problems as with LIST_POP(L, &d); below.
typedef struct node {
void *data;
int type;
} node;
typedef struct list {
node *head;
node *tail;
} list;
node node_var;
void List_Push(list *l, void *p, int type) {
// tbd code - simplistic use of global for illustration only
node_var.data = p;
node_var.type = type;
}
void *List_Pop(list *l, int type) {
// tbd code
assert(node_var.type == type);
return node_var.data;
}
#define cast(X,ptr) _Generic((X), \
double *: (double *) (ptr), \
unsigned *: (unsigned *) (ptr), \
int *: (int *) (ptr) \
)
#define type_id(X) _Generic((X), \
double *: 1, \
unsigned *: 2, \
int *: 3 \
)
#define LIST_PUSH(L, data) { List_Push((L),(data), type_id(data)); }
#define LIST_POP(L, dataptr) (*(dataptr)=cast(*dataptr, List_Pop((L), type_id(*dataptr))) )
Usage example and output
int main() {
list *L = 0; // tbd initialization
int i = 42;
printf("%p %d\n", (void*) &i, i);
LIST_PUSH(L, &i);
int *j;
LIST_POP(L, &j);
printf("%p %d\n", (void*) j, *j);
double *d;
LIST_POP(L, &d);
}
42
42
assertion error
There is no way to do what you want in C. There is no way to store a type in a variable and C doesn't have a template system like C++ that would allow you to fake it in the preprocessor.
You could define your own template-like macros that could quickly define your node and list structs for whatever type you need, but I think that sort of hackery is generally frowned upon unless you really need a whole bunch of linked lists that only differ in the type they store.
C doesn't have any runtime type information and doesn't have a type "Type". Types are meaningless once the code was compiled. So, there's no solution to what you ask provided by the language.
One common reason you would want to have a type available at runtime is that you have some code that might see different instances of your container and must do different things for different types stored in the container. You can easily solve such a situation using an enum, e.g.
enum ElementType
{
ET_INT; // int
ET_DOUBLE; // double
ET_CAR; // struct Car
// ...
};
and enumerate any type here that should ever go into your container. Another reason is if your container should take ownership of the objects stored in it and therefore must know how to destroy them (and sometimes how to clone them). For such cases, I recommend the use of function pointers:
typedef void (*ElementDeleter)(void *element);
typedef void *(*ElementCloner)(const void *element);
Then extend your struct to contain these:
typedef struct list {
node *head;
node *tail;
ElementDeleter deleter;
ElementCloner cloner;
} list;
Make sure they are set to a function that actually deletes resp. clones an element of the type to be stored in your container and then use them where needed, e.g. in a remove function, you could do something like
myList->deleter(myNode->data);
// delete the contained element without knowing its type
create enum type, that will store data type and alloc memory according to this enum. This could be done in switch/case construction.
Unlike Java or C++, C does not provide any type safety. To answer your question succinctly, by rearranging your node type this way:
struct node {
node* prev; /* put these at front */
node* next;
/* no data here */
};
You could then separately declare nodes carrying any data
struct data_node {.
data_node *prev; // keep these two data members at the front
data_node *next; // and in the same order as in struct list.
// you can add more data members here.
};
/* OR... */
enter code here
struct data_node2 {
node node_data; /* WANING: this may look a bit safer, but is _only_ if placed at the front.
/* more data ... */
};
You can then create a library that operates on data-less lists of nodes.
void list_add(list* l, node* n);
void list_remove(list* l, node* n);
/* etc... */
And by casting, use this 'generic lists' api to do operation on your list
You can have some sort of type information in your list declaration, for what it's worth, since C does not provide meaningful type protection.
struct data_list
{
data_node* head; /* this makes intent clear. */
data_node* tail;
};
struct data2_list
{
data_node2* head;
data_node2* tail;
};
/* ... */
data_node* my_data_node = malloc(sizeof(data_node));
data_node2* my_data_node2 = malloc(sizeof(data_node2));
/* ... */
list_add((list*)&my_list, (node*)my_data_node);
list_add((list*)&my_list2, &(my_data_node2->node_data));
/* warning above is because one could write this */
list_add((list*)&my_list2, (node*)my_data_node2);
/* etc... */
These two techniques generate the same object code, so which one you choose is up to you, really.
As an aside, avoid the typedef struct notation if your compiler allows, most compilers do, these days. It increases readability in the long run, IMHO. You can be certain some won't and some will agree with me on this subject though.
I was looking at Glibc codes. Some codes of glibc's queue caught my attention. I couldn't give a meaning to this struct definition. This struct doesn't have a name. Why? How does it work?
#define LIST_ENTRY(type) \
struct { \
struct type *le_next; /* next element */ \
struct type **le_prev; /* address of previous next element */ \
}
Source
That is actually a preprocessor macro, that could be expanded (most probably with trailing name) somewhere else.
In the comments at the start of that header file there is a reference to queue(3) man page that contains more details on that and other macros:
The macro LIST_ENTRY declares a structure that connects the elements
in the list.
And an example of use:
LIST_HEAD(listhead, entry) head = LIST_HEAD_INITIALIZER(head);
struct listhead *headp; /* List head. */
struct entry {
...
LIST_ENTRY(entry) entries; /* List. */
...
}
*n1, *n2, *n3, *np, *np_temp;
LIST_INIT(&head); /* Initialize the list. */
n1 = malloc(sizeof(struct entry)); /* Insert at the head. */
LIST_INSERT_HEAD(&head, n1, entries);
Being this C code (not C++), and C lacks templates, this preprocessor macro can be used to "simulate" templates (note the type parameter).
It's a macro that is used to declare a struct type, with next and prev pointers to instances of a second struct type. That second type can be a parent type, so you can make a "linkable struct" like this:
struct foo {
LIST_ENTRY(foo) list;
int value;
};
This creates a struct foo containing a member called list which in turn is the structure in the question, with the pointers pointing at struct foo.
We can now create a little linked list of struct foos like so:
struct foo fa, fb;
fa.value = 47;
fa.list.le_next = &fb;
fa.list.le_prev = NULL;
fb.value = 11;
fb.list.le_next = NULL;
fb.list.le_prev = &fa.list.le_next;
I'm not 100% sure about the last line, but I think it kind of makes sense.
I am trying to port a library written in Java into C programming language. For Java interface, I intend to use a struct of function-pointers to replace, for instance:
// Java code
public interface ActionsFunction {
Set<Action> actions(Object s);
}
/* C code */
typedef struct ActionsFunction {
List* (*actions)(void* s);
void (*clear_actions)(struct List **list); /* Since C doesn't have garbage collector */
} ActionsFunction;
My question is: whether it is a suitable solution or not, and how can I simulate a generic interface such as:
public interface List <E> {
void add(E x);
Iterator<E> iterator();
}
UPDATE:
I also have to face with another problem: implementing generic abstract data structure like List, Queue, Stack, etc since the C standard library lacks of those implementation. My approach is client code should pass the pointer of its data accompanying with its size, thus allowing library to hold that one without specifying its type. One more time, it just my idea. I need your advices for the design as well as implementing technique.
My initial porting code can be found at:
https://github.com/PhamPhiLong/AIMA
generic abstract data structure can be found in utility sub folder.
Here's a very brief example using macros to accomplish something like this. This can get hairy pretty quick, but if done correctly, you can maintain complete static type safety.
#include <stdlib.h>
#include <stdio.h>
#define list_type(type) struct __list_##type
/* A generic list node that keeps 'type' by value. */
#define define_list_val(type) \
list_type(type) { \
list_type(type) *next; \
type value; \
}
#define list_add(plist, node) \
do \
{ \
typeof(plist) p; \
for (p = plist; *p != NULL; p = &(*p)->next) ; \
*p = node; \
node->next = NULL; \
} while(0)
#define list_foreach(plist, p) \
for (p = *plist; p != NULL; p = p->next)
define_list_val(int) *g_list_ints;
define_list_val(float) *g_list_floats;
int main(void)
{
list_type(int) *node;
node = malloc(sizeof(*node));
node->value = 42;
list_add(&g_list_ints, node);
node = malloc(sizeof(*node));
node->value = 66;
list_add(&g_list_ints, node);
list_foreach(&g_list_ints, node) {
printf("Node: %d\n", node->value);
}
return 0;
}
There are a few common ways to do generic-ish programming in C. I would expect to use one or more of the following methods in trying to accomplish the task you've described.
MACROS: One is to use macros. In this example, MAX looks like a function, but operate on anything that can be compared with the ">" operator:
#define MAX(a,b) ((a) > (b) ? (a) : (b))
int i;
float f;
unsigned char b;
f = MAX(7.4, 2.5)
i = MAX(3, 4)
b = MAX(10, 20)
VOID *: Another method is to use void * pointers for representing generic data, and then pass function pointers into your algorithms to operate on the data. Look up the <stdlib.h> function qsort for a classic example of this technique.
UNIONS: Yet another, though probably seen less often, technique is to use unions to hold data of multiple different types. This makes your algorithms that operate on the data kinda ugly though and might not save much coding:
enum { VAR_DOUBLE, VAR_INT, VAR_STRING }
/* Declare a generic container struct for any type of data you want to operate on */
struct VarType
{
int type;
union data
{
double d;
int i;
char * sptr;
};
}
int main(){
VarType x;
x.data.d = 1.75;
x.type = VAR_DOUBLE;
/* call some function that sorts out what to do based on value of x.type */
my_function( x );
}
CLEVER CASTING & POINTER MATH It's a pretty common idiom to see data structures with functions that operate on a specific kind of struct and then require that the struct by included in your struct to do anything useful.
The easy way to do this, is the force the struct that allows insertion into the data structure to be the first member of your derived type. Then you can seamless cast back & forth between the two. The more versatile way is to use 'offsetof'. Here's a simple example.
For example:
/* Simple types */
struct listNode { struct listNode * next; struct listNode * prev };
struct list { struct listNode dummy; }
/* Functions that operate on those types */
int append( struct list * theList, struct listNode * theNode );
listNode * first( struct list *theList );
/* To use, you must do something like this: */
/* Define your own type that includes a list node */
typedef struct {
int x;
double y;
char name[16];
struct listNode node;
} MyCoolType;
int main() {
struct list myList;
MyCoolType coolObject;
MyCoolType * ptr;
/* Add the 'coolObject's 'listNode' member to the list */
appendList( &myList, &coolObject.node );
/* Use ugly casting & pointer math to get back you your original type
You may want to google 'offsetof' here. */
ptr = (MyCoolType *) ( (char*) first( &myList )
- offsetof(MyCoolType,node);
}
The libev documentation has some more good examples of this last technique:
http://search.cpan.org/dist/EV/libev/ev.pod#COMMON_OR_USEFUL_IDIOMS_(OR_BOTH)
I am trying to understand the inner workings of queue (3) macros in Freebsd. I had asked a previous question about the same topic and this is a follow up question to it.
I am trying to define a function to insert an element into the queue. queue (3) provides the macro STAILQ_INSERT_HEAD which needs a pointer to the head of the queue, type of the items in queue and the item to be inserted. My problem is that I am getting
stailq.c:31: warning: passing argument 1 of 'addelement' from incompatible pointer type
error when I try to pass the address of head to the function. The full source code is as follows:
#include <stdio.h>
#include <stdlib.h>
#include <sys/queue.h>
struct stailq_entry {
int value;
STAILQ_ENTRY(stailq_entry) entries;
};
STAILQ_HEAD(stailhead, stailq_entry);
int addelement(struct stailhead *h1, int e){
struct stailq_entry *n1;
n1 = malloc(sizeof(struct stailq_entry));
n1->value = e;
STAILQ_INSERT_HEAD(h1, n1, entries);
return (0);
}
int main(void)
{
STAILQ_HEAD(stailhead, stailq_entry) head = STAILQ_HEAD_INITIALIZER(head);
struct stailq_entry *n1;
unsigned i;
STAILQ_INIT(&head); /* Initialize the queue. */
for (i=0;i<10;i++){
addelement(&head, i);
}
n1 = NULL;
while (!STAILQ_EMPTY(&head)) {
n1 = STAILQ_LAST(&head, stailq_entry, entries);
STAILQ_REMOVE(&head, n1, stailq_entry, entries);
printf ("n2: %d\n", n1->value);
free(n1);
}
return (0);
}
As far as I can tell, head is of type struct stailhead and the addelement function also expects a pointer to struct stailhead.
STAILQ_HEAD(stailhead, stailq_entry); expands to:
struct stailhead {
struct stailq_entry *stqh_first;
struct stailq_entry **stqh_last;
};
What am I missing here?
Thanks.
You just need to convert the first line in your main function from
STAILQ_HEAD(stailhead, stailq_entry) head = STAILQ_HEAD_INITIALIZER(head);
to
struct stailhead head = STAILQ_HEAD_INITIALIZER(head);
What's happening is that STAILQ_HEAD is a macro that defines a new type, a struct that is your data structure with the name of the first parameter with an entry type of the second parameter.
You're only supposed to call STAILQ_HEAD once to define the type of the struct - then you use that typename from thereon to create new data structures of this type.
What you did in your code sample is simple: you defined a struct named stailhead twice - once in the global scope and once in the scope of your main function. You were then passing a pointer to the local stailhead to a function that accepted the global type with the same name.
Even though both structs are identical, they're in two different storage scopes and the compiler treats them as distinct types. It's warning you that you're converting from type main::stailhead to type global::stailhead (note that I have just made up this notation, I do not believe it is canon).
You only need to define stailhead by calling the STAILQ_HEAD macro just once at the top of the file where you already did, and from thereon use struct stailhead to define an object of this type.