Type-safe generic data structures in plain-old C? - c

I have done far more C++ programming than "plain old C" programming. One thing I sorely miss when programming in plain C is type-safe generic data structures, which are provided in C++ via templates.
For sake of concreteness, consider a generic singly linked list. In C++, it is a simple matter to define your own template class, and then instantiate it for the types you need.
In C, I can think of a few ways of implementing a generic singly linked list:
Write the linked list type(s) and supporting procedures once, using void pointers to go around the type system.
Write preprocessor macros taking the necessary type names, etc, to generate a type-specific version of the data structure and supporting procedures.
Use a more sophisticated, stand-alone tool to generate the code for the types you need.
I don't like option 1, as it is subverts the type system, and would likely have worse performance than a specialized type-specific implementation. Using a uniform representation of the data structure for all types, and casting to/from void pointers, so far as I can see, necessitates an indirection that would be avoided by an implementation specialized for the element type.
Option 2 doesn't require any extra tools, but it feels somewhat clunky, and could give bad compiler errors when used improperly.
Option 3 could give better compiler error messages than option 2, as the specialized data structure code would reside in expanded form that could be opened in an editor and inspected by the programmer (as opposed to code generated by preprocessor macros). However, this option is the most heavyweight, a sort of "poor-man's templates". I have used this approach before, using a simple sed script to specialize a "templated" version of some C code.
I would like to program my future "low-level" projects in C rather than C++, but have been frightened by the thought of rewriting common data structures for each specific type.
What experience do people have with this issue? Are there good libraries of generic data structures and algorithms in C that do not go with Option 1 (i.e. casting to and from void pointers, which sacrifices type safety and adds a level of indirection)?

Option 1 is the approach taken by most C implementations of generic containers that I see. The Windows driver kit and the Linux kernel use a macro to allow links for the containers to be embedded anywhere in a structure, with the macro used to obtain the structure pointer from a pointer to the link field:
list_entry() macro in Linux
CONTAINING_RECORD() macro in Windows
Option 2 is the tack taken by BSD's tree.h and queue.h container implementation:
http://openbsd.su/src/sys/sys/queue.h
http://openbsd.su/src/sys/sys/tree.h
I don't think I'd consider either of these approaches type safe. Useful, but not type safe.

C has a different kind of beauty to it than C++, and type safety and being able to always see what everything is when tracing through code without involving casts in your debugger is typically not one of them.
C's beauty comes a lot from its lack of type safety, of working around the type system and at the raw level of bits and bytes. Because of that, there's certain things it can do more easily without fighting against the language like, say, variable-length structs, using the stack even for arrays whose sizes are determined at runtime, etc. It also tends to be a lot simpler to preserve ABI when you're working at this lower level.
So there's a different kind of aesthetic involved here as well as different challenges, and I'd recommend a shift in mindset when you work in C. To really appreciate it, I'd suggest doing things many people take for granted these days, like implementing your own memory allocator or device driver. When you're working at such a low level, you can't help but look at everything as memory layouts of bits and bytes as opposed to 'objects' with behaviors attached. Furthermore, there can come a point in such low-level bit/byte manipulation code where C becomes easier to comprehend than C++ code littered with reinterpret_casts, e.g.
As for your linked list example, I would suggest a non-intrusive version of a linked node (one that does not require storing list pointers into the element type, T, itself, allowing the linked list logic and representation to be decoupled from T itself), like so:
struct ListNode
{
struct ListNode* prev;
struct ListNode* next;
MAX_ALIGN char element[1]; // Watch out for alignment here.
// see your compiler's specific info on
// aligning data members.
};
Now we can create a list node like so:
struct ListNode* list_new_node(int element_size)
{
// Watch out for alignment here.
return malloc_max_aligned(sizeof(struct ListNode) + element_size - 1);
}
// create a list node for 'struct Foo'
void foo_init(struct Foo*);
struct ListNode* foo_node = list_new_node(sizeof(struct Foo));
foo_init(foo_node->element);
To retrieve the element from the list as T*:
T* element = list_node->element;
Since it's C, there's no type checking whatsoever when casting pointers in this way, and that will probably also give you an uneasy feeling if you're coming from a C++ background.
The tricky part here is to make sure that this member, element, is properly aligned for whatever type you want to store. When you can solve that problem as portably as you need it to be, you'll have a powerful solution for creating efficient memory layouts and allocators. Often this will have you just using max alignment for everything which might seem wasteful, but typically isn't if you are using appropriate data structures and allocators which aren't paying this overhead for numerous small elements on an individual basis.
Now this solution still involves the type casting. There's little you can do about that short of having a separate version of code of this list node and the corresponding logic to work with it for every type, T, that you want to support (short of dynamic polymorphism). However, it does not involve an additional level of indirection as you might have thought was needed, and still allocates the entire list node and element in a single allocation.
And I would recommend this simple way to achieve genericity in C in many cases. Simply replace T with a buffer that has a length matching sizeof(T) and aligned properly. If you have a reasonably portable and safe way you can generalize to ensure proper alignment, you'll have a very powerful way of working with memory in a way that often improves cache hits, reduces the frequency of heap allocations/deallocations, the amount of indirection required, build times, etc.
If you need more automation like having list_new_node automatically initialize struct Foo, I would recommend creating a general type table struct that you can pass around which contains information like how big T is, a function pointer pointing to a function to create a default instance of T, another to copy T, clone T, destroy T, a comparator, etc. In C++, you can generate this table automatically using templates and built-in language concepts like copy constructors and destructors. C requires a bit more manual effort, but you can still reduce it the boilerplate a bit with macros.
Another trick that can be useful if you go with a more macro-oriented code generation route is to cash in a prefix or suffix-based naming convention of identifiers. For example, CLONE(Type, ptr) could be defined to return Type##Clone(ptr), so CLONE(Foo, foo) could invoke FooClone(foo). This is kind of a cheat to get something akin to function overloading in C, and is useful when generating code in bulk (when CLONE is used to implement another macro) or even a bit of copying and pasting of boilerplate-type code to at least improve the uniformity of the boilerplate.

Option 1, either using void * or some union based variant is what most C programs use, and it may give you BETTER performance than the C++/macro style of having multiple implementations for different types, as it has less code duplication, and thus less icache pressure and fewer icache misses.

GLib is has a bunch of generic data structures in it, http://www.gtk.org/
CCAN has a bunch of useful snippets and such http://ccan.ozlabs.org/

Your option 1 is what most old time c programmers would go for, possibly salted with a little of 2 to cut down on the repetitive typing, and just maybe employing a few function pointers for a flavor of polymorphism.

There's a common variation to option 1 which is more efficient as it uses unions to store the values in the list nodes, ie there's no additional indirection. This has the downside that the list only accepts values of certain types and potentially wastes some memory if the types are of different sizes.
However, it's possible to get rid of the union by using flexible array member instead if you're willing to break strict aliasing. C99 example code:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct ll_node
{
struct ll_node *next;
long long data[]; // use `long long` for alignment
};
extern struct ll_node *ll_unshift(
struct ll_node *head, size_t size, void *value);
extern void *ll_get(struct ll_node *head, size_t index);
#define ll_unshift_value(LIST, TYPE, ...) \
ll_unshift((LIST), sizeof (TYPE), &(TYPE){ __VA_ARGS__ })
#define ll_get_value(LIST, INDEX, TYPE) \
(*(TYPE *)ll_get((LIST), (INDEX)))
struct ll_node *ll_unshift(struct ll_node *head, size_t size, void *value)
{
struct ll_node *node = malloc(sizeof *node + size);
if(!node) assert(!"PANIC");
memcpy(node->data, value, size);
node->next = head;
return node;
}
void *ll_get(struct ll_node *head, size_t index)
{
struct ll_node *current = head;
while(current && index--)
current = current->next;
return current ? current->data : NULL;
}
int main(void)
{
struct ll_node *head = NULL;
head = ll_unshift_value(head, int, 1);
head = ll_unshift_value(head, int, 2);
head = ll_unshift_value(head, int, 3);
printf("%i\n", ll_get_value(head, 0, int));
printf("%i\n", ll_get_value(head, 1, int));
printf("%i\n", ll_get_value(head, 2, int));
return 0;
}

An old question, I know, but in case it is still of interest: I was experimenting with option 2) (pre-processor macros) today, and came up with the example I will paste below. Slightly clunky indeed, but not terrible. The code is not fully type safe, but contains sanity checks to provide a reasonable level of safety. And dealing with the compiler error messages while writing it was mild compared to what I have seen when C++ templates came into play. You are probably best starting reading this at the example use code in the "main" function.
#include <stdio.h>
#define LIST_ELEMENT(type) \
struct \
{ \
void *pvNext; \
type value; \
}
#define ASSERT_POINTER_TO_LIST_ELEMENT(type, pElement) \
do { \
(void)(&(pElement)->value == (type *)&(pElement)->value); \
(void)(sizeof(*(pElement)) == sizeof(LIST_ELEMENT(type))); \
} while(0)
#define SET_POINTER_TO_LIST_ELEMENT(type, pDest, pSource) \
do { \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pSource); \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pDest); \
void **pvDest = (void **)&(pDest); \
*pvDest = ((void *)(pSource)); \
} while(0)
#define LINK_LIST_ELEMENT(type, pDest, pSource) \
do { \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pSource); \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pDest); \
(pDest)->pvNext = ((void *)(pSource)); \
} while(0)
#define TERMINATE_LIST_AT_ELEMENT(type, pDest) \
do { \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pDest); \
(pDest)->pvNext = NULL; \
} while(0)
#define ADVANCE_POINTER_TO_LIST_ELEMENT(type, pElement) \
do { \
ASSERT_POINTER_TO_LIST_ELEMENT(type, pElement); \
void **pvElement = (void **)&(pElement); \
*pvElement = (pElement)->pvNext; \
} while(0)
typedef struct { int a; int b; } mytype;
int main(int argc, char **argv)
{
LIST_ELEMENT(mytype) el1;
LIST_ELEMENT(mytype) el2;
LIST_ELEMENT(mytype) *pEl;
el1.value.a = 1;
el1.value.b = 2;
el2.value.a = 3;
el2.value.b = 4;
LINK_LIST_ELEMENT(mytype, &el1, &el2);
TERMINATE_LIST_AT_ELEMENT(mytype, &el2);
printf("Testing.\n");
SET_POINTER_TO_LIST_ELEMENT(mytype, pEl, &el1);
if (pEl->value.a != 1)
printf("pEl->value.a != 1: %d.\n", pEl->value.a);
ADVANCE_POINTER_TO_LIST_ELEMENT(mytype, pEl);
if (pEl->value.a != 3)
printf("pEl->value.a != 3: %d.\n", pEl->value.a);
ADVANCE_POINTER_TO_LIST_ELEMENT(mytype, pEl);
if (pEl != NULL)
printf("pEl != NULL.\n");
printf("Done.\n");
return 0;
}

I use void pointers (void*) to represent generic data structures defined with structs and typedefs. Below I share my implementation of a lib which I'm working on.
With this kind of implementation, you can think of each new type, defined with typedef, like a pseudo-class. Here, this pseudo-class is the set of the source code (some_type_implementation.c) and its header file (some_type_implementation.h).
In the source code, you have to define the struct that will present the new type. Note the struct in the "node.c" source file. There I made a void pointer to the "info" atribute. This pointer may carry any type of pointer (I think), but the price you have to pay is a type identifier inside the struct (int type), and all the switchs to make the propper handle of each type defined. So, in the node.h" header file, I defined the type "Node" (just to avoid have to type struct node every time), and also I had to define the constants "EMPTY_NODE", "COMPLEX_NODE", and "MATRIX_NODE".
You can perform the compilation, by hand, with "gcc *.c -lm".
main.c Source File
#include <stdio.h>
#include <math.h>
#define PI M_PI
#include "complex.h"
#include "matrix.h"
#include "node.h"
int main()
{
//testCpx();
//testMtx();
testNode();
return 0;
}
node.c Source File
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "node.h"
#include "complex.h"
#include "matrix.h"
#define PI M_PI
struct node
{
int type;
void* info;
};
Node* newNode(int type,void* info)
{
Node* newNode = (Node*) malloc(sizeof(Node));
newNode->type = type;
if(info != NULL)
{
switch(type)
{
case COMPLEX_NODE:
newNode->info = (Complex*) info;
break;
case MATRIX_NODE:
newNode->info = (Matrix*) info;
break;
}
}
else
newNode->info = NULL;
return newNode;
}
int emptyInfoNode(Node* node)
{
return (node->info == NULL);
}
void printNode(Node* node)
{
if(emptyInfoNode(node))
{
printf("Type:%d\n",node->type);
printf("Empty info\n");
}
else
{
switch(node->type)
{
case COMPLEX_NODE:
printCpx(node->info);
break;
case MATRIX_NODE:
printMtx(node->info);
break;
}
}
}
void testNode()
{
Node *node1,*node2, *node3;
Complex *Z;
Matrix *M;
Z = mkCpx(POLAR,5,3*PI/4);
M = newMtx(3,4,PI);
node1 = newNode(COMPLEX_NODE,Z);
node2 = newNode(MATRIX_NODE,M);
node3 = newNode(EMPTY_NODE,NULL);
printNode(node1);
printNode(node2);
printNode(node3);
}
node.h Header File
#define EMPTY_NODE 0
#define COMPLEX_NODE 1
#define MATRIX_NODE 2
typedef struct node Node;
Node* newNode(int type,void* info);
int emptyInfoNode(Node* node);
void printNode(Node* node);
void testNode();
matrix.c Source File
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "matrix.h"
struct matrix
{
// Meta-information about the matrix
int rows;
int cols;
// The elements of the matrix, in the form of a vector
double** MTX;
};
Matrix* newMtx(int rows,int cols,double value)
{
register int row , col;
Matrix* M = (Matrix*)malloc(sizeof(Matrix));
M->rows = rows;
M->cols = cols;
M->MTX = (double**) malloc(rows*sizeof(double*));
for(row = 0; row < rows ; row++)
{
M->MTX[row] = (double*) malloc(cols*sizeof(double));
for(col = 0; col < cols ; col++)
M->MTX[row][col] = value;
}
return M;
}
Matrix* mkMtx(int rows,int cols,double** MTX)
{
Matrix* M;
if(MTX == NULL)
{
M = newMtx(rows,cols,0);
}
else
{
M = (Matrix*)malloc(sizeof(Matrix));
M->rows = rows;
M->cols = cols;
M->MTX = MTX;
}
return M;
}
double getElemMtx(Matrix* M , int row , int col)
{
return M->MTX[row][col];
}
void printRowMtx(double* row,int cols)
{
register int j;
for(j = 0 ; j < cols ; j++)
printf("%g ",row[j]);
}
void printMtx(Matrix* M)
{
register int row = 0, col = 0;
printf("\vSize\n");
printf("\tRows:%d\n",M->rows);
printf("\tCols:%d\n",M->cols);
printf("\n");
for(; row < M->rows ; row++)
{
printRowMtx(M->MTX[row],M->cols);
printf("\n");
}
printf("\n");
}
void testMtx()
{
Matrix* M = mkMtx(10,10,NULL);
printMtx(M);
}
matrix.h Header File
typedef struct matrix Matrix;
Matrix* newMtx(int rows,int cols,double value);
Matrix* mkMatrix(int rows,int cols,double** MTX);
void print(Matrix* M);
double getMtx(Matrix* M , int row , int col);
void printRowMtx(double* row,int cols);
void printMtx(Matrix* M);
void testMtx();
complex.c Source File
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "complex.h"
struct complex
{
int type;
double a;
double b;
};
Complex* mkCpx(int type,double a,double b)
{
/** Doc - {{{
* This function makes a new Complex number.
*
* #params:
* |-->type: Is an interger that denotes if the number is in
* | the analitic or in the polar form.
* | ANALITIC:0
* | POLAR :1
* |
* |-->a: Is the real part if type = 0 and is the radius if
* | type = 1
* |
* `-->b: Is the imaginary part if type = 0 and is the argument
* if type = 1
*
* #return:
* Returns the new Complex number initialized with the values
* passed
*}}} */
Complex* number = (Complex*)malloc(sizeof(Complex));
number->type = type;
number->a = a;
number->b = b;
return number;
}
void printCpx(Complex* number)
{
switch(number->type)
{
case ANALITIC:
printf("Re:%g | Im:%g\n",number->a,number->b);
break;
case POLAR:
printf("Radius:%g | Arg:%g\n",number->a,number->b);
break;
}
}
void testCpx()
{
Complex* Z = mkCpx(ANALITIC,3,2);
printCpx(Z);
}
complex.h Header File
#define ANALITIC 0
#define POLAR 1
typedef struct complex Complex;
Complex* mkCpx(int type,double a,double b);
void printCpx(Complex* number);
void testCpx();
I hope I hadn't missed nothing.

I am using option 2 for a couple of high performance collections, and it is extremely time-consuming working through the amount of macro logic needed to do anything truly compile-time generic and worth using. I am doing this purely for raw performance (games). An X-macros approach is used.
A painful issue that constantly comes up with Option 2 is, "Assuming some finite number of options, such as 8/16/32/64 bit keys, do I make said value a constant and define several functions each with a different element of this set of values that constant can take on, or do I just make it a member variable?" The former means a less performant instruction cache since you have a lot of repeated functions with just one or two numbers different, while the latter means you have to reference allocated variables which in the worst case means a data cache miss. Since Option 1 is purely dynamic, you will make such values member variables without even thinking about it. This truly is micro-optimisation, though.
Also bear in mind the trade-off between returning pointers vs. values: the latter is most performant when the size of the data item is less than or equal to pointer size; whereas if the data item is larger, it is most likely better to return pointers than to force a copy of a large object by returning value.
I would strongly suggest going for Option 1 in any scenario where you are not 100% certain that collection performance will be your bottleneck. Even with my use of Option 2, my collections library supplies a "quick setup" which is like Option 1, i.e. use of void * values in my list and map. This is sufficient for 90+% of circumstances.

You could check out https://github.com/clehner/ll.c
It's easy to use:
#include <stdio.h>
#include <string.h>
#include "ll.h"
int main()
{
int *numbers = NULL;
*( numbers = ll_new(numbers) ) = 100;
*( numbers = ll_new(numbers) ) = 200;
printf("num is %d\n", *numbers);
numbers = ll_next(numbers);
printf("num is %d\n", *numbers);
typedef struct _s {
char *word;
} s;
s *string = NULL;
*( string = ll_new(string) ) = (s) {"a string"};
*( string = ll_new(string) ) = (s) {"another string"};
printf("string is %s\n", string->word);
string = ll_next( string );
printf("string is %s\n", string->word);
return 0;
}
Output:
num is 200
num is 100
string is another string
string is a string

Related

Abstract Data Structures - Immutability and Cleanup in C

Suppose we have a tree data structure:
#ifndef TREE_H
#define TREE_H
#include <stdarg.h>
typedef struct tree {
char *tag;
struct tree **children;
int num_children;
} tree;
tree *leaf(char *tag);
tree *node(char *tag, int num_children, ...);
#endif
With this, we can define a tree in the following, syntactically pleasing way:
#include "tree.h"
int main() {
tree *test =
node("Tywin", 2,
node("Cersei", 2,
leaf("Joffrey"),
leaf("Tommen")
),
leaf("Tyrion")
);
return 0;
}
Now, I have thought of two possible ways we could implement this. When we define nodes and leaves, we could just copy all the data, as follows:
#include "tree.h"
#include <stdarg.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
tree *leaf(char *tag) {
tree *res = malloc(sizeof(*res));
size_t tag_length = strlen(tag) + 1;
res->tag = malloc(tag_length * sizeof((*res->tag)));
strcpy(res->tag, tag);
res->children = NULL;
res->num_children = 0;
return res;
}
tree *node(char *tag, int num_children, ...) {
tree *res = malloc(sizeof(*res));
size_t tag_length = strlen(tag) + 1;
res->tag = malloc(tag_length * sizeof((*res->tag)));
strcpy(res->tag, tag);
res->children = malloc(num_children * sizeof(*res->children));
va_list ap;
va_start(ap, num_children);
tree *current_child;
for (int i = 0; i < num_children; ++i) {
*(res->children + i) = malloc(sizeof(**(res->children + i)));
current_child = va_arg(ap, tree *);
memcpy(*(res->children + i), current_child, sizeof(*current_child));
}
va_end(ap);
res->num_children = num_children;
return res;
}
But with this, the intermediary calls to node (like the ones creating Tyrion and Cersei) create memory leaks. Additionally, even if you could somehow clean up the leaks, you're potentially allocating far more memory than you 'should' be. So the obvious alternative to just to copy the pointers:
#include "tree.h"
#include <stdarg.h>
#include <stddef.h>
#include <stdlib.h>
#include <string.h>
tree *leaf(char *tag) {
tree *res = malloc(sizeof(*res));
res->tag = tag;
res->children = NULL;
res->num_children = 0;
return res;
}
tree *node(char *tag, int num_children, ...) {
tree *res = malloc(sizeof(*res));
res->tag = tag;
res->children = malloc(num_children * sizeof(*res->children));
va_list ap;
va_start(ap, num_children);
for (int i = 0; i < num_children; ++i) {
*(res->children + i) = va_arg(ap, tree *);
}
va_end(ap);
res->num_children = num_children;
return res;
}
You could then define a recursive cleanup method that you call on your top-level objects once you've finished (i.e. recursively freeing and setting to NULL pointers, so if something is freed twice, you should be fine). But my concern with this is that you're hoping the user won't change the underlying data later on i.e. there are no guarantees this data structure is immutable.
Questions:
Is the second approach the canonical way of defining abstract data structures in C?
Can I adjust the code at all to give some sort of immutability assurance?
What you have defined can be called an "intrusive" data structure, because you store the hierarchy data (children) along with the payload (tag) in each tree instance. There is hardly any "canonical" way of defining such a structure - the definition usually depends on the requirements. Some textbooks and open libraries may define tress exactly in this way, some code may provide a custom memory allocators to keep allocated nodes close to each other in memory.
Considering other options, it is preferred to separate node payload and actual hierarchy data. For example, if your tree is only a (complete) binary tree, you can get away with storing node's left/right sibling indices in an array and avoid allocations/pointers altogether. If you need k-ary trees without too much node insertions/deletions, you can look at the LCRS tree representation to represent hierarchy and store all the data associated to node (tags and whatever else) in a separate location.
There is my answer to a seemingly similar question about a simpler data structure. You may take a look at it.
That said, you representation (given the absence of bugs and a correct recursive cleanup routine) is a valid tree representation in C.
In C you have an option to lock everything down or apply something like a Copy-on-Write strategy while modifying your trees. It all depends on requirements, once again. Do you have to search the tree, do you frequently update the structure or only recalculate/reassign node data ? Do you process this data locally using CPUs only or you need to share the tree with GPU/DSP or transfer it over the network ? These questions may help you to select the representation and synchronization primitives required. Otherwise, there is no "standard" way to "make everything thread-safe/immutable" once and use all the time. One suggestion I may have is when using the LCRS, you can program in a "functional way" and "calculate" the new hierarchy array from an old one each time you need to modify the tree structure.

Simulate a Java generic interface and abstract data type in C

I am trying to port a library written in Java into C programming language. For Java interface, I intend to use a struct of function-pointers to replace, for instance:
// Java code
public interface ActionsFunction {
Set<Action> actions(Object s);
}
/* C code */
typedef struct ActionsFunction {
List* (*actions)(void* s);
void (*clear_actions)(struct List **list); /* Since C doesn't have garbage collector */
} ActionsFunction;
My question is: whether it is a suitable solution or not, and how can I simulate a generic interface such as:
public interface List <E> {
void add(E x);
Iterator<E> iterator();
}
UPDATE:
I also have to face with another problem: implementing generic abstract data structure like List, Queue, Stack, etc since the C standard library lacks of those implementation. My approach is client code should pass the pointer of its data accompanying with its size, thus allowing library to hold that one without specifying its type. One more time, it just my idea. I need your advices for the design as well as implementing technique.
My initial porting code can be found at:
https://github.com/PhamPhiLong/AIMA
generic abstract data structure can be found in utility sub folder.
Here's a very brief example using macros to accomplish something like this. This can get hairy pretty quick, but if done correctly, you can maintain complete static type safety.
#include <stdlib.h>
#include <stdio.h>
#define list_type(type) struct __list_##type
/* A generic list node that keeps 'type' by value. */
#define define_list_val(type) \
list_type(type) { \
list_type(type) *next; \
type value; \
}
#define list_add(plist, node) \
do \
{ \
typeof(plist) p; \
for (p = plist; *p != NULL; p = &(*p)->next) ; \
*p = node; \
node->next = NULL; \
} while(0)
#define list_foreach(plist, p) \
for (p = *plist; p != NULL; p = p->next)
define_list_val(int) *g_list_ints;
define_list_val(float) *g_list_floats;
int main(void)
{
list_type(int) *node;
node = malloc(sizeof(*node));
node->value = 42;
list_add(&g_list_ints, node);
node = malloc(sizeof(*node));
node->value = 66;
list_add(&g_list_ints, node);
list_foreach(&g_list_ints, node) {
printf("Node: %d\n", node->value);
}
return 0;
}
There are a few common ways to do generic-ish programming in C. I would expect to use one or more of the following methods in trying to accomplish the task you've described.
MACROS: One is to use macros. In this example, MAX looks like a function, but operate on anything that can be compared with the ">" operator:
#define MAX(a,b) ((a) > (b) ? (a) : (b))
int i;
float f;
unsigned char b;
f = MAX(7.4, 2.5)
i = MAX(3, 4)
b = MAX(10, 20)
VOID *: Another method is to use void * pointers for representing generic data, and then pass function pointers into your algorithms to operate on the data. Look up the <stdlib.h> function qsort for a classic example of this technique.
UNIONS: Yet another, though probably seen less often, technique is to use unions to hold data of multiple different types. This makes your algorithms that operate on the data kinda ugly though and might not save much coding:
enum { VAR_DOUBLE, VAR_INT, VAR_STRING }
/* Declare a generic container struct for any type of data you want to operate on */
struct VarType
{
int type;
union data
{
double d;
int i;
char * sptr;
};
}
int main(){
VarType x;
x.data.d = 1.75;
x.type = VAR_DOUBLE;
/* call some function that sorts out what to do based on value of x.type */
my_function( x );
}
CLEVER CASTING & POINTER MATH It's a pretty common idiom to see data structures with functions that operate on a specific kind of struct and then require that the struct by included in your struct to do anything useful.
The easy way to do this, is the force the struct that allows insertion into the data structure to be the first member of your derived type. Then you can seamless cast back & forth between the two. The more versatile way is to use 'offsetof'. Here's a simple example.
For example:
/* Simple types */
struct listNode { struct listNode * next; struct listNode * prev };
struct list { struct listNode dummy; }
/* Functions that operate on those types */
int append( struct list * theList, struct listNode * theNode );
listNode * first( struct list *theList );
/* To use, you must do something like this: */
/* Define your own type that includes a list node */
typedef struct {
int x;
double y;
char name[16];
struct listNode node;
} MyCoolType;
int main() {
struct list myList;
MyCoolType coolObject;
MyCoolType * ptr;
/* Add the 'coolObject's 'listNode' member to the list */
appendList( &myList, &coolObject.node );
/* Use ugly casting & pointer math to get back you your original type
You may want to google 'offsetof' here. */
ptr = (MyCoolType *) ( (char*) first( &myList )
- offsetof(MyCoolType,node);
}
The libev documentation has some more good examples of this last technique:
http://search.cpan.org/dist/EV/libev/ev.pod#COMMON_OR_USEFUL_IDIOMS_(OR_BOTH)

Complete encapsulation without malloc

I was experimenting with C11 and VLAs, trying to declare a struct variable on the stack with only an incomplete declaration. The objective is to provide a mechanism to create a variable of some struct type without showing the internals (like the PIMPL idiom) but without the need to create the variable on the heap and return a pointer to it. Also, if the struct layout changes, I don't want to recompile every file that uses the struct.
I have managed to program the following:
private.h:
#ifndef PRIVATE_H_
#define PRIVATE_H_
typedef struct A{
int value;
}A;
#endif /* PRIVATE_H_ */
public.h:
#ifndef PUBLIC_H_
#define PUBLIC_H_
typedef struct A A;
size_t A_getSizeOf(void);
void A_setValue(A * a, int value);
void A_printValue(A * a);
#endif /* PUBLIC_H_ */
implementation.c:
#include "private.h"
#include "stdio.h"
size_t A_getSizeOf(void)
{
return sizeof(A);
}
void A_setValue(A * a, int value)
{
a->value = value;
}
void A_printValue(A * a)
{
printf("%d\n", a->value);
}
main.c:
#include <stdalign.h>
#include <stddef.h>
#include "public.h"
#define createOnStack(type, variable) \
alignas(max_align_t) char variable ## _stack[type ## _getSizeOf()]; \
type * variable = (type *)&variable ## _stack
int main(int argc, char *argv[]) {
createOnStack(A, var);
A_setValue(var, 5335);
A_printValue(var);
}
I have tested this code and it seems to work. However I'm not sure if I'm overlooking something (like aliasing, alignment or something like that) that could be dangerous or unportable, or could hurt performance. Also I want to know if there are better (portable) solutions to this problem in C.
This of course violates the effective typing rules (aka strict aliasing) because the C language does not allow an object of tye char [] to be accessed through a pointer that does not have that type (or a compatible one).
You could disable strict aliasing analysis via compiler flags like -fno-strict-aliasing or attributes like
#ifdef __GNUC__
#define MAY_ALIAS __attribute__((__may_alias__))
#else
#define MAY_ALIAS
#endif
(thanks go to R.. for pointing out the latter), but even if you do not do so, in practice everything should work just fine as long as you only ever use the variable's proper name to initialize the typed pointer.
Personally, I'd simplify your declarations to something along the lines of
#define stackbuffer(NAME, SIZE) \
_Alignas (max_align_t) char NAME[SIZE]
typedef struct Foo Foo;
extern const size_t SIZEOF_FOO;
stackbuffer(buffer, SIZEOF_FOO);
Foo *foo = (void *)buffer;
The alternative would be using the non-standard alloca(), but that 'function' comes with its own set of issues.
I am considering adopting a strategy similar to the following to solve essentially the same problem. Perhaps it will be of interest despite being a year late.
I wish to prevent clients of a struct from accessing the fields directly, in order to make it easier to reason about their state and easier to write reliable design contracts. I'd also prefer to avoid allocating small structures on the heap. But I can't afford a C11 public interface - much of the joy of C is that almost any code knows how to talk to C89.
To that end, consider the adequate application code:
#include "opaque.h"
int main(void)
{
opaque on_the_stack = create_opaque(42,3.14); // constructor
print_opaque(&on_the_stack);
delete_opaque(&on_the_stack); // destructor
return 0;
}
The opaque header is fairly nasty, but not completely absurd. Providing both create and delete functions is mostly for the sake of consistency with structs where calling the destructor actually matters.
/* opaque.h */
#ifndef OPAQUE_H
#define OPAQUE_H
/* max_align_t is not reliably available in stddef, esp. in c89 */
typedef union
{
int foo;
long long _longlong;
unsigned long long _ulonglong;
double _double;
void * _voidptr;
void (*_voidfuncptr)(void);
/* I believe the above types are sufficient */
} alignment_hack;
#define sizeof_opaque 16 /* Tedious to keep up to date */
typedef struct
{
union
{
char state [sizeof_opaque];
alignment_hack hack;
} private;
} opaque;
#undef sizeof_opaque /* minimise the scope of the macro */
void print_opaque(opaque * o);
opaque create_opaque(int foo, double bar);
void delete_opaque(opaque *);
#endif
Finally an implementation, which is welcome to use C11 as it's not the interface. _Static_assert(alignof...) is particularly reassuring. Several layers of static functions are used to indicate the obvious refinement of generating the wrap/unwrap layers. Pretty much the entire mess is amenable to code gen.
#include "opaque.h"
#include <stdalign.h>
#include <stdio.h>
typedef struct
{
int foo;
double bar;
} opaque_impl;
/* Zero tolerance approach to letting the sizes drift */
_Static_assert(sizeof (opaque) == sizeof (opaque_impl), "Opaque size incorrect");
_Static_assert(alignof (opaque) == alignof (opaque_impl), "Opaque alignment incorrect");
static void print_opaque_impl(opaque_impl *o)
{
printf("Foo = %d and Bar = %g\n",o->foo,o->bar);
}
static void create_opaque_impl(opaque_impl * o, int foo, double bar)
{
o->foo = foo;
o->bar = bar;
}
static void create_opaque_hack(opaque * o, int foo, double bar)
{
opaque_impl * ptr = (opaque_impl*)o;
create_opaque_impl(ptr,foo,bar);
}
static void delete_opaque_impl(opaque_impl *o)
{
o->foo = 0;
o->bar = 0;
}
static void delete_opaque_hack(opaque * o)
{
opaque_impl * ptr = (opaque_impl*)o;
delete_opaque_impl(ptr);
}
void print_opaque(opaque * o)
{
return print_opaque_impl((opaque_impl*)o);
}
opaque create_opaque(int foo, double bar)
{
opaque tmp;
unsigned int i;
/* Useful to zero out padding */
for (i=0; i < sizeof (opaque_impl); i++)
{
tmp.private.state[i] = 0;
}
create_opaque_hack(&tmp,foo,bar);
return tmp;
}
void delete_opaque(opaque *o)
{
delete_opaque_hack(o);
}
The drawbacks I can see myself:
Changing the size define manually would be irritating
The casting should hinder optimisation (I haven't checked this yet)
This may violate strict pointer aliasing. Need to re-read the spec.
I am concerned about accidentally invoking undefined behaviour. I would also be interested in general feedback on the above, or whether it looks like a credible alternative to the inventive VLA technique in the question.

Is there something in C like C++ templates? If not, how to re-use structures and functions for different data types?

I want to write a linked list that can have the data field store any build-in or user-define types. In C++ I would just use a template, but how do I accomplish this in C?
Do I have to re-write the linked list struct and a bunch of operations of it for each data type I want it to store? Unions wouldn't work because what type can it store is predefined.
There's a reason people use languages other than C.... :-)
In C, you'd have your data structure operate with void* members, and you'd cast wherever you used them to the correct types. Macros can help with some of that noise.
There are different approaches to this problem:
using datatype void*: these means, you have pointers to memory locations whose type is not further specified. If you retrieve such a pointer, you can explicitly state what is inside it: *(int*)(mystruct->voidptr) tells the compiler: look at the memory location mystruct->voidptr and interpret the contents as int.
another thing can be tricky preprocessor directives. However, this is usually a very non-trivial issue:
I also found http://sglib.sourceforge.net/
Edit: For the preprocessor trick:
#include <stdio.h>
#define mytype(t) struct { t val; }
int main(int argc, char *argv[]) {
mytype(int) myint;
myint.val=6;
printf ("%d\n", myint.val);
return 0;
}
This would be a simple wrapper for types, but I think it can become quite complicated.
It's less comfortable in C (there's a reason C++ is called C incremented), but it can be done with generic pointers (void *) and the applocation handles the type management itself.
A very nice implementation of generic data structures in C can be found in ubiqx modules, the sources are definitely worth reading.
With some care, you can do this using macros that build and manipulate structs. One of the most well-tested examples of this is the BSD "queue" library. It works on every platform I've tried (Unix, Windows, VMS) and consists of a single header file (no C file).
It has the unfortunate downside of being a bit hard to use, but it preserves as much type-safety as it can in C.
The header file is here: http://www.openbsd.org/cgi-bin/cvsweb/src/sys/sys/queue.h?rev=1.34;content-type=text%2Fplain, and the documentation on how to use it is here: http://www.openbsd.org/cgi-bin/man.cgi?query=queue.
Beyond that, no, you're stuck with losing type-safety (using (void *) all over the place) or moving to the STL.
Here's an option that's very flexible but requires a lot of work.
In your list node, store a pointer to the data as a void *:
struct node {
void *data;
struct node *next;
};
Then you'd create a suite of functions for each type that handle tasks like comparison, assignment, duplication, etc.:
// create a new instance of the data item and copy the value
// of the parameter to it.
void *copyInt(void *src)
{
int *p = malloc(sizeof *p);
if (p) *p = *(int *)src;
return p;
}
void assignInt(void *target, void *src)
{
// we create a new instance for the assignment
*(int *)target = copyInt(src);
}
// returns -1 if lhs < rhs, 0 if lhs == rhs, 1 if lhs > rhs
int testInt(void *lhs, void *rhs)
{
if (*(int *)lhs < *(int *)rhs) return -1;
else if (*(int *)lhs == *(int *)rhs) return 0;
else return 1;
}
char *intToString(void *data)
{
size_t digits = however_many_digits_in_an_int();
char *s = malloc(digits + 2); // sign + digits + terminator
sprintf(s, "%d", *(int *)data);
return s;
}
Then you could create a list type that has pointers to these functions, such as
struct list {
struct node *head;
void *(*cpy)(void *); // copy operation
int (*test)(void *, void *); // test operation
void (*asgn)(void *, void *); // assign operation
char *(*toStr)(void *); // get string representation
...
}
struct list myIntList;
struct list myDoubleList;
myIntList.cpy = copyInt;
myIntList.test = testInt;
myIntList.asgn = assignInt;
myIntList.toStr = intToString;
myDoubleList.cpy = copyDouble;
myDoubleList.test = testDouble;
myDoubleList.asgn = assignDouble;
myDoubleList.toStr = doubleToString;
...
Then, when you pass the list to an insert or search operation, you'd call the functions from the list object:
void addToList(struct list *l, void *value)
{
struct node *new, *cur = l->head;
while (cur->next != NULL && l->test(cur->data, value) <= 0)
cur = cur->next;
new = malloc(sizeof *new);
if (!new)
{
// handle error here
}
else
{
new->data = l->cpy(value);
new->next = cur->next;
cur->next = new;
if (logging)
{
char *s = l->toStr(new->data);
fprintf(log, "Added value %s to list\n", s);
free(s);
}
}
}
...
i = 1;
addToList(&myIntList, &i);
f = 3.4;
addToList(&myDoubleList, &f);
By delegating the type-aware operations to separate functions called through function pointers, you now have a list structure that can store values of any type. To add support for new types, you only need to implement new copy, assign, toString, etc., functions for that new type.
There are drawbacks. For one thing, you can't use constants as function parameters (e.g., you can't do something simple like addToList(&myIntList, 1);) -- you have to assign everything to a variable first, and pass the address of the variable (which is why you need to create new instances of the data member when you add it to the list; if you just assigned the address of the variable, every element in the list would wind up pointing to the same object, which may no longer exist depending on the context).
Secondly, you wind up doing a lot of memory management; you don't just create a new instance of the list node, but you also must create a new instance of the data member. You must remember to free the data member before freeing the node. Then you're creating a new string instance every time you want to display the data, and you have to remember to free that string when you're done with it.
Finally, this solution throws type safety right out the window and into oncoming traffic (after lighting it on fire). The delegate functions are counting on you to keep the types straight; there's nothing preventing you from passing the address of a double variable to one of the int handling functions.
Between the memory management and the fact that you must make a function call for just about every operation, performance is going to suffer. This isn't a fast solution.
Of course, this assumes that every element in the list is the same type; if you're wanting to store elements of different types in the same list, then you're going to have to do something different, such as associate the functions with each node, rather than the list overall.
I wrote a generic linked list "template" in C using the preprocessor, but it's pretty horrible to look at, and heavily pre-processed code is not easy to debug.
These days I think you'd be better off using some other code generation tool such as Python / Cog: http://www.python.org/about/success/cog/
I agree with JonathanPatschke's answer that you should look at sys/queue.h, although I've never tried it myself, as it is not on some of the platforms I work with. I also agree with Vicki's answer to use Python.
But I've found that five or six very simple C macros meet most of my garden-variety needs. These macros help clean up ugly, bug-prone code, without littering it with hidden void *'s, which destroy type-safety. Some of these macros are:
#define ADD_LINK_TO_END_OF_LIST(add, head, tail) \
if (!(head)) \
(tail) = (head) = (add); \
else \
(tail) = (tail)->next = (add)
#define ADD_DOUBLE_LINK_TO_END_OF_LIST(add, head, tail) \
if (!(head)) \
(tail) = (head) = (add); \
else \
(tail) = ((add)->prev = (tail), (tail)->next = (add))
#define FREE_LINK_IN_LIST(p, dtor) do { /* singly-linked */ \
void *myLocalTemporaryPtr = (p)->next; \
dtor(p); \
(p) = myLocalTemporaryPtr;} while (0)
#define FREE_LINKED_LIST(p, dtor) do { \
while (p) \
FREE_LINK_IN_LIST(p, dtor);} while (0)
// copy "ctor" (shallow)
#define NEW_COPY(p) memcpy(myMalloc(sizeof *(p)), p, sizeof *(p))
// iterator
#define NEXT_IN_LIST(p, list) ((p) ? (p)->next : (list))
So, for example:
struct MyContact {
char *name;
char *address;
char *telephone;
...
struct MyContact *next;
} *myContactList = 0, *myContactTail; // the tail doesn't need to be init'd
...
struct MyContact newEntry = {};
...
ADD_LINK_TO_END_OF_LIST(NEW_COPY(newEntry), myContactList, myContactTail);
...
struct MyContact *i = 0;
while ((i = NEXT_IN_LIST(i, myContactList))) // iterate through list
// ...
The next and prev members have hard-coded names. They don't need to be void *, which avoids problems with strict anti-aliasing. They do need to be zeroed when the data item is created.
The dtor argument for FREE_LINK_IN_LIST would typically be a function like free, or (void) to do nothing, or another macro such as:
#define MY_CONTACT_ENTRY_DTOR(p) \
do { if (p) { \
free((p)->name); \
free((p)->address); \
free((p)->telephone); \
free(p); \
}} while (0)
So for example, FREE_LINKED_LIST(myContactList, MY_CONTACT_ENTRY_DTOR) would free all the members of the (duck-typed) list headed by myContactList.
There is one void * here, but perhaps it could be removed via gcc's typeof.
If you need a list that can hold elements of different types simultaneously, e.g. an int followed by three char * followed by a struct tm, then using void * for the data is the solution. But if you only need multiple list types with identical methods, the best solution depends on if you want to avoid generating many instances of almost identical machine code, or just avoid typing source code.
A struct declaration doesn't generate any machine code...
struct int_node {
void *next;
int data;
};
struct long_node {
void *next;
long data;
};
...and one single function which uses a void * parameter and/or return value, can handle them all.
struct generic_node {
void *next;
};
void *insert(void *before_this, void *element, size_t element_sizes);

What is your favorite C programming trick? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
For example, I recently came across this in the linux kernel:
/* Force a compilation error if condition is true */
#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
So, in your code, if you have some structure which must be, say a multiple of 8 bytes in size, maybe because of some hardware constraints, you can do:
BUILD_BUG_ON((sizeof(struct mystruct) % 8) != 0);
and it won't compile unless the size of struct mystruct is a multiple of 8, and if it is a multiple of 8, no runtime code is generated at all.
Another trick I know is from the book "Graphics Gems" which allows a single header file to both declare and initialize variables in one module while in other modules using that module, merely declare them as externs.
#ifdef DEFINE_MYHEADER_GLOBALS
#define GLOBAL
#define INIT(x, y) (x) = (y)
#else
#define GLOBAL extern
#define INIT(x, y)
#endif
GLOBAL int INIT(x, 0);
GLOBAL int somefunc(int a, int b);
With that, the code which defines x and somefunc does:
#define DEFINE_MYHEADER_GLOBALS
#include "the_above_header_file.h"
while code that's merely using x and somefunc() does:
#include "the_above_header_file.h"
So you get one header file that declares both instances of globals and function prototypes where they are needed, and the corresponding extern declarations.
So, what are your favorite C programming tricks along those lines?
C99 offers some really cool stuff using anonymous arrays:
Removing pointless variables
{
int yes=1;
setsockopt(yourSocket, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int));
}
becomes
setsockopt(yourSocket, SOL_SOCKET, SO_REUSEADDR, (int[]){1}, sizeof(int));
Passing a Variable Amount of Arguments
void func(type* values) {
while(*values) {
x = *values++;
/* do whatever with x */
}
}
func((type[]){val1,val2,val3,val4,0});
Static linked lists
int main() {
struct llist { int a; struct llist* next;};
#define cons(x,y) (struct llist[]){{x,y}}
struct llist *list=cons(1, cons(2, cons(3, cons(4, NULL))));
struct llist *p = list;
while(p != 0) {
printf("%d\n", p->a);
p = p->next;
}
}
Any I'm sure many other cool techniques I haven't thought of.
While reading Quake 2 source code I came up with something like this:
double normals[][] = {
#include "normals.txt"
};
(more or less, I don't have the code handy to check it now).
Since then, a new world of creative use of the preprocessor opened in front of my eyes. I no longer include just headers, but entire chunks of code now and then (it improves reusability a lot) :-p
Thanks John Carmack! xD
I'm fond of using = {0}; to initialize structures without needing to call memset.
struct something X = {0};
This will initialize all of the members of the struct (or array) to zero (but not any padding bytes - use memset if you need to zero those as well).
But you should be aware there are some issues with this for large, dynamically allocated structures.
If we are talking about c tricks my favourite has to be Duff's Device for loop unrolling! I'm just waiting for the right opportunity to come along for me to actually use it in anger...
using __FILE__ and __LINE__ for debugging
#define WHERE fprintf(stderr,"[LOG]%s:%d\n",__FILE__,__LINE__);
In C99
typedef struct{
int value;
int otherValue;
} s;
s test = {.value = 15, .otherValue = 16};
/* or */
int a[100] = {1,2,[50]=3,4,5,[23]=6,7};
Once a mate of mine and I redefined return to find a tricky stack corruption bug.
Something like:
#define return DoSomeStackCheckStuff, return
I like the "struct hack" for having a dynamically sized object. This site explains it pretty well too (though they refer to the C99 version where you can write "str[]" as the last member of a struct). you could make a string "object" like this:
struct X {
int len;
char str[1];
};
int n = strlen("hello world");
struct X *string = malloc(sizeof(struct X) + n);
strcpy(string->str, "hello world");
string->len = n;
here, we've allocated a structure of type X on the heap that is the size of an int (for len), plus the length of "hello world", plus 1 (since str1 is included in the sizeof(X).
It is generally useful when you want to have a "header" right before some variable length data in the same block.
Object oriented code with C, by emulating classes.
Simply create a struct and a set of functions that take a pointer to that struct as a first parameter.
Instead of
printf("counter=%d\n",counter);
Use
#define print_dec(var) printf("%s=%d\n",#var,var);
print_dec(counter);
Using a stupid macro trick to make record definitions easier to maintain.
#define COLUMNS(S,E) [(E) - (S) + 1]
typedef struct
{
char studentNumber COLUMNS( 1, 9);
char firstName COLUMNS(10, 30);
char lastName COLUMNS(31, 51);
} StudentRecord;
For creating a variable which is read-only in all modules except the one it's declared in:
// Header1.h:
#ifndef SOURCE1_C
extern const int MyVar;
#endif
// Source1.c:
#define SOURCE1_C
#include Header1.h // MyVar isn't seen in the header
int MyVar; // Declared in this file, and is writeable
// Source2.c
#include Header1.h // MyVar is seen as a constant, declared elsewhere
Bit-shifts are only defined up to a shift-amount of 31 (on a 32 bit integer)..
What do you do if you want to have a computed shift that need to work with higher shift-values as well? Here is how the Theora vide-codec does it:
unsigned int shiftmystuff (unsigned int a, unsigned int v)
{
return (a>>(v>>1))>>((v+1)>>1);
}
Or much more readable:
unsigned int shiftmystuff (unsigned int a, unsigned int v)
{
unsigned int halfshift = v>>1;
unsigned int otherhalf = (v+1)>>1;
return (a >> halfshift) >> otherhalf;
}
Performing the task the way shown above is a good deal faster than using a branch like this:
unsigned int shiftmystuff (unsigned int a, unsigned int v)
{
if (v<=31)
return a>>v;
else
return 0;
}
Declaring array's of pointer to functions for implementing finite state machines.
int (* fsm[])(void) = { ... }
The most pleasing advantage is that it is simple to force each stimulus/state to check all code paths.
In an embedded system, I'll often map an ISR to point to such a table and revector it as needed (outside the ISR).
Another nice pre-processor "trick" is to use the "#" character to print debugging expressions. For example:
#define MY_ASSERT(cond) \
do { \
if( !(cond) ) { \
printf("MY_ASSERT(%s) failed\n", #cond); \
exit(-1); \
} \
} while( 0 )
edit: the code below only works on C++. Thanks to smcameron and Evan Teran.
Yes, the compile time assert is always great. It can also be written as:
#define COMPILE_ASSERT(cond)\
typedef char __compile_time_assert[ (cond) ? 0 : -1]
I wouldn't really call it a favorite trick, since I've never used it, but the mention of Duff's Device reminded me of this article about implementing Coroutines in C. It always gives me a chuckle, but I'm sure it could be useful some time.
#if TESTMODE == 1
debug=1;
while(0); // Get attention
#endif
The while(0); has no effect on the program, but the compiler will issue a warning about "this does nothing", which is enough to get me to go look at the offending line and then see the real reason I wanted to call attention to it.
I'm a fan of xor hacks:
Swap 2 pointers without third temp pointer:
int * a;
int * b;
a ^= b;
b ^= a;
a ^= b;
Or I really like the xor linked list with only one pointer. (http://en.wikipedia.org/wiki/XOR_linked_list)
Each node in the linked list is the Xor of the previous node and the next node. To traverse forward, the address of the nodes are found in the following manner :
LLNode * first = head;
LLNode * second = first.linked_nodes;
LLNode * third = second.linked_nodes ^ first;
LLNode * fourth = third.linked_nodes ^ second;
etc.
or to traverse backwards:
LLNode * last = tail;
LLNode * second_to_last = last.linked_nodes;
LLNode * third_to_last = second_to_last.linked_nodes ^ last;
LLNode * fourth_to_last = third_to_last.linked_nodes ^ second_to_last;
etc.
While not terribly useful (you can't start traversing from an arbitrary node) I find it to be very cool.
This one comes from the book 'Enough rope to shoot yourself in the foot':
In the header declare
#ifndef RELEASE
# define D(x) do { x; } while (0)
#else
# define D(x)
#endif
In your code place testing statements eg:
D(printf("Test statement\n"));
The do/while helps in case the contents of the macro expand to multiple statements.
The statement will only be printed if '-D RELEASE' flag for compiler is not used.
You can then eg. pass the flag to your makefile etc.
Not sure how this works in windows but in *nix it works well
Rusty actually produced a whole set of build conditionals in ccan, check out the build assert module:
#include <stddef.h>
#include <ccan/build_assert/build_assert.h>
struct foo {
char string[5];
int x;
};
char *foo_string(struct foo *foo)
{
// This trick requires that the string be first in the structure
BUILD_ASSERT(offsetof(struct foo, string) == 0);
return (char *)foo;
}
There are lots of other helpful macros in the actual header, which are easy to drop into place.
I try, with all of my might to resist the pull of the dark side (and preprocessor abuse) by sticking mostly to inline functions, but I do enjoy clever, useful macros like the ones you described.
Two good source books for this sort of stuff are The Practice of Programming and Writing Solid Code. One of them (I don't remember which) says: Prefer enum to #define where you can, because enum gets checked by the compiler.
Not specific to C, but I've always liked the XOR operator. One cool thing it can do is "swap without a temp value":
int a = 1;
int b = 2;
printf("a = %d, b = %d\n", a, b);
a ^= b;
b ^= a;
a ^= b;
printf("a = %d, b = %d\n", a, b);
Result:
a = 1, b = 2
a = 2, b = 1
See "Hidden features of C" question.
I like the concept of container_of used for example in lists. Basically, you do not need to specify next and last fields for each structure which will be in the list. Instead, you append the list structure header to actual linked items.
Have a look on include/linux/list.h for real-life examples.
I think the use of userdata pointers is pretty neat. A fashion losing ground nowdays. It's not so much a C feature but is pretty easy to use in C.
I use X-Macros to to let the pre-compiler generate code. They are especially useful for defining error values and associated error strings in one place, but they can go far beyond that.
Our codebase has a trick similar to
#ifdef DEBUG
#define my_malloc(amt) my_malloc_debug(amt, __FILE__, __LINE__)
void * my_malloc_debug(int amt, char* file, int line)
#else
void * my_malloc(int amt)
#endif
{
//remember file and line no. for this malloc in debug mode
}
which allows for the tracking of memory leaks in debug mode. I always thought this was cool.
Fun with macros:
#define SOME_ENUMS(F) \
F(ZERO, zero) \
F(ONE, one) \
F(TWO, two)
/* Now define the constant values. See how succinct this is. */
enum Constants {
#define DEFINE_ENUM(A, B) A,
SOME_ENUMS(DEFINE_ENUMS)
#undef DEFINE_ENUM
};
/* Now a function to return the name of an enum: */
const char *ToString(int c) {
switch (c) {
default: return NULL; /* Or whatever. */
#define CASE_MACRO(A, B) case A: return #b;
SOME_ENUMS(CASE_MACRO)
#undef CASE_MACRO
}
}
Here is an example how to make C code completly unaware about what is actually used of HW for running the app. The main.c does the setup and then the free layer can be implemented on any compiler/arch. I think it is quite neat for abstracting C code a bit, so it does not get to be to spesific.
Adding a complete compilable example here.
/* free.h */
#ifndef _FREE_H_
#define _FREE_H_
#include <stdio.h>
#include <string.h>
typedef unsigned char ubyte;
typedef void (*F_ParameterlessFunction)() ;
typedef void (*F_CommandFunction)(ubyte byte) ;
void F_SetupLowerLayer (
F_ParameterlessFunction initRequest,
F_CommandFunction sending_command,
F_CommandFunction *receiving_command);
#endif
/* free.c */
static F_ParameterlessFunction Init_Lower_Layer = NULL;
static F_CommandFunction Send_Command = NULL;
static ubyte init = 0;
void recieve_value(ubyte my_input)
{
if(init == 0)
{
Init_Lower_Layer();
init = 1;
}
printf("Receiving 0x%02x\n",my_input);
Send_Command(++my_input);
}
void F_SetupLowerLayer (
F_ParameterlessFunction initRequest,
F_CommandFunction sending_command,
F_CommandFunction *receiving_command)
{
Init_Lower_Layer = initRequest;
Send_Command = sending_command;
*receiving_command = &recieve_value;
}
/* main.c */
int my_hw_do_init()
{
printf("Doing HW init\n");
return 0;
}
int my_hw_do_sending(ubyte send_this)
{
printf("doing HW sending 0x%02x\n",send_this);
return 0;
}
F_CommandFunction my_hw_send_to_read = NULL;
int main (void)
{
ubyte rx = 0x40;
F_SetupLowerLayer(my_hw_do_init,my_hw_do_sending,&my_hw_send_to_read);
my_hw_send_to_read(rx);
getchar();
return 0;
}
if(---------)
printf("hello");
else
printf("hi");
Fill in the blanks so that neither hello nor hi would appear in output.
ans: fclose(stdout)

Resources