C, generalize functions argument's type - c

I was writing a c library for linked lists and trees and i was looking for a solution to generalize the data type handled by these libraries without making a list/tree library for each type i need to handle.
For example, my list library has these functions:
/* list.h */
typedef int list_element; // <-- need to generalize that
struct list_node {
list_element value;
struct list_node* next;
};
typedef struct list_node list_node;
typedef struct list_node* list;
extern list list_cons(list_element d, list l)
And then my list.c:
/* list.c */
#include <list.h>
list list_cons(list_element d, list l){
list m = malloc(sizeof(list_node));
m->value = element_copy(d);
m->next = l;
return m;
}
Now suppose that in my main program i've to use a list of int and a list of double, i should create 2 couple of file, something like list_float.c/.h and list_int.c/.h
Also some list_element can be struct and need functions like copy/isLess/isEqual to compare themselves
I want to write something like this in my code:
/* main.c */
list_cons(void *data, list l);
Where data is a pointer to any type i want and inside list_cons, element_copy work for any type of data i pass (obviously i need to copy the data not the pointer to data, void* is the only idea i had to generalize the argument's type of a function)

As a general suggestion, you should not assume that list_cons would be the only way to construct a linked list. Sometimes malloc is just not available or user wants to preallocate everything in a static array or wants to use custom allocator or...
As a concrete sample, you may look at https://github.com/torvalds/linux/blob/master/include/linux/list.h.
If you want other license for your code, search for data structure implementations in xBSD Unix sources.
The general idea is that you only require a linked list structure to contain next/prev and similar fields, not limiting user to your type names. All iterations and basic operations are defined as preprocessor macros, on top of which you implement complex algorithms.

Notice that in C11 (read its standard n1570 and some C reference site) different types can be handled differently (see its §6.5.2).
In particular, the implementation can handle int, double and pointer values differently (their size and alignment is often different, see sizeof & alignof), and some ABI conventions decide that they are passed in different registers (e.g. in function calls).
So you cannot write something which handles (portably) all of int, double etc... the same way (unless you have variadic functions; with <stdarg.h>)
You might decide to implement some "generic" list whose content is some arbitrary pointer. But then you need some conventions about them (who is allocating that pointer, who is freeing it, perhaps what operations are allowed, etc...). Look into Glib doubly-linked lists for inspiration.
You could also use preprocessor techniques to generate an abstract data type (of your list) and functions implementing it, given some type name for the content. Look into SGLIB for inspiration.
You could also use some metaprogramming techniques: you'll describe somehow the type of the element, and you feed that description into your metaprogram which is some C code generator. Look into SWIG for inspiration. The generated C code would implement your list's abstract data type.
Don't forget memory management issues and describe and document clearly your conventions around them. Read about RAII.
Think also of complex cases like some list of list of strings (perhaps dynamically allocated à la strdup or obtained using asprintf). You'll discover that things are not simple, and you'll need to explicit conventions (e.g. could some string be shared between two sublists? When would that string be free-d, ...).

This might be a good place to use a union.
You can define your base datatype as a union of the most common types you want to support. Then you would define a enum of the types in question as use that value to flag what the union contains.
enum element_type {
TYPE_INT,
TYPE_DOUBLE
};
typedef union {
int e_int;
double e_double;
} list_element;
struct list_node {
enum element_type type;
list_element value;
struct list_node* next;
};
Then you add to the list like this:
list list_cons(list_element d, enum element_type type, list l){
list m = malloc(sizeof(list_node));
m->type = type;
m->value = element_copy(d);
m->next = l;
return m;
}

Related

How can I create a C LinkedList that holds dynamically set data types

In C :
struct node {
int data;
int key;
struct node *next;
};
The above C data structure can only hold the set data types, that is, int data and int key.
In Java :
List<?> objectList = new ArrayList<>();
How can I create a C LinkedList that holds dynamically set data types?
The idea is :
struct node {
<?> data;
<?> key;
struct node *next;
};
So that I can do :
List<String> objectList = new ArrayList<>();
OR
List<MyCustomObject> objectList = new ArrayList<>();
in C
struct node {
<String> data;
<String> key;
struct node *next;
};
OR
struct node {
<MyCustomObject> data;
<MyCustomObject> key;
struct node *next;
};
You can't, because dynamic data types do not exist in general in C. Check by reading the C11 standard n1570.
What you might do is implementing some kind of tagged union type, and make a list containing these (or perhaps a list containing some pointers to them, but you need to decide on some memory management strategy and related coding conventions).
How to implement such tagged union types is different question. For inspiration, look into Glib GVariant type (but there are other ways to do that; one example is Python value type, read more about it in extending Python chapter; or some struct containing some union, or some union of struct all starting with a common discriminating field, or some union of pointers to such struct, etc.), and study the source code of that implementation in Glib, which is free software.
You could also consider some metaprogramming approach, e.g. generating some C code from some description of your data types. Look into SWIG or RPCGEN for inspiration.
BTW, the SGLIB header-mostly library uses a lot of preprocessor techniques (with huge C macros) to provide "generic" containers.
Be also aware of the ABI and calling conventions of your C implementation. In some cases, it might help understanding it (e.g. if you decide to use tagged pointers). Notice that many basic C types (int, long, char, data pointers, function pointers) have essentially different representations (different sizes, different alignments, different ways of being passed as arguments and returned as results - e.g. in various processor registers or on the call stack) and perhaps even different address spaces (think of Harvard architecture where function pointers are in a different space than data and could even have different sizes than data pointers).
Data should be a void *. You will then need to implicitly cast it to whatever pointer type it needs to be.

Convention for declaring structs in C [duplicate]

I have seen many programs consisting of structures like the one below
typedef struct
{
int i;
char k;
} elem;
elem user;
Why is it needed so often? Any specific reason or applicable area?
As Greg Hewgill said, the typedef means you no longer have to write struct all over the place. That not only saves keystrokes, it also can make the code cleaner since it provides a smidgen more abstraction.
Stuff like
typedef struct {
int x, y;
} Point;
Point point_new(int x, int y)
{
Point a;
a.x = x;
a.y = y;
return a;
}
becomes cleaner when you don't need to see the "struct" keyword all over the place, it looks more as if there really is a type called "Point" in your language. Which, after the typedef, is the case I guess.
Also note that while your example (and mine) omitted naming the struct itself, actually naming it is also useful for when you want to provide an opaque type. Then you'd have code like this in the header, for instance:
typedef struct Point Point;
Point * point_new(int x, int y);
and then provide the struct definition in the implementation file:
struct Point
{
int x, y;
};
Point * point_new(int x, int y)
{
Point *p;
if((p = malloc(sizeof *p)) != NULL)
{
p->x = x;
p->y = y;
}
return p;
}
In this latter case, you cannot return the Point by value, since its definition is hidden from users of the header file. This is a technique used widely in GTK+, for instance.
UPDATE Note that there are also highly-regarded C projects where this use of typedef to hide struct is considered a bad idea, the Linux kernel is probably the most well-known such project. See Chapter 5 of The Linux Kernel CodingStyle document for Linus' angry words. :) My point is that the "should" in the question is perhaps not set in stone, after all.
It's amazing how many people get this wrong. PLEASE don't typedef structs in C, it needlessly pollutes the global namespace which is typically very polluted already in large C programs.
Also, typedef'd structs without a tag name are a major cause of needless imposition of ordering relationships among header files.
Consider:
#ifndef FOO_H
#define FOO_H 1
#define FOO_DEF (0xDEADBABE)
struct bar; /* forward declaration, defined in bar.h*/
struct foo {
struct bar *bar;
};
#endif
With such a definition, not using typedefs, it is possible for a compiland unit to include foo.h to get at the FOO_DEF definition. If it doesn't attempt to dereference the 'bar' member of the foo struct then there will be no need to include the "bar.h" file.
Also, since the namespaces are different between the tag names and the member names, it is possible to write very readable code such as:
struct foo *foo;
printf("foo->bar = %p", foo->bar);
Since the namespaces are separate, there is no conflict in naming variables coincident with their struct tag name.
If I have to maintain your code, I will remove your typedef'd structs.
From an old article by Dan Saks (http://www.ddj.com/cpp/184403396?pgno=3):
The C language rules for naming
structs are a little eccentric, but
they're pretty harmless. However, when
extended to classes in C++, those same
rules open little cracks for bugs to
crawl through.
In C, the name s appearing in
struct s
{
...
};
is a tag. A tag name is not a type
name. Given the definition above,
declarations such as
s x; /* error in C */
s *p; /* error in C */
are errors in C. You must write them
as
struct s x; /* OK */
struct s *p; /* OK */
The names of unions and enumerations
are also tags rather than types.
In C, tags are distinct from all other
names (for functions, types,
variables, and enumeration constants).
C compilers maintain tags in a symbol
table that's conceptually if not
physically separate from the table
that holds all other names. Thus, it
is possible for a C program to have
both a tag and an another name with
the same spelling in the same scope.
For example,
struct s s;
is a valid declaration which declares
variable s of type struct s. It may
not be good practice, but C compilers
must accept it. I have never seen a
rationale for why C was designed this
way. I have always thought it was a
mistake, but there it is.
Many programmers (including yours
truly) prefer to think of struct names
as type names, so they define an alias
for the tag using a typedef. For
example, defining
struct s
{
...
};
typedef struct s S;
lets you use S in place of struct s,
as in
S x;
S *p;
A program cannot use S as the name of
both a type and a variable (or
function or enumeration constant):
S S; // error
This is good.
The tag name in a struct, union, or
enum definition is optional. Many
programmers fold the struct definition
into the typedef and dispense with the
tag altogether, as in:
typedef struct
{
...
} S;
The linked article also has a discussion about how the C++ behavior of not requireing a typedef can cause subtle name hiding problems. To prevent these problems, it's a good idea to typedef your classes and structs in C++, too, even though at first glance it appears to be unnecessary. In C++, with the typedef the name hiding become an error that the compiler tells you about rather than a hidden source of potential problems.
Using a typedef avoids having to write struct every time you declare a variable of that type:
struct elem
{
int i;
char k;
};
elem user; // compile error!
struct elem user; // this is correct
One other good reason to always typedef enums and structs results from this problem:
enum EnumDef
{
FIRST_ITEM,
SECOND_ITEM
};
struct StructDef
{
enum EnuumDef MyEnum;
unsigned int MyVar;
} MyStruct;
Notice the typo in EnumDef in the struct (EnuumDef)? This compiles without error (or warning) and is (depending on the literal interpretation of the C Standard) correct. The problem is that I just created an new (empty) enumeration definition within my struct. I am not (as intended) using the previous definition EnumDef.
With a typdef similar kind of typos would have resulted in a compiler errors for using an unknown type:
typedef
{
FIRST_ITEM,
SECOND_ITEM
} EnumDef;
typedef struct
{
EnuumDef MyEnum; /* compiler error (unknown type) */
unsigned int MyVar;
} StructDef;
StrructDef MyStruct; /* compiler error (unknown type) */
I would advocate ALWAYS typedef'ing structs and enumerations.
Not only to save some typing (no pun intended ;)), but because it is safer.
Linux kernel coding style Chapter 5 gives great pros and cons (mostly cons) of using typedef.
Please don't use things like "vps_t".
It's a mistake to use typedef for structures and pointers. When you see a
vps_t a;
in the source, what does it mean?
In contrast, if it says
struct virtual_container *a;
you can actually tell what "a" is.
Lots of people think that typedefs "help readability". Not so. They are useful only for:
(a) totally opaque objects (where the typedef is actively used to hide what the object is).
Example: "pte_t" etc. opaque objects that you can only access using the proper accessor functions.
NOTE! Opaqueness and "accessor functions" are not good in themselves. The reason we have them for things like pte_t etc. is that there really is absolutely zero portably accessible information there.
(b) Clear integer types, where the abstraction helps avoid confusion whether it is "int" or "long".
u8/u16/u32 are perfectly fine typedefs, although they fit into category (d) better than here.
NOTE! Again - there needs to be a reason for this. If something is "unsigned long", then there's no reason to do
typedef unsigned long myflags_t;
but if there is a clear reason for why it under certain circumstances might be an "unsigned int" and under other configurations might be "unsigned long", then by all means go ahead and use a typedef.
(c) when you use sparse to literally create a new type for type-checking.
(d) New types which are identical to standard C99 types, in certain exceptional circumstances.
Although it would only take a short amount of time for the eyes and brain to become accustomed to the standard types like 'uint32_t', some people object to their use anyway.
Therefore, the Linux-specific 'u8/u16/u32/u64' types and their signed equivalents which are identical to standard types are permitted -- although they are not mandatory in new code of your own.
When editing existing code which already uses one or the other set of types, you should conform to the existing choices in that code.
(e) Types safe for use in userspace.
In certain structures which are visible to userspace, we cannot require C99 types and cannot use the 'u32' form above. Thus, we use __u32 and similar types in all structures which are shared with userspace.
Maybe there are other cases too, but the rule should basically be to NEVER EVER use a typedef unless you can clearly match one of those rules.
In general, a pointer, or a struct that has elements that can reasonably be directly accessed should never be a typedef.
It turns out that there are pros and cons. A useful source of information is the seminal book "Expert C Programming" (Chapter 3). Briefly, in C you have multiple namespaces: tags, types, member names and identifiers. typedef introduces an alias for a type and locates it in the tag namespace. Namely,
typedef struct Tag{
...members...
}Type;
defines two things. 1) Tag in the tag namespace and 2) Type in the type namespace. So you can do both Type myType and struct Tag myTagType. Declarations like struct Type myType or Tag myTagType are illegal. In addition, in a declaration like this:
typedef Type *Type_ptr;
we define a pointer to our Type. So if we declare:
Type_ptr var1, var2;
struct Tag *myTagType1, myTagType2;
then var1,var2 and myTagType1 are pointers to Type but myTagType2 not.
In the above-mentioned book, it mentions that typedefing structs are not very useful as it only saves the programmer from writing the word struct. However, I have an objection, like many other C programmers. Although it sometimes turns to obfuscate some names (that's why it is not advisable in large code bases like the kernel) when you want to implement polymorphism in C it helps a lot look here for details. Example:
typedef struct MyWriter_t{
MyPipe super;
MyQueue relative;
uint32_t flags;
...
}MyWriter;
you can do:
void my_writer_func(MyPipe *s)
{
MyWriter *self = (MyWriter *) s;
uint32_t myFlags = self->flags;
...
}
So you can access an outer member (flags) by the inner struct (MyPipe) through casting. For me it is less confusing to cast the whole type than doing (struct MyWriter_ *) s; every time you want to perform such functionality. In these cases brief referencing is a big deal especially if you heavily employ the technique in your code.
Finally, the last aspect with typedefed types is the inability to extend them, in contrast to macros. If for example, you have:
#define X char[10] or
typedef char Y[10]
you can then declare
unsigned X x; but not
unsigned Y y;
We do not really care for this for structs because it does not apply to storage specifiers (volatile and const).
I don't think forward declarations are even possible with typedef. Use of struct, enum, and union allow for forwarding declarations when dependencies (knows about) is bidirectional.
Style:
Use of typedef in C++ makes quite a bit of sense. It can almost be necessary when dealing with templates that require multiple and/or variable parameters. The typedef helps keep the naming straight.
Not so in the C programming language. The use of typedef most often serves no purpose but to obfuscate the data structure usage. Since only { struct (6), enum (4), union (5) } number of keystrokes are used to declare a data type there is almost no use for the aliasing of the struct. Is that data type a union or a struct? Using the straightforward non-typdefed declaration lets you know right away what type it is.
Notice how Linux is written with strict avoidance of this aliasing nonsense typedef brings. The result is a minimalist and clean style.
Let's start with the basics and work our way up.
Here is an example of Structure definition:
struct point
{
int x, y;
};
Here the name point is optional.
A Structure can be declared during its definition or after.
Declaring during definition
struct point
{
int x, y;
} first_point, second_point;
Declaring after definition
struct point
{
int x, y;
};
struct point first_point, second_point;
Now, carefully note the last case above; you need to write struct point to declare Structures of that type if you decide to create that type at a later point in your code.
Enter typedef. If you intend to create new Structure ( Structure is a custom data-type) at a later time in your program using the same blueprint, using typedef during its definition might be a good idea since you can save some typing moving forward.
typedef struct point
{
int x, y;
} Points;
Points first_point, second_point;
A word of caution while naming your custom type
Nothing prevents you from using _t suffix at the end of your custom type name but POSIX standard reserves the use of suffix _t to denote standard library type names.
The name you (optionally) give the struct is called the tag name and, as has been noted, is not a type in itself. To get to the type requires the struct prefix.
GTK+ aside, I'm not sure the tagname is used anything like as commonly as a typedef to the struct type, so in C++ that is recognised and you can omit the struct keyword and use the tagname as the type name too:
struct MyStruct
{
int i;
};
// The following is legal in C++:
MyStruct obj;
obj.i = 7;
typedef will not provide a co-dependent set of data structures. This you cannot do with typdef:
struct bar;
struct foo;
struct foo {
struct bar *b;
};
struct bar {
struct foo *f;
};
Of course you can always add:
typedef struct foo foo_t;
typedef struct bar bar_t;
What exactly is the point of that?
A>
a typdef aids in the meaning and documentation of a program by allowing creation of more meaningful synonyms for data types. In addition, they help parameterize a program against portability problems (K&R, pg147, C prog lang).
B>
a structure defines a type. Structs allows convenient grouping of a collection of vars for convenience of handling (K&R, pg127, C prog lang.) as a single unit
C>
typedef'ing a struct is explained in A above.
D> To me, structs are custom types or containers or collections or namespaces or complex types, whereas a typdef is just a means to create more nicknames.
In 'C' programming language the keyword 'typedef' is used to declare a new name for some object(struct, array, function..enum type). For example, I will use a 'struct-s'.
In 'C' we often declare a 'struct' outside of the 'main' function. For example:
struct complex{ int real_part, img_part }COMPLEX;
main(){
struct KOMPLEKS number; // number type is now a struct type
number.real_part = 3;
number.img_part = -1;
printf("Number: %d.%d i \n",number.real_part, number.img_part);
}
Each time I decide to use a struct type I will need this keyword 'struct 'something' 'name'.'typedef' will simply rename that type and I can use that new name in my program every time I want. So our code will be:
typedef struct complex{int real_part, img_part; }COMPLEX;
//now COMPLEX is the new name for this structure and if I want to use it without
// a keyword like in the first example 'struct complex number'.
main(){
COMPLEX number; // number is now the same type as in the first example
number.real_part = 1;
number.img)part = 5;
printf("%d %d \n", number.real_part, number.img_part);
}
If you have some local object(struct, array, valuable) that will be used in your entire program you can simply give it a name using a 'typedef'.
Turns out in C99 typedef is required. It is outdated, but a lot of tools (ala HackRank) use c99 as its pure C implementation. And typedef is required there.
I'm not saying they should change (maybe have two C options) if the requirement changed, those of us studing for interviews on the site would be SOL.
At all, in C language, struct/union/enum are macro instruction processed by the C language preprocessor (do not mistake with the preprocessor that treat "#include" and other)
so :
struct a
{
int i;
};
struct b
{
struct a;
int i;
int j;
};
struct b is expended as something like this :
struct b
{
struct a
{
int i;
};
int i;
int j;
}
and so, at compile time it evolve on stack as something like:
b:
int ai
int i
int j
that also why it's dificult to have selfreferent structs, C preprocessor round in a déclaration loop that can't terminate.
typedef are type specifier, that means only C compiler process it and it can do like he want for optimise assembler code implementation. It also dont expend member of type par stupidly like préprocessor do with structs but use more complex reference construction algorithm, so construction like :
typedef struct a A; //anticipated declaration for member declaration
typedef struct a //Implemented declaration
{
A* b; // member declaration
}A;
is permited and fully functional. This implementation give also access to compilator type conversion and remove some bugging effects when execution thread leave the application field of initialisation functions.
This mean that in C typedefs are more near as C++ class than lonely structs.

Difference between using a structure member and cast a structure pointer when "emulate" polymorphism in C

I'm not sure if my wording is technically correct, so please correct me in both title and the main body of this question.
So basically my question is regarding emulating polymorphism in C. For example, suppose I have a tree, and there is a struct tree_node type. And I have some functions to help me insert nodes, delete nodes etc like this as an example:
void tree_insert(tree_node **root, tree_node *new_node);
Then I start to build other stuff for my app, and and need to use this tree to maintain, say, family members. But for human, I have another struct, let's call it "struct human_node" which is defined like this, for example:
typedef struct human_node_ {
tree_node t_node;
char *name;
} human_node;
Now apparently I want to use those tree utility functions I build for the generic tree. But they take tree_node pointers. Now time for the polymorphism emulation. So here are the two options I have, one is to cast my human_node, one is to use the t_node member in the human_node:
human_node *myfamily_tree_root, *new_family_guy;
//some initialization code and other code later...
tree_insert((tree_node **)&myfamily_tree_root, &(new_family_guy->t_node));
For concise I put both ways in one function call above.
And this is exactly where I have my confusion. So which one should I use and more importantly, why?
Both are standard, but in general if you can avoid type casts then you should pick the solution that avoids the type casts.
A common thing for such inline data structure implementations is to not even require that the tree node (or equivalent) is the first element in the struct since you might want to enter your nodes into multiple trees. Then you definitely want to use the second approach. To convert between the tree_node element and the struct containing you'll have to have some black magic macros, but it's worth it. For example in an implementation of avl trees I have these macros:
#ifndef offsetof
#define offsetof(s, e) ((size_t)&((s *)0)->e)
#endif
/* the bit at the end is to prevent mistakes where n is not an avl_node */
#define avl_data(n, type, field) ((type *)(void*)((char *)n - offsetof(type, field) - (n - (struct avl_node *)n)))
So I can have something like:
struct foo {
int data;
struct avl_node tree_node_1;
struct avl_node tree_node_2;
};
int
tree_node_1_to_data(struct avl_node *x)
{
return avl_data(x, struct foo, tree_node_1)->data;
}
If you choose to make your code this generic you definitely want to take references to your tree_node members and not typecast the pointer to the struct.
This question is probably too broad for any specific answers, but you could for instance see how CPython does it.
Basically, all Python structs have the same header, and to define your own types you must be sure to start your struct with the PyObject_HEAD macro (or PyObject_VAR_HEAD for variably sized objects like strings). This adds stuff like a type tag, reference counts, and so on.
After instantiating objects, you pass them around as PyObject*s, and the functions will infer what type the object actually is (e.g. a string, a list, etc.) and be able to dispatch based on that. Yes, you have to type cast at some point to get to your actual object contents.
For instance, this is how Python's character strings are defined:
typedef struct {
PyObject_VAR_HEAD
Py_hash_t ob_shash;
char ob_sval[1];
/* Invariants:
* ob_sval contains space for 'ob_size+1' elements.
* ob_sval[ob_size] == 0.
* ob_shash is the hash of the string or -1 if not computed yet.
*/
} PyBytesObject;
You can read more about the CPython's type object inheritance model. Extract:
Objects are always accessed through pointers of the type 'PyObject *'.
The type 'PyObject' is a structure that only contains the reference
count and the type pointer. The actual memory allocated for an object
contains other data that can only be accessed after casting the
pointer to a pointer to a longer structure type.
Note that this angle of attack may be more suited for interpreted code. There are probably other open source projects you may look at that are more directly relevant to your needs.

Should custom deallocation functions consider compatibility with generic containers?

In C if you want to have generic containers, one of the popular approaches is to use void*. If the generic containers hold some custom struct that has its own deallocation function, it's likely going to ask for that function:
struct Foo {...};
Foo *Foo_Allocate(...);
void Foo_Deallocate(const Foo*);
int main(void)
{
/* Let's assume that when you create the list you have to
specify the deallocator of the type you want to hold */
List *list = List_Allocate(Foo_Deallocate);
/* Here we allocate a new Foo and push it into the list.
The list now has possession of the pointer. */
List_PushBack(list, Foo_Allocate());
/* When we deallocate the list, it will also deallocate all the
items we inserted, using the deallocator specified at the beginning */
List_Deallocate(list);
}
But most likely the type of the deallocator function will be something that takes a void*
typedef void (*List_FnItemDeallocator)(const void*);
The problem is that Foo_Deallocate takes a const Foo*, not a const void*. Is it still safe to pass the function, even though their signatures do not match? Probably not, since pointer types are not necessarily the same size in C.
If that's not possible, would it be a good idea to have all deallocator functions take a const void* instead of a pointer to the type they are related to, so that they would be compatible with generic containers?
As you said, assigning pointer to a different function type is not valid.
You should take a void* as a parameter, and perform some check inside each function to see if the given pointer matches the expected type (like checking for a magic number at the beginning of the struct).
As mentioned, you can use a magic number or 'header' to specify the destructor function. You can go quite far with this header, and even select a 'well known, registered' deallocator (in which case you don't actually need to store a function pointer, just possibly an integer index into an array), or have a 'flags' section within your header which specifies that this contains an 'extended' deallocator. The possibilities are quite far and quite fun.
So your list 'headers' would look something like this
#define LIST_HEAD struct list *next; struct list *prev; short flags;
struct list { LIST_HEAD };
struct list_with_custom_deallocator { LIST_HEAD void (*dealloc)(void*); };
Now, actually answering your question.. why not just define a common header type, and have your deallocators take a pointer to that type (in my example, a struct list*) and then cast it to whatever specific relevant type -- or even better,maybe heuristically determine the actual structure and deallocator from the flags (as hinted by Binyamin).

Why should we typedef a struct so often in C?

I have seen many programs consisting of structures like the one below
typedef struct
{
int i;
char k;
} elem;
elem user;
Why is it needed so often? Any specific reason or applicable area?
As Greg Hewgill said, the typedef means you no longer have to write struct all over the place. That not only saves keystrokes, it also can make the code cleaner since it provides a smidgen more abstraction.
Stuff like
typedef struct {
int x, y;
} Point;
Point point_new(int x, int y)
{
Point a;
a.x = x;
a.y = y;
return a;
}
becomes cleaner when you don't need to see the "struct" keyword all over the place, it looks more as if there really is a type called "Point" in your language. Which, after the typedef, is the case I guess.
Also note that while your example (and mine) omitted naming the struct itself, actually naming it is also useful for when you want to provide an opaque type. Then you'd have code like this in the header, for instance:
typedef struct Point Point;
Point * point_new(int x, int y);
and then provide the struct definition in the implementation file:
struct Point
{
int x, y;
};
Point * point_new(int x, int y)
{
Point *p;
if((p = malloc(sizeof *p)) != NULL)
{
p->x = x;
p->y = y;
}
return p;
}
In this latter case, you cannot return the Point by value, since its definition is hidden from users of the header file. This is a technique used widely in GTK+, for instance.
UPDATE Note that there are also highly-regarded C projects where this use of typedef to hide struct is considered a bad idea, the Linux kernel is probably the most well-known such project. See Chapter 5 of The Linux Kernel CodingStyle document for Linus' angry words. :) My point is that the "should" in the question is perhaps not set in stone, after all.
It's amazing how many people get this wrong. PLEASE don't typedef structs in C, it needlessly pollutes the global namespace which is typically very polluted already in large C programs.
Also, typedef'd structs without a tag name are a major cause of needless imposition of ordering relationships among header files.
Consider:
#ifndef FOO_H
#define FOO_H 1
#define FOO_DEF (0xDEADBABE)
struct bar; /* forward declaration, defined in bar.h*/
struct foo {
struct bar *bar;
};
#endif
With such a definition, not using typedefs, it is possible for a compiland unit to include foo.h to get at the FOO_DEF definition. If it doesn't attempt to dereference the 'bar' member of the foo struct then there will be no need to include the "bar.h" file.
Also, since the namespaces are different between the tag names and the member names, it is possible to write very readable code such as:
struct foo *foo;
printf("foo->bar = %p", foo->bar);
Since the namespaces are separate, there is no conflict in naming variables coincident with their struct tag name.
If I have to maintain your code, I will remove your typedef'd structs.
From an old article by Dan Saks (http://www.ddj.com/cpp/184403396?pgno=3):
The C language rules for naming
structs are a little eccentric, but
they're pretty harmless. However, when
extended to classes in C++, those same
rules open little cracks for bugs to
crawl through.
In C, the name s appearing in
struct s
{
...
};
is a tag. A tag name is not a type
name. Given the definition above,
declarations such as
s x; /* error in C */
s *p; /* error in C */
are errors in C. You must write them
as
struct s x; /* OK */
struct s *p; /* OK */
The names of unions and enumerations
are also tags rather than types.
In C, tags are distinct from all other
names (for functions, types,
variables, and enumeration constants).
C compilers maintain tags in a symbol
table that's conceptually if not
physically separate from the table
that holds all other names. Thus, it
is possible for a C program to have
both a tag and an another name with
the same spelling in the same scope.
For example,
struct s s;
is a valid declaration which declares
variable s of type struct s. It may
not be good practice, but C compilers
must accept it. I have never seen a
rationale for why C was designed this
way. I have always thought it was a
mistake, but there it is.
Many programmers (including yours
truly) prefer to think of struct names
as type names, so they define an alias
for the tag using a typedef. For
example, defining
struct s
{
...
};
typedef struct s S;
lets you use S in place of struct s,
as in
S x;
S *p;
A program cannot use S as the name of
both a type and a variable (or
function or enumeration constant):
S S; // error
This is good.
The tag name in a struct, union, or
enum definition is optional. Many
programmers fold the struct definition
into the typedef and dispense with the
tag altogether, as in:
typedef struct
{
...
} S;
The linked article also has a discussion about how the C++ behavior of not requireing a typedef can cause subtle name hiding problems. To prevent these problems, it's a good idea to typedef your classes and structs in C++, too, even though at first glance it appears to be unnecessary. In C++, with the typedef the name hiding become an error that the compiler tells you about rather than a hidden source of potential problems.
Using a typedef avoids having to write struct every time you declare a variable of that type:
struct elem
{
int i;
char k;
};
elem user; // compile error!
struct elem user; // this is correct
One other good reason to always typedef enums and structs results from this problem:
enum EnumDef
{
FIRST_ITEM,
SECOND_ITEM
};
struct StructDef
{
enum EnuumDef MyEnum;
unsigned int MyVar;
} MyStruct;
Notice the typo in EnumDef in the struct (EnuumDef)? This compiles without error (or warning) and is (depending on the literal interpretation of the C Standard) correct. The problem is that I just created an new (empty) enumeration definition within my struct. I am not (as intended) using the previous definition EnumDef.
With a typdef similar kind of typos would have resulted in a compiler errors for using an unknown type:
typedef
{
FIRST_ITEM,
SECOND_ITEM
} EnumDef;
typedef struct
{
EnuumDef MyEnum; /* compiler error (unknown type) */
unsigned int MyVar;
} StructDef;
StrructDef MyStruct; /* compiler error (unknown type) */
I would advocate ALWAYS typedef'ing structs and enumerations.
Not only to save some typing (no pun intended ;)), but because it is safer.
Linux kernel coding style Chapter 5 gives great pros and cons (mostly cons) of using typedef.
Please don't use things like "vps_t".
It's a mistake to use typedef for structures and pointers. When you see a
vps_t a;
in the source, what does it mean?
In contrast, if it says
struct virtual_container *a;
you can actually tell what "a" is.
Lots of people think that typedefs "help readability". Not so. They are useful only for:
(a) totally opaque objects (where the typedef is actively used to hide what the object is).
Example: "pte_t" etc. opaque objects that you can only access using the proper accessor functions.
NOTE! Opaqueness and "accessor functions" are not good in themselves. The reason we have them for things like pte_t etc. is that there really is absolutely zero portably accessible information there.
(b) Clear integer types, where the abstraction helps avoid confusion whether it is "int" or "long".
u8/u16/u32 are perfectly fine typedefs, although they fit into category (d) better than here.
NOTE! Again - there needs to be a reason for this. If something is "unsigned long", then there's no reason to do
typedef unsigned long myflags_t;
but if there is a clear reason for why it under certain circumstances might be an "unsigned int" and under other configurations might be "unsigned long", then by all means go ahead and use a typedef.
(c) when you use sparse to literally create a new type for type-checking.
(d) New types which are identical to standard C99 types, in certain exceptional circumstances.
Although it would only take a short amount of time for the eyes and brain to become accustomed to the standard types like 'uint32_t', some people object to their use anyway.
Therefore, the Linux-specific 'u8/u16/u32/u64' types and their signed equivalents which are identical to standard types are permitted -- although they are not mandatory in new code of your own.
When editing existing code which already uses one or the other set of types, you should conform to the existing choices in that code.
(e) Types safe for use in userspace.
In certain structures which are visible to userspace, we cannot require C99 types and cannot use the 'u32' form above. Thus, we use __u32 and similar types in all structures which are shared with userspace.
Maybe there are other cases too, but the rule should basically be to NEVER EVER use a typedef unless you can clearly match one of those rules.
In general, a pointer, or a struct that has elements that can reasonably be directly accessed should never be a typedef.
It turns out that there are pros and cons. A useful source of information is the seminal book "Expert C Programming" (Chapter 3). Briefly, in C you have multiple namespaces: tags, types, member names and identifiers. typedef introduces an alias for a type and locates it in the tag namespace. Namely,
typedef struct Tag{
...members...
}Type;
defines two things. 1) Tag in the tag namespace and 2) Type in the type namespace. So you can do both Type myType and struct Tag myTagType. Declarations like struct Type myType or Tag myTagType are illegal. In addition, in a declaration like this:
typedef Type *Type_ptr;
we define a pointer to our Type. So if we declare:
Type_ptr var1, var2;
struct Tag *myTagType1, myTagType2;
then var1,var2 and myTagType1 are pointers to Type but myTagType2 not.
In the above-mentioned book, it mentions that typedefing structs are not very useful as it only saves the programmer from writing the word struct. However, I have an objection, like many other C programmers. Although it sometimes turns to obfuscate some names (that's why it is not advisable in large code bases like the kernel) when you want to implement polymorphism in C it helps a lot look here for details. Example:
typedef struct MyWriter_t{
MyPipe super;
MyQueue relative;
uint32_t flags;
...
}MyWriter;
you can do:
void my_writer_func(MyPipe *s)
{
MyWriter *self = (MyWriter *) s;
uint32_t myFlags = self->flags;
...
}
So you can access an outer member (flags) by the inner struct (MyPipe) through casting. For me it is less confusing to cast the whole type than doing (struct MyWriter_ *) s; every time you want to perform such functionality. In these cases brief referencing is a big deal especially if you heavily employ the technique in your code.
Finally, the last aspect with typedefed types is the inability to extend them, in contrast to macros. If for example, you have:
#define X char[10] or
typedef char Y[10]
you can then declare
unsigned X x; but not
unsigned Y y;
We do not really care for this for structs because it does not apply to storage specifiers (volatile and const).
I don't think forward declarations are even possible with typedef. Use of struct, enum, and union allow for forwarding declarations when dependencies (knows about) is bidirectional.
Style:
Use of typedef in C++ makes quite a bit of sense. It can almost be necessary when dealing with templates that require multiple and/or variable parameters. The typedef helps keep the naming straight.
Not so in the C programming language. The use of typedef most often serves no purpose but to obfuscate the data structure usage. Since only { struct (6), enum (4), union (5) } number of keystrokes are used to declare a data type there is almost no use for the aliasing of the struct. Is that data type a union or a struct? Using the straightforward non-typdefed declaration lets you know right away what type it is.
Notice how Linux is written with strict avoidance of this aliasing nonsense typedef brings. The result is a minimalist and clean style.
Let's start with the basics and work our way up.
Here is an example of Structure definition:
struct point
{
int x, y;
};
Here the name point is optional.
A Structure can be declared during its definition or after.
Declaring during definition
struct point
{
int x, y;
} first_point, second_point;
Declaring after definition
struct point
{
int x, y;
};
struct point first_point, second_point;
Now, carefully note the last case above; you need to write struct point to declare Structures of that type if you decide to create that type at a later point in your code.
Enter typedef. If you intend to create new Structure ( Structure is a custom data-type) at a later time in your program using the same blueprint, using typedef during its definition might be a good idea since you can save some typing moving forward.
typedef struct point
{
int x, y;
} Points;
Points first_point, second_point;
A word of caution while naming your custom type
Nothing prevents you from using _t suffix at the end of your custom type name but POSIX standard reserves the use of suffix _t to denote standard library type names.
The name you (optionally) give the struct is called the tag name and, as has been noted, is not a type in itself. To get to the type requires the struct prefix.
GTK+ aside, I'm not sure the tagname is used anything like as commonly as a typedef to the struct type, so in C++ that is recognised and you can omit the struct keyword and use the tagname as the type name too:
struct MyStruct
{
int i;
};
// The following is legal in C++:
MyStruct obj;
obj.i = 7;
typedef will not provide a co-dependent set of data structures. This you cannot do with typdef:
struct bar;
struct foo;
struct foo {
struct bar *b;
};
struct bar {
struct foo *f;
};
Of course you can always add:
typedef struct foo foo_t;
typedef struct bar bar_t;
What exactly is the point of that?
A>
a typdef aids in the meaning and documentation of a program by allowing creation of more meaningful synonyms for data types. In addition, they help parameterize a program against portability problems (K&R, pg147, C prog lang).
B>
a structure defines a type. Structs allows convenient grouping of a collection of vars for convenience of handling (K&R, pg127, C prog lang.) as a single unit
C>
typedef'ing a struct is explained in A above.
D> To me, structs are custom types or containers or collections or namespaces or complex types, whereas a typdef is just a means to create more nicknames.
In 'C' programming language the keyword 'typedef' is used to declare a new name for some object(struct, array, function..enum type). For example, I will use a 'struct-s'.
In 'C' we often declare a 'struct' outside of the 'main' function. For example:
struct complex{ int real_part, img_part }COMPLEX;
main(){
struct KOMPLEKS number; // number type is now a struct type
number.real_part = 3;
number.img_part = -1;
printf("Number: %d.%d i \n",number.real_part, number.img_part);
}
Each time I decide to use a struct type I will need this keyword 'struct 'something' 'name'.'typedef' will simply rename that type and I can use that new name in my program every time I want. So our code will be:
typedef struct complex{int real_part, img_part; }COMPLEX;
//now COMPLEX is the new name for this structure and if I want to use it without
// a keyword like in the first example 'struct complex number'.
main(){
COMPLEX number; // number is now the same type as in the first example
number.real_part = 1;
number.img)part = 5;
printf("%d %d \n", number.real_part, number.img_part);
}
If you have some local object(struct, array, valuable) that will be used in your entire program you can simply give it a name using a 'typedef'.
Turns out in C99 typedef is required. It is outdated, but a lot of tools (ala HackRank) use c99 as its pure C implementation. And typedef is required there.
I'm not saying they should change (maybe have two C options) if the requirement changed, those of us studing for interviews on the site would be SOL.
At all, in C language, struct/union/enum are macro instruction processed by the C language preprocessor (do not mistake with the preprocessor that treat "#include" and other)
so :
struct a
{
int i;
};
struct b
{
struct a;
int i;
int j;
};
struct b is expended as something like this :
struct b
{
struct a
{
int i;
};
int i;
int j;
}
and so, at compile time it evolve on stack as something like:
b:
int ai
int i
int j
that also why it's dificult to have selfreferent structs, C preprocessor round in a déclaration loop that can't terminate.
typedef are type specifier, that means only C compiler process it and it can do like he want for optimise assembler code implementation. It also dont expend member of type par stupidly like préprocessor do with structs but use more complex reference construction algorithm, so construction like :
typedef struct a A; //anticipated declaration for member declaration
typedef struct a //Implemented declaration
{
A* b; // member declaration
}A;
is permited and fully functional. This implementation give also access to compilator type conversion and remove some bugging effects when execution thread leave the application field of initialisation functions.
This mean that in C typedefs are more near as C++ class than lonely structs.

Resources