C Variable Member List for structs, is this possible? - c

I have a question about structures having a "variable members list" similar to the "variable argument list" that we can define functions as having.
I may sound stupid or completely off the line in terms of C language basics, but please correct me if I am wrong.
So can I have a C struct like this:
struct Var_Members_Interface
{
int intMember;
char *charMember;
... // is this possible?
};
My idea is to have a c style interface that can be implemented by the classes but these classes can have additional members in this structure. However, they must have intMember and charMember.
Thanks in advance.

The closest approximation in C99 (but not C89) is to have a flexible array member at the end of the structure:
struct Var_Members_Interface
{
int intMember;
char *charMember;
Type flexArrayMember[];
};
You can now dynamically allocate the structure with an array of the type Type at the end, and access the array:
struct Var_Members_Interface *vmi = malloc(sizeof(*vmi) + N * sizeof(Type));
vmi->flexArrayMember[i] = ...;
Note that this cannot be used in C++.
But that isn't a very close approximation to what you are after. What you are after cannot be done in C with a single structure type, and can only be approximated in C++ via inheritance - see other answers.
One trick that you can get away with - usually - in C uses multiple structure types and lots of casts:
struct VM_Base
{
int intMember;
char *charMember;
};
struct VM_Variant1
{
int intMember;
char *charMember;
int intArray[3];
};
struct VM_Variant2
{
int intMember;
char *charMember;
Type typeMember;
};
struct VM_Variant3
{
int intMember;
char *charMember;
double doubleMember;
};
Now, with some sledgehammering casts, you can write functions which take 'struct VM_Base *' arguments, and pass in a pointer to any of the VM_VariantN types. The 'intMember' can probably be used to tell which of the variants you actually have. This is more or less what happens with the POSIX sockets functions. There are different types of socket address, and the structures have different lengths, but they have a common prefix, and the correct code ends up being called because the common prefix identifies the type of socket address. (The design is not elegant; but it was standard - a de facto standard from BSD sockets - before POSIX standardized it. And the BSD design pre-dates C89, let alone C99. Were it being designed now, from scratch, with no requirement for compatibility with existing code, it would be done differently.)
This technique is ugly as sin and requires casts galore to make it compile -- and great care to make it work correctly. You shouldn't bother with this sort of mess in C++.

You can't do anything like this with direct language support in C; but in C++, classes that extended your struct would inherit those data members and could add their own. So in C++, not only can you do this, but it's a normal mode of operation.

You first need to understand what a struct really is.
A struct in C is little more than a standard for interpreting bytes in memory.
To see what that means, let's use your struct:
struct Var_Members_Interface
{
int intMember;
char *charMember;
};
struct Var_Members_Interface instance; //An instance of the struct
What this means is, "I'll reserve some memory and call it instance, and I'll interpret the first few bytes to mean an integer, and the next few bytes to mean that the point to somewhere in memory."
Given this, it makes little sense to have "variable-member" structs, because a struct is just the layout specification for an existing block of memory -- and existing blocks don't have "variable" length.

You could do it the way the old X11 Xt widget library did it:
struct Var_Members_Interface {
int intMember;
char *charMember;
};
struct Other_Part {
int extraInt;
char *extraString;
}
struct Var_Other_Interface {
struct Var_Members_Interface base;
struct Other_Part other;
};
As long as you're careful with your allocations, alignment, and padding issues, then this will work:
struct Var_Other_Interface *other = create_other();
struct Var_Members_Interface *member = (struct Var_Other_Interface *)other;
struct Var_Other_Interface *back_again = (struct Var_Other_Interface)member;
And you can nest the structs as deep as needed to get a single inheritance hierarchy.
This sort of thing is not for the feint of heart: you have to be very careful with you allocations, structure nesting, etc.
Have a look at an old school Xt widget and you'll get the idea; Xt widgets were usually implemented in three files: a C source file, a public header with the function interface, and a private header to define the structure layout (this one would be needed for subclassing).
For example, the Ghostscript widget that I used to use in mgv looked like this:
typedef struct {
/* Bunch of stuff. */
} GhostviewPart;
typedef struct _GhostviewRec {
CorePart core;
GhostviewPart ghostview;
} GhostviewRec;
The CorePart was the standard Xt widget definition and the GhostviewRec was the actual widget itself.

Not exactly what you are looking for but you can create a void pointer within your struct that can be used to point to another struct where the new types are defined.
struct Var_Members_Interface
{
int intMember;
char *charMember;
void *otherMembers;
};
Edit: The solution in this article might be a lot closer to what you are looking for.

You have two choices I think. You can emulate what C++ does but unfortunately, you have to see all the gory details. You define your common base struct and have that as a member of all your variant structs e.g.
struct VM_Base
{
int intMember;
char *charMember;
};
struct VM_Variant1
{
struct VM_Base base;
int foo;
};
struct VM_Variant2
{
struct VM_Base base;
char *charMember;
double bar;
};
struct VM_Variant3
{
struct VM_Base base;
char *charMember;
char baz[10];
};
Pointers to any of the variant structs are also pointers to the base member of the variant struct so you can cast to the base member freely. Going back the other way is obviously more problematic, since you need a check to make sure you are casting to the right type.
You can do away with the casts by using union instead e.g.
struct VM_Variant1
{
struct VM_Base base;
int foo;
};
struct VM_Variant2
{
struct VM_Base base;
char *charMember;
double bar;
};
struct VM_Variant3
{
struct VM_Base base;
char *charMember;
char baz[10];
};
struct VM
{
int intMember;
char *charMember;
union
{
struct VM_Variant1 vm1;
struct VM_Variant2 vm2;
struct VM_Variant3 vm3;
}
};
This second method obviates the need for type casts. You access the members like this:
double aDouble = aVMStruct.vm2.bar;
The three members of the union overlay each other in memory so the allocated block will only be the size of the largest of the three variants.

Related

C structure multiple types

I'd like to write a library in C and I don't know what is the recommended way. I got for example structure and multiple functions like this:
typedef struct example
{
int *val;
struct example *next;
} Example;
and I have build function for multiple types of val
Example* build() { do sth };
Example* buildf() { do sth }; // val is float
Example* buildd() { do sth }; // val is double
What is the better practice (used in "professional" library). Use pointer to void and casting or have structure for all possibilities - int, float, double.
Use a union and some way to store type info:
typedef struct example
{
enum{ T_STRUCT_WITH_INT, T_STRUCT_WITH_FLOAT, T_SO_ON } type;
union {
int val_int;
float val_float;
} val;
struct example *next;
} Example;
Access fields after checking type by s->val.val_int
In C11 you can have union anonymous and fields can be accessed like s->val_int
This is primarily based on some combination of opinion, experience and the specific requirements at hand.
The following approach is possible, inspired by some container library work by Jacob Navia. I've never used it myself:
struct container_node {
struct container_node *link_here, *link_there, *link_elsewhere;
/*...*/
char data[0]; /* C90 style of "flexible array member" */
};
struct container_node *container_node_alloc(size_t data_size);
The allocation function allocates the node large enough so that data[0] through data[data_size-1] bytes of storage are available. Through another set of API functions, user data of arbitrary type be copied in and out.
The following approach is sometimes called "intrusive container". The container defines only a "base class" consisting of the link structure. The user must embed this structure into their own structure:
struct container_node {
struct container_node *next, *prev;
};
void container_insert(struct container *container, struct container_node *n);
struct container_node *container_first(struct container *container);
The user does this:
struct my_widget {
struct container_node container_links;
int widget_height;
/* ... */
};
/* .... */
/* We don't insert my_widget, but rather its links base. */
container_insert(&widg_container, &widget->container_links);
Some macros are used to convert between a pointer to the widget and a pointer to the container links. See the container_of macro used widely in the Linux kernel:
struct my_widget *wptr = container_of(container_first(&widg_container),
struct my_widget, container_links);
See this question.
Then there approaches of storing a union in each node, which provides an integer, floating-point-value or a pointer. In that case, the data is separately allocated (though not necessarily: if the caller controls the allocation of the nodes, it's still possible to put the node structure and the user data in a buffer that came from a single malloc call).
Finally, there are also approaches which wrap these techniques with preprocessor templating, an example of which are the BSD QUEUE macros.

Using macro in C11 anonymous struct definition

The typical C99 way to extending stuct is something like
struct Base {
int x;
/* ... */
};
struct Derived {
struct Base base_part;
int y;
/* ... */
};
Then we may cast instance of struct Derived * to struct Base * and then access x.
I want to access base elements of struct Derived * obj; directly, for example obj->x and obj->y. C11 provide extended structs, but as explained here we can use this feature only with anonymous definitions. Then how about to write
#define BASE_BODY { \
int x; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
int y;
};
Then I may access Base members same as it's part of Derived without any casts or intermediate members. I can cast Derived pointer to Base pointer if need do.
Is this acceptable? Are there any pitfalls?
There are pitfalls.
Consider:
#define BASE_BODY { \
double a; \
short b; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
short c;
};
On some implementation it could be that sizeof(Base) == sizeof(Derived), but:
struct Base {
double a;
// Padding here
short b;
}
struct Derived {
double a;
short b;
short c;
};
There is no guarantee that at the beginning of the struct memory layout is the same. Therefore you cannot pass this kind of Derived * to function expecting Base *, and expect it to work.
And even if padding would not mess up the layout, there is a still potential problem with trap presenstation:
If again sizeof(Base) == sizeof(Derived), but c ends up to a area which is covered by the padding at the end of Base. Passing pointer of this struct to function which expects Base* and modifies it, might affect padding bits too (padding has unspecified value), thus possibly corrupting c and maybe even creating trap presentation.

How to handle different data types, in one array, in C

I would like to simulate the object oriented programming, so in C++, let's consider the following C code:
typedef struct tAnimal{
char * name;
int age;
}tAnimal;
typedef struct tAnimal2{
char * name;
int age;
float size;
}tAnimal2;
In C++ you can create a table of different objects which are inherited from the same class.
I would like to do the same in C, let's consider the following code:
tAnimal ** tab;
tab = malloc(sizeof(tAnimal*)*2);
tab[0] = malloc(sizeof(tAnimal));
tab[1] = malloc(sizeof(tAnimal2));
Notice that the allocation works because malloc returns a void pointer, and C does not require casting. But I still have no access to the size field, because the type of tab elements is tAnimal after all.
Is there anyway to fix this?, I would like to stay away from void ** pointers.
In C it's common to use a structure with a type-flag, and a union of the data:
typedef enum
{
Animal1,
Animal2
} AnimalType;
struct Animal
{
AnimalType type;
union
{
tAnimal animal;
tAnimal2 animal2;
};
};
Now you can create an array of the Animal structure.
If you want to access the size field you have to cast your pointer to tAnimal2. Note, the same would be true for C++.
You can kind of simulate inheritance by embedding the first struct at the beginning of the second:
struct tAnimal{
char * name;
int age;
};
struct tAnimal2{
struct tAnimal parent;
float size;
};
In order to access the size field in tab[1] you can cast the pointer to a tAnimal2 pointer.
tAnimal2* panimal2 = (tAnimal2*) tab[1];
panimal2->size = 1.0;
However this practice is prone to data corruption since you will need a method to ensure that the element of the table you cast to tAnimal2 is indeed an instance of tAnimal2. You could use an additional type field as Joachim Pileborg suggests to check the type of the object.

When are anonymous structs and unions useful in C11?

C11 adds, among other things, 'Anonymous Structs and Unions'.
I poked around but could not find a clear explanation of when anonymous structs and unions would be useful. I ask because I don't completely understand what they are. I get that they are structs or unions without the name afterwards, but I have always (had to?) treat that as an error so I can only conceive a use for named structs.
Anonymous union inside structures are very useful in practice. Consider that you want to implement a discriminated sum type (or tagged union), an aggregate with a boolean and either a float or a char* (i.e. a string), depending upon the boolean flag. With C11 you should be able to code
typedef struct {
bool is_float;
union {
float f;
char* s;
};
} mychoice_t;
double as_float(mychoice_t* ch)
{
if (ch->is_float) return ch->f;
else return atof(ch->s);
}
With C99, you'll have to name the union, and code ch->u.f and ch->u.s which is less readable and more verbose.
Another way to implement some tagged union type is to use casts. The Ocaml runtime gives a lot of examples.
The SBCL implementation of Common Lisp does use some union to implement tagged union types. And GNU make also uses them.
A typical and real world use of anonymous structs and unions are to provide an alternative view to data. For example when implementing a 3D point type:
typedef struct {
union{
struct{
double x;
double y;
double z;
};
double raw[3];
};
}vec3d_t;
vec3d_t v;
v.x = 4.0;
v.raw[1] = 3.0; // Equivalent to v.y = 3.0
v.z = 2.0;
This is useful if you interface to code that expects a 3D vector as a pointer to three doubles. Instead of doing f(&v.x) which is ugly, you can do f(v.raw) which makes your intent clear.
struct bla {
struct { int a; int b; };
int c;
};
the type struct bla has a member of a C11 anonymous structure type.
struct { int a; int b; } has no tag and the object has no name: it is an anonymous structure type.
You can access the members of the anonymous structure this way:
struct bla myobject;
myobject.a = 1; // a is a member of the anonymous structure inside struct bla
myobject.b = 2; // same for b
myobject.c = 3; // c is a member of the structure struct bla
Another useful implementation is when you are dealing with rgba colors, since you might want access each color on its own or as a single int.
typedef struct {
union{
struct {uint8_t a, b, g, r;};
uint32_t val;
};
}Color;
Now you can access the individual rgba values or the entire value, with its highest byte being r. i.e:
int main(void)
{
Color x;
x.r = 0x11;
x.g = 0xAA;
x.b = 0xCC;
x.a = 0xFF;
printf("%X\n", x.val);
return 0;
}
Prints 11AACCFF
I'm not sure why C11 allows anonymous structures inside structures. But Linux uses it with a certain language extension:
/**
* struct blk_mq_ctx - State for a software queue facing the submitting CPUs
*/
struct blk_mq_ctx {
struct {
spinlock_t lock;
struct list_head rq_lists[HCTX_MAX_TYPES];
} ____cacheline_aligned_in_smp;
/* ... other fields without explicit alignment annotations ... */
} ____cacheline_aligned_in_smp;
I'm not sure if that example strictly necessary, except to make the intent clear.
EDIT: I found another similar pattern which is more clear-cut. The anonymous struct feature is used with this attribute:
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
/* This anon struct can add padding, so only enable it under randstruct. */
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end } __randomize_layout;
#endif
I.e. a language extension / compiler plugin to randomize field order (ASLR-style exploit "hardening"):
struct kiocb {
struct file *ki_filp;
/* The 'ki_filp' pointer is shared in a union for aio */
randomized_struct_fields_start
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
unsigned int ki_cookie; /* for ->iopoll */
randomized_struct_fields_end
};
Well, if you declare variables from that struct only once in your code, why does it need a name?
struct {
int a;
struct {
int b;
int c;
} d;
} e,f;
And you can now write things like e.a,f.d.b,etc.
(I added the inner struct, because I think that this is one of the most usages of anonymous structs)

Is this a coding convention?

I am doing feature enhancement on a piece of code, and here is what i saw in existing code. If there is a enum or struct declared, later there is always a typedef:
enum _Mode {
MODE1 = 0,
MODE2,
MODE3
};
typedef enum _Mode Mode;
Similary for structure:
struct _Slot {
void * mem1;
int mem2;
};
typedef struct _Slot Slot;
Can't the structures be directly declared as in enum? Why there is a typedef for something as minor as underscore? Is this a coding convention?
Kindly give good answers, because i need to add some code, and if this is a rule, i need to follow it.
Please help.
P.S: As an additional info, the source code is written in C, and Linux is the platform.
In C, to declare a varaible with a struct type you would have to use the following:
struct _Slot a;
The typedef allows you to make this look somewhat neater by essentially creating an alias. And allowing variable declaration like so:
Slot a;
In C there are separate "namespaces" for struct and typedef. Thus, without a typedef you would have to access Slot as struct _Slot, which is more typing. Compare:
struct Slot { ... };
struct Slot s;
struct Slot create_s() { ... }
void use_s(struct Slot s) { ... }
vs
typedef struct _Slot { ... } Slot;
Slot s;
Slot create_s() { ... }
void use_s(Slot s) { ... }
Also see http://en.wikipedia.org/wiki/Struct_(C_programming_language)#typedef for details, like possible namespace clash.
If the following is a structure:
struct _Slot {
void * mem1;
int mem2;
};
you need the following to declare a variable:
struct _Slot s;
Notice the extra struct before _Slot. It seems more natural to declare a variable like Slot s, isn't it?
If you want to get rid of extra struct, you need a typedef:
typedef struct _Slot Slot;
Slot s;
It's sort of code obfuscation technique which only make sense in small amount of cases.
People say it's more natural to not write "struct" and other subjective things.
But objectively, one at least a) can't forward declare such typedeffed struct, b) have to jump through one hoop when using ctags.

Resources