The typical C99 way to extending stuct is something like
struct Base {
int x;
/* ... */
};
struct Derived {
struct Base base_part;
int y;
/* ... */
};
Then we may cast instance of struct Derived * to struct Base * and then access x.
I want to access base elements of struct Derived * obj; directly, for example obj->x and obj->y. C11 provide extended structs, but as explained here we can use this feature only with anonymous definitions. Then how about to write
#define BASE_BODY { \
int x; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
int y;
};
Then I may access Base members same as it's part of Derived without any casts or intermediate members. I can cast Derived pointer to Base pointer if need do.
Is this acceptable? Are there any pitfalls?
There are pitfalls.
Consider:
#define BASE_BODY { \
double a; \
short b; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
short c;
};
On some implementation it could be that sizeof(Base) == sizeof(Derived), but:
struct Base {
double a;
// Padding here
short b;
}
struct Derived {
double a;
short b;
short c;
};
There is no guarantee that at the beginning of the struct memory layout is the same. Therefore you cannot pass this kind of Derived * to function expecting Base *, and expect it to work.
And even if padding would not mess up the layout, there is a still potential problem with trap presenstation:
If again sizeof(Base) == sizeof(Derived), but c ends up to a area which is covered by the padding at the end of Base. Passing pointer of this struct to function which expects Base* and modifies it, might affect padding bits too (padding has unspecified value), thus possibly corrupting c and maybe even creating trap presentation.
Related
I have a struct X which inherits from struct Base. However, in my current setup, due to alignment, size of X is 24B:
typedef struct {
double_t a;
int8_t b;
} Base;
typedef struct {
Base base;
int8_t c;
} X;
In order to save the memory, I'd like to unwind the Base struct, so I created struct Y which contains fields from Base (in the same order, always at the beginning of the struct), so the size of the struct is 16B:
typedef struct {
double_t base_a;
int8_t base_b;
int8_t c;
} Y;
Then I'm going to use instance of struct Y in a method which expects a pointer to Base struct:
void print_base(Base* b)
{
printf("%f %d\n", b->a, b->b);
}
// ...
Y data;
print_base((Base*)&data);
Does the code above violates the strict aliasing rule, and causes undefined behavior?
First, Base and Y are not compatible types as defined by the standard 6.2.7, all members must match.
To access an Y through a Base* without creating a strict aliasing violation, Y needs to be "an aggregate type" (it is) that contains a Base type among its members. It does not.
So it is a strict aliasing violation and furthermore, since Y and Base are not compatible, they may have different memory layouts. Which is kind of the whole point, you made them different types for that very reason :)
What you can do in situations like this, is to use unions with struct members that share a common initial sequence, which is a special allowed case. Example of valid code from C11 6.5.2.3:
union {
struct {
int alltypes;
} n;
struct {
int type;
int intnode;
} ni;
struct {
int type;
double doublenode;
} nf;
} u;
u.nf.type = 1;
u.nf.doublenode = 3.14;
/* ... */
if (u.n.alltypes == 1)
if (sin(u.nf.doublenode) == 0.0)
I'm in a position where I need to get some object oriented features working in C, in particular inheritance. Luckily there are some good references on stack overflow, notably this Semi-inheritance in C: How does this snippet work? and this Object-orientation in C. The the idea is to contain an instance of the base class within the derived class and typecast it, like so:
struct base {
int x;
int y;
};
struct derived {
struct base super;
int z;
};
struct derived d;
d.super.x = 1;
d.super.y = 2;
d.z = 3;
struct base b = (struct base *)&d;
This is great, but it becomes cumbersome with deep inheritance trees - I'll have chains of about 5-6 "classes" and I'd really rather not type derived.super.super.super.super.super.super all the time. What I was hoping was that I could typecast to a struct of the first n elements, like this:
struct base {
int x;
int y;
};
struct derived {
int x;
int y;
int z;
};
struct derived d;
d.x = 1;
d.y = 2;
d.z = 3;
struct base b = (struct base *)&d;
I've tested this on the C compiler that comes with Visual Studio 2012 and it works, but I have no idea if the C standard actually guarantees it. Is there anyone that might know for sure if this is ok? I don't want to write mountains of code only to discover it's broken at such a fundamental level.
What you describe here is a construct that was fully portable and would have been essentially guaranteed to work by the design of the language, except that the authors of the Standard didn't think it was necessary to explicitly mandate that compilers support things that should obviously work. C89 specified the Common Initial Sequence rule for unions, rather than pointers to structures, because given:
struct s1 {int x; int y; ... other stuff... };
struct s2 {int x; int y; ... other stuff... };
union u { struct s1 v1; struct s2 v2; };
code which received a struct s1* to an outside object that was either
a union u* or a malloc'ed object could legally cast it to a union u*
if it was aligned for that type, and it could legally cast the resulting
pointer to struct s2*, and the effect of using accessing either struct s1* or struct s2* would have to be the same as accessing the union via either the v1 or v2 member. Consequently, the only way for a compiler to make all of the indicated rules work would be to say that converting a pointer of one structure type into a pointer of another type and using that pointer to inspect members of the Common Initial Sequence would work.
Unfortunately, compiler writers have said that the CIS rule is only applicable in cases where the underlying object has a union type, notwithstanding the fact that such a thing represents a very rare usage case (compared with situations where the union type exists for the purpose of letting the compiler know that pointers to the structures should be treated interchangeably for purposes of inspecting the CIS), and further since it would be rare for code to receive a struct s1* or struct s2* that identifies an object within a union u, they think they should be allowed to ignore that possibility. Thus, even if the above declarations are visible, gcc will assume that a struct s1* will never be used to access members of the CIS from a struct s2*.
By using pointers you can always create references to base classes at any level in the hierarchy. And if you use some kind of description of the inheritance structure, you can generate both the "class definitions" and factory functions needed as a build step.
#include <stdio.h>
#include <stdlib.h>
struct foo_class {
int a;
int b;
};
struct bar_class {
struct foo_class foo;
struct foo_class* base;
int c;
int d;
};
struct gazonk_class {
struct bar_class bar;
struct bar_class* base;
struct foo_class* Foo;
int e;
int f;
};
struct gazonk_class* gazonk_factory() {
struct gazonk_class* new_instance = malloc(sizeof(struct gazonk_class));
new_instance->bar.base = &new_instance->bar.foo;
new_instance->base = &new_instance->bar;
new_instance->Foo = &new_instance->bar.foo;
return new_instance;
}
int main(int argc, char* argv[]) {
struct gazonk_class* object = gazonk_factory();
object->Foo->a = 1;
object->Foo->b = 2;
object->base->c = 3;
object->base->d = 4;
object->e = 5;
object->f = 6;
fprintf(stdout, "%d %d %d %d %d %d\n",
object->base->base->a,
object->base->base->b,
object->base->c,
object->base->d,
object->e,
object->f);
return 0;
}
In this example you can either use base pointers to work your way back or directly reference a base class.
The address of a struct is the address of its first element, guaranteed.
C11 adds, among other things, 'Anonymous Structs and Unions'.
I poked around but could not find a clear explanation of when anonymous structs and unions would be useful. I ask because I don't completely understand what they are. I get that they are structs or unions without the name afterwards, but I have always (had to?) treat that as an error so I can only conceive a use for named structs.
Anonymous union inside structures are very useful in practice. Consider that you want to implement a discriminated sum type (or tagged union), an aggregate with a boolean and either a float or a char* (i.e. a string), depending upon the boolean flag. With C11 you should be able to code
typedef struct {
bool is_float;
union {
float f;
char* s;
};
} mychoice_t;
double as_float(mychoice_t* ch)
{
if (ch->is_float) return ch->f;
else return atof(ch->s);
}
With C99, you'll have to name the union, and code ch->u.f and ch->u.s which is less readable and more verbose.
Another way to implement some tagged union type is to use casts. The Ocaml runtime gives a lot of examples.
The SBCL implementation of Common Lisp does use some union to implement tagged union types. And GNU make also uses them.
A typical and real world use of anonymous structs and unions are to provide an alternative view to data. For example when implementing a 3D point type:
typedef struct {
union{
struct{
double x;
double y;
double z;
};
double raw[3];
};
}vec3d_t;
vec3d_t v;
v.x = 4.0;
v.raw[1] = 3.0; // Equivalent to v.y = 3.0
v.z = 2.0;
This is useful if you interface to code that expects a 3D vector as a pointer to three doubles. Instead of doing f(&v.x) which is ugly, you can do f(v.raw) which makes your intent clear.
struct bla {
struct { int a; int b; };
int c;
};
the type struct bla has a member of a C11 anonymous structure type.
struct { int a; int b; } has no tag and the object has no name: it is an anonymous structure type.
You can access the members of the anonymous structure this way:
struct bla myobject;
myobject.a = 1; // a is a member of the anonymous structure inside struct bla
myobject.b = 2; // same for b
myobject.c = 3; // c is a member of the structure struct bla
Another useful implementation is when you are dealing with rgba colors, since you might want access each color on its own or as a single int.
typedef struct {
union{
struct {uint8_t a, b, g, r;};
uint32_t val;
};
}Color;
Now you can access the individual rgba values or the entire value, with its highest byte being r. i.e:
int main(void)
{
Color x;
x.r = 0x11;
x.g = 0xAA;
x.b = 0xCC;
x.a = 0xFF;
printf("%X\n", x.val);
return 0;
}
Prints 11AACCFF
I'm not sure why C11 allows anonymous structures inside structures. But Linux uses it with a certain language extension:
/**
* struct blk_mq_ctx - State for a software queue facing the submitting CPUs
*/
struct blk_mq_ctx {
struct {
spinlock_t lock;
struct list_head rq_lists[HCTX_MAX_TYPES];
} ____cacheline_aligned_in_smp;
/* ... other fields without explicit alignment annotations ... */
} ____cacheline_aligned_in_smp;
I'm not sure if that example strictly necessary, except to make the intent clear.
EDIT: I found another similar pattern which is more clear-cut. The anonymous struct feature is used with this attribute:
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
/* This anon struct can add padding, so only enable it under randstruct. */
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end } __randomize_layout;
#endif
I.e. a language extension / compiler plugin to randomize field order (ASLR-style exploit "hardening"):
struct kiocb {
struct file *ki_filp;
/* The 'ki_filp' pointer is shared in a union for aio */
randomized_struct_fields_start
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
unsigned int ki_cookie; /* for ->iopoll */
randomized_struct_fields_end
};
Well, if you declare variables from that struct only once in your code, why does it need a name?
struct {
int a;
struct {
int b;
int c;
} d;
} e,f;
And you can now write things like e.a,f.d.b,etc.
(I added the inner struct, because I think that this is one of the most usages of anonymous structs)
I have a question about structures having a "variable members list" similar to the "variable argument list" that we can define functions as having.
I may sound stupid or completely off the line in terms of C language basics, but please correct me if I am wrong.
So can I have a C struct like this:
struct Var_Members_Interface
{
int intMember;
char *charMember;
... // is this possible?
};
My idea is to have a c style interface that can be implemented by the classes but these classes can have additional members in this structure. However, they must have intMember and charMember.
Thanks in advance.
The closest approximation in C99 (but not C89) is to have a flexible array member at the end of the structure:
struct Var_Members_Interface
{
int intMember;
char *charMember;
Type flexArrayMember[];
};
You can now dynamically allocate the structure with an array of the type Type at the end, and access the array:
struct Var_Members_Interface *vmi = malloc(sizeof(*vmi) + N * sizeof(Type));
vmi->flexArrayMember[i] = ...;
Note that this cannot be used in C++.
But that isn't a very close approximation to what you are after. What you are after cannot be done in C with a single structure type, and can only be approximated in C++ via inheritance - see other answers.
One trick that you can get away with - usually - in C uses multiple structure types and lots of casts:
struct VM_Base
{
int intMember;
char *charMember;
};
struct VM_Variant1
{
int intMember;
char *charMember;
int intArray[3];
};
struct VM_Variant2
{
int intMember;
char *charMember;
Type typeMember;
};
struct VM_Variant3
{
int intMember;
char *charMember;
double doubleMember;
};
Now, with some sledgehammering casts, you can write functions which take 'struct VM_Base *' arguments, and pass in a pointer to any of the VM_VariantN types. The 'intMember' can probably be used to tell which of the variants you actually have. This is more or less what happens with the POSIX sockets functions. There are different types of socket address, and the structures have different lengths, but they have a common prefix, and the correct code ends up being called because the common prefix identifies the type of socket address. (The design is not elegant; but it was standard - a de facto standard from BSD sockets - before POSIX standardized it. And the BSD design pre-dates C89, let alone C99. Were it being designed now, from scratch, with no requirement for compatibility with existing code, it would be done differently.)
This technique is ugly as sin and requires casts galore to make it compile -- and great care to make it work correctly. You shouldn't bother with this sort of mess in C++.
You can't do anything like this with direct language support in C; but in C++, classes that extended your struct would inherit those data members and could add their own. So in C++, not only can you do this, but it's a normal mode of operation.
You first need to understand what a struct really is.
A struct in C is little more than a standard for interpreting bytes in memory.
To see what that means, let's use your struct:
struct Var_Members_Interface
{
int intMember;
char *charMember;
};
struct Var_Members_Interface instance; //An instance of the struct
What this means is, "I'll reserve some memory and call it instance, and I'll interpret the first few bytes to mean an integer, and the next few bytes to mean that the point to somewhere in memory."
Given this, it makes little sense to have "variable-member" structs, because a struct is just the layout specification for an existing block of memory -- and existing blocks don't have "variable" length.
You could do it the way the old X11 Xt widget library did it:
struct Var_Members_Interface {
int intMember;
char *charMember;
};
struct Other_Part {
int extraInt;
char *extraString;
}
struct Var_Other_Interface {
struct Var_Members_Interface base;
struct Other_Part other;
};
As long as you're careful with your allocations, alignment, and padding issues, then this will work:
struct Var_Other_Interface *other = create_other();
struct Var_Members_Interface *member = (struct Var_Other_Interface *)other;
struct Var_Other_Interface *back_again = (struct Var_Other_Interface)member;
And you can nest the structs as deep as needed to get a single inheritance hierarchy.
This sort of thing is not for the feint of heart: you have to be very careful with you allocations, structure nesting, etc.
Have a look at an old school Xt widget and you'll get the idea; Xt widgets were usually implemented in three files: a C source file, a public header with the function interface, and a private header to define the structure layout (this one would be needed for subclassing).
For example, the Ghostscript widget that I used to use in mgv looked like this:
typedef struct {
/* Bunch of stuff. */
} GhostviewPart;
typedef struct _GhostviewRec {
CorePart core;
GhostviewPart ghostview;
} GhostviewRec;
The CorePart was the standard Xt widget definition and the GhostviewRec was the actual widget itself.
Not exactly what you are looking for but you can create a void pointer within your struct that can be used to point to another struct where the new types are defined.
struct Var_Members_Interface
{
int intMember;
char *charMember;
void *otherMembers;
};
Edit: The solution in this article might be a lot closer to what you are looking for.
You have two choices I think. You can emulate what C++ does but unfortunately, you have to see all the gory details. You define your common base struct and have that as a member of all your variant structs e.g.
struct VM_Base
{
int intMember;
char *charMember;
};
struct VM_Variant1
{
struct VM_Base base;
int foo;
};
struct VM_Variant2
{
struct VM_Base base;
char *charMember;
double bar;
};
struct VM_Variant3
{
struct VM_Base base;
char *charMember;
char baz[10];
};
Pointers to any of the variant structs are also pointers to the base member of the variant struct so you can cast to the base member freely. Going back the other way is obviously more problematic, since you need a check to make sure you are casting to the right type.
You can do away with the casts by using union instead e.g.
struct VM_Variant1
{
struct VM_Base base;
int foo;
};
struct VM_Variant2
{
struct VM_Base base;
char *charMember;
double bar;
};
struct VM_Variant3
{
struct VM_Base base;
char *charMember;
char baz[10];
};
struct VM
{
int intMember;
char *charMember;
union
{
struct VM_Variant1 vm1;
struct VM_Variant2 vm2;
struct VM_Variant3 vm3;
}
};
This second method obviates the need for type casts. You access the members like this:
double aDouble = aVMStruct.vm2.bar;
The three members of the union overlay each other in memory so the allocated block will only be the size of the largest of the three variants.
What is the use of typedef keyword in C ?
When is it needed?
typedef is for defining something as a type. For instance:
typedef struct {
int a;
int b;
} THINGY;
...defines THINGY as the given struct. That way, you can use it like this:
THINGY t;
...rather than:
struct _THINGY_STRUCT {
int a;
int b;
};
struct _THINGY_STRUCT t;
...which is a bit more verbose. typedefs can make some things dramatically clearer, specially pointers to functions.
From wikipedia:
typedef is a keyword in the C and C++ programming languages. The purpose of typedef is to assign alternative names to existing types, most often those whose standard declaration is cumbersome, potentially confusing, or likely to vary from one implementation to another.
And:
K&R states that there are two reasons for using a typedef. First, it provides a means to make a program more portable. Instead of having to change a type everywhere it appears throughout the program's source files, only a single typedef statement needs to be changed. Second, a typedef can make a complex declaration easier to understand.
And an argument against:
He (Greg K.H.) argues that this practice not only unnecessarily obfuscates code, it can also cause programmers to accidentally misuse large structures thinking them to be simple types.
Typedef is used to create aliases to existing types. It's a bit of a misnomer: typedef does not define new types as the new types are interchangeable with the underlying type. Typedefs are often used for clarity and portability in interface definitions when the underlying type is subject to change or is not of importance.
For example:
// Possibly useful in POSIX:
typedef int filedescriptor_t;
// Define a struct foo and then give it a typedef...
struct foo { int i; };
typedef struct foo foo_t;
// ...or just define everything in one go.
typedef struct bar { int i; } bar_t;
// Typedef is very, very useful with function pointers:
typedef int (*CompareFunction)(char const *, char const *);
CompareFunction c = strcmp;
Typedef can also be used to give names to unnamed types. In such cases, the typedef will be the only name for said type:
typedef struct { int i; } data_t;
typedef enum { YES, NO, FILE_NOT_FOUND } return_code_t;
Naming conventions differ. Usually it's recommended to use a trailing_underscore_and_t or CamelCase.
Explaining the use of typedef in the following example. Further, Typedef is used to make the code more readable.
#include <stdio.h>
#include <math.h>
/*
To define a new type name with typedef, follow these steps:
1. Write the statement as if a variable of the desired type were being declared.
2. Where the name of the declared variable would normally appear, substitute the new type name.
3. In front of everything, place the keyword typedef.
*/
// typedef a primitive data type
typedef double distance;
// typedef struct
typedef struct{
int x;
int y;
} point;
//typedef an array
typedef point points[100];
points ps = {0}; // ps is an array of 100 point
// typedef a function
typedef distance (*distanceFun_p)(point,point) ; // TYPE_DEF distanceFun_p TO BE int (*distanceFun_p)(point,point)
// prototype a function
distance findDistance(point, point);
int main(int argc, char const *argv[])
{
// delcare a function pointer
distanceFun_p func_p;
// initialize the function pointer with a function address
func_p = findDistance;
// initialize two point variables
point p1 = {0,0} , p2 = {1,1};
// call the function through the pointer
distance d = func_p(p1,p2);
printf("the distance is %f\n", d );
return 0;
}
distance findDistance(point p1, point p2)
{
distance xdiff = p1.x - p2.x;
distance ydiff = p1.y - p2.y;
return sqrt( (xdiff * xdiff) + (ydiff * ydiff) );
} In front of everything, place the keyword typedef.
*/
typedef doesnot introduce a new type but it just provide a new name for a type.
TYPEDEF CAN BE USED FOR:
Types that combine arrays,structs,pointers or functions.
To facilitate the portability , typedef the type you require .Then when you port the code to different platforms,select the right type by making changes only in the typedef.
A typedef can provide a simple name for a complicated type cast.
typedef can also be used to give names to unnamed types. In such cases, the typedef will be the only name for said type.
NOTE:-SHOULDNT USE TYPEDEF WITH STRUCTS. ALWAYS USE A TAG IN A STRUCTURE DEFINITION EVEN IF ITS NOT NEEDED.
from Wikipedia:
"K&R states that there are two reasons for using a typedef. First ... . Second, a typedef can make a complex declaration easier to understand."
Here is an example of the second reason for using typedef, simplifying complex types (the complex type is taken from K&R "The C programming language second edition p. 136).
char (*(*x())[])()
x is a function returning pointer to array[] of pointer to function returning char.
We can make the above declaration understandable using typedefs. Please see the example below.
typedef char (*pfType)(); // pf is the type of pointer to function returning
// char
typedef pfType pArrType[2]; // pArr is the type of array of pointers to
// functions returning char
char charf()
{ return('b');
}
pArrType pArr={charf,charf};
pfType *FinalF() // f is a function returning pointer to array of
// pointer to function returning char
{
return(pArr);
}
It can alias another type.
typedef unsigned int uint; /* uint is now an alias for "unsigned int" */
typedef unsigned char BYTE;
After this type definition, the identifier BYTE can be used as an abbreviation for the type unsigned char, for example..
BYTE b1, b2;