Mocking "inheritance" in accessing members of structures in C - c

I have a question about C and attempting to mock a partial type of "inheritance", only in accessing members of structures. Look at the following example:
#pragma pack(push,1)
typedef struct foo
{
int value;
int value2;
}foo;
typedef struct foo_extended
{
// "inherits" foo
int value;
int value2;
// "inherits" foo stops
//we also have some additional data
float additional;
}foo_extended;
#pragma pack(pop)
//! This function works for both foo types
void workboth(void* objP)
{
foo* obj = (foo*)objP;
obj->value = 5;
obj->value2 = 15;
}
//! This works only for the extended
void workextended(foo_extended* obj)
{
obj->value = 25;
obj->value2 = 35;
obj->additional = 3.14;
}
int main()
{
foo a;
foo_extended b;
workboth(&a);
workboth(&b);
workextended(&b);
return 0;
}
This works in my system but my question is whether this can be portable as long as there is correct packing of the involved structures (depending on the compiler). I suppose it would need #ifndefs correcttly invoking the tight packing in other compilers too.
Of course the obvious problem is total lack of type checking and putting all of the responsibility of correct usage to the programmer but I am wondering if this is portable or not. Thanks!
P.S.: Forgot to mention that the standard I attempt to adhere to is C99

I have a slightly different method. Instead of using the same values, I create a struct in the struct, and this way the packing is unnecessary:
typedef struct foo
{
int value;
int value2;
}foo;
typedef struct foo_extended
{
foo father;
float additional;
}foo_extended;
now the rest is pretty much as you showed, with a small difference:
void workextended(foo_extended* obj)
{
obj->father.value = 25;
obj->father.value2 = 35;
obj->additional = 3.14;
}
but I would add an id as a field of the first object in the hierarchy to make sure the casting is done to the correct object.
This method is guaranteed to work by the C standard.

As of C11, and also supported by some existing compilers as extensions to older standards, you should use an anonymous struct for that
struct foo_extended {
struct {
int value;
int value2;
};
//we also have some additional data
float additional;
};
by this your substructure has exactly the same layout as foo in particular what concerns alignment of its parts: to be compatible between different compilation units struct that have exactly the same fields in the same order must be laid out identically.
(The impact of your packed pragma is not so clear to me)
Since your foo structure is the first in foo_extended it must always be at offset 0 within that one.

By reading 6.7.2.1/12 and /13 in the C99 draft, I think it can be assumed that two different structs with the same initial members are compatible up to the first different member.
6.7.2.1/12
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.
6.7.2.1/13
Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Related

generic API to initialize structs with common fields

I have a number of structures, which have first 3 fields common, below is
simplified example:
struct my_struct1 {
/* common fields. */
int a;
int b;
int c;
};
struct my_struct2 {
int a;
int b;
int c;
uint32_t d;
uint32_t e;
};
struct my_struct3 {
int a;
int b;
int c;
uint16_t d;
char e;
};
static void func1(struct my_struct1 *s)
{
/* ... */
}
static void func2(struct my_struct2 *s)
{
/* ... */
}
static void func3(struct my_struct3 *s)
{
/* ... */
}
int main(void)
{
struct my_struct1 s = {1, 2, 3};
struct my_struct2 p = {1, 2, 3, 4, 5};
struct my_struct3 q = {1, 2, 3, 4, 'a'};
func1(&s);
func2(&p);
func3(&q);
/* XXX */
func3((struct my_struct3 *)&s);
return 0;
}
Is it safe to typecast s to struct my_struct3 * and pass to func3 and ensure that s or other objects allocated on stack would not be corrupted?
The reason is that I would like to write a generic API that takes a pointer, initializes common fields (which are common for structures). The other function is specific to my_struct* and sets the rest of the fields.
I'm not sure if void * can solve this.
UPDATE
I should mention, that unfortunately I can't change the structures layout, i.e. adding a common part isn't an option, because the code I'm working with is pretty old and I'm not allowed to change its core structures.
The only ugly workaround I'm seeing is to pass void * and enum struct_type parameters to generic_init function, and based on struct_type cast void * to appropriate structure.
As far as I can interpret the standard, casting a pointer of type my_struct1* to a pointer of type mystruct_3* or vice vera may yield undefined behaviour because of pointer conversion rules (cf. C11 standard ISO/IEC 9899:TC2):
6.3.2.3 Pointers ... (7) A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If
the resulting pointer is not correctly aligned for the pointed-to
type, the behavior is undefined. ...
Hence, as my_struct1 and my_struct3 may have different alignment, a pointer that is correctly aligned according to my_struct1 does not necessarily be correctly aligned according to my_struct3.
But even if you can guarantee that all structs have the same alignment, passing a pointer to an object of type my_struct1 to func3 is - in my opinion -
not safe, even if the common members are the first ones in each struct and even if func3 accesses only the common members.
The reason is that a compiler may introduce padding between members:
6.7.2.1 Structure and union specifiers ... (13) Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are
declared. A pointer to a structure object, suitably converted, points
to its initial member (or if that member is a bit-field, then to the
unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
Hence, as my_struct1 and my_struct3 have different sets of members, the rules of how a compiler introduces padding may vary between these two structs. I think that it is unlikely that this happens, but I did not find any statement in the standard that guarantees that padding - even for the first three members - is the same for my_struct1 and my_struct3.
To flesh out what the comments of EOF and Eugene Sh. already explain:
It would not be safe to cast a my_struct1 to a my_struct3, as my_struct3 has more members that my_struct1 and the compiler would not warn at all about accessing those additional members (d and e), overwriting whatever is behind the my_struct1. Doing it the other way around may work, as long as my_struct1 exactly corresponds to the start of my_struct3. I am not sure if there is any guarantee in the standard that would cover you there, but I would not bet on it.
The advantages of separating out the common part in a separate structure type
are the following:
This reduces repetition in your code which has the advantage of
allowing you to change the common code in one place, reducing the
risk of errors.
The compiler can check the types passed around, by
casting the structs you would be effectively disabling such checks.
There is no need for the common struct to be at the start of a struct, by making it a struct member the compiler can figure out the correct offsets for you.
struct common {
int a;
int b;
int c;
};
struct my_struct1 {
struct common com;
};
struct my_struct2 {
struct common com;
uint32_t d;
uint32_t e;
};
struct my_struct3 {
struct common com;
uint16_t d;
char e;
};
void init_common(struct common *com)
{
com->a = 1;
com->b = 2;
/* ... */
}
struct my_struct1 s = {{1, 2, 3}};
struct my_struct2 p = {{1, 2, 3}, 4, 5};
struct my_struct3 q = {{1, 2, 3}, 4, 'a'};
init_common(&s.com);
init_common(&p.com);
init_common(&q.com);
It's semi-safe. If the first member is identical, you are guaranteed that a pointer to the structure is also a pointer to the first member. Technically, a compiler could insert arbitrary padding after the first member. In fact, no compiler will do so, so if two structs share a first and second member, the pointers to the second members also have the same offset. However the offset may not be address + sizeof(int) for your member "b". ints might e padded to 8 bytes for performance.
To avoid ambiguity, you can explicitly set the common members to a struct "common".

Using macro in C11 anonymous struct definition

The typical C99 way to extending stuct is something like
struct Base {
int x;
/* ... */
};
struct Derived {
struct Base base_part;
int y;
/* ... */
};
Then we may cast instance of struct Derived * to struct Base * and then access x.
I want to access base elements of struct Derived * obj; directly, for example obj->x and obj->y. C11 provide extended structs, but as explained here we can use this feature only with anonymous definitions. Then how about to write
#define BASE_BODY { \
int x; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
int y;
};
Then I may access Base members same as it's part of Derived without any casts or intermediate members. I can cast Derived pointer to Base pointer if need do.
Is this acceptable? Are there any pitfalls?
There are pitfalls.
Consider:
#define BASE_BODY { \
double a; \
short b; \
}
struct Base BASE_BODY;
struct Derived {
struct BASE_BODY;
short c;
};
On some implementation it could be that sizeof(Base) == sizeof(Derived), but:
struct Base {
double a;
// Padding here
short b;
}
struct Derived {
double a;
short b;
short c;
};
There is no guarantee that at the beginning of the struct memory layout is the same. Therefore you cannot pass this kind of Derived * to function expecting Base *, and expect it to work.
And even if padding would not mess up the layout, there is a still potential problem with trap presenstation:
If again sizeof(Base) == sizeof(Derived), but c ends up to a area which is covered by the padding at the end of Base. Passing pointer of this struct to function which expects Base* and modifies it, might affect padding bits too (padding has unspecified value), thus possibly corrupting c and maybe even creating trap presentation.

C inheritance through type punning, without containment?

I'm in a position where I need to get some object oriented features working in C, in particular inheritance. Luckily there are some good references on stack overflow, notably this Semi-inheritance in C: How does this snippet work? and this Object-orientation in C. The the idea is to contain an instance of the base class within the derived class and typecast it, like so:
struct base {
int x;
int y;
};
struct derived {
struct base super;
int z;
};
struct derived d;
d.super.x = 1;
d.super.y = 2;
d.z = 3;
struct base b = (struct base *)&d;
This is great, but it becomes cumbersome with deep inheritance trees - I'll have chains of about 5-6 "classes" and I'd really rather not type derived.super.super.super.super.super.super all the time. What I was hoping was that I could typecast to a struct of the first n elements, like this:
struct base {
int x;
int y;
};
struct derived {
int x;
int y;
int z;
};
struct derived d;
d.x = 1;
d.y = 2;
d.z = 3;
struct base b = (struct base *)&d;
I've tested this on the C compiler that comes with Visual Studio 2012 and it works, but I have no idea if the C standard actually guarantees it. Is there anyone that might know for sure if this is ok? I don't want to write mountains of code only to discover it's broken at such a fundamental level.
What you describe here is a construct that was fully portable and would have been essentially guaranteed to work by the design of the language, except that the authors of the Standard didn't think it was necessary to explicitly mandate that compilers support things that should obviously work. C89 specified the Common Initial Sequence rule for unions, rather than pointers to structures, because given:
struct s1 {int x; int y; ... other stuff... };
struct s2 {int x; int y; ... other stuff... };
union u { struct s1 v1; struct s2 v2; };
code which received a struct s1* to an outside object that was either
a union u* or a malloc'ed object could legally cast it to a union u*
if it was aligned for that type, and it could legally cast the resulting
pointer to struct s2*, and the effect of using accessing either struct s1* or struct s2* would have to be the same as accessing the union via either the v1 or v2 member. Consequently, the only way for a compiler to make all of the indicated rules work would be to say that converting a pointer of one structure type into a pointer of another type and using that pointer to inspect members of the Common Initial Sequence would work.
Unfortunately, compiler writers have said that the CIS rule is only applicable in cases where the underlying object has a union type, notwithstanding the fact that such a thing represents a very rare usage case (compared with situations where the union type exists for the purpose of letting the compiler know that pointers to the structures should be treated interchangeably for purposes of inspecting the CIS), and further since it would be rare for code to receive a struct s1* or struct s2* that identifies an object within a union u, they think they should be allowed to ignore that possibility. Thus, even if the above declarations are visible, gcc will assume that a struct s1* will never be used to access members of the CIS from a struct s2*.
By using pointers you can always create references to base classes at any level in the hierarchy. And if you use some kind of description of the inheritance structure, you can generate both the "class definitions" and factory functions needed as a build step.
#include <stdio.h>
#include <stdlib.h>
struct foo_class {
int a;
int b;
};
struct bar_class {
struct foo_class foo;
struct foo_class* base;
int c;
int d;
};
struct gazonk_class {
struct bar_class bar;
struct bar_class* base;
struct foo_class* Foo;
int e;
int f;
};
struct gazonk_class* gazonk_factory() {
struct gazonk_class* new_instance = malloc(sizeof(struct gazonk_class));
new_instance->bar.base = &new_instance->bar.foo;
new_instance->base = &new_instance->bar;
new_instance->Foo = &new_instance->bar.foo;
return new_instance;
}
int main(int argc, char* argv[]) {
struct gazonk_class* object = gazonk_factory();
object->Foo->a = 1;
object->Foo->b = 2;
object->base->c = 3;
object->base->d = 4;
object->e = 5;
object->f = 6;
fprintf(stdout, "%d %d %d %d %d %d\n",
object->base->base->a,
object->base->base->b,
object->base->c,
object->base->d,
object->e,
object->f);
return 0;
}
In this example you can either use base pointers to work your way back or directly reference a base class.
The address of a struct is the address of its first element, guaranteed.

When are anonymous structs and unions useful in C11?

C11 adds, among other things, 'Anonymous Structs and Unions'.
I poked around but could not find a clear explanation of when anonymous structs and unions would be useful. I ask because I don't completely understand what they are. I get that they are structs or unions without the name afterwards, but I have always (had to?) treat that as an error so I can only conceive a use for named structs.
Anonymous union inside structures are very useful in practice. Consider that you want to implement a discriminated sum type (or tagged union), an aggregate with a boolean and either a float or a char* (i.e. a string), depending upon the boolean flag. With C11 you should be able to code
typedef struct {
bool is_float;
union {
float f;
char* s;
};
} mychoice_t;
double as_float(mychoice_t* ch)
{
if (ch->is_float) return ch->f;
else return atof(ch->s);
}
With C99, you'll have to name the union, and code ch->u.f and ch->u.s which is less readable and more verbose.
Another way to implement some tagged union type is to use casts. The Ocaml runtime gives a lot of examples.
The SBCL implementation of Common Lisp does use some union to implement tagged union types. And GNU make also uses them.
A typical and real world use of anonymous structs and unions are to provide an alternative view to data. For example when implementing a 3D point type:
typedef struct {
union{
struct{
double x;
double y;
double z;
};
double raw[3];
};
}vec3d_t;
vec3d_t v;
v.x = 4.0;
v.raw[1] = 3.0; // Equivalent to v.y = 3.0
v.z = 2.0;
This is useful if you interface to code that expects a 3D vector as a pointer to three doubles. Instead of doing f(&v.x) which is ugly, you can do f(v.raw) which makes your intent clear.
struct bla {
struct { int a; int b; };
int c;
};
the type struct bla has a member of a C11 anonymous structure type.
struct { int a; int b; } has no tag and the object has no name: it is an anonymous structure type.
You can access the members of the anonymous structure this way:
struct bla myobject;
myobject.a = 1; // a is a member of the anonymous structure inside struct bla
myobject.b = 2; // same for b
myobject.c = 3; // c is a member of the structure struct bla
Another useful implementation is when you are dealing with rgba colors, since you might want access each color on its own or as a single int.
typedef struct {
union{
struct {uint8_t a, b, g, r;};
uint32_t val;
};
}Color;
Now you can access the individual rgba values or the entire value, with its highest byte being r. i.e:
int main(void)
{
Color x;
x.r = 0x11;
x.g = 0xAA;
x.b = 0xCC;
x.a = 0xFF;
printf("%X\n", x.val);
return 0;
}
Prints 11AACCFF
I'm not sure why C11 allows anonymous structures inside structures. But Linux uses it with a certain language extension:
/**
* struct blk_mq_ctx - State for a software queue facing the submitting CPUs
*/
struct blk_mq_ctx {
struct {
spinlock_t lock;
struct list_head rq_lists[HCTX_MAX_TYPES];
} ____cacheline_aligned_in_smp;
/* ... other fields without explicit alignment annotations ... */
} ____cacheline_aligned_in_smp;
I'm not sure if that example strictly necessary, except to make the intent clear.
EDIT: I found another similar pattern which is more clear-cut. The anonymous struct feature is used with this attribute:
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
/* This anon struct can add padding, so only enable it under randstruct. */
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end } __randomize_layout;
#endif
I.e. a language extension / compiler plugin to randomize field order (ASLR-style exploit "hardening"):
struct kiocb {
struct file *ki_filp;
/* The 'ki_filp' pointer is shared in a union for aio */
randomized_struct_fields_start
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
unsigned int ki_cookie; /* for ->iopoll */
randomized_struct_fields_end
};
Well, if you declare variables from that struct only once in your code, why does it need a name?
struct {
int a;
struct {
int b;
int c;
} d;
} e,f;
And you can now write things like e.a,f.d.b,etc.
(I added the inner struct, because I think that this is one of the most usages of anonymous structs)

Using structure in C and C++

I am new to C and I want to know how to access elements inside a structure which is placed inside a structure.
struct profile_t
{
unsigned char length;
unsigned char type;
unsigned char *data;
};
typedef struct profile_datagram_t
{
unsigned char src[4];
unsigned char dst[4];
unsigned char ver;
unsigned char n;
struct profile_t profiles[MAXPROFILES];
} header;
How to access elements inside profile_t??
struct profile_t;
The above statement doesn't create an object of type profile_t. What you need to do is -
struct profile_t inObj ;
Then create object for profile_datagram_t. i.e.,
header outObj ; // header typedef for profile_datagram_t
Now you can access elements like -
outObj.inObj.type = 'a' ; // As an example
In C++, while creation of object for a structure, struct key word isn't necessary.
On your question edit and comment :
struct profile_t profiles[MAXPROFILES];
profiles is an array of objects of type profile_t. To access the individual object, just use the [] operator. i.e.,
header obj ;
obj.profiles[0].type = 'a' ; // Example
obj.profiles[i], where i can take values from 0 to MAXPROFILES - 1, gives the object at index i.
Not sure what happends in C, but in C++, rest of the stuff aside, the following declares two types.
struct profile_datagram_t
{
struct profile_t;
};
One type is named profile_datagram_t and the other is called profile_datagram_t::profile_t. The inner type declaration is just a forward declaration, so you'll need to define the type after.
struct profile_datagram_t::profile_t
{
// ...
};
Then, you can use the struct as follows:
int main ( int, char ** )
{
profile_datagram_t::profile_t profile;
}
Some compilers support a nonstandard extension to the C language (that I actually rather like, despite it being nonstandard) called anonymous structs (or unions). Code demonstration:
struct x {
int i;
};
struct y {
struct x;
};
int main(void)
{
struct y;
y.i = 1; // this accesses member i of the struct x nested in struct y
return 0;
}
In a nutshell, if you don't give the struct (or union) member a name, you can access its members directly from the containing struct (or union). This is useful in situations where you might have given it the name _, and had to do y._.i - the anonymous struct syntax is much simpler. However, it does mean that you have to remember the names of all members of both structs and ensure they never clash.
This is all, of course, a nonstandard extension, and should be used with caution. I believe it works on MSVC and can be enabled in GCC with a switch. Don't know about any other compilers. If you're worried about portability, give the member a proper name.
EDIT: According to the GCC reference (below) this behavior is being added to the upcoming C1X standard, so it won't be nonstandard for long. I doubt MSVC will support C1X since they refuse to support C99 as it is, but at least this feature is becoming part of the standard.
However, the behavior shown above is MSVC only. The C1X (and GCC without the -fms-extensions switch) syntax doesn't allow the unnamed struct member to have a name:
struct y {
struct {
int i;
};
};
int main(void) {
struct y;
y.i = 1; // this accesses member i of the struct x nested in struct y
return 0;
}
References for various compilers (they have different names but are the same concept):
GCC (unnamed fields): http://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html'
MSVC (anonymous structs): http://msdn.microsoft.com/en-us/library/z2cx9y4f.aspx
Basically you can use the following format:
variable = profile_t.element
profile_t.element = ?
EDIT: In your declaration of profile_datagram_t, the proper definition for struct profile_t should be:
struct profile_t someProfile;
Let's say you have:
header profileDiagram1;
struct profile_t profile1;
profileDiagram1.someProfile = profile1;
To access length, type or *data from profile_t:
profileDiagram1.someProfile.type;
profileDiagram1.someProfile.length;
...

Resources