when doing inheritance in pure C like this
typedef struct {
char name[NAMESIZE];
char sex;
} Person;
typedef struct {
Person person;
char job[JOBSIZE];
} Employee;
typedef struct {
Person person;
char booktitle[TITLESIZE];
} LiteraryCharacter;
I know it is okay to cast a instance of type "LiteraryCharacter" to type "Person", and use it as such. But is it also ok/safe to cast a instance of type "LiteraryCharacter" to type "Employee" and use it as such?
Such a cast is undefined behavior in standard C, although it will work with many compilers.
Even if it works on a compiler you are currently using, be aware that it might break in a future version of the compiler, or with a different compiler. The C standard allows the compiler to assume that pointers to different types don't point to the same memory - except for some well-documented exceptions, which include cast of LiteraryCharacter * to Person *. The code that casts LiteraryCharacter * to Employee * can and will break under a compiler that makes use of this assumption to generate efficient code.
It might be a good idea to explain why you think you need this cast in the first place. Its equivalent would be quite incorrect in C++, and generate a ClassCastException in Java. After all, LiteraryCharacter doesn't have the fields of Person, such as job.
That would be undefined behaviour. There is no such thing as inheritance in C.
You might get away with it if the structures had exactly the same layout but that would be very brittle code indeed.
Undefined behavior:
In your example you are casting to-from incompatible types which doesn't even compile because of the differences in the sizes of those two types. You can however cast to-from compatible (or even incompatible) pointer types and it will compile.
Casting to-from incompatible (or rather pseudo-compatible) types is undefined behavior, on some structures and compiler implementations it will work on others it won't.
Casting to compatible -- pointer types is safe as long as the compiler is taking care of the object structure / internal alignments / memory layout, etc. (there's an example below).
The "OK" part is debatable. Will it work? Yes because they have the same "in memory" layout and you are aware of that particular fact. But what happens when another programmer created that structure? You have to read its source to assess if it can be safely cast to another type.
Encapsulation and future proofing:
Also most encapsulation techniques will only expose an "opaque type" which you can use via pointers -- pass it along to functions implemented within the same library/package and they do all of the structure specific work. In the case of "opaque types" you only know that that structure exists, you don't know it's internal structure so you can't just change the pointer type you're using to access it because you can't be sure of the structure compatibility behind it.
The point of encapsulation is to allow modules to be developed individually. So when the original developer of this external module changes the structure in a future version it breaks compatibility with your code (which shouldn't be using it that way anyway).
Safer way to do this:
This is a non standard (yet pretty much ubiquitous) compiler feature of GCC.
typedef struct {
char name[NAMESIZE];
char sex;
} Person;
typedef struct {
Person;
char job[JOBSIZE];
} Employee;
typedef struct {
Employee;
char booktitle[TITLESIZE];
} LiteraryEmployeeCharacter;
LiteraryEmployeeCharacter* lec = malloc(sizeof(LiteraryEmployeeCharacter));
lec->name = "My character";
lec->sex = 'F';
lec->job = "Steward";
lec->booktitle = "A book";
By using unnamed fields you get access to name, sex and job as if they were defined in the LiteraryEmployeeCharacter structure.
By casting to a compatible pointer type the compiler knows the exact structure and position of every field and ca easily handle the structure correctly.
Related
Structs in C declare a data structure that associates different data types into a contiguous piece of memory.
Typedefs are a way to create user-defined data type names. This is useful for many applications including <stdint.h>
Structs seem to be exclusively used with typedefs. It seems like the default behaviour of defining a struct should also define a typedef.
Why would I ever want to define struct without also using a typedef?
It's a matter of personal preference, possibly imposed onto other people working on the same project as a convention. For instance, the Linux kernel coding style guide discourages the introduction of new typedefs in no uncertain terms.
Though I don't necessarily agree with everything in that guide, some of which seems silly (for instance vps_t a; example could be virtual_container_t a;: the issue hinges on the cryptic name that is chosen for typedef more than the existence of the typedef), in my TXR language project, here are some raw stats:
txr$ git grep '^typedef struct' '*/*.[ch]' '*.[ch]' | wc
25 91 839
txr$ git grep '^struct' '*/*.[ch]' '*.[ch]' | wc
135 528 4710
Lines of code beginning with struct outnumber typedef struct lines by a factor of 5.4!
The union/tag namespace feature of C means that you can have a variable called foo in the same scope as a struct foo without a clash, which is useful. This extra namespace gives us an opportunity not to pollute the regular identifier namespace with user-defined type names, which improves hygiene.
The cost is that we have to type struct foo instead of just foo in declarations.
It makes particular sense if you have experience with languages in which class/type names do not intrude into the lexical variable namespace, like Common Lisp.
The above code base, though, compiles as C++, so that throws a bit of a monkey wrench into it. In C++, struct foo defines foo as a type, which can be referenced in the ordinary namespace.
Another reason is that there is some clarity. When we see a declaration like:
struct foo x;
we know that x is a structure, whereas
foo x;
could be anything; it could be typedef double foo. Sometimes we go for that kind of abstraction. When you want to hide how something is implemented, reach for typedef. However, typedef doesn't provide perfect abstraction.
Also, if we see:
struct foo f;
struct bar b;
we know that these are necessarily different, incompatible types. We do not know that given:
foo f;
bar b;
They could both be typedefs for the same structure or for int for all we know.
If you typedef pointer types, but the code dereferences them, it looks pretty silly:
foo x = get_foo();
char *fname = x->name; /* what? */
That kernel coding style document has this to say about the above: In general, a pointer, or a struct that has elements that can reasonably be directly accessed should never be a typedef.
Not using typedef for structures has as much to do with some of the hygiene of using the tag namespace, as it has to do with avoiding typedef as such: keeping the code explicit, and reserving typedef for situations in which we actually need a proper abstraction.
Structs seem to be exclusively used with typedefs.
This is a mistaken impression. Structs are frequently used without typedefs, and I personally prefer that. There are numerous struct types declared and used, without built-in typedefs, by the C standard library and POSIX standard library extension functions, for example.
It seems like the
default behavior of defining a struct should also define a typedef.
In C++, it effectively does.
Why would I ever want to define struct without also using a typedef?
Why would you ever want to define a struct with a typedef? Using a (tagged) structure type via the struct keyword and its tag clarifies what kind of type it is, and enables you to determine quickly by eye whether two types are the same. On the other hand, a typedefed alias can represent any type at all, and there can be multiple such aliases for the same type.
There are some good and appropriate uses of typedef, but there are a lot of other uses whose propriety is a code style consideration. Myself, I strongly prefer styles that minimize use of typedef.
typedef is just a convenience to allow you to refer to your struct without explicitly stating struct MyStruct every time you refer to it.
Some actually prefer this explicitness, making it clear you're working with a user-defined type.
I never typedef my structs so that they can be "extern" declared in other headers easier.
You ask: Why would I ever want to define struct without also using a typedef?
It depends what you mean by 'define a struct'. Not all uses of struct are to define named types.
For example you could define a variable via
struct
{ double a, b;
} dbls = { -1.0, 1.0};
Then dbls.a etc makes sense, but there is no named type.
Similarly in an anonymous union you might have
struct ructT
{ union
{ struct { int a; int b; } ints;
struct { float a; float b; } floats;
};
};
Here one has defined a type struct ructT but the inner structs are unnamed.
In each of these cases if one was to insist on typedefs, there would be more names to dream up, and more code to type.
typedef isn't just about saving a few keystrokes - it's about abstracting away implementation details of the underlying type. IOW, if you provide a typedef name for a struct type, you should also provide a complete API for setting and accessing members, formatting for output, allocating, deallocating, etc. You're hiding the "struct"-ness of the type from whomever is using it. Think about the FILE type in the standard library - that's a typedef name, usually for some implementation-specific structure. However, you never access any element of a FILE type directly - you just pass FILE * objects to various functions (fprintf, fread, feof, ferror, etc.) that hide the implementation from you.
If you're expecting the user of the type to explicitly access members with the . or -> operators, then don't create a typedef name for it - just leave it as struct whatever. Otherwise you create a "leaky" abstraction, which creates heartburn down the line.
Similar rule for pointer types - don't hide the "pointer"-ness of a type behind a typedef unless you're willing to create a full API to abstract away pointer operations as well.
I have a library with the following structure:
struct frame_meta_data
{
uint8_t id;
uint8_t general_field_1;
uint8_t general_field_2;
...
uint8_t user_data[16];
};
And I would like users of the library to be able to save custom data into frame objects (that's what the user_data field is for).
However when trying to cast user_data into a custom structure:
frame_meta_data cur_frame;
...
#define USER_HDR ((struct my_user_header*)cur_frame.user_data)
I get the following error:
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
#define USER_HDR ((struct my_user_header*)cur_frame.user_data)
How can I work around this?
Thanks in advance.
Reinterpreting addresses like that isn't allowed by the C standard. Strict aliasing means that compilers are free to assume two pointers of different types will never point at the same object, and then make all sorts of optimizations based of that.
Your code violates the C standard and has undefined behavior on account of that. But you can fix it still. Like melpomene suggested in the comments, don't cast, but use memcpy:
struct my_user_header obj;
memcpy(&obj, cur_frame.user_data, sizeof obj);
Alternatively, some compilers allow you to write non-standard code with compiler option, such as GCC's -fno-strict-aliasing.
If you know what you are doing, you can disable that warning. There is however a potential problem.
Assume that the structure you want to use contains something that is larger than 1 byte. For example a 4-byte integer. Now if you simply cast that user_data field to your structure it is possible that the int is not aligned to 4-byte boundary as it should be. This might result in a runtime exception in some architectures.
Using memcpy should solve that problem though. And remove the warning.
I suspect it's because you're involving copies of this USR_DATA macro expression in multiple accesses to the data area, and it's confusing the compiler. Or perhaps you're even mixing USR_DATA accesses with manipulations of the underlying char array.
If the data area is only being initialized and accessed as that given user data type, there isn't any aliasing going on. Ensure you use it that way.
I would provide an external (as in, non-inlined, external linkage) API function which, given a frame object, returns a void * to the associated user data:
struct foobar *fbs = (struct foobar *) frame_get_userdata(fr);
// now work just with fbs
The cast isn't necessary; that's my style.
Depending on what precedes the user data, it might not be suitably aligned for arbitrary use. One easy way to fix that would be to make it the first struct member, if that option is available. Otherwise there are various fairly portable tricks involving making a union between a char array and various types like long double and whatnot, or else using compiler-specific constructs, like __attribute__((aligned)) with GCC.
Sorry if this has been asked before, I wasn't really even sure what to search for to come up with this.
When I create a typedef struct, I usually do something like this:
typedef struct myStruct {
int a;
int b;
struct myStruct *next;
} MyStruct;
So I declare it with MyStruct at the end. Then when I create functions that pass that in as a parameter, I write
int doSomething(MyStruct *ptr){
}
Yet I am collaborating with a friend on a project and I have come across his coding style, which is to also declare *MyStructP like this:
typedef struct myStruct {
int a;
int b;
struct myStruct *next;
} MyStructR, *MyStructP;
And then he uses MyStructP in his functions, so his parameters look like:
int doSomething(MyStructP)
So he doesn't have to use the * in the parameter list. This confused me because when I look at the parameter list, I always look for the * to determine if the arg is a pointer or not. On top of that, I am creating a function that takes in a struct I created and a struct he created, so my arg has the * and his does not. Ultra confusing!!
Can someone give insight/comparison/advice on the differences between the two? Pros? Cons? Which way is better or worse, or more widely used? Any information at all. Thanks!
It is generally considered poor style to hide pointers behind typedefs, unless they are meant to be opaque handles (for example SDL_GLContext is a void*).
This being not the case here, I agree with you that it's more confusing than helping.
The Linux kernel coding style says to avoid these kinds of typedefs:
Chapter 5: Typedefs
Please don't use things like "vps_t".
It's a mistake to use typedef for structures and pointers. When you see a
vps_t a;
in the source, what does it mean?
In contrast, if it says
struct virtual_container *a;
you can actually tell what "a" is.
Some people like to go with ideas from Hungarian Notation when they name variables. And some people take that concept further when they name types.
I think it's a matter of taste.
However, I think it obscures things (like in your example) because you'd have to dig up the declaration of the name in order to find its type. I prefer things to be obvious and explicit, and I would avoid such type names.
(And remember, typedef does not introduce a new type but merely a new name that aliases a new type.)
The main good reason why people occasionally typedef pointers is to represent the type as a "black box object" to the programmer and to allow its implementation to more easily be changed in the future.
For example, maybe today the type is a pointer to a struct but tomorrow the type becomes an index into some table, a handle/key of some sort, or a file descriptor. Typedef'ing this way tells the programmer that they shouldn't try things they might normally do to a pointer such as comparing it against 0 / NULL, dereferencing it (e.g. - directly accessing members), incrementing it, etc., as their code may become broken in the future. Of course, using a naming convention, such as your friend did, that reveals and encodes that the underlying implementation actually is a pointer conflicts with that purpose.
The other reason to do this is to make this kind of error less likely:
myStructR *ptr1, ptr2;
myStructP ptr3, ptr4;
That's pretty weak sauce as the compiler will typically catch you misusing ptr2 later, but that is a reason given for doing this.
How to literally translate the following empty C struct inside struct to Delphi (from winnt.h):
typedef struct _TP_CALLBACK_ENVIRON_V3 {
...
struct _ACTIVATION_CONTEXT *ActivationContext;
...
} TP_CALLBACK_ENVIRON_V3;
I'm inclined to use just Pointer since this structure must not be manipulated and it's a pointer anyway. I'm just curious how would one translate it literally (if possible). I was thinking about something like this:
type
PActivationContext = ^TActivationContext;
TActivationContext = record
end;
TTPCallbackEnvironV3 = record
...
ActivationContext: PActivationContext;
...
end;
But, you know, an empty record... So, how would you literally translate the above structure to Delphi ?
The C struct is what is known as an incomplete type. The C code is a common technique used to implement an opaque pointer. By implementing it this way in C you have type safety in the sense that variables of type struct _ACTIVATION_CONTEXT* are not assignment compatible with other pointers. Well, apart from void* pointers which are assignment compatible with all pointer types.
In Delphi there is no such thing as an incomplete type. So I think that the best solution is exactly what you have proposed. It's not particularly important to mimic the C code exactly. What you are aiming for is to have the benefits, specifically type safety. And what you propose is probably as good as you will get.
On the other hand, it depends how visible this type is. If it is very private, perhaps declared only in the implementation section of a unit, and used sparingly, then you may take the stance that declaring an empty record is a little over the top. You may conclude that PActivationContext = Pointer is reasonable.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Typedef pointers a good idea?
I've seen this oddity in many APIs I have used:
typedef type_t *TYPE;
My point is that declaring a variable of type TYPE will not make it clear that in fact a pointer is declared.
Do you, like me, think that this brings a lot of confusion? Is this meant to enforce encapsulation, or there are other reasons as well? Do you consider this to be a bad practice?
In general, it's a bad practice. The significant problem is that it does not play well with const:
typedef type_t *TYPE;
extern void set_type(TYPE t);
void foo(const TYPE mytype) {
set_type(mytype); // Error expected, but in fact compiles
}
In order for the author of foo() to express what they really mean, the library that provides TYPE must also provide CONST_TYPE:
typedef const type_t *CONST_TYPE;
so that foo() can have the signature void foo(CONST_TYPE mytype), and at this point we have descended into farce.
Hence a rule of thumb:
Make typedefs of structs (particularly incomplete structs), not pointers to those structs.
If the definition of the underlying struct is not to be publicly available (which is often laudable), then that encapsulation should be supplied by the struct being incomplete, rather than by inconvenient typedefs:
struct type_t;
typedef struct type_t type_t;
void set_type(type_t *);
int get_type_field(const type_t *);
A common idiom is to suffix the type with _p to indicate that it's a pointer while still retaining the pointery qualities.
Sometimes it is necessary to use only the pointer type if the struct that it is pointing to is not publicly available. This helps facilitate data hiding. I.e.
typedef struct hidden_secret_object * object;
void change_object(object foo);
this allows you to change the way that hidden_secret_object is structured without breaking external code.
I don't find it clear either. I'm not fond of full capitalised types either (I try to reserve those for #defines).
This way makes it easy to kid oneself by thinking it is in fact a value type, while we're talking about a pointer type. The pointer type can be completely abstracted away with smart pointers, but that isn't common practise in C.
Suffixing with (as mentioned previously) _p, _ptr, Pointer or anything along those lines creates clarity; increases typing, that's true, but will prevent you from silly mistakes (such as using '.' instead of '->', ...) costing you valuable developing time.
It depends on what you are trying to achieve. There is no meaningful "yes or no" answer to your question the way it is stated.
If you are trying to create an abstract handle kind of type, implying that the user is not supposed to know or care what is hiding behind the type, then typedef-ing a pointer type is perfectly fine. The whole point is that today it might be a pointer type, and tomorrow it might become an integer type, and later it might become something else. This is exactly what pointer type typedefs are normally used for in most library interfaces.
You are saying that sometimes it is "not clear that a pointer is declared". But under this usage model that's exactly the point! It is supposed to be "not clear". The fact that the type happens to be an obfuscated pointer is none of your business. It is something that you don't need to know and not supposed to rely upon.
A classic example of this usage model is the va_list type in the standard library. In some implementation it might easily be a typedef for a pointer type. But that's something you are not supposed to know or rely upon.
Another example would be the definition of HWND type in Windows API. It is a typedef for pointer type as well, but that's none of your business.
A completely different situation is when you are typedef-ing a pointer type as a form of shorthand, just to make the declarations shorter for not having to type the * character every time. In this case the fact that the typedef is (and will always be) standing for a pointer type is exposed to the user. Normally this usage is not a good programming practice. If the users will want to create an alias to avoid typing * every time, they can do it by themselves.
This usage model usually leads to more obfuscated code for the reasons you already mentioned in your OP.
Example of this bad usage of typedefs can also be found in Windows API. Typedef names like PINT follow exactly that flawed usage model.
I don't think it's bad practice if the it's a pointer to an incomplete type, or if for any other reason the user isn't expected to dereference it. I never understood FILE*.
I also don't think it's bad practice if you're doing it because you have several levels of indirection, and you want to use it in situations where some of them are irrelevant. typedef char **argarray, or something.
If the user is expected to dereference it then in C, I think it's probably best to retain the *. In C++, people are used to user-defined types with overloaded operator*, such as iterators. In C that's just not normal.
Storage-class qualifiers like 'const' will work differently with typedef'ed pointers than with 'natural' ones. While this isn't typically a good thing with 'const', it can be very useful with compiler-specific storage classes like "xdata". A declaration like:
xdata WOKKA *foo;
will declare "foo" to be a pointer, stored in the default storage class, to a WOKKA in xdata. A declaration:
xdata WOKKA_PTR bar;
would declare "bar" to be a pointer, stored in xdata, to a WOKKA in whatever storage class was specified in WOKKA_PTR. If library routines are going to expect pointers to things with a particular storage class, it may be useful to define those storage classes within the pointer types.
It's bitten me in the ass on occasion:
for (vector<typedef_name_that_doesnt_indicate_pointerness_at_all>::iterator it;
it != v.end(); ++it)
{
it->foo(); // should have been written (*it)->foo();
}
The only time it's acceptable is if the type is meant to be truly opaque and not accessed directly at all. IOW, if someone's going to have to dereference it outside of an API, then the pointerness should not be hidden behind a typedef.
Maybe a way to make it more specific would be to call the new pointer type type_ptr or something like that:
typedef type_t* type_ptr;