Suppose you are writing a library that uses internally certain data structures, and wants to export to the user only a subset of them (or hide the exact type using something like void *). The definitions for all the structs and functions used in the library are in a header library.h, which will be used when building the library.
Is it considered good practice to also produce another copy of library.h that would not be used during the build process but only by users linking to the library?
For example suppose the library internally uses the following library.h:
#ifndef LIBRARY_H
#define LIBRARY_H
struct myStruct {
int some_x;
void (*some_callback)(void);
};
typedef struct myStruct *myStruct_t;
#endif
While we would like to hide the definition of myStruct to the user, so we export a header library.h that is:
#ifndef LIBRARY_H
#define LIBRARY_H
typedef void *myStruct_t;
#endif
Is it considered good practice to also produce another copy of library.h that would not be used during the build process but only by users linking to the library?
No. While the details of a best practice for what you want to do are probably a matter of taste, delivering headers not used during building is objectively not a good practice: You risk to introduce typing errors that are never catched when you build your project.
So, without going into details on how you should organize that, what you should definitely do is have each "private" header #include the respective "public" header and not repeat public declarations in the private header. For your example, this would look e.g. like:
library.h:
#ifndef LIBRARY_H
#define LIBRARY_H
typedef struct myStruct *myStruct_t;
// there's absolutely no need to use void * here. An incomplete struct
// type is perfectly fine as long as only pointers to it are used.
#endif
library_internal.h:
#ifndef LIBRARY_INTERNAL_H
#define LIBRARY_INTERNAL_H
#include "library.h"
struct myStruct {
int some_x;
void (*some_callback)(void);
};
#endif
Additional "best practice" notes:
Don't hide pointers behind typedefs. Most C programmers are well aware that a pointer is part of the declarator and expect to explicitly see a pointer when there is one. Dereferencing something that doesn't look like a pointer will just cause confusion for others reading the code. You also might confuse consumers of your library into expecting a myStruct_t to exhibit call-by-value semantics.
Don't define your own types with the _t suffix. At least in POSIX, this is reserved for the implementation (of the compiler/runtime). There's nothing wrong with defining a type of the same name as a struct tag.
Example with these additional suggestions:
library.h:
#ifndef LIBRARY_H
#define LIBRARY_H
typedef struct myStruct myStruct;
#endif
library_internal.h:
#ifndef LIBRARY_INTERNAL_H
#define LIBRARY_INTERNAL_H
#include "library.h"
struct myStruct {
int some_x;
void (*some_callback)(void);
};
#endif
Notice that the C standard doesn't guarantee that a pointer to void has a representation that is compatible with a pointer to a struct! Thus:
typedef struct myStruct *myStruct_t;
typedef void *myStruct_t;
these two are not compatible and cannot be used in a strictly conforming program.
Another thing is that you usually shouldn't hide pointers, unless needed. Consider for example the FILE in the standard library. Its contents are not defined anywhere, but all the functions specifically return a pointer to it and accept a pointer to it.
You can even use a simple struct declaration, instead of definition:
struct myStruct;
Then external users can define a variable as a pointer to it
struct myStruct *handle;
Or if you wish to hide the fact that it indeed is a struct, use a typedef:
typedef struct myStruct myStruct;
Then the users of the external interface can define their variables simply as
myStruct *handle;
Related
I have the following files:
A.h
#ifndef __A_H_
#define __A_H_
#include <B.h> // contains foo_t
typedef struct {
foo_t foo;
...
} baz_t;
#endif
B.h
#ifndef __B_H_
#define __B_H_
#include <A.h> // contains baz_t
typedef struct {
...
} foo_t;
extern int useful_func(baz_t d);
#endif
When I compile this B.h refuses to compile complaining error: unknown type name 'baz_t'
I am assuming this error is owing to circular dependency between the two files. But I am wondering how do I forward declare baz_t to solve this? I found answers relating to circular dependencies between structs. But I am unsure how I would solve this. I would appreciate some help here. I am looking for a strictly C99 solution.
EDIT
I previously forgot to mention this but I have already used include guards.
A very obvious solution as suggested by user KamilCuk is moving useful_func to A.h. This has also occured to me but software organization wise useful_func unfortunately belongs to B.h. This problem could be a reflection of a poor design as well.
You can use forward declaration as follows:
struct foo;
typedef struct {
struct foo foo;
...
} baz_t;
And then use it normally on B.h
That said, circular dependencies can be avoided is you define everything on a third header that you use as interface. It would be cleaner, but it's not always possible.
Typically, one would pass pointers to structures rather than passing them by value. If that would be acceptable, things are easy, since C compilers will accept a declaration:
void doSomethingWithAFoo(struct foo *it);
without regard for whether they have seen any definition for struct foo. Indeed, compilers will even accept function definitions like:
void doSomethingWithAFooTwice(struct foo *it)
{
doSomethingWithAFoo(it);
doSomethingWithAFoo(it);
}
without having to know or care about whether, where, or how struct foo is defined.
Note that an advantage of using struct tag syntax rather than typedef names is that prototypes using the struct tag syntax don't require declaring or defining anything that can't be harmlessly redeclared arbitrarily many times.
I tried to do data encapsulation in C based on this post here https://alastairs-place.net/blog/2013/06/03/encapsulation-in-c/.
In a header file I have:
#ifndef FUNCTIONS_H
#define FUNCTIONS_H
// Pre-declaration of struct. Contains data that is hidden
typedef struct person *Person;
void getName(Person obj);
void getBirthYear(Person obj);
void getAge(Person obj);
void printFields(const Person obj);
#endif
In ´functions.c´ I have defined the structure like that
#include "Functions.h"
enum { SIZE = 60 };
struct person
{
char name[SIZE];
int birthYear;
int age;
};
pluss I have defined functions as well.
In main.c I have:
#include "Functions.h"
#include <stdlib.h>
int main(void)
{
// Works because *Person makes new a pointer
Person new = malloc(sizeof new);
getName(new);
getAge(new);
getBirthYear(new);
printFields(new);
free(new);
return 0;
}
Is it true, that when I use Person new, new is already pointer because of typedef struct person *Person;.
How is it possible, that linker cannot see the body and members that I have declared in my struct person
Is this only possible using pointer?
Is the correct (and only) way to implement OOP prinicples in my case to make a different struct in functions.h like so:
typedef struct classPerson
{ // This data should be hidden
Person data;
void (*fPtrGetName)(Person obj);
void (*fPtrBirthYear)(Person obj);
void (*fPtrGetAge)(Person obj);
void (*fPtrPrintFields)(const Person obj);
} ClassPerson;
First of all, it is usually better to not hide pointers behind a typedef, but to let the caller use pointer types. This prevents all kinds of misunderstandings when reading and maintaining the code. For example void printFields(const Person obj); looks like nonsense if you don't realize that Person is a pointer type.
Have I understood correctly, that when I use Person new, new is already pointer because of typedef struct person *Person;.
Yes. You are confused because of the mentioned typedef.
How is it possible, that linker cannot see the body and members that I have declared in my ´struct person´?
The linker can see everything that is linked, or you wouldn't end up with a working executable.
The compiler however, works on "translation units" (roughly means a .c file and all its included headers). When compiling the caller's translation unit, the compiler doesn't see functions.c, it only sees functions.h. And in functions.h, the struct declaration gives an incomplete type. Meaning "this struct definition is elsewhere".
Is this only possible using pointer?
Yes, it is the only way if you want to do proper OO programming in C. This concept is sometimes called opaque pointers or opaque type.
(Though you could also achieve "poor man's private encapsulation" though the static keyword. Which is usually not really recommended, since it wouldn't be thread-safe.)
Is the correct (and only) way to implement OOP prinicples in my case to make a different struct in functions.h like so:
Pretty much, yeah (apart from the nit-pick about the mentioned pointer typedef). Using function pointers to the public functions isn't necessary though, although that's how you implement polymorphism.
What your example lacks though is a "constructor" and "destructor". Without them the code wouldn't be meaningful. The malloc and free calls should be inside those, and not done by the caller.
With or without typedef, in C you hide data by declaring incomplete types. In /usr/include/stdio.h, you'll find fread(3) takes a FILE * argument:
extern size_t fread (void *__restrict __ptr, size_t __size,
size_t __n, FILE *__restrict __stream) __wur;
and FILE is declared something like this:
struct _IO_FILE;
typedef struct _IO_FILE FILE;
Using stdio.h you cannot define a variable of type FILE, because type FILE is incomplete: it's declared, but not defined. But you can happily pass FILE * around, because all data pointers are the same size. You're just going to have to call fopen(3) to make it point to an open file.
To partially define a type, as in your case:
struct classPerson
{ // This data should be hidden
Person data;
void (*fPtrGetName)(Person obj);
...
};
is a little trickier. First of all, you should have a really good reason, namely that two implementations of fPtrGetName are implemented. Otherwise you're just building complexity on the altar of OOP.
A good example of a good reason is bind(2). You can bind a unix domain socket or a network socket, among others. Both types are represented by struct sockaddr, but that's just a stand-in type for struct sockaddr_un and struct sockaddr_in. Functions that take struct sockaddr depend on the fact that all such structures start with the member sun_family, and branch accordingly. Et voila, polymorphism: one function, many types.
For an example of a struct full of function pointers, I recommend looking at SQLite. Its API is loaded with structures to isolate it from the OS and let the user define plug-ins.
BTW, if I may say so, fPtrGetName is a terrible name. It's not interesting that it's a function pointer and (controversy!) "get" is noise on a function that takes no arguments. Compare
struct classPerson sargent;
sargent.fPtrGetName();
sargent.name();
Which would you rather use? I reserve "get" (or similar) for I/O functions; at least then you're getting something, not just moving it from one pocket to another! For setting, in C++ I overload the function, so that get/set functions have the same name, but in C I wind up with e.g. set_name(const char name[]).
I have a C project that is designed to be portable to various (PC and embedded) platforms.
Application code will use various calls that will have platform-specific implementations, but share a common (generic) API to aid in portability. I'm trying to settle on the most appropriate way to declare the function prototypes and structures.
Here's what I've come up with so far:
main.c:
#include "generic.h"
int main (int argc, char *argv[]) {
int ret;
gen_t *data;
ret = foo(data);
...
}
generic.h: (platform-agnostic include)
typedef struct impl_t gen_t;
int foo (gen_t *data);
impl.h: (platform-specific declaration)
#include "generic.h"
typedef struct impl_t {
/* ... */
} gen_t;
impl.c: (platform-specific implementation)
int foo (gen_t *data) {
...
}
Build:
gcc -c -fPIC -o platform.o impl.c
gcc -o app main.c platform.o
Now, this appears to work... in that it compiles OK. However, I don't usually tag my structures since they're never accessed outside of the typedef'd alias. It's a small nit-pick, but I'm wondering if there's a way to achieve the same effect with anonymous structs?
I'm also asking for posterity, since I searched for a while and the closest answer I found was this: (Link)
In my case, that wouldn't be the right approach, as the application specifically shouldn't ever include the implementation headers directly -- the whole point is to decouple the program from the platform.
I see a couple of other less-than-ideal ways to resolve this, for example:
generic.h:
#ifdef PLATFORM_X
#include "platform_x/impl.h"
#endif
/* or */
int foo (struct impl_t *data);
Neither of these seems particularly appealing, and definitely not my style. While I don't want to swim upstream, I also don't want conflicting style when there might be a nicer way to implement exactly what I had in mind. So I think the typedef solution is on the right track, and it's just the struct tag baggage I'm left with.
Thoughts?
Your current technique is correct. Trying to use an anonymous (untagged) struct defeats what you're trying to do — you'd have to expose the details of definition of the struct everywhere, which means you no longer have an opaque data type.
In a comment, user3629249 said:
The order of the header file inclusions means there is a forward reference to the struct by the generic.h file; that is, before the struct is defined, it is used. It is unlikely this would compile.
This observation is incorrect for the headers shown in the question; it is accurate for the sample main() code (which I hadn't noticed until adding this response).
The key point is that the interface functions shown take or return pointers to the type gen_t, which in turn maps to a struct impl_t pointer. As long as the client code does not need to allocate space for the structure, or dereference a pointer to a structure to access a member of the structure, the client code does not need to know the details of the structure. It is sufficient to have the structure type declared as existing. You could use either of these to declare the existence of struct impl_t:
struct impl_t;
typedef struct impl_t gen_t;
The latter also introduces the alias gen_t for the type struct impl_t. See also Which part of the C standard allows this code to compile? and Does the C standard consider that there are one or two struct uperms entry types in this header?
The original main() program in the question was:
int main (int argc, char *argv[]) {
int ret;
gen_t data;
ret = foo(&data);
…
}
This code cannot be compiled with gen_t as an opaque (non-pointer) type. It would work OK with:
typedef struct impl_t *gen_t;
It would not compile with:
typedef struct impl_t gen_t;
because the compiler must know how big the structure is to allocate the correct space for data, but the compiler cannot know that size by definition of what an opaque type is. (See Is it a good idea to typedef pointers? for typedefing pointers to structures.)
Thus, the main() code should be more like:
#include "generic.h"
int main(int argc, char **argv)
{
gen_t *data = bar(argc, argv);
int ret = foo(data);
...
}
where (for this example) bar() is defined as extern gen_t *bar(int argc, char **argv);, so it returns a pointer to the opaque type gen_t.
Opinion is split over whether it is better to always use struct tagname or to use a typedef for the name. The Linux kernel is one substantial body of code that does not use the typedef mechanism; all structures are explicitly struct tagname. On the other hand, C++ does away with the need for the explicit typedef; writing:
struct impl_t;
in a C++ program means that the name impl_t is now the name of a type. Since opaque structure types require a tag (or you end up using void * for everything, which is bad for a whole legion of reasons, but the primary reason is that you lose all type safety using void *; remember, typedef introduces an alias for an underlying type, not a new distinct type), the way I code in C simulates C++:
typedef struct Generic Generic;
I avoid using the _t suffix on my types because POSIX reserves the _t for the implementation to use* (see also What does a type followed by _t represent?). You may be lucky and get away with it. I've worked on code bases where types like dec_t and loc_t were defined by the code base (which was not part of the implementation — where 'the implementation' means the C compiler and its supporting code, or the C library and its supporting code), and both those types caused pain for decades because some of the systems where the code was ported defined those types, as is the system's prerogative. One of the names I managed to get rid of; the other I didn't. 'Twas painful! If you must use _t (it is a convenient way to indicate that something is a type), I recommend using a distinctive prefix too: pqr_typename_t for some project pqr, for example.
* See the bottom line of the second table in The Name Space in the POSIX standard.
I have a situation where two of my header files require the data structures defined in either one, i.e. no matter which order you include them it won't compile
however, one of the problem data structures only contains pointers to the data structure declared in the other header file so I would have though that technically it doesn't need to know at this point how big the data structure is so it shouldn't be complaining
A simplified example of what I mean is outlined below. I would have thought that the array of modes in Library doesn't need to know how big a Mode is, only how big a pointer to a Mode is therefore the compiler shouldn't complain if it hasn't yet seen the declaration of Mode in the other header file.
header_1.h
typedef struct
{
Mode **modes;
} Library;
header_2.h
typedef struct
{
int number;
char *name;
} Mode;
It doesn't need to know the size, but it must have seen a declaration. A forward declaration
typedef struct Mode Mode;
before the definition of struct Library suffices.
As currently written, your example does not show the mutual cross-referencing that you mention in the question.
The compiler must be told something about each type it uses. You could use in header_1.h just:
typedef struct Mode Mode;
typedef struct
{
Mode **modes;
} Library;
That would make it compile, at least. The compiler doesn't need the details, but it does need to know that Modes is a type.
Edit:
Note that header_2.h should be modified for this to work. You have to ensure that each typedef appears just once. After you have the typedefs in place, you specify the structure content (definition) once, and you omit the keyword typedef and the typedef name from the structure definition. And you have to decide on exactly the cross-references will be managed. For example, should header_1.h include header_2.h anyway.
I don't remember encountering a case where I really needed mutually referencing structures (in quite a long time programming — long enough that I could have forgotten a example). I do now remember a case of structures mutually referencing each other; it was in a version of make originally written for Minix. I still regard such a requirement as somewhat 'pathological' (or, if you prefer, as a 'code smell') and as something to be avoided whenever possible. If you really must manage it, then the section below explains how I'd go about doing it (and more or less how the make program did go about it).
Mutually-referencing structures
If you truly have two mutually referencing structures, you should (re)consider why you think two headers are better than one. If you still need two headers, you use an idiom like:
header_1.h
#ifndef HEADER_1_H_INCLUDED
#define HEADER_1_H_INCLUDED
#ifndef TYPEDEF_MODE
#define TYPEDEF_MODE
typedef struct Mode Mode;
#endif
#ifndef TYPEDEF_LIBRARY
#define TYPEDEF_LIBRARY
typedef struct Library Library;
#endif
struct Library
{
...
Mode **modes;
...
};
#endif /* HEADER_1_H_INCLUDED */
header_2.h
#ifndef HEADER_2_H_INCLUDED
#define HEADER_2_H_INCLUDED
#ifndef TYPEDEF_MODE
#define TYPEDEF_MODE
typedef struct Mode Mode;
#endif
#ifndef TYPEDEF_LIBRARY
#define TYPEDEF_LIBRARY
typedef struct Library Library;
#endif
struct Mode
{
...
Library **liblist;
...
};
#endif /* HEADER_2_H_INCLUDED */
The repeated typedef 'detection' code is not nice; a single header is better, in my estimation. However, you can include header_1.h and header_2.h above in either order and it should compile.
I believe this is happening because "Mode" is a type defined using typedef and its not the name of the struct. You will either need to explicitly forward declare it or you can try using the code structured as follows:
header_1.h
typedef struct
{
struct _Mode_t **modes;
} Library;
header_2.h
typedef struct _Mode_t
{
int number;
char *name;
} Mode;
I've been using the following code to create various struct, but only give people outside of the C file a pointer to it. (Yes, I know that they could potentially mess around with it, so it's not entirely like the private keyword in Java, but that's okay with me).
Anyway, I've been using the following code, and I looked at it today, and I'm really surprised that it's actually working, can anyone explain why this is?
In my C file, I create my struct, but don't give it a tag in the typedef namespace:
struct LABall {
int x;
int y;
int radius;
Vector velocity;
};
And in the H file, I put this:
typedef struct LABall* LABall;
I am obviously using #include "LABall.h" in the c file, but I am NOT using #include "LABall.c" in the header file, as that would defeat the whole purpose of a separate header file. So, why am I able to create a pointer to the LABall* struct in the H file when I haven't actually included it? Does it have something to do with the struct namespace working accross files, even when one file is in no way linked to another?
Thank you.
A common pattern for stuff like that is to have a foo.h file defining the API like
typedef struct _Foo Foo;
Foo *foo_new();
void foo_do_something(Foo *foo);
and a foo.c file providing an implementation for that API like
struct _Foo {
int bar;
};
Foo *foo_new() {
Foo *foo = malloc(sizeof(Foo));
foo->bar = 0;
return foo;
}
void foo_do_something(Foo *foo) {
foo->bar++;
}
This hides all the memory layout and size of the struct in the implementation in foo.c, and the interface exposed via foo.h is completely independent of those internals: A caller.c which only does #include "foo.h" will only have to store a pointer to something, and pointers are always the same size:
#include "foo.h"
void bleh() {
Foo *f = foo_new();
foo_do_something(f);
}
Note: The ISO C standard section on reserved identifiers says that all identifiers beginning with an underscore are reserved. So typedef struct Foo Foo; is actually a better way to name things than typedef struct _Foo Foo;.
Note: I have left freeing the memory as an exercise to the reader. :-)
Of course, this means that the following file broken.c will NOT work:
#include "foo.h"
void broken() {
Foo f;
foo_do_something(&f);
}
as the memory size necessary for actually creating a variable of type Foo is not known in this file.
Since you're asking a precise reason as to "why" the language works this way, I'm assuming you want some precise references. If you find that pedant, just skip the notes...
It works because of two things:
All pointer to structure types have the same representation (note that it's not true of all pointer types, as far as standard C is concerned).[1] Hence, the compiler has enough information to generate proper code for all uses of your pointer-to-struct type.
The tag namespace (struct, enum, union) is indeed compatible accross all translation units.[2] Thus, the two structures (even though one is not completely defined, i.e. it lacks member declarations) are one and the same.
(BTW, #import is non-standard.)
[1] As per n1256 §6.2.5.27:
All pointers to structure types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
[2] As per n1256 §6.2.7.1:
two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are complete types, then the following additional requirements apply: [does not concern us].
In
typedef struct A* B;
since all pointers' interfaces are the same, knowing that B means a pointer to a struct A contains enough information already. The actual implementation of A is irrelevant (this technique is called "opaque pointer".)
(BTW, better rename one of the LABall's. It's confusing that the same name is used for incompatible types.)