How does linking work in C with regards to opaque pointers? - c

So, I've been having a bit of confusion regarding linking of various things. For this question I'm going to focus on opaque pointers.
I'll illustrate my confusion with an example. Let's say I have these three files:
main.c
#include <stdio.h>
#include "obj.h" //this directive is replaced with the code in obj.h
int main()
{
myobj = make_obj();
setid(myobj, 6);
int i = getid(myobj);
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
struct obj *make_obj(void){
return calloc(1, sizeof(struct obj));
};
void setid(struct obj *o, int i){
o->id = i;
};
int getid(struct obj *o){
return o->id;
};
obj.h
struct obj;
struct obj *make_obj(void);
void setid(struct obj *o, int i);
int getid(struct obj *o);
struct obj *myobj;
Because of the preprocessor directives, these would essentially become two files:
(I know technically stdio.h and stdlib.h would have their code replace the preprocessor directives, but I didn't bother to replace them for the sake of readability)
main.c
#include <stdio.h>
//obj.h
struct obj;
struct obj *make_obj(void);
void setid(struct obj *o, int i);
int getid(struct obj *o);
struct obj *myobj;
int main()
{
myobj = make_obj();
setid(myobj, 6);
int i = getid(myobj);
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
struct obj *make_obj(void){
return calloc(1, sizeof(struct obj));
};
void setid(struct obj *o, int i){
o->id = i;
};
int getid(struct obj *o){
return o->id;
};
Now here's where I get a bit confused. If I try to make a struct obj in main.c, I get an incomplete type error, even though main.c has the declaration struct obj;.
Even if I change the code up to use extern, It sill won't compile:
main.c
#include <stdio.h>
extern struct obj;
int main()
{
struct obj myobj;
myobj.id = 5;
int i = myobj.id;
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
So far as I can tell, main.c and obj.c do not communicate structs (unlike functions or variables for some which just need a declaration in the other file).
So, main.c has no link with struct obj types, but for some reason, in the previous example, it was able to create a pointer to one just fine struct obj *myobj;. How, why? I feel like I'm missing some vital piece of information. What are the rules regarding what can or can't go from one .c file to another?
ADDENDUM
To address the possible duplicate, I must emphasize, I'm not asking what an opaque pointer is but how it functions with regards to files linking.

Converting comments into a semi-coherent answer.
The problems with the second main.c arise because it does not have the details of struct obj; it knows that the type exists, but it knows nothing about what it contains. You can create and use pointers to struct obj; you cannot dereference those pointers, not even to copy the structure, let alone access data within the structure, because it is not known how big it is. That's why you have the functions in obj.c. They provide the services you need — object allocation, release, access to and modification of the contents (except that the object release is missing; maybe free(obj); is OK, but it's best to provide a 'destructor').
Note that obj.c should include obj.h to ensure consistency between obj.c and main.c — even if you use opaque pointers.
I'm not 100% what you mean by 'ensuring consistency'; what does that entail and why is it important?
At the moment, you could have struct obj *make_obj(int initializer) { … } in obj.c, but because you don't include obj.h in obj.c, the compiler can't tell you that your code in main.c will call it without the initializer — leading to quasi-random (indeterminate) values being used to 'initialize' the structure. If you include obj.h in obj.c, the discrepancy between the declaration in the header and the definition in the source file will be reported by the compiler and the code won't compile. The code in main.c wouldn't compile either — once the header is fixed. The header files are the 'glue' that hold the system together, ensuring consistency between the function definition and the places that use the function (references). The declaration in the header ensures that they're all consistent.
Also, I thought the whole reason why pointers are type-specific was because the pointers need the size which can vary depending on the type. How can a pointer be to something of unknown size?
As to why you can have pointers to types without knowing all the details, it is an important feature of C that provides for the interworking of separately compiled modules. All pointers to structures (of any type) must have the same size and alignment requirements. You can specify that the structure type exists by simply saying struct WhatEver; where appropriate. That's usually at file scope, not inside a function; there are complex rules for defining (or possibly redefining) structure types inside functions. And you can then use pointers to that type without more information for the compiler.
Without the detailed body of the structure (struct WhatEver { … };, where the braces and the content in between them are crucial), you cannot access what's in the structure, or create variables of type struct WhatEver — but you can create pointers (struct WhatEver *ptr = NULL;). This is important for 'type safety'. Avoid void * as a universal pointer type when you can, and you usually can avoid it — not always, but usually.
Oh okay, so the obj.h in obj.c is a means of ensuring the prototype being used matches the definition, by causing an error message if they don't.
Yes.
I'm still not entirely following in terms of all pointers having the same size and alignment. Wouldn't the size and alignment of a struct be unique to that particular struct?
The structures are all different, but the pointers to them are all the same size.
And the pointers can be the same size because struct pointers can't be dereferenced, so they don't need specific sizes?
If the compiler knows the details of the structure (there's a definition of the structure type with the { … } part present), then the pointer can be dereferenced (and variables of the structure type can be defined, as well as pointers to it, of course). If the compiler doesn't know the details, you can only define (and use) pointers to the type.
Also, out of curiosity, why would one avoid void * as a universal pointer?
You avoid void * because you lose all type safety. If you have the declaration:
extern void *delicate_and_dangerous(void *vptr);
then the compiler can't complain if you write the calls:
bool *bptr = delicate_and_dangerous(stdin);
struct AnyThing *aptr = delicate_and_dangerous(argv[1]);
If you have the declaration:
extern struct SpecialCase *delicate_and_dangerous(struct UnusualDevice *udptr);
then the compiler will tell you when you call it with a wrong pointer type, such as stdin (a FILE *) or argv[1] (a char * if you're in main()), etc. or if you assign to the wrong type of pointer variable.

Related

OOP and forward declaration of structure in C

I am studying C language and have recently learned how to write the OOP using C. Most part of it was not hard that much to understand for me except the name of structures type used to create new class.
My textbook used struct dummy_t for forward declaration and typedef struct {...} dummy_t for its definition. In my understanding, these are two different type because the former is struct dummy type and the later is struct type without a name tag but the sample code from the textbook worked well.
So I deliberately modified the sample code so that the difference in the names of structures will be much clearer. Below are the lines of code I tried.
//class.h
struct test_a;
struct test_a * test_init(void);
void test_print(struct test_a*);
//class.c
#include <stdio.h>
#include <stdlib.h>
typedef struct dummy{
int x;
int y;
} test_b;
test_b * test_init(void){
test_b * temp = (test_b *) malloc(sizeof(test_b));
temp->x = 10;
temp->y = 11;
return temp;
}
void test_print(test_b* obj){
printf("x: %d, y: %d\n", obj->x, obj->y);
}
//main.c
#include "class.h"
int main(void){
struct test_a * obj;
obj = test_init();
test_print(obj);
return 0;
}
// It printed "x: 10, y: 10"
As you can see, I used struct test_a for forward declaration and typedef struct dummy {...} test_b for definition.
I am wondering why I did not get the compile error and it worked.
I am wondering why I did not get the compile error
When you compile main.c the compiler is told via a forward declaration from class.h that there is a function with the signature struct test_a * test_init(void);
The compiler can't do anything other than just trusting that, i.e. no errors, no warnings can be issued.
When you compile class.c there is no forward declaration but only the function definition, i.e. no errors, no warnings.
It's always a good idea to include the .h file into the corresponding .c file. Had you had a #include "class.h" in class.c the compiler would have been able to detect the mismatch.
..and it worked
What happens is:
A pointer to test_b is assigned to a pointer to test_a variable
The variable is then passed as argument to a function expecting a pointer to test_b
So once you use the pointer it is used as it was created (i.e. as pointer to test_b). In between you just stored in a variable of another pointer type.
Is that ok? No
Storing a pointer to one type in a object defined for another pointer type is not ok. It's undefined behavior. In this case it "just happened to work". In real life it will "just happen to work" on most systems because most systems use the same pointer layout for all types. But according to the C standard it's undefined behavior.
It 'worked' because you did not include class.h in class.c. So the compiler can't see the implementation does not match the declaration.
The proper way is (but without the typedef for clarity):
// class.h
struct test_a;
struct test_a* test_init(void);
//class.c
#include "class.h"
struct test_a {
int x;
int y;
};
struct test_a* test_init(void)
{
...
}
The struct test_a in the header file makes the name test_a known to the compiler as being a struct. But as it does not now what is in the struct you can only use pointers to such a struct.
The members are defined in the implementation file and can only be used there.
If you want to use a typedef:
// header
typedef struct test_a_struct test_a;
test_a* test_init(void);
//implementation
struct test_a_struct {
int x;
int y;
};
test_a* test_init(void)
{
...
}

Using a structure in main.c which is declared in module.h and defined in module.c

Context
We have three files:
module.h: it holds the declaration of a structure,
module.c: it holds the definition of the structure,
main.c: it holds an instance of the structure.
The goal is to use a structure in main.c by using an API (module.h) and not directly by manipulating the structure members. It is why the definition of the structure is in module.c and not in module.h.
Code
module.h
#ifndef MODULE_H
#define MODULE_H
typedef struct test_struct test_struct;
void initialize_test_struct(int a, int b, test_struct * test_struct_handler);
#endif
module.c
#include "module.h"
struct test_struct
{
int a;
int b;
};
void initialize_test_struct(int a, int b, test_struct * test_struct_handler)
{
test_struct_handler->a = a;
test_struct_handler->b = b;
}
main.c
#include "module.h"
int main(void)
{
test_struct my_struct; // <- GCC error here
test_struct * my_struct_handler = &my_struct;
initialize_test_struct(1, 2, my_struct_handler);
return 0;
}
Problem
If we compile those files with GCC, we will get the following error:
main.c:7:17: error: storage size of ‘my_struct’ isn’t known
Question
How can we force to use an API and so forbid to use directly a structure's members to manipulate a structure, the structure declaration and definition being in a different module than the main.c?
Since the definition of test_struct is not visible to your main function, you cannot create an instance of this object nor can you access its members. You can however create a pointer to it. So you need a function in module.c that allocates memory for an instance and returns a pointer to it. You'll also need functions to read the members.
In module.h:
test_struct *allocate_test_struct();
int get_a(test_struct *p);
int get_b(test_struct *p);
In module.c:
test_struct *allocate_test_struct()
{
test_struct *p = malloc(sizeof(test_struct));
if (!p) {
perror("malloc failed");
exit(1);
}
return p;
}
int get_a(test_struct *p)
{
return p->a;
}
int get_b(test_struct *p)
{
return p-b;
}
In main.c:
test_struct * my_struct_handler = allocate_test_struct()
initialize_test_struct(1, 2, my_struct_handler);
printf("%d\n", get_a(my_struct_handler));
printf("%d\n", get_b(my_struct_handler));
free(my_struct_handler);
You cannot instantiate a test_struct directly with only the include file because the details are not known when the C file is processed in the compiler. The language will only let you initialize pointers to objects of unknown size, not the objects themselves. The size and other details of test_struct are only known by the compiler when processing module.c
To get around this, you need to have module.c allocate data and provide a pointer to it in the initialize call. This means you have to either have the initialize function return a pointer to a newly created object (one that was either malloc'd or a global or static object), or have the function accept a test_struct **:
void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)
{
*test_struct_handler = malloc(sizeof(test_struct));
//Do rest of init. You should also check return value of malloc
}
//Alternatively
test_struct * initialize_test_struct(int a, int b)
{
test_struct *temp;
temp = malloc(sizeof(test_struct);
//Init members as needed
return temp;
}
Normally in this situation the typedef is for a pointer to the opaque structure, and named to indicate that it's not the struct itself - 'typedef struct test_struct* test_struct_handle' could be appropriate, as the struct name itself is rather useless to users of the module (except to make pointers).
It's also good practice to:
Have accessor functions if you need them (which you do for your main file - see dbush's answer)
Have a 'de-init'/'free' function. The user does not necessarily know that malloc was used, and having a de-init function will make it possible to hide more implementation details.
First, what is it that you think clients of your library want to do that your API doesn’t support, and they’ll be tempted to manipulate the fields of your structure to accomplish? Consider extending the API.
One option is to force client code to get its structures from your factory function, rather than allocate them itself.
Another is to create a phony definition, perhaps containing only an array of char to establish a minimum size and alignment, so that the client code knows just enough to allocate an array of them, and the library module itself can cast the pointer and twiddle the bits.
Another is to put the definition in the header and add a comment saying that the fields are internal implementation details that might change.
Another is to port to an object-oriented language.
If we compile those files with GCC, we will get the following error:
main.c:7:17: error: storage size of ‘my_struct’ isn’t known
The reason why structures are inside the header files in the first place is so that the user source (here main.c) can access its members...
Your compiler does not know the address of the definition-less struct typedef struct test_struct test_struct; and so &my_struct won't give you its address!
And without an address, you can not get the size of any variable!
Explanation:
test_struct my_struct;
Here, you make a variable of an incomplete type and hence is not valid and doesn't have an address since its members are inaccessible...
test_struct * my_struct_handler = &my_struct;
And here you pass the address of my_struct which unnotably is not possible to gain (The structure is definition-less inside the header and the source is compiled so it can't be accessed either)...
So you use pointers in this case so that a temporary address can be assigned to the incomplete type:
/* Don't forget to change to
* 'void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)'
* in the header file!
*/
void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)
{
// Allocate an undefined address to the pointer...
*test_struct_handler = malloc(sizeof(test_struct));
(*test_struct_handler)->a = a;
(*test_struct_handler)->b = b;
}
// The declarations have to be present inside the headers as well...
// A function that returns the pointer to the variables a and b respectively...
// These functions can readily change their values while returning them...
int * get_a_ref(test_struct * test_struct_handler)
{
return &test_struct_handler->a;
}
int * get_b_ref(test_struct * test_struct_handler)
{
return &test_struct_handler->b;
}
and use it in main like this:
#include <stdio.h>
#include "module.h"
int main(void)
{
test_struct * my_struct_handler;
// Here a address is malloc'd to the pointer and the value is assigned to it...
initialize_test_struct(1, 2, &my_struct_handler);
// Will change the value of 'a' and similar for b as well...
// *get_a_ref(my_struct_handler) = 10;
printf("%d\n", *get_a_ref(my_struct_handler));
printf("%d\n", *get_b_ref(my_struct_handler));
return 0;
}
Just to remind you about the magic (and obscurity) of typedefs...

C: Incomplete definition of type struct

this error seems very easy to fix but i've been trying and have no clue.
So i have three files:
symtable.h:
typedef struct symbolTable *SymTable_T;
symtablelist.c:
#include "symtable.h"
struct Node{
char* key;
void* value;
struct Node* next;
};
struct symbolTable{
struct Node* head;
int length;
};
SymTable_T SymTable_new(void){
/* code */
}
And main.c:
#include "symtable.h"
int main(int argc, const char * argv[]) {
// insert code here...
SymTable_T emptyTable = SymTable_new();
emptyTable->length = 3; <------- ERROR
return 0;
}
I'm getting error: Incomplete definition of type "struct symbolTable"
Can anyone please give me a hint?
The reason i declare my struct in my source file is that i will have another implementation for the header file. so is there another way to fix my bug beside moving my struct declaration?
You can't access the members directly with an opaque pointer - if you keep the implementation in a separate source file, you'll have to access all the members via your interface, and not directly mess with the struct.
For instance, add this to symtable.h:
void SymTable_set_length(SymTable_T table, int len);
this to symtablelist.c:
void SymTable_set_length(SymTable_T table, int len)
{
table->length = len;
}
and in main.c change this:
emptyTable->length = 3;
to this:
SymTable_set_length(emptyTable, 3);
although in this specific case passing the length as an argument to SymTable_new() is an obviously superior solution. Even more superior is not letting the user set the length of a linked list data structure at all - the length is the number of items in it, and it is what it is. It would make no sense to, for instance, add three items to the list, and then allow main.c to set the length to 2. symtablelist.c can calculate and store the length privately, and main.c can find out what the length is, but it doesn't make much sense for main.c to be able to set the length directly. Indeed, the whole point of hiding the members of a struct behind an opaque pointer like this is precisely to prevent client code from being able to mess with the data like that and breaking the data structure's invariants in this manner.
If you want to access the members directly in main.c, then you have to have the struct definition visible, there is no alternative. This will mean either putting the struct definition in the header file (recommended) or duplicating it in main.c (highly unrecommended).
In typedef symbolTable *SymTable_T;, you refer to a non-existent type symbolTable. In C (unlike C++) the type is named struct symbolTable. (Note: the question has changed to fix this since answering it.)
There's a second problem. In main.c the code will need to be able to see the definition of struct symbolTable for you to be able to refer to fields of emptyTable. At the moment, the definition is hidden in a .c file... it should be moved to the header.

Using structs in multiple files

I am trying to call a function in main.c from io.h that reads data from a file, stores that data into multiple structs, then somehow lets me pass the different structs as arguments in later functions in main. Those later functions will be defined in other files, such as alg.h.
How do I go about doing this? Do I use extern and make the structs global and put them in a separate file? Is it possible to have a function from alg.h have a return type of one of the structs? Does it depend on the order of my includes?
The code pasted below complies and works, but any attempt to move either of the structs causes the program to not compile.
Also, is it possible to have, for example, a struct declared in alg.h, then functions that have that struct as a parameter declared later in alg.h. Then in main.c, you initialize and pass the struct into a function declared in io.h, give the struct some values, have it returned to main.c, then pass that into the function declared in alg.h? I know that sounds like a class, but I need a C solution and I only need one instance of the struct floating around.
Thanks.
io.h
struct s1 {
int num1;
double num2;
};
struct s2 {
int num3;
double num4;
};
void io_init(struct s1*, struct s2*);
io.c
#include <stdio.h>
#include <stdlib.h>
#include "io.h"
void io_init(struct s1* s1i, struct s2* s2i)
{
s1i->num1 = 5;
s1i->num2 = 2.4;
FILE *fp;
char line[80];
fp = fopen("input.txt","rt");
fgets(line, 80, fp);
sscanf(line,"%i",&s2i->num3);
fgets(line, 80, fp);
sscanf(line,"%i",&s2i->num4);
fclose(fp);
}
alg.h
void ga_init(struct s1);
alg.c
#include <stdio.h>
#include "io.h"
#include "ga.h"
void ga_init(struct s1 s1i)
{
printf("%i", s1i.val1);
}
main.c:
#include <stdio.h>
#include "io.h"
#include "ga.h"
int main() {
struct s1 s1i;
struct s2 s2i;
io_init(&s1i, &s2i);
ga_init(s1i);
return 0;
}
Every file which requires the declaration of your types (i.e., wants to use them) must include your header file (ok, so forward declarations and pointers will work, but they can't be dereferenced without the definition and that's not really applicable here anyway.)
So, to elaborate, if file X needs to use struct Y then it needs to include the header file which contains its declaration, that's it.
/* X.c */
#include "Y.h" /* <-- that's it! */
void foo(Y *obj) {
/* ... */
}
Here is some advice.
Your .h file is not defining struct objects. It's just defining the type. It's fine the way it is. Everyone who touches any struct of those types should include this file.
It's very rare to need to pass a struct by value as you are doing in the call to ga_init. You will essentially always want to call by reference, like you did with io_init.
Yes, you can return a struct, but again, it would almost always be better to return a reference to a struct.
You can certainly share globally defined structs and you don't need extern unless your linker is something awful. But sharing a reference to a struct allocated in main() amounts to roughly the same thing.

How can I hide the declaration of a struct in C?

In the question Why should we typedef a struct so often in C?, unwind answered that:
In this latter case, you cannot return
the Point by value, since its
declaration is hidden from users of
the header file. This is a technique
used widely in GTK+, for instance.
How is declaration hiding accomplished? Why can't I return the Point by value?
ADD:
I understood why I can't return the struct by value, but, is still hard to see why i can't deference this point in my function. i.e. If my struct have member named y, why i can't do it?
pointer_to_struct->y = some_value;
Why should I use methods to do it? (Like Gtk+)
Thanks guys, and sorry for my bad english again.
Have a look at this example of a library, using a public header file, a private header file and an implementation file.
In file public.h:
struct Point;
struct Point* getSomePoint();
In file private.h:
struct Point
{
int x;
int y;
}
In file private.c:
struct Point* getSomePoint()
{
/* ... */
}
If you compile these three files into a library, you only give public.h and the library object file to the consumer of the library.
getSomePoint has to return a pointer to Point, because public.h does not define the size of Point, only that is a struct and that it exists. Consumers of the library can use pointers to Point, but can not access the members or copy it around, because they do not know the size of the structure.
Regarding your further question:
You can not dereference because the program using the library does only have the information from private.h, that does not contain the member declarations. It therefore can not access the members of the point structure.
You can see this as the encapsulation feature of C, just like you would declare the data members of a C++ class as private.
What he means is that you cannot return the struct by-value in the header, because for that, the struct must be completely declared. But that happens in the C file (the declaration that makes X a complete type is "hidden" in the C file, and not exposed into the header), in his example. The following declares only an incomplete type, if that's the first declaration of the struct
struct X;
Then, you can declare the function
struct X f(void);
But you cannot define the function, because you cannot create a variable of that type, and much less so return it (its size is not known).
struct X f(void) { // <- error here
// ...
}
The error happens because "x" is still incomplete. Now, if you only include the header with the incomplete declaration in it, then you cannot call that function, because the expression of the function call would yield an incomplete type, which is forbidden to happen.
If you were to provide a declaration of the complete type struct X in between, it would be valid
struct X;
struct X f(void);
// ...
struct X { int data; };
struct X f(void) { // valid now: struct X is a complete type
// ...
}
This would apply to the way using typedef too: They both name the same, (possibly incomplete) type. One time using an ordinary identifier X, and another time using a tag struct X.
In the header file:
typedef struct _point * Point;
After the compiler sees this it knows:
There is a struct called _point.
There is a pointer type Point that can refer to a struct _point.
The compiler does not know:
What the struct _point looks like.
What members struct _point contains.
How big struct _point is.
Not only does the compiler not know it - we as programmers don't know it either. This means we can't write code that depends on those properties of struct _point, which means that our code may be more portable.
Given the above code, you can write functions like:
Point f() {
....
}
because Point is a pointer and struct pointers are all the same size and the compiler doesn't need to know anything else about them. But you can't write a function that returns by value:
struct _point f() {
....
}
because the compiler does not know anything about struct _point, specifically its size, which it needs in order to construct the return value.
Thus, we can only refer to struct _point via the Point type, which is really a pointer. This is why Standard C has types like FILE, which can only be accessed via a pointer - you can't create a FILE structure instance in your code.
Old question, better answer:
In Header File:
typedef struct _Point Point;
In C File:
struct _Point
{
int X;
int Y;
};
What that post means is: If you see the header
typedef struct _Point Point;
Point * point_new(int x, int y);
then you don't know the implementation details of Point.
As an alternative to using opaque pointers (as others have mentioned), you can instead return an opaque bag of bytes if you want to avoid using heap memory:
// In public.h:
struct Point
{
uint8_t data[SIZEOF_POINT]; // make sure this size is correct!
};
void MakePoint(struct Point *p);
// In private.h:
struct Point
{
int x, y, z;
};
void MakePoint(struct Point *p);
// In private.c:
void MakePoint(struct Point *p)
{
p->x = 1;
p->y = 2;
p->z = 3;
}
Then, you can create instances of the struct on the stack in client code, but the client doesn't know what's in it -- all it knows is that it's a blob of bytes with a given size. Of course, it can still access the data if it can guess the offsets and data types of the members, but then again you have the same problem with opaque pointers (though clients don't know the object size in that case).
For example, the various structs used in the pthreads library use structs of opaque bytes for types like pthread_t, pthread_cond_t, etc. -- you can still create instances of those on the stack (and you usually do), but you have no idea what's in them. Just take a peek into your /usr/include/pthreads.h and the various files it includes.

Resources