why is it possible to use undefined struct in c - c

#include <stdio.h>
int main()
{
printf("%d", sizeof(struct token *));
}
The above code can be compiled and linked use gcc under Linux. Could anyone of you explain the thing behind the Scenes to me? I know the point take the fix size of memory, so the struct
token is irrelevant to sizeof, but even turn on the warning option in gcc, no warnings about the "none exist" struct at all. The context for the question is that I'm reading some source code by others, I'm trying very very hard to find the definition of "struct token", but off course failed.

Because you are trying to get the size of a pointer to struct token. The size of a pointer doesn't depend on how the structure is actually defined.
Generally, you can even declare a variable of type struct token*, but you can't dereference it (e. g. access a member through the pointer).

To paraphrase the C standard, an incomplete type is a type that describes an
object but lacks information needed to determine its size.
void is another incomplete type. Unlike other incomplete types, void cannot
be completed.
This "incomplete type" is often used for kinds of handle: a library allows you to allocate a "handle" to something, work with it and dispose it again. All this happens encapsulated in the library. You as user have no idea what might happen inside.
Example:
lib.h:
struct data * d_generate(void);
void d_set(struct data *, int);
int d_get(struct data *);
void d_free(struct data*);
lib.c:
#include "lib.h"
#include <stdlib.h>
struct data { int value; };
struct data * d_generate(void) {
return malloc(sizeof(struct data))
}
void d_set(struct data * d, int v) {
d -> value = v;
}
int d_get(struct data * d) {
return d->value;
}
void d_free(struct data* d) {
free(d);
}
user.c:
#include "lib.h"
[...]
struct data * d = d_generate();
d_set(d, 42);
int v = d_get(d);
// but v = d->value; doesn't work here because of the incomplete definition.
d_free(d);

Related

C compiler checking of a typedef'ed void *

We have an anonymous type, typedefed to void *, which is the handle for an API (all code in C11). It is deliberately void * as what it is pointing to changes depending on the platform we are compiled for and we also don't want the application to try dereferencing it. Internally we know what it should be pointing to and we cast it appropriately. This is fine, the code is public, we've been using it for years, it cannot be changed.
The problem is that we now need to introduce another one of these, and we don't want the user to get the two confused, we want the compiler to throw an error if the wrong handle is passed to one of our functions. However, all of the versions of all of the C compilers I have tried so far (GCC, Clang, MSVC) don't care; they know that the underlying type is void * and so anything goes (this is with -Wall and -Werror). Putting it another way, our typedef has not achieved anything, we might as well have just used void *. I have also tried Lint and CodeChecker, who also don't seem to care (though you could probably question my configurations for these). Note that I am not able to use -Wpedantic as we include third party code where that wouldn't fly.
I have tried making the new thing a specific typedefed pointer rather than a void * but that doesn't entirely fix things as the compiler is still happy for the caller to pass that new specific typedefed pointer into the existing functions that are expecting the existing handle typedef.
Is there (a) a way to construct a new anonymous handle such that the compiler will not allow it to be passed to the existing functions or (b) a checker that we can apply to pick the problem up, at least in our own use of these APIs?
Here is some code to illustrate the problem:
#include <stdlib.h>
typedef struct {
int contents;
} existingThing_t;
typedef void *anonExistingHandle_t;
typedef struct {
char contents[10];
} newThing_t;
typedef void *anonNewHandle_t;
typedef newThing_t *newHandle_t;
static void functionExisting(anonExistingHandle_t handle)
{
existingThing_t *pThing = (existingThing_t *) handle;
// Perform the function
(void) pThing;
}
static void functionNew(anonNewHandle_t handle)
{
newThing_t *pThing = (newThing_t *) handle;
// Perform a new function
(void) pThing;
}
int main() {
anonExistingHandle_t existingHandle = NULL;
anonNewHandle_t newHandleA = NULL;
newHandle_t newHandleB = NULL;
functionExisting(existingHandle);
functionNew(newHandleA);
// These should result in a compilation error
functionExisting(newHandleA);
functionNew(existingHandle);
functionExisting(newHandleB);
return 0;
}
Is there (a) a way to construct a new anonymous handle such that the compiler will not allow it to be passed to the existing functions
Yes, use a type that can't be implicitly converted to void *. Use a structure.
typedef struct {
struct newThing_s *p;
} anonNewHandle_t;
Anyway, your design is just flawed and disables all static compiler checks. Do not use void *, instead use structures or structures with void * inside, to enable compile checks. Research how the very, very standard FILE * works. FILE is not void.
Do not use typedef pointers. They are very confusing. https://wiki.sei.cmu.edu/confluence/display/c/DCL05-C.+Use+typedefs+of+non-pointer+types+only
I suggest rewriting your library so that you do not use void * and do not use typedef pointers.
The design, may look like the following:
// handle.h
struct handle_s;
typedef struct {
struct handle_s *p;
} handle_t;
handle_t handle_init(void);
void handle_deinit(handle_t t);
void handle_do_something(handle_t t);
// handle.c
struct handle_s {
int the_stuff_you_need;
};
handle_t handle_init(void) {
return (handle_t){
.p = calloc(1, sizeof(struct handle_s))
};
}
void handle_do_something(handle_t h) {
struct hadnle_s *t = h->p;
// etc.
}
// anotherhandle.h
// similar to above
typedef struct {
struct anotherhandle_s *p;
} anotherhandle_t;
void anotherhandle_do_something(anotherhandle_t h);
// main
int main() {
handle_t h = handle_new();
handle_do_something(h);
handle_free(h);
anotherhandle_do_something(h); // compiler error
}

OOP and forward declaration of structure in C

I am studying C language and have recently learned how to write the OOP using C. Most part of it was not hard that much to understand for me except the name of structures type used to create new class.
My textbook used struct dummy_t for forward declaration and typedef struct {...} dummy_t for its definition. In my understanding, these are two different type because the former is struct dummy type and the later is struct type without a name tag but the sample code from the textbook worked well.
So I deliberately modified the sample code so that the difference in the names of structures will be much clearer. Below are the lines of code I tried.
//class.h
struct test_a;
struct test_a * test_init(void);
void test_print(struct test_a*);
//class.c
#include <stdio.h>
#include <stdlib.h>
typedef struct dummy{
int x;
int y;
} test_b;
test_b * test_init(void){
test_b * temp = (test_b *) malloc(sizeof(test_b));
temp->x = 10;
temp->y = 11;
return temp;
}
void test_print(test_b* obj){
printf("x: %d, y: %d\n", obj->x, obj->y);
}
//main.c
#include "class.h"
int main(void){
struct test_a * obj;
obj = test_init();
test_print(obj);
return 0;
}
// It printed "x: 10, y: 10"
As you can see, I used struct test_a for forward declaration and typedef struct dummy {...} test_b for definition.
I am wondering why I did not get the compile error and it worked.
I am wondering why I did not get the compile error
When you compile main.c the compiler is told via a forward declaration from class.h that there is a function with the signature struct test_a * test_init(void);
The compiler can't do anything other than just trusting that, i.e. no errors, no warnings can be issued.
When you compile class.c there is no forward declaration but only the function definition, i.e. no errors, no warnings.
It's always a good idea to include the .h file into the corresponding .c file. Had you had a #include "class.h" in class.c the compiler would have been able to detect the mismatch.
..and it worked
What happens is:
A pointer to test_b is assigned to a pointer to test_a variable
The variable is then passed as argument to a function expecting a pointer to test_b
So once you use the pointer it is used as it was created (i.e. as pointer to test_b). In between you just stored in a variable of another pointer type.
Is that ok? No
Storing a pointer to one type in a object defined for another pointer type is not ok. It's undefined behavior. In this case it "just happened to work". In real life it will "just happen to work" on most systems because most systems use the same pointer layout for all types. But according to the C standard it's undefined behavior.
It 'worked' because you did not include class.h in class.c. So the compiler can't see the implementation does not match the declaration.
The proper way is (but without the typedef for clarity):
// class.h
struct test_a;
struct test_a* test_init(void);
//class.c
#include "class.h"
struct test_a {
int x;
int y;
};
struct test_a* test_init(void)
{
...
}
The struct test_a in the header file makes the name test_a known to the compiler as being a struct. But as it does not now what is in the struct you can only use pointers to such a struct.
The members are defined in the implementation file and can only be used there.
If you want to use a typedef:
// header
typedef struct test_a_struct test_a;
test_a* test_init(void);
//implementation
struct test_a_struct {
int x;
int y;
};
test_a* test_init(void)
{
...
}

How does linking work in C with regards to opaque pointers?

So, I've been having a bit of confusion regarding linking of various things. For this question I'm going to focus on opaque pointers.
I'll illustrate my confusion with an example. Let's say I have these three files:
main.c
#include <stdio.h>
#include "obj.h" //this directive is replaced with the code in obj.h
int main()
{
myobj = make_obj();
setid(myobj, 6);
int i = getid(myobj);
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
struct obj *make_obj(void){
return calloc(1, sizeof(struct obj));
};
void setid(struct obj *o, int i){
o->id = i;
};
int getid(struct obj *o){
return o->id;
};
obj.h
struct obj;
struct obj *make_obj(void);
void setid(struct obj *o, int i);
int getid(struct obj *o);
struct obj *myobj;
Because of the preprocessor directives, these would essentially become two files:
(I know technically stdio.h and stdlib.h would have their code replace the preprocessor directives, but I didn't bother to replace them for the sake of readability)
main.c
#include <stdio.h>
//obj.h
struct obj;
struct obj *make_obj(void);
void setid(struct obj *o, int i);
int getid(struct obj *o);
struct obj *myobj;
int main()
{
myobj = make_obj();
setid(myobj, 6);
int i = getid(myobj);
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
struct obj *make_obj(void){
return calloc(1, sizeof(struct obj));
};
void setid(struct obj *o, int i){
o->id = i;
};
int getid(struct obj *o){
return o->id;
};
Now here's where I get a bit confused. If I try to make a struct obj in main.c, I get an incomplete type error, even though main.c has the declaration struct obj;.
Even if I change the code up to use extern, It sill won't compile:
main.c
#include <stdio.h>
extern struct obj;
int main()
{
struct obj myobj;
myobj.id = 5;
int i = myobj.id;
printf("ID: %i\n",i);
getchar();
return 0;
}
obj.c
#include <stdlib.h>
struct obj{
int id;
};
So far as I can tell, main.c and obj.c do not communicate structs (unlike functions or variables for some which just need a declaration in the other file).
So, main.c has no link with struct obj types, but for some reason, in the previous example, it was able to create a pointer to one just fine struct obj *myobj;. How, why? I feel like I'm missing some vital piece of information. What are the rules regarding what can or can't go from one .c file to another?
ADDENDUM
To address the possible duplicate, I must emphasize, I'm not asking what an opaque pointer is but how it functions with regards to files linking.
Converting comments into a semi-coherent answer.
The problems with the second main.c arise because it does not have the details of struct obj; it knows that the type exists, but it knows nothing about what it contains. You can create and use pointers to struct obj; you cannot dereference those pointers, not even to copy the structure, let alone access data within the structure, because it is not known how big it is. That's why you have the functions in obj.c. They provide the services you need — object allocation, release, access to and modification of the contents (except that the object release is missing; maybe free(obj); is OK, but it's best to provide a 'destructor').
Note that obj.c should include obj.h to ensure consistency between obj.c and main.c — even if you use opaque pointers.
I'm not 100% what you mean by 'ensuring consistency'; what does that entail and why is it important?
At the moment, you could have struct obj *make_obj(int initializer) { … } in obj.c, but because you don't include obj.h in obj.c, the compiler can't tell you that your code in main.c will call it without the initializer — leading to quasi-random (indeterminate) values being used to 'initialize' the structure. If you include obj.h in obj.c, the discrepancy between the declaration in the header and the definition in the source file will be reported by the compiler and the code won't compile. The code in main.c wouldn't compile either — once the header is fixed. The header files are the 'glue' that hold the system together, ensuring consistency between the function definition and the places that use the function (references). The declaration in the header ensures that they're all consistent.
Also, I thought the whole reason why pointers are type-specific was because the pointers need the size which can vary depending on the type. How can a pointer be to something of unknown size?
As to why you can have pointers to types without knowing all the details, it is an important feature of C that provides for the interworking of separately compiled modules. All pointers to structures (of any type) must have the same size and alignment requirements. You can specify that the structure type exists by simply saying struct WhatEver; where appropriate. That's usually at file scope, not inside a function; there are complex rules for defining (or possibly redefining) structure types inside functions. And you can then use pointers to that type without more information for the compiler.
Without the detailed body of the structure (struct WhatEver { … };, where the braces and the content in between them are crucial), you cannot access what's in the structure, or create variables of type struct WhatEver — but you can create pointers (struct WhatEver *ptr = NULL;). This is important for 'type safety'. Avoid void * as a universal pointer type when you can, and you usually can avoid it — not always, but usually.
Oh okay, so the obj.h in obj.c is a means of ensuring the prototype being used matches the definition, by causing an error message if they don't.
Yes.
I'm still not entirely following in terms of all pointers having the same size and alignment. Wouldn't the size and alignment of a struct be unique to that particular struct?
The structures are all different, but the pointers to them are all the same size.
And the pointers can be the same size because struct pointers can't be dereferenced, so they don't need specific sizes?
If the compiler knows the details of the structure (there's a definition of the structure type with the { … } part present), then the pointer can be dereferenced (and variables of the structure type can be defined, as well as pointers to it, of course). If the compiler doesn't know the details, you can only define (and use) pointers to the type.
Also, out of curiosity, why would one avoid void * as a universal pointer?
You avoid void * because you lose all type safety. If you have the declaration:
extern void *delicate_and_dangerous(void *vptr);
then the compiler can't complain if you write the calls:
bool *bptr = delicate_and_dangerous(stdin);
struct AnyThing *aptr = delicate_and_dangerous(argv[1]);
If you have the declaration:
extern struct SpecialCase *delicate_and_dangerous(struct UnusualDevice *udptr);
then the compiler will tell you when you call it with a wrong pointer type, such as stdin (a FILE *) or argv[1] (a char * if you're in main()), etc. or if you assign to the wrong type of pointer variable.

Using a structure in main.c which is declared in module.h and defined in module.c

Context
We have three files:
module.h: it holds the declaration of a structure,
module.c: it holds the definition of the structure,
main.c: it holds an instance of the structure.
The goal is to use a structure in main.c by using an API (module.h) and not directly by manipulating the structure members. It is why the definition of the structure is in module.c and not in module.h.
Code
module.h
#ifndef MODULE_H
#define MODULE_H
typedef struct test_struct test_struct;
void initialize_test_struct(int a, int b, test_struct * test_struct_handler);
#endif
module.c
#include "module.h"
struct test_struct
{
int a;
int b;
};
void initialize_test_struct(int a, int b, test_struct * test_struct_handler)
{
test_struct_handler->a = a;
test_struct_handler->b = b;
}
main.c
#include "module.h"
int main(void)
{
test_struct my_struct; // <- GCC error here
test_struct * my_struct_handler = &my_struct;
initialize_test_struct(1, 2, my_struct_handler);
return 0;
}
Problem
If we compile those files with GCC, we will get the following error:
main.c:7:17: error: storage size of ‘my_struct’ isn’t known
Question
How can we force to use an API and so forbid to use directly a structure's members to manipulate a structure, the structure declaration and definition being in a different module than the main.c?
Since the definition of test_struct is not visible to your main function, you cannot create an instance of this object nor can you access its members. You can however create a pointer to it. So you need a function in module.c that allocates memory for an instance and returns a pointer to it. You'll also need functions to read the members.
In module.h:
test_struct *allocate_test_struct();
int get_a(test_struct *p);
int get_b(test_struct *p);
In module.c:
test_struct *allocate_test_struct()
{
test_struct *p = malloc(sizeof(test_struct));
if (!p) {
perror("malloc failed");
exit(1);
}
return p;
}
int get_a(test_struct *p)
{
return p->a;
}
int get_b(test_struct *p)
{
return p-b;
}
In main.c:
test_struct * my_struct_handler = allocate_test_struct()
initialize_test_struct(1, 2, my_struct_handler);
printf("%d\n", get_a(my_struct_handler));
printf("%d\n", get_b(my_struct_handler));
free(my_struct_handler);
You cannot instantiate a test_struct directly with only the include file because the details are not known when the C file is processed in the compiler. The language will only let you initialize pointers to objects of unknown size, not the objects themselves. The size and other details of test_struct are only known by the compiler when processing module.c
To get around this, you need to have module.c allocate data and provide a pointer to it in the initialize call. This means you have to either have the initialize function return a pointer to a newly created object (one that was either malloc'd or a global or static object), or have the function accept a test_struct **:
void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)
{
*test_struct_handler = malloc(sizeof(test_struct));
//Do rest of init. You should also check return value of malloc
}
//Alternatively
test_struct * initialize_test_struct(int a, int b)
{
test_struct *temp;
temp = malloc(sizeof(test_struct);
//Init members as needed
return temp;
}
Normally in this situation the typedef is for a pointer to the opaque structure, and named to indicate that it's not the struct itself - 'typedef struct test_struct* test_struct_handle' could be appropriate, as the struct name itself is rather useless to users of the module (except to make pointers).
It's also good practice to:
Have accessor functions if you need them (which you do for your main file - see dbush's answer)
Have a 'de-init'/'free' function. The user does not necessarily know that malloc was used, and having a de-init function will make it possible to hide more implementation details.
First, what is it that you think clients of your library want to do that your API doesn’t support, and they’ll be tempted to manipulate the fields of your structure to accomplish? Consider extending the API.
One option is to force client code to get its structures from your factory function, rather than allocate them itself.
Another is to create a phony definition, perhaps containing only an array of char to establish a minimum size and alignment, so that the client code knows just enough to allocate an array of them, and the library module itself can cast the pointer and twiddle the bits.
Another is to put the definition in the header and add a comment saying that the fields are internal implementation details that might change.
Another is to port to an object-oriented language.
If we compile those files with GCC, we will get the following error:
main.c:7:17: error: storage size of ‘my_struct’ isn’t known
The reason why structures are inside the header files in the first place is so that the user source (here main.c) can access its members...
Your compiler does not know the address of the definition-less struct typedef struct test_struct test_struct; and so &my_struct won't give you its address!
And without an address, you can not get the size of any variable!
Explanation:
test_struct my_struct;
Here, you make a variable of an incomplete type and hence is not valid and doesn't have an address since its members are inaccessible...
test_struct * my_struct_handler = &my_struct;
And here you pass the address of my_struct which unnotably is not possible to gain (The structure is definition-less inside the header and the source is compiled so it can't be accessed either)...
So you use pointers in this case so that a temporary address can be assigned to the incomplete type:
/* Don't forget to change to
* 'void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)'
* in the header file!
*/
void initialize_test_struct(int a, int b, test_struct ** test_struct_handler)
{
// Allocate an undefined address to the pointer...
*test_struct_handler = malloc(sizeof(test_struct));
(*test_struct_handler)->a = a;
(*test_struct_handler)->b = b;
}
// The declarations have to be present inside the headers as well...
// A function that returns the pointer to the variables a and b respectively...
// These functions can readily change their values while returning them...
int * get_a_ref(test_struct * test_struct_handler)
{
return &test_struct_handler->a;
}
int * get_b_ref(test_struct * test_struct_handler)
{
return &test_struct_handler->b;
}
and use it in main like this:
#include <stdio.h>
#include "module.h"
int main(void)
{
test_struct * my_struct_handler;
// Here a address is malloc'd to the pointer and the value is assigned to it...
initialize_test_struct(1, 2, &my_struct_handler);
// Will change the value of 'a' and similar for b as well...
// *get_a_ref(my_struct_handler) = 10;
printf("%d\n", *get_a_ref(my_struct_handler));
printf("%d\n", *get_b_ref(my_struct_handler));
return 0;
}
Just to remind you about the magic (and obscurity) of typedefs...

GCC Casting Pointer to Incompatible Type

I have a working C code when compiled using GCC, but I am trying to find out if the code works because of pure luck or because GCC handles this code as I expect by design.
NOTE
I am not trying to "fix" it. I am trying to understand the compiler
Here is what I have:
iexample.h
#ifndef IEXAMPLE_H_
#define IEXAMPLE_H_
/* The interface */
struct MyIf
{
int (* init)(struct MyIf* obj);
int (* push)(struct MyIf* obj, int x);
void (* sort)(struct MyIf* obj);
};
/* The object, can be in different header */
struct Obj1
{
struct MyIf myinterface;
int val1;
int val2;
};
struct Obj1* newObj1();
#endif
iexample.c
#include <stdio.h>
#include <stdlib.h>
#include "iexample.h"
/* Functions here are "equivalent" to methods on the Obj1 struct */
int Obj1_init(struct Obj1* obj)
{
printf("Obj1_init()\n");
return 0;
}
int Obj1_push(struct Obj1* obj, int x)
{
printf("Obj1_push()\n");
return 0;
}
void Obj1_sort(struct Obj1* obj)
{
printf("Obj1_sort()\n");
}
struct Obj1* newObj1()
{
struct Obj1* obj = malloc(sizeof(struct Obj1));
obj->myinterface.init = Obj1_init;
obj->myinterface.push = Obj1_push;
obj->myinterface.sort = Obj1_sort;
return obj;
}
main.c
#include "iexample.h"
int main(int argc, char* argv[])
{
struct MyIf* myIf = (struct MyIf*) newObj1();
myIf->init(myIf);
myIf->push(myIf, 3);
myIf->sort(myIf);
/* ... free, return ... */
}
When I compile, as I expect, I get for assigning the pointers in newObj1(),
warning: assignment from incompatible pointer type
The code works as long as I have the "struct MyIf myinterface" to be the first member of the struct, which is by design (I like to shoot myself in the foot)
Now, although I am assigning incompatible pointer types, and the C spec says behavior is undefined, does GCC or other compilers make any design claim on how this case is handled? I can almost swear that this OUGHT TO WORK due to how struct memory is laid out, but I cannot find the proof.
Thanks
C11 standard 6.7.2.1 Structure and union specifiers:
Within a structure object, the non-bit-field members and the
units in which bit-fields reside have addresses that increase in
the order in which they are declared. A pointer to a structure
object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it
resides), and vice versa. There may be unnamed padding within
a structure object, but not at its beginning.
So it should work as long, as you access only first structure member. However, I believe you understand, that this is pretty bad idea. Should you port this code to C++ and make some Obj1 member virtual, this will immediately fail.

Resources