I'm currently a bit confused regarding the concept of information hiding of C-structs.
The backround of this question is an embedded c project with nearly zero knowledge of OOP.
Up until now I always declared my typedef structs inside the header file of the corresponding module.
So every module which wants to use this struct knows the struct type.
But after a MISRA-C check I discovered the medium severity warning: MISRAC2012-Dir-4.8
- The implementation of a structure is unnecessarily exposed to a translation unit.
After a bit of research I discovered the concept of information hiding of C-structs by limiting the visible access of the struct members to private scope.
I promptly tried a simple example which goes like this:
struct_test.h
//struct _structName;
typedef struct _structName structType_t;
struct_test.c
#include "struct_test.h"
typedef struct _structName
{
int varA;
int varB;
char varC;
}structType_t;
main.c
#include "struct_test.h"
structType_t myTest;
myTest.varA = 0;
myTest.varB = 1;
myTest.varC = 'c';
This yields the compiler error, that for main.c the size of myTest is unknown.
And of course it is, main.c has only knowledge that a struct of the type structType_t exists and nothing else.
So I continued my research and stumbled upon the concept of opaque pointers.
So I tried a second attempt:
struct_test.h
typedef struct _structName *myStruct_t;
struct_test.c
#include "struct_test.h"
typedef struct _structName
{
int varA;
int varB;
char varC;
}structType_t;
main.c
#include "struct_test.h"
myStruct_t myTest;
myTest->varA = 1;
And I get the compiler error: dereferencing pointer to incomplete type struct _structName
So obviously I haven't understood the basic concept of this technique.
My main point of confusion is where the data of the struct object will?
Up until now I had the understanding that a pointer usually points to a "physical" representation of the datatype and reads/writes the content on the corresponding address.
But with the method above, I declare a pointer myTest but never set an address where it should point to.
I took the idea from this post:
What is an opaque pointer in C?
In the post it is mentioned, that the access is handled with set/get interface methods so I tried adding one similiar like this:
void setVarA ( _structName *ptr, int valueA )
{
ptr->varA = valueA;
}
But this also doesn't work because now he tells me that _structName is unknown...
So can I only access the struct with the help of additional interface methods and, if yes, how can I achieve this in my simple example?
And my bigger question still remains where the object of my struct is located in memory.
I only know the pointer concept:
varA - Address: 10 - Value: 1
ptrA - Address: 22 - Value: 10
But in this example I only have
myTest - Address: xy - Value: ??
I have trouble understanding where the "physical" representation of the corresponding myTest pointer is located?
Furthermore I can not see the benefits of doing it like this in relatively small scope embedded projects where I am the producer and consumer of the modules.
Can someone explain me if this method is really reasonable for small to mid scale embedded projects with 1-2 developers working with the code?
Currently it seems like more effort to make all this interface pointer methods than just declaring the struct in my header-file.
Thank you in advance
My main point of confusion is where the data of the struct object will?
The point is that you do not use the struct representation (i.e. its size, fields, layout, etc.) in other translation units, but rather call functions that do the work for you. You need to use an opaque pointer for that, yes.
how can I achieve this in my simple example?
You have to put all the functions that use the struct fields (the real struct) in one file (the implementation). Then, in a header, expose only the interface (the functions that you want users to call, and those take an opaque pointer). Finally, users will use the header to call only those functions. They won't be able to call any other function and they won't be able to know what is inside the struct, so code trying to do that won't compile (that is the point!).
Furthermore I can not see the benefits of doing it like this in relatively small scope embedded projects where I am the producer and consumer of the modules.
It is a way to force modules to be independent of each other. Sometimes it is used to hide implementations to customers or to be able to guarantee ABI stability.
But yes, for internal usage, it is usually a burden (and hinders optimization, since everything becomes a black box to the compiler except if you use LTO etc.). A syntactic approach like public/private in other languages like C++ is way better for that.
However, if you are bound to follow MISRA to such degree (i.e. if your project has to follow that rule, even if it is only advisory), there is not much you can do.
Can someone explain me if this method is really reasonable for small to mid scale embedded projects with 1-2 developers working with the code?
That is up to you. There are very big projects that do not follow that advice and are successful. Typically a comment for private fields, or a naming convention, is enough.
As you've deduced, when using an opaque type such as this the main source file can't access the members of the struct, and in fact doesn't know how big the struct is. Because of this, not only do you need accessor functions to read/write the fields of the struct, but you also need a function to allocate memory for the struct, since only the library source knows the definition and size of the struct.
So your header file would contain the following:
typedef struct _structName structType_t;
structType_t *init();
void setVarA(structType_t *ptr, int valueA );
int getVarA(structType_t *ptr);
void cleanup(structType_t *ptr);
This interface allows a user to create an instance of the struct, get and set values, and clean it up. The library source would look like this:
#include "struct_test.h"
struct _structName
{
int varA;
int varB;
char varC;
};
structType_t *init()
{
return malloc(sizeof(structType_t ));
}
void setVarA(structType_t *ptr, int valueA )
{
ptr->varA = valueA;
}
int getVarA(structType_t *ptr)
{
return ptr->varA;
}
void cleanup(structType_t *ptr)
{
free(ptr);
}
Note that you only need to define the typedef once. This both defines the type alias and forward declares the struct. Then in the source file the actual struct definition appears without the typedef.
The init function is used by the caller to allocate space for the struct and return a pointer to it. That pointer can then be passed to the getter / setter functions.
So now your main code can use this interface like this:
#include "struct_test.h"
int main()
{
structType_t *s = init();
setVarA(s, 5);
printf("s->a=%d\n", getVarA(s));
cleanup(s);l
}
In the post it is mentioned, that the access is handled with set/get interface methods so I tried adding one similiar like this:
void setVarA ( _structName *ptr, int valueA )
{
ptr->varA = valueA;
}
But this also doesn't work because now he tells me that _structName is unknown...
The type is not _structName, but struct _structName or (as defined) structType_t.
And my bigger question still remains where the object of my struct is located in memory.
With this technique, there would be a method which returns the address of such an opaque object. It could be statically or dynamically allocated. There should of course also be a method to free an object.
Furthermore I can not see the benefits of doing it like this in relatively small scope embedded projects where I am the producer and consumer of the modules.
I agree with you.
Related
I have been writing C for a decent amount of time, and obviously am aware that C does not have any support for explicit private and public fields within structs. However, I (believe) I have found a relatively clean method of implementing this without the use of any macros or voodoo, and I am looking to gain more insight into possible issues I may have overlooked.
The folder structure isn't all that important here but I'll list it anyway because it gives clarity as to the import names (and is also what CLion generates for me).
- example-project
- cmake-build-debug
- example-lib-name
- include
- example-lib-name
- example-header-file.h
- src
- example-lib-name
- example-source-file.c
- CMakeLists.txt
- CMakeLists.txt
- main.c
Let's say that example-header-file.h contains:
typedef struct ExampleStruct {
int data;
} ExampleStruct;
ExampleStruct* new_example_struct(int, double);
which just contains a definition for a struct and a function that returns a pointer to an ExampleStruct.
Obviously, now if I import ExampleStruct into another file, such as main.c, I will be able to create and return a pointer to an ExampleStruct by calling
ExampleStruct* new_struct = new_example_struct(<int>, <double>);,
and will be able to access the data property like: new_struct->data.
However, what if I also want private properties in this struct. For example, if I am creating a data structure, I don't want it to be easy to modify the internals of it. I.e. if I've implemented a vector struct with a length property that describes the current number of elements in the vector, I wouldn't want for people to just be able to change that value easily.
So, back to our example struct, let's assume we also want a double field in the struct, that describes some part of internal state that we want to make 'private'.
In our implementation file (example-source-file.c), let's say we have the following code:
#include <stdlib.h>
#include <stdbool.h>
typedef struct ExampleStruct {
int data;
double val;
} ExampleStruct;
ExampleStruct* new_example_struct(int data, double val) {
ExampleStruct* new_example_struct = malloc(sizeof(ExampleStruct));
example_struct->data=data;
example_struct->val=val;
return new_example_struct;
}
double get_val(ExampleStruct* e) {
return e->val;
}
This file simply implements that constructor method for getting a new pointer to an ExampleStruct that was defined in the header file. However, this file also defines its own version of ExampleStruct, that has a new member field not present in the header file's definition: double val, as well as a getter which gets that value. Now, if I import the same header file into main.c, which contains:
#include <stdio.h>
#include "example-lib-name/example-header-file.h"
int main() {
printf("Hello, World!\n");
ExampleStruct* test = new_example(6, 7.2);
printf("%d\n", test->data); // <-- THIS WORKS
double x = get_val(test); // <-- THIS AND THE LINE BELOW ALSO WORK
printf("%f\n", x); //
// printf("%f\n", test->val); <-- WOULD THROW ERROR `val not present on struct!`
return 0;
}
I tested this a couple times with some different fields and have come to the conclusion that modifying this 'private' field, val, or even accessing it without the getter, would be very difficult without using pointer arithmetic dark magic, and that is the whole point.
Some things I see that may be cause for concern:
This may make code less readable in the eyes of some, but my IDE has arrow buttons that take me to and from the definition and the implementation, and even without that, a one line comment would provide more than enough documentation to point someone in the direction of where the file is.
Questions I'd like answers on:
Are there significant performance penalties I may suffer as a result of writing code this way?
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Aside: I am not trying to make C into C++, and generally favor the way C does things, but sometimes I really want some encapsulation of data.
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Yes: your approach produces undefined behavior.
C requires that
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
(C17 6.2.7/2)
and that
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
[...]
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
(C17 6.5/7, a.k.a. the "Strict Aliasing Rule")
Your two definitions of struct ExampleStruct define incompatible types because they specify different numbers of members (see C17 6.2.7/1 for more details on structure type compatibility). You will definitely have problems if you pass instances by value between functions relying on different of these incompatible definitions. You will have trouble if you construct arrays of them, whether dynamically, automatically, or statically, and attempt to use those across boundaries between TUs using one definition and those using another. You may have problems even if you do none of the above, because the compiler may behave unexpectedly, especially when optimizing. DO NOT DO THIS.
Other alternatives:
Opaque pointers. This means you do not provide any definition of struct ExampleStruct in those TUs where you want to hide any of its members. That does not prevent declaring and using pointers to such a structure, but it does prevent accessing any members, declaring new instances, or passing or receiving instances by value. Where member access is needed from TUs that do not have the structure definition, it would need to be mediated by accessor functions.
Just don't access the "private" members. Do not document them in the public documentation, and if you like, explicity mark them (in code comments, for example) as reserved. This approach will be familiar to many C programmers, as it is used a lot for structures declared in POSIX system headers.
As long as the public has a complete definition for ExampleStruct, it can make code like:
ExampleStruct a = *new_example_struct(42, 1.234);
Then the below will certainly fail.
printf("%g\n", get_val(&a));
I recommend instead to create an opaque pointer and provide access public functions to the info in .data and .val.
Think of how we use FILE. FILE *f = fopen(...) and then fread(..., f), fseek(f, ...), ftell(f) and eventually fclose(f). I suggest this model instead. (Even if in some implementations FILE* is not opaque.)
Are there significant performance penalties I may suffer as a result of writing code this way?
Probably:
Heap allocation is expensive, and - today - usually not optimized away even when that is theoretically possible.
Dereferencing a pointer for member access is expensive; although this might get optimized away with link-time-optimization... if you're lucky.
i.e. is there a simpler way to do this
Well, you could use a slack array of the same size as your private fields, and then you wouldn't need to go through pointers all the time:
#define EXAMPLE_STRUCT_PRIVATE_DATA_SIZE sizeof(double)
typedef struct ExampleStruct {
int data;
_Alignas(max_align_t) private_data[EXAMPLE_STRUCT_PRIVATE_DATA_SIZE];
} ExampleStruct;
This is basically a type-erasure of the private data without hiding the fact that it exists. Now, it's true that someone can overwrite the contents of this array, but it's kind of useless to do it intentionally when you "don't know" what the data means. Also, the private data in the "real" definition will need to have the same, maximal, _AlignAs() as well (if you want the private data not to need to use AlignAs(), you will need to use the real alignment quantum for the type-erased version).
The above is C11. You can sort of do about the same thing by typedef'ing max_align_t yourself, then using an array of max_align_t elements for private data, with an appropriate length to cover the actual size of the private data.
An example of the use of such an approach can be found in CUDA's driver API:
Parameters for copying a 3D array: CUDA_MEMCPY3D vs
Parameters for copying a 3D array between two GPU devices: CUDA_MEMCPY3D_peer
The first structure has a pair of reserved void* fields, hiding the fact that it's really the second structure. They could have used an unsigned char array, but it so happens that the private fields are pointer-sized, and void* is also kind of opaque.
This causes undefined behaviour, as detailed in the other answers. The usual way around this is to make a nested struct.
In example.h, one defines the public-facing elements. struct example is not meant to be instantiated; in a sense, it is abstract. Only pointers that are obtained from one of it's (in this case, the) constructor are valid.
struct example { int data; };
struct example *new_example(int, double);
double example_val(struct example *e);
and in example.c, instead of re-defining struct example, one has a nested struct private_example. (Such that they are related by composite aggregation.)
#include <stdlib.h>
#include "example.h"
struct private_example {
struct example public;
double val;
};
struct example *new_example(int data, double val) {
struct private_example *const example = malloc(sizeof *example);
if(!example) return 0;
example->public.data = data;
example->val = val;
return &example->public;
}
/** This is a poor version of `container_of`. */
static struct private_example *example_upcast(struct example *example) {
return (struct private_example *)(void *)
((char *)example - offsetof(struct private_example, public));
}
double example_val(struct example *e) {
return example_upcast(e)->val;
}
Then one can use the object as in main.c. This is used frequently in linux kernel code for container abstraction. Note that offsetof(struct private_example, public) is zero, ergo example_upcast does nothing and a cast is sufficient: ((struct private_example *)e)->val. If one builds structures in a way that always allows casting, one is limited by single inheritance.
I tried to do data encapsulation in C based on this post here https://alastairs-place.net/blog/2013/06/03/encapsulation-in-c/.
In a header file I have:
#ifndef FUNCTIONS_H
#define FUNCTIONS_H
// Pre-declaration of struct. Contains data that is hidden
typedef struct person *Person;
void getName(Person obj);
void getBirthYear(Person obj);
void getAge(Person obj);
void printFields(const Person obj);
#endif
In ´functions.c´ I have defined the structure like that
#include "Functions.h"
enum { SIZE = 60 };
struct person
{
char name[SIZE];
int birthYear;
int age;
};
pluss I have defined functions as well.
In main.c I have:
#include "Functions.h"
#include <stdlib.h>
int main(void)
{
// Works because *Person makes new a pointer
Person new = malloc(sizeof new);
getName(new);
getAge(new);
getBirthYear(new);
printFields(new);
free(new);
return 0;
}
Is it true, that when I use Person new, new is already pointer because of typedef struct person *Person;.
How is it possible, that linker cannot see the body and members that I have declared in my struct person
Is this only possible using pointer?
Is the correct (and only) way to implement OOP prinicples in my case to make a different struct in functions.h like so:
typedef struct classPerson
{ // This data should be hidden
Person data;
void (*fPtrGetName)(Person obj);
void (*fPtrBirthYear)(Person obj);
void (*fPtrGetAge)(Person obj);
void (*fPtrPrintFields)(const Person obj);
} ClassPerson;
First of all, it is usually better to not hide pointers behind a typedef, but to let the caller use pointer types. This prevents all kinds of misunderstandings when reading and maintaining the code. For example void printFields(const Person obj); looks like nonsense if you don't realize that Person is a pointer type.
Have I understood correctly, that when I use Person new, new is already pointer because of typedef struct person *Person;.
Yes. You are confused because of the mentioned typedef.
How is it possible, that linker cannot see the body and members that I have declared in my ´struct person´?
The linker can see everything that is linked, or you wouldn't end up with a working executable.
The compiler however, works on "translation units" (roughly means a .c file and all its included headers). When compiling the caller's translation unit, the compiler doesn't see functions.c, it only sees functions.h. And in functions.h, the struct declaration gives an incomplete type. Meaning "this struct definition is elsewhere".
Is this only possible using pointer?
Yes, it is the only way if you want to do proper OO programming in C. This concept is sometimes called opaque pointers or opaque type.
(Though you could also achieve "poor man's private encapsulation" though the static keyword. Which is usually not really recommended, since it wouldn't be thread-safe.)
Is the correct (and only) way to implement OOP prinicples in my case to make a different struct in functions.h like so:
Pretty much, yeah (apart from the nit-pick about the mentioned pointer typedef). Using function pointers to the public functions isn't necessary though, although that's how you implement polymorphism.
What your example lacks though is a "constructor" and "destructor". Without them the code wouldn't be meaningful. The malloc and free calls should be inside those, and not done by the caller.
With or without typedef, in C you hide data by declaring incomplete types. In /usr/include/stdio.h, you'll find fread(3) takes a FILE * argument:
extern size_t fread (void *__restrict __ptr, size_t __size,
size_t __n, FILE *__restrict __stream) __wur;
and FILE is declared something like this:
struct _IO_FILE;
typedef struct _IO_FILE FILE;
Using stdio.h you cannot define a variable of type FILE, because type FILE is incomplete: it's declared, but not defined. But you can happily pass FILE * around, because all data pointers are the same size. You're just going to have to call fopen(3) to make it point to an open file.
To partially define a type, as in your case:
struct classPerson
{ // This data should be hidden
Person data;
void (*fPtrGetName)(Person obj);
...
};
is a little trickier. First of all, you should have a really good reason, namely that two implementations of fPtrGetName are implemented. Otherwise you're just building complexity on the altar of OOP.
A good example of a good reason is bind(2). You can bind a unix domain socket or a network socket, among others. Both types are represented by struct sockaddr, but that's just a stand-in type for struct sockaddr_un and struct sockaddr_in. Functions that take struct sockaddr depend on the fact that all such structures start with the member sun_family, and branch accordingly. Et voila, polymorphism: one function, many types.
For an example of a struct full of function pointers, I recommend looking at SQLite. Its API is loaded with structures to isolate it from the OS and let the user define plug-ins.
BTW, if I may say so, fPtrGetName is a terrible name. It's not interesting that it's a function pointer and (controversy!) "get" is noise on a function that takes no arguments. Compare
struct classPerson sargent;
sargent.fPtrGetName();
sargent.name();
Which would you rather use? I reserve "get" (or similar) for I/O functions; at least then you're getting something, not just moving it from one pocket to another! For setting, in C++ I overload the function, so that get/set functions have the same name, but in C I wind up with e.g. set_name(const char name[]).
I have a C project that is designed to be portable to various (PC and embedded) platforms.
Application code will use various calls that will have platform-specific implementations, but share a common (generic) API to aid in portability. I'm trying to settle on the most appropriate way to declare the function prototypes and structures.
Here's what I've come up with so far:
main.c:
#include "generic.h"
int main (int argc, char *argv[]) {
int ret;
gen_t *data;
ret = foo(data);
...
}
generic.h: (platform-agnostic include)
typedef struct impl_t gen_t;
int foo (gen_t *data);
impl.h: (platform-specific declaration)
#include "generic.h"
typedef struct impl_t {
/* ... */
} gen_t;
impl.c: (platform-specific implementation)
int foo (gen_t *data) {
...
}
Build:
gcc -c -fPIC -o platform.o impl.c
gcc -o app main.c platform.o
Now, this appears to work... in that it compiles OK. However, I don't usually tag my structures since they're never accessed outside of the typedef'd alias. It's a small nit-pick, but I'm wondering if there's a way to achieve the same effect with anonymous structs?
I'm also asking for posterity, since I searched for a while and the closest answer I found was this: (Link)
In my case, that wouldn't be the right approach, as the application specifically shouldn't ever include the implementation headers directly -- the whole point is to decouple the program from the platform.
I see a couple of other less-than-ideal ways to resolve this, for example:
generic.h:
#ifdef PLATFORM_X
#include "platform_x/impl.h"
#endif
/* or */
int foo (struct impl_t *data);
Neither of these seems particularly appealing, and definitely not my style. While I don't want to swim upstream, I also don't want conflicting style when there might be a nicer way to implement exactly what I had in mind. So I think the typedef solution is on the right track, and it's just the struct tag baggage I'm left with.
Thoughts?
Your current technique is correct. Trying to use an anonymous (untagged) struct defeats what you're trying to do — you'd have to expose the details of definition of the struct everywhere, which means you no longer have an opaque data type.
In a comment, user3629249 said:
The order of the header file inclusions means there is a forward reference to the struct by the generic.h file; that is, before the struct is defined, it is used. It is unlikely this would compile.
This observation is incorrect for the headers shown in the question; it is accurate for the sample main() code (which I hadn't noticed until adding this response).
The key point is that the interface functions shown take or return pointers to the type gen_t, which in turn maps to a struct impl_t pointer. As long as the client code does not need to allocate space for the structure, or dereference a pointer to a structure to access a member of the structure, the client code does not need to know the details of the structure. It is sufficient to have the structure type declared as existing. You could use either of these to declare the existence of struct impl_t:
struct impl_t;
typedef struct impl_t gen_t;
The latter also introduces the alias gen_t for the type struct impl_t. See also Which part of the C standard allows this code to compile? and Does the C standard consider that there are one or two struct uperms entry types in this header?
The original main() program in the question was:
int main (int argc, char *argv[]) {
int ret;
gen_t data;
ret = foo(&data);
…
}
This code cannot be compiled with gen_t as an opaque (non-pointer) type. It would work OK with:
typedef struct impl_t *gen_t;
It would not compile with:
typedef struct impl_t gen_t;
because the compiler must know how big the structure is to allocate the correct space for data, but the compiler cannot know that size by definition of what an opaque type is. (See Is it a good idea to typedef pointers? for typedefing pointers to structures.)
Thus, the main() code should be more like:
#include "generic.h"
int main(int argc, char **argv)
{
gen_t *data = bar(argc, argv);
int ret = foo(data);
...
}
where (for this example) bar() is defined as extern gen_t *bar(int argc, char **argv);, so it returns a pointer to the opaque type gen_t.
Opinion is split over whether it is better to always use struct tagname or to use a typedef for the name. The Linux kernel is one substantial body of code that does not use the typedef mechanism; all structures are explicitly struct tagname. On the other hand, C++ does away with the need for the explicit typedef; writing:
struct impl_t;
in a C++ program means that the name impl_t is now the name of a type. Since opaque structure types require a tag (or you end up using void * for everything, which is bad for a whole legion of reasons, but the primary reason is that you lose all type safety using void *; remember, typedef introduces an alias for an underlying type, not a new distinct type), the way I code in C simulates C++:
typedef struct Generic Generic;
I avoid using the _t suffix on my types because POSIX reserves the _t for the implementation to use* (see also What does a type followed by _t represent?). You may be lucky and get away with it. I've worked on code bases where types like dec_t and loc_t were defined by the code base (which was not part of the implementation — where 'the implementation' means the C compiler and its supporting code, or the C library and its supporting code), and both those types caused pain for decades because some of the systems where the code was ported defined those types, as is the system's prerogative. One of the names I managed to get rid of; the other I didn't. 'Twas painful! If you must use _t (it is a convenient way to indicate that something is a type), I recommend using a distinctive prefix too: pqr_typename_t for some project pqr, for example.
* See the bottom line of the second table in The Name Space in the POSIX standard.
This is what I want to do:
1) I want a function that instantiates a data structure.
void instantiateCDB(void);
2) I also want a function that updates the data structure that is instantiated and returns a const pointer to the data structure (to make it read-only)
I know that this can be done in C++/Java. But can it also be done in C?
The program flow that I want to write is:
main(){
instantiateCDB(); // Allocates a CDB
const struct canDataBlock * cdb = getUpdateSystem();
}
// But the best function definitions that I can come up with is this.
struct canDataBlock * instantiateCDB() {
static struct canDataBlock cdb = {0};
return &cdb;
}
const struct canDataBlock * getUpdateSystem() {
struct canDataBlock * cdb = instantiateCDB();
return &cdb;
}
The problem is: How do I access the data structure with write/read access instantiated in the instantiateCDB function if it would be declared void? If I am going to return the allocated data structure, the user can alter the canDataBlock thus losing its integrity. What I want to happen is only the getUpdateSystem() can change the values of the data structure instantiated by the instantiateCDB() function. How do I solve this problem? Is there another technique in C that I do not know about. If there is, please teach me. :)
OO in C can be partly simulated (word "simulation" rather than "implementation" was chosen deliberately), for example inheritance can be simulated by nesting base struct as a first member of "derived" struct, in such case it is possible to "upcast" derived pointer struct to "base" struct as in C there is guarantee that pointer to struct can be used to access pointer to first member.
Depending on the context, you may use opaque data type to hide implementation details. To simulate private and public data you can provide full struct declaration, including all members, but still hide some members as a opaque or void pointer.
struct myclass_privdata;
struct myclass { int data; /* public member */
struct myclass_privdata* data_priv; /* private simulation*/ }
You can also simulate member functions by using function pointers, but you still need to pass explicitly object reference (and initialize it too).
struct S;
void do(struct S* this_p) { }
struct S { void (*do_smth)(struct s* this_p); } s;
s->do_smth = do;
s->do_smth(s);
Probably you can found inspiration in glib library object model. Also there is even a book about this topic.
However, the best advice existing for this topic, is to stay with the language style, rather than trying to do things the language is not supposed to be used (for example, implementing oop in C often results in boilerplate code which adds low functional value - it makes code looking like it is written in OOP when in fact it is not, and makes code looking ugly for those who program in "native" style).
Addressing your question.
1) Function returning data instance is typically implemented as function which returns malloced() pointer
#include <stdlib.h>
struct S {int i; };
struct S* S_alloc(void) { return malloc(sizeof(S)); }
Your code which returns pointer to static data is almost for sure a trouble as all references will refer to same object. And judging by void instantiateCDB(void); declaration I doubt it does really what you want to do.
2) I doubt that you really "want a function that updates the data structure that is instantiated and returns a const pointer to the data structure (to make it read-only)" - because caller will still have a non-const pointer which was just passed to such function.
There's a pattern called opaque data pointer, it might help you here. See http://en.wikipedia.org/wiki/Opaque_data_type.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Assuming I have to use C (no C++ or object oriented compilers) and I don't have dynamic memory allocation, what are some techniques I can use to implement a class, or a good approximation of a class? Is it always a good idea to isolate the "class" to a separate file? Assume that we can preallocate the memory by assuming a fixed number of instances, or even defining the reference to each object as a constant before compile time. Feel free to make assumptions about which OOP concept I will need to implement (it will vary) and suggest the best method for each.
Restrictions:
I have to use C and not an OOP
because I'm writing code for an
embedded system, and the compiler and
preexisting code base is in C.
There is no dynamic memory allocation
because we don't have enough memory
to reasonably assume we won't run out
if we start dynamically allocating
it.
The compilers we work with have no problems with function pointers
That depends on the exact "object-oriented" feature-set you want to have. If you need stuff like overloading and/or virtual methods, you probably need to include function pointers in structures:
typedef struct {
float (*computeArea)(const ShapeClass *shape);
} ShapeClass;
float shape_computeArea(const ShapeClass *shape)
{
return shape->computeArea(shape);
}
This would let you implement a class, by "inheriting" the base class, and implementing a suitable function:
typedef struct {
ShapeClass shape;
float width, height;
} RectangleClass;
static float rectangle_computeArea(const ShapeClass *shape)
{
const RectangleClass *rect = (const RectangleClass *) shape;
return rect->width * rect->height;
}
This of course requires you to also implement a constructor, that makes sure the function pointer is properly set up. Normally you'd dynamically allocate memory for the instance, but you can let the caller do that, too:
void rectangle_new(RectangleClass *rect)
{
rect->width = rect->height = 0.f;
rect->shape.computeArea = rectangle_computeArea;
}
If you want several different constructors, you will have to "decorate" the function names, you can't have more than one rectangle_new() function:
void rectangle_new_with_lengths(RectangleClass *rect, float width, float height)
{
rectangle_new(rect);
rect->width = width;
rect->height = height;
}
Here's a basic example showing usage:
int main(void)
{
RectangleClass r1;
rectangle_new_with_lengths(&r1, 4.f, 5.f);
printf("rectangle r1's area is %f units square\n", shape_computeArea(&r1));
return 0;
}
I hope this gives you some ideas, at least. For a successful and rich object-oriented framework in C, look into glib's GObject library.
Also note that there's no explicit "class" being modelled above, each object has its own method pointers which is a bit more flexible than you'd typically find in C++. Also, it costs memory. You could get away from that by stuffing the method pointers in a class structure, and invent a way for each object instance to reference a class.
I had to do it once too for a homework. I followed this approach:
Define your data members in a
struct.
Define your function members that
take a pointer to your struct as
first argument.
Do these in one header & one c.
Header for struct definition &
function declarations, c for
implementations.
A simple example would be this:
/// Queue.h
struct Queue
{
/// members
}
typedef struct Queue Queue;
void push(Queue* q, int element);
void pop(Queue* q);
// etc.
///
If you only want one class, use an array of structs as the "objects" data and pass pointers to them to the "member" functions. You can use typedef struct _whatever Whatever before declaring struct _whatever to hide the implementation from client code. There's no difference between such an "object" and the C standard library FILE object.
If you want more than one class with inheritance and virtual functions, then it's common to have pointers to the functions as members of the struct, or a shared pointer to a table of virtual functions. The GObject library uses both this and the typedef trick, and is widely used.
There's also a book on techniques for this available online - Object Oriented Programming with ANSI C.
C Interfaces and Implementations: Techniques for Creating Reusable Software, David R. Hanson
http://www.informit.com/store/product.aspx?isbn=0201498413
This book does an excellent job of covering your question. It's in the Addison Wesley Professional Computing series.
The basic paradigm is something like this:
/* for data structure foo */
FOO *myfoo;
myfoo = foo_create(...);
foo_something(myfoo, ...);
myfoo = foo_append(myfoo, ...);
foo_delete(myfoo);
you can take a look at GOBject. it's an OS library that give you a verbose way to do an object.
http://library.gnome.org/devel/gobject/stable/
I will give a simple example of how OOP should be done in C. I realize this thread is from 2009 but would like to add this anyway.
/// Object.h
typedef struct Object {
uuid_t uuid;
} Object;
int Object_init(Object *self);
uuid_t Object_get_uuid(Object *self);
int Object_clean(Object *self);
/// Person.h
typedef struct Person {
Object obj;
char *name;
} Person;
int Person_init(Person *self, char *name);
int Person_greet(Person *self);
int Person_clean(Person *self);
/// Object.c
#include "object.h"
int Object_init(Object *self)
{
self->uuid = uuid_new();
return 0;
}
uuid_t Object_get_uuid(Object *self)
{ // Don't actually create getters in C...
return self->uuid;
}
int Object_clean(Object *self)
{
uuid_free(self->uuid);
return 0;
}
/// Person.c
#include "person.h"
int Person_init(Person *self, char *name)
{
Object_init(&self->obj); // Or just Object_init(&self);
self->name = strdup(name);
return 0;
}
int Person_greet(Person *self)
{
printf("Hello, %s", self->name);
return 0;
}
int Person_clean(Person *self)
{
free(self->name);
Object_clean(self);
return 0;
}
/// main.c
int main(void)
{
Person p;
Person_init(&p, "John");
Person_greet(&p);
Object_get_uuid(&p); // Inherited function
Person_clean(&p);
return 0;
}
The basic concept involves placing the 'inherited class' at the top of the struct. This way, accessing the first 4 bytes in the struct also accesses the first 4 bytes in the 'inherited class' (assuming non-crazy optimizations). Now, when the pointer of the struct is cast to the 'inherited class', the 'inherited class' can access the 'inherited values' in the same way it would access its members normally.
This and some naming conventions for constructors, destructors, allocation, and deallocation functions (I recommend _init, _clean, _new, and _free) will get you a long way.
As for Virtual functions, use function pointers in the struct, possibly with Class_func(...); wrapper too.
As for (simple) templates, add a size_t parameter to determine size, require a void* pointer, or require a 'class' type with just the functionality you care about. (e.g. int GetUUID(Object *self); GetUUID(&p);)
Use a struct to simulate the data members of a class. In terms of method scope you can simulate private methods by placing the private function prototypes in the .c file and the public functions in the .h file.
GTK is built entirely on C and it uses many OOP concepts. I have read through the source code of GTK and it is pretty impressive, and definitely easier to read. The basic concept is that each "class" is simply a struct, and associated static functions. The static functions all accept the "instance" struct as a parameter, do whatever then need, and return results if necessary. For Example, you may have a function "GetPosition(CircleStruct obj)". The function would simply dig through the struct, extract the position numbers, probably build a new PositionStruct object, stick the x and y in the new PositionStruct, and return it. GTK even implements inheritance this way by embedding structs inside structs. pretty clever.
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <uchar.h>
/**
* Define Shape class
*/
typedef struct Shape Shape;
struct Shape {
/**
* Variables header...
*/
double width, height;
/**
* Functions header...
*/
double (*area)(Shape *shape);
};
/**
* Functions
*/
double calc(Shape *shape) {
return shape->width * shape->height;
}
/**
* Constructor
*/
Shape _Shape() {
Shape s;
s.width = 1;
s.height = 1;
s.area = calc;
return s;
}
/********************************************/
int main() {
Shape s1 = _Shape();
s1.width = 5.35;
s1.height = 12.5462;
printf("Hello World\n\n");
printf("User.width = %f\n", s1.width);
printf("User.height = %f\n", s1.height);
printf("User.area = %f\n\n", s1.area(&s1));
printf("Made with \xe2\x99\xa5 \n");
return 0;
};
In your case the good approximation of the class could be the an ADT. But still it won't be the same.
My strategy is:
Define all code for the class in a separate file
Define all interfaces for the class in a separate header file
All member functions take a "ClassHandle" which stands in for the instance name (instead of o.foo(), call foo(oHandle)
The constructor is replaced with a function void ClassInit(ClassHandle h, int x, int y,...) OR ClassHandle ClassInit(int x, int y,...) depending on the memory allocation strategy
All member variables are store as a member of a static struct in the class file, encapsulating it in the file, preventing outside files from accessing it
The objects are stored in an array of the static struct above, with predefined handles (visible in the interface) or a fixed limit of objects that can be instantiated
If useful, the class can contain public functions that will loop through the array and call the functions of all the instantiated objects (RunAll() calls each Run(oHandle)
A Deinit(ClassHandle h) function frees the allocated memory (array index) in the dynamic allocation strategy
Does anyone see any problems, holes, potential pitfalls or hidden benefits/drawbacks to either variation of this approach? If I am reinventing a design method (and I assume I must be), can you point me to the name of it?
Also see this answer and this one
It is possible. It always seems like a good idea at the time but afterwards it becomes a maintenance nightmare. Your code become littered with pieces of code tying everything together. A new programmer will have lots of problems reading and understanding the code if you use function pointers since it will not be obvious what functions is called.
Data hiding with get/set functions is easy to implement in C but stop there. I have seen multiple attempts at this in the embedded environment and in the end it is always a maintenance problem.
Since you all ready have maintenance issues I would steer clear.
My approach would be to move the struct and all primarily-associated functions to a separate source file(s) so that it can be used "portably".
Depending on your compiler, you might be able to include functions into the struct, but that's a very compiler-specific extension, and has nothing to do with the last version of the standard I routinely used :)
The first c++ compiler actually was a preprocessor which translated the C++ code into C.
So it's very possible to have classes in C.
You might try and dig up an old C++ preprocessor and see what kind of solutions it creates.
Do you want virtual methods?
If not then you just define a set of function pointers in the struct itself. If you assign all the function pointers to standard C functions then you will be able to call functions from C in very similar syntax to how you would under C++.
If you want to have virtual methods it gets more complicated. Basically you will need to implement your own VTable to each struct and assign function pointers to the VTable depending on which function is called. You would then need a set of function pointers in the struct itself that in turn call the function pointer in the VTable. This is, essentially, what C++ does.
TBH though ... if you want the latter then you are probably better off just finding a C++ compiler you can use and re-compiling the project. I have never understood the obsession with C++ not being usable in embedded. I've used it many a time and it works is fast and doesn't have memory problems. Sure you have to be a bit more careful about what you do but its really not that complicated.
C isn't an OOP language, as your rightly point out, so there's no built-in way to write a true class. You're best bet is to look at structs, and function pointers, these will let you build an approximation of a class. However, as C is procedural you might want to consider writing more C-like code (i.e. without trying to use classes).
Also, if you can use C, you can probally use C++ and get classes.