"Private" struct members in C with const

"Private" struct members in C with const - c

In order to have a clean code, using some OO concept can be useful, even in C.
I often write modules made of a pair of .h and .c files. The problem is that the user of the module have to be careful, since private members don't exist in C. The use of the pimpl idiom or abstract data types is ok, but it adds some code and/or files, and requires a heavier code. I hate using accessor when I don't need one.
Here is a idea which provides a way to make the compiler complain about invalid access to "private" members, with only a few extra code. The idea is to define twice the same structure, but with some extra 'const' added for the user of the module.
Of course, writing in "private" members is still possible with a cast. But the point is only to avoid mistakes from the user of the module, not to safely protect memory.
/*** 2DPoint.h module interface ***/
#ifndef H_2D_POINT
#define H_2D_POINT
/* 2D_POINT_IMPL need to be defined in implementation files before #include */
#ifdef 2D_POINT_IMPL
#define _cst_
#else
#define _cst_ const
#endif
typedef struct 2DPoint
{
/* public members: read and write for user */
int x;
/* private members: read only for user */
_cst_ int y;
} 2DPoint;
2DPoint *new_2dPoint(void);
void delete_2dPoint(2DPoint **pt);
void set_y(2DPoint *pt, int newVal);
/*** 2dPoint.c module implementation ***/
#define 2D_POINT_IMPL
#include "2dPoint.h"
#include <stdlib.h>
#include <string.h>
2DPoint *new_2dPoint(void)
{
2DPoint *pt = malloc(sizeof(2DPoint));
pt->x = 42;
pt->y = 666;
return pt;
}
void delete_2dPoint(2DPoint **pt)
{
free(*pt);
*pt = NULL;
}
void set_y(2DPoint *pt, int newVal)
{
pt->y = newVal;
}
#endif /* H_2D_POINT */
/*** main.c user's file ***/
#include "2dPoint.h"
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
2DPoint *pt = new_2dPoint();
pt->x = 10; /* ok */
pt->y = 20; /* Invalid access, y is "private" */
set_y(pt, 30); /* accessor needed */
printf("pt.x = %d, pt.y = %d\n", pt->x, pt->y); /* no accessor needed for reading "private" members */
delete_2dPoint(&pt);
return EXIT_SUCCESS;
}
And now, here is the question: is this trick OK with the C standard?
It works fine with GCC, and the compiler doesn't complain about anything, even with some strict flags, but how can I be sure that this is really OK?

This is almost certainly undefined behavior.
Writing/modifying an object declared as const is prohibited and doing so results in UB. Furthermore, the approach you take re-declares struct 2DPoint as two technically different types, which is also not permitted.
Note that this (as undefined behavior in general) does not mean that it "certainly won't work" or "it must crash". In fact, I find it quite logical that it works, because if one reads the source intelligently, he may easily find out what the purpose of it is and why it migh be regarded as correct. However, the compiler is not intelligent - at best, it's a finite automaton which has no knowledge about what the code is supposed to do; it only obeys (more or less) to the syntactical and semantical rules of the grammar.

This violates C 2011 6.2.7 1.
6.2.7 1 requires that two definitions of the same structure in different translation units have compatible type. It is not permitted to have const in one and not the other.
In one module, you may have a reference to one of these objects, and the members appear to be const to the compiler. When the compiler writes calls to functions in other modules, it may hold values from the const members in registers or other cache or in partially or fully evaluated expressions from later in the source code than the function call. Then, when the function modifies the member and returns, the original module will not have the changed value. Worse, it may use some combination of the changed value and the old value.
This is highly improper programming.

In Bjarne Stroustrup's words: C is not designed to support OOP, although it enables OOP, which means it is possible to write OOP programs in C, but only very hard to do so. As such, if you have to write OOP code in C, there seems nothing wrong with using this approach, but it is preferable to use a language better suited for the purpose.
By trying to write OOP code in C, you have already entered a territory where "common sense" has to be overridden, so this approach is fine as long as you take responsibility to use it properly. You also need to ensure that it is thoroughly and rigourously documented and everyone concerned with the code is aware of it.
Edit Oh, you may have to use a cast to get around the const. I fail to recall if the C-style cast can be used like C++ const_cast.

You can use different approach - declare two structs, one for user without private members (in header) and one with private members for internal use in your implementation unit. All private members should be placed after public ones.
You always pass around the pointer to the struct and cast it to internal-use when needed, like this:
/* user code */
struct foo {
int public;
};
int bar(void) {
struct foo *foo = new_foo();
foo->public = 10;
}
/* implementation */
struct foo_internal {
int public;
int private;
};
struct foo *new_foo(void) {
struct foo_internal *foo == malloc(sizeof(*foo));
foo->public = 1;
foo->private = 2;
return (struct foo*)foo; // to suppress warning
}
C11 allows unnamed structure fields (GCC supports it some time), so in case of using GCC (or C11 compliant compiler) you can declare internal structure as:
struct foo_internal {
struct foo;
int private;
};
therefore no extra effort required to keep structure definitions in sync.

Related

Using different struct definitions to simulate public and private fields in C

I have been writing C for a decent amount of time, and obviously am aware that C does not have any support for explicit private and public fields within structs. However, I (believe) I have found a relatively clean method of implementing this without the use of any macros or voodoo, and I am looking to gain more insight into possible issues I may have overlooked.
The folder structure isn't all that important here but I'll list it anyway because it gives clarity as to the import names (and is also what CLion generates for me).
- example-project
- cmake-build-debug
- example-lib-name
- include
- example-lib-name
- example-header-file.h
- src
- example-lib-name
- example-source-file.c
- CMakeLists.txt
- CMakeLists.txt
- main.c
Let's say that example-header-file.h contains:
typedef struct ExampleStruct {
int data;
} ExampleStruct;
ExampleStruct* new_example_struct(int, double);
which just contains a definition for a struct and a function that returns a pointer to an ExampleStruct.
Obviously, now if I import ExampleStruct into another file, such as main.c, I will be able to create and return a pointer to an ExampleStruct by calling
ExampleStruct* new_struct = new_example_struct(<int>, <double>);,
and will be able to access the data property like: new_struct->data.
However, what if I also want private properties in this struct. For example, if I am creating a data structure, I don't want it to be easy to modify the internals of it. I.e. if I've implemented a vector struct with a length property that describes the current number of elements in the vector, I wouldn't want for people to just be able to change that value easily.
So, back to our example struct, let's assume we also want a double field in the struct, that describes some part of internal state that we want to make 'private'.
In our implementation file (example-source-file.c), let's say we have the following code:
#include <stdlib.h>
#include <stdbool.h>
typedef struct ExampleStruct {
int data;
double val;
} ExampleStruct;
ExampleStruct* new_example_struct(int data, double val) {
ExampleStruct* new_example_struct = malloc(sizeof(ExampleStruct));
example_struct->data=data;
example_struct->val=val;
return new_example_struct;
}
double get_val(ExampleStruct* e) {
return e->val;
}
This file simply implements that constructor method for getting a new pointer to an ExampleStruct that was defined in the header file. However, this file also defines its own version of ExampleStruct, that has a new member field not present in the header file's definition: double val, as well as a getter which gets that value. Now, if I import the same header file into main.c, which contains:
#include <stdio.h>
#include "example-lib-name/example-header-file.h"
int main() {
printf("Hello, World!\n");
ExampleStruct* test = new_example(6, 7.2);
printf("%d\n", test->data); // <-- THIS WORKS
double x = get_val(test); // <-- THIS AND THE LINE BELOW ALSO WORK
printf("%f\n", x); //
// printf("%f\n", test->val); <-- WOULD THROW ERROR `val not present on struct!`
return 0;
}
I tested this a couple times with some different fields and have come to the conclusion that modifying this 'private' field, val, or even accessing it without the getter, would be very difficult without using pointer arithmetic dark magic, and that is the whole point.
Some things I see that may be cause for concern:
This may make code less readable in the eyes of some, but my IDE has arrow buttons that take me to and from the definition and the implementation, and even without that, a one line comment would provide more than enough documentation to point someone in the direction of where the file is.
Questions I'd like answers on:
Are there significant performance penalties I may suffer as a result of writing code this way?
Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Aside: I am not trying to make C into C++, and generally favor the way C does things, but sometimes I really want some encapsulation of data.

Am I overlooking something that may make this whole ordeal pointless, i.e. is there a simpler way to do this or is this explicitly discouraged, and if so, what are the objective reasons behind it.
Yes: your approach produces undefined behavior.
C requires that
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
(C17 6.2.7/2)
and that
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
[...]
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a
subaggregate or contained union), or
a character type.
(C17 6.5/7, a.k.a. the "Strict Aliasing Rule")
Your two definitions of struct ExampleStruct define incompatible types because they specify different numbers of members (see C17 6.2.7/1 for more details on structure type compatibility). You will definitely have problems if you pass instances by value between functions relying on different of these incompatible definitions. You will have trouble if you construct arrays of them, whether dynamically, automatically, or statically, and attempt to use those across boundaries between TUs using one definition and those using another. You may have problems even if you do none of the above, because the compiler may behave unexpectedly, especially when optimizing. DO NOT DO THIS.
Other alternatives:
Opaque pointers. This means you do not provide any definition of struct ExampleStruct in those TUs where you want to hide any of its members. That does not prevent declaring and using pointers to such a structure, but it does prevent accessing any members, declaring new instances, or passing or receiving instances by value. Where member access is needed from TUs that do not have the structure definition, it would need to be mediated by accessor functions.
Just don't access the "private" members. Do not document them in the public documentation, and if you like, explicity mark them (in code comments, for example) as reserved. This approach will be familiar to many C programmers, as it is used a lot for structures declared in POSIX system headers.

As long as the public has a complete definition for ExampleStruct, it can make code like:
ExampleStruct a = *new_example_struct(42, 1.234);
Then the below will certainly fail.
printf("%g\n", get_val(&a));
I recommend instead to create an opaque pointer and provide access public functions to the info in .data and .val.
Think of how we use FILE. FILE *f = fopen(...) and then fread(..., f), fseek(f, ...), ftell(f) and eventually fclose(f). I suggest this model instead. (Even if in some implementations FILE* is not opaque.)

Are there significant performance penalties I may suffer as a result of writing code this way?
Probably:
Heap allocation is expensive, and - today - usually not optimized away even when that is theoretically possible.
Dereferencing a pointer for member access is expensive; although this might get optimized away with link-time-optimization... if you're lucky.
i.e. is there a simpler way to do this
Well, you could use a slack array of the same size as your private fields, and then you wouldn't need to go through pointers all the time:
#define EXAMPLE_STRUCT_PRIVATE_DATA_SIZE sizeof(double)
typedef struct ExampleStruct {
int data;
_Alignas(max_align_t) private_data[EXAMPLE_STRUCT_PRIVATE_DATA_SIZE];
} ExampleStruct;
This is basically a type-erasure of the private data without hiding the fact that it exists. Now, it's true that someone can overwrite the contents of this array, but it's kind of useless to do it intentionally when you "don't know" what the data means. Also, the private data in the "real" definition will need to have the same, maximal, _AlignAs() as well (if you want the private data not to need to use AlignAs(), you will need to use the real alignment quantum for the type-erased version).
The above is C11. You can sort of do about the same thing by typedef'ing max_align_t yourself, then using an array of max_align_t elements for private data, with an appropriate length to cover the actual size of the private data.
An example of the use of such an approach can be found in CUDA's driver API:
Parameters for copying a 3D array: CUDA_MEMCPY3D vs
Parameters for copying a 3D array between two GPU devices: CUDA_MEMCPY3D_peer
The first structure has a pair of reserved void* fields, hiding the fact that it's really the second structure. They could have used an unsigned char array, but it so happens that the private fields are pointer-sized, and void* is also kind of opaque.

This causes undefined behaviour, as detailed in the other answers. The usual way around this is to make a nested struct.
In example.h, one defines the public-facing elements. struct example is not meant to be instantiated; in a sense, it is abstract. Only pointers that are obtained from one of it's (in this case, the) constructor are valid.
struct example { int data; };
struct example *new_example(int, double);
double example_val(struct example *e);
and in example.c, instead of re-defining struct example, one has a nested struct private_example. (Such that they are related by composite aggregation.)
#include <stdlib.h>
#include "example.h"
struct private_example {
struct example public;
double val;
};
struct example *new_example(int data, double val) {
struct private_example *const example = malloc(sizeof *example);
if(!example) return 0;
example->public.data = data;
example->val = val;
return &example->public;
}
/** This is a poor version of `container_of`. */
static struct private_example *example_upcast(struct example *example) {
return (struct private_example *)(void *)
((char *)example - offsetof(struct private_example, public));
}
double example_val(struct example *e) {
return example_upcast(e)->val;
}
Then one can use the object as in main.c. This is used frequently in linux kernel code for container abstraction. Note that offsetof(struct private_example, public) is zero, ergo example_upcast does nothing and a cast is sufficient: ((struct private_example *)e)->val. If one builds structures in a way that always allows casting, one is limited by single inheritance.

Getters and setters in pure C?

Can I use getters and setters in pure C instead of using extern variables?

First of all, don't listen to anyone saying "there is no object-orientation in language x" because they have truly not understood that OO is a program design method, completely apart from language syntax.
Some languages have elegant ways to implement OO, some have not. Yet it is possible to write an object-oriented program in any language, for example in C. Similarly, your program will not automagically get a proper OO design just because you wrote it in Java, or because you used certain language keywords.
The way you implement private encapsulation in C is a bit more crude than in languages with OO support, but it does like this:
// module.h
void set_x (int n);
int get_x (void);
// module.c
static int x; // private variable
void set_x (int n)
{
x = n;
}
int get_x (void)
{
return x;
}
// main.c
#include "module.h"
int main (void)
{
set_x(5);
printf("%d", get_x());
}
Can call it "class" or "ADT" or "code module" as you prefer.
This is how every reasonable C program out there is written. And has been written for the past 30-40 years or so, as long as program design has existed. If you say there are no setters/getters in a C program, then that is because you have no experience of using C.

Yes, it's very much possible and sometimes even useful. C supports opaque types:
struct Context;
C code compiled with only this declaration in scope can not access any hypothetical members of the struct, and can't use value of type Context either. But it can still handle pointers to Context values, so functions like these are possible:
Context *make_context(...);
int context_get_foo(Context *);
void context_set_foo(Context *, int);
This pattern insulates the client C code from any changes to the size or internal layout of Context. Note that this is a stronger guarantee than simply declaring but not documenting the members: Even if the programmers duly ignore the undocumented members, by-value use of the struct is permitted (and will certainly slip in), and now the code has to be recompiled when the size changes. In other words, opaque types only handled through pointers give greater ABI stability.

Another approach is by using a global variable and inline functions:
// module.h
inline void set_x (int n) {extern int x; x = n;}
inline int get_x (void) {extern int x; return x;}
// module.c
int x; // global variable
// main.c
#include "module.h"
int main (void)
{
set_x(5);
printf("%d", get_x());
}
It has two advantages:
Getters and setters become easily inlineable
It becomes clear to the compiler that getters have no side effects, which allows further optimizations and produces no warnings in cases like this one:
// warning: compound statement with side effects
if(get_x() || get_y())
Of course, a "dedicated" (read: dumb) programmer can always write extern int x; in their code and use the variable directly. On the other hand, a "dedicated" programmer can also easily remove the static keyword and use it anyway...

Why GCC compiles this erroneous code?

I tried to compile something like:
struct A
{ int a;
struct B
{ int c;
};
};
Now when I compile this code the compiler gives me a warning message that:
declaration does not declare anything [enabled by default]
I know that I have not defined any instance of struct B. That will mean that I shall not be able to access variable c. Still compiler compiles this code with a warning. What's the whole point ? Why does not the compiler give a compilation error instead ?
ADDED Info:
The size of the struct A is equal to the size of int on my machine!!

Because you can do this:
struct A
{ int a;
struct B
{ int c;
};
};
int main()
{
struct A a = {1};
struct B b = {2};
return a.a + b.c;
}
Note:
you need a semicolon after declaring B, which your code is missing
this isn't particularly useful, but I suppose it might serve some documentary purpose (ie,to suggest a relationship or grouping between types)
in C++, the second variable would have type A::B, but C doesn't have the same scoping rules (all structs just belong to the global struct namespace, in effect)
As to the motivation for allowing it ...
struct Outer {
struct {
int b;
} anon;
/* this ^ anonymous struct can only be declared inside Outer,
because there's no type name to declare anon with */
struct Inner {
int c;
} named;
/* this ^ struct could be declared outside, but why force it to be
different than the anonymous one? */
struct Related {
double d;
};
/* oh no we have no member declared immediately ... should we force this
declaration to be outside Outer now? */
struct Inner * (*function_pointer)(struct Related *);
/* no member but we are using it, now can it come back inside? */
struct Related excuse;
/* how about now? */
};
Once you've allowed nested type declarations like this, I doubt there's any particular motivation to require there be a member of that type right away.

It's legal (but extremely bad style) to do:
struct A {
int a;
struct B {
int c;
};
};
struct B B_instance;
struct A A_instance;
And the compiler doesn't know about the later variables that use the struct types, so it really should not error out.

Generally, a warning means the code likely does not do what you intended but is legal in the language. The compiler is saying, “This is likely not what you really wanted to do, but I must allow you to do it because the language says it is allowed.” The compiler cannot give you an error for this code because the C standard permits it, so it must be allowed (unless you specifically ask for errors for such things, as by using GCC’s -Werror option to turn warnings into errors).
The C standard does not attempt to define everything that makes sense in a program. For example, these things are legal in C:
3;
if (x) then foo(); else foo();
x = 4*0;
The first statement has no side effects, and its return value is not used. But it is legal in C, since a statement may be just an expression. The second statement just calls foo(), so the if is pointless. In the third statement, multiplying by four is pointless.
It would be extremely difficult to write a C standard that prohibited all things that did not make sense. And it is certainly not worth the effort. So this is part of your answer: When the committee writing the C standard builds the language, do they want to spend a lot of time rewriting the technical specification to exclude things that do not make sense? Sometimes yes, if it seems valuable to avoid something that could cause serious bugs. But much of the time, it is just not worth their time and would complicate the specification unnecessarily.
However, compilers can recognize some of these things and warn you. This helps catch many typographical errors or other mistakes.
On the other hand, sometimes these constructions arise from unusual circumstances. For example, a program may have preprocessor statements that define struct A in different ways when building for different targets or different features. In some of those targets, it may be that the struct B member is not needed in struct A, so it is not declared, but the declaration of struct B (the type, not the object) remains present just because it was easier to write the preprocessor statements that way.
So the compiler needs to permit these things, to avoid interfering with programmers writing a wide variety of programs.

You are, in fact, declaring struct B here, but you are not declaring a variable of that type.
This is a warning, but one you should fix. Perhaps you meant:
struct A
{ int a;
struct B
{
int c;
} c;
};

Is it possible to cast pointers from a structure type to another structure type extending the first in C?

If I have structure definitions, for example, like these:
struct Base {
int foo;
};
struct Derived {
int foo; // int foo is common for both definitions
char *bar;
};
Can I do something like this?
void foobar(void *ptr) {
((struct Base *)ptr)->foo = 1;
}
struct Derived s;
foobar(&s);
In other words, can I cast the void pointer to Base * to access its foo member when its type is actually Derived *?

You should do
struct Base {
int foo;
};
struct Derived {
struct Base base;
char *bar;
};
to avoid breaking strict aliasing; it is a common misconception that C allows arbitrary casts of pointer types: although it will work as expected in most implementations, it's non-standard.
This also avoids any alignment incompatibilities due to usage of pragma directives.

Many real-world C programs assume the construct you show is safe, and there is an interpretation of the C standard (specifically, of the "common initial sequence" rule, C99 §6.5.2.3 p5) under which it is conforming. Unfortunately, in the five years since I originally answered this question, all the compilers I can easily get at (viz. GCC and Clang) have converged on a different, narrower interpretation of the common initial sequence rule, under which the construct you show provokes undefined behavior. Concretely, experiment with this program:
#include <stdio.h>
#include <string.h>
typedef struct A { int x; int y; } A;
typedef struct B { int x; int y; float z; } B;
typedef struct C { A a; float z; } C;
int testAB(A *a, B *b)
{
b->x = 1;
a->x = 2;
return b->x;
}
int testAC(A *a, C *c)
{
c->a.x = 1;
a->x = 2;
return c->a.x;
}
int main(void)
{
B bee;
C cee;
int r;
memset(&bee, 0, sizeof bee);
memset(&cee, 0, sizeof cee);
r = testAB((A *)&bee, &bee);
printf("testAB: r=%d bee.x=%d\n", r, bee.x);
r = testAC(&cee.a, &cee);
printf("testAC: r=%d cee.x=%d\n", r, cee.a.x);
return 0;
}
When compiling with optimization enabled (and without -fno-strict-aliasing), both GCC and Clang will assume that the two pointer arguments to testAB cannot point to the same object, so I get output like
testAB: r=1 bee.x=2
testAC: r=2 cee.x=2
They do not make that assumption for testAC, but — having previously been under the impression that testAB was required to be compiled as if its two arguments could point to the same object — I am no longer confident enough in my own understanding of the standard to say whether or not that is guaranteed to keep working.

That will work in this particular case. The foo field in the first member of both structures and hit has the same type. However this is not true in the general case of fields within a struct (that are not the first member). Items like alignment and packing can make this break in subtle ways.

As you seem to be aiming at Object Oriented Programming in C I can suggest you to have a look at the following link:
http://www.planetpdf.com/codecuts/pdfs/ooc.pdf
It goes into detail about ways of handling oop principles in ANSI C.

In particular cases this could work, but in general - no, because of the structure alignment.
You could use different #pragmas to make (actually, attempt to) the alignment identical - and then, yes, that would work.
If you're using microsoft visual studio, you might find this article useful.

There is another little thing that might be helpful or related to what you are doing ..
#define SHARED_DATA int id;
typedef union base_t {
SHARED_DATA;
window_t win;
list_t list;
button_t button;
}
typedef struct window_t {
SHARED_DATA;
int something;
void* blah;
}
typedef struct window_t {
SHARED_DATA;
int size;
}
typedef struct button_t {
SHARED_DATA;
int clicked;
}
Now you can put the shared properties into SHARED_DATA and handle the different types via the "superclass" packed into the union.. You could use SHARED_DATA to store just a 'class identifier' or store a pointer.. Either way it turned out handy for generic handling of event types for me at some point. Hope i'm not going too much off-topic with this

I know this is an old question, but in my view there is more that can be said and some of the other answers are incorrect.
Firstly, this cast:
(struct Base *)ptr
... is allowed, but only if the alignment requirements are met. On many compilers your two structures will have the same alignment requirements, and it's easy to verify in any case. If you get past this hurdle, the next is that the result of the cast is mostly unspecified - that is, there's no requirement in the C standard that the pointer once cast still refers to the same object (only after casting it back to the original type will it necessarily do so).
However, in practice, compilers for common systems usually make the result of a pointer cast refer to the same object.
(Pointer casts are covered in section 6.3.2.3 of both the C99 standard and the more recent C11 standard. The rules are essentially the same in both, I believe).
Finally, you've got the so called "strict aliasing" rules to contend with (C99/C11 6.5 paragraph 7); basically, you are not allowed to access an object of one type via a pointer of another type (with certain exceptions, which don't apply in your example). See "What is the strict-aliasing rule?", or for a very in-depth discussion, read my blog post on the subject.
In conclusion, what you attempt in your code is not guaranteed to work. It might be guaranteed to always work with certain compilers (and with certain compiler options), and it might work by chance with many compilers, but it certainly invokes undefined behavior according to the C language standard.
What you could do instead is this:
*((int *)ptr) = 1;
... I.e. since you know that the first member of the structure is an int, you just cast directly to int, which bypasses the aliasing problem since both types of struct do in fact contain an int at this address. You are relying on knowing the struct layout that the compiler will use and you are still relying on the non-standard semantics of pointer casting, but in practice this is significantly less likely you give you problems.

The great/bad thing about C is that you can cast just about anything -- the problem is, it might not work. :) However, in your case, it will*, since you have two structs whose first members are both of the same type; see this program for an example. Now, if struct derived had a different type as its first element -- for example, char *bar -- then no, you'd get weird behavior.
* I should qualitfy that with "almost always", I suppose; there're a lot of different C compilers out there, so some may have different behavior. However, I know it'll work in GCC.

How do you implement a class in C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Assuming I have to use C (no C++ or object oriented compilers) and I don't have dynamic memory allocation, what are some techniques I can use to implement a class, or a good approximation of a class? Is it always a good idea to isolate the "class" to a separate file? Assume that we can preallocate the memory by assuming a fixed number of instances, or even defining the reference to each object as a constant before compile time. Feel free to make assumptions about which OOP concept I will need to implement (it will vary) and suggest the best method for each.
Restrictions:
I have to use C and not an OOP
because I'm writing code for an
embedded system, and the compiler and
preexisting code base is in C.
There is no dynamic memory allocation
because we don't have enough memory
to reasonably assume we won't run out
if we start dynamically allocating
it.
The compilers we work with have no problems with function pointers

That depends on the exact "object-oriented" feature-set you want to have. If you need stuff like overloading and/or virtual methods, you probably need to include function pointers in structures:
typedef struct {
float (*computeArea)(const ShapeClass *shape);
} ShapeClass;
float shape_computeArea(const ShapeClass *shape)
{
return shape->computeArea(shape);
}
This would let you implement a class, by "inheriting" the base class, and implementing a suitable function:
typedef struct {
ShapeClass shape;
float width, height;
} RectangleClass;
static float rectangle_computeArea(const ShapeClass *shape)
{
const RectangleClass *rect = (const RectangleClass *) shape;
return rect->width * rect->height;
}
This of course requires you to also implement a constructor, that makes sure the function pointer is properly set up. Normally you'd dynamically allocate memory for the instance, but you can let the caller do that, too:
void rectangle_new(RectangleClass *rect)
{
rect->width = rect->height = 0.f;
rect->shape.computeArea = rectangle_computeArea;
}
If you want several different constructors, you will have to "decorate" the function names, you can't have more than one rectangle_new() function:
void rectangle_new_with_lengths(RectangleClass *rect, float width, float height)
{
rectangle_new(rect);
rect->width = width;
rect->height = height;
}
Here's a basic example showing usage:
int main(void)
{
RectangleClass r1;
rectangle_new_with_lengths(&r1, 4.f, 5.f);
printf("rectangle r1's area is %f units square\n", shape_computeArea(&r1));
return 0;
}
I hope this gives you some ideas, at least. For a successful and rich object-oriented framework in C, look into glib's GObject library.
Also note that there's no explicit "class" being modelled above, each object has its own method pointers which is a bit more flexible than you'd typically find in C++. Also, it costs memory. You could get away from that by stuffing the method pointers in a class structure, and invent a way for each object instance to reference a class.

I had to do it once too for a homework. I followed this approach:
Define your data members in a
struct.
Define your function members that
take a pointer to your struct as
first argument.
Do these in one header & one c.
Header for struct definition &
function declarations, c for
implementations.
A simple example would be this:
/// Queue.h
struct Queue
{
/// members
}
typedef struct Queue Queue;
void push(Queue* q, int element);
void pop(Queue* q);
// etc.
///

If you only want one class, use an array of structs as the "objects" data and pass pointers to them to the "member" functions. You can use typedef struct _whatever Whatever before declaring struct _whatever to hide the implementation from client code. There's no difference between such an "object" and the C standard library FILE object.
If you want more than one class with inheritance and virtual functions, then it's common to have pointers to the functions as members of the struct, or a shared pointer to a table of virtual functions. The GObject library uses both this and the typedef trick, and is widely used.
There's also a book on techniques for this available online - Object Oriented Programming with ANSI C.

C Interfaces and Implementations: Techniques for Creating Reusable Software, David R. Hanson
http://www.informit.com/store/product.aspx?isbn=0201498413
This book does an excellent job of covering your question. It's in the Addison Wesley Professional Computing series.
The basic paradigm is something like this:
/* for data structure foo */
FOO *myfoo;
myfoo = foo_create(...);
foo_something(myfoo, ...);
myfoo = foo_append(myfoo, ...);
foo_delete(myfoo);

you can take a look at GOBject. it's an OS library that give you a verbose way to do an object.
http://library.gnome.org/devel/gobject/stable/

I will give a simple example of how OOP should be done in C. I realize this thread is from 2009 but would like to add this anyway.
/// Object.h
typedef struct Object {
uuid_t uuid;
} Object;
int Object_init(Object *self);
uuid_t Object_get_uuid(Object *self);
int Object_clean(Object *self);
/// Person.h
typedef struct Person {
Object obj;
char *name;
} Person;
int Person_init(Person *self, char *name);
int Person_greet(Person *self);
int Person_clean(Person *self);
/// Object.c
#include "object.h"
int Object_init(Object *self)
{
self->uuid = uuid_new();
return 0;
}
uuid_t Object_get_uuid(Object *self)
{ // Don't actually create getters in C...
return self->uuid;
}
int Object_clean(Object *self)
{
uuid_free(self->uuid);
return 0;
}
/// Person.c
#include "person.h"
int Person_init(Person *self, char *name)
{
Object_init(&self->obj); // Or just Object_init(&self);
self->name = strdup(name);
return 0;
}
int Person_greet(Person *self)
{
printf("Hello, %s", self->name);
return 0;
}
int Person_clean(Person *self)
{
free(self->name);
Object_clean(self);
return 0;
}
/// main.c
int main(void)
{
Person p;
Person_init(&p, "John");
Person_greet(&p);
Object_get_uuid(&p); // Inherited function
Person_clean(&p);
return 0;
}
The basic concept involves placing the 'inherited class' at the top of the struct. This way, accessing the first 4 bytes in the struct also accesses the first 4 bytes in the 'inherited class' (assuming non-crazy optimizations). Now, when the pointer of the struct is cast to the 'inherited class', the 'inherited class' can access the 'inherited values' in the same way it would access its members normally.
This and some naming conventions for constructors, destructors, allocation, and deallocation functions (I recommend _init, _clean, _new, and _free) will get you a long way.
As for Virtual functions, use function pointers in the struct, possibly with Class_func(...); wrapper too.
As for (simple) templates, add a size_t parameter to determine size, require a void* pointer, or require a 'class' type with just the functionality you care about. (e.g. int GetUUID(Object *self); GetUUID(&p);)

Use a struct to simulate the data members of a class. In terms of method scope you can simulate private methods by placing the private function prototypes in the .c file and the public functions in the .h file.

GTK is built entirely on C and it uses many OOP concepts. I have read through the source code of GTK and it is pretty impressive, and definitely easier to read. The basic concept is that each "class" is simply a struct, and associated static functions. The static functions all accept the "instance" struct as a parameter, do whatever then need, and return results if necessary. For Example, you may have a function "GetPosition(CircleStruct obj)". The function would simply dig through the struct, extract the position numbers, probably build a new PositionStruct object, stick the x and y in the new PositionStruct, and return it. GTK even implements inheritance this way by embedding structs inside structs. pretty clever.

#include <stdio.h>
#include <math.h>
#include <string.h>
#include <uchar.h>
/**
* Define Shape class
*/
typedef struct Shape Shape;
struct Shape {
/**
* Variables header...
*/
double width, height;
/**
* Functions header...
*/
double (*area)(Shape *shape);
};
/**
* Functions
*/
double calc(Shape *shape) {
return shape->width * shape->height;
}
/**
* Constructor
*/
Shape _Shape() {
Shape s;
s.width = 1;
s.height = 1;
s.area = calc;
return s;
}
/********************************************/
int main() {
Shape s1 = _Shape();
s1.width = 5.35;
s1.height = 12.5462;
printf("Hello World\n\n");
printf("User.width = %f\n", s1.width);
printf("User.height = %f\n", s1.height);
printf("User.area = %f\n\n", s1.area(&s1));
printf("Made with \xe2\x99\xa5 \n");
return 0;
};

In your case the good approximation of the class could be the an ADT. But still it won't be the same.

My strategy is:
Define all code for the class in a separate file
Define all interfaces for the class in a separate header file
All member functions take a "ClassHandle" which stands in for the instance name (instead of o.foo(), call foo(oHandle)
The constructor is replaced with a function void ClassInit(ClassHandle h, int x, int y,...) OR ClassHandle ClassInit(int x, int y,...) depending on the memory allocation strategy
All member variables are store as a member of a static struct in the class file, encapsulating it in the file, preventing outside files from accessing it
The objects are stored in an array of the static struct above, with predefined handles (visible in the interface) or a fixed limit of objects that can be instantiated
If useful, the class can contain public functions that will loop through the array and call the functions of all the instantiated objects (RunAll() calls each Run(oHandle)
A Deinit(ClassHandle h) function frees the allocated memory (array index) in the dynamic allocation strategy
Does anyone see any problems, holes, potential pitfalls or hidden benefits/drawbacks to either variation of this approach? If I am reinventing a design method (and I assume I must be), can you point me to the name of it?

Also see this answer and this one
It is possible. It always seems like a good idea at the time but afterwards it becomes a maintenance nightmare. Your code become littered with pieces of code tying everything together. A new programmer will have lots of problems reading and understanding the code if you use function pointers since it will not be obvious what functions is called.
Data hiding with get/set functions is easy to implement in C but stop there. I have seen multiple attempts at this in the embedded environment and in the end it is always a maintenance problem.
Since you all ready have maintenance issues I would steer clear.

My approach would be to move the struct and all primarily-associated functions to a separate source file(s) so that it can be used "portably".
Depending on your compiler, you might be able to include functions into the struct, but that's a very compiler-specific extension, and has nothing to do with the last version of the standard I routinely used :)

The first c++ compiler actually was a preprocessor which translated the C++ code into C.
So it's very possible to have classes in C.
You might try and dig up an old C++ preprocessor and see what kind of solutions it creates.

Do you want virtual methods?
If not then you just define a set of function pointers in the struct itself. If you assign all the function pointers to standard C functions then you will be able to call functions from C in very similar syntax to how you would under C++.
If you want to have virtual methods it gets more complicated. Basically you will need to implement your own VTable to each struct and assign function pointers to the VTable depending on which function is called. You would then need a set of function pointers in the struct itself that in turn call the function pointer in the VTable. This is, essentially, what C++ does.
TBH though ... if you want the latter then you are probably better off just finding a C++ compiler you can use and re-compiling the project. I have never understood the obsession with C++ not being usable in embedded. I've used it many a time and it works is fast and doesn't have memory problems. Sure you have to be a bit more careful about what you do but its really not that complicated.

C isn't an OOP language, as your rightly point out, so there's no built-in way to write a true class. You're best bet is to look at structs, and function pointers, these will let you build an approximation of a class. However, as C is procedural you might want to consider writing more C-like code (i.e. without trying to use classes).
Also, if you can use C, you can probally use C++ and get classes.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

"Private" struct members in C with const - c

Related

Using different struct definitions to simulate public and private fields in C

Getters and setters in pure C?

Why GCC compiles this erroneous code?

Is it possible to cast pointers from a structure type to another structure type extending the first in C?

How do you implement a class in C? [closed]

Categories

Resources