I've seen a good deal of C libraries that do not present the objects they deal with internally as distinct types, but instead wrap them with void pointers before letting you access them. In practice, the "private" source files look something like:
typedef struct {
int value;
} object;
void * object_new(void)
{
object *o = malloc(sizeof(object));
o->value = 1;
return o;
}
int object_get(void *o)
{
return (object *)o->value;
}
void * object_free(void *o)
{
free(o);
}
And in the main header you have only:
void * object_new(void);
int object_get(void *o);
void * object_free(void *o);
Now I wonder: is there a particular reason they do so? If the idea is to ensure the API user has no access to the internals of the object, isn't it sufficient to only expose the type name in the main library header, and to hide the details of the underlying structure (or whatever be the actual object) in the implementation files?
The reason to hide the types behind void pointers could be a (misguided) attempt to hide (in the sense of modular programming) the internal details. This is dangerous, as it throws any type checking the compiler might do right out the window.
Better would be something along the lines:
for-user.h:
struct internalstuff;
void somefunc(struct internalstuff *p);
for-internal-use.h:
#include "for-user.h"
struct internalstuff { ...};
implementation.c:
#include "for-internal-use.h";
void somefunc(struct internalstuff *p)
{
...
}
This way nobody will mix up internalstuff with a random string or the raw result from malloc(3) without getting at least a warning. As long as you only mention pointers to struct internalstuff in C it is fine not to have the definition of the struct at hand.
Something along the same lines can be done in C++ with class, and I'd be suprised if Objective C doesn't allow the same. But the object oriented programming languages have their own, much more flexible, tools for this. There you can define a bare-bones base class to export, while internally extensions are used. Take a look at a good C++ book for details (there are extensive lists here).
In a world of objects (Obj-C and C++), I believe the reason is mostly to do with inheritance. If a subclass is created from the base class, then there is no problem with the type of the return value when creating a new instance of the class. With just straight C, there does not appear to be a clear cut reason as no internal details are revealed or dependencies created.
You're correct.. the idea in most of these cases is to restrict the API user from the internals of the object. The decision about type names though really is just a matter of style. If you were to expose the type name in the header as you suggest (which some APIs do), it would probably look something like:
typedef void* object;
There is no real advantage or disadvantage to doing this from the compiler's point of view. Although it does give the API user a better understanding of what's going on:
object object_new(void);
int object_get(object o);
void object_free(object o);
Related
I've use quite a bit of JavaScript so far. If you were to use an object constructor in JavaScript, you have access to the this constructor.
So my question relates to trying to use a similar concept in C. I created a struct that I want to be able to self reference:
struct Storage {
void (*delete)();
}
So if I were to allocate a Storage class:
struct Storage *myStruct = malloc(sizeof(struct Storage));
Let's say I'm trying to delete myStruct. If I have some delete function that I point to (with myStruct->delete = deleteStructure), I would like to do something like this:
myStruct.delete();
which would then free() the struct through a self referencing variable inside of said delete function. I'm wondering if there would be a way to have the delete function look like:
void deleteStructure() {
free( /* "this" or some equivalent C self-reference */ );
}
My assumption from research so far is that this is not possible since this is usually only in object oriented programming languages. If this is not possible, I'm wondering what would be the semantically correct way to do this. I'm hoping to make the usage of this delete functionality rather simplistic from a user interface perspective. The only way I understand this to work would be passing a reference to the structure like:
void deleteStructure(struct Storage *someStructure) {
free(someStructure);
}
which would then require deletion to be done as follows:
deleteStructure(myStruct);
To sum up: is there a way to make a delete function that uses self references in C, and if not, what would be the most semantically correct way to delete a structure in the most user friendly way?
No. You cannot even define a function for a struct.
struct Storage {
void (*delete)();
}
simply stores a pointer to a void function. That could be any void function and when it is being called, it has no connection to Storage whatsoever.
Also note that in your code, every instance of the struct stores one pointer to a void function. You could initialize them so that they all point to the same function, in which case you would simply waste 64 bit per instance without any real benefit. You could also make them point to completely different functions with different semantics.
As per #UnholySheep's comment, the correct semantical use of a struct with connection to a C function will follow the structure:
struct Storage {
/* Some definitions here */
}
void deleteStructure(struct Storage *someStructure) {
free( /* all inner structure allocations */ );
free(someStructure);
}
Here's more about passing structs by reference.
I'm writing a C library, and have a struct that looks (roughly) like:
struct Obj {
char tag,
union {
int i,
void *v
} val
};
I do not want to expose the internals of this struct through the API, because users do not need to know the implementation and they could change in future versions. Users can interact with the struct via functions in the API.
I used incomplete types in the header for other, larger types in my API, which can only be accessed via pointer by the user. I do not want to restrict users to accessing Obj via pointer, as Obj will likely only be 16 bytes maximum.
I have not been able to use an incomplete type here, because I do not know of a way to expose only the size of the struct to users, without fields.
My question is:
Is there a way to expose a type with size only in C (no knowledge of the fields in the struct given to user), some other hack to accomplish what I want, or should I implement this in some completely different way?
Please comment if I haven't provided enough details or anything is unclear.
The standard pattern for this is to create a function which allocates the struct for the user:
struct Obj* obj_new(void) {
return malloc(sizeof(struct Obj));
}
Then just leave the type as incomplete in your public header.
Of course, if you really want to expose only the size, you could just create a function which returns sizeof(struct Obj). Obviously people can misuse it (e.g., hardcoding the value into their code as an "optimization" to avoid calling that function), but that's not on you. It is something that is done occasionally, usually to help facilitate inheritance.
I am looking for a way to make private style typedefs that can only be accessed or manipulated from a specific set of function calls (setBit(bit_typ *const t), getBit(bit_typ *const t)). I am looking for a way to do this without using malloc, does anyone have any ideas?
EDIT:// this question is different than this one because it is looking for ways to get as close to a "private" structure whereas the other question (TL;DR is there a way to define an opaque type which can nonetheless be allocated on stack, and without breaking strict aliasing rule ?) looks for a solution to a problem related to one possible solution to my question.
One way to do it is to expose the total size of the opaque type and make used declare the objects of your opaque type as unsigned char [N] buffers. For example, let's say you have some type OpaqueType, internals of which you want to hide from the user.
In the header file (exposed to the user) you do this
typedef unsigned char OpaqueType[16];
where 16 is the exact byte-size of the type you want to hide. In the header file you write the whole interface in terms of that type, e.g.
void set_data(OpaqueType *dst, int data);
In the implementation file you declare the actual type
typedef struct OpaqueTypeImpl
{
int data1;
double data2;
} OpaqueTypeImpl;
and implement the functions as follows
void set_data(OpaqueType *dst, int data)
{
OpaqueTypeImpl *actual_dst = (OpaqueTypeImpl *) dst;
actual_dst->data1 = data;
}
You can also add a static assertion that will make sure that sizeof(OpaqueType) is the same as sizeof(OpaqueTypeImpl).
Of course, as it has been noted in the comments below, extra steps have to be taken to ensure the proper alignment of such objects, like _Alignas in C11 or some union-based technique in "classic" C.
That way you give the user opportunity to declare non-dynamic object of OpaqueType, i.e. you don't force the user to call your function that will malloc such objects internally. And at the same time you don't expose to user anything about the inner structure of your type (besides its total size and its alignment requirement).
Note also that OpaqueType declared in that way is an array, meaning that it is not copyable (unless you use memcpy). That might be a good thing, if you want to actively prevent unrestrained user-level copying. But if you want to enable copying, you can wrap the array into a struct.
This approach is not terribly elegant, but that's probably the only way to hide implementation when you want to keep objects of your type freely user-definable.
I've been tinkering with some code in a effort to understand OOP using c.
I really like this style and want to use it. The code sample works great if another class creates an instance of FooOBJ.
How can FooOBJ reference itself to change its own variables?
Do I need to make a copy of foo in the constructor or something like that or am I wandering away from the right way to use this methodology?
struct fooobj {
int privateint;
char *privateString;
};
FooOBJ newFooOBJ(){
FooOBJ foo=(FooOBJ)malloc(sizeof(struct fooobj));
bzero(foo, sizeof(struct fooobj));
return foo;
}
void setFooNumber(FooOBJ foo,int num){
if(foo==NULL) return; /* you may chose to debugprint something
*instead
*/
foo->privateint=num;
}
void setmyself(int val)
{
//this->privateint = val
}
Well, any function operating on an instance of your "class" will have to take a pointer to the instance. This happens automatically and implicitly in C++, but in C you'll have to pass a "this" pointer everywhere.
What this means is that your setFooNumber has the right signature for a "member function", whereas setmyself does not.
There's a reason C++ and other OO languages have an implicit parameter to instance methods. The only way this can be done is if you explicitly pass a this pointer. A function doesn't have access to something that isn't declared in an appropriate scope: locally or globally (parameters being local).
To understand OOP in C, you'll need to understand how to simulate pure OO code in a procedural way.
I am especially interested in objects meant to be used from within C, as opposed to implementations of objects that form the core of interpreted languages such as python.
I tend to do something like this:
struct foo_ops {
void (*blah)(struct foo *, ...);
void (*plugh)(struct foo *, ...);
};
struct foo {
struct foo_ops *ops;
/* data fields for foo go here */
};
With these structure definitions, the code implementing foo looks something like this:
static void plugh(struct foo *, ...) { ... }
static void blah(struct foo *, ...) { ... }
static struct foo_ops foo_ops = { blah, plugh };
struct foo *new_foo(...) {
struct foo *foop = malloc(sizeof(*foop));
foop->ops = &foo_ops;
/* fill in rest of *foop */
return foop;
}
Then, in code that uses foo:
struct foo *foop = new_foo(...);
foop->ops->blah(foop, ...);
foop->ops->plugh(foop, ...);
This code can be tidied up with macros or inline functions so it looks more C-like
foo_blah(foop, ...);
foo_plugh(foop, ...);
although if you stick with a reasonably short name for the "ops" field, simply writing out the code shown originally isn't particularly verbose.
This technique is entirely adequate for implementing a relatively simple object-based designs in C, but it does not handle more advanced requirements such as explicitly representing classes, and method inheritance. For those, you might need something like GObject (as EFraim mentioned), but I'd suggest making sure you really need the extra features of the more complex frameworks.
Your use of the term "objects" is a bit vague, so I'm going to assume you're asking how to use C to achieve certain aspects of Object-Oriented Programming (feel free to correct me on this assumption.)
Method Polymorphism:
Method polymorphism is typically emulated in C using function pointers. For example if I had a struct that I used to represent an image_scaler ( something that takes an image and resizes it to new dimensions ), I could do something like this:
struct image_scaler {
//member variables
int (*scale)(int, int, int*);
}
Then, I could make several image scalers as such:
struct image_scaler nn, bilinear;
nn->scale = &nearest_neighbor_scale;
bilinear->scale = &bilinear_scale;
This lets me achieve polymorphic behavior for any function that takes in a image_scaler and uses it's scale method by simply passing it a different image_scaler.
Inheritance
Inheritance is usually achieved as such:
struct base{
int x;
int y;
}
struct derived{
struct base;
int z;
}
Now, I'm free to use derived's extra fields, along with getting all the 'inherited' fields of base. Additionally, If you have a function that only takes in a struct base. you can simply cast your struct dervied pointer into a struct base pointer with no consequences
Libraries such as GObject.
Basically GObject provides common way to describe opaque values (integers, strings) and objects (by manually describing the interface - as a structure of function pointers, basically correspoinding to a VTable in C++) - more info on the structure can be found in its reference
You would often also hand-implement vtables as in "COM in plain C"
As you can see from browsing all the answers, there are libraries,
function pointers, means of inheritance, encapsulation, etc., all
available (C++ was originally a front-end for C).
However, I have found that a VERY important aspect to software is
readability. Have you tried to read code from 10 years ago? As a
result, I tend to take the simplest approach when doing things like
objects in C.
Ask the following:
Is this for a customer with a deadline (if so, consider OOP)?
Can I use an OOP (often less code, faster to develop, more readable)?
Can I use a library (existing code, existing templates)?
Am I constrained by memory or CPU (for example Arduino)?
Is there another technical reason to use C?
Can I keep my C very simple and readable?
What OOP features do I really need for my project?
I usually revert to something like the GLIB API which allows me to
encapsulate my code and provides a very readable interface. If more
is needed, I add function pointers for polymorphism.
class_A.h:
typedef struct _class_A {...} Class_A;
Class_A* Class_A_new();
void Class_A_empty();
...
#include "class_A.h"
Class_A* my_instance;
my_instance = Class_A_new();
my_instance->Class_A_empty(); // can override using function pointers
Look at IJG's implementation. They not only use setjmp/longjmp for exception handling, they have vtables and everything. It is a well written and small enough library for you to get a very good example.
Similar to Dale's approach but a bit more of a footgun is how PostgreSQL represents parse tree nodes, expression types, and the like internally. There are default Node and Expr structs, along the lines of
typedef struct {
NodeTag n;
} Node;
where NodeTag is a typedef for unsigned int, and there's a header file with a bunch of constants describing all the possible node types. Nodes themselves look like this:
typedef struct {
NodeTag n = FOO_NODE;
/* other members go here */
} FooNode;
and a FooNode can be cast to a Node with impunity, because of a quirk of C structs: if two structs have identical first members, they can be cast to each other.
Yes, this means that a FooNode can be cast to a BarNode, which you probably don't want to do. If you want proper runtime type-checking, GObject is the way to go, though be prepared to hate life while you're getting the hang of it.
(note: examples from memory, I haven't hacked on the Postgres internals in a while. The developer FAQ has more info.)