I'm designing a program in C that manipulates geometric figures and it would be very convenient if every type of figure could be manipulated by the same primitives.
How can I do this in C?
You generally do it with function pointers. In other words, simple structures that hold both the data and pointers to functions which manipulate that data. We were doing that sort of stuff years before Bjarne S came onto the scene.
So, for example, in a communications class, you would have an open, read, write and close call which would be maintained as four function pointers in the structure, alongside the data for an object, something like:
typedef struct {
int (*open)(void *self, char *fspec);
int (*close)(void *self);
int (*read)(void *self, void *buff, size_t max_sz, size_t *p_act_sz);
int (*write)(void *self, void *buff, size_t max_sz, size_t *p_act_sz);
// And the data for the object goes here.
} tCommsClass;
tCommsClass commRs232;
commRs232.open = &rs232Open;
: :
commRs232.write = &rs232Write;
tCommsClass commTcp;
commTcp.open = &tcpOpen;
: :
commTcp.write = &tcpWrite;
The initialisation of those function pointers would actually be in a "constructor" such as rs232Init(tCommClass*), which would be responsible for setting up the default state of that particular object to match a specific class.
When you 'inherit' from that class, you just change the pointers to point to your own functions. Everyone that called those functions would do it through the function pointers, giving you your polymorphism:
int stat = (commTcp.open)(commTcp, "bigiron.box.com:5000");
Sort of like a manually configured vtable, in C++ parlance.
You could even have virtual classes by setting the pointers to NULL -the behaviour would be slightly different to C++ inasmuch as you would probably get a core dump at run-time rather than an error at compile time.
Here's a piece of sample code that demonstrates it:
#include <stdio.h>
// The top-level class.
typedef struct _tCommClass {
int (*open)(struct _tCommClass *self, char *fspec);
} tCommClass;
// Function for the TCP class.
static int tcpOpen (tCommClass *tcp, char *fspec) {
printf ("Opening TCP: %s\n", fspec);
return 0;
}
static int tcpInit (tCommClass *tcp) {
tcp->open = &tcpOpen;
return 0;
}
// Function for the HTML class.
static int htmlOpen (tCommClass *html, char *fspec) {
printf ("Opening HTML: %s\n", fspec);
return 0;
}
static int htmlInit (tCommClass *html) {
html->open = &htmlOpen;
return 0;
}
// Test program.
int main (void) {
int status;
tCommClass commTcp, commHtml;
// Same base class but initialized to different sub-classes.
tcpInit (&commTcp);
htmlInit (&commHtml);
// Called in exactly the same manner.
status = (commTcp.open)(&commTcp, "bigiron.box.com:5000");
status = (commHtml.open)(&commHtml, "http://www.microsoft.com");
return 0;
}
This produces the output:
Opening TCP: bigiron.box.com:5000
Opening HTML: http://www.microsoft.com
so you can see that the different functions are being called, depending on the sub-class.
I'm astonished, does no one have mentioned glib, gtk and the GObject system.
So instead of baking yet-another-oo-layer-upon-C. Why not use something that has proofed to work?
Regards
Friedrich
People have done silly things with various types of structs and relying on predictable padding - for example you can define a struct with a particular subset of another struct and it'll usually work. See below (code stolen from Wikipedia):
struct ifoo_version_42 {
long x, y, z;
char *name;
long a, b, c;
};
struct ifoo_old_stub {
long x, y;
};
void operate_on_ifoo(struct ifoo_version_42 *);
struct ifoo_old_stub s;
...
operate_on_ifoo(&s);
In this example, the ifoo_old_stub could be considered a superclass. As you can probably figure out, this relies on the fact that the same compiler will pad the two structs equivalently, and trying to access the x and y of a version-42 will work even if you pass a stub. This ought to work in the reverse as well. But AFAIK it doesn't necessarily work across compilers, so be careful if you want to send a struct of this format over the network, or save it in a file, or call a library function with one.
There's a reason polymorphism in C++ is pretty complicated to implement... (vtables, etc)
Related
We have an anonymous type, typedefed to void *, which is the handle for an API (all code in C11). It is deliberately void * as what it is pointing to changes depending on the platform we are compiled for and we also don't want the application to try dereferencing it. Internally we know what it should be pointing to and we cast it appropriately. This is fine, the code is public, we've been using it for years, it cannot be changed.
The problem is that we now need to introduce another one of these, and we don't want the user to get the two confused, we want the compiler to throw an error if the wrong handle is passed to one of our functions. However, all of the versions of all of the C compilers I have tried so far (GCC, Clang, MSVC) don't care; they know that the underlying type is void * and so anything goes (this is with -Wall and -Werror). Putting it another way, our typedef has not achieved anything, we might as well have just used void *. I have also tried Lint and CodeChecker, who also don't seem to care (though you could probably question my configurations for these). Note that I am not able to use -Wpedantic as we include third party code where that wouldn't fly.
I have tried making the new thing a specific typedefed pointer rather than a void * but that doesn't entirely fix things as the compiler is still happy for the caller to pass that new specific typedefed pointer into the existing functions that are expecting the existing handle typedef.
Is there (a) a way to construct a new anonymous handle such that the compiler will not allow it to be passed to the existing functions or (b) a checker that we can apply to pick the problem up, at least in our own use of these APIs?
Here is some code to illustrate the problem:
#include <stdlib.h>
typedef struct {
int contents;
} existingThing_t;
typedef void *anonExistingHandle_t;
typedef struct {
char contents[10];
} newThing_t;
typedef void *anonNewHandle_t;
typedef newThing_t *newHandle_t;
static void functionExisting(anonExistingHandle_t handle)
{
existingThing_t *pThing = (existingThing_t *) handle;
// Perform the function
(void) pThing;
}
static void functionNew(anonNewHandle_t handle)
{
newThing_t *pThing = (newThing_t *) handle;
// Perform a new function
(void) pThing;
}
int main() {
anonExistingHandle_t existingHandle = NULL;
anonNewHandle_t newHandleA = NULL;
newHandle_t newHandleB = NULL;
functionExisting(existingHandle);
functionNew(newHandleA);
// These should result in a compilation error
functionExisting(newHandleA);
functionNew(existingHandle);
functionExisting(newHandleB);
return 0;
}
Is there (a) a way to construct a new anonymous handle such that the compiler will not allow it to be passed to the existing functions
Yes, use a type that can't be implicitly converted to void *. Use a structure.
typedef struct {
struct newThing_s *p;
} anonNewHandle_t;
Anyway, your design is just flawed and disables all static compiler checks. Do not use void *, instead use structures or structures with void * inside, to enable compile checks. Research how the very, very standard FILE * works. FILE is not void.
Do not use typedef pointers. They are very confusing. https://wiki.sei.cmu.edu/confluence/display/c/DCL05-C.+Use+typedefs+of+non-pointer+types+only
I suggest rewriting your library so that you do not use void * and do not use typedef pointers.
The design, may look like the following:
// handle.h
struct handle_s;
typedef struct {
struct handle_s *p;
} handle_t;
handle_t handle_init(void);
void handle_deinit(handle_t t);
void handle_do_something(handle_t t);
// handle.c
struct handle_s {
int the_stuff_you_need;
};
handle_t handle_init(void) {
return (handle_t){
.p = calloc(1, sizeof(struct handle_s))
};
}
void handle_do_something(handle_t h) {
struct hadnle_s *t = h->p;
// etc.
}
// anotherhandle.h
// similar to above
typedef struct {
struct anotherhandle_s *p;
} anotherhandle_t;
void anotherhandle_do_something(anotherhandle_t h);
// main
int main() {
handle_t h = handle_new();
handle_do_something(h);
handle_free(h);
anotherhandle_do_something(h); // compiler error
}
I have the following in C (not C++!):
module.c
struct Private {...};
void foo(void* private, int param) {...}
module.h
#define PRIVATE_SIZE ???;
void foo(void* private, int param);
main.c
char m1[PRIVATE_SIZE];
char m2[PRIVATE_SIZE];
int main()
{
foo(m1, 10);
foo(m2, 20);
}
How can I expose sizeof(Private) at compile time so that application can statically allocate its storage without exposing Private type?
Note, this is a very limited embedded system and dynamic allocation is not available.
You shouldn't expose the size of the struct to the caller, because that breaks the whole purpose of having private encapsulation in the first place. Allocation of your private data is no business of the caller. Also, avoid using void* because they complete lack type safety.
This is how you write private encapsulation in C:
In module.h, forward declare an incomplete type typedef struct module module;.
In module.c, place the struct definition of this struct. it will only be visible to module.c and not to the caller. This is known as opaque types.
The caller can only allocate pointers to this struct, never allocate objects.
Caller code might look like:
#include "module.h"
...
module* m;
result = module_init(&m)
And the module_init function acts as a "constructor", declared in module.h and defined in module.c:
bool module_init (module** obj)
{
module* m = malloc(sizeof *m);
...
m->something = ...; // init private variables if applicable
*obj = m;
return true;
}
If the caller does need to know the size of the objects, it would only be for the purpose of hard copy etc. If there's a need for that, provide a copy function which encapsulates the allocation and copy ("copy constructor"), for example:
result module_copy (module** dst, const module* src);
Edit:
Please note that the manner of allocation is a separate issue. You don't have to use dynamic allocation for the above design. In embedded systems for example, it is common to use static memory pools instead. See Static allocation of opaque data types
You can't allocate size for a struct such as this because it isn't known at compile time. Even if you did know the size at run time, you'd still have issues due to alignment.
There is a possible solution which involves defining a separate structure that has the same size and alignment requirements as the private struct.
For example:
module.h:
#include <inttypes.h>
struct Public {
uint64_t opaque1;
uint64_t opaque2;
uint64_t opaque3;
};
void init(struct Public *p);
module.c:
#include <assert.h>
#include <stdalign.h>
#include "module.h"
struct Private {
int a;
double b;
float c;
};
static_assert(sizeof(struct Private)==sizeof(struct Public), "sizes differ");
static_assert(alignof(struct Private)==alignof(struct Public), "alignments differ");
void init(struct Public *p)
{
struct Private *pr = (struct Private *)p;
pr->a = 2;
pr->b = 2.5;
pr->c = 2.4f;
}
The Public and Private structs are guaranteed to have the same size, and the alignment should be the same. There is the possibility of the user writing the the "opaque" fields of the public struct, in which case you could have aliasing issues regarding effective types, but if the user can be trusted to do that then this should work.
Another, more robust option, is if you have some idea of the maximum number of objects you want to support. If that's the case, you can have a static array of these objects in your implementation file, and the init function would return a pointer to one of the objects in this list. Then you'd have a related cleanup function that would free the instance.
For example:
module.c:
struct Private {
int a;
double b;
float c;
};
struct PrivateAllocator {
struct Private obj;
int used;
};
struct PrivateAllocator list[5] = {
{ { 0, 0, 0}, 0 },
{ { 0, 0, 0}, 0 },
{ { 0, 0, 0}, 0 },
{ { 0, 0, 0}, 0 },
{ { 0, 0, 0}, 0 }
};
struct Private *private_init()
{
int i;
for (i=0; i<5; i++) {
if (!list[i].used) {
list[i].used = 1;
return &list[i].obj;
}
}
return NULL;
}
void private_free(struct Private *p)
{
int i;
for (i=0; i<5; i++) {
if (&list[i].obj == p) {
list[i].used = 0;
return;
}
}
}
In conforming C code you can't create a static instance of an arbitrary unknown type even if you know its size at compile time (not even if you know the alignment).
Let's say you try doing it anyway. How would you do it, given the size in a macro or enum PRIVATE_SIZE?
unsigned char obj[PRIVATE_SIZE];
And then you'd pass (void*)obj to wherever its needed, right?
Well, this breaks the aliasing rules. While you can legally access any individual char/byte in any object, you can't do it the other way around saying that those chars are not chars, they are just storage behind other types. That is, you can't legally have a short int superimposed on top of, say, obj[2] and obj[3] through smarty-pants casts (e.g. ((struct Private*)obj)->my_short = 2;). The only legal way to do something like this would be through memcpy(), e.g. memcpy(&temp, obj, sizeof temp); and then back after the modification. Or you'd need to work with individual chars of obj[].
There are two possible ways to sort of do it. One is described in another answer, basically define the instance where the type is known, but only let the outside world have a pointer to it.
Another, very similar, define it in assembly code and, again, let the outside world have a pointer to it. The "beauty" of the assembly way is that you really only need a name, an alignment and a size to allocate space for a named object.
And if you put the instances into a special data section (see the gcc's section attribute and the linker scripts), you may even have all of the instances in the same place (think, array) and even find out their cumulative size and therefore count.
Yet another thing to do while not explicitly violating any C rules is to still use this unsigned char obj[PRIVATE_SIZE] trick, but launder it by passing it unchanged through an assembly function that the C compiler can't look into, e.g. something like
// struct Private* launder(unsigned char*);
.text
.globl launder
launder:
move %first_param_reg, %return_reg
ret
But you'll really need to change unsigned char obj[PRIVATE_SIZE] to something that would have proper alignment on your architecture, e.g. double obj[PRIVATE_SIZE / sizeof(double)] (or the same with long long if you like that way better).
As for PRIVATE_SIZE, you can have a check at compile time that it matches the size of the type, e.g.
#include "mod.h" // mod.h defines PRIVATE_SIZE
struct Private { ... };
extern char StAtIcAsSeRt[sizeof(struct Private) == PRIVATE_SIZE];
How to expose C struct size without exposing its type?
If able to compromise a bit: (statically --> main() local)
with variable length arrays (C99), use a helper function and put the array in main().
module.h
size_t foo_size(void);
main.c
int main() {
char m1[foo_size()];
foo(m1, 10);
}
Additional work needed to account for alignment issues.
Consider relaxing your goal as suggested.
C99 allowes you to use variable length array.
private.h:
#include <stdio.h>
extern const size_t size;
private.c:
#include "private.h"
struct Private {
int x;
int y;
int z;
};
const size_t size = sizeof(struct Private);
main.c:
#include <stdio.h>
#include "private.h"
int main(void) {
char m1[size]; //variable length array
printf("Size of m1 = %ld\n", sizeof(m1));
}
Can structures contain functions?
No, but they can contain function pointers.
If your intent is to do some form of polymorphism in C then yes, it can be done:
typedef struct {
int (*open)(void *self, char *fspec);
int (*close)(void *self);
int (*read)(void *self, void *buff, size_t max_sz, size_t *p_act_sz);
int (*write)(void *self, void *buff, size_t max_sz, size_t *p_act_sz);
// And data goes here.
} tCommClass;
The typedef above was for a structure I created for a general purpose communications library. In order to initialise the variable, you would:
tCommClass *makeCommTcp (void) {
tCommClass *comm = malloc (sizeof (tCommClass));
if (comm != NULL) {
comm->open = &tcpOpen;
comm->close = &tcpOpen;
comm->read = &tcpOpen;
comm->write = &tcpWrite;
}
return comm;
}
tCommClass *makeCommSna (void) {
tCommClass *comm = malloc (sizeof (tCommClass));
if (comm != NULL) {
comm->open = &snaOpen;
comm->close = &snaOpen;
comm->read = &snaOpen;
comm->write = &snaWrite;
}
return comm;
}
tCommClass *commTcp = makeCommTcp();
tCommClass *commSna = makeCommSna();
Then, to call the functions, something like:
// Pass commTcp as first params so we have a self/this variable
// for accessing other functions and data area of object.
int stat = (commTcp->open)(commTcp, "bigiron.box.com:5000");
In this way, a single type could be used for TCP, SNA, RS232 or even carrier pidgeons, with exactly the same interface.
edit Cleared up ambiguity with the use of 'data types'
Not in C. struct types can only contain data.
From Section 6.7.2.1 of the ISO C99 Standard.
A structure or union shall not contain a member with incomplete or function type (hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
No, you cannot. A structure cannot contain a declaration of a function but they can contain a definition of a function. A structure can only contain data types, pointers, pointers to different function. You can make a pointer to a function and then access from the structure.
#include<iostream>
#include<cstring>
using namespace std;
struct full_name
{
char *fname;
char *lname;
void (*show)(char *,char*);
};
void show(char *a1,char * a2)
{
cout<<a1<<"-"<<a2<<endl;
}
int main()
{
struct full_name loki;
loki.fname="Mohit";
loki.lname="Dabas";
loki.show=show;
loki.show(loki.fname,loki.lname);
return 0;
}
In C, structures are allowed to contain on data values and not the function pointers. Not allowed in C. but the following works literally fine when checked with gcc.
enter code here
#include <stdio.h>
struct st_func_ptr{
int data;
int (*callback) ();
};
int cb(){
printf(" Inside the call back \n");
return 0;
}
int main() {
struct st_func_ptr sfp = {10, cb};
printf("return value = %d \n",sfp.callback());
printf(" Inside main\n");
return 0;
}
So, am confused ...
It's all right.
In the linux kernel code,you will find many structures contain functions.
such as:
/*
* The type of device, "struct device" is embedded in. A class
* or bus can contain devices of different types
* like "partitions" and "disks", "mouse" and "event".
* This identifies the device type and carries type-specific
* information, equivalent to the kobj_type of a kobject.
* If "name" is specified, the uevent will contain it in
* the DEVTYPE variable.
*/
struct device_type {
const char *name;
struct attribute_group **groups;
int (*uevent)(struct device *dev, struct kobj_uevent_env *env);
void (*release)(struct device *dev);
int (*suspend)(struct device * dev, pm_message_t state);
int (*resume)(struct device * dev);
};
Yes its possible to declare a function and the function definition is not allowed and that should be the function pointer.
Its based on C99 tagged structure.
Lokesh V
They can, but there is no inherent advantage in usual C programming.
In C, all functions are in the global space anyway, so you get no information hiding by tucking them in a function. paxdiablo 's example is a way to organize functions into a struct, but you must see has to dereference each one anyway to use it.
The standard organizational structure of C is the File, with
the interfaces in the header and the implementations in the source.
That is how libc is done and that is how almost all C libraries are done.
Moder C compilers allow you to define and implement functions in the same source file, and even implement static functions in header files. This unfortunately leads to some confusion as to what goes where, and you can get unusual solutions like cramming functions into structs, source-only programs with no headers, etc.
You lose the advantage of separating interface from implementation that way.
I am trying to understand the existing code.
When do we actually go for function pointers? specially like the one below.
struct xx
{
char *a;
(*func)(char *a, void *b);
void *b;
}
struct xx ppp[] = { };
then check sizeof(ppp)/sizeof(*ppp);
when do we go with such kind of approach?
sizeof array / sizeof *array is a way of finding out how many elements are in an array. (Note that it must be an array rather than a pointer.) I'm not sure how that's related to your function pointer question.
Function pointers are used to store a reference to a function so that it can be called later. The key thing is that a function pointer needn't always point to the same function. (If it did, you could just refer to the function by name.)
Here's an example based on your code (although I could provide a better one if I knew what your code was supposed to do.
char *s1 = "String one";
char *s2 = "String two";
void f(char *a, void *b) {
/* Do something with a and b */
}
void g(char *a, void *b) {
/* Do something else with a and b */
}
struct xx {
char *a;
void (*func)(char *a, void *b);
void *b;
}
struct xx ppp[] = { {s1, f, NULL}, {s2, g, NULL} };
int main(int argc, char **argv) {
for (int i = 0; i < (sizeof ppp / sizeof *ppp); i++) {
ppp[i].func(ppp[i].a, ppp[i].b);
}
}
There are two major uses (that I know of) for function pointers in C.
1. Callbacks
You have some sort of event-driven framework (a GUI is one of the easiest examples), and the program wants to react to events as they happen. Now you can do that with an event pump, like
while (event *e = get_one_event()) {
switch (e->type) {
case EVT_CLICK:
...
}
}
but that gets tiring after a while. The other major alternative is callbacks. The program defines a bunch of functions to handle different events, and then registers them with the library: "when event X happens, call function Y" -- so of course, the library is going to receive a function pointer, and call it at the relevant time.
2. Objects (function tables / vtables)
If you've done OO in most other languages, this should be fairly easy for you to picture. Imagine an object as a struct that contains its members and then a bunch of function pointers (or, maybe more likely, its members and a pointer to another struct representing its class, that contains a bunch of function pointers). The function pointers in the table are the object's methods. GLib/GObject is a big user of this technique, as is the Linux kernel (struct file_operations, struct device_driver, struct bus_type, and many many more). This lets us have an arbitrary number of objects with different behavior, without multiplying the amount of code.
When do we actually go for function pointers? specially like the one below.
You use function pointer when you want to make something more abstract.
By example, suppose your application has a graphical toolbox with a certain number of buttons. Every button corresponds to an instance of a certain struct.
The button structure can contain a function pointer and a context:
typedef struct {
void (*press_button) (void *context);
void *context;
/* Here some other stuff */
} Button;
When the user clicks the button, the event is something like
void event_click (Button *b)
{
b->press_button(b->context);
}
The point in doing this is that you can use always the same structure for each button:
Button * create_button (void (*callback) (void *), void *context, /* other params */
{
Button *ret = malloc(sizeof(Button));
if (ret != NULL) {
ret->callback = callback;
ret->context = context;
/* Assign other params */
}
...
return ret;
}
So when you build your toolbox you probably do something like
Button * toolbox[N];
toolbox[0] = create_button(function1, (void *)data, ...);
toolbox[1] = create_button(function2, (void *)something, ...);
...
toolbox[N-1] = create_button(functionN, (void *)something_else, ...);
Also when you create some function pointer, always carry some contxt information (like I did with the context field of the struct). This allows you to avoid global variables, thus you can get a robust and reentrant code!
Note:
This method is awesome, but if you deal with C++ you may prefer to use object orientation and replace callbacks with derivaton from abstract classes. By doing this you also don't need to carry the context, since the class will do it for you.
Edit in answer of first comment:
The current code I am going through is related to file IO. setting an environment variable and creating symbolic links between files, copying data from one file to another, etc. I am not understanding why do we need to call these functions at run time using function pointers. we can as well call them directly.
In fact you can do what you need without using function pointers. If I do understand well your problem, you are trying to understand someone else's code, which is doing what you listed with function pointers.
Personally I don't use this feature unless I need it but if you post here some additional code maybe we can try to understand it better.
I've seen the concept of 'opaque types' thrown around a bit but I really haven't found a succinct answer as to what defines an opaque type in C and more importantly what problems they allow us to solve with their existence. Thanks
It is the most generally used for library purpose. The main principe behind Opaque type in c is to use data though its pointer in order to hide data handling implementation. Since the implementation is hidden, you can modify the library without recompiling any program which depend on it (if the interface is respected)
eg:
version 1:
// header file
struct s;
int s_init(struct s **x);
int s_f(struct s *x);
int s_g(struct s *x);
// source file
struct s { int x; }
int s_init(struct s **x) { *x = malloc(...); }
int s_f(..) { ... }
int s_g(..) { ... }
version 2
// header file
struct s;
int s_init(struct s **x);
int s_f(struct s *x);
int s_g(struct s *x);
// source file
struct s { int y; int x; }
int s_init(struct s **x) { *x = malloc(...); }
int s_f(..) { ... }
int s_g(..) { ... }
From your program side, nothing changed! and as said previously, no need to recompile every single program which rely on it.
In my understanding, opaque types are those which allow you to hold a handle (i.e., a pointer) to an structure, but not modify or view its contents directly (if you are allowed to at all, you do so through helper functions which understand the internal structure).
Opaque types are, in part, a way to make C more object-oriented. They allow encapsulation, so that the internal details of a type can change--or be implemented differently in different platforms/situations--without the code that uses it having to change.
An opaque type is a type which is exposed in APIs via a pointer but never concretely defined.