I need to create a library in C and I am wondering how to manage objects: returning allocated (ex: fopen, opendir) or in-place initialization (ex: GNU hcreate_r).
I understand that it is mostly a question of taste, and I'm inclined to choose the allocating API because of the convenience when doing lazy initialization (by testing if the object pointer is NULL).
However, after reading Ulrich's paper (PDF), I'm wondering if this design will cause locality of reference problems, especially if I compose objects from others:
struct opaque_composite {
struct objectx *member1;
struct objecty *member2;
struct objectz *member2;
/* ... */
};
Allocation of such an object will make a cascade of other sub-allocations. Is this a problem in practice? And are there other issues that I should be aware of?
The thing to consider is whether the type of the object the function constructs is opaque. An opaque type is only forward-declared in the header file and the only thing you can do with it is having a pointer to it and passing that pointer to separately compiled API functions. FILE in the standard library is such an opaque type. For an opaque type, you have no option but have to provide an allocation and a deallocation function as the user has no other way to obtain a reference to an object of that type.
If the type is not opaque – that is, the definition of the struct is in the header file – it is more versatile to have a function that does only initialization – and, if required, another that does finalization – but no allocation and deallocation. The reason is that with this interface, the user can decide whether to put the objects on the stack…
struct widget w;
widget_init(&w, 42, "lorem ipsum");
// use widget…
widget_fini(&w);
…or on the heap.
struct widget * wp = malloc(sizeof(struct widget));
if (wp == NULL)
exit(1); // or do whatever
widget_init(wp, 42, "lorem ipsum");
// use widget…
widget_fini(wp);
free(wp);
If you think that this is too much typing, you – or your users themselves – can easily provide convenience functions.
inline struct widget *
new_widget(const int n, const char *const s)
{
struct widget wp = malloc(sizeof(struct widget));
if (wp != NULL)
widget_init(wp, n, s);
return wp;
}
inline void
del_widget(struct widget * wp)
{
widget_fini(wp);
free(wp);
}
Going the other way round is not possible.
Interfaces should always provide the essential building blocks to compose higher-level abstractions but not make legitimate uses impossible by being overly restrictive.
Of course, this leaves us with the question when to make a type opaque. A good rule of thumb – that I have first seen in the coding standards for the Linux kernel – might be to make types opaque only if there are no data members your users could meaningfully access. I think this rule should be refined a little to take into account that non-opaque types allow for “member” functions to be provided as inline versions in the header files which might be desirable from a performance point of view. On the other hand, opaque types provide better encapsulation (especially since C has no way to restrict access to a struct's members). I would also lean towards an opaque type more easily if making it not opaque would force me to #include headers into the header file of my library because they provide definitions of the types used as members in my type. (I'm okay with #includeing <stdint.h> for uint32_t. I'm a little less easy about #includeing a large header such as <unistd.h> and I'd certainly try to avoid having to #include a header from a third-party library such as <curses.h>.)
IMO the "cascade of sub-allocations" is not a problem if you keep the object opaque so you can keep it in a consistent state. The creation and destruction routines will have some added complexity dealing with an allocation failure part way through creation, but nothing too onerous.
Besides the option to have a static/stack-allocated copy (which I'm generally not fond of anyway), in my mind, the main advantage of a scheme like:
x = initThang(thangPtr);
is the ease of returning a variety of more specific error codes.
Related
I want to create an API in C. My goal is to implement abstractions to access and mutate struct variables that are defined in the API.
API's header file:
#ifndef API_H
#define API_H
struct s_accessor {
struct s* s_ptr;
};
void api_init_func(struct s_accessor *foo);
void api_mutate_func(struct s_accessor *foo, int x);
void api_print_func(struct s_accessor *foo);
#endif
API' implementation file:
#include <stdio.h>
#include "api.h"
struct s {
int internal;
int other_stuff;
};
void api_init_func(struct s_accessor* foo) {
foo->s_ptr = NULL;
}
void api_print_func(struct s_accessor *foo)
{
printf("Value of member 'internal' = %d\n", foo->s_ptr->internal);
}
void api_mutate_func(struct s_accessor *foo, int x)
{
struct s bar;
foo->s_ptr = &bar;
foo->s_ptr->internal = x;
}
Client-side program that uses the API:
#include <stdio.h>
#include "api.h"
int main()
{
struct s_accessor foo;
api_init_func(&foo); // set s_ptr to NULL
api_mutate_func(&foo, 123); // change value of member 'internal' of an instance of struct s
api_print_func(&foo); // print member of struct s
}
I have the following questions regarding my code:
Is there a direct (non-hackish) way to hide the implementation of my API?
Is this the proper way to create abstractions for the client-side to use my API? If not, how can I improve this to make it better?
"Accessor" isn't a good terminology. This term is used in object oriented programming to denote a kind of method.
The structure type struct s_accessor is in fact something called a handle. It contains a pointer to the real object. A handle is a doubly indirect pointer: the application passes around pointers to handles, and the handles contain pointers to the objects.
An old adage says that "any problem in computer science can be solved by adding another layer of indirection", of which handles are a prime example. Handles allow objects to be moved from one address to another or to be replaced. Yet, to the application, the handle address represents the object, and so when the implementation object is relocated or replaced, as far as the application is concerned, it is still the same object.
With a handle we can do things like:
have a vector object that can grow
have OOP objects that can apparently change their class
relocate variable-length objects such as buffers and strings to compact their memory footprint
all without the object changing its memory address and thus identity. Because the handle stays the same when these changes occur, the application does not have to hunt down every copy of the object pointer and replace it with a new one; the handle effectively takes care of that in one place.
In spite of all of that, handles tend to be unusual in C API's, in particular lower-level ones. Given an API that does not use handles, you can whip up handles around it. Even if you think that the users of your object will benefit from handles, it may be good to split up the API into two: an internal one which only deals with s, and the external one with the struct s_handle.
If you're using threads, then handles require careful concurrent programming. So that is to say, even though from the application's point of view, you can change the handle-referenced object, which is convenient, it requires synchronization. Say we have a vector object referenced by a handle. Application code is working with it, so we can't just suddenly replace the vector with a pointer to a different one (in order to resize it). Another thread is just in the middle of working with the original pointer. The operations that access the vector or store values into it through the handle must be synchronized with the replacement operation. Even if all of that is done right, it's going to add a lot of overhead, and so then application people may notice some performance problems and ask for escape hatches in the API, like for some functions function to "pin" down a handle so that the object cannot move while an efficient operation works directly with the s object inside it.
For that reason, I would tend stay away from designing a handle API, and make that sort of thing the application's problem. It may well be easier for a multi-threaded application to just use a well-designed "just the s please" API correctly, than to write a completely thread-safe, robust, efficient struct s_handle layer.
Is there a direct (non-hackish) way to hide the implementation of my API?
Basically the "rule #1" of hiding the implementation of an API in C is not to allow an init operation whereby the client application declares some memory and your API initializes it. That said, it is possible like this:
typedef struct opaque opaque_t;
#ifndef OPAQUE_IMPL
struct opaque {
int dummy[42]; // big enough for all future extension
} opaque_t;
#endif
void opaque_init(opaque_t *o);
In this declaration, we have revealed nothing to the client, other than that our objects are buffers of memory that require int alignment, and are at least 42 int wide.
In actual fact, the objects are smaller; we have just added a reserve amount for future growth. We can make our actual object larger withotu having to re-compile the clients, as long as our object does not require more than int [42] bytes.
Why we have that #ifndef is that the implementation code will do something like this:
#define OPAQUE_IMPL // suppress the fake definition in the header
#include "opaque.h"
// actual definition
struct opaque {
int whatever;
char *name;
};
This kind of thing plays it loose with the "law" of ISO C, because effectively the client and implementation are using a different definition of the struct opaque type.
Allowing clients to allocate the objects themselves yields certain efficiencies, because allocating objects in automatic storage (i.e. declaring them as local variables) can place them in the stack with very little overhead compared to dynamic memory allocation.
The more common approach for opaqueness is not to provide an init operation at all, only an operation for allocating a new object and destroying it:
typedef struct opaque opaque_t; // incomplete struct
opaque_t *opaque_create(/* args .... */);
void opaque_destroy(opaque_t *o);
Now the caller knows nothing, other than that an "opaque" object is represented as a pointer, the same pointer over its entire lifetime.
Total opaqueness may not be worth it for an API which is internal to an application or application framework. It's useful for an API that has external clients, like application developers in a different team or organization.
Ask yourself the question: would the client of this API, and its implementation, ever be shipped and upgraded separately? If the answer is no, then that diminishes the need for total opaqueness.
this is the right way to do abstarction and encapsulation in C applications.
use the Incomplete Types in C Language for hiding structure details. You can define structures, unions, and enumerations without listing their members (or values, in the case of enumerations). Doing so results in an incomplete type. You can't declare variables of incomplete types, but you can work with pointer to those types
constness in c lang
in evrey function espicialy those that you are exposing in api, that do not change the pointer or the structure data pointed by pointer, better and shall be const pointer. this will ensure (somehow :-) you still can change it in c) to the api user that you are not changing structure data. you can also protect the datat and the address by double const the pointer, seee below:
#ifndef API_H
#define API_H
typedef struct s_accessor s_accessor, *p_s_accessor;
void api_init_func(p_s_accessor p_foo);
void api_mutate_func(p_s_accessor p_foo, int x);
void api_print_func(const p_s_accessor const p_foo);
#endif
in the api.c you can complete the structure type:
struct s {
int internal;
int other_stuff;
};
all auxilary functions should be static in api.c(limit the fucntions scope to api.c only!
minimise the includes in the api.h.
regarding question 1 idont think there is a way that you can hide the implementaion details!
if I am developing a C shared library and I have my own structs. To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself? Is it a good practice? Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
I know it goes a lot closer to C++ classes but I wish to stick to C and learn how it would be done in a procedural language as opposed to OOP.
To give an example
typedef struct tag tag;
typedef struct my_custom_struct my_custom_struct;
struct tag
{
// ...
};
struct my_custom_struct
{
tag *tags;
my_custom_struct* (*add_tag)(my_custom_struct* str, tag *tag);
};
my_custom_struct* add_tag(my_custom_struct* str, tag *tag)
{
// ...
}
where add_tag is a helper that manages to add the tag to tag list inside *str.
I saw this pattern in libjson-c like here- http://json-c.github.io/json-c/json-c-0.13.1/doc/html/structarray__list.html. There is a function pointer given inside array_list to help free it.
To make common operations on these struct instances easier for library
consumers, can I provide function pointers to such functions inside
the struct itself?
It is possible to endow your structures with members that are function pointers, pointing to function types whose parameters include pointers to your structure type, and that are intended to be used more or less like C++ instance methods, more or less as presented in the question.
Is it a good practice?
TL;DR: no.
The first problem you will run into is getting those pointer members initialized appropriately. Name correspondence notwithstanding, the function pointers in instances of your structure will not automatically be initialized to point to a particular function. Unless you make the structure type opaque, users can (and undoubtedly sometimes will) declare instances without calling whatever constructor-analog function you provide for the purpose, and then chaos will ensue.
If you do make the structure opaque (which after all isn't a bad idea), then you'll need non-member functions anyway, because your users won't be able to access the function pointers directly. Perhaps something like this:
struct my_custom_struct *my_add_tag(struct my_custom_struct *str, tag *tag) {
return str->add_tag(str, tag);
}
But if you're going to provide for that, then what's the point of the extra level of indirection? (Answer: the only good reason for that would be that in different instances, the function pointer can point to different functions.)
And similar applies if you don't make the structure opaque. Then you might suppose that users would (more) directly call
str->add_tag(str, tag);
but what exactly makes that a convenience with respect to simply
add_tag(str, tag);
?
So overall, no, I would not consider this approach a good practice in general. There are limited circumstances where it may make sense to do something along these lines, but not as a general library convention.
Would there be issues with
respect to multithreading where a utility function is called in
parallel with different arguments and so on?
Not more so than with functions designated any other way, except if the function pointers themselves are being modified.
I know it goes a lot closer to C++ classes but I wish to stick to C
and learn how it would be done in a procedural language as opposed to
OOP.
If you want to learn C idioms and conventions then by all means do so. What you are describing is not one. C code and libraries can absolutely be designed with use of OO principles such as encapsulation, and to some extent even polymorphism, but it is not conventionally achieved via the mechanism you describe. This answer touches on some of the approaches that are used for the purpose.
Is it a good practice?
TLDR; no.
Background:
I've been programming almost exclusively in embedded C on STM32 microcontrollers for the last year and a half (as opposed to using C++ or "C+", as I'll describe below). It's been very insightful for me to have to learn C at the architectural level, like I have. I've studied C architecture pretty hard to get to where I can say I "know C". It turns out, as we all know, C and C++ are NOT the same language. At the syntax level, C is almost exactly a subset of C++ (with some key differences where C supports stuff C++ does not), hence why people (myself included before this) frequently think/thought they are pretty much the same language, but at the architectural level they are VASTLY DIFFERENT ANIMALS.
Aside:
Note that my favorite approach to embedded is to use what some colloquially know as "C+". It is basically using a C++ compiler to write C-style embedded code. You basically just write C how you'd expect to write C, except you use C++ classes to vastly simplify the (otherwise pure C) architecture. In other words, "C+" is a pseudonym used to describe using a C++ compiler to write C-like code that uses classes instead of "object-based C" architecture (which is described below). You may also use some advanced C++ concepts on occasion, like operator overloading or templates, but avoid the STL for the most part to not accidentally use dynamic allocation (behind-the-scenes and automatically, like C++ vectors do, for example) after initialization, since dynamic memory allocation/deallocation in normal run-time can quickly use up scarce RAM resources and make otherwise-deterministic code non-deterministic. So-called "C+" may also include using a mix of C (compiled with the C compiler) and C++ (compiled with the C++ compiler), linked together as required (don't forget your extern "C" usage in C header files included in your C++ code, as required).
The core Arduino source code (again, the core, not necessarily their example "sketches" or example code for beginners) does this really well, and can be used as a model of good "C+" design. <== before you attack me on this, go study the Arduino source code for dozen of hours like I have [again, NOT the example "sketches", but their actual source code, linked-to below], and drop your "arduino is for beginners" pride right now.
The AVR core (mix of C and "C+"-style C++) is here: https://github.com/arduino/ArduinoCore-avr/tree/master/cores/arduino
Some of the core libraries ("C+"-style C++) are here: https://github.com/arduino/ArduinoCore-avr/tree/master/libraries
[aside over]
Architectural C notes:
So, regarding C architecture (ie: actual C, NOT "C+"/C-style C++):
C is not an OO language, as you know, but it can be written in an "object-based" style. Notice I say "object-based", NOT "object oriented", as that's how I've heard other pedantic C programmers refer to it. I can say I write object-based C architecture, and it's actually quite interesting.
To make object-based C architecture, here's a few things to remember:
Namespaces can be done in C simply by prepending your namespace name and an underscore in front of something. That's all a namespace really is after-all. Ex: mylibraryname_foo(), mylibraryname_bar(), etc. Apply this to enums, for example, since C doesn't have "enum classes" like C++. Apply it to all C class "methods" too since C doesn't have classes. Apply to all global variables or defines as well that pertain to a particular library.
When making C "classes", you have 2 major architectural options, both of which are very valid and widely used:
Use public structs (possibly hidden in headers named "myheader_private.h" to give them a pseudo-sense of privacy)
Use opaque structs (frequently called "opaque pointers" since they are pointers to opaque structs)
When making C "classes", you have the option of wrapping up pointers to functions inside of your structs above to give it a more "C++" type feel. This is somewhat common, but in my opinion a horrible idea which makes the code nearly impossible to follow and very difficult to read, understand, and maintain.
1st option, public structs:
Make a header file with a struct definition which contains all your "class data". I recommend you do NOT include pointers to functions (will discuss later). This essentially gives you the equivalent of a "C++ class where all members are public." The downside is you don't get data hiding. The upside is you can use static memory allocation of all of your C "class objects" since your user code which includes these library headers knows the full specification and size of the struct.
2nd option: opaque structs:
In your library header file, make a forward declaration to a struct:
/// Opaque pointer (handle) to C-style "object" of "class" type mylibrarymodule:
typedef struct mylibrarymodule_s *mylibrarymodule_h;
In your library .c source file, provide the full definition of the struct mylibrarymodule_s. Since users of this library include only the header file, they do NOT get to see the full implementation or size of this opaque struct. That is what "opaque" means: "hidden". It is obfuscated, or hidden away. This essentially gives you the equivalent of a "C++ class where all members are private." The upside is you get true data hiding. The downside is you can NOT use static memory allocation for any of your C "class objects" in your user code using this library, since any user code including this library doesn't even know how big the struct is, so it cannot be statically allocated. Instead, the library must do dynamic memory allocation at program initialization, one time, which is safe even for embedded deterministic real-time safety-critical systems since you are not allocating or freeing memory during normal program execution.
For a detailed and full example of Option 2 (don't be confused: I call it "Option 1.5" in my answer linked-to here) see my other answer on opaque structs/pointers here: Opaque C structs: how should they be declared?.
Personally, I think the Option 1, with static memory allocation and "all public members", may be my preferred approach, but I am most familiar with the opaque struct Option 2 approach, since that's what the C code base I work in the most uses.
Bullet 3 above: including pointers to functions in your structs.
This can be done, and some do it, but I really hate it. Don't do it. It just makes your code so stinking hard to follow. In Eclipse, for instance, which has an excellent indexer, I can Ctrl + click on anything and it will jump to its definition. What if I want to see the implementation of a function I'm calling on a C "object"? I Ctrl + click it and it jumps to the declaration of the pointer to the function. But where's the function??? I don't know! It might take me 10 minutes of grepping and using find or search tools, digging all around the code base, to find the stinking function definition. Once I find it, I forget where I was, and I have to repeat it all over again for every single function, every single time I edit a library module using this approach. It's just bad. The opaque pointer approach above works fantastic instead, and the public pointer approach would be easy too.
Now, to directly answer your questions:
To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself?
Yes you can, but it only makes calling something easier. Don't do it. Finding the function to look at its implementation becomes really hard.
Is it a good practice?
No, use Option 1 or Option 2 above instead, where you now just have to call C "namespaced" "methods" on every C "object". You must simply pass the "members of the C class" into the function as the first argument for every call instead. This means instead of in C++ where you can do:
myclass.dosomething(int a, int b);
You'll just have to do in object-based C:
// Notice that you must pass the "guts", or member data
// (`mylibrarymodule` here), of each C "class" into the namespaced
// "methods" to operate on said C "class object"!
// - Essentially you're passing around the guts (member variables)
// of the C "class" (which guts are frequently referred to as
// "private data", or just `priv` in C lingo) to each function that
// needs to operate on a C object
mylibrarymodule_dosomething(mylibrarymodule_h mylibrarymodule, int a, int b);
Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?
Yes, same as in any multithreaded situation where multiple threads are trying to access the same data. Just add a mutex to each C struct-based "object", and be sure each "method" acting on your C "objects" properly locks (takes) and unlocks (gives) the mutex as required before operating on any shared volatile members of the C "object".
Related:
Opaque C structs: how should they be declared? [use "Object-based" C architecture]
I would like to suggest you reading com specification, you will gain a lot. all these com, ole and dcom technology is based on a simple struct that incorporates its own data and methods.
https://www.scribd.com/document/45643943/Com-Spec
simplied more here
http://www.voidcn.com/article/p-fixbymia-beu.html
if you want to cut to the chase, please skip down to the last two paragraphs. If you're interested in my predicament and the steps I've taken to solve it, continue reading directly below.
I am currently developing portions of a C library as part of my internship. So naturally, there are some parts of code which should not be accessible to the user while others should be. I am basically developing several architecture-optimized random number generators (RNG's)(uniform, Gaussian, and exponential distributed numbers). The latter two RNG's depend on the uniform generator , which is in a different kernel (project). So, in the case that the user wants to use more than one RNG, I want to make sure I'm not duplicating code needlessly since we are constrained with memory (no point in having the same function defined multiple times at different addresses in the code segment).
Now here's where the problem arises. The convention for all other kernels in the library is that we have a two header files and two C files (one each for the natural C implementation and the optimized C version (which may use some intrinsic functions and assembly and/or have some restrictions to make it faster and better for our architecture). This is followed by another C file (a testbench) where our main function is located and it tests both implementations and compares the results. With that said, we cannot really add an additional header file for private or protected items nor can we add a global header file for all these generators.
To combat this restriction, I used extern functions and extern const int's in the C files which depend on the uniform RNG rather than #define's at the top of each C file in order to make the code more portable and easily modified in one place. This worked for the most part.
However, the tricky bit is that we are using an internal type within these kernels (which should not be seen by the user and should not be placed in the header file). Again, for portability, I would like to be able to change the definition of this typedef in one place rather than in multiple places in multiple kernels since the library may be used for another platform later on and for the algorithms to work it is critical that I use 32-bit types.
So basically I'm wondering if there's any way I can make a typedef "protected" in C. That is, I need it to be visible among all C files which need it, but invisible to the user. It can be in one of the header files, but must not be visible to the user who will be including that header file in his/her project, whatever that may be.
============================Edit================================
I should also note that the typedef I am using is an unsigned int. so
typedef unsigned int myType
No structures involved.
============================Super Edit==========================
The use of stdint.h is also forbidden :(
I am expanding on Jens Gustedt’s answer since the OP still has questions.
First, it is unclear why you have separate header files for the two implementations (“natural C” and “optimized C”). If they implement the same API, one header should serve for either.
Jens Gustedt’s recommendation is that you declare a struct foo in the header but define it only in the C source file for the implementation and not in the header. A struct declared in this way is an incomplete type, and source code that can only see the declaration, and not the definition, cannot see what is in the type. It can, however, use pointers to the type.
The declaration of an incomplete struct may be as simple as struct foo. You can also define a type, such as typedef struct foo foo; or typedef struct foo Mytype;, and you can define a type that is a pointer to the struct, such as typedef struct foo *FooPointer;. However, these are merely for convenience. They do not alter the basic notion, that there is a struct foo that API users cannot see into but that they can have pointers to.
Inside the implementation, you would fully define the struct. If you want an unsigned int in the struct, you would use:
struct foo
{
unsigned int x;
};
In general, you define the struct foo to contain whatever data you like.
Since the API user cannot define struct foo, you must provide functions to create and destroy objects of this type as necessary. Thus, you would likely have a function declared as extern struct foo *FooAlloc(some parameters);. The function creates a struct foo object (likely by calling malloc or a related function), initializes it with data from the parameters, and returns a pointer to the object (or NULL if the creation or initialization fails). You would also have a function extern void FooFree(struct foo *p); that frees a struct foo object. You might also have functions to reset, set, or alter the state of a foo object, functions to copy foo objects, and functions to report about foo objects.
Your implementations could also define some global struct foo objects that could be visible (essentially by address only) to API users. As a matter of good design, this should be done only for certain special purposes, such as to provide instances of struct foo objects with special meanings, such as a constant object with a permanent “initial state” for copying.
Your two implementations, the “natural C” and the “optimized C” implementations may have different definitions for the struct foo, provided they are not both used in a program together. (That is, each entire program is compiled with one implementation or the other, not both. If necessary, you could mangle both into a program by using a union, but it is preferable to avoid that.)
This is not a singleton approach.
Just do
typedef struct foo foo;
These are two declarations, a forward declaration of a struct and a type alias with the same name. Forward declared struct can be used to nothing else than to define pointers to them. This should give you enough abstraction and type safety.
In all your interfaces you'd have
extern void proc(foo* a);
and you'd have to provide functions
extern foo* foo_alloc(size_t n);
extern void foo_free(foo* a);
This would bind your users as well as your library to always use the same struct. Thereby the implementation of foo is completely hidden to the API users. You could even one day to decide to use something different than a struct since users should use foo without the struct keyword.
Edit: Just a typedef to some kind of integer wouldn't help you much, because these are only aliases for types. All your types aliased to unsigned could be used interchangeably. One way around this would be to encapsulate them inside a struct. This would make your internal code a bit ugly, but the generated object code should be exactly the same with a good modern compiler.
Let's say that there's an interface specification "X". X calls for XHeader.h to have a struct called X_Fiddle that guarantees the presence of members foo and bar, both of type int. However, implementation-defined members are not prohibited and are in fact encouraged if doing so increases efficiency. I need to write an implementation of X for the company I work for, and realised that having a few members specific to my implementation to store some state would be very convenient, so I wrote the following:
typedef struct X_Fiddle {
int foo; /* Guaranteed by spec */
int bar; /* Guaranteed by spec */
size_t dinky_dingbat; /* IMPLEMENTATION-SPECIFIC, DO NOT USE */
unsigned char *dingbat_data; /* IMPLEMENTATION-SPECIFIC, DO NOT USE */
} X_Fiddle;
Of course, there's nothing to tell the users that dinky_dingbat or dingbat_data should not be used since they are implementation-specific details and may change or disappear at some point in the future. Given that I can't hide the implementation by using something like opaque pointers, what should I do to make such internal members stand out (or other tricks to hide such things)? Are there any commonly used/standard ways of dealing with problems like this? Best I could think of is using a naming convention like leading underscores, but I'm not sure if the leading underscores rules apply to member variables, and I have a feeling that I'm getting mixed up with some C++ specific rules too. I also thought of naming them something like INTERNAL_dinky_dingbat or having a separate struct for internal types contained inside X_Fiddle, but I'd like to keep the extra typing to a minimum so I dislike them somewhat. Or is it perfectly acceptable just to have a plain, ordinary struct as above, where implementation-specific details are spelled out in the comments and documentation, leaving the less experienced and diligent to suffer their self-inflicted wounds when I need to change things around?
Assuming I'm starting from scratch and/or my company/team has no convention for this specific case.
Even if the PIMPL idiom is C++ thing, it has actually be used for longer in plain C as anonymous void pointers.
If you want the structure to have private implementation-specific fields, then create a structure for those private fields, and add a void pointer field to the public structure. Then have an init or create function which allocates this private structure, and makes the void pointer field in the public structure point to that private structure.
The GLib Object System provides an object system with data hiding and inheritance. If you cannot use it in your application, you can at least get some ideas from it.
May I know the usage and logic behind the opaque pointer concept in C?
An opaque pointer is one in which no details are revealed of the underlying data (from a dictionary definition: opaque: adjective; not able to be seen through; not transparent).
For example, you may declare in a header file (this is from some of my actual code):
typedef struct pmpi_s *pmpi;
which declares a type pmpi which is a pointer to the opaque structure struct pmpi_s, hence anything you declare as pmpi will be an opaque pointer.
Users of that declaration can freely write code like:
pmpi xyzzy = NULL;
without knowing the actual "definition" of the structure.
Then, in the code that knows about the definition (ie, the code providing the functionality for pmpi handling, you can "define" the structure:
struct pmpi_s {
uint16_t *data; // a pointer to the actual data array of uint16_t.
size_t sz; // the allocated size of data.
size_t used; // number of segments of data in use.
int sign; // the sign of the number (-1, 0, 1).
};
and easily access the individual fields of it, something that users of the header file cannot do.
More information can be found on the Wikipedia page for opaque pointers..
The main use of it is to hide implementation details from users of your library. Encapsulation (despite what the C++ crowd will tell you) has been around for a long time :-)
You want to publish just enough details on your library for users to effectively make use of it, and no more. Publishing more gives users details that they may come to rely upon (such as the fact the size variable sz is at a specific location in the structure, which may lead them to bypass your controls and manipulate it directly.
Then you'll find your customers complaining bitterly when you change the internals. Without that structure information, your API is limited only to what you provide and your freedom of action regarding the internals is maintained.
Opaque pointers are used in the definitions of programming interfaces (API's).
Typically they are pointers to incomplete structure types, declared like:
typedef struct widget *widget_handle_t;
Their purpose is to provide the client program a way to hold a reference to an object managed by the API, without revealing anything about the implementation of that object, other than its address in memory (the pointer itself).
The client can pass the object around, store it in its own data structures, and compare two such pointers whether they are the same or different, but it cannot dereference the pointers to peek at what is in the object.
The reason this is done is to prevent the client program from becoming dependent on those details, so that the implementation can be upgraded without having to recompile client programs.
Because the opaque pointers are typed, there is a good measure of type safety. If we have:
typedef struct widget *widget_handle_t;
typedef struct gadget *gadget_handle_t;
int api_function(widget_handle_t, gadget_handle_t);
if the client program mixes up the order of the arguments, there will be a diagnostic from the compiler, because a struct gadget * is being converted to a struct widget * without a cast.
That is the reason why we are defining struct types that have no members; each struct declaration with a different new tag introduces a new type that is not compatible with previously declared struct types.
What does it mean for a client to become dependent? Suppose that a widget_t has width and height properties. If it isn't opaque and looks like this:
typedef struct widget {
short width;
short height;
} widget_t;
then the client can just do this to get the width and height:
int widget_area = whandle->width * whandle->height;
whereas under the opaque paradigm, it would have to use access functions (which are not inlined):
// in the header file
int widget_getwidth(widget_handle_t *);
int widget_getheight(widget_handle_t *);
// client code
int widget_area = widget_getwidth(whandle) * widget_getheight(whandle);
Notice how the widget authors used the short type to save space in the structure, and that has been exposed to the client of the non-opaque interface. Suppose that widgets can now have sizes that don't fit into short and the structure has to change:
typedef struct widget {
int width;
int height;
} widget_t;
Client code must be re-compiled now to pick up this new definition. Depending on the tooling and deployment workflow, there may even be a risk that this isn't done: old client code tries to use the new library and misbehaves by accessing the new structure using the old layout. That can easily happen with dynamic libraries. The library is updated, but the dependent programs are not.
The client which uses the opaque interface continues to work unmodified and so doesn't require recompiling. It just calls the new definition of the accessor functions. Those are in the widget library and correctly retrieve the new int typed values from the structure.
Note that, historically (and still currently here and there) there has also been a lackluster practice of using the void * type as an opaque handle type:
typedef void *widget_handle_t;
typedef void *gadget_handle_t;
int api_function(widget_handle_t, gadget_handle_t);
Under this scheme, you can do this, without any diagnostic:
api_function("hello", stdout);
The Microsoft Windows API is an example of a system in which you can have it both ways. By default, various handle types like HWND (window handle) and HDC (device context) are all void *. So there is no type safety; a HWND could be passed where a HDC is expected, by mistake. If you do this:
#define STRICT
#include <windows.h>
then these handles are mapped to mutually incompatible types to catch those errors.
Opaque as the name suggests is something we can’t see through. E.g. wood is opaque. Opaque pointer is a pointer which points to a data structure whose contents are not exposed at the time of its definition.
Example:
struct STest* pSTest;
It is safe to assign NULL to an opaque pointer.
pSTest = NULL;