Implementing shared library / module - argument struct not in headers - c

I am trying to implement a custom locking library for LVM. Setting locking_type to external (2) in lvm.conf and providing a shared library implementing the required functions seems enough and relatively simple, in theory.
Looking into this I started with the sources for LVM2, specifically the external locking mechanism implementation, which can be found here.
Basically, what I figured out that I need to do is to implement functions with the headers described like this:
static void (*_reset_fn) (void) = NULL;
static void (*_end_fn) (void) = NULL;
static int (*_lock_fn) (struct cmd_context * cmd, const char *resource, uint32_t flags) = NULL;
static int (*_init_fn) (int type, struct dm_config_tree * cft, uint32_t *flags) = NULL;
static int (*_lock_query_fn) (const char *resource, int *mode) = NULL;
Now, everything peachy up to this point. However, looking at the _lock_fn definition, it takes a pointer to struct cmd_context as first argument. That struct can easily be found in the LVM2 sources (and it is a fairly complex one!), but it is not inside the headers exposed by the package as an API (eg. the lvm2-devel package in RHEL7). As I imagine (I am definitely not the best C programmer around), since that struct is supposed to be used by external libraries, it is mandatory that it should be in the headers.
Am I thinking this wrong or is it just a "bug" and I should discuss with LVM2 developers? Are there any workarounds, other than copy/pasting that struct and all other types that it depends on to a header file in my project? Doing this "workaround", does it break the GNU GPL license in any way?

The source file you linked, indirectly includes the config.h header which has this line:
struct cmd_context;
This tells the C compiler that there is a struct with that name. That's all the compiler needs to know to generate correct machine code.
If you were to access members of that struct or directly create an object of it, you would get errors thrown your way because the compiler doesn't know size or members of the struct.
But as long you use it as an opaque data type passing a known-size pointer to it in and out of functions, you are ok. This is called forward declaration and is a primary way of achieving encapsulation in C (check stdio's FILE).

Related

Is there a way to make a struct definition "private" to a single translation unit in C?

In C, you can use the static keyword to make global variables and functions private to the file in which they're defined. The compiler won't export these symbols, and thus the linker will not allow other modules to use these definitions.
However, I'm struggling to figure out how to restrict a struct definition such that it doesn't get added as an exported symbol that could accidentally be used by another module during the linking process. I would like to restrict this to the only file in which its defined.
Here are my attempts thus far which I've been struggling with.
// structure that is visible to other modules
struct PrivateStruct
{
int hello;
int there;
};
// this seems to throw an error
static struct PrivateStruct
{
int hello;
int there;
};
// i would ideally like to also wrap in the struct in a typedef, but this definitely doesn't work.
typedef static struct PrivateStruct
{
int hello;
int there;
} PrivateStruct;
Edit: I realize if I just define this struct in the .c file, others won't know about it. But won't it still technically be an exported symbol by the compiler? It would be nice to prevent this behavior.
I realize if I just define this struct in the .c file, others won't know about it. But won't it still technically be an exported symbol by the compiler?
No. Whether you are talking about structure tags or typedefed identifiers, these have no linkage. Always. There is no sense in which it would be reasonable to say that they are exported symbols.
This is among the reasons that header files are used in C. If you want to use a structure type in one compilation unit that is compatible with a structure type in a different compilation unit then compatible structure type declarations must appear in both. Putting the definition in a header makes it pretty easy to achieve that.
yes..... you can use pointers to the structure, and to the outer world it is not deferrable. Normally, that requires two header files, the first is the public one:
header mystruct.h
This header is public, used by client code.
struct my_opaque;
/* we cannot use *p fields below, as the struct my_opaque is
* incompletely defined here. */
void function_using(struct my_opaque *p, ...);
header mystructP.h
This header is private, and includes the public header file to maintain the
API to the client code.
#include "mystruct.h" /* safety include (1) */
struct my_opaque {
/* ... */
};
implementation mystruct.c
In implementation code, we include the private header file, and so, we have full access to the structure fields.
#include "mystructP.h"
/* we have full definition, as we included mystructP.h */
void function_using(struct my_opaque *p,
...)
{
... /* we can use p->fields here */
}
(1) the safety #include allows to change the struct my_opaque and the API functions, and the compiler to blame you if you change something in the api to the caller modules, forcing you to recompile if you change something in that API.

Rebuild a dynamic library upon argument typedef change

Let's assume, I have a C structure, DynApiArg_t.
typedef struct DynApiArg_s {
uint32_t m1;
...
uint32_t mx;
} DynApiArg_t;
The pointer of this struct is passed as an arg to a function say
void DynLibApi(DynApiArg_t *arg)
{
arg->m1 = 0;
another_fn_in_the_lib(arg->mold); /* May crash here. (1) */
}
which is present in a dynamic library, libdyn.so. This API is invoked from an executable via a dlopen/dlsym procedure of invocation.
In case this dynamic library is updated to version 2, where DynApiArg_t now has new member, say m2, as below:
typedef struct DynApiArg_s {
uint32_t m1;
OldMbr_t *mold;
...
uint32_t mx;
uint32_t m2;
NewMbr *mnew;
} DynApiArg_t;
Without a complete rebuild of the executable or other libs that call this API via a dlopen/dlsym, everytime this API is invoked, I see the process crashing, due to the some dereference of any member in the struct. I understand accessing m2 may be a problem. But access to member mold like below is seen causing crashes.
typedef void (*fnPtr_t)(DynApiArg_t*);
void DynApiCaller(DynApiArg_t *arg)
{
void *libhdl = dlopen("libdyn.so", RTLD_LAZY | RTLD_GLOBAL);
fnPtr_t fptr = dlsym(libhdl, "DynLibApi");
fnptr(arg); /* actual call to the dynamically loaded API (2) */
}
In the call to the API via fnptr, at line marked (2), when the old/existing members (in v1 of lib, when DynApiCaller was initially compiled) is accessed at (1), it happens to be any garbage value or even NULL at times.
What is the right way to handle such updates without a complete recompilation of the executable everytime the dependant libs are updated?
I've seen libs being named with symliks with version numbers like libsolid.so.4. Is there something related to this versioning system that can help me? If so can you point me to right documentations for these if any?
There are a number of approaches to solve this problem:
Include the API version in the dynamic library name.
Instead of dlopen("libfoo.so"), you use dlopen("libfoo.so.4"). Different major versions of the library are essentially separate, and can coexist on the same system; so, the package name for that library would be e.g. libfoo-4. You can have libfoo.so.4 and libfoo.so.5 installed at the same time. Minor versions, say libfoo-4.2, install libfoo.so.4.2, and symlink libfoo.so.4 to libfoo.so.4.2.
Initially define the structures with zero padding (required to be zero in earlier versions of the library), and have the later versions reuse the padding fields, but keeping the structures the same size.
Use versioned symbol names. This is a Linux extension, using dlvsym(). A single shared library binary can implement several versions of the same dynamic symbol.
Use resolver functions to determine the symbols at load time. This allows e.g. hardware architecture-optimized variants of functions to be selected at run time, but is less useful with a dlopen()-based approach.
Use a structure to describe the library API, and a versioned function to obtain/initialize that API.
For example, version 4 of your library could implement
struct libfoo_api {
int (*func1)(int arg1, int arg2);
double *data;
void (*func2)(void);
/* ... */
};
and only export one symbol,
int libfoo_init(struct libfoo_api *const api, const int version);
Calling that function would initialize the api structure with the symbols supported, with the assumption that the structure corresponds to the specified version. A single shared library can support multiple versions. If a version is not supported, it can return a failure.
This is especially useful for plugin-type interfaces (although then the _init function is more likely to call application-provided functionality registering functions, rather than fill in a structure), as a single file can contain optimized functionality for a number of versions, optimized for a number of compatible hardware architectures (for example, AMD/Intel architectures with different SSE/AVX/AVX2/AVX512 support).
Note that the above implementation details can be "hidden" in a header file, making actual C code using the shared library much simpler. It also helps making the same API work across a number of OSes, simply by changing the header file to use the approach that works best on that OS, while keeping the actual C interface the same.

Implementing data hiding access specifiers in C language

Is there is a way to implement access specifiers like "private", "protected" in C language. I came across solutions in the internet about using "static" and "ifdefs" for making a function available only inside certain other functions.
Apart from these, is there any C implementation equivalent of using private and protected access specifiers in C++ classes?
C does not have access specifiers. The only way to hide something from your callers is to not provide its declaration in the header.
You can make it static in the translation unit:
myapi.h
extern int visibleVariable;
void visibleFunction();
myapi.c
int visibleVariable;
static int invisibleVariable;
void visibleFunction() {
...
}
static void invisibleFunction() {
...
}
You can also hide the definition of a struct by placing it in the implementation file. This way all fields of your struct would be private to the translation unit. The drawback to this approach is that the users of your API would be unable to declare variables of your struct's type, so they would need to deal with your struct through pointers.
C has no concept of inheritance, hence there is no equivalent of protected access.
C does not have user definable name spaces or access specifiers. Since you exclude (ab)use of preprocessor, the only way to get compiler error trying to access private parts of "classes" is to not have a .h file which exposes "private" stuff. They can still be put into "private" separate .h files (included by module's or library's own .c files, but not meant to be included from application code), or hidden behind #ifdefs (requiring special define to activate the "private" parts).
One common way to hide things is to use opaque structs AKA opaque pointers. For that approach, the code outside a module or library only has pointer to a struct, but no struct definition. And then it uses functions offered by the module to get an instance, access it, and finally release it.
With this approach, you easily get public interface: the functions you provide in the public .h file, as well as any public support structs which have definition there. The private interface is the code where the full struct definition is visible, and any functions which are not in the public .h file.
Protected access implies inheritance, which usually works very differently from C++, when implemented with C by hand, and which is too broad a subject to cover in this answer. The closest thing to this would probably be to have several .h files, which provide several levels of "public" access, and then it is responsibility of the programmer to not get into problems with them.
The good thing about this approach is, other code using the module does not need to be modified (or even recompiled), if struct is changed. Often struct might even be an union, and then the module's functions would branch based on the actual type, all invisibe from the code using it. Another good thing is, the module can control creation of structs, so it could for example have a pool of structs and avoid using heap, all invisible to the application code. One downside is, you can't have inline functions (because the inline function body in .h file would need the struct definition, which we are trying to hide here), which prevents some nice compiler optimizations in cases where performance is a concern.
Example (untested code written for this answer):
module.h:
// ...other standard header file stuff ...
// forward declaration of struct
struct module_data;
// "constructor" function
struct module_data *module_initialize_data(int value);
// modification function
int module_update_data(struct module_data *data, int adjust);
// "destructor" function
void module_release(struct module_data *data);
module.c
#include "module.h"
// struct definition only in the .c file
struct module_data {
int value;
};
struct module_data *module_initialize_data(int value) {
struct module_data *data = malloc(sizeof(*data));
data->value = value;
return data;
}
int module_update_data(struct module_data *data, int adjust) {
data->value += adjust;
return data->value;
}
void module_release(struct module_data *data) {
free(data);
}
Relevant Wikipedia links for reference:
https://en.wikipedia.org/wiki/Opaque_pointer
https://en.wikipedia.org/wiki/Opaque_data_type

How to export/import a C struct from a DLL/ to a console application using __declspec( dllexport/import )

This is my first time dealing with DLLs. Following the MSDN documentation I created a header file fooExports.h with macros defined according to a preprocessor definition:
#ifdef FOODLL_EXPORTS
#define FOO_API __declspec( dllexport )
#else
#define FOO_API __declspec( dllimport )
My intention was to use this header both in my DLL implementation as well as in the console application. So far importing and exporting functions works just fine. The problem arrises when I try to export an already defined struct that I need as parameter for one of the exported functions. For example, in the aforementioned header file I declare FOO_API void foo( FooParams *args ) and args is a struct defined as follows:
typedef struct FooParams
{
char *a;
char *b;
void *whatever; //some other type
} FooParams;
This struct has to be defined in foo.h rather than in fooExports.h. Is there any way to export this struct without taking it out of it's original header file (taking into consideration that I want to keep the exports/imports centralized in fooExports.h).
What would be a better approach to doing this? The DLL is all C as well as the client application using it.
If the only use the client will ever have for FooParams is to get pointers to it returned from DLL functions and to pass those pointers to other DLL functions, you can make it an "opaque type": Put
typedef struct FooParams FooParams;
in fooExports.h. The FOO_API macro does not belong on that declaration. An opaque type means the client code cannot:
Create any variables of type FooParams (but FooParams * ptr = NULL; is okay).
Do anything at all with any member of FooParams.
Find sizeof(FooParams) - and therefore cannot correctly malloc space for one or more FooParams objects.
You can't #define macros visible to the client which do any of the above, either. So your DLL would need to have one or more "constructor" or "factory" functions, maybe something like
FOO_API FooParams* CreateFooParams(const char * input);
It's also good practice to define a matching "destructor" function like
FOO_API void DestroyFooParams(FooParams * p);
even if the definition is as simple as { free(p); }, because there can sometimes be issues if memory allocated inside a DLL is freed by code outside it or vice versa (because not all Windows code uses identical definitions of malloc and free).
If all this is too extreme, the only other option is to put or #include the struct definition in the exported header file and make it visible to clients. Without that, any attempt to do something to a FooParams, other than passing pointers around, is going to be impossible because the compiler won't know what's in a FooParams. The compiler (as opposed to the linker) takes information only from commandline arguments and #include-d files, not from libraries or DLLs.

What is an opaque pointer in C?

May I know the usage and logic behind the opaque pointer concept in C?
An opaque pointer is one in which no details are revealed of the underlying data (from a dictionary definition: opaque: adjective; not able to be seen through; not transparent).
For example, you may declare in a header file (this is from some of my actual code):
typedef struct pmpi_s *pmpi;
which declares a type pmpi which is a pointer to the opaque structure struct pmpi_s, hence anything you declare as pmpi will be an opaque pointer.
Users of that declaration can freely write code like:
pmpi xyzzy = NULL;
without knowing the actual "definition" of the structure.
Then, in the code that knows about the definition (ie, the code providing the functionality for pmpi handling, you can "define" the structure:
struct pmpi_s {
uint16_t *data; // a pointer to the actual data array of uint16_t.
size_t sz; // the allocated size of data.
size_t used; // number of segments of data in use.
int sign; // the sign of the number (-1, 0, 1).
};
and easily access the individual fields of it, something that users of the header file cannot do.
More information can be found on the Wikipedia page for opaque pointers..
The main use of it is to hide implementation details from users of your library. Encapsulation (despite what the C++ crowd will tell you) has been around for a long time :-)
You want to publish just enough details on your library for users to effectively make use of it, and no more. Publishing more gives users details that they may come to rely upon (such as the fact the size variable sz is at a specific location in the structure, which may lead them to bypass your controls and manipulate it directly.
Then you'll find your customers complaining bitterly when you change the internals. Without that structure information, your API is limited only to what you provide and your freedom of action regarding the internals is maintained.
Opaque pointers are used in the definitions of programming interfaces (API's).
Typically they are pointers to incomplete structure types, declared like:
typedef struct widget *widget_handle_t;
Their purpose is to provide the client program a way to hold a reference to an object managed by the API, without revealing anything about the implementation of that object, other than its address in memory (the pointer itself).
The client can pass the object around, store it in its own data structures, and compare two such pointers whether they are the same or different, but it cannot dereference the pointers to peek at what is in the object.
The reason this is done is to prevent the client program from becoming dependent on those details, so that the implementation can be upgraded without having to recompile client programs.
Because the opaque pointers are typed, there is a good measure of type safety. If we have:
typedef struct widget *widget_handle_t;
typedef struct gadget *gadget_handle_t;
int api_function(widget_handle_t, gadget_handle_t);
if the client program mixes up the order of the arguments, there will be a diagnostic from the compiler, because a struct gadget * is being converted to a struct widget * without a cast.
That is the reason why we are defining struct types that have no members; each struct declaration with a different new tag introduces a new type that is not compatible with previously declared struct types.
What does it mean for a client to become dependent? Suppose that a widget_t has width and height properties. If it isn't opaque and looks like this:
typedef struct widget {
short width;
short height;
} widget_t;
then the client can just do this to get the width and height:
int widget_area = whandle->width * whandle->height;
whereas under the opaque paradigm, it would have to use access functions (which are not inlined):
// in the header file
int widget_getwidth(widget_handle_t *);
int widget_getheight(widget_handle_t *);
// client code
int widget_area = widget_getwidth(whandle) * widget_getheight(whandle);
Notice how the widget authors used the short type to save space in the structure, and that has been exposed to the client of the non-opaque interface. Suppose that widgets can now have sizes that don't fit into short and the structure has to change:
typedef struct widget {
int width;
int height;
} widget_t;
Client code must be re-compiled now to pick up this new definition. Depending on the tooling and deployment workflow, there may even be a risk that this isn't done: old client code tries to use the new library and misbehaves by accessing the new structure using the old layout. That can easily happen with dynamic libraries. The library is updated, but the dependent programs are not.
The client which uses the opaque interface continues to work unmodified and so doesn't require recompiling. It just calls the new definition of the accessor functions. Those are in the widget library and correctly retrieve the new int typed values from the structure.
Note that, historically (and still currently here and there) there has also been a lackluster practice of using the void * type as an opaque handle type:
typedef void *widget_handle_t;
typedef void *gadget_handle_t;
int api_function(widget_handle_t, gadget_handle_t);
Under this scheme, you can do this, without any diagnostic:
api_function("hello", stdout);
The Microsoft Windows API is an example of a system in which you can have it both ways. By default, various handle types like HWND (window handle) and HDC (device context) are all void *. So there is no type safety; a HWND could be passed where a HDC is expected, by mistake. If you do this:
#define STRICT
#include <windows.h>
then these handles are mapped to mutually incompatible types to catch those errors.
Opaque as the name suggests is something we can’t see through. E.g. wood is opaque. Opaque pointer is a pointer which points to a data structure whose contents are not exposed at the time of its definition.
Example:
struct STest* pSTest;
It is safe to assign NULL to an opaque pointer.
pSTest = NULL;

Resources