I want to encapsulate global variables in a single "data-manager-module".
Access is only possible with functions to avoid all the ugly
global variable problems ... So the content is completely hidden from the users.
Are there any existing concepts? How could such an implementation
look like? How should the values stored in the "data-manager-module"?
A "data manager module" doesn't make any sense. Implementing one would merely be sweeping away a fundamentally poor program design underneath the carpet, hiding it instead of actually cleaning it up. The main problem with globals is not user-abuse, but that they create tight couplings between modules in your project, making it hard to read and maintain, and also increases the chance of bugs "escalating" outside the module where the bug was located.
Every datum in your program belongs to a certain module, where a "module" consists of a h file and a corresponding c file. Call it module or class or ADT or whatever you like. Common sense and OO design both dictate that the variables need to be declared in the module where they actually belong, period.
You can either do so by declaring the variable at file scope static and then implement setter/getter functions. This is "poor man's private encapsulation" and not thread-safe, but for embedded systems it will work just fine in most cases. This is the embedded industry de facto standard of declaring such variables.
Or alternatively and more advanced, you can do true private encapsulation by declaring a struct as incomplete type in a h file, and define it in the C file. This is sometimes called "opaque type" and gives true encapsulation on object basis, meaning that you can declare multiple instances of the class. Opaque type can also be used to implement inheritance (though in rather burdensome ways).
Declare and define all the variables in a header file that is included into the manager's .c file but not into its .h
That way they will be only visible for the manager's functions.
You can keep all variables in a single source inside a struct with getters and setters
static struct all_globals{
long long myll;
/* ... */
} all_globals; /* Not _really_ global*/
long long getmyll(void){
return all_globals.myll;
}
long long setmyll(long long value){
return all_globals.myll = value;
}
similarly, you could use an internal header file that is not exported to the user API, then strip symbols from the resulting binary/library
/* globals.c */
struct all_globals{
long long myll;
/* ... */
} all_globals; /* Not _really_ global*/
/* globals.h */
#define getmyll() all_globals.myll
#define setmyll(value) all_globals.myll = (value)
This will still be technically visible to the end user with enough effort, but allows you to distinguish globals and keep them together.
Related
How can I achieve or emulate private access behavior under ANSI C (that is only code from my software module can call functions "successfully"), when "static" declaration is not an option?
Background:
I am in a strange situation.
I have to develop a big software module, which operates as an abstraction on a huge number of inputs (some thousand) to our software, presented by access functions — I will call them signals further. The module does some similar mapping operations on most of these signals, like changing the representation from integer to float, changing physical units or remap nominal values to other ones — depending on semantics and nature of the signals. Also some validations (like input set/range checks) are done signal-wise, and some common error validation is also done for signals and groups of them (return values of the access functions).
Now our test department wants to test small groups of the functionality — the units(tm) — like for every input-signal-group, and of course testing input routines, mapping routines and regrouped output-routines individually from each other, which totally makes sense.
But due to rigid rules on what is tested, each testable unit must be represented as its own C object and tested as we will ship it. To allow this, any of the input, mapping and output routines need external interfaces. And we ship a library to be integrated in a bigger software system.
Now, any input and output-routine is highly special (to some degree) as it calls specific access routines; any mapping routine is highly special (to some degree) as it provides the (general) mapping with signal specific magic numbers ;and so on. To have more fun, the special I/O routines (and the used access routines) are not (under all conditions) reentrant. So ... these functions MUST be called (always) in a certain way by my module, and MUST NOT under any circumstances be called by any other part of the software.
How can I provide some safety and encapsulation for these functions, although a "static" declaration of the functions is not possible, because of the test procedures, and my requirements document says it is my job to ensure calling correctness.
Any ideas?
There is no way to strictly define a private / public function or variable in C. What you can do is either to "hide" some information or to define a function or a variable as static.
A way to hide the information is to have a header my_object.h like:
typedef struct my_object_struct my_object_t;
my_object_t * my_object_new(...);
void my_object_free(my_object_t * my_object);
type my_function1(my_object_t * my_object, ...)
type my_function2(my_object_t * my_object, ...)
And the implementation in a file my_object.c:
struct my_object_struct{
type some_var
type some_var2
...
};
struct some_hidden_state{
...
};
type __my_hidden_function(my_object_t * my_object, struct some_hidden_state * state, ...){
}
my_object_t * my_object_new(...){
...
}
void my_object_free(my_object_t * my_object){
...
}
type my_function1(my_object_t * my_object, ...){
...
}
type my_function2(my_object_t * my_object, ...){
...
}
The only visible way to create a my_object_t object is to use my_object_new, and its internal variables would be hidden (not to a debbuger though).
This way my_hidden_function and my_hidden_state are not visible but someone could still use the extern keyword and the correct prototype to use them; the only way to strictly encapsulate them is to use static.
I am writing a C (shared) library. It started out as a single translation unit, in which I could define a couple of static global variables, to be hidden from external modules.
Now that the library has grown, I want to break the module into a couple of smaller source files. The problem is that now I have two options for the mentioned globals:
Have private copies at each source file and somehow sync their values via function calls - this will get very ugly very fast.
Remove the static definition, so the variables are shared across all translation units using extern - but now application code that is linked against the library can access these globals, if the required declaration is made there.
So, is there a neat way for making private global variable shared across multiple, specific translation units?
You want the visibility attribute extension of GCC.
Practically, something like:
#define MODULE_VISIBILITY __attribute__ ((visibility ("hidden")))
#define PUBLIC_VISIBILITY __attribute__ ((visibility ("default")))
(You probably want to #ifdef the above macros, using some configuration tricks à la autoconfand other autotools; on other systems you would just have empty definitions like #define PUBLIC_VISIBILITY /*empty*/ etc...)
Then, declare a variable:
int module_var MODULE_VISIBILITY;
or a function
void module_function (int) MODULE_VISIBILITY;
Then you can use module_var or call module_function inside your shared library, but not outside.
See also the -fvisibility code generation option of GCC.
BTW, you could also compile your whole library with -Dsomeglobal=alongname3419a6 and use someglobal as usual; to really find it your user would need to pass the same preprocessor definition to the compiler, and you can make the name alongname3419a6 random and improbable enough to make the collision improbable.
PS. This visibility is specific to GCC (and probably to ELF shared libraries such as those on Linux). It won't work without GCC or outside of shared libraries.... so is quite Linux specific (even if some few other systems, perhaps Solaris with GCC, have it). Probably some other compilers (clang from LLVM) might support also that on Linux for shared libraries (not static ones). Actually, the real hiding (to the several compilation units of a single shared library) is done mostly by the linker (because the ELF shared libraries permit that).
The easiest ("old-school") solution is to simply not declare the variable in the intended public header.
Split your libraries header into "header.h" and "header-internal.h", and declare internal stuff in the latter one.
Of course, you should also take care to protect your library-global variable's name so that it doesn't collide with user code; presumably you already have a prefix that you use for the functions for this purpose.
You can also wrap the variable(s) in a struct, to make it cleaner since then only one actual symbol is globally visible.
You can obfuscate things with disguised structs, if you really want to hide the information as best as possible. e.g. in a header file,
struct data_s {
void *v;
};
And somewhere in your source:
struct data_s data;
struct gbs {
// declare all your globals here
} gbss;
and then:
data.v = &gbss;
You can then access all the globals via: ((struct gbs *)data.v)->
I know that this will not be what you literally intended, but you can leave the global variables static and divide them into multiple source files.
Copy the functions that write to the corresponding static variable in the same source file also declared static.
Declare functions that read the static variable so that external source files of the same module can read it's value.
In a way making it less global. If possible, best logic for breaking big files into smaller ones, is to make that decision based on the data.
If it is not possible to do it this way than you can bump all the global variables into one source file as static and access them from the other source files of the module by functions, making it official so if someone is manipulating your global variables at least you know how.
But then it probably is better to use #unwind's method.
if you want to cut to the chase, please skip down to the last two paragraphs. If you're interested in my predicament and the steps I've taken to solve it, continue reading directly below.
I am currently developing portions of a C library as part of my internship. So naturally, there are some parts of code which should not be accessible to the user while others should be. I am basically developing several architecture-optimized random number generators (RNG's)(uniform, Gaussian, and exponential distributed numbers). The latter two RNG's depend on the uniform generator , which is in a different kernel (project). So, in the case that the user wants to use more than one RNG, I want to make sure I'm not duplicating code needlessly since we are constrained with memory (no point in having the same function defined multiple times at different addresses in the code segment).
Now here's where the problem arises. The convention for all other kernels in the library is that we have a two header files and two C files (one each for the natural C implementation and the optimized C version (which may use some intrinsic functions and assembly and/or have some restrictions to make it faster and better for our architecture). This is followed by another C file (a testbench) where our main function is located and it tests both implementations and compares the results. With that said, we cannot really add an additional header file for private or protected items nor can we add a global header file for all these generators.
To combat this restriction, I used extern functions and extern const int's in the C files which depend on the uniform RNG rather than #define's at the top of each C file in order to make the code more portable and easily modified in one place. This worked for the most part.
However, the tricky bit is that we are using an internal type within these kernels (which should not be seen by the user and should not be placed in the header file). Again, for portability, I would like to be able to change the definition of this typedef in one place rather than in multiple places in multiple kernels since the library may be used for another platform later on and for the algorithms to work it is critical that I use 32-bit types.
So basically I'm wondering if there's any way I can make a typedef "protected" in C. That is, I need it to be visible among all C files which need it, but invisible to the user. It can be in one of the header files, but must not be visible to the user who will be including that header file in his/her project, whatever that may be.
============================Edit================================
I should also note that the typedef I am using is an unsigned int. so
typedef unsigned int myType
No structures involved.
============================Super Edit==========================
The use of stdint.h is also forbidden :(
I am expanding on Jens Gustedt’s answer since the OP still has questions.
First, it is unclear why you have separate header files for the two implementations (“natural C” and “optimized C”). If they implement the same API, one header should serve for either.
Jens Gustedt’s recommendation is that you declare a struct foo in the header but define it only in the C source file for the implementation and not in the header. A struct declared in this way is an incomplete type, and source code that can only see the declaration, and not the definition, cannot see what is in the type. It can, however, use pointers to the type.
The declaration of an incomplete struct may be as simple as struct foo. You can also define a type, such as typedef struct foo foo; or typedef struct foo Mytype;, and you can define a type that is a pointer to the struct, such as typedef struct foo *FooPointer;. However, these are merely for convenience. They do not alter the basic notion, that there is a struct foo that API users cannot see into but that they can have pointers to.
Inside the implementation, you would fully define the struct. If you want an unsigned int in the struct, you would use:
struct foo
{
unsigned int x;
};
In general, you define the struct foo to contain whatever data you like.
Since the API user cannot define struct foo, you must provide functions to create and destroy objects of this type as necessary. Thus, you would likely have a function declared as extern struct foo *FooAlloc(some parameters);. The function creates a struct foo object (likely by calling malloc or a related function), initializes it with data from the parameters, and returns a pointer to the object (or NULL if the creation or initialization fails). You would also have a function extern void FooFree(struct foo *p); that frees a struct foo object. You might also have functions to reset, set, or alter the state of a foo object, functions to copy foo objects, and functions to report about foo objects.
Your implementations could also define some global struct foo objects that could be visible (essentially by address only) to API users. As a matter of good design, this should be done only for certain special purposes, such as to provide instances of struct foo objects with special meanings, such as a constant object with a permanent “initial state” for copying.
Your two implementations, the “natural C” and the “optimized C” implementations may have different definitions for the struct foo, provided they are not both used in a program together. (That is, each entire program is compiled with one implementation or the other, not both. If necessary, you could mangle both into a program by using a union, but it is preferable to avoid that.)
This is not a singleton approach.
Just do
typedef struct foo foo;
These are two declarations, a forward declaration of a struct and a type alias with the same name. Forward declared struct can be used to nothing else than to define pointers to them. This should give you enough abstraction and type safety.
In all your interfaces you'd have
extern void proc(foo* a);
and you'd have to provide functions
extern foo* foo_alloc(size_t n);
extern void foo_free(foo* a);
This would bind your users as well as your library to always use the same struct. Thereby the implementation of foo is completely hidden to the API users. You could even one day to decide to use something different than a struct since users should use foo without the struct keyword.
Edit: Just a typedef to some kind of integer wouldn't help you much, because these are only aliases for types. All your types aliased to unsigned could be used interchangeably. One way around this would be to encapsulate them inside a struct. This would make your internal code a bit ugly, but the generated object code should be exactly the same with a good modern compiler.
Is it possible to declare a structure type that is only visible in the .c file which uses the structure? I know that by putting static in front of a external data object, you change the linkage of the variable to be internal. But is it possible to put static in front of the declaration of a new struct type, like the following?
static struct log{
...;
...;
};
typedef struct log log;
If it is not possible to make the structure type, say log as above, to be "private", does it mean that even though other source files do not know the existence of the name (which is log in my example) of the structure, accidental name collisions can still happen if they name some variables log (assuming I will link all object files) ?
EDIT: I am not familiar with how compiler/linker works. If there is a global variable name log, and the file that contains the global variable is linked to the sole source file in which structure log is defined, wouldn't that cause any confusion when linking, one log is a variable name while another log is a type name?
No. The only way to make a struct private is to only have its definition available in the files that use it -- don't put it in a common header file. If it's only used in one source file, then just define it in that source file, but if it's used in more than one source file, you have a tricky problem: you can define it in each source file, but that's fragile since you have to remember to change each instance of it when you make any changes; or, you can define it in a private header file, and make sure only those source files include the private header.
Name collisions in different source files are ok, as long as they don't try to interface with each other in any way. If you have a struct log defined in one file and a different definition of struct log in a different file, do not ever pass one log to the other. In C, the structure name doesn't become part of any symbol names in the object file -- in particular, there's no name mangling of function names to include the parameter types (like C++ does), since function overloading is illegal in C.
No. static is a storage type; it is not meaningful to apply it to a type outside a variable declaration.
If you don't want to define struct log in your header file, you don't have to. Simply writing the typedef as:
typedef struct log log;
is sufficient, so long as you only deal with log * pointers. However, you will need a full definition of the structure to declare a log (or take sizeof(log)), because the size of the structure depends on what it contains.
With regard to name collisions, keep in mind that structures and types are not managed by the linker. The linker only cares about globally visible symbols, such as functions and variables. That being said, you should probably apply a prefix to your type names (e.g, mylib_log_t) to avoid confusion, particularly because log is a math function in the standard library.
You have a reason to write this:
static int a;
Because it prevents the linker from combining it with a defined somewhere else.
The linker has nothing to do with structs, so there is no worries putting in different c files.
As long as its in different c files, there will be no name confusions.
This isn't possible in general. But I can think of a hack that might work on some compilers.
The reason why this is hard to do is because the C compiler needs to know what the structure looks like in order to generate calls to functions with instances of the structure as argument.
So, suppose that you define a library with the following header:
struct foo {
int32_t a, b;
};
foo make_foo(int arg);
foo do_something(foo p1, foo p2);
Then to compile a program which makes a call to do_something, your compiler usually needs to know what the structure foo is like, so that it can pass it as an argument. The compiler can do all sorts of weird things here, like passing part of the structure via registers and part via the stack, so it really needs to know what the structure looks like.
However, I believe that in some compilers, it is possible to give the indication that the structure should be passed entirely via the stack. For instance, the regparm(0) function attribute should work for GCC if you have i386 as your target architecture (docs).
In that situation, it should be possible to do something like this: create a 'public version' of the header file, and in that file, instead of laying out the full struct, you create an undiferentiated version of it:
struct foo {
uint8_t contents[SIZE_OF_STRUCT_FOO];
}
where SIZE_OF_STRUCT_FOO is whatever sizeof(struct foo) returns when you define the struct in the usual way. You are then basically saying that "foo" is a struct with SIZE_OF_STRUCT_FOO bytes. Then, as long as the calling convention treats these two structs in the same way, it should work.
Is there any way to keep global variables visible only from inside a library while inaccessible from programs that access that library in C?
It's not that it is vital to keep the variable protected, but I would rather it if programs couldn't import it as it is nothing of their business.
I don't care about solutions involving macros.
If you use g++, you can use the linker facilities for that using attributes.
__attribute__((visibility("hidden"))) int whatever;
You can also mark everything as hidden and mark explicitly what is visible with this flag: -fvisibility=hidden
And then mark the visible variables with:
__attribute__((visibility("default"))) int whatever;
static int somelocalvar = 0;
that makes somelocalvar visible only from whithin the source file where it is declared (reference and example).
Inside the library implementation, declare your variables like that:
struct my_lib_variables
{
int var1;
char var2;
};
Now in the header for end-users, declare it like that:
struct my_lib_variables;
It declares the structure as an incomplete type. People who will use the header will be able to create a pointer to the struct, but that's all. The goal is that they have to write something like that:
#include "my_lib.h"
struct my_lib_variables* p = my_lib_init();
my_lib_do_something(p);
my_lib_destroy(p);
The libray code is able to modify the variables, but the library can't do it directly.
Or you can use global variables, but put the extern declarations inside a header which will not be used by the end-user.
You can use another header file for exporting functionality to outside modules than you have for the internal functionality and thus you don't have to declare globals that doesn't have to be accessible from outside the module.
Edit:
There is only linker problems if you declare things more than once. There is no need to keep all global data in one header file, in fact, there may be a wise reason top split it up into several smaller pieces for maintainability and different areas of responisiblity. Splitting up into header files for external data and internal data is one such reason and this should not be a problem since it is possible to include more than one header file into the same source file. And don't forget the guards in the header files, this way, collision in linking is mostly avoided.
#ifndef XXX_HEADER_FILE
#define XXX_HEADER_FILE
code
#endif