does the static keyword protect variables in #included files? - c

Let's say I'm writing a library of functions, and each function makes use of a global array to perform its duties. I don't want to expose that array to non library code, so I declare it as static like so:
library.h:
void function1();
void function2();
library.c:
#include "library.h"
static int arr[ARBITRARY_SIZE];
void function1() {...} // both of these
void function2() {...} // make use arr
If I now want to use this library in my code, I would #include "library.c" at the top of my code.
If I understand correctly, #include simply copies and pastes in place the contents of the #includeed file. If this is the case, the user's code would itself contain the static definition of arr. Given that, how would I, as the author of the library, protect my library variables? If this is not the case, please correct me about what #include does!

static keyword doesn't protect the memory used by a variable, you can pass out of a function (with visibility of it) a reference to the variable so the variable is accessible out of the block where it is defined. Then the calling code can use that reference to modify it as desired.
static serves two purposes:
inside a block in a function body, it states that: despite the variable has visibility only in the inside of the block where it is defined, its life is all the program life (it is not created/destroyed when the program enters/exist the definition block)
outside a block, it gives local file visibility (the variable name is exposed nowhere out of the definition compilation unit). But that doesn't imply that there's no accessability to that global chunk of memory. You can, if you have a pointer reference pointing to it, still modify it as you want.
#include just text includes the include file contents verbatim in the compilation flow, so everything declared static in the include file has visibility in the including file (after the point of inclussion), and locally in every compilation unit that also includes the header file. But all definitions of it are different and independent, and they don't refer to the same variable (as they are local definitions in different compilation units), as it happens if you name two local variables of different blocks (even when nesting the blocks) with the same name, they are different variables.

If I now want to use this library in my code, I would #include "library.c" at the top of my code.
That will only work if you use this library in a single source file.
As soon as you add foo.c and bar.c which both #include "library.c" and try to link them together, you would get a multiply-defined function1 and function2 symbol error (because each of foo.o and bar.o will now provide their own separate definitions.
You could fix this by making the functions static as well: static void function1() { ... }, etc. but this not how people usually use libraries, because that method causes long compile times and larger than necessary executable. In addition, if you are using this method, you don't need the library.h file at all.
Instead, what people usually do is compile library.c into library.o, #include "library.h" at the top of their source files, then link everything together.
I don't want to expose that array to non library code, so I declare it as static like so:
That is a valid thing to do, and achieves your purpose (so long as you #include "library.h" and not library.c).
Note that using global arrays (as well as most other globals) makes code harder to reason about, and causes additional difficulties when making code thread-safe, and thus it's best to use globals very sparingly.

Related

Benefits and drawbacks of making all functions in main.c static?

I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.

is it possible to have only header file in C without source file

I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?
Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.
If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.
My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}

If function declaration is not in header file, is static keyword necessary?

If a function declaration isn't in a header file (.h), but is instead only in a source file (.c), why would you need to use the static keyword? Surely, if you only declare it in a .c file, it isn't seen by other files, as you're not supposed to #include .c files, right?
I have already read quite a few questions and answers about this (eg. here and here), but can't quite get my head around it.
What static does is make it impossible to declare and call a function in other modules, whether through a header file or not.
Recall that header file inclusion in C is just textual substitution:
// bar.c
#include "header.h"
int bar()
{
return foo() + foo();
}
// header.h
int foo(void);
gets preprocessed to become
int foo(void);
int bar()
{
return foo() + foo();
}
In fact, you can do away with header.h and just write bar.c this way in the first place. Similarly, the definition for foo does not need to include the header in either case; including it just adds a check that the definition and declaration for foo are consistent.
But if you were to change the implementation of foo to
static int foo()
{
// whatever
return 42;
}
then the declaration of foo would cease to work, in modules and in header files (since header files just get substituted into modules). Or actually, the declaration still "works", but it stops referring to your foo function and the linker will complain about that when you try to call foo.
The main reason to use static is to prevent linker clashes: even if foo and bar were in the same module and nothing outside the module called foo, if it weren't static, it would still clash with any other non-static function called foo. The second reason is optimization: when a function is static, the compiler knows exactly which parts of the program call it and with what arguments, so it can perform constant-folding, dead code elimination and/or inlining.
The static keyword reduces the visibility of a function to the file scope. That means that you can't locally declare the function in other units and use it since the linker does not add it to the global symbol table. This also means that you can use the name in other units too (you may have a static void testOutput(); in every file, that is not possible if the static is omitted.)
As a rule of thumb you should keep the visibility of symbols as limited es possible. So if you do not need the routine outside (and it is not part of some interface) then keep it static.
It allows you to have functions with identical names in different source files, since the compiler adds an implicit prefix to the name of every static function (based on the name of the file in which the function is located), thus preventing multiple-definition linkage errors.
It helps whoever maintains the code to know that the function is not exposed as part of the interface, and is used only internally within the file (a non-static function can be used in other source files even if it's not declared in any header file, using the extern keyword).

How to define and declare global variables for use by library code?

main file (prog.c):
#include "log.c"
#include "library.c"
static char * Foo;
If some variable (char * Foo) is defined in main file (prog.c), and it is required by log.c function called from library.c, how to correctly declare Foo to be visible from log.c's namespace?
Add its declaration to some .h file that is included in both .c files. Define it in one of the files.
Of course, it can't be declared static for this to work since the static keyword is a promise that the name won't be needed outside of that particular module.
For example, in prog.h:
extern char *Foo;
in prog.c:
#include "prog.h"
#include "log.c"
#include "library.c"
char * Foo; // make sure that Foo has a definition
// some code probably wants to make Foo have a value other than NULL
in log.c:
//... other includes
#include "prog.h" // and now Foo is a known name
// some code here is using the global variable Foo
Now, for the bad news.
Doing this sort of thing creates a coupling between the prog.c and log.c modules. That coupling adds to the maintenance cost of your application as a whole. One reason is that there is no way to prevent other modules from using the global variable also. Worse, they might be using it completely by accident, because its name is insufficiently descriptive.
Worse, globals make it much more difficult to move from single-threaded programs to multi-threaded programs. Every global variable that might be accessed from more than one thread is a potential source of really hard to diagnose bugs. The cure is to guard information that must be global with synchronization objects, but overused that can result in an application where all the threads are blocked except the one that is currently using the global, making the multi-threaded application effectively single threaded.
There certainly are times when the inter-module coupling implied by global variables is acceptable. One use case is for general purpose application-wide options. For instance, if your application supports a --verbose option that makes it chatter while it works, then it makes sense for the flag that is set by the option and tested throughout the code would be a global variable.
There are certainly questions at SO that delve deeply into the pitfalls of globals and will provide guidance on their sensible use.
It is aconventional to include the library source code in your main program:
#include "log.c"
#include "library.c"
static char * Foo;
(The semi-colon is needed.)
However, given that is what you are doing, if "log.c" needs to see the declaration, you could simply do:
static char * Foo;
#include "log.c"
#include "library.c"
Now the static declaration is visible to "log.c" (and "library.c").
If you go for a more conventional setup, then you would have the code in "log.c" access a global variable declared in an appropriate header (rather than a file static variables). However, such dependencies (where a library file depends on a global variable) are a nuisance. The main program (or some piece of code) has to provide the variable definition. It would be better to have the code in "log.c" define the variable, and the (presumed) header "log.h" would declare the variable, and then the main program would set the variable accordingly. Or, better, the code in "log.c" would provide a function or several functions to manipulate the variable, and the header would declare those functions, and the main program would use them.
You want extern. When you extern a variable name you're making a "promise" that the variable will exist when you link. You want to give it storage in a .c file but extern it in a header. That way it's just instantiated once, in the .c's object file. You don't want to have two different .o's using the same name to refer to different locations in memory. (As noted above it's nearly always bad form to require something like this for a library.)
So in a common header you'd have
common.h
extern Foo bar;
Then in prog.c
Foo bar;
And when you included common.h in log.c you could access bar from prog.c
Note that static is very different in C than in Java. In Java it's global to an Class and available for anyone, even without an instance of the class. In C static means that variable is not visible outside of the compilation unit.
The simple answer is:
static char * Foo;
#include "log.c"
#include "library.c"
Which makes Foo visible in log.c and library.c simply by virtue of the "declare before use" rule.
However what you really need to know is that this is nasty code! You have committed at least two sins; Use of global variables and failure to understand the use separate compilation and linking.

Keeping variables global to the library scope in C

Is there any way to keep global variables visible only from inside a library while inaccessible from programs that access that library in C?
It's not that it is vital to keep the variable protected, but I would rather it if programs couldn't import it as it is nothing of their business.
I don't care about solutions involving macros.
If you use g++, you can use the linker facilities for that using attributes.
__attribute__((visibility("hidden"))) int whatever;
You can also mark everything as hidden and mark explicitly what is visible with this flag: -fvisibility=hidden
And then mark the visible variables with:
__attribute__((visibility("default"))) int whatever;
static int somelocalvar = 0;
that makes somelocalvar visible only from whithin the source file where it is declared (reference and example).
Inside the library implementation, declare your variables like that:
struct my_lib_variables
{
int var1;
char var2;
};
Now in the header for end-users, declare it like that:
struct my_lib_variables;
It declares the structure as an incomplete type. People who will use the header will be able to create a pointer to the struct, but that's all. The goal is that they have to write something like that:
#include "my_lib.h"
struct my_lib_variables* p = my_lib_init();
my_lib_do_something(p);
my_lib_destroy(p);
The libray code is able to modify the variables, but the library can't do it directly.
Or you can use global variables, but put the extern declarations inside a header which will not be used by the end-user.
You can use another header file for exporting functionality to outside modules than you have for the internal functionality and thus you don't have to declare globals that doesn't have to be accessible from outside the module.
Edit:
There is only linker problems if you declare things more than once. There is no need to keep all global data in one header file, in fact, there may be a wise reason top split it up into several smaller pieces for maintainability and different areas of responisiblity. Splitting up into header files for external data and internal data is one such reason and this should not be a problem since it is possible to include more than one header file into the same source file. And don't forget the guards in the header files, this way, collision in linking is mostly avoided.
#ifndef XXX_HEADER_FILE
#define XXX_HEADER_FILE
code
#endif

Resources