I wonder about the use of the static keyword as scope limiting for variables in a file, in C.
The standard way to build a C program as I see it is to:
have a bunch of c files defining functions and variables, possibly scope limited with static.
have a bunch of h files declaring the functions and possibly variables of the corresponding c file, for other c files to use. Private functions and variables are not published in the h file.
every c file is compiled separately to an o file.
all o files are linked together to an application file.
I see two reasons for declaring a gobal as static, if the variable is not published in the h file anyway:
one is for readability. Inform future readers including myself that a variable is not accessed in any other file.
the second is to prevent another c file from redeclaring the variable as extern. I suppose that the linker would dislike a variable being both extern and static. (I dislike the idea of a file redeclaring a variable owned by someone else as extern, is it ok practice?)
Any other reason?
Same goes for static functions. If the prototype is not published in the h file, other files may not use the function anyway, so why define it static at all?
I can see the same two reasons, but no more.
When you talk about informing other readers, consider the compiler itself as a reader. If a variable is declared static, that can affect the degree to which optimizations kick in.
Redefining a static variable as extern is impossible, but the compiler will (as usual) give you enough rope to hang yourself.
If I write static int foo; in one file and int foo; in another, they are considered different variables, despite having the same name and type - the compiler will not complain but you will probably get very confused later trying to read and/or debug the code. (If I write extern int foo; in the second case, that will fail to link unless I declare a non-static int foo; somewhere else.)
Global variables rarely appear in header files, but when they do they should be declared extern. If not, depending on your compiler, you risk that every source file which includes that header will declare its own copy of the variable: at best this will cause a link failure (multiply-defined symbol) and at worst several confusing cases of overshadowing.
By declaring a variable static on file level (static within function has a different meaning) you forbid other units to access it, e.g. if you try to the variable use inside another unit (declared with extern), linker won't find this symbol.
When you declare a static function the call to the function is a "near call" and in theory it performs better than a "far call". You can google for more information. This is what I found with a simple google search.
If a global variable is declared static, the compiler can sometimes make better optimizations than if it were not. Because the compiler knows that the variable cannot be accessed from other source files, it can make better deductions about what your code is doing (such as "this function does not modify this variable"), which can sometimes cause it to generate faster code. Very few compilers/linkers can make these sorts of optimizations across different translation units.
If you declare a variable foo in file a.c without making it static, and a variable foo in file b.c without making it static, both are automatically extern which means the linker may complain if you initialise both, and assign the same memory location if it doesn't complain. Expect fun debugging your code.
If you write a function foo () in file a.c without making it static, and a function foo () in file b.c without making it static, the linker may complain, but if it doesn't, all calls to foo () will call the same function. Expect fun debugging your code.
My favorite usage of static is being able to store methods that I wont have to Inject or create an object to use, the way I see it is, Private Static Methods are always useful, where public static you have to put some more time in thinking of what it is your doing to avoid what crazyscot defined as, getting your self too much rope and accidentally hanging ones self!
I like to keep a folder for Helper classes for most of my projects that mainly consist of static methods to do things quickly and efficiently on the fly, no objects needed!
Related
I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.
I'm using eclipse indigo, gcc and cdt in a project. If two functions in separate source files share names (regardless of return type or parameters), eclipse flags a redefinition error. This isn't a huge issue regarding this project given I can easily rename these functions, and I'm well aware of wrappers if it were. Although this isn't a critical issue, it does make me think I'm not understanding the c build process. What occurs during the build process in which a program structure like this would cause issue?
Here's some more info. on the situation, and where my understanding is so far -- not necessary to answer the question, although there must be a hole in my understanding.
In this case, the two functions are intended to be used only locally, as such their prototypes are not given in the .h interface, and for the sake of my point, neither are defined 'static'.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units. With that in consideration, I would have assumed that the neither source file is aware of the presence of the other, and the compiler would have no problem indexing the two functions, as the separate files would allow for proper distinguishing between the two during linking -- so long as they weren't included in the same compilation unit.
I noticed that statically defining either instance of the function declaration removes the error. I remember reading at some point that every function not declared static is global -- although given these functions are not a part of the .h interface, the practical example in which including the .h interface doesn't allow for the including program to reference all .c functions would indicate "hiding" these functions would be of no issue.
What am I overlooking?
Some insight would be greatly appreciated, thanks!
This is the concept of "linkage". Every function and variable in C has a linkage type, one of "external", "internal", and "none". (Only variables can have no linkage.)
Functions have external linkage by default, which means that they can be called by name from any compilation unit (where "compilation unit" roughly means one source file and all the headers it includes). This can be expressed explicitly by declaring them extern, or it can be overridden by declaring them static. Functions declared static have internal linkage, meaning they can be referenced by name only from other functions in the same compilation unit.
No two external functions anywhere in the same program can have the same name, regardless of header files, but static functions in different compilation units may have the same name. A static function may have the same name as an external function, too -- then the name resolves to the static function within its compilation unit, and to the external function elsewhere. These restrictions make sense, for otherwise it would be possible for a function call to be ambiguous.
Header files don't factor into the linkage equation at all. They are primarily a vehicle for sharing declarations, but a function's linkage depends only on how it is declared, not on where.
I leave discussion of variables' linkage for another time.
It doesn't matter whether one source module includes headers for another. Header files only contain declarations for the purpose of local functions being able to find functions in other modules. It doesn't mean that functions not declared don't exist from the perspective of that module.
When everything gets linked together, anything not specifically defined to be local to one source module (i.e. static) has to have a unique name across all linked components.
remember reading at some point that every function not declared static is global
Having understood this you got the main point and reason for the behaviour observed.
.h files are not known to the linker, after pre-processing there are only translation units left (typically a .c file with all includes merged in), from which .o files are compiled.
There are no interfaces on language level in C.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units.
Declare those functions as static. This is the only way to "hide" a function from the linker "inside" a translation unit.
C doesn't "mangle" function names the way C++ or Java do (since C doesn't support function polymorphism).
For example, in C++, the functions
void foo( void );
void foo( int x );
void foo( int x, double y );
have their names "mangled" into the unique symbols1
_Z3fooid
_Z3fooi
_Z3foov
which is how overloaded function/method calls are disambiguated at the machine level.
C doesn't do that; instead, the linker sees two different function definitions using the same symbol and yaks because it has no way to disambiguate the two.
1. This is what happens on my system, anyway
I am writing a C (shared) library. It started out as a single translation unit, in which I could define a couple of static global variables, to be hidden from external modules.
Now that the library has grown, I want to break the module into a couple of smaller source files. The problem is that now I have two options for the mentioned globals:
Have private copies at each source file and somehow sync their values via function calls - this will get very ugly very fast.
Remove the static definition, so the variables are shared across all translation units using extern - but now application code that is linked against the library can access these globals, if the required declaration is made there.
So, is there a neat way for making private global variable shared across multiple, specific translation units?
You want the visibility attribute extension of GCC.
Practically, something like:
#define MODULE_VISIBILITY __attribute__ ((visibility ("hidden")))
#define PUBLIC_VISIBILITY __attribute__ ((visibility ("default")))
(You probably want to #ifdef the above macros, using some configuration tricks à la autoconfand other autotools; on other systems you would just have empty definitions like #define PUBLIC_VISIBILITY /*empty*/ etc...)
Then, declare a variable:
int module_var MODULE_VISIBILITY;
or a function
void module_function (int) MODULE_VISIBILITY;
Then you can use module_var or call module_function inside your shared library, but not outside.
See also the -fvisibility code generation option of GCC.
BTW, you could also compile your whole library with -Dsomeglobal=alongname3419a6 and use someglobal as usual; to really find it your user would need to pass the same preprocessor definition to the compiler, and you can make the name alongname3419a6 random and improbable enough to make the collision improbable.
PS. This visibility is specific to GCC (and probably to ELF shared libraries such as those on Linux). It won't work without GCC or outside of shared libraries.... so is quite Linux specific (even if some few other systems, perhaps Solaris with GCC, have it). Probably some other compilers (clang from LLVM) might support also that on Linux for shared libraries (not static ones). Actually, the real hiding (to the several compilation units of a single shared library) is done mostly by the linker (because the ELF shared libraries permit that).
The easiest ("old-school") solution is to simply not declare the variable in the intended public header.
Split your libraries header into "header.h" and "header-internal.h", and declare internal stuff in the latter one.
Of course, you should also take care to protect your library-global variable's name so that it doesn't collide with user code; presumably you already have a prefix that you use for the functions for this purpose.
You can also wrap the variable(s) in a struct, to make it cleaner since then only one actual symbol is globally visible.
You can obfuscate things with disguised structs, if you really want to hide the information as best as possible. e.g. in a header file,
struct data_s {
void *v;
};
And somewhere in your source:
struct data_s data;
struct gbs {
// declare all your globals here
} gbss;
and then:
data.v = &gbss;
You can then access all the globals via: ((struct gbs *)data.v)->
I know that this will not be what you literally intended, but you can leave the global variables static and divide them into multiple source files.
Copy the functions that write to the corresponding static variable in the same source file also declared static.
Declare functions that read the static variable so that external source files of the same module can read it's value.
In a way making it less global. If possible, best logic for breaking big files into smaller ones, is to make that decision based on the data.
If it is not possible to do it this way than you can bump all the global variables into one source file as static and access them from the other source files of the module by functions, making it official so if someone is manipulating your global variables at least you know how.
But then it probably is better to use #unwind's method.
In our project, we have pretty big C file of around 50K lines, written in 90's.
I wanted to split the file based on the functionality. But, all the functions in this file are declared as static. So, file scoped. If I split the file, then the function in file1 cannot call function in file2 and vice-versa.
But, My TL feels like that there could be memory optimization by using static functions.
I wrote some sample code to see if the stacks are different for different threads.
It seemed like it was. Could someone please enlighten me the difference between static function and a normal one other an file scope?
In C, while defining a function, the static keyword has the following 2 major consequences :
Prevents the function name from being exported (i.e. function does NOT have external linkage). Thus, preventing linkage / direct calls from other parts of the code.
As the function is clearly marked private to the file, the compiler is in a better position to generate a complete call-graph for the function. This may result in the compiler deciding to automatically in-line the function for better performance.
All functions are implicitly declared as extern, which means they're visible across translation units. But when we use static it restricts visibility of the function to the translation unit in which it's defined. So we can say Functions that are visible only to other functions in the same file are known as static functions.
The most important difference is you cannot call the static function in any other files. i think so ,yeah?
So I'm working on a "quick and dirty" profiler for firmware- I just need to know how long some functions take. Merely printing the time it takes every time will skew the results, as logging is expensive- so I am saving a bunch of results to an array and dumping that after some time.
When working in one compilation unit (one source file), I just had a bunch of static arrays storing the results. Now I need to do this across several files. I could "copy paste" the code, but that would be just ugly (Bear with me). If I put timing code in a seperate compilation unit, make static variables, and provide accessor functions in the header file, I will be incurring the overhead of function calls every time i want to access those static variables.
Is it possible to access static variables of a compilation unit directly?
I've always tried to encapsulate data, and not use global variables, but this situation calls for it simply due to speed concerns.
I hope this makes sense! Thank you!
EDIT: Alright, so it appears what I'm asking is impossible- do any of you see alternatives that essentially allow me to directly access data of another compilation unit?
EDIT2: Thank you for the answers Pablo and Jonathan. I ended up accepting Pablo's because I didn't have clear place to get the pointer to the static data (as per Jonathan) in my situation. Thanks again!
No, it's not possible to access static variables of a compilation unit from another one. static keyword precisely prevents that from happening.
If you need to access globals of one compilation unit from another, you can do:
file1.c:
int var_from_file1 = 10;
file2.c:
extern int var_from_file1;
// you can access var_from_file1 here
If you can remove the static keyword from your declarations, you should be fine. I understand that changing existing source code is not always an option (I.E. dealing with existing legacy compiled code).
To get at the static variables in a compilation unit C1 from another unit C2, some function in C1 must make pointers to the variables available to C2, or some non-static variable must contain a pointer to the static variables.
So, you could package the 'static variables' into a single structure, and then write a function that returns a pointer to that structure; you can call that function to gain access to the static variables.
Similar rules apply to static functions; if some function (or non-static variable) in the file makes the pointers to the functions available, then the static functions can be called indirectly from outside the file.
If access via pointers doesn't count as directly, then you are snookered; static hides and you can't unhide except by removing the keyword static from the variables when the module is compiled - maybe via the C preprocessor. Beware name clashes.