The question was about plain c functions, not c++ static methods, as clarified in comments.
I understand what a static variable is, but what is a static function?
And why is it that if I declare a function, let's say void print_matrix, in let's say a.c (WITHOUT a.h) and include "a.c" - I get "print_matrix##....) already defined in a.obj", BUT if I declare it as static void print_matrix then it compiles?
UPDATE Just to clear things up - I know that including .c is bad, as many of you pointed out. I just do it to temporarily clear space in main.c until I have a better idea of how to group all those functions into proper .h and .c files. Just a temporary, quick solution.
static functions are functions that are only visible to other functions in the same file (more precisely the same translation unit).
EDIT: For those who thought, that the author of the questions meant a 'class method': As the question is tagged C he means a plain old C function. For (C++/Java/...) class methods, static means that this method can be called on the class itself, no instance of that class necessary.
There is a big difference between static functions in C and static member functions in C++. In C, a static function is not visible outside of its translation unit, which is the object file it is compiled into. In other words, making a function static limits its scope. You can think of a static function as being "private" to its *.c file (although that is not strictly correct).
In C++, "static" can also apply to member functions and data members of classes. A static data member is also called a "class variable", while a non-static data member is an "instance variable". This is Smalltalk terminology. This means that there is only one copy of a static data member shared by all objects of a class, while each object has its own copy of a non-static data member. So a static data member is essentially a global variable, that is a member of a class.
Non-static member functions can access all data members of the class: static and non-static. Static member functions can only operate on the static data members.
One way to think about this is that in C++ static data members and static member functions do not belong to any object, but to the entire class.
Minimal runnable multi-file scope example
Here I illustrate how static affects the scope of function definitions across multiple files.
a.c
#include <stdio.h>
/* Undefined behavior: already defined in main.
* Binutils 2.24 gives an error and refuses to link.
* https://stackoverflow.com/questions/27667277/why-does-borland-compile-with-multiple-definitions-of-same-object-in-different-c
*/
/*void f() { puts("a f"); }*/
/* OK: only declared, not defined. Will use the one in main. */
void f(void);
/* OK: only visible to this file. */
static void sf() { puts("a sf"); }
void a() {
f();
sf();
}
main.c
#include <stdio.h>
void a(void);
void f() { puts("main f"); }
static void sf() { puts("main sf"); }
void m() {
f();
sf();
}
int main() {
m();
a();
return 0;
}
GitHub upstream.
Compile and run:
gcc -c a.c -o a.o
gcc -c main.c -o main.o
gcc -o main main.o a.o
./main
Output:
main f
main sf
main f
a sf
Interpretation
there are two separate functions sf, one for each file
there is a single shared function f
As usual, the smaller the scope, the better, so always declare functions static if you can.
In C programming, files are often used to represent "classes", and static functions represent "private" methods of the class.
A common C pattern is to pass a this struct around as the first "method" argument, which is basically what C++ does under the hood.
What standards say about it
C99 N1256 draft 6.7.1 "Storage-class specifiers" says that static is a "storage-class specifier".
6.2.2/3 "Linkages of identifiers" says static implies internal linkage:
If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage.
and 6.2.2/2 says that internal linkage behaves like in our example:
In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function.
where "translation unit" is a source file after preprocessing.
How GCC implements it for ELF (Linux)?
With the STB_LOCAL binding.
If we compile:
int f() { return 0; }
static int sf() { return 0; }
and disassemble the symbol table with:
readelf -s main.o
the output contains:
Num: Value Size Type Bind Vis Ndx Name
5: 000000000000000b 11 FUNC LOCAL DEFAULT 1 sf
9: 0000000000000000 11 FUNC GLOBAL DEFAULT 1 f
so the binding is the only significant difference between them. Value is just their offset into the .bss section, so we expect it to differ.
STB_LOCAL is documented on the ELF spec at http://www.sco.com/developers/gabi/2003-12-17/ch4.symtab.html:
STB_LOCAL Local symbols are not visible outside the object file containing their definition. Local symbols of the same name may exist in multiple files without interfering with each other
which makes it a perfect choice to represent static.
Functions without static are STB_GLOBAL, and the spec says:
When the link editor combines several relocatable object files, it does not allow multiple definitions of STB_GLOBAL symbols with the same name.
which is coherent with the link errors on multiple non static definitions.
If we crank up the optimization with -O3, the sf symbol is removed entirely from the symbol table: it cannot be used from outside anyways. TODO why keep static functions on the symbol table at all when there is no optimization? Can they be used for anything?
See also
Same for variables: https://stackoverflow.com/a/14339047/895245
extern is the opposite of static, and functions are already extern by default: How do I use extern to share variables between source files?
C++ anonymous namespaces
In C++, you might want to use anonymous namespaces instead of static, which achieves a similar effect, but further hides type definitions: Unnamed/anonymous namespaces vs. static functions
The following is about plain C functions - in a C++ class the modifier 'static' has another meaning.
If you have just one file, this modifier makes absolutely no difference. The difference comes in bigger projects with multiple files:
In C, every "module" (a combination of sample.c and sample.h) is compiled independently and afterwards every of those compiled object files (sample.o) are linked together to an executable file by the linker.
Let's say you have several files that you include in your main file and two of them have a function that is only used internally for convenience called add(int a, b) - the compiler would easily create object files for those two modules, but the linker will throw an error, because it finds two functions with the same name and it does not know which one it should use (even if there's nothing to link, because they aren't used somewhere else but in it's own file).
This is why you make this function, which is only used internal, a static function. In this case the compiler does not create the typical "you can link this thing"-flag for the linker, so that the linker does not see this function and will not generate an error.
static function definitions will mark this symbol as internal. So it will not be visible for linking from outside, but only to functions in the same compilation unit, usually the same file.
First: It's generally a bad idea to include a .cpp file in another file - it leads to problems like this :-) The normal way is to create separate compilation units, and add a header file for the included file.
Secondly:
C++ has some confusing terminology here - I didn't know about it until pointed out in comments.
a) static functions - inherited from C, and what you are talking about here. Outside any class. A static function means that it isn't visible outside the current compilation unit - so in your case a.obj has a copy and your other code has an independent copy. (Bloating the final executable with multiple copies of the code).
b) static member function - what Object Orientation terms a static method. Lives inside a class. You call this with the class rather than through an object instance.
These two different static function definitions are completely different. Be careful - here be dragons.
"What is a “static” function in C?"
Let's start at the beginning.
It´s all based upon a thing called "linkage":
"An identifier declared in different scopes or in the same scope more than once can be made to refer to the same object or function by a process called linkage. 29)There are three kinds of linkage: external, internal, and none."
Source: C18, 6.2.2/1
"In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity."
Source: C18, 6.2.2/2
If a function is defined without a storage-class specifier, the function has external linkage by default:
"If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern."
Source: C18, 6.2.2/5
That means that - if your program is contained of several translation units/source files (.c or .cpp) - the function is visible in all translation units/source files your program has.
This can be a problem in some cases. What if you want to use f.e. two different function (definitions), but with the same function name in two different contexts (actually the file-context).
In C and C++, the static storage-class qualifier applied to a function at file scope (not a static member function of a class in C++ or a function within another block) now comes to help and signifies that the respective function is only visible inside of the translation unit/source file it was defined in and not in the other TLUs/files.
"If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage. 30)"
A function declaration can contain the storage-class specifier static only if it is at file scope; see 6.7.1.
Source: C18, 6.2.2/3
Thus, A static function only makes sense, iff:
Your program is contained of several translation units/source files (.c or .cpp).
and
You want to limit the scope of a function to the file, in which the specific function is defined.
If not both of these requirements match, you don't need to wrap your head around about qualifying a function as static.
Side Notes:
As already mentioned, A static function has absolutely no difference at all between C and C++, as this is a feature C++ inherited from C.
It does not matter that in the C++ community, there is a heartbreaking debate about the depreciation of qualifying functions as static in comparison to the use of unnamed namespaces instead, first initialized by a misplaced paragraph in the C++03 standard, declaring the use of static functions as deprecated which soon was revised by the committee itself and removed in C++11.
This was subject to various SO questions:
Unnamed/anonymous namespaces vs. static functions
Superiority of unnamed namespace over static?
Why an unnamed namespace is a "superior" alternative to static?
Deprecation of the static keyword... no more?
In fact, it is not deprecated per C++ standard yet. Thus, the use of static functions is still legit. Even if unnamed namespaces have advantages, the discussion about using or not using static functions in C++ is subject to one´s one mind (opinion-based) and with that not suitable for this website.
A static function is one that can be called on the class itself, as opposed to an instance of the class.
For example a non-static would be:
Person* tom = new Person();
tom->setName("Tom");
This method works on an instance of the class, not the class itself. However you can have a static method that can work without having an instance. This is sometimes used in the Factory pattern:
Person* tom = Person::createNewPerson();
Minor nit: static functions are visible to a translation unit, which for most practical cases is the file the function is defined in. The error you are getting is commonly referred to as violation of the One Definition Rule.
The standard probably says something like:
"Every program shall contain exactly one definition of every noninline
function or object that is used in that program; no diagnostic
required."
That is the C way of looking at static functions. This is deprecated in C++ however.
In C++, additionally, you can declare member functions static. These are mostly metafunctions i.e. they do not describe/modify a particular object's behavior/state but act on the whole class itself. Also, this means that you do not need to create an object to call a static member function. Further, this also means, you only get access to static member variables from within such a function.
I'd add to Parrot's example the Singleton pattern which is based on this sort of a static member function to get/use a single object throughout the lifetime of a program.
The answer to static function depends on the language:
1) In languages without OOPS like C, it means that the function is accessible only within the file where its defined.
2)In languages with OOPS like C++ , it means that the function can be called directly on the class without creating an instance of it.
Since static function is only visible in this file.
Actually, compiler can do some optimization for you if you declare "static" to some function.
Here is a simple example.
main.c
#include <stdio.h>
static void test()
{
ghost(); // This is an unexist function.
}
int main()
{
int ret = 0;
#ifdef TEST
#else
test();
#endif
return (ret);
}
And compile with
gcc -o main main.c
You will see it failed. Because you even not implement ghost() function.
But what if we use following command.
gcc -DTEST -O2 -o main main.c
It success, and this program can be execute normally.
Why? There are 3 key points.
-O2 : Compiler optimization level at least 2.
-DTEST : Define TEST, so test() will not be called.
Defined "static" to test().
Only if these 3 conditions are all true, you can pass compilation.
Because of this "static" declaration, compiler can confirm that test() will NEVER be called in other file. Your compiler can remove test() when compiling. Since we don't need test(), it does not matter whether ghost() is defined or implemented.
Related
void test(void){
//
}
void test(void); // <-- legal
int main(){
test();
int i = 5;
// int i; <-- not legal
return 0;
}
I understand that functions can have multiple declarations but only 1 definition,
but in my example the declaration is coming after the definition. Why would this be useful? Same cannot be done with block scoped variables.
I found this post which explains the behaviour in C++, not sure if the same applies to C:
Is a class declaration allowed after a class definition?
The underlying reason has to do with the way programs are typically compiled and linked on systems on which C is the "natural language", and the origin of the C language. The following describes conceptually how a program is generated from a collection of source files with static linking.
A program (which may or may not be written in C) consists of separate units — the C term is "translation units", which are source files — which are compiled or assembled to object files.
As a very rough picture such object files expose data objects (global variables) and executable code snippets (functions), and they are able to use such entities defined in other translation units. For the CPU, both are simply addresses. These entities have names or labels called "symbols" (function names, variable names) which an object file declares as "needed" (defined elsewhere) or "exported" (provided for use elsewhere).
On the C source code level the names of objects that are used here but defined elsewhere are made known to the compiler by "extern" declarations; this is true for functions and variables alike. The compiler conceptually generates "placeholder addresses" whenever such an object is accessed. It "publishes" the needed symbols in the object file, and the linker later replaces the symbolic placeholders with the "real" addresses of objects and executable code snippets when it creates an executable.
It does not hurt to declare the use of an external object or function multiple times. No code is generated anyway. But the definition, where actual memory is reserved for an object or executable code, can in general only occur once in a program, because it would otherwise be a duplicate code or object and create an ambiguity. Local variables don't have declarations like global variables; there is no need to declare their use far away from their definition. Their declaration is always also a definition, as in your example, therefore can only occur once in a given scope. That is not different for global variable definitions (as opposed to extern declarations) which can only occur once in the global scope.
Let's say you have these files:
// foo.h
#pragma once
void foo();
// helpers.h
#pragma once
#include "foo.h"
// ...
void bar();
// foo.c
void foo() {
// ...
}
#include "helpers.h"
// ...
Here, there is a declaration of foo after it's fully defined. Should this not compile? I think it's totally reasonable to expect #include directives to not have such effects.
I understand that functions can have multiple declarations but only 1 definition, but in my example the declaration is coming after the definition.
So?
Why would this be useful?
At minimum, it is useful for simplifying the definition of the language. Given that functions may be declared multiple times in the same scope, what purpose would be served by requiring the definition, if any, to be the last one? If multiple declaration is to be allowed at all -- and there is good reason for this -- then it is easier all around to avoid unnecessary constraints on their placement.
Same cannot be done with block scoped variables.
That's true, but for a different reason than you may suppose: every block-scope variable declaration is a definition, so multiple declarations in the same scope result in multiple definitions in the same scope, in violation of the one-definition rule.
A better comparison would be with file-scope variable declarations, which can be duplicated, in any order relative to a single definition, if present.
I am trying to implement a global singleton variable in the header-only library in C (not C++). So after searching on this forum and elsewhere, I came across a variation of Meyer's singleton that I am adapting to C here:
/* File: sing.h */
#ifndef SING_H
#define SING_H
inline int * singleton()
{
static int foo = 0;
return &foo;
}
#endif
Notice that I am returning a pointer because C lacks & referencing available in C++, so I must work around it.
OK, now I want to test it, so here is a simple test code:
/* File: side.h */
#ifndef SIDE_H
#define SIDE_H
void side();
#endif
/*File: side.c*/
#include "sing.h"
#include <stdio.h>
void side()
{
printf("%d\n",*(singleton()));
}
/*File: main.c*/
#include "sing.h"
#include "side.h"
#include <stdio.h>
int main(int argc, char * argv[])
{
/* Output default value - expected output: 0 */
printf("%d\n",*(singleton()));
*(singleton()) = 5;
/* Output modified value - expected output: 5 */
printf("%d\n",*(singleton()));
/* Output the same value from another module - expected output: 5*/
side();
return 0;
}
Compiles and runs fine in MSVC in C mode (also in C++ mode too, but that's not the topic). However, in gcc it outputs two warnings (warning: ‘foo’ is static but declared in inline function ‘singleton’ which is not static), and produces an executable which then segfaults when I attempt to run it. The warning itself kind of makes sense to me (in fact, I am surprised I don't get it in MSVC), but segfault kind of hints at the possibility that gcc never compiles foo as a static variable, making it a local variable in stack and then returns expired stack address of that variable.
I tried declaring the singleton as extern inline, it compiles and runs fine in MSVC, results in linker error in gcc (again, I don't complain about linker error, it is logical).
I also tried static inline (compiles fine in both MSVC and gcc, but predictably runs with wrong output in the third line because the side.c translation unit now has its own copy of singleton.
So, what am I doing wrong in gcc? I have neither of these problems in C++, but I can't use C++ in this case, it must be straight C solution.
I could also accept any other form of singleton implementation that works from header-only library in straight C in both gcc and MSVC.
I am trying to implement a global singleton variable in the header-only library in C (not C++).
By "global", I take you to mean "having static storage duration and external linkage". At least, that's as close as C can come. That is also as close as C can come to a "singleton" of a built-in type, so in that sense, the term "global singleton" is redundant.
Notice that I am returning a pointer because C lacks & referencing available in C++, so I must work around it.
It is correct that C does not have references, but you would not need either pointer or reference if you were not using a function to wrap access to the object. I'm not really seeing what you are trying to gain by that. You would likely find it easier to get what you are looking for without. For example, when faced with duplicate external defintions of the same variable identifier, the default behavior of all but the most recent versions of GCC was to merge them into a single variable. Although current GCC reports this situation as an error, the old behavior is still available by turning on a command-line switch.
On the other hand, your inline function approach is unlikely to work in many C implementations. Note especially that inline semantics are rather different in C than in C++, and external inline functions in particular are rarely useful in C. Consider these provisions of the C standard:
paragraph 6.7.4/3 (a language constraint):
An inline definition of a function with external linkage shall not contain a definition of a modifiable object with static or thread storage duration, and shall not contain a reference to an identifier with internal linkage.
Your example code is therefore non-conforming, and conforming compilers are required to diagnose it. They may accept your code nonetheless, but they may do anything they choose with it. It seems unreasonably hopeful to expect that you could rely on a random conforming C implementation to both accept your code for the function and compile it such that callers in different translation units could obtain pointers to the same object by calling that function.
paragraph 6.9/5:
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression [...], somewhere in the entire program there shall be exactly one external definition for the identifier [...].
Note here that although an inline definition of a function identifier with external linkage -- such as yours -- provides an external declaration of that identifier, it does not provide an external definition of it. This means that a separate external definition is required somewhere in the program (unless the function goes altogether unused). Moreover, that external definition cannot be in a translation unit that includes the inline definition. This is large among the reasons that extern inline functions are rarely useful in C.
paragraph 6.7.4/7:
For a function with external linkage, the following restrictions apply: [...] If all of the file scope declarations for a function in a translation unit include the inline function specifier without extern, then the definition in that translation unit is an inline definition. An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.
In addition to echoing part of 6.9/5, that also warns you that if you do provide an external definition of your function to go with the inline definitions, you cannot be sure which will be used to serve any particular call.
Furthermore, you cannot work around those issues by declaring the function with internal linkage, for although that would allow you to declare a static variable within, each definition of the function would be a different function. Lest there be any doubt, Footnote 140 clarifies that in that case,
Since an inline definition is distinct from the corresponding external definition and from any other corresponding inline definitions in other translation units, all corresponding objects with static storage duration are also distinct in each of the definitions.
(Emphasis added.)
So again, the approach presented in your example cannot be relied upon to work in C, though you might find that in practice, it does work with certain compilers.
If you need this to be a header-only library, then you can achieve it in a portable manner by placing an extra requirement on your users: exactly one translation unit in any program using your header library must define a special macro before including the header. For example:
/* File: sing.h */
#ifndef SING_H
#define SING_H
#ifdef SING_MASTER
int singleton = 0;
#else
extern int singleton;
#endif
#endif
With that, the one translation unit that defines SING_MASTER before including sing.h (for the first time) will provide the needed definition of singleton, whereas all other translation units will have only a declaration. Moreover, the variable will be accessible directly, without either calling a function or dereferencing a pointer.
I have the following files:
main.c:
#include "ext.h"
#include "main2.h"
#include <stdio.h>
int main () {
// printf("main - internal_static_variable: %d\n", internal_static_variable);
// printf("main - internal_static_variable: %d\n", internal_static_variable);
printf("main - external_variable: %d\n", external_variable);
put_static_val(24);
put_val(42);
printf("main - internal_static_variable: %d\n", get_static_val());
printf("main - internal_variable: %d\n", get_val());
++external_variable;
print();
}
main2.h:
// main 2.h
#pragma once
void print();
main2.c:
// main2.c
#include "ext.h"
#include "main2.h"
#include <stdio.h>
void print() {
printf("main2 - external_variable: %d\n", external_variable);
printf("main2 - internal_static_variable: %d\n", get_static_val());
printf("main2 - internal_variable: %d\n", get_val());
}
ext.h:
// ext.h
#pragma once
extern int external_variable;
void put_static_val(int v);
int get_static_val();
void put_val(int v);
int get_val();
ext.c:
// ext.c
#include "ext.h"
static int internal_static_variable = 0;
int internal_variable = 1;
int external_variable = 2;
void put_static_val(int v) {
internal_static_variable = v;
}
int get_static_val() {
return internal_static_variable;
}
void put_val(int v) {
internal_variable = v;
}
int get_val() {
return internal_variable;
}
When compiled and executed, the result is the following:
main - external_variable: 2
main - internal_static_variable: 24
main - internal_variable: 42
main2 - external_variable: 3
main2 - internal_static_variable: 24
main2 - internal_variable: 42
As expected, the variables not exposed in the header file (internal_static_variable and internal_variable) are not directly accessible.
What I don't get is the meaning of static. I know it limits the scope of a variable to the compilation unit, but isn't it enough not to declare a variable in the header file to hide it?
Also, I assumed that the static variable and the not-static variable would behave differently. Specifically, internal_static_variable would not be shared by the files including it (one instance for main.c and one for main2.c), but since I change its value from main.c and I get the changed valued in main2.c, there seems not to be any difference between the two.
Could you explain it, please? Thanks
Scope and Linkage
Identifiers have two properties that are relevant here: scope and linkage.
Scope is where an identifier is visible. You apparently already know that scope is limited to the file an identifier is declared in, and it may be further limited to a block or a function (or a function prototype) depending on where the identifier is declared and the keywords (such as static or extern) used when declaring it.
Linkage is a way of making different declarations of an identifier refer to the same object. There are three types of linkage: external, internal, and none.
If an identifier has internal linkage, it is not linked with identifiers in other translation units. An object called foo in one translation unit1 cannot be accessed by name in another translation unit.2
If an identifier has external linkage, it can be accessed in another translation unit by declaring an identifier with the same name and also with external linkage. When the program is linked together, identifiers with external linkage are resolved by the linker so that they refer to the same storage.
Problems With External Linkage
You can omit static and leave your identifiers with external linkage. As long as you are the only person writing your program, you can avoid problems. But this is not tidy; it leaves some things dangling, which can cause problems.
If you are writing routines to be used in other programs, leaving private identifiers with external linkage can be a problem, especially if they have simple, common names. A person who is using your routines in their own code might use the same name coincidentally, and then your two identifiers would be linked to the same object even though you need them to be different.
This can also occur intentionally. If you write a popular software package and leave private names with external linkage, some users of the package may explore what names are present and try to use them. This can result in people creating software which makes use of things in your software that were supposed to be private. Then you cannot develop new versions of the software that change the private parts without breaking existing software. That becomes a business problem. You may need to implement new algorithms inside the software package, but you do not want to break the existing source code of your customers. Declaring the names with static originally could avoid that.
How Declarations Affect Linkage
When an identifier is declared with static at file scope, it has internal linkage. Beyond that, the rules for which linkage an identifier has are a bit complicated, due in part to history of how the C language developed:
Declaring an identifier with extern gives it external linkage if no prior declaration is visible.
If there is a visible prior declaration, extern leaves the identifier with the same linkage as in the previous declaration.
A declaration of a function or an object at file scope without extern or static gives the identifier external linkage.
A declaration of an object at block scope without extern has no linkage, even if static is used.
Function parameters have no linkage.
Identifiers of things that are not objects or functions (such as type definitions) have no linkage.
Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity. (This paragraph is a direct quote from C 2011 [N1570] 6.2.2, and the other information in this answer comes from there too.)
Footnotes
1 A translation unit is the combined source code resulting from all #include directives. I use the technical term “translation unit” rather than “source file” because an object called foo in one source file could be accessed in another source file by using the #include directive.
2 An object with internal linkage can still be accessed in another translation unit by using a pointer, if you pass its address from one function to another.
If you define a non-static global variable, it's still global. Even if it's not declared in a header file, it can still be declared in another translation unit.
When a variable is extern (the default), the object file generated by this compilation unit will carry a named reference to its location. Whenever another object file is linked with the first and refers to the same named variable but does not provide its own definition, the linker will replace all instances of its use of that variable with its location. The CPU deals with memory locations during execution not variable names. This is why it being omitted in the header does not matter; later when you link the object files created from your .c source files, only then are global references resolved.
Static (outside of functions) is useful in that a single library/program can have multiple globally accessible variables under the same name. This prevents name collisions between modules which may both use a variable name for different purposes but make sense in their own contexts to use a variable name which just happens to be the same. As long as the variable is only needed in the current compilation unit, then you should make it static.
What I don't get is the meaning of static. I know it limits the scope
of a variable to the compilation unit, but isn't it enough not to
declare a variable in the header file to hide it?
That would not prevent the variable being declared and therefore becoming accessible. It is the difference between security and obscurity. By declaring it static it cannot be accessed externally by name, by simply not declaring it in a header you are only preventing access to those who do not know its name and data type. A more likely scenario is that your object code or library is used elsewhere and you get an accidental name clash - such bugs are often difficult to fathom.
I assumed that the static variable and the not-static variable would
behave differently. Specifically, internal_static_variable would
not be shared by the files including it (one instance for main.c and one for main2.c), but since I change its value from main.c and
I get the changed valued in main2.c, there seems not to be any
difference between the two.
Your code does not modify internal_static_variable in main.c; it modifies it only in ext.c. ext.c happens to expose internal_static_variable through an accessor function, which in your example provides minimal protection, but as a single point of write access, provides a number of advantages over direct access to the variable, such as:
It is possible to include code in the accessory to handle invalid input, by asserting, returning an error value, aborting, ignoring the value and not modifying the variable, or coercing to a valid value for example. Such code might also be conditionally compiled so that it only performs checking in a debug build.
The accessor function provides a single point in the code to place a debugger breakpoint to trap all write accesses.
If a function declaration isn't in a header file (.h), but is instead only in a source file (.c), why would you need to use the static keyword? Surely, if you only declare it in a .c file, it isn't seen by other files, as you're not supposed to #include .c files, right?
I have already read quite a few questions and answers about this (eg. here and here), but can't quite get my head around it.
What static does is make it impossible to declare and call a function in other modules, whether through a header file or not.
Recall that header file inclusion in C is just textual substitution:
// bar.c
#include "header.h"
int bar()
{
return foo() + foo();
}
// header.h
int foo(void);
gets preprocessed to become
int foo(void);
int bar()
{
return foo() + foo();
}
In fact, you can do away with header.h and just write bar.c this way in the first place. Similarly, the definition for foo does not need to include the header in either case; including it just adds a check that the definition and declaration for foo are consistent.
But if you were to change the implementation of foo to
static int foo()
{
// whatever
return 42;
}
then the declaration of foo would cease to work, in modules and in header files (since header files just get substituted into modules). Or actually, the declaration still "works", but it stops referring to your foo function and the linker will complain about that when you try to call foo.
The main reason to use static is to prevent linker clashes: even if foo and bar were in the same module and nothing outside the module called foo, if it weren't static, it would still clash with any other non-static function called foo. The second reason is optimization: when a function is static, the compiler knows exactly which parts of the program call it and with what arguments, so it can perform constant-folding, dead code elimination and/or inlining.
The static keyword reduces the visibility of a function to the file scope. That means that you can't locally declare the function in other units and use it since the linker does not add it to the global symbol table. This also means that you can use the name in other units too (you may have a static void testOutput(); in every file, that is not possible if the static is omitted.)
As a rule of thumb you should keep the visibility of symbols as limited es possible. So if you do not need the routine outside (and it is not part of some interface) then keep it static.
It allows you to have functions with identical names in different source files, since the compiler adds an implicit prefix to the name of every static function (based on the name of the file in which the function is located), thus preventing multiple-definition linkage errors.
It helps whoever maintains the code to know that the function is not exposed as part of the interface, and is used only internally within the file (a non-static function can be used in other source files even if it's not declared in any header file, using the extern keyword).
I understand what static does, but not why we use it. Is it just for keeping the abstraction layer?
There are a few reasons to use static in C.
When used with functions, yes the intention is for creating abstraction. The original term for the scope of a C source code file was "translation unit." The static functions may only be reached from within the same translation unit. These static functions are similar to private methods in C++, liberally interpreted (in that analogy, a translation unit defines a class).
Static data at a global level is also not accessible from outside the translation unit, and this is also used for creating an abstraction. Additionally, all static data is initialized to zero, so static may be used to control initialization.
Static at the local ("automatic") variable level is used to abstract the implementation of the function which maintains state across calls, but avoids using a variable at translation unit scope. Again, the variables are initialized to zero due to static qualification.
The keyword static has several uses; Outside of a function it simply limits the visibility of a function or variable to the compilation unit (.c file) the function or variable occurs in. That way the function or variable doesn't become global. This is a good thing, it promotes a kind of "need to know" principle (don't expose things that don't need to be exposed). Static variables of this type are zero initialized, but of course global variables are also zero initialized, so the static keyword is not responsible for zero initialization per se.
Variables can also be declared static inside a function. This feature means the variable is not automatic, i.e. allocated and freed on the stack with each invocation of the function. Instead the variable is allocated in the static data area, it is initialized to zero and persists for the life of the program. If the function modifies it during one invocation, the new modified value will be available at the next invocation. This sounds like a good thing, but there are good reasons "auto" is the default, and "static" variables within functions should be used sparingly. Briefly, auto variables are more memory efficient, and are essential if you want your function to be thread safe.
static is used as both a storage class specifier and a linkage specifier. As a linkage specifier it restricts the scope of an otherwise global variable or function to a single compilation unit. This allows, for example a compilation unit to have variables and functions with the same identifier names as other compilation units but without causing a clash, since such identifiers are 'hidden' from the linker. This is useful if you are creating a library for example and need internal 'helper' functions that must not cause a conflict with user code.
As a storage class specifier applied to a local variable, it has different semantics entirely, but your question seems to imply that you are referring to static linkage.
Static functions in C
In C, functions are global by default. The “static” keyword before a function name makes it static. For example, below function fun() is static.
static int fun(void)
{
printf("I am a static function ");
}
Unlike global functions in C, access to static functions is restricted to the file where they are declared. Therefore, when we want to restrict access to functions, we make them static. Another reason for making functions static can be reuse of the same function name in other files.
For example, if we store following program in one file file1.c
/* Inside file1.c */
static void fun1(void)
{
puts("fun1 called");
}
And store following program in another file file2.c
/* Iinside file2.c */
int main(void)
{
fun1();
getchar();
return 0;
}
Now, if we compile the above code with command gcc file2.c file1.c, we get the error undefined reference to fun1. This is because fun1 is declared static in file1.c and cannot be used in file2.c. See also the explanation here, where the codes come from.