I have seen static structure declarations quite often in a driver code I have been asked to modify.
I tried looking for information as to why structs are declared static and the motivation of doing so.
Can anyone of you please help me understand this?
The static keyword in C has several effects, depending on the context it's applied to.
when applied to a variable declared inside a function, the value of that variable will be preserved between function calls.
when applied to a variable declared outside a function, or to a function, the visibility of that variable or function is limited to the "translation unit" it's declared in - ie the file itself. For variables this boils down to a kind of "locally visible global variable".
Both usages are pretty common in relatively low-level code like drivers.
The former, and the latter when applied to variables, allow functions to retain a notion of state between calls, which can be very useful, but this can also cause all kinds of nasty problems when the code is being used in any context where it is being used concurrently, either by multiple threads or by multiple callers. If you cannot guarantee that the code will strictly be called in sequence by one "user", you can pass a kind of "context" structure that's being maintained by the caller on each call.
The latter, applied to functions, allows a programmer to make the function invisible from outside of the module, and it MAY be somewhat faster with some compilers for certain architectures because the compiler knows it doesn't have to make the variable/function available outside the module - allowing the function to be inlined for example.
Something that apparently all other answers seem to miss: static is and specifies also a storage duration for an object, along with automatic (local variables) and allocated (memory returned by malloc and friends).
Objects with static storage duration are initialized before main() starts, either with the initializer specified, or, if none was given, as if 0 had been assigned to it (for structs and arrays this goes for each member and recursively).
The second property static sets for an identifier, is its linkage, which is a concept used at link time and tells the linker which identifiers refer to the same object. The static keyword makes an identifier have internal linkage, which means it cannot refer to identifiers of the same name in another translation unit.
And to be pedantic about all the sloppy answers I've read before: a static variable can not be referenced everyhere in the file it is declared. Its scope is only from its declaration (which can be between function definitions) to the end of the source file--or even smaller, to the end of the enclosing block.
struct variable
For a struct variable like static struct S s;, this has been widely discussed at: What does "static" mean in C?
struct definition: no effect:
static struct S { int i; int j; };
is the exact same as:
struct S { int i; int j; };
so never use it. GCC 4.8 raises a warning if you do it.
This is because struct definitions have no storage, and do no generate symbols in object files like variables and functions. Just try compiling and decompiling:
struct S { int i; int j; };
int i;
with:
gcc -c main.c
nm main.o
and you will see that there is no S symbol, but there is an i symbol.
The compiler simply uses definitions to calculate the offset of fields at compile time.
This is struct definitions are usually included in headers: they won't generate multiple separate data, even if included multiple times.
The same goes for enum.
C++ struct definition: deprecated in C++11
C++11 N3337 standard draft Annex C 7.1.1:
Change: In C ++, the static or extern specifiers can only be applied to names of objects or functions
Using these specifiers with type declarations is illegal in C ++. In C, these specifiers are ignored when used
on type declarations.
See also: https://stackoverflow.com/a/31201984/895245
If you declare a variable as being static, it is visible only in that translation unit (if globally declared) or retains its value from call to call (if declared inside a function).
In your case I guess it is the first case. In that case, probably the programmer didn't want the structure to be visible from other files.
The static modifier for the struct limits the scope of visibility of the structure to the current translation unit (i.e. the file).
NOTE: This answer assumes (as other responders have indicated) that your declaration is not within a function.
Related
void test(void){
//
}
void test(void); // <-- legal
int main(){
test();
int i = 5;
// int i; <-- not legal
return 0;
}
I understand that functions can have multiple declarations but only 1 definition,
but in my example the declaration is coming after the definition. Why would this be useful? Same cannot be done with block scoped variables.
I found this post which explains the behaviour in C++, not sure if the same applies to C:
Is a class declaration allowed after a class definition?
The underlying reason has to do with the way programs are typically compiled and linked on systems on which C is the "natural language", and the origin of the C language. The following describes conceptually how a program is generated from a collection of source files with static linking.
A program (which may or may not be written in C) consists of separate units — the C term is "translation units", which are source files — which are compiled or assembled to object files.
As a very rough picture such object files expose data objects (global variables) and executable code snippets (functions), and they are able to use such entities defined in other translation units. For the CPU, both are simply addresses. These entities have names or labels called "symbols" (function names, variable names) which an object file declares as "needed" (defined elsewhere) or "exported" (provided for use elsewhere).
On the C source code level the names of objects that are used here but defined elsewhere are made known to the compiler by "extern" declarations; this is true for functions and variables alike. The compiler conceptually generates "placeholder addresses" whenever such an object is accessed. It "publishes" the needed symbols in the object file, and the linker later replaces the symbolic placeholders with the "real" addresses of objects and executable code snippets when it creates an executable.
It does not hurt to declare the use of an external object or function multiple times. No code is generated anyway. But the definition, where actual memory is reserved for an object or executable code, can in general only occur once in a program, because it would otherwise be a duplicate code or object and create an ambiguity. Local variables don't have declarations like global variables; there is no need to declare their use far away from their definition. Their declaration is always also a definition, as in your example, therefore can only occur once in a given scope. That is not different for global variable definitions (as opposed to extern declarations) which can only occur once in the global scope.
Let's say you have these files:
// foo.h
#pragma once
void foo();
// helpers.h
#pragma once
#include "foo.h"
// ...
void bar();
// foo.c
void foo() {
// ...
}
#include "helpers.h"
// ...
Here, there is a declaration of foo after it's fully defined. Should this not compile? I think it's totally reasonable to expect #include directives to not have such effects.
I understand that functions can have multiple declarations but only 1 definition, but in my example the declaration is coming after the definition.
So?
Why would this be useful?
At minimum, it is useful for simplifying the definition of the language. Given that functions may be declared multiple times in the same scope, what purpose would be served by requiring the definition, if any, to be the last one? If multiple declaration is to be allowed at all -- and there is good reason for this -- then it is easier all around to avoid unnecessary constraints on their placement.
Same cannot be done with block scoped variables.
That's true, but for a different reason than you may suppose: every block-scope variable declaration is a definition, so multiple declarations in the same scope result in multiple definitions in the same scope, in violation of the one-definition rule.
A better comparison would be with file-scope variable declarations, which can be duplicated, in any order relative to a single definition, if present.
In C, if I declare a structure like so:
static struct thing {
int number;
};
and compile it (with gcc in this case), the compiler prints this warning:
warning: 'static' ignored on this declaration
[-Wmissing-declarations]
Why is this?
My intention in making the struct static would be to keep thing out of the global namespace so that another file could declare its own thing if it wanted.
You cant define the storage without defining the actual object.
static struct thing {
int number;
}obj1,obj2;
is ok and:
struct thing {
int number;
};
static struct thing x,y;
Struct tags (and typedef names) have no linkage, which means they are not shared across translation units. You might use the term "private" to describe this. It is perfectly fine for two different units to define their own struct thing.
There would only be a problem if it was attempted to make a cross-unit call of a function with external linkage that accepts a struct thing or type derived from that. You can minimize the chance of this happening by ensuring that functions with external linkage are only called via prototypes in header files (i.e. don't use local prototypes).
You can't use static on this way to control the linkage of a type like you could for a function or object, because in C types never have linkage anyway.
"Global namespace" isn't quite the term you want here. C describes names of objects and functions as having "external linkage" if the same name can be declared in different translation units to mean the same thing (like the default for functions), "internal linkage" if the same name can be redeclared within the same translation unit to mean the same thing (like declarations marked static), or "no linkage" when a declaration names a different object or function from any other declaration (like variables defined within a function body). (A translation unit, roughly speaking, is one *.c file together with the contents of the headers it includes.) But none of this applies to types.
So if you want to use a struct type that's essentially private to one source file, just define it within that source file. Then you don't need to worry about another usage of the same name colliding with yours, unless maybe somebody adds it to a header file that the source file was including.
(And just in case a C++ user comes across this Q&A, note the rules for this in C++ are very different.)
I know when a program is run, the main() function is executed first. But when does the initialization of global variables declared outside the main() happens? I mean if I declare a variable like this:
unsigned long current_time = millis();
void main() {
while () {
//some code using the current_time global variable
}
}
Here, the exact time when the global variable initializes is important. Please tell what happens in this context.
Since you didn't define the language you're talking about, I assumed it to be C++.
In computer programming, a global variable is a variable that is accessible in every scope (unless shadowed). Interaction mechanisms with global variables are called global environment (see also global state) mechanisms. The global environment paradigm is contrasted with the local environment paradigm, where all variables are local with no shared memory (and therefore all interactions can be reconducted to message passing). Wikipedia.
In principle, a variable defined outside any function (that is, global, namespace, and class static variables) is initialized before main() is invoked. Such nonlocal variables in a translation unit are initialized in their declaration order (§10.4.9). If such a variable has no explicit initializer, it is by default initialized to the default for its type (§10.4.2). The default initializer value for built-in types and enumerations is 0. [...] There is no guaranteed order of initialization of global variables in different translation units. Consequently, it is unwise to create order dependencies between initializers of global variables in different compilation units. In addition, it is not possible to catch an exception thrown by the initializer of a global variable (§14.7). It is generally best to minimize the use of global variables and in particular to limit the use of global variables requiring complicated initialization. See.
(Quick answer: The C standard doesn't support this kind of initialization; you'll have to consult your compiler's documentation.)
Now that we know the language is C, we can see what the standard has to say about it.
C99 6.7.8 paragraph 4:
All the expressions in an initializer for an object that has static
storage duration shall be constant expressions or string literals.
And the new 2011 standard (at least the draft I has) says:
All the expressions in an initializer for an object that has static
storage duration shall be constant expressions or string literals.
So initializing a static object (e.g., a global such as your current_time) with a function call is a constraint violation. A compiler can reject it, or it can accept it with a warning and do whatever it likes if it provides an language extension.
The C standard doesn't say when the initialization occurs, because it doesn't permit that kind of initialization. Basically none of your code can execute before the main() function starts executing.
Apparently your compiler permits this as an extension (assuming you've actually compiled this code). You'll have to consult your compiler's documentation to find out what the semantics are.
(Normally main is declared as int main(void) or int main(int argc, char *argv[]) or equivalent, or in some implementation-defined manner. In many cases void main() indicates a programmer who's learned C from a poorly written book, of which there are far too many. But this applies only to hosted implementations. Freestanding implementations, typically for embedded systems, can define the program's entry point any way they like. Since you're targeting the Arduino, you're probably using a freestanding implementation, and you should declare main() however the compiler's documentation tells you to.)
It appears to me that even if I refer to a function in another file with out extern declaration, gcc can still compile that unit. So I am wondering whether the extern declaration is necessary anywhere for function? I know that you need extern for variables.
functions have extern storage class specifier by default (unless they are explicitly defined as static)
extern Storage Class Specifier
If the declaration describes a function or appears outside a function and describes an object with external linkage, the keyword extern is optional. If you do not specify a storage class specifier, the function is assumed to have external linkage.
....
It is an error to include a declaration for the same function with the storage class specifier static before the declaration with no storage class specifier because of the incompatible declarations. Including the extern storage class specifier on the original declaration is valid and the function has internal linkage.
It's not necessary, but I prefer it in headers to reinforce the idea that this function is defined somewhere else.
To me, this:
int func(int i);
is a forward declaration of a function that will be needed later, while this:
extern int func(int i);
is a declaration of a function that will be used here, but defined elsewhere.
The two lines are functionally identical, but I use the extern keyword to document the difference, and for consistency with regular variables (where the difference is important, and has exactly that meaning).
You do not necessarily "need" extern for variables.
When C was invented Unix linkers were also written, and they advanced the art in unheralded but clever ways. One contribution was defining all symbols as small "common blocks". This allowed a single syntax for declarations with no required specification of which module was allocating the space. (Only one module could actually initialize the object, but no one was required to.)
There are really three considerations.
Forward declarations for prototypes. (Optional, because legacy C has to compile without them.)
Extern declarations for non-function objects (variables) in all files except one. (Needed only on non-Unix systems that also have crummy linkers. Hopefully this is rare these days.)
For functions, extern is already the assumption if no function body is present to form a definition.
As far as I remember the standard, all function declarations are considered as "extern" by default, so there is no need to specify it explicitly.
That doesn't make this keyword useless since it can also be used with variables (and it that case - it's the only solution to solve linkage problems). But with the functions - yes, it's optional.
A little more verbose answer is that it allows you to use variables compiled in another source code file, but doesn't reserve memory for the variable. So, to utilise extern, you have to have a source code file or a library unit that contains memory space for the variable on the top level (not within functions). Now, you can refer to that variable by defining an extern variable of the same name in your other source code files.
In general, the use of extern definition should be avoided. They lead easily to unmanagable code and errors that hard to locate. Of course, there are examples where other solutions would be impractical, but they are rare. For example, stdin and stdout are macros that are mapped to an extern array variable of type FILE* in stdin.h; memory space for this array is in a standard C-library unit.
I understand what static does, but not why we use it. Is it just for keeping the abstraction layer?
There are a few reasons to use static in C.
When used with functions, yes the intention is for creating abstraction. The original term for the scope of a C source code file was "translation unit." The static functions may only be reached from within the same translation unit. These static functions are similar to private methods in C++, liberally interpreted (in that analogy, a translation unit defines a class).
Static data at a global level is also not accessible from outside the translation unit, and this is also used for creating an abstraction. Additionally, all static data is initialized to zero, so static may be used to control initialization.
Static at the local ("automatic") variable level is used to abstract the implementation of the function which maintains state across calls, but avoids using a variable at translation unit scope. Again, the variables are initialized to zero due to static qualification.
The keyword static has several uses; Outside of a function it simply limits the visibility of a function or variable to the compilation unit (.c file) the function or variable occurs in. That way the function or variable doesn't become global. This is a good thing, it promotes a kind of "need to know" principle (don't expose things that don't need to be exposed). Static variables of this type are zero initialized, but of course global variables are also zero initialized, so the static keyword is not responsible for zero initialization per se.
Variables can also be declared static inside a function. This feature means the variable is not automatic, i.e. allocated and freed on the stack with each invocation of the function. Instead the variable is allocated in the static data area, it is initialized to zero and persists for the life of the program. If the function modifies it during one invocation, the new modified value will be available at the next invocation. This sounds like a good thing, but there are good reasons "auto" is the default, and "static" variables within functions should be used sparingly. Briefly, auto variables are more memory efficient, and are essential if you want your function to be thread safe.
static is used as both a storage class specifier and a linkage specifier. As a linkage specifier it restricts the scope of an otherwise global variable or function to a single compilation unit. This allows, for example a compilation unit to have variables and functions with the same identifier names as other compilation units but without causing a clash, since such identifiers are 'hidden' from the linker. This is useful if you are creating a library for example and need internal 'helper' functions that must not cause a conflict with user code.
As a storage class specifier applied to a local variable, it has different semantics entirely, but your question seems to imply that you are referring to static linkage.
Static functions in C
In C, functions are global by default. The “static” keyword before a function name makes it static. For example, below function fun() is static.
static int fun(void)
{
printf("I am a static function ");
}
Unlike global functions in C, access to static functions is restricted to the file where they are declared. Therefore, when we want to restrict access to functions, we make them static. Another reason for making functions static can be reuse of the same function name in other files.
For example, if we store following program in one file file1.c
/* Inside file1.c */
static void fun1(void)
{
puts("fun1 called");
}
And store following program in another file file2.c
/* Iinside file2.c */
int main(void)
{
fun1();
getchar();
return 0;
}
Now, if we compile the above code with command gcc file2.c file1.c, we get the error undefined reference to fun1. This is because fun1 is declared static in file1.c and cannot be used in file2.c. See also the explanation here, where the codes come from.