Is extern keyword for function necessary at all in C? - c

It appears to me that even if I refer to a function in another file with out extern declaration, gcc can still compile that unit. So I am wondering whether the extern declaration is necessary anywhere for function? I know that you need extern for variables.

functions have extern storage class specifier by default (unless they are explicitly defined as static)
extern Storage Class Specifier
If the declaration describes a function or appears outside a function and describes an object with external linkage, the keyword extern is optional. If you do not specify a storage class specifier, the function is assumed to have external linkage.
....
It is an error to include a declaration for the same function with the storage class specifier static before the declaration with no storage class specifier because of the incompatible declarations. Including the extern storage class specifier on the original declaration is valid and the function has internal linkage.

It's not necessary, but I prefer it in headers to reinforce the idea that this function is defined somewhere else.
To me, this:
int func(int i);
is a forward declaration of a function that will be needed later, while this:
extern int func(int i);
is a declaration of a function that will be used here, but defined elsewhere.
The two lines are functionally identical, but I use the extern keyword to document the difference, and for consistency with regular variables (where the difference is important, and has exactly that meaning).

You do not necessarily "need" extern for variables.
When C was invented Unix linkers were also written, and they advanced the art in unheralded but clever ways. One contribution was defining all symbols as small "common blocks". This allowed a single syntax for declarations with no required specification of which module was allocating the space. (Only one module could actually initialize the object, but no one was required to.)
There are really three considerations.
Forward declarations for prototypes. (Optional, because legacy C has to compile without them.)
Extern declarations for non-function objects (variables) in all files except one. (Needed only on non-Unix systems that also have crummy linkers. Hopefully this is rare these days.)
For functions, extern is already the assumption if no function body is present to form a definition.

As far as I remember the standard, all function declarations are considered as "extern" by default, so there is no need to specify it explicitly.
That doesn't make this keyword useless since it can also be used with variables (and it that case - it's the only solution to solve linkage problems). But with the functions - yes, it's optional.
A little more verbose answer is that it allows you to use variables compiled in another source code file, but doesn't reserve memory for the variable. So, to utilise extern, you have to have a source code file or a library unit that contains memory space for the variable on the top level (not within functions). Now, you can refer to that variable by defining an extern variable of the same name in your other source code files.
In general, the use of extern definition should be avoided. They lead easily to unmanagable code and errors that hard to locate. Of course, there are examples where other solutions would be impractical, but they are rare. For example, stdin and stdout are macros that are mapped to an extern array variable of type FILE* in stdin.h; memory space for this array is in a standard C-library unit.

Related

Why is redeclaring functions legal in C?

void test(void){
//
}
void test(void); // <-- legal
int main(){
test();
int i = 5;
// int i; <-- not legal
return 0;
}
I understand that functions can have multiple declarations but only 1 definition,
but in my example the declaration is coming after the definition. Why would this be useful? Same cannot be done with block scoped variables.
I found this post which explains the behaviour in C++, not sure if the same applies to C:
Is a class declaration allowed after a class definition?
The underlying reason has to do with the way programs are typically compiled and linked on systems on which C is the "natural language", and the origin of the C language. The following describes conceptually how a program is generated from a collection of source files with static linking.
A program (which may or may not be written in C) consists of separate units — the C term is "translation units", which are source files — which are compiled or assembled to object files.
As a very rough picture such object files expose data objects (global variables) and executable code snippets (functions), and they are able to use such entities defined in other translation units. For the CPU, both are simply addresses. These entities have names or labels called "symbols" (function names, variable names) which an object file declares as "needed" (defined elsewhere) or "exported" (provided for use elsewhere).
On the C source code level the names of objects that are used here but defined elsewhere are made known to the compiler by "extern" declarations; this is true for functions and variables alike. The compiler conceptually generates "placeholder addresses" whenever such an object is accessed. It "publishes" the needed symbols in the object file, and the linker later replaces the symbolic placeholders with the "real" addresses of objects and executable code snippets when it creates an executable.
It does not hurt to declare the use of an external object or function multiple times. No code is generated anyway. But the definition, where actual memory is reserved for an object or executable code, can in general only occur once in a program, because it would otherwise be a duplicate code or object and create an ambiguity. Local variables don't have declarations like global variables; there is no need to declare their use far away from their definition. Their declaration is always also a definition, as in your example, therefore can only occur once in a given scope. That is not different for global variable definitions (as opposed to extern declarations) which can only occur once in the global scope.
Let's say you have these files:
// foo.h
#pragma once
void foo();
// helpers.h
#pragma once
#include "foo.h"
// ...
void bar();
// foo.c
void foo() {
// ...
}
#include "helpers.h"
// ...
Here, there is a declaration of foo after it's fully defined. Should this not compile? I think it's totally reasonable to expect #include directives to not have such effects.
I understand that functions can have multiple declarations but only 1 definition, but in my example the declaration is coming after the definition.
So?
Why would this be useful?
At minimum, it is useful for simplifying the definition of the language. Given that functions may be declared multiple times in the same scope, what purpose would be served by requiring the definition, if any, to be the last one? If multiple declaration is to be allowed at all -- and there is good reason for this -- then it is easier all around to avoid unnecessary constraints on their placement.
Same cannot be done with block scoped variables.
That's true, but for a different reason than you may suppose: every block-scope variable declaration is a definition, so multiple declarations in the same scope result in multiple definitions in the same scope, in violation of the one-definition rule.
A better comparison would be with file-scope variable declarations, which can be duplicated, in any order relative to a single definition, if present.

Global singleton in header-only C library

I am trying to implement a global singleton variable in the header-only library in C (not C++). So after searching on this forum and elsewhere, I came across a variation of Meyer's singleton that I am adapting to C here:
/* File: sing.h */
#ifndef SING_H
#define SING_H
inline int * singleton()
{
static int foo = 0;
return &foo;
}
#endif
Notice that I am returning a pointer because C lacks & referencing available in C++, so I must work around it.
OK, now I want to test it, so here is a simple test code:
/* File: side.h */
#ifndef SIDE_H
#define SIDE_H
void side();
#endif
/*File: side.c*/
#include "sing.h"
#include <stdio.h>
void side()
{
printf("%d\n",*(singleton()));
}
/*File: main.c*/
#include "sing.h"
#include "side.h"
#include <stdio.h>
int main(int argc, char * argv[])
{
/* Output default value - expected output: 0 */
printf("%d\n",*(singleton()));
*(singleton()) = 5;
/* Output modified value - expected output: 5 */
printf("%d\n",*(singleton()));
/* Output the same value from another module - expected output: 5*/
side();
return 0;
}
Compiles and runs fine in MSVC in C mode (also in C++ mode too, but that's not the topic). However, in gcc it outputs two warnings (warning: ‘foo’ is static but declared in inline function ‘singleton’ which is not static), and produces an executable which then segfaults when I attempt to run it. The warning itself kind of makes sense to me (in fact, I am surprised I don't get it in MSVC), but segfault kind of hints at the possibility that gcc never compiles foo as a static variable, making it a local variable in stack and then returns expired stack address of that variable.
I tried declaring the singleton as extern inline, it compiles and runs fine in MSVC, results in linker error in gcc (again, I don't complain about linker error, it is logical).
I also tried static inline (compiles fine in both MSVC and gcc, but predictably runs with wrong output in the third line because the side.c translation unit now has its own copy of singleton.
So, what am I doing wrong in gcc? I have neither of these problems in C++, but I can't use C++ in this case, it must be straight C solution.
I could also accept any other form of singleton implementation that works from header-only library in straight C in both gcc and MSVC.
I am trying to implement a global singleton variable in the header-only library in C (not C++).
By "global", I take you to mean "having static storage duration and external linkage". At least, that's as close as C can come. That is also as close as C can come to a "singleton" of a built-in type, so in that sense, the term "global singleton" is redundant.
Notice that I am returning a pointer because C lacks & referencing available in C++, so I must work around it.
It is correct that C does not have references, but you would not need either pointer or reference if you were not using a function to wrap access to the object. I'm not really seeing what you are trying to gain by that. You would likely find it easier to get what you are looking for without. For example, when faced with duplicate external defintions of the same variable identifier, the default behavior of all but the most recent versions of GCC was to merge them into a single variable. Although current GCC reports this situation as an error, the old behavior is still available by turning on a command-line switch.
On the other hand, your inline function approach is unlikely to work in many C implementations. Note especially that inline semantics are rather different in C than in C++, and external inline functions in particular are rarely useful in C. Consider these provisions of the C standard:
paragraph 6.7.4/3 (a language constraint):
An inline definition of a function with external linkage shall not contain a definition of a modifiable object with static or thread storage duration, and shall not contain a reference to an identifier with internal linkage.
Your example code is therefore non-conforming, and conforming compilers are required to diagnose it. They may accept your code nonetheless, but they may do anything they choose with it. It seems unreasonably hopeful to expect that you could rely on a random conforming C implementation to both accept your code for the function and compile it such that callers in different translation units could obtain pointers to the same object by calling that function.
paragraph 6.9/5:
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression [...], somewhere in the entire program there shall be exactly one external definition for the identifier [...].
Note here that although an inline definition of a function identifier with external linkage -- such as yours -- provides an external declaration of that identifier, it does not provide an external definition of it. This means that a separate external definition is required somewhere in the program (unless the function goes altogether unused). Moreover, that external definition cannot be in a translation unit that includes the inline definition. This is large among the reasons that extern inline functions are rarely useful in C.
paragraph 6.7.4/7:
For a function with external linkage, the following restrictions apply: [...] If all of the file scope declarations for a function in a translation unit include the inline function specifier without extern, then the definition in that translation unit is an inline definition. An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.
In addition to echoing part of 6.9/5, that also warns you that if you do provide an external definition of your function to go with the inline definitions, you cannot be sure which will be used to serve any particular call.
Furthermore, you cannot work around those issues by declaring the function with internal linkage, for although that would allow you to declare a static variable within, each definition of the function would be a different function. Lest there be any doubt, Footnote 140 clarifies that in that case,
Since an inline definition is distinct from the corresponding external definition and from any other corresponding inline definitions in other translation units, all corresponding objects with static storage duration are also distinct in each of the definitions.
(Emphasis added.)
So again, the approach presented in your example cannot be relied upon to work in C, though you might find that in practice, it does work with certain compilers.
If you need this to be a header-only library, then you can achieve it in a portable manner by placing an extra requirement on your users: exactly one translation unit in any program using your header library must define a special macro before including the header. For example:
/* File: sing.h */
#ifndef SING_H
#define SING_H
#ifdef SING_MASTER
int singleton = 0;
#else
extern int singleton;
#endif
#endif
With that, the one translation unit that defines SING_MASTER before including sing.h (for the first time) will provide the needed definition of singleton, whereas all other translation units will have only a declaration. Moreover, the variable will be accessible directly, without either calling a function or dereferencing a pointer.

C compiler ignores 'static' for declaration of struct

In C, if I declare a structure like so:
static struct thing {
int number;
};
and compile it (with gcc in this case), the compiler prints this warning:
warning: 'static' ignored on this declaration
[-Wmissing-declarations]
Why is this?
My intention in making the struct static would be to keep thing out of the global namespace so that another file could declare its own thing if it wanted.
You cant define the storage without defining the actual object.
static struct thing {
int number;
}obj1,obj2;
is ok and:
struct thing {
int number;
};
static struct thing x,y;
Struct tags (and typedef names) have no linkage, which means they are not shared across translation units. You might use the term "private" to describe this. It is perfectly fine for two different units to define their own struct thing.
There would only be a problem if it was attempted to make a cross-unit call of a function with external linkage that accepts a struct thing or type derived from that. You can minimize the chance of this happening by ensuring that functions with external linkage are only called via prototypes in header files (i.e. don't use local prototypes).
You can't use static on this way to control the linkage of a type like you could for a function or object, because in C types never have linkage anyway.
"Global namespace" isn't quite the term you want here. C describes names of objects and functions as having "external linkage" if the same name can be declared in different translation units to mean the same thing (like the default for functions), "internal linkage" if the same name can be redeclared within the same translation unit to mean the same thing (like declarations marked static), or "no linkage" when a declaration names a different object or function from any other declaration (like variables defined within a function body). (A translation unit, roughly speaking, is one *.c file together with the contents of the headers it includes.) But none of this applies to types.
So if you want to use a struct type that's essentially private to one source file, just define it within that source file. Then you don't need to worry about another usage of the same name colliding with yours, unless maybe somebody adds it to a header file that the source file was including.
(And just in case a C++ user comes across this Q&A, note the rules for this in C++ are very different.)

Confusion regarding extern with function definitions in C [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Effects of the `extern` keyword on C functions
Ok, so for a few hours now I've read a lot about what the extern keyword means. And there is one last thing that is bugging me to no end that I cannot find any info about.
As far as I understand the extern keyword basically tells the compiler that the variable or function is only a declaration and that it is defined somewhere, so it doesn't have to worry about that, the linker will handle it.
And the warning generated by the compiler (I'm using gcc 4.2.1) when typing this:
extern int var = 10;
supports this. With extern this should be a declaration only so it is not correct.
However, the thing that is confusing me is the absence of a warning or anything when typing this:
extern int func() {return 5;}
This is a definition, and it should generate the same warning, but it does not. The only explanation to this I was able to find here is that the definition overrides the extern keyword. However, following that logic why does it not override it when it is a variable definition? Or does the keyword have special meaning when used with variables?
I would be most grateful if someone explained this to me. Thank you!
The extern keyword indeed has special meaning only when it is used with variables. Using extern with function prototypes is entirely optional:
extern void foo(int bar);
is equivalent to
void foo(int bar);
When you declaring/defining a function, you have two options:
Provide only a declaration (i.e. a prototype), or
Provide a definition, which also serves as a declaration in the absence of a prototype.
With variables, however, you have three options:
Provide only a declaration,
Provide a definition with the default initializer: int var; without the = 10 part, or
Provide a definition with a specific initializer: int var = 10
Since there are only two options for functions, the compiler can distinguish between then without the use of extern keyword. Any declaration that does not have a static keywords is considered extern by default. Therefore, the extern keyword is ignored with all function declarations or definitions.
With variables, however, the keyword is needed to distinguish between the #1 and the #2. When you use extern, it's #1; when you do not use extern, it's #2. When you try to add extern to #3, it's a warning, because it remains a definition, and the extern is ignored.
All of this is somewhat simplified: you can provide declarations several times in the same compilation unit, and you can provide them at the global scope or at a block scope. For complete details, check section 6.7.9 5 of the C standard.
However, following that logic why does it not override it when it is a variable definition? Or does the keyword have special meaning when used with variables?
The difference between variables and functions is that
void foo();
is a function declaration, but
int i;
is a variable definition.
If you have the variable definition in multiple files, then the compiler will generate the storage for that variable multiple times (and most likely you'll get a linker error). This is not the case for functions.

Why and when to use static structures in C programming?

I have seen static structure declarations quite often in a driver code I have been asked to modify.
I tried looking for information as to why structs are declared static and the motivation of doing so.
Can anyone of you please help me understand this?
The static keyword in C has several effects, depending on the context it's applied to.
when applied to a variable declared inside a function, the value of that variable will be preserved between function calls.
when applied to a variable declared outside a function, or to a function, the visibility of that variable or function is limited to the "translation unit" it's declared in - ie the file itself. For variables this boils down to a kind of "locally visible global variable".
Both usages are pretty common in relatively low-level code like drivers.
The former, and the latter when applied to variables, allow functions to retain a notion of state between calls, which can be very useful, but this can also cause all kinds of nasty problems when the code is being used in any context where it is being used concurrently, either by multiple threads or by multiple callers. If you cannot guarantee that the code will strictly be called in sequence by one "user", you can pass a kind of "context" structure that's being maintained by the caller on each call.
The latter, applied to functions, allows a programmer to make the function invisible from outside of the module, and it MAY be somewhat faster with some compilers for certain architectures because the compiler knows it doesn't have to make the variable/function available outside the module - allowing the function to be inlined for example.
Something that apparently all other answers seem to miss: static is and specifies also a storage duration for an object, along with automatic (local variables) and allocated (memory returned by malloc and friends).
Objects with static storage duration are initialized before main() starts, either with the initializer specified, or, if none was given, as if 0 had been assigned to it (for structs and arrays this goes for each member and recursively).
The second property static sets for an identifier, is its linkage, which is a concept used at link time and tells the linker which identifiers refer to the same object. The static keyword makes an identifier have internal linkage, which means it cannot refer to identifiers of the same name in another translation unit.
And to be pedantic about all the sloppy answers I've read before: a static variable can not be referenced everyhere in the file it is declared. Its scope is only from its declaration (which can be between function definitions) to the end of the source file--or even smaller, to the end of the enclosing block.
struct variable
For a struct variable like static struct S s;, this has been widely discussed at: What does "static" mean in C?
struct definition: no effect:
static struct S { int i; int j; };
is the exact same as:
struct S { int i; int j; };
so never use it. GCC 4.8 raises a warning if you do it.
This is because struct definitions have no storage, and do no generate symbols in object files like variables and functions. Just try compiling and decompiling:
struct S { int i; int j; };
int i;
with:
gcc -c main.c
nm main.o
and you will see that there is no S symbol, but there is an i symbol.
The compiler simply uses definitions to calculate the offset of fields at compile time.
This is struct definitions are usually included in headers: they won't generate multiple separate data, even if included multiple times.
The same goes for enum.
C++ struct definition: deprecated in C++11
C++11 N3337 standard draft Annex C 7.1.1:
Change: In C ++, the static or extern specifiers can only be applied to names of objects or functions
Using these specifiers with type declarations is illegal in C ++. In C, these specifiers are ignored when used
on type declarations.
See also: https://stackoverflow.com/a/31201984/895245
If you declare a variable as being static, it is visible only in that translation unit (if globally declared) or retains its value from call to call (if declared inside a function).
In your case I guess it is the first case. In that case, probably the programmer didn't want the structure to be visible from other files.
The static modifier for the struct limits the scope of visibility of the structure to the current translation unit (i.e. the file).
NOTE: This answer assumes (as other responders have indicated) that your declaration is not within a function.

Resources