I am a little confused on the purpose of static functions in C, if anybody can explain that would be great! :)
I understand that static functions are used to limit the function’s visibility but why is it used?
Large programs are built with multiples subcomponents, which may be in separate groups of source files within one company or in libraries provided by third-party vendors.
When a function is not declared with static, its identifier has external linkage. This means any two instances of the identifier will be linked to refer to the same function.
Sometimes we do not want that. One person writing something for one part of the program might have called one function they use CalculateSquare because it calculates the square of a complex number, and another person writing something for another part of the program might have called a function CalculateSquare because it calculates some properties for a geometric square. If both of these identifiers have external linkage, this will generally result in a link error due to multiple definitions or, worse, linking in one definition and not the other with no error message (which can happen when one is in a library file).
When a function is declared with static, its identifier has internal linkage. This means uses of its identifier inside the same translation unit, after its declaration, will refer to that function, but uses of the same identifier in other translation units will not refer to the function in this translation unit. This allows programmers to pick their names more freely, without worrying about collisions with names other programmers use.
(There are other ways to avoid multiple definition errors. Linkers often have commands to control the publication and use of symbols in their output files, and those sometimes have to be used and may provide features that merely using static does not. However, using static is a standard and easy way to handle this in many situations.)
Related
As of late, I've been trying to work through opaque pointers as a programming concept and one of the main things I've had difficulties with is figuring out what is or isn't available to other files. In a previous question, I failed in trying to create an opaque pointer to a struct and even though the answer explained how to fix that, I still don't quite understand where I went wrong.
I think that if a struct is defined in file2.c, file1.c can use it if both files include header.h which includes a declaration of the struct? That doesn't entirely make sense to me. header.h is used by both files, so I can see how they would access the stuff in it, but I don't understand how they would use it to access each other.
When I started programming, I thought it was pretty straight forwards, where you have program files, they can't access anything in each other, and those program files can #include header files with definitions and declarations in them (e.g. file1.c has access to variables/functions/etc. defined in header.h). Turns out I was wrong and things are quite a bit more complicated.
So from what I can tell, func() defined in header.h can be used by file1.c without being declared in file1.c, if file1.c includes header.h. As opposed to var defined in header.h which needs to be declared in file1.c with the extern keyword? And I think if var is defined in file2.c, file1.c can use it if it extern declares it, even if neither file1.c nor file2.c include header.h?
I apologize if the previous paragraphs makes no sense, I'm having quite a bit of difficulty with trying to describe something that confuses me. By all means, please edit this if you are able to fix mistakes or whatnot.
Books and webpages don't seem to help at all. They end up giving me misconceptions because I already don't understand something and draw the wrong conclusions, or they bring up concepts that throw me off even more.
What I'm looking what I'm looking for is an answer that lays this all down in front of me. For example 'this can access this under these circumstances', 'this cannot access this'.
Functions defined in one .c file can use anything defined in another .c file except for those things which are marked as static. Functions and global variables which are marked as static cannot be accessed from other translation units.
Whether something is declared in a header file or not doesn't really matter--you can declare functions locally in the same .c file which calls them if you want.
Your question asks about “access” at several points, but I do not think that is what you mean to use. Any object or function can be accessed (for an object: read or written, for a function: called) from anywhere as long as a pointer to it is provided in some way). I think what you mean to ask is what names are available.
Any declaration that is outside of a function is an external declaration. In this use of “external” in the C standard, it simply means outside of a function. (That includes a function declaration or definition; although it is declaring or defining a function, it is not inside itself or any other function declaration, so it is outside of any function.)
Any identifier for an object or function with an external declaration has either internal linkage or external linkage. If it is first declared with static, it has internal linkage (and may be later declared with extern, but that will not change the linkage). Otherwise, it has external linkage.
Any identifier with external linkage will refer to the same object or function in all translation units (provided other rules of the C standard are satisfied—a program can do various things that will result in behavior not defined by the C standard).
Thus your answer is: The name of any object or function that is (a) defined outside of any function and (b) not initially declared with static is available to be linked to from other translation units.
Some technicalities that may be of interested:
What people think of as a variable is two things: an identifier (the name) and an object (a region of memory that stores the value).
Identifiers have scope, which is where they are visible in the source code. Identifiers declared outside functions have file scope; they are visible for the rest of the translation unit. Identifiers declared inside functions have various other types of scope: function scope, function prototype scope, and block scope.
You may sometimes seem people refer to global scope or external scope, but these are misnomers; they are not terms used in the C standard.
Linkage is related to scope and is sometimes confused with it, but linkage is a different concept: Two identical identifiers declared in different places can be made to refer to the same thing. Those identifiers have different scopes, notably one having file scope in one translation unit and the other having file scope in a different translation unit. Since each translation unit is compiled separately, the compiler generates code regarding each identifier separately. When the object modules are linked, then the code is bounded together, causing the separate identifiers to refer to the same object or function.
Identifiers can be declared with extern inside functions, but these can only link to objects or functions defined elsewhere; external definitions cannot appear inside functions.
I'm using eclipse indigo, gcc and cdt in a project. If two functions in separate source files share names (regardless of return type or parameters), eclipse flags a redefinition error. This isn't a huge issue regarding this project given I can easily rename these functions, and I'm well aware of wrappers if it were. Although this isn't a critical issue, it does make me think I'm not understanding the c build process. What occurs during the build process in which a program structure like this would cause issue?
Here's some more info. on the situation, and where my understanding is so far -- not necessary to answer the question, although there must be a hole in my understanding.
In this case, the two functions are intended to be used only locally, as such their prototypes are not given in the .h interface, and for the sake of my point, neither are defined 'static'.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units. With that in consideration, I would have assumed that the neither source file is aware of the presence of the other, and the compiler would have no problem indexing the two functions, as the separate files would allow for proper distinguishing between the two during linking -- so long as they weren't included in the same compilation unit.
I noticed that statically defining either instance of the function declaration removes the error. I remember reading at some point that every function not declared static is global -- although given these functions are not a part of the .h interface, the practical example in which including the .h interface doesn't allow for the including program to reference all .c functions would indicate "hiding" these functions would be of no issue.
What am I overlooking?
Some insight would be greatly appreciated, thanks!
This is the concept of "linkage". Every function and variable in C has a linkage type, one of "external", "internal", and "none". (Only variables can have no linkage.)
Functions have external linkage by default, which means that they can be called by name from any compilation unit (where "compilation unit" roughly means one source file and all the headers it includes). This can be expressed explicitly by declaring them extern, or it can be overridden by declaring them static. Functions declared static have internal linkage, meaning they can be referenced by name only from other functions in the same compilation unit.
No two external functions anywhere in the same program can have the same name, regardless of header files, but static functions in different compilation units may have the same name. A static function may have the same name as an external function, too -- then the name resolves to the static function within its compilation unit, and to the external function elsewhere. These restrictions make sense, for otherwise it would be possible for a function call to be ambiguous.
Header files don't factor into the linkage equation at all. They are primarily a vehicle for sharing declarations, but a function's linkage depends only on how it is declared, not on where.
I leave discussion of variables' linkage for another time.
It doesn't matter whether one source module includes headers for another. Header files only contain declarations for the purpose of local functions being able to find functions in other modules. It doesn't mean that functions not declared don't exist from the perspective of that module.
When everything gets linked together, anything not specifically defined to be local to one source module (i.e. static) has to have a unique name across all linked components.
remember reading at some point that every function not declared static is global
Having understood this you got the main point and reason for the behaviour observed.
.h files are not known to the linker, after pre-processing there are only translation units left (typically a .c file with all includes merged in), from which .o files are compiled.
There are no interfaces on language level in C.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units.
Declare those functions as static. This is the only way to "hide" a function from the linker "inside" a translation unit.
C doesn't "mangle" function names the way C++ or Java do (since C doesn't support function polymorphism).
For example, in C++, the functions
void foo( void );
void foo( int x );
void foo( int x, double y );
have their names "mangled" into the unique symbols1
_Z3fooid
_Z3fooi
_Z3foov
which is how overloaded function/method calls are disambiguated at the machine level.
C doesn't do that; instead, the linker sees two different function definitions using the same symbol and yaks because it has no way to disambiguate the two.
1. This is what happens on my system, anyway
This question already has answers here:
How are static variables with the same name in different functions identified by the System?
(3 answers)
Closed 8 years ago.
I have tried running and compiling the code where I have defined the static variables with the same name in two different source files. The code was compiled successfully and running.
Now my question is that both the static variables reside in the .data/BSS section in the memory. As per my understanding two different memory locations must have a separate unique name identifier. Why this was not a problem in this case?
"As per my understanding two different memory locations must have a separate unique name identifier." - it is not clear what you mean by "memory locations" in this case. Memory locations have addresses, not names. If by "memory locations" you mean "individual variables", then the above statement only applies to variables with external linkage. Variables with external linkage need externally visible names. Variables with internal linkage (static variables) don't.
In a typical implementation all static symbols are resolved internally by the compiler, at the compilation stage. They do not produce external names in object files. I.e they are not exposed to the linker at all. In the simplest case all static variables from the same translation unit are are seen by the linker as a single blob of data.
By the time different translation units are brought together for linking, all names of static variables are no longer necessary. By that time they are long forgotten. Which is why naming conflicts do not have a chance to occur.
P.S. In C++ language inline functions with external linkage are allowed to define static variables inside. To provide proper functionality, compilers typically assign external names to such static variables. C language, which also supports inline functions, decided to deal with this matter differently: in C language inline function definitions are simply prohibited to contain static variable definitions.
I know that it's poor practice to not include function prototypes, but if you don't, then the compiler will infer a prototype based on what you pass into the function when you call it (according to this answer). My question is why does the compiler infer the prototype from what you pass into the function rather than the definition of the function itself? I can imagine some kind of preprocessing step where all declared functions are identified and checked to see if a prototype exists for each one. If one doesn't have a prototype, the first line of the function is copied and stuck under the existing prototypes. Why isn't this done?
Because the C compiler was designed as a single pass compiler, where any given file does not know about the other source files that make up the project.
Although compilers have gotten more sophisticated, and may do multiple passes, the general outline of the compilation process framework remains as it was in K&R's day:
Pre-process each source file(macro text replacement only).
Compile the processed source into an object file.
Link the objects into an executable or library.
Inferring prototypes would have to happen in the first step, but the compiler does not know about the existence of any other objects which may contain the function definition at that time.
It might be possible to make a compiler which did what you suggest, but not without breaking the existing rules for how to infer prototypes. A change with such big consequences would make the language no longer C.
The major use for prototypes is to declare a function and inform the compiler about the number and type of arguments in cases where the definition is not visible. Since C was originally compiled single-pass, the definition is not visible when it occurs later in the translation unit, but the more important case from a modern perspective is when the definition is not visible at all, due to lying in a separate translation unit, possibly even in a library file that exists only in compiled form and where no information about the function's type is recorded.
In our project, we have pretty big C file of around 50K lines, written in 90's.
I wanted to split the file based on the functionality. But, all the functions in this file are declared as static. So, file scoped. If I split the file, then the function in file1 cannot call function in file2 and vice-versa.
But, My TL feels like that there could be memory optimization by using static functions.
I wrote some sample code to see if the stacks are different for different threads.
It seemed like it was. Could someone please enlighten me the difference between static function and a normal one other an file scope?
In C, while defining a function, the static keyword has the following 2 major consequences :
Prevents the function name from being exported (i.e. function does NOT have external linkage). Thus, preventing linkage / direct calls from other parts of the code.
As the function is clearly marked private to the file, the compiler is in a better position to generate a complete call-graph for the function. This may result in the compiler deciding to automatically in-line the function for better performance.
All functions are implicitly declared as extern, which means they're visible across translation units. But when we use static it restricts visibility of the function to the translation unit in which it's defined. So we can say Functions that are visible only to other functions in the same file are known as static functions.
The most important difference is you cannot call the static function in any other files. i think so ,yeah?