How are global static/non-static variables mangled in c?

How are global static/non-static variables mangled in c? - c

I can imagine static variables var inside a function func to be named like var#func,
what about global static and non-static variables?

Compilers don't need to uniquely name things with internal linkage, like static variables and functions. You can't access static objects outside the translation unit, so the linker doesn't need to get a name for them.
Global variables with external linkage don't usually have much mangling or decoration applied to their names, and it's often exactly the same that is applied to functions. A single leading underscore is not terribly uncommon.

Adding on that since the information given here is at least incomplete. Most compilers will create "local" symbol for static variables, and yes, since the naming of static variables in function scope is not unique, they have to mangle the names. gcc, e.g, does that by appending a dot and a unique number to the name. Since the dot is not part of any valid identifier, this makes sure that there is no name clash.
Things become obscure when the compiler supports universal characters in identifiers. Depending on the environment, the compiler has to mangle such identifiers, since e.g the loader might not support such characters in the symbol table.
icc chooses something like replacing such a character by _uXXXX where XXXX is the hex representation of the character. In that case (icc) this results in two subtle compiler bugs. First, this mangling uses valid identifiers that the user is allowed to use, so they may clash for global symbols with identifiers from the same compilation unit or even from other units. Second, icc even mixes up its own internal naming and reserves only space for one static variable and if they are eg also declared volatile completely runs into the wild.

Related

Advantages of static functions in C?

I am a little confused on the purpose of static functions in C, if anybody can explain that would be great! :)
I understand that static functions are used to limit the function’s visibility but why is it used?

Large programs are built with multiples subcomponents, which may be in separate groups of source files within one company or in libraries provided by third-party vendors.
When a function is not declared with static, its identifier has external linkage. This means any two instances of the identifier will be linked to refer to the same function.
Sometimes we do not want that. One person writing something for one part of the program might have called one function they use CalculateSquare because it calculates the square of a complex number, and another person writing something for another part of the program might have called a function CalculateSquare because it calculates some properties for a geometric square. If both of these identifiers have external linkage, this will generally result in a link error due to multiple definitions or, worse, linking in one definition and not the other with no error message (which can happen when one is in a library file).
When a function is declared with static, its identifier has internal linkage. This means uses of its identifier inside the same translation unit, after its declaration, will refer to that function, but uses of the same identifier in other translation units will not refer to the function in this translation unit. This allows programmers to pick their names more freely, without worrying about collisions with names other programmers use.
(There are other ways to avoid multiple definition errors. Linkers often have commands to control the publication and use of symbols in their output files, and those sometimes have to be used and may provide features that merely using static does not. However, using static is a standard and easy way to handle this in many situations.)

Why Storage-Class Specifiers are used to determine two independent properties?

From Storage-class specifiers:
The storage-class specifiers determine two independent properties of the names they declare: storage duration and linkage.
So, for example, when static keyword is used on global variables and functions (who's storage class is static anyway) it sets their linkage to Internal-linkage. When used on variables inside functions (which have no linkage) - it sets their storage class to static.
My question is: why is the same specifier used for both things?

The reason is mostly historical: linkage came into the design of C language as an afterthought. In the early versions you could redeclare global variables as many times as you wish, and linker would merge all these declarations for you:
Ritchie's original intention had been to model C's rules on FORTRAN COMMON declarations, on the theory that any machine that could handle FORTRAN would be ready for C. In the common-block model, a public variable may be declared multiple times; identical declarations are merged by the linker. (source)
The current rule of a single declaration came later, along with extern keyword. At that point there was a body of C code significant enough to make backward compatibility important. That is probably the reason why language designers refrained from introducing a new keyword for handling linkage, reusing static instead.

Why won't C compile if two separate source files in the same workspace share function names?

I'm using eclipse indigo, gcc and cdt in a project. If two functions in separate source files share names (regardless of return type or parameters), eclipse flags a redefinition error. This isn't a huge issue regarding this project given I can easily rename these functions, and I'm well aware of wrappers if it were. Although this isn't a critical issue, it does make me think I'm not understanding the c build process. What occurs during the build process in which a program structure like this would cause issue?
Here's some more info. on the situation, and where my understanding is so far -- not necessary to answer the question, although there must be a hole in my understanding.
In this case, the two functions are intended to be used only locally, as such their prototypes are not given in the .h interface, and for the sake of my point, neither are defined 'static'.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units. With that in consideration, I would have assumed that the neither source file is aware of the presence of the other, and the compiler would have no problem indexing the two functions, as the separate files would allow for proper distinguishing between the two during linking -- so long as they weren't included in the same compilation unit.
I noticed that statically defining either instance of the function declaration removes the error. I remember reading at some point that every function not declared static is global -- although given these functions are not a part of the .h interface, the practical example in which including the .h interface doesn't allow for the including program to reference all .c functions would indicate "hiding" these functions would be of no issue.
What am I overlooking?
Some insight would be greatly appreciated, thanks!

This is the concept of "linkage". Every function and variable in C has a linkage type, one of "external", "internal", and "none". (Only variables can have no linkage.)
Functions have external linkage by default, which means that they can be called by name from any compilation unit (where "compilation unit" roughly means one source file and all the headers it includes). This can be expressed explicitly by declaring them extern, or it can be overridden by declaring them static. Functions declared static have internal linkage, meaning they can be referenced by name only from other functions in the same compilation unit.
No two external functions anywhere in the same program can have the same name, regardless of header files, but static functions in different compilation units may have the same name. A static function may have the same name as an external function, too -- then the name resolves to the static function within its compilation unit, and to the external function elsewhere. These restrictions make sense, for otherwise it would be possible for a function call to be ambiguous.
Header files don't factor into the linkage equation at all. They are primarily a vehicle for sharing declarations, but a function's linkage depends only on how it is declared, not on where.
I leave discussion of variables' linkage for another time.

It doesn't matter whether one source module includes headers for another. Header files only contain declarations for the purpose of local functions being able to find functions in other modules. It doesn't mean that functions not declared don't exist from the perspective of that module.
When everything gets linked together, anything not specifically defined to be local to one source module (i.e. static) has to have a unique name across all linked components.

remember reading at some point that every function not declared static is global
Having understood this you got the main point and reason for the behaviour observed.
.h files are not known to the linker, after pre-processing there are only translation units left (typically a .c file with all includes merged in), from which .o files are compiled.
There are no interfaces on language level in C.
Neither of these source files are being included anywhere in the project, so they shouldn't be sharing any compilation units.
Declare those functions as static. This is the only way to "hide" a function from the linker "inside" a translation unit.

C doesn't "mangle" function names the way C++ or Java do (since C doesn't support function polymorphism).
For example, in C++, the functions
void foo( void );
void foo( int x );
void foo( int x, double y );
have their names "mangled" into the unique symbols1
_Z3fooid
_Z3fooi
_Z3foov
which is how overloaded function/method calls are disambiguated at the machine level.
C doesn't do that; instead, the linker sees two different function definitions using the same symbol and yaks because it has no way to disambiguate the two.
1. This is what happens on my system, anyway

Static variable with the same name in different file [duplicate]

This question already has answers here:
How are static variables with the same name in different functions identified by the System?
(3 answers)
Closed 8 years ago.
I have tried running and compiling the code where I have defined the static variables with the same name in two different source files. The code was compiled successfully and running.
Now my question is that both the static variables reside in the .data/BSS section in the memory. As per my understanding two different memory locations must have a separate unique name identifier. Why this was not a problem in this case?

"As per my understanding two different memory locations must have a separate unique name identifier." - it is not clear what you mean by "memory locations" in this case. Memory locations have addresses, not names. If by "memory locations" you mean "individual variables", then the above statement only applies to variables with external linkage. Variables with external linkage need externally visible names. Variables with internal linkage (static variables) don't.
In a typical implementation all static symbols are resolved internally by the compiler, at the compilation stage. They do not produce external names in object files. I.e they are not exposed to the linker at all. In the simplest case all static variables from the same translation unit are are seen by the linker as a single blob of data.
By the time different translation units are brought together for linking, all names of static variables are no longer necessary. By that time they are long forgotten. Which is why naming conflicts do not have a chance to occur.
P.S. In C++ language inline functions with external linkage are allowed to define static variables inside. To provide proper functionality, compilers typically assign external names to such static variables. C language, which also supports inline functions, decided to deal with this matter differently: in C language inline function definitions are simply prohibited to contain static variable definitions.

Change default from Extern to Static

I always forget to add the 'static' prefix to my variabeles and functions, and so the GCC marks them as extern. Is it possible to change this behaviour so that it marks everything static by default. And is there a performance difference between the two types at runtime, or is it more a formality?

Is it possible to change this behaviour so that it marks everything static by default.
Not to my knowledge.
And is there a performance difference between the two types at runtime, or is it more a formality?
Yes, gcc is able to perform further optimizations when objects or functions are static specified. For example, gcc(even in -O0) will inline a static specified function that is called only once.

First of all: The extern modifier is not default. That qualifier indicates that the item mentioned will be defined in another compilation unit, so it's only appropriate for declaring things like global variables.
There is no way to make the static modifier default, because there is no dynamic modifier which would cancel out this default. As such, there'd be no way to write working code with that default in place: every function and variable would be static, which would cause the compiler to generate an empty output file!
Is there a performance difference between the two types at runtime, or is it more a formality?
The compiler can perform some optimizations on static functions and variables which cannot be performed on dynamic ones. In particular, static functions and variables which are never referenced may be dropped entirely, and static functions can be inlined more aggressively.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How are global static/non-static variables mangled in c? - c

I can imagine static variables var inside a function func to be named like var#func, what about global static and non-static variables?

Related

Advantages of static functions in C?

Why Storage-Class Specifiers are used to determine two independent properties?

Why won't C compile if two separate source files in the same workspace share function names?

Static variable with the same name in different file [duplicate]

Change default from Extern to Static

Categories

Resources