Multiple definitions of a global variable - c

I stumbled upon this weird behaviour when compiling C programs using gcc.
Let's say we have these two simple source files:
fun.c
#include <stdio.h>
// int var = 10; Results in Gcc compiler error if global variable is initialized
int var;
void fun(void) {
printf("Fun: %d\n", var);
}
main.c
#include <stdio.h>
int var = 10;
int main(void) {
fun();
printf("Main: %d\n", var);
}
Surprisingly, when compiled as gcc main.c fun.c -o main.out this doesn't produce multiple definition linker error.
One would expect multiple definition linker error to occur just the same, regardless of global variable initialization.
Does it mean that compiler makes uninitialized global variables extern by default?

A global variable can have any number of declarations, but only one definition. An initializer makes it a definition, so it will complain about having two of those (even if they agree).

Related

At what point during compilation or linking of C code are extern variables implicitly defined?

If I have a project with the following 3 files in the same directory:
mylib.h:
int some_global;
void set_some_global(int value);
mylib.c:
#include "mylib.h"
void set_some_global(int value)
{
some_global = value;
}
main.c:
#include <stdio.h>
#include "mylib.h"
int main()
{
set_some_global(42);
printf("Some global: %d\n", some_global);
return 0;
}
and I compile with
gcc main.c mylib.c -o prog -Wall -Wpedantic
I get no errors or warnings, and the prog program prints 42 to the console.
When I first tried this, I expected there to be a "multiple definition" error or some kind of warning since some_global is not declared extern in the header file. Upon researching this issue, I discovered that in C the extern is implicit on variable declarations outside of functions (and also that the opposite is true for C++, which can be demonstrated by using g++ instead of gcc in the compilation line above). Also, if I change the line in mylib.h from a declaration to a definition (e.g. int some_global = 1;), I do get the "multiple definition" error that I expected (this is nothing shocking).
My main question is: where is the variable being defined? It appears to be implicitly defined somewhere, but at what point does either the compiler or linker realize it needs that variable defined and does so?
Also, why is it that if I explicitly declare the variable as extern in the mylib.h file, I get "undefined reference" errors unless I explicitly declare the variable in one and only one *.c? I would expect that given the reason why the code above works (that extern is implicit), that explicitly declaring extern wouldn't make a difference. Why is there a difference in behavior?
Follow up
After the answer below corrected me that the code in mylib.h is a "tentative definition" rather than a declaration, I discovered this related answer with more details on such matters:
https://stackoverflow.com/a/3095957/7007605
Your code compiles and links without error only because you use gcc which was compiled with -fcommon command line option "The -fcommon places uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. (...) It is mainly useful to enable legacy code to link without errors." This was default prior to version 10, but even now many toolchains are still build with this option enabled.
Never define data in the header files. Place only extern definitions of the variables in the header files.
It should be:
extern int some_global;
void set_some_global(int value);
mylib.c:
#include "mylib.h"
int some_global;
void set_some_global(int value)
{
some_global = value;
}
main.c:
#include <stdio.h>
#include "mylib.h"
int main()
{
set_some_global(42);
printf("Some global: %d\n", some_global);
return 0;
}
int some_global; is a tentative definition. In GCC before version 10, GCC produced an object file treating this as a common symbol. (This behavior is still selectable by a switch, -fcommon.) The linker coalesces multiple definitions of a common symbol to a single definition.

Use of modifier static in c

I'm a beginner learning c. I know that use of word "static" makes a c function and variable local to the source file it's declared in. But consider the following...
test.h
static int n = 2;
static void f(){
printf("%d", n);
}
main.c
#include <stdio.h>
#include "test.h"
int main()
{
printf("%d", n);
f();
return 0;
}
My expected result was that an error message will throw up, since the function f and variable n is local to test.h only? Thanks.
But instead, the output was
2
2
EDIT:
If it only works for a compilation unit, what does that mean? And how do I use static the way I intended to?
static makes your function/variable local to the compilation unit, ie the whole set of source code that is read when you compile a single .c file.
#includeing a .h file is a bit like copy/paste-ing the content of this header file in your .c file. Thus, n and f in your example are considered local to your main.c compilation unit.
Example
module.h
#ifndef MODULE_H
#define MODULE_H
int fnct(void);
#endif /* MODULE_H */
module.c
#include "module.h"
static
int
detail(void)
{
return 2;
}
int
fnct(void)
{
return 3+detail();
}
main.c
#include <stdio.h>
#include "module.h"
int
main(void)
{
printf("fnct() gives %d\n", fnct());
/* printf("detail() gives %d\n", detail()); */
/* detail cannot be called because:
. it was not declared
(rejected at compilation, or at least a warning)
. even if it were, it is static to the module.c compilation unit
(rejected at link)
*/
return 0;
}
build (compile each .c then link)
gcc -c module.c
gcc -c main.c
gcc -o prog module.o main.o
You have included test.h in main.c.
Therefore static int n and static void f() will be visible inside main.c also.
When a variable or function is declared at file scope (not inside any other { } brace pair), and they are declared static, they are local to the translation unit they reside in.
Translation unit is a formal term in C and it's slightly different from a file. A translation unit is a single c file and all the h files it includes.
So in your case, the static variable is local to the translation unit consisting of test.h and main.c. You will be able to access it in main.c, but not in foo.c.
Meaning that if you have another .c file including test.h, you'll get two instances of the same variable, with the same name. That in turn can lead to all manner of crazy bugs.
This is one of many reasons why we never define variables inside header files.
(To avoid spaghetti program design, we should not declare variables in headers either, unless they are const qualified.)

Why do I need to declare this function extern. It works without it

I am new to the concept of extern. Today at work I came across a large number of extern functions that were declared inside of header file; foo.h. Somewhere off in a mess of folders I found a foo.c file that contained the definition of said functions but it did not #include foo.h. When I got home I decided to play around with the extern storage class examples. After doing some reading in the "C book" this is what I came up with.
Here is what I DID NOT expect to work. But it did.
main.c
#include <stdio.h>
int data;
int main()
{
data = 6;
printf("%d\n", getData());
data = 8;
printf("%d\n", getData());
return 0;
}
externs.c
int getData()
{
extern int data;
return data;
}
bash
gcc main.c externs.c -o externs
I didn't think this would work because the function is not technically (or at least verbose) defined before main. Does this work because the default storage class for int getData() is extern? If so, why even bother with the following example (similar to what I saw at work)?
main2.c
#include <stdio.h>
#include "externs.h"
int data;
int main()
{
data = 6;
printf("%d\n", getData());
data = 8;
printf("%d\n", getData());
return 0;
}
externs.h
#ifndef _EXTERNSH_
#define _EXTERNSH_
extern int getData();
#endif
externs.c
int getData()
{
extern int data;
return data;
}
bash
gcc main2.c externs.c -o externs
Functions are implicitly extern. You can change this by declaring them as static.
You are right that it is redundant to say extern int getData(); compared to int getData(); but some people feel that it improves code clarity.
Note that since C99, your first example is ill-formed. It calls getData without that function having been declared. In C89 this performed an implicit int getData(); declaration, but since C99 that was removed. Use the compiler switches -std=c99 -pedantic to test this.
By default, GCC version 4.x or earlier compiles in a C89/C90 compatibility mode, which does not require functions to be declared before use. Undeclared functions are inferred to have a return type of int. By default, GCC 5.x or later compiles in C11 compatibility mode, which does require functions to be declared before use — as does C99.
Given that the code compiled without complaints, we can infer that you are using GCC 4.x.
Even so, you are not obliged to provide full prototypes, though you would be foolish not to. I routinely compile code with the options:
gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes \
-Wold-style-definition -Wold-style-declaration …
I suspect that there is some redundancy in there, but those options catch many problems before I get around to running the code.

Duplicate symbol in C using Clang

I am currently working on my first "serious" C project, a 16-bit vm. When I split up the files form one big source file into multiple source files, the linker (whether invoked through clang, gcc, cc, or ld) spits out a the error:
ld: duplicate symbol _registers in register.o and main.o for inferred
architecture x86_64
There is no declaration of registers anywhere in the main file. It is a uint16_t array if that helps. I am on Mac OS 10.7.3 using the built in compilers (not GNU gcc). Any help?
It sounds like you've defined a variable in a header then included that in two different source files.
First you have to understand the distinction between declaring something (declaring that it exists somewhere) and defining it (actually creating it). Let's say you have the following files:
header.h:
void printIt(void); // a declaration.
int xyzzy; // a definition.
main.c:
#include "header.h"
int main (void) {
xyzzy = 42;
printIt();
return 0;
}
other.c:
#include <stdio.h>
#include "header.h"
void printIt (void) { // a definition.
printf ("%d\n", xyzzy);
}
When you compile the C programs, each of the resultant object files will get a variable called xyzzy since you effectively defined it in both by including the header. That means when the linker tries to combine the two objects, it runs into a problem with multiple definitions.
The solution is to declare things in header files and define them in C files, such as with:
header.h:
void printIt(void); // a declaration.
extern int xyzzy; // a declaration.
main.c:
#include "header.h"
int xyzzy; // a definition.
int main (void) {
xyzzy = 42;
printIt();
return 0;
}
other.c:
#include <stdio.h>
#include "header.h"
void printIt (void) { // a definition.
printf ("%d\n", xyzzy);
}
That way, other.c knows that xyzzy exists, but only main.c creates it.

warning in extern declaration

I declared a variable i in temp2.h
extern i; which contains just one above line
and made another file
temp3.c
#include<stdio.h>
#include<temp2.h>
int main ()
{
extern i;
i=6;
printf("The i is %d",i);
}
When I compiled above as
cc -I ./ temp3.c I got following errors
/tmp/ccJcwZyy.o: In function `main':
temp3.c:(.text+0x6): undefined reference to `i'
temp3.c:(.text+0x10): undefined reference to `i'
collect2: ld returned 1 exit status
I had declared extern in temp3.c above as K R page 33 says as I mentioned in above post.
I tried another way for temp3.c with same header file temp2.h
#include<stdio.h>
#include<temp2.h>
int main ()
{
i=6;
printf("The i is %d",i);
}
and compiled it cc -I ./ temp3.c and got following error
/tmp/ccZZyGsL.o: In function `main':
temp3.c:(.text+0x6): undefined reference to `i'
temp3.c:(.text+0x10): undefined reference to `i'
collect2: ld returned 1 exit status
I also tried
#include<stdio.h>
#include<temp2.h>
int main ()
{
extern i=6;
printf("The i is %d",i);
}
compiled this one
cc -I ./ temp3.c
got same error as in post 1
temp3.c: In function ‘main’:
temp3.c:5: error: ‘i’ has both ‘extern’ and initializer
So I have tried at least 3 different ways to use extern but non of them worked.
When you declare a variable using extern , you are telling the compiler that the variable was defined elsewhere and the definition will be provided at the time of linking. Inclusion is a different thing altogether.
extern
An external variable must be defined, exactly once, outside of any function; this sets aside storage for it. The variable must also be declared in each function that wants to access it; this states the type of the variable. The declaration may be an explicit extern statement or may be implicit from context.
-The C Programming Language
A variable must be defined once in one of the modules(in one of the Translation Units) of the program. If there is no definition or more than one, an error is produced, possibly in the linking stage (as in example 1 and 2).
Try something like the following
a.c
int i =10; //definition
b.c
extern int i; //declaration
int main()
{
printf("%d",i);
}
Compile, link and create an executable using
gcc a.c b.c -o executable_name
or
gcc -c a.c // creates a.o
gcc -c b.c // creates b.o
Now link the object files and create an executable
gcc a.o b.o -o executable_name
extern is declaration mechanism used to tell the compiler that the variable is defined in another file.
My Suggestion is that you define a variable in a ".c" file and then add the extern declaration in the corresponding ".h" file. In this way, the variable declaration will be available to all the source files which includes this header file and also it will be easier for one to identify in which ".c" it is actually defined.
The first program said:
The variable i (implicitly of type int) is defined somewhere else - but you didn't define it anywhere.
The second program tried to use a variable for which there was no declaration at all.
The third program tried to declare a variable without an explicit type (used to be OK; not allowed in C99), and said:
The variable i is defined somewhere else, but I want to initialize it here.
You are not allowed to do that.
So, the compiler is correct in all cases.
The declaration in the header 'temp2.h' should be fixed to extern int i; first. The implicit int is long obsolete.
You could fix the first example in several ways:
#include <stdio.h>
#include <temp2.h>
int main()
{
extern int i;
i=6;
printf("The i is %d",i);
return 0;
}
int i;
This defines the variable after the function - aconventional but legitimate. It could alternatively be in a separate source file that is separately compiled and then linked with the main program.
The second example could be fixed with:
#include <stdio.h>
#include <temp2.h>
int main()
{
int i=6;
printf("The i is %d",i);
return 0;
}
It is important to note that this example now has two variables called i; the one declared in temp2.h (but not actually referenced anywhere), and the one defined in main(). Another way of fixing it is the same as in the first possible fix:
#include <stdio.h>
#include <temp2.h>
int main()
{
i=6;
printf("The i is %d",i);
return 0;
}
int i;
Again, aconventional placement, but legitimate.
The third one can be fixed by similar methods to the first two, or this variant:
#include <stdio.h>
#include <temp2.h>
int i;
int main()
{
extern int i;
i=6;
printf("The i is %d",i);
return 0;
}
This still declares i in <temp2.h> but defines it in the source file containing main() (conventionally placed - at the top of the file). The extern int i; insides main() is doubly redundant - the definition in the source file and the declaration in the header mean that it does not need to be redeclared inside main(). Note that in general, the declaration in the header and the definition in the source file are not redundant; the declaration in the header provides a consistency check with the definition, and the other files that also use the variable and the header are then assured that the definition in the file containing main() is equivalent to the definition that the other file is using too.

Resources