Linker doesn't show any error, weird - c

Suppose i have 2 C src files, A1.C, A2.C, these are the contents:
A1.C
int x;
int main(){
void f(void);
x = 5;
f();
printf("%d", x);
return 0;
}
A2.C
int x;
void f() { x = 4; }
the linker doesn't give me any errors despite the missing "extern" safe-word. i have 2 identical symbols. can someone explain why?

For gcc, you can use the -fno-common flag to turn this into an error.
The gcc documentation explains what's happening
-fno-common
In C code, controls the placement of uninitialized global variables. Unix C compilers have traditionally permitted multiple
definitions of such variables in different compilation units by
placing the variables in a common block. This is the behavior
specified by -fcommon, and is the default for GCC on most targets.
On
the other hand, this behavior is not required by ISO C, and on some
targets may carry a speed or code size penalty on variable references.
The -fno-common option specifies that the compiler should place
uninitialized global variables in the data section of the object file,
rather than generating them as common blocks. This has the effect that
if the same variable is declared (without extern) in two different
compilations, you get a multiple-definition error when you link them.
See also Tentative definitions in C99 and linking

Related

Share a variable between two files in C

I'm trying to understand on how to link two files with a variable. I know that the standard way is through extern keyword as shown below.
File 1
#include<stdio.h>
int i;
void func1();
int main()
{
i = 10;
func1();
printf("i value with method 1: %d\n", i);
}
File 2
extern int i;
void func1()
{
i = i+1;
}
Compile and execute => gcc *.c ./a.out
Output=> i value with method 1: 11
One thing I found is through headers as shown below:
common.h
#ifndef __COMMON_H
#define __COMMON_H
int i;
#endif
File 1
#include<stdio.h>
#include"common.h"
int main()
{
i = 10;
func1();
printf("i value with method 2: %d\n", i);
}
File 2
#include"common.h"
void func1()
{
i = i+1;
}
Compile and execute => gcc *.c ./a.out
Output=> i value with method 2: 11
My doubt is, How method2 works?
At file scope, int i; is a special kind of declaration called a tentative definition. In spite of its name, it is not a definition. However, it can cause a definition to be created.
A clean way to declare and define an object that is used in multiple translation units is to declare it with extern in a header that is included in each unit that uses the object by name:
extern int i;
and to define the object in one translation unit:
int i = 0;
At file scope, int i = 0; is a definition; the initialization with = 0 makes it a regular definition instead of a tentative definition.
Ideally, all source code would use clean declarations and definitions. However, C was not completely planned and designed in advance. It developed through experiments and different people in different places implementing things differently. When the C committee standardized C, they had to deal with different practices and implementations. One common practice was that of declarations such as int i; in multiple units that were intended to create a single i. (This behavior was inherited from FORTRAN, which had common objects as a similar feature.)
To accommodate this, the committee described int i; at file scope as a special kind of declaration, a tentative definition. If there is a regular definition in the same translation unit that defines the same identifier, the tentative definition acts as a plain declaration, not a definition. If there is no regular definition, the compiler (or other part of C implementation) creates a definition for the identifier as if it had been initialized with zero.
The C standard leaves reconciling of multiple tentative definitions to each C implementation; it does not define the behavior when int i; is used in multiple translation units. Prior to version 10, the default behavior of GCC was to use the “common symbol” behavior; multiple tentative definitions would be reconciled to a single definition when linking. (To support this, the compiler marks tentative definitions differently from regular definitions when creating object modules, so the linker knows which is which.) In version 10, the default changed, and GCC now treats the definitions resulting from tentative definitions as regular symbols instead of common symbols.
This is why you will see some people report they get an error when linking sources with tentative definitions while you and others do not. It is simply a matter of which version of which compiler and linker they used.
You can explicitly request either behavior with the GCC switch -fcommon for the common symbol behavior or -fno-common for the regular symbol behavior.
Generally, you should use the clean method above; declare identifiers with extern in headers, and put exactly one definition in each identifier in one source file.

why no need to add `extern` for external functions?

below is my code:
//main.c
//I'm not using header file here,I know it is bad practice, it is just for demo purpose.
int main()
{
func();
return 0;
}
//test.c
void func()
{
...
}
we can see that above code compiles and can be linked by linker, but the same thing doesn't apply to variables as:
//main.c
int main()
{
sum += 1;
return 0;
}
//test.c
int sum = 2020;
then this code won't compile and can't be linked, and we have to add extern int sum; before main function in main.c.
But why we don't need to add extern in main.c as:
//main.c
extern void func(); //or `void func();` since functions are by default external
// without above line, it still compile
int main()
{
func();
return 0;
}
is it a little bit inconsistent here?
Note: by saying " Functions are by default external.",my understanding is: we can save some keystokes without typing extern , so void func(); == extern void func();, but we still need to add void func(); before main function in main.c, isn't it?
Both programs are incorrect since C99 and may be rejected by the compiler. An identifier may not be used in an expression without previously being declared.
In C89 there was a rule that if you write something that resembles a function call, and the function name has not previously been declared, then the compiler inserts a function declaration int f(); . There was not a similar rule for use of other identifiers that aren't followed by parentheses.
Some compilers (depending on compiler flags) will, even if set to C99 or later mode, issue a diagnostic and then perform the C89 behaviour anyway.
Note: your program still causes undefined behaviour in C89 because the implicit declaration is int func(); but the function definition has void func() which is incompatible type.
The compiler doesn't need to know anything about a function, in order to generate code to call it. In the absence of a prototype, it might generate the wrong code, but it can generate something (in principle, at least -- standards-compliance might forbid it by default). The compiler knows the calling convention for the platform -- it knows to put the function arguments onto the stack or into registers as required. It knows to write a symbol that the linker can later find and fix up, and so on.
But when you write "sum++", the compiler has no clue, lacking a declaration, how to generate code for that. It doesn't even know what kind of thing "sum" is. The code needed to increment a floating-point number will be completely different to that needed to increment an integer, and may be different from that needed to increment a pointer. The compiler doesn't need to know where "sum" is -- that's the linker's job -- but it needs to know what it is, to produce meaningful machine code.
But we don't need to add extern for the function in main.c as extern void func(); or void func();(as functions are implicitly extern prefixed) and the code still compile?
That's correct. Functions are by default external.
To make functions specific to a local source file (translation unit), you need to specific static for them.
Variables, on the other hand, are visible in the source file only. If you want to make some variable visible outside the source file where it is defined, you need extern for it.
There are two completely different topics - function prototypes and linkage.
void foo(void);
provides the extern function prototype needed by compiler to know the number and type of parameters and the type of the return value. Function has an external linkage - ie can be accessed by other compilation units
static void foo(void);
provides the static function prototype. Function has an no external linkage - ie it cannot be accessed by other compilation units
By default functions have an external linkage.
Objects (global scope).
int x;
Defines the object x having the external linkage and type int.
If you define another x object in another compilation unit the linker will complain and emit an error.
extern int x;
Only declares the object x without defining it. The object x has to be defined in other compilation unit.

Why is this statement producing a linker error with gcc?

I have this extremely trivial piece of C code:
static int arr[];
int main(void) {
*arr = 4;
return 0;
}
I understand that the first statement is illegal (I've declared a file-scope array with static storage duration and file linkeage but no specified size), but why is it resulting in a linker error? :
/usr/bin/ld: /tmp/cch9lPwA.o: in function `main':
unit.c:(.text+0xd): undefined reference to `arr'
collect2: error: ld returned 1 exit status
Shouldn't the compiler be able to catch this before the linker?
It is also strange to me that, if I omit the static storage class, the compiler simply assumes array is of length 1 and produces no error beyond that:
int arr[];
int main(void) {
*arr = 4;
return 0;
}
Results in:
unit.c:5:5: warning: array 'arr' assumed to have one element
int arr[];
Why does omitting the storage class result in different behavior here and why does the first piece of code produce a linker error? Thanks.
Empty arrays static int arr[]; and zero-length arrays static int arr[0]; were gcc non-standard extensions.
The intention of these extensions were to act as a fix for the old "struct hack". Back in the C90 days, people wrote code such as this:
typedef struct
{
header stuff;
...
int data[1]; // the "struct hack"
} protocol;
where data would then be used as if it had variable size beyond the array depending on what's in the header part. Such code was buggy, wrote data to padding bytes and invoked array out-of-bounds undefined behavior in general.
gcc fixed this problem by adding empty/zero arrays as a compiler extension, making the code behave without bugs, although it was no longer portable.
The C standard committee recognized that this gcc feature was useful, so they added flexible array members to the C language in 1999. Since then, the gcc feature is to be regarded as obsolete, as using the C standard flexible array member is to prefer.
As recognized by the linked gcc documentation:
Declaring zero-length arrays in other contexts, including as interior members of structure objects or as non-member objects, is discouraged.
And this is what your code does.
Note that gcc with no compiler options passed defaults to -std=gnu90 (gcc < 5.0) or -std=gnu11(gcc > 5.0). This gives you all the non-standard extensions enabled, so the program compiles but does not link.
If you want standard compliant behavior, you must compile as
gcc -std=c11 -pedantic-errors
The -pedantic flag disables gcc extensions, and the linker error switches to a compiler error as expected. For an empty array as in your case, you get:
error: array size missing in 'arr'
And for a zero-length array you get:
error: ISO C forbids zero-size array 'arr' [-Wpedantic]
The reason why int arr[] works, is because this is an array declaration of tentative definition with external linkage (see C17 6.9.2). It is valid C and can be regarded as a forward declaration. It means that elsewhere in the code, the compiler (or rather the linker) should expect to find for example int arr[10], which is then referring to the same variable. This way, arr can be used in the code before the size is known. (I wouldn't recommend using this language feature, as it is a form of "spaghetti programming".)
When you use static you block the possibility to have the array size specified elsewhere, by forcing the variable to have internal linkage instead.
Maybe one reason for this behavior is that the compiler issues a warning resulting in a non-accessed static variable and optimizes it away - the linker will complain!
If it is not static, it cannot simply be ignored, because other modules might reference it - so the linker can at least find that symbol arr.

Is it valid to treat an extern global as const when the definition is not const? [duplicate]

This question already has answers here:
C -- Accessing a non-const through const declaration
(5 answers)
Closed 8 years ago.
Say I have a compilation unit file1.c, which declares a file-scope variable like so:
int my_variable = 12;
Then, in another compilation unit file2.c, I create an extern declaration for that variable, but declare it as const:
extern const int my_variable;
This will compile and work fine with gcc , using -Wall -Wextra -ansi -pedantic. However, the C89 standard says For two qualified types to be compatible, both shall have the identically qualified version of a compatible type. Adding const to the declaration adds a restriction rather than avoiding one. Is this safe and valid C? What would be the best practice in setting this up with header files?
It's clearly undefined as the declarations don't match. As you noted, const int and int aren't compatible types. A diagnostic is required only if they appear in the same scope.
It isn't safe in practice either, consider
$ cat test1.c
#include <stdio.h>
extern const int n;
void foo(void);
int main(void) {
printf("%d\n", n);
foo();
printf("%d\n", n);
}
$ cat test2.c
int n;
void foo(void) { ++n; }
$ gcc -std=c99 -pedantic test1.c test2.c && ./a.out
0
1
$ gcc -O1 -std=c99 -pedantic test1.c test2.c && ./a.out
0
0
Gcc assumes that n isn't changed by foo() when optimizing, because it may assume the definition of n is of a compatible type, thus const.
Chances are that you get the expected behaviour with also volatile-qualifying n in test1.c, but as far as the C standard is concerned, this is still undefined.
The best way I can think of to prevent the user from accidentally modifying n is to declare a pointer to const, something along
int my_real_variable;
const int *const my_variable = &my_real_variable;
or perhaps some macro
#define my_variable (*(const int *)&my_variable)
With C99, my_real_variable can be avoided via a compound literal:
const int *const my_variable_ptr = &(int){ 12 };
It would be legal to cast away const here (as the int object itself isn't const), but the cast would be required, preventing accidental modification.
In this case the definition and declaration appear in separate translation units, so the compiler cannot perform any type or qualifier checks. The symbols are resolved by the linker and in this case it seems that the linker is not enforcing this qualifier matching.
If the definition and the declaration appeared in the same translation unit; for example if you placed the extern declaration in a header file and included it in file1.c, then I would imagine that the compiler would complain. By placing them in separate translation units, the compiler never sees both so cannot perform the check.

extern function during linkage?

I have this weird thing:
in a file1.c there's
extern void foo(int x, int y);
..
..
int tmp = foo(1,2);
in the project I could find only this foo():
in file2.c :
int foo(int x, int y, int z)
{
....
}
in file2.h :
int foo(int x, int y, int z);
file2.h isn't included in file1.c (this is why who wrote it used extern, i guess).
this project compiles fine, I think that's because in file1.c foo() will be looked for only during linkage, am I right?
but my real question is : why is the linkage succssful ?
after all, there is no such function as foo with 2 parameters....
and i'm in c .. so there's no overloading..
so what's going on ?
Because there is no overloading, the C compiler does not decorate the function names. The linker finds in file2.c a reference to function foo and in file1.c it finds a function foo. It cannot know their parameter lists do not match and happily use them.
Of course, when the function foo runs the value of z is garbage and the behavior of the program becomes unpredictable from that point on.
Calling a function with the wrong number (or types) of arguments is an error.
The standard requires the implementation to detect some, but not all of them.
What the standard calls an implementation, is typically a compiler with a separate linker (and some other things), where a compiler translates single translation units (that is, a preprocessed source file) into object files, which later get linked together.
While the standard doesn't distinct between them, its authors of course wrote it with the typical setup in mind.
C11 (n1570) 6.5.2.2 "Function calls", p2:
If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters. Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter.
This is in a "constraints" section, which means, the implementation (in this case, that's the compiler) must complain and may abort translation if a "shall" requirement is violated.
In your case, there was a prototype visible, so the arguments of the function call must match with the prototype.
Similar requirements apply for a function definition with a prototype declaration in scope; if your function definition doesn't match the prototype, your compiler must tell you. In other words, as long as you ensure that all calls to a function and that function's definition are in the scope of the same prototype, you are told if there is a mismatch. This can be ensured if the prototype is in a header file which is included by all files with calls to that function and by the file containing its definition. We use header files with prototypes exactly for that reason.
In the code shown, this checking is by-passed by providing a non-matching prototype and not including the header file2.h.
Ibid. p9:
If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
Undefined behaviour means, the compiler is free to assume it doesn't happen, and is not required to detect if it does.
And in fact, on my machine, the generated object files from file2.c (I inserted a return 0; to have some function body), don't differ if I remove one of the function arguments, which means, the object file doesn't contain any information about the arguments und thus a compiler seeing only file2.o and file1.c hasn't got any chance to detect the violation.
You've mentioned overloading, so let's compile file2.c (with two and three arguments) as C++ and look at the object files:
$ g++ file2_three_args.cpp -c
$ g++ file2_two_args.cpp -c
$ nm file2_three_args.o
00000000 T _Z3fooiii
$ nm file2_two_args.o
00000000 T _Z3fooii
Function foo has its arguments incorporated into the symbol created for it (a process called name mangling), the object file indeed carries some information about the function types. Accordingly, we get an error at link time:
$ cat file1.cpp
extern void foo(int x, int y);
int main(void) {
foo(1,2);
}
$ g++ file2_three_args.o file1.cpp
In function `main':
file1.cpp:(.text+0x19): undefined reference to `foo(int, int)'
collect2: error: ld returned 1 exit status
This behaviour would also be allowed for a C implementation, aborting translation is a valid manifestation of undefined behaviour at compile or link time.
The way overloading in C++ is usually done actually allows such checks at link time. That C doesn't have built-in support for function overloading, and that the behaviour is undefined for the cases where the compiler cannot see the type mismatches, allows to generate symbols for functions without any type information.
First of all
extern void foo(int x, int y);
means exactly the same thing as
void foo(int x, int y);
The former is just an overly explicit way to write the same thing. extern fills no other purpose here. It is like writing auto int x; instead of int x, it means the very same thing.
In your case, the "foo" module (which you call file2) contains the function prototype as well as the definition. This is proper program design in C. What file1.c should be doing is to #include the foo.h.
For reasons unknown, whoever wrote file1.c didn't do this. Instead they are just saying "elsewhere in the project, there is this function, do not care about its definition, that's handled elsewhere".
This is bad programming practice. file1.c shouldn't concern itself with how things are defined elsewhere: this is spaghetti programming which creates a needless tight coupling between the caller and the module. There is also the chance that the actual function doesn't match the local prototype, in which case you would hopefully get linker errors. But there are no guarantees.
The code must be fixed like this:
file1.c
#include "foo.h"
...
int tmp = foo(1,2);
foo.h
#ifndef FOO_H
#define FOO_H
int foo(int x, int y, int z);
#endif
foo.c
#include "foo.h"
int foo(int x, int y, int z)
{
....
}

Resources