Compiler warning not generated for multiple definitions - c

problem I am facing is function with same signature is defined in two .c files and is not giving compile time error. I have included declaration in .h file, which is included to both .c files.
For example:
int add(int x, int y) { return x+y;}
same definition is given in two .c files (Say A.c and B.c) and declaration in one .h file which is included in both A.c and B.c. But why this is not giving compile time error or How can I make to give them compile error
Even Linker is not giving any error, it looks it is taking first definition
I am using GCC compiler mingw
I found another pattern in this.
if I am using this in header file
#ifndef H_H_
#define H_H_
linker is not giving warning warning but If i don't use this Linker gives warning which is expected.

This situation is undefined behaviour with no diagnostic required.
Consult your linker's documentation to see if it has any options to report multiple definition of functions.

The compiler doesn't analyze your program as a whole. It simply processes one .c file at a time. If the declaration in the .h file matches the definition in the .c file, then everything is good as far as the compiler is concerned.
The linker will detect that the function was defined twice and will generate a "duplicate symbol" error.

Compiler sees each source file apart from the other. Compiler includes the content of header file(s) into A.c then geneates an object file A.obj from A.c. A.obj file will contain symbols of the variables and functions defined in A.c. On the other hand, compiler will process B.c apart without checking A.c, or any other source file, content. It will start by including header file(s) into B.c then it generates B.obj which also includes symbols of the variables and functions defined in B.c.
As a result, you will not get errors at compile time as the function duplication is not detected by the compiler. It is the linker job to check the symbols consistency and that there are no duplication present. Linker will get all generated object files in order to generate an executable. Linker must assign a unique memory address to each symbol. For example, in your code if there is a point (let's say in main function) where a function of A.c is called, actually, this is translated into a jump to an address in memory where that function is located. Now, imagine if two functions with the same signature coexist in the executable and each symbol has a different address. Then, how can the processor figure out which function exactly do you intend to call in your program. For that reason, if linker finds a symbol which is duplicated it will signal an error.

As #Matt-McNabb says: consult your linker documentation.
The only other cause I can come up with is that the linker binary compares the two functions, finds they are idenical, and ignores one. You can check this by slightly changing the code, for example by 'return y+x'.

Related

Why is it possible to redefine C library functions?

I noticed that if I write a function named getline, this function will be used if I invoke it, even if I #include <stdio.h>, but if I don't write such a function, the one from stdio.h will be used.
I expected instead to get a linker error, the same as if I had done the following:
foo.c:
int f() { return 0; }
main.c:
int f() { return 1; }
int main() { return f(); }
Compile:
$ gcc -c foo.c
$ gcc -c main.c
$ gcc foo.o main.o
/usr/bin/ld: main.o: in function `f':
main.c:(.text+0x0): multiple definition of `f'; foo.o:foo.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
The linker error makes sense to me; when the linker attempts to combine the object files into a single binary, it doesn't know how to resolve the invocation of f(); should it use foo.o's f() or main.o's f()?
But then why don't I get such a linker error when I write my own versions of getline or other C library functions?
This came up because I noticed that when compiling with -std=c99, gcc gives me a implicit-function-declaration warning for using getline. I can make an explicit function prototype, and it works correctly, but this implies that glibc's getline is being linked, so I tested what happens if I write my own getline, and if I do, the linker uses it instead and produces no error... The same appears to be true for other C library functions. Why is this? Why don't I get a linker error instead?
Linkers process library files differently than object files. The following discusses typical behavior for linkers. Details may vary with specific linkers and command-line switches or other settings.
When a linker processes an object file, it includes the entire object file in the output file it is building. As it is doing this, it builds a list of symbols that the object files use (refer to) but that are not defined yet.
A library file consists of multiple object modules inside a containing file. When a linker processes a library file, it examines each module in the library file and compares the symbols that module defines to that list of symbols that are needed but not yet defined. When it finds such a module, the linker includes that module in the output file. (The linker may also go back to earlier modules in the same library file, in case a later module uses a symbol that an earlier one defines.)
Any modules in the library file that do not provide a needed symbol are not needed in the output file, so the linker does not include them.
A consequence of this is that, if a same symbol is defined more than once in the object files, there will be multiple definitions because they are both built into the output file. However, if a symbol is defined once in the object files and once in the library, the one in the library will not be used because, when the linker considers the module it is in, that symbol will not be on the list of needed symbols, and the linker will not include it in the output file. So the output file ends up with just one definition of the symbol, the one from the object modules.
There are some complications to this. Suppose a module in a library defines both sin and cos, and an object module defines sin and uses both sin and cos. When the linker processes the object module, it will note that sin and cos are both used. The reference to sin will be satisfied by the object module, but cos is still needed. Then, when the linker processes the library, it will find cos and include that module. But that module also defines sin, so there will be two definitions of sin in the output file, and the linker will complain. So you can get multiple-definition errors from library modules this way.
Another complication is that the order of processing matters. If the linker first processes an object module that needs getline, and then a library module that defines getline, and then an object module that defines getline, the library module will be included in the output file (because getline was needed when the linker processed the library), and the object module that defines getline will also be included (because the linker includes all object files). So the output will have multiple definitions of getline, and the linker will complain. This is one reason why libraries are generally processed last, so that all object modules are processed first, and only things that are needed from libraries are taken.
In spite of this linker behavior, you cannot rely on defining your own versions of standard C routines. Compilers may have built-in knowledge about how the routines are specified by the C standard, and they may replace calls to those routines with other code. If you do need to provide your own version of a standard routine, the compiler may have a switch to disable its special treatment of that routine. For example, GCC has -fno-builtin-function, where function is replaced with a particular name, to tell it to disable special knowledge of a function.

Functions in C Headers?

I've heard that you put function prototypes in header files and then put the function definitions in a .c file. However, what is this .c file. For example, if you were to include a file "foo.h", would you call the aforementioned .c file "foo.c", put the function definition in it, put it in the same place as the foo.h, and when you try to include foo.h, will the function carry over from the c file and will the function be ready to use?
No, just putting the .c with the .h and including it in your code doesn't magically carry over the definition of the functions too.
You need to compile the foo.c separately into an object file (in case of Linux, it is a .o file). For example using the command -
gcc -c foo.c -o foo.o
Now this foo.o needs to be linked to your actual program. This can be done by simply passing the object file while compiling as
gcc test.c foo.o -o test.out
If you do not link the foo.o with your program, your linker won't be able to find the implementations for the functions defined in it and will throw a linker error as -
Undefined reference to function foo_function.
Header files are just conventional. The C preprocessor is handling #include directives and the compiler sees the preprocessed input (but you might copy and paste huge amount of C code to get the equivalent of #include). Actually, the preprocessor don't care if you #include "foo.h" or #include "foo.c", but the later is often poor taste. Naming header files with a .h suffix is just a (very common) convention.
So, if you have a function definition in a header included in several source files (technically translation units), that function definition is in every such translation unit.
What happen then depends on that function definition (it should be static or even better static inline).
In practice, you should restrict function definitions in header to be static or static inline. If you don't declare static a function definition void foo(int x) { /* something */ } in a header which is included in several *.c files you'll have multiple-definitions of foo errors at link time. And the main interest of putting a function definition in a header is to enable inlining (hence the inline hint); otherwise (the usual case), you don't need that and you just put the function prototype in the header file and the function definition in one of your *.c files.
If you have short and quick running functions, it could be wise to define them as static inline (so give their bodies) in your header files (but of course, that increases the compilation time). Otherwise, it is not worth the burden.
Some header files might have long macros (of dozens of physical lines, all of them except the last ended with a backslash) expanded to function definitions. Look into sglib for an example.
Notice that inline is today (like register was in the previous decade) just a hint to the compiler (for inline expansion optimization), which is allowed to ignore it (also, many optimizing compilers are able to inline functions without any annotation, provided they know the body of the called function).
Don't forget to enable all warnings and debug info. With GCC, use gcc -Wall -Wextra -g, perhaps with -Wstrict-prototypes. You could get the included files with -H (and the preprocessed form with -C -E).
Refer to some C reference site and to the C11 standard n1570 for more details. Read the GNU cpp reference manual about preprocessing.

Does the linker refer to the main code

Let assume I am having three source files main.c, a.c and b.c. In the main.c are called some of the functions (not all) that are defined in a.c. None of the functions defined in b.c are called (used) by main.c. In main.c is the main function. Then we have a makefile that compiles all the source files(main.c, a.c and b.c) and then links them to produce executable file, in my case intel hex file. My question is: Does the linker know in which file the main function resides and knowing that to determine what part of the object files to link together? I mean if the linker produces the exe file based only on the recipe of the rule to make the target then no matter how many functions are called in our application code the size of the executable will be the same because the recipe says to link all the object files. For example we compile the three source files and we get three object files: main.o a.o and b.o (the bigger the object files are, the bigger the exe file is). I know you would say if you dont want anything from the b.c then do not include it in the build. But it means that every time I want to change the application (include/exclide modules) I need to change the makefile too. And another thing is how the linker knows what part of the object file to take, does it understand the C language? I hope you understand my question, excuse my bad English.
1) Does the linker know in which file the main function resides and knowing that to determine what part of the object files to link together?
Maybe there are options of your toolchain (compiler/linker) to enable this kind of optimizations, I mean removing unused functions from link, but I have big doubt for global functions (could be possible for static functions).
2) And another thing is how the linker knows what part of the object file to take, does it understand the C language?
Linker may detect if a function or variable is not used by the application (once again, check the available options), but it is not really the objective of this tool. However if you compile/link some functions as library functions (see options), you can generate a "library" file and then link this library with other object files. The functions of the library will then be included by the linker ONLY if they are used.
What I suggest: use compilation flags (#ifdef...) to include or exclude parts of code from compilation/link.
If you want only those functions in the executable that are eventually called from main, use a library of object files.
Basically the smallest unit the linker will extract from a library is the object file. Whatever symbols are in that object file will also be resolved, until all symbols are resolved.
In other words, if none of the symbols in an object file are needed, it won't end up in the result. If at least one symbol is needed, it will get linked in its entirety.
No, the linker does not understand C. Note that a lot of language compilers create object files (C++, FORTRAN, ..., and assemblers). A linker resolves symbols, which are names attached to values.
John Levine has written a book, "Linkers and Loaders", available on the 'net, which will give you an in-depth understanding of linkers, symbols, and object files.

How the compiler knows where my main function is?

I am working on a project that contains multiple modules (source files, header files, libraries). One of the files in all that soup contains my main function.
My questions are:
How the compiler knows which modules to compile and which not?
How does the compiler recognize the module with the main() inside?
The compiler itself doesn't care about what file contains which functions; main() is not special. However, in the linking stage, all these symbols from different files (and compilation units, possibly) are matched. The linker has a hidden "template" which has code at a fixed address that the OS will always call when you run a program. That code will call your main; hence, the linker looks for a main in all files. If it isn't there, you get an unresolved symbol error, exactly like if you used a function that you forgot to implement.
The same as for any other function applies to main: You can only have one implementation; having two main in two files that get linked together, you get a linker error, because the linker can't decide which of these to use.
How the compiler knows which modules to compile and which not?
It does not. You tell him which ones you want to compile, typically though the compilation statement(s) present in a makefile.
How does the compiler recognize the module with the main() inside?
Altogether it's a big process, already answered in this related question.
To summarize, while compiling a program with standard C library, the entry point of your program is set to _start. Now that has a reference to main() function internally. So, at compilation time, there is no (need for) checking the presence of main(). At linking time, linker should be able to locate one instance of main() which it can link to. That way, main() will serve as the entry point to your program.
So, to answer
How the compiler knows where my main function is?
It does (and need) not. It's the job of a linker, specifically.
The assembly code (often referred as startup code by embedded people) that starts up the program specifically calls main().
The prototype for main() is included in compiler documentation.
When you compile a program, an object file is produced. The object file from your source code is then linked with a startup runtime component (usually called crt0.o[bj]) and the C library components, etc.
If main() is changed to an unrecognizable signature, the compilation unit will complain about an unresolved external reference to _main or __main.

Detect undefined symbols in C header file

Suposse I coded a C library which provides a bunch of "public" functions, declared in a mylib.h header file. Those functions are supposedly implemented in (say) a mylib.c file which is compiled to a (say) static lib mylib.c -> mylib.o -> mylib.a.
Is there some way to detect that I forgot to provide the implementation of some declared function in mylib.h? (Yes, I know about unit testing, good practices, etc - and, yes, I understand the meaning of a plain function declaration in C).
Suppose mylib.h declares a void func1(); and this function was not coded in the provided library. This will trigger an error only if the linker needs to use that function. Otherwise, it will compile ok and even without warnings - AFAIK. Is there a way (perhaps compiler dependent) to trigger a warning for declared but not implemented functions, or there is any other way to deal with this issue?
BTW: nm -u lists not all undefined declared functions, but only those "used" by the library, i.e., those functions that will trigger an error in the linking phase if not declared somewhere. (Which makes sense, the library object file does not know about header files, of course.)
Basically, the most reliable way is to have a program (or possibly a series of programs) which formally exercise each and every one of the functions. If one of the programs fails to link because of a missing symbol, you've goofed.
I suppose you could try to do something by editing a copy of the header into a source file (as in, file ending .c), converting the function declarations into dummy function definitions:
Original:
extern int somefunc(void);
Revised:
extern int somefunc(void){}
Then compile the modified source with minimum warnings - and ignore anything to do with "function that is supposed to return a value doesn't". Then compare the defined symbols in the object file from the revised source with the defined symbols in the library (using nm -g on Unix-like systems). Anything present in the object file that isn't present in the library is missing and should be supplied.
Note: if your header includes other headers of your own which define functions, you need to process all of those. If your header includes standard headers such as <stdio.h>, then clearly you won't be defining functions such as fopen() or printf() in the ordinary course of events. So, choose the headers you reprocess into source code carefully.
There's no easy way.
For example, you can analyse the output of clang -Xclang -ast-print-xml or gcc-xml and filter out declarations with no implementations for a given .h file.
You could grep for signatures of exported function in both .h and .c, and compare the lists.
Use wc -l for counting matches, Both numbers should be equal.
Another thought, just came to my mind. It is ihmo not possible to handle it using compiler. it is not always the case, that function declares in mylib.h is implemented in mylib.c
Is there some way to detect that I forgot to provide the implementation of some declared function in mylib.h?
Write the implementation first, then worry about header contents -- because that way, it can be flagged.

Resources