So I get the point of headers vs source files. What I don't get is how the compiler knows to compile all the source files. Example:
example.h
#ifndef EXAMPLE_H
#define EXAMPLE_H
int example(int argument); // prototype
#endif
example.c
#include "example.h"
int example(int argument)
{
return argument + 1; // implementation
}
main.c
#include "example.h"
main()
{
int whatever;
whatever = example(whatever); // usage in program
}
How does the compiler, compiling main.c, know the implementation of example() when nothing includes example.c?
Is this some kind of an IDE thing, where you add files to projects and stuff? Is there any way to do it "manually" as I prefer a plain text editor to quirky IDEs?
Compiling in C or C++ is actually split up into 2 separate phases.
compiling
linking
The compiler doesn't know about the implementation of example(). It just knows that there's something called example() that will be defined at some point. So it just generated code with placeholders for example()
The linker then comes along and resolves these placeholders.
To compile your code using gcc you'd do the following
gcc -c example.c -o example.o
gcc -c main.c -o main.o
gcc example.o main.o -o myProgram
The first 2 invocations of gcc are the compilation steps. The third invocation is the linker step.
Yes, you have to tell the compiler (usually through a makefile if you're not using an IDE) which source files to compile into object files, and the compiler compiles each one individually. Then you give the linker the list of object files to combine into the executable. If the linker is looking for a function or class definition and can't find it, you'll get a link error.
It doesn't ... you have to tell it to.
For example, whe using gcc, first you would compile the files:
gcc file1.c -c -ofile1.o
gcc file2.c -c -ofile2.o
Then the compiler compiles those files, assuming that symbols that you've defined (like your example function) exist somewhere and will be linked in later.
Then you link the object files together:
gcc file1.o file2.o -oexecutable
At this point of time, the linker looks at those assumtions and "clarifies" them ie. checks whether they're present. This is how it basically works...
As for your IDE question, Google "makefiles"
The compiler does not know the implementation of example() when compiling main.c - the compiler only knows the signature (how to call it) which was included from the header file. The compiler produces .o object files which are later linked by a linker to create the executable binary. The build process can be controlled by an IDE, or if you prefer a Makefile. Makefiles have a unique syntax which takes a bit of learning to understand but will make the build process much clearer. There are lots of good references on the web if you search for Makefile.
The compiler doesn't. But your build tool does. IDE or make tool. The manual way is hand-crafted Makefiles.
Related
I was trying to write a common function for other files could reuse it, the example as following, I have three files:
The first file: cat test1.h
void say();
The second file: cat test1.c
void say(){
printf("This is c example!");
}
The third file: cat test2.c
include "test1.h"
void main(){
say();
}
but when I ran: gcc -g -o test2 test2.c
it threw error as:
undefined reference to `say'
Additionally: I knew this would work:gcc -g -o test2 test1.c test2.c
but I don't wanna do this, because the other team would use the server, and I hope them directly use my binary code not source code. I hope that just like we use printf() function, we just need include .
You can build yourself a library from the object files containing your useful functions, and store the header(s) that describe them in a convenient location. You and your colleagues then compile with the headers and link that library with any executables that use any of those functions. That's very much the same general mechanism that the C compiler uses to include the standard headers and automatically link with the standard C library.
The mechanics vary a bit depending on platform (Windows vs Unix being the primary distinction, though there are differences between Unix platforms too), and also on the type of library (static archive vs dynamic linked / loaded libraries — also known as shared objects or shared libraries).
In broad outline, for a Unix system with a static library, you'd:
Compile library object files libfile1.o, libfile2.o, … using (for example) gcc -c libfile1.c libfile2.c.
Create an archive from the object files — using for example ar r libname.a libfile1.o libfile2.o.
Copy the headers to a standard location such as /usr/local/include.
Copy the library to a standard location such as /usr/local/lib.
You'd compile any code that uses the library functions with -I/usr/local/include (if that is not already a standard compilation option).
You'd link the programs with -L/usr/local/lib -lname (you might not need to specify -L… but you would need to specify -lname).
Including a header file does not make a function available. It simply informs the compiler that the function will be provided at a later time.
You should compile the file with the function into a shareable object file (or a library if there is more than one function that you want to share). Mind the switch -c which tells gcc not to build an executable file:
gcc -o test1.o test1.c -c
Similarly, compile the main function into its own object file. Now you or anyone else can link the object file with their main program:
gcc -o test2 test2.o test1.o
The process can be automated using make.
Other programmers can use compiled object files (`*.o') in their programs. They need only to have a header file with function prototypes, extern data declarations and type definitions.
You can also wrap many object files into the library.
On many systems you can also create the dynamic linked libraries which do not have to be linked into the executable.
you also need to compile test1:
gcc -g -o test2 test1.c test2.c.
Say I have a parent directory A with two subdirectories B and C.
Sub-directory C has a helper.c and helper.h as shown:
//helper.c
void print(){
printf("Hello, World!\n");
}
//helper.h
void print();
Now, in sub directory B, I have a main.c which just calls the print function:
//main.c
#include<stdio.h>
#include"../C/helper.h"
void main(){
print();
}
I tried the following commands for compiling main.c:
Command 1: gcc main.c //Gives undefined reference to 'print' error
Command 2: gcc main.c ../C/helper.c //Compiles successfully
Now I removed the #include"../C/helper.h" from main .c and tried the Command 2 again. It still works.
So I have the following questions:
i) What difference does it make whether the helper.h file is included or
helper.c?
ii) Why command 1 fails?
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
What happens when you execute:
Command 1: gcc main.c //Gives undefined reference to 'print' error
When execute gcc main.c
Compiler compiles main.c and creates objective file. This file will contain unresolved link to function print(). Because there is no implementation of function print() in main.c file.
After compilation gcc tries to make full executable file. To do this gcc combines all objective files and tries to resolve all unresolved links. As you remember there is unresolved link for function print(), gcc can't find implementation and raise the error.
When you execute
Command 2: gcc main.c ../C/helper.c //Compiles successfully
gcc compiles both files. Second file ../C/helper.c contains implementation of function print(), so linker can find it and resolve reference to it in function main().
i) What difference does it make whether the helper.h file is included or helper.c?
In your case helper.h contains forward declaration of function print(). This gives information to compiler how to make call of function print().
ii) Why command 1 fails?
See above.
iii) Is there a way to compile my C program without having to specify helper.c everytime?
Use make utility. Compile helper.c in separate objective file helper.o and use it in linkage command.
helper.o: ../C/helper.c ../C/helper.h
gcc -c ../C/helper.c
main.o: main.c main.h
gcc -c main.c
testprog: main.o helper.o
g++ main.o helper.o -o testprog
See make utility manual for details.
Commands should be indented by TAB.
First you need to understand that #include simply adds whatever text is in the #include parameter to the position in the file the statement is in, for example:
//file1.h
void foo();
//main.c
#include "file1.txt"
int main(int argc, char **argv)
{
foo();
return 0;
}
Will cause the pre-compilation to generate this unified file for compilation:
//main.c.tmp
void foo();
int main(int argc, char **argv)
{
foo();
return 0;
}
So to answer your first and second questions:
When you include a header file (or any file) that only contains declarations (i.e function signatures) without definitions (i.e function implementations), as in the example above, the linker will fail in finding the definitions and you will get the 'undefined reference' error.
When you include a c code file (or any file) that contains definitions, these definitions will be merged to your code and the linker will have them, that's why it works.
and as for your third question
It is bad practice to include c files directly in other c files, the common approach is to keep separate c files with headers exposing the functionality they provide, include the header files and link against the compiled c files, for example in your case:
gcc main.c helper.c -o out
Will allow you to include helper.c in main.c and still work because you instructed the compiler to compile both files instead of just main.c so when linking occurs the definitions from the compilation will be found and you will not get the undefined behavior error
This is, in a nutshell. I abstracted a lot of what's going on to pass on the general idea. this is a nice article describing the compilation process in fair detail and this is a nice overview of the entire process.
I'll try to answer:
i) What difference does it make whether the helper.h file is included or helper.c?
When you include a file, you don't want to expose your implementation, hence its better to include h files, that contains only the "signatures" - api of your implementation.
ii) Why command 1 fails?
When you compile you must add all your resources to the executable, otherwise he won't compile.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can use Makefile to compile your program. Maybe this tutorial can help you.
i) What difference does it make whether the helper.h file is included
or helper.c?
Including helper.c means that helper.c gets compiled each time as if it were part of main.c
Including helper.h lets the compiler know what argument types the function print() takes and returns so the compiler can give an error or warning if you call print() incorrectly
ii) Why command 1 fails?
The compiler is not being told where to find the actual code for the print function. As explained, including the .h file only helps the compiler with type checking.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can compile it once into an object file and optionally you can add that obj to a static or dynamically loaded library. You still need to help the compiler find that obj or library. For example,
gcc -c helper.c
gcc main.c helper.o
The correct way to avoid compiling modules that don't need compiling is to use a Makefile. A Makefile compares when a module was last compiled compared to when it was last modified and that way it knows what needs to be compiled and what doesn't.
#include"header.h"
int main(){
function();
return 0;
}
above is simplified form of my code. I implemented function() in header.h file, and put it in the same directory with this code.c file.
I heard that "gcc -c code.c" is "compile but no linking" option, but this code need linking with header.h file. So I guess -c option will flag an error, while it didn't. Though, without -c option it flags an error. Can anyone explain how this -c options works?
Header files have nothing to do with linking. Linking is combining multiple object files and libraries into an executable.
Header files are processed by the compiler, as part of generating an object file. Therefore, gcc -c will process header files.
gcc -c compiles source files without linking.
header files have nothing to do with linking process, they are only used in compilation process to tell compiler the various declaration and function prototypes.
However it is bad practice to implement function in header file, both compilation strategy should work in this case. i.e. gcc with and without c flag
/me/home/file1.c containes function definition:
int mine(int i)
{
/* some stupidity by me */
}
I've declared this function in
/me/home/file1.h
int mine(int);
if I want to use this function mine() in /me/home/at/file2.c
To do so, all I need to do is:
file2.c
#include "../file1.h"
Is that enough? Probably not.
After doing this much, when I compile file2.c, I get undefined reference to 'mine'
You will also need to link the object file from file1. Example:
gcc -c file2.c
gcc -c ../file1.c
gcc -o program file2.o file1.o
Or you can also feed all files simultaneously and let GCC do the work (not suggested beyond a handful of files);
gcc -o program file1.c file2.c
Don't use ../ in a header. Instead, instruct gcc to use the parent directory as include path:
(in the at directory):
gcc -I../ -c file2.c
After doing this much, when I compile file2.c, I get undefined reference to 'mine'
No, you don't. It's not compiling that causes those errors. It's this other thing, called "linking".
The compiler compiles one "translation unit" - the result of running the preprocessor on one source file, possibly pulling in more stuff via #include - at a time, and then the linker sticks these together to make an executable. Typically the same program serves as both the compiler and linker, with different flags, and typically you can tell it to do everything at once (and not save any temporary files for the compiled translation units). But you do need to tell it what to link, and you do need to compile everything that will be linked.
This is a question from job interview.Let's say we have "a.c" source file with some function and "a.h" as its header file.Also we have main.c file which calls that function.Now let's suppose we have "a.h" and "a.o"(object file) and a.c is unavailable.How do we call this function now?
(I had a hint that we need to use function pointers.Another hint is to do this using pre-compiler directives such as #define and #ifndef).
Also i would like to know how in .h file we know if we are linked properly to source file?
Thank You
Just include a.h from main.c and you can use the functions declared in a.h. Then just compile it with the same compiler version as a.o is build:
gcc -c main.c
gcc main.o a.o
To compile main.c, you need the function definition. You already have that in a.h. So you would write:
// main.c
#include "a.h"
int main()
{
foobar(); // Let's say this is the function from a.h
}
When compiling it, you would have to include the object file at the linking stage. So using gcc...
gcc -c main.c // Compile main.c to main.o
gcc -o main main.o a.o
No function pointers or macros needed.
The way you describe it, you only need a header file to call the function. The header file contains the prototype of the function, which allows the compiler to know what the signature of the function is.
You would then link in your object file (which contains the compiled version of function) and everything would be OK.
I don't know why you would need functions pointers or pre-compiler directives. Maybe you didn't understand the question 100%?
In main.c, call the function as normal.
Then compile main.c to main.o. gcc -c main.c
Then link a.o and main.o. gcc main.o a.o
Something about this question sounds garbled. How you write the function call in main depends solely on its declaration in a.h. The presence or absence of a.c doesn't change that. Certainly nothing involving macros or function pointers.
Compiling and linking are two distinct steps; the compiler checks that you're passing the right number and types of arguments and assigning the result to the right type of object based on the function's declaration, while the linker attempts to resolve the reference to the function's implementation in the machine code.
The result of compiling and linking is a binary sludge that may or may not have any obvious relationship to the original source code1. Debug versions preserve varying levels of information to support source-level debuggers, but you can pretty much rely on release versions not preserving any useful source information.
1. Every now and again someone asks for a tool to recover source code from an executable; this is often described as attempting to turn hamburger back into cows.