This is a question from job interview.Let's say we have "a.c" source file with some function and "a.h" as its header file.Also we have main.c file which calls that function.Now let's suppose we have "a.h" and "a.o"(object file) and a.c is unavailable.How do we call this function now?
(I had a hint that we need to use function pointers.Another hint is to do this using pre-compiler directives such as #define and #ifndef).
Also i would like to know how in .h file we know if we are linked properly to source file?
Thank You
Just include a.h from main.c and you can use the functions declared in a.h. Then just compile it with the same compiler version as a.o is build:
gcc -c main.c
gcc main.o a.o
To compile main.c, you need the function definition. You already have that in a.h. So you would write:
// main.c
#include "a.h"
int main()
{
foobar(); // Let's say this is the function from a.h
}
When compiling it, you would have to include the object file at the linking stage. So using gcc...
gcc -c main.c // Compile main.c to main.o
gcc -o main main.o a.o
No function pointers or macros needed.
The way you describe it, you only need a header file to call the function. The header file contains the prototype of the function, which allows the compiler to know what the signature of the function is.
You would then link in your object file (which contains the compiled version of function) and everything would be OK.
I don't know why you would need functions pointers or pre-compiler directives. Maybe you didn't understand the question 100%?
In main.c, call the function as normal.
Then compile main.c to main.o. gcc -c main.c
Then link a.o and main.o. gcc main.o a.o
Something about this question sounds garbled. How you write the function call in main depends solely on its declaration in a.h. The presence or absence of a.c doesn't change that. Certainly nothing involving macros or function pointers.
Compiling and linking are two distinct steps; the compiler checks that you're passing the right number and types of arguments and assigning the result to the right type of object based on the function's declaration, while the linker attempts to resolve the reference to the function's implementation in the machine code.
The result of compiling and linking is a binary sludge that may or may not have any obvious relationship to the original source code1. Debug versions preserve varying levels of information to support source-level debuggers, but you can pretty much rely on release versions not preserving any useful source information.
1. Every now and again someone asks for a tool to recover source code from an executable; this is often described as attempting to turn hamburger back into cows.
Related
Say I have a parent directory A with two subdirectories B and C.
Sub-directory C has a helper.c and helper.h as shown:
//helper.c
void print(){
printf("Hello, World!\n");
}
//helper.h
void print();
Now, in sub directory B, I have a main.c which just calls the print function:
//main.c
#include<stdio.h>
#include"../C/helper.h"
void main(){
print();
}
I tried the following commands for compiling main.c:
Command 1: gcc main.c //Gives undefined reference to 'print' error
Command 2: gcc main.c ../C/helper.c //Compiles successfully
Now I removed the #include"../C/helper.h" from main .c and tried the Command 2 again. It still works.
So I have the following questions:
i) What difference does it make whether the helper.h file is included or
helper.c?
ii) Why command 1 fails?
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
What happens when you execute:
Command 1: gcc main.c //Gives undefined reference to 'print' error
When execute gcc main.c
Compiler compiles main.c and creates objective file. This file will contain unresolved link to function print(). Because there is no implementation of function print() in main.c file.
After compilation gcc tries to make full executable file. To do this gcc combines all objective files and tries to resolve all unresolved links. As you remember there is unresolved link for function print(), gcc can't find implementation and raise the error.
When you execute
Command 2: gcc main.c ../C/helper.c //Compiles successfully
gcc compiles both files. Second file ../C/helper.c contains implementation of function print(), so linker can find it and resolve reference to it in function main().
i) What difference does it make whether the helper.h file is included or helper.c?
In your case helper.h contains forward declaration of function print(). This gives information to compiler how to make call of function print().
ii) Why command 1 fails?
See above.
iii) Is there a way to compile my C program without having to specify helper.c everytime?
Use make utility. Compile helper.c in separate objective file helper.o and use it in linkage command.
helper.o: ../C/helper.c ../C/helper.h
gcc -c ../C/helper.c
main.o: main.c main.h
gcc -c main.c
testprog: main.o helper.o
g++ main.o helper.o -o testprog
See make utility manual for details.
Commands should be indented by TAB.
First you need to understand that #include simply adds whatever text is in the #include parameter to the position in the file the statement is in, for example:
//file1.h
void foo();
//main.c
#include "file1.txt"
int main(int argc, char **argv)
{
foo();
return 0;
}
Will cause the pre-compilation to generate this unified file for compilation:
//main.c.tmp
void foo();
int main(int argc, char **argv)
{
foo();
return 0;
}
So to answer your first and second questions:
When you include a header file (or any file) that only contains declarations (i.e function signatures) without definitions (i.e function implementations), as in the example above, the linker will fail in finding the definitions and you will get the 'undefined reference' error.
When you include a c code file (or any file) that contains definitions, these definitions will be merged to your code and the linker will have them, that's why it works.
and as for your third question
It is bad practice to include c files directly in other c files, the common approach is to keep separate c files with headers exposing the functionality they provide, include the header files and link against the compiled c files, for example in your case:
gcc main.c helper.c -o out
Will allow you to include helper.c in main.c and still work because you instructed the compiler to compile both files instead of just main.c so when linking occurs the definitions from the compilation will be found and you will not get the undefined behavior error
This is, in a nutshell. I abstracted a lot of what's going on to pass on the general idea. this is a nice article describing the compilation process in fair detail and this is a nice overview of the entire process.
I'll try to answer:
i) What difference does it make whether the helper.h file is included or helper.c?
When you include a file, you don't want to expose your implementation, hence its better to include h files, that contains only the "signatures" - api of your implementation.
ii) Why command 1 fails?
When you compile you must add all your resources to the executable, otherwise he won't compile.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can use Makefile to compile your program. Maybe this tutorial can help you.
i) What difference does it make whether the helper.h file is included
or helper.c?
Including helper.c means that helper.c gets compiled each time as if it were part of main.c
Including helper.h lets the compiler know what argument types the function print() takes and returns so the compiler can give an error or warning if you call print() incorrectly
ii) Why command 1 fails?
The compiler is not being told where to find the actual code for the print function. As explained, including the .h file only helps the compiler with type checking.
iii) Is there a way to compile my C program without having to specify
helper.c everytime?
You can compile it once into an object file and optionally you can add that obj to a static or dynamically loaded library. You still need to help the compiler find that obj or library. For example,
gcc -c helper.c
gcc main.c helper.o
The correct way to avoid compiling modules that don't need compiling is to use a Makefile. A Makefile compares when a module was last compiled compared to when it was last modified and that way it knows what needs to be compiled and what doesn't.
I have a number of .c files, i.e. the implementation files say
main.c
A.c
B.c
Where functions from any of the files can call any function from a different files. My question being, do I need a .h i.e. header file for each of A and B's implementation where each header file has the definition of ALL the functions in A or B.
Also, main.c will have both A.h and B.h #included in it?
If someone can finally make it clear, also, how do I later compile and run the multiple files in the terminal.
Thanks.
Header contents
The header A.h for A.c should only contain the information that is necessary for external code that uses the facilities defined in A.c. It should not declare static functions; it should not declare static variables; it should not declare internal types (types used only in A.c). It should ensure that a file can use just #include "A.h" and then make full use of the facilities published by A.c. It should be self-contained, idempotent (so you can include it twice without any compilation errors) and minimal. You can simply check that the header is self-contained by writing #include "A.h" as the first #include line in A.c; you can check that it is idempotent by including it twice (but that's better done as a separate test). If it doesn't compile, it is not self-contained. Similarly for B.h and B.c.
For more information on headers and standards, see 'Should I use #include in headers?', which references a NASA coding standard, and 'Linking against a static library', which includes a script chkhdr that I use for testing self-containment and idempotency.
Linking
Note that main.o depends on main.c, A.h and B.h, but main.c itself does not depend on the headers.
When it comes to compilation, you can use:
gcc -o program main.c A.c B.c
If you need other options, add them (most flags at the start; libraries at the end, after the source code). You can also compile each file to object code separately and then link the object files together:
gcc -c main.c
gcc -c A.c
gcc -c B.c
gcc -o program main.o A.o B.o
You must provide an header file just if what is declared in a .c file is required in another .c file.
Generally speaking you can have a header file for every source file in which you export all the functions declared or extern symbols.
In practice you won't alway need to export every function or every variable, just the one that are required by another source file, and you will need to include it just in the required file (and in the source paired with the specific header file).
When trying to understand how it works just think about the fact that every source file is compiled on its own, so if it's going to use something that is not declared directly in its source file, then it must be declared through an header file. In this way the compiler can know that everything exists and it is correctly typed.
It would depend on the compiler, but assuming you are using gcc, you could use something like this:
gcc -Wall main.c A.c B.c -o myoutput
Look at http://www.network-theory.co.uk/docs/gccintro/gccintro_11.html (first google answer) for more details. You could compile it into multiple object files/ libraries:
gcc -c main.c
gcc -c A.c
gcc -c B.c
gcc -o mybin main.o A.o B.o
You want to use
gcc -g *.c -lm
It saves typing and will allow you to link all your c files in your project.
I have three files, test.c, foo.c, foo.h.
In foo.c i
#include "foo.h"
In test.c i
#include "foo.c."
Then when I compile my code, I use gcc -o test test.c, and it compiles.
However, my professor told me, I should use
#include "foo.h"
inside my test.c rather than #include foo.c, and I should compile it this way
gcc -o test test.c foo.c
Is the second way more preferred? If it is, why? What's the difference between these two compilation?
In most cases you should never include source files (apart from cases where you would probably want to include a piece of code generated dynamically by a separate script). Source files are to be passed directly to the compiler. Only header files should be included.
Although the way that your professor suggests is correct, the following way has more educational value in this case:
gcc -c test.c
gcc -c foo.c
gcc -o test foo.o test.o
The first two lines compile each source file to an object file, and the third line doesn't really compile but only invokes the linker to produce an executable out of the 2 object files. The idea is to make a distinction between compiling and linking, which would be performed transparently in the way your professor suggests.
The major reasons not to #include .c files in other .c files are:
Avoid duplicate definition errors: suppose foo.c defines the function foo(). You have two other files that use foo(), so you #include "foo.c" in both of them. When you try to build your project, the compiler will translate foo.c multiple times, meaning it will see multiple attempts to define the foo function, which will cause it to issue a diagnostic and halt.
Minimize build times: even if you don't introduce duplicate definition errors, you wind up recompiling the same code needlessly. Suppose you #include "foo.c" in bar.c, and you discover you need to make a one-line change in bar.c. When you rebuild, you wind up re-translating the contents of foo.c unnecessarily.
C allows you to compile your source files separately of each other, and then link the resulting object files together to build your applications or libraries. Ideally, header files should only contain non-defining object declarations, function prototype declarations, type definitions, and macro definitions.
It is common practice to #include header files instead of source files, and compile source files individually. Separation of concerns makes it easier to work with in large projects. In your example, it may be trivial, but could be confusing when you have hundreds of files to work with.
Doing it the way your professor suggests means you can compile each source separately. So, if you had a large project where the sources were thousands of lines of code, and you changed something in test.c, you can just recompile test.c instead of having to recompile foo.c along with it.
Hope this makes some sense :)
If you want to compile several files in gcc, use:
gcc f1.c f2.c ... fn.c -o output_file
Short answer:
YES the second way is more preferred.
Long answer:
In this specific case you will get the same result.
To have a dipper understanding you need first to know that "#include" statement basically copy the file it's include and put its value instead of the "#include" statement.
Therefore "h" files are used for forward declaration which you have no problem several different file will include.
while "c" files have the implementations, in that case if both files will implement the same function you will have error in linking them.
Lets say you would have "test2.c" and you will also include foo.c and try to link it with the test.c you will have two implementations of foo.c. But if you only include foo.h in all 3 files (foo.c, test.c and test2.c) you can still link them cause foo.h shouldn't have any implementations.
It is not good practice to include .c files.
In your case
Include foo.h in both test.c and foo.c , but add this inside your header file
#ifndef foo.h
#define foo.h
..your header code here
#endif
Writing the header the above way , ensures that you can include it multiple times , just to be on the safe side.
Coming to how you must put your code in files>
In foo.h
You place all your global structures ,and variables along with function prototypes , that you will use.
In foo.c
Here you define your modular functions
In test.c
Here you generally have your main() , and you will call and test the functions defined in foo.c
You Generally put all the files in the same folder , and the compiler will find them and compile them individually , they will be connected later by the linker.
gcc f1.c f2.c ... fn.c -o output_file
/me/home/file1.c containes function definition:
int mine(int i)
{
/* some stupidity by me */
}
I've declared this function in
/me/home/file1.h
int mine(int);
if I want to use this function mine() in /me/home/at/file2.c
To do so, all I need to do is:
file2.c
#include "../file1.h"
Is that enough? Probably not.
After doing this much, when I compile file2.c, I get undefined reference to 'mine'
You will also need to link the object file from file1. Example:
gcc -c file2.c
gcc -c ../file1.c
gcc -o program file2.o file1.o
Or you can also feed all files simultaneously and let GCC do the work (not suggested beyond a handful of files);
gcc -o program file1.c file2.c
Don't use ../ in a header. Instead, instruct gcc to use the parent directory as include path:
(in the at directory):
gcc -I../ -c file2.c
After doing this much, when I compile file2.c, I get undefined reference to 'mine'
No, you don't. It's not compiling that causes those errors. It's this other thing, called "linking".
The compiler compiles one "translation unit" - the result of running the preprocessor on one source file, possibly pulling in more stuff via #include - at a time, and then the linker sticks these together to make an executable. Typically the same program serves as both the compiler and linker, with different flags, and typically you can tell it to do everything at once (and not save any temporary files for the compiled translation units). But you do need to tell it what to link, and you do need to compile everything that will be linked.
So I get the point of headers vs source files. What I don't get is how the compiler knows to compile all the source files. Example:
example.h
#ifndef EXAMPLE_H
#define EXAMPLE_H
int example(int argument); // prototype
#endif
example.c
#include "example.h"
int example(int argument)
{
return argument + 1; // implementation
}
main.c
#include "example.h"
main()
{
int whatever;
whatever = example(whatever); // usage in program
}
How does the compiler, compiling main.c, know the implementation of example() when nothing includes example.c?
Is this some kind of an IDE thing, where you add files to projects and stuff? Is there any way to do it "manually" as I prefer a plain text editor to quirky IDEs?
Compiling in C or C++ is actually split up into 2 separate phases.
compiling
linking
The compiler doesn't know about the implementation of example(). It just knows that there's something called example() that will be defined at some point. So it just generated code with placeholders for example()
The linker then comes along and resolves these placeholders.
To compile your code using gcc you'd do the following
gcc -c example.c -o example.o
gcc -c main.c -o main.o
gcc example.o main.o -o myProgram
The first 2 invocations of gcc are the compilation steps. The third invocation is the linker step.
Yes, you have to tell the compiler (usually through a makefile if you're not using an IDE) which source files to compile into object files, and the compiler compiles each one individually. Then you give the linker the list of object files to combine into the executable. If the linker is looking for a function or class definition and can't find it, you'll get a link error.
It doesn't ... you have to tell it to.
For example, whe using gcc, first you would compile the files:
gcc file1.c -c -ofile1.o
gcc file2.c -c -ofile2.o
Then the compiler compiles those files, assuming that symbols that you've defined (like your example function) exist somewhere and will be linked in later.
Then you link the object files together:
gcc file1.o file2.o -oexecutable
At this point of time, the linker looks at those assumtions and "clarifies" them ie. checks whether they're present. This is how it basically works...
As for your IDE question, Google "makefiles"
The compiler does not know the implementation of example() when compiling main.c - the compiler only knows the signature (how to call it) which was included from the header file. The compiler produces .o object files which are later linked by a linker to create the executable binary. The build process can be controlled by an IDE, or if you prefer a Makefile. Makefiles have a unique syntax which takes a bit of learning to understand but will make the build process much clearer. There are lots of good references on the web if you search for Makefile.
The compiler doesn't. But your build tool does. IDE or make tool. The manual way is hand-crafted Makefiles.