How to link two files in C - c

I am currently working on a class assignment. The assignment is to create a linked list in c. But because we it's a class assignment we have some constraints:
We have a header file that we cannot modify.
We have a c file that is the linkedlist
We have a c file that is just a main method just to test the linkedlist
the header file has a main method defined, so when I attempt to build the linkedlist it fails because there is no main method. What should I do to resolve the issue?? Import the test file (this causes another error)?

I'm assuming your three files are called header.h, main.c, and linkedlist.c
gcc main.c linkedlist.c -o executable
This will create an executable binary called "executable"
Note this also assumes you're using gcc as a compiler.

Like most languages, C supports modules. What I assume your assignment requires is compiling a module. Modules, unlike full programs, lack entry points. Roughly speaking, they are collections of functions, in the manner of a library. When compiling a module, no linking is made.
You would compile a module like this: gcc -c linkedlist.c -> this would actually produce linkedlist.o, which is a module. Try executing this linkedlist.o (after changing its mode to executable, since it won't be so by default). The reason you fail to execute this module is, partly, because it is not in the proper format to be executed. Ones of the reasons it is not so is it lacks entry point (what we know as 'main') and linkage. Your assignment seems to provide a test 'main.c', if you wanted to use it, you would only have to link the 'main.c' (actually compiled into main.o) with linkedlist.o . To actually do that, simply type in gcc -o name_of_your_program main.c linkedlist.o. In fact, what is being done here is that your compiler first compiles main.c into a main.o module, then links the 2 modules together under the name you have given it with the -o option, but the compiler is pretty smart and needs nothing explicit about the steps he needs to take. Now if you wanted to know more about this stuff, you'd have to try and learn about how compilers do what they do. Google can help you with that more than I ever could. Good luck.

Related

GCC linked library for compile

Why do we have to tell gcc which library to link against when that information is already in source file in form of #include?
For example, if I have a code which uses threads and has:
#include <pthread.h>
I still have to compile it with -pthread option in gcc:
gcc -pthread test.c
If I don't give -pthread option it will give errors finding thread function definitions.
I am using this version:
gcc --version
gcc (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
This may be one of the most common things that trip up beginners to C.
In C there are two different steps to building a program, compilation and linking. For the purposes of your question, these steps connect your code to two different types of files, headers and libraries.
The #include <pthread.h> directive in your C code is handled by the compiler. The compiler (actually preprocessor) literally pastes in the contents of pthread.h into your code before turning your C file into an object file.
pthread.h is a header file, not a library. It contains a list of the functions that you can expect to find in the library, what arguments they take and what they return. A header can exist without a library and vice-versa. The header is a text file, often found in /usr/include on Unix-derived systems. You can open it just like any C file to read the contents.
The command line gcc -lpthread test.c does both compilation and linking. In the old days, you would first do something like cc test.c, then ld -lpthread test.o. As you can see, -lpthread is actually an option to the linker.
The linker does not know anything about text files like C code or headers. It only works with compiled object files and existing libraries. The -l flag tells it which libraries to look in to find the functions you are using.
The name of the header has nothing to do with the name of the library. Here it's really just by the accident. Most often there are many headers provided by the library.
Especially in C++ there is usually one header per class and the library usually provides classes implementations from the same namespace. In C the headers are organized that they contain some common subset of functions - math.h contains mathematical operations, stdio.h provides IO functions etc.
They are two separate things. .h files holds the declarations, sometimes the inline function also. As we all know, every functions should have an implementation/definition to work. These implementations are kept seperately. -lpthread, for example is the library which holds the implementation of the functions declared in headers in binary form.
Separating the implementation is what people want when you don't want to share your commercial code with others
So,
gcc -pthread test.c
tell gcc to look for definitions declared in pthread.h in the libpthread. -pthread is expanded to libpthread by linker automatically
there are/were compilers that you told it where the lib directory was and it simply scanned all the files hoping to find a match. then there are compilers that are the other extreme where you have to tell it everything to link in. the key here is include simply tells the compiler to look for some definitions or even simpler to include some external file into this file. this does not necessarily have any connection to a library or object, there are many includes that are not tied to such things and it is a bad assumption. next the linker is a different step and usually a different program from the compiler, so not only does the include not have a one to one relationship with an object or library, the linker is not the compiler.

Modular programming and compiling a C program in linux

So I have been studying this Modular programming that mainly compiles each file of the program at a time. Say we have FILE.c and OTHER.c that both are in the same program. To compile it, we do this in the prompt
$gcc FILE.c OTHER.c -c
Using the -c flag to compile it into .o files (FILE.o and OTHER.o) and only when that happens do we translate it (compile) to executable using
$gcc FILE.o OTHER.o -o
I know I can just do it and skip the middle part but as it shows everywhere, they do it first and then they compile it into executable, which I can't understand at all.
May I know why?
If you are working on a project with several modules, you don't want to recompile all modules if only some of them have been modified. The final linking command is however always needed. Build tools such as make is used to keep track of which modules need to be compiled or recompiled.
Doing it in two steps allows to separate more clearly the compiling and linking phases.
The output of the compiling step is object (.o) files that are machine code but missing the external references of each module (i.e. each c file); for instance file.c might use a function defined in other.c, but the compiler doesn't care about that dependency in that step;
The input of the linking step is the object files, and its output is the executable. The linking step bind together the object files by filling the blanks (i.e. resolving dependencies between objets files). That's also where you add the libraries to your executable.
This part of another answer responds to your question:
You might ask why there are separate compilation and linking steps.
First, it's probably easier to implement things that way. The compiler
does its thing, and the linker does its thing -- by keeping the
functions separate, the complexity of the program is reduced. Another
(more obvious) advantage is that this allows the creation of large
programs without having to redo the compilation step every time a file
is changed. Instead, using so called "conditional compilation", it is
necessary to compile only those source files that have changed; for
the rest, the object files are sufficient input for the linker.
Finally, this makes it simple to implement libraries of pre-compiled
code: just create object files and link them just like any other
object file. (The fact that each file is compiled separately from
information contained in other files, incidentally, is called the
"separate compilation model".)
It was too long to put in a comment, please give credit to the original answer.

Does the linker refer to the main code

Let assume I am having three source files main.c, a.c and b.c. In the main.c are called some of the functions (not all) that are defined in a.c. None of the functions defined in b.c are called (used) by main.c. In main.c is the main function. Then we have a makefile that compiles all the source files(main.c, a.c and b.c) and then links them to produce executable file, in my case intel hex file. My question is: Does the linker know in which file the main function resides and knowing that to determine what part of the object files to link together? I mean if the linker produces the exe file based only on the recipe of the rule to make the target then no matter how many functions are called in our application code the size of the executable will be the same because the recipe says to link all the object files. For example we compile the three source files and we get three object files: main.o a.o and b.o (the bigger the object files are, the bigger the exe file is). I know you would say if you dont want anything from the b.c then do not include it in the build. But it means that every time I want to change the application (include/exclide modules) I need to change the makefile too. And another thing is how the linker knows what part of the object file to take, does it understand the C language? I hope you understand my question, excuse my bad English.
1) Does the linker know in which file the main function resides and knowing that to determine what part of the object files to link together?
Maybe there are options of your toolchain (compiler/linker) to enable this kind of optimizations, I mean removing unused functions from link, but I have big doubt for global functions (could be possible for static functions).
2) And another thing is how the linker knows what part of the object file to take, does it understand the C language?
Linker may detect if a function or variable is not used by the application (once again, check the available options), but it is not really the objective of this tool. However if you compile/link some functions as library functions (see options), you can generate a "library" file and then link this library with other object files. The functions of the library will then be included by the linker ONLY if they are used.
What I suggest: use compilation flags (#ifdef...) to include or exclude parts of code from compilation/link.
If you want only those functions in the executable that are eventually called from main, use a library of object files.
Basically the smallest unit the linker will extract from a library is the object file. Whatever symbols are in that object file will also be resolved, until all symbols are resolved.
In other words, if none of the symbols in an object file are needed, it won't end up in the result. If at least one symbol is needed, it will get linked in its entirety.
No, the linker does not understand C. Note that a lot of language compilers create object files (C++, FORTRAN, ..., and assemblers). A linker resolves symbols, which are names attached to values.
John Levine has written a book, "Linkers and Loaders", available on the 'net, which will give you an in-depth understanding of linkers, symbols, and object files.

(C) How are the implementations of the stdlib functions stored and linked to in header files if the source code does not have to be provided directly?

new to using C
Header files for libraries like stdlib do not contain the actual implementation code for the functions they provide access to. I understand that the actual source text for libraries like this aren't needed to compile, but how does this work specifically? Are the implementation details for these libraries contained within the compiler?
When you use a function like printf(), including the header file essentially pastes in code for the declaration of the function, but normally the implementation code would need to be available as well.
What form is it stored in? (and where?) Is this compiler specific? Would it be possible to write custom code and reference it in this way without modifying the behavior of the compiler?
I've been searching around and found some info that is relevant but nothing specific. This could be related to not formulating the question well. Thanks.
When you link a program, the compiler will implicitly add some extra libraries to your program:
$ ls
main.c
$ cc -c main.c
$ cc main.o
$ ls
main.c main.o a.out
You can discover the extra libraries a program uses with ldd. Here, there are three libraries linked into the program, and I didn't ask for any of them:
$ ldd a.out
linux-vdso.so => (0x00...)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00...)
/lib64/ld-linux-x86-64.so.2 (0x00...)
So, what happens if we link without these libraries? That's easy enough, just use the linker (ld) directly, instead of calling it through cc. When you use ld, it doesn't give you these extra libraries, so you get an error:
$ ld main.o
Undefined symbols:
"_printf", referenced from:
_main in main.o
The implementation for printf() is stored in the standard C library, which is usually just another library on your system... the only difference is that it gets automatically included into your program when you compile C.
You can use nm to find out what symbols are in a library, so I can use it to find printf() in libc:
$ nm -D /lib/x86_64-linux-gnu/libc-2.13.so | grep printf
...
000000000004e4b0 T printf
...
So, now that we know that libc has printf(), we can use -lc to tell the linker to include libc, and that will get rid of the errors about printf() being missing:
$ ld main.o -lc
There might be some other bits missing, and that's why we use cc to link our programs instead of ld: cc gives us all the default libraries.
When you compile a file you only need to promise the compiler that you have certain functions and symbols. A function call is in the compiled into a call [some_address]
The compiler will compile each C-file into object files that just have place holders for calls to functions declared in the headers. That is [some_address] does not need to be known at this point.
A number of oject files can be collected into what is known as a library.
After that it is the linkers job to look through all object files and libraries it know of and find out what the real value of all unknown [some_address] is and translate the call to, e.g. call 0x1234 if the particular function you are calling starts at 0x1234 (or it might be a relative offset from the current program pointer.
Stdlib and other library functions are implemented in an object library. A library is a collection of code that is linked with your program. By default C programs are linked against the stdlib library, which is usually provided by the operating system. Most modern operating systems use a dynamical linker. That is, your program is not linked against the library until it is executed. When it is being loaded, the linker-loader combines your code and the library code in your program's address space. You code and then make a call to the printf() code that is located in that library.
Usually a header file contains only a function prototype while the implementation is either in a separate source file or a precompiled library in the case of stdlib (and other libraries, both shipped with a compiler or available separately) the precompiled library gets linked at the end of the compilation process. (There's also a distinction between static and dynamic libraries, but I won't go into detail about that)
The implementation of standard libraries (which are shipped with a compiler) are usually compiler specific (there is a standard describing which functions have to be in a library, but the compiler programmer can decide how exactly he implements them) and it is (in theory) possible to exchange these libraries with some of your own without modifying the behaviour of the compiler (though not recommended as you would have to rewrite the entire library in order to ensure that all functions are contained).

Two basic question about compiling and libraries

I have two semi-related questions.
My first question: I can call functions in the standard library without compiling the entire library by just:
#include <stdio.h>
How would I go about doing the same thing with my header files? Just "including" my plaintext header files obviously does not work.
#include "nameofmyheader.h"
Basically, how can I create a library that other files can call?
Second question: Suppose I have a program that is split into 50 c files and a header file. What is the proper way to compile it besides:
cc main.c 1.h 1.c 2.c 3.c 4.c 5.c 6.c 7.c /*... and so on*/
Please correct any misconceptions I am having. I'm totally lost here.
First, you're a bit confused as to what happens with an #include. You never "compile" the standard library. The standard library is already compiled and is sitting in library files (.dll and .lib files on Windows, .a and .so on Linux). What the #include does is give you the declarations needed to link to the standard library.
The first thing to understand about #include directives is that they are very low-level. If you have programmed in Java or Python, #includes are much different from imports. Imports tell the compiler at a high level "this source file requires the use of this package" and the compiler figures out how to resolve that dependency. An #include in C directive says "take the entire contents of this file and literally paste it in right here when compiling." In particular, #include <stdio.h> brings in a file that has the forward declarations for all of the I/O functions in the standard library. Then, when you compile your code, the compiler knows how to make calls to those functions and check them for type-correctness.
Once your program is compiled, it is linked to the standard library. This means that your linker (which is automatically invoked by your compiler) will either cause your executable to make use of the shared standard library (.dll or .so), or will copy the needed parts of the static standard library (.lib or .a) into your executable. In neither case does your executable "contain" any part of the standard library that you do not use.
As for creating a library, that is a bit of a complicated topic and I will leave that to others, particularly since I don't think that's what you really want to do based on the next part of your question.
A header file is not always part of a library. It seems that what you have is multiple source files, and you want to be able to use functions from one source file in another source file. You can do that without creating a library. All you need to do is put the declarations for things foo.c that you want accessible from elsewhere into foo.h. Declarations are things like function prototypes and "extern" variable declarations. For example, if foo.c contains
int some_global;
void some_function(int a, char b)
{
/* Do some computation */
}
Then in order to make these accessible from other source files, foo.h needs to contain
extern int some_global;
void some_function(int, char);
Then, you #include "foo.h" wherever you want to use some_global or some_function. Since headers can include other headers, it is usual to wrap headers in "include guards" so that declarations are not duplicated. For example, foo.h should really read:
#ifndef FOO_H
#define FOO_H
extern int some_global;
void some_function(int, char);
#endif
This means that the header will only be processed once per compilation unit (source file).
As for how to compile them, never put .h files on the compiler command line, since they should not contain any compile-able code (only declarations). In most cases it is perfectly fine to compile as
cc main.c 1.c 2.c 3.c ... [etc]
However if you have 50 source files, it is probably a lot more convenient if you use a build system. On Linux, this is a Makefile. On windows, it depends what development environment you are using. You can google for that, or ask another SO question once you specify your platform (as this question is pretty broad already).
One of the advantages of a build system is that they compile each source file independently, and then link them all together, so that when you change only one source file, only that file needs to be re-compiled (and the program re-linked) rather than having everything re-compiled including the stuff that didn't get changed. This makes a big time difference when your program gets large.
You can combine several .c files to a library. Those libraries can be linked with other .c files to become the executable.
You can use a makefile to create a big project.
The makefile has a set of rules. Each rule describes the steps needed to create one piece of the program and their dependencies with other pieces or source files.
You need to create a shared library, the standard library is a shared library that is implicitly linked in your program.
Once you have your shared library you can use the .h files and just compile the program with -lyourlib wich is implicit for the libc
Create one using:
gcc -shared test.c -o libtest.so
And then compile your program like:
gcc myprogram.c -ltest -o myprogram
For your second question I advise you to use Makefiles
http://www.gnu.org/software/make/
The standard library is already compliled and placed on your machine ready to get dynamically linked. This means that the library is dynamically loaded when needed by a program. Compare this to a static library which gets compiled INTO your program when you run the compiler/linker.
This is why you need to compile your code and not the standard library code. You could build a dynamic (shared) library yourself.
For reference, #include <stdio.h> does not IMPORT the standard library. It just allows the compile and link to see the public interface of the library (To know what functions are used, what parameters they take, what types are defined, what sizes they are, etc).
Dynamic Loading
Shared Library
You could split your files up into modules, and create shared libraries. But generally as projects get bigger you tend to need a better mechanism to build your program (and libraries). Rather than directly calling the compiler when you need to do a rebuild you should use a make program or a complete build system like the GNU Build System.
If you really want it to be as simple as just including a .h file, all of your "library" code needs to be in the .h file. However, in this scenario, someone can only include your .h file into one and only one .c file. That may be ok, depending on how someone will use your "library".

Resources