Step by step C compilation result in segfault - c

I'm trying to understand C compilation
Given this simple C code in main.c:
int main() {
int a;
a = 42;
return 0;
}
I performed the following operations:
cpp main.c main.i
/usr/lib/gcc/x86_64-linux-gnu/9/cc1 main.i -o main.s
as -o main.o main.s
ld -o main.exe main.o
When executing main.exe, I get a Segmentation Fault.
How can I get a good memory addressing in this example?

When I try the sequence of commands from your question on an x86_64 Ubuntu 19.10 system, I get a warning from ld:
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
This is an indication that something is wrong.
The error means that the linker did not find a symbol _start and used a default address instead. When running your program it will try to execute code at this address which apparently is invalid.
An executable program compiled from C code doesn't contain only your code. The compiler instructs the linker to add C run-time library and startup code. The startup code is responsible for initialization and for calling your main function.
Run e.g.
gcc -v -o main.exe main.o
to see what other files get added to your program. On my system this shows a few files with names starting with crt which means "C runtime".
If you don't use gcc to link your program but use ld directly, you have to manually add all necessary object files in a similar way as the compiler would do automatically.

Related

Different behavior of undefined reference error on linux gcc during linking with object file vs static library

I have following two source codes and want to link them.
// test.c
#include <stdio.h>
void lib2();
void lib1(){
lib2();
return 0;
}
// main.c
#include <stdio.h>
int main() {
return 0;
}
I've used gcc -c main.c and gcc -c test.c to generate objects files
$ ls *.o
main.o test.o
and I've used ar rcs test.a test.o command to generate static library(test.a) from object file test.o
Then, I tried to build executable by linking main.o with test.a or test.o. As far as I know, a static library file(.a extension) is a kind of simple collection of object files(.o). so I expected both would give same result: error or success. but it didn't.
Linking with the object file gives undefined reference error.
$ gcc -o main main.o test.o
/usr/bin/ld: test.o: in function `lib1':
test.c:(.text+0xe): undefined reference to `lib2'
collect2: error: ld returned 1 exit status
$
but linking with the static library doesn't give any error and success on compilation.
$ gcc -o main main.o test.a
$
Why is this happening? and how can I get undefined reference errors even when linking with static libraries?
If your code contains a function call expression then the language standard requires a function definition exists. (See C11 6.9/3). If you don't provide a definition then it is undefined behaviour with no diagnostic required .
The rule was written this way so that implementation vendors aren't forced to perform analysis to determine if a function is ever called or not; for example in your library scenario the compiler isn't forced to dig around in the library if none of the rest of the code contains anything that references that library.
It's totally up to the implementation what to do, and in your case it decides to give an error in one case and not the other. To avoid this, you can provide definitions for all the functions you call.
You might be able to modify the behaviour in the first case by using linker options such as elimination of unused code sections. Another thing you can do is call lib1() from main() -- this is still not guaranteed to produce an error but is more likely to.
Force the linker to do some work use -flto option and the error will go away.
ld does not search libraries for objects which are not used it only searches for symbols used in object files. Imagine that you have a library where some functions require defined callbacks. If you do not have them in every program you link against the library even if you do not use those functions.
I expected both would give same result: error or success. but it didn't.
Your expectation is incorrect. A good explanation of the difference between .o and .a with respect to linking is here.

Steps of compilation

I am trying to compile simple Hello World C program in a sequential manner. First creating the preprocessed file, then creating the assembly file, then creating the object file and finally invoking the linker to create the ELF.
The problem is I am not able to execute the ELF being created.
I have tried following steps
gcc -E hello.c 1>hello.i
gcc -S hello.i
gcc -c hello.s
ld -o hello hello.o -lc
At this step I got a warning saying
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400260
But an executable with the name hello is created.
When I try to execute the output using
./hello
I am getting the error
bash: ./hello: No such file or directory
//hello.c contains following code
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
printf("\tHello World \n");
return (0);
}
I would suggest that you link your object files (however they are produced) with gcc, not ld.
gcc will call ld with the appropriate options, since it knows more about the source code and will create whatever is necessary for the assumptions that ld makes.

C compiler gcc gives linker command failed error [duplicate]

I'm getting the following error and can't for the life of me figure out what I'm doing wrong.
$ gcc main.c -o main
Undefined symbols:
"_wtf", referenced from:
_main in ccu2Qr2V.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
main.c:
#include <stdio.h>
#include "wtf.h"
main(){
wtf();
}
wtf.h:
void wtf();
wtf.c:
void wtf(){
printf("I never see the light of day.");
}
Now, if I include the entire function in the header file instead of just the signature, it complies fine so I know wtf.h is being included. Why doesn't the compiler see wtf.c? Or am I missing something?
Regards.
You need to link wtf with your main. Easiest way to compile it together - gcc will link 'em for you, like this:
gcc main.c wtf.c -o main
Longer way (separate compilation of wtf):
gcc -c wtf.c
gcc main.c wtf.o -o main
Even longer (separate compilation and linking)
gcc -c wtf.c
gcc -c main.c
gcc main.o wtf.o -o main
Instead of last gcc call you can run ld directly with the same effect.
You are missing the fact that merely including a header doesn't tell the compiler anything about where the actual implementation (the definitions) of the things declared in the header are.
They could be in a C file next to the one doing the include, they could come from a pre-compiled static link library, or a dynamic library loaded by the system linker when reading your executable, or they could come at run-time user programmer-controlled explicit dynamic loading (the dlopen() family of function in Linux, for instance).
C is not like Java, there is no implicit rule that just because a C file includes a certain header, the compiler should also do something to "magically" find the implementation of the things declared in the header. You need to tell it.

What is wrong with my header files?

I've just completed a school assignment and I'm having a problem testing my code because I keep getting the following output after running make packetize (it's a makefile the professor gave us)
cc packetize.c -o packetize
/tmp/ccJJyqF6.o: In function `block_to_packet':
packetize.c:(.text+0xb1): undefined reference to `crc_message'
collect2: ld returned 1 exit status
make: *** [packetize] Error 1
block_to_packet is defined in a file called packetize.c, crc_message is defined in crc16.c (both of which contain an #include "data.h" line). data.h also has the function heading for crc_message in it All of these files are in the same directory. I've been trying to compile them for the past hour and a half and have searched Google endlessly with no avail. It has something to do with linking I've read, my instructor has not taught this and so I don't know how to compile these files to test their outputs. Can anyone let me know what's wrong?
Your header files are absolutely OK. What you have there is a linker error: The compilation of packetize.c ran without problems, but then you're trying to link an executable file packetize (since you did not give the -c option which states "compile to object file"). And the executable would need the compiled code from crc16.c as well.
Either you have to give all sources on the compiler line:
cc packetize.c crc16.c -o myApp
Or you have to compile into individual object files, eventually linked together:
cc -c packetize.c -o packetize.o
cc -c crc16.c -o crc16.o
cc packetize.o crc16.o -o myApp
The former is what you'd do in a one-shot command line, the latter is what a Makefile usually does. (Because you do not need to recompile crc16.c if all you did was modify packetize.c. In large projects, recompiles can take significant amounts of time.)
Edit:
Tutorial time. Take note of the existence / absence of -c options in the command lines given.
Consider:
// foo.c
int foo()
{
return 42;
}
A source file defining the function foo().
// foo.h
int foo();
A header file declaring the function foo().
// main.c
#include "foo.h"
int main()
{
return foo();
}
A source file referencing foo().
In the file main.c, the include makes the compiler aware that, eventually, somewhere, there will be a definition of the function foo() declared in foo.h. All the compiler needs to know at this point is that the function will exist, that it takes no arguments, and that it returns int. That is enough to compile the source to object code:
cc -c main.c -o main.o
However, it is not enough to actually compile an executable:
cc main.c -o testproc # fail of compile-source-to-exe
ld main.o -o testproc # fail of link-object-to-exe
The compiler was promised (by the declaration) that a definition of foo() will exist, and that was enough for the compiler.
The linker however (implicitly run by cc in the first example) needs that definition. The executable needs to execute the function foo(), but it is nowhere to be found in main.c. The reference to foo() cannot be resolved. "Unresolved reference error".
You need to either compile both source files in one go...
cc foo.c main.c -o testproc # compile-source-to-exe
...or compile foo.c as well and provide the linker with both object files so it can resolve all references:
cc -c foo.c -o foo.o
ld foo.o main.o -o testproc # link-objects-to-exe
Post Scriptum: Calling ld directly as pictured above most likely will not work just like that. Linking needs a couple of additional parameters, which cc adds implicitly -- the C runtime support, the standard C library, stuff like that. I did not give those parameters in the examples above as they would confuse the matter and are beyond the scope of the question.
You have to compile crc16.c as well and link these two object files to build the binary. Otherwise packetize.c, from where crc_message() is being called, has no knowledge of it.
Try using
cc packetize.c crc16.c -o packetize
Your call crc_message() from packetize.c would just be fine.
As Totland writes crc_message is defined in crc16.c; which means that packetize.c can't see the definition, no matter how many shared headers they have. You do not have a compile error but an error from the linker.
If you compile your c files first to object files and then link everything to an executable it will work.

Undefined symbols error when using a header file

I'm getting the following error and can't for the life of me figure out what I'm doing wrong.
$ gcc main.c -o main
Undefined symbols:
"_wtf", referenced from:
_main in ccu2Qr2V.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
main.c:
#include <stdio.h>
#include "wtf.h"
main(){
wtf();
}
wtf.h:
void wtf();
wtf.c:
void wtf(){
printf("I never see the light of day.");
}
Now, if I include the entire function in the header file instead of just the signature, it complies fine so I know wtf.h is being included. Why doesn't the compiler see wtf.c? Or am I missing something?
Regards.
You need to link wtf with your main. Easiest way to compile it together - gcc will link 'em for you, like this:
gcc main.c wtf.c -o main
Longer way (separate compilation of wtf):
gcc -c wtf.c
gcc main.c wtf.o -o main
Even longer (separate compilation and linking)
gcc -c wtf.c
gcc -c main.c
gcc main.o wtf.o -o main
Instead of last gcc call you can run ld directly with the same effect.
You are missing the fact that merely including a header doesn't tell the compiler anything about where the actual implementation (the definitions) of the things declared in the header are.
They could be in a C file next to the one doing the include, they could come from a pre-compiled static link library, or a dynamic library loaded by the system linker when reading your executable, or they could come at run-time user programmer-controlled explicit dynamic loading (the dlopen() family of function in Linux, for instance).
C is not like Java, there is no implicit rule that just because a C file includes a certain header, the compiler should also do something to "magically" find the implementation of the things declared in the header. You need to tell it.

Resources