Trying to understand the main function with GCC and Windows - c

They say that main() is a function like any other function, but "marked" as an entry point inside the binary, an entry point that the operating system may find (Don't know how) and start the program from there. So, I'm trying to find out more about this function. What have I done? I created a simple .C file with this code inside:
int main(int argc, char **argv) {
return (0);
}
I saved the file, installed the GCC compiler (in Windows, MingW environment) and created a batch file like this:
gcc -c test.c -nostartfiles -nodefaultlibs -nostdlib -nostdinc -o test.o
gcc -o test.exe -nostartfiles -nodefaultlibs -nostdlib -nostdinc -s -O2 test.o
#%comspec%
I did this to obtain a very simplistic compiler and linker, no library, no header, just the compiler. So, the compiling goes well but the linking stops with this error:
test.c:(.text+0xa): undefined reference to '___main'
collect2.exe: error: Id returned 1 exit status
I thought that the main function is exported by the linker but I believed that you didn't need any library with additional information about it. But it looks like it does. In my case I supposed that it must be the standard GCC library, so I downloaded the source code of it and opened this file: libgcc2.c
Now, I don't know if that is the file where the main function is constructed to be linked by GCC. In fact, I don't understand how the main function is used by GCC. Why does the linker need the gcc standard libraries? To know what about main? I hope this has made my question quite specific and clear. Thanks!

When gcc puts together all object files (test.o) and libraries to form a binary it also prepends a small object (usually crt0.o or crt1.o), which is responsible for calling your main(). You can see what gcc is doing, when you add -v on the command line:
$ gcc -v -o test.exe test.o
crt0/crt1 does some setup and then calls into main. But the linker is finally responsible for building the executable according to the OS. With -v you can see also an option for the target system. In my case it's for Linux 64 bit: -m elf_x86_64. For your system this will be something like -m windows or -m mingw.

The error happens because you use these two options: -nodefaultlibs -nostdlib
These tell GCC that it should not link your code against libc.a/c.lib which contains the code which really calls main(). In a nutshell, every OS is slightly different and most of them don't care about C and main(). Each has their own special way to start a process and most of them are not compatible with the C API.
So the solution of the C developers was to put "glue code" into the C standard library libc.a which contains the interface which the OS expects, creates the standard C environment (setting up the memory allocation structures so malloc() will map the OS's memory management functions, set up stdio, etc) and eventually calls main()
For C developers, this means they get a libc.a for their OS (along with the compiler binaries) and they don't need to care about how the setup works.
Another source of confusion is the name of the reference. On most systems, the symbolic name of main() is _main (i.e. one underscore) while __main is the name of an internal function called by the setup code which eventually calls the real main()

Related

Calling functions from an external C file that has its own main()

I have two C files, program.c and tests.c, that each contain a main function.
program.c is a standalone program, that compiles and run normally on its own. But I would like to also be able to use some of its functions in tests.c (without using a common header file). Is there a way of doing this?
If I insert the prototype of the function I want from program.c into tests.c and compile with:
gcc -o program.o -c program.c
gcc -o tests.o -c tests.c
gcc -o tests tests.o program.o
I obtain an error duplicate symbol _main, which I understand since there are indeed two `main' functions.
I basically would like to be able to treat program.c both as a standalone program and as a library, similarly to what could be done in Python with if __name__ == '__main__'.
If you need to have two separate distinct executables for which some of the functionality between them is similar you can share the common functionality by placing relevant functions into a third file, and compiling as a portable executable, DLL in Windows. (or shared library in Linux.) Each of these file types contain sharable, executable code, ithout the main() function, designed to be linked during compile time, and dynamically loaded into your executable at runtime.
Here is a step by step set of instructions for shared library using GCC and Linux.
Here is a step by step example for creating DLL using GCC in windows.
So I managed to achieve what I wanted thanks to the comment from #pmg:
I compile program.c into a standalone binary (gcc -o program program.c), but I also compile it into an object file with "main" renamed (gcc -c -Dmain=mainp -o program.o program.c).
I can then use this object file (that does not contain a "main" symbol anymore) to compile tests.c: gcc -o tests tests.c program.o.
Thanks #pmg, I did not know this use of the -D option.

How to write common functions for reusing in C

I was trying to write a common function for other files could reuse it, the example as following, I have three files:
The first file: cat test1.h
void say();
The second file: cat test1.c
void say(){
printf("This is c example!");
}
The third file: cat test2.c
include "test1.h"
void main(){
say();
}
but when I ran: gcc -g -o test2 test2.c
it threw error as:
undefined reference to `say'
Additionally: I knew this would work:gcc -g -o test2 test1.c test2.c
but I don't wanna do this, because the other team would use the server, and I hope them directly use my binary code not source code. I hope that just like we use printf() function, we just need include .
You can build yourself a library from the object files containing your useful functions, and store the header(s) that describe them in a convenient location. You and your colleagues then compile with the headers and link that library with any executables that use any of those functions. That's very much the same general mechanism that the C compiler uses to include the standard headers and automatically link with the standard C library.
The mechanics vary a bit depending on platform (Windows vs Unix being the primary distinction, though there are differences between Unix platforms too), and also on the type of library (static archive vs dynamic linked / loaded libraries — also known as shared objects or shared libraries).
In broad outline, for a Unix system with a static library, you'd:
Compile library object files libfile1.o, libfile2.o, … using (for example) gcc -c libfile1.c libfile2.c.
Create an archive from the object files — using for example ar r libname.a libfile1.o libfile2.o.
Copy the headers to a standard location such as /usr/local/include.
Copy the library to a standard location such as /usr/local/lib.
You'd compile any code that uses the library functions with -I/usr/local/include (if that is not already a standard compilation option).
You'd link the programs with -L/usr/local/lib -lname (you might not need to specify -L… but you would need to specify -lname).
Including a header file does not make a function available. It simply informs the compiler that the function will be provided at a later time.
You should compile the file with the function into a shareable object file (or a library if there is more than one function that you want to share). Mind the switch -c which tells gcc not to build an executable file:
gcc -o test1.o test1.c -c
Similarly, compile the main function into its own object file. Now you or anyone else can link the object file with their main program:
gcc -o test2 test2.o test1.o
The process can be automated using make.
Other programmers can use compiled object files (`*.o') in their programs. They need only to have a header file with function prototypes, extern data declarations and type definitions.
You can also wrap many object files into the library.
On many systems you can also create the dynamic linked libraries which do not have to be linked into the executable.
you also need to compile test1:
gcc -g -o test2 test1.c test2.c.

gcc ld linker issues [Need Help] [duplicate]

I'm trying to understand C compilation in a little more depth, and so I'm compiling and linking "manually". Here is my code
int main()
{
return 0;
}
And here is what I'm putting into my console (Windows):
gcc -S main.c
as main.s -o main.o
ld main.o
And when trying to link, I get:
main.o:main.c:(text+0x7): undefined reference to `__main'
You didn't link any of the necessary support libraries. C global objects like stdin, stdout, stderr don't just appear from nowhere. Command arguments and environment variables are pulled from the operating system. And on exit all those atexit() functions get called and the return code from main is passed to exit(return_code). Etc.
Check out the commands gcc -dumpspecs, gcc -print-libgcc-file-name. Look at all of the other libraries in that directory. You'll find a lot of those libraries and object files referenced in the output of dumpspecs. I don't know exactly when or how those spec rules are interpreted but you can probably get the idea. And I think the GCC info pages info gcc explain it in detail if you dig in far enough.
info gcc and then press 'g' and then enter 'Spec Files'
And as Jonathan Leffler said, the shortcut is to run gcc with the verbose option: gcc -v and just see what commands it used.

Undefined reference to "main" in minimal C program

I'm trying to understand C compilation in a little more depth, and so I'm compiling and linking "manually". Here is my code
int main()
{
return 0;
}
And here is what I'm putting into my console (Windows):
gcc -S main.c
as main.s -o main.o
ld main.o
And when trying to link, I get:
main.o:main.c:(text+0x7): undefined reference to `__main'
You didn't link any of the necessary support libraries. C global objects like stdin, stdout, stderr don't just appear from nowhere. Command arguments and environment variables are pulled from the operating system. And on exit all those atexit() functions get called and the return code from main is passed to exit(return_code). Etc.
Check out the commands gcc -dumpspecs, gcc -print-libgcc-file-name. Look at all of the other libraries in that directory. You'll find a lot of those libraries and object files referenced in the output of dumpspecs. I don't know exactly when or how those spec rules are interpreted but you can probably get the idea. And I think the GCC info pages info gcc explain it in detail if you dig in far enough.
info gcc and then press 'g' and then enter 'Spec Files'
And as Jonathan Leffler said, the shortcut is to run gcc with the verbose option: gcc -v and just see what commands it used.

gcc detect duplicate symbols/functions in static libraries

Is there any way we can get gcc to detect a duplicate symbol in static libraries vs the main code (Or another static library ?)
Here's the situation:
main.c erroneously contained a function definition, e.g. with the signature uint foohash(const char*)
foo.c also contains a function definition with the signature uint foohash(const char*)
foo.c and other source files are compiled to a static util library, which the main program links in, i.e. something like:
gcc -o main main.o util.o -L ./libs -lfooutils
So, now main.o and libs/libfooutils.a both contain a foohash function. Presumably the linker found that symbol in main.o and doesn't bother looking for it elsewhere.
Is there any way we can get gcc to detect such a situation ?
Indeed as Simon Richter stated, --whole-archive option can be useful. Try to change your command-line to:
gcc -o main main.o util.o -L ./libs -Wl,--whole-archive -lfooutils -Wl,--no-whole-archive
and you'll see a multiple definition error.
gcc calls the ld program for linking. The relevant ld options are:
--no-define-common
--traditional-format
--warn-common
See the man page for ld. These should be what you need to experiment with to get the warnings sought.
Short answer: no.
GCC does not actually do anything with libraries. It is the task of ld, the linker (called automatically by GCC) to pull in symbols from libraries, and that's really a fairly dumb tool.
The linker has lots of complex jiggery pokery for combining different types of data from different sources, and supporting different file formats, and all the evil little details of binary executables, but in the end, all it really does is look for undefined symbols and find the definitions.
What you can do is a link trace (pass -t to gcc) to see what comes from where. Or else run nm on all the object files and libraries in your system, and write a script to detect duplicates.

Resources