GNU Triplet, GCC and Linux kernel compiling - c

My native gcc says, that its triplet is the following.
> gcc -dumpmachine
x86_64-suse-linux
Where cpu-vendor-os are correspondingly x86_64, suse, linux. The latter means that glibs is in use(?). When I am doing cross-compiling busybux-based system the compiler triplet is something like avr32-linux-uclibc, where os is 'linux-uclibc', meaning that uclibc is used.
The difference between 'linux-glibc' and 'linux-uclibc' is (AFAIU) in collect2 behavior and libgcc.a content. Either glibc or uclibs are silently linked to the target binary.
The questions is that how is the linux kernel been compiled by the same compilers? As soon as the kernel runs on bare-metal it must not been linked with any kind of user-space libc, and should use appropriate libgcc.a

gcc has all kind of options to control how it works.
Here's a few relevant ones:
-nostdlib to omit linking to the standard libraries and startup code
-nostdinc to omit searching for header files in the standard locations.
-ffreestanding to compile for a freestanding environment (such as a kernel)
You also do not need to use gcc for linking. You can invoke the linker directly, supply it with your own linker map, startup object code, and anything else you need.
The linux kernel build seems, for arbitrary reasons not to use -ffreestanding , it does control the linking stage though, and ensures the kernel gets linked without pulling in any userspace code.

Related

why my executable is bigger after linking

I have a small C program which I need to run on different chips.
The executable should be smaller than 32kb.
For this I have several toolchains with different compilers for arm, mips etc.
The program consists of several files which each is compiled to an object file and then linked together to an executable.
When I use the system gcc (x86) my executable is 15kb big.
With the arm toolchain the executable is 65kb big.
With another toolchain it is 47kb.
For example for arm all objects which are included in the executable are 14kb big together.
The objects are compiled with the following options:
-march=armv7-m -mtune=cortex-m3 -mthumb -msoft-float -Os
For linking the following options are used:
-s -specs=nosys.specs -march-armv7-m
The nosys.specs library is 274 bytes big.
Why is my executable still so much bigger (65kb) when my code is only 14kb and the library 274 bytes?
Update:
After suggestions from the answer I removed all malloc and printf commands from my code and removed the unused includes. Also I added the compile flags -ffunction-sections -fdata-sections and linking flag --gc-sections , but the executable is still too big.
For experimenting I created a dummy program:
int main()
{
return 1;
}
When I compile the program with different compilers I get very different executable sizes:
8.3 KB : gcc -Os
22 KB : r2-gcc -Os
40 KB : arm-gcc --specs=nosys.specs -Os
1.1 KB : avr-gcc -Os
So why is my arm-gcc executable so much bigger?
The avr-gcc executable does static linking as well, I guess.
Your x86 executable is probably being dynamically linked, so any standard library functions you use -- malloc, printf, string and math functions, etc -- are not included in the binary.
The ARM executable is being statically linked, so those functions must be included in your binary. This is why it's larger. To make it smaller, you may want to consider compiling with -ffunction-sections -fdata-sections, then linking with --gc-sections to discard any unused functions or data from your binary.
(The "nosys.specs library" is not a library. It's a configuration file. The real library files are elsewhere.)
The embedded software porting depends on target hardware and software platform.
The hardware platforms are splitted into mcu, cpus which could run linux os.
The software platform includes compiler toolchain and libraries.
It's meaningless to compare the program image size for mcu and x86 hardware platform.
But it's worth to compare the program image size on the same type of CPU using different toolchains.

linking pgi compiled library with gcc linker

I would like to know how to link a pgc++ compiled code (blabla.a) with a main code compiled with c++ or g++ GNU compiler.
For the moment linking with default gnu c++ linker gives errors like:
undefined reference to `__pgio_initu'
As the previous person already pointed out, PGI supports G++ name mangling when using the pgc++ command. Judging from this output, I'm guessing that you're linking with g++ rather than pgc++. I've had the most success when using pgc++ as the linker so that it finds the PGI libraries. If that's not an option, you can link an executable with pgc++ -dryrun to get the full link line and past the -L and -l options from there to get the same libraries.
Different C++ compilers use different name-mangling conventions
to generate the names that they expose to the linker, so the member function
name int A::foo(int) will be emitted to to the linker by compiler A as one string
of goobledegook, and by compiler B as quite a different string of goobledegook,
and the linker has no way of knowing they refer to the same function. Hence
you can't link object files produced by different C++ compilers unless they
employ the same name-mangling convention (and quite possibly not even then: name-mangling
is just one aspect of ABI compatibility.)
That being said, according to this document,
PGC++ supported name-mangling compatibility with g++ 3-and-half years ago, provided that PGI C++ compiler was invoked with precisely the command pgc++ or pgcpp --gnu. It may be that the library you are dealing with was not built in that specific way, or perhaps was built with an older PGI C++ compiler.
Anyhow, if g++ compiles the headers of your blabla.a and emits different
symbols from the ones in blabla.a, you can't link g++ code with blabla.a.
You'd need to rebuild blabla.a with g++, which perhaps is not an option.

force_load linker flag for other platforms

I need to include all symbols from a static library. "-force_load" is good when compiling with Xcode. But, for example, when using it under Ubuntu with gcc, "-force_load" is not recognized. I'm looking for alternative options that can be used under other operating systems. Thanks.
The GNU linker's option is called --whole-archive, but while -force_load applies to one library, --whole-archive applies to all libraries after it on the command line. So the usual thing is to do --whole-archive somelib.a --no-whole-archive.
Usually you don't use ld directly but instead invoke it via GCC, in which case you have to tell GCC to pass the options on to the linker: -Wl,--whole-archive,somelib.a,--no-whole-archive

Standard functions defined in header files or automatically linked?

When writing a basic c program.
#include <stdio.h>
main(){
printf("program");
}
Is the definition of printf in "stdio.h" or is the printf function automatically linked?
Usually, in stdio.h there's only the prototype; the definition should be inside a library that your object module is automatically linked against (the various msvcrt for VC++ on Windows, libcsomething for gcc on Linux).
By the way, it's <stdio.h>, not "stdio.h".
Usually they are automatically linked, but the compiler is allowed to implement them as it pleases (even by compiler magic).
The #include is still necessary, because it brings the standard functions into scope.
Stricto sensu, the compiler and the linker are different things (and I am not sure that the C standard speaks of compilation & linking, it more abstractly speaks of translation and implementation issues).
For instance, on Linux, you often use gcc to translate your hello.c source file, and gcc is a "driving program" which runs the compiler cc1, the assembler as, the linker ld etc.
On Linux, the <stdio.h> header is an ordinary file. Run gcc -v -Wall -H hello.c -o hello to understand what is happening. The -v option asks gcc to show you the actual programs (cc1 and others) that are used. The -Wall flag asks for all warnings (don't ignore them!). The -H flag asks the compiler to show you the header files which are included.
The header file /usr/include/stdio.h is #include-ing itself other headers. At some point, the declaration of printf is seen, and the compiler parses it and adjust its state accordingly.
Later, the gcc command would run the linker ld and ask it to link the standard C library (on my system /usr/lib/x86_64-linux-gnu/libc.so). This library contains the [object] code of printf
I am not sure to understand your question. Reading wikipedia's page about compilers, linkers, linux kernel, system calls should be useful.
You should not want gcc to link automagically your own additional libraries. That would be confusing. (but if you really wanted to do that with GCC, read about GCC specs file)

gcc switches - what do these do?

I am new with using gcc and so I have a couple of questions.
What do the following switches accomplish:
gcc -v -lm -lfftw3 code.c
I know that lfftw3 is an .h file used with code.c but why is it part of the command?
I couldn't find out what -lm does in my search. What does it do?
I think I found out -v causes gcc to display programs invoked by it.
-l specifies a library to include. In this case, you're including the math library (-lm) and the fftw3 library (-lffw3). The library will be somewhere in your library path, possibly /usr/lib, and will be named something like libffw3.so
From GCC's man page:
-v Print (on standard error output) the commands executed to run the
stages of compilation. Also print the version number of the
compiler driver program and of the preprocessor and the compiler
proper.
-l library
Search the library named library when linking. (The second
alternative with the library as a separate argument is only for
POSIX compliance and is not recommended.)
It makes a difference where in the command you write this option;
the linker searches and processes libraries and object files in the
order they are specified. Thus, foo.o -lz bar.o searches library z
after file foo.o but before bar.o. If bar.o refers to functions in
z, those functions may not be loaded.
The linker searches a standard list of directories for the library,
which is actually a file named liblibrary.a. The linker then uses
this file as if it had been specified precisely by name.
The directories searched include several standard system
directories plus any that you specify with -L.
Normally the files found this way are library files---archive files
whose members are object files. The linker handles an archive file
by scanning through it for members which define symbols that have
so far been referenced but not defined. But if the file that is
found is an ordinary object file, it is linked in the usual
fashion. The only difference between using an -l option and
specifying a file name is that -l surrounds library with lib and .a
and searches several directories.
libm is the library that math.h uses, so -lm includes that library. You might want to get a better grasp of the concept of linking. Basically, that switch adds a bunch of compiled code to your program.
-lm links your program with the math library.
-v is the verbose (extra ouput) flag for the compiler.
-lfftw3 links your program with fftw3 library.
You just include headers by using #include "fftw3.h". If you want to actually include the code associated to it, you need to link it. -l is for that. Linking with libraries.
arguments starting with -l specify a library which is linked into the program. Like Pablo Santa Cruz said, -lm is the standard math library, -lfftw3 is a library for fourier transformation.
Try man when you're trying to learn about a command.
From man gcc
-v Print (on standard error output) the commands executed to run the
stages of compilation. Also print the version number of the
com-
piler driver program and of the preprocessor and the compiler
proper.
As Pablo stated, -lm links your math library.
-lfftw3 links in a library used for Fourier transforms. The project page, with more info can be found here:
http://www.fftw.org/
The net gist of all these statements is that they compile your code file into a program, which will be named the default (a.out) and is dependent on function calls from the math and fourier transform libs. The -v statement just helps you keep track of the compilation process and diagnose errors should occur.
In addition to man gcc which should be the first stop for questions about any command, you can also try the almost standard --help option. Even for commands that don't support it, an unsupported option usually causes it to print an error containing usage information that should hint at a similar option. In this case, gcc will display a terse (for gcc, its only about 50 lines long) help summary listing the small number of options that are understood by the gcc program itself rather than passed on to its component programs. After the description of the --help option itself, it lists --target-help and -v --help as ways to get more information about the target architecture and the component programs.
My MinGW GCC 3.4.5 installation generates more than 1200 lines of output from gcc -v --help on Windows XP. I'm pretty sure that doesn't get much smaller in other installations.
It would also be a good idea to read the official manual for GCC. It is also helpful to read the documentation for the linker (ld) and assembler (often gas or just as, but it may be some platform specific assembler as well); aside from a platform-specific assembler, these are documented as part of the binutils collection.
General familiarity with the command line style of Unix tools is also helpful. The idea that a single-character option's value might not be delimited from the option name is a convention that goes back essentially as far as Unix does. The modern convention (promulgated by GNU) that multiple-character option names are introduced by -- instead of just - implies that -lm might be a synonym for -l m (or the pair of options -l -m in some conventions but that happens not to be the case for gcc) but it is probably not a single option named -lm. You will see a similar pattern with the -f options that control specific optimizations or the -W options that control warnings, for example.

Resources