Is it possible to generate small executables with MINGW g++? - linker

I understand that MINGW-g++ compiles larger executables because it statically links many things. On the other hand MSVC++ links dynamically against DLLs from the VCRedist package, and that is why it produces way smaller executables.
However, is it possible to compile with g++ in a similar manner on windows? Not necessarily MINGW-g++ but something that I can use with Qt Creator (I didn't add Qt as a tag because it's irrelevant to the question).

MinGW is perfectly capable of linking to the msvcrt runtime dynamically. The only mess you're not getting rid of this way is GCC/MinGW startup code, which is not very large.
A small C++ test program (simple iostream hello world program, note: I got the same results for a plain C printf version).
#include <iostream>
using namespace std;
int main()
{
cout << "Hello World!" << endl;
return 0;
}
Commandlines:
g++ main.cpp -MD -Os -s -o test.exe
cl /MD /Os main.cpp /link /out:test2.exe
Executable file sizes:
GCC: 13kB
MSVC: 6kB
Although this is double, all necessary startup code accounts for the large difference; for larger programs, the difference is negligible.

To make a fair comparison between VC++ and MinGW using static linking, I would suggest removing the compiler switch /MD in the command line syntax above. This will cause the Visual C++ compiler to link statically with static libraries instead but still, the Visual C++ compiler will generate a much smaller executable than the one compiled statically with MinGW.
Because the linker used by the Visual C++ compiler has a feature called function-level linking, with this, the linker only links the necessary libraries based on the functions used in your code. Any unreferenced or unused functions will not be linked to the final executable generated resulting in a much smaller statically linked binary.
Going back to the example above using the Visual C++ compiler and this time, using the static linking, the command line syntax would be:
cl /Os main.cpp /link /out:test2.exe
You can notice here that I have removed the /MD switch so that the compiler will use static linking instead of dynamic.
Now, to make a much smaller statically linked executable, I suggest the command line syntax:
cl /Ox main.cpp /link /FILEALIGN:512 /OPT:REF /OPT:ICF /INCREMENTAL:NO /out:test2.exe
If you check the resulting binary, you will notice that it is much smaller which is again, a statically linked executable.
I actually got this idea from the discussion on this website at http://www.catch22.net/tuts/minexe
Most Pascal compilers including Delphi also have the same linking feature and it is known as smart linking but the resulting statically linked executables are much smaller the those produced by the Visual C++ compiler.
The linker used by MinGW is very dumb, it is bloat unaware and therefore, it links many static libraries including those which contains functions or routines which are not used in your source code at all leading to a very bloated statically linked binaries.
I would advise dumping MinGW and use the Visual C++ compiler instead. Even the developer of MinGW doesn't seem to care on reducing code bloat using static linking.

You can use cygwin (www.cygwin.com). They use a runtime DLL much like MSVCRT . Then your program depends on the cygwin runtime, of course (kind of a tautology, sorry).

Related

GCC: compiling an application without linking any library

I know how to compile a C application without linking any library using GCC in bare metal embedded application just setting up the startup function(s) and eventually the assembly startup.s file.
Instead, I am not able to do the same thing in Windows (I am using MINGW32 GCC). Seems that linking with -nostdlib removes also everything needed to be executed before main, so I should write a specific startup but I did not find any doc about that.
The reason because I need to compile without C std lib is that I am writing a rduced C std lib for little 32 bits microcontrollers and I would like to test and unit test this library using GCC under Windows. So, if there is an alternative simplest way it is OK for me.
Thanks.
I found the solution adding -nostdlib and -lgcc together to ld (or gcc used as linker). In this way the C standard lib is not automatically linked to the application but everything needed to startup the application is linked.
I found also that the order of these switches matters, it may not work at all, signal missing at_exit() function or work without any error/warning depending by the order and position of the options.
I discovered another little complication using Eclipse based IDEs because there are some different approaches in the Settings menu so to write the options in the right order I needed to set them in different places.
After that I had a new problem: I did not think that unit test libraries require at least a function able to write to stdout or to a file.
I found that using "" and <> forces the compiler and linker to use the library modules I want by my library and the C standard library.
So, for instance:
#include "string.h" // points to my library include
#include <stdio.h> // points to C stdlib include
permits me to test all my library string functions using the C stdlib stdout functions.
It works both using GCC and GCC Cross Compilers.

How does a compiler find out which dynamic link library will be used in my code, if I only include headers-files, where is not describe it?

How does a compiler find out which dynamic link library will be used in my code, if I only include headers-files, where is not describe it?
#include <stdio.h>
void main()
{
printf("Hello world\n");
}
There is I only include
stdio.h
and my code is used
printf function
How it is known, in headers-files prototypes , macros and constant are described, but nothing about in which file "printf" is implement. How does then it works?
When you compile a runnable executable, you don't just specify the source code, but also a list of libraries from which undefined references are looked up. With the C standard library, this happens implicitly (unless you tell GCC -nostdinc), so you may not have been consciously aware of this.
The libraries are only consumed by the linker, not the compiler. The linker locates all the undefined references in the libraries. If the library is a static one, the linker just adds the actual machine code to your final executable. On the other hand, if the library is a shared one, the linker only records the name (and version?) of the library in the executable's header. It is then the job of the loader to find appropriate libraries at load time and resolve the missing dependencies on the fly.
On Linux, you can use ldd to list the load-time dependencies of a dynamically linked executable, e.g. try ldd /bin/ls. (On MacOS, you can use otool -L for the same purpose.)
As others have answered, the standard c library is implicitly linked. If you are using gcc you can use the -Wl,--trace option to see what the linker is doing.
I tested your example code:
gcc -Wl,--trace main.c
Gives:
/usr/bin/ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.6/crtbegin.o
/tmp/ccCjfUFN.o
-lgcc_s (/usr/lib/gcc/x86_64-linux-gnu/4.6/libgcc_s.so)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
-lgcc_s (/usr/lib/gcc/x86_64-linux-gnu/4.6/libgcc_s.so)
/usr/lib/gcc/x86_64-linux-gnu/4.6/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crtn.o
This shows that the linker is using libc.so (and also ld-linux.so).
The library glibc is linked by default by GCC. There is no need to mention -l library when you are building your executable. Hence you find that the functions printf and others which are a part of glibc do not need any linking exclusively.
Technically your compiler does not figure out which libraries will be used. The linker (commonly ld) does this. The header files only tell the compiler what interface your library functions use and leaves it up to the linker to figure out where they are.
A source file goes a long path until it becomes an executable. Commonly
source.c -[preprocess]> source.i -[compile]> source.s -[assemble]> source.o -[link]> a.out
When you invoke cc source.c all those steps are done transparently for you in one go and the standard libraries (commonly libc.so) and executable loader (commonly crt0.o) are linked together.
Any additional libraries have to be passed as additional linker flags i.e. -lpthread.
I would say that depends on IDE or the compiler and system. Header file just contains interface information like name of function parameters it expects any attributes others and that's how compiler first convert your code to an intermediate object file.
After that comes linking where in code for printf is getting added to the executable either through static library or dynamic library.
Functions and other facilities like STL are part of C/C++ so they are either delivered by compiler or system. e.g on Solaris there is no debug version of C library unless you are using gcc. But on Visual Studio you have debug version msvcrt.dll and you can also link C library statically.
In short the answer is that code for printf and other functions in C library are added by compiler at link time.

SWIG, OpenCOBOL and mixing targets

OpenCOBOL uses intermediate C source on the way to compiled binary, giving it access to the entire libc universe. With a goal of centralized embedding of more than one SWIG wrapper:
cobc -C nextbig.cob
swig -java nextbig.i
gcc nextbig.c nextbig_wrapper.c
gcc -shared ...
and gcc builds a very nice binary, Java and the C output of the COBOL compiler mixed nicely.
Only tested swig -tcl, -perl, -python and -java so far (all different nextbig_wrapper.c of course).
How much grief would be involved in blending (for instance) swig -java and swig -python across the same nextbig.c and nextbig.i? Is there a known idiom for manual intervention at managing two or more target _wrapper.c files? Or is it a known thing not to do?

Performance difference between C program executables created by gcc and g++ compilers

Lets say I have written a program in C and compiled it with both gcc (as C) and g++ (as C++), which compiled executable will run faster: the one created by gcc or by g++? I think using the g++ compiler will make the executable slow, but I'm not sure about it.
Let me clarify my question again because of confusion about gcc:
Let's say I compile program a.c like this in the terminal:
gcc a.c
g++ a.c
Which a.out executable will run faster?
Firstly: the question (and some of the other answers) seem to be based on the faulty premise that C is a strict subset of C++, which is not in fact the case. Compiling C as C++ is not the same as compiling it as C: it can change the meaning of your program!
C will mostly compile as C++, and will mostly give the same results, but there are some things that are explicitly defined to give different behaviour.
Here's a simple example - if this is your a.c:
#include <stdio.h>
int main(void)
{
printf("%d\n", sizeof('x'));
return 0;
}
then compiling as C will give one result:
$ gcc a.c
$ ./a.out
4
and compiling as C++ will give a different result (unless you're using an unusual platform where int and char are the same size):
$ g++ a.c
$ ./a.out
1
because the C specification defines a character literal to have type int, and the C++ specification defines it to have type char.
Secondly: gcc and g++ are not "the same compiler". The same back end code is used, but the C and C++ front ends are different pieces of code (gcc/c-*.c and gcc/cp/*.c in the gcc source).
Even if you stick to the parts of the language that are defined to do the same thing, there is no guarantee that the C++ front end will parse the code in exactly the same way as the C front end (i.e. giving exactly the same input to the back end), and hence no guarantee that the generated code will be identical. So it is certainly possible that one might happen to generate faster code than the other in some cases - although I would imagine that you'd need complex code to have any chance of finding a difference, as most of the optimisation and code generation magic happens in the common back end of the compiler; and the difference could be either way round.
I think they they will both produce the same machine code, and therefore the same speed on your computer.
If you want to find out, you could compile the assembly for both and compare the two, but I'm betting that they create the same assembly, and therefore the same machine code.
Profile it and try it out. I'm certain it will depend on the actual code, even if it would require potentially a really weird case to get any different bytecode. Though if you don't have extern C {} around your C code, and or works fine in C, I'm not sure how "compiling it as though it were C++" could provide any speed, unless the particular compiler optimizations in g++ just happen to be a bit better for your particular situation...
The machine code generated should be identical. The g++ version of a.out will probably link in a couple of extra support libraries. This will make the startup time of a.out be slower by a few system calls.
There is not really any practical difference though. The Linux linker will not become noticeably slower until you reach 20-40 linked libraries and thousands of symbols to resolve.
The gcc and g++ executables are just frontends, they are not the actual compilers. They both run the actual C or C++ compilers (and ld, ar, whatever is needed to produce the output you asked for) based on the file extensions. So you'll get the exact same result. G++ is commonly used for C++ because it links with the standard C++ library (iostreams etc.).
If you want to compile C code as C++, either change the file extension, or do something like this:
gcc test.c -otest -x c++
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/G_002b_002b-and-GCC.html
GCC is a compiler collection. It is mainly used for compilation of C,C++,Ada,Java and many more programming languages.
G++ is a part of gnu compiler collection(gcc).
I mean gcc includes g++ as well. When we use gcc for compilation of C++ it uses g++. The output files will be different because the G++ compiler uses its own run time library.
Edit: Okay, to clarify things, because we have a bit of confusion in naming here. GCC is the GNU Compiler Collection. It can compile Ada, C++, C, and a billion and a half other languages. It is a "backend" to the various languages "front end" compilers like GNAT. Go read the link i made at the top of the page from GCC.GNU.Org.
GCC can also refer to the GNU C Compiler. This will compile C++ code if given the -lstdc++ command, but normally will choke and die because it's not pulling in the C++ libraries.
G++, the GNU C++ Compiler, like the GNU C Compiler is a front end to the GNU Compiler Collection. It's difference between the C Compiler is that it automatically includes those libraries and makes a few other small tweaks, because it's assuming it's going to be fed C++ code to compile.
This is where the confusion comes from. Does this clarify things a bit?

LLVM - linking problem

I am writing a LLVM code generator for the language Timber, the current compiler emits C-code. My problem is that I need to call C functions from the generated LLVM files, for example the compiler has a real-time garbage collector and i need to call functions to notify when new objects are allocated on the heap. I have no idea on how to link these functions with my generated LLVM files.
The code generation is made by generate .ll-files and then manually compile these.
I'm trying to call an external function from LLVM but i have no luck. In the examples I've >found only C standard functions like "puts" and "printf" are called, but I want to call a >homemade function. I'm stuck.
I'm assuming you're writing an LLVM transformation, and you want to add calls to external functions into transformed code. If this is not the case, edit your question and include more information.
Before you can call an external function from LLVM code, you need to insert a declaration for it. For example:
virtual bool runOnModule(Module &m) {
Constant *log_func = m.getOrInsertFunction("log_func",
Type::VoidTy,
PointerType::getUnqual(Type::Int8Ty),
Type::Int32Ty,
Type::Int32Ty,
NULL);
...
}
The code above declares a function log_func which returns void and takes three arguments: a byte pointer (string), and two 32-bit integers. getOrInsertFunction is a method of Module.
To actually call the function, you have to insert a CallInst. There are several static Create methods for this.
Compile your LLVM assembly files normally with llvm-as:
llvm-as *.ll
Compile the bitcode files to .s assembly language files:
llc *.bc
GCC them in with the runtime library:
gcc *.s runtime.c -o executable
Substitute in real makefiles, shared libraries, etc. if necessary. You get the idea.
I'm interpreting your question as being "how do I implement a runtime library in C or C++ for my language that gets compiled to LLVM?"
One approach is, as detailed by Jonathan Tang, to transform the output of your compiler from LLVM IR to bitcode to assembly, and have vanilla gcc link the assembly against the runtime source (or object files).
An alternative, possibly more flexible approach is to use llvm-gcc to compile the runtime itself into LLVM bitcode, and then use llvm-ld to link the bitcode from your compiler with the bitcode of your runtime. This bitcode can then be re-optimized with opt, converted back to IR with llvm-dis, interpreted directly with lli (this will, afaik, only work if LLVM was built against libffi), or compiled to assembly with llc (and then to a native binary with vanilla gcc).

Resources