LLVM: Implement linking of the object code

LLVM: Implement linking of the object code - linker

I am following the kaleidoscope tutorial. Emitting object code is very simple, but now I would like to implement linking step so that my toy programming language could compile directly into a binary (so there is no clang usage necessary). How can I achieve this with LLVM?

As far as the "no clang necessary": LLVM has a linker called LLD that is part of the LLVM project. Depending on how you installed LLVM it should be part of the distro.
Refer to your installed version for LLD as well as usage strategies. You will be able to then define your make or cmake recipes.
With reference to your core question, here is the general make flow I go through with my own language:
Compile source -> output.ll (LLVM assembly)
Optimize assembly -> output.oll (using opt)
Generate target assembly -> output.s
Assemble to object (using as) -> output.o
Link (I am using clang but this could be swapped with lld)

Related

Static build of GMP for MSVC (Windows)

Is it possible to build GMP for MSVC on Windows?
I need fully static solution (static library), without any DLL dependencies. So that my final EXE doesn't depend on any external (non-system) DLLs.
I'm alright if building GMP will need Cygwin or MSYS, as far as it can be used later in MSVC without any problems. But as far as I know at least Cygwin builds always depend on extra DLLs like cygwin1.dll which is not affordable for me, fully static-library solution is needed.
I'm aware there exists MPIR library that is more Windows friendly. But right now I need specifically GMP solution if possible.
Of course would be great if all optimizations and assembly is used when building for Windows. But if assembly is not possible then at least non-assembly (generic) variant of GMP is needed.
Of course I need 64-bit version.
If someone can post all steps needed to produce such static library for MSVC usage? Or maybe link some web-site that has such instructions?

I successfully managed to compile a working fully statically linked program with GMP using MSVC under Windows.
For that I used installation of MSYS, which is located in c:/bin/msys/ on my machine.
Then inside MSYS shell installed GMP packages mingw-w64-clang-x86_64-gmp and gmp-devel (pacman -S gmp-devel to install and pacman -Ss gmp to search).
In MSVC compiler I added include directory c:/bin/msys/clang64/include/.
Wrote an example of GMP usage program in C++, that implements Trial Division / Pollard's Rho / Pollard's P-1 factoring algorithms using long arithmetics. This program uses both mpz_...() C routines and mpz_class C++ wrapper class. For example this program is located in main.cpp.
To linker command line I added following libraries:
c:/bin/msys/clang64/lib/libgmp.a
c:/bin/msys/clang64/lib/libgmpxx.a
c:/bin/msys/mingw64/lib/gcc/x86_64-w64-mingw32/10.3.0/libgcc.a
c:/bin/msys/clang64/x86_64-w64-mingw32/lib/libmingwex.a
Also I had to add /FORCE flag (read about it here) to linker command, because libmingwex.a has some symbols overlapping with default MSVC's libraries, precisely without /FORCE I had following errors:
libucrt.lib(strnlen.obj) : error LNK2005: wcsnlen already defined in libmingwex.a(lib64_libmingwex_a-wcsnlen.o)
libucrt.lib(strnlen.obj) : error LNK2005: strnlen already defined in libmingwex.a(lib64_libmingwex_a-strnlen.o)
bin\win-msvc-m-64-release\drafts\gmp_int_msvc.exe : fatal error LNK1169: one or more multiply defined symbols found
All steps produced working (tested) final statically-linked program without any external DLL dependencies (of course except for default system DLLs of Windows).
It means MSYS's libraries .a are fully compatible with MSVC and link successfully in MSVC compilation.
Not to have /FORCE linker flag I also did extra following steps. Made a copy of c:/bin/msys/clang64/x86_64-w64-mingw32/lib/libmingwex.a library. Used c:/bin/msys/clang64/bin/objcopy.exe util, which probably was installed together with Clang. With objcopy renamed overlapping symbols:
objcopy --redefine-sym wcsnlen=msys_wcsnlen libmingwex.a
objcopy --redefine-sym strnlen=msys_strnlen libmingwex.a
which allowed me to successfully use this modified libmingwex.a library to link in MSVC without using /FORCE.

How to Self-Host Clang?

Can anyone tell me how to compile the Clang compiler into LLVM bytecode (that is, self-host Clang)? The reason I want to do this is so that I can take the resulting LLVM bytecode and then use Emscripten to produce a C to Javascript compiler.

You can get clang to output into LLVM bytecode by using the -emit-llvm command-line flag, along with the -c flag. (If you use the -S flag instead of -c, you get a textual representation of the LLVM bytecode.) You don't need to compile clang into LLVM bytecode for that to work.
If you want to try to run clang itself inside a browser, then you will need to compile all of clang into LLVM bytecode, and then link the object files together using llvm-link. Then you'll need to figure out how to give the compiled compiler access to the system header files it needs. I don't know if there is a build option for all that, but I haven't ever seen anything in the ./configure options for that, so I suspect not. But it's possible that it exists.

building binutils before gcc compiler

I am trying to build a gcc cross compiler. I understand that before compiling the cross compiler I need to have the target binutils built already. why the building of the compiler need the target binutils ? the compiler alone only takes high level code and turn it to the assembly that I defined it in the compiler sources. so why do I need the target bintools for compiling the cross compiler ? It is written in all of the cross compiler documentation that I need them to be build before compiling the cross compiler. (e.g. http://wiki.osdev.org/Building_GCC and http://www.ifp.illinois.edu/~nakazato/tips/xgcc.html).

GCC needs an assembler to transform the assembly it generates into object files (machine code), and a linker to link object files together to produce executables and shared libraries. It also needs an archiver to produce static libraries/archives.
Those three are usually provided by the binutils package (among other useful tools): the GNU assembler as, linker ld and the ar archiver.

Your key question seems to be:
why the building of the compiler need the target binutils ?
As described in Building a cross compiler, part of the build process for a GNU cross-compiler is to build runtime libraries for the target using the newly-compiled cross-compiler. So the binutils for the target need to be present for that step to succeed.
It may be possible to build the cross-compiler first, using empty files for the subset of binutils components that gcc needs - such as as and ld and ar and ranlib - then build and install the target binutils components into the proper locations, then build the target runtime libraries.
But it would be less error-prone to do things the following way (and the documentation recommends this): build binutils for the target first, place the specified executables in gcc's source tree, then build the cross-compiler.

The binutils (binary utilities) provide low-level handling of
binary files, such as linking, assembling, and parsing ELF files. The GCC
compiler depends on these tools to create an executable, because it generates
object files that binutils assemble into an executable image.
ELF is the format that Linux uses for binary executable
files. The GCC compiler relies on binutils to provide much of the platform-specific functionality.
Here your are cross-compiling for some other architecture not for x86. So resulting binutils are platform-specific
while configuring has to give --host!=target. i.e --host=i686-pc-linux-gnu
where --target=arm-none-linux-gnueabi.
So resulting executable are not same which host already having binutils.
addition
the basic things needs to be known.
The build machine, where the toolchain is built.
The host machine, where the toolchain will be executed.
The target machine, where the binaries created by the
toolchain are executed.
So binutils will be having tools to generate and manipulate binaries
for a given CPU architecture. Not for the one host is using

Are static c libraries created with one compiler compatible with another?

In my case I have a library built with code sourcery gcc targeting arm cortex-m4. I am trying to then link that library into a project being compiled with IAR compiler.
Is it possible to do this or does the library have to be rebuilt with the new tools? What factors affect this?

Static library is bundle of several object files which are always compiler specific. So if you try to link a gcc based lib with IAR compiler, you will get error at compile time due to mismatch between object file formats to be linked.
You need to rebuild your library using IAR.

The IAR compiler for ARM supports the AEABI format, which allows you to compile files with one compiler and link with another.
If you have built your library using GCC and have enabled AEABE, it should be possible to use the static library in a project build using the IAR tools.

how to use llvm+clang to compile for stm32

Has someone infos how to build a llvm+clang toolchain using binutils and newlib and how to use it?
host: Linux, AMD64
target: cortex-m3, stm32
c-lib: newlib
assembler: gnu as

I created a firmware framework - PolyMCU https://github.com/labapart/polymcu - that is based on CMake that support GCC and LLVM. Because it is based on CMake you can build your firmware on Linux/Windows/MacOS.
It also uses Newlib - it looks all your requirements are there!
I also wrote a blog where I compared GCC and LLVM build size on ARM Cortex-M: http://labapart.com/blogs/3-the-importance-of-the-toolchain-version-in-embedded-space
Interesting results, Clang generated code is not much bigger than GCC on Cortex-M...

Unfortunately, right now clang does not support flexible cross-compilation settings. So, most probably you will need to invoke necessary tools with all necessary arguments.
Start with building llvm + clang using --target=thumbv7-eabi configure argument (note that you will need llvm + clang as of yesterday for this). You might want to specify --enable-targets=arm as well. This will instruct clang to generate code for thumb by default. After this you can invoke clang -mcpu=cortex-m3 to generate the code for you.
You will have to provide all necessary include / library paths by hands via -I / -L, etc.
If you're happy with some C++ hacking, you can write necessary "HostInfo", so it will invoke the right tools and provide right paths automagically.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight