Difference b/w llvm-ld and llvm-link - c

What is the difference between llvm-ld and llvm-link? I suppose llvm-ld performs link time optimization while llvm-link doesn't. Am I right?

llvm-ld is a drop-in replacement for the system linker that supports both LLVM bitcode and native code. It produces bitcode executables by default (ie the resulting executable invokes the bitcode interpreter), but can also be used to produced native executables.
I don't use llvm-ld directly as it's more convenient to use the llvmc and clang frontends, which invoke the appropriate programs of the LLVM toolchain as necessary (note: llvmc was marked experimental and appears to have been removed in the 3.0 release).
llvm-link is a more low-level tool which joins several bitcode files into a single one. The documentation doesn't mention if it does optimizations, but it doesn't appear to do so. The next optimization passes will be triggered on native code generation.

Related

Why specify the target architecture to the linker?

I've been working on using the Meson build system for an embedded project. Since I'm working on an embedded platform, I've written a custom linker script and also an invocation for the linker. I didn't have any problems until I tried to link in newlib to my project, when I started to have link issues. Just before I got it working, the last error was undefined reference to main which I knew was clearly in the project.
Out of happenstance, I tried adding -mcpu=cortex-m4 to my linker invocation (I am using gcc to link, I am told this is quite typical instead of directly calling ld). It worked! Now, my only question is "why"?
Perhaps I am missing something about how the linking process actually works, but considering I am just producing an ELF file, I didn't think it would be important to specify the CPU architecture to the linker. Is this a newlib thing, or has gcc just been doing magic behind the scenes for me that I haven't seen before?
For reference, here's my project (it's not complete)
In general, you should always link via the compiler driver (link forms of the gcc command), not via direct invocation of ld. If you're developing for bare metal on a particular exact target, it's possible to determine the set of linker arguments you need and use ld directly, but there's a lot that the compiler driver takes care of for you, and it's usually better to let it. (If you don't have a single fixed target, there are unlimited combinations of possibilities and no way you can reproduce all present and future ones someone may care about.)
You can still pass whatever options you like to the linker, e.g. custom linker scripts, via -Wl,... option forms.
As for why the specific target architecture ISA level could matter to linking, linking is not a dumb process of just sticking together binary chunks. Linking can involve patching up (relocations) or even generating (thunks for distant jump targets, etc.) code, in which case the linker may need to care what particular ISA level/variant it's targeting.
Such linker options ensure that the appropriate standard library and start-up code is linked when these are defaulted rather then explicitly specified or overridden.
The one ARM toolchain supports a variety of ARM architecture variants and options; they may be big or little-endian, have various instruction sets - ARM, Thumb. Thumb-2, ARM64 etc, and various extensions such a SIMD or DSP units. The linker requires the architecture information to select the correct library to link for both performance and binary compatibility.

Is clang a standalone C compiler or does it need gcc?

I want to use clang on Windows to compile C code.
I'd like to know if it is in fact a standalone compiler that can do that, or are its aims somewhat different?
I've used it before, but it appears now that is was piggy-backing on top of whatever gcc compilers were lying around (mingw for example).
If I try a fresh binary installation of clang 64-bits (and I hide my mingw/gcc directories), then it can't find stdio.h for Hello World. This is running from directly inside the bin directory (C:\clang\bin). If I unhide mingw, it will compile, but then I get errors like this (one mingw compiler is in c:\win):
c:\win\bin\ld.exe cannot find -lgcc_s
Considering clang is a 438MB installation, you'd think it would have it's own include and library files! I want to use clang in place of gcc.
So, what am I doing wrong? (I've seen a few questions also about the inability to find stdio.h, but they weren't helpful. Surely clang must be able to compile Hello World by itself?!)
You are confusing compiler with linker with standard library.
Clang is a full featured independent compiler. But it does not provides the standard library (the library containing stdio.h). Traditionally, on Unix systems, the operating systems must provide the standard library it uses. But since you are using Windows, it doesn't, and for whatever reason it finds the ones from MingW installed. There are many free implementations of C standard library which are compatible with Clang.
Lastly, ld.exe is the linker, and it also, traditionally, must be provided by the system. There is one linker, lld, that I believe is being developed alongside Clang, but for whatever reason, the packager of the version you downloaded just chose to configure clang to simply call ld.
Clang is a completely separate compiler (written entirely from scratch, using LLVM). You don't need GCC to use Clang, as can be shown in the case of FreeBSD (they completely replaced GCC with Clang/LLVM and don't install GCC in the base anymore for licensing reasons). There are a variety of different C compilers other than GCC, it's just that GCC is the most common.
However, no compiler provides the standard C libraries (GCC provides some weird libraries like the one you're trying to use). C libraries are provided separately, and you need to install C libraries in order to compile any significant C program. The error message saying cannot find -lgcc_s tells me that you're trying to link against some library provided by GCC. In this case, you probably want to install that library by installing GCC (but note that you don't need GCC to use Clang.
It does appear that your version of Clang has been compiled to use GNU's linked ld, not LLVM's linked lld. As such, you'll need GCC's linker (or you can recompile clang to use LLVM's linker, or just compile the object files and use lld separately).
I think you are missing a path variable. After install you must manually add a PATH to the Windows Environment.

Can I compile a function with gcc and then use it with clang?

I am trying to use SSE4.2 intrinsics with clang/llvm but its not compiling, as I get cannot select intrinsic error from LLVM. On the other hand, the same code compiles flawlessly in gcc. So I thought, maybe I can compile that function with gcc, so as to have an object or library file, and then call that library function in my code, which is compiled by clang/llvm. Would that work?
It's possible to compile an object file with GCC in Linux and convert it to work in Visual Studio. I did this recently running Linux in Virtual Box on Windows converting-c-object-file-from-linux-o-to-windows-obj so this should be possible with Clang on Linux or Windows as well.
So not only can this be done cross compiler it can be done cross platform.
You need to get the calling conventions and the object file format correct (and for C++ the name mangling as well) . With GCC when you compile you can tell it which calling convention/API to use with mabi. Then, if going from Linux to Windows, you need an object file converter to convert from e.g. ELF on Linux to COFF on Windows. Of course, there are cases this probably won't work (e.g. if the module relies on a system call that is only in one platform). See the link above for more details.
For any more-or-less complicated c++ code, e.g., one that compiles to vtable - the answer is a resounding NO. The two are NOT compatible.
To illustrate the above point, try to compile the Crypto++ library with g++ (gains about 40% speedup for AES/GCM) and then link your clang++-compiled code with it.
It may or it may not work. Some elements of the ABI can be expected to be the same. For example, I believe both g++ and clang use the Itanium ABI name mangling scheme. Others elements can not. So it depends on how complex the code you're compiling is.
Also, I would suggest opening an LLVM bug for the intrinsic that could not be selected. Clang and LLVM have a very active community, and it's possible someone will pick the bug up quickly.

Why in Linux compiler we have to give additional arguments while compiling and running C programs?

I have implemented semaphores in Linux last year. But for that I have to use -lpthread.
Now while implementing log10() function in C, I surfed the INTERNET and I saw that I have to use -lm.
I want to know why these kind of command line arguments are necessary in Linux.And Does this rule is compiler oriented?
(In windows Turboc compiler, I never used these kind of arguments.)
You are instructing the compiler to look for certain libraries and use them to try and produce a final object file.
When you were doing your threading code, you used threading primitives. These threading primitives are implemented in a library called pthread, -lpthread tells the linker to use the library pthread, without providing this switch the compiler will not be able to produce a valid object file as it is missing threading code implementation.
On the file system the libraries can be found in /usr/lib and lib (among others) when you look in these directories you will see files start with the lib prefix. for example libpthreadxxxxxx. You will have to do your own research to figure out what the xxxx means.
The development cycle using unix style tools is very granular on the surface, when you use heavyweight IDE's (read: visual studiio for C++), the IDE implicetly links against loads of standard libraries, so often you do not need to supply the name of the libraries you will commonly use. However, when you start doing more advanced programming you will probably have to install and configure your IDE to use external code libraries. If you were to use threading primitives in visual studio, you most likely will not have to provide the compiler with information on where to look for threading primitives, Microsoft considers this a common library and every new project will implicitly link against it.
A little discussion on GCC
GCC is a very diverse compiler producing code for various different usage scenarios. As such they try to be neutral and do not make assumptions. For example pthread is a particular threading primitives implementation. However, even through now on Linux at least it is the defacto standard, it is not the only one. Other Unix implementation have had different implementation. When such choices exist it is not fair for the compiler developers to implicitly link against libraries. They do however implicitly link against standard libraries; for example G++ is just a wrapper command to the internal compiler code, it is a C++ front-end so it implicitly links against an implementation of the C++ standard library. Similarly the C front end links against a the standard C library.
People often do not want to use certain standard library implementation, and instead they might want to use another implementation, in such cases you have to explicetly inform the compiler to use an implementation that you provide. Such use cases are very granular and are surface level issues with G++. In visual studio, you would have to tinker a lot to make such changes generally, since it is not an anticipated use-case anymore.
wikipedia will provide you with more information.
Edit: I'll fix the spelling and Grammatical issues later :D
The option -l indicates to gcc what libraries must be used for linking. -lpthread stands for "use the pthread library", and -lm stands for "use the m library" which is the math library. These commands are relative to gcc, not linux.
Because by default, gcc only links the C library (libc), which contains the well-known functions printf, scanf, and many more.
log10 exists in a different library called libm, and thus you need to explictly tell gcc to link that library, with -lm. The same logic applies for -lpthread.
This is purely a backwards, harmful practice. Separating parts of the standard library into separate .so files does nothing but increase load time and memory usage. Good luck getting anyone to change it though... Just accept that you have to do it (and that POSIX specifically allows, but does not require, that an implementation require -lm for using the math functions and -lpthread for using threads, etc.) and move on to more important things.
Or, go pester Drepper about it on the glibc bug tracker/mailing list. He won't change his mind, but if you enjoy flamewars you can get some kicks...

Binary compatibility between avr-gcc 3.4.0 and avr-gcc 4.3.x

I have inherited an application that links to a library which MAY HAVE been built with gcc3. Or maybe with the imagecraft compiler. That information has now vanished to the heavenly bitfield and I am left with a libXXX.a library against which to link my app. I cannot recompile the libXXX.a because it requires certain unknown headers from imagecraft and somewhere else which at a certain point may have been ubiquitous in my environment but now are nowhere to be found.
My question is this, provided that my compiling my app with avr-gcc version 3.4.0 (and linking to that "special" libXXX) resulted in a working binary image, is it reasonable to expect that I could compile all the other parts of my app with avr-gcc 4 (this action having some very nice and proven benefits), link with libXXX and still get a working program?
Essentially, it all boils down to: is avr-gcc binary compatible with "mysterious compiler X which just may have been avr-gcc 3.something"?
To be honest, I have successfully compiled the rest of my app with avr-gcc4 and linked it with the library, and verified that the result works, but what kind of side effects or quirks should I be on the lookout for?
Linking libraries from different compilers (or -versions) will work reliably if both compilers use the same ABI (Application Binary Interface)
The ABI of a specific platform is typically specified by the dominant compiler for that platform, but that could be done by referencing an external specification.
ABI changes are rare, especially if the platform supports third-party libraries/applications, because an ABI change means that literally everything has to be rebuilt.

Resources