Common linker to all languages - linker

Is a linker common to the object codes of various languages for a given ISA? or is it that various languages need separate linker for an underlying platform? I understand linker to be a system software and should be common to all?

First you need to understand that the linker links object code. This object code is machine (and usually operating system) specific. There are several different standard object code formats. A linker can not link object code from different machine architectures. And even if it could do that, it wouldn't execute. That being said, you can almost always link object code from different languages as long as the compilers run on the same machine and sometimes even the same operating system. For example, if you create a program in C and want to link with it a Pascal object file, this will usually work. The most popular object code format is called COFF object code. COFF code is almost universally the accepted standard format for object code. It doesn't matter what language compiler you use to generate the code (as long as its from the same machine architecture), most linkers will understand be able to link COFF files.

Related

C object file compatibility between computers

First I want to state for the record that this question is related to school/homework.
Let’s say computers CP1 and CP2 both share the same operating system and machine language. If a C program is compiled on CP1, in order to move it to CP2, is it necessary to transfer the source code and recompile on CP2, or simply transfer the object files.
My gut answer is that the object files should suffice. The C code is translated into assembly by the compiler and assembled into machine code by the assembler. Because the architecture shares the same machine code and operating system, I don't see a problem.
But the more I think about it, the more confused I’m starting to get.
My questions are:
a) Since its referring to object files and not executables, I’m assuming there has been no linking. Would there be any problems that surface when linking on CP2?
b) Would it matter if the code used C11 standard on CP1 but the only compiler on CP2 was C99? I'm assuming this is irrelevant once the code has been compiled/assembled.
c) The question doesn't specify shared/dynamic linked libraries. So this would only really work if the program had no dependencies on .dll/.so/ .dylib files, or else these would be required on CP2 as well.
I feel like there are so many gotchas, and considering how vague the question is I now feel that it would be safer to simply recompile.
Halp!
The answer is, it depends. When you compile a C program and move the object files to link on a different computer, it should work. But because of factors such as endianness or name mangling, your program might not work as intended, and even might crash when you try to run it.
C11 is not supported by a C99 compiler, but it does not matter if the source has been compiled and assembled.
As long as the source is compiled with the libraries on one machine, you don't need the libraries to link or run the file(s) on the other computer (static libraries only, dynamic libraries will have to be on the computer you run the application on). This said, you should make the program independent so you don't run into the same problems as before where the program doesn't work as intended or crashes.
You could get a compiler that supports EABI so you don't run into these problems. Compilers that support the EABI create object code that is compatible with code generated by other such compilers, thus allowing developers to link libraries generated with one compiler with object code generated with a different compiler.
I have tried to do this before, but not a whole lot, and not recently. Therefore, my information may not be 100% accurate.
a) I've already heard the term "object files" being used to refer to linked binaries - even though it's kinda inaccurate. So maybe they mean "binaries". I'd say linking on a different machine could be problematic if it has a different compiler -
unless object file formats are standardized, which I'm not sure about.
b) Using different standards or even compilers doesn't matter for binary code - if it's linked statically. If it relies on functions from a dynamic lib, there could be problems. Which answers c) as well: Yes, this will be a problem. The program won't start if it doesn't have all required dynamic libs in the correct version. Depends on linking mode (static vs. dynamic), again.
Q: Let’s say computers CP1 and CP2 both share the same operating system and machine language.
A: Then you can run the same .exe's on both computers
Q: If a C program is compiled on CP1, in order to move it to CP2, is it necessary to transfer the source code
A: No. You only need the source code if you want to recompile. You only need to recompile if it's a different, incompatible CPU and/or OS.
"Object files" are generally not needed at all for program execution:
http://en.wikipedia.org/wiki/Object_files
An object file is a file containing relocatable format machine code
that is usually not directly executable. Object files are produced by
an assembler, compiler, or other language translator, and used as
input to the linker.
An "executable program" might need one or more "shared libraries" (aka .dll's). In which case the same restrictions apply: the shared libraries, if not already resident, must be copied along with the .exe, and must also be compatible with the CPU and OS.
Finally, "scripts" do not need to be recompiled. You may copy the script freely from computer to computer. But each computer must have an "interpreter" to run the script: a Perl script needs a Perl interpreter, a Python script a python interpreter, and so on.

Should static libraries be always built with same compiler options as the application?

We have a reusable library which gets delivered across to multiple products. Most of the products are in VxWorks and use gcc compiler. But, each of them will be on different architectures like PPC, MIPS and in PPC itself there are more types like 8531, 8620 etc.
Currently, I am building static libs for each of these boards seperately and provide. Is there anyway that a common library can be built, which can be used across all these different architectures?
Also, currently I try to ensure that compiler options are same as that of the products. Is it necessary? Is there any information available in the internet which classifies which options are important to maintained same for static libraries and applications?
No there is no other way - you must built the libraries (static or not) for each platform.
As you probably already know static library is really just a container storing a buch of object files. Each object file contains binary code specific to platform that the library was built for (read: different set of assembly instructions).
Yes, keeping the compiler options the same when you are building a library and the binary (program) that uses it the same is a very good practice. This way you are avoiding potentially very nasty problems. Some optimization options are binary incompatible (e.g.: you may compile a function in a library with a optimization that will cause it to return (or expect) a data by register), but your main program may expect that the function returns it by address on stack - big trouble.
It depends of each option: platform and architecture options must be the same, obviously.
Another ones like optimization, debug, profiling can be different.
Imagine that a library may be provided by an external developer, so, you don't really not know how did he compiled it, only platform and architecture requirements.
Also, currently I try to ensure that compiler options are same as that of the products. Is it necessary?
'2. Necessary - no. In fact, most libraries can be considered to be standalone and not tied to any particular product (i.e. they are usable from many products). As such, per-product specific flags just don't belong into the library, or vice versa (library-implementation specific flags are not supposed to appear when compiling products' objects).

Crosscompiler Binary compatibility in C

I need to verify something for which I have doubts. If a shared library ( .dll) is written in C, with the C99 standard and compiled under a compiler. Say MinGw. Then in my experience it is binary compatible and hence useable from any other compiler. Say MS Visual Studio. I say in my experience because I have tried it successfully more than once. But I need to verify if this is a rule.
And in addition I would like to ask if it is indeed so, then why libraries written completely in C, like openCV for example don't provide compiled binaries for every different OS? I know that the obvious reason would be to set all the compile-time parameters, but other than that there is none right?
EDIT: I am adding an additional question which I see as a logical extension to the original. Isn't this how one would go and create a closed source library? Since the option of giving source goes out of the window there, giving binaries is the only choice. And in that case providing binaries for as many architectures as possible is the desired result, with C being an obvious choice for having the best portability between systems and compilers. Right?
In the specific case of C compilers (MSVC and GCC/MinGW) in the Windows world, you are correct in the assumption of binary compatibility. One can link a C interface DLL compiled by GCC to a program in Visual Studio. This is the way C99 projects like ffmpeg allow developers to write application wiht Visual Studio. One only needs to create the import library with lib.exe found in the Microsoft toolchain from the DLL. Or vice versa, using mingw.org's pexports or better, mingw-w64's gendef tool, one can create a GCC import lib for a MSVC produced DLL.
This handy interoperability breaks down when you enter the C++ interface world, where the ABI of MSVC and GCC is different and incompatible. It may work, it may not, no guarantees are made and no effort is (currently) being done in changing that. Also, debugging info is obviously different, until someone writes a debug information generator/writer in GCC that is compatible to MSVC's debugger (along with gdb support of course).
I don't think C99 specifically changes anything to function declarations or the way arguments are handled in symbol definitions, so there should be no problem here either.
Note that as Vijay said, there is still the architecture difference, so a x86 library can't be used when linking to an AMD64 library.
To also answer your additional question about closed source binaries and distributing a version for all available compilers/architectures.
This is exactly the way you would create a closed source binary. In addition to the import library, it is also very important to hide exports from the DLL, making the DLL itself useless for linking (if you don't want client code to use private functions in the library, see for example the output of dumpbin /exports on a MSOffice DLL, lots of hidden stuff there). You can achieve the same thing with GCC (I believe, never used or tried it) using things like __attribute(hidden) etc...
Some compiler specific points:
MSVC comes with four (well, actually only three remaining in newer versions) different runtime libraries through /MT, /MD, and /LD. On top of this, you would have to provide a build for each version of Visual Studio (including Service Packs) to assure compatibility. But that is closed source binary and Windows for you...
GCC does not have this problem; MinGW always links to msvcrt.dll provided by Windows (since Windows 98), equivalent with /MD (and maybe also a debug library equivalent with /MDd). But I there are two versions of MinGW (mingw.org and mingw-w64) which do not guarantee binary compatibility. THe latter is more complete as it provides 64-bit options as well as 32-bit, and provides a more complete header/library set (including a substantial part of DirectX and DDK).
The general rule is that IF your OS/CPU combination has a standard ABI, and IF that ABI is powerful enough for your language, most compilers will follow that ABI and as a result will be binary compatible, allowing you to link libraries (shared or static) compiled with different compilers to programs compiled with other compilers just fine.
The problem is that most ABIs are fairly weak -- they're designed around low-level languages like C and FORTRAN and date back to the days before object oriented languages like C++. So they tend to lack support for things like function overloading, user-defined operators, exceptions, global contructors and destructors, virtual functions, inheritance, and such that are needed by C++.
This lack was recognized when C++ was designed which is why C++ has extern "C" -- which causes the compiler to limit itself to the standard ABI for certain functions, while disabling all the extra C++ features that the ABIs generally don't support.
A shared library or dll compiled to a particular architecture can be linked to applications compiled by other compilers that target the same architecture. (By architecture, I mean a processor/OS combination). But it is not practical for a library developer to compile against all possible architectures. Moreover, when a library is distributed in source form, users can build binaries optimized to their specific requirements.

What is the C runtime library?

What actually is a C runtime library and what is it used for? I was searching, Googling like a devil, but I couldn't find anything better than Microsoft's: "The Microsoft run-time library provides routines for programming for the Microsoft Windows operating system. These routines automate many common programming tasks that are not provided by the C and C++ languages."
OK, I get that, but for example, what is in libcmt.lib? What does it do? I thought that the C standard library was a part of C compiler. So is libcmt.lib Windows' implementation of C standard library functions to work under win32?
Yes, libcmt is (one of several) implementations of the C standard library provided with Microsoft's compiler. They provide both "debug" and "release" versions of three basic types of libraries: single-threaded (always statically linked), multi-threaded statically linked, and multi-threaded dynamically linked (though, depending on the compiler version you're using, some of those may not be present).
So, in the name "libcmt", "libc" is the (more or less) traditional name for the C library. The "mt" means "multi-threaded". A "debug" version would have a "d" added to the end, giving "libcmtd".
As far as what functions it includes, the C standard (part 7, if you happen to care) defines a set of functions a conforming (hosted) implementation must supply. Most vendors (including Microsoft) add various other functions themselves (for compatibility, to provide capabilities the standard functions don't address, etc.) In most cases, it will also contain quite a few "internal" functions that are used by the compiler but not normally by the end user.
The runtime library is basically a collection of the implementations of those functions in one big file (or a few big files--e.g., on UNIX the floating point functions are traditionally stored separately from the rest). That big file is typically something on the same general order as a zip file, but without any compression, so it's basically just some little files collected together and stored together into one bigger file. The archive will usually contain at least some indexing to make it relatively fast/easy to find and extract the data from the internal files. At least at times, Microsoft has used a library format with an "extended" index the linker can use to find which functions are implemented in which of the sub-files, so it can find and link in the parts it needs faster (but that's purely an optimization, not a requirement).
If you want to get a complete list of the functions in "libcmt" (to use your example) you could open one of the Visual Studio command prompts (under "Visual Studio Tools", normally), switch to the directory where your libraries were installed, and type something like: lib -list libcmt.lib and it'll generate a (long) list of the names of all the object files in that library. Those don't always correspond directly to the names of the functions, but will generally give an idea. If you want to look at a particular object file, you can use lib -extract to extract one of those object files, then use dumpbin /symbols <object file name> to find what function(s) is/are in that particular object file.
At first, we should understand what a Runtime Library is; and think what it could mean by "Microsoft C Runtime Library".
see: http://en.wikipedia.org/wiki/Runtime_library
I have posted most of the article here because it might get updated.
When the source code of a computer program is translated into the respective target language by a compiler, it would cause an extreme enlargement of program code if each command in the program and every call to a built-in function would cause the in-place generation of the complete respective program code in the target language every time. Instead the compiler often uses compiler-specific auxiliary functions in the runtime library that are mostly not accessible to application programmers. Depending on the compiler manufacturer, the runtime library will sometimes also contain the standard library of the respective compiler or be contained in it.
Also some functions that can be performed only (or are more efficient or accurate) at runtime are implemented in the runtime library, e.g. some logic errors, array bounds checking, dynamic type checking, exception handling and possibly debugging functionality. For this reason, some programming bugs are not discovered until the program is tested in a "live" environment with real data, despite sophisticated compile-time checking and pre-release testing. In this case, the end user may encounter a runtime error message.
Usually the runtime library realizes many functions by accessing the operating system. Many programming languages have built-in functions that do not necessarily have to be realized in the compiler, but can be implemented in the runtime library. So the border between runtime library and standard library is up to the compiler manufacturer. Therefore a runtime library is always compiler-specific and platform-specific.
The concept of a runtime library should not be confused with an ordinary program library like that created by an application programmer or delivered by a third party or a dynamic library, meaning a program library linked at run time. For example, the programming language C requires only a minimal runtime library (commonly called crt0) but defines a large standard library (called C standard library) that each implementation has to deliver.
I just asked this myself and was hurting my brain for some hours. Still did not find anything that really makes a point. Everybody that does write something to a topic is not able to actually "teach". If you want to teach someone, take the most basic language a person understands, so he does not need to care about other topics when handling a topic. So I came to a conclusion for myself that seems to fit well in all this chaos.
In the programming language C, every program starts with the main() function.
Other languages might define other functions where the program starts. But a processor does not know the main(). A processor knows only predefined commands, represented by combinations of 0 and 1.
In microprocessor programming, not having an underlying operating system (Microsoft Windows, Linux, MacOS,..), you need to tell the processor explicitly where to start by setting the ProgramCounter (PC) that iterates and jumps (loops, function calls) within the commands known to the processor. You need to know how big the RAM is, you need to set the position of the program stack (local variables), as well as the position of the heap (dynamic variables) and the location of global variables (I guess it was called SSA?) within the RAM.
A single processor can only execute one program at a time.
That's where the operating system comes in. The operating system itself is a program that runs on the processor. A program that allows the execution of custom code. Runs multiple programs at a time by switching between the execution codes of the programs (which are loaded into the RAM). But the operating system IS A PROGRAM, each program is written differently. Simply putting the code of your custom program into RAM will not run it, the operating system does not know it. You need to call functions on the operating system that registers your program, tell the operating system how much memory the program needs, where the entry point into the program is located (the main() function in case of C). And this is what I guess is located within the Runtime Library, and explains why you need a special library for each operating system, cause these are just programs themselves and have different functions to do these things.
This also explains why it is NOT dynamically linked at runtime as .dll files are, even if it is called a RUNTIME Library. The Runtime Library needs to be linked statically, because it is needed at startup of your program. The Runtime Library injects/connects your custom program into/to another program (the operating system) at RUNTIME. This really causes some brain f...
Conclusion:
RUNTIME Library is a fail in naming. There might not have been a .dll (linking at runtime) in the early times and the issue of understanding the difference simply did not exist. But even if this is true, the name is badly chosen.
Better names for the Runtime Library could be: StartupLibrary/OSEntryLibrary/SystemConnectLibrary/OSConnectLibrary
Hope I got it right, up for correction/expansion.
cheers.
C is a language and in its definition, there do not need to be any functions available to you. No IO, no math routines and so on. By convention, there are a set of routines available to you that you can link into your executable, but you don't need to use them. This is, however, such a common thing to do that most linkers don't ask you to link to the C runtime libraries anymore.
There are times when you don't want them - for example, in working with embedded systems, it might be impractical to have malloc, for example. I used to work on embedding PostScript into printers and we had our own set of runtime libraries that were much happier on embedded systems, so we didn't bother with the "standard".
The runtime library is that library that is automatically compiled in for any C program you run. The version of the library you would use depends on your compiler, platform, debugging options, and multithreading options.
A good description of the different choices for runtime libraries:
http://www.davidlenihan.com/2008/01/choosing_the_correct_cc_runtim.html
It includes those functions you don't normally think of as needing a library to call:
malloc
enum, struct
abs, min
assert
Microsoft has a nice list of their runtime library functions:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/crt-alphabetical-function-reference?view=msvc-170
The exact list of functions would vary depending on compiler, so for iOS you would get other functions like dispatch_async() or NSLog().
If you use a tool like Dependency Walker on an executable compiled from C or C++ , you will see that one of the the DLLs it is dependent on is MSVCRT.DLL. This is the Microsoft C Runtime Library. If you further examine MSVCRT.DLL with DW, you will see that this is where all the functions like printf(), puts(0, gets(), atoi() etc. live.
i think Microsoft's definition really mean:
The Microsoft implementation of
standard C run-time library provides...
There are three forms of the C Run-time library provided with the Win32 SDK:
* LIBC.LIB is a statically linked library for single-threaded programs.
* LIBCMT.LIB is a statically linked library that supports multithreaded programs.
* CRTDLL.LIB is an import library for CRTDLL.DLL that also supports multithreaded programs. CRTDLL.DLL itself is part of Windows NT.
Microsoft Visual C++ 32-bit edition contains these three forms as well, however, the CRT in a DLL is named MSVCRT.LIB. The DLL is redistributable. Its name depends on the version of VC++ (ie MSVCRT10.DLL or MSVCRT20.DLL). Note however, that MSVCRT10.DLL is not supported on Win32s, while CRTDLL.LIB is supported on Win32s. MSVCRT20.DLL comes in two versions: one for Windows NT and the other for Win32s.
see: http://support.microsoft.com/?scid=kb%3Ben-us%3B94248&x=12&y=9

Minimum amount of information necessary to do (dynamic) linking?

This is a question I've come across repeatedly, usually concerning plug ins, but recently I came across it trying to hammer out some build system issues. My concern is primarily for *nix based systems, but I suppose it applies to windows as well.
The question is, what is the minimum amount of information necessary to do dynamic linking? I know linux distributions like Debian have simply an 'i686', which is enough. However, I suppose there is some implicit information here, and I probably won't be able to do dynamic linking of any shared object as long as they're compiled using -march=i686, will I?
So what must be matched correctly for me to be able to load a shared object successfully? I know that for c++ even the compiler (and sometimes version) must match due to name mangling, but I was kind of hoping this wasn't the case for c.
Any thoughts appreciated.
Edit:
Neil's answer made me realize I'm not really talking about dynamic linking, or rather, the question is two-fold,
what's needed for static linking, and
what's needed for dynamic linking
I have higher hopes for the first I guess.
Well at minimum, the code must have been compiled for the same processor family, and you need to know the names of the library and the function. On top of that, you need the same ABI. You should be aware that despite what people think, the C Standard does not specify an ABI and it is entirely possible for two C compilers (or versions of the same compiler) to adhere to the standard, run on the same platform, but have different ABIs.
As for exactly specifying architecture details - I must admit I've never done it. Are you planning on distributing binary libraries on different Linux variants?

Resources