What is a "Port" in linker? - linker

I'm reading the documentations of LLD (LLVM linker) and it mentioned "ports".
Eg.,
The ELF port is the one that will be described in this document.
The PE/COFF port is complete, including Windows debug info (PDB)
support. The WebAssembly port is still a work in progress (See
WebAssembly lld port)
Can someone explain what exactly is a linker port?

In these context the part of the linker that actually writes the output files (in the respective formats like ELF for *nix and PE/COFF for Windows).
Since the formats (and associated debuginfo) can be quite complex, this can be quite a large, and because they are also somewhat platform dependent here it is referred to as a "port" (as in porting, carrying over software from one system to the other), because adding support for the various file formats is a/the major part of adding new target.

Related

GCC preprocessor directives for Arch Linux

Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
I need to check that my software restricts itself from compiling under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
Does anyone know how to guarantee through GCC preprocessor directives that my binaries are only compilable under Arch Linux?
Of course I can always
#ifdef __linux__
...
#endif
But this is not precise enough.
Edit: This must be done through C source code and not by any building systems, so, for example, doing this through CMake is completely discarded.
Edit 2: Users faking this behaviour is not a problem since the software is distributed to selected clients and thus, actively trying to "misuse" our source code is "their decision".
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
No. Because there's nothing inherently specific to Arch Linux on the binary level. For what it's worth, when compiling the only things you/the compiler has to care about is the target architecture (i.e. what kind of CPU it's going to run with), data type sizes and alignments and function calling conventions.
Then later on, when it's time to link the compiled translation unit objects into the final binary executable, the runtime libraries around are also of concern. Without taking special precautions you're essentially locking yourself into the specific brand of runtime libraries (glibc vs. e.g. musl; libstdc++ vs. libc++) pulled by the linker.
One can easily sidestep the later problem by linking statically, but that limits the range of system and midlevel APIs available to the program. For example on Linux a purely naively statically linked program wouldn't be able to use graphics acceleration APIs like OpenGL-3.x or Vulkan, since those rely on loading components of the GPU drivers into the process. You can however still use X11 and indirect GLX OpenGL, since those work using wire protocols going over sockets, which are implemented using direct syscalls to the kernel.
And these kernel syscalls are exactly the same on the binary level for each and every Linux kernel of every distribution out there. Although inside of the kernel there's a lot of leeway when it comes to redefining interfaces, when it comes to the interfaces toward the userland (i.e. regular programs) there's this holy, dogmatic, ironclad rule that YOU NEVER BREAK USERLAND! Kernel developers breaking this rule, intentionally or not are chewed out publicly by Linus Torvalds in his in-/famous rants.
The bottom line to this is, that there is no such thing as a "Linux distribution specific identifier on the binary level". At the end of the day, a Linux distribution is just that: A distribution of stuff. That means someone or more decided on a set of files that make up a working Linux system, wrap it up somehow and slap a name on it. That's it. There's nothing inherently specific to "Arch" Linux other than it's called "Arch" and (for the time being) relies on the pacman package manager. That's it. Everything else about "Arch", or any other Linux distribution, is just a matter of happenstance.
If you really want to sort different Linux distributions into certain bins regarding binary compatibility, then you'd have to pigeonhole the combinations of
Minimum required set of supported syscalls. This translates into minimum required kernel version.
What libc variant is being used; and potentially which version, although it's perfectly possible to link against a minimally supported set of functions, that has been around for almost "forever".
What variant of the C++ standard library the distribution decided upon. This actually also inflicts programs that might appear to be purely C, because certain system level libraries (*cough* Mesa *cough*) will internally pull a lot of C++ infrastructure (even compilers), also triggering other "fun" problems¹
I need to check that my software restricts itself from running under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
You couldn't find resources on the Internet, because there's nothing specific on the binary level that makes "Arch" Arch. For what it's worth right now, this instant I could create a fork of Arch, change out its choice of default XDG skeleton – so that by default user directories are populated with subdirs called leech, flicks, beats, pics – and call it "l33tz" Linux. For all intents and purposes it's no longer Arch. It does behave significantly different from the default Arch behavior, which would also be of concern to you, if you'd relied on any specific thing, and be it most minute.
Your employer doesn't seem to understand what Linux is or what distinguished distributions from each other.
Hint: It's not the binary compatibility. As a matter of fact, as long as you stay within the boring old realm of boring old glibc + libstdc++ Linux distributions are shockingly compatible with each other. There might be slight differences in where they put libraries other than libc.so, libdl.so and ld-linux[-${arch}].so, but those two usually always can be found under /lib. And once ld-linux[-${arch}].so and libdl.so take over (that means pulling in all libraries loaded at runtime) all the specifics of where shared objects and libraries are to be found are abstracted away by the dynamic linker.
1: like becoming multithreaded only after global constructors were executed and libstdc++ deciding it wants to be singlethreaded, because libpthread wasn't linked into a program that didn't create a single thread on its own. That was a really weird bug I unearthed, but yshui finally understood https://gitlab.freedesktop.org/mesa/mesa/-/issues/3199
You can list the predefined preprocessor macros with
gcc -dM -E - /dev/null
clang -dM -E - /dev/null
None of those indicate what operating system the compiler is running under. So not only you can't tell whether the program is compiled under Arch Linux, you can't even tell whether the program is compiled under Linux. The macros __linux__ and friends indicate that the program is being compiler for Linux. They are defined when cross-compiling from another system to Linux, and not defined when cross-compiling from Linux to another system.
You can artificially make your program more difficult to compile by specifying absolute paths for system headers and relying on non-portable headers (e.g. /usr/include/bits/foo.h). That can make cross-compilation or compilation for anything other than Linux practically impossible without modifying the source code. However, most Linux distributions install headers in the same location, so you're unlikely to pinpoint a specific distribution.
You're very likely asking the wrong question. Instead of asking how to restrict compilation to Arch Linux, start from why you want to restrict compilation to Arch Linux. If the answer is “because the resulting program wouldn't be what I want under another distribution”, then start from there and make sure that the difference results in a compilation error rather than incorrect execution. If the answer to “why” is something else, then you're probably looking for a technical solution to a social problem, and that rarely ends well.
No, it doesn't. And even if it did, it wouldn't stop anyone from compiling the code on an Arch Linux distro and then running it on a different Linux.
If you need to prevent your software from "from running under anything but Arch Linux", you'll need to insert a run-time check. Although, to be honest, I have no idea what that check might consist of, since linux distros are not monolithic products. The actual check would probably have to do with your reasons for imposing the restriction.

How to check if a object code is 16/32 bit?

Is there any way by which we can identify that a .obj file and .exe file is 16/32 bit?
Basically I want to create a smart linker, that will automatically identify which linker do the given file names need to be passed to.
Preferred Language: C (it can be different, if needed)
I am looking for some solution that can read the bytes of an .exe/the code of an .obj file and then determine if it's 16/32 bit. Even an algorithm would too do.
Note: I know both object code and a executable are two different entities.
All of this information is encoded in the binary object according to the relevant Application Binary Interface (ABI).
The current Linux ABI is the Executable and Linkable Format (ELF), and you can query a specific binary file using a tool such as readelf or objdump.
The current Windows ABI is the Portable Executable (PE) format. I'm not familiar with the toolset here but a quick google search suggests there are programs that function the same as readelf:
http://www.pe-explorer.com/peexplorer-tour.htm
Here's the Microsoft specification of the PE format:
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
However, neither of those formats support 16-bit binaries anymore. The older ABI format is called "a.out" for Linux, which can be read and queried with objdump (I'm not sure about readelf). The older Windows/DOS formats are called MZ and NE. Again, I'm not familiar with the tool support for these older Windows formats.
Wikipedia has a pretty comprehensive list of all the popular executable file formats that have been used, with links to more info:
https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats

How to run executable file a.out created in my laptop gcc environment in other laptops?

I have written a program code in c compiled and executed in gcc compiler. I want to share the executable file of program without sharing actual source code. Is there any way to share my program without revealing actual source code so that executable file could run on other computers with gcc compilers??
Is there any way to share my program without revealing actual source code so that executable file could run on other computers with gcc compilers?
TL;DR: yes, provided a greater degree of similarity than just having GCC. One simply copies the binary file and any needed auxiliary files to a compatible system and runs it.
In more detail
It is quite common to distribute compiled binaries without source code, for execution on machines other than the ones on which those binaries were built. This mode of distribution does present potential compatibility issues (as described below), but so does source distribution. In broad terms, you simply install (copy) the binaries and any needed supporting files to suitable locations on a compatible system and execute them. This is the manner of distribution for most commercial software.
Architecture dependence
Compiled binaries are certainly specific to a particular hardware architecture, or in certain special cases to a small, predetermined set of two or more architectures (e.g. old Mac universal binaries). You will not be able to run a binary on hardware too different from what it was built for, but "architecture" is quite a different thing from CPU model.
For example, there is a very wide range of CPUs that implement the x86_64 architecture. Most programs targeting that architecture will run on any such CPU. Indeed, the x86 architecture is similar enough to x86_64 that most programs built for x86 will also run on x86_64 (but not vise versa). It is possible to introduce finer-grained hardware dependency, but you do not generally get that by default.
Operating system dependence
Furthermore, most binaries are built to run in the context of a host operating system. You will not be able to run a binary on an operating system too different from the one it was built for.
For example, Linux binaries do not run (directly) on Windows. Windows binaries do not run (directly) on OS X. Etc.
Library dependence
Additionally, a program built against shared libraries require a compatible version of each required shared library to be available in the runtime environment. That does not necessarily have to be exactly the same version against which it was built; that depends on the library, which of its functions and data are used, and whether and how those changed over time.
You can sidestep this issue by linking every needed library statically, up to and including the C standard library, or by distributing shared libraries along with your binary. It's fairly common to just live with this issue, however, and therefore to support only a subset of all possible environments with your binary distribution(s).
Other
There is a veritable universe of other potential compatibility issues, but it's unlikely that any of them would catch you by surprise with respect to a program that you wrote yourself and want to distribute. For example, if you use nVidia CUDA in your program then it might require an nVidia GPU, but such a requirement would surely be well known to you.
Executable are often specific to the environment/machine they were created on. Even if the same processor/hardware is involved, there may be dependencies on libraries that may prevent executables from just running on other machines.
A program that uses only "standard libraries" and that links all libraries statically, does not need any other dependency (in the sense that all the code it need is in the binary itself or into OS libraries that -being part of the system itself- are already on the system).
You have to link the standard library statically. Otherwise it will only work if the version of the standard library for your compiler is installed in your OS by default (which you can't rely on, in general).

xorg input driver

I need to write a xinput driver for a virtual device, e.g. http://cgit.freedesktop.org/~whot/xf86-input-random. The device is connected to a LAN. The client for this device is written in C++. Is it possible to use C++ code in this driver or must the whole project be C only?
An Xorg driver is just an ELF shared object plugin following some documented convention. In principle, how you obtain that .so is your own business (you could in theory write it manually bit by bit if you have centuries of your time to lose).
In principle, you could link the libstdc++.so to your shared object (since one can link shared objects to other ones). I guess that you would compile and link your plugin with g++ and perhaps explicitly need to link with -lstdc++
However, I guess that it might be unsafe. Perhaps C++ ABI requires some specific things to be executed by the crt0.
So you might try, but I won't be surprised if something does not work as you want (e.g. exception handling). It could depend upon the version of the C++ library and the version of the C library and the version of the compiler.... I guess that it might work better with recent g++, recent libc, recent libstdc++ ....
Read Drupper's paper: How to Write Shared Libraries
Make your driver free software, and publish very quickly its source code, so you could get some help from the Xorg community (even when your driver is incomplete). Use probably a recent Xorg....

Implementing C file streams (FILE *, fopen, fread, etc.) on embedded platform

I've been tasked with adding streams support (C89/C90) to the libraries for my company's legacy embedded C compiler. Our target hardware typically has 1MB or less of code space and does not have an operating system.
We have a lot of stream-like implementations throughout the codebase that I can use as a starting point. For example, a console that works over a TCP sockets or serial port, a web server that reads from FAT on SD card or in-memory file, and even a firmware updater that reads from many sources.
Before I go and re-invent the wheel, I'm wondering if there are existing implementations that I could either port or use as a starting point for my work. Even though we provide full source code to our customers, GPL-licensed code isn't an option since our customers don't want to release source code to their products.
Can anyone recommend a book (annotated Unix source, CompSci text) or public domain/BSD-licensed source? I'd prefer to look at an older OS targeted to a single device, as current operating systems contain a tangle of macros and layers of typedefs that make following even a simple struct definition difficult.
Take a look at P.J. Plauger's book The Standard C Library, which describes in detail one possible implementation of the complete C89 standard library.
You should be able to pull most of what you need from the source code for the GNU C standard library. It is licensed with the Lesser GPL, which means you can link to the library without affecting the license of your software (or forcing your customers to release their code). Porting this to your platform (thus keeping the LGPL-ed code in its own library) may be easier than implementing your own from scratch.
Several different projects have taken GNU GLIBC and optimized it for embedded systems. You may want to look at:
Embedded GLIBC (LGPL)
uLIBC (LGPL)
Newlib (multiple free licenses)
In particular, EGLIBC and uLIBC were designed to run properly on embedded systems that lack a MMU.
You can also have a look at BSD's implementation of libc
Alternatively there is STLSoft, who provides several libraries (including the C standard lib) under a BSD license. I can't attest to their quality since I haven't used their code myself, but it might be worth looking at if you can't work LGPL-ed code into your project.
Wouldn't *BSD (Net|Open|Free)'s libc be suitable? At least as a starting point.
Try looking at http://www.minix3.org/
Check your development tools. Some development tools come with their on source for their software libraries.
I took the source for the Compiler's printf and adapted for a debug port on an embedded system. There is less work when you have a foundation to build from.

Resources