I'm trying to build my own toolchain for an Raspberry-Pi.
I know there are plenty of prebuilt Toolchains. This work is for educational reasons.
I'm following the embedded arm linux from scratch book.
And succeeded in building a gcc and uClib so far.
I'm building for the target arm-unknown-linux-eabi.
Now that it comes to preparing a bootable filesystem i'm questioning myself about the bootloader build.
The part about the bootloader for this System seems to be incomplete.
Now I'm questioning myself how do I build a uboot for this System with my arm-unknown-linux-eabi toolchain.
Do I need to build a toolchain which doesn't depend on linux kernel calls.
My first reasearch lead me to the point that there are separate kind of tool chain
the OS dependent (linux kernel sys-calls etc...) and the ones which don't need to have a kernel underneath. Sometimes refered to as "Bare-Metal" toolchain or "standalone" toolchain.
Some sources mention that it would be possible to build an U-Boot with the linux toolchain.
If this is true why and how should this work?
And if I have to build a second toolchain for "Bare Metal" Toolchain where can I find informations about the difference between these two. Do I need another libstdc?
You can built U-Boot with the same cross-toolchain used to build the kernel - and most probably the rest of the user-space of the system.
A bootloader is - by definition - self-contained and doesn't care about your choice of C-runtime library because it doesn't use it. Therefore the issue of sys-calls doesn't come into it.
A toolchain is always going to need to be hosted by a fully functioning development system - invariably not your target system. Whatever references you see to a 'bare-metal toolchain' are not referring to the compiler's use of sys-calls (it relies heavily on the operating system for I/O). What is important when building bootloaders and kernels is that compiler and linker are configured to produce statically linked code that can run at specific memory address.
In almost all possible ways, there is no difference between the embedded and the Linux toolchain. But there is one exception.
That exception is __clear_cache - a function that can be generated by the compiler and in a "Linux"-toolchain includes a system call to synchronize instruction and data caches. (See http://blogs.arm.com/software-enablement/141-caches-and-self-modifying-code/ for more information about that bit.)
Now, unless you explicitly add a call to that function, the only way I know for it to be invoked is by writing nested functions in C (a GCC extension that should be avoided).
But it is a difference.
Related
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
I need to check that my software restricts itself from compiling under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
Does anyone know how to guarantee through GCC preprocessor directives that my binaries are only compilable under Arch Linux?
Of course I can always
#ifdef __linux__
...
#endif
But this is not precise enough.
Edit: This must be done through C source code and not by any building systems, so, for example, doing this through CMake is completely discarded.
Edit 2: Users faking this behaviour is not a problem since the software is distributed to selected clients and thus, actively trying to "misuse" our source code is "their decision".
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
No. Because there's nothing inherently specific to Arch Linux on the binary level. For what it's worth, when compiling the only things you/the compiler has to care about is the target architecture (i.e. what kind of CPU it's going to run with), data type sizes and alignments and function calling conventions.
Then later on, when it's time to link the compiled translation unit objects into the final binary executable, the runtime libraries around are also of concern. Without taking special precautions you're essentially locking yourself into the specific brand of runtime libraries (glibc vs. e.g. musl; libstdc++ vs. libc++) pulled by the linker.
One can easily sidestep the later problem by linking statically, but that limits the range of system and midlevel APIs available to the program. For example on Linux a purely naively statically linked program wouldn't be able to use graphics acceleration APIs like OpenGL-3.x or Vulkan, since those rely on loading components of the GPU drivers into the process. You can however still use X11 and indirect GLX OpenGL, since those work using wire protocols going over sockets, which are implemented using direct syscalls to the kernel.
And these kernel syscalls are exactly the same on the binary level for each and every Linux kernel of every distribution out there. Although inside of the kernel there's a lot of leeway when it comes to redefining interfaces, when it comes to the interfaces toward the userland (i.e. regular programs) there's this holy, dogmatic, ironclad rule that YOU NEVER BREAK USERLAND! Kernel developers breaking this rule, intentionally or not are chewed out publicly by Linus Torvalds in his in-/famous rants.
The bottom line to this is, that there is no such thing as a "Linux distribution specific identifier on the binary level". At the end of the day, a Linux distribution is just that: A distribution of stuff. That means someone or more decided on a set of files that make up a working Linux system, wrap it up somehow and slap a name on it. That's it. There's nothing inherently specific to "Arch" Linux other than it's called "Arch" and (for the time being) relies on the pacman package manager. That's it. Everything else about "Arch", or any other Linux distribution, is just a matter of happenstance.
If you really want to sort different Linux distributions into certain bins regarding binary compatibility, then you'd have to pigeonhole the combinations of
Minimum required set of supported syscalls. This translates into minimum required kernel version.
What libc variant is being used; and potentially which version, although it's perfectly possible to link against a minimally supported set of functions, that has been around for almost "forever".
What variant of the C++ standard library the distribution decided upon. This actually also inflicts programs that might appear to be purely C, because certain system level libraries (*cough* Mesa *cough*) will internally pull a lot of C++ infrastructure (even compilers), also triggering other "fun" problems¹
I need to check that my software restricts itself from running under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
You couldn't find resources on the Internet, because there's nothing specific on the binary level that makes "Arch" Arch. For what it's worth right now, this instant I could create a fork of Arch, change out its choice of default XDG skeleton – so that by default user directories are populated with subdirs called leech, flicks, beats, pics – and call it "l33tz" Linux. For all intents and purposes it's no longer Arch. It does behave significantly different from the default Arch behavior, which would also be of concern to you, if you'd relied on any specific thing, and be it most minute.
Your employer doesn't seem to understand what Linux is or what distinguished distributions from each other.
Hint: It's not the binary compatibility. As a matter of fact, as long as you stay within the boring old realm of boring old glibc + libstdc++ Linux distributions are shockingly compatible with each other. There might be slight differences in where they put libraries other than libc.so, libdl.so and ld-linux[-${arch}].so, but those two usually always can be found under /lib. And once ld-linux[-${arch}].so and libdl.so take over (that means pulling in all libraries loaded at runtime) all the specifics of where shared objects and libraries are to be found are abstracted away by the dynamic linker.
1: like becoming multithreaded only after global constructors were executed and libstdc++ deciding it wants to be singlethreaded, because libpthread wasn't linked into a program that didn't create a single thread on its own. That was a really weird bug I unearthed, but yshui finally understood https://gitlab.freedesktop.org/mesa/mesa/-/issues/3199
You can list the predefined preprocessor macros with
gcc -dM -E - /dev/null
clang -dM -E - /dev/null
None of those indicate what operating system the compiler is running under. So not only you can't tell whether the program is compiled under Arch Linux, you can't even tell whether the program is compiled under Linux. The macros __linux__ and friends indicate that the program is being compiler for Linux. They are defined when cross-compiling from another system to Linux, and not defined when cross-compiling from Linux to another system.
You can artificially make your program more difficult to compile by specifying absolute paths for system headers and relying on non-portable headers (e.g. /usr/include/bits/foo.h). That can make cross-compilation or compilation for anything other than Linux practically impossible without modifying the source code. However, most Linux distributions install headers in the same location, so you're unlikely to pinpoint a specific distribution.
You're very likely asking the wrong question. Instead of asking how to restrict compilation to Arch Linux, start from why you want to restrict compilation to Arch Linux. If the answer is “because the resulting program wouldn't be what I want under another distribution”, then start from there and make sure that the difference results in a compilation error rather than incorrect execution. If the answer to “why” is something else, then you're probably looking for a technical solution to a social problem, and that rarely ends well.
No, it doesn't. And even if it did, it wouldn't stop anyone from compiling the code on an Arch Linux distro and then running it on a different Linux.
If you need to prevent your software from "from running under anything but Arch Linux", you'll need to insert a run-time check. Although, to be honest, I have no idea what that check might consist of, since linux distros are not monolithic products. The actual check would probably have to do with your reasons for imposing the restriction.
As I understand FreeRTOS is merely three C files which reside somewhere on the unit. If I create some C program, to carry out some specific process. My question has three parts.
Does the C programs have to be compiled externally or is there an interpreter which processes the code when the unit is powered up?
If compiled on the unit itself is there a compiler which creates a system type file which then gets booted to start the device?
If it has to be created externally, is gcc capable of generating the, I assume, one executable?
Thanks...
Assuming that you are talking about an embedded system:
Yes, commonly you need to compile all C files into one executable externally.
Well, normally there is no compiler on such a unit. If there is one, its usage and functioning will be documented.
This depends on your target system. GCC has a lot of target architectures but you didn't mention yours. Most probably you can't use your PC's GCC because its target is x86 or x64 on Linux or Windows or Mac or such a common system. And your embedded system might be an ARM without an operating system. You need a cross-compiler which you can install.
I have written a program code in c compiled and executed in gcc compiler. I want to share the executable file of program without sharing actual source code. Is there any way to share my program without revealing actual source code so that executable file could run on other computers with gcc compilers??
Is there any way to share my program without revealing actual source code so that executable file could run on other computers with gcc compilers?
TL;DR: yes, provided a greater degree of similarity than just having GCC. One simply copies the binary file and any needed auxiliary files to a compatible system and runs it.
In more detail
It is quite common to distribute compiled binaries without source code, for execution on machines other than the ones on which those binaries were built. This mode of distribution does present potential compatibility issues (as described below), but so does source distribution. In broad terms, you simply install (copy) the binaries and any needed supporting files to suitable locations on a compatible system and execute them. This is the manner of distribution for most commercial software.
Architecture dependence
Compiled binaries are certainly specific to a particular hardware architecture, or in certain special cases to a small, predetermined set of two or more architectures (e.g. old Mac universal binaries). You will not be able to run a binary on hardware too different from what it was built for, but "architecture" is quite a different thing from CPU model.
For example, there is a very wide range of CPUs that implement the x86_64 architecture. Most programs targeting that architecture will run on any such CPU. Indeed, the x86 architecture is similar enough to x86_64 that most programs built for x86 will also run on x86_64 (but not vise versa). It is possible to introduce finer-grained hardware dependency, but you do not generally get that by default.
Operating system dependence
Furthermore, most binaries are built to run in the context of a host operating system. You will not be able to run a binary on an operating system too different from the one it was built for.
For example, Linux binaries do not run (directly) on Windows. Windows binaries do not run (directly) on OS X. Etc.
Library dependence
Additionally, a program built against shared libraries require a compatible version of each required shared library to be available in the runtime environment. That does not necessarily have to be exactly the same version against which it was built; that depends on the library, which of its functions and data are used, and whether and how those changed over time.
You can sidestep this issue by linking every needed library statically, up to and including the C standard library, or by distributing shared libraries along with your binary. It's fairly common to just live with this issue, however, and therefore to support only a subset of all possible environments with your binary distribution(s).
Other
There is a veritable universe of other potential compatibility issues, but it's unlikely that any of them would catch you by surprise with respect to a program that you wrote yourself and want to distribute. For example, if you use nVidia CUDA in your program then it might require an nVidia GPU, but such a requirement would surely be well known to you.
Executable are often specific to the environment/machine they were created on. Even if the same processor/hardware is involved, there may be dependencies on libraries that may prevent executables from just running on other machines.
A program that uses only "standard libraries" and that links all libraries statically, does not need any other dependency (in the sense that all the code it need is in the binary itself or into OS libraries that -being part of the system itself- are already on the system).
You have to link the standard library statically. Otherwise it will only work if the version of the standard library for your compiler is installed in your OS by default (which you can't rely on, in general).
I was using PIC micro controller for my projects. Now I would like to move to ARM based Controllers. I would like to start ARM using Linux (using C). But I have no idea how to start using Linux. Which compiler is best, what all things I need to study like a lot of confusions. Can you guys help me on that? My projects usually includes UART, IIC, LCD and such things. I am not using any RTOS. Can you guys help me?
Sorry for my bad English
Once you put a heavyweight OS like Linux on a device, the level of abstraction from the hardware it provides makes it largely irrelevant what the chip is. If you want to learn something about ARM specifically, using Linux is a way of avoiding exactly that!
Morover the jump from PIC to ARM + Linux is huge. Linux does not get out of bed for less that 4Mb or RAM and considerably more non-volatile storage - and that is a bare minimum. ARM chips cover a broad spectrum, with low-end parts not even capable of supporting Linux. To make Linux worthwhile you need an ARM part with MMU support, which excludes a large range of ARM7 and Cortex-M parts.
There are plenty of smaller operating systems for ARM that will allow you to perform efficient (and hard real-time) scheduling and IPC with a very small footprint. They range form simple scheduling kernels such as FreeRTOS to more complete operating systems with standard device support and networking such as eCOS. Even if you use a simple scheduler, there are plenty of libraries available to support networking, filesystems, USB etc.
The answer to your question about compiler is almost certainly GCC - thet is the compiler Linux is built with. You will need a cross-compiler to build the kernel itself, but if you do have an ARM platform with sufficient resource, once you have Linux running on it, your target can host a compiler natively.
If you truly want to use Linux on ARM against all my advice, then the lowest cost, least effort approach to doing so is perhaps to use a Raspberry Pi. It is an ARM11 based board that runs Linux out of the box, is increasingly widely supported, and can be overclocked to 900MHz
You can also try using the Beagle Bone development board. To start with it has few features like UART I2C and others also u can give a try developing the device driver modules for the hardware.
ARM Linux compilers and build toolchains are provided by many vendors. Below are your options which I know of:
1.ARM themselves in form of their product DS-5 ;
2.Codesourcery now acquired by Mentor graphics. See some instructions to obtain & install, codesourcery toolchain for ARM linux here
3.To first start programming using ARM (C , assembly ) I find this Windows-Cygwin version of ARM linux tool chain very helpfull. Here. These are prebuilt executables which work under Cygwin(A Posix shell layer) on Windows.
4.Another option would be to cross compile gcc/g++ toolchain on Linux for ARM target of your choice. Search and web will have information about how it is done. But this could be a slightly mroe involved and long-winding process.
enjoy ARM'ing.
First, you should question yourself if you really need to program assembly language, most modern compilers are hard to beat when it comes to generating optimized code.
Then if you decide you really need it, you can make life easier for your self by using inline assembler, and let the compiler write the glue code for you, as shown in this wikipedia article.
Then the compiler to use: For free compilers there are practically only two choices: either gcc or clang.
There is also a non free toolchain from arm which when i last tried, 5 years ago, produced about 30% faster code than gcc at the time. I have not used it since.
The latest version of this compiler can be found here
You can also write standalone assembler code in .s files, both gcc and clang can compile .s into .o in the same way you would compile a .c or .cpp file.
Compile
If you are using a STM32 based microcontroller you need to get CMSIS and GNU arm-non-eabi-gcc package installed. Then you need to write your own makefile to pass your c codes into arm gcc compiler.
Programming
For the programming step you need to install openocd and configure that for your specific programmer. You can find a full description on how to do that on my blog
http://bijan.binaee.com/index.php/2016/04/14/how-to-program-cortex-m-under-gnulinux-arch/ and in my GitHub repository.
IDE
I'm using vim with CTags but you can use gEdit with the Shortcut plugin if you need a simpler text editor.
I don't quite understand the compiling process of the Linux kernel when I install
a Linux system on my machine.
Here are some things that confused me:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
I was so confused for a couple of days, thanks for the response.
The first round of binaries for your Linux box were built on some other Linux box (probably).
The binaries for the first Linux system were built on some other platform.
The binaries for that computer can trace their root back to an original system that was built on yet another platform.
...
Push this far enough, and you find compilers built with more primitive tools, which were in turn built on machines other than their host.
...
Keep pushing and you find computers built so that their instructions could be entered by setting switches on the front panel of the machine.
Very cool stuff.
The rule is "build the tools to build the tools to build the tools...". Very much like the tools which run our physical environment. Also known as "pulling yourself up by the bootstraps".
I think you should distinguish between:
compile, v: To use a compiler to process source code and produce executable code [1].
and
install, v: To connect, set up or prepare something for use [2].
Compilation produces binary executables from source code. Installation merely puts those binary executables in the right place to run them later. So, installation and use do not require compilation if the binaries are available. Think about ”compile” and “install” like about “cook” and “serve”, correspondingly.
Now, your questions:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
The kernel cannot be compiled without a compiler, but it can be installed from a compiled binary.
Usually, when you install an operating system, you install an pre-compiled kernel (binary executable). It was compiled by someone else. And only if you want to compile the kernel yourself, you need the source and the compiler, and all the other tools.
Even in ”source-based” distributions like gentoo you start from running a compiled binary.
So, you can live your entire life without compiling kernels, because you have them compiled by someone else.
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
The compiler cannot be run if there is no kernel (OS). So one has to install a compiled kernel to run the compiler, but does not need to compile the kernel himself.
Again, the most common practice is to install compiled binaries of the compiler, and use them to compile anything else (including the compiler itself and the kernel).
Now, chicken and egg problem. The first binary is compiled by someone else... See an excellent answer by dmckee.
The term describing this phenomenon is bootstrapping, it's an interesting concept to read up on. If you think about embedded development, it becomes clear that a lot of devices, say alarm clocks, microwaves, remote controls, that require software aren't powerful enough to compile their own software. In fact, these sorts of devices typically don't have enough resources to run anything remotely as complicated as a compiler.
Their software is developed on a desktop machine and then copied once it's been compiled.
If this sort of thing interests you, an article that comes to mind off the top of my head is: Reflections on Trusting Trust (pdf), it's a classic and a fun read.
The kernel doesn't compile itself -- it's compiled by a C compiler in userspace. In most CPU architectures, the CPU has a number of bits in special registers that represent what privileges the code currently running has. In x86, these are the current privilege level bits (CPL) in the code segment (CS) register. If the CPL bits are 00, the code is said to be running in security ring 0, also known as kernel mode. If the CPL bits are 11, the code is said to be running in security ring 3, also known as user mode. The other two combinations, 01 and 10 (security rings 1 and 2 respectively) are seldom used.
The rules about what code can and can't do in user mode versus kernel mode are rather complicated, but suffice to say, user mode has severely reduced privileges.
Now, when people talk about the kernel of an operating system, they're referring to the portions of the OS's code that get to run in kernel mode with elevated privileges. Generally, the kernel authors try to keep the kernel as small as possible for security reasons, so that code which doesn't need extra privileges doesn't have them.
The C compiler is one example of such a program -- it doesn't need the extra privileges offered by kernel mode, so it runs in user mode, like most other programs.
In the case of Linux, the kernel consists of two parts: the source code of the kernel, and the compiled executable of the kernel. Any machine with a C compiler can compile the kernel from the source code into the binary image. The question, then, is what to do with that binary image.
When you install Linux on a new system, you're installing a precompiled binary image, usually from either physical media (such as a CD DVD) or from the network. The BIOS will load the (binary image of the) kernel's bootloader from the media or network, and then the bootloader will install the (binary image of the) kernel onto your hard disk. Then, when you reboot, the BIOS loads the kernel's bootloader from your hard disk, and the bootloader loads the kernel into memory, and you're off and running.
If you want to recompile your own kernel, that's a little trickier, but it can be done.
Which one was there first? the chicken or the egg?
Eggs have been around since the time of the dinosaurs..
..some confuse everything by saying chickens are actually descendants of the great beasts.. long story short: The technology (Egg) was existent prior to the Current product (Chicken)
You need a kernel to build a kernel, i.e. you build one with the other.
The first kernel can be anything you want (preferably something sensible that can create your desired end product ^__^)
This tutorial from Bran's Kernel Development teaches you to develop and build a smallish kernel which you can then test with a Virtual Machine of your choice.
Meaning: you write and compile a kernel someplace, and read it on an empty (no OS) virtual machine.
What happens with those Linux installs follows the same idea with added complexity.
It's not turtles all the way down. Just like you say, you can't compile an operating system that has never been compiled before on a system that's running that operating system. Similarly, at least the very first build of a compiler must be done on another compiler (and usually some subsequent builds too, if that first build turns out not to be able to compile its own source code just yet).
I think the very first Linux kernels were compiled on a Minix box, though I'm not certain about that. GCC was available at the time. One of the very early goals of many operating systems is to run a compiler well enough to compile their own source code. Going further, the first compiler was almost certainly written in assembly language. The first assemblers were written by those poor folks who had to write in raw machine code.
You may want to check out the Linux From Scratch project. You actually build two systems in the book: a "temporary system" that is built on a system you didn't build yourself, and then the "LFS system" that is built on your temporary system. The way the book is currently written, you actually build the temporary system on another Linux box, but in theory you could adapt it to build the temporary system on a completely different OS.
If I am understanding your question correctly. The kernel isn't "compiling itself" these days. Most Linux distributions today provide system installation through a linux live cd. The kernel is loaded from the CD into memory and operates as it would normally as if it were installed to disk. With a linux environment up and running on your system it is easy to just commit the necessary files to your disk.
If you were talking about the bootstrapping issue; dmckee summed it up pretty nice.
Just offering another possibility...