I'm compiling newlib for a bespoke PowerPC platform with no OS. Reading information on the net I realise I need to implement stub functions in a <newplatform> subdirectory of libgloss.
My confusion is to how this is going to be picked up when I compile newlib. Is it the last part of the --target argument to configure e.g. powerpc-ibm-<newplatform> ?
If this is the case, then I guess I should use the same --target when compiling binutils and gcc?
Thank you
I ported newlib and GCC myself too. And i remember i didn't have to do much stuff to make newlib work (porting GCC, gas and libbfd was most of the work).
Just had to tweak some files about floating point numbers, turn off some POSIX/SomeOtherStandard flags that made it not use some more sophisticated functions and write support code for longjmp / setjmp that load and store register state into the jump buffers. But you certainly have to tell it the target using --target so it uses the right machine sub-directory and whatnot. I remember i had to add small code to configure.sub to make it know about my target and print out the complete configuration trible (cpu-manufacturer-os or similar). Just found i had to edit a file called configure.host too, which sets some options for your target (for example, whether an operation systems handles signals risen by raise, or whether newlib itself should simulate handling).
I used this blog of Anthony Green as a guideline, where he describes porting of GCC, newlib and binutils. I think it's a great source when you have to do it yourself. A fun read anyway. It took a total of 2 months to compile and run some fun C programs that only need free-standing C (with dummy read/write functions that wrote into the simulator's terminal).
So i think the amount of work is certainly manageable. The one that made me nearly crazy was libgloss's build scripts. I certainly was lost in those autoconf magics :) Anyway, i wish you good luck! :)
Check out Porting Newlib.
Quote:
I decided that after an incredibly difficult week of trying to get newlib ported to my own OS that I would write a tutorial that outlines the requirements for porting newlib and how to actually do it. I'm assuming you can already load binaries from somewhere and that these binaries are compiled C code. I also assume you have a syscall interface setup already. Why wait? Let's get cracking!
Related
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
I need to check that my software restricts itself from compiling under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
Does anyone know how to guarantee through GCC preprocessor directives that my binaries are only compilable under Arch Linux?
Of course I can always
#ifdef __linux__
...
#endif
But this is not precise enough.
Edit: This must be done through C source code and not by any building systems, so, for example, doing this through CMake is completely discarded.
Edit 2: Users faking this behaviour is not a problem since the software is distributed to selected clients and thus, actively trying to "misuse" our source code is "their decision".
Does GCC (or alternatively Clang) defines any macro when it is compiled for the Arch Linux OS?
No. Because there's nothing inherently specific to Arch Linux on the binary level. For what it's worth, when compiling the only things you/the compiler has to care about is the target architecture (i.e. what kind of CPU it's going to run with), data type sizes and alignments and function calling conventions.
Then later on, when it's time to link the compiled translation unit objects into the final binary executable, the runtime libraries around are also of concern. Without taking special precautions you're essentially locking yourself into the specific brand of runtime libraries (glibc vs. e.g. musl; libstdc++ vs. libc++) pulled by the linker.
One can easily sidestep the later problem by linking statically, but that limits the range of system and midlevel APIs available to the program. For example on Linux a purely naively statically linked program wouldn't be able to use graphics acceleration APIs like OpenGL-3.x or Vulkan, since those rely on loading components of the GPU drivers into the process. You can however still use X11 and indirect GLX OpenGL, since those work using wire protocols going over sockets, which are implemented using direct syscalls to the kernel.
And these kernel syscalls are exactly the same on the binary level for each and every Linux kernel of every distribution out there. Although inside of the kernel there's a lot of leeway when it comes to redefining interfaces, when it comes to the interfaces toward the userland (i.e. regular programs) there's this holy, dogmatic, ironclad rule that YOU NEVER BREAK USERLAND! Kernel developers breaking this rule, intentionally or not are chewed out publicly by Linus Torvalds in his in-/famous rants.
The bottom line to this is, that there is no such thing as a "Linux distribution specific identifier on the binary level". At the end of the day, a Linux distribution is just that: A distribution of stuff. That means someone or more decided on a set of files that make up a working Linux system, wrap it up somehow and slap a name on it. That's it. There's nothing inherently specific to "Arch" Linux other than it's called "Arch" and (for the time being) relies on the pacman package manager. That's it. Everything else about "Arch", or any other Linux distribution, is just a matter of happenstance.
If you really want to sort different Linux distributions into certain bins regarding binary compatibility, then you'd have to pigeonhole the combinations of
Minimum required set of supported syscalls. This translates into minimum required kernel version.
What libc variant is being used; and potentially which version, although it's perfectly possible to link against a minimally supported set of functions, that has been around for almost "forever".
What variant of the C++ standard library the distribution decided upon. This actually also inflicts programs that might appear to be purely C, because certain system level libraries (*cough* Mesa *cough*) will internally pull a lot of C++ infrastructure (even compilers), also triggering other "fun" problems¹
I need to check that my software restricts itself from running under anything but Arch Linux (the reason behind this is off-topic). I couldn't find any relevant resources on the internet.
You couldn't find resources on the Internet, because there's nothing specific on the binary level that makes "Arch" Arch. For what it's worth right now, this instant I could create a fork of Arch, change out its choice of default XDG skeleton – so that by default user directories are populated with subdirs called leech, flicks, beats, pics – and call it "l33tz" Linux. For all intents and purposes it's no longer Arch. It does behave significantly different from the default Arch behavior, which would also be of concern to you, if you'd relied on any specific thing, and be it most minute.
Your employer doesn't seem to understand what Linux is or what distinguished distributions from each other.
Hint: It's not the binary compatibility. As a matter of fact, as long as you stay within the boring old realm of boring old glibc + libstdc++ Linux distributions are shockingly compatible with each other. There might be slight differences in where they put libraries other than libc.so, libdl.so and ld-linux[-${arch}].so, but those two usually always can be found under /lib. And once ld-linux[-${arch}].so and libdl.so take over (that means pulling in all libraries loaded at runtime) all the specifics of where shared objects and libraries are to be found are abstracted away by the dynamic linker.
1: like becoming multithreaded only after global constructors were executed and libstdc++ deciding it wants to be singlethreaded, because libpthread wasn't linked into a program that didn't create a single thread on its own. That was a really weird bug I unearthed, but yshui finally understood https://gitlab.freedesktop.org/mesa/mesa/-/issues/3199
You can list the predefined preprocessor macros with
gcc -dM -E - /dev/null
clang -dM -E - /dev/null
None of those indicate what operating system the compiler is running under. So not only you can't tell whether the program is compiled under Arch Linux, you can't even tell whether the program is compiled under Linux. The macros __linux__ and friends indicate that the program is being compiler for Linux. They are defined when cross-compiling from another system to Linux, and not defined when cross-compiling from Linux to another system.
You can artificially make your program more difficult to compile by specifying absolute paths for system headers and relying on non-portable headers (e.g. /usr/include/bits/foo.h). That can make cross-compilation or compilation for anything other than Linux practically impossible without modifying the source code. However, most Linux distributions install headers in the same location, so you're unlikely to pinpoint a specific distribution.
You're very likely asking the wrong question. Instead of asking how to restrict compilation to Arch Linux, start from why you want to restrict compilation to Arch Linux. If the answer is “because the resulting program wouldn't be what I want under another distribution”, then start from there and make sure that the difference results in a compilation error rather than incorrect execution. If the answer to “why” is something else, then you're probably looking for a technical solution to a social problem, and that rarely ends well.
No, it doesn't. And even if it did, it wouldn't stop anyone from compiling the code on an Arch Linux distro and then running it on a different Linux.
If you need to prevent your software from "from running under anything but Arch Linux", you'll need to insert a run-time check. Although, to be honest, I have no idea what that check might consist of, since linux distros are not monolithic products. The actual check would probably have to do with your reasons for imposing the restriction.
Is it possible to write code in C, then statically build it and make a binary out of it like an ELF/PE then remove its header and all unnecessary meta-data so to create a raw binary and at last be able to put this raw binary in any other kind of OS specific like (ELF > PE) or (PE > ELF)?!
have you done this before?
is it possible?
what are issues and concerns?
how this would be possible?!
and if not, just tell me why not?!!?!
what are my pitfalls in understanding the static build?
doesn't it mean that it removes any need for 3rd party and standard as well as os libs and headers?!
Why cant we remove the meta of for example ELF and put meta and other specs needed for PE?
Mention:
I said, Cross OS not Cross Hardware
[Read after reading below!]
As you see the best answer, till now (!) just keep going and learn cross platform development issues!!! How crazy is this?! thanks to philosophy!!!
I would say that it's possible, but this process must be crippled by many, many details.
ABI compatibility
The first thing to think of is Application Binary Interface compatibility. Unless you're able to call your functions the same way, the code is broken. So I guess (though I can't check at the moment) that compiling code with gcc on Linux/OS X and MinGW gcc on Windows should give the same binary code as far as no external functions are called. The problem here is that executable metadata may rely on some ABI assumptions.
Standard libraries
That seems to be the largest hurdle. Partly because of C preprocessor that can inline some procedures on some platforms, leaving them to run-time on others. Also, cross-platform dynamic interoperation with standard libraries is close to impossible, though theoretically one can imagine a code that uses a limited subset of the C standard library that is exposed through the same ABI on different platforms.
Static build mostly eliminates problems of interaction with other user-space code, but still there is a huge issue of interfacing with kernel: it's int $0x80 calls on x86 Linux and a platform-specifc set of syscall numbers that does not map to Windows in any direct way.
OS-specific register use
As far as I know, Windows uses register %fs for storing some OS-wide exception-handling stuff, so a binary compiled on Linux should avoid cluttering it. There might be other similar issues. Also, C++ exceptions on Windows are mostly done with OS exceptions.
Virtual addresses
Again, AFAIK Windows DLLs have some predefined address they're must be loaded into in virtual address space of a process, whereas Linux uses position-independent code for shared libraries. So there might be issues with overlapping areas of an executable and ported code, unless the ported position-dependent code is recompiled to be position-independent.
So, while theoretically possible, such transformation must be very fragile in real situations and it's impossible to re-plant the whole static build code - some parts may be transferred intact, but must be relinked to system-specific code interfacing with other kernel properly.
P.S. I think Wine is a good example of running binary code on a quite different system. It tricks a Windows program to think it's running in Windows environment and uses the same machine code - most of the time that works well (if a program does not use private system low-level routines or unavailable libraries).
I inherited an old project that uses an Innovasic ia188em processor (previously AM188 from AMD). I will likely need to modify the code, and so will need to recompile. Unfortunately, I'm not sure which compiler was used previously (it compiled into a .hex file), and searching through the source code (and in particular the header files) doesn't seem to indicate it either.
I did see one program that could work, but I was wondering if anyone knew of any free programs that might do this. I saw some forums where people said they thought either an old Borland compiler or Bruce's C Compiler may work with 80188 chips (which I assume my chip falls under?), but nothing concrete. I failed to compile with Borland C++ 5 when I tried, though I admit I probably didn't have it set up correctly.
This is for an embedded board (i.e. no OS). I don't program too often, so my compiler knowledge is limited. I mostly just write simple C programs and compile with gcc under linux. Any help is appreciated.
Updated 10/8: I apologize, I was looking at both this code, and the PC side code that talks to the embedded board, and got mixed up. The code for the ia188em (embedded board) is actually C (not C++). Updated title to reflect that. I'm not sure if it makes a huge difference or not.
You'll need a 16 bit "real mode" x86 compiler. If your compiler is a DOS targeted compiler, you will need some means of generating a raw binary rather than than MS-DOS load module (.exe), this may be possible through linker options or may require a non-DOS linker.
Any build scripts or makefiles included with the project code might help you identifier the toolchain used, but the likelihood is that it is no longer available, and you'll need to source "antique software".
When I used to do this sort of thing (1985 -> 1990) I used the intel toolchain, now long obsolete and no longer available from intel. The tools required were
iC-86 - The compiler
link-86 - the linker
loc-86 - the image locater.
There is some information on these tools at a very old site here.
Another method that was used at the time was to process the .exe file produced by a Microsoft standard real mode PC compiler (MS-Pascal was the language used on that project) into an absolutely located image that could be blown into EPROM. The tool used for the conversion was proprietary to the company so I have no idea whether there is an equivalent available
Suppose you are designing, and writing a compiler for, a new language called Foo, among whose virtues is intended to be that it's particularly good for implementing compilers. A classic approach is to write the first version of the compiler in C, and use that to write the second version in Foo, after which it becomes self-compiling.
This does mean you have to be careful to keep backup copies of the binary (as opposed to most programs where you only have to keep backup copies of the source); once the language has evolved away from the first version, if you lost all copies of the binary, you would have nothing capable of compiling the current version. So be it.
But suppose it is intended to support both Linux and Windows. As long as it is in fact running on both platforms, it can compile itself on each platform, no problem. Supposing however you lost the binary on one platform (or had reason to suspect it had been compromised by an attacker); now there is a problem. And having to safeguard the binary for every supported platform is at least one more failure point than I'm comfortable with.
One solution would be to make it a cross-compiler, such that the binary on either platform can target both platforms.
This is not quite as easy as it sounds - while there is no problem selecting the binary output format, each platform provides the system API in the form of C header files, which normally only exist on their native platform, e.g. there is no guarantee code compiled against the Windows stdio.h will work on Linux even if compiled into Linux binary format.
Perhaps that problem could be solved by downloading the Linux header files onto a Windows box and using the Windows binary to cross-compile a Linux binary.
Are there any caveats with that solution I'm missing?
Another solution might be to maintain a separate minimum bootstrap compiler in Python, that compiles Foo into portable C, accepting only that subset of the language needed by the main Foo compiler and performing minimum error checking and no optimization, the intent being that the bootstrap compiler will thus remain simple enough that maintaining it across subsequent language versions wouldn't cost very much.
Again, are there any caveats with that solution I'm missing?
What methods have people used to solve this problem in the past?
This is a problem for C compilers themselves. It's typically solved by the use of a cross-compiler, exactly as you suggest.
The process of cross-compiling a compiler is no more difficult than cross-compiling any other project: that is to say, it's trickier than you'd like, but by no means impossible.
Of course, you first need the cross-compiler itself. This probably means some major surgery to your build-configuration system, and you'll need some kind of "sysroot" taken from the target (header, libraries, anything else you'll need to reference in a build).
So, in the end it depends on how your compiler is structured. Either it's easier to re-bootstrap using historical sources, repeating each phase of language compatibility you went through in the first place (you did use source revision control, right?), or it's easier to implement a cross-compiler configuration. I can't tell you which from here.
For many years, the GCC compiler was always written only in standard-compliant C code for exactly this reason: they wanted to be able to bring it up on any OS, given only the native C compiler for that system. Only in 2012 was it decided that C++ is now sufficiently widespread that the compiler itself can be written in it. Even then, they're only permitting themselves a subset of the language. In future, if anybody wants to port GCC to a platform that does not already have C++, they will need to either use a cross-compiler, or first port GCC 4.7 (that last major C-only version) and then move to the latest.
Additionally, the GCC build process does not "trust" the compiler it was built with. When you type "make", it first builds a reduced version of itself, it then uses that the build a full version. Finally, it uses the full version to rebuild another full version, and compares the two binaries. If the two do not match it knows that the original compiler was buggy and introduced some bad code, and the build has failed.
I want to run a simple hello world, written in c, app.
on my at91sam9rl-ek.
Is it possible without an os?
And (if it is) how do I have to compile it?
-right now I try using g++ lite for creating arm code
(In general which programms can the board start without OS,
assembler, arm code?)
Sure, no problem running without an operating system, I do that kind of thing daily...
http://sam7stuff.blogspot.com/
You programs are, at least at first, not going to resemble desktop applications, I would avoid any libraries C libraries, no printfs or strcmps or things like that until you get the feel for it and find the right tools. No floating point as well. add some numbers do some shifting blink some leds.
codesourcery lite is probably the fastest way to get started, the gnueabi one I believe is the one you want.
This winarm site has a compiler and tons of non-os projects for seems like every arm based microcontroller.
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/
Atmel is very very good about information, no doubt they have example programs you can try as well on the eval board.
emdebian is another cross compiler that is somewhat up to date and has binaries. building a gcc from scratch for cross compiling is not bad at all. The C library is another story though, and even the gcc library for that matter. I find it easier to do without either library.
It is possible get a C library working and run a great many kinds of programs. Depends on what you are looking to do. Ahh, just looked at the specs, that is a pretty serious eval board, plenty of power for an operating system should you choose to run one. You can certainly run programs that use the display as a user interface. read/write sd cards, usb, basically everything on the board, without an os, if you choose.