GCC, ARMboot - Creating standalone application without any library and any OS - c

I have an embedded hardware system which contains a bootloader based on ARMboot (which is very similar to Uboot and PPCboot).
This bootloader normally serves to load uClinux image from the flash. However, now I am trying to use this bootloader to run a standalone helloworld application, which does not require any linked library. Actually, it contains only while(1){} code in the main function.
My problem is that I cannot find out what GCC settings should I use in order to build a standalone properly formatted binary.
I do use following build command:
cr16-elf-gcc -o helloworld helloworld.c -nostdlib
which produces warning message:
warning: cannot find entry symbol _start; defaulting to 00000004
Thereafter, within the bootloader, I upload a produced application and start it at some address:
tftpboot 0xa00000 helloworld
go 0xa00004
But it doesn't work :(
The system reboots.
Normally it should just hang (because of while(1)).

I don't know that loader, but I think you should use objcopy like this to dump your executable data to a raw binary file. Don't jump to ELF headers, people :)
objcopy -O binary ./a.out o.bin
Also try to compile position independent code and to read ld and gcc manuals.

The linker is complaining about missing startup code.
You need to provide two things: startup code and a linker command file that defines the address map of your target processor.
In your case the startup code is as "bl main", but usually the startup code will initialize the stack pointer at least before branching to main.
If you know you are loading your example into RAM, you can start your program at main directly. You'll need to determine main()'s address ate use that for your "go" command.

I operate on the ARM non-os non-lib all day every day. This is my current gcc options:
arm-whatever-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -c hello.c -o hello.o
then I use the linker to combine the C code with the vector tables and such, even if it is not an image that needs a vector table using a vector table makes it easy to put your entry point on the first instruction.

Any reason you can't statically link at least the standard libraries in? You should have a working program and the benefits of the standard libraries without external dependencies.
Also, does your toolchain/IDE provide differentiate between "standalone application" and "linux application"? The IDE for the AVR32 has that distinction and is able to generate either a program that runs within the embedded linux environment or a standalone program that basically becomes the OS.

Related

Analysing stack frame of C program in Linux

I'd like to ask if there is any option to gcc for Linux which allows debugging stack frames of given procedure of program written in C?
I know I can compile my program with -ggdb3 gcc parameter and it allows me to find out what are the symbols in this program. But is there any method to find out how the procedures arguments are passed (via stack or registers)?
I've got program which overwrites stack causing SEGV and I'd like to analyse it from the same program. First I'd like to find the problematic procedure and then I'm planning to find the place of the error.
You have a few options. One I prefer is to look at the actual generated code as it tells me exactly what is being executed. You can get this when compiling with gcc or g++. This will create a file with a .S suffix.
For example, gcc -S helloworld.c will also create a file called helloworld.S which contains the assembly code.
If you don't have source you can use tools like objdump to turn the binary code into a disassembly.
Lots of examples if you search for gcc assembly output

How do I link an object file generated from C code, a static library and a NASM generated object file?

I am working on a program (for real real mode) that is loaded by a bootloader to an address in memory and jumps to it and starts executing the program. The problem is that I have the project separated into two files: a.asm (16bit asm, NASM syntax) and b.c (which i compile with gcc for dos (djgpp)). Also, b.c uses some functions from the allegro library (I have it as a static library, .a).
My question is, how do I compile and link these 3 files together? My first thought was to:
Compile and assemble b.c with gcc (with the -c flag), as a result I get a b.o file
Assemble a.asm with NASM (-fbin or.. ?) and get a.o
Link b.o, a.o and allegro.a to get a pure binary (no .exe headers,
no debug information etc.)
I tried the above approach, but at the step 3, the linker throws an error saying that the format of a.o (the object file generated by NASM), is unrecognized, and that may be because either I am not invoking the right flags and options when assembling the file, or..
I would like some guidance on how to approach this problem.
Thanks.
The .o file generated by DJGPP contains 32-bit (i386) code, which cannot be called from 16-bit code directly.
Under DOS, 32-bit code is typically run by using a DOS extender, which switches to 32-bit protected mode, sets up memory mappings and DOS API translation (i.e. small trampoline functions which switch back to 16-bit real mode when calling the int 21h DOS API). and then loads and calls the 32-bit code.
Lightweight alternatives of DOS extenders for switching between 16-bit and 32-bit mode:
unreal mode with gcc -m16 (.code16gcc). See this answer and other answers more details about gcc -m16.
The bootloader of the Syslinux project, which contains 16-bit assembly (NASM), 32-bit assembly (NASM) and 32-bit C (GCC) code, and it switches between them.
To link 16-bit and 32-bit code together, you can run objcopy -O binary func.o func.bin (32-bit), and then add %incbin "func.bin" to your 16-bit NASM source file. However, this breaks relocations (so you won't be able to use global variables).

How do I use the GNU linker instead of the Darwin Linker?

I'm running OS X 10.12 and I'm developing a basic text-based operating system. I have developed a boot loader and that seems to be running fine. My only problem is that when I attempt to compile my kernel into pure binary, the linker won't work. I have done some research and I think that this is because of the fact OS X runs the Darwin linker and not the GNU linker. Because of this, I have downloaded and installed the GNU binutils. However, it still won't work...
Here is my kernel:
void main() {
// Create pointer to a character and point it to the first cell of video
// memory (i.e. the top-left)
char* video_memory = (char*) 0xb8000;
// At that address, put an x
*video_memory = 'x';
}
And this is when I attempt to compile it:
Hazims-MacBook-Pro:32 bit root# gcc -ffreestanding -c kernel.c -o kernel.o
Hazims-MacBook-Pro:32 bit root# ld -o kernel.bin -T text 0x1000 kernel.o --oformat binary
ld: unknown option: -T
Hazims-MacBook-Pro:32 bit root#
I would love to know how to solve this issue. Thank you for your time.
-T is a gcc compiler flag, not a linker flag. Have a look at this:
With these components you can now actually build the final kernel. We use the compiler as the linker as it allows it greater control over the link process. Note that if your kernel is written in C++, you should use the C++ compiler instead.
You can then link your kernel using:
i686-elf-gcc -T linker.ld -o myos.bin -ffreestanding -O2 -nostdlib boot.o kernel.o -lgcc
Note: Some tutorials suggest linking with i686-elf-ld rather than the compiler, however this prevents the compiler from performing various tasks during linking.
The file myos.bin is now your kernel (all other files are no longer needed). Note that we are linking against libgcc, which implements various runtime routines that your cross-compiler depends on. Leaving it out will give you problems in the future. If you did not build and install libgcc as part of your cross-compiler, you should go back now and build a cross-compiler with libgcc. The compiler depends on this library and will use it regardless of whether you provide it or not.
This is all taken directly from OSDev, which documents the entire process, including a bare-bones kernel, very clearly.
You're correct in that you probably want binutils for this especially if you're coding baremetal; while clang as is purports to be a cross compiler it's far from optimal or usable here, for various reasons. noticing you're developing on ARM I infer; you want this.
https://developer.arm.com/open-source/gnu-toolchain/gnu-rm
Aside from the fact that gcc does this thing better than clang markedly, there's also the issue that ld does not build on OS X from the binutils package; it in some configurations silently fails so you may in fact never have actually installed it despite watching libiberty etc build, it will even go through the motions of compiling the source of that target sometimes and just refuse to link it... to the fellow with the lousy tone blaming OP, if you had relevant experience ie ever had built this under this condition you would know that is patently obnoxious. it'd be nice if you'd refrain from discouraging people from asking legitimate questions.
In the CXXfilt package they mumble about apple-darwin not being a target; try changing FAKE_TARGET to instead of mn10003000-whatever or whatever they used, to apple-rhapsody some time.
You're still in way better shape just building them from current if you say need to strip relocations from something or want to work on restoring static linkage to the system. which is missing by default from that clang installation as well...anyhow it's not really that ld couldn't work with macho, it's all there, codewise in fact...that i am sure of
Regarding locating things in memory, you may want to refer to a linker script
http://svn.screwjackllc.com/?p=noid.git;a=blob_plain;f=new_mbed_bs.link_script.ld
As i have some code in there that will directly place things in memory, rather than doing it on command line it is more reproducible to go with the linker script. it's a little complex but what it is doing is setting up a couple of regions of memory to be used with my memory allocators, you can use malloc, but you should prefer not to use actual malloc; dynamic memory is fine when it isn't dynamic...heh...
The script also sets flags for the stack and heap locations, although they are just markers, not loaded til go time, they actually get placed, stack and heap, by the startup code, which is in assembly and rather readable and well commented (hard to believe, i know)... neat trick, you have some persistence to volatile memory, so i set aside a very tiny bit to flip and you can do things like have it control what bootloader to run on the next power cycle. again you are 100% correct regarding the linker; seems to be you are headed the right direction. incidentally another way you can modify objects prior to loading them , and preload things in memory, similar to this method, well there are a ton of ways, but, check out objcopy and objdump...you can use gdb to dump srecs of structures in memory, note the address, and then before linking but after assembly use dd to insert the records you extracted with gdb back in to extracted sections..is one of my favorite ways just because is smartass route :D also, if you are tight on memory ever and need to precalculate constants it's one way to optimize things...that way is actually closer to what ld is doing, just doing it by hand... probably path of least resistance on this now though is linker script.

Modular programming and compiling a C program in linux

So I have been studying this Modular programming that mainly compiles each file of the program at a time. Say we have FILE.c and OTHER.c that both are in the same program. To compile it, we do this in the prompt
$gcc FILE.c OTHER.c -c
Using the -c flag to compile it into .o files (FILE.o and OTHER.o) and only when that happens do we translate it (compile) to executable using
$gcc FILE.o OTHER.o -o
I know I can just do it and skip the middle part but as it shows everywhere, they do it first and then they compile it into executable, which I can't understand at all.
May I know why?
If you are working on a project with several modules, you don't want to recompile all modules if only some of them have been modified. The final linking command is however always needed. Build tools such as make is used to keep track of which modules need to be compiled or recompiled.
Doing it in two steps allows to separate more clearly the compiling and linking phases.
The output of the compiling step is object (.o) files that are machine code but missing the external references of each module (i.e. each c file); for instance file.c might use a function defined in other.c, but the compiler doesn't care about that dependency in that step;
The input of the linking step is the object files, and its output is the executable. The linking step bind together the object files by filling the blanks (i.e. resolving dependencies between objets files). That's also where you add the libraries to your executable.
This part of another answer responds to your question:
You might ask why there are separate compilation and linking steps.
First, it's probably easier to implement things that way. The compiler
does its thing, and the linker does its thing -- by keeping the
functions separate, the complexity of the program is reduced. Another
(more obvious) advantage is that this allows the creation of large
programs without having to redo the compilation step every time a file
is changed. Instead, using so called "conditional compilation", it is
necessary to compile only those source files that have changed; for
the rest, the object files are sufficient input for the linker.
Finally, this makes it simple to implement libraries of pre-compiled
code: just create object files and link them just like any other
object file. (The fact that each file is compiled separately from
information contained in other files, incidentally, is called the
"separate compilation model".)
It was too long to put in a comment, please give credit to the original answer.

How to wrote programs to implement shared memory page in C [duplicate]

I want to use same library functions (i.e. OpenSSL library ) in two different programs in C for computation. How can I make sure that both program use a common library , means only one copy of library is loaded into shared main memory and both program access the library from that memory location for computation?
For example, when 1st program access the library for computation it is loaded into cache from main memory and when the 2nd program wants to access it later , it will access the data from cache ( already loaded by 1st program), not from main memory again.
I am using GCC under Linux. Any explanation or pointer will be highly appreciated .
Code gets shared by the operating system, not only of shared libraries but also of executables of the same binary — you don't have to do anything to have this feature. It is part of the system's memory management.
Data will not get shared between the two processes. You would need threads in one process to share data. But unless you want that, just make sure both programs use exactly the same shared library file (.so file). Normally you won't have to think about that; it only might be important if two programs use different versions of a library (they would not get shared of course).
Have a look at the output of ldd /path/to/binary to see which shared libraries are used by a binary.
Read Drepper's paper How to Write Shared Libraries and Program Library HowTo
To make one, compile your code as position independent code, e.g.
gcc -c -fPIC -O -Wall src1.c -o src1.pic.o
gcc -c -fPIC -O -Wall src2.c -o src2.pic.o
then link it into a shared object
gcc -shared src1.pic.o src2.pic.o -lsome -o libfoo.so
you may link a shared library -lsome into another one libfoo.so
Internally, the dynamic linker ld-linux.so(8) is using mmap(2) (and will do some relocation at dynamic link time) and what matters are inodes. The kernel will use the file system cache to avoid reading twice a shared library used by different processes. See also linuxatemyram.com
Use e.g. ldd(1), pmap(1) and proc(5). See also dlopen(3). Try
cat /proc/self/maps
to understand the address space of the virtual memory used by the process running that cat command; not everything of an ELF shared library is shared between processes, only some segments, including the text segment...

Resources