How to generate an c function directly to excutable machine code? - c

My file is bootpack.c and it has a function void f() { while(1); } in it.
I want to generate it directly to excutable machine code. So I compile it like this:
gcc -c -nostdinc -fno-builtin bootpack.c
ld -nostdlib file.o -o bootpack.bin
But I find that bootpack.bin is 3.84KB. It is should only be a few bytes, I thought, because it is just a loop. What is wrong? And how to generate this file correctly?

You can use binary as output format for the GNU (BFD-based) linker:
ld -nostdlib file.o --oformat=binary -o bootpack.bin
You can then disassemble that with:
objdump -b binary -m i386 -D bootpack.bin
(substitute your target architecture in place of i386).

Because it contain symbol table information ,to reduce the size of executable you can use strip command .
Use it as "strip --strip-all executable-file-name" so it will remove extra information such as symbol table etc. Even in gcc option -s can be used , there are more option in gcc which can be used .

Related

CMake compile NASM and C and link everything together

I'm trying to compile assembly files with NASM and C files with GCC and link all object files together. Moreover, I'd like the C preprocessor to process the assembly files as well. This is normally no problem from the command line or a simple makefile, but I've had some trouble in replicating this functionality in CMake.
The exact process, assuming three files (boot.S, kernel.c, link.ld) would look something like this:
gcc -E -P boot.S -D <...> -o boot.s
nasm -f elf32 boot.s -o boot.o
gcc -c kernel.c -o kernel.o -ffreestanding -O2 -Wall -Wextra
Now its time to link. I want to do this like this (maybe with a few extra flags):
gcc -T link.ld -o out.bin -ffreestanding -O2 -nostdlib boot.o kernel.o -lgcc
The problems with CMake are the following:
Cmake support for NASM is weird at best. When adding .S files as sources to targets they don't get recognized as assembly files and I get hit with 'cannot determine linker language for target'. I have tried adding 's S' to CMAKE_ASM_NASM_SOURCE_FILE_EXTENSIONS but it still doesn't work unless I manually set the languages with set_source_files_properties(). Moreover, as is pointed out here, CMAKE_ASM_NASM_LINK_EXECUTABLE is broken.
As far as I understand, after compiling source files to objects, CMake attempts to link them automatically. Which linker will it use to link all .o files? Will it use the linker for C? Will it use the linker for NASM? The answer is relevant, because I need to configure it with the flags I mentioned above.
What would an example CMakeLists.txt would look like that replicates the previously mentioned process? Also do I need a create_custom_command() in order to invoke just the preprocessor? Thank you.

How to disassemble .elf file to .asm file in riscv

I have generated a .elf file by using
riscv64-unknown-elf-gcc -march=rv64imac -mabi=lp64 -Tlinker.ld *.o add.o -o add.elf -static -nostartfiles -lm -lgcc
And now I want to see the stack to check the values assigned to variables used in my add.c. I believe the same can be obtained from a .dasm/.asm file. How can I generate a .asm/.dasm file from an .elf file?
Just as an extension to dratenik's answer.
I am using riscv32-unknown-elf-objdump --disassemble-all NAME.elf > NAME.disasm
This way you don't even have to go over the -S option. And can just disassemble your .elf file.
Again as dratenik noted you need to adjust the prefix of objdump to you toolchain aka. your compiler prefix
You can stop gcc at the assembly stage by adding the -S switch, the file output by -o will then be an asm source file. Or you can let gcc finish and then take the resulting binary apart with objdump -d. Of course you need to run the objdump binary from the same toolchain, not your system one.

How do I produce plain binary from object files?

How should I produce raw binary file from two object (.o) files?
I want the plain binary format produced by nasm -f bin when compiling a .asm file, but for .o files.
By a plain binary, I mean a file which contains only the instructions, not some extra information, as many executable files contain a lot of extra helpful information.
See http://www.nasm.us/doc/nasmdoc7.html for information on that.
PS: I want to make a "plain binary" to start in QEMU.
This brings back memories. I'm sure there is a better way to do this with linker scripts, but this is how I did it when I was young and stupid:
# compile some files
gcc -c -nostdlib -nostartfiles -nodefaultlibs -fno-builtin kernel.c -o kernel.o
gcc -c -nostdlib -nostartfiles -nodefaultlibs -fno-builtin io.c -o io.o
# link files and place code at known address so we can jump there from asm
ld -Ttext 0x100000 kernel.o io.o -o kernel.out
# get a flat binary
objcopy -S -O binary kernel.out kernel.bin
The file kernel.c started with
__asm__("call _kmain");
__asm__("ret");
void kmain(void) { ... }
The fun part is writing the loader in assembler.
ld --oformat binary is a more direct option:
ld --oformat binary -o main.img -Ttext 0x7C00 main.o
The downside of this method is that I don't think it is possible to reuse the symbols to debug, as we'd want something like:
qemu-system-i386 -hda main.img -S -s &
gdb main.elf -ex 'target remote localhost:1234'
So in that case you should stick to objcopy. See also: https://stackoverflow.com/a/32960272/895245
Also make sure that you use your own clean linker script: https://stackoverflow.com/a/32594933/895245
Repository with working examples for some common cases:
boot sectors
multiboot interfacing with C
Similar question: How to generate plain binaries like nasm -f bin with the GNU GAS assembler?

Generating a.out file format with GCC

How do I generate the a.out file format with GCC on x86 architectures?
With NASM I can do this easily with the -f flag, for example:
nasm -f aout start.asm
objdump -a start.o
start.o: file format a.out-i386-linux
start.o
On Linux, compiling .c files produces an ELF object file. How can I produce a.out files with GCC?
To generate the a.out format with gcc, your linker needs to be told to do so. You can do it by passing it flags from gcc thanks to the -Wl flag.
Here is what you would do for the a.out format:
gcc -Wl,--oformat=a.out-i386-linux file.c -o file.out
You can also display all formats supported by typing:
objdump -i
According to the post Re: How can I control the gcc's output format?, you need to build gcc for a different target (i386-aout).
It sounds plausible as a.out has been deprecated for years (10+).
There are two answers to this question. One is that you'll need to compile a fresh GCC with aout as its target; it's not as simple as flipping a command-line switch. The other answer is a question: why do you actually need this? I can't immediately think of a valid reason.

Linking a C program directly with ld fails with undefined reference to `__libc_csu_fini`

I'm trying to compile a C program under Linux. However, out of curiosity, I'm trying to execute some steps by hand: I use:
the gcc frontend to produce assembler code
then run the GNU assembler to get an object file
and then link it with the C runtime to get a working executable.
Now I'm stuck with the linking part.
The program is a very basic "Hello world":
#include <stdio.h>
int main() {
printf("Hello\n");
return 0;
}
I use the following command to produce the assembly code:
gcc hello.c -S -masm=intel
I'm telling gcc to quit after compiling and dump the assembly code with Intel syntax.
Then I use th GNU assembler to produce the object file:
as -o hello.o hello.s
Then I try using ld to produce the final executable:
ld hello.o /usr/lib/libc.so /usr/lib/crt1.o -o hello
But I keep getting the following error message:
/usr/lib/crt1.o: In function `_start':
(.text+0xc): undefined reference to `__libc_csu_fini'
/usr/lib/crt1.o: In function `_start':
(.text+0x11): undefined reference to `__libc_csu_init'
The symbols __libc_csu_fini/init seem to be a part of glibc, but I can't find them anywhere! I tried linking against libc statically (against /usr/lib/libc.a) with the same result.
What could the problem be?
/usr/lib/libc.so is a linker script which tells the linker to pull in the shared library /lib/libc.so.6, and a non-shared portion, /usr/lib/libc_nonshared.a.
__libc_csu_init and __libc_csu_fini come from /usr/lib/libc_nonshared.a. They're not being found because references to symbols in non-shared libraries need to appear before the archive that defines them on the linker line. In your case, /usr/lib/crt1.o (which references them) appears after /usr/lib/libc.so (which pulls them in), so it doesn't work.
Fixing the order on the link line will get you a bit further, but then you'll probably get a new problem, where __libc_csu_init and __libc_csu_fini (which are now found) can't find _init and _fini. In order to call C library functions, you should also link /usr/lib/crti.o (after crt1.o but before the C library) and /usr/lib/crtn.o (after the C library), which contain initialisation and finalisation code.
Adding those should give you a successfully linked executable. It still won't work, because it uses the dynamically linked C library without specifying what the dynamic linker is. You'll need to tell the linker that as well, with something like -dynamic-linker /lib/ld-linux.so.2 (for 32-bit x86 at least; the name of the standard dynamic linker varies across platforms).
If you do all that (essentially as per Rob's answer), you'll get something that works in simple cases. But you may come across further problems with more complex code, as GCC provides some of its own library routines which may be needed if your code uses certain features. These will be buried somewhere deep inside the GCC installation directories...
You can see what gcc is doing by running it with either the -v option (which will show you the commands it invokes as it runs), or the -### option (which just prints the commands it would run, with all of the arguments quotes, but doesn't actually run anything). The output will be confusing unless you know that it usually invokes ld indirectly via one of its own components, collect2 (which is used to glue in C++ constructor calls at the right point).
I found another post which contained a clue: -dynamic-linker /lib/ld-linux.so.2.
Try this:
$ gcc hello.c -S -masm=intel
$ as -o hello.o hello.s
$ ld -o hello -dynamic-linker /lib/ld-linux.so.2 /usr/lib/crt1.o /usr/lib/crti.o hello.o -lc /usr/lib/crtn.o
$ ./hello
hello, world
$
Assuming that a normal invocation of gcc -o hello hello.c produces a working build, run this command:
gcc --verbose -o hello hello.c
and gcc will tell you how it's linking things. That should give you a good idea of everything that you might need to account for in your link step.
In Ubuntu 14.04 (GCC 4.8), the minimal linking command is:
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
/usr/lib/x86_64-linux-gnu/crt1.o \
/usr/lib/x86_64-linux-gnu/crti.o \
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/ \
-lc -lgcc -lgcc_s \
hello.o \
/usr/lib/x86_64-linux-gnu/crtn.o
Although they may not be necessary, you should also link to -lgcc and -lgcc_s, since GCC may emit calls to functions present in those libraries for operations which your hardware does not implement natively, e.g. long long int operations on 32-bit. See also: Do I really need libgcc?
I had to add:
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/ \
because the default linker script does not include that directory, and that is where libgcc.a was located.
As mentioned by Michael Burr, you can find the paths with gcc -v. More precisely, you need:
gcc -v hello_world.c |& grep 'collect2' | tr ' ' '\n'
This is how I fixed it on ubuntu 11.10:
apt-get remove libc-dev
Say yes to remove all the packages but copy the list to reinstall after.
apt-get install libc-dev
If you're running a 64-bit OS, your glibc(-devel) may be broken. By looking at this and this you can find these 3 possible solutions:
add lib64 to LD_LIBRARY_PATH
use lc_noshared
reinstall glibc-devel
Since you are doing the link process by hand, you are forgetting to link the C run time initializer, or whatever it is called.
To not get into the specifics of where and what you should link for you platform, after getting your intel asm file, use gcc to generate (compile and link) your executable.
simply doing gcc hello.c -o hello should work.
Take it:
$ echo 'main(){puts("ok");}' > hello.c
$ gcc -c hello.c -o hello.o
$ ld hello.o -o hello.exe /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/crtn.o \
-dynamic-linker /lib/ld-linux.so.2 -lc
$ ./hello.exe
ok
Path to /usr/lib/crt*.o will when glibc configured with --prefix=/usr

Resources