Linking object files of differing types - linker

I am trying to link object files which had originally been created by two different assemblers. We have some legacy assembly code that was compiled into object files using an old MRI assembler for the 68332 processor. We are developing a new application with the GNU Binutils m68k v2.24. We would like to use the original object files as built by the old assembler without change. I have configured our build environment to do this. For historic reasons, our build environment links into three output formats: Srecord, ieee, and ELF. When I run this is succeeding without error for the Srecord and ieee formats. However, for the ELF output format, I receive the following errors:
m68k-elf-ld: failed to merge target specific data of file
As a result the Elf file is not created.
I am first trying to understand what this error message might mean but I was not able to. If anyone knows the GNU Binutils ld documentation enough to point me to where the error definition is defined I would appreciate this.
I have actually loaded our target and run the Srecord output. It seems to pass many tests the same as before so it appears that it is running to some degree.
It looks like our legacy object files may be in coff format format. I would guess that this is the problem. Is there any way to convert a coff file to ELF format?
Thanks in advance for any support.

It looks like our legacy object files may be in coff format format. I would guess that this is the problem. Is there any way to convert a coff file to ELF format?
objcopy can be used to convert between formats. However, to do this it has to have been configured to understand both formats. You can check what formats it accepts with objcopy --info (a shortened list appears at the end of objcopy --help).
If you objcopy doesn't support the required formats, then you'll have to build binutils yourself.

Related

Rename symbols in web assembly binary file

I need a way to rename certain symbols in a WebAssembly binary archive file that were compiled from C files by emscripten.
When using gcc I can use the objcopy --redefine-sym command, but that gives me objcopy: libname.bc: file format not recognized
I also tried llvm-objcopy, but that gave me llvm-objcopy: error: unsupported object file format
Running llvm-nm did work on it however.
Running file gives libname.bc: WebAssembly (wasm) binary module version 0x1 (MVP)
tldr; I'm not sure there is any easy way to do this today.
Renaming in source code and recompiling is the only way I can think of doing this, and you probably have some reason why you can't do that?
Support for WebAssembly in llvm-objcopy is only partial, and was only added recently: https://reviews.llvm.org/D70970. So some parts of objcopy maybe work with but you would need llvm 11.
However I don't believe --redefine-sym is implemented yet, even on tip of tree.
If this was a normal WebAssembly binary you could just convert it to wat, edit it, and convert it back, but sadly with wasm object files that are extra custom sections that do not survive round trips.
You have the change the names in the export section
https://webassembly.github.io/spec/core/binary/modules.html#binary-exportsec
but if binary editing is to hard then translate your wasm into wabt then you can make the change with text editor and convert back from wabt into wasm

How to check if a object code is 16/32 bit?

Is there any way by which we can identify that a .obj file and .exe file is 16/32 bit?
Basically I want to create a smart linker, that will automatically identify which linker do the given file names need to be passed to.
Preferred Language: C (it can be different, if needed)
I am looking for some solution that can read the bytes of an .exe/the code of an .obj file and then determine if it's 16/32 bit. Even an algorithm would too do.
Note: I know both object code and a executable are two different entities.
All of this information is encoded in the binary object according to the relevant Application Binary Interface (ABI).
The current Linux ABI is the Executable and Linkable Format (ELF), and you can query a specific binary file using a tool such as readelf or objdump.
The current Windows ABI is the Portable Executable (PE) format. I'm not familiar with the toolset here but a quick google search suggests there are programs that function the same as readelf:
http://www.pe-explorer.com/peexplorer-tour.htm
Here's the Microsoft specification of the PE format:
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
However, neither of those formats support 16-bit binaries anymore. The older ABI format is called "a.out" for Linux, which can be read and queried with objdump (I'm not sure about readelf). The older Windows/DOS formats are called MZ and NE. Again, I'm not familiar with the tool support for these older Windows formats.
Wikipedia has a pretty comprehensive list of all the popular executable file formats that have been used, with links to more info:
https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats

How do I compile C code to a raw os-less binary?

Considering that C is a systems programming language, how can I compile C code into raw x86 machine code that could be invoked without the presence of an operating system? (IE: You can assume I have a boot sector that loads the raw machine code from disk into memory then jumps directly to the first instruction).
And now, for bonus points: Ideally, I'd like to compile using Visual Studio 2010's compiler because I've already got it. Failing that, what's the best way to accomplish the task, without having to install a bunch of dependencies or having to make large sweeping configuration changes across my entire system? I'd be compiling on Windows 7.
Usually, you don't. Instead, you compile your code normally, and then (either with the linker or some other tool) extract a raw binary from the object file.
For example, on Linux, you can use the objcopy tool to copy an object file to a raw binary file.
$ objcopy -O binary object.elf object.binary
First off you dont use any libraries that require a system call (printf, fopen, read, etc). then you compile the C files normally. the major difference is the linker step, if you are used to letting the c compiler call the linker (or letting some gui do it) you will likely need to take over that manually in some form. The specific solution depends on your tools, you will need to have some bootstrap code (the small amount of assembly that is needed to cover the assumptions of C compilers and programmers and launch the entry point in your C program), and a linker script or the right command line options for the linker to control the address space for the binary as well as to link the objects together. Then depending on the output format of the linker you might have to convert it to some other binary format (intel hex, srec, exe, com, coff, elf, raw binary, etc) to be compatible with wherever it is going to be loaded or run.

Compiling object file from an intermediate file of gcc

By using the -fdump-tree-* flag , one can dump some intermediate format file during compilation of a source code file. My question is if one can use that intermediate file as an input to gcc to get the final object file.
I'm asking this because I want to add some code to the intermediate file of the gimple (obtained by using the flag -fdump-tree-gimple) format. Sure I can use hooks and add my own pass, but I don't want to get to that level of complexity yet. I just want to give gcc my modified intermediate file, so it can start its compilation from there and give me the final object file. Any ideas how to achieve this?
GIMPLE was a binary internal format which is hard to dump fully and reload back correctly. Comparing with LLVM, LLVM IR was designed to be dumpable and reloadable into usual file (text and binary format of such files are fully-convertible from each to other). You can run Clang fronted to emit LLVMIR, then start opt program with some optimizations, then with other, and there will be LLVM IR bitcode files between phases. And then you can start codegeneration from IR bitcode into native code (even, in theory, into not the same platform, see PNaCl project).
There are some projects of dumping/reloading internal representation of GCC. I know such project was created to integrate gcc with commercial compiler tool. The author can't just link commercial code with gcc, because gcc is VIRAL (it will infect any linked code with anti-commercial GPL). So, author wrote a GPL dumper/loader of GIMPLE to some external (xml) format; the proprietary tool was able to read and translate this XML into other XML of the same format and then it was reloaded back with GPL tool.
In newer gcc you have an option of writing a plugin, which is VIRAL (23.2.1) in terms of GPL. Plugin will operate on in-memory representation of program and there will be no problem of dumping/reloading GIMPLE via external file.
There are some plugins which may be configured/may use user-supplied program, e.g MELT (Lisp) and GCC Python (Python). Some list of gcc plugins is there
There's no built-in facility to translate the text GIMPLE representation back to original GIMPLE internal representation.
You'll need to use custom front-end (such as suggested GIMPLE FE) to make sense of dumped GIMPLE.

Instruct GDB 6.5 to use source embedded in object file

I've been trying to make GNU gdb 6.5-14 to use the source code embedded on the object file when debugging, instead of scanning some directories for it.
The main reason is that I develop for an embedded platform and I cross compile, which means that all the source is in my computer.
I read about the -ggdb3 flag, that includes a lot of extra info, including the source code. So I started compiling with that flag.
Doing a objdump -S src/lib/libfoo.so indeed prints out all the source code with the assembly code intermixed with the source code, so I'm guessing that it does indeed contain that info.
The only thing is that GDB does not print it, unless I run from a nfs mounted version of my workspace that contains the source.
Does anyone know how can I instruct gdb to look in the object file for code instead of relying on external files?
Employed Russian is correct -- gcc never embeds source code in object files.
What it does do (with any -g setting) is add paths to where the source file can be found.
GDB can use these paths to find the source files. And if you happen to set up the exact same paths on your embedded file system as the paths where you keep your source code on the host system, you can trick gdb into finding them.
Your guess about what -ggdb3 does is totally incorrect; the object files do not contain the source. You can prove that by running 'strings -a libfoo.so'.
Your best bet is to learn how to use remote debugging -- you can then use GDB from host (which has access to all the sources); with an added advantage that you need much less memory on target. See gdbserver in "info gdb".

Resources