Is there any way by which we can identify that a .obj file and .exe file is 16/32 bit?
Basically I want to create a smart linker, that will automatically identify which linker do the given file names need to be passed to.
Preferred Language: C (it can be different, if needed)
I am looking for some solution that can read the bytes of an .exe/the code of an .obj file and then determine if it's 16/32 bit. Even an algorithm would too do.
Note: I know both object code and a executable are two different entities.
All of this information is encoded in the binary object according to the relevant Application Binary Interface (ABI).
The current Linux ABI is the Executable and Linkable Format (ELF), and you can query a specific binary file using a tool such as readelf or objdump.
The current Windows ABI is the Portable Executable (PE) format. I'm not familiar with the toolset here but a quick google search suggests there are programs that function the same as readelf:
http://www.pe-explorer.com/peexplorer-tour.htm
Here's the Microsoft specification of the PE format:
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
However, neither of those formats support 16-bit binaries anymore. The older ABI format is called "a.out" for Linux, which can be read and queried with objdump (I'm not sure about readelf). The older Windows/DOS formats are called MZ and NE. Again, I'm not familiar with the tool support for these older Windows formats.
Wikipedia has a pretty comprehensive list of all the popular executable file formats that have been used, with links to more info:
https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats
Related
I am trying to link object files which had originally been created by two different assemblers. We have some legacy assembly code that was compiled into object files using an old MRI assembler for the 68332 processor. We are developing a new application with the GNU Binutils m68k v2.24. We would like to use the original object files as built by the old assembler without change. I have configured our build environment to do this. For historic reasons, our build environment links into three output formats: Srecord, ieee, and ELF. When I run this is succeeding without error for the Srecord and ieee formats. However, for the ELF output format, I receive the following errors:
m68k-elf-ld: failed to merge target specific data of file
As a result the Elf file is not created.
I am first trying to understand what this error message might mean but I was not able to. If anyone knows the GNU Binutils ld documentation enough to point me to where the error definition is defined I would appreciate this.
I have actually loaded our target and run the Srecord output. It seems to pass many tests the same as before so it appears that it is running to some degree.
It looks like our legacy object files may be in coff format format. I would guess that this is the problem. Is there any way to convert a coff file to ELF format?
Thanks in advance for any support.
It looks like our legacy object files may be in coff format format. I would guess that this is the problem. Is there any way to convert a coff file to ELF format?
objcopy can be used to convert between formats. However, to do this it has to have been configured to understand both formats. You can check what formats it accepts with objcopy --info (a shortened list appears at the end of objcopy --help).
If you objcopy doesn't support the required formats, then you'll have to build binutils yourself.
Considering that C is a systems programming language, how can I compile C code into raw x86 machine code that could be invoked without the presence of an operating system? (IE: You can assume I have a boot sector that loads the raw machine code from disk into memory then jumps directly to the first instruction).
And now, for bonus points: Ideally, I'd like to compile using Visual Studio 2010's compiler because I've already got it. Failing that, what's the best way to accomplish the task, without having to install a bunch of dependencies or having to make large sweeping configuration changes across my entire system? I'd be compiling on Windows 7.
Usually, you don't. Instead, you compile your code normally, and then (either with the linker or some other tool) extract a raw binary from the object file.
For example, on Linux, you can use the objcopy tool to copy an object file to a raw binary file.
$ objcopy -O binary object.elf object.binary
First off you dont use any libraries that require a system call (printf, fopen, read, etc). then you compile the C files normally. the major difference is the linker step, if you are used to letting the c compiler call the linker (or letting some gui do it) you will likely need to take over that manually in some form. The specific solution depends on your tools, you will need to have some bootstrap code (the small amount of assembly that is needed to cover the assumptions of C compilers and programmers and launch the entry point in your C program), and a linker script or the right command line options for the linker to control the address space for the binary as well as to link the objects together. Then depending on the output format of the linker you might have to convert it to some other binary format (intel hex, srec, exe, com, coff, elf, raw binary, etc) to be compatible with wherever it is going to be loaded or run.
I was wondering if there was a way to set the compiler to compile my code into a .bin file which only has the 1's and 0's, no hex code as in a .exe file. I want the code to run on the processor, not the operating system. Are there any settings to set it to that in the Express edition?? Thanks in advance.
There is nothing magic about a ".bin" file. The extension generally just indicates a binary file, but all files are binary. So you can create a ".bin" file by renaming the ".exe" file that your linker generates to ".bin".
I presume you won't be satisfied with that, so I'll elaborate a little further. The ".exe" file extension (at least on Windows, which I'll assume since you've added a Visual Studio-related tag) implies a binary file with a special format—a Portable Executable, or PE for short. This is the standard form of binary file used on Windows operating systems, both for executables and DLLs.
So a PE file is a binary (".bin") file, but an unknown binary file with a ".bin" extension is not necessarily a PE file. You could have taken some other binary file (like an image) and renamed it to have a ".bin" extension. It just contains a sequence of binary bits in no particular format. You won't be able to execute the file because it's not in the correct, recognized format. It's lacking the magic PE header that makes it executable. There's a reason that C build systems output PE files by default: that's the only type of file that's going to be of any use to you.
And like user1167662 says in his comment, there is nothing magical about hex code. Code in binary files can be represented in either hex or binary format. It's exactly the same information either way. Any good text editor (at least, one designed for programmers), can open and display the contents of a file using either representation (or ASCII, or decimal).
I want it to be as low level as possible for optimal performance.
There is nothing "lower level" about it, and you certainly won't get any optimized performance. PE files already contain native machine code that runs directly on your microprocessor. It's not interpreted like managed code would be. It contains a series of instructions in your processor's machine language. PE files just contain an additional header that allows them to be recognized and executed by the operating system. This has no effect on performance.
To build an operating system.
Now, that's a bit different… In particular, it's going to be a lot more difficult than writing a regular Windows application. You have a lot of work ahead of you, because you can't rely on the operating system to do anything to help you out. You'll need to get down-and-dirty with the underlying hardware that you're targeting—a developer's guide/manual for your CPU will be very useful.
And you'll have to get a different build environment. Visual Studio is not going to do you any good if you're not creating a PE file in the recognized format. Neither is Microsoft's C++ linker included with it, link.exe. The linker doesn't support outputting "flat" binary files (i.e., those with the PE header stripped off). You're going to need a different linker. The GCC toolset can do this. There is a Windows port; it is called MinGW.
I also recommend a book on operating system development. It's too much to cover in an answer to a Stack Overflow question. And for learning purposes, I strongly suggest playing with an architecture other than Intel's x86.
By using the -fdump-tree-* flag , one can dump some intermediate format file during compilation of a source code file. My question is if one can use that intermediate file as an input to gcc to get the final object file.
I'm asking this because I want to add some code to the intermediate file of the gimple (obtained by using the flag -fdump-tree-gimple) format. Sure I can use hooks and add my own pass, but I don't want to get to that level of complexity yet. I just want to give gcc my modified intermediate file, so it can start its compilation from there and give me the final object file. Any ideas how to achieve this?
GIMPLE was a binary internal format which is hard to dump fully and reload back correctly. Comparing with LLVM, LLVM IR was designed to be dumpable and reloadable into usual file (text and binary format of such files are fully-convertible from each to other). You can run Clang fronted to emit LLVMIR, then start opt program with some optimizations, then with other, and there will be LLVM IR bitcode files between phases. And then you can start codegeneration from IR bitcode into native code (even, in theory, into not the same platform, see PNaCl project).
There are some projects of dumping/reloading internal representation of GCC. I know such project was created to integrate gcc with commercial compiler tool. The author can't just link commercial code with gcc, because gcc is VIRAL (it will infect any linked code with anti-commercial GPL). So, author wrote a GPL dumper/loader of GIMPLE to some external (xml) format; the proprietary tool was able to read and translate this XML into other XML of the same format and then it was reloaded back with GPL tool.
In newer gcc you have an option of writing a plugin, which is VIRAL (23.2.1) in terms of GPL. Plugin will operate on in-memory representation of program and there will be no problem of dumping/reloading GIMPLE via external file.
There are some plugins which may be configured/may use user-supplied program, e.g MELT (Lisp) and GCC Python (Python). Some list of gcc plugins is there
There's no built-in facility to translate the text GIMPLE representation back to original GIMPLE internal representation.
You'll need to use custom front-end (such as suggested GIMPLE FE) to make sense of dumped GIMPLE.
If I just want to use the gsl_histogram.h library from Gnu Scientific Library (GSL), can I copy it from an existing machine (Mac OS Snow Leopard) that has GSL installed to a different machine (Linux CentOS 5.7) that doesn't have GSL installed, and just use an #include <gls_histogram.h> statement in my c program? Would this work?
Or, do I have to go through the full install of GSL on the Linux box, even though I only need this one library?
Just copying a header gsl_histogram.h is not enough. Header states merely the interface that is exposed by this library. You would need to copy also binaries like *.so and *.a files, but it's hard to tell which ones to copy. So I think the you'd better just install it on your machine. It's pretty easy, just use this tutorial to find and install GSL package.
So there are surely a lot of libraries out there. However the particular one is Gnuplot. Using it you even do not need to compile the code, however you do need to read a bit of documentation. But luckily there is already a question about how to draw a histogram with Gnuplot on Stackoverflow: Histogram using gnuplot? It worth noting that Gnuplot is actually very powerful tool, so invested time into reading its documentation will certainly pay off.
You cannot copy libraries from OS and expect them to work unchanged.
OS X uses the Mach-O object file format while modern Linux systems use the ELF object file format. The usual ld.so(8) linker/loader will not know how to load the Mach-O format object files for your executable to execute. So you would need the Apple-provided ld.so(8) -- or whatever they call their loader. (It's been a while.)
Furthermore, the object files from OS X will be linked against the Apple-supplied libc, and require the corresponding symbols from the Apple-supplied library. You would also need to provide the Apple-provided libc on the Linux system. This C library would try to make system calls using the OS X system call numbers and calling conventions. I guarantee the system call numbers have changed and almost certainly calling conventions are different.
While the Linux kernel's binfmt_misc generic object loader can be used to teach the kernel how to load different object file formats, and the kernel's personality(2) system call can be used to select between different calling conventions, system call numbers, and so on, the amount of work required to make this work is nothing short of immense: the WINE Project has been working on exactly this issue (but with the Windows format COFF and supporting libraries) since 1993.
It would be easier to run:
apt-get install libgs0-dev
or whatever the equivalent is on your distribution of choice. If your distribution does not make it easily available, it would still be easier to compile and install the library by hand rather than try to make the OS X version work.