So straight to my question: how can I compile my ASM files with a 32-bit ASM compiler, include it in my 64-bit project and access the compiled code by using the ASM file's function names?
If it is a bit unclear, I can elaborate:
I'm converting a project of mine from 32-bit to 64-bit and I ran into a technical problem. My project compiles an ASM file and use the compiled binary as input for it's usage.
When my project was 32-bit, it was quite easy. I included the ASM files in the project and added a build rule to compile them with Microsoft Macro Assembler - then I could access the compiled code from my 32-bit project by exported each function I wanted to access from the ASM to a .h header file and access it using the function name (I was able to do so because it was compiled to obj and the linker knew the symbols because I exported the prototypes to a .h file).
Now, I need to convert this code to 64-bit, but I still need the ASM to be compiled as 32-bit code and still be able to do the same thing (accessing the compiled 32-bit code from my 64-bit program).
However, when I try to compile it, it obviously doesn't recognize the instructions because now the whole project is being compiled as 64-bit code.
Thanks in advance.
If I were trying to embed 32-bit code inside a 64-bit program (which is a dubious thing to do, but let's say for the sake of argument that you have a good reason and actually know what you're doing with the result) — I'd take the 32-bit code, whether written in C, assembly, or something else — and compile it as a separate project, producing a DLL as output. This involves no extra weirdness in the compile chain: It's just an ordinary 32-bit DLL.
That 32-bit DLL can then be embedded in your 64-bit application as a binary resource — just a blob of memory that you can load and access.
So then how can you actually do anything with the compiled code in that DLL? I'd use a somewhat-hacked version of Joachim Bauch's MemoryModule library to access it. MemoryModule is designed to load DLLs from a hunk of memory and provide access to their exports — it's just like the Windows API's LoadLibrary(), only from memory instead of from a file. It's designed to do it for the same bit size as the calling process, but with a bit of hackery, you could probably make it compile as a 64-bit library, but able to read a 32-bit library. The resulting usages would be pretty simple:
// Load the embedded DLL first from the current module.
hresource = FindResource(hmodule, "MyLibrary.DLL", "binary");
hglobal = LoadResource(hmodule, hresource);
data = LockResource(hglobal);
size = SizeofResource(hmodule, hresource);
// Turn the raw buffer into a "library".
libraryHandle = MemoryLoadLibrary(data, size);
// Get a pointer to some export within it.
myFunction = MemoryGetProcAddress(libraryHandle, "myFunction");
That said, as I alluded to before (and others also alluded to), even if you can get pointers to the exports, you won't be able to invoke them, because the code's 32-bit and might not even be loaded at an address that exists below the 4GB mark. But if you really want to embed 32-bit code in a 64-bit application, that's how I'd go about it.
Related
I'm currently trying to figure out the way to produce equivalent assembly code from corresponding C source file.
I've been using the C language for several years, but have little experience with assembly language.
I was able to output the assembly code using the -S option in gcc. However, the resulting assembly code contained call instructions which in turn make a jump to another function like _exp. This is not what I wanted, I needed a fully functional assembly code in a single file, with no dependency to other code.
Is it possible to achieve what I'm looking for?
To better describe the problem, I'm showing you my code here:
#include <math.h>
float sigmoid(float i){
return 1/(1+exp(-i));
}
The platform I am working on is Windows 10 64-bit, the compiler I'm using is cl.exe from MSbuild.
My initial objective was to see, at a lowest level possible, how computers calculate mathematical functions. The level where I decided to observe the calculation process is assembly code, and the mathematical function I've chosen was sigmoid defined as above.
_exp is the standard math library function double exp(double); apparently you're on a platform that prepends a leading underscore to C symbol names.
Given a .s that calls some library functions, build it the same way you would a .c file that calls library functions:
gcc foo.S -o foo -lm
You'll get a dynamic executable by default.
But if you really want all the code in one file with no external dependencies, you can link your .c into a static executable and disassemble that.
gcc -O3 -march=native foo.c -o foo -static -lm
objdump -drwC -Mintel foo > foo.s
There's no guarantee that the _exp implementation in libm.a (static library) is identical to the one you'd get in libm.so or libm.dll or whatever, because it's a different file. This is especially true for a function like memcpy where dynamic-linker tricks are often used to select an optimal version (for your CPU) at run-time.
It is not possible in general, there are exceptions sure, I could craft one so that means other folks can too, but it isnt an interesting program.
Normally your C program, your main() entry point is only a percentage of the code. There is a bootstrap that contains the actual entry point for the operating system to launch your program, this does some things that prepare your virtual memory space so that your program can run. Zeros .bss and other such things. that is often and or should be written in assembly language (otherwise you get a chicken and egg problem) but not an assembly language file you will see unless you go find the sources for the C library, you will often get an object as part of the toolchain along with other compiler libraries, etc.
Then if you make any C calls or create code that results in a compiler library call (perform a divide on a platform that doesnt support divide, perform floating point on a platform that doesnt have floating point, etc) that is another object that came from some other C or assembly that is part of the library or compiler sources and is not something you will see during the compile/assemble/link (the chain in toolchain) process.
So except for specifically crafted trivial programs or specifically crafted tools for this purpose (for specific likely baremetal platforms), you will not see your whole program turn into one big assembly source file before it gets assembled then linked.
If not baremetal then there is of course the operating system layer which you certainly would not get to see as part of your source code, ultimately the C library calls that need the system will have a place where they do that, all compiled to object/lib before you use them, and the assembly sources for the operating system side is part of some other source and build process somewhere else.
I have managed to build an RV32E cross-compiler on my Intel Ubuntu machine by using the official riscv GitHub toolchain (github.com/riscv/riscv-gnu-toolchain) with the following configuration:-
./configure --prefix=/home/riscv --with-arch=rv32i --with-abi=ilp32e
The ip32e specifies soft float for RV32E. This generates a working compiler that works fine on my simple C source code. If I disassemble the created application then it does indeed stick to the RV32E specification. It only generates assembly for my code that uses the first 16 registers.
I use static linking and it pulls in the expected set of soft float routines such as __divdi3 and __mulsi3. Unfortunately the pulled in routines use all 32 registers and not the restricted lower 16 for RV32E. Hence, not very useful!
I cannot find where this statically linked code is coming from, is it compiled from C source and therefore being compiled without the RV32E restriction? Or maybe it was written as hand coded assembly that has been written only for the full RV32I instead of RV32E? I tried to grep around the source but have had no luck finding anything like the actual code that is statically linked.
Any ideas?
EDIT: Just checked in more details and the compiler is not generating using just the first 16 registers. Turns out with a simple test routine it manages to only use the first 16 but more complex code does use others as well. Maybe RV32E is not implemented yet?
The configure.ac file contains this code:
AS_IF([test "x$with_abi" == xdefault],
[AS_CASE([$with_arch],
[*rv64g* | *rv64*d*], [with_abi=lp64d],
[*rv64*f*], [with_abi=lp64f],
[*rv64*], [with_abi=lp64],
[*rv32g* | *rv32*d*], [with_abi=ilp32d],
[*rv32*f*], [with_abi=ilp32f],
[*rv32*], [with_abi=ilp32],
[AC_MSG_ERROR([Unknown arch])]
)])
Which seems to map your input of rv32i to the ABI ilp32, ignoring the e. So yes, it seems support for the ...e ABIs is not fully implemented yet.
I am using Dev-C++, which compiles using GCC, on Windows 8.1, 64-bit.
I noticed that all my .c files always compiled to at least 128-kilobyte .exe files, no matter how small the source is. Even a simple "Hello, world!" was 128kb. Source files with more lines of code increased the size of the executable as I would expect, but all the files started off at at least 128kb, as if that's some sort of minimum size.
I know .exe's don't actually have a minimum size like that; .kkrieger is a full first-person shooter with 3d graphics and sound that all fit inside a single 96kb executable.
Trying to get to the bottom of this, I opened up my hello_world.exe in Notepad++. Perhaps my compiler adds a lengthy header that happens to be 128kb, I thought.
Unfortunately, I don't know enough about executables to be able to make sense of it, though I did find strings like "Address %p has no image-section VirtualQuery failed for %d bytes at address %p" buried among the usual garble of characters in an .exe.
Of course, this isn't a serious problem, but I'd like to know why it's happening.
Why is this 128kb minimum happening? Does it have something to do with my 64-bit OS, or perhaps with a quirk of my compiler?
Short answer: it depends.
Long answer: it depends on what operating system you have and how it handles executables.
Most (if not all) compilers of programming languages do not break it down to the absolute, raw x86/ARM/other architecture's machine code. Instead, after they pack your source code into a .o (object) file, they then bring the .o and its libraries and "link" it all together, in such a way that it forms a standard executable format. These "executable formats" are essentially system-specific file formats that contain low level, very-close-to-machine-code instructions that the OS interprets in such a way that it can relay those low-level instructions to the CPU in the form of machine-code instructions.
For example, I'll talk about the two most commonly used executable formats for Linux devices: ELF and ELF64 (I'll let you figure out what the namesake differences are yourself). ELF stands for Executable and Linkable Format. In every ELF-compiled program, the file starts off with a 4-byte "magic number", which is simply a hexadecimal 0x7F followed by the string "ELF" in ASCII. The next byte is set to either 1 or 2, which signifies that the program is for 32-bit or 64-bit architectures, respectively. And after that, another byte to signify the program's endianness. After that, there's a few more bytes that tell what the architecture is, and so on, until you reach a total of up to 64 bytes for the 64-bit header.
However, 64 bytes is not even close to the 128K that you have stated. That's because (aside from the fact that the windows .exe format is usually much more complex), there is the C++ standard library at fault here. For instance, let's have a look at a common use of the C++ iostream library:
#include <iostream>
int main()
{
std::cout<<"Hello, World!"<<std::endl;
return 0;
}
This program may compile to an extremely large executable on a windows system, because the moment you add iostream to your program, it adds the entire C++ standard library into it, increasing your executable's size immensely.
So, how do we rectify this problem? Simple:
Use the C standard library implementation for C++!
#include <cstdio>
int main()
{
printf("Hello, World!\n");
return 0;
}
Simply using the original C standard library can decrease your size from a couple hundred KBytes to a handful at most. The reason that this happens is simply because GCC/G++ really likes linking programs with the entire standard C++ library for some odd reason.
However, sometimes you absolutely need to use the C++-specific libraries. In that case,a lot of linkers have some kind of command-line option that essentially tells the linker "Hey, I'm only using like, 2 functions from the STDCPP library, you don't need the whole thing". On the Linux linker ld, this is the command-line option -nodefaultlibs. I'm not entirely sure what this is on windows, though. Of course, this can very quickly break a TON of calls and such in programs that make a lot of standard C++ calls.
So, in the end, I would worry more about simply re-writing your program to use the regular C functions instead of the new-fangled C++ functions, as amazing as they are. that is if you're worried about size.
I'm trying to learn x86. I thought this would be quite easy to start with - i'll just compile a very small program basically containing nothing and see what the compiler gives me. The problem is that it gives me a ton of bloat. (This program cannot be run in dos-mode and so on) 25KB file containing an empty main() calling one empty function.
How do I compile my code without all this bloat? (and why is it there in the first place?)
Executable formats contain a bit more than just the raw machine code for the CPU to execute. If you want that then the only option is (I think) a DOS .com file which essentially is just a bunch of code loaded into a page and then jumped into. Some software (e.g. Volkov commander) made clever use of that format to deliver quite much in very little executable code.
Anyway, the PE format which Windows uses contains a few things that are specially laid out:
A DOS stub saying "This program cannot be run in DOS mode" which is what you stumbled over
several sections containing things like program code, global variables, etc. that are each handled differently by the executable loader in the operating system
some other things, like import tables
You may not need some of those, but a compiler usually doesn't know you're trying to create a tiny executable. Usually nowadays the overhead is negligible.
There is an article out there that strives to create the tiniest possible PE file, though.
You might get better result by digging up older compilers. If you want binaries that are very bare to the bone COM files are really that, so if you get hold of an old compiler that has support for generating COM binaries instead of EXE you should be set. There is a long list of free compilers at http://www.thefreecountry.com/compilers/cpp.shtml, I assume that Borland's Turbo C would be a good starting point.
The bloated module could be the loader (operating system required interface) attached by linker. Try adding a module with only something like:
void foo(){}
and see the disassembly (I assume that's the format the compiler 'gives you'). Of course the details vary much from operating systems and compilers. There are so many!
Is there a way to convert a Delphi .dcu file to an .obj file so that it can be linked using a compiler like GCC? I've not used Delphi for a couple of years but would like to use if for a project again if this is possible.
Delphi can output .obj files, but they are in a 32-bit variant of Intel OMF. GCC, on the other hand, works with ELF (Linux, most Unixes), COFF (on Windows) or Mach-O (Mac).
But that alone is not enough. It's hard to write much code without using the runtime library, and the implementation of the runtime library will be dependent on low-level details of the compiler and linker architecture, for things like correct order of initialization.
Moreover, there's more to compatibility than just the object file format; code on Linux, in particular, needs to be position-independent, which means it can't use absolute values to reference global symbols, but rather must index all its global data from a register or relative to the instruction pointer, so that the code can be relocated in memory without rewriting references.
DCU files are a serialization of the Delphi symbol tables and code generated for each proc, and are thus highly dependent on the implementation details of the compiler, which changes from one version to the next.
All this is to say that it's unlikely that you'd be able to get much Delphi (dcc32) code linking into a GNU environment, unless you restricted yourself to the absolute minimum of non-managed data types (no strings, no interfaces) and procedural code (no classes, no initialization section, no data that needs initialization, etc.)
(answer to various FPC remarks, but I need more room)
For a good understanding, you have to know that a delphi .dcu translates to two differernt FPC files, .ppu file with the mentioned symtable stuff, which includes non linkable code like inline functions and generic definitions and a .o which is mingw compatible (COFF) on Windows. Cygwin is mingw compatible too on linking level (but runtime is different and scary). Anyway, mingw32/64 is our reference gcc on Windows.
The PPU has a similar version problem as Delphi's DCU, probably for the same reasons. The ppu format is different nearly every major release. (so 2.0, 2.2, 2.4), and changes typically 2-3 times an year in the trunk
So while FPC on Windows uses own assemblers and linkers, the .o's it generates are still compatible with mingw32 In general FPC's output is very gcc compatible, and it is often possible to link in gcc static libs directly, allowing e.g. mysql and postgres linklibs to be linked into apps with a suitable license. (like e.g. GPL) On 64-bit they should be compatible too, but this is probably less tested than win32.
The textmode IDE even links in the entire GDB debugger in library form. GDB is one of the main reasons for gcc compatibility on Windows.
While Barry's points about the runtime in general hold for FPC too, it might be slightly easier to work around this. It might only require calling certain functions to initialize the FPC rtl from your startup code, and similarly for the finalize. Compile a minimal FPC program with -al and see the resulting assembler (in the .s file, most notably initializeunits and finalizeunits) Moreover the RTL is more flexible and probably more easily cut down to a minimum.
Of course as soon as you also require exceptions to work across gcc<->fpc bounderies you are out of luck. FPC does not use SEH, or any scheme compatible with anything else ATM. (contrary to Delphi, which uses SEH, which at least in theory should give you an advantage there, Barry?) OTOH, gcc might use its own libunwind instead of SEH.
Note that the default calling convention of FPC on x86 is Delphi compatible register, so you might need to insert proper cdecl (which should be gcc compatible) modifiers, or even can set it for entire units at a time using {$calling cdecl}
On *nix this is bog standard (e.g. apache modules), I don't know many people that do this on win32 though.
About compatibility; FPC can compile packages like Indy, Teechart, Zeos, ICS, Synapse, VST
and reams more with little or no mods. The dialect levels of released versions are a mix of D7 and up, with the focus on D7. The dialect level is slowly creeping to D2006 level in trunk versions. (with for in, class abstract etc)
Yes. Have a look at the Project Options dialog box:
(High-Res)
As far as I am aware, Delphi only supports the OMF object file format. You may want to try an object format converter such as Agner Fog's.
Since the DCU format is proprietary and has a tendency of changing from one version of Delphi to the next, there's probably no reliable way to convert a DCU to an OBJ. Your best bet is to build them in OBJ format in the first place, as per Andreas's answer.