Convert C code to MASM32 - c

This seems a ridiculous question, but I really need to know an easy way to convert C code to MASM32 code (with the .if's, .while's). The code has a single function, but it uses structs (which, I believe, exists in MASM). I know there are a few questions like this here, and some in other sites too, but I couldn't find, until this point, a solution to my specific problem (MASM32 readable, not c compiled low level obfuscated pure assembly). Does anyone know some sort of program that would made this miracle happen? It doesn't seem so difficult, as the macros in masm are pretty much just an uglier version of C...

you can look for that command line parameter of the MicroSoft C compiler cl. Most of C compiler will provide that. Despite the output asm source code might be need modify few for MASM.

Related

How to put custom DWARF in C resulting binary?

I have two questions:
Is it possible to add custom DWARF on the resulting binary of a C program? (I explain later why i want to do this)
How does DWARF work?
First of all, i don't understand DWARF. I tried to read some docs on dwarfstd.org, but i think it's to high for me. Maybe someone could give me some basic instructions which helps me to dig deeper (the entry point is a bit difficult for me).
Why i want to do this? I like playing around with writing my own compiler, implementing my own language. My goal is to write a compiled language and not an interpreted or jitted one. So i have several options as a backend: C, Opcodes, ASM, LLVM and maybe there are a lot more.
Because LLVM is a C++ library (and i have no clue about C++) i tried it a little bit using the C wrapper. Since i'm a newbie on C too i didn't got it working easily (but i didn't investigate a lot). The problem with Opcodes and ASM is, that the learning curve is higher than LLVM and i'm even more than a newbie on that topic.
So, i would like to use C as a backend... but i think about some problems: Debugging info. The resulting C file would have different function names than my source language and even different line numbers. I know that line numbers could be fixed using the #line directive in C but it's not 100% perfect, though. So i'm looking for a really good solution for this before i start implementing something odd. I stumbled upon DWARF and the i got those question.
If anyone knows a well documented alternative to LLVM which would fit my requirements, your welcome to tell me :)
My requirements for target platform are at least: x86, x64 and ARM

How to get a c source code from the compiled code

I have the compiled C code in text format. I need to extract the source code by decompiling the machine code. How to do that?
"True" decompiling is, basically, impossible. Foremost, you can't "decompile" local names (in functions and source code files / modules). For those, you'll get something like, for int local variables: i1, i2... Of course, unless you also have debug information, which is not often the case.
Decompiling to "something" (which might not be very readable) is possible, but it usually relies on some heuristics, recognizing code patterns that compilers generate and can be fooled into generating strange (possibly even incorrect) C code. In practice that means that a decompiler usually works OK for a certain compiler with certain (default) compile options, but, not so nice with others.
Having said that, decompilers do exist and you can try your luck with, say Snowman
As Srdjan has said, in general decompilation of a C (or C++) program is not possible. There is too much information lost during the compilation process. For example consider a declaration such as int x this is 'lost' as it does not directly produce any machine level instruction. The compiler needs this information to do type checking only.
Now, however it is possible to disassembly which is taking the compiled executable back up a level to assembly language. However, interpretation of the assembly might (will ?) be difficult and certainly time consuming. There are several disassemblers available, if you have money IDA-Pro is probably the industry standard in disassemblers, and if you are doing this type work, well worth the several thousand dollars per license. There are a number of open source disassemblers available, google can find them.
Now, that being said there have been efforts to create a decompilers, IDA-Pro has one, and you can look at http://boomerang.sourceforge.net/ in addition to Snowman linked above.
Lastly, other languages are more friendly towards decompilation then C or C++. For example a C# programs is decompilable with tools like dotPeek or ilSpy. Similarly with Java there are a number of tools that can convert Java bytecode back into Java source.
Please post a sample of the "compiled C code in text format."
Perhaps then it will be easier to see what you are trying to achieve.
Typically it is not practical to reverse engineer assembly language into C because much the human readable information in the form of Labels and variable names is permanently lost in the compilation process.

Is it possible to get an intermediate, optimized C file using GCC?

I have some C code with a loop:
for(int i=0; i<1000; i+=ceil(sqrt(i)))
{
do stuff that could benefit from loop unrolling;
}
I intend on using a macro command to tell GCC to unroll the loops, but I'd like to make sure it will indeed unroll the loop in this case (since the increment is not 1, but it could still be preprocessed and unrolled).
Is it possible to get GCC to output a .C file containing the code after it's been optimized? (Hopefully including any optimization it does with -O that come before the assembly-level optimizations)?
I know I can confirm this using the assembly output, but I'd rather see something in C - much easier for me to read and understand.
C is a high-level, compiled language. Therefore, it is not an appropriate representation of the optimized machine code. Although, you might feel like seeing C code will be easier to understand, it lacks the absolute precision of the assembly, which maps directly to machine code. For this simple example, you might have a pretty good idea what optimization means in terms of a high-level language, but this is not the case in general with optimizations. Viewing the assembly language shows exactly what the compiler has done.
Secondly, compilers perform optimizations on some sort of intermediate representation (IR), which is more similar to machine code than high-level code (C in this case). To output high-level code after performing optimizations would require a decompilation step. GCC is not the appropriate place to add decompilation logic for a rarely used feature like this. But, if you really want to see the optimized code in C, you could run the assembly produced by GCC through a decompiler to get high-level code back.
Short answer: GCC will not do what you want, but you can produce C code from assembly with a decompiler.
Here is a stack overflow thread about choosing a good C decompiler for linux.

Turning strings into code?

So let's say I have a string containing some code in C, predictably read from a file that has other things in it besides normal C code. How would I turn this string into code usable by the program? Do I have to write an entire interpreter, or is there a library that already does this for me? The code in question may call subroutines that I declared in my actual C file, so one that only accounts for stock C commands may not work.
Whoo. With C this is actually pretty hard.
You've basically got a couple of options:
interpret the code
To do this, you'll hae to write an interpreter, and interpreting C is a fairly hard problem. There have been C interpreters available in the past, but I haven't read about one recently. In any case, unless you reallY really need this, writing your own interpreter is a big project.
Googling does show a couple of open-source (partial) C interpreters, like picoc
compile and dynamically load
If you can capture the code and wrap it so it makes a syntactically complete C source file, then you can compile it into a C dynamically loadable library: a DLL in Windows, or a .so in more variants of UNIX. Then you could load the result at runtime.
Now, what normally would lead someone to do this is a need to be able to express some complicated scripting functions. Have you considered the possibility of using a different language? Python, Scheme (guile) and Lua are easily available to add as a scripting language to a C application.
C has nothing of this nature. That's because C is compiled, and the compiler needs to do a lot of building of the code before the code starts running (hence receives a string as input) that it can't really change on the fly that easily. Compiled languages have a rigidity to them while interpreted languages have a flexibility.
You're thinking of Perl, Python PHP etc. and so called "fourth generation languages." I'm sure there's a technical term in c.s. for this flexibility, but C doesn't have it. You'll need to switch to one of these languages (and give up performance) if you have a task that requires this sort of string use much. Check out Perl's /e flag with regexes, for instance.
In C, you'll need to design your application so you don't need to do this. This is generally quite doable, as for its non-OO-ness and other deficiencies many huge, complex applications run on well-written C just fine.

How do I use C libraries in assembler?

I want to know how to write a text editor in assembler. But modern operating systems require C libraries, particularly for their windowing systems. I found this page, which has helped me a lot.
But I wonder if there are details I should know. I know enough assembler to write programs that will use windows in Linux using GTK+, but I want to be able to understand what I have to send to a function for it to be a valid input, so that it will be easier to make use of all C libraries. For interfacing between C and x86 assembler, I know what can be learned from this page, and little else.
One of the most instructive ways to learn how to call C from assembler is to:
Write a C program that calls the C function of interest
Compile it, and look at the assembly listing (gcc -S)
This approach makes it easy to experiment by starting with something that is already known to work. You can change the C source and see how the generated code changes, and you can start with the generated code and modify it yourself.
push parameter on the stack
call the function
clear the stack
The links you have in your question show all these steps.
The OS may define the calling standard (it pretty well must define the standard for invoking system calls), in which case you need only find where that is documents and read it closely.

Resources