c library x86/x64 assembler - c

Is there a C library for assembling a x86/x64 assembly string to opcodes?
Example code:
/* size_t assemble(char *string, int asm_flavor, char *out, size_t max_size); */
unsigned char bytes[32];
size_t size = assemble("xor eax, eax\n"
"inc eax\n"
"ret",
asm_x64, &bytes, 32);
for(int i = 0; i < size; i++) {
printf("%02x ", bytes[i]);
}
/* Output: 31 C0 40 C3 */
I have looked at asmpure, however it needs modifications to run on non-windows machines.
I actually both need an assembler and a disassembler, is there a library which provides both?

There is a library that is seemingly a ghost; its existance is widely unknown:
XED (X86 Encoder Decoder)
Intel wrote it: https://software.intel.com/sites/landingpage/pintool/docs/71313/Xed/html/
It can be downloaded with Pin: https://software.intel.com/en-us/articles/pintool-downloads

Sure - you can use llvm. Strictly speaking, it's C++, but there are C interfaces. It will handle both the assembling and disassembling you're trying to do, too.

Here you go:
http://www.gnu.org/software/lightning/manual/lightning.html
Gnu Lightning is a C library which is designed to do exactly what you want. It uses a portable assembly language though, rather than x86 specific one. The portable assembly is compiled in run time to a machine specific one in a very straightforward manner.
As an added bonus, it is much smaller and simpler to start using than LLVM (which is rather big and cumbersome).

You might want libyasm (the backend YASM uses). You can use the frontends as examples (most particularly, YASM's driver).

I'm using fasm.dll: http://board.flatassembler.net/topic.php?t=6239
Don't forget to write "use32" at the beginning of code if it's not in PE format.

Keystone seems like a great choice now, however it didn't exist when I asked this question.

Write the assembly into its own file, and then call it from your C program using extern. You have to do a little bit of makefile trickery, but otherwise it's not so bad.
Your assembly code has to follow C conventions, so it should look like
global _myfunc
_myfunc: push ebp ; create new stack frame for procedure
mov ebp,esp ;
sub esp,0x40 ; 64 bytes of local stack space
mov ebx,[ebp+8] ; first parameter to function
; some more code
leave ; return to C program's frame
ret ; exit
To get at the contents of C variables, or to declare variables which C can access, you need only declare the names as GLOBAL or EXTERN. (Again, the names require leading underscores.) Thus, a C variable declared as int i can be accessed from assembler as
extern _i
mov eax,[_i]
And to declare your own integer variable which C programs can access as extern int j, you do this (making sure you are assembling in the _DATA segment, if necessary):
global _j
_j dd 0
Your C code should look like
extern void myasmfunc(variable a);
int main(void)
{
myasmfunc(a);
}
Compile the files, then link them using
gcc mycfile.o myasmfile.o

Related

How to tell gcc to not align function parameters on the stack?

I am trying to decompile an executable for the 68000 processor into C code, replacing the original subroutines with C functions one by one.
The problem I faced is that I don't know how to make gcc use the calling convention that matches the one used in the original program. I need the parameters on the stack to be packed, not aligned.
Let's say we have the following function
int fun(char arg1, short arg2, int arg3) {
return arg1 + arg2 + arg3;
}
If we compile it with
gcc -m68000 -Os -fomit-frame-pointer -S source.c
we get the following output
fun:
move.b 7(%sp),%d0
ext.w %d0
move.w 10(%sp),%a0
lea (%a0,%d0.w),%a0
move.l %a0,%d0
add.l 12(%sp),%d0
rts
As we can see, the compiler assumed that parameters have addresses 7(%sp), 10(%sp) and 12(%sp):
but to work with the original program they need to have addresses 4(%sp), 5(%sp) and 7(%sp):
One possible solution is to write the function in the following way (the processor is big-endian):
int fun(int bytes4to7, int bytes8to11) {
char arg1 = bytes4to7>>24;
short arg2 = (bytes4to7>>8)&0xffff;
int arg3 = ((bytes4to7&0xff)<<24) | (bytes8to11>>8);
return arg1 + arg2 + arg3;
}
However, the code looks messy, and I was wondering: is there a way to both keep the code clean and achieve the desired result?
UPD: I made a mistake. The offsets I'm looking for are actually 5(%sp), 6(%sp) and 8(%sp) (the char-s should be aligned with the short-s, but the short-s and the int-s are still packed):
Hopefully, this doesn't change the essence of the question.
UPD 2: It turns out that the 68000 C Compiler by Sierra Systems gives the described offsets (as in UPD, with 2-byte alignment).
However, the question is about tweaking calling conventions in gcc (or perhaps another modern compiler).
Here's a way with a packed struct. I compiled it on an x86 with -m32 and got the desired offsets in the disassembly, so I think it should still work for an mc68000:
typedef struct {
char arg1;
short arg2;
int arg3;
} __attribute__((__packed__)) fun_t;
int
fun(fun_t fun)
{
return fun.arg1 + fun.arg2 + fun.arg3;
}
But, I think there's probably a still cleaner way. It would require knowing more about the other code that generates such a calling sequence. Do you have the source code for it?
Does the other code have to remain in asm? With the source, you could adjust the offsets in the asm code to be compatible with modern C ABI calling conventions.
I've been programming in C since 1981 and spent years doing mc68000 C and assembler code (for apps, kernel, device drivers), so I'm somewhat familiar with the problem space.
It's not a gcc 'fault', it is 68k architecture that requires stack to be always aligned on 2 bytes.
So there is simply no way to break 2-byte alignment on the hardware stack.
but to work with the original program they need to have addresses
4(%sp), 5(%sp) and 7(%sp):
Accessing word or long values off the ODD memory address will immediately trigger alignment exception on 68000.
To get integral parameters passed using 2 byte alignment instead of 4 byte alignment, you can change the default int size to be 16 bit by -mshort. You need to replace all int in your code by long (if you want them to be 32 bit wide). The crude way to do that is to also pass -Dint=long to your compiler. Obviously, you will break ABI compatibility to object files compiled with -mno-short (which appears to be the default for gcc).

How to define C functions with LuaJIT?

This:
local ffi = require "ffi"
ffi.cdef[[
int return_one_two_four(){
return 124;
}
]]
local function print124()
print(ffi.C.return_one_two_four())
end
print124()
Throws an error:
Error: main.lua:10: cannot resolve symbol 'return_one_two_four': The specified procedure could not be found.
I have a sort-of moderate grasp on C and wanted to use some of it's good sides for a few things, but I couldn't find many examples on LuaJIT's FFI library. It seems like cdef is only used for function declarations and not definitions. How can I make functions in C and then use them in Lua?
LuaJIT is a Lua compiler, but not a C compiler. You have to compile your C code into a shared library first. For example with
gcc -shared -fPIC -o libtest.so test.c
luajit test.lua
with the files test.c and test.lua as below.
test.c
int return_one_two_four(){
return 124;
}
test.lua
local ffi = require"ffi"
local ltest = ffi.load"./libtest.so"
ffi.cdef[[
int return_one_two_four();
]]
local function print124()
print(ltest.return_one_two_four())
end
print124()
Live example on Wandbox
A JIT within LuaJIT
In the comments under the question, someone mentioned a workaround to write functions in machine code and have them executed within LuaJIT on Windows. Actually, the same is possible in Linux by essentially implementing a JIT within LuaJIT. While on Windows you can just insert opcodes into a string, cast it to a function pointer and call it, the same is not possible on Linux due to page restrictions. On Linux, memory is either writeable or executable, but never both at the same time, so we have to allocate a page in read-write mode, insert the assembly and then change the mode to read-execute. To this end, simply use the Linux kernel functions to get the page size and mapped memory. However, if you make even the tiniest mistake, like a typo in one of the opcodes, the program will segfault. I'm using 64-bit assembly because I'm using a 64-bit operating system.
Important: Before executing this on your machine, check the magic numbers in <bits/mman-linux.h>. They are not the same on every system.
local ffi = require"ffi"
ffi.cdef[[
typedef unsigned char uint8_t;
typedef long int off_t;
// from <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
int mprotect(void *addr, size_t len, int prot);
// from <unistd.h>
int getpagesize(void);
]]
-- magic numbers from <bits/mman-linux.h>
local PROT_READ = 0x1 -- Page can be read.
local PROT_WRITE = 0x2 -- Page can be written.
local PROT_EXEC = 0x4 -- Page can be executed.
local MAP_PRIVATE = 0x02 -- Changes are private.
local MAP_ANONYMOUS = 0x20 -- Don't use a file.
local page_size = ffi.C.getpagesize()
local prot = bit.bor(PROT_READ, PROT_WRITE)
local flags = bit.bor(MAP_ANONYMOUS, MAP_PRIVATE)
local code = ffi.new("uint8_t *", ffi.C.mmap(ffi.NULL, page_size, prot, flags, -1, 0))
local count = 0
local asmins = function(...)
for _,v in ipairs{ ... } do
assert(count < page_size)
code[count] = v
count = count + 1
end
end
asmins(0xb8, 0x7c, 0x00, 0x00, 0x00) -- mov rax, 124
asmins(0xc3) -- ret
ffi.C.mprotect(code, page_size, bit.bor(PROT_READ, PROT_EXEC))
local fun = ffi.cast("int(*)(void)", code)
print(fun())
ffi.C.munmap(code, page_size)
Live example on Wandbox
How to find opcodes
I see that this answer has attracted some interest, so I want to add something which I was having a hard time with at first, namely how to find opcodes for the instructions you want to perform. There are some resources online most notably the Intel® 64 and IA-32 Architectures Software Developer Manuals but nobody wants to go through thousands of PDF pages just to find out how to do mov rax, 124. Therefore some people have made tables which list instructions and corresponding opcodes, e.g. http://ref.x86asm.net/, but looking up opcodes in a table is cumbersome as well because even mov can have many different opcodes depending on what the target and source operands are. So what I do instead is I write a short assembly file, for example
mov rax, 124
ret
You might wonder, why there are no functions and no things like segment .text in my assembly file. Well, since I don't want to ever link it, I can just leave all of that out and save some typing. Then just assemble it using
$ nasm -felf64 -l test.lst test.s
The -felf64 option tells the assembler that I'm using 64-bit syntax, the -l test.lst option that I want to have a listing of the generated code in the file test.lst. The listing looks similar to this:
$ cat test.lst
1 00000000 B87C000000 mov rax, 124
2 00000005 C3 ret
The third column contains the opcodes I am interested in. Just split these into units of 1 byte and insert them into you program, i.e. B87C000000 becomes 0xb8, 0x7c, 0x00, 0x00, 0x00 (hexadecimal numbers are luckily case-insensitive in Lua and I like lowercase better).
Technically you can do the sorts of things you want to do without too much trouble (as long as the code is simple enough).
Using something like this:
https://github.com/nucular/tcclua
With tcc (which is very small, and you can even deploy with it easily) its quite a nice way to have the best of both worlds, all in a single package :)
LuaJIT includes a recognizer for C declarations, but it isn't a full-fledged C compiler. The purpose of its FFI system is to be able to define what C functions a particular DLL exports so that it can load that DLL (via ffi.load) and allow you to call those functions from Lua.
LuaJIT can load pre-compiled code through a DLL C-based interface, but it cannot compile C itself.

merging assembly and C in mplab

I want use a procedure written in assembly for PIC in my c code in MPLABX. is there a way I can do this. I have searched over the internet but can't find anything helpful on this.
If you are using a 16-bit PIC, see 8.3 MIXING ASSEMBLY LANGUAGE AND C VARIABLES AND FUNCTIONS in MPLAB® C30 User’s Guide.
EXAMPLE 8-2: CALLING AN ASSEMBLY FUNCTION IN C
/*
** file: call1.c
*/
extern int asmFunction(int, int);
int x;
void main(void)
{
x = asmFunction(0x100, 0x200);
}
The assembly-language function sums its two parameters and returns the result.
;
; file: call2.s
;
.global _asmFunction
_asmFunction:
add w0,w1,w0
return
.end
Parameter passing in C is detailed in Section 4.12.2 “Return Value”. In the preceding
example, the two integer arguments are passed in the W0 and W1 registers. The
integer return result is transferred via register W0. More complicated parameter lists
may require different registers and care should be taken in the hand-written assembly
to follow the guidelines.

Print out value of stack pointer

How can I print out the current value at the stack pointer in C in Linux (Debian and Ubuntu)?
I tried google but found no results.
One trick, which is not portable or really even guaranteed to work, is to simple print out the address of a local as a pointer.
void print_stack_pointer() {
void* p = NULL;
printf("%p", (void*)&p);
}
This will essentially print out the address of p which is a good approximation of the current stack pointer
There is no portable way to do that.
In GNU C, this may work for target ISAs that have a register named SP, including x86 where gcc recognizes "SP" as short for ESP or RSP.
// broken with clang, but usually works with GCC
register void *sp asm ("sp");
printf("%p", sp);
This usage of local register variables is now deprecated by GCC:
The only supported use for this feature is to specify registers for input and output operands when calling Extended asm
Defining a register variable does not reserve the register. Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc. ...
It's also broken in practice with clang where sp is treated like any other uninitialized variable.
In addition to duedl0r's answer with specifically GCC you could use __builtin_frame_address(0) which is GCC specific (but not x86 specific).
This should also work on Clang (but there are some bugs about it).
Taking the address of a local (as JaredPar answered) is also a solution.
Notice that AFAIK the C standard does not require any call stack in theory.
Remember Appel's paper: garbage collection can be faster than stack allocation; A very weird C implementation could use such a technique! But AFAIK it has never been used for C.
One could dream of a other techniques. And you could have split stacks (at least on recent GCC), in which case the very notion of stack pointer has much less sense (because then the stack is not contiguous, and could be made of many segments of a few call frames each).
On Linuxyou can use the proc pseudo-filesystem to print the stack pointer.
Have a look here, at the /proc/your-pid/stat pseudo-file, at the fields 28, 29.
startstack %lu
The address of the start (i.e., bottom) of the
stack.
kstkesp %lu
The current value of ESP (stack pointer), as found
in the kernel stack page for the process.
You just have to parse these two values!
You can also use an extended assembler instruction, for example:
#include <stdint.h>
uint64_t getsp( void )
{
uint64_t sp;
asm( "mov %%rsp, %0" : "=rm" ( sp ));
return sp;
}
For a 32 bit system, 64 has to be replaced with 32, and rsp with esp.
You have that info in the file /proc/<your-process-id>/maps, in the same line as the string [stack] appears(so it is independent of the compiler or machine). The only downside of this approach is that for that file to be read it is needed to be root.
Try lldb or gdb. For example we can set backtrace format in lldb.
settings set frame-format "frame #${frame.index}: ${ansi.fg.yellow}${frame.pc}: {pc:${frame.pc},fp:${frame.fp},sp:${frame.sp}} ${ansi.normal}{ ${module.file.basename}{\`${function.name-with-args}{${frame.no-debug}${function.pc-offset}}}}{ at ${ansi.fg.cyan}${line.file.basename}${ansi.normal}:${ansi.fg.yellow}${line.number}${ansi.normal}{:${ansi.fg.yellow}${line.column}${ansi.normal}}}{${function.is-optimized} [opt]}{${frame.is-artificial} [artificial]}\n"
So we can print the bp , sp in debug such as
frame #10: 0x208895c4: pc:0x208895c4,fp:0x01f7d458,sp:0x01f7d414 UIKit`-[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 376
Look more at https://lldb.llvm.org/use/formatting.html
You can use setjmp. The exact details are implementation dependent, look in the header file.
#include <setjmp.h>
jmp_buf jmp;
setjmp(jmp);
printf("%08x\n", jmp[0].j_esp);
This is also handy when executing unknown code. You can check the sp before and after and do a longjmp to clean up.
If you are using msvc you can use the provided function _AddressOfReturnAddress()
It'll return the address of the return address, which is guaranteed to be the value of RSP at a functions' entry. Once you return from that function, the RSP value will be increased by 8 since the return address is pop'ed off.
Using that information, you can write a simple function that return the current address of the stack pointer like this:
uintptr_t GetStackPointer() {
return (uintptr_t)_AddressOfReturnAddress() + 0x8;
}
int main(int argc, const char argv[]) {
uintptr_t rsp = GetStackPointer();
printf("Stack pointer: %p\n", rsp);
}
Showcase
You may use the following:
uint32_t msp_value = __get_MSP(); // Read Main Stack pointer
By the same way if you want to get the PSP value:
uint32_t psp_value = __get_PSP(); // Read Process Stack pointer
If you want to use assembly language, you can also use MSP and PSP process:
MRS R0, MSP // Read Main Stack pointer to R0
MRS R0, PSP // Read Process Stack pointer to R0

Is it possible to access 32-bit registers in C?

Is it possible to access 32-bit registers in C ? If it is, how ? And if not, then is there any way to embed Assembly code in C ? I`m using the MinGW compiler, by the way.
Thanks in advance!
If you want to only read the register, you can simply:
register int ecx asm("ecx");
Obviously it's tied to instantiation.
Another way is using inline assembly. For example:
asm("movl %%ecx, %0;" : "=r" (value) : );
This stores the ecx value into the variable value. I've already posted a similar answer here.
Which registers do you want to access?
General purpose registers normally can not be accessed from C. You can declare register variables in a function, but that does not specify which specific registers are used. Further, most compilers ignore the register keyword and optimize the register usage automatically.
In embedded systems, it is often necessary to access peripheral registers (such as timers, DMA controllers, I/O pins). Such registers are usually memory-mapped, so they can be accessed from C...
by defining a pointer:
volatile unsigned int *control_register_ptr = (unsigned int*) 0x00000178;
or by using pre-processor:
#define control_register (*(unsigned int*) 0x00000178)
Or, you can use Assembly routine.
For using Assembly language, there are (at least) three possibilities:
A separate .asm source file that is linked with the program. The assembly routines are called from C like normal functions. This is probably the most common method and it has the advantage that hw-dependent functions are separated from the application code.
In-line assembly
Intrinsic functions that execute individual assembly language instructions. This has the advantage that the assembly language instruction can directly access any C variables.
You can embed assembly in C
http://en.wikipedia.org/wiki/Inline_assembler
example from wikipedia
extern int errno;
int funcname(int arg1, int *arg2, int arg3)
{
int res;
__asm__ volatile(
"int $0x80" /* make the request to the OS */
: "=a" (res) /* return result in eax ("a") */
"+b" (arg1), /* pass arg1 in ebx ("b") */
"+c" (arg2), /* pass arg2 in ecx ("c") */
"+d" (arg3) /* pass arg3 in edx ("d") */
: "a" (128) /* pass system call number in eax ("a") */
: "memory", "cc"); /* announce to the compiler that the memory and condition codes have been modified */
/* The operating system will return a negative value on error;
* wrappers return -1 on error and set the errno global variable */
if (-125 <= res && res < 0) {
errno = -res;
res = -1;
}
return res;
}
I don't think you can do them directly. You can do inline assembly with code like:
asm (
"movl $0, %%ebx;"
"movl $1, %%eax;"
);
If you are on a 32-bit processor and using an adequate compiler, then yes. The exact means depends on the particular system and compiler you are programming for, and of course this will make your code about as unportable as can be.
In your case using MinGW, you should look at GCC's inline assembly syntax.
You can of course. "MinGW" (gcc) allows (as other compilers) inline assembly, as other answers already show. Which assembly, it depends on the cpu of your system (prob. 99.99% that it is x86). This makes however your program not portable on other processors (not that bad if you know what you are doing and why).
The relevant page talking about assembly for gcc is here and here, and if you want, also here. Don't forget that it can't be specific since it is architecture-dependent (gcc can compile for several cpus)
there is generally no need to access the CPU registers from a program written in a high-level language: high-level languages, like C, Pascal, etc. where precisely invented in order to abstract the underlying machine and render a program more machine-independent.
i suspect you are trying to perform something but have no clue how to use a conventional way to do it.
many access to the registers are hidden in higher-level constructs or in system or library calls which lets you avoid coding the "dirty-part". tell us more about what you want to do and we may suggest you an alternative.

Resources