So I am currently learning assembly language (AT&T syntax). We all know that gcc has an option to generate assembly code from C code with -S argument. Now, I would like to look at some code, how it looks in assembly. The problem is, on laboratories we compile it with as+ld, and as for now, we cannot use C libraries. So for example we cannot use printf. We should do it by syscalls (32 bit is enough). And now I have this code in C:
#include <stdio.h>
int main()
{
int a = 5;
int b = 3;
int c = a + b;
printf("%d", c);
return 0;
}
This is simple code, so I know how it will look with syscalls. But if I have some more complicated code, I don't want to mess around and replace every call printf and modify other registers, cuz gcc generated code for printf, and I should have it with syscalls. So can I somehow make gcc generate assembly code with syscalls (for example for I/O (console, files)), not with C libs?
Under Linux there exist the macro family _syscallX to generate a syscall where the X names the number of parameters. It is marked as obsolete, but IMHO still working. E.g., the following code should work (not tested here):
_syscall3(int,syswrite,int,handle,char*,str,int len);
// ---
char str[]="Hello, world!\n";
// file handle 1 is stdout
syswrite(1,str,14);
Related
I would like to use a library written in D for a C program compilable with MinGW GCC, for Windows. Here are the codes:
dll.d
extern (C) int dsquare(int n) nothrow
{
return n * n;
}
main.c
#include <stdio.h>
int main()
{
int res = dsquare(6); // Expect '36'
printf("res = %d\n", res);
return 0;
}
There is a tutorial on D's site, but it seems to target only Linux. Indeed, no explanation is given for creating such a dynamic D library for Windows and MinGW users.
D's documentation also says that the -shared option should generate a DLL version of the D code, but in my case, it generates an executable, I don't know why.
Also, anything that seems to generate files to be linked targets MVSC formats and nothing seems to be suitable for MinGW GCC compilers.
So, how can I generate a "GCC-friend" DLL with D, so that I can link it to my C program without having to use another compiler, such as GDC or LDC, via gcc main.c -o main -ldll -L. (I guess)?
I attached link with short explanation. Link D onto C is not so straightforward as C to D. Check D.org page here:
https://dlang.org/spec/betterc.html
I am trying to decompile an executable for the 68000 processor into C code, replacing the original subroutines with C functions one by one.
The problem I faced is that I don't know how to make gcc use the calling convention that matches the one used in the original program. I need the parameters on the stack to be packed, not aligned.
Let's say we have the following function
int fun(char arg1, short arg2, int arg3) {
return arg1 + arg2 + arg3;
}
If we compile it with
gcc -m68000 -Os -fomit-frame-pointer -S source.c
we get the following output
fun:
move.b 7(%sp),%d0
ext.w %d0
move.w 10(%sp),%a0
lea (%a0,%d0.w),%a0
move.l %a0,%d0
add.l 12(%sp),%d0
rts
As we can see, the compiler assumed that parameters have addresses 7(%sp), 10(%sp) and 12(%sp):
but to work with the original program they need to have addresses 4(%sp), 5(%sp) and 7(%sp):
One possible solution is to write the function in the following way (the processor is big-endian):
int fun(int bytes4to7, int bytes8to11) {
char arg1 = bytes4to7>>24;
short arg2 = (bytes4to7>>8)&0xffff;
int arg3 = ((bytes4to7&0xff)<<24) | (bytes8to11>>8);
return arg1 + arg2 + arg3;
}
However, the code looks messy, and I was wondering: is there a way to both keep the code clean and achieve the desired result?
UPD: I made a mistake. The offsets I'm looking for are actually 5(%sp), 6(%sp) and 8(%sp) (the char-s should be aligned with the short-s, but the short-s and the int-s are still packed):
Hopefully, this doesn't change the essence of the question.
UPD 2: It turns out that the 68000 C Compiler by Sierra Systems gives the described offsets (as in UPD, with 2-byte alignment).
However, the question is about tweaking calling conventions in gcc (or perhaps another modern compiler).
Here's a way with a packed struct. I compiled it on an x86 with -m32 and got the desired offsets in the disassembly, so I think it should still work for an mc68000:
typedef struct {
char arg1;
short arg2;
int arg3;
} __attribute__((__packed__)) fun_t;
int
fun(fun_t fun)
{
return fun.arg1 + fun.arg2 + fun.arg3;
}
But, I think there's probably a still cleaner way. It would require knowing more about the other code that generates such a calling sequence. Do you have the source code for it?
Does the other code have to remain in asm? With the source, you could adjust the offsets in the asm code to be compatible with modern C ABI calling conventions.
I've been programming in C since 1981 and spent years doing mc68000 C and assembler code (for apps, kernel, device drivers), so I'm somewhat familiar with the problem space.
It's not a gcc 'fault', it is 68k architecture that requires stack to be always aligned on 2 bytes.
So there is simply no way to break 2-byte alignment on the hardware stack.
but to work with the original program they need to have addresses
4(%sp), 5(%sp) and 7(%sp):
Accessing word or long values off the ODD memory address will immediately trigger alignment exception on 68000.
To get integral parameters passed using 2 byte alignment instead of 4 byte alignment, you can change the default int size to be 16 bit by -mshort. You need to replace all int in your code by long (if you want them to be 32 bit wide). The crude way to do that is to also pass -Dint=long to your compiler. Obviously, you will break ABI compatibility to object files compiled with -mno-short (which appears to be the default for gcc).
I'm writing a basic compiler and the code generated does not work as intended.
I'm using a naive graph coloring algorithm to allocate variables in registers based on their liveness.
The problem is that the generated assembly code seems perfectly fine, but, at some point, it produces undefined behaviour.
If, instead of using registers to store variables, I just use the stack, everything works fine.
I also discovered that I can't use the %edx register around an imull instruction and I wondered if something similar is happening right now with %ebx and %ecx.
I compile the code using gcc -m32 "test.s" runtime.c -o test, where runtime.c is a helper C file containing the print and input functions.
I've also tried to remove parts of the program (every print except the last one) and then the last print will work.
If I call a single print function before the last call it won't work.
The runtime.c file:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
int input() {
int num;
char term;
scanf("%d%c", &num, &term);
return num;
}
void print_int_nl(int i) {
printf("%d\n", i);
}
Source file:
a = 10
b = input()
c = - 10
d = -input()
print a
print b
print c
print d
Generated assembly code:
https://pastebin.com/ChSRbWgt
After compiling the .s file and running it using the console (./test) it asks for 2 input (as intended).
I give it 1 and 2.
Then the output is:
10
1
-10
1415880
instead of
10
1
-10
-2
You need to observe the calling convention (see e.g. Calling conventions for different C++ compilers and operating systems by Agner Fog).
Namely, there are caller-save and callee-save registers in the convention for your C compiler.
Your generated code needs to preserve the callee-save registers in order to be able to return to its C caller.
Similarly, printf() will preserve the callee-save registers, but it can trash the caller-save registers, meaning that if your generated code calls printf(), it will need to preserve the caller-save registers across the calls to printf() or any other C function.
You need to clear the input buffer after your first input the buffer still contains the newline character i'd recommend this :
int input() {
int num;
char term;
scanf("%d %c", &num, &term);
return num;
}
Is there a C library for assembling a x86/x64 assembly string to opcodes?
Example code:
/* size_t assemble(char *string, int asm_flavor, char *out, size_t max_size); */
unsigned char bytes[32];
size_t size = assemble("xor eax, eax\n"
"inc eax\n"
"ret",
asm_x64, &bytes, 32);
for(int i = 0; i < size; i++) {
printf("%02x ", bytes[i]);
}
/* Output: 31 C0 40 C3 */
I have looked at asmpure, however it needs modifications to run on non-windows machines.
I actually both need an assembler and a disassembler, is there a library which provides both?
There is a library that is seemingly a ghost; its existance is widely unknown:
XED (X86 Encoder Decoder)
Intel wrote it: https://software.intel.com/sites/landingpage/pintool/docs/71313/Xed/html/
It can be downloaded with Pin: https://software.intel.com/en-us/articles/pintool-downloads
Sure - you can use llvm. Strictly speaking, it's C++, but there are C interfaces. It will handle both the assembling and disassembling you're trying to do, too.
Here you go:
http://www.gnu.org/software/lightning/manual/lightning.html
Gnu Lightning is a C library which is designed to do exactly what you want. It uses a portable assembly language though, rather than x86 specific one. The portable assembly is compiled in run time to a machine specific one in a very straightforward manner.
As an added bonus, it is much smaller and simpler to start using than LLVM (which is rather big and cumbersome).
You might want libyasm (the backend YASM uses). You can use the frontends as examples (most particularly, YASM's driver).
I'm using fasm.dll: http://board.flatassembler.net/topic.php?t=6239
Don't forget to write "use32" at the beginning of code if it's not in PE format.
Keystone seems like a great choice now, however it didn't exist when I asked this question.
Write the assembly into its own file, and then call it from your C program using extern. You have to do a little bit of makefile trickery, but otherwise it's not so bad.
Your assembly code has to follow C conventions, so it should look like
global _myfunc
_myfunc: push ebp ; create new stack frame for procedure
mov ebp,esp ;
sub esp,0x40 ; 64 bytes of local stack space
mov ebx,[ebp+8] ; first parameter to function
; some more code
leave ; return to C program's frame
ret ; exit
To get at the contents of C variables, or to declare variables which C can access, you need only declare the names as GLOBAL or EXTERN. (Again, the names require leading underscores.) Thus, a C variable declared as int i can be accessed from assembler as
extern _i
mov eax,[_i]
And to declare your own integer variable which C programs can access as extern int j, you do this (making sure you are assembling in the _DATA segment, if necessary):
global _j
_j dd 0
Your C code should look like
extern void myasmfunc(variable a);
int main(void)
{
myasmfunc(a);
}
Compile the files, then link them using
gcc mycfile.o myasmfile.o
I'll explain:
Let's say I'm interested in replacing the rand() function used by a certain application.
So I attach gdb to this process and make it load my custom shared library (which has a customized rand() function):
call (int) dlopen("path_to_library/asdf.so")
This would place the customized rand() function inside the process' memory. However, at this point the symbol rand will still point to the default rand() function. Is there a way to make gdb point the symbol to the new rand() function, forcing the process to use my version?
I must say I'm also not allowed to use the LD_PRELOAD (linux) nor DYLD_INSERT_LIBRARIES (mac os x) methods for this, because they allow code injection only in the beginning of the program execution.
The application that I would like to replace rand(), starts several threads and some of them start new processes, and I'm interested in injecting code on one of these new processes. As I mentioned above, GDB is great for this purpose because it allows code injection into a specific process.
I followed this post and this presentation and came up with the following set of gdb commands for OSX with x86-64 executable, which can be loaded with -x option when attaching to the process:
set $s = dyld_stub_rand
set $p = ($s+6+*(int*)($s+2))
call (void*)dlsym((void*)dlopen("myrand.dylib"), "my_rand")
set *(void**)$p = my_rand
c
The magic is in set $p = ... command. dyld_stub_rand is a 6-byte jump instruction. Jump offset is at dyld_stub_rand+2 (4 bytes). This is a $rip-relative jump, so add offset to what $rip would be at this point (right after the instruction, dyld_stub_rand+6).
This points to a symbol table entry, which should be either real rand or dynamic linker routine to load it (if it was never called). It is then replaced by my_rand.
Sometimes gdb will pick up dyld_stub_rand from libSystem or another shared library, if that happens, unload them first with remove-symbol-file before running other commands.
This question intrigued me, so I did a little research. What you are looking for is a 'dll injection'. You write a function to replace some library function, put it in a .so, and tell ld to preload your dll. I just tried it out and it worked great! I realize this doesn't really answer your question in relation to gdb, but I think it offers a viable workaround.
For a gdb-only solution, see my other solution.
// -*- compile-command: "gcc -Wall -ggdb -o test test.c"; -*-
// test.c
#include "stdio.h"
#include "stdlib.h"
int main(int argc, char** argv)
{
//should print a fairly random number...
printf("Super random number: %d\n", rand());
return 0;
}
/ -*- compile-command: "gcc -Wall -fPIC -shared my_rand.c -o my_rand.so"; -*-
//my_rand.c
int rand(void)
{
return 42;
}
compile both files, then run:
LD_PRELOAD="./my_rand.so" ./test
Super random number: 42
I have a new solution, based on the new original constraints. (I am not deleting my first answer, as others may find it useful.)
I have been doing a bunch of research, and I think it would work with a bit more fiddling.
In your .so rename your replacement rand function, e.g my_rand
Compile everything and load up gdb
Use info functions to find the address of rand in the symbol table
Use dlopen then dlsym to load the function into memory and get its address
call (int) dlopen("my_rand.so", 1) -> -val-
call (unsigned int) dlsym(-val-, "my_rand") -> my_rand_addr
-the tricky part- Find the hex code of a jumpq 0x*my_rand_addr* instruction
Use set {int}*rand_addr* = *my_rand_addr* to change symbol table instruction
Continue execution: now whenever rand is called, it will jump to my_rand instead
This is a bit complicated, and very round-about, but I'm pretty sure it would work. The only thing I haven't accomplished yet is creating the jumpq instruction code. Everything up until that point works fine.
I'm not sure how to do this in a running program, but perhaps LD_PRELOAD will work for you. If you set this environment variable to a list of shared objects, the runtime loader will load the shared object early in the process and allow the functions in it to take precedence over others.
LD_PRELOAD=path_to_library/asdf.so path/to/prog
You do have to do this before you start the process but you don't have to rebuild the program.
Several of the answers here and the code injection article you linked to in your answer cover chunks of what I consider the optimal gdb-oriented solution, but none of them pull it all together or cover all the points. The code-expression of the solution is a bit long, so here's a summary of the important steps:
Load the code to inject. Most of the answers posted here use what I consider the best approach -- call dlopen() in the inferior process to link in a shared library containing the injected code. In the article you linked to the author instead loaded a relocatable object file and hand-linked it against the inferior. This is quite frankly insane -- relocatable objects are not "ready-to-run" and include relocations even for internal references. And hand-linking is tedious and error-prone -- far simpler to let the real runtime dynamic linker do the work. This does mean getting libdl into the process in the first place, but there are many options for doing that.
Create a detour. Most of the answers posted here so far have involved locating the PLT entry for the function of interest, using that to find the matching GOT entry, then modifying the GOT entry to point to your injected function. This is fine up to a point, but certain linker features -- e.g., use of dlsym -- can circumvent the GOT and provide direct access to the function of interest. The only way to be certain of intercepting all calls to a particular function is overwrite the initial instructions of that function's code in-memory to create a "detour" redirecting execution to your injected function.
Create a trampoline (optional). Frequently when doing this sort of injection you'll want to call the original function whose invocation you are intercepting. The way to allow this with a function detour is to create a small code "trampoline" which includes the overwritten instructions of the original function then a jump to the remainder of the original. This can be complex, because any IP-relative instructions in the copied set need to be modified to account for their new addresses.
Automate it all. These steps can be tedious, even if doing some of the simpler solutions posted in other answers. The best way to ensure that the steps are done correctly every time with variable parameters (injecting different functions, etc) is to automate their execution. Starting with the 7.0 series, gdb has included the ability to write new commands in Python. This support can be used to implement a turn-key solution for injecting and detouring code in/to the inferior process.
Here's an example. I have the same a and b executables as before and an inject2.so created from the following code:
#include <unistd.h>
#include <stdio.h>
int (*rand__)(void) = NULL;
int
rand(void)
{
int result = rand__();
printf("rand invoked! result = %d\n", result);
return result % 47;
}
I can then place my Python detour command in detour.py and have the following gdb session:
(gdb) source detour.py
(gdb) exec-file a
(gdb) set follow-fork-mode child
(gdb) catch exec
Catchpoint 1 (exec)
(gdb) run
Starting program: /home/llasram/ws/detour/a
a: 1933263113
a: 831502921
[New process 8500]
b: 918844931
process 8500 is executing new program: /home/llasram/ws/detour/b
[Switching to process 8500]
Catchpoint 1 (exec'd /home/llasram/ws/detour/b), 0x00007ffff7ddfaf0 in _start ()
from /lib64/ld-linux-x86-64.so.2
(gdb) break main
Breakpoint 2 at 0x4005d0: file b.c, line 7.
(gdb) cont
Continuing.
Breakpoint 2, main (argc=1, argv=0x7fffffffdd68) at b.c:7
7 {
(gdb) detour libc.so.6:rand inject2.so:rand inject2.so:rand__
(gdb) cont
Continuing.
rand invoked! result = 392103444
b: 22
Program exited normally.
In the child process, I create a detour from the rand() function in libc.so.6 to the rand() function in inject2.so and store a pointer to a trampoline for the original rand() in the rand__ variable of inject2.so. And as expected, the injected code calls the original, displays the full result, and returns that result modulo 47.
Due to length, I'm just linking to a pastie containing the code for my detour command. This is a fairly superficial implementation (especially in terms of the trampoline generation), but it should work well in a large percentage of cases. I've tested it with gdb 7.2 (most recently released version) on Linux with both 32-bit and 64-bit executables. I haven't tested it on OS X, but any differences should be relatively minor.
For executables you can easily find the address where the function pointer is stored by using objdump. For example:
objdump -R /bin/bash | grep write
00000000006db558 R_X86_64_JUMP_SLOT fwrite
00000000006db5a0 R_X86_64_JUMP_SLOT write
Therefore, 0x6db5a0 is the adress of the pointer for write. If you change it, calls to write will be redirected to your chosen function. Loading new libraries in gdb and getting function pointers has been covered in earlier posts. The executable and every library have their own pointers. Replacing affects only the module whose pointer was changed.
For libraries, you need to find the base address of the library and add it to the address given by objdump. In Linux, /proc/<pid>/maps gives it out. I don't know whether position-independent executables with address randomization would work. maps-information might be unavailable in such cases.
As long as the function you want to replace is in a shared library, you can redirect calls to that function at runtime (during debugging) by poking at the PLT. Here is an article that might be helpful:
Shared library call redirection using ELF PLT infection
It's written from the standpoint of malware modifying a program, but a much easier procedure is adaptable to live use in the debugger. Basically you just need to find the function's entry in the PLT and overwrite the address with the address of the function you want to replace it with.
Googling for "PLT" along with terms like "ELF", "shared library", "dynamic linking", "PIC", etc. might find you more details on the subject.
You can still us LD_PRELOAD if you make the preloaded function understand the situations it's getting used in. Here is an example that will use the rand() as normal, except inside a forked process when it will always return 42. I use the dl routines to load the standard library's rand() function into a function pointer for use by the hijacked rand().
// -*- compile-command: "gcc -Wall -fPIC -shared my_rand.c -o my_rand.so -ldl"; -*-
//my_rand.c
#include <sys/types.h>
#include <unistd.h>
#include <dlfcn.h>
int pid = 0;
int (*real_rand)(void) = NULL;
void f(void) __attribute__ ((constructor));
void f(void) {
pid = getpid();
void* dl = dlopen("libc.so.6", RTLD_LAZY);
if(dl) {
real_rand = dlsym(dl, "rand");
}
}
int rand(void)
{
if(pid == getpid() && real_rand)
return real_rand();
else
return 42;
}
//test.c
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char** argv)
{
printf("Super random number: %d\n", rand());
if(fork()) {
printf("original process rand: %d\n", rand());
} else {
printf("forked process rand: %d\n", rand());
}
return 0;
}
jdizzle#pudding:~$ ./test
Super random number: 1804289383
original process rand: 846930886
forked process rand: 846930886
jdizzle#pudding:~$ LD_PRELOAD="/lib/ld-linux.so.2 ./my_rand.so" ./test
Super random number: 1804289383
original process rand: 846930886
forked process rand: 42
I found this tutorial incredibly useful, and so far its the only way I managed to achieve what I was looking with GDB: Code Injection into Running Linux Application: http://www.codeproject.com/KB/DLL/code_injection.aspx
There is also a good Q&A on code injection for Mac here: http://www.mikeash.com/pyblog/friday-qa-2009-01-30-code-injection.html
I frequently use code injection as a method of mocking for automated testing of C code. If that's the sort of situation you're in -- if your use of GDB is simply because you're not interested in the parent processes, and not because you want to interactively select the processes which are of interest -- then you can still use LD_PRELOAD to achieve your solution. Your injected code just needs to determine whether it is in the parent or child processes. There are several ways you could do this, but on Linux, since your child processes exec(), the simplest is probably to look at the active executable image.
I produced two executables, one named a and the other b. Executable a prints the result of calling rand() twice, then fork()s and exec()s b twice. Executable b print the result of calling rand() once. I use LD_PRELOAD to inject the result of compiling the following code into the executables:
// -*- compile-command: "gcc -D_GNU_SOURCE=1 -Wall -std=gnu99 -O2 -pipe -fPIC -shared -o inject.so inject.c"; -*-
#include <sys/types.h>
#include <unistd.h>
#include <limits.h>
#include <stdio.h>
#include <dlfcn.h>
#define constructor __attribute__((__constructor__))
typedef int (*rand_t)(void);
typedef enum {
UNKNOWN,
PARENT,
CHILD
} state_t;
state_t state = UNKNOWN;
rand_t rand__ = NULL;
state_t
determine_state(void)
{
pid_t pid = getpid();
char linkpath[PATH_MAX] = { 0, };
char exepath[PATH_MAX] = { 0, };
ssize_t exesz = 0;
snprintf(linkpath, PATH_MAX, "/proc/%d/exe", pid);
exesz = readlink(linkpath, exepath, PATH_MAX);
if (exesz < 0)
return UNKNOWN;
switch (exepath[exesz - 1]) {
case 'a':
return PARENT;
case 'b':
return CHILD;
}
return UNKNOWN;
}
int
rand(void)
{
if (state == CHILD)
return 47;
return rand__();
}
constructor static void
inject_init(void)
{
rand__ = dlsym(RTLD_NEXT, "rand");
state = determine_state();
}
The result of running a with and without injection:
$ ./a
a: 644034683
a: 2011954203
b: 375870504
b: 1222326746
$ LD_PRELOAD=$PWD/inject.so ./a
a: 1023059566
a: 986551064
b: 47
b: 47
I'll post a gdb-oriented solution later.