Help me understand this C code (*(void(*) ()) scode) () - c

Source: http://milw0rm.org/papers/145
#include <stdio.h>
#include <stdlib.h>
int main()
{
char scode[]="\x31\xc0\xb0\x01\x31\xdb\xcd\x80";
(*(void(*) ()) scode) ();
}
This papers is tutorial about shellcode on Linux platform, however it did not explain how the following statement "(*(void(*) ()) scode) ();" works. I'm using the book "The C Language Programming Reference, 2ed by Brian.W.Kernighan, Dennis.M.Ritchie" to lookup for an answer but found no answer. May someone can point to the right directions, maybe a website, another C reference book where I can find an answer.

Its machine code (compiled assembly instructions) in scode then it casts to a callable void function pointer and calls it. GMan demonstrated an equivalent, clearer approach:
typedef void(*void_function)(void);
int main()
{
char scode[]="\x31\xc0\xb0\x01\x31\xdb\xcd\x80";
void_function f = (void_function)scode;
f(); //or (*f)();
}
scode contains x86 machine code which disassembles into (thanks Michael Berg)
31 c0 xor %eax,%eax
b0 01 mov $0x1,%al
31 db xor %ebx,%ebx
cd 80 int $0x80
This is the code for a system call in Linux (interrupt 0x80). According to the system call table, this is calling the sys_exit() system call (eax=1) with parameter 0 (in ebx). This causes the process to exit immediately, as if it called _exit(0).
Jonathan Leffler pointed out that this is most commonly used to call shellcode, "a small piece of code used as the payload in the exploitation of a software vulnerability." Thus, modern OSes take measures to prevent this.
If the stack is non-executable, this code will fail horribly. The shell code is loaded into a local variable in the stack, and then we jump to that location. If the stack is non-executable, then a CPU fault of some kind will occur as soon as the CPU tries to execute the code, and control will be shifted into the kernel's interrupt handlers. The kernel will then kill the process in an abnormal fashion. One case where the stack might be non-executable would be if you're running on a CPU that supports Physical Address Extensions, and you have the NX (non-executable) bit set in your page tables.
There may also be instruction cache issues on some CPUs -- if the instruction cache hasn't been flushed, the CPU may read stale data (instead of the shell code we explicitly loaded into the stack) and start executing random instructions.

In C:
(some_type) some_var
casts some_var to be of type some_type.
In your code sample "void(*) ()" is the some_type and is the signature for a function pointer that takes no arguments and returns nothing.
"(void(*) ()) scode" casts scode to be a function pointer.
"(*(void(*) ()) scode)" dereferences that function pointer.
And the final () calls the function defined in scode.
And the bytes in scode disassemble to the following i386 assembly:
31 c0 xor %eax,%eax
b0 01 mov $0x1,%al
31 db xor %ebx,%ebx
cd 80 int $0x80

What this code does is assign some machine code (the bytes in scode) then it converts the address of that code into a function pointer of type void function () then calls it.
In C/C++, this function's type definition is expressed:
typedef void (* basicFunctionPtr) (void);

A typedef helps:
// function that takes and returns nothing
typedef void(*generic_function)(void);
// cast to function
generic_function f = (generic_function)scode;
// call
(*f)();
// same thing written differently:
// call
f();

scode is an address. (void(*)()) casts scode to a function returning void and accepting no parameters. The leading * calls the function pointer, and the trailing () indicates that no arguments are given to the function.

To learn a lot more about shell-coding technique, look at the book:
The Shellcoder's Handbook, 2nd Edn
There are several other similar books as well - I think this is the best, but could be persuaded otherwise. You can also find numerous related resources with Google and "shellcoder's handbook" (or your search engine of choice, no doubt).

The character array contains executable code and the cast is a function cast.
(*(void(*) ()) means "cast to a function pointer that produces void, i.e. nothing. The () after the name is the function call operator.

The characters encoded in scode are the char/byte representations of some compiled assembly code. The code you have posted takes that assembly, encoded as characters for simplicity, and then calls that string as a function.
The assembly seems to translate out to:
xor %eax,
%eax mov $0x1,
%al xor %ebx,
%ebx int $0x80
Yup, that would indeed create a shell in Linux.

Related

Abnormal presence of code after __stack_chk_fail

I have an 64-bits ELF binary. I don't have its source code, don't know with which parameters it was compiled, and am not allowed to provide it here. The only relevant information I have is that the source is a .c file (so no hand-crafted assembly), compiled through a Makefile.
While reversing this binary using IDA, I stumbled upon an extremely weird construction I have never seen before and absolutely cannot explain. Here is the raw decompilation of one function with IDA syntax:
mov rax, [rsp+var_20]
xor rax, fs:28h
jnz location
add rsp, 28h
pop rbx
pop rbp
retn
location:
call __stack_chk_fail
nop dword ptr [rax]
db 2Eh
nop word ptr [rax+rax+00000000h]
...then dozens of instructions of normal and functional code
Here, we have a simple canary check, where we return if it is valid, and call __stack_chk_fail otherwise. Everything is perfectly normal. But after this check, there is still assembly, and of fully-functional code.
Looking at the manual of __stack_chk_fail, I made sure that this function does exit the program, and that there is no edge case where it could continue:
Description
The interface __stack_chk_fail() shall abort the function that called it with a message that a stack overflow has been detected. The program that called the function shall then exit.
I also tried to write this small C program, to search for a method to reproduce this:
#include <stdio.h>
#include <stdlib.h>
int foo()
{
int a = 3;
printf("%d\n", a);
return 0;
int b = 7;
printf("%d\n", b);
}
int main()
{
foo();
return 0;
}
But the code after the return is simply omitted by gcc.
It does not appear either that my binary is vulnerable to a buffer overflow that I could exploit to control rip and jump to the code after the canary check. I also inspected every call and jumps using objdump, and this code seems to never be called.
Could someone explain what is going on? How was this code generated in the first place? Is it a joke from the author of the binary?
I suspect you are looking at padding, followed by an unrelated function that IDA does not have a name for.
To test this hypothesis, I need the following additional information:
The address of the byte immediately after call __stack_chk_fail.
The next higher address that is the target of a call or jump instruction.
A raw hex dump of the bytes in between those two addresses.
The disassembly of four or five instructions starting at the next higher address that is the target of a call or jump instruction.

C - Variadic function compiles but gives segmentation fault

I have a variadic function in C to write in a log file, but as soon as it is invoked, it gives a segmentation fault in the header.
In the main process, the call has this format:
mqbLog("LOG_INFORMATION",0,0,"Connect",0,"","Parameter received");
and the function is defined this way:
void mqbLog(char *type,
int numContext,
double sec,
char *service,
int sizeData,
char *data,
char *fmt,
...
)
{
//write the log in the archive
}
It compiles OK. When I debug the process, the call to the mqbLog function is done, and it gives me the segmentation fault in the open bracket of the function, so I can ask about the function values:
(gdb) p type
$1 = 0x40205e "LOG_INFORMATION"
(gdb) p numContext
$2 = 0
(gdb) p sec
$3 = 0
(gdb) p service
$4 = 0x0
(gdb) p sizeData
$5 = 4202649
(gdb) p data
$6 = 0x0
Any ideas will be gratefully received.
Based on the gdb output, it looks like the caller didn't have a prototype for the function it was calling. As #JonathanLeffler noticed, you wrote 0 instead of 0.0, so it's passing an integer where the callee is expecting a double.
Judging from the pointer value, this is probably on x86-64 Linux with the System V calling convention, where the register assigned for an arg is determined by it being e.g. the third integer arg. (See the x86 wiki for ABI/calling convention docs).
So if the caller and callee disagree about the function signature, they will disagree about which arg goes in which register, which I think explains why gdb is showing args that don't match the caller.
In this case, the caller puts "Connect" (the address) in RCX, because it's the 4th integer/pointer arg with that implicit declaration.
The caller looks for the value of service in RDX, because its caller's 3rd integer/pointer arg.
sec is 0.0 in the callee apparently by chance. It's just using whatever was sitting in XMM0. Or maybe possibly uninitialized stack space, since the caller would have set AL=0 to indicate that no FP args were passed in registers (necessary for variadic functions only). Note al = number of fp register args includes the fixed non-variadic args when the prototype is available. Compiling your call with the prototype available includes a mov eax, 1 before the call. See the source+asm for compiling with/without the prototype on the Godbolt compiler explorer.
In a different calling convention (e.g. -m32 with stack args), things would break at least a badly because those args would be passed on the stack, but int and double are different sizes.
Writing 0.0 for the FP args would make the implicit declaration match the definition. But don't do this, it's still a terrible idea to call undeclared functions. Use -Wall to have the compiler tell you when your code does bad things.
You function might still crash; who knows what other bugs you have in code that's not shown?
When your code crashes, you should look at the asm instruction it crashed on to figure out which pointer was bad — e.g. run disas in gdb. Even if you don't understand it yourself, including that in a debugging-help question (along with register values) can help a lot.

How does the stack frame look like in my function?

I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.

Print out value of stack pointer

How can I print out the current value at the stack pointer in C in Linux (Debian and Ubuntu)?
I tried google but found no results.
One trick, which is not portable or really even guaranteed to work, is to simple print out the address of a local as a pointer.
void print_stack_pointer() {
void* p = NULL;
printf("%p", (void*)&p);
}
This will essentially print out the address of p which is a good approximation of the current stack pointer
There is no portable way to do that.
In GNU C, this may work for target ISAs that have a register named SP, including x86 where gcc recognizes "SP" as short for ESP or RSP.
// broken with clang, but usually works with GCC
register void *sp asm ("sp");
printf("%p", sp);
This usage of local register variables is now deprecated by GCC:
The only supported use for this feature is to specify registers for input and output operands when calling Extended asm
Defining a register variable does not reserve the register. Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc. ...
It's also broken in practice with clang where sp is treated like any other uninitialized variable.
In addition to duedl0r's answer with specifically GCC you could use __builtin_frame_address(0) which is GCC specific (but not x86 specific).
This should also work on Clang (but there are some bugs about it).
Taking the address of a local (as JaredPar answered) is also a solution.
Notice that AFAIK the C standard does not require any call stack in theory.
Remember Appel's paper: garbage collection can be faster than stack allocation; A very weird C implementation could use such a technique! But AFAIK it has never been used for C.
One could dream of a other techniques. And you could have split stacks (at least on recent GCC), in which case the very notion of stack pointer has much less sense (because then the stack is not contiguous, and could be made of many segments of a few call frames each).
On Linuxyou can use the proc pseudo-filesystem to print the stack pointer.
Have a look here, at the /proc/your-pid/stat pseudo-file, at the fields 28, 29.
startstack %lu
The address of the start (i.e., bottom) of the
stack.
kstkesp %lu
The current value of ESP (stack pointer), as found
in the kernel stack page for the process.
You just have to parse these two values!
You can also use an extended assembler instruction, for example:
#include <stdint.h>
uint64_t getsp( void )
{
uint64_t sp;
asm( "mov %%rsp, %0" : "=rm" ( sp ));
return sp;
}
For a 32 bit system, 64 has to be replaced with 32, and rsp with esp.
You have that info in the file /proc/<your-process-id>/maps, in the same line as the string [stack] appears(so it is independent of the compiler or machine). The only downside of this approach is that for that file to be read it is needed to be root.
Try lldb or gdb. For example we can set backtrace format in lldb.
settings set frame-format "frame #${frame.index}: ${ansi.fg.yellow}${frame.pc}: {pc:${frame.pc},fp:${frame.fp},sp:${frame.sp}} ${ansi.normal}{ ${module.file.basename}{\`${function.name-with-args}{${frame.no-debug}${function.pc-offset}}}}{ at ${ansi.fg.cyan}${line.file.basename}${ansi.normal}:${ansi.fg.yellow}${line.number}${ansi.normal}{:${ansi.fg.yellow}${line.column}${ansi.normal}}}{${function.is-optimized} [opt]}{${frame.is-artificial} [artificial]}\n"
So we can print the bp , sp in debug such as
frame #10: 0x208895c4: pc:0x208895c4,fp:0x01f7d458,sp:0x01f7d414 UIKit`-[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 376
Look more at https://lldb.llvm.org/use/formatting.html
You can use setjmp. The exact details are implementation dependent, look in the header file.
#include <setjmp.h>
jmp_buf jmp;
setjmp(jmp);
printf("%08x\n", jmp[0].j_esp);
This is also handy when executing unknown code. You can check the sp before and after and do a longjmp to clean up.
If you are using msvc you can use the provided function _AddressOfReturnAddress()
It'll return the address of the return address, which is guaranteed to be the value of RSP at a functions' entry. Once you return from that function, the RSP value will be increased by 8 since the return address is pop'ed off.
Using that information, you can write a simple function that return the current address of the stack pointer like this:
uintptr_t GetStackPointer() {
return (uintptr_t)_AddressOfReturnAddress() + 0x8;
}
int main(int argc, const char argv[]) {
uintptr_t rsp = GetStackPointer();
printf("Stack pointer: %p\n", rsp);
}
Showcase
You may use the following:
uint32_t msp_value = __get_MSP(); // Read Main Stack pointer
By the same way if you want to get the PSP value:
uint32_t psp_value = __get_PSP(); // Read Process Stack pointer
If you want to use assembly language, you can also use MSP and PSP process:
MRS R0, MSP // Read Main Stack pointer to R0
MRS R0, PSP // Read Process Stack pointer to R0

Function body on heap

A program has three sections: text, data and stack. The function body lives in the text section. Can we let a function body live on heap? Because we can manipulate memory on heap more freely, we may gain more freedom to manipulate functions.
In the following C code, I copy the text of hello function onto heap and then point a function pointer to it. The program compiles fine by gcc but gives "Segmentation fault" when running.
Could you tell me why?
If my program can not be repaired, could you provide a way to let a function live on heap?
Thanks!
Turing.robot
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
void
hello()
{
printf( "Hello World!\n");
}
int main(void)
{
void (*fp)();
int size = 10000; // large enough to contain hello()
char* buffer;
buffer = (char*) malloc ( size );
memcpy( buffer,(char*)hello,size );
fp = buffer;
fp();
free (buffer);
return 0;
}
My examples below are for Linux x86_64 with gcc, but similar considerations should apply on other systems.
Can we let a function body live on heap?
Yes, absolutely we can. But usually that is called JIT (Just-in-time) compilation. See this for basic idea.
Because we can manipulate memory on heap more freely, we may gain more freedom to manipulate functions.
Exactly, that's why higher level languages like JavaScript have JIT compilers.
In the following C code, I copy the text of hello function onto heap and then point a function pointer to it. The program compiles fine by gcc but gives "Segmentation fault" when running.
Actually you have multiple "Segmentation fault"s in that code.
The first one comes from this line:
int size = 10000; // large enough to contain hello()
If you see x86_64 machine code generated by gcc of your
hello function, it compiles down to mere 17 bytes:
0000000000400626 <hello>:
400626: 55 push %rbp
400627: 48 89 e5 mov %rsp,%rbp
40062a: bf 98 07 40 00 mov $0x400798,%edi
40062f: e8 9c fe ff ff call 4004d0 <puts#plt>
400634: 90 nop
400635: 5d pop %rbp
400636: c3 retq
So, when you are trying to copy 10,000 bytes, you run into a memory
that does not exist and get "Segmentation fault".
Secondly, you allocate memory with malloc, which gives you a slice of
memory that is protected by CPU against execution on Linux x86_64, so
this would give you another "Segmentation fault".
Under the hood malloc uses system calls like brk, sbrk, and mmap to allocate memory. What you need to do is allocate executable memory using mmap system call with PROT_EXEC protection.
Thirdly, when gcc compiles your hello function, you don't really know what optimisations it will use and what the resulting machine code looks like.
For example, if you see line 4 of the compiled hello function
40062f: e8 9c fe ff ff call 4004d0 <puts#plt>
gcc optimised it to use puts function instead of printf, but that is
not even the main problem.
On x86 architectures you normally call functions using call assembly
mnemonic, however, it is not a single instruction, there are actually many different machine instructions that call can compile to, see Intel manual page Vol. 2A 3-123, for reference.
In you case the compiler has chosen to use relative addressing for the call assembly instruction.
You can see that, because your call instruction has e8 opcode:
E8 - Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.
Which basically means that instruction pointer will jump the relative amount of bytes from the current instruction pointer.
Now, when you relocate your code with memcpy to the heap, you simply copy that relative call which will now jump the instruction pointer relative from where you copied your code to into the heap, and that memory will most likely not exist and you will get another "Segmentation fault".
If my program can not be repaired, could you provide a way to let a function live on heap? Thanks!
Below is a working code, here is what I do:
Execute, printf once to make sure gcc includes it in our binary.
Copy the correct size of bytes to heap, in order to not access memory that does not exist.
Allocate executable memory with mmap and PROT_EXEC option.
Pass printf function as argument to our heap_function to make sure
that gcc uses absolute jumps for call instruction.
Here is a working code:
#include "stdio.h"
#include "string.h"
#include <stdint.h>
#include <sys/mman.h>
typedef int (*printf_t)(char* format, char* string);
typedef int (*heap_function_t)(printf_t myprintf, char* str, int a, int b);
int heap_function(printf_t myprintf, char* str, int a, int b) {
myprintf("%s", str);
return a + b;
}
int heap_function_end() {
return 0;
}
int main(void) {
// By printing something here, `gcc` will include `printf`
// function at some address (`0x4004d0` in my case) in our binary,
// with `printf_t` two argument signature.
printf("%s", "Just including printf in binary\n");
// Allocate the correct size of
// executable `PROT_EXEC` memory.
size_t size = (size_t) ((intptr_t) heap_function_end - (intptr_t) heap_function);
char* buffer = (char*) mmap(0, (size_t) size,
PROT_EXEC | PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
memcpy(buffer, (char*)heap_function, size);
// Call our function
heap_function_t fp = (heap_function_t) buffer;
int res = fp((void*) printf, "Hello world, from heap!\n", 1, 2);
printf("a + b = %i\n", res);
}
Save in main.c and run with:
gcc -o main main.c && ./main
In principle in concept it is doable. However... You are copying from "hello" which basically contains assembly instructions that possibly call or reference or jump to other addresses. Some of these addresses get fixed up when the application loads. Just copying that and calling into it would then crash. Also some systems like windows have data execution protection that would prevent code in data form being executed, as a security measure. Also, how large is "hello"? Trying to copy past the end of it would likely also crash. And you are also dependent on how the compiler implements "hallo". Needless to say, this would be very compiler and platform dependent, if it worked.
I can imagine that this might work on a very simple architecture or with a compiler designed to make it easy.
A few of the many requirements for this work:
All memory references would need to be absolute ... no pc-relative addresses, except . . .
Certain control transfers would need to be pc-relative (so your copied function's local branches work) but it would be nice if other ones would just happen to be absolute, so your module's external control transfers, like printf(), would work.
There are more requirements. Add to this the wierdness of doing this in what is likely to already be a highly complex dynamically linked environment (did you static link it?) and you simply are not ever going to get this to work.
And as Adam points out, there are security mechanisms in place, at least for the stack, to prevent dynamically constructed code from executing at all. You may need to figure out how to turn these off.
You might also be getting clobbered with the memcpy().
You might learn something by tracing this through step-by-step and watching it shoot itself in the head. If the memcpy hack is the problem, perhaps try something like:
f() {
...
}
g() {
...
}
memcpy(dst, f, (intptr_t)g - (intptr_t)f)
You program is segfaulting because you're memcpy'ing more than just "hello"; that function is not 10000 bytes long, so as soon as you get past hello itself, you segfault because you're accessing memory that doesn't belong to you.
You probably also need to use mmap() at some point to make sure the memory location you're trying to call is actually executable.
There are many systems that do what you seem to want (e.g., Java's JIT compiler creates native code in the heap and executes it), but your example will be way more complicated than that because there's no easy way to know the size of your function at runtime (and it's even harder at compile time, when the compiler hasn't yet decide what optimizations to apply). You can probably do what objdump does and read the executable at runtime to find the right "size", but I don't think that's what you're actually trying to achieve here.
After malloc you should check that the pointer is not null buffer = (char*) malloc ( size );
memcpy( buffer,(char*)hello,size ); and it might be your problem since you try to allocate a big area in memory. can you check that?
memcpy( buffer,(char*)hello,size );
hello is not a source get copied to buffer. You are cheating the compiler and it is taking it's revenge at run-time. By typecasting hello to char*, the program is making the compiler to believe it so, which is not the case actually. Never out-smart the compiler.

Resources