program counter (PC) has the address of the currently executing instruction or next instruction in line. for ARMV5, its the former case.
I have encountered the crashes where PC (R15) value is zero. I was wondering if someone can tell me the significance of that. And is there some way (some other register) to find out the address of current instruction.
Any help would be highly appreciated.
Some code probably tried to call a null function-pointer. Check the stack to see where the call came from.
In general (ARM terminology ) it would be a prefetch abort.
Means cpu is trying to read (pre-fetch) instruction from illgela address which caused this.
you can try to see how tht mem location has become invlid to find out more on the cause !
Related
This question already has answers here:
What does C expression ((void(*)(void))0)(); mean?
(5 answers)
Closed 5 years ago.
I was trying to write my own boot loader on Atmel AVR Microcontroller. I have referred one of the code base from github. I would like to thank for the code base to ZEVERO
At primary level I understand the code base. but at line 224 I found a line
Reference to the code
**if (pgm_read_word(0) != 0xFFFF) ((void(*)(void))0)(); //EXIT BOOTLOADER**
I understand the if condition part but when I was trying to understand the true statement part i.e.
**((void(*)(void))0)();**
code writer has given explanation to this is //EXIT BOOTLOADER
My first Question is what is the meaning of this complex declaration
**((void(*)(void))0)();**
And Second Question is, does it Exit the execution of the code in Microcontroller.
As #iBug pointed out, ((void(*)(void))0)(); invokes a function call on a NULL function pointer.
In effect, that transfers program control to memory address 0. Now, on a workstation, that would be colossal UB, most likely resulting in a segfault.
However, since the code in question is for a hardware bootloader, it's not UB, it (apparently) just exits the bootloader.
At the hardware level, almost everything is implementation dependent, and almost nothing is portable. You can't expect C code targeted at a specific hardware platform to be in any way representative of generally-accepted C patterns and practices.
((void(*)(void))0)(); tries to call a NULL function pointer. User programs (not bootloaders) for AVR microcontrollers usually start execution at address 0. AVR-GCC's ABI uses an all-0-bit representation of NULL function pointers, so this call will (among other things) transfer execution to the user program. Essentially, it works as a slower version of __asm__ __volatile__("jmp 0");, and assumes that the user program's startup code will reinitialize the stack pointer anyway.
Calling through a NULL function pointer is undefined behavior, so there's no guarantee that this trick will work with other compilers, later versions of GCC, or even different optimization settings.
The if (pgm_read_word(0) != 0xFFFF) check before the call is probably to determine if a user program is present: program memory words that have been erased but not written will read as 0xFFFF, while most programs start with a JMP instruction to skip over the rest of the interrupt vector table, and the first word of a JMP instruction is never 0xFFFF.
As has been pointed out before, calling this function simply results in a jump to address 0.
As the code at this address is typically not defined by your own program, but rather by the specific environment, behavior totally depends on this environment.
Your question is tagged as AVM/Atmel: on AVRs, jumping to address 0 simply results in a restart (nearly same behavior as a hardware reset, but beware, the MCU will keep the interrupt enabled/disabled state as opposed to a "real" reset). A "cleaner" program might probably want to use the watchdog timer for a "real" reset (wdt_reset() et al).
It will simply call the address 0 as if it was a function returning void and taking no arguments. Or... less simply the address that is the bit pattern of the null pointer. Or even less simply, the behaviour is undefined so it might do anything unexpected.
I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.
If certain conditions are not met I want to crash my program by jumping to a random location. I also want to randomize the registers by statements like
asm("rdtsc \n");
asm ("movq %rax, %r15 \n");
...
asm ("xor %rbp, %r13 \n");
...
Is there a better/stealthier method to do this? I am concerned, because rdtsc is not a frequent statement in programs. Calling it continually generates similar results too. Beside this, can I somehow clear/randomize the stack content too?
If you just want to crash, your random choice of destination might jump somewhere legal. Just run the ud2 instruction (0F 0B), which is guaranteed to cause an invalid-instruction exception (leading to SIGILL) on every future x86 CPU. i.e. it's reserved, so no future instruction-set extension will ever use that two-byte sequence at the beginning of an instruction.
If you care about high-quality randomness to frustrate any potential backtrace or core dump, then call a random number generator to fill a buffer of random data (or just one 32bit random value which you repeat). Fill all the registers with that garbage data. In 32bit code, you could use a popa instruction to fill all the registers with that garbage data. In 64bit mode, you have to load them manually.
Then scribble over the stack with that data, so your program eventually stops with a segfault when you try to write to an unmapped address (because you've gone outside the stack area).
You could do that scribbling with a rep stosd or something.
As far as "stealthier", you'll need to be much more elaborate about what your threat model is, and what you're trying to stop anyone from learning / doing. i.e. defend against someone modifying your binary to not crash this way?
In addition to Peter Cordes suggestions, I would add that the OP wants to code responsible for this obfuscation to stay out of scope (stealthier). The instruction causing the crash needs to be somewhere else, otherwise the obfuscation code will be obvious from a crash dump and the code will be easy to patch to remove the bomb.
A rather easy solution is to locate the RET opcode from a common library function such as read or strlen and JUMP there by pushing the address on the stack and executing a RET statement. This solution is not perfect: advanced debuggers exist that store the execution trace and will be able to backtrack to the obfuscator from the crash location. In order to defeat that, you may prefer to enter an infinite loop instead of crashing, but that loop can be easily found and removed.
You can also embed some complex code in your app that computes for a while by executing many different functions in a random manner and use that as a honey pot to jump to from the obfuscator.
I'm sorry if this question is stupid or has been asked, but I couldn't find it.
I have a program that I was attempting to use a buffer over flow. It is a simple program that uses getchar() to retrieve the input from the user. The buffer is set to size 12. I can get the program to crash by typing >12 x's or using >12 \x78's, but it won't seg fault if I type in hundreds of A's or \x41's.
Any help or pointing in the right direction would be greatly appreciated.
0x41414141 may be a valid address within a text page of the process. Look at the segment map of the process for details.
To eliminate guessing, look at the assembly code and then at machine instructions of your program. Run it in a debugger and see what happens in the memory. You can see at what addresses on the stack local variables are placed and and what addresses registers and especially the instruction pointer are saved on a function call.
Have you look at examples like the stack overflow on Wikipedia?
Following this question:
Good crash reporting library in c#
Is there any library like CrashRpt.dll that does the same on Linux? That is, generate a failure report including a core dump and any necessary environment and notify the developer about it?
Edit: This seems to be a duplicate of this question
See Getting stack traces on Unix systems, automatically on Stack Overflow.
Compile your code with debug symbols, enter unlimit coredumpsize in your shell and you'll get a coredump in the same folder as the binary. Use gdb/ddd - open the program first and then open the core dump. You can check this out for additional info.
#Ionut
This handles generating the core dump, but it doesn't handle notifying the developer when other users had crashes.
Nathan, under what circumstances in a segment base non-zero? I've never seen that occur in my 5 years of Linux application development.
Thanks.
#Martin
I do architectural validation for x86, so I'm very familiar with the architecture the processor provides, but very unfamiliar with how it's used. That's what I based my comment on. If CR2 can be counted on to give the correct answer, then I stand corrected.
Nathan, I wasn't insisting that you were incorrect; I was just saying that in my (limited) experience with Linux, the segment base is always zero. Maybe that's a good question for me to ask...
Note: there are two interesting registers in an x86 seg-fault crash.
The first, EIP, specifies the code address at which the exception occurred. In RichQ's answer, he uses addr2line to show the source line that corresponds to the crash address. But EIP can be invalid; if you call a function pointer that is null, it can be 0x00000000, and if you corrupt your call stack, the return can pop any random value into EIP.
The second, CR2, specifies the data address which caused the segmentation fault. In RichQ's example, he is setting up i as a null pointer, then accessing it. In this case, CR2 would be 0x00000000. But if you change:
int j = *i
to:
int j = i[2];
Then you are trying to access address 0x00000008, and that's what would be found in CR2.