I am wondering if there is a way to write a function in a C program that walks through another function to locate an address that an instruction is called.
For example, I want to find the address that the ret instruction is used in the main function.
My first thoughts are to make a while loop that begins at "&main()" and then looping each time increments the address by 1 until the instruction is "ret" at the current address and returning the address.
It is certainly possible to write a program that disassembles machine code. (Obviously, this is architecture-specific. A program like this works only for the architectures it is designed for.) And such a program could take the address of its main routine and examine it. (In some C implementations, a pointer to a function is not actually the address of the code of the function. However, a program designed to disassemble code would take this into an account.)
This would be a task of considerable difficulty for a novice.
Your program would not increment the address by one byte between instructions. Many architectures have a fixed instruction size of four bytes, although other sizes are possible. The x86-64 architecture (known by various names) has variable instruction sizes. Disassembling it is fairly complicated. As part of the process of disassembling an instruction, you have to figure out how big it is, so you know where the next instruction is.
In general, though, it is not always feasible to determine which return instruction is the one executed by main when it is done. Although functions are often written in a straightforward way, they may jump around. A function may have multiple return statements. Its code may be in multiple non-contiguous places, and it might even share code with other functions. (I do not know if this is common practice in common compilers, but it could be.) And, of course main might not ever return (and, if the compiler detects this, it might not bother writing a return instruction at all).
(Incidentally, there is a mathematical proof that it is impossible to write a program that always determines whether a program terminates or not. This is called the Halting Problem.)
I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.
This question already has answers here:
What is the function of this statement *(long*)0=0;?
(4 answers)
Closed 8 years ago.
I just saw in a code the following line :
#define ERR_FATAL( str, a, b, c ) {while(1) {*(unsigned int *)0 = 0xdeadbeef;} }
I know that 0xdeadbeef means error, but what putting this value mean when it's in address 0 ?
What address 0 represents ?
The address 0x0 is recognized by the compiler as a NULL pointer and is going to be an implementation defined invalid memory address, on some systems is quite literally at the first addressable location in the system memory but on others it's just a placeholder in the code for some generic invalid memory address. From the point of view of the C code we don't know what address that will be, only that it's invalid to access it. Essentially what this code snippet is trying to do is to write to an "illegal" memory address, with the value 0xdeadbeef. The value itself is some hex that spells out "dead beef" hence indicating that the program is dead beef (ie. a problem), if you aren't a native english speaker I can see how this might not be so clear :). The idea is that this will trigger a segmentation fault or similar, with the intention of informing you that there is a problem by immediately terminating the program with no cleanup or other operations performed in the interim (the macro name ERR_FATAL hints at that). Most operating systems don't give programs direct access to all the memory in the system and the code presumes that the operating system won't let you directly access memory that's located at address 0x0. Given that you tagged this question with the linux tag this is the behavior you will see (because linux will not allow an access to that memory address). Note that if you are working on something like an embedded system where there's no such guarantee then this could cause a bunch of problems as you might be overwriting something important.
Note that there's going to be better ways out there to report problems than this that don't depend on certain types of undefined behaviors causing certain side effects. Using things like assert is going to likely be a better choice. If you want to terminate the program using abort() is a better choice as it in the standard library and does exactly what you want. See the answer from ComicSansMS for more about why this is preferable.
Putting this value (or any value for that matter) in address 0 is supposed to terminate the program immediately.
A fatal error indicates an error situation so severe that the program cannot safely continue execution without risking further data corruption.
In particular, no pending cleanup operations are to be executed, as they could have an undesired effect. Think of flushing a buffer of corrupted data to the filesystem. Instead, the program is to terminate immediately and allow further examination of the situation via a core dump or an attached debugger.
Note that C already provides a standard library function for this purpose: abort(). Calling abort would be preferable for this purpose for a number of reasons:
Writing to address 0 is not guaranteed to terminate the program. This unnecessarily restricts portability of the code and might have devastating consequences in case the code actually gets recompiled and executed on a platform where writing to address 0 results in an actual memory store operation.
Calling abort() is more understandable to someone reading the code. Your question proves that many developers will not understand what the code in question is supposed to do. While the name of the macro and the value deadbeef give some hints, it is still unnecessarily obscure. Also, note that the name of the macro will not be visible when looking at the disassembled code in a debugger.
Calling abort() signals intent more clearly. This is not only true for the code itself, but also for the observable behavior of the binary. Assuming the operation executes as intended on a Unix machine, you would get a SIGSEGV as a result, which is a signal indicating memory corruption. abort() on the other hand causes SIGABRT which indicates an abnormal program termination. Unless the reason for the fatal error was indeed a memory corruption, throwing SIGSEGV in this case obscures why the program is failing and might be misleading to a developer trying to hunt down the error. This is particularly delicate when you think that the signal might not be caught by a debugger, but by an automated signal handler, which might then invoke unfitting code for error handling.
Therefore, if the sole intent of the macro is to signal a fatal error (as the name suggests), calling abort would be a better implementation.
Dereferencing an invalid pointer (which is what the above code is trying to do) results in Undefined Behavior. As such the system could do anything - it could crash, it could do nothing, or demons can fly out of your nose. This page is an excellent page on what every C programmer should know about Undefined Behavior.
Thus the above code is the wrong way to cause a crash. As a comenter pointed out, it would be better to call abort().
While studying tsr programming i have seen the use of certain code which i cannot understand..
The example cede part is(in c):
(please explain the bolded sections)
#include "dos.h"
#include"stdio.h"
void interrupt our();
void interrupt (*prev)();
char far *scr=(char far*)0xB8000000L;
int ticks;
unsigned char color;
main()
{
prev=getvect(8); // <<<<<
setvect(8,our); // <<<<<
keep(0,10000); // <<<<<
}
You would partially understand this code if you read the answer i posted to your similar question on TSR
How to write a TSR which changes case of characters
The most important things here are
Far pointer: Since 16 bit DOS used segment offset addressing scheme, your normal near pointer could not access memory beyond 64K of it's allocated segment. You have to read details to understand it.
Video memory address: This B8000000 is the address for which you need far pointer. The special thing about this address is, that starting from this location bytes (equal to the resolution of screen * 2) are copied directly into video memory.
So if you assign a character to a pointer address after indirection it will be printed on screen
Something like
char * far p = 0xB8000000;
*p = 'a'; // this would actually print a on screen at left top
Loop forward to get to the rest of the screen.
There was a c book by yashwant kanetkar which had a good deal of reference for this. I remember using it in my undergrad many years ago.
The rest of them are just indexing api's in dos.h. Why don't you go through their description and get back here if you don't understand any?
This program installs an interrupt handler. It uses interrupt number 8, the system timer interrupt. This was a common practice to use this interrupt to "continuously" do stuff on a machine running DOS.
prev=getvect(8);
This line gets the interrupt vector, that is, a pointer to a function that the system calls 18 times per second.
setvect(8,our);
This line sets the interrupt vector, that is, tells the system to call this function, instead of the old function, 18 times per second. Note that to avoid a crash, the new function must call the old function, in addition to its main purpose (which seems to be changing the case of characters).
keep(0,10000);
This line makes the program with exit code 0 (a conventional value for success) and tells DOS to leave 10000 bytes (or maybe 16-byte units? Unlikely; I don't remember) in RAM. This is unlike normal completion of program (exit(0)), where DOS marks all RAM previously occupied by the program as free.
A common cause of a crash in a TSR program is caused by the absence of keep at the end. DOS releases the memory occupied the by the code of the function our, and in the next 1/18 of a second, a random piece of code is executed.
See Int 21/AH=31h for more information.
Please note also that the parameter to keep should be calculated by manipulating some addresses, so that you don't take too much memory, and on the other hand, take enough memory to contain the code of the function our, which performs the stuff you need. The value 10000 is just an example.
I know this is more "heavy" question, but I think its interesting too. It was part of my previous questions about compiler functions, but back than I explained it very badly, and many answered just my first question, so ther it is:
So, if my knowledge is correct, modern Windows systems use paging as a way to switch tasks and secure that each task has propriate place in memory. So, every process gets its own place starting from 0.
When multitasking goes into effect, Kernel has to save all important registers to the task´s stack i believe than save the current stack pointer, change page entry to switch to another proces´s physical adress space, load new process stack pointer, pop saved registers and continue by call to poped instruction pointer adress.
Becouse of this nice feature (paging) every process thinks it has nice flat memory within reach. So, there is no far jumps, far pointers, memory segment or data segment. All is nice and linear.
But, when there is no more segmentation for the process, why does still compilers create variables on the stack, or when global directly in other memory space, than directly in program code?
Let me give an example, I have a C code:int a=10;
which gets translated into (Intel syntax):mov [position of a],#10
But than, you actually ocupy more bytes in RAM than needed. Becouse, first few bytes takes the actuall instruction, and after that instruction is done, there is new byte containing the value 10.
Why, instead of this, when there is no need to switch any segment (thus slowing the process speed) isn´t just a value of 10 coded directly into program like this:
xor eax,eax //just some instruction
10 //the value iserted to the program
call end //just some instruction
Becouse compiler know the exact position of every instruction, when operating with that variable, it would just use it´s adress.
I know, that const variables do this, but they are not really variables, when you cannot change them.
I hope I eplained my question well, but I am still learning English, so forgive my sytactical and even semantical errors.
EDIT:
I have read your answers, and it seems that based on those I can modify my question:
So, someone told here that global variable is actually that piece of values attached directly into program, I mean, when variable is global, is it atached to the end of program, or just created like the local one at the time of execution, but instead of on stack on heap directly?
If the first case - attached to the program itself, why is there even existence of local variables? I know, you will tell me becouse of recursion, but that is not the case. When you call function, you can push any memory space on stack, so there is no program there.
I hope you do understand me, there always is ineficient use of memory, when some value (even 0) is created on stack from some instruction, becouse you need space in program for that instruction and than for the actual var. Like so: push #5 //instruction that says to create local variable with integer 5
And than this instruction just makes number 5 to be on stack. Please help me, I really want to know why its this way. Thanks.
Consider:
local variables may have more than one simultaneous existence if a routine is called recursively (even indirectly in, say, a recursive decent parser) or from more than one thread, and these cases occur in the same memory context
marking the program memory non-writable and the stack+heap as non-executable is a small but useful defense against certain classes of attacks (stack smashing...) and is used by some OSs (I don't know if windows does this, however)
Your proposal doesn't allow for either of these cases.
So, there is no far jumps, far pointers, memory segment or data segment. All is nice and linear.
Yes and no. Different program segments have different purposes - despite the fact that they reside within flat virtual memory. E.g. data segment is readable and writable, but you can't execute data. Code segment is readable and executable, but you can't write into it.
why does still compilers create variables on the stack, [...] than directly in program code?
Simple.
Code segment isn't writable. For safety reasons first. Second,
most CPUs do not like to have code segment being written into as it
breaks many existing optimization used to accelerate execution.
State of the function has to be private to the function due to
things like recursion and multi-threading.
isn´t just a value of 10 coded directly into program like this
Modern CPUs prefetch instructions to allow things like parallel execution and out-of-order execution. Putting the garbage (to CPU that is the garbage) into the code segment would simply diminish (or flat out cancel) the effect of the techniques. And they are responsible for the lion share of the performance gains CPUs had showed in the past decade.
when there is no need to switch any segment
So if there is no overhead of switching segment, why then put that into the code segment? There are no problems to keep it in data segment.
Especially in case of read-only data segment, it makes sense to put all read-only data of the program into one place - since it can be shared by all instances of the running application, saving physical RAM.
Becouse compiler know the exact position of every instruction, when operating with that variable, it would just use it´s adress.
No, not really. Most of the code is relocatable or position independent. The code is patched with real memory addresses when OS loads it into the memory. Actually special techniques are used to actually avoid patching the code so that the code segment too could be shared by all running application instances.
The ABI is responsible for defining how and what compiler and linker supposed to do for program to be executable by the complying OS. I haven't seen the Windows ABI, but the ABIs used by Linux are easy to find: search for "AMD64 ABI". Even reading the Linux ABI might answer some of your questions.
What you are talking about is optimization, and that is the compiler's business. If nothing ever changes that value, and the compiler can figure that out, then the compiler is perfectly free to do just what you say (unless a is declared volatile).
Now if you are saying that you are seeing that the compiler isn't doing that, and you think it should, you'd have to talk to your compiler writer. If you are using VisualStudio, their address is One Microsoft Way, Redmond WA. Good luck knocking on doors there. :-)
Why isn´t just a value of 10 coded directly into program like this:
xor eax,eax //just some instruction
10 //the value iserted to the program
call end //just some instruction
That is how global variables are stored. However, instead of being stuck in the middle of executable code (which is messy, and not even possible nowadays), they are stored just after the program code in memory (in Windows and Linux, at least), in what's called the .data section.
When it can, the compiler will move variables to the .data section to optimize performance. However, there are several reasons it might not:
Some variables cannot be made global, including instance variables for a class, parameters passed into a function (obviously), and variables used in recursive functions.
The variable still exists in memory somewhere, and still must have code to access it. Thus, memory usage will not change. In fact, on the x86 ("Intel"), according to this page the instruction to reference a local variable:
mov eax, [esp+8]
and the instruction to reference a global variable:
mov eax, [0xb3a7135]
both take 1 (one!) clock cycle.
The only advantage, then, is that if every local variable is global, you wouldn't have to make room on the stack for local variables.
Adding a variable to the .data segment may actually increase the size of the executable, since the variable is actually contained in the file itself.
As caf mentions in the comments, stack-based variables only exist while the function is running - global variables take up memory during the entire execution of the program.
not quite sure what your confusion is?
int a = 10; means make a spot in memory, and put the value 10 at the memory address
if you want a to be 10
#define a 10
though more typically
#define TEN 10
Variables have storage space and can be modified. It makes no sense to stick them in the code segment, where they cannot be modified.
If you have code with int a=10 or even const int a=10, the compiler cannot convert code which references 'a' to use the constant 10 directly, because it has no way of knowing whether 'a' may be changed behind its back (even const variables can be changed). For example, one way 'a' can be changed without the compiler knowing is, if you have a pointer which points 'a'. Pointers are not fixed at runtime, so the compiler cannot determine at compile time whether there will be a pointer which will point to and modify 'a'.