I have this two programs just for understanding how pointers work. The first one is named test.c and here is the code.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *mem = malloc(sizeof(int) * 1);
*mem = 90;
//free(mem);
printf("%p", mem);
return (0);
}
so basically what I have is a program that allocates a place for one integer then prints that address to the standard output. In the commented section I am freeing the allocated memory after assigning a value. I will talk about why I commented it later. Here is my second code. It is in a file called test1.c. And here is the code,
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv)
{
int num = (int)strtol(argv[1], NULL, 16);
int *mem = (void *)(long)num;
printf("at test1 string> %s, changed to pointer-->%p, after being dereferenced-->%i\n",argv[1], mem, *mem);
return (0);
}
In this second program, an input is taken from the command line and then it is changed to a pointer (an address). It assumes what is passed is a hex string. What it does, is just to print the memory address passed and then try to get the value at that address.
What I did next is to compile each file to test and test1 respectively using gcc (I am using linux) and run the following command ./test | xargs ./test1 and this gives me the following error xargs: ./test1: terminated by signal 11. I have understood that this is because of the segmentation fault test1 is raising because if I don't try to dereference the pointer I don't get this error. But I don't understand why I am getting a seg fault. even after I free the memory (uncomment the comment in the first program) I still get a segmentation fault. I was expecting to get some garbage value rather than a seg fault.I am getting started with this whole process and pointer thing so for sure there is something I am missing, I hope someone will explain or direct me to a resource.
To just reprahse my question, how can a program access a specific memory without allocating it?
You can't.
Memory address are divided into pages (usually 4096 bytes). When you access an address, the CPU looks up the page with that address in the process's "page table". This table says where in physical memory (i.e. "which RAM chip") that address is.
So you cannot access addresses that aren't in your process's page table. Full stop.
How do you get pages added to your page table? You ask the OS. On Windows the function is called VirtualAlloc. On Linux it's called mmap. Or, you use a function like malloc, which lets you allocate only a small part of a page (by splitting up the pages it gets from the OS).
Also, every process has a different page table. So the addresses mean different things in different processes. Maybe the test process could have a page table entry for page 0x12345000, but the test1 process doesn't, because they're completely different tables. This is why it doesn't make sense to send pointers from one process to another.
In the old days of computing, there were no page tables and pointers were actual RAM addresses, but those days are long gone.
Edit: You can also ask the OS to put the same page in two different processes' page tables at the same time - this is called shared memory.
So if I understand what you are doing, you're printing the address of a dynamically allocated block of memory from one program, cutting and pasting that as input to a second program, which then tries to access that address.
There are several reasons why this won't work.
First is what user253751 points out - addresses don't map across different processes. 0x1234 in process A maps to a different physical memory cell than 0x1234 in process B. There are ways of setting up shared memory between running processes, but it's a bit more work than this.
Secondly, you're using the wrong types. An int is not large enough to store a pointer value - after casting and assigning the result of strtol to num you've certainly lost some digits, so casting that back to a pointer won't get you the right address.
The types intptr_t and uintptr_t in stddef.h are integer types large enough to store pointer values, but their implementation is optional, and it's still not a sure bet that strtol can accurately convert the input value.
While user253751 has given a good technical explanation, i want to put it more beginner-friendly:
Your OS makes sure one process cannot access the memory of another one as this would mean that you could manipulate or destroy other programs or steal their data (passwords, for example).
The C language does not check pointers, so you can set them to whatever you want, but if you want to access this address, the OS is stopping you because it has security features.
Related
I have two questions related to C programming and shellcoding (assembly) following below.
Question 1: Can anyone provide an answer on why putting two shellcodes in one program wouldn't work? I know it's related to the memory region but I need to know the exact reason. Program is compiled using gcc with the -zexecstack and -fno-stack-protector options.
#include <stdio.h>
#include <string.h>
main(int argc, char *argv[])
{
unsigned char shellcode[] = "\x01\x02<SHELLCODE>";
/* if the below line is uncommmented it will result in segault */
/* unsigned char shellcode_[] = "\x01\x02<SHELLCODE>"; */
int (*ret)() = (int(*)())shellcode;
return 0;
}
So how would it be possible to divide multiple shellcodes into different memory regions and call them without them interrupting the execution flow between each other, and decide which one to call? (I mean just STORE two shellcodes, not RUN them simultaneously, if that's possible at all).
Question 2: if the shellcode has to be passed as a parameter to a function, what would be the proper way to do it?
Pseudocode:
unsigned char shellcode[] = "\x01\x02...";
void call_shellcode(unsigned char shellcode[200]);
main()
{
call_shellcode(shellcode);
}
void call_shellcode(unsigned char shellcode[200])
{
... print/call shellcode
}
UPDATE: As there seems to be some misunderstanding to the question, this is not the ACTUAL shellcode. I do know what shellcode is and how it is generated, and how it works. I have not provided an actual shellcode within the C stub to leave it in a readable state. The value "\x01\x01" is a pseudo code to point to the idea of the question and NOT any actual contents.
Your shellcode cannot possibly work for a very simple reason: it begins with \x01\x02:
unsigned char shellcode[] = "\x01\x02<SHELLCODE>";
I'm not sure why your think your shellcode has to begin with those two bytes: it really doesn't!
Those two bytes decode to add DWORD PTR [rdx],eax (or edx if running in 32-bit mode). Since you do not have any control over the value of RDX/EDX at the time your shellcode is called, it will very likely immediately cause a segmentation fault because RDX/EDX does not contain a valid (and writable) memory address.
Changing literally anything around the shellcode, in the function or outside, could cause the compiler to choose a different register allocation that will result in RDX/EDX having a good value at runtime that doesn't result in a crash, but that'd just be a lucky coincidence. Writing and using shellcode like this is inherently undefined behavior, or at least implementation defined (fixed an operating system and compiler) so extra care must be taken.
So how would it be possible to divide multiple shellcodes into different memory regions and call them without them interrupting the execution flow between each other, and decide which one to call?
Well, you're not really dividing anything in "different memory regions"... whether you use one array or two or ten, they are all declared on the stack and they will be close together on the stack.
If you want to jump from one to the other, that's going to be a complex task, because in general you do not know the location of a variable on the stack beforehand, so you will have to do some math calculating your current location and then the offset from one shellcode chunk to the other, ultimately performing a relative call/jump.
If shellcode has to be passed as a parameter to a function what would be the proper way to do this?
The proper way is to mmap a region of memory that is RWX, write the shellcode into it (memcpy, read from stdin, etc.) and then pass a pointer to that memory region to the function you want. You have no guarantee that a piece of global data will be put by the compiler in an executable memory region. In fact, no modern day compiler would do that, and furthermore, no modern day kernel would map such a region as executable even if the ELF is compiled with -z execstack.
In recent kernels -z execstack is only respected for the stack itself, so passing a shellcode as function argument through a variable will only work if the variable was defined on the stack.
You can't have two variables with the same name in the same scope (this part has nothing to do with what the variables are or how they are used). Simply give the second shellcode a different name.
Note I am not going to comment at all on what you are trying to do, other than that I would not think of manually created machine code as "shell code" (which I would usually think of as code intended for a command shell like bash).
I wrote this very simple program on Windows 8.1 and compiled it using gcc from Mingw. I ran it with "test.exe > t.txt" and "test.exe > t1.txt" and the outputs were different (even though it uses virtual addresses). It ran for a while and then it crashed. I decided to test this because I'm reading a book on operating systems.
Is it reading other programs' memory? Wasn't that not supposed to happen? I'm probably misunderstanding something...
#include <stdio.h>
int main(int argc, char *argv[]){
int r = 0;
int p[4] = {1,5,4,3};
for(r=0; p[r]!=1111111111111111; r++){
p[2] = p[r];
printf("%d\n", p[2]);
}
return 0;
}
Thank you.
SadSeven, I assume you are reading past the end of the array on purpose. What you are seeing is not other programs memory, it's uninitialized memory inside of your programs memory.
Every program runs inside it's own virtual memory space, the os's virtual memory manager takes care of this. You can't access another programs memory from your program (unless you are both using shared memory, but you have to do that on purpose)
You haven't initialized anything beyond p[3]. The C language makes no guarantees about what will happen when you try to access addresses that haven't been initiazed with data. You'll likely see a bunch of garbage, but what the garbage is isn't defined by the program you wrote. It could be anything.
The addresses you are accessing before the crash still belong to the current process, it is just unitialized memory that exists between the stack and heap.
The process probably crashed due to a segmentation fault, which occurs when a process tries to access memory that doesn't belong to it. This would be the point when it attempts to access outside its own memory.
The output you see is from reading its own memory. When it reaches memory that isn't assigned to the process it crashes.
Edit:
To make things harder for computer viruses, the starting address of a program will be different each time you run it. So you should expect different output if you run it several times. In Windows, the adress space layout is not randomized by all programs.
Your program overruns a local (auto) variable, which means that it will walk up through the stack frame(s). The stack frame contains local variables, function arguments, saved registers, the return address of the function call, and a pointer to the end of the previous stack frame. If the variables all have the same values any difference would be explained by memory addresses being different. There may be other reasons that I'm not aware of, as I'm not an expert on the memory layout in Windows.
In your code, the for loop is erroneous.
for(r=0; p[r]!=1111111111111111; r++)
it tries to access out of bound memory starting from a value of 4 of r. The result is undefined behaviour.
It may run fine till r is 1997, it may crash at r value 4, it may start playing a song from your playlist at r value 2015, even. The behaviour is not guaranteed after r value 3.
Each process runs within its separate 4GB virtual address space, attempting an out-of-bounds read won't read from another process's memory. It will read garbage from its own address space. Now, you have been asking about why the output is different, well, ASLR randomises key parts of an executable thereby giving different entry points and stack addresses at instance of a process, thereby even the same process which is run more than once will have different entry points
Read about ASLR at: http://en.wikipedia.org/wiki/Address_space_layout_randomization
Read about Virtual Memory at:
http://en.wikipedia.org/wiki/Virtual_memory
I found an interesting phenomenon when I execute a simple test code:
int main(){
int *p=(int *)0x12f930;
printf("%d",*p);
return 0;
}
Of course it crashed with a segmentation fault. but even I change the 0x12f930 to 0x08048001(0x08048000+1, that should be the text area when execute the elf binary), it still crashed with a SF.
then I changed my code as below:
int main()
{
int i=1;
printf("%x",&i);
return 0;
}
the output is 0xf3ee8f0c, but as I know, the address of user space should be <=0xc0000000, so I am quite confused.
Anyone can help?
First, don't ever do it, unless there's a specific need to.
But, certain embedded applications and legacy systems, might need the explicit memory access.So, here's and example code:
const unsigned addr = 0xdeadbeee;//This address is an example, which should always be >0xc000000 and const
const unsigned *ptr=(const unsigned*)addr;//Then you can assign it to a pointer after proper casting and keeping it const, unless there's a need to keep it not-const
Be careful, as you may hit an unallocated memory or worse thrash the memory and even cause system instability. Also, the above code is implementation defined and as such not portable among different systems.
If you are executing your program in that OS, you need to understand the memory addressing scheme, followed by OS.Specially, some OS assign random starting address of the stack and/or heap in order to make some difficult to attack memory/processes in the system.So, every time you will execute the program, that processes address will be different.
If you wish to examine a process's memory, you could refer to source of GDB and how they do it.
I'm a new C programmer, still learning the language itself.
Anyway -
I'm trying to access a specific memory address.
I've written this code:
#include <stdio.h>
int main()
{
int* p = (int*) 0x4e0f68;
*p = 12;
getchar();
}
When I try to access a specific memory address like that, the program crashes.
I don't know if this information is relevant, but I'm using Windows 7 and Linux Ubuntu.
(I've tried this code only on Windows 7).
Any explanations why the program crashes?
How can I access a specific memory address (an address which is known at compile-time, I don't mean to dynamic memory allocation)?
Thanks.
That's memory you don't own and accessing it is undefined behavior. Anything can happen, including crashing.
On most systems, you'd be able to inspect the memory (although technically still undefined behavior), but writing to it is a whole different story.
Strictly speaking you cannot create a valid pointer like this. Valid pointers must point to valid objects (either on your stack or obtained from malloc).
For most modern operating systems you have a virtual memory space that only your process can see. As you request more memory from the system (malloc, VirtualAlloc, mmap, etc) this virtual memory is mapped into real usable memory that you can safely read and write to. So you can't just take an arbitrary address and try to use it without OS cooperation.
An example for windows:
#include <windows.h>
#include <stdio.h>
int main(void)
{
SYSTEM_INFO sysinfo;
GetSystemInfo(&sysinfo);
unsigned pageSize = sysinfo.dwPageSize;
printf("page size: %d\n", pageSize);
void* target = (void*)0x4e0f68;
printf("trying to allocate exactly one page containing 0x%p...\n", target);
void* ptr = VirtualAlloc(target, pageSize, MEM_COMMIT, PAGE_READWRITE);
if (ptr)
printf("got: 0x%p\n", ptr); // ptr <= target < ptr+pageSize
else
printf("failed! OS wont let us use that address.\n");
return 0;
}
Note that this will give you different results on different runs. Try it more than once.
Just to clrify one phrase the OP wrote: strictly speaking, no address associated to a program (code or data) is known at compile time. Programs usually are loaded at whatever address the OS determines. The final address a program sees (for example, to read a global variable) is patched by the OS in the very program code, using some sort of relocation table. DLL functions called by a program have a similar mechanism, where the IDATA section of the executable is converted into a jump table to jump to the actual address of a function in a DLL, taking the actual addresses from the DLL in memory.
That said, it is indeed possible to know by advance where a variable will be placed, if the program is linked with no relocation information. This is possible in Windows, where you can tell the linker to load the program to an absolute virtual address. The OS loader will try to load your program to that address, if possible.
However, this feature is not recommended because it can lead to easily exploiting possible security holes. If an attacker discovers a security hole in a program and try to inject code into it, it will be easier for him if the program has all its variables and functions in specific addresses, so the malicious code will know where to make patches to gain control of that program.
What you're getting is a segfault - when you're trying to access memory you don't have permission to access. Pointers, at least for userspace, must point to some variable, object, function, etc. You can set a pointer to a variable with the & operator - int* somePtr = &variableToPointTo, or to another pointer - int* someNewPtr = somePtr. In kernel mode (ring 0) or for OS development, you can do that, BUT IT IS NOT ADVISED TO DO SO. In MS-DOS, you could destroy your machine because there was no protection against that.
I'm trying to debug a program I've written. According to the debugger a particular void * holds the value 0x804b008. I'd like to be able to dereference this value (cast it to an int * and get it's value).
I'm getting a Segmentation Error with this code. (The program with the void * is still running in the background btw - it's 'paused')
#include <stdio.h>
int main() {
int* pVal = (int *)0x804b008;
printf("%d", *pVal);
}
I can see why being able to deference any point in memory could be dangerous so maybe that's why this isn't working.
Thank you!
Your program (running in the debugger) and this one won't be running in the same virtual memory space; accessing that pointer (even if it were a valid one) won't give you any information.
Every program running on your machine has its own logical address space. Your operating system, programming language runtime, and other factors can affect the actual literal values you see used as pointers for any given program. But one program definitely can't see into another program's memory space, barring of course software debuggers, which do some special tricks to support this behaviour.
In any case, your debugger should let you see whatever memory you want while your program is paused - assuming you have a valid address. In gdb x/x 0x804b008 would get you what you want to see.
For more information:
Wikipedia article on Virtual Memory.
gdb documentation
It's quite simple. The OS knows that address does not belong to your program, so you can't print it without bypassing memory protection.