I have this simple test C program which leaks 4 bytes of memory:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int* x = malloc(sizeof(int));
printf( "Address: %p\n", x);
return 0;
}
I compile it with gcc -o leak leak.c, and then run it:
$ leak
Address: 0x55eb2269a260
Then I create another test C program that will try to free the leaked memory:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
void *addr = (void*)0x55eb2269a260;
printf( "Trying to free address: %p\n", addr);
free(addr);
return 0;
}
I compile it with gcc -o myfree free.c and then run it:
$ myfree
Trying to free address: 0x55eb2269a260
Segmentation fault (core dumped)
What is happening here? Why is it not possible to free the leaked memory?
Assuming we are talking about Unix-like operating systems (this also applies to Windows and the majority of other modern operating systems)...
What is happening here? Why is it not possible to free the leaked memory?
First of all: every running process have it's own virtual address space (or VAS). This VAS is a way the operating system have to lay out and organize the physical memory between different processes. It ranges from 0x0 to 0xFFFFFFFF on 32 bits processors and contains all the memory of the process - its code, static data, stack, heap, etc, it is all in the process VAS. A virtual address (or VA) is a specific address inside the virtual address space.
When you allocate memory with malloc the system will search for a valid unallocated memory on the process heap and if it finds, return a pointer of it (i.e. malloc essentially returns a virtual address).
After the process ends its VAS will be automatically "freed" by the operating system, so that memory is no longer valid, or allocated in that matter. Asides that, each process have its own virtual address space. You can not directly access a process VAS (virtual address space) using a VA (virtual address) of another process - by doing that what you actually end up doing is trying to access that VA in the running process which in your example is very likely to result in an unhandled ACCESS_VIOLATION exception and crashes the process.
Each process has its own virtual memory space due to process isolation. Memory addresses in one process are not the same memory addresses in another process. You cannot just access dynamic memory by the address in another process.
Just to complete the answer of dedecos, the virtual address space of a process is always different for a different process in the same machine. This means that what for process A is at 0x55eb2269a260, in process B has no counterpart. No address at Process B matches the same memory as it does for process A. So then, it's impossible to address from B any of the addresses of process A.
The thing is that there is a memory manager at the operating system kernel that handles processes memory to be always disjoint, so even if two processes have the same variable at the same address, those variables are different (e.g. two versions of ls command executing at the same time will have the data segment at the same address, as the linker put it there) their address spaces (the same data addresses, for example) map to different memory pages for each process. This way, you can run the same program several times in parallel without clashing their data segments.
Operating systems offer ways for two processes to share a memory segment, so they can share common data, but even in that case, the addresses of both visuals of the same segment haven't to be placed at the same address (as one of those processes can have occupied the address given to the other for something else), so the conclusion is that what an address means for a process is not valid outside it.
Related
#include <stdio.h>
int main() {
char *address = (char*)0x004452FC;
volatile char value;
printf("Test 1");
while (1) {
value = *address;
printf("Test 2");
if (value == 1) {
printf("Test 3");
break;
}
}
return 0;
}
If I run this code, I never reach "Test 2" and it gives me the exitcode 322122547. I am running the program in VSC but it also didn't work in cmd with administrator permissions. The address is from a .exe game I am running and it is the number of a stage I am currently in.
The reason why you can't read from or write data to this memory address is that every process on Linux/Windows/macOS has its own virtual memory address space. This means that a memory address like 0x004452FC does not refer to an address in your physical memory, it is only virtual. Before the CPU can access the data from a virtual memory address, it must first translate it to a physical address using a page table.
Since these page tables are process-specific, the same virtual address in two processes will most likely belong to two different physical addresses. These page tables are managed by the operating system, and you can't really look up the physical address of an entry, nor would the operating system allow you to read/write to physical addresses directly (with the exception of kernel drivers). Before you can use a memory address (or more accurately, a page of memory), you must ask the operating system to create a mapping for that virtual memory address. Since you didn't ask for a mapping for 0x004452FC, you tried to access non-existent memory and the OS killed your process.
If you want to change the memory contents of other processes, you must use special APIs provided by the operating system. On Windows this would be ReadProcessMemory/WriteProcessMemory, on Linux you need to use ptrace and for macOS you need vm_read/vm_write.
Given the following C code:
void *ptr;
ptr = malloc(100);
printf("Address: %p\n", ptr);
When compiling this code using GCC 4.9 in Ubuntu 64 bit and running it the output is similar to this:
Address: 0x151ab10
The value 0x151ab10 seems a reasonable number since my machine has 8 GB of RAM, but when compiling the same code using GCC 4.9 in Mac OS X 64 bit and running it, it gives an output similar to this:
Address: 0x7fb9cb43ed30
... which is strange because 0x7fb9cb43ed30 is well above the 8 GB of RAM. Is there some kind of bit masking that one has to do in Mac OS X so that the real address of ptr can be printed out?
When processes run in general-purpose operating systems, the operating system constructs a “virtual” address space for each process, using assistance from hardware.
Whenever a process with a virtual address space accesses memory, the hardware translates the virtual address (in the process’ address space) to a physical address (in actual memory hardware), using special registers in the hardware and tables in system memory that describe how the translation should be done.1 The operating system configures the registers and tables for each process.
Commonly, the operating system, or the loader (the software that loads programs into memory for execution) assigns various ranges of the virtual address space for various purposes. It may put the stack in one place, executable code in another, general space for allocatable memory in another, and special system data in another. These addresses may come from base locations set arbitrarily by human designers or from various calculations, or combinations of those.
So seeing a large virtual address is not unusual; it is simply a location that was assigned in an imaginary address space.
Footnote
1 There are additional complications in translating virtual addresses to physical addresses. When the processor translates an address, the result may be that the desired location is not in physical memory at all. When this happens, the processor notifies the operating system. In response, the operating system can allocate some physical memory, read the necessary data from disk, update the memory map of the process so that the virtual address points to the newly allocated physical memory, and resume execution of the process. Then your process can continue as if the data were there all along. Additionally, when the system allocated physical memory, it may have had to make some memory available by writing data that was in memory to disk, and also removing it from the memory map of some process (possibly the same one). In this way, disk space becomes auxiliary memory, and processes can execute with more memory in their virtual address spaces than there is in actual physical memory.
I wrote this code so I can see the address of variable foo.
#include <stdio.h>
#include <stdlib.h>
int main(){
char* foo=(char*) malloc(1);
*foo='s';
printf(" foo addr : %p\n\n" ,&foo);
int pause;
scanf("%d",&pause);
return 0;
}
Then pause it and use the address of foo in here:
#include <stdio.h>
int main(){
char * ptr=(char *)0x7ffebbd57fc8; //this was the output from the first code
printf("\n\n\n\n%c\n\n\n",*ptr);
}
but I keep getting segmentation fault. Why is this code not working?
This is not a C question/problem but a matter of runtime support. On most OS programs run in a virtual environment, especially concerning their memory space. In such case memory is a virtual memory which means that when a program access a given address x the real (physical) memory is computed as f(x). f is a function implemented by the OS to ensure that a given process (object which represent the running of a code in the OS) have its own reserved memory separated from memory dedicated to other processes. This is called virtual memory.
Oups, your problem is not related to C language, but really depends of the OS, if any.
First let us read it from a pure C language point of view:
char * ptr=(char *)0x7ffebbd57fc8;
your are converting an unsigned integer to a char *. As you get the integer value from the other program, you can be sure that is has an acceptable range, so you indeed get a pointer pointing to that address. As it is a char * pointer, you can use it to read the byte representation of any object that will lie at that address. Still fine until there. But common systems use virtual addresses and limit each process to access only its own pages, so by default a process cannot access the memory of another process. In addition, with the common usage of virtual memory, there are no reasons that any two non kernel processes share common addresses. Exceptions for real addresses are:
real memory OS (MS/DOS and derivatives like FreeDOS, CP/M, and other anthic systems)
kernel mode: the kernel can access the whole memory of the system - who could load your program?
special functions: some OS provide special API to let one process read the memory of another one (Windows does), but it not as simple as directly reading an address...
As I assume that you are not in any of the first two cases, nothing is mapped at that address from the current process, hence the error.
On systems use virtual memory, you have a range of logical addresses that are available to user processes. These address ranges are subdivided into units called PAGES whose size depends upon the processor (512b to 1MB).
Pages are not valid until they are mapped into the process. The operating system will have system calls that allow the application to map pages. If you try to access a page that is not valid you get some kind of exception.
While the operating system only allocates memory pages, applications are used to calls, such as malloc(), that allocate memory blocks of arbitrary sizes.
Behind the scenes, malloc() is mapping pages (ie making them valid) to create a pool of memory that is uses to return small amounts of memory.
When you omit your malloc(), the memory is not being mapped.
Note that each process has its own range of logical addresses. One process's page containing 0x7ffebbd57fc8 is likely to be mapped to a different physical page frame than another process's 0x7ffebbd57fc8. If that were not the case, one user could muck with another. (There is always a range of addresses shared by all processes but this is can only be accessed in kernel mode.)
Your problem is further complicated by the fact that many systems these days randomly map processes to different locations in the logical address space. On such systems you could run your first program multiple times and get different addresses.
Why is this code not working?
You would need to call the system service on your operating system that maps memory into the process and make the page containing 0x7ffebbd57fc8 accessible.
As I know, all processes run within its own virtual address space. If a process call malloc, OS will allocate some region from the heap owned by the program, and return an address which is a virtual address not a real physical address. As the heap is owned by the program, why can't OS reclaim the memory not freed by programmer ?
Virtual address space is respectively for every program, so program have no method to destroy the data owned by other program. Am I right?
If a program access to an random address owned by other program, segment fault will occur. But why no error occurs when the program access to the address freed by itself previously?
The OS can't reclaim the memory not freed by a running program (a process) because it (the OS) doesn't know whether the process is still using that memory. This is exactly what the freeing act is there for - notifying the OS that the memory will not be used anymore. Of course, the OS can reclaim the memory of the program once it's terminated.
Yes, one process can't easily destroy (or even modify) another process's address space. Processes are isolated from one another (which is a good thing) and in order to make some interaction possible, a programmer has to resort to some means of interprocess communication (or IPC), e.g. shared memory, pipes, signals etc.
Actually, accessing previously freed region of memory can lead to a program crash. But usually detecting this kind of errors by OS is not cheap and is not, therefore, generaly done.
As the heap is owned by the program, why can't OS reclaim the memory not freed by programmer ?
Assuming virtual memory as you do here, the OS will reclaim any memory not freed by the program when it terminates. For short lived programs this is not a problem. However not all systems have virtual memory (think embedded programming on strange CPU's.) Also for programs that live for a long time, leaking memory while the program is running could be bad.
Virtual address space is respectively for every program, so program have no method to destroy the data owned by other program. Am I right?
Yes.
If a program access to an random address owned by other program, segment fault will occur. But why no error occurs when the program access to the address freed by itself previously?
This depends. What OS you're running, which allocator you're using, where the block of memory was located and its size will all matter in how the actual memory is freed or not. In short, the OS allocates fixed size pages of memory to your application. Your runtime library maps memory requested by malloc to the pages served by the OS. A page will not be released back to the OS until all the memory blocks located within it is freed. Some allocators also hold on to pages once they've been allocated in the assumption that you will need them again later. You will only get segfaults for trying to access memory in a page that does not belong to your program, either because it was never allocated in the first place, or because it has been released back to the OS.
Am I printing it wrong?
#include <stdio.h>
#include <stdlib.h>
int
main( void )
{
int * p = malloc(100000);
int * q;
printf("%p\n%p\n", (void *)p, (void *)q);
(void)getchar(); /* to run several instances at same time */
free(p);
return 0;
}
Whether I run it sequentially or in multiple terminals simultaneously, it always prints "0x60aa00000800" for p (q is different, though).
EDIT: Thanks for the answers, one of the reasons I was confused was because it used to print a different address each time. It turns out that a new compiler option I started using, -fsanitize=address, caused this change. Whoops.
The value of q is uninitialized garbage, since you never assign a value to it.
It's not surprising that you get the same address for p each time you run the program. That address is almost certainly a virtual address, so it applies only to the memory space of the currently running program (process).
Virtual address 0x60aa00000800 as seen from one program and virtual address 0x60aa00000800 as seen from another program are distinct physical addresses. The operating system maps virtual addresses to physical addresses, and vice versa, so there's no conflict. (If different programs could read and write the same physical memory, it would be a security nightmare.)
It also wouldn't be surprising if they were different each time. For example, some operating systems randomize stack addresses to prevent some code exploits. I'm not sure whether heap addresses are also randomized, but they certainly could be.
https://en.wikipedia.org/wiki/Virtual_memory
This behavior is not entirely surprising. The malloc operation is simply returning a pointer to user addressable + allocated memory in the process. It's completely reasonable for the first memory request of the same size to return the same address through different invocations of a process
The behavior for q doesn't contradict this. You have given q no value hence it gets whatever the last value written to that portion of the stack was. It's unsurprising that undefined behavior would be different through different invocations of the same process (after all, it's undefined)
Your code is fine.
The same code, and the same algorithm for obtaining memory with malloc() is run each time, so there's no reason the addresses should be different.
Some malloc implementations could randomize the start of memory allocations, yours does not.
This is because of virtual memory. The physical memory address for memory of q is different, but your operating system provides each process with a virtual view of memory, mapping different physical memory addresses to the same virtual addresses in your processes. So all processes have a similar view of the memory (and cannot see the memory of other processes)
Heap allocators are not required to provide distinct / unique addresses each time you run the program. There is no guarantee either way on this, but it's entirely reasonable for an implementation of malloc() to have deterministic behavior and give you the same pointer each time you run the program.
The stack, on the other hand, usually is (but is not required to be) located at a different address. This is a protection measure against buffer-overflow exploits. By making the stack location non-deterministic, they make it more difficult for an attacker to inject direct memory addresses of code via buffer-overflow attack.
Finally note that all pointers in a program are virtual memory addresses, not physical addresses. So even though two concurrent processes might have the same memory address in a pointer, those two processes still have distinct memory in separate areas of physical memory. The operating system takes care of this via its virtual memory manager and page translation. Each process has its own virtual address space, various pieces of which are mapped transparently by the OS to physical memory as needed.