Access (read/write) to virtual memory process from another application - c

I have simple program:
#include <stdio.h>
int a = 5;
int
main(void)
{
while(1)
{
int i;
sleep(1);
printf("%p %i\n", &a, a);
}
return 0;
}
Output (Ubuntu x64):
0x601048 5
0x601048 5
0x601048 5
0x601048 5
I was learning about pointers in C and I already know that you can use memcpy to write data wherever (almost) you want within virtual memory of process. But, is it possible to modify value of int a, placed at 0x601048 address, by using another application(which is of course using its own virtual memory)? How to do this? I'm interested in solutions only for C.

It is not easily possible (to share virtual memory between two different processes on Linux). As a first approximation, code as it if was not possible.
And even if you did share such memory, you'll get into synchronization issues.
You really should read books like Advanced Linux Programming. They have several chapters on that issue (which is complex).
Usually, if you really want to share memory, you won't share some memory on the call stack, but you would "reserve" some memory zone to be later shared.
You could read a lot more about
pthread-s (e.g. read this pthread tutprial)
shared memory segments set up with mmap(2) using MAP_SHARED
low level debugging facilities using ptrace(2) notably PTRACE_PEEKDATA
old SysV shared memory using shmat(2)
Posix shared memory (see shm_overview(7)...) using shm_open(2)
/proc/ file system proc(5) e.g. /proc/$PID/mem ; I strongly suggest to look at file:///proc/self/maps at first in your browser and to read more till you understand what that is showing you. (you could mmap some other's process /proc/$PID/mem ....)
/dev/mem (the physical RAM) see mem(4)
loading a kernel module doing insane tricks.
I strongly advise against playing such dirty memory tricks for a beginner. If you insist, be prepared to break your system and backup it often. Don't play such tricks while a Linux novice.
Often you'll need root privileges. See capabilities(7)

Related

How can I get a guarantee that when a memory is freed, the OS will reclaim that memory for it's use?

I noticed that this program:
#include <stdio.h>
int main() {
const size_t alloc_size = 1*1024*1024;
for (size_t i = 0; i < 3; i++) {
printf("1\n");
usleep(1000*1000);
void *p[3];
for (size_t j = 3; j--; )
memset(p[j] = malloc(alloc_size),0,alloc_size); // memset for de-virtualize the memory
usleep(1000*1000);
printf("2\n");
free(p[i]);
p[i] = NULL;
usleep(1000*1000*4);
printf("3\n");
for (size_t j = 3; j--; )
free(p[j]);
}
}
which allocates 3 memories, 3 times and each time frees different memory, frees the memory according to watch free -m, which means that the OS reclaimed the memory for every free regardless of the memory's position inside the program's address space. Can I somehow get a guarantee for this effect? Or is there already anything like that (like a rule of >64KB allocations)?
The short answer is: In general, you cannot guarantee that the OS will reclaim the freed memory, but there may be an OS specific way to do it or a better way to ensure such behavior.
The long answer:
Your code has undefined behavior: there is an extra free(p[i]); after the printf("2\n"); which accesses beyond the end of the p array.
You allocate large blocks (1 MB) for which your library makes individual system calls (for example mmap in linux systems), and free releases these blocks to the OS, hence the observed behavior.
Various OSes are likely to implement such behavior for a system specific threshold (typically 128KB), but the C standard gives guarantee about this, so relying on such behavior is system specific.
Read the manual page for malloc() on your system to see if this behavior can be controlled. For example, the C library on Linux uses an environment variable MMAP_THRESHOLD to override the default setting for this threshold.
If you program to a Posix target, you might want to use mmap() directly instead of malloc to guarantee that the memory is returned to the system once deallocated with munmap(). Note that the block returned by mmap() will have been initialized to all bits zero before the first access, so you may avoid such explicit initialization to take advantage of on demand paging, on perform explicit initialization to ensure the memory is mapped to try and minimize latency in later operations.
On the OSes I know, and especially on linux:
no, you cannot guarantee reuse. Why would you want that? Reuse only happens when someone needs more memory pages, and Linux will then have to pick pages that aren't currently mapped to a process; if these run out, you'll get into swapping. And: you can't make your OS do something that is none of your processes' business. How it internally manages memory allocations is none of the freeing process' business. In fact, that's security-wise a good thing.
What you can do is not only freeing the memory (which might leave it allocated to your process, handled by your libc, for later mallocs), but actually giving it back (man sbrk, man munmap, have fun). That's not something you'd usually do.
Also: this is yet another instantiation of "help, linux ate my RAM"... you misinterpret what free tells you.
For glibc malloc(), read the man 3 malloc man page.
In short, smaller allocations use memory provided by sbrk() to extend the data segment; this is not returned to the OS. Larger allocations (typically 132 KiB or more; you can use MMAP_THRESHOLD on glibc to change the limit) use mmap() to allocate anonymous memory pages (but also include memory allocation bookkeeping on those pages), and when freed, these are usually immediately returned to the OS.
The only case when you should worry about the process returning memory to the OS in a timely manner, is if you have a long-running process, that temporarily does a very large allocation, running on an embedded or otherwise memory-constrained device. Why? Because this stuff has been done in C successfully for decades, and the C library and the OS kernel do handle these cases just fine. It just isn't a practical problem in normal circumstances. You only need to worry about it, if you know it is a practical problem; and it won't be a practical problem except on very specific circumstances.
I personally do routinely use mmap(2) in Linux to map pages for huge data sets. Here, "huge" means "too large to fit in RAM and swap".
Most common case is when I have a truly huge binary data set. Then, I create a (sparse) backing file of suitable size, and memory-map that file. Years ago, in another forum, I showed an example of how to do this with a terabyte data set -- yes, 1,099,511,627,776 bytes -- of which only 250 megabytes or so was actually manipulated in that example, to keep the data file small. The key here in this approach is to use MAP_SHARED | MAP_NORESERVE to ensure the kernel does not use swap memory for this dataset (because it would be insufficient, and fail), but use the file backing directly. We can use madvise() to inform the kernel of our probable access patterns as an optimization, but in most cases it does not have that big of an effect (as the kernel heuristics do a pretty good job of it anyway). We can also use msync() to ensure certain parts are written to storage. (There are certain effects that has wrt. other processes that read the file backing the mapping, especially depending on whether they read it normally, or use options like O_DIRECT; and if shared over NFS or similar, wrt. processes reading the file remotely. It all goes quite complicated very quickly.)
If you do decide to use mmap() to acquire anonymous memory pages, do note that you need to keep track of both the pointer and the length (length being a multiple of page size, sysconf(_SC_PAGESIZE)), so that you can release the mapping later using munmap(). Obviously, this is then completely separate from normal memory allocation (malloc(), calloc(), free()); but unless you try to use specific addresses, the two will not interfere with each other.
If you want memory to be reclaimed by the operating system you need to use operating system services to allocate the memory (which be allocated in pages). Deallocate the memory, you need to call the operating system services that remove pages from your process.
Unless you write your own malloc/free that does this, you are never going to be able to accomplish your goal with off-the-shelf library functions.

heap overflow affecting other programs

I was trying to create the condition for malloc to return a NULL pointer. In the below program, though I can see malloc returning NULL, once the program is forcebly terminated, I see that all other programs are becoming slow and finally I had to reboot the system. So my question is whether the memory for heap is shared with other programs? If not, other programs should not have affected. Is OS is not allocating certain amount of memory at the time of execution? I am using windows 10, Mingw.
#include <stdio.h>
#include <malloc.h>
void mallocInFunction(void)
{
int *ptr=malloc(500);
if(ptr==NULL)
{
printf("Memory Could not be allocated\n");
}
else
{
printf("Allocated memory successfully\n");
}
}
int main (void)
{
while(1)
{
mallocInFunction();
}
return(0);
}
So my question is whether the memory for heap is shared with other programs?
Physical memory (RAM) is a resource that is shared by all processes. The operating system makes decisions about how much RAM to allocate to each process and adjusts that over time.
If not, other programs should not have affected. Is OS is not allocating certain amount of memory at the time of execution?
At the time the program starts executing, the operating system has no idea how much memory the program will want or need. Instead, it deals with allocations as they happen. Unless configured otherwise, it will typically do everything it possibly can to allow the program's allocation to succeed because presumably there's a reason the program is doing what it's doing and the operating system won't try to second guess it.
... whether the memory for heap is shared with other programs?
Well, the C standard doesn't exactly require a heap, but in the context of a task-switching, multi-user and multi-threaded OS, of course memory is shared between processes! The C standard doesn't require any of this, but this is all pretty common stuff:
CPU cache memory tends to be preferred for code that's executed often, though this might get swapped around quite a bit; that may or may not be swapped to a heap.
Task switching causes registers to be swapped to other forms of memory; that may or may not be swapped to a heap.
Entire pages are swapped to and from disk, so that other programs can make use of them when your OS switches execution away from your program and to the other programs, and when it's your programs turn to execute again among other reasons. This may or may not involve manipulating the heap.
FWIW, you're referring to memory that has allocated storage duration. It's best to avoid using terms like heap and stack, as they're virtually meaningless. The memory you're referring to is on a silicon chip, regardless of whether it uses a heap or a stack.
... Is OS is not allocating certain amount of memory at the time of execution?
Speaking of silicon chips and execution, your OS likely only has control of one processor (a silicon chip which contains some logic circuits and memory, among other things I'm sure) with which to execute many programs! To summarise this post, yes, your program is most likely sharing those silicon chips with other programs!
On a tangential note, I don't think heap overflow means what you think it means.
Your question cannot be answered in the context of C, the language. For C, there's no such thing as a heap, a process, ...
But it can be answered in the context of operating systems. Even a bit generically because many modern multitasking OSes do similar things.
Given a modern multitasking OS, it will use virtual address spaces for each process. The OS manages a fixed size of physical RAM and divides this into pages, when a process needs memory, such pages are mapped into the process' virtual address space (typically using a different virtual address than the physical one). So when all memory pages are claimed by the OS itself and by the processes running, the OS will typically save some of these pages that are not in active use to disk, in a swap area, in order to serve this page as a fresh page to the next process requesting one. But when the original page is touched (and this is typically the case with free(), see below), it must first be loaded from disk again, but to have a free page for this, another page must be saved to swap space.
This is, like all disk I/O, slow, and it's probably what you see happening here.
Now to fully understand this: what does malloc() do? It typically requests from the operating system to have the memory of the own process increased (and if necessary, the OS does this by mapping another page), and it uses this new memory by writing some information there about the block of memory requested (so free() can work correctly later) and ultimately returns a pointer to a block that's free to use for the program. free() uses the information written by malloc(), modifies it to indicate this block is free again, and it typically can't give any memory back to the OS because there are other malloc()d blocks in the same page. It will give memory back when possible, but that's the exception in a typical scenario where dynamic allocations are heavily used.
So, the answer to your question is: Yes, the RAM is shared because there is only one set of physical RAM. The OS does the best it can to hide that fact and virtualize RAM, but if a process consumes all that is there, this will have visible effects.
malloc() is not system call but libc library function. So when a program ask for allocating memory via malloc(), system call brk()/sbrk() OR mmap() to allocated page(s), more details here.
Please keep in mind that the memory you get is all virtual in nature, that means if you have 3GB of physical RAM you can actually allocate almost infinite memory. So how does this happens? This happens via concept called 'paging', where system stores and retrieves data from secondary memory storage(HDD/SDD) to main memory(RAM), more details here.
So with this theory, out of memory usually quite rare but program like above which is checking system limits, this can happen. This is nicely explained here.
Now, why other programs are sort of hanged OR slow? Because they all share the same operating system and system is starving for resource. In fact at a point the system will crash and reboot again.
Hope this helps?

How to read the variable value from RAM?

I've written a program using dynamic memory allocation. I do not use the free function to free the memory, still at the address, the variable's value is present there.
Now I want to reuse this value and I want to see all the variables' values that are present in RAM from another process.
Is it possible?
#include<stdio.h>
#include<stdlib.h>
void main(){
int *fptr;
fptr=(int *)malloc(sizeof(int));
*fptr=4;
printf("%d\t%u",*fptr,fptr);
while(1){
//this is become infinite loop
}
}
and i want to another program to read the value of a because it is still in memory because main function is infinite. how can do this?
This question shows misconceptions on at least two topics.
The first is virtual address spaces and memory protection, as already addressed by RobertL in his answer. On a modern operating system, you just can't access memory belonging to another process (even if you knew the physical address, which you don't because all user space processes work on addresses in their private virtual address space). The result of trying to access something not mapped into your address space will be a segmentation fault. Read more on Wikipedia.
The second is the scope of the C standard. It doesn't know about processes. C is defined in terms of an abstract machine executing your program (and only this program). Scopes and lifetimes of variables are defined and the respective maximum is the global scope and a static storage duration. So yes, your variable will continue to live as long as your program runs, but it's scope will be this program.
When you understand that, you see: even on a platform using a single global address space and no memory protection at all, you could never access the variable of another program in terms of the C standard. You could probably pass a pointer value somehow to your other program and use that, maybe it would work, but it would be undefined behavior.
That's why operating systems provide means for inter process communication like shared memory (which comes close to what you seem to want) and pipes.
When you return from main(), the process frees all acquired resources, so you can't. Have a look on this.
First of all, when you close your process the memory you allocated with it will be freed entirely. You can circumvent this by having 1 process of you write to a file (like a .dat file) and the other read from it. There are other ways, too.
But generally speaking in normal cases, when your process terminates the existing memory will be freed.
If you try accessing memory from another process from within your process, you will most likely get a segmentation fault, as most operating systems protect processes from messing with each other's memory.
It depends on the operating system. Most modern multi-tasking operating systems protect processes from each other. Hardware is setup to disallow processes from seeing other processes memory. On Linux and Unix for example, to communicate between programs in memory you will need to use operating system services for "inter-process" communication, such as shared memory, semaphores, or pipes.

How do I get a function to execute in a different address space? Writing a clone function

I have this code that gives me a segmentation fault. My understanding of the clone function is that the parent process has to allocate space for the child process and clone calls a function that runs in that stack space. Am I misunderstanding something or does my code just not make sense?
char *stack;
char *stackTop;
stack = malloc(STACK_SIZE);
if (stack == NULL)
fprintf(stderr, "malloc");
stackTop = stack + STACK_SIZE;
myClone(childFunc, stackTop, CLONE_FILES, NULL);
int myClone(int (*fn)(void *), void *child_stack,int flags, void *arg){
int* space = memcpy(child_stack, fn, sizeof(fn));
typedef int func(void);
func* f = (func*)&space;
f();
}
There are two main reasons why this wouldn't work.
Memory protection: the relevant memory pages must be executable. Data pages, you got from malloc are not. "Normal" memory-management functions can't do this. On the other hand, the existing code pages are not writable, so you can't move one piece of code onto another. This is a fundamental memory-protection mechanism. You have to either go back to DOS or to use some advanced "debugging" interface.
Position-independent code: all memory addresses in your code must be either relative ones, or be fixuped manually. It may be too tricky to do this in C.
The clone() function is a system call. It cannot be replicated by C code running within your process.
There's a fundamental misunderstanding there. So you're getting a segmentation fault, that tells me you're trying to run this code in user space (in a process created by the operating system).
An address space is an abstraction available to the operating system. It typically uses hardware support (that of an MMU [memory management unit]) which provides means to use virtual addresses. These are addresses that are, when accessed, automatically translated to the real physical addresses according to some data structures that only the OS can manage.
I don't think it makes much sense to go into great detail here, you have enough key words to google for. The essence is: There is no way you can create an address space from user space code. That functionality is reserved to the OS and to do it, clone() on linux issues a syscall, invoking the OS.
edit: concerning the stack, providing a stack means to reserve space for it (by mapping an appropriate amount of pages to the address space) and setting the necessary processor registers when context is switched to the process (e.g. esp/ebp on i386). This, too, is something only the operating system can do.

read a variable in another program using c

I want to use a program to read a variable with a given address in another program using c. Here is my code:
#include<stdio.h>
#include<stdlib.h>
volatile unsigned int *a;
int main(){
scanf("%d",&a);//input the pointer to the variable I want to read
while(1){
printf("%d\n",*a);
system("pause");
}
}
To test this program, I used another program to change the variable. This is the code.
#include<stdio.h>
volatile unsigned int a;
int main(){
printf("%d\n",&a);//output the pointer to the variable
while(1)scanf("%d",&a);
}
I run the second program first and then type the output of the variable into the first program. It's wired that every time I run the second program, I get the same output. And when I run the first program, I get the same value every time, despite I changed the variable in the second program. Why doesn't it work? My computer is 32-bit.
It is operating system specific, and you generally should avoid doing that -even when possible-. Prefer other inter-process communication facilities (e.g. pipes, sockets, message passing)
On most OSes, each process has its own address space in virtual memory, so a process A cannot change any data in process B. BTW, two processes can run simultaneously (on different cores) or quasi-simultaneously (with their tasks scheduled by the kernel), so sharing without care a variable does not make any sense.
Some OSes provide shared memory facilities, but then you should care about synchronization (e.g. with semaphores).
For Linux, read Advanced Linux Programming and shm_overview(7) & sem_overview(7)
Generally, you need to design and adapt both programs to make them communicate. For security reasons you don't want (and your OS kernel forbids) arbitrary processes to be able to glance in other processes' address space.
For example, you don't want your game software to be able to access your banking data in your browser without your consent.
Alternatively, merge the two programs into a single multi-threaded application. You'll be concerned by synchronization and probably would need to use mutexes. Read e.g. some POSIX threads tutorial.
See also MPI. BTW, you might use some database to share the common data (look into PostGreSQL, MongoDB, etc...) or adapt a client-server model

Resources