Reading Other Process' Memory in OS X? - c

I've been trying to understand how to read the memory of other processes on Mac OS X, but I'm not having much luck. I've seen many examples online using ptrace with PEEKDATA and such, however it doesn't have that option on BSD [man ptrace].
int pid = fork();
if (pid > 0) {
// mess around with child-process's memory
}
How is it possible to read from and write to the memory of another process on Mac OS X?

Use task_for_pid() or other methods to obtain the target process’s task port. Thereafter, you can directly manipulate the process’s address space using vm_read(), vm_write(), and others.

Matasano Chargen had a good post a while back on porting some debugging code to OS X, which included learning how to read and write memory in another process (among other things).
It has to work, otherwise GDB wouldn't:
It turns out Apple, in their infinite wisdom, had gutted ptrace(). The OS X man page lists the following request codes:
PT_ATTACH — to pick a process to debug
PT_DENY_ATTACH — so processes can stop themselves from being debugged
[...]
No mention of reading or writing memory or registers. Which would have been discouraging if the man page had not also mentioned PT_GETREGS, PT_SETREGS, PT_GETFPREGS, and PT_SETFPREGS in the error codes section. So, I checked ptrace.h. There I found:
PT_READ_I — to read instruction words
PT_READ_D — to read data words
PT_READ_U — to read U area data if you’re old enough to remember what the U area is
[...]
There’s one problem solved. I can read and write memory for breakpoints. But I still can’t get access to registers, and I need to be able to mess with EIP.

I know this thread is 100 years old, but for people coming here from a search engine:
xnumem does exactly what you are looking for, manipulate and read inter-process memory.
// Create new xnu_proc instance
xnu_proc *Process = new xnu_proc();
// Attach to pid (or process name)
Process->Attach(getpid());
// Manipulate memory
int i = 1337, i2 = 0;
i2 = process->memory().Read<int>((uintptr_t)&i);
// Detach from process
Process->Detach();

It you're looking to be able to share chunks of memory between processes, you should check out shm_open(2) and mmap(2). It's pretty easy to allocate a chunk of memory in one process and pass the path (for shm_open) to another and both can then go crazy together. This is a lot safer than poking around in another process's address space as Chris Hanson mentions. Of course, if you don't have control over both processes, this won't do you much good.
(Be aware that the max path length for shm_open appears to be 26 bytes, although this doesn't seem to be documented anywhere.)
// Create shared memory block
void* sharedMemory = NULL;
size_t shmemSize = 123456;
const char* shmName = "mySharedMemPath";
int shFD = shm_open(shmName, (O_CREAT | O_EXCL | O_RDWR), 0600);
if (shFD >= 0) {
if (ftruncate(shFD, shmemSize) == 0) {
sharedMemory = mmap(NULL, shmemSize, (PROT_READ | PROT_WRITE), MAP_SHARED, shFD, 0);
if (sharedMemory != MAP_FAILED) {
// Initialize shared memory if needed
// Send 'shmemSize' & 'shmemSize' to other process(es)
} else handle error
} else handle error
close(shFD); // Note: sharedMemory still valid until munmap() called
} else handle error
...
Do stuff with shared memory
...
// Tear down shared memory
if (sharedMemory != NULL) munmap(sharedMemory, shmemSize);
if (shFD >= 0) shm_unlink(shmName);
// Get the shared memory block from another process
void* sharedMemory = NULL;
size_t shmemSize = 123456; // Or fetched via some other form of IPC
const char* shmName = "mySharedMemPath";// Or fetched via some other form of IPC
int shFD = shm_open(shmName, (O_RDONLY), 0600); // Can be R/W if you want
if (shFD >= 0) {
data = mmap(NULL, shmemSize, PROT_READ, MAP_SHARED, shFD, 0);
if (data != MAP_FAILED) {
// Check shared memory for validity
} else handle error
close(shFD); // Note: sharedMemory still valid until munmap() called
} else handle error
...
Do stuff with shared memory
...
// Tear down shared memory
if (sharedMemory != NULL) munmap(sharedMemory, shmemSize);
// Only the creator should shm_unlink()

You want to do Inter-Process-Communication with the shared memory method. For a summary of other commons method, see here
It didn't take me long to find what you need in this book which contains all the APIs which are common to all UNIXes today (which many more than I thought). You should buy it in the future. This book is a set of (several hundred) printed man pages which are rarely installed on modern machines.
Each man page details a C function.
It didn't take me long to find shmat() shmctl(); shmdt() and shmget() in it. I didn't search extensively, maybe there's more.
It looked a bit outdated, but: YES, the base user-space API of modern UNIX OS back to the old 80's.
Update: most functions described in the book are part of the POSIX C headers, you don't need to install anything. There are few exceptions, like with "curses", the original library.

I have definitely found a short implementation of what you need (only one source file (main.c)).
It is specially designed for XNU.
It is in the top ten result of Google search with the following keywords « dump process memory os x »
The source code is here
but from a strict point of virtual address space point de vue, you should be more interested with this question: OS X: Generate core dump without bringing down the process? (look also this)
When you look at gcore source code, it is quite complex to do this since you need to deal with treads and their state...
On most Linux distributions, the gcore program is now part of the GDB package. I think the OSX version is installed with xcode/the development tools.
UPDATE: wxHexEditor is an editor which can edit devices. IT CAN also edit process memory the same way it does for regular files. It work on all UNIX machines.

Manipulating a process's memory behind its back is a Bad Thing and is fraught with peril. That's why Mac OS X (like any Unix system) has protected memory, and keeps processes isolated from one another.
Of course it can be done: There are facilities for shared memory between processes that explicitly cooperate. There are also ways to manipulate other processes' address spaces as long as the process doing so has explicit right to do so (as granted by the security framework). But that's there for people who are writing debugging tools to use. It's not something that should be a normal — or even rare — occurrence for the vast majority of development on Mac OS X.

In general, I would recommend that you use regular open() to open a temporary file. Once it's open in both processes, you can unlink() it from the filesystem and you'll be set up much like you would be if you'd used shm_open. The procedure is extremely similar to the one specified by Scott Marcy for shm_open.
The disadvantage to this approach is that if the process that will be doing the unlink() crashes, you end up with an unused file and no process has the responsibility of cleaning it up. This disadvantage is shared with shm_open, because if nothing shm_unlinks a given name, the name remains in the shared memory space, available to be shm_opened by future processes.

Related

How to determine if a pointer is in rodata [duplicate]

This question already has answers here:
How can I prevent (not react to) a segmentation fault?
(3 answers)
Closed 2 years ago.
Can I tell if a pointer is in the rodata section of an executable?
As in, editing that pointer's data would cause a runtime system trap.
Example (using a C character pointer):
void foo(char const * const string) {
if ( in_rodata( string ) ) {
puts("It's in rodata!");
} else {
puts("That ain't in rodata");
}
}
Now I was thinking that, maybe, I could simply compare the pointer to the rodata section.
Something along the lines of:
if ( string > start_of_rodata && string < end_of_rodata ) {
// it's in rodata!
}
Is this a feasible plan/idea?
Does anyone have an idea as to how I could do this?
(Is there any system information that one might need in order to answer this?)
I am executing the program on a Linux platform.
I doubt that it could possibly be portable
If you don't want to mess with linker scripts or using platform-specific memory map query APIs, a proxy approach is fairly portable on platforms with memory protection, if you're willing to just know whether the location is writable, read-only, or neither. The general idea is to do a test read and a test write. If the first succeeds but the second one fails, it's likely .rodata or code segment. This doesn't tell you "it's rodata for sure" - it may be a code segment, or some other read-only page, such as as read-only file memory mapping that has copy-on-write disabled. But that depends on what you had in mind for this test - what was the ultimate purpose.
Another caveat is: For this to be even remotely safe, you must suspend all other threads in the process when you do this test, as there's a chance you may corrupt some state that code executing on another thread may happen to refer to. Doing this from inside a running process may have hard-to-debug corner cases that will stop lurking and show themselves during a customer demo. So, on platforms that support this, it's always preferable to spawn another process that will suspend the first process in its entirety (all threads), probe it, write the result to the process's address space (to some result variable), resume the process and terminate itself. On some platforms, it's not possible to modify a process's address space from outside, and instead you need to suspend the process mostly or completely, inject a probe thread, suspend the remaining other threads, let the probe do its job, write an answer to some agreed-upon variable, terminate, then resume everything else from the safety of an external process.
For simplicity's sake, the below will assume that it's all done from inside the process. Even though "fully capable" self-contained examples that work cross-process would not be very long, writing this stuff is a bit tedious especially if you want it short, elegant and at least mostly correct - I imagine a really full day's worth of work. So, instead, I'll do some rough sketches and let you fill in the blanks (ha).
Windows
Structured exceptions get thrown e.g. due to protection faults or divide by zero. To perform the test, attempt a read from the address in question. If that succeeds, you know it's at least a mapped page (otherwise it'll throw an exception you can catch). Then try writing there - if that fails, then it was read-only. The code is almost boring:
static const int foo;
static int bar;
#if _WIN32
typedef struct ThreadState ThreadState;
ThreadState *suspend_other_threads(void) { ... }
void resume_other_threads(ThreadState *) { ... }
int check_if_maybe_rodata(void *p) {
__try {
(void) *(volatile char *)p;
} __finally {
return false;
}
volatile LONG result = 0;
ThreadState *state = suspend_other_threads();
__try {
InterlockedExchange(&result, 1);
LONG saved = *(volatile LONG*)p;
InterlockedExchange((volatile LONG *)p, saved);
InterlockedExchange(&result, 0); // we succeeded writing there
} __finally {}
resume_other_threads(state);
return result;
}
int main() {
assert(check_if_maybe_rodata(&foo));
assert(!check_if_maybe_rodata(&bar));
}
#endif
Suspending the threads requires traversing the thread list, and suspending each thread that's not the current thread. The list of all suspended threads has to be created and saved, so that later the same list can be traversed to resume all the threads.
There are surely caveats, and WoW64 threads have their own API for suspension and resumption, but it's probably something that would, in controlled circumstances, work OK.
Unix
The idea is to leverage the kernel to check the pointer for us "at arms length" so that no signal is thrown. Handling POSIX signals that result from memory protection faults requires patching the code that caused the fault, inevitably forcing you to modify the protection status of the code's memory. Not so great. Instead, pass a pointer to a syscall you know should succeed in all normal circumstances to read from the pointed-to-address - e.g. open /dev/zero, and write to that file from a buffer pointed-to by the pointer. If that fails with EFAULT, it is due to buf [being] outside your accessible address space. If you can't even read from that address, it's not .rodata for sure.
Then do the converse: from an open /dev/zero, attempt a read to the address you are testing. If the read succeeds, then it wasn't read-only data. If the read fails with EFAULT that most likely means that the area in question was read-only since reading from it succeeded, but writing to it didn't.
In all cases, it'd be most preferable to use native platform APIs to test the mapping status of the page on which the address you try to access resides, or even better - to walk the sections list of the mapped executable (ELF on Linux, PE on Windows), and see exactly what went where. It's not somehow guaranteed that on all systems with memory protection the .rodata section or its equivalent will be mapped read only, thus the executable's image as-mapped into the running process is the ultimate authority. That still does not guarantee that the section is currently mapped read-only. An mprotect or a similar call could have changed it, or parts of it, to be writable, even modified them, and then perhaps changed them back to read-only. You'd then have to either checksum the section if the executable's format provides such data, or mmap the same binary somewhere else in memory and compare the sections.
But I smell a faint smell of an XY problem: what is it that you're actually trying to do? I mean, surely you don't just want to check if an address is in .rodata out of curiosity's sake. You must have some use for that information, and it is this application that would ultimately decide whether even doing this .rodata check should be on the radar. It may be, it may be not. Based on your question alone, it's a solid "who knows?"

mmap() for Remote File

Currently I am implementing a version of mmap() which its objective is to map a remote file on a client machine. For the implementation, I cannot use any in-built or third party libraries. Having said that, I am in doubt whether the implementation will be based on either of the following two options:
Load the file on the client machine after reading the file contents from the client side and use the mmap() syscall by using the file descriptor obtained from the client machine or
Allocating memory for each chunk of file data received by the client side by using sbrk()
Any suggestions will be greatly appreciated!
This is quite possible to do in Linux, and even in a thread-safe fashion for a multithreaded process, but there is one very difficult function you'd need to implement either yourself, or by using some library.
You would need to decode and emulate any memory-accessing instruction, using an interface similar to
static void emulate(mcontext_t *const context,
void (*fetch)(void *const data,
const unsigned long addr,
size_t bytes),
void (*store)(const unsigned long addr,
const void *const data,
size_t bytes));
The instruction to decode is at (void *)context->gregs[REG_IP] on x86, and at (void *)context->gregs[REG_RIP] on x86-64. The function must skip the instruction by incrementing context->gregs[REG_IP]/context->gregs[REG_RIP]/etc. by the number of bytes in the machine instruction. If you don't, SIGSEGV will just be raised again and again, with the program code stuck in that instruction!
The function must use only the fetch and store callbacks to access the memory that caused the SEGV. In your case, they would be implemented as functions that contact the remote machine, asking it to perform the desired action on the specified bytes.
Assuming you have the above three functions implemented, the rest is just about trivial. For simplicity, lets assume you have
static void *map_base;
static size_t map_size;
static void *map_ends; /* (char *)map_base + map_size */
static void sigsegv_handler(int signum, siginfo_t *info, void *context)
{
if (info->si_addr >= map_base && info->si_addr < map_ends) {
const int saved_errno = errno;
emulate(&((ucontext_t *)context)->uc_mcontext,
your_load_function, your_store_function);
errno = saved_errno;
} else {
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_handler = SIG_DFL;
act.sa_flags = 0;
if (sigaction(SIGSEGV, &act, NULL) == 0)
raise(SIGSEGV);
else
raise(SIGKILL);
}
}
static int install_sigsegv_handler(void)
{
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_sigaction = handle_sigsegv;
act.sa_mask = SA_SIGINFO;
if (sigaction(SIGSEGV, &act, NULL) == -1)
return errno;
return 0;
}
If map_size was already obtained from the remote machine (and rounded up to sysconf(_SC_PAGESIZE)), then you just need to do
if (install_sigsegv_handler()) {
/* Failed; see errno. Abort. */
}
map_base = mmap(NULL, map_size, PROT_NONE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, (off_t)0);
if ((void *)map_base != MAP_FAILED)
map_ends = (void *)(map_size + (char *)map_base);
else {
/* Failed; see errno. Abort. */
}
Now that I've scared everyone reading this out of their brains, I'm happy to also mention that there is a much easier, portable way to do this. It also tends to be more efficient.
This is not "memory mapping a remote file", but a co-operative scheme where multiple machines can share a mapping. From the user's perspective it's pretty much the same thing, but all parties using the mapping must participate in the work.
Instead of trying to catch every access to the mapped region, use page granularity and introduce the concept of page owner: each page of the mapping is accessible on at most one machine at a time, that machine owning said page.
Memory maps act on page-sized units (see sysconf(_SC_PAGESIZE)). You cannot set a specific byte or arbitrary byte range to be inaccessible or read-only -- unless it is aligned to page boundary. You can change any page to be readable and writable, readable only, or inaccessible (PROT_READ|PROT_WRITE, PROT_READ, and PROT_NONE, respectively; see mmap() and mprotect()).
The owner concept is quite simple. When a machine owns a page, it can freely read and write to the page, otherwise not. Note: If there is a file backing, updating the mapped file contents atomically is very difficult. I really recommend an approach where there is no backing file -- or that the backing file is updated in page-sized chunks using fcntl()-based leases or locking.)
Simply put, each page in the mapping is PROT_READ|PROT_WRITE on exactly one machine, and PROT_NONE in all others.
When somebody tries to write to a read-only page, the SIGSEGV handler on that machine is triggered. It contacts the other machines, and requests the ownership of that particular page. The then-owner, receiving such a message, changes its mapping to PROT_NONE, and sends the page to the new owner. The new owner updates the mapping, changing the protection to PROT_READ|PROT_WRITE, and returns from the SIGSEGV handler.
A couple of notes:
If the SIGSEGV handler returns before a change occurs in the mapping, nothing bad happens. The SIGSEGV signal simply gets immediately re-raised by the same instruction.
I recommend using a separate thread for receiving pages, and updating the local contents of the mapping. Then, the SIGSEGV handler only needs to make sure it has sent a request for ownership of that page, and sched_yield(), to not spin or "twiddle its thumbs" unnecessarily.
Program execution continues when the mapping is updated for that page. send() etc. are async-signal-safe, so you can send the request from the signal handler directly -- but not that you don't want to send the request every time slice (100-1000 times a second!), just once every while.
Remember: If the SIGSEGV signal handler does not resolve the problem, there is no harm done. The SIGSEGV just gets raised immediately again by the same instruction. However, I do warmly recommend using sched_yield(), so that other threads and processes on the machine get to use the CPU, instead of wasting CPU time raising a signal millions of times a second for nothing.
If writes are rare, but reads common, you can extend the ownership concept, to read-owner(s) and write-owner. Each page can be owned by any number of read-owners, as long as there is no write-owner. To modify the page, one needs to be write-owner, and that revokes any read-owners.
The logic is such that any thread can ask for read-ownership. If there is no write-owner, it is automatically granted; either the last write owner or any existing read-owners will send the read-only page contents. If there is a write-owner, it must downgrade its ownership to read-owner, and send the now read-only contents to the requester. To modify a page, one must already be a read-owner, and simply tells all other read-owners that they are now the write-owner.
In this case, the SIGSEGV handler is not much more complicated. If the page protections are PROT_NONE, it will ask for read ownership. If the page protections are PROT_READ, it already has read ownership, and therefore must ask to upgrade it to write-ownership. Note: using this scheme, we do not need to check the instruction whether it tried to access the memory for fetch or store -- indeed, it does not even matter. In the worst case -- write to a page not owned in any way by this thread -- SIGSEGV just gets raised twice: first to get read ownership, and second time to upgrade it to write-ownership.
Note that you cannot upgrade read-ownership to write-ownership in the SIGSEGV handler. If you did that, two threads on separate machines could upgrade their read-ownership at the same time, before the messages reach the other parties. All state changes can only occur after all necessary confirmation TCP messages have arrived.
(Since many-to-many message arbitration is quite complicated, it is almost always better to have a designated arbitrator (or "server"), which handles all the requests from each child. Page transfers can still be direct between members, although you do need to send a notification of each page transfer to the arbitrator/server, too.)
If there is no backing file -- i.e. it is MAP_ANONYMOUS -- you can replace the contents of any page atomically.
When receiving a page, you first get a new anonymous page using mmap(NULL, page, PROT_READ[|PROT_WRITE], MAP_PRIVATE|MAP_ANONYMOUS, -1, (off_t)0), and copy the new data into it. Then, you use mremap() to replace the old page with the new one. (The old page is effectively released as if munmap() was called, but this all happens atomically, so that no thread sees any intermediate state.)
This way you'll be sending just page-sized chunks around. For portability, you should actually use the smallest common multiple of all the page sizes involved, so that every machine can participate regardless of their possible page size differences. (Fortunately, they're always powers of two, and very often 4096, although I do seem to recall architectures that used 512, 2048, 8192, 16384, 32768, 65536, and 2097152 -byte pages, so please do not just hard-code your page size.)
Overall, both approaches have their benefits. The first (requiring the instruction emulator) allows any number of clients to access a memory mapping on one server with no co-operation needed from any of the other mappings to the same file on the server. The second needs co-operation from all parties using the mapping, but reduces the access latencies for multiple consecutive accesses; using the read-owner/write-owner logic, you should get a very performant shared memory management.
If you have difficulty deciding between brk()/sbrk() on one hand, and mmap() at other, I do fear both of these approaches are just too complex for you at this point. You should understand the inherent limitations of memory mapping first -- page granularity et cetera --, and perhaps even some of the cache theory (since this is essentially caching data), so that you can relatively easily manage the concepts involved.
Believe me, trying to program something you cannot really grasp at the conceptual level, leads to frustration. That said, grasping for the concepts, taking the time to learn them as you encounter them while programming, is fine; you just need to spend the time and effort.
Questions?
Here's an idea:
When the caller requests to "remote mmap" a region or an entire file, you allocate memory for that entire size right away and return that pointer. Also store a record of the allocation internally.
Use SFTP or similar to open the remote file. Don't do anything with it yet, just make sure it exists and has the right size.
You install a signal handler for SIGSEGV.
You use mprotect(2) to set the entire allocated space to be inaccessible (PROT_NONE).
When your signal handler is called, use the siginfo_t argument's si_addr parameter to know if the segmentation fault is in the region you allocated in step 1. If not, pass the segmentation fault along, it's probably going to be fatal as they usually are in most programs.
Now you know you have a region of memory which has been requested but is not yet accessible. Populate the memory by reading from the remote file opened in step 2 and return from your signal handler.
What we achieve then is something like "page faults" where we load on demand the required parts of the remote file. Of course, if you know something about the access pattern (e.g. that the entire file will always be needed in some particular order, or will be needed by multiple processes over time) you can do better, perhaps simpler things.

How does Linux Kernel assigns memory pointers when a process uses shm_open()?

I'm on Linux 2.6 and I have a weird problem. I have 3 concurrent processes (forked from the same process) which need to obtain 3 DIFFERENT shared memory segments, one for each process. Each of the process executes this code (please note that 'message' type is user-defined)
message *m;
int fd = shm_open("message", O_CREAT|O_RDWR, S_IRUSR|S_IWUSR);
ftruncate(fd, sizeof(message));
m = mmap(NULL, sizeof(message), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
char messagename[16];
snprintf(messagename, sizeof(messagename), "%p", m);
char path[32] = "/dev/shm/";
strcat(path, messagename);
rename("/dev/shm/message", path);
Let me explain a bit: I want every process to allocate a shared memory zone which contains a message. To make sure another process (the message receiver) can access the same shm, I then rename my shm file from "message" to a string named after the message pointer (this because the process which receives the message already knows the pointer).
When executing the program, though, I tried to print (for debugging purpose) the pointers that every process received when mmapping the fd obtained with shm_open, and I noticed that all of them got the SAME pointer. How is it possible? I thought that maybe other processes did the shm_open() after the first one did and before it renamed the segment, so I also tried to make these lines of code an atomic operation by using a process shared mutex, but the problem persists.
I would really appreciate any kind of help or suggestion.
Your processes all started with identical address space layouts at the moment of forking, and then followed very similar code paths. It is therefore not surprising that they all end up with the same value of m.
However, once they became separate processes, their address spaces became independent, so having the same value of m does not imply that all of the ms are pointing to the same thing.
Furthermore, I am not sure that your idea of renaming the /dev/shm entry after creating the shared memory block is safe or portable. If you want each process's shared memory block to have a unique name, why not base the name on the process ID (which is guaranteed to be unique at a given point in time) and pass it directly to shm_open, rather than going to the bother of renaming it afterwards?
The same virtual address in different processes can (and usually does) map to different physical pages in memory. You might want to read the wikipedia article on virtual memory.
I solved a similar problem simply by making the mmap before forking. So after forking the same area is shared between all processes. I then put my semaphores and mutexes on defined positions. It works perfectly.

Managing Shared Memory in C on OSX

I am working on a university assignment, based largely around IPC and shared memory. The problem is, as a complete noob to C, I've been happily testing my app (which uses shmget and shmat obviously) for hours. As you can probably guess, I've not been cleaning up after myself, and now I can't run my app, because (I assume) shmget cant allocate anymore resources.
My question is: how can I get this resource back without restarting OSX, and is there a GUI tool or something I can use to monitor/manage this shared memory I am creating?
Perhaps a bit late, but there are cmdline tools available to do exactly this.
ipcs and ipcrm
take a look at their man pages.
Call shmdt ("shared memory detach") on the shared memory segment in each process that holds a reference to it. Unix shared memory sections are reference counted, so when the last process detaches from them, they can be destroyed with shmctl(id, IPC_RMID, NULL).
From outside your application, the only option I can think of right now to clear your shared memory segments is:
for (int id=0; id < INT_MAX; id++)
shmctl(id, IPC_RMID, NULL);
but this is a horribly inefficient kludge. (I'm also not sure if it works; it doesn't on Linux, but Linux violates the Unix standard while MacOS X is certified against it.)

Quantify RAM, CPU use of a process in C under Linux

How to find out, how much RAM and CPU "eats" certain process in Linux? And how to find out all runned processes (including daemons and system ones)? =)
UPD: using C language
Use top or ps.
For example, ps aux will list all processes along with their owner, state, memory used, etc.
EDIT: To do that with C under Linux, you need to read the process files in the proc filesystem. For instance, /proc/1/status contains information about your init process (which always has PID 1):
char buf[512];
unsigned long vmsize;
const char *token = "VmSize:";
FILE *status = fopen("/proc/1/status", "r");
if (status != NULL) {
while (fgets(buf, sizeof(buf), status)) {
if (strncmp(buf, token, strlen(token)) == 0) {
sscanf(buf, "%*s %lu", &vmsize);
printf("The INIT process' VM size is %lu kilobytes.\n", vmsize);
break;
}
}
fclose(status);
}
Measuring how much ram a process uses is nearly impossible. The difficulty is, that each piece of ram is not used by exactly one process, and not all ram a process is using is actually "owned" by it.
For example, two processes can have shared mappings of the same file, in which case any pages which are in core for the mapping, would "belong" to both processes. But what if only one of these processes was using it?
Private pages can also be copy-on-write if the process has forked, or if they have been mapped but not used yet (consider the case where a process has malloc'd a huge area but not touched most of it yet). In this case, which process "owns" those pages?
Processes can also be effectively using parts of the buffer cache and lots of other kinds of kernel buffers, which aren't "owned" by them.
There are two measurements which are available, which are VM Size, (how much memory the process has mapped just now) and resident set size (RSS). Neither of them really tells you much about how much memory a process is using, because they both count shared pages and neither counts non-mapped pages.
So is there an answer? Some of these can be measured by examining the page maps structures which are now available in /proc (/proc/pid/pagemap), but there isn't necessarily a trivial way of sharing out the "ownership" of shared pages.
See Linux's Documentation/vm/pagemap.txt for a discussion of this.

Resources