I'm trying to write a program that reads the memory of other processes. This seems quite straightforward however I came across one problem. If the process I'm trying to read from has memory mapped but it has not use that memory so far (e.g. process called malloc, but never read from or wrote to that memory) the physical pages are not yet allocated. I presume this is the lazy allocation.
Now, when my program tries to read that memory it's immediately allocated. I would like to avoid that as I noticed some programs seems to allocate (or map) very large amounts that they don't normally use. This either causes very large memory consumption after the read or, in extreme cases, the process even being killed by OOM as the mapped region was bigger than the available ram. This is of course not acceptable as my reader should ideally be transparent and not affect the process being examined.
So far I just found information about /proc/pid/pagemap but I'm not fully sure if it suits my needs. There is a 'present' bit but I did not found information what it exactly means and if this is what I'm looking for.
Can someone confirm that? Or there's another way of achieving the same goal.
I want to mmap a big file into memory and parse it sequentially. As I understand if bytes have been lazily read into memory once, they stay there. Is there a way to periodically tell the system to release the previously read contents?
This understanding is only a very superficial view.
To understand what really happens you have to take into account the difference of the virtual memory of your process and the actual real memory of the machine. Mapping a huge file means reserving space in your virtual address-space. It's probably platform-dependent if anything is already read at this point.
When you actually access the data the OS has to fill an actual page of memory. When you access other parts these parts have to be brought into memory. It's completely up to the OS when it will re-use the memory. Normally this happens when some data is accessed by you or an other process and no free memory is available. But could happen at any time. If you access it again later it might be still in memory or will brought back by the OS. No way for your process to tell the difference.
In short: You don't need to care about that. The OS manages all that in the background.
One point might be that if you map a really huge file this takes up space in your virtual address-space which is limited. So if you deal with many huge mappings and or huge allocations you might want to only map parts of the file at a given time.
ADDITION: after thinking a bit about it, I came up with a reason why it might be smarter to do it blockwise-sequential. Although I doubt you will be able to measure that.
Any reasonable OS will look for a block to unload when in need in something like the following order:
unmapped files ( not needed anymore)
LRU unmodified mapped file (can be retrieved from disc)
LRU modified mapped file (same as 2. but needs to be updated on disc before unload)
LRU allocated memory (needs to be written to swap)
So unmapping blocks known to be never used again as you go, you give the OS a hint that these should be freed earlier. This will give data that has been used less recently but might be accessed in the future a bigger chance to stay in memory.
The memory usage of a process can be displayed by running:
$ ps -C processname -o size
SIZE
3808
Is there any way to retrieve this information without executing ps (or any external program), or reading /proc?
On a Linux system, a process' memory usage can be queried by reading /proc/[pid]/statm. Where [pid] is the PID of the process. If a process wants to query its own data, it can do so by reading /proc/self/statm instead. man 5 proc says:
/proc/[pid]/statm
Provides information about memory usage, measured in pages. The
columns are:
size total program size
(same as VmSize in /proc/[pid]/status)
resident resident set size
(same as VmRSS in /proc/[pid]/status)
share shared pages (from shared mappings)
text text (code)
lib library (unused in Linux 2.6)
data data + stack
dt dirty pages (unused in Linux 2.6)
You could just open the file with: fopen("/proc/self/statm", "r") and read the contents.
Since the file returns results in 'pages', you will want to find the page size also. getpagesize () returns the size of a page, in bytes.
You have a few options to do find the memory usage of a program:
Run it within a profiler like Valgrind or memprof.
exec/proc_open/fork a new process to use ps, top, or pmap as you would from the command line
bundle the ps into your app and use it directly (it's open source, of course!)
Use the /proc system (which is all that ps does, anyways...)
Create a report the kernel, which watches over process memory operations. The /proc filesystem is just a view into the kernel's internal data structures, so this is really already done for you.
Develop your own mechanism to compute memory usage without kernel assistance.
The former are all educational from a system administration perspective, and would be the best options in a real-life situation, but the last bullet point is probably the most interesting. You'd probably want to read the source of Valgrind or memprof to see how it works, but essentially what you'd need to do is insert your mechanism between the app and the kernel, and intercept any requests for memory allocation. Additionally, when the process started, you would want to initialize its memory space with a preset value like 0xDEADBEEF. Then, after the process finished, you could read through the memory space and count the occurrences of words other than your preset value, giving you an estimate of memory usage.
Of course, things are always more complicated than they seem. What about memory used by shared libraries? Pipes? Shared memory between your processes and another? System calls? Virtual memory allocated but not used? Data buffered to the disk? There's a lot of calls to be made beyond your question 'memory of process', see this post for some additional concerns.
I run my C program on Suse Linux Enterprise that compresses several thousand large files (between 10MB and 100MB in size), and the program gets slower and slower as the program runs (it's running multi-threaded with 32 threads on a Intel Sandy Bridge board). When the program completes, and it's run again, it's still very slow.
When I watch the program running, I see that the memory is being depleted while the program runs, which you would think is just a classic memory leak problem. But, with a normal malloc()/free() mismatch, I would expect all the memory to return when the program terminates. But, most of the memory doesn't get reclaimed when the program completes. The free or top command shows Mem: 63996M total, 63724M used, 272M free when the program is slowed down to a halt, but, after the termination, the free memory only grows back to about 3660M. When the program is rerun, the free memory is quickly used up.
The top program only shows that the program, while running, is using at most 4% or so of the memory.
I thought that it might be a memory fragmentation problem, but, I built a small test program that simulates all the memory allocation activity in the program (many randomized aspects were built in - size/quantity), and it always returns all the memory upon completion. So, I don't think that's it.
Questions:
Can there be a malloc()/free() mismatch that will lose memory permanently, i.e. even after the process completes?
What other things in a C program (not C++) can cause permanent memory loss, i.e. after the program completes, and even the terminal window closes? Only a reboot brings the memory back. I've read other posts about files not being closed causing problems, but, I don't think I have that problem.
Is it valid to be looking at top and free for the memory statistics, i.e. do they accurately describe the memory situation? They do seem to correspond to the slowness of the program.
If the program only shows a 4% memory usage, will something like valgrind find this problem?
Can there be a malloc()/free() mismatch that will lose memory permanently, i.e. even after the process completes?
No, malloc and free, and even mmap are harmless in this respect, and when the process terminates the OS (SUSE Linux in this case) claims all their memory back (unless it's shared with some other process that's still running).
What other things in a C program (not C++) can cause permanent memory loss, i.e. after the program completes, and even the terminal window closes? Only a reboot brings the memory back. I've read other posts about files not being closed causing problems, but, I don't think I have that problem.
Like malloc/free and mmap, files opened by the process are automatically closed by the OS.
There are a few things which cause permanent memory leaks like big pages but you would certainly know about it if you were using them. Apart from that, no.
However, if you define memory loss as memory not marked 'free' immediately, then a couple of things can happen.
Writes to disk or mmap may be cached for a while in RAM. The OS must keep the pages around until it synchs them back to disk.
Files READ by the process may remain in memory if the OS has nothing else to use that RAM for right now - on the reasonable assumption that it might need them soon and it's quicker to read the copy that's already in RAM. Again, if the OS or another process needs some of that RAM, it can be discarded instantly.
Note that as someone who paid for all my RAM, I would rather the OS used ALL of it ALL the time, if it helps in even the smallest way. Free RAM is wasted RAM.
The main problem with having little free RAM is when it is overcommitted, which is to say there are more processes (and the OS) asking for or using RAM right now than is available on the system. It sounds like you are using about 4Gb of RAM in your processes, which might be a problem - (and remember the OS needs a good chunk too. But it sounds like you have plenty of RAM! Try running half the number of processes and see if it gets better.
Sometimes a memory leak can cause temporary overcommitment - it's a good idea to look into that. Try plotting the memory use of your program over time - if it rises continuously, then it may well be a leak.
Note that forking a process creates a copy that shares the memory the original allocated - until both are closed or one of them 'exec's. But you aren't doing that.
Is it valid to be looking at top and free for the memory statistics, i.e. do they accurately describe the memory situation? They do seem to correspond to the slowness of the program.
Yes, top and ps are perfectly reasonable ways to look at memory, in particular observe the RES field. Ignore the VIRT field for now. In addition:
To see what the whole system is doing with memory, run:
vmstat 10
While your program is running and for a while after. Look at what happens to the ---memory--- columns.
In addition, after your process has finished, run
cat /proc/meminfo
And post the results in your question.
If the program only shows a 4% memory usage, will something like valgrind find this problem?
Probably, but it can be extremely slow, which might be impractical in this case. There are plenty of other tools which can help such as electricfence and others which do not slw your program down noticeably. I've even rolled my own in the past.
malloc()/free() work on the heap. This memory is guaranteed to be released to the OS when the process terminates. It is possible to leak memory even after the allocating process terminates using certain shared memory primitives (e.g. System V IPC). However, I don't think any of this is directly relevant.
Stepping back a bit, here's output from a lightly-loaded Linux server:
$ uptime
03:30:56 up 72 days, 8:42, 2 users, load average: 0.06, 0.17, 0.27
$ free -m
total used free shared buffers cached
Mem: 24104 23452 652 0 15821 978
-/+ buffers/cache: 6651 17453
Swap: 3811 5 3806
Oh no, only 652 MB free! Right? Wrong.
Whenever Linux accesses a block device (say, a hard drive), it looks for any unused memory, and stores a copy of the data there. After all, why not? The data's already in RAM, some program clearly wanted that data, and RAM that's unused can't do anyone any good. If a program comes along and asks for more memory, the cached data is discarded to make room -- until then, might as well hang onto it.
The key to this free output is not the first line, but the second. Yes, 23.4 GB of RAM is being used -- but 17.4 GB is available for programs that want it. See Help! Linux ate my RAM! for more.
I can't say why the program is getting slower, but having the "free memory" metric steadily drop down to nothing is entirely normal and not the cause.
The operating system only makes as much memory free as it absolutely needs. Making memory free is wasted effort if the memory is later used normally -- it's more efficient to just directly transition the memory from one use to another than to make the memory free just to have to make it unfree later.
The only thing the system needs free memory for is operations that require memory that can't switch used memory from one purpose to another. This is a very small set of unusual operations such as servicing network interrupts.
If you type this command sysctl vm.min_free_kbytes, the system will tell you the number of KB it needs free. It's likely less than 100MB. So having any amount more than that free is perfectly fine.
If you want more of your memory free, remove it from the computer. Otherwise, the operating system assumes that there is zero cost to using it, and thus zero benefit to making it free.
For example, consider the data you wrote to disk. The operating system could make the memory that was holding that data free. But that's a double loss. If the data you wrote to disk is later read, it will have to read it from disk rather than just grabbing it from memory. And if that memory is later needed for some other purpose, it will just have to undo all the work it went through making it free. Yuck. So if the system doesn't absolutely need free memory, it won't make it free.
My guess would be the problem is not in your program, but in the operating system. The OS keeps a cache of recently used files in memory on the assumption that you are going to access them again. It does not know with certainty what files are going to be needed, so it can end up deciding to keep the wrong ones at the expense of the ones you wish it was keeping.
It may be keeping the output files of the first run cached when you do your second run, which prevents it from effectively using the cache on the second run. You can test this theory by deleting all files from the first run (which should free them from cache) and seeing if that makes the second run go faster.
If that doesn't work, try deleting all the input files for the first run as well.
Answers
Yes there is no requirement in C or C++ to release memory that is not freed back to the OS
Do you have memory mapped files, open file handles for deleted files etc. Linux will not delete a file until all references to is a deallocated. Also linux will cache the file in memory in case it needs to be read again - file cache memory usage can be ignored as the OS will deal with it
No
Maybe valgrind will highlight cases where memory is not
I'd like to obtain memory usage information for both per process and system wide. In Windows, it's pretty easy. GetProcessMemoryInfo and GlobalMemoryStatusEx do these jobs greatly and very easily. For example, GetProcessMemoryInfo gives "PeakWorkingSetSize" of the given process. GlobalMemoryStatusEx returns system wide available memory.
However, I need to do it on Linux. I'm trying to find Linux system APIs that are equivalent GetProcessMemoryInfo and GlobalMemoryStatusEx.
I found 'getrusage'. However, max 'ru_maxrss' (resident set size) in struct rusage is just zero, which is not implemented. Also, I have no idea to get system-wide free memory.
Current workaround for it, I'm using "system("ps -p %my_pid -o vsz,rsz");". Manually logging to the file. But, it's dirty and not convenient to process the data.
I'd like to know some fancy Linux APIs for this purpose.
You can see how it is done in libstatgrab.
And you can also use it (GPL)
Linux has a (modular) filesystem-interface for fetching such data from the kernel, thus being usable by nearly any language or scripting tool.
Memory can be complex. There's the program executable itself, presumably mmap()'ed in. Shared libraries. Stack utilization. Heap utilization. Portions of the software resident in RAM. Portions swapped out. Etc.
What exactly is "PeakWorkingSetSize"? It sounds like the maximum resident set size (the maximum non-swapped physical-memory RAM used by the process).
Though it could also be the total virtual memory footprint of the entire process (sum of the in-RAM and SWAPPED-out parts).
Irregardless, under Linux, you can strace a process to see its kernel-level interactions. "ps" gets its data from /proc/${PID}/* files.
I suggest you cat /proc/${PID}/status. The Vm* lines are quite useful.
Specifically: VmData refers to process heap utilization. VmStk refers to process stack utilization.
If you continue using "ps", you might consider popen().
I have no idea to get system-wide free memory.
There's always /usr/bin/free
Note that Linux will make use of unused memory for buffering files and caching... Thus the +/-buffers/cache line.