I am trying to determine whether a group of contiguous memory locations on my Linux system are currently holding data for a process. If they are not then I want to allocate these locations to my programme and ensure that other processes know that those memory locations are reserved for my programme.
I need to access a specific memory location, e.g. 0xF0A35194. Any old location will not do.
I have been searching for solutions but have not been brave enough to execute and test any code for fear that I will destroy my system.
I do not want to run this code in kernel-space so a user-space solution is preferable.
Any help would be greatly appreciated!
There are a lot of issues at play here. To start, each process has its own memory address space. Even if a particular memory address is allocated for one process, it won't be for a different process. Memory virtualize and paging is a complex and opaque abstraction that can't be broken from within user space.
Next, the only reason that I can imagine you want to do something like this is to go hunting for particular DMA ranges for devices. That is also not allowed from user space and there are much better ways to accomplish that.
If you can post what you are trying to achieve more directly, we can provide a better solution.
Related
I'm trying to write a program that reads the memory of other processes. This seems quite straightforward however I came across one problem. If the process I'm trying to read from has memory mapped but it has not use that memory so far (e.g. process called malloc, but never read from or wrote to that memory) the physical pages are not yet allocated. I presume this is the lazy allocation.
Now, when my program tries to read that memory it's immediately allocated. I would like to avoid that as I noticed some programs seems to allocate (or map) very large amounts that they don't normally use. This either causes very large memory consumption after the read or, in extreme cases, the process even being killed by OOM as the mapped region was bigger than the available ram. This is of course not acceptable as my reader should ideally be transparent and not affect the process being examined.
So far I just found information about /proc/pid/pagemap but I'm not fully sure if it suits my needs. There is a 'present' bit but I did not found information what it exactly means and if this is what I'm looking for.
Can someone confirm that? Or there's another way of achieving the same goal.
I want to mmap a big file into memory and parse it sequentially. As I understand if bytes have been lazily read into memory once, they stay there. Is there a way to periodically tell the system to release the previously read contents?
This understanding is only a very superficial view.
To understand what really happens you have to take into account the difference of the virtual memory of your process and the actual real memory of the machine. Mapping a huge file means reserving space in your virtual address-space. It's probably platform-dependent if anything is already read at this point.
When you actually access the data the OS has to fill an actual page of memory. When you access other parts these parts have to be brought into memory. It's completely up to the OS when it will re-use the memory. Normally this happens when some data is accessed by you or an other process and no free memory is available. But could happen at any time. If you access it again later it might be still in memory or will brought back by the OS. No way for your process to tell the difference.
In short: You don't need to care about that. The OS manages all that in the background.
One point might be that if you map a really huge file this takes up space in your virtual address-space which is limited. So if you deal with many huge mappings and or huge allocations you might want to only map parts of the file at a given time.
ADDITION: after thinking a bit about it, I came up with a reason why it might be smarter to do it blockwise-sequential. Although I doubt you will be able to measure that.
Any reasonable OS will look for a block to unload when in need in something like the following order:
unmapped files ( not needed anymore)
LRU unmodified mapped file (can be retrieved from disc)
LRU modified mapped file (same as 2. but needs to be updated on disc before unload)
LRU allocated memory (needs to be written to swap)
So unmapping blocks known to be never used again as you go, you give the OS a hint that these should be freed earlier. This will give data that has been used less recently but might be accessed in the future a bigger chance to stay in memory.
CONTEXT:
I run on an old laptop. I only just have 128Mo ram free on 512Mo total. No money to buy more ram.
I use mmap to help me circumvent this issue and it works quite well.
C code.
Debian 64 bits.
PROBLEM:
Besides all my efforts, I am running out of memory pretty quick right know and I would like to know if I could release the mmaped regions I read to free my ram.
I read that madvise could help, especially the option MADV_SEQUENTIAL.
But I don't quite understand the whole picture.
THE NEED:
To be able to free mmaped allocated memory after the region is read so that it doesn't fill my whole ram with large files. I will not read it soon so it is garbage to me. It is pointless to keep it in ram.
Update: I am not done with the file so don't want to call munmap. I have other stuffs to do with it but in another regions of it. Random reads.
For random read/write access to a mmap()ed file, MADV_SEQUENTIAL is probably not very useful (and may in fact cause undesired behavior). MADV_RANDOM or MADV_DONTNEED would be better options in this case. However, be aware that the kernel is free to ignore any madvise() - although in my understanding, Linux currently does not, as it tends to treat madvise() more as a command than an advisory...
Another option would be to mmap() only selected sections of the file as needed, and munmap() them as you're done with them, perhaps maintaining a pool of some small number of currently active mappings (i.e. mapping more than one region at once if needed, but still keeping it limited).
Or course you must free resources when you're done with them in order not to leak them and thus run out of available space too soon.
Not sure what the question is, if you know about mmap() then surely you know about munmap() too? It's right there on the same manual page.
Short question:
Is it possible to map a buffer that has been malloc'd to have two ways (two pointers pointing to the same physical memory) of accessing the same buffer?
Or, is it possible to temporarily move a virtual memory address received by malloc? Or is it possible to point from one location in virtual space to another?
Background:
I am working with DirectFB, a surface management and 2D graphics composting library. I am trying to enforce the Locking protocol which is to Lock a surface, modify the memory only while locked (the pointer is to system memory allocated using malloc), and unlocking the surface.
I am currently trying to trace down a bug in an application that is locking a surface and then storing the pixel pointer and modifying the surface later. This means that the library does not know when it is safe to read or write to a surface. I am trying to find a way to detect that the locking protocol has been violated. What I would like is a way to invalidate the pointer passed to the user after the unlock call is made. Even better, I would like the application to seg fault if it tries to access the memory after the lock. This would stop in the debugger and give us an idea of which surface is involved, which routine is involved, who called it, etc.
Possible solutions:
Create a temporary buffer, pass the buffer pointer to the user, on unlock copy the pixels to the actual buffer, delete the temporary
buffer.
Pros: This is an implementable solution.
Cons: Performance is slow as it requires a copy which is expensive, also the memory may or may not be available. There is no
way to guarantee that one temporary surface overlaps another allowing
an invalidated pointer to suddenly work again.
Make an additional map to a malloc'd surface and pass that to the user. On unlock, unmap the memory.
Pros: Very fast, no additional memory required.
Cons: Unknown if this is possible.
Gotchas: Need to set aside a reserved range of addresses are never used by anything else (including malloc or the kernel). Also need to
ensure that no two surfaces overlap which could allow an old pointer
to suddenly point to something valid and not seg fault when it should.
Take advantage of the fact that the library does not access the memory while locked by the user and simply move the virtual address on
a lock and move it back on an unlock.
Pros: Very fast, no additional memory required.
Cons: Unknown if this is possible.
Gotchas: Same as "2" above.
Is this feasible?
Additional info:
This is using Linux 2.6, using stdlib.
The library is written in C.
The library and application run in user space.
There is a possibility of using a kernel module (to write a custom memory allocation routine), but the difficulty of writing a module in
my current working climate would probably reduce the chances to near
zero levels that I could actually implement this solution. But if this
is the only way, it would be good to know.
The underlying processor is x86.
The function you want to create multiple mappings of a page is shm_open.
You may only be using the memory within one process, but it's still "shared memory" - that is to say, multiple virtual mappings for the same underlying physical page will exist.
However, that's not what you want to do. What you should actually do is have your locking functions use the mprotect system call to render the memory unreadable on unlock and restore the permissions on lock; any access without the lock being held will cause a segfault. Of course, this'll only work with a single simultaneous accessing thread...
Another, possibly better, way to track down the problem would be to run your application in valgrind or another memory analysis tool. This will greatly slow it down, but allows you very fine control: you can have a valgrind script that will mark/unmark memory as accessible and the tool will kick you straight into the debugger when a violation occurs. But for one-off problem solving like this, I'd say install an #ifdef DEBUG-wrapped mprotect call in your lock/unlock functions.
I have made a program in c and wanted to see, how much memory it uses and noticed, that the memory usage grows while normally using it (at launch time it uses about 250k and now it's at 1.5mb). afaik, I freed all the unused memory and after some time hours, the app uses less memory. Could it be possible, that the freed memory just goes from the 'active' memory to the 'wired' or something, so it's released when free space is needed?
btw. my machine runs on mac os x, if this is important.
How do you determine the memory usage? Have you tried using valgrind to locate potential memory leaks? It's really easy. Just start your application with valgrind, run it, and look at the well-structured output.
If you're looking at the memory usage from the OS, you are likely to see this behavior. Freed memory is not automatically returned to the OS, but normally stays with the process, and can be malloced later. What you see is usually the high-water mark of memory use.
As Konrad Rudolph suggested, use something that examines the memory from inside the process to look for memory links.
The C library does not usually return "small" allocations to the OS. Instead it keeps the memory around for the next time you use malloc.
However, many C libraries will release large blocks, so you could try doing a malloc of several megabytes and then freeing it.
On OSX you should be able to use MallocDebug.app if you have installed the Developer Tools from OSX (as you might have trouble finding a port of valgrind for OSX).
/Developer/Applications/PerformanceTools/MallocDebug.app
I agree with what everyone has already said, but I do want to add just a few clarifying remarks specific to os x:
First, the operating system actually allocates memory using vm_allocate which allocates entire pages at a time. Because there is a cost associated with this, like others have stated, the C library does not just deallocate the page when you return memory via free(3). Specifically, if there are other allocations within the memory page, it will not be released. Currently memory pages are 4096 bytes in mac os x. The number of bytes in a page can be determined programatically with sysctl(2) or, more easily, with getpagesize(2). You can use this information to optimize your memory usage.
Secondly, user-space applications do not wire memory. Generally the kernel wires memory for critical data structures. Wired memory is basically memory that can never be swapped out and will never generate a page fault. If, for some reason, a page fault is generated in a wired memory page, the kernel will panic and your computer will crash. If your application is increasing your computer's wired memory by a noticeable amount, it is a very bad sign. It generally means that your application is doing something that significantly grows kernel data structures, like allocating and not reaping hundreds of threads of child processes. (of course, this is a general statement... in some cases, this growth is expected, like when developing a virtual host or something like that).
In addition to what the others have already written:
malloc() allocates bigger chunks from the OS and spits it out in smaller pieces as you malloc() it. When free()ing, the piece first goes into a free-list, for quick reuse by another malloc if the size fits. It may at this time be merged with another free item, to form bigger free blocks, to avoid fragmentation (a whole bunch of different algorithms exist there, from freeLists to binary-sized-fragments to hashing and what not else).
When freed pieces arrive so that multiple fragments can be joined, free() usually does this, but sometimes, fragments remain, depending on size and orderof malloc() and free(). Also, only when a big such free block has been created will it be (sometimes) returned to the OS as a block. But usually, malloc() keeps things in its pocket, dependig on the allocated/free ratio (many heuristics and sometimes compile or flag options are often available).
Notice, that there is not ONE malloc/free algotrithm. There is a whole bunch of different implementations (and literature). Highly system, OS and library dependent.