This is probably more of a problem with my lack of C knowledge, but I'm hoping someone might be able to offer a possible solution. In a nutshell, I'm trying to read a struct that is stored in memory, and I have it's physical memory address. Also this is being done on a 64-bit Linux system (Debian (Wheezy) Kernel 3.6.6), and I'd like to use C as the language.
For example the current address of the struct in question is at physical address: 0x3f5e16000
Now I did initially try to access this address by using using a pointer to /dev/mem. However, I've since learned that access to any address > 1024MB is not allowed, and I get a nice error message in var/log/messages telling me all about it. At present access is being attempted from a userspace app, but I'm more than happy to look into writing a kernel module, if that is what is required.
Interesting, I've also discovered something known as 'kprobe', which supposedly allows the > 1024MB /dev/mem restriction to be bypassed. However, I don't really want to introduce any potential security issues into my system, and I'm sure there must be an easier way to accomplish this. The info on kprobe can be found here: http://www.libcrack.so/2012/09/02/bypassing-devmem_is_allowed-with-kprobes/
I've done some reading and I've found references to using mmap to map the physical address into userspace so that it can be read, but I must confess that I don't understand the implementation of this in C.
If anyone could provide some information on accessing physical memory, or either mapping data from a physical address to a userspace virtual address, I would be extremely grateful.
You'll have to forgive me if I'm a little bit vague as to exactly what I'm doing, but it's part of a project and I don't want to give too much information away, so please bear with me :) I'm not being obtuse or anything.
The structure in memory is a block of four ints and ten longs that is loaded into memory by a running kernel module.
The address that I'm using is definitely a physical address and it's set to non-paged, the kernel module performs the translations to physical and I'm not using the address-of operator.
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct. I hope that helps to clarify things!
EDIT - After doing some more reading, it appears that one possible solution to this problem is to construct a kernel module, and then use the mmap function to map the physical address to a virtual address the kernel module can then access. Can anyone offer any advice on achieving this using mmap?
I'm only going to answer this question:
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct.
No. The problem is not int vs. struct, the problem is that C in and of itself has no notion of physical memory. The OS in conjunction with the MMU makes sure that every process, including every running C program, runs in a virtual memory sandbox. The OS might offer an escape hatch into physical memory.
If you're writing a kernel module that manages some object at physical address 0x3f5e16000, then you should offer some API to get to that memory, preferably one that uses a file descriptor or some other abstraction to hide the nitty-gritty of kernel memory management from the user program it communicates with.
If you're trying to communicate with a poorly designed kernel module that expects you to access a fixed physical memory address, then ugly hacks involving /dev/mem are your share.
Related
On Python, using ctypes if applicable, how can I return the value a memory address is pointing to?
For instance- when I boot up my x86 PC, let's say the address 0xfffff800 points to a memory address of 0xffffffff
Using Python, how do I extract the value 0xfffff800 is pointing to (0xffffffff) and save it into a variable? Is this even possible? I have tried using id but I believe that is only used for local instances (if I created a variable, assigned it a value, and returned that value via id)
Thanks
From your reference to memory contents at boot time, and the idea implicit in the question that there is only one value at any given address, I take you to be talking about physical memory. In that case, no, what you ask is not possible. Only the operating system kernel (or some other program running directly on the hardware) can access physical memory in any system that presently supports Python, and to the best of my knowledge, there is no Python implementation that runs on bare metal.
Instead, the operating system affords each running process its own virtual memory space in which to run, whose contents at any given time might reside more or less anywhere in physical memory or on swap devices (disk files and / or partitions). The system takes great care to isolate processes from each other and from the underlying physical storage, so no, Python cannot access it.
Moreover, processes running on an OS cannot generally access arbitrary virtual addresses, either, regardless of the programming language in which they are written. They can access only those portions of their address spaces that the OS has mapped for them. ctypes therefore does not help you.
I'm trying to reconcile a few concepts.
I know of virtual memory is shared (mapped) between the kernel and all user processes, which I read here. I also know that when the compiler generates addresses for code + data, the kernel must load them at the correct virtual addresses for that process.
To constrain the scope of the question, I'll just mean gcc when I mention 'the compiler'.
So does the compiler need to be compliant each new release of an OS, to know not to place code or data at the high memory addresses reserved for the kernel? As in, someone writing that piece of the compiler must know those details of how the kernel plans to load the program (lest the compiler put executable code in high memory)?
Or am I confusing different concepts? I got a bit confused when going through this tutorial, especially at the very bottom where it has OS code in low memory addresses, because I thought Linux uses high memory for the kernel.
The compiler doesn't determine the address ranges in memory at which things are placed. That's handled by the OS.
When the program is first executed, the loader places the various portions of the program and its libraries in memory. For memory that's allocated dynamically, large chunks are allocated from the OS and then sometimes divided into smaller chunks.
The OS loader knows where to load things. And the OS's virtual memory allocation logic how to find safe, empty spaces in the address space the process uses.
I'm not sure what you mean by the "high memory addresses reserved for the kernel". If you're talking about a 2G/2G or 3G/1G split on a 32-bit operating system, that is a fundamental design element of those OSes that use it. It doesn't change with versions.
If you're talking about high physical memory, then no. Compilers don't care about physical memory.
Linux gives each application its own memory space, distinct from the kernel. The page table contains the translations between this memory space and physical RAM, and the kernel sets up the page table so there's no interference.
That said, the compiler usually doesn't even care where the program is loaded in memory. Why would it?
I'm currently experimenting with IPC via mmap on Unix.
So far, I'm able to map a sparse-file of 10MB into the RAM and access it reading and writing from two separated processes. Awesome :)
Now, I'm currently typecasting the address of the memory-segment returned by mmap to char*, so I can use it as a plain old cstring.
Now, my real question digs a bit further. I've quite a lot experience with higher levels of programming (ruby, java), but never did larger projects in C or ASM.
I want to use the mapped memory as an address-space for variable-allocation. I dont't wether this is possible or does make any sense at all. Im thinking of some kind of a hash-map-like data structure, that lives purely in the shared segment. This would allow some interesting experiments with IPC, even with other languages like ruby over FFI.
Now, a regular implementation of a hash would use pretty often something like malloc. But this woult allocate memory out of the shared space.
I hope, you understand my thoughts, although my english is not the best!
Thank you in advance
Jakob
By and large, you can treat the memory returned by mmap like memory returned by malloc. However, since the memory may be shared between multiple "unrelated" processes, with independent calls to mmap, the starting address for each may be different. Thus, any data structure you build inside the shared memory should not use direct pointers.
Instead of pointers, offsets from the initial map address should be used instead. The data structure would then compute the right pointer value by adding the offset to the starting address of the mmamp region.
The data structure would be built from the single call to mmap. If you need to grow the data structure, you have to extend the mmap region itself. The could be done with mremap or by manually munmap and mmap again after the backing file has been extended.
How do i find out what memory addresses are suitable for use ?
More specifically, the example of How to use a specific address is here: Pointer to a specific fixed address, but not information on Why this is a valid address for reading/writing.
I would like a way of finding out that addresses x to y are useable.
This is so i can do something similar to memory mapped IO without a specific simulator. (My linked Question relevant so i can use one set of addresses for testing on Ubuntu, and another for the actual software-on-chip)
Ubuntu specific answers please.
You can use whatever memory address malloc() returns. Moreover, you can specify how much memory you need. And with realloc() you even can change your mind afterwards.
You're mixing two independent topics here. The Question that you're linking to, is regarding a micro controller's memory mapped IO. It's referring to the ATM128, a Microcontroller from the Atmel. The OP of that question is trying to write to one of the registers of it, these registers are given specific addresses.
If you're trying to write to the address of a register, you need to understand how memory mapped IO works, you need to read the spec for the chipset/IC your working on. Asking this talking about "Ubuntu specific answers" is meaningless.
Your program running on the Ubuntu OS is running it it's own virtual address space. So asking if addresses x to y are available for use is pretty pointless... unless you're accessing hardware, there's no point in looking for a specific address, just use what the OS gives you and you'll know you're good.
Based on your edit, the fact that you're trying to do a simulation of memory mapped IO, you could do something like:
#ifdef SIMULATION
unsigned int TX_BUF_REG; // The "simulated" 32-bit register
#else
#define TX_BUF_REG 0x123456 // The actual address of the reg you're simulating
#endif
Then use accessor macro's to read or write specific bits via a mask (as is typically done):
#define WRITE_REG_BITS(reg, bits) {reg |= bits;}
...
WRITE_REG_BITS(TX_BUF_REG, SOME_MASK);
Static variables can be used in simulations this way so you don't have to worry about what addresses are "safe" to write to.
For the referenced ATMega128 microcontroller you look in the Datasheet to see which addresses are mapped to registers. On a PC with OS installed you won't have a chance to access hardware registers directly this way. At least not from userspace. Normally only device drivers (ring 0) are allowed to access hardware.
As already mentioned by others you have to use e.g. malloc() to tell the OS that you need a pointer to memory chuck that you are allowed to write to. This is because the OS manages the memory for the whole system.
As the title suggests, I have a problem of obtaining the physical address from a virtual one.
Let me explain: Given a variable declaration in process space, how can I derive it's physical address mapped by the OS?
I've stumbled upon some sys calls /asm/io.h where the virt_to_phys() function is defined; however it seems this header is outdated and I can't find a work around.
However; io.h is available at: /usr/src/linux-headers-2.6.35-28-generic/arch/x86/include/asm/. My current kernel is 2.6.35-28, but io.h isn't included in /usr/include/asm/?
So, to reiterate: I need a way to get the physical address from virtual. Preferably derived from within the application at runtime. But even a workaround of using a monitor of /proc/PID/maps will do.
Any ideas or comments would be greatly appreciated.
EDIT
After doing a bit of research on this topic I found something that helps in this regard.
It turns out this is more than doable, although requires a bit of a workaround.
Here is a link to a simple app that analyses the current mapped pages.
The file in question turns out is (a binary file) /proc/pid/pagemap (contains the physical mapping of virtual pages). Anyway, the code in that link can be modified to serve as a monitor app or something.
I needed the physical address for cache simulation purposes.
Thanks for all the help and answers!
In user code, you can't know the physical address corresponding to a virtual address. This is information is simply not exported outside the kernel. It could even change at any time, especially if the kernel decides to swap out part of your process's memory.
In /proc/$pid/maps, you have information about what the virtual addresses in your program's address space correspond to (mmapped files, heap, stack, etc.). That's all you'll get.
If you're working on kernel code (which you aren't, apparently), you can find out the physical address corresponding to a page of memory. But even then virt_to_phys isn't the whole story; I recommend reading Linux Device Drivers (especially chapters 8 and 15).
The header asm/io.h is a kernel header. It's not available when you're compiling user code because its contents just wouldn't make sense. The functions it declares aren't available in any library, only in the kernel.
As partially answered by the original poster, the Linux kernel exposes its mapping to userland through a set of files in the /proc. The documentation can be found here. Short summary:
/proc/$pid/maps provides a list of mappings with their virtual addresses and the corresponding file for mapped files.
/proc/$pid/pagemap provides more information about each mapped page, including the physical address if it exists.
As the original poster found later, this example provides a sample implementation of how to use these files.
Pass the virtual address to the kernel using systemcall/procfs and use vmalloc_to_pfn. Return the Physical address through procfs/registers.