Linux (Ubuntu), C language: Virtual to Physical Address Translation - c

As the title suggests, I have a problem of obtaining the physical address from a virtual one.
Let me explain: Given a variable declaration in process space, how can I derive it's physical address mapped by the OS?
I've stumbled upon some sys calls /asm/io.h where the virt_to_phys() function is defined; however it seems this header is outdated and I can't find a work around.
However; io.h is available at: /usr/src/linux-headers-2.6.35-28-generic/arch/x86/include/asm/. My current kernel is 2.6.35-28, but io.h isn't included in /usr/include/asm/?
So, to reiterate: I need a way to get the physical address from virtual. Preferably derived from within the application at runtime. But even a workaround of using a monitor of /proc/PID/maps will do.
Any ideas or comments would be greatly appreciated.
EDIT
After doing a bit of research on this topic I found something that helps in this regard.
It turns out this is more than doable, although requires a bit of a workaround.
Here is a link to a simple app that analyses the current mapped pages.
The file in question turns out is (a binary file) /proc/pid/pagemap (contains the physical mapping of virtual pages). Anyway, the code in that link can be modified to serve as a monitor app or something.
I needed the physical address for cache simulation purposes.
Thanks for all the help and answers!

In user code, you can't know the physical address corresponding to a virtual address. This is information is simply not exported outside the kernel. It could even change at any time, especially if the kernel decides to swap out part of your process's memory.
In /proc/$pid/maps, you have information about what the virtual addresses in your program's address space correspond to (mmapped files, heap, stack, etc.). That's all you'll get.
If you're working on kernel code (which you aren't, apparently), you can find out the physical address corresponding to a page of memory. But even then virt_to_phys isn't the whole story; I recommend reading Linux Device Drivers (especially chapters 8 and 15).
The header asm/io.h is a kernel header. It's not available when you're compiling user code because its contents just wouldn't make sense. The functions it declares aren't available in any library, only in the kernel.

As partially answered by the original poster, the Linux kernel exposes its mapping to userland through a set of files in the /proc. The documentation can be found here. Short summary:
/proc/$pid/maps provides a list of mappings with their virtual addresses and the corresponding file for mapped files.
/proc/$pid/pagemap provides more information about each mapped page, including the physical address if it exists.
As the original poster found later, this example provides a sample implementation of how to use these files.

Pass the virtual address to the kernel using systemcall/procfs and use vmalloc_to_pfn. Return the Physical address through procfs/registers.

Related

How to get SSDT address

I'm trying to write a program in c that lists the SSDT addresses so if some function is hooked I would see a different address.
How do I get the address of SSDT?
I used WinDbg and listed with KeServiceDescriptorTable, now how I get this address in c.
I searched the web for it and saw programs that used NtQuerySystemInformation with SystemModuleInformation. I didn't find any documentation for those programs or any articles or explanations for this.
Thanks for helping
[Below is for when you're in kernel-mode].
On 32-bit environments, KeServiceDescriptorTable is exported by NTOSKRNL, and thus you can retrieve the address with MmGetSystemRoutineAddress.
On 64-bit environments however, you'll need to locate KeServiceDescriptorTable yourself through memory scanning because it won't be exported by NTOSKRNL. It is quite straight forward once you've found out where it the table is used in the Windows Kernel, check the internal system-call related routines in NTOSKRNL.
Note: you'll need to byte-shift when extracting the address on a 64-bit environment.
Now, once you have the addresses, you can do a comparison to determine if the addresses are between a specific range in memory to try and determine whether the address is not correct (e.g. if it has been manipulated). You can also perform forensics on the operands at the memory for an in-depth analysis.

does the starting address of the section in linker script is applicable to only virtual memory

I have read the linker script.
i have got one confusion regarding allocating memory.
when we define section with starting where we want to load the file.
1) does the memory locations what we have specified are applicable to virtual memory like ( . = 0x10000 ).
in your linker script (and the resulting binary), addresses are just addresses.
Whether these are meant virtual or physical solely depends on your loader (which might be a tiny bootloader at early system init that doesn't know about virtual addresses or a full blown OS that provides a sophisticated virtual environment).
So it's the program that brings your binary into memory that decides whether addresses are interpreted virtually or physically, not the linker script.
Unless you tell us about your specific environment, we can't tell you more.

Virtual Memory and Relocatable Code

In a 32 bit system, each process virtually has 2^32 bytes of CONTIGUOUS address space. So why the final executable code generated by a linker needs to be relocatable. What is the requirement since all addresses generated would be virtual addresses in the process's own address space and other process CANNOT use the same.
Hence the process can be placed in anywhere it wants to be. Why relocatable?
Some operating systems make the executable code relocatable (this is definitely not universal to all operating systems) to allow for address space layout randomization. This helps mitigate certain attacks.
In the past when stacks were executable a buffer overflow could be exploited by writing executable code directly on the overflowed stack or heap. As operating systems became smarter and started preventing execution of the stack and the heap, attacks became more sophisticated and started using known code sequences in memory by doing return oriented programming. The mitigation to that class of attacks was first done by randomizing the memory layout for shared libraries (since those were easier to exploit) and then when attackers switched to attacking the main executable, by randomizing the memory position of the executable. To make it possible the main executable needs to be relocatable.
Executable code does not always contain relative addresses. On Windows, for example, addressing is often absolute (e.g. for global data).
Consider two different dynamic libraries. Both were compiled for a fixed base address of 0x00100000. Your program tries to load both of them. Where is the loader to place the 2nd DLL? Its preferred base address is already used by the other DLL.
In this case relocatable code helps placing the 2nd DLL at a different address and patching its internal pointers to the new location. With fixed base addresses, loading the 2nd DLL would just fail.
It needs to be relocatable because in order to execute your process needs to be put into the actual main memory in a ready queue. Now where in the main memory it shall be placed is not fixed (it is placed wherever sufficient space is available) so the actual addresses of the instructions varies from its virtual address .
Hence statements making calls to functions ,returns etc need to be updated accordingly pointing to the actual address of those functions

Read struct from physical memory address in C

This is probably more of a problem with my lack of C knowledge, but I'm hoping someone might be able to offer a possible solution. In a nutshell, I'm trying to read a struct that is stored in memory, and I have it's physical memory address. Also this is being done on a 64-bit Linux system (Debian (Wheezy) Kernel 3.6.6), and I'd like to use C as the language.
For example the current address of the struct in question is at physical address: 0x3f5e16000
Now I did initially try to access this address by using using a pointer to /dev/mem. However, I've since learned that access to any address > 1024MB is not allowed, and I get a nice error message in var/log/messages telling me all about it. At present access is being attempted from a userspace app, but I'm more than happy to look into writing a kernel module, if that is what is required.
Interesting, I've also discovered something known as 'kprobe', which supposedly allows the > 1024MB /dev/mem restriction to be bypassed. However, I don't really want to introduce any potential security issues into my system, and I'm sure there must be an easier way to accomplish this. The info on kprobe can be found here: http://www.libcrack.so/2012/09/02/bypassing-devmem_is_allowed-with-kprobes/
I've done some reading and I've found references to using mmap to map the physical address into userspace so that it can be read, but I must confess that I don't understand the implementation of this in C.
If anyone could provide some information on accessing physical memory, or either mapping data from a physical address to a userspace virtual address, I would be extremely grateful.
You'll have to forgive me if I'm a little bit vague as to exactly what I'm doing, but it's part of a project and I don't want to give too much information away, so please bear with me :) I'm not being obtuse or anything.
The structure in memory is a block of four ints and ten longs that is loaded into memory by a running kernel module.
The address that I'm using is definitely a physical address and it's set to non-paged, the kernel module performs the translations to physical and I'm not using the address-of operator.
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct. I hope that helps to clarify things!
EDIT - After doing some more reading, it appears that one possible solution to this problem is to construct a kernel module, and then use the mmap function to map the physical address to a virtual address the kernel module can then access. Can anyone offer any advice on achieving this using mmap?
I'm only going to answer this question:
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct.
No. The problem is not int vs. struct, the problem is that C in and of itself has no notion of physical memory. The OS in conjunction with the MMU makes sure that every process, including every running C program, runs in a virtual memory sandbox. The OS might offer an escape hatch into physical memory.
If you're writing a kernel module that manages some object at physical address 0x3f5e16000, then you should offer some API to get to that memory, preferably one that uses a file descriptor or some other abstraction to hide the nitty-gritty of kernel memory management from the user program it communicates with.
If you're trying to communicate with a poorly designed kernel module that expects you to access a fixed physical memory address, then ugly hacks involving /dev/mem are your share.

Is DLL always have the same Base Address?

I'm studying about windows and DLL stuffs and I have some question about it. :)
I made a simple program that loads my own DLL. This DLL has just simple functions, plus, minus.
This is the question : if I load some DLL (for example, text.dll), is this DLL always have the same Base Address? or it changes when I restart it? and can I hold the DLL's Base Address?
When I test it, it always have same Base Address, but I think when I need to do about this, I have to make some exception about the DLL Base Address.
The operating system will load your DLL in whatever base address it pleases. You can specify a "preferred" base address, but if that does not happen to be available, (for whatever reason, which may well be completely out of your control,) your DLL will be relocated by the operating system to whatever address the operating system sees fit.
i load some DLL(for example, text.dll), is this DLL always have the same Base Address?
No. It is a preferred base address. If something is already loaded at that address, the loader will rebase it and fixup all of the addresses.
Other things, like Address Space Layout Randomization could cause it to be different every time the process starts.
That's a common problem with DLLs that we encountered when trying to decode stacktraces issued by GNAT runtime (Ada).
When presented with a list of addresses (traceback) when our executables crash, we are able to perform addr2line on the given addresses and rebuild the call tree without issues.
On DLLs, this isn't the case (that's why I highly doubt that this issue is ASLR-related, else the executables would have the same random shift), vcsjones answer explains the "why".
Now to workaround this issue, you can write the address of a given symbol (example: the main program) to disk. When analysing a crash, just perform a difference between the address of the symbol in the mapfile and the address written to disk. Apply this difference to your addresses, and you'll be able to compute the theorical addresses, thus the call stack.

Resources