I know how to work with pointers. But I don't know how do this:
I have an hex address that's of course it has a any value from any app.
I know to find the address that I want.
I want to write a C Code to pass this address to a pointer variable, and then I could capture the value from that address and so on.
For example:
hex 0x00010010
int *pointer;
How can I pass this address to the pointer, what's the syntax for that?
By using int* you are assuming the data you are point to is an int (4 bytes, on most OS's).
Shouldn't we use a void* instead? Later, if you want to access it's data using a particular type, all you need to do is cast.
void *addr = (void*)0x0023FF74; // valid memory address within my app
printf("Address: %p\n", addr);
getchar();
printf("Data (4bytes): %d\n", *(int*)addr); // print the first 4 bytes of data as int
getchar();
EDIT:
Help me understand what you are trying to do, because this statement confused me:
There is app using that address, I know the value in it, but the (int)pointer don't access the value.
Are you trying to write an application that will modify the memory of another application that is running your system? Your current approach won't work, and this is why: When the Operating System loads your application into a process, it reserves a memory region to be used by the process and assigns a range of virtual addresses to this region. These virtual addresses do not map directly to memory addresses in RAM, so the OS has to keep an internal table for that.
On Windows, each process loaded receives the same range of virtual addresses, but this area is only visible to the process that is running inside it. For instance, (on Windows) processes are loaded in memory address 0x00400000, which mean each process has it's own memory address 0x00400000, and therefore you can't assign X memory address to a pointer in your application and expect Windows to magically know that you are reffering to address X that is inside another application.
What you are trying to accomplish it's called Code Injection, and there's a lot of information on the web about it.
In typical modern operating systems, you cannot access another process' (app's) address space. The addresses are virtual, so the same numerical value means different things to different processes.
If you are on a system with a true "flat" address space (where processes directly work with actual physical memory addresses) such as Amiga OS on the 680x0, you can do something like:
int *pointer = (int *) 0x00010010;
Related
Consider a sample below.
char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';
The 4KB memory is allocated by calling malloc(). OS handles the memory request by the user program in user-space. First, OS requests memory allocation to RAM, then RAM gives physical memory address to OS. Once OS receives physical address, OS maps the physical address to virtual address then OS returns the virtual address which is the address of p to user program.
I wrote some value(a and b) in virtual address and they are really written into main memory(RAM). I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory(RAM) even though I didn't care about them.
What happens in behind? What OS does for me? I couldn't found relevant materials in some books(OS, system programming).
Could you give some explanation? (Please omit the contents about cache for easier understanding)
A detailed answer to your question will be very long - and too long to fit here at StackOverflow.
Here is a very simplified answer to a little part of your question.
You write:
I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory
Seems you have a very fundamental misunderstanding here.
There is no memory directly "behind" a virtual address. Whenever you access a virtual address in your program, it is automatically translated to a physical address and the physical address is then used for access in main memory.
The translation happens in HW, i.e. inside the processor in a block called "MMU - Memory management unit" (see https://en.wikipedia.org/wiki/Memory_management_unit).
The MMU holds a small but very fast look-up table that tells how a virtual address is to be translated into a physical address. The OS configures this table but after that, the translation happens without any SW being involved and - just to repeat - it happens whenever you access a virtual memory address.
The MMU also takes some kind of process ID as input in order to do the translation. This is need because two different processes may use the same virtual address but they will need translation to two different physical addresses.
As mentioned above the MMU look-up table (TLB) is small so the MMU can't hold a all translations for a complete system. When the MMU can't do a translation, it can make an exception of some kind so that some OS software can be triggered. The OS will then re-program the MMU so that the missing translation gets into the MMU and the process execution can continue. Note: Some processors can do this in HW, i.e. without involving the OS.
You have to understand that virtual memory is virtual, and it can be more extensive than physical memory RAM, so it is mapped differently. Although they are actually the same.
Your programs use virtual memory addresses, and it is your OS who decides to save in RAM. If it fills up, then it will use some space on the hard drive to continue working.
But the hard drive is slower than the RAM, that's why your OS uses an algorithm, which could be Round-Robin, to exchange pages of memory between the hard drive and RAM, depending on the work being done, ensuring that the data that are most likely to be used are in fast memory. To swap pages back and forth, the OS does not need to modify virtual memory addresses.
Summary overlooking a lot of things
You want to understand how virtual memory works. There's lots of online resources about this, here's one I found that seems to do a fair job of trying to explain it without getting too crazy in technical details, but also doesn't gloss over important terms.
https://searchstorage.techtarget.com/definition/virtual-memory
For Linux on x86 platforms, the assembly equivalent of asking for memory is basically a call into the kernel using int 0x80 with some parameters for the call set into some registers. The interrupt is set at boot by the OS to be able to answer for the request. It is set in the IDT.
An IDT descriptor for 32 bits systems looks like:
struct IDTDescr {
uint16_t offset_1; // offset bits 0..15
uint16_t selector; // a code segment selector in GDT or LDT
uint8_t zero; // unused, set to 0
uint8_t type_attr; // type and attributes, see below
uint16_t offset_2; // offset bits 16..31
};
The offset is the address of the entry point of the handler for that interrupt. So interrupt 0x80 has an entry in the IDT. This entry points to an address for the handler(also called ISR). When you call malloc(), the compiler will compile this code to a system call. The system call returns in some register the address of the allocated memory. I'm pretty sure as well that this system call will actually use the sysenter x86 instruction to switch into kernel mode. This instruction is used alongside an MSR register to securely jump into kernel mode from user mode at the address specified in the MSR (Model Specific Register).
Once in kernel mode, all instructions can be executed and access to all hardware is unlocked. To provide with the request the OS doesn't "ask RAM for memory". RAM isn't aware of what memory the OS uses. RAM just blindly answers to asserted pins on it's DIMM and stores information. The OS just checks at boot using the ACPI tables that were built by the BIOS to determine how much RAM there is and what are the different devices that are connected to the computer to avoid writing to some MMIO (Memory Mapped IO). Once the OS knows how much RAM is available (and what parts are usable), it will use algorithms to determine what parts of available RAM every process should get.
When you compile C code, the compiler (and linker) will determine the address of everything right at compilation time. When you launch that executable the OS is aware of all memory the process will use. So it will set up the page tables for that process accordingly. When you ask for memory dynamically using malloc(), the OS determines what part of physical memory your process should get and changes (during runtime) the page tables accordingly.
As to paging itself, you can always read some articles. A short version is the 32 bits paging. In 32 bits paging you have a CR3 register for each CPU core. This register contains the physical address of the bottom of the Page Global Directory. The PGD contains the physical addresses of the bottom of several Page Tables which themselves contain the physical addresses of the bottom of several physical pages (https://wiki.osdev.org/Paging). A virtual address is split into 3 parts. The 12 bits to the right (LSB) are the offset in the physical page. The 10 bits in the middle are the offset in the page table and the 10 MSB are the offset in the PGD.
So when you write
char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';
you create a pointer of type char* and making a system call to ask for 4096 bytes of memory. The OS puts the first address of that chunk of memory into a certain conventional register (which depends on the system and OS). You should not forget that the C language is just a convention. It is up to the operating system to implement that convention by writing a compatible compiler. It means that the compiler knows what register and what interrupt number to use (for the system call) because it was specifically written for that OS. The compiler will thus take the address stored into this certain register and store it into this pointer of type char* during runtime. On the second line you are telling the compiler that you want to take the char at the first address and make it an 'a'. On the third line you make the second char a 'b'. In the end, you could write an equivalent:
char* p = (char*)malloc(4096);
*p = 'a';
*(p + 1) = 'b';
The p is a variable containing an address. The + operation on a pointer increments this address by the size of what is stored in that pointer. In this case, the pointer points to a char so the + operation increments the pointer by one char (one byte). If it was pointing to an int then it would be incremented of 4 bytes (32 bits). The size of the actual pointer depends on the system. If you have a 32 bits system then the pointer is 32 bits wide (because it contains an address). On a 64 bits system the pointer is 64 bits wide. A static memory equivalent of what you did is
char p[4096];
p[0] = 'a';
p[1] = 'b';
Now the compiler will know at compile time what memory this table will get. It is static memory. Even then, p represents a pointer to the first char of that array. It means you could write
char p[4096];
*p = 'a';
*(p + 1) = 'b';
It would have the same result.
First, OS requests memory allocation to RAM,…
The OS does not have to request memory. It has access to all of memory the moment it boots. It keeps its own database of which parts of that memory are in use for what purposes. When it wants to provide memory for a user process, it uses its own database to find some memory that is available (or does things to stop using memory for other purposes and then make it available). Once it chooses the memory to use, it updates its database to record that it is in use.
… then RAM gives physical memory address to OS.
RAM does not give addresses to the OS except that, when starting, the OS may have to interrogate the hardware to see what physical memory is available in the system.
Once OS receives physical address, OS maps the physical address to virtual address…
Virtual memory mapping is usually described as mapping virtual addresses to physical addresses. The OS has a database of the virtual memory addresses in the user process, and it has a database of physical memory. When it is fulfilling a request from the process to provide virtual memory and it decides to back that virtual memory with physical memory, the OS will inform the hardware of what mapping it choose. This depends on the hardware, but a typical method is that the OS updates some page table entries that describe what virtual addresses get translated to what physical addresses.
I wrote some value(a and b) in virtual address and they are really written into main memory(RAM).
When your process writes to virtual memory that is mapped to physical memory, the processor will take the virtual memory address, look up the mapping information in the page table entries or other database, and replace the virtual memory address with a physical memory address. Then it will write the data to that physical memory.
The question is particulary about QNX and pretty much stated in the title.
I tried logging the variable addresses after modifying them in both processes, assuming copy-on-write doesn't work anymore and they are not identical, so I expected the addresses to be different. But they are the same (virtual addresses, but still).
So how can I check that one process doesn't affect another without printing variables value, maybe there's a simpler solution?
int q;
q = 3;
...
if (pid == 0) {
// in child
q = 5;
printf("%d\n", &q);
} else {
// in parent
q = 9;
printf("%d\n", &q);
}
The virtual addresses you print will be identical — the child process is an almost exact copy of its parent. The physical addresses the programs access will be separate as soon as one process tries to modify the data in that page, but that will be completely hidden from the two processes. That's the beauty of virtual memory.
Note that you're using the wrong format for printing addresses; you should use %p and cast the address to void *:
printf("%p\n", (void *)&q);
or use <inttypes.h> and uintptr_t and PRIXPTR (or PRIdPTR is you really want the address in decimal rather than hex):
printf("0x%" PRIXPTR "\n", (uintptr_t)&q);
Print the numbers too — and do so several times in a loop with sleeps of some sort in them. You will see that despite the same logical (virtual) address, the physical addresses are different. You won't be able to find the physical address easily, though.
If you have created a new process, rather than a new thread, then it has its own process address space by definition.
Every process address space will have the same virtual address range - 0x00000000 - 0xffffffff on a 32-bit machine. Every process will have a virtual address of n, what it is used for and whether it maps to anything which physically exists will differ. Some of that address space will be used by the kernel, some might be shared (see also man mmap).
After a fork() you should not be surprised if the virtual addresses are identical in both processes (although that cannot be guaranteed for new memory operations after the fork) - it does not mean that copy-on-write is not working, that is invisible to normal code.
Pages do not necessarily reside in RAM (physical memory) but can sit in a swap or paging file (terms used vary) until required. The virtual address refers to a page table that knows where its page really lives. When copy-on-write kicks in it means a new page is created, it does not mean that the virtual address changes, it will stay the same but in the page table will refer to a different physical location.
Why would you want to know anyway? That kind of operation is in the domain of the operating system.
I wrote this code so I can see the address of variable foo.
#include <stdio.h>
#include <stdlib.h>
int main(){
char* foo=(char*) malloc(1);
*foo='s';
printf(" foo addr : %p\n\n" ,&foo);
int pause;
scanf("%d",&pause);
return 0;
}
Then pause it and use the address of foo in here:
#include <stdio.h>
int main(){
char * ptr=(char *)0x7ffebbd57fc8; //this was the output from the first code
printf("\n\n\n\n%c\n\n\n",*ptr);
}
but I keep getting segmentation fault. Why is this code not working?
This is not a C question/problem but a matter of runtime support. On most OS programs run in a virtual environment, especially concerning their memory space. In such case memory is a virtual memory which means that when a program access a given address x the real (physical) memory is computed as f(x). f is a function implemented by the OS to ensure that a given process (object which represent the running of a code in the OS) have its own reserved memory separated from memory dedicated to other processes. This is called virtual memory.
Oups, your problem is not related to C language, but really depends of the OS, if any.
First let us read it from a pure C language point of view:
char * ptr=(char *)0x7ffebbd57fc8;
your are converting an unsigned integer to a char *. As you get the integer value from the other program, you can be sure that is has an acceptable range, so you indeed get a pointer pointing to that address. As it is a char * pointer, you can use it to read the byte representation of any object that will lie at that address. Still fine until there. But common systems use virtual addresses and limit each process to access only its own pages, so by default a process cannot access the memory of another process. In addition, with the common usage of virtual memory, there are no reasons that any two non kernel processes share common addresses. Exceptions for real addresses are:
real memory OS (MS/DOS and derivatives like FreeDOS, CP/M, and other anthic systems)
kernel mode: the kernel can access the whole memory of the system - who could load your program?
special functions: some OS provide special API to let one process read the memory of another one (Windows does), but it not as simple as directly reading an address...
As I assume that you are not in any of the first two cases, nothing is mapped at that address from the current process, hence the error.
On systems use virtual memory, you have a range of logical addresses that are available to user processes. These address ranges are subdivided into units called PAGES whose size depends upon the processor (512b to 1MB).
Pages are not valid until they are mapped into the process. The operating system will have system calls that allow the application to map pages. If you try to access a page that is not valid you get some kind of exception.
While the operating system only allocates memory pages, applications are used to calls, such as malloc(), that allocate memory blocks of arbitrary sizes.
Behind the scenes, malloc() is mapping pages (ie making them valid) to create a pool of memory that is uses to return small amounts of memory.
When you omit your malloc(), the memory is not being mapped.
Note that each process has its own range of logical addresses. One process's page containing 0x7ffebbd57fc8 is likely to be mapped to a different physical page frame than another process's 0x7ffebbd57fc8. If that were not the case, one user could muck with another. (There is always a range of addresses shared by all processes but this is can only be accessed in kernel mode.)
Your problem is further complicated by the fact that many systems these days randomly map processes to different locations in the logical address space. On such systems you could run your first program multiple times and get different addresses.
Why is this code not working?
You would need to call the system service on your operating system that maps memory into the process and make the page containing 0x7ffebbd57fc8 accessible.
Am I printing it wrong?
#include <stdio.h>
#include <stdlib.h>
int
main( void )
{
int * p = malloc(100000);
int * q;
printf("%p\n%p\n", (void *)p, (void *)q);
(void)getchar(); /* to run several instances at same time */
free(p);
return 0;
}
Whether I run it sequentially or in multiple terminals simultaneously, it always prints "0x60aa00000800" for p (q is different, though).
EDIT: Thanks for the answers, one of the reasons I was confused was because it used to print a different address each time. It turns out that a new compiler option I started using, -fsanitize=address, caused this change. Whoops.
The value of q is uninitialized garbage, since you never assign a value to it.
It's not surprising that you get the same address for p each time you run the program. That address is almost certainly a virtual address, so it applies only to the memory space of the currently running program (process).
Virtual address 0x60aa00000800 as seen from one program and virtual address 0x60aa00000800 as seen from another program are distinct physical addresses. The operating system maps virtual addresses to physical addresses, and vice versa, so there's no conflict. (If different programs could read and write the same physical memory, it would be a security nightmare.)
It also wouldn't be surprising if they were different each time. For example, some operating systems randomize stack addresses to prevent some code exploits. I'm not sure whether heap addresses are also randomized, but they certainly could be.
https://en.wikipedia.org/wiki/Virtual_memory
This behavior is not entirely surprising. The malloc operation is simply returning a pointer to user addressable + allocated memory in the process. It's completely reasonable for the first memory request of the same size to return the same address through different invocations of a process
The behavior for q doesn't contradict this. You have given q no value hence it gets whatever the last value written to that portion of the stack was. It's unsurprising that undefined behavior would be different through different invocations of the same process (after all, it's undefined)
Your code is fine.
The same code, and the same algorithm for obtaining memory with malloc() is run each time, so there's no reason the addresses should be different.
Some malloc implementations could randomize the start of memory allocations, yours does not.
This is because of virtual memory. The physical memory address for memory of q is different, but your operating system provides each process with a virtual view of memory, mapping different physical memory addresses to the same virtual addresses in your processes. So all processes have a similar view of the memory (and cannot see the memory of other processes)
Heap allocators are not required to provide distinct / unique addresses each time you run the program. There is no guarantee either way on this, but it's entirely reasonable for an implementation of malloc() to have deterministic behavior and give you the same pointer each time you run the program.
The stack, on the other hand, usually is (but is not required to be) located at a different address. This is a protection measure against buffer-overflow exploits. By making the stack location non-deterministic, they make it more difficult for an attacker to inject direct memory addresses of code via buffer-overflow attack.
Finally note that all pointers in a program are virtual memory addresses, not physical addresses. So even though two concurrent processes might have the same memory address in a pointer, those two processes still have distinct memory in separate areas of physical memory. The operating system takes care of this via its virtual memory manager and page translation. Each process has its own virtual address space, various pieces of which are mapped transparently by the OS to physical memory as needed.
I have shared Memory like this
struct MEMORY {
char * type;
int number;
}
now in code I make it shared everything works probably but other process can't see what pointer points to how can I use pointer in shared memory?
You need to make sure that the shared memory is attached at the same address in the address space of all the processes. Otherwise, as you can imagine, the pointer values end up meaning different things in different processes.
What are you using for shared memory? mmap or shm? It is the first argument in the mmap call.
If you cannot assure same address space in all processes, the other way to go is to us offsets only. Each process simply offsets from the base address where the shared memory is attached.
EDIT: Ah ... maybe what you are saying is that "char* type" is some arbitrary pointer. Remember that the other processes can only see what is in the shared memory. All other memory locations (pointer values) are inaccessible. So, for this pointer to work, it needs to be to something that is in shared memory, not just any arbitrary pointer. That, and you need to assure that the shared memory is attached at the same address in all processes.