Different programs, same variables, same address in memory - c

I have two C codes.
test.c is
#include <stdlib.h>
int main ()
{
int a;
a = 5;
return a;
}
test2.c is
#include <stdlib.h>
int main ()
{
int a;
a = 6;
return a;
}
When I run them and I check the address in memory of the "a"s with gdb I get the same address. Why is this so?
Breakpoint 1, main () at test.c:7 7 return a; (gdb) print &a $1 =
(int *) 0x7fffffffe1cc
Breakpoint 1, main () at test2.c:7 7 return a; (gdb) print &a $1 =
(int *) 0x7fffffffe1cc

The address of "a" is on the stack frame for your program. This is a virtual address, independent of where in physical memory your program is actually loaded. Therefore, it would not be surprising if both (almost identical) programs used the same address.

Because each application in OS is run in its own memory space.
Address 0x7fffffffe1cc is not really physical address. This is made due to security - you cannot handle other process memory directly just like that. You also cannot handle devices directly.
You can read more about that here and here

It is very likely that your OS is using Virtual Memory for memory management. What this means is that addresses found within a given program are not 1:1 mapped to physical memory. This allows for a number of things (including running multiple programs that require lots of memory by page swapping to disk).
Without virtual memory, if you were to allocate static int a rather than put it on the stack, the linker would do it's best to choose an address for it. If you then linked another program, it doesn't know what other programs may be using that address. Running two programs could stomp on the memory of the other program. With virtual memory, each program gets it's own slice of memory with it's own address 0x0 and it's own address 0x7fffffffe1cc.

Related

Is it possible for a C program to access and changing a memory address in the heap allocated to another program?

Leu us suppose that I have a program like this one (I'll call it program 1):
#include <stdlib.h>
#include <stdio.h>
#define MAX 100
int main(){
int i;
int *v;
v = (int *)malloc (MAX * sizeof (int));
for(i=0;i<MAX;i++){
v[i] = i;
}
printf("Address:%d\n",&v[0]);
getchar();
for(i=0;i<MAX;i++){
printf("%d\n",v[i]);
}
}
And let us suppose that I have a seconde program (called program 2), like this one:
#include <stdlib.h>
#include <stdio.h>
int main(){
int address;
int *v;
scanf("%d",&address)
v = address;
printf("%d\n",*v);
*v = 100;
}
Now, let us suppose that I run program 1 and I collect the address printed by it. The program will be blocked in the getchar() function. And, let us suppose that, while program 1 is blocked, I run program 2 and provide to the scanf the address printed by program 1. Can I access the same memory address allocated to program 1 in program 2?
Best regards.
In an older OS without memory protection (e.g. AmigaDOS or Classic MacOS) all processes would run in the same memory-space and you could do tricks like that. Of course that also meant that any buggy (or malicious) program could easily corrupt other programs, or even crash the entire OS. So modern OS's give each program its own separate virtual memory space, so even if your program 2 had the virtual address as printed by program 1, when program 2 tried to dereference that address, it would find that it pointed to a different page of physical memory (or perhaps to no physical page of memory at all, causing a segmentation fault).
Many modern OS's do provide APIs to set up shared memory regions (e.g. mmap under POSIX) so that multiple programs can access the same physical memory, and some even have APIs to allow you to unilaterally access the private memory of another process (e.g. ReadProcessMemory and WriteProcessMemory under Windows) but you generally need Administrator access to use those APIs, and they are tricky to use safely, for obvious reasons.

Dereferencing pointer to arbitrary address gives Segmentation fault

I have written a simple C code for pointers. As per my understanding, Pointer is a variable which holds the address of another variable.
Eg :
int x = 25; // address - 1024
int *ptr = &x;
printf("%d", *ptr); // *ptr will give value at address of x i.e, 25 at 1024 address.
However when I try below code I'm getting segmentation fault
#include "stdio.h"
int main()
{
int *ptr = 25;
printf("%d", *ptr);
return 0;
}
What's wrong in this? Why can't a pointer variable return the value at address 25? Shouldn't I be able to read the bytes at that address?
Unless you're running on an embedded system with specific known memory locations, you can't assign an arbitrary value to a pointer an expect to be able to dereference it successfully.
Section 6.5.3.2p4 of the C standard states the following regarding the indirection operator *:
The unary
* operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If
the operand has type "pointer to type", the result has
type "type". If an invalid value has been assigned to
the pointer, the behavior of the unary
* operator is undefined.
As mentioned in the passage above, the C standard only allows for pointers to point to known objects or to dynamically allocated memory (or NULL), not arbitrary memory locations. Some implementations may allow that in specific situations, but not in general.
Although the behavior of your program is undefined according to the C standard, your code is actually correct in the sense that it is doing exactly what you intend. It is attempting to read from memory address 25 and print the value at that address.
However, in most modern operating systems, such as Windows and Linux, programs use virtual memory and not physical memory. Therefore, you are most likely attempting to access a virtual memory address that is not mapped to a physical memory address. Accessing an unmapped memory location is illegal and causes a segmentation fault.
Since the memory address 0 (which is written in C as NULL) is normally reserved to specify an invalid memory address, most modern operating systems never map the first few kilobytes of virtual memory addresses to physical memory. That way, a segmentation fault will occur when an invalid NULL pointer is dereferenced (which is good, because it makes it easier to detect bugs).
For this reason, you can be reasonably certain that also the address 25 (which is very close to address 0) is never mapped to physical memory and will therefore cause a segmentation fault if you attempt to access that address.
However, most other addresses in your program's virtual memory address space will most likely have the same problem. Since the operating system tries to save physical memory if possible, it will not map more virtual memory address space to physical memory than necessary. Therefore, trying to guess valid memory addresses will fail, most of the time.
If you want to explore the virtual address space of your process to find memory addresses that you can read without a segmentation fault occuring, you can use the appropriate API supplied by your operating system. On Windows, you can use the function VirtualQuery. On Linux, you can read the pseudo-filesystem /proc/self/maps. The ISO C standard itself does not provide any way of determining the layout of your virtual memory address space, as this is operating system specific.
If you want to explore the virtual memory address layout of other running processes, then you can use the VirtualQueryEx function on Windows and read /proc/[pid]/maps on Linux. However, since other processes have a separate virtual memory address space, you can't access their memory directly, but must use the ReadProcessMemory and WriteProcessMemory functions on Windows and use /proc/[pid]/mem on Linux.
Disclaimer: Of course, I don't recommend messing around with the memory of other processes, unless you know exactly what you are doing.
However, as a programmer, you normally don't want to explore the virtual memory address space. Instead, you normally work with memory that has been assigned to your program by the operating system. If you want the operating system to give you some memory to play around with, which you are allowed to read from and write to at will (i.e. without segmentation faults), then you can just declare a large array of chars (bytes) as a global variable, for example char buffer[1024];. Be careful with declaring larger arrays as local variables, as this may cause a stack overflow. Alternatively, you can ask the operating system for dynamically allocated memory, for example using the malloc function.
You should consider all warnings that the compiler issues.
This statement
int *ptr = 25;
is incorrect. You are trying to assign an integer to a pointer as an address of memory. Thus in this statement
printf("%d", *ptr);
there is an attempt to access memory at address 25 that does not belong to your program.
What you mean is the following
#include "stdio.h"
int main( void )
{
int x = 25;
int *ptr = &x;
printf("%d", *ptr);
return 0;
}
Or
#include "stdio.h"
#include <stdlib.h>
int main( void )
{
int *ptr = malloc( sizeof( int ) );
*ptr = 25;
printf("%d", *ptr);
free( ptr );
return 0;
}

Why does setting a value at an arbitrary memory location not work?

I have this code:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
int main (int argc, char** argv) {
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
printf("%i", var);
printf("%i", &var);
return (EXIT_SUCCESS);
}
I want to see a 1 and the address of that int, which i specified previously. But when compiled by gcc in bash, only "command terminated" without any error will be shown. Does anyone know why so?
PS: I am newbie to C, so just experimenting.
What you are doing:
*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;
is totally wrong.
You have no guarantee whatsoever that an arbitrary address like 0x12345678 will be accessible, not to mention writable by your program. In other words, you cannot set a value to an arbitrary address and expect it to work. It's undefined behavior to say the least, and will most likely crash your program due to the operating system stopping you from touching memory you don't own.
The "command terminated" that you get when trying to run your program happens exactly because the operating system is preventing your program from accessing a memory location it is not allowed to access. Your program gets killed before it can do anything.
If you are on Linux, you can use the mmap function to request a memory page at an (almost) arbitrary address before accessing it (see man mmap). Here's an example program which achieves what you want:
#include <sys/mman.h>
#include <stdio.h>
#define WANTED_ADDRESS (void *)0x12345000
#define WANTED_OFFSET 0x678 // 0x12345000 + 0x678 = 0x12345678
int main(void) {
// Request a memory page starting at 0x12345000 of 0x1000 (4096) bytes.
unsigned char *mem = mmap(WANTED_ADDRESS, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
// Check if the OS correctly granted your program the requested page.
if (mem != WANTED_ADDRESS) {
perror("mmap failed");
return 1;
}
// Get a pointer inside that page.
int *ptr = (int *)(mem + WANTED_OFFSET); // 0x12345678
// Write to it.
*ptr = 123;
// Inspect the results.
printf("Value : %d\n", *ptr);
printf("Address: %p\n", ptr);
return 0;
}
The operating system and loader do not automatically make every possible address available to your program. The virtual address space of your process is constructed on demand by various operations of the program loader and of services inside the process. Although every address “exists” in the sense of being a potential address of memory, what happens when a process attempts to access an address is controlled by special data structures in the system. Those data structures control whether a process can read, write, or execute various portions of memory, whether the virtual addresses are currently mapped to physical memory, and whether the virtual addresses are not currently mapped to memory but will be provide with physical memory when needed. Initially, much of a process’ address space is marked not in use (or at least implicitly marked, in that none of the explicit records for the address space apply to it).
In the executions of your program you have attempted so far, the address 0x12345678 has not been mapped and marked available to your process, so, when your process attempted to use it, the system detected a fault and terminated your process.
(Some systems randomize the layout of the address space when a program is being loaded, to make it harder for an attacker to exploit bugs in a program. Because of this, it is possible that 0x12345678 will be accessible in some executions of your program and not others.)
The quote from C11 standard 6.5.3.2p4:
4 The unary * operator denotes indirection. [...] If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
You use * operator on (volatile uint8_t*)0x12345678u pointer. Is this a valid pointer? Is it invalid pointer? What is an "invalid value" of a pointer?
There is no check that allows to find out which particilar pointer values are valid, which aren't. It is not implemented in C language. A random pointer may just happen to be a valid pointer. But most, most probably it is an invalid pointer. In which case - the behavior is undefined.
Dereferencing an invalid pointer is undefined behavior. But - outside of C scope and into operating system - on *unix systems trying to access memory that you are not allowed to, should raise a signal SIGSEGV on your program and terminate your program. Most probably this is what happens. Your program is not allowed to access memory location that is behind 0x12345678 value, the operating system specifically protects against that.
Also note, that systems use ASLR, so that pointer values within your program are indeed in some degree random. There are not linear, ie. *(char*)0x01 will not access the first byte in your ram. Operating system (or more exact, the underlying hardware as configured by the operating system) translates pointer values in your program to physical location in ram using what is called virtual memory. The same pointer values may just happen to be valid on the second run of your program. But most probably, because pointers can have so many values, most probably it isn't a valid pointer. Your operating system kills your program, as it detects an invalid memory access.

How large is the virtual address space of a program?

I was reading Operating Systems: Three Easy Pieces. To learn how the virtual address space for a program look like, I run the following code.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
printf("location of code : %p\n", (void *) main);
printf("location of heap : %p\n", (void *) malloc(1));
int x = 3;
printf("location of stack : %p\n", (void *) &x);
return x;
}
Its output is:
location of code : 0x564eac1266fa
location of heap : 0x564ead8e5670
location of stack : 0x7fffd0e77e54
Why the code segment's location is 0x564eac1266fa? What does so large a (virtual) space before it use for? Why doesn't it start from or near 0x0)
And, why the program's virtual address is so large?(from the stack location, it's 48 bits wide) What's the point of it?
The possible virtual address space organizations are defined by the hardware you are using, specifically the MMU it supports. The OS may then use any organization that the hardware can be coerced into using, but generally it just uses it directly (possibly with some subsetting), as that is most efficient.
The x86_64 architecture defines a 48-bit virtual address space1, and most OSes reserve half of that for system use, so user programs see a 47 bit address space. Within that address space, most OSes will randomize the addresses used for any given program, so as to make exploiting bugs in the programs harder.
1Strictly speaking, the architecture defines a 64-bit virtual address space, but then reserves all addresses that do not have the top 17 bits all 0 or all 1.
You are barking up the wrong tree with what you are trying to do here. A process has multiple stacks, may have multiple heaps, and main might not be the start of the code. Viewing an address space as a code segment, stack segment, heap segment, ... as horrible operating systems books do is only going to get you confused.
Because of logical addressing, the memory mapped into the address space does not have to be contiguous.
Why the code segment's location is 0x564eac1266fa? What does so large a (virtual) space before it use for? Why doesn't it start from or near 0x0)
The start of code in your process would well be at 0x564eac1266f8. The fact that you have a high address does not mean the lower addresses have been mapped into the process address space.
And, why the program's virtual address is so large?(from the stack location, it's 48 bits wide) What's the point of it?
Stacks generally start high and grow low.

Is there any way to access data of one C pointer from another C program

I have two programs,
Program A is like this,
int main(int argc, char** argv) {
char* s = "hello";
printf(s);
return (EXIT_SUCCESS);
}
The base address of s is 0x80484e0 "hello", Now I have Program B, as below
int main(int argc, char** argv) {
void* p = (void*)0x80484e0;
char* c = (char*)p;
while(*c)
{
printf("%c",*c);
c++;
}
return (EXIT_SUCCESS);
}
In program B 'p' is pointing to the same base address as 's' in Program B but the contents are not same.
Even though 'p' and 's' are having same base address their contents are not same, is it because they are running as different programs in different address space?
In program B 'p' is pointing to the same base address as 's' in
Program B but the contents are not same.
That's the magic of virtual addresses and separate address spaces. You need to look into "shared memory" for your platform.
Addresses used by a program are virtual. They're not the same as the physical address in RAM. The kernel does some nice (nasty) stuff with the help of the MMU and a page table and hides this from the process.
So for example on a 32b system a process thinks it's the sole user of the memory - it can use addresses from 0 to 0xffffffff - with certain restrictions.
If you happen to be on a POSIX system, you can look into mmap and shm_open.
If you're using pretty much any operating system, there'll be the concept of virtual memory. So a certain memory address in one process is not necessarily the same in the other process. Even if it did map to the same physical address by some sheer chance, then by trying to read it you would, hopefully, get a segmentation fault because you're accessing memory that the process doesn't "own".
I believe that this will never work.
In program A, "hello" is a string that comes with the executable, and loads into memory when you call that program.

Resources