Will memory addresses be the same if I run a program in a VM from two different computers? - c

Fairly new to C and I learned that addresses depend on a few things like the operating system and the CPU. I have a lab for one of my C courses that asks us if we run a program and print out the address for each variable will they have the same address and value as another student's (exact same program). They are local variables, stored on the stack. Normally I would say no but all of us are required to ssh to our University's lab and our programs are being run on the same machines with the same specs. This is where I'm confused, pretty sure that the values will be the same however, I don't know what exactly determines these addresses. Here is a piece of code from the program:
int g2(int a, int b)
{
int c = g1(a + 3, b - 11);
printf("g2: %d %d %d \n", a,b,c);
printf("a's address is %p b's address is %p C's address is %p\n", &a, &b, &c);
return c - b;
}
For me a's address is 0x7ffe9bce4a0c. Also not just looking for a homework answer, asking here because none of my Teammates have sent me their addresses which we were allowed to do. Have researched it but can't find an answer that matches this sort of situation, any help is greatly appreciated, thank you!

"Will memory addresses be the same if I run a program in a VM from two different computers?"
No, they probably won´t even be the same when running only in the same environment and on the same machine. There is nothing like a guarantee that it will have the same address.
A modern-day OS assigns the memory arbitrarily (within certain sections of course).
And this has a good reason: To protect against the exploitation of memory vulnerabilities a hacker could use to harm the program or even the OS.
This technique is called Address Space Layout Randomization. You can read more about it here.
It could be that the variables may have the same address on several executions, but there is no guarantee that this will happen again, already on the next run. In fact, if the OS supports ASLR, It is more likely, that there is the "almost-guarantee" that the addresses will be unequal.
The virtual machine shall have no influence on that behavior. Maybe you should read more in the documentation about the memory storage for your particular virtual machine (if it supports ASLR), but it shall follow the same guidelines.

short answer, no.
operating system loads program in different position every time.
the address that you see is not the actual address in the memory. There is an abstract address layer, supplied by the operating system. You can read about virtual memory addresses if you would like you. You will probably learn it in a course on Operating Systems

Whether you get the same address or varying addresses depends on the operating system.
Not too many years ago, if a program printed the address of one of the local variables in its function, that address would be the same every time the program was run, as long as the function was called in the same point in program execution with the same program input and other circumstances. (Which functions are called, including recursive calls, and how much stack space they use could be affected by program input and other factors.) This was true because, when the program was loaded and initialized, its stack was always started at the same memory address.
This behavior was exploited by malicious people—if there were bugs in the program, they might be exploited, and knowing which addresses were used in the program helps some exploits. So common operating systems have changed it. Now, when a program is started, the locations of its stack and other parts of its memory layout are adjusted randomly. This is called Address Space Layout Randomization (ASLR).
So, in common modern operating systems, you will get varying addresses from run to run when printing the address of a local variable. In specialized operating systems, such as for embedded devices, you may get the same address every time.
The title of your question asks about “a VM,” presumably for virtual machine, but this is not mentioned in the body of your question. To the extent that a virtual machine implements a machine properly, it should produce identical behavior. So whether a program is running in a virtual machine or not should be irrelevant to this question.

Related

Why are virtual addresses so big in my C program?

I recently learned about virtual memory and paging and that compilers only generate virtual addresses starting by 1 and simply counting upwards. I thought I'd test this and wrote the short C progam below that instantiates a global variable and prints it's address, expecting a very small value, since the CPU only sees the virtual addresses, but instead I get 4247584. What is going on here, are my assumptions wrong? And if possible, what would be a program that shows virtual addresses being generated from 1 up?
My program:
#include <stdio.h>
int x = 0;
int main(){
printf("%d\n", &x);
return 0;
}
(I'm using gcc 4.8.1 on Windows 10)
The actual value of a virtual address is relatively non-essential (well, because it's virtual). There's nothing "wasted" when it doesn't start at 0. The only precondition for address values is that the program, data and all its associated shared libraries actually fit into the value-space.
For security reasons, however, it makes sense to allocate the various code and data areas of a process in virtual address space in a way non-reproducible by a potential attacker (makes code injection attacks at fixed addresses virtually[sic] impossible), that is why modern operating systems allocate virtual address space values for a program randomly.
On some operating systems like Linux you may be able to switch off virtual address space layout randomization and thus make it reproducible. Addresses will most probably still not start at zero, because libraries and startup code will most likely occupy addresses lower than your own program.

Can address of pointers in two programs be equal? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
When two programs are running at the same time, and you print the address to which the pointer points to, can it happen that both programs print the same value?
Yes. The program runs in a virtual memory allocated by the OS. The amount of virtual memory is determined by the processor architecture.
The address you see refers to the virtual memory address and not to the physical RAM address.
I would add that each process running on a system gets a huge address space (2^32 on a 32-bit OS and 2^64 on a 64-bit OS) allocated to it. It's on this virtual address space that a process runs.
On operating systems like Linux, a running program is called a process. Each process has its own address space and uses virtual memory. So the same address 0x12345 usually refers to different memory cells in process A and in process B.
Read Advanced Linux Programming which has some chapters explaining that (from a Linux perspective). See also fork system call, and read fork(2), mmap(2), execve(2) man pages.
Other operating systems (Windows, MacOSX) also have processes running in their own individual address space using virtual memory.
Details can be quite complex (and actually, some RAM could be shared between processes....). Read about copy on write, shared memory etc...
Read also some good book about Operating Systems, e.g. Tanenbaum's book, or Operating Systems : Three Easy Pieces (freely downloadable online).
Your question title doesn't quite match the body. The title asks:
Can address of pointers in two program be equal?
Yes, that's possible, as others have already pointed out that there's virtual memory and all sorts of other trickery going on.
Also, a NULL pointer constant is typically always the same in each instance of a program (honestly, I don't know of a platform where it would vary from run to run). So if in both programs, you print NULL, it's even expected that the results will be identical.
Now in the question, you are asking about printing those pointers, which is an entirely different thing:
When two programs are running at the same time, and you print the address to which the pointer points to, can it happen that both programs print the same value?
Since this is tagged with c, I'll answer it from a C point of view:
Yes. Assuming you meant printf("%p", (void *)thePointer), it's perfectly possible. The %p conversion specifier formats the pointer in an implementation-defined manner. Also, if you are printing it as an integer after having done proper type conversion, then again, the result of the conversion is implementation-defined. So your program may always print 0xffffffff or foobar or why are you even curious of internals like a pointer's value each time you attempt to print a pointer. So yes, it's possible that the two programs will have the same output.
The C language does not specify the interaction between two different processes. There is no guarantee that pointers in two different programs will have any meaningful connection to each other.
If you specify the operating system, C compiler, and how the programs are executed an answer may be provided that will help you.
However this is not something the C language attempts to control, and is entirely up to the operating system, and hardware running the programs.
Yes, It can happen. The program runs on Virtual memory. If a process starts executing, a process address space is created for each process. Not only 2 process, multiple process can have the same address when printed.
https://stackoverflow.com/a/18479996/1814023 will give you how a process address space will look like... And each process has a similar copy allocated by OS.
If you want to do this , you can use share memory between two Process.

Why would setting a variable to its own address give different results on different program runs?

Yesterday I can across this obfuscated C code implementing Conway's Game of Life. As a pseudorandom generator, it writes code to this effect:
int pseudoRand = (int) &pseudoRand;
According to the author's comments on the program:
This is a big number that should be different on each run, so it works nicely as a seed.
I am fairly confident that the behavior here is either implementation-defined or undefined. However, I'm not sure why this value would vary from run to run. My understanding of how most OS's work is that, due to virtual memory, the stack is initialized to the same virtual address each time the program is run, so the address should be the same each time.
Will this code actually produce different results across different runs on most operating systems? Is it OS-dependent? If so, why would the OS map the same program to different virtual addresses on each run?
Thanks!
While the assignment of addresses to objects with automatic storage is unspecified (and the conversion of an address to an integer is implementation-defined), what you're doing in your case is simply stealing the entropy the kernel assigned to the initial stack address as part of Address space layout randomization (ASLR). It's a bad idea to use this as a source of entropy which may leak out of your program, especially in applications interacting over a network with untrusted, possibly malicious remote hosts, since you're essentially revealing the random address base the kernel gave you to an attacker who might want to know it and thereby defeating the purpose of ASLR. (Even if you just use this as a seed, as long as the attacker knows the PRNG algorithm, they can reverse it to get the seed.)

How do you know the exact address of a variable?

So I'm looking through my C programming text book and I see this code.
#include <stdio.h>
int j, k;
int *ptr;
int main(void)
{
j = 1;
k = 2;
ptr = &k;
printf("\n");
printf("j has the value %d and is stored at %p\n", j, (void *)&j);
printf("k has the value %d and is stored at %p\n", k, (void *)&k);
printf("ptr has the value %p and is stored at %p\n", (void *)ptr, (void *)&ptr);
printf("The value of the integer pointed to by ptr is %d\n", *ptr);
return 0;
}
I ran it and the output was:
j has the value 1 and is stored at 0x4030e0
k has the value 2 and is stored at 0x403100
ptr has the value 0x403100 and is stored at 0x4030f0
The value of the integer pointed to by ptr is 2
My question is if I had not ran this through a compiler, how would you know the address to those variables by just looking at this code? I'm just not sure how to get the actual address of a variable. Thanks!
Here's my understanding of it:
The absolute addresses of things in memory in C is unspecified. It's not standardised into the language. Because of this, you can't know the locations of things in memory by looking at just the code. (However, if you use the same compiler, code, compiler options, runtime and operating system, the addresses may be consistent.)
When you're developing applications, this is not behaviour you should rely on. You may rely on the difference between the locations of two things in some contexts, however. For example, you can determine the difference between the addresses of pointers to two array elements to determine how many elements apart they are.
By the way, if you are considering using the memory locations of variables to solve a particular problem, you may find it helpful to post a separate question asking how to so without relying on this behaviour.
There is no other way to "know the exact address" of a variable in Standard C than to print it with "%p". The actual address is determined by many factors not under control of the programmer writing code. It's a matter of OS, the linker, the compiler, options used and probably others.
That said, in the embedded systems world, there are ways to express this variable must reside at this address, for example if registers of external devices are mapped into the address space of a running program. This usually happens in what is called a linker file or map file or by assigning an integral value to a pointer (with a cast). All of these methods are non-standard.
For the purpose of your everyday garden-variety programs though, the point of writing C programs is that you need and should not care where your variables are stored.
You can't.
Different compilers can put the variables in different places. On some machines the address is not a simple integer anyway.
The compiler only knows things like "the third integer global variable" and "the four bytes allocated 36 bytes down from the stack pointer." It refers to global vars, pointers to subroutines (functions), subroutine arguments and local vars only in relative terms. (Never mind the extra stuff for polymorphic objects in C++, yikes!) These relative references are saved in the object file (.o or .obj) as special codes and offset values.
The Linker can fill in some details. It may modify some of these sketchy location references when joining several object files. Global variable locations will share a space (the Data Section) when globals from multiple compilation units are merged; the linker decides what order they all go in, but still describing them as relative to the start of the entire set of global vars. The result is an executable file with the final opcodes, but addresses still being sketchy and based on relative offsets.
It's not until the executable is loaded that the Loader replaces all the relative addresses with actual addresses. This is possible now, because the loader (or some part of the operating system it depends on) decides where in the whole virtual address space of the process to store the program's opcodes (Text Section), global variables (BSS, Data Sections) and call stack, and other things. The loader can do the math, and write the actual address into every spot in the executable, typically as part of "load immediate" opcodes and all opcodes involving memory access.
Google "relocation table" for more. See http://www.iecc.com/linker/linker07.html (somewhat old) for a more detailed explanation for particular platforms.
In real life, it's all complicated by the fact that virtual addresses are mapped to physical addresses by a virtual memory system, using segments or some other mechanism to keep each process in a separate address space.
I would like to further build upon the answers already provided by pointing out that some compilers, such as Visual Studio's, have a feature called Address Space Layout Randomization (ASLR), which makes programs begin at a random memory address as an anti-virus feature. Given the addresses that you have in your output, I'd say that you compiled without it (programs without it start at address 0x400000, I think). My source for this information is an answer to this question.
That said, the compiler is what determines the memory addresses at which local variables will be stored. The addresses will most likely change from compiler to compiler, and probably also with each version of the source code.
Every process has its own logical address space starting from zero. Addressees your program can access are all relative to zero. Absolute address of any memory location is decided only after loading the process in main memory. This is done using dynamic relocation by modern operating systems. Hence every time a process is loaded into memory it may be loaded at different location according to availability of the memory. Hence allowing user processes to know exact address of data stored in memory does not make any sense. What your code is printing, is a logical address and not the exact or physical address.
Continuing on the answers described above, please do not forget that processes would run in their own virtual address space (process isolation). This ensures that when your program corrupts some memory, the other running processes are not affected.
Process Isolation:
http://en.wikipedia.org/wiki/Process_isolation
Inter-Process Communication
http://en.wikipedia.org/wiki/Inter-process_communication

Declare a pointer to an integer at address 0x200 in memory

I have a couple of doubts, I remember some where that it is not possible for me to manually put a variable in a particular location in memory, but then I came across this code
#include<stdio.h>
void main()
{
int *x;
x=0x200;
printf("Number is %lu",x); // Checkpoint1
scanf("%d",x);
printf("%d",*x);
}
Is it that we can not put it in a particular location, or we should not put it in a particular location since we will not know if it's a valid location or not?
Also, in this code, till the first checkopoint, I get output to be 512.
And then after that Seg Fault.
Can someone explain why? Is 0x200 not a valid memory location?
In the general case - the behavior you will get is undefined - everything can happen.
In linux for example, the first 1GB is reserved for kernel, so if you try to access it - you will get a seg fault because you are trying to access a kernel memory in user mode.
No idea how it works in windows.
Reference for linux claim:
Currently the 32 bit x86 architecture is the most popular type of
computer. In this architecture, traditionally the Linux kernel has
split the 4GB of virtual memory address space into 3GB for user
programs and 1GB for the kernel.
Adding to what #amit wrote:
In windows it is the same. In general it is the same for all protected-mode operating systems. Since DOS etc. are no longer around it is the same with all systems except kernel-mode (km-drivers) and embedded systems.
The operating system manages which memory-pages you are allowed to write to and places markers that will make the cpu automatically raise access-violations if some other page is written to.
Up until the "checkpoint", you haven't accessed memory location 0x200, so everything works fine.
There I'd a local variable x in the function main. It is of type "pointer to int". x is assigned the value 0x200, and then that value is printed. But the target of x hasn't been accessed, so up to this point it doesn't matter whether x holds a valid memory address or not.
Then scanf tries to write to the memory address you passed in, which is the 0x200 stored in x. Then you get a seg fault, which is certainly sac possible result of trying to write to an arbitrary memory address.
So what are your doubts? What makes you think that this might work, when you come across this code that clearly doesn't?
Writing to a particular memory address might work under certain conditions, but is extremely unlikely to in general. Under all modern OSes, normal programs do not have control over their memory layout. The OS decides where initial things like the program's code, stack, and globals go. The OS will probably also be using some memory space, and it is not required to tell you what it's using. Instead you ask for memory (either by making variables or by calling memory allocation routines), and you use that.
So writing to particular addresses is very very likely to get either memory that hasn't been allocated, or memory that is being used for some other purpose. Neither of those is good, even if you do manage to hit an address that is actually writable. What if you clobber sundry some piece of data used by one of your program's other variables? Or some other part of your program clobbers the value you just wrote?
You should never be choosing a particular hard-coded memory address, you should be using an address of something you know is a variable, or an address you got from something like malloc.

Resources