Can address of pointers in two programs be equal? [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
When two programs are running at the same time, and you print the address to which the pointer points to, can it happen that both programs print the same value?

Yes. The program runs in a virtual memory allocated by the OS. The amount of virtual memory is determined by the processor architecture.
The address you see refers to the virtual memory address and not to the physical RAM address.
I would add that each process running on a system gets a huge address space (2^32 on a 32-bit OS and 2^64 on a 64-bit OS) allocated to it. It's on this virtual address space that a process runs.

On operating systems like Linux, a running program is called a process. Each process has its own address space and uses virtual memory. So the same address 0x12345 usually refers to different memory cells in process A and in process B.
Read Advanced Linux Programming which has some chapters explaining that (from a Linux perspective). See also fork system call, and read fork(2), mmap(2), execve(2) man pages.
Other operating systems (Windows, MacOSX) also have processes running in their own individual address space using virtual memory.
Details can be quite complex (and actually, some RAM could be shared between processes....). Read about copy on write, shared memory etc...
Read also some good book about Operating Systems, e.g. Tanenbaum's book, or Operating Systems : Three Easy Pieces (freely downloadable online).

Your question title doesn't quite match the body. The title asks:
Can address of pointers in two program be equal?
Yes, that's possible, as others have already pointed out that there's virtual memory and all sorts of other trickery going on.
Also, a NULL pointer constant is typically always the same in each instance of a program (honestly, I don't know of a platform where it would vary from run to run). So if in both programs, you print NULL, it's even expected that the results will be identical.
Now in the question, you are asking about printing those pointers, which is an entirely different thing:
When two programs are running at the same time, and you print the address to which the pointer points to, can it happen that both programs print the same value?
Since this is tagged with c, I'll answer it from a C point of view:
Yes. Assuming you meant printf("%p", (void *)thePointer), it's perfectly possible. The %p conversion specifier formats the pointer in an implementation-defined manner. Also, if you are printing it as an integer after having done proper type conversion, then again, the result of the conversion is implementation-defined. So your program may always print 0xffffffff or foobar or why are you even curious of internals like a pointer's value each time you attempt to print a pointer. So yes, it's possible that the two programs will have the same output.

The C language does not specify the interaction between two different processes. There is no guarantee that pointers in two different programs will have any meaningful connection to each other.
If you specify the operating system, C compiler, and how the programs are executed an answer may be provided that will help you.
However this is not something the C language attempts to control, and is entirely up to the operating system, and hardware running the programs.

Yes, It can happen. The program runs on Virtual memory. If a process starts executing, a process address space is created for each process. Not only 2 process, multiple process can have the same address when printed.
https://stackoverflow.com/a/18479996/1814023 will give you how a process address space will look like... And each process has a similar copy allocated by OS.

If you want to do this , you can use share memory between two Process.

Related

Will memory addresses be the same if I run a program in a VM from two different computers?

Fairly new to C and I learned that addresses depend on a few things like the operating system and the CPU. I have a lab for one of my C courses that asks us if we run a program and print out the address for each variable will they have the same address and value as another student's (exact same program). They are local variables, stored on the stack. Normally I would say no but all of us are required to ssh to our University's lab and our programs are being run on the same machines with the same specs. This is where I'm confused, pretty sure that the values will be the same however, I don't know what exactly determines these addresses. Here is a piece of code from the program:
int g2(int a, int b)
{
int c = g1(a + 3, b - 11);
printf("g2: %d %d %d \n", a,b,c);
printf("a's address is %p b's address is %p C's address is %p\n", &a, &b, &c);
return c - b;
}
For me a's address is 0x7ffe9bce4a0c. Also not just looking for a homework answer, asking here because none of my Teammates have sent me their addresses which we were allowed to do. Have researched it but can't find an answer that matches this sort of situation, any help is greatly appreciated, thank you!
"Will memory addresses be the same if I run a program in a VM from two different computers?"
No, they probably won´t even be the same when running only in the same environment and on the same machine. There is nothing like a guarantee that it will have the same address.
A modern-day OS assigns the memory arbitrarily (within certain sections of course).
And this has a good reason: To protect against the exploitation of memory vulnerabilities a hacker could use to harm the program or even the OS.
This technique is called Address Space Layout Randomization. You can read more about it here.
It could be that the variables may have the same address on several executions, but there is no guarantee that this will happen again, already on the next run. In fact, if the OS supports ASLR, It is more likely, that there is the "almost-guarantee" that the addresses will be unequal.
The virtual machine shall have no influence on that behavior. Maybe you should read more in the documentation about the memory storage for your particular virtual machine (if it supports ASLR), but it shall follow the same guidelines.
short answer, no.
operating system loads program in different position every time.
the address that you see is not the actual address in the memory. There is an abstract address layer, supplied by the operating system. You can read about virtual memory addresses if you would like you. You will probably learn it in a course on Operating Systems
Whether you get the same address or varying addresses depends on the operating system.
Not too many years ago, if a program printed the address of one of the local variables in its function, that address would be the same every time the program was run, as long as the function was called in the same point in program execution with the same program input and other circumstances. (Which functions are called, including recursive calls, and how much stack space they use could be affected by program input and other factors.) This was true because, when the program was loaded and initialized, its stack was always started at the same memory address.
This behavior was exploited by malicious people—if there were bugs in the program, they might be exploited, and knowing which addresses were used in the program helps some exploits. So common operating systems have changed it. Now, when a program is started, the locations of its stack and other parts of its memory layout are adjusted randomly. This is called Address Space Layout Randomization (ASLR).
So, in common modern operating systems, you will get varying addresses from run to run when printing the address of a local variable. In specialized operating systems, such as for embedded devices, you may get the same address every time.
The title of your question asks about “a VM,” presumably for virtual machine, but this is not mentioned in the body of your question. To the extent that a virtual machine implements a machine properly, it should produce identical behavior. So whether a program is running in a virtual machine or not should be irrelevant to this question.

Using Python to return pointer to already existing memory address

On Python, using ctypes if applicable, how can I return the value a memory address is pointing to?
For instance- when I boot up my x86 PC, let's say the address 0xfffff800 points to a memory address of 0xffffffff
Using Python, how do I extract the value 0xfffff800 is pointing to (0xffffffff) and save it into a variable? Is this even possible? I have tried using id but I believe that is only used for local instances (if I created a variable, assigned it a value, and returned that value via id)
Thanks
From your reference to memory contents at boot time, and the idea implicit in the question that there is only one value at any given address, I take you to be talking about physical memory. In that case, no, what you ask is not possible. Only the operating system kernel (or some other program running directly on the hardware) can access physical memory in any system that presently supports Python, and to the best of my knowledge, there is no Python implementation that runs on bare metal.
Instead, the operating system affords each running process its own virtual memory space in which to run, whose contents at any given time might reside more or less anywhere in physical memory or on swap devices (disk files and / or partitions). The system takes great care to isolate processes from each other and from the underlying physical storage, so no, Python cannot access it.
Moreover, processes running on an OS cannot generally access arbitrary virtual addresses, either, regardless of the programming language in which they are written. They can access only those portions of their address spaces that the OS has mapped for them. ctypes therefore does not help you.

How much memory takes a C program [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am developping an in memory database as a side project which is supposed to be lightweight. I haven't been promming in C since school and my knowledge of computer architecure is limited...
I am wondering how can I calculate exactly how much memory my program will take and from which kind of memory (RAM, register, ... ).
The most obvious is everything I allocate through malloc. Sorry if the following questions are a bit random...
Global variables will be stored in RAM? Does the keyword static (to limit the scope) influence anything?
Are all global variable allocated at the same time or could it be lazy allocated on first access?
Is the executable loaded in memory? Does an executable of 1MB will take 1MB for the execution?
This subject is a pretty big one so don't hesitate to point me to a book or a website. I guess it's not only about C but more about the computer architecture, the assembly code etc.
I'm assuming typical computing platforms, not embedded systems.
Global variables will be stored in RAM? Does the keyword static (to limit the scope) influence anything?
Global variables will be stored in RAM only if the operating system thinks that's the best use for RAM. Scope has no effect.
Are all global variable allocated at the same time or could it be lazy allocated on first access?
It depends what you mean by "allocated". Typically virtual memory (address space) is allocated all at once, but physical memory (RAM) is allocated as needed.
Is the executable loaded in memory? Does an executable of 1MB will take 1MB for the execution?
It is mapped into memory at program start. It is actually loaded into physical memory as needed and evicted from physical memory as the OS deems appropriate.
I strongly suspect you are looking for simple answers to very complex questions.
Yes, but that doesn't mean they're all mapped at any given point in time.
They can't be lazily allocated, depending on what you mean by that. They will all mapped to virtual addresses, but then again if the program never accesses the variables the OS might never need to map those addresses to actual physical RAM.
It depends, but most modern desktop/server operating systems will page the code in as needed, I think.
Oups, that's an interesting question, but the answer is as usual : it depends !
Your questions are heavily implementation dependent. In old (now outdated) systems, existed the notion of overlays : parts of code were only loaded in memory when needed. I do not think it is still used with modern virtual memory systems, but it could have sense on embedded systems with llimited resources.
And some compilers generally have options to determine the size of the stack. It can be determinant for a lightweight program.
And there is obvious dependancy on architecture : on Unix-Linux, you have elf vs. a.out format with different memory requirement and management, on Windows, there is still the old .com format that can lead to really tiny executables.

Force memory allocation always to the same virtual address [duplicate]

This question already has answers here:
disable the randomness in malloc
(6 answers)
Closed 9 years ago.
I'm experimenting with Pin, an instrumentation tool, which I use to compute some statistics based on memory address of my variables. I want to re-run my program with the information gathered by my instrumentation tool, but for that it's crucial that virtual memory addresses remain the same through different runs.
In general, I should let the OS handle memory allocation, but in this case I need some kind of way to force it to always allocate to the same virtual address. In particular, I'm interested in a very long array, which I'm currently allocating with numa_alloc_onnode(), though I could use something else.
What would be the correct way to proceed?
Thanks
You could try mmap(2).
The instrumented version of your program will use a different memory layout than the original program because pin needs memory for the dynamic translation etc. and will change the memory layout. (if I recall correctly)
With the exception of address space layout randomization, most memory allocators, loaders, and system routines for assigning virtual memory addresses will return the same results given the same calls and data (not by deliberate design for that but by natural consequence of how software works). So, you need to:
Disable address space layout randomization.
Ensure your program executes in the same way each time.
Address space layout randomization is deliberate changes to address space to foil attackers: If the addresses are changed in each program execution, it is more difficult for attacks to use various exploits to control the code that is executed. It should be disabled only temporarily and only for debugging purposes. This answer shows one method of doing that and links to more information, but the exact method may depend on the version of Linux you are using.
Your program may execute differently for a variety of reasons, such as using threads or using asynchronous signals or interprocess communication. It will be up to you to control that in your program.
Generally, memory allocation is not guaranteed to be reproducible. The results you get may be on an as-is basis.

How Process Size is determined? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am very new to these concepts but I want to ask you all a question that is very basic I think, but I am confused, So I am asking it.
The question is...
How is the size of a process determined by the OS?
Let me clear it first, suppose that I have written a C program and I want to know that how much memory it is going to take, how can I determine it? secondly I know that there are many sections like code section, data section, BSS of a process. Now does the size of these are predetermined? secondly how the size of Stack and heap are determined. does the size of stack and heap also matters while the Total size of process is calculated.
Again we say that when we load the program , an address space is given to the process ( that is done by base and limit register and controlled by MMU, I guess) and when the process tries to access a memory location that is not in its address space we get segmentation fault. How is it possible for a process to access a memory that is not in its address space. According to my understanding when some buffer overflows happens then the address gets corrupted. Now when the process wants to access the corrupted location then we get the segmentation fault. Is there any other way of Address violation.
and thirdly why the stack grows downward and heap upwards.Is this process is same with all the OS. How does it affects the performance.why can't we have it in other way?
Please correct me, if I am wrong in any of the statement.
Thanks
Sohrab
When a process is started it gets his own virtual address space. The size of the virtual address space depends on your operating system. In general 32bit processes get 4 GiB (4 giga binary) addresses and 64bit processes get 18 EiB (18 exa binary) addresses.
You cannot in any way access anything that is not mapped into your virtual address space as by definition anything that is not mapped there does not have an address for you. You may try to access areas of your virtual address space that are currently not mapped to anything, in which case you get a segfault exception.
Not all of the address space is mapped to something at any given time. Also not all of it may be mapped at all (how much of it may be mapped depends on the processor and the operating system). On current generation intel processors up to 256 TiB of your address space may be mapped. Note that operating systems can limit that further. For example for 32 bit processes (having up to 4 GiB addresses) Windows by default reserves 2 GiB for the system and 2 GiB for the application (but there's a way to make it 1 GiB for the system and 3 GiB for the application).
How much of the address space is being used and how much is mapped changes while the application runs. Operating system specific tools will let you monitor what the currently allocated memory and virtual address space is for an application that is running.
Code section, data section, BSS etc. are terms that refer to different areas of the executable file created by the linker. In general code is separate from static immutable data which is separate from statically allocated but mutable data. Stack and heap are separate from all of the above. Their size is computed by the compiler and the linker. Note that each binary file has his own sections, so any dynamically linked libraries will be mapped in the address space separately each with it's own sections mapped somewhere. Heap and stack, however, are not part of the binary image, there generally is just one stack per process and one heap.
The size of the stack (at least the initial stack) is generally fixed. Compilers and/or linkers generally have some flags you can use to set the size of the stack that you want at runtime. Stacks generally "grow backward" because that's how the processor stack instructions work. Having stacks grow in one direction and the rest grow in the other makes it easier to organize memory in situations where you want both to be unbounded but do not know how much each can grow.
Heap, in general, refers to anything that is not pre-allocated when the process starts. At the lowest level there are several logical operations that relate to heap management (not all are implemented as I describe here in all operating systems).
While the address space is fixed, some OSs keep track of which parts of it are currently reclaimed by the process. Even if this is not the case, the process itself needs to keep track of it. So the lowest level operation is to actually decide that a certain region of the address space is going to be used.
The second low level operation is to instruct the OS to map that region to something. This in general can be
some memory that is not swappable
memory that is swappable and mapped to the system swap file
memory that is swappable and mapped to some other file
memory that is swappable and mapped to some other file in read only mode
the same mapping that another virtual address region is mapped to
the same mapping that another virtual address region is mapped to, but in read only mode
the same mapping that another virtual address region is mapped to, but in copy on write mode with the copied data mapped to the default swap file
There may be other combinations I forgot, but those are the main ones.
Of course the total space used really depends on how you define it. RAM currently used is different than address space currently mapped. But as I wrote above, operating system dependent tools should let you find out what is currently happening.
The sections are predetermined by the executable file.
Besides that one, there may be those of any dynamically linked libraries. While the code and constant data of a DLL is supposed to be shared across multiple processes using it and not be counted more than once, its process-specific non-constant data should be accounted for in every process.
Besides, there can be dynamically allocated memory in the process.
Further, if there are multiple threads in the process, each of them will have its own stack.
What's more, there are going to be per-thread, per-process and per-library data structures in the process itself and in the kernel on its behalf (thread-local storage, command line params, handles to various resources, structures for those resources as well and so on and so forth).
It's difficult to calculate the full process size exactly without knowing how everything is implemented. You might get a reasonable estimate, though.
W.r.t. According to my understanding when some buffer overflows happens then the address gets corrupted. It's not necessarily true. First of all, the address of what? It depends on what happens to be in the memory near the buffer. If there's an address, it can get overwritten during a buffer overflow. But if there's another buffer nearby that contains a picture of you, the pixels of the picture can get overwritten.
You can get segmentation or page faults when trying to access memory for which you don't have necessary permissions (e.g. the kernel portion that's mapped or otherwise present in the process address space). Or it can be a read-only location. Or the location can have no mapping to the physical memory.
It's hard to tell how the location and layout of the stack and heap are going to affect performance without knowing the performance of what we're talking about. You can speculate, but the speculations can turn out to be wrong.
Btw, you should really consider asking separate questions on SO for separate issues.
"How is it possible for a process to access a memory that is not in its address space?"
Given memory protection it's impossible. But it might be attempted. Consider random pointers or access beyond buffers. If you increment any pointer long enough, it almost certainly wanders into an unmapped address range. Simple example:
char *p = "some string";
while (*p++ != 256) /* Always true. Keeps incrementing p until segfault. */
;
Simple errors like this are not unheard of, to make an understatement.
I can answer to questions #2 and #3.
Answer #2
When in C you use pointers you are really using a numerical value that is interpreted as address to memory (logical address on modern OS, see footnotes). You can modify this address at your will. If the value points to an address that is not in your address space you have your segmentation fault.
Consider for instance this scenario: your OS gives to your process the address range from 0x01000 to 0x09000. Then
int * ptr = 0x01000;
printf("%d", ptr[0]); // * prints 4 bytes (sizeof(int) bytes) of your address space
int * ptr = 0x09100;
printf("%d", ptr[0]); // * You are accessing out of your space: segfault
Mostly the causes of segfault, as you pointed out, are the use of pointers to NULL (that is mostly 0x00 address, but implementation dependent) or the use of corrupted addresses.
Note that, on linux i386, base and limit register are not used as you may think. They are not per-process limits but they point to two kind of segments: user space or kernel space.
Answer #3
The stack growth is hardware dependent and not OS dependent. On i386 assembly instruction like push and pop make the stack grow downwards with regard to stack related registers. For instance the stack pointer automatically decreases when you do a push, and increases when you do a pop. OS cannot deal with it.
Footnotes
In a modern OS, a process uses the so called logic address. This address is mapped with physical address by the OS. To have a note of this compile yourself this simply program:
#include <stdio.h>
int main()
{
int a = 10;
printf("%p\n", &a);
return 0;
}
If you run this program multiple times (even simultaneously) you would see, even for different instances, the same address printed out. Of course this is not the real memory address, but it is a logical address that will be mapped to physical address when needed.

Resources