Can OS generate same logical Address for two different processes? - c

As far I know CPU generates logical address for each instruction on run time.
Now this logical address will point to linear or virtual address of the instruction.
Now my questions are ,
1) Can OS generate same logical address for two different processes ?
With reference to "In virtual memory, can two different processes have the same address?" , If two different processes can have same virtual address in that case it is also quit possible that logical addresses can also be the same.
2) Just to clarify my understanding whenever we write a complex C code or simple "hello world" code,Virtual address will be generated at build time (compile->Assemble->link) where logical address will generated by CPU at run time ?
Please clarify my doubts above and also do correct me if I am on wrong way.

The logical address and the virtual address are the same thing. The CPU translates from logical/virtual addresses to physical addresses during execution.
As such, yes, it's not just possible but quite common for two processes to use the same virtual addresses. Under a 32-bit OS this happens quite routinely, simply because the address space is fairly constrained, and there's often more physical memory than address space. But to give one well-known example, the traditional load address for Windows executables is 0x400000 (I might have the wrong number of zeros on the end, but you get the idea). That means essentially every process running on Windows would typically be loaded at that same logical/virtual address.
More recently, Windows (like most other OSes) has started to randomize the layout of executable modules in memory. Since most of a 32-bit address space is often in use, this changes the relative placement of the modules (their order in memory) but means many of the same locations are used in different processes (just for different modules in each).
A 64-bit OS has a much larger address space available, so when it's placing modules at random addresses it has many more choices available. That larger number of choices means there's a much smaller chance of the same address happening to be used in more than one process. It's probably still possible, but certainly a lot less likely.

Related

where does address of variables stored in a memory?

whenever we need to find the address of the variable we use below syntax in C and it prints a address of the variable. what i am trying to understand is the address that returned is actual physical memory location or compiler throwing a some random number. if it is either physical or random, where did it get those number or where it has to be stored in memory. actually does address of the memory location takes space in the memory?
int a = 10;
printf("ADDRESS:%d",&a);
ADDRESS: 2234xxxxxxxx
This location is from the virtual address space, which is allocated to your program. In other words, this is from the virtual memory, which your OS maps to a physical memory, as and when needed.
It depends on what type of system you've got.
Low-end systems such as microcontroller applications often only supports physical addresses.
Mid-range CPUs often come with a MMU (memory mapping unit) which allows so-called virtual memory to be placed on top of the physical memory. Meaning that a certain part of the code could be working from address 0 to x, though in reality those virtual addresses are just aliases for physical ones.
High-end systems like PC typically only allows virtual memory access and denies applications direct access to physical memory. They often also use Address space layout randomization (ASLR) to produce random address layouts for certain kinds of memory, in order to prevent hacks that exploit hard-coded addresses.
In either case, the actual address itself does not take up space in memory.
Higher abstraction layer concepts such as file systems may however store addresses in look-up tables etc and then they will take up memory.
… is the address that returned is actual physical memory location or compiler throwing a some random number
In general-purpose operating systems, the addresses in your C program are virtual memory addresses.1
if it is either physical or random, where did it get those number or where it has to be stored in memory.
The software that loads your program into memory makes the final decisions about what addresses are used2, and it may inform your program about those addresses in various ways, including:
It may put the start addresses of certain parts of the program in designated processor registers. For example, the start address of the read-only data of your program might be put in R17, and then your program would use R17 as a base address for accessing that data.
It may “fix up” addresses built into your program’s instructions and data. The program’s executable file may contain information about places in your program’s instructions or data that need to be updated when the virtual addresses are decided. After the instructions and data are loaded into memory, the loader will use the information in the file to find those places and update them.
With position-independent code, the program counter itself (a register in the processor that contains the address of the instruction the processor is currently executing or about to execute) provides address information.
So, when your program wants to evaluate &x, it may take the offset of x from the start of the section it is in (and that offset is built into the program by the compiler and possibly updated by the linker) and adds it to the base address of that section. The resulting sum is the address of x.
actually does address of the memory location takes space in the memory?
The C standard does not require the program to use any memory for the address of x, &x. The result of &x is a value, like the result of 3*x. The only thing the compiler has to do with a value is ensure it gets used for whatever further expression it is used in. It is not required to store it in memory. However, if the program is dealing with many values in a piece of code, so there are not enough processor registers to hold them all, the compiler may choose to store values in memory temporarily.
Footnotes
1 Virtual memory is a conceptual or “imaginary” address space. Your program can execute with virtual addresses because the hardware automatically translates virtual addresses to physical addresses while it is executing the program. The operating system creates a map that tells the hardware how to translate virtual addresses to physical addresses. (The map may also tell the hardware certain virtual memory is not actually in physical memory at the moment. In this case, the hardware interrupts the program and starts an operating system routine which deals with the issue. That routine arranges for the needed data to be loaded into memory and then updates the virtual memory map to indicate that.)
2 There is usually a general scheme for how parts of the program are laid out in memory, such as starting the instructions in one area and setting up space for stack in another area. In modern systems, some randomness is intentionally added to the addresses to foil malicious people trying to take advantage of bugs in programs.

Using Python to return pointer to already existing memory address

On Python, using ctypes if applicable, how can I return the value a memory address is pointing to?
For instance- when I boot up my x86 PC, let's say the address 0xfffff800 points to a memory address of 0xffffffff
Using Python, how do I extract the value 0xfffff800 is pointing to (0xffffffff) and save it into a variable? Is this even possible? I have tried using id but I believe that is only used for local instances (if I created a variable, assigned it a value, and returned that value via id)
Thanks
From your reference to memory contents at boot time, and the idea implicit in the question that there is only one value at any given address, I take you to be talking about physical memory. In that case, no, what you ask is not possible. Only the operating system kernel (or some other program running directly on the hardware) can access physical memory in any system that presently supports Python, and to the best of my knowledge, there is no Python implementation that runs on bare metal.
Instead, the operating system affords each running process its own virtual memory space in which to run, whose contents at any given time might reside more or less anywhere in physical memory or on swap devices (disk files and / or partitions). The system takes great care to isolate processes from each other and from the underlying physical storage, so no, Python cannot access it.
Moreover, processes running on an OS cannot generally access arbitrary virtual addresses, either, regardless of the programming language in which they are written. They can access only those portions of their address spaces that the OS has mapped for them. ctypes therefore does not help you.

Is virtual address process-specific?

I've been studying memory management related topics. I'm wondering, whether I've understood it correctly:
pointer(virtual) address is process specific
different processes can have pointers with same addresses, but these pointers get translated to different physical addresses
Am I correct about these statements? If yes, do they apply for architectures x86, x86-64 and ARMv7, ARMv8?
Well except for:
different processes can have pointers with same addresses, but these pointers get translated to different physical addresses
while this is the general case, of course different processes could share mapped pages (look into shared memory) and then the pointers could point to the same data, given the pages are mapped to the same locations in virtual address space.
But yes, that's the correct understanding.

Virtual/Logical Memory and Program relocation

Virtual memory along with logical memory helps to make sure programs do not corrupt each others data.
Program relocation does an almost similar thing of making sure that multiple programs does not corrupt each other.Relocation modifies object program so that it can be loaded at a new, alternate address.
How are virtual memory, logical memory and program relocation related ? Are they similar ?
If they are same/similar, then why do we need program relocation ?
Relocatable programs, or said another way position-independent code, is traditionally used in two circumstances:
systems without virtual memory (or too basic virtual memory, e.g. classic MacOS), for any code
for dynamic libraries, even on systems with virtual memory, given that a dynamic library could find itself lodaded on an address that is not its preferred one if other code is already at that space in the address space of the host program.
However, today even main executable programs on systems with virtual memory tend to be position-independent (e.g. the PIE* build flag on Mac OS X) so that they can be loaded at a randomized address to protect against exploits, e.g. those using ROP**.
* Position Independent Executable
** Return-Oriented Programming
Virtual memory does not prevent programs from interfering with out other. It is logical memory that does so. Unfortunately, it is common for the two concepts to be conflated to "virtual memory."
There are two types of relocation and it is not clear which you are referring to. However, they are connected. On the other hand, the concept is not really related to virtual memory.
The first concept of relocatable code. This is critical for shared libraries that usually have to be mapped to different addresses.
Relocatable code uses offsets rather than absolute addresses. When a program results in an instruction sequence something like:
JMP SOMELABEL
. . .
SOMELABEL:
The computer or assembler encodes this as
JUMP the-number-of-bytes-to-SOMELABEL
rather than
JUMP to-the-address-of-somelabel.
By using offsets the code works the same way no matter where the JMP instruction is located.
The second type of relocation uses the first. In the past relocation was mostly used for libraries. Now, some OS's will load program segments at different places in memory. That is intended for security. It is designed to keep malicious cracks that depend upon the application being loaded at a specific address.
Both of these concepts work with or without virtual memory.
Note that generally the program is not modified to relocated it. I generally, because an executable file will usually have some addresses that need to be fixed up at run time.

Why would setting a variable to its own address give different results on different program runs?

Yesterday I can across this obfuscated C code implementing Conway's Game of Life. As a pseudorandom generator, it writes code to this effect:
int pseudoRand = (int) &pseudoRand;
According to the author's comments on the program:
This is a big number that should be different on each run, so it works nicely as a seed.
I am fairly confident that the behavior here is either implementation-defined or undefined. However, I'm not sure why this value would vary from run to run. My understanding of how most OS's work is that, due to virtual memory, the stack is initialized to the same virtual address each time the program is run, so the address should be the same each time.
Will this code actually produce different results across different runs on most operating systems? Is it OS-dependent? If so, why would the OS map the same program to different virtual addresses on each run?
Thanks!
While the assignment of addresses to objects with automatic storage is unspecified (and the conversion of an address to an integer is implementation-defined), what you're doing in your case is simply stealing the entropy the kernel assigned to the initial stack address as part of Address space layout randomization (ASLR). It's a bad idea to use this as a source of entropy which may leak out of your program, especially in applications interacting over a network with untrusted, possibly malicious remote hosts, since you're essentially revealing the random address base the kernel gave you to an attacker who might want to know it and thereby defeating the purpose of ASLR. (Even if you just use this as a seed, as long as the attacker knows the PRNG algorithm, they can reverse it to get the seed.)

Resources