Why does ARM not support the full physical address space in LPAE mode? - arm

I know that the ARM 32-bit (AArch32) architecture supports 2 types of page tables: the short format (which has 32-bit table entries) and the long format (which has 64-bit table entries and is known as LPAE).
For the short format, they managed to support a 40-bit address space by allowing the descriptor for 16MB supersections to contain bits 32-39 of the physical address (similar to how x86 does it using PSE-36, but with 4MB pages instead).
However, what I don't understand is why isn't the full 48-bit address space supported when using the long descriptor format. It has the necessary space for the extra bits and they are supported in AArch64, but in AArch32 mode the manual says (G5.5.2):
In both cases, if bits[47:40] of the descriptor are not zero then a translation that uses the descriptor
will generate an Address size fault, see Address size fault on page G5-9240.
What is the reason for not supporting the full 48-bit address space? Would it have used so much more silicon, considering that (almost) the same translation scheme is used in both LPAE and A64?

Related

Is compiled to binary's user mode program showing logical address or linear address in Windows

I was reading Intel 64 and IA-32 Architectures Software Developer's Manual.
I understand that before going through paging, the logical address will first have to be converted to linear address, and from linear address it will goes through paging table in order to generate the ultimate physical address.
My question is that, if we run the C code below, the address of variable a that is printed, is it a logical address or a linear address?
I know that Windows 10 64-bit is currently using long mode and so the logical address and linear address are the same, but the question I have in mind is:
Is the address we are seeing in user mode program is logical address or it's the linear address that has gone through the global descriptor table translation?
#include <stdio.h>
int main(void)
{
int a = 50;
printf("%p", &a);
return 0;
}
Windows has not used segmented memory since it stopped being 16-bit. Or to put it a different way, the GDT just spans from 0 to the end of linear space in 32-bit Windows.
All this is irrelevant when asking about C because it has no knowledge of such details. To answer we must ignore the C abstract machine and look directly at Windows on x86.
If we imagine your code running in 16-bit Windows, taking the address of a local variable is going to give you the offset into the segment. This is 16-bits. A FAR address on the other hand is 32-bits of information, the segment and the offset. The Windows function lstrlen (l for long?) takes a FAR address and can compute the length of a string coming from anywhere. The strlen C function might just use a plain char* pointer; just the segment offset. Your C compiler might support different memory models (tiny, small, compact, medium, large), giving you access to more memory, perhaps even FAR pointers. Classic DOS .com files use the tiny model, there are no segments, just a maximum of 64 KiB for all code and data. Other models might separate code and data in different segments.
In 32 and 64 bit Windows the logical and linear addresses are the same but if you have to think of it in terms of your graphic, %p is going to print the logical address. print is not going to ask the CPU/OS to translate the address in any way.

What is the size of pointers in C on PAE system?

I know normally in a 32-bit machine the size of pointers used in regular C programs is 32-bit. What about in a x86 system with PAE?
It's still 32 bits.
PAE increases the size of physical memory addresses, which lets the operating system use more than 4GB RAM for running applications. To run an application the operating system maps the larger physical addresses to 32 bit virtual addresses. This means that the address space in each application is still limited to 4GB.
PAE changes the structure of page tables to allow them to address more than 32 bits worth of physical memory. Virtual memory addressing remains unchanged — pointers in userspace applications are still 32 bits.
Note that this means that 32-bit applications can be used unmodified on PAE systems, but can still each only use 4 GB of memory.
It is 32 bit only.Because,
PAE is a feature to allow 32-bit central processing units (CPUs) to access a physical address space (including random access memory and memory mapped devices) larger than 4 gigabytes.
see this http://en.wikipedia.org/wiki/Physical_Address_Extension
You can access the memory through a window (address range). Each time you have to use something outside that range, you should use a system call to map another range there. Consider using multiple heaps, with an offset within the window (pointer) - then the full pointer would be the heap identifier and a window offset (structure), totally 64 bits, each time you have to go outside the current heap, you have to switch them though.

Does using FreeDOS allow my program to access more than 64 K of memory?

I am interested in programming in C on FreeDOS while learning some basic ASM in the process, will using FreeDOS allow my program to access more than the standard 640K of memory?
And secondly, about the ASM, I know on modern processors it is hard to program on assembly due to the complexity of the CPU architecture, but does using FreeDOS limit me to the presumably simpler 16-bit instruction set?
MS-DOS and FreeDOS use the "HIMEM" areas: These are:
Some memory areas above 0xA000:0x0000 reserved for extension cards that contain RAM instead of extension cards
The memory starting from 0xFFFF:0x0010 to 0xFFFF:0xFFFF which is located above 1MB but can be accessed using 16-bit real mode code (if the so-called A20-line is active).
The maximum memory size that can be archieved this way is about 800K.
Using XMS and EMS you can use up to 64M:
XMS will allocate memory blocks above the area that can be accessed via 16-bit real mode code. There are special functions that can copy data from that memory to the low 640K of memory and vice versa
EMS is similar; however using EMS it is possible to "map" the high memory to a low address (a feature of 32-bit CPUs) which means that you can access some memory above the 1MB area as if it was located at an address below 1MB.
Without any extender a program can use maximum 640KB of low memory in DOS. But each structure will be limited to the size of a segment, or 64KB. That means you can have 10 large arrays of size 64KB. Of course you can have multiple arrays in a segment but their total size must not exceed the segment size. Some compilers also handle addresses spanning across multiple segments automatically so you can use objects larger than 64KB seamlessly, or you can also do the same if you're writing in assembly
To access more memory you need an extender like EMS or XMS. But note that the address space is still 20-bit wide. The extenders just map the high memory regions into some segments in the addressable space so you can only see a small window of your data at a time
Regarding assembly, you can use 32-bit registers in 16-bit mode. There are 66h and 67h prefixes to change the operand size. However that doesn't mean that writing 16-bit code is easier. In fact it has lots of idiosyncrasies to remember like the limited register usage in memory addressing. The 32-bit x86 instruction set is a lot cleaner with saner addressing modes as well as a flat address space which is a lot easier to use.

How can a process with a 32-bit address space access large amounts of memory efficiently? [duplicate]

This question already has an answer here:
moving a structure from 32-bit to 64-bit in C programming language [closed]
(1 answer)
Closed 9 years ago.
We have a process with a 32-bit address space that needs to access more memory than it can address directly. Most of the source code for this process cannot be changed. We can change the modules that are used to manage access to the data. The interfaces to those modules may include 64-bit pieces of data that identify the memory to be accessed.
We currently have an implementation in which the interface modules use interprocess communication with a 64-bit process to transfer data to and from the address space of that 64-bit process.
Is there a better way?
Very few platforms support mixing 32-bit and 64-bit code. If you need more than 2 or 3 GB of address space, your options are:
Recompile the whole application as 64-bit, or
Use memory-mapped files to page in and out large chunks of data.
Recompiling is easy. Accessing more than 2 or 3 GB of memory in a 32-bit program is hard.
Note that recompiling a 32-bit application as a 64-bit application requires no changes to your code or functionality, barring a few bugs that might turn up if your code has unportable constructs. Things like:
size_t round_to_16(size_t x)
{
return x & ~15; // BUG in 64-bit code, should be ~(size_t) 15
}
As stated in various comments, the situation is:
There is a 32-bit process of which a small portion can be altered. The rest is essentially pre-compiled and cannot be changed.
The small portion currently communicates with a 64-bit process to transfer selected data between a 64-bit address space and the address space of the 32-bit process.
You seek alternatives, presumably with higher performance.
Interprocess communication is generally fast. There may not be a faster method unless:
your system has specialized hardware for accelerating memory transfers, or
your system has means of remapping memory (more below).
Unix has calls such as shmat and mmap that allow processes to attach to shared memory segments and to map portions of their address spaces to offsets within shared memory segments. It is possible that calls such as these can support mapping portions of a 32-bit address space into large shared memory segments that exist in a large physical address space.
For example, the call mmap takes a void * parameter for the address to map in the process’ address space and an off_t parameter for the offset into a shared memory segment. Conceivably, the off_t type may be a 64-bit type even though the void * is only a 32-bit pointer. I have not investigated this.
Remapping memory is conceivable faster than transferring memory by copy operations, since it can involve simply changing the virtual address map instead of physically moving data.

What is the difference between far pointers and near pointers?

Can anybody tell me the difference between far pointers and near pointers in C?
On a 16-bit x86 segmented memory architecture, four registers are used to refer to the respective segments:
DS → data segment
CS → code segment
SS → stack segment
ES → extra segment
A logical address on this architecture is written segment:offset. Now to answer the question:
Near pointers refer (as an offset) to the current segment.
Far pointers use segment info and an offset to point across segments. So, to use them, DS or CS must be changed to the specified value, the memory will be dereferenced and then the original value of DS/CS restored. Note that pointer arithmetic on them doesn't modify the segment portion of the pointer, so overflowing the offset will just wrap it around.
And then there are huge pointers, which are normalized to have the highest possible segment for a given address (contrary to far pointers).
On 32-bit and 64-bit architectures, memory models are using segments differently, or not at all.
Since nobody mentioned DOS, lets forget about old DOS PC computers and look at this from a generic point-of-view. Then, very simplified, it goes like this:
Any CPU has a data bus, which is the maximum amount of data the CPU can process in one single instruction, i.e equal to the size of its registers. The data bus width is expressed in bits: 8 bits, or 16 bits, or 64 bits etc. This is where the term "64 bit CPU" comes from - it refers to the data bus.
Any CPU has an address bus, also with a certain bus width expressed in bits. Any memory cell in your computer that the CPU can access directly has an unique address. The address bus is large enough to cover all the addressable memory you have.
For example, if a computer has 65536 bytes of addressable memory, you can cover these with a 16 bit address bus, 2^16 = 65536.
Most often, but not always, the data bus width is as wide as the address bus width. It is nice if they are of the same size, as it keeps both the CPU instruction set and the programs written for it clearer. If the CPU needs to calculate an address, it is convenient if that address is small enough to fit inside the CPU registers (often called index registers when it comes to addresses).
The non-standard keywords far and near are used to describe pointers on systems where you need to address memory beyond the normal CPU address bus width.
For example, it might be convenient for a CPU with 16 bit data bus to also have a 16 bit address bus. But the same computer may also need more than 2^16 = 65536 bytes = 64kB of addressable memory.
The CPU will then typically have special instructions (that are slightly slower) which allows it to address memory beyond those 64kb. For example, the CPU can divide its large memory into n pages (also sometimes called banks, segments and other such terms, that could mean a different thing from one CPU to another), where every page is 64kB. It will then have a "page" register which has to be set first, before addressing that extended memory. Similarly, it will have special instructions when calling/returning from sub routines in extended memory.
In order for a C compiler to generate the correct CPU instructions when dealing with such extended memory, the non-standard near and far keywords were invented. Non-standard as in they aren't specified by the C standard, but they are de facto industry standard and almost every compiler supports them in some manner.
far refers to memory located in extended memory, beyond the width of the address bus. Since it refers to addresses, most often you use it when declaring pointers. For example: int * far x; means "give me a pointer that points to extended memory". And the compiler will then know that it should generate the special instructions needed to access such memory. Similarly, function pointers that use far will generate special instructions to jump to/return from extended memory. If you didn't use far then you would get a pointer to the normal, addressable memory, and you'd end up pointing at something entirely different.
near is mainly included for consistency with far; it refers to anything in the addressable memory as is equivalent to a regular pointer. So it is mainly a useless keyword, save for some rare cases where you want to ensure that code is placed inside the standard addressable memory. You could then explicitly label something as near. The most typical case is low-level hardware programming where you write interrupt service routines. They are called by hardware from an interrupt vector with a fixed width, which is the same as the address bus width. Meaning that the interrupt service routine must be in the standard addressable memory.
The most famous use of far and near is perhaps the mentioned old MS DOS PC, which is nowadays regarded as quite ancient and therefore of mild interest.
But these keywords exist on more modern CPUs too! Most notably in embedded systems where they exist for pretty much every 8 and 16 bit microcontroller family on the market, as those microcontrollers typically have an address bus width of 16 bits, but sometimes more than 64kB memory.
Whenever you have a CPU where you need to address memory beyond the address bus width, you will have the need of far and near. Generally, such solutions are frowned upon though, since it is quite a pain to program on them and always take the extended memory in account.
One of the main reasons why there was a push to develop the 64 bit PC, was actually that the 32 bit PCs had come to the point where their memory usage was starting to hit the address bus limit: they could only address 4GB of RAM. 2^32 = 4,29 billion bytes = 4GB. In order to enable the use of more RAM, the options were then either to resort to some burdensome extended memory solution like in the DOS days, or to expand the computers, including their address bus, to 64 bits.
Far and near pointers were used in old platforms like DOS.
I don't think they're relevant in modern platforms. But you can learn about them here and here (as pointed by other answers). Basically, a far pointer is a way to extend the addressable memory in a computer. I.E., address more than 64k of memory in a 16bit platform.
A pointer basically holds addresses. As we all know, Intel memory management is divided into 4 segments.
So when an address pointed to by a pointer is within the same segment, then it is a near pointer and therefore it requires only 2 bytes for offset.
On the other hand, when a pointer points to an address which is out of the segment (that means in another segment), then that pointer is a far pointer. It consist of 4 bytes: two for segment and two for offset.
Four registers are used to refer to four segments on the 16-bit x86 segmented memory architecture. DS (data segment), CS (code segment), SS (stack segment), and ES (extra segment). A logical address on this platform is written segment:offset, in hexadecimal.
Near pointers refer (as an offset) to the current segment.
Far pointers use segment info and an offset to point across segments. So, to use them, DS or CS must be changed to the specified value, the memory will be dereferenced and then the original value of DS/CS restored. Note that pointer arithmetic on them doesn't modify the segment portion of the pointer, so overflowing the offset will just wrap it around.
And then there are huge pointers, which are normalized to have the highest possible segment for a given address (contrary to far pointers).
On 32-bit and 64-bit architectures, memory models are using segments differently, or not at all.
Well in DOS it was kind of funny dealing with registers. And Segments. All about maximum counting capacities of RAM.
Today it is pretty much irrelevant. All you need to read is difference about virtual/user space and kernel.
Since win nt4 (when they stole ideas from *nix) microsoft programmers started to use what was called user/kernel memory spaces.
And avoided direct access to physical controllers since then. Since then dissapered a problem dealing with direct access to memory segments as well. - Everything became R/W through OS.
However if you insist on understanding and manipulating far/near pointers look at linux kernel source and how it works - you will newer come back I guess.
And if you still need to use CS (Code Segment)/DS (Data Segment) in DOS. Look at these:
https://en.wikipedia.org/wiki/Intel_Memory_Model
http://www.digitalmars.com/ctg/ctgMemoryModel.html
I would like to point out to perfect answer below.. from Lundin. I was too lazy to answer properly. Lundin gave very detailed and sensible explanation "thumbs up"!

Resources