Where does ARM read program instructions from after Register 14? - arm

How I understand the basic workings of the ARM architecture is such;
There are 15 main registers with the 15th (r15) being the Program Counter (PC).
If the program counter points to a specific register, then how can you have a Program which runs more than ~14 lines?
Obviously this is not true, but I don't understand how you can incorporate a big program with just 15 registers? What am I missing?

The program counter points to memory, not another register.

Registers don't store the program code. Program code is in main memory, and the Program Counter points to the location in memory of the next instruction.
The other registers are high-speed locations for storing temporary, or frequently accessed, values during the processing of the application.

In the simplest form, you have Program (Instruction memory), Data memory, Stack Memory, and Registers.
ARM instructions are stored in the Instruction memory, they are a sequence of commands which tell the processor what to do. They are never stored in the registers of the processor. The program counter only points to that instruction, that instruction is simply a command which in the basic form has an opcode (operation code) and variables/literals ..
So what happens is that the instruction is read from memory (fetched) from the location pointed to by the program counter. It is not loaded into the registers, but the control unit where it is decoded (that is to know what operation to do, i.e. add, sub, mov etc) and where to read/store its inputs and outputs.
So where are the inputs/outputs to operate on and store? The ARM architecture is a load/store architecture, it means it operates on data loaded into its registers, that is R1, R2 .. R7 ..etc .. where the registers could be thought of as temporary variables where all inputs and outputs are stored. Registers are used because they are so fast and operate at the same speed of the processor frequency not as memory which is slower.
Now the question is, how to populate these registers with values in the first place?
Those values could be stored on the Data Memory or Stack memory, so there are instructions to copy them to these registers, followed by instructions to operate on them and store the value in the registers, then followed by other instructions to copy the result back to Memory. Some instructions could also load a register with a constant.
Instruction 1 // Copy Variable X into R1 from memory
Instruction 2 // Copy Variable Y into R2 from memory
ADD, R3, R1, R2 // add them together
Instruction 3 // Copy back the result into Memory
I tried to make it as simple as possible, there are so many details to cover. Needs books :)

Related

How does CPU reads data from memory?How cache Plays important role [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have written a piece of code where I am allocating memory to variable Bextradata (which is member of structure which is also allocated using malloc) as
Bextradata = (U8_WMC *) malloc(Size);
memset(Bextradata, 0,Size);
memcpy(BextraData,pdata + 18,Size);
and later trying to read this variable in some other file that too just once .So how does this variable will read from memory.will it place this variable in a cache or it will read it from main memory.
Thanks in Advance
Before you understand the working of a CPU, you need to understand few terms. The CPU consists of the ALU (for arithmetic and logic operations), the Control Unit and a bunch of registers. The number of registers in a CPU depends on the architecture and varies. The types of registers present are general purpose registers, special purpose registers instruction pointer and a few others. You can read about them. Now when we generally say 32-bit processor or 64-bit processor we're referring to the size of the registers of the CPU.
Now lets look at the following code:
int a = 10;
int b = 20;
a = a + b;
When the above program is loaded, it's instructions are stored in the main memory. Every instruction in a program is stored in a location in the main memory. Each location has a specific size to it ( Depends on the architecture again, but let's assume it's one byte). Every location has an address to it. The size of the address of particular location in RAM is equal to the size of the instruction pointer. In 64-bit Systems the size of instruction pointer will be 64 bits. That means it can address upto 2^64-1 locations. And since 1 location is generally 1 byte, therefore the total RAM, in theory for 64 bit systems, could be 16 exabytes. ( for 32 bit systems it is 2^32-1 ~ 4 GB)
Now lets look at the first instruction a = 10. This is a store operation. A computer can do following basic operations - add, multiply, subtract, divide, store, jump, etc. You can read the instruction set of any processor, for more on this. Again the instruction set differs from system to system. Coming back, When the program is loaded to memory the instruction pointer points to the first address or base address. In this case it is a = 10. The contents of this location are brought to one of the general purpose registers of the CPU. From this it is taken to the ALU which understands that this is a store operation (cus additional bits are added which represent it as a store operation). The ALU then stores it into one of the locations in RAM and also in the cache. The decision to store it in the cache depends on the compiler and a concept called hardware prefetching. When the compiler parses through a program it sees the frequently used variables and enables them to be stored in cache. In this case, we can see that variable 'a' will be used again so the compiler adds additional intermediate instructions to the program, to store it in the cache as well. Why? For faster access. (In terms of speed always remember Registers > Cache > RAM > Disc )
After the first instruction is executed, the instruction pointer is incremented and it now points to the second instruction, that is, b = 20. The same happens with this as well.
The third is a = a + b. For this there are actually four operations (if u look at the assembly level), that are, 1) Fetch a , 2) Fetch b , 3) Add a and b, 4) store result in a. Now since the variables a and b are present in cache, they are brought from those locations. They are then added and the result is stored back to a.
I hope you understood how it works.
Also you need to know that when a program is loaded in the main memory, it occupies a certain space. This space is called a segment. It has a base address and a final address. You can assume the base address as the first instruction and final address as the last instruction. If from your program you try to dereference a pointer that points from outside this segment, you get the famous error - Segmentation fault. For example :
int *ptr = NULL;
printf(*ptr);
This will give me a segmentation fault as I am trying to dereference a pointer that stores an address whose value is NULL and since NULL is not in the segment, it will give a seg fault.

ARM Assembly loop using PC?

I am currently learning arm assembly and I have some questions. When reading docs, I've found that the register nÂș 15 is the program counter that stores the next instruction adress, and when an instruction is done, it is incremented by 4 (bytes, or 2 in thumb mode).
So, my question is, if I run an instruction that changes PC by itself less 4 bytes, would it return to the instruction before, won't it? Then back and over and over again so it will be an infinite loop?
Thanks, and sorry if it is an obvious question.
Regards,
Pedro.
You have to look on an instruction by instruction basis, as some have modification of the PC being unpredictable, but for those where it is legal modification of the program counter essentially causes a jump to the address you save in the program counter. You dont have to worry about the two instructions ahead thing (it is 8 and 4 bytes not 4 and 2, two instructions ahead).
Yes - a jump/branch instruction is exactly what you're describing - it's an instruction which modifies the PC. If you arrange the result of the jump to put the program counter back where it was then, yes, you'll loop on the spot.
Note that this is not really the address of the next instruction but the address of the current instruction +4 (In Thumb mode) or +8 (In ARM mode). So in ARM this is 2 instructions later, but in Thumb it may not be (As instructions can be 16-bit or 32-bit)

what does PC have to do with load or link address?

Link address is the address where execution of a program takes place, while load address is the address in memory where the program is actually placed.
Now i'm confused what is the value in program counter? is it the load address or is it the link address?
Link address is the address where execution of a program takes place
No, it's not.
while load address is the address in memory where the program is actually placed.
Kind of. The program usually consists of more than one instruction, so it can't be placed at a single "load address".
When people talk about load address, they usually talk about relocatable code that can be relocated (at runtime) to an arbitrary load address.
For example, let's take a program that is linked at address 0x20020, and consists of 100 4-byte instructions, which all execute sequentially (e.g. it's a sequence of ADDs followed by a single SYSCALL to exit the pogram).
If such a program is loaded at address 0x20020, then at runtime the program counter will have value 0x20020, then it will advance to the next instruction at 0x20024, then to 0x20028, etc. until it reaches the last instruction of the program at 0x201ac.
But if that program is loaded at address 0x80020020 (i.e. if the program is relocated by 0x80000000 from its linked-at address), then the program counter will start at 0x80020020, and the last instruction will be at 0x800201ac.
Note that on many OSes executables are not relocatable and thus have to always be loaded at the same address they were linked at (i.e. with relocation 0; in this case "link address" really is the address where execution starts), while shared libraries are almost always relocatable and are often linked at address 0 and have non-zero relocation.
Both are different concepts, used in different context. The Linker/Loader is mainly responsible for code relocation and modification; the PC is a digital counter which indicates the positioning of program sequence(not a type's address/location like linker/loader).
Linking & Loading :-
The heart of a linker or loader's actions is relocation and code
modification. When a compiler or assembler generates an object file,
it generates the code using the unrelocated addresses of code and data
defined within the file, and usually zeros for code and data defined
elsewhere. As part of the linking process, the linker modifies the
object code to reflect the actual addresses assigned. For example,
consider this snippet of x86 code that moves the contents of variable
a to variable b using the eax register.
mov a,%eax
mov %eax,b
If a is defined in the same file at location 1234 hex and b is
imported from somewhere else, the generated object code will be:
A1 34 12 00 00 mov a,%eax
A3 00 00 00 00 mov %eax,b
Each instruction contains a one-byte operation code followed by a
four-byte address. The first instruction has a reference to 1234 (byte
reversed, since the x86 uses a right to left byte order) and the
second a reference to zero since the location of b is unknown.
Now assume that the linker links this code so that the section in
which a is located is relocated by hex 10000 bytes, and b turns out to
be at hex 9A12. The linker modifies the code to be:
A1 34 12 01 00 mov a,%eax
A3 12 9A 00 00 mov %eax,b
That is, it adds 10000 to the address in the first instruction so now
it refers to a's relocated address which is 11234, and it patches in
the address for b. These adjustments affect instructions, but any
pointers in the data part of an object file have to be adjusted as
well.
Program Counter :-
The program counter (PC) is a processor register that indicates where
a computer is in its program sequence.
In a typical central processing unit (CPU), the PC is a digital
counter (which is the origin of the term "program counter") that may
be one of many registers in the CPU hardware. The instruction cycle
begins with a fetch, in which the CPU places the value of the PC on
the address bus to send it to the memory.
The memory responds by
sending the contents of that memory location on the data bus. (This is
the stored-program computer model, in which executable instructions
are stored alongside ordinary data in memory, and handled identically
by it).
Following the fetch, the CPU proceeds to execution, taking
some action based on the memory contents that it obtained. At some
point in this cycle, the PC will be modified so that the next
instruction executed is a different one (typically, incremented so
that the next instruction is the one starting at the memory address
immediately following the last memory location of the current
instruction).
I would put the term "load address" out of your thinking. It does not really exist in a modern operating system. In ye old days of multiple programs loaded into the same address space (and each program loaded into a contiguous region of memory), load address had significance. Now it does not. He's why.
An executable file is typically going to define a number of different program segments. These may not be loaded contiguously in memory. For example, the linker often directs the creation of stack areas remote from other areas of the program.
The executable will indicate the location that should be the initial value of the PC. This might not be at the start of a program segment, let alone be in the first program segment.

asm instruction alternative in c program

I am writing a user space C program to read the hard disk.
I need to convert an assembler instruction to C program code. How can this be done?
mov eax, [rsi+0x0C]
Here eax can be any variable. However, rsi is the base address register with value 0xc1617000. This value does not change.
You can assign values to pointers in C. Try this:
uint8_t *rsi = (uint8_t*)(uintptr_t) 0xc1617000; // The uintptr_t cast isn't really needed, but might help portability.
uint32_t value = *(uint32_t *)(rsi + 0x0C);
A shorter version, of course is:
uint32_t value = *(uint32_t *)0xc161700C;
Basically you interpret that constant as a pointer to uint32_t, and then dereference it.
Following http://www.cs.virginia.edu/~evans/cs216/guides/x86.html:
mov eax, [rsi+0x0C]
means
move the 4 Byte word at the address rsi+0x0C to the EAX register
that's what this line of assembler means; you say
Here eax can be any variable
Typically, EAX is the return value of some function, but I'll not go into this.
Since this is trivial:
int variable = *((unsigned int*) 0xc161700C;
notice that it's totally up to your compiler whether it actually copies over that value -- in many cases, the compiler will be able to do that only when the value of variable is actually used. If asking for the address of variable, you might either be getting a new address, or actually 0xc161700C.
Since this is basic C, I'm not so confident I want to let you play with my hard drive ;) notice that for programs running in unprivileged (non-kernel mode), access to physical memory addresses is impossible in general.
EDIT
On linux the program is crashing when accessing the location. May be because its outside the bound of process memory. Any idea how to access the memory outside the bound of process memory
As I said here and in the comments:
If your code is running as a program (in userland), you can never access raw physical memory addresses. Your process sees its own memory with physical memory being mapped there in pages -- there's no possibility to access raw physical memory without the help of kernel mode. That is the beauty of memory mapping as done on any modern CPU: programs can't fiddle directly with hardware.
Under Linux, things might be relatively easy: open or mmap /dev/mem as root and access the right position in that file -- it's an emulation of direct access to memory as accessible by the operating system.
However, what you're doing is hazardous, and Linux usually already supports as much AHCI as it should -- are you sure you're already using a linux kernel of the last ten years?

Big empty space in memory?

Im very new to embedded programming started yesterday actually and Ive noticed something I think is strange. I have a very simple program doing nothing but return 0.
int main() {
return 0;
}
When I run this in IAR Embedded Workbench I have a memory view showing me the programs memory. Ive noticed that in the memory there is some memory but then it is a big block of empty space and then there is memory again (I suck at explaining :P so here is an image of the memory)
Please help me understand this a little more than I do now. I dont really know what to search for because Im so new to this.
The first two lines are the 8 interrupt vectors, expressed as 32-bit instructions with the highest byte last. That is, read them in groups of 4 bytes, with the highest byte last, and then convert to an instruction via the usual method. The first few vectors, including the reset at memory location 0, turn out to be LDR instructions, which load an immediate address into the PC register. This causes the processor to jump to that address. (The reset vector is also the first instruction to run when the device is switched on.)
You can see the structure of an LDR instruction here, or at many other places via an internet search. If we write the reset vector 18 f0 95 e5 as e5 95 f0 18, then we see that the PC register is loaded with the address located at an offset of 0x20.
So the next two lines are memory locations referred to by instructions in the first two lines. The reset vector sends the PC to 0x00000080, which is where the C runtime of your program starts. (The other vectors send the PC to 0x00000170 near the end of your program. What this instruction is is left to the reader.)
Typically, the C runtime is code added to the front of your program that loads the global variables into RAM from flash, and sets the uninitialized RAM to 0. Your program starts after that.
Your original question was: why have such a big gap of unused flash? The answer is that flash memory is not really at a premium, so we can waste a little, and that having extra space there allows for forward-compatibility. If we need to increase the vector table size, then we don't need to move the code around. In fact, this interrupt model has been changed in the new ARM Cortex processors anyway.
Physical (not virtual) memory addresses map to physical circuits. The lowest addresses often map to registers, not RAM arrays. In the interest of consistency, a given address usually maps to the same functionality on different processors of the same family, and missing functionality appears as a small hole in the address mapping.
Furthermore, RAM is assigned to a contiguous address range, after all the I/O registers and housekeeping functions. This produces a big hole between all the registers and the RAM.
Alternately, as #Martin suggests, it may represent uninitialized and read-only Flash memory as -- bytes. Unlike truly unassigned addresses, access to this is unlikely to produce an exception, and you might even be able to make them "reappear" using appropriate Flash controller commands.
On a modern desktop-class machine, virtual memory hides all this from you, and even parts of the physical address map may be configurable. Many embedded-class processors allow configuration to the extent of specifying the location of the interrupt vector table.
UncleO is right but here is some additional information.
The project's linker command file (*.icf for IAR EW) determines where sections are located in memory. (Look under Project->Options->Linker->Config to identify your linker configuration file.) If you view the linker command file with a text editor you may be able to identify where it locates a section named .intvec (or similar) at address 0x00000000. And then it may locate another section (maybe .text) at address 0x00000080.
You can also see these memory sections identified in the .map file, along with their locations. (Ensure "Generate linker map file" is checked under Project->Options->Linker->List.) The map file is an output from the build, however, and it's the linker command file that determines the locations.
So that space in memory is there because the linker command file instructed it to be that way. I'm not sure whether that space is necessary but it's certainly not a problem. You might be able to experiment with the linker command file and move that second section around. But the exception table (a.k.a. interrupt vector table) must be located at 0x00000000. And you'll want to ensure that the reset vector points to the new location of the startup code if you move it.

Resources