Why are there 6T states in opcode fetch of CALL instead of 4? - call

My question is why are there 6T states in opcode fetch of the CALL instruction while there are 4 for other instructions in 8085 microprocessor. I have searched a lot but didn't find any satisfactory answer.
Here: http://www.edaboard.com/thread201650.html it says that it has something to do with dual addressing modes being used in case of CALL. But doesn't really explain why 6T states.
Any idea?
EDIT
This question arose when I came to know that CALL takes 18 T-states.
According to my calculations it should be: 4(for opcode fetch) + 3 + 3 (two memory reads to read the subroutine address) + 3 + 3 (for two memory writes on the stack) = 16
So, on searching the internet I got to know that the opcode fetch part in case of CALL takes 6T states instead of 4.
UPDATE
Now after reading the comments and rethinking, I got to know that PUSH takes 12 T-states normally as an instruction. We can ignore the opcode fetch part for PUSH in case of CALL as there is no explicit PUSH instruction, so now we have 8 (12 - 4). So, I feel is it because of the decrement of stack pointer? Because even in push it should have been 6 (3 + 3 for memory writes), but here it's 8 (4 + 4).

6(opcode fetch) + 3 + 3 (two memory reads to read the subroutine address) + 3 + 3 (two memory writes on the stack) = 18
So i believe what confuses you is the 6 T states for opcode fetch rather than 4 T states as in usual case.4 T states are used to fetch the opcode as in any other instruction fetch. 2 T states are used to deal with the Stack Pointer (SP). Because on top of the stack nothing is stored.When a call is encountered the current contents of the program counter ( the address of the line in which the call is written) is pushed to the stack. On completion of execution the contents of stack must be put back. Thus the call requires two additional states than the other instruction fetches.

4 T states are used to fetch the opcode; 2 T states are used to decrement the Stack Pointer (SP). Because on top of the stack nothing is stored.

Related

Why do all C functions converted to assembler start and end with some identical operations? [duplicate]

I know data in nested function calls go to the Stack.The stack itself implements a step-by-step method for storing and retrieving data from the stack as the functions get called or returns.The name of these methods is most known as Prologue and Epilogue.
I tried with no success to search material on this topic. Do you guys know any resource ( site,video, article ) about how function prologue and epilogue works generally in C ? Or if you can explain would be even better.
P.S : I just want some general view, not too detailed.
There are lots of resources out there that explain this:
Function prologue (Wikipedia)
x86 Disassembly/Calling Conventions (WikiBooks)
Considerations for Writing Prolog/Epilog Code (MSDN)
to name a few.
Basically, as you somewhat described, "the stack" serves several purposes in the execution of a program:
Keeping track of where to return to, when calling a function
Storage of local variables in the context of a function call
Passing arguments from calling function to callee.
The prolouge is what happens at the beginning of a function. Its responsibility is to set up the stack frame of the called function. The epilog is the exact opposite: it is what happens last in a function, and its purpose is to restore the stack frame of the calling (parent) function.
In IA-32 (x86) cdecl, the ebp register is used by the language to keep track of the function's stack frame. The esp register is used by the processor to point to the most recent addition (the top value) on the stack. (In optimized code, using ebp as a frame pointer is optional; other ways of unwinding the stack for exceptions are possible, so there's no actual requirement to spend instructions setting it up.)
The call instruction does two things: First it pushes the return address onto the stack, then it jumps to the function being called. Immediately after the call, esp points to the return address on the stack. (So on function entry, things are set up so a ret could execute to pop that return address back into EIP. The prologue points ESP somewhere else, which is part of why we need an epilogue.)
Then the prologue is executed:
push ebp ; Save the stack-frame base pointer (of the calling function).
mov ebp, esp ; Set the stack-frame base pointer to be the current
; location on the stack.
sub esp, N ; Grow the stack by N bytes to reserve space for local variables
At this point, we have:
...
ebp + 4: Return address
ebp + 0: Calling function's old ebp value
ebp - 4: (local variables)
...
The epilog:
mov esp, ebp ; Put the stack pointer back where it was when this function
; was called.
pop ebp ; Restore the calling function's stack frame.
ret ; Return to the calling function.
C Function Call Conventions and the Stack explains well the concept of a call stack
Function prologue briefly explains the assembly code and the hows and whys.
The gen on function perilogues
I am quite late to the party & I am sure that in the last 7 years since the question was asked, you'd have gotten a way clearer understanding of things, that is of course if you chose to pursue the question any further. However, I thought I would still give a shot at especially the why part of the prolog & the epilog.
Also, the accepted answer elegantly & quite simply explains the how of the epilog & the prolog, with good references. I only intend to supplement that answer with the why (at least the logical why) part.
I will quote the below from the accepted answer & try to extend it's explanation.
In IA-32 (x86) cdecl, the ebp register is used by the language to keep
track of the function's stack frame. The esp register is used by the
processor to point to the most recent addition (the top value) on the
stack.
The call instruction does two things: First it pushes the return
address onto the stack, then it jumps to the function being called.
Immediately after the call, esp points to the return address on the
stack.
The last line in the quote above says immediately after the call, esp points to the return address on the stack.
Why's that?
So let's say that our code that's getting currently executed has the following situation, as shown in the (really badly drawn) diagram below
So our next instruction to be executed is, say at the address 2. This is where the EIP is pointing. The current instruction has a function call (that would internally translate to the assembly call instruction).
Now ideally, because the EIP is pointing to the very next instruction, that would indeed be the next instruction to get executed. But since there's sort of a diversion from the current execution flow path, (that is now expected because of the call) the EIP's value would change. Why? Because now another instruction, that may be somewhere else, say at the address 1234 (or whatever), may need to get executed. But in order to complete the execution flow of the program as was intended by the programmer, after the diversion activities are done, the control must return back to the address 2 as that is what should have been executed next should the diversion have not happened. Let us call this address 2 as the return address in the context of the call that is being made.
Problem 1
So, before the diversion actually happens, the return address, 2, would need to be stored somewhere temporarily.
There could have been many choices of storing it in any of the available registers, or some memory location etc. But for (I believe good reason) it was decided that the return address would be stored onto the stack.
So what needs to be done now is increment the ESP (the stack pointer) such that the top of the stack now points at the next address on the stack. So TOS' (TOS before the increment) which was pointing to the address, say 292, now gets incremented & starts pointing to the address 293. That is where we put our return address 2. So something like this:
So it looks like now we have achieved our goal of temporarily storing the return address somewhere. We should now just go about making the diversion call. And we could. But there's a small problem. During the execution of the called function, the stack pointer, along with the other register values, could be manipulated multiple times.
Problem 2
So, although the return address of ours, is still stored on the stack, at location 293, after the called function finishes off executing, how would the execution flow know that it should now goto 293 & that's where it would find the return address?
So (I believe for good reason again) one of the ways of solving the above problem could be to store the stack address 293 (where the return address is) in a (designated) register called EBP. But then what about the contents of EBP? Would that not be overwritten? Sure, that's a valid point. So let's store the current contents of EBP on to the stack & then store this stack address into EBP. Something like this:
The stack pointer is incremented. The current value of EBP (denoted as EBP'), which is say xxx, is stored onto the top of the stack, i.e. at the address 294. Now that we have taken a backup of the current contents of EBP, we can safely put any other value onto the EBP. So we put the current address of the top of the stack, that is the address 294, in EBP.
With the above strategy in place, we solve for the Problem 2 discussed above. How? So now when the execution flow wants to know where from should it fetch the return address, it would :
first get the value from EBP out and point the ESP to that value. In our case, this would make TOS (top of stack) point to the address 294 (since that is what is stored in EBP).
Then it would restore the previous value of EBP. To do this it would simply take the value at 294 (the TOS), which is xxx (which was actually the older value of EBP), & put it back to EBP.
Then it would decrement the stack pointer to go to the next lower address in the stack which is 293 in our case. Thus finally reaching 293 (see that's what our problem 2 was). That's where it would find the return address, which is 2.
It will finally pop this 2 out into the EIP, that's the instruction that should have ideally been executed should the diversion have not happened, remember.
And the steps that we just saw being performed, with all the jugglery, to store the return address temporarily & then retrieve it is exactly what gets done with the function prolog (before the function call) & the epilog (before the function ret). The how was already answered, we just answered the why as well.
Just an end note: For the sake of brevity, I have not taken care of the fact that the stack addresses may grow the other way round.
Every function has an identical prologue(The starting of function code) and epilogue ( The ending of a function).
Prologue: The structure of Prologue is look like:
push ebp
mov esp,ebp
Epilogue: The structure of Prologue is look like:
leave
ret
More in detail : what is Prologue and Epilogue

How does CPU reads data from memory?How cache Plays important role [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have written a piece of code where I am allocating memory to variable Bextradata (which is member of structure which is also allocated using malloc) as
Bextradata = (U8_WMC *) malloc(Size);
memset(Bextradata, 0,Size);
memcpy(BextraData,pdata + 18,Size);
and later trying to read this variable in some other file that too just once .So how does this variable will read from memory.will it place this variable in a cache or it will read it from main memory.
Thanks in Advance
Before you understand the working of a CPU, you need to understand few terms. The CPU consists of the ALU (for arithmetic and logic operations), the Control Unit and a bunch of registers. The number of registers in a CPU depends on the architecture and varies. The types of registers present are general purpose registers, special purpose registers instruction pointer and a few others. You can read about them. Now when we generally say 32-bit processor or 64-bit processor we're referring to the size of the registers of the CPU.
Now lets look at the following code:
int a = 10;
int b = 20;
a = a + b;
When the above program is loaded, it's instructions are stored in the main memory. Every instruction in a program is stored in a location in the main memory. Each location has a specific size to it ( Depends on the architecture again, but let's assume it's one byte). Every location has an address to it. The size of the address of particular location in RAM is equal to the size of the instruction pointer. In 64-bit Systems the size of instruction pointer will be 64 bits. That means it can address upto 2^64-1 locations. And since 1 location is generally 1 byte, therefore the total RAM, in theory for 64 bit systems, could be 16 exabytes. ( for 32 bit systems it is 2^32-1 ~ 4 GB)
Now lets look at the first instruction a = 10. This is a store operation. A computer can do following basic operations - add, multiply, subtract, divide, store, jump, etc. You can read the instruction set of any processor, for more on this. Again the instruction set differs from system to system. Coming back, When the program is loaded to memory the instruction pointer points to the first address or base address. In this case it is a = 10. The contents of this location are brought to one of the general purpose registers of the CPU. From this it is taken to the ALU which understands that this is a store operation (cus additional bits are added which represent it as a store operation). The ALU then stores it into one of the locations in RAM and also in the cache. The decision to store it in the cache depends on the compiler and a concept called hardware prefetching. When the compiler parses through a program it sees the frequently used variables and enables them to be stored in cache. In this case, we can see that variable 'a' will be used again so the compiler adds additional intermediate instructions to the program, to store it in the cache as well. Why? For faster access. (In terms of speed always remember Registers > Cache > RAM > Disc )
After the first instruction is executed, the instruction pointer is incremented and it now points to the second instruction, that is, b = 20. The same happens with this as well.
The third is a = a + b. For this there are actually four operations (if u look at the assembly level), that are, 1) Fetch a , 2) Fetch b , 3) Add a and b, 4) store result in a. Now since the variables a and b are present in cache, they are brought from those locations. They are then added and the result is stored back to a.
I hope you understood how it works.
Also you need to know that when a program is loaded in the main memory, it occupies a certain space. This space is called a segment. It has a base address and a final address. You can assume the base address as the first instruction and final address as the last instruction. If from your program you try to dereference a pointer that points from outside this segment, you get the famous error - Segmentation fault. For example :
int *ptr = NULL;
printf(*ptr);
This will give me a segmentation fault as I am trying to dereference a pointer that stores an address whose value is NULL and since NULL is not in the segment, it will give a seg fault.

Examining memory with x86 proccesor [duplicate]

This question already has an answer here:
How many machine instructions can single memory address store?
(1 answer)
Closed 7 years ago.
I'm new in GDB and have some problem with it. I have x86 proccesor and it means that register eip in my proccesor should contain 4 byte memory. I compiled some c code and set break point to main(). Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language. So my questions is: if This machine instruction is the size of 4 byte. This command "x/4x $eip" should display 16 byte and it show me this:
0x8048426 <main+9>: 0xd02404c7 0xe8080484 0xfffffebe 0x9066c3c9
So i'm confused. If this is 16 byte than why does it show me that it is located on the same memory when 1 register in 32 bit proccesor should contain only 4 byte? Thank you.
Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language.
No, it gives you raw bytes in your code. These raw bytes can "cover" less than one, one, or several machine instructions. A shortest x86 instruction takes up just one byte. The longest instruction takes 15 bytes.
So my questions is: if This machine instruction is the size of 4 byte.
An address is 4 bytes, but the instruction itself may contain 1 to 15 bytes. You can see the relationship between bytes and instructions if you do (gdb) disas/r main
So every memory address can store 4 machine instructions?
Not at all. Every memory address corresponds to 1 byte of memory. That byte may contain an entire (single-byte) instruction, or it can be a start of multi-byte instruction, or it could not contain any instructions at all (if the address points to e.g. .data section).

How assembly accesses/stores variables on the stack

In assembly You can store data in registers or on the stack. Only the top of the stack can be accessed at any given moment (right?). Consider the following C code:
main(){
int x=2;
func();
}
func( int x ){
int i;
char a;
}
Upon calling func() the following is pushed onto the stack (consider a 32bit system):
variable x (4 bytes, pushed by main)
<RETURN ADDRESS> (4 bytes pushed by main?)
<BASE POINTER> (4 bytes pushed by func())
variable i (4 bytes, pushed by func())
variable a (1 byte, pushed by func())
I have the following questions:
In C code you can access the local variable from anywhere inside the function, but in assembly you can only access the top of the stack. The C code is translated into assembly (in machine code but assembly is the readable form of it). So how does assembly support the reading of variables that are not on top of the stack?
Did I leave out anything that would also be pushed to the stack in my example?
In assembly if you push a char on the stack or an int, how can it determine whethere it needs to push 4 bytes or 1 byte? Because it uses the same operation (push) right?
Thanks in advance
Gr. Maricruzz
The stack pointer at the beginning of the function is put into a register, and then the variables/arguments are accessed via this base address plus the offset for the variable.
If you want to see the code, instead of creating object files, let the compiler stop at creating assembler files. Then you can see exactly how it works. (Of course, that requires you to have a valid C program, unlike the one you have in the question now.)
The compiler is generating the assembly, each instruction set may differ but at the end of the day the stack is just a register holding an address to memory. The compiler is creating and knows the whole scope of the function it is creating and knows how far down to find each data item on the stack for local data items, so it will then create the appropriate code based on that instruction set to access those local items.
Some instruction sets you need to make a copy of the stack pointer and/or do math with the stack pointer as an operand but some other register as a result of that math, then based on that math (stack pointer + 8 words for example) you access that memory address. Some instruction sets have an addressing mode where you can in the load or store apply an offset to the stack pointer the math is done as part of the instruction execution and you dont have to use an intermediate result and a register.
Only the top of the stack can be accessed at any given moment (right?)
No, generally the ISA has instructions to access other elements on the stack as well. That is, accessing elements on the stack is not limited to push and pop like operations; typically you can just mov things back and forth between a stack location and a register.
Assembly can accesss any memory by address (just like C).
Simple, not optimized programs would put all local variables on stack before method execution, so variables addresses are address of execution frame plus some shift.
Then program can simple use pop and push method to store additional variables (i.e. subresults of some expression) on the top of the stack.
Summary:
There is register (ESP in x86) pointing to the top of the stack
Calling push is moving variable to the top of the stack and increasing this register
Calling pop is moving variable from the top of the stack and decreasing this register
Calling mov is moving variable between memory and registers and do nothing to stack register (ESP).

C Buffer overflow - Return address not expressible in ASCII

I'm trying to overflow a buffer of 64bytes.
The buffer is being filled by a call to gets
My understanding is that I need to write a total of 65 bytes to fill the buffer, and then write another 4 bytes to fill the stack frame pointer.
The next 4 bytes should overwrite the return address.
However, the address that I wish to write is 804846A.
Is this same as 0x0804846A? If so, I'm finding it hard to enter 04 (^D)
Should this be entered in reverse order? (6A 84 04 08)?
Some initial experiments that I was running with input being ZZZZZ..(64 times)..AAAABBBB
ended up making the ebp register to be 0x42414141
The architecture in question is x86.
update: I managed to get ASCII codes 0x04 and 0x08 working. The issue seems to be with 0x84. I tried copying the symbol corresponding to 0x84 from http://www.ascii-code.com which is apparently „. However, C seems to resolve this symbol into a representation greater than 1 byte.
I also tried to use ä as mentioned in http://www.theasciicode.com.ar
This also resulted in a representation greater than 1 byte.
You seem to be depending on implementation details of a particular compiler and CPU architecture. For example:
Not all CPU architectures use a frame pointer at all.
Endianness varies across different CPUs, and this would affect whether you need to "reverse" the bytes or not.
Where the stack metainformation (the frame pointer, etc.) is located with respect to a given local variable will differ between compilers, and even between the same compiler using different optimization options.

Resources