for (i = 0; i < 64; i++) {
A[i] = B[i] + C[i];
}
The MIPS instructions for the above C code is:
add $t4, $zero, $zero # I1 i is initialized to 0, $t4 = 0
Loop:
add $t5, $t4, $t1 # I2 temp reg $t5 = address of b[i]
lw $t6, 0($t5) # I3 temp reg $t6 = b[i]
add $t5, $t4, $t2 # I4 temp reg $t5 = address of c[i]
lw $t7, 0($t5) # I5 temp reg $t7 = c[i]
add $t6, $t6, $t7 # I6 temp reg $t6 = b[i] + c[i]
add $t5, $t4, $t0 # I7 temp reg $t5 = address of a[i]
sw $t6, 0($t5) # I8 a[i] = b[i] + c[i]
addi $t4, $t4, 4 # I9 i = i + 1
slti $t5, $t4, 256 # I10 $t5 = 1 if $t4 < 256, i.e. i < 64
bne $t5, $zero, Loop # I11 go to Loop if $t4 < 256
For I8, could the sw instruction not be replaced with an addi instruction? i.e addi $t5, $t6, 0
Wouldn't it achieve the same task of copying the address of $t6 into $t5? I would like to know the difference and when to use either of them. Same could be said about the lw instruction.
Also, maybe a related question, how does MIPS handle pointers?
edit: changed addi $t6, $t5, 0.
The sw instruction in MIPS stores the first argument (value in $t6) to the address in the second argument (value in $t5) offset by the constant value (0).
You're not actually trying to store the $t5 address into a register, but rather storing the value in $t6 into the memory location represented by the value of $t5.
If you like, you could consider the value in $t5 to be analogous to a C pointer. In other words, MIPS does not handle pointers vs values differently-- all that matters is where you use the values. If you use a register's value as the second argument to lw or sw, then you are effectively using that register as a pointer. If you use a register's value as the first argument to lw or sw, or in most other places, you are operating directly on the value. (Of course, just like in C pointer arithmetic, you might manipulate an address so you can store a piece of data somewhere else in memory.)
For I8, could the sw instruction not be replaced with an addi instruction? i.e addi $t6, $t5, 0
No. The sw instruction stores the result to memory. The add just manipulates registers. And lw gets a word from memory. It's the only MIPS instruction that does so. (Other processors might and do have versions of add that access memory, but not MIPS.)
It's necessary to adjust your thinking when working in assembly language. Registers and memory are separate. In higher level languages, registers are (nearly) completely hidden. In assembly, registers are a separate resource that the programmer must manage. And they're a scarce resource. A HLL compiler would do this for you, but by programming in assembly, you have taken the job for yourself.
how does MIPS handle pointers?
In MIPS, pointers are just integers (in registers or memory) that happen to be memory addresses. The only way they're distinguished from data values is by your brain. The "pointer" is something invented by higher level language designers to relieve you the programmer of this burden. If you look closely, you'll see that $t5 actually holds a pointer. It's a memory address used by lw and sw as the address to load from or store to.
For I8, could the sw instruction not be replaced with an add instruction? Wouldn't it achieve the same task of copying the address of $t5 into $t0? I would like to know the difference and when to use either of them.
I think you are confused with what a store word actually does. In I8, the value of the register in $t6 is being stored into $t5 at position zero. An add will overwrite whatever data is stored in the destination register with the sum of the two other registers' values.
Also, maybe a related question, how does MIPS handle pointers?
The "pointers" are just addresses in memory stored in the registers (as opposed to values).
lw and sw read/write to memory. addi and other arithmetic operations operate on registers.
Registers are like little buckets the CPU uses to store data. They can be addressed in 5 bits or so if I remember my MIPS architecture correctly.
Memory is like a vast ocean of data that requires well over 16 bits to address. So you actually have to store the address in a register.
Pointers are simply memory addresses (32 bit on a 32 bit architecture).
Related
What is the following C code in MIPS?
f = A[B[i]]
I'm told it can be done in 6 lines but can't quite figure out how.
f is in $t0, i is in $t3, A[] is in $s0, and B[] is in $s1. All types are integer.
The best I am able to think of is
lw $t5, $t3($s0); # Doesn't work because lw syntax doesn't accept a register as an offset
lw $t6, $t5($s1);
sadd $t0, $t6, $zero
Obviously this is wrong. How would i go about getting the correct offset for each line?
Thanks.
There might be more efficient ways, but here's one way in 6 lines:
sll $t2,$t3,2 # t2 = i * sizeof(int)
addu $t2,$t2,$s1 # t2 = &B[i]
lw $t0,0($t2) # t0 = B[i]
sll $t0,$t0,2 # t0 *= sizeof(int)
addu $s0,$s0,$t0 # s0 = &A[B[i]]
lw $t0,0($s0) # t0 = A[B[i]]
Read a MIPS instruction set reference to get more information about what individual instructions do.
I would like to translate the C code below into assembly language.
However, I do not see that I need to use the stack in this example.
Moreover, I'd like to know whether or not "beq" saves the address of the following instruction in $ra like "jal" does, for when the loop ends, I would like to get back to the original function foo, and continue the instructions (which here is simply returning.)
int foo(int* a, int N) {
if(N > 0)
{
for(int i = 0; i != N; i = i + 1)
{
a[i] = bar(i << 4, a[i]);
}
}
return N & 7;
}
#assume *a in $a0, N $N in $a1
foo:
slt $t0, $zero, $a1 #put 1 in $t0 if 0 < N
li $t1,0 # use $t1 as loop counter
beq $t0, 1, loop # enter loop if 0 < N
and $v0, $a1, 7 # do bitwise and on N and 7 and save in $v0 as return value
loop:
beq $t1, $a1, exit # exit loop when i = N
sll $t3, $t1, 2 # obtain 4 * i
add $t3, $a1, $t3 # obtain address of a[i] which is address of a plus 4i
lw $t3, o($t3) # load a[i] into $t3
sll $t4, $t1, 4 #perform i<< 4 and save in $t4
# the 2 previous load arguments for bar
jal bar # assume bar saves return value in $v2
sw $t3, 0($v1)
j loop
exit:
and $v0, $a1, 7
beq is for conditional branching, not calling — it changes the PC (conditionally) but not $ra. We use it to translate structured statements (e.g. if, for) into the if-goto style of assembly language.
However, I do not see that I need to use the stack in this example.
You must to use the stack for this code because the call to bar (as in jal bar) will wipe out foos $ra, and while bar will be able to return back to foo, foo will not be able to return to its caller. Since this requires a stack, you will need prologue and epilogue to allocate and release some stack space.
Your code is not properly passing parameters to bar, i << 4, for example, should be passed in $a0, while a[i] should be passed in $a1.
You do not have a return instruction in foo — it is missing a jr $ra.
If either of your beq instructions did set $ra, those wouldn't be useful points to return back to. But since you asked:
I'd like to know whether or not "beq" saves the address of the following instruction in $ra like "jal" does
If the instruction mnemonic doesn't end with al (which stands for And Link), it doesn't save a return address in $ra.
Classic MIPS has the following instructions that link, from this somewhat incomplete reference (missing nor and IDK what else).
jal target (Jump And Link)
BGEZAL $reg, target (conditional Branch if >= 0 And Link)
BLTZAL $reg, target (conditional Branch if < 0 And Link)
Note that the conditional branches are effectively branching on the sign bit of the register.
bal is an alias for bgezal $zero, target, useful for doing a PC-relative function call. (MIPS branches use a fully relative encoding for branch displacement, MIPS jumps use a region-absolute encoding that replaces the low 28 bits of PC+4. This matters for position-independent code).
None of this is particularly relevant to your case; your foo needs to save/restore $ra on entry/before jr $ra because you need to call bar with a jal or bal. Using a linking branch as the loop branch wouldn't affect anything (except to make your code even less efficient, and make performance worse on real CPUs that do return-address prediction with a special predictor that assumes jal and jr $ra are paired properly).
Using bal / jal doesn't automatically make the thing you jump to ever return; that only happens if the target ever uses jr $ra (potentially after copying $ra somewhere else then restoring it).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I was trying to solve this and convert it to MIPS assembly code
but the answer in the book confused me.so, can anyone explain how this code get this result in c ?
B[g] = A[f] + A[f+1];
I have inserted comments for what I think that's right please correct me if I was wrong.
Assume we have variables f, g, h, i, j stored in $s0, $s1, $s2, $s3 and $s4, respectively. Assume the base addresses of arrays A and B are at $s6 and $s7.
The code:
add $t0, $s6, $s0 #This will add f bytes to the base address and it's not equal to A[f].
add $t1, $s7, $s1 #This will add g bytes to the base address and it's not equal to B[g]
lw $s0, 0($t0) #This will point to the value in (f +base address of A +0) bytes
lw $t0, 4($t0) #This will point to the value in (f +base address of A +4) bytes
add $t0, $t0, $s0
sw $t0, 0($t1)
Your compiled fragment annotated by me:
add $t0, $s6, $s0
add and store in register t0 whatever is in register s6 and s0. Since you have pointed out that f is stored in s0 and the base address of A is stored in s6, this adds the addresses in preparation for a register indirect load later on. More simply A[f] == *(A + f) in C, and this is preparing for the (A + f) de-reference later on.
add $t1, $s7, $s1
The same thing happening for B and g. Add their contents and store them in an intermediate register, to be later used as address based de-reference targets.
lw $s0, 0($t0)
This is loading to the s0 register, using what's known as the register-indirect addressing mode of the cpu, whatever is at the address pointed to by t0 plus 0 bytes. In c this is equal to s0 = *(A + f).
lw $t0, 4($t0)
The same thing as above, only that this time it loads to register t0 whatever is pointed at t0 plus 4 bytes. Equal to C t0 = *(A + f + 1).
add $t0, $t0, $s0
This is the point where it performs the addition in your code. It's equal to the C code fragment of A[f] + A[f + 1].
sw $t0, 0($t1)
This is storing the result of the previous addition to the address pointed to by t1.
~~~~~~~~~~~
If you are looking for some references for the code you have, I found both this MIPS instruction set reference useful and, of course, Matt Godbolt's interactive compiler.
If you want to see what code does what using the interactive compiler, just wrap your code in a void function, select as the compiler x86 clang and at the compiler options --target=mips. Then from the filter apply colourise, and you will be able to see what C code generates what assembly code, to get something like the image below
add $t0, $s6, $s0 # t0 = A + f
add $t1, $s7, $s1 # t1 = B + g
lw $s0, 0($t0) # s0 = *(t0 + 0) = *(A + f) = A[f]
lw $t0, 4($t0) # t0 = *(t0 + 4) = *(A + f + 1) = A[f+1]
add $t0, $t0, $s0 # t0 = t0 + s0 = A[f] + A[f+1]
sw $t0, 0($t1) # *(t1 + 0) = *(B + g) = B[g] = t0
Remember that C pointer arithmetic scales by item size, but assembly uses bytes. Hence, advancing 4 bytes is advancing 1 item in C.
On second thought, this in fact means f and g should also be scaled by 4, which they don't seem to be.
sll $t0, $s0, 2 # $t0 = f * 4
add $t0, $s6, $t0
sll $t1, $s1, 2 # $t1 = g * 4
add $t1, $s7, $t1
lw $s0, 0($t0)
lw $t0, 4($t0)
add $t0, $t0, $s0
sw $t0, 0($t1)
Yes, the code is missing these two lines, you just need to scale both f and g by 4.
So I have an assignment problem which asks for a for loop in C to be converted into MIPS. My professor moves extremely fast so I couldn't catch half of what he said. Here is the code:
for (i=0; i<10; i++){
a[i] = b[i] + c[i];
}
This fragment is stored in memory starting from location 00000100 Hex.
Convert this code to MIPS and provide numeric offsets for each branch or jump instruction used.
I don't quite understand the use of offsets. From the lecture slides given to us, it seems load word and store word commands are used for the offsets, which are also used for the arrays but I'm not sure how to go about it. Below is something I put together based on other solutions I saw, but am of course open to changes. I'm hoping it's at least going in the right direction. Any help would be appreciated.
#t0 = i
#s0 = a
#s1 = b
#s2 = c
#t3, t4, t5, t6, t7 = free
loop:
bgt $t0,9,exit #exit before i reaches 10
addi $t3,$s1,$t0 #temp reg $t3 = address of b[i]
addi $t4,$s2,$t0 #temp reg $t4 = address of c[i]
lw $t5,0($t3) #temp reg $t5 = c[i]
lw $t6,0($t4) #temp reg $t6 = a[i]
add $t3,$t5,$t6 #temp reg $t10 = b[i] + c[i]
addi $t7,$s0,$t0 #temp reg $t7 = address of a[i]
sw $t3,0($t7) #store word a[i] = b[i] + c[i]
addi $t0,$t0,1 #increment i by 1
j loop #jump to start of loop
exit:
From what I can see, you are storing it in the same index each time. In MIPS, every 4 bytes is a word, so you must store it in 0, 4, 8, etc. Also, you need to allocate memory for your array before you start using one. Here's an example.
li $v0, 9 # create an array, start address in $v0
li $a0, 80 # allocate 80 bytes, or 20 words
syscall
move $t0, $v0 # move from $v0 (temp) to $t0
Check out this tutorial and see if it helps at all.
I'm Trying to convert this C code to MIPS assembly and I am unsure if it is correct. Can someone help me? Please
Question : Assume that the values of a, b, i, and j are in registers $s0, $s1, $t0, and $t1, respectively. Also, assume that register $s2 holds the base address of the array D
C Code :
for(i=0; i<a; i++)
for(j=0; j<b; j++)
D[4*j] = i + j;
My Attempt at MIPS ASSEMBLY
add $t0, $t0, $zero # i = 0
add $t1, $t1, $zero # j = 0
L1 : slt $t2, $t0, $s0 # i<a
beq $t2, $zero, EXIT # if $t2 == 0, Exit
add $t1, $zero, $zero # j=0
addi $t0, $t0, 1 # i ++
L2 : slt $t3, $t1, $s1 # j<b
beq $t3, $zero, L1, # if $t3 == 0, goto L1
add $t4, $t0, $t1 # $t4 = i+j
muli $t5, $t1, 4 # $t5 = $t1 * 4
sll $t5, $t5, 2 # $t5 << 2
add $t5, $t5, $s2 # D + $t5
sw $t4, $t5($s2) # store word $t4 in addr $t5(D)
addi $t0, $t1, 1 # j ++
j L2 # goto L2
EXIT :
add $t0, $t0, $zero # i = 0 Nope, that leaves $t0 unmodified, holding whatever garbage it did before. Perhaps you meant to use addi $t0, $zero, 0?
Also, MIPS doesn't have 2-register addressing modes (for integer load/store), only 16-bit-constant ($reg). $t5($s2) isn't legal. You need a separate addu instruction, or better a pointer-increment.
(You should use addu instead of add for pointer math; it's not an error if address calculation crosses from the low half to high half of address space.)
In C, it's undefined behaviour for another thread to be reading an object while you're writing it, so we can optimize away the actual looping of the outer loop. Unless the type of D is _Atomic int *D or volatile int *D, but that isn't specified in the question.
The inner loop writes the same elements every time regardless of the outer loop counter, so we can optimize away the outer loop and only do the final outer iteration, with i = a-1. Unless a <= 0, then we must skip the outer loop body, i.e. do nothing.
Optimizing away all but the last store to every location is called "dead store elimination". The stores in earlier outer-loop iterations are "dead" because they're overwritten with nothing reading their value.
You normally want to put the loop condition at the bottom of the loop, so the loop branch is a bne $t0, $t1, top_of_loop for example. (MIPS has bne as a native hardware instruction; blt is only a pseudo-instruction unless the 2nd register is $zero.) So we want to optimize j<b to j!=b because we know we're counting upward.
Put a conditional branch before the loop to check if it might need to run zero times. e.g. blez $s0, after_loop to skip the inner loop body if b <= 0.
An idiomatic for(i=0 ; i<a ; i++) loop in asm looks like this in C (or some variation on this).
if(a<=0) goto end_of_loop;
int i=0;
do{ ... }while(++i != a);
Or if i isn't used inside the loop, then i=a and do{}while(--i). (i.e. add -1 and use bnez). Although MIPS can branch just as efficiently on i!=a as it can on i!=0, unlike most architectures with a FLAGS register where counting down saves a compare instruction.
D[4*j] means we stride by 16 bytes in a word array. Separately using a multiply by 4 and a shift by 2 is crazy redundant. Just keep a pointer in a separate register an increment it by 16 every iteration, like a C compiler would.
We don't know the type of D, or any of the other variables for that matter. If any of them are narrow unsigned integers, we might need to implement 8 or 16-bit truncation/wrapping.
But your implementation assumes they're all int or unsigned, so let's do that.
I'm assuming a MIPS without branch-delay slots, like MARS simulates by default.
i+j starts out (with j=0) as a-1 on the last outer-loop iteration that sets the final value. It runs up to j=b-1, so the max value is a-1 + b-1.
Simplifying the problem down to the values we need to store, and the locations we need to store them in, before writing any asm, means the asm we do write is a lot simpler and easier to debug.
You could check the validity of most of these transformations by doing them in C source and checking with a unit test in C.
# int a: $s0
# int b: $s1
# int *D: $s2
# Pointer to D[4*j] : $t0
# int i+j : $t1
# int a-1 + b : $t2 loop bound
blez $s0, EXIT # if(a<=0) goto EXIT
blez $s1, EXIT # if(b<=0) goto EXIT
# now we know both a and b loops run at least once so there's work to do
addiu $t1, $s0, -1 # tmp = a-1 // addu because the C source doesn't do this operation, so we must not fault on signed overflow here. Although that's impossible because we already excluded negatives
addu $t2, $t1, $s1 # tmp_end = a-1 + b // one past the max we store
add $t0, $s2, $zero # p = D // to avoid destroying the D pointer? Otherwise increment it.
inner: # do {
sw $t1, ($t0) # tmp = i+j
addiu $t1, $t1, 1 # tmp++;
addiu $t0, $t0, 16 # 4*sizeof(*D) # could go in the branch-delay slot
bne $t1, $t2, inner # }while(tmp != tmp_end)
EXIT:
We could have done the increment first, before the store, and used a-2 and a+b-2 as the initializer for tmp and tmp_end. On some real pipelined/superscalar MIPS CPUs, that might be better to avoid putting the increment right before the bne that reads it. (After moving the pointer-increment into the branch-delay slot). Of course you'd actually unroll to save work, e.g. using sw $t1, 16($t0) and 32($t0) / 48($t0).
Again on a real MIPS with branch delays, you'd move some of the init of $t0..2 to fill the branch delay slots from the early-out blez instructions, because they couldn't be adjacent.
So as you can see, your version was over-complicated to say the least. Nothing in the question said we have to transliterate each C expression to asm separately, and the whole point of C is the "as-if" rule that allows optimizations like this.
This similar C code compiles and translates to MIPS:
#include <stdio.h>
main()
{
int a,b,i,j=5;
int D[50];
for(i=0; i<a; i++)
for(j=0; j<b; j++)
D[4*j] = i + j;
}
Result:
.file 1 "Ccode.c"
# -G value = 8, Cpu = 3000, ISA = 1
# GNU C version cygnus-2.7.2-970404 (mips-mips-ecoff) compiled by GNU C version cygnus-2.7.2-970404.
# options passed: -msoft-float
# options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
# -fpcc-struct-return -fcommon -fverbose-asm -fgnu-linker -msoft-float
# -meb -mcpu=3000
gcc2_compiled.:
__gnu_compiled_c:
.text
.align 2
.globl main
.ent main
main:
.frame $fp,240,$31 # vars= 216, regs= 2/0, args= 16, extra= 0
.mask 0xc0000000,-4
.fmask 0x00000000,0
subu $sp,$sp,240
sw $31,236($sp)
sw $fp,232($sp)
move $fp,$sp
jal __main
li $2,5 # 0x00000005
sw $2,28($fp)
sw $0,24($fp)
$L2:
lw $2,24($fp)
lw $3,16($fp)
slt $2,$2,$3
bne $2,$0,$L5
j $L3
$L5:
.set noreorder
nop
.set reorder
sw $0,28($fp)
$L6:
lw $2,28($fp)
lw $3,20($fp)
slt $2,$2,$3
bne $2,$0,$L9
j $L4
$L9:
lw $2,28($fp)
move $3,$2
sll $2,$3,4
addu $4,$fp,16
addu $3,$2,$4
addu $2,$3,16
lw $3,24($fp)
lw $4,28($fp)
addu $3,$3,$4
sw $3,0($2)
$L8:
lw $2,28($fp)
addu $3,$2,1
sw $3,28($fp)
j $L6
$L7:
$L4:
lw $2,24($fp)
addu $3,$2,1
sw $3,24($fp)
j $L2
$L3:
$L1:
move $sp,$fp # sp not trusted here
lw $31,236($sp)
lw $fp,232($sp)
addu $sp,$sp,240
j $31
.end main