I have found a code snippet on this website as an answer to a question. The code uses nested loops in MIPS.
Here is the code :
.data
prompt: .asciiz "Please enter the edge length of the base of right
triangle: "
newLine: .asciiz "\n"
star: .asciiz "*"
.text
main:
li $v0, 4 # print the prompt
la $a0, prompt
syscall
li $v0,5 #take user input
syscall
move $s0, $v0 # move the input to $s0
li $t0, 0 # load 0 at t0
outerLoop:
beq $t0, $s0, end #(for i=0;i<=baseLength;i++)
#if t0=s0 branch to end
addi $t0, $t0, 1 # increment i
li $t1, 1 #load 1 at t1
jal changeLine #jump to changeLine
innerLoop:
bgt $t1, $t0, outerLoop #(for j=0;j<=i;j++)
# if t1=t0 branch to outerLoop
li $v0, 4 # print star
la $a0, star
syscall
addi $t1, $t1, 1 # increment j
j innerLoop # jump to innerLoop
changeLine:
li $v0, 4 # new line
la $a0, newLine
syscall
jr $ra # jump to after the call instruction
end:
li $v0, 10 # end of program
syscall
The code works perfectly but I could not understand how the outer loop iterates even though there is no an explicit command like j outerLoop.
Any help is appreciated.
The answer is the innerLoop's first statement:
innerLoop:
bgt $t1, $t0, outerLoop #(for j=0;j<=i;j++)
# if t1=t0 branch to outerLoop
When the total number of iterations for the innerLoop has ended, the code jumps back to the outerLoop label.
Here's the thing: the outer loop has been "optimized" and no longer directly reflects the for loop stated in comment. The C equivalent looks more like this:
for ( i = 0; i <= baseLength; /* empty increment */ ) {
i++;
<loop-body> // includes inner loop
}
Normally a for loop such as this:
for ( i = 0; i < N; i++ ) { <loop-body> }
by definition of the for loop construct, would expand as follows:
i = 0;
while ( i < N ) {
<loop-body>
i++;
}
However, as you can see, in the code you're showing, the i++ increment has moved from after the loop-body to before the loop-body.
As a result, there is no increment code for the outer loop to perform after executing the inner loop — and thus, the inner loop's exit location can continue directly with the top of the outer loop.
However, when making these changes, we have to be careful b/c now loop-body executes with a 1 larger value of i than in the C code that this was translated from and now shown in comment. If i is used inside loop-body this is an issue.
And here i is indeed used inside the loop body, as part of the inner loop iteration control. Because i is one larger, this inner loop will iterate one more time than the C code that was written in comment.
If the "optimization" had not been applied, the inner loop would probably exit by jumping forward to an i++ increment that belongs to the outer loop, which would then j outerLoop (jump backwards) as you were expecting, to continue the outer loop.
Beginning assembly programmers are often keen to modify the C or pseudo code they're starting with during translation into assembly, but these changes are often done without the understanding that they are not logically or algorithmically equivalent. For example, changing a while loop into a do-while loop; changing array references into pointer operations — in this case changing the order of operations. Since C can express all of those (do-while, pointers, etc.) I advocate making optimizations in C first, then verify the equivalent behavior, and take that to assembly.
Yes, there are other optimizations that are only sensible in assembly (i.e. not possible or practical in C), though suggest to leave those for when efficiency is the primary topic and until then follow the C starting code rather literally.
Related
I am trying to convert this for loop into MIPS assembly language and i am not sure how exactly to tackle this question. I tried using youtube to understand the concept of it and I am still strugling with it.
here is the C code:
int sum = 0;
for (int i = 1; i <= 10; i ++){
if (i & 1) sum = sum + i * i;
else sum = sum + i;
}
I tried converting but i am just not sure where to start it.
expecting to get MIPS code with explanation in possible so I can learn from it!
I assume that you are doing the bitwise and operator with 1 to indicate if the current 'i' is odd or even. What you can do is the following :
.text
li $t0, 1 #is for counting or i in this case
li $t1, 1 #for if statements condition
li $v0, 0 #output register
forLoop:
bgt $t0, 10, end
andi $t2, $t0, 1 #and opertion
beq $t2, 1, odd #if andi resulted in one do the odd computation if not it is even
add $v0, $v0, $t0
addi $t0, $t0, 1
j forLoop
odd:
mul $t2, $t0, $t0 #t2 is like a temporary register can be overwritten, square of i
add $v0, $v0, $t2
addi $t0, $t0, 1
j forLoop
end:
move $a0, $v0
li $v0, 1
syscall
li $v0, 10
syscall
This implementation is correct however the "right" implementation should use stack pointer and jump and link command for calling the function. This example has the sole purpose for showing the logic behind it.
I would like to translate the C code below into assembly language.
However, I do not see that I need to use the stack in this example.
Moreover, I'd like to know whether or not "beq" saves the address of the following instruction in $ra like "jal" does, for when the loop ends, I would like to get back to the original function foo, and continue the instructions (which here is simply returning.)
int foo(int* a, int N) {
if(N > 0)
{
for(int i = 0; i != N; i = i + 1)
{
a[i] = bar(i << 4, a[i]);
}
}
return N & 7;
}
#assume *a in $a0, N $N in $a1
foo:
slt $t0, $zero, $a1 #put 1 in $t0 if 0 < N
li $t1,0 # use $t1 as loop counter
beq $t0, 1, loop # enter loop if 0 < N
and $v0, $a1, 7 # do bitwise and on N and 7 and save in $v0 as return value
loop:
beq $t1, $a1, exit # exit loop when i = N
sll $t3, $t1, 2 # obtain 4 * i
add $t3, $a1, $t3 # obtain address of a[i] which is address of a plus 4i
lw $t3, o($t3) # load a[i] into $t3
sll $t4, $t1, 4 #perform i<< 4 and save in $t4
# the 2 previous load arguments for bar
jal bar # assume bar saves return value in $v2
sw $t3, 0($v1)
j loop
exit:
and $v0, $a1, 7
beq is for conditional branching, not calling — it changes the PC (conditionally) but not $ra. We use it to translate structured statements (e.g. if, for) into the if-goto style of assembly language.
However, I do not see that I need to use the stack in this example.
You must to use the stack for this code because the call to bar (as in jal bar) will wipe out foos $ra, and while bar will be able to return back to foo, foo will not be able to return to its caller. Since this requires a stack, you will need prologue and epilogue to allocate and release some stack space.
Your code is not properly passing parameters to bar, i << 4, for example, should be passed in $a0, while a[i] should be passed in $a1.
You do not have a return instruction in foo — it is missing a jr $ra.
If either of your beq instructions did set $ra, those wouldn't be useful points to return back to. But since you asked:
I'd like to know whether or not "beq" saves the address of the following instruction in $ra like "jal" does
If the instruction mnemonic doesn't end with al (which stands for And Link), it doesn't save a return address in $ra.
Classic MIPS has the following instructions that link, from this somewhat incomplete reference (missing nor and IDK what else).
jal target (Jump And Link)
BGEZAL $reg, target (conditional Branch if >= 0 And Link)
BLTZAL $reg, target (conditional Branch if < 0 And Link)
Note that the conditional branches are effectively branching on the sign bit of the register.
bal is an alias for bgezal $zero, target, useful for doing a PC-relative function call. (MIPS branches use a fully relative encoding for branch displacement, MIPS jumps use a region-absolute encoding that replaces the low 28 bits of PC+4. This matters for position-independent code).
None of this is particularly relevant to your case; your foo needs to save/restore $ra on entry/before jr $ra because you need to call bar with a jal or bal. Using a linking branch as the loop branch wouldn't affect anything (except to make your code even less efficient, and make performance worse on real CPUs that do return-address prediction with a special predictor that assumes jal and jr $ra are paired properly).
Using bal / jal doesn't automatically make the thing you jump to ever return; that only happens if the target ever uses jr $ra (potentially after copying $ra somewhere else then restoring it).
This is my 1st effort towards learning looping in MIPS.
.data
spc: .asciiz ", "
.globl main
main:
li $t0, 0
loop:
bgt $t0, 14, exit # branch if($t0 > 14)
addi $t0, $t0, 1 # $t0++ for loop increment
# print a comma
la $a0, spc # copy spc to $a0 for printing
li $v0, 4 # syscall value for strings
syscall
# repeat loop
j loop
exit:
li $v0, 10 # syscall value for program termination
syscall
Output:
-- program is finished running (dropped off bottom) --
This program is supposed to print 15 commas in the I/O console. That isn't taking place.
What could be the issue?
Ref: MIPS assembly for a simple for loop
You assembled all your code into the .data section; you never switched back to .text.
If you're using MARS, the GUI shows no asm instructions in the disassembly (after assembling). This is why.
Apparently instead of faulting on main being in a non-executable page, MARS simply decides that the program "dropped off the bottom" as soon as you start it.
How would I convert this code into Mips?
int n = 100; int sum = 0; while (n>0) {
sum = sum + n;
n--; }
I have this so far and am not sure what to do to finish this.
.data
n: .word 100
.text
main:
la $t0, n
lw $t1, 0(t0)
li $so, 0
Loop:
bgt $t1, $zero, EXIT
add $t1, $s0, $t1
addi $t1, $t1, -1
j Loop
exit:
Change the line:
add $t1, $s0, $t1
To:
add $s0, $s0, $t1
Also, there is no need for use of the data segment. Just set $t1 using:
li $t1, 100
"Mips" isn't a language. MIPS an Instruction Set Architecture (ISA). Are you trying to figure out how to turn the C code into MIPS assembly code?
This looks like a homework assignment from the Patterson and Hennessy textbook. If so, you should go to your professor's office hours to get help. Almost every university includes in its academic handbook a statement that it's unethical to ask for homework help online.
If your request isn't a homework assignment, then the best way to convert that C code into MIPS assembly code is with a compiler. For simple loops, the compiler will generate more effective code than you can generate by hand. For example, "gcc -march=native -O3" will generate code that optimizes for the exact CPU on which you're compiling, taking into account pipeline depth and cache latencies.
If you absolutely need to see the assembly code, use "gcc -S" to produce an assembly file.
Mips doesn't have loops per-say, instead what your going to do is use a jump statement with conditions and loop with that.
I think bgt $t1, $zero, EXIT is the opposite of what you want. It seems like you want to convert while(n > 100), it would help if you make another method to do the codes inside of the while loop, then bgt $t1, $zero, . Correct me if I'm wrong.
I have a simple question for a Comp Sci class I'm taking where my task is to convert a function into MIPS assembly language. I believe I have a correct answer but I want to verify it.
This is the C function
int strlen(char *s) {
int len;
len=0;
while(*s != '\0') {
len++;
s++;
}
return len;
}
Thanks!
strlen:
add $v0, $zero, $zero # len = 0
loop: # do{
lbu $t0, 0($a0) # tmp0 = load *s
addi $a0, $a0, 1 # s++
addi $v0, $v0, 1 # len++
bne $t0, $zero, loop # }while(tmp0 != 0)
s_end:
addi $v0, $v0, -1 # undo counting of the terminating 0
j $ra
Yeah, you have a correct asm version, and I like the fact that you do as much work as possible before testing the value of t0 to give as much time as possible for loading from memory.
(Editor's note: the add of -1 after the loop corrects for off by 1 while still allowing an efficient do{}while loop structure. This answer proposes a more literal translation from C into an if() break inside an unconditional loop.)
I think the while loop isn't right in the case of *s == 0.
It should be something like this:
...
lbu $t0, 0($a0)
loop:
beq $t0, $zero, s_end # *
...
b loop
s_end:
...
*You could use a macro instruction (beqz $t0, s_end) instead of beq instruction.
Yes, looks correct to me, and fairly efficient. Implementing a while loop with asm structured like a do{}while() is the standard and best way to loop in asm. Why are loops always compiled into "do...while" style (tail jump)?
A more direct transliteration of the C would check *s before incrementing len.
e.g. by peeling the first iteration and turning it into a load/branch that can skip the whole loop for an empty string. (And reordering the loop body, which would probably put the load close to the branch, worse for performance because of load latency.)
You could optimize away the len-- overshoot-correction after the loop: start with len=-1 instead of 0. Use li $v0, -1 which can still be implemented with a single instruction:
addiu $v0, $zero, -1
A further step of optimization is to only do the pointer increment inside the loop, and find the length at the end with len = end - start.
We can correct for the off-by-one (to not count the terminator) by offsetting the incoming pointer while we're copying it to another reg.
# char *s input in $a0, size_t length returned in $v0
strlen:
addiu $v0, $a0, 1 # char *start_1 = start + 1
loop: # do{
lbu $t0, ($a0) # char tmp0 = load *s
addiu $a0, $a0, 1 # s++
bne $t0, $zero, loop # }while(tmp0 != '\0')
s_end:
subu $v0, $a0, $v0 # size_t len = s - start
jr $ra
I used addiu / subu because I don't want it to fault on signed-overflow of a pointer. Your version should probably use addiu as well so it works for strings up to 4GB, not just 2.
Untested, but we can think through the correctness:
For an empty string input (s points at a 0): when we reach the final subtract, we have v0=s+1 (from before the loop) and a0=s+1 (from the first/only iteration which falls through because it loads $t0 = 0). Subtracting these gives len=0 = strlen("")
For a length=1 string: v0=s+1, but the loop body runs twice so we have a0=s+2. len = (s+2) - (s+1) = 1.
By induction, larger lengths work too.
For MIPS with a branch-delay slot, the addiu and subu can be reordered after bne and jr respectively, filling those branch-delay slots. (But then bne is right after the load so classic MIPS would have to stall, or even fill the load-delay slot with a nop on a MIPS I without interlocks for loads).
Of course if you actually care about real-world strlen performance for small to medium strings (not just tiny), like more than 8 or 16 bytes, use a bithack that checks whole words at once for maybe having a 0 byte.
Why does glibc's strlen need to be so complicated to run quickly?