I have an assignment to reverse engineer the assembly to find the values of R, S, and T in the following code. Assume that R, S, and T are constants declared with #define.
long int A[R][S][T];
int store_ele(int h, int i, int j, long int *dest)
{
A[h][i][j] = *dest;
return sizeof(A);
}
When compiling this program, GCC generates the following assembly code (with -O2):
store_ele:
movslq %esi, %rsi //%rsi = h
movslq %edi, %rdi //%rdi = i
movq (%rcx), %rax //moves the value from %rcx to %rax
leaq (%rdi,%rdi,4), %rdi //%rdi = 4 * i + i
leaq (%rsi,%rsi,4), %rcx //%rcx = 4 * h + h
movslq %edx, %rdx //%rdx = j
leaq (%rcx,%rdi,4), %rcx //%rcx = 4 * %rdi + %rcx = 4 * (4 * i + i) + (4 * h + h)
addq %rcx, %rdx //adds something to j
movq %rax, A(,%rdx,8) //moves some value
movl $1120, %eax //%eax = 1120
ret //returns %eax
I want to ask if what I am understanding about the assembly is right and any tips or assistance is appreciated!
Edit: I don't know what it is called but our prof. defines movq: source, destination and other similar assembly instructions where the first argument is source and second is destination
Edit 2: Biggest issue is how do I find the values of the three constants just based on the assembly. I think
movq %rax, A(,%rdx,8) //moves some value
movl $1120, %eax //%eax = 1120
ret //returns %eax
Is going to play the main role in finding out what it does but I don't know what to do with it.
Edit 3: Don't know if I should put the answers, but if someone might have same problem, I got T = 5, S = 4, and R = 7 where R = 1120/T*S*8 and I got T and S from matching coefficients from the help I got from this thread.
That's x86-64 AT&T syntax (mnemonic source, dest), with the x86-64 System V ABI (first arg in rdi, see this approximate summary of that calling convention, or find links to better ABI docs (including the official standards) in the x86 tag wiki).
What you're calling "functions" are assembly instructions. Each one assembles to a single machine instruction.
Hint: your comments are wrong about which arg is which. Check the ABI for arg-passing order.
Since you know the declaration is long int A[R][S][T]:
A[h][i][j] is equivalent to *(A[h][i] + j), where A[h][i] is an array type (with size [T]). Applying this recursively, A[h][i][j] is equivalent to *(base_pointer + S*T*h + T*i + j) (where base_pointer is just a long*, for the purposes of C pointer math which implicitly scales by sizeof(long) in this case).
You seem to be on the right track working out how the LEAs are multiplying, so you can find T, then use that to find S (by dividing the factor for h).
Then to find R, look at the function return value, which is R*S*T * sizeof(long).
sizeof(long) in the x86-64 System V ABI is 8 bytes. The offset into the array is scaled by 8 bytes, too, of course, so don't forget to factor that out when getting your S and T values.
Related
This question already has answers here:
Why does GCC chose dword movl to copy a long shift count to CL?
(2 answers)
Closed 1 year ago.
To learn assembly I am viewing the assembly generated by GCC using the -S command for some simple C programs on Linux.
I write a C function foo.c
long shift_left4_rightn(long x, long n)
{
x <<= 4;
x >>= n;
return x;
}
When I run gcc -Og -S foo.c
I got foo.s . Below is the part about this function
shift_left4_rightn:
movq %rdi, %rax
salq $4, %rax
movl %esi, %ecx
sarq %cl, %rax
ret
The function parameter x uses the register %rdi, which is normal. What confuses me is why the other parameter n uses the register %esi instead of %rsi. What am I missing? What would happen if I replace movl %esi, %ecx with movq %rsi, %rcx.
It's undefined behavior if you try to shift by more than the number of bits in a value. If long is 64 bits, this means that the maximum possible value of n is 64, even though it's declared as a long. So we don't need all 8 bytes of n, the low 4 bytes are enough. In fact, even %sil (1 byte) would be OK, but maybe there's a performance reason why it prefers %esi.
I think it would still work if you use %rsi.
I'm trying to understand assembly in x86 more. I have a mystery function here that I know returns an int and takes an int argument.
So it looks like int mystery(int n){}. I can't figure out the function in C however. The assembly is:
mov %edi, %eax
lea 0x0(,%rdi, 8), %edi
sub %eax, %edi
add $0x4, %edi
callq < mystery _util >
repz retq
< mystery _util >
mov %edi, %eax
shr %eax
and $0x1, %edi
and %edi, %eax
retq
I don't understand what the lea does here and what kind of function it could be.
The assembly code appeared to be computer generated, and something that was probably compiled by GCC since there is a repz retq after an unconditional branch (call). There is also an indication that because there isn't a tail call (jmp) instead of a call when going to mystery_util that the code was compiled with -O1 (higher optimization levels would likely inline the function which didn't happen here). The lack of frame pointers and extra load/stores indicated that it isn't compiled with -O0
Multiplying x by 7 is the same as multiplying x by 8 and subtracting x. That is what the following code is doing:
lea 0x0(,%rdi, 8), %edi
sub %eax, %edi
LEA can compute addresses but it can be used for simple arithmetic as well. The syntax for a memory operand is displacement(base, index, scale). Scale can be 1, 2, 4, 8. The computation is displacement + base + index * scale. In your case lea 0x0(,%rdi, 8), %edi is effectively EDI = 0x0 + RDI * 8 or EDI = RDI * 8. The full calculation is n * 7 - 4;
The calculation for mystery_util appears to simply be
n &= (n>>1) & 1;
If I take all these factors together we have a function mystery that passes n * 7 - 4 to a function called mystery_util that returns n &= (n>>1) & 1.
Since mystery_util returns a single bit value (0 or 1) it is reasonable that bool is the return type.
I was curious if I could get a particular version of GCC with optimization level 1 (-O1) to reproduce this assembly code. I discovered that GCC 4.9.x will yield this exact assembly code for this given C program:
#include<stdbool.h>
bool mystery_util(unsigned int n)
{
n &= (n>>1) & 1;
return n;
}
bool mystery(unsigned int n)
{
return mystery_util (7*n+4);
}
The assembly output is:
mystery_util:
movl %edi, %eax
shrl %eax
andl $1, %edi
andl %edi, %eax
ret
mystery:
movl %edi, %eax
leal 0(,%rdi,8), %edi
subl %eax, %edi
addl $4, %edi
call mystery_util
rep ret
You can play with this code on godbolt.
Important Update - Version without bool
I apparently erred in interpreting the question. I assumed the person asking this question determined by themselves that the prototype for mystery was int mystery(int n). I thought I could change that. According to a related question asked on Stackoverflow a day later, it seems int mystery(int n) is given to you as the prototype as part of the assignment. This is important because it means that a modification has to be made.
The change that needs to be made is related to mystery_util. In the code to be reverse engineered are these lines:
mov %edi, %eax
shr %eax
EDI is the first parameter. SHR is logical shift right. Compilers would only generate this if EDI was an unsigned int (or equivalent). int is a signed type an would generate SAR (arithmetic shift right). This means that the parameter for mystery_util has to be unsigned int (and it follows that the return value is likely unsigned int. That means the code would look like this:
unsigned int mystery_util(unsigned int n)
{
n &= (n>>1) & 1;
return n;
}
int mystery(int n)
{
return mystery_util (7*n+4);
}
mystery now has the prototype given by your professor (bool is removed) and we use unsigned int for the parameter and return type of mystery_util. In order to generate this code with GCC 4.9.x I found you need to use -O1 -fno-inline. This code can be found on godbolt. The assembly output is the same as the version using bool.
If you use unsigned int mystery_util(int n) you would discover that it doesn't quite output what we want:
mystery_util:
movl %edi, %eax
sarl %eax ; <------- SAR (arithmetic shift right) is not SHR
andl $1, %edi
andl %edi, %eax
ret
The LEA is just a left-shift by 3, and truncating the result to 32 bit (i.e. zero-extending EDI into RDI implicilty). x86-64 System V passes the first integer arg in RDI, so all of this is consistent with one int arg. LEA uses memory-operand syntax and machine encoding, but it's really just a shift-and-add instruction. Using it as part of a multiply by a constant is a common compiler optimization for x86.
The compiler that generated this function missed an optimization here; the first mov could have been avoided with
lea 0x0(,%rdi, 8), %eax # n << 3 = n*8
sub %edi, %eax # eax = n*7
lea 4(%rax), %edi # rdi = 4 + n*7
But instead, the compiler got stuck on generating n*7 in %edi, probably because it applied a peephole optimization for the constant multiply too late to redo register allocation.
mystery_util returns the bitwise AND of the low 2 bits of its arg, in the low bit, so a 0 or 1 integer value, which could also be a bool.
(shr with no count means a count of 1; remember that x86 has a special opcode for shifts with an implicit count of 1. 8086 only has counts of 1 or cl; immediate counts were added later as an extension and the implicit-form opcode is still shorter.)
The LEA performs an address computation, but instead of dereferencing the address, it stores the computed address into the destination register.
In AT&T syntax, lea C(b,c,d), reg means reg = C + b + c*d where C is a constant, and b,c are registers and d is a scalar from {1,2,4,8}. Hence you can see why LEA is popular for simple math operations: it does quite a bit in a single instruction. (*includes correction from prl's comment below)
There are some strange features of this assembly code: the repz prefix is only strictly defined when applied to certain instructions, and retq is not one of them (though the general behavior of the processor is to ignore it). See Michael Petch's comment below with a link for more info. The use of lea (,rdi,8), edi followed by sub eax, edi to compute arg1 * 7 also seemed strange, but makes sense once prl noted the scalar d had to be a constant power of 2. In any case, here's how I read the snippet:
mov %edi, %eax ; eax = arg1
lea 0x0(,%rdi, 8), %edi ; edi = arg1 * 8
sub %eax, %edi ; edi = (arg1 * 8) - arg1 = arg1 * 7
add $0x4, %edi ; edi = (arg1 * 7) + 4
callq < mystery _util > ; call mystery_util(arg1 * 7 + 4)
repz retq ; repz prefix on return is de facto nop.
< mystery _util >
mov %edi, %eax ; eax = arg1
shr %eax ; eax = arg1 >> 1
and $0x1, %edi ; edi = 1 iff arg1 was odd, else 0
and %edi, %eax ; eax = 1 iff smallest 2 bits of arg1 were both 1.
retq
Note the +4 on the 4th line is entirely spurious. It cannot affect the outcome of mystery_util.
So, overall this ASM snippet computes the boolean (arg1 * 7) % 4 == 3.
I am taking a MOOC course CS50 from Harvard. In one of the first lectures we learned about variables of different data types: int,char, etc.
What I understand is that command (say, within main function) int a = 5 reserves a number of bytes (4 for the most part) of memory on the stack and puts there a sequence of zeros and ones which represent 5.
The same sequence of zeros and ones also could mean a certain character. So somebody needs to keep track of the fact that the sequence of zeros and ones in the memory place reserved for a is to be read as an integer (and not as a character).
The question is who does keep track of it? The computer's memory by sticking a tag to this place in memory saying "hey, whatever you find in these 4 bytes read as an integer"? Or the C compiler, which knows (looking at the type int of a) that when my code asks it to do something (more precisely, to produce a machine code doing something) with the value of a it needs to treat this value as an integer?
I would really appreciate an answer tailored to a C beginner.
With the C language, it's the compiler.
At run-time, there's only the 32 bits = 4 bytes on the stack.
You ask "The computer's memory by sticking a tag to this place...": that's impossible (with current computer architectures - thanks for the hint from #Ivan). The memory itself is just 8 bits (being 0 or 1) ber byte. There is no place in memory that can tag a memory cell with whatever additional info.
There are other languages (e.g. LISP, and to some degree also Java and C#) that store an integer as a combination of the 32 bits for the number plus a few bits or bytes that contain some bit-encoded tagging that here we have an integer. So they need e.g. 6 bytes for a 32-bit integer. But with C, that's not the case. You need knowledge from the source code to correctly interpret the bits found in memory - they don't explain themselves. And there have been special architectures that supported tagging in hardware.
In C, memory is untyped; no information beyond its value is stored there. All type information is computed at compile time from the type of an expression (a variable name, a value computation, a pointer dereferencing etc.) This computation depends on the information the programmer provides through declarations (also in headers) or casts. If that information is wrong, e.g. because a function prototype's parameters are declared wrong, all bets are off. The compiler warns about or prevents mis-declarations in the same "translation unit" (file with headers), but between translation units there are no (or not many?) protections. That's one reason why C has headers: They share common type information between translation units.
C++ keeps this idea but additionally offers run time type information (as opposed to compile time type information) for polymorphic types. It's obvious that every polymorphic object must carry extra information somewhere (not necessarily close to the data though). But that is C++, not C.
For the main part it's the C compiler that keeps track.
During the compilation process the compiler builds up a large data structure called the parse tree. It also keeps track of all variables, functions, types, ... everything with a name (i.e. identifier); this is called the symbol table.
The nodes of both the parse tree and the symbol table have an entry in which the type is recorded. They keep track of all the types.
With mainly these two data structures in hand, the compiler can check if your code does not violate type rules. It allows the compiler to warn you if you use incompatible values or variable names.
C does allow implicit conversation between types. You can for example assign an int to a double. But in memory these are completely different bit patterns for the same value.
In earlier (higher abstraction level) phases of the compilation process, the compiler does not deal with bit patterns yet (or too much), and makes conversions and checks at a higher level.
But during the assembly code generating process, the compiler needs to finally figure it all out in bits. So for an int to double conversion:
int i = 5;
double d = i; // Conversion.
the compiler will generate code to make this conversion happen.
In C however it's very easy to make mistakes and mess things up. This is because C is not a very strongly typed language and is rather flexible. So a programmer also needs to be aware.
Because C does not keep track of types anymore after compilation, so when program is run, a program can often silently continue running with the wrong data after executing some of your mistakes. And if you're 'lucky' that the program crashes, the error message you is not (very) informative.
You have a stack pointer which gives an absolute offset for the topmost stack frame in memory.
For a given scope of execution, the compiler knows which variable is located relative to this stack pointer and emits access to these variables as on offset to the stack pointer. So it is primarily the compiler mapping the variables, but it's the processor which is applying this mapping.
You can easily write programs which compute or remember a memory address which used to be valid, or is just outside of a valid region. The compiler doesn't stop you from doing so, only higher level languages with reference counting and strict boundary checks do at runtime.
The compiler keeps track of all type information during translation, and it will generate the proper machine code to deal with data of different types or sizes.
Let's take the following code:
#include <stdio.h>
int main( void )
{
long long x, y, z;
x = 5;
y = 6;
z = x + y;
printf( "x = %ld, y = %ld, z = %ld\n", x, y, z );
return 0;
}
After running that through gcc -S, the assignment, addition, and print statements are translated to:
movq $5, -24(%rbp)
movq $6, -16(%rbp)
movq -16(%rbp), %rax
addq -24(%rbp), %rax
movq %rax, -8(%rbp)
movq -8(%rbp), %rcx
movq -16(%rbp), %rdx
movq -24(%rbp), %rsi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
leave
ret
movq is the mnemonic for moving values into 64-bit words ("quadwords"). %rax is a general-purpose 64-bit register that's being used as an accumulator. Don't worry too much about the rest of it for now.
Now let's see what happens when we change those longs to shorts:
#include <stdio.h>
int main( void )
{
short x, y, z;
x = 5;
y = 6;
z = x + y;
printf( "x = %hd, y = %hd, z = %hd\n", x, y, z );
return 0;
}
Again, we run it through gcc -S to generate the machine code, et voila:
movw $5, -6(%rbp)
movw $6, -4(%rbp)
movzwl -6(%rbp), %edx
movzwl -4(%rbp), %eax
leal (%rdx,%rax), %eax
movw %ax, -2(%rbp)
movswl -2(%rbp),%ecx
movswl -4(%rbp),%edx
movswl -6(%rbp),%esi
movl $.LC0, %edi
movl $0, %eax
call printf
movl $0, %eax
leave
ret
Different mnemonics - instead of movq we get movw and movswl, we're using %eax, which is the lower 32 bits of %rax, etc.
Once more, this time with floating-point types:
#include <stdio.h>
int main( void )
{
double x, y, z;
x = 5;
y = 6;
z = x + y;
printf( "x = %f, y = %f, z = %f\n", x, y, z );
return 0;
}
gcc -S again:
movabsq $4617315517961601024, %rax
movq %rax, -24(%rbp)
movabsq $4618441417868443648, %rax
movq %rax, -16(%rbp)
movsd -24(%rbp), %xmm0
addsd -16(%rbp), %xmm0
movsd %xmm0, -8(%rbp)
movq -8(%rbp), %rax
movq -16(%rbp), %rdx
movq -24(%rbp), %rcx
movq %rax, -40(%rbp)
movsd -40(%rbp), %xmm2
movq %rdx, -40(%rbp)
movsd -40(%rbp), %xmm1
movq %rcx, -40(%rbp)
movsd -40(%rbp), %xmm0
movl $.LC2, %edi
movl $3, %eax
call printf
movl $0, %eax
leave
ret
New mnemonics (movsd), new registers (%xmm0).
So basically, after translation, there's no need to tag the data with type information; that type information is "baked in" to the machine code itself.
I am originally given the function prototype:
void decode1(int *xp, int *yp, int *zp)
now i am told to convert the following assembly into C code:
movl 8(%ebp), %edi //line 1 ;; gets xp
movl 12(%ebp), %edx //line 2 ;; gets yp
movl 16(%ebp),%ecx //line 3 ;; gets zp
movl (%edx), %ebx //line 4 ;; gets y
movl (%ecx), %esi //line 5 ;; gets z
movl (%edi), %eax //line 6 ;; gets x
movl %eax, (%edx) //line 7 ;; stores x into yp
movl %ebx, (%ecx) //line 8 ;; stores y into zp
movl %esi, (%edi) //line 9 ;; stores z into xp
These comments were not given to me in the problem this is what I believe they are doing but am not 100% sure.
My question is, for lines 4-6, am I able to assume that the command
movl (%edx), %ebx
movl (%ecx), %esi
movl (%edi), %eax
just creates a local variables to y,z,x?
also, do the registers that each variable get stored in i.e (edi,edx,ecx) matter or can I use any register in any order to take the pointers off of the stack?
C code:
int tx = *xp;
int ty = *yp;
int tz = *zp;
*yp = tx;
*zp = ty;
*xp = tz;
If I wasn't given the function prototype how would I tell what type of return type is used?
Let's focus on a simpler set of instructions.
First:
movl 8(%ebp), %edi
will load into the EDI register the content of the 4 bytes that are situated on memory at 8 eight bytes beyond the address set in the EBP register. This special EBP usage is a convention followed by the compiler code generator, that per each function, saves the stack pointer ESP into the EBP registers, and then creates a stack frame for the function local variables.
Now, in the EDI register, we have the first parameter passed to the function, that is a pointer to an integer, so EDI contains now the address of that integer, but not the integer itself.
movl (%edi), %eax
will get the 4 bytes pointed by the EDI register and load them into the EAX register.
Now in EAX we have the value of the integer pointed by the xp in the first parameter.
And then:
movl %eax, (%edx)
will save this integer value into the memory pointed by the content of the EDX register which was in turn loaded from EBP+12 which is the second parameter passed to the function.
So, your first question, is this assembly code equivalent to this?
int tx = *xp;
int ty = *yp;
int tz = *zp;
*yp = tx;
*zp = ty;
*xp = tz;
is, yes, but note that there are no tx,ty,tz local variables created, but just processor registers.
And your second question, is no, you can't tell the type of return, it is, again, a convention on the register usage that you can't infer just by looking at the generated assembly code.
Congratulations, you got everything right :)
You can use any register but some need to be preserved, that is they should be saved before use and restored afterwards. In typical calling conventions you can use eax, ecx and edx, the rest need to be preserved. The assembly you showed doesn't include code to do this, but presumably it is there.
As for the return type, that's hard to deduce. Simple types are returned in the eax register, and something is always in there. We can't tell if that's intended as a return value, or just remains of a local variable. That is, if your function had return tx; it could be the same assembly code. Also, we don't know the type for eax either, it could be anything that fits in there and is expected to be returned there according to the calling convention.
In many examples when I compile a c-function (such as the sorting algorithm shell sort) the stackaddress (i gues it is called?) ebp-4 / -4(%ebp) / [ebp]-4 or whatever, which, as I understand, is usually used for the first local variable, is not used in my case.
So I was wondering if someone knows what it is used for, as it is not used for any local variables, or anything else for that matter.
Furthermore, 20 is subtracted from the stack pointer to allocate stack space for the locale variables - but then a value is still saved to -24(%ebp) - how is that possible when there is only made room until -20??
the c-function lookes like this:
void shellsort(int a[], unsigned int n) {
unsigned int gap, i, j;
for (gap = n / 2; gap > 0; gap = gap == 2 ? 1 : 5 * gap / 11) {
for (i = gap; i < n; i++) {
int tmp = a[i];
for (j = i; j >= gap && tmp < a[j - gap]; j -= gap)
a[j] = a[j - gap];
a[j] = tmp;
}
}
}
And this is my stack using the gcc -S 32-bit Ubuntu
12(%ebp) = n
8(%ebp) = a[]
-8(%ebp) = tmp
-12(%ebp) = j
-16(%ebp) = i
-20(%ebp) = gap
-24(%ebp) = (gap * 4) + gap
Thanks in advance :)
There are two parts to your question.
The first as I understand it has to do with what [EBP-4] is used for. For that, I recommend you read a summary of the x86 stack frame at What is stack frame in assembly?.
To properly answer the whole 20/24 part of your question, we need to look at the disassembled code. The following is an extract from the disassembly of the C code you provided.
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp /* (1) */
.cfi_def_cfa_register 5
pushl %ebx /* (2) */
subl $20, %esp /* (3) */
movl 12(%ebp), %eax
shrl %eax
movl %eax, -20(%ebp)
jmp .L2
.cfi_offset 3, -12
I have identified the three (3) key lines in the disassembled output above.
At (1), the base pointer is being set to the stack pointer. As per the information supplied in the link earlier, this is just part of setting up the stack frame.
At (2), we are saving EBX (a non-volatile register) to the stack. This automatically updates ESP (but not EBP), subtracting four from its current value. Note that after this operation, EBP = ESP + 4.
At (3), we are subtracting 20 from ESP. After this operation, EBP = ESP + 24.
That is why it is safe to access [EBP-20].
Hope this helps.
On the x86 ebp is typically used as the frame pointer and esp is the stack pointer. Around the frame pointer the frame pointer of the caller and the return address are saved. That could explain the gap. In my assembly actually -4(%ebp) is used, so maybe we are looking at a different ABI, but there should always be a gap for the stack administration.
About the -20 subtracted from the stack pointer and still access -24(%ebp): there is probably a push after the movl %esp, %ebp in your assembly, which would account for an additional 4 bytes.