I don't understand what happens in this Loop written in assembly language with 32-bit registers. This is the code:
void main() {
unsigned char Vet[100];
unsigned short int Mat = 8805;
unsigned short Ris;
__asm {
MOV AX, Mat
MOV BYTE PTR Vet[10], AL
MOV BYTE PTR Vet[13], 99
MOV BYTE PTR Vet[16], AH
LEA ESI, Vet
ADD ESI, 9
XOR EBX, EBX
MOV ECX, 3
L1: XOR BL, [ESI + 1]
ADD ESI, 3
LOOP L1
MOV Ris, BX
}
printf("\nRis: %d\n\n", Ris);
}
L1 set BL=65h the first time because BL starts by 0. Ok.
The second time i suppose to have 65h XOR 99h, because ESI=Vect[9+3+1].
So i expect FCh as result in BL, but compiler returns 06h!
I made test with a simple XOR between 65h and 99h, and is like i thought: FCh. The problem maybe in the Vet index, but where?
it's 99, not 99h. 65h xor 99 = 6.
Related
Sorry if I am asking a stupid question, but I can't find the answer due to clumsy search terms I guess
If I declare three variables as follows
volatile uint16_t a, b, c;
Will all three variables be declared volatile?
Or should I really not declare multiple variables in a row but instead do:
volatile uint16_t a;
volatile uint16_t b;
volatile uint16_t c;
If I declare three variables as follows
volatile uint16_t a, b, c;
Will all three variables be declared volatile?
Yes, all 3 variables will be volatile.
Or should I really not declare multiple variables in a row but instead do:
That is related to code style and personal preference. Usually declaring variables one per line is preferred, is more readable, easier to read, easier to refactor and results in more readable changes when browsing diff output of files.
We can check the assembly generated by the compiler to see if it optimizes the variables out or not.
When I check this simple program:
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint16_t a = 1, b = 1, c = 1;
printf("%hu", a);
printf("%hu", b);
printf("%hu", c);
}
The generated assembly at -O3 (link) is:
.LC0:
.string "%hu"
main:
sub rsp, 8
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 8
ret
It's obvious here that the variables have been optimized out and 1 is being used as a parameter instead of the variables.
When I replace the uint16_t a = 1, b = 1, c = 1; with volatile uint16_t a = 1, b = 1, c = 1;, The assembly generated (link) is:
main:
sub rsp, 24
mov edx, 1
mov ecx, 1
mov eax, 1
mov WORD PTR [rsp+10], ax
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov WORD PTR [rsp+12], dx
mov WORD PTR [rsp+14], cx
movzx esi, WORD PTR [rsp+10]
call printf
movzx esi, WORD PTR [rsp+12]
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
movzx esi, WORD PTR [rsp+14]
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 24
ret
Here, volatile is working like it should for all variables. The variables are created and are not optimized out.
In comparison, if we replace volatile uint16_t a = 1, b = 1, c = 1; with volatile uint16_t a = 1; uint16_t b = 1, c = 1; we see that only a is not optimized out (link):
main:
sub rsp, 24
mov eax, 1
mov edi, OFFSET FLAT:.LC0
mov WORD PTR [rsp+14], ax
movzx esi, WORD PTR [rsp+14]
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
mov esi, 1
mov edi, OFFSET FLAT:.LC0
xor eax, eax
call printf
xor eax, eax
add rsp, 24
ret
I don't understand what is the problem because the result is right, but there is something wrong in it and i don't get it.
1.This is the x86 code I have to convert to C:
%include "io.inc"
SECTION .data
mask DD 0xffff, 0xff00ff, 0xf0f0f0f, 0x33333333, 0x55555555
SECTION .text
GLOBAL CMAIN
CMAIN:
GET_UDEC 4, EAX
MOV EBX, mask
ADD EBX, 16
MOV ECX, 1
.L:
MOV ESI, DWORD [EBX]
MOV EDI, ESI
NOT EDI
MOV EDX, EAX
AND EAX, ESI
AND EDX, EDI
SHL EAX, CL
SHR EDX, CL
OR EAX, EDX
SHL ECX, 1
SUB EBX, 4
CMP EBX, mask - 4
JNE .L
PRINT_UDEC 4, EAX
NEWLINE
XOR EAX, EAX
RET
2.My converted C code, when I input 0 it output me the right answer but there is something false in my code I don't understand what is:
#include "stdio.h"
int main(void)
{
int mask [5] = {0xffff, 0xff00ff, 0xf0f0f0f, 0x33333333, 0x55555555};
int eax;
int esi;
int ebx;
int edi;
int edx;
char cl = 0;
scanf("%d",&eax);
ebx = mask[4];
ebx = ebx + 16;
int ecx = 1;
L:
esi = ebx;
edi = esi;
edi = !edi;
edx = eax;
eax = eax && esi;
edx = edx && edi;
eax = eax << cl;
edx = edx >> cl ;
eax = eax || edx;
ecx = ecx << 1;
ebx = ebx - 4;
if(ebx == mask[1]) //mask - 4
{
goto L;
}
printf("%d",eax);
return 0;
}
Assembly AND is C bitwise &, not logical &&. (Same for OR). So you want eax &= esi.
(Using &= "compound assignment" makes the C even look like x86-style 2-operand asm so I'd recommend that.)
NOT is also bitwise flip-all-the-bits, not booleanize to 0/1. In C that's edi = ~edi;
Read the manual for x86 instructions like https://www.felixcloutier.com/x86/not, and for C operators like ~ and ! to check that they are / aren't what you want. https://en.cppreference.com/w/c/language/expressions https://en.cppreference.com/w/c/language/operator_arithmetic
You should be single-stepping your C and your asm in a debugger so you notice the first divergence, and know which instruction / C statement to fix. Don't just run the whole thing and look at one number for the result! Debuggers are massively useful for asm; don't waste your time without one.
CL is the low byte of ECX, not a separate C variable. You could use a union between uint32_t and uint8_t in C, or just use eax <<= ecx&31; since you don't have anything that writes CL separately from ECX. (x86 shifts mask their count; that C statement could compile to shl eax, cl. https://www.felixcloutier.com/x86/sal:sar:shl:shr). The low 5 bits of ECX are also the low 5 bits of CL.
SHR is a logical right shift, not arithmetic, so you need to be using unsigned not int at least for the >>. But really just use it for everything.
You're handling EBX completely wrong; it's a pointer.
MOV EBX, mask
ADD EBX, 16
This is like unsigned int *ebx = mask+4;
The size of a dword is 4 bytes, but C pointer math scales by the type size, so +1 is a whole element, not 1 byte. So 16 bytes is 4 dwords = 4 unsigned int elements.
MOV ESI, DWORD [EBX]
That's a load using EBX as an address. This should be easy to see if you single-step the asm in a debugger: It's not just copying the value.
CMP EBX, mask - 4
JNE .L
This is NASM syntax; it's comparing against the address of the dword before the start of the array. It's effectively the bottom of a fairly normal do{}while loop. (Why are loops always compiled into "do...while" style (tail jump)?)
do { // .L
...
} while(ebx != &mask[-1]); // cmp/jne
It's looping from the end of the mask array, stopping when the pointer goes past the end.
Equivalently, the compare could be ebx !-= mask - 1. I wrote it with unary & (address-of) cancelling out the [] to make it clear that it's the address of what would be one element before the array.
Note that it's jumping on not equal; you had your if()goto backwards, jumping only on equality. This is a loop.
unsigned mask[] should be static because it's in section .data, not on the stack. And not const, because again it's in .data not .rodata (Linux) or .rdata (Windows))
This one doesn't affect the logic, only that detail of decompiling.
There may be other bugs; I didn't try to check everything.
if(ebx != mask[1]) //mask - 4
{
goto L;
}
//JNE IMPLIES a !=
I am having trouble finding the longest run of integers in an array in x86 assembly. I have an array I am passing in, and the size of the array, and I think I am on track with my logic, just can't figure out the last steps. Any help is appreciated. This is the code I have so far:
int Longest_Run(int nums[], int siz) {
//
int returnValue = 0;
int freq[101] = { 0 }; //if needed
int arr_length = 0;
arr_length = sizeof(nums) / sizeof(int); //WRONG
__asm {
//LONGEST RUN. Return value to caller :
MOV ECX, siz // arr_length //For looping
//LEA ESI, nums //ptr to array
mov ESI, nums //ptr to array
//LEA EDI, nums //ptr to array
MOV EBX, 0 //Frequency counter
MOV EAX, [ESI] //First array int in SI
L1:
CMP EAX, [ESI] //Compare 1st int with 1st int
JE L2
JMP L3
L4:
LOOP L1
JMP CLEAR
L2 : //If equal
INC EBX //Inc counter
ADD ESI, 4 //Inc to 2nd item in array
MOV EAX, [ESI] //Move 2nd item into AX
JMP L1
L3 :
PUSH EBX // Puts counter int on stack
MOV EBX, 0 // Clears BX to start over
ADD ESI, 4 //Inc to 3rd item in array
MOV EAX, [ESI] //Move 3rd item into AX
JMP L4
CLEAR:
POP EDX
MOV EDX, returnValue
//END
}
return returnValue;
}
Any ideas or guidance?
I got a task to write an assembly routine that can be read from C and declared as follows:
extern int solve_equation(long int a, long int b,long int c, long int *x, long int *y);
that finds a solution to the equation
a * x + b * y = c
In -2147483648 <x, y <2147483647 by checking all options.
The value returned from the routine will be 1 if a solution is found and another 0.
You must take into consideration that the results of the calculations: a * x, b * y, a * x + b * y can exceed 32 bits.
.MODEL SMALL
.DATA
C DQ ?
SUM DQ 0
MUL1 DQ ?
MUL2 DQ ?
X DD ?
Y DD ?
.CODE
.386
PUBLIC _solve_equation
_solve_equation PROC NEAR
PUSH BP
MOV BP,SP
PUSH SI
MOV X,-2147483648
MOV Y,-2147483648
MOV ECX,4294967295
FOR1:
CMP ECX,0
JE FALSE
PUSH ECX
MOV Y,-2147483648
MOV ECX,4294967295
FOR2:
MOV SUM,0
CMP ECX,0
JE SET_FOR1
MOV EAX,DWORD PTR [BP+4]
IMUL X
MOV DWORD PTR MUL1,EAX
MOV DWORD PTR MUL1+4,EDX
MOV EAX,DWORD PTR [BP+8]
IMUL Y
MOV DWORD PTR MUL2,EAX
MOV DWORD PTR MUL2+4,EDX
MOV EAX, DWORD PTR MUL1
ADD DWORD PTR SUM,EAX
MOV EAX, DWORD PTR MUL2
ADD DWORD PTR SUM,EAX
MOV EAX, DWORD PTR MUL1+4
ADD DWORD PTR SUM+4,EAX
MOV EAX, DWORD PTR MUL2+4
ADD DWORD PTR SUM+4,EAX
CMP SUM,-2147483648
JL SET_FOR2
CMP SUM,2147483647
JG SET_FOR2
MOV EAX,DWORD PTR [BP+12]
CMP DWORD PTR SUM,EAX
JE TRUE
SET_FOR2:
DEC ECX
INC Y
JMP FOR2
SET_FOR1:
POP ECX
DEC ECX
JMP FOR1
FALSE:
MOV AX,0
JMP SOF
TRUE:
MOV SI,WORD PTR [BP+16]
MOV EAX,X
MOV DWORD PTR [SI],EAX
MOV SI,WORD PTR [BP+18]
MOV EAX,Y
MOV DWORD PTR [SI],EAX
MOV AX,1
SOF:
POP SI
POP BP
RET
_solve_equation ENDP
END
Is this the right way to work with large numbers?
I get argument to operation or instruction has illegal size when I try to do:
MOV SUM,0
CMP SUM,-2147483648
CMP SUM,2147483647
main code:
int main()
{
long int x, y, flag;
flag = solve_equation(-5,4,2147483647,&x, &y);
if (flag == 1)
printf("%ld*%ld + %ld*%ld = %ld\n", -5L,x,4L,y,2147483647);
return 0;
}
output
-5*-2147483647 + 4*-2147483647 = 2147483647
I`m using dosbox 0.74 and tcc
You're using 16-bit code, so 64-bit operand-size isn't available. Your assembler magically associates a size with sum, because you defined it with sum dq 0.
So mov sum, 0 is equivalent to mov qword ptr [sum], 0, which of course won't assemble in 16 or 32-bit mode; you can only operate on up to 32 bits at once with integer operations.
(32-bit operand-size is available in 16-bit mode on 386-compatible CPUs, using the same machine encodings that allows 16-bit operand size in 32-bit mode. But 64-bit operand size is only available in 64-bit mode. Unlike 386, AMD64 didn't add any new prefixes or anything to previously-existing modes, for various reasons.)
You could zero the whole 64-bit sum with an SSE store, or even compare with SSE4.2 pcmpgtq, but that's probably not what you want.
It looks like you want to check if 64-bit sum fits in 32 bits. (i.e. if it is a sign-extended 32-bit integer).
So really you just need to check that all 32 high bits are the same and match bit 31 of the low half.
mov eax, dword ptr [sum]
cdq ; sign extend eax into edx:eax
; i.e. copy bit 31 of EAX to all bits of EDX
cmp edx, dword ptr [sum+4]
je small_sum
My assignment is to find the smallest letter in the array using assembly embedded into C. I am not sure how to access each element of the array. I tried googling and I found out that some people are doing the following:
mov ecx, arrayOfLetters
and then increment ecx to access each element. Is that right or what I wrote so is correct?
please help, I am confused.
char findMinLetter( char arrayOfLetters[], int arraySize )
{
char min;
__asm{
push eax
push ebx
push ecx
push edx
mov dl, 0x7f // initialize DL
xor ebx, ebx //EBX started off as 0
//moves letters from array to registers
mov ecx, arrayOfLetters[ebx]
mov edx, arrayOfLetters[ebx+1]
The first thing to understand is that 'arrayOfLetters' as passed to your subroutine is a pointer.
To access data (one byte at a time) from pointer (in ecx) in assembler, use:
mov al, [ecx]
mov al, [ecx+1]
... or ...
mov al, [ecx]
inc ecx
mov al, [ecx]
The next issue is how local variables are accessed: there are two main styles used and both of them use stack.
mov ecx, _localvariable_ ; this translates to either
mov ecx, [ebp + offset] ; style (1) or
mov ecx, [esp + offset] ; style (2)
If there was a assembler supporting instruction mov ecx, _localvariable [+1], that would most likely convert to:
mov ecx, [ebp + offset + 1]
And this would not access the char array[], but just some arbitrary byte in the stack.