In my .bss section I declare array db 5, 4, 3, 2, 1
I then have a pointer ptr defined as %define ptr dword[ebp-8]
I would like to inc and dec this pointer at will to move from one element in the array to another, and then I would like to have to ability to inc the value in the array that the pointer is pointing to, and have this be updated in the array!
I move through the array with a loop in the form of:
mov ptr, array ; not in loop
mov ebx, ptr
mov al, [ebx]
inc ptr
How can I increment the value and then have it saved in the array instead of just some register as if I did inc al , can I do something like inc [ptr] (This doesnt work ofcourse). Is there a better way to approach this entirely?
Thanks
Edit:
I want my array to be something like 10, 8, 6, 5, 2 i.e increment each element by however many
In my .bss section I declare array db 5, 4, 3, 2, 1
This does not make sense.
By default .bss is defined as nobits, i. e. an uninitialized section (although a modern multi-user OS will initialize it with some values [usually zero] to prevent data leaks).
You probably meant to write .data section.
nasm(1) will issue an “ignored” warning about any db in a nobits section.
I then have a pointer ptr defined as %define ptr dword[ebp‑8]
I would like to inc and dec this pointer […]
No, you have a (case-sensitive) single-line macro ptr.
Any (following) occurrence of ptr will be expanded to dword[ebp‑8].
Use nasm ‑E source.asm (preprocess only) to show this.
[…] then I would like to have to ability to inc the value in the array that the pointer is pointing to […]
Your ptr macro says it’s pointing to a dword – a 32‑bit quantity – but your array consists of db – data Byte, 8‑bit quantity – elements.
This doesn’t add up.
I want my array to be something like 10, 8, 6, 5, 2 i.e increment each element by however many
Well, x + x = 2 × x, calculating the sum of a value plus the same value is the same as multiplying by two.
Any decent compiler will optimize multiplying by a constant factor 2ⁿ as a shl x, n.
Unless you need certain flags (the resulting flags of shl and add marginally differ), you can do something like
lea ebx, [array] ; load address of start of `array`
xor eax, eax ; eax ≔ 0, index in array starting at 0
mov ecx, 5 ; number of elements
head_of_loop:
shl byte [ebx + 1 * eax], 1 ; base address + size of element * index
add eax, 1 ; eax ≔ eax + 1
loop head_of_loop ; ecx ≔ ecx − 1
; if ecx ≠ 0 then goto head_of_loop
The constant factor in [ebx + 1 * eax] can be 1, 2, 4 or 8.
Note, the loop instruction is utterly slow and was used just for the sake of simplicity.
Related
I'm in the process of learning Assembly language using NASM, and have run into a programming problem that I can't seem to figure out. The goal of the program is to solve this equation:
Picture of Equation
For those unable to see the photo, the equation says that for two arrays of length n, array a and array b, find: for i=0 to n-1, ((ai + 3) - (bi - 4))
I'm only supposed to use three general registers, and I've figured out a code sample I think could possibly work, but I keep running into comma and operand errors with lines 16 and 19. I understand that in order to iterate through the array you need to move a pointer to each index, but since both arrays are of different values (array 1 is dw and array 2 is db) I am unsure how to account for that. I'm still very new to Assembly, and any help or pointers would be appreciated.
Here is a picture of my current code:
Code Sample
segment .data
a dw 12, 14, 16 ; array of three values
b db 2, 4, 5 ; array of three values
n dw 3 ; length of both arrays
result dq 0 ; memory to result
segment .text
global main
main:
mov rax, 0
mov rbx, 0
mov rdx, 0
loop_start:
cmp rax, [n]
jge loop_end
add rbx, a[rax*4] ; adding element of a at current index to rbx
add rbx, 3 ; adding 3 to current index value of array a in rbx
add rdx, BYTE b[rax]
sub rdx, 4
sub rbx, [rdx]
add [result], rbx
xor rbx, rbx
xor rdx, rdx
add rax, 1
loop_end:
ret
You are using 16-bit and 8-bit data, but 64-bit registers. Generally speaking, the processor requires the same data size though out the operands of any single instruction.
cmp rax,[n] has varying data size, which is not allowed: rax is a 64-bit register, and [n] is a 16 bit data item. So, we can change this to cmp ax,[n], and now everything is 16-bit.
add rbx,a[rax*4] is also mixing different size operands (not allowed). rbx is 64-bits and a[] is 16-bits. You can change the register to bx and this will be allowed. But also let's note that *4 is too much it should be *2 since dw is 16-bit data (2-byte), not 32-bit (4-byte). Since you're clearing rbx, you don't need an add here you can simply mov.
add rdx, BYTE b[rax] is also mixing different sizes. rax is 64-bits wide whereas b[] is 8-bits wide. Use dl instead of rdx. There is nothing to add to with this so you should use a mov instead of add. Now that there's a value in dl, and you previously cleared rdx, you can switch to using dx (from dl) this will have the 16-bit value of b[i].
sub rbx, [rdx] has an erroneous deference. Here you just want to sub bx,dx.
You are not using the label loop_start, so there is no loop. (Add a backward branch at the end of the loop.)
...but since both arrays are of different values (array 1 is dw and array 2 is db) I am unsure how to account for that
Erik Eidt's answer explaines why you "keep running into comma and operand errors". Although you can revert to using the smaller registers (adding operand size prefixes), my answer takes a different approach.
The instruction set has the movzx (move with zero extension) and movsx (move with sign extension) instructions to deal with these varying sizes. See below how to use these.
I've applied a few changes too.
Don't miss an opportunity to simplify your calculation:
((a[i] + 3) - (b[i] - 4)) is equivalent to (a[i] - b[i] + 7)
None of these arrays is empty, so you can just put the loop condition below its body.
You can process the arrays starting at the end if it's convenient. The summation operation doesn't mind!
segment .data
a dw 12, 14, 16 ; array of three values
b db 2, 4, 5 ; array of three values
n dw 3 ; length of both arrays
result dq 0 ; memory to result
segment .text
global main
main:
movzx rcx, word [n]
loop_start:
movzx rax, word [a + rcx * 2 - 2]
movzx rbx, byte [b + rcx - 1]
lea rax, [rax + rbx + 7]
add [result], rax
dec rcx
jnz loop_start
ret
Please notice that the additional negative offsets - 2 and - 1 exist to compensate for the fact that the loop control takes on the values {3, 2, 1} when {2, 1, 0} would have been perfect. This does not introduce an extra displacement component to the instruction since the mention of the a and b arrays is in fact already the displacement.
Although this is tagged x86-64, you can write the whole thing using 32-bit registers and not require the REX prefixes. Same result.
segment .data
a dw 12, 14, 16 ; array of three values
b db 2, 4, 5 ; array of three values
n dw 3 ; length of both arrays
result dq 0 ; memory to result
segment .text
global main
main:
movzx ecx, word [n]
loop_start:
movzx eax, word [a + ecx * 2 - 2]
movzx ebx, byte [b + ecx - 1]
lea eax, [eax + ebx + 7]
add [result], eax
dec ecx
jnz loop_start
ret
I'm writing a program in masm assembly to count and return the number of times integers appear in an array. I currently have the following code that allows me to populate an array with random integers. What I am struggling with is how to implement a counter that will store each occurrence of an integer at an index in the array. for instance, if the random array was [3,4,3,3,4,5,7,8], I would want to my count array to hold [3, 2, 1, 1, 1], as there are (three 3's, two 4's, etc).
I have the bounds of the random numbers fixed at 3/8 so I know they will be within this range. My current thinking is to compare each number to 3-8 as it is added, and increment my count array respectively. My main lack of understanding is how I can increment specific indices of the array. This code is how I am producing an array of random integers, with an idea of how I can begin to count integer occurrence, but I don't know if I am going in the right direction. Any advice?
push ebp
mov ebp, esp
mov esi, [ebp + 16] ; # holds array to store count of integer occurances
mov edi, [ebp + 12] ; # holds array to be populated with random ints
mov ecx, [ebp + 8] ; value of request in ecx
MakeArray:
mov eax, UPPER ; upper boundary for random num in array
sub eax, LOWER ; lower boundary for random num in array
inc eax
call RandomRange
add eax, LOWER
cmp eax, 3 ; Where I start to compare the random numbers added
je inc_3 ; current thought is it cmp to each num 3-8
mov [edi], eax ; put random number in array
add edi, 4 ; holds address of current element, moves to next element
loop fillArrLoop
inc_3: ; if random num == 3
inc esi ; holds address of count_array, increments count_array[0] to 1?
mov [edi], eax ; put random number in array to be displayed
add edi, 4 ; holds address of current element, moves to next element
loop MakeArray
My current thinking is to compare each number to 3-8 as it is added
No, you're vastly overcomplicating this. You don't want to linear search for a j (index into the counts) such that arr[i] == j, just use j = arr[i].
The standard way to do a histogram is ++counts[ arr[i] ]. In your case, you know the possible values are 3..8, so you can map an array value to a count bucket with arr[i] - 3, so you'll operate on counts[0..5]. A memory-destination add instruction with a scaled-index addressing mode can do this in one x86 instruction, given the element value in a register.
If the possible values are not contiguous, you'd normally use a hash table to map values to count buckets. You can think about this simple case as allowing a trivial hash function.
Since you're generating the random numbers to fill arr[i] at the same time as histograming, you can combine those two tasks, and instead of subtracting 3 just don't add it yet.
; inputs: unsigned len, int *values, int *counts
; outputs: values[0..len-1] filled with random numbers, counts[] incremented
; clobbers: EAX, ECX, EDX (not the other registers)
fill_array_and_counts:
push ebp
mov ebp, esp
push esi ; Save/restore the caller's ESI.
;; Irvine32 functions like RandomRange are special and don't clobber EAX, ECX, or EDX except as return values,
;; so we can use EDX and ECX even in a loop that makes a function call.
mov edi, [ebp + 16] ; int *counts ; assumed already zeroed?
mov edx, [ebp + 12] ; int *values ; output pointers
mov ecx, [ebp + 8] ; size_t length
MakeArray: ; do{
mov eax, UPPER - LOWER + 1 ; size of random range, calculated at assemble time
call RandomRange ; eax = 0 .. eax-1
add dword ptr [edi + eax*4], 1 ; ++counts[ randval ]
add eax, LOWER ; map 0..n to LOWER..UPPER
mov [edx], eax ; *values = randval+3
add edx, 4 ; values++
dec ecx
jnz MakeArray ; }while(--ecx);
pop edi ; restore call-preserved regs
pop ebp ; including tearing down the stack frame
ret
If the caller doesn't zero the counts array for you, you should do that yourself, perhaps with rep stosd with EAX=0 as a memset of ECX dword elements, and then reload EDI and ECX from the stack args.
I'm assuming UPPER and LOWER are assemble time constants like UPPER = 8 or LOWER equ 3, since you used all-upper-case names for them, and they're not function args. If that's the case, then there's no need to do the math at runtime, just let the assembler calculate UPPER - LOWER + 1 for you.
I avoided the loop instruction because it's slow, and doesn't do anything you can't do with other simple instructions.
One standard performance trick for histograms with only a few buckets is to have multiple arrays of counts and unroll over them: Methods to vectorise histogram in SIMD?. This hides the latency of store/reload when the same counter needs to be incremented several times in a row. Your random values will generally avoid long runs of the same value, though, so worst-case performance is avoided.
There might be something to gain from AVX2 for large arrays since there are only 6 possible buckets: Micro Optimization of a 4-bucket histogram of a large array or list. (And you could generate random numbers in SIMD vectors with an AVX2 xorshift128+ PRNG if you wanted.)
If your range is fixed (3-8), you have a fixed-length array that can hold your counts:
(index0:Count of 3),(index1:Count of 4)..(index5:Count of 8s)
Once you have an element from the random array, you just take that element and put it through a switch:
cmp 3, [element]
jne compare4
mov ebx, [countsArrayAddress] ; Can replace [countsArrayAddress] with [ebp + 16]
add ebx, 0 ; First index, can comment out this line
mov ecx, [ebx]
add ecx, 1 ; Increment count
mov [ebx], ecx ; Count at the zeroth offset is now incremented
compare4:
cmp 4, [element]
jne compare5
mov ebx, [countsArrayAddress]
add ebx, 4 ; Second index (1*4)
mov ecx, [ebx]
add ecx, 1
mov [ebx], ecx
...
Is this what you mean? I come from using fasm syntax but it looks pretty similar. The above block is a bit unoptimized, but think this shows how to build the counts array. The array has a fix length, which must be allocated, either on the stack (sub rsp the correct amount) or on the heap, i.e with heapalloc/malloc calls. (Edited, see you're using 32-bit registers)
First of all, this is a homework assignment.
I have a loop to get values of two digits individually, and joining them by doing a multiplication of the first digit by 10 and adding with the second digit to get an integer.
I'm doing all this and saving in my AL register, and now I want to insert that integer into an array and then scan that array and display those numbers.
How can I insert into vector and read from vector?
My array:
section .bss
array resb 200
My digit convert:
sub byte[digit_une], 30h
sub byte[digit_two], 30h
mov al, byte[digit_one]
mov dl, 10 ;dl = 10
mul dl ;al = ax = 10 * digit_one
add al, byte[digit_two] ;al = al + digit_two = digit_one * 10 + digit_two
"arrays", "vectors", etc... all that is higher level concept. The machine has memory, which is addressable by single byte, and what kind of logic you implement with your code, that's up to you. But you should be able to think about it on both levels, as single bytes in memory, each having it's own address, and fully understand your code logic, how it will arrange usage of those bytes to form "array of something".
With your definition of .bss sector you define one symbol/label array, which is equal to the address into memory where the .bss segment starts. Then you reserve 200 bytes of space, so anything else you will add after (like another label) will start at address .bss+200.
Let's say (for example) after loading your binary into memory and jumping to entry point, the .bss is at address 0x1000.
Then
mov dword [array],0x12345678
will store 4 bytes into memory at addresses 0x1000 .. 0x1003, with particular bytes having values 78 56 34 12 (little-endian break down of that dword value).
If you will do mov dword [array+199],0x12345678, you will write value 0x78 into the last officially reserved byte by that resb 200, and remaining 3 bytes will overwrite the memory at addresses .bss+200, .bss+201 and .bss+202 (probably damaging some other data, if you will put something there, or crashing your application, if it will cross the memory page boundary, and you are at the end of available memory for your process).
As you want to store N byte values into array, the simplest logic is to store first value at address array+0, second at array+1, etc... (for dword values the most logical way is array+0, array+4, array+8, ....).
i.e. mov [array+0],al can be used to store first value. But that's not very practical, if you are reading the input in some kind of loop. Let's say you want to read at most 200 values from user, or value 99 will end sooner, then you can use indexing by register, like:
xor esi,esi ; rsi = index = 0
mov ecx,200 ; rcx = 200 (max inputs)
input_loop:
; do input into AL = 0..99 integer (preserve RSI and RCX!)
...
cmp al,99
je input_loop_terminate
mov [array+rsi], al ; store the new value into array
inc rsi ; ++index
dec rcx ; --counter
jnz input_loop ; loop until counter is zero
input_loop_terminate:
; here RSI contains number of inputted values
; and memory from address array contains byte values (w/o the 99)
I.e. for user input 32, 72, 13, 0, 16, 99 the memory at address 0x1000 will have 5 bytes modified, containing (in hexa) now: 20 48 0D 00 10 ?? ?? ?? ....
If you are somewhat skilled asm programmer, you will not only index by register, but also avoid the hardcoded array label, so you would probably do an subroutine which takes as argument target address (of array), and maximum count:
; function to read user input, rsi = array address, rcx = max count
; does modify many other registers
; returns amount of inputted values in rax
take_some_byte_values_from_user:
jrcxz .error_zero_max_count ; validate count argument
lea rdi,[rsi+rcx] ; rdi = address of first byte beyond buffer
neg rcx ; rcx = -count (!)
; ^ small trick to make counter work also as index
; the index values will be: -200, -199, -198, ...
; and that's perfect for that "address of byte beyond buffer"
.input_loop:
; do input into AL = 0..99 integer (preserve RSI, RDI and RCX!)
...
cmp al,99
je .input_loop_terminate
mov [rdi+rcx], al ; store the new value into array
inc rcx ; ++counter (and index)
jnz .input_loop ; loop until counter is zero
.input_loop_terminate:
; calculate inputted size into RAX
lea rax,[rdi+rcx] ; address beyond last written value
sub rax,rsi ; rax = count of inputted values
ret
.error_zero_max_count:
xor eax,eax ; rax = 0, zero values were read
ret
Then you can call that subroutine from main code like this:
...
mov rsi,array ; rsi = address of reserved memory for data
mov ecx,200 ; rcx = max values count
call take_some_byte_values_from_user
; keep RAX (array.length = "0..200" value) somewhere
test al,al ; as 200 was max, testing only 8 bits is OK
jz no_input_from_user ; zero values were entered
...
For word/dword/qword element arrays the x86 has scaling factor in memory operand, so you can use index value going by +1, and address value like:
mov [array+4*rsi],eax ; store dword value into "array[rsi]"
For other sized elements it's usually more efficient to have pointer instead of index, and move to next element by doing add <pointer_reg>, <size_of_element> like add rdi,96, to avoid multiplication of index value for each access.
etc... reading values back is working in the same way, but reversed operands.
btw, these example don't as much "insert" values into array, as "overwrite" it. The computer memory already exists there and has some values (.bss gets zeroed by libc or OS IIRC? Otherwise some garbage may be there), so it's just overwriting old junk values with the values from user. There's still 200 bytes of memory "reserved" by resb, and your code must keep track of real size (count of inputted values) to know, where the user input ends, and where garbage data starts (or you may eventually write the 99 value into array too, and use that as "terminator" value, then you need only address of array to scan it content, and stop when value 99 is found).
EDIT:
And just in case you are still wondering why I am sometimes using square brackets and sometimes not, this Q+A looks detailed enough and YASM syntax is same as NASM in brackets usage: Basic use of immediates (square brackets) in x86 Assembly and yasm
I'm new to Assembly, I need help with an assignment in Assembly Language Irvine32. I want to know where I'm going wrong. I believe my code is 80% right but there's something I'm not seeing or recognizing. Here's the program details.
"Write an assembly language program that has an array of words. The program loads the last element of the array into an appropriately sized register and prints it. (Do not hardcode the index of the last element.)"
INCLUDE Irvine32.inc
.data
val1 word 1,2,3,4,5,6
val2 = ($-val1)/2
.code
main PROC
mov ax, 0
mov ax, val1[val2]
Call WriteDec
Call DumpRegs
exit
main ENDP
END main
First of all, your code has a bug: val1[val2] indexes with the element count in words, not the length in bytes (unless MASM syntax is even more magical than I expect). And it reads from one past the end of the array, since the first element is at val1[0].
To find the end, you either need to know the length (explicit length, like a buffer passed to memcpy(3)), or search it for a sentinel element (implicit length, like a C string passed to strcpy(3)).
Having a function that accepts an explicit length as a parameter seems fine to me. It's obviously much more efficient than a loop scanning for a sentinel element, and the array shown doesn't include one. (See Jose's answer for a suggestion to use '$' (i.e. 36) as a sentinel value. -1 or 0 might be more sensible sentinels/terminators.)
Obviously knowing the length is much better, since there's no need for a loop scanning the whole array.
I'd only call it hard-coding if you wrote val2 = 6, or worse val2 dw 6, rather than having it calculated at assemble-time from the array. If you want to write a function that could work with non-compile-time-constant arrays, you can have it accept the length as a value in memory, instead of an immediate that will be embedded into its load instruction.
e.g.
Length as a parameter in memory
.data
array word 1,2,3,4,5,6
array_len word ($-array)/2 ; some assemblers have syntactic sugar to calc this for you, like a SIZE operator or something.
.code
main PROC ; inputs: array and array_len in static storage
; output: ax = last element of array
; clobbers: si
; mov ax, 0 ; This is useless, the next mov overwrites it.
mov si, [array_len] ; do we need to save/restore si with push/pop in this ABI?
add si,si ; multiply by 2: length in words -> length in bytes
mov ax, [array + si - 2] ; note that the -2 folds into array at assemble time, so it's just a disp16 + index addressing mode
Call WriteDec
Call DumpRegs
exit
main ENDP
END main
You could also write a function to take pointer and length args on the stack or in registers, and have main pass those args.
You could save the add (or shl) by accepting a length in bytes, or a start and one-past-the-end pointer (like C++ STL range functions that take .begin() and .end() iterators). If you have the end pointer, you don't need the start pointer at all, except to return an error if they're equal (size = 0).
Or if you were not stuck with obsolete 16bit code, you could use a scaled index in the addressing mode, like [array + esi * 2]. You include Irvine32.inc...
I think your solution to reach the last element is the most efficient (($-val1)/2), but #zx485 is right and your teacher might believe you are cheating, so, among other solutions, you can reach the last element with a loop and the pointer SI :
INCLUDE Irvine32.inc
.data
val1 word 1,2,3,4,5,6
val2 = ($-val1)/2
.code
main PROC
; mov ax, 0
; mov ax, val1[val2]
mov cx, val2-1 ;COUNTER FOR LOOP (LENGTH-1).
mov si, offset val1 ;SI POINTS TO FIRST WORD IN ARRAY.
repeat:
add si, 2 ;POINT TO NEXT WORD IN ARRAY.
loop repeat ;CX--, IF CX > 0 REPEAT.
mov ax, [ si ] ;LAST WORD!
Call WriteDec
Call DumpRegs
exit
main ENDP
END main
One shorter way would be to get rid of the loop and jump straight to the last element by using the SI pointer (and changing val2 just a little) :
INCLUDE Irvine32.inc
.data
val1 dw 1,2,3,4,5,6
val2 = ($-val1)-2 ;NOW WE GET LENGTH - 2 BYTES.
.code
main PROC
; mov ax, 0
; mov ax, val1[val2]
mov si, offset val1 ;SI POINTS TO FIRST WORD IN ARRAY.
add si, val2 ;SI POINTS TO THE LAST WORD.
mov ax, [ si ] ;LAST WORD!
Call WriteDec
Call DumpRegs
exit
main ENDP
END main
And "Yes", you can join those two lines :
mov si, offset val1 ;SI POINTS TO FIRST WORD IN ARRAY.
add si, val2 ;SI POINTS TO THE LAST WORD.
into one, I separated them to comment each other :
mov si, offset val1 + val2
If you cannot use val2 = ($-val1)/2, one option would be to choose some terminating character for the array, for example, '$', and loop until it's found:
INCLUDE Irvine32.inc
.data
val1 word 1,2,3,4,5,6,'$' ;ARRAY WITH TERMINATING CHARACTER.
;val2 = ($-val1)/2
.code
main PROC
;mov ax, 0
;mov ax, val1[val2]
mov si, offset val1 ;SI POINTS TO VAL1.
mov ax, '$' ;TERMINATING CHARACTER.
repeat:
cmp [ si ], ax
je dollar_found ;IF [ SI ] == '$'
add si, 2 ;NEXT WORD IN ARRAY.
jmp repeat
dollar_found:
sub si, 2 ;PREVIOUS WORD.
mov ax, [ si ] ;FINAL WORD!
Call WriteDec
Call DumpRegs
exit
main ENDP
END main
I'm currently going through Assembly Language for x86 Processors 6th Edition by Kip R. Irvine. It's quite enjoyable, but something is confusing me.
Early in the book, the following code is shown:
list BYTE 10,20,30,40
ListSize = ($ - list)
This made sense to me. Right after declaring an array, subtract the current location in memory with the starting location of the array to get the number of bytes used by the array.
However, the book later does:
.data
arrayB BYTE 10h,20h,30h
.code
mov esi, OFFSET arrayB
mov al,[esi]
inc esi
mov al,[esi]
inc esi
mov al,[esi]
To my understanding, OFFSET returns the location of the variable with respect to the program's segment. That address is stored in the esi register. Immediates are then used to access the value stored at the address represented in esi. Incrementing moves the address to the next byte.
So what is the difference between using OFFSET on an array and simply calling the array variable? I was previously lead to believe that simply calling the array variable would also give me its address.
.data
Number dd 3
.code
mov eax,Number
mov ebx,offset Number
EAX will read memory at a certain address and store the number 3
EBX will store that certain address.
mov ebx,offset Number
is equivalent in this case to
lea ebx,Number