I wondered when I use this code, for example:
#include <stdio.h>
int main(){
int b;
scanf("%d",&b);
if (b)
printf("right\n");
else
printf("zero entered\n");
return 0;
}
How can the compiler know that if b!= 0, it should execute printf("right\n");.....and if b == 0 it should execute printf("zero entered\n");
And if I had another variable a, and check if a > b or not, the return from the logical operations is 1 or 0; how this value is obtained? Is it a function?
In C, all non-zero values are "true", and only zero is "false".
As for the comparison a > b, depending on the types there are instructions in the CPU running your program that does comparisons, and the compiler generates those instructions when compiling your program. For types that don't have a native comparison-instruction, it's up to the implementation of the compiler how those should be handled.
The way the compiler deals with this is to translate it into the appropriate machine instructions.
In the case of this:
if (b)
then this is typically, on an x86 machine, translated to something like this:
cmp eax, eax ; compare register eax with itself
jz target ; jump to target if zero
The above code tells you that zero is a special case in the cpu, in that many, if not most, of the instructions will set some internal flags when they operate on values, so that jz and jnz (jump if not zero) can be done afterwards.
There are other flags as well, overflow, carry, sign, parity.
As for comparison, for types that can be handled natively by the cpu, there are built in instructions:
mov eax, a ; eax = a
cmp eax, b ; compare eax to b
jl target ; jump to target if less (eax < b --> a < b)
You can find more of the jump instructions here: Intel x86 JUMP quick reference.
If the types cannot be handled natively, typically it will involve a function call, which returns a 0/1 (or 0/N, note that N can be negative) value, in which case it falls back to the if (b) type of instructions to handle the results of that function.
Something like this:
mov eax, a ; eax = a
mov ebx, b ; ebx = b
call function ; call comparison function, result returned in eax
cmp eax, eax
jz notequal
If you want to check the value is true or false, you should always compare with zero because all integers are true except zero.
Related
I wonder if it's faster for the processor to negate a number or to do a subtraction. For example:
Is
int a = -3;
more efficient than
int a = 0 - 3;
In other words, does a negation is equivalent to subtracting from 0? Or is there a special CPU instruction that negate faster that a subtraction?
I suppose that the compiler does not optimize anything.
From the C language point of view, 0 - 3 is an integer constant expression and those are always calculated at compile-time.
Formal definition from C11 6.6/6:
An integer constant expression shall have integer type and shall
only have operands that are integer constants, enumeration constants,
character constants, sizeof expressions whose results are integer
constants, _Alignof expressions, and floating constants that are the
immediate operands of casts.
Knowing that these are calculated at compile time is important when writing readable code. For example if you want to declare a char array to hold 5 characters and the null terminator, you can write char str[5+1]; rather than 6, to get self-documenting code telling the reader that you have considered null termination.
Similarly when writing macros, you can make use of integer constant expressions to perform parts of the calculation at compile time.
(This answer is about negating a runtime variable, like -x or 0-x where constant-propagation doesn't result in a compile-time constant value for x. A constant like 0-3 has no runtime cost.)
I suppose that the compiler does not optimize anything.
That's not a good assumption if you're writing in C. Both are equivalent for any non-terrible compiler because of how integers work, and it would be a missed-optimization bug if one compiled to more efficient code than the other.
If you actually want to ask about asm, then how to negate efficiently depends on the ISA.
But yes, most ISAs can negate with a single instruction, usually by subtracting from an immediate or implicit zero, or from an architectural zero register.
e.g. 32-bit ARM has an rsb (reverse-subtract) instruction that can take an immediate operand. rsb rdst, rsrc, #123 does dst = 123-src. With an immediate of zero, this is just negation.
x86 has a neg instruction: neg eax is exactly equivalent to eax = 0-eax, setting flags the same way.
3-operand architectures with a zero register (hard-wired to zero) can just do something like MIPS subu $t0, $zero, $t0 to do t0 = 0 - t0. It has no need for a special instruction because the $zero register always reads as zero. Similarly AArch64 removed RSB but has a xzr / wzr 64/32-bit zero register. (Although it also has a pseudo-instruction called neg which subtracts from the zero register).
You could see most of this by using a compiler. https://godbolt.org/z/B7N8SK
But you'd have to actually compile to machine code and disassemble because gcc/clang tend to use the neg pseudo-instruction on AArch64 and RISC-V. Still, you can see ARM32's rsb r0,r0,#0 for int negate(int x){return -x;}
Both are compile time constants, and will generate the same constant initialisation in any reasonable compiler regardless of optimisation.
For example at https://godbolt.org/z/JEMWvS the following code:
void test( void )
{
int a = -3;
}
void test2( void )
{
int a = 0-3;
}
Compiled with gcc 9.2 x86-64 -std=c99 -O0 generates:
test:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], -3
nop
pop rbp
ret
test2:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], -3
nop
pop rbp
ret
Using -Os, the code:
void test( void )
{
volatile int a = -3;
}
void test2( void )
{
volatile int a = 0-3;
}
generates:
test:
mov DWORD PTR [rsp-4], -3
ret
test2:
mov DWORD PTR [rsp-4], -3
ret
The volatile being necessary to prevent the compiler removing the unused variables.
As static data it is even simpler:
int a = -3;
int b = 0-3;
outside of a function generates no executable code, just initialised data objects (initialisation is different from assignment):
a:
.long -3
b:
.long -3
Assignment of the above statics:
a = -4 ;
b = 0-4 ;
is still a compiler evaluated constant:
mov DWORD PTR a[rip], -4
mov DWORD PTR b[rip], -4
The take-home here is:
If you are interested, try it and see (with your own compiler or Godbolt set for your compiler and/or architecture),
don't sweat the small stuff, let the compiler do its job,
constant expressions are evaluated at compile time and have no run-time impact,
writing weird code in the belief you can better the compiler is almost always pointless. Compilers work better with idiomatic code the optimiser can recognise.
It's hard to tell if you ask asking if subtraction is fast then negation in a general sense, or in this specific case of implementing negation via subtraction from zero. I'll try to answer both.
General Case
For the general case, on most modern CPUs these operations are both very fast: usually each only taking a single cycle to execute, and often having a throughput of more than one per cycle (because CPUs are superscalar). On all recent AMD and Intel CPUs that I checked, both sub and neg execute at the same speed for register and immediate arguments.
Implementing -x
As regards to your specific question of implementing the -x operation, it would usually be slightly faster to implement this with a dedicated neg operation than with a sub, because with neg you don't have to prepare the zero registers. For example, a negation function int neg(int x) { return -x; }; would look something like this with the neg instruction:
neg:
mov eax, edi
neg eax
... while implementing it terms of subtraction would look something like:
neg:
xor eax, eax
sub eax, edi
Well ... sub didn't come out looking at worse there, but that's mostly a quirk of the calling convention and the fact that x86 uses a 1 argument destructive neg: the result needs to be in eax, so in the neg case 1 instruction is spent just moving the result to the right register, and one doing the negation. The sub version takes two instructions to perform the negation itself: one to zero a register, and one to do the subtraction. It so happens that this lets you avoid the ABI shuffling because you get to choose the zero register as the result register.
Still, this ABI related inefficiency wouldn't persist after inlining, so we can say in some fundamental sense that neg is slightly more efficient.
Now many ISAs may not have a neg instruction at all, so the question is more or less moot. They may have a hardcoded zero register, so you'd implement negation via subtraction from this register and there is no cost to set up the zero.
I'm picking up ASM language and trying out the IMUL function on Ubuntu Eclipse C++, but for some reason I just cant seem to get the desired output from my code.
Required:
Multiply the negative elements of an integer array int_array by a specified integer inum
Here's my code for the above:
C code:
#include <stdio.h>
extern void multiply_function();
// Variables
int iaver, inum;
int int_ar[10] = {1,2,3,4,-9,6,7,8,9,10};
int main()
{
inum = 2;
multiply_function();
for(int i=0; i<10; i++){
printf("%d ",int_ar[i]);
}
}
ASM code:
extern int_ar
extern inum
global multiply_function
multiply_function:
enter 0,0
mov ecx, 10
mov eax, inum
multiply_loop:
cmp [int_ar +ecx*4-4], dword 0
jg .ifpositive
mov ebx, [int_ar +ecx*4-4]
imul ebx
cdq
mov [int_ar +ecx*4-4], eax
loop multiply_loop
leave
ret
.ifpositive:
loop multiply_loop
leave
ret
The Problem
For an array of: {1,2,3,4,-9,6,7,8,9,10} and inum, I get the output {1,2,3,4,-1210688460,6,7,8,9,10} which hints at some sort of overflow occurring.
Is there something I'm missing or understood wrong about how the IMUL function in assembly language for x86 works?
Expected Output
The output I expected is {1,2,3,4,-18,6,7,8,9,10}
My Thought Process
My thought process for the above task:
1) Find which array elements in array are negative, for each positive element found, do nothing and continue loop to next element
cmp [int_ar +ecx*4-4], dword 0
jg .ifpositive
.ifpositive:
loop multiply_loop
leave
ret
2) Upon finding the negative element, move its value into register EBX which will serve as SRC in the IMUL SRC function. Then extend register EAX to EAX-EDX where the result is stored in:
mov ebx, [int_ar +ecx*4-4]
imul ebx
cdq
3) Move the result into the negative element of the array by using MOV:
mov [int_ar +ecx*4-4], eax
4) Loop through to the next array element and repeat the above 1)-3)
Reason for Incorrect Values
If we look past the inefficiencies and unneeded code and deal with the real issue it comes down to this instruction:
mov eax, inum
What is inum? You created and initialized a global variable in C called inum with:
int iaver, inum;
[snip]
inum = 2;
inum as a variable is essentially a label to a memory location containing an int (32-bit value). In your assembly code you need to treat inum as a pointer to a value, not the value itself. In your assembly code you need to change:
mov eax, inum
to:
mov eax, [inum]
What your version does is moves the address of inum into EAX. Your code ended up multiplying the address of the variable by the negative numbers in your array. That cause the erroneous values you see. the square brackets around inum tell the assembler you want to treat inum as a memory operand, and that you want to move the 32-bit value at inuminto EAX.
Calling Convention
You appear to be creating a 32-bit program and running it on 32-bit Ubuntu. I can infer the possibility of a 32-bit Linux by the erroneous value of -1210688460 being returned. -1210688460 = 0xB7D65C34 divide by -9 and you get 804A06C. Programs on 32-bit Linux are usually loaded starting at 0x8048000
Whether running on 32-bit Linux or 64-bit Linux, assembly code linked with 32-bit C/C++ programs need to abide by the CDECL calling convention:
cdecl
The cdecl (which stands for C declaration) is a calling convention that originates from the C programming language and is used by many C compilers for the x86 architecture.1 In cdecl, subroutine arguments are passed on the stack. Integer values and memory addresses are returned in the EAX register, floating point values in the ST0 x87 register. Registers EAX, ECX, and EDX are caller-saved, and the rest are callee-saved. The x87 floating point registers ST0 to ST7 must be empty (popped or freed) when calling a new function, and ST1 to ST7 must be empty on exiting a function. ST0 must also be empty when not used for returning a value.
Your code clobbers EAX, EBX, ECX, and EDX. You are free to destroy the contents of EAX, ECX, and EDX but you must preserve EBX. If you don't you can cause problems for the C code calling the function. After you do the enter 0,0 instruction you can push ebx and just before each leave instruction you can do pop ebx
If you were to use -O1, -O2, or -O3 GCC compiler options to enable optimizations your program may not work as expected or crash altogether.
Can anyone explain what this code does? I kind of understand it, but I don't quite understand what happens when the code label appears below "loop N-Not-1". I'm not sure if I understand loops correctly. I think of them as do-while loops in C++. In this case, wouldn't the loop for N-is-1 continue indefinitely? I thought this was a if-else statement and not a loop?
Write a piece of code that computes the function below:
if (N = 1) then Y = -X
else
Y=X
Assume that the value of X is in the eax register. Also assume that the value of N is in the
ebx register. The computed value of Y need to be placed in the eax register.
Hint 1: Use a loop instruction in your code.
Hint 2: This problem can be solved using less than five instructions.
; eax = X, ebx = N
; Write your code below
mov ecx, ebx
loop N-not-1
N-is-1: neg eax
N-not-1: ; Y = eax
loop instruction operates on the value of ECX. It decreases ECX first and checks whether it is zero. If it is not zero, then it jumps to the specified address. If it is zero then break.
mov ecx, ebx ; this instruction moves the value of N to ecx
loop N-not-1 ; if N is 1 then, on decrementing it becomes 0 and the loop breaks.
N-is-1: neg eax ; if N is 1, eax gets negated as the loop breaks
N-not-1: ; Y = eax // if N is not 1, eax remains unchanged
First of all, this code is very badly presented. I am copying it here without the clutter:
mov ecx, ebx
loop N-not-1
neg eax
N-not-1:
This code is a hack. It does not actually loop. It just makes use of the fact that the loop instruction will do 3 things : decrement ecx, check if it is zero, and jump all in one instruction. It is equivalent to the following:
dec ebx
cmp ebx, 0
jnz N-not-1
neg eax
N-not-1:
loop instruction is like the following C code:
for (; ecx != 0; ecx--) {
// instructions
}
and your code use loop instruction for checking ebx(N) against zero:
ebx = N;
ecx = ebx;
for (; ecx != 0; ecx--) {
// if ecx=1 (N = 1) we enter the loop
eax = -eax
}
Y = eax
I used C like code, but in assembly there is no block and everything is controlled with labels.
I've been banging my head against the wall figuring this out, and this is making no sense to me...
Why does this program enter an infinite loop?!
I thought you could use test to compare two values for equality, as shown here... why doesn't it work?
int main()
{
__asm
{
mov EAX, 1;
mov EDX, EAX;
test EAX, EDX;
L: jne L;
}
}
Your expectation of what the TEST instruction does is incorrect.
The instruction is used to perform bit tests. You would typically use it to "test" if certain bits are set given a mask. It would be used in conjunction with the JZ (jump if zero) or JNZ (jump if not zero) instructions.
The test involves performing a bitwise-AND on the two operands and sets the appropriate flags (discarding the result). If none of the corresponding bits in the mask are set, then the ZF (zero flag) will be 1 (all bits are zero). If you wanted to test if any were set, you'd use the JNZ instruction. If you wanted to test if none were set, you'd use the JZ instruction.
The JE and JNE are not appropriate for this instruction because they interpret the flags differently.
You are trying to perform an equality check on some variables. You should be using the CMP instruction. You would typically use it to compare values with each other.
The comparison effectively subtracts the operands and only sets the flags (discarding the result). When equal, the difference of the two values is 0 (ZF = 1). When not equal, the difference of the two values is non-zero (ZF = 0). If you wanted to test if they were equal, you'd use the JE (jump if equal) instruction. If you wanted to test if they were not equal, you'd use the JNE (jump if not equal) instruction.
In this case, since you used TEST, the resulting flags would yield ZF = 0 (0x1 & 0x1 = 0x1, non-zero). Since ZF = 0, the JNE instruction would take the branch as you are seeing here.
tl;dr
You need to compare the values using the CMP instruction if you are checking for equality, not TEST them.
int main()
{
__asm
{
mov EAX, 1
mov EDX, EAX
cmp EAX, EDX
L: jne L ; no more infinite loop
}
}
Just reading this (my asm is very rusty) and this
JNE jumps on ZF (Zero Flag) = 0
TEST sets ZF = 0 If bitwise EAX AND EDX results in 1 and 1 if bitwise AND results in 0
If the result of the AND is 0, the ZF is set to 1, otherwise set to
0.
Therefore it jumps as 1 AND 1 results in 0 in ZF.
Seems logical yet counter intuative.
I think #A.Webb is right - it should probably be JNZ if you're using the TEST instruction as you are relying on the behavior of a bitwuse operation to set the zero flag whereas the SUB instruction would set the Zero flag as you need.
This is pretty simple. You obviously need to know what the instructions do, what processor state they read, and write. When in doubt, get a reference manual. The Intel x86 manuals are easy to find online.
Your specific program:
mov EAX, 1;
moves the constant 1 to EAX. No other state changes occur.
mov EDX, EAX;
copies the contexts of EAX into EDX, so it too contains the value 1.
test EAX, EDX;
test computes the bitwise AND of two registers (did you check the reference manuals?), throws the answer away, and sets condition code bits based on the answer. In your case, the upper 31 bits of each register is zero, and'd produces zeros. The least significant bit is one in both registers; and'd produces a 1. The net effect is that the 32 binary value "one" is generated, and thrown away after the condition code bits are set. There is one condition code bit we care about for this program, and that's the "Z"(ero) bit, which is set if the last condition-code setting operation produced a full zero value. This test produced "one", so the Z bit is reset. I'll let you look up the other condition code bits.
L: jne L;
This is a "Jmp on Not Equal", e.g, it jmps if the Z bit is reset. For your program, Z is reset, the jmp occurs. After execution, the processor is at the same insruction, and sees another (the same jmp). The condition code bits aren't changed by a jmp instruction.
So... it goes into an infinite loop.
There are lots of synonyms for various opcodes supported by assemblers. For instance, "JZ" and "JE" are synonyms for the same instruction. Don't let the synonyms confuse.
I would like to understand how cmp and je/jg work in assembly. I saw few examples on google but I am still little bit confused. Below I have shown a part of assembly code that I am trying to convert to C language and the corresponding C code. Is it implemented in the right way or do I have a wrong understanding of how cmp works?
cmp $0x3,%eax
je A
cmp $0x3,%eax
jg B
cmp $0x1,%eax
je C
int func(int x){
if(x == 3)
goto A;
if (x >3)
goto B;
if(x == 1)
goto C;
A:
......
B:
......
C:
......
You understand correctly how cmp and je/jg work, but you have an error in your C code. This line:
if (*x == 1)
should be
if (x == 1)
Here is a pretty good summary of the x86 control flow instructions.
Also, there's no reason to repeat the cmp instruction for the same values. Once you've executed it, you can test the results multiple ways without repeating the comparison. So your assembly code should look like this:
cmp $0x3,%eax
je A
jg B
cmp $0x1,%eax
je C
Yes, that's correct, except that in your C code you have *x in third example but x in others, that does not make sense. In your assembly code there is no correspoding code.
In C the variable type (signed/unsigned) is defined upon declaring the variable, eg. int x or unsigned int x, but in assembly the distinction between signed and unsigned variables (be they in memory or in registers) for comparisons is made by different conditional jumps:
For signed variables:
jg ; jump if greater
jl ; jump if less
jge ; jump if greater or equal, "jnl" is synonymous
jle ; jump if less or equal, "jng" is synonymous
For unsigned variables:
ja ; jump if above
jb ; jump if below
jae ; jump if above or equal, "jnb" is synonymous
jbe ; jump if below or equal, "jna" is synonymous
Intel x86 JUMP quick reference lists all conditional jumps available in x86 assembly, together with their conditions (flags' values) and their opcodes for short and long jumps.
As you may already know, the processor keeps track of the stuff that happened during last operations in a so-called flags-register.
For example, there is a flag if an operation made an overflow, or the result was zero etc. The cmp mnemonic tells the processor to subtract the two registers/ register and memory content and it changes the correct flags.
After that, you can jump using the jumps you have done. The processor checks the flags to see if it was equal-je, (checks the zero flag), or if it was smaller/bigger(overflow flag for unsigned and overflow and sign flag for signed numbers).