conditional jumps -- comparing c code to assembly - c

I am trying to compare a c function code to the equivalent of the assembly and kind of confused on the conditional jumps
I looked up jl instruction and it says jump if < but the answer to the question was >= Can someone explain why is that?

To my understanding, the condition is inverted, but the logic is the same; the C source defines
if the condition is satisfied, execute the following block
whereas the assembly source defines
if the condition is violated, skip the following block
which means that the flow of execution will be the same in both implementations.

In essence, what this assembly is doing, is executing your condition as you set it, but using negative logic.
Your condition says:
If a is smaller then b, return x. Otherwise, return y.
What the assembly code says (simplified):
Move y into the buffer for returning. Move b into a different buffer.
If a is bigger then b, jump ahead to the return step. Then y is
returned. If a is not bigger then b, continue in the program. The next
step assigns x to the return buffer. The step after that returns as
normal.
The outcome is the same, but the process is slightly different.

the assembly does, line by line (code not included, because you posted it as image):
foo:
return_value (eax) = y; // !!!
temporary_edx = b; // x86 can't compare memory with memory, so "b" goes to register
set_flags_by(a-b); // cmp does subtraction and discards result, except flags
"jump less to return" // so when a < b => return y (see first line)
return_value (eax) = x;
return
so to make that C code do the same thing, you need:
if (a >= b) { return x; } else { return y; }
BTW, see how easy it is to flip:
if (a < b) { return y; } else { return x; }
So there's no point to translate jl into "less" into C, you have to track down each branch, what really happens, and find for each branch of calculation the correct C-side calculation, and then "create" the condition in C to get the same calculation on both sides, so this task is not about "translating" the assembly, but about deciphering the asm logic + rewriting it back in C. Looks like you sort of completely missed the point and expected you can get away with some simple "match pattern" translation, while you have to work it out fully.

Related

gcc optimization: Increment on condition

I've noticed that gcc12 does not optimize these two functions to the same code (with -O3):
int x = 0;
void f(bool a)
{
if (a) {
++x;
}
}
void f2(bool a)
{
x += a;
}
Basically no transformation is done. That can be seen here: https://godbolt.org/z/1G3n4fxEK
Optimizing f to the code in f2 seems to be trivial and no jump would be needed anymore. However, I'm curious if there's a reason why this is not done by gcc? Is it somehow still slower or something? I would assume it's never slower and sometimes faster, but I might be wrong.
Thanks
Such a substitution would be incorrect in a scenario where one thread calls f(1) while another thread calls f(0). If x is never actually accessed outside the first thread, there would be no race condition in the code as written, but the substitution would create one. If x is initially 1, nothing would prevent the code from being processed as:
thread 1: read x (yields 1)
thread 2: read x (yields 1)
thread 1: write 2
thread 2: write 1
This would cause x to be left holding the value 1 when thread 2 has just written the value 2. Worse than that, if the function was invoked within a context like:
x = 1;
f(1);
if (x != 1)
launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();
a compiler might recognize that x will always equal 2 following the return from f(1), and thus make the function call unconditional.
To be sure, such substitution would rarely cause problems in real-world situations, but the Standard explicitly forbids transformations that could create race conditions where none would exist in the source code as written.
I would have hoped the compiler would have changed f2 to f. Reading and writing memory may require slow transactions to acquire a copy of the memory location, and update other bus controllers about the state of that location (Invalid -> Shared -> Modified).
Jumping around an update based on a register value is quite cheap; especially with the efficacy of branch predictors.
A much simpler reason why that optimization isn't done is because bool is nothing but an alias for int. Specifically, nothing stops you from passing an arbitrary integer to your function:
int v = 5;
f2(*(bool *)&v);
// x = 5 here

C/C++ programming about read/wri

I am starting to figure out some basic ideas about memory reading and writing (let's assume that the data we read or write have not been cached yet).
For the following code:
int a = 1;
It's definitely a write, since we write the value '1' to the memory place of variable 'a'.
But for the following code:
int a, b;
a = 1;
b = a;
When we execute the statement "b = a;", do we actually perform one read and one write?
To my understandings, I think it's one read and one write, since we have to load the value of 'a' first and then write the value to 'b'.
Not sure if my understandings are correct. Please help me to clarify these basic ideas.
Many thanks for the help.
Let's assume that the data we read or write have not been cached yet)
I don't see how cache is pertinent to this.
When we execute the statement "b = a;", do we actually perform one read and one write?
Correct.
However, C is not like the assembly language. C instructions don't map 1-to-1 to machine instructions. There is the as-if rule. Basically the compiler can generate whatever machine code as long as the observable behavior of the program is preserved.
For instance:
auto foo()
{
int a = 24;
int b = 11;
int c = a + b;
return c;
}
C compiler is free to compile the above to
foo():
mov eax, 35
ret
And compilers do actually do this (with optimizations enabled). As you can see there is no memory read/write. Just just a write to the eax register (where the return of the function must be put). And the value is an imediate (35).

Function optimized to infinite loop at 'gcc -O2'

Context
I was asked the following puzzle by one of my friends:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
/* write something before this comment */
}
int main()
{
int i = 5;
fn();
printf("%d\n", i);
return 0;
}
I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.
One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)
void fn(void)
{
/* write something after this comment so that the program output is 10 */
int a[1] = {0};
int j = 0;
while(a[j] != 5) ++j; /* Search stack until you find 5 */
a[j] = 10; /* Overwrite it with 10 */
/* write something before this comment */
}
Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.
My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.
Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?
Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:
If i or j is changed to volatile, program behavior remains same.
If array a is made volatile, program does not suffer infinite loop.
Moreover if I apply the following patch
- int a[1] = {0};
+ int aa[1] = {0};
+ int *a = aa;
The program behavior remains same (infinite loop)
If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:
;; Function fn (fn) (executed once)
Removing basic block 3
fn ()
{
<bb 2>:
<bb 3>:
goto <bb 3>;
}
;; Function main (main) (executed once)
main ()
{
<bb 2>:
fn ();
}
Invalid sum of incoming frequencies 0, should be 10000
This verifies the assertions made after the answers below.
This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:
L2:
jmp .L2
The article says (emphasis mine):
Of course this is an infinite loop. Since SATD() unconditionally
executes undefined behavior (it’s a type 3 function), any
translation (or none at all) is perfectly acceptable behavior for a
correct C compiler. The undefined behavior is accessing d[16] just
before exiting the loop. In C99 it is legal to create a pointer to
an element one position past the end of the array, but that pointer
must not be dereferenced. Similarly, the array cell one element past
the end of the array must not be accessed.
which if we examine your program with godbolt we see:
fn:
.L2:
jmp .L2
The logic being used by the optimizer probably goes something like this:
All the elements of a are initialized to zero
a is never modified before or within the loop
So a[j] != 5 is always true -> infinite loop
Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
which is similar to the case in the article which given:
int d[16];
analyzes the following loop:
for (dd=d[k=0]; k<16; dd=d[++k])
like this:
upon seeing d[++k], is permitted to assume that the incremented value
of k is within the array bounds, since otherwise undefined behavior
occurs. For the code here, GCC can infer that k is in the range 0..15.
A bit later, when GCC sees k<16, it says to itself: “Aha– that
expression is always true, so we have an infinite loop.”
Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:
Many knowledgeable people (including me) read this as saying that the
termination behavior of a program must not be changed. Obviously some
compiler writers disagree, or else don’t believe that it matters. The
fact that reasonable people disagree on the interpretation would seem
to indicate that the C standard is flawed.
C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.
In the optimized version, the compiler has decided a few things:
The array a doesn't change before that test.
The array a doesn't contain a 5.
Therefore, we can rewrite the code as:
void fn(void) {
int a[1] = {0};
int j = 0;
while(true) ++j;
a[j] = 10;
}
Now, we can make further decisions:
All the code after the while loop is dead code (unreachable).
j is written but never read. So we can get rid of it.
a is never read.
At this point, your code has been reduced to:
void fn(void) {
int a[1] = {0};
while(true);
}
And we can make the note that a is now never read, so let's get rid of it as well:
void fn(void) {
while(true);
}
Now, the unoptimized code:
In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.
Which is why the unoptimized version sometimes doesn't crash and burn.
If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is
not volatile
contains only 0
never gets written to
and thus it is not possible for it to contain the number 5. Which means an infinite loop.
Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.
As a side note, I bet what your friend actually expected was this:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
while(1); /* Endless loop, function won't return, i won't be output */
/* write something before this comment */
}
or this (if stdlib.h is included):
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
exit(0); /* Exit gracefully */
/* write something before this comment */
}

if statement in C & assignment without a double call

Would it be possible to implement an if that checks for -1 and if not negative -1 than assign the value. But without having to call the function twice? or saving the return value to a local variable. I know this is possible in assembly, but is there a c implementation?
int i, x = -10;
if( func1(x) != -1) i = func1(x);
saving the return value to a local variable
In my experience, avoiding local variables is rarely worth the clarity forfeited. Most compilers (most of the time) can often avoid the corresponding load/stores and just use registers for those locals. So don't avoid it, embrace it! The maintainer's sanity that gets preserved just might be your own.
I know this is possible in assembly, but is there a c implementation?
If it turns out your case is one where assembly is actually appropriate, make a declaration in a header file and link against the assembly routine.
Suggestion:
const int x = -10;
const int y = func1(x);
const int i = y != -1
? y
: 0 /* You didn't really want an uninitialized value here, right? */ ;
It depends whether or not func1 generates any side-effects. Consider rand(), or getchar() as examples. Calling these functions twice in a row might result in different return values, because they generate side effects; rand() changes the seed, and getchar() consumes a character from stdin. That is, rand() == rand() will usually1 evaluate to false, and getchar() == getchar() can't be predicted reliably. Supposing func1 were to generate a side-effect, the return value might differ for consecutive calls with the same input, and hence func1(x) == func1(x) might evaluate to false.
If func1 doesn't generate any side-effect, and the output is consistent based solely on the input, then I fail to see why you wouldn't settle with int i = func1(x);, and base logic on whether or not i == -1. Writing the least repetitive code results in greater legibility and maintainability. If you're concerned about the efficiency of this, don't be. Your compiler is most likely smart enough to eliminate dead code, so it'll do a good job at transforming this into something fairly efficient.
1. ... at least in any sane standard library implementation.
int c;
if((c = func1(x)) != -1) i = c;
The best implementation I could think of would be:
int i = 0; // initialize to something
const int x = -10;
const int y = func1(x);
if (y != -1)
{
i = y;
}
The const would let the compiler to any optimizations that it thinks is best (perhaps inline func1). Notice that func is only called once, which is probably best. The const y would also allow y to be kept in a register (which it would need to be anyway in order to perform the if). If you wanted to give more of a suggestion, you could do:
register const int y = func1(x);
However, the compiler is not required to honor your register keyword suggestion, so its probably best to leave it out.
EDIT BASED ON INSPIRATION FROM BRIAN'S ANSWER:
int i = ((func1(x) + 1) ?:0) - 1;
BTW, I probably wouldn't suggest using this, but it does answer the question. This is based on the SO question here. To me, I'm still confused as to the why for the question, it seems like more of a puzzle or job interview question than something that would be encountered in a "real" program? I'd certainly like to hear why this would be needed.

Should I remove unnecessary `else` in `else if`?

Compare the two:
if (strstr(a, "earth")) // A1
return x;
if (strstr(a, "ear")) // A2
return y;
and
if (strstr(a, "earth")) // B1
return x;
else if (strstr(a, "ear")) // B2
return y;
Personally, I feel that else is redundant and prevent CPU from branch prediction.
In the first one, when executing A1, it's possible to pre-decode A2. And in the second one, it will not interpret B2 until B1 is evaluated to false.
I found a lot of (maybe most of?) sources using the latter form.
Though, the latter form looks better to understand, because it's not so obviously that it will call return y only if a =~ /ear(?!th)/ without the else clause.
Your compiler probably knows that both these examples mean exactly the same thing. CPU branch prediction doesn't come into it.
I usually would choose the first option for symmetry.
(The following answers the original version of the question.)
Do you realize that the two code snippets are NOT semantically equivalent???
Consider what happens if a is "earth".
The first snippet calls foo() and then bar().
The second snippet calls foo() and skips the bar() call.
And this explains why the generated machine code is different. It has to be to implement the different semantics of the respective code fragments!
Personally, I feel that else is redundant ...
Unfortunately, your feeling is incorrect.
Lesson - write your code simply and clearly and leave optimization to the compiler ... which is going to do a far more accurate job than you can achieve.
FOLLOWUP
The snippets in the updated version of the question are now semantically identical, and the else is redundant. However:
any half decent optimizing compiler will generate identical code for the two snippets, and
it is a matter of opinion (i.e. subjective) which of the snippets is easier to understand.
Use else if to state your intentions clearly. Code is meant to be read by humans.
Let the compiler optimize this, and don't worry about optimization until your code is 1) working 2) crystal clear 3) profiled (do this in that order). When doing step 3, you'll notice that the bottlenecks are not where you supposed they would be.
Any attempt to control branch prediction or whatever low level stuff is silly: compilers are very good at optimizing and they use sophisticated methods to yield a fast code on your particular machine.
Look at output from LLVM based compilers to see what I mean: sometimes you can't even remotely understand what it does.
usually it's better to use the second way if you want to test exactly the condition for a, for the exact solution, to reduce the options for the var or const "a". if you write two separate if's you can get 2 different solutions.
for example in your situation with the exact conditions you have there let's say a= -2
A: if (a < 0)
return x; // if -2 is less than 0 will return x and it stops.
else if (a < 100)
return y; //
B: if (a < 0)
return x; // -2 is less than 0 so it will return x and passes to the next if statement;
if (a < 100)
return y; // -2 is also less than 100 and it will return y too
Why not simply write
char* str;
strstr(a, "ear")
if (str != NULL)
{
foo();
if(strstr(str, "earth") != NULL)
{
bar();
}
}

Resources