Why Valac generates these (meaningless?) temp pointers in C code

Why Valac generates these (meaningless?) temp pointers in C code - c

I began to study Vala, and now I dont understand why in these examples the variable tmp1 is created if it was possible to use tmp0 at once?
And same with tmp1 tm3 here
I read the documentation a bit but didn't understand why valac generates these temp pointers.
https://wiki.gnome.org/Projects/Vala/Hacking#Documentation
I really want to understand how the Vala compiler works.Now i think that it relies heavily on the optimization that will happen in gcc with -O3 and apparently it is included by default. I tried compiling with the-O3 flag and without, and the weight of the binaries was the same.

The main reason is to avoid undefined behavior. In C, the order arguments are evaluated in is undefined. For example, if you have something like
int x = 1;
foo(x++, x++);
You could be calling foo(1, 2) or foo(2, 1).
In Vala, the order is defined; it will be foo(1, 2). To do this, Vala sometimes needs to use temporary variables, so the code turns into something like:
int x = 1;
int tmp0 = x++;
int tmp1 = x++;
foo(tmp0, tmp1);
To keep the code generator simple, the temporary variables are just always generated.
Any C compiler will optimize the temporary variables away easily (you don't need -O3, -O1 is more than enough for this), so there isn't much reason to change valac to eliminate the temporary variables. The only real downside is that the generated code is a little uglier.

Related

literal division at compile time

Assume the following code:
static int array[10];
int main ()
{
for (int i = 0; i < (sizeof(array) / sizeof(array[0])); i++)
{
// ...
}
}
The result of sizeof(array) / sizeof(array[0]) should in theory be known at compile time and set to some value depending on the size of the int. Even though, will the compiler do the manual division in run time each time the for loop iterates?
To avoid that, does the code need to be adjusted as:
static int array[10];
int main ()
{
static const int size = sizeof(array) / sizeof(array[0]);
for (int i = 0; i < size; i++)
{
// ...
}
}

You should write the code in whatever way is most readable and maintainable for you. (I'm not making any claims about which one that is: it's up to you.) The two versions of the code you wrote are so similar that a good optimizing compiler should probably produce equally good code for each version.
You can click on this link to see what assembly your two different proposed codes generate in various compilers:
https://godbolt.org/z/v914qYY8E
With GCC 11.2 (targetting x86_64) and with minimal optimizations turned on (-O1), both versions of your main function have the exact same assembly code. With optimizations turned off (-O0), the assembly is slightly different but the size calculation is still done at a compile time for both.
Even if you doubt what I am saying, it is still better to use the more readable version as a starting point. Only change it to the less readable version if you find an actual example of a programming environment where doing that would provide a meaningful speed increase for you application. Avoid wasting time with premature optimization.

Even though, will the compiler do the manual division in run time each time the for loop iterates?
No. It's an integer constant expression which will be calculated at compile-time. Which is why you can even do this:
int some_other_array [sizeof(array) / sizeof(array[0])];
To avoid that, does the code need to be adjusted as
No.
See for yourself: https://godbolt.org/z/rqv15vW6a. Both versions produced 100% identical machine code, each one containing a mov ebx, 10 instruction with the pre-calculated value.

Global variable reads inside tight loops in C

Say I have a tight loop in C, within which I use the value of a global variable to do some arithmetics, e.g.
double c;
// ... initialize c somehow ...
double f(double*a, int n) {
double sum = 0.0;
int i;
for (i = 0; i < n; i++) {
sum += a[i]*c;
}
return sum;
}
with c the global variable. Is c "read anew from global scope" in each loop iteration? After all, it could've been changed by some other thread executing some other function, right? Hence would the code be faster by taking a local (function stack) copy of c prior to the loop and only use this copy?
double f(double*a, int n) {
double sum = 0.0;
int i;
double c_cp = c;
for (i = 0; i < n; i++) {
sum += a[i]*c_cp;
}
return sum;
}
Though I haven't specified how c is initialized, let's assume it's done in some way such that the value is unknown at compile time. Also, c is really a constant throughout runtime, i.e. I as the programmer knows that its value won't change. Can I let the compiler in on this information, e.g. using static double c in the global scope? Does this change the a[i]*c vs. a[i]*c_cp question?
My own research
Reading e.g. the "Global variables" section of this, it seems clear that taking a local copy of the global variable is the way to go. However, they want to update the value of the global variable, whereas I only ever want to read its value.
Using godbolt I fail to notice any real difference in the assembly for both c vs. c_cp and double c vs. static double c.

Any decently smart compiler will optimize your code so it will behave as your second code snippet. Using static won't change much, but if you want to ensure read on each iteration then use volatile.
Great point there about changes from a different thread. Compiler will maintain integrity of your code as far as single-threaded execution goes. That means that it can reorder your code, skip something, add something -- as long as the end result is still the same.
With multiple threads it is your job to ensure that things still happen in a specific order, not just that the end result is right. The way to ensure that are memory barriers. It's a fun topic to read, but one that is best avoided unless you're an expert.

Once everything translated to machine code, you will get no difference whatsoever. If c is global, any access to c will reference the address of c or most probably, in a tight loop c will be kept in a register, or in the worst case the L1 cache.
On a Linux machine you can easily generate the assembly and examine the resultant code.
You can also run benchmarks.

Inconsistent undefined behavior

For a class, I wanted to demonstrate undefined behavior with goto to the students. I came up with the following program:
#include <stdio.h>
int main()
{
goto x;
for(int i = 0; i < 10; i++)
x: printf("%d\n", i);
return 0;
}
I would expect the compiler (gcc version 4.9.2) to warn me about the access to i being undefined behavior, but there is no warning, not even with:
gcc -std=c99 -Wall -Wextra -pedantic -O0 test.c
When running the program, i is apparently initialized to zero. To understand what is happening, I extended the code with a second variable j:
#include <stdio.h>
int main()
{
goto x;
for(int i = 0, j = 1; i < 10; i++)
x: printf("%d %d\n", i, j);
return 0;
}
Now the compiler warns me that I am accessing j without it being initialized. I understand that, but why is i not uninitialized as well?

Undefined behavior is a run-time phenomenon. Therefore it is quite rare that the compiler will be able to detect it for you. Most cases of undefined behavior are invoked when doing things beyond the scope of the compiler.
To make things even more complicated, the compiler might optimize the code. Suppose it decided to put i in a CPU register but j on the stack or vice versa. And then suppose that during debug build, it sets all stack contents to zero.
If you need to reliably detect undefined behavior, you need a static analysis tool which does checks beyond what is done by a compiler.

Now the compiler warns me that I am accessing j without it being
initialized. I understand that, but why is i not uninitialized as
well?
Thats the point with undefined behavior, it sometimes does work, or not, or partially, or print garbage. The problem is that you can't know what exactly your compiler is doing under the hood to make this, and its not the compiler's fault for producing inconsistent results, since, as you admit yourself, the behavior is undefined.
At that point the only thing thats guaranteed is that nothing is guaranteed as to how this will play out. Different compilers may even give different results, or different optimization levels may.
A compiler is also not required to check for this, and its not required to handle this, so consequently compilers don't. You can't use a compiler to check for undefined behavior reliably, anyways. Thats what unit tests and lots of test cases or statistical analysis is for.

Using "goto" to skip a variable initialization would, per the C Standard, allow a compiler to do anything it wants even on platforms where it would normally yield an Indeterminate Value which may not behave consistently but wouldn't have any other side-effects. The behavior of gcc in this case doesn't seem to have devolved as much as its behavior in case of e.g. integer overflow, but its optimizations may be somewhat interesting though benign. Given:
int test(int x)
{
int y;
if (x) goto SKIP;
y=x+1;
SKIP:
return y*2;
}
int test2(unsigned short y)
{
int q=0;int i;
for (i=0; i<=y; i++)
q+=test(i);
return q;
}
The compiler will observe that in all defined cases, test will return 2, and can thus eliminate the loop by generating code for test2 equivalent to:
int test2(unsigned short y)
{
return (int)y << 1;
}
Such an example, however, may give the impression that compilers treat UB in a benign fashion. Unfortunately, in the case of gcc, that is no longer true in general. It used to be that on machines without hardware traps, compilers would treat uses of Indeterminate Value as simply yielding arbitrary values that may or may not behave in any consistent fashion, but without any other side-effects. I'm not sure of any cases where using goto to skip variable initialization would yet cause side-effects other than having a meaningless value in the variable, but that doesn't mean the authors of gcc won't decide to exploit that freedom in future.

Matrix not zero-filled on declaration

I was trying to debug my code in another function when I stumbled upon this "weird" behaviour.
#include <stdio.h>
#define MAX 20
int main(void) {
int matrix[MAX][MAX] = {{0}};
return 0;
}
If I set a breakpoint on the return 0; line and I look at the local variables with Code::Blocks the matrix is not entirely filled with zeros.
The first row is, but the rest of the array contains just random junk.
I know I can do a double for loop to initialize manually everything to zero, but wasn't the C standard supposed to fill this matrix to zero with the {{0}} initializer?
Maybe because it's been a long day and I'm tired, but I could've sworn I knew this.
I've tried to compile with the different standards (with the Code::Blocks bundled gcc compiler): -std=c89, -std=c99, std=c11 but it's the same.
Any ideas of what's wrong? Could you explain it to me?
EDIT:
I'm specifically asking about the {{0}} initializer.
I've always thought it would fill all columns and all rows to zero.
EDIT 2:
I'm bothered specifically with Code::Blocks and its bundled GCC. Other comments say the code works on different platforms. But why wouldn't it work for me? :/
Thanks.

I've figured it out.
Even without any optimization flag on the compiler, the debugger information was just wrong..
So I printed out the values with two for loops and it was initialized correctly, even if the debugger said otherwise (weird).
Thanks however for the comments

Your code should initialize it to zero. In fact, you can just do int matrix[MAX][MAX] = {};, and it will be initialized to 0. However, int matrix[MAX][MAX] = {{1}}; will only set matrix[0][0] to 1, and everything else to 0.
I suspect what you are observing with Code::Blocks is that the debugger (gdb?) is not quite showing you exactly where it is breaking in the code - either that or some other side-effect from the optimizer. To test that theory, add the following loop immediately after the initialization:
``` int i,j;
for (i = 0; i < MAX; i++)
for (j = 0; j < MAX; j++)
printf("matrix[%d][%d] = %d\n", i, j, matrix[i][j]);
```
and see if what it prints is consistent with the output of the debugger.
I am going to guess that what might be happening is that since you are not using matrix the optimizer might have decided to not initialize it. To verify, disassemble your main (disass main in gdb and see if the matrix is actually being initialized.

how to elegantly construct long argument lists that iterate through arrays in C

I have a C function that takes variable arguments, and I need to call it with a very long list of arguments, where the arguments all step through the elements of an array. Example:
myFunction( A[0], B[0], A[1], B[1], A[2], B[2], A[3], B[3], ..... A[N], B[N] );
where N is typically 100-200.
I would prefer not having to construct this call manually every time I make N bigger, and got to thinking, is there an elegant way to do this?
I tried something like:
i=0;
myFunction( A[i], B[i++], A[i], B[i++], A[i], B[i++], A[i], B[i++], ..... A[i], B[++] );
but of course that fails. What is preferred about it, however, is anytime I make N larger, I can simply copy the same line over and over, instead of having to ensure each array index is correct, which is quite tedious.
Changing myFunction() is not an option.
I wish C had a way to construct function calls on the fly, like:
for( i = 0 ; i <= N ; i++ )
{
CONSTRUCT_CALL( myFunction, A[i], B[i] );
}
which would be exactly what I want, but of course that's not an option.
Is there anything that might be easier or more elegant?
Thank you very much.

There is no standard C way of doing that (synthesizing a variadic call at runtime). But...
you can use libffi which is designed to handle such issues (so I recommend it)
you could consider GCC specific Builtins for Constructing Calls
you could have some fixed limit on the arity (e.g. 500) and have some C file generated with some (shell, awk, Python, ...) script doing a switch on the 500 cases, one for each arity.
you might consider generating some C code at runtime into _gen123.c, compile it into a dynamically loadable plugin (e.g. forking some gcc -shared -fPIC -Wall -O _gen123.c -o _gen123.so command on Linux), then loading that plugin (with dlopen(3) on Linux or Posix)
you might consider some just-in-time compilation library (e.g. libjit, llvm, GNU lightning, asmjit, ...)
Of course, avoid several i++ in a single call. Avoid undefined behavior, since bad things could happen.

There is something very bad in your design.
Rewrite your myFunction so it takes two arrays (A and B) and then takes numer of indices to use.
A short example of calling such a function:
int A[100];
int B[100];
int c = myFunction(A, B, 100);
A possible implementation of myFunction:
int myFunction(int* A, int* B, int count)
{
int result = 0;
for(int j = 0; j < i; j++)
result += A[j] + B[j]*2;
return result;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why Valac generates these (meaningless?) temp pointers in C code - c

Related

literal division at compile time

Global variable reads inside tight loops in C

Inconsistent undefined behavior

Matrix not zero-filled on declaration

how to elegantly construct long argument lists that iterate through arrays in C

Categories

Resources