How can one make Clang optimize away useless array copies

How can one make Clang optimize away useless array copies - c

Consider the following C99 code (that uses the alloca extension.)
void print_int_list(size_t size, int x[size]) {
int y[size];
memcpy(y, x, size * sizeof *x);
for (size_t ii = 0; ii < size; ++ii)
printf("%i ", y[ii]);
printf("\n");
}
void print_int_list_2(size_t size, int x[size]) {
for (size_t ii = 0; ii < size; ++ii)
printf("%i ", x[ii]);
printf("\n");
}
void print_int(int x) {
int * restrict const y = alloca(sizeof x);
memcpy(y, &x, sizeof x);
printf("%d\n", *y);
}
void print_int_2(int x) {
printf("%d\n", *x);
}
In the code print_int is optimized to be exactly the same as print_int_2 on Clang version 3.0 but the function print_int_list is not optimized away to print_int_2. Instead the useless array copy is kept.
This sort of thing is not a problem for most people but it is for me. I intend to prototype a compiler by generating C code for use with Clang, (and later port it to LLVM directly), and I want to generate extremely stupid, simple, and obviously correct code, and let LLVM do the work of optimizing the code.
What I need to know is how one can make Clang optimize away useless array copies so that stupid code like print_int_list will get optimized into code like print_int_list_2.

First, I would go more carefully. There is a step inbetween the two cases that you have, arrays of fixed size. I think nowadays compilers can trace array components that are also indexed with a compile time constant.
Also don't forget that memcpy converts your arrays to pointers to the first element and then makes them void*. So it looses all information.
So I'd go
try fixed sized arrays
don't use memcpy but an assignment loop
and try to losen the constraints from there.

Related

std::array equivalent in C

I'm new to C and C++, and I've read that at least in C++ it's preferable to use std::array or std::vector when using vectors and arrays, specially when passing these into a function.
In my research I found the following, which makes sense. I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
void foo(int arr[10]) { arr[9] = 0; }
void bar() {
int data[] = {1, 2};
foo(data);
}
The above code is wrong but the compiler thinks everything is fine and
issues no warning about the buffer overrun.
Instead use std::array or std::vector, which have consistent value
semantics and lack any 'special' behavior that produces errors like
the above.
(answer from bames53, thanks btw!)
What I want to code is
float foo(int X, int Y, int l){
// X and Y are arrays of length l
float z[l];
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
return z;
}
int bar(){
int l = 100;
int X[l];
int Y[l];
float z[l];
z = foo(X,Y,l);
return 0;
}
I want this to be coded in C, so my question is is there a std::vector construct for C? I couldn't find anything on that.
Thanks in advance, also please excuse my coding (I'm green as grass in C and C++)

Standard C has nothing like std::vector or other container structures. All you get is built-in arrays and malloc.
I suppose using std::vector would fix the problem of indexing outside of the variable's scope.
You might think so, but you'd be wrong: Indexing outside of the bounds of a std::vector is just as bad as with a built-in array. The operator[] of std::vector doesn't do any bounds checking either (or at least it is not guaranteed to). If you want your index operations checked, you need to use arr.at(i) instead of arr[i].
Also note that code like
float z[l];
...
return z;
is wrong because there are no array values in C (or C++, for that matter). When you try to get the value of an array, you actually get a pointer to its first element. But that first element (and all other elements, and the whole array) is destroyed when the function returns, so this is a classic use-after-free bug: The caller gets a dangling pointer to an object that doesn't exist anymore.
The customary C solution is to have the caller deal with memory allocation and pass an output parameter that the function just writes to:
void foo(float *z, const int *X, const int *Y, int l){
// X and Y are arrays of length l
for (int i = 0; i < l; i ++){
z[i] = X[i]+Y[i];
}
}
That said, there are some libraries that provide dynamic data structures for C, but they necessarily look and feel very different from C++ and std::vector (e.g. I know about GLib).

Your question might be sensitive for some programmers of the language.
Using constructs of one language into another can be considered cursing as different languages have different design decisions.
C++ and C share a huge part, in a way that C code can (without a lot of modifications) be compiled as C++. However, if you learn to master C++, you will realize that a lot of strange things happen because how C works.
Back to the point: C++ contains a standard library with containers as std::vector. These containers make use of several C++ constructions that ain't available in C:
RAII (the fact that a Destructor gets executed when the instance goes out-of-scope) will prevent a memory leak of the allocated memory
Templates will allow type safety to not mix doubles, floats, classes ...
Operator overloading will allow different signatures for the same function (like erase)
Member functions
None of these exist in C, so in order to have a similar structure, several adaptions are required for getting a data structure that behaves almost the same.
In my experience, most C projects have their own generic version of data structures, often based on void*. Often this will look similar like:
struct Vector
{
void *data;
long size;
long capacity;
};
Vector *CreateVector()
{
Vector *v = (Vector *)(malloc(sizeof(Vector)));
memset(v, 0, sizeof(Vector));
return v;
}
void DestroyVector(Vector *v)
{
if (v->data)
{
for (long i = 0; i < v->size; ++i)
free(data[i]);
free(v->data);
}
free(v);
}
// ...
Alternatively, you could mix C and C++.
struct Vector
{
void *cppVector;
};
#ifdef __cplusplus
extern "C" {
#endif
Vector CreateVector()
void DestroyVector(Vector v)
#ifdef __cplusplus
}
#endif
vectorimplementation.cpp
#include "vector.h"
struct CDataFree
{
void operator(void *ptr) { if (ptr) free(ptr); }
};
using CData = std::unique_ptr<void*, CDataFree>;
Vector CreateVector()
{
Vector v;
v.cppVector = static_cast<void*>(std::make_unique<std::vector<CData>>().release());
return v;
}
void DestroyVector(Vector v)
{
auto cppV = static_cast<std::vector<CData>>(v.cppVector);
auto freeAsUniquePtr = std::unique_ptr<std::vector<CData>>(cppV);
}
// ...

The closest equivalent of std::array in c is probably a preprocessor macro defintion like
#define ARRAY(type,name,length) \
type name[(length)]

How to return multiple types from a function in C?

I have a function in C which calculates the mean of an array. Within the same loop, I am creating an array of t values. My current function returns the mean value. How can I modify this to return the t array also?
/* function returning the mean of an array */
double getMean(int arr[], int size) {
int i;
printf("\n");
float mean;
double sum = 0;
float t[size];/* this is static allocation */
for (i = 0; i < size; ++i) {
sum += arr[i];
t[i] = 10.5*(i) / (128.0 - 1.0);
//printf("%f\n",t[i]);
}
mean = sum/size;
return mean;
}
Thoughts:
Do I need to define a struct within the function? Does this work for type scalar and type array? Is there a cleaner way of doing this?

You can return only 1 object in a C function. So, if you can't choose, you'll have to make a structure to return your 2 values, something like :
typedef struct X{
double mean;
double *newArray;
} X;
BUT, in your case, you'll also need to dynamically allocate the t by using malloc otherwise, the returned array will be lost in stack.
Another way, would be to let the caller allocate the new array, and pass it to you as a pointer, this way, you will still return only the mean, and fill the given array with your computed values.

The most common approach for something like this is letting the caller provide storage for the values you want to return. You could just make t another parameter to your function for that:
double getMean(double *t, const int *arr, size_t size) {
double sum = 0;
for (size_t i = 0; i < size; ++i) {
sum += arr[i];
t[i] = 10.5*(i) / (128.0 - 1.0);
}
return sum/size;
}
This snippet also improves on some other aspects:
Don't use float, especially not when you intend to return a double. float has very poor precision
Use size_t for object sizes. While int often works, size_t is guaranteed to hold any possible object size and is the safe choice
Don't mix output in functions calculating something (just a stylistic advice)
Declare variables close to where they are used first (another stylistic advice)
This is somewhat opinionated, but I changed your signature to make it explicit the function is passed pointers to arrays, not arrays. It's impossible to pass an array in C, therefore a parameter with an array type is automatically adjusted to the corresponding pointer type anyways.
As you don't intend to modify what arr points to, make it explicit by adding a const. This helps for example the compiler to catch errors if you accidentally attempt to modify this array.
You would call this code e.g. like this:
int numbers[] = {1, 2, 3, 4, 5};
double foo[5];
double mean = getMean(foo, numbers, 5);
instead of the magic number 5, you could write e.g. sizeof numbers / sizeof *numbers.
Another approach is to dynamically allocate the array with malloc() inside your function, but this requires the caller to free() it later. Which approach is more suitable depends on the rest of your program.

Following the advice suggested by #FelixPalmen is probably the best choice. But, if there is a maximum array size that can be expected, it is also possible to wrap arrays in a struct, without needing dynamic allocation. This allows code to create new structs without the need for deallocation.
A mean_array structure can be created in the get_mean() function, assigned the correct values, and returned to the calling function. The calling function only needs to provide a mean_array structure to receive the returned value.
#include <stdio.h>
#include <assert.h>
#define MAX_ARR 100
struct mean_array {
double mean;
double array[MAX_ARR];
size_t num_elems;
};
struct mean_array get_mean(int arr[], size_t arr_sz);
int main(void)
{
int my_arr[] = { 1, 2, 3, 4, 5 };
struct mean_array result = get_mean(my_arr, sizeof my_arr / sizeof *my_arr);
printf("mean: %f\n", result.mean);
for (size_t i = 0; i < result.num_elems; i++) {
printf("%8.5f", result.array[i]);
}
putchar('\n');
return 0;
}
struct mean_array get_mean(int arr[], size_t arr_sz)
{
assert(arr_sz <= MAX_ARR);
struct mean_array res = { .num_elems = arr_sz };
double sum = 0;
for (size_t i = 0; i < arr_sz; i++) {
sum += arr[i];
res.array[i] = 10.5 * i / (128.0 - 1.0);
}
res.mean = sum / arr_sz;
return res;
}
Program output:
mean: 3.000000
0.00000 0.08268 0.16535 0.24803 0.33071
In answer to a couple of questions asked by OP in the comments:
size_t is the correct type to use for array indices, since it is guaranteed to be able to hold any array index. You can often get away with int instead; be careful with this, though, since accessing, or even forming a pointer to, the location one before the first element of an array leads to undefined behavior. In general, array indices should be non-negative. Further, size_t may be a wider type than int in some implementations; size_t is guaranteed to hold any array index, but there is no such guarantee for int.
Concerning the for loop syntax used here, e.g., for (size_t i = 0; i < sz; i++) {}: here i is declared with loop scope. That is, the lifetime of i ends when the loop body is exited. This has been possible since C99. It is good practice to limit variable scopes when possible. I default to this so that I must actively choose to make loop variables available outside of loop bodies.
If the loop-scoped variables or size_t types are causing compilation errors, I suspect that you may be compiling in C89 mode. Both of these features were introduced in C99.If you are using gcc, older versions (for example, gcc 4.x, I believe) default to C89. You can compile with gcc -std=c99 or gcc -std=c11 to use a more recent language standard. I would recommend at least enabling warnings with: gcc -std=c99 -Wall -Wextra to catch many problems at compilation time. If you are working in Windows, you may also have similar difficulties. As I understand it, MSVC is C89 compliant, but has limited support for later C language standards.

Macro with Parameters

In the code which follows, I keep getting an error. How to modify the third line? Why's that keep happening? What's wrong?
#include <stdio.h>
#include "stdlib.h"
#define ARRAY_IDX(type, array, i) ((type *)(array+i)) // you can only modify this line!
int main(int argc, const char * argv[]) {
void *ptr = malloc(10*sizeof(int));
#ifdef ARRAY_IDX
for (int i = 0; i < 10; i++) {
ARRAY_IDX(int, ptr, i) = i * 2;
}
for (int i = 0; i < 10; i++) {
printf("%d ", ARRAY_IDX(int, ptr, i));
}
free(ptr);
#else
printf("Implement ARRAY_IDX first");
#endif
}

Looking at
ARRAY_IDX(int, ptr, i) = i * 2;
and
printf("%d ", ARRAY_IDX(int, ptr, i));
shows that the expression
ARRAY_IDX(int, whatever, whatever)
should expand into an expression of type int (and an lvalue, so that we can assign to it).
Starting off with a void * you first need to change (cast) it to a pointer that allows indexing, and since you want to index the elements of that array (not its individual bytes, which would be a violation of aliasing) you need to make it an int * first:
(int *)(ptr)
Now you have a pointer to an integer (array, hopefully). Increment it:
(int *)(ptr) + (idx)
Finally, you need an lvalue int expression. Dereference the pointer to get that:
(*((int *)(ptr) + (idx)))
Converting that to a preprocessor macro is something that should be doable, so I leave it up to you.
Note that whoever is giving you that code is - IMHO - not a teacher you should trust. This won't teach you much about correct C. It might teach you something about the preprocessor. But don't write such code. Just don't. Use correct types if possible. Check for failure of malloc.

There is nothing wrong with adding an int to a void pointer. For many years, compiler designers assumed that this was standard behavior, and it was implemented as such. It's every bit as standard as anonymous structs and unions, which compilers have had for almost 20 years and were only recently added in C11. Practically all compilers will compile this just fine without any warnings or errors, and without having to use any special compiler flags.
Your problem is, as I have pointed out, that you are assigning a value to a pointer. You need to dereference it after the cast.
#define ARRAY_IDX(type, array, i) ((type *)array)[i]

A puzzling example about the C keyword "restrict"

The example is taken from Wikipedia:
void updatePtrs(size_t *restrict ptrA, size_t *restrict ptrB, size_t *restrict val)
{
*ptrA += *val;
*ptrB += *val;
}
I call this function in the main():
int main(void)
{
size_t i = 10;
size_t j = 0;
updatePtrs(&i, &j, &i);
printf("i = %lu\n", i);
printf("j = %lu\n", j);
return 0;
}
The val pointer is not be loaded twice according to the Wikipedia's description, so the value of j should be 10, but it's 20 in fact.
Is my comprehension about this keyword not correct? Should I utilize some specific options of gcc?
Thanks in advance.

Your code causes undefined behaviour. restrict is a promise from you to the compiler that all of the pointer parameters point to different memory areas.
You break this promise by putting &i for two of the arguments.
(In fact, with restrict it is allowed to pass overlapping pointers, but only if no writes are done through any of the overlapping pointers within the function. But typically you would not bother with restrict if there is no writing happening).
FWIW, on my system with gcc 4.9.2, output is j = 20 at -O0 and j = 10 at -O1 or higher, which suggests that the compiler is indeed taking note of the restrict. Of course, since it is undefined behaviour, your results may vary.

Convert a non-pointer variable to a pointer to an array

Ok, I understand that my title might be a bit confusing, but I'll explain. I'm working on a homework assignment in C. I'm given a .c file and need to come up with implementations for some functions.
In short, I have this as a .c file
typedef int set_t;
...
void init(set_t *a, int N); // Initialized an array to a of size N
...
int main() {
set_t a;
init(&a, 10);
}
In a couple of implementations I've come up with, I was able to create an array using a, but I keep getting segmentation faults when the program runs :-/. Is there away to initialize a as an array without changing anything in the original .c file except for the implementation of init(set_t *a, int N)?
EDIT
Here's my current implementation of init --> it leads to a segmentation fault
void init(set_t *a, int N) {
//set_t thing[10];
*a = malloc(sizeof(set_t)*N);
for (int i = 0; i < N; i++) {
*(a + i) = i;
}
printf("value of a[2] = %d\n", a[2]);
}

As things currently stand, the requirements imposed on you are wholly unreasonable. If you are building for 32-bit only, so sizeof(int) == sizeof(int *), then you can use brutal casting to get around the constraints. The code will not work on a 64-bit machine, though (unless sizeof(int) == sizeof(int *), which isn't the case on any machine I can immediately think of.
So, the brute force and casting technique is:
void init(set_t *a, int N)
{
assert(sizeof(set_t) == sizeof(set_t *)); // Ick, but necessary!
set_t *base = malloc(sizeof(set_t)*N);
if (base == 0)
*a = 0;
else
{
*a = (int)base; // Brutal; non-portable; stupid; necessary by the rules given!
for (int i = 0; i < N; i++) {
base[i] = i;
printf("value of a[2] = %d\n", base[2]);
printf("value of a[2] = %d\n", ((int *)*a)[2]); // Brutal and stupid too
}
}
Further, in the code in main(), you'll have to use ((int *)a) to make the type usable for dereferencing, etc. Without knowing about what is actually in that other code, it is impossible to be confident that anything will work. It might, but it probably won't.
At this stage, this looks like someone criminally misleading innocent novice programmers. This is not the way it should be coded at all. However, if that's what the doctor (professor) orders, then that's what you've got to do. But it is a mockery of good coding practices AFAICS and AFAIAC.

Professor realized that he had made an error in the assignment and fixed it. Changed set_t a to set_a *a.
Thanks for all your help (hope I didn't cause too many headaches!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How can one make Clang optimize away useless array copies - c

Related

std::array equivalent in C

How to return multiple types from a function in C?

Macro with Parameters

A puzzling example about the C keyword "restrict"

Convert a non-pointer variable to a pointer to an array

Categories

Resources