Reducing If statements in C - c

I was trying to count how many out of my four variables were greater than 0, so I wrote these if statements to achieve my purpose. All numbers will be positive or 0:
if(a1>0){counter++;}
if(a2>0){counter++;}
if(a3>0){counter++;}
if(a4>0){counter++;}
printf("%d", counter);
It's quite obvious that I would run into some trouble if the number of variables were to increase. Is there a more efficient way of writing this?
Thanks for taking the time to help me.

If you're looking for a single statement,
counter+= (a1>0) + (a2>0) + (a3>0) + (a4>0);
Should do. If you decide to pack your data into an array,
#define SIZE(x) (sizeof(x)/sizeof*(x))
int x, a[4];
for(x=0; x<SIZE(a); x++)
counter += (a[x]>0);
Is still reasonably compact.

Fundamentally you have to tell the compiler which memory address to check, and where to put the result.
If you have lots of local variables, you probably want to consider an array or similar data structure to hold them rather than tons of separately declared variables. In that case, you could define an array to hold the counter results too, and use a loop construct.
If you do not have tons of local variables, I suspect your current approach is about as good as it gets (you could get fancy by placing a pointer to each variable in an array and then using a loop, but the array initialization would be at least as cumbersome as the current if statements).
I would not change your program logic to use arrays if named, individual variables are a more natural fit.

What if you just used an array? such as this:
int a[4];
int counter;
int i; // iterator
for(i=0;i<4;i++){
if(a[i]>0){
counter++;
}
}
printf("%d", counter);
that would be faster, shorter code, but your way is efficient as possible if you must have separate variables.

Use an array rather than separate variables. Then just loop through the array.

If all variables are of the same type, you can store pointers to them in a array and use a loop to check and increment the counter.

Related

How to save memory in an array of which many elements are always 0?

I have a 2tensor in C that looks like:
int n =4;
int l =5;
int p =6;
int q=2;
I then initialize each element of T
//loop over each of the above indices
T[n][l][p][q]=...
However, many of them are zero and there are symmetries such as.
T[4][3][2][1]=-T[3][4][2][1]
How can I save memory on the elements of T which are zero? Ideally I would like to place something like NULL in those positions so they use 0 instead of 8 bytes. Also, later on in the calculation I can check if they are zero or not by checking if they are equal to NULL
How do I implicitly include those symmetries in T with using excess memory?
Edit: the symmetry can perhaps be fixed with a different implementation. But what about the zeros? Is there any implementation to not have them waste memory?
You cannot influence the size of any variable by a value you write to it.
If you want to save memory you have not only to not use it, you have to not define a variable using it.
If you do not define a variable, then you have to not use it ever.
Then you have saved memory.
This is of course obvious.
Now, how to apply that to your problem.
Allow me to simplify, for one because you did not give enough information and explanation, at least not for me to understand every detail. For another, to keep the explanation simple.
So I hope that it suffices if I solve the following problem for you, which I think is kind of the little brother of your problem.
I have a large array in C (not really large, lets say N entries, with N==20).
But for special reasons, I will never need to actually read and write any even indices, they should act as if they contain 0, but I want to save the memory used by them.
So actually I want to only use M of the entries, with M*2==N.
So instead of
int Array[N]; /* all the theoretical elements */
I define
int Array[M]; /* only the actually used elements */
Of course I cannot access any of the elements which are not needed and it will not really be necessary.
But for the logic of my program, I want to be able to program as if I could access them, but be sure that they will always every only read 0 and ignore any written value.
So what I do is wrapping all accesses to the array.
int GetArray(int index)
{
if (index & 1)
{
/* odd, I need to really access the array,
but at a calculated index */
return Array[index/2];
} else
{
/* even, always 0 */
return 0;
}
}
void SetArray(int index, int value)
{
if (index & 1)
{
/* odd, I need to really access the array,
but at a calculated index */ */
Array[index/2] = value;
} else
{
/* even, no need to store anything, stays always "0" */
}
}
So I can read and write as if the array were twice as large, but guarantee not to ever use the faked elements.
And by mapping the indices as
actualindex = wantindex / 2
I ensure that I do not access beyond the size of the actually existing array.
Porting this concept now to the more complicated setup you have described is your job. You know all the details, you can test wether everything works.
I recommend to extend GetArray() and SetArray() by checks on the resulting index, to make sure that it is never outside of the actual array.
You can also add all kinds of self checks to verify that all your rules and expectations are met.

Setting all the elements to a same value in C

Note: There are posts similar to this for C++ only, I didn't find any useful post in regards to C.
I want to set the array of elements with the same value. Of course, this can be achieved simply using a for loop.
But, that consumes a lot of time. Because, in my algorithm this setting array with same value takes place many number of times. Is there any simple way to achieve this in C.
Use a for loop. Any decent compiler will optimize this as much as possible.
It is a near certainty that you wouldn't be able to improve substantially on the speed of your for loop. There is no magic way to set a value into multiple memory locations faster than it takes to store that value into these multiple memory locations. Regardless of whether you use the for loop or not, all the locations must be written to, which takes most of the time.
There is of course the void * memset ( void * ptr, int value, size_t num ); for values composed of identical bytes1, but under the hood it has a loop. Perhaps the implementation could be very smart about using that loop, but so can the optimizing compiler.
1 Although memset takes an int, it converts it to unsigned char before setting it into the memory region.
As suggested by other users, use memset if you want to initiate your array with 0 values, but don't do it if the values are not that simple.
For more complicated values, you can have a constant copy of your initial values and copy them later with memcpy:
float original_values[100]; // don't modify these values
original_values[0] = 1.2f;
original_values[1] = 10.9f;
...
float working_values[100]; // work with these values
memcpy(working_values, original_values, 100 * sizeof(float));
// do your task
working_values[0] *= working_values[1];
...
You can use memset() . It fills no of bytes you want to fill with same byte value.Here
you can read man page.
You can use memset() function
Example:
memset(<array-name>,<initialization-value>,<len>);
You can easily memset an array to 0.
If you want a different value, it all depends on the type used.
For char* arrays you can memset them to any value, since char is almost always one byte long.
For an array of structures, if all fields of a structure are to be initialized with 0 or NULL, you can memset it with 0.
You can not memset an array or array of structures to any value other than 0, because memset operates on single bytes. So if you memset an int[] with 1, you will not have an array of 1's.
To initialize an array of structures with a custom value, just fill one structure with the desired data and do an assignment it in a for. The compiler should do it relatively efficiently for you.
If you are talking about initialization see this question. If you want to set the values at a later time then use memset
Well you can only set your values to zero for a particular array. Here is an example
int arr[5]={0};

Dynamic Arrays and structs

Thanks! I just had to cast the right side of the assignment to Term.
I have to make a dynamic array of polynomials that each have a dynamic array of terms. When giving the term a exponent and coefficient, I get an error "expected expression before '{' token". What am I doing incorrectly when assigning the values?
Also, is there an easy way of keeping the dynamic array of terms ordered by their exponent? I was just planning on looping through, printing the max value but would prefer to store them in order.
Thanks!
polynomialArray[index].polynomialTerm[0] = {exponent, coefficient}; // ISSUE HERE
change to
polynomialArray[index].polynomialTerm[0] = (Term){exponent, coefficient};
polynomialArray[index].polynomialTerm[0]->exponent = exponent;
polynomialArray[index].polynomialTerm[0]->coefficient = coefficient;
There's an efficiency problem here in your code:
if(index > (sizeof(polynomialArray)/sizeof(Polynomial)))
polynomialArray = (Polynomial*)realloc(polynomialArray, index * sizeof(Polynomial));
as polynomialArray is a pointer, I think sizeof(polynomialArray) would always be 4 or 8(64-bit system). So the above if statement will always true as long as index is greater than 0.
If this is C99, I think you need
polynomialArray[index].polynomialTerm[0] = (Term){exponent, coefficient};
You cannot attribute values like that (only during declaration).
You should assign like this:
polynomialArray[index].polynomialTerm[0].exponent = exponent;
polynomialArray[index].polynomialTerm[0].coefficient = coefficient;
About the other question, you really don't need assert here. The pointer will not be NULL if it has a value malloc allocated to it. If not, it is better to be NULL, so you can test if malloc failed.
To have it ordered, you will need to order using some sort algorithm. I think that if you are looking for an easy way, the way you are doing is fine. If it is critical to be ordered (like real time applications), than you need to rethink the approach. If not, keep it and go forward!
Take care,
Beco

Which kind of data organization using C arrays makes fastest code and why?

Given following data, what is the best way to organize an array of elements so that the fastest random access will be possible?
Each element has some int number, a name of 3 characters with '\0' at the end, and a floating point value.
I see two possible methods to organize and access such array:
First:
typedef struct { int num; char name[4]; float val; } t_Element;
t_Element array[900000000];
//random access:
num = array[i].num;
name = array[i].name;
val = array[i].val;
//sequential access:
some_cycle:
num = array[i].num
i++;
Second:
#define NUMS 0
#define NAMES 1
#define VALS 2
#define SIZE (VALS+1)
int array[SIZE][900000000];
//random access:
num = array[NUMS][i];
name = (char*) array[NAMES][i];
val = (float) array[VALS][i];
//sequential access:
p_array_nums = &array[NUMS][i];
some_cycle:
num = *p_array_nums;
p_array_nums++;
My question is, what method is faster and why? My first thought was the second method makes fastest code and allows fastest block copy, but I doubt whether it saves any sensitive number of CPU instructions in comparison to the first method?
It depends on the common access patterns. If you plan to iterate over the data, accessing every element as you go, the struct approach is better. If you plan to iterate independently over each component, then parallel arrays are better.
This is not a subtle distinction, either. With main memory typically being around two orders of magnitude slower than L1 cache, using the data structure that is appropriate for the usage pattern can possibly triple performance.
I must say, though, that your approach to implementing parallel arrays leaves much to be desired. You should simply declare three arrays instead of getting "clever" with two-dimensional arrays and casting:
int nums[900000000];
char names[900000000][4];
float vals[900000000];
Impossible to say. As with any performance related test, the answer my vary by any one or more of your OS, your CPU, your memory, your compiler etc.
So you need to test for yourself. Set your performance targets, measure, optimise, repeat.
The first one is probably faster, since memory access latency will be the dominant factor in performance. Ideally you should access memory sequentially and contiguously, to make best use of loaded cache lines and reduce cache misses.
Of course the access pattern is critical in any such discussion, which is why sometimes it's better to use SoA (structure of arrays) and other times AoS (array of structures), at least when performance is critical.
Most of the time of course you shouldn't worry about such things (premature optimisation, and all that).

Optimizing C loops

I'm new to C from many years of Matlab for numerical programming. I've developed a program to solve a large system of differential equations, but I'm pretty sure I've done something stupid as, after profiling the code, I was surprised to see three loops that were taking ~90% of the computation time, despite the fact they are performing the most trivial steps of the program.
My question is in three parts based on these expensive loops:
Initialization of an array to zero. When J is declared to be a double array are the values of the array initialized to zero? If not, is there a fast way to set all the elements to zero?
void spam(){
double J[151][151];
/* Other relevant variables declared */
calcJac(data,J,y);
/* Use J */
}
static void calcJac(UserData data, double J[151][151],N_Vector y)
{
/* The first expensive loop */
int iter, jter;
for (iter=0; iter<151; iter++) {
for (jter = 0; jter<151; jter++) {
J[iter][jter] = 0;
}
}
/* More code to populate J from data and y that runs very quickly */
}
During the course of solving I need to solve matrix equations defined by P = I - gamma*J. The construction of P is taking longer than solving the system of equations it defines, so something I'm doing is likely in error. In the relatively slow loop below, is accessing a matrix that is contained in a structure 'data' the the slow component or is it something else about the loop?
for (iter = 1; iter<151; iter++) {
for(jter = 1; jter<151; jter++){
P[iter-1][jter-1] = - gamma*(data->J[iter][jter]);
}
}
Is there a best practice for matrix multiplication? In the loop below, Ith(v,iter) is a macro for getting the iter-th component of a vector held in the N_Vector structure 'v' (a data type used by the Sundials solvers). Particularly, is there a best way to get the dot product between v and the rows of J?
Jv_scratch = 0;
int iter, jter;
for (iter=1; iter<151; iter++) {
for (jter=1; jter<151; jter++) {
Jv_scratch += J[iter][jter]*Ith(v,jter);
}
Ith(Jv,iter) = Jv_scratch;
Jv_scratch = 0;
}
1) No they're not you can memset the array as follows:
memset( J, 0, sizeof( double ) * 151 * 151 );
or you can use an array initialiser:
double J[151][151] = { 0.0 };
2) Well you are using a fairly complex calculation to calculate the position of P and the position of J.
You may well get better performance. by stepping through as pointers:
for (iter = 1; iter<151; iter++)
{
double* pP = (P - 1) + (151 * iter);
double* pJ = data->J + (151 * iter);
for(jter = 1; jter<151; jter++, pP++, pJ++ )
{
*pP = - gamma * *pJ;
}
}
This way you move various of the array index calculation outside of the loop.
3) The best practice is to try and move as many calculations out of the loop as possible. Much like I did on the loop above.
First, I'd advise you to split up your question into three separate questions. It's hard to answer all three; I, for example, have not worked much with numerical analysis, so I'll only answer the first one.
First, variables on the stack are not initialized for you. But there are faster ways to initialize them. In your case I'd advise using memset:
static void calcJac(UserData data, double J[151][151],N_Vector y)
{
memset((void*)J, 0, sizeof(double) * 151 * 151);
/* More code to populate J from data and y that runs very quickly */
}
memset is a fast library routine to fill a region of memory with a specific pattern of bytes. It just so happens that setting all bytes of a double to zero sets the double to zero, so take advantage of your library's fast routines (which will likely be written in assembler to take advantage of things like SSE).
Others have already answered some of your questions. On the subject of matrix multiplication; it is difficult to write a fast algorithm for this, unless you know a lot about cache architecture and so on (the slowness will be caused by the order that you access array elements causes thousands of cache misses).
You can try Googling for terms like "matrix-multiplication", "cache", "blocking" if you want to learn about the techniques used in fast libraries. But my advice is to just use a pre-existing maths library if performance is key.
Initialization of an array to zero.
When J is declared to be a double
array are the values of the array
initialized to zero? If not, is there
a fast way to set all the elements to
zero?
It depends on where the array is allocated. If it is declared at file scope, or as static, then the C standard guarantees that all elements are set to zero. The same is guaranteed if you set the first element to a value upon initialization, ie:
double J[151][151] = {0}; /* set first element to zero */
By setting the first element to something, the C standard guarantees that all other elements in the array are set to zero, as if the array were statically allocated.
Practically for this specific case, I very much doubt it will be wise to allocate 151*151*sizeof(double) bytes on the stack no matter which system you are using. You will likely have to allocate it dynamically, and then none of the above matters. You must then use memset() to set all bytes to zero.
In the
relatively slow loop below, is
accessing a matrix that is contained
in a structure 'data' the the slow
component or is it something else
about the loop?
You should ensure that the function called from it is inlined. Otherwise there isn't much else you can do to optimize the loop: what is optimal is highly system-dependent (ie how the physical cache memories are built). It is best to leave such optimization to the compiler.
You could of course obfuscate the code with manual optimization things such as counting down towards zero rather than up, or to use ++i rather than i++ etc etc. But the compiler really should be able to handle such things for you.
As for matrix addition, I don't know of the mathematically most efficient way, but I suspect it is of minor relevance to the efficiency of the code. The big time thief here is the double type. Unless you really have need for high accuracy, I'd consider using float or int to speed up the algorithm.

Resources