Is static const array preprocessed? - arrays

I have a lot of constants and they can be separated by groups, so I used some static const arrays of doubles.
But I need to do some calculations with this array. Therefore, I created another array that stores the calculated values - because I use them a lot.
However, I make a lot of index of the arrays, so it gets very slow too and my code, by now, is O(6^n²) - with n between 1 and 12.
My question is: what is faster, make the same calculations a lot of time, or index this array that stores those calculated?
I thought to make a lot of defines (because I know it's preprocessed), but I can't index defines what would make the code extremely big and unclear.
short code (example)
const static double array1[12] = {2,4,6,8,...};
const static double array2[12] = {1,2,3,4,...};
...
// in some function
{
...
double stored1[12];
for(int i = start1; i < end1; i++)
stored1[i] = array1[i] + i*array1[i-1];
for(int i = start2; i < end2; i++)
stored1[i] = array2[i] + i*array2[i-1];
// then, I'll have to index stored1 a lot of times - or create 12 auxiliary variables
// when I need to use those arrays of stored values
// I use these values in loops of loops of loops (they are some summations)
// I don't use loops, but recursive functions to make this loops, but in both
// cases, I have to index a lot of time this array (or make the same calculations)
...
}

Related

Is there an approach to traverse array randomly?

I am trying to compare linear memory access to random memory access. I am traversing an array in the order of its indices to log performance of linear memory access. However to log memory's performance with random memory access I want to traverse my array randomly i.e arr[8], arr[17], arr[34], arr[2]...
Can I use pointer chasing to achieve this while ensuring that no index are accessed twice? Is pointer chasing most optimal approach in this case?
If your goal is to show that sequential access is faster than non-sequential access, simply pointer chasing the latter is not a good way to demonstrate that. You would be comparing access via a single pointer plus simple offset against deterrencing one or more pointers before offsetting.
To use pointer chasing, you'd have to apply it to both cases. Here's an example:
int arr[n], i;
int *unshuffled[n];
int *shuffled[n];
for(i = 0; i < n; i++) {
unshuffled[i] = arr + i;
}
/* I'll let you figure out how to randomize your indices */
shuffle(unshuffled, shuffled)
/* Do toning on these two loops */
for(i = 0; i < n; i++) {
do_stuff(*unshuffled[i]);
}
for(i = 0; i < n; i++) {
do_stuff(*shuffled[i]);
}
It you want to time the direct access better though, you could construct some simple formula for advancing the index instead of randomizing the access completely:
for(i = 0; i < n; i++) {
do_stuff(arr[i]);
}
for(i = 0; i < n; i++) {
do_stuff(arr[i / 2 + (i % 2) * (n / 2)]);
}
This will only work properly for even n as shown, but it illustrates the idea. You could go so far as to compensate for the extra flops in computing the index within do_stuff.
Probably the most apples-to-apples test would be to literally access the indices you want, without loops or additional computations:
do_stuff(arr[0]);
do_stuff(arr[1]);
do_stuff(arr[2]);
...
do_stuff(arr[123]);
do_stuff(arr[17]);
do_stuff(arr[566]);
...
Since I'd imagine you'd want to test with large arrays, you can write a program to generate the actual test code for you, and possibly compile and run the result.
I can tell you that for arrays in C the access time is constant regardless of the index being accessed. There will be no difference between accessing them randomly or sequentially other than the fact that randomizing will in itself introduce additional computations.
But, to really answer your question, you would probably be best off to build some kind of lookup array and shuffle it a few times and use that array to get the next index. Obviously, you would be accessing two arrays, one sequentially and another randomly, by doing so, thus making the exercise pretty much useless.

C: How to generate a fixed number of objects with an array of pointers

I would like to create 11 text layers for a pebble watch face.
Without a loop the code would look something like.
static TextLayer *time_layer_a;
static TextLayer *time_layer_b;
... and so on.
How can I do this with a loop and put the pointers to the the objects in a list like structure?
list: in this case array or chain would be a better word because the collection of pointers is for a display with a fixed number of text layers. And the number of layers will not be changed during the duration of the program. In C, a list is a structure that can be dynamically resized. Using "list like" could mislead helpful people to the assumption that the sought method of chaining is expected to be dynamic. This is not correct. A structure that uses a fixed allocation of memory is preferred.
Edit: an array as suggested by John3136 worked perfectly. The array has the added benefit of generating the object pointers with its deceleration. And it's a plus that John3136 gave a way to have the code automatically adjust to the size of the array. This is a useful tool to have.
Here is the code as applied to create text layers for my watch face.
declarations:
int i;
static TextLayer* layers[11];
loading method:
// by John3136
// Note the sizeof() stuff means this works unchanged even if you change
// the number of layers.
for(i = 0; i < (short)(sizeof(layers) / sizeof(layers[0])); i++) // (short) converts unsigned interger to +- int
{
layers[i] = text_layer_create(GRect((bounds.size.w/4)*((i + 1)%4),
(bounds.size.h/PBL_IF_ROUND_ELSE(5,4))*((i > 2)
? ((i > 6)
? 3
: 2 )
: 1),
(bounds.size.w / 4) ,(bounds.size.h/PBL_IF_ROUND_ELSE(5,4))));
}
unloading method:
for(i = 0; i < (short)(sizeof(layers) / sizeof(layers[0])); i++)
{
text_layer_destroy(layers[i]);
}
Easiest way that meets your requirements as we know them: An array of 11 pointers to TextLayers.
static TextLayer* layers[11];
You can then populate with:
int i;
// Note the sizeof() stuff means this works unchanged even if you change
// the number of layers.
for(i = 0; i < sizeof(layers) / sizeof(layers[0]); i++)
{
layers[i] = some_func_that_creates_a_layer();
}

Maintain a sorted array that a separate, iterative function can keep accessing

I'm writing code for a decision tree in C. Right now it gives me the correct result (0% training error, low test error), but it takes a long time to run.
The problem lies in how often I run qsort. My basic algorithm is this:
for every feature
sort that feature column using qsort
remove duplicate feature values in that column
for every unique feature value
split
determine entropy given that split
save the best feature to split + split value
for every training_example
if training_example's value for best feature < best split value, store in Left[]
else store in Right[]
recursively call this function, using only the Left[] training examples
recursively call this function, using only the Right[] training examples
Because the last two lines are iterative calls, and because the tree can extend for dozens and dozens of branches, the number of calls to qsort is huge (especially for my dataset that has > 1000 features).
My idea to reduce the runtime is to create a 2d array (in a separate function) where each column is a sorted feature column. Then, as long as I maintain a vector of row numbers of the training examples in Left[] and Right[] for each recursive call, I can just call this separate function, grab the rows I want in the pre-sorted feature vector, and save the cost of having to qsort each time.
I'm fairly new to C and so I'm not sure how to code this. In MatLab I can just have a global array that any function can change or access, looking for something like that in C.
Global arrays in C are totally possible. There are actually two ways of doing that. In the first case the dimensions of the array are fixed for the application:
#define NROWS 100
#define NCOLS 100
int array[NROWS][NCOLS];
int main(void)
{
int i, j;
for (i = 0; i < NROWS; i++)
for (j = 0; j < NCOLS; j++)
{
array[i][j] = i+j;
}
return 0;
}
In the second example the dimensions may depend on values from the input.
#include <stdlib.h>
int **array;
int main(void)
{
int nrows = 100;
int ncols = 100;
int i, j;
array = malloc(nrows*sizeof(*array));
for (i = 0; i < nrows; i++)
{
array[i] = malloc(ncols*sizeof(*(array[i])));
for (j = 0; j < ncols; j++)
{
array[i][j] = i+j;
}
}
}
Although the access to the arrays in both examples looks deceivingly similar, the implementation of the arrays is quite different. In the first example the array is located in one piece of memory and the strides to access rows is a whole row. In the second example each row access is a pointer to a row, which is one piece of memory. The various rows can however be located in different areas of the memory. In the second example rows might also have a different length. In that case you would need to store the length of each row somewhere too.
I don't fully understand what you are trying to achieve, because I'm not familiar with the terminology of decision tree, feature and the standard approaches to training sets. But you may also want to have a look at other data structures to maintain sorted data:
http://en.wikipedia.org/wiki/Red–black_tree maintains a more or less balanced and sorted tree.
AVL tree a bit slower but more balanced and sorted tree.
Trie a sorted tree on lists of elements.
Hash function to easily map a complex element to an integral value that can be used to sort the elements. Good for finding exact elements, but there is no real order in the elements itself.
P.S1: Coming from Matlab you may want to consider a different language from C to move to. C++ has standard libraries to support above data structures. Java, Python come to mind or even Haskell if you are daring. Pointer handling in C can be quite tedious and error prone.
P.S2: I'm unable to include a - in a URL on StackOverflow. So the Red-black tree links is a bit off and can't be clicked. If someone can edit my post to fix it, then I would appreciate that.

Growing an R matrix inside a C loop

I have a routine that generates a series of data vectors, one iteration at a time. I would like to find a way to "grow" either a list or a matrix that holds these vectors. I tried to create a list,
PROTECT( myList = allocVector( VECSXP, 1 ) )
But is there a way to grow the list, by pushing a vector element in the end?
Also, I wouldn't mind using a matrix, since the vectors I generate are of the same length.
Rf_lengthgets in Rinternals.h; implemented in builtin.c:lengthgets. The returned pointer needs to be PROTECTed, so one pattern is
SEXP myList;
PROTECT_INDEX ipx;
PROTECT_WITH_INDEX(myList = allocVector( VECSXP, 1 ), &ipx);
REPROTECT(mylist = Rf_lengthgets(mylist, 100), ipx);
If one were growing a list based on some unknown stopping condition, the approach might be like in R, with pre-allocate and fill followed by extension; the following is psuedo-code:
const int BUF_SIZE = 100;
PROTECT_INDEX ipx;
SEXP myList;
int i, someCondition = 1;
PROTECT_WITH_INDEX(myList=allocVector(VECSXP, BUF_SIZE), &ipx);
for (i = 0; some_condition; ++i) {
if (Rf_length(myList) == i) {
const int len = Rf_length(myList) + BUF_SIZE;
REPROTECT(myList = Rf_lengthgets(mYlist, BUF_SIZE), &ipx);
}
PROTECT(result = some_calculation();
SET_VECTOR_ELT(myList, i, result);
UNPROTECT(1);
// set some_condition
}
Rf_lengthgets(myList, i); // no need to re-PROTECT; we're leaving C
UNPROTECT(1)
return myList;
This performs a deep copy of myList, so can become expensive and in some ways if ht emain objective to evaluate some_calculation, then it seems like it's easier and not too much less efficient to do the pre-allocate and extend operations in an R loop, calling some_calculation and doing assignment inside the loop.
This is IMHO a good example of where C++ beats C hands-down.
In C++, you can use a STL container (such as vector) and easily insert elements one at a time using push_back(). You never use malloc or free (or new and delete), and you never touch pointers. There is just no way to do that in C.
As well, you can make use of the Rcpp interface between R and C++ which makes getting the data you have grown in C++ over to R a lot easier.

How to do a simple random sort of values within an array

I need to sort these values randomly in an array.
int [] d = new int[26];
d[0]=1;
d[1]=5;
d[2]=10;
d[3]=25;
d[4]=50;
d[5]=75;
d[6]=100;
d[7]=200;
d[8]=300;
d[9]=400;
d[10]=500;
d[11]=750;
d[12]=1000;
d[13]=2000;
d[14]=3000;
d[15]=4000;
d[16]=5000;
d[17]=7500;
d[18]=10000;
d[19]=25000;
d[20]=50000;
d[21]=100000;
d[22]=250000;
d[23]=500000;
d[24]=750000;
d[25]=1000000;
Assuming you are writing this in C++ you can use random_shuffle function template from the Standard library. http://www.cppreference.com/wiki/algorithm/random_shuffle
If you want to write your own function, simply take two random indices and swap their values. Put that in a loop and do it as many times as think necessary for the array to be well shuffled (I would say a number of times equal to two times the number of elements in the array).
In psudo-code (since you haven't indicated the language)
NUMBER_OF_SHUFFLES = 2;
for(ix = 0; ix < NUMBER_OF_SHUFFLES * myArray.length; ix++)
index1 = random(myArray.length)
index2 = random(myArray.length)
temp = index1
myArray[index1] = myArray[index2]
myArray[index2] = temp
There are more sophisticated ways of doing it as well. Check out this discussion: An Efficient way of randomizing an array - Shuffle code
If it's java you may use
Arrays.shuffle(d);

Resources