how to optimize repetitive addition - c
#define NULL 0
int main()
{
int *array1=NULL,*array2=NULL;
int x =add(array1[0],array2[0]);
int y =add(array1[1],array2[7]);
int x =add(array1[2],array2[3]);
int y =add(array1[3],array2[4]);
int x =add(array1[4],array2[6]);
int y =add(array1[5],array2[1]);
int x =add(array1[6],array2[5]);
int y =add(array1[7],array2[2]);
................
................
int x =add(array1[252],array2[0]);
int y =add(array1[253],array2[7]);
int x =add(array1[254],array2[3]);
int y =add(array1[255],array2[4]);
}
Basically index for array1 is incrementing by 1 starting from 0 to till 255
but the index for array2 is fixed from 0 to 7. So I want to optimize this multiple addition. How to optimize this?
what you can is
int j = 0,order[] = {0,7,3,4,6,1,5,2};
for(int i = 0;i <256; i +=2)
{
int x =add(array1[i],array2[order[j%8]]);
j++;
int y =add(array1[i+1],array2[order[j%8]]);
j++;
}
UPDATE
alternate solution can be (if you want without using i+=2)
int j = 0,order[] = {0,7,3,4,6,1,5,2};
for(int i = 0;i <256; i ++)
{
int x =add(array1[i],array2[order[j%8]]);
j++;
i++;
if(i>=256) break; //Improves it if you have non even condition
int y =add(array1[i],array2[order[j%8]]);
j++;
}
edit by sam
Now i want to compare this two values of x and y and selecting value based on comparision
CurrentTre=256;
if (x > y)
{
*array3[0]= x;
*array4[CurrentTre +0] = 0;
}
else
{
*array3[i] = y;
*array4[CurrentTre + 0] = 1;
}
..........
..........
if (x > y)
{
*array3[0]= x;
*array4[CurrentTre +127] = 254;
}
else
{
*array3[i] = y;
*array4[CurrentTre + 127] = 255;
}
/////////////
my approach is this way
if (x > y)
{
*array3[i]= x;
*array4[int CurrentTre +i] = int number[i]<<1;
}
else
{
array3[i] = y;
array4[int CurrentTre + i] = int number[i]<<1|1;
}
} //end function main
I want to optimize the code my optimization is given below
please check whether am i doing right or not..?
uint32 even_number[255] ={0};
uint32 loop_index1=0;
uint32 loop_index2=0;
uint16 order[256]={0,7,3,4,6,1,5,2,4,3,7,0,1,6,2,5,7,0,4,3,2,5,1,6,3,4,0
,7,6,1,5,2,4,3,7,0,1,6,2,5,0,7,3,4,5,2,6,1,3,4,0,7,6,1,5,2,7,0,4,3,2,5
,1,6,5,2,6,1,0,7,3,4,1,6,2,5,4,3,7,0,2,5,1,6,7,0,4,3,6,1,5,2,3,4,0,7,1
,6,2,5,4,3,7,0,5,2,6,1,0,7,3,4,6,1,5,2,3,4,0,7,2,5,1,6,7,0,4,3,3,4,0,7
,6,1,5,2,7,0,4,3,2,5,1,6,4,3,7,0,1,6,2,5,0,7,3,4,5,2,6,1,7,0,4,3,2,5,1
,6,3,4,0,7,6,1,5,2,0,7,3,4,5,2,6,1,4,3,7,0,1,6,2,5,6,1,5,2,3,4,0,7,2,5
,1,6,7,0,4,3,1,6,2,5,4,3,7,0,5,2,6,1,0,7,3,4,2,5,1,6,7,0,4,3,6,1,5,2,3
,4,0,7,5,2,6,1,0,7,3,4,1,6,2,5,4,3,7,0}; //all 256 values
for(loop_index1;loop_index1<256;loop_index1++)
{
m0= (CurrentState[loop_index1]+Branch[order[loop_index2]]);
loop_index2++;
loop_index1++;
if(loop_index1>=256)
break;
m1= (CurrentState[loop_index1]+Branch[order[loop_index2]]);
loop_index2++;
if (mo > m1)
{
NextState[loop_index1]= m0;
SurvivorState[CurrentTrellis + loop_index1] =
even_number[loop_index1]<<1;
}
else
{
NextState[loop_index1] = StateMetric1;
SurvivorState[CurrentTrellis + loop_index1] =
even_number[loop_index1<<1|1;
}
}
first step:
for (int i = 0; i < 256; i+=8) {
x = add(array1[i], array2[0]);
y = add(array1[i+1], array2[7]);
...
}
Use a pointer
int *array1 = NULL, *array2 = NULL;
int *ptr1 = array1;
int x = add(*ptr1++, array2[0]);
int y = add(*ptr1++, array2[7]);
int x = add(*ptr1++, array2[3]);
int y = add(*ptr1++, array2[4]);
int x = add(*ptr1++, array2[6]);
int y = add(*ptr1++, array2[1]);
int x = add(*ptr1++, array2[5]);
int y = add(*ptr1++, array2[2]);
................
................
int x = add(*ptr1++, array2[0]);
int y = add(*ptr1++, array2[7]);
int x = add(*ptr1++, array2[3]);
int y = add(*ptr1++, array2[4]);
But remember premature optimization is the root of all evil.
Before optimizing measure and make certain you need optimizing the accesses to the array.
After optimizing measure to make certain the optimization had any effect.
Related
Efficient way to find rows with same elements in a 3D matrix in C
I have a 3D matrix mat[100][100][100]. What is the efficient way to find a row with same elements that appears in mat[0][][], mat[1][][],....,mat[99][][]? A simple approach would be comparing each row of mat[0][][] to all rows of the remaining 99 matrices, but it wouldn't be very efficient(I guess). Is there a better way to do it?
To expand on the comment by #chux, the first step is to compute a hash value for each row of each matrix. That's 10000 hash values in all. The results should be stored in an array of 10000 structs. struct info { int m; // the matrix number int row; // the row number uint32_t hash; // the hash value for mat[m][row] }; static struct info hashArray[10000]; After filling in all 10000 entries of the hashArray, sort the array by hash value. Then you can simply scan the array to find any duplicate hash values. When you do find duplicates, you need to confirm by comparing the row elements.
I finally found some time to write the content addressable code. It turns out to be much faster than using hash tables. But the catch is that the code is way more complex and the program takes WAY more memory. My final opinion is that unless you really need the extra speed, stick with the hash table. Some examples of test runs are given below. The argument to the program specify the number of unique rows. The program fills the rest with randomly chosen existing rows. Then the rows are shuffled. The program looks for all duplicate rows and reports the number of duplicate rows and the time it took for both hash and content addressable tables. bing#mint5 ~ $ cc -O2 cattest.c -o cattest bing#mint5 ~ $ ./cattest 500 CAT Test 9500 0.0083 Hash Test 9500 0.0499 bing#mint5 ~ $ ./cattest 5000 CAT Test 5000 0.0195 Hash Test 5000 0.1477 bing#mint5 ~ $ ./cattest 9000 CAT Test 1000 0.0321 Hash Test 1000 0.1092 /* content addressable table vs hash table */ /* written by Bing H Bang */ /* I DONOT give permission to any snot-nosed students to copy my work and turn it in as their own */ #include <stdio.h> #include <stdlib.h> #include <math.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <time.h> #include <pthread.h> #include <errno.h> #include <string.h> #include <sys/time.h> #include <sys/sysinfo.h> double etime() { struct timeval tv; double dt, df; gettimeofday(&tv, NULL); dt = (double)(tv.tv_sec); df = ((double)(tv.tv_usec))/1000000.0; return(dt+df); } struct CAT_entry { unsigned fval; unsigned short rows[10000]; unsigned short num; unsigned short used; struct CAT_entry *next; } *CAT[256] = {NULL}; struct CAT_entry stmem[10000]; int stidx = 0; unsigned dat[100][10000]; char map[10000]; unsigned hasht[10000]; #define highbit (1 << ((sizeof(unsigned)*8)-1)) unsigned rotxor(unsigned sum, unsigned v) { if((sum & highbit) == 0) return ((sum << 1) ^ v); else return (((sum << 1) | 1) ^ v); } unsigned compute_hash(int y) { int x; unsigned sum = 0; for(x = 0; x < 100; ++x) sum = rotxor(sum, dat[x][y]); return sum; } void mk_hasht() { int y; for(y = 0; y < 10000; ++y) hasht[y] = compute_hash(y); } clearmap() { memset((void *)map, 0, 10000); } comprow(int y, int yd) { int x; for(x = 0; x < 100; ++x) if(dat[x][y] != dat[x][yd]) return 0; return 1; } struct CAT_entry ** srch_CAT(unsigned value) { struct CAT_entry **p = &(CAT[value&255]); static struct CAT_entry *r = NULL; while(*p != NULL) { if((*p)->fval == value) break; if((*p)->fval > value) return &r; else p = &((*p)->next); } return p; } void add_entry(int y, unsigned value) { struct CAT_entry **p = &(CAT[value&255]), *q; while(*p != NULL) { q = *p; if(q->fval == value) { q->rows[q->num] = y; q->num++; return; } if(q->fval > value) break; p = &(q->next); } q = *p; //*p = malloc(sizeof(struct CAT_entry)); *p = &stmem[stidx++]; (*p)->next = q; q = *p; q->fval = value; q->num = 0; q->used = 0; } void mk_CAT() { int x,y; struct CAT_entry **p, *q; for(y = 0; y < 10000; y++) add_entry(y, dat[0][y]); for(x=0; x < 256; ++x) { p = &(CAT[x]); while(*p != NULL) { q = *p; if(q->num == 0) { *p = q->next; //free(q); } else p = &(q->next); } } } void gen_data(int npat) { int x, y, rnum, limit; unsigned r; srandom(time(NULL)); rnum = npat * 100; for(y = 0; y < rnum; ++y) dat[y%100][y/100] = random(); for(y = npat; y < 10000; ++y) { rnum = random() % npat; for(x = 0; x < 100; ++x) dat[x][y]=dat[x][rnum]; } for(y = 0; y < 10000; ++y) { rnum = random() % 10000; if(rnum == y) continue; for(x = 0; x < 100; ++x) { r = dat[x][y]; dat[x][y]=dat[x][rnum]; dat[x][rnum] = r; } } } int do_CATtest() { int y, yd, count = 0, i; struct CAT_entry **p, *q; mk_CAT(); clearmap(); for(y = 0; y < 9999; ++y) { if(map[y] == 0) { map[y] = 1; if(*(p = srch_CAT(dat[0][y])) != NULL) { for(q = *p, i = 0; i < q->num; ++i) { yd = q->rows[i]; if(map[yd] == 0) { if(comprow(y, yd)) { map[yd] = 1; ++count; q->used++; } } } if(q->num <= q->used) *p = q->next; } } } return count; } int do_hashtest() { unsigned h; int x, y, yd, count = 0; mk_hasht(); clearmap(); for(y = 0; y < 9999; ++y) { if(map[y] != 0) continue; map[y] = 1; h = hasht[y]; for(yd = y+1; yd < 10000; ++yd) { if(map[yd] != 0) continue; if(h == hasht[yd]) if(comprow(y, yd)) { map[yd] = 1; ++count; } } } return count; } main(int c, char *v[]) { int npat = 0, count; double t1, t2; if(c == 2) npat = atoi(v[1]); if(npat <= 0 || npat >= 10000) { puts("input param error"); exit(1); } gen_data(npat); npat = 10000 - npat; t1 = etime(); if((count = do_CATtest()) != npat) { printf("CAT test error, %d matches found, not %d", count, npat); exit(1); } t2 = etime(); printf("CAT Test %d %.4f\n", npat, t2-t1); t1 = etime(); if((count = do_hashtest()) != npat) { printf("hash test error, %d matches found, not %d", count, npat); exit(1); } t2 = etime(); printf("Hash Test %d %.4f\n", npat, t2-t1); }
Make a content addressable table of the first values in each row. Then go through each row, take the first value and look it up on the table. If the lookup returns multiple rows, then those rows should be checked for a match. The searched rows should be remembered as to increase efficiency because the checked rows need not be checked again. You'll end up with a list of identical row groupings.
Stack frame size of a function
Simple question: Is there a way to determine the stack size of a function? int stackframe_size(int run) { int i ; if(!run) { return ((int)(&i) - stackframe_size(++run)); } return (int)(&i); } int main() { int x, y; double d; char c; int a = 4; int b = 5; int we = 6; int e = 123123; int hmm = 34453; int lol = 45; int asd = 23; x = 1; y = g(x); d = f(x, y, x-y); c = 'a'; printf("%d", stackframe_size(0)); } I am running the function I obtained from another thread to find the call stack size and it always seems to return 48...is there another way to find out or is this the only way?
FIR filter in C?
I have a homework to implement an FIR filter in C and I wonder whether you think I understood the assignment correctly. The program I wrote that I think solves the problem is: #include <stdio.h> float FIRfloats[5]; void floatFIR(float newsample) { int i; float sum=0; FIRfloats[0]=newsample*0.0299; FIRfloats[1]=FIRfloats[2]*0.4701; FIRfloats[2]=FIRfloats[3]*0.4701; FIRfloats[3]=FIRfloats[4]*0.0299; /* sum */ for(i=0;i<5;i++) { sum=sum+FIRfloats[i]; } printf("Sum: %f\n", sum); } int main () { float n=0.0f; while (scanf("%f", &n) > 0) { floatFIR(n); } return 0; } And the specification is Before a new sample xk arrives the old samples are shifted to the right and then each sample is scaled with a coefficient before the result yk, the total sum of all scaled samples, is calculated Coefficients should be c0=0.0299, c1=0.4701, c2=0.4701, c3=0.0299. Do you think that I solved the assignment correctly? I think it seemed too easy and therefore I wonder.
I'm afraid the implementation provided in the question will not provide the correct results. In FIR (Finite Impulse Response) filter with 4 coefficients the output series (y) for input series (x) is: y[t] = c0*x[t] + c1*x[t-1] + c2*x[t-2] + c3*x[t-3] Therefore the implementation should be similar to: /* add includes (stdio.h and whatever else you'll need...) */ float floatFIR(float inVal, float* x, float* coef, int len) { float y = 0.0; for (int i = (len-1) ; i > 0 ; i--) { x[i] = x[i-1]; y = y + (coef[i] * x[i]); } x[0] = inVal; y = y + (coef[0] * x[0]); return y; } main(int argc, char** argv) { float coef[4] = {0.0299, 0.4701, 0.4701, 0.0299}; float x[4] = {0, 0, 0, 0}; /* or any other initial condition*/ float y; float inVal; while (scanf("%f", &inVal) > 0) { y = floatFIR(inVal, x, coef, 4); } return 0; } This does the shift and multiplication at the same loop (which does not affect results - only is more efficient.) If you want to follow the spec exactly, you can change floatFir like this: float floatFIR(float inVal, float* x, float* coef, int len) { float y = 0.0; for (int i = (len-1) ; i > 0 ; i--) { x[i] = x[i-1]; } x[0] = inVal; for (int i = 0 ; i < len ; i++) { y = y + (coef[i] * x[i]); } return y; }
cast error and invalid conversion error
error: cast from 'void*' to 'unsigned int' loses precision error: invalid conversion from 'unsigned int' to 'unsigned int**' can u tell me how to properly cast this, i am getting error on this line: color = (unsigned int)malloc(height*sizeof(unsigned int)); inside the main function. #include <stdio.h> #include <stdlib.h> #include <time.h> unsigned int width; unsigned int height; unsigned int **color = NULL; bool file_write() { FILE *fractal = fopen("mandelbrot_imageSequential.ppm","w+"); if(fractal != NULL) { fprintf(fractal,"P6\n"); fprintf(fractal,"# %s\n", "Mandelbrot_imageSequential.ppm"); fprintf(fractal,"%d %d\n", height, width); fprintf(fractal,"40\n"); int x = 0, y = 0; unsigned int R = 0, G = 0, B = 0; for(x = 0; x < width; ++x) { for(y = 0; y < height; ++y) { R = (color[y][x]*10); G = 255-((color[y][x]*10)); B = ((color[y][x]*10)-150); if(R == 10) R = 11; if(G == 10) G = 11; if(B == 10) B = 11; putc(R, fractal); putc(G, fractal); putc(B, fractal); } } fclose(fractal); } return true; } int method(int x, int y, double min_re, double max_re, double min_im, double max_im, int max_iterations) { double threshold = 4; double x_factor = (max_re-min_re)/(width-1); double y_factor = (max_im-min_im)/(height-1); double c_im = max_im - y*y_factor; double c_re = min_re + x*x_factor; double Z_re = c_re, Z_im = c_im; unsigned int col = 0; for(unsigned n = 0; n < max_iterations; ++n) { double Z_re2 = Z_re*Z_re, Z_im2 = Z_im*Z_im; if(Z_re2 + Z_im2 > threshold) { col = n; break; } Z_im = 2 * Z_re * Z_im + c_im; Z_re = Z_re2 - Z_im2 + c_re; } return col; } void method1(double min_re, double max_re, double min_im, double max_im, int max_iterations) { for(int x = 0; x < width; x++) { for(int y = 0; y < height; ++y) { int m1 = method(x,y,min_re,max_re,min_im,max_im,max_iterations); if(m1) { color[x][y] = m1*50; } } } } int main(int argc, char *argv[]) { unsigned int max_iterations; int x,y; double threshold; double min_re; double max_re; double min_im; double max_im; unsigned int NUM_OF_THREADS; if(argc != 10) { printf("There is an error in the input given.\n"); return 0; } else { height = atoi(argv[1]); width = atoi(argv[2]); max_iterations = atoi(argv[3]); min_re = atof(argv[4]); max_re = atof(argv[5]); min_im = atof(argv[6]); max_im = atof(argv[7]); threshold = atoi(argv[8]); NUM_OF_THREADS = atoi(argv[9]); } color = (unsigned int)malloc(height*sizeof(unsigned int)); printf("height = %d\twidth = %d\tmaximum_iterations = %d\tminimum_x-value = %.2f\tmaximum_x-value = %.2f\tminimum_y-value = %.2f\tmaximum_y-value = %.2f\tthreshold_value = %.2f\tno. of threads = %d\t\n",height,width,max_iterations,min_re,max_re,min_im,max_im,threshold,NUM_OF_THREADS); for(x = 0; x < height; x++) { color[x] = (unsigned int*)malloc(width*sizeof(unsigned int)); } time_t ts,te; time(&ts); method1(min_re, max_re, min_im, max_im, max_iterations); time(&te); double diff = difftime(te,ts); file_write(); printf("Total Time elapsed: %f\n",diff); return 0; }
Why are you casting the return value of malloc to an unsigned int? First off, don't cast the return value of malloc in C. It is pointless and can actually hide the fact that you forgot to include . C is not C++ in this regard. A void* can be implicitly converted to any pointer type in C. Secondly, malloc returns a pointer, and you have defined color as an unsigned int**... yet you attempt to assign an unsigned int as well as an unsigned int* to it. Obviously those are incompatible. Just drop the casts and use/declare the type properly.
color = (unsigned int**)malloc(height*sizeof(unsigned int*)); Shouldn't it be this?
You are trying to allocate array of pointers dynamically. So what you need to do is the following: color = (unsigned int**)malloc(height*sizeof(unsigned int)); Rest of it is fine ...
segmentation fault
I am trying get a mandelbrot image clearly with the sequential programming in C++, but I am getting a segmentation fault during runtime. I have no idea about the seg. fault, but my program is perfectly compiling with no errors. #include <stdio.h> #include <stdlib.h> #include <time.h> int file_write(unsigned int width, unsigned int height) { unsigned int **color = NULL; FILE *fractal = fopen("mandelbrot_imageSequential.ppm","w+"); if(fractal != NULL) { fprintf(fractal,"P6\n"); fprintf(fractal,"# %s\n", "Mandelbrot_imageSequential.ppm"); fprintf(fractal,"%d %d\n", height, width); fprintf(fractal,"40\n"); int x = 0, y = 0; unsigned int R = 0, G = 0, B = 0; for(x = 0; x < width; ++x) { for(y = 0; y < height; ++y) { R = (color[y][x]*10); G = 255-((color[y][x]*10)); B = ((color[y][x]*10)-150); if(R == 10) R = 11; if(G == 10) G = 11; if(B == 10) B = 11; putc(R, fractal); putc(G, fractal); putc(B, fractal); } } fclose(fractal); } return 0; } int method(int x, int y, int height, int width, double min_re, double max_re, double min_im, double max_im, int max_iterations) { double threshold = 4; double x_factor = (max_re-min_re)/(width-1); double y_factor = (max_im-min_im)/(height-1); double c_im = max_im - y*y_factor; double c_re = min_re + x*x_factor; double Z_re = c_re, Z_im = c_im; unsigned int col = 0; for(unsigned n = 0; n < max_iterations; ++n) { double Z_re2 = Z_re*Z_re, Z_im2 = Z_im*Z_im; if(Z_re2 + Z_im2 > threshold) { col = n; break; } Z_im = 2 * Z_re * Z_im + c_im; Z_re = Z_re2 - Z_im2 + c_re; } return col; } int main(int argc, char *argv[]) { unsigned int width; unsigned int height; unsigned int max_iterations; unsigned int **color = NULL; int x,y; double threshold; double min_re; double max_re; double min_im; double max_im; unsigned int NUM_OF_THREADS; if(argc != 10) { printf("There is an error in the input given.\n"); return 0; } else { height = atoi(argv[1]); width = atoi(argv[2]); max_iterations = atoi(argv[3]); min_re = atof(argv[4]); max_re = atof(argv[5]); min_im = atof(argv[6]); max_im = atof(argv[7]); threshold = atoi(argv[8]); NUM_OF_THREADS = atoi(argv[9]); } color = (unsigned int**)malloc(height*sizeof(unsigned int*)); printf("height = %d\twidth = %d\tmaximum_iterations = %d\tminimum_x-value = %.2f\tmaximum_x-value = %.2f\tminimum_y-value = %.2f\tmaximum_y-value = %.2f\tthreshold_value = %.2f\tno. of threads = %d\t\n",height,width,max_iterations,min_re,max_re,min_im,max_im,threshold,NUM_OF_THREADS); for(x = 0; x < height; x++) { color[x] = (unsigned int*)malloc(width*sizeof(unsigned int)); } time_t ts,te; time(&ts); method(x,y,height,width,min_re,max_re,min_im,max_im,max_iterations); time(&te); double diff = difftime(te,ts); file_write(width, height); printf("Total Time elapsed: %f\n",diff); return 0; } How to correct this segmentation fault?
At least one problem is in the file_write function. unsigned int **color = NULL; R = (color[y][x]*10); I assume the color should be an input parameter.
If you are on Linux machine do the following : $ulimit -c unlimited Then run the code. Notice a core.[pid] file is generated. fire up gdb like following $gdb ./your_app core.[pid] It will take you the statement where segfault occurred. issue a "backtrace" command in gdb prompt to see the call hierarchy. Remember compiling with "-g" flag to get more verbose gdb output.
There are two major problems with your code: You allocate memory for the color array but then use a different color inside file_write() which is initialized to NULL. You need to pass the first color as an argument to file_write(): int main(...) { ... file_write(color, width, height); printf("Total Time elapsed: %f\n",diff); return 0; } And declare the other color as an argument to file_write(): int file_write(unsigned int **color, unsigned int width, unsigned int height) { /* unsigned int **color = NULL; // Removed */ ... You're only calling method() once and not storing anything into color. You need to call it in a loop. Something similar to: /* Untested */ for (y = 0; y < height; y++) { for (x = 0; x < width; x++) { color[y][x] = method(x,y,height,width,min_re,max_re,min_im,max_im,max_iterations); } } Then, of course, you should check the return values of malloc(), fopen(), fprintf(), fclose(), ... , and check that the input variables have reasonable values and so on. I also noticed that you're passing width and height in different order to file_write() and method(). To avoid future headaches, I would change the method() function to method(x, y, width, height) so that the horizontal and vertical arguments are passed in the same order.