I'm trying to get a generic heapsort working. It works perfectly with strings but for some reason does not with integer types. Completely lost as to why this is. I am sure that the comparison function is correct. The last if statement in the heapsort function is to correct a bug that would sometimes cause the 0th and 1st indexes to be reversed.
#include "heapsort.h"
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <sys/types.h>
int heapsort(void *base, size_t nel, size_t width, int (*compar)(const void *, const void *))
{
int i;
int j;
printf("\n");
/*build heap*/
for (j = 1; j < nel; j++) {
i = j;
while (compar(base + i * width, base + (i-1)/2 * width) > 0) {
swap(base + i * width, base + (i-1)/2 * width);
i = (i-1)/2;-
}
}
/*sort*/
for (i = 1; i < nel; i++) {
swap(base , base + (nel - i) * width);
j = 0;
while (((2 * j + 1) < nel - i) && ((2 * j + 2) < nel - i)){
if (compar(base + (j * 2 + 2) * width, base + (j * 2 + 1) * width) > 0
&& compar(base + j * width, base + (j * 2 + 2) * width) < 0)
{
swap(base + j * width, base + (2 * j + 2) * width);
j = 2 * j + 2;
}
else if (compar(base + (j * 2 + 2) * width, base + (j * 2 + 1) * width) <= 0
&& compar(base + j * width, base + (j * 2 + 1) * width) < 0)
{
swap(base + j * width, base + (2 * j + 1) * width);
j = 2 * j + 1;
}
else
break;
}
}
if (compar(base + 0 * width, base + 1 * width) > 0) {
swap(base + 0 * width, base + 1 * width);
}
return 0;
}
void swap(void *a, void *b)
{
void *temp;
printf("swapping %d and %d\n", *(int*)a, *(int*)b);
temp = *(void**)a;
*(void**)a = *(void**)b;
*(void**)b = temp;
}
int intcmp(const void * a, const void * b)
{
return *(int*)a - *(int*)b;
}
The swap function looks wrong. It is assuming the values referenced by a and b are pointers (which will work if the types are [const] char * strings) and is swapping sizeof(void *) bytes. It is likely that sizeof(int) != sizeof(void *) on your system.
You need to swap width bytes.
Related
In my open source project ( https://github.com/mmj-the-fighter/GraphicsLabFramework ) i am trying to add a image smoothening box filter for NxN kernel size. I have already implemented the algorithm for 3x3 kernel size. As you can see from the source code below I am not processing the image for edges. Using this logic, for a 5x5 kernel size I have to skip two rows or columns from top, right, bottom and left of the image. So the edges will not be blurred. Is there any other solution.
Here is the code:
/*applying a box filter of size 3x3*/
void blur_image(unsigned char *img, int width, int height)
{
int n = width * height;
int i, j;
int r, g, b;
int x, y;
float v = 1.0 / 9.0;
float kernel[3][3] =
{
{ v, v, v },
{ v, v, v },
{ v, v, v }
};
unsigned char* resimage = (unsigned char *)malloc(width * height * 4 * sizeof(unsigned char));
memcpy(resimage, img, width*height * 4);
for (x = 1; x < width - 1; ++x) {
for (y = 1; y < height - 1; ++y) {
float bs = 0.0;
float gs = 0.0;
float rs = 0.0;
for (i = -1; i <= 1; ++i) {
for (j = -1; j <= 1; ++j){
float weight = (float)kernel[i + 1][j + 1];
unsigned char* buffer = img + width * 4 * (y + j) + (x + i) * 4;
bs += weight * *buffer;
gs += weight * *(buffer + 1);
rs += weight * *(buffer + 2);
}
}
unsigned char* outbuffer = resimage + width * 4 * y + x * 4;
*outbuffer = bs;
*(outbuffer + 1) = gs;
*(outbuffer + 2) = rs;
*(outbuffer + 3) = 255;
}
}
memcpy(img, resimage, width*height * 4);
free(resimage);
}
I am working on a project in which I need to check neighboring cells of a specific cell in a dynamically allocated 2D char array. Basically, If certain neighboring cells are 'X' for example, then the current cell you are on becomes '-'. To allocate the 2D array, I used a single malloc call:
char *array = (char *)malloc(numRows * numCols * sizeof(char));
To access an element while using a double for loop, I use this:
for (int i = 0; i <= getNumRows(); i++)
{
for (int j = 0; j < getNumCols(); j++)
{
printf("%c ", **(array + i * getNumCols() + j));
}
printf("\n");
}
How would I access and view the neighboring cells of the current element?
The code posted to display the matrix has problems:
the outer loop should stop when i == getNumRows() and
the printf argument should use a single * dereferencing operator
Here is a modified version:
for (int i = 0; i < getNumRows(); i++) {
for (int j = 0; j < getNumCols(); j++) {
printf("%c ", *(array + i * getNumCols() + j));
}
printf("\n");
}
Which can also be rewritten to avoid recomputing the matrix sizes repeatedly:
for (int i = 0, row = getNumRows(), cols = getNumCols(); i < rows; i++) {
for (int j = 0; j < cols; j++) {
printf("%c ", array[i * cols + j]);
}
printf("\n");
}
Accessing the neighbouring cells of cell r,c depends on how you deal with boundaries:
if boundaries should not be crossed, you must test if r and/or c are on a boundary to produce between 3 and 8 neighbours.
if boundaries wrap as a torus, you can just compute r+/-1 % rows and c+/-1 % cols to always produce 8 neighbours.
To simplify the first case, you can allocate the matrix with 2 extra columns and rows, with char *array = malloc(sizeof(char) * (numRows + 1) * (numCols + 2)); and use the inner space (active area) this way:
for (int i = 1; i <= getNumRows(); i++) {
for (int j = 1; j <= getNumCols(); j++) {
printf("%c ", *(array + i * getNumCols() + j));
}
printf("\n");
}
If you initalize the boundary rows and columns in the matrix as ' ', you can always access the 8 cells at r+/-1, c+/-1 and check for 'X' without special casing the boundary rows of the active part.
Accessing these neighbouring cells can be done according to the implementation choices:
int rows = getNumRows(), cols = getNumCols();
char *cellp = array + r * cols + c;
// using extra rows and columns
char top_1 = cellp[-cols - 1];
char top_2 = cellp[-cols];
char top_3 = cellp[-cols + 1];
char mid_1 = cellp[-1];
char mid_2 = cellp[+1];
char bot_1 = cellp[+cols - 1];
char bot_2 = cellp[+cols];
char bot_3 = cellp[+cols + 1];
// using torus-like wrapping
char top_1 = array[(r + rows - 1) % rows * cols + (c + cols - 1) % cols];
char top_2 = array[(r + rows - 1) % rows * cols + c];
char top_3 = array[(r + rows - 1) % rows * cols + (c + 1) % cols];
char mid_1 = array[r * cols + (c + cols - 1) % cols];
char mid_2 = array[r * cols + (c + 1)];
char bot_1 = array[(r + 1) % rows * cols + (c + cols - 1) % cols];
char bot_2 = array[(r + 1) % rows * cols + c];
char bot_3 = array[(r + 1) % rows * cols + (c + 1) % cols];
// using tests
char top_1 = (r == 0 || c == 0 ) ? 0 : cellp[-cols - 1];
char top_2 = (r == 0 ) ? 0 : cellp[-cols];
char top_3 = (r == 0 || c == cols - 1) ? 0 : cellp[-cols + 1];
char mid_1 = ( c == 0 ) ? 0 : cellp[-1];
char mid_2 = ( c == cols - 1) ? 0 : cellp[+1];
char bot_1 = (r == rows - 1 || c == 0 ) ? 0 : cellp[+cols - 1];
char bot_2 = (r == rows - 1 ) ? 0 : cellp[+cols];
char bot_3 = (r == rows - 1 || c == cols - 1) ? 0 : cellp[+cols + 1];
I would use a pointer to the array. It makes array indexing much easier. Example prints neighbouring cells.
void print_n(void *arr, size_t nrows, size_t ncols, size_t col, size_t row)
{
int (*array)[nrows][ncols] = arr;
if(col) printf("Left: %d\n", (*array)[row][col - 1]);
if(col < ncols - 1) printf("Right: %d\n", (*array)[row][col + 1]);
if(row) printf("Top: %d\n", (*array)[row - 1][col]);
if(row < nrows - 1) printf("Right: %d\n", (*array)[row + 1][col]);
}
int main(void)
{
size_t ncols = 10, nrows = 20;
int (*array)[nrows][ncols] = malloc(sizeof(*array));
for(size_t row = 0; row < nrows; row++)
for(size_t col = 0; col < ncols; col++)
(*array)[row][col] = row * 100 + col;
print_n(array, nrows, ncols, 6, 7);
free(array);
}
https://godbolt.org/z/7Yoff5
I was writing a program in C for lexicographically sorting the strings entered by user , but whenever i am entering a string with string length greater than 3 my code shows some garbage values and i am not able to understand why ?
My line of Code
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void lexiiographic_sort(char** , int );
int main(void)
{
int n;
scanf_s("%d", &n);
char** arr;
arr = (char**)malloc(n * sizeof(char*));
for (int i = 0; i < n; i++)
{
if ((*(arr + i) = (char*)malloc(1024 * sizeof(char))) == NULL) { exit(1); }
scanf_s("%s", *(arr + i),sizeof(*(arr + i)));
if ((*(arr + i) = (char*)realloc(*(arr + i), strlen(*(arr + i)) + 1)) == NULL) { exit(1); }
printf("%s\n", *(arr + i));
}
lexiiographic_sort(arr, n);
for (int i = 0; i < n; i++)
{
printf("%s\n", *(arr + i));
}
}
void lexiiographic_sort(char** string, int size)
{
char* temp;
for (int i = 1; i < size; i++)
{
for (int j = 1; j < size; j++)
{
if ((int)(*(*(string + j - 1))) > (int)(*(*(string + j))))
{
if ((temp = (char*)malloc(strlen(*(string + i - 1)) * sizeof(char) + 1)) == NULL) { exit(1); }
temp = *(string + j - 1);
if((temp = ((char*)realloc(temp, strlen(*(string + j)) * sizeof(char) + 1))) == NULL){exit(1); }
*(string + j - 1) = *(string + j);
*(string + j) = temp;
}
}
}
}
}
You cannot get size of dynamically allocated buffer via sizeof. sizeof(*(arr + i)) is the size of a pointer char*. It seems it is 4 in your environment and it is the reason of the 3-character limit (one is left for terminating null-character).
You used the hard-coded number 1024 as the buffer size, so use that instead.
scanf_s("%s", *(arr + i),1024);
Defining a macro for buffer size and using them will improve your code more.
I am working on a project that requires automatic vectorization of large loops. It is mandatory to use GCC to compile. A minimum case of the problem could be the following:
#define VLEN 4
#define NTHREADS 4
#define AVX512_ALIGNMENT 64
#define NUM_INTERNAL_ITERS 5
#define real double
typedef struct private_data {
/*
* Alloc enough space for private data and MEM_BLOCK_SIZE bytes of padding.
* Private data must be allocated all at once to squeeze cache performance by only
* padding once per CPU.
*/
real *contiguous_data;
/*
* Pointers to corresponding index in contiguous_data.
*/
real *array_1;
real *array_2;
} private_data_t;
private_data_t private_data[NTHREADS];
int num_iter;
void minimum_case(const int thread) {
// Reference to thread private data.
real *restrict array_1 =
__builtin_assume_aligned(private_data[thread].array_1, AVX512_ALIGNMENT);
real *restrict array_2 =
__builtin_assume_aligned(private_data[thread].array_2, AVX512_ALIGNMENT);
for (int i = 0; i < num_iter; i++) {
for (int k = 0; k < NUM_INTERNAL_ITERS; ++k) {
int array_1_entry =
(k * (NUM_INTERNAL_ITERS) * VLEN) +
i * NUM_INTERNAL_ITERS * NUM_INTERNAL_ITERS * VLEN;
int array_2_entry =
(k * (NUM_INTERNAL_ITERS) * VLEN) +
i * NUM_INTERNAL_ITERS * VLEN;
#pragma GCC unroll 1
#pragma GCC ivdep
for (int j = 0; j < VLEN; j++) {
real pivot;
int a_idx = array_1_entry + VLEN * 0 + j;
int b_idx = array_1_entry + VLEN * 1 + j;
int c_idx = array_1_entry + VLEN * 2 + j;
int d_idx = array_1_entry + VLEN * 3 + j;
int S_idx = array_2_entry + VLEN * 0 + j;
if (k == 0) {
pivot = array_1[a_idx];
// b = b / a
array_1[b_idx] /= pivot;
// c = c / a
array_1[c_idx] /= pivot;
// d = d / a
array_1[d_idx] /= pivot;
// S = S / a
array_2[S_idx] /= pivot;
}
int e_idx = array_1_entry + VLEN * 4 + j;
int f_idx = array_1_entry + VLEN * 5 + j;
int g_idx = array_1_entry + VLEN * 6 + j;
int k_idx = array_1_entry + VLEN * 7 + j;
int T_idx = array_2_entry + VLEN * 1 + j;
pivot = array_1[e_idx];
// f = f - (e * b)
array_1[f_idx] -= array_1[b_idx]
* pivot;
// g = g - (e * c)
array_1[g_idx] -= array_1[c_idx]
* pivot;
// k = k - (e * d)
array_1[k_idx] -= array_1[d_idx]
* pivot;
// T = T - (e * S)
array_2[T_idx] -= array_2[S_idx]
* pivot;
}
}
}
}
For this specific case, GCC is using 16B vectors instead of 32B ones for automatic vectorization. It is fairly easy to see that the control flow depends on a condition that can be checked out of the internal loop, but GCC is not performing any loop-unswitching.
The loop unswitching can be done manually, but please, note that this is a minimum case of the problem, the real loop has hundreds of lines and performing manual loop-unswitching would result in a lot of code redundancy. I am trying to find a way to force GCC to create different loops for different conditions that can be checked out of the internal loop.
Currently I am using GCC 9.2 with the following flags: -Ofast -march=native -std=c11 -fopenmp -ftree-vectorize -ffast-math -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fopt-info-vec-optimized
Especially if the real loop has hundreds of lines I strongly recommend factoring that out into a separate function -- which then would make manual unswitching (where necessary) not that bad.
The following should be equivalent to your code (notice, I also factored out some index calculations -- this could actually be simplified even more):
inline void inner_loop(real *restrict array_1, real *restrict array_2,
int const first) {
#pragma GCC unroll 1
for (int j = 0; j < VLEN; j++) {
int a_idx = VLEN * 0 + j;
int b_idx = VLEN * 1 + j;
int c_idx = VLEN * 2 + j;
int d_idx = VLEN * 3 + j;
int S_idx = VLEN * 0 + j;
if (first) {
real pivot = array_1[a_idx];
array_1[b_idx] /= pivot; // b = b / a
array_1[c_idx] /= pivot; // c = c / a
array_1[d_idx] /= pivot; // d = d / a
array_2[S_idx] /= pivot; // S = S / a
}
int e_idx = VLEN * 4 + j;
int f_idx = VLEN * 5 + j;
int g_idx = VLEN * 6 + j;
int k_idx = VLEN * 7 + j;
int T_idx = VLEN * 1 + j;
real pivot = array_1[e_idx];
array_1[f_idx] -= array_1[b_idx] * pivot; // f = f - (e * b)
array_1[g_idx] -= array_1[c_idx] * pivot; // g = g - (e * c)
array_1[k_idx] -= array_1[d_idx] * pivot; // k = k - (e * d)
array_2[T_idx] -= array_2[S_idx] * pivot; // T = T - (e * S)
}
}
void minimum_case(const int thread) {
// Reference to thread private data.
real *restrict array_1 =
__builtin_assume_aligned(private_data[thread].array_1, AVX512_ALIGNMENT);
real *restrict array_2 =
__builtin_assume_aligned(private_data[thread].array_2, AVX512_ALIGNMENT);
for (int i = 0; i < num_iter; i++) {
real *array_1_i =
array_1 + i * NUM_INTERNAL_ITERS * NUM_INTERNAL_ITERS * VLEN;
real *array_2_i = array_2 + i * NUM_INTERNAL_ITERS * VLEN;
inner_loop(array_1_i, array_2_i, 0);
for (int k = 1; k < NUM_INTERNAL_ITERS; ++k) {
int array_1_entry = (k * (NUM_INTERNAL_ITERS)*VLEN);
int array_2_entry = (k * (NUM_INTERNAL_ITERS)*VLEN);
inner_loop(array_1_i + array_1_entry, array_2_i + array_2_entry, 1);
}
}
}
Full demo on godbolt: https://godbolt.org/z/wMgSnr
In the final value of array only first element becomes zero and that too when it again goes to the for loop(checked using gdb)..i have mentioned the problem using comments at the bottom of code.Help me out.. I have no clue of what is going wrong.
#include<stdio.h>
#include<stdlib.h>
int main()
{
int a, b, c;
printf("enter the size of matrix");
scanf("%d%d",&a,&b);
printf("enter the number of rotations");
scanf("%d",&c);
int *arr = malloc (sizeof(int) * a * b);
int x = (a >= b)? a : b;
printf("enter the values of matrix");
// scanning the values
for(int i = 0; i < a; i++)
{
for(int j = 0; j < b; j++)
{
scanf("%d",(arr + i * b + j));
}
printf("\n");
}
// main code starts
for(int y = 0; y < c; y++)
{
// declared a new array
int *arr1 = malloc (sizeof(int) * a * b);
for(int k = 0; k < x / 2; k++)
{
for(int i = k; i < a - k; i++)
{
for(int j = k; j < b - k; j++)
{
if (i == k && j > k)
{
*(arr1 + i * b + j - 1) = *(arr + i * b + j);
}
else if (i == a - k - 1 && j < b - k - 1)
{
*(arr1 + i * b + j + 1) = *(arr + i * b + j);
}
else if (j == k && i < a - k - 1)
{
*(arr1 + i * b + j + b) = *(arr + i * b + j);
}
else if (j == b - k - 1 && i > k)
{
*(arr1 + i * b + j - b) = *(arr + i * b + j);
}
}
}
if (x % 2 != 0 && a == b)
*(arr1 + x / 2 * b + (b / 2)) = *(arr + x / 2 * b + (b / 2));
}
// changing the old array to new array
arr = arr1;
// first value is getting printed correctly here
printf("%d\n",*(arr));
printf("%p\n",&(*arr));
free(arr1);
}
// printing the output
for(int i = 0; i < a; i++)
{
for(int j = 0; j < b; j++)
{
printf("%d ",*(arr + i * b + j));
}
printf("\n");
}
// first value is getting printed incorrectly here, outside the loop
printf("\n%d\n",*(arr));
printf("%p",&(*arr));
}
C doesn't support array assignment. You have:
int *arr = malloc (sizeof(int) * a * b);
…
int *arr1 = malloc (sizeof(int) * a * b);
…
arr = arr1;
…
free(arr1);
The assignment means you've lost your original array (memory leak) and you then invalidate your new array with the free().
Array copying requires more code — usually a function call such as memmove() or memcpy(), possibly wrapped in a function.
For example, add #include <string.h> and use this in place of the arr = arr1; assignment:
memmove(arr, arr1, sizeof(int) * a * b);
free(arr1); // No longer needed
Alternatively:
free(arr);
arr = arr1;
This code runs cleanly under valgrind on Mac OS X 10.11.5 with GCC 6.1.0 with the 'Either' or the 'Or' options for handling the array assignments.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void dump_matrix(const char *tag, int *arr, int a, int b)
{
printf("Matrix: %s\n", tag);
for (int i = 0; i < a; i++)
{
for (int j = 0; j < b; j++)
printf(" %3d", arr[i * b + j]);
putchar('\n');
}
}
int main(void)
{
int a, b, c;
printf("enter the size of matrix: ");
scanf("%d%d", &a, &b);
printf("enter the number of rotations: ");
scanf("%d", &c);
int *arr = malloc(sizeof(int) * a * b);
int x = (a >= b) ? a : b;
printf("enter the values of matrix: ");
// scanning the values
for (int i = 0; i < a; i++)
{
for (int j = 0; j < b; j++)
{
if (scanf("%d", (arr + i * b + j)) != 1)
{
fprintf(stderr, "failed to read value arr[%d][%d]\n", i, j);
return EXIT_FAILURE;
}
}
printf("\n");
}
dump_matrix("Initial input", arr, a, b);
// main code starts
for (int y = 0; y < c; y++)
{
// declared a new array
int *arr1 = malloc(sizeof(int) * a * b);
for (int k = 0; k < x / 2; k++)
{
for (int i = k; i < a - k; i++)
{
for (int j = k; j < b - k; j++)
{
if (i == k && j > k)
{
*(arr1 + i * b + j - 1) = *(arr + i * b + j);
}
else if (i == a - k - 1 && j < b - k - 1)
{
*(arr1 + i * b + j + 1) = *(arr + i * b + j);
}
else if (j == k && i < a - k - 1)
{
*(arr1 + i * b + j + b) = *(arr + i * b + j);
}
else if (j == b - k - 1 && i > k)
{
*(arr1 + i * b + j - b) = *(arr + i * b + j);
}
}
}
if (x % 2 != 0 && a == b)
*(arr1 + x / 2 * b + (b / 2)) = *(arr + x / 2 * b + (b / 2));
}
// Changing the old array to new array
// Either:
// memmove(arr, arr1, sizeof(int) * a * b);
// free(arr1);
// Or:
free(arr);
arr = arr1;
dump_matrix("After rotation", arr, a, b);
}
dump_matrix("Finished", arr, a, b);
free(arr);
return 0;
}
Note the use of the dump_matrix() function. Writing such a function once means it can be used multiple places in the code. The tag argument simplifies the use. The 'commercial grade' variant takes a FILE *fp argument too and writes to the specified file stream.
Note the error checking on the main input loop scanf(). I should also have checked the two other scanf() statements. Errors are reported on standard error, of course.
Example run:
$ ./mat31
enter the size of matrix: 3 4
enter the number of rotations: 2
enter the values of matrix: 1 2 3 4 10 11 12 13 99 98 97 96
Matrix: Initial input
1 2 3 4
10 11 12 13
99 98 97 96
Matrix: After rotation
2 3 4 13
1 12 11 96
10 99 98 97
Matrix: After rotation
3 4 13 96
2 11 12 97
1 10 99 98
Matrix: Finished
3 4 13 96
2 11 12 97
1 10 99 98
$
Whether the output is what you intended is a wholly separate discussion. This is simply not abusing the memory.