General Information
NOTE: I am also decently new to C, OpenAcc.
Hi I am trying to develop an image blurring program, but first I wanted to see if I could parallelize the for loops and copyin/copyout my values.
The problem I am facing currently is when I try to copyin and copyout my data and output variables. The error looks to be a buffer overflow (I have also googled it and that is what people have said), but i am not sure how I should go about fixing this. I think I am doing something wrong with the pointers, but I am not sure.
Thanks so much in advance, if you think that I missed some information please let me know and I can provide it.
Question
I would like to confirm what the error actually is?
How should I go about fixing the issue?
Anything I should look into more so I can fix this kind of issue myself in the future.
Error
FATAL ERROR: variable in data clause is partially present on the device: name=output
file:/nfs/u50/singhn8/4F03/A3/main.c ProcessImageACC line:48
output lives at 0x7ffca75f6288 size 16 not present
Present table dump for device[1]: NVIDIA Tesla GPU 1, compute capability 3.5
host:0x7fe98eaf9010 device:0xb05dc0000 size:2073600 presentcount:1 line:47 name:(null)
host:0x7fe98f0e8010 device:0xb05bc0000 size:2073600 presentcount:1 line:47 name:(null)
host:0x7ffca75f6158 device:0xb05ac0400 size:4 presentcount:1 line:47 name:filterRad
host:0x7ffca75f615c device:0xb05ac0000 size:4 presentcount:1 line:47 name:row
host:0x7ffca75f6208 device:0xb05ac0200 size:4 presentcount:1 line:47 name:col
host:0x7ffca75f6280 device:0xb05ac0600 size:16 presentcount:1 line:48 name:data
Program Definition
#include <sys/time.h>
#include <stdio.h>
#include <stdlib.h>
#include <openacc.h>
// ================================================
// ppmFile.h
// ================================================
#include <sys/types.h>
typedef struct Image
{
int width;
int height;
unsigned char *data;
} Image;
Image* ImageCreate(int width,
int height);
Image* ImageRead(char *filename);
void ImageWrite(Image *image,
char *filename);
int ImageWidth(Image *image);
int ImageHeight(Image *image);
void ImageClear(Image *image,
unsigned char red,
unsigned char green,
unsigned char blue);
void ImageSetPixel(Image *image,
int x,
int y,
int chan,
unsigned char val);
unsigned char ImageGetPixel(Image *image,
int x,
int y,
int chan);
Blur Filter Function
// ================================================
// The Blur Filter
// ================================================
void ProcessImageACC(Image **data, int filterRad, Image **output) {
int row = (*data)->height;
int col = (*data)->width;
#pragma acc data copyin(row, col, filterRad, (*data)->data[0:row * col]) copyout((*output)->data[0:row * col])
#pragma acc kernels
{
#pragma acc loop independent
for (int j = 0; j < row; j++) {
#pragma acc loop independent
for (int i = 0; i < col; i++) {
(*output)->data[j * row + i] = (*data)->data[j * row + i];
}
}
}
}
Main Function
// ================================================
// Main Program
// ================================================
int main(int argc, char *argv[]) {
// vars used for processing:
Image *data, *result;
int dataSize;
int filterRadius = atoi(argv[1]);
// ===read the data===
data = ImageRead(argv[2]);
// ===send data to nodes===
// send data size in bytes
dataSize = sizeof(unsigned char) * data->width * data->height * 3;
// ===process the image===
// allocate space to store result
result = (Image *)malloc(sizeof(Image));
result->data = (unsigned char *)malloc(dataSize);
result->width = data->width;
result->height = data->height;
// initialize all to 0
for (int i = 0; i < (result->width * result->height * 3); i++) {
result->data[i] = 0;
}
// apply the filter
ProcessImageACC(&data, filterRadius, &result);
// ===save the data back===
ImageWrite(result, argv[3]);
return 0;
}
The problem here is that in addition to the data arrays, the output and data pointers need to be copied over as well. From the compiler feed back messages, you can see the compiler implicitly copying them over.
% pgcc -c image.c -ta=tesla:cc70 -Minfo=accel
ProcessImageACC:
46, Generating copyout(output->->data[:col*row])
Generating copyin(data->->data[:col*row],col,filterRad,row)
47, Generating implicit copyout(output[:1])
Generating implicit copyin(data[:1])
50, Loop is parallelizable
52, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
50, #pragma acc loop gang, vector(4) /* blockIdx.y threadIdx.y */
52, #pragma acc loop gang, vector(32) /* blockIdx.x threadIdx.x */
Now you might be able to get this to work by using unstructured data regions to create both the data and pointers, and then "attach" the pointers to the arrays (i.e. fill in the value of the device pointers to the address of the device data array).
Though an easier option is to create temp arrays to point to the data, and then copy the data to the device. This will also increase the performance of your code (both on the GPU and CPU) since it eliminates the extra levels of indirection.
void ProcessImageACC(Image **data, int filterRad, Image **output) {
int row = (*data)->height;
int col = (*data)->width;
unsigned char * ddata, * odata;
odata = (*output)->data;
ddata = (*data)->data;
#pragma acc data copyin(ddata[0:row * col]) copyout(odata[0:row * col])
#pragma acc kernels
{
#pragma acc loop independent
for (int j = 0; j < row; j++) {
#pragma acc loop independent
for (int i = 0; i < col; i++) {
odata[j * row + i] = ddata[j * row + i];
}
}
}
}
Note that scalars are firstprivate by default so there's no need to add the row, col, and filterRad variables in the data clause.
Related
Hope everyone is doing well.
I am trying to write a simple Matrix Library in C by creating a Matrix struct, and then using its memory address to execute operations.
Here is my header file for the library:
/*
To compile:
g++ -c simpMat.cpp
ar rvs simpMat.a simpMat.o
g++ test_simpMat.c simpMat.a
*/
#ifndef SIMPMATH_H
#define SIMPMAT_H
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef struct{
uint8_t nRows;
uint8_t nCols;
uint8_t nElements;
float **elements;
}simpMat;
/**
*#brief simpMat_Init
*#param simpMat
*#param uint8_t
*#param uint8_t
*#param uint8_t
*#param float*
*#retval NONE
*/
void simpMat_Init(simpMat *Matrix, uint8_t nRows, uint8_t nColumns, uint8_t nElements, float elements[]);
/**
*#brief simpMat_Print
*#param simpMat
*#retval NONE
*/
void simpMat_Print(simpMat *Matrix);
/**
*#brief simpMat_Delete
*#param simpMat
*#retval NONE
*/
void simpMat_Delete(simpMat *Matrix);
#endif
Here is the source file:
#include "simpMat.h"
void simpMat_Init(simpMat *Matrix, uint8_t nRows, uint8_t nColumns, uint8_t nElements, float elements[])
{
Matrix->nRows = nRows;
Matrix->nCols = nColumns;
Matrix->nElements = nElements;
Matrix->elements = (float**)malloc(nRows * sizeof(float*));
for (uint8_t i = 0; i < nRows; i++)
{
Matrix->elements[i] = (float*)malloc(nColumns * sizeof(float));
}
uint8_t count = 0;
for (uint8_t i = 0; i < nRows; i++)
{
for (uint8_t j = 0; j < nColumns; j++)
{
Matrix->elements[i][j] = elements[count];
count++;
}
}
}
void simpMat_Print(simpMat *Matrix)
{
for (uint8_t i = 0; i < Matrix->nRows; i++)
{
for (uint8_t j = 0; j < Matrix->nCols; j++)
{
printf("%d ", Matrix->elements[i][j]);
}
printf("\n");
}
}
void simpMat_Delete(simpMat *Matrix)
{
uint8_t n = Matrix->nRows;
while(n) free(Matrix->elements[--n]);
free(Matrix->elements);
}
I also wrote a small test program to see if I can successfully assign elements to the matrix; such as:
#include "simpMat.h"
#include "stdio.h"
int main()
{
simpMat Matrix1;
float toAppend[9] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
simpMat_Init(&Matrix1, 3, 2, 9, toAppend);
printf("MATRIX ELEMENTS ARE:\n");
simpMat_Print(&Matrix1);
simpMat_Delete(&Matrix1);
return 0;
}
I compiled my library and the main program with the following commands on CMD:
g++ -c simpMat.cpp
ar rvs simpMat.a simpMat.o
g++ test_simpMat.c simpMat.a
However, when I run the executable, I get the following output:
MATRIX ELEMENTS ARE:
0 0
0 0
0 0
I could not understand the reason I cannot assign values. I am fairly new to the Dynamic Memory Allocation subject and I suspect that I had a misconception about the methodology. Can you help me with that?
If you use a debugger and step through your program looking at the memory, you should see the data is actually there. Your question assumes the problem is assignment, whereas it's actually in your output. This kind of thing is most easily discoverable with a debugger.
The actual problem is your matrix elements are float. But you are using %d specifier in your printf, which is for int values. Change this to %f.
Separately, you should reconsider the purpose of the nElements parameter. You are not doing any sanity tests before copying the array (for example, ensuring rows * cols does not exceed that value). It doesn't appear to have any relation to the actual matrix and should not be stored.
I am trying to wrap my head around combining openacc with pointers to structs containing dynamically allocated members. The code below fails with
Failing in Thread:1
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
when compiled using nvc ("nvc 20.9-0 LLVM 64-bit target on x86-64 Linux -tp haswell"). As far as I can tell I am following the approach suggested eg in the OpenACC 'getting started' guide. But somehow presumably the pointers don't stick (?) on the device. Does anyone know what goes wrong here?
#include <stdlib.h>
#include <stdio.h>
typedef struct grid
{
int N;
double *X;
} grid;
void allocate(grid* g, int N)
{
g->N = N;
g->X = (double*) malloc(sizeof(double) * g->N);
#pragma acc enter data create(g[0:1])
#pragma acc enter data create(g->X[0:N])
}
void release(grid* g)
{
#pragma acc exit data delete(g->X[0:g->N])
#pragma acc exit data delete(g[0:1])
free(g->X);
}
void fill(grid * g)
{
int i;
#pragma acc parallel loop
for (i = 0; i < g->N; i++)
{
g->X[i] = 42; // the cuprit, commenting this removes the error too
}
}
int main()
{
grid g;
allocate(&g, 10);
fill(&g);
release(&g);
return 0;
}```
From the compiler feedback messages you'll see something like:
fill:
32, Accelerator restriction: size of the GPU copy of g is unknown
Generating Tesla code
32, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
32, Generating implicit copyin(g) [if not already present]
37, Generating update self(g->X[:g->N])
The problem being that the compiler can't implicitly copy aggregate types with dynamic data members so you need to add a "present(g)" to indicate that g is already the device.
Also, you'll want to copyin g in order to get the value of N on the device and no need to include the array shape in the exit data delete directive. For example:
% cat test.c
#include <stdlib.h>
#include <stdio.h>
typedef struct grid
{
int N;
double *X;
} grid;
void allocate(grid* g, int N)
{
g->N = N;
g->X = (double*) malloc(sizeof(double) * g->N);
#pragma acc enter data copyin(g[0:1])
#pragma acc enter data create(g->X[0:N])
}
void release(grid* g)
{
#pragma acc exit data delete(g->X)
#pragma acc exit data delete(g)
free(g->X);
}
void fill(grid * g)
{
int i;
#pragma acc parallel loop present(g)
for (i = 0; i < g->N; i++)
{
g->X[i] = 42; // the cuprit, commenting this removes the error too
}
#pragma acc update self(g->X[:g->N])
for (i = 0; i < 4; i++)
{
printf("%d : %f \n",i,g->X[i]);
}
}
int main()
{
grid g;
allocate(&g, 10);
fill(&g);
release(&g);
return 0;
}
% nvc -acc test.c -Minfo=accel -V20.9 ; a.out
allocate:
17, Generating enter data copyin(g[:1])
Generating enter data create(g->X[:N])
release:
24, Generating exit data delete(g[:1],g->X[:1])
fill:
32, Generating present(g[:1])
Generating Tesla code
32, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
37, Generating update self(g->X[:g->N])
0 : 42.000000
1 : 42.000000
2 : 42.000000
3 : 42.000000
I'm having a bit of trouble understanding how to send a 2D array to Cuda. I have a program that parses a large file with a 30 data points on each line. I read about 10 rows at a time and then create a matrix for each line and items(so in my example of 10 rows with 30 data points, it would be int list[10][30]; My goal is to send this array to my kernal and have each block process a row(I have gotten this to work perfectly in normal C, but Cuda has been a bit more challenging).
Here's what I'm doing so far but no luck(note: sizeofbucket = rows, and sizeOfBucketsHoldings = items in row...I know I should win a award for odd variable names):
int list[sizeOfBuckets][sizeOfBucketsHoldings]; //this is created at the start of the file and I can confirmed its filled with the correct data
#define sizeOfBuckets 10 //size of buckets before sending to process list
#define sizeOfBucketsHoldings 30
//Cuda part
//define device variables
int *dev_current_list[sizeOfBuckets][sizeOfBucketsHoldings];
//time to malloc the 2D array on device
size_t pitch;
cudaMallocPitch((int**)&dev_current_list, (size_t *)&pitch, sizeOfBucketsHoldings * sizeof(int), sizeOfBuckets);
//copy data from host to device
cudaMemcpy2D( dev_current_list, pitch, list, sizeOfBuckets * sizeof(int), sizeOfBuckets * sizeof(int), sizeOfBucketsHoldings * sizeof(int),cudaMemcpyHostToDevice );
process_list<<<count,1>>> (sizeOfBuckets, sizeOfBucketsHoldings, dev_current_list, pitch);
//free memory of device
cudaFree( dev_current_list );
__global__ void process_list(int sizeOfBuckets, int sizeOfBucketsHoldings, int *current_list, int pitch) {
int tid = blockIdx.x;
for (int r = 0; r < sizeOfBuckets; ++r) {
int* row = (int*)((char*)current_list + r * pitch);
for (int c = 0; c < sizeOfBucketsHoldings; ++c) {
int element = row[c];
}
}
The error I'm getting is:
main.cu(266): error: argument of type "int *(*)[30]" is incompatible with parameter of type "int *"
1 error detected in the compilation of "/tmp/tmpxft_00003f32_00000000-4_main.cpp1.ii".
line 266 is the kernel call process_list<<<count,1>>> (count, countListItem, dev_current_list, pitch); I think the problem is I am trying to create my array in my function as int * but how else can I create it? In my pure C code, I use int current_list[num_of_rows][num_items_in_row] which works but I can't get the same outcome to work in Cuda.
My end goal is simple I just want to get each block to process each row(sizeOfBuckets) and then have it loop through all items in that row(sizeOfBucketHoldings). I orginally just did a normal cudamalloc and cudaMemcpy but it wasn't working so I looked around and found out about MallocPitch and 2dcopy(both of which were not in my cuda by example book) and I have been trying to study examples but they seem to be giving me the same error(I'm currently reading the CUDA_C programming guide found this idea on page22 but still no luck). Any ideas? or suggestions of where to look?
Edit:
To test this, I just want to add the value of each row together(I copied the logic from the cuda by example array addition example).
My kernel:
__global__ void process_list(int sizeOfBuckets, int sizeOfBucketsHoldings, int *current_list, size_t pitch, int *total) {
//TODO: we need to flip the list as well
int tid = blockIdx.x;
for (int c = 0; c < sizeOfBucketsHoldings; ++c) {
total[tid] = total + current_list[tid][c];
}
}
Here's how I declare the total array in my main:
int *dev_total;
cudaMalloc( (void**)&dev_total, sizeOfBuckets * sizeof(int) );
You have some mistakes in your code.
Then you copy host array to device you should pass one dimensional host pointer.See the function signature.
You don't need to allocate static 2D array for device memory. It creates static array in host memory then you recreate it as device array. Keep in mind it must be one dimensional array, too. See this function signature.
This example should help you with memory allocation:
__global__ void process_list(int sizeOfBucketsHoldings, int* total, int* current_list, int pitch)
{
int tid = blockIdx.x;
total[tid] = 0;
for (int c = 0; c < sizeOfBucketsHoldings; ++c)
{
total[tid] += *((int*)((char*)current_list + tid * pitch) + c);
}
}
int main()
{
size_t sizeOfBuckets = 10;
size_t sizeOfBucketsHoldings = 30;
size_t width = sizeOfBucketsHoldings * sizeof(int);//ned to be in bytes
size_t height = sizeOfBuckets;
int* list = new int [sizeOfBuckets * sizeOfBucketsHoldings];// one dimensional
for (int i = 0; i < sizeOfBuckets; i++)
for (int j = 0; j < sizeOfBucketsHoldings; j++)
list[i *sizeOfBucketsHoldings + j] = i;
size_t pitch_h = sizeOfBucketsHoldings * sizeof(int);// always in bytes
int* dev_current_list;
size_t pitch_d;
cudaMallocPitch((int**)&dev_current_list, &pitch_d, width, height);
int *test;
cudaMalloc((void**)&test, sizeOfBuckets * sizeof(int));
int* h_test = new int[sizeOfBuckets];
cudaMemcpy2D(dev_current_list, pitch_d, list, pitch_h, width, height, cudaMemcpyHostToDevice);
process_list<<<10, 1>>>(sizeOfBucketsHoldings, test, dev_current_list, pitch_d);
cudaDeviceSynchronize();
cudaMemcpy(h_test, test, sizeOfBuckets * sizeof(int), cudaMemcpyDeviceToHost);
for (int i = 0; i < sizeOfBuckets; i++)
printf("%d %d\n", i , h_test[i]);
return 0;
}
To access your 2D array in kernel you should use pattern base_addr + y * pitch_d + x.
WARNING: the pitvh allways in bytes. You need to cast your pointer to byte*.
I'm trying to make a struct that generates a random matrix and am getting "error: expected â=â, â,â, â;â, âasmâ or â_attribute_â before âmatrixâ" when compiling. How can I get this to work effectively and efficiently?
I guess expected errors usually are caused by typos but I don't see any.
I'm very new to C so pointers and malloc are quite foreign to me. I really appreciate your help.
/* It's called RandomMatrixMaker.c */
#include <stdio.h>
#include <stdlib.h>
typdef struct {
char* name;
int MID;
int MRows;
int MCols;
long[][]* MSpace;
} matrix;
matrix makeRIDMatrix(char* name, int MID, int MRows, int MCols) {
matrix m;
static int i, j, r;
m.name = name;
m.MID = MID;
m.MRows = MRows;
m.MCols = MCols;
for (i=0; i<m.MRows; i++) {
for (j=0; i<m.MCols; j++) {
r = random(101);
*(m.MSpace[i][j]) = r;
}
}
return m;
}
int main(void) {
makeRIDMatrix("test", 1, 10, 10);
return 0;
}
There is indeed a typo. You misspelled typedef:
typdef struct {
should be:
typedef struct {
EDIT:
Also, there's no reason to use static here:
static int i, j, r;
You can just get rid of the static modifier.
int i, j, r;
As another poster mentioned, there's a typo, but even with that corrected, it wouldn't compile, due to the definition of matrix.MSpace.
Let's begin in makeRIDMatrix(). You've declared an automatic (stack) variable of type "matrix". At the end of the function, you return that object. Whilst this is permissible, it's not advisable. If the struct is large, you will be copying a lot of data unnecessarily. Better to pass a pointer to a matrix into makeRIDMatrix(), and have makeRIDMatrix() fill in the contents.
The test in the inner loop is against i, but should be against j.
Next, let's look at the definition of "matrix". The definition of "MSpace" is a mess, and wouldn't even compile. Even if it did, because you haven't defined the length of a row, the compiler would not be able to calcuate the offset to any given item in the array. You want a two-dimensional array without giving the row length, but you can't do that in C. You can in other languages, but not C.
There's a lot more I could point out, but I'd be missing the real point. The real point is this:
C Is Not Java.
(It's also not one of the interpreted languages such as JavaScript, PHP, Python, Ruby and so on.)
You don't get dynamically-expanding arrays; you don't get automatic allocation of memory; you don't get garbage collection of unreferenced memory.
What you need is something more like this:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
typedef struct {
char* name;
int MID;
unsigned int MRows;
unsigned int MCols;
long *MSpace;
} matrix;
void makeRIDMatrix(matrix *pmx, char* name, int MID,
unsigned int MRows, unsigned int MCols) {
int i, j;
long *MSpace = malloc(sizeof(*MSpace)*MRows*MCols);
if (MSpace == NULL) {
return;
}
pmx->name = name;
pmx->MID = MID;
pmx->MRows = MRows;
pmx->MCols = MCols;
pmx->MSpace = MSpace;
srandom((unsigned int)time(NULL));
for (i=0; i<MRows; i++) {
for (j=0; i<MCols; j++) {
long int r = random() % 101L;
*(MSpace++) = r;
}
}
}
inline long * item_addr(const matrix *pmx,
unsigned int row, unsigned int col) {
if (pmx == NULL || pmx->MSpace == NULL
|| row >= pmx->MRows || col >= pmx->MCols) {
return NULL;
}
return &(pmx->MSpace[row * pmx->MCols + col]);
}
long get_item(const matrix *pmx, unsigned int row, unsigned int col) {
long *addr = item_addr(pmx, row, col);
return addr == NULL ? 0L : *addr;
}
void set_item(matrix *pmx,
unsigned int row, unsigned int col,
long val) {
long *addr = item_addr(pmx, row, col);
if (addr != NULL) {
*addr = val;
}
}
int main(void) {
matrix m;
makeRIDMatrix(&m, "test", 1, 10, 10);
return 0;
}
Note a few things here. Firstly, for efficiency, I fill the array as if it were one-dimensional. All subsequent get/set of array items should be done through the getter/setter functions, for safety.
Secondly, a hidden nasty: makeRIDMatrix() has used malloc() to allocate the memory - but it's going to be job of the calling function (or its successors) explciitly to free() the allocated pointer when it's finished with.
Thirdly, I've changed the rows/cols variables to unsigned int - there's little sense in definining an array with negative indices!
Fourthly: little error checking. For example, makeRIDMatrix() neither knows nor cares whether the parameter values are sensible (e.g. the matrix pointer isn't checked for NULLness). That's an exercise for the student.
Fifthly, I've fixed your random number usage - after a fashion. Another exercise for the student: why is the way I did it not good practice?
However - all of this is moot. You need to get yourself a good C textbook, or a good online course, and work through the examples. The code you've given here shows that you're punching above your weight at the moment, and you need to develop some more C muscles before going into that ring!
In relation to your question about "variable sized arrays", you could have something like:
/* can stick this into your struct, this is just an example */
size_t rows, cols;
long **matrix;
/* set the values of rows, cols */
/* create the "array" of rows (array of pointers to longs) */
matrix = (long**)malloc(rows * sizeof(long*));
/* create the array of columns (array of longs at each row) */
for (i = 0; i < rows; i++)
matrix[i] = (long*)malloc(cols * sizeof(long));
/* ... */
/* free the memory at the end */
for (i = 0; i < rows; i++)
free(matrix[i]);
free(matrix);
Then you can just access the dynamically allocated matrix similar to any other array of arrays.
ie. to set element at the first row (row 0) and fourth column (column 3) to 5:
matrix[0][3] = 5;
I am writing a C-program where I need 2D-arrays (dynamically allocated) with negative indices or where the index does not start at zero. So for an array[i][j] the row-index i should take values from e.g. 1 to 3 and the column-index j should take values from e.g. -1 to 9.
For this purpose I created the following program, here the variable columns_start is set to zero, so just the row-index is shifted and this works really fine.
But when I assign other values than zero to the variable columns_start, I get the message (from valgrind) that the command "free(array[i]);" is invalid.
So my questions are:
Why it is invalid to free the memory that I allocated just before?
How do I have to modify my program to shift the column-index?
Thank you for your help.
#include <stdio.h>
#include <stdlib.h>
main()
{
int **array, **array2;
int rows_end, rows_start, columns_end, columns_start, i, j;
rows_start = 1;
rows_end = 3;
columns_start = 0;
columns_end = 9;
array = malloc((rows_end-rows_start+1) * sizeof(int *));
for(i = 0; i <= (rows_end-rows_start); i++) {
array[i] = malloc((columns_end-columns_start+1) * sizeof(int));
}
array2 = array-rows_start; //shifting row-index
for(i = rows_start; i <= rows_end; i++) {
array2[i] = array[i-rows_start]-columns_start; //shifting column-index
}
for(i = rows_start; i <= rows_end; i++) {
for(j = columns_start; j <= columns_end; j++) {
array2[i][j] = i+j; //writing stuff into array
printf("%i %i %d\n",i, j, array2[i][j]);
}
}
for(i = 0; i <= (rows_end-rows_start); i++) {
free(array[i]);
}
free(array);
}
When you shift column indexes, you assign new values to original array of columns: in
array2[i] = array[i-rows_start]-columns_start;
array2[i] and array[i=rows_start] are the same memory cell as array2 is initialized with array-rows_start.
So deallocation of memory requires reverse shift. Try the following:
free(array[i] + columns_start);
IMHO, such modification of array indexes gives no benefit, while complicating program logic and leading to errors. Try to modify indexes on the fly in single loop.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int a[] = { -1, 41, 42, 43 };
int *b;//you will always read the data via this pointer
b = &a[1];// 1 is becoming the "zero pivot"
printf("zero: %d\n", b[0]);
printf("-1: %d\n", b[-1]);
return EXIT_SUCCESS;
}
If you don't need just a contiguous block, then you may be better off with hash tables instead.
As far as I can see, your free and malloc looks good. But your shifting doesn't make sense. Why don't you just add an offset in your array instead of using array2:
int maxNegValue = 10;
int myNegValue = -6;
array[x][myNegValue+maxNegValue] = ...;
this way, you're always in the positive range.
For malloc: you acquire (maxNegValue + maxPosValue) * sizeof(...)
Ok I understand now, that you need free(array.. + offset); even using your shifting stuff.. that's probably not what you want. If you don't need a very fast implementation I'd suggest to use a struct containing the offset and an array. Then create a function having this struct and x/y as arguments to allow access to the array.
I don't know why valgrind would complain about that free statement, but there seems to be a lot of pointer juggling going on so it doesn't surprise me that you get this problem in the first place. For instance, one thing which caught my eye is:
array2 = array-rows_start;
This will make array2[0] dereference memory which you didn't allocate. I fear it's just a matter of time until you get the offset calcuations wrong and run into this problem.
One one comment you wrote
but im my program I need a lot of these arrays with all different beginning indices, so I hope to find a more elegant solution instead of defining two offsets for every array.
I think I'd hide all this in a matrix helper struct (+ functions) so that you don't have to clutter your code with all the offsets. Consider this in some matrix.h header:
struct matrix; /* opaque type */
/* Allocates a matrix with the given dimensions, sample invocation might be:
*
* struct matrix *m;
* matrix_alloc( &m, -2, 14, -9, 33 );
*/
void matrix_alloc( struct matrix **m, int minRow, int maxRow, int minCol, int maxCol );
/* Releases resources allocated by the given matrix, e.g.:
*
* struct matrix *m;
* ...
* matrix_free( m );
*/
void matrix_free( struct matrix *m );
/* Get/Set the value of some elment in the matrix; takes logicaly (potentially negative)
* coordinates and translates them to zero-based coordinates internally, e.g.:
*
* struct matrix *m;
* ...
* int val = matrix_get( m, 9, -7 );
*/
int matrix_get( struct matrix *m, int row, int col );
void matrix_set( struct matrix *m, int row, int col, int val );
And here's how an implementation might look like (this would be matrix.c):
struct matrix {
int minRow, maxRow, minCol, maxCol;
int **elem;
};
void matrix_alloc( struct matrix **m, int minCol, int maxCol, int minRow, int maxRow ) {
int numRows = maxRow - minRow;
int numCols = maxCol - minCol;
*m = malloc( sizeof( struct matrix ) );
*elem = malloc( numRows * sizeof( *elem ) );
for ( int i = 0; i < numRows; ++i )
*elem = malloc( numCols * sizeof( int ) );
/* setting other fields of the matrix omitted for brevity */
}
void matrix_free( struct matrix *m ) {
/* omitted for brevity */
}
int matrix_get( struct matrix *m, int col, int row ) {
return m->elem[row - m->minRow][col - m->minCol];
}
void matrix_set( struct matrix *m, int col, int row, int val ) {
m->elem[row - m->minRow][col - m->minCol] = val;
}
This way you only need to get this stuff right once, in a central place. The rest of your program doesn't have to deal with raw arrays but rather the struct matrix type.