Scattered 2D array to contiguous 2D array transformation (in C) - c

I am trying to make a generic function in C that takes a 2D array of ANY type and copies it into a contiguous memory block. ( I need this function for Aggregate operations on MPI on my complex datatypes).
Imagine I have the following integer array
int n = 5;
int m = 6;
int** int_array = (int**) malloc(n* sizeof(int*));
for (int i = 0; i < n; i++ )
int_array[i] = (int *) malloc(m * sizeof(int) );
In this type of memory allocation one cannot, in principle, hope to access the , say i,j-th entry of int_array using the following pointer arithmetics
int value = (*lcc)[i*m+j];
Therefore I implemented a function that basically allocates a new memory block and neatly orders the entries of int_array so that the above indexing should work.
void linearize(char*** array, int n, int m,unsigned int size_bytes){
char* newarray = (char*)malloc(m*n*size_bytes);
//copy array!
for (int i = 0;i<n;i++)
for(int j = 0;j<m*size_bytes;j++)
{
newarray[i*m*size_bytes+j] = (*array)[i][j];
}
//swap pointers and free old memory!
for (int i = 0;i<n;i++)
{
char * temp = (*array)[i];
(*array)[i] = newarray + i*m*size_bytes ;
free(temp);
}
}
I wanted to make the above function to work with any kind of array type, hence I used char pointers to do operations byte by byte. I tested the function and so far it works, but I am not sure about memory deallocation.
Does free(temp) free the whole memory pointed to by int_array[i], that is the m*sizeof(int) bytes accessible from int_array[i] or only the first m bytes (since it thinks that our array is of type char rather than in) ? Or simply put, "Does the linearize function induce any memory leaks? "
Thank you in advance!
*EDIT*
As suggested by Nicolas Barbey, I ran a valgrind checks for memory leaks and it found none.
So to summarize the main points that I found difficult to understand about the behaviour of the program were:
in the function linearize does the following code induce memory leaks:
char * temp = (*array)[i];
(*array)[i] = newarray + i*m*size_bytes ;
free(temp);
NO!! somehow gnu compiler is smart enough to know how many bytes pointed to by "temp" to free. Originally I was afraid that if I array[i] is a pointer of type int , for example, that points to a memory location with say 5 ints = 5*4 bytes, the free(temp) is going to free only the first five bytes of that memory.
Another point to make is : how to free the already linearized array? that is if you have:
// first initialize the array.
int** array = (int**)malloc(5*sizeof(int*);
for(int i = 0; i< 5;i++)
array[i] = ( int* ) malloc(5*sizeof(int));
//now a call to linearize
linearize(&array,5,5,sizeof(int));
... do some work with array ....
// now time to free array
free(array[0]);
free(array);
//suffices to free all memory pointed to by array[i] and as well as the memory allocated
// for the pointers.
Thanks for the discussion and the suggestions.

You need to call free() exactly one call per malloc() inorder to be no memory leaks. Which means in your case int_array is passed to linearize function allocates a block of memory other than int_array allocation, therefore you need to loop over int_array[i] freeing each int* that you traverse followed by free'ing int_array itself. Also you need to free block created in linearize function too.

Here is a slightly slimmer version using actual two dimensional arrays:
void * linearize(void** array, int n, int m,unsigned int size_bytes){
char (*newarray)[m * size_bytes] = malloc(m*n*size_bytes);
//copy array!
int i;
for (i = 0;i<n;i++) {
memcpy(newarray[i], array[i], sizeof(*newarray));
free(array[i]);
}
free(array);
return newarray;
}
Use:
int (*newarray)[m] = linearize(array, n, m, sizeof(**int_array));
int value = newarray[i][j];
// or
value = newarray[0][i*m + j];
// or
value = ((int *)newarray)[i*m + j];

Related

C , regarding pointers (or pointers to pointers?), **, and malloc

As said in title, I have a question regarding using * twice, like in the main function of the following code. it DOES run, but I don't understand why using ** is right here. What i want is an array of SPPoints , sized n, where parr is the base adress. Why is ** right and * wrong in this case? thanks.
SPPoint code:
struct sp_point_t
{
double* data;
int dim;
int index;
};
SPPoint* spPointCreate(double* data, int dim, int index)
{
if (data == NULL || dim <= 0 || index < 0)
{
return NULL;
}
SPPoint* point = malloc(sizeof(*point));
if (point == NULL)
{
return NULL;
}
point->data = (double*)malloc(dim * sizeof(*data));
for (int i = 0; i < dim; i++)
{
point->data[i] = data[i];
}
point->dim = dim;
point->index = index;
return point;
}
And this is the main function:
int main()
{
int n, d, k;
scanf("%d %d %d", &n, &d, &k);
double* darr = malloc(d * sizeof(double));
if (darr == NULL)
{
return 0;
}
SPPoint** parr = malloc(n * sizeof(SPPoint*));
if (parr == NULL)
{
return 0;
}
for (int i = 0; i < n; i++)
{
for (int j = 0; j < d; j++)
{
scanf(" %lf", &darr[j]);
}
parr[i] = spPointCreate(darr, d, i);
}
}
When using a dynamically-allocated array, it's usually "handled" by having a pointer to the first element of the array, and also having some method of knowing the length, such as explicitly storing the length, or having an end sentinel.
So for a dynamically allocated array of SPPoint * as you have in your code, a pointer to the first one of those has type SPPoint * *
Your existing code creates an array of SPPoint *, i.e. an array of pointers. Each of those pointers points to one dynamically-allocated instance of SPPoint, i.e. you have separate allocations for each entry.
This is viable but you indicate that you instead wanted an array of SPPoint, in which case a pointer to the first element has type SPPoint *.
In order to have such an array, it is a single memory allocation. So you will need to redesign your spPointCreate function. Currently that allocates memory for and initializes only a single SPPoint. Instead you want to separate the allocation from the initialization, since you only need one allocation but you need multiple initializations. Your program logic will read something like:
Allocate one block of memory big enough for n SPPoints
Initialize each SPPoint inside the allocated space
If you have tried this but got stuck then post a new question showing your code and explaining where you got stuck.
An array can behave similarly to a pointer. For instance, int a [] is very similar to int* a. Each function in SPPoint returns a pointer to a SPPoint struct. An array of pointers to SPPoint can be written as a pointer to a pointer to SPPoint. With the malloc command, you are designating a certain amount of memory (enough to hold n pointers to SPPoint) for storage of pointers to SPPoint structs.
Not all pointers are arrays, however. SPPoint** parr is acting as an array holding pointers to single structs of type SPPoint.
Arrays can behave differently from pointers, especially when used for strings.
The reason why it is advantageous to use pointers to SPPoint (as you are now) is that you can view or modify a single element without having to copy the entire struct.

C: free() for row of 2d int array makes program halt

I am relatively new to C and have coded (or more precise: copied from here and adapted) the functions below. The first one takes a numpy array and converts it to a C int array:
int **pymatrix_to_CarrayptrsInt(PyArrayObject *arrayin) {
int **result, *array, *tmpResult;
int i, n, m, j;
n = arrayin->dimensions[0];
m = arrayin->dimensions[1];
result = ptrvectorInt(n, m);
array = (int *) arrayin->data; /* pointer to arrayin data as int */
for (i = 0; i < n; i++) {
result[i] = &array[i * m];
}
return result;
}
The second one is used within the first one to allocate the necessary memory of the row vectors:
int **ptrvectorInt(long dim1, long dim2) {
int **result, i;
result = malloc(dim1 * sizeof(int*));
for (i = 0; i < dim1; i++) {
if (!(result[i] = malloc(dim2 * sizeof(int)))){
printf("In **ptrvectorInt. Allocation of memory for int array failed.");
exit(0);
}
}
return result;
}
Up to this point everything works quite fine. Now I want to free the memory occupied by the C array. I have found multiple threads about how to do it, e.g. Allocate and free 2D array in C using void, C: Correctly freeing memory of a multi-dimensional array, or how to free c 2d array. Inspired by the respective answers I wrote my freeing function:
void free_CarrayptrsInt(int **ptr, int i) {
for (i -= 1; i >= 0; i--) {
free(ptr[i]);
}
free(ptr);
}
Nontheless, I found out that already the first call of free fails - no matter whether I let the for loop go down or up.
I looked for explenations for failing free commands: Can a call to free() in C ever fail? and free up on malloc fails. This suggests, that there may have been a problem already at the memory allocation. However, my program works completely as expected - except memory freeing. Printing the regarded array shows that everything should be fine. What could be the issue? And even more important: How can I properly free the array?
I work on a Win8 64 bit machine with Visual Studio 10 64bit compiler. I use C together with python 3.4 64bit.
Thanks for all help!
pymatrix_to_CarrayptrsInt() calls ptrvectorInt() and this allocation is made
if (!(result[i] = malloc(dim2 * sizeof(int)))){
then pymatrix_to_CarrayptrsInt() writes over that allocation with this assignment
result[i] = &array[i * m];
causing a memory leak. If array is free()'d then attempting to free() result will fail

Array of pointers issue

i'm having some troubles when passing data from one pointer to an element of an array of pointers of an struct.
typedef struct {
float* data;
int size;
} vector;
//This function creates the vector
vector* doVector(int n, float* data){
vector * vec = (vector *) malloc(sizeof(vector));
vec->size = n;
vec->data = data;
return vec;
}
void delVector(vector* v){
free(v->data);
free(v);
}
void prVector(vector* v)
{
printf("[");
for(unsigned int i = 0; i<v->size; i++){
if(i!=v->size-1)
printf("%f,", v->data[i]);
else
printf("%f]\n", v->data[i]);
}
}
void fillVectors(float* data,int size){
vector * vectors = (vector*) malloc(size * sizeof(vector));
for(unsigned int i = 0; i < size; i++){
vectors[i] = *doVector(size,data);//This gives trouble
prVector(&vectors[i]);
}
//More stuff will be added here to work with the vectors.
for(unsigned int i = 0; i < size; i++)
delVector(&vectors[i]);//Memory leak here obv
free(vectors);// I also need to free the array
}
int main()
{
//Here recieving data from file and calling fillVectors
//Also allocating memory for data (which is send to fillvectors)
//Avoided to post because it's irrelevant and big
}
So the main idea is to create vectors with the struct,Data and size is read from file and stored into float array called data and int size. Then we call the function fillVector, which will call the doVector function and create the vector itself.
Then I want to assign the value of each vector to a position of the pointer array,(there are 3 mallocs, data and single vector, which is made in doVector, and the array of vectors made in fillVectors).
Problem comes when freeing this pointers, keep getting memory leaks.
Has something to do with the malloc of the array of vectors and the vector malloc from doVector.
ps: fillVector function is only called once
thanks.
Simple rule: in C if want to process smth in function send pointer. So if want to delete vector by pointer then pass pointer to pointer
void delVector(vector** v){
free((*v)->data);
free(*v);
*v = NULL;
}
Function already returns pointer so no need to use asterisk sign.
vectors[i] = *doVector(size,data);
Second: you want array of vectors? so use array of pointers to vectors
vector **vectors = (vector**) malloc(size * sizeof(vector*));
for (unsigned int i = 0; i < size; i++){
vectors[i] = doVector(size, data);//This gives trouble
prVector(vectors[i]);//no need to use ampersand, it is already pointer
}
And main: you need deep copy of float data inside vector. Now all vectors keep pointer to same array, given as argument. And beside that, you delete this data
free(v->data);
But this pointer was copied, but not owned.
vector* doVector(size_t n, float* data){
size_t i;
vector * vec = (vector *) malloc(sizeof(vector));
vec->size = n;
vec->data = (float*)malloc(sizeof(float) * n);
for (i = 0; i < n; i++) {
vec->data[i] = data[i];
}
//or just
//memcpy(vec->data, data, n*sizeof(float));
return vec;
}
More questions...
I will focus on your line with the comment //This gives trouble
With function doVector you use malloc to create a vector instance somewhere in memory. Then, when dereferencing the result by doing *doVector(size, data), you take the created vector and try to assign it to vectors[i]. This copies the memory block of newly created vector into the location vectors[i], but you don't keep the pointer to the result of doVector.
Afterwards, you free the memory of vectors element by element in the for loop and later you try to free the same space again using free(vectors) after the for loop. However, the memory allocated inside doVector is never freed, because you don't have the pointers to created vectors.
I would stick to Ivan Ivanov's answer for making it correct. I just wanted to point out why it doesn't work.
You should be initializing all pointers created and not IMMEDIATELY allocated to NULL or 0 or (void*)0. Then a call to free will clean up any allocated data.
Whenever allocating the actual data type make sure that you set the internal ptr to NULL before you allocate it as well.
C
vector* newVector;
newVector = (void*)0; //or 0, NULL
... //Code here
newVector = malloc(sizeof(vector));
newVector->data = (void*)0;
... //More code
if(!newVector){
free(newVector);
newVector = (void*)0;
}
Notes
If you must do dynamic memory allocation, do it in a format where you manage pointers with a static value.
As Chris mentions below, deleting a null ptr is already handled by delete and free, but I like to include the if statements to remind myself to set the pointer to NULL when its absolutely necessary.
Thanks again Chris :D

Am I correctly allocating memory for my pointer arrays in C?

I am trying to track down a bug a big program. I think it is due to how I am passing arrays to my functions. Am I doing this correctly?
main(){
int *x = declarArray(x, 100);
int *y = declarArray(x, 100);
// lines of code....
x = arrayManip(x, 100);
// more code...
int i;
for(i=0; i<100; i++)
y[i] = x[i];
//more code...
free(x);
free(y);
}
This is how I manipulate arrays:
int *arrayManip(int *myarray, int length){
int i;
for(i=0; i<length; i++)
myarray[i] = i;
return array;
}
This is how I initialize the arrays:
int* declareArray(int *myarray, int length){
myarray = (int*) malloc(length*sizeof(int*));
if (myarray==NULL)
printf("Error allocating memory!\n");
int i;
for(i=0; i<length; i++)
myarray[i] = -888;
return myarray;
}
This code seems to work fine on a small scale, but maybe there is a problem once I have many more arrays of larger size that are often getting passed back and forth and copied in my program?
declarArray :
Name is not gramatically correct
The name of the function is not what it does
malloc with sizeof(int*), not sizeof(int). Guarantuee to be a bug in 64 bit machine
malloc fails, you print, but still write to null
passing myarray as argument is a noop as is
-888 is a magic number
There is no error check whatsoever
My advice. Throw it away and start fresh
No, as per my understanding.
You allocating one dim array => elements in that array should be integers and not pointers to integers so instead of this :
myarray = (int*) malloc(length*sizeof(int*));
it should be :
myarray = (int*) malloc(length*sizeof(int));
In function arrayManip you pass param named array, and than you trying to access it as myarray
This:
myarray = (int*) malloc(length*sizeof(int*));
allocates an array of length pointers to an integer, but then puts it into a pointer to an integer (i.e. an array of integers, not pointers to integers). If you want an array of integers, you want:
myarray = (int*) malloc(length*sizeof(int));
or (if you want to zero it):
myarray = (int*) calloc(length, sizeof(int));
which does the size x length calculation itself.
To allocate a list of pointers to integers, you want:
myarray = (int**) malloc(length*sizeof(int*));
or
myarray = (int**) calloc(length, sizeof(int*));
Unless you are fantastically concerned about speed, I find using calloc() results in fewer bugs from uninitialized arrays, and makes the reason for the allocated size more obvious.
The pointer is of word size [2 or 4 ,... depending on machine architecture]. whatever it may point to int,double,float,...
for integer pointer it works if it takes 4 bytes for int in machine. when u go for other data type it 'll lead you to error.
you should allot memory as
pointer = (DataType*) malloc (length * sizeof(DataType));
use malloc and to make your code clear.
void* malloc (size_t size);
malloc reference
use memset to allot default value [-888] for your array.
void *memset(void *str, int c, size_t n)

Is this code doing what I want it to do?

I want to create an integer pointer p, allocate memory for a 10-element array, and then fill each element with the value of 5. Here's my code:
//Allocate memory for a 10-element integer array.
int array[10];
int *p = (int *)malloc( sizeof(array) );
//Fill each element with the value of 5.
int i = 0;
printf("Size of array: %d\n", sizeof(array));
while (i < sizeof(array)){
*p = 5;
printf("Current value of array: %p\n", *p);
*p += sizeof(int);
i += sizeof(int);
}
I've added some print statements around this code, but I'm not sure if it's actually filling each element with the value of 5.
So, is my code working correctly? Thanks for your time.
First:
*p += sizeof(int);
This takes the contents of what p points to and adds the size of an integer to it. That doesn't make much sense. What you probably want is just:
p++;
This makes p point to the next object.
But the problem is that p contains your only copy of the pointer to the first object. So if you change its value, you won't be able to access the memory anymore because you won't have a pointer to it. (So you should save a copy of the original value returned from malloc somewhere. If nothing else, you'll eventually need it to pass to free.)
while (i < sizeof(array)){
This doesn't make sense. You don't want to loop a number of times equal to the number of bytes the array occupies.
Lastly, you don't need the array for anything. Just remove it and use:
int *p = malloc(10 * sizeof(int));
For C, don't cast the return value of malloc. It's not needed and can mask other problems such as failing to include the correct headers. For the while loop, just keep track of the number of elements in a separate variable.
Here's a more idiomatic way of doing things:
/* Just allocate the array into your pointer */
int arraySize = 10;
int *p = malloc(sizeof(int) * arraySize);
printf("Size of array: %d\n", arraySize);
/* Use a for loop to iterate over the array */
int i;
for (i = 0; i < arraySize; ++i)
{
p[i] = 5;
printf("Value of index %d in the array: %d\n", i, p[i]);
}
Note that you need to keep track of your array size separately, either in a variable (as I have done) or a macro (#define statement) or just with the integer literal. Using the integer literal is error-prone, however, because if you need to change the array size later, you need to change more lines of code.
sizeof of an array returns the number of bytes the array occupies, in bytes.
int *p = (int *)malloc( sizeof(array) );
If you call malloc, you must #include <stdlib.h>. Also, the cast is unnecessary and can introduce dangerous bugs, especially when paired with the missing malloc definition.
If you increment a pointer by one, you reach the next element of the pointer's type. Therefore, you should write the bottom part as:
for (int i = 0;i < sizeof(array) / sizeof(array[0]);i++){
*p = 5;
p++;
}
*p += sizeof(int);
should be
p += 1;
since the pointer is of type int *
also the array size should be calculated like this:
sizeof (array) / sizeof (array[0]);
and indeed, the array is not needed for your code.
Nope it isn't. The following code will however. You should read up on pointer arithmetic. p + 1 is the next integer (this is one of the reasons why pointers have types). Also remember if you change the value of p it will no longer point to the beginning of your memory.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define LEN 10
int main(void)
{
/* Allocate memory for a 10-element integer array. */
int array[LEN];
int i;
int *p;
int *tmp;
p = malloc(sizeof(array));
assert(p != NULL);
/* Fill each element with the value of 5. */
printf("Size of array: %d bytes\n", (int)sizeof(array));
for(i = 0, tmp = p; i < LEN; tmp++, i++) *tmp = 5;
for(i = 0, tmp = p; i < LEN; i++) printf("%d\n", tmp[i]);
free(p);
return EXIT_SUCCESS;
}
//Allocate memory for a 10-element integer array.
int array[10];
int *p = (int *)malloc( sizeof(array) );
At this point you have allocated twice as much memory -- space for ten integers in the array allocated on the stack, and space for ten integers allocated on the heap. In a "real" program that needed to allocate space for ten integers and stack allocation wasn't the right thing to do, the allocation would be done like this:
int *p = malloc(10 * sizeof(int));
Note that there is no need to cast the return value from malloc(3). I expect you forgot to include the <stdlib> header, which would have properly prototyped the function, and given you the correct output. (Without the prototype in the header, the C compiler assumes the function would return an int, and the cast makes it treat it as a pointer instead. The cast hasn't been necessary for twenty years.)
Furthermore, be vary wary of learning the habit sizeof(array). This will work in code where the array is allocated in the same block as the sizeof() keyword, but it will fail when used like this:
int foo(char bar[]) {
int length = sizeof(bar); /* BUG */
}
It'll look correct, but sizeof() will in fact see an char * instead of the full array. C's new Variable Length Array support is keen, but not to be mistaken with the arrays that know their size available in many other langauges.
//Fill each element with the value of 5.
int i = 0;
printf("Size of array: %d\n", sizeof(array));
while (i < sizeof(array)){
*p = 5;
*p += sizeof(int);
Aha! Someone else who has the same trouble with C pointers that I did! I presume you used to write mostly assembly code and had to increment your pointers yourself? :) The compiler knows the type of objects that p points to (int *p), so it'll properly move the pointer by the correct number of bytes if you just write p++. If you swap your code to using long or long long or float or double or long double or struct very_long_integers, the compiler will always do the right thing with p++.
i += sizeof(int);
}
While that's not wrong, it would certainly be more idiomatic to re-write the last loop a little:
for (i=0; i<array_length; i++)
p[i] = 5;
Of course, you'll have to store the array length into a variable or #define it, but it's easier to do this than rely on a sometimes-finicky calculation of the array length.
Update
After reading the other (excellent) answers, I realize I forgot to mention that since p is your only reference to the array, it'd be best to not update p without storing a copy of its value somewhere. My little 'idiomatic' rewrite side-steps the issue but doesn't point out why using subscription is more idiomatic than incrementing the pointer -- and this is one reason why the subscription is preferred. I also prefer the subscription because it is often far easier to reason about code where the base of an array doesn't change. (It Depends.)
//allocate an array of 10 elements on the stack
int array[10];
//allocate an array of 10 elements on the heap. p points at them
int *p = (int *)malloc( sizeof(array) );
// i equals 0
int i = 0;
//while i is less than 40
while (i < sizeof(array)){
//the first element of the dynamic array is five
*p = 5;
// the first element of the dynamic array is nine!
*p += sizeof(int);
// incrememnt i by 4
i += sizeof(int);
}
This sets the first element of the array to nine, 10 times. It looks like you want something more like:
//when you get something from malloc,
// make sure it's type is "____ * const" so
// you don't accidentally lose it
int * const p = (int *)malloc( 10*sizeof(int) );
for (int i=0; i<10; ++i)
p[i] = 5;
A ___ * const prevents you from changing p, so that it will always point to the data that was allocated. This means free(p); will always work. If you change p, you can't release the memory, and you get a memory leak.

Resources