Seg fault trying to assign an array of structs in C - c

I've been away from C for a little bit, so there are some growing pains here.
Basically I'm trying to create an array of all possible RGB values.
#include <stdio.h>
#define MAX 3
struct rgb_val {
int r;
int g;
int b;
};
int main(void) {
struct rgb_val rgb[MAX];
int index = 0;
for (int r = 0; r < MAX; r++) {
for (int g = 0; g < MAX; g++) {
for (int b = 0; b < MAX; b++) {
rgb[index].r = r;
rgb[index].g = g;
rgb[index].b = b;
index++;
}
}
}
return 0;
}

Accessing array out of bounds.Its an undefined behavior.

Right now, you access out of your array's bounds, which has size MAX. You need to change your loop to :
int index = 0;
for (int index = 0; index < MAX; index++) {
rgb[index].r = r; //some r
rgb[index].g = g; //some g
rgb[index].b = b; //some b
}
This way, you will access only your array's elements.
Now, if you do want to increase the r, g and b's value by one for each struct, all you have to do is
rgb[index].r = index;
rgb[index].g = index;
rgb[index].b = index;

All right, if you want to assign all possible combinations, then you need 27 of them, not 3 (because 3 possible r/g/b values each).
So you need to have enough entries in your array:
struct rgb_val rgb[MAX * MAX * MAX];
Right now your code is accessing indexes 3, 4, ... 26 of array that is only 3 elements long, hence UB.

I believe you're accessing the array outside of bounds due to increasing the index count beyond MAX.
You would be better off using:
struct rgb_val rgb[MAX * ((sizeof(rgb_val)/sizeof(int)) * (sizeof(rgb_val)/sizeof(int)))];
int index = 0;
for (int r = 0; r < MAX; r++) {
for (int g = 0; g < MAX; g++) {
for (int b = 0; b < MAX; b++) {
rgb[index].r = r;
rgb[index].g = g;
rgb[index].b = b;
index++;
}
}
}

Related

Can't figure out what's wrong with my code for a HackerRank problem in C

I'm sorry to ask help for a HackerRank problem here, I know it's not really the right place but nobody is answering me on HackerRank. Also, I'm new in C, so don't be to rude please.
Problem's description:
You are given n triangles, specifically, their sides a, b and c. Print them in the same style but sorted by their areas from the smallest one to the largest one. It is guaranteed that all the areas are different.
Link to the problem : https://www.hackerrank.com/challenges/small-triangles-large-triangles/problem
We can only edit the sort_by_area function.
First of all, I didn't calculate the triangles' area, I've just calculated the perimeter of each triangle, because the formula is simpler to read and to execute. Normally, that doesn't change anything for the result since a bigger perimeter means a bigger area. Tell me if I'm wrong.
The problem is that I have unexpected results: there's numbers on a line from my output that I really don't know from where they come. See:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
typedef struct {
int a;
int b;
int c;
} triangle;
void sort_by_area(triangle *tr, int n) {
// Array for storing the perimeter.
int *size = malloc(100 * sizeof(*size));
// Adding perimeters in size array.
for (int i = 0; i < n; i++) {
size[i] = tr[i].a + tr[i].b + tr[i].c;
}
// Sort.
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
if (size[j] > size[j + 1]) {
// Sort in size array.
int temp = size[j];
size[j] = size[j + 1];
size[j + 1] = temp;
// Sort in tr array.
temp = tr[j].a;
tr[j].a = tr[j + 1].a;
tr[j + 1].a = temp;
temp = tr[j].b;
tr[j].b = tr[j + 1].b;
tr[j + 1].b = temp;
temp = tr[j].c;
tr[j].c = tr[j + 1].c;
tr[j + 1].c = temp;
}
}
}
}
int main() {
int n;
scanf("%d", &n);
triangle *tr = malloc(n * sizeof(triangle));
for (int i = 0; i < n; i++) {
scanf("%d%d%d", &tr[i].a, &tr[i].b, &tr[i].c);
}
sort_by_area(tr, n);
for (int i = 0; i < n; i++) {
printf("%d %d %d\n", tr[i].a, tr[i].b, tr[i].c);
}
return 0;
}
Input:
3
7 24 25
5 12 13
3 4 5
Output:
0 417 0 // Unexpected results on this line.
3 4 5
5 12 13
Expected output:
3 4 5
5 12 13
7 24 25
It seems that an error occurs from the 7 24 25 triangle, but for me, my code seems to be good.... Can you help to find out what's wrong ? I really want to understand before going to another problem.
The assumption that a greater parameter implies a greater area is incorrect. Why? Imagine an isosceles triangle with a base of 1000 units and a height of 1e-9 units. The area is minuscule, compared to an equilateral triangle with unit length whereas the former has a huge perimeter (~2000 units) compared to the latter (3 units). That's just an (extreme) example to convey the flaw in your assumption.
I'd suggest you roll up your own area function. It's even mentioned on the problem page to use Heron's formula. Since it's just to be used in the comparison, then we don't need the exact area but an indicative area. So something like
double area(triangle const* tr) {
if(tr) {
double semiPerimeter = (tr->a + tr->b + tr->c)/2.0;
return semiPerimeter* (semiPerimeter - tr->a) * (semiPerimeter - tr->b) * (semiPerimeter - tr->c);
} else {
return 0;
}
}
Where we don't really need to calculate the square root since we just need to compare the areas across triangles and comparing the square of areas across triangles should be fine.
After this, it's just a matter of plugging this into whatever you did, after correcting the inner j loop to run only till n-1 (as the other answer has also explained)
void sort_by_area(triangle* tr, int n) {
/**
* Sort an array a of the length n
*/
double areaArr[n];
for(size_t i = 0; i < n; ++i) {
areaArr[i] = area(&tr[i]);
}
for (int i = 0; i < n; i++) {
for (int j = 0; j < n - 1; j++) {
if (areaArr[j] > areaArr[j + 1]) {
// Sort in area array.
int temp = areaArr[j];
areaArr[j] = areaArr[j + 1];
areaArr[j + 1] = temp;
// Sort in tr array.
triangle tmp = tr[j];
tr[j] = tr[j + 1];
tr[j + 1] = tmp;
}
}
}
}
You could directly use qsort too here since the problem doesn't prohibit using standard functions, something like:
int qsortCompare(void const* a, void const* b) {
triangle const* trA = a;
triangle const* trB = b;
if(trA && trB) {
double areaA = area(trA);
double areaB = area(trB);
return (areaA < areaB) ? -1 :
((areaA > areaB)? 1: 0);
}
return 0;
}
void sort_by_area(triangle* tr, int n) {
qsort(tr, n, sizeof(triangle), &qsortCompare);
}
Also, don't be restricted to add functions in the problem solution. The actual driver code only calls sort_by_area() but you can write other functions in the solution and call them from sort_by_area().
The inner loop does not need to run till n, only till n-1
for (int j = 0; j < n - 1; j++)
Because when j == n, then you are comparing with random junk outside of your respective arrays by accessing size[j+1] and tr[j+1].
Also, when swapping, you don't need to copy the structure members one-by-one. You can simply do:
// Sort in tr array.
triangle tmp = tr[j];
tr[j] = tr[j + 1];
tr[j + 1] = tmp;
Edit: As #CiaPan pointed out:
You have a memory leak. You need to call free() after you are done with using the malloc'd memory.
You are not allocating the right amount of memory. If you are passed more than 100 triangles, your code might behave weirdly or randomly crash.
int *size = malloc(n* sizeof(*size));
Full code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
typedef struct {
int a;
int b;
int c;
} triangle;
void sort_by_area(triangle *tr, int n) {
// Array for storing the perimeter.
int *size = malloc(n* sizeof(*size));
// Adding perimeters in size array.
for (int i = 0; i < n; i++) {
size[i] = tr[i].a + tr[i].b + tr[i].c;
}
// Sort.
for (int i = 0; i < n; i++) {
for (int j = 0; j < n - 1; j++) {
if (size[j] > size[j + 1]) {
// Sort in size array.
int temp = size[j];
size[j] = size[j + 1];
size[j + 1] = temp;
// Sort in tr array.
triangle tmp = tr[j];
tr[j] = tr[j + 1];
tr[j + 1] = tmp;
}
}
}
}
int main() {
int n;
scanf("%d", &n);
triangle *tr = malloc(n * sizeof(triangle));
for (int i = 0; i < n; i++) {
scanf("%d%d%d", &tr[i].a, &tr[i].b, &tr[i].c);
}
sort_by_area(tr, n);
for (int i = 0; i < n; i++) {
printf("%d %d %d\n", tr[i].a, tr[i].b, tr[i].c);
}
return 0;
}

How to store multiple 2D array locations in C?

I'm trying to make a simple program, that generates random numbers into a 15x15 2D array and then works with the array later on.
This function finds the largest value in the array and then prints out the value and its location.
void maxValue(int array[ROWS][COLUMNS]){
int maxNum = array[0][0];
int rowMaxLocation;
int columnMaxLocation;
for(int c = 0; c < ROWS; c++) {
for(int d = 0; d < COLUMNS; d++) {
if(maxNum < array[c][d]) {
maxNum = array[c][d];
rowMaxLocation = c;
columnMaxLocation = d;
}
}
}
printf("Largest value in the array is %d, located in [%d][%d]", maxNum, rowMaxLocation, columnMaxLocation);
return;
What I can't figure out is how to store multiple locations of the same maximal value, e.g. "Largest value in the array is 99, located in [1][3] and [5][12]".
Thanks in advance.
you can follow a two pass approach:
First pass: Find maximum element in array
Second pass: Find all indices where array value is equal to maximum element
In order to store multiple locations you can have arrays of rowMaxLocations[] or columnMaxLocations[] and store values in them accordingly as you scan the array during second pass. If you don't need these values, you can simply skip storing them in some array and just do printing in second pass as well.
I would recommend having a Location struct for storing location values like this:
typedef struct Location {
int row;
int column;
} Location;
Here is how this two pass approach would look like:
void maxValue(int array[ROWS][COLUMNS]){
int maxNum = array[0][0];
// Array of locations to store max locations
Location maxLocations[ROWS * COLUMNS];
// Index of max locations array
int maxLocationIndex = 0;
// First pass; Get Max value
for(int c = 0; c < ROWS; c++) {
for(int d = 0; d < COLUMNS; d++) {
if(maxNum < array[c][d]) {
maxNum = array[c][d];
}
}
}
// Seconds pass; Get all values equal to max value
for (int c = 0; c < ROWS; c++) {
for (int d = 0; d < COLUMNS; d++) {
if (array[c][d] == maxNum) {
maxLocations[maxLocationIndex].row = c;
maxLocations[maxLocationIndex].column = d;
maxLocationIndex++;
}
}
}
// Print the max locations array
printf("Largest value in the array is %d\n", maxNum);
for (int i = 0; i < maxLocationIndex; i++) {
printf("[%d][%d] \n", maxLocations[i].row, maxLocations[i].column);
}
}
Update:
Thanks to suggestion from #Ian Abbott, there is also a way to do all this in one pass. We would just need to reset the maxLocationIndex whenever we find a new max value:
for(int c = 0; c < ROWS; c++) {
for(int d = 0; d < COLUMNS; d++) {
if(maxNum < array[c][d]) {
maxNum = array[c][d];
// Reset Max locations index
maxLocationIndex = 0;
}
if (array[c][d] == maxNum) {
maxLocations[maxLocationIndex].row = c;
maxLocations[maxLocationIndex].column = d;
maxLocationIndex++;
}
}
}

Speed up matrix-matrix multiplication using SSE vector instructions

I have some trouble in vectorize some C code using SSE vector instructions. The code which I have to victorize is
#define N 1000
void matrix_mul(int mat1[N][N], int mat2[N][N], int result[N][N])
{
int i, j, k;
for (i = 0; i < N; ++i)
{
for (j = 0; j < N; ++j)
{
for (k = 0; k < N; ++k)
{
result[i][k] += mat1[i][j] * mat2[j][k];
}
}
}
}
Here is what I got so far:
void matrix_mul_sse(int mat1[N][N], int mat2[N][N], int result[N][N])
{
int i, j, k; int* l;
__m128i v1, v2, v3;
v3 = _mm_setzero_si128();
for (i = 0; i < N; ++i)
{
for (j = 0; j < N; j += 4)
{
for (k = 0; k < N; k += 4)
{
v1 = _mm_set1_epi32(mat1[i][j]);
v2 = _mm_loadu_si128((__m128i*)&mat2[j][k]);
v3 = _mm_add_epi32(v3, _mm_mul_epi32(v1, v2));
_mm_storeu_si128((__m128i*)&result[i][k], v3);
v3 = _mm_setzero_si128();
}
}
}
}
After execution I got wrong result. I know that the reason is the loading from memory to v2. I loop through mat1 in row major order so I need to load mat2[0][0], mat2[1][0], mat2[2][0], mat2[3][0].... but what actually loaded is mat2[0][0], mat2[0][1], mat2[0][2], mat2[0][3]... because mat2 has stored in the memory in row major order. I tried to fix this problem but without any improvement.
Can anyone help me please.
Below fixed your implementation:
void matrix_mul_sse(int mat1[N][N], int mat2[N][N], int result[N][N])
{
int i, j, k;
__m128i v1, v2, v3, v4;
for (i = 0; i < N; ++i)
{
for (j = 0; j < N; ++j) // 'j' must be incremented by 1
{
// read mat1 here because it does not use 'k' index
v1 = _mm_set1_epi32(mat1[i][j]);
for (k = 0; k < N; k += 4)
{
v2 = _mm_loadu_si128((const __m128i*)&mat2[j][k]);
// read what's in the result array first as we will need to add it later to our calculations
v3 = _mm_loadu_si128((const __m128i*)&result[i][k]);
// use _mm_mullo_epi32 here instead _mm_mul_epi32 and add it to the previous result
v4 = _mm_add_epi32(v3, _mm_mullo_epi32(v1, v2));
// store the result
_mm_storeu_si128((__m128i*)&result[i][k], v4);
}
}
}
}
In short _mm_mullo_epi32 (requires SSE4.1) produces 4 x int32 results as opposed to _mm_mul_epi32 which does 2 x int64 results. If you cannot use SSE4.1 then have a look at the answer here for an alternative SSE2 solution.
Full description by Intel Intrinsic Guide:
_mm_mullo_epi32: Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store
the low 32 bits of the intermediate integers in dst.
_mm_mul_epi32: Multiply the low 32-bit integers from each packed 64-bit element in a and b, and store the
signed 64-bit results in dst.
I kinda changed around your code to make the addressing explicit [ it helps in this case ].
#define N 100
This is a stub for the vector unit multiple & accumulate operation; you should be able to replace NV with whatever throw your vector unit has, and put the relevant opcodes in here.
#define NV 8
int Vmacc(int *A, int *B) {
int i = 0;
int x = 0;
for (i = 0; i < NV; i++) {
x += *A++ * *B++;
}
return x;
}
This multiply has two notable variations from the norm:
1. It caches the columnar vector into a contiguous one.
2. It attempts to push slices of the multiply accumulate into a vector-like func.
Even without using the vector unit, this takes half the time of naive version just because of better cache/prefetch utilization.
void mm2(int *A, int *B, int n, int *C) {
int c, r;
int stride = 0;
int cache[N];
for (c = 0; c < n; c++) {
/* cache cumn i: */
for (r = 0; r < n; r++) {
cache[r] = B[c + r*n];
}
for (r = 0; r < n; r++) {
int k = 0;
int x = 0;
int *Av = A + r*n;
for (k = 0; k+NV-1 < n; k += NV) {
x += Vmacc(Av+k, cache+k);
}
while (k < n) {
x += Av[k] * cache[k];
k++;
}
C[r*n + c] = x;
}
}
}

How to make strings stick together while radix sorting?

I have to make a program that sort strings (with exact length 7 chars) by using radix sort. I already made a function that sort each column separately. My problem is how to make the whole string move, not just one char. It's really problematic for me to see how should it work in C.
I made one array "char strings[3][8]" and "char output[3][8]" to get sorted 3 strings with exact 7 chars in each one. For example sorting these strings:
strcpy(strings[0], "kupbars");
strcpy(strings[1], "daparba");
strcpy(strings[2], "jykaxaw");
In output I get:
dakaaaa
juparbs
kypbxrw
Each column is sorted correctly but chars don't stick together. I tried many ways for 3 hours but nothing works.
My code looks like this:
void countingSort(char a[][8], char b[][8]) {
int c[123];
for (int pos = 6; pos >= 0; pos--) {
for (int i = 0; i < 123; i++)
c[i] = 0;
for (int i = 0; i < 3; i++)
c[(int)a[i][pos]]++;
for (int i = 1; i < 123; i++)
c[i] += c[i - 1];
for (int i = 2; i >= 0; i--) {
b[--c[(int)a[i][pos]]][pos] = a[i][pos];
}
}
}
(There are constants limiting string length etc. because it's easy to change it to variable - I just focused on getting this program work properly.)
Try changing the loop to move an entire string:
for (int i = 2; i >= 0; i--) {
int k = --c[(int)a[i][pos]];
for(int j = 0; j < 8; j++) {
b[k][j] = a[i][j];
}
}
You could do a circular list but it's a little overhead. I propose you to use memmove().
#include <string.h>
void array_move_forward(char array[3][8]) {
for (int i = 0; i < 3; i++) {
char tmp = array[i][6];
memmove(array[i] + 1, array[i], 6);
array[i][0] = tmp;
}
}
void array_move_rewind(char array[3][8]) {
for (int i = 0; i < 3; i++) {
char tmp = array[i][0];
memmove(array[i], array[i] + 1, 6);
array[i][6] = tmp;
}
}
A other solution would be to manipulate your string yourself and using a index, that indicate the first letter of your string.
{
char str[7];
int i = 0;
...
int j = i;
for (int k = 0; k < 7; k++) {
char tmp = str[j++ % 7];
}
}
With that you could rotate your string just with i++ or i--.
struct my_string_radix {
char str[7];
int begin;
}

malloc 3d memory allocation size [duplicate]

I have a pointer variable int ***a in C. I'm passing it to a function as &a i.e reference. In the function I'm getting a pointer variable of type int ****a.
I'm allocating memory like this.
*a=(int***)malloc(no1*sizeof(int**));
some loop from 0 to no1
(*a)[++l]=(int**)malloc((no1+1)*sizeof(int*));
some loop from 0 to no1
(*a)[l][h]=(int*)malloc(2*sizeof(int));
This is only the time I allocated memory. The actual program is not given; no error here.
But when I'm going to do this:
(*a)[l][h][0]=no1;
It's giving me a "Segmentation Fault" error and I can't understand why.
UPDATE:
I have wrote a sample program which is to allocate the memory only. This is also giving "segmentation fault" error.
#include<stdio.h>
#include<malloc.h>
#include<stdlib.h>
void allocate(int ****a)
{
int i,j,k;
if(((*a)=(int***)malloc(5*sizeof(int**)))==NULL)
{
printf("\nError in allocation of double pointer array\n");
exit(0);
}
for(i=0;i<5;i++)if(((*a)[i]=(int**)malloc(4*sizeof(int*)))==NULL)
{
printf("\nError in allocation of single pointer array on index [%d]\n",i);
exit(0);
}
for(i=0;i<5;i++)
for(j=0;j<4;i++)
if(((*a)[i][j]=(int*)malloc(3*sizeof(int)))==NULL)
{
printf("\nError in allocation of array on index [%d][%d]\n",i,j);
exit(0);
}
for(i=0;i<5;i++)
for(j=0;j<4;i++)
for(k=0;k<3;k++)
(*a)[i][j][k]=k;
}
main()
{
int ***a;
int i,j,k;
allocate(&a);
for(i=0;i<5;i++)
for(j=0;j<4;i++)
for(k=0;k<3;k++)
printf("\na[%d][%d][%d] = %d ",i,j,k,a[i][j][k]);
}
Revised code from question
Your code has:
for(i=0;i<5;i++)
for(j=0;j<4;i++)
several times. The second loop should be incrementing j, not i. Be very careful with copy'n'paste.
This code does not crash (but does leak).
#include <stdio.h>
#include <stdlib.h>
void allocate(int ****a);
void allocate(int ****a)
{
int i,j,k;
printf("allocate: 1B\n");
if(((*a)=(int***)malloc(5*sizeof(int**)))==NULL)
{
printf("\nError in allocation of double pointer array\n");
exit(0);
}
printf("allocate: 1A\n");
printf("allocate: 2B\n");
for(i=0;i<5;i++)
if(((*a)[i]=(int**)malloc(4*sizeof(int*)))==NULL)
{
printf("\nError in allocation of single pointer array on index [%d]\n",i);
exit(0);
}
printf("allocate: 2A\n");
printf("allocate: 3B\n");
for(i=0;i<5;i++)
for(j=0;j<4;j++)
if(((*a)[i][j]=(int*)malloc(3*sizeof(int)))==NULL)
{
printf("\nError in allocation of array on index [%d][%d]\n",i,j);
exit(0);
}
printf("allocate: 3A\n");
printf("allocate: 4B\n");
for(i=0;i<5;i++)
for(j=0;j<4;j++)
for(k=0;k<3;k++)
(*a)[i][j][k]=k;
printf("allocate: 4A\n");
}
int main(void)
{
int ***a;
int i,j,k;
allocate(&a);
for(i=0;i<5;i++)
for(j=0;j<4;j++)
for(k=0;k<3;k++)
printf("a[%d][%d][%d] = %d\n",i,j,k,a[i][j][k]);
}
Previous answers
Since you've not shown us most of the code, it is hard to predict how you're mishandling it, but equally, since you are getting a core dump, you must be mishandling something.
Here is some working code — not checked with valgrind since that is not available for Mac OS X 10.8 — that seems to work. The error recovery for allocation failure is not complete, and the function to destroy the fully allocated array is also missing.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
static int ***allocate_3d_array(int no1, int ****a)
{
*a = (int***)malloc(no1 * sizeof(int**));
if (*a == 0)
return 0;
for (int l = 0; l < no1; l++)
{
if (((*a)[l]=(int**)malloc((no1+1)*sizeof(int*))) == 0)
{
while (l > 0)
free((*a)[--l]);
return 0;
}
}
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
if (((*a)[l][h]=(int*)malloc(2*sizeof(int))) == 0)
{
/* Leak! */
return 0;
}
}
}
for (int l = 0; l < no1; l++)
for (int h = 0; h < no1; h++)
for (int k = 0; k < 2; k++)
(*a)[l][h][k] = 10000 * l + 100 * h + k;
return *a;
}
int main(void)
{
int no1 = 5;
int ***a = 0;
int ***b = allocate_3d_array(no1, &a);
const char *pad[] = { " ", "\n" };
assert(b == a);
if (a != 0)
{
for (int l = 0; l < no1; l++)
for (int h = 0; h < no1; h++)
for (int k = 0; k < 2; k++)
printf("a[%d][%d][%d] = %.6d%s", l, h, k, a[l][h][k], pad[k]);
// free memory - added by harpun; reformatted by Jonathan Leffler
// Would be a function normally — see version 2 code.
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
free(a[l][h]);
free(a[l]);
}
free(a);
}
return 0;
}
Sample output:
a[0][0][0] = 000000 a[0][0][1] = 000001
a[0][1][0] = 000100 a[0][1][1] = 000101
a[0][2][0] = 000200 a[0][2][1] = 000201
a[0][3][0] = 000300 a[0][3][1] = 000301
a[0][4][0] = 000400 a[0][4][1] = 000401
a[1][0][0] = 010000 a[1][0][1] = 010001
a[1][1][0] = 010100 a[1][1][1] = 010101
a[1][2][0] = 010200 a[1][2][1] = 010201
a[1][3][0] = 010300 a[1][3][1] = 010301
a[1][4][0] = 010400 a[1][4][1] = 010401
a[2][0][0] = 020000 a[2][0][1] = 020001
a[2][1][0] = 020100 a[2][1][1] = 020101
a[2][2][0] = 020200 a[2][2][1] = 020201
a[2][3][0] = 020300 a[2][3][1] = 020301
a[2][4][0] = 020400 a[2][4][1] = 020401
a[3][0][0] = 030000 a[3][0][1] = 030001
a[3][1][0] = 030100 a[3][1][1] = 030101
a[3][2][0] = 030200 a[3][2][1] = 030201
a[3][3][0] = 030300 a[3][3][1] = 030301
a[3][4][0] = 030400 a[3][4][1] = 030401
a[4][0][0] = 040000 a[4][0][1] = 040001
a[4][1][0] = 040100 a[4][1][1] = 040101
a[4][2][0] = 040200 a[4][2][1] = 040201
a[4][3][0] = 040300 a[4][3][1] = 040301
a[4][4][0] = 040400 a[4][4][1] = 040401
Compare this with what you've got. You could add many more diagnostic print messages. If this doesn't help sufficiently, create an SSCCE (Short, Self-Contained, Correct Example) analogous to this that demonstrates the problem in your code without any extraneous material.
Version 2 of the code
This is a somewhat more complex version of the code that simulates memory allocation failures after N allocations (and a test harness that runs it with every value of N from 0 up to 35, where there are actually only 30 allocations for the array. It also includes code to release the array (similar to, but different from, the code that was edited into my answer by harpun. The interaction at the end with the line containing the PID means that I can check memory usage with ps in another terminal window. (Otherwise, I don't like programs that do that sort of thing — I suppose I should run the ps from my program via system(), but I'm feeling lazy.)
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
static int fail_after = 0;
static int num_allocs = 0;
static void *xmalloc(size_t size)
{
if (fail_after > 0 && num_allocs++ >= fail_after)
{
fputs("Out of memory\n", stdout);
return 0;
}
return malloc(size);
}
static int ***allocate_3d_array(int no1, int ****a)
{
*a = (int***)xmalloc(no1 * sizeof(int**));
if (*a == 0)
return 0;
for (int l = 0; l < no1; l++)
{
if (((*a)[l]=(int**)xmalloc((no1+1)*sizeof(int*))) == 0)
{
for (int l1 = 0; l1 < l; l1++)
free((*a)[l1]);
free(*a);
*a = 0;
return 0;
}
}
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
if (((*a)[l][h]=(int*)xmalloc(2*sizeof(int))) == 0)
{
/* Release prior items in current row */
for (int h1 = 0; h1 < h; h1++)
free((*a)[l][h1]);
free((*a)[l]);
/* Release items in prior rows */
for (int l1 = 0; l1 < l; l1++)
{
for (int h1 = 0; h1 < no1; h1++)
free((*a)[l1][h1]);
free((*a)[l1]);
}
free(*a);
*a = 0;
return 0;
}
}
}
for (int l = 0; l < no1; l++)
for (int h = 0; h < no1; h++)
for (int k = 0; k < 2; k++)
(*a)[l][h][k] = 10000 * l + 100 * h + k;
return *a;
}
static void destroy_3d_array(int no1, int ***a)
{
if (a != 0)
{
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
free(a[l][h]);
free(a[l]);
}
free(a);
}
}
static void test_allocation(int no1)
{
int ***a = 0;
int ***b = allocate_3d_array(no1, &a);
const char *pad[] = { " ", "\n" };
assert(b == a);
if (a != 0)
{
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
for (int k = 0; k < 2; k++)
{
if (a[l][h][k] != l * 10000 + h * 100 + k)
printf("a[%d][%d][%d] = %.6d%s", l, h, k, a[l][h][k], pad[k]);
}
}
}
}
destroy_3d_array(no1, a);
}
int main(void)
{
int no1 = 5;
for (fail_after = 0; fail_after < 33; fail_after++)
{
printf("Fail after: %d\n", fail_after);
num_allocs = 0;
test_allocation(no1);
}
printf("PID %d - waiting for some data to exit:", (int)getpid());
fflush(0);
getchar();
return 0;
}
Note how painful the memory recovery is. As before, not tested with valgrind, but I take reassurance from harpun's test on the previous version.
Version 3 — Clean bill of health from valgrind
This code is very similar to the test in version 2. It fixes a memory leak in the clean-up when a memory allocation fails in the leaf level allocations. The program no longer prompts for inputs (much preferable); it takes an optional single argument that is the number of allocations to fail after. Testing with valgrind showed that with an argument 0-6, there were no leaks, but with argument 7 there was a leak. It didn't take long to spot the problem and fix it. (It's easier when the machine running valgrind is available — it was powered down over the long weekend for general site electrical supply upgrade.)
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
static int fail_after = 0;
static int num_allocs = 0;
static void *xmalloc(size_t size)
{
if (fail_after > 0 && num_allocs++ >= fail_after)
{
fputs("Out of memory\n", stdout);
return 0;
}
return malloc(size);
}
static int ***allocate_3d_array(int no1, int ****a)
{
*a = (int***)xmalloc(no1 * sizeof(int**));
if (*a == 0)
return 0;
for (int l = 0; l < no1; l++)
{
if (((*a)[l]=(int**)xmalloc((no1+1)*sizeof(int*))) == 0)
{
for (int l1 = 0; l1 < l; l1++)
free((*a)[l1]);
free(*a);
*a = 0;
return 0;
}
}
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
if (((*a)[l][h]=(int*)xmalloc(2*sizeof(int))) == 0)
{
/* Release prior items in current (partial) row */
for (int h1 = 0; h1 < h; h1++)
free((*a)[l][h1]);
/* Release items in prior (complete) rows */
for (int l1 = 0; l1 < l; l1++)
{
for (int h1 = 0; h1 < no1; h1++)
free((*a)[l1][h1]);
}
/* Release entries in first (complete) level of array */
for (int l1 = 0; l1 < no1; l1++)
free((*a)[l1]);
free(*a);
*a = 0;
return 0;
}
}
}
for (int l = 0; l < no1; l++)
for (int h = 0; h < no1; h++)
for (int k = 0; k < 2; k++)
(*a)[l][h][k] = 10000 * l + 100 * h + k;
return *a;
}
static void destroy_3d_array(int no1, int ***a)
{
if (a != 0)
{
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
free(a[l][h]);
free(a[l]);
}
free(a);
}
}
static void test_allocation(int no1)
{
int ***a = 0;
int ***b = allocate_3d_array(no1, &a);
const char *pad[] = { " ", "\n" };
assert(b == a);
if (a != 0)
{
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
for (int k = 0; k < 2; k++)
{
if (a[l][h][k] != l * 10000 + h * 100 + k)
printf("a[%d][%d][%d] = %.6d%s", l, h, k, a[l][h][k], pad[k]);
}
}
}
}
destroy_3d_array(no1, a);
}
int main(int argc, char **argv)
{
int no1 = 5;
int fail_limit = 33;
if (argc == 2)
fail_limit = atoi(argv[1]);
for (fail_after = 0; fail_after < fail_limit; fail_after++)
{
printf("Fail after: %d\n", fail_after);
num_allocs = 0;
test_allocation(no1);
}
return 0;
}
Version 4 — Fewer memory allocations
Update 2014-12-20
The code above makes a lot of memory allocations, which complicates the release and error recovery. Here is an alternative version that makes just 3 allocations, one for the vector of pointers to pointers, one for the array of pointers, and one for the array of integers. It then sets the pointers to point to the correct places in memory.
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
static int fail_after = 0;
static int num_allocs = 0;
static void *xmalloc(size_t size)
{
if (fail_after > 0 && num_allocs++ >= fail_after)
{
fputs("Out of memory\n", stdout);
return 0;
}
return malloc(size);
}
static int ***allocate_3d_array(int no1, int ****a)
{
int ***d0 = (int***)xmalloc(no1 * sizeof(int**));
int **d1 = (int **)xmalloc(no1 * no1 * sizeof(int *));
int *d2 = (int *)xmalloc(no1 * no1 * 2 * sizeof(int));
if (d0 == 0 || d1 == 0 || d2 == 0)
{
free(d0);
free(d1);
free(d2);
*a = 0;
return 0;
}
for (int l = 0; l < no1; l++)
{
d0[l] = &d1[l * no1];
for (int h = 0; h < no1; h++)
{
d0[l][h] = &d2[(l * no1 + h) * 2];
for (int k = 0; k < 2; k++)
d0[l][h][k] = l * 10000 + h * 100 + k;
}
}
*a = d0;
return *a;
}
static void destroy_3d_array(int ***a)
{
if (a != 0)
{
free(a[0][0]);
free(a[0]);
free(a);
}
}
static void test_allocation(int no1)
{
int ***a = 0;
int ***b = allocate_3d_array(no1, &a);
const char *pad[] = { " ", "\n" };
assert(b == a);
if (a != 0)
{
for (int l = 0; l < no1; l++)
{
for (int h = 0; h < no1; h++)
{
for (int k = 0; k < 2; k++)
{
if (a[l][h][k] != l * 10000 + h * 100 + k)
printf("Oops: a[%d][%d][%d] = %.6d%s", l, h, k, a[l][h][k], pad[k]);
}
}
}
}
destroy_3d_array(a);
}
int main(int argc, char **argv)
{
int no1 = 5;
int fail_limit = 4;
if (argc == 2)
fail_limit = atoi(argv[1]);
for (fail_after = 0; fail_after < fail_limit; fail_after++)
{
printf("Fail after: %d\n", fail_after);
num_allocs = 0;
test_allocation(no1);
}
return 0;
}
This has a clean bill of health with GCC 4.9.1 on Mac OS X 10.10.1, checked with valgrind version valgrind-3.11.0.SVN (built from an SVN tree with some necessary fixes for Mac OS X, but not enough suppressions).
The diagnostic print (starting with 'Oops') was triggered while I developed the answer; I had my pointer calculations wrong at the time.
Sorry, but, to be blunt: this is a horrid way of handling a 3D array: a double-nested loop with a bucketload of calls to malloc(), then triple-indirection to get a value at runtime. Yeuch! :o)
The conventional way of doing this (in the HPC community) is to use a one-dimensional array and do the index computation yourself. Suppose index i iterates over nx planes in the x direction, j iterates over ny pencils in the y direction, and k iterates over nz cells in the z direction. Then a pencil has nz elements, a plane has nz*ny elements, and the whole “brick” has nz*ny*nx elements. Thus, you can iterate over the whole structure with:
for(i=0; i<nx; i++) {
for(j=0; j<ny; j++) {
for(k=0; k<nz; k++) {
printf("a(%d,%d,%d) = %d\n", i, j, k, a[(i*ny+j)*nz+k]);
}
}
}
The advantage of this construction is that you can allocate it with a single call to malloc(), rather than a boatload of nested calls:
int *a;
a = malloc(nx*ny*nz*sizeof(int));
The construction x=a[i][j][k] has three levels of indirection: you have to fetch an address from memory, a, add an offset, i, fetch that address from memory, a[i], add an offset, j, fetch that address from memory, a[i][j], add an offset, k, and (finally) fetch the data, a[i][j][k]. All those intermediate pointers are wasting cache-lines and TLB entries.
The construction x=a[(i*ny+j)*nz+k] has one level of indirection at the expense of two additional integer multiplications: compute the offset, fetch address, 'a', from memory, compute and add the offset, (i*ny+j)*nz+k, fetch the data.
Furthermore, there is essentially no way whatsoever of improving the triple-indirection method's performance based on data-access patterns. If we were actually visiting every cell, we could do something like this to avoid some of the overhead of index computation.
ij = 0;
for(i=0; i<nx; i++) {
ii=i*ny;
for(j=0; j<ny; j++) {
ij=(ii+j)*nz;
for(k=0; k<nz; k++) {
printf("a(%d,%d,%d) = %d\n", i, j, k, a[ij+k]);
}
}
}
Depending on what you're doing, this may not be great either, and there all alternative layouts and indexing methods (such as Morton or Ahnenteufel indexing) that may be more suitable, depending on your access patterns. I'm not trying to give a complete treatise on 3D Cartesian grid representation or indexing, merely illustrate that a “three star” solution is very bad for numerous reasons.
By using (*a)[l][h][0] you are trying to de-reference a plain int and not a pointer.
use a[l][h][0] directly to assign any value to it.

Resources