Finding 2D array cell memory location - c

There is an array a[3][4] and we have to find the address of a[2][2] using row major order and 1001 as base address. I found 2 formulas do find the address:
For any array a[M][N] Row Major Order:
1) a[i][j] = Base Address+Datatype Size[(N*i)+j]
2) a[i][j] = Base Address+Datatype Size[N*(i-1)+(j-1)]
I tried both the formulas but the first one yielded the correct result but not the second one. Here is how I found the address of a[2][2] using row major order:
Using Formula 1:
a[2][2] = 1001+2[(4*2)+2]
= 1001+2[10]
= 1001+20
= 1021 (correct answer)
Using Formula 2:
a[2][2] = 1001+2[4*(2-1)+(2-1)]
= 1001+2[4+1]
= 1001+2[5]
= 1001+10
= 1011 (Wrong Answer)
Is there any error in my 2nd formula or have I done something wrong in the calculation using 2nd formula? Why aren't both the answers coming same?

as amit said, the equations are different, so they are not going to give the same result. when ever in doubt about a problem like this, try putting some numbers in the equations and try them out, so if we are trying for an array of size 4, at position 2,2
equation 1: (N * i) + j, (4 * 2) + 2 = 10
equation 2: N * (I - 1) + (j - 1) = 9
as you can see as they are not the same result, the problem is not in your code, but in the equation itself
if you are feeling brave, you can also try and prove it by induction as well
To find the memory address number, you will also need to know how much space an integer is occupying in memory as well. You can do this with sizeof(int);which will output how many bytes the integer is using on the system.
you will also need to know how an array of arrays is formatted in memory. Just like a normal array, and array of arrays is contiguous, meaning that there are no gaps in memory between the array elements. So an array a[2][2] {1,2}, {3,4} would be formatted like this. 1, 2, 3, 4.
using this you should be able to find the memory location with the following equation address location of array a[I][j] = B + W * [N * (I- Lr) + (J - Lc)] where:
B is the base address
I is the row subscript of the element you are looking for
J is the column element you are looking for
W is the size of an element
Lr is the lower limit of a row - 0 if not given
Lc is the lower limit of a column - 0 if not given
M is the number of rows of the matrix
N is the number of columns of the matrix

Related

Why is the Index NOT out of bounds although it intuitively should?

I'm relatively new to C programming and I stumbled upon a for me unexplainable behaviour while running the following code and debugging it using gdb and lldb.
In short: When swapping the indices i and j (max i != max j) when accessing a value in a two-dimensional Array inside a double nested for-loop it does not seem to matter if I access the value using array[i][j] or array[j][i].
The two loops and arrays are mostly identical.
unsigned matrix[3][1] =
{
{3},
{4},
{5}
};
//Loop1
for (size_t i = 0; i < sizeof(matrix) / sizeof(*matrix); i++)
{
for (size_t j = 0; j < sizeof(matrix[i]) / sizeof(*matrix[i]); j++)
{
matrix[i][j] <<= 1;
printf("matrix[%zu][%zu] has the value: %d\n", i, j, matrix[i][j]);
}
}
//same two dimensional array as matrix
unsigned matrix2[3][1] =
{
{3},
{4},
{5}
};
//Loop2, basically the same loop as Loop1
for (size_t i = 0; i < sizeof(matrix2) / sizeof(*matrix2); i++)
{
for (size_t j = 0; j < sizeof(matrix2[i]) / sizeof(*matrix2[i]); j++)
{
//swapped i and j here
matrix2[j][i] <<= 1;
printf("matrix2[%zu][%zu] has the value: %d\n", j, i, matrix2[j][i]);
}
}
Am I missing here something?
In both cases i is passed the value 2 at the end of the outer loop and j the value 0 at the end of the inner loop.
Intuitively, matrix[0][2] should throw an exception as each row only has one element.
I will take a slightly different approach than the other respondents.
You are technically not reading outside of the array's boundary as far as the memory layout is concerned. Looking at it from a human perspective you are (the index [0][2] doesn't exist!), but the memory layout of the array is contiguous. Each of the "rows" of the matrix are stored next to each other.
In memory, your array is stored as: | ? | 3 | 4 | 5 | ? |
So when you index to matrix[1][0] or matrix [0][1] you are accessing the same position in memory. This would not be the case if your array was larger than 1 dimension wide.
For example, replace your array with the following one and experiment. You can access integer '4' either by indexing matrix[0][2], or matrix [1][0]. The position [0][2] shouldn't exist, but it does because the memory is contiguous.
unsigned matrix[3][2] =
{
{3, 6},
{4, 8},
{5, 10}
};
Oops, matrix[0][2] should throw an exception as each row only has one element...
Some languages do warn the programmer by an exception if they try an out of bound access, but C does not. It just invokes Undefined Behaviour. On a technical point of view, it means that the compiler does not have to test the out of bound condition. On an operational point of view, it means that anything can happen, including expected behaviour... or an immediate crash... or a modification of an unrelated variable... or...
If my C skills aren't mega-rusty you're reading "unsafe memory".
Essentially your matrix is declared as a block of bytes. After that block of bytes there are more bytes. What are they? Usually more variables that are declared as your program's data. Once you reach the end of the program's data block you reach the user code memory block (encoded ASM instructions).
Most languages perform checks and throw an exception when you run out of bounds by somehow keeping track of the last index that is valid to access. C does not do that and doing such thing is your very own responsibility. If you aren't careful you might be overwriting important parts of your program's code.
There are attacks that one can perform on C programs that don't sanitize user input, like a buffer overrun; which exploits what it's been described.
Essentially if you declare a char[] of length N and store a string that comes from outside and this string happens to be of length N+X you'll be overwriting program memory (instructions).
With the right sequence of characters you can inject your very own assembly code into a running program which doesn't sanitize user input
As your array is int and all elements are of the same size, i don't see any problem as your array is stored in contiguous space in RAM and you use a special case of matrix where inverting indexes has no side effect.
In the first loop your indexes are [0][0], [1][0], [2][0]
In the second loop your indexes are [0][0], [0][1], [0][2]
now try to linear the access, as your array is saved as linear array into the RAM.
address of element = row * NCOL + col
row: is row number
NCOL: number of columns into your matrix
col : the column number
so the linear index for :
[0][2] ==> 0 * 1 + 2 = 2 /* the third element*/
[2][0] ==> 2 * 1 + 0 = 2 /* always the third element */
But if you use a matrix of n x m , n >= 1 and m > 1 and n != m.
if you inverse the indexes, the result will not be the same.
so if you take a 4 x 2 matrix
linear index of [3][1] = 3 * 2 + 1 = 7
linear index of [1][3] = 1 * 2 + 3 = 5 /* even [1][3] is out of the range of your matrix index */
[1][3] you will manipulate the element [2][1]
So be worry when manipulating matrix indexes.

Seg faulting with 4D arrays & initializing dynamic arrays

I ran into a big of a problem with a tetris program I'm writing currently in C.
I am trying to use a 4D multi-dimensional array e.g.
uint8_t shape[7][4][4][4]
but I keep getting seg faults when I try that, I've read around and it seems to be that I'm using up all the stack memory with this kind of array (all I'm doing is filling the array with 0s and 1s to depict a shape so I'm not inputting a ridiculously high number or something).
Here is a version of it (on pastebin because as you can imagine its very ugly and long).
If I make the array smaller it seems to work but I'm trying to avoid a way around it as theoretically each "shape" represents a rotation as well.
https://pastebin.com/57JVMN20
I've read that you should use dynamic arrays so they end up on the heap but then I run into the issue how someone would initialize a dynamic array in such a way as linked above. It seems like it would be a headache as I would have to go through loops and specifically handle each shape?
I would also be grateful for anybody to let me pick their brain on dynamic arrays how best to go about them and if it's even worth doing normal arrays at all.
Even though I have not understood why do you use 4D arrays to store shapes for a tetris game, and I agree with bolov's comment that such an array should not overflow the stack (7*4*4*4*1 = 448 bytes), so you should probably check other code you wrote.
Now, to your question on how to manage 4D (N-Dimensional)dynamically sized arrays. You can do this in two ways:
The first way consists in creating an array of (N-1)-Dimensional arrays. If N = 2 (a table) you end up with a "linearized" version of the table (a normal array) which dimension is equal to R * C where R is the number of rows and C the number of columns. Inductively speaking, you can do the very same thing for N-Dimensional arrays without too much effort. This method has some drawbacks though:
You need to know beforehand all the dimensions except one (the "latest") and all the dimensions are fixed. Back to the N = 2 example: if you use this method on a table of C columns and R rows, you can change the number of rows by allocating C * sizeof(<your_array_type>) more bytes at the end of the preallocated space, but not the number of columns (not without rebuilding the entire linearized array). Moreover, different rows must have the same number of columns C (you cannot have a 2D array that looks like a triangle when drawn on paper, just to get things clear).
You need to carefully manage the indicies: you cannot simply write my_array[row][column], instead you must access that array with my_array[row*C + column]. If N is not 2, then this formula gets... interesting
You can use N-1 arrays of pointers. That's my favourite solution because it does not have any of the drawbacks from the previous solution, although you need to manage pointers to pointers to pointers to .... to pointers to a type (but that's what you do when you access my_array[7][4][4][4].
Solution 1
Let's say you want to build an N-Dimensional array in C using the first solution.
You know the length of each dimension of the array up to the (N-1)-th (let's call them d_1, d_2, ..., d_(N-1)). We can build this inductively:
We know how to build a dynamic 1-dimensional array
Supposing we know how to build a (N-1)-dimensional array, we show that we can build a N-Dimensional array by putting each (N-1)-dimensional array we have available in a 1-Dimensional array, thus increasing the available dimensions by 1.
Let's also assume that the data type that the arrays must hold is called T.
Let's suppose we want to create an array with R (N-1)-dimensional arrays inside it. For that we need to know the size of each (N-1)-dimensional array, so we need to calculate it.
For N = 1 the size is just sizeof(T)
For N = 2 the size is d_1 * sizeof(T)
For N = 3 the size is d_2 * d_1 * sizeof(T)
You can easily inductively prove that the number of bytes required to store R (N-1)-dimensional arrays is R*(d_1 * d_2 * ... * d_(n-1) * sizeof(T)). And that's done.
Now, we need to access a random element inside this massive N-dimensional array. Let's say we want to access the item with indicies (i_1, i_2, ..., i_N). For this we are going to repeat the inductive reasoning:
For N = 1, the index of the i_1 element is just my_array[i_1]
For N = 2, the index of the (i_1, i_2) element can be calculated by thinking that each d_1 elements, a new array begins, so the element is my_array[i_1 * d_1 + i_2].
For N = 3, we can repeat the same process and end up having the element my_array[d_2 * ((i_1 * d_1) + i_2) + i_3]
And so on.
Solution 2
The second solution wastes a bit more memory, but it's more straightforward, both to understand and to implement.
Let's just stick to the N = 2 case so that we can think better. Imagine to have a table and to split it row by row and to place each row in its own memory slot. Now, a row is a 1-dimensional array, and to make a 2-dimensional array we only need to be able to have an ordered array with references to each row. Something like the following drawing shows (the last row is the R-th row):
+------+
| R1 -------> [1,2,3,4]
|------|
| R2 -------> [2,4,6,8]
|------|
| R3 -------> [3,6,9,12]
|------|
| .... |
|------|
| RR -------> [R, 2*R, 3*R, 4*R]
+------+
In order to do that, you need to first allocate the references array (R elements long) and then, iterate through this array and assign to each entry the pointer to a newly allocated memory area of size d_1.
We can easily extend this for N dimensions. Simply build a R dimensional array and, for each entry in this array, allocate a new 1-Dimensional array of size d_(N-1) and do the same for the newly created array until you get to the array with size d_1.
Notice how you can easily access each element by simply using the expression my_array[i_1][i_2][i_3]...[i_N].
For example, let's suppose N = 3 and T is uint8_t and that d_1, d_2 and d_3 are known (and not uninitialized) in the following code:
size_t d1 = 5, d2 = 7, d3 = 3;
int ***my_array;
my_array = malloc(d1 * sizeof(int**));
for(size_t x = 0; x<d1; x++){
my_array[x] = malloc(d2 * sizeof(int*));
for (size_t y = 0; y < d2; y++){
my_array[x][y] = malloc(d3 * sizeof(int));
}
}
//Accessing a random element
size_t x1 = 2, y1 = 6, z1 = 1;
my_array[x1][y1][z1] = 32;
I hope this helps. Please feel free to comment if you have questions.

Number of ways such that sum of k elements equal to p

Given series of integers having relation where a number is equal to sum of previous 2 numbers and starting integer is 1
Series ->1,2,3,5,8,13,21,34,55
find the number of ways such that sum of k elements equal to p.We can use an element any number of times.
p=8
k=4.
So,number of ways would be 4.Those are,
1,1,1,5
1,1,3,3
1,2,2,3
2,2,2,2
I am able to sove this question through recursion.I sense dynamic programming here but i am not getting how to do it.Can it be done in much lesser time???
EDIT I forgot to mention that the sequence of the numbers does not matter and will be counted once. for ex=3->(1,2)and(2,1).here number of ways would be 1 only.
EDIT: Poster has changed the original problem since this was posted. My algorithm still works, but maybe can be improved upon. Original problem had n arbitrary input numbers (he has now modified it to be a Fibonacci series). To apply my algorithm to the modified post, truncate the series by taking only elements less than p (assume there are n of them).
Here's an n^(k/2) algorithm. (n is the number of elements in the series)
Use a table of length p, such that table[i] contains all combinations of k/2 elements that sum to i. For example, in the example data that you provided, table[4] contains {1,3} and {2,2}.
EDIT: If the space is prohibitive, this same algorithm can be done with an ordered linked lists, where you only store the non-empty table entries. The linked list has to be both directions: forward and backwards, which makes the final step of the algorithm cleaner.
Once this table is computed, then we get all solutions by combining every table[j] with every table[p-j], whenever both are non-empty.
To get the table, initialize the entire thing to empty. Then:
For i_1 = 0 to n-1:
For i_2 = i_1 to n-1:
...
For i_k/2 = i_k/2-1 to n-1:
sum = series[i_1] + ... + series[i_k/2]
if sum <= p:
store {i_1, i_2, ... , i_k/2 } in table[sum]
This "variable number of loops" looks impossible to implement, but actually it can be done with an array of length k/2 that keeps track of where each i_` is.
Let's go back to your data and see how our table would look:
table[2] = {1,1}
table[3] = {1,2}
table[4] = {1,3} and {2,2}
table[5] = {2,3}
table[6] = {1,5}
table[7] = {2,5}
table[8] = {3,5}
Solutions are found by combining table[2] with table[6], table[3] with table[5], and table[4] with table[4]. Thus, solutions are: {1,1,1,5} {1,2,2,3}, {1,1,3,3}, {2,2,2,2}, {1,3,2,2}.
You can use dynamic programming. Let C(p, k) be the number of ways that sum k element equal to p and a be the array of elements. Then
C(p, k) = C(p - a[0], k - 1) + C(p - a[1], k - 1) + .... + C(p - a[n-1], k - 1)
Then, you can use memorization to speed up your code.
Hint:
Your problem is well-known. It is the sum set problem, a variation of knapsack problem. Check this pretty good explanation. sum-set problem

How to pack a 3 dim array/ how to map it on a 1 dim array

I have a question concerning the packing of multidimensional arrays. I'm stuck at the moment and maybe somebody can help me out since I think it is a rather trivial task. I'm programing in Fortran, but the language doens't matter here so much.
During my work I have to store informations on triples i,j,k with i <= j <= k, where i,j,k go from 1 to n.
Because the memory demands in my progam are critical, I don't want to waste memory and try to pack the array trip(i,j,k) in an one dimensional array. So far I do it with the pairs ij and map them onto a 1d array (as usual for symmetric matrices) and calculate the index in the 1d array as:
ijpair = i*(i+1)/2+j
with the maximal number of pairs
npair = n*(n+1)/2+n
To incorporate the third index I generate n copies of the pair block (for each k). The index for the triple is than:
ijktrip = npair*(k-1) + ijpair
In this way the memory demands are already reduced, but it is still not optimal and at the moment I can not figure out how to fully exploit the restrictions i <= j <= k.
Thanks for your help in advance
As you already noticed, for a specific k you have k*(k+1)/2 elements. Let's enumerate the elements in increasing order of k, then increasing order of j, and then increasing order of i.
For a given k you already used sum_{1 <= p < k} p*(p+1)/2 indices. This sum evaluates to (see Wolfram alpha)
k*(k-1)*(k+1)/6
In the k-plane you need to do a similar thing for j: for a given j you already used sum_{1 <= p < j} p = j*(j-1)/2 indices.
On the jk-line you repeat this: for a given i you already used i-1 indices.
This yields the total formula:
index(i, j, k) = k*(k-1)*(k+1)/6 + j*(j-1)/2 + i
Note that (again, see Wolfram alpha)
index(n, n, n) = n*(n-1)*(n+1)/6 + n*(n-1)/2 + n = n*(n+1)*(n+2)/6
which is exactly the number of elements you have, so no indices are wasted.

Indexing multidimensional arrays in C

I'm familiar with multidimensional arrays being accessed as such: arr[rows][cols] which makes sense to me when I imagine it as a grid or coordinate system and locating points. But I'm confused about the line below. I understand it is picking up a pointer to some struct located in the array of structures I just have a hard time imagining which location this could represent in terms of the coordinate system I'm used to...its a bitmap by the way and SOMETHING is a pixel.
//what does this line mean SOMETHING *o = original + row*cols + col;
for (row=0; row < rows; row++)
for (col=0; col < cols; col++) {
SOMETHING* o = original + row*cols + col;
SOMETHING* n = (*new) + row*cols + (cols-1-col);
*n = *o;
}
Think about how arrays are laid out in memory. A multidimensional array is just an array of arrays, so say you had an array like SOMETHING[10][10].
The memory layout would be:
[0][0], [0][1], .. [0][9], [1][0], [1][1].. [9][9]
This actually is exactly the same as allocating sizeof(SOMETHING)*100.
What the line SOMETHING* o = original + row*cols + col; is saying is "make a pointer to object of type SOMETHING".
The pointer address should be:
the memory address of original,
add row times cols to it,
which will place it at the start of a row,
then add the specific column to it
to get to the exact position of an object in the array
A two dimensional array such as:
{{00,01,02,03},
{10,11,12,13},
{20,21,22,23},
{30,31,32,33}}
Will be placed in memory in order. Just like this:
{00,01,02,03,10,11,12,13,20,21,22,23,30,31,32,33}
So, when you access the array with a[i][j], you can also access the array with
a + i *(ELEMENTS_IN_ROW) + j
There is a pointer named original which probably sits at the origin ([0][0]). You are doing simple arithmetic to point to the current co-ordinate.
Suppose it's a 5x5 array and you are now in the 3rd row, 4th column ([2][3])
To reach [2][3] from origin , you have to travel:
5 units in the first row
5 units in the second row
3 units in the third
Total of 13 units.
row*cols + col gives you 2*5 + 3 ie. 13
So if you move row*cols + col units from origin, you get to your current location.
The reality is that the 2-D grid you're imagining is actually represented as contiguous, linear memory.
So to access a coordinate at index (r,c) you need to start at the base address of the array, then skip to the r-th row by multiplying the row index (r) by the number of columns in each row--again, we're moving in a linear direction. This will bring you to the first column of the r-th row. From here you increment the pointer by c columns and you're at your destination.
So in your code, I'm assuming original is the start of your array. Therefore the line:
original + row*cols + col
is doing exactly what I described above.
row-column coordinates tend to lead to some confusion when mixed with something different... for example with the typical image coordinate system, where the origin is the top-left corner.
a matrix has coordinates indicated as (row, column). That's equivalent to (y,x) in the coordinates of the image: by incrementing rows you go down; incrementing columns, right.
In your code the 2D space, is just following the typical convention of both C (row-major) and of images (by lines). So you can read:
SOMETHING* o = original + row*cols + col;
as
SOMETHING* o = original + iy*width + ix;
note finally that this is the same of using static 2D arrys in C, so these would be equivalent:
original[NROWS][NCOLS];
...
SOMETHING *o = &original[row][col];
or
SOMETHING original[HEIGHT][WIDTH];
...
SOMETHING *o = &original[iy][ix];
with the very important limitation that in this case the dimensions must be known at compile time. In your case you are dealing with dynamic bidimensional arrays, that are not supported by the C language.
To conclude, being SOMETHING a pixel, let me guess...
struct SOMETHING {
unsigned char r;
unsigned char g;
unsigned char b;
};
it means that any element addressed in one of the two ways is just 3 bytes long, and they are just arranged contiguous.

Resources