efficient indexing of an array - c

Suppose that I have
z[7]={0, 0, 2, 0, 1, 2, 1}
that means- first observation allocated in group 0, second obs group 0, third group 2 etc
and I want to write an efficient code to get an array of 3X? such that in the first row I have all the observations allocated in the first group, second row all the obs allocated in the second group etc.. something like
0, 1, 3
4, 6
2, 5
and this must be general, maybe I could have
z={0, 0, 2, 0, 1, 2, 1, 3, 4, 2, 0, 4, 5, 5, 6, 7, 0}
so the number of columns is unknown
I did do my homework and the code is attached to this message, but there must be a better way to do it. I believe that with pointers but I really do not know how.
#include <stdio.h>
int main(){
int z[7]={0, 0, 2, 0, 1, 2, 1}, nj[3], iz[3][7], ip[3], i, j;
for(j=0; j<3; j++){
ip[j] = 0;
nj[j] = 0;
}
for(i=0; i <7; i++ ){
nj[z[i]] = nj[z[i]] + 1;
iz[z[i]][ip[z[i]]] = i;
ip[z[i]] = ip[z[i]] + 1;
}
for(j=0; j<3 ;j++){
for(i=0; i < nj[j]; i++){
printf("%d\t", iz[j][i]);
}
printf("\n");
}
return 0;
}

It seems that you have two tasks here.
To count the number of occurrences of each index in z and allocate a data structure of the right size and configuration.
To iterate over the data and copy it to the correct places.
At the moment you appear to have solved (1) naively by allocating a big, two dimensional array iz. That works fine if you know in advance the limits on how big it could be (and your machine will have enough memory), no need to fix this until later.
It is not clear to me exactly how (2) should be approached. Is the data currently going into iz guaranteed to consist of [0, 1, ... n ]?
If you don't know the limits of the size of iz in advance, then you will have to allocate a dynamic structure. I'd suggest a ragged array though this means two (or even three) passes over z.
What do I mean by a ragged array? A object like the argv argument to main, but in this case of type int **. In memory it looks like this:
+----+ +---+ +---+---+---+--
| iz |---->| |---->| | | ...
+----+ +---+ +---+---+---+--
| |--
+---+ \ +---+---+---+--
| . | --->| | | ...
. +---+---+---+--
.
A ragged array can be accessed with iz[][] just like it was a two-dimensional array (but it is a different type of object), which is nice for your purposes because you can tune your algorithm with the code you have now, and then slap one of these in place.
How to set it up.
Iterate of z to find the largest number, maxZ, present.
Allocate an array of int* of size maxZ+1: iz=callac(maxZ+1,sizeof(int*));.
I chose calloc because it zeros the memory, which makes all those pointers NULL, but you could use malloc and NULL them yourself. Making the array one too big gives us a NULL termination, which may be useful later.
Allocate an array of counters of size maxZ: int *cz = calloc(maxZ,sizeof(int));
iterate over z, filling cz with the number of entries needed in each row.
For each row, allocate an array of ints: for(i=0; i<maxZ; ++i){ iz[i] = malloc(sizeof(int)*cz[i]; }
Iterate over z one last time, sticking the figures into iz as you already do. You could re-use cz at this point to keep track of how many figure have already been put into each row, but you might want to allocate a separate array for that purpose because so that you have a record of how big each allocated array was.
NB: Every call to malloc or calloc ought to be accompanied by a check to insure that the allocation worked. I've left that as an exercise for the student.
This repeated passes over z business can be avoided entirely by using dynamic arrays, but I suspect you don't need that and don't want the added complexity.

Related

Why is the Index NOT out of bounds although it intuitively should?

I'm relatively new to C programming and I stumbled upon a for me unexplainable behaviour while running the following code and debugging it using gdb and lldb.
In short: When swapping the indices i and j (max i != max j) when accessing a value in a two-dimensional Array inside a double nested for-loop it does not seem to matter if I access the value using array[i][j] or array[j][i].
The two loops and arrays are mostly identical.
unsigned matrix[3][1] =
{
{3},
{4},
{5}
};
//Loop1
for (size_t i = 0; i < sizeof(matrix) / sizeof(*matrix); i++)
{
for (size_t j = 0; j < sizeof(matrix[i]) / sizeof(*matrix[i]); j++)
{
matrix[i][j] <<= 1;
printf("matrix[%zu][%zu] has the value: %d\n", i, j, matrix[i][j]);
}
}
//same two dimensional array as matrix
unsigned matrix2[3][1] =
{
{3},
{4},
{5}
};
//Loop2, basically the same loop as Loop1
for (size_t i = 0; i < sizeof(matrix2) / sizeof(*matrix2); i++)
{
for (size_t j = 0; j < sizeof(matrix2[i]) / sizeof(*matrix2[i]); j++)
{
//swapped i and j here
matrix2[j][i] <<= 1;
printf("matrix2[%zu][%zu] has the value: %d\n", j, i, matrix2[j][i]);
}
}
Am I missing here something?
In both cases i is passed the value 2 at the end of the outer loop and j the value 0 at the end of the inner loop.
Intuitively, matrix[0][2] should throw an exception as each row only has one element.
I will take a slightly different approach than the other respondents.
You are technically not reading outside of the array's boundary as far as the memory layout is concerned. Looking at it from a human perspective you are (the index [0][2] doesn't exist!), but the memory layout of the array is contiguous. Each of the "rows" of the matrix are stored next to each other.
In memory, your array is stored as: | ? | 3 | 4 | 5 | ? |
So when you index to matrix[1][0] or matrix [0][1] you are accessing the same position in memory. This would not be the case if your array was larger than 1 dimension wide.
For example, replace your array with the following one and experiment. You can access integer '4' either by indexing matrix[0][2], or matrix [1][0]. The position [0][2] shouldn't exist, but it does because the memory is contiguous.
unsigned matrix[3][2] =
{
{3, 6},
{4, 8},
{5, 10}
};
Oops, matrix[0][2] should throw an exception as each row only has one element...
Some languages do warn the programmer by an exception if they try an out of bound access, but C does not. It just invokes Undefined Behaviour. On a technical point of view, it means that the compiler does not have to test the out of bound condition. On an operational point of view, it means that anything can happen, including expected behaviour... or an immediate crash... or a modification of an unrelated variable... or...
If my C skills aren't mega-rusty you're reading "unsafe memory".
Essentially your matrix is declared as a block of bytes. After that block of bytes there are more bytes. What are they? Usually more variables that are declared as your program's data. Once you reach the end of the program's data block you reach the user code memory block (encoded ASM instructions).
Most languages perform checks and throw an exception when you run out of bounds by somehow keeping track of the last index that is valid to access. C does not do that and doing such thing is your very own responsibility. If you aren't careful you might be overwriting important parts of your program's code.
There are attacks that one can perform on C programs that don't sanitize user input, like a buffer overrun; which exploits what it's been described.
Essentially if you declare a char[] of length N and store a string that comes from outside and this string happens to be of length N+X you'll be overwriting program memory (instructions).
With the right sequence of characters you can inject your very own assembly code into a running program which doesn't sanitize user input
As your array is int and all elements are of the same size, i don't see any problem as your array is stored in contiguous space in RAM and you use a special case of matrix where inverting indexes has no side effect.
In the first loop your indexes are [0][0], [1][0], [2][0]
In the second loop your indexes are [0][0], [0][1], [0][2]
now try to linear the access, as your array is saved as linear array into the RAM.
address of element = row * NCOL + col
row: is row number
NCOL: number of columns into your matrix
col : the column number
so the linear index for :
[0][2] ==> 0 * 1 + 2 = 2 /* the third element*/
[2][0] ==> 2 * 1 + 0 = 2 /* always the third element */
But if you use a matrix of n x m , n >= 1 and m > 1 and n != m.
if you inverse the indexes, the result will not be the same.
so if you take a 4 x 2 matrix
linear index of [3][1] = 3 * 2 + 1 = 7
linear index of [1][3] = 1 * 2 + 3 = 5 /* even [1][3] is out of the range of your matrix index */
[1][3] you will manipulate the element [2][1]
So be worry when manipulating matrix indexes.

How to insert an element starting the iteration from the beginning of the array in c?

I have seen insertion of element in array starting iteration from the rear end. But i wonder if it is possible to insert from the front
I finally figured out a way, Here goes the code
#include <stdio.h>
int main()
{
int number = 5; //element to be inserted
int array[10] = {1, 2, 3, 4, 6, 7, 8, 9};
int ele, temp;
int pos = 4; // position to insert
printf("Array before insertion:\n");
for (int i = 0; i < 10; i++)
{
printf("%d ", array[i]);
}
puts("");
for (int i = pos; i < 10; i++)
{
if (i == pos) // first element
{
ele = array[i + 1];
array[i + 1] = array[i];
}
else // rest of the elements
{
temp = array[i + 1];
array[i + 1] = ele;
ele = temp;
}
}
array[pos] = number; // element to be inserted
printf("Array after insertion:\n");
for (int i = 0; i < 10; i++)
{
printf("%d ", array[i]);
}
return 0;
}
The output looks like:
Array before insertion:
1 2 3 4 6 7 8 9 0 0
Array after insertion:
1 2 3 4 5 6 7 8 9 0
In C the arrays have a "native" built-in implementation based upon the address (aka pointer) to the first element and a the [] operator for element addressing.
Once an array has been allocated, its actual size is not automatically handled or checked: the code needs to make sure boundaries are not trespassed.
Moreover, in C there is no default (aka empty) value for any variable, there included arrays and array element.
Still, in C there's no such a thing like insertion, appending or removal of an array element. You can simply refer to the n-th (with n starting at 0) array element by using the [] operator.
So, if you have an array, you cannot insert a new item at its n-th position. You can only read or (over)write any of its items.
Any other operation, like inserting or removing, requires ad-hoc code which basically boils down to shifting the arrays elements forward (for making room for insertion) or backward (for removing one).
This is the C-language nature and should not be seen as a limitation: any other language allowing for those array operations must have a lower-level hidden implementation or a non-trivial data structure to implement the arrays.
This means, in C, that while keeping the memory usage to a bare minimum, those array operations require some time-consuming implementation, like the item-shifting one.
You can then trade-off the memory usage against the time usage and get some gains in overall efficiency by using, for example, single- and double-linked lists. You loose some memory for link pointer(s) in favor of faster insertion ad removal operations. This depends mostly upon the implementation goals.
Finally, to get to the original question, an actual answer requires some extra details about the memory vs time trade off that can be done to achieve the goal.
The solution depicted by #Krishna Acharya is a simple shift-based solution with no boundary check. A very simple and somehow naive implementation.
A final note. The 0s shown by Krishna's code at the end of the arrays should be considered merely random values. As I said earlier, there is no default value.
The code should have been instead:
int array[10] = {1, 2, 3, 4, 6, 7, 8, 9, 0, 0};
in order to make sure that any unused value was 0 for the last two array elements.

How to free the final element of a malloc'd array in C?

Say I initialize an array of 5 integer elements like this:
int *Q = malloc(sizeof(int) * 5);
for (int i = 0; i < 5; i++) {
Q[i] = i;
}
The array looks like: {0, 1, 2, 3, 4}.
Now if I shift everything along by 1 position:
Q++;
The array looks like: {1, 2, 3, 4, #}, where # is some garbage value.
Is there a way to free the final element so it's not stored in the array?
I tried this:
free(Q[4]);
But I know this won't work because free() can only operate of the whole chunk of memory allocated for Q.
Is there a better way to shift everything along? The resulting array should look like: {1, 2, 3, 4}.
Would it be a good idea to realloc() Q after every shift?
realloc() can change the size of an allocated chunk of memory, which will do the job for you. Note that this cannot be used to "free" arbitrary elements of an array, but only one(s) on the end.
How good an idea it is to do this depends on a number of factors, none of which you have provided.
When you do Q++ the array has not changed, it still contains the five values 0,1,2,3,4 it is just that Q is pointing to the second element in the array.
If you want to change the size of allocated memory then do as Scott said and realloc the block - but it is a costly way of handling heap memory.
If you just want to keep track of the number of elements in the array let Q remain pointing on the first element and have a size variable indicating how many integers there are.
Alternatively use another data structure to hold your integers e.g. a linked list of integers, then you can add and remove integers easier.
Taliking about last elements of array you can surely use realloc
BTW take note that when you say
The array looks like: {1, 2, 3, 4, #}, where # is some garbage value.
You are wrong and you are invoking undefined behavior as well explained by this SO answer.
So the loop that left shift value have not to do Q[4] = Q[5];
To shift around elements inside an array one can use memmove().
#include <stdio.h>
#include <string.h>
int main(void)
{
int d_init[] = {0, 1, 2, 3, 4};
size_t s = sizeof d_init/sizeof *d_init;
int d[s];
/* Fill d */
memcpy(d, d_init, s * sizeof *d);
for (size_t i = 0; i < s; ++i)
printf("%d ", d[i]);
puts("\n");
/* shift one to the left */
memmove(d, d + 1, (s - 1) * sizeof *d);
for (size_t i = 0; i < s; ++i)
printf("%d ", d[i]);
puts("\n");
/* shift two to the right */
memmove(d + 2, d, (s - 2) * sizeof *d);
for (size_t i = 0; i < s; ++i)
printf("%d ", d[i]);
puts("\n");
}
The snippet above would print:
0 1 2 3 4
1 2 3 4 4
1 2 1 2 3
If you're doing a Q++ you've not shifted the elements of the array, your array is simply pointing to the second element (index 1). Thus, Q[4] is reading something that doesn't belong to the array: C is permissive enough to let you do that (in most cases), but it is a mistake.
To shift elements you should either do
for (int i=0; i<4; i++)
Q[i] = Q[i+1];
or (smarter)
memmove(Q, Q+1, 4*sizeof(int));
but indeed, to have an array of size 4 you'll have to realloc.
BUT if you need to do that, maybe an array is not the data structure you should use: a linked list seems to be a better choice.

"C" i couldnt understand how i can add two matrises

I'm a begginer about c and i need help about it please help
{
int matris[2][2];
for(int i=0;i<2;i++)
{
for(int j=0;j<4;j++)
{
printf("Sayi giriniz: "); scanf("%d",&matris[i][j]);
}
}
for(int i=0;i<3;i++)
{
for(int j=0;j<2;j++)
{
printf("%d ",matris[i][j]);
}
printf("\n");
}
}
As a beginner, you need to realize that programming is nothing more than problem-solving, well there is the bit about expressing the answer in a programming language.
Doing matrix addition -- how do you do it, how would a mathematician define it? Arnaldo has given you the answer to this, $A + B = C$ where $c_{ij} = a_{ij} + b_{ij}$. So already, this starts to set some restrictions on the two matrices that your are working with, notably they have to have the same number of rows and columns.
Representation of matrices -- ok, now that you know how to add to matrices, you need to figure out how you are going to represent a matrix in your program. Computer memory is a one-dimensional array of storage units, so we need to map our two dimensional structure onto this one-dimensional array. There are two ways of doing this. The first is row major which means that we write the first row to memory, then the second row and so on. The second is column major which means that we write the first column to memory and then the second row.
Consider the following 2x3 matrix:
| a b c |
| d e f |
in row-major form, it would be laid out in memory as:
+---+---+---+---+---+---+
| a | b | c | d | e | f |
+---+---+---+---+---+---+
and in column-major form, in would be laid out in memory as:
+---+---+---+---+---+---+
| a | d | b | e | c | f |
+---+---+---+---+---+---+
Remember that computer science is zero based, so where a mathematician would designate the first element in the first row as a_{11}, we will be using zero based indicies so we will designate it as a_{00}.
Most modern program languages use row-major form to store two dimensional arrays (or in this discussion matrices). So what you might ask? Well, because we are mapping a two dimensional array to a one dimensional array, and all we really know about the one dimensional array is its starting point in memory, we need to be able to change the pair (row, col) into a single index. You should convince yourself that the following equation is correct, assuming that nRow and nCol are the number of rows and columns in the matrix.
index = nCol * row + col
So, now write some code to add to matrices together. In pseudo-code form this would be:
A <-- read in first n-by-m matrix
B <-- read in second n-by-m matrix
C <-- initialize a n-by-m matrix to all zero elements.
for(r = 0; r < nRow; r++)
for(c = 0; c < nCol; c++)
C[r][c] = A[r][c] + B[r][c]
print C
It is an implementation detail to decide if you want to use fix sized matrices, i.e. 'A[2][2]', or potentially use a dynamically allocated matrix, i.e. 'A = malloc(nRow * nCol * sizeof(int));' (assuming we are storing integers). This will determine exactly how the addition line in the above pseudo-code would be written.
Hope this helps, and kinda show you how to approach problems like this.
Don't be afraid to ask additional questions if you get stuck on attempting the implementation.
Best of Luck,
T
I don't fully understand what your question is, but I can definately show you how to add two matricies of the same length elementwise, if that's what you're looking for.
#include <stdio>;
int main()
{
//this part is declaring the two arrays you want to add,
//and the array you want to store the result in.
int arrayA[3];
int arrayB[3];
int result[3];
//this part is just initializing the data in the array
arrayA[0] = 1;
arrayA[1] = 2;
arrayA[2] = 3;
arrayB[0] = 5;
arrayB[1] = 6;
arrayB[2] = 7;
//loops from 0 to 2, and adds the nth element of arrayA and
// arrayB to store in result
for(int n =0; n < 3; n++)
{
result[n] = arrayA[n] + arrayB[n];
}
//at this point, result is the addition of your arrays.
//You can print it, or whatever it is you wanted to do with it.
return 0;
}
There are many different things that could be 'adding matricies', but this is one of them.
However, the code you submitted looks like you're trying to store numbers from the keyboard into a 2D array. You're on the right track there, but on your first nested for loop your inner loop goes too far, so it's going outside of the array bounds. Try more liek this:
int matris[2][2];
for(int i=0;i<2;i++)
{
//you had j<4 here. That will put you in invalid memory!
for(int j=0;j<2;j++)
{
printf("Sayi giriniz: "); scanf("%d",&matris[i][j]);
}
}
//you had i<3 here. That will also put you in invalid memory.
for(int i=0;i<2;i++)
{
for(int j=0;j<2;j++)
{
printf("%d ",matris[i][j]);
}
printf("\n");
}
I hope I've addressed whatever question you were going for.
İyi şanslar!

How much information do array variables share?

How much information is copied/shared when I assign one array variable to another array variable?
int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
int[] b = a;
a[0] = 42;
writefln("%s %s", a[0], b[0]); // 42 42
Apparently, a and b share the same payload, because 42 is printed twice.
a ~= 10;
writefln("%s %s", a.length, b.length); // 11 10
Appending to a does not change b, so the length does not seem to part of the payload?
b = a;
a ~= 11;
b ~= 42;
writefln("%s %s", a[11], b[11]); // 11 42
Could a conforming D implementation also print 42 42? Could b ~= 42 overwrite the 11 inside a?
When exactly are a and b detached from each other? Is D performing some COW in the background?
"Arrays" in D don't really exist.
Slices do.
Slices are just a pointer and a length. So when you assign them to each other, the pointer and the length get copied. If you modify the target data, then it'll be visible in all instances of the slices -- but if you enlarge one slice, the other one will still be using its old length.
You normally can't "shrink" the actual length of the array in memory (although you can certainly reduce the slice's length, so it 'sees' less data), so that doesn't cause issues.
Hope that explains what's going on.
array variables in D are equivalent to
struct array!T{
size_t length;
T* ptr;
}
(plus the implementations for indexing and slicing)
the appending is special in that it may keep the original slice and append to the end. This happens only when either the capacity of the array is large enough or the realloc can expand inplace
these last things are maintained in the GC

Resources