How much information do array variables share? - arrays

How much information is copied/shared when I assign one array variable to another array variable?
int[] a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];
int[] b = a;
a[0] = 42;
writefln("%s %s", a[0], b[0]); // 42 42
Apparently, a and b share the same payload, because 42 is printed twice.
a ~= 10;
writefln("%s %s", a.length, b.length); // 11 10
Appending to a does not change b, so the length does not seem to part of the payload?
b = a;
a ~= 11;
b ~= 42;
writefln("%s %s", a[11], b[11]); // 11 42
Could a conforming D implementation also print 42 42? Could b ~= 42 overwrite the 11 inside a?
When exactly are a and b detached from each other? Is D performing some COW in the background?

"Arrays" in D don't really exist.
Slices do.
Slices are just a pointer and a length. So when you assign them to each other, the pointer and the length get copied. If you modify the target data, then it'll be visible in all instances of the slices -- but if you enlarge one slice, the other one will still be using its old length.
You normally can't "shrink" the actual length of the array in memory (although you can certainly reduce the slice's length, so it 'sees' less data), so that doesn't cause issues.
Hope that explains what's going on.

array variables in D are equivalent to
struct array!T{
size_t length;
T* ptr;
}
(plus the implementations for indexing and slicing)
the appending is special in that it may keep the original slice and append to the end. This happens only when either the capacity of the array is large enough or the realloc can expand inplace
these last things are maintained in the GC

Related

Can we map a array to another array in C, like mapping a file?

I have an array (1D), and other array of same size in different order (which will change according to the program situation) should also have the same value.
For example:
array1 = {1,2,3,4,5};
hence array2, should automatically have,
array2 = {4,2,3,1,5};
Some what you can say, i want to jumble up values according to my unique reference. But whenever parent array1 changes, array2 should also be updated at its respective indexes. Is it even possible? Array memory mapping? looping and saving to other array is taking time as this operation is iterated several times. I cannot do memcpy, because order can be different. Any pointers/helps/suggestions will be appreciated.
There's no magical way to do this. What you need to do is store the actual values somewhere, and then access them through a permutation stored separately. Here's some example code that uses strings so the permutation and the values are clearly distinct:
char *strings[] = {"foo", "bar", "baz", "quux"};
size_t memory_order[] = {0, 1, 2, 3};
size_t sorted_order[] = {1, 2, 0, 3};
// Get the k'th element in the memory order:
strings[memory_order[k]];
// Get the k'th element in the sorted order:
strings[sorted_order[k]];
Not directly, no. C doesn't specify a way to do that (which makes sense to me, since most computers don't either, and C tends to be fairly close to the metal).
The typical way to solve it is to manually do the re-mapping, of course:
static const size_t map1to2[] = { 3, 1, 2, 0, 4 };
Then do the accesses to array2 through the remap:
printf("array2[3] is %d\n", array1[map1to2[3]]);
This maps the index 3 to 0, and thus prints 1.
You can use macros to make it slightly more managable.

Memory for big array which change its data and size in Matlab (buffer)

I am trying to create a buffer (not circle buffer) for big arrays like 100x100 with a specified size.
I can't find any good solution on the internet so i am trying do write it by myself.
Here is the code:
In this exa,ple I will be using only small arrays 2x2 and buffer size is 3
A = [1 1; 1 1];
B = [2 2; 2 2];
C = [3 3; 3 3];
buffer = [C B A];
D = [4 4; 4 4];
Now i want to push D and pop A to look like [D C B]
buffer = buffer(1:2,3:6); %creating [C B]
buffer = [D buffer] %creating [D C B]
Now the question:
What about the A in the memory? Is it still there or it is deleted?
If I use about 1000 arrays with size [500x500] with buffer size 3 it would be really bad if I had so much trash in memory. If it is wrong, is there any other way to write such buffer?
Your "push front" syntax buffer = [D buffer] seems fine.
As for "push back", you can either concatenate in a similar manner (buffer = [buffer D];) or you can index past the end:
buffer(:,end:end+size(D,2)) = D; % assuming size(D,1) equals size(buffer,1)
For "pop" you either do buffer=buffer(keepRows,keepCols); as in your example, or you can assign [] to whatever indexes you want to remove. For example, given buffer = [C B A];, "pop front" to remove C would be:
buffer(:,1:size(C,2)) = [];
You can remove any values this way, including center elements:
buffer(:,3:4) = []; % remove B from [C B A]
In this way, buffer is rewritten and the removed values are lost. However, the original variable (e.g. B) used to compose buffer remains until you clear it. Keep in mind that when you do buffer = [C B A];, it copies the contents of each variable when doing horizontal concatenation, rather than putting the array in a list.
I once did a pretty thorough comparison of array truncation performance with the v(end-N:end)=[] syntax and the v=v(1:N-1) syntax. Although that is just for a 1D vector, it might be helpful. You may also want to have a look at this article on automatic array growth performance.
I would suggest to use a java linked list.
import java.util.LinkedList
q=linkedList()
q.add(A);
q.add(B);
q.add(C);
%get top element
o=q.pop();
%insert new
q.add(D);
Your code requires to copy the whole buffer content, which will be slow for large buffers.

Array size less than the no. of elements stored in it

Is it possible to declare an array of size 1 and be able to store 5 elements in it and then retrieve them?
I try one such code where I declared an array arr[1] and then stored 5 elements into it. It was actually possible to store 5 elements! How was it?
If this is C (or C++), you can quite easily store more elements than the array is sized for:
#include <stdio.h>
int main (void) {
int x = 0;
int a[1]; // so that a[0] is the only valid element
a[1] = 7; // write beyond end of array
printf ("x=%d, &a[0]=%p, &a[1]=%p, &x=%p\n", x, &(a[0]), &(a[1]), &x);
return 1;
}
Doing so, however, leads to undefined behaviour, probably overwriting some other piece of information, and is not really a good idea.
On my system, that code above prints:
x=7, &a[0]=0xbf9bb638, &a[1]=0xbf9bb63c, &x=0xbf9bb63c
despite the fact I set x to zero and never explicitly changed it. That's because writing beyond the end of the array has affected it (as you can see from the two identical addresses for a[1] and x).

efficient indexing of an array

Suppose that I have
z[7]={0, 0, 2, 0, 1, 2, 1}
that means- first observation allocated in group 0, second obs group 0, third group 2 etc
and I want to write an efficient code to get an array of 3X? such that in the first row I have all the observations allocated in the first group, second row all the obs allocated in the second group etc.. something like
0, 1, 3
4, 6
2, 5
and this must be general, maybe I could have
z={0, 0, 2, 0, 1, 2, 1, 3, 4, 2, 0, 4, 5, 5, 6, 7, 0}
so the number of columns is unknown
I did do my homework and the code is attached to this message, but there must be a better way to do it. I believe that with pointers but I really do not know how.
#include <stdio.h>
int main(){
int z[7]={0, 0, 2, 0, 1, 2, 1}, nj[3], iz[3][7], ip[3], i, j;
for(j=0; j<3; j++){
ip[j] = 0;
nj[j] = 0;
}
for(i=0; i <7; i++ ){
nj[z[i]] = nj[z[i]] + 1;
iz[z[i]][ip[z[i]]] = i;
ip[z[i]] = ip[z[i]] + 1;
}
for(j=0; j<3 ;j++){
for(i=0; i < nj[j]; i++){
printf("%d\t", iz[j][i]);
}
printf("\n");
}
return 0;
}
It seems that you have two tasks here.
To count the number of occurrences of each index in z and allocate a data structure of the right size and configuration.
To iterate over the data and copy it to the correct places.
At the moment you appear to have solved (1) naively by allocating a big, two dimensional array iz. That works fine if you know in advance the limits on how big it could be (and your machine will have enough memory), no need to fix this until later.
It is not clear to me exactly how (2) should be approached. Is the data currently going into iz guaranteed to consist of [0, 1, ... n ]?
If you don't know the limits of the size of iz in advance, then you will have to allocate a dynamic structure. I'd suggest a ragged array though this means two (or even three) passes over z.
What do I mean by a ragged array? A object like the argv argument to main, but in this case of type int **. In memory it looks like this:
+----+ +---+ +---+---+---+--
| iz |---->| |---->| | | ...
+----+ +---+ +---+---+---+--
| |--
+---+ \ +---+---+---+--
| . | --->| | | ...
. +---+---+---+--
.
A ragged array can be accessed with iz[][] just like it was a two-dimensional array (but it is a different type of object), which is nice for your purposes because you can tune your algorithm with the code you have now, and then slap one of these in place.
How to set it up.
Iterate of z to find the largest number, maxZ, present.
Allocate an array of int* of size maxZ+1: iz=callac(maxZ+1,sizeof(int*));.
I chose calloc because it zeros the memory, which makes all those pointers NULL, but you could use malloc and NULL them yourself. Making the array one too big gives us a NULL termination, which may be useful later.
Allocate an array of counters of size maxZ: int *cz = calloc(maxZ,sizeof(int));
iterate over z, filling cz with the number of entries needed in each row.
For each row, allocate an array of ints: for(i=0; i<maxZ; ++i){ iz[i] = malloc(sizeof(int)*cz[i]; }
Iterate over z one last time, sticking the figures into iz as you already do. You could re-use cz at this point to keep track of how many figure have already been put into each row, but you might want to allocate a separate array for that purpose because so that you have a record of how big each allocated array was.
NB: Every call to malloc or calloc ought to be accompanied by a check to insure that the allocation worked. I've left that as an exercise for the student.
This repeated passes over z business can be avoided entirely by using dynamic arrays, but I suspect you don't need that and don't want the added complexity.

Assigning a value to a variable gets stored in the wrong spot?

I'm relatively new to C, and this is baffling me right now. It's part of a much larger program, but I've written this little program to depict the problem I'm having.
#include <stdio.h>
int main()
{
signed int tcodes[3][1];
tcodes[0][0] = 0;
tcodes[0][1] = 1000;
tcodes[1][0] = 1000;
tcodes[1][1] = 0;
tcodes[2][0] = 0;
tcodes[2][1] = 1000;
tcodes[3][0] = 1000;
tcodes[3][1] = 0;
int x, y, c;
for(c = 0; c <= 3; c++)
{
printf("%d %d %d\r\n", c, tcodes[c][0], tcodes[c][1]);
x = 20;
y = 30;
}
}
I'd expect this program to output:
0 0 1000
1 1000 0
2 0 1000
3 1000 0
But instead, I get:
0 0 1000
1 1000 0
2 0 20
3 20 30
It does this for any number assigned to x and y. For some reason x and y are overriding parts of the array in memory.
Can someone explain what's going on?
Thanks!
tcodes[3][0] = 1000;
tcodes[3][1] = 0;
are writing off the end of your array twice. [3] allocates slot ids 0-2 and [1] only allocates 1 actual slot [0].
Change your initialization of tcodes to signed int tcodes[4][2]; for 4 entries by 2 entries.
The other answers are right, but to help explain what's actually happening:
You have the following local declarations:
signed int tcodes[3][1];
int x, y, c;
Those get stored right next to each other in the stack frame in memory:
tcodes
x
y
z
tcodes has 3 spots, and trying to write to tcodes[n] just means to find where tcodes points to in memory and move over to the nth spot (I'm going to ignore your second dimension since it was just 1 anyway). If you try to write to spot 3, it's going to move over 3 spots from the beginning of tcodes, even though tcodes isn't that big. Since x is located right after tcodes, in the spot tcodes[3] would be in, that memory gets overwritten and the value of x changes. tcodes[4] would overwrite y, and tcodes[5] would overwrite z. If you kept making n bigger (or negative, which is legal), you could overwrite anything you're allowed to access in memory, which can screw up your program in bad and hard-to-find ways
Change it to this:
signed int tcodes[4][2];
If You define an array like this:
int somearr[3];
You get an array that has 3 elements. Indexes start form 0, so those elements are:
somearr[0]
somearr[1]
somearr[2]
Arrays and other variables defined inside a function, like in Your code, are allocated on the stack. It just so happens, that variables x and y are placed on the stack next to Your array. If you try to access elements
tcodes[3][0] or tcodes[3][1]
You access a part of a stack, that is behind Your array and, as Your output show, it's the spot, where variables x and y are placed.
In fact definition like this
signed int tcodes[3][1];
creates an array containing 3 elements, each of which is an array too - an array containing one signed int. When You write tcodes[1][1], You are accessing non-existing "second" element of your second array. The place in memory, that the compiler accesses when it interprets tcodes[1][1] overlaps with tcodes[2][0];
As you are writing beyond the array boundaries, you are writing on the memory allocated to x and y variables on stack. In this case, they happen to be same as tcodes[3][0] == x and tcodes[3][1] == y as the addresses are same.
If you are doing it in a called function and the array is passed by reference, you might end up in stack corruption.
The bottom line is that in C, arrays are 0 based.
You need to pay attention to the solution given by Robin Oster above. The other folks may be giving you "too much information". Just count the number of items in each dimension better, don't forget the zero'th item counts!

Resources