I have an integer matrix that should act like a buffer:
x = {{0, 0, 0, 0, 0}, {1, 1, 1, 1, 1}, {2, 2, 2, 2, 2}};
Now if I add a new row {3, 3, 3, 3, 3}, the new matrix should look like:
x = {{1, 1, 1, 1, 1}, {2, 2, 2, 2, 2}, {3, 3, 3, 3, 3}};
Is there a clever way of doing this without copying all elements around?
If your matrix is defined as an int ** and you separately allocate each row, then you would only have to swap the row pointers.
How about modulo operation?
If you access the elements as matrix[x + SZ * y] you could change it to:
matrix[x + SZ * ((y + FIRST_ROW) % SZ)] .
In this way to implement this shift you just put the new line {3, 3, 3..} where line {0, 0, 0} was, and increment the FIRST_ROW counter to point to the new starting row.
Use a linked list.
struct node
{
int row[5];
struct node *next;
};
Appending a row is as simple as walking the list to the end, then replacing the NULL next pointer with a new node (whose next pointer is NULL).
Can you increment x so that it points to the second row, then free the first row? Obviously, you would need to allocate a row at a time and this would not guarantee that the matrix is contiguous. If you require that, you could allocate one big block of memory to hold your matrix and then clobber the unused parts when you reach the end.
If you use an array of pointers to arrays (rather than an ordinary two-dimensional array), you can copy just the pointers to rows instead of copying all elements.
And if you're okay with overallocating the array of pointers, you could maybe add a new pointer to the end and advance the pointer to the "start" of the array. But this wouldn't be a good idea if you potentially want to do this sort of shift many times. And of course you'd want to make sure you have the original pointer somewhere so you can properly free() your resources.
Lazy to write code example - you can use modulo arithmetics to address the rows. When pushing a new row, simply increase a starting offset variable, add the matrix height and modulo the result by matrix height. This way you get a circular matrix with no need to copy the whole matrix and keeping the matrix array compact.
Related
Say I declared an array[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Later I want it to be array[8] = {2, 3, 4, 5, 6, 7, 8, 9}.
Dismiss the first 2 elements. So it would start on array[2]. Reallocing array to array[2].
I tried:
int *array=(int*)malloc(10*sizeof(int));
...//do stuffs
array=(int*)realloc(array[2],8*sizeof(int));
It didn't work. Neither using &array[2], *array[2], nor creating an auxiliary array, reallocing Array to AuxArr than free(AuxArr).
Can I get a light?
You can only realloc a pointer to a memory block that has already been alloc'ed. So you can realloc(array), but not array[2] since that is a pointer to a location in the middle of a memory block.
You may want to try memmove instead.
Edit:In response to ThingyWotsit's comment, after memomving the data you want to the front of the array, then you can realloc to drop off the tail end.
Just use array += 2 or array = &array[2]. You can't realloc() it.
I have an array (1D), and other array of same size in different order (which will change according to the program situation) should also have the same value.
For example:
array1 = {1,2,3,4,5};
hence array2, should automatically have,
array2 = {4,2,3,1,5};
Some what you can say, i want to jumble up values according to my unique reference. But whenever parent array1 changes, array2 should also be updated at its respective indexes. Is it even possible? Array memory mapping? looping and saving to other array is taking time as this operation is iterated several times. I cannot do memcpy, because order can be different. Any pointers/helps/suggestions will be appreciated.
There's no magical way to do this. What you need to do is store the actual values somewhere, and then access them through a permutation stored separately. Here's some example code that uses strings so the permutation and the values are clearly distinct:
char *strings[] = {"foo", "bar", "baz", "quux"};
size_t memory_order[] = {0, 1, 2, 3};
size_t sorted_order[] = {1, 2, 0, 3};
// Get the k'th element in the memory order:
strings[memory_order[k]];
// Get the k'th element in the sorted order:
strings[sorted_order[k]];
Not directly, no. C doesn't specify a way to do that (which makes sense to me, since most computers don't either, and C tends to be fairly close to the metal).
The typical way to solve it is to manually do the re-mapping, of course:
static const size_t map1to2[] = { 3, 1, 2, 0, 4 };
Then do the accesses to array2 through the remap:
printf("array2[3] is %d\n", array1[map1to2[3]]);
This maps the index 3 to 0, and thus prints 1.
You can use macros to make it slightly more managable.
I am new to thrust (cuda) and I want to do some array operations but I donĀ“t find any similar example on the internet.
I have following two arrays (2d):
a = { {1, 2, 3}, {4} }
b = { {5}, {6, 7} }
I want that thrust compute this array:
c = { {1, 2, 3, 5}, {1, 2, 3, 6, 7}, {1, 2, 3, 5}, {1, 2, 3, 6, 7} }
I know how it works in c/c++ but not how to say thrust to do it.
Here is my idea how it wohl maybe could work:
Thread 1:
Take a[0] -> expand it with b.
Write it to c.
Thread 2:
Take a[1] -> expand it with b.
Write it to c.
But I have no idea how to do that. I could write the array a and b to an 1d array like:
thrust::device_vector<int> dev_a;
dev_a.push_back(3); // size of first array
dev_a.push_back(1);
dev_a.push_back(2);
dev_a.push_back(3);
dev_a.push_back(1); // size of secound array
dev_a.push_back(4);
thrust::device_vector<int> dev_b;
dev_b.push_back(1); // size of first array
dev_b.push_back(5);
dev_b.push_back(2); // size of secound array
dev_b.push_back(6);
dev_b.push_back(7);
And the pseudo-function:
struct expand
{
__host__ __device__
?? ?? (const array ai, const array *b) {
for bi in b: // each array in the 2d array
{
c.push_back(bi[0] + ai[0]); // write down the array count
for i in ai: // each element in the ai array
c.push_back(i);
for i in bi: // each element in the bi array
c.push_back(i);
}
}
};
Anyone any idea?
I guess you won't get any speed increase on the GPU in such kind of operation since it needs a lot oo memory accesses - a slow operation on GPU.
But if you anyway want to implement this:
I guess, for the reason I wrote previously, trust won't help you with ready-to-use algorithm. This means that you need to write your own kernel, however, you can leave memory management to thust.
It is always faster to create arrays in CPU memory and, when ready, copy the whole array to GPU. (CPU<->GPU copies are faster on long continiuos pieces of data)
Keep in mind that GPU runs hundreds of threads in parallel. Each thread need to know what to read and where to write.
Global memory operations are slow (300-400 clocks). Avoid thread reading the whole array from global memory to find out that it needed only the last few bytes.
So, as I can see you program.
Make your arrays 1D in a CPU memory look like this:
float array1[] = { 1, 2, 3, 4};
float array2[] = { 5, 6, 7};
int arr1offsets[] = {0, 2, 3, 1}; // position of the first element and length of subarray pairs
int arr2offsets[] = {0, 1, 1, 2};
Copy your arrays and offsets to GPU and allocate memory for result and it's offsets. I guess, you'll have to count max length of one joint subarray and allocate memory for the worst case.
Run the kernel.
Collect the results
The kernel may look like this (If I correctly understood your idea)
__global__ void kernel(float* arr1, int* arr1offset,
float* arr2, int* arr2offset,
float* result, int* resultoffset)
{
int idx = threadIdx.x+ blockDim.x*blockIdx.x;
int a1beg = arr1offset[Idx*2];
int a2beg = arr2offset[Idx*2];
int a1len = arr1offset[Idx*2+1];
int a2len = arr2offset[Idx*2+1];
resultoffset[idx*2] = idx*MAX_SUBARRAY_LEN;
resultoffset[idx*2+1] = a1len+a2len;
for (int k = 0; k < a1len; ++k) result[idx*MAX_SUBARRAY_LEN+k] = arr1[a1beg+k];
for (int k = 0; k < a2len; ++k) result[idx*MAX_SUBARRAY_LEN+a1len+k] = arr2[a2beg+k];
}
This code is not perfect, but should do the right thing.
How to initialize three dimensional char array without pointers in c and access it?
I tried the following:
char card[1][3][15]={
{"iron","man"},
{"contagious","heide"},
{"string","middle"}
};
but I am getting
**Error:too many initializers**
**Warning: Array is only partially initialized**
Lets take a simple example...You can use your own values instead of these integers:
declaration:
int arr[2][3][4] = { { {1, 2, 3, 4}, {1, 2, 3, 4}, {1, 2, 3, 4} },
{ {1, 2, 3, 4}, {1, 2, 3, 4}, {1, 2, 3, 4} } };
I hope, it is clear to you.
Considering your example itself:
I think it should be
char card[1][3][15]={ {"iron","man", "contagious"}};
What this means is that you can effectively create 3 char arrays each of length 15. Your first dimension of 1 doesn't have much effect.
So, you can make it like
char card[2][3][15]={ {"iron","man", "contagious"},
{"iron","man", "contagious"}};
So, for your simple understand, the number of rows indicate the first dimension, the number of columns in each row indicates the second dimension and the number of elements(in this case chars) in each column indicates the 3rd dimension.
So, now you can see that for the data in your question, you should declare the array as char char[3][2][15]
char card[1][3][15]={ { {"iron","man"},{"contagious","heide"},{"string","middle"}}
};
You should put another braces brackets inside. I think it will be helpful to you.
Suppose that I have
z[7]={0, 0, 2, 0, 1, 2, 1}
that means- first observation allocated in group 0, second obs group 0, third group 2 etc
and I want to write an efficient code to get an array of 3X? such that in the first row I have all the observations allocated in the first group, second row all the obs allocated in the second group etc.. something like
0, 1, 3
4, 6
2, 5
and this must be general, maybe I could have
z={0, 0, 2, 0, 1, 2, 1, 3, 4, 2, 0, 4, 5, 5, 6, 7, 0}
so the number of columns is unknown
I did do my homework and the code is attached to this message, but there must be a better way to do it. I believe that with pointers but I really do not know how.
#include <stdio.h>
int main(){
int z[7]={0, 0, 2, 0, 1, 2, 1}, nj[3], iz[3][7], ip[3], i, j;
for(j=0; j<3; j++){
ip[j] = 0;
nj[j] = 0;
}
for(i=0; i <7; i++ ){
nj[z[i]] = nj[z[i]] + 1;
iz[z[i]][ip[z[i]]] = i;
ip[z[i]] = ip[z[i]] + 1;
}
for(j=0; j<3 ;j++){
for(i=0; i < nj[j]; i++){
printf("%d\t", iz[j][i]);
}
printf("\n");
}
return 0;
}
It seems that you have two tasks here.
To count the number of occurrences of each index in z and allocate a data structure of the right size and configuration.
To iterate over the data and copy it to the correct places.
At the moment you appear to have solved (1) naively by allocating a big, two dimensional array iz. That works fine if you know in advance the limits on how big it could be (and your machine will have enough memory), no need to fix this until later.
It is not clear to me exactly how (2) should be approached. Is the data currently going into iz guaranteed to consist of [0, 1, ... n ]?
If you don't know the limits of the size of iz in advance, then you will have to allocate a dynamic structure. I'd suggest a ragged array though this means two (or even three) passes over z.
What do I mean by a ragged array? A object like the argv argument to main, but in this case of type int **. In memory it looks like this:
+----+ +---+ +---+---+---+--
| iz |---->| |---->| | | ...
+----+ +---+ +---+---+---+--
| |--
+---+ \ +---+---+---+--
| . | --->| | | ...
. +---+---+---+--
.
A ragged array can be accessed with iz[][] just like it was a two-dimensional array (but it is a different type of object), which is nice for your purposes because you can tune your algorithm with the code you have now, and then slap one of these in place.
How to set it up.
Iterate of z to find the largest number, maxZ, present.
Allocate an array of int* of size maxZ+1: iz=callac(maxZ+1,sizeof(int*));.
I chose calloc because it zeros the memory, which makes all those pointers NULL, but you could use malloc and NULL them yourself. Making the array one too big gives us a NULL termination, which may be useful later.
Allocate an array of counters of size maxZ: int *cz = calloc(maxZ,sizeof(int));
iterate over z, filling cz with the number of entries needed in each row.
For each row, allocate an array of ints: for(i=0; i<maxZ; ++i){ iz[i] = malloc(sizeof(int)*cz[i]; }
Iterate over z one last time, sticking the figures into iz as you already do. You could re-use cz at this point to keep track of how many figure have already been put into each row, but you might want to allocate a separate array for that purpose because so that you have a record of how big each allocated array was.
NB: Every call to malloc or calloc ought to be accompanied by a check to insure that the allocation worked. I've left that as an exercise for the student.
This repeated passes over z business can be avoided entirely by using dynamic arrays, but I suspect you don't need that and don't want the added complexity.