How to remove elements in static array effectively in C - c

In C programming, suppose we have an array as: A[8] = {3, 5, 6, 8, 2, 9, 10, 1}, how to modify those items by index range? For instance, to "remove" elements from index 3 to 6, since we have static array, after handling, we expect: A[8] = {3, 5, 6, 1, 0, 0, 0, 0}.
concerning dealing with one element at specified index:
for(i= index; i< size - 1; i++)
{
A[i] = A[i+ 1];
}
size = size - 1
What's the best solution to this problem, suppose the array size can be very large.

With an array, copy is all you can do. You can copy element by element. Or you can use memmove which permits overlapping memory space (memcpy is not safe for overlapping copies). memmove is more efficient than a loop (though optimizing compilers may solve that). You can use either method to copy to a temporary space, or to copy/move in place. In all cases, you must makes sure your vacated cell is cleared.
// an example that assumes you have valid index and size
memmove(&A[index], &A[index+1], size-(index+1));
A[--size] = 0;
The Big(O) is still N (linear with respect to size). With very large arrays, the cost to delete goes up. If unacceptable, you would need a different data structure.
So, consider all your requirements which include cost of inserting, cost of finding/changing, and cost of deleting elements. Choose your data structure to match your requirements. When the data set is small or there are relatively few transactions, the performance differences may not matter to your requirements.

One way to remove elements from array will be to create another array.
And then copy elements from (start, left) and (right, end) in second array.
Here (left,right) are indices of subarray which you want to remove.
Another way will be to override elements instead of creating another array.
If you need to maintain order then care needs to be taken while overriding.
else simple copy (right-left) elements from end of array (you need to set these elements to zero) and put it from left..right.

You should use memcpy and memset for readability, but also because they are get optimized well by the compiler.
Example:
#include <stdio.h>
#include <string.h>
void array_print (const int* arr, size_t size)
{
for(size_t i=0; i<size; i++)
{
printf("%d ", arr[i]);
}
printf("\n");
}
void array_remove (int* arr, size_t size, size_t index, size_t rem_size)
{
int* begin = arr+index; // beginning of segment to remove
int* end = arr+index+rem_size; // end of segment to remove
size_t trail_size = size-index-rem_size; // size of the trailing items after segment
memcpy(begin, // move data to beginning
end, // from end of segment
trail_size*sizeof(int));
memset(begin+trail_size, // from the new end of the array
0, // set everything to zero
rem_size*sizeof(int));
}
int main (void)
{
int array [8] = {3, 5, 6, 8, 2, 9, 10, 1};
const int size = sizeof(array) / sizeof(*array);
array_print(array, size);
array_remove(array, size, 3, 4); // from index 3, remove 4 items
array_print(array, size);
return 0;
}

Though memset, memcpy, memmove are options to you, if you want to implememnt something of your own, it could be something like this.
#include <stdio.h>
#define MAX_ARRAY_INDEX 10
static int array[10] = {1,5,1,4,8,9,6,3,7,2};
int Pull(int *arr, int from, int to, int count)
{
int iter = 0;
if((from<0) || (to<0))
return 0;
if((from>MAX_ARRAY_INDEX) || (to>MAX_ARRAY_INDEX))
return 0;
for(iter = 0; iter < count ; iter++,from++,to++)
{
arr[to] = arr[from];
arr[from] = 0;
}
return 1;
}
int main()
{
int i = 0;
for(i = 0; i < 10; i++)
{
printf(" %d,", array[i]);
}
printf("\n\n");
if(Pull(array, 6, 2, 3));
printf("\nIndexes pulled!");
else
//Error handling, the next part of the code can be vomited?
for(i = 0; i < 10; i++)
{
printf(" %d,", array[i]);
}
printf("\n\n");
Pull(array, 9, 7, 2);
for(i = 0; i < 10; i++)
{
printf(" %d,", array[i]);
}
printf("\n\n");
return 0;
}
Though with this code you need to take care of pulling the indexes completely. May be instead of using a count check in for() loop, you could do it till the MAX_ARRAY_INDEX ?

You cant remove from an array unless it is dynamically initialized.
Google Linked Lists.
Best case scenario would be to substitute to get the answer. such as
int indexStart = 3;
int indexEnd= 6;
for(int i=indexStart; i<A.length;i++){
if(indexEnd<A.length)
A[i]=A[indexEnd+1];
else
A[i]=0;
indexEnd++;
}

Related

What is this sorting algorithm?

I happened to write a simple sorting algorithm, but I am not sure what this algorithm is called.
#include<stdio.h>
#include<stdlib.h>
void IDontKnowWhatThisIs(int* arr, int size){
int* minuscount = malloc(size * sizeof(int)); //new location chooser array
int* valarr = malloc(size * sizeof(int)); //value backup array
//compare all elements: size^2
for (int i = 0; i < size; i++){
valarr[i] = arr[i];
minuscount[i] = 0;
for (int j = 0; j < size; j++){
if (i != j){
//the one with the least amount(0) is the smallest value
if (arr[i] - arr[j] > 0){
minuscount[i] += 1;
}
}
}
}
//O(size)
for (int i = 0; i < size; i++){
//place everything back in
arr[minuscount[i]] = valarr[i];
}
free(minuscount);
free(valarr);
//total time complexity: O(size^2)
}
int main(){
int arr[10] = { 50, 2, 13, 33, 62, 11, 30, 66, 1, -101 };
IDontKnowWhatThisIs(arr, 10);
for (int i = 0; i < 10; i++) printf("%d ", arr[i]);
return 0;
}
It is a simple algorithm that compares each elements with one another and counts new location for them.
and then it is copied back to the original array.
I don't think it is one of those generic n^2 algorithms(selection, bubble, insertion), but the concept of it is still very simple, so I am sure this algorithm already exists.
edit: on second thought, I think this is similar to a selection sort, but unoptimized as it compares even more..
I am not aware of a name for this algorithm. It's clever, but unfortunately you need to add an extra step if you want to handle possible duplicates in the array.
For instance, if the array is: [3;4;4;1;2] then minuscount will be [2;3;3;0;1] and the two 4 will be put in the same cell in arr, resulting in the final array [1;2;3;4;2] where that final 2 is leftover from the original array.
I don't known a name either. I would call it RankSort, because it computes the rank of every element, in order to permute them to their sorted location.
This sort is not very attractive because
it takes two extra arrays, one for the ranks and one as a buffer for permutation (the buffer can be avoided by implementing the permutation in-place);
as said by others, possible equal elements require special handling, namely a lexicographical comparison on value then index. This has a cost;
it performs all N² comparisons. (This can be reduced to N(N-1)/2 by updating the rank of the largest element.)

Deleting Elements from an Array based on their Position every 3 Elements

I created an array called elements_n which has the elements 0 to N-1 where N is 2. The below numbers are the elements of the array called elements_n:
0 1
I have another array called Arr which has the following elements:
0 1 3 1 2 4
If any of the first 3 elements of the array Arr are equal to the first element of elements_n which is 0, I would like to delete that element from the array called Arr. I then repeat the same process for the next 3 elements of the array Arr. So to explain myself better, I will use the following example:
Compare the first 3 elements of array Arr which are 0, 1, 3 to the first element of elements_n which is 0. Since Arr[0] == elements_n[0]. I delete Arr[0] from the array Arr.
Compare the next 3 elements of array Arr which are 1, 2, 4 to the second element of elements_n which is 1. Since Arr[3] == elements_n[1]. I delete Arr[3] from the array Arr. So the elements that should be left in the array Arr are:
1 3 2 4
When I implemented it myself in C programming with the code found below the end result is coming:
1 3 3 2 2 4
Rather than:
1 3 2 4
This is the code I implemented:
#include <stdio.h>
#include <stdlib.h>
#define N 2
int main() {
unsigned *elements_n = malloc(N * sizeof(unsigned));
for (int i = 0; i < N; i++) {
elements_n[i] = i; //Created an array which has the elements 0 to N-1
}
printf("\n");
unsigned Arr[6] = { 0, 1, 3, 1, 2, 4 };
unsigned position_indices[2] = { 3, 3 }; //Moving every 3 elements in the Arr array.
int count = 0;
int index = 0;
unsigned *ptr_Arr = &Arr[0];
do {
for (int i = 0; i < position_indices[count]; i++) {
if (ptr_Arr[i] == elements_n[count]) {
index = i + 1; //Index of the Arr element that has the same value as the element in the array elements_n
for (int j = index - 1; j < position_indices[count] - 1; j++) {
ptr_Arr[j] = ptr_Arr[j + 1];
}
}
}
printf("\n");
ptr_Arr += position_indices[count] - 1;
count++;
} while (count < 2);
for (int i = 0; i < 6; i++) {
printf("%d\t", Arr[i]);
}
printf("\n");
free(elements_n);
return 0;
}
You might try something like this (not tested).
#include <stdio.h>
#include <stdlib.h>
#define N 2
int main()
{
unsigned *elements_n = malloc(N * sizeof(unsigned));
for (int i = 0; i < N; i++)
{
elements_n[i] = i; //Created an array which has the elements 0 to N-1
}
unsigned Arr[6] = { 0, 1, 3, 1, 2, 4 };
int dest_index = 0;
int src_index = 0;
int count = sizeof(Arr)/sizeof(Arr[0]);
for ( ; src_index < count; src_index++)
{
int group = src_index / 3;
if (Arr[src_index] != elements_n[group])
{
Arr[dest_index++] = Arr[src_index];
}
}
for (int i = 0; i < dest_index; i++)
{
printf("%d\t", Arr[i]);
}
printf("\n");
free(elements_n);
return 0;
}
You need to keep track of how many elements you removed from the array.
My solution:
#include <stdio.h>
#include <stddef.h>
#include <assert.h>
#include <string.h>
size_t fancy_delete_3(const int elems[], size_t elemssize, int arr[], size_t arrsize)
{
assert(elems != NULL);
assert(arr != NULL);
assert(arrsize%3 == 0);
assert(elemssize*3 == arrsize);
// we need to count the removed elements, to know how much we need to shift left
size_t removed = 0;
// for each element in elems
for (size_t i = 0; i < elemssize; ++i) {
// check the three correponding elements in arr
for (size_t j = i*3; j < (i+1)*3; ++j) {
assert(j >= removed);
const size_t pos = j - removed;
// if elems[i] matches any of the corresponding element in arr
if (elems[i] == arr[pos]) {
// remove element at position pos
assert(arrsize >= pos + 1);
// I don't think this can ever overflow
memmove(&arr[pos], &arr[pos + 1], (arrsize - pos - 1) * sizeof(int));
++removed;
// array is one element shorter, so we can just decrease the array size
assert(arrsize > 0);
--arrsize;
}
}
}
// we return the new size of the array
return arrsize;
}
#define __arraycount(x) sizeof(x)/sizeof(x[0])
int main()
{
int elements_n[] = {0,1};
int arr[] = {0,1,3, 1,2,4};
size_t newsize = fancy_delete_3(elements_n, __arraycount(elements_n), arr, __arraycount(arr));
printf("arr size=%zu {", newsize);
for (size_t i = 0; i < newsize; ++i)
printf("%d,", arr[i]);
printf("}\n");
return 0;
}
You have several related problems around how you perform deletions. In the first place, it's not clear that you understand that you cannot actually delete anything from a C array. The closest you can come is to overwrite it with something else. Often, pseudo-deletion from an array is implemented by moving each of the elements following the deleted one one position forward, and reducing the logical length of the array.* You seem to have chosen this alternative, but (problem 1) you miss maintaining or updating a logical array length.
Your problem is made a bit more complicated by the fact that you logically subdivide your array into segments, and you seem not to appreciate that your segments are variable-length in that, as described, they shrink when you delete an element. This follows from the fact that deleting an element from one group does not change the assignments of elements to other groups. You do have a mechanism in position_groups that apparently serves to track the sizes of the groups, and in that sense its name seems ill-fitting. In the same way that you need to track and update the logical length of the overall array, you'll need to track and update the lengths of the groups.
Finally, you appear to have an off-by-one error here:
for (int j = index - 1; j < position_indices[count]-1; j++)
that would be clearer if position_indices were better named (see above), but recognizing that what it actually contains is the size of each group, and that index and j represent indices within the group, it follows that the boundary condition for the iteration should instead be just j < position_indices[count]. That's moot, however, because you're going to need a somewhat different approach here anyway.
Suggestion, then:
When you delete an element from a group, move up the entire tail of the array, not just the tail of the group.
In service to that, update both group size and logical array size when you perform a deletion, remembering that that affects also where each subsequent group starts.
When you examine or output the result, remember to disregard array elements past the logical end of the array.
* "Logical array size" means the number of (leading) elements that contain meaningful data. In your case, the logical array size is initially the same as the physical array size, but each time you delete an element (and therefore move up the tail) you reduce the logical size by one.
You never really "deleted" any element, you just shifted-down the
second and third elements from each 3-group. And then, when
printing, you iterated over the whole array.
Arrays are just continuous blocks of memory, so you need a
different strategy. You could traverse the original array with
two indexes, one as the general index counter and other as
the index for the shifted slot. If the current element is
different from the corresponding elements_n item, copy it to
the secondary index and increase it.
In case nothing is equal to the elements_n item, you would just
reassign the array elements to themselves. However, as soon as
one is equal you will be shifting them with the advantage of
keeping track of the new size.
Also, calculating the corresponding elements_n item is a simple
matter of dividing the current index by 3, so you don't even need
an extra variable to that.
#include <stdio.h>
#define N 2
int main()
{
unsigned elements_n[N] = { 0, 1 };
unsigned Arr[N*3] = { 0, 1, 3, 1, 2, 4 };
int i, j;
for (i = 0, j = 0; i < N*3; i++)
if (Arr[i] != elements_n[i/3])
Arr[j++] = Arr[i];
for (i = 0; i < j; i++)
printf(" %d", Arr[i]);
printf("\n");
return 0;
}

Shuffle an array of strings randomly

I am writing a program that reads a text file as an input and randomly shuffles the array of strings for the user.
I have written a program that shuffles the string array randomly but I want to do it in a way that no two elements that are the same are beside each other.
Here's an example:
The original array would look like this
{1,2,3,4,5,1,2}
The shuffled array would look like this
{5,3,1,2,4,2,1}
But currently my program creates an output array of this
{5,1,1,3,2,4,2}
Here is my code that shuffles the elements randomly:
int i;
char s[11][100];
char line[100], t[100];
/*Open the text file*/
FILE *fp;
fp = fopen("players.txt", "r");
/*Read each line and put it into an element in an array.
Each line will be in a seperate element in the array.*/
i=0;
while(fgets(line, 100, fp)!= NULL){
strcpy(s[i], line);
i++;
}
/*Generates a random number stored in j and shuffles the order of the array randomly*/
for(i=1; i<10; i++){
j = rand()%(i+1);
strcpy(t, s[j]);
strcpy(s[j], s[i]);
strcpy(s[i], t);
}
As far as I know, there is no better solution than repeatedly running the Fisher-Yates shuffle until you find an arrangement without adjacent duplicates. (That's usually called a rejection strategy.)
The amount of time this will take depends on the probability that a random shuffle has adjacent duplicates, which will be low if there are few duplicates and could be as much as 1.0 if more than half of the set is the same majority element. Since the rejection strategy never terminates if there is no possible qualifying arrangement, it could be worth the trouble to verify that a solution is possible, which means that there is no majority element. There's an O(n) algorithm for that, if necessary, but given the precise details you provided, it shouldn't be necessary (yet).
You can reject immediately rather than continuing to the end of the shuffle, which significantly cuts down on the cost of running the algorithm. So just use your shuffle algorithm, but restart the counter if you place an element beside one of its twins.
By the way, using strcpy to move elements around is really inefficient. Just shuffle the pointers.
Here's some code adapted from this answer. I've assumed that the duplicates are exact, for simplicity; perhaps you have some other way of telling (like looking only at the first word):
void shuffle(const char* names[], size_t n) {
for (size_t i = 0; i < n;) {
size_t j = i + rand() % (n - i);
/* Reject this shuffle if the element we're about to place
* is the same as the previous one
*/
if (i > 0 && strcmp(names[j], names[i-1]) == 0)
i = 0;
else {
/* Otherwise, place element i and move to the next one*/
const char* t = names[i];
names[i] = s[j];
names[j] = t;
++i;
}
}
}
For your use case, where you have 10 objects with frequencies 3, 3, 2, and 2, there are 605,376 valid arrangements, out 3,628,800 (10!) total arrangements, so about five of every six shuffles will be rejected before you find a valid arrangement, on average. However, the early termination means that you will do less than six times as much work as a single shuffle; empirical results indicate that it takes about 33 swaps to produce a valid shuffle of 10 objects with the above frequencies.
Note: rand()%k is not a very good way to generate a uniform distribution of integers from 0 to k-1. You'll find lots of advice about that on this site.
import java.util.Random;
import java.util.Arrays;
public class ShuffleRand
{
static void randomize( int arr[], int n)
{
Random r = new Random();
for (int i = n-1; i > 0; i--) {
int j = r.nextInt(i+1);
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
System.out.println(Arrays.toString(arr));
}
public static void main(String[] args)
{
int[] arr = {1, 2, 3, 4, 5, 6, 7, 8};
int n = arr.length;
randomize (arr, n);
}
}
This function will shuffle array of strings randomly:
void shuffle(char *arr[], int size)
{
srand(time(NULL));
for (int i = 0; i < size; i++)
{
int b = rand() % size;
int a = rand() % size;
char *tmp = arr[a];
arr[a] = arr[b];
arr[b] = tmp;
}
}

Finding cyclic single transposition vector in C

I have the input as array A = [ 2,3,4,1]
The output is simply all possible permutation from elements in A which can be done by single transposition (single flip of two neighbouring elements) operation. So the output is :
[3,2,4,1],[ 2,4,3,1],[2,3,1,4],[1,3,4,2]
Circular transpositioning is allowed. Hence [2,3,4,1] ==> [1,3,4,2] is allowed and a valid output.
How to do it in C?
EDIT
In python, it would be done as follows:
def Transpose(alist):
leveloutput = []
n = len(alist)
for i in range(n):
x=alist[:]
x[i],x[(i+1)%n] = x[(i+1)%n],x[i]
leveloutput.append(x)
return leveloutput
This solution uses dynamic memory allocation, this way you can do it for an array of size size.
int *swapvalues(const int *const array, size_t size, int left, int right)
{
int *output;
int sotred;
output = malloc(size * sizeof(int));
if (output == NULL) /* check for success */
return NULL;
/* copy the original values into the new array */
memcpy(output, array, size * sizeof(int));
/* swap the requested values */
sotred = output[left];
output[left] = output[right];
output[right] = sotred;
return output;
}
int **transpose(const int *const array, size_t size)
{
int **output;
int i;
int j;
/* generate a swapped copy of the array. */
output = malloc(size * sizeof(int *));
if (output == NULL) /* check success */
return NULL;
j = 0;
for (i = 0 ; i < size - 1 ; ++i)
{
/* allocate space for `size` ints */
output[i] = swapvalues(array, size, j, 1 + j);
if (output[i] == NULL)
goto cleanup;
/* in the next iteration swap the next two values */
j += 1;
}
/* do the same to the first and last element now */
output[i] = swapvalues(array, size, 0, size - 1);
if (output[i] == NULL)
goto cleanup;
return output;
cleanup: /* some malloc call returned NULL, clean up and exit. */
if (output == NULL)
return NULL;
for (j = i ; j >= 0 ; j--)
free(output[j]);
free(output);
return NULL;
}
int main()
{
int array[4] = {2, 3, 4, 1};
int i;
int **permutations = transpose(array, sizeof(array) / sizeof(array[0]));
if (permutations != NULL)
{
for (i = 0 ; i < 4 ; ++i)
{
int j;
fprintf(stderr, "[ ");
for (j = 0 ; j < 4 ; ++j)
{
fprintf(stderr, "%d ", permutations[i][j]);
}
fprintf(stderr, "] ");
free(permutations[i]);
}
fprintf(stderr, "\n");
}
free(permutations);
return 0;
}
Although some people think goto is evil, this is a very nice use for it, don't use it to control the flow of your program (for instance to create a loop), that is confusing. But for the exit point of a function that has to do several things before returning, it think it's actually a nice use, it's my opinion, for me it makes the code easier to understand, I might be wrong.
Have a look at this code I have written with an example :
void transpose() {
int arr[] = {3, 5, 8, 1};
int l = sizeof (arr) / sizeof (arr[0]);
int i, j, k;
for (i = 0; i < l; i++) {
j = (i + 1) % l;
int copy[l];
for (k = 0; k < l; k++)
copy[k] = arr[k];
int t = copy[i];
copy[i] = copy[j];
copy[j] = t;
printf("{%d, %d, %d, %d}\n", copy[0], copy[1], copy[2], copy[3]);
}
}
Sample Output :
{5, 3, 8, 1}
{3, 8, 5, 1}
{3, 5, 1, 8}
{1, 5, 8, 3}
A few notes:
a single memory block is preferred to, say, an array of pointers because of better locality and less heap fragmentation;
the cyclic transposition is only one, it can be done separately, thus avoiding the overhead of the modulo operator in each iteration.
Here's the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int *single_transposition(const int *a, unsigned int n) {
// Output size is known, can use a single allocation
int *out = malloc(n * n * sizeof(int));
// Perform the non-cyclic transpositions
int *dst = out;
for (int i = 0; i < n - 1; ++i) {
memcpy(dst, a, n * sizeof (int));
int t = dst[i];
dst[i] = dst[i + 1];
dst[i + 1] = t;
dst += n;
}
// Perform the cyclic transposition, no need to impose the overhead
// of the modulo operation in each of the above iterations.
memcpy(dst, a, n * sizeof (int));
int t = dst[0];
dst[0] = dst[n-1];
dst[n-1] = t;
return out;
}
int main() {include
int a[] = { 2, 3, 4, 1 };
const unsigned int n = sizeof a / sizeof a[0];
int *b = single_transposition(a, n);
for (int i = 0; i < n * n; ++i)
printf("%d%c", b[i], (i % n) == n - 1 ? '\n' : ' ');
free(b);
}
There are many ways to tackle this problem, and most important questions are: how you're going to consume the output and how variable is the size of the array. You've already said the array is going to be very large, therefore I assume memory, not CPU will be the biggest bottleneck here.
If output is going to be used only few times (especially just once), it'll may be best to use functional approach: generate every transposition on the fly, and never have more than one in memory at a time. For this approach many high level languages would work as well as (maybe sometimes even better than) C.
If size of the array is fixed, or semi-fixed (eg few sizes known at compile-time), you can define structures, using C++ templates at best.
If size is dynamic and you still want to have every transposition in memory then you should allocate one huge memory block and treat it as contiguous array of arrays. This is very simple and straightforward on machine level. Unfortunately it's best tackled using pointer arithmetic, one feature of C/C++ that is renowned for being difficult to understand. (It isn't if you learn C from basics, but people jumping down from high level languages have proven track record of getting it completely wrong first time)
Other approach is to have big array of pointers to smaller arrays, which results in double pointer, the ** which is even more terrifying to newcomers.
Sorry for long post which is not a real answer, but IMHO there are too many questions left open for choosing the best solution and I feel you need bit more C basic knowledge to manage them on your own.
/edit:
As other solutions are already posted, here's a solution with minimum memory footprint. This is the most limiting approach, it uses same one buffer over and over, and you must be sure that your code is finished with first transposition before moving on to the next one. On the bright side, it'll still work just fine when other solutions would require terabyte of memory. It's also so undemanding that it might be as well implemented with a high level language. I insisted on using C++ in case you would like to have more than one matrix at a time (eg comparing them OR running several threads concurrently).
#define NO_TRANSPOSITION -1
class Transposable1dMatrix
{
private:
int * m_pMatrix;
int m_iMatrixSize;
int m_iCurrTransposition;
//transposition N means that elements N and N+1 are swapped
//transpostion -1 means no transposition
//transposition (size-1) means cyclic transpostion
//as usual in C (size-1) is the last valid index
public:
Transposable1dMatrix(int MatrixSize)
{
m_iMatrixSize = MatrixSize;
m_pMatrix = new int[m_iMatrixSize];
m_iCurrTransposition = NO_TRANSPOSITION;
}
int* GetCurrentMatrix()
{
return m_pMatrix;
}
bool IsTransposed()
{
return m_iCurrTransposition != NO_TRANSPOSITION;
}
void ReturnToOriginal()
{
if(!IsTransposed())//already in original state, nothing to do here
return;
//apply same transpostion again to go back to original
TransposeInternal(m_iCurrTransposition);
m_iCurrTransposition = NO_TRANSPOSITION;
}
void TransposeTo(int TranspositionIndex)
{
if(IsTransposed())
ReturnToOriginal();
TransposeInternal(TranspositionIndex);
m_iCurrTransposition = TranspositionIndex;
}
private:
void TransposeInternal(int TranspositionIndex)
{
int Swap1 = TranspositionIndex;
int Swap2 = TranspositionIndex+1;
if(Swap2 == m_iMatrixSize)
Swap2 = 0;//this is the cyclic one
int tmp = m_pMatrix[Swap1];
m_pMatrix[Swap1] = m_pMatrix[Swap2];
m_pMatrix[Swap2] = tmp;
}
};
void main(void)
{
int arr[] = {2, 3, 4, 1};
int size = 4;
//allocate
Transposable1dMatrix* test = new Transposable1dMatrix(size);
//fill data
memcpy(test->GetCurrentMatrix(), arr, size * sizeof (int));
//run test
for(int x = 0; x<size;x++)
{
test->TransposeTo(x);
int* copy = test->GetCurrentMatrix();
printf("{%d, %d, %d, %d}\n", copy[0], copy[1], copy[2], copy[3]);
}
}

Algorithm: efficient way to remove duplicate integers from an array

I got this problem from an interview with Microsoft.
Given an array of random integers,
write an algorithm in C that removes
duplicated numbers and return the unique numbers in the original
array.
E.g Input: {4, 8, 4, 1, 1, 2, 9} Output: {4, 8, 1, 2, 9, ?, ?}
One caveat is that the expected algorithm should not required the array to be sorted first. And when an element has been removed, the following elements must be shifted forward as well. Anyway, value of elements at the tail of the array where elements were shifted forward are negligible.
Update: The result must be returned in the original array and helper data structure (e.g. hashtable) should not be used. However, I guess order preservation is not necessary.
Update2: For those who wonder why these impractical constraints, this was an interview question and all these constraints are discussed during the thinking process to see how I can come up with different ideas.
A solution suggested by my girlfriend is a variation of merge sort. The only modification is that during the merge step, just disregard duplicated values. This solution would be as well O(n log n). In this approach, the sorting/duplication removal are combined together. However, I'm not sure if that makes any difference, though.
I've posted this once before on SO, but I'll reproduce it here because it's pretty cool. It uses hashing, building something like a hash set in place. It's guaranteed to be O(1) in axillary space (the recursion is a tail call), and is typically O(N) time complexity. The algorithm is as follows:
Take the first element of the array, this will be the sentinel.
Reorder the rest of the array, as much as possible, such that each element is in the position corresponding to its hash. As this step is completed, duplicates will be discovered. Set them equal to sentinel.
Move all elements for which the index is equal to the hash to the beginning of the array.
Move all elements that are equal to sentinel, except the first element of the array, to the end of the array.
What's left between the properly hashed elements and the duplicate elements will be the elements that couldn't be placed in the index corresponding to their hash because of a collision. Recurse to deal with these elements.
This can be shown to be O(N) provided no pathological scenario in the hashing: Even if there are no duplicates, approximately 2/3 of the elements will be eliminated at each recursion. Each level of recursion is O(n) where small n is the amount of elements left. The only problem is that, in practice, it's slower than a quick sort when there are few duplicates, i.e. lots of collisions. However, when there are huge amounts of duplicates, it's amazingly fast.
Edit: In current implementations of D, hash_t is 32 bits. Everything about this algorithm assumes that there will be very few, if any, hash collisions in full 32-bit space. Collisions may, however, occur frequently in the modulus space. However, this assumption will in all likelihood be true for any reasonably sized data set. If the key is less than or equal to 32 bits, it can be its own hash, meaning that a collision in full 32-bit space is impossible. If it is larger, you simply can't fit enough of them into 32-bit memory address space for it to be a problem. I assume hash_t will be increased to 64 bits in 64-bit implementations of D, where datasets can be larger. Furthermore, if this ever did prove to be a problem, one could change the hash function at each level of recursion.
Here's an implementation in the D programming language:
void uniqueInPlace(T)(ref T[] dataIn) {
uniqueInPlaceImpl(dataIn, 0);
}
void uniqueInPlaceImpl(T)(ref T[] dataIn, size_t start) {
if(dataIn.length - start < 2)
return;
invariant T sentinel = dataIn[start];
T[] data = dataIn[start + 1..$];
static hash_t getHash(T elem) {
static if(is(T == uint) || is(T == int)) {
return cast(hash_t) elem;
} else static if(__traits(compiles, elem.toHash)) {
return elem.toHash;
} else {
static auto ti = typeid(typeof(elem));
return ti.getHash(&elem);
}
}
for(size_t index = 0; index < data.length;) {
if(data[index] == sentinel) {
index++;
continue;
}
auto hash = getHash(data[index]) % data.length;
if(index == hash) {
index++;
continue;
}
if(data[index] == data[hash]) {
data[index] = sentinel;
index++;
continue;
}
if(data[hash] == sentinel) {
swap(data[hash], data[index]);
index++;
continue;
}
auto hashHash = getHash(data[hash]) % data.length;
if(hashHash != hash) {
swap(data[index], data[hash]);
if(hash < index)
index++;
} else {
index++;
}
}
size_t swapPos = 0;
foreach(i; 0..data.length) {
if(data[i] != sentinel && i == getHash(data[i]) % data.length) {
swap(data[i], data[swapPos++]);
}
}
size_t sentinelPos = data.length;
for(size_t i = swapPos; i < sentinelPos;) {
if(data[i] == sentinel) {
swap(data[i], data[--sentinelPos]);
} else {
i++;
}
}
dataIn = dataIn[0..sentinelPos + start + 1];
uniqueInPlaceImpl(dataIn, start + swapPos + 1);
}
How about:
void rmdup(int *array, int length)
{
int *current , *end = array + length - 1;
for ( current = array + 1; array < end; array++, current = array + 1 )
{
while ( current <= end )
{
if ( *current == *array )
{
*current = *end--;
}
else
{
current++;
}
}
}
}
Should be O(n^2) or less.
If you are looking for the superior O-notation, then sorting the array with an O(n log n) sort then doing a O(n) traversal may be the best route. Without sorting, you are looking at O(n^2).
Edit: if you are just doing integers, then you can also do radix sort to get O(n).
One more efficient implementation
int i, j;
/* new length of modified array */
int NewLength = 1;
for(i=1; i< Length; i++){
for(j=0; j< NewLength ; j++)
{
if(array[i] == array[j])
break;
}
/* if none of the values in index[0..j] of array is not same as array[i],
then copy the current value to corresponding new position in array */
if (j==NewLength )
array[NewLength++] = array[i];
}
In this implementation there is no need for sorting the array.
Also if a duplicate element is found, there is no need for shifting all elements after this by one position.
The output of this code is array[] with size NewLength
Here we are starting from the 2nd elemt in array and comparing it with all the elements in array up to this array.
We are holding an extra index variable 'NewLength' for modifying the input array.
NewLength variabel is initialized to 0.
Element in array[1] will be compared with array[0].
If they are different, then value in array[NewLength] will be modified with array[1] and increment NewLength.
If they are same, NewLength will not be modified.
So if we have an array [1 2 1 3 1],
then
In First pass of 'j' loop, array[1] (2) will be compared with array0, then 2 will be written to array[NewLength] = array[1]
so array will be [1 2] since NewLength = 2
In second pass of 'j' loop, array[2] (1) will be compared with array0 and array1. Here since array[2] (1) and array0 are same loop will break here.
so array will be [1 2] since NewLength = 2
and so on
1. Using O(1) extra space, in O(n log n) time
This is possible, for instance:
first do an in-place O(n log n) sort
then walk through the list once, writing the first instance of every back to the beginning of the list
I believe ejel's partner is correct that the best way to do this would be an in-place merge sort with a simplified merge step, and that that is probably the intent of the question, if you were eg. writing a new library function to do this as efficiently as possible with no ability to improve the inputs, and there would be cases it would be useful to do so without a hash-table, depending on the sorts of inputs. But I haven't actually checked this.
2. Using O(lots) extra space, in O(n) time
declare a zero'd array big enough to hold all integers
walk through the array once
set the corresponding array element to 1 for each integer.
If it was already 1, skip that integer.
This only works if several questionable assumptions hold:
it's possible to zero memory cheaply, or the size of the ints are small compared to the number of them
you're happy to ask your OS for 256^sizepof(int) memory
and it will cache it for you really really efficiently if it's gigantic
It's a bad answer, but if you have LOTS of input elements, but they're all 8-bit integers (or maybe even 16-bit integers) it could be the best way.
3. O(little)-ish extra space, O(n)-ish time
As #2, but use a hash table.
4. The clear way
If the number of elements is small, writing an appropriate algorithm is not useful if other code is quicker to write and quicker to read.
Eg. Walk through the array for each unique elements (ie. the first element, the second element (duplicates of the first having been removed) etc) removing all identical elements. O(1) extra space, O(n^2) time.
Eg. Use library functions which do this. efficiency depends which you have easily available.
Well, it's basic implementation is quite simple. Go through all elements, check whether there are duplicates in the remaining ones and shift the rest over them.
It's terrible inefficient and you could speed it up by a helper-array for the output or sorting/binary trees, but this doesn't seem to be allowed.
If you are allowed to use C++, a call to std::sort followed by a call to std::unique will give you the answer. The time complexity is O(N log N) for the sort and O(N) for the unique traversal.
And if C++ is off the table there isn't anything that keeps these same algorithms from being written in C.
You could do this in a single traversal, if you are willing to sacrifice memory. You can simply tally whether you have seen an integer or not in a hash/associative array. If you have already seen a number, remove it as you go, or better yet, move numbers you have not seen into a new array, avoiding any shifting in the original array.
In Perl:
foreach $i (#myary) {
if(!defined $seen{$i}) {
$seen{$i} = 1;
push #newary, $i;
}
}
The return value of the function should be the number of unique elements and they are all stored at the front of the array. Without this additional information, you won't even know if there were any duplicates.
Each iteration of the outer loop processes one element of the array. If it is unique, it stays in the front of the array and if it is a duplicate, it is overwritten by the last unprocessed element in the array. This solution runs in O(n^2) time.
#include <stdio.h>
#include <stdlib.h>
size_t rmdup(int *arr, size_t len)
{
size_t prev = 0;
size_t curr = 1;
size_t last = len - 1;
while (curr <= last) {
for (prev = 0; prev < curr && arr[curr] != arr[prev]; ++prev);
if (prev == curr) {
++curr;
} else {
arr[curr] = arr[last];
--last;
}
}
return curr;
}
void print_array(int *arr, size_t len)
{
printf("{");
size_t curr = 0;
for (curr = 0; curr < len; ++curr) {
if (curr > 0) printf(", ");
printf("%d", arr[curr]);
}
printf("}");
}
int main()
{
int arr[] = {4, 8, 4, 1, 1, 2, 9};
printf("Before: ");
size_t len = sizeof (arr) / sizeof (arr[0]);
print_array(arr, len);
len = rmdup(arr, len);
printf("\nAfter: ");
print_array(arr, len);
printf("\n");
return 0;
}
Here is a Java Version.
int[] removeDuplicate(int[] input){
int arrayLen = input.length;
for(int i=0;i<arrayLen;i++){
for(int j = i+1; j< arrayLen ; j++){
if(((input[i]^input[j]) == 0)){
input[j] = 0;
}
if((input[j]==0) && j<arrayLen-1){
input[j] = input[j+1];
input[j+1] = 0;
}
}
}
return input;
}
Here is my solution.
///// find duplicates in an array and remove them
void unique(int* input, int n)
{
merge_sort(input, 0, n) ;
int prev = 0 ;
for(int i = 1 ; i < n ; i++)
{
if(input[i] != input[prev])
if(prev < i-1)
input[prev++] = input[i] ;
}
}
An array should obviously be "traversed" right-to-left to avoid unneccessary copying of values back and forth.
If you have unlimited memory, you can allocate a bit array for sizeof(type-of-element-in-array) / 8 bytes to have each bit signify whether you've already encountered corresponding value or not.
If you don't, I can't think of anything better than traversing an array and comparing each value with values that follow it and then if duplicate is found, remove these values altogether. This is somewhere near O(n^2) (or O((n^2-n)/2)).
IBM has an article on kinda close subject.
Let's see:
O(N) pass to find min/max allocate
bit-array for found
O(N) pass swapping duplicates to end.
This can be done in one pass with an O(N log N) algorithm and no extra storage.
Proceed from element a[1] to a[N]. At each stage i, all of the elements to the left of a[i] comprise a sorted heap of elements a[0] through a[j]. Meanwhile, a second index j, initially 0, keeps track of the size of the heap.
Examine a[i] and insert it into the heap, which now occupies elements a[0] to a[j+1]. As the element is inserted, if a duplicate element a[k] is encountered having the same value, do not insert a[i] into the heap (i.e., discard it); otherwise insert it into the heap, which now grows by one element and now comprises a[0] to a[j+1], and increment j.
Continue in this manner, incrementing i until all of the array elements have been examined and inserted into the heap, which ends up occupying a[0] to a[j]. j is the index of the last element of the heap, and the heap contains only unique element values.
int algorithm(int[] a, int n)
{
int i, j;
for (j = 0, i = 1; i < n; i++)
{
// Insert a[i] into the heap a[0...j]
if (heapInsert(a, j, a[i]))
j++;
}
return j;
}
bool heapInsert(a[], int n, int val)
{
// Insert val into heap a[0...n]
...code omitted for brevity...
if (duplicate element a[k] == val)
return false;
a[k] = val;
return true;
}
Looking at the example, this is not exactly what was asked for since the resulting array preserves the original element order. But if this requirement is relaxed, the algorithm above should do the trick.
In Java I would solve it like this. Don't know how to write this in C.
int length = array.length;
for (int i = 0; i < length; i++)
{
for (int j = i + 1; j < length; j++)
{
if (array[i] == array[j])
{
int k, j;
for (k = j + 1, l = j; k < length; k++, l++)
{
if (array[k] != array[i])
{
array[l] = array[k];
}
else
{
l--;
}
}
length = l;
}
}
}
How about the following?
int* temp = malloc(sizeof(int)*len);
int count = 0;
int x =0;
int y =0;
for(x=0;x<len;x++)
{
for(y=0;y<count;y++)
{
if(*(temp+y)==*(array+x))
{
break;
}
}
if(y==count)
{
*(temp+count) = *(array+x);
count++;
}
}
memcpy(array, temp, sizeof(int)*len);
I try to declare a temp array and put the elements into that before copying everything back to the original array.
After review the problem, here is my delphi way, that may help
var
A: Array of Integer;
I,J,C,K, P: Integer;
begin
C:=10;
SetLength(A,10);
A[0]:=1; A[1]:=4; A[2]:=2; A[3]:=6; A[4]:=3; A[5]:=4;
A[6]:=3; A[7]:=4; A[8]:=2; A[9]:=5;
for I := 0 to C-1 do
begin
for J := I+1 to C-1 do
if A[I]=A[J] then
begin
for K := C-1 Downto J do
if A[J]<>A[k] then
begin
P:=A[K];
A[K]:=0;
A[J]:=P;
C:=K;
break;
end
else
begin
A[K]:=0;
C:=K;
end;
end;
end;
//tructate array
setlength(A,C);
end;
The following example should solve your problem:
def check_dump(x):
if not x in t:
t.append(x)
return True
t=[]
output = filter(check_dump, input)
print(output)
True
import java.util.ArrayList;
public class C {
public static void main(String[] args) {
int arr[] = {2,5,5,5,9,11,11,23,34,34,34,45,45};
ArrayList<Integer> arr1 = new ArrayList<Integer>();
for(int i=0;i<arr.length-1;i++){
if(arr[i] == arr[i+1]){
arr[i] = 99999;
}
}
for(int i=0;i<arr.length;i++){
if(arr[i] != 99999){
arr1.add(arr[i]);
}
}
System.out.println(arr1);
}
}
This is the naive (N*(N-1)/2) solution. It uses constant additional space and maintains the original order. It is similar to the solution by #Byju, but uses no if(){} blocks. It also avoids copying an element onto itself.
#include <stdio.h>
#include <stdlib.h>
int numbers[] = {4, 8, 4, 1, 1, 2, 9};
#define COUNT (sizeof numbers / sizeof numbers[0])
size_t undup_it(int array[], size_t len)
{
size_t src,dst;
/* an array of size=1 cannot contain duplicate values */
if (len <2) return len;
/* an array of size>1 will cannot at least one unique value */
for (src=dst=1; src < len; src++) {
size_t cur;
for (cur=0; cur < dst; cur++ ) {
if (array[cur] == array[src]) break;
}
if (cur != dst) continue; /* found a duplicate */
/* array[src] must be new: add it to the list of non-duplicates */
if (dst < src) array[dst] = array[src]; /* avoid copy-to-self */
dst++;
}
return dst; /* number of valid alements in new array */
}
void print_it(int array[], size_t len)
{
size_t idx;
for (idx=0; idx < len; idx++) {
printf("%c %d", (idx) ? ',' :'{' , array[idx] );
}
printf("}\n" );
}
int main(void) {
size_t cnt = COUNT;
printf("Before undup:" );
print_it(numbers, cnt);
cnt = undup_it(numbers,cnt);
printf("After undup:" );
print_it(numbers, cnt);
return 0;
}
This can be done in a single pass, in O(N) time in the number of integers in the input
list, and O(N) storage in the number of unique integers.
Walk through the list from front to back, with two pointers "dst" and
"src" initialized to the first item. Start with an empty hash table
of "integers seen". If the integer at src is not present in the hash,
write it to the slot at dst and increment dst. Add the integer at src
to the hash, then increment src. Repeat until src passes the end of
the input list.
Insert all the elements in a binary tree the disregards duplicates - O(nlog(n)). Then extract all of them back in the array by doing a traversal - O(n). I am assuming that you don't need order preservation.
Use bloom filter for hashing. This will reduce the memory overhead very significantly.
In JAVA,
Integer[] arrayInteger = {1,2,3,4,3,2,4,6,7,8,9,9,10};
String value ="";
for(Integer i:arrayInteger)
{
if(!value.contains(Integer.toString(i))){
value +=Integer.toString(i)+",";
}
}
String[] arraySplitToString = value.split(",");
Integer[] arrayIntResult = new Integer[arraySplitToString.length];
for(int i = 0 ; i < arraySplitToString.length ; i++){
arrayIntResult[i] = Integer.parseInt(arraySplitToString[i]);
}
output:
{ 1, 2, 3, 4, 6, 7, 8, 9, 10}
hope this will help
Create a BinarySearchTree which has O(n) complexity.
First, you should create an array check[n] where n is the number of elements of the array you want to make duplicate-free and set the value of every element(of the check array) equal to 1. Using a for loop traverse the array with the duplicates, say its name is arr, and in the for-loop write this :
{
if (check[arr[i]] != 1) {
arr[i] = 0;
}
else {
check[arr[i]] = 0;
}
}
With that, you set every duplicate equal to zero. So the only thing is left to do is to traverse the arr array and print everything it's not equal to zero. The order stays and it takes linear time (3*n).
Given an array of n elements, write an algorithm to remove all duplicates from the array in time O(nlogn)
Algorithm delete_duplicates (a[1....n])
//Remove duplicates from the given array
//input parameters :a[1:n], an array of n elements.
{
temp[1:n]; //an array of n elements.
temp[i]=a[i];for i=1 to n
temp[i].value=a[i]
temp[i].key=i
//based on 'value' sort the array temp.
//based on 'value' delete duplicate elements from temp.
//based on 'key' sort the array temp.//construct an array p using temp.
p[i]=temp[i]value
return p.
In other of elements is maintained in the output array using the 'key'. Consider the key is of length O(n), the time taken for performing sorting on the key and value is O(nlogn). So the time taken to delete all duplicates from the array is O(nlogn).
this is what i've got, though it misplaces the order we can sort in ascending or descending to fix it up.
#include <stdio.h>
int main(void){
int x,n,myvar=0;
printf("Enter a number: \t");
scanf("%d",&n);
int arr[n],changedarr[n];
for(x=0;x<n;x++){
printf("Enter a number for array[%d]: ",x);
scanf("%d",&arr[x]);
}
printf("\nOriginal Number in an array\n");
for(x=0;x<n;x++){
printf("%d\t",arr[x]);
}
int i=0,j=0;
// printf("i\tj\tarr\tchanged\n");
for (int i = 0; i < n; i++)
{
// printf("%d\t%d\t%d\t%d\n",i,j,arr[i],changedarr[i] );
for (int j = 0; j <n; j++)
{
if (i==j)
{
continue;
}
else if(arr[i]==arr[j]){
changedarr[j]=0;
}
else{
changedarr[i]=arr[i];
}
// printf("%d\t%d\t%d\t%d\n",i,j,arr[i],changedarr[i] );
}
myvar+=1;
}
// printf("\n\nmyvar=%d\n",myvar);
int count=0;
printf("\nThe unique items:\n");
for (int i = 0; i < myvar; i++)
{
if(changedarr[i]!=0){
count+=1;
printf("%d\t",changedarr[i]);
}
}
printf("\n");
}
It'd be cool if you had a good DataStructure that could quickly tell if it contains an integer. Perhaps a tree of some sort.
DataStructure elementsSeen = new DataStructure();
int elementsRemoved = 0;
for(int i=0;i<array.Length;i++){
if(elementsSeen.Contains(array[i])
elementsRemoved++;
else
array[i-elementsRemoved] = array[i];
}
array.Length = array.Length - elementsRemoved;

Resources