Merging two arraylists without creating third one - arrays

Here is one task, i was trying to solve. You must write the function
void merge(ArrayList a, ArrayList b) {
// code
}
The function recieves two ArrayLists with equal size as input parameters [a1, a2, ..., an], [b1, b2, ..., bn]. The execution result is the 1st ArrayList must contain elements of both lists, and they alternate consistently ([a1, b1, a2, b2, ..., an, bn]) Please read the bold text twice =)
Code must work as efficiently as possible.
Here is my solution
public static void merge(ArrayList a, ArrayList b) {
ArrayList result = new ArrayList();
int i = 0;
Iterator iter1 = a.iterator();
Iterator iter2 = b.iterator();
while ((iter1.hasNext() || iter2.hasNext()) && i < (a.size() + b.size())) {
if (i % 2 ==0) {
result.add(iter1.next());
} else {
result.add(iter2.next());
}
i++;
}
a = result;
}
I know it's not perfect at all. But I can't understand how to merge in the 1st list without creating tmp list.
Thanks in advance for taking part.

Double ArrayList a's size. Set last two elements of a to the last element of the old a and the last element of b. Keep going, backing up each time, until you reach the beginnings of a and b. You have to do it from the rear because otherwise you will write over the original a's values.

In the end i got this:
public static void merge(ArrayList<Integer> arr1, ArrayList<Integer> arr2) {
int indexForArr1 = arr1.size() - 1;
int oldSize = arr1.size();
int newSize = arr1.size() + arr2.size();
/*
decided not to create new arraylist with new size but just to fill up old one with nulls
*/
fillWithNulls(arr1, newSize);
for(int i = (newSize-1); i >= 0; i--) {
if (i%2 != 0) {
int indexForArr2 = i%oldSize;
arr1.set(i,arr2.get(indexForArr2));
oldSize--; // we reduce the size because we don't need tha last element any more
} else {
arr1.set(i, arr1.get(indexForArr1));
indexForArr1--;
}
}
}
private static void fillWithNulls(ArrayList<Integer> array, int newSize) {
int delta = newSize - array.size();
for(int i = 0; i < delta; i++) {
array.add(null);
}
}
Thanks John again for bright idea!

Related

Algorithm to iterate N-dimensional array in pseudo random order

I have an array that I would like to iterate in random order. That is, I would like my iteration to visit each element only once in a seemingly random order.
Would it be possible to implement an iterator that would iterate elements like this without storing the order or other data in a lookup table first?
Would it be possible to do it for N-dimensional arrays where N>1?
UPDATE: Some of the answers mention how to do this by storing indices. A major point of this question is how to do it without storing indices or other data.
I decided to solve this, because it annoyed me to death not remembering the name of solution that I had heard before. I did however remember in the end, more on that in the bottom of this post.
My solution depends on the mathematical properties of some cleverly calculated numbers
range = array size
prime = closestPrimeAfter(range)
root = closestPrimitiveRootTo(range/2)
state = root
With this setup we can calculate the following repeatedly and it will iterate all elements of the array exactly once in a seemingly random order, after which it will loop to traverse the array in the same exact order again.
state = (state * root) % prime
I implemented and tested this in Java, so I decided to paste my code here for future reference.
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Random;
public class PseudoRandomSequence {
private long state;
private final long range;
private final long root;
private final long prime;
//Debugging counter
private int dropped = 0;
public PseudoRandomSequence(int r) {
range = r;
prime = closestPrimeAfter(range);
root = modPow(generator(prime), closestPrimeTo(prime / 2), prime);
reset();
System.out.println("-- r:" + range);
System.out.println(" p:" + prime);
System.out.println(" k:" + root);
System.out.println(" s:" + state);
}
// https://en.wikipedia.org/wiki/Primitive_root_modulo_n
private static long modPow(long base, long exp, long mod) {
return BigInteger.valueOf(base).modPow(BigInteger.valueOf(exp), BigInteger.valueOf(mod)).intValue();
}
//http://e-maxx-eng.github.io/algebra/primitive-root.html
private static long generator(long p) {
ArrayList<Long> fact = new ArrayList<Long>();
long phi = p - 1, n = phi;
for (long i = 2; i * i <= n; ++i) {
if (n % i == 0) {
fact.add(i);
while (n % i == 0) {
n /= i;
}
}
}
if (n > 1) fact.add(n);
for (long res = 2; res <= p; ++res) {
boolean ok = true;
for (long i = 0; i < fact.size() && ok; ++i) {
ok &= modPow(res, phi / fact.get((int) i), p) != 1;
}
if (ok) {
return res;
}
}
return -1;
}
public long get() {
return state - 1;
}
public void advance() {
//This loop simply skips all results that overshoot the range, which should never happen if range is a prime number.
dropped--;
do {
state = (state * root) % prime;
dropped++;
} while (state > range);
}
public void reset() {
state = root;
dropped = 0;
}
private static boolean isPrime(long num) {
if (num == 2) return true;
if (num % 2 == 0) return false;
for (int i = 3; i * i <= num; i += 2) {
if (num % i == 0) return false;
}
return true;
}
private static long closestPrimeAfter(long n) {
long up;
for (up = n + 1; !isPrime(up); ++up)
;
return up;
}
private static long closestPrimeBefore(long n) {
long dn;
for (dn = n - 1; !isPrime(dn); --dn)
;
return dn;
}
private static long closestPrimeTo(long n) {
final long dn = closestPrimeBefore(n);
final long up = closestPrimeAfter(n);
return (n - dn) > (up - n) ? up : dn;
}
private static boolean test(int r, int loops) {
final int array[] = new int[r];
Arrays.fill(array, 0);
System.out.println("TESTING: array size: " + r + ", loops: " + loops + "\n");
PseudoRandomSequence prs = new PseudoRandomSequence(r);
final long ct = loops * r;
//Iterate the array 'loops' times, incrementing the value for each cell for every visit.
for (int i = 0; i < ct; ++i) {
prs.advance();
final long index = prs.get();
array[(int) index]++;
}
//Verify that each cell was visited exactly 'loops' times, confirming the validity of the sequence
for (int i = 0; i < r; ++i) {
final int c = array[i];
if (loops != c) {
System.err.println("ERROR: array element #" + i + " was " + c + " instead of " + loops + " as expected\n");
return false;
}
}
//TODO: Verify the "randomness" of the sequence
System.out.println("OK: Sequence checked out with " + prs.dropped + " drops (" + prs.dropped / loops + " per loop vs. diff " + (prs.prime - r) + ") \n");
return true;
}
//Run lots of random tests
public static void main(String[] args) {
Random r = new Random();
r.setSeed(1337);
for (int i = 0; i < 100; ++i) {
PseudoRandomSequence.test(r.nextInt(1000000) + 1, r.nextInt(9) + 1);
}
}
}
As stated in the top, about 10 minutes after spending a good part of my night actually getting a result, I DID remember where I had read about the original way of doing this. It was in a small C implementation of a 2D graphics "dissolve" effect as described in Graphics Gems vol. 1 which in turn is an adaption to 2D with some optimizations of a mechanism called "LFSR" (wikipedia article here, original dissolve.c source code here).
You could collect all possible indices in a list and then remove a random indece to visit. I know this is sort of like a lookup table, but i don't see any other option than this.
Here is an example for a one-dimensional array (adaption to multiple dimensions should be trivial):
class RandomIterator<T> {
T[] array;
List<Integer> remainingIndeces;
public RandomIterator(T[] array) {
this.array = array;
this.remainingIndeces = new ArrayList<>();
for(int i = 0;i<array.length;++i)
remainingIndeces.add(i);
}
public T next() {
return array[remainingIndeces.remove((int)(Math.random()*remainingIndeces.size()))];
}
public boolean hasNext() {
return !remainingIndeces.isEmpty();
}
}
On a side note: If this code is performance relevant, this method would perform worse by far, as the random removing from the list triggers copies if you use a list backed by an array (a linked-list won't help either, as indexed access is O(n)). I would suggest a lookup-structure (e.g. HashSet in Java) that stores all visited indices to circumvent this problem (though that's exactly what you did not want to use)
EDIT: Another approach is to copy said array and use a library function to shuffle it and then traverse it in linear order. If your array isn't that big, this seems like the most readable and performant option.
You would need to create a pseudo random number generator that generates values from 0 to X-1 and takes X iterations before repeating the cycle, where X is the product of all the dimension sizes. I don't know if there is a generic solution to doing this. Wiki article for one type of random number generator:
http://en.wikipedia.org/wiki/Linear_congruential_generator
Yes, it is possible. Imagine 3D array (you not likely use anything more than that). This is like a cube and where all 3 lines connect is a cell. You can enumerate your cells 1 to N using a dictionary, you can do this initialization in loops, and create a list of cells to use for random draw
Initialization
totalCells = ... (xMax * yMax * zMax)
index = 0
For (x = 0; x < xMax ; x++)
{
For (y = 0; y < yMax ; y++)
{
For (z = 0; z < zMax ; z++)
{
dict.Add(i, new Cell(x, y, z))
lst.Add(i)
i++
}
}
}
Now, all you have to do is iterate randomly
Do While (lst.Count > 0)
{
indexToVisit = rand.Next(0, lst.Count - 1)
currentCell = dict[lst[indexToVisit]]
lst.Remove(indexToVisit)
// Do something with current cell here
. . . . . .
}
This is pseudo code, since you didn't mention language you work in
Another way is to randomize 3 (or whatever number of dimensions you have) lists and then just nested loop through them - this will be random in the end.

Visual C++ Merge Sort

I'm trying to figure out why the following code does not do a merge sort. The code compiles fine and there are no runtime errors. SortCollection method just returns an unsorted array. No compile errors and no run time errors, just returns an unsorted array. Any pointers would be greatly appreaciated.
#include "stdafx.h"
#include <deque>
#include <climits>
#include <stdio.h>
using namespace System;
using namespace System::Collections;
using namespace System::Collections::Generic;
generic <typename T> where T: IComparable<T>
ref class MergeSort
{
public:
// constructor
MergeSort(){}
// SortCollection() method
array<T>^ SortCollection(array<T>^ inputArray)
{
int n = inputArray->Length;
if (n <= 1)
{
return inputArray;
}
array<T>^ array1 = gcnew array<T>(inputArray->Length / 2);
array<T>^ array2 = gcnew array<T>(inputArray->Length - array1->Length);
int array1Count = 0;
int array2Count = 0;
for (int i = 0; i < n; i++)
{
if (i < n / 2)
{
array1[array1Count] = inputArray[i];
array1Count++;
}
else
{
array2[array2Count] = inputArray[i];
array2Count++;
}
}
SortCollection(array1);
SortCollection(array2);
array<T>^ newArray = gcnew array<T>(inputArray->Length);
delete inputArray;
return Merge(newArray, array1, array2);
}
array<T>^ Merge(array<T>^ targetArray, array<T>^ array1, array<T>^ array2)
{
int n1 = array1->Length;
int n2 = array2->Length;
int x1 = 0;
int x2 = 0;
int counter = 0;
while (x1 < n1 && x2 < n2)
{
if (array1[x1]->CompareTo(array2[x2]) < 0)
{
targetArray[counter] = array1[x1];
x1 ++;
counter++;
}
else
{
targetArray[counter] = array2[x2];
x2 ++;
counter++;
}
}
while (x1 < n1)
{
targetArray[counter] = array1[x1];
counter ++;
x1 ++;
}
while (x2 < n2)
{
targetArray[counter] = array2[x2];
counter ++;
x2 ++;
}
return targetArray;
}
};
Hmm... but what are you printing/testing? The original array or what Sort return?
Anyway, try this:
SortCollection(array1);
SortCollection(array2);
// array<T>^ newArray = gcnew array<T>(inputArray->Length);
// delete inputArray; ---> "reuse" the input array
return Merge(inputArray, array1, array2);
EDIT:
I’m sure you know this, but you just need to paid more attention to it.
A „Normal“ function take arguments and return a result, without changing the arguments:
Y=f(x);
You expect x to be as previous, and the result in y. These are the good function. But some function will change the argument. Sometime it is evident, like in
Destroy(x);
but oft is not very evident, like in
y=sort(x);
Passing x “by value” is a guaranty not to be change, but if the function take a kind of reference (like type^ x) it have direct access to the original variable and can change its contents. This duality is what you have (in SortCollection and in Merge). You need to decide, and “document” what your function return and how modify the arguments it take.
In one version (with delete) you modify the argument deleting it!!! This is normally not a good idea and has to be very good documented. And the sorted array is passed as a “return value” (a sort of reference too, really). This version modifies the argument but the result is in the return (and that is what you need to use/test/print).
The version without the delete modifies the argument, by putting the sorted array in it. In this case it could be perfectly what you want. Anyway – document it! It returns a reference to the sorted array too, -to the argument. It is done for convenience, but can be exclude for better readability, and “return” just void.
A third variant could be, not modify the argument, and return the sorted array.
Here you have a problem:
SortCollection(array1); // array1 is deleted??
SortCollection(array2); // you dont use any result from here?
array<T>^ newArray = gcnew array<T>(inputArray->Length);
delete inputArray; // sort, deleted input array !
return Merge(newArray, array1, array2);
Is here correct??
array1=SortCollection(array1);
array2=SortCollection(array2);
array<T>^ newArray = gcnew array<T>(inputArray->Length);
delete inputArray;
return Merge(newArray, array1, array2);

Cartesian Product of multiple array

I think it is basically an easy problem, but I'm stuck. My brain is blocked by this problem, so I hope you can help me.
I have 2 to N arrays of integers, like
{1,2,3,4,5}
{1,2,3,4,5,6}
{1,3,5}
.....
Now i want to have a list containing arrays of int[N] with every posibillity like
{1,1,1}
{1,1,3}
{1,1,5}
{1,2,1}
....
{1,3,1}
....
{2,1,1}
{2,1,3}
....
{5,6,5}
so there are 6*5*3 (90) elements in it.
Is there a simple algorithm to do it? I think the language didn't matter but I prefer Java.
Thx for the help!
I add a valid answer with the implementation in java for the next guy, who has the same problem. I also do it generic so u can have any CartesianProduct on any Object, not just ints:
public class Product {
#SuppressWarnings("unchecked")
public static <T> List<T[]> getCartesianProduct(T[]... objects){
List<T[]> ret = null;
if (objects != null){
//saves length from first dimension. its the size of T[] of the returned list
int len = objects.length;
//saves all lengthes from second dimension
int[] lenghtes = new int[len];
// arrayIndex
int array = 0;
// saves the sum of returned T[]'s
int lenSum = 1;
for (T[] t: objects){
lenSum *= t.length;
lenghtes[array++] = t.length;
}
//initalize the List with the correct lenght to avoid internal array-copies
ret = new ArrayList<T[]>(lenSum);
//reusable class for instatiation of T[]
Class<T> clazz = (Class<T>) objects[0][0].getClass();
T[] tArray;
//values stores arrayIndexes to get correct values from objects
int[] values = new int[len];
for (int i = 0; i < lenSum; i++){
tArray = (T[])Array.newInstance(clazz, len);
for (int j = 0; j < len; j++){
tArray[j] = objects[j][values[j]];
}
ret.add(tArray);
//value counting:
//increment first value
values[0]++;
for (int v = 0; v < len; v++){
//check if values[v] doesn't exceed array length
if (values[v] == lenghtes[v]){
//set it to null and increment the next one, if not the last
values[v] = 0;
if (v+1 < len){
values[v+1]++;
}
}
}
}
}
return ret;
}
}
As i understand what you want, you need to get all permutations.
Use recursive algorithm, detailed here.
As I see this should work fine:
concatMap (λa -> concatMap (λb -> concatMap (λc -> (a,b,c)) L3) L2) L1
where concatMap(called SelectMany in C#) is defined as
concatMap f l = concat (map f l).
and map maps a function over a list
and concat(sometimes called flatten) takes a List of List and turns it into a flat List

Generating All Permutations of Character Combinations when # of arrays and length of each array are unknown

I'm not sure how to ask my question in a succinct way, so I'll start with examples and expand from there. I am working with VBA, but I think this problem is non language specific and would only require a bright mind that can provide a pseudo code framework. Thanks in advance for the help!
Example:
I have 3 Character Arrays Like So:
Arr_1 = [X,Y,Z]
Arr_2 = [A,B]
Arr_3 = [1,2,3,4]
I would like to generate ALL possible permutations of the character arrays like so:
XA1
XA2
XA3
XA4
XB1
XB2
XB3
XB4
YA1
YA2
.
.
.
ZB3
ZB4
This can be easily solved using 3 while loops or for loops. My question is how do I solve for this if the # of arrays is unknown and the length of each array is unknown?
So as an example with 4 character arrays:
Arr_1 = [X,Y,Z]
Arr_2 = [A,B]
Arr_3 = [1,2,3,4]
Arr_4 = [a,b]
I would need to generate:
XA1a
XA1b
XA2a
XA2b
XA3a
XA3b
XA4a
XA4b
.
.
.
ZB4a
ZB4b
So the Generalized Example would be:
Arr_1 = [...]
Arr_2 = [...]
Arr_3 = [...]
.
.
.
Arr_x = [...]
Is there a way to structure a function that will generate an unknown number of loops and loop through the length of each array to generate the permutations? Or maybe there's a better way to think about the problem?
Thanks Everyone!
Recursive solution
This is actually the easiest, most straightforward solution. The following is in Java, but it should be instructive:
public class Main {
public static void main(String[] args) {
Object[][] arrs = {
{ "X", "Y", "Z" },
{ "A", "B" },
{ "1", "2" },
};
recurse("", arrs, 0);
}
static void recurse (String s, Object[][] arrs, int k) {
if (k == arrs.length) {
System.out.println(s);
} else {
for (Object o : arrs[k]) {
recurse(s + o, arrs, k + 1);
}
}
}
}
(see full output)
Note: Java arrays are 0-based, so k goes from 0..arrs.length-1 during the recursion, until k == arrs.length when it's the end of recursion.
Non-recursive solution
It's also possible to write a non-recursive solution, but frankly this is less intuitive. This is actually very similar to base conversion, e.g. from decimal to hexadecimal; it's a generalized form where each position have their own set of values.
public class Main {
public static void main(String[] args) {
Object[][] arrs = {
{ "X", "Y", "Z" },
{ "A", "B" },
{ "1", "2" },
};
int N = 1;
for (Object[] arr : arrs) {
N = N * arr.length;
}
for (int v = 0; v < N; v++) {
System.out.println(decode(arrs, v));
}
}
static String decode(Object[][] arrs, int v) {
String s = "";
for (Object[] arr : arrs) {
int M = arr.length;
s = s + arr[v % M];
v = v / M;
}
return s;
}
}
(see full output)
This produces the tuplets in a different order. If you want to generate them in the same order as the recursive solution, then you iterate through arrs "backward" during decode as follows:
static String decode(Object[][] arrs, int v) {
String s = "";
for (int i = arrs.length - 1; i >= 0; i--) {
int Ni = arrs[i].length;
s = arrs[i][v % Ni] + s;
v = v / Ni;
}
return s;
}
(see full output)
Thanks to #polygenelubricants for the excellent solution.
Here is the Javascript equivalent:
var a=['0'];
var b=['Auto', 'Home'];
var c=['Good'];
var d=['Tommy', 'Hilfiger', '*'];
var attrs = [a, b, c, d];
function recurse (s, attrs, k) {
if(k==attrs.length) {
console.log(s);
} else {
for(var i=0; i<attrs[k].length;i++) {
recurse(s+attrs[k][i], attrs, k+1);
}
}
}
recurse('', attrs, 0);
EDIT: Here's a ruby solution. Its pretty much the same as my other solution below, but assumes your input character arrays are words: So you can type:
% perm.rb ruby is cool
~/bin/perm.rb
#!/usr/bin/env ruby
def perm(args)
peg = Hash[args.collect {|v| [v,0]}]
nperms= 1
args.each { |a| nperms *= a.length }
perms = Array.new(nperms, "")
nperms.times do |p|
args.each { |a| perms[p] += a[peg[a]] }
args.each do |a|
peg[a] += 1
break if peg[a] < a.length
peg[a] = 0
end
end
perms
end
puts perm ARGV
OLD - I have a script to do this in MEL, (Maya's Embedded Language) - I'll try to translate to something C like, but don't expect it to run without a bit of fixing;) It works in Maya though.
First - throw all the arrays together in one long array with delimiters. (I'll leave that to you - because in my system it rips the values out of a UI). So, this means the delimiters will be taking up extra slots: To use your sample data above:
string delimitedArray[] = {"X","Y","Z","|","A","B","|","1","2","3","4","|"};
Of course you can concatenate as many arrays as you like.
string[] getPerms( string delimitedArray[]) {
string result[];
string delimiter("|");
string compactArray[]; // will be the same as delimitedArray, but without the "|" delimiters
int arraySizes[]; // will hold number of vals for each array
int offsets[]; // offsets will holds the indices where each new array starts.
int counters[]; // the values that will increment in the following loops, like pegs in each array
int nPemutations = 1;
int arrSize, offset, nArrays;
// do a prepass to find some information about the structure, and to build the compact array
for (s in delimitedArray) {
if (s == delimiter) {
nPemutations *= arrSize; // arrSize will have been counting elements
arraySizes[nArrays] = arrSize;
counters[nArrays] = 0; // reset the counter
nArrays ++; // nArrays goes up every time we find a new array
offsets.append(offset - arrSize) ; //its here, at the end of an array that we store the offset of this array
arrSize=0;
} else { // its one of the elements, not a delimiter
compactArray.append(s);
arrSize++;
offset++;
}
}
// put a bail out here if you like
if( nPemutations > 256) error("too many permutations " + nPemutations+". max is 256");
// now figure out the permutations
for (p=0;p<nPemutations;p++) {
string perm ="";
// In each array at the position of that array's counter
for (i=0;i<nArrays ;i++) {
int delimitedArrayIndex = counters[i] + offsets[i] ;
// build the string
perm += (compactArray[delimitedArrayIndex]);
}
result.append(perm);
// the interesting bit
// increment the array counters, but in fact the program
// will only get to increment a counter if the previous counter
// reached the end of its array, otherwise we break
for (i = 0; i < nArrays; ++i) {
counters[i] += 1;
if (counters[i] < arraySizes[i])
break;
counters[i] = 0;
}
}
return result;
}
If I understand the question correctly, I think you could put all your arrays into another array, thereby creating a jagged array.
Then, loop through all the arrays in your jagged array creating all the permutations you need.
Does that make sense?
it sounds like you've almost got it figured out already.
What if you put in there one more array, call it, say ArrayHolder , that holds all of your unknown number of arrays of unknown length. Then, you just need another loop, no?

Removing Duplicates in an array in C

The question is a little complex. The problem here is to get rid of duplicates and save the unique elements of array into another array with their original sequence.
For example :
If the input is entered b a c a d t
The result should be : b a c d t in the exact state that the input entered.
So, for sorting the array then checking couldn't work since I lost the original sequence. I was advised to use array of indices but I don't know how to do. So what is your advise to do that?
For those who are willing to answer the question I wanted to add some specific information.
char** finduni(char *words[100],int limit)
{
//
//Methods here
//
}
is the my function. The array whose duplicates should be removed and stored in a different array is words[100]. So, the process will be done on this. I firstly thought about getting all the elements of words into another array and sort that array but that doesn't work after some tests. Just a reminder for solvers :).
Well, here is a version for char types. Note it doesn't scale.
#include "stdio.h"
#include "string.h"
void removeDuplicates(unsigned char *string)
{
unsigned char allCharacters [256] = { 0 };
int lookAt;
int writeTo = 0;
for(lookAt = 0; lookAt < strlen(string); lookAt++)
{
if(allCharacters[ string[lookAt] ] == 0)
{
allCharacters[ string[lookAt] ] = 1; // mark it seen
string[writeTo++] = string[lookAt]; // copy it
}
}
string[writeTo] = '\0';
}
int main()
{
char word[] = "abbbcdefbbbghasdddaiouasdf";
removeDuplicates(word);
printf("Word is now [%s]\n", word);
return 0;
}
The following is the output:
Word is now [abcdefghsiou]
Is that something like what you want? You can modify the method if there are spaces between the letters, but if you use int, float, double or char * as the types, this method won't scale at all.
EDIT
I posted and then saw your clarification, where it's an array of char *. I'll update the method.
I hope this isn't too much code. I adapted this QuickSort algorithm and basically added index memory to it. The algorithm is O(n log n), as the 3 steps below are additive and that is the worst case complexity of 2 of them.
Sort the array of strings, but every swap should be reflected in the index array as well. After this stage, the i'th element of originalIndices holds the original index of the i'th element of the sorted array.
Remove duplicate elements in the sorted array by setting them to NULL, and setting the index value to elements, which is the highest any can be.
Sort the array of original indices, and make sure every swap is reflected in the array of strings. This gives us back the original array of strings, except the duplicates are at the end and they are all NULL.
For good measure, I return the new count of elements.
Code:
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
void sortArrayAndSetCriteria(char **arr, int elements, int *originalIndices)
{
#define MAX_LEVELS 1000
char *piv;
int beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R;
int idx, cidx;
for(idx = 0; idx < elements; idx++)
originalIndices[idx] = idx;
beg[0] = 0;
end[0] = elements;
while (i>=0)
{
L = beg[i];
R = end[i] - 1;
if (L<R)
{
piv = arr[L];
cidx = originalIndices[L];
if (i==MAX_LEVELS-1)
return;
while (L < R)
{
while (strcmp(arr[R], piv) >= 0 && L < R) R--;
if (L < R)
{
arr[L] = arr[R];
originalIndices[L++] = originalIndices[R];
}
while (strcmp(arr[L], piv) <= 0 && L < R) L++;
if (L < R)
{
arr[R] = arr[L];
originalIndices[R--] = originalIndices[L];
}
}
arr[L] = piv;
originalIndices[L] = cidx;
beg[i + 1] = L + 1;
end[i + 1] = end[i];
end[i++] = L;
}
else
{
i--;
}
}
}
int removeDuplicatesFromBoth(char **arr, int elements, int *originalIndices)
{
// now remove duplicates
int i = 1, newLimit = 1;
char *curr = arr[0];
while (i < elements)
{
if(strcmp(curr, arr[i]) == 0)
{
arr[i] = NULL; // free this if it was malloc'd
originalIndices[i] = elements; // place it at the end
}
else
{
curr = arr[i];
newLimit++;
}
i++;
}
return newLimit;
}
void sortArrayBasedOnCriteria(char **arr, int elements, int *originalIndices)
{
#define MAX_LEVELS 1000
int piv;
int beg[MAX_LEVELS], end[MAX_LEVELS], i=0, L, R;
int idx;
char *cidx;
beg[0] = 0;
end[0] = elements;
while (i>=0)
{
L = beg[i];
R = end[i] - 1;
if (L<R)
{
piv = originalIndices[L];
cidx = arr[L];
if (i==MAX_LEVELS-1)
return;
while (L < R)
{
while (originalIndices[R] >= piv && L < R) R--;
if (L < R)
{
arr[L] = arr[R];
originalIndices[L++] = originalIndices[R];
}
while (originalIndices[L] <= piv && L < R) L++;
if (L < R)
{
arr[R] = arr[L];
originalIndices[R--] = originalIndices[L];
}
}
arr[L] = cidx;
originalIndices[L] = piv;
beg[i + 1] = L + 1;
end[i + 1] = end[i];
end[i++] = L;
}
else
{
i--;
}
}
}
int removeDuplicateStrings(char *words[], int limit)
{
int *indices = (int *)malloc(limit * sizeof(int));
int newLimit;
sortArrayAndSetCriteria(words, limit, indices);
newLimit = removeDuplicatesFromBoth(words, limit, indices);
sortArrayBasedOnCriteria(words, limit, indices);
free(indices);
return newLimit;
}
int main()
{
char *words[] = { "abc", "def", "bad", "hello", "captain", "def", "abc", "goodbye" };
int newLimit = removeDuplicateStrings(words, 8);
int i = 0;
for(i = 0; i < newLimit; i++) printf(" Word # %d = %s\n", i, words[i]);
return 0;
}
Traverse through the items in the array - O(n) operation
For each item, add it to another sorted-array
Before adding it to the sorted array, check if the entry already exists - O(log n) operation
Finally, O(n log n) operation
i think that in C you can create a second array. then you copy the element from the original array only if this element is not already in the send array.
this also preserve the order of the element.
if you read the element one by one you can discard the element before insert in the original array, this could speedup the process.
As Thomas suggested in a comment, if each element of the array is guaranteed to be from a limited set of values (such as a char) you can achieve this in O(n) time.
Keep an array of 256 bool (or int if your compiler doesn't support bool) or however many different discrete values could possibly be in the array. Initialize all the values to false.
Scan the input array one-by-one.
For each element, if the corresponding value in the bool array is false, add it to the output array and set the bool array value to true. Otherwise, do nothing.
You know how to do it for char type, right?
You can do same thing with strings, but instead of using array of bools (which is technically an implementation of "set" object), you'll have to simulate the "set"(or array of bools) with a linear array of strings you already encountered. I.e. you have an array of strings you already saw, for each new string you check if it is in array of "seen" strings, if it is, then you ignore it (not unique), if it is not in array, you add it to both array of seen strings and output. If you have a small number of different strings (below 1000), you could ignore performance optimizations, and simply compare each new string with everything you already saw before.
With large number of strings (few thousands), however, you'll need to optimize things a bit:
1) Every time you add a new string to an array of strings you already saw, sort the array with insertion sort algorithm. Don't use quickSort, because insertion sort tends to be faster when data is almost sorted.
2) When checking if string is in array, use binary search.
If number of different strings is reasonable (i.e. you don't have billions of unique strings), this approach should be fast enough.

Resources