How to know if an array is "almost" sorted? - arrays

I have to decide, given an array of numbers, if either heap sort or merge sort will be faster at sorting it, without actually running them. In order to do that I am trying to find good indicators that might harm performance in both cases such as the length of the array, for example.
I have found that merge sort is exceptionally good at sorting almost sorted arrays. In that sense, I am trying to find a good way to estimate how "almost" sorted an array is but I am not sure how to do that.
I have thought about using the means of the result of subtracting each consecutive element in the array but I am not sure if that is the best approach to this problem. For example:
public class AlmostSortedCalculator {
private static final int[] UNSORTED_ARRAY = {7, 1, 3, 9, 4, 8, 5};
private static final int[] SORTED_ARRAY = {1, 3, 4, 5, 7, 8, 9};
private static final int[] UNSORTED_ARRAY_ = {200, 20, 634, 9957, 1, 890, 555};
private static final int[] SORTED_ARRAY_ = {1, 20, 200, 555, 634, 890, 9957};
public static void main(String[] args) {
new AlmostSortedCalculator();
}
public AlmostSortedCalculator() {
calculate(SORTED_ARRAY);
calculate(UNSORTED_ARRAY);
calculate(SORTED_ARRAY_);
calculate(UNSORTED_ARRAY_);
}
private void calculate(int[] array) {
int result = 0;
for (int i = array.length - 1; i != 0; i--) {
if (i != 0) {
result += array[i] - array[i - 1];
}
}
System.out.println("The result is: " + result / array.length);
}
}
The result is: 1
The result is: 0
The result is: 1422
The result is: 50
The result of the means seems to be higher when the array is sorted but I am not sure how reliable that indicator is. I am sure there is a better approach for this, but I cannot think of any. Any suggestions?

First of all, I'd only look at the sign of the subtraction results:
/* returns the sign of the expression a - b */
int sign_of_subtraction_result(int a, int b) {
if ( a < b ) return -1;
if ( a > b ) return +1;
return 0;
}
You may also call this function compare().
Note that usual library sorting functions only use this information and require such compare()-fuctions.

Related

Minimizing time complexity when looping over two arrays

For a intro computer science class we have to write a function that finds the intersection of two arrays, which each unique element only being shown once and without allocating more space than we need.
For example:
array A = {1, 2, 3, 3, 3, 5}
array B = {2, 2, 2, 3, 5, 6}
intersection of A and B = {2, 3, 5}
How can I accomplish this without looping over the both arrays twice? As it stands I have:
//find how large of an array I'll need
for array A
for array A
if A[i] is already somewhere earlier in array A
stop
else
loop through array B
if A[i] is in array B, increment a counter
declare a new array of size counter
//add unique elements to the array
for array A
for array A
if A[i] is already somewhere earlier in array A
stop
else
loop through array B
if A[i] is in array B, add it to the new array
It seems like this would be really inefficient I have two nearly identical nested for loops. If I was using python I could just append unique elements to list, but is there a way I could do something similar in C? I could just declare an array of the maximum size I could need, but I'm trying to minimize the space complexity.
If your are aware of sets you can use that. The time-complexity will be O(n)
You can know more about set and how to implement one in C here.
Then you can do something like this (written in Java):
public int[] intersection(int[] nums1, int[] nums2) {
Set<Integer> set1 = getSet(nums1);
Set<Integer> set2 = getSet(nums2);
Set<Integer> ans = new HashSet<>();
for(Integer i: set1) {
if(set2.contains(i)) {
ans.add(i);
}
}
int[] ret = new int[ans.size()];
int count = 0;
for(Integer i: ans) {
ret[count] = i;
count++;
}
return ret;
}
public Set<Integer> getSet(int[] arr) {
Set<Integer> set = new HashSet<>();
for(int a: arr) { set.add(a); }
return set;
}

Merge duplicate longs in an array

I'm trying to merge/multiply duplicate longs in an array recursively.
So if I have something like that:
long[] arr = {3, 5, 6, 6, 7} => long[] arr = {3, 5, 36, 7}
That's what I've got:
public static long[] merge(long[] ns, int i, Merger m) {
m.merge();
if(i > ns.length) return new long[0];
if(i < 0) return merge(ns, 0, m);
else {
if(ns[i] != ns[i+1]) {
return append(merge(ns, i-1, m), ns[i+1]);
}
else {
return append(merge(ns, i-1, m), ns[i] * ns[i+1]);
}
}
public long[] append(long[] old, long newLast) {
long[] result = Arrays.copyOf(old, old.length + 1);
result[old.length] = newLast;
return result;
}
}
But it stucks in its recursion.
There are multiple cases that are not clear from the approach that you've taken.
What happens when there are multiple instances of the same value? Do they simply get multiplied? In your current logic, you check whether ns[i] != ns[i+1], which assumes that a. the list if sorted, .b. that occurrences come up only in pairs.
To see why (a) holds, your current approach would not multiply the two 6s if your input list were [3,6,5,6,7]. Is this a valid assumption to make?
To see why (b) holds, assume you had for input [1,3,5,6,6,6,7]. In this case, on multiplying the first two occurrences of 6, your resultant list would be [1,3,5,36,6,7], and your current logic would not end up multiplying 36 and 6.
Is this intended?
Before implementing a recursive solution, it would be instructional to write out the iterative implementation first. That way, the problem specification will become clearer to you.
Assuming these two assumptions hold for the specific problem you're trying to solve, the implementation below works.
(Note - this is implemented in Python. if you're looking for a Java specific solution, you should modify your question specifying it + add a Java tag to your post. Someone fluent in Java can then help you out. This solution tries to resemble your approach as closely as possible.)
def merge(ns, i, resultant_list = None):
if resultant_list is None:
resultant_list = []
if i > len(ns)-1:
return resultant_list
else:
if i == len(ns)-1:
append(resultant_list, ns[i])
return resultant_list
elif(ns[i] != ns[i+1]):
append(resultant_list, ns[i])
return merge(ns, i+1, resultant_list)
else:
append(resultant_list, ns[i] * ns[i+1])
return merge(ns, i+2, resultant_list)
def append(old, newLast):
old.append(newLast)
return old

How do I use 2-D arrays to determine whether or not something is a Magic Square? (requires much help, please)

Before I ask my question, I want to make myself clear that I am not expecting someone to do my lab for me. I am seeking genuine help to understand my lab, I am using StackOverflow because it is consistent and helpful. YES I have searched for answers to my question elsewhere. I am trying my best to format a detailed "good" question.
Okay this is a MagicSquare prgogram. Unlike others I have viewed online, the program we are writing is to determine whether or not a two-dimensional array is a Magic Square. After it determines whether or not it is, it will print the result. I am struggling in my class when it comes to writing a program. So if you could please help me out, I would appreciate it. (More specifics are commented within my code)
public class MagicSquare {
public static void main(String [] args){
int [] [] square3 = {{ 1, 6, 4}, {8, 2, 9} {5, 7, 3}}; // this is for the 3x3 matrix
MagicSquare case1 = new MagicSquare( square3 );
String table = case1.toString();
System.out.println(table);
// Do I stop here for the first array?
int [] [] square4 = {{16, 3, 2, 13}, {5, 10, 11, 8}, {9, 6, 7, 12}, {4, 15, 14, 1}}
//do something here for the rest of the second 2d array
The above is pre-written code, I understand that it is setting up what will be printed. Is the case1 necessary for doing this or could you use a loop instead?
if(case1.isAmagicSq( ))
System.out.println("...is a magic square..." );
else
System.out.println("...is not a magic square... " );
//repeat for the 4x4 square
}
This simply states whether or not it is a magic square.
}/* write the constructor nd two instance methods here */
//1. Write a constructor to create a deep copy of the 2-D array
//formal parameter. There should be a private instance field for
//holding the deep copy of the input array
private int arrayCopy(int [] [] square3, square4){
int [] = new int[square3.length];
for(int i = 0; i < square3.length; i++)
result[i] = square3[i];
return result;
int [] = new int[square4.length];
for( int i = 0; i < square4.length; i++)
result[i] = square4[i];
return result;
}
Am I on the right track for writing a deep copy of the array? Would I use the variables above?
//2. Write the toString method to return a String with a table
//containing the magic square
//3. Determine if the 2d array is or is not a magic square
//by using an instance method
public static int count Elements( int [][] array){
int result;
for(int i = 0; i < array.length; i++) result+=array[i].length;
return result;
}
public static int[] rowI(int data[][], int i){
if(i<data.length)
return data[i];
else
return null;
}
For the last one I have started to write something that (should) count across each row to see if they add up to what a MagicSquare is supposed to be. Obviously I'm not finished, but I didn't want to continue if I'm completely incorrect...
You do not need a deep copy of any sort to validate a magic square. You should just pass the square to some method, and that will add up the numbers for each row and column. Passing the object is done by reference, and since you are not modifying it, you shouldn't have to worry about any copy function. This is the most critical part of the exercise, I would work on that first.
Everything you need to do should be in that method you defined:
public boolean isAmagicSq(){
// do your checks here
}

Calculate all possibilities to get N using values from a given set [duplicate]

This question already has answers here:
Algorithm to find elements best fitting in a particular amount
(5 answers)
how do you calculate the minimum-coin change for transaction?
(3 answers)
Closed 9 years ago.
So here is the problem:
Given input = [100 80 66 25 4 2 1], I need to find the best combination to give me 50.
Looking at this, the best would be 25+25 = 50, so I need 2 elements from the array.
Other combinations include 25+4+4+4+4+4+4+1 and 25+4+4+4+4+4+2+2+1.. etc etc
I need to find all the possibilities which gives me the sum on a value I want.
EDIT: As well as the best possibility (one with least number of terms)
Here is what I have done thus far:
First build a new array (simple for loop which cycles through all elements and stores in a new temp array), check for all elements higher than my array (so for input 50, the elements 100,80,66 are higher, so discard them and then my new array is [25 4 2 1]). Then, from this, I need to check combinations.
The first thing I do is a simple if statement checking if any array elements EXACTLY match the number I want. So if I want 50, I check if 50 is in the array, if not, I need to find combinations.
My problem is, I'm not entirely sure how to find every single combination. I have been struggling trying to come up with an algorithm for a while but I always just end up getting stumped.
Any help/tips would be much appreciated.
PS - we can assume the array is always sorted in order from LARGEST to SMALLEST value.
This is the kind of problem that dynamic programming is meant to solve.
Create an array with with indices, 1 to 50. Set each entry to -1. For each element that is in your input array, set that element in the array to 0. Then, for each integer n = 2 to 50, find all possible ways to sum to n. The number of sums required is the minimum of the two addends plus 1. At the end, get the element at index 50.
Edit: Due to a misinterpretation of the question, I first answered with an efficient way to calculate the number of possibilities (instead of the possibilities themself) to get N using values from a given set. That solution can be found at the bottom of this post as a reference for other people, but first I'll give a proper answer to your questions.
Generate all possibilities, count them and give the shortest one
When generating a solution, you consider each element from the input array and ask yourself "should I use this in my solution or not?". Since we don't know the answer until after the calculation, we'll just have to try out both using it and not using it, as can be seen in the recursion step in the code below.
Now, to avoid duplicates and misses, we need to be a bit careful with the parameters for the recursive call. If we use the current element, we should also allow it to be used in the next step, because the element may be used as many times as possible. Therefore, the first parameter in this recursive call is i. However, if we decide to not use the element, we should not allow it to be used in the next step, because that would be a duplicate of the current step. Therefore, the first parameter in this recursive call is i+1.
I added an optional bound (from "branch and bound") to the algorithm, that will stop expanding the current partial solution if it is known that this solution will never be shorter then the shortest solution found so far.
package otherproblems;
import java.util.Deque;
import java.util.LinkedList;
public class GeneratePossibilities
{
// Input
private static int n = 50;
// If the input array is sorted ascending, the shortest solution is
// likely to be found somewhere at the end.
// If the input array is sorted descending, the shortest solution is
// likely to be found somewhere in the beginning.
private static int[] input = {100, 80, 66, 25, 4, 2, 1};
// Shortest possibility
private static Deque<Integer> shortest;
// Number of possibilities
private static int numberOfPossibilities;
public static void main(String[] args)
{
calculate(0, n, new LinkedList<Integer>());
System.out.println("\nAbove you can see all " + numberOfPossibilities +
" possible solutions,\nbut this one's the shortest: " + shortest);
}
public static void calculate(int i, int left, Deque<Integer> partialSolution)
{
// If there's nothing left, we reached our target
if (left == 0)
{
System.out.println(partialSolution);
if (shortest == null || partialSolution.size() < shortest.size())
shortest = new LinkedList<Integer>(partialSolution);
numberOfPossibilities++;
return;
}
// If we overshot our target, by definition we didn't reach it
// Note that this could also be checked before making the
// recursive call, but IMHO this gives a cleaner recursion step.
if (left < 0)
return;
// If there are no values remaining, we didn't reach our target
if (i == input.length)
return;
// Uncomment the next two lines if you don't want to keep generating
// possibilities when you know it can never be a better solution then
// the one you have now.
// if (shortest != null && partialSolution.size() >= shortest.size())
// return;
// Pick value i. Note that we are allowed to pick it again,
// so the argument to calculate(...) is i, not i+1.
partialSolution.addLast(input[i]);
calculate(i, left-input[i], partialSolution);
// Don't pick value i. Note that we are not allowed to pick it after
// all, so the argument to calculate(...) is i+1, not i.
partialSolution.removeLast();
calculate(i+1, left, partialSolution);
}
}
Calculate the number of possibilities efficiently
This is a nice example of dynamic programming. What you need to do is figure out how many possibilities there are to form the number x, using value y as the last addition and using only values smaller than or equal to y. This gives you a recursive formula that you can easily translate to a solution using dynamic programming. I'm not quite sure how to write down the mathematics here, but since you weren't interested in them anyway, here's the code to solve your question :)
import java.util.Arrays;
public class Possibilities
{
public static void main(String[] args)
{
// Input
int[] input = {100, 80, 66, 25, 4, 2, 1};
int n = 50;
// Prepare input
Arrays.sort(input);
// Allocate storage space
long[][] m = new long[n+1][input.length];
for (int i = 1; i <= n; i++)
for (int j = 0; j < input.length; j++)
{
// input[j] cannot be the last value used to compose i
if (i < input[j])
m[i][j] = 0;
// If input[j] is the last value used to compose i,
// it must be the only value used in the composition.
else if (i == input[j])
m[i][j] = 1;
// If input[j] is the last value used to compose i,
// we need to know the number of possibilities in which
// i - input[j] can be composed, which is the sum of all
// entries in column m[i-input[j]].
// However, to avoid counting duplicates, we only take
// combinations that are composed of values equal or smaller
// to input[j].
else
for (int k = 0; k <= j; k++)
m[i][j] += m[i-input[j]][k];
}
// Nice output of intermediate values:
int digits = 3;
System.out.printf(" %"+digits+"s", "");
for (int i = 1; i <= n; i++)
System.out.printf(" %"+digits+"d", i);
System.out.println();
for (int j = 0; j < input.length; j++)
{
System.out.printf(" %"+digits+"d", input[j]);
for (int i = 1; i <= n; i++)
System.out.printf(" %"+digits+"d", m[i][j]);
System.out.println();
}
// Answer:
long answer = 0;
for (int i = 0; i < input.length; i++)
answer += m[n][i];
System.out.println("\nThe number of possibilities to form "+n+
" using the numbers "+Arrays.toString(input)+" is "+answer);
}
}
This is the integer knapsack problem, which is one your most common NP-complete problems out there; if you are into algorithm design/study check those out. To find the best I think you have no choice but to compute them all and keep the smallest one.
For the correct solution there is a recursive algorithm that is pretty simple to put together.
import org.apache.commons.lang.ArrayUtils;
import java.util.*;
public class Stuff {
private final int target;
private final int[] steps;
public Stuff(int N, int[] steps) {
this.target = N;
this.steps = Arrays.copyOf(steps, steps.length);
Arrays.sort(this.steps);
ArrayUtils.reverse(this.steps);
this.memoize = new HashMap<Integer, List<Integer>>(N);
}
public List<Integer> solve() {
return solveForN(target);
}
private List<Integer> solveForN(int N) {
if (N == 0) {
return new ArrayList<Integer>();
} else if (N > 0) {
List<Integer> temp, min = null;
for (int i = 0; i < steps.length; i++) {
temp = solveForN(N - steps[i]);
if (temp != null) {
temp.add(steps[i]);
if (min == null || min.size() > temp.size()) {
min = temp;
}
}
}
return min;
} else {
return null;
}
}
}
It is based off the fact that to "get to N" you to have come from N - steps[0], or N - steps1, ...
Thus you start from your target total N and subtract one of the possible steps, and do it again until you are at 0 (return a List to specify that this is a valid path) or below (return null so that you cannot return an invalid path).
The complexity of this correct solution is exponential! Which is REALLY bad! Something like O(k^M) where M is the size of the steps array and k a constant.
To get a solution to this problem in less time than that you will have to use a heuristic (approximation) and you will always have a certain probability to have the wrong answer.
You can make your own implementation faster by memorizing the shortest combination seen so far for all targets (so you do not need to recompute recur(N, _, steps) if you already did). This approach is called Dynamic Programming. I will let you do that on your own (very fun stuff and really not that complicated).
Constraints of this solution : You will only find the solution if you guarantee that the input array (steps) is sorted in descending order and that you go through it in that order.
Here is a link to the general Knapsack problem if you also want to look approximation solutions: http://en.wikipedia.org/wiki/Knapsack_problem
You need to solve each sub-problem and store the solution. For example:
1 can only be 1. 2 can be 2 or 1+1. 4 can be 4 or 2+2 or 2+1+1 or 1+1+1+1. So you take each sub-solution and store it, so when you see 25=4+4+4+4+4+4+1, you already know that each 4 can also be represented as one of the 3 combinations.
Then you have to sort the digits and check to avoid duplicate patterns since, for example, (2+2)+(2+2)+(2+2)+(1+1+1+1)+(1+1+1+1)+(1+1+1+1) == (2+1+1)+(2+1+1)+(2+1+1)+(2+1+1)+(2+1+1)+(2+1+1). Six 2's and twelve 1's in both cases.
Does that make sense?
Recursion should be the easiest way to solve this (Assuming you really want to find all the solutions to the problem). The nice thing about this approach is, if you want to just find the shortest solution, you can add a check on the recursion and find just that, saving time and space :)
Assuming an element i of your array is part of the solution, you can solve the subproblem of finding the elements that sums to n-i. If we add an ordering to our solution, for example the numbers in the sum must be from the greater to the smallest, we have a way to find unique solutions.
This is a recursive solution in C#, it should be easy to translate it in java.
public static void RecursiveSum(int n, int index, List<int> lst, List<int> solution)
{
for (int i = index; i < lst.Count; i++)
{
if (n == 0)
{
Console.WriteLine("");
foreach (int j in solution)
{
Console.Write(j + " ");
}
}
if (n - lst[i] >= 0)
{
List<int> tmp = new List<int>(solution);
tmp.Add(lst[i]);
RecursiveSum(n - lst[i], i, lst, tmp);
}
}
}
You call it with
RecursiveSum(N,0,list,new List<int>());
where N is the sum you are looking for, 0 shouldn't be changed, list is your list of allowed numbers, and the last parameter shouldn't be changed either.
The problem you pose is interesting but very complex. I'd approach this by using something like OptaPlanner(formerly Drools Planner). It's difficult to describe a full solution to this problem without spending significant time, but with optaplanner you can also get "closest fit" type answers and can have incremental "moves" that would make solving your problem more efficient. Good luck.
This is a solution in python: Ideone link
# Start of tsum function
def tsum(currentSum,total,input,record,n):
if total == N :
for i in range(0,n):
if record[i]:
print input[i]
i = i+1
for i in range(i,n):
if record[i]:
print input[i]
print ""
return
i=currentSum
for i in range(i,n):
if total+input[i]>sum :
continue
if i>0 and input[i]==input[i-1] and not record[i-1] :
continue
record[i]=1
tsum(i+1,total+input[i],input,record,l)
record[i]=0
# end of function
# Below portion will be main() in Java
record = []
N = 5
input = [3, 2, 2, 1, 1]
temp = list(set(input))
newlist = input
for i in range(0, len(list(set(input)))):
val = N/temp[i]
for j in range(0, val-input.count(temp[i])):
newlist.append(temp[i])
# above logic was to create a newlist/input i.e [3, 2, 2, 1, 1, 1, 1, 1]
# This new list contains the maximum number of elements <= N
# for e.g appended three 1's as sum of new three 1's + existing two 1's <= N(5) where as
# did not append another 2 as 2+2+2 > N(5) or 3 as 3+3 > N(5)
l = len(input)
for i in range(0,l):
record.append(0)
print "all possibilities to get N using values from a given set:"
tsum(0,0,input,record,l)
OUTPUT: for set [3, 2, 2, 1, 1] taking small set and small N for demo purpose. But works well for higher N value as well.
For N = 5
all possibilities to get N using values from a given set:
3
2
3
1
1
2
2
1
2
1
1
1
1
1
1
1
1
For N = 3
all possibilities to get N using values from a given set:
3
2
1
1
1
1
Isn't this just a search problem? If so, just search breadth-first.
abstract class Numbers {
abstract int total();
public static Numbers breadthFirst(int[] numbers, int total) {
List<Numbers> stack = new LinkedList<Numbers>();
if (total == 0) { return new Empty(); }
stack.add(new Empty());
while (!stack.isEmpty()) {
Numbers nums = stack.remove(0);
for (int i : numbers) {
if (i > 0 && total - nums.total() >= i) {
Numbers more = new SomeNumbers(i, nums);
if (more.total() == total) { return more; }
stack.add(more);
}
}
}
return null; // No answer.
}
}
class Empty extends Numbers {
int total() { return 0; }
public String toString() { return "empty"; }
}
class SomeNumbers extends Numbers {
final int total;
final Numbers prev;
SomeNumbers(int n, Numbers prev) {
this.total = n + prev.total();
this.prev = prev;
}
int total() { return total; }
public String toString() {
if (prev.getClass() == Empty.class) { return "" + total; }
return prev + "," + (total - prev.total());
}
}
What about using the greedy algorithm n times (n is the number of elements in your array), each time popping the largest element off the list. E.g. (in some random pseudo-code language):
array = [70 30 25 4 2 1]
value = 50
sort(array, descending)
solutions = [] // array of arrays
while length of array is non-zero:
tmpValue = value
thisSolution = []
for each i in array:
while tmpValue >= i:
tmpValue -= i
thisSolution.append(i)
solutions.append(thisSolution)
array.pop_first() // remove the largest entry from the array
If run with the set [70 30 25 4 2 1] and 50, it should give you a solutions array like this:
[[30 4 4 4 4 4]
[30 4 4 4 4 4]
[25 25]
[4 4 4 4 4 4 4 4 4 4 4 4 2]
[2 ... ]
[1 ... ]]
Then simply pick the element from the solutions array with the smallest length.
Update: The comment is correct that this does not generate the correct answer in all cases. The reason is that greedy isn't always right. The following recursive algorithm should always work:
array = [70, 30, 25, 4, 3, 1]
def findSmallest(value, array):
minSolution = []
tmpArray = list(array)
while len(tmpArray):
elem = tmpArray.pop(0)
tmpValue = value
cnt = 0
while tmpValue >= elem:
cnt += 1
tmpValue -= elem
subSolution = findSmallest(tmpValue, tmpArray)
if tmpValue == 0 or subSolution:
if not minSolution or len(subSolution) + cnt < len(minSolution):
minSolution = subSolution + [elem] * cnt
return minSolution
print findSmallest(10, array)
print findSmallest(50, array)
print findSmallest(49, array)
print findSmallest(55, array)
Prints:
[3, 3, 4]
[25, 25]
[3, 4, 4, 4, 4, 30]
[30, 25]
The invariant is that the function returns either the smallest set for the value passed in, or an empty set. It can then be used recursively with all possible values of the previous numbers in the list. Note that this is O(n!) in complexity, so it's going to be slow for large values. Also note that there are numerous optimization potentials here.
I made a small program to help with one solution. Personally, I believe the best would be a deterministic mathematical solution, but right now I lack the caffeine to even think on how to implement it. =)
Instead, I went with a SAR approach. Stop and Reverse is a technique used on stock trading (http://daytrading.about.com/od/stou/g/SAR.htm), and is heavily used to calculate optimal curves with a minimal of inference. The Wikipedia entry for parabolical SAR goes like this:
'The Parabolic SAR is calculated almost independently for each trend
in the price. When the price is in an uptrend, the SAR emerges below
the price and converges upwards towards it. Similarly, on a
downtrend, the SAR emerges above the price and converges
downwards.'
I adapted it to your problem. I start with a random value from your series. Then the code enters a finite number of iterations.
I pick another random value from the series stack.
If the new value plus the stack sum is inferior to the target, then the value is added; if superior, then decreased.
I can go on for as much as I want until I satisfy the condition (stack sum = target), or abort if the cycle can't find a valid solution.
If successful, I record the stack and the number of iterations. Then I redo everything.
An EXTREMELY crude code follows. Please forgive the hastiness. Oh, and It's in C#. =)
Again, It does not guarantee that you'll obtain the optimal path; it's a brute force approach. It can be refined; detect if there's a perfect match for a target hit, for example.
public static class SAR
{
//I'm considering Optimal as the smallest signature (number of members).
// Once set, all future signatures must be same or smaller.
private static Random _seed = new Random();
private static List<int> _domain = new List<int>() { 100, 80, 66, 24, 4, 2, 1 };
public static void SetDomain(string domain)
{
_domain = domain.Split(',').ToList<string>().ConvertAll<int>(a => Convert.ToInt32(a));
_domain.Sort();
}
public static void FindOptimalSAR(int value)
{
// I'll skip some obvious tests. For example:
// If there is no odd number in domain, then
// it's impossible to find a path to an odd
// value.
//Determining a max path run. If the count goes
// over this, it's useless to continue.
int _maxCycle = 10;
//Determining a maximum number of runs.
int _maxRun = 1000000;
int _run = 0;
int _domainCount = _domain.Count;
List<int> _currentOptimalSig = new List<int>();
List<String> _currentOptimalOps = new List<string>();
do
{
List<int> currSig = new List<int>();
List<string> currOps = new List<string>();
int _cycle = 0;
int _cycleTot = 0;
bool _OptimalFound = false;
do
{
int _cursor = _seed.Next(_domainCount);
currSig.Add(_cursor);
if (_cycleTot < value)
{
currOps.Add("+");
_cycleTot += _domain[_cursor];
}
else
{
// Your situation doesn't allow for negative
// numbers. Otherwise, just enable the two following lines.
// currOps.Add("-");
// _cycleTot -= _domain[_cursor];
}
if (_cycleTot == value)
{
_OptimalFound = true;
break;
}
_cycle++;
} while (_cycle < _maxCycle);
if (_OptimalFound)
{
_maxCycle = _cycle;
_currentOptimalOps = currOps;
_currentOptimalSig = currSig;
Console.Write("Optimal found: ");
for (int i = 0; i < currSig.Count; i++)
{
Console.Write(currOps[i]);
Console.Write(_domain[currSig[i]]);
}
Console.WriteLine(".");
}
_run++;
} while (_run < _maxRun);
}
}
And this is the caller:
String _Domain = "100, 80, 66, 25, 4, 2, 1";
SAR.SetDomain(_Domain);
Console.WriteLine("SAR for Domain {" + _Domain + "}");
do
{
Console.Write("Input target value: ");
int _parm = (Convert.ToInt32(Console.ReadLine()));
SAR.FindOptimalSAR(_parm);
Console.WriteLine("Done.");
} while (true);
This is my result after 100k iterations for a few targets, given a slightly modified series (I switched 25 for 24 for testing purposes):
SAR for Domain {100, 80, 66, 24, 4, 2, 1}
Input target value: 50
Optimal found: +24+24+2.
Done.
Input target value: 29
Optimal found: +4+1+24.
Done.
Input target value: 75
Optimal found: +2+2+1+66+4.
Optimal found: +4+66+4+1.
Done.
Now with your original series:
SAR for Domain {100, 80, 66, 25, 4, 2, 1}
Input target value: 50
Optimal found: +25+25.
Done.
Input target value: 75
Optimal found: +25+25+25.
Done.
Input target value: 512
Optimal found: +80+80+66+100+1+80+25+80.
Optimal found: +66+100+80+100+100+66.
Done.
Input target value: 1024
Optimal found: +100+1+80+80+100+2+100+2+2+2+25+2+100+66+25+66+100+80+25+66.
Optimal found: +4+25+100+80+100+1+80+1+100+4+2+1+100+1+100+100+100+25+100.
Optimal found: +80+80+25+1+100+66+80+80+80+100+25+66+66+4+100+4+1+66.
Optimal found: +1+100+100+100+2+66+25+100+66+100+80+4+100+80+100.
Optimal found: +66+100+100+100+100+100+100+100+66+66+25+1+100.
Optimal found: +100+66+80+66+100+66+80+66+100+100+100+100.
Done.
Cons: It is worth mentioning again: This algorithm does not guarantee that you will find the optimal values. It makes a brute-force approximation.
Pros: Fast. 100k iterations may initially seem a lot, but the algorithm starts ignoring long paths after it detects more and more optimized paths, since it lessens the maximum allowed number of cycles.

Algorithm to find the duplicate numbers in an array ---Fastest Way

I need the fastest and simple algorithm which finds the duplicate numbers in an array, also should be able to know the number of duplicates.
Eg: if the array is {2,3,4,5,2,4,6,2,4,7,3,8,2}
I should be able to know that there are four 2's, two 3's and three 4's.
Make a hash table where the key is array item and value is counter how many times the corresponding array item has occurred in array. This is efficient way to do it, but probably not the fastest way.
Something like this (in pseudo code). You will find plenty of hash map implementations for C by googling.
hash_map = create_new_hash_map()
for item in array {
if hash_map.contains_key(item){
counter = hash_map.get(item)
} else {
counter = 0
}
counter = counter + 1
hash_map.put(item, counter)
}
This can be solved elegantly using Linq:
public static void Main(string[] args)
{
List<int> list = new List<int> { 2, 3, 4, 5, 2, 4, 6, 2, 4, 7, 3, 8, 2 };
var grouping = list
.GroupBy(x => x)
.Select(x => new { Item = x.Key, Count = x.Count()});
foreach (var item in grouping)
Console.WriteLine("Item {0} has count {1}", item.Item, item.Count);
}
Internally it probably uses hashing to partition the list, but the code hides the internal details - here we are only telling it what to calculate. The compiler / runtime is free to choose how to calculate it, and optimize as it sees fit. Thanks to Linq this same code will run efficiently whether run an a list in memory, or if the list is in a database. In real code you should use this, but I guess you want to know how internally it works.
A more imperative approach that demonstrates the actual algorithm is as follows:
List<int> list = new List<int> { 2, 3, 4, 5, 2, 4, 6, 2, 4, 7, 3, 8, 2 };
Dictionary<int, int> counts = new Dictionary<int, int>();
foreach (int item in list)
{
if (!counts.ContainsKey(item))
{
counts[item] = 1;
}
else
{
counts[item]++;
}
}
foreach (KeyValuePair<int, int> item in counts)
Console.WriteLine("Item {0} has count {1}", item.Key, item.Value);
Here you can see that we iterate over the list only once, keeping a count for each item we see on the way. This would be a bad idea if the items were in a database though, so for real code, prefer to use the Linq method.
here's a C version that does it with standard input; it's as fast as the length of the input (beware, the number of parameters on the command line is limited...) but should give you an idea on how to proceed:
#include <stdio.h>
int main ( int argc, char **argv ) {
int dups[10] = { 0 };
int i;
for ( i = 1 ; i < argc ; i++ )
dups[atoi(argv[i])]++;
for ( i = 0 ; i < 10 ; i++ )
printf("%d: %d\n", i, dups[i]);
return 0;
}
example usage:
$ gcc -o dups dups.c
$ ./dups 0 0 3 4 5
0: 2
1: 0
2: 0
3: 1
4: 1
5: 1
6: 0
7: 0
8: 0
9: 0
caveats:
if you plan to count also the number of 10s, 11s, and so on -> the dups[] array must be bigger
left as an exercise is to implement reading from an array of integers and to determine their position
The more you tell us about the input arrays the faster we can make the algorithm. For example, for your example of single-digit numbers then creating an array of 10 elements (indexed 0:9) and accumulating number of occurrences of number in the right element of the array (poorly worded explanation but you probably catch my drift) is likely to be faster than hashing. (I say likely to be faster because I haven't done any measurements and won't).
I agree with most respondents that hashing is probably the right approach for the most general case, but it's always worth thinking about whether yours is a special case.
If you know the lower and upper bounds, and they are not too far apart, this would be a good place to use a Radix Sort. Since this smells of homework, I'm leaving it to the OP to read the article and implement the algorithm.
If you don't want to use hash table or smtg like that, just sort the array then count the number of occurrences, something like below should work
Arrays.sort(array);
lastOne=array's first element;
count=0,
for(i=0; i <array's length; i++)
{
if(array[i]==lastOne)
increment count
else
print(array[i] + " has " + count + " occurrences");
lastOne=array[i+1];
}
If the range of the numbers is known and small, you could use an array to keep track of how many times you've seen each (this is a bucket sort in essence). IF it's big you can sort it and then count duplicates as they will be following each other.
option 1: hash it.
option 2: sort it and then count consecutive runs.
You can use hash tables to store each element value as a key. Then increment +1 each time a key already exists.
Using hash tables / associative arrays / dictionaries (all the same thing but the terminology changes between programming environments) is the way to go.
As an example in python:
numberList = [1, 2, 3, 2, 1, ...]
countDict = {}
for value in numberList:
countDict[value] = countDict.get(value, 0) + 1
# Now countDict contains each value pointing to their count
Similar constructions exist in most programming languages.
> I need the fastest and simple algorithm which finds the duplicate numbers in an array, also should be able to know the number of duplicates.
I think the fastest algorithm is counting the duplicates in an array:
#include <stdlib.h>
#include <stdio.h>
#include <limits.h>
#include <assert.h>
typedef int arr_t;
typedef unsigned char dup_t;
const dup_t dup_t_max=UCHAR_MAX;
dup_t *count_duplicates( arr_t *arr, arr_t min, arr_t max, size_t arr_len ){
assert( min <= max );
dup_t *dup = calloc( max-min+1, sizeof(dup[0]) );
for( size_t i=0; i<arr_len; i++ ){
assert( min <= arr[i] && arr[i] <= max && dup[ arr[i]-min ] < dup_t_max );
dup[ arr[i]-min ]++;
}
return dup;
}
int main(void){
arr_t arr[] = {2,3,4,5,2,4,6,2,4,7,3,8,2};
size_t arr_len = sizeof(arr)/sizeof(arr[0]);
arr_t min=0, max=16;
dup_t *dup = count_duplicates( arr, min, max, arr_len );
printf( " value count\n" );
printf( " -----------\n" );
for( size_t i=0; i<(size_t)(max-min+1); i++ ){
if( dup[i] ){
printf( "%5i %5i\n", (int)(i+min), (int)(dup[i]) );
}
}
free(dup);
}
Note: You can not use the fastest algorithm on every array.
The code first sorts the array and then moves unique elements to the front, keeping track of the number of elements. It's slower than using bucket sort, but more convenient.
#include <stdio.h>
#include <stdlib.h>
static int cmpi(const void *p1, const void *p2)
{
int i1 = *(const int *)p1;
int i2 = *(const int *)p2;
return (i1 > i2) - (i1 < i2);
}
size_t make_unique(int values[], size_t count, size_t *occ_nums)
{
if(!count) return 0;
qsort(values, count, sizeof *values, cmpi);
size_t top = 0;
int prev_value = values[0];
if(occ_nums) occ_nums[0] = 1;
size_t i = 1;
for(; i < count; ++i)
{
if(values[i] != prev_value)
{
++top;
values[top] = prev_value = values[i];
if(occ_nums) occ_nums[top] = 1;
}
else ++occ_nums[top];
}
return top + 1;
}
int main(void)
{
int values[] = { 2, 3, 4, 5, 2, 4, 6, 2, 4, 7, 3, 8, 2 };
size_t occ_nums[sizeof values / sizeof *values];
size_t unique_count = make_unique(
values, sizeof values / sizeof *values, occ_nums);
size_t i = 0;
for(; i < unique_count; ++i)
{
printf("number %i occurred %u time%s\n",
values[i], (unsigned)occ_nums[i], occ_nums[i] > 1 ? "s": "");
}
}
There is an "algorithm" that I use all the time to find duplicate lines in a file in Unix:
sort file | uniq -d
If you implement the same strategy in C, then it is very difficult to beat it with a fancier strategy such as hash tables. Call a sorting algorithm, and then call your own function to detect duplicates in the sorted list. The sorting algorithm takes O(n*log(n)) time and the uniq function takes linear time. (Southern Hospitality makes a similar point, but I want to emphasize that what he calls "option 2" seems both simpler and faster than the more popular hash tables suggestion.)
Counting sort is the answer to the above question.If you see the algorithm for counting sort you will find that there is an array that is kept for keeping the count of an element i present in the original array.
Here is another solution but it takes O(nlogn) time.
Use Divide and Conquer approach to sort the given array using merge sort.
During combine step in merge sort, find the duplicates by comparing the elements in the two sorted sub-arrays.

Resources