Find the ways to add numbers up to K value which there are multiple K to satisfy - arrays

I have encountered this problem during my preparation for the interview, I can't think of an elegant way to solve it:
Given an array of positive integers and target K, asking what are the combinations can satisfy K? We assume all integers <= K
example:
given: [5,4,3,5,2,1,4,8,10]
K = 10
10
5+5 = 10
2+8 = 10
4+4+1 = 9
3 *because it can't be add to any conbination and still less than or equal to k
so one of the answers is [10],[5,5],[2,8],[4,4,1],[3]
but we want minimum numbers of combination, in this case, we get 5
My apology on the tags, I don't know which is the best way to solve it and I have to add tags to post the question

Related

Finding the Number of Quadruples

In this problem you are given a sequence of N positive integers S[1],S[2],…,S[N]. In addition you are given an integer T, and your aim is to find the number of quadruples (i,j,k,l), such that 1 <= i < j < k < l <= N, and S[i]+S[j]+S[k]+S[l]=T. That is, the number of ways of picking four numbers from the sequence summing up to T. For example, for S = [3, 1, 1, 2, 5, 10] and T = 20 the answer is 1 since (1,4,5,6) (using 1-based indexing) is the only valid quadruple as S[1] + S[4] + S[5] + S[6] = 3 + 2 + 5 + 10 = 20.
I have been trying hard to find an efficient solution for the above problem but I am unable to come up with any answer. A strategy to approach such problems along with the pseudo code (and necessary explanation) is highly appreciated.
It can be solved in O(N^2 log N) regardless what T is:
First create new array of pairs of distinct indexes and store all index pairings.
Sort new array by sum value (i.e. pair {1,2} would have value A[1]+A[2] where A is original array)
Now problem gets reduced to some variation of 2SUM problem which can easily be solved as described here with some modifications: we also need to check if all 4 indexes are unique
If T is small enough we can also do some knapsack DP to solve in O(NT):

find the number of distinct prime divisors of G using python programming? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
i have just wasted my 2 hours only for solving this programming question.if any one knows the trick of doing then please share it.question is given below.
You have given an array A having N integers. Let say G is the product of all elements of A.You have to find the number of distinct prime divisors of G.
Input Format
The first argument given is an Array A, having N integers.
Output Format
Return an Integer, i.e number of distinct prime divisors of
G.
Constraints
1 <= N <= 1e5
1 <= A[i] <= 1e5
For Example
Input:
A = [1, 2, 3, 4]
Output:
2
Explanation:
here G = 1 * 2 * 3 * 4 = 24
and distinct prime divisors of G are [2, 3]
Since this seems to be a homework question, I'll give some pushes in the right direction.
Theoretically, this is a very simple thing to do. Anyone can write code that loops through an array and multiplies its elements. It is also very easy to find pseudocode (or even real code) for factorizing a number into its prime factors.
However, this approach will not work here, since we will be dealing with HUGE numbers. The maximum value of G, given your constraints, is (10⁵)^(10⁵) = 10⁵⁰⁰⁰⁰⁰. This by far exceeds the number of electrons in the observable universe. We cannot factorize such huge numbers.
But luckily we don't need to know the value of G. We are only required to calculate it's prime factors, but we don't need to know the value of G to do so. So instead you will have to factorize the individual numbers in the array. I would recommend something like this code:
factors = set()
for num in A:
f = factorize(num) # Function that returns the set of prime factors in num
factors |= f # Add all elements in f to factors
You were only interested in the distinct prime factors, so using a set will take care of that. Just add everything, and it will automatically throw away any duplicates. factorize(x) is a function you will need to write that takes a number as argument and return the set of prime factors.
Good luck!

I really can't figure out where to start

By using 9 numbers which are 1 to 9 you should find the number of ways to get N using multiplication and addition.
For example, if 100 is given, you would answer 7.
The reason is that there are 7 possible ways.
100 = 1*2*3*4+5+6+7*8+9
100 = 1*2*3+4+5+6+7+8*9
100 = 1+2+3+4+5+6+7+8*9
100 = 12+3*4+5+6+7*8+9
100 = 1+2*3+4+5+67+8+9
100 = 1*2+34+5+6*7+8+9
100 = 12+34+5*6+7+8+9
If this question is given to you, how would you start?
Are we allowed to use parentheses? That would expand the number of possibilities by a lot.
I would try to find the first additive term, let’s say 1×23, first. There are a limited number of those, and since we can’t subtract, we know that if we get a term above our target, we can prune it from our search. That leaves us looking for the solution to 23 + f = 100, where f is another formula of exactly the same form. But that is exactly the same as solving the original problem for numbers 4–9 and target 77! So call your algorithm recursively and add the solutions for that subproblem to the solutions to the original problem. That is, if we have 23 + 4, are there any solutions to the subproblem with numbers 5–9 and n = 73? Divide and conquer.
You might benefit from a dynamic table of partial solutions, since it's possible you might get the same subproblem in different ways: 1+2+3 = 1×2×3, so solving the subproblem with numbers 4–9 and target 94 twice duplicates work.
You are probably better going from right to left than from left to right, on the principle of most-constrained first. 89, 8×9, or 78+9 leave much less room for possible solutions than 1+2+3, 1×2×3, 12×3, 12+3 or 1×23.
There are three possible operations
addition
multiplication
combine, for example combine 1 and 2 to make 12
There are 8 positions for each operator. Hence, there are a total of 3^8 = 6561 possible equations. So I would start with
for ( i = 0; i < 6561; i++ )

How do I check to see if two (or more) elements of an array/vector are the same?

For one of my homework problems, we had to write a function that creates an array containing n random numbers between 1 and 365. (Done). Then, check if any of these n birthdays are identical. Is there a shorter way to do this than doing several loops or several logical expressions?
Thank you!
CODE SO FAR, NOT DONE YET!!
function = [prob] bdayprob(N,n)
N = input('Please enter the number of experiments performed: N = ');
n = input('Please enter the sample size: n = ');
count = 0;
for(i=1:n)
x(i) = randi(365);
if(x(i)== x)
count = count + 1
end
return
If I'm interpreting your question properly, you want to check to see if generating n integers or days results in n unique numbers. Given your current knowledge in MATLAB, it's as simple as doing:
n = 30; %// Define sample size
N = 10; %// Define number of trials
%// Define logical array where each location tells you whether
%// birthdays were repeated for a trial
check = false(1, N);
%// For each trial...
for idx = 1 : N
%// Generate sample size random numbers
days = randi(365, n, 1);
%// Check to see if the total number of unique birthdays
%// are equal to the sample size
check(idx) = numel(unique(days)) == n;
end
Woah! Let's go through the code slowly shall we? We first define the sample size and the number of trials. We then specify a logical array where each location tells you whether or not there were repeated birthdays generated for that trial. Now, we start with a loop where for each trial, we generate random numbers from 1 to 365 that is of n or sample size long. We then use unique and figure out all unique integers that were generated from this random generation. If all of the birthdays are unique, then the total number of unique birthdays generated should equal the sample size. If we don't, then we have repeats. For example, if we generated a sample of [1 1 1 2 2], the output of unique would be [1 2], and the total number of unique elements is 2. Since this doesn't equal 5 or the sample size, then we know that the birthdays generated weren't unique. However, if we had [1 3 4 6 7], unique would give the same output, and since the output length is the same as the sample size, we know that all of the days are unique.
So, we check to see if this number is equal to the sample size for each iteration. If it is, then we output true. If not, we output false. When I run this code on my end, this is what I get for check. I set the sample size to 30 and the number of trials to be 10.
check =
0 0 1 1 0 0 0 0 1 0
Take note that if you increase the sample size, there is a higher probability that you will get duplicates, because randi can be considered as sampling with replacement. Therefore, the larger the sample size, the higher the chance of getting duplicate values. I made the sample size small on purpose so that we can see that it's possible to get unique days. However, if you set it to something like 100, or 200, you will most likely get check to be all false as there will most likely be duplicates per trial.
Here are some more approaches that avoid loops. Let
n = 20; %// define sample size
x = randi(365,n,1); %// generate n values between 1 and 365
Any of the following code snippets returns true (or 1) if there are two identical values in x, and false (or 0) otherwise:
Sort and then check if any two consecutive elements are the same:
result = any(diff(sort(x))==0);
Do all pairwise comparisons manually; remove self-pairs and duplicate pairs; and check if any of the remaining comparisons is true:
result = nnz(tril(bsxfun(#eq, x, x.'),-1))>0;
Compute the distance between distinct values, considering each pair just once, and then check if any distance is 0:
result = any(pdist(x(:))==0);
Find the number of occurrences of the most common value (mode):
[~, occurs] = mode(x);
result = occurs>1;
I don't know if I'm supposed to solve the problem for you, but perhaps a few hints may lead you in the right direction (besides I'm not a matlab expert so it will be in general terms):
Maybe not, but you have to ask yourself what they expect of you. The solution you propose requires you to loop through the array in two nested loops which will mean n*(n-1)/2 times through the loop (ie quadratic time complexity).
There are a number of ways you can improve the time complexity of the problem. The most straightforward would be to have a 365 element table where you can keep track if a particular number has been seen yet - which would require only a single loop (ie linear time complexity), but perhaps that's not what they're looking for either. But maybe that solution is a little bit ad-hoc? What we're basically looking for is a fast lookup if a particular number has been seen before - there exists more memory efficient structures that allows look up in O(1) time and O(log n) time (if you know these you have an arsenal of tools to use).
Then of course you could use the pidgeonhole principle to provide the answer much faster in some special cases (remember that you only asked to determine whether two or more numbers are equal or not).

Algorithm to split an array into P subarrays of balanced sum

I have an big array of length N, let's say something like:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
I need to split this array into P subarrays (in this example, P=4 would be reasonable), such that the sum of the elements in each subarray is as close as possible to sigma, being:
sigma=(sum of all elements in original array)/P
In this example, sigma=15.
For the sake of clarity, one possible result would be:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
(sums: 12,19,14,15)
I have written a very naive algorithm based in how I would do the divisions by hand, but I don't know how to impose the condition that a division whose sums are (14,14,14,14,19) is worse than one that is (15,14,16,14,16).
Thank you in advance.
First, let’s formalize your optimization problem by specifying the input, output, and the measure for each possible solution (I hope this is in your interest):
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Input: Array A of positive integers; P is a positive integer.
Output: Array SA of P non-negative integers representing the length of each subarray of A where the sum of these subarray lengths is equal to the length of A.
Measure: abs(sum(sa)-sum(A)/P) is minimal for each sa ∈ {sa | sa = (Ai, …, Ai+‍SAj) for i = (Σ SAj), j from 0 to P-1}.
The input and output define the set of valid solutions. The measure defines a measure to compare multiple valid solutions. And since we’re looking for a solution with the least difference to the perfect solution (minimization problem), measure should also be minimal.
With this information, it is quite easy to implement the measure function (here in Python):
def measure(a, sa):
sigma = sum(a)/len(sa)
diff = 0
i = 0
for j in xrange(0, len(sa)):
diff += abs(sum(a[i:i+sa[j]])-sigma)
i += sa[j]
return diff
print measure([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], [3,4,4,5]) # prints 8
Now finding an optimal solution is a little harder.
We can use the Backtracking algorithm for finding valid solutions and use the measure function to rate them. We basically try all possible combinations of P non-negative integer numbers that sum up to length(A) to represent all possible valid solutions. Although this ensures not to miss a valid solution, it is basically a brute-force approach with the benefit that we can omit some branches that cannot be any better than our yet best solution. E.g. in the example above, we wouldn’t need to test solutions with [9,…] (measure > 38) if we already have a solution with measure ≤ 38.
Following the pseudocode pattern from Wikipedia, our bt function looks as follows:
def bt(c):
global P, optimum, optimum_diff
if reject(P,c):
return
if accept(P,c):
print "%r with %d" % (c, measure(P,c))
if measure(P,c) < optimum_diff:
optimum = c
optimum_diff = measure(P,c)
return
s = first(P,c)
while s is not None:
bt(list(s))
s = next(P,s)
The global variables P, optimum, and optimum_diff represent the problem instance holding the values for A, P, and sigma, as well as the optimal solution and its measure:
class MinimalSumOfSubArraySumsProblem:
def __init__(self, a, p):
self.a = a
self.p = p
self.sigma = sum(a)/p
Next we specify the reject and accept functions that are quite straight forward:
def reject(P,c):
return optimum_diff < measure(P,c)
def accept(P,c):
return None not in c
This simply rejects any candidate whose measure is already more than our yet optimal solution. And we’re accepting any valid solution.
The measure function is also slightly changed due to the fact that c can now contain None values:
def measure(P, c):
diff = 0
i = 0
for j in xrange(0, P.p):
if c[j] is None:
break;
diff += abs(sum(P.a[i:i+c[j]])-P.sigma)
i += c[j]
return diff
The remaining two function first and next are a little more complicated:
def first(P,c):
t = 0
is_complete = True
for i in xrange(0, len(c)):
if c[i] is None:
if i+1 < len(c):
c[i] = 0
else:
c[i] = len(P.a) - t
is_complete = False
break;
else:
t += c[i]
if is_complete:
return None
return c
def next(P,s):
t = 0
for i in xrange(0, len(s)):
t += s[i]
if i+1 >= len(s) or s[i+1] is None:
if t+1 > len(P.a):
return None
else:
s[i] += 1
return s
Basically, first either replaces the next None value in the list with either 0 if it’s not the last value in the list or with the remainder to represent a valid solution (little optimization here) if it’s the last value in the list, or it return None if there is no None value in the list. next simply increments the rightmost integer by one or returns None if an increment would breach the total limit.
Now all you need is to create a problem instance, initialize the global variables and call bt with the root:
P = MinimalSumOfSubArraySumsProblem([2,4,6,7,6,3,3,3,4,3,4,4,4,3,3,1], 4)
optimum = None
optimum_diff = float("inf")
bt([None]*P.p)
If I am not mistaken here, one more approach is dynamic programming.
You can define P[ pos, n ] as the smallest possible "penalty" accumulated up to position pos if n subarrays were created. Obviously there is some position pos' such that
P[pos', n-1] + penalty(pos', pos) = P[pos, n]
You can just minimize over pos' = 1..pos.
The naive implementation will run in O(N^2 * M), where N - size of the original array and M - number of divisions.
#Gumbo 's answer is clear and actionable, but consumes lots of time when length(A) bigger than 400 and P bigger than 8. This is because that algorithm is kind of brute-forcing with benefits as he said.
In fact, a very fast solution is using dynamic programming.
Given an array A of positive integers and a positive integer P, separate the array A into P non-overlapping subarrays such that the difference between the sum of each subarray and the perfect sum of the subarrays (sum(A)/P) is minimal.
Measure: , where is sum of elements of subarray , is the average of P subarray' sums.
This can make sure the balance of sum, because it use the definition of Standard Deviation.
Persuming that array A has N elements; Q(i,j) means the minimum Measure value when split the last i elements of A into j subarrays. D(i,j) means (sum(B)-sum(A)/P)^2 when array B consists of the i~jth elements of A ( 0<=i<=j<N ).
The minimum measure of the question is to calculate Q(N,P). And we find that:
Q(N,P)=MIN{Q(N-1,P-1)+D(0,0); Q(N-2,P-1)+D(0,1); ...; Q(N-1,P-1)+D(0,N-P)}
So it like can be solved by dynamic programming.
Q(i,1) = D(N-i,N-1)
Q(i,j) = MIN{ Q(i-1,j-1)+D(N-i,N-i);
Q(i-2,j-1)+D(N-i,N-i+1);
...;
Q(j-1,j-1)+D(N-i,N-j)}
So the algorithm step is:
1. Cal j=1:
Q(1,1), Q(2,1)... Q(3,1)
2. Cal j=2:
Q(2,2) = MIN{Q(1,1)+D(N-2,N-2)};
Q(3,2) = MIN{Q(2,1)+D(N-3,N-3); Q(1,1)+D(N-3,N-2)}
Q(4,2) = MIN{Q(3,1)+D(N-4,N-4); Q(2,1)+D(N-4,N-3); Q(1,1)+D(N-4,N-2)}
... Cal j=...
P. Cal j=P:
Q(P,P), Q(P+1,P)...Q(N,P)
The final minimum Measure value is stored as Q(N,P)!
To trace each subarray's length, you can store the
MIN choice when calculate Q(i,j)=MIN{Q+D...}
space for D(i,j);
time for calculate Q(N,P)
compared to the pure brute-forcing algorithm consumes time.
Working code below (I used php language). This code decides part quantity itself;
$main = array(2,4,6,1,6,3,2,3,4,3,4,1,4,7,3,1,2,1,3,4,1,7,2,4,1,2,3,1,1,1,1,4,5,7,8,9,8,0);
$pa=0;
for($i=0;$i < count($main); $i++){
$p[]= $main[$i];
if(abs(15 - array_sum($p)) < abs(15 - (array_sum($p)+$main[$i+1])))
{
$pa=$pa+1;
$pi[] = $i+1;
$pc = count($pi);
$ba = $pi[$pc-2] ;
$part[$pa] = array_slice( $main, $ba, count($p));
unset($p);
}
}
print_r($part);
for($s=1;$s<count($part);$s++){
echo '<br>';
echo array_sum($part[$s]);
}
code will output part sums like as below
13
14
16
14
15
15
17
I'm wondering whether the following would work:
Go from the left, as soon as sum > sigma, branch into two, one including the value that pushes it over, and one that doesn't. Recursively process data to the right with rightSum = totalSum-leftSum and rightP = P-1.
So, at the start, sum = 60
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then for 2 4 6 7, sum = 19 > sigma, so split into:
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
2 4 6 7 6 3 3 3 4 3 4 4 4 3 3 1
Then we process 7 6 3 3 3 4 3 4 4 4 3 3 1 and 6 3 3 3 4 3 4 4 4 3 3 1 with P = 4-1 and sum = 60-12 and sum = 60-19 respectively.
This results in, I think, O(P*n).
It might be a problem when 1 or 2 values is by far the largest, but, for any value >= sigma, we can probably just put that in it's own partition (preprocessing the array to find these might be the best idea (and reduce sum appropriately)).
If it works, it should hopefully minimise sum-of-squared-error (or close to that), which seems like the desired measure.
I propose an algorithm based on backtracking. The main function chosen randomly select an element from the original array and adds it to an array partitioned. For each addition will check to obtain a better solution than the original. This will be achieved by using a function that calculates the deviation, distinguishing each adding a new element to the page. Anyway, I thought it would be good to add an original variables in loops that you can not reach desired solution will force the program ends. By desired solution I means to add all elements with respect of condition imposed by condition from if.
sum=CalculateSum(vector)
Read P
sigma=sum/P
initialize P vectors, with names vector_partition[i], i=1..P
list_vector initialize a list what pointed this P vectors
initialize a diferences_vector with dimension of P
//that can easy visualize like a vector of vectors
//construct a non-recursive backtracking algorithm
function Deviation(vector) //function for calculate deviation of elements from a vector
{
dev=0
for i=0 to Size(vector)-1 do
dev+=|vector[i+1]-vector[i]|
return dev
}
iteration=0
//fix some maximum number of iteration for while loop
Read max_iteration
//as the number of iterations will be higher the more it will get
//a more accurate solution
while(!IsEmpty(vector))
{
for i=1 to Size(list_vector) do
{
if(IsEmpty(vector)) break from while loop
initial_deviation=Deviation(list_vector[i])
el=SelectElement(vector) //you can implement that function using a randomized
//choice of element
difference_vector[i]=|sigma-CalculateSum(list_vector[i])|
PutOnBackVector(vector_list[i], el)
if(initial_deviation>Deviation(difference_vector))
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
}
iteration++
//prevent to enter in some infinite loop
if (iteration>max_iteration) break from while loop
}
You can change this by adding in first if some code witch increment with a amount the calculated deviation.
aditional_amount=0
iteration=0
while
{
...
if(initial_deviation>Deviation(difference_vector)+additional_amount)
ExtractFromBackVectorAndPutOnSecondVector(list_vector, vector)
if(iteration>max_iteration)
{
iteration=0
aditional_amout+=1/some_constant
}
iteration++
//delete second if from first version
}
Your problem is very similar to, or the same as, the minimum makespan scheduling problem, depending on how you define your objective. In the case that you want to minimize the maximum |sum_i - sigma|, it is exactly that problem.
As referenced in the Wikipedia article, this problem is NP-complete for p > 2. Graham's list scheduling algorithm is optimal for p <= 3, and provides an approximation ratio of 2 - 1/p. You can check out the Wikipedia article for other algorithms and their approximation.
All the algorithms given on this page are either solving for a different objective, incorrect/suboptimal, or can be used to solve any problem in NP :)
This is very similar to the case of the one-dimensional bin packing problem, see http://www.cs.sunysb.edu/~algorith/files/bin-packing.shtml. In the associated book, The Algorithm Design Manual, Skienna suggests a first-fit decreasing approach. I.e. figure out your bin size (mean = sum / N), and then allocate the largest remaining object into the first bin that has room for it. You either get to a point where you have to start over-filling a bin, or if you're lucky you get a perfect fit. As Skiena states "First-fit decreasing has an intuitive appeal to it, for we pack the bulky objects first and hope that little objects can fill up the cracks."
As a previous poster said, the problem looks like it's NP-complete, so you're not going to solve it perfectly in reasonable time, and you need to look for heuristics.
I recently needed this and did as follows;
create an initial sub-arrays array of length given sub arrays count. sub arrays should have a sum property too. ie [[sum:0],[sum:0]...[sum:0]]
sort the main array descending.
search for the sub-array with the smallest sum and insert one item from main array and increment the sub arrays sum property by the inserted item's value.
repeat item 3 up until the end of main array is reached.
return the initial array.
This is the code in JS.
function groupTasks(tasks,groupCount){
var sum = tasks.reduce((p,c) => p+c),
initial = [...Array(groupCount)].map(sa => (sa = [], sa.sum = 0, sa));
return tasks.sort((a,b) => b-a)
.reduce((groups,task) => { var group = groups.reduce((p,c) => p.sum < c.sum ? p : c);
group.push(task);
group.sum += task;
return groups;
},initial);
}
var tasks = [...Array(50)].map(_ => ~~(Math.random()*10)+1), // create an array of 100 random elements among 1 to 10
result = groupTasks(tasks,7); // distribute them into 10 sub arrays with closest sums
console.log("input array:", JSON.stringify(tasks));
console.log(result.map(r=> [JSON.stringify(r),"sum: " + r.sum]));
You can use Max Flow algorithm.

Resources