Find way to separate array so each subarrays sum is less or equal to a number - arrays

I have a mathematical/algorithmic problem here.
Given an array of numbers, find a way to separate it to 5 subarrays, so that sum of each subarrays is less than or equal to a given number. All numbers from the initial array, must go to one of the subarrays, and be part of one sum.
So the input to the algorithm would be:
d - representing the number that each subarrays sum has to be less or equal
A - representing the array of numbers that will be separated to different subarrays, and will be part of one sum
Algorithm complexity must be polynomial.
Thank you.

If by "subarray" you mean "subset" as opposed to "contiguous slice", it is impossible to find a polynomial time algorithm for this problem (unless P = NP). The Partition Problem is to partition a list of numbers into to sets such that the sum of both sets are equal. It is known to be NP-complete. The partition problem can be reduced to your problem as follows:
Suppose that x1, ..., x_n are positive numbers that you want to partition into 2 sets such that their sums are equal. Let d be this common sum (which would be the sum of the xi divided by 2). extend x_i to an array, A, of size n+3 by adding three copies of d. Clearly the only way to partition A into 5 subarrays so that the sum of each is less than or equal to d is if the sum of each actually equals d. This would in turn require 3 of the subarrays to have length 1, each consisting of the number d. The remaining 2 subarrays would be exactly a partition of the original n numbers.
On the other hand, if there are additional constraints on what the numbers are and/or the subarrays need to be, there might be a polynomial solution. But, if so, you should clearly spell out what there constraints are.

Set up of the problem:
d : the upper bound for the subarray
A : the initial array
Assuming A is not sorted.
(Heuristic)
Algorithm:
1.Sort A in ascending order using standard sorting algorithm->O(nlogn)
2.Check if the largest element of A is greater than d ->(constant)
if yes, no solution
if no, continue
3.Sum up all the element in A, denote S. Check if S/5 > d ->O(n)
if yes, no solution
if no, continue
4.Using greedy approach, create a new subarray Asi, add next biggest element aj in the sorted A to Asi so that the sum of Asi does not exceed d. Remove aj from sorted A ->O(n)
repeat step4 until either of the condition satisfied:
I.At creating subarray Asi, there are only 5-i element left
In this case, split the remaining element to individual subarray, done
II. i = 5. There are 5 subarray created.
The algorithm described above is bounded by O(nlogn) therefore in polynomial time.

Related

Sum over n-tuples with total sum equal to k

I want to sum over tuples of length n, i.e. I have a vector (m_1,...,m_n) where mi is an integer greater or equal to zero with the constraint that the sum of all vector elements is equal to k.
What is the most efficient way to implement this?
My naive approach would be to iterate through all combinations with m_i between 0 and k and check if they satisfy the criterion, but this seems inefficient.
For instance, if k=2 and n=2, then
(2,0),(1,1),(0,2) would be the possible values of m1,m2 that I would like to have. Is there a way to generate these numbers efficiently (I don't necessarily have to store them all in an array, but I want to iterate over all possible combinations)
Ok, random stuff I deleted.
If you look at FXT book/library by J.Arndt, there is on page 342 section 16.3 "Partition into m parts"
Here is algorithm and reference to the code to generate exactly m-vector of partitioning of n.
You'll probably need to modify it, he doesn't have bins with zeros, starts with ones.
And some thoughts on the matter. n is sum, and you have k bins. Start with |n|0|...|0| combination. Define operation "distribute 1" which is take one from the leftmost bin and distribute it into all other bins.
E.g. D1(|n|0|...|0|)=tuple(|n-1|1|...|0|, ..., |n-1|0|...|1|)
Then you apply D1() to the tuple, and get tuple of tuples. And so on and so forth, till first bin is exhausted.
You could think this as a tree:
root |n|0|...|0|
D1 applied once, k-1 leaves |n-1|1|...|0| ... |n-1|0|...|1|
Next tree level, D1 applied once to previous level, each node getting k-1 children.
THe only thing left is how to traverse it - DFS, BFS, or anything else from https://en.wikipedia.org/wiki/Tree_traversal

Minimum steps needed to make all elements equal by adding adjacent elements

I have an array A of size N. All elements are positive integers. In one step, I can add two adjacent elements and replace them with their sum. That said, the array size reduces by 1. Now I need to make all the elements same by performing minimum number of steps.
For example: A = [1,2,3,2,1,3].
Step 1: Merge index 0 and 1 ==> A = [3,3,2,1,3]
Step 2: Merge index 2 and 3 (of new array) ==> [3,3,3,3]
Hence number of steps are 2.
I couldn't think of a straight solution, so tried a recursive approach by merging all indices one by one and returning the min level I can get when either array size is 1 or all elements are equal.
Below is the code I tried:
# Checks if all the elements are same or not
def check(A):
if len(set(A)) == 1:
return True
return False
# Recursive function to find min steps
def min_steps(N,A,level):
# If all are equal return the level
if N == 1 or check(A):
return level
# Initialize min variable
mn = float('+inf')
# Try merging all one by one and recur
for i in range(N-1):
temp = A[:]
temp[i]+=temp[i+1]
temp.pop(i+1)
mn = min(mn, min_steps(N-1,temp, level+1))
return mn
This solution has complexity of O(N^N). I want to reduce it to polynomial time near to O(N^2) or O(N^3). Can anyone help me modify this solution or tell me if I am missing something?
Combining any k adjacent pairs of elements (even if they include elements formed from previous combining steps) leaves exactly n-k elements in total, each of which we can map back to the contiguous subarray of the original problem that constitutes the elements that were added together to form it. So, this problem is equivalent to partitioning the array into the largest possible number of contiguous subarrays such that all subarrays have the same sum: Any adjacent pair of elements within the same subarray can be combined into a single element, and this process repeated within the subarray with adjacent pairs chosen in any order, until all elements have been combined into a single element.
So, if there are n elements and they sum to T, then a simple O(nT) algorithm is:
For i from 0 to T:
Try partitioning the elements into subarrays each having sum i. This amounts to scanning along the array, greedily adding the current element to the current subarray if the sum of elements in the current subarray is strictly < i. When we reach a total of exactly i, the current subarray ends and a new subarray (initially having sum 0) begins. If adding the current element takes us above the target of i, or if we run out of elements before reaching the target, stop this scan and try the next outer loop iteration (value of i). OTOH if we get to the end, having formed k subarrays in the process, stop and report n-k as the optimal (minimum possible) number of combining moves.
A small speedup would be to only try target i values that evenly divide T.
EDIT: To improve the time complexity from O(nT) to O(n^2), it suffices to only try target i values corresponding to sums of prefixes of the array (since there must be a subarray containing the first element, and this subarray can only have such a sum).

Find pair/triplets whose sum is least away from the given value

There are 2 variations of this question.
Given 2 arrays of integers, select single element from each array so that their sum is least away (numerically) from given integer value V. Sum can be greater than V.
Given 3 arrays of integers, select single element from each array so that their sum is least away (numerically) from given integer value V. Sum can be greater than V.
I know there is a naive O(n^2) and O(n^3) solution respectively for them and I'd like to ask if there's any approach which optimises running time.
For the first case, you could sort both the arrays in O(nlogn), and then for each element x in the first array, find V-x in the second array using the binary-search/upper-bound algorithm.
Its total complexity would still be O(nlogn) which is less than O(n*n).
For the second case, you can apply a similar algorithm with O(n*n*log(n)) complexity.

For given array of pairs of two numbers, count pairs in which first element is bigger and second one is smaller

I'm trying to implement this in my program, namely we have given array of K pairs, where each pair is in form (i,j) and i<=N, j<=M, N,M<=1000, and K<=N*M.
Now for each pair we want to count pairs in which the first element is strictly bigger and the second one is strictly less, for example if our pair is: (2,3), we want to count the pair (4,1), but not the pair (1,2) because 1 is less than 2.
Is this possible to do in O(N*M) time complexity?

Find the Element Occurring b times in an an array of size n*k+b

Description
Given an Array of size (n*k+b) where n elements occur k times and one element occurs b times, in other words there are n+1 distinct Elements. Given that 0 < b < k find the element occurring b times.
My Attempted solutions
Obvious solution will be using hashing but it will not work if the numbers are very large. Complexity is O(n)
Using map to store the frequencies of each element and then traversing map to find the element occurring b times.As Map's are implemented as height balanced trees Complexity will be O(nlogn).
Both of my solution were accepted but the interviewer wanted a linear solution without using hashing and hint he gave was make the height of tree constant in tree in which you are storing frequencies, but I am not able to figure out the correct solution yet.
I want to know how to solve this problem in linear time without hashing?
EDIT:
Sample:
Input: n=2 b=2 k=3
Aarray: 2 2 2 3 3 3 1 1
Output: 1
I assume:
The elements of the array are comparable.
We know the values of n and k beforehand.
A solution O(n*k+b) is good enough.
Let the number occuring only b times be S. We are trying to find the S in an array of n*k+b size.
Recursive Step: Find the median element of the current array slice as in Quick Sort in lineer time. Let the median element be M.
After the recursive step you have an array where all elements smaller than M occur on the left of the first occurence of M. All M elements are next to each other and all element larger than M are on the right of all occurences of M.
Look at the index of the leftmost M and calculate whether S<M or S>=M. Recurse either on the left slice or the right slice.
So you are doing a quick sort but delving only one part of the divisions at any time. You will recurse O(logN) times but each time with 1/2, 1/4, 1/8, .. sizes of the original array, so the total time will still be O(n).
Clarification: Let's say n=20 and k = 10. Then, there are 21 distinct elements in the array, 20 of which occur 10 times and the last occur let's say 7 times. I find the medium element, let's say it is 1111. If the S<1111 than the index of the leftmost occurence of 1111 will be less than 11*10. If S>=1111 then the index will be equal to 11*10.
Full example: n = 4. k = 3. Array = {1,2,3,4,5,1,2,3,4,5,1,2,3,5}
After the first recursive step I find the median element is 3 and the array is something like: {1,2,1,2,1,2,3,3,3,5,4,5,5,4} There are 6 elements on the left of 3. 6 is a multiple of k=3. So each element must be occuring 3 times there. So S>=3. Recurse on the right side. And so on.
An idea using cyclic groups.
To guess i-th bit of answer, follow this procedure:
Count how many numbers in array has i-th bit set, store as cnt
If cnt % k is non-zero, then i-th bit of answer is set. Otherwise it is clear.
To guess whole number, repeat the above for every bit.
This solution is technically O((n*k+b)*log max N), where max N is maximal value in the table, but because number of bits is usually constant, this solution is linear in array size.
No hashing, memory usage is O(log k * log max N).
Example implementation:
from random import randint, shuffle
def generate_test_data(n, k, b):
k_rep = [randint(0, 1000) for i in xrange(n)]
b_rep = [randint(0, 1000)]
numbers = k_rep*k + b_rep*b
shuffle(numbers)
print "k_rep: ", k_rep
print "b_rep: ", b_rep
return numbers
def solve(data, k):
cnts = [0]*10
for number in data:
bits = [number >> b & 1 for b in xrange(10)]
cnts = [cnts[i] + bits[i] for i in xrange(10)]
return reduce(lambda a,b:2*a+(b%k>0), reversed(cnts), 0)
print "Answer: ", solve(generate_test_data(10, 15, 13), 3)
In order to have a constant height B-tree containing n distinct elements, with height h constant, you need z=n^(1/h) children per nodes: h=log_z(n), thus h=log(n)/log(z), thus log(z)=log(n)/h, thus z=e^(log(n)/h), thus z=n^(1/h).
Example, with n=1000000, h=10, z=3.98, that is z=4.
The time to reach a node in that case is O(h.log(z)). Assuming h and z to be "constant" (since N=n.k, then log(z)=log(n^(1/h))=log(N/k^(1/h))=ct by properly choosing h based on k, you can then say that O(h.log(z))=O(1)... This is a bit far-fetched, but maybe that was the kind of thing the interviewer wanted to hear?
UPDATE: this one use hashing, so it's not a good answer :(
in python this would be linear time (set will remove the duplicates):
result = (sum(set(arr))*k - sum(arr)) / (k - b)
If 'k' is even and 'b' is odd, then XOR will do. :)

Resources