LeetCode Find All Numbers Disappeared in an Array Question - arrays

The problem I am doing is stated as follows:
Given an array nums of n integers where nums[i] is in the range [1, n], return an array of all the integers in the range [1, n] that do not appear in nums.
I found a solution that takes O(n) space fairly quickly however this problem has a condition to find a constant space solution and I do not understand the solution that is given. The constant space solution is recreated here as follows:
def findDisappearedNumbers(self, nums: List[int]) -> List[int]:
# Iterate over each of the elements in the original array
for i in range(len(nums)):
# Treat the value as the new index
new_index = abs(nums[i]) - 1
# Check the magnitude of value at this new index
# If the magnitude is positive, make it negative
# thus indicating that the number nums[i] has
# appeared or has been visited.
if nums[new_index] > 0:
nums[new_index] *= -1
# Response array that would contain the missing numbers
result = []
# Iterate over the numbers from 1 to N and add all those
# that have positive magnitude in the array
for i in range(1, len(nums) + 1):
if nums[i - 1] > 0:
result.append(i)
return result
I don't understand how this code works. To me it seems that every element will be made negative in the first pass and therefore no values will be appended to the answer list. I ran it through a debugger and it seems that is not what is happening. I was hoping that someone can explain to me what it is doing.

Let's take an example:
nums = [4,3,2,7,8,2,3,1]
Now Let's iterate over it,
Index-1: Value = 4 -> Mark (Value - 1) -> (4-1) index element as negative provided that element is positive.
Now, nums = [4,3,2,-7,8,2,3,1]
In this do for every index,
You will come to this:
nums = [-4,-3,-2,-7,8,2,-3,-1]
The element at index = 4 and index = 5 are positive.
So, the answer is [4+1,5+1] = [5,6].
Hope you understood this🔑.

Related

Find Minimum Score Possible

Problem statement:
We are given three arrays A1,A2,A3 of lengths n1,n2,n3. Each array contains some (or no) natural numbers (i.e > 0). These numbers denote the program execution times.
The task is to choose the first element from any array and then you can execute that program and remove it from that array.
For example:
if A1=[3,2] (n1=2),
A2=[7] (n2=1),
A3=[1] (n3=1)
then we can execute programs in various orders like [1,7,3,2] or [7,1,3,2] or [3,7,1,2] or [3,1,7,2] or [3,2,1,7] etc.
Now if we take S=[1,3,2,7] as the order of execution the waiting time of various programs would be
for S[0] waiting time = 0, since executed immediately,
for S[1] waiting time = 0+1 = 1, taking previous time into account, similarly,
for S[2] waiting time = 0+1+3 = 4
for S[3] waiting time = 0+1+3+2 = 6
Now the score of array is defined as sum of all wait times = 0 + 1 + 4 + 6 = 11, This is the minimum score we can get from any order of execution.
Our task is to find this minimum score.
How can we solve this problem? I tried with approach trying to pick minimum of three elements each time, but it is not correct because it gets stuck when two or three same elements are encountered.
One more example:
if A1=[23,10,18,43], A2=[7], A3=[13,42] minimum score would be 307.
The simplest way to solve this is with dynamic programming (which runs in cubic time).
For each array A: Suppose you take the first element from array A, i.e. A[0], as the next process. Your total cost is the wait-time contribution of A[0] (i.e., A[0] * (total_remaining_elements - 1)), plus the minimal wait time sum from A[1:] and the rest of the arrays.
Take the minimum cost over each possible first array A, and you'll get the minimum score.
Here's a Python implementation of that idea. It works with any number of arrays, not just three.
def dp_solve(arrays: List[List[int]]) -> int:
"""Given list of arrays representing dependent processing times,
return the smallest sum of wait_time_before_start for all job orders"""
arrays = [x for x in arrays if len(x) > 0] # Remove empty
#functools.lru_cache(100000)
def dp(remaining_elements: Tuple[int],
total_remaining: int) -> int:
"""Returns minimum wait time sum when suffixes of each array
have lengths in 'remaining_elements' """
if total_remaining == 0:
return 0
rem_elements_copy = list(remaining_elements)
best = 10 ** 20
for i, x in enumerate(remaining_elements):
if x == 0:
continue
cost_here = arrays[i][-x] * (total_remaining - 1)
if cost_here >= best:
continue
rem_elements_copy[i] -= 1
best = min(best,
dp(tuple(rem_elements_copy), total_remaining - 1)
+ cost_here)
rem_elements_copy[i] += 1
return best
return dp(tuple(map(len, arrays)), sum(map(len, arrays)))
Better solutions
The naive greedy strategy of 'smallest first element' doesn't work, because it can be worth it to do a longer job to get a much shorter job in the same list done, as the example of
A1 = [100, 1, 2, 3], A2 = [38], A3 = [34],
best solution = [100, 1, 2, 3, 34, 38]
by user3386109 in the comments demonstrates.
A more refined greedy strategy does work. Instead of the smallest first element, consider each possible prefix of the array. We want to pick the array with the smallest prefix, where prefixes are compared by average process time, and perform all the processes in that prefix in order.
A1 = [ 100, 1, 2, 3]
Prefix averages = [(100)/1, (100+1)/2, (100+1+2)/3, (100+1+2+3)/4]
= [ 100.0, 50.5, 34.333, 26.5]
A2=[38]
A3=[34]
Smallest prefix average in any array is 26.5, so pick
the prefix [100, 1, 2, 3] to complete first.
Then [34] is the next prefix, and [38] is the final prefix.
And here's a rough Python implementation of the greedy algorithm. This code computes subarray averages in a completely naive/brute-force way, so the algorithm is still quadratic (but an improvement over the dynamic programming method). Also, it computes 'maximum suffixes' instead of 'minimum prefixes' for ease of coding, but the two strategies are equivalent.
def greedy_solve(arrays: List[List[int]]) -> int:
"""Given list of arrays representing dependent processing times,
return the smallest sum of wait_time_before_start for all job orders"""
def max_suffix_avg(arr: List[int]):
"""Given arr, return value and length of max-average suffix"""
if len(arr) == 0:
return (-math.inf, 0)
best_len = 1
best = -math.inf
curr_sum = 0.0
for i, x in enumerate(reversed(arr), 1):
curr_sum += x
new_avg = curr_sum / i
if new_avg >= best:
best = new_avg
best_len = i
return (best, best_len)
arrays = [x for x in arrays if len(x) > 0] # Remove empty
total_time_sum = sum(sum(x) for x in arrays)
my_averages = [max_suffix_avg(arr) for arr in arrays]
total_cost = 0
while True:
largest_avg_idx = max(range(len(arrays)),
key=lambda y: my_averages[y][0])
_, n_to_remove = my_averages[largest_avg_idx]
if n_to_remove == 0:
break
for _ in range(n_to_remove):
total_time_sum -= arrays[largest_avg_idx].pop()
total_cost += total_time_sum
# Recompute the changed array's avg
my_averages[largest_avg_idx] = max_suffix_avg(arrays[largest_avg_idx])
return total_cost

Finding max sum with operation limit

As an input i'm given an array of integers (all positive).
Also as an input i`m given a number of "actions". The goal is to find max possible sum of array elements with given number of actions.
As an "action" i can either:
Add current element to sum
Move to the next element
We are starting at 0 position in array. Each element could be added only once.
Limitation are:
2 < array.Length < 20
0 < number of "actions" < 20
It seems to me that this limitations essentially not important. Its possible to find each combination of "actions", but in this case complexity would be like 2^"actions" and this is bad...))
Examples:
array = [1, 4, 2], 3 actions. Output should be 5. In this case we added zero element, moved to first element, added first element.
array = [7, 8, 9], 2 actions. Output should be 8. In this case we moved to the first element, then added first element.
Could anyone please explain me the algorithm to solve this problem? Or at least the direction in which i shoudl try to solve it.
Thanks in advance
Here is another DP solution using memoization. The idea is to represent the state by a pair of integers (current_index, actions_left) and map it to the maximum sum when starting from the current_index, assuming actions_left is the upper bound on actions we are allowed to take:
from functools import lru_cache
def best_sum(arr, num_actions):
'get best sum from arr given a budget of actions limited to num_actions'
#lru_cache(None)
def dp(idx, num_actions_):
'return best sum starting at idx (inclusive)'
'with number of actions = num_actions_ available'
# return zero if out of list elements or actions
if idx >= len(arr) or num_actions_ <= 0:
return 0
# otherwise, decide if we should include current element or not
return max(
# if we include element at idx
# we spend two actions: one to include the element and one to move
# to the next element
dp(idx + 1, num_actions_ - 2) + arr[idx],
# if we do not include element at idx
# we spend one action to move to the next element
dp(idx + 1, num_actions_ - 1)
)
return dp(0, num_actions)
I am using Python 3.7.12.
array = [1, 1, 1, 1, 100]
actions = 5
In example like above, you just have to keep moving right and finally pickup the 100. At the beginning of the array we never know what values we are going to see further. So, this can't be greedy.
You have two actions and you have to try out both because you don't know which to apply when.
Below is a python code. If not familiar treat as pseudocode or feel free to convert to language of your choice. We recursively try both actions until we run out of actions or we reach the end of the input array.
def getMaxSum(current_index, actions_left, current_sum):
nonlocal max_sum
if actions_left == 0 or current_index == len(array):
max_sum = max(max_sum, current_sum)
return
if actions_left == 1:
#Add current element to sum
getMaxSum(current_index, actions_left - 1, current_sum + array[current_index])
else:
#Add current element to sum and Move to the next element
getMaxSum(current_index + 1, actions_left - 2, current_sum + array[current_index])
#Move to the next element
getMaxSum(current_index + 1, actions_left - 1, current_sum)
array = [7, 8, 9]
actions = 2
max_sum = 0
getMaxSum(0, actions, 0)
print(max_sum)
You will realize that there can be overlapping sub-problems here and we can avoid those repetitive computations by memoizing/caching the results to the sub-problems. I leave that task to you as an exercise. Basically, this is Dynamic Programming problem.
Hope it helped. Post in comments if any doubts.

Ruby - compare two arrays for index matches and with the remainder if included

Working on a project to recreate a game Mastermind. I need to compare two arrays, and running into some struggles.
I need to output two integers for the flow of the game to work,
the first integer is the number of correct choices where the index matches. The code I have for this appears to be working
pairs = #code.zip(guess)
correct_position_count = pairs.select { |pair| pair[0] == pair[1] }.count
Where pairs is equal to a 4 element array and the guess is also a 4 element array
The second part I am having a bit of trouble with on how to do the comparison and return an array. The integer should represent where the two arrays index don't match (the above code block but !=) and confirm whether the guess array excluding any exact index matches has any elements included with the code array once again excluding the exact index matches.
Any help would be greatly appreciated!
I am not completely sure to understand your problem but if I understood well, you've two arrays, solution with the solution and guess with the current guess of the player.
Now, let's assume that the solution is 1234 and that the guess is 3335.
solution = [1, 2, 3, 4]
guess = [3, 3, 3, 5]
an element by element comparison produces an array of booleans.
diff = guess.map.with_index { |x,i| x == solution[i] }
# = [false, false, true, false]
Now, you can easily compute the number of good digits diff.count true and the number of wrong digits diff.count false. And, in case you need the index of the false and/or true values you can do
diff.each_index.select { |i| diff[i] } # indexes with true
# = [2]
diff.each_index.select { |i| !diff[i] } # indexes with false
# = [0, 1, 3]
You can count all digit matches ignoring their positions and then subtract exact matches.
pairs = #code.zip(guess)
correct_position_count = pairs.select { |pair| pair[0] == pair[1]}.count
any_position_count = 0
code_digits = #code.clone # protect #code from modifying
guess.each do |digit|
if code_digits.include?(digit)
code_digits.delete_at(code_digits.find_index(digit)) # delete the found digit not to count it more than once
any_position_count += 1
end
end
inexact_position_count = any_position_count - correct_position_count
puts "The first value: #{correct_position_count}"
puts "The second value: #{inexact_position_count}"

Count elements in 1st array less than or equal than elements in 2nd array python

I have an array Aof 21381120 elements ranking from [0,1]. I need to construct a new array B in which the element i contains the number of elements in A less than or equal than A[i].
My attempt:
A = np.random.random(10000) # for reproducibility
g = np.sort(A)
B = [np.sum(g<=element) for element in A]
I am still using a for loop, taking too much time. Since I have to do this several times I was wondering if exists a better way to do it.
EDIT
I gave an example of the array A for reproducibility. This does what is expected to. But I need it to be faster (for arrays having 2e9 elements).
For instance if:
A = [0.1,0.01,0.3,0.5,1]
I expect the output to be
B = [2, 1, 3, 4, 5]
You could use binary search to speed up searching in a sorted array. Binary search in numpy.
A = np.random.rand(10000) # for reproducibility
g = np.sort(A)
B = [np.searchsorted(g, element) for element in A]
Looks like sorting is the way to go because in a sorted array A, the number of elements less than or equal to A[i] is almost i + 1.
However, if an element is repeated, you'll have to look at the nearest element that's to the right of A[i]:
A = [1,2,3,4,4,4,5,6]
^^^^^ A[3] == A[4] == A[5]
Here, the number of elements <= A[3] is 3 + <number of repeated 4's>. Maybe you could roll your own sorting algorithm that would keep track of such repetitions. Or count the repetitions before sorting the array.
Then the final formula would be:
N(<= A[k]) = k + <number of elements equal to A[k]>
So the speed of your code would mainly depend on the speed of the sorting algorithm.

Most Efficient Algorithm to Align an Multiple Ordered Sequences

I have a strange feeling this is a very easy problem to solve but I'm not finding a good way of doing this without using brute force or dynamic programming. Here it goes:
Given N arrays of ordered and monotonic values, find the set of positions for each array i1, i2 ... in that minimises pair-wise difference of values at those indexes between all arrays. In other words, find the positions for all arrays whose values are closest to each other. Multiple solutions may exist and arrays may or may not be equally sized.
If A denotes the list of all arrays, the pair-wise difference is given by the sum of absolute differences between all values at the given indexes between all different arrays, as so:
An example, 3 arrays a, b and c:
a = [20 29 30 32 33]
b = [28 29 30 32 33]
c = [10 12 28 31 32 33]
The best alignment for this array would be a[3] b[3] c[4] or a[4] b[4] c[5], because (32,32,32) and (33,33,33) are all equal values and have, therefore minimum pairwise difference between each other. (Assuming array index starts at 0)
This is a common problem in bioinformatics thats usually solved with Dynamic Programming, but due to the fact this is an ordered sequence, I think there's somehow a way of exploiting this notion of order. I first thought about doing this pairwise, but this does not guarantee the global optimum because the best local answer might not be the best global answer.
This is meant to be language agnostic, but I don't really mind an answer for a specific language, as long as there is no loss of generality. I know Dynamic Programming is an option here, but I have a feeling there's an easier way to do this?
The tricky thing is parsing the arrays so that at some point you're guaranteed to be considering the set of indices that realize the pairwise min. Using a min heap on the values doesn't work. Counterexample with 4 arrays: [0,5], [1,2], [2], [2]. We start with a d(0,1,2,2) = 7, optimal is d(0,2,2,2) = 6, but the min heap moves us from 7 to d(5,1,2,2) = 12, then d(5,2,2,2) = 9.
I believe (but haven't proved) that if we alway increment the index that improves pairwise distance the most (or degrades it the least), we're guaranteed to visit every local min and the global min.
Assuming n total elements across k arrays:
Simple approach: we repeatedly get the pairwise distance deltas (delta wrt. incrementing each index), increment the best one, and any time doing so switch us from improvement to degradation (i.e. a local minimum) we calculate the pairwise distance. All this is O(k^2) per increment for a total running time of O((n-k) * (k^2)).
With O(k^2) storage, we could keep an array where (i,j) stores the pairwise distance delta achieve by increment the index of array i wrt. array j. We also store the column sums. Then on incrementing an index we can update the appropriate row & column & column sums in O(k). This gives us a running time of O((n-k)*k)
To just complete Dave's answer, here is the pseudocode of the delta algorithm:
initialise index_table to 0's where each row i denotes the index for the ith array
initialise delta_table with the corresponding cost of incrementing index of ith array and keeping the other indexes at their current values
cur_cost <- cost of current index table
best_cost <- cur_cost
best_solutions <- list with the current index table
while (can_at_least_one_index_increase)
i <- index whose delta is lowest
increment i-th entry of the index_table
if cost(index_table) < cur_cost
cur_cost = cost(index_table)
best_solutions = {} U {index_table}
if cost(index_table) = cur_cost
best_solutions = best_solutions U {index_table}
update delta_table
Important Note: During an iteration, some index_table entries might have already reached the maximum value for that array. Whenever updating the delta_table, it is necessary to never pick those values, otherwise this will result in a Array Out of Bounds,Segmentation Fault or undefined behaviour. A neat trick is to simply check which indexes are already at max and set a sufficiently large value, so they are never picked. If no index can increase anymore, the loop will end.
Here's an implementation in Python:
def align_ordered_sequences(arrays: list):
def get_cost(index_table):
n = len(arrays)
if n == 1:
return 0
sum = 0
for i in range(0, n-1):
for j in range(i+1, n):
v1 = arrays[i][index_table[i]]
v2 = arrays[j][index_table[j]]
sum += math.sqrt((v1 - v2) ** 2)
return sum
def compute_delta_table(index_table):
# Initialise the delta table: we switch each index element to 1, call
# the cost method and then revert the change, this avoids having to
# create copies, which decreases performance unnecessarily
delta_table = []
for i in range(n):
if index_table[i] + 1 >= len(arrays[i]):
# Implementation detail: if the index is outside the bounds of
# array i, choose a "large enough" number
delta_table.append(999999999999999)
else:
index_table[i] = index_table[i] + 1
delta_table.append(get_cost(index_table))
index_table[i] = index_table[i] - 1
return delta_table
def can_at_least_one_index_increase(index_table):
answer = False
for i in range(len(arrays)):
if index_table[i] < len(arrays[i]) - 1:
answer = True
return answer
n = len(arrays)
index_table = [0] * n
delta_table = compute_delta_table(index_table)
best_solutions = [index_table.copy()]
cur_cost = get_cost(index_table)
best_cost = cur_cost
while can_at_least_one_index_increase(index_table):
i = delta_table.index(min(delta_table))
index_table[i] = index_table[i] + 1
new_cost = get_cost(index_table)
# A new best solution was found
if new_cost < cur_cost:
cur_cost = new_cost
best_solutions = [index_table.copy()]
# A new solution with the same cost was found
elif new_cost == cur_cost:
best_solutions.append(index_table.copy())
# Update the delta table
delta_table = compute_delta_table(index_table)
return best_solutions
And here are some examples:
>>> print(align_ordered_sequences([[0,5], [1,2], [2], [2]]))
[[0, 1, 0, 0]]
>> print(align_ordered_sequences([[3, 5, 8, 29, 40, 50], [1, 4, 14, 17, 29, 50]]))
[[3, 4], [5, 5]]
Note 2: this outputs indexes not the actual values of each array.

Resources