Array Moves Chunks Of Neighboring Elements To A New Index - arrays

The Question
The question is simple, Imagine I have a list or an array (not linkedList)
list = [1, 2, ... 999999]
Now I wanna move elements from index 3000 to 600000 to index 100. The result should be easy to imagine
[1, 2,... 99, 100, 3000, 3001, ... 6000000, 101, 102, ...2999, 600001, 600002, ... 999999]
How to do those operations efficiently?
Judge My Thinking
Disclaimer: You can say this operation is exactly same as moving 101 to 2999 to index 600000. Which seems a more efficient operation. But that's not the algorithm my question is about, my question is about how to move more efficiently, so let's do the original question.
I can think of several ways:
Just do delete and insert for element 3000 to 600000. (What's the time complexity?)
Save elements from [3000, 600000] to a temporary space, then use delete insert to move everything from [101 to 2999] down to [597192, 600000] to make space to transfer [3000, 600000] back into index 100. Temporary holding data from [3000, 600000] will cost some memory, but does the copying make the whole operation slower? (Time complexity?)
is an attempt to improve 2. Same idea, but the move operation is not done by delete insert, but by manually copy [101,2999] to [597192, 600000]. (Time complexity? does it improve speed compared to delete insert)?
is an attempt to improve 2 or 3. Same idea, but no delete insert, but using many copying. But not copying everything from [3000, 600000], but only hold 1 element at a time in temporary memory, and move / copy everything in a complicated way. (Is this faster than others? Is it possible to implement? Can you show me the code / pseudo-code)
Is there better ideas?
Thank you for reading and thinking.

The algorithm you are after is known as rotate. There are two common ways to implement it. Both are running in O(length) time and O(1) space.
One, which is attributed to Dijkstra, is ultimately efficient in a sense that every element is moved just once. It is kind of tricky to implement, and requires a non-obvious setup. Besides, it may behave in a very cache-unfriendly manner. For details, see METHOD 3 (A Juggling Algorithm).
Another is very simple, cache-friendly, but moves each element twice. To rotate a range [l, r) around a midpoint m do
reverse(l, m);
reverse(m, r);
reverse(l, r);

I split the list at the "breakpoints", and reassembled the pieces. Creating a slice should be O(n), with n being the length of the slice. The longest slice is up to len(a) elements, so the storing of the list pieces should be O(n), with n being len(a). Reassembling the list pieces is O(n) as well, so this should be O(n) in total. The memory requirement is 2*len(a), since we store the slices as well, which sum up to the same length as a.
def truck(a, start, end, to):
a = list(a)
diff = end - start
to_left = to < start
split_save = a[to:to+diff]
split_take = a[start:end]
if to_left:
split_first = a[:to]
split_second = a[to+diff:start]
split_third = a[end:]
res = split_first + split_take + split_second + split_save + split_third
else:
split_first = a[:start]
split_second = a[end:to]
split_third = a[to+diff:]
res = split_first + split_save + split_second + split_take + split_third
return res
print(truck(range(10), 5, 8, 2))
# [0, 1, 5, 6, 7, 2, 3, 4, 8, 9]
print(truck(range(10), 2, 5, 8))
# [0, 1, 8, 9, 5, 6, 7, 2, 3, 4]

Let [l, r] be the segment you want to move, and L = r-l+1, the length of segment. N is total count of elements.
Delete and Insert at arbitrary position in array takes O(N), and delete and insert occurs O(L). So total time complexity is O(NL).
Same as #1. It takes O(NL) because of delete and insert.
3, 4. Copy and move takes O(L). Simply we can say it takes O(N)
Now, some fancy data structure for better complexity. You can use tree data structure to store linear array.
Especially, self-balancing binary search tree(BBST). It takes O(logN) to insert and delete one element.
Time complexity of moving segment to arbitrary position could be O(L logN), delete and insert each element.
You can find this data strcuture std::map in C++.
O(L logN) does not seem to be better. But, it can be better with BBST, to amortized O(log N)
You can gather elements in [l, r] to one subtree. Then cut this subtree from BBST. It is delete.
Insert this subtree at position you want.
For example, gather [3000, 600000] to one subtree and cut it from its parent. (Delete segment at once)
Make this subtree right child of 100th element in inorder, or left child of 101th element in inorder. (Insert segment at once)
Then, tree contains elements in order what you want.
Splay Tree would be good choice.

Related

Holding a sorted array - reverse sorted inputs case

I get integers from the user (one by one) and insert into a sorted vec to its right place by running binary search and finding the insertion index.
The problem is when user decides to provide a reversed sorted input (one by one) then insertion will be expensive, O(n^2), since on each insertion, all of the current elements in the vec has to be shifted to the right. Is there an algorithm that can handle this with less time?
Example:
[] <- 10
[10] <- 9 // Shift x1
[9, 10] <- 8 // Shift x2
[8, 9, 10] <- 7 // Shift x3
[7, 8, 9, 10] <- 6 // Shift x4
.
.
.
The problem is when user decides to provide a reversed sorted input (one by one) then insertion will be expensive, O(n^2), since on each insertion, all of the current elements in the vec has to be shifted to the right.
The Vec implementation will shift all the contents at once (using a memcpy) so shifting 20 items and shifting 1 doesn't really make any difference. If the collection is huge memory traffic will start being a concern but at low arities you can treat it as a constant.
Is there an algorithm that can handle this with less time?
An intrinsically sorted tree-based data structure. But the Rust standard library is somewhat limited on that front, and a BTreeSet will only work if you're deduplicating anyway. Not sure it will beat a regular Vec though, as it'll have a higher number of allocations.
And while a LinkedList theoretically provides O(1) insertion, Rust doesn't provide an insertion API because there's no Cursor, so you'd be paying O(n-i) to look for the insertion index, following which insert() would be paying that again to traverse to the index in question and insert the new item.

Array operations for maximum sum

Given an array A consisting of N elements. Our task is to find the maximal subarray sum after applying the following operation exactly once:
. Select any subarray and set all the elements in it to zero.
Eg:- array is -1 4 -1 2 then answer is 6 because we can choose -1 at index 2 as a subarray and make it 0. So the resultatnt array will be after applying the operation is : -1 4 0 2. Max sum subarray is 4+0+2 = 6.
My approach was to find start and end indexes of minimum sum subarray and make all elements as 0 of that subarray and after that find maximum sum subarray. But this approach is wrong.
Starting simple:
First, let us start with the part of the question: Finding the maximal subarray sum.
This can be done via dynamic programming:
a = [1, 2, 3, -2, 1, -6, 3, 2, -4, 1, 2, 3]
a = [-1, -1, 1, 2, 3, 4, -6, 1, 2, 3, 4]
def compute_max_sums(a):
res = []
currentSum = 0
for x in a:
if currentSum > 0:
res.append(x + currentSum)
currentSum += x
else:
res.append(x)
currentSum = x
return res
res = compute_max_sums(a)
print(res)
print(max(res))
Quick explanation: we iterate through the array. As long as the sum is non-negative, it is worth appending the whole block to the next number. If we dip below zero at any point, we discard whole "tail" sequence since it will not be profitable to keep it anymore and we start anew. At the end, we have an array, where j-th element is the maximal sum of a subarray i:j where 0 <= i <= j.
Rest is just the question of finding the maximal value in the array.
Back to the original question
Now that we solved the simplified version, it is time to look further. We can now select a subarray to be deleted to increase the maximal sum. The naive solution would be to try every possible subarray and to repeat the steps above. This would unfortunately take too long1. Fortunately, there is a way around this: we can think of the zeroes as a bridge between two maxima.
There is one more thing to address though - currently, when we have the j-th element, we only know that the tail is somewhere behind it so if we were to take maximum and 2nd biggest element from the array, it could happen that they would overlap which would be a problem since we would be counting some of the elements more than once.
Overlapping tails
How to mitigate this "overlapping tails" issue?
The solution is to compute everything once more, this time from the end to start. This gives us two arrays - one where j-th element has its tail i pointing towards the left end of the array(e.g. i <=j) and the other where the reverse is true. Now, if we take x from first array and y from second array we know that if index(x) < index(y) then their respective subarrays are non-overlapping.
We can now proceed to try every suitable x, y pair - there is O(n2) of them. However since we don't need any further computation as we already precomputed the values, this is the final complexity of the algorithm since the preparation cost us only O(n) and thus it doesn't impose any additional penalty.
Here be dragons
So far the stuff we did was rather straightforward. This following section is not that complex but there are going to be some moving parts. Time to brush up the max heaps:
Accessing the max is in constant time
Deleting any element is O(log(n)) if we have a reference to that element. (We can't find the element in O(log(n)). However if we know where it is, we can swap it with the last element of the heap, delete it, and bubble down the swapped element in O(log(n)).
Adding any element into the heap is O(log(n)) as well.
Building a heap can be done in O(n)
That being said, since we need to go from start to the end, we can build two heaps, one for each of our pre-computed arrays.
We will also need a helper array that will give us quick index -> element-in-heap access to get the delete in log(n).
The first heap will start empty - we are at the start of the array, the second one will start full - we have the whole array ready.
Now we can iterate over whole array. In each step i we:
Compare the max(heap1) + max(heap2) with our current best result to get the current maximum. O(1)
Add the i-th element from the first array into the first heap - O(log(n))
Remove the i-th indexed element from the second heap(this is why we have to keep the references in a helper array) - O(log(n))
The resulting complexity is O(n * log(n)).
Update:
Just a quick illustration of the O(n2) solution since OP nicely and politely asked. Man oh man, I'm not your bro.
Note 1: Getting the solution won't help you as much as figuring out the solution on your own.
Note 2: The fact that the following code gives the correct answer is not a proof of its correctness. While I'm fairly certain that my solution should work it is definitely worth looking into why it works(if it works) than looking at one example of it working.
input = [100, -50, -500, 2, 8, 13, -160, 5, -7, 100]
reverse_input = [x for x in reversed(input)]
max_sums = compute_max_sums(input)
rev_max_sums = [x for x in reversed(compute_max_sums(reverse_input))]
print(max_sums)
print(rev_max_sums)
current_max = 0
for i in range(len(max_sums)):
if i < len(max_sums) - 1:
for j in range(i + 1, len(rev_max_sums)):
if max_sums[i] + rev_max_sums[j] > current_max:
current_max = max_sums[i] + rev_max_sums[j]
print(current_max)
1 There are n possible beginnings, n possible ends and the complexity of the code we have is O(n) resulting in a complexity of O(n3). Not the end of the world, however it's not nice either.

Find the most frequent triplet in an array

We have an array of N numbers. All the numbers are between 1-k.
The problem is how to find the best way of finding the most frequent triplet.
My approach to the problem is:
Say if the input is like { 1, 2, 3, 4, 1, 2, 3, 4}
First search for the count of triplet ( 1, 2, 3) start from the second element in the array till the end of the array. Now we will have the count as 1.
Now start with { 2, 3, 4) and search the array.
for each triplet we scan the array and find the count. Like this we run the array for n-1 times.
This way my algorithm runs in the order of n*n time complexity. Is there any better way for
this problem.?
You can do it in O(n * log n) worst-case space and time complexity: Just insert all triples into a balanced binary search tree and find the maximum afterwards.
Alternatively, you can use a hash table to get O(n) expected time (which is typically faster than the search tree approach in reality, if you choose a good hash function).
Are there any memory boundaries i.e. does it run on a device with memory limitations?
If not, maybe this could be good solution: iterate over array and for each tripple build and representation object (or struct if implemented in c#) which goes into map as a key and the tripple counter as a value.
If you implement hash and equals functions appropriately, you will be able to find the "most popular" tripple where numbers order matters or not e.g. 1,2,3 != 2,1,3 or 1,2,3 == 2,1,3
After iterating entire array you would have to find the largest value and its key would be your "most popular" tripple. With that approach you could find X most popular tripples too. Also you would scan array only once and aggregate all the trippels (no extra scanning for each tripple).

Algorithm - the time complexity of deletion in a unsorted array

Suppose there is a unsorted array A, and it contains an element x (x is the pointer of the element), and every element has a satellite variable k. So, we can get the following time complexity (for worst cases):
If we want to Search for a particular K, then it costs O(n).
if we want to Insert an element, then it costs O(1), because A just adds the element to the end.
What if we know x, then Delete it from the array A?
We have to Search for x.k first and get the index of x, then Delete x via its index in A, right?
So for Delete, it costs O(n) too, right?
thanks
Finding the element with a given value is linear.
Since the array isn't sorted anyway, you can do the deletion itself in constant time. First swap the element you want to delete to the end of the array, then reduce the array size by one element.
Yes, that's right. Also, if it's an array, deleting alone will take O(n) time because after you delete the element, you'll need to shift all the elements to the right of that element one place to the left. So, even if you know x (for example, you will only delete the first element), it will take O(n) time.
Worst case time complexity for deletion operation in a sorted array is O(n),
If the array is not sorted and it is mentioned that after deletion operation
order of the array shouldn't be altered then time complexity will be same as O(n)
otherwise it will be O(1).
Yes. It takes O(n) time to find the element you want to delete. Then in order to delete it, you must shift all elements to the right of it one space to the left. This is also O(n) so the total complexity is linear.
Also, if you're talking about statically allocated arrays, insert takes O(n) as well. You have to resize the array in order to accommodate the extra element. There are ways to amortize this running time to O(1) though.
Unsorted Array
Ex:
Items Value [3, 5, 1, 7, 4]
Items Address [&1, &2, &3, &4, &5]
Deleting - Value 5
1. Deletion - Order to be preserved - O(n + n) - O(2n) ~> O(n)
i) O(n) - Finding the position of that element (Index = 1 for value 5)
ii) O(n) - After deleting that element, the rest of the items (1, 7, 4) needs to be reshuffled to hold the previous item's address. Like
Items Value [3, 1, 7, 4]
Items Address [&1, &2, &3, &4]
2. Deletion - Without preserving the Order - O(n + 1 + 1) - O(2 + n) ~> O(n)
i) O(n) - Finding the position of that element (Index = 1 for value 5)
ii) O(1) - Swape with the last element of the array.
Items Value [3, 4, 1, 7, 5]
Items Address [&1, &2, &3, &4, &5]
iii) O(1) - Delete the last element of the array.
Items Value [3, 4, 1, 7]
Items Address [&1, &2, &3, &4]

efficient sorted Cartesian product of 2 sorted array of integers

Need Hints to design an efficient algorithm that takes the following input and spits out the following output.
Input: two sorted arrays of integers A and B, each of length n
Output: One sorted array that consists of Cartesian product of arrays A and B.
For Example:
Input:
A is 1, 3, 5
B is 4, 8, 10
here n is 3.
Output:
4, 8, 10, 12, 20, 24, 30, 40, 50
Here are my attempts at solving this problem.
1) Given that output is n^2, Efficient algorithm can't do any better than O(n^2) time complexity.
2) First I tried a simple but inefficient approach. Generate Cartesian product of A and B. It can be done in O(n^2) time complexity. we need to store, so we can do sorting on it. Therefore O(n^2) space complexity too. Now we sort n^2 elements which can't be done better than O(n^2logn) without making any assumptions on the input.
Finally I have O(n^2logn) time and O(n^2) space complexity algorithm.
There must be a better algorithm because I've not made use of sorted nature of input arrays.
If there's a solution that's better than O(n² log n) it needs to do more than just exploit the fact that A and B are already sorted. See my answer to this question.
Srikanth wonders how this can be done in O(n) space (not counting the space for the output). This can be done by generating the lists lazily.
Suppose we have A = 6,7,8 and B = 3,4,5. First, multiply every element in A by the first element in B, and store these in a list:
6×3 = 18, 7×3 = 21, 8×3 = 24
Find the smallest element of this list (6×3), output it, replace it with that element in A times the next element in B:
7×3 = 21, 6×4 = 24, 8×3 = 24
Find the new smallest element of this list (7×3), output it, and replace:
6×4 = 24, 8×3 = 24, 7×4 = 28
And so on. We only need O(n) space for this intermediate list, and finding the smallest element at each stage takes O(log n) time if we keep the list in a heap.
If you multiply a value of A with all values of B, the result list is still sorted. In your example:
A is 1, 3, 5
B is 4, 8, 10
1*(4,8,10) = 4,8,10
3*(4,8,10) = 12,24,30
Now you can merge the two lists (exactly like in merge sort). You just look at both list heads and put the smaller one in the result list. so here you would select 4, then 8 then 10 etc.
result = 4,8,10,12,24,30
Now you do the same for result list and the next remaining list merging 4,8,10,12,24,30 with 5*(4,8,10) = 20,40,50.
As merging is most efficient if both lists have the same length, you can modify that schema by dividing A in two parts, do the merging recursively for both parts, and merge both results.
Note that you can save some time using a merge approach as is isn't required that A is sorted, just B needs to be sorted.

Resources