Use merge sort to sort a list with only swap and rotation and one another list

Use merge sort to sort a list with only swap and rotation and one another list - c

I have two list : list A containing a set of random numbers and list B empty. I have to sort the list A.
I can make limited operations on these two lists like :
move the first element of list A or B at the beginning of the list B or A
swap the two first elements of list A or B like 32 41 8 9 become 41 32 8 9
make the last element of list A or B become the first in this list (rotation) like 32 41 8 9 become 9 32 41 8
make the first element of list A or B become the last one in this list (rotation) like 32 41 8 9 become 41 8 9 32
I have already set up an algorithm to sort the list A using the list B as the stack and the set of allowed operations, but it takes time when the list becomes large (more than 1000 elements) and is therefore not performing at all.
So I looked on the side of the merge sort which is very nice for its performance with the linked lists that I use.
The problem is that the use of this one needs to cut lists into other sublists, but in my case I have the right only has two lists and the operations that I explained.
Is it possible to make a effective merge sort according to the guidelines? How to make this algorithm?

Related

How could be the algorithm for this problem?

I have this data structure problem:
Implement an external sort method with two helper files. The separation
The initial file in chunks follows the following strategy: 20 elements are read from the file in chunks.
an array and are sorted with the internal sort method quicksort. Then it
writes the smallest element of the array to the auxiliary file and reads the next element from the
source file. If the element read is greater than the element written (it is part of the section
current), then it is inserted in order into the ordered subarray; otherwise it is added in
the free positions of the array. Because the elements of the array are extracted by the head
free positions remain on the opposite end. The section ends when the
sorted subarray is empty. To form the next section, start by ordering the array
(remember that the elements that were not part of the section were added to the array) and
then the process continues in the same way: write the smaller element to another file
helper and read element from the source file... Once the distribution is done, the phase of
mixing is the same as in the direct or natural mixing algorithms.
I'm thinking of this algorithm:
File origin
75
98
3
27
37
64
19
55
22
62
81
80
87
36
68
8
33
38
72
90
24
91
6
78
54
ordered array of 20 elements: 3-8-19-22-27-33-36-37-38-55-62-64-68-72-75-80-81-87-90-98
First element in the aux file 1: 3
After this point I don't know what the algorithm has to do to order the remaining 5 elements

How to identify breaks within an array of MATLAB?

I have an array in MATLAB containing elements such as
A=[12 13 14 15 30 31 32 33 58 59 60];
How can I identify breaks in values of data? For example, the above data exhibits breaks at elements 15 and 33. The elements are arranged in ascending order and have an increment of one. How can I identify the location of breaks of this pattern in an array? I have achieved this using a for and if statement (code below). Is there a better method to do so?
count=0;
for i=1:numel(A)-1
if(A(i+1)==A(i)+1)
continue;
else
count=count+1;
q(count)=i;
end
end

Good time to use diff and find those neighbouring differences that aren't equal to 1. However, this will return an array which is one less than the length of your input array because it finds pairwise differences up until the last element, so naturally there will be one less. As such, when you find the locations that aren't equal to 1, make sure you add 1 to the locations to account for this:
>> A=[12 13 14 15 30 31 32 33 58 59 60];
>> q = find(diff(A) ~= 1) + 1
q =
5 9
This tells us that locations 5 and 9 in your array is where the jump happens, and that's right for your example data.
However, if you want to find the locations before the jump happens, such as in your code, don't add 1 to the result:
>> q = find(diff(A) ~= 1)
q =
4 8

Traversing a complete binary min heap

I am not sure how to traverse the tree structure below so that the nodes are always in ascending order. Heapifying the array [9 8 7 6 5 4 3 2 1 0] results in the array [0 1 3 2 5 4 7 9 6 8] which I think corresponds to this representation:
Wanting to keep the array as is (because I want to do efficient inserts later) how can I efficiently traverse it in ascending order? (That is visiting the nodes in this order [0 1 2 3 4 5 6 7 8 9])

Just sort the array. It will still be a min-heap afterward, and no other algorithm is asymptotically more efficient.

You can't traverse the heap in the same sense that you can traverse a BST. #Dukeling is right about the BST being a better choice if sorted traversal is an important operation. However you can use the following algorithm, which requires O(1) additional space.
Assume you have the heap in the usual array form.
Remove items one at a time in sorted order to "visit" them for the traversal.
After visiting the i'th item, put it back in the heap array at location n-i, which is currently unused by the heap (assuming zero-based array indices).
After traversal reverse the array to create a new heap.
Removing the items requires O(n log n) time. Reversing is another O(n).
If you don't need to traverse all the way, you can stop at any time and "fix up" the array by running the O(n) heapify operation. See pseudocode here for example.

I would actually rather suggest a self-balancing binary search tree (BST) here:
A binary search tree (BST) ... is a node-based binary tree data structure which has the following properties:
The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
The left and right subtree each must also be a binary search tree.
There must be no duplicate nodes.
It's simpler and more space efficient to traverse a BST in sorted order (a so-called in-order traversal) than doing so with a heap.
A BST would support O(log n) inserts, and O(n) traversal.
If you're doing tons of inserts before you do a traversal again, it might be more efficient to just sort it into an array before traversing - the related running times would be O(1) for inserts and O(n log n) to get the sorted order - the exact point this option becomes more efficient than using a BST will need to be benchmarked.
For the sake of curiosity, here's how you traverse a heap in sorted order (if you, you know, don't want to just keep removing the minimum from the heap until it's empty, which is probably simpler option, since removing the minimum is a standard operation of a heap).
From the properties of a heap, there's nothing stopping some element to be in the left subtree, the element following it in the right, the one after in the left again, etc. - this means that you can't just completely finish the left subtree and then start on the right - you may need to keep a lot of the heap in memory as you're doing this.
The main idea is based on the fact that an element is always smaller than both its children.
Based on this, we could construct the following algorithm:
Create a heap (another one)
Insert the root of the original heap into the new heap
While the new heap has elements:
Remove minimum from the heap
Output that element
Add the children of that element in the original heap, if it has any, to the new heap
This takes O(n log n) time and O(n) space (for reference, the BST traversal takes O(log n) space), not to mention the added code complexity.

You can use std::set, if you're ok without duplicates. A std::set iterator will traverse in order and maintains ordering based on the default comparator. In the case of int, it's <, but if you traverse in reverse order with rbegin(), you can traverse from highest to lowest. Or you can add a custom comparator. The former is presented:
#include <iostream>
#include <vector>
#include <set>
using namespace std;
int main() {
vector<int> data{ 5, 2, 1, 9, 10, 3, 4, 7, 6, 8 };
set<int> ordered;
for (auto n : data) {
ordered.insert(n);
// print in order
for (auto it = ordered.rbegin(); it != ordered.rend(); it++) {
cout << *it << " ";
}
cout << endl;
}
return 0;
}
Output:
5
5 2
5 2 1
9 5 2 1
10 9 5 2 1
10 9 5 3 2 1
10 9 5 4 3 2 1
10 9 7 5 4 3 2 1
10 9 7 6 5 4 3 2 1
10 9 8 7 6 5 4 3 2 1

To get index of an sorted array into another array in C

I have 2 arrays say
sum:
68
78
25
45
85
Index:
0
1
2
3
4
I did bubble sort on sum and got as:
25
45
68
78
85
Now, I need to know sort index array with respect to the sum array. So, my output of array should be:
2
3
0
1
4
How can I do that?

I'd recommend you sort the index array instead, and have the comparisons made using the values the indexes "point" to. So then you don't actually need to sort the "value" array just the "index" array.

So, here is a very simple solution that I never thought of and all credits goes to my project supervisor.. and it is.
Sort both the arrays under the same loop.. Thus when ever I will refer to any index, it will give me the sorted version.

Find all possible row-wise sums in a 2D array

Ideally I'm looking for a c# solution, but any help on the algorithm will do.
I have a 2-dimension array (x,y). The max columns (max x) varies between 2 and 10 but can be determined before the array is actually populated. Max rows (y) is fixed at 5, but each column can have a varying number of values, something like:
1 2 3 4 5 6 7...10
A 1 1 7 9 1 1
B 2 2 5 2 2
C 3 3
D 4
E 5
I need to come up with the total of all possible row-wise sums for the purpose of looking for a specific total. That is, a row-wise total could be the cells A1 + B2 + A3 + B5 + D6 + A7 (any combination of one value from each column).
This process will be repeated several hundred times with different cell values each time, so I'm looking for a somewhat elegant solution (better than what I've been able to come with). Thanks for your help.

The Problem Size
Let's first consider the worst case:
You have 10 columns and 5 (full) rows per column. It should be clear that you will be able to get (with the appropriate number population for each place) up to 5^10 ≅ 10^6 different results (solution space).
For example, the following matrix will give you the worst case for 3 columns:
| 1 10 100 |
| 2 20 200 |
| 3 30 300 |
| 4 40 400 |
| 5 50 500 |
resulting in 5^3=125 different results. Each result is in the form {a1 a2 a3} with ai ∈ {1,5}
It's quite easy to show that such a matrix will always exist for any number n of columns.
Now, to get each numerical result, you will need to do n-1 sums, adding up to a problem size of O(n 5^n). So, that's the worst case and I think nothing can be done about it, because to know the possible results you NEED to effectively perform the sums.
More benign incarnations:
The problem complexity may be cut off in two ways:
Less numbers (i.e. not all columns are full)
Repeated results (i.e. several partial sums give the same result, and you can join them in one thread). Much more in this later.
Let's see a simplified example of the later with two rows:
| 7 6 100 |
| 3 4 200 |
| 1 2 200 |
at first sight you will need to do 2 3^3 sums. But that's not the real case. As you add up the first column you don't get the expected 9 different results, but only 6 ({13,11,9,7,5,3}).
So you don't have to carry your nine results up to the third column, but only 6.
Of course, that is on the expense of deleting the repeating numbers from the list. The "Removal of Repeated Integer Elements" was posted before in SO and I'll not repeat the discussion here, but just cite that doing a mergesort O(m log m) in the list size (m) will remove the duplicates. If you want something easier, a double loop O(m^2) will do.
Anyway, I'll not try to calculate the size of the (mean) problem in this way for several reasons. One of them is that the "m" in the sort merge is not the size of the problem, but the size of the vector of results after adding up any two columns, and that operation is repeated (n-1) times ... and I really don't want to do the math :(.
The other reason is that as I implemented the algorithm, we will be able to use some experimental results and save us from my surely leaking theoretical considerations.
The Algorithm
With what we said before, it is clear that we should optimize for the benign cases, as the worst case is a lost one.
For doing so, we need to use lists (or variable dim vectors, or whatever can emulate those) for the columns and do a merge after every column add.
The merge may be replaced by several other algorithms (such as an insertion on a BTree) without modifying the results.
So the algorithm (procedural pseudocode) is something like:
Set result_vector to Column 1
For column i in (2 to n-1)
Remove repeated integers in the result_vector
Add every element of result_vector to every element of column i+1
giving a new result vector
Next column
Remove repeated integers in the result_vector
Or as you asked for it, a recursive version may work as follows:
function genResVector(a:list, b:list): returns list
local c:list
{
Set c = CartesianProduct (a x b)
Set c = Sum up each element {a[i],b[j]} of c </code>
Drop repeated elements of c
Return(c)
}
function ResursiveAdd(a:matrix, i integer): returns list
{
genResVector[Column i from a, RecursiveAdd[a, i-1]];
}
function ResursiveAdd(a:matrix, i==0 integer): returns list={0}
Algorithm Implementation (Recursive)
I choose a functional language, I guess it's no big deal to translate to any procedural one.
Our program has two functions:
genResVector, which sums two lists giving all possible results with repeated elements removed, and
recursiveAdd, which recurses on the matrix columns adding up all of them.
recursiveAdd, which recurses on the matrix columns adding up all of them.
The code is:
genResVector[x__, y__] := (* Header: A function that takes two lists as input *)
Union[ (* remove duplicates from resulting list *)
Apply (* distribute the following function on the lists *)
[Plus, (* "Add" is the function to be distributed *)
Tuples[{x, y}],2] (*generate all combinations of the two lists *)];
recursiveAdd[t_, i_] := genResVector[t[[i]], recursiveAdd[t, i - 1]];
(* Recursive add function *)
recursiveAdd[t_, 0] := {0}; (* With its stop pit *)
Test
If we take your example list
| 1 1 7 9 1 1 |
| 2 2 5 2 2 |
| 3 3 |
| 4 |
| 5 |
And run the program the result is:
{11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27}
The maximum and minimum are very easy to verify since they correspond to taking the Min or Max from each column.
Some interesting results
Let's consider what happens when the numbers on each position of the matrix is bounded. For that we will take a full (10 x 5 ) matrix and populate it with Random Integers.
In the extreme case where the integers are only zeros or ones, we may expect two things:
A very small result set
Fast execution, since there will be a lot of duplicate intermediate results
If we increase the Range of our Random Integers we may expect increasing result sets and execution times.
Experiment 1: 5x10 matrix populated with varying range random integers
It's clear enough that for a result set near the maximum result set size (5^10 ≅ 10^6 ) the Calculation time and the "Number of != results" have an asymptote. The fact that we see increasing functions just denote that we are still far from that point.
Morale: The smaller your elements are, the better chances you have to get it fast. This is because you are likely to have a lot of repetitions!
Note that our MAX calculation time is near 20 secs for the worst case tested
Experiment 2: Optimizations that aren't
Having a lot of memory available, we can calculate by brute force, not removing the repeated results.
The result is interesting ... 10.6 secs! ... Wait! What happened ? Our little "remove repeated integers" trick is eating up a lot of time, and when there are not a lot of results to remove there is no gain, but looses in trying to get rid of the repetitions.
But we may get a lot of benefits from the optimization when the Max numbers in the matrix are well under 5 10^5. Remember that I'm doing these tests with the 5x10 matrix fully loaded.
The Morale of this experiment is: The repeated integer removal algorithm is critical.
HTH!
PS: I have a few more experiments to post, if I get the time to edit them.