In a book about data structure and algorithms, there is the following implementation of the insertion sort:
int insertionSort(void *data, int size, int esize, int(*compare)(const void *key1, const void *key2)){
int i,j;
void *key,
char *a = data;
key = (char *) malloc(esize);
if(key == NULL){
return -1;
}
for(j=1; j<size; j++){
memcpy(key, &a[j*esize], esize);
i = j-1;
while(i>=0 && compare(&a[i*esize], key)>0){
memcpy(&a[(i+1)*esize],&a[i*esize], esize);
i--;
}
memcpy(&a[(i+1)*esize], key, esize);
}
}
So, in the section about the complexity it states the following:
The runtime complexity of insertion sort focuses on its nested loops.
With this in mind, the outer loop has a running time of T(n) = n – 1,
times some constant amount of time, where n is the number of elements
being sorted. Examining the inner loop in the worst case, we assume
that we will have to go all the way to the left end of the array
before inserting each element into the sorted set. Therefore, the
inner loop could iterate once for the first element, twice for the
second, and so forth until the outer loop terminates. The running time
of the nested loop is repre- sented as a summation from 1 to n – 1,
which results in a running time of T (n) = (n (n + 1)/2) – n, times
some constant amount of time. (This is from the well- known formula
for summing a series from 1 to n.) Using the rules of O-notation, this simplifies to O(n^2)
So, I googled the summation formula and understand why it is (n(n+1)/2). There are also a bunch of youtube videos about this formula which I also watched.
But I could not understand the last part of T(n) in the book which is the "-n" part. Why minus n at the end ?
Related
I'm trying to obtain the complexity of a particular divide and conquer algorithm so transpose a given matrix.
From what I've been reading, I got that the recursion should start as follows:
C(1) = 1
C(n) = 4C(n/2) + O(n)
I know how to solve the recursion but I'm not sure if it's right. Everytime the function is called, the problem is divided by 2 (vars fIni and fEnd), and then another 4 functions are called. Also, at the end, swap is called with a complexity of O(n²) so I'm pretty sure I'm not taking that into account in the above recursion.
The code, as follows:
void transposeDyC(int **m,int f,int c, int fIni, int fEnd, int cIni, int cEnd){
if(fIni < fEnd){
int fMed = (fIni+fFin)/2;
int cMed = (cIni+cFin)/2;
transposeDyC(m,f,c, fIni, fMed, cIni, cMed);
transposeDyC(m,f,c, fIni, fMed, cMed+1, cEnd);
transposeDyC(m,f,c, fMed+1, fFin, cIni, cMed);
transposeDyC(m,f,c, fMed+1, fFin, cMed+1, cEnd);
swap(m,f,c, fMed+1, cIni, fIni, cMed+1, fEnd-fMed);
}
}
void swap (int **m,int f, int c,int fIniA, int cIniA, int fIniB, int cIniB, int dimen){
for (int i=0; i<=dimen-1; i++){
for (int j=0; j<=dimen-1; j++) {
int aux = m[fIniA+i][cIniA+j];
m[fIniA+i][cIniA+j] = m[fIniB+i][cIniB+j];
m[fIniB+i][cIniB+j] = aux;
}
}
}
I'm really stuck in this complexity with recursion and divide and conquer. I don't know how to continue.
You got the recursion wrong. It is 4C(n/2) + O(n2), because when joining the matrix back, for a size n, there are total n2 elements.
Two ways:
Master Theorem
Here we have a = 4, b = 2, c = 2, Logba = 2
Since, Logba == c, this falls under the case 2, resulting in the complexity of O(ncLog n) = O(n2 Log n).
Recurrence tree visualization
If you'd try to unfold your recurrence, you can see that you are solving the problem of size n by breaking it down into 4 problems of size n/2 and then doing a work of size n2 (at each level).
Total work done at each level = 4 * Work (n/2) + n2
Total number of levels will be equal to the number of times you'd have to divide the n sized problem until you come to a problem of size 1. That will be simply equal to Log2n.
Therefore, total work = Log(n) (4*(n / 2) + n2), which is O(n2 Log n).
Each recursive step reduces the number of elements by a factor of 4, so the number of levels of recursion will be on the order O(log n). At each level, the swap has order O(n^2), so the algorithm has complexity O((n^2)(log n)).
I want to shuffle an array, and that each index will have the same probability to be in any other index (excluding itself).
I have this solution, only i find that always the last 2 indexes will always ne swapped with each other:
void Shuffle(int arr[]. size_t n)
{
int newIndx = 0;
int i = 0;
for(; i > n - 2; ++i)
{
newIndx = rand() % (n - 1);
if (newIndx >= i)
{
++newIndx;
}
swap(i, newIndx, arr);
}
}
but in the end it might be that some indexes will go back to their first place once again.
Any thoughts?
C lang.
A permutation (shuffle) where no element is in its original place is called a derangement.
Generating random derangements is harder than generating random permutations, can be done in linear time and space. (Generating a random permutation can be done in linear time and constant space.) Here are two possible algorithms.
The simplest solution to understand is a rejection strategy: do a Fisher-Yates shuffle, but if the shuffle attempts to put an element at its original spot, restart the shuffle. [Note 1]
Since the probability that a random shuffle is a derangement is approximately 1/e, the expected number of shuffles performed is about e (that is, 2.71828…). But since unsuccessful shuffles are restarted as soon as the first fixed point is encountered, the total number of shuffle steps is less than e times the array size for a detailed analysis, see this paper, which proves the expected number of random numbers needed by the algorithm to be around (e−1) times the number of elements.
In order to be able to do the check and restart, you need to keep an array of indices. The following little function produces a derangement of the indices from 0 to n-1; it is necessary to then apply the permutation to the original array.
/* n must be at least 2 for this to produce meaningful results */
void derange(size_t n, size_t ind[]) {
for (size_t i = 0; i < n; ++i) ind[i] = i;
swap(ind, 0, randint(1, n));
for (size_t i = 1; i < n; ++i) {
int r = randint(i, n);
swap(ind, i, r);
if (ind[i] == i) i = 0;
}
}
Here are the two functions used by that code:
void swap(int arr[], size_t i, size_t j) {
int t = arr[i]; arr[i] = arr[j]; arr[j] = t;
}
/* This is not the best possible implementation */
int randint(int low, int lim) {
return low + rand() % (lim - low);
}
The following function is based on the 2008 paper "Generating Random Derangements" by Conrado Martínez, Alois Panholzer and Helmut Prodinger, although I use a different mechanism to track cycles. Their algorithm uses a bit vector of size N but uses a rejection strategy in order to find an element which has not been marked. My algorithm uses an explicit vector of indices not yet operated on. The vector is also of size N, which is still O(N) space [Note 2]; since in practical applications, N will not be large, the difference is not IMHO significant. The benefit is that selecting the next element to use can be done with a single call to the random number generator. Again, this is not particularly significant since the expected number of rejections in the MP&P algorithm is very small. But it seems tidier to me.
The basis of the algorithms (both MP&P and mine) is the recursive procedure to produce a derangement. It is important to note that a derangement is necessarily the composition of some number of cycles where each cycle is of size greater than 1. (A cycle of size 1 is a fixed point.) Thus, a derangement of size N can be constructed from a smaller derangement using one of two mechanisms:
Produce a derangement of the N-1 elements other than element N, and add N to some cycle at any point in that cycle. To do so, randomly select any element j in the N-1 cycle and place N immediately after j in the j's cycle. This alternative covers all possibilities where N is in a cycle of size > 3.
Produce a derangement of N-2 of the N-1 elements other than N, and add a cycle of size 2 consisting of N and the element not selected from the smaller derangement. This alternative covers all possibilities where N is in a cycle of size 2.
If Dn is the number of derangements of size n, it is easy to see from the above recursion that:
Dn = (n−1)(Dn−1 + Dn−2)
The multiplier is n−1 in both cases: in the first alternative, it refers to the number of possible places N can be added, and in the second alternative to the number of possible ways to select n−2 elements of the recursive derangement.
Therefore, if we were to recursively produce a random derangement of size N, we would randomly select one of the N-1 previous elements, and then make a random boolean decision on whether to produce alternative 1 or alternative 2, weighted by the number of possible derangements in each case.
One advantage to this algorithm is that it can derange an arbitrary vector; there is no need to apply the permuted indices to the original vector as with the rejection algorithm.
As MP&P note, the recursive algorithm can just as easily be performed iteratively. This is quite clear in the case of alternative 2, since the new 2-cycle can be generated either before or after the recursion, so it might as well be done first and then the recursion is just a loop. But that is also true for alternative 1: we can make element N the successor in a cycle to a randomly-selected element j even before we know which cycle j will eventually be in. Looked at this way, the difference between the two alternatives reduces to whether or not element j is removed from future consideration or not.
As shown by the recursion, alternative 2 should be chosen with probability (n−1)Dn−2/Dn, which is how MP&P write their algorithm. I used the equivalent formula Dn−2 / (Dn−1 + Dn−2), mostly because my prototype used Python (for its built-in bignum support).
Without bignums, the number of derangements and hence the probabilities need to be approximated as double, which will create a slight bias and limit the size of the array to be deranged to about 170 elements. (long double would allow slightly more.) If that is too much of a limitation, you could implement the algorithm using some bignum library. For ease of implementation, I used the Posix drand48 function to produce random doubles in the range [0.0, 1.0). That's not a great random number function, but it's probably adequate to the purpose and is available in most standard C libraries.
Since no attempt is made to verify the uniqueness of the elements in the vector to be deranged, a vector with repeated elements may produce a derangement where one or more of these elements appear to be in the original place. (It's actually a different element with the same value.)
The code:
/* Deranges the vector `arr` (of length `n`) in place, to produce
* a permutation of the original vector where every element has
* been moved to a new position. Returns `true` unless the derangement
* failed because `n` was 1.
*/
bool derange(int arr[], size_t n) {
if (n < 2) return n != 1;
/* Compute derangement counts ("subfactorials") */
double subfact[n];
subfact[0] = 1;
subfact[1] = 0;
for (size_t i = 2; i < n; ++i)
subfact[i] = (i - 1) * (subfact[i - 2] + subfact[i - 1]);
/* The vector 'todo' is the stack of elements which have not yet
* been (fully) deranged; `u` is the count of elements in the stack
*/
size_t todo[n];
for (size_t i = 0; i < n; ++i) todo[i] = i;
size_t u = n;
/* While the stack is not empty, derange the element at the
* top of the stack with some element lower down in the stack
*/
while (u) {
size_t i = todo[--u]; /* Pop the stack */
size_t j = u * drand48(); /* Get a random stack index */
swap(arr, i, todo[j]); /* i will follow j in its cycle */
/* If we're generating a 2-cycle, remove the element at j */
if (drand48() * (subfact[u - 1] + subfact[u]) < subfact[u - 1])
todo[j] = todo[--u];
}
return true;
}
Notes
Many people get this wrong, particularly in social occasions such as "secret friend" selection (I believe this is sometimes called "the Santa game" in other parts of the world.) The incorrect algorithm is to just choose a different swap if the random shuffle produces a fixed point, unless the fixed point is at the very end in which case the shuffle is restarted. This will produce a random derangement but the selection is biased, particularly for small vectors. See this answer for an analysis of the bias.
Even if you don't use the RAM model where all integers are considered fixed size, the space used is still linear in the size of the input in bits, since N distinct input values must have at least N log N bits. Neither this algorithm nor MP&P makes any attempt to derange lists with repeated elements, which is a much harder problem.
Your algorithm is only almost correct (which in algorithmics means unexpected results). Because of some little errors scattered along, it will not produce expected results.
First, rand() % N is not guaranteed to produce an uniformal distribution, unless N is a divisor of the number of possible values. In any other case, you will get a slight bias. Anyway my man page for rand describes it as a bad random number generator, so you should try to use random or if available arc4random_uniform.
But avoiding that an index come back at its original place is both incommon, and rather hard to achieve. The only way I can imagine is to keep an array of the numbers [0; n[ and swap it the same as the real array to be able to know the original index of a number.
The code could become:
void Shuffle(int arr[]. size_t n)
{
int i, newIndx;
int *indexes = malloc(n * sizeof(int));
for (i=0; i<n; i++) indexes[i] = i;
for(i=0; i < n - 1; ++i) // beware to the inequality!
{
int i1;
// search if index i is in the [i; n[ current array:
for (i1=i; i1 < n; ++i) {
if (indexes[i1] == i) { // move it to i position
if (i1 != i) { // nothing to do if already at i
swap(i, i1, arr);
swap(i, i1, indexes);
}
break;
}
}
i1 = (i1 == n) ? i : i+1; // we will start the search at i1
// to guarantee that no element keep its place
newIndx = i1 + arc4random_uniform(n - i1);
/* if arc4random is not available:
newIndx = i1 + (random() % (n - i1));
*/
swap(i, newIndx, arr);
swap(i, newIndx, indexes);
}
/* special case: a permutation of [0: n-1[ have left last element in place
* we will exchange the last element with a random one
*/
if (indexes[n-1] == n-1) {
newIndx = arc4random_uniform(n-1)
swap(n-1, newIndx, arr);
swap(n-1, newIndx, indexes);
}
free(indexes); // don't forget to free what we have malloc'ed...
}
Beware: the algorithm should be correct, but the code has not been tested and can contain typos...
For instance: I have an unsorted list A of 10 elements. I need the sublist of k consecutive elements from i through i+k-1 of the sorted version of A.
Example:
Input: A { 1, 6, 13, 2, 8, 0, 100, 3, -4, 10 }
k = 3
i = 4
Output: sublist B { 2, 3, 6 }
If i and k are specified, you can use a specialized version of quicksort where you stop recursion on parts of the array that fall outside of the i .. i+k range. If the array can be modified, perform this partial sort in place, if the array cannot be modified, you will need to make a copy.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// Partial Quick Sort using Hoare's original partition scheme
void partial_quick_sort(int *a, int lo, int hi, int c, int d) {
if (lo < d && hi > c && hi - lo > 1) {
int x, pivot = a[lo];
int i = lo - 1;
int j = hi;
for (;;) {
while (a[++i] < pivot)
continue;
while (a[--j] > pivot)
continue;
if (i >= j)
break;
x = a[i];
a[i] = a[j];
a[j] = x;
}
partial_quick_sort(a, lo, j + 1, c, d);
partial_quick_sort(a, j + 1, hi, c, d);
}
}
void print_array(const char *msg, int a[], int count) {
printf("%s: ", msg);
for (int i = 0; i < count; i++) {
printf("%d%c", a[i], " \n"[i == count - 1]);
}
}
int int_cmp(const void *p1, const void *p2) {
int i1 = *(const int *)p1;
int i2 = *(const int *)p2;
return (i1 > i2) - (i1 < i2);
}
#define MAX 1000000
int main(void) {
int *a = malloc(MAX * sizeof(*a));
clock_t t;
int i, k;
srand((unsigned int)time(NULL));
for (i = 0; i < MAX; i++) {
a[i] = rand();
}
i = 20;
k = 10;
printf("extracting %d elements at %d from %d total elements\n",
k, i, MAX);
t = clock();
partial_quick_sort(a, 0, MAX, i, i + k);
t = clock() - t;
print_array("partial qsort", a + i, k);
printf("elapsed time: %.3fms\n", t * 1000.0 / CLOCKS_PER_SEC);
t = clock();
qsort(a, MAX, sizeof *a, int_cmp);
t = clock() - t;
print_array("complete qsort", a + i, k);
printf("elapsed time: %.3fms\n", t * 1000.0 / CLOCKS_PER_SEC);
return 0;
}
Running this program with an array of 1 million random integers, extracting the 10 entries of the sorted array starting at offset 20 gives this output:
extracting 10 elements at 20 from 1000000 total elements
partial qsort: 33269 38347 39390 45413 49479 50180 54389 55880 55927 62158
elapsed time: 3.408ms
complete qsort: 33269 38347 39390 45413 49479 50180 54389 55880 55927 62158
elapsed time: 149.101ms
It is indeed much faster (20x to 50x) than sorting the whole array, even with a simplistic choice of pivot. Try multiple runs and see how the timings change.
An idea could be to scan your array for bigger or equal numbers of i and smaller or equal numbers of i+k and add them to another list/container.
This will take you O(n) and give an unordered list of the numbers you need. Then you sort that list O(nlogn) and you are done.
For really big arrays the advantage of this method is that you will sort a smaller list of numbers. (given that the k is relatively small).
You can use Quickselect, or a heap selection algorithm to get the i+k smallest items. Quickselect works in-place, but it modifies the original array. It also won't work if the list of items is larger than will fit in memory. Quickselect is O(n), but with a fairly high constant. When the number of items you are selecting is a very small fraction of the total number of items, the heap selection algorithm is faster.
The idea behind the heap selection algorithm is that you initialize a max-heap with the first i+k items. Then, iterate through the rest of the items. If an item is smaller than the largest item on the max-heap, remove the largest item from the max-heap and replace it with the new, smaller item. When you're done, you have the first i+k items on the heap, with the largest k items at the top.
The code is pretty simple:
heap = new max_heap();
Add first `i+k` items from a[] to heap
for all remaining items in a[]
if item < heap.peek()
heap.pop()
heap.push(item)
end-if
end-for
// at this point the smallest i+k items are on the heap
This requires O(i+k) extra memory, and worst case running time is O(n log(i+k)). When (i+k) is less than about 2% of n, it will usually outperform Quickselect.
For much more information about this, see my blog post When theory meets practice.
By the way, you can optimize your memory usage somewhat based on i. That is, if there are a billion items in the array and you want items 999,999,000 through 999,999,910, the standard method above would require a huge heap. But you can re-cast that problem to one in which you need to select the smallest of the last 1,000 items. Your heap then becomes a min-heap of 1,000 items. It just takes a little math to determine which way will require the smallest heap.
That doesn't help much, of course, if you want items 600,000,000 through 600,000,010, because your heap still has 400 million items in it.
It occurs to me, though, that if time isn't a huge issue, you can just build the heap in the array in-place using Floyd's algorithm, pop the first i items like you would with heap sort, and the next k items are what you're looking for. This would require constant extra space and O(n + (i+k)*log(n)) time.
Come to think of it, you could implement the heap selection logic with a heap of (i+k) items (as described above) in-place, as well. It would be a little tricky to implement, but it wouldn't require any extra space and would have the same running time O(n*log(i+k)).
Note that both would modify the original array.
One thing you could do is modify heapsort, such that you will first create the heap, but then pop the first i elements. The next k elements you pop form the heap will be your result. Discarding the n - i - k elements remaining let's the algorithm terminate early.
The result will be in O((i + k) log n) which is in O(n log n), but is significantly faster with relative low values for i and k.
I need to merge k (1 <= k <= 16) sorted arrays into one sorted array. This is for a homework assignment and the Professor requires that this be done using an O(n) algorithm. Merging 2 arrays is no problem and I can do it easily using an O(n) algorithm. I feel that what my professor is asking is undoable for n arrays with an O(n) algorithm.
I am using the below algorithm to split the array indices and running InsertionSort on each partition. I could save these start and end indices into a 2D array. I just don't see how the merging can be done using O(n) because this is going to require more than one loop. If it is possible, does anyone have any hints. I'm not looking for actual code, just a hint as to where I should start/
int chunkSize = round(float(arraySize) / numThreads);
for (int i = 0; i < numThreads; i++) {
int start = i * chunkSize;
int end = start + chunkSize - 1;
if (i == numThreads - 1) {
end = arraySize - 1;
}
InsertionSort(&array[start], end - start + 1);
}
EDIT: The requirement is that the algorithm be O(n) where n is the number of elements in the array. Also, I need to solve this without using a min heap.
EDIT #2: Here is an algorithm I came up with. The problem here is that I'm not storing the result of each iteration back into the original array. I could just copy all of it back in for a loop but that would be expensive. Is there any way I can do this, other than using something memcpy? In the below code, indices is a 2D array [numThreads][2] where array[i][0] is the start index and array[i][1] is the end index of the ith array.
void mergeArrays(int array[], int indices[][2], int threads, int result[]) {
for (int i = 0; i < threads - 1; i++) {
int resPos = 0;
int lhsPos = 0;
int lhsEnd = indices[i][1];
int rhsPos = indices[i+1][0];
int rhsEnd = indices[i+1][1];
while (lhsPos <= lhsEnd && rhsPos <= rhsEnd) {
if (array[lhsPos] <= array[rhsPos]) {
result[resPos] = array[lhsPos];
lhsPos++;
} else {
result[resPos] = array[rhsPos];
rhsPos++;
}
resPos++;
}
while (lhsPos <= lhsEnd) {
result[resPos] = array[lhsPos];
lhsPos++;
resPos++;
}
while (rhsPos <= rhsEnd) {
result[resPos] = array[rhsPos];
rhsPos++;
resPos++;
}
}
}
You can merge K sorted arrays in one sorted array with O(N*log(K)) algorithm, using priority queue with K entries, where N is overall number of elements in all arrays.
If K is considered as constant value (it is limited by 16 in your case), then complexity is O(N).
Note again: N is number of elements in my post, not number of arrays.
It is impossible to merge arrays in O(K) - simple copy takes O(N)
Using the facts you provided:
(1) n is the number of arrays to to merge;
(2) the arrays to be merged are already sorted;
(3) the merge needs to be of order n, that is linear in the number of arrays
(and NOT linear in the number of elements in each array, as you might mistakenly think at first sight).
Use the analogy of merging 4 sorted piles of cards, low to high, face up. You would pick the card with the lowest face value from one of the piles and put it (face down) on the merged deck, until all piles are exhausted.
For your program: keep a counter for each array for the number of elements you have already transferred to the output. This is at the same time an index to the next element in each array NOT merged in the output. Pick the smallest element that you find at one of these locations. You have to lookup the first waiting element in all the arrays for that, so that is of order n.
Also, I don't understand why the answer from MoB got up-votes, it does not answer the question.
Here is one way to do it (pseudocode)
input array[k][n]
init indices[k] = { 0, 0, 0, ... }
init queue = { empty priority queue }
for i in 0..k:
insert i into queue with priority (array[i][0])
while queue is not empty:
let x = pop queue
output array[x, indices[x]]
increment indices[x]
insert x into queue with priority (array[x][indices[x]])
This can probably be simplified further in C. You would have to find a suitable queue implementation to use though as there are none in libc.
Complexity for this operation:
"while queue is not empty" => O(n)
"insert x into queue ..." => O(log k)
=> O(n log k)
Which, if you consider k = constant, is O(n).
After sorting the k sub-arrays (the method doesn't matter), the code does a k-way merge. The simplest implementation does k-1 compares to determine the smallest leading element of each of the k arrays, then moves that element from it's sub-array to the output array and gets the next element from that array. When the end of an array is reached, the algorithm drops down to a (k-1) way merge, then (k-2) way merge, finally there's just one sub-array left and it's copied. This will be O(n) time since k-1 is a constant.
The k-1 compares can be sped up by using a minimum heap (which is how some priority queues are implemented), but it's still O(n), with just a smaller constant. The heap needs to be initialized at the start, then updated each time an element is removed and a new one added.
I have a big size array that contains numbers, is there way to find the indices of top n values? Any lib function in C?
example:
an array : {1,2,6,5,3}
the indices of top 2 number is: {2,3}
If by top n you mean the n-th highest (or lowest) number in the array, you may want to look at the QuickSelect algorithm. Unfortunately there is no C library function I am aware of that implements it but Wikipedia should give you a good starting point.
QuickSelect is O(n) on average, if O(nlogn) and some overhead is fine as well, you can do qsort and take the n'th element.
Edit (In response to example) Getting all the indexes of the top-n in a single batch is straightforward with both approaches. QuickSelect sorts them all on one side of the final pivot.
So you want the top n numbers in a big array of N numbers. There is a straightforward algorithm which is O(N*n). If n is small (as it seems to be in your case) this is good enough.
size_t top_elems(int *arr, size_t N, size_t *top, size_t n) {
/*
insert into top[0],...,top[n-1] the indices of n largest elements
of arr[0],...,arr[N-1]
*/
size_t top_count = 0;
size_t i;
for (i=0;i<N;++i) {
// invariant: arr[top[0]] >= arr[top[1]] >= .... >= arr[top[top_count-1]]
// are the indices of the top_count larger values in arr[0],...,arr[i-1]
// top_count = max(i,n);
size_t k;
for (k=top_count;k>0 && arr[i]>arr[top[k-1]];k--);
// i should be inserted in position k
if (k>=n) continue; // element arr[i] is not in the top n
// shift elements from k to top_count
size_t j=top_count;
if (j>n-1) { // top array is already full
j=n-1;
} else { // increase top array
top_count++;
}
for (;j>k;j--) {
top[j]=top[j-1];
}
// insert i
top[k] = i;
}
return top_count;
}