Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have an array x of constant size (usually around 100-200 entries) and would like to add an array y of constant smaller size (usually between 2-10 entries) to the head, while removing the same size at the end. For example:
Buffer:
x = [6 5 4 3 2 1]
New array to add in front:
y = [8 7]
Resulting buffer:
x = [8 7 6 5 4 3]
And so on...
Note: I need to use a regular C array and need to be able to access the whole array, not only the head or tail. The array is rather small, but the function is called very often, so I am looking for a solution that does not require any excessive memory tasks. Arrays x and y are always of the same size in each step.
Is this a buffer, circular / ring buffer, queue or FIFO? I don't really know what the right search term is for this application.
Language is C.
If you require linear access to the array contents, and you want to not perform frequent memcpy operations, a possible solution for this is a flip buffer or sliding buffer.
The flip buffer is twice as large as the array needs to be (or even more, if you like), so that you can can just move a tail pointer when adding to the end without any wraparound, keeping the data linear.
When you hit the hard limit of the underlying buffer, then you do a slide operation: you move the upper half of the array to the lower half, and subtract the same delta from all the indices.
When this slide operation happens, you know that all data and indices are in the upper partition now, because the buffer, which is 2 * N, wide never contains more than N entries: it is simulating an N sized ring buffer. That is to say, a situation never arises that the tail has hit the end of the buffer, but the head is still in the lower partition (there are more than N items).
Since you'd like to add to the front, we start by filling the upper partition, and we flip in the upward direction:
[x x x x x x | 6 5 4 3 2 1 ] -- six-element queue, twelve el. buffer
H T
Add 8 7, remove 2 1:
[x x x x 8 7 | 6 5 4 3 x x ]
H T
Add 2 1 0 9, remove 6 5 4 3:
[2 1 0 9 8 7 | x x x x x x ]
H T
Head has hit -1! Flip to upper partition with memcpy, add 6 to head and tail:
[x x x x x x | 2 1 0 9 8 7 ]
H T
Note that since the two partitions don't overlap, we don't have to use memmove: we can use memcpy.
What about using memmove() from <string.h> to move slices of the array around? That's not something to exclude without having actually measured the performance. Any solution using linked lists or so might involve operations that take longer than a highly optimized memmove() (that the compiler might even inline).
It really depends on
the number of elements in the array
the size of the element type
the frequency of the operations
the time spent modifying the array with respect to everything else
What you need is index offset. So, every time you "insert" values, you simply write them where you virtual tail is and update index offset. That is how circular buffers work.
This is a duplicate. Here's a Java implementation in full on code review https://codereview.stackexchange.com/questions/64258/array-implementation-of-queue and one of the questions in C here Queue using Arrays.
Related
This was an interview question.
I was given an array of n+1 integers from the range [1,n]. The property of the array is that it has k (k>=1) duplicates, and each duplicate can appear more than twice. The task was to find an element of the array that occurs more than once in the best possible time and space complexity.
After significant struggling, I proudly came up with O(nlogn) solution that takes O(1) space. My idea was to divide range [1,n-1] into two halves and determine which of two halves contains more elements from the input array (I was using Pigeonhole principle). The algorithm continues recursively until it reaches the interval [X,X] where X occurs twice and that is a duplicate.
The interviewer was satisfied, but then he told me that there exists O(n) solution with constant space. He generously offered few hints (something related to permutations?), but I had no idea how to come up with such solution. Assuming that he wasn't lying, can anyone offer guidelines? I have searched SO and found few (easier) variations of this problem, but not this specific one. Thank you.
EDIT: In order to make things even more complicated, interviewer mentioned that the input array should not be modified.
Take the very last element (x).
Save the element at position x (y).
If x == y you found a duplicate.
Overwrite position x with x.
Assign x = y and continue with step 2.
You are basically sorting the array, it is possible because you know where the element has to be inserted. O(1) extra space and O(n) time complexity. You just have to be careful with the indices, for simplicity I assumed first index is 1 here (not 0) so we don't have to do +1 or -1.
Edit: without modifying the input array
This algorithm is based on the idea that we have to find the entry point of the permutation cycle, then we also found a duplicate (again 1-based array for simplicity):
Example:
2 3 4 1 5 4 6 7 8
Entry: 8 7 6
Permutation cycle: 4 1 2 3
As we can see the duplicate (4) is the first number of the cycle.
Finding the permutation cycle
x = last element
x = element at position x
repeat step 2. n times (in total), this guarantees that we entered the cycle
Measuring the cycle length
a = last x from above, b = last x from above, counter c = 0
a = element at position a, b = elment at position b, b = element at position b, c++ (so we make 2 steps forward with b and 1 step forward in the cycle with a)
if a == b the cycle length is c, otherwise continue with step 2.
Finding the entry point to the cycle
x = last element
x = element at position x
repeat step 2. c times (in total)
y = last element
if x == y then x is a solution (x made one full cycle and y is just about to enter the cycle)
x = element at position x, y = element at position y
repeat steps 5. and 6. until a solution was found.
The 3 major steps are all O(n) and sequential therefore the overall complexity is also O(n) and the space complexity is O(1).
Example from above:
x takes the following values: 8 7 6 4 1 2 3 4 1 2
a takes the following values: 2 3 4 1 2
b takes the following values: 2 4 2 4 2
therefore c = 4 (yes there are 5 numbers but c is only increased when making steps, not initially)
x takes the following values: 8 7 6 4 | 1 2 3 4
y takes the following values: | 8 7 6 4
x == y == 4 in the end and this is a solution!
Example 2 as requested in the comments: 3 1 4 6 1 2 5
Entering cycle: 5 1 3 4 6 2 1 3
Measuring cycle length:
a: 3 4 6 2 1 3
b: 3 6 1 4 2 3
c = 5
Finding the entry point:
x: 5 1 3 4 6 | 2 1
y: | 5 1
x == y == 1 is a solution
Here is a possible implementation:
function checkDuplicate(arr) {
console.log(arr.join(", "));
let len = arr.length
,pos = 0
,done = 0
,cur = arr[0]
;
while (done < len) {
if (pos === cur) {
cur = arr[++pos];
} else {
pos = cur;
if (arr[pos] === cur) {
console.log(`> duplicate is ${cur}`);
return cur;
}
cur = arr[pos];
}
done++;
}
console.log("> no duplicate");
return -1;
}
for (t of [
[0, 1, 2, 3]
,[0, 1, 2, 1]
,[1, 0, 2, 3]
,[1, 1, 0, 2, 4]
]) checkDuplicate(t);
It is basically the solution proposed by #maraca (typed too slowly!) It has constant space requirements (for the local variables), but apart from that only uses the original array for its storage. It should be O(n) in the worst case, because as soon as a duplicate is found, the process terminates.
If you are allowed to non-destructively modify the input vector, then it is pretty easy. Suppose we can "flag" an element in the input by negating it (which is obviously reversible). In that case, we can proceed as follows:
Note: The following assume that the vector is indexed starting at 1. Since it is probably indexed starting at 0 (in most languages), you can implement "Flag item at index i" with "Negate the item at index i-1".
Set i to 0 and do the following loop:
Increment i until item i is unflagged.
Set j to i and do the following loop:
Set j to vector[j].
if the item at j is flagged, j is a duplicate. Terminate both loops.
Flag the item at j.
If j != i, continue the inner loop.
Traverse the vector setting each element to its absolute value (i.e. unflag everything to restore the vector).
It depends what tools are you(your app) can use. Currently a lot of frameworks/libraries exists. For exmaple in case of C++ standart you can use std::map<> ,as maraca mentioned.
Or if you have time you can made your own implementation of binary tree, but you need to keep in mind that insert of elements differs in comarison with usual array. In this case you can optimise search of duplicates as it possible in your particular case.
binary tree expl. ref:
https://www.wikiwand.com/en/Binary_tree
This question already has answers here:
Finding out the duplicate element in an array
(2 answers)
Closed 6 years ago.
Can anyone could help me how to solve this code in C? I think that I have to use big O notation as a solution, but I have no idea about it.
The question: There is an array T sized N+1 where numbers from 1 to N are random. One number x is repeated twice (position is also random).
What should be the fastest way to find value of this number x?
For example:
N = 7
[6 3 5 1 3 7 4 2]
x=3
The sum of numbers 1..N is N*(N+1)/2.
So, the extra number is:
extra_number = sum(all N+1 numbers) - N*(N+1)/2
Everything is O(1) except the sum. The sum can be computed in O(N) time.
The overall algorithm is O(N).
Walk the array using the value as the next array index (minus 1), marking the ones visited with a special value (like 0 or the negation). O(n)
On average, only half the elements are visited.
v
6 3 5 1 3 7 4 2
v
. 3 5 1 3 7 4 2
v
. 3 5 1 3 7 . 2
v
. 3 5 1 . 7 . 2
v
. 3 5 . . 7 . 2
v !! all ready visited. Previous 3 is repeated.
. 3 5 . . 7 . 2
No overflow problem caused by adding up the sum. Of course the array needs to be modified (or a sibling bool array of flags is needed.)
This method works even if more than 1 value is repeated.
The algorithm given by Klaus has O(1) memory requirements, but requires to sum all the elements from the given array, which may be quite large to iterate (sum) all over them.
Another approach is to iterate over array and increment the occurence counter once per iteration, so the algorithm can be stopped instantly once it finds the duplicate, though the worst case scenario is to scan through all the elements. For example:
#define N 8
int T[N] = {6, 3, 5, 1, 3, 7, 4, 2};
int occurences[N+1] = {0};
int duplicate = -1;
for (int i = 0; i < N; i++) {
occurences[T[i]]++;
if (occurences[T[i]] == 2) {
duplicate = T[i];
break;
}
}
Note that this method is also immune to integer overflow, that is N*(N+1)/2. might be larger than integer data type can possibly hold.
This question already has answers here:
Sum of products of elements of all subarrays of length k
(2 answers)
Permutation of array
(13 answers)
Closed 7 years ago.
I will start with an example. Suppose we have an array of size 3 with elements a, b and c like: (where a, b and c are some numerical values)
|1 | 2| 3| |a | b| c|
(Assume index starts from 1 as shown in the example above)
Now all possible increasing sub-sequence of length 2 are:
12 23 13
so the sum of product of elements at those indexes is required, that is, ab+bc+ac
For length 3 we have only one increasing sub-sequence, that is, 123 so abc should be printed.
For length 4 we have no sequence so 0 is printed and the program terminates.
So output for the given array will be:
ab+bc+ac,abc,0
So for example if the elements a, b and c are 1, 2 and 3 respectively then the output should be 11,6,0
Similarly, for an array of size 4 with elements a,b,c,d the output will be:
ab+ac+ad+bc+bd+cd,abc+abd+acd+bcd,abcd,0
and so on...
Now obviously brute force will be too inefficient for large value of array size. I was wondering if there is an efficient algorithm to compute the output for an array of given size?
Edit 1: I tried finding a pattern. For example for an array of size 4:
The first value we need is :(ab+ac+bc)+d(a+b+c)= ab+ac+ad+bc+bd+cd (Take A=ab+ac+bd)
then the second value we need is:(abc) +d(A) = abc+abd+acd+bcd(B=abc)
then the third value we need is : (0) +d(B) = abcd(Let's take 0 as C)
then the fourth value we need is: +d(C) = 0
But it still requires a lot of computation and I can't figure out an efficient way to implement this.
Edit 2: My question is different then this since:
I don't need all possible permutations. I need all possible increasing sub-sequences from length 2 to n+1.
I also don't need to print all possible such sequences, I just need the value thus obtained (as explained above) and hence I am looking for some maths concept or/and some dynamic programming approach to solve this problem efficiently.
Note I am finding the set of all possible such increasing sub-sequences based on the index value and then computing based on the values at those index position as explained above.
As a post that seems to have disappeared pointed out one way is to get a recurrence relation. Let S(n,k) be the sum over increasing subsequences (of 1..n) of length k of the product of the array elements indexed by the sequence. Such a subsequence either ends in n or not; in the first case it's the concatenation of a subsequence of length k-1 of 1..n-1 and {n}; in the second case it's a subsequence of 1..n-1 of length k. Thus:
S(n,k) = S(n-1,k) + A[n] * S(n-1,k-1)
For this always to make sense we need to add:
S(n,0) = 1
S(n,m) = 0 for m>n
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Problem:
Given a sorted array of integers find the most frequently occurring integer. If there are multiple integers that satisfy this condition, return any one of them.
My basic solution:
Scan through the array and keep track of how many times you've seen each integer. Since it's sorted, you know that once you see a different integer, you've gotten the frequency of the previous integer. Keep track of which integer had the highest frequency.
This is O(N) time, O(1) space solution.
I am wondering if there's a more efficient algorithm that uses some form of binary search. It will still be O(N) time, but it should be faster for the average case.
Asymptotically (big-oh wise), you cannot use binary search to improve the worst case, for the reasons the answers above mine have presented. However, here are some ideas that may or may not help you in practice.
For each integer, binary search for its last occurrence. Once you find it, you know how many times it appears in the array, and can update your counts accordingly. Then, continue your search from the position you found.
This is advantageous if you have only a few elements that repeat a lot of times, for example:
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
Because you will only do 3 binary searches. If, however, you have many distinct elements:
1 2 3 4 5 6
Then you will do O(n) binary searches, resulting in O(n log n) complexity, so worse.
This gives you a better best case and a worse worst case than your initial algorithm.
Can we do better? We could improve the worst case by finding the last occurrence of the number at position i like this: look at 2i, then at 4i etc. as long as the value at those positions are the same. If they are not, look at (i + 2i) / 2 etc.
For example, consider the array:
i
1 2 3 4 5 6 7 ...
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
We look at 2i = 2, it has the same value. We look at 4i = 4, same value. We look at 8i = 8, different value. We backtrack to (4 + 8) / 2 = 6. Different value. Backtrack to (4 + 6) / 2 = 5. Same value. Try (5 + 6) / 2 = 5, same value. We search no more, because our window has width 1, so we're done. Continue the search from position 6.
This should improve the best case, while keeping the worst case as fast as possible.
Asymptotically, nothing is improved. To see if it actually works better on average in practice, you'll have to test it.
Binary search, which eliminates half of the remaining candidates, probably wouldn't work. There are some techniques you could use to avoid reading every element in the array. Unless your array is extremely long or you're solving a problem for curiosity, the naive (linear scan) solution is probably good enough.
Here's why I think binary search wouldn't work: start with an array: given the value of the middle item, you do not have enough information to eliminate the lower or upper half from the search.
However, we can scan the array in multiple passes, each time checking twice as many elements. When we find two elements that are the same, make one final pass. If no other elements were repeated, you've found the longest element run (without even knowing how many of that element is in the sorted list).
Otherwise, investigate the two (or more) longer sequences to determine which is longest.
Consider a sorted list.
Index 0 1 2 3 4 5 6 7 8 9 a b c d e f
List 1 2 3 3 3 3 3 3 3 4 5 5 6 6 6 7
Pass1 1 . . . . . . 3 . . . . . . . 7
Pass2 1 . . 3 . . . 3 . . . 5 . . . 7
Pass3 1 2 . 3 . x . 3 . 4 . 5 . 6 . 7
After pass 3, we know that the run of 3's must be at least 5, while the longest run of any other number is at most 3. Therefore, 3 is the most frequently occurring number in the list.
Using the right data structures and algorithms (use binary-tree-style indexing), you can avoid reading values more than once. You can also avoid reading the 3 (marked as an x in pass 3) since you already know its value.
This solution has running time O(n/k) which degrades to O(n) for k=1 for a list with n elements and a longest run of k elements. For small k, the naive solution will perform better due to simpler logic, data structures, and higher RAM cache hits.
If you need to determine the frequency of the most common number, it would take O((n/k) log k) as indicated by David to find the first and last position of the longest run of numbers using binary search on up to n/k groups of size k.
The worst case cannot be better than O(n) time. Consider the case where each element exists once, except for one element which exists twice. In order to find that element, you'd need to look at every element in the array until you find it. This is because knowing the value of any array element does not give you any information regarding the location of the duplicate element, until it's actually found. This is in contrast to binary search, where the value of an array element allows you to rule out many other elements.
No, in the worst case we have to scan at least n - 2 elements, but see
below for an algorithm that exploits inputs with many duplicates.
Consider an adversary that, for the first n - 3 distinct probes into the
n-element array, returns m for the value at index m. Now the algorithm
knows that the array looks like
1 2 3 ... i-1 ??? i+1 ... j-1 ??? j+1 ... k-1 ??? k+1 ... n-2 n-1 n.
Depending on what the ???s are, the sole correct answer could be j-1
or j+1, so the algorithm isn’t done yet.
This example involved an array where there were very few duplicates. In
fact, we can design an algorithm that, if the most frequent element
occurs k times out of n, uses O((n/k) log k) probes into the array. For
j from ceil(log2(n)) - 1 down to 0, examine the subarray consisting of
every (2**j)th element. Stop if we find a duplicate. The cost so far
is O(n/k). Now, for each element in the subarray, use binary search to
find its extent (O(n/k) searches in subarrays of size O(k), for a total
of O((n/k) log k)).
It can be shown that all algorithms have a worst case of Omega((n/k) log
k), making this one optimal in the worst case up to constant factors.
Consider having an array filled with elements a0,a1,a2,....,a(n-1).
Consider that this array is sorted already; it will be easier to describe the problem.
Now the range of the array is defined as the biggest element - smallest element.
Say this range is some value x.
Now the problem I have is that, I want to change the elements in such a way that the range becomes less than/equal to some target value y.
I also have the additional constraint that I want to change minimal amount for each element. Consider an element a(i) that has value z. If I change it by r amount, this costsr^2.
Thus, what is an efficient algorithm to update this array to make the range less than or equal to target range y that minimizes the cost.
An example:
Array = [ 0, 3, 19, 20, 23 ] Target range is 17.
I would make the new array [ 3, 3, 19, 20, 20 ] . The cost is (3)^2 + (3)^2 = 18.
This is the minimal cost.
If you are adding/removing to some certain element a(i), you must add/remove that quantity q all at once. You can not remove 3 times 1 unit from a certain element, but must remove a quantity of 3 units once.
I think you can build two heaps from the array - one min-heap, one max-heap. Now you will take the top elements of both heaps and peek at the ones right under them and compare the differences. The one that has the bigger difference you will take and if that difference is bigger than you need, you will just take the required size and add the cost.
Now, if you had to take the whole difference and didn't achieve your goal, you will need to repeat this step. However, if you once again choose from the same heap, you have to remember to add the cost for the element you are taking out of the heap in that steps AND also for those that have been taken out of the processed heap before.
This yields an O(N*logN) algorithm, I'm not sure if it can be done faster.
Example:
Array [2,5,10,12] , I want difference 4.
First heap has 2 on top, second one 12. the 2 is 3 far from 5 and 12 is 2 far from 10 so I take the min-heap and the two will have to be changed by 3. So now we have a new situation:
[5, 10, 12]
The 12 is 2 far from 10 and we take it, subtract 2 and get new situation:
[5,10]
Now we can choose any heap, both differences are the same (the same numbers :-) ). We just need to change by 1 so we get subtract 1 from 10 and get the right result. Now, because we changed 5 to 6 we would also have to change the number that was originally 12 once more to 9 so the resulting cost:
[2 - changed to 5, 5 - unchanged, 10 - changed to 9, 12 - changed to 9].
Here is a linear-time algorithm that minimizes the piecewise quadratic objective function. Probably it can be simplified.
Let the range be [x, x + y], where x is a variable. For different choices of x, there are at most 2n + 1 possibilities for which points lie in the range, arising from 2n critical values a0 - y, a1 - y, ..., a(n-1) - y, a0, a1, ..., a(n-1). One linear-time merge yields the critical values in sorted order. For each of the 2n - 1 intervals [w, z] between critical values where the range contains at least one point, we can construct and minimize a quadratic function consisting of a sum where every point aj less than w yields a term (x - aj)^2 and every point aj greater than z + y yields a term (x + y - aj)^2. The global minimum lies at the mean of aj (for terms of the first type) or aj - y (for terms of the second type); the endpoints of the interval must be checked as well. Naively, this gives a quadratic-time algorithm.
To get down to linear time, it suffices to update the sum preceding the mean computation incrementally. Each of the critical values has an associated event indicating whether the point responsible for it is entering or leaving the interval, meaning that that point's term should enter or leave the sum.