How would I find an infinite loop in an array of pointers? - arrays

I have an array of pointers (this is algorithmic, so don't go into language specifics). Most of the time, this array points to locations outside of the array, but it degrades to a point where every pointer in the array points to another pointer in the array. Eventually, these pointers form an infinite loop.
So on the assumption that the entire array consists of pointers to another location in the array and you start at the beginning, how could you find the length of the loop with the highest efficiency in both time and space? I believe the best time efficiency would be O(n), since you have to loop over the array, and the best space efficiency would be O(1), though I have no idea how that would be achieved.
Index: 0 1 2 3 4 5 6
Value: *3 *2 *5 *4 *1 *1 *D
D is data that was being pointed to before the loop began. In this example, the cycle is 1, 2, 5 and it repeats infinitely, but indices 0, 3, and 4 are not part of the cycle.

This is an instance of the cycle-detection problem. An elegant O(n) time O(1) space solution was discovered by Robert W. Floyd in 1960; it's commonly known as the "Tortoise and Hare" algorithm because it consists of traversing the sequence with two pointers, one moving twice as fast as the other.
The idea is simple: the cycle must have a loop with length k, for some k. At each iteration, the hare moves two steps and the tortoise moves one, so the distance between them is one greater than it was in the previous iteration. Every k iterations, therefore, they are a multiple of k steps apart from each other, and once they are both in the cycle (which will happen once the tortoise arrives), if they are a multiple of k steps apart, they both point at the same element.
If all you need to know is the length of the cycle, you wait for the hare and the tortoise to reach the same spot; then you step along the cycle, counting steps until you get back to the same spot again. In the worst case, the total number of steps will be the length of the tail plus twice the length of the cycle, which must be less than twice the number of elements.
Note: The second paragraph was edited to possibly make the idea "more obvious", whatever that might mean. A formal proof is easy and so is an implementation, so I provided neither.

Make a directed graph of the elements in the array where a node points to another node if the element of the node points to the element of the node its pointing to and for each node. keep track of the indegree of the node(number of pointers pointing to it.) While making your graph, if there is a node with indegree == 2, then that node is part of an infinite cycle.
The above fails if the first element is included in the infinite cycle, so before the algorithm starts, add 1 indegree to the first element to resolve this.

The array becomes, as you describe it, a graph (more properly a forrest) where each vertex has out-degree of exactly one. The components of such a graph can only consist of chains that each possibly end in a single loop. That is, each component is either shaped like an O or like a 6. (I am assuming no pointers are null, but this is easy to deal with. You end up with 1-shaped components with no cycles at all.)
You can trace all these components by "visiting" and keeping track of where you've been with a "visited" hash or flags array.
Here's an algorithm.
Edit It just DFS of a forrest simplified for the case of one child per node, which eliminates the need for a stack (or recursion) because backtracking is not needed.
Let A[0..N-1] be the array of pointers.
Let V[0..N-1] be an array of boolean "visited" flags, initially false.
Let C[0..N-1] be an array if integer counts, initially zero.
Let S[0..N-1] be an array of "step counts" for each component trace.
longest = 0 // length of longest cycle
for i in 0..N-1, increment C[j] if A[i] points to A[j]
for each k such that C[k] = 0
// No "in edges", so must be the start of a 6-shaped component
s = 0
while V[k] is false
V[k] = true
S[k] = s
s = s + 1
k index of the array location that A[k] points to
end
// Loop found. Length is s - S[k]
longest = max(longest, s - S[k])
end
// Rest of loops must be of the O variety
while there exists V[k] false
Let k be such that V[k] is false.
s = 0
while V[k] is false
V[k] = true
s = s + 1
k index of the array location that A[k] points to
end
// Found loop of length s
longest = max(longest, s)
end
Space and execution time are both proportional to size of the input array A. You can get rid of the S array if you're willing to trace 6-shaped components twice.
Addition I fully agree that if it's not necessary to find the cycle of maximum size, then the ancient "two pointer" algorithm for finding cycles in a linked list is superior, since it requires only constant space.

Related

Greatest element present on the right side of every element in an array

I have been given an array (of n elements) and i have to find the smallest element on the right side of each element which is greater than itself(current element).
For example :
Array = {8,20,9,6,15,31}
Output Array = {9,31,15,15,31,-1}
Is it possible to solve this in O(n).? I thought of traversing the array from the right side (starting from n-2) and building a balance binary search tree for the remaining elements, as searching in it for an element which is immediately greater than the current element would be O(logn) .
Hence time complexity would come out to be O(n*(log(n)).
Is there a better approach to this problem?
The problem you present is impossible to solve in O(n) time, since you can reduce sorting to it and thereby achieve sorting in O(n) time.
Say there exists an algorithm which solves the problem in O(n).
Let there be an element a.
The algorithm can also be used to find the smallest element to the left of and larger than a (by reversing the array before running the algorithm).
It can also be used to find the largest element to the right (or left) of and smaller than a (by negating the elements before running the algorithm).
So, after running the algorithm four times (in linear time), you know which elements should be to the right and to the left of each element. In order to construct the sorted array in linear time, you'd need to keep the indices of the elements instead of the values. You first find the smallest element by following your "larger-than pointers" in linear time, and then make another pass in the other direction to actually build the array.
Others have proved that it is impossible in general to solve in O(n).
However, it is possible to do in O(m) where m is the size of your largest element.
This means that in certain cases (e.g. if if your input array is known to be a permutation of the integers 1 up to n) then it is possible to do in O(n).
The code below shows the approach, built upon a standard method for computing the next greater element. (There is a good explanation of this method on geeks for geeks)
def next_greater_element(A):
"""Return an array of indices to the next strictly greater element, -1 if none exists"""
i=0
NGE=[-1]*len(A)
stack=[]
while i<len(A)-1:
stack.append(i)
while stack and A[stack[-1]]<A[i+1]:
x=stack.pop()
NGE[x]=i+1
i+=1
return NGE
def smallest_greater_element(A):
"""Return an array of smallest element on right side of each element"""
top = max(A) + 1
M = [-1] * top # M will contain the index of each element sorted by rank
for i,a in enumerate(A):
M[a] = i
N = next_greater_element(M) # N contains an index to the next element with higher value (-1 if none)
return [N[a] for a in A]
A=[8,20,9,6,15,31]
print smallest_greater_element(A)
The idea is to find the next element in size order with greater index. This next element will therefore be the smallest one appearing to the right.
This cannot be done in O(n), since we can reduce Element Distinctness Problem (which is known to be sovleable in Omega(nlogn) when comparisons based) to it.
First, let's do a little expansion to the problem, that does not influence its hardness:
I have been given an array (of n elements) and i have to find the
smallest element on the right side of each element which is greater/equals
than itself(current element).
The addition is we allow the element to be equal to it (and to the right), and not only strictly greater than1.
Now, Given an instance of element distinctness arr, run the algorithm for this problem, and look if there is any element i such that arr[i] == res[i], if there isn't answer "all distinct", otherwise: "not all distinct".
However, since Element Distinctness is Omega(nlogn) comparisons based, it makes this problem such as well.
(1)
One possible justification why adding equality is not making the problem more difficult is - assuming elements are integers, we can just add i/(n+1) to each element in the array, now for each two elements if arr[i] < arr[j], also arr[i] + i/(n+1) < arr[j] + j/(n+1), but if arr[i] = arr[j], then if i<j arr[i] + i/(n+1) < arr[j] + j/(n+1), and we can have the same algorithm solve the problem for equalities as well.

quicksort, can it be made to output the first m sorted values in an N dimension array, thereby being faster than a full N sort

Quicksort is a well known algorithm, but it's complex to decipher the C (for me). The inline version speed things up a lot http://www.corpit.ru/mjt/qsort.html‎.
However, could it be easily converted to output the first m samples of an N-element array ?
So a call that would simply stop the sort after the first m samples are sorted ? I suspect not as it does a quicksort into blocks then stitches blocks together for the final output. If I make the initial quicksort block size the size of m then I'm in a bad place, not taking advantage of the clever stuff in qsort.
Thanks in advance
Grog
Use Quickselect, as #R.. suggested, to get the first k elements, then sort them. Running time is O(N) to get the elements, and O(k log k) to sort them.
However, emperical evidence suggests that if the number of items to select (k) is less than 1% of the total number of elements (N), then using a binary heap will be faster than Quickselect followed by sort. When I had to select 200 items from a list of 2 million, the heap selection algorithm was a lot faster. See the linked blog for details.
(Restate the question: given N items, find the largest m of them.)
A simple solution is a priority queue. Feed all N items into the queue, then pop the top m items off the list. Feeding the N items in will be O(N log m). Each individual pop operation is O(log m), so removing the top n items would be O(m log m).
An in-place algorithm should be relatively straightforward. We an array of N elements. Each position in the array is numbered, with a number between 1 and N (inclusive). For each position in the array, take its position and divide by two (rounding down if necessary), and defining that position as its parent. Every position, apart from position 1, will have a parent. And most positions (not all) will have two children. For example:
node position: 1 2 3 4 5 6 7 8 9 ...
parent: - 1 1 2 2 3 3 4 4 ...
We want to swap the nodes until each node has a value less than (or equal to) its parent. This will guarantee that the largest value is in position 1. It is quite easy to reorder an array to have this form. Simply go through the nodes in order from position 1 to N, and call this function on it once:
void fixup_position(int x) {
if(x==1)
return;
int parent_position = (x/2) ; // rounding-down where necessary
if (data[x] > data[parent_position]) {
swap(data[x], data[parent_position]);
check_position(parent_position); // note this recursive call
}
}
for(x = 1; x <= N; ++x) {
fixup_position(x);
}
(Yes, I'm counting the array with position one, not zero! You'll have to take this account when implementing it for real. But this is easier to understand the logic of priority queue.)
The average number of recursive calls (and therefore swaps) is a constant (2, if I remember correctly). So this will be pretty quick, even with large datasets.
It's worth taking a moment to understand why this is correct. Just before calling fixup_position(x), every position up to, but not including x, are in a 'correct' state. By 'correct' I mean that they're not fully sorted, but each node is less than its parent. A new value is introduced (at position x), and will 'bubble up' through the queue. You might worry that this will invalidate other positions, and their parent-child relationship, but it won't. Only one node at a time will be in an invalid state, and it will keep bubbling up to its rightful place.
This is the O(N) step that will rearrange your array into a priority queue.
Removing the top n items. After the above method, it's clear that the biggest number will be in position 1, but what about the second-biggest, and third-biggest, and so on? What we do is we pop one value at a time from position 1 and then rearrange the data so that the next-biggest value is moved into position 1. This is slightly more complex than the fixup_position.
for(int y = 1; y <= m; ++y) {
print the number in position 1 .... it's the next biggest number
data[1] = -10000000000000; // a number smaller than all your data
fixup_the_other_way(1); // yes, this is '1', not 'y' !
}
where fixup_the_other_way is:
void fixup_the_other_way(int x) {
int child1 = 2*x;
int child2 = 2*x+1;
if(child1 > N) // doesn't have any children, we're done here
return;
if(child2 > N) { // has one child, at position[child1]
swap(data[x], data[child1]);
fixup_the_other_way(child1);
return;
}
// otherwise, two children, we must identify the biggest child
int position_of_largest_child = (data[child1]>data[child2]) ? child1 : child2;
swap(data[x], data[position_of_largest_child]);
fixup_the_other_way(position_of_largest_child);
return;
}
This means we print out the biggest remaining item, then replace that with a really small number and force it to 'bubble down' to the bottom of our data structures.
There are two ways to solve the problem efficiently:-
1.> Priority Queues
Algorithm: -
Insert first n items into Priority Queue with max heap
Peek on max element to check if current element compared is less than that
if less delete top element and add current
Do steps for all N-n elements.
2.> Your Problem can be reduced to selection problem : -
Algorithm
Do randomized selection for nth element on N elements (O(N) in average case)
sort first n elements using qsort or any other efficient sorting algorithm
Using both algorithms you would get average case O(N) performance

Why does linear probing work with a relatively prime step?

I was reading about linear probing in a hash table tutorial and came upon this:
The step size is almost always 1 with linear probing, but it is acceptable to use other step sizes as long as the step size is relatively prime to the table size so that every index is eventually visited. If this restriction isn't met, all of the indices may not be visited...
(The basic problem is: You need to visit every index in an array starting at an arbitrary index and skipping ahead a fixed number of indices [the skip] to the next index, wrapping to the beginning of the array if necessary with modulo.)
I understand why not all indices could be visited if the step size isn't relatively prime to the table size, but I don't understand why the converse is true: that all the indices will be visited if the step size is relatively prime to the array size.
I've observed this relatively prime property working in several examples that I've worked out by hand, but I don't understand why it works in every case.
In short, my question is: Why is every index of an array visited with a step that is relatively prime to the array size? Is there a proof of this?
Thanks!
Wikipedia about Cyclic Groups
The units of the ring Z/nZ are the numbers coprime to n.
Also:
[If two numbers are co-prime] There exist integers x and y such that ax + by = 1
So, if "a" is your step length, and "b" the length of the array, you can reach any index "z" by
axz + byz = z
=>
axz = z (mod b)
i.e stepping "xz" times (and wrapping over the array "yz" times).
number of steps is lcm(A,P)/P or A/gcd(A,P) where A is array size and P is this magic coprime.
so if gcd(A,P) != 1 then number of steps will be less than A
On contrary if gcd(A,P) == 1 (coprimes) then number of steps will be A and all indexes will be visited

Inserting unknown number of elements into dynamic array in linear time

(This question is inspired by deque::insert() at index?, I was surprised that it wasn't covered in my algorithm lecture and that I also didn't find it mentioned in another question here and even not in Wikipedia :). I think it might be of general interest and I will answer it myself ...)
Dynamic arrays are datastructures that allow addition of elements at the end in amortized constant time O(1) (by doubling the size of the allocated memory each time it needs to grow, see Amortized time of dynamic array for a short analysis).
However, insertion of a single element in the middle of the array takes linear time O(n), since in the worst case (i.e. insertion at first position) all other elements needs to be shifted by one.
If I want to insert k elements at a specific index in the array, the naive approach of performit the insert operation k times would thus lead to a complexity of O(n*k) and, if k=O(n), to a quadratic complexity of O(n²).
If I know k in advance, the solution is quite easy: Expand the array if neccessary (possibly reallocating space), shift the elements starting at the insertion point by k and simply copy the new elements.
But there might be situations, where I do not know the number of elements I want to insert in advance: For example I might get the elements from a stream-like interface, so I only get a flag when the last element is read.
Is there a way to insert multiple (k) elements, where k is not known in advance, into a dynamic array at consecutive positions in linear time?
In fact there is a way and it is quite simple:
First append all k elements at the end of the array. Since appending one element takes O(1) time, this will be done in O(k) time.
Second rotate the elements into place. If you want to insert the elements at position index. For this you need to rotate the subarray A[pos..n-1] by k positions to the right (or n-pos-k positions to the left, which is equivalent). Rotation can be done in linear time by use of a reverse operation as explained in Algorithm to rotate an array in linear time. Thus the time needed for rotation is O(n).
Therefore the total time for the algorithm is O(k)+O(n)=O(n+k). If the number of elements to be inserted is in the order of n (k=O(n)), you'll get O(n+n)=O(2n)=O(n) and thus linear time.
You could simply allocate a new array of length k+n and insert the desired elements linearly.
newArr = new T[k + n];
for (int i = 0; i < k + n; i++)
newArr[i] = i <= insertionIndex ? oldArr[i]
: i <= insertionIndex + k ? toInsert[i - insertionIndex - 1]
: oldArr[i - k];
return newArr;
Each iteration takes constant time, and it runs k+n times, thus O(k+n) (or, O(n) if you so like).

detecting the start of a loop in a singly linked link list?

Is there any way of finding out the start of a loop in a link list using not more than two pointers?
I do not want to visit every node and mark it seen and reporting the first node already been seen.Is there any other way to do this?
Step1: Proceed in the usual way, you will use to find the loop, i.e.
Have two pointers, increment one in single step and other in two steps, If they both meet in sometime, there is a loop.
Step2: Freeze one pointer where it was and increment the other pointer in one step counting the steps you make and when they both meet again, the count will give you the length of the loop (this is same as counting the number of elements in a circular link list).
Step3: Reset both pointers to the start of the link list, increment one pointer to the length of loop times and then start the second pointer. increment both pointers in one step and when they meet again, it will be the start of the loop (this is same as finding the nth element from the end of the link list).
MATHEMATICAL PROOF + THE SOLUTION
Let 'k' be the number of steps from HEADER to BEGINLOOP.
Let 'm' be the number of steps from HEADER to MEETPOINT.
Let 'n' be the number of steps in the loop.
Also, consider two pointers 'P' and 'Q'. Q having 2x speed than P.
SIMPLE CASE: When k < N
When pointer 'P' would be at BEGINLOOP (i.e. it would have traveled 'k' steps), Q would have traveled '2k' steps. So, effectively, Q is ahead of '2k-k = k' steps from P when P enters the loop, and hence, Q is 'n-k' steps behind the BEGINLOOP now.
When P would have moved from BEGINLOOP to MEETPONT, it would have traveled 'm-k' steps. In that time, Q would have traveled '2(m-k)' steps. But, since they met, and Q started 'n-k' steps behind the BEGINLOOP, so, effectively,
'2(m-k) - (n-k)' should be equal to '(m-k)'
So,
=> 2m - 2k - n + k = m - k
=> 2m - n = m
=> n = m
THAT MEANS, P and Q meet at the point equal to the number of steps (or multiple to be general, see the case mentioned below) in the loop. Now, at the MEETPOINT, both P and Q are 'n-(m-k)' steps behind, i.e, 'k' steps behind ,as we saw n=m.
So, if we start P from HEADER again, and Q from the MEETPOINT but this time with the pace equal to P, P and Q will now be meeting at BEGINLOOP only.
GENERAL CASE: Say, k = nX + Y, Y < n
(Hence, k%n = Y)
When pointer 'P' would be at BEGINLOOP (i.e. it would have traveled 'k' steps), Q would have traveled '2k' steps. So, effectively, Q is ahead of '2k-k = k' steps from P when P enters the loop. But, please note 'k' is greater than 'n', which means Q would have made multiple rounds of the loop. So, effectively, Q is 'n-(k%n)' steps behind the BEGINLOOP now.
When P would have moved from BEGINLOOP to MEETPOINT, it would have traveled 'm-k' steps. (Hence, effectively, MEETPOINT would be at '(m-k)%n' steps ahead of BEGINLOOP now.) In that time, Q would have traveled '2(m-k)' steps. But, since they met, and Q started 'n-(k%n)' steps behind the BEGINLOOP, so, effectively, new position of Q (which is '(2(m-k) - (n-k/%n))%n' from BEGINLOOP) should be equal to the new position of P (which is '(m-k)%n' from BEGIN LOOP).
So,
=> (2(m - k) - (n - k%n))%n = (m - k)%n
=> (2(m - k) - (n - k%n))%n = m%n - k%n
=> (2(m - k) - (n - Y))%n = m%n - Y (as k%n = Y)
=> 2m%n - 2k%n - n%n + Y%n = m%n - Y
=> 2m%n - Y - 0 + Y = m%n - Y (Y%n = Y as Y < n)
=> m%n = 0
=> 'm' should be multiple of 'n'
First we try to find out, is there any loop in list or not. If loop exists then we try to find out starting point of loop. For this we use two pointers namely slowPtr and fastPtr. In first detection (checking for loop), fastPtr moves two steps at once but slowPtr moves by one step ahead at once.
slowPtr 1 2 3 4 5 6 7
fastPtr 1 3 5 7 9 5 7
It is clear that if there is any loop in list then they will meet at point (Point 7 in above image), because fastPtr pointer is running twice faster than other one.
Now, we come to second problem of finding starting point of loop.
Suppose, they meet at Point 7 (as mentioned in above image). Then, slowPtr comes out of loop and stands at beginning of list means at Point 1 but fastPtr still at meeting point (Point 7). Now we compare both pointers value, if they same then it is starting point of loop otherwise we move one step at ahead (here fastPtr is also moving by one step each time) and compare again till we find same point.
slowPtr 1 2 3 4
fastPtr 7 8 9 4
Now one question comes in mind, how is it possible. So there is good mathematical proof.
Suppose:
m => length from starting of list to starting of loop (i.e 1-2-3-4)
l => length of loop (i.e. 4-5-6-7-8-9)
k => length between starting of loop to meeting point (i.e. 4-5-6-7)
Total distance traveled by slowPtr = m + p(l) +k
where p => number of repetition of circle covered by slowPtr
Total distance traveled by fastPtr = m + q(l) + k
where q => number of repetition of circle covered by fastPtr
Since,
fastPtr running twice faster than slowPtr
Hence,
Total distance traveled by fastPtr = 2 X Total distance traveled by slowPtr
i.e
m + q(l) + k = 2 * ( m + p(l) +k )
or, m + k = q(l) - p(l)
or, m + k = (q-p) l
or, m = (q-p) l - k
So,
If slowPtr starts from beginning of list and travels "m" length then, it will reach to Point 4 (i.e. 1-2-3-4)
and
fastPtr start from Point 7 and travels " (q-p) l - k " length then, it will reach to Point 4 (i.e. 7-8-9-4),
because "(q-p) l" is a complete circle length with " (q-p) " times.
More detail here
Proceed in the usual way you will use to find the loop. ie. Have two pointers, increment one in single step(slowPointer) and other in two steps(fastPointer), If they both meet in sometime, there is a loop.
As you might would have already realized that meeting point is k Step before the head of the loop.
where k is size of non-looped part of the list.
now move slow to head of the loop
keep Fast at collision point
each of them are k STep from the loop start (Slow from start of the list where as fast is k step before the head of the loop- Draw the pic to get the clarity)
Now move them at same speed - They must meet at loop start
eg
slow=head
while (slow!=fast)
{
slow=slow.next;
fast=fast.next;
}
This is code to find start of loop in linked List :
public static void findStartOfLoop(Node n) {
Node fast, slow;
fast = slow = n;
do {
fast = fast.next.next;
slow = slow.next;
} while (fast != slow);
fast = n;
do {
fast = fast.next;
slow = slow.next;
}while (fast != slow);
System.out.println(" Start of Loop : " + fast.v);
}
There are two way to find the loops in a link list.
1. Use two pointer one advance one step and other advance two steps if there is loop, in some point both pointer get the same value and never reach to null. But if there is no loop pointer reaches to null in one point and both pointer never get the same value. But in this approach we can get there is a loop in the link list but we can't tell where exactly starting the loop. This is not the efficient way as well.
Use a hash function in such a way that the value should be unique. Incase if we are trying to insert the duplicate it should through the exception. Then travel through each node and push the address into the hash. If the pointer reach to null and no exception from the hash means there is no cycle in the link list. If we are getting any exception from hash means there is a cycle in the list and that is the link from which the cycle is starting.
Well I tried a way by using one pointer... I tried the method in several data sets.... As the memory for each of the nodes of a linked list are allocated in an increasing order, so while traversing the linked list from the head of the linked list, if the address of a node becomes larger than the address of the node it is pointing to, we can determine there is a loop, as well as the beginning element of the loop.
The best answer I have found was here:
tianrunhe: find-loop-starting-point-in-a-circular-linked-list
'm' being distance between HEAD and START_LOOP
'L' being loop length
'd' being distance between MEETING_POINT and START_LOOP
p1 moving at V, and p2 moving at 2*V
when the 2 pointers meet: distance run is = m+ n*L -d = 2*(m+ L -d)
=> which means (not mathematicaly demonstrated here) that if p1 starts from HEAD & p2 starts from MEETING_POINT & they move at same pace, they will meet # START_LOOP
Refer to this link for comprehensive answer.
Proceed in the usual way you will use to find the loop. ie. Have two pointers, increment one in single step and other in two steps, If they both meet in sometime, there is a loop.
Keep one of the pointers fixed get the total number of nodes in the loop say L.
Now from this point(increment second pointer to the next node in the loop) in the loop reverse the linked list and count the number of nodes traversed, say X.
Now using the second pointer(loop is broken) from the same point in the loop travrse the linked list and count the number of nodes remaining say Y
The loop begins after the ((X+Y)-L)\2 nodes. Or it starts at the (((X+Y)-L)\2+1)th node.
Proceed in the usual way you will use to find the loop. ie. Have two pointers, increment one in single step and other in two steps, If they both meet in sometime, there is a loop.
Keep one of the pointers fixed get the total number of nodes in the loop say L.
Now from this point(increment second pointer to the next node in the loop) in the loop reverse the linked list and count the number of nodes traversed, say X.
Now using the second pointer(loop is broken) from the same point in the loop travrse the linked list and count the number of nodes remaining say Y
The loop begins after the ((X+Y)-L)\2 nodes. Or it starts at the (((X+Y)-L)\2+1)th node.
void loopstartpoint(Node head){
Node slow = head.next;;
Node fast = head.next.next;
while(fast!=null && fast.next!=null){
slow = slow.next;
fast = fast.next.next;
if(slow==fast){
System.out.println("Detected loop ");
break;
}
}
slow=head;
while(slow!=fast){
slow= slow.next;
fast = fast.next;
}
System.out.println("Starting position of loop is "+slow.data);
}
Take two pointers, one fast and one slow. The slow pointer moves one node at a time, the fast pointer moves two nodes at a time, ultimately they'll meet and you'll find the loop.
Now comes the fun part, how do you detect the loop? That's simple as well. Let me ask you another fun question first: How will you go about searching for the n-x the node in the list in one pass? The simple answer will be to take two pointers, one at the head, one at x steps ahead of the head and move them at the same speed, when the second pointer hits the end, the first one will be at n-x.
As many other people have mathematically proved in this thread if one pointer moves at twice the speed of one pointer, the distance from start to the point at where they meet is going to be a multiple of the length of the list.
Why is this the case??
As fast pointer is moving at twice the speed of slow pointer, can we agree that:
Distance travelled by fast pointer = 2 * (Distance travelled
by slow pointer)
now:
If m is the length of the loop(nodes in the loop) that has no cyle
If n is the actual length of the loop.
x is the number of cycles slow pointer made
y is the number of cycles fast pointer made.
And K is the distance from the start of the loop to the point of
meeting
Note that this point is the only piece of length in the path of both
the pointers that are going to be extra after their number of cycles
of the loop. So besides this k rest of what they travelled are
cycles of the loop and the initial distance from the head to the
start of the loop. Hence, both are travelling m+n*(numbers of cycles
they made) + k distance after their cycles at which they met each
other. So, we can say that:
(m + nx + k) = 2(m + n*y + k)
When you solve this mathematically you'll discover that m+k is a
multiple of the length of the loop that is n. i.e. [m + k = (x-2y)*n]
So, if you maintain a distance that is a multiple of the length and
move eventually you'll meet again at the start of the loop. Why?
Can't they meet somewhere else? Well fast is already at k and slow
is at the head, if they both travel m distance as k+m is the
multiple of n, fast would be at the start of the loop. While if slow
travels m distance it'll meet k as m is the distance from head to
start of the loop as we originally assumed m to be.
Hence, it is mathematically proved that the distance which both the
the fast and slow pointer will have to travel if set the slow pointer to
head again after detecting the loop and make them both travel at the
The same speed is going to be m.
public class Solution {
public ListNode detectCycle(ListNode head) {
if(head==null||head.next==null)return null;
ListNode slow = head;
ListNode fast = head;
while(fast.next!=null&&fast.next.next!=null){
slow = slow.next;
fast = fast.next.next;
if(fast==slow){
slow=head;
while(slow!=fast){
slow=slow.next;
fast=fast.next;
}
return slow;
}
}
return null;
}
}
Pythonic code solution based on #hrishikeshmishra solution
def check_loop_index(head):
if head == None or head.next == None:
return -1
slow = head.next
if head.next.next == None:
return -1
fast = head.next.next
# searching for loops exists or not
while fast and fast.next:
if slow==fast:
break
slow = slow.next
fast = fast.next.next
# checking if there is loop
if slow != fast:
return -1
# reseting the slow to head and creating index
index = 0
slow = head
# incrementing slow and fast by 1 step and incrmeenting index, if they meet
# hen thats the index of node where loop starts
while slow!=fast:
slow = slow.next
fast = fast.next
index+=1
return index
detect loop
copy each element's address into set. If duplicate is found that's the start of the loop
I have heard this exact question as an interview quiz question.
The most elegant solution is:
Put both pointers at the first element (call them A and B)
Then keep looping::
Advance A to the next element
Advance A to the next element again
Advance B to the next element
Every time you update a pointer, check if A and B are identical.
If at some point pointers A and B are identical, then you have a loop.
Problem with this approach is that you may end up moving around the loop twice, but no more than twice with pointer A
If you want to actually find the element that has two pointers pointing to it, that is more difficult. I'd go out of a limb and say its impossible to do with just two pointers unless you are willing to repeat following the linked list a large number of times.
The most efficient way of doing it with more memory, would be to put the pointers to the elements in and array, sort it, and then look for a repeat.

Resources