Mid point in a link list in a single traversal? - c

I'm trying to find the point of a singly link list where a loop begins.
what I thought of was taking 2 pointers *slow, *fast one moving with twice the speed of other.
If the list has a loop then at some point
5-6-7-8
| |
1-2-3-4-7-7
slow=fast
Can there be another elegant solution so that the list is traversed only once?

Your idea of using two walkers, one at twice the speed of the other would work, however the more fundamental question this raises is are you picking an appropriate data structure? You should ask yourself if you really need to find the midpoint, and if so, what other structures might be better suited to achieve this in O(1) (constant) time? An array would certainly provide you with much better performance for the midpoint of a collection, but has other operations which are slower. Without knowing the rest of the context I can't make any other suggestion, but I would suggest reviewing your requirements.

I am assuming this was some kind of interview question.
If your list has a loop, then to do it in a single traversal, you will need to mark the nodes as visited as your fast walker goes through the list. When the fast walker encounters NULL or an already visited node, the iteration can end, and your slow walker is at the midpoint.
There are many ways to mark the node as visited, but an external map or set could be used. If you mark the node directly in the node itself, this would necessitate another traversal to clean up the mark.
Edit: So this is not about finding the midpoint, but about loop detection without revisiting already visited nodes. Marking works for that as well. Just traverse the list and mark the nodes. If you hit NULL, no loop. If you hit a visited node, there is a loop. If the mark includes a counter as well, you even know where the loop starts.

I'm assuming that this singly linked list is ending with NULL. In this case, slow pointer and fast pointer will work. Because fast pointer is double at speed of slow one, if fast pointer reaches end of list slow pointer should be at middle of it.

Related

Search in flat array based double link list when element in between head and tail

I have a double link list using arrays and while inserting elements I maintain order, something like this. I am treating new values (nodes) as array indices and store them directly to their location. The link list is quite long. Everything is working as required but is there any algorithm or ideas that can help me to reduce number of iterations when a node falls between head and tail?
That is
if (new_node > head && new_node < tail) {
search from head()
}
So to reduce iteration; I added search from tail() after finding new-node is closer to which head or tail. Then I added a mid-node that is somewhere in mid of head and tail. But still I am still not able to reduce iterations when node needs to be linked. Will knowing range of values help? What else can be done to reduce number of iteration while inserting (due to sorted nature)?
I hope I am able to explain this properly.
Here is a SO answer discussing the difference between array and linked list: insert/delete functions.
tldr; It is a limitation of double linked lists. Even if you wrap it inside an array to do stuff.
Understanding the underlying methods of how a linked list and array works will help identify the impossible issue you are solving. Unless you utilize a different structure, you will typically have the issues you are describing.
I would do the following:
Consider your problem, what do you CARE about
Searching?
Insert/Delete?
Memory size?
Then decide a data structure that best solves your problem
Array
Linked List
Trees
Hash

Is this a good way to speed up array processing?

So, today I woke up with this single idea.
Just supouse you have a long list of things, an array, and you have to check each one of those to find the one that matches what you're looking for. To do this, you could maybe use a for loop. Now, imagine that the one you're looking for is almost at the end of the list but you don't know it. So, in that case, asuming it doesn't matter the order in which you check the elements of the list, it would be more convinient for you to start from the last element rather than the first one just to save some time and memory maybe. But then, what if your element is almost at the beggining?
That's when I thought: what if I could start checking the elements from both ends of the list at the same time?
So, after several tries, I came up with this raw sample code (which is written in js) that, in my opinion, would solve what we were defining above:
fx (var list) {
var len = length(list);
// To save some time as we were saying, we could check first if the array isn't as long as we were expecting
if (len == 0) {
// If it's not, then we just process the only element anyway
/*
...
list[0]
...
*/
return;
} else {
// So, now here's the thing. The number of loops won't be the length of the list but just half of it.
for (var i = 0; i == len/2; i++) {
// And inside each loop we process both the first and last elements and so on until we reach the middle or find the one we're looking, whatever happens first
/*
...
list[i]
list[len]
...
*/
len--;
}
}
return;
};
Anyway, I'm still not totally sure about if this would really speed up the process or make it slower or not making any difference at all. That's why I need your help, guys.
In your own experience, what do you think? Is this really a good way to make this kind of process faster? If it is or it isn't, why? Is there a way to improve it?
Thanks, guys.
Your proposed algorithm is good if you know that the item is likely to be at the beginning or end but not in the middle, bad if it's likely to be in the middle, and merely overcomplicated if it's equally likely to be anywhere in the list.
In general, if you have an unsorted list of n items then you potentially have to check all of them, and that will always take time which is at least proportional to n (this is roughly what the notation “O(n)” means) — there are no ways around this, other than starting with a sorted or partly-sorted list.
In your scheme, the loop runs for only n/2 iterations, but it does about twice as much work in each iteration as an ordinary linear search (from one end to the other) would, so it's roughly equal in total cost.
If you do have a partly-sorted list (that is, you have some information about where the item is more likely to be), then starting with the most likely locations first is a fine strategy. (Assuming you're not frequently looking for items which aren't in the list at all, in which case nothing helps you.)
If you work from both ends, then you'll get the worst performance when the item you're looking for is near the middle. No matter what you do, sequential searching is O(n).
If you want to speed up searching a list, you need to use a better data structure, such as a sorted list, hash table, or B-tree.

Trimming time off an insertion sort

I need to frequently add and remove elements to and from a large number of sorted lists.
The lists are bound in size, and relatively small -- around 100 elements or less, with each element being on the order of 32 bytes or thereabouts.
If inserting into a full list, the last element can be discarded.
I've done some simple profiling, and found that storing these lists in arrays and using memmove to move everything after the insertion point backwards works surprisingly well; better even than using a linked list of a fixed size and linking the tail into the insertion point. (Never underestimate the power of spatial locality, I guess.)
But I think I can do even better. I suspect that most of my operations will be near the top of the lists; that means I'm going to memmove the vast majority of the list every time I insert or remove. So what if I were to implement this as a sort of ring buffer, and if an operation is closer to the top than the bottom, I shift the topmost items backwards, with the old tail getting overwritten by the head? This should theoretically involve cheaper calls to memmove.
But I'm completely brain farting on implementing this elegantly. The list can now wrap around, with the head being at position k and the tail being at position (k-1)%n if the list is full. So there's the possibility of doing three operations (k is the head, m is the insertion point, n is the max list size).
memmove elements k through n-1 back one
memcpy element 0 to location n-1
memmove elements 1 through m-1 back one
I don't know if that'll be faster than one bigger memmove, but we'll see.
Anyway, I just have a gut feeling that there's a very clever and clean way to implement this through modulo arithmetic and ternary operators. But I can't think of a way to do this without multiple nested "if" statements.
If we're inserting or removing.
If we're closer to the front or the back.
If the list "wraps around" the end of the array.
If the item being inserted is in the same segment as the head, or if it needs to go in the wrapped around portion.
If the head is at element 0.
I'm sure that too much branching will doom any improvements I make with smaller memmoves. Is there a clean solution here that I just am not seeing?

Linked List - Appending node: loop or pointer?

I am writing a linked list datatype and as such I currently have the standard head pointer which references the first item, and then a next pointer for each element that points to the following one such that the final element has next = NULL.
I am just curious what the pros/cons or best practices are for keeping track of the last node. I could have a 'tail' pointer which always points to the last node making it easy to append, or I could loop over the list starting from the head pointer to find the last node when I want to append. Which method is better?
It is usually a good idea to store the tail. If we think about the complexity of adding an item at the end (if this is an operation you commonly do) it will be O(n) time to search for the tail, or O(1) if you store it.
Another option you can consider is to make your list doubly linked. This way when you want to delete the end of the list, by storing tail you can delete nodes in O(1) time. But this will incur an extra pointer to be stored per element of your list (not expensive, but it adds up, and should be a consideration for memory constrained systems).
In the end, it is all about the operations you need to do. If you never add or delete or operate from the end of your list, there is no reason to do this. I recommend analyzing the complexity of your most common operations and base your decision on that.
Depends on how often you need to find the last node, but in general it is best to have a tail pointer.
There's very little cost to just keeping and updating a tail pointer, but you have to remember to update it! If you can keep it updated, then it will make append operations much faster (O(1) instead of O(n)). So, if you usually add elements to the end of the list, then you should absolutely create and maintain a tail pointer.
If you have a doubly linked list, where every element contains a pointer both to the next and prev elements, then a tail pointer is almost universally used.
On the other hand, if this is a sorted list, then you won't be appending to the end, so the tail pointer would never be used. Still, keeping the pointer around is a good idea, just in case you decide you need it in the future.

Find the last node of a circular linked list whose size is unknown and the last node points to any other node except first node of the linked list

How can I find last node of a circular linked list whose size I don't know and the last node points to any other node except first node of the linked list?
One algorithm that can be used for this is the Floyd cycle algorithm.
Also, see this question.
By definition, if a node does not point to the first node of a circular linked list,
it is not the last node.
Can you elaborate here?
A strange list... why would you ever need something like this? But anyway...
You can simply iterate over all nodes, and stop as soon as the next node would be one you have already visited. The current node will then be your answer.
You need some way to keep track of which nodes have been visited. Add a boolean flag to each node, or use some kind of set data type with fast insertion and lookup (e.g. a hash set).
Maybe add parameter to nodes of the list which tells you if you at end? I think, it wouldn't be problem.
Otherwise, you can remember nodes you already visted. When the next node is already visited, you are at the end.
The Floyd cycle algorithm won't give the last element of the list. It will only tell if there is a cycle or not.
The definition of the last one would be that, while traversing the list in a sequential scan starting from the first one, all elements before it and the last one aren't seen before (pointer value). The after last one will be the first element that has already been seen in this sequential scan.
An easy solution is to flag visited elements so an element already seen is easily detected. The flag may be intrusive, i.e. by changing a bit in the element, or external by using a hash table to store pointer values.
Since we need to be able to test if an element has already been visited, I don't see another solution.
I can elaborate on how to use Floyd's algorithm to solve this problem but I don't understand the explanation for one step
Have 2 pointers traverse the linked list, pointer 1 going at a rate of 1 node per iteration, the second going at a rate of 2 nodes
When the pointers meet, we are in the cycle and we are some distance before pointer 1 has reached the end of the cycle (we know pointer 1 hasn't reached then end because if cycle is distance d and pointer 2 is going at twice the speed of 1, pointer1 will loop the cycle twice before pointer 1 does it once)
So because they have met before pointer 1 fully traversed the cycle, we know that the meeting point is d nodes from the start and k nodes within the cycle (pos = d + k)
If we set pointer 1 to position 0 and start both points again (but at the same rate rate of 1 node per iteration), they will meet at the start of the cycle.
Since we know the start of the cycle, finding the end is trivial
I don't fully understand why step 4 is true but I had a friend explain the solution to me.

Resources