merged linked list in C - c

This question was asked to me in an interview:
There are two header of two linked lists.
There is a merged linked list in c where in the second linked list is merged into the first one at some point.
How could we identify the merging point and what is the complexity of finding that point ?
Could anybody please help?

O(n)
search = list1->header;
if (mixed->header == list1->header) search = list2->header;
while (mixed->next != search) mixed = mixed->next;
Edit: new name for variables and a few comments
/* search is what we want to find. Here it's the head of `list2` */
search = list2->header;
/* unless the merging put `list2` first; then we want to search for `list1` */
if (mixed->header == list2->header) search = list1->header;
/* assume (wrongly) that the header for the mixed list is the merge point */
mergepoint = mixed->head;
/* traverse the mixed list until we find the pointer we're searching */
while (mergepoint->next != search) mergepoint = mergepoint->next;
/* mergepoint now points to the merge point */

Update: This assumes the Y-shaped joining of two linked lists as described better in Steve Jessop's post. But I think the description of the problem is sufficiently ambiguous that various interpretations are possible, of which this is only one.
This can be done with a single pass through one list plus a partial pass through the other. In other words, it's O(n).
Here's my proposed algorithm:
Create a hashmap. (Yes, this is busywork in C if you don't have a library handy for it).
The keys will be pointers to the items in List1 (i.e. the head pointer and each link).
The values will be integers denoting the position, i.e. distance from the head of List1.
Run through List1, keeping track of the position, and hash all your pointers and positions.
Run through List2, keeping track of the position, and find the first pointer that occurs in the hashmap.
At this point, you'll know the position in List2 of the first node common to both lists.
The hashmap entry will also contain the position in List1 of that same node.
That will nicely identify your merge point.

Do you mean you have a Y-shape, like this:
list1: A -> B -> C -> D -> E -> F
list2: X -> Y -> Z -> E -> F
Where A .. Z are singly-linked list nodes. We want to find the "merge point" E, which is defined to be the first node appearing in both lists. Is that correct?
If so, then I would attach the last node of list2 (F) to the first node of list2 (X). This turns list2 into a loop:
list2 : X -> Y -> Z -> E -> F -> X -> ...
But more importantly:
list1 : A -> B -> C -> D -> E -> F -> X -> Y -> Z -> E -> ...
This reduces the question to a previously-solved problem, which can be solved in O(n) time and O(1) additional storage.
But reading your question, another possibility is that by "merge" you mean "insert". So you have two lists like this:
list1: A -> B -> C
list2: D -> E -> F
and then another completely separate list:
list3: A -> B -> D -> E -> F -> C
where this time, A .. F are the values contained in the list, not the nodes themselves.
If the values are all different, you just need to search list3 for D (or for the later of D and A, if you don't know which list it was that was copied into the other). Which seems like a pointless question. If values can be repeated, then you have to check for the full sequence of list2 inside list3. But just because you find "DEF" doesn't mean that's where list2 was inserted - maybe "DEF" already occurred several times in list1 beforehand, and you've just found the first of those. For instance if I insert "DEF" into "ABCDEF", and the result is "ABCDEFDEF", then did I insert at index 3 or at index 6? There's no way to tell, so the question can't be answered.
So, in conclusion, I don't understand the question. But I might have answered it anyway.

If the question means list2 contained in list1 (that is list2 points somewhere in the middle of list1), then it is easy - just walk list1 and compare pointers until you reach list2.
However such interpretation does not make much sense, because by inserting list2 into the list1 (like 1 1 2 2 1), you would also modify list2 - the last 1 becomes part of list2.
So I will assume the question is about the Y shape:
list1: A -> B -> C -> D -> E -> F
list2: X -> Y -> Z -> E -> F
This can be solved using hashtable as Carl suggested.
Solution without a hashtable would be this:
Walk list1 and disconnect all its pointers as you go
Walk list2. When it ends, you've reached the junction point
Repair the pointers in list1
Disconnecting and repairing pointers in list1 can be done easily using recursion:
Diconnect(node)
{
if (node->next == NULL)
walk list2 to its end, that is the solution, remember it
else
{
tmp = node->next;
node->next = NULL;
Disconnect(tmp);
node->next = tmp; // repair
}
}
Now call Disconnect(list1).
That is recurse down list1 and disconnect pointers. When you reach end, execute step 2 (walk list2 to find junction), repair pointers when returning back from recursion.
This solution modifies list1 temporarily, so it is not thread safe and you should use a lock around the Disconnect(list1) call.

//try this code for merge
void mergeNode(){
node *copy,*current,*current1;
free(copy);
merge=NULL;
current=head;
current1=head1;
while(current!=NULL){
if(merge==NULL){
node *tmp;
tmp=(node*)malloc(sizeof(node));
tmp->data=current->data;
tmp->link=NULL;
merge=tmp;
}
else{
copy=merge;
while(copy->link!=NULL)
copy=copy->link;
node *tmp;
tmp=(node*)malloc(sizeof(node));
tmp->data=current->data;
tmp->link=copy->link;
copy->link=tmp;
}
current=current->link;
}
while(current1!=NULL){
copy=merge;
while(copy->link!=NULL)
copy=copy->link;
node *tmp;
tmp=(node*)malloc(sizeof(node));
tmp->data=current1->data;
tmp->link=copy->link;
copy->link=tmp;
current1=current1->link;
}
display(merge);
}

Sorry if my answer seems too simple, but if you have two linked list which are identified by a header and you join them, so that
A -> B -> C -> D is the first list, and
1 -> 2 -> 3 -> 4 is the second, then suppose
A -> B -> C -> 1 -> 2 -> 3 -> 4 -> D is the result
then to find the merging point you need to go through the final list until you find the second header (the 1). Which goes in O(n1) worst case, where n1 is the number of elements of the first list (this happens if the second list is merged at the end).
That's how I would intend the question. The reference to the C language would probably mean that you have no 'object' or pre-packaged data structure, unless specified.
[update] as suggested by Sebastian, if the two list above have the same elements my solution won't work. I suspect that this is where the C language comes into action: you can search for the address of the first element of the second list (the head). Thus the duplicates objection won't hold.

Well, there are several approaches to solve this problem.
Note that i am only discussing the approaches[corner cases may need to be handled separately] starting from brute force to the best one.
Considering N: number of nodes in first linked list
M: number of nodes in second linked list
Approach 1:
Compare each node of first linked list with every other node of second list. Stop when you find a matching node, this is the merging point.
while(head1)
{
cur2=head2;
while(cur2)
{
if(cur2==head1)
return cur2;
cur2=cur2->next;
}
head1=head1->next;
}
Time Complexity: O(N*M)
Space Complexity: O(1)
Approach 2:
Maintain two stacks. Push all the nodes of he first linked list to first stack. Repeat he same for second linked list.
Start popping nodes from both the stacks until both popped nodes do not match. The last matching node is the merging point.
Time Complexity: O(N+M)
Space Complexity: O(N+M)
Approach 3:
Make use of hash table. Insert all the nodes of the first linked list into hash.
Search for the first matching node of he second list in the hash.
This is the merging point.
Time Complexity: O(N+M)
Space Complexity: O(N)
Note that the space complexity may vary depending upon the hash function used[talking about C where you are supposed to implement your own hash function].
Approach 4:
Insert all the nodes of first linked list[by nodes, i mean addresses] into an array.
Sort the array with some stable sorting algorithm in O(N logN) time[Merge sort would be better].
Now search for the first matching node from the second linked list.
Time Complexity: O(N logN)
Space Complexity: O(N)
Note that this approach may be better than Approach 3 [in terms of space]as it doesn't use a hash.
Approach 5:
1. Take an array of size M+N.
2. Insert each node from the first linked list, followed by inserting each node from the second linked list.
3. Search for the first repeating element[can be found in one scan in O(M+N) time].
Time Complexity: O(N+M)
Space Complexity: O(N+M)
Approach 6: [A better approach]
1. Modify the first linked list & make it circular.
2. Now starting from the head of the second linked list, find the start of the loop using Floyd- Warshall cycle detection algorithm.
3. Remove the loop[can be easily removed as we know the last node].
Time Complexity: O(N+M)
Space Complexity: O(1)
Approach 7: [Probably the best one]
1. Count the number of nodes in first linked list[say c1].
2. Count the number of nodes in second linked list[say c2].
3. Find the difference[Lets say c1>c2] diff=c1-c2.
4. Take two pointers p1 & p2, p1 pointing to the head of the first linked list & p2 pointing to the head of the second linked list.
5. Move p1 diff times.
6. Move both p1 & p2 each node at a time until both point to the same node.
7. p1 or p2 indicates the merging point.
Time Complexity: O(N+M)
Space Complexity: O(1)

Trivial solution is obviously O(N+M). Hm.. What could be better. You can go from start to end of the list or vice versa. When you have a threads, you can go these directions at the some time, so should be a litter bit quicker.

Related

Deduplication optimization

The problem is as follows. I want a function that, given a list and a max number of occurrences "x", deletes all elements of the list that appear more than x times or x times.
I found a pretty straightforward solution, which is to check for each of the elements. This said, to repeat the find and delete functions many times seems computationally-wise not optimal to me.
I was wondering whether you could provide a better algorithm (i excluded allocating memory for a matrix from the min to the max... just too much for the task... say you have few very big numbers and your memory won't do it.)
My code follows.
typedef struct n_s
{
int val;
struct n_s *next;
}
n_t;
// deletes all elements equal to del in list with head h
n_t * delete(n_t *h, int del);
// returns the first occurrence of find in list with head h, otherwise gives NULL
n_t * find(n_t *h, int find);
n_t *
delFromList(n_t *h, int x)
{
int val;
n_t *el, *posInter;
// empty list case
if (h == NULL)
return NULL;
// first element
val=h->val;
if ( (posInter = find(h -> next,val))
&& (find(posInter -> next, val)))
h = delete(h, val);
// loop from second element
el = h;
while (el -> next)
{
val = el -> next -> val;
// check whether you want to delete the next one,
// and then if you do so, check again on the "new" next one
if ((posInter = find(el -> next -> next, val))
&& (find(posInter -> next, val)))
el -> next = delete(el -> next, val);
// in case you did not delete the nexy node, you can move on
else
el = el -> next;
}
return h;
}
I know that the el->next->next may look confusing, but I find it less intuitive to use variables such as "next", "past"... so, sorry for your headache.
One option for an algorithm with improved performance is:
Define a data structure D with two members, one for the value of a list element and one to count the number of times it appears.
Initialize an empty balanced tree ordered by value.
Iterate through the list. For each item in the list, look it up in the tree. If it is not present, insert a D structure into that tree with its value member copied from the list element and its count set to one. If it is present in the tree, increments its count. If its count equals or exceeds the threshold, remove it from the list.
Lookups and insertions in a balanced tree are O(log n). A linked list of n items uses n of them, and deletions from a linked list are O(1). So the total time is O(n log n).
Use a counting map to count the number of times each element appears. The keys are the elements, and the values are the counts.
Then, go through your array a second time, deleting anything which meets your threshold.
O(n) time, O(n) extra space.

Efficiency of an unsorted vs sorted linked list in C

For a programming project, I created two linked list programs: an unsorted linked list and a sorted linked list. The unsorted linked list program adds values to the end of the list as long as the value is not found in the list. If the value is found in the list, the node containing the value is removed. The only difference in the sorted linked_list program is that if a value is not found in the list, instead of just adding the value to the end, the program looks for the proper space to insert the value so that the repository is consistently maintained in sorted order. I have a "stepcounter" variable that basically increments each time a pointer in my program is reassigned to a point to a different pointer, even during traversal of the linked list. I output this variable to the screen to give me an idea of the efficiency of my program. What's strange is that if I run the same operation on the sorted list and on the unsorted list, the number of steps, or, effort of the unsorted list is MORE than the sorted list. This seems very counter-intuitive to me but I looked through my code and I'm pretty sure I incremented in all the same places, so I can't come up with an explanation as to why the unsorted linked list operations would have more steps than the sorted. Is there something I'm missing?
If you are really keeping track of pointer assignments, then walking like
while (p && (p.value != input) && (p.next != NULL)) p = updatePointer(p.next);
(assuming updatePointer takes care of your counting) performs one of those for each node you examine.
To know if a item is in the unsorted list you have to look at every node in the list. (That is, you have to use the code I had above)
To do the same thing on the sorted list you only have to keep looking until you pass the space where the item is questions would have been. This implies code like
while (p && (p.value < input) && (p.next != NULL)){
p = updatePointer(p.next);
}
if (p.value == input) //...
Assuming randomly distributed (i.e. unordered input) you expect the second to case to require about 1/2 as many comparisons.
Suppose you have 1000 data to insert in both lists and the data is pure random order but
of the values of 1 up to 1000.
Additionally suppose both lists are filled already with 500 data items of pure random order for the unsorted list and of sorted order in case of the sorted list.
For the unsorted list you have to check each item to find possible doubles, which lead
to a pointer stepping forward for each visited node.
For the sorted list you only have to search forward this way until the first element
appears in the list which has a greater value.
The chance of such hits is 50% by 1000 elements being inserted into a list already filled
with 500 items for a total range of 1 to 1000 for the values.
This creates 50% of all operations being inserts for replaces, which let the unsorted list
being checked for additional items compared to the sorted list.
Insertion itself is more cheap with the unsorted list (1 step instead of 4 steps).

Find the common nodes of two intersecting linked lists

I was asked in the interview that if we have two linked list which intersect in more that one node then how can we find the common nodes in which the linked list meet. Also find the solution with minimum complexity.
e.g.
![Linked List example][1]
Linked List 1 = 11->12->13->14->15->16->17->54
Linked List 2 = 23->24->13->26->14->15->35->16->45
I answered him that we can store addresses of one linked list in hashmap and compare every node's address in second list with the hashmap. This way we can achieve O(n) complexity. But the interviewer was not satisfied.
Please suggest any better solution. Thanks in advance.
it can be achieved in a better way listen if two linked list are intersecting at some node so we can traverse ones both the list find the length of each list then move the pointer of one list upto the distance between the two &then move both the pointer simultaneouly int his way whenever u get the that node both the pointers are equal..
Given two singly linked list, find if they are intersecting. Do this in single iteration.
a. traverse list1 and find the last element
b. traverse list2 and find the last element
c. check if last element of list1 == last element of list2 , if equal intersecting else not
here we have parsed the list only once :-)
Also find the intersecting node in O(n) time and O(1) space
here they have asked to do it in O(1) space so we need to use only one variable :-)
a. create a variable(int) diff=0
b. parse list1 and increment diff for each node
c. parse list2 and decrement diff for each node
d. if diff is > 0 list1 is bigger so push the pointer of list1 by diff times
else list2 is bigger so push the pointer of list2 by mod(diff) times
e. Now check if both the pointers are equal till we reach end
If the values are integers and you have unlimited memory, you can perform the following:
Traverse each list once and find the global maximal value MAX
Allocate a boolean array A with the size of MAX
Traverse one list, for each value X in the list set A[X] = true
Traverse the second list, for each value Y in the list if A[Y] = true then Y is a list intersection
This runs with O(N) time (which I believe you can't do better as the lists are not sorted)
my suggested solution:
you create a hashmap,
iterate the first list, and for each element you do:
hashMap.put({value}, "firstList");
on O(n) you get a map of elements.
iterate the second list,
for each element ask:
hash_map.containsKey({number}) ?
if so, intersectionCounter ++;
the best is O(N) i think.
this is what i would do.

Splitting Linklist in two parts

Given a list, split it into two sublists — one for the front half, and one for the back half. If the number of elements is odd, the extra element should go in the front list. So FrontBackSplit() on the list {2, 3, 5, 7, 11} should yield the two lists {2, 3, 5} and {7, 11}.
code is this.
void FrontBackSplit(Node *head, Node **front, Node **back) {
if (!head) return; // Handle empty list
Node *front_last_node;
Node *slow = head;
Node *fast = head;
while (fast) {
front_last_node = slow;
slow = slow->next;
fast = (fast->next) ? fast->next->next : NULL;
}
front_last_node->next = NULL; // ends the front sublist
*front = head;
*back = slow;
}
Problem is I am not getting best run-time and sometimes expected output.
Generally, your code works well for even-sized lists. Consider a list of 4 elements A -> B -> C -> D -> NULL and take a look at your algorithm trace.
A slow, fast, head
B
C
D
NULL
A front_last_node, head
B slow
C fast
D
NULL
A head
B front_last_node
C slow
D
NULL fast
Then you erase the link B->C and return two lists: A -> B and C -> D. This is exactly the wanted behavior of this function, isn't it?
There is another way ;
Use a loop to calculate how many elements are in the linked list.
If its couple - the first list is between head(the address of the first element) to i=n/2 element. The second list is between i=0.5n+1 element to the last one.
Else the first list is between the head to the i=0.5n+1 element and the second is between i=0.5n+2 to the last one.
To make it easy, when you run the loop to calculate the number of elements, use a variable to keep the place of the last one and the middle one, so it will be easy to use them when needed.

finding longest path in an adjacency list

I have an adjacency list I have created for a given graph with nodes and weighted edges. I am trying to figure out what the best way would be to find the longest path within the graph. I have a topological sort method, which I've heard can be useful, but I am unsure how to implement it to find the longest path. So is there a way to accomplish this using topology sort or is there a more efficient method?
Here is an example of my out for the adj list (the value in parenthesis are the cost to get to the node after the arrow (cost)to get to -> node:
Node 0 (4)->1(9)->2
Node 1 (10)->3
Node 2 (8)->3
Node 3
Node 4 (3)->8(3)->7
Node 5 (2)->8(4)->7(2)->0
Node 6 (2)->7(1)->0
Node 7 (5)->9(6)->1(4)->2
Node 8 (6)->9(5)->1
Node 9 (7)->3
Node 10 (12)->4(11)->5(1)->6
Bryan already answered your question above, but I thought I could go in more depth.
First, as he pointed out, this problem is only easily solvable if there are no cycles. If there are cycles you run into the situation where you have infinitely long paths. In that case, you might define a longest path to be any path with no repeated nodes. Unfortunately, this problem can be shown to be NP-Hard. So instead, we'll focus on the problem which it seems like you actually need to solve (since you mentioned the topological sort)--longest path in a Directed Acyclic Graph (DAG). We'll also assume that we have two nodes s and t that are our start and end nodes. The problem is a bit uglier otherwise unless you can make certain assumptions about your graph. If you understand the text below, and such assumptions in your graphs are correct, then perhaps you can remove the s and t restrictions (otherwise, you'll have to run it on every pair of vertices in your graph! Slow...)
The first step in the algorithm is to topologically order the vertices. Intuitively this makes sense. Say you order them from left to right (i.e. the leftmost node will have no incoming edges). The longest path from s to t will generally start from the left and end on the right. It's also impossible for the path to ever go in the left direction. This gives you a sequential ordering to generate the longest path--start at the left and move right.
The next step is to sequentially go left to right and define the longest path for each node. For any node that has no incoming edges, the longest path to that node is 0 (this is true by definition). For any node with incoming edges, recursively define the longest path to that node to be the maximum over all incoming edges + the longest path to get to the "incoming" neighbor (note that this number might be negative, if, for example, all of the incoming edges are negative!). Intuitively this makes sense, but the proof is also trivial:
Suppose our algorithm claims that the longest path to some node v is d but the actual longest path is some d' > d. Pick the "least" such node v (we use the ordering as defined by the topological sort. In other words, we pick the "left-most" node that our algorithm failed at. This is important so that we can assume that our algorithm has correctly determined the longest path for any nodes to the "left" of v). Define the length of the hypothetical longest path to be d' = d_1 + e where d_1 is the length of the hypothetical path up to a node v_prev with edge e to v (note the sloppy naming. The edge e also has weight e). We can define it as such because any path to v must go through one of its neighbors which have an edge going to v (since you can't get to v without getting there via some edge that goes to it). Then d_1 must be the longest path to v_prev (else, contradiction. There is a longer path which contradicts our choice of v as the "least" such node!) and our algorithm would choose the path containing d_1 + e as desired.
To generate the actual path you can figure out which edge was used. Say you've reconstructed the path up to some vertex v which has longest path length d. Then go over all incoming vertices and find the one with longest path length d' = d - e where e is the weight of the edge going into v. You could also just keep track of the parents' of nodes as you go through the algorithm. That is, when you find the longest path to v, set its parent to whichever adjacent node was chosen. You can use simple contradiction to show why either method generates the longest path.
Finally some pseudocode (sorry, it's basically in C#. This is a lot messier to code in C without custom classes and I haven't coded C in a while).
public List<Nodes> FindLongestPath(Graph graph, Node start, Node end)
{
var longestPathLengths = Dictionary<Node, int>;
var orderedNodes = graph.Nodes.TopologicallySort();
// Remove any nodes that are topologically less than start.
// They cannot be in a path from start to end by definition
while (orderedNodes.Pop() != start);
// Push it back onto the top of the stack
orderedNodes.Push(start);
// Do algorithm until we process the end node
while (1)
{
var node = orderedNodes.Pop();
if (node.IncomingEdges.Count() == 0)
{
longestPathLengths.Add(node, 0);
}
else
{
var longestPathLength = Int.Min;
foreach (var incomingEdge in node.IncomingEdges)
{
var currPathLength = longestPaths[incomingEdge.Parent] +
incomingEdge.Weight);
if (currPathlength > longestPathLength)
{
longestPath = currPathLength;
}
}
longestPathLengths.Add(node, longestPath);
}
if (node == end)
{
break;
}
}
// Reconstruct path. Go backwards until we hit start
var node = end;
var longestPath = new List<Node>();
while (node != start)
{
foreach (var incomingEdge in node.IncomingEdges)
{
if (longestPathLengths[incomingEdge.Parent] ==
longestPathLengths[node] - incomingEdge.Weight)
{
longestPath.Prepend(incomingEdge.Parent);
node = incomingEdge.Parent;
break;
}
}
}
return longestPath;
}
Note that this implementation is not particularly efficient, but hopefully it's clear! You can optimize in a lot of small ways that should be obvious as you think through the code/implementation. Generally, if you store more stuff in memory, it'll run faster. The way you structure your Graph is also critical. For instance, it didn't seem like you had an IncomingEdges property for your nodes. But without that, finding the incoming edges for each node is a pain (and is not performant!). In my opinion, graph algorithms are conceptually different from, say, algorithms on strings and arrays because the implementation matters so much! If you read the wiki entries on graph algorithms you'll find they often give three or four different runtimes based on different implementations (with different data structures). Keep this in mind if you care about speed
Assuming your graph has no cycles, otherwise longest path becomes a vague concept, you can have a topological sort indeed. Now you can walk this topological sort and for each node compute its longest distance from a source node by looking at all its predecessors and add the weight of the edge connecting to them to their distance. Then choose the predecessor that gives you the longest distance for this node. The topological sort guarantees that all your predecessors have their distance already correctly determined.
If in addition to the length of the longest path, you also want the path itself. Then you start at the node that gave the longest length and look at all its predecessors to find the one that resulted in this length. Then repeat this process until you have found a source node of the graph.

Resources