Creating the number of shortest paths in a directed Graph - c

my homework is Creating the number of shortest paths from S to any other vertex in a directed Graph using c language
the graph is shown as a txt file like this:
3 // number of vertex in G
{2,3},{1},{} // in the first {} we can see the neighbors for V1 , in the second for V2 and so on
and i have to print an array of number of shortest path for s
the algorithm i use is like BFS with some adds:
numOfShortest(G,S)
for vertex x which belongs to gropu V-S
do color[x]=white, d[x]=0, F[x]=0
color[s]=gray,d[s]=0,F[s]=1
while Q is not empty //= let Q be a queue
do u=dequeue(Q)
for each vertex v = N(u) // = for every neighbor of u
do if color[v] = white
then color[v]= gray, d[v]=d[u]+1
F[v]=f[v]+f[u] // = v must have atleast the same number of paths as u
enqueue(Q,v)
else if color[v]=gray
then if d[u] < d[v]
then f[v]=f[v]+f[u]
color[u]=black // = when finished with every N(u)
now i have to take a few things into account ( correct me if im wrong)
implement a enqueue using a linked list
make a struct called vertex for each v which contains the neightbors
( using a dynamic array )
i need somehow to scan the neighbors written on the file to the
neighbors on the struct vertex
perhaps i took it too far with the preparations and there is a more simple way to do that, i got some mess in my mind.
thanks to whoever can help

You should start having a look at the Dijkstra algorithm to get the shortest path from one vertex S to every other vertex in the graph.
Then maybe mixing it with a BFS-like algorithm will help you counting what you mean.

You can use a 2D array to store the entire graph.
Let int a[][] the 2D array.
As this is your assignment I am not going to give you the code but can give a way- how to store the graph.
First assign to each g[i][j]=0; This means j is not neighbor of i.
Then take the number of node in a variable and use a loop for taking neighbors in a sequence. And save the neighbors in the array i.e. for your input file.
NumOfNode = 3
a[1][2] = 1;
a[1][3] = 1;
a[2][1] = 1;
After that you in your algorithm if you get a[i][j] is 1 then there is a path from i to j. If a[i][j] is 0 then there is no path from i to j.
This link may help you.

Related

Given a DAG, the length of the longest path and the node in which it ends, how do I retrace my steps so I can print each node of the longest path?

I'm working on a problem of finding the most parallelepipeds that can be stored into each other given a list of parallelepipeds.
My approach was to represent the graph with an adjacency list, do a topological sort and then for each node in the topological array "unrelax" the edges, giving me the longest path.
Below is the code but I don't think it matters for the question.
typedef struct Edge {
int src; /* source node */
int dst; /* destination node */
struct Edge *next;
} Edge;
int maxend; //node in which the longest path ends
int mp; // longest path
for (int i = 0; i < G.n; i++)
{
int j = TA[i]; //TA is the topological sorted array
if (g->edges[j] != NULL)
{
if(DTA[j] == -1) DTA[j] = 0;
Edge* tmp = G.edges[j];
while (tmp != NULL)
{
if(DTA[tmp->src] >= DTA[tmp->dst]){ //DTA is the array that keeps track of the maximum distance of each node in TA
DTA[tmp->dst] = DTA[tmp->src]+1;
if (DTA[tmp->dst] > mp) {
mp = DTA[tmp->dst];
maxend = tmp->dst;
}
}
tmp = tmp->next;
}
}
}
In the end I have the lenght of the longest path and the node in which said path ends, but how do I efficiently recreate the path?
If parallelepiped A contains parallelepiped B and parallelepiped B contains parallelepiped C that means that parallelepiped A parallelepiped box C aswell, which means that each edge has a weight of 1 and Vertex where the longest path starts has the furthest node of the path in his adjaceny list.
I've thought of 3 solutions but none of them look great.
Iterate the edges of each vertex that has weight 0 (so no predecessors) and if there is a choice avoid choosing the edge that connects it with the furthest node (As said before, the shortest path between the starting node and the ending node will be 1)
In the the array that tracks the maximum distance of each node in the topologically sorted array: start from the index representing the furthest node we found, see if the previous node has a compatible distance (as in, the previous node has 1 less distance than the furthest node). If it does, check it's adjaceny list to see if the furthest node is in it (because if the furthest node has a distance of 10 there could be several nodes that have a distance of 9 but are unconnected to it). Repeat until we reach the root of the path.
Most probable candidate so far, create an array of pointers that keeps track of the "maximum" parent of each node. In the code above everytime a node has it's maximum distance changed it means that it's parent node, if it had any, had a longer distance than the previous parent, which means we can change the maximum parent associated with the current node.
Edit: I ended up just allocating a new array and everytime I updated the weight of a node ( DTA[tmp->src] >= DTA[tmp->dst] ) I also stored the number of the source edge in the cell of the destination edge.
I am assuming the graph edge u <- v indicates that box u is big enough to contain v.
I suggest you dump the topological sort. Instead:
SET weight of every edge to -1
LOOP
LOOP over leaf nodes ( out degree zero, box too small to contain others )
Run Dijkstra algorithm ( gives longest path, with predecessors )
Save length of longest path, and path itself
SAVE longest path
REMOVE nodes on longest path from graph
IF all nodes gone from graph
OUTPUT saved longest paths ( lists of nested boxes )
STOP
This is called a "greedy" algorithm. It is not guaranteed to give the optimal result. But it is fast and simple, always gives a reasonable result and often does give the optimal.
I think this solves it, unless there's something I don't understand.
The highest-weighted path in a DAG is equivalent to the lowest-weighted path if you make the edge weights negative. Then you can apply Dijkstra's algorithm directly.
A longest path between two given vertices s and t in a weighted graph
G is the same thing as a shortest path in a graph −G derived from G by
changing every weight to its negation.
This might even be a special case of Dijkstra that is simpler... not sure.
To retrieve the longest path, you start at the end and go backwards:
Start at the vertex with the greatest DTA V_max
Find the edges that end at V_max (edge->dest = V_max)
Find an edge Src_max where the DTA value is 1 less than the max (DTA[Src_max] == DTA[V_max] - 1)
Repeat this recursively until there are no more source vertices
To make that a little more efficient, you can reverse the endpoints of the edges on the way down and then follow the path back to the start. That way each reverse step is O(1).
I think the option 3 is most promising. You can search for the longest path with DSF starting from all the root vertices (those without incoming edges) and increasing the 'max distance' for each vertex encountered.
This is quite a simple solution, but it may traverse some paths more than once. For example, for edges (a,f), (b,c), (c,f), (d,e), (e,c)
a------------f
/
b----c--/
/
d--e--/
(all directed rightwards)
the starting vertices are a, b, and d, the edge (c,f) will be traversed twice and the vertex f distance will be updated three times. If we append the rest of alphabet to f in a simple chain,
a------------f-----g-- - - ---y---z
/
b----c--/
/
d--e--/
the whole chain from f to z will be probably traversed three times, too.
You can avoid this by separating the phases and modifying the graph between them: after finding all the starting vertices (a, b, d) increment the distance of each vertex available from those (f, c, e), then remove starting vertices and their edges from the graph - and re-iterate as long as some edges remain.
This will transform the example graph after the first step like this:
f-----g-- - - ---y---z
/
c--/
/
e--/
and we can see all the junction vertices (c and f) will wait until the longest path to them is found before letting the analysis go further past them.
That needs iterative seeking for starting vertices, which may be time consuming unless you do some preprocessing (for example, counting all incoming edges for each vertex and storing vertices in some sorting data structure, like an integer-indexed multimap or a simple min-heap.)
The question remains open, whether the whole overhead of truncating a graph and rescaning it for new root vertices makes a net gain compared with multiple traversing some final parts of common paths in your particular graph...

BFS for m-ary tree - C

I'm taking this term a course in C, and I have got an assignment which deals namely with pointers - building an m-ary tree.
Some description:
We receive at the command line arguments: the file name of a text and two numbers which represent the keys of some two vertices in the graph (I will explain later what we have to do with these two vertices).
The first line of the text is actually the total number of the vertices of the graph, the next line could for example include numbers like "2 5" which implies that vertices 2 and 5 are children of vertex with key 0, the next line may include "6 0" which says that vertex with key of 1 is the father of 6 and 0 vertices, and so on...
If some line contains only '-' then it's a leaf.
This part actually deals with parsing and defining the suitable structure for vertex, and I have already done that (but I have to take care of corner cases later on...).
Now, my problem begins - we have to find the number of edges in the tree in Big O of 1 time complexity; find the root in Big O of n (where n is the number of vertices) time complexity; find the simple shortest path between the two vertices (I think it can also be done with BFS) we received at the command line in Big O of n squared time complexity; find the minimal and maximum heights of the tree; find the diameter of the tree in Big O of n squared time complexity.
To implement it, we have to use BFS and we can use their implementation of queue.
Here is my vertex struct:
typedef struct Vertex {
size_t key;
unsigned int amountOfNeighbors; // The current amount of neighbors
unsigned int capacity; // The capacity of the neighbors (It's updating during run-time)
struct Vertex* parent;
struct Vertex** neighbors; // The possible parent and children of a vertex
} Vertex;
I have went through the pseudo-code of BFS and it uses the idea of the next and previous vertices of a vertex - it's a concept which is not used in my implementation and I really don't know how I can mingle it with my code properly...
Secondly, I have no idea how I can calculate the number of edges in the tree in O(1) - it seems impossible - it requires me to go through all the vertices at least once which is O(n)...
So I actually need help to adjust the BFS algorithm to my needs, and find a way to calculate the number of edges in constant time complexity.
Thanks in advance!

Optimizing a method to find the most traversed edge given an adjacency graph and several traversals

I am given N vertices of a tree and its corresponding adjacency graph represented as an N by N array, adjGraph[N][N]. For example, if (1,3) is an edge, then adjGraph[0][2] == 1. Otherwise, adjGraph[i][j] == 0 for (i,j)s that are not edges.
I'm given a series of inputs in the form of:
1 5
which denote that a path has been traversed starting from vertex 1 to vertex 5. I wish to find the edge that was travesed the most times, along with the number of times it was traversed. To do this, I have another N by N array, numPass[N][N], whose elements I first initialize to 0, then increment by 1 every time I identify a path that includes an edge that matches its index. For example, if path (2,4) included edges (2,3) and (3,4), I would increment numPass[1][2] and numPass[2][3] by 1 each.
As I understand it, the main issue to tackle is that the inputs only give information of the starting vertex and ending vertex, and it is up to me to figure out which edges connect the two. Since the given graph is a tree, any path between two vertices is unique. Therefore, I assumed that given the index of the ending vertex for any input path, I would be able to recursively backtrack which edges were connected.
The following is the function code that I have tried to implement with that idea in mind:
// find the (unique) path of edges from vertices x to y
// and increment edges crossed during such a path
void findPath(int x, int y, int N, int adjGraph[][N], int numPass[][N]) {
int temp;
// if the path is a single edge, case is trivial
if (adjGraph[x][y] == 1) {
numPass[x][y] += 1;
return;
}
// otherwise, find path by backtracking from y
backtrack: while (1) {
temp = y-1;
if (adjGraph[temp][y] == 1) {
numPass[temp][y] += 1;
break;
}
}
if (adjGraph[x][temp] == 1) {
numPass[x][temp] += 1;
return;
} else {
y = temp;
goto backtrack;
}
However, the problem is that while my code works fine for small inputs, it runs out of memory for large inputs, since I have a required memory limit of 128MB and time limit of 1 second. The ranges for the inputs are up to 222222 vertices, and 222222 input paths.
How could I optimize my method to satisfy such large inputs?
Get rid of the adjacency matrix (it uses O(N^2) space). Use adjacency lists instead.
Use a more efficient algorithm. Let's make the tree rooted. For a path from a to b we can add 1 to a and b and subtract 1 from their lca (it is easy to see that this way a one is added to edges on this path and only to them).
After processing all paths, the number of paths going through the edge is just a sum in the subtree.
If we use an efficient algorithm to compute lca, this solution works in O(N + Q * log N), where Q is the number of paths. It looks good enough for this constraints (we can actually do even better by using more complex and more efficient algorithms for finding the lca, but I don't think it's necessary here).
Note: lca means lowest common ancestor.

Count number of paths of all length in a DAG

Suppose you have an unweighted DAG, and two vertices, start s and endt. The problem is to count how many paths there are from s to t of length 1, 2, 3...N-1, where N is the number of vertices in the DAG.
My approach:
Build a matrix d of size N*N, where d[u][k] is the number of ways to reach u from s in exactly k steps, and set d[s][0] = 1
Find a topological sorting TS of the DAG
Now, for every vertex u in TS
clone the array d[u] as a
shift every element in a right by 1 (ie, insert 0 on left, discard rightmost element)
for every adjacent vertex v of u, add array a to array d[v]
The answer is d[t]
This seems to work in O(V+EV). I'm wondering if there is a more efficient O(V+E) way?
The optimal algorithm is quite likely O(VE). However a simpler implementation is possible using BFS allowing vertices to be visited multiple times (on most practical cases, this will use less memory than O(V^2).

Dividing a graph in three parts such the maximum of the sum of weights of the three parts is minimized

I want to divide a graph with N weighted-vertices and N-1 edges into three parts such that the maximum of the sum of weights of all the vertices in each of the parts is minimized. This is the actual problem i am trying to solve, http://www.iarcs.org.in/inoi/contests/jan2006/Advanced-1.php
I considered the following method
/*Edges are stored in an array E, and also in an adjacency matrix for depth first search.
Every edge in E has two attributes a and b which are the nodes of the edge*/
min-max = infinity
for i -> 0 to length(E):
for j -> i+1 to length(E):
/*Call depth first search on the nodes of both the edges E[i] and E[j]
the depth first search returns the sum of weights of the vertices it visits,
we keep track of the maximum weight returned by dfs*/
Adjacency-matrix[E[i].a][E[i].b] = 0;
Adjacency-matrix[E[j].a][E[j].b] = 0;
max = 0
temp = dfs(E[i].a)
if temp > max then max = temp
temp = dfs(E[i].b)
if temp > max then max = temp
temp = dfs(E[i].a)
if temp > max then max = temp
temp = dfs(E[i].a)
if temp > max then max = temp
if max < min-max
min-max = max
Adjacency-matrix[E[i].a][E[i].b] = 1;
Adjacency-matrix[E[j].a][E[j].b] = 1;
/*The depth first search is called four times but it will terminate one time
if we keep track of the visited vertices because there are only three components*/
/*After the outer loop terminates what we have in min-max will be the answer*/
The above algorithm takes O(n^3) time, as the number of edges will be n-1 the outer loop will run (n-1)! times that takes O(n^2) the dfs will visit each vertex only one so that is O(n) time.
But the problem is that n can be <= 3000 and O(n^3) time is not good for this problem. Is there any other method which will calculate the solve the question in the link faster than n^3?
EDIT: I implemented #BorisStrandjev's algorithm in c, it gave me a correct answer for the test input in the question, but for all other test inputs it gives a wrong answer, here is a link to my code in ideone http://ideone.com/67GSa2, the output here should be 390 but the program prints 395.
I am trying to find if i have made any mistake in my code but i dont see any. Can anyone please help me here the answers my code gave are very close to the correct answer so is there anything more to the algorithm?
EDIT 2: In the following graph-
#BorisStrandjev, your algorithm will chose i as 1, j as 2 in one of the iterations, but then the third part (3,4) is invalid.
EDIT 3
I finally got the mistake in my code, instead of V[i] storing sum of i and all its descendants it stored V[i] and its ancestors, otherwise it would solve the above example correctly, thanks to all of you for your help.
Yes there is faster method.
I will need few auxiliary matrices and I will leave their creation and initialization in correct way to you.
First of all plant the tree - that is make the graph directed. Calculate array VAL[i] for each vertex - the amount of passengers for a vertex and all its descendants (remember we planted, so now this makes sense). Also calculate the boolean matrix desc[i][j] that will be true if vertex i is descendant of vertex j. Then do the following:
best_val = n
for i in 1...n
for j in i + 1...n
val_of_split = 0
val_of_split_i = VAL[i]
val_of_split_j = VAL[j]
if desc[i][j] val_of_split_j -= VAL[i] // subtract all the nodes that go to i
if desc[j][i] val_of_split_i -= VAL[j]
val_of_split = max(val_of_split, val_of_split_i)
val_of_split = max(val_of_split, val_of_split_j)
val_of_split = max(val_of_split, n - val_of_split_i - val_of_split_j)
best_val = min(best_val, val_of_split)
After the execution of this cycle the answer will be in best_val. the algorithm is clearly O(n^2) you just need to figure out how to calculate desc[i][j] and VAL[i] in such complexity, but it is not so complex a task, I think you can figure it out yourself.
EDIT Here I will include the code for the whole problem in pseudocode. I deliberately did not include the code before the OP tried and solved it by himself:
int p[n] := // initialized from the input - price of the node itself
adjacency_list neighbors := // initialized to store the graph adjacency list
int VAL[n] := { 0 } // the price of a node and all its descendants
bool desc[n][n] := { false } // desc[i][j] - whether i is descendant of j
boolean visited[n][n] := {false} // whether the dfs visited the node already
stack parents := {empty-stack}; // the stack of nodes visited during dfs
dfs ( currentVertex ) {
VAL[currentVertex] = p[currentVertex]
parents.push(currentVertex)
visited[currentVertex] = true
for vertex : parents // a bit extended stack definition supporting iteration
desc[currentVertex][vertex] = true
for vertex : adjacency_list[currentVertex]
if visited[vertex] continue
dfs (currentvertex)
VAL[currentVertex] += VAL[vertex]
perents.pop
calculate_best ( )
dfs(0)
best_val = n
for i in 0...(n - 1)
for j in i + 1...(n - 1)
val_of_split = 0
val_of_split_i = VAL[i]
val_of_split_j = VAL[j]
if desc[i][j] val_of_split_j -= VAL[i]
if desc[j][i] val_of_split_i -= VAL[j]
val_of_split = max(val_of_split, val_of_split_i)
val_of_split = max(val_of_split, val_of_split_j)
val_of_split = max(val_of_split, n - val_of_split_i - val_of_split_j)
best_val = min(best_val, val_of_split)
return best_val
And the best split will be {descendants of i} \ {descendants of j}, {descendants of j} \ {descendants of i} and {all nodes} \ {descendants of i} U {descendants of j}.
You can use a combination of Binary Search & DFS to solve this problem.
Here's how I would proceed:
Calculate the total weight of the graph, and also find the heaviest edge in the graph. Let them be Sum, MaxEdge resp.
Now we have to run a binary search between this range: [maxEdge, Sum].
In each search iteration, middle = (start + end / 2). Now, pick a start node and perform a DFS s.t. the sum of edges traversed in the sub-graph is as close to 'middle' as possible. But keep this sum to be less than middle. This will be one sub graph. In the same iteration, now pick another node which is unmarked by the previous DFS. Perform another DFS in the same way. Likewise, do it once more because we need to break the graph into 3 parts.
The min. weight amongst the 3 sub-graphs calculated above is the solution from this iteration.
Keep running this binary search until its end variable exceeds its start variable.
The max of all the mins obtained in step 4 is your answer.
You can do extra book-keeping in order to get the 3-sub-graphs.
Order complexity : N log(Sum) where Sum is the total weight of the graph.
I just noticed that you have talked about weighted vertices, and not edges. In that case, just treat edges as vertices in my solution. It should still work.
EDIT 4: THIS WON'T WORK!!!
If you process the nodes in the link in the order 3,4,5,6,1,2, after processing 6, (I think) you'll have the following sets: {{3,4},{5},{6}}, {{3,4,5},{6}}, {{3,4,5,6}}, with no simple way to split them up again.
I'm just leaving this answer here in case anyone else was thinking of a DP algorithm.
It might work to look at all the already processed neighbours in the DP algorithm.
.
I'm thinking a Dynamic Programming algorithm, where the matrix is (item x number of sets)
n = number of sets
k = number of vertices
// row 0 represents 0 elements included
A[0, 0] = 0
for (s = 1:n)
A[0, s] = INFINITY
for (i = 1:k)
for (s = 0:n)
B = A[i-1, s] with i inserted into minimum one of its neighbouring sets
A[i, s] = min(A[i-1, s-1], B)) // A[i-1, s-1] = INFINITY if s-1 < 0
EDIT: Explanation of DP:
This is a reasonably basic Dynamic Programming algorithm. If you need a better explanation, you should read up on it some more, it's a very powerful tool.
A is a matrix. The row i represents a graph with all vertices up to i included. The column c represents the solution with number of sets = c.
So A[2,3] would give the best result of a graph containing item 0, item 1 and item 2 and 3 sets, thus each in it's own set.
You then start at item 0, calculate the row for each number of sets (the only valid one is number of sets = 1), then do item 1 with the above formula, then item 2, etc.
A[a, b] is then the optimal solution with all vertices up to a included and b number of sets. So you'll just return A[k, n] (the one that has all vertices included and the target number of sets).
EDIT 2: Complexity
O(k*n*b) where b is the branching factor of a node (assuming you use an adjacency list).
Since n = 3, this is O(3*k*b) = O(k*b).
EDIT 3: Deciding which neighbouring set a vertex should be added to
Keep n arrays of k elements each in a union find structure, with each set pointing to the sum for that set. For each new row, to determine which sets a vertex can be added to, we use its adjacency list and look-up the set and value of each of its neighbours. Once we find the best option, we can just add that element to the applicable set and increment its sum by the added element's value.
You'll notice the algorithm only looks down 1 row, so we only need to keep track of the last row (not store the whole matrix), and can modify the previous row's n arrays rather than copying them.

Resources