How to compute the cost of an mst graph. - c

I'm working in C, using the igraph library. I need to get the minimum spanning tree of a given graph stores in a igraph_graph_t type (g). Also I have a igraph_vector containing the weight of each edge (w). The following is my call:
igraph_minimum_spanning_tree_prim(&g, &mst, &w)
How I can get the weight of each edge in the mst graph? All I need is the cost of mst.
Thanks, Guillermo.

I think you should take the result graph(mst) and sum the weight of the edges.

Related

Calculate Heuristics for graph

I have these Information for a graph:
number of vertices
edges with weight
and Nothing more. and I want to calculate a heuristic for each vertices so I can use a path finding algorithm like A Star. What can I do? If you know more than one solution please mention all of them.
Please note that I can't calculate Manhattan distance and use it as a heuristic because I don't have any location Information .
You can still use A* but you will have to "cheat" a little. A* is an extension of Dijkstra's Shortest Path algorithm. The extension is that A* uses a distance heuristic to keep the search going in the right direction. Given a graph with just weight edges you calculate the shortest sum of weights from node n to the goal node and use that as the heuristic distance. So the distance from START to Target is 8, the distance from A to TARGET is 7, distance(B, TARGET) = 2 and so forth. Since this is a heuristic you could also use other measurements. If you set the distances to 0 then this becomes Dijkstra's algorithm. If you set the distances high then this turns into the Greedy-Best-First algorithm.

creating a cost function in jgrapht

jgrapht supports the idea of putting a wehight(a cost) on an edge/vertex between two nodes. This can be achieved using the class DefaultWeightedEdge.
In my graph I do have the requirement to not find the shortest path but the cheapest one. The cheapest path might be longer/have more hops nodes to travel then the shortest path.
Therefor, one can use the DijkstraShortestPath algorithm to achieve this.
However, my use case is a bit more complex: It needs to also evaluate costs on actions that need to be executed when arriving at a node.
Let's say, you have a graph like a chess board(8x8 fields, each field beeing a node). All the edges have a weight of 1. To move in a car from left bottom to the diagonal corner(right upper), there are many paths with the cost of 16. You can take a diagonal path in a zic zac style, or you can first travel all nodes to the right and then all nodes upwards.
The difference is: When taking a zic zac, you need to rotate yourself in the direction of moving. You rotate 16 times.
When moving first all to the right and then upwards, you need to rotate only once (maybe twice, depending on your start orientation).
So the zic zac path is, from a Djikstra point of view, perfect. From a logical point of view, it's the worst.
Long story short: How can I put some costs on a node or edge depending on the previous edge/node in that path? I did not find anything related in the source code of jgrapht.
Or is there a better algorithm to use?
This is not a JGraphT issue but a graph algorithm issue. You need to think about how to encode this problem and formalize that in more detail
Incorporating weights on vertices is in general easy. Say that every vertex represents visiting a customer, which takes a_i time. This can be encoded in the graph by adding a_i/2 to the cost of every incoming arc in node i, as well as a_i/2 to the cost of every outgoing arc.
A cost function where the cost of traveling from j to k dependents on the arc (i,j) you used to travel to j is more complicated.
Approach a.: Use a dynamic programming (labeling) algorithm. This is perhaps the easiest. You can define your cost function as a recursive function, where the cost of traversing an arc depends on the cost of the previous arc.
Approach b.: With some tricks you may be able to encode the costs in the graph by adding extra nodes to it. Here's an example:
Given a graph with vertices {a,b,c,d,e}, with arcs: (a,e), (e,b), (c,e), (e,d). This graph represents a crossroad with vertex e being in the middle. Going from a->e->b (straight) is free, however, a turn from a->e->d takes additional time. Similar for c->e->d (straight) is free and c->e->b (turning) should be penalized.
Decouple vertex e in 4 new vertices: e1,e2,e3,e4.
Add the following arcs:
(a,e1), (e3,b), (c,e2), (e4,d), (e2, e3), (e1, e3), (e1, e4), (e2, e4).
(e1,e4) and (e2,e3) can have a positive weight to penalize turning.

What's the main difference between dijkstra's algorithm and Prim's algorithm? [duplicate]

What is the exact difference between Dijkstra's and Prim's algorithms? I know Prim's will give a MST but the tree generated by Dijkstra will also be a MST. Then what is the exact difference?
Prim's algorithm constructs a minimum spanning tree for the graph, which is a tree that connects all nodes in the graph and has the least total cost among all trees that connect all the nodes. However, the length of a path between any two nodes in the MST might not be the shortest path between those two nodes in the original graph. MSTs are useful, for example, if you wanted to physically wire up the nodes in the graph to provide electricity to them at the least total cost. It doesn't matter that the path length between two nodes might not be optimal, since all you care about is the fact that they're connected.
Dijkstra's algorithm constructs a shortest path tree starting from some source node. A shortest path tree is a tree that connects all nodes in the graph back to the source node and has the property that the length of any path from the source node to any other node in the graph is minimized. This is useful, for example, if you wanted to build a road network that made it as efficient as possible for everyone to get to some major important landmark. However, the shortest path tree is not guaranteed to be a minimum spanning tree, and the sum of the costs on the edges of a shortest-path tree can be much larger than the cost of an MST.
Another important difference concerns what types of graphs the algorithms work on. Prim's algorithm works on undirected graphs only, since the concept of an MST assumes that graphs are inherently undirected. (There is something called a "minimum spanning arborescence" for directed graphs, but algorithms to find them are much more complicated). Dijkstra's algorithm will work fine on directed graphs, since shortest path trees can indeed be directed. Additionally, Dijkstra's algorithm does not necessarily yield the correct solution in graphs containing negative edge weights, while Prim's algorithm can handle this.
Dijkstra's algorithm doesn't create a MST, it finds the shortest path.
Consider this graph
5 5
s *-----*-----* t
\ /
-------
9
The shortest path is 9, while the MST is a different 'path' at 10.
Prim and Dijkstra algorithms are almost the same, except for the "relax function".
Prim:
MST-PRIM (G, w, r) {
for each key ∈ G.V
u.key = ∞
u.parent = NIL
r.key = 0
Q = G.V
while (Q ≠ ø)
u = Extract-Min(Q)
for each v ∈ G.Adj[u]
if (v ∈ Q)
alt = w(u,v) <== relax function, Pay attention here
if alt < v.key
v.parent = u
v.key = alt
}
Dijkstra:
Dijkstra (G, w, r) {
for each key ∈ G.V
u.key = ∞
u.parent = NIL
r.key = 0
Q = G.V
while (Q ≠ ø)
u = Extract-Min(Q)
for each v ∈ G.Adj[u]
if (v ∈ Q)
alt = w(u,v) + u.key <== relax function, Pay attention here
if alt < v.key
v.parent = u
v.key = alt
}
The only difference is pointed out by the arrow, which is the relax function.
The Prim, which searches for the minimum spanning tree, only cares about the minimum of the total edges cover all the vertices. The relax function is alt = w(u,v)
The Dijkstra, which searches for the minimum path length, so it cares about the edge accumulation. The relax function is alt = w(u,v) + u.key
Dijsktra's algorithm finds the minimum distance from node i to all nodes (you specify i). So in return you get the minimum distance tree from node i.
Prims algorithm gets you the minimum spaning tree for a given graph. A tree that connects all nodes while the sum of all costs is the minimum possible.
So with Dijkstra you can go from the selected node to any other with the minimum cost, you don't get this with Prim's
The only difference I see is that Prim's algorithm stores a minimum cost edge whereas Dijkstra's algorithm stores the total cost from a source vertex to the current vertex.
Dijkstra gives you a way from the source node to the destination node such that the cost is minimum. However Prim's algorithm gives you a minimum spanning tree such that all nodes are connected and the total cost is minimum.
In simple words:
So, if you want to deploy a train to connecte several cities, you would use Prim's algo. But if you want to go from one city to other saving as much time as possible, you'd use Dijkstra's algo.
Both can be implemented using exactly same generic algorithm as follows:
Inputs:
G: Graph
s: Starting vertex (any for Prim, source for Dijkstra)
f: a function that takes vertices u and v, returns a number
Generic(G, s, f)
Q = Enqueue all V with key = infinity, parent = null
s.key = 0
While Q is not empty
u = dequeue Q
For each v in adj(u)
if v is in Q and v.key > f(u,v)
v.key = f(u,v)
v.parent = u
For Prim, pass f = w(u, v) and for Dijkstra pass f = u.key + w(u, v).
Another interesting thing is that above Generic can also implement Breadth First Search (BFS) although it would be overkill because expensive priority queue is not really required. To turn above Generic algorithm in to BFS, pass f = u.key + 1 which is same as enforcing all weights to 1 (i.e. BFS gives minimum number of edges required to traverse from point A to B).
Intuition
Here's one good way to think about above generic algorithm: We start with two buckets A and B. Initially, put all your vertices in B so the bucket A is empty. Then we move one vertex from B to A. Now look at all the edges from vertices in A that crosses over to the vertices in B. We chose the one edge using some criteria from these cross-over edges and move corresponding vertex from B to A. Repeat this process until B is empty.
A brute force way to implement this idea would be to maintain a priority queue of the edges for the vertices in A that crosses over to B. Obviously that would be troublesome if graph was not sparse. So question would be can we instead maintain priority queue of vertices? This in fact we can as our decision finally is which vertex to pick from B.
Historical Context
It's interesting that the generic version of the technique behind both algorithms is conceptually as old as 1930 even when electronic computers weren't around.
The story starts with Otakar Borůvka who needed an algorithm for a family friend trying to figure out how to connect cities in the country of Moravia (now part of the Czech Republic) with minimal cost electric lines. He published his algorithm in 1926 in a mathematics related journal, as Computer Science didn't existed then. This came to the attention to Vojtěch Jarník who thought of an improvement on Borůvka's algorithm and published it in 1930. He in fact discovered the same algorithm that we now know as Prim's algorithm who re-discovered it in 1957.
Independent of all these, in 1956 Dijkstra needed to write a program to demonstrate the capabilities of a new computer his institute had developed. He thought it would be cool to have computer find connections to travel between two cities of the Netherlands. He designed the algorithm in 20 minutes. He created a graph of 64 cities with some simplifications (because his computer was 6-bit) and wrote code for this 1956 computer. However he didn't published his algorithm because primarily there were no computer science journals and he thought this may not be very important. The next year he learned about the problem of connecting terminals of new computers such that the length of wires was minimized. He thought about this problem and re-discovered Jarník/Prim's algorithm which again uses the same technique as the shortest path algorithm he had discovered a year before. He mentioned that both of his algorithms were designed without using pen or paper. In 1959 he published both algorithms in a paper that is just 2 and a half page long.
Dijkstra finds the shortest path between it's beginning node
and every other node. So in return you get the minimum distance tree from beginning node i.e. you can reach every other node as efficiently as possible.
Prims algorithm gets you the MST for a given graph i.e. a tree that connects all nodes while the sum of all costs is the minimum possible.
To make a story short with a realistic example:
Dijkstra wants to know the shortest path to each destination point by saving traveling time and fuel.
Prim wants to know how to efficiently deploy a train rail system i.e. saving material costs.
Directly from Dijkstra's Algorithm's wikipedia article:
The process that underlies Dijkstra's algorithm is similar to the greedy process used in Prim's algorithm. Prim's purpose is to find a minimum spanning tree that connects all nodes in the graph; Dijkstra is concerned with only two nodes. Prim's does not evaluate the total weight of the path from the starting node, only the individual path.
Here's what clicked for me: think about which vertex the algorithm takes next:
Prim's algorithm takes next the vertex that's closest to the tree, i.e. closest to some vertex anywhere on the tree.
Dijkstra's algorithm takes next the vertex that is closest to the source.
Source: R. Sedgewick's lecture on Dijkstra's algorithm, Algorithms, Part II: https://coursera.org/share/a551af98e24292b6445c82a2a5f16b18
I was bothered with the same question lately, and I think I might share my understanding...
I think the key difference between these two algorithms (Dijkstra and Prim) roots in the problem they are designed to solve, namely, shortest path between two nodes and minimal spanning tree (MST). The formal is to find the shortest path between say, node s and t, and a rational requirement is to visit each edge of the graph at most once. However, it does NOT require us to visit all the node. The latter (MST) is to get us visit ALL the node (at most once), and with the same rational requirement of visiting each edge at most once too.
That being said, Dijkstra allows us to "take shortcut" so long I can get from s to t, without worrying the consequence - once I get to t, I am done! Although there is also a path from s to t in the MST, but this s-t path is created with considerations of all the rest nodes, therefore, this path can be longer than the s-t path found by the Dijstra's algorithm. Below is a quick example with 3 nodes:
2 2
(s) o ----- o ----- o (t)
| |
-----------------
3
Let's say each of the top edges has the cost of 2, and the bottom edge has cost of 3, then Dijktra will tell us to the take the bottom path, since we don't care about the middle node. On the other hand, Prim will return us a MST with the top 2 edges, discarding the bottom edge.
Such difference is also reflected from the subtle difference in the implementations: in Dijkstra's algorithm, one needs to have a book keeping step (for every node) to update the shortest path from s, after absorbing a new node, whereas in Prim's algorithm, there is no such need.
The simplest explanation is in Prims you don't specify the Starting Node, but in dijsktra you (Need to have a starting node) have to find shortest path from the given node to all other nodes.
The key difference between the basic algorithms lies in their different edge-selection criteria. Generally, they both use a priority queue for selecting next nodes, but have different criteria to select the adjacent nodes of current processing nodes: Prim's Algorithm requires the next adjacent nodes must be also kept in the queue, while Dijkstra's Algorithm does not:
def dijkstra(g, s):
q <- make_priority_queue(VERTEX.distance)
for each vertex v in g.vertex:
v.distance <- infinite
v.predecessor ~> nil
q.add(v)
s.distance <- 0
while not q.is_empty:
u <- q.extract_min()
for each adjacent vertex v of u:
...
def prim(g, s):
q <- make_priority_queue(VERTEX.distance)
for each vertex v in g.vertex:
v.distance <- infinite
v.predecessor ~> nil
q.add(v)
s.distance <- 0
while not q.is_empty:
u <- q.extract_min()
for each adjacent vertex v of u:
if v in q and weight(u, v) < v.distance:// <-------selection--------
...
The calculations of vertex.distance are the second different point.
Dijkstras algorithm is used only to find shortest path.
In Minimum Spanning tree(Prim's or Kruskal's algorithm) you get minimum egdes with minimum edge value.
For example:- Consider a situation where you wan't to create a huge network for which u will be requiring a large number of wires so these counting of wire can be done using Minimum Spanning Tree(Prim's or Kruskal's algorithm) (i.e it will give you minimum number of wires to create huge wired network connection with minimum cost).
Whereas "Dijkstras algorithm" will be used to get the shortest path between two nodes while connecting any nodes with each other.
Dijkstra's algorithm is a single source shortest path problem between node i and j, but Prim's algorithm a minimal spanning tree problem. These algorithm use programming concept named 'greedy algorithm'
If you check these notion, please visit
Greedy algorithm lecture note : http://jeffe.cs.illinois.edu/teaching/algorithms/notes/07-greedy.pdf
Minimum spanning tree : http://jeffe.cs.illinois.edu/teaching/algorithms/notes/20-mst.pdf
Single source shortest path : http://jeffe.cs.illinois.edu/teaching/algorithms/notes/21-sssp.pdf
#templatetypedef has covered difference between MST and shortest path. I've covered the algorithm difference in another So answer by demonstrating that both can be implemented using same generic algorithm that takes one more parameter as input: function f(u,v). The difference between Prim and Dijkstra's algorithm is simply which f(u,v) you use.
At the code level, the other difference is the API.
You initialize Prim with a source vertex, s, i.e., Prim.new(s); s can be any vertex, and regardless of s, the end result, which are the edges of the minimum spanning tree (MST) are the same. To get the MST edges, we call the method edges().
You initialize Dijkstra with a source vertex, s, i.e., Dijkstra.new(s) that you want to get shortest path/distance to all other vertices. The end results, which are the shortest path/distance from s to all other vertices; are different depending on the s. To get the shortest paths/distances from s to any vertex, v, we call the methods distanceTo(v) and pathTo(v) respectively.
They both create trees with the greedy method.
With Prim's algorithm we find minimum cost spanning tree. The goal is to find minimum cost to cover all nodes.
with Dijkstra we find Single Source Shortest Path. The goal is find the shortest path from the source to every other node
Prim’s algorithm works exactly as Dijkstra’s, except
It does not keep track of the distance from the source.
Storing the edge that connected the front of the visited vertices to the next closest vertex.
The vertex used as “source” for Prim’s algorithm is
going to be the root of the MST.

Dividing a graph to triple-vertices graphs such the sum of weights of edges is minimized

I want to divide a graph to subgraphs that each subgraph made from maximum 3 vertices and sum of weight of edges is minimized, the main graph is complete ( have all possible edges ), and edges are weighted.
the main problem that i want to solve is finding close three threes points on a map.
I'm sure this problem is NP-Complete. It's called the minimum k-cut problem.
Try to take a look at this article. It talks about approximation algorithms to solve such problems.

Suggestions of the easiest algorithms for some Graph operations

The deadline for this project is closing in very quickly and I don't have much time to deal with what it's left. So, instead of looking for the best (and probably more complicated/time consuming) algorithms, I'm looking for the easiest algorithms to implement a few operations on a Graph structure.
The operations I'll need to do is as follows:
List all users in the graph network given a distance X
List all users in the graph network given a distance X and the type of relation
Calculate the shortest path between 2 users on the graph network given a type of relation
Calculate the maximum distance between 2 users on the graph network
Calculate the most distant connected users on the graph network
A few notes about my Graph implementation:
The edge node has 2 properties, one is of type char and another int. They represent the type of relation and weight, respectively.
The Graph is implemented with linked lists, for both the vertices and edges. I mean, each vertex points to the next one and each vertex also points to the head of a different linked list, the edges for that specific vertex.
What I know about what I need to do:
I don't know if this is the easiest as I said above, but for the shortest path between 2 users, I believe the Dijkstra algorithm is what people seem to recommend pretty often so I think I'm going with that.
I've been searching and searching and I'm finding it hard to implement this algorithm, does anyone know of any tutorial or something easy to understand so I can implement this algorithm myself? If possible, with C source code examples, it would help a lot. I see many examples with math notations but that just confuses me even more.
Do you think it would help if I "converted" the graph to an adjacency matrix to represent the links weight and relation type? Would it be easier to perform the algorithm on that instead of the linked lists? I could easily implement a function to do that conversion when needed. I'm saying this because I got the feeling it would be easier after reading a couple of pages about the subject, but I could be wrong.
I don't have any ideas about the other 4 operations, suggestions?
List all users in the graph network given a distance X
A distance X from what? from a starting node or a distance X between themselves? Can you give an example? This may or may not be as simple as doing a BF search or running Dijkstra.
Assuming you start at a certain node and want to list all nodes that have distances X to the starting node, just run BFS from the starting node. When you are about to insert a new node in the queue, check if the distance from the starting node to the node you want to insert the new node from + the weight of the edge from the node you want to insert the new node from to the new node is <= X. If it's strictly lower, insert the new node and if it is equal just print the new node (and only insert it if you can also have 0 as an edge weight).
List all users in the graph network given a distance X and the type of relation
See above. Just factor in the type of relation into the BFS: if the type of the parent is different than that of the node you are trying to insert into the queue, don't insert it.
Calculate the shortest path between 2 users on the graph network given a type of relation
The algorithm depends on a number of factors:
How often will you need to calculate this?
How many nodes do you have?
Since you want easy, the easiest are Roy-Floyd and Dijkstra's.
Using Roy-Floyd is cubic in the number of nodes, so inefficient. Only use this if you can afford to run it once and then answer each query in O(1). Use this if you can afford to keep an adjacency matrix in memory.
Dijkstra's is quadratic in the number of nodes if you want to keep it simple, but you'll have to run it each time you want to calculate the distance between two nodes. If you want to use Dijkstra's, use an adjacency list.
Here are C implementations: Roy-Floyd and Dijkstra_1, Dijkstra_2. You can find a lot on google with "<algorithm name> c implementation".
Edit: Roy-Floyd is out of the question for 18 000 nodes, as is an adjacency matrix. It would take way too much time to build and way too much memory. Your best bet is to either use Dijkstra's algorithm for each query, but preferably implementing Dijkstra using a heap - in the links I provided, use a heap to find the minimum. If you run the classical Dijkstra on each query, that could also take a very long time.
Another option is to use the Bellman-Ford algorithm on each query, which will give you O(Nodes*Edges) runtime per query. However, this is a big overestimate IF you don't implement it as Wikipedia tells you to. Instead, use a queue similar to the one used in BFS. Whenever a node updates its distance from the source, insert that node back into the queue. This will be very fast in practice, and will also work for negative weights. I suggest you use either this or the Dijkstra with heap, since classical Dijkstra might take a long time on 18 000 nodes.
Calculate the maximum distance between 2 users on the graph network
The simplest way is to use backtracking: try all possibilities and keep the longest path found. This is NP-complete, so polynomial solutions don't exist.
This is really bad if you have 18 000 nodes, I don't know any algorithm (simple or otherwise) that will work reasonably fast for so many nodes. Consider approximating it using greedy algorithms. Or maybe your graph has certain properties that you could take advantage of. For example, is it a DAG (Directed Acyclic Graph)?
Calculate the most distant connected users on the graph network
Meaning you want to find the diameter of the graph. The simplest way to do this is to find the distances between each two nodes (all pairs shortest paths - either run Roy-Floyd or Dijkstra between each two nodes and pick the two with the maximum distance).
Again, this is very hard to do fast with your number of nodes and edges. I'm afraid you're out of luck on these last two questions, unless your graph has special properties that can be exploited.
Do you think it would help if I "converted" the graph to an adjacency matrix to represent the links weight and relation type? Would it be easier to perform the algorithm on that instead of the linked lists? I could easily implement a function to do that conversion when needed. I'm saying this because I got the feeling it would be easier after reading a couple of pages about the subject, but I could be wrong.
No, adjacency matrix and Roy-Floyd are a very bad idea unless your application targets supercomputers.
This assumes O(E log V) is an acceptable running time, if you're doing something online, this might not be, and it would require some higher powered machinery.
List all users in the graph network given a distance X
Djikstra's algorithm is good for this, for one time use. You can save the result for future use, with a linear scan through all the vertices (or better yet, sort and binary search).
List all users in the graph network given a distance X and the type of relation
Might be nearly the same as above -- just use some function where the weight would be infinity if it is not of the correct relation.
Calculate the shortest path between 2 users on the graph network given a type of relation
Same as above, essentially, just determine early if you match the two users. (Alternatively, you can "meet in the middle", and terminate early if you find someone on both shortest path spanning tree)
Calculate the maximum distance between 2 users on the graph network
Longest path is an NP-complete problem.
Calculate the most distant connected users on the graph network
This is the diameter of the graph, which you can read about on Math World.
As for the adjacency list vs adjacency matrix question, it depends on how densely populated your graph is. Also, if you want to cache results, then the matrix might be the way to go.
The simplest algorithm to compute shortest path between two nodes is Floyd-Warshall. It's just triple-nested for loops; that's it.
It computes ALL-pairs shortest path in O(N^3), so it may do more work than necessary, and will take a while if N is huge.

Resources