I know this may sound a lot naive, but can someone please explain me how can i implement graphs in C language. I have read the theory, but I am not able to get off the blocks with graph programming.
I would really appreciate if someone could explain how would to create a graph using adjacency lists and adjacency matrix and how would you perform breadth first search and depth first search in C code with some explanations
And before anything, I would like to tell you that this is not a homework. I really want to learn graphs but can't afford a tutor.
I assume that here graph is a collection of vertex and edges. For that you would need an array of pointer to structures. This is adjacency list representation of graph. These structures would having at least an value, which is node number and pointer to another structure. While inserting a new node to graph just go to appropriate index of array and push the node at beginning. This is O(1) time for insertion. My implementation might help you in understanding how it really works. If you are having good skills at C this wouldn't take much longer to understand the code.
// Graph implementation by adjacency list
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE 1000
typedef struct node{
int number;
struct node * next;
} Node;
// U is starting node, V is ending node
void addNode (Node *G[], int U, int V, int is_directed)
{
Node * newnode = (Node *)malloc(sizeof(Node));
newnode->number = V;
newnode->next = G[U];
G[U] = newnode;
// 0 for directed, 1 for undirected
if (is_directed)
{
Node * newnode = (Node *)malloc(sizeof(Node));
newnode->number = U;
newnode->next = G[V];
G[V] = newnode;
}
}
void printgraph(Node *G[], int num_nodes)
{
int I;
for (I=0; I<=num_nodes; I++)
{
Node *dum = G[I];
printf("%d : ",I);
while (dum != NULL)
{
printf("%d, ",dum->number);
dum =dum->next;
}
printf("\n");
}
}
void dfs (Node *G[], int num_nodes, int start_node)
{
int stack[MAX_SIZE];
int color[num_nodes+1];
memset (color, 0, sizeof(color));
int top = -1;
stack[top+1] = start_node;
top++;
while (top != -1)
{
int current = stack[top];
printf("%d ",current);
top--;
Node *tmp = G[current];
while (tmp != NULL)
{
if (color[tmp->number] == 0)
{
stack[top+1] = tmp->number;
top++;
color[tmp->number] = 1;
}
tmp = tmp->next;
}
}
}
void bfs (Node *G[], int num_nodes, int start_node)
{
int queue[MAX_SIZE];
int color[num_nodes+1];
memset (color, 0, sizeof (color));
int front=-1, rear=-1;
queue[rear+1] = start_node;
rear++;printf("\n\n");
while (front != rear)
{
front++;
int current = queue[front];
printf("%d ",current);
Node *tmp = G[current];
while (tmp != NULL)
{
if (color[tmp->number] == 0)
{
queue[rear+1] = tmp->number;
rear++;
color[tmp->number] = 1;
}
tmp = tmp->next;
}
}
}
int main(int argc, char **argv)
{
int num_nodes;
// For Demo take num_nodes = 4
scanf("%d",&num_nodes);
Node *G[num_nodes+1];
int I;
for (I=0; I<num_nodes+1 ;I++ )
G[I] = NULL;
addNode (G, 0, 2, 0);
addNode (G, 0, 1, 0);
addNode (G, 1, 3, 0);
addNode (G, 2, 4, 0);
addNode (G, 2, 1, 0);
printgraph( G, num_nodes);
printf("DFS on graph\n");
dfs(G, num_nodes, 0);
printf("\n\nBFS on graph\n");
bfs(G, num_nodes, 0);
return 0;
}
Well, a real naive and basic answer would be that graph can be represented in C using data structures that contain their pointers to other such data structures. Graphs are really just doubly linked lists that can have multiple links from a single node. If you haven't digested linked lists and doubly linked lists, that'd be a good place to start.
So let's say you have a adjacency list, {a,b},{b,c},{d},{b,e}. First off, you parse that and make a list of all your unique items. (A regular linked list, array, whatever, it's just a temporary structure to help you. You could bypass that, do it on the fly, and probably reap a speedup, but this is simple.) Walking through that list, you generate a node for each item. For each node, you go through the adjacency list again and create an edge when it sees itself. This is a pointer inside the node pointing to another node.
In the end you have a regular list of all you nodes, so you don't lose that lone 'd' node hanging out by itself. You also have a graph of all your nodes so you know their relationship to each other.
Search
Searching across graphs is a pretty basic idea. Start in a node, compare, move to one of it's neighbors and do it again. There are a lot of pitfalls though. Like getting into an endless loop and knowing when to stop.
You'll have to ask more specific questions if you want a better explanation than what you can find online already.
Related
so I have this function which does a depth first search traversal of a graph and prints out the nodes traversed. But instead of printing the nodes I want this function to return the node it newly moved to, return one node per call of the function.
when I call the function with the start node 0, it should return the next node in traversal (order of the nodes when traversing is printed out when running the program, along with the graph's adjacency list), which is 1 but it is returning 3.
the traversal order is: 1 2 3 2 4 5 4 6
below is what I have tried:
int DFS(struct Graph* graph, int vertex) {
struct node* adjList = graph->adjLists[vertex];
struct node* temp = adjList;
int back = 0;
graph->visited[vertex] = 1;
while (temp != NULL) {
int connectedVertex = temp->vertex;
if (graph->visited[connectedVertex] == 0) {
if ( back==1 ) {
return vertex;
printf("node: %d\n", vertex);
}
printf("node 1: %d\n", connectedVertex);
return DFS(graph, connectedVertex);
back = 1;
}
temp = temp->next;
}
return vertex;
}
and here is the function without the return statements (originally a void function):
void DFS(struct Graph* graph, int vertex) {
struct node* adjList = graph->adjLists[vertex];
struct node* temp = adjList;
int back = 0; // = "We have already expanded vertex"
graph->visited[vertex] = 1;
while (temp != NULL) {
int connectedVertex = temp->vertex;
if (graph->visited[connectedVertex] == 0) {
if ( back==1 ) // Report the revisited node
printf("node: %d\n", vertex);
printf("node: %d\n", connectedVertex);
DFS(graph, connectedVertex);
back = 1; // Tag this node as having been expanded.
}
temp = temp->next;
}
}
and here is my full program:
// DFS algorithm in C
#include <stdio.h>
#include <stdlib.h>
struct node {
int vertex;
struct node* next;
};
struct node* createNode(int v);
struct Graph {
int numVertices;
int* visited;
struct node** adjLists;
};
void DFS(struct Graph* graph, int vertex) {
struct node* adjList = graph->adjLists[vertex];
struct node* temp = adjList;
int back = 0; // = "We have already expanded vertex"
graph->visited[vertex] = 1;
while (temp != NULL) {
int connectedVertex = temp->vertex;
if (graph->visited[connectedVertex] == 0) {
if ( back==1 ) // Report the revisited node
printf("node: %d\n", vertex);
printf("node: %d\n", connectedVertex);
DFS(graph, connectedVertex);
back = 1; // Tag this node as having been expanded.
}
temp = temp->next;
}
}
// Create a node
struct node* createNode(int v) {
struct node* newNode = malloc(sizeof(struct node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}
// Create graph
struct Graph* createGraph(int vertices) {
struct Graph* graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = malloc(vertices * sizeof(struct node*));
graph->visited = malloc(vertices * sizeof(int));
int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}
return graph;
}
void sortedInsert(struct node** head_ref,
struct node* new_node)
{
struct node* current;
/* Special case for the head end */
if (*head_ref == NULL
|| (*head_ref)->vertex
>= new_node->vertex) {
new_node->next = *head_ref;
*head_ref = new_node;
}
else {
/* Locate the node before
the point of insertion */
current = *head_ref;
while (current->next != NULL
&& current->next->vertex < new_node->vertex) {
current = current->next;
}
new_node->next = current->next;
current->next = new_node;
}
}
// Add edge
void addEdge(struct Graph* graph, int src, int dest) {
// Add edge from src to dest
sortedInsert(&graph->adjLists[src], createNode(dest));
// Add edge from dest to src
sortedInsert(&graph->adjLists[dest], createNode(src));
}
// Print the graph
void printGraph(struct Graph* graph) {
int v;
for (v = 0; v < graph->numVertices; v++) {
struct node* temp = graph->adjLists[v];
printf("\n Adjacency list of vertex %d\n ", v);
while (temp) {
printf("%d -> ", temp->vertex);
temp = temp->next;
}
printf("\n");
}
}
int main() {
struct Graph* graph = createGraph(7);
addEdge(graph, 0, 1);
addEdge(graph, 0, 3);
addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
addEdge(graph, 2, 4);
addEdge(graph, 4, 5);
addEdge(graph, 4, 6);
printGraph(graph);
DFS(graph, 0);
return 0;
}
Help would be much appreciated.
" I want this function to return the node it newly moved to, return one node per call of the function."
This is a bad idea, because your function is recursive.
To get the nodes traversed in order add the visited node index to a global data structure.
Note: Recursion is the correct way to go here. Returning the node visited from the recursive function will not work.
Allow me to describe how I would do this:
When searching through graphs, the concept of a "visitor" is useful. A visitor is a function that the search code calls each time it reaches a new node. It makes writing the search code slightly more complex, but you need only do it once. Once written you can adapt the algorithm to do different purposes without disturbing your carefully tested and optimized search code. In this case, all the visitor need do is record the node indexes as they are visited.
Note that once you have the visitor written, you can easily change the searching algorithm ( say from depth first to breadth first ) without writing new code.
Your visitor can look like this in C++
/// in global namespace
std::vector<int> gvIndexNodeVisitedOrder();
void visitor( int indexNode )
{
gvIndexNodeVositedOrder.push_back( indexNode );
}
The searching code looks like:
void depthFirstRecurse(
int v )
{
// Call the visitor
visitor(v);
// remember this node has been visited
gvNodeVisited[v] = 1;
// look at adjacent nodes
for (int w : adjacent(v)) {
// check node has not been visited
if (!gvNodeVisited[w])
{
// continue search from new node
depthFirstRecurse(w);
}
}
}
Note: I have placed stuff in the global namespace because the original question is tagged C. In reality I would use a C++ class and make the "globals" private attributes and methods of the class.
I added this two fuctions:
// Adds a node to a list of nodes
void addNode(struct node** nodeList, int vertex){
struct node *temp = *nodeList;
if(*nodeList == NULL){
*nodeList = createNode(vertex);
}else{
while(temp->next != NULL){
temp = temp->next;
}
temp->next = createNode(vertex);
}
}
// Prints a list of nodes
void printNodeList(struct node* nodeList) {
struct node* temp = nodeList;
while(temp != NULL){
printf("%d", temp->vertex);
if(temp->next != NULL){
printf(" -> ");
}
temp = temp->next;
}
printf("\n");
}
and modified DFS and main as follow:
// added last argument
void DFS(struct Graph* graph, int vertex, struct node** nodeList) {
struct node* adjList = graph->adjLists[vertex];
struct node* temp = adjList;
graph->visited[vertex] = 1;
addNode(nodeList, vertex); // added this
while (temp != NULL) {
int connectedVertex = temp->vertex;
if (graph->visited[connectedVertex] == 0) {
printf("node: %d\n", connectedVertex);
DFS(graph, connectedVertex, nodeList);
addNode(nodeList, vertex); // added this
}
temp = temp->next;
}
}
int main() {
struct node* nodeList = NULL;
struct Graph* graph = createGraph(7);
addEdge(graph, 0, 1);
addEdge(graph, 0, 3);
addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
addEdge(graph, 2, 4);
addEdge(graph, 4, 5);
addEdge(graph, 4, 6);
printGraph(graph);
DFS(graph, 0, &nodeList);
printNodeList(nodeList);
return 0;
}
If I would have to define the traversal order of the nodes, in your example graph it would not be 1 -> 2 -> 3 -> 2 -> 4 -> 5 -> 4 -> 6 but rather 0 -> 1 -> 2 -> 3 -> 2 -> 4 -> 5 -> 4 -> 6 -> 4 -> 2 -> 1 -> 0, since I think that any time you "land" on a different node (either because you call DFS or because DFS gives back control to the caller), that "counts" in the path you're following to search the graph, and this until you're back to the main function, hence you've finished searching. Therefore in the DFS function above I implemented that, but if you need the order you mentioned, just add addNode(nodeList, vertex); below your printf statements and you should get it.
Since the function is recursive you can't really use the return statement to return the visited nodes, because what you want to have at the end is a list of elements and not just a single value. For instance in your code you defined the return type of DFS as int, this means that the function can only give you back a number, but when you call DFS in your main function you expect it to give you back a list of node that got visited. You may be able to figure out something returning a pointer to a data structure or maybe returning an int (the visited vertex) and calling something like addNode(list, DFS(g, vertex)) but you would still need to pass the list to DFS (otherwise you won't be able to call addNode(list,...) inside of it), so you would get addNode(list, DFS(g, vertex, list)), therefore I don't think you would get any advantage out of it, but I don't know.
What I did is to define a list in the main function and to pass it to the recursive function (does not need to return anything), which is then able to add the visited node to it when necessary. The first call to addNode(nodeList, vertex) happens only once per vertex since you never call DFS more than one time for any vertex, while the second happens every time you come back to a vertex after having searched into one of it's neighbors.
So I need to do a depth first search traversal of a given graph, however if a node in the graph has multiple adjacent neighbours, I need to choose the node with the lowest value to go to. So I implemented the following recursive depth first search function:
void DFS(struct Graph *graph, int vertex) {
struct node *adjList = graph->adjLists[vertex];
struct node *temp = adjList;
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
int neighbouring_nodes[graph->numVertices];
while (temp != NULL) {
int count = 0;
struct node *temp_cpy = temp;
while (temp_cpy != NULL) {
neighbouring_nodes[count] = temp_cpy->vertex;
count++;
temp_cpy = temp_cpy->next;
}
int smallest_node = neighbouring_nodes[0];
for (int i = 0; i < count; i++) {
if (neighbouring_nodes[i] < smallest_node) {
smallest_node = neighbouring_nodes[i];
}
}
if (graph->visited[smallest_node] == 0) {
DFS(graph, smallest_node);
} else if (graph->visited[smallest_node] == 1 && count == 1) {
//if the node is visited but is it the only neighbour
DFS(graph, smallest_node);
}
temp = temp->next;
}
}
But when I run my program, it results in an infinite loop. I think I know why I am getting an infinite loop, it might be because there is never a return condition, so the recursive function just keeps running?
Is this type of depth first search possible with a recursive function? If yes, where am I going wrong? If no, how would I do it iteratively?
Help would be much appreciated.
Below is my full program without the DFS function:
// DFS algorithm in C
#include <stdio.h>
#include <stdlib.h>
struct node {
int vertex;
struct node *next;
};
struct node *createNode(int v);
struct Graph {
int numVertices;
int *visited;
struct node **adjLists;
};
// Create a node
struct node *createNode(int v) {
struct node *newNode = malloc(sizeof(struct node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}
// Create graph
struct Graph *createGraph(int vertices) {
struct Graph *graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = malloc(vertices * sizeof(struct node*));
graph->visited = malloc(vertices * sizeof(int));
int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}
return graph;
}
// Add edge
void addEdge(struct Graph *graph, int src, int dest) {
// Add edge from src to dest
struct node *newNode = createNode(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;
// Add edge from dest to src
newNode = createNode(src);
newNode->next = graph->adjLists[dest];
graph->adjLists[dest] = newNode;
}
// Print the graph
void printGraph(struct Graph *graph) {
int v;
for (v = 0; v < graph->numVertices; v++) {
struct node *temp = graph->adjLists[v];
printf("\n Adjacency list of vertex %d\n ", v);
while (temp) {
printf("%d -> ", temp->vertex);
temp = temp->next;
}
printf("\n");
}
}
int main() {
struct Graph *graph = createGraph(4);
addEdge(graph, 0, 1);
addEdge(graph, 0, 2);
addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
printGraph(graph);
DFS(graph, 2);
return 0;
}
"if a node in the graph has multiple adjacent neighbours, I need to choose the node with the lowest value to go to."
I assume the 'value' of a node is an attribute of the node object?
Most implementations of DFS will first look at the node with the lowest index in the data structure containing the node objects. So, if you first sort the nodes in your data structure into ascending value order, then the DFS will do what you want without needing to change the DFS code.
Here is what I came up with:
void DFS(struct Graph* graph, int vertex) {
struct node* temp = graph->adjLists[vertex];
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
int neighbouring_nodes[graph->numVertices];
int count = 0;
while(temp != NULL) {
neighbouring_nodes[count] = temp->vertex;
count++;
temp = temp->next;
}
int smallest_node = neighbouring_nodes[0];
// Need to search (at most) in every neighbouring node
for (int i = 0; i < count; i++) {
// Go through all nodes in neighbouring_nodes array in order
// to find the smallest unvisited one, if it exists
for (int j = 0; j < count; j++){
// if current smallest_node has already been visited and
// neighbouring_nodes[j] is unvisited, assign it to smallest_node
if (graph->visited[smallest_node] == 1 && graph->visited[neighbouring_nodes[j]] == 0){
smallest_node = neighbouring_nodes[j];
}
// if neighbouring_nodes[j] is smaller than smallest_node,
// assign it to smallest_node
if (graph->visited[neighbouring_nodes[j]] == 0 && neighbouring_nodes[j] < smallest_node){
smallest_node = neighbouring_nodes[j];
}
}
if (graph->visited[smallest_node] == 0){
// calls DFS on the smallest unvisited neighboring node, if it exists
DFS(graph, smallest_node);
}else{
// otherwise (all neighboring nodes already visited)
// return control to the caller function
return;
}
}
}
I'm not 100% sure I understood what you wanted to do with the while (temp != NULL) and while (temp_cpy != NULL) loops but couldn't really figure out a way to use this approach especially in your particular case in which you want to visit the neighboring nodes in ascending order.
Let's assume a simple graph like 6->0->1, calling DFS(g, 0) will get temp to point to 6->1->NULL (could be also 1->6->NULL, depending on how you construct the graph), then smallest_node will be 1 and therefore the node 1 will be visited and the temp = temp->next will "assign" 1->NULL to temp. Back to the beginning of the loop, now temp_cpy will "be equal" to temp, hence 1->NULL. The node 6 is not on the list anymore even if it was not visited, on the other hand the already visited node 1 is still there. Also count is now equal to 1 therefore the condition (graph->visited[smallest_node] == 1 && count == 1) is met and DFS(g, 1) is called, which should not since node 1 was already visited. The infinite loop arises from this, since the previous condition is always met when temp has one (already visited) element left ([some value]->NULL). Once you reach that point you always call DFS(g, [some value]) and this will never give back control, since before reaching the temp = temp->next statement (which should assign NULL to temp , hence ending the while loop), DFS(g, [some other value]) is again called, which at some point will again call DFS(g, [some value]), and so forth.
As mentioned, one problem your original code has is that you call the DFS function also for an already visited vertex, and this should never be the case. When you encounter an already visited neighboring vertex, you want either to check the next or, if there are no unvisited neighboring vertices left, to give back control to the caller function. Therefore the last if else statement should not be there. The second problem is that smallest_node is selected in the wrong way. This is because temp_cpy, as explained above, is not constructed in such a way that it necessarily contains all unvisited neighboring nodes and also because you're actually looking for the smallest element in this list, regardless if it has already been visited or not (again because of the assumption that temp_cpy contains only all unvisited nodes). In fact you should be looking for the "smallest unvisited node" rather than the "smallest node".
In my code I go through all neighboring nodes with two for loops, find the smallest unvisited one and call DFS(g, [smallest unvisited node]) and once there are no unvisited neighbors left, return control back to the caller function.
I Hope this is somewhat understandable and I also hope I'm not missing something about what you had in mind with your implementation, in which case I would be very much interested in some explanations!
Here is a simpler version of the DFS in which neighboring nodes are checked and eventually visited in the order they're presented in the adjList. In this case I think the while (temp != NULL)/temp = temp->next approach makes sense:
void DFS(struct Graph *graph, int vertex) {
struct node *temp = graph->adjLists[vertex];
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
// for vertex search in every neighboring node
while (temp != NULL) {
// if neighboring vertex temp->vertex not visited, then search there
if (graph->visited[temp->vertex] == 0) {
DFS(graph, temp->vertex);
// if already visited, go to the next vertex on the neighbors list
}else{
temp = temp->next;
}
}
// when searched in all neighboring vertexes return control to caller
return;
}
I am trying to accomplish a function that grows a linked list while also putting them in ascending order at the same time, I have been stuck for a while and gained little progress. I believe my insertLLInOrder is correct it's just the createlinkedList that is messing it up.
Sometimes my output comes out fully and other times it only prints out some of the list.
Anything helps!
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
typedef struct node {
int data;
struct node *next;
} node;
node *createlinkedList(int num);
node *insertLLInOrder(node * h, node * n);
void display(node * head);
int randomVal(int min, int max);
int
main()
{
int usernum = 0;
node *HEAD = NULL;
printf("How many Nodes do you want? ");
scanf("%d", &usernum);
srand(time(0));
HEAD = createlinkedList(usernum);
display(HEAD);
return 0;
}
node *
createlinkedList(int num)
{
int i;
int n = num;
node *head = NULL;
node *newNode;
node *temp;
for (i = 0; i < n; i++) {
newNode = (node *) malloc(sizeof(node));
newNode->next = NULL;
newNode->data = randomVal(1, 9);
temp = insertLLInOrder(head, newNode);
head = temp;
}
return head;
}
int
randomVal(int min, int max)
{
return min + (rand() % (max - min)) + 1;
}
node *
insertLLInOrder(node * h, node * n)
{
//h is the head pointer, n is the pointer to new node
node *ptr = h;
node *previous = NULL;
while ((ptr != NULL) && (ptr->data < n->data)) {
previous = ptr; // remember previous node
ptr = ptr->next; // check for the next node
}
if (previous == NULL) {
//h is an empty list initially
n->next = NULL;
return n; // return the pointer of the new node
}
else {
//if there are nodes in the linked list
// previous will point to the node that has largest value, but smaller than new node
n->next = previous->next; // insert new node between previous, and previous->next
previous->next = n;
return h; // return old head pointer
}
}
void
display(node * head)
{
node *p = head;
while (p != NULL) {
printf("%d, ", p->data);
p = p->next;
}
}
Obviously in your insertLLInOrder() if the first while loop gives previous == NULL it means that you must insert at list head, which is not what your are doing.
Just change n->next = NULL; to n->next = h; and it should improve behavior.
Taking a step back and perspective
This is a very simple error, but it is made harder to spot because of the way you wrote your code.
The bug in itself is not very interesting, but it can help to get a higher perspective on why it happened and how to avoid such bugs.
And, no, running a debugger is not very helpful for such cases!
Having to run a debugger happens sometimes, but it merely means that you have lost the control of your program. Like having a parachute can be a safety mesure for a pilot, but if he has to use it, it also means that the pilot lost control and his plane is crashing.
Do you know the story of the Three Ninjas Programmers?
The three Ninjas
The chief of ninjas orders three Ninja to show him their training level. There is a Noob, a Beginner and a Senior. He asks them to reach a small cabin, on the other side of a field, take some object inside and come back.
The first Ninja is a noob, he runs and jumps across the field with all his speed but soon enough he walks on a (plaster) mine. He goes back at the start line and confesses his failure, which is obvious because his previously black shirt is now covered by white plaster.
The second Ninja shows some practice. You can tell he failed like the Noob on a previous try and that now he is wary. He is very slow and very careful. He sneaks very slowly across the field watching closely everywhere at each step. He gets quite close to the cabin, and everybody believes he will succeed, but eventually, he is also blown by a mine at the last second. He also goes back disappointed to the starting point, but he somehow believes it will be hard for the third Ninja to do any better.
The third Ninja is a Senior. He walks calmly across the field in a straight line, enter the cabin, and goes back without any visible trouble, still merely walking across the field.
When he gets back to the starting point the other two Ninjas are stunned and ask him eagerly:
- How did you avoid the mines?
- Obviously, I didn't put any mines on my path in the first place; why did you put mines in yours?
Back to the code
So, what could be done differently when writing such a program?
First using random values in code is a bad idea. The consequence is that the code behavior can't be repeated from one run to the next one.
It is also important that the code clearly separate user inputs (data) and code manipulating that data.
In that case, it means that the createLinkedList() function should probably have another signature. Probably something like node *createlinkedList(int num, int data[]) where the data[] array will contains values to sort. It is still possible to fill input data with random values if it is what we want.
That way, we can easily create tests set and unit tests, like in code below:
Home made unit tests suite
#include <stdio.h>
#include <stdlib.h>
typedef struct node {
int data;
struct node *next;
} node;
node *createlinkedList(int num, int * data);
node *insertLLInOrder(node * h, node * n);
/* No need to have a test framework to write unit tests */
/* Check_LL is some helper function comparing a linked list with test data from an array */
int check_LL(node * head, int num, int * data)
{
node *p = head;
int n = 0;
for (; n < num ; n++){
if (!p){return 0;}
if (p->data != data[n]){return 0;}
p = p->next;
}
return p == NULL;
}
void test_single_node()
{
printf("Running Test %s: ", __FUNCTION__);
int input_data[1] = {1};
int expected[1] = {1};
node * HEAD = createlinkedList(1, input_data);
printf("%s\n", check_LL(HEAD, 1, expected)?"PASSED":"FAILED");
}
void test_insert_after()
{
printf("Running Test %s: ", __FUNCTION__);
int input_data[2] = {1, 2};
int expected[2] = {1, 2};
node * HEAD = createlinkedList(2, input_data);
printf("%s\n", check_LL(HEAD, 2, expected)?"PASSED":"FAILED");
}
void test_insert_before()
{
printf("Running Test %s: ", __FUNCTION__);
int input_data[2] = {2, 1};
int expected[2] = {1, 2};
node * HEAD = createlinkedList(2, input_data);
printf("%s\n", check_LL(HEAD, 2, expected)?"PASSED":"FAILED");
}
/* We could leave test code in program and have a --test command line option to call the code */
int
main()
{
test_single_node();
test_insert_after();
test_insert_before();
}
node *
createlinkedList(int num, int * data)
{
int i;
node *head = NULL;
for (i = 0; i < num; i++) {
node * newNode = (node *) malloc(sizeof(node));
newNode->next = NULL;
newNode->data = data[i];
head = insertLLInOrder(head, newNode);
}
return head;
}
node *
insertLLInOrder(node * h, node * n)
{
//h is the head pointer, n is the pointer to new node
node *ptr = h;
node *previous = NULL;
while ((ptr != NULL) && (ptr->data < n->data)) {
previous = ptr; // remember previous node
ptr = ptr->next; // check for the next node
}
if (previous == NULL) {
//h is an empty list initially
n->next = NULL;
return n; // return the pointer of the new node
}
else {
//if there are nodes in the linked list
// previous will point to the node that has largest value, but smaller than new node
n->next = previous->next; // insert new node between previous, and previous->next
previous->next = n;
return h; // return old head pointer
}
}
As you can see the third test spot the bug.
Of course, you could use some available third party Unit Test library, but the most important point is not the test library, but to write the tests.
Another point is that really you should interleave writing tests and writing implementation code.
This typically helps for writing good code and is what people call TDD. But my answer is probably already long enough, so I won't elaborate here on TDD.
Below is my code that provides a simple interface to link lists in C. So that it can behave simular to ArrayLists in java.
My question is this:
My code will only work for people who want to have a link list of ints and nothing else.
I understand that they can use the int to hold the address of their data type via pointers.
However, I want something more versatile.
Can I use void* instead of the int in the node struct. Then users can provide int, double, char etc...?
Code:
#include <stdio.h>
#include <stdlib.h>
int AL_appened(int val);
struct Tuple AL_find(int val);
int AL_remove(int val);
int AL_setup();
int AL_len();
typedef struct Node Node;
typedef struct Tuple Tuple;
struct Node{
int val;
Node *next;
};
struct Tuple{
int index;
int val;
};
Node *root, *curr;
int AL_appened(int val) {
Node *tmp;
tmp = (Node *)malloc(sizeof (Node));
curr->next = tmp;
tmp->val = val;
curr = tmp;
root->val++;
return 0;
}
struct Tuple AL_find(int val){
if(root->next) {
curr = root->next;
int count = 0;
while (curr->next){
if (curr->val == val){
Tuple r = {count+=1, val};
return r;
}
count++;
curr = curr->next;
}
if (curr->val == val){
Tuple r = {count+=1, val};
return r;
}
}
Tuple r = {-1, -1};
return r;
}
int AL_remove(int val){
Node *prev;
prev = (Node *)malloc(sizeof (Node));
curr = root;
while (curr->next->val != val){
prev = curr;
curr = curr->next;
}
if (curr->next->val != val) return -1;
curr->next = curr->next->next;
root->val--;
free(curr->next);
return 1;
}
int AL_setup(){
root = (Node *)malloc(sizeof(Node));
root->val = 0;
root->next = 0;
curr = root;
return 0;
}
int AL_len(){
return root->val;
}
void printAll(){
curr=root->next;
while (curr->next != NULL) {
printf("%d\n",curr->val);
curr=curr->next;
}
printf("%d\n",curr->val);
}
int main(){
AL_setup(); //setup the root, we will use root to keep track of the number of links
AL_appened(1); // append 1 so it should look like root>1
AL_appened(2); // append 2 so it should look like root>1>2
AL_appened(3);
AL_appened(4);
printf("%d\n", AL_len()); // print len of list
Tuple results = AL_find(4); // find 4 in list
printf("%d %d\n", results.index, results.val); // return the index and the number found
AL_remove(3);
Tuple results2 = AL_find(4);
printf("%d %d\n", results2.index, results2.val);
results2 = AL_find(4);
printf("%d %d\n", results2.index, results2.val);
printAll(); // print entire list
return 0;
}
Well, you could have void* as the value type, but that would require an extra memory allocation per-node. Another approach I've seen is to put only the link in the struct, and have the user declare their own node type (with the link first), and cast to the generic node type. I've even seen linked list libraries put in preprocessor macros, such that DEFINE_LINKED_LIST(Foo) makes a FooList, a FooNode, an appendFoo function, etc. Finally, you could put all the types you think you might want into a union for the value.
Ultimately, there's no especially clean, pretty way of doing this; C is not well-suited to polymorphism. You'll need to decide which imperfect option you like best.
The linux programmers had a similar problem, and they came up with a general solution to it. In short, their approach is to define a list_head structure, that is inserted into whatever user structure that needs to be part of a linked list. Since the inclusion is the other way around as in your code, the linked list implementation 1. does not constrict the user structure in any way, and 2. allows a user structure to be a member of more than one linked lists. This is really a very flexible design, and as it's GPL'd, you can use it in any GPL'd code.
(This answer to another question on SO points to a user space adaption of the kernel lists. I have not tested it myself, so use it at your own risk. It looks sane, though.)
I'm trying to implement a graph to store a list of data from a text file such as the following:
0,1 (node 0 links to 1)
0,2 (node 0 links to 2)
1,2 (node 1 links to 2)
2,1 (node 2 links to 1)
Anyways I come across trouble when it comes down to defining the structures. I'm torn between using a matrix or adjacent lists, but I think I will go with lists, I am just not sure how to define the structures. Should I use variable sized arrays, linked lists or something else? Which way would be the easiest?
struct grph{
};
struct node{
//ID of the node
int id;
};
Second, how do I store the data into this graph, this is where I come across the most trouble. Essentially, I thought it would be easy like linked lists where you just keep adding a node to the end. The difference here is that each node can point to many different nodes or to none at all. How do I link the graph structure with all the linked node structures?
When using linked lists for example, how would I store what node 0 connects to in the example above? I understand you use a matrix or list/array, but I'm seriously getting confused because of the lack of examples of such implementations in C. Any examples I found just made it much worse then I was before.
This is just an example:
struct node{
int id;
struct node **out;
int num_out;
/* optional: if you want doubly links */
struct node **in;
int num_in;
};
/* quick access to a node given an id */
struct node *node_list;
/* connect 'from' to 'to' */
void link(struct node *graph, int from, int to) {
struct node *nfrom = &node_list[from],
*nto = &node_list[to];
nfrom->num_out++;
nfrom->out = realloc(nfrom->out,
sizeof(struct node*) * nfrom->num_out);
nfrom->out[num_out-1] = nto;
/* also do similar to nto->in if you want doubly links */
}
In answer to your first question: adjacency matrix vs adjacency lists? If you expect your graph to be dense, i.e. most nodes are adjacent with most other nodes, then go for the matrix as most operations are much easier on matrices. If you really need a transitive closure, then matrices are probably better also, as these tend to be dense. Otherwise adjacency lists are faster and smaller.
A graph would look as follows:
typedef struct node * node_p;
typedef struct edge * edge_p;
typedef struct edge
{ node_p source, target;
/* Add any data in the edges */
} edge;
typedef struct node
{ edge_p * pred, * succ;
node_p next;
/* Add any data in the nodes */
} node;
typedef struct graph
{ node_p N;
} graph;
The N field of graph would start a linked list of the nodes of the graph using the next field of node to link the list. The pred and succ can be arrays allocated using malloc and realloc for the successor and predecessor edges in the graph (arrays of pointers to edges and NULL terminated). Even though keeping both successor and predecessors may seem redundant, you will find that most graph algorithms like to be able to walk both ways. The source and target field of an edge point back to the nodes. If you don't expect to store data in the edges, then you could let the pred and succ arrays point back directly to the nodes and forget about the edge type.
Don't try to use realloc on N in the graph because all the addresses of the nodes may change and these are heavily used in the remainder of the graph.
P.S: Personally I prefer circular linked lists over NULL ended linked lists, because the code for most, if not all, operations are much simpler. In that case graph would contain a (dummy) node instead of a pointer.
You could do something like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct
{
void* pElements;
size_t ElementSize;
size_t Count; // how many elements exist
size_t TotalCount; // for how many elements space allocated
} tArray;
void ArrayInit(tArray* pArray, size_t ElementSize)
{
pArray->pElements = NULL;
pArray->ElementSize = ElementSize;
pArray->TotalCount = pArray->Count = 0;
}
void ArrayDestroy(tArray* pArray)
{
free(pArray->pElements);
ArrayInit(pArray, 0);
}
int ArrayGrowByOne(tArray* pArray)
{
if (pArray->Count == pArray->TotalCount) // used up all allocated space
{
size_t newTotalCount, newTotalSize;
void* p;
if (pArray->TotalCount == 0)
{
newTotalCount = 1;
}
else
{
newTotalCount = 2 * pArray->TotalCount; // double the allocated count
if (newTotalCount / 2 != pArray->TotalCount) // count overflow
return 0;
}
newTotalSize = newTotalCount * pArray->ElementSize;
if (newTotalSize / pArray->ElementSize != newTotalCount) // size overflow
return 0;
p = realloc(pArray->pElements, newTotalSize);
if (p == NULL) // out of memory
return 0;
pArray->pElements = p;
pArray->TotalCount = newTotalCount;
}
pArray->Count++;
return 1;
}
int ArrayInsertElement(tArray* pArray, size_t pos, void* pElement)
{
if (pos > pArray->Count) // bad position
return 0;
if (!ArrayGrowByOne(pArray)) // couldn't grow
return 0;
if (pos < pArray->Count - 1)
memmove((char*)pArray->pElements + (pos + 1) * pArray->ElementSize,
(char*)pArray->pElements + pos * pArray->ElementSize,
(pArray->Count - 1 - pos) * pArray->ElementSize);
memcpy((char*)pArray->pElements + pos * pArray->ElementSize,
pElement,
pArray->ElementSize);
return 1;
}
typedef struct
{
int Id;
int Data;
tArray LinksTo; // links from this node to other nodes (array of Id's)
tArray LinksFrom; // back links from other nodes to this node (array of Id's)
} tNode;
typedef struct
{
tArray Nodes;
} tGraph;
void GraphInit(tGraph* pGraph)
{
ArrayInit(&pGraph->Nodes, sizeof(tNode));
}
void GraphPrintNodes(tGraph* pGraph)
{
size_t i, j;
if (pGraph->Nodes.Count == 0)
{
printf("Empty graph.\n");
}
for (i = 0; i < pGraph->Nodes.Count; i++)
{
tNode* pNode = (tNode*)pGraph->Nodes.pElements + i;
printf("Node %d:\n Data: %d\n", pNode->Id, pNode->Data);
if (pNode->LinksTo.Count)
{
printf(" Links to:\n");
for (j = 0; j < pNode->LinksTo.Count; j++)
{
int* p = (int*)pNode->LinksTo.pElements + j;
printf(" Node %d\n", *p);
}
}
}
}
void GraphDestroy(tGraph* pGraph)
{
size_t i;
for (i = 0; i < pGraph->Nodes.Count; i++)
{
tNode* pNode = (tNode*)pGraph->Nodes.pElements + i;
ArrayDestroy(&pNode->LinksTo);
ArrayDestroy(&pNode->LinksFrom);
}
ArrayDestroy(&pGraph->Nodes);
}
int NodeIdComparator(const void* p1, const void* p2)
{
const tNode* pa = p1;
const tNode* pb = p2;
if (pa->Id < pb->Id)
return -1;
if (pa->Id > pb->Id)
return 1;
return 0;
}
int IntComparator(const void* p1, const void* p2)
{
const int* pa = p1;
const int* pb = p2;
if (*pa < *pb)
return -1;
if (*pa > *pb)
return 1;
return 0;
}
size_t GraphFindNodeIndexById(tGraph* pGraph, int Id)
{
tNode* pNode = bsearch(&Id,
pGraph->Nodes.pElements,
pGraph->Nodes.Count,
pGraph->Nodes.ElementSize,
&NodeIdComparator);
if (pNode == NULL)
return (size_t)-1;
return pNode - (tNode*)pGraph->Nodes.pElements;
}
int GraphInsertNode(tGraph* pGraph, int Id, int Data)
{
size_t idx = GraphFindNodeIndexById(pGraph, Id);
tNode node;
if (idx != (size_t)-1) // node with this Id already exist
return 0;
node.Id = Id;
node.Data = Data;
ArrayInit(&node.LinksTo, sizeof(int));
ArrayInit(&node.LinksFrom, sizeof(int));
if (!ArrayInsertElement(&pGraph->Nodes, pGraph->Nodes.Count, &node))
return 0;
qsort(pGraph->Nodes.pElements,
pGraph->Nodes.Count,
pGraph->Nodes.ElementSize,
&NodeIdComparator); // maintain order for binary search
return 1;
}
int GraphLinkNodes(tGraph* pGraph, int IdFrom, int IdTo)
{
size_t idxFrom = GraphFindNodeIndexById(pGraph, IdFrom);
size_t idxTo = GraphFindNodeIndexById(pGraph, IdTo);
tNode *pFrom, *pTo;
if (idxFrom == (size_t)-1 || idxTo == (size_t)-1) // one or both nodes don't exist
return 0;
pFrom = (tNode*)pGraph->Nodes.pElements + idxFrom;
pTo = (tNode*)pGraph->Nodes.pElements + idxTo;
// link IdFrom -> IdTo
if (bsearch(&IdTo,
pFrom->LinksTo.pElements,
pFrom->LinksTo.Count,
pFrom->LinksTo.ElementSize,
&IntComparator) == NULL) // IdFrom doesn't link to IdTo yet
{
if (!ArrayInsertElement(&pFrom->LinksTo, pFrom->LinksTo.Count, &IdTo))
return 0;
qsort(pFrom->LinksTo.pElements,
pFrom->LinksTo.Count,
pFrom->LinksTo.ElementSize,
&IntComparator); // maintain order for binary search
}
// back link IdFrom <- IdTo
if (bsearch(&IdFrom,
pTo->LinksFrom.pElements,
pTo->LinksFrom.Count,
pTo->LinksFrom.ElementSize,
&IntComparator) == NULL) // IdFrom doesn't link to IdTo yet
{
if (!ArrayInsertElement(&pTo->LinksFrom, pTo->LinksFrom.Count, &IdFrom))
return 0;
qsort(pTo->LinksFrom.pElements,
pTo->LinksFrom.Count,
pTo->LinksFrom.ElementSize,
&IntComparator); // maintain order for binary search
}
return 1;
}
int main(void)
{
tGraph g;
printf("\nCreating empty graph...\n");
GraphInit(&g);
GraphPrintNodes(&g);
printf("\nInserting nodes...\n");
GraphInsertNode(&g, 0, 0);
GraphInsertNode(&g, 1, 101);
GraphInsertNode(&g, 2, 202);
GraphPrintNodes(&g);
printf("\nLinking nodes...\n");
GraphLinkNodes(&g, 0, 1);
GraphLinkNodes(&g, 0, 2);
GraphLinkNodes(&g, 1, 2);
GraphLinkNodes(&g, 2, 1);
GraphPrintNodes(&g);
printf("\nDestroying graph...\n");
GraphDestroy(&g);
GraphPrintNodes(&g);
// repeat
printf("\nLet's repeat...\n");
printf("\nCreating empty graph...\n");
GraphInit(&g);
GraphPrintNodes(&g);
printf("\nInserting nodes...\n");
GraphInsertNode(&g, 1, 111);
GraphInsertNode(&g, 2, 222);
GraphInsertNode(&g, 3, 333);
GraphPrintNodes(&g);
printf("\nLinking nodes...\n");
GraphLinkNodes(&g, 1, 2);
GraphLinkNodes(&g, 2, 3);
GraphLinkNodes(&g, 3, 1);
GraphPrintNodes(&g);
printf("\nDestroying graph...\n");
GraphDestroy(&g);
GraphPrintNodes(&g);
return 0;
}
Output (ideone):
Creating empty graph...
Empty graph.
Inserting nodes...
Node 0:
Data: 0
Node 1:
Data: 101
Node 2:
Data: 202
Linking nodes...
Node 0:
Data: 0
Links to:
Node 1
Node 2
Node 1:
Data: 101
Links to:
Node 2
Node 2:
Data: 202
Links to:
Node 1
Destroying graph...
Empty graph.
Let's repeat...
Creating empty graph...
Empty graph.
Inserting nodes...
Node 1:
Data: 111
Node 2:
Data: 222
Node 3:
Data: 333
Linking nodes...
Node 1:
Data: 111
Links to:
Node 2
Node 2:
Data: 222
Links to:
Node 3
Node 3:
Data: 333
Links to:
Node 1
Destroying graph...
Empty graph.
It seems quite like my working, social networking...
You could define the node and links seperately. In c language, you could define as:
struct graph_node{
int id;
struct node_following *following;
struct graph_node *next_node;
}
struct node_following{
int id;
struct node_following *next_node;
}
For your example, the result is:
root -> node0 -> node1 -> node2
The content of root might be: id = -1; following=NULL; next_node= node0
The content of node0 might be: id = 0; next_node = node1; following point to a list of node_following as:
following -> {1, address of next node} -> {2, NULL}
The content of node1 might be: id = 1; next_node = node2; following point to a list of node_following as:
following -> {2, NULL}
The content of node2 might be: id = 2; next_node = NULL; following point to a list of node_following as:
following -> {1, NULL}
Essentially, it is a quesition on how to store a two-dimensional matrix. If the matrix is sparse, use the linked list. Otherwise, bitmap is a better solution.