I want to implement two dimensional grid graph in C. Is it better to start with a single node and continue adding nodes when it is required or forming it one at a time? Code snippet would be great.
First of all StackOverflow community doesn't like to give whole solutions to given problems. It's always better to propose something on your own and show that you thought the problem over. Your problem statement is not specific. Do you know the number of vertices at compile time? If so the simple two-dimensional array would be enough:
int connections[100][100];
If not but the number of vertices is constant maybe you should consider using dynamic allocation:
int** connections = malloc(sizeof(int*)*numberOfVertices);
for (int i = 0; i < numberOfVertices; i++) {
connections[i] = malloc(sizeof(int)*numberOfVertices);
}
This two solutions of course cause some problems. They for example are very memory expensive (especially when the graph is sparse). This is why you always can use some dedicated struct.
If you are thinking of adding nodes as you go, then you should use an adjacency list (a linked list of linked lists) to represent your graph; especially since adding as you go means you don't know whether your graph will be parse. But if you already know the size of your graph, then use a square array (n by n, where n is number of nodes).
Related
I've found answers to similar problems, but none of them exactly described my problem.
so on the risk of being down-voted to hell I was wondering if there is a standard method to solve my problem. Further, there's a chance that I'm asking the wrong question. Maybe the problem can be solved more efficiently another way.
So here's some background:
I'm looping through a list of particles. Each particle has a list of it's neighboring particles. Now I need to create a list of unique particle pairs of mutual neightbours.
Each particle can be identified by an integer number.
Should I just build a list of all the pair's including duplicates and use some kind of sort & comparator to eliminate duplicates or should I try to avoid adding duplicates into my list in the first place?
Performance is really important to me. I guess most of the loops may be vectorized and threaded. On average each particle has around 15 neighbours and I expect, that there will be 1e6 particles at most.
I do have some ideas, but I'm not an experienced coder and I don't want to waste 1 week to test every single method by benchmarking different situations just to find out that there's already a standard meyjod for my problem.
Any suggestions?
BTW: I'm using C.
Some pseudo-code
for i in nparticles
particle=particles[i]; //just an array containing the "index" of each particle
//each particle has a neightbor-list
for k in neighlist[i] //looping through all the neighbors
//k represent the index of the neighbor of particle "i"
if the pair (i,k) or (k,i) is not already in the pair-list, add it. otherwise don't
Sorting the elements each iteration is not a good idea since comparison sort is O(n log n) complex.
The next best thing would be to store the items in a search tree, better yet binary search tree, and better yet self equalizing binary search tree, you can find implementations on GitHub.
Even better solution would give an access time of O(1), you can achieve this in 2 different ways one is a simple identity array, where at each slot you would save say a pointer to item if there is on at this id or some flag defining that current id is empty. This is very fast but wasteful. You'll need O(N) memory.
The best solution in my opinion would be to use a set or a has-map. Which are basically the same because sets can be implemented using hash-map.
Here is a github project with c hash-map implementation.
And stack overflow answer to a similar question.
I would like to allocate memory using pointers in a loop in c. I have found answers to questions similar but not to this specific problem.
For example
char name_bin[50];
for (int i=0; i<NB; i++) {
sprintf(name_bin, "bin_vector%d", i);
double *name_bin = (double *) malloc(sizeof(double) * NGMAX);
}
I can't seem to find a way of doing this. I want NB arrays of size NGMAX all with different names, bin_vector0 etc
I was told it was possible and so any help would be very welcome.
You can't create a new variable with a name derived at runtime. In C, the way to accomplish that is using a table that associates a string with an object. How the table is implemented is up to you, depending on your requirements and the amount of complexity you are willing to deal with. For small numbers of names, a linked list may be sufficient. For very large numbers of names, you may likely need a tree or a hash table.
However, your particular problem can be resolved with a simple array.
double *bin_vector[NB];
Then, you can refer to the 6th bin with bin_vector[5].
I want to implement a graph in C. I am confused on how should I store each node. I was first thinking of using a linked list but how can I store the next nodes connected to one node.
Any ideas what data structure should I use and how should I use it?
There are some well known ways to do that.
One is to use a bidimensional array of size [n][n] where n is the number of nodes. And then set graph[a][b]= 1 if there is a link from a to b. This method is in general fast but uses a lot of memory, expecially if there are not so many links and many nodes.
Another way is to make a list (or an array, for the matter) of all nodes and set the content of everyone of them to point to a dynamic array or to a list of nodes it is linked to.
The data structure that is helpful in case your graph is sparse is an adjacency list(linked list of linked lists) that is when you have few connections(edges) between the vertices.
If your graph is dense then use an adjacency matrix(nxn) 2 dimensional array that is the case your vertices have lots of edges between them.
I'm writing a program for a numerical simulation in C. Part of the simulation are spatially fixed nodes that have some float value to each other node. It is like a directed graph. However, if two nodes are too far away, (farther than some cut-off length a) this value is 0.
To represent all these "correlations" or float values, I tried to use a 2D array, but since I have 100.000 and more nodes, that would correspond to 40GB memory or so.
Now, I am trying to think of different solustions for that problem. I don't want to save all these values on the harddisk. I also don't want to calculate them on the fly. One idea was some sort of sparse matrix, like the one one can use in Matlab.
Do you have any other ideas, how to store these values?
I am new to C, so please don't expect too much experience.
Thanks and best regards,
Jan Oliver
How many nodes, on average, are within the cutoff distance for a given node determines your memory requirement and tells you whether you need to page to disk. The solution taking the least memory is probably a hash table that maps a pair of nodes to a distance. Since the distance is the same each way, you only need to enter it into the hash table once for the pair -- put the two node numbers in numerical order and then combine them to form a hash key. You could use the Posix hsearch/hcreate/hdestroy functions for the hash table, although they are less than ideal.
A sparse matrix approach sounds ideal for this. The Wikipedia article on sparse matrices discusses several approaches to implementation.
A sparse adjacency matrix is one idea, or you could use an adjacency list, allowing your to only store the edges which are closer than your cutoff value.
You could also hold a list for each node, which contains the other nodes this node is related to. You would then have an overall number of list entries of 2*k, where k is the number of non-zero values in the virtual matrix.
Implementing the whole system as a combination of hashes/sets/maps is still expected to be acceptable with regard to speed/performance compared to a "real" matrix allowing random access.
edit: This solution is one possible form of an implementation of a sparse matrix. (See also Jim Balter's note below. Thank you, Jim.)
You should indeed use sparse matrices if possible. In scipy, we have support for sparse matrices, so that you can play in python, although to be honest sparse support still has rough edges.
If you have access to matlab, it will definitely be better ATM.
Without using sparse matrix, you could think about using memap-based arrays so that you don't need 40 Gb of RAM, but it will still be slow, and only really make sense if you have a low degree of sparsity (say if 10-20 % of your 100000x100000 matrix has items in it, then full arrays will actually be faster and maybe even take less space than sparse matrices).
The first question is: "How I do a simple sparse array in C (with one dimension only)?" {with my own hands, without libraries.}
And the last one: "Can I allocate only parts of an array?"
like *array;
then use malloc to allocate some mem for this;
so, We free the index that we don't want.
Can I do it?
Thanks so much!
No, you can't do it.
What you can do is to allocate blocks, but you need to design it carefully.
Probably the best optimization is to use ranges of cell. So you can use a linked list (or a map) of available ranges:
struct SparseBlock
{
void *blockData;
int beginIndex;
int endIndex;
struct SparseBlock *next;
}
obviously if endIndex - beginIndex = 0 you have a single cell (that is isolated inside the array), otherwise you have got a block of cells, allowing you to allocate the right amount of memory for it.
This approach is simple for immutable sparse vectors, otherwise you should take care of
restructuring the blocks whenever a hole is filled or generated
just store single cells
In addition you have to decide how to index these blocks, you can keep them ordered in a linked list, or you can use a map to have a constant O(1) time to retrieve a n-th block (of course you will have to insert many equal keys for the same block if it's a range or reduce the index to the nearest lower index available).
Solutions are many, just express your creativity! :)
It is not uncommon to implement these in linked structures of one kind or another. In one dimension you can simple generate a linked list of occupied regions, and I've discussed a two dimensional implementation in another context before.
You do lose O(1) access time this way, but the win on space can be considerable if the structure really is sparse.