Create a regular graph through one vertex deletion - c

The problem: Given an undirected graph, implemented using adjacency list. I'm looking for an algorithm to transform it to a regular graph (each vertex has same degree) through one vertex deletion.
For example:

Iterate all vertex, partition them by their degrees.
If all have same degree, its only possible if there is a vertex that has degree n - 1.
If you can partition them into 2 different degrees set: Let´s call X the set with the lower degree and Y the one with higher. Lets call dg(X) and dg(Y) the degree of those vertex
If one of the partitions has only 1 vertex and its degree is either 0 or the amount of vertex in the other set, remove it
If dg(Y) - dg(X) > 1, its not possible
If dg(Y) - dg(X) = 1 and |Y| = dg(X), check if a vertex from X is connected to all vertex from Y and remove it.
If dg(Y) - dg(X) = 1 and |X| = dg(Y), check if a vertex from Y is connected to all vertex from X and remove it.
Any other case is not possible with 2 partitions
If you can partition into 3 sets:
One of them must have only 1 vertex and that vertex has to be connected to all vertex from the other highest degree set, and to none of the remaining set. The degree difference between the other highest degree set and the remaining set must also be 1
Any other case, its not possible

Related

read a graph by vertices not as an edge list in R

To explain: I have an undirected graph stored in a text file as edges where each line consist of two values represent an edge, like:
5 10
1000 2
212 420
.
.
.
Normally when reading a graph in R from a file (using igraph), it will be read as edges so to call the edges of the graph "g" we write E(g) and to call the vertices of "g" we write V(g) and to call both vertices of a certain edge (i.e to call a certain edge (edge i)) we write E(g)[i].
My question: Is there a similar way to call one vertex only inside an edge not to call both of them.
For example, if I need the second vertex in the third edge then what I need to type?
Also from the beginning, is there something on igraph to read the graph as vertices and not as edges? like to read the graph as a table with two columns such that each edge to be read as X[i][1], X[i][2].
I need this because I want to do a loop among all vertices and to choose them separately from the edge and I think it is possible if each vertex was labeled like an element in a table.
Many thanks in advance for any help
If you have a two column table with vertices, you could use graph_from_data_frame to convert it into graph. To get nodes on particular edge, you can use ends.
#DATA
set.seed(2)
m = cbind(FROM = sample(LETTERS[1:5], 10, TRUE), TO = sample(LETTERS[6:10], 10, TRUE))
#Convert to graph
g = graph_from_data_frame(m, directed = FALSE)
#plot(g)
#Second vertex on third edge
ends(graph = g, es = 3)[2]
#[1] "I"

How to find the biggest subset of an array given some constraints?

There is an array A[1........N]. How to find the largest subset of the array such that product of any two distinct element of the subset is not a perfect cube. Upper bound for N is 100000.
Example:
For A = 1 2 4 8. Answer will be {1, 2} or {1, 4} or {8, 2} 0r {8, 4}.
1 and 8 cannot come together in the solution.
Similarly 2 and 4.
My approach.
check all the subset of the given array and return the subset of maximum length which satisfies the constraint. It will take O(N*N*2^N).
create a graph out of the given array. Two nodes in the graph will be connected if their product is perfect cube. Our main task is to remove the minimum number of nodes such that there are no edges left in the graph (when we remove any node all the edges associated with the node will disappear). Here the main issue is the space (representation of graph). In the worst case size of the graph will be O(N*N).
Please help.
Explanation
Consider the factorization of each number as follows:
A[i] = x^3.y^2.z
i.e. we first find the largest cube that divides (and call it x), then the largest square (and call it y), then call whatever is left over z.
The product of A[i] with another A[j]=X^3.Y^2.Z will be a cube if and only if Y=z and Z=y.
Therefore, if you consider groups of numbers with the same value of y^2.z, these groups form into pairs, where for each pair you cannot take an element from both linked groups.
Clearly the best case is to take all the elements from whichever group is the largest in each pair.
There is one special case, where y^2.z is equal to 1. In this case, any number in the group is already a perfect cube and cannot be paired with another number from the same group. Therefore you can add just 1 number from the set of perfect cubes.
Example
Suppose our array was (expressed as a prime factorization):
A[0] = 2^3
A[1] = 3^3
A[2] = 2^2.3.5^3
A[3] = 2^2.3.7^3
A[4] = 2.3^2.13^3
We first assign these into groups:
Value 1 = Group A (2^3, 3^3)
Value 2^2.3 = Group B (2^2.3.5^3, 2^2.3.7^3)
Value 2.3^2 = Group C (2.3^2.13^3)
Group A is paired with itself, while group B is paired with group C.
Therefore we can take one element from group A, and the whole of group B, for a total of 3 elements in the final subset.
You can formulate it as a largest clique problem.
Create a graph with each number as a vertex and connect two vertices if their product is not a cube.
Now find the largest clique in the graph. See https://en.wikipedia.org/wiki/Clique_problem#Finding_maximum_cliques_in_arbitrary_graphs

Mapping out subregions of neighbors in a matrix and calculating center of mass for each sub-region

I have been presented with a problem which involves the following:
Given an MxN matrix with some values.
Given a threshold value T
Problems
Identify sub-regions in the matrix.
A sub-region is a region of cells in the matrix which are considered neighbors and where the cell values in that sub-region are greater than T. Two cells C1 and C2 are considered to be neighbors if they're adjacent. They are also neighbors if they're diagonally adjacent.
Calculate the "center of mass" for each sub-region, defined to be the average position (x,y) of the cells in the sub-region. Each cell's location is weighted by its value.
My approach
Search through MxN matrix to qualify cells and add them as nodes to a linked list (their value must be greater than threshold value).
Pull a node from the linked list and put it in a "tree". That node will be the parent node. Search through linked list of remaining nodes to find "nearest neighbors" based on definition of what a neighbor is. Each neighbor is placed as in the "tree" as a child node. Now ..for each child node...search through linked list of remaining nodes to find their neighbors. Continue with this until done. The final tree will be a representation of a sub-region.
Go to 2 (and create a new tree) if linked list is not empty
After that, calculating the "center of mass" for each tree will be easy.
Does this seem to be the right approach or is there a better, more optimal one.
Hoping for some feedback.
Thanks.
EDIT
I should probably mention that the matrix is "placed" in a (x,y) coordinate system so that the lower-left corner cell (row M, column 0) corresponds to the (x,y) coordinate (0,0) and the top right corner cell (row 0, column N) corresponds to the (x,y) coordinate (N,M)
If you have the Image Processing Toolbox, you could do that with regionprops.
Assuming M is your matrix and T the threshold value:
subregions = regionprops(M > T, 'Centroid');
Now you have the centroids on subregions(i).Centroid, for each subregion i.

Random walks in directed graphs/networks

I have a weighted graph with (in practice) up to 50,000 vertices. Given a vertex, I want to randomly choose an adjacent vertex based on the relative weights of all adjacent edges.
How should I store this graph in memory so that making the selection is efficient? What is the best algorithm? It could be as simple as a key value store for each vertex, but that might not lend itself to the most efficient algorithm. I'll also need to be able update the network.
Note that I'd like to take only one "step" at a time.
More Formally: Given a weighted, directed, and potentially complete graph, let W(a,b) be the weight of edge a->b and let Wa be the sum of all edges from a. Given an input vertex v, I want to choose a vertex randomly where the likelihood of choosing vertex x is W(v,x) / Wv
Example:
Say W(v,a) = 2, W(v,b) = 1, W(v,c) = 1.
Given input v, the function should return a with probability 0.5 and b or c with probability 0.25.
If you are concerned about the performance of generating the random walk you may use the alias method to build a datastructure which fits your requirements of choosing a random outgoing edge quite well. The overhead is just that you have to assign each directed edge a probability weight and a so-called alias-edge.
So for each note you have a vector of outgoing edges together with the weight and the alias edge. Then you may choose random edges in constant time (only the generation of th edata structure is linear time with respect to number of total edges or number of node edges). In the example the edge is denoted by ->[NODE] and node v corresponds to the example given above:
Node v
->a (p=1, alias= ...)
->b (p=3/4, alias= ->a)
->c (p=3/4, alias= ->a)
Node a
->c (p=1/2, alias= ->b)
->b (p=1, alias= ...)
...
If you want to choose an outgoing edge (i.e. the next node) you just have to generate a single random number r uniform from interval [0,1).
You then get no=floor(N[v] * r) and pv=frac(N[v] * r) where N[v] is the number of outgoing edges. I.e. you pick each edge with the exact same probability (namely 1/3 in the example of node v).
Then you compare the assigned probability p of this edge with the generated value pv. If pv is less you keep the edge selected before, otherwise you choose its alias edge.
If for example we have r=0.6 from our random number generator we have
no = floor(0.6*3) = 1
pv = frac(0.6*3) = 0.8
Therefore we choose the second outgoing edge (note the index starts with zero) which is
->b (p=3/4, alias= ->a)
and switch to the alias edge ->a since p=3/4 < pv.
For the example of node v we therefore
choose edge b with probability 1/3*3/4 (i.e. whenever no=1 and pv<3/4)
choose edge c with probability 1/3*3/4 (i.e. whenever no=2 and pv<3/4)
choose edge a with probability 1/3 + 1/3*1/4 + 1/3*1/4 (i.e. whenever no=0 or pv>=3/4)
In theory the absolutely most efficient thing to do is to store, for each node, the moral equivalent of a balanced binary tree (red-black, or BTree, or skip list all fit) of the connected nodes and their weights, and the total weight to each side. Then you can pick a random number from 0 to 1, multiply by the total weight of the connected nodes, then do a binary search to find it.
However traversing a binary tree like that involves a lot of choices, which have a tendency to create pipeline stalls. Which are very expensive. So in practice if you're programming in an efficient language (eg C++), if you've got less than a couple of hundred connected edges per node, a linear list of edges (with a pre-computed sum) that you walk in a loop may prove to be faster.

Dividing circle into pieces by choosing points on circumference?

On a circle, N arbitrary points are chosen on its circumference. The complete graph formed with those N points would divide the area of the circle into many pieces.
What is the maximum number of pieces of area that the circle will get divided into when the points are chosen along its circumference?
Examples:
2 points => 2 pieces
4 points => 8 pieces
Any ideas how to go about this?
This is known as Moser's circle problem.
The solution is:
i.e.
The proof is quite simple:
Consider each intersection inside the circle. It must be defined by the intersection of two lines, and each line has two points, so every intersection inside the circle defines 4 unique sets of points on the circumference. Therefore, there are at most n choose 4 inner vertices, and obviously there are n vertices on the circumference.
Now, how many edges does each vertex touch? Well, it's a complete graph, so each vertex on the outside touches n - 1 edges, and of course each vertex on the inside touches 4 edges. So the number of edges is given by (n(n - 1) + 4(n choose 4))/2 (we divide by two because otherwise each edge would be counted twice by its two vertices).
The final step is to use Euler's formula for the number of faces in a graph, i.e.: v - e + f = 1 (the Euler characteristic is 1 in our case).
Solving for f gives the formulae above :-)

Resources