How to create star shaped graph on AgensGraph? - agens-graph

I want to create star shaped graph using single "CREATE" statement of CYPHER.
How to create star shaped graph on AgensGraph?

Use variable on first cypher pattern, and reuse variable on following cypher patterns.
agens=# create (n:v{id:1})-[:e]->(:v{id:2}),
(n)-[:e]->(:v{id:3}),
(n)-[:e]->(:v{id:4});
GRAPH WRITE (INSERT VERTEX 4, INSERT EDGE 3)
agens=# match p = (n:v{id:1})-[:e]->() return p;
p
-----------------------------------------------------
[v[3.1]{"id": 1},e[4.1][3.1,3.2]{},v[3.2]{"id": 2}]
[v[3.1]{"id": 1},e[4.2][3.1,3.3]{},v[3.3]{"id": 3}]
[v[3.1]{"id": 1},e[4.3][3.1,3.4]{},v[3.4]{"id": 4}]
(3 rows)

Related

How to remove "duplicate" edges in 2-D array in Numpy?

I'm working with a (15000, 2) array in Numpy from which I plan to build an adjacency matrix. Each row represents a vertex connection from node to i to j, or in other words, the first element of each row has an edge with the second element. For example, [24, 79] represents an edge between node 24 and 79.
If there exists a row that was [79, 24], I would like to remove it altogether because [24, 79] already exists.
Is there a way to remove these "duplicate" connections so that the overall array only consists of uni-directional vertices? I'm doing this step before I make symmetrize the matrix, where I add the matrix to its transpose.
You can do that by sorting the items in each row so to easily track duplicate edges (ie. duplicate rows). The later can be done using np.unique. Here is an example:
v = np.random.randint(0, 1_000, size=(15000, 2)) # len: 15000
result = np.unique(np.sort(v, axis=1), axis=0) # len: 14790
result contains the set of unique non-directed edges where, for each row, the smallest ID is the first item. The computation is done efficiently in O(n log n) time.
Note the parameter return_index and return_inverse of np.unique can be used to track the unsorted row source index. Also note that using a (2, 15_000) array is likely to be faster due to the operation being more SIMD-friendly and cache-friendly.

Three Way Quicksort for arrays with many duplicates: Partition Placement

I'm working on a variation of the Quick Sort Algorithm designed to handle arrays where there may be many duplicate elements. The basic idea is to divide the array into 3 partitions first: All Elements below the Pivot Values (with the initial pivot value being chosen at random); All Elements Equal to the Pivot Value; and All Elements Greater than the Pivot Value.
I need some advice regarding the best way to arrange the Partition..
What is the best way to arrange the Partition in a Three Way Quick Sort?
The first way I might go about it is to just keep the Pivot Partition on the left, which would make it easy to define the boundaries when I return them to the larger Quick Sort function I plan to nest the Partition function within. But that makes subsequent recursive calls to sort the Above and Below Partitions a little tricky, since they would be all lumped together in one large partition above the Pivot Partition to start with (instead of being more neatly organized into an Above and Below Partition). I could call a For Loop to insert each of these elements above and below the Pivot Partition, but I suspect that would mitigate the efficiency of the algorithm. After doing this, I could make two recursive calls to Quick Sort: once on the Below Partition, and again on the Above Partition.
OR I could modify Partition to insert "Below" elements to the left of the Pivot Partition, and insert "Above" elements to the right. This reduces the need for linear scans over the array, but it means I would have to update the left and right bounds of the partition as the Partition function operates over the array.
I believe the second choice is the better one, but I want to see if anyone has any other ideas.
For reference, the initial array might look something like this:
array = [2, 2, 1, 9, 2]
Assuming the Pivot is randomly chosen as value of "2", then after Partition, it could look either like this:
array = [2, 2, 2, 9, 1]
Or like this if I insert above and below the partition during the Partition Function:
array = [1, 2, 2, 2, 9]
And the "shell code" I'm supposed to build this function around looks like this:
def randomized_quick_sort(a, l, r):
if l >= r:
return
k = random.randint(l, r)
a[l], a[k] = a[k], a[l]
left_part_bound, right_part_bound = partition3(a, l, r)
randomized_quick_sort(a, l, left_part_bound - 1)
randomized_quick_sort(a, right_part_bound + 1, r)
*The end result doesn't need to look like this (I just need to be able to output the right result and be able to resolve within a time limit to demonstrate minimal efficiency), but it shows why I think I may need to create Above and Below partitions as I'm creating the Pivot Partition.

How to avoid duplicate vertex node on AgensGraph?

I want to create edge on two vertexes.
agens=# create (:v1{id:1}), (:v1{id:2});
GRAPH WRITE (INSERT VERTEX 2, INSERT EDGE 0)
agens=# create (:v1{id:1})-[:e1{id:3}]->(:v1{id:2});
GRAPH WRITE (INSERT VERTEX 2, INSERT EDGE 1)
agens=# match (n:v1) return n;
n
------------------
v1[3.1]{"id": 1}
v1[3.2]{"id": 2}
v1[3.3]{"id": 1}
v1[3.4]{"id": 2}
(4 rows)
But, There is duplicated vertex on it.
How to avoid duplicate vertex node on AgensGraph?
First, find the vertexes using match
After, add edge using found vertexes.
agens=# create (:v1{id:1}), (:v1{id:2});
GRAPH WRITE (INSERT VERTEX 2, INSERT EDGE 0)
agens=# match (n1:v1{id:1}), (n2:v1{id:2}) create (n1)-[:e1{id:3}]->(n2);
GRAPH WRITE (INSERT VERTEX 0, INSERT EDGE 1)
agens=# match (n:v1) return n;
n
------------------
v1[3.1]{"id": 1}
v1[3.2]{"id": 2}
(2 rows)

read a graph by vertices not as an edge list in R

To explain: I have an undirected graph stored in a text file as edges where each line consist of two values represent an edge, like:
5 10
1000 2
212 420
.
.
.
Normally when reading a graph in R from a file (using igraph), it will be read as edges so to call the edges of the graph "g" we write E(g) and to call the vertices of "g" we write V(g) and to call both vertices of a certain edge (i.e to call a certain edge (edge i)) we write E(g)[i].
My question: Is there a similar way to call one vertex only inside an edge not to call both of them.
For example, if I need the second vertex in the third edge then what I need to type?
Also from the beginning, is there something on igraph to read the graph as vertices and not as edges? like to read the graph as a table with two columns such that each edge to be read as X[i][1], X[i][2].
I need this because I want to do a loop among all vertices and to choose them separately from the edge and I think it is possible if each vertex was labeled like an element in a table.
Many thanks in advance for any help
If you have a two column table with vertices, you could use graph_from_data_frame to convert it into graph. To get nodes on particular edge, you can use ends.
#DATA
set.seed(2)
m = cbind(FROM = sample(LETTERS[1:5], 10, TRUE), TO = sample(LETTERS[6:10], 10, TRUE))
#Convert to graph
g = graph_from_data_frame(m, directed = FALSE)
#plot(g)
#Second vertex on third edge
ends(graph = g, es = 3)[2]
#[1] "I"

How to stop traversal based on inbound edge

I have a graph in ArangoDB with two vertices collections, P and F, and one edge collection with two types of edges: fp and hp.
Note that the image above has been simplified - the "F" nodes connect themselves to other F nodes via more "fp" edges. In other words, I do not know in advance (for example) if "F4" has an inbound "fp" edge or more.
I would like to use one AQL query to traverse the graph, starting from node PA, but stopping at any vertex that do not have an inbound "hp" edge. The Documentation indicates that the way to stop traversal is to use a query like:
FOR v, e, p IN 0..100 OUTBOUND 'P/PA'GRAPH 'test'
FILTER p.<vertices|edges>... <condition>
RETURN v
And I would like to see:
PA, PC, PE, PF
But I don't know how to achieve that without first populating a new property on the "P" nodes indicating that they have an inbound "fp" edge. If I did that, this would then be possible:
FOR v, e, p IN 0..100 OUTBOUND 'P/PA'GRAPH 'test'
FILTER p.vertices[*].hasFP ALL == true
RETURN v
Is there a way to achieve this without the pre-processing step, in one AQL query?
You can do this with the following query. It starts a traversal at your starting point P/PA and check whether there is no connecting edge (needed for the starting point itself) or the type of the edge is hp. Then it starts another traversal for every found vertex directly or indirectly connected with P/PA with a depth of 1 and direction INBOUND. Here it filters for edge type equal fp and return the vertex from the surrounding traversal. You need DISTINCT in this RETURN otherwise your starting point will be returned twice.
FOR v, e, p IN 0..100 OUTBOUND 'P/PA' GRAPH 'test'
FILTER e == null
OR e.type == 'hp'
FOR v1, e1, p1 IN 1 INBOUND v._id GRAPH 'test'
FILTER e1.type == 'fp'
RETURN DISTINCT v
If I understand the question correctly, the answer is: use a sub-query to check the condition. Unfortunately, however, at the moment, AQL does not support pruning, so (with the current version, i.e. 3.2), this approach may be too slow.
In brief, the sub-query would look like this:
(FOR v0 IN p.vertices
LET hasP = (FOR v1, e1 IN 1..1 INBOUND v0 GRAPH “test”
FILTER e1.type == “fp”
COLLECT WITH COUNT INTO c
RETURN c>0)
FILTER hasP
COLLECT WITH COUNT INTO c
RETURN c == LENGTH(p.vertices) )

Resources