Performant query to find all reacheable nodes - graph-databases

I'm studying AWS Neptune with Gremlin to build a permission system.
This system would have basically 3 types of vertices: Users, Permissions and Groups.
A group has 0..n permissions
A user can have 0..n groups
A user can be directly connected to 0..n permissions
A user can be connected to another user, in which case it "inheritates" that user permission's
A group can be inside of another group, that is inside of another group.... so on.
I'm looking for a performant query to find all permissions for a given user.
This graph may get really huge so to stress it out I have build a 17kk user vertices graph, created 10 random edges for each one of them and then created a few permissions.
Then the query I was using to get all permissions is obviouly running forever... n_n'
What I'm trying is simply:
g.V('u01')
.repeat(out())
.until(hasLabel('Permission'))
.simplePath()
Is there a better query to achieve it? Or maybe even a better modeling for this scenario?
I was thinking that maybe my 10 random edges have created a lot of cycles and connections that "make no sense" and thats why the query is slow. Does it make sense?
Thanks in advance!

You probably running in circles. You should write it like this:
g.V('u01')
.repeat(out().simplePath())
.until(hasLabel('Permission'))
It is also preferable the use specific label in the out step to avoid traversing irrelevant paths.

Related

Using a graph database to store and retrieve sorted users by personality scores

I am interested in storing a set of users that have personality scores.
I would like to get them to be more connected (closer?) to each other based on formulas that are applied to their scores. The more similar the users are, the more connected or closer to each other they are (like in a cluster). The closest nodes are to one-another, the more similar they are.
I currently do this over multiple steps (some in SQL and other in code) from a relational database.
Most posts out there and documentation seems to focus on how to get started and what the advantages are at a high level compared to relational databases.
I am wondering if Graph databases are better suited for this and would do most of the heavy lifting out of the box or more natively. Any details are greatly appreciated.
You could consider modeling it like this:
Where a vertex type/label named Score_range was introduced, together with the label User(with property score).
User vertices are connected to Score_range vertex like User with score: 101 is connected to Score_range(vertexID=100) which stands for [100, 110).
Thus, those vertices with closer score are more connected/clusterred in this graph, and in your applicaiton, you need to make connection changes when the score are recaculated/changed to the graph database.
Then, either to run cluster algorithm(i.e. Louvain) on the whole graph or graph query to find path between any two user nodes(i.e. FIND PATH in Nebula Graph, an opensource distributed graph database speaks opencypher), the closeness will be reflected.
But, I think due to this connection/closness is actually numerical/sortable, simply handling this closeness relationship may not need a graph database from the context you already provided.
PS. I drew a picture of a graph in the above schema:

Dynamic distribution groups: Which DDGs is user part of?

I know how to get all the members of a dynamic distribution group: I can take the distribution group, get the AD filters from msExchDynamicDLFilter and msExchQueryFilter properties and query the AD for the users who match that filter.
Now, how do I go the other way? E.g. show which dynamic distribution groups a user is part of? Is there any better way than taking all the hundreds of dynamic distribution groups from AD, resolving each of them, one after the other, and looking whether the user is in the resolved list?
No there is not. DDGs where created for Exchange. So upon message submission, the DDGs are resolved.
That being said, you can only retrieve all DDGs, do the queries and check to see which one yield the desired user.

LDAP Query to find all groups with more than one parent

Is there a way to construct an LDAP search string that would return all groups that have more than one parent group? I have searched and searched Google, and perhaps this can't be done, or perhaps I am just not looking for the right thing, but it seems like I should be able to do this.
What I am trying to solve:
We have a batch application that maintains an organizational hierarchy of groups. In our hierarchy, a group can only have one parent "org" group. It have have any number of non-Org parent groups, just not Organizational unit groups. Orgs are identified by a CN that consists of 8 separate numbers withing a very specific range, lets say 1000 to 1001 for sake of argument, where 10000000 is the "Base" Org unit. An Org can only be a child of one other Org, but can have other parent groups that are not Org units.
The problem is that someone, in their infinite wisdom, has gone out and broke the cardinal rule that an Org group should have one and only one Org parent. Now I have to update the batch program to handle and correct it. But, first, I need to know how to find these.
My thought is something like this:
(&(objectClass=group)(count(members) > 2))
Where count is some aggregate function that returns the number of members a group might have. Or, maybe some way to return all groups that have more than one memberOf?
LDAP has no aggregate function to determine the number of members. Some LDAP implementations may have added features for aggregation, but AFIK, Microsoft Active Directory does not.
You could move you baseDN to a higher point to encompass all the possible OUs in which there are groups or even root.
As you tagged the question as Microsoft Active Directory, you may then need to chase referrals.
I was not able to determine if Microsoft Active Directory supports extensible matching for DNs which would allow matching only within two or more containers. If Microsoft Active Directory does, then a filter similar to: (&(|(ou:dn:=groups)(ou:dn:=groups2))(objectclass=groups)) might work.
-jim

Second Order Relationship in Graph Database

I'm creating an app which is quite relationship heavy. One of the features of the site is a recommendation feature, where users can rate things for others. For this, it seems like a Graph DB would be ideal so I am planning on using Neo4j, alongside Ruby.
This all seems fairly straight forward, however I would like to include a feature where users can rate a specific relationship. For example, a user could recommend a hotdog in a specific restaurant, etc. The only way I can really think about doing this with a Graph DB is to either add a 'joining node' between the two nodes, connecting all three, or by adding lists of properties to the relationship (ie adding hotdog_5 to the user-restaurant relationship). Obviously the rating could just be added to the hotdog-restaurant relationship, but you wouldn't be able to trace the users that rated it, to prevent them rating more than once.
Any thoughts on the problem would be appreciated.
You may want to retrieve all the comments from a user, or the comments about hotdogs in all restaurant, or all the comments about all type of food in a restaurant so I would recommend to do it like :
1. user-[:write]->comment
2. comment-[:about]->hotdog
3. comment-[:concern]->restaurant
4.restaurant-[serve]->hotdog
Not sure about the last one it may be useless due to 2 and 3, it depend a lot on the queries you'll run

Is it possible to LDAP query users common to a set of groups

I need a list of all the users common to a known collection of groups, using a single LDAP query of our Active Directory. It would seem, from the our reading so far, that such is not possible, but I thought it best to ask the hive mind.
Try this:
(&(objectCategory=Person)
(&
(memberOf=CN=group1,dc=company,dc=local)
(memberOf=CN=group2,dc=company,dc=local)
(memberOf=CN=group3,dc=company,dc=local)
)
)
This is similar to my question, except there I wanted all users who were NOT members of groups. You'll need to delete all the whitespace for most query tools to work.
Yes it's possible with an attribute scoped query. It requires W2K3 AD or later but will give you the all of the users that have a particular attribue i.e. membership in a group or in your case multiple groups (intersection of groups). One of the best examples is from Joe Kaplan and Ryan Dunns book "The .NET Developers Guide to Directory Services Programming" for AD work it's hard to beat look at page 179 for a good walk through.
Caveat:At this point you are past trivial searches in AD and a number of things are becoming important like the search root, scope and the effect of searching through some potentially HUGE set of data for the items you want. Looking through 50 or 60K users to find the members of a group does have an effect on performance and be prepared to do paged results or similar in case the dataset is large. Kaplan/Ryan do an excellent job of down to earth work to get you where you need to be. That said, I have used them on two AD projects with great success. Being able retrieve the data from AD without recursive queries is VERY worth while and I found that it is fast as long as I control the size of my dataset.
It's not possible in a single query if your groups contain nested groups.
You would need to write some code that recursively resolves the group members and does the logical equivalent of an "inner join" on the results, producing a list of users that are common to all the original groups.

Resources