I have a requirement where I need to run query like below and fetch 2-3 attributes for all entities satisfying this query. The number of distinguishedName would be around 100 in a single query. As I see in the microsoft documentation, that distinguishedName is not indexed, I suspect that this might cause performance issues.
Does anybody know if this would indeed cause performance issues? Apart from the below ldap filter, I would obviously have to use SUBTREE scope.
(|(distinguishedName=<DN 1 goes here>)(distinguishedName=<DN 2 goes here>))
Edit 1:
I tried this in my test Active Directory which has ~6k entries.
Internal event: A client issued a search operation with the following options.
Starting node:
DC=example,DC=com
Filter:
( |
(distinguishedName=CN=user-1,CN=large-test,CN=Users,DC=example,DC=com)
(distinguishedName=CN=user-2,CN=large-test,CN=Users,DC=example,DC=com)
(distinguishedName=CN=user-3,CN=large-test,CN=Users,DC=example,DC=com)
(distinguishedName=CN=group1,CN=large-test,CN=Groups,DC=example,DC=com)
)
Search scope:
subtree
Attribute selection:
mail,objectClass
Server controls:
Visited entries:
4
Returned entries:
4
Used indexes:
idx_distinguishedName:4:N;idx_distinguishedName:1:N;idx_distinguishedName:1:N;idx_distinguishedName:1:N;
Pages referenced:
123
Pages read from disk:
0
From the results it looks like it only visited 4 entries that I searched for using some indexes. I checked with schema snap-in tool (just to be sure) and it doesn't show indexes on distinguishedName. Not sure how it's using these indexes though.
Microsoft Active Directory stores the group memberships at the entry level, so you could use this to fetch the email attribute.
E.g.
ldapsearch .... -b SEARCH_BASE "(|(memberOf=GROUP_DN_1)(memberOf=GROUP_DN_2)...)" mail
Related
I'm trying to select any children that are a single level below with ltree.
For example, if I had Car.Ford, the query would grab any child with a path such as Car.Ford.Fiesta, Car.Ford.Fusion, Car.Ford.Mustang.
How can I build this query using ltree, if possible, specifically using Elixir?
Right now I'm using
from c in query, where: fragment("path <# ?", c.path)
But it returns all entries with the path in it.
Figured this out.
The documentation on Postgres states that the {} in an lquery limits the number of labels it will match, the documentation from the developers clarifies that this is actually to limit the number of levels to search.
'My.Example.*{1}'
That will match anything one level below a path starting with My.Example
Say you have an LDAP with the following structure:
dc=corp,dc=com
|--dc=security
|--ou=users
|--ou=corporate
| |--ou=it
| |--it-user1
| |--it-user2
|--user1
|--user2
|--user3
I need a search query that will look at all entries under the users ou, including those under corporate and it.
Currently I am trying the following:
uid=it-user2,ou=users,dc=security,dc=corp,dc=com
The scope of the search is set as subtree. I was under the impression that the subtree scope would cause the LDAP to search recursively through the entire tree, but that does not seem to be the case. However, if I add the full path into the search as I have below, the entry is found.
uid=it-user2,ou=it,ou=corporate,ou=users,dc=security,dc=corp,dc=com
Could someone give me an idea of where I am going wrong? Thanks.
You need to set your search context (i.e., the search base) to where your object/entry is stored. Based on your example, the search context is ou=users,dc=security,dc=corp,dc=com. When you set the search scope to subtree, it should find the entry or entries that match your critera (i.e., search filter). For example,
ldapsearch -h SERVER -b ou=users,dc=security,dc=corp,dc=com -s sub "(uid=it-user2)"
Of course, with the 'subtree' search scope, you could even set the search context to a higher level container (e.g., dc=security,dc=corp,dc=com). Your entry would still be found as long as it matches the criteria specified by your filter. Since you're searching for all entries under the ou=users container, your query would probably look like this:
ldapsearch -h SERVER -b ou=users,dc=security,dc=corp,dc=com -s sub "(uid=*)"
or
ldapsearch -h SERVER -b ou=users,dc=security,dc=corp,dc=com -s sub "(objectclass=*)"
I fought this for hours - CN=Users LDAP Directory Entry in .Net - not working with OU=Users
This may seem silly and stupid, but the default tree setup in Active Directory is not OU=Users,dc=domain,dc=com but rather CN=Users,dc=domain,dc=com (Note the CN= not the OU= for Users.)
uid=it-user2,ou=users,dc=security,dc=corp,dc=com does not exist. The LDAP client must provide a base object to the search request which exists.
see also
LDAP: Search Best Practices
We have an LDAP login problem of a specific user and I'm suspecting that this is due to cyclic groups assignment in LDAP, i.e. the user is assigned to groups A,B,C,D. Group A contains sub-groups E,F,G and group E contains group A again.
If I query for the user I can see that he has been assigned with 50+ groups and each group may contain more groups and each of those may contain more....
My question is if there's a query I can run to get the nested groups inside those main groups all the way down instead of going each group and do it manually?
The server is AD
To find all the groups that "user1" is a member of (adaptation of this answer see AD search filter):
Set the base to the groups container DN; for example root DN (dc=dom,dc=fr)
Set the scope to subtree
Use the following filter : (member:1.2.840.113556.1.4.1941:=cn=user1,cn=users,DC=x)
Example with LDIFDE.EXE (native command line AD search on windows) :
ldifde -f t.txt -d "DC=dom,DC=fr" -r "(member:1.2.840.113556.1.4.1941:=CN=jblanc,OU=MonOu,DC=dom,DC=fr)"
Remark : as far as I remember there is a small syntax difference with in brackets user DN name. '1.2.840.113556.1.4.1941' is not working in W2K3 SP1, it begins to work with SP2. I presume it's the same with W2K3 R2. I test here with W2K8R2.
With Apache Directory Studio :
Result :
How can I use a a search filter to display users of a specific group?
I've tried the following:
(&
(objectCategory=user)
(memberOf=MyCustomGroup)
)
and this:
(&
(objectCategory=user)
(memberOf=cn=SingleSignOn,ou=Groups,dc=tis,dc=eg,dc=ddd,DC=com)
)
but neither display users of a specific group.
memberOf (in AD) is stored as a list of distinguishedNames. Your filter needs to be something like:
(&(objectCategory=user)(memberOf=cn=MyCustomGroup,ou=ouOfGroup,dc=subdomain,dc=domain,dc=com))
If you don't yet have the distinguished name, you can search for it with:
(&(objectCategory=group)(cn=myCustomGroup))
and return the attribute distinguishedName. Case may matter.
For Active Directory users, an alternative way to do this would be -- assuming all your groups are stored in OU=Groups,DC=CorpDir,DC=QA,DC=CorpName -- to use the query (&(objectCategory=group)(CN=GroupCN)). This will work well for all groups with less than 1500 members. If you want to list all members of a large AD group, the same query will work, but you'll have to use ranged retrieval to fetch all the members, 1500 records at a time.
The key to performing ranged retrievals is to specify the range in the attributes using this syntax: attribute;range=low-high. So to fetch all members of an AD Group with 3000 members, first run the above query asking for the member;range=0-1499 attribute to be returned, then for the member;range=1500-2999 attribute.
If the DC is Win2k3 SP2 or above, you can use something like:
(&(objectCategory=user)(memberOf:1.2.840.113556.1.4.1941:=CN=GroupOne,OU=Security Groups,OU=Groups,DC=example,DC=com))
to get the nested group membership.
Source: https://ldapwiki.com/wiki/Active%20Directory%20Group%20Related%20Searches
And the more complex query if you need to search in a several groups:
(&(objectCategory=user)(|(memberOf=CN=GroupOne,OU=Security Groups,OU=Groups,DC=example,DC=com)(memberOf=CN=GroupTwo,OU=Security Groups,OU=Groups,DC=example,DC=com)(memberOf=CN=GroupThree,OU=Security Groups,OU=Groups,DC=example,DC=com)))
The same example with recursion:
(&(objectCategory=user)(|(memberOf:1.2.840.113556.1.4.1941:=CN=GroupOne,OU=Security Groups,OU=Groups,DC=example,DC=com)(memberOf:1.2.840.113556.1.4.1941:=CN=GroupTwo,OU=Security Groups,OU=Groups,DC=example,DC=com)(memberOf:1.2.840.113556.1.4.1941:=CN=GroupThree,OU=Security Groups,OU=Groups,DC=example,DC=com)))
Consider an e-commerce application with multiple stores. Each store owner can edit the item catalog of his store.
My current database schema is as follows:
item_names: id | name | description | picture | common(BOOL)
items: id | item_name_id | picture | price | description | picture
item_synonyms: id | item_name_id | name | error(BOOL)
Notes: error indicates a wrong spelling (eg. "Ericson"). description and picture of the item_names table are "globals" that can optionally be overridden by "local" description and picture fields of the items table (in case the store owner wants to supply a different picture for an item). common helps separate unique item names ("Jimmy Joe's Cheese Pizza" from "Cheese Pizza")
I think the bright side of this schema is:
Optimized searching & Handling Synonyms: I can query the item_names & item_synonyms tables using name LIKE %QUERY% and obtain the list of item_name_ids that need to be joined with the items table. (Examples of synonyms: "Sony Ericsson", "Sony Ericson", "X10", "X 10")
Autocompletion: Again, a simple query to the item_names table. I can avoid the usage of DISTINCT and it minimizes number of variations ("Sony Ericsson Xperia™ X10", "Sony Ericsson - Xperia X10", "Xperia X10, Sony Ericsson")
The down side would be:
Overhead: When inserting an item, I query item_names to see if this name already exists. If not, I create a new entry. When deleting an item, I count the number of entries with the same name. If this is the only item with that name, I delete the entry from the item_names table (just to keep things clean; accounts for possible erroneous submissions). And updating is the combination of both.
Weird Item Names: Store owners sometimes use sentences like "Harry Potter 1, 2 Books + CDs + Magic Hat". There's something off about having so much overhead to accommodate cases like this. This would perhaps be the prime reason I'm tempted to go for a schema like this:
items: id | name | picture | price | description | picture
(... with item_names and item_synonyms as utility tables that I could query)
Is there a better schema you would suggested?
Should item names be normalized for autocomplete? Is this probably what Facebook does for "School", "City" entries?
Is the first schema or the second better/optimal for search?
Thanks in advance!
References: (1) Is normalizing a person's name going too far?, (2) Avoiding DISTINCT
EDIT: In the event of 2 items being entered with similar names, an Admin who sees this simply clicks "Make Synonym" which will convert one of the names into the synonym of the other. I don't require a way to automatically detect if an entered name is the synonym of the other. I'm hoping the autocomplete will take care of 95% of such cases. As the table set increases in size, the need to "Make Synonym" will decrease. Hope that clears the confusion.
UPDATE: To those who would like to know what I went ahead with... I've gone with the second schema but removed the item_names and item_synonyms tables in hopes that Solr will provide me with the ability to perform all the remaining tasks I need:
items: id | name | picture | price | description | picture
Thanks everyone for the help!
The requirements you state in your comment ("Optimized searching", "Handling Synonyms" and "Autocomplete") are not things that are generally associated with an RDBMS. It sounds like what you're trying to solve is a searching problem, not a data storage and normalization problem. You might want to start looking at some search architectures like Solr
Excerpted from the solr feature list:
Faceted Searching based on unique field values, explicit queries, or date ranges
Spelling suggestions for user queries
More Like This suggestions for given document
Auto-suggest functionality
Performance Optimizations
If there were more attributes exposed for mapping, I would suggest using a fast search index system. No need to set aliases up as the records are added, the attributes simply get indexed and each search issued returns matches with a relevance score. Take the top X% as valid matches and display those.
Creating and storing aliases seems like a brute-force, labor intensive approach that probably won't be able to adjust to the needs of your users.
Just an idea.
One thing that comes to my mind is sorting the characters in the name and synonym throwing away all white space. This is similar to the solution of finding all anagrams for a word. The end result is ability to quickly find similar entries. As you pointed out, all synonyms should converge into one single term, or name. The search is performed against synonyms using again sorted input string.