ldap query syntax to exclude alpha characters

ldap query syntax to exclude alpha characters - active-directory

I'm trying to make adjustments to this LDAP query so that any employeeIDs that contain only numbers are included in the filter and anything else is skipped. (!employeeID=\00)will grab any ID that is not blank or null I believe, but how do I test for alpha or just numeric characters in AD? Thanks
-LdapFilter "(&(&(objectCategory=person)(objectClass=user)
(!userAccountControl:1.2.840.113556.1.4.803:=2))(&(objectCategory=person)(objectClass=user)
(!objectClass=inetOrgPerson))(sAMAccountName=s0*)(!sAMAccountName=*-d)(!sAMAccountName=*-e)
(!sAMAccountName=*-a)(!Name=Test*)(!Name=v-*)(!employeeID=\00))”

I don't think we can use the search filter to filter out non-digit values.
Found no related filters here:
http://msdn.microsoft.com/en-us/library/aa746475%28v=vs.85%29.aspx
Instead you may use a simpler filter to get all users, add those attributes (e.g. employeeID, sAMAccountName, etc.) to the properties-to-load list. And then filter that on client side.
Besides, filter like attr=*sth will be slow. The index only helps in equal (attr=sth) and start-with (attr=sth*) ones.

Related

How to return only a single regex match group in snowflake?

I have a regex that has multiple match groups.
How in snowflake do I specify which match group to return?
I'm using REGEXP_SUBSTR but happy to use alternatives if they work better.

TL;DR: Can't do exactly that, but you can the 'e' option and use non-capturing groups with (?:re).
So to clarify, it seems Neil is asking for something that would return word for
select regexp_substr('bird is the word','(bird) (is) (the) (word)',1,4)
Unfortunately, I don't think Snowflake supports exactly this functionality today. There is an 'e' (extract) parameter to REGEXP_SUBSTR, which allows you to extract a group only, but it always extracts the first group. The reason for that is that the occurrence parameter today means occurrence of the entire regexp in the string. Example
select regexp_substr('bird is cows are','([a-z]*) (is|are)',1,2,'e');
=> cows
You can achieve what you want by not using grouping for the groups before what you want, e.g.
select regexp_substr('bird is the word','bird (is) (the) (word)',1,1,'e');
-> is
select regexp_substr('bird is the word','bird is the (word)',1,1,'e');
-> word
However, that doesn't work if you want to use grouping for expressing alternatives, e.g.
select regexp_substr('cow is the word','(bird|cow) is the (word)',1,1,'e');
-> cow
Still, I see there would be value in providing an option to extract a particular group number, will raise it with Snowflake development :)

Snowflake has a regexp_substr_all function...
select regexp_substr_all('bird is the word','(bird) (is) (the) (word)',1,1,'c',4)[0];

Active Directory Query using LDAP Query in custom search

Some of the users in the domain I'm working on have no manager assigned or no Job title so I tried to create a new query with this LDAP query in the definequery>customsearch>advanced tab:
(&(objectCategory=user)(objectClass=user))(|(!manager=*)(!title=*)
This returns zero results even though I know they exist. Using the Custom Search creates the same search string and also returns zero results. I tried this, based on research elsewhere, which also returns zero results.
(&(objectCategory=person)(objectClass=user))(|(!manager=*)(!title=*)
What am I doing wrong?
Also I want to search only in specific folders and their subfolders, should I pre-pend this:
(|(OU=Innsbruck)(OU=Totnes)(OU=Dueren))
where these are immediately below the domain and each location has its own sub folders of Computers, Groups, Users.

Your query is just invalid. That window doesn't tell you that - it just gives zero results.
You're missing closing parentheses and you need to put the OR condition inside the AND condition. And you also need to use (objectCategory=person), not (objectCategory=user). You don't really need (objectCategory=person) since (objectClass=user) is good enough to limit the search to user objects, but it doesn't hurt.
This is what it should look like:
(&(objectCategory=person)(objectClass=user)(|(!manager=*)(!title=*)))
I will usually paste my query into Notepad++, which highlights matching parentheses, so it's easy to find missing ones. Or you can break it up over multiple lines to make it easier to read and easier to spot errors:
(&
(objectCategory=person)
(objectClass=user)
(|
(!manager=*)
(!title=*)
)
)
Regardless of how you search (through the Users and Computers UI or through code) you can only search one OU at a time. There is no OU attribute or any other attribute that you can use in a query to limit to specific OUs.
In the UI, you can click 'Browse' in the top right to pick the OU you want to search.
If you were doing this in code, you can do a couple things to limit it to specific OUs:
Search each OU separately (you can optionally set the Search Scope to not search sub-OUs if you want), or
Search the whole domain, then look at the distinguishedName attribute of each result and discard the results from OUs you don't want.
Option #2 will probably perform faster since it's less network requests.

It seems to me that the filter is not compliant with RFC 4515: LDAP String Representation of Search Filters.
May be AD and the tool you are using is accepting it, but NOT filters should be in the form of (!(manager=*)).
(&(objectCategory=person)(objectClass=user)(|(!(manager=*))(!(title=*))))

LDAP Groupfilter with range

i have the following Groupfilter:
(name=*)(member;range=0-1)
The reason i use "range" is that i have groups in AD > 1500 users.
With the above testfilter i try to find the first two users.
If I use this filter, the result is always 0, I tried different filter variants, unfortunately without success.
If i just use (name=*), then i can find all the members.
Do someone have an idea, what could be wrong?
Thanks!

Your LDAP filter is not valid. When you have more than one condition, you need to add either an "and" (&) or "or" (|) operator.
But also, the "range" is not valid in the LDAP filter itself. It belongs in the list of attributes to return. How that's done exactly depends on which programming language you are using to make the query. If you show the rest of your code, I can help there.

SQL Server validating postcodes

I have a table containing postcodes but there is no validation built in to the entry form so there is no consistency in the way they are stored in the database, sample below:
ID Postcode
001742 B5
001745
001746
001748 DY3
001750
001751
001768 B276LL
001774 B339HY
001776 B339QY
001780 WR51DD
I want to use these postcode to map the distance from a central point but before I can do that I need to put them into a valid format and filter out any blanks or incomplete postcodes.
I had considered using
left(postcode,3) + ' ' + right(postcode,3)
To correct the formatting but this wouldn't work for postcodes like 'M6 8HD'
My aim is to get the list of postcodes in a valid format but I don't know how to account for different lengths of postcode. Is this there a way to do this in SQL Server?

As discussed in the comments, sometimes looking at a problem the other way around presents a far simpler solution.
You have a list of arbitrary input provided by users, which frequently doesn't contain the correct spacing. You also have a list of valid postcodes which are correctly spaced.
You're trying to solve the problem of finding the correct place to insert spaces into your arbitrary inputs to make them match the list of valid codes, and this is extremely difficult to do in practice.
However, performing the opposite task - removing the spaces from the valid postcodes - is remarkably easy to do. So that is what I'd suggest doing.
In our most recent round of data modelling, we have modelled addresses with two postcode columns - PostCode containing the postcode as provided from whatever sources, and PostCodeNoSpace, a computed column which strips whitespace characters from PostCode. We use the latter column for e.g. searches based on user input. You may want to do something similar with your list of Valid postcodes, if you're keeping it around permanently - so that you can perform easy matches/lookups and then translate those matches back into a version that has spaces - which is actually a solution to the original question posed!

Finding alphabetical position in a large list

I have an as400 table containing roughly 1 million rows of full names / company names which I would like to convert to use another datastore while still matching the speed of the original.
Currently, a user enters the search and almost instantaneously gets the alphabetical position of the search term in the table and and a page of matches. The user can then paginate either up or down through the records very quickly.
There is almost no updating of the data and approximately 50 inserts per week. I'm thinking that any database can maintain an alphabetical index of the names, but I'm unsure of how to quickly find the position of the search within the dataset. Any suggestions are greatly appreciated.

This sounds just like a regular pagination of results, except that instead of going to a specific page based on a page number or offset being requested, it goes to a specific page based on where the user's search fits in the results alphabetically.
Let's say you want to fetch 10 rows after this position, and 10 rows before.
If the user searches for 'Smith', you could do two selects such that:
SELECT
name
FROM
companies
WHERE
name < 'Smith'
ORDER BY
name DESC
LIMIT 10
and then
SELECT
name
FROM
companies
WHERE
name >= 'Smith'
ORDER BY
name
LIMIT 10
You could do a UNION to fetch that in one query, the above is just simplified.
The term the user searched for would fit half way through these results. If there are any exact matches, then the first exact match will be positioned such that it is eleventh.
Note that if the user searches for 'aaaaaaaa' then they'll probably just get the 10 first results with nothing before it, and for 'zzzzzzzz' they may get just the 10 last results.
I'm assuming that the SQL engine in question allows >= and < comparisons between strings (and can optimise that in indexes), but I haven't tested this, maybe you can't do this. If, like MySQL, it supports internationalized collations then you could even have the ordering done correctly for non-ascii characters.

If by "the position of the search" you mean the number of the record if they were enumerated alphabetically, you may want to try something like:
select count(*) from companies where name < 'Smith'
Most databases ought to optimize that reasonably well (but try it--theories you read on the web don't trump empirical data).

Just to add to the ordering suggestions:
Add an index to the name if this is your standard means of data retrieval.
You can paginate efficiently by combining LIMIT and OFFSET.