Nested Search using LDAP_MATCHING_RULE_IN_CHAIN - c

I am using LDAP_MATCHING_RULE_IN_CHAIN to retrieve all nested user-groups to which a user is part of. However, i am facing issue of performance as user is located in a really nested domain forest.
As, a result i am getting too many user-group entries. In order to improve performance, is it possible to restrict the number of entries based on the nested-depth. Say, i would like to fetch all user-groups which user is part of till nested-depth (3-4)?
Server used is :Active Directory (2003/2008)
Please advise

The search happens recursivly so only way to make it faster is limit where you are searching from. ie the search base, make it more specific if you can. No point searching places where you are sure there will be no results.

Related

Azure form-recognizer - prebuilt-invoice doesnt recognize currency, other custom fields

Azure form-recognizer, prebuilt-invoice doesnt recognize currency and some of my other custom fields from my invoice pdf. General Document gets me all key values. But for General document keyvalues I need to write algorithm to categorize the invoice related fields, which are already done in prebuilt-invoice.
I need all keyvalues from prebiult-invoice api, so I can find the missing elements by myself.
Anybody faced this? how do you overcome? one way I think of is, we can call both apis for same document. But it affects performance and increases cost.
any suggestion?

Is it possible to get a list of similar and/identical documents?

This is a general question that would like to get some input from the search community, so I don't have a piece of code to share just yet.
The objective is for a single document to get a list of similar and/or identical documents indexed by Azure Search - is that possible?
So given a document_id = 1 how do I get a list of the most similar documents to the specified id in the index? Ideally the outcome would be a list of documents order by a match of 0-100 - where 100 (%) would be an identical match.
I considering maybe taking the content of a given document and submitting that as part of the search, but that doesn't seem to be very elegant and it is also error prone in terms of constructing the query and the size of a document can be significant.
Thank you in advance for any suggestions or comments.
You could try using the preview feature "moreLikeThis" -> https://learn.microsoft.com/en-us/azure/search/search-more-like-this
I believe that's the closest Azure Search has to offer to what you want.
Edit 1: Be advised that this feature has limitations like non-support for complex types. Make sure it meets your requirements before taking a production dependency.

Solr - Get more next "cursorMark" indices to allow a real pagination

i'm developing an application, and the database is managed by Solr v8.1.
I have the necessity of create a pagination system, and i have read that cursors are advised for this type of operations.
The problem is: if i want create a pagination system, that will show to the end-user more than 1 next page, how can i do this?
Normally solr will return only 1 nextCursor index, but what about next 2/3/4 or more pages? Is this possible? Is possible have the same behaviour for previous cursors?
Checking the documentation, seems a continue fetch using the next cursor is mandatory, but i don't think that this is a smart solution.
Thanks
Sounds like what you want is regular pagination if those are important features. CursorMarks are (very) useful for certain use cases, but might not give you any additional performance in your case.
You can however use cursorMarks, but a cursorMark won't tell you how far into a result set you've come (or how many rows are left - just how many rows there are in total. You can still keep track of this manually in your UI). The cursorMark only tells Solr "this is the last entry I showed, so start returning values from here..". This is useful for deep pagination across a cluster with many nodes, as it greatly reduces the required number of results to fetch from each node in the cluster.
If you decide to use a cursorMark, keep track of the current offset, the page size and the page number in your URL. You won't be able to let people skip directly to page X, but you can at least show how many results that remain (this is the same strategy as applied by Gmail).

Advanced database queries - appengine datastore

I have a fairly simple application (like CRM) which has a lot of contacts and associated tags.
A user can search giving lot of criteria (search-items) such as
updated_time in last 10 days
tags in xxx
tags not in xxx
first_name starts with xxx
first_name not in 'Smith'
I understand indexing and how filters (not in) cannot work on more than one property.
For me, since most of the times, reporting is done in a cron - I can iterate through all records and process them. However, I would like to know the best optimized route of doing it.
I am hoping that instead of querying 'ALL', I can get close to a query which can run with the appengine design limits and then manually match rest of the items in the query.
One way of doing it is to start with the first search-item and then get count, add another the next search-item, get count. The point it bails out, I then process those records with rest of the search-items manually.
The question is
Is there a way before hand to know if a query is valid programatically w/o doing a count
How do you determine the best of search-items in a set which do not collide (like not-in does not work on many filters) etc.
The only way I see it is to get all equal filters as one query, take the first in-equality filter or in, execute it and just iterate over the search entities.
Is there a library which can help me ;)
I understand indexing and how filters (not in) cannot work on more than one property.
This is not strictly true. You may create a "composite index" which allows you to perform filters on multiple fields. These consume additional data.
You may also generate your own equivalent of composite index by generating your own "composite field" that you can use to query against.
Is there a way before hand to know if a query is valid programatically w/o doing a count
I'm not sure I understand what kind of validity you're referring to.
How do you determine the best of search-items in a set which do not collide (like not-in does not work on many filters) etc.
A "not in" filter is not trivial. One way is to create two arrays (repeated fields). One with all the tagged entries and one with not all the tags. This would allow you to easily find all the entities with and without the tag. The only issue is that once you create a new tag, you have to sweep across the entities adding a "not in" entry for all the entities.

Placement of sorted entry in Google App Engine Datastore

I'd like to determine what place a particular entry is in, but the appropriate GQL query is escaping me. Ideally I'd like to know the following details, which seem like they should be known by the Datastore. I just can't seem to figure how to determine it. Can someone help?
the placement of a particular entry (in a given sorting, i.e. by a particular property)
the total number of entries that exist (w/o retrieving them, just the count)
the next entry in the list (I figure as long as I can get the placement, I can make the right query to get the next one by simply getting 2 and taking the latter)
Can someone help?
If you're using Python, check out the google-app-engine-ranklist project, which implements a rank list in the App Engine datastore.
GQL is very limited, and really only exists to give people stuck in an SQL mindset a slightly easier transition to using the App Engine datastore. You can't do any of the things you want to do with GQL syntax.
Assuming you're using python, the second can be done by calling the .count() method of a db.Query or db.GqlQuery object, with the caveat that you must specify the maximum number to count as the parameter to count(), and that this maximum cannot be larger than 1000.
You can't find a particular entry in the result set without fetching all of them and looking for it. The last then becomes trivial, since you've already been fetching all of the entities and you just need to fetch the next one.
None of this is going to be efficient; the datastore isn't designed to do this sort of stuff.

Resources