java Mail Search Term does not work foe message Number Term - jakarta-mail

Message Number Term is not part of Search Term in IMAP Search Sequence class. How to do fetch limited number of messaged from IMAP Folder get Sorted Messages().

Related

Determine page number when using Solr cursorMark

I am building an application with Solr that should give users the first N results of a search, and using cursorMark to paginate through R rows at a time.
The problem is that with the client+server relationship, the client knows the page number and the cursorMark, but the server is only told the cursorMark. It's also not safe to trust a page number from the client.
Is there any way I'd be able to determine the offset from a given cursorMark server-side without also storing a list of page number + cursorMark combinations for every search?
For example, I'd like to be able to reject a request after using a cursorMark that would yield results > 10000 for a given search.

Azure search SDK .NET searching for Phone number fields and formats

I am struggling to do searchs for like emails and phone numbers.
For instance we have a strig field in the index called PhoneNumber and it might have “(555) 555 -5555” but when I try to search 555-5555 or 555- I get no results. Another is CustomerEmail with “abs#gmail.com’ and searching abs# gives no results. I have seen tokens and analyzers but have not got it to work. Are there any examples on querying for emails or phone or special characters for .net sdk or github?
so a user can type in say '(816)555-5555' or '816 555-5555' and it would find 81655555555 in index?
What analyzer configurations have you tried? It all comes down to how your documents and query terms are processed what you can learn about from the following article: How full text search works in Azure Search.
With the standard analyzer, the phone number (555) 555-5555 is tokenized to: 555, 555, 5555.
Searching for 555-5555 should return it as your query becomes 555 OR 5555.
Same for the email address, abs#gmail.com is tokenized to : abs, gmail.com. The # character is removed from your query term so your query becomes abs what should find a match.
Could you share an example of your queries and index configuration?

Azure Search: boost results that contains word

I have an airports database in Azure Search which upon searching I would like to boost results with those airports that contains the word "international" in the airport name.
given 2 results that have the same score, i would like to boost the one that has the word "international" in the airport name using just Azure Search (i.e. if possible, not using any code to manipulate after getting the relevant results).
I tried Term Boosting but it returns me a list of airports that has "international" in them which is not what I want.
I looked at the Scoring Functions but none of them seems to suit my needs
in essence, i do not want to "match" results that contains the word "international"
but i want to "boost" results that contains the word "international" after the user keys in the query text
If you want results containing a term to score higher, but you don't want to require matching documents to contain the term, you can use OR as well as AND. For example, if the user typed "Dallas", your query could look like this:
Dallas OR (Dallas AND airportName:international)
If you further want to control the impact that the term international has on the score, you can use term boosting.
You might find this article on how Azure Search processes queries to be helpful.

Language support for Google Search API

I'd like to perform a partial text / phrase search against a Datastore record field using Ruby.
I've figured out how to do it with a conditional constraint using >= <"\ufffd" condition, but that only works from the beginning of the field.
This works; querying for "Ener" returns "Energizer AA Batteries" but querying for "AA" does not return the same.
In the docs for the Python Google Client's Search API, it documents the ability to manually create indexes which allow for both atomic and partial word searches.
https://cloud.google.com/appengine/docs/standard/python/search/ says:
Tokenizing string fields When an HTML or text field is indexed, its
contents are tokenized. The string is split into tokens wherever
whitespace or special characters (punctuation marks, hash sign, etc.)
appear. The index will include an entry for each token. This enables
you to search for keywords and phrases comprising only part of a
field's value. For instance, a search for "dark" will match a document
with a text field containing the string "it was a dark and stormy
night", and a search for "time" will match a document with a text
field containing the string "this is a real-time system".
In the docs for Ruby and PHP, I cannot find such an API reference to enable me to do the same thing. Is it possible to perform this type of query in Ruby / PHP with Cloud Datastore?
If so, can you point me to the docs, and if not, is there a workaround, for example, create indexes with the Python Search API, and then configure the PHP/Ruby client to execute it's queries against those indexes?

How can I sort appengine search index results by relevance?

I'm working on a project that uses Google App Engine's text search API to allow users to search for documents that include a words field. I'm sorting using a MatchScorer, which according to the documentation "assigns a score based on term frequency in a document".
When a user enters a query like "business promo", I convert this into a query string that looks like words:business OR words:promo. I would have expected that this would return documents that contain both the words "business" and "promo" before documents that only contain one of the words (since the documentation says it assigns a score based on term frequency in the document). However, I frequently see results that contain only one of the words before documents that contain both.
I've also tried querying using the RescoringMatchScorer, but see the same problem using this scorer.
I've thought about doing separate queries - ones that AND the search terms and ones that OR the search terms - but this would require many queries if the user enters more than two search terms. For example, if I searched for "advanced business solutions", I'd need queries like this to cover all the bases:
words:advanced AND words:business AND words:solutions
words:advanced AND words:business
words:advanced AND words:solutions
words:business AND words:solutions
words:advanced OR words:business OR words:solutions
Does anyone have any hints on how to perform searches that return more relevant results (i.e. more search term matches) before less relevant results?
Perhaps it depends on how you interpret the phrase "term frequency". I think you're interpreting it to mean "how many of my search terms appear in the document". But it could also mean "how many times (any of) the search terms appears in each document", and indeed -- at least according to some simple experiments I've done -- the latter seems to be the actual behavior.
For example, a document that contains the word "business" 20 times and never mentions the word "promo" would be scored higher than a document that contains "business" and "promo" only once each. Does that jibe with the behavior you're seeing?

Resources