How to search within a specific document - azure-cognitive-search

The use case I have is that I would like to use the azure search "hit" highlighting to check (and visulize) if a set of key words exists for a specific key id (e.g. my_id).
Example query:
/docs?api-version=2020-06-30&search="search term 1" OR "search term 2" OR "search term 3"
AND my_id = "some id"
(As opposed to searching across the entire database and getting multiple document hits).
But not matter what variation I can't seem to limit the search scope to just a specific document?
Any help or guidance would be much apreciated!
UPDATE 13-OCT-2020
The solution was based on two issues:
Postman encoded = for the filter
The Guid had to be entered surrounded by single quotes
Thanks to #ramero-MSFT for the assistance!

You can use filters to limit the scope of documents to search in -> https://learn.microsoft.com/en-us/azure/search/search-filters
Make sure the "key" field is marked as filterable in your index.

Related

Solr Custom Boosting if a specific field matches the query

We are trying to implement a very interesting search logic with custom boosting and I am wondering if Solr can support this.
We have the following fields in our index:
Name
Description
Keywords (array)
Each keyword will have an amount(int value) paired to it.
A search is run across Name, description and keywords field. If a keyword matches the search text, the corresponding index must be boosted based on the amount of the matching keyword only.
I've read through Solr DisMax and they can only boost a field using a fixed amount.
My scenario will be to boost the result by X amount based on matching keywords only.
Thanks in advance
The only viable solution i see to this problem (assuming ofcourse you DO NOT know the number of keywords in advance) would be to just make the query as a filter query (to skip the scoring stage), get all documents matching ( a bit problematic), then just sort them on your side using the matched term to build the a java Comparator.
Problems may arise when you get a particularly large number of documents, but you could probably side step this issue by pagination
If you don't have too much different amounts maybe you can try this on index-time:
Store "keywords" in different fields(dynamicfields->boost-*) based on it's amount:
boost-1 = keyword1,keyword4,keyword6 <br/>
boost-10 = keyword2<br/>
boost-100 = keyword5
You can search across all your boost fields(edismax), boost every dynamicfield with his amount in your (e)dismax conf(boost-1^1,boost-10^10,boost-100^100).

APACHE SOLR: Best Way to have multiple OR Conditions for same field

I am new to SOLR , we have CRM data for Contacts and Companies which are in millions, we have switched to SOLR for fast search results.
PROBLEM: We have large inclusion and exclusion lists with names of companies or contacts.
Ex: Include or Exclude : "company A" & "Company B" & "Company C" .... & "Company n" where assume n = 10000;
What would be the best way to do this kind of a query using SOLR.
WHAT I HAVE TRIED:
Setting "q" ==> field_name: ("companyA" OR "companyB" ..... OR "Company n");
This works only for a list of 400 odd.
Looking forward for assistance on this.
You can increase the max number of boolean clauses: See here: http://wiki.apache.org/solr/SolrConfigXml
Performance Hint: In your case I would think about packaging the inclusion and exclusion lists into a filter and letting the results cache for reuse.
This can happen for multiple reasons:
Check the way how you are querying Solr. Is it GET method or POST? If it is GET method then all the parameters are passed as the part of URL i.e. http://<host:>q=field_name:(....). Maximum number of character a URL can have just 2048 defined universally by Microsoft. If your programatically formed URL has more than 2048 characters then either change the query model or make it POST call.
If #1 doesn't apply for your case, then check for maxBooleanClauses tag present in solrConfig.xml file. If it is missing then add it as per guidelines by Solr wiki.
http://wiki.apache.org/solr/SolrConfigXml#The_Query_Section
You can increase the value of maxBooleanClauses in solrConfig.xml to a desired level. By default the value for this is 1028.
Shishir

'FieldBoost' causes CFIndex to fail

I read in the CF10 docs that the attribute 'FieldBoost' has been added to CFIndex in order to specify which fields should have more importance in Solr's scoring.
However, it seems that not only does it not work as intended, it in fact causes the whole indexing operation to fail completely!
I've seen other posts on the Adobe forums mentioning exactly the same issue, but no replies or resolution available.
I'm running CF10 Update 11.
The following code works and indexes 14,000 records:
<cfindex collection = "MyCollection"
action = "refresh"
type = "custom"
query = "Local.MyContent"
key = "ID"
title = "Name"
body = "Name,Description"
>
However, if I add the FieldBoost value, there are no errors and the index operation appears to run correctly, however the collection now contains zero records:
<cfindex collection = "MyCollection"
action = "refresh"
type = "custom"
query = "Local.MyContent"
key = "itemID"
title = "Name"
body = "Name,Description"
fieldBoost = "title"
>
Has anyone had this working?
From the comments...
I found this bug which I believe is similar to your situation (although it was reported on a Mac platform).
Although it is not documented very well you need to include the weight with the fieldboost attribute. For ColdFusion's implementation you specify the weight by appending it to the field you want boosted delimited with a : (colon). The attribute should look something like this:
fieldboost="title:6"
I was able to find a little bit of documentation on this attribute in the Adobe ColdFusion 10 Beta documentation (on page 106 of that document specifically). Here is an excerpt from that document:
Improving search result rankings
The following attributes in cfindex help you improve the search result rankings:
fieldBoost: Boost specific fields while indexing.
fieldBoost enhances the score of the fields and thereby the ranking in the search results. Multiple fields can be boosted by specifying the values as a comma-separated list.
docBoost: Boost entire document while indexing.
docBoost enhances the score of the documents and thereby the ranking in the search results
And the following code is the example they used to show the fieldboost attribute (notice that they are boosting two fields, separated by a comma):
<cfindex collection="autocommit_check" action="update" type="file"
key="#Expandpath(".")#/_boost1.txt" first_t="fieldboost" second_t="secondfield"
fieldboost="first_t:1,second_t:2" docboost="6" autocommit="true">
Also check this related question for a way to boost fields during the search - CF10 Fieldboost on cfindex has no effect

Need help in constructing code for GAE Search Api

I have created Document, added it to Index and used the GAE Search API to search for a text successfully. Please find the sample code below.
search.Document(
fields=[search.TextField(name='id', value=id),
search.TextField(name='search', value=searchT)])
options = search.QueryOptions(returned_fields=['id'])
results = search.Index(name=_D_INDEX_NAME).search(search.Query(searchTxt, options=options))
Now I am unable to understand to to achieve these mentioned below: Some sample code would be really appreciated.
To search for plural variants of an exact query, use the ~ operator:
~"car" # searches for "car" and "cars"
To build queries that reference specific fields, use both field and value in your query, separated by a colon:
field:value
field:"value as a string"
When you add a document, you specify its schema by defining the fields of the document. In your case id and search.
To search for a term that only appears in a specific field you use the notation field:term
search.Index(name=_D_INDEX_NAME).search('search:programming')
For searching plural variants of a term you use the operator ~
search.Index(name=_D_INDEX_NAME).search('~car')
Note however that this won't work in the dev_appserver.

Can I use Solr term component with filtering on non-term fields

http://localhost:8080/search/terms?terms.prefix=ab&terms.fl=text&terms.sort=count
I have the above terms query which works as I expect. Returns all the terms from the "text" field that have a certain prefix, sorted by count.
I want to return only the terms where another field "language" is "en" can I add such a filter to a terms query?
Unfortunately you can't filter while accessing the indexed terms within a field through the TermsComponent. That's one of the limitations you face when you make auto suggestions for example. If you're making auto-suggestions, one of the ways that supports filtering is based on a facet and the prefix parameter like explained here.

Resources