solr auto-suggestion with multible where clauses

solr auto-suggestion with multible where clauses - solr

Not sure if it is a relevant query to post, but want to understand auto-suggestion is suitable option for location based search as I am looking for specific requirement. The requirement is, from a specified geo location, want to search for providers(be it doctor with specialty or hospitals) using auto suggestion.
As part of suggestion, I need to pass geo location with search key, the search key would be a doctor’s name or doctor’s specialty or hospital name or hospital address, the suggester would provide the results on the basis of geo distance in ascending order.
The weightage option would be calculated on the basis of distance by inverse value.
I posted earlier a query here (solr autosuggestion with tokenization), this post is relevant to my earlier query.
Regards
Venkata Madhu

If you want to add more logic to the suggestions that you're going to show is probably a good idea to use normal queries instead of the suggest component.
For instance take a look at this repo is a (bit outdated) example of using a normal solr core to store suggestions and do suggest-like queries. Meaning you can do partial match queries on that index and add the custom scoring logic that you want. Keep in mind that it doesn't need to be a separated core you could just copy data from the fields that you have in a separate field used only for generating the suggestions.
In this case, you'll only need to add/edit the score function used to add your own logic (geodist) or even do a hard sort on the distance.

Related

Azure Search - Hierarchical facets guidance

I'm developing a project where I want to have hierarchical facets.
I have an index with a complex structure, like:
Index
-field1
-List
And othercomplexfield contains another list with anothercomplexfield inside.
I'd like to be able to give to users the possibility to:
Have the facets of field1.
When one is selected, I'd like to give the user the possibility to select one of the values of a certain field of "othercomplexfield" while filtering by the selected field1.
I can do that.
I'd then like to give the user the possibility to select one of the possible values of "anothercomplexfield" while filtering by field1 AND by the selected othercomplexfield.
The difficulty here is that I don't want every possible facet value, but only the ones CONTAINED by the othercomplexfield that I'm filtering for.
So far I had to do this inside of c# and i did not find a way to write a query that gives me back from azure search the distinct values that I want.
Someone has a similar problem?
Did I explain the problem well enough?
I saw no clear guidance online, everything is easy if you only have level 1 facets but when you get into nested objects it's not that clear anymore.

I'm not sure I fully understand the context of your question. What I can tell you is that filters only apply at the document level and not at the complex collection level. What I mean by that is that if a filter matches an item in a complex collection, the entire document will be returned, not just the item in the complex collection that matched. The same is true for facets--facets will count all documents in the result set that match the filter and can't be scoped down just to parts of documents. With that, it seems like having this logic in your application like you mentioned might be the best approach for your current index schema.
We do have this old blog post that talks about one way to implement hierarchical facets with Azure Cognitive Search which may give you some other ideas on how you could implement the functionality you're looking for: https://learn.microsoft.com/en-us/archive/blogs/onsearch/multi-level-taxonomy-facets-in-azure-search

Is it possible to get a list of similar and/identical documents?

This is a general question that would like to get some input from the search community, so I don't have a piece of code to share just yet.
The objective is for a single document to get a list of similar and/or identical documents indexed by Azure Search - is that possible?
So given a document_id = 1 how do I get a list of the most similar documents to the specified id in the index? Ideally the outcome would be a list of documents order by a match of 0-100 - where 100 (%) would be an identical match.
I considering maybe taking the content of a given document and submitting that as part of the search, but that doesn't seem to be very elegant and it is also error prone in terms of constructing the query and the size of a document can be significant.
Thank you in advance for any suggestions or comments.

You could try using the preview feature "moreLikeThis" -> https://learn.microsoft.com/en-us/azure/search/search-more-like-this
I believe that's the closest Azure Search has to offer to what you want.
Edit 1: Be advised that this feature has limitations like non-support for complex types. Make sure it meets your requirements before taking a production dependency.

how to boost the score in azure search for unstructured blob data?

I am using Azure search which is using default indexing on the data which is importing unstructured data (pdf, doc, text, image files etc.)
I didn't make any scoring profile on the default available fields.
Almost every setting in the portal is the default. If I search any text through the search explorer then I get the JSON result which has very low search score.
I read about score boosting using the scoring profile. however, the terms which I want to find out can be in any document at any place. so how can I decide on which field I can weight more?
how can I generate more custom fields on these input files? Do I need to write document parser?
I am using SDK 4.0 and c# in my bot.
please suggest.

To use scoring profile, the fields you are trying to boost need to be part of the index definition, otherwise the scoring mechanism won't know about them.
You mentioned using unstructured data as your source, I assume this means your data does not have any stable or predictable structure. If that's the case, then you probably won't be able to update your index definition to match exactly the structure of every document, since different documents will likely have a different and unpredictable structure. If you know what fields you want to boost, and you know how to retrieve those fields from your document, then you could update your index definition with only the fields you care about, and then use the "merge" document API to populate that field for each document.
https://learn.microsoft.com/en-us/rest/api/searchservice/addupdate-or-delete-documents
This would require you to retrieve all documents from the index, parse the data to extract the field you want to boost, and then use the merge API to update the index data with the data you extracted. Once you have this, you will be able to use that field as part of a scoring profile.

Best way to perform search on several fields‏

I have to perform search on several fields for ex.(ProductName,ProductDescription,FeedBackOfProduct etc.).
Currently I have 2 approaches
1. I will copy all these searchable fields into one copy field and perform search on that field.
But problem here is How can I boost a perticular field say suppose only on ProductName.
2.Or I will search by field name and give boost accordingly.
ProductName:"Test"^50.0 ProductDescription:"Easy To Handle"~100^70.0
Please tell me which will the best approach.
Thanks in advance.

With the 2nd option (search by field with boost) you have more control over how documents are scored, as you note you do not have this control with 1st option. Either way is a valid approach and which one you want to use will depend on your use case scenarios.

How to add an non indexed field to SOLR Results

I have a Chemical search application where we will execute a Molecular search using a standard molecule matching engine and retrieve the IDs of the chemical structures and the hit's score or Similarity value from the engine.
My application will then invoke a SOLR with the list of IDs retrieved from the engine. I want to add the hit's score to the results.
1. Can I simply add this calculated field to SOLR's results? How?
2. Could I implement a SIMILARITY function to supply it as the score instead of the score created by Lucene?
3. I want to order the results by the score. The molecule search can drive this can I tell SOLR to retain the order of the ids passed as the search query?
We are using SOLR 3.5. It is part of a stack provided by our vendor and cannot just upgrade it.
I'm thinking implementing a custom Search handler to do molecule pre-search and then search solr with the output.
I am very new to SOLR and any help would be appreciated.

If you send IDs into Solr and then sorting by those same IDs, what do you actually need a Solr for? Or are you sub-selecting from those IDs afterwards using Solr query?
In any case, if your implementation allows you to change solrconfig.xml, you should be able to sneak a custom Request Handler in, which should allow you to build your pre- and post- processing. Here is one somewhat relevant article.
Regarding custom similarity, I am not sure you mean what you think you mean (custom Request Handler is a higher level intercept). However, if you do mean it, Wiki discusses what is possible before and after Solr 4.