Is it possible to change document ranking in Lucene index? - sql-server

What I would like to have is when a user searches for a specific term or phrase, I would like lucene to return a certain document as first result...
Knowing that my documents are rows in a SQL server database.
Thank you in advance.

If you know which document you want to show first, you dont need to bother with Lucene specifics to achieve what you want.
Simply show it first, and when you iterate your search results skip it, since you know you already displayed it.

Related

Azure Search - Hierarchical facets guidance

I'm developing a project where I want to have hierarchical facets.
I have an index with a complex structure, like:
Index
-field1
-List
And othercomplexfield contains another list with anothercomplexfield inside.
I'd like to be able to give to users the possibility to:
Have the facets of field1.
When one is selected, I'd like to give the user the possibility to select one of the values of a certain field of "othercomplexfield" while filtering by the selected field1.
I can do that.
I'd then like to give the user the possibility to select one of the possible values of "anothercomplexfield" while filtering by field1 AND by the selected othercomplexfield.
The difficulty here is that I don't want every possible facet value, but only the ones CONTAINED by the othercomplexfield that I'm filtering for.
So far I had to do this inside of c# and i did not find a way to write a query that gives me back from azure search the distinct values that I want.
Someone has a similar problem?
Did I explain the problem well enough?
I saw no clear guidance online, everything is easy if you only have level 1 facets but when you get into nested objects it's not that clear anymore.
I'm not sure I fully understand the context of your question. What I can tell you is that filters only apply at the document level and not at the complex collection level. What I mean by that is that if a filter matches an item in a complex collection, the entire document will be returned, not just the item in the complex collection that matched. The same is true for facets--facets will count all documents in the result set that match the filter and can't be scoped down just to parts of documents. With that, it seems like having this logic in your application like you mentioned might be the best approach for your current index schema.
We do have this old blog post that talks about one way to implement hierarchical facets with Azure Cognitive Search which may give you some other ideas on how you could implement the functionality you're looking for: https://learn.microsoft.com/en-us/archive/blogs/onsearch/multi-level-taxonomy-facets-in-azure-search

Is it possible to get a list of similar and/identical documents?

This is a general question that would like to get some input from the search community, so I don't have a piece of code to share just yet.
The objective is for a single document to get a list of similar and/or identical documents indexed by Azure Search - is that possible?
So given a document_id = 1 how do I get a list of the most similar documents to the specified id in the index? Ideally the outcome would be a list of documents order by a match of 0-100 - where 100 (%) would be an identical match.
I considering maybe taking the content of a given document and submitting that as part of the search, but that doesn't seem to be very elegant and it is also error prone in terms of constructing the query and the size of a document can be significant.
Thank you in advance for any suggestions or comments.
You could try using the preview feature "moreLikeThis" -> https://learn.microsoft.com/en-us/azure/search/search-more-like-this
I believe that's the closest Azure Search has to offer to what you want.
Edit 1: Be advised that this feature has limitations like non-support for complex types. Make sure it meets your requirements before taking a production dependency.

Creating Index in Cloudant

Scenario.
I have a document in the database which has thousands of item in
'productList' as below.
here
All the object in array 'productList' has the same shape and same fields with different values.
Now I want to search in the following way.
when a user writes 'c' against 'Ingrediants' field, the list will show all 'Ingrediants' start with alphabet 'c'.
when a user write 'A' against 'brandName' field, the list will show
all 'brandName' start with alphabet 'A'.
please give an example using this to search for it, either it is by
creating an index(json,text).
creating a Search index (design document) or
using views etc
Note: I don't want to create an index at run-time(I mean index could be defined by Cloudant dashboard) I just want to query it, by this library in the application.
I have read the documentation's, I got the concepts.
Now, I want to implement it with the best approach.
I will use this approach to handle all such scenarios in future.
Sorry if the question is stupid :)
thanks.
CouchDB isn't designed to do exactly what you're asking. You'd need one index for Ingredient, and another for Brand Name - and it isn't particularly performant to do both at once. The best approach I think would be to check out the Mango query feature http://docs.couchdb.org/en/2.0.0/api/database/find.html, try the queries you're interested in and then add indexes as required (it has the explain plan to help make this more efficient).

Solr AND operator

I have a problem getting the right results with my SOLR query. Basically, let's say I want all documents in English containing the string "toto".
http://127.0.0.1:8080/solr-webservice/query/?q=iso_lang_cd:en&ctnt_val:*toto*
The problem is that this query sends me all documents in English AND all documents containing toto.
What I need is to get all documents that are in English AND contain toto. How could I achieve this? I'd think this is the standard use of the AND operator...
Actually OR is the default query operator for Solr and your query is not formatted in such a away as to force an AND operation. In order to achieve the AND behavior you could specify your query in one of the following formats:
+iso_lang_cd:en +ctnt_val:*toto*
iso_lang_cd:en && ctnt_val:*toto*
Or you can optionally pass the q.op=AND to force an AND operation. Additionally, you might want to consider using Filter Queries, where you could filter on the language. There are some performance improvements with using filter queries, but please refer to the documentation for more details.
q=ctnt_val:*toto*&qf=iso_lang_cd:en
Please see The Standard Query Parser for more details and a good overview of querying.

Can Solr be used to match a document against keywords without storing the document?

I'm not entirely sure on the vocabulary, but what I'd like to do is send a document (or just a string really) and a bunch of keywords to a Solr server (using Solrnet), and have a return that tells me if the document is a match for the keywords or not, without having the document being stored or indexed to the server.
Is this possible, and if so, how do I do it?
If not, any suggestions of a better way? The idea is to check if a document is a match before storing it. Could it work to store it first with just a soft commit, and if it is not a match delete it again? How would this affect the index?
Index a document - send it to Solr to be tokenized and analyzed and the resulting strings stored
Store a document - send it to Solr to be stored as-is, without any modifications
So if you want a document to be searchable you need to index it first.
If you want a document (fields) to be retrievable in its original form, you need to store a document.
What exactly are you trying to accomplish? Avoid duplicate documents? Can you expand a little bit on your case...

Resources