I am indexing certain documents in Solr which have a Title and Text. I dont want to create a separate field called Title in the document schema and want to index the title by putting it inside the text itself in some way so that title words are given more importance while scoring.
e.g. Title : Olympics 2012, Text : In December 2012, Olympics were held in......
I want to put the Title words in the Text itself, above should have just one field called Text with Title words inside it.
e.g. Text : Olympics 2012 In December 2012, Olympics were held in......
In the above, title words will not be given any special importance. Is there a way I can accomplish this by giving title words a little extra importance than other words in Text field while indexing/scoring ?
giving title words a little extra importance than other words in Text
field while indexing/scoring
I think there is no need to copy the title field to text field to boost the title over text field. Assuming you have index both fields as full text, please consider to use edismax query, and provide the qf (Query Fields) as
qf=title^10 text
which indicates that matches in title are 10 times more significant than matches in text
The following is an example query in case it helps
http://localhost:8983/solr/select/?q=Olympics&defType=edismax&qf=title^10.0+text
Related
I have a utility which takes a document and processes it and generates a unique set of tags which are used to index the document sentence by sentence.
I am writing these new tags into a different field than the content field.
I was wondering if there is any way to ask highlighting to perform a match against the special tags but to show the highlight of the original content.
Alternately, I could embed the source sentence in part of my overall index and set some file marker so that if the highlighter matched the section with the index it could return the text in the tag.
Any ideas on the best way to do this?
everybody. I'm trying to elaborate a query that complies with the following:
Find a set of words that appear in a group of fields. For example, i want to find the documents that have the words soccer, ball and goalkeeper in one or both fields: 'sport_name' and 'descritpion'.
The problem I'm having is that I need to treat both fields as only one for getting results like:
{
"sport_name":"soccer",
"description": "...played with a ball... positions are goalkeeper"
}
I need that the words appear in any field, but all the words need to appear in the "concatenated bigger field".
Is there a way to do this during query time?
Thanks!!
You can do this by using the edismax handler (defType=edismax), setting q.op=AND (since all the terms has to be present) and using qf=sport_name description to tell Solr to search for the given terms in both fields.
You can also use qf=sport_name^2 description to say that you want to weigh hits in the sport_name field twice as much as hits in the description field. So if there was a sport named something with ball, that hit would contribute more to the score than if the same content were present in the description field.
How can I specify during the index creation that one field should receive more relevance than another field?
Example: I have documents with a title and a description field and want the content of the title field to be more important during query time.
doc1: title:"Hello, world", description:"Just a greeting"
doc2: title:"Greetings", description:"Hello, everybody. Hello, hello"
index("default", doc.title);
index("default", doc.description);
A search for the term "hello" should return doc1 one with a higher relevance than doc2 because the word "hello" is present in the title field even though doc2 contains the word 3 times.
How can this be accomplished?
You can specify a boost at query time e.g. if you index items separately
index("title", doc.title);
index("description", doc.description);
Then at query time your can specify that the title gets more weight than the description field
q=(title:hello)^100 OR (description:hello)
where ^100 indicates that this term is boosted. See https://docs.cloudant.com/search.html#query-syntax
I was reading the documentation and saw this:
For example, suppose an index contains two fields, title and text,and that text is the default field. If you want to find a document called "The Right Way" which contains the text "don't go this way," you could include ... the following terms in your search query:
title:"Do it right" AND go
Since text is the default field, the field indicator is not required; hence the ... query above omits it.
The field is only valid for the term that it directly precedes, so the query title:Do it right will find only "Do" in the title field. It will find "it" and "right" in the default field (in this case the text field).
This seems strange to me, how would I search for the phrase "Do it right" in the title?
I continued reading and found the answer in the section titled "Grouping Clauses within a Field". You can use this:
title:("Do it right") AND go
I have two fields
text field .. All important fields like category, product name, brand are copied into it.
attributes field .. All attributes are copied into this field.
I have a single search query e.g. "50 mm diameter drill"
I want to search this string in both fields. I am assuming that this will match all products that have drill in the text field.
I want to narrow down the result in case any attributes that match any of 50 mm diameter.
And in case none matches in the attributes field I want to return all documents that match text field.
Edit: I dont want any docs which don't match text field.
I only want that if search is matched to attributes field, and docs are found we return only those docs.
If not found we return all docs which match text field
This is getting a bit tricky and a lot of things depend on your field processing requirements.
You will need to use a combination of field weighting, to rank attributes field higher and edismax minimum match mm
Minimum match allows you to configure how many terms in the query must be hit in order for it to display results. This helps weed out documents that only hit on one term in one field.
Lastly, if you really want to have your own logic in here, you can prepend field with + to make it mandatory. For example +attributes:drill will only return items that have drill in the attributes field.
Whether "drill" will match depends on how your fields are processed, but probably, yes. The easiest way to do this is to not limit by "if not matched here, do this ..", but to score matches in the attributes field higher. You can do this by using qf (if using (e)dismax) together with their weights, such as attributes^20 text which will score any match in attributes 20 times more than a match in text. Any search matching documents with the correct term in attributes will then be scored higher than those just matching in text.
You can also do something similar in the q parameter, where you can weight each term separately: text:drill OR attributes:drill^20.