I have found type of document in solr database. Out of those 4, one is general document, second news, third books and fourth poetry. Now I want to query solr so that it should return 10 results. out of those 7 result should be from general documents, 1 from news, 1 from books and 1 from poetry. I am using solr in cloud mode.
Is it possible. if yes then how if not then why ?
Basically, I want to get all these in one query.
You can use result grouping. Based on this documentation your request should be
http://localhost:8983/solr/techproducts/select?wt=json&indent=true&
q=you_query_string&group=true&group.field=type&group.limit=7
After search you should take seven items from group with general type, and by one item from others groups.
Related
Suppose I want to create a recommendation system to suggest people you should connect with based off of certain attributes that I know about you and attributes I have about other people that are stored in a Solr index. Is it possible to query the index with a list of attributes (along with boosts for each attribute) and have Solr return scored results even if some of my fields return no matches? The way that I understand that Solr works is that if one of your fields doesn't contain a match in any documents found in your index, you get zero results for the entire query (even if other fields in the query matched) - is that right? What I would hope is that I could query the index and get a list of results back in order of a score given based on how many (and which) fields matched to something, even if some fields have no matches, for example:
Say that there are 2 people documents stored in the index as follows (figuratively):
Person 1:
Industry: Manufacturing
City: Oakland
Person 2:
Industry: Manufacturing
City: San Jose
And say that I perform a pseudo-Solr query that basically says "Search for everyone whose industry is equal to manufacturing and whose city is equal to Oakland". What I would like is to receive both results back in the result set, even though one of the "Persons" does not reside in Oakland. I just want that person to come back as a result with a lower score than Person1. Is this possible? What might a solr query look like to handle this? Assume that I have many more than 2 attributes for each person (so saying that I can use "And" and "Or" in my solr query isn't really feasible.. or is it?) Thanks in advance for your helpful input! (PS I'm using Solr 3.6)
You mention using the AND operator, which is likely your problem.
The default behavior of Lucene, and Solr, query syntax is exactly what you are asking for. A query like:
industry:manufacturing city:oakland
Will match either, with scoring preference on those that match both. See the lucene query syntax documentation
You can use the bq parameter (boost query) does not affect matching, but affects the scores only.
http://localhost:8983/solr/persons/select?q=industry:manufacturing&bq=City:Oakland^2
play with the boosting factor at the end to get the correct balance between matching score, and boosting score.
I am struggling with a little problem where I have to display relevant information about the resultset returned from SolR but can't figure out how to calculate it without iterating the results (bad).
Basically I am storing my documents with a state field and while the search is supposed to return all documents, the UI has to show "Found 15 entities, 5 are in state A, 3 in state B and 8 in C".
At the moment I am using a rather brittle approach of running the query 3 times with additional scoping by type, but I'd rather get that information from the one query I am displaying. (There have been some edge cases where the numbers don't add up and since SolR can return facets I guess there has to be a way to use that functionality in this case)
I am using SolR 3.5 from Rails with the sunspot gem
As you mention yourself, you can use facets for this by setting
facet=true&facet.field=state
I'm not familiar with the sunspot gem, but by looking at the documentation you can use
facets like this(Assuming Entity is your searchable):
Entity.search do:
facet :state
end
This should return the states of all entities returned by your query with the number of entities in this state. The Sunspot documentation tells me you can read these facets in the following way:
search.facet(:state).rows.each do |facet|
puts "State #{facet.value} has #{facet.count} entities"
end
Essentially there are three main sets of functions you can use to garner stats from solr.
The first is faceting:
http://wiki.apache.org/solr/SimpleFacetParameters
There is also grouping (field collapsing):
https://wiki.apache.org/solr/FieldCollapsing
And the stats package:
https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
Although the stats, facet and group may be replaced by the analytic package known as olap which is aimed to be in solr V 5.0.0:
https://issues.apache.org/jira/browse/SOLR-5302
Good luck.
I'm unclear on this point from the documentation. Is it possible to give Solr X document IDs and tell it that I want documents similar to those?
Example:
The user is browsing 5 different articles
I send Solr the IDs of these 5 articles so I can present the user other similar articles
I am not clear about sending the document IDs, nor whether MoreLikeThis can operate on multiple documents as in this example.
you can try passing multiple Ids with the Query q=id:(document_id1 OR document_id2 OR document_id3) :-
e.g.
http://localhost:8080/solr/select/?qt=mlt&q=id:(document_id1 OR document_id2 OR document_id3)&mlt.fl=[field1],[field2],[field3]&fl=id&rows=10
Imagine an index like the following:
id partno name description
1 1000.001 Apple iPod iPod by Apple
2 1000.123 Apple iPhone The iPhone
When the user searches for "Apple" both documents would be returned. Now I'd like to give the user the possibility to narrow down the results by limiting the search to one or more fields that have documents containing the term "Apple" within those fields.
So, ideally, the user would see something like this in the filter section of the ui after his first query:
Filter by field
name (2)
description (1)
When the user applies the filter for field "description", only documents which contain the term "Apple" within the field "description" would be returned. So the result set of that second request would be the iPod document only. For that I'd use a query like ?q=Apple&qf=description (I'm using the Extended DisMax Query Parser)
How can I accomplish that with Solr?
I already experimented with faceting, grouping and highlighting components, but did not really come to a decent solution to this.
[Update]
Just to make that clear again: The main problem here is to get the information needed for displaying the "Filter by field" section. This includes the names of the fields and the hits per field. Sending a second request with one of those filters applied already works.
Solr just plain Doesn't Do This. If you absolutely need it, I'd try it the multiple requests solution and benchmark it -- solr tends to be a lot faster than what people put in front of it, so an couple few requests might not be that big of a deal.
you could achieve this with two different search requests/queries:
name:apple -> 2 hits
description:apple -> 1 hit
EDIT:
You also could implement your own SearchComponent that executes multiple queries in the background and put it in the SearchHandler processing chain so you only will need a single query in the frontend.
if you want the term to be searched over the same fields every time, you have 2 options not breaking the "single query" requirement:
1) copyField: you group at index time all the fields that should match togheter. With just one copyfield your problem doesn't exist, if you need more than one, you're at the same spot.
2) you could filter the query each time dynamically adding the "fq" parameter at the end
http://<your_url_and_stuff>/?q=Apple&fq=name:Apple ...
this works if you'll be searching always on the same two fields (or you can setup them before querying) otherwise you'll always need at least a second query
Since i said "you have 2 options" but you actually have 3 (and i rushed my answer), here's the third:
3) the dismax plugin described by them like this:
The DisMaxQParserPlugin is designed to process simple user entered phrases
(without heavy syntax) and search for the individual words across several fields
using different weighting (boosts) based on the significance of each field.
so, if you can use it, you may want to give it a look and start from the qf parameters (that is what the option number 2 wanted to be about, but i changed it in favor of fq... don't ask me why...)
SolrFaceting should solve your problem.
Have a look at the Examples.
This can be achieved with Solr faceting, but it's not neat. For example, I can issue this query:
/select?q=*:*&rows=0&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json
to find the number of documents containing donkey in the title and text fields. I may get this response:
{
"responseHeader":{"status":0,"QTime":1,"params":{"facet":"true","facet.query":["title:donkey","text:donkey"],"q":"*:*","wt":"json","rows":"0"}},
"response":{"numFound":3365840,"start":0,"docs":[]},
"facet_counts":{
"facet_queries":{
"title:donkey":127,
"text:donkey":4108
},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{}
}
}
Since you also want the documents back for the field-disjunctive query, something like the following works:
/select?q=donkey&defType=edismax&qf=text+titlle&rows=10&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json
I am using solr 3.3.0 working out of the box using the example folder
solrQueryParser defaultOperator = "OR"
My problem is that Solr doesn't seem to be returning good results when I search for a multiple word phrase.
The following search return no results.
http://localhost:8080/solr/select/?q=roof+fixing
However, when I search for roof or fixing, they both return a few good results.
http://localhost:8080/solr/select/?q=roof returns 4 results
http://localhost:8080/solr/select/?q=fixing returns 3 results
On the query for "roof fixing", I expect solr to return 7 results. The 4 records for roof and 3 records for fixing.
Is any special configuration necessary for that to happen?
You just expressed your query incorrectly.
Try the following query from the Admin page:
(roof OR fixing)
Or, if you want to find that in a particular field:
fieldname:(roof OR fixing)
When you give SOLR a query like "roof fixing" you are effectively asking for all documents which have "roof" AND "fixing" in the default field (or the default dismax set of fields. The only way to change the meaning is to rewrite the query that your users type in. That's what we do, but on a larger scale. We have a front end interface that provides a whole bunch of options and generate a SOLR query from it. People can enter a search term in a specific field and if there is more than one word and it's not quoted, we add the AND. Then we OR together all of the fields that are filled in. Some fields are special and have a MIN and a MAX version which we turn into a range query :[0 TO 125000]. And there are some dropdowns that support multiple selections which we also turn into an OR, e.g. State:("WA" OR "CA" OR "OR" OR "NV")
Solr won't necessarily return 7 results for "roof OR fixing" as one result could include both "roof" and "fixing". Suppose "roof" has 3 results, "fixing" has 4, but both "roof" and "fixing" appear in 2 results. You will get only 5 results on a search for "roof OR fixing" as Solr will not return duplicate results.
Have you tried using a url-encoded space ("%20") instead of the "+" sign? If the default operator is OR you should not need to include that operator.