Being new to Solr (3.6.1 used on the project that I am working on) I am trying to understand how logical grouping can limit the data returned.
Working with the test data and schema that is supplied as part of the solr download when I run a query like id:1 and id:2 which based on the data returns 2 documents
but in the next case
(id:1 and popularity:0) and (id:2 and popularity:7)
I would assume that I would only get 1 document back as there is no document that has a popularity of 0 and yet all 5 documents are returned (I only loaded 5)
In the last case where I have int1 and (id:2 and popularity:7) I get three documents based on the tests i do (through the admin web page) and / or seem to return the same number of results. What am I missing?
After additional research it turns out that the parser (atleast the one used for the admin window) so lowercase and will be treated as the default operator which is normally defined as OR so anded clauses must be upper cased AND not and for the correct results to be returned.
Related
We have a customer web application, which uses Apache Solr APIs internally. We do not have access to the SolrUI on customer site.
Due to recent changes in Solr at customer end, whenever we try to search the data like
NAME:PRANAV AND AGE:1, Solr does not show any results.(shows numFound:0)
whereas when we search
NAME=PRANAV AND AGE:1, it gives the result.(shows numFound value greater than 0)
So String searches works with = and numeric search works with :.
But when we search
NAME=PRA* , we do not get any result in Solr. (shows numFound:0)
Can someone please advise, what should be changed on Solr side to correct the searches?
We want to have wild card searches (*, ?) to work and also String search should work with : instead of =.
I have indexed the following record in my collection
{
"app_name":"atm inspection",
"appversion":1,
"id":"app_1427_version_2449",
"icon":"/images/media/default_icons/app.png",
"type":"app",
"app_id":1427,
"account_id":556,
"app_description":"inspection",
"_version_":1599625614495580160}]
}
and It's working fine unless an until i search records case sensitively i.e if i write following Solr query to search records whose app_name contains atm then Solr is returning above response which is a correct behaviour.
http://localhost:8983/solr/NewAxoSolrCollectionLocal/select?fq=app_name:*atm\ *&q=*:*
However, If i execute following Solr query to search records whose app_name contains ATM
http://localhost:8983/solr/NewAxoSolrCollectionLocal/select?fq=app_name:*ATM\ *&q=*:*
Solr is not returning above response because ATM!=atm.
Can someone please help me with the Solr query to search records case insensitively.
Your help is greatly appreciated.
You can't. The field type string requires an exact match (it's a single, unprocessed token being stored for the field value).
The way to do it is to use a TextField with an associated Tokenizer and a LowercaseFilter. If you use a KeywordTokenizer, the whole token will be kept intact (so it won't get split as you'd usually assume with a tokenizer), and since it's a TextField it can have a analysis chain associated - allowing you to add a LowercaseFilter.
The LowerCaseFilter is multiterm aware as far as I remember, but remember that wildcard queries will usually not have any filters applied. You should therefor lowercase the value before creating your query yourself (even if it probably will work in this simple case).
I am struggling with a little problem where I have to display relevant information about the resultset returned from SolR but can't figure out how to calculate it without iterating the results (bad).
Basically I am storing my documents with a state field and while the search is supposed to return all documents, the UI has to show "Found 15 entities, 5 are in state A, 3 in state B and 8 in C".
At the moment I am using a rather brittle approach of running the query 3 times with additional scoping by type, but I'd rather get that information from the one query I am displaying. (There have been some edge cases where the numbers don't add up and since SolR can return facets I guess there has to be a way to use that functionality in this case)
I am using SolR 3.5 from Rails with the sunspot gem
As you mention yourself, you can use facets for this by setting
facet=true&facet.field=state
I'm not familiar with the sunspot gem, but by looking at the documentation you can use
facets like this(Assuming Entity is your searchable):
Entity.search do:
facet :state
end
This should return the states of all entities returned by your query with the number of entities in this state. The Sunspot documentation tells me you can read these facets in the following way:
search.facet(:state).rows.each do |facet|
puts "State #{facet.value} has #{facet.count} entities"
end
Essentially there are three main sets of functions you can use to garner stats from solr.
The first is faceting:
http://wiki.apache.org/solr/SimpleFacetParameters
There is also grouping (field collapsing):
https://wiki.apache.org/solr/FieldCollapsing
And the stats package:
https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
Although the stats, facet and group may be replaced by the analytic package known as olap which is aimed to be in solr V 5.0.0:
https://issues.apache.org/jira/browse/SOLR-5302
Good luck.
Imagine an index like the following:
id partno name description
1 1000.001 Apple iPod iPod by Apple
2 1000.123 Apple iPhone The iPhone
When the user searches for "Apple" both documents would be returned. Now I'd like to give the user the possibility to narrow down the results by limiting the search to one or more fields that have documents containing the term "Apple" within those fields.
So, ideally, the user would see something like this in the filter section of the ui after his first query:
Filter by field
name (2)
description (1)
When the user applies the filter for field "description", only documents which contain the term "Apple" within the field "description" would be returned. So the result set of that second request would be the iPod document only. For that I'd use a query like ?q=Apple&qf=description (I'm using the Extended DisMax Query Parser)
How can I accomplish that with Solr?
I already experimented with faceting, grouping and highlighting components, but did not really come to a decent solution to this.
[Update]
Just to make that clear again: The main problem here is to get the information needed for displaying the "Filter by field" section. This includes the names of the fields and the hits per field. Sending a second request with one of those filters applied already works.
Solr just plain Doesn't Do This. If you absolutely need it, I'd try it the multiple requests solution and benchmark it -- solr tends to be a lot faster than what people put in front of it, so an couple few requests might not be that big of a deal.
you could achieve this with two different search requests/queries:
name:apple -> 2 hits
description:apple -> 1 hit
EDIT:
You also could implement your own SearchComponent that executes multiple queries in the background and put it in the SearchHandler processing chain so you only will need a single query in the frontend.
if you want the term to be searched over the same fields every time, you have 2 options not breaking the "single query" requirement:
1) copyField: you group at index time all the fields that should match togheter. With just one copyfield your problem doesn't exist, if you need more than one, you're at the same spot.
2) you could filter the query each time dynamically adding the "fq" parameter at the end
http://<your_url_and_stuff>/?q=Apple&fq=name:Apple ...
this works if you'll be searching always on the same two fields (or you can setup them before querying) otherwise you'll always need at least a second query
Since i said "you have 2 options" but you actually have 3 (and i rushed my answer), here's the third:
3) the dismax plugin described by them like this:
The DisMaxQParserPlugin is designed to process simple user entered phrases
(without heavy syntax) and search for the individual words across several fields
using different weighting (boosts) based on the significance of each field.
so, if you can use it, you may want to give it a look and start from the qf parameters (that is what the option number 2 wanted to be about, but i changed it in favor of fq... don't ask me why...)
SolrFaceting should solve your problem.
Have a look at the Examples.
This can be achieved with Solr faceting, but it's not neat. For example, I can issue this query:
/select?q=*:*&rows=0&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json
to find the number of documents containing donkey in the title and text fields. I may get this response:
{
"responseHeader":{"status":0,"QTime":1,"params":{"facet":"true","facet.query":["title:donkey","text:donkey"],"q":"*:*","wt":"json","rows":"0"}},
"response":{"numFound":3365840,"start":0,"docs":[]},
"facet_counts":{
"facet_queries":{
"title:donkey":127,
"text:donkey":4108
},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{}
}
}
Since you also want the documents back for the field-disjunctive query, something like the following works:
/select?q=donkey&defType=edismax&qf=text+titlle&rows=10&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json
I am using solr 3.3.0 working out of the box using the example folder
solrQueryParser defaultOperator = "OR"
My problem is that Solr doesn't seem to be returning good results when I search for a multiple word phrase.
The following search return no results.
http://localhost:8080/solr/select/?q=roof+fixing
However, when I search for roof or fixing, they both return a few good results.
http://localhost:8080/solr/select/?q=roof returns 4 results
http://localhost:8080/solr/select/?q=fixing returns 3 results
On the query for "roof fixing", I expect solr to return 7 results. The 4 records for roof and 3 records for fixing.
Is any special configuration necessary for that to happen?
You just expressed your query incorrectly.
Try the following query from the Admin page:
(roof OR fixing)
Or, if you want to find that in a particular field:
fieldname:(roof OR fixing)
When you give SOLR a query like "roof fixing" you are effectively asking for all documents which have "roof" AND "fixing" in the default field (or the default dismax set of fields. The only way to change the meaning is to rewrite the query that your users type in. That's what we do, but on a larger scale. We have a front end interface that provides a whole bunch of options and generate a SOLR query from it. People can enter a search term in a specific field and if there is more than one word and it's not quoted, we add the AND. Then we OR together all of the fields that are filled in. Some fields are special and have a MIN and a MAX version which we turn into a range query :[0 TO 125000]. And there are some dropdowns that support multiple selections which we also turn into an OR, e.g. State:("WA" OR "CA" OR "OR" OR "NV")
Solr won't necessarily return 7 results for "roof OR fixing" as one result could include both "roof" and "fixing". Suppose "roof" has 3 results, "fixing" has 4, but both "roof" and "fixing" appear in 2 results. You will get only 5 results on a search for "roof OR fixing" as Solr will not return duplicate results.
Have you tried using a url-encoded space ("%20") instead of the "+" sign? If the default operator is OR you should not need to include that operator.