Sub queries with SOLR - solr

I am trying to use subquery in SOLR (like SQL), is there a way to implement this using SOLR? to use output of one query as input to another one.
Basically want to get set of records (lets say top 300) from solr then apply some filter on the results returned.
Is there any way to implement it in SOLR?
Thanks in Advance!!!

Yes, of course. Filter queries are specifically targeted for that. Say, if a query with q=the returns you 3000 docs, you can further refine these by supplying fq=lang:en and get top 300 of the documents in English matching 'the'.

Related

Migrating SOLR fq to Elasticsearch

I am currently migrating a SOLR app to Elasticsearch and have become stuck on a particular query. The ElasticSearch documentation is rather vague on how to achieve my desired result.
Currently I am trying to convert tagged "fq's" (filter queries) from SOLR into Elasticsearch. I need to be able to return from Elasticsearch facets (now known as aggregations) based on my query and filters but also show aggregations for other options in a search
Although this sounds complicated it is achieved in SOLR simply by adding an "fq" parameter and tagging the filter as follows:
q=mainquery&fq=status:public&fq={!tag=dt}doctype:pdf&facet=on&facet.field={!ex=dt}doctype
From the main SOLR help docs this will filter on "doctype:pdf" but also include counts for other doc types in the facet output - again this works fine for me, I am simply trying to recreate this in Elasticsearch.
So far I have tried a "post_filter" which does the job until I wish to apply anymore than one filter (again something SOLR handles with no problems). You can see an example of how this works and how I want to achieve it at:
https://www.jobsinhealthcare.co.uk/search?latitude=&longitude=&title=&location=&radius=5&type=&salary=0&frequency=year&since=&jobtype=&keywords=&company=&sort=Most+recent&filter[contract_type_estr][33d5667c]=Temporary&filter[job_type_estr][5d370027]=Part+time&filter[job_type_estr][4b45bd05]=Full+time
IN the filters/facets on the Right of the results you can select multiple "contract type" and/or "job type" and/or "location" and still be shown the facet counts for unselected queries/filters. Please note that Hourly Salary, Annual Salary and Date Added do NOT have this functionality - this is by design.
Any pointers as to how I should be structuring my query would be greatly apprreciated.
I think what you need is global aggregation (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html). Inside top level aggregation you should use filter aggregation (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html) as a sub-aggretation to filter only "status:public".

Solr AND operator

I have a problem getting the right results with my SOLR query. Basically, let's say I want all documents in English containing the string "toto".
http://127.0.0.1:8080/solr-webservice/query/?q=iso_lang_cd:en&ctnt_val:*toto*
The problem is that this query sends me all documents in English AND all documents containing toto.
What I need is to get all documents that are in English AND contain toto. How could I achieve this? I'd think this is the standard use of the AND operator...
Actually OR is the default query operator for Solr and your query is not formatted in such a away as to force an AND operation. In order to achieve the AND behavior you could specify your query in one of the following formats:
+iso_lang_cd:en +ctnt_val:*toto*
iso_lang_cd:en && ctnt_val:*toto*
Or you can optionally pass the q.op=AND to force an AND operation. Additionally, you might want to consider using Filter Queries, where you could filter on the language. There are some performance improvements with using filter queries, but please refer to the documentation for more details.
q=ctnt_val:*toto*&qf=iso_lang_cd:en
Please see The Standard Query Parser for more details and a good overview of querying.

Apply Solr filter query to only part of the search results

I have a Solr solution working which requires two queries, but I'm looking for a way to do it in a single query. My idea is that if I can figure out a way to do this, I wont have to incur the overhead of twice the load on the Solr cluster.
The details: I'm running a simple query like "q=camera" with a query filter of say "fq=type:digital". The second query is identical to the first, but the filter is the inverse, like "fq=-type:digital" I'm imagining that if there's a way to run a single query while applying the first filter to get the first set of topDocs, then generate a second set with the second filter the results could be merged and returned ( it doesn't matter if sorting resorts and mixes the two sets).
I experimented with partitioning the data by marking a specific field during indexing, into two different groups and then using Solr "grouping" queries, but the response time for these wasn't acceptable in my setup.
I'm looking for suggestions the most Solr congruent approach to experiment with: tuning to improve the two-query solution performance, or investigating a kind of custom Solr post-filter ( I read Yonik's 2/2012 blog post ).
I have to implement this in Solr 3.5, although if there's a slam dunk solution in 4.0 I'll eventually be able to move to that.
I can think of two alternate approaches :-
Instead of filter the results, use a variable higher boost so that all the results for type:digital come on top and rest of the documents would follow. No need for separate queries. The boost can be changes as per the type value.
Other approach is not to display the results for type other then digital. However, you can display the facets for the other types with the counts for the same for users to know if the other types exist for the search term. You can check on tagging and excluding filters
Result grouping might give you what you want. Just group by that parameter and specify sufficient top number of documents in each group.
But I would test whether its performance is any better than two queries. Just because it mentions performance in limitations section.

How can I elevate the top 10,000 records using elevate.xml in Solr?

I have 100,000 records. I need to elevate the top 10,000 records using elevate.xml. How can I do this?
As explained here, the QueryElevationComponent is used to configure the top results for specific queries. I may be wrong, but I would guess that you want to give a higher weight to your 10,000 records irrespective of the query. You could use index time boosts as explained here. Alternately, you could add a field with some special value (like boost:true) to these special records and use a boost query (bq) as explained here.
This is possible after customizing the QueryElevationComponent java file.Its working now!!!

Wildcard to select all items in Solr

I'm currently using Local Solr for doing geo searching. It takes in lat and long parameters as well as a search query. I want to create nearby functionality, where I don't need to provide a location and not a search query. Is there a way to provide a wildcard query that matches all elements then order by the distance? Is the best to create another field and place the same value in all fields?
Thanks.
You can use the query *:* to match all values in all fields.
See http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F for an example on how to query all documents using the *:* wildcard.
See also http://wiki.apache.org/solr/SolrQuerySyntax for general Solr syntax help.
You may use Solr spatial search to sort by distance, and your query can be *:* if you want to pull all documents from your index.
eg: ?q=*:*&sfield=search_field&pt=22.12,-55.56&sort=geodist() asc
http://wiki.apache.org/solr/SpatialSearch

Resources