How Can I perform Distributed search inside solr - solr

I am doing a distributed search inside Solr on 4 diffrent solr servers on different machine. I have extended my class to Query and I would like to perform distributed search. I have created a solr query using solrj. But when I make a query to solr sometimes it gives me a correct result and sometimes incorrect. It gives me incorrect result only when some shards throws query parsing exception. So my question is can I perform a distributed search inside solr. Outline of my class from where I am making a distributed solr search is as given below.
public class CutomClass extends Query {
// some other code....
public Weight createWeight(IndexSearcher searcher1) throws IOException {
SolrQuery query = new SolrQuery();
query.setQuery("*:*");
query.add(ShardParams.SHARDS, getShards);
query.setStart(0);
query.setRows(0);
query.set("sort", "score desc");
query.setFacet(true);
query.addFacetField("CLIENT");
query.setFacetMinCount(1);
QueryResponse queryResponse = solrServer.query(query, SolrRequest.METHOD.POST);
}
// some other code....
}
Sometimes it gives follwing parsing exception on some shard and the result comes incorrect.
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: org.apache.solr.search.SyntaxError: org.apache.lucene.queryParser.ParseException: Cannot parse ':': Encountered "" at line 1, column 0.

Yes, you can perform a distributed search in Solr. If you are using a single collection (say collection1) and using the following Solr URL to create the solrServer object, then you are doing a distributed search by default.
http://localhost:8983/solr/collection1/select?
This URL let you query across all the shards in collection1, whether they are in same or different machine. But, if you have separate collections and you want a search among those collections, please follow http://wiki.apache.org/solr/SolrCloud.
To know more about distributed search, please look into https://wiki.apache.org/solr/DistributedSearch.
You can simply comment the following line.
query.add(ShardParams.SHARDS, getShards);
And yes, *:* is not the problematic part in your program as stated by D. Kasipovic. Query can contain :. You need to provide query like field:text. Problem occurs when the text part is having ":". Then you need to escape : by \:.
Please modify the URL to do a distributed search and then let us know what happens.

Related

Synonyms in Solr Query Results

Whenever I query a string in solr, i want to get the synonyms of field value if there exists any as a part of query result, is it possible to do that in solr
There is no direct way to fetch the synonyms used in the search results. You can get close by looking at how Solr parsed your query via the debugQuery=true parameter and looking at the parsedQuery value in the response but it would not be straightforward. For example, if you search for "tv" on a text field that uses synonyms you will get something like this:
$ curl "localhost:8983/solr/your-core/select?q=tv&debugQuery=true"
{
...
"parsedquery":"SynonymQuery(Synonym(_text_:television _text_:televisions _text_:tv _text_:tvs))",
...
Another approach would be to load in your application the synonyms.txt file that Solr uses and do the mapping yourself. Again, not straightforward,

How to get a Query object from solr query string

There are solr query strings available from the log ,and the intent is to analyze the query to find out number fqs ,terms etc. Is there any api/parser available in solr/lucene to parse the entire query string and get the terms used ,filters used ,languages used ,fields used etc. Looked at QueryParser provided by lucene ,but it doesn't seem to help.
Example simple query string:
q=*:*&facet.field=Language&facet=true&f.Language.facet.limit=101&rows=0&sort=score desc,DefaultRelevance desc&fl=xxNonexx&bmf=50&wt=xml
You can use the SolrRequestParsers.parseQueryString() method to convert the string into Solr Params. Here's a link to documentation for it.
Below is an example.
String queryString = "q=*:*&facet.field=Language&facet=true&f.Language.facet.limit=101&rows=0&sort=score desc,DefaultRelevance desc&fl=xxNonexx&bmf=50&wt=xml";
MultiMapSolrParams solrParams = SolrRequestParsers.parseQueryString(String);
The code resides in the solr-core library, so you may need to add it.
I think you're not really looking for a parser but for a way to debug your query. Fortunately Solr has a debug parameter that you can use for such purpose as explained here. For instance you can add to your query:
q=*:*&facet.field=Language&facet=true&f.Language.facet.limit=101&rows=0&sort=score desc,DefaultRelevance desc&fl=xxNonexx&bmf=50&debug=true&wt=xml

Use solrj to look for results within only specific documents

I have a solr server and query it using solrj. Suppose I want to search for results within only specific documents, and I have the ID's of the documents I want to look in. How do I configure the query to only return results from a specified list of documents?
List<String> documentList = ...; // collection of Strings of the
// ID's of the documents I want
// to look for
this.query = new SolrQuery();
this.query.setFields("id", "score");
this.query.addSort("score", SolrQuery.ORDER.desc);
this.query.addSort("id", SolrQuery.ORDER.desc);
this.query.setQuery(searchString);
What do I need to to make it so that all of the documents returned by the query are documents whose id is in the list of acceptable documents?
I've not used solrj much, but it should be as easy as adding a filter query (I'm assuming you don't want whether a document is acceptable to affect the score) with the document list, e.g.:
String filterQuery = "id:(1 OR 2 OR 3)";
this.query.addFilterQuery(filterQuery);
So you'll want to convert documentList into a string delimited by OR (and yes, I believe it does have to be uppercase).
If the number of acceptable documents is really large, then you'll have to make changes to your Solr configuration to allow a greater number of boolean terms in your query (I think the default is 512, or perhaps 1024; but I've used 32768 with no problems).

Solrj Api Use in Java code

Anybody pleasee help.I am new to Solr.My project uses Solrj api to access solr in java code.I don't understand the different steps in querying with solr and solrj.I got ths following code from net.Can anyone please describe the importance of these statements.?
public class SolrJSearcher {
public static void main(String[] args) throws MalformedURLException, SolrServerException {
HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr");
SolrQuery query = new SolrQuery();
query.setQuery("sony digital camera");
query.addFilterQuery("cat:electronics","store:amazon.com");
query.setFields("id","price","merchant","cat","store");
query.setStart(0);
query.set("defType", "edismax");
QueryResponse response = solr.query(query);
SolrDocumentList results = response.getResults();
for (int i = 0; i < results.size(); ++i) {
System.out.println(results.get(i));
}
}
}
You'll have to read up on Solr concepts to actually use SolrJ for anything useful, so that you're able to tell what the different parts of the API are. I'm not going to go into any detail here, and you should really research a concept more before posting a very broad and basic question. I'll answer it for further reference for anyone stumbling across this post from the Internet anyway, or if anyone need to reference it from another post.
setQuery - The actual query to send to Solr. This is what usually goes in the q parameter when reading the Solr documentation. The format of the query depends on which query parser you're using (which is edismax here, I'll get back to that). Lucene query syntax in general is field:value.
addFilterQuery - Filter the search result by the values supplied. This is what you'll see in the fq parameter in the Solr docs. A filter query doesn't affect scoring, it just filters the search result returned by Solr by removing any documents that doesn't match the filter query.
setFields - Which fields to return from the index. If you don't need all the fields, you can cut down the size of the response from Solr by just requesting the fields you need.
setStart - The offset of the query result, which document hit to start retrieving data from. Useful for pagination.
set - Set any parameter that isn't available through dedicated methods. Here the parameter defType is set, which tells Solr which query parser to use. edismax is one such query parser, that accepts queries in a natural format like you'd expect most people to be familiar with from general search engines.
query - Performs the actual query on the Solr server, and retrieves the result. The response is returned, and then used to get the list of documents in the result (getResults ).
The results are then printed out one by one.

Solr Index appears to be valid - but returns no results

Solr newbie here.
I have created a Solr index and write a whole bunch of docs into it. I can see
from the Solr admin page that the docs exist and the schema is fine as well.
But when I perform a search using a test keyword I do not get any results back.
On entering * : *
into the query (in Solr admin page) I get all the results.
However, when I enter any other query (e.g. a term or phrase) I get no results.
I have verified that the field being queried is Indexed and contains the values I am searching for.
So I am confused what I am doing wrong.
Probably you don't have a <defaultSearchField> correctly set up. See this question.
Another possibility: your field is of type string instead of text. String fields, in contrast to text fields, are not analyzed, but stored and indexed verbatim.
I had the same issue with a new setup of Solr 8. The accepted answer is not valid anymore, because the <defaultSearchField> configuration will be deprecated.
As I found no answer to why Solr does not return results from any fields despite being indexed, I consulted the query documentation. What I found is the DisMax query parser:
The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input).
In contrast, the default Lucene parser only speaks about searching one field. So I gave DisMax a try and it worked very well!
Query example:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video
You can also specify which fields to search exactly to prevent unwanted side effects. Multiple fields are separated by spaces which translate to + in URLs:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features+text
Last but not least, give the fields a weight:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features^20.0+text^0.3
If you are using pysolr like I do, you can add those parameters to your search request like this:
results = solr.search('search term', **{
'defType': 'dismax',
'qf': 'features text'
})
In my case the problem was the format of the query. It seems that my setup, by default, was looking and an exact match to the entire value of the field. So, in order to get results if I was searching for the sit I had to query *sit*, i.e. use wildcards to get the expected result.
With solr 4, I had to solve this as per Mauricio's answer by defining type="text_en" to the field.
With solr 6, use text_general.

Resources