Spring Data Solr phrase search and Sorting - solr

I am looking for help on Spring data solr phrase search and ordering.
When I search for "Spring Data", it should return the data that has "Spring Data" in it.
> Ex:- Should return --- Spring Data is good
> Should return --- 123Spring Data123 is good
> Should not return --- Spring and Data
Also, the result should be in the sorted order based on the number of fields matched.
When I get the search input from the user, I do search as below.
public interface SolrSecurityRepository extends
SolrCrudRepository<SolrSecurity, String> {
#Query(value = "subject:*?0* OR content:*?0*")
List<SolrSecurity> find(String searchStr,
Pageable pagebale);
}
Not sure how to achieve exact phrase match ....
Also, Is there anyway to enable the debug to see the query being generated by Spring data solr ...
Thanks,
Baskar.S

Spring Data Solr relies on the behavior of the Standard Query Parser. By default no action is performed to enforce sorting in any way, which means documents are sorted by score. An exact phrase search would mean you'd have to quote the terms string. You can have a look at the score assigned by adding a readonly score property to the entity and alter the annotated Query to read the given list of fields.
class SomeDocument {
#Id String id;
#Indexed(readonly=true) Float score;
// ...
}
#Query(value = "subject:*?0* OR content:*?0*", fields={"*","score"})
List<SomeDocument> find(String searchStr, Pageable pagebale);
Or use simply use #Score (currently only available for 1.4.RC1) within the document.
If you want to print out the query before execution please enable logging for SolrTemplate by adding <logger name="org.springframework.data.solr.core.SolrTemplate" level="debug" /> to logback.xml.

Related

Augment Query with ranking and query features powered externally?

While submitting http://localhost:8080/search/?query=honda car in vespa application with "honda car" as an unstructured query on automobile database.
I have an external engine which powers query features and ranking features (weights) based on query. I am very well aware of query profiles, but instead of using it if I want to augment query with &feature1="value1"&feature2="value2" how is it possible with searchers or any other component?
We have a method yqlrepresentation() in Query class of Vespa. Is it called for every unstructured query, in other words, does an unstructured query gets converted to YQL and then gets hit on index?
To pass features for ranking with the query write a Searcher which adds them to Query.getRanking().getFeatures():
public class FeatureAdder extends Searcher {
#Override
public Result search(Query query, Execution execution) {
Map<String, Double> features = lookUpFeaturesFromExternalStore();
features.forEach((name, value) -> query.getRanking().getFeatures().put("query(" + name + ")",
String.valueOf(value)));
return execution.search(query);
}
}
Now you can access these values in ranking expressions using "query(name)".
does an unstructured query gets converted to YQL and then gets hit on index?
YQL is just an external representation language. The query (unstructured or not) is parsed into the object representation under Query.getModel().getQueryTree(). That is what you should work on if you want to modify the query programmatically (from a Searcher).

Use solrj to look for results within only specific documents

I have a solr server and query it using solrj. Suppose I want to search for results within only specific documents, and I have the ID's of the documents I want to look in. How do I configure the query to only return results from a specified list of documents?
List<String> documentList = ...; // collection of Strings of the
// ID's of the documents I want
// to look for
this.query = new SolrQuery();
this.query.setFields("id", "score");
this.query.addSort("score", SolrQuery.ORDER.desc);
this.query.addSort("id", SolrQuery.ORDER.desc);
this.query.setQuery(searchString);
What do I need to to make it so that all of the documents returned by the query are documents whose id is in the list of acceptable documents?
I've not used solrj much, but it should be as easy as adding a filter query (I'm assuming you don't want whether a document is acceptable to affect the score) with the document list, e.g.:
String filterQuery = "id:(1 OR 2 OR 3)";
this.query.addFilterQuery(filterQuery);
So you'll want to convert documentList into a string delimited by OR (and yes, I believe it does have to be uppercase).
If the number of acceptable documents is really large, then you'll have to make changes to your Solr configuration to allow a greater number of boolean terms in your query (I think the default is 512, or perhaps 1024; but I've used 32768 with no problems).

Solr AnalyticsQuery API returns analytics by documents that don't match query

Solr AnalyticsQuery API returns analytics by documents that don't match query.
I have core named 'documents' in Solr. There are some fields like 'id', 'url', 'text', 'domain'.
Also I have ResourceAnalyticsCollector which counts how many documents belongs to each resource.
Example of result of resource analytics is:
resources:{
example.com: 456
example2.com: 123
...
}
First time I found problem when the query was by one domain but analytics returned result with few domains.
Example:
Solr query: domain:example.com
Number of documents returned by this query: 1000(All this documents belongs to example.com)
Analytics result:
resources:{
example.com: 700
example2.com: 100
example3.com: 100
example4.com: 100
}
I looked for all documents by /select search handler and all of this documents was belonged to example.com domain.
But when I looked in analytics for documents, I have found that there are many documents which don't match query. But number of documents is the same.
Here is my analytics module:
public class ResourceAnalyticsCollector extends DelegatingCollector{
public ResourceAnalyticsCollector(ResponseBuilder rb, IndexSearcher searcher) {
this.rb = rb;
this.searcher = searcher;
}
#Override
public void collect(int docNum){
Document doc;
doc = searcher.doc(docNum);
//Output document id for logs
String docId = doc.get(AnalyticsConstants.ID_SOLR_FIELD);
System.out.println("Doc id = " + docId);
documentList.add(doc);
delegate.collect(docNum);
}
#Override
public void finish(){
rb.rsp.add(TOTAL_RESULT_FIELD, this.getAnalyticsContext(documentList));
}
I think this is the bug of the Solr. But if someone could help me with this problem, it would be great!
I have create mini version of my analytics, with core. Name of file "Analytics_API_problem.rar".
You can download it by these link
I guess you need to use query like &fq={!myanalytic param1=a param2=b cost=101}

Solrj Api Use in Java code

Anybody pleasee help.I am new to Solr.My project uses Solrj api to access solr in java code.I don't understand the different steps in querying with solr and solrj.I got ths following code from net.Can anyone please describe the importance of these statements.?
public class SolrJSearcher {
public static void main(String[] args) throws MalformedURLException, SolrServerException {
HttpSolrServer solr = new HttpSolrServer("http://localhost:8983/solr");
SolrQuery query = new SolrQuery();
query.setQuery("sony digital camera");
query.addFilterQuery("cat:electronics","store:amazon.com");
query.setFields("id","price","merchant","cat","store");
query.setStart(0);
query.set("defType", "edismax");
QueryResponse response = solr.query(query);
SolrDocumentList results = response.getResults();
for (int i = 0; i < results.size(); ++i) {
System.out.println(results.get(i));
}
}
}
You'll have to read up on Solr concepts to actually use SolrJ for anything useful, so that you're able to tell what the different parts of the API are. I'm not going to go into any detail here, and you should really research a concept more before posting a very broad and basic question. I'll answer it for further reference for anyone stumbling across this post from the Internet anyway, or if anyone need to reference it from another post.
setQuery - The actual query to send to Solr. This is what usually goes in the q parameter when reading the Solr documentation. The format of the query depends on which query parser you're using (which is edismax here, I'll get back to that). Lucene query syntax in general is field:value.
addFilterQuery - Filter the search result by the values supplied. This is what you'll see in the fq parameter in the Solr docs. A filter query doesn't affect scoring, it just filters the search result returned by Solr by removing any documents that doesn't match the filter query.
setFields - Which fields to return from the index. If you don't need all the fields, you can cut down the size of the response from Solr by just requesting the fields you need.
setStart - The offset of the query result, which document hit to start retrieving data from. Useful for pagination.
set - Set any parameter that isn't available through dedicated methods. Here the parameter defType is set, which tells Solr which query parser to use. edismax is one such query parser, that accepts queries in a natural format like you'd expect most people to be familiar with from general search engines.
query - Performs the actual query on the Solr server, and retrieves the result. The response is returned, and then used to get the list of documents in the result (getResults ).
The results are then printed out one by one.

How Can I perform Distributed search inside solr

I am doing a distributed search inside Solr on 4 diffrent solr servers on different machine. I have extended my class to Query and I would like to perform distributed search. I have created a solr query using solrj. But when I make a query to solr sometimes it gives me a correct result and sometimes incorrect. It gives me incorrect result only when some shards throws query parsing exception. So my question is can I perform a distributed search inside solr. Outline of my class from where I am making a distributed solr search is as given below.
public class CutomClass extends Query {
// some other code....
public Weight createWeight(IndexSearcher searcher1) throws IOException {
SolrQuery query = new SolrQuery();
query.setQuery("*:*");
query.add(ShardParams.SHARDS, getShards);
query.setStart(0);
query.setRows(0);
query.set("sort", "score desc");
query.setFacet(true);
query.addFacetField("CLIENT");
query.setFacetMinCount(1);
QueryResponse queryResponse = solrServer.query(query, SolrRequest.METHOD.POST);
}
// some other code....
}
Sometimes it gives follwing parsing exception on some shard and the result comes incorrect.
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: org.apache.solr.search.SyntaxError: org.apache.lucene.queryParser.ParseException: Cannot parse ':': Encountered "" at line 1, column 0.
Yes, you can perform a distributed search in Solr. If you are using a single collection (say collection1) and using the following Solr URL to create the solrServer object, then you are doing a distributed search by default.
http://localhost:8983/solr/collection1/select?
This URL let you query across all the shards in collection1, whether they are in same or different machine. But, if you have separate collections and you want a search among those collections, please follow http://wiki.apache.org/solr/SolrCloud.
To know more about distributed search, please look into https://wiki.apache.org/solr/DistributedSearch.
You can simply comment the following line.
query.add(ShardParams.SHARDS, getShards);
And yes, *:* is not the problematic part in your program as stated by D. Kasipovic. Query can contain :. You need to provide query like field:text. Problem occurs when the text part is having ":". Then you need to escape : by \:.
Please modify the URL to do a distributed search and then let us know what happens.

Resources