Solr: Subquery concept - solr

http://xx.xx.xx.xx:8983/solr/collection1/select?q=_query_:{! v=cars rows=10 df=content_urdu fl=score,*}&wt=json&indent=true&rows=30&sort=pr desc
Please someone can explain me, what the above query will do to clear my concepts? Is the text inside curly brackets is sub-query? How will it execute?

query will give you the flexibility of using different query parcers instead of the ones that is by default selected by your query handler in your mentioned example select handler.
Everything inside the braces are your parameters for the Qparser and anything outside is the q parameter for the parser but should be within the quotes. In the below example edismax and surround parser are working together with AND in between them. So they act as a filter. Its the same as using them in fq fields but this helps when generating dynamic queries where there can be OR instead of AND. This feature leverages solr and lucene's multiple Qparsers and can be used together with faceting to get desired results.
_query_:"{!edismax rows=10 df=content_urdu } source_type:\"custom\"" AND
_query_:"{!surround maxBasicQueries=10000} content:5N(tru*,(equi* OR and*))

Related

Applying boosts inside a SpanQuery

Is there a way to apply a boost inside a SpanOrQuery? (Solr 7.1, lucene of same version.)
Example of query structure generated by edismax query parser (inside the usual +BooleanQuery / DisjunctionMaxQuery for multiple fields):
SpanNearQuery
List of SpanOrQuery
List of SpanTermQuery or SpanNearQuery <-- I want to boost those terms
I want to apply a boost to each clause inside the SpanOrQuery. So I tried (via custom query parser extending edismax):
SpanNearQuery
List of SpanOrQuery
List of SpanBoostQuery, each wrapping:
a SpanTermQuery or a SpanNearQuery.
The boost seems to be ignored, although the query is executed successfully (except boost not applied).
Here is the use case:
The input is a sentence, a phrase query (quoted with slop) to valuate proximity. I am using edismax query parser and a SynonymGraphFilter and I want to apply a different boost for each synonym, so the boost information is attached to each term in the synonym file (e.g. 0.7_foo). In case of multi-term synonyms (expanded version), edismax generates a SpanQuery, which sounds to me like a graph search, so far so good.
Where it fails is when I insert a SpanBoostQuery to wrap each clause inside the SpanOrQuery (via a custom query parser extending edismax, extracting boosts from term text). While the query is still returning results, the boost seems just ignored.
Is that a misusage? A bug? Any advice about how I can fix it or work around it please?
To have a similar behavior, I replaced the SpanQuery by a list of PhraseQueries, one for each possible combination of terms in the SpanQuery, which can result in lots of phrases, the performances seem highly affected and we lose lots of information.
Thanks a lot!
Edit:
The tree above describes the query generated by my custom query parser (I checked the resulting Query by debugging it), but for whoever wants to figure out how this is implemented, I first let edismax return a parsed query, then I go through the tree recursively, rebuilding a new Query, until I reach the leaf (SpanTermQuery), and I apply a code like below to wrap the term in a boost (SpanBoostQuery is provided by lucene).
if (initialQuery instanceof SpanTermQuery) {
SpanTermQuery q = (SpanTermQuery) initialQuery;
// parse and extract term/boost from q.getTerm(),
// e.g. "0.7_foo" -> {term: foo, boost: 0.7}
q = new SpanTermQuery(term);
if (boost >= 0 && boost != 1) {
return new SpanBoostQuery(q, boost);
} else {
return q;
}
}
I am aware it is not an optimal solution, but so far it works except the missing boost. Advices are welcome, but it would be a separate topic, and not as important for me as my question above :)

Solr OR between Block Join expressions

Does anyone know how to make a query in Solr between a block join expression and a "regular" expression? For example like this:
{!parent which="content_t:customer"}(content_t:order AND orderDate_dt:[* TO 2016-01-13T00:00:00Z])
OR
customerType=VIP
It seems to me like I have tried all combinations and alternativs with "q" and "fq" parameters, but there is always something giving me a query error. By default we have AND between fq parameters, and I guess q.op does not change that?
I have also tried to create a corresponding NOT AND query, which logically is the same. This works for queries that do not contain block join expressions. E.g., this is not valid:
fq=-{!parent which="content_t:customer"}(content_t:order AND orderDate_dt:[* TO 2016-01-13T00:00:00Z])
but this is:
-(customerType=VIP)
We are using Solr 5.2.1. Can someone please help?

Difference between NGramFilterFactory and EdgeNGramFilterFactory

I am a beginner in Solr. In my project, NGramFilterFactory and EdgeNGramFilterFactory, both are being used for a field. My understanding as per the document is EdgeNGramFilterFactory is used for "starts with" query while NGramFilterFactory is suitable for "contains" query.
I indexed a small dataset for both combinations (one in which I used only NGramFilterFactory and in another I used both NGramFilterFactory and EdgeNGramFilterFactory) but I did not see any difference in the output.
If my understanding is correct, in a way EdgeNGramFilterFactory is a subset of NGramFilterFactory. If this is true then is there any benefit of using both types of filters on the same field?
You should not be using both filters on the same field, they will completely mess up your matching. If you need to match in a middle of a token, you use NGrams. If you only need to match from the start, you use EdgeNGrams. Never both together.

SOLR AND clause in q object

I need to query SOLR index using a WHERE clause -
When I try with AND clause its not working
NOT WORKING
category:2 AND -document_type:category
WORKING
category:2 -document_type:category
Which is the correct syntax?
You could try to pass -(document_type:category) as a filter query (fq=-(document_type:category)). Should work that way and is also faster because of optimized caches on filter queries.
Also tested your specific case, but it also works for me in a regular query.

Difference between q and fq in Solr

Someone please give me a decent explanation of the difference between q and fq in Solr query, covering some points such as -
Do they have the same syntax?
Do they return same results?
When to use which one and why?
Any other differences
Standard solr queries use the "q" parameter in a request. Filter queries use the "fq" parameter.
The primary difference is that filtered queries do not affect relevance scores; the query functions purely as a filter (docset intersection, essentially).
The q parameter takes your query and execute against the index. Then you can use filter queries (can use multiple filter queries) to filter the results.
For example your query can look like this.
q=author:shakespeare
this will match the documents which has 'shakespeare' in the 'author' field. Then you can use filter queries like this.
fq=title:hamlet
fq=type:play
Those will filter the results based on the other fields. You can even filter on the same field.
The query syntax is similar for both q and fq parameters

Resources