It seems there is no way to construct query with OR condition. Has anyone hit this issue or know when this will be done or any workaround.
What I want to achive something like this with OR:
query = datastore.query(kind='Article',
filters=[('url', '=', 'url1'),
('url', '=', 'url2')]
)
But this filter works as AND not OR.
OR is not a supported query construct in Google Cloud Datastore.
The current way to achieve this is to construct multiple queries client-side and combine the result sets.
For reference, you should read through the Datastore Queries documentation:
The Datastore currently only supports combining filters with the AND operator. However it's relatively straightforward to create your own OR query by issuing multiple queries and combining the results:
Python runtime supports "IN" query filter.
Note, however, that this is just a convenience: under the hood, "IN" query is translated into a series of independent queries each looking for one value on the list.
Related
From reading the docs the search +term +another_term should return the same documents as term AND another_term. But I'm getting different results. Someone suggested that one of the terms is actually acting as an OR. But I thought the search queries were baked into SOLR.
Where in the Solr config would I check for this?
If you enable the debug flat in the admin UI when you run those two queries, it will show you what they get translated to on the lowest level after the Query Parser, etc. You can compare and see if something is different.
Sometimes I don't need just the top X results from a SOLR query, but all results (running into millions). This is easily achievable by searching once with 0 rows as a request parameter, and then re-execute the search with the numFound from the result as number of rows(*)
Of course we can sort the results by e.g. "id asc" to remove relevancy ranking, however, I would like to be able to disable the entire scoring calculation for these queries, as they probably are quite computational intensive and we just don't need them in these cases.
My question:
Is there a way to make SOLR work in boolean mode and effectively run faster on these often slow queries, when all we need is just all results?
(*) I actually usually simply do a paged query where a script walks through the pages (multi threaded), to prevent timeouts on large result sets, yet keep it fast as possible, but this is not important for the question.
This looks like a related question, but apparently the user asked the wrong question and was only after retrieving all results: Solr remove ranking or modify ranking feature; This question is not answered there.
Use filters instead of queries; there is no score calculation for filters.
There is a couple of things to be aware of
Solr deep paging allows you to export large number of results much quicker
Using an export format such as CSV could be faster than using an XML format just due to the formatting and it being more compact
And, as already mentioned, if you are exporting all, put your queries into FilterQuery with caching off
For very complex queries, if you can split it into several steps, you can actually assign different weights to the filters and have them execute in sequence. This allows to use cheap first filter that gets rid of most of the results and only then apply more expensive, more precise, filters
I have a problem getting the right results with my SOLR query. Basically, let's say I want all documents in English containing the string "toto".
http://127.0.0.1:8080/solr-webservice/query/?q=iso_lang_cd:en&ctnt_val:*toto*
The problem is that this query sends me all documents in English AND all documents containing toto.
What I need is to get all documents that are in English AND contain toto. How could I achieve this? I'd think this is the standard use of the AND operator...
Actually OR is the default query operator for Solr and your query is not formatted in such a away as to force an AND operation. In order to achieve the AND behavior you could specify your query in one of the following formats:
+iso_lang_cd:en +ctnt_val:*toto*
iso_lang_cd:en && ctnt_val:*toto*
Or you can optionally pass the q.op=AND to force an AND operation. Additionally, you might want to consider using Filter Queries, where you could filter on the language. There are some performance improvements with using filter queries, but please refer to the documentation for more details.
q=ctnt_val:*toto*&qf=iso_lang_cd:en
Please see The Standard Query Parser for more details and a good overview of querying.
I have a Solr solution working which requires two queries, but I'm looking for a way to do it in a single query. My idea is that if I can figure out a way to do this, I wont have to incur the overhead of twice the load on the Solr cluster.
The details: I'm running a simple query like "q=camera" with a query filter of say "fq=type:digital". The second query is identical to the first, but the filter is the inverse, like "fq=-type:digital" I'm imagining that if there's a way to run a single query while applying the first filter to get the first set of topDocs, then generate a second set with the second filter the results could be merged and returned ( it doesn't matter if sorting resorts and mixes the two sets).
I experimented with partitioning the data by marking a specific field during indexing, into two different groups and then using Solr "grouping" queries, but the response time for these wasn't acceptable in my setup.
I'm looking for suggestions the most Solr congruent approach to experiment with: tuning to improve the two-query solution performance, or investigating a kind of custom Solr post-filter ( I read Yonik's 2/2012 blog post ).
I have to implement this in Solr 3.5, although if there's a slam dunk solution in 4.0 I'll eventually be able to move to that.
I can think of two alternate approaches :-
Instead of filter the results, use a variable higher boost so that all the results for type:digital come on top and rest of the documents would follow. No need for separate queries. The boost can be changes as per the type value.
Other approach is not to display the results for type other then digital. However, you can display the facets for the other types with the counts for the same for users to know if the other types exist for the search term. You can check on tagging and excluding filters
Result grouping might give you what you want. Just group by that parameter and specify sufficient top number of documents in each group.
But I would test whether its performance is any better than two queries. Just because it mentions performance in limitations section.
I have been using some filters in multi-faceting queries in Solr. Right now the filters are using only value but now I have to expand it to multi-values and I think I have to use OR for that. I haven't done any performance checking but I am wondering if there is a way to stop my filter queries from being stored to FilterCache? I don't want to cache results from filter queries with more than two values. Ideally I guess I have to rely on caching algorithm doing a good job but I am just wondering.
Taken from here.
To tell Solr not to cache a filter, we use the same powerful local params DSL that adds metadata to query parameters and is used to specify different types of query syntaxes and query parsers. For a normal query that does not have any localParam metadata, simply prepend a local param of cache=false.