Speeding up SOLR search - solr

The SOLR search response is extremely slow using SOLR Apache Lucene 3.6.
Some performance enhancement techniques I'm experimenting with are
SOLR Pagination
mergeFactor currently set to 10 in solrConfig.xml
SOLR Facet queries
filterCache in solrconfig.xml set to size 512 and using
solr.FastLRUCache and autowarm = 0;
queryResultCache in solfconfig.xml set to size 512 with
autowarmCount=0
newSearcher, firstSearcher, and useColdSearcher
single segment index for 100,000 documents
single machine SOLR server for 100,000 documents
How can I optimize items 1-7 to increase SOLR search response for a term/query?
Are there any other optimization parameters to consider not mentioned above?

You can also check below :-
SolrPerformanceFactors
ImproveSearchingSpeed
ImproveIndexingSpeed
SolrCaching
The Seven Deadly Sins of Solr

Related

How to split a running solr cloud shard base on field?

I have a solr cloud 7.6.0 cluster running with 1 shard. The shard is growing quite fast so I'm going split it into separated shards base on "brand" field. For example:
Docs in current shards (Shard1, Collection1):
doc1: {"id":"doc1","brand":"brand1"}
doc2: {"id":"doc2","brand":"brand1"}
doc3: {"id":"doc3","brand":"brand2"}
doc4: {"id":"doc3","brand":"brand2"}
How could I split them into shard1_0 (doc1, doc2), shard1_1 (doc3, doc4)?

Solr Performance - how to OR a filter query with a regular query

Trying to OR main query (q) and filter query (fq) using Solr 4.7.
We have a high frequency term in the index
'field1:value' - 5 million documents
'field2:value' - 500 documents which should be ranked higher
When searching by
q=field1:value OR field2:value
The query takes very long (more than 2 seconds)
When searching by
q=*:*fq=field1:value OR field2:value
The query runs rather fast, but I have no way to get the field2:value at the top of the list.
Currently post re-ranking is not an option.
I understand the speed of the filter query (fq) is that no scoring is involved. (This is not a frequent query, no caching is necessary).
Tried in the QueryParser plugin to wrap the TermQuery with a ConstantScoreQuery.
But it seems performs just the same as an ordinary TermQuery.
What I am looking for is a way to run a filter query as an OR, what means q=field2:value&or_fq=field2:value
Or instead create a 'real-non-scoring' TermQuery within the main query.
Could you please help me? Thanks.
Use *:* as the query, then apply a filter query you did - field1:value OR field2:value as fq. You can then use bq or boost with field2:value to score those that have hits in field2.

Getiing a certain number of docs from Solr

I need to get only n first documents sorted by prevId field from Solr (and not getting all the docs but cut to rows value) It seems to have poor performance and moreover it returns me the wrong value of found docs.Is where any way to do it from SOLR gui
or raw request?
numFound is the total number of documents that matches your query in the index (which in this case is all the documents in the index), it's not the number of documents returned.
You can enable docValues on your field if sorting is slow for that field - but caching usually helps a lot when doing multiple sorts (as long as your index hasn't been modified in between). That being said, your query took 285ms on the Solr side, so maybe the slowness you're experiencing comes from somewhere else than Solr?
Different output formats (&wt=json etc.) might also be more efficient for deserializing in your language of choice (.. and for display in your browser, which does a lot of syntax highlighting for XML).

Solr Query Performance

We are using solr 4.3 for search funcationlity. We have configured 2 shard and 2 replicas.
We have total 132905 Solr documents in index.
Our search query is very long takes around 3 second (from Solr Admin Console) for below query.
id:(FOLD5002861 FOLD5002890 FOLD5219963 FOLD4105003 FOLD4105005 FOLD4105006 FOLD4105007 FOLD4105008 FOLD4105009 FOLD4105010 FOLD4105011 FOLD4105012 FOLD4105013 FOLD4105014 FOLD4105018 FOLD4105019 FOLD4105020 FOLD4105021 FOLD4105022 FOLD4105023 FOLD4105024 FOLD4105025 FOLD4105026 FOLD4105027 FOLD5220166 FOLD5220168 FOLD5220169 FOLD5220170 FOLD5220171 FOLD5220172 FOLD5220173 FOLD5220174 FOLD5220175 FOLD5220176 FOLD5220177 FOLD5220178 FOLD5220179 FOLD5220180 FOLD5220181 FOLD4100876 FOLD4100877 FOLD4100878 FOLD4100879 FOLD4100880 FOLD4100881 FOLD4655426 FOLD4655428 FOLD4655429 FOLD4655430 FOLD4655431 FOLD4655432 FOLD4655433 FOLD4655434 FOLD4655435 FOLD4655436 FOLD4655437 FOLD4655438 FOLD4655439 FOLD4655483 FOLD4655487 FOLD4655523 FOLD4655874 FOLD4655884 FOLD4655856 FOLD4655858 FOLD4655859 FOLD4655860 FOLD4655861 FOLD4655862 FOLD4655863 FOLD4655864 FOLD4655865 FOLD4655866 FOLD4655867 FOLD4655868 FOLD4655869 FOLD4655870 FOLD4655871 FOLD4655872 FOLD4655882 FOLD4655892 FOLD4649510 FOLD4649512 FOLD4649513 FOLD4649514 FOLD4649515...50000 times)
We want to trace where it is taking time. we tried debugQuery option in solr Admin console but not getting useful information.
Is there any way to improve the query? How can we track detail timing?
If you transform this query into a filter query it will give better performance most of the time because it will apply the filter on top of a query (that is a subset of your index and that can be cached) and not on the entire index.
The best would be if you had another piece of query that you could run with this and then apply this as a filter on top of that , but if you donøt have it also running a query for field:[* TO *] should still give better performances.
have a look at this question that explain the difference between filter query and normal query:
SOLR filter-query vs main-query
Take a look at some factors that affect Solr performance.
Try to optimize your index with:
curl http://<solr_host>:<port>/solr/<core_name>/update -F stream.body=' <optimize />'

Different scores from Solr 1 vs Solr 4 Dismax Handler

I've migrated my Solr 1.4 index to Solr 4.0 using this method, and I've kept my solrconfig.xml and schema.xml as unchanged as possible while still being functional.
I'm using the DisjunctionMaxQuery (dismax / solr.DisMaxRequestHandler) requestHandler and comparing my search results between Solr 1.4 and Solr 4. Using ?debugQuery=on in the URL, I can see that the parsedQuery portion is virtually the same between Solr versions, yet the generated scores are different. (The explain portion is different, but the calculation is long and obtuse.)
Example query: q=foo
Example response:
Solr 1.4:
title: "foo (32-bit)"
score: 3.8850176
Solr 4.0:
title: "foo (32-bit)"
score: 2.1525226
Despite having the same request handler and identical indices, what would be causing this significant difference in scores?
If the explain portion is different, then it's using different calculations to calculate the scores so they are going to be different. Scores are pretty arbitrary anyways and are basically only used for comparison within the one result set for the query, in other words it doesn't make sense to compare scores from one query to the scores of another query. The same probably applies to different version of solr, especially if the way the calculations are done are different.

Resources