Is it a good idea to stop the Solr replication for all search machines till the the new Solr configurations are deployed on all nodes? - solr

The question is for legacy Solr setup (non-cloud mode).
Let's consider one hypothetical example. Say we have one index machine and 2 search machines.
We have some Solr schema and config changes that we want to deploy to all the machines.
We do a round-robin deployment - deploy to the index machine first then deploy to one search machine at a time. For this whole deployment, we disable the replication from index machines to search machines. Can we do better so that replication is not stopped for the entirety of the deployment process?

Related

Run SolrCloud on different machine

I am trying to configure SolrCloud on more than one Server/machine so that it one server is fail another replica can serve that request.
I can successfully run SolrCloud on single machine with two node on different port address. I am refering this link
But How can I run it on different machine. What configuration I need to do to achieve this?
Any help is appreciated.
You provide Solr with the address to the Zookeeper ensemble you'll be using to distribute the workload. It's highly recommended to run Zookeeper by itself instead of using the one bundled with Solr. In the last example in the link you've provided, you can see the -z parameter that provides the connection list of zookeeper instances available. Solr uses Zookeeper to keep its cluster state and available servers synced across instances.

Sitecore 8.1 index rebuild strategy for SOLR search provider

Just read through the index update strategies document below but couldn't get the clear answer on which strategy is best for SOLR search implementation:
https://doc.sitecore.net/sitecore_experience_platform/search_and_indexing/index_update_strategies
We have setup the master and slave Solr endpoints where master will be used for create/update. And slave for reading only.
Appreciate if you could suggest the indexing strategy to be used for:
Content Authoring
Content Delivery
Solution is hosted in azure web apps and content delivery can be scaled up or down from 1-N number at any time.
I'm planning to configure below:
Only CA have a OnPublishEndAsync
All CDs will not have any indexing strategy.
Appreciate if you could suggest a solution that has worked for you. Also how do we disable indexing strategy?
Thanks.
Usually when you use replication in Solr (master + slave Solr servers), it should be configured like that:
Content Authoring (CM server):
connects to Solr master server.
It runs syncMaster strategy for master database, and onPublishEndAsync for web database.
Content Delivery (CD servers):
connects to Solr slave server (or to some load balancer if there are multiple Solr slave servers).
has all the indexing strategies set to manual - they should NEVER update Slave solr servers.
With this solution, CD servers always can get results from Solr, even if there is full index rebuild in progress (this happens on Master Solr server and data is copied to Slaves after it's finished).
You should think about having 2 Solr Slave servers and load balancer for them. If you do this:
If Solr master is down for some reason, slaves still answers to requests from CD boxes. You can safely restart master, reindex, and the only thing you lost is that you didn't have 100% up to date search results on CD for some time.
If one of the Solr slave servers is down, second slave server still answers to the request and load balancer should redirect all the traffic to the slave server which works.

Solr and Zookeeper with a single node

I have the setup of Solr cloud running in my local machine with the internal Zookeeper (i.e) Zookeeper that is being internally used by Solr with the single node.
My query is that while I move my Solr to the production environment, Is it recommended to run the Zookeeper in a isolated/separate/external instance or is it better to go with the internal instance of Zookeeper that comes along with the Solr?
The use solr internal zookeeper is discouraged for the production environments. This is even stated in SolrCloud documentation.
Although Solr comes bundled with Apache ZooKeeper, you should consider yourself discouraged from using this internal ZooKeeper in production, because shutting down a redundant Solr instance will also shut down its ZooKeeper server, which might not be quite so redundant. Because a ZooKeeper ensemble must have a quorum of more than half its servers running at any given time, this can be a problem.
The solution to this problem is to set up an external ZooKeeper ensemble. You should create this ensemble on a different machine so that if any of the solr machine goes down it will not impact the zookeeper and rest of the solr instances. I know currently you are going with one solr instance.
As mentioned, for production is not a good idea to have the internal Zookeeper inside Solr but for development is entirely OK and very practical and for that you just need to add this lines to your /etc/default/solr.in.sh file:
SOLR_MODE=solrcloud
ZK_CREATE_CHROOT=true
As an alternative, you can also start Solr manually with the command $SOLR_HOME_DIR/bin/solr start -c
Tested with Apache Solr 9 on a Debian based Linux

How to setup Solr Replication with two search servers?

Hi I'm developing rails project with sunspot solr and configure Solr replication.
My environment: rails 3.2.1, ruby 2.1.2, sunspot 2.1.0, Solr 4.1.6.
Why replication: I need more stable system - oftentimes search server goes on maintenance and web application stop working on production. So, I think about how to make 2 identical search servers instead of one, to make system more stable: if one server will be down, other will continue working.
I cannot find any good turtorial with simple, easy to understand and described in details turtorial...
I'm trying to set up replication on two servers, but I do not fully understand how replication working inside:
synchronize data between two servers (is it automatic action?)
balances search requests between two servers
when one server suddenly stop working other should become a master (is it automatic action?)
is there replication features other than listed?
Answer to this is similar to
How to setup Solr Cloud with two search servers?
What is the difference between Solr Replication and Solr Cloud?
Can we close this as duplicate?

How to setup solr cloud with 2 shards 1 leader and 1 replica and with zookeeper on different machines?

I'm still confused on setting up a solr cloud cluster. The one in the tutorial are setup for localhost binded to different ports. But I wanna know how would it be like using different machines. What do I need? Do I need to extract the downloaded Solr to each machine? Should I setup zookeeper first and set the configuration? Should zookeeper be installed on a different machine which is not a Solr server?
This tutorial is a lot closer to what you need:
http://solr.pl/en/2013/03/11/solrcloud-howto-2/
If you don't want to run a separate Zookeeper, you can run the embedded Zookeeper on one of your Solr instances by passing -Dzkrun on this instance, and -DzkHost on the other instances to point to the first one.

Resources