How to setup vespa nodes for High availability - vespa

I have to deploy the Vespa database and the requirement is that it should be highly available. I have 3 instances available for the database. I have successfully deployed the database where 1 instance had config node, container node, and content node and the rest 2 instances had container node and config node.
Redundancy was kept as 3 and searchable copies are also 3.
But the issue here is if the instance with config node stops due to any reason then the whole system is unavailable.
So to avoid this, I was trying to have a config node on all the instances. But didn't able to deploy it successfully.
Link that I referred:
https://docs.vespa.ai/en/reference/services-admin.html
<admin version="2.0">
<adminserver hostalias="admin0"/>
<configservers>
<configserver hostalias="admin0"/>
<configserver hostalias="admin1"/>
<configserver hostalias="admin2"/>
</configservers>
</admin>
What is that I am doing wrong here?
Also, is there a better way to ensure availability in Vespa, Please do let me know?
Thanks in Advance

Please note that the documents are stored on the content nodes, so you need three of those, too. So maybe just set up config, container and content on all nodes?

Related

Change "Solr Cluster" in Lucidworks Fusion 4

I am running Fusion 4.2.4 with external Zookeeper (3.5.6) and Solr (7.7.2). I have been running a local set of servers and am trying to move to AWS instances. All of the configuration from my local Zookeepers has been duplicated to the AWS instances so they should be functionally equivalent.
I am to the point where I want to shut down the old (local) Zookeeper instances and just use the ones running in AWS. I have changed the configuration for Solr and Fusion (fusion.properties) so that they only use the AWS instances.
The problem I have is that Fusion's solr cluster (System->Solr Clusters) associated with all of my collections is still set to the old Zookeepers :9983,:9983,:9983 so if I turn off all of the old instances of Zookeeper my queries through Fusion's Query API no longer work. When I try to change the "Connect String" for that cluster it fails because the cluster is currently in use by collections. I am able to create a new cluster but there is no way that I can see to associate the new cluster with any of my collections. In a test environment set up similar to production, I have changed the searchClusterId for a specific collection using Fusion's Collections API however after doing so the queries still fail when I turn off all of the "old" Zookeeper instances. It seems like this is the way to go so I'm surprised that it doesn't seem to work.
So far, Lucidworks's support has not been able to provide a solution - I am open to suggestions.
This is what I came up with to solve this problem.
I created a test environment with an AWS Fusion UI/API/etc., local Solr, AWS Solr, local ZK, and AWS ZK.
1. Configure Fusion and Solr to only have the AWS ZK configured
2. Configure the two ZKs to be an ensemble
3. Create a new Solr Cluster in Fusion containing only the AWS ZK
4. For each collection in Solr
a. GET the json from <fusion_url>:8764/api/collections/<collection>
b. Edit the json to change “searchClusterId” to the new cluster defined in Fusion
c. PUT the new json to <fusion_url>:8764/api/collections/<collection>
After doing all of this, I was able to change the “default” Solr cluster in the Fusion Admin UI to confirm that no collections were using it (I wasn’t sure if anything would use the ‘default’ cluster so I thought it would be wise to not take the chance).
I was able to then stop the local ZK, put the AWS ZK in standalone mode, and restart Fusion. Everything seems to have started without issues.
I am not sure that this is the best way to do it, but it solved the problem as far as I could determine.

Solr AutoScaling - Add replicas on new nodes

Using Solr version 7.3.1
Starting with 3 nodes:
I have created a collection like this:
wget "localhost:8983/solr/admin/collections?action=CREATE&autoAddReplicas=true&collection.configName=my_col_config&maxShardsPerNode=1&name=my_col&numShards=1&replicationFactor=3&router.name=compositeId&wt=json" -O /dev/null
In this way I have a replica on each node.
GOAL:
Each shard should add a replica to new nodes joining the cluster.
When a node are shoot down. It should just go away.
Only one replica for each shard on each node.
I know that it should be possible with the new AutoScalling API but I am having a hard time finding the right syntax. The API is very new and all I can find is the documentation. Its not bad but I am missing some more examples.
This is how its looks today. There are many small shard each with a replication factor that match the numbers of nodes. Right now there are 3 nodes.
This video was uploaded yesterday (2018-06-13) and around 30 min. into the video there is an example of the Solr.HttpTriggerListener that can be used to call any kind of service, for example an AWS Lamda to add new nodes.
The short answer is that your goals are not not achievable today (till Solr 7.4).
The NodeAddedTrigger only moves replicas from other nodes to the new node in an attempt to balance the cluster. It does not support adding new replicas. I have opened SOLR-12715 to add this feature.
Similarly, the NodeLostTrigger adds new replicas on other nodes to replace the ones on the lost node. It, too, has no support for merely deleting replicas from cluster state. I have opened SOLR-12716 to address that issue. I hope to release both the enhancements in Solr 7.5.
As for the third goal:
Only one replica for each shard on each node.
To achieve this, a policy rule given in the "Limit Replica Placement" example should suffice. However, looking at the screenshot you've posted, you actually mean a (collection,shard) pair which is unsupported today. You'd need a policy rule like the following (following does not work because collection:#EACH is not supported):
{"replica": "<2", "collection": "#EACH", "shard": "#EACH", "node": "#ANY"}
I have opened SOLR-12717 to add this feature.
Thank you for these excellent use-cases. I'll recommend asking questions such as these on the solr-user mailing list because not a lot of Solr developers frequent Stackoverflow. I could only find this question because it was posted on the docker-solr project.

SOLR DIH cluster environment

I have the solr cloud environment configured, up and running, no issues at all. But now I need to run a delta import in a loop.. every time this import process finished start another one.
Considerations:
Same DIH configuration in all nodes.
The 3 solr nodes are running behind a load balancer (the command can be executed on any of the nodes)
I don't want to execute the importer in a second node if it's running already in one node.
I would like to run the DIH as soon as the last execution finished, right away.
if one node goes down during a import, I would like to be able to say.. this is taking too long.. let's just start another import process.(if there is a way to identify the node where the process was running when it went down, so I can check it and save that information to find out the reasons.. it will be great )
I have so many events going on on the database every minute, I really need all these events(DB records) on Solr (documents up to date)
Options and thoughts
I'm thinking in using JBoss EAP 5.1 to run the external app with the TimerService, I have got a cluster and I can ensure this will run forever asking for status and restarting the DIH process in a loop.
I was taking a look and testing the DHI Event lister
<dataConfig>
<document onImportEnd="com.me.MyNotificationService">
....
</document>
</dataConfig>
com.me.MyNotificationService this can let me know when the process finished, but I still don't know how to connect it to the "Run solr import app" since this will be on a library running out of my JBoss AS container(again if the Solr node goes down I lose the notification as well ).
If there is a way to ensure this loop won't be broke. If all this is managed by the Solr cluster(and take care of situations like when a node goes down in the middle of an import) I will forget about that external "Run solr import app", but I really don't think it's possible.
It can be really useful the ability to say to the Solr cluster execute this import process on this node (let's say node 2) and then let me know when it finished or give me a way to ask for status (on that specific node 2 even if I'm asking this to the node 1, because of the load balancer )
Any recommendation and thoughts will be more than welcome.
Thanks.

Is there any way to remove dead replicas in solrcloud?

I am using solr 4.5. After several tests I have noticed a lot of dead (non existing) replicas are shown in my SolrCloud graph as gone (black). Is there any way to force my solr to forget about this gone replicas?
I think that manually modifying /clusterstate.json node in zookeeper might help but did not try it yet.
The simplest way I found is in fact editing /clusterstate.json in zookeeper, and removing dead replicas info from it.
I don't know if there is any way to do some sort of global cleanup... but:
There is an API to remove some specific replica:
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DeleteaReplica
As well as for removing the (INACTIVE) shard with all it's replica's (4.4+):
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DeleteaShard
And, if this is something related to production and not only for testing purpose - you may also want to look at this upcoming change from 4.6 related to registering the replica that was previously removed - https://issues.apache.org/jira/browse/SOLR-5311

Apache Solr setup for two diffrent projects

I just started using apache solr with it's data import functionality on my project
by following steps in http://tudip.blogspot.in/2013/02/install-apache-solr-on-ubuntu.html
but now I have to make two different instances of my project on same server with different databases but with same configuration of solr for both projects. How can I do that?
Please help me if anyone can?
Probably the closest you can get is having two different Solr cores. They will run under the same server but will have different configuration (which you can copy paste).
When you say "different databases", do you mean you want to copy from several databases into one join collection/core? If so, you just define multiple entities and possibly multiple datasources in your DataImportHandler config and run either all of them or individually. See the other question for some tips.
If, on the other hand, you mean different Solr cores/collections, then you just want to run Solr with multiple cores. It is very easy, you just need solr.xml file above your collection level as described on the Solr wiki. Once you get the basics, you may want to look at sharing instance directories and having separate data directories to avoid changing the same config twice (instanceDir vs. dataDir settings on each core).

Resources