DC/OS: modifying Cluster name post installation - mesosphere

I missed to update the cluster name (cluster_name) in my boot node's genconf/config.yaml before deploying the DC/OS cluster. I was wondering if there's a configuration/properties file in the nodes (or using dcos-cli or in etcd) that I need to change to update the cluster name string (that appears on the DC/OS UI). 'appreciate any help.
version: DC/OS 1.8
nodes running on CoreOS
size: 3 masters and 11 agents

The cluster name that appears on the DC/OS interface is extracted from the Mesos cluster name. According to this configuration generation file it's possible to change the name of the environment variable. Obviously you're going to have to restart the Mesos master one by one.
Important note: I have not had the possibility to test it, if you are in a production environment I highly recommend you not to do.

Related

Change "Solr Cluster" in Lucidworks Fusion 4

I am running Fusion 4.2.4 with external Zookeeper (3.5.6) and Solr (7.7.2). I have been running a local set of servers and am trying to move to AWS instances. All of the configuration from my local Zookeepers has been duplicated to the AWS instances so they should be functionally equivalent.
I am to the point where I want to shut down the old (local) Zookeeper instances and just use the ones running in AWS. I have changed the configuration for Solr and Fusion (fusion.properties) so that they only use the AWS instances.
The problem I have is that Fusion's solr cluster (System->Solr Clusters) associated with all of my collections is still set to the old Zookeepers :9983,:9983,:9983 so if I turn off all of the old instances of Zookeeper my queries through Fusion's Query API no longer work. When I try to change the "Connect String" for that cluster it fails because the cluster is currently in use by collections. I am able to create a new cluster but there is no way that I can see to associate the new cluster with any of my collections. In a test environment set up similar to production, I have changed the searchClusterId for a specific collection using Fusion's Collections API however after doing so the queries still fail when I turn off all of the "old" Zookeeper instances. It seems like this is the way to go so I'm surprised that it doesn't seem to work.
So far, Lucidworks's support has not been able to provide a solution - I am open to suggestions.
This is what I came up with to solve this problem.
I created a test environment with an AWS Fusion UI/API/etc., local Solr, AWS Solr, local ZK, and AWS ZK.
1. Configure Fusion and Solr to only have the AWS ZK configured
2. Configure the two ZKs to be an ensemble
3. Create a new Solr Cluster in Fusion containing only the AWS ZK
4. For each collection in Solr
a. GET the json from <fusion_url>:8764/api/collections/<collection>
b. Edit the json to change “searchClusterId” to the new cluster defined in Fusion
c. PUT the new json to <fusion_url>:8764/api/collections/<collection>
After doing all of this, I was able to change the “default” Solr cluster in the Fusion Admin UI to confirm that no collections were using it (I wasn’t sure if anything would use the ‘default’ cluster so I thought it would be wise to not take the chance).
I was able to then stop the local ZK, put the AWS ZK in standalone mode, and restart Fusion. Everything seems to have started without issues.
I am not sure that this is the best way to do it, but it solved the problem as far as I could determine.

Change from local to external host

I am running yugabyte using yb-ctl create. I am using --rf 3 to create a 3 node cluster. How can make it listen on the external IP address instead of localhost? And run on three different IPs?
yb-ctl only works for local deployments for quick debugging or testing. To bring up yugabyte on three separate hosts, you can follow the instructions at https://docs.yugabyte.com/latest/deploy/manual-deployment/. The commands there are for 4 different hosts but it should be very similar for 3 hosts.
Indeed, yb-ctl is for local clusters on a single node and not meant to be used for multi-node deployments. In addition to the manual install option, there are a number of orchestrated multi-node deployment options available:
Terraform on any cloud
Cloud formation in AWS, Deployment manager in GCP and ARM templates in Azure
If Kubernetes is of interest, thats another easy way to deploy using Operators or Helm charts.

Flink EMR Installation

I am new to flink and trying to deploy this on EMR cluster. I have used 3 node cluster (1 master and 2 slaves) with their default configuration. I have not done any configuration changes and sticking with default configuration.
I am curious to understand the following points:
How does master and slaves communicate with each other as I have not mentioned any IP in conf/slaves in master node?
I can see a flink library in master node (Path: /usr/lib/flink) but cannot find flink library in slave nodes. How is my code getting executed on slave nodes?
I will change some config according to my requirements in conf/flink-config.yml, if required. Do I need to make any other change on master or slave node apart from this?
See the Running flink-crawler in EMR wiki page for details on how we run a Flink streaming job on top of EMR. Note that in this mode Flink is running via YARN, thus the Flink conf/slaves file isn't being used. You should also take a look at the YARN Setup documentation to better understand how Flink runs on top of YARN.

How to install Flink on Mesos cluster without DC/OS?

I am newbie in Apache Flink and our team is trying to set up an Apache Flink Cluster on Apaches Mesos. We have already installed Apache Mesos & Marathon with 3 Master nodes and 3 Slaves and now we are trying to install Apache Flink without DC/OS as mentioned here https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/mesos.html#mesos-without-dcos.
I have couple of questions over here :
Do we need to download Flink on all the nodes(master and slaves) and configure mesos.master in all nodes?
Or Shall we download flink on only one master node and configure mesos.master over there?
If flink needs to be downloaded on all the nodes then what should be the location of flink directory or if there is any script where I can specify that?
Is running "mesos-appmaster.sh" on master node also responsible for running flink libraries and classes on slaves?
Thanks
Do we need to download Flink on all the nodes(master and slaves) and configure mesos.master in all nodes?
No you don't. Actualy it depends on the way you want to run Flink. In your setup the most convenient way to run Flink would be to run it with Marathon and download binaries during deployment. See this
Or Shall we download flink on only one master node and configure mesos.master over there?
It's up to you. You can run Flink on dedicated server or let Marathon do it for you. If you already have Marathon then it's easier to run Flink with Marathon. On the other hand for debugging purposes and proof of concept I'll recommend standalone version where you can quickly change configuration on local machine and see how it works. Creating docker images or binaries and publishing them in repository and finally deploying Flink on Marathon could have more overhead that will slow you down on development but will keep you safe on production. Flink does not come with support for High Availability (HA) so Marathon is required to provide basic HA support (launch new instance of Flink when agent crash).
If flink needs to be downloaded on all the nodes then what should be the location of flink directory or if there is any script where I can specify that?
Flink does not have to be downloaded on all nodes. It can be downloaded when needed at deployment.
Is running "mesos-appmaster.sh" on master node also responsible for running flink libraries and classes on slaves?
Flink is a scheduler which means that it should start tasks and executors on Mesos when needed.
Even when not using DC/OS, feel free to look at the Apache Flink DC/OS package. At its core, it is a marathon app definition you can deploy on pure Marathon/Mesos. The Flink package (as of today) does not require any DC/OS specific features.
The DC/OS example might also provide useful information.

Flink add Task/JobManagers to cluster

Regarding adding new Task/JobManagers to an existing running cluster the procedure can be found here (https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/cluster_setup.html#adding-jobmanagertaskmanager-instances-to-a-cluster).
However if we shutdown the cluster and start it again the information about the added hosts will be lost.
Is it safe practice that while adding the new host to the cluster to also update and save in parallel the "masters" and "slaves" configuration files on all nodes?
Yes it is absolutely safe. The information from masters and slaves files are read only in starting scripts.

Resources