Autostart zookeeper instances - solr

Helo,
I run a 2 machine setup with 5 Zookeeper instances on it. I know that normally minimum 3 machines are required to run a smal zookeeper quorum but for now I need to start with this 2 machines. Now I want to create a script which autostarts all the zookeeper instances automaically in case of crashes or reboots. After all I want to build a stable environment which recovers automatically the following services:
solr
solrcloud
zookeeper
shardallocation
Does somebody have any experience with this?

You require a good monitoring system for this. A simpler solution would be to write a cron jobs for all these boxes. These cron jobs would run curl or wget comands and check the output. If the output of the command is not as expected, restart your services. Also add the services to your startup with /etc/init.d so the services start with the reboot.

Related

How to run kubectl exec on scripts that never end

I have a mssql pod that I need to use the sql_exporter to export its metrics.
I was able to set up this whole thing manually fine:
download the binary
install the package
run ./sql_exporter on the pod to start listening on port for metrics
I tried to automate this using kubectl exec -it ... and was able to do step 1 and 2. When I try to do step 3 with kubectl exec -it "$mssql_pod_name" -- bash -c ./sql_exporter the script just hangs and I understand as the server is just going to be listening forever, but this stops the rest of my installation scripts.
I0722 21:26:54.299112 435 main.go:52] Starting SQL exporter (version=0.5, branch=master, revision=fc5ed07ee38c5b90bab285392c43edfe32d271c5) (go=go1.11.3, user=root#f24ba5099571, date=20190114-09:24:06)
I0722 21:26:54.299534 435 config.go:18] Loading configuration from sql_exporter.yml
I0722 21:26:54.300102 435 config.go:131] Loaded collector "mssql_standard" from mssql_standard.collector.yml
I0722 21:26:54.300207 435 main.go:67] Listening on :9399
<nothing else, never ends>
Any tips on just silencing this and let it run in the background (I cannot ctrl-c as that will stop the port-listening). Or is there a better way to automate plugin install upon pod deployment? Thank you
To answer your question:
This answer should help you. You should (!?) be able to use ./sql_exporter & to run the process in the background (when not using --stdin --tty). If that doesn't work, you can try nohup as described by the same answer.
To recommend a better approach:
Using kubectl exec is not a good way to program a Kubernetes cluster.
kubectl exec is best used for debugging rather than deploying solutions to a cluster.
I assume someone has created a Kubernetes Deployment (or similar) for Microsoft SQL Server. You now want to complement that Deployment with the exporter.
You have options:
Augment the existing Deployment and add the sql_exporter as a sidecar (another container) in the Pod that includes the Microsoft SQL Server container. The exporter accesses the SQL Server via localhost. This is a common pattern when deploying functionality that complements an application (e.g. logging, monitoring)
Create a new Deployment (or similar) for the sql_exporter and run it as a standalone Service. Configure it scrape one|more Microsoft SQL Server instances.
Both these approaches:
take more work but they're "more Kubernetes" solutions and provide better repeatability|auditability etc.
require that you create a container for sql_exporter (although I assume the exporter's authors already provide this).

Apache Solr 6.6.2 in SolrCloud mode, not working for new replica

I have installed Apache Solr 6.6.2 on 2 different systems and I have to run Solr in Cloud mode that I have done successfully. Now I want to create one shard with 2 replica. For that I have run following command
bin/solr create -c myCollection -d use_configs -n conf1 -replicationFactor 2
At the time of above command execution, there was only one node live, so it creates one replicat and all index data reside in corresponding Solr Home. When I start second solr (on seperate machine), It replicate index to second machine also (it was expected due to replication factor 2). But After that I have to replace second machine with a new one. I did the same setting and run below command on new machine
bin/solr start -cloud -s tmp/solr -p 7900 -z zk-ip:2181
Solr on new machine starts successfully but it does not replicate index to this new machine. Is there any configuration I missed on this new system ?
Also in admin dashboard, it shows that only one replica (system) is available on first system but there is no indication for second system. Why solr is behaving like this ? I think if I add a new system then index should be replicated to this new system as I have set replication factor to 2 at time of creation of the shard.

Apache Flink: multiple task managers in local mode

In local mode, can I start another task manager?
Although I tried ./taskmanager.sh start, the number of task manager
at web dashboard didn't change.
The command ./taskmanager.sh start -m localhost:6123 didn't work also.
What should I do?
Is it impossible to start multiple task manager in local mode?
To start another taskmanager, you should run the following inside the flink binary directory:
bin/taskmanager.sh start
Which should update the number of taskmanagers on the web dashboard and give you an output like this:
[INFO] 1 instance(s) of taskmanager are already running on my-localhost.
Starting taskmanager daemon on host my-localhost.
From my understanding, you want to set up a standalone cluster in your
local machine. If that is the case, you could simply edit the
$FLINK_DIST/conf/workers, in which each line represents a TM host. By
default, there is only one TM in localhost. In your case, you could
add a line 'localhost' to it. Then, execute the
$FLINK_DIST/bin/start-cluster.sh, you could see a standalone cluster
with two TM in your local machine.
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/cluster_setup.html#configuring-flink
This option wont work for windows.
https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E

How to create solr cloud as a windows service

We are using solr cloud as a search service and currently we run this from command prompt of windows, but I don't know how we can create solr cloud as a windows service on production environment.
I referred below document for the same,
http://www.norconex.com/how-to-run-solr5-as-a-service-on-windows/
but it is not working as expected for solr cloud.
Can anybody please help me on this.
Thanks,
Santosh
You need to give some information on what is not as expected .
The link shown looks like it will get you some of the way, but you obviously need to run a couple of instances, give different home locations and probably setup dependencies between the services to ensure that the one with the Zookeeper starts first. All of which you should already have by running it on the command line, so you should only need to put the corresponding parameters into corresponding fields in the GUI.
Sorry for late reply, I was on vacation.
I can run solr cloud as a windows service , but for that I need to first ccreate a complete setup of solr cloud instance using command:
solr -e cloud -z localhost:2181
then need to stop all the solr port running in command prompt , by closing the command prompt.
Then I configure individual windows service for each solr port running as below:
restart -c -f -p 8984 -z 0.0.0.0:2181 -s "C:/solr-5.2.1/example/cloud/node1/solr"
and so on for every port.
This way I can configure each solr running port as a windows service.
But I want to know , is there any command in solr cloud, which will create solr cloud setup as well run all solr port under single windows service instance.
the command like " solr -f -e cloud -z localhost:2181 -noprompt" , is doing that but it is running for only one default port i.e 8983, and I want to configure solr cloud setup for at least two ports just like "solr -e cloud -z localhost:2181 -noprompt" configured default in solr.cmd.
Thanks,
Santosh

Running batch file remotely using Hudson

What is the simplest way to schedule a batch file to run on a remote machine using Hudson (latest and greatest version)? I was exploring the master slave setup. I created a dumb slave but I am not sure what the parameters should be so that I can trigger the batch file in the remote slave machine.
Basically, I am trying to run 2 different batch files on two different remote machines sequentially, triggered from my machine (the master). The Step by step guide on the Hudson website is a dead link. There are similar questions posted on SO but it does not quite work for me when I use the parameters they mention.
If anyone has done something similar please suggest ways to make this work.
(I know how to set up jobs, and add a step to run a batch file etc what I am having trouble configuring is doing this on a remote machine using hudson in built features)
UPDATE
Thank you all for the suggestions. Quick update on this:
What I wanted to get done is partially working, below are the steps followed to get to it -
Created new Node from Manage Nodes -> New Node -> set # of Executors as 1, Remote FS root set as '/var/hudson', set Launch method as using JNLP, set slavename and saved.
Once slave was set up (from master machine), I logged into the Slave physical machine, I downloaded the _slave.jar from http://masterserver:port/jnlpJars/slave.jar, and ran the following from command line at the download location -> java -jar _slave.jar -jnlpUrl http://masterserver:port/computer/slavename/slave-agent.jnlp. The connection was made successfully.
Checked 'Restrict where this project can be run' in the Master job configuration, and set paramater as slavename.
Checked "Add Build Step" for adding my batch job script
What I am still missing now is a way to connect to 2 slaves from one job in sequence, is that possible?
It is fairly easy and straight forward. Lets assume you already have a slave running. Then you configure the job as if you are locally on the target box. The setting for Restrict where this project can be run needs to be the node that you want to on. This is all for the job configuration.
For the slave configuration read the following pages.
Installing Hudson as a Windows service
Distributed builds
On windows I prefer to run the slave as a service and let the remote machine manage the start up and shut down of the slave. The only disadvantage with this is, you need to upgrade the client every time you update the server Just get the new client.jar from the server, after the upgrade and put it on the slave. Then restart the slave and you are done.
I had troubles using the install as a service option for the slave even though I did it as a local administrator. I used then srvany to wrap the jar into a service. Here is a blog about it. The command that you need to wrap, you will get from your Hudson server from the slave page. For all of this to work, you should set up the slave management as jnlp.
If you have an ssh server on your target machine, you can use the ssl slave settings. These work for me like a charm. I use them with my unix slaves. So far the ssl option with unix is less of an hassle, than the windows service clients.
I had some similar trouble with slave setup and wrote up this blog post - I was running on Linux rather than Windows, but hopefully this will help.
I dont know about how to use built-in hudson features for this job - but in one of my project builds, i run a batch file that in turn uses PSTools
to run the job on a remote server. I found PS tools extremely easy to use - download, unpack and run the command with the right parameters, hence opted to use this.

Resources