Is it possible to run a script when Icinga2 a host has a critical service? - nagios

Let's say a given host, FooHost, is running Apache2. Icinga2 runs a check by ssh command and discovers that Apache2 is not running, which triggers a critical alert.
Is it possible to have Icinga2 execute a script on this event? In this example, I would like to write a script that does an SSH remote execute of systemctl restart apache2.
In the alternative, we could write a watchdog script that could be deployed on all servers, but it makes MUCH more sense to write it on the Icinga2 box, and use ssh remote execute because that allows central control.
I see no reason to have to have an engineer log in to fix this unless this restart failed also.

You can use Event Commands (like Event Handlers in Icinga 1.x / Nagios) to achieve this.
The documentation show the following example, which is using a custom shell script to perform the restart operations:
object EventCommand "restart_service" {
command = [ PluginDir + "/restart_service" ]
arguments = {
"-s" = "$service.state$"
"-t" = "$service.state_type$"
"-a" = "$service.check_attempt$"
"-S" = "$restart_service$"
}
vars.restart_service = "$procs_command$"
}
object Service "Process httpd" {
check_command = "procs"
event_command = "restart_service"
max_check_attempts = 4
host_name = "icinga2-client1.localdomain"
command_endpoint = "icinga2-client1.localdomain"
vars.procs_command = "httpd"
vars.procs_warning = "1:10"
vars.procs_critical = "1:"
}

as far as i know it is not possible but you can use
an nagios nrpe command or a cronjob which is execute command like this
pgrep apache2 || /bin/systemctl restart apache2 > /dev/null 2>&1
or
/bin/systemctl status apache2 || /bin/systemctl restart apache2
which means if the apache2 service is not running it will be restarted.

Related

Connect to docker sqlserver via ssh

I've created a docker container that contains a mssql Database. On the command line ip a gives an ip address for the container, however trying to ssh into it username#docker_ip_address yields ssh: connect to host ip_address port 22: Connection refused. So I'm wondering if I am even able to ssh into the container so I don't have to always be using the docker tool docker exec .... and if so how would I go about doing that?
To ssh into container you should full-fill followings
SSH server(Openssh) should be installed within the container and ssh service should be running
Port 22 should be published from container (when you run the container).more info here > Publish ports on Docker
docker ps command should display mapped ports 22
Hope above information helps for you to understand the situation...
If your container contains a database server, the normal way to interact with will be through an SQL client that connects to it; Google suggests SQL Server Management Studio and that connector libraries exist for popular languages. I'm not clear what you would do given a shell in the container, and my main recommendation here would be to focus on working with the server in the normal way.
Docker containers normally run a single process, and that's normally the main server process. In this case, the container runs only SQL Server. As some other answers here suggest, you'd need to significantly rearchitect the container to even have it be possible to run an ssh daemon, at which point you need to worry about a bunch of other things like ssh host keys and user accounts and passwords that a typical Docker image doesn't think about at all.
Also note that the Docker-internal IP address (what you got from ip addr; what docker inspect might tell you) is essentially useless. There are always better ways to reach a container (using inter-container DNS to communicate between containers; using the host's IP address or DNS name to reach published ports from the same or other hosts).
Basically, alter your Dockerfile to something like the following - that will install openssh-server, alter a prohibitive default configs and start the service:
# FROM a-image-with-mssql
RUN echo "root:toor" | chpasswd
RUN apt-get update
RUN apt-get install -y openssh-server
COPY entrypoint.sh .
RUN cd /;wget https://gist.githubusercontent.com/spekulant/e04521d6c6e1ccffbd3455c673518c5b/raw/1e4f6f2cb32caf3a4a9f73b02efdcbd5dde4ba7a/sshd_config
RUN rm /etc/ssh/sshd_config; cp sshd_config /etc/ssh/
ENTRYPOINT ["./entrypoint.sh"]
# further commands
Now you've got yourself an image with ssh server inside, all you have to do is start the service, you cant do RUN service ssh start because it won't work - docker specifics, refer to the documentation. You have to use a Entrypoint like the following:
#!/bin/bash
set -e
sh -c 'service ssh start'
exec "$#"
Put it in a file entrypoint.sh next to your Dockerfile - remember to chmod 755 entrypoint.sh it. There's one thing to mention here, you still wouldn't be able to ssh into the container - the default SSH server configuration doesn't allow login into root account using a password. So you either change the configs yourself and provide it to the image, or you can trust me and use the file I created - inspect it with the link from Dockerfile - nothing malicious there, only a change from prohibit-password to yes.
Fortunately for us - MSSQL official images start from Ubuntu so all the commands above fit perfectly into the environment.
Edit
Be sure to ask if something is unclear or I'm jumping too fast.

Setup Nagios check_clamd

I'm getting a (No output returned from plugin) from a host and cannot understand why:
Service on monitor server:
# Check Clamd availability
define service {
hostgroup_name clamd-servers
service_description ClamAV Daemon
check_command check_nrpe!check_clamd
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
}
Hosts on monitor:
# Clamd Servers
define hostgroup {
hostgroup_name clamd-servers
alias ClamAV servers
members fsmvps
}
nrpe_local.fcfg on host fsmvps
command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H /var/run/clamav/clamd.ctl
Running the command /usr/lib/nagios/plugins/check_clamd -H /var/run/clamav/clamd.ctl on the host will produce the following output as clam is up and running:
CLAMD OK - 0.000 second response time on socket /var/run/clamav/clamd.ctl [PONG]|time=0.000219s;;;0.000000;10.000000
Clueless at the moment as to why no output is returned as I'm a beginner on Nagios.
Perhaps your NRPE service wasn't setup right (sometimes it complains about ssl).
Running (as the nagios user) on your monitor server something like :
/usr/lib/nagios/plugins/check_nrpe -H fsmvps check_clamd
Might help diagnose things.
It might be :
Permissions (can the nagios user on fsmvps read /var/run/clamav/clamd.ctl)
check_nrpe needs the -n flag or a different port.
you've not restarted nrpe on the fsmvps server after editing it's config.

Add a plugin from Nagios Exchange to Nagios 3.x

I just finished installing Nagios 3 in Ubuntu server and I'm not sure how I can add a third party plugin into it.
The plugin is available : Here
Thanks in advance for your help
You didn't mention any information about the server that you want to monitor with Nagios.
I'm going to assume it's an Ubuntu Linux server and it's not the same server as the machine you installed Nagios on.
On the server to be monitored:
Ensure that NRPE (Nagios Remote Plugin Executor) is installed. Here's a link to instructions for installing NRPE on the Ubuntu operating system.
http://tecadmin.net/install-nrpe-on-ubuntu/
After you install NRPE on the server to be monitored, it's very important that you edit the nrpe.cfg file (most likely found at etc/nagios/nrpe.cfg but this can differ based on your installation method).
You need to modify the allowed_hosts configuration line to include the IP address of your Nagios server. If you don't, NRPE will refuse connection attempts from Nagios and you won't be able to run your Nagios plugin or report results back to Nagios.
Be sure to restart NRPE after you've modified nrpe.cfg.
Next you'll need to download the Nagios plugin to the server being monitored. For example:
wget --directory-prefix=/usr/lib/nagios/plugins/ https://github.com/thehunmonkgroup/nagios-plugin-file-ages-in-dirs/archive/v1.1.tar.gz
cd to your nagios plugins directory and extract the tar-gzipped archive you just downloaded:
cd /usr/lib/nagios/plugins/
tar zxvf v1.1
ls -al /usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs
Be sure to give the nagios plugin script execute permissions:
chmod a+x /usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs
With the nagios plugin now residing on your server to be monitored, you will need to define some command definitions on that same server.
First you need to find the path that NRPE will search for new command definitions that you manually add to the system.
To do this, grep your nrpe.cfg file for the term "include_dir".
For example:
grep include_dir /etc/nagios/nrpe.cfg
include_dir=/etc/nrpe.d/
If no result for "include_dir" is returned from your grep, add the above "include_dir" configuration to your nrpe.cfg file. Ensure that the /etc/nrpe.d/ folder is created.
Create a new file in your include_dir named check_file_ages_in_dirs.cfg. Add to check_file_ages_in_dirs.cfg a command definition for check_file_ages_in_dirs pointing to the path of your Nagios plugin and including the arguments necessary to execute it.
For example:
echo "command[check_file_ages_in_dirs]=/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d \"/tmp\" -w 24 -c 48" >> /etc/nrpe.d/check_file_ages_in_dirs.cfg
cat /etc/nrpe.d/check_file_ages_in_dirs.cfg
command[check_file_ages_in_dirs]=/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d "/tmp" -w 24 -c 48
For the above, I hard-coded the warning and critical thresholds of 24 hours and 48 hours. I've also hard-coded the directory to check as "/tmp"
Attempt to execute the nagios plugin script locally to confirm it's working correctly:
/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d "/tmp" -w 24 -c 48
OK: 1 dir(s) -- /tmp: 1 files
Ensure the nrpe user has read permissions on your check_file_ages_in_dirs.cfg file:
chmod a+r /etc/nrpe.d/check_file_ages_in_dirs.cfg
Restart your nrpe service, as per the instructions in http://tecadmin.net/install-nrpe-on-ubuntu/
You also need to ensure that if you have any firewall rules in place, they allow tcp traffic to port 5666.
On your Nagios server:
From your Nagios server, you'll need to manually run check_nrpe against your host to be monitored so as to verify correct functioning of the Nagios plugin and correct NRPE configuration.
Find the location of your check_nrpe file. On my installation, it's located at /usr/local/nagios/libexec/check_nrpe, but this could be different for your installation.
find / -name "check_nrpe" -type f
/usr/local/nagios/libexec/check_nrpe
If you don't have check_nrpe, you'll need to install it on your Nagios server.
apt-get install nagios-nrpe-plugin
First execute check_nrpe against your server to be monitored with no remote command arguments. This is just to confirm that NRPE is running on your remote server and it's correctly configured to allow connections from your Nagios server.
Note: For this example I'll pretend the IP address of the host I want to monitor is 10.0.0.1. Replace this with the IP address of the host you want to monitor.
/usr/local/nagios/libexec/check_nrpe -H 10.0.0.1
NRPE v2.14
The check_nrpe command above should return the version number of the NRPE agent running on the remote host if it's configured correctly.
Next attempt to manually invoke the Nagios plugin via NRPE:
/usr/local/nagios/libexec/check_nrpe -H 10.0.0.1 -c check_file_ages_in_dirs
OK: 1 dir(s) -- /tmp: 1 files
If you get output similar to the above, then it's time to move on to defining hosts, services, and commands on your Nagios server.
It would be cleaner to define separate configuration files for host, service, and command definitions. But that's outside of the scope of this post.
For now, we'll define these things in the default Nagios configuration file (nagios.cfg).
First locate your nagios.cfg file:
find / -name "nagios.cfg" -type f
/usr/local/nagios/etc/nagios.cfg
Edit the nagios.cfg file.
Add a host definition for the server you wish to monitor:
define host {
host_name Remote-Host
alias Remote-Host
address 10.0.0.1
use linux-server
contact_groups admins
notification_interval 0
notification_period 24x7
notifications_enabled 1
register 1
}
Add a command definition for the remote execution of check_file_ages_in_dirs:
define command {
command_name check_file_ages_in_dirs
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_file_ages_in_dirs
register 1
}
Add a service definition that will reference the check_file_ages_in_dirs command:
define service {
service_description check_file_ages_in_dirs
use generic-service
check_command check_file_ages_in_dirs
host_name Remote-Host
contact_groups admins
notification_interval 0
notification_period 24x7
notifications_enabled 1
flap_detection_enabled 1
register 1
}
Save and exit your nagios.cfg file.
Validate your Nagios configuration file:
nagios -v /usr/local/nagios/etc/nagios.cfg
If no errors are reported, restart your Nagios service.
Check the Nagios Web UI, and you should see your check_file_ages_in_dirs service monitoring your remote host.

How to copy file from SSH remote host to Jenkins Server

We are using Jenkins server for our daily build process and executes some bash scripts on remote hosts over SSH. This scripts are generating html log files on remote hosts.
We are using Copy to slave plugin to copy files on slave machines and Publish over ssh plugin to manage SSH sessions in build process.
Now the question is, We want to copy some files (log files of Scripts) from remote ssh host to Jenkins Server.
Which will be possible and better option for the same (plugin will be better if any).
EDIT :
sshpass is an option, but looking for any plugin or better way to do the job.
use sshpass command to send file in
Build Environment -> Execute Shell script on remote host using ssh ->
Post build script
sample command :
sshpass -p "password" scp path/of/file <new_server_ip>:/path/of/file
This will skip password prompt for scp command and will provide password to scp.
I think you can generate ssh keypair and pass it to the slave as a parameter with, for example, Config File Provider Plugin
Then just use scp to retrieve files using this keypair for authentication process.
Obviously way too late, but in case you're already using publish-over-ssh, want to avoid duplicating the credentials and have a shared library you can use this piece of groovy to obtain the host configuration:
import jenkins.plugins.publish_over_ssh.*
#NonCPS
def getSSHHost(name) {
def found = null
Jenkins.instance.getDescriptorByType(BapSshPublisherPlugin.Descriptor.class).each{
it.hostConfigurations.each{host ->
if (host.name == name) {
found = host
}
}
}
found
}
As mentioned, this either requires a Global Shared Library (so that your code is trusted) or (probably) a number of admin approvals, sorry for that.
This returns a BapSshHostConfiguration.
For a password connection you can do:
def sshHost = getSSHHost('Configuration Name')
def host = [host: sshHost.hostname, user: sshHost.username, password: sshHost.password]
sshHost = null
sh("""
set +x
sshpass -p "${host.password}" scp -o StrictHostKeyChecking=no ${host.user}#${host.host}:filename.extension .
set -x
""")
This copies the file to your local work directory.
Probably not the best code ever, but I'm not a groovy specialist. It works and that is enough for me. (the set +x is to avoid it echoing the command in the log, showing the password). Getting rid of anything Non-CPS (sshHost = null) before you perform a CPS call saves you a lot of headaches :)
Since it took me quite a while to figure out I wanted to share this for whoever comes next.

how to bind a app to a user when it connect the server with a terminal

My app now has been sealed as a product, which will be sold with a PC with linux system installed. How ever I will create a new user for the customers, but I want bind a interface-like app to the user, so when my custumers log in via terminals the selected app runs automatically, when connection ends, the app quit the same way.
I know , maybe this can be implemented programmally..but...
Do you know any suggestion???
thanx
all appreciated...
As mentioned by AProgrammer you can run your app as the user shell or in the profile, as in this example
# ~/.profile: executed by the command interpreter for login shells.
# This file is not read by bash(1), if ~/.bash_profile or ~/.bash_login
# exists.
# see /usr/share/doc/bash/examples/startup-files for examples.
# the files are located in the bash-doc package.
# the default umask is set in /etc/profile; for setting the umask
# for ssh logins, install and configure the libpam-umask package.
#umask 022
# if running bash
if [ -n "$BASH_VERSION" ]; then
# include .bashrc if it exists
if [ -f "$HOME/.bashrc" ]; then
. "$HOME/.bashrc"
fi
fi
# set PATH so it includes user's private bin if it exists
if [ -d "$HOME/bin" ] ; then
PATH="$HOME/bin:$PATH"
fi
# run you app here
exec myapp
If you have your app started by xinetd then you can have it start up on connect. On disconnect your app will be sent a SIGHUP, so you can catch that and shut down.
The program executed by terminal login is the user shell as determined by a field in /etc/passwd. You could either put your program as the shell, or arrange for you program to be executed by the shell start up scripts (~/.profile, ~/.cshrc depending on the shell).

Resources