I've got two NRPE checks in nagios that are failing as undefined in the nagios web interface.
Which is odd, because:
1) I have the check defined in the config file for this host on the nagios server. In /usr/local/nagios/etc/objects/web2.cfg I have these two definitions.
# Define a service to check Cassandra reads on the web2 machine.
define service{
use generic-service
host_name web2
service_description Check NRPE Cassandra JMX
contact_groups linux-admins
check_command check_nrpe!check_cassandra_jmx
notifications_enabled 1
}
# Define a service to check Cassandra reads on the web2 machine.
define service{
use generic-service
host_name web2
service_description Check NRPE Cassandra Heap
contact_groups linux-admins
check_command check_nrpe!check_cassandra_heap
notifications_enabled 1
}
2) I have the check defined in the nrpe.cfg on the host I'm checking:
[root#web2:~] #egrep "heap|jmx" /usr/local/nagios/etc/nrpe.cfg
command[check_cassandra_jmx]=/usr/local/nagios/libexec/check_jmx -U service:jmx:rmi:///jndi/rmi://beta.jokefire.com:7199/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 10737418240 -c 20401094656
command[check_cassandra_heap]=/usr/local/nagios/libexec/cassandra.pl
3) If I go back to the nagios host and run the commands via check_nrpe on the command line, it succeeds:
[root#monitor1:~] #/usr/local/nagios/libexec/check_nrpe -H web2.mydomain.com -c check_cassandra_jmx
JMX OK HeapMemoryUsage.used=142913536{committed=526385152;init=536870912;max=526385152;u sed=142913536}
[root#monitor1:~] #/usr/local/nagios/libexec/check_nrpe -H web2.mydomain.com -c check_cassandra_heap
CASSANDRA OK - | heap_mem=27.46
Other checks for this host are working fine in the web interface. Does anyone out there have some ideas about what's wrong in this case?
Thanks!
Related
Firstly, sorry for my english.
I try to configure a probe in nagios for monitor log files and notify me when Nagios find string like "Exception" or "Error".
I use Nagios with Centreon.
So, when I execute my command :
$USER1$/check_log -F path/for/log.files -q /Exception/
Nagios return : "Log check error: Log file path/for/log.files does not exist!"
When i check in my server the path, the files exists, all (root, group and other) can read the file. So the problem doesn't seem to come of rights management.
The client for supervisor is a CentOS. I have already install nrpe client, and configure allowed host etc ...
I looked everywhere for someone who had the same error but find nothing.
If someone can help me, it would be so nice !
If you need further informations for help me, please, don't hesitate, i'm not sur that i'm explain in good way my problem.
Regards.
On Nagios server side, you must define something like this:
define service {
service_description service_name
host_name your_remote_hostname
use your_template
check_command ext_check!check_log!-f path/for/log.files -g /Exception/
}
In commands.cfg or in similar file on Nagios server:
define command{
command_name ext_check
command_line $USER1$/check_nrpe -t 30 -H $HOSTADDRESS$ -p 5666 -c $ARG1$ -a $ARG2$
}
On client/monitored host you must edit nrpe.cfg file:
command[check_log]=/opt/nagios/plugins/check_log $ARG1$
after that, you must restart nrpe service and reload Nagios configuration.
I'm getting a (No output returned from plugin) from a host and cannot understand why:
Service on monitor server:
# Check Clamd availability
define service {
hostgroup_name clamd-servers
service_description ClamAV Daemon
check_command check_nrpe!check_clamd
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
}
Hosts on monitor:
# Clamd Servers
define hostgroup {
hostgroup_name clamd-servers
alias ClamAV servers
members fsmvps
}
nrpe_local.fcfg on host fsmvps
command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H /var/run/clamav/clamd.ctl
Running the command /usr/lib/nagios/plugins/check_clamd -H /var/run/clamav/clamd.ctl on the host will produce the following output as clam is up and running:
CLAMD OK - 0.000 second response time on socket /var/run/clamav/clamd.ctl [PONG]|time=0.000219s;;;0.000000;10.000000
Clueless at the moment as to why no output is returned as I'm a beginner on Nagios.
Perhaps your NRPE service wasn't setup right (sometimes it complains about ssl).
Running (as the nagios user) on your monitor server something like :
/usr/lib/nagios/plugins/check_nrpe -H fsmvps check_clamd
Might help diagnose things.
It might be :
Permissions (can the nagios user on fsmvps read /var/run/clamav/clamd.ctl)
check_nrpe needs the -n flag or a different port.
you've not restarted nrpe on the fsmvps server after editing it's config.
When I run the below command in the Nagios client machine, it is working good.
**/usr/lib/nagios/plugins/check_ssh -H 127.0.0.1 -p 22**
*SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.6 (protocol 2.0) | time=0.004430s;;;0.000000;10.000000*
When running from the Nagios Server, getting the below issue.
ubuntu#Nagios-server:/usr/lib/nagios/plugins$ ./check_nrpe -H <CLIENT-IP> -c check_ssh -n -H <CLIENT-IP> -p 2
CHECK_NRPE: Error receiving data from daemon.
Below is the service definition:
define service {
host qa-ad-useast-1.dpclk.com
use generic-service
check_command check_nrpe!check_ssh
service_description SSH Status
contact_groups admins
notifications_enabled 1
}
And command entry on nrpe.cfg is:
command[check_ssh]=/usr/lib/nagios/plugins/check_ssh -H $ARG1$
Anything wrong in service definition or command in passing the arguments.
This looks like it could be a few possible things:
Double check your only_from settings on the client (in the xinetd
configuration)
If you're passing arguments via NRPE, you need to
enable dont_blame_nrpe
Ensure that your NRPE process is running as
nagios user and that the plugin directory has execute rights for
the nagios user
I just finished installing Nagios 3 in Ubuntu server and I'm not sure how I can add a third party plugin into it.
The plugin is available : Here
Thanks in advance for your help
You didn't mention any information about the server that you want to monitor with Nagios.
I'm going to assume it's an Ubuntu Linux server and it's not the same server as the machine you installed Nagios on.
On the server to be monitored:
Ensure that NRPE (Nagios Remote Plugin Executor) is installed. Here's a link to instructions for installing NRPE on the Ubuntu operating system.
http://tecadmin.net/install-nrpe-on-ubuntu/
After you install NRPE on the server to be monitored, it's very important that you edit the nrpe.cfg file (most likely found at etc/nagios/nrpe.cfg but this can differ based on your installation method).
You need to modify the allowed_hosts configuration line to include the IP address of your Nagios server. If you don't, NRPE will refuse connection attempts from Nagios and you won't be able to run your Nagios plugin or report results back to Nagios.
Be sure to restart NRPE after you've modified nrpe.cfg.
Next you'll need to download the Nagios plugin to the server being monitored. For example:
wget --directory-prefix=/usr/lib/nagios/plugins/ https://github.com/thehunmonkgroup/nagios-plugin-file-ages-in-dirs/archive/v1.1.tar.gz
cd to your nagios plugins directory and extract the tar-gzipped archive you just downloaded:
cd /usr/lib/nagios/plugins/
tar zxvf v1.1
ls -al /usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs
Be sure to give the nagios plugin script execute permissions:
chmod a+x /usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs
With the nagios plugin now residing on your server to be monitored, you will need to define some command definitions on that same server.
First you need to find the path that NRPE will search for new command definitions that you manually add to the system.
To do this, grep your nrpe.cfg file for the term "include_dir".
For example:
grep include_dir /etc/nagios/nrpe.cfg
include_dir=/etc/nrpe.d/
If no result for "include_dir" is returned from your grep, add the above "include_dir" configuration to your nrpe.cfg file. Ensure that the /etc/nrpe.d/ folder is created.
Create a new file in your include_dir named check_file_ages_in_dirs.cfg. Add to check_file_ages_in_dirs.cfg a command definition for check_file_ages_in_dirs pointing to the path of your Nagios plugin and including the arguments necessary to execute it.
For example:
echo "command[check_file_ages_in_dirs]=/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d \"/tmp\" -w 24 -c 48" >> /etc/nrpe.d/check_file_ages_in_dirs.cfg
cat /etc/nrpe.d/check_file_ages_in_dirs.cfg
command[check_file_ages_in_dirs]=/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d "/tmp" -w 24 -c 48
For the above, I hard-coded the warning and critical thresholds of 24 hours and 48 hours. I've also hard-coded the directory to check as "/tmp"
Attempt to execute the nagios plugin script locally to confirm it's working correctly:
/usr/lib/nagios/plugins/nagios-plugin-file-ages-in-dirs-1.1/check_file_ages_in_dirs -d "/tmp" -w 24 -c 48
OK: 1 dir(s) -- /tmp: 1 files
Ensure the nrpe user has read permissions on your check_file_ages_in_dirs.cfg file:
chmod a+r /etc/nrpe.d/check_file_ages_in_dirs.cfg
Restart your nrpe service, as per the instructions in http://tecadmin.net/install-nrpe-on-ubuntu/
You also need to ensure that if you have any firewall rules in place, they allow tcp traffic to port 5666.
On your Nagios server:
From your Nagios server, you'll need to manually run check_nrpe against your host to be monitored so as to verify correct functioning of the Nagios plugin and correct NRPE configuration.
Find the location of your check_nrpe file. On my installation, it's located at /usr/local/nagios/libexec/check_nrpe, but this could be different for your installation.
find / -name "check_nrpe" -type f
/usr/local/nagios/libexec/check_nrpe
If you don't have check_nrpe, you'll need to install it on your Nagios server.
apt-get install nagios-nrpe-plugin
First execute check_nrpe against your server to be monitored with no remote command arguments. This is just to confirm that NRPE is running on your remote server and it's correctly configured to allow connections from your Nagios server.
Note: For this example I'll pretend the IP address of the host I want to monitor is 10.0.0.1. Replace this with the IP address of the host you want to monitor.
/usr/local/nagios/libexec/check_nrpe -H 10.0.0.1
NRPE v2.14
The check_nrpe command above should return the version number of the NRPE agent running on the remote host if it's configured correctly.
Next attempt to manually invoke the Nagios plugin via NRPE:
/usr/local/nagios/libexec/check_nrpe -H 10.0.0.1 -c check_file_ages_in_dirs
OK: 1 dir(s) -- /tmp: 1 files
If you get output similar to the above, then it's time to move on to defining hosts, services, and commands on your Nagios server.
It would be cleaner to define separate configuration files for host, service, and command definitions. But that's outside of the scope of this post.
For now, we'll define these things in the default Nagios configuration file (nagios.cfg).
First locate your nagios.cfg file:
find / -name "nagios.cfg" -type f
/usr/local/nagios/etc/nagios.cfg
Edit the nagios.cfg file.
Add a host definition for the server you wish to monitor:
define host {
host_name Remote-Host
alias Remote-Host
address 10.0.0.1
use linux-server
contact_groups admins
notification_interval 0
notification_period 24x7
notifications_enabled 1
register 1
}
Add a command definition for the remote execution of check_file_ages_in_dirs:
define command {
command_name check_file_ages_in_dirs
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_file_ages_in_dirs
register 1
}
Add a service definition that will reference the check_file_ages_in_dirs command:
define service {
service_description check_file_ages_in_dirs
use generic-service
check_command check_file_ages_in_dirs
host_name Remote-Host
contact_groups admins
notification_interval 0
notification_period 24x7
notifications_enabled 1
flap_detection_enabled 1
register 1
}
Save and exit your nagios.cfg file.
Validate your Nagios configuration file:
nagios -v /usr/local/nagios/etc/nagios.cfg
If no errors are reported, restart your Nagios service.
Check the Nagios Web UI, and you should see your check_file_ages_in_dirs service monitoring your remote host.
I am looking for a little help in configuring NAGIOS for NRPE. I am quite new at Linux and seem to be having some trouble getting this working.
I am running Ubuntu 11.10 with the Nagios 3.3.1 core and Nagios plugins 1.4.15 running nrpe2.13
Currently I am trying to get the Nagios Exchange plugin check_be.exe to work with Nagios. I followed the check_be.txt for the setup on my nagios server and windows backup exec server.
Currently if I run
root#PERSES:/usr/local/nagios/libexec# ./check_nrpe -H 192.168.1.10 -t200 -c check_be
I will get
Job: Daily Backup, Success, Date:17/4/2012
From Nagios all I get is no output returned from plugin.
Windows.cfg has the following entry
# Service for Backup Exec agent
define service {
use template-backupexec
service_description BackupExec - Daily DAT backup ; specific display name, if you need
host_name cmbssrv.cmbs.local
}
Templates.cfg has this entry – I have tried to modify it to avoid the socket timeout
define service{
name template-backupexec
use generic-service
service_description BackupExec Job Check ; default display name in Nagios
check_command check_nrpe! -t 240 -c check_be ; same name as in the nsclient++ nsc.ini command defini$
normal_check_interval 60 ; your check intervals here
retry_check_interval 60
register 0 ; this is a template
}
Commands.cfg:
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDESS$ -p 5666 -v $ARG1$
}
Any ideas would be greatly appreciated
This looks wrong
check_command check_nrpe! -t 240 -c check_be
I think those extra arguments need to be in the define command block.
Also change the name of the check_command. You may confuse the check_nrpe executable command (runs in terminal) with your check_command of the same name (which is unknown to the terminal shell).
Here's a working example much like what you are doing.
On the main nagios machine:
define command {
command_name check_nrpe_cart
command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -p 6565 -c $ARG1$
}
define service{
use clientcritical
host_name cartbox
service_description email
normal_check_interval 15
check_command check_nrpe_cart!check_postfix
}
On cartbox in /etc/nagios/nrpe_local.cfg
command[check_postfix]=/usr/lib/nagios/plugins/check_procs -w 1:1 -c 1:1 -C master
You should read pdf from the following link you will get maximum posible answer
for NRPE NAGIOS comunication problem.
http://assets.nagios.com/downloads/nagiosxi/docs/NRPE_Troubleshooting_and_Common_Solutions.pdf