I'm trying to send out a different style of notification for a specific service in Nagios, namely, when a user account gets locked out from AD. I don't need all the excess information associated with the usual emails (e.g. Host the service pertains to, IP address, service status, etc), as all the information I need is given in the SNMP trap sent from Windows in $SERVICEOUTPUT. However, I can't just change the notify-service-by-email command because I need to use the full output for all the other services.
I need to find a way to either:
Send out an email notification customized to this service
define command{
command_name notify-service-by-email
command_line $~LongOutputCommand~$
}
define command{
command_name notify-lockouts-by-email
command_line $-ShortOutputCommand~$
}
define service{
service_description Account Lockouts
service_notification notify-lockouts-by-email
...
}
Execute an if statement inside the command_line section of the Nagios command:
define command{
command_name notify-service-by-email
command_line if [ "$SERVICEDESC" == "Account Lockouts" ]; then $-ShortOutputCommand~$; else $~LongOutputCommand~$; fi
}
I don't believe it's possible for Nagios to do the first way because of the way it is programmed, but no matter how I try the second way it doesn't process it as a proper command ("if not recognized", etc).
You cannot put a "script" syntax on a command_line definition. Think the command_line as an handler to call a script in action: the logic, and a if statement is "logic", must be moved in the script you are calling. Inside the script just use the if statement on $1 (the positional variable for the first argument passed to the script) and then process the value of $1. So, if $1 (in our case if you pass $SERVICEDESC is to the script as first argument, inside the script it is referenced as $1) equal to...
Related
I have the following service definition:
define service{
use my-service ; Name of service template to use
host_name dra
service_description https://www.example.com
check_command check_http!-I my.ip.address --ssl -H www.example.com
notifications_enabled 1
retry_check_interval 2
normal_check_interval 5
contact_groups myadmins
}
The service check keeps failing with
Name or service not known
HTTP CRITICAL - Unable to open TCP socket
However, if I run http_check from the command line, I get a 200 OK result:
/usr/lib/nagios/plugins/check_http -I my.ip.address --ssl -H www.example.com -v
.....
HTTP OK: HTTP/1.1 200 OK - 9176 bytes in 0.074 second response time |time=0.073543s;;;0.000000 size=9176B;;;0
Note also that the URL in question works just fine from a browser, the certificate is valid, etc. I also use the exact same service definition for a bunch of other sites, and they all work fine. The only thing I can think of is that this remote host is running on DigitalOcean and has a "Floating IP" assigned to it. I tried replacing my.ip.address above (and also in the host definition of the nagios config file) with either the Floating IP or the "standard" IP assigned to the host, and it makes no difference.
How is it possible that the same command would fail when run by nagios, but succeed when run manually?
The answer to my question is: don't use check_http, use
use check_https_hostname, and
make sure that the host_name stanza is the actual hostname
which requires matching the host_name stanzas in all the service and host definitions in the same cfg file.
So:
define service{
use my-service ; Name of service template to use
host_name www.example.com
service_description https://www.example.com
check_command check_https_hostname
notifications_enabled 1
retry_check_interval 2
normal_check_interval 5
contact_groups myadmins
}
Here is why: it becomes clear by looking at the definitions of check_http and check_https_hostname which are in the /etc/nagios-plugins/config/http.cfg file in my installation.
# 'check_http' command definition
define command{
command_name check_http
command_line /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' -I '$HOSTADDRESS$' '$ARG1$'
}
# 'check_https_hostname' command definition
define command{
command_name check_https_hostname
command_line /usr/lib/nagios/plugins/check_http --ssl -H '$HOSTNAME$' -I '$HOSTADDRESS$' '$ARG1$'
}
You will notice that the -H and -I arguments in check_http get the same value $HOSTADDRESS$, while in check_https_hostname they get $HOSTNAME$ and $HOSTADDRESS$, respectively.
The fact that I built my original command as check_http!-I my.ip.address --ssl -H www.example.com did not really matter. In the end, the /usr/lib/nagios/plugins/check_http command got two values for -I and two for -H, and the second pair was being ignored.
This did break "thanks" to Cloudflare, because the IP address dynamically assigned by Cloudflare to my www.example.com was not the same as the actual host IP address that I had specified in my host definition.
Finally, I wanted to mention that what helped me figure this out was setting
debug_level=-1
debug_verbosity=1
in my /etc/nagios3/nagios.cfg file and then looking through /var/log/nagios3/nagios.debug.
Also, check out all the different variants of the check_http commands in /etc/nagios-plugins/config/http.cfg. There are some very useful ones.
I have one host on nagios defined like that:
define host {
host_name my-host
address ip
display_name my-host
hostgroups windows,windows-process-count
use windows-server
_PROCESSNAME my-process1.exe
_PROCESSCOUNT 1
}
On this host I check only that my-process1.exe is up.
but I need to check more process (my-process1, my-process2 etc....)
I would like check more process, defining like that:
define host {
host_name my-host
address ip
display_name my-host
hostgroups windows,windows-process-count
use windows-server
_PROCESSNAME my-process1.exe
_PROCESSCOUNT 1
_PROCESSNAME2 my-process2.exe
_PROCESSCOUNT2 1
_PROCESSNAME2 my-process3.exe
_PROCESSCOUNT2 4
etc...... for x process that i must control on this server
}
but in this way i must define x services, x hostgroups and x commands.
This is very uncomfortable and not very elegant.
what is the best way to get this result?
Unfortunately I don't think there is an elegant way to do it as you would like to. I have always worked with Nagios using a service-oriented approach, means I define a monitoring for one service or process and then I link all the hosts or hostgroups that use that process and need monitoring, even if it is one server. For me, I found that as the most reliable, tidy and sustainable way.
If you can afford a generic alert when any of the service fails, you could prepare a custom command to check all of them in one separate script, I would not like to see it like this in my dashboard.
I know that it is what you want to avoid but, If I were you, and considering you have a single server to monitor these processes, I would prepare a separate service file, something like:
#!/bin/bash
srvCfg = "/etc/nagios3/conf.d/host1procs.cfg" # I am using Nagios over Debian
server="host1"
processes=("process1.exe" "process2.exe")
srvGroup="customservicegroup"
for proc in "${processes[#]}"; do
echo "define service{" >> $srvCfg
echo " use generic-service" >> $srvCfg
echo " host_name $server" >> $srvCfg
echo " servicegroups $srvGroup" >> $srvCfg
echo " service_description Process monitoring for $proc" >> $srvCfg
echo " check_command check_nt!PROCSTATE!-d SHOWALL -l $proc" >> $srvCfg
echo "}" >> $srvCfg
done
I assumed that your example is just an example and the process names are not actually iterable to generate the list. That script will result in a file like:
define service{
use generic-service
host_name host1
servicegroups customservicegroup
service_description Process monitoring for process1.exe
check_command check_nt!PROCSTATE!-d SHOWALL -l process1.exe
}
define service{
use generic-service
host_name host1
servicegroups customservicegroup
service_description Process monitoring for process2.exe
check_command check_nt!PROCSTATE!-d SHOWALL -l process2.exe
}
You will have to define the servicegroup if you want all of the services to be automatically in it, if not take out the servicegroups line.
I know it is not the answer you are looking for, but hope it helps
I have currently the following two service defined as below:
define service {
use my-webapp-service
hostgroup_name all
service_description System check - PING
check_command check_ping!100.0,20%!500.0,60%
}
define service {
use my-webapp-service
hostgroup_name all
service_description System check - Swap Usage
check_command check_nrpe!check_swap
check_interval 1
}
What I want is output string to be:
System check - PING - "Actual hostname where this alarm got fired off"
System check - Swap Usage - "Actual hostname where this alarm got fired off"
I think this could be possible but I just don't know how to make it possible.
Would sincerely appreciate your guidance on that.
Many Thanks
Output are handled by scripts. Default behavior is that script donĀ“t return hostname, because it is not necessary.
If you wanna add hostname in output, you must edit already exist scripts or create new one.
Here is basic info how create script for Nagios - http://www.yourownlinux.com/2014/06/how-to-create-nagios-plugin-using-bash-script.html
For your needs you must add $HOSTNAME to echo. For instance:
echo "$HOSTNAME - WARNING- $output"
If you want the script that is executing to be aware of the hostname, you'll need to pass the hostname as an argument to the Nagios command. That also means that the script will need to accept the hostname as an argument. Take for example:
define service {
use my-webapp-service
hostgroup_name all
service_description System check - PING
check_command check_ping!100.0,20%!500.0,60%
}
check_ping probably looks something like:
define command {
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
The problem here is that the executable at $USER1$/check_ping doesn't know that you want to pass the host's name as an argument. So you'll need to make a wrapper script. I'm not going to write the script for you, but to give you a hint, the command definition would look something like:
define command {
command_name check_ping_print_hostname
command_line $USER1$/my_check_ping_wrapper.sh --hostname $HOSTNAME$ -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}
And then the script at $USER1$/my_check_ping_wrapper.sh is obviously going to need grab that --hostname argument, and then pass the other arguments directly to check_ping, wait for the output, and then amend the output with the information given in the --hostname arg.
Hope this helps!
I have a nagios system, working well and i wanted to check a specific url with check_http.
The command is defined:
define command{
command_name check_http_with_folder
command_line $USER1$/check_http -H $HOSTADRESS$ -u http://$HOSTADRESS$$ARG1$
}
and i call it correct ... But it throws me an
"Name or service not known"
When i call it from my nagios machine from command line, it works well and i get an status result 200, so all okay.
The problem is now, that i want the nagios command working and not throwing an error.
Any Ideas?
P.S. The problem is only in the part with the -u xxx param, without it (in the normal check_http command without -u) it all works well.
You've misspelled $HOSTADDRESS$ in your command definition. It needs 2 D's. Also, you might want to ensure there is a slash in between $HOSTADDRESS$ and $ARG1$ in the value you pass in to your -u command argument, or make sure that $ARGS1$ has a preceding slash.
Building on Joe's observations...
Note the corrections:
define command{
command_name check_http_with_folder
command_line $USER1$/check_http -H $HOSTADDRESS$ -u $ARG1$
}
Then the $HOSTADDRESS$ should be just that. For example, www.example.com. And $ARG1$ should be the location at the host only. For example, /blog/index.php. The check_http check will build it into an actual http request.
I have a program in c/gtk which is opened with gksu. The problem is that when I get the environment variable $HOME with getenv("HOME") it returns "root" obviously. I would like to know if there is a way to know who was the user that executed the gksu or the way to get his environmental variables.
Thanks in advance!
See the man page. Use gksu -k command... to preserve the environment (in particular, PATH and HOME).
Or, like Lewis Richard Phillip C indicated, you can use gksu env PATH="$PATH" HOME="$HOME" command... to reset the environment variables for the command. (The logic is that the parent shell, the one run with user privileges, substitutes the variables, and env re-sets them when superuser privileges have been attained.)
If your application should only be run with root privileges, you can write a launcher script -- just like many other applications do. The script itself is basically
#!/bin/sh
exec gksu -k /path/to/your/application "$#"
or
#!/bin/sh
exec gksu env PATH="$PATH" HOME="$HOME" /path/to/your/application "$#"
The script is installed in /usr/bin, and your application as /usr/bin/yourapp-bin or /usr/lib/yourapp/yourapp. The exec means that the command replaces the shell; i.e. nothing after the exec command will ever be executed (unless the application or command cannot be executed at all) -- and most importantly, there will not be an extra shell in memory while your application is begin executed.
While Linux and other POSIX-like systems do have a notion of effective identity (defining the operations an application may do) and real identity (defining the user that is doing the operation), gksu modifies all identities. In particular, while getuid() returns the real user ID for example for set-UID binaries, it will return zero ("root") when gksu is used.
Therefore, the above launch script is the recommended method to solve your problem. It is also a common one; run
file -L /usr/bin/* /usr/sbin/* | sed -ne '/shell/ s|:.*$||p' | xargs -r grep -lie launcher -e '^exec /'
to see which commands are (or declare themselves to be) launcher scripts on your system, or
file -L /bin/* /sbin/* /usr/bin/* /usr/sbin/* | sed -ne '/shell/ s|:.*$||p' | xargs -r grep -lie gksu
to see which ones use gksu explicitly. There is no harm in adopting a known good approach.. ;)
you could assign the values of these environment vars to standard variables, then execute gksu exporting the vars after gkSU... By defining these after the gkSU using && to bind together your executions, so that you essentially execute using cloned environment variables...
A better question is why do this at all? I realize you are wanting to keep some folders, but am not sure why as any files created as root, would have to be globally writable, probably using umask, or you would have to manually revise permissions or change ownership... This is such a bad Idea!
Please check out https://superuser.com/questions/232231/how-do-i-make-sudo-preserve-my-environment-variables , http://www.cyberciti.biz/faq/linux-unix-shell-export-command/ & https://serverfault.com/questions/62178/how-to-specify-roots-environment-variable