Hi I have just installed a clean copy of Nagios and Check_MK. But I don't understand how they work together. Nagios uses nrpe to connect to clients and performs checks. This means that some Nagios plugins have to sit on the client and return results from when they are called. But how does Check_MK tie into Nagios. Does it use check_mk_agent to replace all the Nagios plugins to perform its checks? Also does the Nagios configurations all have to be fully configured with all the clients already in place to be checked and then ported to Check_MK interface (wato) or can the clients be added to Check_MK without being present in the Nagios configurations. This is where my confusion lies and I cant find a concrete answer to this question anywhere. Please help.
Check_MK uses Nagios core for theses tasks:
Manage Check results
Triggering of alarms
Manage planned downtimes
Test host availability
Detect network failures
As you can see at the bottom of this page: http://mathias-kettner.com/checkmk_monitoring_system.html
Check_MK needs both: client side monitoring agent and server side monitoring system.
The server side monitoring system calls the agent of the host and passes the check results to the monitoring core (usually Nagios but there is also an new core just for Check_MK).
What makes Check_MK different from other passive Checks (like NRPE) is that the results for all checks is send to the monitoring system in one package. If you run the agent on a host in a shell it will return something like this:
➜ ~ check_mk_agent
<<<df>>>
/dev/mapper/MyStorage-rootvol ext4 15350768 13206900 1341052 91% /
dev devtmpfs 4022348 0 4022348 0% /dev
plus many more lines ....
So the server side part of Check_MK splits these packages into single checks so that the Nagios core can handle them.
So Check_MK wont replace your existing checks, it doesn't care about them. It will just add more.
You don't necessary need WATO to configure Check_MK. WATO is just an interface for the configuration. Configuration can also be done with plain text files. You should start with WATO and take a look at the configuration it has generated.
Related
I am very familiar of using Nagios with NRDP, NRDP I use for remote server traps handling! but am unable to understand what is NCPA can any one explain me? for what this NCPA is required in actual?
I have seen in below Nagios user agent comparison link that NCPA is the best among the other agents like NRDS,NSClient,NRPE.
Iam unable to understand what is NCPA from below mentioned official definition.
NRDP
Nagios Remote Data Processor (NDRP) is a flexible data transport mechanism and processor for Nagios. It is designed with a simple and powerful architecture that allows for it to be easily extended and customized to fit individual users' needs. It uses standard ports protocols (HTTP(S) and XML) and can be implemented as a replacement for NSCA.
NCPA
NCPA is a cross-platform monitoring agent that runs on Windows, Linux/Unix, and Mac OS/X machines. Its features include both active and passive checks, remote management, and a local monitoring interface.
You should compare NCPA with NSClient++, they are both agents that can run on servers and actively or passively execute checks through commands over different protocols, scuh as NRPE, NSCA and NRDP.
Agents: NSClient++, NCPA
Protocols - Active:
NRPE => https://docs.nsclient.org/reference/windows/NSClientServer/
Protocols - Passive:
NSCA => https://docs.nsclient.org/reference/client/NSCAClient/
NRDP => https://docs.nsclient.org/reference/client/NRDPClient/
Fyi, imho NSClient++ is much better then NCPA, as it has amongst other features integrated real-time eventlog monitoring.
In other socket applications you can’t open a port that is already in use but bluetoothd seems to accept several listening GATT servers running in parallel, how is that possible?
I try to setup a GATT server using bluez 5.35 on Raspberry Pi Jessie. I have made an application that starts the GATT server much like the example btgatt-server.c using l2cap socket. I have a custom characteristic that a client application can connect and use. I have also setup to enable advertising using hci commands (it is set to enable just after listen() command on the socket).
I have done so the application auto start in rc.local. My problem is that after reboot, sometimes I don't see my own characteristics but I get a complete other list of services/characteristics. If I don’t start my own application and only enable advertising (sudo hciconfig hci0 leadv) I see the same list so it seems to be running a GATT server by default.
What mechanism in bluez decide if my services/characteristics or the other ones (I guess loaded by default plugins) are visible? They are never combined and visible at the same time and I don’t see any error messages during my application startup even if I can’t see the characteristics from the client and don’t get anything by accept(). How can I be sure my characteristics is always visible?
has anyone from you pulled server hardware data from Nagios to build an inventory? Basically I am trying to create an inventory of an existing servers and hardware components monitored with nagios (i.e. hostname, CPU, MEM, HDD and etc.) There are other ways to do it, but maybe there is a plugin or a way to pull it directly from nagios?
Thanks
You could run this on the command line against your configuration files:
grep "host_name" /usr/local/nagios/etc/objects/hosts.cfg >>output.log
That will output a list of all parameters associated with the host_name line within your configuration.
There is a new mk_inventory plugin for nagios which will pull HW info from the machine
I want to create metric that show the current CPU Usage on a host.
Metric that I want is like on the Ganglia (gweb)
How I can build that?
If the host is Linux/UNIX you could install NRPE on the monitored host, then use check_nrpe to remotely run the check_cpu plugin.
If the host is windows, you could install NC_Net on the monitored host, then use check_nt to query the CPU usage.
If the host is SNMP-capable, you could use check_snmp to query the CPU OID , either 1.3.6.1.4.1.2021.11.11.0 (NET-SNMP) or 1.3.6.1.4.1.9.2.1.58.0 (HOST-MIB)
If the host is a VMware guest, then you need to query the VirtualCentre. Check on monitoringexchange.org for the check_vmware plugin.
This will allow Nagios to alert based on CPU usage thresholds. To obtain graphs and so on (as in Ganglia) you will need to add something like pnp4nagios to graph the perfstats.
At some point my site, running on Apache2 with mod_wsgi just stops processing requests. The connection to server is maintained and client waits for responce, but it never is returned by apache. The server at this time is at 0% CPU, and nothing is processing. I think, apache just sends request to queue and never gets them out of there.
When I perform apache2ctl graceful the problem does not resolve. Only after apache2ctl restart.
My site is a 4 instance wsgi application of Pyramid and 2 instances of Zope 3. It is running normaly and does not have speed problems, that I am aware of.
versions:
Ubuntu 10.04
apache2 2.2.14-5ubuntu8.9
libapache2-mod-wsgi 2.8-2ubuntu1
Sounds like you are using embedded mode to run the multiple applications and you are using third party C extensions that have problems in sub interpreters, resulting in potential deadlock. Else your code is internally deadlocking or blocking on external services and never returning, causing exhaustion of available processes/threads.
For a start, you should look at using daemon mode and delegate each web application to a distinct daemon process group and then forcing each to run in the main interpreter.
See:
http://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
Otherwise use debugging tips described in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques
for getting stack traces about what application is doing.