CHECK_NRPE: Error receiving data from daemon - nagios

I have configured nagios server on Centos machine. I am trying to add a rhel 6.6 machine to this nagios server Version 4.2.0. While configuring NRPE and testing it, there is a step to check configuration as /usr/local/nagios/libexec/check_nrpe -H localhost
I am getting the error as below at this step:
CHECK_NRPE: Error - Could not complete SSL handshake.
So, I used the -n option: /usr/local/nagios/libexec/check_nrpe -n -H localhost
And it shows a new error as below:
CHECK_NRPE: Error receiving data from daemon.
System logs just say:
Aug 31 14:31:10 xxxxx xinetd[18730]: START: nrpe pid=18781 from=::1
Aug 31 14:31:10 xxxxx xinetd[18781]: FAIL: nrpe address from=::1
Aug 31 14:31:10 xxxxx xinetd[18730]: EXIT: nrpe status=0 pid=18781 duration=0(sec)
Any idea on why this shows up?

Check the logs at /usr/local/nagios/var/nagios.log for any errors. This probably is the issue with nrpe.cfg, mostly a syntactical error.
Check your command definitions too in nrpe.cfg.
Also make sure port 5666 is open.
If you are using nrpe under xinetd, check the allowed hosts entry at /etc/xinetd.d/nrpe.
This problem basically arises when nrpe can't read its configurations properly.

Check allowed_hosts parameter in nrpe.cfg file. You need to allow localhost too.

CHECK_NRPE: Error - Could not complete SSL handshake.
solution : allow hosts in nrpe.conf file
remove -n in the command if ssl is enabled.

Related

Amazon DocumentDB fails to connect with error "SSL peer certificate validation failed"

I am trying to connect to our AWS DocumentDB, but it fails with the following error:
2019-12-04T17:46:52.551-0800 W CONTROL [main] Option: ssl is deprecated. Please use tls instead.
2019-12-04T17:46:52.551-0800 W CONTROL [main] Option: sslCAFile is deprecated. Please use tlsCAFile instead.
2019-12-04T17:46:52.551-0800 W CONTROL [main] Option: sslAllowInvalidHostnames is deprecated. Please use tlsAllowInvalidHostnames instead.
MongoDB shell version v4.2.1
connecting to: mongodb://insights-db-2019-08-12-18-32-13.cih94xwdmniv.us-west-2.docdb.amazonaws.com:27017/?compressors=disabled&gssapiServiceName=mongodb
2019-12-04T17:46:52.684-0800 E NETWORK [js] SSL peer certificate validation failed: Certificate trust failure: CSSMERR_CSP_UNSUPPORTED_KEY_SIZE; connection rejected
2019-12-04T17:46:52.685-0800 E QUERY [js] Error: couldn't connect to server insights-db-2019-08-12-18-32-13.cih94xwdmniv.us-west-2.docdb.amazonaws.com:27017, connection attempt failed: SSLHandshakeFailed: SSL peer certificate validation failed: Certificate trust failure: CSSMERR_CSP_UNSUPPORTED_KEY_SIZE; connection rejected :
connect#src/mongo/shell/mongo.js:341:17
#(connect):2:6
2019-12-04T17:46:52.687-0800 F - [main] exception: connect failed
2019-12-04T17:46:52.687-0800 E - [main] exiting with code 1
The command I use:
mongo --ssl --host MY_DOCUMENT_DB_HOST_AND_PORT --sslCAFile MY_KEY_PATH --username MY_USERNAME --password MY_PASSWORD
A couple troubleshooting I already tried:
Sent the exact same command and key to another Mac OS X machine on the same network --> worked fine
Uninstalled and reinstalled my mongo app mongodb-community#4.2
Try adding the rds-combined-ca-bundle.pem certificate to your Mac, I had a very similar error when trying to connect to DocumentDb using localhost through a forwarded port, the command I ran is
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain rds-combined-ca-bundle.pem
I got this command from this answer
For those hitting this issue post 2020, see the last reply in this thread: https://forums.aws.amazon.com/message.jspa?messageID=936916
Mac OS X Catalina has updated the requirements for trusted certificates. Trusted certificates must now be valid for 825 days or fewer (see https://support.apple.com/en-us/HT210176). Amazon DocumentDB instance certificates are valid for over four years, longer than the Mac OS X maximum. In order to connect directly to an Amazon DocumentDB cluster from a computer running Mac OS X Catalina, you must allow invalid certificates when creating the TLS connection. In this case, invalid certificates mean that the validity period is longer than 825 days. You should understand the risks before allowing invalid certificates when connecting to your Amazon DocumentDB cluster.
To connect to an Amazon DocumentDB cluster from OS X Catalina using the AWS CLI, use the tlsAllowInvalidCertificates parameter.
mongo --tls --host <hostname> --username <username> --password <password> --port 27017 --tlsAllowInvalidCertificates
Basically, just ignore invalid certificates.

SQL Server service breaks after adding SSL certificates in Linux

I have set up a SQL Server database server on my Ubuntu 16 machine. To make it secure over a host network I am working on adding an SSL encryption certificate on it.
I tried following the steps as mentioned on this link ssl-encryption-mssql
But after restarting the service of SQL Server, it breaks giving the below exit code status
code=exited, status=1/FAILURE
I even tried to check the logs using journalctl -u mssql-server.service -b but it is not helpful at all. For the referrence, I am adding the screenshot of journalctl command below:
My /var/opt/mssql/mssql.conf looks something like this after following the steps from official doc.
[sqlagent]
enabled = false
[EULA]
accepteula = Y
[network]
tlscert = /etc/ssl/certs/cert.pem
tlskey = /etc/ssl/private/privkey.pem
tlsprotocols = 1.2
forceencryption = 1
EDIT-1: I further checked out the logs from /var/log/syslog, it stated the following log-
Error: 49940, Severity: 16, State: 1.Unable to open one or more of the user-specified certificate file(s). Verify that the certificate file(s) exist with read permissions for the user and group running SQL Server and found this question which seems similar, I tried the approach as told by Charles but it doesn't seem to work. Even I am using the Let's Encrypt Certificates.
EDIT-2: It is not a licensed version, could this be the reason?
How to resolve this error?
I just faced the same problem even though I followed the same steps as mentioned in the microsoft documentation. The actual problem seems to be with the permissions on the folder paths where the certificate files are located.
You can verify whether mssql user is able to connect or not using the openssl commands.
This command will do a basic verification on whether the certificates are valid or not.
sudo su - mssql -c "openssl verify -verbose -CAfile /etc/ssl/certs/mssql_ca.pem /etc/ssl/certs/cert.pem"
If you wanted to see if the combination of certificates are actually working or not (with key), you can start a openssl server service and then connect to it with another openssl client connection.
sudo su - mssql -c "openssl s_server -accept 8443 -cert /etc/ssl/certs/cert.pem -key /etc/ssl/private/privkeyrsa.pem -CAfile /etc/ssl/certs/mssql_ca.pem"
openssl s_client -connect localhost:8443
Another small correction from the documentation (I am using CA provided certificate), had to convert the key file format (might not require for you).
openssl rsa -in /etc/ssl/private/key.pem -out /etc/ssl/private/privkeyrsa.pem

[Error]: Error - Could not complete SSL handshake. ON LOCALHOST

All of the questions about this error show people running check_nrpe -H [some_remote_ip], in contrast to an error-free run on localhost.
I, however, can't even get this to run on localhost:
$> ./check_nrpe -H localhost
CHECK_NRPE: Error - Could not complete SSL handshake.
The service does appear to be up and running:
$> sudo netstat -apn | grep :5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 5847/nrpe
tcp6 0 0 :::5666 :::* LISTEN 10216/nrpe
And the daemon returns no errors
$> sudo service nagios-nrpe-server status
* nagios-nrpe is running
My nrpe.cfg file has allowed_hosts set correctly:
allowed_hosts=127.0.0.1,10.0.1.2,0.0.0.0
Contents of /var/log/syslog with debugging turned on:
Nov 1 22:54:44 <MYHOST> nrpe[11156]: Connection from ::1 port 6601
Nov 1 22:54:44 <MYHOST> nrpe[11156]: Host ::1 is not allowed to talk to us!
Nov 1 22:54:44 <MYHOST> nrpe[11156]: Connection from ::1 closed.
Does anyone have any idea what's going on, this seems almost nonsensical. Thanks!
Note that my example may be different than yours.
First change to the folder having your nrpe command and run:
./nrpe --version
The output from that command will look something like this:
NRPE - Nagios Remote Plugin Executor
Copyright (c) 1999-2008 Ethan Galstad (nagios#nagios.org)
Version: nrpe-3.0
Last Modified: 07-12-2016
License: GPL v2 with exemptions (-l for more info)
SSL/TLS Available, OpenSSL 0.9.6 or higher required
Notice that the last line tells you that SSL is indeed supported by this build of NRPE. If it is not there, then you'll have to install a version that was compiled with SSL support (which may mean compiling one of for yourself, depending on where you got it). The docs for the source code are pretty clear on how this is done.
If you DO have the SSL line above, look at the required version on the line and check your system to be sure that at least that version has been installed. I used this command:
rpm -qa | grep openssl
And received output looking like this:
libopenssl1_0_0-32bit-1.0.1k-2.39.1.x86_64
openssl-1.0.1k-2.39.1.x86_64
Both openssl and libopenssl are required for NRPEs SSL support to function correctly. I strongly recommend that if these are not installed, to use your systems package installer (aptget, yum, zypper, ...) to fetch and install them. If these are already installed, but you still have errors, then you will likely have a configuration issue in:
/etc/ssl/openssl.cnf
Fixing that is well outside of the scope/space available here. I recommend to upgrade both of these via a working, on-line package - these packages always include a default configuration which should work fine with NRPE - assuming the version is equal to or higher than required.
I think that check_nrpe is trying to use IPv6.
The IPv6 localhost ip is ::1, so adding this to your allowed_hosts= line in _nrpe.cfg_ and restarting nrpe will tick this box for you.
Alternatively as another responder replied you can just add -4 to your check_nrpe command to force it to stick to IPv4.
I was having the same issue and it's only when I saw the ::1 in the question it dawned on me what was happening.
I am not sure if it is still relevant, but I had the same issue and discovered someone had changed the /etc/hosts.allow file, blocking the access. Somehow this results in the following errors:
Client: Connection refused by TCP wrapper
Server: Error: (nerrs = 0)(!log_opts) Could not complete SSL handshake with <Client IP> : rc=-1 SSL-error=5
Changing the /etc/hosts.allow file solved the issue.

MongoDB, issues with configuring and starting

I am new to mongoDB and i am trying to get it configured and running on my Ubuntu server. When i go and enter this command in my terminal
sudo service mongod start
I get the following output
start: Job is already running: mongod
So, when i try to enter the shell with
mongo
I get the following output
2015-02-24T14:54:39.557-0800 warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2015-02-24T14:54:39.559-0800 Error: couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed at src/mongo/shell/mongo.js:146
I know I'm not working locally so I heard over to the mongod.conf file and change the following
port = 5000
# Listen to local interface only. Comment out to listen on all interfaces.
bind_ip = 10.0.1.51
Where bind_ip is now my ubuntu server and the port is 5000 as shown, so now i restart the service with
sudo service mongod restart
and outsputs
mongod start/running, process 1755
And now I try to renter back into shell with
mongo
and i still get the same error messages
MongoDB shell version: 2.6.7
connecting to: test
2015-02-24T15:01:26.229-0800 warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2015-02-24T15:01:26.230-0800 Error: couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed at src/mongo/shell/mongo.js:146
exception: connect failed
Can someone help me out with this issue? I've been going through the forums and nothing appears to be working. Thanks.
If anyone is having trouble, i looked into mongod --help and found the following solutions
mongod --smallfiles
or
mongod --nojournal
hope this helps anyone.

Nagios: CRITICAL - Socket timeout after 10 seconds

I've been running nagios for about two years, but recently this problem started appearing with one of my services.
I'm getting
CRITICAL - Socket timeout after 10 seconds
for a check_http -H my.host.com -f follow -u /abc/def check, which used to work fine. No other services are reporting this problem. The remote site is up and healthy, and I can do a wget http://my.host.com/abc/def from the nagios server, and it downloads the response just fine. Also, doing a check_http -H my.host.com -f follow works just fine, i.e. it's only when I use the -u argument that things break. I also tried passing it a different user agent string, no difference. I tried increasing the timeout, no luck. I tried with -v, but all it get is:
GET /abc/def HTTP/1.0
User-Agent: check_http/v1861 (nagios-plugins 1.4.11)
Connection: close
Host: my.host.com
CRITICAL - Socket timeout after 10 seconds
... which does not tell me what's going wrong.
Any ideas how I could resolve this?
Thanks!
Try using the -N option of check_http.
I ran into similar problems, and in my case the web server didn't terminate the connection after sending the response (https was working, http wasn't). check_http tries to read from the open socket until the server closes the connection. If that doesn't happen then the timeout occurs.
The -N option tells check_http to receive only the header, but not the content of the page / document.
I tracked my issue down to an issue with the security providers configured in the most recent version of OpenSUSE.
From summary of other web pages it appears to be an issue with an attempt to use TLSv2 protocol which does not appear to work correctly, or is missing something in the default configurations to allow it to work.
To overcome the problem I commented out the security provider in question from the JRE security configuration file.
#security.provider.10=sun.security.pkcs11.SunPKCS11
The security.provider. value may be different in your configuration, but essentially the SunPKCS11 provider is at issue.
This configuration is normally found in
$JAVA_HOME/lib/security/java.security
of the JRE that you are using.
Fixed with this url in nrpe.cfg: (on Deb 6.0 Squeeze using nagios-nrpe-server)
command[check_http]=/usr/lib/nagios/plugins/check_http -H localhost -p 8080 -N -u /login?from=%2F
For whoever is interested, I stumbled in this problem too and the problem ended up being in mod_itk on the web server.
A patch is available, even if it seems it's not included in the current CentOS or Debian packages:
https://lists.err.no/pipermail/mpm-itk/2015-September/000925.html
In my case /etc/postfix/main.cf file was not good configured.
My mailserverrelay was not defined and was also very restrictive.
I should to add:
relayhost = mailrelay.ext.example.com
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination

Resources