Nagios emails not received even when notification is fired poperly - nagios

After enabling detailed debugging, I can see that Nagios is firing notifications properly.. Here is what I see in nagios.logs
[1430915423] SERVICE ALERT: test;Check node port;CRITICAL;HARD;4;Connection refused
[1430915423] SERVICE NOTIFICATION: abhishek;test;Check node port;CRITICAL;notify-service-by-email;Connection refused
[1430915423] SERVICE NOTIFICATION: root;test;Check node port;CRITICAL;notify-service-by-email;Connection refused
However, I do not receive emails at the specified contact.. I am using SSMTP..
It is working fine as well.. This command works -
ssmtp abc#xxx.com
Therefore, either 2 things can happen -
notify-service-by-email
is not working OR some security check is filtering out such emails (this should not happen as I am sending emails from my email address).. Can any one suggest how to debug this..?
EDIT - Here is my notify-service-by-email command -
define command{
command_name notify-service-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
}

Finally found the issue..
sSMTP was working properly.. Tested it with this command -
ssmtp -s abcd#xxx.com
Enabled DEBUG logs to find out that /etc/ssmtp/ssmtp.conf did not have sufficient permission..
The file was owned by root instead of nagios user
Hope this helps someone..

Related

Broadcast an Intent from Historical Broadcast

The story behind it: I'm trying to command an Android box (with a proprietary launcher) that also manages TV channels. To enter the channel section it is not sufficient to type a number, but a specific key must be pressed. I want to find a way to replicate that key, and then use this command on Home Assistant. I could try with a bluetooth sniffer, but it will be after the failure of my actual attempt.
I ran this adb command after pressing the specific key for TV channels:
adb.exe shell dumpsys activity broadcasts history
And the last broadcast in history is this (timvision is the name of the box):
Historical Broadcast foreground #0:
BroadcastRecord{560a0f4 u0 android.intent.action.GLOBAL_BUTTON} to user 0
Intent { act=android.intent.action.GLOBAL_BUTTON flg=0x10000010 (has extras) }
targetComp: {timvision.launcher/timvision.launcher.TimVisionKeyReceiver}
extras: Bundle[{android.intent.extra.KEY_EVENT=KeyEvent { action=ACTION_UP, keyCode=KEYCODE_LAST_CHANNEL, scanCode=377, metaState=0, flags=0x8, repeatCount=0, eventTime=10092564, downTime=10092515, deviceId=27, source=0x301, displayId=-1 }}]
caller=android 3514:system/1000 pid=3514 uid=1000
enqueueClockTime=2022-04-16 14:43:54.577 dispatchClockTime=2022-04-16 14:43:54.578
dispatchTime=-3s691ms (+1ms since enq) finishTime=-3s502ms (+189ms since disp)
resultTo=null resultCode=0 resultData=null
nextReceiver=1 receiver=null
Deliver +189ms #0: (manifest)
priority=0 preferredOrder=0 match=0x0 specificIndex=-1 isDefault=false
ActivityInfo:
name=timvision.launcher.TimVisionKeyReceiver
packageName=timvision.launcher
enabled=true exported=true directBootAware=false
resizeMode=RESIZE_MODE_RESIZEABLE
Is possible to replicate this broadcast ? I tried this (with extra_key values too) but seems is not allowed:
adb.exe shell am broadcast -a android.intent.action.GLOBAL_BUTTON -n timvision.launcher/timvision.launcher.TimVisionKeyReceiver
Error:
java.lang.SecurityException: Permission Denial: not allowed to send broadcast android.intent.action.GLOBAL_BUTTON from pid=32487, uid=2000
Alternatives or ideas are welcome.
Thanks
It is not the solution to the question but it is the solution to my problem. So I post it.
The key to enter in the TV channels section is listed on the official Android documentation and is KEYCODE_LAST_CHANNEL with code 229.
On Home Assistant the service to launch is this:
service: androidtv.adb_command
data:
command: input keyevent 229
target:
entity_id: media_player.your_android_tv_entity

[flink]Task manager initialization failed

I am new to flink. I am trying to run the flink example on my local PC(windows).
However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0.
I checked the log and seems it fails to initialize:
2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed.
org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpec.FromConfig(TaskExecutorResourceUtils.java:72)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerSecurely$2(TaskManagerRunner.java:322)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:321)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) is not set
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84)
at java.util.Arrays$ArrayList.forEach(Arrays.java:3880)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70)
... 7 more
2020-02-21 23:03:14,217 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
Basically, it looks like a required option 'taskmanager.cpu.cores' is not set. However, I can't find this property in flink-conf.yaml and in the document(https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html) either.
I am using flink 1.10.0. Any help would be highly appreciated!
That configuration option is intended for internal use only -- it shouldn't be user configured, which is why it isn't documented.
The windows start-cluster.bat is failing because of a bug introduced in Flink 1.10. See https://jira.apache.org/jira/browse/FLINK-15925.
One workaround is to use the bash script, start-cluster.sh, instead.
See also this mailing list thread: https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E

Mongoose opening multiple unwanted TCP sockets on reconnect

Wanting to test a mongoDB server up/down procedure connected to Node/Mongoose, we found out that Mongoose can sometimes open hundreds of TCP sockets (which is not necessary and potentially blocking for the user who is limited to a certain amount of sockets). This occurs in the following case and environment :
Node supervised with PM2 and MongoDB surevised with daemontools
At normal and clean startup :
$ netstat -alpet | grep mongo
tcp 0 0 *:27017 *:* LISTEN mongo 65910844 22930/mongod
tcp 0 0 localhost.localdomain:27017 localhost.localdomain:54595 ESTABLISHED mongo 6591110422930/mongod
The last "ESTABLISHED" line repeated 5 times since the option (poolSize: 5) is specified in Mongoose ("mongo" is the user running mongod under daemontools)
When we have the Node procedure :
mongoose.connection.on('disconnected', function () {
var options = {server: { auto_reconnect:true, poolSize: 5 ,socketOptions: { connectTimeoutMS: 5000 } }
}
console.log('Mongoose default connection disconnected ' + mongoose.connection.readyState);
mongoose.connect( dbURI, options );
});
and we bring down the MongoDB by daemontools (mongodbdaemon is a simple $mongod command) :
svc -d /service/mongodbdaemon
there is of course no mongod running in the system (tested by the netstat command ) and the web server pages called which are using mongoose announce what is normal :
{"name":"MongoError","message":"topology was destroyed"}
The problem occurrs at this stage. Since the time we bring down MongoDB, Mongoose accumulates all the connect() calls in the 'disconnected' event handler. This means that the longer we wait before bringing up MongoDB, the more TCP connections will be opened.
So bringing up MongoDB by
svc -u /service/mongodbdaemon
gives the following :
$ netstat -alpet | grep mongo | wc -l
850 'ESTABLISHED' TCP connections to mongod !
If we bring down again mongod, the hundreds of connections remain in the TIME_WAIT state until Linux cleans the socket pool.
Questions
Can we check if a MongoDB instance is available before connecting to it ?
Can we configure Mongoose not to accumulate reconnecting() tries every millisecond or so ?
Is there a buffer for pending connection operations (as there is for mongoose.insert[...]) that we can access or clean manually ?
Problem reproductible on a CentOS 6.7 / mongoDB 3.0.6 / mongoose 4.1.8 / node 4.0.0
Edit :
From the official mongoose site where I posted this question after posting it here, I received an answer : "using auto_reconnect : true, on the initial connect() operation (which is set by default) there is no reason to reconnect() in a disconnect event callback".
This is true and it works jute fine, but the question is now why does this happen and how to avoid it (it is serious enough on the Linux system level to be an issue in mongoose).
Thanks !

about qpid exchange,queue

every one:
i am new to qpid and encounter some problem. the exchange created by me can’t route message to the queue, as follows:
first i create a durbale queue “test-queue-1” in the qpid use the quid-config command:
qpid-config add queue test-queue-1 --durable
next i create a durable direct exchange “test-exchange-1" in the qpid also use the qpid-config command:
qpid-config add exchange direct test-exchange-1 --durable
the last, in bind them as follow command:
qpid-config bind test-exchange-1 test-queue-1 test-queue-1
everything seems ok in the qpid-tool:
Object Summary:
ID Created Destroyed Index
========================================================================================
128 12:28:28 - org.apache.qpid.broker:queue:qmfc-v2-hb-iZ23c6sri0pZ.12680.1
129 12:28:28 - org.apache.qpid.broker:queue:qmfc-v2-iZ23c6sri0pZ.12680.1
130 12:28:28 - org.apache.qpid.broker:queue:qmfc-v2-ui-iZ23c6sri0pZ.12680.1
131 12:28:28 - org.apache.qpid.broker:queue:reply-iZ23c6sri0pZ.12680.1
132 12:24:17 - org.apache.qpid.broker:queue:test-queue-1
133 12:28:28 - org.apache.qpid.broker:queue:topic-iZ23c6sri0pZ.12680.1
116 12:27:20 -
and
org.apache.qpid.broker:binding:org.apache.qpid.broker:exchange:test-exchange-1,org.apache.qpid.broker:queue:test-queue-1,test-queue-1
now i am ready to test them, start the recv/send demo program:
[devel#iZ23c6sri0pZ build]$ ./recv amqp://127.0.0.1/test-queue-1
send the message:
[devel#iZ23c6sri0pZ build]$ ./send -a amqp://127.0.0.1/test-exchange-1 hi,everyone
but the "recv program” can’t recv any message.
if i send message like this :
[devel#iZ23c6sri0pZ build]$ ./send -a amqp://127.0.0.1/test-queue-1 hi,everyone
the “recv program” can recv the message:
Address: amqp://127.0.0.1/test-queue-1
Subject: Hello Subject
Content: "hi,everyone"
who can tell me why?i read the amqp protocol, maybe the routing-key in the message don’t match the binding-key, but if this, how could i set the routing-key?
my recv/send writed by proton-c , version 0.8. qpidd is 0.32 version.
When you send a message to a qpid direct exchange, it gets routed to a bound queue based on the routing-key of the message. In proton-c you can set the routing key by setting the message-subject using the function
PN_EXTERN int pn_message_set_subject (pn_message_t* msg,const char* subject)
Unfortunately this is not implemented in the example send.c that is shipped with proton-c v0.8 You can insert the following line somewhere around here and rebuild your send executable
pn_message_set_subject(message, "my-routing-key");
You can also with some effort, add a new command-line option to accept and use a routing-key from ./send
The java example implements a -s option to set the message subject.
I too think it's a binding issue.
Try binding with following,
qpid-config bind test-exchange-1 test-queue-1 test-exchange-1
#Feng Fang: "test-exchange-1" is a routing key which is you are using while sending a message. If that does not give a try with "test-exchange-1/test-exchange-1"
Keep rest as-is and give a try.
I hope this helps!

Implement NOT logic (Negative logic) in nagios alarm

I am newbie in network monitoring field and I have just started my work on nagios. So I have some basic doubts related to nagios.
we have a localhost.cfg at /usr/local/nagios/etc/objects/localhost.cfg
define service{
use local-service ; Name of service template to use
host_name blah-16.10
service_description Sample Check
check_command check_http_services!-H mydomain.com -u "/sample_url" --string "foo bar" -t 60
}
My questions:
1.) I know this script checks "http service" for the url "www.mydomain.com/sample_url" and find the text "foo bar" on that web page.
but I do not know the meaning/usage of the options (-H, -u, -t 60, --string)
i have googled but I can not find proper documentation where I can find the meaning of these parameters. Can anyone please suggest some link/urls for this?
2.) I want to implement kind of negative logic in my alarm. For example: I want to raise the alarm only when I find "status closed"`string on my web page (www.mydomain.com/sample_url)
How can I achieve this in nagios?
Note: During my searching, I found all those examples which worked like "If 'sample string' found within specific time then 'No Alarm'. If 'sample string' not found in specific time, then only 'Raise Alarm'".
But i need exact opposite.

Resources