How to handle SSH failures in Expect script - loops

I have an expect script that logs in to a list of devices and run a series of commands.
Everything works fine except when one of the hosts is/becomes unreachable and the script just exits. Is there a way to get it to skip the unreachable host & move on to the remaining devices?
Here's the main body of my script.
foreach host $hosts {
spawn -noecho /usr/bin/ssh user#$host
set timeout 10
expect {
"assword:" { send [string trimright "$pwd" "\r"] }
"No route*" {puts "Host error -> $expect_out(buffer)";exit}
"Could not resolve*" {puts "Host error -> $expect_out(buffer)";exit}
}
expect "#"
send "term len 0\r"
expect "#"
send "show version\r"
expect "#"
send "exit\r"
expect eof
}
And here's what i get:
.
. <output of reachable device - R1>
.
Connection to R1 closed by remote host.
Connection to R1 closed.
ssh: Could not resolve hostname R2: Name or service not known
Host error -> ssh: Could not resolve hostname R1: Name or service not known

Given that Expect is, essentially, an extension to the TCL language, your question really boils down to "how to I end a loop iteration early in TCL?".
The answer is, use the continue command instead of the exit command.

Related

Check database connectivity

I'm writing a unix script to check for database connectivity in a server. When my database connection gets errored out or when there is delay observed in connecting to the database, I want the output as "Not connected". In case it gets connected, my output should be "Connected". It is a Oracle databse.
When there is delay in database connectivity, my code is not working and my script gets hung. What changes should I make in my code so that it is able to handle both the conditions(when I get an error connecting to the database and when there is delay observed in connecting to the database)??
if sqlplus $DB_USER/$DB_PASS#$DB_INSTANCE< /dev/null | grep 'Connected to'; then
echo "Connectivity is OK"
else
echo "No Connectivity"
fi
The first thing to add to your code is a timeout. Checking database connectivity is not easy and there can be all kinds of problems in the various layers that your connection passes. A timeout gives you the option to break out of a hanging session and continue the task with reporting that the connection failed.
googleFu gave me a few nice examples:
Timeout a command in bash without unnecessary delay
If you are using Linux, you can use the timeout command to do what you want. So the following will have three outcomes, setting the variable RC as follows:
"Connected to" successful: RC set to 0
"Connected to" not found: RC set to 1
sqlplus command timed out after 5 minutes: RC set to 124
WAIT_MINUTES=5
SP_OUTPUT=$(timeout ${WAIT_MINUTES}m sqlplus $DB_USER/$DB_PASS#$DB_INSTANCE < /dev/null )
CMD_RC=$?
if [ $CMD_RC -eq 124 ]
then
ERR_MSG="Connection attempt timed out after $WAIT_MINUES minutes"
RC=$CMD_RC
else
echo $SP_OUTPUT | grep -q 'Connected to'
GREP_RC=$?
if [ $GREP_RC -eq 0 ]
then
echo "Connectivity is OK"
RC=0
else
ERR_MSG="Connectivity or user information is bad"
RC=1
fi
fi
if [ $RC -gt 0 ]
then
# Add code to send email with subject of $ERR_MSG and body of $SP_OUTPUT
echo Need to email someone about $ERR_MSG
fi
exit $RC
I'm sure there are several improvements to this, but this will get you started.
Briefly, we use the timeout command to wait the specified time for the sqlplus command to run. I separated out the grep as a separate command to allow the use of timeout and to allow more flexibility in checking additional text messages.
There are several examples on StackOverflow on sending email from a Linux script.

Execute a shell command in C but not through web page

I have a program in C/java/html 5
The service file of the program runs as root user and it is on an Archlinux distribution.
I want to have the result IP address that SMB will find and show it in an html page i have.
The part of the code i am interested is this:
sds ns_server_list(sds buffer, sds method, int request_id)
{
FILE *fp = popen("/usr/bin/nmblookup -S \'*\' | grep \"<00>\" | awk \'{print $1}\'", "r");
// returns three lines per server found - 1st line ip address 2nd line name 3rd line workgroup
if (fp == NULL)
{
LOG_ERROR("Failed to get server list");
buffer = jsonrpc_respond_message(buffer, method, request_id, "Failed to get server list", true);
}
If i run the program through shell (./test) and after compile, it returns the results in HTML just fine.
If i run it through the web interface of program and through a service file, it returns these SMB errors in journalctl -xe:
WARNING: no network interfaces found
name_query failed to find name *
ERROR Executing syscmd "/usr/bin/nmblookup -S '*'" failed
SMB.conf
[global]
server string = SMB Server
security = user
guest account = root
map to guest = bad user
log level = 0
load printers = No
syslog = 0
directory mask = 0775
create mask = 0775
browseable = yes
#veto files = /._*/.DS_Store/
interfaces = eth0 192.168.2.0/24
Summarized...
If i run the program through service file and without IP address in SMB.conf i have the errors in journalctl. If i put the IP everything is ok but i do not want to work it like this way (dynamic ip and eth0 do not work, same errors).
If i run the program through shell with the executable compiled file (./test) everything play fine without ip address in smb.conf
Why is that?
Thank you.

Catch invalid password on Sudo

Is there a way to trap/catch a invalid password when you use sudo? Basically I want to return a specific exit code if the sudo password is invalid. I don't want to avoid sudo or get around it, I just want to close/exit a script in a matter of my choosing.
Based on the man page of sudo(8), there is no easy way for evaluating the exact error reasons for a failure:
Exit Value
Upon successful execution of a program, the exit status from sudo will
simply be the exit status of the program that was executed.
Otherwise, sudo exits with a value of 1 if there is a
configuration/permission problem or if sudo cannot execute the given
command. In the latter case the error string is printed to the
standard error. If sudo cannot stat(2) one or more entries in the
user's PATH, an error is printed on stderr. (If the directory does not
exist or if it is not really a directory, the entry is ignored and no
error is printed.) This should not happen under normal circumstances.
The most common reason for stat(2) to return ''permission denied'' is
if you are running an automounter and one of the directories in your
PATH is on a machine that is currently unreachable.
The only "ugly" approach, which comes to my mind is to parse the result of stderr to determine the error reason:
#!/bin/bash
tmpfile=`mktemp`
sudo echo "dummy" 2>$tmpfile
if [ $? == 1 ]; then
if [ `cat $tmpfile | grep -x "sudo.*incorrect password attempts" | wc -l` == 1 ]; then
# exit due to failed password attempts
echo "too many failed password attempts"
else
# other reason, for instance configuration
echo "other reason"
fi
fi
rm $tmpfile
Note, however, that this approach is not upgrade-safe and moreover language-dependent: If a patch to sudo changes the text which is shown to the user in case of a wrong password, or the user logs on in a different language, this coding will not be able to handle this properly.

How can i catch error message from command line TCL

I'm writing script at tcl on ICC and trying to get error message while sending ran to sung-grid.
For example, I have the below line.
sh /usr/bin/xterm -e "cd DM ; mqsub -int -parallel 200 cal -cal -t 200 CAL_header | tee S.log ; touch .S_finished" &
since I don't have 200 free cpu, If i execute this command line at linux shell I'll get the below message:
"Your "qrsh" request could not be scheduled, try again later."
How can i catch this error message at ICC with & and the end of the command?
Thanks
I assume, you are executing the shell command by means of using exec in tcl.
In that case, you could use catch statement to get identify the error message.
if { [catch {exec <your_shell_program_command_here>} result] } {
puts "Following problem happened : $result"
exit 1
}
Syntax :
catch script ?varName?
Quoting the below from the man page
If script raises an error, catch will return a non-zero integer value
corresponding to the exceptional return code returned by evaluation of
script. Tcl defines the normal return code from script evaluation to
be zero (0), or TCL_OK. Tcl also defines four exceptional return
codes: 1 (TCL_ERROR), 2 (TCL_RETURN), 3 (TCL_BREAK), and 4
(TCL_CONTINUE). Errors during evaluation of a script are indicated by
a return code of TCL_ERROR.
Do you mean Synopsys ICCompiler by term ICC?
If there is any error regarding queue for launching the job for any EDA tool kindly prefer, the following way of launching.
{launch command for job (qsub *switches* ) } > & log &
this will eliminate the troubleshooting difficulties.
Sorry for posting this question as answer but i am unable to comment

Batch with No args runs as job (scheduled task) with no errors, Batch with 1 arg fails with Access Denied. Why?

Here's a trivial batch:
#echo off
if not .%1==.-b goto else
echo Running with -b flag ON
goto endif
:else
echo Running with NO flags
:endif
Now, trying to run this from a scheduled task on a Windows Server 2003...
If the task is ran like: "C:\Test\test.bat" then the log (Schedlgu.txt) says:
"Test Job.job" (test.bat)
Started 7/14/2010 10:27:19 AM
"Test Job.job" (test.bat)
Finished 7/14/2010 10:27:19 AM
Result: The task completed with an exit code of (0).
However, when running like: "C:\Test\test.bat -b" then:
"Test Job.job" (test.bat -b) 7/14/2010 10:28:02 AM ** ERROR **
Unable to start task.
The specific error is:
0x80070005: Access is denied.
Try using the Task page Browse button to locate the application.
The task is running under the Admin account (of the domain). I have also granted full access to this user to the local cmd.exe
Any thoughts why the task fails when running a batch with one argument?
Thx
Run the task with parameters like this:
"C:\Test\test.bat" -b
Note the different quoting!
The fisrt string inside quotes is always considered the file name, hence the error message you see.

Resources