While loop in snakemake; checkpoints

While loop in snakemake; checkpoints - loops

One of the tools I use in snakemake can on the same data in one of the N cases return me a weird signal (non zero status) and does not record output. According to this, I want to restart this rule every time I get such a signal and get empty output. I see it as a kind of "while-cycle". I know that this kind of logic to work without an explicit DAG contradicts the idea of snakemake, but with the appearance of checkpoints, I believe that a solution can be found.
Thanks!

You cannot "restart the rule", but you can run the same command multiple times.
Here is the recipe how to mask the error:
shell:
"""
set +e
somecommand ...
exitcode=$?
if [ $exitcode -eq 1 ]
then
exit 1
else
exit 0
fi
"""
The same approach should work with the while loop in bash:
shell:
"""
set +e
somecommand ...
exitcode=$?
while [ $exitcode -ne 0 ]
do
somecommand ...
exitcode=$?
done
"""

Related

Randomly generating invoice IDs - moving text database into script file?

I've come up with the following bash script to randomly generate invoice numbers, preventing duplications by logging all generated numbers to a text file "database".
To my surprise the script actually works, and it seems robust (although I'd be glad to have any flaws pointed out to me at this early stage rather than later on).
What I'm now wondering is whether it's at all possible to move the "database" of generated numbers into the script file itself. This would allow me to rely on and keep track of just the one file rather than two separate ones.
Is this at all possible, and if so, how? If it isn't a good idea, what valid reasons are there not to do so?
#!/usr/bin/env bash
generate_num() {
#num=$(head /dev/urandom | tr -dc '[:digit:]' | cut -c 1-5) [Original method, no longer used]
num=$(shuf -i 10000-99999 -n 1)
}
read -p "Are you sure you want to generate a new invoice ID? [Y/n] " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]
then
generate_num && echo Generating a random invoice ID and checking it against the database...
sleep 2
while grep -xq "$num" "ID_database"
do
echo Invoice ID \#$num already exists in the database...
sleep 2
generate_num && echo Generating new random invoice ID and checking against database...
sleep 2
done
while [[ ${#num} -gt 5 ]]
do
echo Invoice ID \#$num is more than 5 digits...
sleep 2
generate_num && echo Generating new random invoice ID and checking against database...
sleep 2
done
echo Generated random invoice ID \#$num
sleep 1
echo Invoice ID \#$num does not exist in database...
sleep 2
echo $num >> "ID_database" && echo Successfully added Invoice ID \#$num to the database.
else
echo "Exiting..."
fi

I do not recommend this because:
These things are fragile. One bad edit and your invoice database is corrupt.
It makes version control a pain. Each new version of the script should preferably be checked in. You could add logic to make sure that "$mydir" is an empty directory when you run the script (except for "$myname", .git and other git-related files) then run git -C "$mydir" init if "$mydir"/.git doesn't exist. Then for each database update, git -C "$mydir" add "$myname" and git -C "$mydir" commit -m "$num". It's just an idea to explore...
Locking - It's possible to do file locking to make sure that not two users run the script at the same time, but it adds to the complexity so I didn't bother. If you feel that's a risk, you need to add that.
... but you want a self-modifying script, so here goes.
This just adds a new invoice number to its internal database for each time you run it. I've explained what goes on as comments. The last line should read __INVOICES__ (+ a newline) if you copy the script.
As always when dealing with things like this, remember to make a backup before making changes :-)
As it's currently written, you can only add one invoice per run. It shouldn't be hard to move things around (you need a new tempfile) to get it to add more than one if you need that.
#!/bin/bash
set -e # exit on error - imporant for this type of script
#------------------------------------------------------------------------------
myname="$0"
mydir=$(dirname "$myname")
if [[ ! -w $myname ]]; then
echo "ERROR: You don't have permission to update $myname" >&2
exit 1
fi
# create a tempfile to be able to update the database in the file later
#
# set -e makes the script end if this fails:
temp=$(mktemp -p "$mydir")
trap "{ rm -f "$temp"; }" EXIT # remove the tempfile if we die for some reason
# read current database from the file
readarray -t ID_database <<< $(sed '0,/^__INVOICES__$/d' "$0")
#declare -p ID_database >&2 # debug
#------------------------------------------------------------------------------
# a function to check if a number is already in the db
is_it_taken() {
local num=$1
# return 1 (true, yes it's taken) if the regex found a match
[[ ! " ${ID_database[#]} " =~ " ${num} " ]]
}
generate_num() {
local num
(exit 1) # set $? to 1
# loop until $? becomes 0
while (( $? )); do
num=$(shuf -i 10000-99999 -n 1)
is_it_taken "$num"
done
# we found a free number
echo $num
}
add_to_db() {
local num=$1
# add to db in memory
ID_database+=($num)
# add to db in file:
# copy the script to the tempfile
cp -pf "$myname" "$temp"
# add the new number
echo $num >> "$temp"
# move the tempfile into place
mv "$temp" "$myname"
}
#------------------------------------------------------------------------------
num=$(generate_num)
add_to_db $num
# your business logic goes here:
echo "All current invoices:"
for invoice in ${ID_database[#]}
do
echo ">$invoice<"
done
#------------------------------------------------------------------------------
# leave the rest untouched:
exit
__INVOICES__

Edited
To answer the question you asked -
Make sure your file ends with an explicit exit statement.
Without some sort of branching it won't execute past that, so unless there is a gross parsing error anything below could be used as storage space. Just
echo $num >> $0
If you write your records directly onto the bottom of the script, the script grows, but ...relatively harmlessly. Just make sure your grep pattern doesn't grab any lines of code, though grep -E '^\d[%]$' seems pretty safe.
This is only ever going to give you a max of ~90k id's, and spends unneeded time and cycles on redundancy checking. Is there a limit on the length of the value?
If you can assure there won't be more than one invoice processed per second,
date +%s >> "ID_database" # the UNIX epoch, seconds since 00:00:00 01/01/1970
If you need more precision that that,
date +%Y%m%d%H%M%S%N
will output Year month day hour minute second nanoseconds, which is both immediate and "pretty safe".
date +%s%N # epoch with nanoseconds
is shorter, but doesn't have the convenient side effect of automatically giving you the date and time of invoice creation.
If you absolutely need to guarantee uniqueness and nanoseconds isn't good enough, use a lock of some sort, and maybe a more fine-grained language.
On the other hand, if minutes are unique enough, you could use
date +%y%m%d%H%M
You get the idea.

How can I do a function that outputs the not of another function in bash shell?

I have an existing function is_active_instance, which determines if a database instance is running (true) or not. I am working in a new function called is_inactive_instance which needs to return true if is_active_instance returns false.
How can I call is_active_instance from is_inactive_instance and negate its return to return True to main program?
I already tried to call is_instance_active with ! to change the result of the original function.
is_active_instance(){
dbservice=""
if is_mysql_db
then
dbservice="mysqld"
elif is_mariadb_db
then
dbservice="mysqld"
elif is_postgre_db
then
dbservice='postgresql'
fi
[ $(ps -ef | grep -v grep | grep $dbservice* | wc -l) > 0 ]
}
is_inactive_instance(){
[ [ ! is_active_instance ] ]
}
if is_active_instance
then
echo "Is active instance"
elif is_inactive_instance
then
echo "Is inactive instance"
else
echo "Other result"
fi
In Main body I will need to be able to detect if the instance is running, stopped or other for my purposes.

Don't use any [s:
is_inactive_instance(){
! is_active_instance
}
Also see Comparing numbers in Bash for how to make your is_active_instance work.

Here is an example of how to do this in BASH. Your code is on the right track but needs syntax changes to work in BASH.
Instead of checking for a NOT you would check for "Yes" or "No", and you may change the outputs to zeroes and ones if you wish and test for those.
Copy the code between CODE STARTS and CODE ENDS into ./active_instance.sh.
Type the line below and press RETURN.
chmod 755 ./active_instance.sh
CODE STARTS HERE ==================
#!/usr/bin/env bash
for some_db in mariadb mysqld postgres oracle sybase
do
echo -n "Checking ${some_db}..."
set `ps -ef|grep -v grep|grep ${some_db}|wc`
if test ${1} -gt 0
then
echo "Yes"
else
echo "No"
fi
done
CODE ENDS HERE ==================
To run, type the line below and press RETURN.
./active_instance.sh
Sample execution:
./active_instance.sh
Checking mariadb...Yes
Checking mysqld...Yes
Checking postgres...Yes
Checking oracle...No
Checking sybase...No

repeat pipe command until first command succeeds and second command fails

I am trying to figure out how to get my bash script to work. I have a the following command:
curl http://192.168.1.2/api/queue | grep -q test
I need it to repeat until the first command in the pipline succeeds (meaning the server responds) and the second command fails (meaning the pattern is not found or the queue is empty). I've tried a number of combinations but just can't seem to get it. I looked at using $PIPESTATUS but can't get it to function in a loop how I want. I've tried all kind of variations but can't get it to work. This is what I am currently trying:
while [[ "${PIPESTATUS[0]}" -eq 0 && "${PIPESTATUS[1]}" -eq 1 ]]
do curl http://192.168.1.2 | grep -q regular
echo "The exit status of first command ${PIPESTATUS[0]}, and the second command ${PIPESTATUS[1]}"
sleep 5
done

Although it's not really clear what kind of output is returned by the curl call, maybe you are looking for something like this:
curl --silent http://192.168.1.2 |while read line; do
echo $line |grep -q regular || { err="true"; break }
done
if [ -z "$err" ]; then
echo "..All lines OK"
else
echo "..Abend on line: '$line'" >&2
fi

Figured it out. Just had to re-conceptualize it. I couldn't figure it out strictly with a while or until loop but creating an infinite loop and breaking out of it when the condition is met worked.
while true
do curl http://192.168.1.2/api/queue | grep -q test
case ${PIPESTATUS[*]} in
"0 1")
echo "Item is no longer in the queue."
break;;
"0 0")
echo "Item is still in the queue. Trying again in 5 minutes."
sleep 5m;;
"7 1")
echo "Server is unreachable. Trying again in 5 minutes."
sleep 5m;;
esac
done

Unix Script exiting when select query returns no rows

Below is the part of my script.
The "set -e" will make the script exit whenever a negative return code comes from any of the commands.
But, when the below select statement returns no rows from the table, the script exits there itself (echo "Get Eg names end" is not executed). Which means the below command is giving negative return code.
db2 -x "SELECT EG_NAME FROM MS.CFG_CACHE_REFRESH WHERE
EG_RELOAD_UPD_BK2_TS < M_TABLE_UPD_TS AND CURRENT_TIME >= EG_RELOAD_START_TIME AND
CURRENT_TIME <= EG_RELOAD_END_TIME"
> /home/DummyUser/gm4/logs/MP_CACHE_REFRESH_TEMP_BK2.txt
If the select statement returns some rows, it works fine. The script doesn't exit and runs till the end.
My requirement is to exit if a genuine error occurs, like unable to connect database, invalid syntax etc.
If no rows are returned from the table, it should not be considered as an error.
Why am I getting a -ve return code for the select query not returning rows and how can I handle this?
Below is the part of the script:
#!/bin/ksh
set -e
brokername=$1
if [ "$#" -ne "1" ]
then
echo "Invalid arugments supplied"
echo "call the script using command:"
echo "MP_CACHE_REFRESH_BRK2.ksh <BrokerName>"
exit -1
fi
touch /home/DummyUser/gm4/logs/MP_CACHE_REFRESH_TEMP_BK2.txt
chmod 777 /home/DummyUser/gm4/logs/MP_CACHE_REFRESH_TEMP_BK2.txt
db2 CONNECT TO MSAPP USER DummyUser using paasss
db2 -x "SELECT EG_NAME FROM MS.CFG_CACHE_REFRESH WHERE EG_RELOAD_UPD_BK2_TS < M_TABLE_UPD_TS AND CURRENT_TIME >= EG_RELOAD_START_TIME AND CURRENT_TIME <= EG_RELOAD_END_TIME" > /home/DummyUser/gm4/logs/MP_CACHE_REFRESH_TEMP_BK2.txt
echo "Get Eg names end"

The problem is not negative return codes, it is any return code that is != 0. DB2 exits with:
- 0, success
- 1, no row found
- 2, warning (for example using existing index instead of creating a new one)
- 4, error (for example object not found)
- 8, system error (os related)
Unless you wrap db2 and return 0 for $? -lt 4 I don't see how you are going to success. Example on how to deal with db2 exit codes (-e removed)
db2 -x "SELECT EG_NAME FROM MS.CFG_CACHE_REF ..."
rc=$?
if [ $rc -ge 4 ]; then
...
EDIT: Alternative with sql stmts in separate file
Keeping all the sql in a separate file - say myfile.sql - you can do something like from your sh:
db2 -l myfile.log +c -s -tf myfile.sql
rc=$?
if [ $rc -ge 4 ]; then
echo "Error"
db2 rollback
exit 1
elif [ $rc -ge 1 ]; then
echo "Warning"
fi
db2 commit
exit 0
-s terminates execution on error ( -ge 4 ). You can find out what caused the problem by tailing the log file, tail -10 myfile.log. Beware that certain operations such as reorg will commit the ongoing transaction.

Flow Control and Return Values in a BashScript

I'm quite new to bash scripting and i've been working on a small bash file that can do a few things for me. First of all, i'm calling this from a C function using system() and i'd like the script to return a value (1 for error, 0 for OK). Is this possible?
int kinit() {
int err = system("/home/bluemoon/Desktop/GSSExample/kinitScript.sh");
}
Second, using Zenity, i managed to create a pop up window to insert user/password. Now, according to what the user does, multiple things should happen. If he closes the window or clicks "cancel", nothing should happen. If he clicks "OK", then i should check the input (for empty text boxes or something).
Assuming a correct input, i will use Expect to run a kinit program (it's a promt related with Kerberos) and log in. Now, if the password is correct, the prompt closes and the script should return 0. If it's not, the prompt will show something like "Kinit: user not found". I wanted to, in any case of error (closing window, clicking cancel or wrong credentials) return 1 in my script and return 0 on success.
#!/bin/bash
noText=""
ENTRY=`zenity --password --username`
case $? in
0)
USER=`echo $ENTRY | cut -d'|' -f1`
PW=`echo $ENTRY | cut -d'|' -f2`
if [ "$USER"!="$noText" -a "$PW"!="$noText" ]
then
/usr/bin/expect -c 'spawn kinit '`echo $USER`'; expect "Password for '`echo $USER`':" { send "'`echo $PW`'\r" }; interact'
fi
;;
1)
echo "Stop login.";;
-1)
echo "An unexpected error has occurred.";;
esac
My if isn't working properly, the expect command is always run. Cancelling or closing the Zenity window also always lead to case "0". I've also tried to return a variable but it says i can only return vars from inside functions?
Well, if anyone could give me some pointers, i'd appreciate it.
Dave

i'd like the script to return a value
Sure, just use exit in appropriate places
exit: exit [n]
Exit the shell.
Exits the shell with a status of N. If N is omitted, the exit status
is that of the last command executed.
My if isn't working properly, the expect command is always run.
if [ "$USER" != "$noText" -a "$PW" != "$noText" ]
# ^ ^ ^ ^
# | | | |
# \----- notice spaces ----/

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

While loop in snakemake; checkpoints - loops

Related

Randomly generating invoice IDs - moving text database into script file?

How can I do a function that outputs the not of another function in bash shell?

repeat pipe command until first command succeeds and second command fails

Unix Script exiting when select query returns no rows

Flow Control and Return Values in a BashScript

Categories

Resources