Currently I have below script to check if corresponding services are running on my server or not using some internal logic. Please find code snippet below:
MY_SERVER_ID=X22 //stored somewhere in profile of the servers
if [ "$MY_SERVER_ID" = "X11" -o "$MY_SERVER_ID" = "X22" -o "$MY_SERVER_ID" = "X33" ]
then
##do xx
echo " Service 1 : Running "
fi
if [ "$MY_SERVER_ID" = "X11" -o "$MY_SERVER_ID" = "X22" ]
then
##do xx
echo " Service 2 : Running "
fi
Now as all servers do not run all the services the If conditions become a lot more unreadable/ unnecessary complex. Currently I have 10+ servers and 8+ services where different services run on different servers. Also in future any service can be invoked/ start running on a particular node it wasn't running before in which case I have to go and change the script again.
I understand that in case of any change I definitely have to change the script and update in all the servers however I would like to make the process less painful than it already is.
I can implement something like an array that is defined at the start of the script to point out whether a given service runs on a particular node or not. Something I picked up from this question on stackoverflow.
I know multi dimensional arrays could be easily implemented in C but as I am quite new to shell scripting I would like to know if there is any possibility to make my script more readable and easily editable!!!
Since all your if checks are with the same variable, you can simplify it to case
case "$MY_SERVER_ID" in
X11|X22|X33)
echo "Service 1 running" ;;
esac
bash doesn't have multi-dimensional arrays, but you could use a space-delimited string as a substitute for a second dimension, as long as the values are simple.
declare -A services
all_services=([X11]="1 2" [X22]="1 2" [X33]="1")
services=${all_services[$MY_SERVER_ID]}
for i in $services; do
echo "Service $i running"
done
Another possibility is storing the data in a JSON file and using the jq utility to parse it and extract values. This more elaborate version is left as an exercise for the reader (I'm not really very experienced with this tool).
Related
i'm trying to write a script which lets me send commands to multiple services based on the passed params.
currently i'm declaring an associative array like
declare -A services0=(
[name]='service-a'
[port]='1234'
[type]='typa-x'
[...]='...'
)
declare -A services1=(
[name]='service-b'
[port]='1234'
[type]='typa-y'
[...]='...'
)
declare -A services?=(...
declare -n services
for a single param that works fine because then i do something like
for services in ${!services#}; do
if [[ "$PARAM_TYPE" == ${services[type]} ]]; then
do something...
fi
done
but now i want to add a bunch of new params.
my idea was to filter the associative array per param like
#step1 - filter associative array to objects where attribute-a is param-a if it is set
for services in ${!services#}; do
if [[ "$PARAM_TYPE" != ${services[type]} ]]; then
echo "Removing ${services[name]}"
unset -v 'services[$services]'
fi
#step2 - filter whats left after the previous step where attribute-b is param-b if it is set
#step3 - repeat for all possible params...
#step_last - iterate over the filtered array and execute commands
but i'm struggling removing the objects from the associative array i want to filter as they are not getting removed. when i execute
echo "services: ${!services#}"
output: 'services: services0 services1 services2 ...'
i get the same is if i've havn't filtered at all.
also when i iterate over the array again like
for services in ${!services#}; do
echo "executing commands for ${services[name]}";
done
i get this warning twice
warning: services : circular name reference
once for the echo line and then for the for line.
am i on the right track here? what am i missing to make this work or is this a completely wrong approach?
thanks, bernd
It seems you want to unset an entire array services3.
But your unset line says to unset an entire array services.
The fact that it says to unset some subkey of this variable is irrelevant, since it is not an array. So you may as well be writing
unset 'services'
which will unset the for-loop variable services. Which will then be set again on the next iteration of the for-loop.
What you want is
unset "$services"
which probably would have been clearer if the variable had been named properly. services is never multiple services, only one at a time. Which would make this a simple issue of uninterpreted string 'service' vs bash interpreted "$service" being fed to the unset builtin.
I am using slurm scripts to run arrays for Matlab computing on a cluster. Each script uses an array to loop over a matlab parameter.
1) Is it possible to create a shell script to loop over another variable?
2) Can I pass variables to a slurm script?
For example, my slurm files currently look like
#!/bin/bash
#SBATCH --array=1-128
...
matlab -nodesktop r "frame=[${SLURM_ARRAY_TASK_ID}]; filename=['Person24']; myfunction(frame, filename);";
I frequently need to run this array to process a number of different files. This means I will submit the job (sbatch exampleScript.slurm), edit the file, update 'Person24' to 'Person25', and then resubmit the job. This is pretty inefficient when I have a large number of files to process.
Could I make a shell script that would pass a variable to the slurm script? For example, something like this:
Shell Script (myshell.sh)
#!/bin/bash
for ((FNUM=24; FNUM<=30; FNUM+=1));
do
sbatch myscript.slurm >> SOMEHOW PASS ${FNUM} HERE (?)
done
Slurm script (myscript.slurm)
#!/bin/bash
#SBATCH --array=1-128
...
matlab -nodesktop -nodisplay r "frame=[${SLURM_ARRAY_TASK_ID}]; filename=[${FNUM}]; myfunction(frame, filename);";
where I could efficiently submit all of the jobs using something like
sbatch myshell.sh
Thank you!
In order to avoid possible name collisions with shell and anvironment variables, it is a good habit to always use lowercase or mixed case variables in your Bash scripts.
You were almost there. You just need to pass the variable as an argument to the second script and then pick it up there based on the positional parameters. In this case, it looks like you're only passing one argument, so $1 is OK to use. In other cases, with multiple parameters of a fixed number you could also use $2,$3, etc. With a variable number of arguments "$#" would be more appropriate.
Shell Script (myshell.sh)
#!/bin/bash
for ((fnum=24; fnum<=30; fnum+=1))
do
sbatch myscript.slurm "$fnum"
done
Slurm script (myscript.slurm)
#!/bin/bash
#SBATCH --array=1-128
fnum=$1
...
matlab -nodesktop -nodisplay r "frame=[${slurm_array_task_ID}]; filename=[${fnum}]; myfunction(frame, filename);";
For handling various timeout conditions this might work:
A=$(sbatch --parsable a.slurm)
case $? in
9|64|130|131|137|140)
echo "some sort of timeout occurred"
B=$(sbatch --parsable --dependency=afternotok:$A a.slurm)
;;
*)
echo "some other exit condition occurred"
;;
esac
You will just need to decide what conditions you want to handle and how you want to handle them. I have listed all the ones that seem to involve timeouts.
I want to check if the following records exist with two arrays. I'm not sure if this is the best way of going about it, but from the logic it looks like it may be possible from the code below:
Domain_checking () {
array=(
grafana
kibana
prometheus
alertmanager
)
array2=(
Name
NXDOMIAN
)
for index in ${!array[*]}; do
echo "checking that ${array[$index]} exists in the domain domain.co.uk"
DOMAIN_CHECK=$(nslookup ${array[$index]}.domain.co.uk | grep {array2[$index]})
if [[ $DOMAIN_CHECK == *'Name'* ]]; then
echo "The A record for ${array[$index]}.domain.co.uk exists"
elif [[ $DOMAIN_CHECK == *'NXDOMIAN'* ]]; then
echo "The A record for ${array[$index]}.domain.co.uk dose not exist"
fi
done
}
Domain_checking
When the code above is run, the loop does start and for the echo statement, I see the values in both arrays when I add {array2[$index]} to the echo statement.
But the array values are not present in DOMAIN_CHECK, which I'm not sure as to why this is as the for loop does iterate.
So I would expect that DOMAIN_CHECK should have some sort of value and hit the if statement but for some reason, this doesn't seem to be the case. Why is that?
It appears you're only using nslookup to see if the domain exists or not, rather than looking for specific information from the command. You can simplify by just checking the exit code instead of using grep:
Domain_checking () {
array=(
grafana
kibana
prometheus
alertmanager
)
for domain in ${array[#]}
do
if nslookup "${domain}.domain.co.uk" >/dev/null 2>&1 ; then
echo "$domain exists"
else
echo "$domain does not exist"
fi
done
}
Domain_checking
If the domain record exists, nslookup will return 0 and the if condition will be satisfied. Anything else indicates a failure and the control will fall through to the else.
You use $index as an index to both arrays, but there's a matching entry in $array2 only for the first entry. That's why the other entries aren't showing up, and also why grep is missing it's required argument.
Thinking through your logic, I don't see any reason not to remove the second array completely and hard code in Name for the grep.
Come think of it, the first array isn't helping much either. You could simplify the code by iterating across the names themselves rather than their array indices.
domain=some.thing
names="kibana prometheus graphite"
for name in $names; do
nslookup $name.$domain ....
done
I have a C program that I want to run without having to manually type commands into. I have 4 commands (5 if you count the one to exit the program) that I want given to the program and I don't know where to start. I have seen some stuff like
./a.out <<<'name'
to pass in a single string but that doesn't quite work for me.
Other issues I have that make this more difficult are that one of the commands will give an output and that output needs to be a part of a later command. If I had access to the source code I could just brute force in some loops and counters so I am trying to get a hold of it but for now I am stuck working without it. I was thinking there was a way to do this with bash scripts but I don't know what that would be.
In simple cases, bash script is a possibility: run the executable in coproc (requires version 4). A short example:
#!/bin/bash
coproc ./parrot
echo aaa >&${COPROC[1]}
read result <&${COPROC[0]}
echo $result
echo exit >&${COPROC[1]}
with parrot (a test executable):
#!/bin/bash
while [ true ]; do
read var
if [ "$var" = "exit" ]; then exit 0; fi
echo $var
done
For a more serious scenarios, use expect.
I currently have a R script written to perform a population genetic simulation, then write a table with my results to a text file. I would like to somehow run multiple instances of this script in parallel using an array job (my University's cluster uses SGE), and when its all done I will have generated results files corresponding to each job (Results_1.txt, Results_2.txt, etc.).
Spent the better part of the afternoon reading and trying to figure out how to do this, but haven't really found anything along the lines of what I am trying to do. I was wondering if someone could provide and example or perhaps point me in the direction of something I could read to help with this.
To boil down mithrado's answer to the bare essentials:
Create job script, pop_gen.bash, that may or may not take SGE task id argument as input, storing results in specific file identified by same SGE task id:
#!/bin/bash
Rscript pop_gen.R ${SGE_TASK_ID} > Results_${SGE_TASK_ID}.txt
Submit this script as a job array, e.g. 1000 jobs:
qsub -t 1-1000 pop_gen.bash
Grid Engine will execute pop_gen.bash 1000 times, each time setting SGE_TASK_ID to value ranging from 1-1000.
Additionally, as mentioned above, via passing SGE_TASK_ID as command line variable to pop_gen.R you can use SGE_TASK_ID to write to output file:
args <- commandArgs(trailingOnly = TRUE)
out.file <- paste("Results_", args[1], ".txt", sep="")
# d <- "some data frame"
write.table(d, file=out.file)
HTH
I am not used to do this in R, but I've been using the same approach in python. Imagine that you have an script genetic_simulation.r and it has 3 parameter:
--gene_id --khmer_len and --output_file.
You will have one csv file, genetic_sim_parms.csv with n rows:
first_gene,10,/result/first_gene.txt
...
nth_gene,6,/result/nth_gene.txt
A import detail is the first lane of your genetic_simulation.r. It needs to tell which executable the cluster is going to will use. You might need to tweak its parameters as well, depending on your setup, it will look like to:
#!/path/to/Rscript --vanilla
And finally, you will need a array-job bash script:
#!/bin/bash
#$ -t 1:N < change to number of rows in genetic_sim_parms.csv
#$ -N genetic_simulation.r
echo "Starting on : $(date)"
echo "Running on node : $(hostname)"
echo "Current directory : $(pwd)"
echo "Current job ID : $JOB_ID"
echo "Current job name : $JOB_NAME"
echo "Task index number : $SGE_TASK_ID"
ID=$(awk -F, -v "line=$SGE_TASK_ID" 'NR==line {print $1}' genetic_sim_parms.csv)
LEN=$(awk -F, -v "line=$SGE_TASK_ID" 'NR==line {print $2}' genetic_sim_parms.csv)
OUTPUT=$(awk -F, -v "line=$SGE_TASK_ID" 'NR==line {print $3}' genetic_sim_parms.csv)
echo "id is: $ID"
rscript genetic_simulation.r --gene_id $ID --khmer_len $LEN --output_file $OUTPUT
echo "Finished on : $(date)"
Hope this helps!