Bash script for deleting multiple users based upon date and UID - arrays

I am working on a bash script that deletes users based upon two things the date it was created and the user ID. If the date is before the date given in the terminal and the User ID is greater than 1000 the user should be deleted from the system. I have some code written out and it has given me alot of issues because the file that takes in the information with the username, date created, and ID needs to be split or "cut". Is there a better way to go about this then they way I did with breaking the pieces into an array?
#!/bin/bash
if [ $# -eq 0 ] ; then
echo $0 "<date(year/month/day) filename>"
exit 1
fi
dateForDeletion=$1
listOfUsers=$2
for i in `cat $listOfUsers` | while IFS='/t' read username date; do
${array[0]}="$username"
${array[1]}="$date"
uid=`id -u ${array[0]}`
if [$dateForDeletion < $date] && [uid > 1000] ; then
`sudo userdel $username`
fi
done

You really should start smaller and build up, testing each statement individually before adding more. Here are some issues with your code:
You're using a strange amalgamation of a for and while loop.
IFS="/t" appears to try to set IFS to a tab. This should be IFS=$'\t'
${array[0]}="$username" is not a valid assignment. You should use array[0]="$username", though I'm not sure why you're assigning it to an array in the first place.
[$dateForDeletion < $date] is not a valid condition/command. It should be [[ $dateForDeletion < $date ]] (assuming the dates are yyyy-mm-dd format or something that can be compared as strings).
[uid > 1000] is not a valid condition/command. It should be [ "$uid" -gt 1000 ]
`sudo userdel $username` should not have backticks around it.
Here's how your loop should look:
while IFS=$'\t' read -r username date
do
uid=$(id -u "$username")
if [[ $dateForDeletion < $date && $uid -gt 1000 ]]
then
sudo userdel "$username"
fi
done < "$listOfUsers"

Related

Bash: Is there a way to add UUIDs and their Mount Points into an array?

I'm trying to write a backup script that takes a specific list of disk's UUIDs, mounts them to specified points, rsyncs the data to a specified end point, and then also does a bunch of other conditional checks when that's done. But since there will be a lot of checking after rsync, it would be nice to set each disk's UUID and associated/desired mount point as strings to save hardcoding them throughout the script, and also if say the UUIDs change in future (if a drive is updated or swapped out), this will be easier to maintain the script...
I've been looking at arrays in bash, to make the list of wanted disks, but have questions about how to make this possible, as my experience with arrays is non-existent!
The list of disks---in order of priority wanted to backup---are:
# sdc1 2.7T UUID 8C1CC0C19D012E29 /media/user/Documents/
# sdb1 1.8T UUID 39CD106C6FDA5907 /media/user/Photos/
# sdd1 3.7T UUID 5104D5B708E102C0 /media/user/Video/
... and notice I want to go sdc1, sdb1, sdd1 and so on, (i.e. custom order).
Is it possible to create this list, in order of priority, so it's something like this?
DisksToBackup
└─1
└─UUID => '8C1CC0C19D012E29'
└─MountPoint => '/media/user/Documents/'
└─2
└─UUID => '39CD106C6FDA5907'
└─MountPoint => '/media/user/Photos/'
└─3
└─UUID => '5104D5B708E102C0'
└─MountPoint => '/media/user/Video/'
OR some obviously better idea than this...
And then how to actually use this?
Let's say for example, how to go through our list and mount each disk (I know this is incorrect syntax, but again, I know nothing about arrays:
mount --uuid $DisksToBackup[*][UUID] $DisksToBackup[*][MountPoint]?
Update: Using Linux Mint 19.3
Output of bash --version gives: GNU bash, version 4.4.20(1)
Bash starting with version 4 provides associative arrays but only using a single dimension. Multiple dimensions you would have to simulate using keys such as 'sdc1-uuid' as shown in the following interactive bash examples (remove leading $ and > and bash output when putting into a script).
$ declare -A disks
$ disks=([0-uuid]=8C1CC0C19D012E29 [0-mount]=/media/user/Documents/
> [1-uuid]=39CD106C6FDA5907 [1-mount]=/media/user/Photos/)
$ echo ${disks[sdc1-uuid]}
8C1CC0C19D012E29
$ echo ${disks[*]}
/media/user/Documents/ 39CD106C6FDA5907 8C1CC0C19D012E29 /media/user/Photos/
$ echo ${!disks[*]}
0-mount 0-uuid 1-uuid 1-mount
However, there is no ordering for the keys (the order of the keys differs from the order in which we defined them). You may want to use a second array as in the following example which allows you to break down the multiple dimensions as well:
$ disks_order=(0 1)
$ for i in ${disks_order[*]}; do
> echo "${disks[$i-uuid]} ${disks[$i-mount]}"
> done
8C1CC0C19D012E29 /media/user/Documents/
39CD106C6FDA5907 /media/user/Photos/
In case you use bash version 3, you need to simulate the associative array using other means. See the question on associative arrays in bash 3 or simply represent your structure in a simple array such as which makes everything more readable anyway:
$ disks=(8C1CC0C19D012E29=/media/user/Documents/
> 39CD106C6FDA5907=/media/user/Photos/)
$ for disk in "${disks[#]}"; do
> uuid="${disk%=*}"
> path="${disk##*=}"
> echo "$uuid $path"
> done
8C1CC0C19D012E29 /media/user/Documents/
39CD106C6FDA5907 /media/user/Photos/
The %=* is a fancy way of saying remove everything after (and including) the = sign. And ##*= to remove everything before (and including) the = sign.
As an example of how one could read this data into a series of arrays:
#!/usr/bin/env bash
i=0
declare -g -A "disk$i"
declare -n currDisk="disk$i"
while IFS= read -r line || (( ${#currDisk[#]} )); do : "line=$line"
if [[ $line ]]; then
if [[ $line = *=* ]]; then
currDisk[${line%%=*}]=${line#*=}
else
printf 'WARNING: Ignoring unrecognized line: %q\n' "$line" >&2
fi
else
if [[ ${#currDisk[#]} ]]; then
declare -p "disk$i" >&2 # for debugging/demo: print out what we created
(( ++i ))
unset -n currDisk
declare -g -A "disk$i=( )"
declare -n currDisk="disk$i"
fi
fi
done < <(blkid -o export)
This gives you something like:
declare -g -A disk0=( [PARTLABEL]="primary" [UUID]="1111-2222-3333" [TYPE]=btrfs ...)
declare -g -A disk1=( [PARTLABEL]="esp" [LABEL]="boot" [TYPE]="vfat" ...)
...so you can write code iterating over them doing whatever search/comparison/etc you want. For example:
for _diskVar in "${!disk#}"; do # iterates over variable names starting with "disk"
declare -n _currDisk="$_diskVar" # refer to each such variable as _currDisk in turn
# replace the below with your actual application logic, whatever that is
if [[ ${_currDisk[LABEL]} = "something" ]] && [[ ${_currDisk[TYPE]} = "something_else" ]]; then
echo "Found ${_currDisk[DEVNAME]}"
fi
unset -n _currDisk # clear the nameref when done with it
done

Randomly generating invoice IDs - moving text database into script file?

I've come up with the following bash script to randomly generate invoice numbers, preventing duplications by logging all generated numbers to a text file "database".
To my surprise the script actually works, and it seems robust (although I'd be glad to have any flaws pointed out to me at this early stage rather than later on).
What I'm now wondering is whether it's at all possible to move the "database" of generated numbers into the script file itself. This would allow me to rely on and keep track of just the one file rather than two separate ones.
Is this at all possible, and if so, how? If it isn't a good idea, what valid reasons are there not to do so?
#!/usr/bin/env bash
generate_num() {
#num=$(head /dev/urandom | tr -dc '[:digit:]' | cut -c 1-5) [Original method, no longer used]
num=$(shuf -i 10000-99999 -n 1)
}
read -p "Are you sure you want to generate a new invoice ID? [Y/n] " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]
then
generate_num && echo Generating a random invoice ID and checking it against the database...
sleep 2
while grep -xq "$num" "ID_database"
do
echo Invoice ID \#$num already exists in the database...
sleep 2
generate_num && echo Generating new random invoice ID and checking against database...
sleep 2
done
while [[ ${#num} -gt 5 ]]
do
echo Invoice ID \#$num is more than 5 digits...
sleep 2
generate_num && echo Generating new random invoice ID and checking against database...
sleep 2
done
echo Generated random invoice ID \#$num
sleep 1
echo Invoice ID \#$num does not exist in database...
sleep 2
echo $num >> "ID_database" && echo Successfully added Invoice ID \#$num to the database.
else
echo "Exiting..."
fi
I do not recommend this because:
These things are fragile. One bad edit and your invoice database is corrupt.
It makes version control a pain. Each new version of the script should preferably be checked in. You could add logic to make sure that "$mydir" is an empty directory when you run the script (except for "$myname", .git and other git-related files) then run git -C "$mydir" init if "$mydir"/.git doesn't exist. Then for each database update, git -C "$mydir" add "$myname" and git -C "$mydir" commit -m "$num". It's just an idea to explore...
Locking - It's possible to do file locking to make sure that not two users run the script at the same time, but it adds to the complexity so I didn't bother. If you feel that's a risk, you need to add that.
... but you want a self-modifying script, so here goes.
This just adds a new invoice number to its internal database for each time you run it. I've explained what goes on as comments. The last line should read __INVOICES__ (+ a newline) if you copy the script.
As always when dealing with things like this, remember to make a backup before making changes :-)
As it's currently written, you can only add one invoice per run. It shouldn't be hard to move things around (you need a new tempfile) to get it to add more than one if you need that.
#!/bin/bash
set -e # exit on error - imporant for this type of script
#------------------------------------------------------------------------------
myname="$0"
mydir=$(dirname "$myname")
if [[ ! -w $myname ]]; then
echo "ERROR: You don't have permission to update $myname" >&2
exit 1
fi
# create a tempfile to be able to update the database in the file later
#
# set -e makes the script end if this fails:
temp=$(mktemp -p "$mydir")
trap "{ rm -f "$temp"; }" EXIT # remove the tempfile if we die for some reason
# read current database from the file
readarray -t ID_database <<< $(sed '0,/^__INVOICES__$/d' "$0")
#declare -p ID_database >&2 # debug
#------------------------------------------------------------------------------
# a function to check if a number is already in the db
is_it_taken() {
local num=$1
# return 1 (true, yes it's taken) if the regex found a match
[[ ! " ${ID_database[#]} " =~ " ${num} " ]]
}
generate_num() {
local num
(exit 1) # set $? to 1
# loop until $? becomes 0
while (( $? )); do
num=$(shuf -i 10000-99999 -n 1)
is_it_taken "$num"
done
# we found a free number
echo $num
}
add_to_db() {
local num=$1
# add to db in memory
ID_database+=($num)
# add to db in file:
# copy the script to the tempfile
cp -pf "$myname" "$temp"
# add the new number
echo $num >> "$temp"
# move the tempfile into place
mv "$temp" "$myname"
}
#------------------------------------------------------------------------------
num=$(generate_num)
add_to_db $num
# your business logic goes here:
echo "All current invoices:"
for invoice in ${ID_database[#]}
do
echo ">$invoice<"
done
#------------------------------------------------------------------------------
# leave the rest untouched:
exit
__INVOICES__
Edited
To answer the question you asked -
Make sure your file ends with an explicit exit statement.
Without some sort of branching it won't execute past that, so unless there is a gross parsing error anything below could be used as storage space. Just
echo $num >> $0
If you write your records directly onto the bottom of the script, the script grows, but ...relatively harmlessly. Just make sure your grep pattern doesn't grab any lines of code, though grep -E '^\d[%]$' seems pretty safe.
This is only ever going to give you a max of ~90k id's, and spends unneeded time and cycles on redundancy checking. Is there a limit on the length of the value?
If you can assure there won't be more than one invoice processed per second,
date +%s >> "ID_database" # the UNIX epoch, seconds since 00:00:00 01/01/1970
If you need more precision that that,
date +%Y%m%d%H%M%S%N
will output Year month day hour minute second nanoseconds, which is both immediate and "pretty safe".
date +%s%N # epoch with nanoseconds
is shorter, but doesn't have the convenient side effect of automatically giving you the date and time of invoice creation.
If you absolutely need to guarantee uniqueness and nanoseconds isn't good enough, use a lock of some sort, and maybe a more fine-grained language.
On the other hand, if minutes are unique enough, you could use
date +%y%m%d%H%M
You get the idea.

Bash manipulate and sort file content with arrays via loop

Purpose
Create a bash script which loops through certain commands and save the outputs of each command (they print only numbers) into a file (I guess the best way is to save them in a file?) with their dates (unix time) next to each output so we can use these stored values next time we run the script and it looped through again, see if there isn't any change in the outputs of commands within the last hour.
Example output
# ./script
command1 123123
command2 123123
Important notes
There are around 200 commands which the script will loop through.
There'll be new commands in the future so the script will have to check if this command exists in the saved file. If it already present, only compare it within the last hour to see if the number has changed since you last saved the file. If it doesn't exists, save it into the file so we can use it to compare next time.
Order of the commands which the script will run might be different as the commands increase/decrease/change. So if it's only like this for now;
# ./script
command1 123123
command2 123123
and you add a 3rd command in the future, the order might change (it is also not certain what kind of pattern it's following), for example;
# ./script
command1 123123
command3 123123
command2 123123
so we can't, for example, read it line by line and in this case, I believe the best way is to compare them with the command* names.
Structure for stored values
My presumed structure for stored values is like this (don't have to stick with this one tho);
command1 123123 unixtime
command2 123123 unixtime
About the said commands
The things I called commands are basically applications which are running on /usr/local/bin/ an can be accessed by directly running their names on the shell, like command1 getnumber and it will print you the number.
Since the commands are located in the /usr/local/bin/ and following a similar pattern, I'm first looping through the /usr/local/bin/ for command*. See below.
commands=`find /usr/local/bin/ -name 'command*'`
for i in $commands; do
echo "$i" "`$i getnumber`"
done
so this will loop through all files that starts with command and run command* getnumber for each one, which will print out the numbers we need.
Now we need to store these values in a file to compare them next time we run the command.
Catch:
We may even run the script every few minutes but we only need to report if the values (numbers) hasn't changed in the last hour.
The script will list the numbers every time you run it and we may add a styling to those who aren't changed in the last hour to pop them out for the eyes, maybe like adding a red color to them?
Attempt number #1
So this is my first attempt building this script. Here's what it looks like;
#!/bin/bash
commands=`find /usr/local/bin/ -name 'command*'`
date=`date +%s`
while read -r command number unixtime; do
for i in $commands; do
current_block_count=`$i getnumber`
if [[ $command = $i ]]; then
echo "$i exists in the file, checking the number changes within last hour" # just for debugging, will be removed in production
if (( ($date-$unixtime)/60000 > 60 )); then
if (( $number >= $current_number_count )); then
echo "There isn't a change within the last hour, this is a problem!" # just for debugging, will be removed in production
echo -e "$i" "`$i getnumber`" "/" "$number" "\e[31m< No change within last hour."
else
echo "$i" "`$i getnumber`"
echo "There's a change within the last hour, we're good." # just for debugging, will be removed in production
# find the line number of $i so we can change it with the new output
line_number=`grep -Fn '$i' outputs.log`
new_output=`$i getnumber`
sed -i "$line_numbers/.*/$new_output/" outputs.log
fi
else
echo "$i" "`$i getnumber`"
# find the line number of $i so we can change it with the new output
line_number=`grep -Fn '$i' outputs.log`
output_check="$i getnumber; date +%s"
new_output=`eval ${output_check}`
sed -i "$line_numbers/.*/$new_output/" outputs.log
fi
else
echo "$i does not exists in the file, adding it now" # just for debugging, will be removed in production
echo "$i" "`$i getnumber`" "`date +%s`" >> outputs.log
fi
done
done < outputs.log
Which was a quite the disaster and eventually, it did nothing when I've run it.
Attempt number #2
This time, I've tried another approach nesting for loop outside of the while loop.
#!/bin/bash
commands=`find /usr/local/bin/ -name 'command*'`
date=`date +%s`
for i in $commands; do
echo "${i}" "`$i getnumber`"
name=${i}
number=`$i getnumber`
unixtime=$date
echo "$name" "$number" "$unixtime" # just for debugging, will be removed in production
while read -r command number unixtime; do
if ! [ -z ${name+x} ]; then
echo "$name" "$number" "$unix" >> outputs.log
else
if [[ $name = $i ]]; then
if (( ($date-$unixtime)/60000 > 60 )); then
if (( $number >= $current_number_count )); then
echo "There isn't a change within the last hour, this is a problem!" # just for debugging, will be removed in production
echo -e "$i" "`$i getnumber`" "/" "$number" "\e[31m< No change within last hour."
else
echo "$i" "`$i getnumber`"
echo "There's a change within the last hour, we're good." # just for debugging, will be removed in production
# find the line number of $i so we can change it with the new output
line_number=`grep -Fn '$i' outputs.log`
new_output=`$i getnumber`
sed -i "$line_numbers/.*/$new_output/" outputs.log
fi
else
echo "$i" "`$i getnumber`"
# find the line number of $i so we can change it with the new output
line_number=`grep -Fn '$i' outputs.log`
output_check="$i getnumber; date +%s"
new_output=`eval ${output_check}`
sed -i "$line_numbers/.*/$new_output/" outputs.log
fi
else
echo "$i does not exists in the file, adding it now" # just for debugging, will be removed in production
echo "$i" "`$i getnumber`" "`date +%s`" >> outputs.log
fi
fi
done < outputs.log
done
Unfortunately, no luck for me, again.
Can someone give me a helping hand?
Additional notes #2
So basically, you run the script first time, outputs.log is empty, so you write the outputs of commands into outputs.log.
And it's been 10 minutes passed, you run the script again, since it's only 10 minutes passed and not more than an hour, the script won't check if the numbers have changed or not. It will not manipulate the stored values but also display us the outputs of command every time you run it. (Their present outputs and not from the stored values)
In this 10 minutes timeframe, for example, there might have been new commands added so it will check if the commands' outputs are stored every time you run the script, just to deal with new commands.
Now it's been, let's say 1.2 hours passed, you decided to run the script again, this time the script will check if the numbers hasn't changed after more than an hour and report us saying that Hey! It's been more than an hour passed and those numbers still haven't changed, there might be problem!
Simple explanation
You have 100 commands to run, your script will loop through each of them and do the followings for each;
Run the script whenever you want
On each run, check if outputs.log contains the command
If outputs.log contains the commands of each loop, check the last stored date ($unixtime) of each of them.
If last stored date is more than an hour, check the numbers between the current run and the stored value
If the numbers haven't changed for more than an hour, run the command in red text color.
If the numbers have changed, run the command as usual without any warning.
If last stored date is less than an hour, run the command as usual.
If outputs.log doesn't contain the command, simply store them in the file so it can be used for next runs to check.
The following uses a sqlite database to store results, instead of a flat file, which makes querying the history of previous runs easy:
#!/bin/sh
database=tracker.db
if [ ! -e "$database" ]; then
sqlite3 -batch "$database" <<EOF
CREATE TABLE IF NOT EXISTS outputs(command TEXT
, output INTEGER
, ts INTEGER NOT NULL DEFAULT (strftime('%s', 'now')));
CREATE INDEX IF NOT EXISTS outputs_idx ON outputs(command, ts);
EOF
fi
for cmd in /usr/local/bin/command*; do
f=$(basename "$cmd")
o=$("$cmd")
echo "$f $o"
sqlite3 -batch "$database" <<EOF
INSERT INTO outputs(command, output) VALUES ('$f', $o);
SELECT command || ' has unchanged output!'
FROM outputs
WHERE command = '$f' AND ts >= strftime('%s', 'now', '-1 hour')
GROUP BY command
HAVING count(DISTINCT output) = 1 AND count(*) > 1;
EOF
done
It lists commands that have had every run in the last hour produce the same output (and skips commands that have only run once). If instead you're interested in cases where the most recent output of each command is the same as the previous run in that hour timeframe, replace the sqlite3 invocation in the loop with:
sqlite3 -batch $database <<EOF
INSERT INTO outputs(command, output) VALUES ('$f', $o);
WITH prevs AS
(SELECT command
, output
, row_number() OVER w AS rn
, lead(output, 1) OVER w AS prev
FROM outputs
WHERE command = '$f' AND ts >= strftime('%s', 'now', '-1 hour')
WINDOW w AS (ORDER BY ts DESC))
SELECT command || ' has unchanged output!'
FROM prevs
WHERE output = prev AND rn = 1;
EOF
(This requires the sqlite3 shell from release 3.25 or newer because it uses features introduced then.)

Multi-dimensional arrays in Bash

I am planning a script to manage some pieces of my Linux systems and am at the point of deciding if I want to use bash or python.
I would prefer to do this as a Bash script simply because the commands are easier, but the real deciding factor is configuration. I need to be able to store a multi-dimensional array in the configuration file to tell the script what to do with itself. Storing simple key=value pairs in config files is easy enough with bash, but the only way I can think of to do a multi-dimensional array is a two layer parsing engine, something like
array=&d1|v1;v2;v3&d2|v1;v2;v3
but the marshall/unmarshall code could get to be a bear and its far from user friendly for the next poor sap that has to administer this. If i can't do this easily in bash i will simply write the configs to an xml file and write the script in python.
Is there an easy way to do this in bash?
thanks everyone.
Bash does not support multidimensional arrays, nor hashes, and it seems that you want a hash that values are arrays. This solution is not very beautiful, a solution with an xml file should be better :
array=('d1=(v1 v2 v3)' 'd2=(v1 v2 v3)')
for elt in "${array[#]}";do eval $elt;done
echo "d1 ${#d1[#]} ${d1[#]}"
echo "d2 ${#d2[#]} ${d2[#]}"
EDIT: this answer is quite old, since since bash 4 supports hash tables, see also this answer for a solution without eval.
Bash doesn't have multi-dimensional array. But you can simulate a somewhat similar effect with associative arrays. The following is an example of associative array pretending to be used as multi-dimensional array:
declare -A arr
arr[0,0]=0
arr[0,1]=1
arr[1,0]=2
arr[1,1]=3
echo "${arr[0,0]} ${arr[0,1]}" # will print 0 1
If you don't declare the array as associative (with -A), the above won't work. For example, if you omit the declare -A arr line, the echo will print 2 3 instead of 0 1, because 0,0, 1,0 and such will be taken as arithmetic expression and evaluated to 0 (the value to the right of the comma operator).
This works thanks to 1. "indirect expansion" with ! which adds one layer of indirection, and 2. "substring expansion" which behaves differently with arrays and can be used to "slice" them as described https://stackoverflow.com/a/1336245/317623
# Define each array and then add it to the main one
SUB_0=("name0" "value 0")
SUB_1=("name1" "value;1")
MAIN_ARRAY=(
SUB_0[#]
SUB_1[#]
)
# Loop and print it. Using offset and length to extract values
COUNT=${#MAIN_ARRAY[#]}
for ((i=0; i<$COUNT; i++))
do
NAME=${!MAIN_ARRAY[i]:0:1}
VALUE=${!MAIN_ARRAY[i]:1:1}
echo "NAME ${NAME}"
echo "VALUE ${VALUE}"
done
It's based off of this answer here
If you want to use a bash script and keep it easy to read recommend putting the data in structured JSON, and then use lightweight tool jq in your bash command to iterate through the array. For example with the following dataset:
[
{"specialId":"123",
"specialName":"First"},
{"specialId":"456",
"specialName":"Second"},
{"specialId":"789",
"specialName":"Third"}
]
You can iterate through this data with a bash script and jq like this:
function loopOverArray(){
jq -c '.[]' testing.json | while read i; do
# Do stuff here
echo "$i"
done
}
loopOverArray
Outputs:
{"specialId":"123","specialName":"First"}
{"specialId":"456","specialName":"Second"}
{"specialId":"789","specialName":"Third"}
Independent of the shell being used (sh, ksh, bash, ...) the following approach works pretty well for n-dimensional arrays (the sample covers a 2-dimensional array).
In the sample the line-separator (1st dimension) is the space character. For introducing a field separator (2nd dimension) the standard unix tool tr is used. Additional separators for additional dimensions can be used in the same way.
Of course the performance of this approach is not very well, but if performance is not a criteria this approach is quite generic and can solve many problems:
array2d="1.1:1.2:1.3 2.1:2.2 3.1:3.2:3.3:3.4"
function process2ndDimension {
for dimension2 in $*
do
echo -n $dimension2 " "
done
echo
}
function process1stDimension {
for dimension1 in $array2d
do
process2ndDimension `echo $dimension1 | tr : " "`
done
}
process1stDimension
The output of that sample looks like this:
1.1 1.2 1.3
2.1 2.2
3.1 3.2 3.3 3.4
After a lot of trial and error i actually find the best, clearest and easiest multidimensional array on bash is to use a regular var. Yep.
Advantages: You don't have to loop through a big array, you can just echo "$var" and use grep/awk/sed. It's easy and clear and you can have as many columns as you like.
Example:
$ var=$(echo -e 'kris hansen oslo\nthomas jonson peru\nbibi abu johnsonville\njohnny lipp peru')
$ echo "$var"
kris hansen oslo
thomas johnson peru
bibi abu johnsonville
johnny lipp peru
If you want to find everyone in peru
$ echo "$var" | grep peru
thomas johnson peru
johnny lipp peru
Only grep(sed) in the third field
$ echo "$var" | sed -n -E '/(.+) (.+) peru/p'
thomas johnson peru
johnny lipp peru
If you only want x field
$ echo "$var" | awk '{print $2}'
hansen
johnson
abu
johnny
Everyone in peru that's called thomas and just return his lastname
$ echo "$var" |grep peru|grep thomas|awk '{print $2}'
johnson
Any query you can think of... supereasy.
To change an item:
$ var=$(echo "$var"|sed "s/thomas/pete/")
To delete a row that contains "x"
$ var=$(echo "$var"|sed "/thomas/d")
To change another field in the same row based on a value from another item
$ var=$(echo "$var"|sed -E "s/(thomas) (.+) (.+)/\1 test \3/")
$ echo "$var"
kris hansen oslo
thomas test peru
bibi abu johnsonville
johnny lipp peru
Of course looping works too if you want to do that
$ for i in "$var"; do echo "$i"; done
kris hansen oslo
thomas jonson peru
bibi abu johnsonville
johnny lipp peru
The only gotcha iv'e found with this is that you must always quote the
var(in the example; both var and i) or things will look like this
$ for i in "$var"; do echo $i; done
kris hansen oslo thomas jonson peru bibi abu johnsonville johnny lipp peru
and someone will undoubtedly say it won't work if you have spaces in your input, however that can be fixed by using another delimeter in your input, eg(using an utf8 char now to emphasize that you can choose something your input won't contain, but you can choose whatever ofc):
$ var=$(echo -e 'field one☥field two hello☥field three yes moin\nfield 1☥field 2☥field 3 dsdds aq')
$ for i in "$var"; do echo "$i"; done
field one☥field two hello☥field three yes moin
field 1☥field 2☥field 3 dsdds aq
$ echo "$var" | awk -F '☥' '{print $3}'
field three yes moin
field 3 dsdds aq
$ var=$(echo "$var"|sed -E "s/(field one)☥(.+)☥(.+)/\1☥test☥\3/")
$ echo "$var"
field one☥test☥field three yes moin
field 1☥field 2☥field 3 dsdds aq
If you want to store newlines in your input, you could convert the newline to something else before input and convert it back again on output(or don't use bash...). Enjoy!
I am posting the following because it is a very simple and clear way to mimic (at least to some extent) the behavior of a two-dimensional array in Bash. It uses a here-file (see the Bash manual) and read (a Bash builtin command):
## Store the "two-dimensional data" in a file ($$ is just the process ID of the shell, to make sure the filename is unique)
cat > physicists.$$ <<EOF
Wolfgang Pauli 1900
Werner Heisenberg 1901
Albert Einstein 1879
Niels Bohr 1885
EOF
nbPhysicists=$(wc -l physicists.$$ | cut -sf 1 -d ' ') # Number of lines of the here-file specifying the physicists.
## Extract the needed data
declare -a person # Create an indexed array (necessary for the read command).
while read -ra person; do
firstName=${person[0]}
familyName=${person[1]}
birthYear=${person[2]}
echo "Physicist ${firstName} ${familyName} was born in ${birthYear}"
# Do whatever you need with data
done < physicists.$$
## Remove the temporary file
rm physicists.$$
Output:
Physicist Wolfgang Pauli was born in 1900 Physicist Werner Heisenberg was born in 1901 Physicist Albert Einstein was born in 1879 Physicist Niels Bohr was born in 1885
The way it works:
The lines in the temporary file created play the role of one-dimensional vectors, where the blank spaces (or whatever separation character you choose; see the description of the read command in the Bash manual) separate the elements of these vectors.
Then, using the read command with its -a option, we loop over each line of the file (until we reach end of file). For each line, we can assign the desired fields (= words) to an array, which we declared just before the loop. The -r option to the read command prevents backslashes from acting as escape characters, in case we typed backslashes in the here-document physicists.$$.
In conclusion a file is created as a 2D-array, and its elements are extracted using a loop over each line, and using the ability of the read command to assign words to the elements of an (indexed) array.
Slight improvement:
In the above code, the file physicists.$$ is given as input to the while loop, so that it is in fact passed to the read command. However, I found that this causes problems when I have another command asking for input inside the while loop. For example, the select command waits for standard input, and if placed inside the while loop, it will take input from physicists.$$, instead of prompting in the command-line for user input.
To correct this, I use the -u option of read, which allows to read from a file descriptor. We only have to create a file descriptor (with the exec command) corresponding to physicists.$$ and to give it to the -u option of read, as in the following code:
## Store the "two-dimensional data" in a file ($$ is just the process ID of the shell, to make sure the filename is unique)
cat > physicists.$$ <<EOF
Wolfgang Pauli 1900
Werner Heisenberg 1901
Albert Einstein 1879
Niels Bohr 1885
EOF
nbPhysicists=$(wc -l physicists.$$ | cut -sf 1 -d ' ') # Number of lines of the here-file specifying the physicists.
exec {id_file}<./physicists.$$ # Create a file descriptor stored in 'id_file'.
## Extract the needed data
declare -a person # Create an indexed array (necessary for the read command).
while read -ra person -u "${id_file}"; do
firstName=${person[0]}
familyName=${person[1]}
birthYear=${person[2]}
echo "Physicist ${firstName} ${familyName} was born in ${birthYear}"
# Do whatever you need with data
done
## Close the file descriptor
exec {id_file}<&-
## Remove the temporary file
rm physicists.$$
Notice that the file descriptor is closed at the end.
Bash does not supports multidimensional array, but we can implement using Associate array. Here the indexes are the key to retrieve the value. Associate array is available in bash version 4.
#!/bin/bash
declare -A arr2d
rows=3
columns=2
for ((i=0;i<rows;i++)) do
for ((j=0;j<columns;j++)) do
arr2d[$i,$j]=$i
done
done
for ((i=0;i<rows;i++)) do
for ((j=0;j<columns;j++)) do
echo ${arr2d[$i,$j]}
done
done
Expanding on Paul's answer - here's my version of working with associative sub-arrays in bash:
declare -A SUB_1=(["name1key"]="name1val" ["name2key"]="name2val")
declare -A SUB_2=(["name3key"]="name3val" ["name4key"]="name4val")
STRING_1="string1val"
STRING_2="string2val"
MAIN_ARRAY=(
"${SUB_1[*]}"
"${SUB_2[*]}"
"${STRING_1}"
"${STRING_2}"
)
echo "COUNT: " ${#MAIN_ARRAY[#]}
for key in ${!MAIN_ARRAY[#]}; do
IFS=' ' read -a val <<< ${MAIN_ARRAY[$key]}
echo "VALUE: " ${val[#]}
if [[ ${#val[#]} -gt 1 ]]; then
for subkey in ${!val[#]}; do
subval=${val[$subkey]}
echo "SUBVALUE: " ${subval}
done
fi
done
It works with mixed values in the main array - strings/arrays/assoc. arrays
The key here is to wrap the subarrays in single quotes and use * instead of # when storing a subarray inside the main array so it would get stored as a single, space separated string: "${SUB_1[*]}"
Then it makes it easy to parse an array out of that when looping through values with IFS=' ' read -a val <<< ${MAIN_ARRAY[$key]}
The code above outputs:
COUNT: 4
VALUE: name1val name2val
SUBVALUE: name1val
SUBVALUE: name2val
VALUE: name4val name3val
SUBVALUE: name4val
SUBVALUE: name3val
VALUE: string1val
VALUE: string2val
Lots of answers found here for creating multidimensional arrays in bash.
And without exception, all are obtuse and difficult to use.
If MD arrays are a required criteria, it is time to make a decision:
Use a language that supports MD arrays
My preference is Perl. Most would probably choose Python.
Either works.
Store the data elsewhere
JSON and jq have already been suggested. XML has also been suggested, though for your use JSON and jq would likely be simpler.
It would seem though that Bash may not be the best choice for what you need to do.
Sometimes the correct question is not "How do I do X in tool Y?", but rather "Which tool would be best to do X?"
I do this using associative arrays since bash 4 and setting IFS to a value that can be defined manually.
The purpose of this approach is to have arrays as values of associative array keys.
In order to set IFS back to default just unset it.
unset IFS
This is an example:
#!/bin/bash
set -euo pipefail
# used as value in asscciative array
test=(
"x3:x4:x5"
)
# associative array
declare -A wow=(
["1"]=$test
["2"]=$test
)
echo "default IFS"
for w in ${wow[#]}; do
echo " $w"
done
IFS=:
echo "IFS=:"
for w in ${wow[#]}; do
for t in $w; do
echo " $t"
done
done
echo -e "\n or\n"
for w in ${!wow[#]}
do
echo " $w"
for t in ${wow[$w]}
do
echo " $t"
done
done
unset IFS
unset w
unset t
unset wow
unset test
The output of the script below is:
default IFS
x3:x4:x5
x3:x4:x5
IFS=:
x3
x4
x5
x3
x4
x5
or
1
x3
x4
x5
2
x3
x4
x5
I've got a pretty simple yet smart workaround:
Just define the array with variables in its name. For example:
for (( i=0 ; i<$(($maxvalue + 1)) ; i++ ))
do
for (( j=0 ; j<$(($maxargument + 1)) ; j++ ))
do
declare -a array$i[$j]=((Your rule))
done
done
Don't know whether this helps since it's not exactly what you asked for, but it works for me. (The same could be achieved just with variables without the array)
echo "Enter no of terms"
read count
for i in $(seq 1 $count)
do
t=` expr $i - 1 `
for j in $(seq $t -1 0)
do
echo -n " "
done
j=` expr $count + 1 `
x=` expr $j - $i `
for k in $(seq 1 $x)
do
echo -n "* "
done
echo ""
done

BASH store values in an array and check difference of each value

[CentOS, BASH, cron] Is there a method to declare variants that would keep even when system restarts?
The scenario is to snmpwalk interface I/O errors and store the values in an array. A cron job to snmpwalk again, say 5 mins later, would have another set of values. I would like to compare them with previous corresponding value of each interface. If the difference exceeds the threshold (50), an alert would generate.
So the question is: how to store an array variable that would lost in the system? and how to check the difference of each value in two arrays?
UPDATE Mar 16, 2012 I attach my final script here for your reference.
#!/bin/bash
# This script is to monitor interface Input/Output Errors of Cisco devices, by snmpwalk the error values every 5 mins, and send email alert if incremental value exceeds threshold (e.g. 500).
# Author: Wu Yajun | Created: 12Mar2012 | Updated: 16Mar2012
##########################################################################
DIR="$( cd "$( dirname "$0" )" && pwd )"
host=device.ip.addr.here
# Check and initiate .log file storing previous values, create .tmp file storing current values.
test -e $DIR/host1_ifInErrors.log || snmpwalk -c public -v 1 $host IF-MIB::ifInErrors > $DIR/host1_ifInErrors.log
snmpwalk -c public -v 1 $host IF-MIB::ifInErrors > $DIR/host1_ifInErrors.tmp
# Compare differences of the error values, and alert if diff exceeds threshold.
# To exclude checking some interfaces, e.g. Fa0/6, Fa0/10, Fa0/11, change the below "for loop" to style as:
# for i in {1..6} {8..10} {13..26}
totalIfNumber=$(echo $(wc -l $DIR/host1_ifInErrors.tmp) | sed 's/ \/root.*$//g')
for (( i=1; i<=$totalIfNumber; i++))
do
currentValue=$(cat $DIR/host1_ifInErrors.tmp | sed -n ''$i'p' | sed 's/^.*Counter32: //g')
previousValue=$(cat $DIR/host1_ifInErrors.log | sed -n ''$i'p' | sed 's/^.*Counter32: //g')
diff=$(($currentValue-$previousValue))
[ $diff -ge 500 ] && (ifName=$(echo $(snmpwalk -c public -v 1 $host IF-MIB::ifName.$i) | sed 's/^.*STRING: //g') ; echo "ATTENTION - Input Error detected from host1 interface $ifName" | mutt -s "ATTENTION - Input Error detected from host1 interface $ifName" <email address here>)
done
# Store current values for next time checking.
snmpwalk -c public -v 1 $host IF-MIB::ifInErrors > $DIR/host1_ifInErrors.log
Save the variables in a file. Add a date stamp:
echo "$(date)#... variables here ...." >> "$file"
Read the last values from the file:
tail -1 "$file" | cut "-d#" -f2 | read ... variables here ....
That also gives you a nice log file where you can monitor the changes. I suggest to always append to the file, so you can easily see when the service is down/didn't run for some reason.
To check for changes, you can use an simple if
if [[ "...old values..." != "...new values..." ]]; then
send mail
fi

Resources