BASH - find regex in array, print found array items - arrays

I have tested the regex here:
http://regexr.com/3bchs
I cant get the array to only print the regex search terms.
files=(`ls $BACKUPDIR`)
daterange='(2015\-06\-)[0-9]+\s?'
for i in "${files[#]}"
do
if [[ "$files[$i]" =~ $daterange ]];
then
echo $i
fi
done
Input: 2015-06-06 2015-06-13 2015-06-20 2015-06-27 2015-07-04 2015-07-11
Output:
2015-06-06
2015-06-13
2015-06-20
2015-06-27
2015-07-04
2015-07-11

By running bash -vx <script> I found out that the files it was compering was wrong. I needed to change $files[$i] to $i.
$files[$i] = 2015-06-06[2015-06-06]
I have further improved my answer thanks to Etan Reisner comment. By not parsing output from ls.
Reference: https://stackoverflow.com/a/11584156/3371795
#!/bin/bash
# Enable special handling to prevent expansion to a
# literal '/example/backups/*' when no matches are found.
shopt -s nullglob
# Variables
YEAR=`date +%Y`
MONTH=`date +%m`
DIR=(/home/user/backups/backup.weekly/*)
# REGEX - Get curent month
DATE_RANGE='('$YEAR'\-'$MONTH'\-)[0-9]+\s?'
# Loop through dir
for i in "${DIR[#]}";
do
# Compare dir name with date range
if [[ "$i" =~ $DATE_RANGE ]];
then
# I have no idea what this is, works fine without it.
[[ -d "$i" ]]
# Echo found dirs
echo "$i"
fi
done

Related

Appending to an array in bash, why this isn't working?

I'm trying from a list of files with a pattern correctly matched by regex to check whether this value is in my array, if not, append it.
Unfortunately, this code that I build up inspired by some stack overflow post doesn't work (nothing is happened, the =~ doesn't seem to find the bash_rematch, and also it doesn't output anything?
sample_array=() #creating the array
for context_files in data/*.txt.gz # checking all the different samples id we have
do
[[ $context_files =~ SL[0-9]{6} ]]
echo 'context file:' "$context_files"
echo 'rematch:' "${BASH_REMATCH[0]}"
if ! [[ " ${sample_array[*]} " =~ (^|[[:space:]])"${BASH_REMATCH[0]}"($|[[:space:]]) ]]; then
echo 'condition matched'
echo 'rematch:' "${BASH_REMATCH[0]}"
sample_array+=(" ${BASH_REMATCH[0]} ")
fi
done
echo "${sample_array[*]}"
replacing this code by
sample_array=() #creating the array
for context_files in data/*.txt.gz # checking all the different samples id we have
do
[[ $context_files =~ SL[0-9]{6} ]]
echo 'context file:' "$context_files"
echo 'rematch:' "${BASH_REMATCH[0]}"
if ! [[ " ${sample_array[*]} " == "${BASH_REMATCH[0]}" ]]; then
echo 'condition matched'
echo 'rematch:' "${BASH_REMATCH[0]}"
sample_array+=(" ${BASH_REMATCH[0]} ")
fi
done
echo "${sample_array[*]}"
will this time add all the variable
output :
A B A B A B
I probably don't get something in how the if is managed and/or how the regex lookup in a bash array is to be made but I'd gladly get some help!
The second match is negated, so in order to enter the then part, the match needs to fail. A failed match resets $BASH_REMATCH.
#! /bin/bash
sample_array=()
for context_files in data/SL{111111,222222,333333,111111,222222}.txt.gz ; do
[[ $context_files =~ SL[0-9]{6} ]]
match=${BASH_REMATCH[0]}
echo 'context file:' "$context_files"
echo 'rematch:' "$match"
if ! [[ " ${sample_array[*]} " =~ (^|[[:space:]])"$match"($|[[:space:]]) ]]; then
echo 'condition matched'
echo 'rematch:' "$match"
sample_array+=(" $match ")
fi
done
echo "${sample_array[*]}"
Here is a completely alternative solution in bash-style like John Kugelman suggested:
printf %s\\n data/*.txt.gz | grep -Eo 'SL[0-9]{6}' | sort -u
If you need the results in an array, use mapfile:
mapfile -t array <(printf %s\\n data/*.txt.gz | grep -Eo 'SL[0-9]{6}' | sort -u)

Using mapfile to save output to associative arrays

In practicing bash, I tried writing a script that searches the home directory for duplicate files in the home directory and deletes them. Here's what my script looks like now.
#!/bin/bash
# create-list: create a list of regular files in a directory
declare -A arr1 sumray origray
if [[ -d "$HOME/$1" && -n "$1" ]]; then
echo "$1 is a directory"
else
echo "Usage: create-list Directory | options" >&2
exit 1
fi
for i in $HOME/$1/*; do
[[ -f $i ]] || continue
arr1[$i]="$i"
done
for i in "${arr1[#]}"; do
Name=$(sed 's/[][?*]/\\&/g' <<< "$i")
dupe=$(find ~ -name "${Name##*/}" ! -wholename "$Name")
if [[ $(find ~ -name "${Name##*/}" ! -wholename "$Name") ]]; then
mapfile -t sumray["$i"] < <(find ~ -name "${Name##*/}" ! -wholename "$Name")
origray[$i]=$(md5sum "$i" | cut -c 1-32)
fi
done
for i in "${!sumray[#]}"; do
poten=$(md5sum "$i" | cut -c 1-32)
for i in "${!origray[#]}"; do
if [[ "$poten" = "${origray[$i]}" ]]; then
echo "${sumray[$i]} is a duplicate of $i"
fi
done
done
Originally, where mapfile -t sumray["$i"] < <(find ~ -name "${Name##*/}" ! -wholename "$Name") is now, my line was the following:
sumray["$i"]=$(find ~ -name "${Name##*/}" ! -wholename "$Name")
This saved the output of find to the array. But I had an issue. If a single file had multiple duplicates, then all locations found by find would be saved to a single value. I figured I could use the mapfile command to fix this, but now it's not saving anything to my array at all. Does it have to do with the fact that I'm using an associative array? Or did I just mess up elsewhere?
I'm not sure if I'm allowed to answer my own question, but I figured that I should post how I solved my problem.
As it turns out, the mapfile command does not work on associative arrays at all. So my fix was to save the output of find to a text file and then store that information in an indexed array. I tested this a few times and I haven't seemed to encounter any errors yet.
Here's my finished script.
#!/bin/bash
# create-list: create a list of regular files in a directory
declare -A arr1 origray
declare indexray
#Verify that Parameter is a directory.
if [[ -d "$HOME/$1/" && -n "$1" ]]; then
echo "Searching for duplicates of files in $1"
else
echo "Usage: create-list Directory | options" >&2
exit 1
fi
#create list of files in specified directory
for i in $HOME/${1%/}/*; do
[[ -f $i ]] || continue
arr1[$i]="$i"
done
#search for all duplicate files in the home directory
#by name
#find checksum of files in specified directory
for i in "${arr1[#]}"; do
Name=$(sed 's/[][?*]/\\&/g' <<< "$i")
if [[ $(find ~ -name "${Name##*/}" ! -wholename "$Name") ]]; then
find ~ -name "${Name##*/}" ! -wholename "$Name" >> temp.txt
origray[$i]=$(md5sum "$i" | cut -c 1-32)
fi
done
#create list of duplicate file locations.
if [[ -f temp.txt ]]; then
mapfile -t indexray < temp.txt
else
echo "No duplicates were found."
exit 0
fi
#compare similarly named files by checksum and delete duplicates
count=0
for i in "${!indexray[#]}"; do
poten=$(md5sum "${indexray[$i]}" | cut -c 1-32)
for i in "${!origray[#]}"; do
if [[ "$poten" = "${origray[$i]}" ]]; then
echo "${indexray[$count]} is a duplicate of a file in $1."
fi
done
count=$((count+1))
done
rm temp.txt
This is kind of sloppy but it does what it's supposed to do. md5sum may not be the optimal way to check for file duplicates but it works. All I have to do is replace echo "${indexray[$count]} is a duplicate of a file in $1." with rm -i ${indexray[$count]} and it's good to go.
So my next question would have to be...why doesn't mapfile work with associative arrays?

echo a variable and then grep to see if value exist in a file is not returning anything. Unix Shell Scripting

I'm trying to figure out how to determine if a variable contains a value from a file using grep, this is not returning anything, so I'm going to explain it.
I have my code that is this:
MyFiles="MyFile-I-20160606_141_Employees.txt"
DirFiles="/dev/fs/C/Users/salasfri/Desktop/MyFiles.txt"
for OutFile in $(cat $DirFiles); do
if [[ $( echo $MyFiles | grep -c $OutFile ) -gt 0 ]]; then
print "The file $OutFile exist!!"
fi
done
and the file in /dev/fs/C/Users/salasfri/Desktop/MyFiles.txt contains the following values:
MyFile-I-*_141_Employees.txt
MyFile-I-*_141_Products.txt
MyFile-I-*_141_Deparments.txt
the idea is verify if the variable "MyFiles" is found in the MyFiles.txt file, as you can see is using the pattern "*" due that is a date, it will change.
that solutions is not returning any count of files, there's something that I'm doing wrong?
You can try to change the searchstring before searching.
An example with three teststrings:
for teststring in MyFile-I-20160606_141_Employees.txt MyFile-I-20160606_142_Employees.txt MyFile-I-20160606_141_Others.txt
do
grepstr=$(sed 's/[0-9]\{8\}_/*_/' <<< "${teststring}")
fgrep "${grepstr}" "${DirFiles}"
found=$(fgrep "${grepstr}" "${DirFiles}")
if [ $? -eq 0 ]; then
echo "${found} matches ${teststring}."
fi
done
In your case you can make the code shorter with
fgrep -q "$(sed 's/[0-9]\{8\}_/*_/' <<< "${MyFiles}")" $DirFiles &&
echo "The file $(sed 's/[0-9]\{8\}_/*_/' <<< "${MyFiles}") exist!!"
Your patterns are glob-style patterns, not regular expressions. The pattern abc-*_X.txt will not match the string abc-1234_X.txt.
You want to use a shell construct that does glob matching.
MyFiles="MyFile-I-20160606_141_Employees.txt"
sed 's/\r$//' "/dev/fs/C/Users/salasfri/Desktop/MyFiles.txt" \
| while IFS= read -r Pattern; do
if [[ $MyFiles == $Pattern ]]; then
print "$MyFiles matches pattern $Pattern"
break
fi
done

Check multiple var if exists in array with grep

I am using this code to check one $var if exists in array :
if echo ${myArr[#]} | grep -qw $myVar; then echo "Var exists on array" fi
How could I combine more than one $vars to my check? Something like grep -qw $var1,$var2; then ... fi
Thank you in Advance.
if echo ${myArr[#]} | grep -qw -e "$myVar" -e "$otherVar"
then
echo "Var exists on array"
fi
From the man-page:
-e PATTERN, --regexp=PATTERN
Use PATTERN as the pattern.
This can be used to specify multiple search patterns, or to protect a pattern beginning with a hyphen (-). (-e is specified by POSIX.)
But if you want to use arrays like this you might as well use the bash built-in associative arrays.
To implement and logic:
myVar1=home1
myVar2=home2
myArr[0]=home1
myArr[1]=home2
if echo ${myArr[#]} | grep -qw -e "$myVar1.*$myVar2" -e "$myVar2.*$myVar1"
then
echo "Var exists on array"
fi
# using associative arrays
declare -A assoc
assoc[home1]=1
assoc[home2]=1
if [[ ${assoc[$myVar1]} && ${assoc[$myVar2]} ]]; then
echo "Var exists on array"
fi
Actually you don't need grep for this, Bash is perfectly capable of doing Extended Regular Expressions itself (Bash 3.0 or later).
pattern="$var1|$var2|$var3"
for element in "${myArr[#]}"
do
if [[ $element =~ $pattern ]]
then
echo "$pattern exists in array"
break
fi
done
Something quadratic, but aware of spaces:
myArr=(aa "bb c" ddd)
has_values(){
for e in "${myArr[#]}" ; do
for f ; do
if [ "$e" = "$f" ]; then return 0 ; fi
done
done
return 1
}
if has_values "ee" "bb c" ; then echo yes ; else echo "no" ; fi
this example will print no because "bb c" != "bb c"

Is there a way to search an entire array inside of an argument?

Posted my code below, wondering if I can search one array for a match... or if theres a way I can search a unix file inside of an argument.
#!/bin/bash
# store words in file
cat $1 | ispell -l > file
# move words in file into array
array=($(< file))
# remove temp file
rm file
# move already checked words into array
checked=($(< .spelled))
# print out words & ask for corrections
for ((i=0; i<${#array[#]}; i++ ))
do
if [[ ! ${array[i]} = ${checked[#]} ]]; then
read -p "' ${array[i]} ' is mispelled. Press "Enter" to keep
this spelling, or type a correction here: " input
if [[ ! $input = "" ]]; then
correction[i]=$input
else
echo ${array[i]} >> .spelled
fi
fi
done
echo "MISPELLED: CORRECTIONS:"
for ((i=0; i<${#correction[#]}; i++ ))
do
echo ${array[i]} ${correction[i]}
done
otherwise, i would need to write a for loop to check each array indice, and then somehow make a decision statement whether to go through the loop and print/take input
The ususal shell incantation to do this is:
cat $1 | ispell -l |while read -r ln
do
read -p "$ln is misspelled. Enter correction" corrected
if [ ! x$corrected = x ] ; then
ln=$corrected
fi
echo $ln
done >correctedwords.txt
The while;do;done is kind of like a function and you can pipe data into and out of it.
P.S. I didn't test the above code so there may be syntax errors

Resources