Bash: rm with an array of filenames - arrays

So I'm working on making an advanced delete script. The idea is the user inputs a grep regex for what needs to be deleted, and the script does an rm operation for all of it. Basically eliminates the need to write all the code directly in the command line each time.
Here is my script so far:
#!/bin/bash
# Script to delete files passed to it
if [ $# -ne 1 ]; then
echo "Error! Script needs to be run with a single argument that is the regex for the files to delete"
exit 1
fi
IFS=$'\n'
files=$(ls -a | grep $1 | awk '{print "\"" $0 "\"" }')
## TODO ensure directory support
echo "This script will delete the following files:"
for f in $files; do
echo " $f"
done
valid=false
while ! $valid ; do
read -p "Do you want to proceed? (y/n): "
case $REPLY in
y)
valid=true
echo "Deleting, please wait"
echo $files
rm ${files}
;;
n)
valid=true
;;
*)
echo "Invalid input, please try again"
;;
esac
done
exit 0
My problem is when I actually do the "rm" operation. I keep getting errors saying No such file or directory.
This is the directory I'm working with:
drwxr-xr-x 6 user staff 204 May 9 11:39 .
drwx------+ 51 user staff 1734 May 9 09:38 ..
-rw-r--r-- 1 user staff 10 May 9 11:39 temp two.txt
-rw-r--r-- 1 user staff 6 May 9 11:38 temp1.txt
-rw-r--r-- 1 user staff 6 May 9 11:38 temp2.txt
-rw-r--r-- 1 user staff 10 May 9 11:38 temp3.txt
I'm calling the script like this: easydelete.sh '^tem'
Here is the output:
This script will delete the following files:
"temp two.txt"
"temp1.txt"
"temp2.txt"
"temp3.txt"
Do you want to proceed? (y/n): y
Deleting, please wait
"temp two.txt" "temp1.txt" "temp2.txt" "temp3.txt"
rm: "temp two.txt": No such file or directory
rm: "temp1.txt": No such file or directory
rm: "temp2.txt": No such file or directory
rm: "temp3.txt": No such file or directory
If I try and directly delete one of these files, it works fine. If I even pass that whole string that prints out before I call "rm", it works fine. But when I do it with the array, it fails.
I know I'm handling the array wrong, just not sure exactly what I'm doing wrong. Any help would be appreciated. Thanks.

Consider instead:
# put all filenames containing $1 as literal text in an array
#files=( *"$1"* )
# ...or, use a grep with GNU extensions to filter contents into an array:
# this passes filenames around with NUL delimiters for safety
#files=( )
#while IFS= read -r -d '' f; do
# files+=( "$f" )
#done < <(printf '%s\0' * | egrep --null --null-data -e "$1")
# ...or, evaluate all files against $1, as regex, and add them to the array if they match:
files=( )
for f in *; do
[[ $f =~ $1 ]] && files+=( "$f" )
done
# check that the first entry in that array actually exists
[[ -e $files || -L $files ]] || {
echo "No files containing $1 found; exiting" >&2
exit 1
}
# warn the user
echo "This script will delete the following files:" >&2
printf ' %q\n' "${files[#]}" >&2
# prompt the user
valid=0
while (( ! valid )); do
read -p "Do you want to proceed? (y/n): "
case $REPLY in
y) valid=1; echo "Deleting; please wait" >&2; rm -f "${files[#]}" ;;
n) valid=1 ;;
esac
done
I'll go into the details below:
files has to be explicitly created as an array to actually be an array -- otherwise, it's just a string with a bunch of files in it.
This is an array:
files=( "first file" "second file" )
This is not an array (and, in fact, could be a single filename):
files='"first file" "second file"'
A proper bash array is expanded with "${arrayname[#]}" to get all contents, or "$arrayname" to get only the first entry.
[[ -e $files || -L $files ]]
...thus checks the existence (whether as a file or a symlink) of the first entry in the array -- which is sufficient to tell if the glob expression did in fact expand, or if it matched nothing.
A boolean is better represented with numeric values than a string containing true or false: Running if $valid has potential to perform arbitrary activity if the contents of valid could ever be set to a user-controlled value, whereas if (( valid )) -- checking whether $valid is a positive numeric value (true) or otherwise (false) -- has far less room for side effects in presence of bugs elsewhere.
There's no need to loop over array entries to print them in a list: printf "$format_string" "${array[#]}" will expand the format string additional times whenever it has more arguments (from the array expansion) than its format string requires. Moreover, using %q in your format string will quote nonprintable values, whitespace, newlines, &c. in a format that's consumable by both human readers and the shell -- whereas otherwise a file created with touch $'evil\n - hiding' will appear to be two list entries, whereas in fact it is only one.

Related

Issue using diff with array and value quoted SHELL [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 2 months ago.
Hi guys i'm having an issue while using diff.
In my script i'm trying to compare all files in 1 dir to all files in 2 other dir
Using diff to compare is files are the same.
Here is my script :
`
#!/bin/bash
files1=()
files2=()
# Directories to compare. Adding quotes at the begining and at the end of each files found in content1 & content3
content2=$(find /data/logs -name "*.log" -type f)
content1=$(find /data/other/logs1 -type f | sed 's/^/"/g' | sed 's/$/"/g')
content3=$(find /data/other/logs2 -type f | sed 's/^/"/g' | sed 's/$/"/g')
# ADDING CONTENT INTO FILES1 & FILES2 ARRAY
while read -r line; do
files1+=("$line")
done <<< "$content1"
# content1 and content3 goes into the same array
while read -r line3;do
files1+=("$line3")
done <<< "$content3"
while read -r line2; do
files2+=("$line2")
done <<< "$content2"
# Here i'm trying to compare 1 by 1 the files in files2 to all files1
for ((i=0; i<${#files2[#]}; i++))
do
for ((j=0; j<${#files1[#]}; j++))
do
if [[ -n ${files2[$i]} ]];then
diff -s "${files2[$i]}" "${files1[$j]}" > /dev/null
if [[ $? == 0 ]]; then
echo ${files1[$j]} "est identique a" ${files2[$i]}
unset 'files2[$i]'
break
fi
fi
done
done
#SHOW THE FILES WHO DIDN'T MATCHED
echo ${files2[#]}
`
I'm having the folling issue when i'm trying to diff :
diff: "/data/content3/other/log2/perso log/somelog.log": No such file or directory
But when i'm doing
ll "/data/content3/other/log2/perso log/somelog.log" -rw-rw-r-- 2 lopom lopom 551M 30 oct. 18:53 '/data/content3/other/logs2/perso log/somelog.log'
So the file exist.
i need those quotes because sometimes there are space in the path
Does some1 know how to fix that ?
Thanks.
I already tried to change the quotes by single quotes, but it didn't fixed it
First, don't do this -
content2=$(find /data/logs -name "*.log" -type f)
content1=$(find /data/other/logs1 -type f | sed 's/^/"/g' | sed 's/$/"/g')
content3=$(find /data/other/logs2 -type f | sed 's/^/"/g' | sed 's/$/"/g')
don't stack all these into single vars. This is asking for ten kinds of obscure trouble. More importantly, those sed calls are embedding the quotation marks into the data as part of the filenames, which is probably what's causing diff to crash, because there are no actual files with the quotes in the name.
Also, if you are throwing away the output and just using diff to check the files are identical, try cmp instead. The -s is silent, and it's a lot faster since it exits at the first differing byte without reading the rest of both files and generating a report. If there ae a lot of files, this will add up.
If the logs are the only things in the directories, and you don't have to scan subdirectoies, and the filename can't appear in both /data/other/logs1 AND /data/other/logs2, but you're pretty sure it will be in at least one of them... then simplify:
for f in /data/logs/*.log # I'll assume these are all files...
do t=/data/other/logs[12]/"${f#/data/logs/}" # always just one?
if cmp -s "$f" "$t" # cmp -s *has* no output
then echo "$t est identique a $f" # files are same
elif [[ -e "$t" ]] # check t exists
then echo "$t diffère de $f" # maybe ls -l "$f" "$t" ?
else echo "$t n'existe pas" # report it does not
fi
done
This needs no arrays, no find, no sed calls, etc.
If you do need to read subdirectories, use shopt to handle it with globs so that you don't have to worry about parsing odd characters with read. (c.f. https://mywiki.wooledge.org/ParsingLs for some reasons.)
shopt -s globstar
for f in /data/logs/**/*.log # globstar makes ** match at arbitrary depth
do for t in /data/other/logs[12]/**/"${f#/data/logs/}" # if >1 possible hit
do if cmp -s "$f" "$t"
then echo "$t est identique a $f"
elif [[ -e "$t" ]]
then echo "$t diffère de $f"
else echo "$t n'existe pas" # $t will be the glob, one iteration
fi
done
done

Renaming Files with "Invalid Characters"

I have a script that removes invalid characters from files due to one drive restrictions. It works except for file that have * { } in them. I have used the * { } but that is not working (ignores those files). Script is below. Not sure what I am doing wrong here.
#Renames FOLDERS with space at the end
IFS=$'\n'
for file in $(find -d . -name "* ")
do
target_name=$(echo "$file" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
if [ "$file" != "$target_name" ]; then
if [ -e $target_name ]; then
echo "WARNING: $target_name already exists, file not renamed"
else
echo "Move $file to $target_name"
mv "$file" "$target_name"
fi
fi
done
#end Folder rename
#Renames FILES
declare -a arrayGrep=(\? \* \, \# \; \: \& \# \+ \< \> \% \$ \~ \% \: \< \> )
echo "array: ${arrayGrep[#]}"
for i in "${arrayGrep[#]}"
do
for file in $(find . | grep $i )
do
target_name=$(echo "$file" | sed 's/\'$i'/-/g' )
if [ "$file" != "$target_name" ]; then
if [ -e $target_name ]; then
echo "WARNING: $target_name already exists, file not renamed"
else
echo "Move $file to $target_name"
mv "$file" "$target_name"
fi
fi
done
done ````
There are some issues with your code:
You've got some duplicates in arrayGrep
You have some quoting issues: $1 in the grep and sed commands must be protected from the shell
some of the disallowed characters, even if quoted to protect from the shell, could be mis-parsed by grep as meta-characters
You loop rather a lot when all substitutions could happen simultaneously
for file in $(find ...) has to read the entire list into memory, which might be problematic for large lists. It also breaks with some filenames due to word-splitting. Piping into read is better
find can do path filtering without grep (at least for this set of disallowed characters)
sed is fine to use but tr is a little neater
A possible rewrite is:
badchars='?*,#;:&#+<>%$~'
find . -name "*[$badchars]*" | while read -r file
do
target_name=$(echo "$file" | tr "$badchars" - )
if [ "$file" != "$target_name" ]; then
if [ -e $target_name ]; then
echo "WARNING: $target_name already exists, file not renamed"
else
echo "Move $file to $target_name"
mv "$file" "$target_name"
fi
fi
done
If using bash, you can even do parameter expansion directly,
although you have to embed the list:
target_name="${file//[?*,#;:&#+<>%$~]/-}"
an enhancement idea
If you choose a suitable character (eg. =), you can rename filenames reversibly (but only if the length of the new name wouldn't exceed the maximum allowed length for a filename). One possible algorithm:
replace all = in the filename with =3D
replace all the other disallowed characters with =hh where hh is the appropriate ASCII code in hex
You can reverse the renaming by:
replace all =hh (except =3D) with the character corresponding to ASCII code hh
replace all =3D with =

Check if each element of an array is present in a string in bash, ignoring certain characters and order

On the web I found answers to find if an element of array is present in the string. But I want to find if each element in the array is present in the string.
eg. str1 = "This_is_a_big_sentence"
Initially str2 was like
str2 = "Sentence_This_big"
Now I wanted to search if string str1 contains "sentence"&"this"&"big" (All 3, ignore alphabetic order and case)
So I used arr=(${str2//_/ })
How do i proceed now, I know comm command finds intersection, but it needs a sorted list, also I need to ignore _ underscores.
I get my str2 by finding the extension of a particular type of file using the command
for i in `ls snooze.*`; do echo $i | cut -d "." -f2
# Till here i get str2 and need to check as mentioned above. Not sure how to do this, i tried putting str2 as array and now just need to check if all elements of my array occur in str1 (ignore case,order)
Any help would be highly appreciated. I did try to use This link
Now I wanted to search if string a contains "sentence"&"this"&"big"
(All 3, ignore alphabatic order and case)
Here is one approach:
#!/bin/bash
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
if ! grep -qvwFf <(sed 's/_/\n/g' <<<${str1,,}) <(sed 's/_/\n/g' <<<${str2,,})
then
echo "All words present"
else
echo "Some words missing"
fi
How it works
${str1,,} returns the string str1 with all capitals replaced by lower case.
sed 's/_/\n/g' <<<${str1,,} returns the string str1, all converted to lower case and with underlines replaced by new lines so that each word is on a new line.
<(sed 's/_/\n/g' <<<${str1,,}) returns a file-like object containing all the words in str1, each word lower case and on a separate line.
The creation of file-like objects is called process substitution. It allows us, in this case, to treat the output of a shell command as if it were a file to read.
<(sed 's/_/\n/g' <<<${str2,,}) does the same for str2.
Assuming that file1 and file2 each have one word per line, grep -vwFf file1 file2 removes from file2 every occurrence of a word in file2. If there are no words left, that means that every word in file2 appears in file1.
By adding the option -q, grep will return no output but will set an exit code that we can use in our if statement.
In the actual command, file1 and file2 are replaced by our file-like objects.
The remaining grep options can be understood as follows:
-w tells grep to look for whole words only.
-F tells grep to look for fixed strings, not regular expressions.
-f tells grep to look for the patterns to match in the file (or file-like object) which follows.
-v tells grep to remove (the default is to keep) the words which match.
Here is an awk solution to check existence of all the words from a string in another string:
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
awk -v RS=_ 'FNR==NR{a[tolower($1)]; next} {delete a[tolower($1)]} END{print (length(a)) ? "Not all words" : "All words"}' <(echo "$str2") <(echo "$str1")
With indentation:
awk -v RS=_ 'FNR==NR {
a[tolower($1)];
next
}
{ delete a[tolower($1)] }
END {
print (length(a)) ? "Not all words" : "All words"
}' <(echo "$str2") <(echo "$str1")
Explanation:
-v RS=_ We use record separator as _
FNR==NR - Execute this block for str2
a[tolower($1)]; next - Populate an array a with each lowercase word as key
{delete a[tolower($1)]} - For each word in str1 delete key in array a
END - If length of array a is still not 0 then there are some words left.
Here's another solution:
#!/bin/bash
str1="This_is_a_big_sentence"
str2="sentence_This_big"
var=0
var2=0
while read in
do
if [ $(echo $str1 | grep -ioE $in) ]
then
var=$((var+1))
fi
var2=$((var2+1))
done < <(echo $str2 | sed -e 's/\(.*\)/\L\1/' -e 's/_/\n/g')
if [[ $var -eq $var2 && $var -ne 0 ]]
then
echo "matched"
else
echo "not matched"
What this script does make str2 all lower case with sed -e 's/\(.*\)/\L\1/' which is a substitution of any character with its lower case, then replace underscores _ with return lines \n with the following sed expression: sed -e 's/_/\n/g', which is another substitution.
Now the individual words are fed into a while loop that compares str1 with the word that was fed in. Every time there's a match, increment var and every time we iterate though the while, we increment var2. If var == var2, then all the words of str2 were found in str1. Hope that helps.
Here's an approach.
if [ "$(echo "This_BIG_senTence" | grep -ioE 'this|big|sentence' | wc -l)" == "3" ]; then echo "matched"; fi
How it works.
grep options -i makes the grep case insensitive, -E for extended regular expressions, and -o separates the matches by line. Now that it is separated by line use wc with -l for line count. Since we had 3 conditions we check if it equals 3. Grep will return the lines where the match occurred, so if you are only working with a string, the example above will return the string for each condition, in this case 3, so there won't be any problems.
Note you can also create a grep chain and see if its empty.
if [ $(echo "This_BIG_SenTence" | grep -i this | grep -i big | grep -i sentence) ]; then echo matched; else echo not_matched; fi
Now I know what you mean. Try this:
#!/bin/bash
# add 4 non-matching examples
> snooze.foo_bar
> snooze.bar_go
> snooze.go_foo
> snooze.no_match
# add 3 matching examples
> snooze.foo_bar_go
> snooze.goXX_XXfoo_XXbarXX
> snooze.bar_go_foo_Ok
str1=("foo" "bar" "go")
for i in `ls snooze.*`; do
str2=${i#snooze.}
j=0
found=1
while [[ $j -lt ${#str1[#]} ]]; do
if ! echo $str2 | eval grep \${str1[$j]} >& /dev/null; then
found=0
break
fi
((j++))
done
if [[ $found -ne 0 ]]; then
echo Match found: $str2
fi
done
Resulting print of this script:
Match found: bar_go_foo_Ok
Match found: foo_bar_go
Match found: goXX_XXfoo_XXbarXX
alternatively, the if..grep line above can be replaced by
if [[ ! $str2 =~ `eval echo \${str1[$j]}` ]]; then
utilizing bash's regular expression match.
Note: I am not too careful about special characters in the search string, such as "\" or " " (space), which may cause problem.
--- Some explanations ---
In the if .. grep line, $j is first evaluated to the running index, from 0 to the number of elements in $str1 minus 1. Then, eval will re-evaluate the whole grep command again, causing ${str1[jjj]} to be re-evaluated (Here, jjj is the already evaluated index)
The strategy is to set found=1 (found by default), and then when any grep fails, we set found to 0 and break the inner j-loop.
Everything else should be straightforward.

Store the output of find command in an array [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 4 years ago.
How do I put the result of find $1 into an array?
In for loop:
for /f "delims=/" %%G in ('find $1') do %%G | cut -d\/ -f6-
I want to cry.
In bash:
file_list=()
while IFS= read -d $'\0' -r file ; do
file_list=("${file_list[#]}" "$file")
done < <(find "$1" -print0)
echo "${file_list[#]}"
file_list is now an array containing the results of find "$1
What's special about "field 6"? It's not clear what you were attempting to do with your cut command.
Do you want to cut each file after the 6th directory?
for file in "${file_list[#]}" ; do
echo "$file" | cut -d/ -f6-
done
But why "field 6"? Can I presume that you actually want to return just the last element of the path?
for file in "${file_list[#]}" ; do
echo "${file##*/}"
done
Or even
echo "${file_list[#]##*/}"
Which will give you the last path element for each path in the array. You could even do something with the result
for file in "${file_list[#]##*/}" ; do
echo "$file"
done
Explanation of the bash program elements:
(One should probably use the builtin readarray instead)
find "$1" -print0
Find stuff and 'print the full file name on the standard output, followed by a null character'. This is important as we will split that output by the null character later.
<(find "$1" -print0)
"Process Substitution" : The output of the find subprocess is read in via a FIFO (i.e. the output of the find subprocess behaves like a file here)
while ...
done < <(find "$1" -print0)
The output of the find subprocess is read by the while command via <
IFS= read -d $'\0' -r file
This is the while condition:
read
Read one line of input (from the find command). Returnvalue of read is 0 unless EOF is encountered, at which point while exits.
-d $'\0'
...taking as delimiter the null character (see QUOTING in bash manpage). Which is done because we used the null character using -print0 earlier.
-r
backslash is not considered an escape character as it may be part of the filename
file
Result (first word actually, which is unique here) is put into variable file
IFS=
The command is run with IFS, the special variable which contains the characters on which read splits input into words unset. Because we don't want to split.
And inside the loop:
file_list=("${file_list[#]}" "$file")
Inside the loop, the file_list array is just grown by $file, suitably quoted.
arrayname=( $(find $1) )
I don't understand your loop question? If you look how to work with that array then in bash you can loop through all array elements like this:
for element in $(seq 0 $((${#arrayname[#]} - 1)))
do
echo "${arrayname[$element]}"
done
This is probably not 100% foolproof, but it will probably work 99% of the time (I used the GNU utilities; the BSD utilities won't work without modifications; also, this was done using an ext4 filesystem):
declare -a BASH_ARRAY_VARIABLE=$(find <path> <other options> -print0 | sed -e 's/\x0$//' | awk -F'\0' 'BEGIN { printf "("; } { for (i = 1; i <= NF; i++) { printf "%c"gensub(/"/, "\\\\\"", "g", $i)"%c ", 34, 34; } } END { printf ")"; }')
Then you would iterate over it like so:
for FIND_PATH in "${BASH_ARRAY_VARIABLE[#]}"; do echo "$FIND_PATH"; done
Make sure to enclose $FIND_PATH inside double-quotes when working with the path.
Here's a simpler pipeless version, based on the version of user2618594
declare -a names=$(echo "("; find <path> <other options> -printf '"%p" '; echo ")")
for nm in "${names[#]}"
do
echo "$nm"
done
To loop through a find, you can simply use find:
for file in "`find "$1"`"; do
echo "$file" | cut -d/ -f6-
done
It was what I got from your question.

Is there a way to search an entire array inside of an argument?

Posted my code below, wondering if I can search one array for a match... or if theres a way I can search a unix file inside of an argument.
#!/bin/bash
# store words in file
cat $1 | ispell -l > file
# move words in file into array
array=($(< file))
# remove temp file
rm file
# move already checked words into array
checked=($(< .spelled))
# print out words & ask for corrections
for ((i=0; i<${#array[#]}; i++ ))
do
if [[ ! ${array[i]} = ${checked[#]} ]]; then
read -p "' ${array[i]} ' is mispelled. Press "Enter" to keep
this spelling, or type a correction here: " input
if [[ ! $input = "" ]]; then
correction[i]=$input
else
echo ${array[i]} >> .spelled
fi
fi
done
echo "MISPELLED: CORRECTIONS:"
for ((i=0; i<${#correction[#]}; i++ ))
do
echo ${array[i]} ${correction[i]}
done
otherwise, i would need to write a for loop to check each array indice, and then somehow make a decision statement whether to go through the loop and print/take input
The ususal shell incantation to do this is:
cat $1 | ispell -l |while read -r ln
do
read -p "$ln is misspelled. Enter correction" corrected
if [ ! x$corrected = x ] ; then
ln=$corrected
fi
echo $ln
done >correctedwords.txt
The while;do;done is kind of like a function and you can pipe data into and out of it.
P.S. I didn't test the above code so there may be syntax errors

Resources