Bash - Removing files using Arrays - arrays

I have one flat directory. I have 2 arrays. Array1 stores the contents of the directory (all .PNG files). Array2 has six files. These six files are the same as the six files within Array1. How do I use Array2 to remove the 6 files in the directory? Two arrays are as follows:
array1= (`ls ${files}*.PNG`)
array2= $(find . ! -name 'PHOTO*')
Tried using a for loop but not sure how to proceed:
for files in $array2;do
rm -f files $array1

No space is allowed after the = in an assignment.
Don't parse ls. Your code will not work for files whose names contain whitespace.
array1=( "${files}"*.png )
Your array2 isn't an array; it's a string consisting of a sequence of file names separated by whitespace.
array2=( $(find . ! -name 'PHOTO*') )
Also, using find in a command substitution like this can fail for the same reason outlined in 2). Use an extended pattern instead (activated by running shopt -s extglob):
array2=( !(PHOTO*) )
To iterate over the files in an array, you first need to expand the array into a sequence of words, one element per word:
for files in "${array2[#]}"; do
rm -f "$files"
done

To run rm only once... (assuming no spaces in filenames)
unset rm_tmpfiles
if [ ${#tmpfiles[#]} -gt 0 ]; then
for tmpfile in "${tmpfiles[#]}"; do
rm_tmpfiles+="$tmpfile "
done
rm -f $rm_tmpfiles
fi

Related

Populate an array with list of directories existing in a given path in Bash

I have a directory path where there are multiple files and directories.
I want to use basic bash script to create an array containing only list of directories.
Suppose I have a directory path:
/my/directory/path/
$ls /my/directory/path/
a.txt dirX b.txt dirY dirZ
Now I want to populate array named arr[] with only directories, i.e. dirX, dirY and dirZ.
Got one post but its not that relevant with my requirement.
Any help will be appreciated!
Try this:
#!/bin/bash
arr=(/my/directory/path/*/) # This creates an array of the full paths to all subdirs
arr=("${arr[#]%/}") # This removes the trailing slash on each item
arr=("${arr[#]##*/}") # This removes the path prefix, leaving just the dir names
Unlike the ls-based answer, this will not get confused by directory names that contain spaces, wildcards, etc.
Try:
shopt -s nullglob # Globs that match nothing expand to nothing
shopt -s dotglob # Expanded globs include names that start with '.'
arr=()
for dir in /my/directory/path/*/ ; do
dir2=${dir%/} # Remove the trailing /
dir3=${dir2##*/} # Remove everything up to, and including, the last /
arr+=( "$dir3" )
done
Try:
baseDir="/my/directory/path/"
readarray -d '' arr < <(find "${baseDir}" -mindepth 1 -maxdepth 1 -type d -print0)
Here the find command outputs all directories within the baseDir, then the readarray command puts these into an array names arr.
You can then work over the array with:
for directory in "${arr[#]}"; do
echo "${directory}"
done
Note: This only works with bash version 4.4-alpha and above. (See this answer for more.)

Array of all files in a directory, except one

Trying to figure out how to include all .txt files except one called manifest.txt.
FILES=(path/to/*.txt)
You can use extended glob patterns for this:
shopt -s extglob
files=(path/to/!(manifest).txt)
The !(pattern-list) pattern matches "anything except one of the given patterns".
Note that this exactly excludes manifest.txt and nothing else; mmanifest.txt, for example, would still go in to the array.
As a side note: a glob that matches nothing at all expands to itself (see the manual and this question). This behaviour can be changed using the nullglob (expand to empty string) and failglob (print error message) shell options.
You can build the array one file at a time, avoiding the file you do not want :
declare -a files=()
for file in /path/to/files/*
do
! [[ -e "$file" ]] || [[ "$file" = */manifest.txt ]] || files+=("$file")
done
Please note that globbing in the for statement does not cause problems with whitespace (even newlines) in filenames.
EDIT
I added a test for file existence to handle the case where the glob fails and the nullglob option is not set.
I think this is best handled with an associative array even if just one element.
Consider:
$ touch f{1..6}.txt manifest.txt
$ ls *.txt
f1.txt f3.txt f5.txt manifest.txt
f2.txt f4.txt f6.txt
You can create an associative array for the names you wish to exclude:
declare -A exclude
for f in f1.txt f5.txt manifest.txt; do
exclude[$f]=1
done
Then add files to an array that are not in the associative array:
files=()
for fn in *.txt; do
[[ ${exclude[$fn]} ]] && continue
files+=("$fn")
done
$ echo "${files[#]}"
f2.txt f3.txt f4.txt f6.txt
This approach allows any number of exclusions from the list of files.
FILES=($(ls /path/to/*.txt | grep -wv '^manifest.txt$'))

Read filenames with embedded whitespace into an array in a shell script

Basically I'm searching for a multi-word file which is present in many directories using find command and the output is stored on to a variable vari
vari = `find -name "multi word file.xml"
When I try to delete the file using a for loop to iterate through.,
for file in ${vari[#]}
the execution fails saying.,
rm: cannot remove `/abc/xyz/multi':: No such file or directory
Could you guys please help me with this scenario??
If you really need to capture all file paths in an array up front (assumes bash, primarily due to use of arrays and process substitution (<(...))[1]; a POSIX-compliant solution would be more cumbersome[2]; also note that this is a line-based solution, so it won't handle filenames with embedded newlines correctly, but that's very rare in practice):
# Read matches into array `vari` - safely: no word splitting, no
# globbing. The only caveat is that filenames with *embedded* newlines
# won't be handled correctly, but that's rarely a concern.
# bash 4+:
readarray -t vari < <(find . -name "multi word file.xml")
# bash 3:
IFS=$'\n' read -r -d '' -a vari < <(find . -name "multi word file.xml")
# Invoke `rm` with all array elements:
rm "${vari[#]}" # !! The double quotes are crucial.
Otherwise, let find perform the deletion directly (these solutions also handle filenames with embedded newlines correctly):
find . -name "multi word file.xml" -delete
# If your `find` implementation doesn't support `-delete`:
find . -name "multi word file.xml" -exec rm {} +
As for what you tried:
vari=`find -name "multi word file.xml"` (I've removed the spaces around =, which would result in a syntax error) does not create an array; such a command substitution returns the stdout output from the enclosed command as a single string (with trailing newlines stripped).
By enclosing the command substitution in ( ... ), you could create an array:
vari=( `find -name "multi word file.xml"` ),
but that would perform word splitting on the find's output and not properly preserve filenames with spaces.
While this could be addressed with IFS=$'\n' so as to only split at line boundaries, the resulting tokens are still subject to pathname expansion (globbing), which can inadvertently alter the file paths.
While this could also be addressed with a shell option, you now have 2 settings you need to perform ahead of time and restore to their original value; thus, using readarray or read as demonstrated above is the simpler choice.
Even if you did manage to collect the file paths correctly in $vari as an array, referencing that array as ${vari[#]} - without double quotes - would break, because the resulting strings are again subject to word splitting, and also pathname expansion (globbing).
To safely expand an array to its elements without any interpretation of its elements, double-quote it: "${vari[#]}"
[1]
Process substitution rather than a pipeline is used so as to ensure that readarray / read is executed in the current shell rather than in a subshell.
As eckes points out in a comment, if you were to try find ... | IFS=$'\n' read ... instead, read would run in a subshell, which means that the variables it creates will disappear (go out of scope) when the command returns and cannot be used later.
[2]
The POSIX shell spec. supports neither arrays nor process substitution (nor readarray, nor any read options other than -r); you'd have to implement line-by-line processing as follows:
while IFS='
' read -r vari; do
pv vari
done <<EOF
$(find . -name "multi word file.xml")
EOF
Note the require actual newline between IFS=' and ' in order to assign a newline, given that the $'\n' syntax is not available.
Here are a few approaches:
# change the input field separator to a newline to ignore spaces
IFS=$'\n'
for file in $(find . -name '* *.xml'); do
ls "$file"
done
# pipe find result lines to a while loop
IFS=
find . -name '* *.xml' | while read -r file; do
ls "$file"
done
# feed the while loop with process substitution
IFS=
while read -r file; do
ls "$file"
done < <(find . -name '* *.xml')
When you're satisfied with the results, replace ls with rm.
The solutions are all line-based solutions. There is a test environment at bottom for which there is no known solution.
As already written, the file could be removed with this tested command:
$ find . -name "multi word file".xml -exec rm {} +
I did not manage to use rm command with a variable when the path or filename contains \n.
Test environment:
$ mkdir "$(printf "\1\2\3\4\5\6\7\10\11\12\13\14\15\16\17\20\21\22\23\24\25\26\27\30\31\32\33\34\35\36\37\40\41\42\43\44\45\46\47testdir" "")"
$ touch "multi word file".xml
$ mv *xml *testdir/
$ touch "2nd multi word file".xml ; mv *xml *testdir
$ ls -b
\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\ !"#$%&'testdir
$ ls -b *testdir
2nd\ multi\ word\ file.xml multi\ word\ file.xml

How do I store the output from a find command in an array? + bash

I have the following find command with the following output:
$ find -name '*.jpg'
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
./public_html/github/screencasts-gh-pages/introToBackbone/presentation/images/telescope.jpg
./public_html/github/StarCraft-master/img/Maps/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Snapshot.jpg
./public_html/github/StarCraft-master/img/Maps/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Original/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Bg/GameLose.jpg
./public_html/github/StarCraft-master/img/Bg/GameWin.jpg
./public_html/github/StarCraft-master/img/Bg/GameStart.jpg
./public_html/github/StarCraft-master/img/Bg/GamePlay.jpg
./public_html/github/StarCraft-master/img/Demo/Demo.jpg
./public_html/github/flot/examples/image/hs-2004-27-a-large-web.jpg
./public_html/github/minicourse-ajax-project/other/GameLose.jpg
How do I store this output in an array? I want it to handle filenames with spaces
I have tried this arrayname=($(find -name '*.jpg')) but this just stores the first element. # I am doing the following which seems to be just the first element?
$ arrayname=($(find -name '*.jpg'))
$ echo "$arrayname"
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
$
I have tried here but again this just stores the 1st element
Other similar Qs
How do I capture the output from the ls or find command to store all file names in an array?
How do i store the output of a bash command in a variable?
If you know with certainty that your filenames will not contain newlines, then
mapfile -t arrayname < <(find ...)
If you want to be able to handle any file
arrayname=()
while IFS= read -d '' -r filename; do
arrayname+=("$filename")
done < <(find ... -print0)
echo "$arrayname" will only show the first element of the array. It is equivalent to echo "${arrayname[0]}". To dump an array:
printf "%s\n" "${arrayname[#]}"
# ............^^^^^^^^^^^^^^^^^ must use exactly this form, with the quotes.
arrayname=($(find ...)) is still wrong. It will store the file ./file with spaces.txt as 3 separate elements in the array.
If you have a sufficiently recent version of bash, you can save yourself a lot of trouble by just using a ** glob.
shopt -s globstar
files=(**/*.jpg)
The first line enables the feature. Once enabled, ** in a glob pattern will match any number (including 0) of directories in the path.
Using the glob in the array definition makes sure that whitespace is handled correctly.
To view an array in a form which could be used to define the array, use the -p (print) option to the declare builtin:
declare -p files

Bash script - how to fill array?

Let's say I have this directory structure:
DIRECTORY:
.........a
.........b
.........c
.........d
What I want to do is: I want to store elements of a directory in an array
something like : array = ls /home/user/DIRECTORY
so that array[0] contains name of first file (that is 'a')
array[1] == 'b' etc.
Thanks for help
You can't simply do array = ls /home/user/DIRECTORY, because - even with proper syntax - it wouldn't give you an array, but a string that you would have to parse, and Parsing ls is punishable by law. You can, however, use built-in Bash constructs to achieve what you want :
#!/usr/bin/env bash
readonly YOUR_DIR="/home/daniel"
if [[ ! -d $YOUR_DIR ]]; then
echo >&2 "$YOUR_DIR does not exist or is not a directory"
exit 1
fi
OLD_PWD=$PWD
cd "$YOUR_DIR"
i=0
for file in *
do
if [[ -f $file ]]; then
array[$i]=$file
i=$(($i+1))
fi
done
cd "$OLD_PWD"
exit 0
This small script saves the names of all the regular files (which means no directories, links, sockets, and such) that can be found in $YOUR_DIR to the array called array.
Hope this helps.
Option 1, a manual loop:
dirtolist=/home/user/DIRECTORY
shopt -s nullglob # In case there aren't any files
contentsarray=()
for filepath in "$dirtolist"/*; do
contentsarray+=("$(basename "$filepath")")
done
shopt -u nullglob # Optional, restore default behavior for unmatched file globs
Option 2, using bash array trickery:
dirtolist=/home/user/DIRECTORY
shopt -s nullglob
contentspaths=("$dirtolist"/*) # This makes an array of paths to the files
contentsarray=("${contentpaths[#]##*/}") # This strips off the path portions, leaving just the filenames
shopt -u nullglob # Optional, restore default behavior for unmatched file globs
array=($(ls /home/user/DIRECTORY))
Then
echo ${array[0]}
will equal to the first file in that directory.

Resources