Array of all files in a directory, except one - arrays

Trying to figure out how to include all .txt files except one called manifest.txt.
FILES=(path/to/*.txt)

You can use extended glob patterns for this:
shopt -s extglob
files=(path/to/!(manifest).txt)
The !(pattern-list) pattern matches "anything except one of the given patterns".
Note that this exactly excludes manifest.txt and nothing else; mmanifest.txt, for example, would still go in to the array.
As a side note: a glob that matches nothing at all expands to itself (see the manual and this question). This behaviour can be changed using the nullglob (expand to empty string) and failglob (print error message) shell options.

You can build the array one file at a time, avoiding the file you do not want :
declare -a files=()
for file in /path/to/files/*
do
! [[ -e "$file" ]] || [[ "$file" = */manifest.txt ]] || files+=("$file")
done
Please note that globbing in the for statement does not cause problems with whitespace (even newlines) in filenames.
EDIT
I added a test for file existence to handle the case where the glob fails and the nullglob option is not set.

I think this is best handled with an associative array even if just one element.
Consider:
$ touch f{1..6}.txt manifest.txt
$ ls *.txt
f1.txt f3.txt f5.txt manifest.txt
f2.txt f4.txt f6.txt
You can create an associative array for the names you wish to exclude:
declare -A exclude
for f in f1.txt f5.txt manifest.txt; do
exclude[$f]=1
done
Then add files to an array that are not in the associative array:
files=()
for fn in *.txt; do
[[ ${exclude[$fn]} ]] && continue
files+=("$fn")
done
$ echo "${files[#]}"
f2.txt f3.txt f4.txt f6.txt
This approach allows any number of exclusions from the list of files.

FILES=($(ls /path/to/*.txt | grep -wv '^manifest.txt$'))

Related

Creating an array from JSON data of file/directory locations; bash not registering if element is a directory

I have a JSON file which holds data like this: "path/to/git/directory/location": "path/to/local/location". A minimum example of the file might be this:
{
"${HOME}/dotfiles/.bashrc": "${HOME}/.bashrc",
"${HOME}/dotfiles/.atom/": "${HOME}/.atom/"
}
I have a script that systematically reads the above JSON (called locations.json) and creates an array, and then prints elements of the array that are directories. MWE:
#!/usr/bin/env bash
unset sysarray
declare -A sysarray
while IFS=: read -r field data
do
sysarray["${field}"]="${data}"
done <<< $(sed '/^[{}]$/d;s/\s*"/"/g;s/,$//' locations.json)
for file in "${sysarray[#]}"
do
if [ -d "${file}" ]
then
echo "${file}"
fi
done
However, this does not print the directory (i.e., ${HOME}/.atom).
I don't understand why this is happening, because
I have tried creating an array manually (i.e., not from a JSON) and checking if its elements are directories, and that works fine.
I have tried echoing each element in the array into a temporary file and reading each line in the file to see if it was a product of how the information was stored in the array, but no luck.
I have tried adding | tr -d "[:blank:]" | tr -d '\"' after using sed on the JSON (to see if it was a product of unintended whitespace or quotes), but no luck.
I have tried simply running [ -d "${HOME}/.atom/" ] && echo '.atom is a directory', and that works (so indeed it is a directory). I'm unsure what might be causing this.
Help on this would be great!
You could use a tool to process json files properly, which will deal with any valid json.
#!/usr/bin/env bash
unset sysarray
declare -A sysarray
while IFS="=" read -r field data
do
sysarray["${field}"]=$(eval echo "${data}")
done <<< $(jq -r 'keys[] as $k | "\($k)=\(.[$k])"' locations.json)
for file in "${sysarray[#]}"
do
if [ -d "${file}" ]
then
echo "${file}"
fi
done
Another problem is that, once the extra quote signs are properly processed, we have a literal ${HOME} that is not expanded. The only solution I came is using eval to force the expansion. It is not the nicest way, but right now I cannot find a better solution.

How do I store the output from a find command in an array? + bash

I have the following find command with the following output:
$ find -name '*.jpg'
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
./public_html/github/screencasts-gh-pages/introToBackbone/presentation/images/telescope.jpg
./public_html/github/StarCraft-master/img/Maps/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Snapshot.jpg
./public_html/github/StarCraft-master/img/Maps/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Original/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Bg/GameLose.jpg
./public_html/github/StarCraft-master/img/Bg/GameWin.jpg
./public_html/github/StarCraft-master/img/Bg/GameStart.jpg
./public_html/github/StarCraft-master/img/Bg/GamePlay.jpg
./public_html/github/StarCraft-master/img/Demo/Demo.jpg
./public_html/github/flot/examples/image/hs-2004-27-a-large-web.jpg
./public_html/github/minicourse-ajax-project/other/GameLose.jpg
How do I store this output in an array? I want it to handle filenames with spaces
I have tried this arrayname=($(find -name '*.jpg')) but this just stores the first element. # I am doing the following which seems to be just the first element?
$ arrayname=($(find -name '*.jpg'))
$ echo "$arrayname"
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
$
I have tried here but again this just stores the 1st element
Other similar Qs
How do I capture the output from the ls or find command to store all file names in an array?
How do i store the output of a bash command in a variable?
If you know with certainty that your filenames will not contain newlines, then
mapfile -t arrayname < <(find ...)
If you want to be able to handle any file
arrayname=()
while IFS= read -d '' -r filename; do
arrayname+=("$filename")
done < <(find ... -print0)
echo "$arrayname" will only show the first element of the array. It is equivalent to echo "${arrayname[0]}". To dump an array:
printf "%s\n" "${arrayname[#]}"
# ............^^^^^^^^^^^^^^^^^ must use exactly this form, with the quotes.
arrayname=($(find ...)) is still wrong. It will store the file ./file with spaces.txt as 3 separate elements in the array.
If you have a sufficiently recent version of bash, you can save yourself a lot of trouble by just using a ** glob.
shopt -s globstar
files=(**/*.jpg)
The first line enables the feature. Once enabled, ** in a glob pattern will match any number (including 0) of directories in the path.
Using the glob in the array definition makes sure that whitespace is handled correctly.
To view an array in a form which could be used to define the array, use the -p (print) option to the declare builtin:
declare -p files

Saving egrep output containing '*' to bash array

I would like to save egrep output to a bash array:
arr=( $(egrep -Rn 'regex') )
If there so happens to be a '*' in the egrep result, it appears like bash is expanding the '*' to be all files in current directory. And the expansion plus results of egrep are then saved into arr.
How do I fix this? I want the '*' in the grep results to be unaltered.
To use the idiom attempted in the question "correctly" might look something like this:
# DON'T DO THIS.
set -f # turn off globbing
IFS=$'\n' # word-split only on newlines
arr=( $(...) ) # populate array
unset IFS # return IFS to defaults (assuming it was there before)
set +f # turn globbing back on
Obviously, there's a lot of room to get this wrong and leave your shell in a state other than the way it started (What if your script had a different initial IFS value? What if this code is sourced from a script that wants globbing to be disabled to work correctly?). Don't do it.
One approach, compatible with bash 3.x, is to use read -a (reading into an array) with IFS (used to separate fields) containing a newline, and -d (used to separate records) set to a NUL:
IFS=$'\n' read -r -d '' -a arr < <(egrep -Rn 'regex' && printf '\0')
The trailing NUL added to the input is present to ensure that read exits successfully; otherwise, this could trigger an abrupt exit if using set -e.
A longer but more explicit approach is to do the iteration yourself:
arr=( )
while IFS= read -r; do
arr+=( "$REPLY" )
done < <(egrep -Rn 'regex')
Another, using bash 4.x features (readarray, AKA mapfile):
readarray -t arr < <(egrep -Rn 'regex')

How can I store the "find" command results as an array in Bash

I am trying to save the result from find as arrays.
Here is my code:
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
array=`find . -name ${input}`
len=${#array[*]}
echo "found : ${len}"
i=0
while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done
I get 2 .txt files under current directory.
So I expect '2' as result of ${len}. However, it prints 1.
The reason is that it takes all result of find as one elements.
How can I fix this?
P.S
I found several solutions on StackOverFlow about a similar problem. However, they are a little bit different so I can't apply in my case. I need to store the results in a variable before the loop. Thanks again.
Update 2020 for Linux Users:
If you have an up-to-date version of bash (4.4-alpha or better), as you probably do if you are on Linux, then you should be using Benjamin W.'s answer.
If you are on Mac OS, which —last I checked— still used bash 3.2, or are otherwise using an older bash, then continue on to the next section.
Answer for bash 4.3 or earlier
Here is one solution for getting the output of find into a bash array:
array=()
while IFS= read -r -d $'\0'; do
array+=("$REPLY")
done < <(find . -name "${input}" -print0)
This is tricky because, in general, file names can have spaces, new lines, and other script-hostile characters. The only way to use find and have the file names safely separated from each other is to use -print0 which prints the file names separated with a null character. This would not be much of an inconvenience if bash's readarray/mapfile functions supported null-separated strings but they don't. Bash's read does and that leads us to the loop above.
[This answer was originally written in 2014. If you have a recent version of bash, please see the update below.]
How it works
The first line creates an empty array: array=()
Every time that the read statement is executed, a null-separated file name is read from standard input. The -r option tells read to leave backslash characters alone. The -d $'\0' tells read that the input will be null-separated. Since we omit the name to read, the shell puts the input into the default name: REPLY.
The array+=("$REPLY") statement appends the new file name to the array array.
The final line combines redirection and command substitution to provide the output of find to the standard input of the while loop.
Why use process substitution?
If we didn't use process substitution, the loop could be written as:
array=()
find . -name "${input}" -print0 >tmpfile
while IFS= read -r -d $'\0'; do
array+=("$REPLY")
done <tmpfile
rm -f tmpfile
In the above the output of find is stored in a temporary file and that file is used as standard input to the while loop. The idea of process substitution is to make such temporary files unnecessary. So, instead of having the while loop get its stdin from tmpfile, we can have it get its stdin from <(find . -name ${input} -print0).
Process substitution is widely useful. In many places where a command wants to read from a file, you can specify process substitution, <(...), instead of a file name. There is an analogous form, >(...), that can be used in place of a file name where the command wants to write to the file.
Like arrays, process substitution is a feature of bash and other advanced shells. It is not part of the POSIX standard.
Alternative: lastpipe
If desired, lastpipe can be used instead of process substitution (hat tip: Caesar):
set +m
shopt -s lastpipe
array=()
find . -name "${input}" -print0 | while IFS= read -r -d $'\0'; do array+=("$REPLY"); done; declare -p array
shopt -s lastpipe tells bash to run the last command in the pipeline in the current shell (not the background). This way, the array remains in existence after the pipeline completes. Because lastpipe only takes effect if job control is turned off, we run set +m. (In a script, as opposed to the command line, job control is off by default.)
Additional notes
The following command creates a shell variable, not a shell array:
array=`find . -name "${input}"`
If you wanted to create an array, you would need to put parens around the output of find. So, naively, one could:
array=(`find . -name "${input}"`) # don't do this
The problem is that the shell performs word splitting on the results of find so that the elements of the array are not guaranteed to be what you want.
Update 2019
Starting with version 4.4-alpha, bash now supports a -d option so that the above loop is no longer necessary. Instead, one can use:
mapfile -d $'\0' array < <(find . -name "${input}" -print0)
For more information on this, please see (and upvote) Benjamin W.'s answer.
Bash 4.4 introduced a -d option to readarray/mapfile, so this can now be solved with
readarray -d '' array < <(find . -name "$input" -print0)
for a method that works with arbitrary filenames including blanks, newlines, and globbing characters. This requires that your find supports -print0, as for example GNU find does.
From the manual (omitting other options):
mapfile [-d delim] [array]
-d
The first character of delim is used to terminate each input line, rather than newline. If delim is the empty string, mapfile will terminate a line when it reads a NUL character.
And readarray is just a synonym of mapfile.
The following appears to work for both Bash and Z Shell on macOS.
#! /bin/sh
IFS=$'\n'
paths=($(find . -name "foo"))
unset IFS
printf "%s\n" "${paths[#]}"
If you are using bash 4 or later, you can replace your use of find with
shopt -s globstar nullglob
array=( **/*"$input"* )
The ** pattern enabled by globstar matches 0 or more directories, allowing the pattern to match to an arbitrary depth in the current directory. Without the nullglob option, the pattern (after parameter expansion) is treated literally, so with no matches you would have an array with a single string rather than an empty array.
Add the dotglob option to the first line as well if you want to traverse hidden directories (like .ssh) and match hidden files (like .bashrc) as well.
you can try something like
array=(`find . -type f | sort -r | head -2`) , and in order to print the array values , you can try something like echo "${array[*]}"
None of these solutions suited me because I didn't feel like learning readarray and mapfile. Here is what I came up with.
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
# The only change is here. Append to array for each non-empty line.
array=()
while read line; do
[[ ! -z "$line" ]] && array+=("$line")
done; <<< $(find . -name ${input} -print)
len=${#array[#]}
echo "found : ${len}"
i=0
while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done
You could do like this:
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
array=(`find . -name '*'${input}'*'`)
for i in "${array[#]}"
do :
echo $i
done
In bash, $(<any_shell_cmd>) helps to run a command and capture the output. Passing this to IFS with \n as delimiter helps to convert that to an array.
IFS='\n' read -r -a txt_files <<< $(find /path/to/dir -name "*.txt")

Bash script - how to fill array?

Let's say I have this directory structure:
DIRECTORY:
.........a
.........b
.........c
.........d
What I want to do is: I want to store elements of a directory in an array
something like : array = ls /home/user/DIRECTORY
so that array[0] contains name of first file (that is 'a')
array[1] == 'b' etc.
Thanks for help
You can't simply do array = ls /home/user/DIRECTORY, because - even with proper syntax - it wouldn't give you an array, but a string that you would have to parse, and Parsing ls is punishable by law. You can, however, use built-in Bash constructs to achieve what you want :
#!/usr/bin/env bash
readonly YOUR_DIR="/home/daniel"
if [[ ! -d $YOUR_DIR ]]; then
echo >&2 "$YOUR_DIR does not exist or is not a directory"
exit 1
fi
OLD_PWD=$PWD
cd "$YOUR_DIR"
i=0
for file in *
do
if [[ -f $file ]]; then
array[$i]=$file
i=$(($i+1))
fi
done
cd "$OLD_PWD"
exit 0
This small script saves the names of all the regular files (which means no directories, links, sockets, and such) that can be found in $YOUR_DIR to the array called array.
Hope this helps.
Option 1, a manual loop:
dirtolist=/home/user/DIRECTORY
shopt -s nullglob # In case there aren't any files
contentsarray=()
for filepath in "$dirtolist"/*; do
contentsarray+=("$(basename "$filepath")")
done
shopt -u nullglob # Optional, restore default behavior for unmatched file globs
Option 2, using bash array trickery:
dirtolist=/home/user/DIRECTORY
shopt -s nullglob
contentspaths=("$dirtolist"/*) # This makes an array of paths to the files
contentsarray=("${contentpaths[#]##*/}") # This strips off the path portions, leaving just the filenames
shopt -u nullglob # Optional, restore default behavior for unmatched file globs
array=($(ls /home/user/DIRECTORY))
Then
echo ${array[0]}
will equal to the first file in that directory.

Resources