Splitting strings in nested loops and iterating through results bash [duplicate] - arrays

I try to list subfolders and save them as list to a variable
DIRS=($(find . -maxdepth 1 -mindepth 1 -type d ))
After that I want to do some work in subfolders. i.e.
for item in $DIRS
do
echo $item
But after
echo $DIRS
it gives only first item (subfolder). Could someone point me to error or propose another solution?

The following creates a bash array but lists only one of three subdirectories:
$ dirs=($(find . -maxdepth 1 -mindepth 1 -type d ))
$ echo $dirs
./subdir2
To see all the directories, you must use the [#] or [*] subscript form:
$ echo "${dirs[#]}"
./subdir2 ./subdir1 ./subdir3
Or, using it in a loop:
$ for item in "${dirs[#]}"; do echo $item; done
./subdir2
./subdir1
./subdir3
Avoiding problems from word splitting
Note that, in the code above, the shell performs word splitting before the array is created. Thus, this approach will fail if any subdirectories have whitespace in their names.
The following will successfully create an array of subdirectory names even if the names have spaces, tabs or newlines in them:
dirs=(*/)
If you need to use find and you want it to be safe for difficult file names, then use find's --exec option.
Documentation
The form $dirs returns just the first element of the array. This is documented in man bash:
Referencing an array variable without a subscript is equivalent to referencing the array with a subscript of 0.
The use of [#] and [*] is also documented in man bash:
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with pathname expansion. If subscript is # or , the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[#]} expands each element of name to a separate word.

Related

bash store output of command in array

I'm trying to find if the output of the following command, stores just one file in the array array_a
array_a = $(find /path/dir1 -maxdepth 1 -name file_orders?.csv)
echo $array_a
/path/dir1/file_orders1.csv /path/dir1/file_orders2.csv
echo ${#array_a[#]}
1
So it tell's me there's just one element, but obviously there are 2.
If I type echo ${array_a[0]} it doesn't return me anything. It's like, the variable array_a isn't an array at all. How can i force it to store the elements in array?
You are lacking the parentheses which define an array. But the fundamental problem is that running find inside backticks will split on whitespace, so if any matching file could contain a space, it will produce more than one element in the resulting array.
With -maxdepth 1 anyway, just use the shell's globbing facilities instead; you don't need find at all.
array_a=(/path/dir1/file_orders?.csv)
Also pay attention to quotes when using the array.
echo "${array_a[#]}"
Without the quotes, the whitespace splitting will happen again.

Bash: Iterate over variable names

I'm writing a bash script to analyse some files. In a first iteration I create associative arrays with word counts for each category that is analysed. These categories are not known in advance so the names of these associative arrays are variable but all with the same prefix count_$category. The associative arrays have a word as key and its occurrence count in that category as value.
After all files are analysed, I have to summarise the results for each category. I can iterate over the variable names using ${count_*} but how can I access the associative arrays behind those variable names? For each associative array (each count_* variable) I should iterate over the words and their counts.
I have already tried with indirect access like this but it doesn't work:
for categorycount in ${count_*} # categorycount now holds the name of the associative array variable for each category
do
array=${!categorycount}
for word in ${!array[#]}
do
echo "$word occurred ${array[$word]} times"
done
done
The modern (bash 4.3+) approach uses "namevars", a facility borrowed from ksh:
for _count_var in "${!count_#}"; do
declare -n count=$_count_var # make count an alias for $_count_var
for key in "${!count[#]}"; do # iterate over keys, via same
echo "$key occurred ${count[$key]} times" # extract value, likewise
done
unset -n count # clear that alias
done
declare -n count=$count_var allows "${count[foo]}" to be used to look up item foo in the associative array named count_var; similarly, count[foo]=bar will assign to that same item. unset -n count then removes this mapping.
Prior to bash 4.3:
for _count_var in "${!count_#}"; do
printf -v cmd '_count_keys=( "${!%q[#]}" )' "$_count_var" && eval "$cmd"
for key in "${_count_keys[#]}"; do
var="$_count_var[$key]"
echo "$key occurred ${!var} times"
done
done
Note the use of %q, rather than substituting a variable name directly into a string, to generate the command to eval. Even though in this case we're probably safe (because the set of possible variable names is restricted), following this practice reduces the amount of context that needs to be considered to determine whether an indirect expansion is secure.
In both cases, note that internal variables (_count_var, _count_keys, etc) use names that don't match the count_* pattern.

Saving directory content to an array (bash) [duplicate]

This question already has answers here:
How do you store a list of directories into an array in Bash (and then print them out)?
(4 answers)
Closed 7 years ago.
I need to save content of two directories in an array to compare them later. Thats the solution i write:
DirContent()
{
#past '$1' directorys to 'directorys'
local DIRECTORYS=`ls -l --time-style="long-iso" $1 | egrep '^d' | awk '{print $8}'`
local CONTENT
local i
for DIR in $DIRECTORYS
do
i=+1
CONTENT[i]=${DIR}
done
echo $CONTENT
}
Then when I try to print this array I get empty output. Both directories are not empty. Please tell me what am I doing wrong here.
Thanks, Siery.
The core of this question is answered in the one I marked as a duplicate. Here are a few more pointers:
All uppercase variable names are discouraged as they are more likely to clash with environment variables.
You assign to DIRECTORYS (should probably be "directories") the output of a complicated command, which suffers from a few deficiencies:
Instead of backticks as in var=`command`, the syntax var=$(command) is preferred.
egrep is deprecated and grep -E is preferred.
The grep and awk commands could be combined to awk /^d/ '{ print $8 }'.
There are better ways to get directories, for example find, but the output of find shouldn't be parsed either.
You shouldn't process the output of ls programmatically: filenames can contain spaces, newlines, other special characters...
DIRECTORYS is now just one long string, and you rely on word splitting to iterate over it. Again, spaces in filenames will trip you up.
DIR isn't declared local.
To increase i, you'd use (( ++i )).
CONTENT[i]=${DIR} is actually okay: the i is automatically expanded here and doesn't have to be prepended by a $. Normally you'd want to quote your variables like "$dir", but in this case we happen to know that it won't be split any further as it already is the result of word splitting.
Array indices start at zero and you're skipping zero. You should increase the counter after the assignment.
Instead of using a counter, you can just append to an array with content+=("$dir").
To print the contents of an array, you'd use echo "${CONTENT[#]}".
But really, what you should do instead of all this: a call DirContent some_directory is equivalent to echo some_directory/*/, and if you want that in an array, you'd just use
arr=(some_directory/*/)
instead of the whole function – this even works for weird filenames. And is much, much shorter.
If you have hidden directories (names starts with .), you can use shopt -s dotglob to include them as well.
You can try
for((i=0;i<${#CONTENT[*]};i++))
do
echo ${CONTENT[$i]}
done
instead of echo $CONTENT
Also these change are required
((i=+1))
CONTENT[$i]=${DIR}
in your above code

How do you assign lines of output containing spaces to a bash array [duplicate]

This question already has answers here:
Capturing output of find . -print0 into a bash array
(13 answers)
Closed 7 years ago.
I would like to take the output of something that returns lines of elements possibly containing spaces to a bash array, with each line getting its own array element.
So for example:
find . -name \*.jpg
... may return a list of filenames. I want each filename to be assigned to an array. The simple solution doesn't work in general, because if there are spaces in filenames, the words get their own array element.
For example, start with this list of files in a directory:
FILE1.jpg
FILE2.jpg
FILE WITH SPACES.jpg
Try:
FILES=( $(find . -name \*.jpg) )
And you get (<> added for emphasis of individual elements):
$ for f in "${FILES[#]}"; do echo "<$f>"; done
<./FILE>
<WITH>
<SPACES.jpg>
<./FILE1.jpg>
<./FILE2.jpg>
This is not likely what you want.
How do you assign lines to array elements regardless of the lines containing spaces?
Set IFS before making the assignment. This allows bash to ignore the spaces by using only "\n" as the delimiter:
IFS=$'\n'
FILES=( $(find . -name \*.jpg) )
Now you get the result:
for f in "${FILES[#]}"; do echo "<$f>"; done
<./FILE WITH SPACES.jpg>
<./FILE1.jpg>
<./FILE2.jpg>
Note that how you access the array is important as well. This is covered in a similar question: BASH array with spaces in elements

Bash substring expansion on array

I have a set of files with a given suffix. For instance, I have a set of pdf files with suffix .pdf. I would like to obtain the names of the files without the suffix using substring expansion.
For a single file I can use:
file="test.pdf"
echo ${file:0 -4}
To do this operation for all files, I now tried:
files=( $(ls *.pdf) )
ff=( "${files[#]:0: -4}" )
echo ${ff[#]}
I now get an error saying that substring expression < 0..
( I would like to avoid using a for loop )
Use parameter expansions to remove the .pdf part like so:
shopt -s nullglob
files=( *.pdf )
echo "${files[#]%.pdf}"
The shopt -s nullglob is always a good idea when using globs: it will make the glob expand to nothing if there are no matches.
"${files[#]%.pdf}" will expand to an array with all the trailing .pdf removed. You can, if you wish put this in another array as so:
files_noext=( "${files[#]%.pdf}" )
All this is 100% safe regarding funny symbols in filenames (spaces, newlines, etc.), except for the echo part for files named -n.pdf, -e.pdf and -E.pdf... but the echo was just here for demonstration purposes. Your files=( $(ls *.pdf) ) is really really bad! Do never parse the output of ls.
To answer your comment: substring expansions don't work on each field of the array. Taken from the reference manual linked above:
${parameter:offset}
${parameter:offset:length}
If offset evaluates to a number less than zero, the value is used as an offset from the end of the value of parameter. If length evaluates to a number less than zero, and parameter is not # and not an indexed or associative array, it is interpreted as an offset from the end of the value of parameter rather than a number of characters, and the expansion is the characters between the two offsets. If parameter is #, the result is length positional parameters beginning at offset. If parameter is an indexed array name subscripted by # or *, the result is the length members of the array beginning with ${parameter[offset]}. A negative offset is taken relative to one greater than the maximum index of the specified array. Substring expansion applied to an associative array produces undefined results.
So, e.g.,
$ array=( zero one two three four five six seven eight )
$ echo "${array[#]:3:2}"
three four
$

Resources