Bash array with spaces in elements - arrays

I'm trying to construct an array in bash of the filenames from my camera:
FILES=(2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg)
As you can see, there is a space in the middle of each filename.
I've tried wrapping each name in quotes, and escaping the space with a backslash, neither of which works.
When I try to access the array elements, it continues to treat the space as the elementdelimiter.
How can I properly capture the filenames with a space inside the name?

I think the issue might be partly with how you're accessing the elements. If I do a simple for elem in $FILES, I experience the same issue as you. However, if I access the array through its indices, like so, it works if I add the elements either numerically or with escapes:
for ((i = 0; i < ${#FILES[#]}; i++))
do
echo "${FILES[$i]}"
done
Any of these declarations of $FILES should work:
FILES=(2011-09-04\ 21.43.02.jpg
2011-09-05\ 10.23.14.jpg
2011-09-09\ 12.31.16.jpg
2011-09-11\ 08.43.12.jpg)
or
FILES=("2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg")
or
FILES[0]="2011-09-04 21.43.02.jpg"
FILES[1]="2011-09-05 10.23.14.jpg"
FILES[2]="2011-09-09 12.31.16.jpg"
FILES[3]="2011-09-11 08.43.12.jpg"

There must be something wrong with the way you access the array's items. Here's how it's done:
for elem in "${files[#]}"
...
From the bash manpage:
Any element of an array may be referenced using ${name[subscript]}. ... If subscript is # or *, the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[*]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[#]} expands each element of name to a separate word.
Of course, you should also use double quotes when accessing a single member
cp "${files[0]}" /tmp

You need to use IFS to stop space as element delimiter.
FILES=("2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg")
IFS=""
for jpg in ${FILES[*]}
do
echo "${jpg}"
done
If you want to separate on basis of . then just do IFS="."
Hope it helps you:)

I agree with others that it's likely how you're accessing the elements that is the problem. Quoting the file names in the array assignment is correct:
FILES=(
"2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg"
)
for f in "${FILES[#]}"
do
echo "$f"
done
Using double quotes around any array of the form "${FILES[#]}" splits the array into one word per array element. It doesn't do any word-splitting beyond that.
Using "${FILES[*]}" also has a special meaning, but it joins the array elements with the first character of $IFS, resulting in one word, which is probably not what you want.
Using a bare ${array[#]} or ${array[*]} subjects the result of that expansion to further word-splitting, so you'll end up with words split on spaces (and anything else in $IFS) instead of one word per array element.
Using a C-style for loop is also fine and avoids worrying about word-splitting if you're not clear on it:
for (( i = 0; i < ${#FILES[#]}; i++ ))
do
echo "${FILES[$i]}"
done

If you had your array like this:
#!/bin/bash
Unix[0]='Debian'
Unix[1]="Red Hat"
Unix[2]='Ubuntu'
Unix[3]='Suse'
for i in $(echo ${Unix[#]});
do echo $i;
done
You would get:
Debian
Red
Hat
Ubuntu
Suse
I don't know why but the loop breaks down the spaces and puts them as an individual item, even you surround it with quotes.
To get around this, instead of calling the elements in the array, you call the indexes, which takes the full string thats wrapped in quotes.
It must be wrapped in quotes!
#!/bin/bash
Unix[0]='Debian'
Unix[1]='Red Hat'
Unix[2]='Ubuntu'
Unix[3]='Suse'
for i in $(echo ${!Unix[#]});
do echo ${Unix[$i]};
done
Then you'll get:
Debian
Red Hat
Ubuntu
Suse

This was already answered above, but that answer was a bit terse and the man page excerpt is a bit cryptic. I wanted to provide a fully worked example to demonstrate how this works in practice.
If not quoted, an array just expands to strings separated by spaces, so that
for file in ${FILES[#]}; do
expands to
for file in 2011-09-04 21.43.02.jpg 2011-09-05 10.23.14.jpg 2011-09-09 12.31.16.jpg 2011-09-11 08.43.12.jpg ; do
But if you quote the expansion, bash adds double quotes around each term, so that:
for file in "${FILES[#]}"; do
expands to
for file in "2011-09-04 21.43.02.jpg" "2011-09-05 10.23.14.jpg" "2011-09-09 12.31.16.jpg" "2011-09-11 08.43.12.jpg" ; do
The simple rule of thumb is to always use [#] instead of [*] and quote array expansions if you want spaces preserved.
To elaborate on this a little further, the man page in the other answer is explaining that if unquoted, $* an $# behave the same way, but they are different when quoted. So, given
array=(a b c)
Then $* and $# both expand to
a b c
and "$*" expands to
"a b c"
and "$#" expands to
"a" "b" "c"

Not exactly an answer to the quoting/escaping problem of the original question but probably something that would actually have been more useful for the op:
unset FILES
for f in 2011-*.jpg; do FILES+=("$f"); done
echo "${FILES[#]}"
Where of course the expression would have to be adopted to the specific requirement (e.g. *.jpg for all or 2001-09-11*.jpg for only the pictures of a certain day).

For those who prefer set array in oneline mode, instead of using for loop
Changing IFS temporarily to new line could save you from escaping.
OLD_IFS="$IFS"
IFS=$'\n'
array=( $(ls *.jpg) ) #save the hassle to construct filename
IFS="$OLD_IFS"

Escaping works.
#!/bin/bash
FILES=(2011-09-04\ 21.43.02.jpg
2011-09-05\ 10.23.14.jpg
2011-09-09\ 12.31.16.jpg
2011-09-11\ 08.43.12.jpg)
echo ${FILES[0]}
echo ${FILES[1]}
echo ${FILES[2]}
echo ${FILES[3]}
Output:
$ ./test.sh
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg
Quoting the strings also produces the same output.

#! /bin/bash
renditions=(
"640x360 80k 60k"
"1280x720 320k 128k"
"1280x720 320k 128k"
)
for z in "${renditions[#]}"; do
echo "$z"
done
OUTPUT
640x360 80k 60k
1280x720 320k 128k
1280x720 320k 128k
`

Another solution is using a "while" loop instead a "for" loop:
index=0
while [ ${index} -lt ${#Array[#]} ]
do
echo ${Array[${index}]}
index=$(( $index + 1 ))
done

If you aren't stuck on using bash, different handling of spaces in file names is one of the benefits of the fish shell. Consider a directory which contains two files: "a b.txt" and "b c.txt". Here's a reasonable guess at processing a list of files generated from another command with bash, but it fails due to spaces in file names you experienced:
# bash
$ for f in $(ls *.txt); { echo $f; }
a
b.txt
b
c.txt
With fish, the syntax is nearly identical, but the result is what you'd expect:
# fish
for f in (ls *.txt); echo $f; end
a b.txt
b c.txt
It works differently because fish splits the output of commands on newlines, not spaces.
If you have a case where you do want to split on spaces instead of newlines, fish has a very readable syntax for that:
for f in (ls *.txt | string split " "); echo $f; end

If the elements of FILES come from another file whose file names are line-separated like this:
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg
then try this so that the whitespaces in the file names aren't regarded as delimiters:
while read -r line; do
FILES+=("$line")
done < ./files.txt
If they come from another command, you need to rewrite the last line like this:
while read -r line; do
FILES+=("$line")
done < <(./output-files.sh)

I used to reset the IFS value and rollback when done.
# backup IFS value
O_IFS=$IFS
# reset IFS value
IFS=""
FILES=(
"2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg"
)
for file in ${FILES[#]}; do
echo ${file}
done
# rollback IFS value
IFS=${O_IFS}
Possible output from the loop:
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg

Related

bash outputting empty array as empty string

This has been asked several times, with several accepted answers, but, on my trials none of the answers seem to work... I have two arrays, each of which represent the parameter list for a command. As such, I want to quote the strings properly to use with eval:
bash-4.2> ARRAY0=()
bash-4.2> ARRAY3=("ONE" "TWO WITH SPACE" "THREE")
bash-4.2> echo cmd $opt_arg $(printf "%q " "${ARRAY0[#]}")
cmd ''
bash-4.2> echo cmd $opt_arg $(printf "%q " "${ARRAY3[#]}")
cmd ONE TWO\ WITH\ SPACE THREE
bash#
Where $opt_arg may or may not be populated. The problem is that in the first case, where the array is empty, it outputs '' as a parameter, even though the array is empty. This kills my command, as it's expecting zero arguments. I've not found a neat solution (I can do an if [[ ${#ARRAY0[#]} ]] around it, but that's rather ugly...). Is there a neat way to do this?
The idiom I use for this is to always check array length:
(( ${#array[#]} )) && printf '%q ' "${array[#]}"
That said, in present case, you can avoid the zero-argument case simply by having your cmd in the list, ensuring that printf always has at least one non-format-string argument:
printf '%q ' cmd "${array[#]}"
Why do you need printf? Just "${ARRAY0[#]}" should be fine.
for i in cmd $opt_arg "${ARRAY0[#]}"; do echo "[[[$i]]]"; done
[[[cmd]]]
for i in cmd $opt_arg "${ARRAY3[#]}"; do echo "[[[$i]]]"; done
[[[cmd]]]
[[[ONE]]]
[[[TWO WITH SPACE]]]
[[[THREE]]]

Bash, split words into letters and save to array

I'm struggling with a project. I am supposed to write a bash script which will work like tr command. At the beginning I would like to save all commands arguments into separated arrays. And in case if an argument is a word I would like to have each char in separated array field,eg.
tr_mine AB DC
I would like to have two arrays: a[0] = A, a[1] = B and b[0]=C b[1]=D.
I found a way, but it's not working:
IFS="" read -r -a array <<< "$a"
No sed, no awk, all bash internals.
Assuming that words are always separated with blanks (space and/or tabs),
also assuming that words are given as arguments, and writing for bash only:
#!/bin/bash
blank=$'[ \t]'
varname='A'
n=1
while IFS='' read -r -d '' -N 1 c ; do
if [[ $c =~ $blank ]]; then n=$((n+1)); continue; fi
eval ${varname}${n}'+=("'"$c"'")'
done <<<"$#"
last=$(eval echo \${#${varname}${n}[#]}) ### Find last character index.
unset "${varname}${n}[$last-1]" ### Remove last (trailing) newline.
for ((j=1;j<=$n;j++)); do
k="A$j[#]"
printf '<%s> ' "${!k}"; echo
done
That will set each array A1, A2, A3, etc. ... to the letters of each word.
The value at the end of the first loop of $n is the count of words processed.
Printing may be a little tricky, that is why the code to access each letter is given above.
Applied to your sample text:
$ script.sh AB DC
<A> <B>
<D> <C>
The script is setting two (array) vars A1 and A2.
And each letter is one array element: A1[0] = A, A1[1] = B and A2[0]=C, A2[1]=D.
You need to set a variable ($k) to the array element to access.
For example, to echo fourth letter (0 based) of second word (1 based) you need to do (that may be changed if needed):
k="A2[3]"; echo "${!k}" ### Indirect addressing.
The script will work as this:
$ script.sh ABCD efghi
<A> <B> <C> <D>
<e> <f> <g> <h> <i>
Caveat: Characters will be split even if quoted. However, quoted arguments is the correct way to use this script to avoid the effect of shell metacharacters ( |,&,;,(,),<,>,space,tab ). Of course, spaces (even if repeated) will split words as defined by the variable $blank:
$ script.sh $'qwer;rttt fgf\ngfg'
<q> <w> <e> <r> <;> <r> <t> <t> <t>
<>
<>
<>
<f> <g> <f> <
> <g> <f> <g>
As the script will accept and correctly process embebed newlines we need to use: unset "${varname}${n}[$last-1]" to remove the last trailing "newline". If that is not desired, quote the line.
Security Note: The eval is not much of a problem here as it is only processing one character at a time. It would be difficult to create an attack based on just one character. Anyway, the usual warning is valid: Always sanitize your input before using this script. Also, most (not quoted) metacharacters of bash will break this script.
$ script.sh qwer(rttt fgfgfg
bash: syntax error near unexpected token `('
I would strongly suggest to do this in another language if possible, it will be a lot easier.
Now, the closest I come up with is:
#!/bin/bash
sentence="AC DC"
words=`echo "$sentence" | tr " " "\n"`
# final array
declare -A result
# word count
wc=0
for i in $words; do
# letter count in the word
lc=0
for l in `echo "$i" | grep -o .`; do
result["w$wc-l$lc"]=$l
lc=$(($lc+1))
done
wc=$(($wc+1))
done
rLen=${#result[#]}
echo "Result Length $rLen"
for i in "${!result[#]}"
do
echo "$i => ${result[$i]}"
done
The above prints:
Result Length 4
w1-l1 => C
w1-l0 => D
w0-l0 => A
w0-l1 => C
Explanation:
Dynamic variables are not supported in bash (ie create variables using variables) so I am using an associative array instead (result)
Arrays in bash are single dimension. To fake a 2D array I use the indexes: w for words and l for letters. This will make further processing a pain...
Associative arrays are not ordered thus results appear in random order when printing
${!result[#]} is used instead of ${result[#]}. The first iterates keys while the second iterates values
I know this is not exactly what you ask for, but I hope it will point you to the right direction
Try this :
sentence="$#"
read -r -a words <<< "$sentence"
for word in ${words[#]}; do
inc=$(( i++ ))
read -r -a l${inc} <<< $(sed 's/./& /g' <<< $word)
done
echo ${words[1]} # print "CD"
echo ${l1[1]} # print "D"
The first read reads all words, the internal one is for letters.
The sed command add a space after each letters to make the string splittable by read -a. You can also use this sed command to remove unwanted characters from words (eg commas) before splitting.
If special characters are allowed in words, you can use a simple grep instead of the sed command (as suggested in http://www.unixcl.com/2009/07/split-string-to-characters-in-bash.html) :
read -r -a l${inc} <<< $(grep -o . <<< $word)
The word array is ${w}.
The letters arrays are named l# where # is an increment added for each word read.

How can I handle an array where elements contain spaces in Bash?

Let's say I have a file named tmp.out that contains the following:
c:\My files\testing\more files\stuff\test.exe
c:\testing\files here\less files\less stuff\mytest.exe
I want to put the contents of that file into an array and I do it like so:
ARRAY=( `cat tmp.out` )
I then run this through a for loop like so
for i in ${ARRAY[#]};do echo ${i}; done
But the output ends up like this:
c:\My
files\testing\more
files\stuff\test.sas
c:\testing\files
here\less
files\less
stuff\mytest.sas
and I want the output to be:
c:\My files\testing\more files\stuff\test.exe
c:\testing\files here\less files\less stuff\mytest.exe
How can I resolve this?
In order to iterate over the values in an array, you need to quote the array expansion to avoid word splitting:
for i in "${values[#]}"; do
Of course, you should also quote the use of the value:
echo "${i}"
done
That doesn't answer the question of how to get the lines of a file into an array in the first place. If you have bash 4.0, you can use the mapfile builtin:
mapfile -t values < tmp.out
Otherwise, you'd need to temporarily change the value of IFS to a single newline, or use a loop over the read builtin.
You can use the IFS variable, the Internal Field Separator. Set it to empty string to split the contents on newlines only:
while IFS= read -r line ; do
ARRAY+=("$line")
done < tmp.out
-r is needed to keep the literal backslashes.
Another simple way to control word-splitting is by controlling the Internal Field Separator (IFS):
#!/bin/bash
oifs="$IFS" ## save original IFS
IFS=$'\n' ## set IFS to break on newline
array=( $( <dat/2lines.txt ) ) ## read lines into array
IFS="$oifs" ## restore original IFS
for ((i = 0; i < ${#array[#]}; i++)) do
printf "array[$i] : '%s'\n" "${array[i]}"
done
Input
$ cat dat/2lines.txt
c:\My files\testing\more files\stuff\test.exe
c:\testing\files here\less files\less stuff\mytest.exe
Output
$ bash arrayss.sh
array[0] : 'c:\My files\testing\more files\stuff\test.exe'
array[1] : 'c:\testing\files here\less files\less stuff\mytest.exe'

Reading a space-delimited string into an array in Bash

I have a variable which contains a space-delimited string:
line="1 1.50 string"
I want to split that string with space as a delimiter and store the result in an array, so that the following:
echo ${arr[0]}
echo ${arr[1]}
echo ${arr[2]}
outputs
1
1.50
string
Somewhere I found a solution which doesn't work:
arr=$(echo ${line})
If I run the echo statements above after this, I get:
1 1.50 string
[empty line]
[empty line]
I also tried
IFS=" "
arr=$(echo ${line})
with the same result. Can someone help, please?
In order to convert a string into an array, create an array from the string, letting the string get split naturally according to the IFS (Internal Field Separator) variable, which is the space char by default:
arr=($line)
or pass the string to the stdin of the read command using the herestring (<<<) operator:
read -a arr <<< "$line"
For the first example, it is crucial not to use quotes around $line since that is what allows the string to get split into multiple elements.
See also: https://github.com/koalaman/shellcheck/wiki/SC2206
In: arr=( $line ). The "split" comes associated with "glob".
Wildcards (*,? and []) will be expanded to matching filenames.
The correct solution is only slightly more complex:
IFS=' ' read -a arr <<< "$line"
No globbing problem; the split character is set in $IFS, variables quoted.
Try this:
arr=(`echo ${line}`);
If you need parameter expansion, then try:
eval "arr=($line)"
For example, take the following code.
line='a b "c d" "*" *'
eval "arr=($line)"
for s in "${arr[#]}"; do
echo "$s"
done
If the current directory contained the files a.txt, b.txt and c.txt, then executing the code would produce the following output.
a
b
c d
*
a.txt
b.txt
c.txt
line="1 1.50 string"
arr=$( $line | tr " " "\n")
for x in $arr
do
echo "> [$x]"
done

Store the output of find command in an array [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 4 years ago.
How do I put the result of find $1 into an array?
In for loop:
for /f "delims=/" %%G in ('find $1') do %%G | cut -d\/ -f6-
I want to cry.
In bash:
file_list=()
while IFS= read -d $'\0' -r file ; do
file_list=("${file_list[#]}" "$file")
done < <(find "$1" -print0)
echo "${file_list[#]}"
file_list is now an array containing the results of find "$1
What's special about "field 6"? It's not clear what you were attempting to do with your cut command.
Do you want to cut each file after the 6th directory?
for file in "${file_list[#]}" ; do
echo "$file" | cut -d/ -f6-
done
But why "field 6"? Can I presume that you actually want to return just the last element of the path?
for file in "${file_list[#]}" ; do
echo "${file##*/}"
done
Or even
echo "${file_list[#]##*/}"
Which will give you the last path element for each path in the array. You could even do something with the result
for file in "${file_list[#]##*/}" ; do
echo "$file"
done
Explanation of the bash program elements:
(One should probably use the builtin readarray instead)
find "$1" -print0
Find stuff and 'print the full file name on the standard output, followed by a null character'. This is important as we will split that output by the null character later.
<(find "$1" -print0)
"Process Substitution" : The output of the find subprocess is read in via a FIFO (i.e. the output of the find subprocess behaves like a file here)
while ...
done < <(find "$1" -print0)
The output of the find subprocess is read by the while command via <
IFS= read -d $'\0' -r file
This is the while condition:
read
Read one line of input (from the find command). Returnvalue of read is 0 unless EOF is encountered, at which point while exits.
-d $'\0'
...taking as delimiter the null character (see QUOTING in bash manpage). Which is done because we used the null character using -print0 earlier.
-r
backslash is not considered an escape character as it may be part of the filename
file
Result (first word actually, which is unique here) is put into variable file
IFS=
The command is run with IFS, the special variable which contains the characters on which read splits input into words unset. Because we don't want to split.
And inside the loop:
file_list=("${file_list[#]}" "$file")
Inside the loop, the file_list array is just grown by $file, suitably quoted.
arrayname=( $(find $1) )
I don't understand your loop question? If you look how to work with that array then in bash you can loop through all array elements like this:
for element in $(seq 0 $((${#arrayname[#]} - 1)))
do
echo "${arrayname[$element]}"
done
This is probably not 100% foolproof, but it will probably work 99% of the time (I used the GNU utilities; the BSD utilities won't work without modifications; also, this was done using an ext4 filesystem):
declare -a BASH_ARRAY_VARIABLE=$(find <path> <other options> -print0 | sed -e 's/\x0$//' | awk -F'\0' 'BEGIN { printf "("; } { for (i = 1; i <= NF; i++) { printf "%c"gensub(/"/, "\\\\\"", "g", $i)"%c ", 34, 34; } } END { printf ")"; }')
Then you would iterate over it like so:
for FIND_PATH in "${BASH_ARRAY_VARIABLE[#]}"; do echo "$FIND_PATH"; done
Make sure to enclose $FIND_PATH inside double-quotes when working with the path.
Here's a simpler pipeless version, based on the version of user2618594
declare -a names=$(echo "("; find <path> <other options> -printf '"%p" '; echo ")")
for nm in "${names[#]}"
do
echo "$nm"
done
To loop through a find, you can simply use find:
for file in "`find "$1"`"; do
echo "$file" | cut -d/ -f6-
done
It was what I got from your question.

Resources