Reading a space-delimited string into an array in Bash - arrays

I have a variable which contains a space-delimited string:
line="1 1.50 string"
I want to split that string with space as a delimiter and store the result in an array, so that the following:
echo ${arr[0]}
echo ${arr[1]}
echo ${arr[2]}
outputs
1
1.50
string
Somewhere I found a solution which doesn't work:
arr=$(echo ${line})
If I run the echo statements above after this, I get:
1 1.50 string
[empty line]
[empty line]
I also tried
IFS=" "
arr=$(echo ${line})
with the same result. Can someone help, please?

In order to convert a string into an array, create an array from the string, letting the string get split naturally according to the IFS (Internal Field Separator) variable, which is the space char by default:
arr=($line)
or pass the string to the stdin of the read command using the herestring (<<<) operator:
read -a arr <<< "$line"
For the first example, it is crucial not to use quotes around $line since that is what allows the string to get split into multiple elements.
See also: https://github.com/koalaman/shellcheck/wiki/SC2206

In: arr=( $line ). The "split" comes associated with "glob".
Wildcards (*,? and []) will be expanded to matching filenames.
The correct solution is only slightly more complex:
IFS=' ' read -a arr <<< "$line"
No globbing problem; the split character is set in $IFS, variables quoted.

Try this:
arr=(`echo ${line}`);

If you need parameter expansion, then try:
eval "arr=($line)"
For example, take the following code.
line='a b "c d" "*" *'
eval "arr=($line)"
for s in "${arr[#]}"; do
echo "$s"
done
If the current directory contained the files a.txt, b.txt and c.txt, then executing the code would produce the following output.
a
b
c d
*
a.txt
b.txt
c.txt

line="1 1.50 string"
arr=$( $line | tr " " "\n")
for x in $arr
do
echo "> [$x]"
done

Related

how to parse json in bash script as an array? array value should have both key:value format

my json.file looks like
{
"price1" : "120.10",
"price2" : "110.30",
"price3" : "244.45"
}
I have used below sed command in my bash script to declare array that reads from json
array=( $(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json.file) )
this gave me output for echo ${array[*]} values
120.10 110.10 244.45
I am looking for my array values to include the key name as well (key:value).
my desired output should be key:value format
price1:120.10 price2:110.10 price3:244.45
can someone please help or guide me?
Parsing json with sed is probably a bad idea. But let's assume it is a super-simple and regular kind of json format. You were not too far from a working solution:
$ array=($(sed -nr 's/.*"(.*)".*"(.*)".*/\1:\2/p' json.file))
$ echo ${array[*]}
price1:120.10 price2:110.30 price3:244.45
The p flag of the substitute command tells sed to print the result if there was a match. It is good to know. If you have only two quoted strings per line of interest it should work. The regular expression .*"(.*)".*"(.*)".* matches anything - .* - followed by a double quote, anything again, another double quote... The parentheses - (.*) - do not change what is matched. It just records the matched part between parentheses in one of nine buffers that can be used in the replacement string - \1:\2. So here, the replacement string corresponds to:
<first recorded match>:<second recorded match>
If you want to be more specific about the matching lines you can. For instance:
sed -nr 's/^ "(.*)" : "(.*)",?$/\1:\2/p' json.file
Also specifies that there are 2 leading spaces, that the colon is preceded and followed by one single space and that there is a optional comma after the last quote and before the end of line.
But there is also the much simpler:
sed -nr 's/[[:space:]",]//gp' json.file
that just removes all spaces, double quotes and commas, printing only the matching lines. And guess what? It is what you (apparently) want.
Anyway, remember that if your json files are more complex than what you show sed is definitely not the right tool.
Your solution just needs to extract the extra data:
array=( $(sed -n '/{/,/}/{s/"\([^"]*\)"[^:]*:[^"]*"\([^"]*\).*/\1:\2/p;}' <<< "$j") )
echo ${array[#]}
price1:120.10 price2:110.30 price3:244.45
Simpler looking solutions may work too, but I'm assuming that you wrote your pattern like that deliberately.
The key trick here is to enclose part of your pattern with escaped brackets \(...\) and then have a numbered substitution for each in the output part \1, \2, etc.
json='{"price1":"120.10","price2":"110.30","price3":"244.45"}'
array=($(jq -r 'to_entries[] | ( "\(.key):\(.value)")' <<< "$json"))
echo ${array[#]}
printf '%s\n' "${array[#]}"
declare -p array
echo
array=($(jq -r 'to_entries[] | ( "\(.key):\(.value)") | #sh' <<< "$json"))
echo ${array[#]}
printf '%s\n' "${array[#]}"
declare -p array
Output:
price1:120.10 price2:110.30 price3:244.45
price1:120.10
price2:110.30
price3:244.45
declare -a array='([0]="price1:120.10" [1]="price2:110.30" [2]="price3:244.45")'
'price1:120.10' 'price2:110.30' 'price3:244.45'
'price1:120.10'
'price2:110.30'
'price3:244.45'
declare -a array='([0]="'\''price1:120.10'\''" [1]="'\''price2:110.30'\''" [2]="'\''price3:244.45'\''")'
For bash scripting, you might want to use an Associative array:
declare -A prices
while IFS=$'\t' read -r key value; do
prices[$key]=$value
done < <(
jq -r 'to_entries[] | [.key, .value] | #tsv' json.file
)
Then, inspect the array
declare -p prices
declare -A prices='([price1]="120.10" [price3]="244.45" [price2]="110.30" )'
or iterate over it
for key in "${!prices[#]}"; do
printf '%s => %s\n' "$key" "${prices[$key]}"
done
price1 => 120.10
price3 => 244.45
price2 => 110.30

Passing and Parsing an Array from one shell script to another

Due to the limitation of 9 parameters in a script, my objective is to pass about 30 strings bundled in an array from calling script (scriptA) to called script (scriptB).
My scriptA looks something like this...
#!/bin/bash
declare -a arr=( ab "c d" 123 "string with spaces" 456 )
. ./scriptB.sh "Task Name" "${arr[#]}"
My scriptB looks something like this...
#!/bin/bash
arg1="$1"
shift
arg2=("$#")
read -a arr1 <<< "$#"
j=0
for i in "${arr1[#]}"; do
#echo ${arr1[j]}
((j++))
case "$j" in
"1")
param1="${i//(}"
echo "$j=$param1"
;;
"2")
param2="${i}"
echo "$j=$param2"
;;
"3")
param3="${i}"
echo "$j=$param3"
;;
"4")
param4="${i}"
echo "$j=$param4"
;;
"5")
param5="${i//)}"
echo "$j=$param5"
;;
esac
done
OUTPUT:
1=ab
2=c
3=d
4=123
5=string
Problem:
1. I see parenthesis ( and ) gets added to the string which I have to strip them out
2. I see an array element (with spaces) though quoted under double quotes get to interpreted as separate elements by spaces.
read -a arr1 <<< "$#"
is wrong. The "$#" here is equal to "$*", and then read will split the input on whitespaces (spaces, tabs and newlines) and also interpret \ slashes and assign the result to array arr1. Remember to use read -r.
Do:
arr1=("$#")
to assign to an array. Then you could print with:
for ((i=1;i<${#arr1};++i)); do
printf "%d=%s\n" "$i" "${arr1[$i]}"
done
of 9 parameters in a script, my objective is to pass about 30 strings bundled in an array from calling script (scriptA)
Ok. But "${arr[#]}" is passing multiple arguments anyway. If you want to pass array as string, pass it as a string (note that eval is evil):
arr=( ab "c d" 123 "string with spaces" 456 )
./scriptB.sh "Task Name" "$(declare -p arr)"
# Then inside scriptB.sh, re-evaulate parameter 2:
eval "$2" # assigns to arr
Note that the scriptB.sh is sourced in your example, so passing arguments.... makes no sense anyway.
I see an array element (with spaces) though quoted under double quotes get to interpreted as separate elements by spaces
Yes, because you interpreted the content with read, which splits the input on characters in IFS, which by default is set to space, tab and newline. You could print arguments on separate lines and change IFS accordingly:
IFS=$'\n' read -r -a arr1 < <(printf "%s\n" "$#")
or even use a zero terminated string:
mapfile -t -d '' arr1 < <(printf "%s\0" "$#")
but those are just fancy and useless ways of writing arr1=("$#").
Note that in your code snipped, arg2 is an array.

Bash, split words into letters and save to array

I'm struggling with a project. I am supposed to write a bash script which will work like tr command. At the beginning I would like to save all commands arguments into separated arrays. And in case if an argument is a word I would like to have each char in separated array field,eg.
tr_mine AB DC
I would like to have two arrays: a[0] = A, a[1] = B and b[0]=C b[1]=D.
I found a way, but it's not working:
IFS="" read -r -a array <<< "$a"
No sed, no awk, all bash internals.
Assuming that words are always separated with blanks (space and/or tabs),
also assuming that words are given as arguments, and writing for bash only:
#!/bin/bash
blank=$'[ \t]'
varname='A'
n=1
while IFS='' read -r -d '' -N 1 c ; do
if [[ $c =~ $blank ]]; then n=$((n+1)); continue; fi
eval ${varname}${n}'+=("'"$c"'")'
done <<<"$#"
last=$(eval echo \${#${varname}${n}[#]}) ### Find last character index.
unset "${varname}${n}[$last-1]" ### Remove last (trailing) newline.
for ((j=1;j<=$n;j++)); do
k="A$j[#]"
printf '<%s> ' "${!k}"; echo
done
That will set each array A1, A2, A3, etc. ... to the letters of each word.
The value at the end of the first loop of $n is the count of words processed.
Printing may be a little tricky, that is why the code to access each letter is given above.
Applied to your sample text:
$ script.sh AB DC
<A> <B>
<D> <C>
The script is setting two (array) vars A1 and A2.
And each letter is one array element: A1[0] = A, A1[1] = B and A2[0]=C, A2[1]=D.
You need to set a variable ($k) to the array element to access.
For example, to echo fourth letter (0 based) of second word (1 based) you need to do (that may be changed if needed):
k="A2[3]"; echo "${!k}" ### Indirect addressing.
The script will work as this:
$ script.sh ABCD efghi
<A> <B> <C> <D>
<e> <f> <g> <h> <i>
Caveat: Characters will be split even if quoted. However, quoted arguments is the correct way to use this script to avoid the effect of shell metacharacters ( |,&,;,(,),<,>,space,tab ). Of course, spaces (even if repeated) will split words as defined by the variable $blank:
$ script.sh $'qwer;rttt fgf\ngfg'
<q> <w> <e> <r> <;> <r> <t> <t> <t>
<>
<>
<>
<f> <g> <f> <
> <g> <f> <g>
As the script will accept and correctly process embebed newlines we need to use: unset "${varname}${n}[$last-1]" to remove the last trailing "newline". If that is not desired, quote the line.
Security Note: The eval is not much of a problem here as it is only processing one character at a time. It would be difficult to create an attack based on just one character. Anyway, the usual warning is valid: Always sanitize your input before using this script. Also, most (not quoted) metacharacters of bash will break this script.
$ script.sh qwer(rttt fgfgfg
bash: syntax error near unexpected token `('
I would strongly suggest to do this in another language if possible, it will be a lot easier.
Now, the closest I come up with is:
#!/bin/bash
sentence="AC DC"
words=`echo "$sentence" | tr " " "\n"`
# final array
declare -A result
# word count
wc=0
for i in $words; do
# letter count in the word
lc=0
for l in `echo "$i" | grep -o .`; do
result["w$wc-l$lc"]=$l
lc=$(($lc+1))
done
wc=$(($wc+1))
done
rLen=${#result[#]}
echo "Result Length $rLen"
for i in "${!result[#]}"
do
echo "$i => ${result[$i]}"
done
The above prints:
Result Length 4
w1-l1 => C
w1-l0 => D
w0-l0 => A
w0-l1 => C
Explanation:
Dynamic variables are not supported in bash (ie create variables using variables) so I am using an associative array instead (result)
Arrays in bash are single dimension. To fake a 2D array I use the indexes: w for words and l for letters. This will make further processing a pain...
Associative arrays are not ordered thus results appear in random order when printing
${!result[#]} is used instead of ${result[#]}. The first iterates keys while the second iterates values
I know this is not exactly what you ask for, but I hope it will point you to the right direction
Try this :
sentence="$#"
read -r -a words <<< "$sentence"
for word in ${words[#]}; do
inc=$(( i++ ))
read -r -a l${inc} <<< $(sed 's/./& /g' <<< $word)
done
echo ${words[1]} # print "CD"
echo ${l1[1]} # print "D"
The first read reads all words, the internal one is for letters.
The sed command add a space after each letters to make the string splittable by read -a. You can also use this sed command to remove unwanted characters from words (eg commas) before splitting.
If special characters are allowed in words, you can use a simple grep instead of the sed command (as suggested in http://www.unixcl.com/2009/07/split-string-to-characters-in-bash.html) :
read -r -a l${inc} <<< $(grep -o . <<< $word)
The word array is ${w}.
The letters arrays are named l# where # is an increment added for each word read.

Space Delimiter Arrays Shell Script

Array is taking "space" as default delimiter:
str="HI I GOT;IT"
arr2=$(echo $str | tr ";" " ")
for x in $arr2
do
echo " $x"
done
Output:
HI
I
GOT
IT
I want the output to be:
HI I GOT
IT
You haven't said which shell this is, but it looks like bash, so I'll go ith that. This is a job for IFS, which determines how bash splits words. Here we set it to ; for a single command, to split up your string.
You also need to iterate over the array properly (using quotes and [#]) so that it's not split again by bash at this point.
str="HI I GOT;IT"
IFS=\; arr=($str)
for x in "${arr[#]}"
do
echo "$x"
done

Bash array with spaces in elements

I'm trying to construct an array in bash of the filenames from my camera:
FILES=(2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg)
As you can see, there is a space in the middle of each filename.
I've tried wrapping each name in quotes, and escaping the space with a backslash, neither of which works.
When I try to access the array elements, it continues to treat the space as the elementdelimiter.
How can I properly capture the filenames with a space inside the name?
I think the issue might be partly with how you're accessing the elements. If I do a simple for elem in $FILES, I experience the same issue as you. However, if I access the array through its indices, like so, it works if I add the elements either numerically or with escapes:
for ((i = 0; i < ${#FILES[#]}; i++))
do
echo "${FILES[$i]}"
done
Any of these declarations of $FILES should work:
FILES=(2011-09-04\ 21.43.02.jpg
2011-09-05\ 10.23.14.jpg
2011-09-09\ 12.31.16.jpg
2011-09-11\ 08.43.12.jpg)
or
FILES=("2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg")
or
FILES[0]="2011-09-04 21.43.02.jpg"
FILES[1]="2011-09-05 10.23.14.jpg"
FILES[2]="2011-09-09 12.31.16.jpg"
FILES[3]="2011-09-11 08.43.12.jpg"
There must be something wrong with the way you access the array's items. Here's how it's done:
for elem in "${files[#]}"
...
From the bash manpage:
Any element of an array may be referenced using ${name[subscript]}. ... If subscript is # or *, the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[*]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[#]} expands each element of name to a separate word.
Of course, you should also use double quotes when accessing a single member
cp "${files[0]}" /tmp
You need to use IFS to stop space as element delimiter.
FILES=("2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg")
IFS=""
for jpg in ${FILES[*]}
do
echo "${jpg}"
done
If you want to separate on basis of . then just do IFS="."
Hope it helps you:)
I agree with others that it's likely how you're accessing the elements that is the problem. Quoting the file names in the array assignment is correct:
FILES=(
"2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg"
)
for f in "${FILES[#]}"
do
echo "$f"
done
Using double quotes around any array of the form "${FILES[#]}" splits the array into one word per array element. It doesn't do any word-splitting beyond that.
Using "${FILES[*]}" also has a special meaning, but it joins the array elements with the first character of $IFS, resulting in one word, which is probably not what you want.
Using a bare ${array[#]} or ${array[*]} subjects the result of that expansion to further word-splitting, so you'll end up with words split on spaces (and anything else in $IFS) instead of one word per array element.
Using a C-style for loop is also fine and avoids worrying about word-splitting if you're not clear on it:
for (( i = 0; i < ${#FILES[#]}; i++ ))
do
echo "${FILES[$i]}"
done
If you had your array like this:
#!/bin/bash
Unix[0]='Debian'
Unix[1]="Red Hat"
Unix[2]='Ubuntu'
Unix[3]='Suse'
for i in $(echo ${Unix[#]});
do echo $i;
done
You would get:
Debian
Red
Hat
Ubuntu
Suse
I don't know why but the loop breaks down the spaces and puts them as an individual item, even you surround it with quotes.
To get around this, instead of calling the elements in the array, you call the indexes, which takes the full string thats wrapped in quotes.
It must be wrapped in quotes!
#!/bin/bash
Unix[0]='Debian'
Unix[1]='Red Hat'
Unix[2]='Ubuntu'
Unix[3]='Suse'
for i in $(echo ${!Unix[#]});
do echo ${Unix[$i]};
done
Then you'll get:
Debian
Red Hat
Ubuntu
Suse
This was already answered above, but that answer was a bit terse and the man page excerpt is a bit cryptic. I wanted to provide a fully worked example to demonstrate how this works in practice.
If not quoted, an array just expands to strings separated by spaces, so that
for file in ${FILES[#]}; do
expands to
for file in 2011-09-04 21.43.02.jpg 2011-09-05 10.23.14.jpg 2011-09-09 12.31.16.jpg 2011-09-11 08.43.12.jpg ; do
But if you quote the expansion, bash adds double quotes around each term, so that:
for file in "${FILES[#]}"; do
expands to
for file in "2011-09-04 21.43.02.jpg" "2011-09-05 10.23.14.jpg" "2011-09-09 12.31.16.jpg" "2011-09-11 08.43.12.jpg" ; do
The simple rule of thumb is to always use [#] instead of [*] and quote array expansions if you want spaces preserved.
To elaborate on this a little further, the man page in the other answer is explaining that if unquoted, $* an $# behave the same way, but they are different when quoted. So, given
array=(a b c)
Then $* and $# both expand to
a b c
and "$*" expands to
"a b c"
and "$#" expands to
"a" "b" "c"
Not exactly an answer to the quoting/escaping problem of the original question but probably something that would actually have been more useful for the op:
unset FILES
for f in 2011-*.jpg; do FILES+=("$f"); done
echo "${FILES[#]}"
Where of course the expression would have to be adopted to the specific requirement (e.g. *.jpg for all or 2001-09-11*.jpg for only the pictures of a certain day).
For those who prefer set array in oneline mode, instead of using for loop
Changing IFS temporarily to new line could save you from escaping.
OLD_IFS="$IFS"
IFS=$'\n'
array=( $(ls *.jpg) ) #save the hassle to construct filename
IFS="$OLD_IFS"
Escaping works.
#!/bin/bash
FILES=(2011-09-04\ 21.43.02.jpg
2011-09-05\ 10.23.14.jpg
2011-09-09\ 12.31.16.jpg
2011-09-11\ 08.43.12.jpg)
echo ${FILES[0]}
echo ${FILES[1]}
echo ${FILES[2]}
echo ${FILES[3]}
Output:
$ ./test.sh
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg
Quoting the strings also produces the same output.
#! /bin/bash
renditions=(
"640x360 80k 60k"
"1280x720 320k 128k"
"1280x720 320k 128k"
)
for z in "${renditions[#]}"; do
echo "$z"
done
OUTPUT
640x360 80k 60k
1280x720 320k 128k
1280x720 320k 128k
`
Another solution is using a "while" loop instead a "for" loop:
index=0
while [ ${index} -lt ${#Array[#]} ]
do
echo ${Array[${index}]}
index=$(( $index + 1 ))
done
If you aren't stuck on using bash, different handling of spaces in file names is one of the benefits of the fish shell. Consider a directory which contains two files: "a b.txt" and "b c.txt". Here's a reasonable guess at processing a list of files generated from another command with bash, but it fails due to spaces in file names you experienced:
# bash
$ for f in $(ls *.txt); { echo $f; }
a
b.txt
b
c.txt
With fish, the syntax is nearly identical, but the result is what you'd expect:
# fish
for f in (ls *.txt); echo $f; end
a b.txt
b c.txt
It works differently because fish splits the output of commands on newlines, not spaces.
If you have a case where you do want to split on spaces instead of newlines, fish has a very readable syntax for that:
for f in (ls *.txt | string split " "); echo $f; end
If the elements of FILES come from another file whose file names are line-separated like this:
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg
then try this so that the whitespaces in the file names aren't regarded as delimiters:
while read -r line; do
FILES+=("$line")
done < ./files.txt
If they come from another command, you need to rewrite the last line like this:
while read -r line; do
FILES+=("$line")
done < <(./output-files.sh)
I used to reset the IFS value and rollback when done.
# backup IFS value
O_IFS=$IFS
# reset IFS value
IFS=""
FILES=(
"2011-09-04 21.43.02.jpg"
"2011-09-05 10.23.14.jpg"
"2011-09-09 12.31.16.jpg"
"2011-09-11 08.43.12.jpg"
)
for file in ${FILES[#]}; do
echo ${file}
done
# rollback IFS value
IFS=${O_IFS}
Possible output from the loop:
2011-09-04 21.43.02.jpg
2011-09-05 10.23.14.jpg
2011-09-09 12.31.16.jpg
2011-09-11 08.43.12.jpg

Resources