Issue storing awk output in bash array

Issue storing awk output in bash array - arrays

I have below file:
Site is facebook.
Site is microsoft.
Site is google.
And below script:
#!/bin/bash
#tried arr=$(awk {'print'} test) which gives array length as 1
arr=($(awk {'print'} test))
echo "Length ::: ${#arr[#]}"
Here the expected output is 3. However, I am getting length of array as 9. Above is just an excerpt from a script and need to use awk here.
Please let me know where the issue is....

This is the correct way to build a shell array of one entry per line from output of awk (requires bash 4+):
readarray -t arr < <(awk '1' file)
declare -p arr
declare -a arr=([0]="Site is facebook." [1]="Site is microsoft." [2]="Site is google")
When you use this code:
arr=($(awk '1' file))
Then shell splits on default delimiter and assigns each word from awk output to a separate array entry.
Having said that please bear in mind that awk is capable of doing everything that shell can do and it is always better to process your data in awk itself.

Related

Trouble with AWK'd command output and bash array

I am attempting to get a list of running VirtualBox VMs (the UUIDs) and put them into an array. The command below produces the output below:
$ VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}'
f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a
46b285c3-cabd-4fbb-92fe-c7940e0c6a3f
83f4789a-b55b-4a50-a52f-dbd929bdfe12
4d1589ba-9153-489a-947a-df3cf4f81c69
I would like to take those UUIDs and put them into an array (possibly even an associative array for later use, but a simple array for now is sufficient)
If I do the following:
array1="( $(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}') )"
The commands
array1_len=${#array1[#]}
echo $array1_len
Outputs "1" as in there's only 1 element. If I print out the elements:
echo ${array1[*]}
I get a single line of all the UUIDs
( f93c17ca-ab1b-4ba2-95e5-a1b0c8d70d2a 46b285c3-cabd-4fbb-92fe-c7940e0c6a3f 83f4789a-b55b-4a50-a52f-dbd929bdfe12 4d1589ba-9153-489a-947a-df3cf4f81c69 )
I did some research (Bash Guide/Arrays on how to tackle this and found this with command substitution and redirection, but it produces an empty array
while read -r -d '\0'; do
array2+=("$REPLY")
done < <(VBoxManage list runningvms | awk -F '[{}]' '{print $(NF-1)}')
I'm obviously missing something. I've looked at several simiar questions on this site such as:
Reading output of command into array in Bash
AWK output to bash Array
Creating an Array in Bash with Quoted Entries from Command Output
Unfortunately, none have helped. I would apprecaite any assistance in figuring out how to take the output and assign it to an array.
I am running this on macOS 10.11.6 (El Captain) and BASH version 3.2.57

Since you're on a Mac:
brew install bash
Then with this bash as your shell, pipe the output to:
readarray -t array1
Of the -t option, the man page says:
-t Remove a trailing delim (default newline) from each line read.
If the bash4 solution is admissible, then the advice given
e.g. by gniourf_gniourf at reading-output-of-command-into-array-in-bash
is still sound.

Creating an array from a text file in Bash

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?

Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.

mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash

You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.

Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.

You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt

This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.

Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

How to store elements with whitespace in an array?

Just wondering, assuming I am storing my data in a file called BookDB.txt in the following format :
C++ for dummies:Jared:10.52:5:6
Java for dummies:David:10.65:4:6
whereby each field is seperated by the delimeter ":".
How would I preserve whitespace in the first field and have an array with the following contents : ('C++ for dummies' 'Java for dummies')?
Any help is very much appreciated!

Ploutox's solution is almost correct, but without setting IFS, you will not get the array that you seek, with two elements in this case.
Note: He corrected his solution after this post.
IFS=$'\n': arr=( $(awk -F':' '{print $1 }' Input.txt ) )
echo ${#arr[#]}
echo ${arr[0]}
echo ${arr[1]}
Output:
2
C++ for dummies
Java for dummies

Just use a while loop:
#!/bin/bash
# create and populate the array
a=()
while IFS=':' read -r field _
do
a+=("$field")
done < file
# print the array contents
printf "%s\n" "${a[#]}"

I totally misunderstood your question on my 1st attempt to answer. awk seems more suited for your need though. You can get what you want with simple scripting :
IFS=$'\n' : MYARRAY=($(awk -F ":" '{print $1}' myfile))
the -F flag forces : as the field separator.
echo ${MYARRAY[0]} will print :
C++ for dummies

$ yes sed -i "s/:/\'\'/" BookDB.txt | head -n100 | bash
this command while work. this is a linux command, run it on shell in same path with BookDB.txt

Splitting string separated by comma into array values in shell script?

My data set(data.txt) looks like this [imageID,sessionID,height1,height2,x,y,crop]:
1,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
2,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
3,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
4,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,950
These are a set of values which I wish to use. I'm new to shell script :) I read the file line by line like this ,
cat $FILENAME | while read LINE
do
string=($LINE)
# PROCESSING THE STRING
done
Now, in the code above, after getting the string, I wish to do the following :
1. Split the string into comma separated values.
2. Store these variables into arrays like imageID[],sessionID[].
I need to access these values for doing image processing using imagemagick.
However, I'm not able to perform the above steps correctly

set -A doesn't work for me (probably due to older BASH on OSX)
Posting an alternate solution using read -a in case someone needs it:
# init all your individual arrays here
imageId=(); sessionId=();
while IFS=, read -ra arr; do
imageId+=(${arr[0]})
sessionId+=(${arr[1]})
done < input.csv
# Print your arrays
echo "${imageId[#]}"
echo "${sessionId[#]}"

oIFS="$IFS"; IFS=','
set -A str $string
IFS="$oIFS"
echo "${str[0]}";
echo "${str[1]}";
echo "${str[2]}";
you can split and store like this
have a look here for more on Unix arrays.

Read lines from a file into a Bash array [duplicate]

This question already has answers here:
Creating an array from a text file in Bash
(7 answers)
Closed 7 years ago.
I am trying to read a file containing lines into a Bash array.
I have tried the following so far:
Attempt1
a=( $( cat /path/to/filename ) )
Attempt2
index=0
while read line ; do
MYARRAY[$index]="$line"
index=$(($index+1))
done < /path/to/filename
Both attempts only return a one element array containing the first line of the file. What am I doing wrong?
I am running bash 4.1.5

The readarray command (also spelled mapfile) was introduced in bash 4.0.
readarray -t a < /path/to/filename

Latest revision based on comment from BinaryZebra's comment
and tested here. The addition of command eval allows for the expression to be kept in the present execution environment while the expressions before are only held for the duration of the eval.
Use $IFS that has no spaces\tabs, just newlines/CR
$ IFS=$'\r\n' GLOBIGNORE='*' command eval 'XYZ=($(cat /etc/passwd))'
$ echo "${XYZ[5]}"
sync:x:5:0:sync:/sbin:/bin/sync
Also note that you may be setting the array just fine but reading it wrong - be sure to use both double-quotes "" and braces {} as in the example above
Edit:
Please note the many warnings about my answer in comments about possible glob expansion, specifically gniourf-gniourf's comments about my prior attempts to work around
With all those warnings in mind I'm still leaving this answer here (yes, bash 4 has been out for many years but I recall that some macs only 2/3 years old have pre-4 as default shell)
Other notes:
Can also follow drizzt's suggestion below and replace a forked subshell+cat with
$(</etc/passwd)
The other option I sometimes use is just set IFS into XIFS, then restore after. See also Sorpigal's answer which does not need to bother with this

The simplest way to read each line of a file into a bash array is this:
IFS=$'\n' read -d '' -r -a lines < /etc/passwd
Now just index in to the array lines to retrieve each line, e.g.
printf "line 1: %s\n" "${lines[0]}"
printf "line 5: %s\n" "${lines[4]}"
# all lines
echo "${lines[#]}"

One alternate way if file contains strings without spaces with 1string each line:
fileItemString=$(cat filename |tr "\n" " ")
fileItemArray=($fileItemString)
Check:
Print whole Array:
${fileItemArray[*]}
Length=${#fileItemArray[#]}

Your first attempt was close. Here is the simplistic approach using your idea.
file="somefileondisk"
lines=`cat $file`
for line in $lines; do
echo "$line"
done

#!/bin/bash
IFS=$'\n' read -d '' -r -a inlines < testinput
IFS=$'\n' read -d '' -r -a outlines < testoutput
counter=0
cat testinput | while read line;
do
echo "$((${inlines[$counter]}-${outlines[$counter]}))"
counter=$(($counter+1))
done
# OR Do like this
counter=0
readarray a < testinput
readarray b < testoutput
cat testinput | while read myline;
do
echo value is: $((${a[$counter]}-${b[$counter]}))
counter=$(($counter+1))
done