bash4 read file into associative array - arrays

I am able to read file into a regular array with a single statement:
local -a ary
readarray -t ary < $fileName
Not happening is reading a file into assoc. array.
I have control over file creation and so would like to do as simply as possible w/o loops if possible at all.
So file content can be following to be read in as:
keyname=valueInfo
But I am willing to replace = with another string if cuts down on code, especially in a single line code as above.
And ...
So would it be possible to read such a file into an assoc array using something like an until or from - i.e. read into an assoc array until it hits a word, or would I have to do this as part of loop?
This will allow me to keep a lot of similar values in same file, but read into separate arrays.
I looked at mapfile as well, but does same as readarray.
Finally ...
I am creating an options list - to select from - as below:
local -a arr=("${!1}")
select option in ${arr[*]}; do
echo ${option}
break
done
Works fine - however the list shown is not sorted. I would like to have it sorted if possible at all.
Hope it is ok to put all 3 questions into 1 as the questions are similar - all on arrays.
Thank you.

First thing, associative arrays are declared with -A not -a:
local -A ary
And if you want to declare a variable on global scope, use declare outside of a function:
declare -A ary
Or use -g if BASH_VERSION >= 4.2.
If your lines do have keyname=valueInfo, with readarray, you can process it like this:
readarray -t lines < "$fileName"
for line in "${lines[#]}"; do
key=${line%%=*}
value=${line#*=}
ary[$key]=$value ## Or simply ary[${line%%=*}]=${line#*=}
done
Using a while read loop can also be an option:
while IFS= read -r line; do
ary[${line%%=*}]=${line#*=}
done < "$fileName"
Or
while IFS== read -r key value; do
ary[$key]=$value
done < "$fileName"

Related

Using array from an output file in bash shell

am in a need of using an array to set the variable value for further manipulations from the output file.
scenario:
> 1. fetch the list from database
> 2. trim the column using sed to a file named x.txt (got specific value as that is required)
> 3. this file x.txt has the below output as
10000
20000
30000
> 4. I need to set a variable and assign the above values to it.
A=10000
B=20000
C=30000
> 5. I can invoke this variable A,B,C for further manipulations.
Please let me know how to define an array assigning to its variable from the output file.
Thanks.
In bash (starting from version 4.x) and you can use the mapfile command:
mapfile -t myArray < file.txt
see https://stackoverflow.com/a/30988704/10622916
or another answer for older bash versions: https://stackoverflow.com/a/46225812/10622916
I am not a big proponent of using arrays in bash (if your code is complex enough to need an array, it's complex enough to need a more robust language), but you can do:
$ unset a
$ unset i
$ declare -a a
$ while read line; do a[$((i++))]="$line"; done < x.txt
(I've left the interactive prompt in place. Remove the leading $
if you want to put this in a script.)

When do I set IFS to a newline in Bash?

I thought setting IFS to $'\n' would help me in reading an entire file into an array, as in:
IFS=$'\n' read -r -a array < file
However, the above command only reads the first line of the file into the first element of the array, and nothing else.
Even this reads only the first line into the array:
string=$'one\ntwo\nthree'
IFS=$'\n' read -r -a array <<< "$string"
I came across other posts on this site that talk about either using mapfile -t or a read loop to read a file into an array.
Now my question is: when do I use IFS=$'\n' at all?
You are a bit confused as to what IFS is. IFS is the Internal Field Separator used by bash to perform word-splitting to split lines into words after expansion. The default value is [ \t\n] (space, tab, newline).
By reassigning IFS=$'\n', you are removing the ' \t' and telling bash to only split words on newline characters (your thinking is correct). That has the effect of allowing some line with spaces to be read into a single array element without quoting.
Where your implementation fails is in your read -r -a array < file. The -a causes words in the line to be assigned to sequential array indexes. However, you have told bash to only break on a newline (which is the whole line). Since you only call read once, only one array index is filled.
You can either do:
while IFS=$'\n' read -r line; do
array+=( $line )
done < "$filename"
(which you could do without changing IFS if you simply quoted "$line")
Or using IFS=$'\n', you could do
IFS=$'\n'
array=( $(<filename) )
or finally, you could use IFS and readarray:
readarray array <filename
Try them and let me know if you have questions.
Your second try almost works, but you have to tell read that it should not just read until newline (the default behaviour), but for example until the null string:
$ IFS=$'\n' read -a arr -d '' <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'
But as you pointed out, mapfile/readarray is the way to go if you have it (requires Bash 4.0 or newer):
$ mapfile -t arr <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'
The -t option removes the newlines from each element.
As for when you'd want to use IFS=$'\n':
As just shown, if you want to read a files into an array, one line per element, if your Bash is older than 4.0, and you don't want to use a loop
Some people promote using an IFS without a space to avoid unexpected side effects from word splitting; the proper approach in my opinion, though, is to understand word splitting and make sure to avoid it with proper quoting as desired.
I've seen IFS=$'\n' used in tab completion scripts, for example the one for cd in bash-completion: this script fiddles with paths and replaces colons with newlines, to then split them up using that IFS.

Creating an array from a text file in Bash

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?
Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.
mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash
You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.
Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.
You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt
This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.
Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

Splitting string separated by comma into array values in shell script?

My data set(data.txt) looks like this [imageID,sessionID,height1,height2,x,y,crop]:
1,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
2,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
3,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,0
4,0c66824bfbba50ee715658c4e1aeacf6fda7e7ff,1296,4234,194,1536,950
These are a set of values which I wish to use. I'm new to shell script :) I read the file line by line like this ,
cat $FILENAME | while read LINE
do
string=($LINE)
# PROCESSING THE STRING
done
Now, in the code above, after getting the string, I wish to do the following :
1. Split the string into comma separated values.
2. Store these variables into arrays like imageID[],sessionID[].
I need to access these values for doing image processing using imagemagick.
However, I'm not able to perform the above steps correctly
set -A doesn't work for me (probably due to older BASH on OSX)
Posting an alternate solution using read -a in case someone needs it:
# init all your individual arrays here
imageId=(); sessionId=();
while IFS=, read -ra arr; do
imageId+=(${arr[0]})
sessionId+=(${arr[1]})
done < input.csv
# Print your arrays
echo "${imageId[#]}"
echo "${sessionId[#]}"
oIFS="$IFS"; IFS=','
set -A str $string
IFS="$oIFS"
echo "${str[0]}";
echo "${str[1]}";
echo "${str[2]}";
you can split and store like this
have a look here for more on Unix arrays.

Bash: read lines into an array *without* touching IFS

I'm trying to read the lines of output from a subshell into an array, and I'm not willing to set IFS because it's global. I don't want one part of the script to affect the following parts, because that's poor practice and I refuse to do it. Reverting IFS after the command is not an option because it's too much trouble to keep the reversion in the right place after editing the script. How can I explain to bash that I want each array element to contain an entire line, without having to set any global variables that will destroy future commands?
Here's an example showing the unwanted stickiness of IFS:
lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines without IFS"
IFS=$'\r\n' lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines with IFS"
lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines without IFS?"
The output is:
42 lines without IFS
6 lines with IFS
6 lines without IFS?
This question is probably based on a misconception.
IFS=foo read does not change IFS outside of the read operation itself.
Thus, this would have side effects, and should be avoided:
IFS=
declare -a array
while read -r; do
array+=( "$REPLY" )
done < <(your-subshell-here)
...but this is perfectly side-effect free:
declare -a array
while IFS= read -r; do
array+=( "$REPLY" )
done < <(your-subshell-here)
With bash 4.0 or newer, there's also the option of readarray or mapfile (synonyms for the same operation):
mapfile -t array < <(your-subshell-here)
In examples later added to your answer, you have code along the lines of:
lines=($(egrep "^-o" speccmds.cmd))
The better way to write this is:
mapfile -t lines < <(egrep "^-o" speccmds.cmd)
Are you trying to store the lines of the output in an array, or the words of each line?
lines
mapfile -t arrayname < <(your subshell)
This does not use IFS at all.
words
(your subshell) | while IFS=: read -ra words; do ...
The form var=value command args... puts the var variable into the environment of the command, and does not affect the current shell's environment.

Resources