Creating an array from a text file in Bash - arrays

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?

Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.

mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash

You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.

Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.

You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt

This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.

Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

Related

Creating an array of Strings from Grep Command

I'm pretty new to Linux and I've been trying some learning recently. One thing I'm struggling is Within a log file I would like to grep for all the unique IDs that exist and store them in an array.
The format of the ids are like so id=12345678,
I'm struggling though to get these in to an array. So far I've tried a range of things, the below however
a=($ (grep -HR1 `id=^[0-9]' logfile))
echo ${#a[#]}
but the echo count is always returned as 0. So it is clear the populating of the array is not working. Have explored other pages online, but nothing seems to have a clear explanation of what I am looking for exactly.
a=($(grep -Eow 'id=[0-9]+' logfile))
a=("${a[#]#id=}")
printf '%s\n' "${a[#]}"
It's safe to split an unquoted command substitution here, as we aren't printing pathname expansion characters (*?[]), or whitespace (other than the new lines which delimit the list).
If this were not the case, mapfile -t a <(grep ...) is a good alternative.
-E is extended regex (for +)
-o prints only matching text
-w matches a whole word only
${a[#]#id=} strips the id suffix from each array element
Here is an example
my_array=()
while IFS= read -r line; do
my_array+=( "$line" )
done < <( ls )
echo ${#my_array[#]}
printf '%s\n' "${my_array[#]}"
It prints out 14 and then the names of the 14 files in the same folder. Just substitute your command instead of ls and you started.
Suggesting readarray command to make sure it array reads full lines.
readarray -t my_array < <(grep -HR1 'id=^[0-9]' logfile)
printf "%s\n" "${my_array[#]}"

Why doesn't this split string expression give me an array?

I am wondering why this array expression in Bash doesn't give me an array. It just gives me the first element in the string:
IFS='\n' read -r -a POSSIBLE_ENCODINGS <<< $(iconv -l)
I want to try out all available encodings to see how reading different file encodings for a script in R works, and I am using this Bash-script to create text-files with all possible encodings:
#!/bin/bash
IFS='\n' read -r -a POSSIBLE_ENCODINGS <<< $(iconv -l)
echo "${POSSIBLE_ENCODINGS[#]}"
for CURRENT_ENCODING in "${POSSIBLE_ENCODINGS[#]}"
do
TRIMMED=$(echo $CURRENT_ENCODING | sed 's:/*$::')
iconv --verbose --from-code=UTF-8 --to-code="$TRIMMED" --output=encoded-${TRIMMED}.txt first_file.txt
echo "Current encoding: ${TRIMMED}"
echo "Output file:encoded-${TRIMMED}.txt"
done
EDIT: Code edited according to answers below:
#!/bin/bash
readarray -t possibleEncodings <<< "$(iconv -l)"
echo "${possibleEncodings[#]}"
for currentEncoding in "${possibleEncodings[#]}"
do
trimmedEncoding=$(echo $currentEncoding | sed 's:/*$::')
echo "Trimmed encoding: ${trimmedEncoding}"
iconv --verbose --from-code=UTF-8 --to-code="$trimmedEncoding" --output=encoded-${trimmedEncoding}.txt first_file.txt
echo "Current encoding: ${trimmedEncoding}"
echo "Output file:encoded-${trimmedEncoding}.txt"
done
You could just readarray/mapfile instead which are tailor made for reading multi-line output into an array.
mapfile -t possibleEncodings < <(iconv -l)
The here-strings are useless, when you can just run the command in a process-substitution model. The <() puts the command output as if it appears on a file for mapfile to read from.
As for why your original attempt didn't work, you are just doing the read call once, but there is still strings to read in the subsequent lines. You either need to read till EOF in a loop or use the mapfile as above which does the job for you.
As a side-note always use lowercase letters for user defined variable/array and function names. This lets you distinguish your variables from the shell's own environment variables which are upper-cased.
because read reads only one line, following while can be used
arr=()
while read -r line; do
arr+=( "$line" )
done <<< "$(iconv -l)"
otherwise, there is also readarray builtin
readarray -t arr <<< "$(iconv -l)"

How do I store the output from a find command in an array? + bash

I have the following find command with the following output:
$ find -name '*.jpg'
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
./public_html/github/screencasts-gh-pages/introToBackbone/presentation/images/telescope.jpg
./public_html/github/StarCraft-master/img/Maps/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Snapshot.jpg
./public_html/github/StarCraft-master/img/Maps/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(6)Thin Ice.jpg
./public_html/github/StarCraft-master/img/Maps/Original/Map_Grass.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)TheHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Volcanis.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(3)Trench wars.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)BigGameHunters.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(8)Turbo.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Blood Bath.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(2)Switchback.jpg
./public_html/github/StarCraft-master/img/Maps/Original/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Maps/(4)Orbital Relay.jpg
./public_html/github/StarCraft-master/img/Bg/GameLose.jpg
./public_html/github/StarCraft-master/img/Bg/GameWin.jpg
./public_html/github/StarCraft-master/img/Bg/GameStart.jpg
./public_html/github/StarCraft-master/img/Bg/GamePlay.jpg
./public_html/github/StarCraft-master/img/Demo/Demo.jpg
./public_html/github/flot/examples/image/hs-2004-27-a-large-web.jpg
./public_html/github/minicourse-ajax-project/other/GameLose.jpg
How do I store this output in an array? I want it to handle filenames with spaces
I have tried this arrayname=($(find -name '*.jpg')) but this just stores the first element. # I am doing the following which seems to be just the first element?
$ arrayname=($(find -name '*.jpg'))
$ echo "$arrayname"
./public_html/github/screencasts-gh-pages/reactiveDataVis/presentation/images/telescope.jpg
$
I have tried here but again this just stores the 1st element
Other similar Qs
How do I capture the output from the ls or find command to store all file names in an array?
How do i store the output of a bash command in a variable?
If you know with certainty that your filenames will not contain newlines, then
mapfile -t arrayname < <(find ...)
If you want to be able to handle any file
arrayname=()
while IFS= read -d '' -r filename; do
arrayname+=("$filename")
done < <(find ... -print0)
echo "$arrayname" will only show the first element of the array. It is equivalent to echo "${arrayname[0]}". To dump an array:
printf "%s\n" "${arrayname[#]}"
# ............^^^^^^^^^^^^^^^^^ must use exactly this form, with the quotes.
arrayname=($(find ...)) is still wrong. It will store the file ./file with spaces.txt as 3 separate elements in the array.
If you have a sufficiently recent version of bash, you can save yourself a lot of trouble by just using a ** glob.
shopt -s globstar
files=(**/*.jpg)
The first line enables the feature. Once enabled, ** in a glob pattern will match any number (including 0) of directories in the path.
Using the glob in the array definition makes sure that whitespace is handled correctly.
To view an array in a form which could be used to define the array, use the -p (print) option to the declare builtin:
declare -p files

Read file into array with empty lines

I'm using this code to load file into array in bash:
IFS=$'\n' read -d '' -r -a LINES < "$PAR1"
But unfortunately this code skips empty lines.
I tried the next code:
IFS=$'\n' read -r -a LINES < "$PAR1"
But this variant only loads one line.
How do I load file to array in bash, without skipping empty lines?
P.S. I check the number of loaded lines by the next command:
echo ${#LINES[#]}
You can use mapfile available in BASH 4+
mapfile -t lines < "$PAR1"
To avoid doing anything fancy, and stay compatible with all versions of bash in common use (as of this writing, Apple is shipping bash 3.2.x to avoid needing to comply with the GPLv3):
lines=( )
while IFS= read -r line; do
lines+=( "$line" )
done
See also BashFAQ #001.

How to write an array ignoring space characters in shell scripting?

I have a text file which consists of say ..following information say test.text:
an
apple of
one's eye
I want to read these lines in an array using shell scripting by doing a cat test.text. I have tried using a=(`cat test.text`), but that doesn't work as it considers space as a delimiter. I need the values as a[0]=an , a[1]=apple of , a[2]=one's eye. I don't want to use IFS. Need help, thanks in advance..!!
In bash 4 or later
readarray a < test.text
This will include an empty element for each blank line, so you might want to remove the empty lines from the input file first.
In earlier versions, you'll need to build the array manually.
a=()
while read; do a+=("$REPLY"); done < test.text
One of various options you have is to use read with bash. Set IFS to the newline and line separator to NUL
IFS=$'\n' read -d $'\0' -a a < test.txt
Plain sh
IFS='
'
set -- $(< test.txt)
unset IFS
echo "$1"
echo "$2"
echo "$#"
bash
IFS=$'\n' a=($(< test.txt))
echo "${a[0]}"
echo "${a[1]}"
echo "${a[#]}"
I'm inclined to say these are the best of the available solutions because they do not involve looping.
Let's say:
cat file
an
apple of
one's eye
Use this while loop:
arr=()
while read -r l; do
[[ -n "$l" ]] && arr+=("$l")
done < file
TEST
set | grep arr
arr=([0]="an" [1]="apple of" [2]="one's eye")

Resources