Read lines from a file into a Bash array [duplicate] - arrays

This question already has answers here:
Creating an array from a text file in Bash
(7 answers)
Closed 7 years ago.
I am trying to read a file containing lines into a Bash array.
I have tried the following so far:
Attempt1
a=( $( cat /path/to/filename ) )
Attempt2
index=0
while read line ; do
MYARRAY[$index]="$line"
index=$(($index+1))
done < /path/to/filename
Both attempts only return a one element array containing the first line of the file. What am I doing wrong?
I am running bash 4.1.5

The readarray command (also spelled mapfile) was introduced in bash 4.0.
readarray -t a < /path/to/filename

Latest revision based on comment from BinaryZebra's comment
and tested here. The addition of command eval allows for the expression to be kept in the present execution environment while the expressions before are only held for the duration of the eval.
Use $IFS that has no spaces\tabs, just newlines/CR
$ IFS=$'\r\n' GLOBIGNORE='*' command eval 'XYZ=($(cat /etc/passwd))'
$ echo "${XYZ[5]}"
sync:x:5:0:sync:/sbin:/bin/sync
Also note that you may be setting the array just fine but reading it wrong - be sure to use both double-quotes "" and braces {} as in the example above
Edit:
Please note the many warnings about my answer in comments about possible glob expansion, specifically gniourf-gniourf's comments about my prior attempts to work around
With all those warnings in mind I'm still leaving this answer here (yes, bash 4 has been out for many years but I recall that some macs only 2/3 years old have pre-4 as default shell)
Other notes:
Can also follow drizzt's suggestion below and replace a forked subshell+cat with
$(</etc/passwd)
The other option I sometimes use is just set IFS into XIFS, then restore after. See also Sorpigal's answer which does not need to bother with this

The simplest way to read each line of a file into a bash array is this:
IFS=$'\n' read -d '' -r -a lines < /etc/passwd
Now just index in to the array lines to retrieve each line, e.g.
printf "line 1: %s\n" "${lines[0]}"
printf "line 5: %s\n" "${lines[4]}"
# all lines
echo "${lines[#]}"

One alternate way if file contains strings without spaces with 1string each line:
fileItemString=$(cat filename |tr "\n" " ")
fileItemArray=($fileItemString)
Check:
Print whole Array:
${fileItemArray[*]}
Length=${#fileItemArray[#]}

Your first attempt was close. Here is the simplistic approach using your idea.
file="somefileondisk"
lines=`cat $file`
for line in $lines; do
echo "$line"
done

#!/bin/bash
IFS=$'\n' read -d '' -r -a inlines < testinput
IFS=$'\n' read -d '' -r -a outlines < testoutput
counter=0
cat testinput | while read line;
do
echo "$((${inlines[$counter]}-${outlines[$counter]}))"
counter=$(($counter+1))
done
# OR Do like this
counter=0
readarray a < testinput
readarray b < testoutput
cat testinput | while read myline;
do
echo value is: $((${a[$counter]}-${b[$counter]}))
counter=$(($counter+1))
done

Related

Creating an array of Strings from Grep Command

I'm pretty new to Linux and I've been trying some learning recently. One thing I'm struggling is Within a log file I would like to grep for all the unique IDs that exist and store them in an array.
The format of the ids are like so id=12345678,
I'm struggling though to get these in to an array. So far I've tried a range of things, the below however
a=($ (grep -HR1 `id=^[0-9]' logfile))
echo ${#a[#]}
but the echo count is always returned as 0. So it is clear the populating of the array is not working. Have explored other pages online, but nothing seems to have a clear explanation of what I am looking for exactly.
a=($(grep -Eow 'id=[0-9]+' logfile))
a=("${a[#]#id=}")
printf '%s\n' "${a[#]}"
It's safe to split an unquoted command substitution here, as we aren't printing pathname expansion characters (*?[]), or whitespace (other than the new lines which delimit the list).
If this were not the case, mapfile -t a <(grep ...) is a good alternative.
-E is extended regex (for +)
-o prints only matching text
-w matches a whole word only
${a[#]#id=} strips the id suffix from each array element
Here is an example
my_array=()
while IFS= read -r line; do
my_array+=( "$line" )
done < <( ls )
echo ${#my_array[#]}
printf '%s\n' "${my_array[#]}"
It prints out 14 and then the names of the 14 files in the same folder. Just substitute your command instead of ls and you started.
Suggesting readarray command to make sure it array reads full lines.
readarray -t my_array < <(grep -HR1 'id=^[0-9]' logfile)
printf "%s\n" "${my_array[#]}"

Bash: Read args from stdin into array

Problem Description
Given a plaintext file args.in containing one line of command line arguments, read them into an array.
Problem Formulation
We have 4 files:
args.in:
"ab" c
refimpl.sh:
read -r line
bash -c "bash showargs.sh $line"
arrayimpl.sh:
arr=()
# BEGIN-------------------------
# Input comes from stdin.
# You need to set arr here.
# END---------------------------
echo "${#arr[#]}"
for i in "${arr[#]}"; do
echo "$i"
done
showargs.sh:
echo "$#"
for i in "$#"; do
echo "$i"
done
Put them into the same folder. We want you to implement arrayimpl.sh so that
bash refimpl.sh < args.in
and
bash arrayimpl.sh < args.in
give the same output.
Your solution should only contain a single file arrayimpl.sh.
Output Example
2
ab
c
This problem is a better formulation of this but not a dup of this. Some solutions work there but not here. For example, when we have the following input:
args.in:
"a\"b" c
There is no known solution yet.
The expected solution for this assignment is something equivalent to:
eval "arr=( $(cat) )"
This evaluates input as shell words, which is what refimpl.sh also does.
This is for toy problems and homework assignments only. Real software should not use executable code as a data format.

Creating an array from a text file in Bash

A script takes a URL, parses it for the required fields, and redirects its output to be saved in a file, file.txt. The output is saved on a new line each time a field has been found.
file.txt
A Cat
A Dog
A Mouse
etc...
I want to take file.txt and create an array from it in a new script, where every line gets to be its own string variable in the array. So far I have tried:
#!/bin/bash
filename=file.txt
declare -a myArray
myArray=(`cat "$filename"`)
for (( i = 0 ; i < 9 ; i++))
do
echo "Element [$i]: ${myArray[$i]}"
done
When I run this script, whitespace results in words getting split and instead of getting
Desired output
Element [0]: A Cat
Element [1]: A Dog
etc...
I end up getting this:
Actual output
Element [0]: A
Element [1]: Cat
Element [2]: A
Element [3]: Dog
etc...
How can I adjust the loop below such that the entire string on each line will correspond one-to-one with each variable in the array?
Use the mapfile command:
mapfile -t myArray < file.txt
The error is using for -- the idiomatic way to loop over lines of a file is:
while IFS= read -r line; do echo ">>$line<<"; done < file.txt
See BashFAQ/005 for more details.
mapfile and readarray (which are synonymous) are available in Bash version 4 and above. If you have an older version of Bash, you can use a loop to read the file into an array:
arr=()
while IFS= read -r line; do
arr+=("$line")
done < file
In case the file has an incomplete (missing newline) last line, you could use this alternative:
arr=()
while IFS= read -r line || [[ "$line" ]]; do
arr+=("$line")
done < file
Related:
Need alternative to readarray/mapfile for script on older version of Bash
You can do this too:
oldIFS="$IFS"
IFS=$'\n' arr=($(<file))
IFS="$oldIFS"
echo "${arr[1]}" # It will print `A Dog`.
Note:
Filename expansion still occurs. For example, if there's a line with a literal * it will expand to all the files in current folder. So use it only if your file is free of this kind of scenario.
Use mapfile or read -a
Always check your code using shellcheck. It will often give you the correct answer. In this case SC2207 covers reading a file that either has space separated or newline separated values into an array.
Don't do this
array=( $(mycommand) )
Files with values separated by newlines
mapfile -t array < <(mycommand)
Files with values separated by spaces
IFS=" " read -r -a array <<< "$(mycommand)"
The shellcheck page will give you the rationale why this is considered best practice.
You can simply read each line from the file and assign it to an array.
#!/bin/bash
i=0
while read line
do
arr[$i]="$line"
i=$((i+1))
done < file.txt
This answer says to use
mapfile -t myArray < file.txt
I made a shim for mapfile if you want to use mapfile on bash < 4.x for whatever reason. It uses the existing mapfile command if you are on bash >= 4.x
Currently, only options -d and -t work. But that should be enough for that command above. I've only tested on macOS. On macOS Sierra 10.12.6, the system bash is 3.2.57(1)-release. So the shim can come in handy. You can also just update your bash with homebrew, build bash yourself, etc.
It uses this technique to set variables up one call stack.
Make sure set the Internal File Separator (IFS)
variable to $'\n' so that it does not put each word
into a new array entry.
#!/bin/bash
# move all 2020 - 2022 movies to /backup/movies
# put list into file 1 line per dir
# dirs are "movie name (year)/"
ls | egrep 202[0-2] > 2020_movies.txt
OLDIFS=${IFS}
IFS=$'\n' #fix separator
declare -a MOVIES # array for dir names
MOVIES=( $( cat "${1}" ) ) // load into array
for M in ${MOVIES[#]} ; do
echo "[${M}]"
if [ -d "${M}" ] ; then # if dir name
mv -v "$M" /backup/movies/
fi
done
IFS=${OLDIFS} # restore standard separators
# not essential as IFS reverts when script ends
#END

declare global array in shell [duplicate]

This question already has an answer here:
How to use global arrays in bash?
(1 answer)
Closed 6 years ago.
Here is the code which I need to separate the files in array, but using the PIPE it is generating subshell so am not able to get access to arrays normal, executable and directory.and its not printing anything or don't know what is happening after #////////.Please help me regarding this.
i=0
j=0
k=0
normal[0]=
executable[0]=
directory[0]=
ls | while read line
do
if [ -f $line ];then
#echo "this is normal file>> $line"
normal[i]=$line
i=$((i+1))
fi
if [ -x $line ];then
#echo "this is executable file>> $line"
executable[j]=$line
j=$((j+1))
fi
if [ -d $line ];then
#echo "this is directory>> $line"
directory[k]=$line
k=$((k+1))
fi
done
#//////////////////////////////////////
echo "normal files are"
for k in "${normal[#]}"
do
echo "$k"
done
echo "executable files are"
for k in "${executable[#]}"
do
echo "$k"
done
echo "directories are"
for k in "${directory[#]}"
do
echo "$k"
done
There are several flaws to your script :
Your if tests should be written with [[, not [, which is for binary comparison (more info : here). If you want to keep [ or are not using bash, you will have to quote your line variable, i.e. write all your tests like this : if [ -f "$line" ];then
Don't use ls to list the current directory as it misbehaves in some cases. A glob would be more suited in your case (more info: here)
If you want to avoid using a pipe, use a for loop instead. Replace ls | while read line with for line in $(ls) or, to take my previous point in acount, for line in *
After doing that, I tested your script and it worked perfectly fine. You should note that some folders will be listed under both under "executable files" and "directories", due to them having +x rights (I don't know if this is the behaviour you wanted).
As a side note, you don't need to declare variables in bash before using them. Your first 6 lines are thus un-necessary. Variables i,j,k are not necessary as well as you can dynamicaly increment an array with the following syntax : normal+=("$line").
The simplest thing to do is to keep the subshell open until you no longer need the arrays. In other words:
ls | { while read line; do
...
echo "directories: ${directory[#]}" | tr ' ' \\n
}
In other words, add an open brace before the while and a closing brace at the end of the script.

Bash: read lines into an array *without* touching IFS

I'm trying to read the lines of output from a subshell into an array, and I'm not willing to set IFS because it's global. I don't want one part of the script to affect the following parts, because that's poor practice and I refuse to do it. Reverting IFS after the command is not an option because it's too much trouble to keep the reversion in the right place after editing the script. How can I explain to bash that I want each array element to contain an entire line, without having to set any global variables that will destroy future commands?
Here's an example showing the unwanted stickiness of IFS:
lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines without IFS"
IFS=$'\r\n' lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines with IFS"
lines=($(egrep "^-o" speccmds.cmd))
echo "${#lines[#]} lines without IFS?"
The output is:
42 lines without IFS
6 lines with IFS
6 lines without IFS?
This question is probably based on a misconception.
IFS=foo read does not change IFS outside of the read operation itself.
Thus, this would have side effects, and should be avoided:
IFS=
declare -a array
while read -r; do
array+=( "$REPLY" )
done < <(your-subshell-here)
...but this is perfectly side-effect free:
declare -a array
while IFS= read -r; do
array+=( "$REPLY" )
done < <(your-subshell-here)
With bash 4.0 or newer, there's also the option of readarray or mapfile (synonyms for the same operation):
mapfile -t array < <(your-subshell-here)
In examples later added to your answer, you have code along the lines of:
lines=($(egrep "^-o" speccmds.cmd))
The better way to write this is:
mapfile -t lines < <(egrep "^-o" speccmds.cmd)
Are you trying to store the lines of the output in an array, or the words of each line?
lines
mapfile -t arrayname < <(your subshell)
This does not use IFS at all.
words
(your subshell) | while IFS=: read -ra words; do ...
The form var=value command args... puts the var variable into the environment of the command, and does not affect the current shell's environment.

Resources