Append data read from a file onto a bash array [duplicate] - arrays

Bash allows to use: cat <(echo "$FILECONTENT")
Bash also allow to use: while read i; do echo $i; done </etc/passwd
to combine previous two this can be used: echo $FILECONTENT | while read i; do echo $i; done
The problem with last one is that it creates sub-shell and after the while loop ends variable i cannot be accessed any more.
My question is:
How to achieve something like this: while read i; do echo $i; done <(echo "$FILECONTENT") or in other words: How can I be sure that i survives while loop?
Please note that I am aware of enclosing while statement into {} but this does not solves the problem (imagine that you want use the while loop in function and return i variable)

The correct notation for Process Substitution is:
while read i; do echo $i; done < <(echo "$FILECONTENT")
The last value of i assigned in the loop is then available when the loop terminates.
An alternative is:
echo $FILECONTENT |
{
while read i; do echo $i; done
...do other things using $i here...
}
The braces are an I/O grouping operation and do not themselves create a subshell. In this context, they are part of a pipeline and are therefore run as a subshell, but it is because of the |, not the { ... }. You mention this in the question. AFAIK, you can do a return from within these inside a function.
Bash also provides the shopt builtin and one of its many options is:
lastpipe
If set, and job control is not active, the shell runs the last command of a pipeline not executed in the background in the current shell environment.
Thus, using something like this in a script makes the modfied sum available after the loop:
FILECONTENT="12 Name
13 Number
14 Information"
shopt -s lastpipe # Comment this out to see the alternative behaviour
sum=0
echo "$FILECONTENT" |
while read number name; do ((sum+=$number)); done
echo $sum
Doing this at the command line usually runs foul of 'job control is not active' (that is, at the command line, job control is active). Testing this without using a script failed.
Also, as noted by Gareth Rees in his answer, you can sometimes use a here string:
while read i; do echo $i; done <<< "$FILECONTENT"
This doesn't require shopt; you may be able to save a process using it.

Jonathan Leffler explains how to do what you want using process substitution, but another possibility is to use a here string:
while read i; do echo "$i"; done <<<"$FILECONTENT"
This saves a process.

This function makes duplicates $NUM times of jpg files (bash)
function makeDups() {
NUM=$1
echo "Making $1 duplicates for $(ls -1 *.jpg|wc -l) files"
ls -1 *.jpg|sort|while read f
do
COUNT=0
while [ "$COUNT" -le "$NUM" ]
do
cp $f ${f//sm/${COUNT}sm}
((COUNT++))
done
done
}

Related

Append to an array variable from a pipeline command

I am writing a bash function to get all git repositories, but I have met a problem when I want to store all the git repository pathnames to the array patharray. Here is the code:
gitrepo() {
local opt
declare -a patharray
locate -b '\.git' | \
while read pathname
do
pathname="$(dirname ${pathname})"
if [[ "${pathname}" != *.* ]]; then
# Note: how to add an element to an existing Bash Array
patharray=("${patharray[#]}" '\n' "${pathname}")
# echo -e ${patharray[#]}
fi
done
echo -e ${patharray[#]}
}
I want to save all the repository paths to the patharray array, but I can't get it outside the pipeline which is comprised of locate and while command.
But I can get the array in the pipeline command, the commented command # echo -e ${patharray[#]} works well if uncommented, so how can I solve the problem?
And I have tried the export command, however it seems that it can't pass the patharray to the pipeline.
Bash runs all commands of a pipeline in separate SubShells. When a subshell containing a while loop ends, all changes you made to the patharray variable are lost.
You can simply group the while loop and the echo statement together so they are both contained within the same subshell:
gitrepo() {
local pathname dir
local -a patharray
locate -b '\.git' | { # the grouping begins here
while read pathname; do
pathname=$(dirname "$pathname")
if [[ "$pathname" != *.* ]]; then
patharray+=( "$pathname" ) # add the element to the array
fi
done
printf "%s\n" "${patharray[#]}" # all those quotes are needed
} # the grouping ends here
}
Alternately, you can structure your code to not need a pipe: use ProcessSubstitution
( Also see the Bash manual for details - man bash | less +/Process\ Substitution):
gitrepo() {
local pathname dir
local -a patharray
while read pathname; do
pathname=$(dirname "$pathname")
if [[ "$pathname" != *.* ]]; then
patharray+=( "$pathname" ) # add the element to the array
fi
done < <(locate -b '\.git')
printf "%s\n" "${patharray[#]}" # all those quotes are needed
}
First of all, appending to an array variable is better done with array[${#array[*]}]="value" or array+=("value1" "value2" "etc") unless you wish to transform the entire array (which you don't).
Now, since pipeline commands are run in subprocesses, changes made to a variable inside a pipeline command will not propagate to outside it. There are a few options to get around this (most are listed in Greg's BashFAQ/024):
pass the result through stdout instead
the simplest; you'll need to do that anyway to get the value from the function (although there are ways to return a proper variable)
any special characters in paths can be handled reliably by using \0 as a separator (see Capturing output of find . -print0 into a bash array for reading \0-separated lists)
locate -b0 '\.git' | while read -r -d '' pathname; do dirname -z "$pathname"; done
or simply
locate -b0 '\.git' | xargs -0 dirname -z
avoid running the loop in a subprocess
avoid pipeline at all
temporary file/FIFO (bad: requires manual cleanup, accessible to others)
temporary variable (mediocre: unnecessary memory overhead)
process substitution (a special, syntax-supported case of FIFO, doesn't require manual cleanup; code adapted from Greg's BashFAQ/020):
i=0 #`unset i` will error on `i' usage if the `nounset` option is set
while IFS= read -r -d $'\0' file; do
patharray[i++]="$(dirname "$file")" # or however you want to process each file
done < <(locate -b0 '\.git')
use the lastpipe option (new in Bash 4.2) - doesn't run the last command of a pipeline in a subprocess (mediocre: has global effect)

Saving in arrays and comparing an argument to an array in bash

I do not know why this code stopped working
I tested it a couple of times and it was running great
what I am trying to do hear is place first and second in 2 different arrays
and then comparing argument $2 ==> $comment to the array varA and if it is in the array i do not want to store it in the text file $file
comment=$2
dueD=$3
x=0
hasData()
{
declare -a varA varB
cat $file | while IFS=$'\t' read -r num first second;do
varA+=("$first")
varB+=("$second")
done
if [[ ${varA[#]} == ~$comment ]]; then
echo "already in the Todo list"
else
x=$(cat $file | wc -l)
x=$(($x+1))
echo -e "$x\t$comment\t$dueD" >> $file
fi
I think I am storing the values wrong in the array because when I try
echo ${varA[#]}
nothing gets printed
more over I think my if statement is not accurate enough since this is the 4th time I edit it and it works but after a while it no longer works
need assistance kindly
Your pipeline creates a sub-shell. Therefore your assignments to varA and varB happen in the sub-shell and are lost as soon as the sub-shell exits. See How can I read a file (data stream, variable) line-by-line (and/or field-by-field)? for how to do this without a sub-shell. – Etan Reisner
Look at the solutions there. See how they don't use a pipe? That's the solution: Don't use a pipe. Use one of the other input redirection options. – Etan Reisner

declare global array in shell [duplicate]

This question already has an answer here:
How to use global arrays in bash?
(1 answer)
Closed 6 years ago.
Here is the code which I need to separate the files in array, but using the PIPE it is generating subshell so am not able to get access to arrays normal, executable and directory.and its not printing anything or don't know what is happening after #////////.Please help me regarding this.
i=0
j=0
k=0
normal[0]=
executable[0]=
directory[0]=
ls | while read line
do
if [ -f $line ];then
#echo "this is normal file>> $line"
normal[i]=$line
i=$((i+1))
fi
if [ -x $line ];then
#echo "this is executable file>> $line"
executable[j]=$line
j=$((j+1))
fi
if [ -d $line ];then
#echo "this is directory>> $line"
directory[k]=$line
k=$((k+1))
fi
done
#//////////////////////////////////////
echo "normal files are"
for k in "${normal[#]}"
do
echo "$k"
done
echo "executable files are"
for k in "${executable[#]}"
do
echo "$k"
done
echo "directories are"
for k in "${directory[#]}"
do
echo "$k"
done
There are several flaws to your script :
Your if tests should be written with [[, not [, which is for binary comparison (more info : here). If you want to keep [ or are not using bash, you will have to quote your line variable, i.e. write all your tests like this : if [ -f "$line" ];then
Don't use ls to list the current directory as it misbehaves in some cases. A glob would be more suited in your case (more info: here)
If you want to avoid using a pipe, use a for loop instead. Replace ls | while read line with for line in $(ls) or, to take my previous point in acount, for line in *
After doing that, I tested your script and it worked perfectly fine. You should note that some folders will be listed under both under "executable files" and "directories", due to them having +x rights (I don't know if this is the behaviour you wanted).
As a side note, you don't need to declare variables in bash before using them. Your first 6 lines are thus un-necessary. Variables i,j,k are not necessary as well as you can dynamicaly increment an array with the following syntax : normal+=("$line").
The simplest thing to do is to keep the subshell open until you no longer need the arrays. In other words:
ls | { while read line; do
...
echo "directories: ${directory[#]}" | tr ' ' \\n
}
In other words, add an open brace before the while and a closing brace at the end of the script.

How can I store the "find" command results as an array in Bash

I am trying to save the result from find as arrays.
Here is my code:
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
array=`find . -name ${input}`
len=${#array[*]}
echo "found : ${len}"
i=0
while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done
I get 2 .txt files under current directory.
So I expect '2' as result of ${len}. However, it prints 1.
The reason is that it takes all result of find as one elements.
How can I fix this?
P.S
I found several solutions on StackOverFlow about a similar problem. However, they are a little bit different so I can't apply in my case. I need to store the results in a variable before the loop. Thanks again.
Update 2020 for Linux Users:
If you have an up-to-date version of bash (4.4-alpha or better), as you probably do if you are on Linux, then you should be using Benjamin W.'s answer.
If you are on Mac OS, which —last I checked— still used bash 3.2, or are otherwise using an older bash, then continue on to the next section.
Answer for bash 4.3 or earlier
Here is one solution for getting the output of find into a bash array:
array=()
while IFS= read -r -d $'\0'; do
array+=("$REPLY")
done < <(find . -name "${input}" -print0)
This is tricky because, in general, file names can have spaces, new lines, and other script-hostile characters. The only way to use find and have the file names safely separated from each other is to use -print0 which prints the file names separated with a null character. This would not be much of an inconvenience if bash's readarray/mapfile functions supported null-separated strings but they don't. Bash's read does and that leads us to the loop above.
[This answer was originally written in 2014. If you have a recent version of bash, please see the update below.]
How it works
The first line creates an empty array: array=()
Every time that the read statement is executed, a null-separated file name is read from standard input. The -r option tells read to leave backslash characters alone. The -d $'\0' tells read that the input will be null-separated. Since we omit the name to read, the shell puts the input into the default name: REPLY.
The array+=("$REPLY") statement appends the new file name to the array array.
The final line combines redirection and command substitution to provide the output of find to the standard input of the while loop.
Why use process substitution?
If we didn't use process substitution, the loop could be written as:
array=()
find . -name "${input}" -print0 >tmpfile
while IFS= read -r -d $'\0'; do
array+=("$REPLY")
done <tmpfile
rm -f tmpfile
In the above the output of find is stored in a temporary file and that file is used as standard input to the while loop. The idea of process substitution is to make such temporary files unnecessary. So, instead of having the while loop get its stdin from tmpfile, we can have it get its stdin from <(find . -name ${input} -print0).
Process substitution is widely useful. In many places where a command wants to read from a file, you can specify process substitution, <(...), instead of a file name. There is an analogous form, >(...), that can be used in place of a file name where the command wants to write to the file.
Like arrays, process substitution is a feature of bash and other advanced shells. It is not part of the POSIX standard.
Alternative: lastpipe
If desired, lastpipe can be used instead of process substitution (hat tip: Caesar):
set +m
shopt -s lastpipe
array=()
find . -name "${input}" -print0 | while IFS= read -r -d $'\0'; do array+=("$REPLY"); done; declare -p array
shopt -s lastpipe tells bash to run the last command in the pipeline in the current shell (not the background). This way, the array remains in existence after the pipeline completes. Because lastpipe only takes effect if job control is turned off, we run set +m. (In a script, as opposed to the command line, job control is off by default.)
Additional notes
The following command creates a shell variable, not a shell array:
array=`find . -name "${input}"`
If you wanted to create an array, you would need to put parens around the output of find. So, naively, one could:
array=(`find . -name "${input}"`) # don't do this
The problem is that the shell performs word splitting on the results of find so that the elements of the array are not guaranteed to be what you want.
Update 2019
Starting with version 4.4-alpha, bash now supports a -d option so that the above loop is no longer necessary. Instead, one can use:
mapfile -d $'\0' array < <(find . -name "${input}" -print0)
For more information on this, please see (and upvote) Benjamin W.'s answer.
Bash 4.4 introduced a -d option to readarray/mapfile, so this can now be solved with
readarray -d '' array < <(find . -name "$input" -print0)
for a method that works with arbitrary filenames including blanks, newlines, and globbing characters. This requires that your find supports -print0, as for example GNU find does.
From the manual (omitting other options):
mapfile [-d delim] [array]
-d
The first character of delim is used to terminate each input line, rather than newline. If delim is the empty string, mapfile will terminate a line when it reads a NUL character.
And readarray is just a synonym of mapfile.
The following appears to work for both Bash and Z Shell on macOS.
#! /bin/sh
IFS=$'\n'
paths=($(find . -name "foo"))
unset IFS
printf "%s\n" "${paths[#]}"
If you are using bash 4 or later, you can replace your use of find with
shopt -s globstar nullglob
array=( **/*"$input"* )
The ** pattern enabled by globstar matches 0 or more directories, allowing the pattern to match to an arbitrary depth in the current directory. Without the nullglob option, the pattern (after parameter expansion) is treated literally, so with no matches you would have an array with a single string rather than an empty array.
Add the dotglob option to the first line as well if you want to traverse hidden directories (like .ssh) and match hidden files (like .bashrc) as well.
you can try something like
array=(`find . -type f | sort -r | head -2`) , and in order to print the array values , you can try something like echo "${array[*]}"
None of these solutions suited me because I didn't feel like learning readarray and mapfile. Here is what I came up with.
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
# The only change is here. Append to array for each non-empty line.
array=()
while read line; do
[[ ! -z "$line" ]] && array+=("$line")
done; <<< $(find . -name ${input} -print)
len=${#array[#]}
echo "found : ${len}"
i=0
while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done
You could do like this:
#!/bin/bash
echo "input : "
read input
echo "searching file with this pattern '${input}' under present directory"
array=(`find . -name '*'${input}'*'`)
for i in "${array[#]}"
do :
echo $i
done
In bash, $(<any_shell_cmd>) helps to run a command and capture the output. Passing this to IFS with \n as delimiter helps to convert that to an array.
IFS='\n' read -r -a txt_files <<< $(find /path/to/dir -name "*.txt")

Problem with appending to bash arrays

I'm trying to create an alias that will get all "Modified" files and run the php syntax check on them...
function gitphpcheck () {
filearray=()
git diff --name-status | while read line; do
if [[ $line =~ ^M ]]
then
filename="`echo $line | awk '{ print $2 }'`"
echo "$filename" # correct output
filearray+=($filename)
fi
done
echo "--------------FILES"
echo ${filearray[#]}
# will do php check here, but echo of array is blank
}
As Wrikken says, the while body runs in a subshell, so all changes to the filearray array will disappear when the subshell ends. A couple of different solutions come to mind:
Process substitution (less readable but does not require a subshell)
while read line; do
:
done < <(git diff --name-status)
echo "${filearray[#]}"
Use the modified variable in the subshell using command grouping
git diff --name-status | {
while read line; do
:
done
echo "${filearray[#]}"
}
# filearray is empty here
You've piped | things to while, which is essentially another process, so the filearray variable is a different one (not the same scope).

Resources