Variable in bash array name - arrays

I have this code:
#!/bin/sh
...
for cnumber in `seq $CNUMBER`; do
declare -a CAT$cnumber
let i=0
while IFS=$'\n' read -r line_data; do
CAT$cnumber[i]="${line_data}"
((++i))
done < input_file_$cnumber
done
It works if I use array name without variable $cnumber.
But I want to create multiple arrays (CAT0, CAT1, CAT2 etc.) and to read lines:
from file 'input_file_0' to array 'CAT0'
from file 'input_file_1' to array 'CAT1'
from file 'input_file_2' to array 'CAT2'
etc.
What syntax to use $cnumber variable in array name (CAT1) and in input file_name?

for cnumber in `seq $CNUMBER`; do
declare -a CAT$cnumber
let i=0
while IFS=$'\n' read -r line_data; do
eval CAT$cnumber[i]='"${line_data}"'
((++i))
done < input_file_$cnumber
done
Mainly, that adds the word "eval" which makes bash evaluate the rest of the line. Before that, bash expands variables, thus CAT$number will be something like CAT1 when the line is evaluated. Keep in mind that "${line_date}" would be subject to variable expansion before eval evaluates the line if it would not be protected by single quotes. That might have unexpected effects if the $line_data would contain blank spaces. See this simplified example:
a=b
l="hello date"
eval $a="$l" # executes "date", has no other effect
echo $b # prints an empty line
eval $a='"$l"' # sets b to "hello date"
echo $b # prints that: hello date
In reply to the comment of Etan Reisner below, I add another solution that avoids "eval" and instead uses references, which are available in bash version 4.3 or higher. In that case, using references is preferable for the reason Etan pointed out, and also, for my opinion, because it is more natural:
for cnumber in `seq $CNUMBER`; do
declare -a CAT$cnumber # be sure the array is declared before ...
declare -n ref=CAT$cnumber # ... you declare ref to reference the array
let i=0
while IFS=$'\n' read -r line_data; do
ref[$i]="${line_data}"
((++i))
done < input_file_$cnumber
done

First, your shebang needs to be
#!/bin/bash
since declare is a bash built-in.
Next, use declare to define the array elements
for cnumber in `seq $CNUMBER`; do
declare -a CAT$cnumber
let i=0
while IFS=$'\n' read -r line_data; do
declare "CAT$cnumber[$i]=${line_data}"
((++i))
done < input_file_$cnumber
done

Related

BASH - Create arrays from lines of csv file, where first entry is array name

I'm learning to script in Bash.
I have an CSV file, which contains next lines:
numbers,one,two,three,four,five
colors,red,blue,green,yellow,white
custom-1,a,b,c,d,e
custom+2,t,y,w,x,z
Need to create arrays from this, where first entry is array name, eg.
number=(one,two,three,four,five)
colors=(red,blue,green,yellow,white)
custom-1=(a,b,c,d,e)
custom+2=(t,y,w,x,z)
Here is my script:
IFS=","
while read NAME VALUES ; do
declare -a $NAME
arrays+=($NAME)
IFS=',' read -r -a $NAME <<< "${VALUES[0]}"
done < file.csv
When I try with csv file, containing only two first string (numbers and colors), code works well. And if i try to with number, colors, custom-1, custom-2, there is error during reading csv:
./script.sh: line 5: declare: `custom-1': not a valid identifier
./script.sh: line 7: read: `custom+2': not a valid identifier
because bash does not allow special characters in variable names, as far as I understand. Is there any way to avoid this?
As you cannot use the first column of your CSV file as bash array names one option would be to generate valid names using a counter (e.g. arrayN). If you want to access your data using the values of this first column you would also need to store them somewhere with the corresponding counter value. An associative array (declare -A names=()) would be perfect. Last but not least, the namerefs (declare -n arr=..., available starting with bash 4.3) will be convenient to store and access your data. Example:
declare -i cnt=1
declare -A names=()
while IFS=',' read -r -a line; do
names["${line[0]}"]="$cnt"
declare -n arr="array$cnt"
unset line[0]
declare -a arr=( "${line[#]}" )
((cnt++))
done < foo.csv
Now, to access the values corresponding to, let's say, entry custom+2, first get the corresponding counter value, declare a nameref pointing to the corresponding array and voilà:
$ cnt="${names[custom+2]}"
$ declare -n arr="array$cnt"
$ echo "${arr[#]}"
t y w x z
Let's declare a function for easier access:
getdata () {
local -i cnt="${names[$1]}"
local -n arr="array$cnt"
[ -z "$2" ] && echo "${arr[#]}" || echo "${arr[$2]}"
}
And then:
$ getdata "custom+2"
t y w x z
$ getdata "colors"
red blue green yellow white
$ getdata "colors" 3
yellow

How do I convert CSV data into an associative array using Bash 4?

The file /tmp/file.csv contains the following:
name,age,gender
bob,21,m
jane,32,f
The CSV file will always have headers.. but might contain a different number of fields:
id,title,url,description
1,foo name,foo.io,a cool foo site
2,bar title,http://bar.io,a great bar site
3,baz heading,https://baz.io,some description
In either case, I want to convert my CSV data into an array of associative arrays..
What I need
So, I want a Bash 4.3 function that takes CSV as piped input and sends the array to stdout:
/tmp/file.csv:
name,age,gender
bob,21,m
jane,32,f
It needs to be used in my templating system, like this:
{{foo | csv_to_array | foo2}}
^ this is a fixed API, I must use that syntax .. foo2 must receive the array as standard input.
The csv_to_array func must do it's thing, so that afterwards I can do this:
$ declare -p row1; declare -p row2; declare -p new_array;
and it would give me this:
declare -A row1=([gender]="m" [name]="bob" [age]="21" )
declare -A row2=([gender]="f" [name]="jane" [age]="32" )
declare -a new_array=([0]="row1" [1]="row2")
..Once I have this array structure (an indexed array of associative array names), I have a shell-based templating system to access them, like so:
{{#new_array}}
Hi {{item.name}}, you are {{item.age}} years old.
{{/new_array}}
But I'm struggling to generate the arrays I need..
Things I tried:
I have already tried using this as a starting point to get the array structure I need:
while IFS=',' read -r -a my_array; do
echo ${my_array[0]} ${my_array[1]} ${my_array[2]}
done <<< $(cat /tmp/file.csv)
(from Shell: CSV to array)
..and also this:
cat /tmp/file.csv | while read line; do
line=( ${line//,/ } )
echo "0: ${line[0]}, 1: ${line[1]}, all: ${line[#]}"
done
(from https://www.reddit.com/r/commandline/comments/1kym4i/bash_create_array_from_one_line_in_csv/cbu9o2o/)
but I didn't really make any progress in getting what I want out the other end...
EDIT:
Accepted the 2nd answer, but I had to hack the library I am using to make either solution work..
I'll be happy to look at other answers, which do not export the declare commands as strings, to be run in the current env, but instead somehow hoist the resultant arrays of the declare commands to the current env (the current env is wherever the function is run from).
Example:
$ cat file.csv | csv_to_array
$ declare -p row2 # gives the data
So, to be clear, if the above ^ works in a terminal, it'll work in the library I'm using without the hacks I had to add (which involved grepping STDIN for ^declare -a and using source <(cat); eval $STDIN... in other functions)...
See my comments on the 2nd answer for more info.
The approach is straightforward:
Read the column headers into an array
Read the file line by line, in each line …
Create a new associative array and register its name in the array of array names
Read the fields and assign them according to the column headers
In the last step we cannot use read -a, mapfile, or things like these since they only create regular arrays with numbers as indices, but we want an associative array instead, so we have to create the array manually.
However, the implementation is a bit convoluted because of bash's quirks.
The following function parses stdin and creates arrays accordingly.
I took the liberty to rename your array new_array to rowNames.
#! /bin/bash
csvToArrays() {
IFS=, read -ra header
rowIndex=0
while IFS= read -r line; do
((rowIndex++))
rowName="row$rowIndex"
declare -Ag "$rowName"
IFS=, read -ra fields <<< "$line"
fieldIndex=0
for field in "${fields[#]}"; do
printf -v quotedFieldHeader %q "${header[fieldIndex++]}"
printf -v "$rowName[$quotedFieldHeader]" %s "$field"
done
rowNames+=("$rowName")
done
declare -p "${rowNames[#]}" rowNames
}
Calling the function in a pipe has no effect. Bash executes the commands in a pipe in a subshell, therefore you won't have access to the arrays created by someCommand | csvToArrays. Instead, call the function as either one of the following
csvToArrays < <(someCommand) # when input comes from a command, except "cat file"
csvToArrays < someFile # when input comes from a file
Bash scripts like these tend to be very slow. That's the reason why I didn't bother to extract printf -v quotedFieldHeader … from the inner loop even though it will do the same work over and over again.
I think the whole templating thing and everything related would be way easier to program and faster to execute in languages like python, perl, or something like that.
The following script:
csv_to_array() {
local -a values
local -a headers
local counter
IFS=, read -r -a headers
declare -a new_array=()
counter=1
while IFS=, read -r -a values; do
new_array+=( row$counter )
declare -A "row$counter=($(
paste -d '' <(
printf "[%s]=\n" "${headers[#]}"
) <(
printf "%q\n" "${values[#]}"
)
))"
(( counter++ ))
done
declare -p new_array ${!row*}
}
foo2() {
source <(cat)
declare -p new_array ${!row*} |
sed 's/^/foo2: /'
}
echo "==> TEST 1 <=="
cat <<EOF |
id,title,url,description
1,foo name,foo.io,a cool foo site
2,bar title,http://bar.io,a great bar site
3,baz heading,https://baz.io,some description
EOF
csv_to_array |
foo2
echo "==> TEST 2 <=="
cat <<EOF |
name,age,gender
bob,21,m
jane,32,f
EOF
csv_to_array |
foo2
will output:
==> TEST 1 <==
foo2: declare -a new_array=([0]="row1" [1]="row2" [2]="row3")
foo2: declare -A row1=([url]="foo.io" [description]="a cool foo site" [id]="1" [title]="foo name" )
foo2: declare -A row2=([url]="http://bar.io" [description]="a great bar site" [id]="2" [title]="bar title" )
foo2: declare -A row3=([url]="https://baz.io" [description]="some description" [id]="3" [title]="baz heading" )
==> TEST 2 <==
foo2: declare -a new_array=([0]="row1" [1]="row2")
foo2: declare -A row1=([gender]="m" [name]="bob" [age]="21" )
foo2: declare -A row2=([gender]="f" [name]="jane" [age]="32" )
The output comes from foo2 function.
The csv_to_array function first reads the headaers. Then for each read line it adds new element into new_array array and also creates a new associative array with the name row$index with elements created from joining the headers names with values read from the line. On the end the output from declare -p is outputted from the function.
The foo2 function sources the standard input, so the arrays come into scope for it. It outputs then those values again, prepending each line with foo2:.

Bash, wihle read line by line, split strings on line divided by ",", store to array

I need to read file line by line, and every line split by ",", and store to array.
File source_file.
usl-coop,/root
usl-dev,/bin
Script.
i=1
while read -r line; do
IFS="," read -ra para_$i <<< $line
echo ${para_$i[#]}
((i++))
done < source_file
Expected output.
para_1[0]=usl-coop
para_1[1]=/root
para_2[0]=usl-dev
para_2[1]=/bin
Script will out error about echo.
./sofimon.sh: line 21: ${para_$i[#]}: bad substitution
When I echo array one by one field, for example
echo para_1[0]
it shows, that variables are stored.
But I need use it with variable within, something like this.
${para_$i[1]}
Is possible to do this?
Thanks.
S.
There is a trick to simulate 2D arrays using associative arrays. It works nice and I think is the most flexible and extensible:
declare -A para
i=1
while IFS=, read -r -a line; do
for j in ${!line[#]}; do
para[$i,$j]="${line[$j]}"
done
((i++)) ||:
done < source_file
declare -p para
will output:
declare -A para=([1,0]="usl-coop" [1,1]="/root" [2,1]="/bin" [2,0]="usl-dev" )
Without modifying your script that much you could use indirect variable expansion. It's sometimes used in simpler scripts:
i=1
while IFS="," read -r -a para_$i; do
n="para_$i[#]"
echo "${!n}"
((i++)) ||:
done < source_file
declare -p ${!para_*}
or basically the same with a nameref a named reference to another variable (side note: see how [#] needs to be part of the variable in indirect expansion, but not in named reference):
i=1
while IFS="," read -r -a para_$i; do
declare -n n
n="para_$i"
echo "${n[#]}"
((i++)) ||:
done < source_file
declare -p ${!para_*}
both scripts above will output the same:
usl-coop /root
usl-dev /bin
declare -a para_1=([0]="usl-coop" [1]="/root")
declare -a para_2=([0]="usl-dev" [1]="/bin")
That said, I think you shouldn't read your file into memory at all. It's just a bad design. Shell and bash is build around passing your files with pipes, streams, fifos, redirections, process substitutions, etc. without ever saving/copying/storing the file. If you have a file to parse, you should stream it to another process, parse and save the result, without ever storing the whole input in memory. If you want some data to find inside a file, use grep or awk.
Here is a short awk script that do the task.
awk 'BEGIN{FS=",";of="para_%d[%d]=%s\n"}{printf(of, NR, 0, $1);printf(of, NR, 1, $2)}' input.txt
Provide the desired output.
Explanation:
BEGIN{
FS=","; # set field seperator to `,`
of="para_%d[%d]=%s\n" # define common printf output format
}
{ # for each input line
printf(of, NR, 0, $1); # output for current line, [0], left field
printf(of, NR, 1, $2) # output for current line, [1], right field
}

How to copy an array in Bash?

I have an array of applications, initialized like this:
depends=$(cat ~/Depends.txt)
When I try to parse the list and copy it to a new array using,
for i in "${depends[#]}"; do
if [ $i #isn't installed ]; then
newDepends+=("$i")
fi
done
What happens is that only the first element of depends winds up on newDepends.
for i in "${newDepends[#]}"; do
echo $i
done
^^ This would output just one thing. So I'm trying to figure out why my for loop is is only moving the first element. The whole list is originally on depends, so it's not that, but I'm all out of ideas.
a=(foo bar "foo 1" "bar two") #create an array
b=("${a[#]}") #copy the array in another one
for value in "${b[#]}" ; do #print the new array
echo "$value"
done
The simplest way to copy a non-associative array in bash is to:
arrayClone=("${oldArray[#]}")
or to add elements to a preexistent array:
someArray+=("${oldArray[#]}")
Newlines/spaces/IFS in the elements will be preserved.
For copying associative arrays, Isaac's solutions work great.
The solutions given in the other answers won't work for associative arrays, or for arrays with non-contiguous indices. Here are is a more general solution:
declare -A arr=([this]=hello [\'that\']=world [theother]='and "goodbye"!')
temp=$(declare -p arr)
eval "${temp/arr=/newarr=}"
diff <(echo "$temp") <(declare -p newarr | sed 's/newarr=/arr=/')
# no output
And another:
declare -A arr=([this]=hello [\'that\']=world [theother]='and "goodbye"!')
declare -A newarr
for idx in "${!arr[#]}"; do
newarr[$idx]=${arr[$idx]}
done
diff <(echo "$temp") <(declare -p newarr | sed 's/newarr=/arr=/')
# no output
Try this: arrayClone=("${oldArray[#]}")
This works easily.
array_copy() {
set -- "$(declare -p $1)" "$2"
eval "$2=${1#*=}"
}
# Usage examples:
these=(apple banana catalog dormant eagle fruit goose hat icicle)
array_copy these those
declare -p those
declare -A src dest
source=(["It's a 15\" spike"]="and it's 1\" thick" [foo]=bar [baz]=qux)
array_copy src dest
declare -p dest
Note: when copying associative arrays, the destination must already exist as an associative array. If not, array_copy() will create it as a standard array and try to interpret the key names from the associative source as arithmetic variable names, with ugly results.
Isaac Schwabacher's solution is more robust in this regard, but it can't be tidily wrapped up in a function because its eval step evaluates an entire declare statement and bash treats those as equivalent to local when they're inside a function. This could be worked around by wedging the -g option into the evaluated declare but that might give the destination array more scope than it's supposed to have. Better, I think, to have array_copy() perform only the actual copy into an explicitly scoped destination.
You can copy an array by inserting the elements of the first array into the copy by specifying the index:
#!/bin/bash
array=( One Two Three Go! );
array_copy( );
let j=0;
for (( i=0; i<${#array[#]}; i++)
do
if [[ $i -ne 1 ]]; then # change the test here to your 'isn't installed' test
array_copy[$j]="${array[$i]}
let i+=1;
fi
done
for k in "${array_copy[#]}"; do
echo $k
done
The output of this would be:
One
Three
Go!
A useful document on bash arrays is on TLDP.
Problem is to copy array in function to be visible in parent code. This solution works for indexed arrays and if before copying are predefined as declare -A ARRAY, works also for associative arrays.
function array_copy
# $1 original array name
# $2 new array name with the same content
{
local INDEX
eval "
for INDEX in \"\${!$1[#]}\"
do
$2[\"\$INDEX\"]=\"\${$1[\$INDEX]}\"
done
"
}
Starting with Bash 4.3, you can do this
$ alpha=(bravo charlie 'delta 3' '' foxtrot)
$ declare -n golf=alpha
$ echo "${golf[2]}"
delta 3
Managed to copy an array into another.
firstArray=()
secondArray=()
firstArray+=("Element1")
firstArray+=("Element2")
secondArray+=("${firstArray[#]}")
for element in "${secondArray[#]}"; do
echo "${element}"
done
I've found that this works for me (mostly :)) ...
eval $(declare -p base | sed "s,base,target,")
extending the sed command to edit any switches as necessary e.g. if the new structure has to be writeable, to edit out read-only (-r).
I've discovered what was wrong.. My if isn't installed test is two for loops that remove excess characters from file names, and spits them out if they exist on a certain web server. What it wasn't doing was removing a trailing hyphen. So, when it tested it online for availability, they were parsed out. Because "file" exists, but "file-" doesn't.

How to return an array in bash without using globals?

I have a function that creates an array and I want to return the array to the caller:
create_array() {
local my_list=("a", "b", "c")
echo "${my_list[#]}"
}
my_algorithm() {
local result=$(create_array)
}
With this, I only get an expanded string. How can I "return" my_list without using anything global?
With Bash version 4.3 and above, you can make use of a nameref so that the caller can pass in the array name and the callee can use a nameref to populate the named array, indirectly.
#!/usr/bin/env bash
create_array() {
local -n arr=$1 # use nameref for indirection
arr=(one "two three" four)
}
use_array() {
local my_array
create_array my_array # call function to populate the array
echo "inside use_array"
declare -p my_array # test the array
}
use_array # call the main function
Produces the output:
inside use_array
declare -a my_array=([0]="one" [1]="two three" [2]="four")
You could make the function update an existing array as well:
update_array() {
local -n arr=$1 # use nameref for indirection
arr+=("two three" four) # update the array
}
use_array() {
local my_array=(one)
update_array my_array # call function to update the array
}
This is a more elegant and efficient approach since we don't need command substitution $() to grab the standard output of the function being called. It also helps if the function were to return more than one output - we can simply use as many namerefs as the number of outputs.
Here is what the Bash Manual says about nameref:
A variable can be assigned the nameref attribute using the -n option
to the declare or local builtin commands (see Bash Builtins) to create
a nameref, or a reference to another variable. This allows variables
to be manipulated indirectly. Whenever the nameref variable is
referenced, assigned to, unset, or has its attributes modified (other
than using or changing the nameref attribute itself), the operation is
actually performed on the variable specified by the nameref variable’s
value. A nameref is commonly used within shell functions to refer to a
variable whose name is passed as an argument to the function. For
instance, if a variable name is passed to a shell function as its
first argument, running
declare -n ref=$1 inside the function creates a nameref variable ref
whose value is the variable name passed as the first argument.
References and assignments to ref, and changes to its attributes, are
treated as references, assignments, and attribute modifications to the
variable whose name was passed as $1.
What's wrong with globals?
Returning arrays is really not practical. There are lots of pitfalls.
That said, here's one technique that works if it's OK that the variable have the same name:
$ f () { local a; a=(abc 'def ghi' jkl); declare -p a; }
$ g () { local a; eval $(f); declare -p a; }
$ f; declare -p a; echo; g; declare -p a
declare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'
-bash: declare: a: not found
declare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'
-bash: declare: a: not found
The declare -p commands (except for the one in f() are used to display the state of the array for demonstration purposes. In f() it's used as the mechanism to return the array.
If you need the array to have a different name, you can do something like this:
$ g () { local b r; r=$(f); r="declare -a b=${r#*=}"; eval "$r"; declare -p a; declare -p b; }
$ f; declare -p a; echo; g; declare -p a
declare -a a='([0]="abc" [1]="def ghi" [2]="jkl")'
-bash: declare: a: not found
-bash: declare: a: not found
declare -a b='([0]="abc" [1]="def ghi" [2]="jkl")'
-bash: declare: a: not found
Bash can't pass around data structures as return values. A return value must be a numeric exit status between 0-255. However, you can certainly use command or process substitution to pass commands to an eval statement if you're so inclined.
This is rarely worth the trouble, IMHO. If you must pass data structures around in Bash, use a global variable--that's what they're for. If you don't want to do that for some reason, though, think in terms of positional parameters.
Your example could easily be rewritten to use positional parameters instead of global variables:
use_array () {
for idx in "$#"; do
echo "$idx"
done
}
create_array () {
local array=("a" "b" "c")
use_array "${array[#]}"
}
This all creates a certain amount of unnecessary complexity, though. Bash functions generally work best when you treat them more like procedures with side effects, and call them in sequence.
# Gather values and store them in FOO.
get_values_for_array () { :; }
# Do something with the values in FOO.
process_global_array_variable () { :; }
# Call your functions.
get_values_for_array
process_global_array_variable
If all you're worried about is polluting your global namespace, you can also use the unset builtin to remove a global variable after you're done with it. Using your original example, let my_list be global (by removing the local keyword) and add unset my_list to the end of my_algorithm to clean up after yourself.
You were not so far out with your original solution. You had a couple of problems, you used a comma as a separator, and you failed to capture the returned items into a list, try this:
my_algorithm() {
local result=( $(create_array) )
}
create_array() {
local my_list=("a" "b" "c")
echo "${my_list[#]}"
}
Considering the comments about embedded spaces, a few tweaks using IFS can solve that:
my_algorithm() {
oldIFS="$IFS"
IFS=','
local result=( $(create_array) )
IFS="$oldIFS"
echo "Should be 'c d': ${result[1]}"
}
create_array() {
IFS=','
local my_list=("a b" "c d" "e f")
echo "${my_list[*]}"
}
Use the technique developed by Matt McClure:
http://notes-matthewlmcclure.blogspot.com/2009/12/return-array-from-bash-function-v-2.html
Avoiding global variables means you can use the function in a pipe. Here is an example:
#!/bin/bash
makeJunk()
{
echo 'this is junk'
echo '#more junk and "b#d" characters!'
echo '!#$^%^&(*)_^&% ^$##:"<>?/.,\\"'"'"
}
processJunk()
{
local -a arr=()
# read each input and add it to arr
while read -r line
do
arr+=('"'"$line"'" is junk')
done;
# output the array as a string in the "declare" representation
declare -p arr | sed -e 's/^declare -a [^=]*=//'
}
# processJunk returns the array in a flattened string ready for "declare"
# Note that because of the pipe processJunk cannot return anything using
# a global variable
returned_string="$(makeJunk | processJunk)"
# convert the returned string to an array named returned_array
# declare correctly manages spaces and bad characters
eval "declare -a returned_array=${returned_string}"
for junk in "${returned_array[#]}"
do
echo "$junk"
done
Output is:
"this is junk" is junk
"#more junk and "b#d" characters!" is junk
"!#$^%^&(*)_^&% ^$##:"<>?/.,\\"'" is junk
A pure bash, minimal and robust solution based on the 'declare -p' builtin — without insane global variables
This approach involves the following three steps:
Convert the array with 'declare -p' and save the output in a variable.
myVar="$( declare -p myArray )"
The output of the declare -p statement can be used to recreate the array.
For instance the output of declare -p myVar might look like this:
declare -a myVar='([0]="1st field" [1]="2nd field" [2]="3rd field")'
Use the echo builtin to pass the variable to a function or to pass it back from there.
In order to preserve whitspaces in array fields when echoing the variable, IFS is temporarly set to a control character (e.g. a vertical tab).
Only the right-hand-side of the declare statement in the variable is to be echoed - this can be achieved by parameter expansion of the form ${parameter#word}. As for the example above: ${myVar#*=}
Finally, recreate the array where it is passed to using the eval and the 'declare -a' builtins.
Example 1 - return an array from a function
#!/bin/bash
# Example 1 - return an array from a function
function my-fun () {
# set up a new array with 3 fields - note the whitespaces in the
# 2nd (2 spaces) and 3rd (2 tabs) field
local myFunArray=( "1st field" "2nd field" "3rd field" )
# show its contents on stderr (must not be output to stdout!)
echo "now in $FUNCNAME () - showing contents of myFunArray" >&2
echo "by the help of the 'declare -p' builtin:" >&2
declare -p myFunArray >&2
# return the array
local myVar="$( declare -p myFunArray )"
local IFS=$'\v';
echo "${myVar#*=}"
# if the function would continue at this point, then IFS should be
# restored to its default value: <space><tab><newline>
IFS=' '$'\t'$'\n';
}
# main
# call the function and recreate the array that was originally
# set up in the function
eval declare -a myMainArray="$( my-fun )"
# show the array contents
echo ""
echo "now in main part of the script - showing contents of myMainArray"
echo "by the help of the 'declare -p' builtin:"
declare -p myMainArray
# end-of-file
Output of Example 1:
now in my-fun () - showing contents of myFunArray
by the help of the 'declare -p' builtin:
declare -a myFunArray='([0]="1st field" [1]="2nd field" [2]="3rd field")'
now in main part of the script - showing contents of myMainArray
by the help of the 'declare -p' builtin:
declare -a myMainArray='([0]="1st field" [1]="2nd field" [2]="3rd field")'
Example 2 - pass an array to a function
#!/bin/bash
# Example 2 - pass an array to a function
function my-fun () {
# recreate the array that was originally set up in the main part of
# the script
eval declare -a myFunArray="$( echo "$1" )"
# note that myFunArray is local - from the bash(1) man page: when used
# in a function, declare makes each name local, as with the local
# command, unless the ‘-g’ option is used.
# IFS has been changed in the main part of this script - now that we
# have recreated the array it's better to restore it to the its (local)
# default value: <space><tab><newline>
local IFS=' '$'\t'$'\n';
# show contents of the array
echo ""
echo "now in $FUNCNAME () - showing contents of myFunArray"
echo "by the help of the 'declare -p' builtin:"
declare -p myFunArray
}
# main
# set up a new array with 3 fields - note the whitespaces in the
# 2nd (2 spaces) and 3rd (2 tabs) field
myMainArray=( "1st field" "2nd field" "3rd field" )
# show the array contents
echo "now in the main part of the script - showing contents of myMainArray"
echo "by the help of the 'declare -p' builtin:"
declare -p myMainArray
# call the function and pass the array to it
myVar="$( declare -p myMainArray )"
IFS=$'\v';
my-fun $( echo "${myVar#*=}" )
# if the script would continue at this point, then IFS should be restored
# to its default value: <space><tab><newline>
IFS=' '$'\t'$'\n';
# end-of-file
Output of Example 2:
now in the main part of the script - showing contents of myMainArray
by the help of the 'declare -p' builtin:
declare -a myMainArray='([0]="1st field" [1]="2nd field" [2]="3rd field")'
now in my-fun () - showing contents of myFunArray
by the help of the 'declare -p' builtin:
declare -a myFunArray='([0]="1st field" [1]="2nd field" [2]="3rd field")'
I recently discovered a quirk in BASH in that a function has direct access to the variables declared in the functions higher in the call stack. I've only just started to contemplate how to exploit this feature (it promises both benefits and dangers), but one obvious application is a solution to the spirit of this problem.
I would also prefer to get a return value rather than using a global variable when delegating the creation of an array. There are several reasons for my preference, among which are to avoid possibly disturbing an preexisting value and to avoid leaving a value that may be invalid when later accessed. While there are workarounds to these problems, the easiest is have the variable go out of scope when the code is finished with it.
My solution ensures that the array is available when needed and discarded when the function returns, and leaves undisturbed a global variable with the same name.
#!/bin/bash
myarr=(global array elements)
get_an_array()
{
myarr=( $( date +"%Y %m %d" ) )
}
request_array()
{
declare -a myarr
get_an_array "myarr"
echo "New contents of local variable myarr:"
printf "%s\n" "${myarr[#]}"
}
echo "Original contents of global variable myarr:"
printf "%s\n" "${myarr[#]}"
echo
request_array
echo
echo "Confirm the global myarr was not touched:"
printf "%s\n" "${myarr[#]}"
Here is the output of this code:
When function request_array calls get_an_array, get_an_array can directly set the myarr variable that is local to request_array. Since myarr is created with declare, it is local to request_array and thus goes out of scope when request_array returns.
Although this solution does not literally return a value, I suggest that taken as a whole, it satisfies the promises of a true function return value.
Useful example: return an array from function
function Query() {
local _tmp=`echo -n "$*" | mysql 2>> zz.err`;
echo -e "$_tmp";
}
function StrToArray() {
IFS=$'\t'; set $1; for item; do echo $item; done; IFS=$oIFS;
}
sql="SELECT codi, bloc, requisit FROM requisits ORDER BY codi";
qry=$(Query $sql0);
IFS=$'\n';
for row in $qry; do
r=( $(StrToArray $row) );
echo ${r[0]} - ${r[1]} - ${r[2]};
done
I tried various implementations, and none preserved arrays that had elements with spaces ... because they all had to use echo.
# These implementations only work if no array items contain spaces.
use_array() { eval echo '(' \"\${${1}\[\#\]}\" ')'; }
use_array() { local _array="${1}[#]"; echo '(' "${!_array}" ')'; }
Solution
Then I came across Dennis Williamson's answer. I incorporated his method into the following functions so they can a) accept an arbitrary array and b) be used to pass, duplicate and append arrays.
# Print array definition to use with assignments, for loops, etc.
# varname: the name of an array variable.
use_array() {
local r=$( declare -p $1 )
r=${r#declare\ -a\ *=}
# Strip keys so printed definition will be a simple list (like when using
# "${array[#]}"). One side effect of having keys in the definition is
# that when appending arrays (i.e. `a1+=$( use_array a2 )`), values at
# matching indices merge instead of pushing all items onto array.
echo ${r//\[[0-9]\]=}
}
# Same as use_array() but preserves keys.
use_array_assoc() {
local r=$( declare -p $1 )
echo ${r#declare\ -a\ *=}
}
Then, other functions can return an array using catchable output or indirect arguments.
# catchable output
return_array_by_printing() {
local returnme=( "one" "two" "two and a half" )
use_array returnme
}
eval test1=$( return_array_by_printing )
# indirect argument
return_array_to_referenced_variable() {
local returnme=( "one" "two" "two and a half" )
eval $1=$( use_array returnme )
}
return_array_to_referenced_variable test2
# Now both test1 and test2 are arrays with three elements
I needed a similar functionality recently, so the following is a mix of the suggestions made by RashaMatt and Steve Zobell.
echo each array/list element as separate line from within a function
use mapfile to read all array/list elements echoed by a function.
As far as I can see, strings are kept intact and whitespaces are preserved.
#!bin/bash
function create-array() {
local somearray=("aaa" "bbb ccc" "d" "e f g h")
for elem in "${somearray[#]}"
do
echo "${elem}"
done
}
mapfile -t resa <<< "$(create-array)"
# quick output check
declare -p resa
Some more variations…
#!/bin/bash
function create-array-from-ls() {
local somearray=("$(ls -1)")
for elem in "${somearray[#]}"
do
echo "${elem}"
done
}
function create-array-from-args() {
local somearray=("$#")
for elem in "${somearray[#]}"
do
echo "${elem}"
done
}
mapfile -t resb <<< "$(create-array-from-ls)"
mapfile -t resc <<< "$(create-array-from-args 'xxx' 'yy zz' 't s u' )"
sentenceA="create array from this sentence"
sentenceB="keep this sentence"
mapfile -t resd <<< "$(create-array-from-args ${sentenceA} )"
mapfile -t rese <<< "$(create-array-from-args "$sentenceB" )"
mapfile -t resf <<< "$(create-array-from-args "$sentenceB" "and" "this words" )"
# quick output check
declare -p resb
declare -p resc
declare -p resd
declare -p rese
declare -p resf
Here is a solution with no external array references and no IFS manipulation:
# add one level of single quotes to args, eval to remove
squote () {
local a=("$#")
a=("${a[#]//\'/\'\\\'\'}") # "'" => "'\''"
a=("${a[#]/#/\'}") # add "'" prefix to each word
a=("${a[#]/%/\'}") # add "'" suffix to each word
echo "${a[#]}"
}
create_array () {
local my_list=(a "b 'c'" "\\\"d
")
squote "${my_list[#]}"
}
my_algorithm () {
eval "local result=($(create_array))"
# result=([0]="a" [1]="b 'c'" [2]=$'\\"d\n')
}
[Note: the following was rejected as an edit of this answer for reasons that make no sense to me (since the edit was not intended to address the author of the post!), so I'm taking the suggestion to make it a separate answer.]
A simpler implementation of Steve Zobell's adaptation of Matt McClure's technique uses the bash built-in (since version == 4) readarray as suggested by RastaMatt to create a representation of an array that can be converted into an array at runtime. (Note that both readarray and mapfile name the same code.) It still avoids globals (allowing use of the function in a pipe), and still handles nasty characters.
For some more-fully-developed (e.g., more modularization) but still-kinda-toy examples, see bash_pass_arrays_between_functions. Following are a few easily-executable examples, provided here to avoid moderators b!tching about external links.
Cut the following block and paste it into a bash terminal to create /tmp/source.sh and /tmp/junk1.sh:
FP='/tmp/source.sh' # path to file to be created for `source`ing
cat << 'EOF' > "${FP}" # suppress interpretation of variables in heredoc
function make_junk {
echo 'this is junk'
echo '#more junk and "b#d" characters!'
echo '!#$^%^&(*)_^&% ^$##:"<>?/.,\\"'"'"
}
### Use 'readarray' (aka 'mapfile', bash built-in) to read lines into an array.
### Handles blank lines, whitespace and even nastier characters.
function lines_to_array_representation {
local -a arr=()
readarray -t arr
# output array as string using 'declare's representation (minus header)
declare -p arr | sed -e 's/^declare -a [^=]*=//'
}
EOF
FP1='/tmp/junk1.sh' # path to script to run
cat << 'EOF' > "${FP1}" # suppress interpretation of variables in heredoc
#!/usr/bin/env bash
source '/tmp/source.sh' # to reuse its functions
returned_string="$(make_junk | lines_to_array_representation)"
eval "declare -a returned_array=${returned_string}"
for elem in "${returned_array[#]}" ; do
echo "${elem}"
done
EOF
chmod u+x "${FP1}"
# newline here ... just hit Enter ...
Run /tmp/junk1.sh: output should be
this is junk
#more junk and "b#d" characters!
!#$^%^&(*)_^&% ^$##:"<>?/.,\\"'
Note lines_to_array_representation also handles blank lines. Try pasting the following block into your bash terminal:
FP2='/tmp/junk2.sh' # path to script to run
cat << 'EOF' > "${FP2}" # suppress interpretation of variables in heredoc
#!/usr/bin/env bash
source '/tmp/source.sh' # to reuse its functions
echo '`bash --version` the normal way:'
echo '--------------------------------'
bash --version
echo # newline
echo '`bash --version` via `lines_to_array_representation`:'
echo '-----------------------------------------------------'
bash_version="$(bash --version | lines_to_array_representation)"
eval "declare -a returned_array=${bash_version}"
for elem in "${returned_array[#]}" ; do
echo "${elem}"
done
echo # newline
echo 'But are they *really* the same? Ask `diff`:'
echo '-------------------------------------------'
echo 'You already know how to capture normal output (from `bash --version`):'
declare -r PATH_TO_NORMAL_OUTPUT="$(mktemp)"
bash --version > "${PATH_TO_NORMAL_OUTPUT}"
echo "normal output captured to file # ${PATH_TO_NORMAL_OUTPUT}"
ls -al "${PATH_TO_NORMAL_OUTPUT}"
echo # newline
echo 'Capturing L2AR takes a bit more work, but is not onerous.'
echo "Look # contents of the file you're about to run to see how it's done."
declare -r RAW_L2AR_OUTPUT="$(bash --version | lines_to_array_representation)"
declare -r PATH_TO_COOKED_L2AR_OUTPUT="$(mktemp)"
eval "declare -a returned_array=${RAW_L2AR_OUTPUT}"
for elem in "${returned_array[#]}" ; do
echo "${elem}" >> "${PATH_TO_COOKED_L2AR_OUTPUT}"
done
echo "output from lines_to_array_representation captured to file # ${PATH_TO_COOKED_L2AR_OUTPUT}"
ls -al "${PATH_TO_COOKED_L2AR_OUTPUT}"
echo # newline
echo 'So are they really the same? Per'
echo "\`diff -uwB "${PATH_TO_NORMAL_OUTPUT}" "${PATH_TO_COOKED_L2AR_OUTPUT}" | wc -l\`"
diff -uwB "${PATH_TO_NORMAL_OUTPUT}" "${PATH_TO_COOKED_L2AR_OUTPUT}" | wc -l
echo '... they are the same!'
EOF
chmod u+x "${FP2}"
# newline here ... just hit Enter ...
Run /tmp/junk2.sh # commandline. Your output should be similar to mine:
`bash --version` the normal way:
--------------------------------
GNU bash, version 4.3.30(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
`bash --version` via `lines_to_array_representation`:
-----------------------------------------------------
GNU bash, version 4.3.30(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
But are they *really* the same? Ask `diff`:
-------------------------------------------
You already know how to capture normal output (from `bash --version`):
normal output captured to file # /tmp/tmp.Ni1bgyPPEw
-rw------- 1 me me 308 Jun 18 16:27 /tmp/tmp.Ni1bgyPPEw
Capturing L2AR takes a bit more work, but is not onerous.
Look # contents of the file you're about to run to see how it's done.
output from lines_to_array_representation captured to file # /tmp/tmp.1D6O2vckGz
-rw------- 1 me me 308 Jun 18 16:27 /tmp/tmp.1D6O2vckGz
So are they really the same? Per
`diff -uwB /tmp/tmp.Ni1bgyPPEw /tmp/tmp.1D6O2vckGz | wc -l`
0
... they are the same!
There's no need to use eval or to change IFS to \n. There are at least 2 good ways to do this.
1) Using echo and mapfile
You can simply echo each item of the array in the function, then use mapfile to turn it into an array:
outputArray()
{
for i
{
echo "$i"
}
}
declare -a arr=( 'qq' 'www' 'ee rr' )
mapfile -t array < <(outputArray "${arr[#]}")
for i in "${array[#]}"
do
echo "i=$i"
done
To make it work using pipes, add (( $# == 0 )) && readarray -t temp && set "${temp[#]}" && unset temp to the top of output array. It converts stdin to parameters.
2) Using declare -p and sed
This can also be done using declare -p and sed instead of mapfile.
outputArray()
{
(( $# == 0 )) && readarray -t temp && set "${temp[#]}" && unset temp
for i; { echo "$i"; }
}
returnArray()
{
local -a arr=()
(( $# == 0 )) && readarray -t arr || for i; { arr+=("$i"); }
declare -p arr | sed -e 's/^declare -a [^=]*=//'
}
declare -a arr=( 'qq' 'www' 'ee rr' )
declare -a array=$(returnArray "${arr[#]}")
for i in "${array[#]}"
do
echo "i=$i"
done
declare -a array=$(outputArray "${arr[#]}" | returnArray)
echo
for i in "${array[#]}"
do
echo "i=$i"
done
declare -a array < <(outputArray "${arr[#]}" | returnArray)
echo
for i in "${array[#]}"
do
echo "i=$i"
done
This can also be done by simply passing array variable to the function and assign array values to this var then use this var outside of function. For example.
create_array() {
local __resultArgArray=$1
local my_list=("a" "b" "c")
eval $__resultArgArray="("${my_list[#]}")"
}
my_algorithm() {
create_array result
echo "Total elements in the array: ${#result[#]}"
for i in "${result[#]}"
do
echo $i
done
}
my_algorithm
The easest way y found
my_function()
{
array=(one two three)
echo ${array[#]}
}
result=($(my_function))
echo ${result[0]}
echo ${result[1]}
echo ${result[2]}
You can also use the declare -p method more easily by taking advantage of declare -a's double-evaluation when the value is a string (no true parens outside the string):
# return_array_value returns the value of array whose name is passed in.
# It turns the array into a declaration statement, then echos the value
# part of that statement with parentheses intact. You can use that
# result in a "declare -a" statement to create your own array with the
# same value. Also works for associative arrays with "declare -A".
return_array_value () {
declare Array_name=$1 # namespace locals with caps to prevent name collision
declare Result
Result=$(declare -p $Array_name) # dehydrate the array into a declaration
echo "${Result#*=}" # trim "declare -a ...=" from the front
}
# now use it. test for robustness by skipping an index and putting a
# space in an entry.
declare -a src=([0]=one [2]="two three")
declare -a dst="$(return_array_value src)" # rehydrate with double-eval
declare -p dst
> declare -a dst=([0]="one" [2]="two three") # result matches original
Verifying the result, declare -p dst yields declare -a dst=([0]="one" [2]="two three")", demonstrating that this method correctly deals with both sparse arrays as well as entries with an IFS character (space).
The first thing is to dehydrate the source array by using declare -p to generate a valid bash declaration of it. Because the declaration is a full statement, including "declare" and the variable name, we strip that part from the front with ${Result#*=}, leaving the parentheses with the indices and values inside: ([0]="one" [2]="two three").
It then rehydrates the array by feeding that value to your own declare statement, one where you choose the array name. It relies on the fact that the right side of the dst array declaration is a string with parentheses that are inside the string, rather than true parentheses in the declare itself, e.g. not declare -a dst=( "true parens outside string" ). This triggers declare to evaluate the string twice, once into a valid statement with parentheses (and quotes in the value preserved), and another for the actual assignment. I.e. it evaluates first to declare -a dst=([0]="one" [2]="two three"), then evaluates that as a statement.
Note that this double evaluation behavior is specific to the -a and -A options of declare.
Oh, and this method works with associative arrays as well, just change -a to -A.
Because this method relies on stdout, it works across subshell boundaries like pipelines, as others have noted.
I discuss this method in more detail in my blog post
If your source data is formatted with each list element on a separate line, then the mapfile builtin is a simple and elegant way to read a list into an array:
$ list=$(ls -1 /usr/local) # one item per line
$ mapfile -t arrayVar <<<"$list" # -t trims trailing newlines
$ declare -p arrayVar | sed 's#\[#\n[#g'
declare -a arrayVar='(
[0]="bin"
[1]="etc"
[2]="games"
[3]="include"
[4]="lib"
[5]="man"
[6]="sbin"
[7]="share"
[8]="src")'
Note that, as with the read builtin, you would not ordinarily* use mapfile in a pipeline (or subshell) because the assigned array variable would be unavailable to subsequent statements (* unless bash job control is disabled and shopt -s lastpipe is set).
$ help mapfile
mapfile: mapfile [-n count] [-O origin] [-s count] [-t] [-u fd] [-C callback] [-c quantum] [array]
Read lines from the standard input into an indexed array variable.
Read lines from the standard input into the indexed array variable ARRAY, or
from file descriptor FD if the -u option is supplied. The variable MAPFILE
is the default ARRAY.
Options:
-n count Copy at most COUNT lines. If COUNT is 0, all lines are copied.
-O origin Begin assigning to ARRAY at index ORIGIN. The default index is 0.
-s count Discard the first COUNT lines read.
-t Remove a trailing newline from each line read.
-u fd Read lines from file descriptor FD instead of the standard input.
-C callback Evaluate CALLBACK each time QUANTUM lines are read.
-c quantum Specify the number of lines read between each call to CALLBACK.
Arguments:
ARRAY Array variable name to use for file data.
If -C is supplied without -c, the default quantum is 5000. When
CALLBACK is evaluated, it is supplied the index of the next array
element to be assigned and the line to be assigned to that element
as additional arguments.
If not supplied with an explicit origin, mapfile will clear ARRAY before
assigning to it.
Exit Status:
Returns success unless an invalid option is given or ARRAY is readonly or
not an indexed array.
You can try this
my_algorithm() {
create_array list
for element in "${list[#]}"
do
echo "${element}"
done
}
create_array() {
local my_list=("1st one" "2nd two" "3rd three")
eval "${1}=()"
for element in "${my_list[#]}"
do
eval "${1}+=(\"${element}\")"
done
}
my_algorithm
The output is
1st one
2nd two
3rd three
I'd suggest piping to a code block to set values of an array. The strategy is POSIX compatible, so you get both Bash and Zsh, and doesn't run the risk of side effects like the posted solutions.
i=0 # index for our new array
declare -a arr # our new array
# pipe from a function that produces output by line
ls -l | { while read data; do i=$i+1; arr[$i]="$data"; done }
# example of reading that new array
for row in "${arr[#]}"; do echo "$row"; done
This will work for zsh and bash, and won't be affected by spaces or special characters. In the case of the OP, the output is transformed by echo, so it is not actually outputting an array, but printing it (as others mentioned shell functions return status not values). We can change it to a pipeline ready mechanism:
create_array() {
local my_list=("a", "b", "c")
for row in "${my_list[#]}"; do
echo "$row"
done
}
my_algorithm() {
i=0
declare -a result
create_array | { while read data; do i=$i+1; result[$i]="$data"; done }
}
If so inclined, one could remove the create_array pipeline process from my_algorithm and chain the two functions together
create_array | my_algorithm
A modern Bash implementation using #Q to safely output array elements:
#!/usr/bin/env bash
return_array_elements() {
local -a foo_array=('1st one' '2nd two' '3rd three')
printf '%s\n' "${foo_array[#]#Q}"
}
use_array_elements() {
local -a bar_array="($(return_array_elements))"
# Display declareation of bar_array
# which is local to this function, but whose elements
# hahaves been returned by the return_array_elements function
declare -p bar_array
}
use_array_elements
Output:
declare -a bar_array=([0]="1st one" [1]="2nd two" [2]="3rd three")
While the declare -p approach is elegant indeed, you can still create a global array using declare -g within a function and have it visible outside the scope of the function:
create_array() {
declare -ag result=("a", "b", "c")
}
my_algorithm() {
create_array
echo "${result[#]}"
}

Resources