Bash: Iterate over variable names - arrays

I'm writing a bash script to analyse some files. In a first iteration I create associative arrays with word counts for each category that is analysed. These categories are not known in advance so the names of these associative arrays are variable but all with the same prefix count_$category. The associative arrays have a word as key and its occurrence count in that category as value.
After all files are analysed, I have to summarise the results for each category. I can iterate over the variable names using ${count_*} but how can I access the associative arrays behind those variable names? For each associative array (each count_* variable) I should iterate over the words and their counts.
I have already tried with indirect access like this but it doesn't work:
for categorycount in ${count_*} # categorycount now holds the name of the associative array variable for each category
do
array=${!categorycount}
for word in ${!array[#]}
do
echo "$word occurred ${array[$word]} times"
done
done

The modern (bash 4.3+) approach uses "namevars", a facility borrowed from ksh:
for _count_var in "${!count_#}"; do
declare -n count=$_count_var # make count an alias for $_count_var
for key in "${!count[#]}"; do # iterate over keys, via same
echo "$key occurred ${count[$key]} times" # extract value, likewise
done
unset -n count # clear that alias
done
declare -n count=$count_var allows "${count[foo]}" to be used to look up item foo in the associative array named count_var; similarly, count[foo]=bar will assign to that same item. unset -n count then removes this mapping.
Prior to bash 4.3:
for _count_var in "${!count_#}"; do
printf -v cmd '_count_keys=( "${!%q[#]}" )' "$_count_var" && eval "$cmd"
for key in "${_count_keys[#]}"; do
var="$_count_var[$key]"
echo "$key occurred ${!var} times"
done
done
Note the use of %q, rather than substituting a variable name directly into a string, to generate the command to eval. Even though in this case we're probably safe (because the set of possible variable names is restricted), following this practice reduces the amount of context that needs to be considered to determine whether an indirect expansion is secure.
In both cases, note that internal variables (_count_var, _count_keys, etc) use names that don't match the count_* pattern.

Related

Splitting strings in nested loops and iterating through results bash [duplicate]

I try to list subfolders and save them as list to a variable
DIRS=($(find . -maxdepth 1 -mindepth 1 -type d ))
After that I want to do some work in subfolders. i.e.
for item in $DIRS
do
echo $item
But after
echo $DIRS
it gives only first item (subfolder). Could someone point me to error or propose another solution?
The following creates a bash array but lists only one of three subdirectories:
$ dirs=($(find . -maxdepth 1 -mindepth 1 -type d ))
$ echo $dirs
./subdir2
To see all the directories, you must use the [#] or [*] subscript form:
$ echo "${dirs[#]}"
./subdir2 ./subdir1 ./subdir3
Or, using it in a loop:
$ for item in "${dirs[#]}"; do echo $item; done
./subdir2
./subdir1
./subdir3
Avoiding problems from word splitting
Note that, in the code above, the shell performs word splitting before the array is created. Thus, this approach will fail if any subdirectories have whitespace in their names.
The following will successfully create an array of subdirectory names even if the names have spaces, tabs or newlines in them:
dirs=(*/)
If you need to use find and you want it to be safe for difficult file names, then use find's --exec option.
Documentation
The form $dirs returns just the first element of the array. This is documented in man bash:
Referencing an array variable without a subscript is equivalent to referencing the array with a subscript of 0.
The use of [#] and [*] is also documented in man bash:
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with pathname expansion. If subscript is # or , the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[#]} expands each element of name to a separate word.

How to set arrays with variables with loop in bash [duplicate]

I am confused about a bash script.
I have the following code:
function grep_search() {
magic_way_to_define_magic_variable_$1=`ls | tail -1`
echo $magic_variable_$1
}
I want to be able to create a variable name containing the first argument of the command and bearing the value of e.g. the last line of ls.
So to illustrate what I want:
$ ls | tail -1
stack-overflow.txt
$ grep_search() open_box
stack-overflow.txt
So, how should I define/declare $magic_way_to_define_magic_variable_$1 and how should I call it within the script?
I have tried eval, ${...}, \$${...}, but I am still confused.
I've been looking for better way of doing it recently. Associative array sounded like overkill for me. Look what I found:
suffix=bzz
declare prefix_$suffix=mystr
...and then...
varname=prefix_$suffix
echo ${!varname}
From the docs:
The ‘$’ character introduces parameter expansion, command substitution, or arithmetic expansion. ...
The basic form of parameter expansion is ${parameter}. The value of parameter is substituted. ...
If the first character of parameter is an exclamation point (!), and parameter is not a nameref, it introduces a level of indirection. Bash uses the value formed by expanding the rest of parameter as the new parameter; this is then expanded and that value is used in the rest of the expansion, rather than the expansion of the original parameter. This is known as indirect expansion. The value is subject to tilde expansion, parameter expansion, command substitution, and arithmetic expansion. ...
Use an associative array, with command names as keys.
# Requires bash 4, though
declare -A magic_variable=()
function grep_search() {
magic_variable[$1]=$( ls | tail -1 )
echo ${magic_variable[$1]}
}
If you can't use associative arrays (e.g., you must support bash 3), you can use declare to create dynamic variable names:
declare "magic_variable_$1=$(ls | tail -1)"
and use indirect parameter expansion to access the value.
var="magic_variable_$1"
echo "${!var}"
See BashFAQ: Indirection - Evaluating indirect/reference variables.
Beyond associative arrays, there are several ways of achieving dynamic variables in Bash. Note that all these techniques present risks, which are discussed at the end of this answer.
In the following examples I will assume that i=37 and that you want to alias the variable named var_37 whose initial value is lolilol.
Method 1. Using a “pointer” variable
You can simply store the name of the variable in an indirection variable, not unlike a C pointer. Bash then has a syntax for reading the aliased variable: ${!name} expands to the value of the variable whose name is the value of the variable name. You can think of it as a two-stage expansion: ${!name} expands to $var_37, which expands to lolilol.
name="var_$i"
echo "$name" # outputs “var_37”
echo "${!name}" # outputs “lolilol”
echo "${!name%lol}" # outputs “loli”
# etc.
Unfortunately, there is no counterpart syntax for modifying the aliased variable. Instead, you can achieve assignment with one of the following tricks.
1a. Assigning with eval
eval is evil, but is also the simplest and most portable way of achieving our goal. You have to carefully escape the right-hand side of the assignment, as it will be evaluated twice. An easy and systematic way of doing this is to evaluate the right-hand side beforehand (or to use printf %q).
And you should check manually that the left-hand side is a valid variable name, or a name with index (what if it was evil_code # ?). By contrast, all other methods below enforce it automatically.
# check that name is a valid variable name:
# note: this code does not support variable_name[index]
shopt -s globasciiranges
[[ "$name" == [a-zA-Z_]*([a-zA-Z_0-9]) ]] || exit
value='babibab'
eval "$name"='$value' # carefully escape the right-hand side!
echo "$var_37" # outputs “babibab”
Downsides:
does not check the validity of the variable name.
eval is evil.
eval is evil.
eval is evil.
1b. Assigning with read
The read builtin lets you assign values to a variable of which you give the name, a fact which can be exploited in conjunction with here-strings:
IFS= read -r -d '' "$name" <<< 'babibab'
echo "$var_37" # outputs “babibab\n”
The IFS part and the option -r make sure that the value is assigned as-is, while the option -d '' allows to assign multi-line values. Because of this last option, the command returns with an non-zero exit code.
Note that, since we are using a here-string, a newline character is appended to the value.
Downsides:
somewhat obscure;
returns with a non-zero exit code;
appends a newline to the value.
1c. Assigning with printf
Since Bash 3.1 (released 2005), the printf builtin can also assign its result to a variable whose name is given. By contrast with the previous solutions, it just works, no extra effort is needed to escape things, to prevent splitting and so on.
printf -v "$name" '%s' 'babibab'
echo "$var_37" # outputs “babibab”
Downsides:
Less portable (but, well).
Method 2. Using a “reference” variable
Since Bash 4.3 (released 2014), the declare builtin has an option -n for creating a variable which is a “name reference” to another variable, much like C++ references. Just as in Method 1, the reference stores the name of the aliased variable, but each time the reference is accessed (either for reading or assigning), Bash automatically resolves the indirection.
In addition, Bash has a special and very confusing syntax for getting the value of the reference itself, judge by yourself: ${!ref}.
declare -n ref="var_$i"
echo "${!ref}" # outputs “var_37”
echo "$ref" # outputs “lolilol”
ref='babibab'
echo "$var_37" # outputs “babibab”
This does not avoid the pitfalls explained below, but at least it makes the syntax straightforward.
Downsides:
Not portable.
Risks
All these aliasing techniques present several risks. The first one is executing arbitrary code each time you resolve the indirection (either for reading or for assigning). Indeed, instead of a scalar variable name, like var_37, you may as well alias an array subscript, like arr[42]. But Bash evaluates the contents of the square brackets each time it is needed, so aliasing arr[$(do_evil)] will have unexpected effects… As a consequence, only use these techniques when you control the provenance of the alias.
function guillemots {
declare -n var="$1"
var="«${var}»"
}
arr=( aaa bbb ccc )
guillemots 'arr[1]' # modifies the second cell of the array, as expected
guillemots 'arr[$(date>>date.out)1]' # writes twice into date.out
# (once when expanding var, once when assigning to it)
The second risk is creating a cyclic alias. As Bash variables are identified by their name and not by their scope, you may inadvertently create an alias to itself (while thinking it would alias a variable from an enclosing scope). This may happen in particular when using common variable names (like var). As a consequence, only use these techniques when you control the name of the aliased variable.
function guillemots {
# var is intended to be local to the function,
# aliasing a variable which comes from outside
declare -n var="$1"
var="«${var}»"
}
var='lolilol'
guillemots var # Bash warnings: “var: circular name reference”
echo "$var" # outputs anything!
Source:
BashFaq/006: How can I use variable variables (indirect variables, pointers, references) or associative arrays?
BashFAQ/048: eval command and security issues
Example below returns value of $name_of_var
var=name_of_var
echo $(eval echo "\$$var")
Use declare
There is no need on using prefixes like on other answers, neither arrays. Use just declare, double quotes, and parameter expansion.
I often use the following trick to parse argument lists contanining one to n arguments formatted as key=value otherkey=othervalue etc=etc, Like:
# brace expansion just to exemplify
for variable in {one=foo,two=bar,ninja=tip}
do
declare "${variable%=*}=${variable#*=}"
done
echo $one $two $ninja
# foo bar tip
But expanding the argv list like
for v in "$#"; do declare "${v%=*}=${v#*=}"; done
Extra tips
# parse argv's leading key=value parameters
for v in "$#"; do
case "$v" in ?*=?*) declare "${v%=*}=${v#*=}";; *) break;; esac
done
# consume argv's leading key=value parameters
while test $# -gt 0; do
case "$1" in ?*=?*) declare "${1%=*}=${1#*=}";; *) break;; esac
shift
done
Combining two highly rated answers here into a complete example that is hopefully useful and self-explanatory:
#!/bin/bash
intro="You know what,"
pet1="cat"
pet2="chicken"
pet3="cow"
pet4="dog"
pet5="pig"
# Setting and reading dynamic variables
for i in {1..5}; do
pet="pet$i"
declare "sentence$i=$intro I have a pet ${!pet} at home"
done
# Just reading dynamic variables
for i in {1..5}; do
sentence="sentence$i"
echo "${!sentence}"
done
echo
echo "Again, but reading regular variables:"
echo $sentence1
echo $sentence2
echo $sentence3
echo $sentence4
echo $sentence5
Output:
You know what, I have a pet cat at home
You know what, I have a pet chicken at home
You know what, I have a pet cow at home
You know what, I have a pet dog at home
You know what, I have a pet pig at home
Again, but reading regular variables:
You know what, I have a pet cat at home
You know what, I have a pet chicken at home
You know what, I have a pet cow at home
You know what, I have a pet dog at home
You know what, I have a pet pig at home
This will work too
my_country_code="green"
x="country"
eval z='$'my_"$x"_code
echo $z ## o/p: green
In your case
eval final_val='$'magic_way_to_define_magic_variable_"$1"
echo $final_val
This should work:
function grep_search() {
declare magic_variable_$1="$(ls | tail -1)"
echo "$(tmpvar=magic_variable_$1 && echo ${!tmpvar})"
}
grep_search var # calling grep_search with argument "var"
An extra method that doesn't rely on which shell/bash version you have is by using envsubst. For example:
newvar=$(echo '$magic_variable_'"${dynamic_part}" | envsubst)
For zsh (newers mac os versions), you should use
real_var="holaaaa"
aux_var="real_var"
echo ${(P)aux_var}
holaaaa
Instead of "!"
As per BashFAQ/006, you can use read with here string syntax for assigning indirect variables:
function grep_search() {
read "$1" <<<$(ls | tail -1);
}
Usage:
$ grep_search open_box
$ echo $open_box
stack-overflow.txt
Even though it's an old question, I still had some hard time with fetching dynamic variables names, while avoiding the eval (evil) command.
Solved it with declare -n which creates a reference to a dynamic value, this is especially useful in CI/CD processes, where the required secret names of the CI/CD service are not known until runtime. Here's how:
# Bash v4.3+
# -----------------------------------------------------------
# Secerts in CI/CD service, injected as environment variables
# AWS_ACCESS_KEY_ID_DEV, AWS_SECRET_ACCESS_KEY_DEV
# AWS_ACCESS_KEY_ID_STG, AWS_SECRET_ACCESS_KEY_STG
# -----------------------------------------------------------
# Environment variables injected by CI/CD service
# BRANCH_NAME="DEV"
# -----------------------------------------------------------
declare -n _AWS_ACCESS_KEY_ID_REF=AWS_ACCESS_KEY_ID_${BRANCH_NAME}
declare -n _AWS_SECRET_ACCESS_KEY_REF=AWS_SECRET_ACCESS_KEY_${BRANCH_NAME}
export AWS_ACCESS_KEY_ID=${_AWS_ACCESS_KEY_ID_REF}
export AWS_SECRET_ACCESS_KEY=${_AWS_SECRET_ACCESS_KEY_REF}
echo $AWS_ACCESS_KEY_ID $AWS_SECRET_ACCESS_KEY
aws s3 ls
Wow, most of the syntax is horrible! Here is one solution with some simpler syntax if you need to indirectly reference arrays:
#!/bin/bash
foo_1=(fff ddd) ;
foo_2=(ggg ccc) ;
for i in 1 2 ;
do
eval mine=( \${foo_$i[#]} ) ;
echo ${mine[#]}" " ;
done ;
For simpler use cases I recommend the syntax described in the Advanced Bash-Scripting Guide.
KISS approach:
a=1
c="bam"
let "$c$a"=4
echo $bam1
results in 4
I want to be able to create a variable name containing the first argument of the command
script.sh file:
#!/usr/bin/env bash
function grep_search() {
eval $1=$(ls | tail -1)
}
Test:
$ source script.sh
$ grep_search open_box
$ echo $open_box
script.sh
As per help eval:
Execute arguments as a shell command.
You may also use Bash ${!var} indirect expansion, as already mentioned, however it doesn't support retrieving of array indices.
For further read or examples, check BashFAQ/006 about Indirection.
We are not aware of any trick that can duplicate that functionality in POSIX or Bourne shells without eval, which can be difficult to do securely. So, consider this a use at your own risk hack.
However, you should re-consider using indirection as per the following notes.
Normally, in bash scripting, you won't need indirect references at all. Generally, people look at this for a solution when they don't understand or know about Bash Arrays or haven't fully considered other Bash features such as functions.
Putting variable names or any other bash syntax inside parameters is frequently done incorrectly and in inappropriate situations to solve problems that have better solutions. It violates the separation between code and data, and as such puts you on a slippery slope toward bugs and security issues. Indirection can make your code less transparent and harder to follow.
For indexed arrays, you can reference them like so:
foo=(a b c)
bar=(d e f)
for arr_var in 'foo' 'bar'; do
declare -a 'arr=("${'"$arr_var"'[#]}")'
# do something with $arr
echo "\$$arr_var contains:"
for char in "${arr[#]}"; do
echo "$char"
done
done
Associative arrays can be referenced similarly but need the -A switch on declare instead of -a.
POSIX compliant answer
For this solution you'll need to have r/w permissions to the /tmp folder.
We create a temporary file holding our variables and leverage the -a flag of the set built-in:
$ man set
...
-a Each variable or function that is created or modified is given the export attribute and marked for export to the environment of subsequent commands.
Therefore, if we create a file holding our dynamic variables, we can use set to bring them to life inside our script.
The implementation
#!/bin/sh
# Give the temp file a unique name so you don't mess with any other files in there
ENV_FILE="/tmp/$(date +%s)"
MY_KEY=foo
MY_VALUE=bar
echo "$MY_KEY=$MY_VALUE" >> "$ENV_FILE"
# Now that our env file is created and populated, we can use "set"
set -a; . "$ENV_FILE"; set +a
rm "$ENV_FILE"
echo "$foo"
# Output is "bar" (without quotes)
Explaining the steps above:
# Enables the -a behavior
set -a
# Sources the env file
. "$ENV_FILE"
# Disables the -a behavior
set +a
While I think declare -n is still the best way to do it there is another way nobody mentioned it, very useful in CI/CD
function dynamic(){
export a_$1="bla"
}
dynamic 2
echo $a_2
This function will not support spaces so dynamic "2 3" will return an error.
for varname=$prefix_suffix format, just use:
varname=${prefix}_suffix

Iterating over list of arrays in bash

I need to iterate over couple of key-value arrays (associative arrays) in bash. Here's my last attempt:
declare -A ARR1
ARR1[foo]=bar
declare -A ARR2
ARR2[foo]=baz
arrays=(ARR1 ARR2)
for idx in "${arrays[#]}"; do
echo ${${idx}[foo]};
done
which is wrong of course (syntax error), but at this moment, I have no other ideas how to deal with it. Whereas in the following example there are no errors, but the output is just a name of an array.
for idx in "${array[#]}"; do
echo "${idx[foo]}";
done
------- OUTPUT -------
ARR1
ARR2
EDIT
Ok, I was able to do it by using eval:
eval echo \${${idx}[foo]};
However, I read that using eval in bash scripts is not such a good idea. Is there any better way?
Bash 4.3-alpha introduced the nameref attribute, which could be used in this case:
declare -A arr1=([foo]=bar)
declare -A arr2=([foo]=baz)
arrays=(arr1 arr2)
for idx in "${arrays[#]}"; do
declare -n temp="$idx"
echo "${temp[foo]}"
done
gives the output
bar
baz
As pointed out by kojiro in his comment, storing the array names in an array to iterate over is not actually required as long as the names have a shared prefix.
arrays=(arr1 arr2)
for idx in "${arrays[#]}"; do
could be replaced by
for idx in "${!arr#}"; do
Notice that despite the exclamation mark, this has nothing to do with indirect expansion.
Relevant excerpts from the reference manual
Section "Shell Parameters":
A variable can be assigned the nameref attribute using the -n
option to the declare or local builtin commands (see Bash
Builtins)
to create a nameref, or a reference to another variable. This allows
variables to be manipulated indirectly. Whenever the nameref variable
is referenced or assigned to, the operation is actually performed on
the variable specified by the nameref variable's value. A nameref is
commonly used within shell functions to refer to a variable whose name
is passed as an argument to the function. For instance, if a variable
name is passed to a shell function as its first argument, running
declare -n ref=$1
inside the function creates a nameref variable ref whose value is
the variable name passed as the first argument. References and
assignments to ref are treated as references and assignments to the
variable whose name was passed as $1.
If the control variable in a for loop has the nameref attribute, the
list of words can be a list of shell variables, and a name reference
will be established for each word in the list, in turn, when the loop
is executed. Array variables cannot be given the -n attribute.
However, nameref variables can reference array variables and
subscripted array variables. Namerefs can be unset using the -n
option to the unset builtin (see Bourne Shell
Builtins).
Otherwise, if unset is executed with the name of a nameref variable
as an argument, the variable referenced by the nameref variable will
be unset.
Section "Shell Parameter Expansion":
${!prefix*}${!prefix#}
Expands to the names of variables whose names begin with prefix,
separated by the first character of the IFS special variable. When
# is used and the expansion appears within double quotes, each
variable name expands to a separate word.
You can do this like with indirect substitution, but ideally you don't want to use bash for something like this if it's anything but a one-off script.
#!/bin/bash
set -ex
declare -A ARR1
ARR1[foo]=bar
declare -A ARR2
ARR2[foo]=baz
arrays=(ARR1 ARR2)
for idx in "${arrays[#]}"; do
x="${idx}[foo]" # Store as a string in a variable
echo ${!x} # Reference using indirection.
done
See http://mywiki.wooledge.org/BashFAQ/006 for more info.

Storing Bash associative arrays

I want to store (and retrieve, of course) Bash's associative arrays and am looking for a simple way to do that.
I know that it is possible to do it using a look over all keys:
for key in "${!arr[#]}"
do
echo "$key ${arr[$key]}"
done
Retrieving it could also be done in a loop:
declare -A arr
while read key value
do
arr[$key]=$value
done < store
But I also see that set will print a version of the array in this style:
arr=([key1]="value1" [key2]="value2" )
(Unfortunately along with all other shell variables.)
Is there a simpler way for storing and retrieving an associative array than my proposed loop?
To save to a file:
declare -p arr > saved.sh
(You can also use typeset instead of declare if you prefer.)
To load from the file:
source saved.sh

how to enumerate multiple arrays using same base name

I am trying to create multiple arrays holding random lists of file names referencing the number of elements in another array. How can I append a $cntr var (beginning with cntr=0) to the end of the new array names so they are directly referenced with elements in other array?
Wow I hope that reads somewhat sensible. Here is what I got going on so far that I hope helps make better sense of what I mean:
function fGenRanList() {
cntr=0
while [[ "$cntr" -lt "${#mTypeAr[#]}" ]] ; do
n="${nAr[$cntr]}" ; echo "\$n: $n"
tracks${cntr}=() ; echo "\$tracks${cntr}: $tracks${cntr}"
while ((n > 0)) && IFS= read -rd $'\0' ; do
tracks${cntr}+=("$REPLY")
((n--))
done < <(sort -zuR <(find "${dirAr[$cntr]}" -type f \( -name '*.mp3' -o -name '*.ogg' \) -print0))
((cntr++))
done
}
error I get is:
/home/user/bin/ranSong_multDirs.sh: line 95: syntax error near unexpected token `"$REPLY"'
/home/user/bin/ranSong_multDirs.sh: line 95: ` tracks${cntr}+=("$REPLY")'
But I first commentted out the echo statements from the tracks${cntr}=() array initialization to get rid of a similar error, but unsure whether or not track${cntr} gets initialized in the first place.
By the end I should end up with as many track(n) arrays as there are elements in ${#mTypeAr[#]}, using the numeric var stored in array ${nAr[$cntr]} to determine how many elements each track array will contain.
Maybe I am making things more difficult than need be, trying to implement arrays into older scripts I have both in order to make them a little more efficient, but I guess am driven primarily to get a better handle on using BASH arrays to store vars for similar but multiple processes which I seem to do often in my scripts.
Change this line, which is not valid bash syntax,
tracks${cntr}+=("$REPLY")
to
declare "tracks${cntr}+=($REPLY)"
Rather than having a syntactic assignment, the declare command takes a string that *look*s like an assignment as an argument; that argument is processed by the shell first, so if cntr is currently 3 and $REPLY is foo, the actual assignment performed is
tracks3+=(foo)
The declare command gives you a level of indirection in making parameter assignments.

Resources