Elegant use of arrays in ksh - arrays

I'm trying build an sort of property set in ksh.
Thought the easiest way to do so was using arrays but the syntax is killing me.
What I want is to
Build an arbitrary sized array in a config file with a name and a property.
Iterate for each item in that list and get that property.
I theory what I wish I could do is something like
MONITORINGSYS={
SYS1={NAME="GENERATOR" MONITORFUNC="getGeneratorStatus"}
SYS2={NAME="COOLER" MONITORFUNC="getCoolerStatus"}
}
Then later on, be able to do something like:
for CURSYS in $MONITORINGSYS
do
CSYSNAME=$CURSYS.NAME
CSYSFUNC=$CURSYS.MONITORFUNC
REPORT="$REPORT\n$CSYSNAME"
CSYSSTATUS=CSYSFUNC $(date)
REPORT="$REPORT\t$CSYSSTATUS"
done
echo $REPORT
Well, that's not real programming, but I guess you got the point..
How do I do that?
[EDIT]
I do not mean I want to use associative arrays. I only put this way to make my question more clear... I.e. It would not be a problem if the loop was something like:
for CURSYS in $MONITORINGSYS
do
CSYSNAME=${CURSYS[0]}
CSYSFUNC=${CURSYS[1]}
REPORT="$REPORT\n$CSYSNAME"
CSYSSTATUS=CSYSFUNC $(date)
REPORT="$REPORT\t$CSYSSTATUS"
done
echo $REPORT
Same applies to the config file.. I'm just looking for a syntax that makes it minimally readable.
cheers

Not exactly sure what you want... Kornshell can handle both associative and indexed arrays.
However, Kornshell arrays are one dimensional. It might be possible to use indirection to emulate a two dimensional array via the use of $() and eval. I did this a couple of times in the older Perl 4.x and Perl 3.x, but it's a pain. If you want multidimensional arrays, use Python or Perl.
The only thing is that you must declare arrays via the typedef command:
$ typeset -A foohash #foohash is an associative array
$ typeset -a foolist #foolist is an integer indexed array.
Maybe your script can look something like this
typeset -a sysname
typeset -a sysfunct
sysname[1] = "GENERATOR"
sysname[2] = "COOLER"
sysfunc[1] = "getGeneratorStatus"
sysfunc[2] = "getCoolerStatus"
for CURSYS in {1..2}
do
CSYSNAME="${sysname[$CURSYS]}"
CSYSFUNC="${sysfunc[$CURSYS]}"
REPORT="$REPORT\n$CSYSNAME"
CSYSSTATUS=$(eval "CSYSFUNC $(date)")
REPORT="$REPORT\t$CSYSSTATUS"
done
echo $REPORT

ksh93 now has compound variables which can contain a mixture of indexed and associative arrays. No need to declare it as ksh will work it out itself.
#!/bin/ksh
MONITORINGSYS=(
[SYS1]=(NAME="GENERATOR" MONITORFUNC="getGeneratorStatus")
[SYS2]=(NAME="COOLER" MONITORFUNC="getCoolerStatus")
)
echo MONITORING REPORT
echo "-----------------"
for sys in ${!MONITORINGSYS[*]}; do
echo "System: $sys"
echo "Name: ${MONITORINGSYS[$sys].NAME}"
echo "Generator: ${MONITORINGSYS[$sys].MONITORFUNC}"
echo
done
Output:
MONITORING REPORT
-----------------
System: SYS1
Name: GENERATOR
Generator: getGeneratorStatus
System: SYS2
Name: COOLER
Generator: getCoolerStatus

Related

Bash Arrays - Zip array of variable names with corresponding values

As I am teaching myself Bash programming, I came across an interesting use case, where I want to take a list of variables that exist in the environment, and put them into an array. Then, I want to output a list of the variable names and their values, and store that output in an array, one entry per variable.
I'm only about 2 weeks into Bash shell scripting in any "real" way, and I am educating myself on arrays. A common function in other programming language is the ability to "zip" two arrays, e.g. as is done in Python. Another common feature in any programming language is indirection, e.g. via pointer indirection, etc. This is largely academic, to teach myself through a somewhat challenging example, but I think this has widespread use if for no other reason than debugging, keeping track of overall system state, etc.
What I want is for the following input... :
VAR_ONE="LIGHT RED"
VAR_TWO="DARK GREEN"
VAR_THREE="BLUE"
VARIABLE_ARRAY=(VAR_ONE VAR_TWO VAR_THREE)
... to be converted into the following output (as an array, one element per line):
VAR_ONE: LIGHT RED
VAR_TWO: DARK GREEN
VAR_THREE: BLUE
Constraints:
Assume that I do not have control of all of the variables, so I cannot just sidestep the problem e.g. by using an associative array from the get-go. (i.e. please do not recommend avoiding the need for indirect reference lookups altogether by never having a discrete variable named "VAR_ONE"). But a solution that stores the result in an associative array is fine.
Assume that variable names will never contain spaces, but their values might.
The final output should not contain separate elements just because the input variables had values containing spaces.
What I've read about so far:
I've read some StackOverflow posts like this one, that deal with using indirect references to arrays themselves (e.g. if you have three arrays and want to choose which one to pull from based on an "array choice" variable): How to iterate over an array using indirect reference?
I've also found one single post that deals with "zipping" arrays in Bash in the manner I'm talking about, where you pair-up e.g. the 1st element from array1 and array2, then pair up the 2nd elements, etc.: Iterate over two arrays simultaneously in bash
...but I haven't found anything that quite discusses this unique use-case...
QUESTION:
How should I make an array containing a list of variable names and their values (colon-separated), given an array containing a list of variable names only. I'm not "failing to come up with any way to do it" but I want to find the "preferred" way to do this in Bash, considering performance, security, and being concise/understandable.
EDIT: I'll post what I've come up with thus far as an answer to this post... but not mark it as answered, since I want to also hear some unbiased recommendations...
OP starts with:
VAR_ONE="LIGHT RED"
VAR_TWO="DARK GREEN"
VAR_THREE="BLUE"
VARIABLE_ARRAY=(VAR_ONE VAR_TWO VAR_THREE)
OP has provided an answer with 4 sets of code:
# first 3 sets of code generate:
$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT RED" [1]="VAR_TWO: DARK GREEN" [2]="VAR_THREE: BLUE")
# the 4th set of code generates the following where the data values are truncated at the first space:
$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT" [1]="VAR_TWO: DARK" [2]="VAR_THREE: BLUE")
NOTES:
I'm assuming the output from the 4th set of code is wrong so will be ignoring this one
OP's code samples touch on a couple ideas I'm going to make use of (below) ... nameref's (declare -n <variable_name>) and indirect variable references (${!<variable_name>})
For readability (and maintainability by others) I'd probably avoid the various eval and expansion ideas and instead opt for using bash namesref's (declare -n); a quick example:
$ x=5
$ echo "${x}"
5
$ y=x
$ echo "${y}"
x
$ declare -n y="x" # nameref => y=(value of x)
$ echo "${y}"
5
Pulling this into the original issue we get:
unset outputValues
declare -a outputValues # optional; declare 'normal' array
for var_name in "${VARIABLE_ARRAY[#]}"
do
declare -n data_value="${var_name}"
outputValues+=("${var_name}: ${data_value}")
done
Which gives us:
$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT RED" [1]="VAR_TWO: DARK GREEN" [2]="VAR_THREE: BLUE")
While this generates the same results (as OP's first 3 sets of code) there is (for me) the nagging question of how is this new array going to be used?
If the sole objective is to print this pre-formatted data to stdout ... ok, though why bother with a new array when the same can be done with the current array and nameref's?
If the objective is to access this array as sets of variable name/value pairs for processing purposes, then the current structure is going to be hard(er) to work with, eg, each array 'value' will need to be parsed/split based on the delimiter :<space> in order to access the actual variable names and values.
In this scenario I'd opt for using an associative array, eg:
unset outputValues
declare -A outputValues # required; declare associative array
for var_name in "${VARIABLE_ARRAY[#]}"
do
declare -n data_value="${var_name}"
outputValues[${var_name}]="${data_value}"
done
Which gives us:
$ typeset -p outputValues
declare -A outputValues=([VAR_ONE]="LIGHT RED" [VAR_THREE]="BLUE" [VAR_TWO]="DARK GREEN" )
NOTES:
again, why bother with a new array when the same can be done with the current array and nameref's?
if the variable $data_value is to be re-used in follow-on code as a 'normal' variable it will be necessary to remove the nameref attribute (unset -n data_value)
With an associative array (index=variable name / array element=variable value) it becomes easier to reference the variable name/value pairs, eg:
$ myvar=VAR_ONE
$ echo "${myvar}: ${outputValues[${myvar}]}"
VAR_ONE: LIGHT RED
$ for var_name in "${!outputValues[#]}"; do echo "${var_name}: ${outputValues[${var_name}]}"; done
VAR_ONE: LIGHT RED
VAR_THREE: BLUE
VAR_TWO: DARK GREEN
In older versions of bash (before nameref's were available), and still available in newer versions of bash, there's the option of using indirect variable references;
$ x=5
$ echo "${x}"
5
$ unset -n y # make sure 'y' has not been previously defined as a nameref
$ y=x
$ echo "${y}"
x
$ echo "${!y}"
5
Pulling this into the associative array approach:
unset -n var_name # make sure var_name not previously defined as a nameref
unset outputValues
declare -A outputValues # required; declare associative array
for var_name in "${VARIABLE_ARRAY[#]}"
do
outputValues[${var_name}]="${!var_name}"
done
Which gives us:
$ typeset -p outputValues
declare -A outputValues=([VAR_ONE]="LIGHT RED" [VAR_THREE]="BLUE" [VAR_TWO]="DARK GREEN" )
NOTE: While this requires less coding in the for loop, if you forget to unset -n the variable (var_name in this case) then you'll end up with the wrong results if var_name was previously defined as a nameref; perhaps a minor issue but it requires the coder to know of, and code for, this particular issue ... a bit too esoteric (for my taste) so I prefer to stick with namerefs ... ymmv ...
I've come up with a handful of possible solutions in the last couple days, each one with their own pro's and con's. I won't mark this as the answer for awhile though, since I'm interested in hearing unbiased recommendations.
My brainstorming solutions thus far:
OPTION #1 - FOR-LOOP:
alias PrintCommandValues='unset outputValues
for var in ${VARIABLE_ARRAY[#]}
do outputValues+=("${var}: ${!var}")
done; printf "%s\n\n" "${outputValues[#]}"'
PrintCommandValues
Pro's: Traditional, easy to understand
Cons: A little verbose. I'm not sure about Bash, but I've been doing a lot of Mathematica programming (imperative-style), where such loops are notably slower. Anybody know if that's true for Bash?
OPTION #2 - EVAL:
i=0; outputValues=("${VARIABLE_ARRAY[#]}")
eval declare "${VARIABLE_ARRAY[#]/#/outputValues[i++]+=:\\ $}"
printf "%s\n\n" "${outputValues[#]}"
Pros: Shorter than the for-loop, and still easy to understand.
Cons: I'm no expert, but I've read a lot of warnings to avoid eval whenever possible, due to security issues. Probably not something I'll concern myself a ton over when I'm mostly writing scripts for "handy utility purposes" for my personal machine only, but...
OPTION #3 - QUOTED DECLARE WITH PARENTHESIS:
i=0; declare -a outputValues="(${VARIABLE_ARRAY[#]/%/'\:\ "${!VARIABLE_ARRAY[i++]}"'})"
printf "%s\n\n" "${outputValues[#]}"
Pros: Super-concise. I just plain stumbled onto this syntax -- I haven't found it mentioned anywhere on the web. Apparently, using declare in Bash (I use version 4.4.20(1)), if (and ONLY if) you place array-style (...) brackets after the equals-sign, and quote it, you get one more "round" of expansion/dereferencing, similar to eval. I happened to be toying with this post, and found the part about the "extra expansion" by accident.
For example, compare these two tests:
varName=varOne; varOne=something
declare test1=\$$varName
declare -a test2="(\$$varName)"
declare -p test1 test2
Output:
declare -- test1="\$varOne"
declare -a test2=([0]="something")
Pretty neat, I think...
Anyways, the cons for this method are... I've never seen it documented officially or unofficially anywhere, so... portability...?
Alternative for this option:
i=0; declare -a LABELED_VARIABLE_ARRAY="(${VARIABLE_ARRAY[#]/%/'\:\ \$"${VARIABLE_ARRAY[i++]}"'})"
declare -a outputValues=("${LABELED_VARIABLE_ARRAY[#]#P}")
printf "%s\n\n" "${outputValues[#]}"
JUST FOR FUN - BRACE EXPANSION:
unset outputValues; OLDIFS=$IFS; IFS=; i=0; j=0
declare -n nameCursor=outputValues[i++]; declare -n valueCursor=outputValues[j++]
declare {nameCursor+=,valueCursor+=": "$}{VAR_ONE,VAR_TWO,VAR_THREE}
printf "%s\n\n" "${outputValues[#]}"
IFS=$OLDIFS
Pros: ??? Maybe speed?
Cons: Pretty verbose, not very easy to understand
Anyways, those are all of my methods... Are any of them reasonable, or would you do something different altogether?

Parsing a settingsfile in bashscript where some special settings are arrays

I am still quite new to bash scripting and I am somehow stuck.
I am looking for a clean and easy way to parse a settingsfile, where some special (and known) settings are arrays.
So the settings file looks like this.
foo=(1 2 3 4)
bar="foobar"
The best solution I came up with so far is:
#!/bin/bash
while IFS== read -r k v; do
if [ "$k" = "foo" ]
then
IFS=' ' read -r -a $k <<< "$v"
else
declare "$k"="$(echo $v | tr -d '""')"
fi
done < settings.txt
But I am obviously mixing up array types. As far is I understood and tried out for the bar="foobar" part this actually declares an array, and could be accessed by echo ${bar[0]} but as well as echo $bar. So I thought this would be a indexed array, but the error log clearly states something different:
cannot convert associative to indexed array
Would be glad if somebody could explain me a little bit how to find a proper solution.
Is it safe for you to just source the file?
. settings.txt
That will insert all the lines of the file as if they were lines of your current script. Obviously, there are security concerns if the file isn't as secure as the script file itself.

How can I reference an existing bash array using a 2nd variable containing the name of the array?

My closest most helpful matches when I searched for an answer ahead of posting:
Iterate over array in shell whose name is stored in a variable
How to use an argument/parameter name as a variable in a bash script
How to iterate over an array using indirect reference?
My attempt with partial success:
#!/bin/bash
declare -a large_furry_mammals
declare -a array_reference
# I tried both declaring array_reference as an array and
# not declaring it as an array. no change in behavior.
large_furry_mammals=(horse zebra gorilla)
size=large
category=mammals
tmp="${size}_furry_${category}"
eval array_reference='$'$tmp
echo tmp=$tmp
echo array_reference[0]=${array_reference[0]}
echo array_reference[1]=${array_reference[1]}
Output
tmp=large_furry_mammals
array_reference[0]=horse
array_reference[1]=
Expectation
I would have expected to get the output zebra when I echoed array_reference[1].
...but I'm missing some subtlety...
Why can I not access elements of the index array beyond index 0?
This suggests that array_reference is not actually being treated as an array.
I'm not looking to make a copy of the array. I want to reference (what will be) a static array based on a variable pointing to that array, i.e., ${size}_furry_${category} -> large_furry_mammals.
I've been successful with the general idea here using the links I've posted but only as long as its not an array. When it's an array, it's falling down for me.
Addendum Dec 5, 2018
bash 4.3 is not available in this case. #benjamin's answer does work on under 4.3.
I'll be needing to loop over the resulting array variable's contents. This kinda dumb example I gave involving mammals was just to describe the concept. There's actually a real world case around this. I have set of static reference arrays and an input string would be parsed to select which array was relevant and then I will loop over the array that was selected. I could do a case statement but with more than 100 reference arrays that would be the direct but overly verbose way to do it.
This pseudo code is probably better example of what I'm going after.
m1_array=(x a r d)
m2_array=(q 3 fg d)
m3_array=(c e p)
Based on some logic...select which array prefix you need.
x=m1
for each element in ${x}_array
do
some-task
done
I'm doing some testing with #eduardo's solution to see if I can adapt the way he references the variables to get to my endgame.
** Addendum #2 December 14, 2018 **
Solution
I found it! Working with #eduardo's example I came up with the following:
#!/bin/bash
declare -a large_furry_mammals
#declare -a array_reference
large_furry_mammals=(horse zebra gorilla)
size=large
category=mammals
tmp="${size}_furry_${category}[#]"
for element in "${!tmp}"
do
echo $element
done
Here is what execution looks like. We successfully iterate over the elements of the array string that was built dynamically.
./example3b.sh
horse
zebra
gorilla
Thank you everyone.
If you have Bash 4.3 or newer, you can use namerefs:
large_furry_mammals=(horse zebra gorilla)
size=large
category=mammals
declare -n array_reference=${size}_furry_$category
printf '%s\n' "${array_reference[#]}"
with output
horse
zebra
gorilla
This is a reference, so changes are reflected in both large_furry_mammals and array_reference:
$ array_reference[0]='donkey'
$ large_furry_mammals[3]='llama'
$ printf '%s\n' "${array_reference[#]}"
donkey
zebra
gorilla
llama
$ printf '%s\n' "${large_furry_mammals[#]}"
donkey
zebra
gorilla
llama
declare -a large_furry_mammals
declare -a array_reference
large_furry_mammals=(horse zebra gorilla)
size=large
category=mammals
echo ${large_furry_mammals[#]}
tmp="${size}_furry_${category}"
array_reference=${tmp}"[1]"
eval ${array_reference}='bear'
echo tmp=$tmp
echo ${large_furry_mammals[#]}

Practical use of bash array

After reading up on how to initialize arrays in Bash, and seeing some basic examples put forward in blogs, there remains some uncertainties on its practical use. An interesting example perhaps would be to sort in ascending order -- list countries from A to Z in random order, one for each letter.
But in the real world, how is a Bash array applied? What is it applied to? What is the common use case for arrays? This is one area I am hoping to be familiar with. Any champions in the use of bash arrays? Please provide your example.
There are a few cases where I like to use arrays in Bash.
When I need to store a collections of strings that may contain spaces or $IFS characters.
declare -a MYARRAY=(
"This is a sentence."
"I like turtles."
"This is a test."
)
for item in "${MYARRAY[#]}"; do
echo "$item" $(echo "$item" | wc -w) words.
done
This is a sentence. 4 words.
I like turtles. 3 words.
This is a test. 4 words.
When I want to store key/value pairs, for example, short names mapped to long descriptions.
declare -A NEWARRAY=(
["sentence"]="This is a sentence."
["turtles"]="I like turtles."
["test"]="This is a test."
)
echo ${NEWARRAY["turtles"]}
echo ${NEWARRAY["test"]}
I like turtles.
This is a test.
Even if we're just storing single "word" items or numbers, arrays make it easy to count and slice our data.
# Count items in array.
$ echo "${#MYARRAY[#]}"
3
# Show indexes of array.
$ echo "${!MYARRAY[#]}"
0 1 2
# Show indexes/keys of associative array.
$ echo "${!NEWARRAY[#]}"
turtles test sentence
# Show only the second through third elements in the array.
$ echo "${MYARRAY[#]:1:2}"
I like turtles. This is a test.
Read more about Bash arrays here. Note that only Bash 4.0+ supports every operation I've listed (associative arrays, for example), but the link shows which versions introduced what.

Storing Bash associative arrays

I want to store (and retrieve, of course) Bash's associative arrays and am looking for a simple way to do that.
I know that it is possible to do it using a look over all keys:
for key in "${!arr[#]}"
do
echo "$key ${arr[$key]}"
done
Retrieving it could also be done in a loop:
declare -A arr
while read key value
do
arr[$key]=$value
done < store
But I also see that set will print a version of the array in this style:
arr=([key1]="value1" [key2]="value2" )
(Unfortunately along with all other shell variables.)
Is there a simpler way for storing and retrieving an associative array than my proposed loop?
To save to a file:
declare -p arr > saved.sh
(You can also use typeset instead of declare if you prefer.)
To load from the file:
source saved.sh

Resources