Bash associative array with list as value - arrays

I have to work with an output of a Java tool, which returns a map data structure that looks like HashMap<String, ArrayList<String>. I have to work with BASH and i tried to declare it as an associative array, what is very similar to a map. The declaration of the associative array in bash should be in one line, i try to do this as following.
ARRAY=(["sem1"]=("first name" "second name") ["sem2"]=("third name") ["sem3]=OTHER_LITS)
But this creates the following error:
bash: syntax error near unexpected token `('
I can define this line by line, but i want to have it in one line. How can i define a assoviative array in bash in only one line?

BTW, associative array, dictionary or map - all come into the one abstract data type (let's call it a dictionary).
So, here is the solution for storing array as values in the dictionary of Bash (4+ version).
Note, that array in Bash is a space delimited list of strings (so no any spaces inside the element, i.e. string), so we could write a quoted list:
"firstname middlename secondname"
as a value of the s1 key in our X dictionary:
declare -A X=(
['s1']="firstname middlename secondname"
['s2']="surname nickname"
['s3']="other"
)
Now we can get the value of the s1 key as array:
declare -a names=(${X[s1]})
Variable names now contains array:
> echo $names
firstname
> echo ${names[1]}
middlename
> echo ${#names[#]}
3
Finally, your question part where the strings with spaces were shown:
"first name", "second name"
Let's do a trick - represent a space as a special symbol sequence (it could be just one symbol), for example, double underscores:
"first__name", "second__name"
Declare our dictionary again, but with "escaped" spaces inside array elements:
declare -A X=(
['s1']="first__name middle__name second__name"
['s2']="surname nickname"
['s3']="other"
)
In this case after we get the value of the s1 key as array:
declare -a names=(${X[s1]})
We need to post process our array elements to remove __ the space-replacements to the actual space symbols. To do this we simply use replace commands of Bash strings:
> echo ${names/__/ }
first name
> echo ${names[1]/__/ }
middle name
> echo ${#names[#]}
3

In the absence of multi-dimensional array support in BASH, you can use this word-around associative array. Each key in the associative array is string concatenation of map-index,array-list-index:
# use one line declaration
declare -A array=([sem1,0]="first name" [sem1,1]="second name" [sem2,0]="third name" [sem3,0]="foo bar")
# loop thrpugh the map array
for i in "${!array[#]}"; do echo "$i => ${array[$i]}"; done
sem2,0 => third name
sem1,0 => first name
sem1,1 => second name
sem3,0 => foo bar

A more ergonomic solution that doesn't force the manipulation of the keys.
# your data with spaces
array=(1 '2 with space' 3 "4 with space and ' symbol")
declare -p array
# quote it with " and store it, your data can't contain double quote
declare -A associative=([x]=x [array]=$(printf '"%s" ' "${array[#]}"))
declare -p associative
# get your data in another array
eval deserialized_array=(${associative[array]})
declare -p deserialized_array
echo ${deserialized_array[3]}
# or let bash handle everything
# note: array contain data with double quote character
array=(1 '2 with space' 3 "4 with space and ' \" symbol")
declare -A associative=([x]=x [array]=$(declare -p array))
declare -p associative
array=() # make sure data is gone
# get the data in the same array
eval ${associative[array]}
echo ${array[3]}

Related

BASH - Create arrays from lines of csv file, where first entry is array name

I'm learning to script in Bash.
I have an CSV file, which contains next lines:
numbers,one,two,three,four,five
colors,red,blue,green,yellow,white
custom-1,a,b,c,d,e
custom+2,t,y,w,x,z
Need to create arrays from this, where first entry is array name, eg.
number=(one,two,three,four,five)
colors=(red,blue,green,yellow,white)
custom-1=(a,b,c,d,e)
custom+2=(t,y,w,x,z)
Here is my script:
IFS=","
while read NAME VALUES ; do
declare -a $NAME
arrays+=($NAME)
IFS=',' read -r -a $NAME <<< "${VALUES[0]}"
done < file.csv
When I try with csv file, containing only two first string (numbers and colors), code works well. And if i try to with number, colors, custom-1, custom-2, there is error during reading csv:
./script.sh: line 5: declare: `custom-1': not a valid identifier
./script.sh: line 7: read: `custom+2': not a valid identifier
because bash does not allow special characters in variable names, as far as I understand. Is there any way to avoid this?
As you cannot use the first column of your CSV file as bash array names one option would be to generate valid names using a counter (e.g. arrayN). If you want to access your data using the values of this first column you would also need to store them somewhere with the corresponding counter value. An associative array (declare -A names=()) would be perfect. Last but not least, the namerefs (declare -n arr=..., available starting with bash 4.3) will be convenient to store and access your data. Example:
declare -i cnt=1
declare -A names=()
while IFS=',' read -r -a line; do
names["${line[0]}"]="$cnt"
declare -n arr="array$cnt"
unset line[0]
declare -a arr=( "${line[#]}" )
((cnt++))
done < foo.csv
Now, to access the values corresponding to, let's say, entry custom+2, first get the corresponding counter value, declare a nameref pointing to the corresponding array and voilà:
$ cnt="${names[custom+2]}"
$ declare -n arr="array$cnt"
$ echo "${arr[#]}"
t y w x z
Let's declare a function for easier access:
getdata () {
local -i cnt="${names[$1]}"
local -n arr="array$cnt"
[ -z "$2" ] && echo "${arr[#]}" || echo "${arr[$2]}"
}
And then:
$ getdata "custom+2"
t y w x z
$ getdata "colors"
red blue green yellow white
$ getdata "colors" 3
yellow

How to split the string into two separate variables in shell

I have an array of strings with the following numbers in it separated by a space.
S0 [
42 4677
S10 [
4719 1266
6020 3618
9667 8463
I want to separate each value as an integer and put the in a separate array like
array2 = [S0,42, 4677,S10, 4719, 1266, 6020, 3618, 9667, 843].
What I have done is loop through the array like the solution given here
How to split one string into multiple strings separated by at least one space in bash shell?
for each element of the array but it doesn't seem to work.
for each in ${my_array[1]}
do
echo $each
done
always gives me the output 42 4677 as a single value.
EDITED
for each in ${my_array[#]}
do
echo $each
done
declare -a array2
for each in ${my_array[#]}
do
array2=$each
done
echo ${array2[1]}
declare -p my_array
Now the first for loop works correctly but now when I assign the values of my_array to array2 in the second loop, only the last value of the last string is assigned to it i.e. 8463.
The contents of declare -p my_array are
declare -a my_array='([0]="S0 [" [1]="42 4677" [2]="S10 [" [3]="4719 1266" [4]="6020 3618" [5]="9667 8463")'
You can use:
declare -a my_array='([0]="S0 [" [1]="42 4677" [2]="S10 [" [3]="4719 1266" [4]="6020 3618" [5]="9667 8463")'
array2=()
for i in ${my_array[#]}; do
[[ $i == *[0-9]* ]] && array2+=($i)
done
# check content of array2
declare -p array2
declare -a array2=([0]="S0" [1]="42" [2]="4677" [3]="S10" [4]="4719" [5]="1266" [6]="6020" [7]="3618" [8]="9667" [9]="8463")

Bash, split words into letters and save to array

I'm struggling with a project. I am supposed to write a bash script which will work like tr command. At the beginning I would like to save all commands arguments into separated arrays. And in case if an argument is a word I would like to have each char in separated array field,eg.
tr_mine AB DC
I would like to have two arrays: a[0] = A, a[1] = B and b[0]=C b[1]=D.
I found a way, but it's not working:
IFS="" read -r -a array <<< "$a"
No sed, no awk, all bash internals.
Assuming that words are always separated with blanks (space and/or tabs),
also assuming that words are given as arguments, and writing for bash only:
#!/bin/bash
blank=$'[ \t]'
varname='A'
n=1
while IFS='' read -r -d '' -N 1 c ; do
if [[ $c =~ $blank ]]; then n=$((n+1)); continue; fi
eval ${varname}${n}'+=("'"$c"'")'
done <<<"$#"
last=$(eval echo \${#${varname}${n}[#]}) ### Find last character index.
unset "${varname}${n}[$last-1]" ### Remove last (trailing) newline.
for ((j=1;j<=$n;j++)); do
k="A$j[#]"
printf '<%s> ' "${!k}"; echo
done
That will set each array A1, A2, A3, etc. ... to the letters of each word.
The value at the end of the first loop of $n is the count of words processed.
Printing may be a little tricky, that is why the code to access each letter is given above.
Applied to your sample text:
$ script.sh AB DC
<A> <B>
<D> <C>
The script is setting two (array) vars A1 and A2.
And each letter is one array element: A1[0] = A, A1[1] = B and A2[0]=C, A2[1]=D.
You need to set a variable ($k) to the array element to access.
For example, to echo fourth letter (0 based) of second word (1 based) you need to do (that may be changed if needed):
k="A2[3]"; echo "${!k}" ### Indirect addressing.
The script will work as this:
$ script.sh ABCD efghi
<A> <B> <C> <D>
<e> <f> <g> <h> <i>
Caveat: Characters will be split even if quoted. However, quoted arguments is the correct way to use this script to avoid the effect of shell metacharacters ( |,&,;,(,),<,>,space,tab ). Of course, spaces (even if repeated) will split words as defined by the variable $blank:
$ script.sh $'qwer;rttt fgf\ngfg'
<q> <w> <e> <r> <;> <r> <t> <t> <t>
<>
<>
<>
<f> <g> <f> <
> <g> <f> <g>
As the script will accept and correctly process embebed newlines we need to use: unset "${varname}${n}[$last-1]" to remove the last trailing "newline". If that is not desired, quote the line.
Security Note: The eval is not much of a problem here as it is only processing one character at a time. It would be difficult to create an attack based on just one character. Anyway, the usual warning is valid: Always sanitize your input before using this script. Also, most (not quoted) metacharacters of bash will break this script.
$ script.sh qwer(rttt fgfgfg
bash: syntax error near unexpected token `('
I would strongly suggest to do this in another language if possible, it will be a lot easier.
Now, the closest I come up with is:
#!/bin/bash
sentence="AC DC"
words=`echo "$sentence" | tr " " "\n"`
# final array
declare -A result
# word count
wc=0
for i in $words; do
# letter count in the word
lc=0
for l in `echo "$i" | grep -o .`; do
result["w$wc-l$lc"]=$l
lc=$(($lc+1))
done
wc=$(($wc+1))
done
rLen=${#result[#]}
echo "Result Length $rLen"
for i in "${!result[#]}"
do
echo "$i => ${result[$i]}"
done
The above prints:
Result Length 4
w1-l1 => C
w1-l0 => D
w0-l0 => A
w0-l1 => C
Explanation:
Dynamic variables are not supported in bash (ie create variables using variables) so I am using an associative array instead (result)
Arrays in bash are single dimension. To fake a 2D array I use the indexes: w for words and l for letters. This will make further processing a pain...
Associative arrays are not ordered thus results appear in random order when printing
${!result[#]} is used instead of ${result[#]}. The first iterates keys while the second iterates values
I know this is not exactly what you ask for, but I hope it will point you to the right direction
Try this :
sentence="$#"
read -r -a words <<< "$sentence"
for word in ${words[#]}; do
inc=$(( i++ ))
read -r -a l${inc} <<< $(sed 's/./& /g' <<< $word)
done
echo ${words[1]} # print "CD"
echo ${l1[1]} # print "D"
The first read reads all words, the internal one is for letters.
The sed command add a space after each letters to make the string splittable by read -a. You can also use this sed command to remove unwanted characters from words (eg commas) before splitting.
If special characters are allowed in words, you can use a simple grep instead of the sed command (as suggested in http://www.unixcl.com/2009/07/split-string-to-characters-in-bash.html) :
read -r -a l${inc} <<< $(grep -o . <<< $word)
The word array is ${w}.
The letters arrays are named l# where # is an increment added for each word read.

Copying a Bash array fails

Assigning arrays to variables in Bash script seems rather complicated:
a=("a" "b" "c")
b=$a
echo ${a[0]}
echo ${a[1]}
echo ${b[0]}
echo ${b[1]}
leads to
a
b
a
instead of
a
b
a
b
Why? How can I fix it?
If you want to copy a variable that holds an array to another name, you do it like this:
a=('a' 'b' 'c')
b=( "${a[#]}" )
Why?
If a is an array, $a expands to the first element in the array. That is why b in your example only has one value. In bash, variables that refer to arrays aren't assignable like pointers would be in C++ or Java. Instead variables expand (as in Parameter Expansion) into strings and those strings are copied and associated with the variable being assigned.
How can I fix it?
To copy a sparse array that contains values with spaces, the array must be copied one element at a time by the indices - which can be obtained with ${!a[#]}.
declare -a b=()
for i in ${!a[#]}; do
b[$i]="${a[$i]}"
done
From the bash man page:
It is possible to obtain the keys (indices) of an array as well as the values.
${!name[#]} and ${!name[*]} expand to the indices assigned in array variable name.
The treatment when in double quotes is similar to the expansion of the special
parameters # and * within double quotes.
Here's a script you can test on your own:
#!/bin/bash
declare -a a=();
a[1]='red hat'
a[3]='fedora core'
declare -a b=();
# Copy method that works for sparse arrays with spaces in the values.
for i in ${!a[#]}; do
b[$i]="${a[$i]}"
done
# does not work, but as LeVar Burton says ...
#b=("${a[#]}")
echo a indicies: ${!a[#]}
echo b indicies: ${!b[#]}
echo "values in b:"
for u in "${b[#]}"; do
echo $u
done
Prints:
a indicies: 1 3
b indicies: 1 3 # or 0 1 with line uncommented
values in b:
red hat
fedora core
This also works for associative arrays in bash 4, if you use declare -A (with capital A instead of lower case) when declaring the arrays.

How to pass an associative array as argument to a function in Bash?

How do you pass an associative array as an argument to a function? Is this possible in Bash?
The code below is not working as expected:
function iterateArray
{
local ADATA="${#}" # associative array
for key in "${!ADATA[#]}"
do
echo "key - ${key}"
echo "value: ${ADATA[$key]}"
done
}
Passing associative arrays to a function like normal arrays does not work:
iterateArray "$A_DATA"
or
iterateArray "$A_DATA[#]"
If you're using Bash 4.3 or newer, the cleanest way is to pass the associative array by name reference and then access it inside your function using a name reference with local -n. For example:
function foo {
local -n data_ref=$1
echo ${data_ref[a]} ${data_ref[b]}
}
declare -A data
data[a]="Fred Flintstone"
data[b]="Barney Rubble"
foo data
You don't have to use the _ref suffix; that's just what I picked here. You can call the reference anything you want so long as it's different from the original variable name (otherwise youll get a "circular name reference" error).
I had exactly the same problem last week and thought about it for quite a while.
It seems, that associative arrays can't be serialized or copied. There's a good Bash FAQ entry to associative arrays which explains them in detail. The last section gave me the following idea which works for me:
function print_array {
# eval string into a new associative array
eval "declare -A func_assoc_array="${1#*=}
# proof that array was successfully created
declare -p func_assoc_array
}
# declare an associative array
declare -A assoc_array=(["key1"]="value1" ["key2"]="value2")
# show associative array definition
declare -p assoc_array
# pass associative array in string form to function
print_array "$(declare -p assoc_array)"
Based on
Florian Feldhaus's solution:
# Bash 4+ only
function printAssocArray # ( assocArrayName )
{
var=$(declare -p "$1")
eval "declare -A _arr="${var#*=}
for k in "${!_arr[#]}"; do
echo "$k: ${_arr[$k]}"
done
}
declare -A conf
conf[pou]=789
conf[mail]="ab\npo"
conf[doo]=456
printAssocArray "conf"
The output will be:
doo: 456
pou: 789
mail: ab\npo
Update, to fully answer the question, here is an small section from my library:
Iterating an associative array by reference
shopt -s expand_aliases
alias array.getbyref='e="$( declare -p ${1} )"; eval "declare -A E=${e#*=}"'
alias array.foreach='array.keys ${1}; for key in "${KEYS[#]}"'
function array.print {
array.getbyref
array.foreach
do
echo "$key: ${E[$key]}"
done
}
function array.keys {
array.getbyref
KEYS=(${!E[#]})
}
# Example usage:
declare -A A=([one]=1 [two]=2 [three]=3)
array.print A
This we a devlopment of my earlier work, which I will leave below.
#ffeldhaus - nice response, I took it and ran with it:
t()
{
e="$( declare -p $1 )"
eval "declare -A E=${e#*=}"
declare -p E
}
declare -A A='([a]="1" [b]="2" [c]="3" )'
echo -n original declaration:; declare -p A
echo -n running function tst:
t A
# Output:
# original declaration:declare -A A='([a]="1" [b]="2" [c]="3" )'
# running function tst:declare -A E='([a]="1" [b]="2" [c]="3" )'
You can only pass associative arrays by name.
It's better (more efficient) to pass regular arrays by name also.
Here is a solution I came up with today using eval echo ... to do the indirection:
print_assoc_array() {
local arr_keys="\${!$1[#]}" # \$ means we only substitute the $1
local arr_val="\${$1[\"\$k\"]}"
for k in $(eval echo $arr_keys); do #use eval echo to do the next substitution
printf "%s: %s\n" "$k" "$(eval echo $arr_val)"
done
}
declare -A my_arr
my_arr[abc]="123"
my_arr[def]="456"
print_assoc_array my_arr
Outputs on bash 4.3:
def: 456
abc: 123
yo:
#!/bin/bash
declare -A dict
dict=(
[ke]="va"
[ys]="lu"
[ye]="es"
)
fun() {
for i in $#; do
echo $i
done
}
fun ${dict[#]} # || ${dict[key]} || ${!dict[#] || ${dict[$1]}
eZ
Here's another way: you can manually serialize the associative array as you pass it to a function, then deserialize it back into a new associative array inside the function:
1. Manual passing (via serialization/deserialization) of the associative array
Here's a full, runnable example from my eRCaGuy_hello_world repo:
array_pass_as_bash_parameter_2_associative.sh:
# Print an associative array using manual serialization/deserialization
# Usage:
# # General form:
# print_associative_array array_length array_keys array_values
#
# # Example:
# # length indices (keys) values
# print_associative_array "${#array1[#]}" "${!array1[#]}" "${array1[#]}"
print_associative_array() {
i=1
# read 1st argument, the array length
array_len="${#:$i:1}"
((i++))
# read all key:value pairs into a new associative array
declare -A array
for (( i_key="$i"; i_key<$(($i + "$array_len")); i_key++ )); do
i_value=$(($i_key + $array_len))
key="${#:$i_key:1}"
value="${#:$i_value:1}"
array["$key"]="$value"
done
# print the array by iterating through all of the keys now
for key in "${!array[#]}"; do
value="${array["$key"]}"
echo " $key: $value"
done
}
# Let's create and load up an associative array and print it
declare -A array1
array1["a"]="cat"
array1["b"]="dog"
array1["c"]="mouse"
# length indices (keys) values
print_associative_array "${#array1[#]}" "${!array1[#]}" "${array1[#]}"
Sample output:
a: cat
b: dog
c: mouse
Explanation:
For a given function named print_associative_array, here is the general form:
# general form
print_associative_array array_length array_keys array_values
For an array named array1, here is how to obtain the array length, indices (keys), and values:
array length: "${#array1[#]}"
all of the array indices (keys in this case, since it's an associative array): "${!array1[#]}"
all of the array values: "${array1[#]}"
So, an example call to print_associative_array would look like this:
# example call
# length indices (keys) values
print_associative_array "${#array1[#]}" "${!array1[#]}" "${array1[#]}"
Putting the length of the array first is essential, as it allows us to parse the incoming serialized array as it arrives into the print_associative_array function inside the magic # array of all incoming arguments.
To parse the # array, we'll use array slicing, which is described as follows (this snippet is copy-pasted from my answer here):
# array slicing basic format 1: grab a certain length starting at a certain
# index
echo "${#:2:5}"
# │ │
# │ └────> slice length
# └──────> slice starting index (zero-based)
2. [Better technique than above!] Pass the array by reference
...as #Todd Lehman explains in his answer here
# Print an associative array by passing the array by reference
# Usage:
# # General form:
# print_associative_array2 array
# # Example
# print_associative_array2 array1
print_associative_array2() {
# declare a local **reference variable** (hence `-n`) named `array_reference`
# which is a reference to the value stored in the first parameter
# passed in
local -n array_reference="$1"
# print the array by iterating through all of the keys now
for key in "${!array_reference[#]}"; do
value="${array_reference["$key"]}"
echo " $key: $value"
done
}
echo 'print_associative_array2 array1'
print_associative_array2 array1
echo ""
echo "OR (same thing--quotes don't matter in this case):"
echo 'print_associative_array2 "array1"'
print_associative_array2 "array1"
Sample output:
print_associative_array2 array1
a: cat
b: dog
c: mouse
OR (same thing--quotes don't matter in this case):
print_associative_array2 "array1"
a: cat
b: dog
c: mouse
See also:
[my answer] a more-extensive demo of me serializing/deserializing a regular "indexed" bash array in order to pass one or more of them as parameters to a function: Passing arrays as parameters in bash
[my answer] a demo of me passing a regular "indexed" bash array by reference: Passing arrays as parameters in bash
[my answer] array slicing: Unix & Linux: Bash: slice of positional parameters
[my question] Why do the man bash pages state the declare and local -n attribute "cannot be applied to array variables", and yet it can?
Excellent. The simple solution described by #Todd Lehman, solved my associative array passing problem. I had to pass 3 parameters, an integer, an associative array and an indexed array to a function.
A while ago I'd read that since arrays in bash are not first-class entities, arrays could not be passed as arguments to functions. Clearly that's not the whole truth after all. I've just implemented a solution where a function handles those parameters, something like this...
function serve_quiz_question() {
local index="$1"; shift
local -n answers_ref=$1; shift
local questions=( "$#" )
current_question="${questions[$index]}"
echo "current_question: $current_question"
#...
current_answer="${answers_ref[$current_question]}"
echo "current_answer: $current_answer"
}
declare -A answers
answers[braveheart]="scotland"
answers[mr robot]="new york"
answers[tron]="vancouver"
answers[devs]="california"
# integers would actually be assigned to index \
# by iterating over a random sequence, not shown here.
index=2
declare -a questions=( "braveheart" "devs" "mr robot" "tron" )
serve_quiz_question "$index" answers "${questions[#]}"
As the local variables get assigned, I had to shift the positional parameters away, in order to end with the ( "$#" ) assigning what's left to the indexed questions array.
The indexed array is needed so that we can reliably iterate over all the questions, either in random or ordered sequence. Associative arrays are not ordered data structures, so are not meant for any sort of predictable iteration.
Output:
current_question: mr robot
current_answer: new york
From the best Bash guide ever:
declare -A fullNames
fullNames=( ["lhunath"]="Maarten Billemont" ["greycat"]="Greg Wooledge" )
for user in "${!fullNames[#]}"
do
echo "User: $user, full name: ${fullNames[$user]}."
done
I think the issue in your case is that $# is not an associative array: "#: Expands to all the words of all the positional parameters. If double quoted, it expands to a list of all the positional parameters as individual words."

Resources