Bash reading from a file to an associative array - arrays

I'm trying to write a script in bash using an associative array.
I have a file called data:
a,b,c,d,e,f
g,h,i,j,k,l
The following script:
oldIFS=${IFS}
IFS=","
declare -A assoc
while read -a array
do
assoc["${array[0]}"]="${array[#]}"
done
for key in ${!assoc[#]}
do
echo "${key} ---> ${assoc[${key}]}"
done
IFS=${oldIFS}
gives me
a ---> a b c d e f
g ---> g h i j k l
I need my output to be:
a b ---> c d e f
g h ---> i j k l

oldIFS=${IFS}
IFS=","
declare -A assoc
while read -r -a array
do
assoc["${array[0]} ${array[1]}"]="${array[#]:2}"
done < data
for key in "${!assoc[#]}"
do
echo "${key} ---> ${assoc[${key}]}"
done
IFS=${oldIFS}
data:
a,b,c,d,e,f
g,h,i,j,k,l
Output:
a b ---> c d e f
g h ---> i j k l
Uses Substring Expansion here ${array[#]:2} to get substring needed as the value of the assoc array. Also added -r to read to prevent backslash to act as an escape character.
Improved based on #gniourf_gniourf's suggestions:
declare -A assoc
while IFS=, read -r -a array
do
((${#array[#]} >= 2)) || continue
assoc["${array[#]:0:2}"]="${array[#]:2}"
done < data
for key in "${!assoc[#]}"
do
echo "${key} ---> ${assoc[${key}]}"
done

Related

Split two numbers in two arrays

I need to split 2 numbers in the form(they are from a text file):
Num1:Num2
Num3:Num4
And store num1 into array X and number 2 in array Y num 3 in array X and num4 in array Y.
With bash:
mapfile -t X < <(cut -d : -f 1 file) # read only first column
mapfile -t Y < <(cut -d : -f 2 file) # read only second column
declare -p X Y
Output:
declare -a X='([0]="num1" [1]="num3")'
declare -a Y='([0]="num2" [1]="num4")'
Disadvantage: The file is read twice.
You could perform the following steps:
Create destination arrays empty
Read file line by line, with a classic while read ... < file loop
Split each line on :, again using read
Append values to arrays
For example:
arr_x=()
arr_y=()
while IFS= read line || [ -n "$line" ]; do
IFS=: read x y <<< "$line"
arr_x+=("$x")
arr_y+=("$y")
done < data.txt
echo "content of arr_x:"
for v in "${arr_x[#]}"; do
echo "$v"
done
echo "content of arr_y:"
for v in "${arr_y[#]}"; do
echo "$v"
done
Here is a quick bash solution:
c=0
while IFS=: read a b ;do
x[$c]="$a"
y[$c]="$b"
c=$((c+1))
done < input.txt
We send the input.txt to a while loop, using Input Field Separator : and read the first number of each line as $a and second number as $b. Then we add them to the array as you specified. We use a counter $c to iterate the location in the arrays.
Using =~ operator to store the pair of numbers to array $BASH_REMATCH:
$ cat file
123:456
789:012
$ while read -r line
do
[[ $line =~ ([^:]*):(.*) ]] && echo ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
# do something else with numbers as they will be replaced on the next iteration
done < file

Fill array with permutations (a-z) using bash

I am looking to populate an array using the {a..z}. The end result is every letter from a-z stored in an array that can be used for referencing later.
code:
#!/bin/bash
#proof of concept
#echo {a..z}
#a b c d e f g h i j k l m n o p q r s t u v w x y z
#attempt 1
CHARSET=({a..z})
printf "${CHARSET[#]}"
#result: a
#attempt 2
CHARSET=({a..z})
for i in CHARSET ; do
echo "$1"
done
exit
#result a
Ultimately I am trying to test every permutation of a-z up to 4 characters long without making an intermediate file to read from e.g.
#!/bin/bash
for i in {a..z}; do
for j in {a..z}; do
for k in {a..z}; do
for l in {a..z}; do
echo $i >>test.txt #1 letter
echo $i$j >>test.txt #2 letters
echo $i$j$k >>test.txt #3 letters
echo $i$j$k$l >>test.txt #4 letters
done;done;done;done
test.txt
a
aa
aaa
aaaa
...........
z
zz
zzz
zzzz
I was hoping to be able to store a-z in an array then use that array each time to increase the letter count up to four. Or is there a much simpler way to succeed here? (Without creating the intermediate file as given in the example above)
You can append multiple brace expansions to combinatorially combine them:
for word in {a..z}{a..z}{a..z}{a..z}
do
echo "$word"
done

Associative array in bash to store all lines start with X

I have a file with lines which I am taking input by $1:
X B C D E
X G H I J
X L M N
Y G
Z B
Y L
In each line starts with X, the key is the 2nd element and the values are the rest elements.
I am reading the file line by lines creating associate array for each.
while read LINE
do
INPUT=$(echo $LINE |awk '{print $1}')
if [[ "$INPUT" = X ]]
then
key_name=$(echo $LINE | awk '{print $2}')
declare -A dependencies
value_names=($(echo $LINE|awk '{$1=$2=""; print $0}'))
dependencies[key_name]=value_names
echo -e "\nvalues of $key_name are ${key_name[*]}\n"
sleep 1
fi
done < $1
So I am losing the value for each line reading.
But I need to store all the lines with X in the associate arays,
because I need to search for the key later for the later lines, lets say: a line start with Y, and it has G, so here I need to find the valuess from the associated arrays
with key G.
Can anyone suggest some idea how to store all lines start with X in a single associative array by reading line line the file? Or any better approach?
Here from the sample input given, the output will be in 3 lines:
H I J
C D E
M N
Here X,Y,X are recognizing the lines, what to do with the next characters. If X store the rest in KEY-PAIR or if Y or Z extract the values from associative arrays.
Using GNU awk for gensub():
$ gawk '{ if (/^X/) a[$2] = gensub(/(\S+\s+){2}/,"",""); else print a[$2] }' file
H I J
C D E
M N
The above implicitly loops through every line in the input file and when it finds a line that starts with X (/^X/) it removes the first 2 non-space-then-space pairs (gensub(/(\S+\s+){2}/,"","")) and stores the result in associative array a indexed by the original 2nd field (a[$2] = ...), so for example for input line X B C D E it saves a["B"] = "C D E". If the line did not start with X (else) then it prints the array indexed by the 2nd field in the current line, so for input line Z B it will execute print a["B"] and so output C D E.
With an old version of gawk (run gawk --version and check for version before 4.0) you might need:
$ gawk --re-interval '{ if (/^X/) a[$2] = gensub(/([^[:space:]]+[[:space:]]+){2}/,"",""); else print a[$2] }' file
but if so youre missing a lot of very useful functionality so get a new gawk!
The declaration should go outside the loop. The variable interpolations need a dollar sign in front. The rest is just refactoring.
declare -A dependencies
awk '$1=="X"{$1=""; print }' "$1" |
{ while read -r key value;
do
dependencies["$key"]="$value"
echo -e "\nvalues of $key_name are ${key_name[*]}\n"
#sleep 1
done
:
# do stuff with "${dependencies[#]}"
}

Retrieve index of array element matching string in Bash

I have sound files for each of the 88 keys on a piano keyboard.
p-book:OUT pi$ ls
Piano.ff.A0.aiff Piano.ff.Bb7.aiff Piano.ff.Eb1.aiff
Piano.ff.A1.aiff Piano.ff.C1.aiff Piano.ff.Eb2.aiff
Piano.ff.A2.aiff Piano.ff.C2.aiff Piano.ff.Eb3.aiff
Piano.ff.A3.aiff Piano.ff.C3.aiff Piano.ff.Eb4.aiff
Piano.ff.A4.aiff Piano.ff.C4.aiff Piano.ff.Eb5.aiff
Piano.ff.A5.aiff Piano.ff.C5.aiff Piano.ff.Eb6.aiff
Piano.ff.A6.aiff Piano.ff.C6.aiff Piano.ff.Eb7.aiff
Piano.ff.A7.aiff Piano.ff.C7.aiff Piano.ff.F1.aiff
Piano.ff.Ab1.aiff Piano.ff.C8.aiff Piano.ff.F2.aiff
Piano.ff.Ab2.aiff Piano.ff.D1.aiff Piano.ff.F3.aiff
Piano.ff.Ab3.aiff Piano.ff.D2.aiff Piano.ff.F4.aiff
Piano.ff.Ab4.aiff Piano.ff.D3.aiff Piano.ff.F5.aiff
Piano.ff.Ab5.aiff Piano.ff.D4.aiff Piano.ff.F6.aiff
Piano.ff.Ab6.aiff Piano.ff.D5.aiff Piano.ff.F7.aiff
Piano.ff.Ab7.aiff Piano.ff.D6.aiff Piano.ff.G1.aiff
Piano.ff.B0.aiff Piano.ff.D7.aiff Piano.ff.G2.aiff
Piano.ff.B1.aiff Piano.ff.Db1.aiff Piano.ff.G3.aiff
Piano.ff.B2.aiff Piano.ff.Db2.aiff Piano.ff.G4.aiff
Piano.ff.B3.aiff Piano.ff.Db3.aiff Piano.ff.G5.aiff
Piano.ff.B4.aiff Piano.ff.Db4.aiff Piano.ff.G6.aiff
Piano.ff.B5.aiff Piano.ff.Db5.aiff Piano.ff.G7.aiff
Piano.ff.B6.aiff Piano.ff.Db6.aiff Piano.ff.Gb1.aiff
Piano.ff.B7.aiff Piano.ff.Db7.aiff Piano.ff.Gb2.aiff
Piano.ff.Bb0.aiff Piano.ff.E1.aiff Piano.ff.Gb3.aiff
Piano.ff.Bb1.aiff Piano.ff.E2.aiff Piano.ff.Gb4.aiff
Piano.ff.Bb2.aiff Piano.ff.E3.aiff Piano.ff.Gb5.aiff
Piano.ff.Bb3.aiff Piano.ff.E4.aiff Piano.ff.Gb6.aiff
Piano.ff.Bb4.aiff Piano.ff.E5.aiff Piano.ff.Gb7.aiff
Piano.ff.Bb5.aiff Piano.ff.E6.aiff
Piano.ff.Bb6.aiff Piano.ff.E7.aiff
I wish to rename them to their MIDI note number:
Piano.ff.A0.aiff -> 21.aiff
Piano.ff.Bb0.aiff -> 22.aiff
Piano.ff.B0.aiff -> 23.aiff
Piano.ff.C1.aiff -> 24.aiff
:
(21 is the MIDI number for the lowest note on a piano)
While 88 is probably more a 'do it by hand' size, I'm curious whether it can be automated in a few lines of Bash
If:
'C' ~ 0
'Db' ~ 1
'D' ~ 2
:
'B' ~ 11
Then I could do:
MidiNote = NumberForPitchclass( pitchclassstring ) + 12 * octave
But does Bash have the apparatus for this operation?
If you have bash 4, using associative arrays would be the way to go:
noteNames=(C Db D Eb E F Gb G Ab A Bb B)
declare -A noteNumbers
for (( i=0; i<${#noteNames[#]}; ++i )); do
noteNumbers[${noteNames[i]}]=$i
done
for f in *.aiff; do
note="${f#Piano.ff.}"
note="${note%.aiff}"
name="${note%%[0-9]*}"
octave="${note#$name}"
if [ ! -n "${noteNumbers[$name]}" ]; then
echo >&2 "$0: not renaming $f - note not found"
else
let midiNote=${noteNumbers[$name]}+12*octave
mv "$f" "$midiNote.aiff"
fi
done
If you don't have bash 4, you can do it more manually by looping through the notes for every file instead of just once at the beginning:
noteNames=(C Db D Eb E F Gb G Ab A Bb B)
for f in *.aiff; do
note="${f#Piano.ff.}"
note="${note%.aiff}"
name="${note%%[0-9]*}"
octave="${note#$name}"
for (( base=0; base<${#noteNames[#]}; ++base )); do
if [[ "${noteNames[base]}" == $name ]]; then
break
fi
done
if (( base >= ${#noteNames[#]} )); then
echo >&2 "$0: not renaming $f - note not found"
else
let midiNote=base+12*octave
mv "$f" "$midiNote.aiff"
fi
done
However, that gives A0 the number 10, where you said it was 21, so you apparently need to add 11 to those numbers.

How to convert command output to an array line by line in bash?

I'm trying to convert the output of a command like echo -e "a b\nc\nd e" to an array.
X=( $(echo -e "a b\nc\nd e") )
Splits the input for every new line and whitespace character:
$ echo ${#X[#]}
> 5
for i in ${X[#]} ; do echo $i ; done
a
b
c
d
e
The result should be:
for i in ${X[#]} ; do echo $i ; done
a b
c
d e
You need to change your Internal Field Separator variable (IFS) to a newline first.
$ IFS=$'\n'; arr=( $(echo -e "a b\nc\nd e") ); for i in ${arr[#]} ; do echo $i ; done
a b
c
d e
readarray -t ARRAY < <(COMMAND)
Set the IFS to newline. By default, it is space.
[jaypal:~] while IFS=$'\n' read -a arry; do
echo ${arry[0]};
done < <(echo -e "a b\nc\nd e")
a b
c
d e

Resources