Shell script - awk extract block from file into array

Shell script - awk extract block from file into array - arrays

I'm currently writing a shell script that reads a Vagrantfile and bootstraps it (in a nutshell ;) )
But I'm hitting a wall with the following piece of code:
TEST=()
while read result; do
TEST+=(`echo ${result}`)
done <<< `awk '/config.vm.define[ \s]\"[a-z]*\"[ \s]do[ \s]\|[a-zA-Z_]*\|/, /end/ { print }' Vagrantfile`
echo "${TEST[1]}"
When I pass a Vagrantfile into this awk pattern regex with two machines defined (config.vm.define) in it they are found.
The output
config.vm.define "web" do |web|
web.vm.box = "CentOs"
web.vm.box_url = "http://developer.nrel.gov/downloads/vagrant-boxes/CentOS-6.4-x86_64-v20130731.box"
web.vm.hostname = 'dev.local'
web.vm.network :forwarded_port, guest: 90, host: 9090
web.vm.network :private_network, ip: "22.22.22.11"
web.vm.provision :puppet do |puppet|
puppet.manifests_path = "puppet/manifests"
puppet.manifest_file = "web.pp"
puppet.module_path = "puppet/modules"
puppet.options = ["--verbose", "--hiera_config /vagrant/hiera.yaml", "--parser future"]
end
config.vm.define "db" do |db_mysql|
db_mysql.vm.box = "CentOs"
db_mysql.vm.box_url = "http://developer.nrel.gov/downloads/vagrant-boxes/CentOS-6.4-x86_64-v20130731.box"
db_mysql.vm.hostname = 'db.mysql.local'
db_mysql.vm.network :private_network, ip: "22.22.22.22"
db_mysql.vm.network :forwarded_port, guest: 3306, host: 3306
db_mysql.vm.provision :puppet do |puppet|
puppet.manifests_path = "puppet/manifests"
puppet.manifest_file = "db.pp"
puppet.module_path = "puppet/modules"
puppet.options = ["--verbose", "--hiera_config /vagrant/hiera.yaml", "--parser future"]
end
But I can't seem to pass them into a array nicely. What I want is that the TEST array contains two indexes with the machine config.vm.define block as their corresponding values.
E.g.
TEST[0] = 'config.vm.define "web" do |web|
.... [REST OF THE BLOCK CONTENT] ...
end'
TEST[1] = 'config.vm.define "db" do |db_mysql|
.... [REST OF THE BLOCK CONTENT] ...
end'
The output echo "${TEST[1]}" is nothing. echo "${TEST[0]}" returns the whole block as plotted above.
I played with IFS / RS / FS but I can't seem to get the output I want.

A solution might be to write the two blocks to two separate files (blk1 and blk2) as:
awk '
/config.vm.define[[:space:]]\"[a-z]*\"[[:space:]]do[[:space:]]\|[a-zA-Z_]*\|/{f=1; i++}
f{print $0 > "blk"i}
/end/ {f=0}' Vagrantfile
and then later read these two files into the bash array as
IFS= TEST=( $(cat <"blk1") $(cat <"blk2") )
Note:
The regex \s seems to work only for the latest version of gawk (Works with version 4.1, but not version 3.1.8.
For gawk version 3.1.8, use [[:space:]] instead.
For gawk version 4.1, the regex \s does not work inside brackets [\s]. Use either config.vm.define[[:space:]] or config.vm.define\s..
Update
An alternative could be to insert an artificial separator between the blocks, for instance the string ###. Then you could do
IFS= TEST=()
while IFS= read -r -d '#' line ; do
TEST+=($line)
done < <(awk '
/config.vm.define[[:space:]]\"[a-z]*\"[[:space:]]do[[:space:]]\|[a-zA-Z_]*\|/{f=1; i++}
f{print }
/end/ {f=0; print "###"}' Vagrantfile)

Related

Cannot send command output into an array

I want to place each sysctl -a output line into an array:
TAB=($(sysctl -a))
It does not work; the resulting array contains the output split on any whitespace, instead of only on newlines:
[..]
NONE
net.netfilter.nf_log.5
=
NONE
net.netfilter.nf_log.6
=
NONE
net.netfilter.nf_log.7
=
NONE
net.netfilter.nf_log.8
[..]
I try:
while read -r line
do
TAB+=("${line}")
done< <(sysctl -a)
That does not work neither (same issue).
I try:
while IFS=$'\n' read -r line
do
TAB+=("${line}")
done< <(sysctl -a)
But still same output, same issue.
What's the correct method to have each line correctly placed in the array?

One way - probably the easiest - is to use readarray (bash 4 needed).
readarray -t TAB < <(sysctl -a)
Test:
$ echo ${TAB[0]}
abi.vsyscall32 = 1
$ echo ${TAB[1]}
crypto.fips_enabled = 0

You were close:
IFS=$'\n' TAB=($(sysctl -a))
# Usage example:
for x in "${TAB[#]}"
do
echo $x
done

Put lines of a text file in an array in bash

I'm taking over a bash script from a colleague that reads a file, process it and print another file based on the line in the while loop at the moment.
I now need to append some features to it. The one I'm having issues with right now is to read a file and put each line into an array, except the 2nd column of that line can be empty, e.g.:
For a text file with \t as separator:
A\tB\tC
A\t\tC
For a CSV file same but with , as separator:
A,B,C
A,,C
Which should then give
["A","B","C"] or ["A", "", "C"]
The code I took over is as follow:
while IFS=$'\t\r' read -r -a col; do
# Process the array, put that into a file
lp -d $printer $file_to_print
done < $input_file
Which works if B is filled, but B need to be empty now sometimes, so when the input files keeps it empty, the created array and thus the output file to print just skips this empty cell (array is then ["A","C"]).
I tried writing the whole bloc on awk but this brought it's own sets of problems, making it difficult to call the lp command to print.
So my question is, how can I preserve the empty cell from the line into my bash array, so that I can call on it later and use it?
Thank you very much. I know this might be quite confused so please ask and I'll specify.
Edit: After request, here's the awk code I've tried. The issue here is that it only prints the last print request, while I know it loops over the whole file, and the lp command is still in the loop.
awk 'BEGIN {
inputfile="'"${optfile}"'"
outputfile="'"${file_loc}"'"
printer="'"${printer}"'"
while (getline < inputfile){
print "'"${prefix}"'" > outputfile
split($0,ft,"'"${IFSseps}"'");
if (length(ft[2]) == 0){
print "CODEPAGE 1252\nTEXT 465,191,\"ROMAN.TTF\",180,7,7,\""ft[1]"\"" >> outputfile
size_changer = 0
} else {
print "CODEPAGE 1252\nTEXT 465,191,\"ROMAN.TTF\",180,7,7,\""ft[1]"_"ft[2]"\"" >> outputfile
size_changer = 1
}
if ( split($0,ft,"'"${IFSseps}"'") > 6)
maxcounter = 6;
else
maxcounter = split($0,ft,"'"${IFSseps}"'");
for (i = 3; i <= maxcounter; i++){
x=191-(i-2)*33
print "CODEPAGE 1252\nTEXT 465,"x",\"ROMAN.TTF\",180,7,7,\""ft[i]"\"" >> outputfile
}
print "PRINT ""'"${copies}"'"",1" >> outputfile
close(outputfile)
"'"`lp -d ${printer} ${file_loc}`"'"
}
close("'"${file_loc}"'");
}'
EDIT2: Continuing to try to find a solution to it, I tried following code without success. This is weird, as just doing printf without putting it in an array keeps the formatting intact.
$ cat testinput | tr '\t' '>'
A>B>C
A>>C
# Should normally be empty on the second ouput line
$ while read line; do IFS=$'\t' read -ra col < <(printf "$line"); echo ${col[1]}; done < testinput
B
C

For tab, it's complicated.
From 3.5.7 Word Splitting in the manual:
A sequence of IFS whitespace characters is also treated as a delimiter.
Since tab is an "IFS whitespace character", sequences of tabs are treated as a single delimiter
IFS=$'\t' read -ra ary <<<$'A\t\tC'
declare -p ary
declare -a ary=([0]="A" [1]="C")
What you can do is translate tabs to a non-whitespace character, assuming it does not clash with the actual data in the fields:
line=$'A\t\tC'
IFS=, read -ra ary <<<"${line//$'\t'/,}"
declare -p ary
declare -a ary=([0]="A" [1]="" [2]="C")
To avoid the risk of colliding with commas in the data, we can use an unusual ASCII character: FS, octal 034
line=$'A\t\tC'
printf -v FS '\034'
IFS="$FS" read -ra ary <<<"${line//$'\t'/"$FS"}"
# or, without the placeholder variable
IFS=$'\034' read -ra ary <<<"${line//$'\t'/$'\034'}"
declare -p ary
declare -a ary=([0]="A" [1]="" [2]="C")

One bash example using parameter expansion where we convert the delimiter into a \n and let mapfile read in each line as a new array entry ...
For tab-delimited data:
for line in $'A\tB\tC' $'A\t\tC'
do
mapfile -t array <<< "${line//$'\t'/$'\n'}"
echo "############# ${line}"
typeset -p array
done
############# A B C
declare -a array=([0]="A" [1]="B" [2]="C")
############# A C
declare -a array=([0]="A" [1]="" [2]="C")
NOTE: The $'...' construct insures the \t is treated as a single <tab> character as opposed to the two literal characters \ + t.
For comma-delimited data:
for line in 'A,B,C' 'A,,C'
do
mapfile -t array <<< "${line//,/$'\n'}"
echo "############# ${line}"
typeset -p array
done
############# A,B,C
declare -a array=([0]="A" [1]="B" [2]="C")
############# A,,C
declare -a array=([0]="A" [1]="" [2]="C")
NOTE: This obviously (?) assumes the desired data does not contain a comma (,).

It may just be your # Process the array, put that into a file part.
IFS=, read -ra ray <<< "A,,C"
for e in "${ray[#]}"; do o="$o\"$e\","; done
echo "[${o%,}]"
["A","","C"]
See #Glenn's excellent answer regarding tabs.
My simple data file:
$: cat x # tab delimited, empty field 2 of line 2
a b c
d f
My test:
while IFS=$'\001' read -r a b c; do
echo "a:[$a] b:[$b] c:[$c]"
done < <(tr "\t" "\001"<x)
a:[a] b:[b] c:[c]
a:[d] b:[] c:[f]
Note that I used ^A (a 001 byte) but you might be able to use something as simple as a comma or pipe (|) character. Choose based on your data.

How to process all or selected rows in a csv file where column headers and order are dynamic?

I'd like to either process one row of a csv file or the whole file.
The variables are set by the header row, which may be in any order.
There may be up to 12 columns, but only 3 or 4 variables are needed.
The source files might be in either format, and all I want from both is lastname and country. I know of many different ways and tools to do it if the columns were fixed and always in the same order. But they're not.
examplesource.csv:
firstname,lastname,country
Linus,Torvalds,Finland
Linus,van Pelt,USA
examplesource2.csv:
lastname,age,country
Torvalds,66,Finland
van Pelt,7,USA
I have cobbled together something from various Stackoverflow postings which looks a bit voodoo but seems fairly robust. I say "voodoo" because shellcheck complains that, for example, "firstname is referenced but not assigned". And yet it prints it.
#!/bin/bash
#set the field seperator to newline
IFS=$'\n'
#split/transpose the first-line column titles to rows
COLUMNAMES=$(head -n1 examplesource.csv | tr ',' '\n')
#set an array and read the columns into it
columns=()
for line in $COLUMNAMES; do
columns+=("$line")
done
#reset the field seperator
IFS=","
#using -p here to debug in output
declare -ap columns
#read from line 2 onwards
sed 1d examplesource.csv | while read "${columns[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
In the case of looping through everything, it works perfectly for my needs and I can process within the "while read" loop. But to make it cleaner, I'd rather pass the current element(?) to an external function to process (not just echo).
And if I only wanted the array (current row) belonging to "Torvalds", I cannot find how to access that or even get its current index, eg: "if $wantedname && $lastname == $wantedname then call function with currentrow only otherwise loop all rows and call function".
I know there aren't multidimensional associative arrays in bash from reading
Multidimensional associative arrays in Bash and I've tried to understand arrays from
https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
Is it clear what I'm trying to achieve in a bash-only manner and does the question make sense?
Many thanks.

Let's short your function. Don't read the source twice (first with head then with sed). You can do that once. Also the whole array reading can be shorten to just IFS=',' COLUMNAMES=($(head -n1 source.csv)). Here's a shorter version:
#!/bin/bash
cat examplesource.csv |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
If you want to parse both files and the same time, ie. join them, nothing simpler ;). First, let's number lines in the first file using nl -w1 -s,. Then we use join to join the files on the name of the people. Remember that join input needs to be sort-ed using proper fields. Then we sort the output with sort using the number from the first file. After that we can read all the data just like that:
# join the files, using `,` as the seaprator
# on the 3rd field from the first file and the first field from the second file
# the output should be first the fields from the first file, then the second file
# the country (field 1.4) is duplicated in 2.3, so just omiting it.
join -t, -13 -21 -o 1.1,1.2,1.3,2.2,2.3 <(
# number the lines in the first file
<examplesource.csv nl -w1 -s, |
# there is one field more, sort using the 3rd field
sort -t, -k3
) <(
# sort the second file using the first field
<examplesource2.csv sort -t, -k1
) |
# sort the output using the numbers from the first file
sort -t, -k1 -n |
# well, remove the numbers
cut -d, -f2- |
# just a normal read follows
{
# read the headers
IFS=, read -r -a names
while IFS=, read -r "${names[#]}"; do
# finally out output!
echo "${firstname} ${lastname} is from ${country} and is so many ${age} years old!"
done
}
Tested on tutorialspoint.

GNU Awk has multidimensional arrays. It also has array sorting mechanisms, which I have not used here. Please comment if you are interested in pursuing this solution further. The following depends on consistent key names and line numbers across input files, but can handle an arbitrary number of fields and input files.
$ gawk -V |gawk NR==1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
$ gawk -F, '
FNR == 1 {for(f=1;f<=NF;f++) Key[f]=$f}
FNR != 1 {for(f=1;f<=NF;f++) People[FNR][Key[f]]=$f}
END {
for(Person in People) {
for(attribute in People[Person])
output = output FS People[Person][attribute]
print substr(output,2)
output=""
}
}
' file*
66,Finland,Linus,Torvalds
7,USA,Linus,van Pelt

A bash solution takes a bit more work than an awk solution, but if this is an exercise over what bash provides, it provides all you need to handle determining the column holding the last name from the first line of input and then outputting the lastname from the remaining lines.
An easy approach is simply to read each line into a normal array and then loop over the elements of the first line to locate the column "lastname" appears in saving the column in a variable. You can then read each of the remaining lines the same way and output the lastname field by outputting the element at the saved column.
A short example would be:
#!/bin/bash
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{ col=i; break; } ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
echo "line $cnt lastname: ${arr[col]}" ## output lastname variable
((cnt++)) ## increment linecount
done < "$1"
Example Use/Output
Using your two files data files, the output would be:
$ bash readcsv.sh ex1.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
$ bash readcsv.sh ex2.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
A similar implementation using awk would be:
awk -F, -v col=1 '
NR == 1 {
for (i in FN) {
if (i = "lastname") next
}
col++
}
NR > 1 {
print "lastname: ", $col
}
' ex1.csv
Example Use/Output
$ awk -F, -v col=1 'NR == 1 { for (i in FN) { if (i = "lastname") next } col++ } NR > 1 {print "lastname: ", $col }' ex1.csv
lastname: Torvalds
lastname: van Pelt
(output is the same for either file)

Thank you all. I've taken a couple of bits from two answers
I used the answer from David to find the number of the row, then I used the elegantly simple solution from Kamil at to loop through what I need.
The result is exactly what I wanted. Thank you all.
$ readexample.sh examplesource.csv "Torvalds"
Everyone
Linus Torvalds is from Finland
Linus van Pelt is from USA
now just Torvalds
Linus Torvalds is from Finland
And this is the code - now that you know what I want it to do, if anyone can see any dangers or improvements, please let me know as I'm always learning. Thanks.
#!/bin/bash
FILENAME="$1"
WANTED="$2"
printDetails() {
SINGLEROW="$1"
[[ ! -z "$SINGLEROW" ]] && opt=("--expression" "1p" "--expression" "${SINGLEROW}p") || opt=("--expression" "1p" "--expression" "2,199p")
sed -n "${opt[#]}" "$FILENAME" |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
}
findRow() {
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{
col=i
break
} ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
if [ "${arr[col]}" == "$1" ]; then
echo "$cnt" ## output lastname variable
fi
((cnt++)) ## increment linecount
done <"$FILENAME"
}
echo "Everyone"
printDetails
if [ ! -z "${WANTED}" ]; then
echo -e "\nnow just ${WANTED}"
row=$(findRow "${WANTED}")
printDetails "$((row + 1))"
fi

Using array inside awk in shell script

I am very new to Unix shell script and trying to get some knowledge in shell scripting. Please check my requirement and my approach.
I have a input file having data
ABC = A:3 E:3 PS:6
PQR = B:5 S:5 AS:2 N:2
I am trying to parse the data and get the result as
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
The values can be added horizontally and vertically so I am trying to use an array. I am trying something like this:
myarr=(main.conf | awk -F"=" 'NR!=1 {print $1}'))
echo ${myarr[1]}
# Or loop through every element in the array
for i in "${myarr[#]}"
do
:
echo $i
done
or
awk -F"=" 'NR!=1 {
print $1"\n"
STR=$2
IFS=':' read -r -a array <<< "$STR"
for i in "${!array[#]}"
do
echo "$i=>${array[i]}"
done
}' main.conf
But when I add this code to a .sh file and try to run it, I get syntax errors as
$ awk -F"=" 'NR!=1 {
> print $1"\n"
> STR=$2
> FS= read -r -a array <<< "$STR"
> for i in "${!array[#]}"
> do
> echo "$i=>${array[i]}"
> done
>
> }' main.conf
awk: cmd. line:4: FS= read -r -a array <<< "$STR"
awk: cmd. line:4: ^ syntax error
awk: cmd. line:5: for i in "${!array[#]}"
awk: cmd. line:5: ^ syntax error
awk: cmd. line:8: done
awk: cmd. line:8: ^ syntax error
How can I complete the above expectations?

This is the awk code to do what you want:
$ cat tst.awk
BEGIN { FS="[ =:]+"; OFS="=" }
{
print $1
for (i=2;i<NF;i+=2) {
print $i, $(i+1)
}
print ""
}
and this is the shell script (yes, all a shell script does to manipulate text is call awk):
$ awk -f tst.awk file
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
A UNIX shell is an environment from which to call UNIX tools (find, sort, sed, grep, awk, tr, cut, etc.). It has its own language for manipulating (e.g. creating/destroying) files and processes and sequencing calls to tools but it is NOT intended to be used to manipulate text. The guys who invented shell also invented awk for shell to call to manipulate text.
Read https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice and the book Effective Awk Programming, 4th Edition, by Arnold Robbins.

First off, a command that does what you want:
$ sed 's/ = /\n/;y/: /=\n/' main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2
This replaces, on each line, the first (and only) occurrence of = with a newline (the s command), then turns all : into = and all spaces into newlines (the y command). Notice that
this works only because there is a space at the end of the first line (otherwise it would be a bit more involved to get the empty line between the blocks) and
this works only with GNU sed because it substitutes newlines; see this fantastic answer for all the details and how to get it to work with BSD sed.
As for what you tried, there is almost too much wrong with it to try and fix it piece by piece: from the wild mixing of awk and Bash to syntax errors all over the place. I recommend you read good tutorials for both, for example:
The BashGuide
Effective AWK Programming
A Bash solution
Here is a way to solve the same in Bash; I didn't use any arrays.
#!/bin/bash
# Read line by line into the 'line' variable. Setting 'IFS' to the empty string
# preserves leading and trailing whitespace; '-r' prevents interpretation of
# backslash escapes
while IFS= read -r line; do
# Three parameter expansions:
# Replace ' = ' by newline (escape backslash)
line="${line/ = /\\n}"
# Replace ':' by '='
line="${line//:/=}"
# Replace spaces by newlines (escape backslash)
line="${line// /\\n}"
# Print the modified input line; '%b' expands backslash escapes
printf "%b" "$line"
done < "$1"
Output:
$ ./SO.sh main.conf
ABC
A=3
E=3
PS=6
PQR
B=5
S=5
AS=2
N=2

Accessing a JSON object in Bash - associative array / list / another model

I have a Bash script which gets data in JSON, I want to be able to convert the JSON into an accessible structure - array / list / or other model which would be easy to parse the nested data.
Example:
{
"SALUTATION": "Hello world",
"SOMETHING": "bla bla bla Mr. Freeman"
}
I want to get the value like the following: echo ${arr[SOMETHING]}
[ Different approach is optional as well. ]

If you want key and value, and based on How do i convert a json object to key=value format in JQ, you can do:
$ jq -r "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" file
SALUTATION=Hello world
SOMETHING=bla bla bla Mr. Freeman
In a more general way, you can store the values into an array myarray[key] = value like this, just by providing jq to the while with the while ... do; ... done < <(command) syntax:
declare -A myarray
while IFS="=" read -r key value
do
myarray[$key]="$value"
done < <(jq -r 'to_entries|map("(.key)=(.value)")|.[]' file)
And then you can loop through the values like this:
for key in "${!myarray[#]}"
do
echo "$key = ${myarray[$key]}"
done
For this given input, it returns:
SALUTATION = Hello world
SOMETHING = bla bla bla Mr. Freeman

Although this question is answered, I wasn't able to fully satiate my
requirements from the posted answer. Here is a little write up that'll help any
bash-newcomers.
Foreknowledge
A basic associative array declaration
#!/bin/bash
declare -A associativeArray=([key1]=val1 [key2]=val2)
You can also use quotes (', ") around the declaration, its keys, and
values.
#!/bin/bash
declare -A 'associativeArray=([key1]=val1 [key2]=val2)'
And you can delimit each [key]=value pair via space or newline.
#!/bin/bash
declare -A associativeArray([key1]=value1
['key2']=value2 [key3]='value3'
['key4']='value2' ["key5"]="value3"
["key6"]='value4'
['key7']="value5"
)
Depending on your quote variation, you may need to escape your string.
Using Indirection to access both key and value in an associative array
example () {
local -A associativeArray=([key1]=val1 [key2]=val2)
# print associative array
local key value
for key in "${!associativeArray[#]}"; do
value="${associativeArray["$key"]}"
printf '%s = %s' "$key" "$value"
done
}
Running the example function
$ example
key2 = val2
key1 = val1
Knowing the aforementioned tidbits allows you to derive the following snippets:
The following examples will all have the result as the example above
String evaluation
#!/usr/bin/env bash
example () {
local arrayAsString='associativeArray=([key1]=val1 [key2]=val2)'
local -A "$arrayAsString"
# print associative array
}
Piping your JSON into JQ
#!/usr/bin/env bash
# Note: usage of single quotes instead of double quotes for the jq
# filter. The former is preferred to avoid issues with shell
# substitution of quoted strings.
example () {
# Given the following JSON
local json='{ "key1": "val1", "key2": "val2" }'
# filter using `map` && `reduce`
local filter='to_entries | map("[\(.key)]=\(.value)") |
reduce .[] as $item ("associativeArray=("; . + ($item|#sh) + " ") + ")"'
# Declare and assign separately to avoid masking return values.
local arrayAsString;
# Note: no encompassing quotation (")
arrayAsString=$(jq --raw-output "${filter}" <<< "$json")
local -A "$arrayAsString"
# print associative array
}
jq -n / --null-input option + --argfile && redirection
#!/usr/bin/env bash
example () {
# /path/to/file.json contains the same json as the first two examples
local filter filename='/path/to/file.json'
# including bash variable name in reduction
filter='to_entries | map("[\(.key | #sh)]=\(.value | #sh) ")
| "associativeArray=(" + add + ")"'
# using --argfile && --null-input
local -A "$(jq --raw-output --null-input --argfile file "$filename" \
"\$filename | ${filter}")"
# or for a more traceable declaration (using shellcheck or other) this
# variation moves the variable name outside of the string
# map definition && reduce replacement
filter='[to_entries[]|"["+(.key|#sh)+"]="+(.value|#sh)]|"("+join(" ")+")"'
# input redirection && --join-output
local -A associativeArray=$(jq --join-output "${filter}" < "${filename}")
# print associative array
}
Reviewing previous answers
#Ján Lalinský
To load JSON object into a bash associative array efficiently
(without using loops in bash), one can use tool 'jq', as follows.
# first, load the json text into a variable:
json='{"SALUTATION": "Hello world", "SOMETHING": "bla bla bla Mr. Freeman"}'
# then, prepare associative array, I use 'aa':
unset aa
declare -A aa
# use jq to produce text defining name:value pairs in the bash format
# using #sh to properly escape the values
aacontent=$(jq -r '. | to_entries | .[] | "[\"" + .key + "\"]=" + (.value | #sh)' <<< "$json")
# string containing whole definition of aa in bash
aadef="aa=($aacontent)"
# load the definition (because values may contain LF characters, aadef must be in double quotes)
eval "$aadef"
# now we can access the values like this: echo "${aa[SOMETHING]}"
Warning: this uses eval, which is dangerous if the json input is from unknown source (may contain malicious shell commands that eval may execute).
This could be reduced to the following
example () {
local json='{ "key1": "val1", "key2": "val2" }'
local -A associativeArray="($(jq -r '. | to_entries | .[] |
"[\"" + .key + "\"]=" + (.value | #sh)' <<< "$json"))"
# print associative array
}
#fedorqui
If you want key and value, and based on How do i convert a json object to key=value format in JQ, you can do:
$ jq -r "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" file
SALUTATION=Hello world
SOMETHING=bla bla bla Mr. Freeman
In a more general way, you can store the values into an array myarray[key] = value like this, just by providing jq to the while with the while ... do; ... done < <(command) syntax:
declare -A myarray
while IFS="=" read -r key value
do
myarray[$key]="$value"
done < <(jq -r "to_entries|map(\"\(.key)=\(.value)\")|.[]" file)
And then you can loop through the values like this:
for key in "${!myarray[#]}"
do
echo "$key = ${myarray[$key]}"
done
For this given input, it returns:
SALUTATION = Hello world
SOMETHING = bla bla bla Mr. Freeman
The main difference between this solution and my own is looping through the
array in bash or in jq.
Each solution is valid and depending on your use case, one may be more useful
then the other.

Context: This answer was written to be responsive to a question title which no longer exists..
The OP's question actually describes objects, vs arrays.
To be sure that we help other people coming in who are actually looking for help with JSON arrays, though, it's worth covering them explicitly.
For the safe-ish case where strings can't contain newlines (and when bash 4.0 or newer is in use), this works:
str='["Hello world", "bla bla bla Mr. Freeman"]'
readarray -t array <<<"$(jq -r '.[]' <<<"$str")"
To support older versions of bash, and strings with newlines, we get a bit fancier, using a NUL-delimited stream to read from jq:
str='["Hello world", "bla bla bla Mr. Freeman", "this is\ntwo lines"]'
array=( )
while IFS= read -r -d '' line; do
array+=( "$line" )
done < <(jq -j '.[] | (. + "\u0000")')

This is how can it be done recursively:
#!/bin/bash
SOURCE="$PWD"
SETTINGS_FILE="$SOURCE/settings.json"
SETTINGS_JSON=`cat "$SETTINGS_FILE"`
declare -A SETTINGS
function get_settings() {
local PARAMS="$#"
local JSON=`jq -r "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" <<< "$1"`
local KEYS=''
if [ $# -gt 1 ]; then
KEYS="$2"
fi
while read -r PAIR; do
local KEY=''
if [ -z "$PAIR" ]; then
break
fi
IFS== read PAIR_KEY PAIR_VALUE <<< "$PAIR"
if [ -z "$KEYS" ]; then
KEY="$PAIR_KEY"
else
KEY="$KEYS:$PAIR_KEY"
fi
if jq -e . >/dev/null 2>&1 <<< "$PAIR_VALUE"; then
get_settings "$PAIR_VALUE" "$KEY"
else
SETTINGS["$KEY"]="$PAIR_VALUE"
fi
done <<< "$JSON"
}
To call it:
get_settings "$SETTINGS_JSON"
The array will be accessed like so:
${SETTINGS[grandparent:parent:child]}

To load JSON object into a bash associative array efficiently (without using loops in bash), one can use tool 'jq', as follows.
# first, load the json text into a variable:
json='{"SALUTATION": "Hello world", "SOMETHING": "bla bla bla Mr. Freeman"}'
# then, prepare associative array, I use 'aa':
unset aa
declare -A aa
# use jq to produce text defining name:value pairs in the bash format
# using #sh to properly escape the values
aacontent=$(jq -r '. | to_entries | .[] | "[\"" + .key + "\"]=" + (.value | #sh)' <<< "$json")
# string containing whole definition of aa in bash
aadef="aa=($aacontent)"
# load the definition (because values may contain LF characters, aadef must be in double quotes)
eval "$aadef"
# now we can access the values like this: echo "${aa[SOMETHING]}"
Warning: this uses eval, which is dangerous if the json input is from unknown source (may contain malicious shell commands that eval may execute).

Building on #HelpNeeder's solution (nice one btw)
His solution wasn't really working with integers, so i made some additions. Extended amount of condition checks, so it's fair to say some performance is sacrificed.
This version works with integers and also floating point values.
SOURCE="$PWD"
SETTINGS_FILE="./test2.json"
SETTINGS_JSON=`cat "$SETTINGS_FILE"`
declare -A SETTINGS
get_settings() {
local PARAMS="$#"
local JSON=`jq -r "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" <<< "$1"`
local KEYS=''
if [ $# -gt 1 ]; then
KEYS="$2"
fi
while read -r PAIR; do
local KEY=''
if [ -z "$PAIR" ]; then
break
fi
IFS== read PAIR_KEY PAIR_VALUE <<< "$PAIR"
if [ -z "$KEYS" ]; then
KEY="$PAIR_KEY"
else
KEY="$KEYS:$PAIR_KEY"
fi
res=$(jq -e . 2>/dev/null <<< "$PAIR_VALUE")
exitCode=$?
check=`echo "$PAIR_VALUE" | grep -E ^\-?[0-9]*\.?[0-9]+$`
# if [ "${res}" ] && [ $exitCode -eq "0" ] && [[ ! "${PAIR_VALUE}" == ?(-)+([0-9]) ]] ALTERNATIVE, works only for integer (not floating point)
if [ "${res}" ] && [ $exitCode -eq "0" ] && [[ "$check" == '' ]]
then
get_settings "$PAIR_VALUE" "$KEY"
else
SETTINGS["$KEY"]="$PAIR_VALUE"
fi
done <<< "$JSON"
}
get_settings "$SETTINGS_JSON"

Solution: use jq( it's a lightweight and flexible command-line JSON processor.).
In bash I'd rather assign JSONs object to a variable and use jq in order to access and parse the right result from it. It's more convenient than parse this structure with arrays and it comes out of the box with multiple functionalities and features such as accessing nested and complex objects, select methods, builtin operators and functions, regex support ,comparisons etc...
example:
example='{"SALUTATION": "Hello world","SOMETHING": "bla bla bla Mr. Freeman"}'
echo $example | jq .SOMETHING
# output:
"bla bla bla Mr. Freeman"

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Shell script - awk extract block from file into array - arrays

Related

Cannot send command output into an array

Put lines of a text file in an array in bash

How to process all or selected rows in a csv file where column headers and order are dynamic?

Using array inside awk in shell script

Accessing a JSON object in Bash - associative array / list / another model

Categories

Resources