Bash arrays: Need help with white space - arrays

I'm trying to create an array in bash from a file with the following sample format:
data, data, interesting
data, data, more interesting
The way I'm populating the arrays:
read -r -a DATA1 <<< $(cat $FILE | awk -F, '{ print $1 }')
read -r -a DATA2 <<< $(cat $FILE | awk -F, '{ print $2 }')
read -r -a DATA3 <<< $(cat $FILE | awk -F, '{ print $3 }')
When I examine the array DATA3, there are 3 elements:
interesting
more
interesting
I need it to show only 2 elements like:
interesting
more interesting
How can I preserve the white space in field 3 so when I call that element from the array, it appears as "more interesting"? Is there a better way to handle this?

The key is to use the IFS (internal field separators) variable together with read.
read can read words directly into an array using the -a option. Set IFS=, and it will split at comma:
while read f; do
echo "$f" | (IFS=, && read -a arr && echo ${arr[2]})
done
will echo
interesting
more interesting
You can also read directly into variable names:
IFS=, && read f1 f2 f2
EDIT: I can recommend reading Advanced Bash Scripting Guide.

Use cut. Example:
read -r -a DATA1 <<< $(cut -d, -f 1 $FILE)
read -r -a DATA2 <<< $(cut -d, -f 2 $FILE)
read -r -a DATA3 <<< $(cut -d, -f 3 $FILE)

IFS=','
while read -a array
do
echo "${array[2]}"
done < data.csv

>> Is there a better way to handle this? - Tee Bone
$ awk -F", " '{print $3}' file
interesting
more interesting

Related

How to parse and convert string list to JSON string array in shell command?

How to parse and convert string list to JSON string array in shell command?
'["test1","test2","test3"]'
to
test1
test2
test3
I tried like below:
string=$1
array=${string#"["}
array=${array%"]"}
IFS=',' read -a array <<< $array;
echo "${array[#]}"
Any other optimized way?
As bash and jq are tagged, this solution relies on both (without summoning eval). The input string is expected to be in $string, the output array is generated into ${array[#]}. It is robust wrt spaces, newlines, quotes, etc. as it uses NUL as delimiter.
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
Testing
string='["has spaces\tand tabs","has a\nnewline","has \"quotes\""]'
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
printf '==>%s<==\n' "${array[#]}"
==>has spaces and tabs<==
==>has a
newline<==
==>has "quotes"<==
eval "array=($( jq -r 'map( #sh ) | join(" ")' <<<"$json" ))"

Problem with Splitting Up a String and Putting it Into an Array in BASH on a Mac

I have been trying to split up a string and putting it into an Array in Bash on my Mac without success.
Here is my sample code:
#!/bin/bash
declare -a allDisks
allDisksString="`ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }'`"
#allDisksString="/dev/disk0s1 /dev/disk1s1"
echo allDisksString is $allDisksString
IFS=' ' read -ra allDisks <<< "$allDisksString"
echo allDIsks is "$allDisks"
echo The second item in allDisks is "${allDisks[1]}"
for disk in "${allDisks[#]}"
do
printf "Loop $disk\n"
done
And below is the output:
allDisksString is /dev/disk0s1 /dev/disk0s2 /dev/disk0s3 /dev/disk0s4 /dev/disk1s1
allDIsks is /dev/disk0s1
The second item in allDisks is
Loop /dev/disk0s1
Interesting if I execute the following in the Mac Terminal:
ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }'
I get the following output
/dev/disk0s1
/dev/disk0s2
/dev/disk0s3
/dev/disk0s4
/dev/disk1s1
So I have also tried setting IFS to IFS=$'\n' without any success.
So no luck in getting a list of my drives into an array.
Any ideas on what I am doing wrong?
You're making this much more complicated than it needs to be. You don't need to use ls, you can just use a wildcard to match the device names you want, and put that in an array assignment.
#!/bin/bash
declare -a allDisks
allDisks=(/dev/disk[0-9]s*)
echo allDIsks is "$allDisks"
echo The second item in allDisks is "${allDisks[1]}"
for disk in "${allDisks[#]}"
do
printf "Loop $disk\n"
done
read only reads one line.
Use an assignment instead. When assigning to an array, you need to use parentheses after the = sign:
#!/bin/bash
disks=( $(ls /dev/disk* | grep -e 'disk[0-9]s.*' | awk '{ print $NF }') )
echo ${disks[1]}

Assigning an Array Parsed With jq to Bash Script Array

I parsed a json file with jq like this :
# cat test.json | jq '.logs' | jq '.[]' | jq '._id' | jq -s
It returns an array like this : [34,235,436,546,.....]
Using bash script i described an array :
# declare -a msgIds = ...
This array uses () instead of [] so when I pass the array given above to this array it won't work.
([324,32,45..]) this causes problem. If i remove the jq -s, an array forms with only 1 member in it.
Is there a way to solve this issue?
We can solve this problem by two ways. They are:
Input string:
// test.json
{
"keys": ["key1","key2","key3"]
}
Approach 1:
1) Use jq -r (output raw strings, not JSON texts) .
KEYS=$(jq -r '.keys' test.json)
echo $KEYS
# Output: [ "key1", "key2", "key3" ]
2) Use #sh (Converts input string to a series of space-separated strings). It removes square brackets[], comma(,) from the string.
KEYS=$(<test.json jq -r '.keys | #sh')
echo $KEYS
# Output: 'key1' 'key2' 'key3'
3) Using tr to remove single quotes from the string output. To delete specific characters use the -d option in tr.
KEYS=$((<test.json jq -r '.keys | #sh')| tr -d \')
echo $KEYS
# Output: key1 key2 key3
4) We can convert the comma-separated string to the array by placing our string output in a round bracket().
It also called compound Assignment, where we declare the array with a bunch of values.
ARRAYNAME=(value1 value2 .... valueN)
#!/bin/bash
KEYS=($((<test.json jq -r '.keys | #sh') | tr -d \'\"))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
Approach 2:
1) Use jq -r to get the string output, then use tr to delete characters like square brackets, double quotes and comma.
#!/bin/bash
KEYS=$(jq -r '.keys' test.json | tr -d '[],"')
echo $KEYS
# Output: key1 key2 key3
2) Then we can convert the comma-separated string to the array by placing our string output in a round bracket().
#!/bin/bash
KEYS=($(jq -r '.keys' test.json | tr -d '[]," '))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
To correctly parse values that have spaces, newlines (or any other arbitrary characters) just use jq's #sh filter and bash's declare -a. (No need for a while read loop or any other pre-processing)
// foo.json
{"data": ["A B", "C'D", ""]}
str=$(jq -r '.data | #sh' foo.json)
declare -a arr="($str)" # must be quoted like this
$ declare -p arr
declare -a arr=([0]="A B" [1]="C'D" [2]="")
The reason that this works correctly is that #sh will produce a space-separated list of shell-quoted words:
$ echo "$str"
'A B' 'C'\''D' ''
and this is exactly the format that declare expects for an array definition.
Use jq -r to output a string "raw", without JSON formatting, and use the #sh formatter to format your results as a string for shell consumption. Per the jq docs:
#sh:
The input is escaped suitable for use in a command-line for a POSIX shell. If the input is an array, the output will be a series of space-separated strings.
So can do e.g.
msgids=($(<test.json jq -r '.logs[]._id | #sh'))
and get the result you want.
From the jq FAQ (https://github.com/stedolan/jq/wiki/FAQ):
𝑸: How can a stream of JSON texts produced by jq be converted into a bash array of corresponding values?
A: One option would be to use mapfile (aka readarray), for example:
mapfile -t array <<< $(jq -c '.[]' input.json)
An alternative that might be indicative of what to do in other shells is to use read -r within a while loop. The following bash script populates an array, x, with JSON texts. The key points are the use of the -c option, and the use of the bash idiom while read -r value; do ... done < <(jq .......):
#!/bin/bash
x=()
while read -r value
do
x+=("$value")
done < <(jq -c '.[]' input.json)
++ To resolve this, we can use a very simple approach:
++ Since I am not aware of you input file, I am creating a file input.json with the following contents:
input.json:
{
"keys": ["key1","key2","key3"]
}
++ Use jq to get the value from the above file input.json:
Command: cat input.json | jq -r '.keys | #sh'
Output: 'key1' 'key2' 'key3'
Explanation: | #sh removes [ and "
++ To remove ' ' as well we use tr
command: cat input.json | jq -r '.keys | #sh' | tr -d \'
Explanation: use tr delete -d to remove '
++ To store this in a bash array we use () with `` and print it:
command:
KEYS=(`cat input.json | jq -r '.keys | #sh' | tr -d \'`)
To print all the entries of the array: echo "${KEYS[*]}"

create arrays from for loop output

I'm trying to understand what I'm doing wrong here, but can't seem to determine the cause. I would like to create a set of arrays from an output for a for loop in bash. Below is the code I have so far:
for i in `onedatastore list | grep pure02 | awk '{print $1}'`;
do
arr${i}=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)) ;
echo "Output of arr${i}: ${arr${i}[#]}" ;
done
The output for the condition is as such:
107
108
109
What I want to do is based on these unique IDs is create arrays:
arr107
arr108
arr109
The arrays will have data like such in each:
[oneadmin#opennebula/]$ arr107=($(onedatastore show 107 | sed 's/[A-Z]://' | cut -f2 -d\:))
[oneadmin#opennebula/]$ echo ${arr107[#]}
DATASTORE 107 INFORMATION 107 pure02_vm_datastore_1 oneadmin oneadmin 0 IMAGE vcenter vcenter /var/lib/one//datastores/107 FILE READY DATASTORE CAPACITY 60T 21.9T 38.1T - PERMISSIONS um- u-- --- DATASTORE TEMPLATE CLONE_TARGET="NONE" DISK_TYPE="FILE" DS_MAD="vcenter" LN_TARGET="NONE" RESTRICTED_DIRS="/" SAFE_DIRS="/var/tmp" TM_MAD="vcenter" VCENTER_CLUSTER="CLUSTER01" IMAGES
When I try this in the script section though I get output errors as such:
./test.sh: line 6: syntax error near unexpected token `$(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)'
I can't seem to figure out the syntax to use on this scenario.
In the end what I want to do is be able to compare different datastores and based on which on has more free space, deploy VMs to it.
Hope someone can help. Thanks
You can use the eval (potentially unsafe) and declare (safer) commands:
for i in $(onedatastore list | grep pure02 | awk '{print $1}');
do
declare "arr$i=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:))"
eval echo 'Output of arr$i: ${arr'"$i"'[#]}'
done
readarray or mapfile, added in bash 4.0, will read directly into an array:
while IFS= read -r i <&3; do
readarray -t "arr$i" < <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:)
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Better, back through bash 3.x, one can use read -a to read to an array:
shopt -s pipefail # cause pipelines to fail if any element does
while IFS= read -r i <&3; do
IFS=$'\n' read -r -d '' -a "arr$i" \
< <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d: && printf '\0')
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Alternately, one can use namevars to create an alias for an array with an arbitrarily-named array in bash 4.3:
while IFS= read -r i <&3; do
declare -a "arr$i"
declare -n arr="arr$i"
# this is buggy: expands globs, string-splits on all characters in IFS, etc
# ...but, well, it's what the OP is asking for...
arr=( $(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:) )
done 3< <(onedatastore list | awk '/pure02/ {print $1}')

Iterating over lines (w/ numbers) read from a file to an array in bash

I'm trying to write a small script that will take the 4th columns of a file and store it in an array then do a little comparison. If the element in the array is greater than 0 and less than 500 I have to increment the counter. However when I run the script the counter always shows 0. Here's my script
#!/bin/bash
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' > test.txt)
COUNTER=0
for i in ${my_array[#]}; do
if [["${my_array[$i]}" -gt 0 -a "${my_array[$i]}" -lt 500 ]]
then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "%i" "${my_array[$i]}"//just to test if the mapfile command is working
done
echo $COUNTER
output:
./script1.bash
0
#!/bin/bash
mapfile -t my_array < <(awk '{ print $4 }' file1.txt | tee test.txt)
COUNTER=0
for idx in "${!my_array[#]}"; do
value=${my_array[$idx]}
if (( value > 0 )) && (( value < 500 )); then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "$idx" "$value"
done
echo "$COUNTER"
The use of cat here is needless: It added nothing but inefficiency (requiring an extra process to be started, and forcing awk to read from a pipe rather than direct from a file).
mapfile had nothing to read because the output of awk was redirected to test.txt. If you want it to go to both a file and stdout, then you need to use tee.
-a is not valid in [[ ]]; use && instead there. However, since you're doing only arithmetic, (( )) is more appropriate. Incidentally, -a is officially marked obsolescent even for [ ] and test; see the current POSIX standard.
${my_array[#]} iterates over values. If you want to iterate over indexes, you need ${!my_array[#]} instead.
Whitespace is mandatory in separating command names. [["$foo" is a different command from [[, unless $foo is empty or starts with a character in $IFS.
If you redirect the output to a file: > test.txt then there is no output in "standard output" because it is consumed by the file. So, first, you need to remove that redirection. You may use:
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' )
But since awk could perfectly well read a file, this is better:
mapfile -t my_array < <(awk '{ print $4 }' file1.txt)
And since you are using awk, it could do the comparison to 0 and 500 and output the whole count.
counter=$(awk '{if($4>0 && $4<500){c++}}END{print c}' file1.txt)
echo "$counter"
Simpler, faster.
That will also avoid some simple mistakes in your script, like missing an space in the […] construct:
if [[ "${my … # NOT "if [["${my …"
And some missing quotes:
for i in "${my_array[#]}" # NOT for i in ${my_array[#]}
In general, it is a good idea to check your script with ShellCheck.net to remove some simple mistakes.

Resources