put input with spaces as a single element in array in bash - arrays

I have a file from which I extract the first three columns using the cut command and write them into an array.
When I check the length of the array , it is giving me four. I need the array to have only 3 elements.
I think it's taking space as the delimiter for array elements.
aaa|111|ADAM|1222|aauu
aaa|222|MIKE ALLEN|5678|gggg
aaa|333|JOE|1222|eeeee
target=($(cut -d '|' -f1-3 sample_file2.txt| sort -u ))

In bash 4 or later, use readarray with process substitution to populate the array. As is, your code cannot distinguish between the whitespace separating each line in the output from the whitespace occurring in "Mike Allen". The readarray command puts each line of the input into a separate array element.
readarray -t target < <(cut -d '|' -f1-3 sample_file2.txt| sort -u)
Prior to bash 4, you need a loop to read each line individually to assign to the array.
while IFS='' read -r line; do
target+=("$line")
done < <(cut -d '|' -f1-3 sample_file2.txt | sort -u)

This should work:
IFS=$'\n' target=($(cut -d '|' -f1-3 sample_file2.txt| sort -u ))
Example:
#!/bin/bash
IFS=$'\n' target=($(cut -d '|' -f1-3 sample_file2.txt| sort -u ))
echo ${#target[#]}
echo "${target[1]}"
Output:
3
aaa|222|MIKE ALLEN

As an alternative, using the infamous eval,
eval target=($(cut -sd '|' -f1-3 sample_file2.txt | sort -u | \
xargs -d\\n printf "'%s'\n"))

Related

How to split string to array with specific word in bash

I have a string after I do a command:
[username#hostname ~/script]$ gsql ls | grep "Graph graph_name"
- Graph graph_name(Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e)
Then I do
IFS=", " read -r -a vertices <<< "$(gsql use graph ifgl ls | grep "Graph ifgl(" | cut -d "(" -f2 | cut -d ")" -f1)" to make the string splitted and append to array. But, what I want is to split it by delimiter ", " then append each word that contain ":v" to an array, its mean word that contain ":e" will excluded.
How to do it? without do a looping
Like this, using grep
mapfile -t array < <(gsql ls | grep "Graph graph_name" | grep -oP '\b\w+:v')
The regular expression matches as follows:
Node
Explanation
\b
the boundary between a word char (\w) and something that is not a word char
\w+
word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
:v
':v'
This bash script should work:
declare arr as array variable
arr=()
# use ", " as delimiter to parse the input fed through process substituion
while read -r -d ', ' val || [[ -n $val ]]; do
val="${val%)}"
val="${val#*\(}"
[[ $val == *:v ]] && arr+=("$val")
done < <(gsql ls | grep "Graph graph_name")
# check array content
declare -p arr
Output:
declare -a arr='([0]="Vertice_1:v" [1]="Vertice_2:v" [2]="Vertice_3:v" [3]="Vertice_4:v")'
Since there is a condition per element the logical way is to use a loop. There may be ways to do it, but here is a solution with a for loop:
#!/bin/bash
input="Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e"
input="${input//,/ }" #replace , with SPACE (bash array uses space as separator)
inputarray=($input)
outputarray=()
for item in "${inputarray[#]}"; do
if [[ $item =~ ":v" ]]; then
outputarray+=($item) #append the item to the output array
fi
done
echo "${outputarray[#]}"
will give output: Vertice_1:v Vertice_2:v Vertice_3:v Vertice_4:v
since the elements don't have space in them this works

How to parse and convert string list to JSON string array in shell command?

How to parse and convert string list to JSON string array in shell command?
'["test1","test2","test3"]'
to
test1
test2
test3
I tried like below:
string=$1
array=${string#"["}
array=${array%"]"}
IFS=',' read -a array <<< $array;
echo "${array[#]}"
Any other optimized way?
As bash and jq are tagged, this solution relies on both (without summoning eval). The input string is expected to be in $string, the output array is generated into ${array[#]}. It is robust wrt spaces, newlines, quotes, etc. as it uses NUL as delimiter.
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
Testing
string='["has spaces\tand tabs","has a\nnewline","has \"quotes\""]'
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
printf '==>%s<==\n' "${array[#]}"
==>has spaces and tabs<==
==>has a
newline<==
==>has "quotes"<==
eval "array=($( jq -r 'map( #sh ) | join(" ")' <<<"$json" ))"

Bash. Split text to array by delimiter [duplicate]

This question already has answers here:
How to split a string into an array in Bash?
(24 answers)
Closed 1 year ago.
Can somebody help me out. I want to split TEXT(variable with \n) into array in bash.
Ok, I have some text-variable:
variable='13423exa*lkco3nr*sw
kjenve*kejnv'
I want to split it in array.
If variable did not have new line in it, I will do it by:
IFS='*' read -a array <<< "$variable"
I assumed the third element should be:
echo "${array[2]}"
>sw
>kjenve
But with new line it is not working. Please give me right direction.
Use readarray.
$ variable='13423exa*lkco3nr*sw
kjenve*kejnv'
$ readarray -d '*' -t arr < <(printf "%s" "$variable")
$ declare -p arr
declare -a arr=([0]="13423exa" [1]="lkco3nr" [2]=$'sw\nkjenve' [3]="kejnv")
mapfile: -d: invavlid option
Update bash, then use readarray.
If not, replace separator with zero byte and read it element by element with read -d ''.
arr=()
while IFS= read -d '' -r e || [[ -n "$e" ]]; do
arr+=("$e")
done < <(printf "%s" "$variable" | tr '*' '\0');
declare -p arr
declare -a arr=([0]="13423exa" [1]="lkco3nr" [2]=$'sw\nkjenve' [3]="kejnv")
You can use the readarray command and use it like in the following example:
readarray -d ':' -t my_array <<< "a:b:c:d:"
for (( i = 0; i < ${#my_array[*]}; i++ )); do
echo "${my_array[i]}"
done
Where the -d parameter defines the delimiter and -t ask to remove last delimiter.
Use a ending character different than new line
end=.
read -a array -d "$end" <<< "$v$end"
Of course this solution suppose there is at least one charecter not used in your input variable.

Using "comm" to find matches between two arrays

I have two arrays, I am trying to find matching values using comm. Array1 contains some additional information in each element that I strip out for the comparison. However, I would like to keep that information after the comparison is complete.
For example:
Array1=("abc",123,"hello" "def",456,"world")
Array2=("abc")
declare -a Array1
declare -a Array2
I then compare the two arrays:
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${Array1[*]}" | awk -F "," {'print $1'} | sort) <(echo "${Array2[*]}" | sort)))
IFS=$oldIFS
Which finds the match of abc:
echo ${test3[0]}
abc
However what I want is remaining values from array1 that were not part of my comm statement.
abc,123,hello
EDIT: For more clarification
The arrays in this example are populated with dummy data.
My real example is pulling information from server logs which I am saving into array1. array1 contains (userIDs,hostIPs,count) that I want to cross reference against a list of userID's (array2). My goal is to find out what userIDs exsist in array1 and array2 and save those ID's with the additional information from array1 (hostIPs,count) into array3
array1 is populated from a variable that is is the results of a curl command that generates a splunk search. The data returned looks like this:
"uniqueID=<ID>","<IP>","<hostname>",1
I save the results of the splunk report as $splunk, and then decalare array1 with the results of $splunk - the header information since the results come back in csv format
array1=( $(echo $splunk | sed 's/ /\n/g' | sed 1d) )
array2 is generated from a master file that I have stored locally. That contains all the application ID's in our ecosystem. For example
uid=<ID>
I cat the contents of the master file into array2
array2=( $(cat master.txt) )
I then want to find what IDs from array1 exsist in array2 and save that as array3. This requires some massaging of the data in array1 to make it match the format of array2.
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${array1[*]}" | sed 's/ /\n/g' | awk -F "\"," {'print $1'} | sed 's/\"//g' | sed 's/|/ /g' | awk -F$'=' -v OFS=$'=' '{ $1 = "uid" }1' | grep -i "OU=People" | sed 's/OU/ou/g' | sort) <(echo "${array2[*]}" | sort)))
IFS=$oldIFS
array 3 will then contain lines that match in both arrays
uid=<ID>
uid=<ID>
However I am looking for something more along the line of
"uid=<ID>","<IP>","<hostname>",1
"uid=<ID>","<IP>","<hostname>",1
I would do it like this:
join -t, \
<(printf '%s\n' "${Array1[#]}" | sort -t, -k1,1) \
<(printf '%s\n' "${Array2[#]}" | sort)
Use the join command with , as the field delimiter. The first "file" is the first array, one element per line, sorted on the first field (comma delimited); the second "file" is the second array, one element per line, sorted.
The output will be every line where the first element of the first file matches the element from the second file; for the example input it's
abc,123,hello
This makes only one assumption, namely that no array element contains a newline. To make it more robust (assuming GNU Coreutils), we can use NUL as the delimiter:
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
This prints the output separated by NUL as well; to read the result into an array, we can use readarray:
readarray -d '' -t Array3 < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
readarray -d requires Bash 4.4 or newer. For older Bash, you can use a loop:
while IFS= read -r -d '' element; do
Array3+=("$element")
done < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
I don't know how to do this with comm, but I do have a solution for you with sed and grep. The following commands match on the regex uid=X,, where the string/array is in the form of uid=x or (uid=x uid=y) respectively.
# Array 2 (B) is a string
$ A=("uid=1,10.10.10.1,server1,1" "uid=2,10.10.10.2,server2,1")
$ B="uid=1"
$ echo ${A[#]} | grep -oE "([^ ]*${B},[^ ]*)"
uid=1,10.10.10.1,server1,1
# Array 2 (D) is an array
$ C=(${A[#]} "uid=3,10.10.10.3,server3,1" "uid=4,10.10.10.4,server4,1")
$ D=(${B} "uid=3")
$ echo ${C[*]} | grep -oE "([^ ]*($(echo ${D[#]} | sed 's/ /,|/g'))[^ ]*)"
uid=1,10.10.10.1,server1,1
uid=3,10.10.10.3,server3,1
# Content of arrays
$ echo ${A[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1
$ echo ${B}
uid=1
$ echo ${C[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1 uid=3,10.10.10.3,server3,1 uid=4,10.10.10.4,server4,1
$ echo ${D[#]}
uid=1 uid=3

Getting the output of a cat command into an array

How can I get just the filenames into an array using the cat command?
How I've been trying:
array=()
while IFS= read -r -d $'\0'; do
array+=("$REPLY")
done < <(cat /proc/swaps | grep "swap")
This either grabs all the information from the output into an array, or just doesn't work. How can I successfully get my expected output of [/swapfile, /dev/hda1, /some/other/swap] into an array form using the cat command?
readarray array < <(awk '/swap/{print $1}' /proc/swaps)
Bash introduced readarray in version 4 which can take the place of the while read loop. readarray is the solution you want.
here is the syntax
readarray variable < inputfile
echo "${variable[0]}" ' to print the first element in array

Resources