How to split string to array with specific word in bash - arrays

I have a string after I do a command:
[username#hostname ~/script]$ gsql ls | grep "Graph graph_name"
- Graph graph_name(Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e)
Then I do
IFS=", " read -r -a vertices <<< "$(gsql use graph ifgl ls | grep "Graph ifgl(" | cut -d "(" -f2 | cut -d ")" -f1)" to make the string splitted and append to array. But, what I want is to split it by delimiter ", " then append each word that contain ":v" to an array, its mean word that contain ":e" will excluded.
How to do it? without do a looping

Like this, using grep
mapfile -t array < <(gsql ls | grep "Graph graph_name" | grep -oP '\b\w+:v')
The regular expression matches as follows:
Node
Explanation
\b
the boundary between a word char (\w) and something that is not a word char
\w+
word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
:v
':v'

This bash script should work:
declare arr as array variable
arr=()
# use ", " as delimiter to parse the input fed through process substituion
while read -r -d ', ' val || [[ -n $val ]]; do
val="${val%)}"
val="${val#*\(}"
[[ $val == *:v ]] && arr+=("$val")
done < <(gsql ls | grep "Graph graph_name")
# check array content
declare -p arr
Output:
declare -a arr='([0]="Vertice_1:v" [1]="Vertice_2:v" [2]="Vertice_3:v" [3]="Vertice_4:v")'

Since there is a condition per element the logical way is to use a loop. There may be ways to do it, but here is a solution with a for loop:
#!/bin/bash
input="Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e"
input="${input//,/ }" #replace , with SPACE (bash array uses space as separator)
inputarray=($input)
outputarray=()
for item in "${inputarray[#]}"; do
if [[ $item =~ ":v" ]]; then
outputarray+=($item) #append the item to the output array
fi
done
echo "${outputarray[#]}"
will give output: Vertice_1:v Vertice_2:v Vertice_3:v Vertice_4:v
since the elements don't have space in them this works

Related

How to parse and convert string list to JSON string array in shell command?

How to parse and convert string list to JSON string array in shell command?
'["test1","test2","test3"]'
to
test1
test2
test3
I tried like below:
string=$1
array=${string#"["}
array=${array%"]"}
IFS=',' read -a array <<< $array;
echo "${array[#]}"
Any other optimized way?
As bash and jq are tagged, this solution relies on both (without summoning eval). The input string is expected to be in $string, the output array is generated into ${array[#]}. It is robust wrt spaces, newlines, quotes, etc. as it uses NUL as delimiter.
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
Testing
string='["has spaces\tand tabs","has a\nnewline","has \"quotes\""]'
mapfile -d '' array < <(jq -j '.[] + "\u0000"' <<< "$string")
printf '==>%s<==\n' "${array[#]}"
==>has spaces and tabs<==
==>has a
newline<==
==>has "quotes"<==
eval "array=($( jq -r 'map( #sh ) | join(" ")' <<<"$json" ))"

How to replace a word from a file with elements in an array

I am trying to replace all the word "null" to elements in array. The problem is that after replacing one word of "null", I would like to replace the next "null" with next element in the array.
I am not very good with bash and I feel like this is quite a basic question.
Here is what I have so far:
for m in $(cat finalfile.csv)
do
if [ "$m" = "null" ]
then
m=cwearray[$counter]
let counter++
fi
done
This doesn't replace anything in the finalfile.csv.
For example if the file has:
"value1","value2","null","value3"\n
"value1","value2","null","value3"...
and the array has ["foo","bar"]
I would like it to be:
"value1","value2","foo","value3"\n
"value1","value2","bar","value3"...
can be done with bash, even with multiple nulls per line:
$ cat finalfile.csv
"value1","value2","null","null"
"value1","value2","null","value3"
$ cwearray=( foo bar baz )
$ idx=0
$ while read -r line; do
while [[ $line == *null* ]]; do
line=${line/null/${cwearray[idx++]}}
# ...............^^^^^^^^^^^^^^^^^^
# replace the _first_ "null" with the _next_ array element
done
echo "$line"
done < finalfile.csv > updatedfinalfile.csv
$ cat updatedfinalfile.csv
"value1","value2","foo","bar"
"value1","value2","baz","value3"
It's easier in Perl where you can increase the index directly in the replacement part of a substitution:
printf '%s\n' 1,2,3,null null,2,3,4 null,null,null,null \
| perl -pe 'BEGIN { #cwe = qw( A B C D E F ) }
s/(?:^|(?<=,))null(?=,|$)/$cwe[$i++]/g'
Update: It seems you've updated your question with a sample input. If nulls are double quoted, it gets even easier, as there's no need to check whether they're surrounded with commas or beginning/end of the line.
perl -pe 'BEGIN{ #cwe = qw( foo bar ) }
s/"null"/"$cwe[$i++]"/g'
An awk solution :
declare -a cwearray
cwearray=(foo bar)
awk -F, 'NR==FNR{repl[NR]=$0; next}{for(i=1;i<=NF;i++){if($i=="\"null\""){$i="\""repl[++counter]"\""}}}1' OFS="," <(for i in "${cwearray[#]}"; do echo "$i"; done) <file>
Read the file line by line. If a line contains null, then use sed to replace all occurrences of null with the corresponding value, retrieved via array index.
#!/bin/bash
file="finalfile.csv"
counter=0
array=(
"foo"
"bar"
)
while read -r line; do
item="${array[$counter]}"
echo "$line" | sed "s/null/$item/g"
((counter++))
done < "$file"

Picking valid IPs from an array of strings

in my usecase i'm filtering certain IPv4s from the list and putting them into array for further tasks:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-")
As a result the output is:
10.8.61.10
10.0.10.15
172.0.20.30
678.0.0.10
As you see the last row is not an IP, therefore i faced an urge to add some regex check on the FIRSTARRAY.
I do not want to save a collateral files to work with them, so i'm looking for some "on-the-fly" option to regex the firstarray. I tried the following:
for X in "${FIRSTARRAY[#]}"; do
readarray -t SECONDARRAY < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' "$X")
done
But in the output I see that system thinks that $X is a file/dir, and didn't process the value, even though it clearly sees it:
line ABC: 172.0.20.30: No such file or directory
line ABC: 678.0.0.10: No such file or directory
What am I doing wrong and what would be the best approach to proceed?
You are passing "$X" as an argument to grep and hence it is being treated as a file. Use herestring <<< instead:
for X in "${firstarray[#]}"; do
readarray -t secondarray < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' <<< "$X")
done
You are better off writing a function to validate the IP instead of relying on just a regex match:
#!/bin/bash
validate_ip() {
local arr element
IFS=. read -r -a arr <<< "$1" # convert ip string to array
[[ ${#arr[#]} != 4 ]] && return 1 # doesn't have four parts
for element in "${arr[#]}"; do
[[ $element =~ ^[0-9]+$ ]] || return 1 # non numeric characters found
[[ $element =~ ^0[1-9]+$ ]] || return 1 # 0 not allowed in leading position if followed by other digits, to prevent it from being interpreted as on octal number
((element < 0 || element > 255)) && return 1 # number out of range
done
return 0
}
And then loop through your array:
for x in "${firstarray[#]}"; do
validate_ip "$x" && secondarray+=("$x") # add to second array if element is a valid IP
done
The problem is, that you passing an argument to the grep command and it expects reading standard input instead.
You can use your regex to filter the IP addresses right in the first command:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-" | grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' )
Then you have only IP addresses in firstarray.

Comment out items that do not match pattern in array

I have a log file I am trying to comment out lines that do not match my array. I did successfully learn how to create an array and I can echo out the array items but I am having trouble taking anything that doesn't match my array and adding something in front of it. Here is my code, if you have suggestions on another path or ways I can make it better:
for itsSaturday in $(find "$LOCATION" -mindepth 1 -maxdepth 1 -name "*.log" ); do
TEMPFILE="$itsSaturday.$$"
declare -a someArray=( "breakfast" "scrambled eggs" "Bloody Mary" )
theCall='some_additional_text_'
commentOn="## You_need_"
for arrayItem in "${someArray[#]}"; do
merged="$theCall$arrayItem"
if ! grep -q "$merged" "$itsSaturday"; then
sed -e '/$merged/! s:$commentOn$theCall::g' "$itsSaturday" > $TEMPFILE && mv $TEMPFILE "$itsSaturday"
fi
done
done
file:
some_additional_text_breakfast
some_additional_text_bacon
some_additional_text_scrambled eggs
some_additional_text_Bloody Mary
some_additional_text_orange juice
some_additional_text_breakfast
file into:
some_additional_text_breakfast
## You_need_some_additional_text_bacon
some_additional_text_scrambled eggs
some_additional_text_Bloody Mary
## You_need_some_additional_text_orange juice
some_additional_text_breakfast
How can I add a variable before items that do not match my array?
I don't like doing this using bash and sed, but I think the following might be enough:
#! /bin/bash
declare -a someArray=( "breakfast" "scrambled eggs" "Bloody Mary" )
theCall='some_additional_text_'
commentOn="## You_need_"
OIFS="$IFS"
IFS='|' mergedLines="${someArray[*]/#/$theCall}"
IFS="$OIFS"
for i in *.txt
do
TEMPFILE="$i.$$"
sed -r "/$mergedLines/!s/^/$commentOn/" "$i" >> "$TEMPFILE"
done
I shifted the array and other constants out of the loop.
"${someArray[*]/#/$theCall}" uses bash string substitution to append the contents of $theCall to every element in the array.
IFS='|' mergedLines="${someArray[*]} is a convenient trick to combine the elements of an array into a pipe-separated string.
Combined, (2) and (3) get me
some_additional_text_breakfast|some_additional_text_scrambled eggs|some_additional_text_Bloody Mary
in mergedLines.
Then it's just a matter of using extended regular expressions in sed (for |) and replacing non-matching lines.
Your sed pattern used single quotes, so the variables within were not expanded.
Try replacing the inner for-loop with:
PROG=$(printf '%s\n' "${COMMENT[#]}" | while read comment ; do
/bin/echo -n '$0 !~ /'"$comment"'$/ && '
done
echo '1 { printf commentOn } ; { print }')
awk -v commentOn="$commentOn" "$PROG" $itsSaturday > $TEMPFILE && mv $TEMPFILE $itsSaturday
On each file, this creates an awk program that does the work.

Check if each element of an array is present in a string in bash, ignoring certain characters and order

On the web I found answers to find if an element of array is present in the string. But I want to find if each element in the array is present in the string.
eg. str1 = "This_is_a_big_sentence"
Initially str2 was like
str2 = "Sentence_This_big"
Now I wanted to search if string str1 contains "sentence"&"this"&"big" (All 3, ignore alphabetic order and case)
So I used arr=(${str2//_/ })
How do i proceed now, I know comm command finds intersection, but it needs a sorted list, also I need to ignore _ underscores.
I get my str2 by finding the extension of a particular type of file using the command
for i in `ls snooze.*`; do echo $i | cut -d "." -f2
# Till here i get str2 and need to check as mentioned above. Not sure how to do this, i tried putting str2 as array and now just need to check if all elements of my array occur in str1 (ignore case,order)
Any help would be highly appreciated. I did try to use This link
Now I wanted to search if string a contains "sentence"&"this"&"big"
(All 3, ignore alphabatic order and case)
Here is one approach:
#!/bin/bash
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
if ! grep -qvwFf <(sed 's/_/\n/g' <<<${str1,,}) <(sed 's/_/\n/g' <<<${str2,,})
then
echo "All words present"
else
echo "Some words missing"
fi
How it works
${str1,,} returns the string str1 with all capitals replaced by lower case.
sed 's/_/\n/g' <<<${str1,,} returns the string str1, all converted to lower case and with underlines replaced by new lines so that each word is on a new line.
<(sed 's/_/\n/g' <<<${str1,,}) returns a file-like object containing all the words in str1, each word lower case and on a separate line.
The creation of file-like objects is called process substitution. It allows us, in this case, to treat the output of a shell command as if it were a file to read.
<(sed 's/_/\n/g' <<<${str2,,}) does the same for str2.
Assuming that file1 and file2 each have one word per line, grep -vwFf file1 file2 removes from file2 every occurrence of a word in file2. If there are no words left, that means that every word in file2 appears in file1.
By adding the option -q, grep will return no output but will set an exit code that we can use in our if statement.
In the actual command, file1 and file2 are replaced by our file-like objects.
The remaining grep options can be understood as follows:
-w tells grep to look for whole words only.
-F tells grep to look for fixed strings, not regular expressions.
-f tells grep to look for the patterns to match in the file (or file-like object) which follows.
-v tells grep to remove (the default is to keep) the words which match.
Here is an awk solution to check existence of all the words from a string in another string:
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
awk -v RS=_ 'FNR==NR{a[tolower($1)]; next} {delete a[tolower($1)]} END{print (length(a)) ? "Not all words" : "All words"}' <(echo "$str2") <(echo "$str1")
With indentation:
awk -v RS=_ 'FNR==NR {
a[tolower($1)];
next
}
{ delete a[tolower($1)] }
END {
print (length(a)) ? "Not all words" : "All words"
}' <(echo "$str2") <(echo "$str1")
Explanation:
-v RS=_ We use record separator as _
FNR==NR - Execute this block for str2
a[tolower($1)]; next - Populate an array a with each lowercase word as key
{delete a[tolower($1)]} - For each word in str1 delete key in array a
END - If length of array a is still not 0 then there are some words left.
Here's another solution:
#!/bin/bash
str1="This_is_a_big_sentence"
str2="sentence_This_big"
var=0
var2=0
while read in
do
if [ $(echo $str1 | grep -ioE $in) ]
then
var=$((var+1))
fi
var2=$((var2+1))
done < <(echo $str2 | sed -e 's/\(.*\)/\L\1/' -e 's/_/\n/g')
if [[ $var -eq $var2 && $var -ne 0 ]]
then
echo "matched"
else
echo "not matched"
What this script does make str2 all lower case with sed -e 's/\(.*\)/\L\1/' which is a substitution of any character with its lower case, then replace underscores _ with return lines \n with the following sed expression: sed -e 's/_/\n/g', which is another substitution.
Now the individual words are fed into a while loop that compares str1 with the word that was fed in. Every time there's a match, increment var and every time we iterate though the while, we increment var2. If var == var2, then all the words of str2 were found in str1. Hope that helps.
Here's an approach.
if [ "$(echo "This_BIG_senTence" | grep -ioE 'this|big|sentence' | wc -l)" == "3" ]; then echo "matched"; fi
How it works.
grep options -i makes the grep case insensitive, -E for extended regular expressions, and -o separates the matches by line. Now that it is separated by line use wc with -l for line count. Since we had 3 conditions we check if it equals 3. Grep will return the lines where the match occurred, so if you are only working with a string, the example above will return the string for each condition, in this case 3, so there won't be any problems.
Note you can also create a grep chain and see if its empty.
if [ $(echo "This_BIG_SenTence" | grep -i this | grep -i big | grep -i sentence) ]; then echo matched; else echo not_matched; fi
Now I know what you mean. Try this:
#!/bin/bash
# add 4 non-matching examples
> snooze.foo_bar
> snooze.bar_go
> snooze.go_foo
> snooze.no_match
# add 3 matching examples
> snooze.foo_bar_go
> snooze.goXX_XXfoo_XXbarXX
> snooze.bar_go_foo_Ok
str1=("foo" "bar" "go")
for i in `ls snooze.*`; do
str2=${i#snooze.}
j=0
found=1
while [[ $j -lt ${#str1[#]} ]]; do
if ! echo $str2 | eval grep \${str1[$j]} >& /dev/null; then
found=0
break
fi
((j++))
done
if [[ $found -ne 0 ]]; then
echo Match found: $str2
fi
done
Resulting print of this script:
Match found: bar_go_foo_Ok
Match found: foo_bar_go
Match found: goXX_XXfoo_XXbarXX
alternatively, the if..grep line above can be replaced by
if [[ ! $str2 =~ `eval echo \${str1[$j]}` ]]; then
utilizing bash's regular expression match.
Note: I am not too careful about special characters in the search string, such as "\" or " " (space), which may cause problem.
--- Some explanations ---
In the if .. grep line, $j is first evaluated to the running index, from 0 to the number of elements in $str1 minus 1. Then, eval will re-evaluate the whole grep command again, causing ${str1[jjj]} to be re-evaluated (Here, jjj is the already evaluated index)
The strategy is to set found=1 (found by default), and then when any grep fails, we set found to 0 and break the inner j-loop.
Everything else should be straightforward.

Resources