Picking valid IPs from an array of strings - arrays

in my usecase i'm filtering certain IPv4s from the list and putting them into array for further tasks:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-")
As a result the output is:
10.8.61.10
10.0.10.15
172.0.20.30
678.0.0.10
As you see the last row is not an IP, therefore i faced an urge to add some regex check on the FIRSTARRAY.
I do not want to save a collateral files to work with them, so i'm looking for some "on-the-fly" option to regex the firstarray. I tried the following:
for X in "${FIRSTARRAY[#]}"; do
readarray -t SECONDARRAY < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' "$X")
done
But in the output I see that system thinks that $X is a file/dir, and didn't process the value, even though it clearly sees it:
line ABC: 172.0.20.30: No such file or directory
line ABC: 678.0.0.10: No such file or directory
What am I doing wrong and what would be the best approach to proceed?

You are passing "$X" as an argument to grep and hence it is being treated as a file. Use herestring <<< instead:
for X in "${firstarray[#]}"; do
readarray -t secondarray < <(grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' <<< "$X")
done
You are better off writing a function to validate the IP instead of relying on just a regex match:
#!/bin/bash
validate_ip() {
local arr element
IFS=. read -r -a arr <<< "$1" # convert ip string to array
[[ ${#arr[#]} != 4 ]] && return 1 # doesn't have four parts
for element in "${arr[#]}"; do
[[ $element =~ ^[0-9]+$ ]] || return 1 # non numeric characters found
[[ $element =~ ^0[1-9]+$ ]] || return 1 # 0 not allowed in leading position if followed by other digits, to prevent it from being interpreted as on octal number
((element < 0 || element > 255)) && return 1 # number out of range
done
return 0
}
And then loop through your array:
for x in "${firstarray[#]}"; do
validate_ip "$x" && secondarray+=("$x") # add to second array if element is a valid IP
done

The problem is, that you passing an argument to the grep command and it expects reading standard input instead.
You can use your regex to filter the IP addresses right in the first command:
readarray -t firstarray < <(grep -ni '^ser*' IPbook.file | cut -f 2 -d "-" | grep -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' )
Then you have only IP addresses in firstarray.

Related

How to split string to array with specific word in bash

I have a string after I do a command:
[username#hostname ~/script]$ gsql ls | grep "Graph graph_name"
- Graph graph_name(Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e)
Then I do
IFS=", " read -r -a vertices <<< "$(gsql use graph ifgl ls | grep "Graph ifgl(" | cut -d "(" -f2 | cut -d ")" -f1)" to make the string splitted and append to array. But, what I want is to split it by delimiter ", " then append each word that contain ":v" to an array, its mean word that contain ":e" will excluded.
How to do it? without do a looping
Like this, using grep
mapfile -t array < <(gsql ls | grep "Graph graph_name" | grep -oP '\b\w+:v')
The regular expression matches as follows:
Node
Explanation
\b
the boundary between a word char (\w) and something that is not a word char
\w+
word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
:v
':v'
This bash script should work:
declare arr as array variable
arr=()
# use ", " as delimiter to parse the input fed through process substituion
while read -r -d ', ' val || [[ -n $val ]]; do
val="${val%)}"
val="${val#*\(}"
[[ $val == *:v ]] && arr+=("$val")
done < <(gsql ls | grep "Graph graph_name")
# check array content
declare -p arr
Output:
declare -a arr='([0]="Vertice_1:v" [1]="Vertice_2:v" [2]="Vertice_3:v" [3]="Vertice_4:v")'
Since there is a condition per element the logical way is to use a loop. There may be ways to do it, but here is a solution with a for loop:
#!/bin/bash
input="Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e"
input="${input//,/ }" #replace , with SPACE (bash array uses space as separator)
inputarray=($input)
outputarray=()
for item in "${inputarray[#]}"; do
if [[ $item =~ ":v" ]]; then
outputarray+=($item) #append the item to the output array
fi
done
echo "${outputarray[#]}"
will give output: Vertice_1:v Vertice_2:v Vertice_3:v Vertice_4:v
since the elements don't have space in them this works

Bash. Split text to array by delimiter [duplicate]

This question already has answers here:
How to split a string into an array in Bash?
(24 answers)
Closed 1 year ago.
Can somebody help me out. I want to split TEXT(variable with \n) into array in bash.
Ok, I have some text-variable:
variable='13423exa*lkco3nr*sw
kjenve*kejnv'
I want to split it in array.
If variable did not have new line in it, I will do it by:
IFS='*' read -a array <<< "$variable"
I assumed the third element should be:
echo "${array[2]}"
>sw
>kjenve
But with new line it is not working. Please give me right direction.
Use readarray.
$ variable='13423exa*lkco3nr*sw
kjenve*kejnv'
$ readarray -d '*' -t arr < <(printf "%s" "$variable")
$ declare -p arr
declare -a arr=([0]="13423exa" [1]="lkco3nr" [2]=$'sw\nkjenve' [3]="kejnv")
mapfile: -d: invavlid option
Update bash, then use readarray.
If not, replace separator with zero byte and read it element by element with read -d ''.
arr=()
while IFS= read -d '' -r e || [[ -n "$e" ]]; do
arr+=("$e")
done < <(printf "%s" "$variable" | tr '*' '\0');
declare -p arr
declare -a arr=([0]="13423exa" [1]="lkco3nr" [2]=$'sw\nkjenve' [3]="kejnv")
You can use the readarray command and use it like in the following example:
readarray -d ':' -t my_array <<< "a:b:c:d:"
for (( i = 0; i < ${#my_array[*]}; i++ )); do
echo "${my_array[i]}"
done
Where the -d parameter defines the delimiter and -t ask to remove last delimiter.
Use a ending character different than new line
end=.
read -a array -d "$end" <<< "$v$end"
Of course this solution suppose there is at least one charecter not used in your input variable.

How to create mutiple arrays from a text file and loop through the values of each array

I have a text file with the following:
Paige
Buckley
Govan
Mayer
King
Harrison
Atkins
Reinhardt
Wilson
Vaughan
Sergovia
Tarrega
My goal is to create an array for each set of names. Then Iterate through the first array of values then move on to the second array of values and lastly the third array. Each set is separated by a new line in the text file. Help with code or logic is much appreciated!
so far I have the following. i am unsure of the logic moving forward when i reach a line break. My research here also suggests that i can use readarray -d.
#!/bin/bash
my_array=()
while IFS= read -r line || [[ "$line" ]]; do
if [[ $line -eq "" ]];
.
.
.
arr+=("$line") # i know this adds the value to the array
done < "$1"
printf '%s\n' "${my_array[#]}"
desired output:
array1 = (Paige Buckley6 Govan Mayer King)
array2 = (Harrison Atkins Reinhardt Wilson)
array3 = (Vaughan Sergovia Terrega)
#then loop through the each array one after the other.
Bash has no array-of-arrays. So you have to represent it in an other way.
You could leave the newlines and have an array of newline separated elements:
array=()
elem=""
while IFS= read -r line; do
if [[ "$line" != "" ]]; then
elem+="${elem:+$'\n'}$line" # accumulate lines in elem
else
array+=("$elem") # flush elem as array element
elem=""
fi
done
if [[ -n "$elem" ]]; then
array+=("$elem") # flush the last elem
fi
# iterate over array
for ((i=0;i<${#array[#]};++i)); do
# each array element is newline separated items
readarray -t elem <<<"${array[i]}"
printf 'array%d = (%s)\n' "$i" "${elem[*]}"
done
You could simplify the loop with some unique character and a sed for example like:
readarray -d '#' -t array < <(sed -z 's/\n\n/#/g' file)
But overall, this awk generates same output:
awk -v RS= -v FS='\n' '{
printf "array%d = (", NR;
for (i=1;i<=NF;++i) printf "%s%s", $i, i==NF?"":" ";
printf ")\n"
}'
Using nameref :
#!/usr/bin/env bash
declare -a array1 array2 array3
declare -n array=array$((n=1))
while IFS= read -r line; do
test "$line" = "" && declare -n array=array$((n=n+1)) || array+=("$line")
done < "$1"
declare -p array1 array2 array3
Called with :
bash test.sh data
# result
declare -a array1=([0]="Paige" [1]="Buckley" [2]="Govan" [3]="Mayer" [4]="King")
declare -a array2=([0]="Harrison" [1]="Atkins" [2]="Reinhardt" [3]="Wilson")
declare -a array3=([0]="Vaughan" [1]="Sergovia" [2]="Tarrega")
Assumptions:
blank links are truly blank (ie, no need to worry about any white space on said lines)
could have consecutive blank lines
names could have embedded white space
the number of groups could vary and won't always be 3 (as with the sample data provided in the question)
OP is open to using a (simulated) 2-dimensional array as opposed to a (variable) number of 1-dimensional arrays
My data file:
$ cat names.dat
<<< leading blank lines
Paige
Buckley
Govan
Mayer
King Kong
<<< consecutive blank lines
Harrison
Atkins
Reinhardt
Wilson
Larry
Moe
Curly
Shemp
Vaughan
Sergovia
Tarrega
<<< trailing blank lines
One idea that uses a couple arrays:
array #1: associative array - the previously mentioned (simulated) 2-dimensional array with the index - [x,y] - where x is a unique identifier for a group of names and y is a unique identifier for a name within a group
array #2: 1-dimensional array to keep track of max(y) for each group x
Loading the arrays:
unset names max_y # make sure array names are not already in use
declare -A names # declare associative array
x=1 # init group counter
y=0 # init name counter
max_y=() # initialize the max(y) array
inc= # clear increment flag
while read -r name
do
if [[ "${name}" = '' ]] # if we found a blank line ...
then
[[ "${y}" -eq 0 ]] && # if this is a leading blank line then ...
continue # ignore and skip to the next line
inc=y # set flag to increment 'x'
else
[[ "${inc}" = 'y' ]] && # if increment flag is set ...
max_y[${x}]="${y}" && # make note of max(y) for this 'x'
((x++)) && # increment 'x' (group counter)
y=0 && # reset 'y'
inc= # clear increment flag
((y++)) # increment 'y' (name counter)
names[${x},${y}]="${name}" # save the name
fi
done < names.dat
max_y[${x}]="${y}" # make note of the last max(y) value
Contents of the array:
$ typeset -p names
declare -A names=([1,5]="King Kong" [1,4]="Mayer" [1,1]="Paige" [1,3]="Govan" [1,2]="Buckley" [3,4]="Shemp" [3,3]="Curly" [3,2]="Moe" [3,1]="Larry" [2,4]="Wilson" [2,2]="Atkins" [2,3]="Reinhardt" [2,1]="Harrison" [4,1]="Vaughan" [4,2]="Sergovia" [4,3]="Tarrega" )
$ for (( i=1; i<=${x}; i++ ))
do
for (( j=1; j<=${max_y[${i}]}; j++ ))
do
echo "names[${i},${j}] : ${names[${i},${j}]}"
done
echo ""
done
names[1,1] : Paige
names[1,2] : Buckley
names[1,3] : Govan
names[1,4] : Mayer
names[1,5] : King Kong
names[2,1] : Harrison
names[2,2] : Atkins
names[2,3] : Reinhardt
names[2,4] : Wilson
names[3,1] : Larry
names[3,2] : Moe
names[3,3] : Curly
names[3,4] : Shemp
names[4,1] : Vaughan
names[4,2] : Sergovia
names[4,3] : Tarrega

Split two numbers in two arrays

I need to split 2 numbers in the form(they are from a text file):
Num1:Num2
Num3:Num4
And store num1 into array X and number 2 in array Y num 3 in array X and num4 in array Y.
With bash:
mapfile -t X < <(cut -d : -f 1 file) # read only first column
mapfile -t Y < <(cut -d : -f 2 file) # read only second column
declare -p X Y
Output:
declare -a X='([0]="num1" [1]="num3")'
declare -a Y='([0]="num2" [1]="num4")'
Disadvantage: The file is read twice.
You could perform the following steps:
Create destination arrays empty
Read file line by line, with a classic while read ... < file loop
Split each line on :, again using read
Append values to arrays
For example:
arr_x=()
arr_y=()
while IFS= read line || [ -n "$line" ]; do
IFS=: read x y <<< "$line"
arr_x+=("$x")
arr_y+=("$y")
done < data.txt
echo "content of arr_x:"
for v in "${arr_x[#]}"; do
echo "$v"
done
echo "content of arr_y:"
for v in "${arr_y[#]}"; do
echo "$v"
done
Here is a quick bash solution:
c=0
while IFS=: read a b ;do
x[$c]="$a"
y[$c]="$b"
c=$((c+1))
done < input.txt
We send the input.txt to a while loop, using Input Field Separator : and read the first number of each line as $a and second number as $b. Then we add them to the array as you specified. We use a counter $c to iterate the location in the arrays.
Using =~ operator to store the pair of numbers to array $BASH_REMATCH:
$ cat file
123:456
789:012
$ while read -r line
do
[[ $line =~ ([^:]*):(.*) ]] && echo ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
# do something else with numbers as they will be replaced on the next iteration
done < file

Iterating over lines (w/ numbers) read from a file to an array in bash

I'm trying to write a small script that will take the 4th columns of a file and store it in an array then do a little comparison. If the element in the array is greater than 0 and less than 500 I have to increment the counter. However when I run the script the counter always shows 0. Here's my script
#!/bin/bash
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' > test.txt)
COUNTER=0
for i in ${my_array[#]}; do
if [["${my_array[$i]}" -gt 0 -a "${my_array[$i]}" -lt 500 ]]
then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "%i" "${my_array[$i]}"//just to test if the mapfile command is working
done
echo $COUNTER
output:
./script1.bash
0
#!/bin/bash
mapfile -t my_array < <(awk '{ print $4 }' file1.txt | tee test.txt)
COUNTER=0
for idx in "${!my_array[#]}"; do
value=${my_array[$idx]}
if (( value > 0 )) && (( value < 500 )); then
COUNTER=$((COUNTER + 1))
fi
printf "%s\t%s\n" "$idx" "$value"
done
echo "$COUNTER"
The use of cat here is needless: It added nothing but inefficiency (requiring an extra process to be started, and forcing awk to read from a pipe rather than direct from a file).
mapfile had nothing to read because the output of awk was redirected to test.txt. If you want it to go to both a file and stdout, then you need to use tee.
-a is not valid in [[ ]]; use && instead there. However, since you're doing only arithmetic, (( )) is more appropriate. Incidentally, -a is officially marked obsolescent even for [ ] and test; see the current POSIX standard.
${my_array[#]} iterates over values. If you want to iterate over indexes, you need ${!my_array[#]} instead.
Whitespace is mandatory in separating command names. [["$foo" is a different command from [[, unless $foo is empty or starts with a character in $IFS.
If you redirect the output to a file: > test.txt then there is no output in "standard output" because it is consumed by the file. So, first, you need to remove that redirection. You may use:
mapfile -t my_array < <(cat file1.txt | awk '{ print $4 }' )
But since awk could perfectly well read a file, this is better:
mapfile -t my_array < <(awk '{ print $4 }' file1.txt)
And since you are using awk, it could do the comparison to 0 and 500 and output the whole count.
counter=$(awk '{if($4>0 && $4<500){c++}}END{print c}' file1.txt)
echo "$counter"
Simpler, faster.
That will also avoid some simple mistakes in your script, like missing an space in the […] construct:
if [[ "${my … # NOT "if [["${my …"
And some missing quotes:
for i in "${my_array[#]}" # NOT for i in ${my_array[#]}
In general, it is a good idea to check your script with ShellCheck.net to remove some simple mistakes.

Resources