how to capture only digits in array using if statement in bash - arrays

len=${#newarray[*]}
for ((i=0;i<${len};i++)); do
if [[ "${newarray[$i]}" =~ ^[[:digit:]]+$ ]]; then
echo "${newarray[$i]}"
fi
this is my code I need to get only digits from the array list and their count as well. but there is an issue that shows alphabetical values only.
given getdata.txtcontains -
cat
dog
1234
pl345
567ab
12234

Having:
value=(cat dog 1234 pl345 567ab 12234)
You could print "only numbers":
$ printf "%s\n" "${value[#]}" | grep -o "[0-9]*"
1234
345
567
12234
or maybe you want "only lines that contain only numbers":
$ printf "%s\n" "${value[#]}" | grep -x "[0-9]*"
1234
12234
or maybe you want "only lines with a number":
$ printf "%s\n" "${value[#]}" | grep "[0-9]"
1234
pl345
567ab
12234

Your if/else is backwards.
value=(cat dog 1234 pl345 567ab 12234)
len=${#value[*]}
for ((i=0;i<${len};i++)); do
if [[ "${value[$i]}" =~ ^[[:digit:]]+$ ]]; then
echo "${value[$i]}"
fi
done
This outputs:
1234
12234

There is a non-regex way of doing it in bash using shopt:
# enable extended glob
shopt -s extglob
value=(cat dog 1234 pl345 567ab 12234)
# lop through array and print only it contains 1+ digits
for v in "${value[#]}"; do [[ $v == +([0-9]) ]] && echo "$v"; done
1234
12234

You probably should use two nested loops:
iterate over all elements of an array
iterate over all chars in the element, check if the char is a digit
(2) would probably be simpler with grep, but since we're speaking of bash implementaton I've came up with this:
#!/bin/bash
a=(cat dog 1234 pl345 567ab 12234) # input array
for s in "${a[#]}"; do # iterate over all elements of our array
res=''
for ((i=0;i<${#s};i++)); do # iterate over all chars of element $s
c="${s:$i:1}" # $c is now char at $i position inside $s
case "$c" in
[0-9]) res="${res}${c}" # append digit to res
esac
done
[ -n "$res" ] && echo "$res" # if $s contains at least 1 digit, print it
# you may add $res to another array here if needed
done

Simply
tr -dc '0-9\n' <getdata.txt

You can use:
$ cat getdata.txt |grep -P '^[0-9]+$'

Related

How to split string to array with specific word in bash

I have a string after I do a command:
[username#hostname ~/script]$ gsql ls | grep "Graph graph_name"
- Graph graph_name(Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e)
Then I do
IFS=", " read -r -a vertices <<< "$(gsql use graph ifgl ls | grep "Graph ifgl(" | cut -d "(" -f2 | cut -d ")" -f1)" to make the string splitted and append to array. But, what I want is to split it by delimiter ", " then append each word that contain ":v" to an array, its mean word that contain ":e" will excluded.
How to do it? without do a looping
Like this, using grep
mapfile -t array < <(gsql ls | grep "Graph graph_name" | grep -oP '\b\w+:v')
The regular expression matches as follows:
Node
Explanation
\b
the boundary between a word char (\w) and something that is not a word char
\w+
word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
:v
':v'
This bash script should work:
declare arr as array variable
arr=()
# use ", " as delimiter to parse the input fed through process substituion
while read -r -d ', ' val || [[ -n $val ]]; do
val="${val%)}"
val="${val#*\(}"
[[ $val == *:v ]] && arr+=("$val")
done < <(gsql ls | grep "Graph graph_name")
# check array content
declare -p arr
Output:
declare -a arr='([0]="Vertice_1:v" [1]="Vertice_2:v" [2]="Vertice_3:v" [3]="Vertice_4:v")'
Since there is a condition per element the logical way is to use a loop. There may be ways to do it, but here is a solution with a for loop:
#!/bin/bash
input="Vertice_1:v, Vertice_2:v, Vertice_3:v, Vertice_4:v, Edge_1:e, Edge_2:e, Edge_3:e, Edge_4:e, Edge_5:e"
input="${input//,/ }" #replace , with SPACE (bash array uses space as separator)
inputarray=($input)
outputarray=()
for item in "${inputarray[#]}"; do
if [[ $item =~ ":v" ]]; then
outputarray+=($item) #append the item to the output array
fi
done
echo "${outputarray[#]}"
will give output: Vertice_1:v Vertice_2:v Vertice_3:v Vertice_4:v
since the elements don't have space in them this works

bash string to array and array to string

How do we convert a string into N character array and back to string with spaces?
And how do we remove the spaces?
e.g. 123456789 into 2's should give 12 34 56 78 9
Sounds like you don't need the array at all and your final goal is just to insert spaces between groups of two characters. If that's the case you can use
sed 's/../& /g' <<< "your string here"
This will transform your example input 123456789 into the expected output 12 34 56 78 9.
Of course you can assign the result to a variable as usual:
yourVariable="$(sed 's/../& /g' <<< "your string here")"
if needed, how do we remove the spaces?
I'm not sure which spaces you mean. If you are talking about the final result, wouldn't it be easier to use the original input instead of procession the ouput again?
Anyways, you can remove all spaces from a any string using tr -d ' ' <<< "your string" or the parameter substitution ${yourVariable// /}.
$ str='123456789'
$ arr=( $(printf '%s\n' "$str" | sed 's/../& /g') )
$ declare -p arr
declare -a arr=([0]="12" [1]="34" [2]="56" [3]="78" [4]="9")
$ str="${arr[*]}"
$ echo "$str"
12 34 56 78 9
$ str="${str// }"
$ echo "$str"
123456789
$ str=$(printf "%s" "${arr[#]}")
$ echo "$str"
123456789
If you need to split a string into array, you can use IFS variable:
IFS=' '
arr=( )
read -r -a arr <<< "string morestring another"
unset IFS
To remove spaces from string you can use different approaches, one of them is:
str="123 123 12312312"
echo ${str// /}
//output: 12312312312312
This would be demonstrative to your question.
Convert a string into N character array:
string="0123456789abcdefghijklmnopqrstuvwxyz"
number_max_of_char_per_set=4
increment_by=$number_max_of_char_per_set
for i in `seq 0 $increment_by ${#string}`
#for i in $(seq 0 ${#string})
do array[$i]=${string:$i:number_max_of_char_per_set}
done
... back to string with spaces:
string_new=${array[#]}
echo "zero element of array is [${array[0]}]"
echo "entire array is [${array[#]}]"
echo $string_new
remove the spaces:
string_new=${string_new//[[:space:]]/}
echo $string_new

How to replace a word from a file with elements in an array

I am trying to replace all the word "null" to elements in array. The problem is that after replacing one word of "null", I would like to replace the next "null" with next element in the array.
I am not very good with bash and I feel like this is quite a basic question.
Here is what I have so far:
for m in $(cat finalfile.csv)
do
if [ "$m" = "null" ]
then
m=cwearray[$counter]
let counter++
fi
done
This doesn't replace anything in the finalfile.csv.
For example if the file has:
"value1","value2","null","value3"\n
"value1","value2","null","value3"...
and the array has ["foo","bar"]
I would like it to be:
"value1","value2","foo","value3"\n
"value1","value2","bar","value3"...
can be done with bash, even with multiple nulls per line:
$ cat finalfile.csv
"value1","value2","null","null"
"value1","value2","null","value3"
$ cwearray=( foo bar baz )
$ idx=0
$ while read -r line; do
while [[ $line == *null* ]]; do
line=${line/null/${cwearray[idx++]}}
# ...............^^^^^^^^^^^^^^^^^^
# replace the _first_ "null" with the _next_ array element
done
echo "$line"
done < finalfile.csv > updatedfinalfile.csv
$ cat updatedfinalfile.csv
"value1","value2","foo","bar"
"value1","value2","baz","value3"
It's easier in Perl where you can increase the index directly in the replacement part of a substitution:
printf '%s\n' 1,2,3,null null,2,3,4 null,null,null,null \
| perl -pe 'BEGIN { #cwe = qw( A B C D E F ) }
s/(?:^|(?<=,))null(?=,|$)/$cwe[$i++]/g'
Update: It seems you've updated your question with a sample input. If nulls are double quoted, it gets even easier, as there's no need to check whether they're surrounded with commas or beginning/end of the line.
perl -pe 'BEGIN{ #cwe = qw( foo bar ) }
s/"null"/"$cwe[$i++]"/g'
An awk solution :
declare -a cwearray
cwearray=(foo bar)
awk -F, 'NR==FNR{repl[NR]=$0; next}{for(i=1;i<=NF;i++){if($i=="\"null\""){$i="\""repl[++counter]"\""}}}1' OFS="," <(for i in "${cwearray[#]}"; do echo "$i"; done) <file>
Read the file line by line. If a line contains null, then use sed to replace all occurrences of null with the corresponding value, retrieved via array index.
#!/bin/bash
file="finalfile.csv"
counter=0
array=(
"foo"
"bar"
)
while read -r line; do
item="${array[$counter]}"
echo "$line" | sed "s/null/$item/g"
((counter++))
done < "$file"

What is the fastest way to delete an element of an array?

Giving this array:
arr=(hello asd asd1 asd22 asd333)
I want to delete the a especific item by its value, for example asd. I did this:
IFS=' '
echo "${arr[#]/asd/}"
But it returns the following:
hello 1 22 333
So I did this function:
function remove_item() {
local item_search="$1"
shift
local arr_tmp=("${#}")
if [ ${#arr_tmp[#]} -eq 0 ]; then
return
fi
local index=0
for item in ${arr_tmp[#]}; do
if [ "$item" = "$item_search" ]; then
unset arr_tmp[$index]
break
fi
let index++
done
echo "${arr_tmp[*]}"
}
arr=(asd asd1 asd22 asd333)
remove_item 'asd' "${arr[#]}"
Prints the desired output:
hello asd1 asd22 asd333
But I have to use it with very long arrays, and I have to call it a lot of times. And its performance sucks.
Do you have any better alternative to do it? Any tip, trick, or advice will be appreciatted.
You could use a loop to iterate over the array and remove the element that matches the specified value:
for i in "${!arr[#]}"; do
[[ "${arr[i]}" == "asd" ]] && unset arr[i]
done
If you know that the array would have at most one matching element, you could even break out of the loop:
[[ "${arr[i]}" == "asd" ]] && unset arr[i] && break
|^^^^^^^^|
(this causes the loop to break
as soon as the element is found)
As an example:
$ arr=(asd asd1 asd22 asd333)
$ for i in "${!arr[#]}"; do [[ "${arr[i]}" == "asd" ]] && unset arr[i]; done
$ echo "${arr[#]}"
asd1 asd22 asd333
Probably #devnull's answer is fastest. But it might possibly be faster not to use a loop and instead let grep do the work. Its not very pretty though:
$ arr=(hello asd asd1 asd22 asd333)
$ remove="asd"
$ i=$(paste -d: <(printf "%s\n" "${!arr[#]}") <(printf "%s\n" "${arr[#]}") | grep -m1 -w -E "^[[:digit:]]+:${remove}$")
$ unset arr[${i%:*}]
$

bash array leave elements containing string

I will be short, what i have is
array=( one.a two.b tree.c four.b five_b_abc)
I want this
array=( two.b four.b five_b_abc )
from here i found this
# replace any array item matching "b*" with "foo"
array=( foo bar baz )
array=( "${array[#]/%b*/foo}" )
echo "${orig[#]}"$'\n'"${array[#]}"
how ever this does not work
array2=( ${array[#]//%^.p/})
result array2=array
this deletes all with p
array2=(${array[#]/*p*/})
result array2=( one.a tree.c )
I need an idea how to do add ^p (all accept p), and get my solution
array2=(${array[#]/*^p*/}
Its a rather large array about 10k elements where i need to do this i need it to be as fast as possible with data so please no loop solutions.
Edit: added timing comparisons (at end) and got rid of the tr
It is possible to replace the content of array elements using bash Parameter expansion, ie. ${var[#]....} but you cannot actually delete the elements. The best you can get with parameter expansion is to make null ("") those elements you do not want.
Instead, you can use printf and sed, and IFS. This has the advantabe of allowing you to use full regular expression syntax (not just the shell globbing expressions)... and it is much faster than using a loop...
This example leaves array elements which contain c
Note: This method caters for spaces in data. This is achieved via IFS=\n
IFS=$'\n'; a=($(printf '%s\n' "${a[#]}" |sed '/c/!d'))
Here it is again with before/after dumps:
#!/bin/bash
a=(one.ac two.b tree.c four.b "five b abcdefg" )
echo "======== Original array ===="
printf '%s\n' "${a[#]}"
echo "======== Array containing only the matched elements 'c' ===="
IFS=$'\n'; a=($(printf '%s\n' "${a[#]}" |sed '/c/!d'))
printf '%s\n' "${a[#]}"
echo "========"
Output:
======== Original array ====
one.ac
two.b
tree.c
four.b
five b abcdefg
======== Array containing only the matched elements 'c' ====
one.ac
tree.c
five b abcdefg
========
For general reference: Testing an array containing 10k elements. selecting 5k:
a=( a\ \ \ \ b{0..9999} )
 The printf method took: 0m0.226s (results in sequential index values)
    The first loop method: 0m4.007s (it leaves gaps in index values)
The second loop method: 0m7.862s (results in sequential index values)
printf method:
IFS=$'\n'; a=($(printf '%s\n' "${a[#]}" |sed '/.*[5-9]...$/!d'))
1st loop method:
iz=${#a[#]}; j=0
for ((i=0; i<iz; i++)) ;do
[[ ! "${a[i]}" =~ .*[5-9]...$ ]] && unset a[$i]
done
2nd loop method:
iz=${#a[#]}; j=0
for ((i=0; i<iz; i++)) ;do
if [[ ! "${a[i]}" =~ .*[5-9]...$ ]] ;then
unset a[$i]
else
a[$j]="${a[i]}=$i=$j"; ((j!=i)) && unset a[$i]; ((j+=1));
fi
done
You can try this:
array2=(`echo ${array[#]} | sed 's/ /\n/g' | grep b`)
If you set IFS to $'\n', then you could use the echo command on the array with an '*' as the subscript, instead of printf with an '#':
IFS=$'\n'; a=($(echo "${a[*]}" | sed '/.*[5-9]...$/!d'))
The '*' subscript concatenates the array elements together, separated by the IFS.
(I have only recently learned this myself, by the way.)
I know that it is a bash question, but I used a python script to make it one liner and easy to read.
$ array=( one.a two.b tree.c four.b five_b_abc)
$ printf '%s\n' "${array[#]}" | python3 -c "import sys; print(' '.join([ s.strip() for s in sys.stdin if 'b' in s ]));"
two.b four.b five_b_abc

Resources