Assume that the Pid of active processes on my machines are 1000 and 2000.
I am trying to make an array in Linux such that
The command echo ${Pid_Current[0]} gives 1000 in output
The command echo ${Pid_Current[1]} gives 2000 in output
Here is my code:
declare -a Pid_Current
Pid_Current=$(ps -aF | tail -n +2 | awk '{print $2}')
However, instead of the desired output I explained above, I receive the following output:
echo ${Pid_Current[0]} gives 1000 2000 in output
echo ${Pid_Current[1]} gives nothing in output
Would you please advise me what part of my code is incorrect?
In bash array assignment is done by enclosing the expression in parenthesis, so to use array assignment you need to write:
Pid_Current=($(ps -aF | tail -n +2 | awk '{print $2}'))
Without parenthesis the result of the expression is assigned to Pid_Current[0]
I am trying to read a file line by line and find the average of the numbers in each line. I am getting the error: expr: non-numeric argument
I have narrowed the problem down to sum=expr $sum + $i, but I'm not sure why the code doesn't work.
while read -a rows
do
for i in "${rows[#]}"
do
sum=`expr $sum + $i`
total=`expr $total + 1`
done
average=`expr $sum / $total`
done < $fileName
The file looks like this (the numbers are separated by tabs):
1 1 1 1 1
9 3 4 5 5
6 7 8 9 7
3 6 8 9 1
3 4 2 1 4
6 4 4 7 7
With some minor corrections, your code runs well:
while read -a rows
do
total=0
sum=0
for i in "${rows[#]}"
do
sum=`expr $sum + $i`
total=`expr $total + 1`
done
average=`expr $sum / $total`
echo $average
done <filename
With the sample input file, the output produced is:
1
5
7
5
2
5
Note that the answers are what they are because expr only does integer arithmetic.
Using sed to preprocess for expr
The above code could be rewritten as:
$ while read row; do expr '(' $(sed 's/ */ + /g' <<<"$row") ')' / $(wc -w<<<$row); done < filename
1
5
7
5
2
5
Using bash's builtin arithmetic capability
expr is archaic. In modern bash:
while read -a rows
do
total=0
sum=0
for i in "${rows[#]}"
do
((sum += $i))
((total++))
done
echo $((sum/total))
done <filename
Using awk for floating point math
Because awk does floating point math, it can provide more accurate results:
$ awk '{s=0; for (i=1;i<=NF;i++)s+=$i; print s/NF;}' filename
1
5.2
7.4
5.4
2.8
5.6
Some variations on the same trick of using the IFS variable.
#!/bin/bash
while read line; do
set -- $line
echo $(( ( $(IFS=+; echo "$*") ) / $# ))
done < rows
echo
while read -a line; do
echo $(( ( $(IFS=+; echo "${line[*]}") ) / ${#line[*]} ))
done < rows
echo
saved_ifs="$IFS"
while read -a line; do
IFS=+
echo $(( ( ${line[*]} ) / ${#line[*]} ))
IFS="$saved_ifs"
done < rows
Others have already pointed out that expr is integer-only, and recommended writing your script in awk instead of shell.
Your system may have a number of tools on it that support arbitrary-precision math, or floats. Two common calculators in shell are bc which follows standard "order of operations", and dc which uses "reverse polish notation".
Either one of these can easily be fed your data such that per-line averages can be produced. For example, using bc:
#!/bin/sh
while read line; do
set - ${line}
c=$#
string=""
for n in $*; do
string+="${string:++}$1"
shift
done
average=$(printf 'scale=4\n(%s) / %d\n' $string $c | bc)
printf "%s // avg=%s\n" "$line" "$average"
done
Of course, the only bc-specific part of this is the format for the notation and the bc itself in the third last line. The same basic thing using dc might look like like this:
#!/bin/sh
while read line; do
set - ${line}
c=$#
string="0"
for n in $*; do
string+=" $1 + "
shift
done
average=$(dc -e "4k $string $c / p")
printf "%s // %s\n" "$line" "$average"
done
Note that my shell supports appending to strings with +=. If yours does not, you can adjust this as you see fit.
In both of these examples, we're printing our output to four decimal places -- with scale=4 in bc, or 4k in dc. We are processing standard input, so if you named these scripts "calc", you might run them with command lines like:
$ ./calc < inputfile.txt
The set command at the beginning of the loop turns the $line variable into positional parameters, like $1, $2, etc. We then process each positional parameter in the for loop, appending everything to a string which will later get fed to the calculator.
Also, you can fake it.
That is, while bash doesn't support floating point numbers, it DOES support multiplication and string manipulation. The following uses NO external tools, yet appears to present decimal averages of your input.
#!/bin/bash
declare -i total
while read line; do
set - ${line}
c=$#
total=0
for n in $*; do
total+="$1"
shift
done
# Move the decimal point over prior to our division...
average=$(($total * 1000 / $c))
# Re-insert the decimal point via string manipulation
average="${average:0:$((${#average} - 3))}.${average:$((${#average} - 3))}"
printf "%s // %0.3f\n" "$line" "$average"
done
The important bits here are:
* declare which tells bash to add to $total with += rather than appending it as if it were a string,
* the two average= assignments, the first of which multiplies $total by 1000, and the second of which splits the result at the thousands column, and
* printf whose format enforces three decimal places of precision in its output.
Of course, input still needs to be integers.
YMMV. I'm not saying this is how you should solve this, just that it's an option. :)
This is a pretty old post, but came up at the top my Google search, so thought I'd share what I came up with:
while read line; do
# Convert each line to an array
ARR=( $line )
# Append each value in the array with a '+' and calculate the sum
# (this causes the last value to have a trailing '+', so it is added to '0')
ARR_SUM=$( echo "${ARR[#]/%/+} 0" | bc -l)
# Divide the sum by the total number of elements in the array
echo "$(( ${ARR_SUM} / ${#ARR[#]} ))"
done < "$filename"
Here is the complete code. In BER_SB, values of K,SB passed to rand-src command and value of sigama passed to transmit command are being calculated in main. Vlues written to BER array by BER_SB are being further used in main.
BER_SB()
{
s=$1
mkdir "$1"
cp ex-ldpc36-5000a.pchk ex-ldpc36-5000a.gen "$1"
cd "$1"
rand-src ex-ldpc36-5000a.src $s "$K"x"$SB"
encode ex-ldpc36-5000a.pchk ex-ldpc36-5000a.gen ex-ldpc36-5000a.src ex-ldpc36-5000a.enc
transmit ex-ldpc36-5000a.enc ex-ldpc36-5000a.rec 1 awgn $sigma
decode ex-ldpc36-5000a.pchk ex-ldpc36-5000a.rec ex-ldpc36-5000a.dec awgn $sigma prprp 250
BER="$(verify ex-ldpc36-5000a.pchk ex-ldpc36-5000a.dec ex-ldpc36-5000a.gen ex-ldpc36-5000a.src)"
echo $BER
}
export BER
export -f BER_SB
K=5000 # No of Message Bits
N=10000 # No of encoded bits
R=$(echo "scale=3; $K/$N" | bc) # Code Rate
# Creation of Parity Check and Generator files
make-ldpc ex-ldpc36-5000a.pchk $K $N 2 evenboth 3 no4cycle
make-gen ex-ldpc36-5000a.pchk ex-ldpc36-5000a.gen dense
# Creation of file to write BER values
echo>/media/BER/BER_LDPC36_5000_E.txt -n
S=1; # Variable to control no of blocks of source messages
for Eb_No in 0.5 1.0
do
B=$(echo "10^($S+1)" | bc)
# No of Blocks are increased for higher Eb/No values
S=$(($S+1))
# As we have four cores in our PC so we will divide number of source blocks into four subblocks to process these in parallel
SB=$(echo "$B/4" | bc)
# Calculation of Noise Variance from Eb/No values
tmp=$(echo "scale=3; e(($Eb_No/10)*l(10))" | bc -l)
sigma=$(echo "scale=3; sqrt(1/(2*$R*$tmp))" | bc)
# Calling of functions to process the each subbloc
parallel BER_SB ::: 1 2 3 4
BER_T= Here I want to process values of BER variables returned by BER_SB function
done
It is not very clear what you want done. From what you write it seems you want the same 3 lines run 4 times in parallel. That is easily done:
runone() {
mkdir "$1"
cd "$1"
rand-src ex-ldpc36-5000a.src 0 5000 1000
encode ex-ldpc36-5000a.pchk ex-ldpc36-5000a.gen ex-ldpc36-5000a.src ex-ldpc36-5000a.enc
transmit ex-ldpc36-5000a.enc ex-ldpc36-5000a.rec 1 awgn .80
}
export -f runone
parallel runone ::: 1 2 3 4
But that does not use the '1 2 3 4' for anything. If you want the '1 2 3 4' used for anything you will need to describe better what you really want.
Edit:
It is unclear whether you have:
Read the examples: LESS=+/EXAMPLE: man parallel
Walked through the tutorial: man parallel_tutorial
Watched the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
and whether I can assume that the material covered in those are known to you.
In your code you use BER[1]..BER[4], but they are not initialized. You also use BER[x] in the function. Maybe you forgot that a sub-shell cannot pass values in an array back to its parent?
If I were you I would move all the computation in the function and call the function with all needed parameters instead of passing them as environment variables. Something like:
parallel BER_SB ::: 1 2 3 4 ::: 0.5 1.0 ::: $S > computed.out
post process computed.out >>/media/BER/BER_LDPC36_5000_E.txt
To keep the arguments in computed.out you can use --tag. That may make it easier to postprocess.
I have the below input and I would like to do geometric average if the “Cpd_number” and ”ID3” are the same. The files have a lot of data so we might need arrays to do the tricks. However, as an awk beginner, I am not very sure how to start. Could anyone kindly offer some hints?
input:
“ID1”,“Cpd_number”, “ID2”,”ID3”,”activity”
“95”,“123”,”4”,”5”,”10”
“95”, “123”,”4”,”5”,”100”
“95”, “123”,”4”,”5”,”1”
“95”, “123”,”4”,”6”,”10”
“95”, “123”,”4”,”6”,”100”
“95”, “456”,”4”,”6”,”10”
“95”, “456”,”4”,”6”,”100”
Three lines of “95”,“123”,”4”,”5” should do a geometric average
Two lines of “95”, “123”,”4”,”6” should do a geometric average
Two lines of “95”, “456”,”4”,”6” should do a geometric average
Here is the desired output:
“ID1”,“Cpd_number”, “ID2”,”ID3”,”activity”
“95”,“123”,”4”,”5”,”10”
“95”, “123”,”4”,”6”,”31.62”
“95”, “456”,”4”,”6”,”31.62”
Some info about geometric mean:
http://en.wikipedia.org/wiki/Geometric_mean
This script computes a geometric mean
#!/usr/bin/awk -f
{
b = $1; # value of 1st column
C += log(b);
D++;
}
END {
print "Geometric mean : ",exp(C/D);
}
Having this file:
$ cat infile
"ID1","Cpd_number","ID2","ID3","activity"
"95","123","4","5","10"
"95","123","4","5","100"
"95","123","4","5","1"
"95","123","4","6","10"
"95","123","4","6","100"
"95","456","4","6","10"
"95","456","4","6","100"
This piece:
awk -F\" 'BEGIN{print} # Print headers
last != $4""$8 && last{ # ONLY When last key "Cpd_number + ID3"
print line,exp(C/D) # differs from actual , print line + average
C=D=0} # reset acumulators
{ # This block process each line of infile
C += log($(NF-1)+0) # C calc
D++ # D counter
$(NF-1)="" # Get rid of activity col ir order to print line
line=$0 # Line will be actual line without activity
last=$4""$8} # Store the key in orther to track switching
END{ # This block triggers after the complete file read
# to print the last average that cannot be trigger during
# the previous block
print line,exp(C/D)}' infile
Will throw:
ID1 , Cpd_number , ID2 , ID3 , 0
95 , 123 , 4 , 5 , 10
95 , 123 , 4 , 6 , 31.6228
95 , 456 , 4 , 6 , 31.6228
Still some work left for formatting.
NOTE: char " is used instead of “ and ”
EDIT: NF is the number of fields in file , so NF-1 will be the next to last:
$ awk -F\" 'BEGIN{getline}{print $(NF-1)}' infile
10
100
1
10
100
10
100
So in: log($(NF-1)+0) we apply log function to that value (0 sum is added to ensure numeric value)
D++ y just a counter.
Why use awk, just do it in bash, with either bc or calc to handle floating point math. You can download calc at http://www.isthe.com/chongo/src/calc/ (2.12.4.13-11 is latest). There are rpms, binary and source tarballs available. It is far superior to bc in my opinion. The routine is fairly simple. You need to remove the extranious " quotes from your datafile first leaving a csv file. That helps. See the sed command used in the comments below. Note, the geometric mean below is the 4th root of (id1*cpd*id2*id3). If you need a different mean, just adjust the code below:
#!/bin/bash
##
## You must strip all quotes from data before processing, or write more code to do
## it here. Just do "$ sed -d 's/\"//g' < datafile > newdatafile" Then use
## newdatafile as command line argument to this program
##
## Additionally, this script uses 'calc' for floating point math. go download it
## from: http://www.isthe.com/chongo/src/calc/ (2.12.4.13-11 is latest). You can also
## use bc if you like, but why, calc is so much better.
##
## test to make sure file passed as argument is readable
test -r "$1" || { echo "error: invalid input, usage: ${0//*\//} filename"; exit 1; }
## function to strip extraneous whitespace from input
trimWS() {
[[ -z $1 ]] && return 1
strln="${#1}"
[[ strln -lt 2 ]] && return 1
trimSTR=$1
trimSTR="${trimSTR#"${trimSTR%%[![:space:]]*}"}" # remove leading whitespace characters
trimSTR="${trimSTR%"${trimSTR##*[![:space:]]}"}" # remove trailing whitespace characters
echo $trimSTR
return 0
}
let cnt=0
let oldsum=0 # holds value to compare against new Cpd_number & ID3
product=1 # initialize product to 1
pcnt=0 # initialize the number of values in product
IFS=$',\n' # Internal Field Separator, set to break on ',' or newline
while read newid1 newcpd newid2 newid3 newact || test -n "$act"; do
cpd=`trimWS $cpd` # trimWS from cpd (only one that needed it)
# if first iteration, just output first row
test "$cnt" -eq 0 && echo " $newid1 $newcpd $newid2 $newid3 $newact"
# after first iteration, test oldsum -ne sum, if so do geometric mean
# and reset product and counters
if test "$cnt" -gt 0 ; then
sum=$((newcpd+newid3)) # calculate sum to test against oldsum
if test "$oldsum" -ne "$sum" && test "$cnt" -gt 1; then
# geometric mean (nth root of product)
# mean=`calc -p "root ($product, $pcnt)"` # using calc
mean=`echo "scale=6; e( l($product) / $pcnt)" | bc -l` # using bc
echo " $id1 $cpd $id2 $id3 average: $mean"
pcnt=0
product=1
fi
# update last values to new values
oldsum=$sum
id1="$newid1"
cpd="$newcpd"
id2="$newid2"
id3="$newid3"
act="$newact"
((product*=act)) # accumulate product
((pcnt+=1))
fi
((cnt+=1))
done < "$1"
output:
# output using calc
ID1 Cpd_number ID2 ID3 activity
95 123 4 5 average: 10
95 123 4 6 average: 31.62277660168379331999
95 456 4 6 average: 31.62277660168379331999
# output using bc
ID1 Cpd_number ID2 ID3 activity
95 123 4 5 average: 9.999999
95 123 4 6 average: 31.622756
95 456 4 6 average: 31.622756
The updated script calculates the proper mean. It is a bit more involved due to having to keep old/new values to test for the change in cpd & id3. This may be where awk is the simpler way to go. But if you need more flexibility later, bash may be the answer.
My bash script needs to read values from a properties file and assign them to a number of arrays. The number of arrays is controlled via configuration as well. My current code is as follows:
limit=$(sed '/^\#/d' $propertiesFile | grep 'limit' | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
for (( i = 1 ; i <= $limit ; i++ ))
do
#properties that define values to be assigned to the arrays are labeled myprop## (e.g. myprop01, myprop02):
lookupProperty=myprop$(printf "%.2d" "$i")
#the following line reads the value of the lookupProperty, which is a set of space-delimited strings, and assigns it to the myArray# (myArray1, myArray2, etc):
myArray$i=($(sed '/^\#/d' $propertiesFile | grep $lookupProperty | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'))
done
When I attempt to execute the above code, the following error message is displayed:
syntax error near unexpected token `$(sed '/^\#/d' $propertiesFile | grep $lookupProperty | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')'
I am quite sure the issue is in the way I am declaring the "myArray$i" arrays. However, any different approach I tried produced either the same errors or incomplete results.
Any ideas/suggestions?
You are right that bash does not recognize the construct myArray$i=(some array values) as an array variable assignment. One work-around is:
read -a myArray$i <<<"a b c"
The read -a varname command reads an array from stdin, which is provided by the "here" string <<<"a b c", and assigns it to varname where varname can be constructs like myArray$i. So, in your case, the command might look like:
read -a myArray$i <<<"$(sed '/^\#/d' $propertiesFile | grep$lookupProperty | tail -n 1 | cut -d "=" -f2- | seds/^[[:space:]]*//;s/[[:space:]]*$//')"
The above allows assignment. The next issue is how to read out variables like myArray$i. One solution is to name the variable indirectly like this:
var="myArray$i[2]" ; echo ${!var}