The below script:
#!/bin/bash
otscurrent="
AAA,33854,4528,38382,12
BBB,83917,12296,96213,13
CCC,20399,5396,25795,21
DDD,27198,4884,32082,15
EEE,2472,981,3453,28
FFF,3207,851,4058,21
GGG,30621,4595,35216,13
HHH,8450,1504,9954,15
III,4963,2157,7120,30
JJJ,51,59,110,54
KKK,87,123,210,59
LLL,573,144,717,20
MMM,617,1841,2458,75
NNN,234,76,310,25
OOO,12433,1908,14341,13
PPP,10627,1428,12055,12
QQQ,510,514,1024,50
RRR,1361,687,2048,34
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
"
IFS="," array1=(${otscurrent})
echo ${array1[4]}
Prints:
$ ./test.sh
12
BBB
I'm trying to get it to just print 12... And I am not even sure how to make it just print row 5 column 4
The variable is an output of a sqlquery that has been parsed with several sed commands to change the formatting to csv.
otscurrent="$(sqlplus64 user/password#dbserverip/db as sysdba #query.sql |
sed '1,11d; /^-/d; s/[[:space:]]\{1,\}/,/g; $d' |
sed '$d'|sed '$d'|sed '$d' | sed '$d' |
sed 's/Used,MB/Used MB/g' |
sed 's/Free,MB/Free MB/g' |
sed 's/Total,MB/Total MB/g' |
sed 's/Pct.,Free/Pct. Free/g' |
sed '1b;/^Name/d' |
sed '/^$/d'
)"
Ultimately I would like to be able to call on a row and column and run statements on the values.
Initially i was piping that into :
awk -F "," 'NR>1{ if($5 < 10) { printf "%-30s%-10s%-10s%-10s%-10s\n", $1,$2,$3,$4,$5"%"; } else { echo "Nothing to do" } }')"
Which works but I couldn't run commands from if else ... or atleaste I didn't know how.
If you have bash 4.0 or newer, an associative array is an appropriate way to store data in this kind of form.
otscurrent=${otscurrent#$'\n'} # strip leading newline present in your sample data
declare -A data=( )
row=0
while IFS=, read -r -a line; do
for idx in "${!line[#]}"; do
data["$row,$idx"]=${line[$idx]}
done
(( row += 1 ))
done <<<"$otscurrent"
This lets you access each individual item:
echo "${data[0,0]}" # first field of first line
echo "${data[9,0]}" # first field of tenth line
echo "${data[9,1]}" # second field of tenth line
"I'm trying to get it to just print 12..."
The issue is that IFS="," splits on commas and there is no comma between 12 and BBB. If you want those to be separate elements, add a newline to IFS. Thus, replace:
IFS="," array1=(${otscurrent})
With:
IFS=$',\n' array1=(${otscurrent})
Output:
$ bash test.sh
12
All you need to print the value of the 4th column on the 5th row is:
$ awk -F, 'NR==5{print $4}' <<< "$otscurrent"
3453
and just remember that in awk row (record) and column (field) numbers start at 1, not 0. Some more examples:
$ awk -F, 'NR==1{print $5}' <<< "$otscurrent"
12
$ awk -F, 'NR==2{print $1}' <<< "$otscurrent"
BBB
$ awk -F, '$5 > 50' <<< "$otscurrent"
JJJ,51,59,110,54
KKK,87,123,210,59
MMM,617,1841,2458,75
SSS,1,24,25,96
TTT,0,5,5,100
UUU,294,1606,1900,85
If you'd like to avoid all of the complexity and simply parse your SQL output to produce what you want without 20 sed commands in between, post a new question showing the raw sqlplus output as the input and what you want finally output and someone will post a brief, clear, simple, efficient awk script to do it all at one time, or maybe 2 commands if you still want an intermediate CSV for some reason.
I'm looking for a way to find non-repeated elements in an array in bash.
Simple example:
joined_arrays=(CVE-2015-4840 CVE-2015-4840 CVE-2015-4860 CVE-2015-4860 CVE-2016-3598)
<magic>
non_repeated=(CVE-2016-3598)
To give context, the goal here is to end up with an array of all package update CVEs that aren't generally available via 'yum update' on a host due to being excluded. The way I came up with doing such a thing is to populate 3 preliminary arrays:
available_updates=() #just what 'yum update' would provide
all_updates=() #including excluded ones
joined_updates=() # contents of both prior arrays
Then apply logic to joined_updates=() that would return only elements that are included exactly once. Any element with two occurrences is one that can be updated normally and doesn't need to end up in the 'excluded_updates=()' array.
Hopefully this makes sense. As I was typing it out I'm wondering if it might be simpler to just remove all elements found in available_updates=() from all_updates=(), leaving the remaining ones as the excluded updates.
Thanks!
One pure-bash approach is to store a counter in an associative array, and then look for items where the counter is exactly one:
declare -A seen=( ) # create an associative array (requires bash 4)
for item in "${joined_arrays[#]}"; do # iterate over original items
(( seen[$item] += 1 )) # increment value associated with item
done
declare -a non_repeated=( )
for item in "${!seen[#]}"; do # iterate over keys
if (( ${seen[$item]} == 1 )); then # if counter for that key is 1...
non_repeated+=( "$item" ) # ...add that item to the output array.
done
declare -p non_repeated # print result
Another, terser (but buggier -- doesn't work with values containing newline literals) approach is to take advantage of standard text manipulation tools:
non_repeated=( ) # setup
# use uniq -c to count; filter for results with a count of 1
while read -r count value; do
(( count == 1 )) && non_repeated+=( "$value" )
done < <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -c)
declare -p non_repeated # print result
...or, even terser (and buggier, requiring that the array value split into exactly one field in awk):
readarray -t non_repeated \
< <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -c | awk '$1 == 1 { print $2; }'
To crib an answer I really should have come up myself from #Aaron (who deserves an upvote from anyone using this; do note that it retains the doesn't-work-with-values-with-newlines bug), one can also use uniq -u:
readarray -t non_repeated < <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -u)
I would rely on uniq.
It's -u option is made for this exact case, outputting only the uniques occurrences. It relies on the input to be a sorted linefeed-separated list of tokens, hence the need for IFS and sort :
$ my_test_array=( 1 2 3 2 1 0 )
$ printf '%s\n' "${my_test_array[#]}" | sort | uniq -u
0
3
Here is a single awk based solution that doesn't require sort:
arr=( 1 2 3 2 1 0 )
printf '%s\n' "${arr[#]}" |
awk '{++fq[$0]} END{for(i in fq) if (fq[i]==1) print i}'
0
3
var1=$(echo $getDate | awk '{print $1} {print $2}')
var2=$(echo $getDate | awk '{print $3} {print $4}')
var3=$(echo $getDate | awk '{print $5} {print $6}')
Instead of repeating like the code above, I need to:
loop the same command
increment the values ({print $1} {print $2})
store the value in an array
I was doing something like below but I am stuck maybe someone can help me please:
COMMAND=`find $locationA -type f | wc -l`
getDate=$(find $locationA -type f | xargs ls -lrt | awk '{print $6} {print $7}')
a=1
b=2
for i in $COMMAND
do
i=$(echo $getDate | awk '{print $a} {print $b}')
myarray+=('$i')
a=$((a+1))
b=$((b+1))
done
PS - using ksh
Problem: $COMMAND stores the number of files found in $locationA. I need to loop through the amount of files found and store their dates in an array.
I don't get the meaning of your example code (what is the 'for' loop supposed to do? What is the content of the variable COMMAND?), but in your question you ask to store something in an array, while in the code you wish to simplify, you don't use an array, but simple variables (var1, var2, ....).
If I understand your requirement correctly, your variable getDate contains a string of several words, which are separated by spaces, and you want to assign the first two words to var1, the following two words to var2, and so on. Is this correct?
Now the edited code is at least a bit clearer, though I still don't understand, why you use i as a loop variable, and overwrite it in the first statement inside the loop.
However, a few comments:
If you push '$i' into your array, you will get a literal '$' sign, followed by the letter 'i'. To add a variable i containing to numbers, you need double quotes ("$i").
I don't understand why you want to loop over the cotnent of the variable COMMAND. This variable will always hold a single number, which means that the loop will be executed exactly once.
You could use a counting loop, incrementing loop variable by 2 on each iteration. You would have to precalculate the number of iterations beforehand.
Perhaps an easier alternative, which would work in bash or in zsh (I did not try other shells) is to first turn your variable in an array,
tmparr=($(echo $getDate|fmt -w 1))
and then use a loop to collect pairs of this element:
myarray=()
for ((i=0; i<${#tmparr[*]}; i+=2))
do
myarray+=("${tmparr[$i]} ${tmparr[$((i+1))]}")
done
${myarray[0]} will hold a string consisting of the first to words from getDate, etc.
This one should work on zsh, at least with newer versions:
myarray=()
echo $g|fmt -w 1|paste -s -d " \n"|while read s; do myarray+=("$s"); done
This leaves the first pair in ${myarray[1]}, etc.
It doesn't work with bash (and old zsh versions), because these shells would execute the body of the loop in a subshell.
ADDED:
On a second thought, in zsh this one would be simpler:
myarray=("${(f)$(echo $g|fmt -w 1|paste -s -d ' \n')}")
I'm writing script in Bash and I have a problem with sum elements of array. I add to array results of df for two paths. In result I want to get sum elements of array.
use=()
i=0
for d in '$PATH1' '$PATH2'
do
usagebck=$(du $d | awk '{print awk $1}')
use[i]=$usagebck
sum=0
for j in $use
do
sum=$($sum + ${use[$i]})
done
i=$((i+1))
done
echo ${use[*]}
If your du has option -s:
use=()
sum=0
for d in "$PATH1" "$PATH2"
do
usagebck="$(du -s "$d" | awk 'END{print $1}')"
use+=($usagebck)
((sum+=$usagebck))
done
echo ${use[*]}
echo $sum
First, take a look at the parameters in du. On BSD based systems, there's -c which will give you a grand total. On GNU and BSD, there's the -a parameter which will report on all files for a directory.
Since you're already using awk, why not do everything in awk?
$ du -ms $PATH1 $PATH2 |
awk 'BEGIN {sum = 0}
END {print "Total: " sum }
{
sum+=$1
print $0
}'
du -ms specifies that I want the total sums of each file specified
BEGIN is executed before the main awk program. Here I'm initializing sum. This isn't necessary because variables are assumed to equal zero when created.
END is executed after the main awk program. Here, I'm specifying that I want sum printed.
Between the { ... } is the main Awk program. Two lines. The first line adds Column 1 (the size of the file) to sum. The second line prints out the entire line.
I need to know how many processes are running for a specific task (e.g. number of Apache tomcats) and if it's 1, then print the PID. Otherwise print out a message.
I need this in a BASH script, now when I perform something like:
result=`ps aux | grep tomcat | awk '{print $2}' | wc -l`
The number of items is assigned to result. Hurrah! But I don't have the PID(s). However when I attempt to perform this as an intermediary step (without the wc), I encounter problems. So if I do this:
result=`ps aux | grep tomcat | awk '{print $2}'`
Any attempts I make to modify the variable result just don't seem to work. I've tried set and tr (replace blanks with line-breaks), but I just cannot get the right result. Ideally I'd like the variable result to be an array with the PIDs as individual elements. Then I can see size, elements, easily.
Can anyone suggest what I am doing wrong?
Thanks,
Phil
Update:
I ended up using the following syntax:
pids=(`ps aux | grep "${searchStr}"| grep -v grep | awk '{print $2}'`)
number=${#pids[#]}
The key was putting the brackets around the back-ticked commands. Now the variable pids is an array and can be asked for length and elements.
Thanks to both choroba and Dimitre for their suggestions and help.
pids=($(
ps -eo pid,command |
sed -n '/[t]omcat/{s/^ *\([0-9]\+\).*/\1/;p}'
))
number=${#pids[#]}
pids=( ... ) creates an array.
$( ... ) returns its output as a string (similar to backquote).
Then, sed is called on the list of all the processes: for lines containing tomacat (the [t] prevents the sed itself from being included), only the pid is preserved and printed.
You may need to adjust the pgrep command (you may need or may not need the -f option).
_pids=(
$( pgrep -f tomcat )
)
(( ${#_pids[#]} == 1 )) &&
echo ${_pids[0]} ||
echo message
If you want to print the number of pids (with a message):
_pids=(
$( pgrep -f tomcat )
)
(( ${#_pids[#]} == 1 )) &&
echo ${_pids[0]} ||
echo "${#_pids[#]} running"
It should be noted that the pgrep utility and the syntax used are not standard.