Bash Delete doublons within array - arrays

I have a file containing a several number of fields. I am trying to delete doublons (ex: two same attributes with a different date.) within a same field. For example from this :
Andro manual gene 1 100 . + . ID=truc;Name=truc;modified=13-09-1993;added=13-09-1993;modified=13-09-1997
Andro manual mRNA 1 100 . + . ID=truc-mRNA;Name=truc-mRNA;modified=13-09-1993;added=13-09-1993;modified=13-09-1997
We can see modified=13-09-1993 and modified=13-09-1997 are doublons. So I want to obtain this :
Andro manual gene 1 100 . + . ID=truc;Name=truc;added=13-09-1993;modified=13-09-1997
Andro manual mRNA 1 100 . + . ID=truc-mRNA;Name=truc-mRNA;added=13-09-1993;modified=13-09-1997
I want to keep the latest occurence of particular attribute and deleting the oldest one. They will only have at maximum twice the same attribute in a same row.
I've tried this code (which is now working):
INPUT=$1
ID=$2
ALL_FEATURES=()
CONTIG_FEATURES=$(grep $ID $INPUT)
while read LINE; do
FEATURES=$(echo -e "$LINE" | cut -f 9)
#For each line, store all attributes from every line in an array
IFS=';' read -r -a ARRAY <<< "$FEATURES"
#Once the array is created, loop in array to look for doublons
for INDEX in "${!ARRAY[#]}"
do
ELEMENT=${ARRAY[INDEX]}
#If we are not at the end of the array, compare actual element and next element
ACTUAL=$ELEMENT
for INDEX2 in "${!ARRAY[#]}"
do
NEXT="${ARRAY[INDEX2]}"
ATTRIBUTE1=$(echo -e "$ACTUAL" | cut -d'=' -f1)
ATTRIBUTE2=$(echo -e "$NEXT" | cut -d'=' -f1)
echo "Comparing element number $INDEX ($ATTRIBUTE1) with element number $INDEX2 ($ATTRIBUTE2) ..."
if [[ $ATTRIBUTE1 = $ATTRIBUTE2 ]] && [[ $INDEX -ne $INDEX2 ]]
then
echo "Deleting features..."
#Delete actual element, because next element will be more recent
NEW=()
for VAL in "${ARRAY[#]}"
do
[[ $VAL != "${ARRAY[INDEX]}" ]] && NEW+=($VAL)
done
ARRAY=("${NEW[#]}")
unset NEW
fi
done
done
#Rewriting array into string separated by ;
FEATURES2=$( IFS=$';'; echo "${ARRAY[*]}" )
sed -i "s/$FEATURES/$FEATURES2/g" $INPUT
done < <(echo -e "$CONTIG_FEATURES")
I need advices because I think my array approache may not be a clever one, but I want a bash solution in any case. If anyone has some bash adives/shortcuts, any suggestions will be appreciated to improve my bash understanding.
I'm sorry if I forgot any details, thanks for your help.
Roxane

In awk:
$ awk '
{
n=split($NF,a,";") # split the last field by ;
for(i=n;i>=1;i--) { # iterate them backwards to keep the last "doublon"
split(a[i],b,"=") # split key=value at =
if(b[1] in c==0) { # if key not in c hash
d=a[i] (d==""?"":";") d # append key=value to d with ;
c[b[1]] # hash key into c
}
}
$NF=d # set d to last field
delete c # clear c for next record
d="" # deetoo
}
1 # output
' file
Andro manual gene 1 100 . + . ID=truc;Name=truc;added=13-09-1993;modified=13-09-1997
Andro manual mRNA 1 100 . + . ID=truc-mRNA;Name=truc-mRNA;added=13-09-1993;modified=13-09-1997

Following awk could also help you in same.
awk -F';' '{
for(i=NF;i>0;i--){
split($i, array,"=");
if(++a[array[1]]>1){
$i="\b"
}
};
delete a
}
1
' OFS=";" Input_file
Output will be as follows.
Andro manual gene 1 100 . + . ID=truc;Name=truc;added=13-09-1993;modified=13-09-1997
Andro manual mRNA 1 100 . + . ID=truc-mRNA;Name=truc-mRNA;added=13-09-1993;modified=13-09-1997

Related

how to perform string operation in an input string and get the desired output

I have to write a shell script to accept the product order details as string input and display the total order amount and number of order under each category.
INPUT:
101:Redmi:Mobile:15000#102:Samsung:TV:20000#103:OnePlus:Mobile:35000#104:HP:Laptop:65000#105:Samsung:Mobile:10000#106:Samsung:TV:30000
OUTPUT:
Mobile:60000:3#TV:50000:2#Laptop:65000:1
I have to achieve this using sort,tr,cut,grep command only no sed awk should be used.
Here is my solution, it is improvable, I will explain it:
x="101:Redmi:Mobile:15000#102:Samsung:TV:20000#103:OnePlus:Mobile:35000#104:HP:Laptop:65000#105:Samsung:Mobile:10000#106:Samsung:TV:30000"
echo $x > origin.txt
cat origin.txt | grep -E -o ':[a-Z]+:[0-9]+' | cut -c 2- > temp.txt
categories=()
quantity=()
items=()
while IFS= read -r line
do
category=$(echo $line | grep -E -o '[a-Z]+')
amount=($(( $(echo $line | grep -E -o '[0-9]+') )))
if [ "0" = "${#categories[#]}" ]
then
# Add new element at the end of the array
categories+=( $category )
quantity+=( $amount )
items+=(1)
else
let in=0
let index=0
let i=0
# Iterate the loop to read and print each array element
for value in "${categories[#]}"
do
if [ $category = $value ]
then
let in=$in+1
let index=$i
fi
let i=$i+1
done
if [ $in = 0 ]
then
categories+=( $category )
quantity+=( $amount )
items+=(1)
else
let sum=$amount+${quantity[$index]}
quantity[$index]=$sum
let newitems=${items[$index]}+1
items[$index]=$newitems
fi
fi
done < temp.txt
let j=0
for value in "${categories[#]}"
do
echo -n $value
echo -n ':'
echo -n ${quantity[$j]}
echo -n ':'
echo -n ${items[$j]}
let k=$j+1
if [ $k != ${#categories[#]} ]
then
echo -n '#'
fi
let j=$j+1
done
First i save the string in x and later in origin.txt, with cat, grep and cut I obtain the string in this format inside of temp.txt:
Mobile:15000 \n
TV:20000 \n
Mobile:35000 \n
Laptop:65000 \n
Mobile:10000 \n
TV:30000
I create three arrays: categories for storage the name of the category, quantity for storage the amount of the category and items the objects of the category.
Later I read temp.txt line by line and using a simple regex I can get the name of the category and the amount of that line.
amount=($(( $(echo $line | grep -E -o '[0-9]+') )))
This is used to convert string into int.
The first if is for check if the array is empty, if it's empty we append the category, the amount and 1 object.
If is not empty we need to declare three variables:
in, for check if the incoming category is in the array or not
index, for storage the index of the existing category
i, for count the times we loop
If the category already exist in the array $in is setted to 1 or higher.
If not $in still have the default value 0.
Next if in is equal to 0, means the category is new, we append to the three arrays.
If in is not equal to 0 using the index we can sum the new amount to the stored amount.
Finally when the main loop finish I print the data following the format you have mentioned.
The variable k is for not print the last "#", we compare k with the size of the array. Note that arrays start with 0 index.
The input is:
x="101:Redmi:Mobile:15000#102:Samsung:TV:20000#103:OnePlus:Mobile:35000#104:HP:Laptop:65000#105:Samsung:Mobile:10000#106:Samsung:TV:30000"
First, split records on lines:
tr '#' '\n' <<< "$x"
This gives:
101:Redmi:Mobile:15000
102:Samsung:TV:20000
103:OnePlus:Mobile:35000
104:HP:Laptop:65000
105:Samsung:Mobile:10000
106:Samsung:TV:30000
Better to keep only relevant data and store it in a variable:
data="$(tr '#' '\n' <<< "$x" | cut -d: -f3,4)"
echo "$data"
This gives:
Mobile:15000
TV:20000
Mobile:35000
Laptop:65000
Mobile:10000
TV:30000
Count how many items in each category. sed is used for formatting:
categories="$(cut -d: -f1 <<< "$data" | sort | uniq -c | sed 's/^ *//;s/ /:/')"
echo "$categories"
This gives:
1:Laptop
3:Mobile
2:TV
Then loop over categories to extract and compute total amount for each:
for i in $categories
do
count=${i%:*}
item=${i/*:}
amount=$(($(grep $item <<< "$data"| cut -d: -f2 | tr '\n' '+')0))
echo $item $amount $count
done
This gives:
Laptop 65000 1
Mobile 60000 3
TV 50000 2
Finally, a little bit of formatting after the done:
<...> done | tr ' \n' ':#' | sed 's/#$//' ; echo
This gives:
Laptop:65000:1#Mobile:60000:3#TV:50000:2

How to create mutiple arrays from a text file and loop through the values of each array

I have a text file with the following:
Paige
Buckley
Govan
Mayer
King
Harrison
Atkins
Reinhardt
Wilson
Vaughan
Sergovia
Tarrega
My goal is to create an array for each set of names. Then Iterate through the first array of values then move on to the second array of values and lastly the third array. Each set is separated by a new line in the text file. Help with code or logic is much appreciated!
so far I have the following. i am unsure of the logic moving forward when i reach a line break. My research here also suggests that i can use readarray -d.
#!/bin/bash
my_array=()
while IFS= read -r line || [[ "$line" ]]; do
if [[ $line -eq "" ]];
.
.
.
arr+=("$line") # i know this adds the value to the array
done < "$1"
printf '%s\n' "${my_array[#]}"
desired output:
array1 = (Paige Buckley6 Govan Mayer King)
array2 = (Harrison Atkins Reinhardt Wilson)
array3 = (Vaughan Sergovia Terrega)
#then loop through the each array one after the other.
Bash has no array-of-arrays. So you have to represent it in an other way.
You could leave the newlines and have an array of newline separated elements:
array=()
elem=""
while IFS= read -r line; do
if [[ "$line" != "" ]]; then
elem+="${elem:+$'\n'}$line" # accumulate lines in elem
else
array+=("$elem") # flush elem as array element
elem=""
fi
done
if [[ -n "$elem" ]]; then
array+=("$elem") # flush the last elem
fi
# iterate over array
for ((i=0;i<${#array[#]};++i)); do
# each array element is newline separated items
readarray -t elem <<<"${array[i]}"
printf 'array%d = (%s)\n' "$i" "${elem[*]}"
done
You could simplify the loop with some unique character and a sed for example like:
readarray -d '#' -t array < <(sed -z 's/\n\n/#/g' file)
But overall, this awk generates same output:
awk -v RS= -v FS='\n' '{
printf "array%d = (", NR;
for (i=1;i<=NF;++i) printf "%s%s", $i, i==NF?"":" ";
printf ")\n"
}'
Using nameref :
#!/usr/bin/env bash
declare -a array1 array2 array3
declare -n array=array$((n=1))
while IFS= read -r line; do
test "$line" = "" && declare -n array=array$((n=n+1)) || array+=("$line")
done < "$1"
declare -p array1 array2 array3
Called with :
bash test.sh data
# result
declare -a array1=([0]="Paige" [1]="Buckley" [2]="Govan" [3]="Mayer" [4]="King")
declare -a array2=([0]="Harrison" [1]="Atkins" [2]="Reinhardt" [3]="Wilson")
declare -a array3=([0]="Vaughan" [1]="Sergovia" [2]="Tarrega")
Assumptions:
blank links are truly blank (ie, no need to worry about any white space on said lines)
could have consecutive blank lines
names could have embedded white space
the number of groups could vary and won't always be 3 (as with the sample data provided in the question)
OP is open to using a (simulated) 2-dimensional array as opposed to a (variable) number of 1-dimensional arrays
My data file:
$ cat names.dat
<<< leading blank lines
Paige
Buckley
Govan
Mayer
King Kong
<<< consecutive blank lines
Harrison
Atkins
Reinhardt
Wilson
Larry
Moe
Curly
Shemp
Vaughan
Sergovia
Tarrega
<<< trailing blank lines
One idea that uses a couple arrays:
array #1: associative array - the previously mentioned (simulated) 2-dimensional array with the index - [x,y] - where x is a unique identifier for a group of names and y is a unique identifier for a name within a group
array #2: 1-dimensional array to keep track of max(y) for each group x
Loading the arrays:
unset names max_y # make sure array names are not already in use
declare -A names # declare associative array
x=1 # init group counter
y=0 # init name counter
max_y=() # initialize the max(y) array
inc= # clear increment flag
while read -r name
do
if [[ "${name}" = '' ]] # if we found a blank line ...
then
[[ "${y}" -eq 0 ]] && # if this is a leading blank line then ...
continue # ignore and skip to the next line
inc=y # set flag to increment 'x'
else
[[ "${inc}" = 'y' ]] && # if increment flag is set ...
max_y[${x}]="${y}" && # make note of max(y) for this 'x'
((x++)) && # increment 'x' (group counter)
y=0 && # reset 'y'
inc= # clear increment flag
((y++)) # increment 'y' (name counter)
names[${x},${y}]="${name}" # save the name
fi
done < names.dat
max_y[${x}]="${y}" # make note of the last max(y) value
Contents of the array:
$ typeset -p names
declare -A names=([1,5]="King Kong" [1,4]="Mayer" [1,1]="Paige" [1,3]="Govan" [1,2]="Buckley" [3,4]="Shemp" [3,3]="Curly" [3,2]="Moe" [3,1]="Larry" [2,4]="Wilson" [2,2]="Atkins" [2,3]="Reinhardt" [2,1]="Harrison" [4,1]="Vaughan" [4,2]="Sergovia" [4,3]="Tarrega" )
$ for (( i=1; i<=${x}; i++ ))
do
for (( j=1; j<=${max_y[${i}]}; j++ ))
do
echo "names[${i},${j}] : ${names[${i},${j}]}"
done
echo ""
done
names[1,1] : Paige
names[1,2] : Buckley
names[1,3] : Govan
names[1,4] : Mayer
names[1,5] : King Kong
names[2,1] : Harrison
names[2,2] : Atkins
names[2,3] : Reinhardt
names[2,4] : Wilson
names[3,1] : Larry
names[3,2] : Moe
names[3,3] : Curly
names[3,4] : Shemp
names[4,1] : Vaughan
names[4,2] : Sergovia
names[4,3] : Tarrega

How to process all or selected rows in a csv file where column headers and order are dynamic?

I'd like to either process one row of a csv file or the whole file.
The variables are set by the header row, which may be in any order.
There may be up to 12 columns, but only 3 or 4 variables are needed.
The source files might be in either format, and all I want from both is lastname and country. I know of many different ways and tools to do it if the columns were fixed and always in the same order. But they're not.
examplesource.csv:
firstname,lastname,country
Linus,Torvalds,Finland
Linus,van Pelt,USA
examplesource2.csv:
lastname,age,country
Torvalds,66,Finland
van Pelt,7,USA
I have cobbled together something from various Stackoverflow postings which looks a bit voodoo but seems fairly robust. I say "voodoo" because shellcheck complains that, for example, "firstname is referenced but not assigned". And yet it prints it.
#!/bin/bash
#set the field seperator to newline
IFS=$'\n'
#split/transpose the first-line column titles to rows
COLUMNAMES=$(head -n1 examplesource.csv | tr ',' '\n')
#set an array and read the columns into it
columns=()
for line in $COLUMNAMES; do
columns+=("$line")
done
#reset the field seperator
IFS=","
#using -p here to debug in output
declare -ap columns
#read from line 2 onwards
sed 1d examplesource.csv | while read "${columns[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
In the case of looping through everything, it works perfectly for my needs and I can process within the "while read" loop. But to make it cleaner, I'd rather pass the current element(?) to an external function to process (not just echo).
And if I only wanted the array (current row) belonging to "Torvalds", I cannot find how to access that or even get its current index, eg: "if $wantedname && $lastname == $wantedname then call function with currentrow only otherwise loop all rows and call function".
I know there aren't multidimensional associative arrays in bash from reading
Multidimensional associative arrays in Bash and I've tried to understand arrays from
https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
Is it clear what I'm trying to achieve in a bash-only manner and does the question make sense?
Many thanks.
Let's short your function. Don't read the source twice (first with head then with sed). You can do that once. Also the whole array reading can be shorten to just IFS=',' COLUMNAMES=($(head -n1 source.csv)). Here's a shorter version:
#!/bin/bash
cat examplesource.csv |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
If you want to parse both files and the same time, ie. join them, nothing simpler ;). First, let's number lines in the first file using nl -w1 -s,. Then we use join to join the files on the name of the people. Remember that join input needs to be sort-ed using proper fields. Then we sort the output with sort using the number from the first file. After that we can read all the data just like that:
# join the files, using `,` as the seaprator
# on the 3rd field from the first file and the first field from the second file
# the output should be first the fields from the first file, then the second file
# the country (field 1.4) is duplicated in 2.3, so just omiting it.
join -t, -13 -21 -o 1.1,1.2,1.3,2.2,2.3 <(
# number the lines in the first file
<examplesource.csv nl -w1 -s, |
# there is one field more, sort using the 3rd field
sort -t, -k3
) <(
# sort the second file using the first field
<examplesource2.csv sort -t, -k1
) |
# sort the output using the numbers from the first file
sort -t, -k1 -n |
# well, remove the numbers
cut -d, -f2- |
# just a normal read follows
{
# read the headers
IFS=, read -r -a names
while IFS=, read -r "${names[#]}"; do
# finally out output!
echo "${firstname} ${lastname} is from ${country} and is so many ${age} years old!"
done
}
Tested on tutorialspoint.
GNU Awk has multidimensional arrays. It also has array sorting mechanisms, which I have not used here. Please comment if you are interested in pursuing this solution further. The following depends on consistent key names and line numbers across input files, but can handle an arbitrary number of fields and input files.
$ gawk -V |gawk NR==1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
$ gawk -F, '
FNR == 1 {for(f=1;f<=NF;f++) Key[f]=$f}
FNR != 1 {for(f=1;f<=NF;f++) People[FNR][Key[f]]=$f}
END {
for(Person in People) {
for(attribute in People[Person])
output = output FS People[Person][attribute]
print substr(output,2)
output=""
}
}
' file*
66,Finland,Linus,Torvalds
7,USA,Linus,van Pelt
A bash solution takes a bit more work than an awk solution, but if this is an exercise over what bash provides, it provides all you need to handle determining the column holding the last name from the first line of input and then outputting the lastname from the remaining lines.
An easy approach is simply to read each line into a normal array and then loop over the elements of the first line to locate the column "lastname" appears in saving the column in a variable. You can then read each of the remaining lines the same way and output the lastname field by outputting the element at the saved column.
A short example would be:
#!/bin/bash
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{ col=i; break; } ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
echo "line $cnt lastname: ${arr[col]}" ## output lastname variable
((cnt++)) ## increment linecount
done < "$1"
Example Use/Output
Using your two files data files, the output would be:
$ bash readcsv.sh ex1.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
$ bash readcsv.sh ex2.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
A similar implementation using awk would be:
awk -F, -v col=1 '
NR == 1 {
for (i in FN) {
if (i = "lastname") next
}
col++
}
NR > 1 {
print "lastname: ", $col
}
' ex1.csv
Example Use/Output
$ awk -F, -v col=1 'NR == 1 { for (i in FN) { if (i = "lastname") next } col++ } NR > 1 {print "lastname: ", $col }' ex1.csv
lastname: Torvalds
lastname: van Pelt
(output is the same for either file)
Thank you all. I've taken a couple of bits from two answers
I used the answer from David to find the number of the row, then I used the elegantly simple solution from Kamil at to loop through what I need.
The result is exactly what I wanted. Thank you all.
$ readexample.sh examplesource.csv "Torvalds"
Everyone
Linus Torvalds is from Finland
Linus van Pelt is from USA
now just Torvalds
Linus Torvalds is from Finland
And this is the code - now that you know what I want it to do, if anyone can see any dangers or improvements, please let me know as I'm always learning. Thanks.
#!/bin/bash
FILENAME="$1"
WANTED="$2"
printDetails() {
SINGLEROW="$1"
[[ ! -z "$SINGLEROW" ]] && opt=("--expression" "1p" "--expression" "${SINGLEROW}p") || opt=("--expression" "1p" "--expression" "2,199p")
sed -n "${opt[#]}" "$FILENAME" |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
}
findRow() {
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{
col=i
break
} ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
if [ "${arr[col]}" == "$1" ]; then
echo "$cnt" ## output lastname variable
fi
((cnt++)) ## increment linecount
done <"$FILENAME"
}
echo "Everyone"
printDetails
if [ ! -z "${WANTED}" ]; then
echo -e "\nnow just ${WANTED}"
row=$(findRow "${WANTED}")
printDetails "$((row + 1))"
fi

How does bash array slicing work when start index is not provided?

I'm looking at a script, and I'm having trouble determining what is going on.
Here is an example:
# Command to get the last 4 occurrences of a pattern in a file
lsCommand="ls /my/directory | grep -i my_pattern | tail -4"
# Load the results of that command into an array
dirArray=($(echo $(eval $lsCommand) | tr ' ' '\n'))
# What is this doing?
yesterdaysFileArray=($(echo ${x[#]::$((${#x[#]} / 2))} | tr ' ' '\n'))
There is a lot going on here. I understand how arrays work, but I don't know how $x is getting referenced if it was never declared.
I see that the $((${#x[#]} / 2}} is taking the number of elements and dividing it in half, and the tr is used to create the array. But what else is going on?
I think the last line is an array slice pattern in bash of form ${array[#]:1:2}, where array[#] returns the contents of the array, :1:2 takes a slice of length 2, starting at index 1.
So for your case though you are taking the start index empty because you haven't specified any and length as half the count of array.
But there is a lot better way to do this in bash as below. Don't use eval and use the built-in globbing support from the shell itself
cd /my/directory
fileArray=()
for file in *my_pattern*; do
[[ -f "$file" ]] || { printf '%s\n' 'no file found'; return 1; }
fileArray+=( "$file" )
done
and do
printf '%s\n' "${fileArray[#]::${#fileArray[#]}/2}"

Append elements of an array to the end of a line

First let me say I followed questions on stackoverflow.com that relate to my question and it seems the rules are not applying. Let me show you.
The following script:
#!/bin/bash
OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
TODAY=`date +"%m-%d-%y"`
HOSTNAME=`hostname`
WORKSPACES=( "bob" "mel" "sideshow-ws2" )
if ! [ -f $OUTPUT_DIR/$HOSTNAME.csv ] && [ $HOSTNAME == "sideshow" ]; then
echo "$TODAY","$HOSTNAME" > $OUTPUT_DIR/$HOSTNAME.csv
echo "${WORKSPACES[0]}," >> $OUTPUT_DIR/$HOSTNAME.csv
sed -i "/^'"${WORKSPACES[0]}"'/$/'"${WORKSPACES[1]}"'/" $OUTPUT_DIR/$HOSTNAME.csv
sed -i "/^'"${WORKSPACES[1]}"'/$/${WORKSPACES[2]}"'/" $OUTPUT_DIR/$HOSTNAME.csv
fi
I want the output to look like:
09-20-14,sideshow
bob,mel,sideshow-ws2
the sed statements are supposed to append successive array elements to preceding ones on the same line. Now I know there's a simpler way to do this like:
echo "${WORKSPACES[0]},${WORKSPACES[1]},${WORKSPACES[2]}" >> $OUTPUT_DIR/$HOSTNAME.csv
But let's say I had 30 elements in the array and I wanted to appended them one after the other on the same line? Can you show me how to loop through the elements in an array and append them one after the other on the same line?
Also let's say I had the output of a command like:
df -m /export/ws/$ws | awk '{if (NR!=1) {print $3}}'
and I wanted to append that to the end of the same line.
But when I run it I get:
+ OUTPUT_DIR=/share/es-ops/Build_Farm_Reports/WorkSpace_Reports
++ date +%m-%d-%y
+ TODAY=09-20-14
++ hostname
+ HOSTNAME=sideshow
+ WORKSPACES=("bob" "mel" "sideshow-ws2")
+ '[' -f /share/es-ops/Build_Farm_Reports/WorkSpace_Reports/sideshow.csv ']'
And the file right now looks like:
09-20-14,sideshow
bob,
I am happy to report that user syme solved this (see below) but then I realized I need the date in the first column:
09-7-14,bob,mel,sideshow-ws2
Can I do this using syme's for loop?
Okay user syme solved this too he said "Just add $TODAY to the for loop" like this:
for v in "$TODAY" "${WORKSPACES[#]}"
Okay now the output looks like this I changed the elements in the array btw:
sideshow
09-20-14,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
Now below that the next line will be populated by a , in the first column skipping the date and then:
df -m /export/ws/$v | awk '{if (NR!=1) {print $3}}
which equals the value of available space on bob in the first iteration
and then:
df -m /export/ws/$v | awk '{if (NR!=1) {print $2}}
which equals the value of used space on bob in the 2nd iteration
and then we just move on to the next value in ${WORKSPACE[#]}
which will be mel and do the available and used as we did with bob or $v above.
I know you geniuses on here will make child's play out of this.
I solved my own last question on this thread:
WORKSPACES2=( "bob" "mel" "sideshow-ws2" )
separator="," # defined empty for the first value
for v in "${WORKSPACES2[#]}"
do
available=`df -m /export/ws/$v | awk '{if (NR!=1) {print $3}}'`
used=`df -m /export/ws/$v | awk '{if (NR!=1) {print $2}}'`
echo -n "$separator$available$separator$used" >> $OUTPUT_DIR/$HOSTNAME.csv # append, concatenated, the separator and the value to the file
done
produces:
sideshow
09-20-14,bob_avail,bob_used,mel_avail,mel_used,sideshow-ws2_avail,sideshow-ws2_used
,470400,1032124,661826,1032124,43443,1032108
echo -n permits to print text without the linebreak.
To loop over the values of the array, you can use a for-loop:
echo "$TODAY,$HOSTNAME" > $OUTPUT_DIR/$HOSTNAME.csv # with a linebreak
separator="" # defined empty for the first value
for v in "${WORKSPACES[#]}"
do
echo -n "$separator$v" >> $OUTPUT_DIR/$HOSTNAME.csv # append, concatenated, the separator and the value to the file
separator="," # comma for the next values
done
echo >> $OUTPUT_DIR/$HOSTNAME.csv # add a linebreak (if you want it)

Resources