Bash trim some of the text [duplicate] - unix-text-processing

This question already has answers here:
Printing the last column of a line in a file
(11 answers)
Closed 1 year ago.
I have an output that looks like this
root#machine:path# someapp report | grep Lost
Lost Workers: 0
How can I grep only the digit at the end?
Thanks

Like this:
someapp report | grep -i lost | tr -s ' ' | cut -d' ' -f4
Run app, pipe STDOUT through tr to remove runs of spaces, cut the new string using a space and then select the 4th field
test:
echo " Lost Workers: 0" | tr -s ' ' | cut -d' ' -f4

Combine search and parse with sed:
echo ' Lost Workers: 0' | sed -n '/Lost/ s/.*[[:blank:]]//p'

Pipelines that look like grep | cut ... or grep | tr | cut or similar are almost always better off using awk:
$ printf 'foo\nLost Workers: 0\nbar\n' | awk '/Lost/{print $NF}'
0

Related

Shell Remove lower versions from array

I have the following array:
ARRAYNAME=(value_1.21.zip value_1.22.zip valueN_0.51.zip valueN_0.52.zip valueM_3.52)
I want to remove the lower versions of the same element and to have the following array:
ARRAYNAME=(value_1.22.zip valueN_0.52.zip valueM_3.52)
In this moment I am using this approach to remove the same elements
ARRAYNAMESORT=$(tr ' ' '\n' <<< "${ARRAYNAME[#]}" | sort -u | tr '\n' ' ')
but I am stuck in removing the lower versions. Does anyone has an idea how to achieve this?
Based on the text structure [Name]_[version].zip
ARRAYNAME=($(printf '%s\n' "${ARRAYNAME[#]}" | awk '{print $1,$1}' | cut -d'_' -f2- | sort -n | sed 1d | awk '{print $2}' | paste -s))
Explanation:
print all array elements printf '%s\n' "${ARRAYNAME[#]}"
duplicate the name in two column awk '{print $1,$1}'
remove left text from the first column cut -d'_' -f2-
sort then remove smallest one which is in the first line sort -n | sed 1d
get the second column the make it serial awk '{print $2}' | paste -s

Putting multiline awk command in a for loop not printing the variable

I have a command as follows, which takes lines from the Allergens file, based on lines from the IDs file.
awk '
FNR==NR{
a[$1]
next
}
/^Query/ || $2 in a
' IDs C100_Allergens | grep -B 1 'Hit: ' | grep -v '^--' > C100_Allergens_matches.txt
However, I have numerous sample_Allergens files, and want to run it in a loop as such, where list is a file with different sample names:
for i in `cat list`
do
awk '
FNR==NR{
a[$1]
next
}
/^Query/ || $2 in a
' IDs "$i"_Allergens | grep -B 1 'Hit: ' | grep -v '^--' > "$i"_Allergens_matches.txt
done
I tried this loop, including using the variable flag for awk, i.e. -v i="$i":
for i in `cat list`
do
awk -v i="$i" '
FNR==NR{
a[$1]
next
}
/^Query/ || $2 in a
' IDs "$i"_Allergens | grep -B 1 'Hit: ' | grep -v '^--' > "$i"_Allergens_matches.txt
done
I only keep getting empty files. Thanks in advance for your help!

create arrays from for loop output

I'm trying to understand what I'm doing wrong here, but can't seem to determine the cause. I would like to create a set of arrays from an output for a for loop in bash. Below is the code I have so far:
for i in `onedatastore list | grep pure02 | awk '{print $1}'`;
do
arr${i}=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)) ;
echo "Output of arr${i}: ${arr${i}[#]}" ;
done
The output for the condition is as such:
107
108
109
What I want to do is based on these unique IDs is create arrays:
arr107
arr108
arr109
The arrays will have data like such in each:
[oneadmin#opennebula/]$ arr107=($(onedatastore show 107 | sed 's/[A-Z]://' | cut -f2 -d\:))
[oneadmin#opennebula/]$ echo ${arr107[#]}
DATASTORE 107 INFORMATION 107 pure02_vm_datastore_1 oneadmin oneadmin 0 IMAGE vcenter vcenter /var/lib/one//datastores/107 FILE READY DATASTORE CAPACITY 60T 21.9T 38.1T - PERMISSIONS um- u-- --- DATASTORE TEMPLATE CLONE_TARGET="NONE" DISK_TYPE="FILE" DS_MAD="vcenter" LN_TARGET="NONE" RESTRICTED_DIRS="/" SAFE_DIRS="/var/tmp" TM_MAD="vcenter" VCENTER_CLUSTER="CLUSTER01" IMAGES
When I try this in the script section though I get output errors as such:
./test.sh: line 6: syntax error near unexpected token `$(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)'
I can't seem to figure out the syntax to use on this scenario.
In the end what I want to do is be able to compare different datastores and based on which on has more free space, deploy VMs to it.
Hope someone can help. Thanks
You can use the eval (potentially unsafe) and declare (safer) commands:
for i in $(onedatastore list | grep pure02 | awk '{print $1}');
do
declare "arr$i=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:))"
eval echo 'Output of arr$i: ${arr'"$i"'[#]}'
done
readarray or mapfile, added in bash 4.0, will read directly into an array:
while IFS= read -r i <&3; do
readarray -t "arr$i" < <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:)
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Better, back through bash 3.x, one can use read -a to read to an array:
shopt -s pipefail # cause pipelines to fail if any element does
while IFS= read -r i <&3; do
IFS=$'\n' read -r -d '' -a "arr$i" \
< <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d: && printf '\0')
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Alternately, one can use namevars to create an alias for an array with an arbitrarily-named array in bash 4.3:
while IFS= read -r i <&3; do
declare -a "arr$i"
declare -n arr="arr$i"
# this is buggy: expands globs, string-splits on all characters in IFS, etc
# ...but, well, it's what the OP is asking for...
arr=( $(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:) )
done 3< <(onedatastore list | awk '/pure02/ {print $1}')

Submitting a job on PBS with while loop, changing output file names

I have a (let's call it original) script to parse a file (~1000 lines) line by line, generate arguments to execute a C++ program.
#!/bin/bash
i = 0
while IFS='' read -r line || [[ -n "$line" ]]; do
a="$line" | cut -c1-2
b="$line" | cut -c3-4
c="$line" | cut -c5-6
d="$line" | cut -c7-8
e="$line" | cut -c9-10
f="$line" | cut -c11-12
g="$line" | cut -c13-14
h="$line" | cut -c15-16
i="$line" | cut -c17-18
j="$line" | cut -c19-20
k="$line" | cut -c21-22
l="$line" | cut -c23-24
m="$line" | cut -c25-26
n="$line" | cut -c27-28
o="$line" | cut -c29-30
p="$line" | cut -c31-32
./a.out "$a" "$b" "$c" "$d" "$e" "$f" "$g" "$h" "$i" "$j" "$k" "$l" "$m" "$n" "$o" "$p" > $(echo some-folder/output_${i}.txt)
done < test_10.txt
I want to schedule this job in a batch, so that each run is queued and ran on separate cores.
I checked the PBS and qsub writing styles. I could write a PBS file (simple one, without all options for now. Lets call it callPBS.PBS):
#!/bin/bash
cd $PBS_O_WORKDIR
qsub ./a.out -F "my arguments"
exit 0
I can call this file instead of ./a.out -------- in original script. BUT how do I pass "my arguments"? Problem is they are not fixed.
Secondly I know qsub takes -o as an option for output file. But I want my output file name to be changed. I can pass that as an argument again, but how?
Can I do this in my original script:
callPBS.pbs > $(echo some-folder/output_${i}.txt)
I am sorry if I am missing something here. I am trying to use all that I know!

Assign values to dynamic arrays

My bash script needs to read values from a properties file and assign them to a number of arrays. The number of arrays is controlled via configuration as well. My current code is as follows:
limit=$(sed '/^\#/d' $propertiesFile | grep 'limit' | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
for (( i = 1 ; i <= $limit ; i++ ))
do
#properties that define values to be assigned to the arrays are labeled myprop## (e.g. myprop01, myprop02):
lookupProperty=myprop$(printf "%.2d" "$i")
#the following line reads the value of the lookupProperty, which is a set of space-delimited strings, and assigns it to the myArray# (myArray1, myArray2, etc):
myArray$i=($(sed '/^\#/d' $propertiesFile | grep $lookupProperty | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'))
done
When I attempt to execute the above code, the following error message is displayed:
syntax error near unexpected token `$(sed '/^\#/d' $propertiesFile | grep $lookupProperty | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')'
I am quite sure the issue is in the way I am declaring the "myArray$i" arrays. However, any different approach I tried produced either the same errors or incomplete results.
Any ideas/suggestions?
You are right that bash does not recognize the construct myArray$i=(some array values) as an array variable assignment. One work-around is:
read -a myArray$i <<<"a b c"
The read -a varname command reads an array from stdin, which is provided by the "here" string <<<"a b c", and assigns it to varname where varname can be constructs like myArray$i. So, in your case, the command might look like:
read -a myArray$i <<<"$(sed '/^\#/d' $propertiesFile | grep$lookupProperty | tail -n 1 | cut -d "=" -f2- | seds/^[[:space:]]*//;s/[[:space:]]*$//')"
The above allows assignment. The next issue is how to read out variables like myArray$i. One solution is to name the variable indirectly like this:
var="myArray$i[2]" ; echo ${!var}

Resources