create arrays from for loop output - arrays

I'm trying to understand what I'm doing wrong here, but can't seem to determine the cause. I would like to create a set of arrays from an output for a for loop in bash. Below is the code I have so far:
for i in `onedatastore list | grep pure02 | awk '{print $1}'`;
do
arr${i}=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)) ;
echo "Output of arr${i}: ${arr${i}[#]}" ;
done
The output for the condition is as such:
107
108
109
What I want to do is based on these unique IDs is create arrays:
arr107
arr108
arr109
The arrays will have data like such in each:
[oneadmin#opennebula/]$ arr107=($(onedatastore show 107 | sed 's/[A-Z]://' | cut -f2 -d\:))
[oneadmin#opennebula/]$ echo ${arr107[#]}
DATASTORE 107 INFORMATION 107 pure02_vm_datastore_1 oneadmin oneadmin 0 IMAGE vcenter vcenter /var/lib/one//datastores/107 FILE READY DATASTORE CAPACITY 60T 21.9T 38.1T - PERMISSIONS um- u-- --- DATASTORE TEMPLATE CLONE_TARGET="NONE" DISK_TYPE="FILE" DS_MAD="vcenter" LN_TARGET="NONE" RESTRICTED_DIRS="/" SAFE_DIRS="/var/tmp" TM_MAD="vcenter" VCENTER_CLUSTER="CLUSTER01" IMAGES
When I try this in the script section though I get output errors as such:
./test.sh: line 6: syntax error near unexpected token `$(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)'
I can't seem to figure out the syntax to use on this scenario.
In the end what I want to do is be able to compare different datastores and based on which on has more free space, deploy VMs to it.
Hope someone can help. Thanks

You can use the eval (potentially unsafe) and declare (safer) commands:
for i in $(onedatastore list | grep pure02 | awk '{print $1}');
do
declare "arr$i=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:))"
eval echo 'Output of arr$i: ${arr'"$i"'[#]}'
done

readarray or mapfile, added in bash 4.0, will read directly into an array:
while IFS= read -r i <&3; do
readarray -t "arr$i" < <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:)
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Better, back through bash 3.x, one can use read -a to read to an array:
shopt -s pipefail # cause pipelines to fail if any element does
while IFS= read -r i <&3; do
IFS=$'\n' read -r -d '' -a "arr$i" \
< <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d: && printf '\0')
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Alternately, one can use namevars to create an alias for an array with an arbitrarily-named array in bash 4.3:
while IFS= read -r i <&3; do
declare -a "arr$i"
declare -n arr="arr$i"
# this is buggy: expands globs, string-splits on all characters in IFS, etc
# ...but, well, it's what the OP is asking for...
arr=( $(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:) )
done 3< <(onedatastore list | awk '/pure02/ {print $1}')

Related

Using "comm" to find matches between two arrays

I have two arrays, I am trying to find matching values using comm. Array1 contains some additional information in each element that I strip out for the comparison. However, I would like to keep that information after the comparison is complete.
For example:
Array1=("abc",123,"hello" "def",456,"world")
Array2=("abc")
declare -a Array1
declare -a Array2
I then compare the two arrays:
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${Array1[*]}" | awk -F "," {'print $1'} | sort) <(echo "${Array2[*]}" | sort)))
IFS=$oldIFS
Which finds the match of abc:
echo ${test3[0]}
abc
However what I want is remaining values from array1 that were not part of my comm statement.
abc,123,hello
EDIT: For more clarification
The arrays in this example are populated with dummy data.
My real example is pulling information from server logs which I am saving into array1. array1 contains (userIDs,hostIPs,count) that I want to cross reference against a list of userID's (array2). My goal is to find out what userIDs exsist in array1 and array2 and save those ID's with the additional information from array1 (hostIPs,count) into array3
array1 is populated from a variable that is is the results of a curl command that generates a splunk search. The data returned looks like this:
"uniqueID=<ID>","<IP>","<hostname>",1
I save the results of the splunk report as $splunk, and then decalare array1 with the results of $splunk - the header information since the results come back in csv format
array1=( $(echo $splunk | sed 's/ /\n/g' | sed 1d) )
array2 is generated from a master file that I have stored locally. That contains all the application ID's in our ecosystem. For example
uid=<ID>
I cat the contents of the master file into array2
array2=( $(cat master.txt) )
I then want to find what IDs from array1 exsist in array2 and save that as array3. This requires some massaging of the data in array1 to make it match the format of array2.
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${array1[*]}" | sed 's/ /\n/g' | awk -F "\"," {'print $1'} | sed 's/\"//g' | sed 's/|/ /g' | awk -F$'=' -v OFS=$'=' '{ $1 = "uid" }1' | grep -i "OU=People" | sed 's/OU/ou/g' | sort) <(echo "${array2[*]}" | sort)))
IFS=$oldIFS
array 3 will then contain lines that match in both arrays
uid=<ID>
uid=<ID>
However I am looking for something more along the line of
"uid=<ID>","<IP>","<hostname>",1
"uid=<ID>","<IP>","<hostname>",1
I would do it like this:
join -t, \
<(printf '%s\n' "${Array1[#]}" | sort -t, -k1,1) \
<(printf '%s\n' "${Array2[#]}" | sort)
Use the join command with , as the field delimiter. The first "file" is the first array, one element per line, sorted on the first field (comma delimited); the second "file" is the second array, one element per line, sorted.
The output will be every line where the first element of the first file matches the element from the second file; for the example input it's
abc,123,hello
This makes only one assumption, namely that no array element contains a newline. To make it more robust (assuming GNU Coreutils), we can use NUL as the delimiter:
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
This prints the output separated by NUL as well; to read the result into an array, we can use readarray:
readarray -d '' -t Array3 < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
readarray -d requires Bash 4.4 or newer. For older Bash, you can use a loop:
while IFS= read -r -d '' element; do
Array3+=("$element")
done < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
I don't know how to do this with comm, but I do have a solution for you with sed and grep. The following commands match on the regex uid=X,, where the string/array is in the form of uid=x or (uid=x uid=y) respectively.
# Array 2 (B) is a string
$ A=("uid=1,10.10.10.1,server1,1" "uid=2,10.10.10.2,server2,1")
$ B="uid=1"
$ echo ${A[#]} | grep -oE "([^ ]*${B},[^ ]*)"
uid=1,10.10.10.1,server1,1
# Array 2 (D) is an array
$ C=(${A[#]} "uid=3,10.10.10.3,server3,1" "uid=4,10.10.10.4,server4,1")
$ D=(${B} "uid=3")
$ echo ${C[*]} | grep -oE "([^ ]*($(echo ${D[#]} | sed 's/ /,|/g'))[^ ]*)"
uid=1,10.10.10.1,server1,1
uid=3,10.10.10.3,server3,1
# Content of arrays
$ echo ${A[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1
$ echo ${B}
uid=1
$ echo ${C[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1 uid=3,10.10.10.3,server3,1 uid=4,10.10.10.4,server4,1
$ echo ${D[#]}
uid=1 uid=3

jq: 1 compile error jq: error: schedule/0 is not defined at <top-level>, line 1: .Christchurch.bus-schedule.from["Weekday"] |= . + ["1646"]

I have a react application which uses a .sh file to generate a part of the application. But I have come across an error which prevents the application from reading the data files.
I've looked at versioning, possible syntax changes in the .sh file which is emap.sh
This is the .sh file
#!/bin/bash
object='{}'
find 'public/data' -type f -name '*.csv' | while read -r filename;
do
filename=$(echo "$filename" | sed 's/public\/data\///' | sed 's/ /\//')
city=$(echo "$filename" | cut -d'/' -f1)
medium=$(echo "$filename" | cut -d'/' -f2)
direction=$(echo "$filename" | cut -d'/' -f3)
date=$(echo "$filename" | cut -d'/' -f4)
time=$(echo "$filename" | cut -d'/' -f5| head -c-5)
object=$(echo $object | jq -c ".$city.$medium.$direction[\"$date\"] |= . + [\"$time\"]")
echo $object > public/available.json
done
It should be successful so that when we yarn the application, it shows up having read the data, but we get a page without data in the area using the information that needs to be processed.
The shell variables should be passed to jq in a more robust manner, e.g. along these lines:
jq -c --arg city "$city" \
--arg medium "$medium" \
--arg direction "$direction" \
--arg date "$date" \
--arg time "$time" \
'.[$city][$medium][$direction][$date] += [$time]'
As #OguzIsmail points out, though, you would probably be better off avoiding all the messiness by doing everything with just find and jq.

Submitting a job on PBS with while loop, changing output file names

I have a (let's call it original) script to parse a file (~1000 lines) line by line, generate arguments to execute a C++ program.
#!/bin/bash
i = 0
while IFS='' read -r line || [[ -n "$line" ]]; do
a="$line" | cut -c1-2
b="$line" | cut -c3-4
c="$line" | cut -c5-6
d="$line" | cut -c7-8
e="$line" | cut -c9-10
f="$line" | cut -c11-12
g="$line" | cut -c13-14
h="$line" | cut -c15-16
i="$line" | cut -c17-18
j="$line" | cut -c19-20
k="$line" | cut -c21-22
l="$line" | cut -c23-24
m="$line" | cut -c25-26
n="$line" | cut -c27-28
o="$line" | cut -c29-30
p="$line" | cut -c31-32
./a.out "$a" "$b" "$c" "$d" "$e" "$f" "$g" "$h" "$i" "$j" "$k" "$l" "$m" "$n" "$o" "$p" > $(echo some-folder/output_${i}.txt)
done < test_10.txt
I want to schedule this job in a batch, so that each run is queued and ran on separate cores.
I checked the PBS and qsub writing styles. I could write a PBS file (simple one, without all options for now. Lets call it callPBS.PBS):
#!/bin/bash
cd $PBS_O_WORKDIR
qsub ./a.out -F "my arguments"
exit 0
I can call this file instead of ./a.out -------- in original script. BUT how do I pass "my arguments"? Problem is they are not fixed.
Secondly I know qsub takes -o as an option for output file. But I want my output file name to be changed. I can pass that as an argument again, but how?
Can I do this in my original script:
callPBS.pbs > $(echo some-folder/output_${i}.txt)
I am sorry if I am missing something here. I am trying to use all that I know!

BASH scipt. how to code if (array=nothing) then echo "no element in array" in bash script

in bash script, I define a array:
array=$(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1)
Now, I want to translate this content to code in bash script:
"if there is NOT any element in array, it means array=nothing, then echo "nothing in array".
help me to do that??? Thanks a lot
*besides, I want to delete access_log's content periodically every 5min (/var/log/httpd/access_log). Please tell me how to do that??*
Saying:
array=$(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1)
does not define an array. This simply puts the result of the command into the variable array.
If you wanted to define an array, you'd say:
array=( $(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1) )
You can get the count of the elements in the array by saying echo "${#foo[#]}".
For checking whether the array contains an element or not, you can say:
(( "${#array[#]}" )) || echo "Nothing in array"

Bash: Script for finding files by mime-type

First, I am not experienced in scripting, so be gentle with me
Anyway, I tried making a script for finding files by mime-type ( audio, video, text...etc), and here's the poor result I came up with.
#!/bin/bash
FINDPATH="$1"
FILETYPE="$2"
locate $FINDPATH* | while read FILEPROCESS
do
if file -bi "$FILEPROCESS" | grep -q "$FILETYPE"
then
echo $FILEPROCESS
fi
done
It works, but as you could guess, the performance is not so good.
So, can you guys help me make it better ? and also, I don't want to rely on files extensions.
Update:
Here's what I am using now
#!/bin/bash
FINDPATH="$1"
find "$FINDPATH" -type f | file -i -F "::" -f - | awk -v FILETYPE="$2" -F"::" '$2 ~ FILETYPE { print $1 }'
Forking (exec) is expensive. This runs the file command only once, so it is fast:
find . -print | file -if - | grep "what you want" | awk -F: '{print $1}'
or
locate what.want | file -if -
check man file
-i #print mime types
-f - #read filenames from the stdin
#!/bin/bash
find $1 | file -if- | grep $2 | awk -F: '{print $1}'
#!/usr/bin/env bash
mimetypes=$(sed -E 's/\/.*//g; /^$/d; /^#/d' /etc/mime.types | uniq)
display_help(){
echo "Usage: ${0##*/} [mimetype]"
echo "Available mimetypes:"
echo "$mimetypes"
exit 2
}
[[ $# -lt 1 ]] && display_help
ext=$(sed -E "/^${1}/!d; s/^[^ \t]+[ \t]*//g; /^$/d; s/ /\n/g" /etc/mime.types | sed -Ez 's/\n$//; s/\n/\\|/g; s/(.*)/\.*\\.\\(\1\\)\n/')
find "$PWD" -type f -regex "$ext"

Resources