I have two arrays, I am trying to find matching values using comm. Array1 contains some additional information in each element that I strip out for the comparison. However, I would like to keep that information after the comparison is complete.
For example:
Array1=("abc",123,"hello" "def",456,"world")
Array2=("abc")
declare -a Array1
declare -a Array2
I then compare the two arrays:
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${Array1[*]}" | awk -F "," {'print $1'} | sort) <(echo "${Array2[*]}" | sort)))
IFS=$oldIFS
Which finds the match of abc:
echo ${test3[0]}
abc
However what I want is remaining values from array1 that were not part of my comm statement.
abc,123,hello
EDIT: For more clarification
The arrays in this example are populated with dummy data.
My real example is pulling information from server logs which I am saving into array1. array1 contains (userIDs,hostIPs,count) that I want to cross reference against a list of userID's (array2). My goal is to find out what userIDs exsist in array1 and array2 and save those ID's with the additional information from array1 (hostIPs,count) into array3
array1 is populated from a variable that is is the results of a curl command that generates a splunk search. The data returned looks like this:
"uniqueID=<ID>","<IP>","<hostname>",1
I save the results of the splunk report as $splunk, and then decalare array1 with the results of $splunk - the header information since the results come back in csv format
array1=( $(echo $splunk | sed 's/ /\n/g' | sed 1d) )
array2 is generated from a master file that I have stored locally. That contains all the application ID's in our ecosystem. For example
uid=<ID>
I cat the contents of the master file into array2
array2=( $(cat master.txt) )
I then want to find what IDs from array1 exsist in array2 and save that as array3. This requires some massaging of the data in array1 to make it match the format of array2.
oldIFS=$IFS IFS=$'\n\t'
array3=($(comm -12 <(echo "${array1[*]}" | sed 's/ /\n/g' | awk -F "\"," {'print $1'} | sed 's/\"//g' | sed 's/|/ /g' | awk -F$'=' -v OFS=$'=' '{ $1 = "uid" }1' | grep -i "OU=People" | sed 's/OU/ou/g' | sort) <(echo "${array2[*]}" | sort)))
IFS=$oldIFS
array 3 will then contain lines that match in both arrays
uid=<ID>
uid=<ID>
However I am looking for something more along the line of
"uid=<ID>","<IP>","<hostname>",1
"uid=<ID>","<IP>","<hostname>",1
I would do it like this:
join -t, \
<(printf '%s\n' "${Array1[#]}" | sort -t, -k1,1) \
<(printf '%s\n' "${Array2[#]}" | sort)
Use the join command with , as the field delimiter. The first "file" is the first array, one element per line, sorted on the first field (comma delimited); the second "file" is the second array, one element per line, sorted.
The output will be every line where the first element of the first file matches the element from the second file; for the example input it's
abc,123,hello
This makes only one assumption, namely that no array element contains a newline. To make it more robust (assuming GNU Coreutils), we can use NUL as the delimiter:
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
This prints the output separated by NUL as well; to read the result into an array, we can use readarray:
readarray -d '' -t Array3 < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
readarray -d requires Bash 4.4 or newer. For older Bash, you can use a loop:
while IFS= read -r -d '' element; do
Array3+=("$element")
done < <(
join -z -t, \
<(printf '%s\0' "${Array1[#]}" | sort -z -t, -k1,1) \
<(printf '%s\0' "${Array2[#]}" | sort -z)
)
I don't know how to do this with comm, but I do have a solution for you with sed and grep. The following commands match on the regex uid=X,, where the string/array is in the form of uid=x or (uid=x uid=y) respectively.
# Array 2 (B) is a string
$ A=("uid=1,10.10.10.1,server1,1" "uid=2,10.10.10.2,server2,1")
$ B="uid=1"
$ echo ${A[#]} | grep -oE "([^ ]*${B},[^ ]*)"
uid=1,10.10.10.1,server1,1
# Array 2 (D) is an array
$ C=(${A[#]} "uid=3,10.10.10.3,server3,1" "uid=4,10.10.10.4,server4,1")
$ D=(${B} "uid=3")
$ echo ${C[*]} | grep -oE "([^ ]*($(echo ${D[#]} | sed 's/ /,|/g'))[^ ]*)"
uid=1,10.10.10.1,server1,1
uid=3,10.10.10.3,server3,1
# Content of arrays
$ echo ${A[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1
$ echo ${B}
uid=1
$ echo ${C[#]}
uid=1,10.10.10.1,server1,1 uid=2,10.10.10.2,server2,1 uid=3,10.10.10.3,server3,1 uid=4,10.10.10.4,server4,1
$ echo ${D[#]}
uid=1 uid=3
I'm trying to understand what I'm doing wrong here, but can't seem to determine the cause. I would like to create a set of arrays from an output for a for loop in bash. Below is the code I have so far:
for i in `onedatastore list | grep pure02 | awk '{print $1}'`;
do
arr${i}=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)) ;
echo "Output of arr${i}: ${arr${i}[#]}" ;
done
The output for the condition is as such:
107
108
109
What I want to do is based on these unique IDs is create arrays:
arr107
arr108
arr109
The arrays will have data like such in each:
[oneadmin#opennebula/]$ arr107=($(onedatastore show 107 | sed 's/[A-Z]://' | cut -f2 -d\:))
[oneadmin#opennebula/]$ echo ${arr107[#]}
DATASTORE 107 INFORMATION 107 pure02_vm_datastore_1 oneadmin oneadmin 0 IMAGE vcenter vcenter /var/lib/one//datastores/107 FILE READY DATASTORE CAPACITY 60T 21.9T 38.1T - PERMISSIONS um- u-- --- DATASTORE TEMPLATE CLONE_TARGET="NONE" DISK_TYPE="FILE" DS_MAD="vcenter" LN_TARGET="NONE" RESTRICTED_DIRS="/" SAFE_DIRS="/var/tmp" TM_MAD="vcenter" VCENTER_CLUSTER="CLUSTER01" IMAGES
When I try this in the script section though I get output errors as such:
./test.sh: line 6: syntax error near unexpected token `$(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:)'
I can't seem to figure out the syntax to use on this scenario.
In the end what I want to do is be able to compare different datastores and based on which on has more free space, deploy VMs to it.
Hope someone can help. Thanks
You can use the eval (potentially unsafe) and declare (safer) commands:
for i in $(onedatastore list | grep pure02 | awk '{print $1}');
do
declare "arr$i=($(onedatastore show ${i} | sed 's/[A-Z]://' | cut -f2 -d\:))"
eval echo 'Output of arr$i: ${arr'"$i"'[#]}'
done
readarray or mapfile, added in bash 4.0, will read directly into an array:
while IFS= read -r i <&3; do
readarray -t "arr$i" < <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:)
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Better, back through bash 3.x, one can use read -a to read to an array:
shopt -s pipefail # cause pipelines to fail if any element does
while IFS= read -r i <&3; do
IFS=$'\n' read -r -d '' -a "arr$i" \
< <(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d: && printf '\0')
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
Alternately, one can use namevars to create an alias for an array with an arbitrarily-named array in bash 4.3:
while IFS= read -r i <&3; do
declare -a "arr$i"
declare -n arr="arr$i"
# this is buggy: expands globs, string-splits on all characters in IFS, etc
# ...but, well, it's what the OP is asking for...
arr=( $(onedatastore show "$i" | sed 's/[A-Z]://' | cut -f2 -d:) )
done 3< <(onedatastore list | awk '/pure02/ {print $1}')
I have an Array like this:
ARRAY=(one two three four five)
And I want to Ilterate this Array in a for loop. But when I read the Array I want to change the output. Like this:
on
tw
thre
fou
fiv
So my question is, how do I do that? I got something like that:
for (( i=0; i<${ARRAYLENGTH}; i++ ));
do
echo "$({ARRAY[$i]} | rev | cut -c 2- | rev)"
done
But It doesn't Work. It Interpretes my pipe argumentes as an echo output.
What can I do?
Try this, I think this should work.
pipea[0]="awk -F[=,] '{print \$2}' | sed '/^\s*$/d'"
pipea[1]="awk -F[=,] '{print \$2}' | sed '/^\s*$/d' | cut -d ' ' -f2"
pipea[2]="awk -F[=,] '{print \$2}' | sed '/^\s*$/d' | cut -d ' ' -f1"
for (( i=0; i<${int}; i++ ));
do
echo "
dn: cn=$(${ldapquery[$i]} | eval ${pipea[0]}),ou=mydomain,dc=saturday,dc=int
objectClass: inetOrgPerson
objectClass: top
cn: $(${ldapquery[$i]} | eval ${pipea[0]})
sn: $(${ldapquery[$i]} | eval ${pipea[1]})
givenName= $(${ldapquery[$i]} | eval ${pipea[2]})
telephoneNumber $(${ldapquery[$i]} | eval ${pipea[2]})"
done
I need to assign two columns (column 2 and 6) in a text file to a variable as an array. so that i can call them back element by element.
the code below puts the whole column as one string element in k and l.
and the echo command will not return anything as everything is stored in the first element
k=`(cat mydata.txt | awk '{print $2}')`
l=`(cat mydata.txt | awk '{print $6}')`
echo ${k[2]}
echo ${l[2]}
below is an example of data set i used.
60594412 56137844 48552535 44214019 44121294 28652826 21975449 21718959 18208824 18004925 13299946 12969796 11990006 10435260 9992615 9975420 9223972 8918246 8730367 7723045 7316105 6772270 6301570 5662296 4653831 3769516 3343899 2639162 2393169 1992206 1838674 1681498 1563810 1389679 1267762 1253490 1205487 940968 718249 702722 655069 649121 619911 437735 284727 264334 252627 233213 185924 177421 160412 156581 143128 107247 87194 81369 74594 74185
Swap the order of the parentheses and backticks.
k=(`cat mydata.txt | awk '{print $2}'`)
l=(`cat mydata.txt | awk '{print $6}'`)
Stylewise, you could get rid of the useless use of cat and also change the backticks to $(...), which is generally preferable.
k=( $(awk '{print $2}' mydata.txt) )
l=( $(awk '{print $6}' mydata.txt) )
in bash script, I define a array:
array=$(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1)
Now, I want to translate this content to code in bash script:
"if there is NOT any element in array, it means array=nothing, then echo "nothing in array".
help me to do that??? Thanks a lot
*besides, I want to delete access_log's content periodically every 5min (/var/log/httpd/access_log). Please tell me how to do that??*
Saying:
array=$(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1)
does not define an array. This simply puts the result of the command into the variable array.
If you wanted to define an array, you'd say:
array=( $(awk '{print $4}' /var/log/httpd/sample | uniq -c | cut -d[ -f1) )
You can get the count of the elements in the array by saying echo "${#foo[#]}".
For checking whether the array contains an element or not, you can say:
(( "${#array[#]}" )) || echo "Nothing in array"