So I am quite struggling with arrays in shell scripting, especially dealing with sorting the key values. Here's what I have
declare -A array
Array[0]=(0)
Array[1]=(4)
Array[2]=(6)
Array[3]=(1)
So in each array we have (0,4,6,1), if we sort them to the largest to the smallest, it would be (6,4,1,0). Now, I wonder if I could sort the key of the value, and put them in a new array like this(sort of like ranking them):
newArray[0]=(2) # 2 which was the key for 6
newArray[1]=(1) # 1 which was the key for 4
newArray[2]=(3) # 3 which was the key for 1
newArray[3]=(0) # 0 which was the key for 0
I've tried some solutions but they are so much hard coded and not working for some situations. Any helps would be appreciated.
Create a tuple of index+value.
Sort over value.
Remove values.
Read into an array.
array=(0 4 6 1)
tmp=$(
# for every index in the array
for ((i = 0; i < ${#array[#]}; ++i)); do
# output the index, space, an array value on every line
echo "$i ${array[i]}"
done |
# sort lines using Key as second column Numeric Reverse
sort -k2nr |
# using space as Delimiter, extract first Field from each line
cut -d' ' -f1
)
# Load tmp into an array separated by newlines.
readarray -t newArray <<<"$tmp"
# output
declare -p newArray
outputs:
declare -a newArray=([0]="2" [1]="1" [2]="3" [3]="0")
Consider the associative array below:
declare -A shapingTimes
shapingTimes=([0-start]=15 [0-stop]=21 [0-anotherkey]=foo)
shapingTimes+=([1-start]=4 [1-stop]=6 [1-anotherkey]=bar)
shapingTimes+=([2-start]=9 [2-stop]=11 [2-anotherkey]=blah)
Is there a way to find the total number of keys used per entry in an array? (Is this per 'index' in an array?)
For example, how to count: [start], [stop], [anotherkey] as = 3 keys?
At the moment I'm using the hardcoded value (3) from this code I found (as below) that does the job fine, but I'm wondering if this can be achieved dynamically?
totalshapingTimes=$((${#shapingTimes[*]} / 3))
I've found these variables that return various array aspects, but not the total number of keys.
echo "All of the items in the array:" ${shapingTimes[*]}
echo "All of the indexes in the array:" ${!shapingTimes[*]}
echo "Total number of items in the array:" ${#shapingTimes[*]}
echo "Total number of KEYS in each array entry:" #???
Desired output:
All of the items in the array: 21 6 11 blah 15 4 bar 9 foo
All of the indexes in the array: 0-stop 1-stop 2-stop 2-anotherkey 0-start 1-start 1-anotherkey 2-start 0-anotherkey
Total number of items in the array: 9
Total number of KEYS in each array entry: 3
declare -A shapingTimes
shapingTimes=([0-start]=15 [0-stop]=21 [0-anotherkey]=foo)
shapingTimes+=([1-start]=4 [1-stop]=6 [1-anotherkey]=bar)
shapingTimes+=([2-start]=9 [2-stop]=11 [2-anotherkey]=blah)
# output all keys
for i in "${!shapingTimes[#]}"; do echo $i; done
Output:
1-start
2-stop
1-stop
0-start
2-start
2-anotherkey
1-anotherkey
0-stop
0-anotherkey
# Leading numbers and "-" removed:
for i in "${!shapingTimes[#]}"; do echo ${i/#[0-9]*-/}; done
Output:
start
stop
stop
start
start
anotherkey
anotherkey
stop
anotherkey
# put shortend keys in new associative array
declare -A hash
for i in "${!shapingTimes[#]}"; do hash[${i/#[0-9]*-/}]=""; done
echo "${#hash[#]}"
Output:
3
Once you key your array names with -, there isn't a direct way to identify the count of occurrences of the string past the - character.
One way would be to use additional tools to identify the count. The "${!shapingTimes[#]}" prints all the keys of the array and sort -ut- k2 goes a unique sort based on the 2nd field following the - delimiter, which is piped to wc -l to get the line count.
printf '%s\n' "${!shapingTimes[#]}" | sort -ut- -k2 | wc -l
3
Same as #Cyrus's solution, but with no loop and no sub-shell:
#!/usr/bin/env bash
declare -A shapingTimes=(
[0-start]=15 [0-stop]=21 [0-anotherkey]=foo
[1-start]=4 [1-stop]=6 [1-anotherkey]=bar
[2-start]=9 [2-stop]=11 [2-anotherkey]=blah
)
# Populate simple array with keys only
declare -a shapingTimesKeys=("${!shapingTimes[#]}")
# Create associative array entries declaration string
# populated by prefix stripped keys
printf -v _str '[%q]= ' "${shapingTimesKeys[#]/#[0-9]*-/}"
# Declare the stripped keys associative array using
# the string populated just above
declare -A shapingTimesHash="($_str)"
printf 'Found %d keys:\n' "${#shapingTimesHash[#]}"
printf '%q\n' "${!shapingTimesHash[#]}"
In my shell code, I have an indexed array, containing the names of associative arrays:
declare -A assoc1=([name]=aaa [age]=20)
declare -A assoc2=([name]=bbb [age]=40)
declare -A assoc3=([name]=ccc [age]=25)
indexed_array=(assoc1 assoc2 assoc3)
So, using the above, ${indexed_array[#]} equals assoc1 assoc2 assoc3.
I want a sort_array function that can re-sort the values in indexed_array, so that the associative array with the highest age (assoc2) is listed first or last, like so:
new_indexed_array=( $(echo ${indexed_array[#]} | sort_by 'age' 'desc') )
After that I should get re-ordered contents in the new array:
declare -p new_indexed_array
# gives "assoc2 assoc3 assoc1"
I have some boilerplate code to get into the array values, but wasn't able to get any further in sorting the arrays..
function sort_by {
# for each hash in the given array
get_stdin # (custom func, sets $STDIN)
for hash in ${STDIN[#]}
do
# get the hash keys
hash_keys="$(eval "echo \${!$hash[#]}")"
# for each key
for hashkey in $hash_keys
do
# reset
return_the_array=false
# if $hashkey matches the key given
if [ "$hashkey" = "$1" ];then
# check the value of this one if highest/lowest
# (compared to previous ones)
# and then return if yes/mo (asc/desc)
fi
# if $return_the_array = true, then we found the right key and
# it's higher/lower
if [ "$return_the_array" = true ];then
# do stuff
fi
done
done
}
If you have Bash 4.3 or newer, you can use namerefs for this, as follows:
sort_by() {
local arr field sort_params elem
declare -n arr=$1
field=$2
# Build array with sort parameters
[[ $3 == 'desc' ]] && sort_params+=('-r')
[[ $field == 'age' ]] && sort_params+=('-n')
# Schwartzian transform
for elem in "${arr[#]}"; do
declare -n ref=$elem
printf '%s\t%s\n' "${ref["$field"]}" "$elem"
done | sort "${sort_params[#]}" | cut -f2
}
declare -A assoc1=([name]=aaa [age]=20)
declare -A assoc2=([name]=bbb [age]=40)
declare -A assoc3=([name]=ccc [age]=25)
indexed_array=(assoc1 assoc2 assoc3)
readarray -t byage < <(sort_by indexed_array age desc)
declare -p byage
readarray -t byname < <(sort_by indexed_array name asc)
declare -p byname
The calling syntax is a bit different:
sort_by ARRAYNAME FIELD SORTORDER
and the output is one element per line, so to read it back into an array, we have to use something like readarray (see examples at the end).
First, we use a nameref to assign the arrayname to arr:
declare -n arr=$1
arr now behaves as if it were the actual array.
Then, we build an array with the parameters for sort: if the third parameter is desc, we use -r, and if the field is age, we use -n. This could be made a bit smarter and check if the field contains numeric values or not, and set -n accordingly.
We then iterate over the elements of arr, the elements of which are the names of the associative arrays. In the loop, we assign the names to ref:
declare -n ref=$elem
ref now behaves like the actual associative array.
To sort, we use a Schwartzian transform (decorate – sort – undecorate) by printing lines with the chosen field name and then the array name; for example, for age, we'd get
20 assoc1
40 assoc2
25 assoc3
This is piped to sort with the proper parameters, and with cut -f2 we remove the sort field again.
The output for the examples looks like this:
declare -a byage=([0]="assoc2" [1]="assoc3" [2]="assoc1")
declare -a byname=([0]="assoc1" [1]="assoc2" [2]="assoc3")
Notice that declare -n declares local parameters in the function, so they don't pollute the global namespace.
Just for reference, I am using a modified version of the accepted answer:
function sort_by {
local field sort_params elem
field=$1
# Build array with sort parameters
[[ $2 == 'desc' ]] && sort_params+=('-r')
[[ $field == 'age' ]] && sort_params+=('-n')
# Schwartzian transform,
# get piped array contents from $(cat)
while read -r elem; do
declare -n ref=$elem
printf '%s\t%s\n' "${ref["$field"]}" "$elem"
done | sort "${sort_params[#]}" | cut -f2 | tr '\n' ' '
}
The difference between this function and the accepted answer, is that this function above expects to receive the contents of the indexed array as piped input, rather than taking the array name as the first parameter..
Therefore, my function can be called like so:
echo ${someIndexedArray[#]} | sort_by 'age' 'desc'
(And this func outputs everything on the same line, as opposed to newlines)
The benefit of this (for me) is that it now works in the CMS I am using, and also ,the sort_by function does not need to know the name of the array - which would not be passed by the CMS.
I need to remove an element from an array in bash shell.
Generally I'd simply do:
array=("${(#)array:#<element to remove>}")
Unfortunately the element I want to remove is a variable so I can't use the previous command.
Down here an example:
array+=(pluto)
array+=(pippo)
delete=(pluto)
array( ${array[#]/$delete} ) -> but clearly doesn't work because of {}
Any idea?
The following works as you would like in bash and zsh:
$ array=(pluto pippo)
$ delete=pluto
$ echo ${array[#]/$delete}
pippo
$ array=( "${array[#]/$delete}" ) #Quotes when working with strings
If need to delete more than one element:
...
$ delete=(pluto pippo)
for del in ${delete[#]}
do
array=("${array[#]/$del}") #Quotes when working with strings
done
Caveat
This technique actually removes prefixes matching $delete from the elements, not necessarily whole elements.
Update
To really remove an exact item, you need to walk through the array, comparing the target to each element, and using unset to delete an exact match.
array=(pluto pippo bob)
delete=(pippo)
for target in "${delete[#]}"; do
for i in "${!array[#]}"; do
if [[ ${array[i]} = $target ]]; then
unset 'array[i]'
fi
done
done
Note that if you do this, and one or more elements is removed, the indices will no longer be a continuous sequence of integers.
$ declare -p array
declare -a array=([0]="pluto" [2]="bob")
The simple fact is, arrays were not designed for use as mutable data structures. They are primarily used for storing lists of items in a single variable without needing to waste a character as a delimiter (e.g., to store a list of strings which can contain whitespace).
If gaps are a problem, then you need to rebuild the array to fill the gaps:
for i in "${!array[#]}"; do
new_array+=( "${array[i]}" )
done
array=("${new_array[#]}")
unset new_array
You could build up a new array without the undesired element, then assign it back to the old array. This works in bash:
array=(pluto pippo)
new_array=()
for value in "${array[#]}"
do
[[ $value != pluto ]] && new_array+=($value)
done
array=("${new_array[#]}")
unset new_array
This yields:
echo "${array[#]}"
pippo
This is the most direct way to unset a value if you know it's position.
$ array=(one two three)
$ echo ${#array[#]}
3
$ unset 'array[1]'
$ echo ${array[#]}
one three
$ echo ${#array[#]}
2
This answer is specific to the case of deleting multiple values from large arrays, where performance is important.
The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first is fast, but can only deal with elements that have distinct prefix, the second has O(n*k), n=array size, k=elements to remove. Associative array are relative new feature, and might not have been common when the question was originally posted.
For the exact match case, with large n and k, possible to improve performance from O(nk) to O(n+klog(k)). In practice, O(n) assuming k much lower than n. Most of the speed up is based on using associative array to identify items to be removed.
Performance (n-array size, k-values to delete). Performance measure seconds of user time
N K New(seconds) Current(seconds) Speedup
1000 10 0.005 0.033 6X
10000 10 0.070 0.348 5X
10000 20 0.070 0.656 9X
10000 1 0.043 0.050 -7%
As expected, the current solution is linear to N*K, and the fast solution is practically linear to K, with much lower constant. The fast solution is slightly slower vs the current solution when k=1, due to additional setup.
The 'Fast' solution: array=list of input, delete=list of values to remove.
declare -A delk
for del in "${delete[#]}" ; do delk[$del]=1 ; done
# Tag items to remove, based on
for k in "${!array[#]}" ; do
[ "${delk[${array[$k]}]-}" ] && unset 'array[k]'
done
# Compaction
array=("${array[#]}")
Benchmarked against current solution, from the most-voted answer.
for target in "${delete[#]}"; do
for i in "${!array[#]}"; do
if [[ ${array[i]} = $target ]]; then
unset 'array[i]'
fi
done
done
array=("${array[#]}")
Here's a one-line solution with mapfile:
$ mapfile -d $'\0' -t arr < <(printf '%s\0' "${arr[#]}" | grep -Pzv "<regexp>")
Example:
$ arr=("Adam" "Bob" "Claire"$'\n'"Smith" "David" "Eve" "Fred")
$ echo "Size: ${#arr[*]} Contents: ${arr[*]}"
Size: 6 Contents: Adam Bob Claire
Smith David Eve Fred
$ mapfile -d $'\0' -t arr < <(printf '%s\0' "${arr[#]}" | grep -Pzv "^Claire\nSmith$")
$ echo "Size: ${#arr[*]} Contents: ${arr[*]}"
Size: 5 Contents: Adam Bob David Eve Fred
This method allows for great flexibility by modifying/exchanging the grep command and doesn't leave any empty strings in the array.
Partial answer only
To delete the first item in the array
unset 'array[0]'
To delete the last item in the array
unset 'array[-1]'
To expand on the above answers, the following can be used to remove multiple elements from an array, without partial matching:
ARRAY=(one two onetwo three four threefour "one six")
TO_REMOVE=(one four)
TEMP_ARRAY=()
for pkg in "${ARRAY[#]}"; do
for remove in "${TO_REMOVE[#]}"; do
KEEP=true
if [[ ${pkg} == ${remove} ]]; then
KEEP=false
break
fi
done
if ${KEEP}; then
TEMP_ARRAY+=(${pkg})
fi
done
ARRAY=("${TEMP_ARRAY[#]}")
unset TEMP_ARRAY
This will result in an array containing:
(two onetwo three threefour "one six")
Here's a (probably very bash-specific) little function involving bash variable indirection and unset; it's a general solution that does not involve text substitution or discarding empty elements and has no problems with quoting/whitespace etc.
delete_ary_elmt() {
local word=$1 # the element to search for & delete
local aryref="$2[#]" # a necessary step since '${!$2[#]}' is a syntax error
local arycopy=("${!aryref}") # create a copy of the input array
local status=1
for (( i = ${#arycopy[#]} - 1; i >= 0; i-- )); do # iterate over indices backwards
elmt=${arycopy[$i]}
[[ $elmt == $word ]] && unset "$2[$i]" && status=0 # unset matching elmts in orig. ary
done
return $status # return 0 if something was deleted; 1 if not
}
array=(a 0 0 b 0 0 0 c 0 d e 0 0 0)
delete_ary_elmt 0 array
for e in "${array[#]}"; do
echo "$e"
done
# prints "a" "b" "c" "d" in lines
Use it like delete_ary_elmt ELEMENT ARRAYNAME without any $ sigil. Switch the == $word for == $word* for prefix matches; use ${elmt,,} == ${word,,} for case-insensitive matches; etc., whatever bash [[ supports.
It works by determining the indices of the input array and iterating over them backwards (so deleting elements doesn't screw up iteration order). To get the indices you need to access the input array by name, which can be done via bash variable indirection x=1; varname=x; echo ${!varname} # prints "1".
You can't access arrays by name like aryname=a; echo "${$aryname[#]}, this gives you an error. You can't do aryname=a; echo "${!aryname[#]}", this gives you the indices of the variable aryname (although it is not an array). What DOES work is aryref="a[#]"; echo "${!aryref}", which will print the elements of the array a, preserving shell-word quoting and whitespace exactly like echo "${a[#]}". But this only works for printing the elements of an array, not for printing its length or indices (aryref="!a[#]" or aryref="#a[#]" or "${!!aryref}" or "${#!aryref}", they all fail).
So I copy the original array by its name via bash indirection and get the indices from the copy. To iterate over the indices in reverse I use a C-style for loop. I could also do it by accessing the indices via ${!arycopy[#]} and reversing them with tac, which is a cat that turns around the input line order.
A function solution without variable indirection would probably have to involve eval, which may or may not be safe to use in that situation (I can't tell).
Using unset
To remove an element at particular index, we can use unset and then do copy to another array. Only just unset is not required in this case. Because unset does not remove the element it just sets null string to the particular index in array.
declare -a arr=('aa' 'bb' 'cc' 'dd' 'ee')
unset 'arr[1]'
declare -a arr2=()
i=0
for element in "${arr[#]}"
do
arr2[$i]=$element
((++i))
done
echo "${arr[#]}"
echo "1st val is ${arr[1]}, 2nd val is ${arr[2]}"
echo "${arr2[#]}"
echo "1st val is ${arr2[1]}, 2nd val is ${arr2[2]}"
Output is
aa cc dd ee
1st val is , 2nd val is cc
aa cc dd ee
1st val is cc, 2nd val is dd
Using :<idx>
We can remove some set of elements using :<idx> also. For example if we want to remove 1st element we can use :1 as mentioned below.
declare -a arr=('aa' 'bb' 'cc' 'dd' 'ee')
arr2=("${arr[#]:1}")
echo "${arr2[#]}"
echo "1st val is ${arr2[1]}, 2nd val is ${arr2[2]}"
Output is
bb cc dd ee
1st val is cc, 2nd val is dd
http://wiki.bash-hackers.org/syntax/pe#substring_removal
${PARAMETER#PATTERN} # remove from beginning
${PARAMETER##PATTERN} # remove from the beginning, greedy match
${PARAMETER%PATTERN} # remove from the end
${PARAMETER%%PATTERN} # remove from the end, greedy match
In order to do a full remove element, you have to do an unset command with an if statement. If you don't care about removing prefixes from other variables or about supporting whitespace in the array, then you can just drop the quotes and forget about for loops.
See example below for a few different ways to clean up an array.
options=("foo" "bar" "foo" "foobar" "foo bar" "bars" "bar")
# remove bar from the start of each element
options=("${options[#]/#"bar"}")
# options=("foo" "" "foo" "foobar" "foo bar" "s" "")
# remove the complete string "foo" in a for loop
count=${#options[#]}
for ((i = 0; i < count; i++)); do
if [ "${options[i]}" = "foo" ] ; then
unset 'options[i]'
fi
done
# options=( "" "foobar" "foo bar" "s" "")
# remove empty options
# note the count variable can't be recalculated easily on a sparse array
for ((i = 0; i < count; i++)); do
# echo "Element $i: '${options[i]}'"
if [ -z "${options[i]}" ] ; then
unset 'options[i]'
fi
done
# options=("foobar" "foo bar" "s")
# list them with select
echo "Choose an option:"
PS3='Option? '
select i in "${options[#]}" Quit
do
case $i in
Quit) break ;;
*) echo "You selected \"$i\"" ;;
esac
done
Output
Choose an option:
1) foobar
2) foo bar
3) s
4) Quit
Option?
Hope that helps.
There is also this syntax, e.g. if you want to delete the 2nd element :
array=("${array[#]:0:1}" "${array[#]:2}")
which is in fact the concatenation of 2 tabs. The first from the index 0 to the index 1 (exclusive) and the 2nd from the index 2 to the end.
POSIX shell script does not have arrays.
So most probably you are using a specific dialect such as bash, korn shells or zsh.
Therefore, your question as of now cannot be answered.
Maybe this works for you:
unset array[$delete]
What I do is:
array="$(echo $array | tr ' ' '\n' | sed "/itemtodelete/d")"
BAM, that item is removed.
This is a quick-and-dirty solution that will work in simple cases but will break if (a) there are regex special characters in $delete, or (b) there are any spaces at all in any items. Starting with:
array+=(pluto)
array+=(pippo)
delete=(pluto)
Delete all entries exactly matching $delete:
array=(`echo $array | fmt -1 | grep -v "^${delete}$" | fmt -999999`)
resulting in
echo $array -> pippo, and making sure it's an array:
echo $array[1] -> pippo
fmt is a little obscure: fmt -1 wraps at the first column (to put each item on its own line. That's where the problem arises with items in spaces.) fmt -999999 unwraps it back to one line, putting back the spaces between items. There are other ways to do that, such as xargs.
Addendum: If you want to delete just the first match, use sed, as described here:
array=(`echo $array | fmt -1 | sed "0,/^${delete}$/{//d;}" | fmt -999999`)
Actually, I just noticed that the shell syntax somewhat has a behavior built-in that allows for easy reconstruction of the array when, as posed in the question, an item should be removed.
# let's set up an array of items to consume:
x=()
for (( i=0; i<10; i++ )); do
x+=("$i")
done
# here, we consume that array:
while (( ${#x[#]} )); do
i=$(( $RANDOM % ${#x[#]} ))
echo "${x[i]} / ${x[#]}"
x=("${x[#]:0:i}" "${x[#]:i+1}")
done
Notice how we constructed the array using bash's x+=() syntax?
You could actually add more than one item with that, the content of a whole other array at once.
In ZSH this is dead easy (note this uses more bash compatible syntax than necessary where possible for ease of understanding):
# I always include an edge case to make sure each element
# is not being word split.
start=(one two three 'four 4' five)
work=(${(#)start})
idx=2
val=${work[idx]}
# How to remove a single element easily.
# Also works for associative arrays (at least in zsh)
work[$idx]=()
echo "Array size went down by one: "
[[ $#work -eq $(($#start - 1)) ]] && echo "OK"
echo "Array item "$val" is now gone: "
[[ -z ${work[(r)$val]} ]] && echo OK
echo "Array contents are as expected: "
wanted=("${start[#]:0:1}" "${start[#]:2}")
[[ "${(j.:.)wanted[#]}" == "${(j.:.)work[#]}" ]] && echo "OK"
echo "-- array contents: start --"
print -l -r -- "-- $#start elements" ${(#)start}
echo "-- array contents: work --"
print -l -r -- "-- $#work elements" "${work[#]}"
Results:
Array size went down by one:
OK
Array item two is now gone:
OK
Array contents are as expected:
OK
-- array contents: start --
-- 5 elements
one
two
three
four 4
five
-- array contents: work --
-- 4 elements
one
three
four 4
five
To avoid conflicts with array index using unset - see https://stackoverflow.com/a/49626928/3223785 and https://stackoverflow.com/a/47798640/3223785 for more information - reassign the array to itself: ARRAY_VAR=(${ARRAY_VAR[#]}).
#!/bin/bash
ARRAY_VAR=(0 1 2 3 4 5 6 7 8 9)
unset ARRAY_VAR[5]
unset ARRAY_VAR[4]
ARRAY_VAR=(${ARRAY_VAR[#]})
echo ${ARRAY_VAR[#]}
A_LENGTH=${#ARRAY_VAR[*]}
for (( i=0; i<=$(( $A_LENGTH -1 )); i++ )) ; do
echo ""
echo "INDEX - $i"
echo "VALUE - ${ARRAY_VAR[$i]}"
done
exit 0
[Ref.: https://tecadmin.net/working-with-array-bash-script/ ]
How about something like:
array=(one two three)
array_t=" ${array[#]} "
delete=one
array=(${array_t// $delete / })
unset array_t
#/bin/bash
echo "# define array with six elements"
arr=(zero one two three 'four 4' five)
echo "# unset by index: 0"
unset -v 'arr[0]'
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
arr_delete_by_content() { # value to delete
for i in ${!arr[*]}; do
[ "${arr[$i]}" = "$1" ] && unset -v 'arr[$i]'
done
}
echo "# unset in global variable where value: three"
arr_delete_by_content three
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
echo "# rearrange indices"
arr=( "${arr[#]}" )
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
delete_value() { # value arrayelements..., returns array decl.
local e val=$1; new=(); shift
for e in "${#}"; do [ "$val" != "$e" ] && new+=("$e"); done
declare -p new|sed 's,^[^=]*=,,'
}
echo "# new array without value: two"
declare -a arr="$(delete_value two "${arr[#]}")"
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
delete_values() { # arraydecl values..., returns array decl. (keeps indices)
declare -a arr="$1"; local i v; shift
for v in "${#}"; do
for i in ${!arr[*]}; do
[ "$v" = "${arr[$i]}" ] && unset -v 'arr[$i]'
done
done
declare -p arr|sed 's,^[^=]*=,,'
}
echo "# new array without values: one five (keep indices)"
declare -a arr="$(delete_values "$(declare -p arr|sed 's,^[^=]*=,,')" one five)"
for i in ${!arr[*]}; do echo "arr[$i]=${arr[$i]}"; done
# new array without multiple values and rearranged indices is left to the reader
I'm looking for a way to find non-repeated elements in an array in bash.
Simple example:
joined_arrays=(CVE-2015-4840 CVE-2015-4840 CVE-2015-4860 CVE-2015-4860 CVE-2016-3598)
<magic>
non_repeated=(CVE-2016-3598)
To give context, the goal here is to end up with an array of all package update CVEs that aren't generally available via 'yum update' on a host due to being excluded. The way I came up with doing such a thing is to populate 3 preliminary arrays:
available_updates=() #just what 'yum update' would provide
all_updates=() #including excluded ones
joined_updates=() # contents of both prior arrays
Then apply logic to joined_updates=() that would return only elements that are included exactly once. Any element with two occurrences is one that can be updated normally and doesn't need to end up in the 'excluded_updates=()' array.
Hopefully this makes sense. As I was typing it out I'm wondering if it might be simpler to just remove all elements found in available_updates=() from all_updates=(), leaving the remaining ones as the excluded updates.
Thanks!
One pure-bash approach is to store a counter in an associative array, and then look for items where the counter is exactly one:
declare -A seen=( ) # create an associative array (requires bash 4)
for item in "${joined_arrays[#]}"; do # iterate over original items
(( seen[$item] += 1 )) # increment value associated with item
done
declare -a non_repeated=( )
for item in "${!seen[#]}"; do # iterate over keys
if (( ${seen[$item]} == 1 )); then # if counter for that key is 1...
non_repeated+=( "$item" ) # ...add that item to the output array.
done
declare -p non_repeated # print result
Another, terser (but buggier -- doesn't work with values containing newline literals) approach is to take advantage of standard text manipulation tools:
non_repeated=( ) # setup
# use uniq -c to count; filter for results with a count of 1
while read -r count value; do
(( count == 1 )) && non_repeated+=( "$value" )
done < <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -c)
declare -p non_repeated # print result
...or, even terser (and buggier, requiring that the array value split into exactly one field in awk):
readarray -t non_repeated \
< <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -c | awk '$1 == 1 { print $2; }'
To crib an answer I really should have come up myself from #Aaron (who deserves an upvote from anyone using this; do note that it retains the doesn't-work-with-values-with-newlines bug), one can also use uniq -u:
readarray -t non_repeated < <(printf '%s\n' "${joined_arrays[#]}" | sort | uniq -u)
I would rely on uniq.
It's -u option is made for this exact case, outputting only the uniques occurrences. It relies on the input to be a sorted linefeed-separated list of tokens, hence the need for IFS and sort :
$ my_test_array=( 1 2 3 2 1 0 )
$ printf '%s\n' "${my_test_array[#]}" | sort | uniq -u
0
3
Here is a single awk based solution that doesn't require sort:
arr=( 1 2 3 2 1 0 )
printf '%s\n' "${arr[#]}" |
awk '{++fq[$0]} END{for(i in fq) if (fq[i]==1) print i}'
0
3