comparing 2 strings element of an array - arrays

a=0
b=1
for a in ${my_array[#]}
do
for b in ${my_array[#]}
do
if [ " ${my_array[a]} " = " ${my_array[b]} " ]
then
continue
((a++))
fi
done
((b++))
done
Hi. I want to compare 2 strings. They are in the same array. If they are same, I just print it one of them. How can I do that ? I write some code. There are 2 thins (a and b ) a's first value is 0 and it stores first element of array. b's first value is 1 and it stores 1.element of array. I want to compare them and if they are same strings, I just print one of them .so I use "continue". think my code is true, but it doesn't work .there is a mistake which I can't see. Can you help me ?
for example it runs like that .
Enter words :
erica 17
sally 16
john 18
henry 17
john 18
jessica 19
as you see there are 2 john 18. I don't want both of them. My program will be check there are 2 strings are the same . If they are same I will just use one of them .

The if statment "=" - assign, "==" - compare.

If I understand correctly, you want to uniquify the elements of an array. If this is right, then the following stackoverflow question (How can I get unique values from an array in Bash?) appears to answer it in the following one-liner:
echo "${my_array[#]}" | tr ' ' '\n' | sort | uniq
Unfortunately, since your input words (or elements of array) contain spaces, the above will not work as expected. The issue is because the first echo will flatten-out the array into space-separated elements. The solution to this would be to use printf and remove 'tr'. Here it is...
printf '%s\n' "${my_array[#]}" | sort | uniq -c
But this alters the position of elements wrt the original array. Hope that is fine?

You can use sort and uniq to be able to get your desired output:
echo "${my_array[#]}" | tr ' ' '\n' | sort | uniq -c | tr '\n' ' '
And if you have spaces in your input you can use this:
printf '%s\n' "${my_array[#]}" | sort | uniq -c
That way you will get the number of times the string occurred but it will be printed just once.

Related

How to create a for loop that reads each character in ksh

I'm unable to find any for loop syntax that works with ksh for me. I'm wanting to create a program that esentially will add up numbers that are assigned to letters when a user inputs a word into the program. I want the for loop to read the first character, grab the number value of the letter, and then add the value to a variable that starts out equaled to 0. Then the next letter will get added to the variable's value and so on until there are no more letters and there will be a variable from the for loop that equals a value.
I understand I more than likely need an array that specifies what a letter's (a-z) value would be (1-26)...which I am finding difficult to figure that out as well. Or worst case I figure out the for loop and then make about 26 if statements saying something like if the letter equals c, add 3 to the variable.
So far I have this (which is pretty bare bones):
#!/bin/ksh
typeset -A my_array
my_array[a]=1
my_array[b]=2
my_array[c]=3
echo "Enter a word: \c"
read work
for (( i=0; i<${work}; i++ )); do
echo "${work:$i:1}"
done
Pretty sure this for loop is bash and not ksh. And the array returns an error of typeset: bad option(s) (I understand I haven't specified the array in the for loop).
I want the array letters (a, b, c, etc) to correspond to a value such as a = 1, b = 2, c = 3 and so on. For example the word is 'abc' and so it would add 1 + 2 + 3 and the final output will be 6.
You were missing the pound in ${#work}, which expands to 'length of ${work}'.
#!/bin/ksh
typeset -A my_array
my_array[a]=1
my_array[b]=2
my_array[c]=3
read 'work?enter a word: '
for (( i=0; i<${#work}; i++ )); do
c=${work:$i:1} w=${my_array[${work:$i:1}]}
echo "$c: $w"
done
ksh2020 supports read -p PROMPT syntax, which would make this script 100% compatible with bash. ksh93 does not. You could also use printf 'enter a word: ', which works in all three (ksh2020, ksh93, bash) and zsh.
Both ksh2020 and ksh93 understand read var?prompt which I used above.
First check the input work, you only want {a..z}.
charwork=$(tr -cd "[a-z]" <<< "$work")
Next you can fill 2 arrays with corresponding values
a=( {a..z} )
b=( {1..26} )
Using these arrays you can make a file with sed commands
for ((i=0;i<26;i++)); do echo "s/${a[i]}/${b[i]}+/g"; done > replaceletters.sed
# test it
echo "abcz" | sed -f replaceletters.sed
# result: 1+2+3+26+
Before you can pipe this into bc, use sed to remove the last + character.
Append s/+$/\n/ to replaceletters.sed and bc can calculate it.
Now you can use sed for replacing letters by digits and insert + signs.
Combining the steps, you have
a=( {a..z} )
b=( {1..26} )
tr -cd "[a-z]" <<< "$work" |
sed -f <(
for ((i=0;i<26;i++)); do
echo "s/${a[i]}/${b[i]}+/g"
done
echo 's/+$/\n/'
) | bc
In the loop you can use $i and avoid array b, but remember that the array start with 0, so a[5] corresponds with 6.

bash shell pull all values from array in random order. Pull each value only once

I need to pull the values from an array in a random order. It shouldn't pull the same value twice.
R=$(($RANDOM%5))
mixarray=("I" "like" "to" "play" "games")
echo ${mixarray[$R]}
I'm not sure what to do after the code above. I thought of putting the first pulled value into another array, and then nesting it all in a loop that checks that second array so it doesn't pull the same value twice from the first array. After many attempts, I just can't get the syntax right.
The output should be something like:
to
like
I
play
games
Thanks,
Would you please try the following:
#!/bin/bash
mixarray=("I" "like" "to" "play" "games")
mapfile -t result < <(for (( i = 0; i < ${#mixarray[#]}; i++ )); do
echo -e "${RANDOM}\t${i}"
done | sort -nk1,1 | cut -f2 | while IFS= read -r n; do
echo "${mixarray[$n]}"
done)
echo "${result[*]}"
First, it prints a random number and an index starting with 0 side by side.
The procedure above is repeated as much as the length of mixarray.
The output will look like:
13959 0
6416 1
6038 2
492 3
19893 4
Then the table is sorted with the 1st field:
492 3
6038 2
6416 1
13959 0
19893 4
Now the indices in the 2nd field are randomized. The field is extracted with
the cut command.
Next rearrange the elements of mixarray using the randomized index.
Finally the array result is assigned with the output and printed out.

Strange behaviour while subtracting 2 string arrays

I am subtracting array1 from array2
My 2 arrays are
array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)
And the way Im subtracting them is
for i in "${array2[#]}"; do
array1=(${array1[#]//$i})
done
echo ${array1[#]}
Now my expected result should be
dev-monitoring-busk test-ci-cd
But my expected result is
dev--busk test-ci-cd
Although the subtraction looks good but its also deleting the string monitoring from dev-monitoring-busk. I dont understand why. Can some point out whats wrong here ?
I know that there are other solutions out there for a diff between 2 arrays like
echo ${Array1[#]} ${Array2[#]} | tr ' ' '\n' | sort | uniq -u
But this is more of a diff and not a subtraction. So this does not work for me.
Bit of a kludge but it works ...
use comm to find those items unique to a (sorted) data set
use tr to convert between spaces (' ' == array element separator) and carriage returns ('\n' ; comm works on individual lines)
echo "${array1[#]}" | tr ' ' '\n' | sort : convert an array's elements into separate lines and sort
comm -23 (sorted data set #1) (sorted data set #2) : compare sorted data sets and return the rows that only exist in data set #1
Pulling this all together gives us:
$ array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
$ array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)
# find rows that only exist in array1
$ comm -23 <(echo "${array1[#]}" | tr ' ' '\n' | sort) <(echo "${array2[#]}" | tr ' ' '\n' | sort)
dev-monitoring-busk
test-ci-cd
# same thing but this time replace carriage returns with spaces (ie, pull all items onto a single line of output):
$ comm -23 <(echo "${array1[#]}" | tr ' ' '\n' | sort) <(echo "${array2[#]}" | tr ' ' '\n' | sort) | tr '\n' ' '
dev-monitoring-busk test-ci-cd
NOTEs about comm:
- takes 2 sorted data sets as input
- generates 3 columns of output:
- (output column #1) rows only in data set #1
- (output column #2) rows only in data set #2
- (output column #3) rows in both data sets #1 and #2
- `comm -xy` ==> discard ouput columns 'x' and 'y'
- `comm -12` => discard output columns #1 and #2 => only show lines common to both data sets (output column #3)
- `comm -23' => discard output columns #2 and #3 => only show lines that exist in data set #1 (output column #1)
If I'm understanding correctly, what you want is not to subtract array1
from array2, but to subtract array2 from array1.
As others are pointing out, bash replacement do not work with arrays.
Instead you can make use of an associative array if your bash version >= 4.2.
Please try the following:
declare -a array1=(apps argocd cache core dev-monitoring-busk test-ci-cd)
declare -a array2=(apps argocd cache core default kube-system kube-public kube-node-lease monitoring)
declare -A mark
declare -a ans
for e in "${array2[#]}"; do
mark[$e]=1
done
for e in "${array1[#]}"; do
[[ ${mark[$e]} ]] || ans+=( "$e" )
done
echo "${ans[#]}"
It first iterate over array2 and marks its elements by using
an associative arrray mark.
It then iterates over array1 and add the element to the answer
if it is not seen in the mark.

Count unique values in a bash array

I have an array ${sorted[#]}. How can I count the frequency of occurrence of the elements of the array.
e.g:
Array values:
bob
jane
bob
peter
Results:
bob 2
jane 1
peter 1
The command
(IFS=$'\n'; sort <<< "${array[*]}") | uniq -c
Explanation
Counting occurrences of unique lines is done with the idiom sort file | uniq -c.
Instead of using a file, we can also feed strings from the command line to sort using the here string operator <<<.
Lastly, we have to convert the array entries to lines inside a single string. With ${array[*]} the array is expanded to one single string where the array elements are separated by $IFS.
With IFS=$'\n' we set the $IFS variable to the newline character for this command exclusively. The $'...' is called ANSI-C Quoting and allows us to express the newline character as \n.
The subshell (...) is there to keep the change of $IFS local. After the command $IFS will have the same value as before.
Example
array=(fire air fire earth water air air)
(IFS=$'\n'; sort <<< "${array[*]}") | uniq -c
prints
3 air
1 earth
2 fire
1 water

Bash function with array won't work

I am trying to write a function in bash but it won't work. The function is as follows, it gets a file in the format of:
1 2 first 3
4 5 second 6
...
I'm trying to access only the strings in the 3rd word in every line and to fill the array "arr" with them, without repeating identical strings.
When I activated the "echo" command right after the for loop, it printed only the first string in every iteration (in the above case "first").
Thank you!
function storeDevNames {
n=0
b=0
while read line; do
line=$line
tempArr=( $line )
name=${tempArr[2]}
for i in $arr ; do
#echo ${arr[i]}
if [ "${arr[i]}" == "$name" ]; then
b=1
break
fi
done
if [ "$b" -eq 0 ]; then
arr[n]=$name
n=$(($n+1))
fi
b=0
done < $1
}
The following line seems suspicious
for i in $arr ; do
I changed it as follows and it works for me:
#! /bin/bash
function storeDevNames {
n=0
b=0
while read line; do
# line=$line # ?!
tempArr=( $line )
name=${tempArr[2]}
for i in "${arr[#]}" ; do
if [ "$i" == "$name" ]; then
b=1
break
fi
done
if [ "$b" -eq 0 ]; then
arr[n]=$name
(( n++ ))
fi
b=0
done
}
storeDevNames < <(cat <<EOF
1 2 first 3
4 5 second 6
7 8 first 9
10 11 third 12
13 14 second 15
EOF
)
echo "${arr[#]}"
You can replace all of your read block with:
arr=( $(awk '{print $3}' <"$1" | sort | uniq) )
This will fill arr with only unique names from the 3rd word such as first, second, ... This will reduce the entire function to:
function storeDevNames {
arr=( $(awk '{print $3}' <"$1" | sort | uniq) )
}
Note: this will provide a list of all unique device names in sorted order. Removing duplicates also destroys the original order. If preserving the order accept where duplicates are removed, see 4ae1e1's alternative.
You're using the wrong tool. awk is designed for this kind of job.
awk '{ if (!seen[$3]++) print $3 }' <"$1"
This one-liner prints the third column of each line, removing duplicates along the way while preserving the order of lines (only the first occurrence of each unique string is printed). sort | uniq, on the other hand, breaks the original order of lines. This one-liner is also faster than using sort | uniq (for large files, which doesn't seem to be applicable in OP's case), since this one-liner linearly scans the file once, while sort is obviously much more expensive.
As an example, for an input file with contents
1 2 first 3
4 5 second 6
7 8 third 9
10 11 second 12
13 14 fourth 15
the above awk one-liner gives you
first
second
third
fourth
To put the results in an array:
arr=( $(awk '{ if (!seen[$3]++) print $3 }' <"$1") )
Then echo ${arr[#]} will give you first second third fourth.

Resources