Comparison of two arrays perl - arrays

I am new to Perl world, and I have a script that compares two arrays.
I use List::MoreUtils (each_arrayref) to do the comparison.
I have two questions:
1) Is there a way to compare two chunks of arrays (like natatime but for two arrayrefs) instead of comparing single element at a time as in each_arrayref?
The elements should be from the same index from each array.
The data structure is something like this:
{
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
}
This is what I have got so far.
my #lists = keys %{$hash};
for (my $i = 0; $i <= #lists; $i++) {
my $list_one = $lists[$i];
my $one = $hash->{$list_one};
for (my $j = 0 ; $j <= #lists ; $j++) {
my $list_two = $lists[$j];
my $two = $hash->{$list_two};
my ($overlapping, $mismatch, $identity);
my $match = 0;
my $non_match = 0;
my $count_ac_calls = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if ((defined $call_one) && (defined $call_two)) {
if ($call_one eq $call_two) {
$match++;
}
if ($call_one ne $call_two) {
$non_match++;
}
}
} #end of while loop $each_array->()
print "$list_one,$list_two,$match,$non_match";
} #end of for j loop
} #end of for i loop
I would like to compare atr->ats, atr->att, ats->att. But with my current code, I get repetitions of comparison like ats->atr att->atr,att->ats.
2) How can I avoid those?

I'm not clear what your first question means. Do you want an iterator that, say, returns (('a','b','c'),('a','b','c'))
instead of
('a','a')? If so then there isn't one available in a library, but it wouldn't be hard to write your own.
As for the second, the usual way to avoid items being compared with themselves is to change the inner loop to start after the current value of the first. Like so
for my $i (0..$#lists) {
for my $j ($i+1..$#lists) {
}
}
This works because A eq B is generally the same as B eq A, so there is no point in comparing an entry with one earlier in the list because the inverse comparison has already been made.
Note that it is much better Perl to write for loops this way than the messy C-style syntax. You also have a couple of bugs in
for (my $i = 0 ; $i <= #lists ; $i++) { ... }
because the maximum index of #lists is one less than the scalar value of #lists - usually coded as $#lists. The same problem exists in your loop for $j.
Update
Here is a refactoring of your program, written to include the ideas I have described and to be more Perlish. I hope it is useful to you.
use strict;
use warnings;
use List::MoreUtils 'each_arrayref';
my $hash = {
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
};
my #keys = keys %{$hash};
for my $i (0 .. $#keys) {
my $key1 = $keys[$i];
my $list1 = $hash->{$key1};
for my $j ($i+1 .. $#keys) {
my $key2 = $keys[$j];
my $list2 = $hash->{$key2};
my ($match, $non_match) = (0, 0);
my $iter = each_arrayref($list1, $list2);
while (my ($call1, $call2) = $iter->()) {
if (defined $call1 and defined $call2) {
($call1 eq $call2 ? $match : $non_match)++;
}
}
print "$key1, $key2, $match, $non_match\n";
}
}

One option is to use Array::Compare to return the number of different array elements. Also, Math::Combinatorics is used to obtain only unique comparisons.
use strict;
use warnings;
use Array::Compare;
use Math::Combinatorics;
my %hash = (
'atr' => [ 'a', 'b', 'c', 'd' ],
'ats' => [ 'a', 'b', 'c', 'd' ],
'att' => [ 'a', 'c', 'c', 'd' ],
);
my $comp = Array::Compare->new( DefFull => 1 );
my $combinat = Math::Combinatorics->new(
count => 2,
data => [ keys %hash ],
);
while ( my ($key1, $key2) = $combinat->next_combination ) {
my $diff = $comp->compare( \#{ $hash{$key1} }, \#{ $hash{$key2} } );
print "$key1,$key2," . ( #{ $hash{$key1} } - $diff ) . ",$diff\n";
}
Output:
ats,att,3,1
ats,atr,4,0
att,atr,3,1

You're not really taking advantage of the features Perl has to offer. Rather than use an error prone C-style loop, just use for my $var (LIST). You can also skip redundant list checking by skipping the self-checks, too. I've taken your script, made some alterations, and I'm sure you'll find it a bit easier to read.
use v5.16;
use warnings;
use List::MoreUtils qw{each_arrayref};
my $hash = {
'atr' => [
'a',
'b',
'c',
'd'
],
'ats'=>[
'a',
'b',
'c',
'd'
],
'att' => [
'a',
'c',
'c',
'd'
],
};
for my $list_one (keys $hash) {
my $one = $hash->{$list_one};
for my $list_two (keys $hash) {
next if $list_one ~~ $list_two;
my $two = $hash->{$list_two};
my ($match, $non_match);
$match = $non_match = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if($call_one && $call_two) {
if($call_one eq $call_two) {
$match++;
}
else {
$non_match++;
}
}
}
print "$list_one,$list_two,$match,$non_match\n";
}
}
You'll want to evaluate one at a time anyway so that you can add in some extra bits like the index location. (Yes, you could use the C-style loop, but that'd be a bit more difficult to read.)

Related

Perl remove same value back to back with splice

I am trying to remove, the same values twice in an array, it is located back to back, this is my code
#{$tmp_h->{'a'}} = qw/A B B C/;
print Dumper ($tmp_h);
my $j = 0;
foreach my $cur (#{$tmp_h->{'a'}}) {
if ($cur eq 'B') {
splice(#{$tmp_h->{'a'}}, $j, 1);
}
$j++;
}
print Dumper $tmp_h;
However what got is,
$VAR1 = {
'a' => [
'A',
'B',
'B',
'C'
]
};
$VAR1 = {
'a' => [
'A',
'B',
'C'
]
};
I am expecting both 'B' to be removed in this case, what could possibly went wrong?
That code is removing from an array while iterating over it, pulling the carpet from underneath itself; is that necessary?
Instead, iterate and put elements on another array if the adjacent ones aren't equal. So iterate over the index, looking up an element and the next (or previous) one.†
I presume that B is just an example while in fact it can be any value, equal to its adjacent one.
It's interesting that regex can help too, with its simple way to find repeated patterns using backreferences
my #ary = qw(a b b c d d e f f f g);
my $str_ary = join '', #ary;
$str_ary =~ s/(.)\g{-1}//g;
my #new_ary = split //, $str_ary;
say "#new_ary"; #--> a c e f g
This removes pairs of adjacent values, so if there is an odd number of equal adjacent values it leaves the odd one (f above). As a curiosity note that it can be written in one statement
my #new_ary = split //, join('', #ary) =~ s/(.)\g{-1}//gr;
The join-ed array, forming a string, is bound to the substitution operator where /r modifier is crucial, for allowing this and returning the changed string which is then split back into a list.
To change an array in place have it assign to itself.‡
But single-letter elements are only an example, likely. With multiple characters in elements we can't join them by empty string because we wouldn't know how to split that back into an array; we have to join by something that can't be in any one element, clearly a tricky proposition. A reasonable take is a line-feed, as one can expect to know whether elements are/not multiline strings
my #ary = qw(aa no no way bah bah bah go);
my $str_ary = join "\n", #ary ;
$str_ary =~ s/([^\n]+)\n\g{-1}//g;
my #new = grep { $_ } split /\n/, $str_ary;
say "#new"; #--> aa way bah go
This would still have edge cases with interesting elements, like spaces and empty strings (but then any approach would).
† For example
use warnings;
use strict;
use feature 'say';
my #ary = qw(a b b c d d e f f f g);
my #new_ary;
my $i = 0;
while (++$i <= $#ary) {
if ($ary[$i] ne $ary[$i-1]) {
push #new_ary, $ary[$i-1]
}
else { ++$i }
}
push #new_ary, $ary[-1] if $ary[-1] ne $ary[-2];
say "#new_ary"; #--> a c e f g
‡ Done for the arrayref in the question
#{ $hr->{a} } = qw/A B B C/;
#{$hr->{a}} = split //, join('', #{$hr->{a}}) =~ s/(.)\g{-1}//gr;
say "#{$hr->{a}}"; #--> A C
The Perl documentation tells you in perlsyn under Foreach Loops:
If any part of LIST is an array, foreach will get very confused if you
add or remove elements within the loop body, for example with splice. So
don't do that.
You can iterate over the indices instead, but don't forget to not increment the index when removing a value:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $tmp_h = {a => [qw[ A B B C ]]};
print Dumper($tmp_h);
my $j = 0;
while ($j <= $#{ $tmp_h->{a} }) {
my $cur = $tmp_h->{a}[$j];
if ($cur eq 'B') {
splice #{ $tmp_h->{a} }, $j, 1;
} else {
++$j;
}
}
print Dumper($tmp_h);
Or start from the right so you don't have to worry:
my $j = $#{ $tmp_h->{a} };
while ($j-- >= 0) {
my $cur = $tmp_h->{a}[$j];
splice #{ $tmp_h->{a} }, $j, 1 if $cur eq 'B';
}
But the most straight forward way is to use grep:
#{ $tmp_h->{a} } = grep $_ ne 'B', #{ $tmp_h->{a} };

How to sort an hash and assign a variable to the keypair from an array in perl?

I have an array like this
my #arr =('1','apple','2','orange','1','orange','3','berry','2','berry','1','berry');
my %hash;
my $var =1;
Now how can i sort and assign a variable to the pair?
The desired output is
$hash{1}{apple} =>1;
$hash{1}{orange} =>1;
$hash{1}{berry} =>1;
$hash{2}{orange} =>1;
$hash{2}{berry} =>1;
$hash{3}{berry} =>1;
You need to iterate your array and take two values out per iteration. One way to do this is with a while loop. This will consume the array, so if you want to keep it you might want to make a copy.
use strict;
use warnings;
use Data::Printer;
my #arr = (
'1', 'apple', '2', 'orange', '1', 'orange',
'3', 'berry', '2', 'berry', '1', 'berry',
);
my %hash;
my $var = 1;
while ( my $first_key = shift #arr ) {
my $second_key = shift #arr;
$hash{$first_key}->{$second_key} = $var;
}
p %hash;
This outputs
{
1 {
apple 1,
berry 1,
orange 1
},
2 {
berry 1,
orange 1
},
3 {
berry 1
}
}
An alternative is to use a C-style for loop. This does not change the array.
for (my $i = 0; $i <= $#arr; $i+=2) {
$hash{ $arr[$i] }->{ $arr[$i + 1] } = $var;
}
Or you could use List::Util's pairs function to get two out at the same time.
use List::Util 'pairs';
foreach my $pair ( pairs #arr ) {
my ( $first_key, $second_key ) = #$pair;
$hash{$first_key}->{$second_key} = $var;
}
It's normally expected that you at least spend a few hours trying to write a solution yourself. We will happily help you if you've made a decent attempt of your own but have run out of ideas, but it doesn't go down well if you appear to have dumped your problem on us and are waiting for an answer to pop up while you drink a mug of coffee. You've been told about this before, and only one of your post has a net positive vote. You need to work on that.
Are you certain that you really want a hash of hashes? This is very reminiscent of your previous question How to find if the value exists in hash without using key in perl? where we pretty much established that it was the wrong choice.
The only non-obvious part is extracting the values from the array in pairs, I and I have used C-style for loop to achieve this.
I have used Data::Dumper only to show the resulting hash of hashes.
use strict;
use warnings 'all';
my #arr = qw/ 1 apple 2 orange 1 orange 3 berry 2 berry 1 berry /;
my %hash;
for ( my $i = 0; $i < $#arr; $i += 2 ) {
$hash{$arr[$i]}{$arr[$i+1]} = 1;
}
use Data::Dumper;
print Dumper \%hash;
output
$VAR1 = {
'2' => {
'berry' => 1,
'orange' => 1
},
'3' => {
'berry' => 1
},
'1' => {
'berry' => 1,
'orange' => 1,
'apple' => 1
}
};
Update
Here's an example of generating the keys as I described in the comments. It's almost identical to the solution above, but the resulting hash contents are different.
use strict;
use warnings 'all';
my #arr = qw/ 1 apple 2 orange 1 orange 3 berry 2 berry 1 berry /;
my %hash;
for ( my $i = 0; $i < $#arr; $i += 2 ) {
$hash{"#arr[$i,$i+1]"} = 1;
}
use Data::Dumper;
print Dumper \%hash;
output
$VAR1 = {
'2 berry' => 1,
'1 apple' => 1,
'3 berry' => 1,
'1 orange' => 1,
'1 berry' => 1,
'2 orange' => 1
};
Take the values from the array two at a time (key/value), put them into a hash, then assign the variable as the value.
use Data::Dumper;
sub SortAndAssign {
my ($args) = #_;
my #arr = #{$args->{ARRAY}};
my $var = $args->{VARIABLE};
my %hash;
my $i = 0;
my $size = scalar(#arr);
while ($i < $size) {
# alternating key/value pairs (really a hash)
my $key = $arr[$i++]; # e.g. 1
my $value = $arr[$i++]; # e.g. apple
$hash{$key}{$value} = $var; # e.g. hash->1->apple = 1
}
return %hash;
}
sub ShowSortAndAssign {
my #arr =('1','apple','2','orange','1','orange','3','berry','2','berry','1','berry');
my $var = 1;
my %hash = SortAndAssign({
ARRAY => \#arr,
VARIABLE => $var,
});
print Dumper(\%hash);
print "first apple is " . $hash{1}{apple};
}
sub _Main {
ShowSortAndAssign();
}
_Main();

Perl - Building array of arrays leads to overflow

I am trying to analyze a hashmap, for duplicate values and getting their keys in arrays. These arrays will be in an array of arrays. I am a newbie, by the way. However it never stops running when I start it. Where am I wrong?
while (($k,$v)=each %hashmap){
$hasduplicate = 0;
delete $hashmap{$k};
#dups = ();
while (($k1,$v1) = each %hashmap){
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate){
push (#dups, $k);
push #dupsarray, [#dups];}
}
Each hash has just one iterator aligned to itself in Perl (see each). Therefore, running each for the same hash in a loop that calls each is not doing what you think.
If you want to see what's going on, try adding the following line at the start of the outer loop:
warn $k;
You are missing several dollar signs before variable names. For example, you probably want to delete $hashmap{$k} instead of $hashmap{k}, which is equivalent to $hashmap{'k'}.
To output an array of arrays, you have to dereference the inner arrays:
print map "#$_\n", #dupsarray;
BTW, I would use a hash of arrays to solve your task. Here's how:
my %dups;
while (my ($k, $v) = each %hashmap) {
push #{ $dups{$v} }, $k;
}
for my $k (grep #{ $dups{$_} } > 1, keys %dups) {
print "$k: #{ $dups{$k} }\n";
}
The problem is that there can be only one each sequence per hash, as there is only a single index to keep track of the next key/value pair.
In addition, you are using k and k1 in a few places where you mean $k and $k1. You must always use strict and use warnings at the top of every Perl program. This would have alerted you to the problem.
You can get around this problem by using for my $k1 (keys %hashmap) { ... } for the inside loop. This will create a separate list of keys to assign to $k1 in turn so that there is no multiple use of the iterator.
This modification of your code does what I think you want.
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray;
while (my ($k, $v) = each %hashmap) {
my $hasduplicate = 0;
delete $hashmap{$k};
my #dups;
for my $k1 (keys %hashmap) {
my $v1 = $hashmap{$k1};
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate) {
push(#dups, $k);
push #dupsarray, [#dups];
}
}
use Data::Dump;
dd \#dupsarray;
output
[["a", "c"]]
A much simpler method is to create an inverted hash where the keys and values of the original hash are swapped. Then just pick out the values of the inverted hash that have more than one element. This program demonstrates
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray = do {
my %inverted;
while (my ($k, $v) = each %hashmap) {
push #{ $inverted{$v} }, $k;
}
grep { #$_ > 1 } values %inverted;
};
use Data::Dump;
dd \#dupsarray;
output
[["c", "a"]]

searching two array string for equal words

I am a beginner in Perl. I have two string arrays array1 and array2. I want to check the each and every element in 2nd array. if there is i want to give a relative value one to that particular element in the 2nd array. the relative values are store in an array.I try it out but it wont work and git gives a warning like" Use of uninitialized value in string eq at pjt.pl line 52, line 3".
while($i <= (scalar #resultarray-1))
{
while ($j <= (scalar #inputsymbl-1))
{
if ($resultarray[$i] eq $inputsymbl[$j])
{
$rel[$j]=1;
$i=$i+1;
$j=0;
}
else
{
$j=$j+1;
}
}
if($j==(scalar #inputsymbl))
{
$i=$i+1;
$j=0;
}
}
try this:
my $i = 0;
my $j = 0;
## walk each array element
foreach(#resultarray) {
my $result = $_;
foreach(#inputsymbl) {
my $symbl = $_;
if ($result eq $symbl) {
$rel[$j] = 1;
$i++;
} else {
$j++;
}
}
if ($j == (scalar #inputsymbl - 1)) {
$i++;
$j = 0;
}
}
provide more informations if you need detailed help.
From your question and code, it appears that you want to flag the indexes, by using a third array, of the two array's elements that are equal. By doing this, however, you're creating a sparse array. Also, if the two arrays don't have the same number of elements, a "Use of uninitialized value in string eq..." warning will eventually occur. Given these issues, consider using the smaller index of the two arrays (done using the ternary operator below) and pushing the indexes of the equal elements onto the third array:
use strict;
use warnings;
use Data::Dumper;
my #results;
my #arr1 = qw/A B C D E F G H I J/;
my #arr2 = qw/A D C H E K L H N J P Q R S T/;
# Equal: ^ ^ ^ ^ ^
# Index: 0 2 4 7 9
for my $i ( 0 .. ( $#arr1 <= $#arr2 ? $#arr1 : $#arr2 ) ) {
push #results, $i if $arr1[$i] eq $arr2[$i];
}
print Dumper \#results;
Output:
$VAR1 = [
0,
2,
4,
7,
9
];
Hope this helps!

Is there a way I can replace an element of an array based on its value and not element number with two new elements in the array?

I'm programming in Perl, and I'm in a situation where I have an array such as #contents=(A,S,D,F,M,E) and I want to replace the element M with two new elements X and Y, such as #contents would equal (A,S,D,F,X,Y,E).
You can use map.
#contents = map { $_ eq 'M' ? ('X','Y') : $_ } #contents;
Or you can use splice:
for (0 .. $#contents) {
if ($contents[$_] eq 'M') {
splice #contents, $_, 1, 'X', 'Y';
}
}
You can also simplify further by using keys #contents as the list of indexes, if you are using perl version 5.12 and up.
The command you are looking for is splice.
#!/usr/bin/perl -wT
use strict;
my #contents = qw(A S D F M E);
my $match = 'M';
my #replace = qw(X Y);
my $arrlen = #contents;
for (my $i = 0; $i < $arrlen; $i++)
{
if ($contents[$i] eq $match)
{
splice (#contents, $i, 1, #replace);
last;
}
}
print "$_\n" foreach (#contents);

Resources