Iterate over first at most n elements of an perl array - arrays

What is the most pleasant idiomatic way of writing
for (take(100,#array)) {...}
given that there is no take (which takes the first n elements of a list, but fewer if there are not n elements)?
Things I considered:
for (#array[0..99]) {...}
but that fails ungraceful if #array has fewer than 100 elements
for (#array[0..min(99,$#array)]) {...}
but min is not a standard function in Perl
for (splice #array,0,100) {...}
bu that changes the array.

for (#array[0..min(99,$#array)]) {...}
but min is not a standard function in Perl
min is a standard function in the module List::Util, which is part of core as of 5.7.3.
use List::Util qw(min);
for (#array[0..min(99,$#array)]) { # generator in 5.8.8+
...
}
Note that perl 5.8.8 onward, and perhaps earlier, is smart enough to understand that expression as a generator rather than a slice. That is, elements 0 through $terminus are fetched one at a time from #array instead of an anonymous slice being taken and copied.

You want the CPAN module List::Slice
use List::Slice 'head';
foreach my $elem ( head 100, #things ) { ... }

You've indicated you've found the following the cleanest:
take(100, #array)
So to answer your question as to what's the cleanest, that is! I don't see why you're trying to find an alternative.

How about using map:
my #array = qw ( 1 2 3 4 );
print join "\n", map { $_ // () } #array[0..10];
This takes 10 elements from a list, but applies the 'defined' test to it - and if it's not defined, returns an empty list.
So you could:
for ( map { $_ // () } #array[0..100] ) {
#do something
}
Note - // is a defined-or operator, only available from perl 5.10+. You could use instead a defined ternary:
print join "\n", map { defined ? $_ : () } #array[0..10];

You could add an additional check inside of your loop to break once the end is reached.
my #arr = (1 .. 90);
for ( #arr[0..99]) {
last unless defined $_;
say;
}
But that would not work on arrays that have undef values in between, like these:
my #foo = (1, 2, undef, 4);
my #bar;
$bar[2] = 'foo'; # (undef, undef, 'foo')

The other responses have this covered but, just for thoroughness, there are a couple of "pumpkin perl" gather/take implementations on CPAN :-)
List::Gather
Syntax::Keyword::Gather
There is also Damian Conway's Perl6::Gather which is pretty much the same but requires Perl6::Export.
They let you work with lists in the way you want. e.g. to "take" half the alphabet:
perl -E 'use List::Gather; #lpha = ("a" .. "z");
#half = gather { for (#lpha){ take $_ if gathered < 13 } } ; say #half'
abcdefghijklm
or less if we aren't halfway there yet :
perl -E 'use List::Gather; #lpha = ("a" .. "c");
#half = gather { for (#lpha) { take $_ if gathered < 13 } } ; say #half'
abc
With List::Gather the gather block can take a loop (because of lexical scoping inside gather{} ??) and the topic $_ is required inside the block:
perl -E 'use List::Gather; #lpha = ("a" .. "g");
#half = gather for (#lpha) { take $_ if gathered < 13 }; say #half'
With Syntax::Keyword::Gather you do that inside the gather{} block (which is also possible with List::Gather):
perl -E 'use Syntax::Keyword::Gather; #lpha = ("a".."g");
#half = gather { for (#lpha){ take if gathered < 13 } }; say #half'
I find gather/take to be a nice alternative way to work with lists. Whether it is nice enough to ship with perl one day - say in List::Util - is an interesting implicit part of your question ;-) but they are on CPAN.
Postscript
To address some of the concerns about defined-ness raised by #simbabque, #zaid and #Joachim Breitner more checks can be added to the take() routine.
Here I use Ingy's boolean:
perl -E 'use boolean; use List::Gather;
#lpha = ("a" .. "g", "", undef, undef, "x", "0", "z");
#half = gather { for (#lpha){ take $_ if boolean($_) && gathered < 13 }};
use DDP; p #half;'
Output:
[
[0] "a",
[1] "b",
[2] "c",
[3] "d",
[4] "e",
[5] "f",
[6] "g",
[7] "x",
[8] "z"
]

I think you should use iterator pattern, i.e.
my $iterator = create_iterator(100);
while (my $element = $iterator->()) {
...;
}
There limit might be either embedded into iterator creation, i.e.
sub create_iterator {
my $limit = shift;
my #data = (0 x 1000);
my $i = 0;
return sub {
return $data[$i++] if ($i < #data);
}
}
PS. There is a limitation, that undef cannot be part of #data

Related

Index of element of one array in another array avoiding list::utils

Aim: To match two array and find the index position of matched element.
Tricky part: As per the example given below, the elements of each array are not exactly similar to each other but the comprise of some part and I want to match that.
Sample of how the arrays are:
array1=('adam west', 'daric dalon','tom helic','todd nick','riley remer');
array2=('adam west 12', 'daric dalon mr.','tom helic (fads)','todd nick (456)','riley remer','john steve','dim madz 12');
I have tried using List::MoreUtils qw(first_index); method but this method isn't giving what I want.
The following I have tried:
Try 1
for my $name (#Names) {
$count;
for $matchList (#org_name) {
if ( index( $matchList, $name ) != -1 ) {
push (#matched,$matchList);
$count++;
}#print $org_name[$count];
}
} print "Those which are matched #matched\n";
Try 2
The next method I have copied from online but not very much useful.
use List::MoreUtils qw(first_index);
#indexes;
foreach my $place (#allNames) {
push #indexes, first_index { $_ eq $place } #org_name;
}
use Data::Dumper qw(Dumper);
print Dumper \#indexes;
I know this is going to be a very easy thing which I am missing out. But please let me know what I can do.
I think you need something like this:
use Data::Dumper
my #indexes = ();
#create hash of "first_index" elemets in #org_name
my %tmp = map { $org_name[$_] => $_ } reverse 0..$#org_name;
for my $t (#allNames) {
if(exists($tmp{$t})) {
push #indexes, $tmp{$t};
}
}
print Dumper #indexes;
if ( index( $matchList, $name ) != -1 ) {
You haven't explained what you mean by "match two arrays". I think you might be looking for elements that are in both arrays. But the line above isn't testing for equality of two strings, it is testing if $name appears as a substring of $matchList. Is that what you want?
If you want to test for equality, you should use this instead:
if ($matchList eq $name) {
But the substring check should still work if your strings are equal. If they aren't matching, then it sounds like your arrays don't contain what you think they contain. Perhaps one array contains elements that still have newlines attached - or something like that.
If you're actually trying to find the elements that appear in both arrays, then (as it often the case) the Perl FAQ will be useful. You're looking for the intersection of two sets and perlfaq4 contains this:
How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?
Use a hash. Here's code to do both and
more. It assumes that each element is unique in a given array:
my (#union, #intersection, #difference);
my %count = ();
foreach my $element (#array1, #array2) { $count{$element}++ }
foreach my $element (keys %count) {
push #union, $element;
push #{ $count{$element} > 1 ? \#intersection : \#difference }, $element;
}
Note that this is the symmetric difference, that is, all elements in either A or in B but not in both. Think of it as an xor operation.
Update: Having seen what you want, most of what I said above is completely off-topic. Please try to be clearer when asking questions.
What you actually want is something like this:
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #arr1 = ('adam west', 'daric dalon', 'tom helic','todd nick', 'riley remer');
my #arr2 = ('adam west 12', 'daric dalon mr.','tom helic (fads)',
'todd nick (456)', 'riley remer', 'john steve', 'dim madz 12');
# Match elements from array1 ...
for my $e1 (#arr1) {
# ... against indexes in array2
for my $i2 (0 .. $#arr2) {
if ($arr2[$i2] =~ /^\Q$e1/) {
say "$e1 -> $i2";
}
}
}
Or you can use List::MoreUtils::firstidx like this:
#!/usr/bin/perl
use strict;
use warnings;
use List::MoreUtils 'firstidx';
use feature 'say';
my #arr1 = ('adam west', 'daric dalon', 'tom helic','todd nick', 'riley remer');
my #arr2 = ('adam west 12', 'daric dalon mr.','tom helic (fads)',
'todd nick (456)', 'riley remer', 'john steve', 'dim madz 12');
# Match elements from array1 ...
for my $e1 (#arr1) {
# ... against indexes in array2
if ((my $i2 = firstidx { /^\Q$e1/ } #arr2) == -1) {
say "$e1 not found in \#arr2";
} else {
say "$e1 -> $i2";
}
}

Perl: Removing array items and resizing the array

I’m trying to filter an array of terms using another array in Perl. I have Perl 5.18.2 on OS X, though the behavior is the same if I use 5.010. Here’s my basic setup:
#!/usr/bin/perl
#use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
foreach $filter (#filters) {
for my $ind (0 .. $#terms) {
if (grep { /$filter/ } $terms[$ind]) {
splice #terms,$ind,1;
}
}
}
This works to pull out the lines that match the various search terms, but the array length doesn’t change. If I write out the resulting #terms array, I get:
[alpha]
[delta quadrant]
[zeta]
[eta]
[theta chi]
[kappa]
[]
[]
[]
[]
As you might expect from that, printing scalar(#terms) gets a result of 10.
What I want is a resulting array of length 6, without the four blank items at the end. How do I get that result? And why isn’t the array shrinking, given that the perldoc page about splice says, “The array grows or shrinks as necessary.”?
(I’m not very fluent in Perl, so if you’re thinking “Why don’t you just...?”, it’s almost certainly because I don’t know about it or didn’t understand it when I heard about it.)
You can always regenerate the array minus things you don't want. grep acts as a filter allowing you to decide which elements you want and which you don't:
#!/usr/bin/perl
use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
my %filter_exclusion = map { $_ => 1 } #filters;
my #filtered = grep { !$filter_exclusion{$_} } #terms;
print join(',', #filtered) . "\n";
It's pretty easy if you have a simple structure like %filter_exclusion on hand.
Update: If you want to allow arbitrary substring matches:
my $filter_exclusion = join '|', map quotemeta, #filters;
my #filtered = grep { !/$filter_exclusion/ } #terms;
To see what's going on, print the contents of the array in each step: When you splice the array, it shrinks, but your loop iterates over 0 .. $#terms, so at the end of the loop, $ind will point behind the end of the array. When you use grep { ... } $array[ $too_large ], Perl needs to alias the non-existent element to $_ inside the grep block, so it creates an undef element in the array.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #terms = ('alpha', 'beta test', 'gamma', 'delta quadrant', 'epsilon',
'zeta', 'eta', 'theta chi', 'one iota', 'kappa');
my #filters = qw( beta gamma epsilon iota );
for my $filter (#filters) {
say $filter;
for my $ind (0 .. $#terms) {
if (grep { do {
no warnings 'uninitialized';
/$filter/
} } $terms[$ind]
) {
splice #terms, $ind, 1;
}
say "\t$ind\t", join ' ', map $_ || '-', #terms;
}
}
If you used $terms[$ind] =~ /$filter/ instead of grep, you'd still get uninitialized warnings, but as there's no need to alias the element, it won't be created.

Perl formatting array output.

I have a small program that I am trying to format the output.
The results get loaded in to an array - I am just having trouble formating the
printing out the array into a certain format.
#!/usr/bin/perl
use strict ;
use warnings ;
my #first_array ;
my #second_array ;
my #cartesian ;
while (<>) {
my $first_input = $_ ;
#first_array = split(' ', $first_input) ;
last ;
}
while (<>) {
my $second_input = $_ ;
#second_array = split(' ', $second_input) ;
last ;
}
while(my $first=shift(#first_array)) {
push(#cartesian, $first) ;
my $second = shift(#second_array) ;
push(#cartesian, $second ) ;
}
print "This is the merged array: #cartesian\n" ;
When I enter this in, I get this:
$ ./double_while2.pl
1 2 3
mon tue wed
This is the merged array 1 mon 2 tue 3 wed
what I want to print out is :
"1", "mon",
"2", "tue" ,
"3", "wed",
or alternately:
1 => "mon",
2 => "tue",
3 => "wed,
May I suggest a hash, since you are pairing things
my %cartesian;
#cartesian{ #first_array } = #second_array;
print "$_ => $cartesian{$_}\n" for sort keys %cartesian;
A hash slice is used above. See Slices in perldata
The arrays that you build had better pair up just right, or there will be errors.
If the goal is to build a data structure that pairs up elements, that can probably be done directly, without arrays. More information would help to comment on that.
Try to use hash instead.
for my $i(0..$#first_array){
$hash{$first_array[$i]} = $second_array[$i];
}
or else, you want format without using hashes, try as follows
for (my $i = 0; $i < $#cartesion/2; $i++) {
my $j = ($cartesion/2) + $i;
print "$cartesion[$i] $cartesion[$j] \n";
}
From your question and your code, I suppose that you are a lovely new 'victim' to perl ~
To merge two arrays with same lengh, I suggeest using 'map' to simplify your code:
my #cartesian = map {$first_array[$_], $second_array[$_]} 0..$#first_array;
and to format print style , you can define a subroutine to meet your different requirements:
sub format_my_array{
my $array_ref = shift;
my $sep = shift;
print $array_ref->[$_],$sep,$array_ref->[$_+1],"\n" for grep {! ($_%2)} 0..$#$array_ref;
}
Now, you can try calling your subroutine:
format_my_array(\#cartesian, " => ");
or
format_my_array(\#cartesian, " , ");
Now, you get what you want~
You may have noticed that some intermediate concepts are used in this answer, don't doute , that's exactly what I'm trying to introduce you to ~
May you the great happiness in learning perl ~
The trick is to go with Perl's strengths instead of fighting against them:
#!/usr/bin/perl
use strict;
use warnings;
# For say()
use 5.010;
my #first_array = split ' ', <>;
my #second_array = split ' ', <>;
if (#first_array != #second_array) {
die "Arrays must be the same length\n";
}
my #cartesian = map { $first_array[$_], $second_array[$_] } 0 .. $#first_array;
for (0 .. $#cartesian / 2) {
say "$cartesian[$_*2] => $cartesian[$_*2+1]";
}
But, it gets much easier still if you use a hash instead of an array for your merged data.
#!/usr/bin/perl
use strict;
use warnings;
# For say()
use 5.010;
my #first_array = split ' ', <>;
my #second_array = split ' ', <>;
if (#first_array != #second_array) {
die "Arrays must be the same length\n";
}
my %cartesian;
#cartesian{#first_array} = #second_array;
for (sort keys %cartesian) {
say "$_ => $cartesian{$_}";
}

Perl: correctly print array of arrays (dereference)

Hey fellow perl monks,
I'm still wrapping my head around how to correctly dereference. (I read the similar posts prior to posting, but unfortunately am still a bit cloudy on the concept.)
I have the following array, which internally is composed of two arrays. (BTW, I am using strict and warning pragmas.)
use strict; use warnings;
my #a1; my #a2;
where:
#a1 = ( "1MB", "2MB", ... )
and..
#a2 = ( "/home", "/home/debug", ... )
Both #a1 & #a2 are arrays which contain 51 rows. So, I populate these into my 2nd array.
my #b;
push (#b, [ #a1, #a2 ]);
However, when I try to print the results of #b:
sub newl { print "\n"; print "\n"; }
my $an1; my #an1;
$an1 = $#a1;
#an1 = ( 0, 1..$an1 );
for my $i (#an1) { print #b[$i]; &newl; }
I see references to the arrays:
ARRAY(0x81c0a10)
.
ARRAY(0x81c0a50)
.
.
.
How do I properly print this array? I know I need to dereference the array, I'm not sure how to go about doing this. I tried populating my array as such:
push (#b, [ \#a1, \#a2 ]);
Which produces the same results. I also tried:
for my $i (#an1) { print #{$b[$i]}; &newl; }
Which unfortunately errors due to having 0 as an array reference?
Can't use string ("0") as an ARRAY ref while "strict refs" in use at p_disk_ex6.pl line 42.
Any suggestions are greatly appreciated!
A short example program, which might help you:
use strict;
use warnings;
my #a1 = qw(1MB 2MB 10MB 7MB);
my #a2 = qw(/foo /bar /flub /blub);
my #b = (\#a1, \#a2);
# equivalent long version:
# my #b = ();
# $b[0] = \#a1;
# $b[1] = \#a2;
for (my $i = 0; $i <= $#a2; $i++) {
print "a1[$i]: $b[0][$i]\n";
print "a2[$i]: $b[1][$i]\n";
print "\n";
}
In your example you were pushin an anoymous arrayref [] into #b. Therefore $b[0] contained the arrayref.
my #b;
push (#b, [ \#a1, \#a2 ]);
# this corresponds to:
# $b[0][0] = \#a1;
# $b[0][1] = \#a2;
In the example where you wrote [#a1, #a2] you were creating an array_ref which contained the joined arrays #a1 and #a2 (first all elements of #a1, and then all elements of #a2):
my #b;
push(#b , [#a1, #a2]);
# $b[0] = ['1MB' , '2MB', '10Mb', '7MB', '/foo', '/bar', '/flub', '/blub']
Even Simply this also works
use strict;
use warnings;
my #a1 = qw(1MB 2MB 10MB 7MB);
my #a2 = qw(/foo /bar /flub /blub);
my #b = (#a1, #a2);
print "#b";
If you want a general solution that doesn't assume how many elements there are in each of the sub-arrays, and which also allows arbitrary levels of nesting, you're better off using packages that someone else has already written for displaying recursive data structures. A particularly prevalent one is YAML, which you can install if you don't already have it by running cpan:
$ cpan
Terminal does not support AddHistory.
cpan shell -- CPAN exploration and modules installation (v1.9800)
Enter 'h' for help.
cpan[1]> install YAML
Then you can display arbitrary data structures easily. To demonstrate with a simple example:
use YAML;
my #a1 = qw(1MB 2MB 10MB 7MB);
my #a2 = qw(/foo /bar /flub /blub);
my #b = (\#a1, \#a2);
print Dump(\#b);
results in the output
---
-
- 1MB
- 2MB
- 10MB
- 7MB
-
- /foo
- /bar
- /flub
- /blub
For a slightly more complicated example
my #b = (\#a1, \#a2,
{ a => 0, b => 1 } );
gives
---
-
- 1MB
- 2MB
- 10MB
- 7MB
-
- /foo
- /bar
- /flub
- /blub
- a: 0
b: 1
To read this, the three "-" characters in column 1 indicate an array with three elements.
The first two elements have four sub elements each (the lines with "-" in column 3). The
third outer element is a hash reference, since it is made up of "key: value" pairs.
A nice feature about YAML is that you can use it to dump any recursive data structure into a file, except those with subroutine references, and then read it back later using Load.
If you really have to roll your own display routine, that is certainly possible, but you'll have a much easier time if you write it recursively. You can check whether your argument is an array reference or a hash reference (or a scalar reference) by using ref:
my #a1 = qw(1MB 2MB 10MB 7MB);
my #a2 = qw(/foo /bar /flub /blub);
my #b = (\#a1, \#a2,
{ a => 0, b => 1 } );
print_recursive(\#b);
print "\n";
sub print_recursive {
my ($obj) = #_;
if (ref($obj) eq 'ARRAY') {
print "[ ";
for (my $i=0; $i < #$obj; $i++) {
print_recursive($obj->[$i]);
print ", " if $i < $#$obj;
}
print " ]";
}
elsif (ref($obj) eq 'HASH') {
print "{ ";
my #keys = sort keys %$obj;
for (my $i=0; $i < #keys; $i++) {
print "$keys[$i] => ";
print_recursive($obj->{$keys[$i]});
print ", " if $i < $#keys;
}
print " }";
}
else {
print $obj;
}
}
which produces the output
[ [ 1MB, 2MB, 10MB, 7MB ], [ /foo, /bar, /flub, /blub ], { a => 0, b => 1 } ]
I have not written my example code to worry about pretty-printing, nor does it
handle scalar, subroutine, or blessed object references, but it should give you the idea of how you can write a fairly general recursive data structure dumper.

In Perl, is there a built in way to compare two arrays for equality?

I have two arrays of strings that I would like to compare for equality:
my #array1 = ("part1", "part2", "part3", "part4");
my #array2 = ("part1", "PART2", "part3", "part4");
Is there a built-in way to compare arrays like there is for scalars?
I tried:
if (#array1 == #array2) {...}
but it just evaluated each array in scalar context, and so compared the length of each array.
I can roll my own function to do it, but it seems like such a low-level operation that there should be a built-in way to do it. Is there?
Edit: sadly, I don't have access to 5.10+ or optional components.
There is the new smart match operator:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my #x = (1, 2, 3);
my #y = qw(1 2 3);
say "[#x] and [#y] match" if #x ~~ #y;
Regarding Array::Compare:
Internally the comparator compares the two arrays by using join to turn both arrays into strings and comparing the strings using eq.
I guess that is a valid method, but so long as we are using string comparisons, I would much rather use something like:
#!/usr/bin/perl
use strict;
use warnings;
use List::AllUtils qw( each_arrayref );
my #x = qw(1 2 3);
my #y = (1, 2, 3);
print "[#x] and [#y] match\n" if elementwise_eq( \(#x, #y) );
sub elementwise_eq {
my ($xref, $yref) = #_;
return unless #$xref == #$yref;
my $it = each_arrayref($xref, $yref);
while ( my ($x, $y) = $it->() ) {
return unless $x eq $y;
}
return 1;
}
If the arrays you are comparing are large, joining them is going to do a lot of work and consume a lot of memory than just comparing each element one by one.
Update: Of course, one should test such statements. Simple benchmarks:
#!/usr/bin/perl
use strict;
use warnings;
use Array::Compare;
use Benchmark qw( cmpthese );
use List::AllUtils qw( each_arrayref );
my #x = 1 .. 1_000;
my #y = map { "$_" } 1 .. 1_000;
my $comp = Array::Compare->new;
cmpthese -5, {
iterator => sub { my $r = elementwise_eq(\(#x, #y)) },
array_comp => sub { my $r = $comp->compare(\(#x, #y)) },
};
This is the worst case scenario where elementwise_eq has to go through each and every element in both arrays 1_000 times and it shows:
Rate iterator array_comp
iterator 246/s -- -75%
array_comp 1002/s 308% --
On the other hand, the best case scenario is:
my #x = map { rand } 1 .. 1_000;
my #y = map { rand } 1 .. 1_000;
Rate array_comp iterator
array_comp 919/s -- -98%
iterator 52600/s 5622% --
iterator performance drops quite quickly, however:
my #x = 1 .. 20, map { rand } 1 .. 1_000;
my #y = 1 .. 20, map { rand } 1 .. 1_000;
Rate iterator array_comp
iterator 10014/s -- -23%
array_comp 13071/s 31% --
I did not look at memory utilization.
There's Test::More's is_deeply() function, which will also display exactly where the structures differ, or Test::Deep's eq_deeply(), which doesn't require a test harness (and just returns true or false).
Not built-in, but there is Array::Compare.
This is one of the operations that's left out of the Perl core for what I believe are didactic reasons -- that is, if you're trying to do it, there's probably something wrong. The most illustrative example of this, I think, is the absence of a core read_entire_file function; basically, providing that function in the core would lead people to think it's a good idea to do that, but instead, Perl is designed in a way that gently nudges you toward processing files line-at-a-time, which is generally far more efficient and otherwise a better idea, but novice programmers are rarely comfortable with it and they need some encouragement to get there.
The same applies here: there is probably a much better way to make the determination you're trying to accomplish by comparing two arrays. Not necessarily, but probably. So Perl is nudging you to think about other ways of accomplishing your goal.
Perl 5.10 gives you the smart match operator.
use 5.010;
if( #array1 ~~ #array2 )
{
say "The arrays are the same";
}
Otherwise, as you said, you'll have top roll your own.
So long as you are using perl 5.10 or newer, you can use the smart match operator.
if (#array1 ~~ #array2) {...}
Simpler solution is faster:
#!/usr/bin/perl
use strict;
use warnings;
use Array::Compare;
use Benchmark qw( cmpthese );
use List::AllUtils qw( each_arrayref );
my #x = 1 .. 1_000;
my #y = map { "$_" } 1 .. 1_000;
my $comp = Array::Compare->new;
cmpthese -2, {
iterator => sub { my $r = elementwise_eq(\(#x, #y)) },
my_comp => sub { my $r = my_comp(\(#x, #y)) },
array_comp => sub { my $r = $comp->compare(\(#x, #y)) },
};
#x = 1 .. 20, map { rand } 1 .. 1_000;
#y = 1 .. 20, map { rand } 1 .. 1_000;
cmpthese -2, {
iterator => sub { my $r = elementwise_eq(\(#x, #y)) },
my_comp => sub { my $r = my_comp(\(#x, #y)) },
array_comp => sub { my $r = $comp->compare(\(#x, #y)) },
};
sub elementwise_eq {
my ($xref, $yref) = #_;
return unless #$xref == #$yref;
my $it = each_arrayref($xref, $yref);
while ( my ($x, $y) = $it->() ) {
return unless $x eq $y;
}
return 1;
}
sub my_comp {
my ($xref, $yref) = #_;
return unless #$xref == #$yref;
my $i;
for my $e (#$xref) {
return unless $e eq $yref->[$i++];
}
return 1;
}
And result in perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi:
Rate iterator array_comp my_comp
iterator 1544/s -- -67% -80%
array_comp 4697/s 204% -- -41%
my_comp 7914/s 413% 68% --
Rate iterator array_comp my_comp
iterator 63846/s -- -1% -75%
array_comp 64246/s 1% -- -75%
my_comp 252629/s 296% 293% --
This question has turned into a very useful resource. ++ for the benchmarks and discussion.
As others have pointed out smart match feature had issues and is being phased out in its current form. There are alternatives that are "less smart" (and so avoid the issues) and that are small, fairly fast and don't have too many non CORE dependencies.
Smart::Match
match::simple (and match::smart)
Scalar::In
You can find links to some pretty good discussions about the history of the future of ~~ by looking at a couple of blog posts by #brian d foy, and the p5p mail archive threads from 2011 and 2012 from #rjbs.
Comparing arrays can be simple and fun!
use v5.20;
use match::smart;
my #x = (1, 2, 3);
my #y = qw(4 5 6);
my #z = qw(4 5 6);
say \#x |M| \#y ? "[\#x] and [\#y] match": "no match";
say \#y |M| \#z ? "[\#y] and [\#z] match": "no match";
__END__
#y and #z match, #x and #y do not
... especially fun if the array is simple. But an array can be a complicated thing, and sometimes you want different kinds of information from the results of the comparison. For that, Array::Compare can make fine tuned comparison easier.
Data::Cmp is another recent option. The cmp_data() function operates similarly to the cmp operator (see perlop for cmp usage).
Example:
use 5.10;
use Data::Cmp qw/cmp_data/;
my #array1 = ("part1", "part2", "part3", "part4");
my #array2 = ("part1", "PART2", "part3", "part4");
my #array3 = ("part1", "PART2", "part3", "part4");
# sample usage
say "1 & 2 are different" if cmp_data(\#array1, \#array2) ;
sat "2 & 3 are the same" unless cmp_data(\#array2, \#array3) ;
It's also possible to compare hashes and more complicated nested data structures (within reason). For a single module with no non-core dependencies, Data::Cmp is pretty "smart" ;-) ... errm I mean "useful".
One could use grep function in scalar context (http://perldoc.perl.org/functions/grep.html#grep-BLOCK-LIST)
($#array1 == $#array2) && (0 == (grep { $array1[ $_ ] ne $array2[ $_ ] } 0..$#array1))
My core-only solution with List::Util::all:
use List::Util qw(all);
if (#array1 == #array2 && all { $array1[$_] eq $array2[$_] } 0..$#array1) {
print "matched\n";
}
As a subroutine:
# call me like string_array_equals([#array1], [#array2])
sub string_array_equals {
my ($array1, $array2) = #_;
#$array1 == #$array2 and
all { $array1->[$_] eq $array2->[$_] } 0..$#$array1;
}
If you want a custom comparison:
# call me like array_equals { $a eq $b } [#array1], [#array2]
sub array_equals(&$$) {
my ($compare, $array1, $array2) = #_;
#$array1 == #$array2 and
all {
local $a = $array1->[$_];
local $b = $array2->[$_];
$compare->($a, $b);
} 0..$#$array1;
}
At this point, all doesn't save much space and you could just do a for:
# call me like array_equals { $a eq $b } [#array1], [#array2]
sub array_equals(&$$) {
my ($compare, $array1, $array2) = #_;
#$array1 == #$array2 or return 0;
for (0..$#$array1) {
local $a = $array1->[$_];
local $b = $array2->[$_];
$compare->($a, $b) or return 0;
}
1;
}
Edit: List::Util::first works as a substitute on older perls (< v5.20).
use List::Util qw(first);
if (#array1 == #array2 && !defined first { $array1[$_] ne $array2[$_] } 0..$#array1) {
print "matched\n";
}
If casing is the only difference, you can simply use:
if (lc "#array1" eq lc "#array2") {...}
Whereas "#array1" returns the same as join ( " ", #array1 )
If order and duplicate values do not matter but only values equality (i.e. set comparison), you could use Set::Scalar.
It overloads common operators such as == or !=.
my #array1 = ("part1", "part2", "part3", "part4");
my #array2 = ("part1", "PART2", "part3", "part4");
if ( Set::Scalar->new(#array1) == Set::Scalar->new(#array2) ) {...}
Alternatively, there's also Algorithm::Diff and List::Compare.
For checking equality of two arrays try this.
In given code, if %eq_or_not has any value then both arrays are not equal otherwise they are equal.
my #array1 = ("part1", "part2", "part3", "part4");
my #array2 = ("part1", "PART2", "part3", "part4");
my %eq_or_not;
#eq_or_not{ #array1 } = undef;
delete #eq_or_not{ #array2 };
if (join(",",sort #a) eq join(",",sort #b))
if performance concern can be ignored, as mentioned several times in the threads here
If the only criteria is "are they equivalent or not?", and not the more complex question, "are they equivalent or not, and if they differ, how?" there are much quicker/uglier ways to do it. For example, smash the entirety of each array into two scalars and compare those.
For example
my #array1 = ("part1", "part2", "part3", "part4");
my #array2 = ("part1", "PART2", "part3", "part4");
my $smash1 = join("", #array1);
my $smash2 = join("", #array2);
if ($smash1 eq $smash2)
{
# equal
}
else
{
#unequal
}
Yes, I probably just made Larry Wall cry.

Resources