Access array elements of passed hash - arrays

I have got two hashes with a foldername as key and its respective files as an array. But I cannot acces the array elements of the passed hash in the getMissingFiles sub (see comments for error message).
Hashes to be compared:
# contains all files
%folderWithFiles1 =
(
foldername1 => [ qw(a b c d e f g h i j k l m n o p) ],
foldername2 => [ qw(a b c d e f g h i j k l m n ) ],
)
%folderWithFiles2 =
(
foldername1 => [ qw(a b d e h i l m n p) ],
foldername2 => [ qw(a d f g h j m ) ],
)
Compare subroutine (get missing files from hash2 that are not in hash1):
sub getMissingFiles()
{
my ($hash1, $hash2) = shift; # is it working?
#my $hash1 = shift; # or do you have to do it separately?
#my $hash2 = shift;
my $flag = 0;
my #missingFiles;
foreach my $folder (sort(keys %{$hash1}))# (sort(keys %hash1)) not possible?
{
for (my $i = 0; $i < #$hash1{$folder}; $i++)
{
foreach my $folder2 (sort(keys %{$hash2}))
{
foreach my $file2 (#$hash2{$folder2})
{
if ($hash1{$folder}[$i] == $file2) # Error: Global symbol "%hash1" requires explicit package name
{
$flag = 1;
last;
}
}
if (0 == $flag)
{
push(#missingFiles, $hash1{$folder}[$i]); # Error: Global symbol "%hash1" requires explicit package name
}
else
{
$flag = 0;
}
}
}
}
return #missingFiles;
}
Calling function:
#missingFiles = &getMissingFiles(\%hash1, \%hash2);
Is: "my ($hash1, $hash2) = shift;" correct or do you have to do it separately?
Why is "foreach my $folder (sort(keys %hash1))" not possible?
Is there a more efficient way than using 4 loops?

In getMissingFiles(), just as you dereference $hash1 and $hash2 to get the keys, you also need to dereference them to get the values:
#folder_files = #{ $hash1->{$folder1} };
or alternatively,
#folder_files = #{ $$hash1{$folder} };
And you can do this to get individual files:
$file = $hash1->{$folder}[$i];

That call syntax isn't quite right - you want
my ($hash1, $hash2) = #_;
or perhaps
my $hash1 = shift;
my $hash2 = shift;
The shift function will only give you the first value, so you need to call twice as you suggest, or access the parameter list #_ if you want to pluck more than value in one go.

Related

Is there a built in Perl Function for finding duplicate subarrays(exact order) in an array?

Lets say the array is (1,2,3,4,5,6,7,8,9),
Another subarray is (2,3,4)
Is there a function to check if the subarray pattern(full exact order) exists within array?
In this case, it would return any indicator(index) that shows it exists.
Also would need to work for duplicates if there are multiple subarrays existing in the array like (4,2,3,4,2,3,4).
If it happens to match multiple times for example:
Array = (2,3,2,3,2,2,3,2)
Sub Array = (2,3,2)
Would just return starting index of matches in order: 0,2,5
Or if it removes, would result in (3,2)
Edit: Elements don't have to be num
There's no built-in method, but it's easy to write:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
# Takes two arrayrefs of numbers.
#
# Returns the first index in the first one where the second list appears, or
# -1 if not found.
sub find_sublist(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
return -1 if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
return $n;
}
return -1;
}
# Takes two arrayrefs of numbers.
#
# Returns a list of the starting indexes of the first list
# of every run of the second list. Returns an empty list if
# there are no matches.
sub find_sublists(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
my #positions;
return #positions if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
push #positions, $n;
}
return #positions;
}
# Takes two arrayrefs of numbers.
#
# Returns a new list that is the first one with every non-overlapping run of
# the second second list removed.
sub remove_sublists(++) {
my #haystack = #{$_[0]};
my $needle = $_[1];
while ((my $pos = find_sublist #haystack, $needle) != -1) {
splice #haystack, $pos, #$needle;
}
return #haystack;
}
my #list1 = (1,2,3,4,5,6,7,8,9);
my #list2 = (4,2,3,4,2,3,4);
my #list3 = (2,3,2,3,2,2,3,2);
say find_sublist(#list1, [2, 3, 4]); # Returns 1
say find_sublist([2,9,3,4], [2,3,4]); # Returns -1
my #positions = find_sublists(#list2, [2,3,4]); # 1,4
say join(",", #positions);
#positions = find_sublists(#list3, [2,3,2]); # 0,2,5
say join(",", #positions);
say join(",", remove_sublists(#list1, [2,3,4])); # 1,5,6,7,8,9
say join(",", remove_sublists(#list3, [2,3,2])); # 3,2
If the inputs are numbers representable by your perl's integers (as shown), you can use
# Indexes
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
my #indexes;
push #indexes, $-[0] while $array =~ /\Q$pattern/g;
# Removal
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
$array =~ s/\Q$pattern//g;
#array = unpack "W*", $array;
How it handles overlaps:
/---\ /---\ Removed
2,3,2 from 2,3,2,3,2,2,3,2
\---/ Not removed
Note that this also works if you can map the inputs to numbers.
my ( %map_f, #map_r );
for ( #array, #pattern ) {
if ( !exists{ $map{ $_ } } ) {
$map_f{ $_ } = #map_r;
push #map_r, $_;
}
}
my $pattern = pack "W*", #map_f{ #pattern };
my $array = pack "W*", #map_f{ #array };
$array =~ s/\Q$pattern//g;
#array = #map_r[ unpack "W*", $array ];
It's not the best algorithm, but it should be very fast by moving the work from Perl to the regex engine.

Perl remove same value back to back with splice

I am trying to remove, the same values twice in an array, it is located back to back, this is my code
#{$tmp_h->{'a'}} = qw/A B B C/;
print Dumper ($tmp_h);
my $j = 0;
foreach my $cur (#{$tmp_h->{'a'}}) {
if ($cur eq 'B') {
splice(#{$tmp_h->{'a'}}, $j, 1);
}
$j++;
}
print Dumper $tmp_h;
However what got is,
$VAR1 = {
'a' => [
'A',
'B',
'B',
'C'
]
};
$VAR1 = {
'a' => [
'A',
'B',
'C'
]
};
I am expecting both 'B' to be removed in this case, what could possibly went wrong?
That code is removing from an array while iterating over it, pulling the carpet from underneath itself; is that necessary?
Instead, iterate and put elements on another array if the adjacent ones aren't equal. So iterate over the index, looking up an element and the next (or previous) one.†
I presume that B is just an example while in fact it can be any value, equal to its adjacent one.
It's interesting that regex can help too, with its simple way to find repeated patterns using backreferences
my #ary = qw(a b b c d d e f f f g);
my $str_ary = join '', #ary;
$str_ary =~ s/(.)\g{-1}//g;
my #new_ary = split //, $str_ary;
say "#new_ary"; #--> a c e f g
This removes pairs of adjacent values, so if there is an odd number of equal adjacent values it leaves the odd one (f above). As a curiosity note that it can be written in one statement
my #new_ary = split //, join('', #ary) =~ s/(.)\g{-1}//gr;
The join-ed array, forming a string, is bound to the substitution operator where /r modifier is crucial, for allowing this and returning the changed string which is then split back into a list.
To change an array in place have it assign to itself.‡
But single-letter elements are only an example, likely. With multiple characters in elements we can't join them by empty string because we wouldn't know how to split that back into an array; we have to join by something that can't be in any one element, clearly a tricky proposition. A reasonable take is a line-feed, as one can expect to know whether elements are/not multiline strings
my #ary = qw(aa no no way bah bah bah go);
my $str_ary = join "\n", #ary ;
$str_ary =~ s/([^\n]+)\n\g{-1}//g;
my #new = grep { $_ } split /\n/, $str_ary;
say "#new"; #--> aa way bah go
This would still have edge cases with interesting elements, like spaces and empty strings (but then any approach would).
† For example
use warnings;
use strict;
use feature 'say';
my #ary = qw(a b b c d d e f f f g);
my #new_ary;
my $i = 0;
while (++$i <= $#ary) {
if ($ary[$i] ne $ary[$i-1]) {
push #new_ary, $ary[$i-1]
}
else { ++$i }
}
push #new_ary, $ary[-1] if $ary[-1] ne $ary[-2];
say "#new_ary"; #--> a c e f g
‡ Done for the arrayref in the question
#{ $hr->{a} } = qw/A B B C/;
#{$hr->{a}} = split //, join('', #{$hr->{a}}) =~ s/(.)\g{-1}//gr;
say "#{$hr->{a}}"; #--> A C
The Perl documentation tells you in perlsyn under Foreach Loops:
If any part of LIST is an array, foreach will get very confused if you
add or remove elements within the loop body, for example with splice. So
don't do that.
You can iterate over the indices instead, but don't forget to not increment the index when removing a value:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $tmp_h = {a => [qw[ A B B C ]]};
print Dumper($tmp_h);
my $j = 0;
while ($j <= $#{ $tmp_h->{a} }) {
my $cur = $tmp_h->{a}[$j];
if ($cur eq 'B') {
splice #{ $tmp_h->{a} }, $j, 1;
} else {
++$j;
}
}
print Dumper($tmp_h);
Or start from the right so you don't have to worry:
my $j = $#{ $tmp_h->{a} };
while ($j-- >= 0) {
my $cur = $tmp_h->{a}[$j];
splice #{ $tmp_h->{a} }, $j, 1 if $cur eq 'B';
}
But the most straight forward way is to use grep:
#{ $tmp_h->{a} } = grep $_ ne 'B', #{ $tmp_h->{a} };

searching two array string for equal words

I am a beginner in Perl. I have two string arrays array1 and array2. I want to check the each and every element in 2nd array. if there is i want to give a relative value one to that particular element in the 2nd array. the relative values are store in an array.I try it out but it wont work and git gives a warning like" Use of uninitialized value in string eq at pjt.pl line 52, line 3".
while($i <= (scalar #resultarray-1))
{
while ($j <= (scalar #inputsymbl-1))
{
if ($resultarray[$i] eq $inputsymbl[$j])
{
$rel[$j]=1;
$i=$i+1;
$j=0;
}
else
{
$j=$j+1;
}
}
if($j==(scalar #inputsymbl))
{
$i=$i+1;
$j=0;
}
}
try this:
my $i = 0;
my $j = 0;
## walk each array element
foreach(#resultarray) {
my $result = $_;
foreach(#inputsymbl) {
my $symbl = $_;
if ($result eq $symbl) {
$rel[$j] = 1;
$i++;
} else {
$j++;
}
}
if ($j == (scalar #inputsymbl - 1)) {
$i++;
$j = 0;
}
}
provide more informations if you need detailed help.
From your question and code, it appears that you want to flag the indexes, by using a third array, of the two array's elements that are equal. By doing this, however, you're creating a sparse array. Also, if the two arrays don't have the same number of elements, a "Use of uninitialized value in string eq..." warning will eventually occur. Given these issues, consider using the smaller index of the two arrays (done using the ternary operator below) and pushing the indexes of the equal elements onto the third array:
use strict;
use warnings;
use Data::Dumper;
my #results;
my #arr1 = qw/A B C D E F G H I J/;
my #arr2 = qw/A D C H E K L H N J P Q R S T/;
# Equal: ^ ^ ^ ^ ^
# Index: 0 2 4 7 9
for my $i ( 0 .. ( $#arr1 <= $#arr2 ? $#arr1 : $#arr2 ) ) {
push #results, $i if $arr1[$i] eq $arr2[$i];
}
print Dumper \#results;
Output:
$VAR1 = [
0,
2,
4,
7,
9
];
Hope this helps!

Comparison of two arrays perl

I am new to Perl world, and I have a script that compares two arrays.
I use List::MoreUtils (each_arrayref) to do the comparison.
I have two questions:
1) Is there a way to compare two chunks of arrays (like natatime but for two arrayrefs) instead of comparing single element at a time as in each_arrayref?
The elements should be from the same index from each array.
The data structure is something like this:
{
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
}
This is what I have got so far.
my #lists = keys %{$hash};
for (my $i = 0; $i <= #lists; $i++) {
my $list_one = $lists[$i];
my $one = $hash->{$list_one};
for (my $j = 0 ; $j <= #lists ; $j++) {
my $list_two = $lists[$j];
my $two = $hash->{$list_two};
my ($overlapping, $mismatch, $identity);
my $match = 0;
my $non_match = 0;
my $count_ac_calls = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if ((defined $call_one) && (defined $call_two)) {
if ($call_one eq $call_two) {
$match++;
}
if ($call_one ne $call_two) {
$non_match++;
}
}
} #end of while loop $each_array->()
print "$list_one,$list_two,$match,$non_match";
} #end of for j loop
} #end of for i loop
I would like to compare atr->ats, atr->att, ats->att. But with my current code, I get repetitions of comparison like ats->atr att->atr,att->ats.
2) How can I avoid those?
I'm not clear what your first question means. Do you want an iterator that, say, returns (('a','b','c'),('a','b','c'))
instead of
('a','a')? If so then there isn't one available in a library, but it wouldn't be hard to write your own.
As for the second, the usual way to avoid items being compared with themselves is to change the inner loop to start after the current value of the first. Like so
for my $i (0..$#lists) {
for my $j ($i+1..$#lists) {
}
}
This works because A eq B is generally the same as B eq A, so there is no point in comparing an entry with one earlier in the list because the inverse comparison has already been made.
Note that it is much better Perl to write for loops this way than the messy C-style syntax. You also have a couple of bugs in
for (my $i = 0 ; $i <= #lists ; $i++) { ... }
because the maximum index of #lists is one less than the scalar value of #lists - usually coded as $#lists. The same problem exists in your loop for $j.
Update
Here is a refactoring of your program, written to include the ideas I have described and to be more Perlish. I hope it is useful to you.
use strict;
use warnings;
use List::MoreUtils 'each_arrayref';
my $hash = {
atr => [qw/ a b c d /],
ats => [qw/ a b c d /],
att => [qw/ a b c d /],
};
my #keys = keys %{$hash};
for my $i (0 .. $#keys) {
my $key1 = $keys[$i];
my $list1 = $hash->{$key1};
for my $j ($i+1 .. $#keys) {
my $key2 = $keys[$j];
my $list2 = $hash->{$key2};
my ($match, $non_match) = (0, 0);
my $iter = each_arrayref($list1, $list2);
while (my ($call1, $call2) = $iter->()) {
if (defined $call1 and defined $call2) {
($call1 eq $call2 ? $match : $non_match)++;
}
}
print "$key1, $key2, $match, $non_match\n";
}
}
One option is to use Array::Compare to return the number of different array elements. Also, Math::Combinatorics is used to obtain only unique comparisons.
use strict;
use warnings;
use Array::Compare;
use Math::Combinatorics;
my %hash = (
'atr' => [ 'a', 'b', 'c', 'd' ],
'ats' => [ 'a', 'b', 'c', 'd' ],
'att' => [ 'a', 'c', 'c', 'd' ],
);
my $comp = Array::Compare->new( DefFull => 1 );
my $combinat = Math::Combinatorics->new(
count => 2,
data => [ keys %hash ],
);
while ( my ($key1, $key2) = $combinat->next_combination ) {
my $diff = $comp->compare( \#{ $hash{$key1} }, \#{ $hash{$key2} } );
print "$key1,$key2," . ( #{ $hash{$key1} } - $diff ) . ",$diff\n";
}
Output:
ats,att,3,1
ats,atr,4,0
att,atr,3,1
You're not really taking advantage of the features Perl has to offer. Rather than use an error prone C-style loop, just use for my $var (LIST). You can also skip redundant list checking by skipping the self-checks, too. I've taken your script, made some alterations, and I'm sure you'll find it a bit easier to read.
use v5.16;
use warnings;
use List::MoreUtils qw{each_arrayref};
my $hash = {
'atr' => [
'a',
'b',
'c',
'd'
],
'ats'=>[
'a',
'b',
'c',
'd'
],
'att' => [
'a',
'c',
'c',
'd'
],
};
for my $list_one (keys $hash) {
my $one = $hash->{$list_one};
for my $list_two (keys $hash) {
next if $list_one ~~ $list_two;
my $two = $hash->{$list_two};
my ($match, $non_match);
$match = $non_match = 0;
my $each_array = each_arrayref($one, $two);
while (my ($call_one, $call_two) = $each_array->()) {
if($call_one && $call_two) {
if($call_one eq $call_two) {
$match++;
}
else {
$non_match++;
}
}
}
print "$list_one,$list_two,$match,$non_match\n";
}
}
You'll want to evaluate one at a time anyway so that you can add in some extra bits like the index location. (Yes, you could use the C-style loop, but that'd be a bit more difficult to read.)

How can I pass an array to a function in Perl?

Question 1:
I want to pass an array to a function. But the passed argument is changed in the function. Is it called by value?
Question 2:
#my ($name, $num, #array)= #_; <=1 )
my $name = shift; <=2 )
my $num = shift;
my #array = shift;
Case 1 and 2 has different output. Why did it occur?
#!/usr/bin/perl
use strict;
my #test1;
push #test1, ['a', 1];
push #test1, ['b', 1];
push #test1, ['c', 1];
push #test1, ['d', 1];
push #test1, ['e', 1];
for (my $i=0; $i< scalar(#test1); $i++) {
print "out1: $test1[$i][0] $test1[$i][1]\n";
}
test_func("test_func", 10, #test1);
sub test_func {
#my ($name, $num, #array)= #_; <=1)
my $name = shift; <=2)
my $num = shift;
my #array = shift;
print "$name\n";
print "$num\n";
for (my $i=0; $i< scalar(#test1); $i++) {
print "$array[$i][0] $array[$i][1]\n";
}
for (my $i=0; $i< scalar(#test1); $i++) {
if ($array[$i][0] eq 'a') {
$array[$i][0] = 'z';
}
}
for (my $i=0; $i< scalar(#test1); $i++) {
print "change: $array[$i][0] $array[$i][1]\n";
}
}
for (my $i=0; $i< scalar(#test1); $i++) {
print "out2: $test1[$i][0] $test1[$i][1]\n";
}
#
Below is the test output.
out1: a 1
out1: b 1
out1: c 1
out1: d 1
out1: e 1
test_func
10
a 1
b 1
c 1
d 1
e 1
change: z 1
change: b 1
change: c 1
change: d 1
change: e 1
out2: z 1 <= Why did it change?
out2: b 1
out2: c 1
out2: d 1
out2: e 1
I want to pass an array to a function [...] has different output. Why did it occur?
You cannot pass an array to a function sub. Subs can only take a list of scalars as arguments.
test_func("test_func", 10, #test1);
is the same as
test_func("test_func", 10, $test1[0], $test1[1], $test1[2], $test1[3], $test1[4]);
You are creating a new array in test_func when you do
my ($name, $num, #array) = #_;
shift returns the first element of #_, which is necessarily a scalar. #_ is an array, and elements of arrays are scalars. The equivalent would be
my $name = shift(#_);
my $num = shift(#_);
my #array = splice(#_);
To pass an array to a sub, one would normally pass a reference to it.
test_func("test_func", 10, \#test1);
my ($name, $num, $array) = #_;
my $name = shift;
my $num = shift;
my $array = shift;
say "#$array";
But the passed argument is changed in the function. Is it called by value?
Perl never passes by value. It always passes by reference. If you change any element of #_, it will change the corresponding argument in the caller.
$ perl -E'sub f { $_[0] = "def"; } my $x = "abc"; f($x); say $x;'
def
But that's not the issue. You don't change any elements of #_. What you are doing is changing the single array referenced by both $test[0] and $array[0].
This is what you are doing:
my $ref1 = [ 'a', 1 ]; # aka $test1[0]
my $ref2 = $ref1; # aka $array[0]
$ref2->[0] = 'z'; # Changes the single array (not $ref1 or $ref2).
It's short for
my #anon = ( 'a', 1 );
my $ref1 = \#anon; # aka $test1[0]
my $ref2 = $ref1; # aka $array[0]
$ref2->[0] = 'z'; # Changes #anon (not $ref1 or $ref2).
Storable's dclone can be used to make a "deep copy" of an array.
my $ref1 = [ 'a', 1 ];
my $ref2 = dclone($ref1); # clones the reference, the array, 'a' and 1.
$ref1->[0] = 'y'; # Changes the original array
$ref2->[0] = 'z'; # Changes the new array

Resources