Perl - Building array of arrays leads to overflow - arrays

I am trying to analyze a hashmap, for duplicate values and getting their keys in arrays. These arrays will be in an array of arrays. I am a newbie, by the way. However it never stops running when I start it. Where am I wrong?
while (($k,$v)=each %hashmap){
$hasduplicate = 0;
delete $hashmap{$k};
#dups = ();
while (($k1,$v1) = each %hashmap){
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate){
push (#dups, $k);
push #dupsarray, [#dups];}
}

Each hash has just one iterator aligned to itself in Perl (see each). Therefore, running each for the same hash in a loop that calls each is not doing what you think.
If you want to see what's going on, try adding the following line at the start of the outer loop:
warn $k;
You are missing several dollar signs before variable names. For example, you probably want to delete $hashmap{$k} instead of $hashmap{k}, which is equivalent to $hashmap{'k'}.
To output an array of arrays, you have to dereference the inner arrays:
print map "#$_\n", #dupsarray;
BTW, I would use a hash of arrays to solve your task. Here's how:
my %dups;
while (my ($k, $v) = each %hashmap) {
push #{ $dups{$v} }, $k;
}
for my $k (grep #{ $dups{$_} } > 1, keys %dups) {
print "$k: #{ $dups{$k} }\n";
}

The problem is that there can be only one each sequence per hash, as there is only a single index to keep track of the next key/value pair.
In addition, you are using k and k1 in a few places where you mean $k and $k1. You must always use strict and use warnings at the top of every Perl program. This would have alerted you to the problem.
You can get around this problem by using for my $k1 (keys %hashmap) { ... } for the inside loop. This will create a separate list of keys to assign to $k1 in turn so that there is no multiple use of the iterator.
This modification of your code does what I think you want.
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray;
while (my ($k, $v) = each %hashmap) {
my $hasduplicate = 0;
delete $hashmap{$k};
my #dups;
for my $k1 (keys %hashmap) {
my $v1 = $hashmap{$k1};
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate) {
push(#dups, $k);
push #dupsarray, [#dups];
}
}
use Data::Dump;
dd \#dupsarray;
output
[["a", "c"]]
A much simpler method is to create an inverted hash where the keys and values of the original hash are swapped. Then just pick out the values of the inverted hash that have more than one element. This program demonstrates
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray = do {
my %inverted;
while (my ($k, $v) = each %hashmap) {
push #{ $inverted{$v} }, $k;
}
grep { #$_ > 1 } values %inverted;
};
use Data::Dump;
dd \#dupsarray;
output
[["c", "a"]]

Related

Match array based on existing array

I have two arrays,
my #test = ('a','b','c','d','e',f,'g','h');
my #test2 = ('h','b','d');
I'm attempting to loop through the array #test, and match elements against those in #test2, deleting those elements that do not exist.
I have the following code:
foreach my $header (#test) {
if( exists $test2[$header]){
# do nothing
}
else {
delete $test[$header];
}
}
So, I want array #test to look like this (ignore the fact this could be sorted alphabetically):
my #test = ('b','d','h');
However currently my array remains the same after the foreach loop, can anyone suggest why?
You're misunderstanding what 'header' is set to. It's set to (an alias of) the value in the array.
So
foreach my $header (#test) {
print $header,"\n";
}
Will give you a, b, c etc.
However, then you're trying to access $test2['a'] which isn't valid, because it should be numeric.
So actually, a good case study in why you should use strict and warnings because this would have told you the problem:
Argument "a" isn't numeric in array or hash lookup at
Argument "b" isn't numeric in array or hash lookup at
etc.
So it's not actually doing anything.
You shouldn't use delete in this way either though, because you're deleting from a list whilst iterating it. That's not good, even without the fact that it's nonsense to use a letter as an array index.
You could do it like this instead though:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
my %is_in_test2 = map { $_ => 1 } #test2;
#test = grep { $is_in_test2{$_} } #test;
print Dumper \#test;
If you do want to iterate by index, you can do it like this:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
}
But I would still suggest that deleting whilst iterating isn't a great plan, because modifying a thing you're iterating it ( and aiming to resize the array) is a good way to end up with strange bugs.
So whilst you could:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
delete $test[$index] if not $is_in_test2{$test[$index]};
}
print Dumper \#test;
What you'll end up with is:
$VAR1 = [
undef,
'b',
undef,
'd',
undef,
undef,
undef,
'h'
];
For the more general case there is a a FAQ: How do I computer the difference/intersection of two arrays
use strict;
use warnings;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
foreach my $header (#test) {
foreach my $header2 (#test2) {
if( $header eq $header2) {
print "$header,"; #prints only existing values
}
else {
#doNothing
}
}
}
Note: This method is not feasible when you are dealing an array with huge elements.

Perl: Load file into hash

I'm struggling to understand logic behind hashes in Perl. Task is to load file in to hash and assign values to keys which are created using this file.
File contains alphabet with each letter on its own line:
a
b
c
d
e
and etc,.
When using array instead of hash, logic is simple: load file into array and then print each element with corresponding number using some counter ($counter++).
But now my question is, how can I read file into my hash, assign automatically generated values and sort it in that way where output is printed like this:
a:1
b:2
c:3
I've tried to first create array and then link it to hash using
%hash = #array
but it makes my hash non-sortable.
There are a number of ways to approach this. The most direct would be to load the data into the hash as you read through the file.
my %hash;
while(<>)
{
chomp;
$hash{$_} = $.; #Use the line number as your autogenerated counter.
}
You can also perform simliar logic if you already have a populated array.
for (0..$#array)
{
$hash{$array[$_]} = $_;
}
Although, if you are in that situation, map is the perlier way of doing things.
%hash = map { $array[$_] => $_ } #array;
Think of a hash as a set of pairs (key, value), where the keys must be unique. You want to read the file one line at a time, and add a pair to the hash:
$record = <$file_handle>;
$hash{$record} = $counter++;
Of course, you could read the entire file into an array at once and then assign to your hash. But the solution is not:
#records = <$file_handle>;
%hash = #records;
... as you found out. If you think in terms of (key, value) pairs, you will see that the above is equivalent to:
$hash{a} = 'b';
$hash{c} = 'd';
$hash{e} = 'f';
...
and so on. You still are going to need a loop, either an explicit one like this:
foreach my $rec (#records)
{
$hash{$rec} = $counter++;
}
or an implicit one like one of these:
%hash = map {$_ => $counter++} #records;
# or:
$hash{$_} = $counter++ for #records;
This code should generate the proper output, where my-text-file is the path to your data file:
my %hash;
my $counter = 0;
open(FILE, "my-text-file");
while (<FILE>) {
chomp;
$counter++;
$hash{$_} = $counter;
}
# Now to sort
foreach $key (sort(keys(%hash))) {
print $key . ":" . $hash{$key} . "\n";
}
I assume you want to sort the hash aplhabetically. keys(%hash) and values(%hash) return the keys and values of %hash as an array, respectively. Run the program on this file:
f
a
b
d
e
c
And we get:
a:2
b:3
c:6
d:4
e:5
f:1
I hope this helps you.

Auto increment numeric key values in a perl hash?

I have a perl script in which I am reading files from a given directory, and then placing those files into an array. I then want to be able to move those array elements into a perl hash, with the array elements being the hash value, and automatically assigning numeric keys to each hash value.
Here's the code:
# Open the current users directory and get all the builds. If you can open the dir
# then die.
opendir(D, "$userBuildLocation") || die "Can't opedir $userBuildLocation: $!\n";
# Put build files into an array.
my #builds = readdir(D);
closedir(D);
print join("\n", #builds, "\n");
This print out:
test.dlp
test1.dlp
I want to take those value and insert them into a hash that looks just like this:
my %hash (
1 => test.dlp
2 => test1.dlp
);
I want the numbered keys to be auto incrementing based on how many files I may find in a given directory.
I'm just not sure how to get the auto-incrementing keys to be set to unique numeric values for each item in the hash.
I am not sure to understand the need, but this should do
my $i = 0;
my %hash = map { ++$i => $_ } #builds;
another way to do it
my $i = 0;
for( #builds ) {
$hash{++$i} = $_;
}
The most straightforward and boring way:
my %hash;
for (my $i=0; $i<#builds; ++$i) {
$hash{$i+1} = $builds[$i];
}
or if you prefer:
foreach my $i (0 .. $#builds) {
$hash{$i+1} = $builds[$i];
}
I like this approach:
#hash{1..#builds} = #builds;
Another:
my %hash = map { $_+1, $builds[$_] } 0..$#builds;
or:
my %hash = map { $_, $builds[$_-1] } 1..#builds;

Finding common elements in arrays

I have a hash whose values are arrays. I need to find the common elements of those arrays,
ie. the elements that are present in all the arrays. So I extracted the values of the hash into
a multidimensional array whose each row corresponds to an array in the hash. Then I took the first row
of this matrix into another array (#arr1) and iterated through it to find if there was any element
in arr1 that was also in the rest of the rows of the matrix. If such an element is found, it is
pushed onto another array that contains the final list of all the elements. The code is as follows
(I hope it is clear enough):
sub construct_arr(my %records) {
my $len = keys %records;
my #matrix;
my $i = 0;
# Extract the values of the hash into a matrix
foreach my $key (keys %records) {
$matrix[$i] = $records{$key};
$i++;
}
my #arr1 = $matrix[0];
my #final;
# Iterate through each element of arr1
for my $j (0..$#{$arr1[0]}) {
my $count = 1;
# Iterate through each row of the matrix, starting from the second
for ( my $i = 1; $i < $len ; $i++ ) {
my $flag = 0;
# Iterate through each element of the row
for my $k (0..$#{$matrix[$i]}) {
if ($arr1[0][$j] eq $matrix[$i][$k]) {
$flag = 1;
$count++;
}
}
# On finding the first instance of the element in a row, go to the next row
if (!$flag == 1) {
last;
}
}
# If element is in all the rows, push it on to the final array
if ($count == $len) {
push(#final, $arr1[0][$j]);
}
}
return #final;
}
I know that the above works, but I would like to know if there is any other (perlish) way to do this.
I am starting to learn perl and I am very interested in knowing things that could make my work easier
in perl as compared to other languages. If my code is the best that can be done, please let me know that
too. Any guidance would be appreciated. Thanks!
Take a look at Chris Charley's link for calculating the intersection of arrays.
Hashes are the clear way to go for problems like this. Together with map and grep a solution can be reduced to just a few lines.
This program uses sundar's data for want of anything better, and seems to do what you need.
use strict;
use warnings;
my %records = (
a => [ qw/ A B C / ],
b => [ qw/ C D E A / ],
c => [ qw/ A C E / ],
);
print "$_\n" for construct_arr(\%records);
sub construct_arr {
my $records = shift;
my %seen;
$seen{$_}++ for map #$_, values %$records;
grep $seen{$_} == keys %$records, keys %seen;
}
output
A
C
Edit
I thought it may help to see a more Perlish, tidied version of your own solution.
use strict;
use warnings;
my %records = (
a => [ qw/ A B C / ],
b => [ qw/ C D E A / ],
c => [ qw/ A C E / ],
);
print "$_\n" for construct_arr(\%records);
sub construct_arr {
my $records = shift;
my #matrix = values %$records;
my #final;
# iterate through each element the first row
for my $i ( 0 .. $#{$matrix[0]} ) {
my $count = 1;
# look for this value in all the rest of the rows, dropping
# out to the next row as soon as a match is found
ROW:
for my $j ( 1 .. $#matrix ) {
for my $k (0 .. $#{$matrix[$j]}) {
next unless $matrix[0][$i] eq $matrix[$j][$k];
$count++;
next ROW;
}
}
# If element is in all the rows, push it on to the final array
push #final, $matrix[0][$i] if $count == #matrix;
}
return #final;
}
The output is the same as for my own program, but the functionality is slightly different as mine assumes the values in each row are unique. If the sama value appears more than once my solution will break (the same applies to sundar's). Please let me know if that is acceptable.
Although the poster explained there aren't duplicates within a single array, here is my attempt which handles that case too (notice the slightly modified test data - "5" should not be printed):
#!/usr/bin/env perl
use warnings;
use strict;
my %records = (
a => [1, 2, 3],
b => [3, 4, 5, 1],
c => [1, 3, 5, 5]
);
my %seen;
while (my ($key, $vals) = each %records) {
$seen{$_}{$key} = 1 for #$vals;
}
print "$_\n" for grep { keys %{$seen{$_}} == keys %records } keys %seen;
You can find the size of the hash easily using scalar(keys %hash);
Here's an example code that does what you need:
#!/usr/bin/perl
use strict;
use warnings;
my %records = ( a => [1, 2, 3],
b => [3, 4, 5, 1],
c => [1, 3, 5]
);
my %count;
foreach my $arr_ref (values %records) {
foreach my $elem (#$arr_ref) {
$count{$elem}++;
}
}
my #intersection;
my $num_arrays = scalar(keys %records);
foreach my $elem (keys %count) {
#If all the arrays contained this element,
#allowing for multiple entries per array
if ($count{$elem} >= $num_arrays) {
push #intersection, $elem;
}
}
Feel free to comment if you need any clarification in this code. And the second foreach that constructs the #intersection array is written this way only for clarity - if you're learning Perl, I'd suggest you study and rewrite it using the map construct, since that's arguably more idiomatic Perl.

How can I append a string to an array stored in an object, in perl?

I want to use an obj->log($msg) function to store strings, and then later dump them
UPDATED:
use warnings;
use strict;
use diagnostics;
package obj;
sub log{
my($self,$log_info) = #_;
push(#{$self->{log}},$log_info);
}
sub showlog{
my($self) = #_;
my $counter = 0;
my #log= #{$self->{log}};
print '<div id="log" class="_view _debug"><h1>Log Data</h1>';
foreach my $i (#log) {
$counter++;
print '<div class="item',(($counter%2)?' alt':''),'"><em class="tag">',$counter,'</em><pre>';
print(Dumper($i));
print $i;
print '</pre></div>';
}
print '</div>';
}
but I can't figure out how to append new items to $self->{log} -- I come from PHP land so this is a bit painful.
It would be nice if I could have log be any sort of data, and just dump it out, array, hash or scalar... is there a way to do this?
Hashes values must be scalars. If you want to store multiple values in hash value, you'll need to find a way of placing multiple values in a scalar. A way that would work well here involves storing a reference to an array in the hash value:
$self->{log} = []; # Optional because of autovivification.
Then place the multiple values in the referenced array:
push #{ $self->{log} }, $log_info;
One would iterate over the values as follows:
for my $entry (#{ $self->{log} }) {
If you want to use index for $i % 2, you can use:
my $log = $self->{log};
for my $i (0..$#$log) {
my $entry = $log->[$i];
...
}
Note: Since 5.14.0,
push $self->{log}, ...;
is mostly equivalent to
push #{ $self->{log} }, ...;
Perl 5.14 is pretty new, so you might want to stay away from that for unless you're just coding for yourself.
It appears as though $self->{log} is an array reference. As of Perl 5.14.0, it's safe to push onto array references, but otherwise, you'll want to de-reference the array reference and do: push(#{$self->{log}},$log_info);. Similarly, you'll want to dereference when assigning to #log by doing my #log=#{$self->{log}} (and by the way, you have use strict; and use warnings; at the top of your script, right?).
Take a look at perldoc perlreftut and perldoc perlref.
You'll probably also find Data::Dumper to be handy for printing out complicated data structures. If you want to know everything that's in your log, you can just have use Data::Dumper; at the beginning of your script and then do print Dumper($self->{log}) . "\n"; (no need to de-reference!).
How about this:
package Logger;
# create a logger object
sub new {
my ($class) = #_;
my #messages;
my $self = {
log => \#messages,
};
bless $self, $class;
return $self;
}
# add a message to the log
sub append {
my $self = shift;
my $message = shift;
push ( #{$self->{log}}, $message );
}
# print the log
sub printlog {
my $self = shift;
foreach my $msg ( #{$self->{log}} ){
print "$msg\n";
}
}
You could useautobox::Core.
use autobox::Core;
...
$self->{log}->push( $log_info );
You could use Moose.
package Logger;
use Moose; ### Enables strict/warnings
has '_log_ref' => => (
traits => ['Array'],
is => 'ro',
isa => 'ArrayRef[Str]',
default => sub { [] },
handles => {
_log_contents => 'elements',
log => 'push',
},
);
sub showlog {
my($self) = #_;
my $counter = 0;
print '<div id="log" class="_view _debug"><h1>Log Data</h1>';
for my $i ($self->_log_contents()) {
$counter++;
print '<div class="item',(($counter%2)?' alt':''),'"><em class="tag">',$counter,'</em><pre>';
print(Dumper($i));
print $i;
print '</pre></div>';
}
print '</div>';
}

Resources