Perl - Identify all elements in a hash that have the highest value

Perl - Identify all elements in a hash that have the highest value - arrays

I wrote some Perl code that searches for the highest key-value pair in a hash in which the keys are text and the values are numbers:
my $o_val = 0; # FOR TRACKING HIGHEST VALUE ENCOUNTERED IN THE LOOP
my $o_key; # FOR TRACKING CORRESPONDING KEY TO THE HIGHEST VALUE
while ( my ($key, $val) = each(%NG) ) {
if ( $val > $o_val ) {
$o_val = $val;
$o_key = $key;
}
}
print "$okey\n";
The problem is that it does nothing to account for the possibility of a tie for highest value. How can I capture all the key-value pairs that tied for highest value if my measuring variable can potentially change with each iteration of the loop?
I have the idea that I could write another while loop to run through the same hash after $o_val had been established as the highest value then push each key paired to $o_val into another array, something like:
my #highest; # ARRAY OF HIGHEST-VALUE KEYS
while ( my ($key, $val) = each(%NG) ) {
if ( $val == $o_val ) { push(#highest, $key); }
}
but this seems kinda inefficient. I'm hoping there are some Perl commands I'm not aware of that will let me identify the highest value in the hash without looping through it so I can just use the second piece of code in my question.

You can use an array to keep the keys for highest value,
if ( $val > $o_val ) {
$o_val = $val;
#o_keys = $key;
}
elsif ($val == $o_val) {
push(#o_keys, $key);
}

#!/usr/bin/env perl
use strict;
use warnings;
my %h = map { $_ => int(rand 10)} 'a' .. 'z';
# see what we've got
my #k = sort { $h{$b} <=> $h{$a} } keys %h;
print "$_ => $h{$_}\n" for #k;
# initialize $max with a value from %h
my ($max) = values %h;
# keys with highest values
my #argmax;
while (my ($k, $v) = each %h) {
next if $v < $max;
if ($v > $max) {
$max = $v;
#argmax = ($k);
}
else {
push #argmax, $k
}
}
print "#argmax\n";
print "#h{ #argmax }\n";
Of course, if you don't care much about memory, or if you are golfing, or japhing, you could write something like:
my %v;
push #{ $v{$h{$_}} }, $_ for keys %h;
my ($max) = sort { $b <=> $a } keys %v;
print "#{ $v{$max} } => $max\n"

Related

Is there a built in Perl Function for finding duplicate subarrays(exact order) in an array?

Lets say the array is (1,2,3,4,5,6,7,8,9),
Another subarray is (2,3,4)
Is there a function to check if the subarray pattern(full exact order) exists within array?
In this case, it would return any indicator(index) that shows it exists.
Also would need to work for duplicates if there are multiple subarrays existing in the array like (4,2,3,4,2,3,4).
If it happens to match multiple times for example:
Array = (2,3,2,3,2,2,3,2)
Sub Array = (2,3,2)
Would just return starting index of matches in order: 0,2,5
Or if it removes, would result in (3,2)
Edit: Elements don't have to be num

There's no built-in method, but it's easy to write:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
# Takes two arrayrefs of numbers.
#
# Returns the first index in the first one where the second list appears, or
# -1 if not found.
sub find_sublist(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
return -1 if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
return $n;
}
return -1;
}
# Takes two arrayrefs of numbers.
#
# Returns a list of the starting indexes of the first list
# of every run of the second list. Returns an empty list if
# there are no matches.
sub find_sublists(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
my #positions;
return #positions if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
push #positions, $n;
}
return #positions;
}
# Takes two arrayrefs of numbers.
#
# Returns a new list that is the first one with every non-overlapping run of
# the second second list removed.
sub remove_sublists(++) {
my #haystack = #{$_[0]};
my $needle = $_[1];
while ((my $pos = find_sublist #haystack, $needle) != -1) {
splice #haystack, $pos, #$needle;
}
return #haystack;
}
my #list1 = (1,2,3,4,5,6,7,8,9);
my #list2 = (4,2,3,4,2,3,4);
my #list3 = (2,3,2,3,2,2,3,2);
say find_sublist(#list1, [2, 3, 4]); # Returns 1
say find_sublist([2,9,3,4], [2,3,4]); # Returns -1
my #positions = find_sublists(#list2, [2,3,4]); # 1,4
say join(",", #positions);
#positions = find_sublists(#list3, [2,3,2]); # 0,2,5
say join(",", #positions);
say join(",", remove_sublists(#list1, [2,3,4])); # 1,5,6,7,8,9
say join(",", remove_sublists(#list3, [2,3,2])); # 3,2

If the inputs are numbers representable by your perl's integers (as shown), you can use
# Indexes
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
my #indexes;
push #indexes, $-[0] while $array =~ /\Q$pattern/g;
# Removal
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
$array =~ s/\Q$pattern//g;
#array = unpack "W*", $array;
How it handles overlaps:
/---\ /---\ Removed
2,3,2 from 2,3,2,3,2,2,3,2
\---/ Not removed
Note that this also works if you can map the inputs to numbers.
my ( %map_f, #map_r );
for ( #array, #pattern ) {
if ( !exists{ $map{ $_ } } ) {
$map_f{ $_ } = #map_r;
push #map_r, $_;
}
}
my $pattern = pack "W*", #map_f{ #pattern };
my $array = pack "W*", #map_f{ #array };
$array =~ s/\Q$pattern//g;
#array = #map_r[ unpack "W*", $array ];
It's not the best algorithm, but it should be very fast by moving the work from Perl to the regex engine.

Create Multi Dimensional Hash Using Array Elements

I need to create multidimensional hashes with varying depth using array elements as keys. Pseudocode attempt:
Example line:
Statement Coverage for instance Hierarchical/path/with/block/and/module ...
if (/Statement Coverage for instance (.&?) /)
{
#array = split /\//, $1;
}
for (my $eye = 1; $eye <= $#array; $eye++)
{
A miracle happens to %hash!
}
$hash{"path"}{"with"}{"block"}{"and"} now has a value of "module". Remember, the number of keys can vary. Any ideas?

That's what Data::Diver does for you:
my #array = split /\//, $1;
DiveVal(\ my %hash, #array[ 0 .. $#array - 1 ]) = $array[-1];
print Dumper(\%hash);

See my code below. It builds the desired hash recursively.
But I think that you are taking a wrong approach. I obviously don't know what exactly you are trying to achieve, but seems to me, that you should use tree data structure instead of the multidimensional hash.
use strict;
use warnings;
use v5.10;
use Data::Dumper;
my #data = (
'some/path/test',
'some/path/deeper/test',
'another/deeper/path/test',
);
my $resultHr = {};
foreach my $path (#data) {
my #elems = split /\//, $path;
buildHash($resultHr, #elems);
}
say Dumper($resultHr);
sub buildValue {
my $n = shift;
if (#_) {
return {$n => buildValue(#_)};
}
else {
return $n;
}
}
sub buildHash {
my $hr = shift;
my $k = shift;
return unless $k;
if (exists $hr->{$k} && ref $hr->{$k}) {
buildHash($hr->{$k}, #_);
}
else {
$hr->{$k} = buildValue(#_);
}
}

Perl Compare hash of arrays with another array

I am trying to compare all the array values (complete array) with a hash's value(which is array) and if the match founds,then push the key of hash to new array.
The below code compare if the hash value is not array but how can I compare if its array?
%hash=(
storeA=>['milk','eggs'],
storeB=>['milk','fruits','eggs','vegetables'],
storeC=>['milk','fruits','eggs'],
);
#array = (
'fruits',
'milk',
'eggs'
);
Code to compare
use strict;
use warnings;
use Data::Dumper;
foreach my $thing (#array) {
foreach ((my $key, my $value) = each %hash) {
if ($value eq $thing) {
push #new_array, $key;
}
}
}
print Dumper(\#new_array);
Expected Output
#new_array = (
storeB,
storeC
);

You could also use a combination of all and any form List::Util :
while ((my $key, my $value) = each %hash) {
if ( all { my $temp = $_; any { $_ eq $temp } #$value } #array ) {
push #new_array, $key;
}
}
So here you are looking for the case where all the elements of #array exists in the given array from the hash.

I would build a hash out of each store's stock array. It's a wasteful method, but not egregiously so as long as the real data isn't enormous
Like this. The inner grep statement counts the number of items in #list that are available at this store and compares it to the number of items in the list, returning true if everything is in stock
If this is a real situation (I suspect it's homework) then for all practical purposes that I can think of, the %stocks hash should contain hashes of the items available at each store
use strict;
use warnings 'all';
my %stocks = (
storeA => [ qw/ milk eggs / ],
storeB => [ qw/ milk fruits eggs vegetables / ],
storeC => [ qw/ milk fruits eggs / ],
);
my #list = qw/ fruits milk eggs /;
my #stores = grep {
my %stock = map { $_ => 1 } #{ $stocks{$_} };
grep($stock{$_}, #list) == #list;
} keys %stocks;
use Data::Dump;
dd \#stores;
output
["storeB", "storeC"]

Find the intersection of the two sets, if the number of its elements is the number of the elements in the array, you want to store the key:
#!/usr/bin/perl
use warnings;
use strict;
sub intersect {
my ($arr1, $arr2) = #_;
my %intersection;
$intersection{$_}{0}++ for #$arr1;
$intersection{$_}{1}++ for #$arr2;
return grep 2 == keys %{ $intersection{$_} }, keys %intersection
}
my %hash = (
storeA => [qw[ milk eggs ]],
storeB => [qw[ milk fruits eggs vegetables ]],
storeC => [qw[ milk fruits eggs ]],
);
my #array = qw( fruits milk eggs );
my #new_array;
while (my ($store, $arr) = each %hash) { # while, not for!
push #new_array, $store if #array == intersect(\#array, $arr);
}
use Data::Dumper;
print Dumper(\#new_array);

Simply try this. One small trick i done here. grep was use to filter the element from an array.
I created the variable $joined_array which contain the | separated #array data. Then i pass the variable into the grep.
And the trick is, when the array is compare with a scalar data, the comparison is behave, the total number of an array element with scalar data.
my #array = qw(one two three);
if(#array == 3)
{
print "Hi\n";
}
Here condition is internally run as 3 == 3.
That the same logic i done here.
use warnings;
use strict;
my %hash=(
"storeA"=>['milk','eggs'],
"storeB"=>['milk','fruits','eggs','vegetables'],
"storeC"=>['milk','fruits','eggs'],
);
my #array = (
'fruits',
'milk',
'eggs'
);
my #new_array;
my $joined_array = join("|",#array);
foreach (keys %hash)
{
push(#new_array,$_) if ((grep{ /\b$joined_array\b/ } #{$hash{$_}}) >= scalar #array);
}
print "$_\n" for #new_array

Go through all stores (keys) and for each check whether all array elems are in the key's array-ref.
use strict;
use warnings;
my %inventory = (
storeA => ['milk','eggs'],
storeB => ['milk','fruits','eggs','vegetables'],
storeC => ['milk','fruits','eggs'],
);
my #items = ('fruits', 'milk', 'eggs');
my #found;
foreach my $store (keys %inventory) {
push #found, $store
if #items == grep { in_store($_, $inventory{$store}) } #items
}
sub in_store { for (#{$_[1]}) { return 1 if $_[0] eq $_ }; return 0; }
print "#found\n"; # prints: storeB storeC
The grep block checks for each item whether it is (available) in the store, and if the number of those that pass is equal to the number of items (array size) that store has all items and is added. Note that a subroutine returns the last value evaluated without an explicit return, so the final return is not needed. It was added for clarity.

Access the key value from an associative array

I have the associative array %cart_item, within this is a series of associative arrays. I need to access the value of the keys within %cart_item. I have the following code which iterates on each array key. (I do the equivalent of php's continue if the value is 'meta')
my $key_value;
for (keys %cart_item) {
next if (/^meta$/ || /^\s*$/);
}
I need to do something like this though (although this isn't valid), setting the value of the keys in the loop:
my $key_value;
for $i (keys %cart_item) {
next if (/^meta$/ || /^\s*$/);
$key_value = $i;
# do stuff
}
Could anyone suggest a solution here? Apologies if this is obvious, I'm a Perl newbie. Thanks

I think you are asking for
for my $key (keys %cart_item) {
next if $key =~ /^meta$/ || $key =~ /^\s*$/;
my $val = $cart_item{$key};
...
}

If you're just looking for the value that goes with the key, you can get both at the same time with each:
while (my ($key, $val) = each %cart_item) {
next if $key eq 'meta' || $key =~ /^\s*$/;
...
}
That's the equivalent of PHP's foreach ($cart_item as $key => $val).
I also changed the "meta" check to use simple string equality; no need to use a regular expression for an exact match.

Your original code has
for ( keys %cart_item ) {
next if (/^meta$/ || /^\s*$/);
}
which works fine because the for has no loop control variable so it defaults to Perl's "pronoun" it variable $_. In addition, your regex pattern matches have no object so they also default to $_
Written fully, this would be
for $_ ( keys %cart_item ) {
next if ( $_ =~ /^meta$/ || $+ =~ /^\s*$/);
}
but we don't have to write all of that. Some people hate it; others like me think it's absolute genius
Your non-working code
my $key_value;
for $i (keys %cart_item) {
next if (/^meta$/ || /^\s*$/);
$key_value = $i;
# do stuff
}
does use a loop control control variable $i (bad name for a hash key, by the way). That's all fine except that your regex matches still
my $key_value;
for $i (keys %cart_item) {
next if $i =~ /^meta$/ or $i =~ /^\s*$/;
$key_value = $i;
# do stuff
}
or, better still, stick with $_ and write this
for ( keys %cart_item ) {
next if /^meta$/ or /^\s*$/;
my $key_value = $_;
# do stuff
}

ID tracking while swapping and sorting other two arrays in perl

#! /usr/bin/perl
use strict;
my (#data,$data,#data1,#diff,$diff,$tempS,$tempE, #ID,#Seq,#Start,#End, #data2);
#my $file=<>;
open(FILE, "< ./out.txt");
while (<FILE>){
chomp $_;
#next if ($line =~/Measurement count:/ or $line =~/^\s+/) ;
#push #data, [split ("\t", $line)] ;
my #data = split('\t');
push(#ID, $data[0]);
push(#Seq, $data[1]);
push(#Start, $data[2]);
push(#End, $data[3]);
# push #$data, [split ("\t", $line)] ;
}
close(FILE);
my %hash = map { my $key = "$ID[$_]"; $key => [ $Start[$_], $End[$_] ] } (0..$#ID);
for my $key ( %hash ) {
print "Key: $key contains: ";
for my $value ($hash{$key} ) {
print " $hash{$key}[0] ";
}
print "\n";
}
for (my $j=0; $j <=$#Start ; $j++)
{
if ($Start[$j] > $End[$j])
{
$tempS=$Start[$j];
$Start[$j]=$End[$j];
$End[$j]=$tempS;
}
print"$tempS\t$Start[$j]\t$End[$j]\n";
}
my #sortStart = sort { $a <=> $b } #Start;
my #sortEnd = sort { $a <=> $b } #End;
#open(OUT,">>./trial.txt");
for(my $i=1521;$i>=0;$i--)
{
print "hey";
my $diff = $sortStart[$i] - $sortStart[$i-1];
print "$ID[$i]\t$diff\n";
}
I have three arrays of same length, ID with IDs (string), Start and End with integer values (reading from a file).
I want to loop through all these arrays and also want to keep track of IDs. First am swapping elements in Start with End if Start > End, then I have to sort these two arrays for further application (as I am negating Start[0]-Start[1] for each item in that Start). While sorting, the Id values may change, and as my IDs are unique for each Start and End elements, how can I keep track of my IDs while sorting them?
Three arrays, ID, Start and End, are under my consideration.
Here is a small chunk of my input data:
DQ704383 191990066 191990037
DQ698580 191911184 191911214
DQ724878 191905507 191905532
DQ715191 191822657 191822686
DQ722467 191653368 191653339
DQ707634 191622552 191622581
DQ715636 191539187 191539157
DQ692360 191388765 191388796
DQ722377 191083572 191083599
DQ697520 189463214 189463185
DQ709562 187245165 187245192
DQ540163 182491372 182491400
DQ720940 180753033 180753060
DQ707760 178340696 178340726
DQ725442 178286164 178286134
DQ711885 178250090 178250119
DQ718075 171329314 171329344
DQ705091 171062479 171062503
The above ID, Start, End respectively. If Start > End i swapped them only between those two arrays. But after swapping the descending order may change, but i want them in descending order also their corresponding ID for negation as explained above.

Don't use different arrays, use a hash to keep the related pieces of information together.
#!/usr/bin/perl
use warnings;
use strict;
use enum qw( START END );
my %hash;
while (<>) {
my ($id, $start, $end) = split;
$hash{$id} = [ $start < $end ? ($start, $end)
: ($end, $start) ];
}
my #by_start = sort { $hash{$a}[START] <=> $hash{$b}[START] } keys %hash;
my #by_end = sort { $hash{$a}[END] <=> $hash{$b}[END] } keys %hash;
use Test::More;
is_deeply(\#by_start, \#by_end, 'same');
done_testing();
Moreover, in the data sample you provided, the order of id's is the same regardless of by what you sort them.