How to create hash with duplicate keys - arrays

Now i am modifying the code a little
I am using the code for creating hash haivng duplicate keys. Its giving the syntax error.
use strict;
use warnings;
my $s = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5";
my $hash;
my #a = split(/\n/, $s);
foreach (#a)
{
my $c = (split)[2];
my $d = (split)[1];
my $e = (split)[0];
push(#{$hash->{$c}}, $d);
}
print Dumper($hash );
i am getting the output as
$VAR1 = {
'P5' => [
'B',
'E'
],
'P2' => [
'C'
],
'P1' => [
'A',
'D'
]
};
But i want the output like
$VAR1 = {
'P5' => {
'E' => '06',
'B' => '23'
},
'P2' => {
'C' => '24'
},
'P1' => {
'A' => '12',
'D' => '15'
}
};
How to do that

You hash declaration is incorrect, it should be:
my %hash = ();
or simply:
my %hash;
Then the rest of your code is both too complex and incorrect.
foreach (#a) {
my ($k, $v) = (split);
push #{$hash{$k}}, $v;
}
should be enough. See Autovivification for why this works.
With your code, the first time you see a key, you set $hash{$k} to be a scalar. You can't then push things to that key - it needs to be an array to begin with.
The if (-e $hash{$c}) test is wrong. -e is a file existence test. If you want to know if a hash key exists, use:
if (exists $hash{$c}) { ... }
And print %hash; won't do what you expect (and print %{$hash}; is invalid). You'll get a prettier display if you do:
use Data::Dumper;
print Dumper(\%hash);
(Great debugging too, this Data::Dumper.)

Perl is telling you exactly what is wrong. You have used the strict pragma, so using the %hash variable without declaring it is a syntax error. While the string %hash does not appear in your code, the string $hash{...} does, on each of the problem lines. This is the syntax to access an element of the %hash, which is why strict is complaining.
You have declared the variable $hash, so accessing an element of the contained hash reference is written $$hash{...} or $hash->{...}. Fix the problem lines to access the correct variable and the code will compile.

%hash is a hash, and $hash is a scalar (a hash reference, like \%hash ), they are two different variables
To refer to $hash, to refer to the hash whose reference is stored in the scalar variable $hash, you either have to use $hash->{$c} or $$hash{$c}
See References quick reference
update:
#!/usr/bin/perl --
use strict; use warnings;
use Data::Dumper;
my $s = "P1 26
P5 23
P2 24
P1 15
P5 06 ";
my $hash = {};
for my $line ( split /[\r\n]+/, $s ) {
my( $c, $d ) = split ' ', $line;
push #{ $hash->{$c} }, $d;
}
print Dumper( $hash );
__END__
$VAR1 = {
'P5' => [
'23',
'06'
],
'P2' => [
'24'
],
'P1' => [
'26',
'15'
]
};

See the working code, the fixed errors (comments in the code), and the resulting output:
use strict;
use warnings;
my $s = "P1 26
P5 23
P2 24
P1 15
P5 06 ";
my %hash; #my $hash ={};
#my $arr = [];
my #a = split(/\n/, $s);
foreach (#a)
{
my $d = (split)[1];
my $c = (split)[0];
push(#{$hash{$c}}, $d); #if ...
}
while (my ($key, $value) = each(%hash)) #print %{$hash};
{
print "$key #{$value}\n";
}
#Output:
#P5 23 06
#P2 24
#P1 26 15

(Strange. Out of all the answers posted so far, none has actually answered the question...)
The code below produces the result asked for. The fundamental bit which seems to be missing from the original code is the two-level hash.
As an aside, there seems to be no reason for the outer hash to be a hashref and not a hash, so I made it a hash. Also you can pick out the split into variables in one line.
use strict;
use warnings;
use Data::Dumper;
my $s = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5";
my %hash;
my #a = split(/\n/, $s);
foreach (#a)
{
my ($e, $d, $c) = (split);
$hash{$c}{$d} = $e;
}
print Dumper(\%hash);

Related

How to sort an hash and assign a variable to the keypair from an array in perl?

I have an array like this
my #arr =('1','apple','2','orange','1','orange','3','berry','2','berry','1','berry');
my %hash;
my $var =1;
Now how can i sort and assign a variable to the pair?
The desired output is
$hash{1}{apple} =>1;
$hash{1}{orange} =>1;
$hash{1}{berry} =>1;
$hash{2}{orange} =>1;
$hash{2}{berry} =>1;
$hash{3}{berry} =>1;
You need to iterate your array and take two values out per iteration. One way to do this is with a while loop. This will consume the array, so if you want to keep it you might want to make a copy.
use strict;
use warnings;
use Data::Printer;
my #arr = (
'1', 'apple', '2', 'orange', '1', 'orange',
'3', 'berry', '2', 'berry', '1', 'berry',
);
my %hash;
my $var = 1;
while ( my $first_key = shift #arr ) {
my $second_key = shift #arr;
$hash{$first_key}->{$second_key} = $var;
}
p %hash;
This outputs
{
1 {
apple 1,
berry 1,
orange 1
},
2 {
berry 1,
orange 1
},
3 {
berry 1
}
}
An alternative is to use a C-style for loop. This does not change the array.
for (my $i = 0; $i <= $#arr; $i+=2) {
$hash{ $arr[$i] }->{ $arr[$i + 1] } = $var;
}
Or you could use List::Util's pairs function to get two out at the same time.
use List::Util 'pairs';
foreach my $pair ( pairs #arr ) {
my ( $first_key, $second_key ) = #$pair;
$hash{$first_key}->{$second_key} = $var;
}
It's normally expected that you at least spend a few hours trying to write a solution yourself. We will happily help you if you've made a decent attempt of your own but have run out of ideas, but it doesn't go down well if you appear to have dumped your problem on us and are waiting for an answer to pop up while you drink a mug of coffee. You've been told about this before, and only one of your post has a net positive vote. You need to work on that.
Are you certain that you really want a hash of hashes? This is very reminiscent of your previous question How to find if the value exists in hash without using key in perl? where we pretty much established that it was the wrong choice.
The only non-obvious part is extracting the values from the array in pairs, I and I have used C-style for loop to achieve this.
I have used Data::Dumper only to show the resulting hash of hashes.
use strict;
use warnings 'all';
my #arr = qw/ 1 apple 2 orange 1 orange 3 berry 2 berry 1 berry /;
my %hash;
for ( my $i = 0; $i < $#arr; $i += 2 ) {
$hash{$arr[$i]}{$arr[$i+1]} = 1;
}
use Data::Dumper;
print Dumper \%hash;
output
$VAR1 = {
'2' => {
'berry' => 1,
'orange' => 1
},
'3' => {
'berry' => 1
},
'1' => {
'berry' => 1,
'orange' => 1,
'apple' => 1
}
};
Update
Here's an example of generating the keys as I described in the comments. It's almost identical to the solution above, but the resulting hash contents are different.
use strict;
use warnings 'all';
my #arr = qw/ 1 apple 2 orange 1 orange 3 berry 2 berry 1 berry /;
my %hash;
for ( my $i = 0; $i < $#arr; $i += 2 ) {
$hash{"#arr[$i,$i+1]"} = 1;
}
use Data::Dumper;
print Dumper \%hash;
output
$VAR1 = {
'2 berry' => 1,
'1 apple' => 1,
'3 berry' => 1,
'1 orange' => 1,
'1 berry' => 1,
'2 orange' => 1
};
Take the values from the array two at a time (key/value), put them into a hash, then assign the variable as the value.
use Data::Dumper;
sub SortAndAssign {
my ($args) = #_;
my #arr = #{$args->{ARRAY}};
my $var = $args->{VARIABLE};
my %hash;
my $i = 0;
my $size = scalar(#arr);
while ($i < $size) {
# alternating key/value pairs (really a hash)
my $key = $arr[$i++]; # e.g. 1
my $value = $arr[$i++]; # e.g. apple
$hash{$key}{$value} = $var; # e.g. hash->1->apple = 1
}
return %hash;
}
sub ShowSortAndAssign {
my #arr =('1','apple','2','orange','1','orange','3','berry','2','berry','1','berry');
my $var = 1;
my %hash = SortAndAssign({
ARRAY => \#arr,
VARIABLE => $var,
});
print Dumper(\%hash);
print "first apple is " . $hash{1}{apple};
}
sub _Main {
ShowSortAndAssign();
}
_Main();

How to assign array variable as a hash?

I'm trying to print this hash. As key1 is the array[0], key2 is array[2] and $sum[0] is the value. But the has does not work. What I'm doing wrong?
#array=(10,45,20);
#sum=($array[0]+$array[1]+$array[2]);
%hash;
$hash{$array[0]}{$array[2]}=$sum[0]
At the end of the hashes I want to print 10 : 75 to the screen.
You've set
$hash{$array[0]}{$array[2]} = $sum[0]
which with the given values is
$hash{10}{20} = 75
If you want to print 10 : 75 from the hash then you need to write
printf "%d : %d\n",10, $hash{10}{20}
And while I'm sure you want something more general than that, you really haven't given enough information
From the description you gave to #ikegami «my program will accept ...» I created a file that would have the data:
data_1.txt:
john 10 45 20
alex 30 15 12
pete 23 45 10 21
will 06 56
bob 8 12 3
lazy
note that only the first two lines actually match the description, I'll come back to that later.
sum.pl:
use strict;
use warnings;
use List::Util 'sum';
# get the two filenames it should work with
#
my $filename_1 = shift;
my $filename_2 = shift;
# be sure we read a file for most modern systems, UTF-8
#
open( my $file1, '<:encoding(UTF-8)', $filename_1)
or die "Can't open file: $filename_1";
# create the (empty) data structure
#
my %sums_and_names;
#
# the % in perl means you are talking about a hash,
# use a sensible name instead of 'hash'
# read line by line
while ( my $line = <$file1> ) {
chomp $line; # get rid of the line endings
my ($name, #grades) = split ' ', $line;
#
# this is not strictly doing what you asked for, just more flexible
#
# split on ' ', a space character, splits on any asmount of (white) space
# your task said that there is one space.
# strictly, you could should split on / /, the regular expression
#
# the first part will go into the variable $name, the rest in array #grades
# strictly you have only three grades so the following would do
# my ($name, $grade_1, $grade_2, $grade_3) = split / /, $line;
my $sum = sum(#grades) // 'no grades';
#
# since we now can handle any number of grades, not just three, we could
# have no grades at all and thus result in `undef`
#
# using the function sum0 would return the value 0 instead
#
# you'll get away with the `undef` using in a hash assignment,
# it will turn it into an empty string `''`
=pod
$sums_and_names{foo}{bar} = [ 'baz', 'qux' ];
=cut
#
# here is where your task doesn't make sense
# i am guessing:
#
$sums_and_names{$sum}{$name} = \#grades;
#
# at least we have all the data from filename_1, and the sum of the grades
}
# please decide on what you want to print
use Data::Dumper;
print Dumper \%sums_and_names;
and running perl sum.pl data_1.txt data_2.txt will give you something like
output:
$VAR1 = {
'no grades' => {
'lazy' => []
},
'23' => {
'bob' => [
'8',
'12',
'3'
]
},
'57' => {
'alex' => [
'30',
'15',
'12'
]
},
'62' => {
'will' => [
'06',
'56'
]
},
'75' => {
'john' => [
'10',
'45',
'20'
]
},
'99' => {
'pete' => [
'23',
'45',
'10',
'21'
]
}
};
please note, strictly the block inside the while loop could had been written as:
chomp $line;
my ($name, $grade_1, $grade_2, $grade_3) = split / /, $line;
$sum = $grade_1 + $grade_2 + $grade_3;
$sums_and_names{$sum}{$name} = [ $grade_1, $grade_2, $grade_3 ];
but I quote from #Borodin:
And while I'm sure you want something more general than that, you really haven't given enough information
Always use use strict; use warnings qw( all );!!!
There's only one sum (at a time), so don't need an array.
There's no need for a hash of hash; a simple hash will do.
Fixed:
use strict;
use warnings qw( all );
use List::Util qw( sum );
my %hash;
while (...) {
my #nums = ...;
$hash{ $nums[0] } = sum(#nums);
}
for (sort { $a <=> $b } keys(%hash)) {
print("$_: $hash{$_}\n");
}

Retrieve unique values from column based on value from other column

I have a table like this
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
... ... ...
I want to print the following in a new table
symbol length id
A 15 id_2; id_3
B 25 id_5
... ... ...
So I want to loop through the symbol column. When there are duplicate values in this column, I want to print the line where the numeric length-value is the greatest (example: symbol B). When the greatest length values are equal, I want to merge the values in the idcolumn (example: symbol A) and print this new line.
How should I do this in perl?
The tool in perl for coalescing duplicates is a hash. Hashes are key-value pairs, but the useful part is - the value can be an array (reference).
I'd be suggesting something like this:
#!/usr/bin/perl
use strict;
use warnings;
my %length_of;
my %ids_of;
my $heading_row = <DATA>;
while (<DATA>) {
my ( $symbol, $length, $id ) = split;
if ( not defined $length_of{$symbol} or $length_of{$symbol} < $length ) {
$length_of{$symbol} = $length;
}
push( #{ $ids_of{$symbol}{$length} }, $id );
}
print join( "\t", "symbol", "length", "ids" ), "\n";
foreach my $symbol ( sort keys %ids_of ) {
my $length = $length_of{$symbol};
print join( "\t",
$symbol,
$length,
join( "; ", #{ $ids_of{$symbol}{$length} } ) ),
"\n";
}
__DATA__
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
What this is doing is - iterating your data, and saving the highest length value (in %length_of). It's also stashing each of the ids - by symbol and length (in %ids_of). It keeps them all, so this might not be very efficient if you've a lot of data.
Just remember the last symbol and length and accumulate the ids:
#! /usr/bin/perl
use warnings;
use strict;
my ($last_l, $last_s, #i);
sub out {
print "$last_s\t$last_l\t", join(";", #i), "\n"
}
while (<>) {
my ($s, $l, $i) = split;
out() if $last_s and $s ne $last_s;
undef #i if $last_l < $l;
push #i, $i;
$last_s = $s;
$last_l = $l;
}
out();
This approach builds a hash of hashes of arrays by using the values from the symbol and length columns as keys and adding the values from the id column as array references. For the simple dataset you provided such a complex data structure is not really needed, but the approach shown below might be more flexible in cases where data is not sorted.
I use the max function from (List::Util , which is part of the core distribution) to get the maximum length value for each symbol, and Data::Dumper to help visualize things.
use Data::Dumper ;
use List::Util 'max';
use v5.16;
my (%hash, #lines) ;
while ( <DATA>) {
chomp ;
next if $. == 1 ;
push #lines, [ split ] ;
}
for (#lines) {
push #{ $hash{ $_->[0] }{ $_->[1] } }, $_->[2] ;
}
say "This is your %hash:\n", Dumper \%hash;
for my $symbol ( keys %hash ) {
my $max = max ( keys $hash{$symbol} ) ;
say "$symbol \t", "$max \t", join "; ", #{ $hash{$symbol}{$max} };
}
__DATA__
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
Output:
This is your %hash:
$VAR1 = {
'A' => {
'10' => [
'id_1'
],
'15' => [
'id_2',
'id_3'
]
},
'B' => {
'25' => [
'id_5'
],
'20' => [
'id_4'
]
}
};
A 15 id_2; id_3
B 25 id_5

How to build a Perl multidimensional array or hash?

I have a set of CSV values like this:
device name, CPU value, frequency of CPU value, CPU in percentage
For example
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
I need to form a data structure like this:
array = {
router1 => [5,10,4],[5,1,5],
router2 => [5,10,4],[5,2,5],
router3 => [4,5,6],[7,6,5]
}
I need help in forming this data structure in Perl.
I have tried visualizing how to do this but am unable to do so. I would appreciate any help on this.
The end goal for me is to convert this into a JSON object.
This should get you started. It uses the DATA file handle so that I could embed the data in the program itself. I have used to_json from the JSON module to format the hash as JSON data. The statement $_ += 0 for #values converts the contents of #values from string to to numeric, to avoid quotation marks in the resultant JSON data.
use strict;
use warnings;
use JSON;
my %data;
while (<DATA>) {
chomp;
my ($device, #values) = split /,/;
$_ += 0 for #values;
push #{ $data{$device} }, \#values;
}
print to_json(\%data, { pretty => 1, canonical => 1 });
__DATA__
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
output
{
"router1" : [
[
5,
10,
4
],
[
5,
1,
5
]
],
"router2" : [
[
5,
10,
4
],
[
5,
2,
5
]
],
"router3" : [
[
4,
5,
6
],
[
7,
6,
5
]
]
}
Here is a simple solution which prints desired JSON object.
#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
my %hash;
while (my $line = <DATA>) {
chomp $line;
my ($device, #cpu_values) = split(/,/, $line);
my $cpu_token = join(",", #cpu_values);
$hash{$device} .= '[' . $cpu_token . '], ';
}
my #devices = keys %hash;
print "array = { \n";
foreach (sort #devices) {
print "$_ => [$hash{$_}]\n";
}
print "}\n";
__DATA__
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
In Perl you need to use references in the way of anonymous arrays and hashes to make multidimensional arrays, arrays of arrays, hashes containing hashes and anywhere in between. perlreftut should cover how to accomplish what you are trying to do. Here is an example I wrote the other day that could help explain as well:
print "\nFun with multidimensional arrays\n";
my #myMultiArray = ([1,2,3],[1,2,3],[1,2,3]);
for my $a (#myMultiArray){
for my $b (#{$a}){
print "$b\n";
}
}
print "\nFun with multidimensional arrays containing hashes\nwhich contains an anonymous array\n";
my #myArrayFullOfHashes = (
{'this-key'=>'this-value','that-key'=>'that-value'},
{'this-array'=>[1,2,3], 'this-sub' => sub {return 'hi'}},
);
for my $a (#myArrayFullOfHashes){
for my $b (keys %{$a}){
if (ref $a->{$b} eq 'ARRAY'){
for my $c (#{$a->{$b}}){
print "$b.$c => $c\n";
}
} elsif ($a->{$b} =~ /^CODE/){
print "$b => ". $a->{$b}() . "\n";
} else {
print "$b => $a->{$b}\n";
}
}
}

How can I store captures from a Perl regular expression into separate variables?

I have a regex:
/abc(def)ghi(jkl)mno(pqr)/igs
How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.
#results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);
Your question is a bit ambiguous to me, but I think you want to do something like this:
my (#first, #second, #third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
push #first, $first;
push #second, $second;
push #third, $third;
}
Starting with 5.10, you can use named capture buffers as well:
#!/usr/bin/perl
use strict; use warnings;
my %data;
my $s = 'abcdefghijklmnopqr';
if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
push #{ $data{$_} }, $+{$_} for keys %+;
}
use Data::Dumper;
print Dumper \%data;
Output:
$VAR1 = {
'first' => [
'def'
],
'second' => [
'jkl'
],
'third' => [
'pqr'
]
};
For earlier versions, you can use the following which avoids having to add a line for each captured buffer:
#!/usr/bin/perl
use strict; use warnings;
my $s = 'abcdefghijklmnopqr';
my #arrays = \ my(#first, #second, #third);
if (my #captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push #{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}
use Data::Dumper;
print Dumper #arrays;
Output:
$VAR1 = [
'def'
];
$VAR2 = [
'jkl'
];
$VAR3 = [
'pqr'
];
But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:
my %data;
my #keys = qw( first second third );
if (my #captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push #{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}
Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:
my #data;
if ( my #captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
push #{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}
An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:
my #results;
while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
my ($key1, $key2, $key3) = ($1, $2, $3);
push #results, {
key1 => $key1,
key2 => $key2,
key3 => $key3,
};
}
# do something with it
foreach my $result (#results) {
print "$result->{key1}, $result->{key2}, $result->{key3}\n";
}
with the main advantage here of using a single data structure, AND having a nice readable loop.
#OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences
$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
print "$1 $2 $3\n";
}
output
$ perl perl.pl
def jkl pqr
def jkl pqr
You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.
You can write a regex containing named capture groups. You do this with the ?<myvar> construct at the beginning of the capture group:
/(?<myvar>[0-9]+)/
You may then refer to those named capture groups using a $+{myvar} form.
Here is a contrived example:
perl -ne '/^systemd-(?<myvar>[^:]+)/ && { print $+{myvar} . "\n"}' /etc/passwd
Given a typical password file, it pulls out the systemd users and returns the names less the systemd prefix. It uses a capture group named myvar. This is just an example thrown together to illustrate the use of capture group variables.

Resources