Retrieve unique values from column based on value from other column - arrays

I have a table like this
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
... ... ...
I want to print the following in a new table
symbol length id
A 15 id_2; id_3
B 25 id_5
... ... ...
So I want to loop through the symbol column. When there are duplicate values in this column, I want to print the line where the numeric length-value is the greatest (example: symbol B). When the greatest length values are equal, I want to merge the values in the idcolumn (example: symbol A) and print this new line.
How should I do this in perl?

The tool in perl for coalescing duplicates is a hash. Hashes are key-value pairs, but the useful part is - the value can be an array (reference).
I'd be suggesting something like this:
#!/usr/bin/perl
use strict;
use warnings;
my %length_of;
my %ids_of;
my $heading_row = <DATA>;
while (<DATA>) {
my ( $symbol, $length, $id ) = split;
if ( not defined $length_of{$symbol} or $length_of{$symbol} < $length ) {
$length_of{$symbol} = $length;
}
push( #{ $ids_of{$symbol}{$length} }, $id );
}
print join( "\t", "symbol", "length", "ids" ), "\n";
foreach my $symbol ( sort keys %ids_of ) {
my $length = $length_of{$symbol};
print join( "\t",
$symbol,
$length,
join( "; ", #{ $ids_of{$symbol}{$length} } ) ),
"\n";
}
__DATA__
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
What this is doing is - iterating your data, and saving the highest length value (in %length_of). It's also stashing each of the ids - by symbol and length (in %ids_of). It keeps them all, so this might not be very efficient if you've a lot of data.

Just remember the last symbol and length and accumulate the ids:
#! /usr/bin/perl
use warnings;
use strict;
my ($last_l, $last_s, #i);
sub out {
print "$last_s\t$last_l\t", join(";", #i), "\n"
}
while (<>) {
my ($s, $l, $i) = split;
out() if $last_s and $s ne $last_s;
undef #i if $last_l < $l;
push #i, $i;
$last_s = $s;
$last_l = $l;
}
out();

This approach builds a hash of hashes of arrays by using the values from the symbol and length columns as keys and adding the values from the id column as array references. For the simple dataset you provided such a complex data structure is not really needed, but the approach shown below might be more flexible in cases where data is not sorted.
I use the max function from (List::Util , which is part of the core distribution) to get the maximum length value for each symbol, and Data::Dumper to help visualize things.
use Data::Dumper ;
use List::Util 'max';
use v5.16;
my (%hash, #lines) ;
while ( <DATA>) {
chomp ;
next if $. == 1 ;
push #lines, [ split ] ;
}
for (#lines) {
push #{ $hash{ $_->[0] }{ $_->[1] } }, $_->[2] ;
}
say "This is your %hash:\n", Dumper \%hash;
for my $symbol ( keys %hash ) {
my $max = max ( keys $hash{$symbol} ) ;
say "$symbol \t", "$max \t", join "; ", #{ $hash{$symbol}{$max} };
}
__DATA__
symbol length id
A 10 id_1
A 15 id_2
A 15 id_3
B 20 id_4
B 25 id_5
Output:
This is your %hash:
$VAR1 = {
'A' => {
'10' => [
'id_1'
],
'15' => [
'id_2',
'id_3'
]
},
'B' => {
'25' => [
'id_5'
],
'20' => [
'id_4'
]
}
};
A 15 id_2; id_3
B 25 id_5

Related

Read space delimited text file into array of hashes [Perl]

I have text file that matches the following format:
1 4730 1031782 init
4 0 6 events
2190 450 0 top
21413 5928 1 sshd
22355 1970 2009 find
And I need to read it into a data structure in perl that will allow me to sort and print according to any of those columns.
From left to right the columns are process_id, memory_size, cpu_time and program_name.
How can I read a text file with formatting like that in a way that allows me to sort the data structure and print it according to the sort?
My attempt so far:
my %tasks;
sub open_file{
if (open (my $input, "task_file" || die "$!\n")){
print "Success!\n";
while( my $line = <$input> ) {
chomp($line);
($process_id, $memory_size, $cpu_time, $program_name) = split( /\s/, $line, 4);
$tasks{$process_id} = $process_id;
$tasks{$memory_size} = $memory_size;
$tasks{$cpu_time} = $cpu_time;
$tasks{$program_name} = $program_name;
print "$tasks{$process_id} $tasks{$memory_size} $tasks{$cpu_time} $tasks{$program_name}\n";
}
This does print the output correctly, however I can't figure out how to then sort my resulting %tasks hash by a specific column (i.e. process_id, or any other column) and print the whole data structure in a sorted format.
You're storing the values under keys that are equal to the values. Use Data::Dumper to inspect the structure:
use Data::Dumper;
# ...
print Dumper(\%tasks);
You can store the pids in a hash of hashes, using the value of each column as the inner key.
#!/usr/bin/perl
use strict;
use warnings;
use feature qw{ say };
my #COLUMNS = qw( memory cpu program );
my %sort_strings = ( program => sub { $a cmp $b } );
my (%process_details, %sort);
while (<DATA>) {
my ($process_id, $memory_size, $cpu_time, $program_name) = split;
$process_details{$process_id} = { memory => $memory_size,
cpu => $cpu_time,
program => $program_name };
undef $sort{memory}{$memory_size}{$process_id};
undef $sort{cpu}{$cpu_time}{$process_id};
undef $sort{program}{$program_name}{$process_id};
}
say 'By pid:';
say join ', ', $_, #{ $process_details{$_} }{#COLUMNS}
for sort { $a <=> $b } keys %process_details;
for my $column (#COLUMNS) {
say "\nBy $column:";
my $cmp = $sort_strings{$column} || sub { $a <=> $b };
for my $value (sort $cmp keys %{ $sort{$column} }
) {
my #pids = keys %{ $sort{$column}{$value} };
say join ', ', $_, #{ $process_details{$_} }{#COLUMNS}
for #pids;
}
}
__DATA__
1 4730 1031782 init
4 0 6 events
2190 450 0 top
21413 5928 1 sshd
22355 1970 2009 find
But if the data aren't really large and the sorting isn't time critical, just sorting the whole array of arrays by a given column is much easier to write and read:
#!/usr/bin/perl
use strict;
use feature qw{ say };
use warnings;
use enum qw( PID MEMORY CPU PROGRAM );
my #COLUMN_NAMES = qw( pid memory cpu program );
my %sort_strings = ((PROGRAM) => 1);
my #tasks;
push #tasks, [ split ] while <DATA>;
for my $column_index (0 .. $#COLUMN_NAMES) {
say "\nBy $COLUMN_NAMES[$column_index]:";
my $sort = $sort_strings{$column_index}
? sub { $a->[$column_index] cmp $b->[$column_index] }
: sub { $a->[$column_index] <=> $b->[$column_index] };
say "#$_" for sort $sort #tasks;
}
__DATA__
...
You need to install the enum distribution.
I can't figure out how to then sort my resulting %tasks hash by a specific column
You can't sort a hash. You need to convert each of your input rows in a hash (which you're doing successfully) and then store all of those hashes in an array. You can then print the contents of the array in a sorted order.
This seems to do what you want:
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #cols = qw[process_id memory_size cpu_time program_name];
#ARGV or die "Usage: $0 [sort_order]\n";
my $sort = lc shift;
if (! grep { $_ eq $sort } #cols ) {
die "$sort is not a valid sort order.\n"
. "Valid sort orders are: ", join('/', #cols), "\n";
}
my #data;
while (<DATA>) {
chomp;
my %rec;
#rec{#cols} = split;
push #data, \%rec;
}
if ($sort eq $cols[-1]) {
# Do a string sort
for (sort { $a->{$sort} cmp $b->{$sort} } #data) {
say join ' ', #{$_}{#cols};
}
} else {
# Do a numeric sort
for (sort { $a->{$sort} <=> $b->{$sort} } #data) {
say join ' ', #{$_}{#cols};
}
}
__DATA__
1 4730 1031782 init
4 0 6 events
2190 450 0 top
21413 5928 1 sshd
22355 1970 2009 find
I've used the built-in DATA filehandle to make the code simpler. You would need to replace that with some code to read from an external file.
I've used a hash slice to simplify reading the data into a hash.
The column that you want to sort by is passed into the program as a command-line argument.
Note that you have to sort the last column (the program name) using string comparison and all other columns using numeric comparison.
This decides how to sort using the first argument the script receives.
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
open my $fh, '<', 'task_file';
my #tasks;
my %sort_by = (
process_id=>0,
memory_size=>1,
cpu_time=>2,
program_name=>3
);
my $sort_by = defined $sort_by{defined $ARGV[0]?$ARGV[0]:0} ? $sort_by{$ARGV[0]} : 0;
while (<$fh>) {
push #tasks, [split /\s+/, $_];
}
#tasks = sort {
if ($b->[$sort_by] =~ /^[0-9]+$/ ) {
$b->[$sort_by] <=> $a->[$sort_by];
} else {
$a->[$sort_by] cmp $b->[$sort_by];
}
} #tasks;
for (#tasks) {
say join ' ', #{$_};
}

Add all values in array for each ID in Perl

I have this table:
NAME |12/31/2016|VALUE
AAA |1/31/2017 |10
AAA |2/1/2017 |20
AAA |2/2/2017 |30
AAA |2/3/2017 |40
AAA |2/4/2017 |50
NAME |2/9/2017 |VALUE
BBB |2/10/2017 |20
BBB |2/11/2017 |30
BBB |2/12/2017 |40
BBB |2/13/2017 |50
BBB |2/14/2017 |60
and this would be my desired output:
NAME |DATE |VALUE
AAA |12/31/2016 |150
AAA |1/31/2017 |140
AAA |2/1/2017 |120
NAME |DATE |VALUE
BBB |2/9/2017 |200
BBB |2/10/2017 |180
BBB |2/11/2017 |150
What I want to do is, for each of the valid symbols, (AAA, BBB) I want to have three rows.
For the first row of each column, I want all the values added,
For example, row 1 value for AAA:
10+20+30+40+50 = 150
then for row 2 I want to just add from the second value to the last.
For example row 2 value for AAA
20+30+40+50 = 140
and so on same goes for BBB.
I want to shift the dates down so that 12/31/2016 would match AAA, then get the first three dates for each row.
I currently have this code. but this doesn't do much. it just gives me a bunch of numbers.
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
use Data::Dumper;
sub uniq {
my %seen;
grep !$seen{$_}++, #_;
}
my %cashflow;
my %fields = (
ID => 0,
DATES => 1,
VALUE => 2,
);
my #total;
my #IDs;
my #uniqueIDs;
my #dates;
my #add;
my $i = 0;
my #values;
my $counter = 3;
open( FILE, "try.CSV" );
while ( my $line = <FILE> ) {
chomp( $line );
my #lineVals = split( /\|/, $line );
if ( $lineVals[ $fields{ID} ] !~ /^SYMBOL$/i ) {
push #IDs, $lineVals[ $fields{ID} ];
}
#uniqueIDs = uniq( #IDs );
#push all CASH FLOW AMOUNTS to #cashflow
if ( looks_like_number( $lineVals[ $fields{VALUE} ] ) ) {
$lineVals[ $fields{VALUE} ] =~ s/\r//;
push #total, $lineVals[ $fields{VALUE} ];
}
if ( $lineVals[ $fields{DATES} ] =~ /(\d{1,2})\/(\d{1,2})\/(\d{4})/ ) {
$lineVals[ $fields{DATES} ] = sprintf( '%04d%02d%02d', $3, $2, $1 );
}
$cashflow{ uc $lineVals[ $fields{ID} ] }{DATES} = $lineVals[ $fields{DATES} ];
$cashflow{ uc $lineVals[ $fields{ID} ] }{VALUE} = $lineVals[ $fields{VALUE} ];
foreach my $ID ( #uniqueIDs ) {
foreach my $symb ( keys %cashflow ) {
if ( $ID = $symb ) {
if ( looks_like_number( $lineVals[ $fields{VALUE} ] ) ) {
$lineVals[ $fields{VALUE} ] =~ s/\r//;
push #total, $lineVals[ $fields{VALUE} ];
my $i = 0;
my $grand = 0;
foreach my $val ( #total ) {
while ( $i < $counter ) {
$grand += $val;
print "$grand \n";
$i++;
}
shift #total;
}
}
}
}
}
}
close FILE;
I'm really stuck with this. I don't know what to do with the problem.
A possible solution:
#!perl
use strict;
use warnings;
sub trim {
my ($str) = #_;
s!\A\s+!!, s!\s+\z!! for $str;
$str
}
my $file = 'try.CSV';
open my $fh, '<', $file or die "$0: $file: $!\n";
my ($group_name, #dates, #values);
my $sum = 0;
my $print_group = sub {
return if !defined $group_name;
my $format = " %-6s|%-11s|%s\n";
printf $format, 'NAME', 'DATE', 'VALUE';
for my $date (#dates) {
printf $format, $group_name, $date, $sum;
$sum -= shift #values if #values;
}
};
while (my $line = readline $fh) {
my ($name, $date, $value) = map trim($_), split /\|/, $line;
if ($name eq 'NAME') {
$print_group->();
$group_name = undef;
#dates = $date;
#values = ();
$sum = 0;
next;
}
$group_name ||= $name;
push #dates, $date if #dates < 3;
push #values, $value if #values < 2;
$sum += $value;
}
$print_group->();
Let's go over it.
sub trim {
my ($str) = #_;
s!\A\s+!!, s!\s+\z!! for $str;
$str
}
A helper function for removing leading/trailing whitespace from a string. We're using ! as the s delimiter here because / breaks SO's syntax highlighting. Shrug.
my $file = 'try.CSV';
open my $fh, '<', $file or die "$0: $file: $!\n";
Open our input file. Note: We use a lexical variable ($fh) instead of a bareword filehandle, and we use 3-argument open. This is strongly recommended. We also check open's return value and produce a nice error message in case of failure, including both the name of the file that couldn't be opened ($file) and the reason for failing ($!).
my ($group_name, #dates, #values);
my $sum = 0;
We set up some state variables that we want to preserve across loop iterations. $group_name is the name of the group we're currently processing, #dates is the saved dates we've seen so far, #values is the saved values we've seen so far. $sum is a running sum of all the values in the current group, and it starts at 0.
my $print_group = sub {
return if !defined $group_name;
my $format = " %-6s|%-11s|%s\n";
printf $format, 'NAME', 'DATE', 'VALUE';
for my $date (#dates) {
printf $format, $group_name, $date, $sum;
$sum -= shift #values if #values;
}
};
A helper function for printing the output for a single group. If $group_name isn't set, we haven't processed any input for the current group yet, so we do nothing and return. Otherwise we print a NAME | DATE | VALUE header, followed by a row of data for each element in #dates. For each $date we output the current group name (e.g. AAA), $date, and the sum of values (all nicely formatted using printf). Initially $sum is the sum of all group values, but after the first iteration we start subtracting the values from #values: If the list of values in the input was x1, x2, x3, x4, ..., then $sum is initially x1 + x2 + x3 + x4 + ..., and that's what's printed in the first line of output. After that we subtract x1, so the next line gets x1 + x2 + x3 + x4 + ... - x1, which is x2 + x3 + x4 + .... After that we subtract x2, so the third row of data gets x3 + x4 + ....
while (my $line = readline $fh) {
my ($name, $date, $value) = map trim($_), split /\|/, $line;
Our main loop. We read a line of input, split it on |, and trim each field.
if ($name eq 'NAME') {
$print_group->();
$group_name = undef;
#dates = $date;
#values = ();
$sum = 0;
next;
}
If $name is 'NAME', this is the start of a new group. Print the output for the current group if any ($print_group->() does nothing if there is no current group), then reset our state variables back to initial values, except for #dates, which is filled with the $date value from the header row. Then start the next iteration of the loop because we're done with this line.
$group_name ||= $name;
push #dates, $date if #dates < 3;
push #values, $value if #values < 2;
$sum += $value;
If we get here, this line is not the start of a new group. We set $group_name if it hasn't been set yet. We add $date to our list of saved dates (but we only need 3 dates, so do nothing if we already have 3). We add $value to our list of saved values (but we only need 2 of them). Finally we add $value to our total $sum within the group.
}
$print_group->();
At the end of the loop we've also just finished processing a group, so we need to call $print_group here as well.
This will do as you ask. It reads the whole data file into an array of arrays and manipulates that array before printing it. The blocks are processed backwards from the end so that the other blocks remain in place when the trailing lines are deleted
This program expects the path to the input file as a parameter on the command line and writes the result to STDOUT
use strict;
use warnings 'all';
my #data = map [ /[^|\s]+/g ], <>;
# Make a list of the indices of all the header rows
my #headers = grep { $data[$_][0] eq 'NAME' } 0 .. $#data;
# Make a list of the indices of the first
# and last lines of all the data blocks
my #blocks = map {
[
$headers[$_] + 1,
$_ == $#headers ? $#data : $headers[$_+1] - 1
]
} 0 .. $#headers;
# Shift the second column down
# Replace the col2 header with 'DATE'
#
$data[$_][1] = $data[$_-1][1] for reverse 1 .. $#data;
$data[$_][1] = 'DATE' for #headers;
# Edit each block of data
#
for my $block ( reverse #blocks ) {
my ( $beg, $end ) = #$block;
# Calculate the block total
my $total = 0;
for ( $beg ... $end ) {
$total += $data[$_][2];
}
# Calculate the first three data values
for my $i ( $beg .. $beg + 2 ) {
my $next = $total - $data[$i][2];
$data[$i][2] = $total;
$total = $next;
}
# Remove everything except those three lines
splice #data, $beg+3, $end-$beg-2;
}
print join('|', #$_), "\n" for #data;
output
NAME|DATE|VALUE
AAA|12/31/2016|150
AAA|1/31/2017|140
AAA|2/1/2017|120
NAME|DATE|VALUE
BBB|2/9/2017|200
BBB|2/10/2017|180
BBB|2/11/2017|150

Print hash of arrays as colums to file

Is there a way to print a hash of arrays to a file with the keys to the hash as header and the values of the arrays as tab ( or anything else ) delimited columns ?
My last try was something like this:
foreach my $key(sort keys %outHash){
my $temp1 = join ("\n",#{$outHash{$key}});
my $temp2 = $key."\n".$temp1;
print OUTPUT "$temp2\t";
print OUTPUT "\n";
}
Which produces just horrible output.
Any help would be greatly appreciated !
Thank you.
If I understand you correctly - you have a hash of arrays, that you want to print in columns.
To do that:
#!/usr/bin/env perl
use strict;
use warnings;
my %test = (
a => [ 1, 2, 3 ],
b => [ 4, 5, 6, 7 ],
);
my #header = sort keys %test;
print join (",", #header), "\n";
while ( map {#$_} values %test ) {
my #row;
push( #row, shift #{ $test{$_} } // '' ) for #header;
print join (",", #row ), "\n";
}
But you'd probably better off with a different data structure, that's got your data organise row-wise, and iterate row by row using a hash slice instead.
You can try this:
use feature qw(say);
use strict;
use warnings;
my %out_hash = (
a => [1,2,3],
b => [3,4,5],
);
my #keys = sort keys %out_hash;
say join "\t", #keys;
my $N = 3;
for my $i (0 .. ($N-1)) {
my #row;
for my $key (#keys) {
push #row, $out_hash{$key}->[$i];
}
say join "\t", #row;
}
Output:
a b
1 3
2 4
3 5

Perl Compare hash of arrays with another array

I am trying to compare all the array values (complete array) with a hash's value(which is array) and if the match founds,then push the key of hash to new array.
The below code compare if the hash value is not array but how can I compare if its array?
%hash=(
storeA=>['milk','eggs'],
storeB=>['milk','fruits','eggs','vegetables'],
storeC=>['milk','fruits','eggs'],
);
#array = (
'fruits',
'milk',
'eggs'
);
Code to compare
use strict;
use warnings;
use Data::Dumper;
foreach my $thing (#array) {
foreach ((my $key, my $value) = each %hash) {
if ($value eq $thing) {
push #new_array, $key;
}
}
}
print Dumper(\#new_array);
Expected Output
#new_array = (
storeB,
storeC
);
You could also use a combination of all and any form List::Util :
while ((my $key, my $value) = each %hash) {
if ( all { my $temp = $_; any { $_ eq $temp } #$value } #array ) {
push #new_array, $key;
}
}
So here you are looking for the case where all the elements of #array exists in the given array from the hash.
I would build a hash out of each store's stock array. It's a wasteful method, but not egregiously so as long as the real data isn't enormous
Like this. The inner grep statement counts the number of items in #list that are available at this store and compares it to the number of items in the list, returning true if everything is in stock
If this is a real situation (I suspect it's homework) then for all practical purposes that I can think of, the %stocks hash should contain hashes of the items available at each store
use strict;
use warnings 'all';
my %stocks = (
storeA => [ qw/ milk eggs / ],
storeB => [ qw/ milk fruits eggs vegetables / ],
storeC => [ qw/ milk fruits eggs / ],
);
my #list = qw/ fruits milk eggs /;
my #stores = grep {
my %stock = map { $_ => 1 } #{ $stocks{$_} };
grep($stock{$_}, #list) == #list;
} keys %stocks;
use Data::Dump;
dd \#stores;
output
["storeB", "storeC"]
Find the intersection of the two sets, if the number of its elements is the number of the elements in the array, you want to store the key:
#!/usr/bin/perl
use warnings;
use strict;
sub intersect {
my ($arr1, $arr2) = #_;
my %intersection;
$intersection{$_}{0}++ for #$arr1;
$intersection{$_}{1}++ for #$arr2;
return grep 2 == keys %{ $intersection{$_} }, keys %intersection
}
my %hash = (
storeA => [qw[ milk eggs ]],
storeB => [qw[ milk fruits eggs vegetables ]],
storeC => [qw[ milk fruits eggs ]],
);
my #array = qw( fruits milk eggs );
my #new_array;
while (my ($store, $arr) = each %hash) { # while, not for!
push #new_array, $store if #array == intersect(\#array, $arr);
}
use Data::Dumper;
print Dumper(\#new_array);
Simply try this. One small trick i done here. grep was use to filter the element from an array.
I created the variable $joined_array which contain the | separated #array data. Then i pass the variable into the grep.
And the trick is, when the array is compare with a scalar data, the comparison is behave, the total number of an array element with scalar data.
my #array = qw(one two three);
if(#array == 3)
{
print "Hi\n";
}
Here condition is internally run as 3 == 3.
That the same logic i done here.
use warnings;
use strict;
my %hash=(
"storeA"=>['milk','eggs'],
"storeB"=>['milk','fruits','eggs','vegetables'],
"storeC"=>['milk','fruits','eggs'],
);
my #array = (
'fruits',
'milk',
'eggs'
);
my #new_array;
my $joined_array = join("|",#array);
foreach (keys %hash)
{
push(#new_array,$_) if ((grep{ /\b$joined_array\b/ } #{$hash{$_}}) >= scalar #array);
}
print "$_\n" for #new_array
Go through all stores (keys) and for each check whether all array elems are in the key's array-ref.
use strict;
use warnings;
my %inventory = (
storeA => ['milk','eggs'],
storeB => ['milk','fruits','eggs','vegetables'],
storeC => ['milk','fruits','eggs'],
);
my #items = ('fruits', 'milk', 'eggs');
my #found;
foreach my $store (keys %inventory) {
push #found, $store
if #items == grep { in_store($_, $inventory{$store}) } #items
}
sub in_store { for (#{$_[1]}) { return 1 if $_[0] eq $_ }; return 0; }
print "#found\n"; # prints: storeB storeC
The grep block checks for each item whether it is (available) in the store, and if the number of those that pass is equal to the number of items (array size) that store has all items and is added. Note that a subroutine returns the last value evaluated without an explicit return, so the final return is not needed. It was added for clarity.

How to create hash with duplicate keys

Now i am modifying the code a little
I am using the code for creating hash haivng duplicate keys. Its giving the syntax error.
use strict;
use warnings;
my $s = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5";
my $hash;
my #a = split(/\n/, $s);
foreach (#a)
{
my $c = (split)[2];
my $d = (split)[1];
my $e = (split)[0];
push(#{$hash->{$c}}, $d);
}
print Dumper($hash );
i am getting the output as
$VAR1 = {
'P5' => [
'B',
'E'
],
'P2' => [
'C'
],
'P1' => [
'A',
'D'
]
};
But i want the output like
$VAR1 = {
'P5' => {
'E' => '06',
'B' => '23'
},
'P2' => {
'C' => '24'
},
'P1' => {
'A' => '12',
'D' => '15'
}
};
How to do that
You hash declaration is incorrect, it should be:
my %hash = ();
or simply:
my %hash;
Then the rest of your code is both too complex and incorrect.
foreach (#a) {
my ($k, $v) = (split);
push #{$hash{$k}}, $v;
}
should be enough. See Autovivification for why this works.
With your code, the first time you see a key, you set $hash{$k} to be a scalar. You can't then push things to that key - it needs to be an array to begin with.
The if (-e $hash{$c}) test is wrong. -e is a file existence test. If you want to know if a hash key exists, use:
if (exists $hash{$c}) { ... }
And print %hash; won't do what you expect (and print %{$hash}; is invalid). You'll get a prettier display if you do:
use Data::Dumper;
print Dumper(\%hash);
(Great debugging too, this Data::Dumper.)
Perl is telling you exactly what is wrong. You have used the strict pragma, so using the %hash variable without declaring it is a syntax error. While the string %hash does not appear in your code, the string $hash{...} does, on each of the problem lines. This is the syntax to access an element of the %hash, which is why strict is complaining.
You have declared the variable $hash, so accessing an element of the contained hash reference is written $$hash{...} or $hash->{...}. Fix the problem lines to access the correct variable and the code will compile.
%hash is a hash, and $hash is a scalar (a hash reference, like \%hash ), they are two different variables
To refer to $hash, to refer to the hash whose reference is stored in the scalar variable $hash, you either have to use $hash->{$c} or $$hash{$c}
See References quick reference
update:
#!/usr/bin/perl --
use strict; use warnings;
use Data::Dumper;
my $s = "P1 26
P5 23
P2 24
P1 15
P5 06 ";
my $hash = {};
for my $line ( split /[\r\n]+/, $s ) {
my( $c, $d ) = split ' ', $line;
push #{ $hash->{$c} }, $d;
}
print Dumper( $hash );
__END__
$VAR1 = {
'P5' => [
'23',
'06'
],
'P2' => [
'24'
],
'P1' => [
'26',
'15'
]
};
See the working code, the fixed errors (comments in the code), and the resulting output:
use strict;
use warnings;
my $s = "P1 26
P5 23
P2 24
P1 15
P5 06 ";
my %hash; #my $hash ={};
#my $arr = [];
my #a = split(/\n/, $s);
foreach (#a)
{
my $d = (split)[1];
my $c = (split)[0];
push(#{$hash{$c}}, $d); #if ...
}
while (my ($key, $value) = each(%hash)) #print %{$hash};
{
print "$key #{$value}\n";
}
#Output:
#P5 23 06
#P2 24
#P1 26 15
(Strange. Out of all the answers posted so far, none has actually answered the question...)
The code below produces the result asked for. The fundamental bit which seems to be missing from the original code is the two-level hash.
As an aside, there seems to be no reason for the outer hash to be a hashref and not a hash, so I made it a hash. Also you can pick out the split into variables in one line.
use strict;
use warnings;
use Data::Dumper;
my $s = "12 A P1
23 B P5
24 C P2
15 D P1
06 E P5";
my %hash;
my #a = split(/\n/, $s);
foreach (#a)
{
my ($e, $d, $c) = (split);
$hash{$c}{$d} = $e;
}
print Dumper(\%hash);

Resources