How do I consolidate a hash in Perl? - arrays

I have an array of hash references. The hashes contain 2 keys, USER and PAGES. The goal here is to go through the array of hash references and keep a running total of the pages that the user printed on a printer (this comes from the event logs). I pulled the data from an Excel spreadsheet and used regexes to pull the username and pages. There are 182 rows in the spreadsheet and each row contains a username and the number of pages they printed on that job. Currently the script can print each print job (all 182) with the username and the pages they printed but I want to consolidate this down so it will show: username 266 (i.e. just show the username once, and the total number of pages they printed for the whole spreadsheet.
Here is my attempt at going through the array of hash references, seeing if the user already exists and if so, += the number of pages for that user into a new array of hash references (a smaller one). If not, then add the user to the new hash ref array:
my $criteria = "USER";
my #sorted_users = sort { $a->{$criteria} cmp $b->{$criteria} } #user_array_of_hash_refs;
my #hash_ref_arr;
my $hash_ref = \#hash_ref_arr;
foreach my $index (#sorted_users)
{
my %hash = (USER=>"",PAGES=>"");
if(exists $index{$index->{USER}})
{
$hash{PAGES}+=$index->{PAGES};
}
else
{
$hash{USER}=$index->{USER};
$hash{PAGES}=$index->{PAGES};
}
push(#hash_ref_arr,{%hash});
}
But it gives me an error:
Global symbol "%index" requires explicit package name at ...
Maybe my logic isn't the best on this. Should I use arrays instead? It seems as though a hash is the best thing here, given the nature of my data. I just don't know how to go about slimming the array of hash refs down to just get a username and the total pages they printed (I know I seem redundant but I'm just trying to be clear). Thank you.

my %totals;
$totals{$_->{USER}} += $_->{PAGES} for #user_array_of_hash_refs;
And then, to get the data out:
print "$_ : $totals{$_}\n" for keys %totals;
You could sort by usage too:
print "$_ : $totals{$_}\n" for sort { $totals{$a} <=> $totals{$b} } keys %totals;

As mkb mentioned, the error is in the following line:
if(exists $index{$index->{USER}})
However, after reading your code, your logic is faulty. Simply correcting the syntax error will not provide your desired results.
I would recommend skipping the use of temporary hash within the loop. Just work with the a results hash directly.
For example:
#!/usr/bin/perl
use strict;
use warnings;
my #test_data = (
{ USER => "tom", PAGES => "5" },
{ USER => "mary", PAGES => "2" },
{ USER => "jane", PAGES => "3" },
{ USER => "tom", PAGES => "3" }
);
my $criteria = "USER";
my #sorted_users = sort { $a->{$criteria} cmp $b->{$criteria} } #test_data;
my %totals;
for my $index (#sorted_users) {
if (not exists $totals{$index->{USER}}) {
# initialize total for this user
$totals{$index->{USER}} = 0;
}
# add to user's running total
$totals{$index->{USER}} += $index->{PAGES}
}
print "$_: $totals{$_}\n" for keys %totals;
This produces the following output:
$ ./test.pl
jane: 3
tom: 8
mary: 2

The error comes from this line:
if(exists $index{$index->{USER}})
The $ sigil in Perl 5 with {} after the name means that you are getting a scalar value out of a hash. There is no hash declared by the name %index. I think that you probably just need to add a -> operator so the problem line becomes:
if(exists $index->{$index->{USER}})
but not having the data makes me unsure.
Also, good on you for using use strict or you would be instantiating the %index hash silently and wondering why your results didn't make any sense.

my %total;
for my $name_pages_pair (#sorted_users) {
$total{$name_pages_pair->{USER}} += $name_pages_pair->{PAGES};
}
for my $username (sort keys %total) {
printf "%20s %6u\n", $username, $total{$username};
}

Related

Perl ... create horizontal children of a %hash using #array items

I've been banging my head on this awhile and searched many ways. I'm sure this is going to boil down to being really basic.
I have data in an #array that I want to move to a tree in a %hash.
This might be something more appropriate to JSON? But I haven't delved into it before and I don't need to save out/restore this information.
Desire:
Create a dependent tree of USB devices that can nest under each other that can track the end point (deviceC) through a hub (deviceB) and finally the root (deviceA).
Example:
Simplified (I hope ... this isn't from the actual longer script):
I want to convert an array in this format:
my #array = ['deviceA','deviceB','deviceC'];
to multidimensional hashes equal to:
my %hash = ('deviceA' => { 'deviceB' => { 'deviceC' => '' } } )
that would dump like:
$VAR1 = {
'deviceA' => {
'deviceB' => {
'deviceC' => ''
}
}
};
For just looking at a single device this isn't necessary, but I'm building out an IOMMU -> PCI Device -> USB map that contains many devices.
NOTES:
I'm trying to avoid installing CPAN modules so the script is to similar systems (Proxmox VE)
The last device (deviceC above) has no children
value '' is fine
undef would probably work
mixing the types would work but I need to know how to set that
I will never need to modify or manipulate the hash once created
I don't know the right way to recurse the #array to populate the %hash children. * I want the data horizontal for each USB device
I'd switch to an Object/package but each device can have a different set of children (or none) making it infeasible to know Object names
Some USB devices have no children (root hubs) ... similar to %hash = ('deviceA' => '')
Some have 1 child that is the final device ... similar to %hash = ('deviceA' => { 'deviceB' =>'' } )
Some have multiple steps between the root via additional hub(s) ... similar to %hash = ('deviceA' => { 'deviceB' => { 'deviceC' => '' } } ) or more
Starting point :
This is basic and incomplete but will run:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
# data in from parsing usb device path:
my #array = ['deviceA','deviceB','deviceC'];
# needs to be converted to:
my %hash = ('deviceA' => { 'deviceB' => { 'deviceC' => '' } } );
print "\n\%hash:\n" . Dumper \%hash;
Pseudo-code
This section is NOT working code in any form. I'm just trying to make a note of what I'm thinking. I know the format is wrong, I've tried multiple ways to create this and I'd look even dumber showing all of my attempts :)
I'm very new to refs and I'm not going to try and get that right here. The idea below is:
For each item in #array:
Create a way (either a ref or a copy of the current hash) that can be used next iteration to place the next child
Attach item as a child of the previous iteration with an empty value (that can be appended if there is further iteration)
my #array = ['deviceA','deviceB','deviceC'];
my %hash = {};
my %trackref;
for (#array) {
%trackref = %hash; # a copy of the existing that won't change when %hash updates
$hash{last_child} ::append_child:: $_;
}
You're actually pretty close, but it seems that you need to understand references a bit better. perldoc perlref is probably a good starting point to understand references.
A few mistakes in your code, before looking at the solution:
my #array = [ ... ];: [] creates an arrayref, not an array, which means that #array actually stores a single scalar item: a reference to another array. Use () to initialize an array: my #array = ( ... );.
my %hash = {};: similarly, {} creates a hashref, not a hash. Which means that this lines stores a single hashref in %hash, which will cause this warning: Reference found where even-sized list expected at hash.pl line (because a hash contains keys-values and you only provided a key). Use () for a simple (ie, not a hashref) hash. In this case however, you don't need to initialize %hash: my %hash; and my %hash = () do the same thing (that is, create an empty hash).
%trackref = %hash; copies the content of %hash in %trackref. Which means that, contrary to what the name "trackref" implies, %trackref doesn't contain a reference to anything, but a copy of %hash. Use \%hash to create a reference to %hash.
Note that if you already have a hashref, then assigning it to another variables copies the reference. For instance, if you do my $hash1 = {}; my $hash2 = $hash1, then both $hash1 and $hash2 reference the same hash.
So, fixing those issues in your attempt, we get:
my #array = ('deviceA','deviceB','deviceC');
my %hash;
my $trackref = \%hash;
for my $usb (#array) {
$trackref->{$usb} = {};
$trackref = $trackref->{$usb};
}
print Dumper \%hash;
Which outputs:
$VAR1 = {
'deviceA' => {
'deviceB' => {
'deviceC' => {}
}
}
};
The main change that I did was to replace your $hash{last_child} ::append_child:: $_; by $trackref->{$_} = {};. But the idea remains the same: Attach item as a child of the previous iteration with an empty value to reuse your words.
To help you understand the code a bit better, let's see what happens in the loop step by step:
Before the first iteration, %hash is empty and $trackref references %hash.
In the first iteration, we put deviceA => {} in $trackref (or, more pedantically, we associate {} with the key deviceA in $trackref). Since $trackref references %hash, this puts deviceA => {} in %hash. Then, we store in $trackref this new {} that we just created, which means that $trackref now references $hash{deviceA}.
In the second iteration, we put deviceB => {} in $trackref. $trackeref references $hash{deviceA} (which we created in the previous iteration), which means that %hash is now (deviceA => { deviceB => {} }). We then store in $trackref the new {}.
And so on...
You'll note that in the innermost hash, {} is associated to the key deviceC. When iterating of the hash, you can thus know if you are at the end by doing something like if (%$hash) (instead of just if ($hash) if this last {} would have been undef or ''). Let me know if that's an issue: we can add a bit of code to convert this {} into undef (alternatively, you can do it yourself, it will be a good exercise to get used to references)
Minor remark: #array and %hash are poor array and hash names, because the # already indicates an array, and % already indicates a hash. It's possible that you used those names just for this small example for your question, in which case, no problem. However, if you use those names in your actual code, consider changing them for something more explicit... #usb_devices and %usb_devices_tree maybe?

How can I organize data in rows and columns with Perl?

The base problem is that I have lots of datapoints with normalized names that are just dumped from the server into a file, but I need to organize these datapoints into a file with rows and columns automatically, according to the data they contain (indicated in their normalized names).
The original file with all the datapoints comes as follows (these are not the original datapoint tags but rather simplified ones):
temp_r301
airflow_r301
temp_r345
airflow_r345
solar_w
solar_e
...
As you can see, they all come as one column, so there is one tag per row.
And I want to organize them so that for each state ("temp" as in temperature), I have the corresponding information in the same row, such as:
temp_r301 301 airflow_r301 solar_w solar_e #airflow in 301 and general solar radiation affect temperature (state) in room 301
temp_r345 345 airflow_r345 solar_w solar_e #airflow in 345 and general solar radiation affect temperature (state) in room 345
Of course the lenght of the array can vary so the idea is to make an algorithm that detects the length and organizes the data accordingly. Also, I am aware I will have to use regular expressions to find the matches and define which datapoints are states and which ones inputs, as well as knowing the room to which they belong.
So far I have tried the following:
use strict;
use warnings;
use diagnostics;
my #transpose = ();
my #sorted = ();
push(#sorted, [qw(temp_r301 temp_r345)]);
push(#sorted, [qw(301 345)]);
push(#sorted, [qw(airflow_r301 airflow_r345 solar_w solar_e)]);
for my $sorted (#sorted) {
for my $column (0 .. $#sorted) {
push(#{$transpose[$column]}, $sorted->[$column]);
}
}
for my $new_row (#transpose) {
for my $new_col (#{$new_row}) {
print "$new_col ";
}
print "\n";
}
But this only works fine if all the arrays have the same lenght (not this case).
I also discovered a loop that can be used to store data into matrix form (array of arrays), but still, I can't seem to find a solution to write in the matrix the data from different arrays:
use strict;
use warnings;
use diagnostics;
use feature 'say';
my #states = qw(temp_r301 temp_r345);
my #zones = qw(301 345);
my #inputs = qw(airflow_r301 airflow_r345 solar_w solar_e);
my #matrix = ();
for my $x (0 .. $#states) {
for my $y (0 .. $#inputs) {
$matrix[$x][$y] = $states[$x]; #of course this only copies the states array and
} #repeats it for each created array
}
for my $aref (#matrix) { #print array of arrays
say "[ #$aref ],";
}
So, knowing that I have all the data dumped into an input file, what would be the best way to sort that data into a matrix? Is there any loop I should give more attention to? Should I be working with arrays?
Details of this problem are still unclear, while explanations did help. So here is what I'll assume.
I take data to have a piece of information per line. Some contain a tag (description) followed by the room number, and I assume format tag_rN, identifying a room number that the tag applies to.
As for others, that don't have the room number, additional processing is needed to decide where that information belongs. The question puts forth only an example of tags that apply to all rooms, related to solar radiation that affects them (see comments), so that's all that's processed.
The fact that some of the data does not neatly classify with a room is what makes organization of the parsed data non-trivial. Since no details are given I merely split it into two hashes, one by room number and another one which structure will depend on specifics.
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
my $file = shift // die "Usage: $0 file\n";
open my $fh, '<', $file or die "Can't open $file: $!";
my (%room, %other);
while (<$fh>) {
chomp;
if ( my ($tag, $room_num) = /([^_]+)_r([0-9]+)/ ) {
$room{$room_num}{$tag} = $_; # have room number
}
else { # more processing needed
my ($tag, $value) = parse_line($_);
push #{ $other{$tag} }, $value;
}
}
dd \%room; dd \%other; say '';
# Print in CSV format. Header first
my #tags = ( keys %{ $room{ (keys %room)[0] } }, keys %other );
say join ',', 'room', #tags;
foreach my $rnum (keys %room) {
say join ',',
$rnum, map { $room{$rnum}{$_} // join ' ', #{$other{$_}} } #tags;
}
sub parse_line {
my ($line) = #_;
my ($tag, $value);
if ($line =~ /solar_w|solar_e/) { # example from sample data
$tag = 'solar';
$value = $line;
}
else { } # other possibilities
return $tag, $value;
}
The data with the room number is sorted out by the identifying description ("tag") as a key, with the line being its value. Each such key-value pair is in a hashref assigned to each room number.
The data without the room number is parsed in a separate sub, with just some token code since no details are given. Then that is stored in another hash, for easier manipulation (since it's not tied to any one room).
How tags are extracted from data is a bit arbitrary, since it's not specified in the question.
All this is combined into a CSV format. The above, with the input file from the question and the explanation in comments that the solar radiation from both west and east affects all rooms, prints:
{
301 => { airflow => "airflow_r301", temp => "temp_r301" },
345 => { airflow => "airflow_r345", temp => "temp_r345" },
}
{ solar => ["solar_w", "solar_e"] }
room,airflow,temp,solar
345,airflow_r345,temp_r345,solar_w solar_e
301,airflow_r301,temp_r301,solar_w solar_e
Comment out the line with dd ... (from Data::Dump) to remove the initial diagnostic prints. Then the last few lines are the CSV that would go into some file etc.
Some data may be missing for some rooms, and there is yet more data which may not classify so uniformly. Then the fields for those headers will be merrily empty in some rows, as desired.

Merge Perl hashes into one array and loop through it

I'm creating a Perl plugin for cPanel which has to get all domains in the account of a user and display it in a HTML select field. Originally, I'm a PHP developer, so I'm having a hard time understanding some of the logic of Perl. I do know that cPanel plugins can also be written in PHP, but for this plugin I'm limited to Perl.
This is how I get the data from cPanel:
my #user_domains = $cpliveapi->uapi('DomainInfo', 'list_domains');
#user_domains = $user_domains[0]{cpanelresult}{result}{data};
This is what it looks like using print Dumper #user_domains:
$VAR1 = {
'addon_domains' => ['domain1.com', 'domain2.com', 'domain3.com'],
'parked_domains' => ['parked1.com', 'parked2.com', 'parked3.com'],
'main_domain' => 'main-domain.com',
'sub_domains' => ['sub1.main-domain.com', 'sub2.main-domain.com']
};
I want the data to look like this (thanks #simbabque):
#domains = qw(domain1.com domain2.com domain3.com main-domain.com parked1.com parked2.com parked3.com);
So, I want to exclude sub_domains and merge the others in 1 single-dimensional array so I can loop through them with a single loop. I've struggled the past few days with what sounds like an extremely simple task, but I just can't wrap my head around it.
You need something like this
If you find you have a copy of List::Util that doesn't include uniq then you can either upgrade the module or use this definition
sub uniq {
my %seen;
grep { not $seen{$_}++ } #_;
}
From your dump, the uapi call is returning a reference to a hash. That goes into $cp_response and then drilling down into the structure fetches the data hash reference into $data
delete removes the subdomain information from the hash.
The lists you want are the values of the hash to which $data refers, so I extract those. Those values are references to arrays of strings if there is more than one domain in the list, or simple strings if there is only one
The map converts all the domain names to a single list by dereferencing array references, or passing strings straight through. That is what the ref() ? #$_ : $_ is doing. FInally uniq removes multiple occurrences of the same name
use List::Util 'uniq';
my $cp_response = $cpliveapi->uapi('DomainInfo', 'list_domains');
my $data = $cp_response->{cpanelresult}{result}{data};
delete $data->{sub_domains};
my #domains = uniq map { ref() ? #$_ : $_ } values %$data;
output
parked1.com
parked2.com
parked3.com
domain1.com
domain2.com
domain3.com
main-domain.com
That isn't doing what you think it' doing. {} is the anonymous hash constructor, so you're making a 1 element array, with a hash in it.
You probably want:
use Data::Dumper;
my %user_domains = (
'addon_domains' => ['domain1.com', 'domain2.com', 'domain3.com'],
'parked_domains' => ['parked1.com', 'parked2.com', 'parked3.com'],
'main_domain' => 'main-domain.com',
'sub_domains' => ['sub1.main-domain.com', 'sub2.main-domain.com'],
);
print Dumper \%user_domains;
And at which point the 'other' array elements you can iterate through either a double loop:
foreach my $key ( keys %user_domains ) {
if ( not ref $user_domains{$key} ) {
print $user_domains{$key},"\n";
next;
}
foreach my $domain ( #{$user_domains{$key}} ) {
print $domain,"\n";
}
}
Or if you really want to 'flatten' your hash:
my #flatten = map { ref $_ : #$_ ? $_ } values %user_domains;
print Dumper \#flatten;
(You need the ref test, because without it, the non-array main-domain won't work properly)
So for the sake of consistency, you might be better off with:
my %user_domains = (
'addon_domains' => ['domain1.com', 'domain2.com', 'domain3.com'],
'parked_domains' => ['parked1.com', 'parked2.com', 'parked3.com'],
'main_domain' => ['main-domain.com'],
'sub_domains' => ['sub1.main-domain.com', 'sub2.main-domain.com'],
);

Directly access nested JSON data in perl?

I'm not familiar with hash/reference syntax with Perl and it makes my eyes hurt trying.
I have the following JSON:
{
"Arg":"Custom_Light state alias protocol",
"Results": [
{
"Name":"Custom_Light",
"Internals": { },
"Readings": {
"protocol": { "Value":"V3", "Time":"2017-01-14 18:49:18" },
"state": { "Value":"off", "Time":"2017-03-05 10:39:50" }
},
"Attributes": { "alias": "Kitchen light" }
} ],
"totalResultsReturned":1
}
How do I directly get the Reading > Protocol Value and Reading > state Value as well as the Attributes > Alias?
I am using the default JSON encoder/decoder and it works splendid. Using Dumper($json) I get all the JSON, but I have no clue how to directly access it without using foreach with all the arrays within arrays in this.
I have tried the following:
my $json = from_json( $readout, { utf8 => 1 } );
print "No. Entries:", scalar(keys($json)); #works, returns 3
my #results = %$json{Results};
Dumper(#results[1]); #I get the Results array
From here it already is ugly. What's that %$ doing there? I thought I could do something like print ${ $json->{'Results'}->[1] }{'Readings'}; but that leads me nowhere.
Give me wisdom. How do I access the Protocol value directly? How do I access the state value directly? And finally, how to get to the alias Attribute?
I don't know what I'm doing but I'm getting somewhere with my $test = %{${%$json{Results}}[0]}{Name}; #I get "Custom_Light", nice. Is this the way to go with a gazillion of weird % and $ just randomly thrown in?
You want
$json->{Results}[0]{Readings}{protocol}{Value}
$json->{Results}[0]{Readings}{state}{Value}
$json->{Results}[0]{Attributes}{alias}
However, since the Results item is an array, you are likely to want to iterate over all of its elements, although in this case there is only one element
I find it useful to extract one level of reference at a time into temporary variables. It would look like this
my $results = $json->{Results};
for my $result ( #$results ) {
my $readings = $result->{Readings};
my $attributes = $result->{Attributes};
printf "Protocol: %s\n", $readings->{protocol}{Value};
printf "State: %s\n", $readings->{state}{Value};
printf "Alias: %s\n", $attributes->{alias};
print "\n";
}
Have a look at perlreftut, perldsc, and perlref, it will help you understand how to access deeply nested structures in Perl.
print "No. Entries:", scalar(keys($json)); #works, returns 3
Actually, this will no longer work. Using keys on a scalar, was an experimental feature added in Perl 5.14 that allowed each, keys, push, pop, shift, splice, unshift, and values to be called with a scalar argument. This experiment was considered unsuccessful, and was removed in 5.23. See also Experimental values on scalar is now forbidden. So, you should dereference the hash reference $json before applying keys:
print "No. Entries:", scalar keys %$json;
As described in perlref, %$ref dereferences the hash reference $ref. Next, lets look at this line:
my #results = %$json{Results};
This actually first creates a new (anonymous) hash ( Result => $json->{Result} ) and then assigns this to #results making #result = ( 'Result', $json->{Result} ). So that is why you now can refer to $json->{Result}[0] as $result[1].
But this is obscure coding, and probably not intended as well. So to return to your question, to get the Value field you could write:
my $value = $json->{Results}[0]{Readings}{state}{Value};
And to get the alias field:
my $alias = $json->{Results}[0]{Attributes}{alias};

How to get first n values from perl Hash of arrays

Experts,
I have a hash of array in perl which I want to print the first 2 values.
my %dramatis_personae = (
humans => [ 'hamnet', 'shakespeare', 'robyn', ],
faeries => [ 'oberon', 'titania', 'puck', ],
other => [ 'morpheus, lord of dreams' ],
);
foreach my $group (keys %dramatis_personae) {
foreach (#{$dramatis_personae{$group}}[0..1]) { print "\t$_\n";}
}
The output I get is
"hamnet
shakespeare
oberon
titania
morpheus
lord of dreams"
which is basically first two array values for each key. But I am looking to have the output as:
hamnet
shakespeare
Please advise how I can get this result. Thanks!
Keys of hashes are not ordered, so you should specify keys ordering by yourself. Then you can concatenate arrays from each key specified and take first two values from resulting array, is it what you want ?
print "\t$_\n" foreach (map {(#{$dramatis_personae{$_}})} qw/humans faeries other/)[0..1];
Hashes are unordered, so what you requested to achieve is impossible. Unless you have some knowledge about the keys and the order they should be in, the closest you can get is something that can produce any of the following:
'hamnet', 'shakespeare'
'oberon', 'titania'
'morpheus, lord of dreams', 'hamnet'
'morpheus, lord of dreams', 'oberon'
The following is an implementation that does just that:
my $to_fetch = 2;
my #fetched = ( map #$_, values %dramatis_personae )[0..$to_fetch-1];
The following is a more efficient version for larger structures. It also handles insufficient data better:
my $to_fetch = 2;
my #fetched;
for my $group (values(%dramatis_personae)) {
if (#$group > $to_fetch) {
push #fetched, #$group[0..$to_fetch-1];
$to_fetch = 0;
last;
} else {
push #fetched, #$group;
$to_fetch -= #$group;
}
}
die("Insufficient data\n") if $to_fetch;

Resources