Strange behaviour of Perl's push function - arrays

I am writing a dedicated ICS (iCalendar file) parser.
I pass an array to a subroutine. All variables are single values apart from $notdates which is a comma-separated list of dates.
#entryl = ($dtstart, $dtend, $attendee, $lastmod, $uid, $notdates);
&entrytoarray(#entryl);
sub entrytoarray {
# print Dumper #_;
my $shiftdur = (&stamptoepoc($_[1]) - &stamptoepoc($_[0])) / 60 / 60;
my $attendee = $_[2];
my $deleted = $_[5];
$attendee =~ /ATTENDEE;USER-KEY=([^;]*);CN=([^;]*);.*:(.*)/;
my %ehash = (
"STARTDATE" , &stamptodate($_[0]),
"ENDDATE" , &stamptodate($_[1]),
"STARTSTAMP" , $_[0],
"ENDSTAMP" , $_[1],
"USERKEY" , $1,
"CN" , $2,
"EMAIL" , $3,
"LASTMOD" , $_[3],
"UID" , $_[4],
"DURATION" , $shiftdur
);
# Only keep current data
my $fdays = 4;
my $tdays = 7;
chomp(my $curstamp = `TZ="UTC" date -d "$fdays days" +"%Y%m%d%H%M00"`);
chomp(my $stpstamp = `TZ="UTC" date -d "$tdays days" +"%Y%m%d%H%M00"`);
if (($_[0] > $curstamp) && ($_[1] < $stpstamp)) {
if (defined($deleted)) {
my #deleted = split /,/, $deleted;
foreach (#deleted) {
if ($_ ne $_[0]) {
push(#entry, \%ehash);
}
}
}
else {
push(#entry, \%ehash);
}
}
print Dumper #entry;
This works mostly as expected:
$VAR1 = {
'DURATION' => '5',
'STARTSTAMP' => '20141122230000',
'UID' => '20141114T010539Z--1092363482',
'LASTMOD' => '20141118214419',
'STARTDATE' => '2014-11-22 23:00:00',
'EMAIL' => 'xxxxxxxxxxxxx',
'ENDDATE' => '2014-11-23 04:00:00',
'CN' => 'xxxxxxxxxxx',
'ENDSTAMP' => '20141123040000',
'KEY' => 'xxxxxxxxxxxxxxxxxx'
};
$VAR2 = {
'EMAIL' => 'xxxxxxxxxxxxx',
'ENDDATE' => '2014-11-23 23:00:00',
'ENDSTAMP' => '20141123230000',
'KEY' => 'xxxxxxxxxxx',
'CN' => 'xxxxxxxxxxxxxx',
'STARTDATE' => '2014-11-23 19:00:00',
'LASTMOD' => '20141118205901',
'UID' => '20141114T010456Z--1092363482',
'DURATION' => '4',
'STARTSTAMP' => '20141123190000'
};
$VAR3 = $VAR2;
Where is the $VAR3 = $VAR2 coming from?

My guess is that this section is the culprit:
foreach (#deleted) {
if ($_ ne $_[0]) {
push(#entry, \%ehash);
}
}
If you have several values in the array, the if-statement can be true twice, and thus push a value twice. Unless this is wanted behaviour, I would make sure that only one value is pushed. You can do this by using grep instead:
if (grep { $_ ne $_[0] } #deleted) {
push #entry, \%ehash;
}
Note that this replaces the foreach loop.

Your array #entry contains hash references. Data::Dumper is saying that the first and second elements of the array refer to two different hashes, while the third refers to the same hash as the second.
You don't show where #entry comes from, but I would expect all three elements to be references to %ehash.
The problem is that, if you keep pushing a reference to %ehash onto #entry, they all point to the same data item, and the intermediate states of the hash won't be recorded.
Unless you mean entrytoarray to push only one copy of %ehash (in which case there's a separate problem that we can't see) you need to fix it by either writing
push #entry, { %ehash }
which copies the hash and returns a reference to the copy, or you can declare and populate %ehash inside the foreach loop, which will create a new hash each time around the loop.

Related

Perl nested hashes matching and merging

I have a file that is read and split into %ojects, the %objects are populated as shown below.
$VAR1 = 'cars';
$VAR2 = {
'car1' => {
'info1' => '"fast"',
'info2' => 'boring'
},
'car2' => {
'info1' => '"slow"',
'info2' => 'boring info'
},
'car3' => {
'info1' => '"unique"',
'info2' => 'useless info'
}
};
$VAR3 = 'age';
$VAR4 = {
'new' => {
'info3' => 'rust',
'info4' => '"car1"'
},
'old' => {
'info3' => 'shiny',
'info4' => '"car2" "car3"'
}
}
};
My goal is to insert data like "car1 fast rust, car2 slow shiny, car3 unique shiny" in a DB but I can't get e.g. "rust to match based on info4 in age" ..
my $key = cars;
my $key2 = age;
foreach my $obj (keys %{$objects{$key}}) { # for every car
#info1s = $objects{$type}{$obj}{'info1'} =~ m/"(.*?)"/g; # added to clean up all info1
foreach my $infos ($info1s) {
dbh execute insert $obj $infos # this gives me "car1 fast, car2 slow, car3 unique"
}
...
Can somebody please point me in the right direction to fetch and store info4 with related info1/info2?
Thanks!
I take the objective to be as follows.
Get values for (info4) keys in $VAR4, at the deepest-level hashref, and find them as top-level keys in $VAR2 hashref. Then associate with them both a value from a (info3) key, their "sibling" in their own $VAR4's deepest level hashref, as well as the value of a key (info1) from $VAR2.
One can traverse the structure by hand for this purpose, specially if it's always with the same two levels as shown, but it's easier and better with libraries. I use Data::Leaf::Walker to get leaves (deepest values) and key-paths to them, and Data::Diver to get values for known paths.
use warnings;
use strict;
use feature 'say';
use Data::Dump;
use Data::Leaf::Walker;
use Data::Diver qw(Dive);
my $hr1 = {
'car1' => { 'info1' => 'fast', 'info2' => 'boring' },
'car2' => { 'info1' => 'slow', 'info2' => 'boring info' },
'car3' => { 'info1' => 'unique', 'info2' => 'useless info' }
};
my $hr2 = {
'new' => { 'info3' => 'rust', 'info4' => 'car1' },
'old' => { 'info3' => 'shiny', 'info4' => 'car2 car3' }
};
my $walker = Data::Leaf::Walker->new($hr2);
my %res;
while ( my ($path, $value) = $walker->each ) {
next if $path->[-1] ne 'info4';
# Some "values" have multiple needed values separated by space
for my $val (split ' ', $value) {
# Get from 'info4' path the one to its sibling, 'info3'
my #sibling_path = ( #{$path}[0..$#$path-1], 'info3' );
# Collect results: values of `info3` and `info1`
push #{$res{$val}},
Dive( $hr2, #sibling_path ),
Dive( $hr1, ($val, 'info1') );
}
}
dd \%res;
This assumes a few things and takes some shortcuts, for simplicity.
For one, I use explicit infoN keys from the questions, and the two-level structure. If data is, or can be different, this shouldn't be hard to adjust.
Next, this assumes that a value like car1 always exists as a key in the other hashref. Add an exists check for that key if it is possible that it doesn't exist as a key.
I've removed some extra quotes from data. (If that's for database entry do that when constructing the statement. If data comes in with such extra quotes it should be easy to adjust the code to take them into account.)
The above program prints
{
car1 => ["rust", "fast"],
car2 => ["shiny", "slow"],
car3 => ["shiny", "unique"],
}
(I use Data::Dump to display complex data structure, for its simplicity and default compact output.)

Why assigning list containing array to array itself causes recursion in perl?

I'm using DBIX::Class and generating conditions for search like that:
my #array;
push #array, { condition1 => 'value1' };
push #array, [ { condition2 => 'value2' }, { condition3 => 'value3' } ];
All this conditions must be checked using AND operator, that's why I wrote this:
#array = ( -and => #array );
After running code with such conditions process on my virtual machine started to use up to 8 Gb memory. I thought that it was recursion problems and I didn't mistake. I checked logs and saw records about deep recursion but I couldn't find anything about my case in internet.
Is there problems with assigning list containing array to array itself?
Or maybe it is a problem with DBIX::Class (SQL::Abstract)? Why it causes deep recursion?
Update. This is the real code from project:
sub faq {
my ( $self ) = #_;
my #cond;
if ( $self->param('faq_type') ) {
push #cond,
{
'me.faq_type' => $self->param('faq_type'),
};
}
if ( my $search = $self->param('search') ) {
push #cond,
[
'me.title' => { ilike => "%$search%" },
'me.text' => { ilike => "%$search%" },
];
}
#cond = ( -and => #cond );
my %attr = (
join => 'page_category',
rows => $self->param('limit'),
offset => $self->param('offset'),
order_by => { -desc => 'id' },
result_class => 'BUX::Util::HashRefInflator',
'+select' => [ qw( page_category.name ) ],
'+as' => [ qw( category_name ) ],
);
my #pages = BUX::DB->rs('Page')->search( \#cond, \%attr )->all;
my $total_count = BUX::DB->rs('Page')->count( \#cond );
return $self->render(json => {
pages => \#pages,
count => $total_count
});
}
And log records:
Deep recursion on subroutine "SQL::Abstract::_SWITCH_refkind" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 719.
Deep recursion on subroutine "SQL::Abstract::_recurse_where" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 546.
Deep recursion on subroutine "SQL::Abstract::_where_ARRAYREF" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 687.
Deep recursion on subroutine "SQL::Abstract::_where_HASHREF" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 493.
Deep recursion on subroutine "SQL::Abstract::_where_unary_op" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 596.
Deep recursion on subroutine "SQL::Abstract::_where_op_ANDOR" at /opt/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/SQL/Abstract.pm line 645.
P.S. BUX::DB is the subclass of DBIx::Class and rs is a shortcut for resultset.
When specifying several conditions that should all be met to search
with DBIx::Class, the usual way to do this is by passing a hashref
with the column names as keys and the conditions as values.
While it is possible to instead specify an arrayref of hashrefs with the '-and' keyword, this is most often unnecessary - especially if you only have one condition to specify!
NOTE: I am not certain { -and => #cond } does what you want, have you tried replacing it with { -and => \#cond } ( Note the arrayref) ? This could be the reason why SQL::Abstract gets confused, though I'm unsure how that would end up being a recursion.
SECOND NOTE: I find #cond = ( -and => \#cond ) confusing and it may cause trouble. I would suggest working with a hashref passed into search, as it should be called with, and setting the -and key instead, by adapting my first example.
This is how I would specify the conditions:
my $cond;
if ( my $faq_type = $self->param('faq_type') ){
$cond->{'me.faq_type'} = $faq_type;
}
if ( my $search = $self->param('search') ){
$cond->{-or} = [
{ $cond->{'me.title'} = { ilike => '%$search%' }, },
{ $cond->{'me.text' } = { ilike => '%$search%' }, },
];
}
An alternative to consider, would be to first specify the 'faq_type' search and store the resulting rs, to then refine it further as necessary, this seems more in line with the spirit of DBIx::Class to me:
my $pages_rs = BUX::DB->rs('Page');
if ( my $faq_type = $self->param('faq_type') ){
$pages_rs = $pages_rs->search({ 'me.faq_type' => $faq_type });
}
if ( my $search = $self->param('search') ){
$pages_rs = $pages_rs->search({
-or => [
'me.title' => { ilike => "%$search%" },
'me.text' => { ilike => "%$search%" },
];
});
}
my %attr = (
join => 'page_category',
rows => $self->param('limit'),
offset => $self->param('offset'),
order_by => { -desc => 'id' },
result_class => 'BUX::Util::HashRefInflator',
'+select' => [ qw( page_category.name ) ],
'+as' => [ qw( category_name ) ],
);
$pages_rs = $pages_rs->search( undef, \%attr );
my #pages = $pages_rs->all; # This executes the query
Please keep in mind this is untested as I currently don't have an easy way of verifying this. If this does not help, feel free to comment and I'll try and fix whatever may be off.
EDIT: to not leave something in that is wrong, I've removed the (irrelevant) page count I put in.

SilverStripe 3.1 loop associative array

In SilverStripe 3.1 I have a function that loops through an array and outputs its contents.
The output it gives me is:
Layout: John
Strategy: John
Management: Martin
In this example John has more than one job.
I would like to group the jobs if a person has more than one job.
This is my desired Output:
Layout and Strategy: John
Management: Martin
//$InfoFieldArray = array('Layout' => 'John', 'Strategy' => 'John', 'Management' => 'Martin');
public function createInfoFields($InfoFieldArray){
$Info = ArrayList::create();
foreach($InfoFieldArray as $key => $value ){
$fields = new ArrayData(array('FieldName' => $key, 'Value' => $value));
$Info->push($fields);
}
return $Info;
}
How do I alter my function to achieve my desired output?
One possible solution to that is by restructuring the data before adding it to the ArrayList.
public function createInfoFields($InfoFieldArray)
{
$info = array();
foreach ($InfoFieldArray as $job => $person)
{
if (!isset($info[$person]))
{
$info[$person] = array();
}
$info[$person][] = $job;
}
$result = ArrayList::create();
foreach ($info as $person => $jobs)
{
$fields = new ArrayData(array('FieldName' => implode(' and ', $jobs), 'Value' => $person));
$result->push($fields);
}
return $result;
}
What I have done is go over the array of jobs and the person assigned and flipped it the other way around, so I have an array of people with a list of jobs. This allows me to then just call implode in PHP, joining the various jobs by the word and.
There are some potential drawbacks, if there are two people named "John", they will be treated as one as I am using the name as the array key.
Also, if there are three jobs for a person, it will list it like "Layout and Strategy and Management". To avoid that, we need to modify the second foreach loop in my code to something like this:
foreach ($info as $person => $jobs)
{
$jobString = null;
if (count($jobs) > 1)
{
$jobString = implode(', ', array_slice($jobs, 0, -1)) . ' and ' . array_pop($jobs);
}
else
{
$jobString = $jobs[0];
}
$fields = new ArrayData(array('FieldName' => $jobString, 'Value' => $person));
$result->push($fields);
}
When there is more than 1 job for a person, we want to implode (glue together) the array pieces for the $jobs array however we don't want the last element at this point. Once array is glued together, we append with with and along with the last item.

Perl: Comparing 2 hash of arrays with another array

I have written the code below in Perl but it's not giving the desirable output. I am dealing with the comparison between one array and two hash of arrays.
Given sample input files:
1) file1.txt
A6416 A2318
A84665 A88
2) hashone.pl
%hash1=(
A6416=>['E65559', 'C11162.1', 'c002gnj.3',],
A88=>['E77522', 'M001103', 'C1613.1', 'c001hyf.2',],
A84665=>['E138347', 'M032578', 'C7275.1', 'c009xpt.3',],
A2318=>['E128591', 'C43644.1', 'C47705.1', 'c003vnz.4',],
);
3) hashtwo.pl
%hash2=(
15580=>['C7275.1', 'E138347', 'M032578', 'c001jnm.3', 'c009xpt.2'],
3178=>['C1613.1', 'E77522','M001103', 'c001hyf.2', 'c001hyg.2'],
24406=>['C11162.1', 'E65559', 'M003010', 'c002gnj.2'],
12352=>['C43644.1', 'C47705.1', 'E128591','M001458', 'c003vnz.3'],
);
My aim is to achieve the task described:
From file1.txt, I have to locate the corresponding ID in %hash1. For instance,A6416 (file1.txt) is the key in %hash1. Next, I have to find the values of A6416 ['E65559', 'C11162.1', 'c002gnj.3',] in %hash2. If majority (more than 50%) of the values are found in %hash2, I replace A6416 with corresponding key from %hash2.
Example:
A6416 A2318
A84665 A88
Output:
24406 12352
15580 3178
Please note that the keys for %hash1 and %hash2 are different (they don't overlap). But the values are the same (they overlap).
#!/usr/bin/perl -w
use strict;
use warnings;
open FH, "file1.txt" || die "Error\n";
my %hash1 = do 'hashone.pl';
my %hash2 = do 'hashtwo.pl';
chomp(my #array=<FH>);
foreach my $amp (#array)
{
if ($amp =~ /(\d+)(\s?)/)
{
if (exists ($hash1{$1}))
{
for my $key (keys %hash2)
{
for my $i ( 0 .. $#{ $hash2{$key} } )
{
if ((#{$hash1{$1}}) eq ($hash2{$key}[$i]))
{
print "$key";
}
}
}
}
}
}
close FH;
1;
Any guidance on this problem is highly appreciated. Thank you!
I think you should invert %hash2 into this structure:
$hash2{'C7275.1'} = $hash2{'E138347'} = $hash2{'M032578'}
= $hash2{'c001jnm.3'} = $hash2{'c009xpt.2'} = 15580;
$hash2{'C1613.1'} = $hash2{'E77522'} = $hash2{'M001103'}
= $hash2{'c001hyf.2'} = $hash2{'c001hyg.2'} = 3178;
$hash2{'C11162.1'} = $hash2{'E65559'}
= $hash2{'M003010'} = $hash2{'c002gnj.2'} = 24406;
$hash2{'C43644.1'} = $hash2{'C47705.1'} = $hash2{'E128591'}
= $hash2{'M001458'} = $hash2{'c003vnz.3'} = 3178;
So that you can perform these look-ups much more effectively, rather than having to iterate over every element of every element of %hash2.
Building on the responses from ruakh and zock here is the code you need to build the look-up table for hash2
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash2=(
15580=>['C7275.1', 'E138347', 'M032578', 'c001jnm.3', 'c009xpt.2'],
3178=>['C1613.1', 'E77522','M001103', 'c001hyf.2', 'c001hyg.2'],
24406=>['C11162.1', 'E65559', 'M003010', 'c002gnj.2'],
12352=>['C43644.1', 'C47705.1', 'E128591','M001458', 'c003vnz.3'],
);
# Build LUT for hash2
my %hash2_lut;
foreach my $key (keys %hash2)
{
foreach my $val (#{$hash2{$key}})
{
$hash2_lut{$val} = $key
}
}
print Dumper(\%hash2_lut);
Please select ruakh's post as the answer, just trying to clarify the code for you. Use Data::Dumper...it is your friend.
Here is the output:
$VAR1 = {
'C47705.1' => '12352',
'M032578' => '15580',
'E138347' => '15580',
'E77522' => '3178',
'C7275.1' => '15580',
'c001jnm.3' => '15580',
'E65559' => '24406',
'C1613.1' => '3178',
'M001458' => '12352',
'c002gnj.2' => '24406',
'c009xpt.2' => '15580',
'c001hyf.2' => '3178',
'C43644.1' => '12352',
'E128591' => '12352',
'c001hyg.2' => '3178',
'M003010' => '24406',
'c003vnz.3' => '12352',
'C11162.1' => '24406',
'M001103' => '3178'
};

Search for hash in an array by value

I have a function which extracts Excel data into an array of hashes like so:
sub set_exceldata {
my $excel_file_or = '.\Excel\ORDERS.csv';
if (-e $excel_file_or) {
open (EXCEL_OR, $excel_file_or) || die("\n can't open $excel_file_or: $!\n");
while () {
chomp;
my ( $id, $date, $product, $batchid, $address, $cost ) = split ",";
my %a = ( id => $id
, date => $date
, product => $product
, batchid => $batchid
, address => $address
, cost => $cost
);
push ( #array_data_or, \%a );
}
close EXCEL_OR;
}
}
Populating the array of hashes is fine. However, the difficult part is searching for a particular item (hash) in the array. I can't seem to locate items that might have an id or 21, or a batchid of 15, or a cost > $20 etc.
How would I go about implementing such a search facility?
Thanks to all,
With the power of grep
my #matching_items = grep {
$_->{id} == 21
} #array_data_or;
If you know there will be only one item returned you can just do this:
my ($item) = grep {
$_->{id} == 21
} #array_data_or;
(Untested, and I haven't written one of these in a while, but this should work)
If you're sure that the search always returns only one occurence or if you're interested in only the first match then you could use the 'first' subroutine found in List::Util
use List::Util;
my %matching_hash = %{ first { $_->{id} == 21 } #array_data_or };
I enclosed the subroutine call in the %{ } block to ensure that the RHS evaluates to a hash.

Resources