Self-deleting array elements (once they become undefined) - arrays

I have a Perl script generating an array of weak references to objects. Once one of these objects goes out of scope, the reference to it in the array will become undefined.
ex (pseudo code):
# Imagine an array of weak references to objects
my #array = ( $obj1_ref, $obj2_ref, $obj3_ref );
# Some other code here causes the last strong reference
# of $obj2_ref to go out of scope.
# We now have the following array
#array = ( $obj1_ref, undef, $obj3_ref )
Is there a way to make the undefined reference automatically remove itself from the array once it becomes undefined?
I want #array = ($obj1_red, $obj3_ref ).
EDIT:
I tried this solution and it didn't work:
#!/usr/bin/perl
use strict;
use warnings;
{
package Object;
sub new { my $class = shift; bless({ #_ }, $class) }
}
{
use Scalar::Util qw(weaken);
use Data::Dumper;
my $object = Object->new();
my $array;
$array = sub { \#_ }->( grep defined, #$array );
{
my $object = Object->new();
#$array = ('test1', $object, 'test3');
weaken($array->[1]);
print Dumper($array);
}
print Dumper($array);
Output:
$VAR1 = [
'test1',
bless( {}, 'Object' ),
'test3'
];
$VAR1 = [
'test1',
undef,
'test3'
];
The undef is not removed from the array automatically.
Am I missing something?
EDIT 2:
I also tried removing undefined values from the array in the DESTROY method of the object, but that doesn't appear to work either. It appears that since the object is still technically not "destroyed" yet, the weak references are still defined until the DESTROY method is completed...

No, there isn't, short of using a magical (e.g. tied) array.
If you have a reference to an array instead of an array, you can use the following to filter out the undefined element efficiently without "hardening" any of the references.
$array = sub { \#_ }->( grep defined, #$array );
This doesn't copy the values at all, in fact. Only "C pointers" get copied.

Perl won't do this for you automatically. You have a couple of options. The first is to clean it yourself whenever you use it:
my #clean = grep { defined $_ } #dirty;
Or you could create a tie'd array and add that functionality to the FETCH* and POP hooks.

Related

Perl: Removing array items and resizing the array

I’m trying to filter an array of terms using another array in Perl. I have Perl 5.18.2 on OS X, though the behavior is the same if I use 5.010. Here’s my basic setup:
#!/usr/bin/perl
#use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
foreach $filter (#filters) {
for my $ind (0 .. $#terms) {
if (grep { /$filter/ } $terms[$ind]) {
splice #terms,$ind,1;
}
}
}
This works to pull out the lines that match the various search terms, but the array length doesn’t change. If I write out the resulting #terms array, I get:
[alpha]
[delta quadrant]
[zeta]
[eta]
[theta chi]
[kappa]
[]
[]
[]
[]
As you might expect from that, printing scalar(#terms) gets a result of 10.
What I want is a resulting array of length 6, without the four blank items at the end. How do I get that result? And why isn’t the array shrinking, given that the perldoc page about splice says, “The array grows or shrinks as necessary.”?
(I’m not very fluent in Perl, so if you’re thinking “Why don’t you just...?”, it’s almost certainly because I don’t know about it or didn’t understand it when I heard about it.)
You can always regenerate the array minus things you don't want. grep acts as a filter allowing you to decide which elements you want and which you don't:
#!/usr/bin/perl
use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
my %filter_exclusion = map { $_ => 1 } #filters;
my #filtered = grep { !$filter_exclusion{$_} } #terms;
print join(',', #filtered) . "\n";
It's pretty easy if you have a simple structure like %filter_exclusion on hand.
Update: If you want to allow arbitrary substring matches:
my $filter_exclusion = join '|', map quotemeta, #filters;
my #filtered = grep { !/$filter_exclusion/ } #terms;
To see what's going on, print the contents of the array in each step: When you splice the array, it shrinks, but your loop iterates over 0 .. $#terms, so at the end of the loop, $ind will point behind the end of the array. When you use grep { ... } $array[ $too_large ], Perl needs to alias the non-existent element to $_ inside the grep block, so it creates an undef element in the array.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #terms = ('alpha', 'beta test', 'gamma', 'delta quadrant', 'epsilon',
'zeta', 'eta', 'theta chi', 'one iota', 'kappa');
my #filters = qw( beta gamma epsilon iota );
for my $filter (#filters) {
say $filter;
for my $ind (0 .. $#terms) {
if (grep { do {
no warnings 'uninitialized';
/$filter/
} } $terms[$ind]
) {
splice #terms, $ind, 1;
}
say "\t$ind\t", join ' ', map $_ || '-', #terms;
}
}
If you used $terms[$ind] =~ /$filter/ instead of grep, you'd still get uninitialized warnings, but as there's no need to alias the element, it won't be created.

Passing an Array of an Array to a subroutine using perl

Ok, so I got an array of an array (AoA) and I need to pass it to a subroutine, and then access it. This works… but is it strictly correct and indeed is there a better way that I should be doing this?
#!/usr/bin/perl
$a = "deep";
push(#b,$a);
push(#c,\#b);
print "c = #{$c[0]}\n";
&test(\#c);
sub test
{
$d = $_[0];
print "sub c = #{$$d[0]}\n";
}
Thanks
The definitely better way to do it is to use strict; and use warnings; and to declare your variables before using them.
Also, you know that is not good practice to name your vars a or b - give them meaningful names, especially because the $a variable for example, is defined by the compiler (the one used with sort {} #).
use strict; # Always!
use warnings; # Always!
sub test {
my ($d) = #_;
print "$d->[0][0]\n";
# Or to print all the deep elements.
for my $d2 (#$d) {
for (#$d2) {
print "$_\n";
}
}
}
{
my $a = "deep"; # or: my #c = [ [ "deep" ] ];
my #b;
my #c;
push(#b,$a);
push(#c,\#b);
test(\#c); # Can't pass arrays per say, just refs. You had this right.
}
Still needs better names. In particular, $a and $b should be avoided as they can interfere with sort.

Get a random element from an array in a hash

I have seen a lot of results on google on how to get a random array index, but I have not been able to apply it to this scenario.
Consider the following:
my %hash;
my #array = {"foo", "bar", "poo"};
$hash->{mykey} = #array;
How would I get a random element from the array inside $hash->{mykey}? Something like the following code which does not work:
my $element = $hash->{mykey}[rand($hash->{mykey})];
EDIT: So the answers below are extremely informative for this. Compounding my issue in particular was that I was using the threads module, and completely forgot to share the arrays that I was appending to the hash elements! Due to this, the answers were not working for me right away.
After fixing that oversight, the solutions below worked perfectly.
Three errors.
1. The following create an array with one element, a reference to a hash:
my #array = {"foo", "bar", "poo"};
You surely meant to use
my #array = ("foo", "bar", "poo");
2.
$hash->{mykey} = #array;
is the same thing as
$hash->{mykey} = 3;
You can't store arrays in scalars, but you can store a reference to one.
$hash->{mykey} = \#array;
3. It would be
rand(#a) # rand conveniently imposes a scalar context.
for an array, so it's
rand(#{ $ref })
for a reference to an array. That means you want the following:
my $element = $hash->{mykey}[ rand(#{ $hash->{mykey} }) ];
Or you can break it down into two lines.
my $array = $hash->{mykey};
my $element = $array->[ rand(#$array) ];
All together, we have the following:
my #array = ( "foo", "bar", "poo" );
my $hash = { mykey => \#array };
my $element = $hash->{mykey}[ rand(#{ $hash->{mykey} }) ];
I think that your first problem is the construction of your data structure:
#always
use strict;
use warnings;
my %hash;
my #array = ("foo", "bar", "poo");
$hash{mykey} = \#array;
You should probably read perldoc perlreftut to get comfortable with Perl's semantics relating to nested data structures (references).
At this point you can create the structure all at once, which is probably what you mean:
#always
use strict;
use warnings;
my %hash = (
mykey => ["foo", "bar", "poo"],
);
To find the length you just use the regular Perl mechanics for getting the length of the array:
my $length = #{ $hash{mykey} };
and then the random element
my $elem = $hash{mykey}[rand $length];

In Perl how can I read a file of unknown length into multiple hashes to be stored in an array for later use?

I have a config file that looks a bit like this:
add
1
2
concatenate
foo
bar
blat
What I'm trying to do is turn this into hashes like %hash = (name=>"add", args=> [1,2]) etc, and push the hash references into a single array. Looping through the file and creating each hash seems straightforward enough, except I get stuck when it comes to naming these hashes to push their references into the array. The config file is going to change all the time and have a variable number of different name/arg combinations to store. Is there a way to iterate through hash names so I can push them into an array one at a time?
So far it looks like this:
my %temphash = (name=>'add', args=>[1,2]);
push (#array, \%temphash);
Can I make that %temphash into something generated on the fly and push it before moving on to the next one?
Edit: Context
The plan is to use those 'name' keys to call subroutines. So something like this could work:
my %subhash = (add=>\&addNumbers, concatenate=>\&concat);
Except the list of subroutines I'm going to need to call are in the config file and I won't know what they are until I start reading from it. Even if I include the names of the subroutines right there in the config file, how do I iterate through them and add them as elements to that hash?
Well, you can simply use curly brackets to make an anonymous hash:
push #array, { name => 'add', args => [1,2] };
You can create the same effect by utilising the lexical scope of the my declaration. E.g.:
my #array;
while ( ... ) {
...
my %hash = ( ... );
push #array, \%hash;
}
If I'm correctly understanding what you're asking, then you can write:
push #array, { name=>'add', args=>[1,2] };
where { ... } is a reference to an anonymous hash.
That said, I'm a bit surprised that you want an array of hashes, when each hash has just a name and args. Why not have a single hash mapping from names to args? :
%array = ( add => [ 1, 2 ], concatenate => [ 'foo', 'bar', 'baz' ] );
Something like this will do what you need
use strict;
use warnings;
open my $fh, '<', 'data_file' or die $!;
my $item;
my #data;
while (<$fh>) {
chomp;
next unless /^(\s*)(.+?)\s*$/;
if ($1) {
push #{ $item->{args} }, $2;
}
else {
push #data, $item if $item;
$item = { name => $2, args => [] };
}
}
push #data, $item if $item;
use Data::Dump;
dd \#data;
output
[
{ args => [1, 2], name => "add" },
{ args => ["foo", "bar", "blat"], name => "concatenate" },
]

Perl: mapping to lists' first element

Task: to build hash using map, where keys are the elements of the given array #a, and values are the first elements of the list returned by some function f($element_of_a):
my #a = (1, 2, 3);
my %h = map {$_ => (f($_))[0]} #a;
All the okay until f() returns an empty list (that's absolutely correct for f(), and in that case I'd like to assign undef). The error could be reproduced with the following code:
my %h = map {$_ => ()[0]} #a;
the error itself sounds like "Odd number of elements in hash assignment". When I rewrite the code such that:
my #a = (1, 2, 3);
my $s = ()[0];
my %h = map {$_ => $s} #a;
or
my #a = (1, 2, 3);
my %h = map {$_ => undef} #a;
Perl does not complain at all.
So how should I resolve this — get first elements of list returned by f(), when the returned list is empty?
Perl version is 5.12.3
Thanks.
I've just played around a bit, and it seems that ()[0], in list context, is interpreted as an empty list rather than as an undef scalar. For example, this:
my #arr = ()[0];
my $size = #arr;
print "$size\n";
prints 0. So $_ => ()[0] is roughly equivalent to just $_.
To fix it, you can use the scalar function to force scalar context:
my %h = map {$_ => scalar((f($_))[0])} #a;
or you can append an explicit undef to the end of the list:
my %h = map {$_ => (f($_), undef)[0]} #a;
or you can wrap your function's return value in a true array (rather than just a flat list):
my %h = map {$_ => [f($_)]->[0]} #a;
(I like that last option best, personally.)
The special behavior of a slice of an empty list is documented under “Slices” in perldata:
A slice of an empty list is still an empty list. […] This makes it easy to write loops that terminate when a null list is returned:
while ( ($home, $user) = (getpwent)[7,0]) {
printf "%-8s %s\n", $user, $home;
}
I second Jonathan Leffler's suggestion - the best thing to do would be to solve the problem from the root if at all possible:
sub f {
# ... process #result
return #result ? $result[0] : undef ;
}
The explicit undef is necessary for the empty list problem to be circumvented.
At first, much thanks for all repliers! Now I'm feeling that I should provide the actual details of the real task.
I'm parsing a XML file containing the set of element each looks like that:
<element>
<attr_1>value_1</attr_1>
<attr_2>value_2</attr_2>
<attr_3></attr_3>
</element>
My goal is to create Perl hash for element that contains the following keys and values:
('attr_1' => 'value_1',
'attr_2' => 'value_2',
'attr_3' => undef)
Let's have a closer look to <attr_1> element. XML::DOM::Parser CPAN module that I use for parsing creates for them an object of class XML::DOM::Element, let's give the name $attr for their reference. The name of element is got easy by $attr->getNodeName, but for accessing the text enclosed in <attr_1> tags one has to receive all the <attr_1>'s child elements at first:
my #child_ref = $attr->getChildNodes;
For <attr_1> and <attr_2> elements ->getChildNodes returns a list containing exactly one reference (to object of XML::DOM::Text class), while for <attr_3> it returns an empty list. For the <attr_1> and <attr_2> I should get value by $child_ref[0]->getNodeValue, while for <attr_3> I should place undef into the resulting hash since no text elements there.
So you see that f function's (method ->getChildNodes in real life) implementation could not be controlled :-) The resulting code that I have wrote is (the subroutine is provided with list of XML::DOM::Element references for elements <attr_1>, <attr_2>, and <attr_3>):
sub attrs_hash(#)
{
my #keys = map {$_->getNodeName} #_; # got ('attr_1', 'attr_2', 'attr_3')
my #child_refs = map {[$_->getChildNodes]} #_; # got 3 refs to list of XML::DOM::Text objects
my #values = map {#$_ ? $_->[0]->getNodeValue : undef} #child_refs; # got ('value_1', 'value_2', undef)
my %hash;
#hash{#keys} = #values;
%hash;
}

Resources