Perl subroutines and arrays - arrays

I'm just starting out Perl (about 15 minutes ago) using a tutorial online. I made a small subroutine to test a few Perl features and would like to know if it is possible to determine at runtime if parameters that were passed to the sub' call are arrays or scalars. Let's use the sub I wrote as an example:
#!/usr/bin/perl
sub somme_prod {
if (scalar(#_) > 1) {
$facteur = shift(#_);
foreach my $nb (#_) {
$resultat += $nb
}
return ($resultat * $facteur);
}
else {
return "ERREUR";
}
}
print somme_prod(2, 2, 3, 7);
This is a basic sum-product subroutine which does exactly what its name says. Now, would it be possible to modify this subroutine to allow for a mix of arrays and scalars like this ?
somme_prod(2, (2,3), 7);
somme_prod(2, (2,3,7));
#...
Also, any comment on the style of Perl coding demonstrated here is much welcome. I have a background of amateur C++ coding so I may not be thinking in Perl.
Edit: I'm so sorry. I actually tried it after posting and it seems that Perl does process my sub as I want it to. Now I guess my question would be more "how does Perl know how to process this" ?
Edited code for a more Perl-ish version.

Yes; in Perl you can create references to arrays (or hashes, or anything else) to stuff several values into a single parameter.
For example:
somme_prod(2, [2, 3], 7);
...would resolve to:
sub somme_prod {
foreach my $arg (#_) {
if (ref($arg) eq 'ARRAY') {
my #values = #$arg; # dereference, e.g. [2, 3] -> (2, 3)
. . .
} else {
# single value, e.g. "2" or "7"
}
}
}
You can read the page perldoc perlref to learn all about references.

Perl handles lists and arrays differently, and a useful document for you to read is What is the difference between a list and an array?
Perl will always flatten nested lists (and so arrays within lists) so
my #data1 = (2, (2, 3), 7);
or
my #data2 = (2, 3);
my #data1 = (2, #data2, 7);
is equivalent to
my #data1 = (2, 2, 3, 7);
As Kevin says, if you want nested arrays you have to place an array reference in the place where the sublist appears. Because a reference is a scalar it won't get expanded.
Your subroutine is fine, but using some de-facto standards would help others to follow your program. Firstly the convention is that a subroutine will return undef if there is an error, so that you can write
sous_routine($p1, $p2) or die "Erreur";
In this case the possibility that zero is a valid result spoils this, but it is still best to stick to the rules. A plain return without a parameter indicates an error
A little bit of tidying up and using unless and if as statement modifiers gives this
sub somme_prod {
return unless #_ > 1;
my $facteur = shift;
my $somme = 0;
$somme += $_ for #_;
return $somme * $facteur;
}
print somme_prod(2, 2, 3, 7);

You've known Perl for 15 minutes? Forget about references for now.
Basically, everything passed to a subroutine is an array. In fact, it's stored in an array called #_.
# /usr/bin/env perl
use strict; #ALWAYS USE!
use warnings; #ALWAYS USE!
my #array = qw(member1 member2 member3 member4);
foo(#array, 'scalar', 'scalar', 'scalar');
sub foo {
print "My input is " . join (":", #_) . "\n";
This will print:
my input is member1:member2:member3:member4:scalar:scalar:scalar
There is no way to tell which entries are from an array and which are from a scalar. As far as your subroutine is concerned, they're all members of the array #_.
By the way, Perl comes with a command called perldoc. When someone says to see perlref, you can type in perldoc perlref at the command line and see the documentation. You can also go to the site http://perldoc.perl.org which will also contain the same information found in the perldoc command.
Now about references....
A data element of a Perl array or the value of a hash can only contain a single value. That could be a string, it could be a real number, it could be an integer, and it could be a reference to another Perl data structure. That's where all the fun and money is.
For example, the same subroutine foo above could have taken this information:
foo(\#array, 'scalar', 'scalar', 'scalar'); #Note the backslash!
In this case, you're not passing in the values of #array into foo. Instead, a reference to the array is passed as the first data element of #_. If you attempted to print out $_[0], you'd get something like ARRAY:6E43832 which says the data element is an array and where it's located in memory.
Now, you can use the ref function to see whether an piece of data is a reference and the type of reference it is:
sub foo {
foreach my $item (#_) {
if (ref $item eq 'ARRAY') {
print "This element is a reference to an array\n";
}
elsif (ref $item eq 'HASH') {
print "This element is a reference to a hash\n";
}
elsif (ref $item) { #Mysterious Moe Reference
print "This element is a reference to a " . lc (ref $item) . "\n";
}
else {
print "This element is a scalar and it's value is '$item'\n";
}
}
}
Of course, your reference to an array might be an array that contains references to hashes that contain references to arrays and so on. There's a module that comes with Perl called Data::Dumper (you can use perldoc to see information about it) that will print out the entire data structure.
This is how object orient Perl works, so it's really quite common to have references to other Perl data structures embedded in a piece of Perl data.
Right now, just get use to basic Perl and how it works. Then, start looking at the various tutorials about Perl references in Perldoc.

Related

Finding index of the lowest value in array

I have 3 arrays #energy, #es_energy and #hb_energy each of which have been indexed with the same name term [$k].
I want to find the lowest value in #energy and then using that index value look for the corresponding values in the other arrays.
Currently I am using my $n = nmin_by { $energy[$_] } 0 .. $#energy;
And then $n is used to output from the other arrays. However, I don't want to use nmin_by as it requires an extra library to download for the software package I am using (loads of admin issues).
Any suggestions?
Use List::Util::reduce
use warnings;
use strict;
use feature 'say';
use List::Util qw(reduce);
my #ary = (12, 3, 1, 23);
my $min_idx = reduce { $ary[$a] < $ary[$b] ? $a : $b } 0..$#ary;
say $min_idx;
Put this in a sub so that the implementation is out of sight while the name clarifies the purpose
use Carp;
sub get_min_idx {
my $ra = shift;
croak "Sub expects array reference" if ref $ra ne 'ARRAY';
return reduce { $ra->[$a] < $ra->[$b] ? $a : $b } 0..$#$ra;
}
my $min_idx = get_min_idx(\#ary);
Tuck it away in a module and you can also change how it works with minimal intrusion.
The error message can be elaborated (tell to user what has been passed, for instance) and checks added; for one, given the numeric < comparison the sub needs an array with only numbers.
Syntax clafirication: the index of the last element of an arrayref $rary is $#$rary (while the index of the last element of an array #ary is $#ary).
Pick your subroutine name carefully; having a good name helps a lot.
Thanks to Borodin for commenting on the need for this.

De-reference x number of times for x number of data structures

I've come across an obstacle in one of my perl scripts that I've managed to solve, but I don't really understand why it works the way it works. I've been scouring the internet but I haven't found a proper explanation.
I have a subroutine that returns a reference to a hash of arrays. The hash keys are simple strings, and the values are references to arrays.
I print out the elements of the array associated with each key, like this
for my $job_name (keys %$build_numbers) {
print "$job_name => ";
my #array = #{#$build_numbers{$job_name}}; # line 3
for my $item ( #array ) {
print "$item \n";
}
}
While I am able to print out the keys & values, I don't really understand the syntax behind line 3.
Our data structure is as follows:
Reference to a hash whose values are references to the populated arrays.
To extract the elements of the array, we have to:
- dereference the hash reference so we can access the keys
- dereference the array reference associated to a key to extract elements.
Final question being:
When dealing with perl hashes of hashes of arrays etc; to extract the elements at the "bottom" of the respective data structure "tree" we have to dereference each level in turn to reach the original data structures, until we obtain our desired level of elements?
Hopefully somebody could help out by clarifying.
Line 3 is taking a slice of your hash reference, but it's a very strange way to do what you're trying to do because a) you normally wouldn't slice a single element and b) there's cleaner and more obvious syntax that would make your code easier to read.
If your data looks something like this:
my $data = {
foo => [0 .. 9],
bar => ['A' .. 'F'],
};
Then the correct version of your example would be:
for my $key (keys(%$data)) {
print "$key => ";
for my $val (#{$data->{$key}}) {
print "$val ";
}
print "\n";
}
Which produces:
bar => A B C D E F
foo => 0 1 2 3 4 5 6 7 8 9
If I understand your second question, the answer is that you can access precise locations of complex data structures if you use the correct syntax. For example:
print "$data->{bar}->[4]\n";
Will print E.
Additional recommended reading: perlref, perlreftut, and perldsc
Working with data structures can be hard depending on how it was made.
I am not sure if your "job" data structure is exactly this but:
#!/usr/bin/env perl
use strict;
use warnings;
use diagnostics;
my $hash_ref = {
job_one => [ 'one', 'two'],
job_two => [ '1','2'],
};
foreach my $job ( keys %{$hash_ref} ){
print " Job => $job\n";
my #array = #{$hash_ref->{$job}};
foreach my $item ( #array )
{
print "Job: $job Item $item\n";
}
}
You have an hash reference which you iterate the keys that are arrays. But each item of this array could be another reference or a simple scalar.
Basically you can work with the ref or undo the ref like you did in the first loop.
There is a piece of documentation you can check for more details here.
So answering your question:
Final question being: - When dealing with perl hashes of hashes of
arrays etc; to extract the elements at the "bottom" of the respective
data structure "tree" we have to dereference each level in turn to
reach the original data structures, until we obtain our desired level
of elements?
It depends on how your data structure was made and if you already know what you are looking for it would be simple to get the value for example:
%city_codes = (
a => 1, b => 2,
);
my $value = $city_codes{a};
Complex data structures comes with complex code.

Modifications to array also change other array

I have two global multidimensional arrays #p and #p0e in Perl. This is part of a genetic algorith where I want to save certain keys from #p to #p0e. Modifications are then made to #p. There are several subroutines that make modifications to #p, but there's a certain subroutine where on occasion (not on every iteration) a modification to #p also leads to #p0e being modified (it receives the same keys) although #p0e should not be affected.
# this is the sub where part of #p is copied to #p0e
sub saveElite {
#p0e = (); my $i = 0;
foreach my $r (sort({$a<=>$b} keys $f{"rank"})) {
if ($i<$elN) {
$p0e[$i] = $p[$f{"rank"}{$r}]; # save chromosome
}
else {last;}
$i++;
}
}
# this is the sub that then sometimes changes #p0e
sub mutation {
for (my $i=0; $i<#p; $i++) {
for (my $j=0; $j<#{$p[$i]}; $j++) {
if (rand(1)<=$mut) { # mutation
$p[$i][$j] = mutate($p[$i][$j]);
}
}
}
}
I thought maybe I'd somehow created a reference to the original array rather than a copy, but because this unexpected behaviour doesn't happen on every iteration this shouldn't be the case.
$j = $f{"rank"}{$r};
$p0e[$i] = $p[$j];
$p[$j] is an array reference, which you can think of as pointing to a particular list of data at a particular memory address. The assignment to $p0e[$i] also tells Perl to let the $i-th row of #p0e also refer to that same block of memory. So when you later make a change to $p0e[$i][$k], you'll find the value of $p[$j][$k] has changed too.
To fix this, you'll want to assign a copy of $p[$j]. Here is one way you can do that:
$p0e[$i] = [ #{$p[$j]} ];
#{$p[$j]} deferences the array reference and [...] creates a new reference for it, so after this statement $p0e[$i] will have the same contents with the same values as $p[$j] but point to a different block of memory.
I think your problem will probably be this:
$p0e[$i] = $p[$f{"rank"}{$r}]; # save chromosome
Because it looks like #p is a multi-dimensional array.
The problem is - the way perl 'does' multi dimensional arrays is via arrays of references. So if you copy an inner array, you do so by reference.
E.g.:
#!c:\Strawberry\perl\bin
use strict;
use warnings;
use Data::Dumper;
my #list = ( [ 1, 2, 3 ],
[ 4, 5, 6 ],
[ 7, 8, 9 ], );
print Dumper \#list;
my #other_list;
push ( #other_list, #list[0,1] ); #make a sub list of two rows;
print Dumper \#other_list;
### all looks good.
## but if we:
print "List:\n";
print join ("\n",#list),"\n";
print "Other List:\n";
print join ("\n", #other_list),"\n";
$list[1][1] = 9;
print Dumper \#other_list;
You will see that by changing an element in #list we also modify #other_list - and if we just print them we get:
List:
ARRAY(0x2ea384)
ARRAY(0x12cef34)
ARRAY(0x12cf024)
Other List:
ARRAY(0x2ea384)
ARRAY(0x12cef34)
Note the duplicate numbers - that means you have the same reference.
The easiest way of working around this is by using [] judicously:
push ( #other_list, [#{$list[0]}], [#{$list[1]}] ); #make a sub list of two rows;
This will then insert anonymous arrays (new ones) containing the dereferenced elements of the list.
Whilst we're at it though - please turn on strict and warnings. They will save you a lot of pain in the long run.
That's because it's an array of arrays. The first level array stores only references to the inner arrays, if you modify the inner array, it's changed in both arrays - they both refer to the same array. Clone the deep copy instead of creating a shallow one.

Perl - How do I update (and access) an array stored in array stored in a hash?

Perhaps I have made this more complicated than I need it to be but I am currently trying to store an array that contains, among other things, an array inside a hash in Perl.
i.e. hash -> array -> array
use strict;
my %DEVICE_INFORMATION = {}; #global hash
sub someFunction() {
my $key = 'name';
my #storage = ();
#assume file was properly opened here for the foreach-loop
foreach my $line (<DATA>) {
if(conditional) {
my #ports = ();
$storage[0] = 'banana';
$storage[1] = \#ports;
$storage[2] = '0';
$DEVICE_INFORMATION{$key} = \#storage;
}
elsif(conditional) {
push #{$DEVICE_INFORMATION{$key}[1]}, 5;
}
}#end foreach
} #end someFunction
This is a simplified version of the code I am writing. I have a subroutine that I call in the main. It parses a very specifically designed file. That file guarantees that the if statement fires before subsequent elsif statement.
I think the push call in the elsif statement is not working properly - i.e. 5 is not being stored in the #ports array that should exist in the #storage array that should be returned when I hash the key into DEVICE_INFORMATION.
In the main I try and print out each element of the #storage array to check that things are running smoothly.
#main execution
&someFunction();
print $DEVICE_INFORMATION{'name'}[0];
print $DEVICE_INFORMATION{'name'}[1];
print $DEVICE_INFORMATION{'name'}[2];
The output for this ends up being... banana ARRAY(blahblah) 0
If I change the print statement for the middle call to:
print #{$DEVICE_INFORMATION{'name'}[1]};
Or to:
print #{$DEVICE_INFORMATION{'name'}[1]}[0];
The output changes to banana [blankspace] 0
Please advise on how I can properly update the #ports array while it is stored inside the #storage array that has been hash'd into DEVICE_INFORMATION and then how I can access the elements of #ports. Many thanks!
P.S. I apologize for the length of this post. It is my first question on stackoverflow.
I was going to tell you that Data::Dumper can help you sort out Perl data structures, but Data::Dumper can also tell you about your first problem:
Here's what happens when you sign open-curly + close-curly ( '{}' ) to a hash:
use Data::Dumper ();
my %DEVICE_INFORMATION = {}; #global hash
print Dumper->Dump( [ \%DEVICE_INFORMATION ], [ '*DEVICE_INFORMATION ' ] );
Here's the output:
%DEVICE_INFORMATION = (
'HASH(0x3edd2c)' => undef
);
What you did is you assigned the stringified hash reference as a key to the list element that comes after it. implied
my %DEVICE_INFORMATION = {} => ();
So Perl assigned it a value of undef.
When you assign to a hash, you assign a list. A literal empty hash is not a list, it's a hash reference. What you wanted to do for an empty hash--and what is totally unnecessary--is this:
my %DEVICE_INFORMATION = ();
And that's unnecessary because it is exactly the same thing as:
my %DEVICE_INFORMATION;
You're declaring a hash, and that statement fully identifies it as a hash. And Perl is not going to guess what you want in it, so it's an empty hash from the get-go.
Finally, my advice on using Data::Dumper. If you started your hash off right, and did the following:
my %DEVICE_INFORMATION; # = {}; #global hash
my #ports = ( 1, 2, 3 );
# notice that I just skipped the interim structure of #storage
# and assigned it as a literal
# * Perl has one of the best literal data structure languages out there.
$DEVICE_INFORMATION{name} = [ 'banana', \#ports, '0' ];
print Data::Dumper->Dump(
[ \%DEVICE_INFORMATION ]
, [ '*DEVICE_INFORMATION' ]
);
What you see is:
%DEVICE_INFORMATION = (
'name' => [
'banana',
[
1,
2,
3
],
'0'
]
);
So, you can better see how it's all getting stored, and what levels you have to deference and how to get the information you want out of it.
By the way, Data::Dumper delivers 100% runnable Perl code, and shows you how you can specify the same structure as a literal. One caveat, you would have to declare the variable first, using strict (which you should always use anyway).
You update #ports properly.
Your print statement accesses $storage[1] (reference to #ports) in wrong way.
You may use syntax you have used in push.
print $DEVICE_INFORMATION{'name'}[0], ";",
join( ':', #{$DEVICE_INFORMATION{'name'}[1]}), ";",
$DEVICE_INFORMATION{'name'}[2], "\n";
print "Number of ports: ", scalar(#{$DEVICE_INFORMATION{'name'}[1]})),"\n";
print "First port: ", $DEVICE_INFORMATION{'name'}[1][0]//'', "\n";
# X//'' -> X or '' if X is undef

Perl array initialized incorrectly

Just starting with Perl, and I used the wrong pair of parentheses for this code example:
#arr = {"a", "b", "c"};
print "$_\n" foreach #arr;
I mean, I can of course see WHAT it does when I run it, but I don't understand WHY it does that. Shouldn't that simply fail with a syntax error?
You have accidentally created an anonymous hash, which is a hash that has no identifier and is accessed only by reference.
The array #arr is set to a single element which is a reference to that hash. Properly written it would look like this
use strict;
use warnings;
my #arr = (
{
a => "b",
c => undef,
}
);
print "$_\n" foreach #arr;
which is why you got the output
HASH(0x3fd36c)
(or something similar) because that is how Perl will represent a hash reference as a string.
If you want to experiment, then you can print the value of the first hash element by using $arr[0] as a hash reference (the array has only a single element at index zero) and accessing the value of the element with key a with print $arr[0]->{a}, "\n".
Note that, because hashes have to have a multiple of two values (a set of key/value pairs) the hash in your own code is implicitly expanded to four values by adding an undef to the end.
It is vital that you add use strict and use warnings to the top of every Perl program you write. In this case the latter would have raised the warning
Odd number of elements in anonymous hash
It can be surprising how few things are syntax errors in Perl. In this case, you've stumbled onto the syntax for creating an an anonymous hash reference: { LIST }. Perl doesn't require the use of parentheses (the () kind) when initializing an array1. Arrays hold ordered lists of scalar (single) values and references are scalars so perl happily initializes your array with that reference as the only element.
It's not what you wanted, but it's perfectly valid Perl.
The () override the precedence of the operators and thus changes the behavior of the expression. #a = (1, 2, 3) does what you'd expect, but #a = 1, 2, 3 means (#a = 1), 2, 3, so it only assigns 1 to #a.
{ LIST } is the hash constructor. It creates a hash, assigns the result of LIST to it, and returns a reference to it.
my $h = {"a", "b", "c", "d"};
say $h->{a}; # b
A list of one hash ref is just as legit as any other list, so no, it shouldn't be a syntax error.

Resources