How to Iterate through multiple Perl arrays - arrays

I am hoping to make a loop that allows me to use less lines of code to make changes to a settings file with Perl. Currently my code reads an XML file and locates a settings ID and replaces the setting value in that ID with a new one. The current request involves a lot of changes to the settings file and the code is very long. I have set my values in an array and my settings ID's in an array. Like this:
#GreetGoalDP1 = (3, 5, 7, 10);
#GreetSIDSunDP1 = ('//xsd:Settings/xsd:Setting[#SID="7012"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7013"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7014"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7015"]/xsd:Value');
and run the following.
my($matchSunDP1G1) = $xpc->findnodes($GreetSIDSunDP1[0]);
$matchSunDP1G1->removeChildNodes();
$matchSunDP1G1->appendText($GreetGoalDP1[0]);
#GreetB
my($matchSunDP1G2) = $xpc->findnodes($GreetSIDSunDP1[1]);
$matchSunDP1G2->removeChildNodes();
$matchSunDP1G2->appendText($GreetGoalDP1[1]);
#GreetC
my($matchSunDP1G3) = $xpc->findnodes($GreetSIDSunDP1[2]);
$matchSunDP1G3->removeChildNodes();
$matchSunDP1G3->appendText($GreetGoalDP1[2]);
#GreetD
my($matchSunDP1G4) = $xpc->findnodes($GreetSIDSunDP1[3]);
$matchSunDP1G4->removeChildNodes();
$matchSunDP1G4->appendText($GreetGoalDP1[3]);
I would like to run these changes through a loop just using the array [0] - [3] until completed as I have to do this same set of 4 multiple times. I am not too familiar with looping arrays. Is this something I can do in Perl? If so, what would be the most efficient way to do so?

A simple take
use warnings;
use strict;
...
for my $i (0..$#GreetGoalDP1) {
my ($matchSunDP1G) = $xpc->findnodes( $GreetSIDSunDP1[$i] );
$matchSunDP1G->removeChildNodes();
$matchSunDP1G->appendText( $GreetGoalDP1[$i] );
}
I take it that you don't need all those individual $matchSunDP1G1 etc. It's assumed that the two arays always have the same length, and their elements are needed in pairs at same indices.
The syntax $#aryname is for the last index in the array #aryname, and .. is the range operator, so 0 .. $#GreetGoalDP1 for your example is the list 0,1,2,3.
Then there are libraries that help with use of multiple arrays in parallel, that can be particularly useful when things get messier or more complicated. An example of using an iterator
use List::MoreUtils qw(each_array);
my $it = each_array #GreetSIDSunDP1, #GreetGoalDP1;
while ( my ($sidsun, $goal) = $it->() ) {
my ($matchSunDP1G) = $xpc->findnodes($sidsun);
$matchSunDP1G -> removeChildNodes();
$matchSunDP1G -> appendText( $goal );
}
If the lists are uneven in size the iterator keeps going through the length of the longer one. After the shorter one gets exhausted its would-be value is undef.

Following code sample demonstrates how you could use %hash for alternation you try to achieve.
my %hash = (
3 => '//xsd:Settings/xsd:Setting[#SID="7012"]/xsd:Value',
5 => '//xsd:Settings/xsd:Setting[#SID="7013"]/xsd:Value',
7 => '//xsd:Settings/xsd:Setting[#SID="7014"]/xsd:Value',
10 => '//xsd:Settings/xsd:Setting[#SID="7015"]/xsd:Value')
);
while( my($k,$v) = each %hash ) {
my $match = $xpc->findnodes($v);
$match->removeChildNodes();
$match->appendText($k);
}
Reference: hash, hash operations

Yet Another Way, using zip from the core List::Util module:
#!/usr/bin/env perl
use warnings;
use strict;
use List::Util qw/zip/;
...;
my #GreetGoalDP1 = (3, 5, 7, 10);
my #GreetSIDSunDP1 = ('//xsd:Settings/xsd:Setting[#SID="7012"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7013"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7014"]/xsd:Value',
'//xsd:Settings/xsd:Setting[#SID="7015"]/xsd:Value');
foreach my $pair (zip \#GreetSIDSunDP1, \#GreetGoalDP1) {
my ($matchSunDP1G1) = $xpc->findnodes($pair->[0]);
$matchSunDP1G1->removeChildNodes();
$matchSunDP1G1->appendText($pair->[1]);
}

Related

Finding index of the lowest value in array

I have 3 arrays #energy, #es_energy and #hb_energy each of which have been indexed with the same name term [$k].
I want to find the lowest value in #energy and then using that index value look for the corresponding values in the other arrays.
Currently I am using my $n = nmin_by { $energy[$_] } 0 .. $#energy;
And then $n is used to output from the other arrays. However, I don't want to use nmin_by as it requires an extra library to download for the software package I am using (loads of admin issues).
Any suggestions?
Use List::Util::reduce
use warnings;
use strict;
use feature 'say';
use List::Util qw(reduce);
my #ary = (12, 3, 1, 23);
my $min_idx = reduce { $ary[$a] < $ary[$b] ? $a : $b } 0..$#ary;
say $min_idx;
Put this in a sub so that the implementation is out of sight while the name clarifies the purpose
use Carp;
sub get_min_idx {
my $ra = shift;
croak "Sub expects array reference" if ref $ra ne 'ARRAY';
return reduce { $ra->[$a] < $ra->[$b] ? $a : $b } 0..$#$ra;
}
my $min_idx = get_min_idx(\#ary);
Tuck it away in a module and you can also change how it works with minimal intrusion.
The error message can be elaborated (tell to user what has been passed, for instance) and checks added; for one, given the numeric < comparison the sub needs an array with only numbers.
Syntax clafirication: the index of the last element of an arrayref $rary is $#$rary (while the index of the last element of an array #ary is $#ary).
Pick your subroutine name carefully; having a good name helps a lot.
Thanks to Borodin for commenting on the need for this.

De-reference x number of times for x number of data structures

I've come across an obstacle in one of my perl scripts that I've managed to solve, but I don't really understand why it works the way it works. I've been scouring the internet but I haven't found a proper explanation.
I have a subroutine that returns a reference to a hash of arrays. The hash keys are simple strings, and the values are references to arrays.
I print out the elements of the array associated with each key, like this
for my $job_name (keys %$build_numbers) {
print "$job_name => ";
my #array = #{#$build_numbers{$job_name}}; # line 3
for my $item ( #array ) {
print "$item \n";
}
}
While I am able to print out the keys & values, I don't really understand the syntax behind line 3.
Our data structure is as follows:
Reference to a hash whose values are references to the populated arrays.
To extract the elements of the array, we have to:
- dereference the hash reference so we can access the keys
- dereference the array reference associated to a key to extract elements.
Final question being:
When dealing with perl hashes of hashes of arrays etc; to extract the elements at the "bottom" of the respective data structure "tree" we have to dereference each level in turn to reach the original data structures, until we obtain our desired level of elements?
Hopefully somebody could help out by clarifying.
Line 3 is taking a slice of your hash reference, but it's a very strange way to do what you're trying to do because a) you normally wouldn't slice a single element and b) there's cleaner and more obvious syntax that would make your code easier to read.
If your data looks something like this:
my $data = {
foo => [0 .. 9],
bar => ['A' .. 'F'],
};
Then the correct version of your example would be:
for my $key (keys(%$data)) {
print "$key => ";
for my $val (#{$data->{$key}}) {
print "$val ";
}
print "\n";
}
Which produces:
bar => A B C D E F
foo => 0 1 2 3 4 5 6 7 8 9
If I understand your second question, the answer is that you can access precise locations of complex data structures if you use the correct syntax. For example:
print "$data->{bar}->[4]\n";
Will print E.
Additional recommended reading: perlref, perlreftut, and perldsc
Working with data structures can be hard depending on how it was made.
I am not sure if your "job" data structure is exactly this but:
#!/usr/bin/env perl
use strict;
use warnings;
use diagnostics;
my $hash_ref = {
job_one => [ 'one', 'two'],
job_two => [ '1','2'],
};
foreach my $job ( keys %{$hash_ref} ){
print " Job => $job\n";
my #array = #{$hash_ref->{$job}};
foreach my $item ( #array )
{
print "Job: $job Item $item\n";
}
}
You have an hash reference which you iterate the keys that are arrays. But each item of this array could be another reference or a simple scalar.
Basically you can work with the ref or undo the ref like you did in the first loop.
There is a piece of documentation you can check for more details here.
So answering your question:
Final question being: - When dealing with perl hashes of hashes of
arrays etc; to extract the elements at the "bottom" of the respective
data structure "tree" we have to dereference each level in turn to
reach the original data structures, until we obtain our desired level
of elements?
It depends on how your data structure was made and if you already know what you are looking for it would be simple to get the value for example:
%city_codes = (
a => 1, b => 2,
);
my $value = $city_codes{a};
Complex data structures comes with complex code.

Assign multiple local vars to array entries in perl

In Perl, I've always been confused about how to cleanly assign multiple local variables from array entries.
I use the following syntax in subs all the time, so I'm somewhat familiar with it:
my ($var1, $var2) = #_
but other variations of this confuse me. For instance, I have the following code that works:
for my $ctr (0 .. $#matchingLines) {
my $lineNo = $matchingLines[$ctr][0];
my $text = $matchingLines[$ctr][1];
Where "#matchingLines" is an array of two-element arrays.
I wish I could convert the last two lines to the obvious:
my ($lineNo, $text) = $matchingLines[$ctr];
That of course does not work. I've tried numerous variations, but I can't find anything that works.
Just dereference the array ref:
my ( $lineNo, $text ) = #{ $matchingLines[$ctr] };
Check out Perl Data Structures Cookbook for additional examples.
It sounds like you have an array of arrays. this means that the inner arrays will be array references. If you want to allocate them to vars then you need to derference them.
use strict;
use warnings;
my #matchingLines = (['y','z'],['a','b']);
for my $ctr (0 .. $#matchingLines) {
my ($lineNo, $text) = #{$matchingLines[$ctr]};
print "#Array index: $ctr - lineno=$lineNo - text=$text\n"
}
this produces the output
#Array index: 0 - lineno=y - text=z
#Array index: 1 - lineno=a - text=b

Count Perl array size

I'm trying to print out the size of my array. I've followed a few other questions like this one on Stack Overflow. However, I never get the result I want.
All I wish for in this example is for the value of 3 to be printed as I have three indexes. All I get, from both print methods is 0.
my #arr;
$arr{1} = 1;
$arr{2} = 2;
$arr{3} = 3;
my $size = #arr;
print $size; # Prints 0
print scalar #arr; # Prints 0
What am I doing wrong, and how do I get the total size of an array when declared and populated this way?
First off:
my #arr;
$arr{1} = 1;
$arr{2} = 2;
$arr{3} = 3;
is nonsense. {} is for hash keys, so you are referring to %arr not #arr. use strict; and use warnings; would have told you this, and is just one tiny fragment of why they're considered mandatory.
To count the elements in an array, merely access it in a scalar context.
print scalar #arr;
if ( $num_elements < #arr ) { do_something(); }
But you would need to change your thing to
my #arr;
$arr[1] = 1;
$arr[2] = 2;
$arr[3] = 3;
And note - the first element of your array $arr[0] would be undefined.
$VAR1 = [
undef,
1,
2,
3
];
As a result, you would get a result of 4. To get the desired 'count of elements' you would need to filter the undefined items, with something like grep:
print scalar grep {defined} #arr;
This will take #arr filter it with grep (returning 3 elements) and then take the scalar value - count of elements, in this case 3.
But normally - you wouldn't do this. It's only necessary because you're trying to insert values into specific 'slots' in your array.
What you would do more commonly, is use either a direct assignment:
my #arr = ( 1, 2, 3 );
Or:
push ( #arr, 1 );
push ( #arr, 2 );
push ( #arr, 3 );
Which inserts the values at the end of the array. You would - if explicitly iterating - go from 0..$#arr but you rarely need to do this when you can do:
foreach my $element ( #arr ) {
print $element,"\n";
}
Or you can do it with a hash:
my %arr;
$arr{1} = 1;
$arr{2} = 2;
$arr{3} = 3;
This turns your array into a set of (unordered) key-value pairs, which you can access with keys %arr and do exactly the same:
print scalar keys %arr;
if ( $elements < keys %arr ) { do_something(); }
In this latter case, your hash will be:
$VAR1 = {
'1' => 1,
'3' => 3,
'2' => 2
};
I would suggest this is bad practice - if you have ordered values, the tool for the job is the array. If you have 'key' values, a hash is probably the tool for the job still - such as a 'request ID' or similar. You can typically tell the difference by looking at how you access the data, and whether there are any gaps (including from zero).
So to answer your question as asked:
my $size = #arr;
print $size; # prints 0
print scalar #arr; # prints 0
These don't work, because you never insert any values into #arr. But you do have a hash called %arr which you created implicitly. (And again - use strict; and use warnings; would have told you this).
You are initializing a hash, not an array.
To get the "size" of your hash you can write.
my $size = keys %arr;
I just thought there should be an illustration of your code run with USUW (use strict/use warnings) and what it adds to the troubleshooting process:
use strict;
use warnings;
my #arr;
...
And when you run it:
Global symbol "%arr" requires explicit package name (did you forget to declare "my %arr"?) at - line 9.
Global symbol "%arr" requires explicit package name (did you forget to declare "my %arr"?) at - line 10.
Global symbol "%arr" requires explicit package name (did you forget to declare "my %arr"?) at - line 11.
Execution of - aborted due to compilation errors.
So USUW.
You may be thinking that you are instantiating an element of #arr when you are typing in the following code:
$arr{1} = 1;
However, you are instantiating a hash doing that. This tells me that you are not using strict or you would have an error. Instead, change to brackets, like this:
$arr[1] = 1;

When is it better to use an array instead of a hash in Perl?

Say you have an array #a = qw/ a b c d/;
and a hash %a = ('a' => 1, 'b' => 1, 'c' => 1, 'd' => 1);
Is there any situation where creating the array version is better than creating the hash (other than when you have to iterate over all the values as in something like
for (#a){
....
in which case you would have to use keys %a if you went with the hash)? Because testing whether a specific value is in a hash is always more efficient than doing so in an array, correct?
Arrays are indexed by numbers.
Hashes are keyed by strings.
All indexes up to the highest index exist in an array.
Hashes are sparsely indexed. (e.g. "a" and "c" can exist without "b".)
There are many emergent properties. Primarily,
Arrays can be used to store ordered lists.
It would be ugly an inefficient to use hashes that way.
It's not possible to delete an element from an array unless it's the highest indexed element.
You can delete from an ordered list implemented using an array, though it is inefficient to remove elements other than the first or last.
It's possible to delete an element from a hash, and it's efficient.
Arrays are ordered lists of values. They can contain duplicate values.
#array = qw(a b c a);
Hashes are a mapping between a key (which must be unique) and a value (which can be duplicated). Hashes are (effectively) unordered, which means that keys come out in apparently random order rather than the order in which they are entered.
%hash = (a => 1, b => 2, c => 3);
Hashes can also be used as sets when only the key matters. Sets are unordered and contain only unique "values" (the hash's keys).
%set = (a => undef, b => undef, c => undef);
Which one to use depends on your data and algorithm. Use an array when order matters (particularly if you can't sort to derive the order) or if duplicate values are possible. Use a set (i.e. use a hash as a set) when values must be unique and don't care about order. Use a hash when uniqueness matters, order doesn't (or is easily sortable), and look-ups are based on arbitrary values rather than integers.
You can combine arrays and hashes (via references) to create arbitrarily complex data structures.
#aoa = ([1, 2, 3], [4, 5, 6]); # array of arrays ("2D" array)
%hoh = (a => { x => 1 }, b => { x => 2 }); # hash of hashes
#aoh = ({a => 1, b => 2}, {a => 3, b => 4}); # array of hashes
%hoa = (a => [1, 2], b => [3, 4]); # hash of arrays
...etc.
This about using numbers as hash keys. It doesn't answer the question directly as it doesn't compare the facilities that arrays provide, but I thought it would be a good place to put the information.
Suppose a hash with ten elements is built using code like this
use strict;
use warnings;
my %hash;
my $n = 1000;
for (1 .. 10) {
$hash{$n} = 1;
$n *= 1000;
}
and then we query it, looking for keys that are powers of ten. Of course the easiest way to multiply an integer by ten is to add a zero, so it is fine to write
my $m = '1';
for (1 .. 100) {
print $m, "\n" if $hash{$m};
$m .= 0;
}
which has the output
1000
1000000
1000000000
1000000000000
1000000000000000
1000000000000000000
We entered ten elements but this shows only six. What has happened? Let's take a look at what's in the hash.
use Data::Dump;
dd \%hash;
and this outputs
{
"1000" => 1,
"1000000" => 1,
"1000000000" => 1,
"1000000000000" => 1,
"1000000000000000" => 1,
"1000000000000000000" => 1,
"1e+021" => 1,
"1e+024" => 1,
"1e+027" => 1,
"1e+030" => 1,
}
so the hash doesn't use the keys that we imagined. It stringifies the numbers in a way that it would be foolish to try to emulate.
For a slightly more practical example, say we had some circles and wanted to collect into sets by area. The obvious thing is to use the area as a hash key, like this program which creates 100,000 circles with random integer diameters up to 18 million.
use strict;
use warnings;
use 5.010;
package Circle;
use Math::Trig 'pi';
sub new {
my $class = shift;
my $self = { radius => shift };
bless $self, $class;
}
sub area {
my $self = shift;
my $radius = $self->{radius};
pi * $radius * $radius;
}
package main;
my %circles;
for (1 .. 100_000) {
my $circle = Circle->new(int rand 18_000_000);
push #{ $circles{$circle->area} }, $circle;
}
Now let's see how many of those hash keys use scientific notation
say scalar grep /e/, keys %circles;
which says (randomly, of course)
861
so there really isn't a tidy way of know what string perl will use if we specify a number as a hash index.
In Perl an #array is an ordered list of values ($v1, $v2, ...) accessed by an integer (both positive and negative),
while a %hash is an unordered list of 'key => value' pairs (k1 => $v1, k2 => $v2, ...) accessed by a string.
There are modules on CPAN that implement ordered hashes, like: Hash::Ordered and Tie::IxHash
You might want to use an array when you have ordered 'items' presumably a great number as well, for
which using a %hash and sorting the keys and/or the values would be inefficient.

Resources