Ways to Flatten A Perl Array in Scalar Context - arrays

I recently started learning perl and have a question that I'm not finding a clear answer to on the Internet. say I have something like this,
#arr = (1, 2, 3);
$scal = "#arr"
# $scal is now 123.
Is the use of quotes the only way to flatten the array so that each element is stored in the scalar value? It seems improbable but I haven't found any other ways of doing this. Thanks in advance.

The join function is commonly used to "flatten" lists. Lets you specify what you want between each element in the resulting string.
$scal = join(",", #arr);
# $scal is no "1,2,3"

In your example, you're interpolating an array in a double-quoted string. What happens in those circumstances is is controlled by Perl's $" variable. From perldoc perlvar:
$LIST_SEPARATOR
$"
When an array or an array slice is interpolated into a double-quoted string or a similar context such as /.../ , its elements are separated by this value. Default is a space. For example, this:
print "The array is: #array\n";
is equivalent to this:
print "The array is: " . join($", #array) . "\n";
Mnemonic: works in double-quoted context.
The default value for $" is a space. You can obviously change the value of $".
{
local $" = ':',
my #arr = (1, 2, 3);
my $scalar = "#arr"; # $scalar contains '1:2:3'
}
As with any of Perl's special variables, it's always best to localise any changes within a code block.

You could also use join without any seperator
my $scalar = join( '' , #array ) ;
There is more than one way to do it.

in the spirit of TIMTOWTDI:
my $scal;
$scal .= $_ foreach #arr;

Read section Context in perldata. Perl has two major contexts: scalar and list.
For example:
#a = (1, 1, 1); # list context
print #a; # list context
$count = #a; # scalar context, returns the number of elements in #a
etc.

Related

How can I create a variable array name in Perl?

Array #p is a multiline array, e.g. $p[1] is the second line.
This code will explain what I want:
$size=#p; # line number of array #p
for($i=0; $i<$size; $i++)
{
#p{$i}= split(/ +/,$p[$i]);
}
I want the result should be like this:
#p0 = $p[0] first line of array #p goes to array #p0;
#p1 = $p[1] second line of array #p goes to array #p1;
...
...
and so on.
But above code does not work, how can I do it?
It is a bad idea to dynamically generate variable names.
I suggest the best solution here is to convert each line in your #p array to an array of fields.
Lets suppose you have a better name for #p, say #lines. Then you can write
my #lines = map [ split ], <$fh>;
to read in all the lines from the file handle $fh and split them on whitespace. The first field of the first line is then $lines[0][0]. The third field of the first line is $lines[0][2] etc.
First, the syntax #p{$i} accesses the entry with the key $i in a hash %p, and returns it in list context. I don't think you meant that. use strict; use warnings; to get warned about undeclared variables.
You can declare variables with my, e.g. my #p; or my $size = #p;
Creating variable names on the fly is possible, but a bad practice. The good thing is that we don't need to: Perl has references. A reference to an array allows us to nest arrays, e.g.
my #AoA = (
[1, 2, 3],
["a", "b"],
);
say $AoA[0][1]; # 2
say $AoA[1][0]; # a
We can create an array reference by using brackets, e.g. [ #array ], or via the reference operator \:
my #inner_array = (1 .. 3);
my #other_inner = ("a", "b");
my #AoA = (\#inner_array, \#other_array);
But careful: the array references still point to the same array as the original names, thus
push #other_inner, "c";
also updates the entry in #AoA:
say $AoA[1][2]; # c
Translated to your problem this means that you want:
my #pn;
for (#p) {
push #pn, [ split /[ ]+/ ];
}
There are many other ways to express this, e.g.
my #pn = map [ split /[ ]+/ ], #p;
or
my #pn;
for my $i ( 0 .. $#p ) {
$pn[$i] = [ split /[ ]+/, $p[$i] ];
}
To learn more about references, read
perlreftut,
perldsc, and
perlref.

How to retrieve an array from a hash that has been passed to a subroutine in perl

I am trying to write a subroutine that takes in a hash of arrays as an argument. However, when I try to retrieve one of the arrays, I seem to get the size of the array instead of the array itself.
my(%hash) = ( );
$hash{"aaa"} = ["blue", 1];
_subfoo("test", %hash);
sub _subfoo {
my($test ,%aa) = #_;
foreach my $name (keys %aa) {
my #array = #{$aa{$name}};
print $name. " is ". #array ."\n";
}
}
This returns 2 instead of (blue, 1) as I expected. Is there some other way to handle arrays in hashes when in a subroutine?
Apologies if this is too simple for stack overflow, first time poster, and new to programming.
You're putting your #array array into a scalar context right here:
print $name. " is ". #array ."\n";
An array in scalar context gives you the number of elements in the array and #array happens to have 2 elements. Try one of these instead:
print $name . " is " . join(', ', #array) . "\n";
print $name, " is ", #array, "\n";
print "$name is #array\n";
and you'll see the elements of your #array. Using join lets you paste the elements together as you please; the second one evaluates #array in list context and will mash the values together without separating them; the third interpolates #array by joining its elements together with $" (which is a single space by default).
As mu is too short has mentioned, you used the array in scalar context, and therefore it returned its length instead of its elements. I had some other pointers about your code.
Passing arguments by reference is sometimes a good idea when some of those arguments are arrays or hashes. The reason for this is that arrays and hashes are expanded into lists before being passed to the subroutine, which makes something like this impossible:
foo(#bar, #baz);
sub foo { # This will not work
my (#array1, #array2) = #_; # All the arguments will end up in #array1
...
}
This will work, however:
foo(\#bar, \#baz);
sub foo {
my ($aref1, $aref2) = #_;
...
}
You may find that in your case, each is a nice function for your purposes, as it will make dereferencing the array a bit neater.
foo("test", \%hash); # note the backslash to pass by reference
sub foo {
my ($test, $aa) = #_; # note use of scalar $aa to store the reference
while (my ($key, $value) = each %$aa)) { # note dereferencing of $aa
print "$key is #$value\n"; # ...and $value
}
}

Perl: mapping to lists' first element

Task: to build hash using map, where keys are the elements of the given array #a, and values are the first elements of the list returned by some function f($element_of_a):
my #a = (1, 2, 3);
my %h = map {$_ => (f($_))[0]} #a;
All the okay until f() returns an empty list (that's absolutely correct for f(), and in that case I'd like to assign undef). The error could be reproduced with the following code:
my %h = map {$_ => ()[0]} #a;
the error itself sounds like "Odd number of elements in hash assignment". When I rewrite the code such that:
my #a = (1, 2, 3);
my $s = ()[0];
my %h = map {$_ => $s} #a;
or
my #a = (1, 2, 3);
my %h = map {$_ => undef} #a;
Perl does not complain at all.
So how should I resolve this — get first elements of list returned by f(), when the returned list is empty?
Perl version is 5.12.3
Thanks.
I've just played around a bit, and it seems that ()[0], in list context, is interpreted as an empty list rather than as an undef scalar. For example, this:
my #arr = ()[0];
my $size = #arr;
print "$size\n";
prints 0. So $_ => ()[0] is roughly equivalent to just $_.
To fix it, you can use the scalar function to force scalar context:
my %h = map {$_ => scalar((f($_))[0])} #a;
or you can append an explicit undef to the end of the list:
my %h = map {$_ => (f($_), undef)[0]} #a;
or you can wrap your function's return value in a true array (rather than just a flat list):
my %h = map {$_ => [f($_)]->[0]} #a;
(I like that last option best, personally.)
The special behavior of a slice of an empty list is documented under “Slices” in perldata:
A slice of an empty list is still an empty list. […] This makes it easy to write loops that terminate when a null list is returned:
while ( ($home, $user) = (getpwent)[7,0]) {
printf "%-8s %s\n", $user, $home;
}
I second Jonathan Leffler's suggestion - the best thing to do would be to solve the problem from the root if at all possible:
sub f {
# ... process #result
return #result ? $result[0] : undef ;
}
The explicit undef is necessary for the empty list problem to be circumvented.
At first, much thanks for all repliers! Now I'm feeling that I should provide the actual details of the real task.
I'm parsing a XML file containing the set of element each looks like that:
<element>
<attr_1>value_1</attr_1>
<attr_2>value_2</attr_2>
<attr_3></attr_3>
</element>
My goal is to create Perl hash for element that contains the following keys and values:
('attr_1' => 'value_1',
'attr_2' => 'value_2',
'attr_3' => undef)
Let's have a closer look to <attr_1> element. XML::DOM::Parser CPAN module that I use for parsing creates for them an object of class XML::DOM::Element, let's give the name $attr for their reference. The name of element is got easy by $attr->getNodeName, but for accessing the text enclosed in <attr_1> tags one has to receive all the <attr_1>'s child elements at first:
my #child_ref = $attr->getChildNodes;
For <attr_1> and <attr_2> elements ->getChildNodes returns a list containing exactly one reference (to object of XML::DOM::Text class), while for <attr_3> it returns an empty list. For the <attr_1> and <attr_2> I should get value by $child_ref[0]->getNodeValue, while for <attr_3> I should place undef into the resulting hash since no text elements there.
So you see that f function's (method ->getChildNodes in real life) implementation could not be controlled :-) The resulting code that I have wrote is (the subroutine is provided with list of XML::DOM::Element references for elements <attr_1>, <attr_2>, and <attr_3>):
sub attrs_hash(#)
{
my #keys = map {$_->getNodeName} #_; # got ('attr_1', 'attr_2', 'attr_3')
my #child_refs = map {[$_->getChildNodes]} #_; # got 3 refs to list of XML::DOM::Text objects
my #values = map {#$_ ? $_->[0]->getNodeValue : undef} #child_refs; # got ('value_1', 'value_2', undef)
my %hash;
#hash{#keys} = #values;
%hash;
}

In Perl, what is the difference between #array[1] and $array[1]?

I have been studying array slices and frankly do not see the difference between choosing #array[1] and $array[1]. Is there a difference?
#!/usr/bin/perl
#array = (1,3);
print "\nPrinting out full array..\#array\n";
print "#array\n";
print "\n";
print "Printing out \#array[1]\n";
print "#array[1]\n";
print "Printing out \$array[1]\n";
print "$array[1]\n";
print "\n\n";
This is a FAQ.
The two forms are likely to work the same way in many contexts (but not all, as you can see from the example in the FAQ), but $array[1] expresses the intent more clearly.
$ and # are called sigils: hear what the sigils tell you
When you see $ you are dealing with a single thing.
When you see # you have a list of things.
#array[ 1 ] is a slice, a list with selected elements from the #array list. You have put only an element in this slice, the second element of #array, but it's a list anyway.
$array[ 1 ] is the second element of the list, a single value, a scalar.
#hash{ 'one', 'two' } is another kind of slice: this time we sliced an hash ( %hash ), obtaining a list with values corresponding to keys one and two.
Where's the difference? Although the difference is thin, you should avoid single element slices when you want a single value. Remember that on the right hand side of an expression, single element slices behave like scalars, but when they are on the left hand side of an expression they become a list assignment: they give list context to the expression.
If still you aren't feeling comfortable with the difference between scalar and list context, please don't use single element slices in order to avoid unexpected results in certain situations ( for instance when using the line input operator < > ).
Check the docs and happy Perl.
You already have your answer from Keith and Marco: # means list, $ means scalar (single value). I have more to add that does not fit in a comment.
You do not notice the difference because you are using a slice of one element. However, try this:
use warnings;
use strict;
use v5.10; # to enable say
my #array = (1 .. 10);
say "#array[2 .. 4]"; # [2,3,4] == "3 4 5"
say "#array[1,4,7]"; # == "2 5 8"
Here's some pointers to your code.
use strict;
use warnings;
There really is no excuse for not using these two. There is hardly any circumstance where it is better -- or easier -- to not use them. This holds especially true when you are experimenting and trying to learn perl.
If you had been using warnings, you would have gotten this additional information:
Scalar value #array[1] better written as $array[1]
Which basically is exactly the answer you have been given here, albeit said in the perl compiler's terse language.
Also, using print the way you do it is hard on the eyes. Make use of perl's flexibility:
#!/usr/bin/perl
use strict;
use warnings;
my #array = (1 .. 10);
print "
Printing out full array..\#array
#array
Printing out \#array[1]
#array[1]
Printing out \$array[1]
$array[1]
";
As you can see, this will print out newlines in a WYSIWYG fashion.
There are two differences:
The «1» is evaluated in list context in the case of the array slice (#foo[1]), but in scalar context in the case of the array indexing ($foo[1]). This has no functional effect.
#foo[1] gives a warning.
Good question - here's some differences:
$a[0] = localtime; print #a; # this prints: Tue Nov 1 18:51:13 2011
#a[0] = localtime; print #a; # this prints the amount of seconds, e.g. 13
$a[0] = grep{}, qw(6 2 8); print #a; # this prints: 3
#a[0] = grep{}, qw(6 2 8); print #a; # this prints: 6
$a[0] = reverse ("a", "b"); print #a; # this prints: ba
#a[0] = reverse ("a", "b"); print #a; # this prints: b
$a[0] = ("a", "b"); print #a; # this prints: b
#a[0] = ("a", "b"); print #a; # this prints: a
It doesn't matter if the slice is [1]. You can create more examples by finding functions that give different results when evaluated in different contexts.
Explanation of the latter (4th) example:
The $ in $a[0] creates a scalar context. The context affects the way the comma operator works, creating different results:
"Binary "," is the comma operator. In scalar context it evaluates its left argument, throws that value away, then evaluates its right argument and returns that value."
http://perldoc.perl.org/perlop.html#Comma-Operator
Hence: b
To understand the line beginning with: #a[0] consider:
($a, $b, $c) = ("a", "b", "c", "d"); # here the "d" is discarded
print $a, $b, $c; # this prints: abc
($a, $b) = ("a", "b", "c"); # here the "c" is discarded
($a) = ("a", "b"); # here the "b" is discarded
($a[0]) = ("a", "b"); # here the "b" is discarded
It would appear that parentheses at the beginning of the line create a list context. This is pretty much what it says at: http://docstore.mik.ua/orelly/perl4/prog/ch02_07.htm
"Assignment to a list of scalars also provides a list context to the righthand side, even if there's only one element in the list"
# at the beginnning of a line also creates a list context:
#a[0] = ("a", "b") means evaluate the RHS in the same way as above i.e. discard the "b"

How to print 'AND' between array elements?

If I have an array with name like below.
How do I print "Hi joe and jack and john"?
The algorithm should also work, when there is only one name in the array.
#!/usr/bin/perl
use warnings;
use strict;
my #a = qw /joe jack john/;
my $mesg = "Hi ";
foreach my $name (#a) {
if ($#a == 0) {
$mesg .= $name;
} else {
$mesg .= " and " . $name;
}
}
print $mesg;
Usually we use an array join method to accomplish this. Here pseudo code:
#array = qw[name1 name2 name2];
print "Hey ", join(" and ", #array), ".";
Untested:
{ local $, = " and "; print "Hi "; print #a; }
Just use the special variable $".
$"="and"; #" (this last double quote is to help the syntax coloring)
$mesg="Hi #a";
To collect the perldoc perlvar answers, you may do one of (at least) two things.
1) Set $" (list separator):
When an array or an array slice is interpolated into a double-quoted
string or a similar context such as /.../ , its elements are separated
by this value. Default is a space. For example, this:
print "The array is: #array\n";
is equivalent to this:
print "The array is: " . join($", #array) . "\n";
=> $" affects the behavior of the interpolation of the array into a string
2) Set $, (output field separator):
The output field separator for the print operator. If defined, this
value is printed between each of print's arguments. Default is undef.
Mnemonic: what is printed when there is a "," in your print statement.
=> $, affects the behavior of the print statement.
Either will work, and either may be used with local to set the value of the special variable only within an enclosing scope. I guess the difference is that with $" you are not limited to the print command:
my #z = qw/ a b c /;
local $" = " and ";
my $line = "#z";
print $line;
here the "magic" happens on the 3rd line not at the print command.
In truth though, using join is the most readable, and unless you use a small enclosing block, a future reader might not notice the setting of a magic variable (say nearer the top) and never see that the behavior is not what is expected vs normal performance. I would save these tricks for small one-offs and one-liners and use the readable join for production code.

Resources