About accessing the Array of Arrays - arrays

I once read the following example about "array of arrays". AOA is a two dimensional array
The following code segment is claimed to print the whole thing with refs
for $aref ( #AoA ) {
print "\t [ #$aref ],\n";
}
And the following code segment is claimed to print the whole thing with indices
for $i ( 0 .. $#AoA ) {
print "\t [ #{$AoA[$i]} ],\n";
}
What's the $aref stand for here? How to understand the definition of #$aref and #{$AoA[$i]}? Thanks.

$aref stands for "array reference", i.e. a reference for an array.
my $my_aref = \#somearray;
You can make an array from an array reference with the following syntax:
#{$my_aref}
#{$my_aref} is #somearray. (It's not a copy, it really is the same array.)
In second example, $AoA[$i] is an array reference, and you dereference it with the same syntax: #{$AoA[$i]}.
See perlreftut for more explanations and examples.

An "array of arrays" isn't actually an array of arrays. It's more an array of array references. Each element in the base array is a reference to another array. Thus, when you want to cycle through the elements in the base array, you get back array references. These are what get assigned to $aref in the first loop. They are then de-referenced by pre-pending with the # symbol, so #$aref is the array referenced by the $aref array reference.
Same sort of thing works for the second loop. $AoA[$i] is the $i-th element of the #AoA array, which is an array reference. De-referencing it by pre-pending it with the # symbol (and adding {} for clarity, and possibly for precedence) means #{$AoA[$i]} is the array referenced by the $AoA[$i] array reference.

Perl doesn't have multidimensional arrays. One places arrays into other arrays to achieve the same result.
Well, almost. Arrays (and hashes) values are scalars, so one cannot place an array into another array. What one does instead of place a reference to an array instead.
In other words, "array of arrays" is short for "array of references to arrays". Each value of the #AoA is a reference to another array, given the "illusion" of a two-dimensional array.
The reference come from the use [ ] or equivalent. [ ] creates an anonymous array, then creates a reference to that array, then returns the reference. That's where the reference comes from.
Common ways of building an AoA:
my #AoA = (
[ 'a', 'b', 'c' ],
[ 'd', 'e', 'f' ],
);
my #AoA;
push #AoA, [ 'a', 'b', 'c' ];
push #AoA, [ 'd', 'e', 'f' ];
my #AoA;
$AoA[$y][$x] = $n;
Keep in mind that
$AoA[$y][$x] = $n;
is short for
$AoA[$y]->[$x] = $n;
and it's equivalent to the following thanks to autovivification:
( $AoA[$y] //= [] )->[$x] = $n;

The whole mystery with multi-dimension structures in perl is quite easy to understand once you realize that there are only three types of variables to deal with. Scalars, arrays and hashes.
A scalar is a single value, it can contain just about anything, but
only one at the time.
An array contains a number of scalar values, ordered by a fixed
numerical index.
A hash contains scalar values, indexed by keys made of strings.
And all arrays, hashes or scalars act this way. Multi-dimension arrays are no different from single dimension.
This is also expressed very succinctly in perldata:
All data in Perl is a scalar, an array of scalars, or a hash of
scalars. A scalar may contain one single value in any of three
different flavors: a number, a string, or a reference. In general,
conversion from one form to another is transparent. Although a scalar
may not directly hold multiple values, it may contain a reference to
an array or hash which in turn contains multiple values.
For example:
my #array = (1, 2, 3);
Here, $array[0] contains 1, $array[1] contains 2, etc. Just like you would expect.
my #aoa = ( [ 1, 2, 3 ], [ 'a', 'b', 'c' ] );
Here, $array[0] contains an array reference. If you print it out, it will say something like ARRAY(0x398a84). Don't worry! That's still a scalar value. How do we know this? Because arrays can only contain scalar values.
When we do something like
for $aref ( #AoA ) {
print $aref; # prints ARRAY(0x398a84) or similar
}
It's no different from doing
for $number ( #array ) {
print $number;
}
$aref and $number are scalar values. So far, so good. Take a moment and lock this knowledge down: Arrays can only contain scalar values.
Now, the next part is simply knowing how to deal with references. This is documented in perlref and perlreftut.
A reference is a scalar value. It's an address to a location in memory. This location contains some data. In order to access the actual data, we need to dereference the reference.
As a simple example:
my #data = (1, 2, 3);
my $aref = \#data; # The backslash in front of the sigil creates a reference
print $aref; # print something like ARRAY(0xa4b6a4)
print #$aref; # prints 123
Adding a sigil in front of the reference tells perl to dereference the scalar value into the type of data the sigil represents. In this case, an array. If you choose the wrong sigil for the type of reference, perl will give an error such as:
Not a HASH reference
In the example above, we have a reference to a specific, named location. Both #$aref and #data access the same values. If we change a value in one, both are affected, because the address to the memory location is identical. Let's try it:
my #data = (1, 2, 3);
my $aref = \#data;
$$aref[1] = 'a'; # dereference to a scalar value by $ sigil
# $aref->[1] = 'a' # does the same thing, a different way
print #data; # prints 1a3
print #$aref; # prints 1a3
We can also have anonymous data. If we were only interested in building an array of arrays, we'd have no interest in the #data, and could skip it by doing this:
my $aref = [ 1, 2, 3 ];
The brackets around the list of numbers create an anonymous array. $aref still contains the same type of data: A reference. But in this case, $aref is the only way we have of accessing the data contained at the memory location. Now, let's build some more scalar values like this:
my $aref1 = [ 1, 2, 3 ];
my $aref2 = [ 'a', 'b', 'c' ];
my $aref3 = [ 'x', 'y', 'z' ];
We now have three scalar variables that contain references to anonymous arrays. What if we put these in an array?
my #aoa = ($aref1, $aref2, $aref3);
If we'd want to access $aref1, we could do print #$aref1, but we could also do
print #{$aoa[0]};
In this case, we need to use the extended form of dereferencing: #{ ... }. Because perl does not like ambiguity, it requires us to distinguish between #{$aoa[0]} (take the reference in $aoa[0] and dereference as an array) and #{$aoa}[0] (take the reference in $aoa and dereference as an array, and take that arrays first value).
Above, we could have used #{$aref}, as it is identical to #$aref.
So, if we are only interested in building an array of arrays, we are not really interested in the $aref1 scalars either. So let's cut them out of the process:
my #aoa = ( [ 1, 2, 3 ], [ 'a', 'b', 'c' ], [ 'x', 'y', 'z' ]);
Tada! That's an array of arrays.
Now, we can backtrack. To access the values inside this array, we can do
for my $scalar ( #aoa ) {
print #$scalar; # prints 123abcxyz
}
This time, I used a different variable name, just to make a point. This loop takes each value from #aoa -- which still is only a scalar value -- dereferences it as an array, and prints it.
Or we can access #aoa via its indexes
for my $i ( 0 .. $#aoa ) {
print #{$aoa[$i]};
}
And that's all there is to it!

Related

Assigning to a slice of a 3D array using the range operator

I have a 3 dimensional array. I want to set three elements of it like this:
$array[$x][$y][0 .. 2] = (0, 1, 2);
but perl tells me:
Useless use of a constant (1) in void context
In array context:
#array[$x][$y][0 .. 2] = (0, 1, 2);
but perl tells me:
syntax error near "]["
presumably meaning that it expects me to give it two indices and then assign to the third dimension as a separate array? However, on this page, under Example: Assignment Using Array Slices, it suggests that it is possible to assign to a slice using the range operator where it says:
#array1[1..3] = #array2[23..25];
How can I assign to a slice of the array like this, or do I have to assign each index individually?
You need to dereference the inner array:
#{ $arr[$x][$y] }[ 0 .. 2 ] = (0, 1, 2);
$array[$x][$y][0..2] isn't a slice; it's just an element lookup.
When you attempted to change it into a slice, you sliced the wrong array. You sliced #arr instead of #{ $arr[$x][$y] }.
The key here is to realize that there's no such thing as 3d arrays in Perl. What you have is an array of references to arrays of references to array, which is colloquially called array of array of array, and often abbreviated to AoAoA.
Array slices have the following syntax:
#NAME[LIST]
#BLOCK[LIST]
#$REF[LIST]
EXPR->#[LIST][1]
You could use any of the following:
The first syntax can't be used since the array to slice doesn't have a name.
#{ $array[$x][$y] }[0..2] = 0..2;
my $ref = $array[$x][$y]; #$ref[0..2] = 0..2;
$array[$x][$y]->#[0..2] = 0..2;[1]
See Dereferencing Syntax.
Requires Perl 5.24+. Available in Perl 5.20+ by adding both use feature qw( postderef ); and no warnings qw( experimental::postderef );.

Perl array initialized incorrectly

Just starting with Perl, and I used the wrong pair of parentheses for this code example:
#arr = {"a", "b", "c"};
print "$_\n" foreach #arr;
I mean, I can of course see WHAT it does when I run it, but I don't understand WHY it does that. Shouldn't that simply fail with a syntax error?
You have accidentally created an anonymous hash, which is a hash that has no identifier and is accessed only by reference.
The array #arr is set to a single element which is a reference to that hash. Properly written it would look like this
use strict;
use warnings;
my #arr = (
{
a => "b",
c => undef,
}
);
print "$_\n" foreach #arr;
which is why you got the output
HASH(0x3fd36c)
(or something similar) because that is how Perl will represent a hash reference as a string.
If you want to experiment, then you can print the value of the first hash element by using $arr[0] as a hash reference (the array has only a single element at index zero) and accessing the value of the element with key a with print $arr[0]->{a}, "\n".
Note that, because hashes have to have a multiple of two values (a set of key/value pairs) the hash in your own code is implicitly expanded to four values by adding an undef to the end.
It is vital that you add use strict and use warnings to the top of every Perl program you write. In this case the latter would have raised the warning
Odd number of elements in anonymous hash
It can be surprising how few things are syntax errors in Perl. In this case, you've stumbled onto the syntax for creating an an anonymous hash reference: { LIST }. Perl doesn't require the use of parentheses (the () kind) when initializing an array1. Arrays hold ordered lists of scalar (single) values and references are scalars so perl happily initializes your array with that reference as the only element.
It's not what you wanted, but it's perfectly valid Perl.
The () override the precedence of the operators and thus changes the behavior of the expression. #a = (1, 2, 3) does what you'd expect, but #a = 1, 2, 3 means (#a = 1), 2, 3, so it only assigns 1 to #a.
{ LIST } is the hash constructor. It creates a hash, assigns the result of LIST to it, and returns a reference to it.
my $h = {"a", "b", "c", "d"};
say $h->{a}; # b
A list of one hash ref is just as legit as any other list, so no, it shouldn't be a syntax error.

What is the difference between lists and arrays?

On this page, it shows how to initialize an array, and if you scroll down a bit, under the section called "The Lists" it "explains" what lists are and how they're different from arrays.
Except it uses an example that's just exactly the same as declaring an array, and doesn't explain it whatsoever.
What is the difference?
Take a look at perldoc -q "list and an array". The biggest difference is that an array is a variable, but all of Perl's data types (scalar, array and hash) can provide a list, which is simply an ordered set of scalars.
Consider this code
use strict;
use warnings;
my $scalar = 'text';
my #array = (1, 2, 3);
my %hash = (key1 => 'val1', key2 => 'val2');
test();
test($scalar);
test(#array);
test(%hash);
sub test { printf "( %s )\n", join ', ', #_ }
which outputs this
( )
( text )
( 1, 2, 3 )
( key2, val2, key1, val1 )
A Perl subroutine takes a list as its parameters. In the first case the list is empty; in the second it has a single element ( $scalar); in the third the list is the same size as #array and contains ( $array[0], $array[1], $array[2], ...), and in the last it is twice as bug as the number of elements in %hash, and contains ( 'key1', $hash{key1}, 'key2', $hash{key2}, ...).
Clearly that list can be provided in several ways, including a mix of scalar variables, scalar constants, and the result of subroutine calls, such as
test($scalar, $array[1], $hash{key2}, 99, {aa => 1, bb => 2}, \*STDOUT, test2())
and I hope it is clear that such a list is very different from an array.
Would it help to think of arrays as list variables? There is rarely a problem distinguishing between scalar literals and scalar variables. For instance:
my $str = 'string';
my $num = 99;
it is clear that 'string' and 99 are literals while $str and $num are variables. And the distinction is the same here:
my #numbers = (1, 2, 3, 4);
my #strings = qw/ aa bb cc dd /;
where (1, 2, 3, 4) and qw/ aa bb cc dd / are list literals, while #numbers and #strings are variables.
Actually, this question is quite well answered in Perl's FAQ. Lists are (one of) methods to organize the data in the Perl source code. Arrays are one type of storing data; hashes are another.
The difference is quite obvious here:
my #arr = (4, 3, 2, 1);
my $arr_count = #arr;
my $list_count = (4, 3, 2, 1);
print $arr_count, "\n"; # 4
print $list_count; # 1
At first sight, there are two identical lists here. Note, though, that only the one that is assigned to #arr variable is correctly counted by scalar assignment. The $list_count stores 1 - the result of evaluating expression with comma operator (which basically gives us the last expression in that enumeration - 1).
Note that there's a slight (but very important) difference between list operators/functions and array ones: the former are kind-of omnivores, as they don't change the source data, the latter are not. For example, you can safely slice and join your list, like this:
print join ':', (4,2,3,1)[1,2];
... but trying to 'pop' it will give you quite a telling message:
pop (4, 3, 2, 1);
### Type of arg 1 to pop must be array (not list)...
An array is a type of variable. It contains 0 or more scalars at monotonously increasing indexes. For example, the following creates array #a:
my #a;
Being variables, you can manipulate arrays. You can add elements, change the values of elements, etc.
"List" means many things. The two primary uses for it are to refer to list values and instances of the list operator.
A list value is an ordered collection of zero or more scalars on the stack. For example, the sub in the following code returns a list to be assigned to #a (an array).
my #a = f();
List values can't be manipulated; they are absorbed in whole by any operator to which they are passed. They are just how values are passed between subs and operators.
The list operator (,) is an N-ary operator* that evaluates each of its operands in turn. In list context, the list operator returns a list consisting of an amalgamation of the lists returned by its operands. For example, the list operator in the following snippet returns a list value consisting of all the elements of arrays #a and #b:
my #c = ( #a, #b );
(By the way, parens don't create lists. They're just there to override precedence.)
You cannot manipulate a list operator since it's code.
* — The docs say it's a binary operator (at least in scalar context), but it's not true.
Simple demonstration of difference.
sub getarray{ my #x = (2,4,6); return #x; }
sub getlist { return (2,4,6); }
Now, if you do something like this:
my #a = getarray();
my #b = getlist();
Then #a and #b will both contain the same value - the list (2,4,6). However, if you do this:
my $a = getarray();
my $b = getlist();
Then $a will contain the value 3, while $b will contain the value 6.
So yes, you can say that arrays are variables containing list values, but that doesn't tell the whole story, because arrays and literal lists behave quite differently at times.
Lists are comma-separated values (csv) or expressions (cse) . Arrays (and hashes) are containers.
One can initialize an array or hash by a list:
#a = ("profession", "driver", "salary", "2000");
%h = ("profession", "driver", "salary", "2000");
One can return a list:
sub f {
return "moscow", "tel-aviv", "madrid";
}
($t1, $t2, $t3) = f();
print "$t1 $t2 $t3\n";
($t1, $t2, $t3) is a list of scalar containers $t1, $t2, $t3.
Lists are a form of writing perl expressions (part of syntax) while arrays are data structures (memory locations).
The difference between lists and arrays confuses many. Perl itself got it wrong by misnaming its builtin function wantarray(): "This function should have been named wantlist() instead." There is an answer in perlfaq4, "What is the difference between a list and an array?", but it did not end my confusion.
I now believe these to be true:
An array in scalar context becomes a count of its elements.
The comma operator in scalar context returns the last element.
You can't make a reference to a list; \(2, 4, 6) returns a list of references to the scalars in the list. You can use [2, 4, 6] to make a reference to an anonymous array.
You can index a list (to get its nth element) without making an array if you make a list slice, so (2, 4, 6)[1] is 4.
But what if I want to count the elements in a list, or get the last element of an array? Should I convert between arrays and lists somehow?
You can always convert a list to an array with [...] syntax. One way to count the elements in a list is to make an anonymous array, then immediately dereference it in scalar context, like so:
sub list { return qw(carrot stick); }
my $count = #{[list()]};
print "Count: $count\n"; # Count: 2
Another way is to use the list assignment operator, like so:
sub list { return qw(carrot stick); }
my $count = (()=list());
print "Count: $count\n"; # Count: 2
There is no array in this code, but the list assignment operator returns the number of things being assigned. I assign them to an empty list of variables. In code golf, I write ()=$str=~/reg/g to count the regular expression matches in some string.
You need not convert an array to a list, because an array in list context is already a list. If you want the last element of an array, just say $array[-1].
The comma operator would return the last element of a list, but I can't use it to get the last element of an array. If I say ((),#array) in scalar context, then #array is in scalar context and I get the count.
You need not make an array to index a list. You can make an anonymous array, as in [list()]->[1], or you can make a list slice, as in (list())[1]. I had trouble with list slices because they have different syntax. A list slice needs parentheses! In C or Python or Ruby, func()[1] would do the array index on the function's return value, but in Perl, func()[1] is a syntax error. You must say (func())[1].
For example, I want to print the 3rd highest number in array. Because I'm lazy, I sort the array and take the 3rd last element:
my #array = (112, 101, 114, 108, 32, 104, 97, 99, 107);
print +(sort { $a <=> $b } #array)[-3], "\n"; # prints 108
The unary + prevents the print() function stealing my parentheses.
You can use a list slice on an array, as in (#array)[1]. This works because an array is a list. The difference between lists and arrays is that arrays can do $array[1].
An Array Vs A List
A list is a different kind of data structure from an array.
The biggest difference is in the idea of direct access Vs sequential access. Arrays allow both; direct and sequential access, while lists allow only sequential access. And this is because the way that these data structures are stored in memory.
In addition, the structure of the list doesn’t support numeric index like an array is. And, the elements don’t need to be allocated next to each other in the memory like an array is.
Arrays
An array is an ordered collection of items, where each item inside the array has an index.
here my answer about sigils and context
but main difference is this:
arrays have a scalar-context-value like count of elements.
lists have a scalar-context-value like LAST element in list.
so, you need to know about goat-operator: =()=.
Usage?
perl -e '$s =()= qw(a b c); print $s' # uh? 3? (3 elements, array context)
perl -e '$s = qw(a b cLastElementThatYouSee); print $s' # uh? cLastElementThatYouSee? (list context, last element returned)
as you see, =()= change context to array

Array assignment in Perl

What is the difference between
myArr1 => \#existingarray
and
myArr2 => [
#existingarray
]
I am assigning the #existingarray to a element in a hash map.
I mean what exactly internally happens. Is it that for the first one, it points to the same array and for the second array it creates a new array with the elements in the #existingarray
Thanks in advance
Yes, the first one takes a reference, the second one does a copy and then takes a reference.
[ ... ] is the anonymous array constructor, and turns the list inside into an arrayref.
So with #a = 1, 2, 3,
[ #a ]
is the same as
[ 1, 2, 3 ]
(the array is flattened to a list) or
do {
my #b = #a;
\#b;
}
In effect, the elements get copied.
Also,
my ($ref1, $ref2) = (\#a, [#a]);
print "$ref1 and $ref2 are " . ($ref1 eq $ref2 ? "equal" : "not equal") . "\n";
would confirm that they are not the same. And if we do
$ref1->[0] = 'a';
$ref2->[0] = 'b';
then $a[0] would equal a and not b.
You can use
perl -e 'my #a=(1); my $ra=\#a; my $rca=[#a]; $ra->[0]=2; print #a, #{$ra}, #{$rca};'
221
to see that your assumption that [#existingarray] creates a reference to a copy of #existingarray is correct (and that myArray* isn't Perl).
WRT amon's revising my perl -e "..." (fails under bash) to perl -e '...' (fails under cmd.exe): Use the quotes that work for your shell.
The square brackets make a reference to a new array with a copy of what's in #existingarray at the time of the assignment.

Filling hash of multi-dimensional arrays in perl

Given three scalars, what is the perl syntax to fill a hash in which one of the scalars is the key, another determines which of two arrays is filled, and the third is appended to one of the arrays? For example:
my $weekday = "Monday";
my $kind = "Good";
my $event = "Birthday";
and given only the scalars and not their particular values, obtained inside a loop, I want a hash like:
my %Weekdays = {
'Monday' => [
["Birthday", "Holiday"], # The Good array
["Exam", "Workday"] # The Bad array
]
'Saturday' => [
["RoadTrip", "Concert", "Movie"],
["Yardwork", "VisitMIL"]
]
}
I know how to append a value to an array in a hash, such as if the key is a single array:
push( #{ $Weekdays{$weekday} }, $event);
Used in a loop, that could give me:
%Weekdays = {
'Monday' => [
'Birthday',
'Holiday',
'Exam',
'Workday'
]
}
I suppose the hash key is the particular weekday, and the value should be a two dimensional array. I don't know the perl syntax to, say, push Birthday into the hash as element [0][0] of the weekday array, and the next time through the loop, push another event in as [0][1] or [1][0]. Similarly, I don't know the syntax to access same.
Using your variables, I'd write it like this:
push #{ $Weekdays{ $weekday }[ $kind eq 'Good' ? 0 : 1 ] }, $event;
However, I'd probably just make the Good/Bad specifiers keys as well. And given my druthers:
use autobox::Core;
( $Weekdays{ $weekday }{ $kind } ||= [] )->push( $event );
Note that the way I've written it here, neither expression cares whether or not an array exists before we start.
Is there some reason that
push #{ $Weekdays{Monday}[0] }, "whatever";
isn’t working for you?

Resources