Filling hash of multi-dimensional arrays in perl - arrays

Given three scalars, what is the perl syntax to fill a hash in which one of the scalars is the key, another determines which of two arrays is filled, and the third is appended to one of the arrays? For example:
my $weekday = "Monday";
my $kind = "Good";
my $event = "Birthday";
and given only the scalars and not their particular values, obtained inside a loop, I want a hash like:
my %Weekdays = {
'Monday' => [
["Birthday", "Holiday"], # The Good array
["Exam", "Workday"] # The Bad array
]
'Saturday' => [
["RoadTrip", "Concert", "Movie"],
["Yardwork", "VisitMIL"]
]
}
I know how to append a value to an array in a hash, such as if the key is a single array:
push( #{ $Weekdays{$weekday} }, $event);
Used in a loop, that could give me:
%Weekdays = {
'Monday' => [
'Birthday',
'Holiday',
'Exam',
'Workday'
]
}
I suppose the hash key is the particular weekday, and the value should be a two dimensional array. I don't know the perl syntax to, say, push Birthday into the hash as element [0][0] of the weekday array, and the next time through the loop, push another event in as [0][1] or [1][0]. Similarly, I don't know the syntax to access same.

Using your variables, I'd write it like this:
push #{ $Weekdays{ $weekday }[ $kind eq 'Good' ? 0 : 1 ] }, $event;
However, I'd probably just make the Good/Bad specifiers keys as well. And given my druthers:
use autobox::Core;
( $Weekdays{ $weekday }{ $kind } ||= [] )->push( $event );
Note that the way I've written it here, neither expression cares whether or not an array exists before we start.

Is there some reason that
push #{ $Weekdays{Monday}[0] }, "whatever";
isn’t working for you?

Related

Ruby join two arrays by key value

I have two arrays, the first array contains field name, type and id.
arr1 = [
{
"n" => "cat",
"t" => 0,
"id" => "42WTd5"
},
{
"n" => "dog",
"t" => 0,
"id" => "1tM5T0"
}
]
Second array contains field, id and value.
arr2 = [
{
"42WTd5"=>"meow",
"1tM5T0"=>"woof"
}
]
How can I join them by id to produce the following result.
cat: meow
dog: woof
Any help is appreciated.
I think you want your result to be a Hash, in which case this would do the job:
def match_animals_to_sounds(animal_array, sound_array)
sounds = sound_array.first
animal_array.map { |animal| [animal['n'], sounds[animal['id']]] }.to_h
end
>> match_animals_to_sounds(arr1, arr2)
=> {"cat"=>"meow", "dog"=>"woof"}
Your arr2 is unusual in that it is an Array of a single element. I'm just calling #first on it to pull out the Hash inside. If you expect some version of this Array to have more than one element in the future, you'll need to rethink the first line of this method.
The second line is standard Ruby Array manipulation. The first part maps each animal to a new Array of two-element Arrays containing each animal's name and sound. At the end, #to_h converts this array of two-element arrays to a single Hash, which is much more useful than an array of strings. I don't know what you intended in your question, but this is probably what you want.
If you prefer to work with Symbols, you can change the second line of the method to:
animal_array.map { |animal| [animal['n'].to_sym, sounds[animal['id']].to_sym] }.to_h
In which case you will get:
>> match_animals_to_sounds(arr1, arr2)
=> {:cat=>:meow, :dog=>:woof}
This is a way to do it.
sounds = arr2[0]
results = arr1.map do |animal|
"#{animal["n"]}: #{sounds[animal["id"]]}"
end
puts results
# => cat: meow
# => dog: woof
Seems like the second array should just be a hash instead. There's no point creating an array if there's only one element in it and that number won't change.
pointless one-liner (don't use this)
puts arr1.map { |x| "#{x["n"]}: #{arr2[0][x["id"]]}" }
You can also get the join result by following code
arr1.collect{ |a| {a["n"] => arr2[0][a["id"]]} }

De-reference x number of times for x number of data structures

I've come across an obstacle in one of my perl scripts that I've managed to solve, but I don't really understand why it works the way it works. I've been scouring the internet but I haven't found a proper explanation.
I have a subroutine that returns a reference to a hash of arrays. The hash keys are simple strings, and the values are references to arrays.
I print out the elements of the array associated with each key, like this
for my $job_name (keys %$build_numbers) {
print "$job_name => ";
my #array = #{#$build_numbers{$job_name}}; # line 3
for my $item ( #array ) {
print "$item \n";
}
}
While I am able to print out the keys & values, I don't really understand the syntax behind line 3.
Our data structure is as follows:
Reference to a hash whose values are references to the populated arrays.
To extract the elements of the array, we have to:
- dereference the hash reference so we can access the keys
- dereference the array reference associated to a key to extract elements.
Final question being:
When dealing with perl hashes of hashes of arrays etc; to extract the elements at the "bottom" of the respective data structure "tree" we have to dereference each level in turn to reach the original data structures, until we obtain our desired level of elements?
Hopefully somebody could help out by clarifying.
Line 3 is taking a slice of your hash reference, but it's a very strange way to do what you're trying to do because a) you normally wouldn't slice a single element and b) there's cleaner and more obvious syntax that would make your code easier to read.
If your data looks something like this:
my $data = {
foo => [0 .. 9],
bar => ['A' .. 'F'],
};
Then the correct version of your example would be:
for my $key (keys(%$data)) {
print "$key => ";
for my $val (#{$data->{$key}}) {
print "$val ";
}
print "\n";
}
Which produces:
bar => A B C D E F
foo => 0 1 2 3 4 5 6 7 8 9
If I understand your second question, the answer is that you can access precise locations of complex data structures if you use the correct syntax. For example:
print "$data->{bar}->[4]\n";
Will print E.
Additional recommended reading: perlref, perlreftut, and perldsc
Working with data structures can be hard depending on how it was made.
I am not sure if your "job" data structure is exactly this but:
#!/usr/bin/env perl
use strict;
use warnings;
use diagnostics;
my $hash_ref = {
job_one => [ 'one', 'two'],
job_two => [ '1','2'],
};
foreach my $job ( keys %{$hash_ref} ){
print " Job => $job\n";
my #array = #{$hash_ref->{$job}};
foreach my $item ( #array )
{
print "Job: $job Item $item\n";
}
}
You have an hash reference which you iterate the keys that are arrays. But each item of this array could be another reference or a simple scalar.
Basically you can work with the ref or undo the ref like you did in the first loop.
There is a piece of documentation you can check for more details here.
So answering your question:
Final question being: - When dealing with perl hashes of hashes of
arrays etc; to extract the elements at the "bottom" of the respective
data structure "tree" we have to dereference each level in turn to
reach the original data structures, until we obtain our desired level
of elements?
It depends on how your data structure was made and if you already know what you are looking for it would be simple to get the value for example:
%city_codes = (
a => 1, b => 2,
);
my $value = $city_codes{a};
Complex data structures comes with complex code.

Perl - How do I update (and access) an array stored in array stored in a hash?

Perhaps I have made this more complicated than I need it to be but I am currently trying to store an array that contains, among other things, an array inside a hash in Perl.
i.e. hash -> array -> array
use strict;
my %DEVICE_INFORMATION = {}; #global hash
sub someFunction() {
my $key = 'name';
my #storage = ();
#assume file was properly opened here for the foreach-loop
foreach my $line (<DATA>) {
if(conditional) {
my #ports = ();
$storage[0] = 'banana';
$storage[1] = \#ports;
$storage[2] = '0';
$DEVICE_INFORMATION{$key} = \#storage;
}
elsif(conditional) {
push #{$DEVICE_INFORMATION{$key}[1]}, 5;
}
}#end foreach
} #end someFunction
This is a simplified version of the code I am writing. I have a subroutine that I call in the main. It parses a very specifically designed file. That file guarantees that the if statement fires before subsequent elsif statement.
I think the push call in the elsif statement is not working properly - i.e. 5 is not being stored in the #ports array that should exist in the #storage array that should be returned when I hash the key into DEVICE_INFORMATION.
In the main I try and print out each element of the #storage array to check that things are running smoothly.
#main execution
&someFunction();
print $DEVICE_INFORMATION{'name'}[0];
print $DEVICE_INFORMATION{'name'}[1];
print $DEVICE_INFORMATION{'name'}[2];
The output for this ends up being... banana ARRAY(blahblah) 0
If I change the print statement for the middle call to:
print #{$DEVICE_INFORMATION{'name'}[1]};
Or to:
print #{$DEVICE_INFORMATION{'name'}[1]}[0];
The output changes to banana [blankspace] 0
Please advise on how I can properly update the #ports array while it is stored inside the #storage array that has been hash'd into DEVICE_INFORMATION and then how I can access the elements of #ports. Many thanks!
P.S. I apologize for the length of this post. It is my first question on stackoverflow.
I was going to tell you that Data::Dumper can help you sort out Perl data structures, but Data::Dumper can also tell you about your first problem:
Here's what happens when you sign open-curly + close-curly ( '{}' ) to a hash:
use Data::Dumper ();
my %DEVICE_INFORMATION = {}; #global hash
print Dumper->Dump( [ \%DEVICE_INFORMATION ], [ '*DEVICE_INFORMATION ' ] );
Here's the output:
%DEVICE_INFORMATION = (
'HASH(0x3edd2c)' => undef
);
What you did is you assigned the stringified hash reference as a key to the list element that comes after it. implied
my %DEVICE_INFORMATION = {} => ();
So Perl assigned it a value of undef.
When you assign to a hash, you assign a list. A literal empty hash is not a list, it's a hash reference. What you wanted to do for an empty hash--and what is totally unnecessary--is this:
my %DEVICE_INFORMATION = ();
And that's unnecessary because it is exactly the same thing as:
my %DEVICE_INFORMATION;
You're declaring a hash, and that statement fully identifies it as a hash. And Perl is not going to guess what you want in it, so it's an empty hash from the get-go.
Finally, my advice on using Data::Dumper. If you started your hash off right, and did the following:
my %DEVICE_INFORMATION; # = {}; #global hash
my #ports = ( 1, 2, 3 );
# notice that I just skipped the interim structure of #storage
# and assigned it as a literal
# * Perl has one of the best literal data structure languages out there.
$DEVICE_INFORMATION{name} = [ 'banana', \#ports, '0' ];
print Data::Dumper->Dump(
[ \%DEVICE_INFORMATION ]
, [ '*DEVICE_INFORMATION' ]
);
What you see is:
%DEVICE_INFORMATION = (
'name' => [
'banana',
[
1,
2,
3
],
'0'
]
);
So, you can better see how it's all getting stored, and what levels you have to deference and how to get the information you want out of it.
By the way, Data::Dumper delivers 100% runnable Perl code, and shows you how you can specify the same structure as a literal. One caveat, you would have to declare the variable first, using strict (which you should always use anyway).
You update #ports properly.
Your print statement accesses $storage[1] (reference to #ports) in wrong way.
You may use syntax you have used in push.
print $DEVICE_INFORMATION{'name'}[0], ";",
join( ':', #{$DEVICE_INFORMATION{'name'}[1]}), ";",
$DEVICE_INFORMATION{'name'}[2], "\n";
print "Number of ports: ", scalar(#{$DEVICE_INFORMATION{'name'}[1]})),"\n";
print "First port: ", $DEVICE_INFORMATION{'name'}[1][0]//'', "\n";
# X//'' -> X or '' if X is undef

Loop 1st element of 2d List

I have a list like this
[[hash,hash,hash],useless,useless,useless]
I want to take the first element of hashes and loop through it - i try this:
my #list = get_list_somehow();
print Dumper($list[0][0]); print Dumper($list[0][1]);print Dumper($list[0][2]);
and i am able to access the elements fine manually, but when i try this
my #list = get_list_somehow()[0];
print Dumper($list[0]); print Dumper($list[1]);print Dumper($list[2]);
foreach(#list){
do_something_with($_);
}
only $list[0] returns a value (the first hash, everything else is undefined)
You are taking a subscript [0] of the return value of get_list_somehow() (although technically, you need parentheses there). What you need to do is to dereference the first element in that list. So:
my #list = get_list_somehow();
my $first = $list[0]; # take first element
my #newlist = #$first; # dereference array ref
Of course, this is cumbersome and verbose, and if you just want to print the array with Data::Dumper you can just do:
print Dumper $list[0];
Or if you just want the first array, you can do it in one step. Although this looks complicated and messy:
my #list = #{ (get_list_somehow())[0] };
The #{ ... } will expand an array reference inside it, which is what hopefully is returned from your subscript of the list from get_list_somehow().
I'm taking that your sample data looks like this:
my #data = [
{
one => 1,
two => 2,
three => 3,
},
"value",
"value",
"value",
];
That is, the first element of #data, $data[0] is your hash. Is that correct?
Your hash is a hash reference. That is the $data[0] points to the memory location where that hash is stored.
To get the hash itself, it must be dereferenced:
my %hash = %{ $data[0] }; # Dereference the hash in $data[0]
for my $key ( keys %hash ) {
say qq( \$hash{$key} = "$hash{$key}".);
}
I could have done the dereferencing in one step...
for my $key ( keys #{ $data[0] } ) {
say qq(\$data[0]->{$key} = ") . $data[0]->{$key} . qq(".);
}
Take a look at the Perl Reference Tutorial for information on how to work with references.
I'm guessing a bit on your data structure here:
my $list = [
[ { a => 1,
b => 2,
c => 3, },
{ d => 4, }
{ e => 5, }
], undef, undef, undef,
];
Then we get the 0th (first) element of the top-level array reference, which is another array reference, and then the 0th (first) element of THAT array reference, which is the first hash reference:
my $hr = $list->[0][0];
And iterate over the hash keys. That could also be written as one step: keys %{ $list->[0][0] }. It's a bit easier to see what's going on when broken out into two steps.
for my $key (keys %$hr) {
printf "%s => %s\n", $key, $hr->{$key};
}
Which outputs:
c => 3
a => 1
b => 2

About accessing the Array of Arrays

I once read the following example about "array of arrays". AOA is a two dimensional array
The following code segment is claimed to print the whole thing with refs
for $aref ( #AoA ) {
print "\t [ #$aref ],\n";
}
And the following code segment is claimed to print the whole thing with indices
for $i ( 0 .. $#AoA ) {
print "\t [ #{$AoA[$i]} ],\n";
}
What's the $aref stand for here? How to understand the definition of #$aref and #{$AoA[$i]}? Thanks.
$aref stands for "array reference", i.e. a reference for an array.
my $my_aref = \#somearray;
You can make an array from an array reference with the following syntax:
#{$my_aref}
#{$my_aref} is #somearray. (It's not a copy, it really is the same array.)
In second example, $AoA[$i] is an array reference, and you dereference it with the same syntax: #{$AoA[$i]}.
See perlreftut for more explanations and examples.
An "array of arrays" isn't actually an array of arrays. It's more an array of array references. Each element in the base array is a reference to another array. Thus, when you want to cycle through the elements in the base array, you get back array references. These are what get assigned to $aref in the first loop. They are then de-referenced by pre-pending with the # symbol, so #$aref is the array referenced by the $aref array reference.
Same sort of thing works for the second loop. $AoA[$i] is the $i-th element of the #AoA array, which is an array reference. De-referencing it by pre-pending it with the # symbol (and adding {} for clarity, and possibly for precedence) means #{$AoA[$i]} is the array referenced by the $AoA[$i] array reference.
Perl doesn't have multidimensional arrays. One places arrays into other arrays to achieve the same result.
Well, almost. Arrays (and hashes) values are scalars, so one cannot place an array into another array. What one does instead of place a reference to an array instead.
In other words, "array of arrays" is short for "array of references to arrays". Each value of the #AoA is a reference to another array, given the "illusion" of a two-dimensional array.
The reference come from the use [ ] or equivalent. [ ] creates an anonymous array, then creates a reference to that array, then returns the reference. That's where the reference comes from.
Common ways of building an AoA:
my #AoA = (
[ 'a', 'b', 'c' ],
[ 'd', 'e', 'f' ],
);
my #AoA;
push #AoA, [ 'a', 'b', 'c' ];
push #AoA, [ 'd', 'e', 'f' ];
my #AoA;
$AoA[$y][$x] = $n;
Keep in mind that
$AoA[$y][$x] = $n;
is short for
$AoA[$y]->[$x] = $n;
and it's equivalent to the following thanks to autovivification:
( $AoA[$y] //= [] )->[$x] = $n;
The whole mystery with multi-dimension structures in perl is quite easy to understand once you realize that there are only three types of variables to deal with. Scalars, arrays and hashes.
A scalar is a single value, it can contain just about anything, but
only one at the time.
An array contains a number of scalar values, ordered by a fixed
numerical index.
A hash contains scalar values, indexed by keys made of strings.
And all arrays, hashes or scalars act this way. Multi-dimension arrays are no different from single dimension.
This is also expressed very succinctly in perldata:
All data in Perl is a scalar, an array of scalars, or a hash of
scalars. A scalar may contain one single value in any of three
different flavors: a number, a string, or a reference. In general,
conversion from one form to another is transparent. Although a scalar
may not directly hold multiple values, it may contain a reference to
an array or hash which in turn contains multiple values.
For example:
my #array = (1, 2, 3);
Here, $array[0] contains 1, $array[1] contains 2, etc. Just like you would expect.
my #aoa = ( [ 1, 2, 3 ], [ 'a', 'b', 'c' ] );
Here, $array[0] contains an array reference. If you print it out, it will say something like ARRAY(0x398a84). Don't worry! That's still a scalar value. How do we know this? Because arrays can only contain scalar values.
When we do something like
for $aref ( #AoA ) {
print $aref; # prints ARRAY(0x398a84) or similar
}
It's no different from doing
for $number ( #array ) {
print $number;
}
$aref and $number are scalar values. So far, so good. Take a moment and lock this knowledge down: Arrays can only contain scalar values.
Now, the next part is simply knowing how to deal with references. This is documented in perlref and perlreftut.
A reference is a scalar value. It's an address to a location in memory. This location contains some data. In order to access the actual data, we need to dereference the reference.
As a simple example:
my #data = (1, 2, 3);
my $aref = \#data; # The backslash in front of the sigil creates a reference
print $aref; # print something like ARRAY(0xa4b6a4)
print #$aref; # prints 123
Adding a sigil in front of the reference tells perl to dereference the scalar value into the type of data the sigil represents. In this case, an array. If you choose the wrong sigil for the type of reference, perl will give an error such as:
Not a HASH reference
In the example above, we have a reference to a specific, named location. Both #$aref and #data access the same values. If we change a value in one, both are affected, because the address to the memory location is identical. Let's try it:
my #data = (1, 2, 3);
my $aref = \#data;
$$aref[1] = 'a'; # dereference to a scalar value by $ sigil
# $aref->[1] = 'a' # does the same thing, a different way
print #data; # prints 1a3
print #$aref; # prints 1a3
We can also have anonymous data. If we were only interested in building an array of arrays, we'd have no interest in the #data, and could skip it by doing this:
my $aref = [ 1, 2, 3 ];
The brackets around the list of numbers create an anonymous array. $aref still contains the same type of data: A reference. But in this case, $aref is the only way we have of accessing the data contained at the memory location. Now, let's build some more scalar values like this:
my $aref1 = [ 1, 2, 3 ];
my $aref2 = [ 'a', 'b', 'c' ];
my $aref3 = [ 'x', 'y', 'z' ];
We now have three scalar variables that contain references to anonymous arrays. What if we put these in an array?
my #aoa = ($aref1, $aref2, $aref3);
If we'd want to access $aref1, we could do print #$aref1, but we could also do
print #{$aoa[0]};
In this case, we need to use the extended form of dereferencing: #{ ... }. Because perl does not like ambiguity, it requires us to distinguish between #{$aoa[0]} (take the reference in $aoa[0] and dereference as an array) and #{$aoa}[0] (take the reference in $aoa and dereference as an array, and take that arrays first value).
Above, we could have used #{$aref}, as it is identical to #$aref.
So, if we are only interested in building an array of arrays, we are not really interested in the $aref1 scalars either. So let's cut them out of the process:
my #aoa = ( [ 1, 2, 3 ], [ 'a', 'b', 'c' ], [ 'x', 'y', 'z' ]);
Tada! That's an array of arrays.
Now, we can backtrack. To access the values inside this array, we can do
for my $scalar ( #aoa ) {
print #$scalar; # prints 123abcxyz
}
This time, I used a different variable name, just to make a point. This loop takes each value from #aoa -- which still is only a scalar value -- dereferences it as an array, and prints it.
Or we can access #aoa via its indexes
for my $i ( 0 .. $#aoa ) {
print #{$aoa[$i]};
}
And that's all there is to it!

Resources