Create a multidimesional key of hash from array? - arrays

I want to create a multidimensional %hash from the #array.
Suppose #array is like
my #array=(1,2,3,4,5);
I want to assign #array last value as final value to multidimensional %hash i.e
%hash=(
1=>{
2=>
{
3=>
{
4=>5
}
}
}
)
Which means $hash{1}{2}{3}{4}=5;
I want to do it in something like:
for my $i (0..$#array){
#push $i as key until second last element and assign last element as value
}
Note : The #array may be of any size, Just I want to assign last element of #array as value to the keys of elements before the last element in %hash.

First, use pop to separate the value to assign from the keys. Then, you can use either of the following:
use Data::Diver qw( DiveVal );
my %hash;
DiveVal(\%hash, map \$_, #keys) = $val;
or
sub dive_val :lvalue {
my $p = \shift;
$p = \( $$p->{$_} ) for #_;
$$p
}
my %hash;
dive_val(\%hash, #keys) = $val;
dive_val works by having $p reference the next value to dereference and/or modify.
Pre-loop: $p references $hash (the anon scalar referencing %hash)
After loop pass 0: $p references $hash->{1}
After loop pass 1: $p references $hash->{1}{2}
After loop pass 2: $p references $hash->{1}{2}{3}
After loop pass 3: $p references $hash->{1}{2}{3}{4}
The extra level of indirection has many benefits.
It removes the need to treat the last key specially.
It removes the need to create the hash before it's dereferenced.
It removes the need for the root to be a reference to a hash. Instead, any scalar can be the root, even an undefined one.
It makes it easy to extend dive_val to support mixed array/hash structures.

Related

How do I copy list elements to hash keys in perl?

I have found a couple of ways to copy the elements of a list to the keys of a hash, but could somebody please explain how this works?
#!/usr/bin/perl
use v5.34.0;
my #arry = qw( ray bill lois shirly missy hank );
my %hash;
$hash{$_}++ for #arry; # What is happening here?
foreach (keys %hash) {
say "$_ => " . $hash{$_};
}
The output is what I expected. I don't know how the assignment is being made.
hank => 1
shirly => 1
missy => 1
bill => 1
lois => 1
ray => 1
$hash{$_}++ for #array;
Can also be written
for (#array) {
$hash{$_}++;
}
Or more explicitly
for my $key (#array) {
$hash{$key}++;
}
$_ is "the default input and pattern-searching space"-variable. Often in Perl functions, you can leave out naming an explicit variable to use, and it will default to using $_. for is an example of that. You can also write an explicit variable name, that might feel more informative for your code:
for my $word (#words)
Or idiomatically:
for my $key (keys %hash) # using $key variable name for hash keys
You should also be aware that for and foreach are exactly identical in Perl. They are aliases for the same function. Hence, I always use for because it is shorter.
The second part of the code is the assignment, using the auto-increment operator ++
It is appended to a variable on the LHS and increments its value by 1. E.g.
$_++ means $_ = $_ + 1
$hash{$_}++ means $hash{$_} = $hash{$_} + 1
...etc
It also has a certain Perl magic included, which you can read more about in the documentation. In this case, it means that it can increment even undefined variables without issuing a warning about it. This is ideal when it comes to initializing hash keys, which do not exist beforehand.
Your code will initialize a hash key for each word in your #arry list, and also count the occurrences of each word. Which happens to be 1 in this case. This is relevant to point out, because since hash keys are unique, your array list may be bigger than the list of keys in the hash, since some keys would overwrite each other.
my #words = qw(foo bar bar baaz);
my %hash1;
for my $key (#words) {
$hash{$key} = 1; # initialize each word
}
# %hash1 = ( foo => 1, bar => 1, baaz => 1 );
# note -^^
my %hash2; # new hash
for my $key (#words) {
$hash{$key}++; # use auto-increment: words are counted
}
# %hash2 = ( foo => 1, bar => 2, baaz => 1);
# note -^^
Here is another one
my %hash = map { $_ => 1 } #ary;
Explanation: map takes an element of the input array at a time and for each prepapres a list, here of two -- the element itself ($_, also quoted because of =>) and a 1. Such a list of pairs then populates a hash, as a list of an even length can be assigned to a hash, whereby each two successive elements form a key-value pair.
Note: This does not account for possibly multiple occurences of same elements in the array but only builds an existance-check structure (whether an element is in the array or not).
$hash{$_}++ for #arry; # What is happening here?
It is iterating over the array, and for each element, it's assigning it as a key to the hash, and incrementing the value of that key by one. You could also write it like this:
my %hash;
my #array = (1, 2, 2, 3);
for my $element (#array) {
$hash{$element}++;
}
The result would be:
$VAR1 = {
'2' => 2,
'1' => 1,
'3' => 1
};
$hash{$_}++ for #arry; # What is happening here?
Read perlsyn, specifically simple statements and statement modifiers:
Simple Statements
The only kind of simple statement is an expression evaluated for its side-effects. Every simple statement must be terminated with a semicolon, unless it is the final statement in a block, in which case the semicolon is optional. But put the semicolon in anyway if the block takes up more than one line, because you may eventually add another line. Note that there are operators like eval {}, sub {}, and do {} that look like compound statements, but aren't--they're just TERMs in an expression--and thus need an explicit termination when used as the last item in a statement.
Statement Modifiers
Any simple statement may optionally be followed by a SINGLE modifier, just before the terminating semicolon (or block ending). The possible modifiers are:
if EXPR
unless EXPR
while EXPR
until EXPR
for LIST
foreach LIST
when EXPR
[...]
The for(each) modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn). There is no syntax to specify a C-style for loop or a lexically scoped iteration variable in this form.
print "Hello $_!\n" for qw(world Dolly nurse);

How to create variable array name in foreach loop for perl

I would like to create an array inside foreach loop that change name itself
our $j = 1;
foreach $key ( sort keys %hash ){
#array1 = $hash{$key};
$j++;
}
How to i change array name with $j. Like every key my array name will change from #array1, #array2, #array3....
That would require symbolic references and you don't want to be doing that.
It is a dangerous feature which is actually needed and used only very rarely for very specific reasons. For all other purposes there are other, better, ways.
Instead, use anonymous arrays (or array references) stored in a data structure, with an array
my #data;
foreach $key (sort keys %hash) {
push #data, [ ... ]; # (populate with $hash data)
}
or a hash
my %data;
foreach $key (sort keys %hash) {
my $name = ...; # work out a suitable key-name
$data{$name} = [ ... ]; # populate with $hash data
}
I don't know what to put in anonymous arrays [ ... ], or what good names for keys ($name) are, since it's not stated what is in the hash.
It is conceivable that your hash values themselves are in fact arrayrefs, in which case
my #data;
foreach $key (sort keys %hash) {
push #data, $hash{$key};
}
seems to fit the question but is really just
my #data = map { $hash{$_} } sort keys %hash;
or, if you don't need a predictable order based on keys
my #data = values %hash;
But I presume that there is more to do with hash's data before it is stored in arrays.
Then you can refer to the individual array(ref)s by index (or by name in the case of a hash).
for(my $i=0;$i<100;$i++){
my #arr_i=($i);
print(#arr_i,"\n");
}

perl: using push() on an array inside a hash

Is it possible to use Perl's push() function on an array inside a hash?
Below is what I believe to be the offending part of a larger program that I am working on.
my %domains = ();
open (TABLE, "placeholder.foo") || die "cannot read domtblout file\n";
while ($line = <TABLE>)
{
if (!($line =~ /^#/))
{
#split_line = split(/\t/, $line); # splits on tabs - some entries contain whitespace
if ($split_line[13] >= $domain_cutoff)
{
push($domains{$split_line[0]}[0], $split_line[19]); # adds "env from" coordinate to array
push($domains{$split_line[0]}[1], $split_line[20]); # adds "env to" coordinate to array
# %domains is a hash, but $domains{identifier}[0] and $domains{$identifier}[1] are both arrays
# this way, all domains from one sequence are stored with the same hash key, but can easily be processed iteratively
}
}
}
Later I try to interact with these arrays using
for ($i = 0, $i <= $domains{$identifier}[0], $i++)
{
$from = $domains{$identifier}[0][$i];
$to = $domains{$identifier}[1][$i];
$length = ($to - $from);
$tmp_seq =~ /.{$from}(.{$length})/;
print("$header"."$1");
}
but it appears as if the arrays I created are empty.
If $domains{$identifier}[0] is an array, then why can I not use the push statement to add an element to it?
$domains{identifier}[0] is not an array.
$domains{identifier}[0] is an array element, a scalar.
$domains{identifier}[0] is a reference to an array.
If it's
#array
when you have an array, it's
#{ ... }
when you have a reference to an array, so
push(#{ $domains{ $split_line[0] }[0] }, $split_line[19]);
References:
Mini-Tutorial: Dereferencing Syntax
References quick reference
perlref
perlreftut
perldsc
perllol

Explanation on data structures

I once read the following Perl code involving iterations.
for my $j (0 .. $#{$dat[$Row]})
{
$vectors{ $dat[$Row][$j] } = $j;
}
What does
$vectors{ $dat[$Row][$j] }
stand for?
Is that equivalent to $vectors->$dat[$Row][$j] ?
what does $vectors{ $dat[$Row][$j] } stand for?
$dat[$Row] is a reference to an array. $dat[$Row][$j] is apparently an element in that array. Whatever value is contained in it, becomes a hash key in %vectors, which gets the value $j.
Is that equivalent $vectors->$dat[$Row][$j]
No, that would be referring to the variable $vectors, not %vectors.
A more readable way to write this might be:
my $aref = $dat[$Row];
for my $index (keys #$aref) {
my $key = $aref->[$index];
$vectors{$key} = $index;
}
Which also exemplifies the use of ->, to dereference a reference.
$vectors is a hash, $dat a multidimensional array (array of references) and $Row and $j two scalars. So you're setting the key given by $dat[$Row][$j] in the %vectors hash to $j.
$vectors{ $dat[$Row][$j] }
is short for
$vectors{ $dat[$Row]->[$j] }
If you spell it out,
# $Row is a row index.
# $j is a column index.
# (How inconsistent!)
my $row = $dat[$Row]; # A ref to an array.
my $key = $row->[$j]; # A value from the table.
$vectors{$key}
%vectors is a hash.
$vectors{$k} is the value in the hash for key $k
$dat[$Row][$j] is an element of a 2-D array (column $j, row $Row)
So the loop is creating a hash where the key is the contents and the value is the column index.

How can I extract an array from a two-dimensional array in Perl?

I have once again forgotten how to get $_ to represent an array when it is in a loop of a two dimensional array.
foreach(#TWO_DIM_ARRAY){
my #ARRAY = $_;
}
That's the intention, but that doesn't work. What's the correct way to do this?
The line my #ARRAY = #$_; (instead of = $_;) is what you're looking for, but unless you explicitly want to make a copy of the referenced array, I would use #$_ directly.
Well, actually I wouldn't use $_ at all, especially since you're likely to want to iterate through #$_, and then you use implicit $_ in the inner loop too, and then you could have a mess figuring out which $_ is which, or if that's even legal. Which may have been why you were copying into #ARRAY in the first place.
Anyway, here's what I would do:
for my $array_ref (#TWO_DIM_ARRAY) {
# You can iterate through the array:
for my $element (#$array_ref) {
# do whatever to $element
}
# Or you can access the array directly using arrow notation:
$array_ref->[0] = 1;
}
for (#TWO_DIM_ARRAY) {
my #arr = #$_;
}
The $_ will be array references (not arrays), so you need to dereference it as:
my #ARRAY = #$_;

Resources