perl: using push() on an array inside a hash - arrays

Is it possible to use Perl's push() function on an array inside a hash?
Below is what I believe to be the offending part of a larger program that I am working on.
my %domains = ();
open (TABLE, "placeholder.foo") || die "cannot read domtblout file\n";
while ($line = <TABLE>)
{
if (!($line =~ /^#/))
{
#split_line = split(/\t/, $line); # splits on tabs - some entries contain whitespace
if ($split_line[13] >= $domain_cutoff)
{
push($domains{$split_line[0]}[0], $split_line[19]); # adds "env from" coordinate to array
push($domains{$split_line[0]}[1], $split_line[20]); # adds "env to" coordinate to array
# %domains is a hash, but $domains{identifier}[0] and $domains{$identifier}[1] are both arrays
# this way, all domains from one sequence are stored with the same hash key, but can easily be processed iteratively
}
}
}
Later I try to interact with these arrays using
for ($i = 0, $i <= $domains{$identifier}[0], $i++)
{
$from = $domains{$identifier}[0][$i];
$to = $domains{$identifier}[1][$i];
$length = ($to - $from);
$tmp_seq =~ /.{$from}(.{$length})/;
print("$header"."$1");
}
but it appears as if the arrays I created are empty.
If $domains{$identifier}[0] is an array, then why can I not use the push statement to add an element to it?

$domains{identifier}[0] is not an array.
$domains{identifier}[0] is an array element, a scalar.
$domains{identifier}[0] is a reference to an array.
If it's
#array
when you have an array, it's
#{ ... }
when you have a reference to an array, so
push(#{ $domains{ $split_line[0] }[0] }, $split_line[19]);
References:
Mini-Tutorial: Dereferencing Syntax
References quick reference
perlref
perlreftut
perldsc
perllol

Related

Perl: Load file into hash

I'm struggling to understand logic behind hashes in Perl. Task is to load file in to hash and assign values to keys which are created using this file.
File contains alphabet with each letter on its own line:
a
b
c
d
e
and etc,.
When using array instead of hash, logic is simple: load file into array and then print each element with corresponding number using some counter ($counter++).
But now my question is, how can I read file into my hash, assign automatically generated values and sort it in that way where output is printed like this:
a:1
b:2
c:3
I've tried to first create array and then link it to hash using
%hash = #array
but it makes my hash non-sortable.
There are a number of ways to approach this. The most direct would be to load the data into the hash as you read through the file.
my %hash;
while(<>)
{
chomp;
$hash{$_} = $.; #Use the line number as your autogenerated counter.
}
You can also perform simliar logic if you already have a populated array.
for (0..$#array)
{
$hash{$array[$_]} = $_;
}
Although, if you are in that situation, map is the perlier way of doing things.
%hash = map { $array[$_] => $_ } #array;
Think of a hash as a set of pairs (key, value), where the keys must be unique. You want to read the file one line at a time, and add a pair to the hash:
$record = <$file_handle>;
$hash{$record} = $counter++;
Of course, you could read the entire file into an array at once and then assign to your hash. But the solution is not:
#records = <$file_handle>;
%hash = #records;
... as you found out. If you think in terms of (key, value) pairs, you will see that the above is equivalent to:
$hash{a} = 'b';
$hash{c} = 'd';
$hash{e} = 'f';
...
and so on. You still are going to need a loop, either an explicit one like this:
foreach my $rec (#records)
{
$hash{$rec} = $counter++;
}
or an implicit one like one of these:
%hash = map {$_ => $counter++} #records;
# or:
$hash{$_} = $counter++ for #records;
This code should generate the proper output, where my-text-file is the path to your data file:
my %hash;
my $counter = 0;
open(FILE, "my-text-file");
while (<FILE>) {
chomp;
$counter++;
$hash{$_} = $counter;
}
# Now to sort
foreach $key (sort(keys(%hash))) {
print $key . ":" . $hash{$key} . "\n";
}
I assume you want to sort the hash aplhabetically. keys(%hash) and values(%hash) return the keys and values of %hash as an array, respectively. Run the program on this file:
f
a
b
d
e
c
And we get:
a:2
b:3
c:6
d:4
e:5
f:1
I hope this helps you.

How to pass an array as value to a single key in a hash of perl?

my %hash;
my #chain;
foreach (my $i=0; $i<=7; $i++)
{
foreach (my $j=0; $j<=($#output); $j++)
{
if ($output[$j] =~ /chain1/)
{
push (#array, $output[$j]);
}
}
$hash{$chain[$i]} = [ #array ];
}
print "$hash{$chain[0]}\n";
The problem is I am not able to assign the arrays to unique keys in the hash. when I say print all the keys print the same output.
You keep adding to the same array.
for (...) {
{
my #array; <-- Add here
for (...) {
...
push #array, $output[$j];
...
}
$hash{$chain[$i]} = \#array; <-- No need to copy elements anymore.
}
Perl hash are designed to hold only scalar values. It can have a key and the value can be the address reference of the array (which is scalar). But if the array value need to be modified concatenate the contents of the array as a string with certain delimiter and store the string as key.
Hope this Helps.

Looping through 2D array in Perl?

If I have a 2D array, how can it be possible to access an entire subarray inside of loop? Right now I have
foreach my $row(#data){
foreach my $ind(#$row){
#perform operations on specific index
}
}
but ideally I'm looking for something along the lines of
foreach my $row(#data){
#read row data like $row[0], which if it has the data I'm looking for
#I can go ahead and access $row[3] while in the same row..
}
I'm fairly new to Perl so might just not understand something yet, but I keep "Global symbol "#row" requires explicit package name" when trying to use it the way I want to.
You're close. $row is an array reference and you access its elements with the deference operator ->[...]:
foreach my $row (#data) {
if ($row->[0] == 42) { ... }
$row[0] refers to an element of the array variable #row, which is a completely different (and probably undefined -- thus the Global symbol ... error message) variable than $row.
If $row in your code sample is supposed to be a sub-array, or an array reference, you will have to use the indirect notation to access its elements, like $row->[0], $row->[1], etc.
The reason for your error is because $row[0] actually implies the existence of an array #row, which is probably not present in your script.
You could also try this...
my #ary = ( [12,13,14,15],
[57,58,59,60,61,101],
[67,68,69],
[77,78,79,80,81,301,302,303]);
for (my $f = 0 ; $f < #ary ; $f++) {
for (my $s = 0 ; $s < #{$ary[$f]} ; $s++ ) {
print "$f , $s , $ary[$f][$s]\n";
}
print "\n";
}

Explanation on data structures

I once read the following Perl code involving iterations.
for my $j (0 .. $#{$dat[$Row]})
{
$vectors{ $dat[$Row][$j] } = $j;
}
What does
$vectors{ $dat[$Row][$j] }
stand for?
Is that equivalent to $vectors->$dat[$Row][$j] ?
what does $vectors{ $dat[$Row][$j] } stand for?
$dat[$Row] is a reference to an array. $dat[$Row][$j] is apparently an element in that array. Whatever value is contained in it, becomes a hash key in %vectors, which gets the value $j.
Is that equivalent $vectors->$dat[$Row][$j]
No, that would be referring to the variable $vectors, not %vectors.
A more readable way to write this might be:
my $aref = $dat[$Row];
for my $index (keys #$aref) {
my $key = $aref->[$index];
$vectors{$key} = $index;
}
Which also exemplifies the use of ->, to dereference a reference.
$vectors is a hash, $dat a multidimensional array (array of references) and $Row and $j two scalars. So you're setting the key given by $dat[$Row][$j] in the %vectors hash to $j.
$vectors{ $dat[$Row][$j] }
is short for
$vectors{ $dat[$Row]->[$j] }
If you spell it out,
# $Row is a row index.
# $j is a column index.
# (How inconsistent!)
my $row = $dat[$Row]; # A ref to an array.
my $key = $row->[$j]; # A value from the table.
$vectors{$key}
%vectors is a hash.
$vectors{$k} is the value in the hash for key $k
$dat[$Row][$j] is an element of a 2-D array (column $j, row $Row)
So the loop is creating a hash where the key is the contents and the value is the column index.

How can I extract an array from a two-dimensional array in Perl?

I have once again forgotten how to get $_ to represent an array when it is in a loop of a two dimensional array.
foreach(#TWO_DIM_ARRAY){
my #ARRAY = $_;
}
That's the intention, but that doesn't work. What's the correct way to do this?
The line my #ARRAY = #$_; (instead of = $_;) is what you're looking for, but unless you explicitly want to make a copy of the referenced array, I would use #$_ directly.
Well, actually I wouldn't use $_ at all, especially since you're likely to want to iterate through #$_, and then you use implicit $_ in the inner loop too, and then you could have a mess figuring out which $_ is which, or if that's even legal. Which may have been why you were copying into #ARRAY in the first place.
Anyway, here's what I would do:
for my $array_ref (#TWO_DIM_ARRAY) {
# You can iterate through the array:
for my $element (#$array_ref) {
# do whatever to $element
}
# Or you can access the array directly using arrow notation:
$array_ref->[0] = 1;
}
for (#TWO_DIM_ARRAY) {
my #arr = #$_;
}
The $_ will be array references (not arrays), so you need to dereference it as:
my #ARRAY = #$_;

Resources