How to create multiple arrays at once in perl - arrays

I'm trying to create 23 arrays without typing out #array1, #array2, and so on, and load them each with the variables from the array #r if the $chrid matches the array number (if $chrid=1 it should be placed in #array1). How can I achieve this?
Here is what I have so far:
#!/usr/bin/perl
use warnings;
use strict;
my #chr;
my $input;
open ($input, "$ARGV[0]") || die;
while (<$input>) {
my #r = split(/\t/);
my $snps = $r[0];
my $pval = $r[1];
my $pmid = $r[2];
my $chrpos = $r[3];
my $chrid = $r[4];
for ($chrid) {
push (#chr, $chrid);
}
}
close $input;

You can use an array of arrays, where each subarray is stored at a sequentially increasing index in your array of arrays. Here is what that could look like, but it is still unclear to me what you want data you want to store:
use warnings;
use strict;
my #chr;
open my $input_fh, '<', $ARGV[0]
or die "Unable to open $ARGV[0] for reading: $!";
while (< $input_fh> ) {
# you can unpack your data in a single statement
my ($snps, $pval, $pmid, $chrpos, $chrid) = split /\t/;
# unclear what you actually want to store
push #{ $chr[$chrid] }, ( $snps, $pval, $pmid, $chrpos, $chrid );
}
close $input_fh;

Related

How to load a CSV file into a perl hash and access each element

I have a CSV file with the following information seperated by commas ...
Owner,Running,Passing,Failing,Model
D42,21,54,543,Yes
T43,54,76,75,No
Y65,76,43,765,Yes
I want to open this CSV file and place its containments inside of a perl hash in my program. I am also interested in the code needed to print a specific element inside of the has. For example, how I will print the "Passing" count for the "Owner" Y65.
The code I currently have:
$file = "path/to/file";
open $f, '<', $files, or die "cant open $file"
while (my $line = <$f>) {
#inside here I am trying to take the containments of this file and place it into a hash. I have tried numerous ways of trying this but none have seemed to work. I am leaving this blank because I do not want to bog down the visibility of my code for those who are kind enough to help and take a look. Thanks.
}
AS well as placing the csv file inside of a hash I also need to understand the syntax to print and navigate through specific elements. Thank you very much in advance.
Here is an example of how to put the data into a hash %owners and later (after having read the file) extract a "passing count" for a particular owner. I am using the Text::CSV module to parse the lines of the file.
use feature qw(say);
use open qw(:std :utf8); # Assume UTF-8 files and terminal output
use strict;
use warnings qw(FATAL utf8);
use Text::CSV;
my $csv = Text::CSV->new ( )
or die "Cannot use CSV: " . Text::CSV->error_diag ();
my $fn = 'test.csv';
open my $fh, "<", $fn
or die "Could not open file '$fn': $!";
my %owners;
my $header = $csv->getline( $fh ); # TODO: add error checking
while ( my $row = $csv->getline( $fh ) ) {
next if #$row == 0; # TODO: more error checking
my ($owner, #values) = #$row;
$owners{$owner} = \#values;
}
close $fh;
my $key = 'Y65';
my $index = 1;
say "Passing count for $key = ", $owners{$key}->[$index];
Since it's not really clear what "load a CSV file into a perl hash" means (Nor does it really make sense. An array of hashes, one per row, maybe, if you don't care about keeping the ordering of fields, but just a hash? What are the keys supposed to be?), let's focus on the rest of your question, in particular
how I will print the "Passing" count for the "Owner" Y65.
There are a few other CSV modules that might be of interest that are much easier to use than Text::CSV:
Tie::CSV_File lets you access a CSV file like a 2D array. $foo[0][0] is the first field of the first row of the tied file.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use Tie::CSV_File;
my $csv = "data.csv";
tie my #data, "Tie::CSV_File", $csv or die "Unable to tie $csv!";
for my $row (#data) {
say $row->[2] and last if $row->[0] eq "Y65";
}
DBD::CSV lets you treat a CSV file like a table in a database you can run SQL queries on.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use DBI;
my $csv = "data.csv";
my $dbh = DBI->connect("dbi:CSV:", undef, undef,
{ csv_tables => { data => { f_file => $csv } } })
or die $DBI::errstr;
my $owner = "Y65";
my $p = $dbh->selectrow_arrayref("SELECT Passing FROM data WHERE Owner = ?",
{}, $owner);
say $p->[0] if defined $p;
Text::AutoCSV has a bunch of handy functions for working with CSV files.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use Text::AutoCSV;
my $csv = "data.csv";
my $acsv = Text::AutoCSV->new(in_file => $csv) or die "Unable to open $csv!";
my $row = $acsv->search_1hr("OWNER", "Y65");
say $row->{"PASSING"} if defined $row;
This last one is probably closest to what I think you think you want.

creating hash from array in perl

I have an array that I want to convert into a hash table. Basically, I want #array[0] to be the keys of the hash, and #array[1] to be the values of the hash. Is there an easy way to do this in perl? The code I have so far is as follows:
#!/usr/bin/perl
use warnings;
use strict;
use diagnostics;
unless( open(INFILE, "<", 'scratch/Drosophila/fb_synonym_fb_2014_05.tsv')) {
die "Cannot open file for reading: ", $!;
while(<INFILE>) {
my #values = split();
#convert values[0] to keys, values[1] to values
}
the file is available for download here
#array[0] (an array slice, used to return multiple elements) is a bad way of writing $array[0] (an array lookup, used to return a single element). use warnings; would have told you this.
To set a hash element, one uses
$hash{$key} = $val;
So the code becomes
my %hash;
while (<>) {
chomp;
my #fields = split /\t/;
$hash{ $fields[0] } = $fields[1];
}
Better yet,
my %hash;
while (<>) {
chomp;
my ($key, $val) = split /\t/;
$hash{$key} = $val;
}
The name of the file implies the fields are tab-separated, not whitespace separated, so I switched
split ' '
to
split /\t/
This required the addition of chomp.

Compare two hashes in perl and list which records are extra?

I have two text files that contain user records. I have to compare these two files and figure out which users are missing from File1. And delete these Orphans from file2.
#!/usr/local/bin/perl -w
use strict;
use warnings;
use autodie;
use Text::Diff;
use List::Compare;
use Data::Dumper;
my $Users1 = "Users1.txt";
my $Users2 ="Users2.txt";
my %hash1;
my %hash2;
my %new_hash;
my #sorted_1;
my #sorted_2;
my #list_keys1;
my #list_keys2;
open(my $fh1, '<:encoding(UTF-8)', $Users1) or die "Colud not open the file!";
while(my $record1 = <$fh1>)
{
chomp $record1;
my #list1 = split( '/', $record1);
foreach my $item(#list1)
{
$new_hash{$list1[1]} = $list1[0];
$hash1{$list1[1]} = $list1[0];
}
while ( my ($key, $value) = each(%hash1) ) {
push (#list_keys1, $key);
#sorted_1 = sort #list_keys1;
}
}
print "\t\tHash values for USERS1:\n";
print Dumper \%hash1;
open(my $fh2, '<:encoding(UTF-8)', $Users2) or die "Colud not open the file!";
while(my $record2 = <$fh2>)
{
chomp $record2;
my #list2 = split( '/', $record2);
foreach my $item(#list2)
{
$hash2{$list2[1]} = $list2[0];
}
while ( my ($key, $value) = each(%hash2) )
{
push (#list_keys2, $key);
#sorted_2 = sort #list_keys2;
}
}
print "\n\n\t\tHash values for Users2:\n";
print Dumper \%hash2;
#hash1{#list_keys1} = 1;
#hash2{#list_keys2} = 1;
foreach(keys %hash2)
{
print "\nThis user does not exist(to be deleted): $_\n" unless exists $hash1{$_};
}
foreach (keys %hash1)
{
print "\nNew User (to be added):$_\n" unless exists $hash2{$_};
}
close ($fh1);
close ($fh2);
Questions:
I am not able to sort the user ID (String) alphabetically(here, USER IDs are random strings of length 7). Is there any limitations when it comes to sorting array/hashes in Perl?
I am not able to compare two hashes and get the differences. What would be the most efficient way to do that?
Are there any additional libraries that I need to install in order to handle this part of code?
Sample records from file:
File1:
ASIA/ASEDF46
INDIA/PSDfT5V
CHINA/FSDfT5V
INDIA/AA44TYB
USA/BBRTT67
File 2:
INDIA/PSDfT5V
CHINA/FSDfT5V
INDIA/AA44TYB
USA/BBRTT67
UK/ZK9EELO
use strict;
use warnings;
use autodie;
open my $in, '<', 'in.txt';
open my $in2, '<', 'in_2.txt';
my (%data1, %data2);
while(<$in>){
chomp;
my #split = split/\//;
$data1{$split[0]} = $split[1];
}
while(<$in2>){
chomp;
my #split = split/\//;
$data2{$split[0]} = $split[1];
}
foreach(sort keys %data1){
print "User: $_ Value: $data1{$_}\n" if $data2{$_};
}

Printing an array variable displaying all its elements using main:: .

The #main::match_to_array prints out only the last element in the array #match_to_array , not the whole array.
I did my code with reference to this SO link.
The input HTML consists of
dmit#sp.com
ems#es.com
dew#es.com
dmit#sp.com
erg#es.com
#!/usr/bin/perl –w
use strict;
use warnings;
use Cwd;
sub extractMail {
my $perl_path = cwd;
# Full HTML.htm
if(-e 'test.html') {
open(OPENFILE, "$perl_path/test.html") or die "Unable to open file";
}
my #email = <OPENFILE>;
close OPENFILE;
foreach my $email (#email){
if ($email =~ /regex to match data/{
my $match = "$1\n";
our #match_to_array = split ("\n",$match);
} # end of if statement
} # end of foreach
} # end of subroutine extractMail
for (my $a = 1;$a<=1;$a++){
&extractMail;
print #main::match_to_array;
}
You have misunderstood the post. The point is to declare the variable at the right place. In this case, you should probably return the value from the subroutine. Moreover, by assigning to an array
#match_to_array = split /\n/, $match;
you are overwriting the previous contents of the array. Use push instead.
Untested:
#!/usr/bin/perl –w
use strict;
use warnings;
use Cwd;
sub extractMail {
my $perl_path = cwd;
if (-e 'test.html') {
open my $OPENFILE, "$perl_path/test.html" or die "Unable to open file: $!";
}
my #match_to_array;
while (my $email = <$OPENFILE>) {
if ($email =~ /regex to match data/) {
my $match = "$1\n";
push #match_to_array, split /\n/, $match;
}
}
return #match_to_array;
}
for my $i (1 .. 1) {
my #match_to_array = extractMail();
print "#match_to_array\n";
}
my #email = <OPENFILE>;
close OPENFILE;
This might be the Problem, after those lines #email is containg one element, namely "dmit#sp.com ems#es.com ...".
Afterwards you're doing this:
foreach my $email (#email)
This will loop once, with
$email = "dmit#sp.com ems#es.com ..."
Then your regex removes everything BUT "dmit#sp.com" and leads you to the conclusion that only one element of your list is processed.
Try reading up on split to generate an array out of your space-separated list

2d arr explicitpackage

I've looked through several threads on websites including this one to try and understand why I am getting an undeclared variable error for my usage of my $line . Each element of the #lines array is an array of strings.
The error is in line 25 and 27 with the $line[$count] statement
use strict;
use warnings;
my #lines;
my #sizes;
# read input from stdin file into 2d array
while(<>)
{
push(#lines, my #tokens = split(/\s+/, $_));
}
# search through each array for largest sizes in
# corresponding elements
for (my $count = 0; $count <= 5; $count++)
{
push(#sizes, 0);
foreach my $line (#lines)
{
if(length($line[$count])>$sizes[$count])
{
$sizes[$count] = length($line[$count]);
}
}
}
I can post the full code if it is necessary, but I am pretty sure the error must be in here somewhere.
The problem is here:
push(#lines, my #tokens = split(/\s+/, $_));
Pushing one array into another just adds all elements to the first array. So you are making a really long one dimensional array.
To fix this, use brackets to make an array reference:
push #lines, [ split(/\s+/, $_) ]; #No need for a temp variable.
Also, to access the array reference, you have to de-reference it. Both of these syntaxes are options:
${$line}[$count];
$line->[$count];
I think the second syntax is more readable.
Update: Also, you could simplify your code if you keep track of the longest lengths while you go through the file:
use strict;
use warnings;
use List::Util qw/max/;
my #lines;
my #sizes = (0)x6;
while(<>)
{
push #lines, [ my #tokens = split ];
#sizes = map { max ( length($tokens[$_]), $sizes[$_] ) } 0..$#tokens;
}
Note: The Data::Dumper core module is an invaluable tool when working with complex data structures in Perl.
use Data::Dumper;
print Dumper #lines;
This will print out the complete structure of whatever variable you give it. That way you can see if you actually created what you thought you did.

Resources