Perl push function gives index values instead of array elements - arrays

I am reading a text file named, mention-freq, which has data in the following format:
1
1
13
2
I want to read the lines and store the values in an array like this: #a=(1, 1, 13, 2). The Perl push function gives the index values/line numbers, i.e., 1,2,3,4, instead of my desired output. Could you please point out the error? Here is what I have done:
use strict;
use warnings;
open(FH, "<mention-freq") || die "$!";
my #a;
my $line;
while ($line = <FH>)
{
$line =~ s/\n//;
push #a, $line;
print #a."\n";
}
close FH;

The bug is that you are printing the concatenation of #a and a newline. When you concatenate, that forces scalar context. The scalar sense of an array is not its contents but rather its element count.
You just want
print "#a\n";
instead.
Also, while it will not affect your code here, the normal way to remove the record terminator read in by the <> readline operator is using chomp:
chomp $line;

Related

How to read a .txt file and store it into an array

I know this is a fairly simple question, but I cannot figure out how to store all of the values in my array the way I want to.
Here is a small portion what the .txt file looks like:
0 A R N D
A 2 -2 0 0
R -2 6 0 -1
N 0 0 2 2
D 0 -1 2 4
Each value is delimited by either two spaces - if the next value is positive - or a space and a '-' - if the next value is negative
Here is the code:
use strict;
use warnings;
open my $infile, '<', 'PAM250.txt' or die $!;
my $line;
my #array;
while($line = <$infile>)
{
$line =~ /^$/ and die "Blank line detected at $.\n";
$line =~ /^#/ and next; #skips the commented lines at the beginning
#array = $line;
print "#array"; #Prints the array after each line is read
};
print "\n\n#array"; #only prints the last line of the array ?
I understand that #array only holds the last line that was passed to it. Is there a way where I can get #array to hold all of the lines?
You are looking for push.
push #array, $line;
You undoubtedly want to precede this with chomp to snip any newlines, first.
If file is small as compared to available memory of your machine then you can simply use below method to read content of file in to an array
open my $infile, '<', 'PAM250.txt' or die $!;
my #array = <$infile>;
close $infile;
If you are going to read a very large file then it is better to read it line by line as you are doing but use PUSH to add each line at end of array.
push(#array,$line);
I will suggest you also read about some more array manipulating functions in perl
You're unclear to what you want to achieve.
Is every line an element of your array?
Is every line an array in your array and your "words" are the elements of this array?
Anyhow.
Here is how you can achieve both:
use strict;
use warnings;
use Data::Dumper;
# Read all lines into your array, after removing the \n
my #array= map { chomp; $_ } <>;
# show it
print Dumper \#array;
# Make each line an array so that you have an array of arrays
$_= [ split ] foreach #array;
# show it
print Dumper \#array;
try this...
sub room
{
my $result = "";
open(FILE, <$_[0]);
while (<FILE>) { $return .= $_; }
close(FILE);
return $result;
}
so you have a basic functionality without great words. the suggest before contains the risk to fail on large files. fastest safe way is that. call it as you like...
my #array = &room('/etc/passwd');
print room('/etc/passwd');
you can shorten, rename as your convinience believes.
to the kidding ducks nearby: by this way the the push was replaced by simplictiy. a text-file contains linebreaks. the traditional push removes the linebreak and pushing up just the line. the construction of an array is a simple string with linebreaks. now contain the steplength...

perl split delimiter from file line by line

I have a text file named 'dataexample' with multiple line like this:
a|30|40
b|50|70
then I split the delimiter with this code:
open(FILE, 'dataexample') or die "File not exist";
while(<FILE>){
my #record = split(/\|/, $_);
print "$record[0]";
}
close FILE;
when I print "$record[0]" , this is what I got:
ab
what I expect :
a 30 40
so when I do print "$record[0][0]" I expect the output to be: a
Where I got it wrong?
Your loop while ( <FILE> ) { ... } reads a single line at a time from the file handle and puts it into $_
my #record = split(/\|/, $_) splits that line on pipe characters |, so since the first line is "a|30|40\n", #record will now be 'a', '30', "40\n". The newline read from the file remains, and you should use chomp to remove it if you don't want it there
So now $record[0] is a, which you print, and then go on to read the next line in the file, setting #record to 'b', '50', "70\n" this time. Now $record[0] is b, which you also print, showing ab on the console
You've now reached the end of the file, so the while loop terminates
It sounds like you're expecting a two-dimensional array. You can do that by pushing each array onto a main array each time you read a record, like this
use strict;
use warnings 'all';
open my $fh, '<', 'dataexample' or die qq{Unable to open "dataexample" for input: $!};
my #data;
while ( <$fh> ) {
chomp;
my #record = split /\|/;
push #data, \#record;
}
print "#{$data[0]}\n";
print "$data[0][0]\n";
output
a 30 40
a
Or, more concisely, like this, which produces exactly the same result but may be a little advanced for you
use strict;
use warnings 'all';
open my $fh, '<', 'dataexample' or die qq{Unable to open "dataexample" for input: $!};
my #data = map { chomp; [ split /\|/ ] } <$fh>;
print "#{$data[0]}\n";
print "$data[0][0]\n";
Some points to know about your own code
You must always use strict and use warnings 'all' at the top of every Perl program you write. It's a measure that will uncover many simple mistakes that you may not otherwise notice
You should use lexical filehandles together with the three-parameter form or open. And an open may fail for many other reasons that the file not existing, so you should include the built-in $! variable in your die string to say why it failed
Don't forget to chomp each record read from a file unless you want to keep then trailing newline or it doesn't matter to you
You will be able to write more concise code if you get used to using the default variable $_. For instance, the second parameter to split is $_ by default, so split(/\|/, $_) may be written as just split /\|/
You can use Data::Dumper to display the contents of your variables, which will help you to debug your code. Data::Dump is superior, but it isn't a core module so you will probably have to install it before you can use it in your code
You have to use
print "$record[1]";
print "$record[2]";
As they are stored in consecutive index values.
or
If you want to print the entire thing you can just do
print "#record\n";
You are printing the value at the first index in the array each time through the loop, and without the new line. So you get the first value from each line, right next to each other on the same line, thus ab.
Print the whole array, under quotes, with the new line. with your program changed a bit
use strict;
use warnings;
my $file = 'dataexample';
open my $fh, '<', $file or die "Error opening $file: $!";
while (<$fh>) {
chomp;
my #record = split(/\|/, $_);
print "#record\n";
}
close $fh;
With the quotes the elements are printed with spaces added between them so you get
a 30 40
b 50 70
If you print without quotes the elements get printed without extra spaces, so
this
print #record, "\n";
over the whole loop prints
a3040
b5070
If you don't have the new line "\n" either, it is all printed on one line so this
print #record;
altogether prints
a3040b5070
As for $record[0][0], this is not valid for the array you have. This would print from a two-dimensional array. Take, for example
my #data = ( [1.1, 2.2], [10, 20] );
This array #data has at its first index a reference to an array -- more precisely, an anonymous array [1.1, 2.2]. Its second element is an anonymous array [10, 20]. So $data[0][0] is: the first element of #data (so the first of the two anonymous arrays inside), and then the first element of that array, thus 1.1. Likewise $data[1][1] is 20.
Thanks to Sobrique for the comment.
But you don't have this in your program. When you split data into an array
while(<FILE>){
my #record = split(/\|/, $_);
# ...
}
it creates a new array named #record every time through the loop. So #record is a normal array, not two-dimensional. For that the syntax $record[0][0] doesn't mean much.
I think you're trying to create a 2d array, whereby each element contains all the pipe delimited items from each line of your input:
my #record;
while(<DATA>){
chomp;
my #split = split(/\|/);
push #record, [#split];
}
print "#{$record[0]}\n";
a 30 40
record[0] has the contents of column 1 - 'a' on the first iteration of the loop, 'b' on the second. record[1] has column 2 and so on. You put the print statement, print "record[0]" in the loop so you get 'a' printed in the first iteration and 'b' in the second.
To get what you wanted you need to replace you print statement with;
print join " ", #record, "\n";

Creating multidimensional array while reading a file - Perl

I'm totally new to Perl, and I've been assigned some task... I have to read a tab separated file, and then do some operations with the data in a DB. The .tsv file is like this:
ID Name Date
155 Pedro 1988-05-05
522 Mengano 2002-08-02
So far I thought that creating a multidimensional array with the data of the file will be a good solution to handle this data later. So I read the file line by line, skip the item title columns and save the values in an array. However, I'm having difficulties creating this multidimensional array... this is what I've done so far:
#Read file from path
my #array;
my $fh = path($filename)->openr_utf8;
while (my $line = <$fh>) {
chomp $line;
# skip comments and blank lines and title line
next if $line =~ /^\#/ || $line =~ /^\s*$/ || $line =~ /^\+/ || $line =~ /ID/;
#split each line into array
my #aux_line = split(/\s+/, $line);
push #array, #{ $aux_line };
}
Obviously, last line is not working... how could be done to create an array of arrays this way? I'm little bit lost with references... And somebody can think of a better way to store this data we read from file? Thank you!
You can also do this with map:
use Data::Dumper;
my #stuff = map {[split]} <$fh>;
print Dumper \#stuff;
(with maybe a grep to skip comments)
But it may suit your use case better to use an array of hashes :
my #stuff ;
chomp(my #header = split ' ', <$fh>);
while ( <$fh>) {
my %this_row;
#this_row{#header} = split;
push ( #stuff, \%this_row) ;
}
First, use strict and use warnings. That would instantly alert you about that your wrong way to get array reference tries to access completely different variable (Perl allows variable of different types have same names).
After that just change your last line to:
push #array, \#aux_line;

Proper way to string interpolate an array element within printf

While working on Schwartz's Learning Perl, I came across an exercise where I am supposed to accept a number of user input strings, where the first input is to be the width determining right justified output for the other strings.
In other words, inputs:
10
apple
boy
Output should be:
10
apple
boy
where output is right justified by 10.
I tried using arrays to approach the problem:
#!/usr/bin/perl
use strict;
use warnings;
my #array;
while (<>) {
chomp($_);
push #array, $_ ;
}
while (#array) {
printf ("%$array[0]s \n", shift #array);
}
But after formatting and printing '10' correctly, I get errors:
$ perl test.pl
10
apple
boy
10
Invalid conversion in printf: "%a" at test.pl line 11, <> line 3.
%apples
Argument "boy" isn't numeric in printf at test.pl line 11, <> line 3.
0oys
I tried a variety of methods to force interpolation of the array element by enclosing it in braces, but all these have resulted in errors. What's the proper way to string interpolate array elements within printf (if that's the right term)?
Here's a more Perlish way to write it that avoids having to perform an explicit shift. It's a lot more do-what-I-mean since the format control variable is not part of #array from the start:
use strict;
use warnings;
my ( $length, #array ) = <>;
chomp( $length, #array );
printf "%${length}s\n", $_ for ( $length, #array );

perl - cutting many strings with given array of numbers

dear my fellow perl masters in the world~!
I need your help.
I have a string file A and a number file B like this:
File A:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
...and so on till 200.
File B:
3, 6, 2, 5, 6, 1, ... 2
(total 200 numbers in an array)
then, with the numbers in file B, I would like to cut each string from the start position to the number of characters in File B.
E.g. as File B starts with 3, 6, 2 ...
File A will be
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
like this.
So. this is my code so far...
use strict;
if (#ARGV != 2) {
print "Invalid usage\n";
print "Usahe: perl program.pl [num_list] [string_file]\n";
exit(0);
}
my $numbers=$ARGV[0];
my $strings=$ARGV[1];
my $i;
open(LIST,$number);
open(DATA,$strings);
my #list = <LIST>;
my $list_size = scalar #sp_list;
for ($i=0;$i<=$list_size;$i++) {
print $i,"\n";
#while (my $line = <DATA>) {
}
close(LIST);
close(DATA);
As the strings and numbers are 200 I changed the array into a scalar value to work on every numbers of every strings.
I'm working on this. and I know I suppose to use a pos function but i do not know how to match each number with each string. is reading the string first by while? or using for to know how many time that I have to run this to achieve the result?
Your help will be much appreciated!
Thank you.
I will be working on it, too. Need your feedback.
It is good that you use strict, and you should also use warnings. Further things to note:
You should check the return value of open to make sure they did not fail. You should also use the three argument form of open and use a lexical file handle. Especially when handling command line arguments, which does pose a security risk.
open my $listfh, "<", $file or die $!;
You may wish to use a safety precaution
use ARGV::readonly;
You can easily make the list of numbers with a map statement. Assuming the numbers are in a comma separated list:
my #list = map split(/\s*,\s*/), <$listfh>;
This will split the input line(s) on comma and strip excess whitespace.
When reading your input file, you do not need to use a counter variable. You can simply do
open my $inputfh, "<", $file or die $!;
while (<$inputfh>) {
my $length = shift #list; # these are your numbers
chomp; # remove newline
my $string = substr($_, 0, -$length); # negative length on substr
print "$string\n";
}
The negative length on substr makes it leave that many characters off the end of the string.
Here is a one-liner in action that demonstrates these principles:
perl -lwe '$f = pop; # save file name for later
#nums = map split(/\s*,\s*/), <>; # process first file
push #ARGV, $f; # put back file name
while (<>) {
my $len = shift #nums;
chomp;
print substr($_,0,-$len);
}' fileb.txt filea.txt
Output:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEEEEEEEEEEEE
Note the use of implicit open of file name arguments by manipulating #ARGV. Also handling newlines with -l switch.
Here is my suggestion. It does use autodie so that there is no need to explicitly check the status of open calls, and temporarily undefines $/ - the input record separator - so that all of the num_list file is read in one go. You aren't clear whether this file will always contain just single line, in which case you can omit local $/.
The numbers are extracted from the text using a regular expression /\d+/g returns all the strings of digits in the input as a list.
The second parameter to substr is the start position of the substring you want, and using a negative number counts from the end of the string instead of the beginning. The third parameter is the number of characters in the substring, and the fourth is a string to replace that substring in the target variable. So substr $data, -$n, $n, '' replaces the substring of length $n starting $n characters from the end with an empty string - i.e. it deletes it.
Note that if it is your intention to remove the given number of characters from the beginning of the string, then you would write substr $data, 0, $n, '' instead.
use strict;
use warnings;
use autodie;
unless (#ARGV == 2) {
print "Usage: perl program.pl [num_list] [string_file]\n";
exit;
}
my #numbers;
{
open my $listfh, '<', $ARGV[0];
local $/;
my $numbers = <$listfh>;
#numbers = $numbers =~ /\d+/g;
};
open my $datafh, '<', $ARGV[1];
for my $i (0 .. $#numbers) {
print "$i\n";
my $n = $numbers[$i];
my $data = <$datafh>;
chomp $data;
substr $data, -$n, $n, '';
print "$data\n";
}
Here is how I would do it. substr is the function to remove a part of a string. From your example, it is not clear whether you want to remove the characters at the beginning or at the end. Both alternatives are shown here:
#!/usr/bin/perl
use warnings;
use strict;
if (#ARGV != 2) {
die "Invalid usage\n"
. "Usage: perl program.pl [num_list] [string_file]\n";
}
my ($number_f, $string_f) = #ARGV;
open my $LIST, '<', $number_f or die "Cannot open $number_f: $!";
my #numbers = split /, */, <$LIST>;
close $LIST;
open my $DATA, '<', $string_f or die "Cannot open $string_f: $!";
while (my $string = <$DATA>) {
substr $string, 0, shift #numbers, q(); # Replace the first n characters with an empty string.
# To remove the trailing portion, replace the previous line with the following:
# my $n = shift #numbers;
# substr $string, -$n-1, $n, q();
print $string;
}
You were not checking the return value of open. Try to remember to always do that.
Do not declare variables far before you are going to use them ($i here).
Do not use C-style for loops if you do not have to. They are prone to fence post errors.
You can use substr():
use strict;
use warnings;
if (#ARGV != 2) {
print "Invalid usage\n";
print "Usage: perl program.pl [num_list] [string_file]\n";
exit(0);
}
my $numbers=$ARGV[0];
my $strings=$ARGV[1];
open my $list, '<', $numbers or die "Can't open $numbers: $!";
open my $data, '<', $strings or die "Can't open $strings: $!";
chomp(my $numlist = <$list>);
my #numbers = split /\s*,\s*/,$numlist;
for my $chop_length (#numbers)
{
my $data = <$data> // die "not enough data in $strings";
print substr($data,0,length($data)-$chop_length)."\n";
}
Your specs say you want "... to cut each string from the start position to the number of characters in File B." I agree with choroba that it's not perfectly clear whether characters from the start or the end of the string are to be cut. However, I tend to think that you want to remove characters from the beginning when you say, "... from the start position ...", but a string like ABCDEFGHIJKLMNOPQRSTUVWXYZ012345 would help clarify this issue.
This option is not as well self-documenting as the other solutions, but a discussion of it will follow:
use strict;
use warnings;
#ARGV == 2 or die "Usage: perl program.pl [num_list] [string_file]\n";
open my $fh, '<', pop or die "Cannot open string file: $!";
chomp( my #str = <$fh> );
local $/ = ', ';
while (<>) {
chomp;
print +( substr $str[ $. - 1 ], $_ ) . "\n";
}
Strings:
ABCDEFGHIJKLMNOPQRSTUVWXYZ012345
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
Numbers:
3, 6, 2, 5, 6
Output:
DEFGHIJKLMNOPQRSTUVWXYZ012345
BBBBBBBBBBBBBBBBBBBBBBBBBB
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
DDDDDDDDDDDDDDDDDDDDDDDDDDD
EEEEEEEEEEEEEEEEEEEEEEEEEE
The strings' file name is poped off #ARGV (since an explicit argument for pop is not used) and passed to open to read the strings into #str. The record separator is set to ', ' so chomp leaves only the number. The current line number in $. is used as part of the index to the corresponding #str element, and the remaining characters in the string from n on are printed.

Resources