Why perl omitting the # strings - arrays

#!/usr/bin/perl
$test ="#test=hello"; #(Actually am reading this text from File and assigning to a variable)
print "$test";
After executing the above code perl outputs the below
Output:
=hello instead of #test=hello.
Can anyone please explain the above. I believe that it is considering as empty array, but how can i avoid this when and reading a file, please clear my misconception.
Thanks,

Perl interpolates variables in strings delimited by double quotes.
#test is treated as an array.
Unless you have created this explicitly, you should get an error when you try to do that. If you don't then you must have forgotten to use strict; and use warnings;!
Use single quoted strings if you don't want to interpolate variables.
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10;
my $test = '#test=hello';
say $test;

Related

How to do a CSV record split [duplicate]

This question already has answers here:
How do I efficiently parse a CSV file in Perl?
(6 answers)
Closed 7 years ago.
I need to get a specific field from a CSV file and put it in an array. I am not sure how to do this. This is what I have tried so far.
#!/usr/bin/perl
use strict;
use warnings;
my #array = <>;
my #fields = split ",", #array;
print #fields[2];
This is an example of the CSV file
9988,Kathleen,Brown,kbrownc#goo.gl,OH,Female,Italian
9989,Antonio,Ford,afordb#bigcartel.com,IL,Male,
9990,Diana,Banks,dbanksa#jalbum.net,MA,Female,English
If there is any chance that your CSV file contains quoted fields (so that each field may itself contain a comma) then you should use Text::CSV to handle the data properly. However, for simple data like that in your question, it is fine to use just split.
Your code would look something like this. Note that it is usually unnecessary to read an entire file into memory, and line-by-line processing is more memory-efficient. It also tends to focus the programmer's attention on a single line and hence improve the resulting design.
use strict;
use warnings;
my #names;
while ( <> ) {
chomp;
my #fields = split /,/;
push #names, $fields[2];
}
print "$_\n" for #names;
output
Brown
Ford
Banks
Update
If you are comfortable with map then you may prefer this. It is much more concise, but suffers from the same inefficiency as your own code in that it reads the whole file into memory at once (although it discards it again immediately). Unless the file is enormous that shouldn't be a problem.
use strict;
use warnings;
my #names = map { chomp; ( split /,/ )[2]; } <>;
print "$_\n" for #names;
There is a perl module that handles many file formats including csv. You can install the module by running:
$ sudo cpan install Text::CSV;
Now you'll be able to easily have the needed parsing of your comma delimiter (which is the default) or specify any other character.
After installing the perl module, this is a quick script to achieve your task. I created a text file with your data called test.csv.
#!/usr/bin/perl
use strict;
use warnings;
require Text::CSV;
my $csv = Text::CSV->new;
open (DATA, "<test.csv") or die "Can't open file...";
while (<DATA>) {
$csv->parse($_);
my#fields = $csv->fields();
print $fields[2];
}
close DATA;
You can see other features of the Text::CSV module reviewing the documentation by running:
$ perldoc Text::CSV

using perl array as input to bash bedtools command

I'm wondering if it is possible to use a perl array as the input to a program called bedtools ( http://bedtools.readthedocs.org/en/latest/ )
The array is itself generated by bedtools via the backticks method in perl. When I try to use the perl array in another bedtools bash command it complains that the argument list is too long because it seems to treat each word or number in the array as a separate argument.
Example code:
my #constit_super = `bedtools intersect -wa -a $enhancers -b $super_enhancer`;
that works fine and can be viewed by:
print #constit_super
which looks like this onscreen:
chr10 73629894 73634938
chr10 73636240 73639574
chr10 73639726 73657218
but then if I try to use this array in bedtools again e.g.
my $bedtools = `bedtools merge -i #constit_super`;
then i get this error message:
Can't exec "/bin/sh": Argument list too long
Is there anyway to use this perl array in bedtools?
many thanks
27/9/14 thanks for the info on doing it via a file. however, sorry to be a pain I would really like to do this without writing a file if possible.
I haven't tested this but I think it would work.
bedtools is expecting one argument with the -i flag, the name of a .bed file. This was in the docs. You need to write your array to a file and then input it into the bedtools merge command.
open(my $fh, '>', "input.bed") or die $!;
print $fh join("", #constit_super);
close $fh;
Then you can sort it with this command from the docs:
`sort -k1,1 -k2,2n input.bed > input.sorted.bed`;
Finally, you can run your merge command.
my $bedtools = `bedtools merge -i input.sorted.bed`;
Hopefully this sets you on the right track.

Output from bash command not storing in array

my below code is very simple... all im doing is grepping a file using an IP address REGEX, and any IP addresses that are found that match my REGEX, I want to store them in #array2.
i know my REGEX works, because ive tested it on a linux command line and it works exactly how I want, but when its integrated into the script, it just returns blank. there should be 700 IP's stored in the array.
#!/usr/bin/perl
use warnings;
use strict;
my #array2 = `grep -Eo "\"\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\"" test1.txt`;
print #array2;
Backticks `` behave like a double quoted string by default.
Therefore you need to escape your backslashes:
my #array2 = `grep -Eo "\\"\\b[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\"" test1.txt`;
Alternatively, you can use a single quoted version of qx to avoid any interpolation:
my #array2 = qx'grep -Eo "\"\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\"" test1.txt';
However, the method I'd recommend is to not shell out at all, but instead do this logic in perl:
my #array2 = do {
open my $fh, '<', 'test1.txt' or die "Can't open file: $!";
grep /\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\b/, <$fh>;
};
I really wouldn't mix bash and perl. It's just asking for pain. Perl can do it all natively.
Something like:
open (my $input_fh, "<", "test.txt" ) or die $!;
my #results = grep ( /\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/, <$input_fh> );
This does however, require slurping the file into memory, which isn't optimal - I'd generally use a while loop, instead.
The text inside the backticks undergoes double-quotish substitution. You will need to double your backslashes.
Running grep from inside Perl is dubious, anyway; just slurp in the text file and use Perl to find the matches.
The easiest way to retrieve the output from an external command is to use open():
open(FH, 'grep -Eo \"\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\" test1.txt'."|")
my #array2=<FH>;
close (FH);
..though I think Sobrique's idea is the best answer here.

Perl, Pattern Matching each element ($line) in an array

I have a simple enough problem I think, I have recently ran a script which extracted specific information from the string in each element in an array. I have written this before and it functions well however when trying the very simple version of it right now it will not presen data only the same response uninitialized value argument! I am getting really frustrated as my previous code works. I am clearly doing something STUPID and would love some help!
#!/usr/bin/env perl
use strict;
use warnings;
my#histone;
my$line;
my$idea;
my$file="demo_site.txt";
open(IN, "<$file")||die"\ncannot be opend\n";
#histone=<IN>;
print #histone;
foreach $line(#histone)
{
$line=~ m/([a-zA-Z0-9]+)\t[0-9]+\t[0-9]+\t/;
print$1."\n";
print$2."\n";
print$3."\n";
}
The infile "demo_site.txt" takes the format of a tab delimited .txt file:
chr9 1234 5678 . 200 . 14.0 -1
This file has multiple lines as above and I wish to extract the first three items of data so the output looks as follows.
chr9
1234
5678
Cheers!
You don't really need a regular expression since it's tab delimited.
foreach $line(#histone)
{
#line_data = split(/\t/,$line)
print $line_data[0]."\n";
print $line_data[1]."\n";
print $line_data[2]."\n";
}
Edit:
If you want to assign the values to specific named variables, assign it in a temporary array.
($varA, $varB, $varC .... ) = split(/\t/,$line)
The actual problem here is that you're trying to print the values of $1, $2 and $3, but you only have one set of capturing parenthesis in your regex, so only $1 gets a value. $2 and $3 will remain undefined and hence give you that error when you try to print them.
The solution is to add two more sets of capturing parenthesis. I expect you want something like this:
$line=~ m/([a-zA-Z0-9]+)\t([0-9]+)\t([0-9]+)\t/;
Let's assume, that file.txt have what you want: (file.txt eq demo_site.txt )
chr9 1234 5678 . 200 . 14.0 -1
you can use simple thing:
perl -ane '$" = "\n"; print "#F[0..2]"' file.txt 1>output.txt
One-liners in Perl are powerful. And you don't need to write your scripts for simple tasks;)
Just open Terminal sometimes;)
P.S:
This is not very good one-liner, I know, but It do what It must.
$line=~ m/([a-zA-Z0-9]+)\t[0-9]+\t[0-9]+\t/)
First of all, the parens are not balanced.
Second, I haven't checked this, but don't you need a set of parens for each capture?
Third, as misplacedme said split() is definitely the way to go. ;)
If I may self-promote, you can use Tie::Array::CSV to give direct read-write access to the file as a Perl array of arrayrefs.
use strict;
use warnings;
use Tie::Array::CSV;
tie my #file, 'Tie::Array::CSV', 'demo_site.txt', sep_char => "\t";
print $file[0][0]; # first line before first tab
$file[2][1] = 10; # set the third line between the first and second tabs

Cannot print out the array when reading it from an existing file

I have a data file, with each line holding one number. I am trying to read this file into an array. Here is the script I wrote:
#!/usr/bin/perl -w
$file1 = '/home/usr1/test.list';
open(FILEC, $file1);
my #cArray = FILEC;
close FILEC;
print #cArray;
But after executing this file, nothing was printed out? I have checked the input, test.list, which is at the correct location. What may be the reason?
You're missing the <>(line) operator:
my #cArray = <FILEC>;
ought to help.
FatalError is correct, you need a readline operator. You can read more about <> in perldoc perlop and more about the readline function in perldoc -f readline.
Once you have that knowledge, you can see why the following could also work (though perhaps not recommended for readability). Also I will use Data::Dumper to print a better representation of #cArray.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
#ARGV = qw( /home/usr1/test.list );
# or remove previous line and call script as
# script.pl /home/usr1/test.list
my #cArray = <>;
print Dumper \#cArray;
Some further notes: a more modern version of your would:
use the three argument form of open
check that open succeeds
use a lexical rather than bareword handle
use strict as well as use warnings (rather than -w)
giving
#!/usr/bin/env perl
use strict;
use warnings;
my $file1 = '/home/usr1/test.list';
open(my $handle, '<', $file1)
or die "Could not open $file1: $!";
my #cArray = <$handle>;
print #cArray;

Resources