comparing two filename arrays for differences - arrays

below is my attempt and loading all filenames in a text file into an array and comparing that array to filenames which are in a seperate directory. I would like to identify the filenames that are in the directory and not in the file so I can then process those files. I am able to load the contents of the both directories succesfully but the compare operation is outputting all the files not just the difference.
Thank you in advance for the assistance.
use File::Copy;
use Net::SMTP;
use POSIX;
use constant DATETIME => strftime("%Y%m%d", localtime);
use Array::Utils qw(:all);
use strict;
use warnings;
my $currentdate = DATETIME;
my $count;
my $ErrorMsg = "";
my $MailMsg = "";
my $MstrTransferLogFile = ">>//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt";
my $DailyLogFile = ">//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Daily_Transfer_Log_" . DATETIME . ".txt";
my $InputDir = "//CFVFTP/Users/ssi/Transfer_Logs/folder1/";
my $MoveDir = "//CFVFTP/Users/ssi/Transfer_Logs/folder2/";
my $filetouse;
my #filetouse;
my $diff;
my $file1;
my $file2;
my %diff;
open (MSTRTRANSFERLOGFILE, $MstrTransferLogFile) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open master transfer log file!\n";
open (DAILYLOGFILE, $DailyLogFile) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open daily log file!\n";
#insert all files in master transfer log into array for cross reference
open (FH, "<//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt") or $ErrorMsg = $ErrorMsg . "ERROR: Could not open master log file!\n";
my #master = <FH>;
close FH;
print "filenames in text file:\n";
foreach $file1 (#master) { print "$file1\n"; }
print "\n";
#insert all 835 files in Input directory into array for cross reference
opendir (DIR, $InputDir) or $ErrorMsg = $ErrorMsg . "ERROR: Could not open input directory $InputDir!\n";
my #list = grep { $_ ne '.' && $_ ne '..' && /\.835$/ } readdir DIR;
close(DIR);
print "filenames in folder\n";
foreach $file2 (#list) { print "$file2\n"; }
print "\n";
#get the all files in the Input directory that are NOT in the master transfer log and place into #filetouse array
#diff{ #master }= ();;
#filetouse = grep !exists($diff{$_}), #list;;
print "difference:\n";
foreach my $file3 (#filetouse) { print "$file3\n"; }
print DAILYLOGFILE "$ErrorMsg\n";
print DAILYLOGFILE "$MailMsg\n";
close(MSTRTRANSFERLOGFILE);
close(DAILYLOGFILE);
this is what the output looks like:
filenames in text file:
160411h00448car0007.835
filenames in folder
160411h00448car0007.835
160411h00448car0008.835
160418h00001com0001.835
difference:
160411h00448car0007.835
160411h00448car0008.835
160418h00001com0001.835

This should help you to do what you need. It stores the names of all of the files in INPUT_DIR as keys in hash %files, and then deletes all the names found in LOG_FILE. The remainder are printed
This program uses autodie so that the success of IO operations needn't be checked explicitly. It was first available in Perl 5 core in v5.10.1
use strict;
use warnings 'all';
use v5.10.1;
use autodie;
use feature 'say';
use constant LOG_FILE => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt';
use constant INPUT_DIR => undef;
chdir INPUT_DIR;
my %files = do {
opendir my $dh, '.';
my #files = grep -f, readdir $dh;
map { $_ => 1 } #files;
};
my #logged_files = do {
open my $fh, '<', LOG_FILE;
<$fh>;
};
chomp #logged_files;
delete #files{#logged_files};
say for sort keys %files;
Update
After a lot of attrition I found this underneath your original code
use strict;
use warnings 'all';
use v5.10.1;
use autodie;
use feature 'say';
use Time::Piece 'localtime';
use constant DATETIME => localtime()->ymd('');
use constant XFR_LOG => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Mstr_Transfer_Log.txt';
use constant DAILY_LOG => '//CFVFTP/Users/ssi/Transfer_Logs/Artiva/ARTIVA_Daily_Transfer_Log_' . DATETIME . '.txt';
use constant INPUT_DIR => '//CFVFTP/Users/ssi/Transfer_Logs/folder1/';
use constant MOVE_DIR => '//CFVFTP/Users/ssi/Transfer_Logs/folder2/';
chdir INPUT_DIR;
my #master = do {
open my $fh, '<', XFR_LOG;
<$fh>;
};
chomp #master;
my #list = do {
opendir my $dh, '.';
grep -f, readdir $dh;
};
my %diff;
#diff{ #master } = ();
my #filetouse = grep { not exists $diff{$_} } #list;
As you can see, it's very similar to my solution. Here are some notes about your original
Always use lexical file handles. With open FH, ... the file handle is global and will never be closed unless you do it explicitly or until the program terminates. Instead, open my $fh, ... leaves perl to close the file handle at the end of the current block
Always use the three-parameter form of open, so that the open mode is separate from the file name, and never put an open mode as part of a file name. You opened the same file twice: once as $MstrTransferLogFile which begins with >> and once explicitly because you needed read access
It is very rare for a program to be able to recover from an IO operation error. Unless you are writing fail-safe software, a failure to open or read from a file or directory means the program won't be able to fulfill its purpose. That means there's little reason to accumulate a list of error messages -- the code should just die when it can't succeed
The output from readdir is very messy if you need to process directories because it includes the pseudo-directories . and ... But if you only want files then a simple grep -f, readdir $dh will throw those out for you
The block form of grep is often more readable, and not is much more visible than !. So grep !exists($diff{$_}), #list is clearer as grep { not exists $diff{$_} } #list
Unless your code is really weird, comments usually just add more noise and confusion and obscure the structure. Make your code look like what it does, so you don't have to explain it
Oh, and don't throw in all the things you might need at the start "just in case". Write your code as if it was all there and the compiler will tell you what's missing
I hope that helps

First, use a hash to store your already-processed files. Then it's just a matter of checking if a file exists in the hash.
(I've changed some variable names to make the answer a bit clearer.)
foreach my $file (#dir_list) {
push #to_process, $file unless ($already_processed{$file});
}
(Which could be a one-liner, but get it working in its most expanded form first.)
If you insist on your array, this looks much less efficient
foreach my $file (#dir_list) {
push #to_process, $file unless (grep (/^$file$/, #already_processed));
}
(Again could be a one-liner, but...)

Related

Regexp to Compare partial filenames then moving to another directory perl

I am working on a script to compare non-running files within a dir to running files from a command. I have to use Regex to strip the front half of the filenames from the dir then regex to strip the filenames out of a command which then records the unmatched names into an array.
The part I cannot figure out is how I can move the filenames from the old dir into a new directory for future deletion.
In order to move the files I will need to enclose them in wildcards, * due to the random numbers in front of the filenames and the extention.
example filenames before and after:
within dir:
13209811124300209156562070_cake_872_trucks.rts
within command:
{"file 872","cake_872_trucks.rts",running}
in #events array:
cake_872_trucks
My code:
#!/usr/bin/perl -w
use strict;
use warnings;
use File::Copy qw(move);
use Data::Dumper;
use List::Util 'max';
my $orig_dir = "/var/user/data/";
my $dest_dir = "/var/user/data/DeleteMe/";
my $dir = "/var/user/data";
opendir(DIR, $dir) or die "Could not open $dir: $!\n";
my #allfiles = readdir DIR;
close DIR;
my %files;
foreach my $allfiles(#allfiles) {
$allfiles =~ m/^(13{2}638752056463{2}635181_|1[0-9]{22}_|1[0-9]{23}_|1[0-9]{24}_|1[0-9]{25}_)([0-9a-z]{4}_8[0-9a-z]{2}_[0-9a-z]{2}[a-z][0-9a-z]0[0-9]\.rts|[a-z][0-9a-z]{3}_[0-9a-z]{4}_8[0-9a-z]{2}_[0-9a-z]{2}[a-z]{2}0[0-9]\.rts|[a-z]{2}[0-9a-z][0-9]\N[0-9a-z]\N[0-9]\N[0-9]\N[0-9a-z]{4}\N[0-9]\.rts|[a-z]{2}[0-9a-z]{2}\N{2}[0-9a-z]{2}\N{2}[0-9][0-9a-z]{2}\N[0-9]{2}\.rts|S0{2}2_86F_JATD_01ZF\.rts)$/im;
$files{$2} = [$1];
}
my #stripfiles = keys %files;
my $cmd = "*****";
my #runEvents = `$cmd`;
chomp #runEvents;
foreach my $running(#runEvents) {
$running =~ s/^\{"blah 8[0-9a-z]{2}","(?<field2>CBE1_D{3}1_8EC_J6TG0{2}\.rts|[0-9a-z]{4}_8[0-9a-z]{2}_[0-9a-z]{2}[a-z][0-9a-z]0[0-9]\.rts|[a-z]{2}[0-9a-z]{2}\N{2}[0-9a-z]{2}\N{2}[0-9][0-9a-z]{2}\N[0-9]{2}\.rts)(?:",\{239,20,93,5\},310{2},20{3},run{2}ing\}|",\{239,20,93,5\},310{2},[0-9]{2}0{3},run{2}ing\}|",\{239,20,93,5\},310{2},[0-9]{3}0{4},run{2}ing\}|",\{239,20,93,5\},3[0-9]0{2},[0-9]{2}0{4},run{2}ing\})$/$+{field2}/img;
}
my #events = grep {my $x = $_; not grep {$x =~/\Q$_/i}#runEvents}#stripfiles;
foreach my $name (#events) {
my ($randnum, $fnames) = { $files{$name}};
my $combined = $randnum . $fnames;
print "Move $file from $orig_dir to $dest_dir";
move ("$orig_dir/$files{$name}", $dest_dir)
or warn "Can't move $file: $!";
}
#print scalar(grep $_, #stripfiles), "\n";
#returned 1626
#print scalar(grep $_, #runEvents), "\n";
#returned 102
#print scalar(grep $_, #allfiles), "\n";
#returned 1906
Once you are parsing filenames with regex there is no reason not to be able to capture all parts so that you can later reconstitute needed parts of the filename.
I assume that that overly long (and incomplete) regex does what it is meant to.
I am not sure how the files to move relate to the original files in #allfiles, since those are fetched from /var/user/data while your moving attempt uses /home/user/RunBackup. So code snippets below are more generic.
If what gets moved are precisely the files from #allfiles then just keep the file name
my %files;
foreach my $oldfile (#allfiles) {
$oldfile =~ m/...(...).../; # your regex, but capture the name
$files{$1} = $oldfile;
}
where by /...(...).../ I mean to indicate that you use your regex, but to which you add parenthesis around the part of the pattern that matches the name itself.
Then you can later retrieve the filename from the "name" of interest (cake_872_trucks).
If, however, the filename components may be needed to patch a different (while related) filename then capture and store the individual components
my %files;
foreach my $oldfile (#allfiles) {
$oldfile =~ m/(...)(...)(...)/; # your regex, just with capture groups
$files{$2} = [$1, $3]; # add to %files: name => [number, ext]
}
The regex only matches (why change names in #allfiles with s///?), and captures.
The first set of parenthesis captures that long leading factor (number) into $1, the second one gets the name (cake_872_trucks) into $2, and the third one has the extension, in $3.
So you end up with a hash with keys that are names of interest, with their values being arrayrefs with all other needed components of the filename. Please adjust as needed as I don't know what that regex does and may have missed some parts.
Now once you go through #events you can rebuild the name
use File::Copy qw(move);
foreach my $name (#events) {
my ($num, $ext) = #{ $files{$name} };
my $file = $num . $name . $ext;
say "Move $file from $orig_dir to $dest_dir";
move("$orig_dir/$file", $dest_dir) or warn "Can't move $file: $!";
}
But if the files to move are indeed from #allfiles (as would be the case in this example) then use the first version above to store filenames as values in %files and now retrieve them
foreach my $name (#events) {
move ("$orig_dir/$files{$name}", $dest_dir)
or warn "Can't move $file: $!";
}
I use the core module File::Copy, instead of going out to the system for the move command.
You can also rebuild the name by going through the directory again, now with names of interest on hand. But that'd be very expensive since you have to try to match every name in #events for every file read in the directory (O(mn) complexity).
What you asked about can be accomplished with glob (and note File::Glob's version)
my #files = glob "$dir/*${name}*";
but you'd have to do this for every $name -- a huge and needless waste of resources.
If that regex really must spell out specific numbers, here is a way to organize it for easier digestion (and debugging!): break it into reasonable parts, with a separate variable for each.
Ideally each part of alternation would be one variable
my $p1 = qr/.../;
my $p2 = qr/.../;
...
my $re_alt = join '|', $p1, $p2, ...;
my $re_other = qr/.../;
$var =~ m/^($re_alt)($re_other)(.*)$/; # adjust anchors, captures, etc
where the qr operator builds a regex pattern.
Adjust those capturing parenthesis, anchors, etc to your actual needs. Breaking it up so that the regex is sensibly split into variables will go a long way for readability, and thus correctness.
Assuming that there is a good reason to seek those specific numbers in filenames, this is also a good way to document any such fixed factors.
I guess you need something like this:
my $path = '/home/user/RunBackup/';
my #files = map {$path."*$_*"} #events;
system(join " ", "mv", #files, "/home/user/RunBackup/files/");
If there are lots of files you might need to move them one by one:
system(join " ", "mv", $_, "/home/user/RunBackup/files/") for #files;

How to load a CSV file into a perl hash and access each element

I have a CSV file with the following information seperated by commas ...
Owner,Running,Passing,Failing,Model
D42,21,54,543,Yes
T43,54,76,75,No
Y65,76,43,765,Yes
I want to open this CSV file and place its containments inside of a perl hash in my program. I am also interested in the code needed to print a specific element inside of the has. For example, how I will print the "Passing" count for the "Owner" Y65.
The code I currently have:
$file = "path/to/file";
open $f, '<', $files, or die "cant open $file"
while (my $line = <$f>) {
#inside here I am trying to take the containments of this file and place it into a hash. I have tried numerous ways of trying this but none have seemed to work. I am leaving this blank because I do not want to bog down the visibility of my code for those who are kind enough to help and take a look. Thanks.
}
AS well as placing the csv file inside of a hash I also need to understand the syntax to print and navigate through specific elements. Thank you very much in advance.
Here is an example of how to put the data into a hash %owners and later (after having read the file) extract a "passing count" for a particular owner. I am using the Text::CSV module to parse the lines of the file.
use feature qw(say);
use open qw(:std :utf8); # Assume UTF-8 files and terminal output
use strict;
use warnings qw(FATAL utf8);
use Text::CSV;
my $csv = Text::CSV->new ( )
or die "Cannot use CSV: " . Text::CSV->error_diag ();
my $fn = 'test.csv';
open my $fh, "<", $fn
or die "Could not open file '$fn': $!";
my %owners;
my $header = $csv->getline( $fh ); # TODO: add error checking
while ( my $row = $csv->getline( $fh ) ) {
next if #$row == 0; # TODO: more error checking
my ($owner, #values) = #$row;
$owners{$owner} = \#values;
}
close $fh;
my $key = 'Y65';
my $index = 1;
say "Passing count for $key = ", $owners{$key}->[$index];
Since it's not really clear what "load a CSV file into a perl hash" means (Nor does it really make sense. An array of hashes, one per row, maybe, if you don't care about keeping the ordering of fields, but just a hash? What are the keys supposed to be?), let's focus on the rest of your question, in particular
how I will print the "Passing" count for the "Owner" Y65.
There are a few other CSV modules that might be of interest that are much easier to use than Text::CSV:
Tie::CSV_File lets you access a CSV file like a 2D array. $foo[0][0] is the first field of the first row of the tied file.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use Tie::CSV_File;
my $csv = "data.csv";
tie my #data, "Tie::CSV_File", $csv or die "Unable to tie $csv!";
for my $row (#data) {
say $row->[2] and last if $row->[0] eq "Y65";
}
DBD::CSV lets you treat a CSV file like a table in a database you can run SQL queries on.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use DBI;
my $csv = "data.csv";
my $dbh = DBI->connect("dbi:CSV:", undef, undef,
{ csv_tables => { data => { f_file => $csv } } })
or die $DBI::errstr;
my $owner = "Y65";
my $p = $dbh->selectrow_arrayref("SELECT Passing FROM data WHERE Owner = ?",
{}, $owner);
say $p->[0] if defined $p;
Text::AutoCSV has a bunch of handy functions for working with CSV files.
So:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use Text::AutoCSV;
my $csv = "data.csv";
my $acsv = Text::AutoCSV->new(in_file => $csv) or die "Unable to open $csv!";
my $row = $acsv->search_1hr("OWNER", "Y65");
say $row->{"PASSING"} if defined $row;
This last one is probably closest to what I think you think you want.

Perl, matching files of a directory, using an array with part of the these file names

So, I have this directory with files named like this:
HG00119.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam.bai
HG00119.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam
HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam
HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam.bai
NA20828.mapped.illumina.mosaik.TSI.exome.20110411.bam_herc2_phase1.bam
NA20828.mapped.ILLUMINA.bwa.TSI.low_coverage.20130415.bam_herc2_data.bam
And I have a input.txt file that contains in each line.
NA20828
HG00119
As you can see, the input.txt file has the beginning of the name of the files inside the directory.
What I want to do is to filter the files in the directory that have the name (in this case just the beginning), inside the input.txt.
I don't know if I was clear, but here is the code I've done so far.
use strict;
use warnings;
my #lines;
my #files = glob("*.mapped*");
open (my $input,'<','input.txt') or die $!;
while (my $line = <$input>) {
push (#lines, $line);
}
close $input;
I used the glob to filter only the files with mapped in the name, since I have other files there that I don't want to look for.
I tried some foreach loops, tried grep and regex also, and I'm pretty sure that I was going in the right way, and I think my mistake might be about scope.
I would appreciate any help guys! thanks!
OK, first off - your while loop is redundant. If you read from a filehandle in a list context, it reads the whole thing.
my #lines = <$input>;
will do the same as your while loop.
Now, for your patterns - you're matching one list against another list, but partial matches.
chomp ( #lines );
foreach my $file ( #files ) {
foreach my $line ( #lines ) {
if ( $file =~ m/$line/ ) { print "$file matches $line"; }
}
}
(And yes, something like grep or map can do this, but I always find those two make my head hurt - they're neater, but they're implicitly looping so you don't really gain much algorithmic efficiency).
You can build a regular expression from the contents of input.txt like this
my #lines = do {
open my $fh, '<', 'input.txt' or die $!;
<$fh>;
};
chomp #lines;
my $re = join '|', #lines;
and then find the required files using
my #files = grep /^(?:$re)/, glob '*.mapped*';
Note that, if the list in input.txt contains any regex metacharacters, such as ., *, + etc. you will need to escape them, probably using quotemeta like this
my $re = join '|', map quotemeta, #lines;
and it may be best to do this anyway unless you are certain that there will never ever be such characters in the file.

Comparing two arrays in Perl

I know this has been asked before, and I know there are functions to make this easy in Perl. But what I want is advice on my specific code. I want to go through each line of text which I've read from a file, and compare it to the same line from another file, printing them if they are different.
I've tried as many variations of this as I could think of, and none work. This specific code which I'm posting thinks every element in the array is different from the one in the other array.
use 5.18.2;
use strict;
use utf8;
printf "This program only compares two files.\n"
. "Here are the differences between "
. $ARGV[0] . " and " . $ARGV[1] . ":\n";
open FIRST_FH, '<', $ARGV[0];
chomp(my #file1 = <FIRST_FH>);
close FIRST_FH;
open SECOND_FH, '<', $ARGV[1];
chomp(my #file2 = <SECOND_FH>);
close SECOND_FH;
for(my $i=0; $i < scalar #file1; ++$i){
my $string = $file2[$i];
unless($_ =~ /$string/){
print "Difference found: #file1[$i], #file2[$i]\n";
}
}
use utf8; just instructs the interpreter to read your source file as UTF-8. Use the open pragma to set the default IO layers to UTF-8 (or manually specify '<:encoding(UTF-8)' as the second argument to open).
Don't use printf when print will suffice (it usually does, due to interpolation). In this particular instance, I find a heredoc to be most readable.
It's inefficient to read both files into memory. Iterate over them lazily by taking one line at a time in a while loop.
Always check if open failed and include $! in the error message. Alternatively, use autodie;, which handles this for you. Also, use lexical filehandles; they'll automatically close when they go out of scope, and won't clash with other barewords (e.g. subroutines and built-ins).
Keeping in mind these suggestions, the new code would look like:
#!/usr/bin/perl
use 5.18.2; # Implicitly loads strict
use warnings;
use open qw(:encoding(utf8) :std);
print <<"EOT";
This program only compares 2 files.
Here are the differences between
$ARGV[0] and $ARGV[1]:
EOT
open(my $file1, '<', shift) or die $!;
open(my $file2, '<', shift) or die $!;
while (my $f1_line = <$file1>, my $f2_line = <$file2>)
{
if ($f1_line ne $f2_line)
{
print $f1_line, $f2_line;
}
}
But this is still a naive algorithm; if one file has a line removed, all subsequent lines will differ between files. To properly achieve a diff-like comparison, you'll need an implementation of an algorithm that finds the longest common subsequence. Consider using the CPAN module Algorithm::Diff.
Why are you comparing using $_? Which you haven't defined anywhere?
my $string = $file2[$i];
unless($_ =~ /$string/){
Simply compare the lines using eq or ne:
if ( $file1[$i] ne $file2[$i] ) {
However, I would recommend that you make a lot of stylistic changes to your script, starting with doing line by line processing instead of slurping in the files. The following is how I would completely rewrite it:
use 5.18.2;
use strict;
use warnings;
use autodie;
use utf8;
my ( $file1, $file2 ) = #ARGV;
open my $fh1, '<', $file1;
open my $fh2, '<', $file2;
while ( !eof($fh1) && !eof($fh2) ) {
chomp( my $line1 = <$fh1> );
chomp( my $line2 = <$fh2> );
if ( line1 ne $line2 ) {
warn "Difference found on line $.:\n $line1\n $line2\n";
}
}
warn "Still more data in $file1\n" if !eof $fh1;
warn "Still more data in $file2\n" if !eof $fh2;

Why does my Perl script delete the contents of whole file when I try to edit the file?

I am running the following code to open a file (test) and edit(search and replace) it. Program seems to open the file but instead of replacing it deletes everything in the file. I am not sure why that is happening. Can anyone help me out here?
#!use/bin/perl
use strict;
use warnings;
my $line = $ARGV[0];
my $find = '\s{6}seqfile\s=\sinfile';
my $replace = '\s{6}seqfile\s=\sinfil2';
open (FILE, ">/home/shubhi/Desktop/pamlrun/test") || die "cant open file \n";
my #body = <FILE>;
foreach $line(#body)
{
(s/$find/$replace/g);
{
print FILE "$line";
}
}
close(FILE);
print "reached here\n";
exit;
open(FILE, ">filename") opens the file in replace mode, writing over whatever there was previously.
Also, you cannot have a regexp in the substitution pattern $replace.
If I understood your intent correctly, this could be replaced with an one-liner
perl -pi -e 's/(\s{6}seqfile\s=\sinfil)e/${1}2/' /home/shubhi/Desktop/pamlrun/test
Your open() is opening a file handle to write to your file. Replace
open (FILE, ">/home/shubhi/Desktop/pamlrun/test") || die "cant open file \n";
with
open (FILE, "/home/shubhi/Desktop/pamlrun/test") || die "cant open file \n";
In the code you've posted, it immediately creates this file for output only. You should open the file, read/process the contents, and then write it out. If the file is sizable, then write to a new file as you read the old one, and replace the old one upon (successful) completion.
It looks like you want in-place editing magic. The easiest way to get this is to use $^I with the magic of #ARGV plus <> (look for null filehandle in the I/O Operators section):
#!/usr/bin/perl
use strict;
use warnings;
my $find = qr/\s{6}seqfile\s=\sinfile/;
my $replace = ' seqfile = infil2';
#ARGV = ("/home/shubhi/Desktop/pamlrun/test");
$^I = ".bak"; #safe the old files as file.bak
while (<>) {
s/$find/$replace/g;
print;
}
Also, given the nature of your regex, it looks like you probably want [ ] (match a space) or \t (match a tab) not \s. \s will match tabs, spaces, and other whitespace characters.
You can also use Tie::File, but it doesn't seem to provide a backup capability:
#!/usr/bin/perl
use strict;
use warnings;
use Tie::File;
my $find = qr/\s{6}seqfile\s=\sinfile/;
my $replace = ' seqfile = infil2';
tie my #lines, "Tie::File", "testdata"
or die "could not open file: $!";
for my $line (#lines) {
$line =~ s/$find/$replace/g;
}
Of course, you could roll your own backups with File::Copy:
#!/usr/bin/perl
use strict;
use warnings;
use Tie::File;
use File::Copy;
my $find = qr/\s{6}seqfile\s=\sinfile/;
my $replace = ' seqfile = infil2';
copy "testdata", "testdata.bak"
or die "could not backup file: $!";
tie my #lines, "Tie::File", "testdata"
or die "could not open file: $!";
for my $line (#lines) {
$line =~ s/$find/$replace/g;
}
I would also be remiss if I did not point out that this is basically a one-liner:
perl -pi.bak -e 's/\s{6}seqfile\s=\sinfile/ seqfile = infil2/' testdata
This can be shortened further with Perl 5.10 by taking advantage of \K (zero-width positive look-behind):
perl -pi.bak -e 's/\s{6}seqfile\s=\s\Kinfile/infil2/' testdata
open for write = '>'
open for read = '<'

Resources