loop goes to next occurence after one match - arrays

I have an array that I am using to match against another table. When I execute it, it grabs only the first occurrence. For example, if company1 is in my array, it will grab only the first instance of company1 and then go to the next search term, say company2. If there had been a company1.0 after company1, only company1 would be spit out. I want it to spit out company1 etc.\t company1.0 and so on on the same line, as there will be multiple matches between the two lists.
Here is my code:
my #attendees = ('company');
foreach $fbm (#attendees) {
open(RFILE, '<', "file.txt")
or die "no such file posf: $!";
while ( $line = <RFILE> )
{
if ($line =~ /$fbm/i)
{
print $fbm."\t". $line;
last;
}
if (eof(RFILE))
{
print "posf"."\n";
}
}
}
print STDERR "\n\nFINISHED!!";
My Input:
company1
company1.0
company1 also begins with 1 but different ending
company1 can i have this one too?
My output:company1
Desired output: company1\tcompany1.0\tcompany1 also begins with 1 but different ending\tcompany1 can i have this one too?

my #attendees = ('company');
my #whatever;
open ( my $fh, '<', "file.txt")
or die "could not open file.txt: $!";
while ( <$fh> ) {
chomp $_;
push #whatever, $_;
}
foreach my $attendee ( #attendees ) {
foreach my $thing ( #whatever ) {
if ($thing =~ /$attendee/i) {
print $attendee, "\t", $thing, "\n";
}
}
}
print STDERR "FINISHED!\n";
Perhaps this does what you want it to, but I must admit that I'm not quite sure.

Related

Building a Perl program to acess a set of data

I have to write a Perl program that is able to read the social security baby names for a specific year, and have a person be able to enter a name, and be told weather it is male or female, how many people were born with that name, and where it falls in rank.
So far I have been able to separate the data into two separate arrays based on gender, but have to no idea where to go from here.
#!/usr/local/bin/perl
use strict;
use warnings;
open (FILE, "ssbn1898.txt");
print <FILE>;
close (FILE);
my #M_array;
my #F_array;
open (my $input, "<", 'ssbn1898.txt');
while ( <$input> ) {
chomp;
my ( $name, $gender ) = split ( /,/ );
if ( $gender = "M" ) {
push ( #M_array, $name );
}
else {
push ( #F_array, $name );
}
}
close ( $input );
print 'M: ' . join("\t", #M_array) . "\n";
print 'F: ' . join("\t", #F_array) . "\n";
http://www.ssa.gov/cgi-bin/popularnames.cgi
This is the data I am working with.
So far you have 2 arrays for each gender. Now the pending tasks are
Input: Allow users to input a name
Output: Male or Female
Output: Number of people with that name
To take input from user you could do
my $userinput = <STDIN>;
Then you might need to chomp the $userinput and then check whether the $userinput is in array 1 or array 2. For that you will have to use grep or loops. Doing that you'd be able to find whether name belongs to array with Male names or Females.
if (grep /$userinput/, #male_names) {
print "found $userinput in male list\n";
}
Using loop you could do something like below to find total number of people with that name:
foreach (#male_names){
$counter++ if $userinput eq $_;
}
PS: grep returns list, so if you use a scalar you could find number of matches, so you don't have to go for loops.
#!/usr/bin/perl
use strict;
use warnings;
my #male_names = qw(Raj Rohan John Jim Tony Raj Rohan Jim Jim);
my #female_names = qw(Natasha Neha Neha Jasmine Rita Rosy);
my $matches;
my $userinput = <STDIN>;
chomp ($userinput);
if ( $matches = grep /$userinput/, #male_names ) {
print "found $userinput in male list, count is $matches \n";
}
elsif ( $matches = grep /$userinput/, #female_names ) {
print "found $userinput in female list, count is $matches \n";
}
else{
print "Did not find name";
}

I need to create an out files out of my hash keys and store a file list to the files based on my keys

I have two files the first one has a number range and a version name, the number range is retrieved from the second file which is consist of a list number. From the second file I am looking for the numbers start in position 11 for 9 char then compare it with my first file "the range file" then print to the screen the name of the version and how many matches.
My first file looks like this
imb,folded ,655575645,827544086
imb,selfmail ,827549192,827572977
My second file looks like this
0026110795165557564528452972062
0026110795165557648628452974959
0026110795182749420290503162401
0026110795182749566690703875348
0026110795182750564290503365856
0026110795182751155490713282618
0026110795182751819190503415474
0026110795182752054790503331977
0026110795182752888194578410931
0026110795182753115893308242647
0026110795182753522398248322033
0026110795182753601890723246006
0026110795182754156995403760702
0026110795182754174597213102232
0026110795182754408698248770395
0026110795182754919290713221614
0026110795182755128698248922635
0026110795182755566790713334451
0026110795182755669490713213633
0026110795182755806390507009696
0026110795182756204890713212248
0026110795182756217690713273839
0026110795182756259998248961157
0026110795182756309595403769515
0026110795182756708894578164887
0026110795182756829090713282238
0026110795182757082791367220156
0026110795182757130090713274108
0026110795182757297798248934527
0026110795182757370277063564556
My output now looks like this
folded IMB Count: 15
No Matched IMB Count: 1
selfmail IMB Count: 14
I need to create files with a name based on the version name in my first array, then to print to each files the original value for what it match. For instance folded has 15 match I need to print the original number from the file list to a file with a name of folded.txt
my code is
#!/usr/bin/perl
use warnings
use strict
use feature qw{ say };
sub trimspaces {
my #argsarray = #_;
$argsarray[0] =~ s/^\s+//;
$argsarray[0] =~ s/\s+$//;
return $argsarray[0];
}
open(INPUT , "< D:\\Home\\emahou\\imbfilelist.txt") or die $!;
open(INPUT2 , "< D:\\Home\\emahou\\imbrange.txt") or die $!;
my $n;
my #fh;
my $value;
my #ranges;
my $isMatch;
my $printed;
my $fVersion;
my %versionHash=();
while (<INPUT2>) {
chomp;
my ($version, $from, $to) = (split /,/)[ 1, 2, 3 ];
push #ranges, [ $from, $to, trimspaces($version)];
if (!exists $versionHash{trimspaces($version)})
{
$versionHash{trimspaces($version)}=0;
}
}
$versionHash{"No Matched"}=0;
close INPUT2;
while (<INPUT>) {
$isMatch=0;
$n = substr($_,12-1,9);
for my $r (#ranges) {
if ( $n >= $r->[0] && $n <= $r->[1]) {
$fVersion=$r->[2];
if (exists $versionHash{$fVersion}) {
$versionHash{$fVersion}++;
}
$isMatch=1;
last;
}
}
if (!$isMatch) {
$versionHash{"No Matched"}++;
}
}
foreach my $key (keys %versionHash) {
print STDOUT "$key IMB Count: " . $versionHash{$key} . "\n";
}
close INPUT;
This seems to do as you ask
It works by building a hash %filelist with keys from the second column of imbfilelist.txt and values from, to, fh (the output file handle) and count (the number of records that matched this range
Then the imbrange.txt is read a line at a time, the nine-digit code extracted, and compared with the from and to values of each element of the %filelist hash. If a match is found then the line is printed to the corresponding file handle and the counter is incremented. If the code from this line doesn't match any of the ranges then $none_matched is incremented for output in the summary
use strict;
use warnings;
use 5.010;
use autodie;
chdir 'D:\Home\emahou';
# Build a hash of `version` strings with their `from` and `to` values
open my $fh, '<', 'imbfilelist.txt';
my %filelist;
while ( <$fh> ) {
chomp;
my ($version, $from, $to) = (split /\s*,\s*/)[1,2,3];
$filelist{$version} = { from => $from, to => $to };
}
# Open an output file for each item and set the count to zero
while ( my ($version, $info) = each %filelist ) {
open $info->{fh}, '>', "$version.txt";
$info->{count} = 0;
}
# Filter the data in the range file, printing to the
# appropriate output file and keeping count
open $fh, '<', 'imbrange.txt';
my $none_matched = 0;
while ( my $line = <$fh> ) {
next unless $line =~ /\S/;
chomp $line;
my $code = substr $line, 11, 9;
my $matched = 0;
while ( my ($version, $info) = each %filelist ) {
next unless $code >= $info->{from} and $code <= $info->{to};
print { $info->{fh} } $line, "\n";
++$info->{count};
++$matched;
}
++$none_matched unless $matched;
}
close $_->{fh} for values %filelist;
# Print the summary
while ( my ($version, $info) = each %filelist ) {
print "$version IMB Count: $info->{count}\n"
}
print "None matched IMB Count: $none_matched\n"
output
selfmail IMB Count: 14
folded IMB Count: 15
None matched IMB Count: 1
folded.txt
0026110795165557564528452972062
0026110795165557648628452974959
0026110795182749420290503162401
0026110795182749566690703875348
0026110795182750564290503365856
0026110795182751155490713282618
0026110795182751819190503415474
0026110795182752054790503331977
0026110795182752888194578410931
0026110795182753115893308242647
0026110795182753522398248322033
0026110795182753601890723246006
0026110795182754156995403760702
0026110795182754174597213102232
0026110795182754408698248770395
selfmail.txt
0026110795182754919290713221614
0026110795182755128698248922635
0026110795182755566790713334451
0026110795182755669490713213633
0026110795182755806390507009696
0026110795182756204890713212248
0026110795182756217690713273839
0026110795182756259998248961157
0026110795182756309595403769515
0026110795182756708894578164887
0026110795182756829090713282238
0026110795182757082791367220156
0026110795182757130090713274108
0026110795182757297798248934527

Why is my script only accessing the first element in array?

Below is my script.
I have attempted many print statements to work out why it is only accessing the first array element. The pattern match works. The array holds a minimum 40 elements. I have checked and it is full.
I have printed each line, and each line prints.
my $index = 0;
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
foreach my $barcode (#barcode) {
my #line = <FILE>;
foreach $_ (#line) {
if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
my $bar = $1;
$_ =~ s/.*//;
print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
}
print OUT $_;
}
$index++;
}
}
Okay, lets say the input was:
File:
Barcode001 001 abc
Barcode002 002 def
Barcode003 003 ghi
#barcode holds:
001
002
003
#otherarray holds:
abc
def
ghi
The output result for this script is currently printing only:
Barcode001_001 abc
It should be printing:
>Barcode001_001 abc
>Barcode002_002 def
>Barcode003_003 ghi
Where it should be printing a whole load up to ~40 lines.
Any ideas? There must be something wrong with the way I am accessing the array elements? Or incrementing? Hoping this isn't something too silly!
Thanks in advance.
It needs the index because I am trying to match arrays in parallel, as they are ordered. Each line needs to match the corresponding indices of the arrays to each line in the file.
It's a little hard to answer with certainty without more information about the contents of #barcode and FILE, but there is something odd in your code which makes me think that it might be the problem.
The construct while (<FILE>) { ... } will, until end of file, read a line from FILE into $_ and then execute the contents of the loop. In your code, you also read all the lines from FILE from within the loop that iterates over #barcode. I think it is likely that you intended to check each line from FILE against all the elements of #barcode, which would make the loop look like the following:
while (my $line = <FILE>) {
foreach my $barcode (#barcode) {
if ($line =~ /Barcode([0-9]*)\t$barcode/) {
my $bar = $1;
print OUT ">Barcode$bar"."_"."$barcode\n";
}
else {
print OUT $line;
}
}
}
I've taken the liberty of doing a bit of code tidying, but I may have made some unwarranted assumptions.
Your core problem in the above is - in your first iteration you slurp all of your file into #lines. But because it's lexically scoped to the loop, it disappears when that loop completes.
Furthermore:
I would strongly suggest that you don't use $_ like that.
$_ is a special variable that's set implicitly in loops. I'd strongly suggest that you need to replace that with something that isn't a special variable, because that's a sure way to cause yourself pain.
turn on use strict; and use warnings;
use 3 argument open with a lexical filehandle.
perltidy your code, so the bracketing looks right.
you've a search and replace pattern on $_ that's emptying it completely, but then you're trying to print it. You may well not be printing what you think you're printing.
You're accessing <FILE> outside and inside your loop. This will cause you problems.
Barcode([0-9]*) - with a '*' there you're saying 'zero or more' is valid. You may want to consider \d+ - one or more digits.
referencing multiple arrays by index is messy. I'd suggest coalescing them into a hash lookup (lookup by key - barcode)
This line:
my #line = <FILE>;
reads your whole file into #line. But you do this inside the while loop that iterates... each line in <FILE>. Don't do that, it's horrible.
Is this something like what you wanted?
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #barcode = qw (
001
002
003
);
my #otherarray = qw (
abc
def
ghi
);
my %lookup;
#lookup{#barcode} = #otherarray;
print Dumper \%lookup;
#commented because I don't have your source data
#my $file = "input_file_name";
#my $output = "output_file_name";
#open( my $input, "<", $file ) or die "\nNot opening $file for reading\n\n";
#open( my $output, ">", $final ) or die "Did not open $final\n";
#while ( my $line = <$input> )
while ( my $line = <DATA> ) {
foreach my $barcode (#barcode) {
if ( my ($bar) = ( $line =~ /Barcode(\d+)\s+$barcode/ ) ) {
print ">Barcode$bar" . "_" . "$barcode $lookup{$barcode}\n";
#print {$output} ">Barcode$bar" . "_" . "$lookup{$barcode}\n";
}
}
}
__DATA__
Barcode001 001
Barcode002 002
Barcode003 003
Prints:
$VAR1 = {
'001' => 'abc',
'002' => 'def',
'003' => 'ghi'
};
>Barcode001_001 abc
>Barcode002_002 def
>Barcode003_003 ghi
It turns out it was a simple issue as I had suspected being a Monday. I had a colleague go through it with me, and it was the placing of the index:
#my $index = 0; #This means the index is iterated through,
#but for each barcode for one line, then it continues
#counting up and misses the other values, therefore
#repeatedly printing just the first element of the array.
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
$index = 0; #New placement of $index for initialising.
foreach my $barcode (#barcode) {
my #line = <FILE>;
foreach $_ (#line) {
if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
my $bar = $1;
$_ =~ s/.*//;
print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
}
print OUT $_;
$index++; #Increment here
}
#$index++;
}
}
Thanks to everyone for their responses, for my original and poorly worded question they would have worked and may be more efficient, but for the purpose of the script and my edited question, it needs to be this way.

skipping a line in an array, Perl

I'm relatively new to Perl and I've come across this project that I'm having a bit of a hard time with.
The object of the project is to compare two csv files, one of which would contain:
$name, $model, $version
and the other which would contain:
$name2,$disk,$storage
in the end the RESULT file will contain that matched lines and put together the information like so:
$name, $model, $version, $disk,$storage.
I've managed to do this, but my problem is that when one of the elements in missing the program breaks. When it encounters a line in the file missing an element it stops at that line. How can I fix this problem? any suggestions or a way as to how I can perhaps make it skip that line and continue on?
Here's my code:
open( TESTING, '>testing.csv' ); # Names will be printed to this during testing. only .net ending names should appear
open( MISSING, '>Missing.csv' ); # Lines with missing name feilds will appear here.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#my (#array) =<FILE>;
my #hostname; #stores names
#close FILE;
#***** TESTING TO SEE IF ANY OF THE LISTED ITEMS BEGIN WITH A COMMA AND DO NOT HAVE A NAME.
#***** THESE OBJECTS ARE PLACED INTO THE MISSING ARRAY AND THEN PRINTED OUT IN A SEPERATE
#***** FILE.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!\n:$!";
}
$count = 0;
$x = 0;
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name;
chomp( $name, $model, $version );
if ( ( $name =~ /^\s*$/ )
&& ( $model =~ /^\s*$/ )
&& ( $version =~ /^\s*$/ ) ) #if all of the fields are blank ( just a blank space)
{
#do nothing at all
}
elsif ( $name =~ /^\s*$/ ) { #if name is a blank
$name =~ s/^\s*/missing/g;
print MISSING "$name,$model,$version\n";
#$hostname[$count]=$name;
#$count++;
}
elsif ( $model =~ /^\s*$/ ) { #if model is blank
$model =~ s/^\s*/missing/g;
print MISSING"$name,$model,$version\n";
}
elsif ( $version =~ /^\s*$/ ) { #if version is blank
$version =~ s/^\s*/missing/g;
print MISSING "$name,$model,$version\n";
}
# Searches for .net to appear in field "$name" if match, it places it into hostname array.
if ( $name =~ /.net/ ) {
$hostname[$count] = $name;
$count++;
}
#searches for a comma in the name feild, puts that into an array and prints the line into the missing file.
#probably won't have to use this, as I've found a better method to test all of the feilds ( $name,$model,$version)
#and put those into the missing file. Hopefully it works.
#foreach $line (#array)
#{
#if($line =~ /^\,+/)
#{
#$line =~s/^\,*/missing,/g;
#$missing[$x]=$line;
#$x++;
#}
#}
}
close FILE;
for my $hostname (#hostname) {
print TESTING $hostname . "\n";
}
#for my $missing(#missing)
#{
# print MISSING $missing;
#}
if ( open( FILE2, "file2.txt" ) ) { #Run this if the open succeeds
#open outfile and print starting header
open( RESULT, '>resultfile.csv' );
print RESULT ("name,Model,version,Disk, storage\n");
}
else {
die " Cannot open file 2!\n:$!";
}
$count = 0;
while ( $hostname[$count] ne "" ) {
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name,"\n";
if ( $name eq $hostname[$count] ) # I think this is the problem area.
{
print $name, "\n", $hostname[$count], "\n";
#print RESULT"$name,$model,$version,";
#open (FILE2,'C:\Users\hp-laptop\Desktop\file2.txt');
#test
if ( open( FILE2, "file2.txt" ) ) {
}
else {
die " Cannot open file 2!\n:$!";
}
while (<FILE2>) {
chomp;
( $name2, $Dcount, $vname ) = split(","); #parsing
if ( $name eq $name2 ) {
chomp($version);
print RESULT"$name,$model,$version,$Dcount,$vname\n";
}
}
}
$count++;
}
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!\n:$!";
}
}
close FILE;
close RESULT;
close FILE2;
I think you want next, which lets you finish the current iteration immediately and start the next one:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( $name && $model && $version );
...;
}
The condition that you use depends on what values you'll accept. In my examples, I'm assuming that all values need to true. If they need to just not be the empty string, maybe you check the length instead:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && length($model) && length($version) );
...;
}
If you know how to validate each field, you might have subroutines for those:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && is_valid_model($model) && length($version) );
...;
}
sub is_valid_model { ... }
Now you just need to decide how to integrate that into what you are already doing.
You should start by adding use strict and use warnings to the top of your program, and declaring all variables with my at their point of first use. That will reveal a lot of simple mistakes that are otherwise difficult to spot.
You should also use the three-parameter for of open and lexical filehandles, and the Perl idiom for checking exceptions on opening files is to add or die to an open call. if statements with an empty block for the success path waste space and become unreadable. An open call should look like this
open my $fh, '>', 'myfile' or die "Unable to open file: $!";
Finally, it is much safer to use a Perl module when you are handling CSV files as there are a lot of pitfalls in using a simple split /,/. The Text::CSV module has done all the work for you and is available on CPAN.
You problem is that, having read to the end of the first file, you don't rewind or reopen it before reading from the same handle again in the second nested loop. That means no more data will be read from that file and the program will behave as if it is empty.
It is a bad strategy to read through the same file hundreds of times just to pair up coresponding records. If file is of a reasonable size you should build a data structure in memory to hold the information. A Perl hash is ideal as it allows you to look up the data corresponding to a given name instantly.
I have written a revision of your code that demonstrates these points. It would be awkward for me to test the code as I have no sample data, but if you continue to have problems please let us know.
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new;
my %data;
# Read the name, model and version from the first file. Write any records
# that don't have the full three fields to the "MISSING" file
#
open my $f1, '<', 'file.txt' or die qq(Cannot open file 1: $!);
open my $missing, '>', 'Missing.csv'
or die qq(Unable to open "MISSING" file for output: $!);
# Lines with missing name fields will appear here.
while ( my $line = csv->getline($f1) ) {
my $name = $line->[0];
if (grep $_, #$line < 3) {
$csv->print($missing, $line);
}
else {
$data{$name} = $line if $name =~ /\.net$/i;
}
}
close $missing;
# Put a list of .net names found into the testing file
#
open my $testing, '>', 'testing.csv'
or die qq(Unable to open "TESTING" file for output: $!);
# Names will be printed to this during testing. Only ".net" ending names should appear
print $testing "$_\n" for sort keys %data;
close $testing;
# Read the name, disk and storage from the second file and check that the line
# contains all three fields. Remove the name field from the start and append
# to the data record with the matching name if it exists.
#
open my $f2, '<', 'file2.txt' or die qq(Cannot open file 2: $!);
while ( my $line = $csv->getline($f2) ) {
next unless grep $_, #$line >= 3;
my $name = shift #$line;
next unless $name =~ /\.net$/i;
my $record = $data{$name};
push #$record, #$line if $record;
}
# Print the completed hash. Send each record to the result output if it
# has the required five fields
#
open my $result, '>', 'resultfile.csv' or die qq(Cannot open results file: $!);
$csv->print($result, qw( name Model version Disk storage ));
for my $name (sort keys %data) {
my $line = $data{$name};
if (grep $_, #$line >= 5) {
$csv->print($result, $data{$name});
}
}

Selectively pushing elements from array A to array C which are not present in array B

I am trying to produce an array called #names which contains the names of people who are present in allnames.txt but not in somenames.txt. My code is as follows:
if(open(SKIPLIST, "somenames.txt")) {
#some = <SKIPLIST>;
}
close(SKIPLIST);
if(open(TESTLIST, "allnames.txt")) {
#all = <TESTLIST>;
}
close(TESTLIST);
foreach $name (#all) {
$name =~ s/[\n\r]//mg;
if (grep {$_ eq $name} #some) {
#Do nothing
}
else {
push(#names, $name);
}
}
print "Leftover: #names";
Contents of allnames.txt:
adam
jake
john
troy
Contents of somenames.txt:
adam
john
Actual output:
Leftover: adam jake troy
Expected output:
Leftover: jake troy
Can anyone explain why 'adam' is still getting pushed?
"adam" is included in the results because your #some array only contains "adam\n". To fix it, just do
chomp #some, #all;
or, if you want to be paranoid about DOS line breaks,
s/[\r\n]+$// for #some, #all;
before your main loop. Then you also won't need the line
$name =~ s/[\n\r]//mg;
inside the loop.
Also, if you want your code to be fast, you should really use a hash instead of the #some array, like this:
my %some;
if (open SKIPLIST, "somenames.txt") {
while (my $name = <SKIPLIST>) {
chomp $name;
undef $some{$name}; # create the key $name in the hash %some
}
close SKIPLIST;
}
my #names;
if (open TESTLIST, "allnames.txt") {
while (my $name = <TESTLIST>) {
chomp $name;
push #names, $name unless exists $some{$name};
}
close TESTLIST;
}
print "Leftover: #names\n";
The problem is some of your elements have trailing and/or leading whitespace (\n or \r) and some don't. The best way to fix it is to clean them right after reading the file:
if(open(SKIPLIST, "somenames.txt")) {
#some = <SKIPLIST>;
foreach (#some) { $_ =~ s/[\n\r]//mg; }
}
close(SKIPLIST);
if(open(TESTLIST, "allnames.txt")) {
#all = <TESTLIST>;
foreach (#all) { $_ =~ s/[\n\r]//mg; }
}
close(TESTLIST);
foreach $name (#all) {
if (grep {$_ eq $name} #some) {
#Do nothing
}
else {
push(#names, $name);
}
}
print "Leftover: #names";
The problem is that you're removing the newlines from what you get from TESTLIST, but not from what you get from SKIPLIST.
I'd use a hash instead of grep for quick lookups, so my code would more like
my %some;
while (<SKIPLIST>) {
s/\s+\z//;
++$some{$_};
}
my #names;
while (<TESTLIST>) {
s/\s+\z//;
push #names, $_ if !$some{$_};
}
Or if you want something in a functional programming style,
use List::MoreUtils qw( apply );
my %some = map { $_ => 1 } apply { s/\s+\z//; } <SKIPLIST>;
my #names = grep !$some{$_}, apply { s/\s+\z//; } <TESTLIST>;
If you have duplicate names and you want to get duplicate names, change !$some{$_} to !$some{$_}++ (in either snippet).
There is no need to write loops to iterate over the two sets of names. Use of map and a hash slice makes it much clearer what is going on.
use strict;
use warnings;
my $fh;
open $fh, '<', 'somenames.txt' or die $!;
chomp(my #some = <$fh>);
open $fh, '<', 'allnames.txt' or die $!;
chomp(my #all = <$fh>);
my %diff = map(($_ => 1), #all);
delete #diff{#some};
print join(' ', "Leftover:", keys %diff), "\n";

Resources