Building a Perl program to acess a set of data - arrays

I have to write a Perl program that is able to read the social security baby names for a specific year, and have a person be able to enter a name, and be told weather it is male or female, how many people were born with that name, and where it falls in rank.
So far I have been able to separate the data into two separate arrays based on gender, but have to no idea where to go from here.
#!/usr/local/bin/perl
use strict;
use warnings;
open (FILE, "ssbn1898.txt");
print <FILE>;
close (FILE);
my #M_array;
my #F_array;
open (my $input, "<", 'ssbn1898.txt');
while ( <$input> ) {
chomp;
my ( $name, $gender ) = split ( /,/ );
if ( $gender = "M" ) {
push ( #M_array, $name );
}
else {
push ( #F_array, $name );
}
}
close ( $input );
print 'M: ' . join("\t", #M_array) . "\n";
print 'F: ' . join("\t", #F_array) . "\n";
http://www.ssa.gov/cgi-bin/popularnames.cgi
This is the data I am working with.

So far you have 2 arrays for each gender. Now the pending tasks are
Input: Allow users to input a name
Output: Male or Female
Output: Number of people with that name
To take input from user you could do
my $userinput = <STDIN>;
Then you might need to chomp the $userinput and then check whether the $userinput is in array 1 or array 2. For that you will have to use grep or loops. Doing that you'd be able to find whether name belongs to array with Male names or Females.
if (grep /$userinput/, #male_names) {
print "found $userinput in male list\n";
}
Using loop you could do something like below to find total number of people with that name:
foreach (#male_names){
$counter++ if $userinput eq $_;
}
PS: grep returns list, so if you use a scalar you could find number of matches, so you don't have to go for loops.
#!/usr/bin/perl
use strict;
use warnings;
my #male_names = qw(Raj Rohan John Jim Tony Raj Rohan Jim Jim);
my #female_names = qw(Natasha Neha Neha Jasmine Rita Rosy);
my $matches;
my $userinput = <STDIN>;
chomp ($userinput);
if ( $matches = grep /$userinput/, #male_names ) {
print "found $userinput in male list, count is $matches \n";
}
elsif ( $matches = grep /$userinput/, #female_names ) {
print "found $userinput in female list, count is $matches \n";
}
else{
print "Did not find name";
}

Related

What is the proper way to search array using Smart Match?

I'm new to programming much less Perl; I'm having difficulty with searching an array I've made from an external text file. I'm looking for a simple way to check if the user entry is located in the array. I've used the Smart Match function before but never in an "if" statement and can't seem to get it to work. Am I implementing this function wrong, or is there an easier way to check if the user's string is in the array?
#!/usr/bin/perl
use 5.010;
#Inventory editing script - Jason Black
#-------------------------------------------------------------------------------
print "1. Add Items\n";
print "2. Search Items\n";
print "Please enter your choice: ";
chomp ($userChoice = <STDIN>); #Stores user input in $userChoice
if($userChoice == 1){
$message = "Please enter in format 'code|title|price|item-count'\n";
&ChoiceOne;
}
elsif($userChoice == 2){
$message = "Enter search terms\n";
&ChoiceTwo;
}
sub ChoiceOne{
print "$message\n";
chomp($userAddition = <STDIN>); #Stores input in $userAddition
$string1 = "$userAddition";
open (FILE, "FinalProjData.txt") or die ("File not found"); #"FILE" can be named anything
#array = <FILE>;
if ( /$string1/ ~~ #array){
print "This entry already exists. Would you like to replace? Y/N \n";
chomp($userDecision = <STDIN>); #Stores input in $userDecision
if ($userDecision eq "Y"){
$string1 =~ s/$userAddition/$userAddition/ig;
print "Item has been overwritten\n";}
elsif($userDecision eq "N"){
print FILE "$string1\n";
print "Entry has been added to end of file.\n";}
else{
print "Invalid Input";
exit;}
}
else {
print FILE "$string1\n";
print "Item has been added.\n";}
close(FILE);
exit;
}#end sub ChoiceOne
sub ChoiceTwo{
print "$message\n";
}
If you want to avoid using smartmatch alltogether:
if ( grep { /$string1/ } #array ) {
To actually match the $string1, however, it needs to be escaped, so that | doesn't mean or:
if ( grep { /\Q$string\E/ } #array ) {
or just a simple string compare:
if ( grep { $_ eq $string } #array ) {

Why is my script only accessing the first element in array?

Below is my script.
I have attempted many print statements to work out why it is only accessing the first array element. The pattern match works. The array holds a minimum 40 elements. I have checked and it is full.
I have printed each line, and each line prints.
my $index = 0;
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
foreach my $barcode (#barcode) {
my #line = <FILE>;
foreach $_ (#line) {
if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
my $bar = $1;
$_ =~ s/.*//;
print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
}
print OUT $_;
}
$index++;
}
}
Okay, lets say the input was:
File:
Barcode001 001 abc
Barcode002 002 def
Barcode003 003 ghi
#barcode holds:
001
002
003
#otherarray holds:
abc
def
ghi
The output result for this script is currently printing only:
Barcode001_001 abc
It should be printing:
>Barcode001_001 abc
>Barcode002_002 def
>Barcode003_003 ghi
Where it should be printing a whole load up to ~40 lines.
Any ideas? There must be something wrong with the way I am accessing the array elements? Or incrementing? Hoping this isn't something too silly!
Thanks in advance.
It needs the index because I am trying to match arrays in parallel, as they are ordered. Each line needs to match the corresponding indices of the arrays to each line in the file.
It's a little hard to answer with certainty without more information about the contents of #barcode and FILE, but there is something odd in your code which makes me think that it might be the problem.
The construct while (<FILE>) { ... } will, until end of file, read a line from FILE into $_ and then execute the contents of the loop. In your code, you also read all the lines from FILE from within the loop that iterates over #barcode. I think it is likely that you intended to check each line from FILE against all the elements of #barcode, which would make the loop look like the following:
while (my $line = <FILE>) {
foreach my $barcode (#barcode) {
if ($line =~ /Barcode([0-9]*)\t$barcode/) {
my $bar = $1;
print OUT ">Barcode$bar"."_"."$barcode\n";
}
else {
print OUT $line;
}
}
}
I've taken the liberty of doing a bit of code tidying, but I may have made some unwarranted assumptions.
Your core problem in the above is - in your first iteration you slurp all of your file into #lines. But because it's lexically scoped to the loop, it disappears when that loop completes.
Furthermore:
I would strongly suggest that you don't use $_ like that.
$_ is a special variable that's set implicitly in loops. I'd strongly suggest that you need to replace that with something that isn't a special variable, because that's a sure way to cause yourself pain.
turn on use strict; and use warnings;
use 3 argument open with a lexical filehandle.
perltidy your code, so the bracketing looks right.
you've a search and replace pattern on $_ that's emptying it completely, but then you're trying to print it. You may well not be printing what you think you're printing.
You're accessing <FILE> outside and inside your loop. This will cause you problems.
Barcode([0-9]*) - with a '*' there you're saying 'zero or more' is valid. You may want to consider \d+ - one or more digits.
referencing multiple arrays by index is messy. I'd suggest coalescing them into a hash lookup (lookup by key - barcode)
This line:
my #line = <FILE>;
reads your whole file into #line. But you do this inside the while loop that iterates... each line in <FILE>. Don't do that, it's horrible.
Is this something like what you wanted?
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #barcode = qw (
001
002
003
);
my #otherarray = qw (
abc
def
ghi
);
my %lookup;
#lookup{#barcode} = #otherarray;
print Dumper \%lookup;
#commented because I don't have your source data
#my $file = "input_file_name";
#my $output = "output_file_name";
#open( my $input, "<", $file ) or die "\nNot opening $file for reading\n\n";
#open( my $output, ">", $final ) or die "Did not open $final\n";
#while ( my $line = <$input> )
while ( my $line = <DATA> ) {
foreach my $barcode (#barcode) {
if ( my ($bar) = ( $line =~ /Barcode(\d+)\s+$barcode/ ) ) {
print ">Barcode$bar" . "_" . "$barcode $lookup{$barcode}\n";
#print {$output} ">Barcode$bar" . "_" . "$lookup{$barcode}\n";
}
}
}
__DATA__
Barcode001 001
Barcode002 002
Barcode003 003
Prints:
$VAR1 = {
'001' => 'abc',
'002' => 'def',
'003' => 'ghi'
};
>Barcode001_001 abc
>Barcode002_002 def
>Barcode003_003 ghi
It turns out it was a simple issue as I had suspected being a Monday. I had a colleague go through it with me, and it was the placing of the index:
#my $index = 0; #This means the index is iterated through,
#but for each barcode for one line, then it continues
#counting up and misses the other values, therefore
#repeatedly printing just the first element of the array.
open(FILE, "$file") or die "\nNot opening $file for reading\n\n";
open(OUT, ">$final") or die "Did not open $final\n";
while (<FILE>) {
$index = 0; #New placement of $index for initialising.
foreach my $barcode (#barcode) {
my #line = <FILE>;
foreach $_ (#line) {
if ($_ =~ /Barcode([0-9]*)\t$barcode[$index]\t$otherarray[$index]/) {
my $bar = $1;
$_ =~ s/.*//;
print OUT ">Barcode$bar"."_"."$barcode[$index]\t$otherarray[$index]";
}
print OUT $_;
$index++; #Increment here
}
#$index++;
}
}
Thanks to everyone for their responses, for my original and poorly worded question they would have worked and may be more efficient, but for the purpose of the script and my edited question, it needs to be this way.

perl to add the results of 2 arrays from 2 differnt files together

I run a report between 2 csv files, the last bit i wish to do check is to add matching elemants of the 2 arrays (built up of unique values and occurances) together. but i can't work out how to do a for each matching name in each both arrays add together, to get the output as below.
INPUT:
jon 22
james 12
ken 22
jack 33
jim 11
harry 7
dave 9
grant 12
matt 74
malc 12
INPUT1:
jon 2
james 1
ken 8
jack 5
jim 1
harry 51
dave 22
Desired Output:
jon 24
james 13
ken 30
jack 38
jim 12
harry 58
dave 31
grant 12
matt 74
malc 12
code i have so to create oput from INPUT and INPUT1
my %seen;
seek INPUT, 0, 0;
while (<INPUT>)
{
chomp;
my $line = $_;
my #elements = split (",", $line);
my $col_name = $elements[1];
#print " $col_name \n" if !
$seen{$col_name}++;
}
while ( my ( $col_name, $times_seen ) = each %seen ) {
my $loc_total = $times_seen * $dd;
print "\n";
print " $col_name \t\t : = $loc_total";
printf OUTPUT "%-34s = %15s\n", $col_name , " $loc_total ";
}
############## ###################
my %seen2;
seek INPUT1, 0, 0;
while (<INPUT1>)
{
chomp;
my $line = $_;
my #elements1 = split (",", $line);
my $col_name = $elements1[1];
my $col_type = $elements1[5];
$seen2{$col_name}++ if $col_type eq "YES";
}
while ( my ( $col_name, $times_seen2 ) = each %seen2 ) {
my $loc_total = $times_seen2 ;
print "\n $col_name \t\t= $loc_total";
printf OUTPUT "%-34s = %15s\n", $col_name , $times_seen2 ;
}
close INPUT;
Instead of using %seen, store the running total in the hash directly:
#!/usr/bin/perl
use warnings;
use strict;
my %count;
for my $file ('INPUT', 'INPUT1') {
open my $IN, '<', $file or die "$file: $!";
while (<$IN>) {
my ($name, $num) = split;
$count{$name} += $num;
}
}
for my $name (sort { $count{$b} <=> $count{$a} } keys %count) {
print "$name\t$count{$name}\n";
}
First, I'll assume that the input files are actual CSV files -- whereas your examples are just whitespace delimited. In other words:
jon,22
james,12
ken,22
jack,33
jim,11
harry,7
dave,9
grant,12
matt,74
malc,12
and
jon,2
james,1
ken,8
jack,5
jim,1
harry,51
dave,22
ASSUMING I'm correct, then your while loops will do the trick, with a couple of tweaks:
The first element of your #elements arrays have index 0, not 1. So the "key" here is at $elements[0], and the "value" is at $elements[1]. So you'd have something like:
my $col_name = $elements[0];
my $col_value = $elements[1];
Instead of incrementing %seen, it seems more useful to add the value, like so:
$seen{ $col_name } += $col_value;
In your while loop which iterates over INPUT1, do the same thing done in the first loop to extract data; also, don't use %seen2; instead, simply add to %seen as above:
my $col_name = $elements1[0];
my $col_value = $elements1[1];
$seen{$col_name} += $col_value;
Your totals will then be stored in %seen, so your final while loop is slightly modified:
while ( my ( $col_name, $times_seen2 ) = each %seen ) { # instead of %seen2
If your two processing loops are identical (and I see it's possible that they're not), then I'd suggest factoring them into a common subroutine. But that's a different matter.
The following could easily be adapted to just take file names from the command line instead.
Maintains the order of the keys in your file:
use strict;
use warnings;
use autodie;
my #names;
my %total;
local #ARGV = qw(INPUT INPUT1);
while (<>) {
my ($name, $val) = split;
push #names, $name if ! exists $total{$name};
$total{$name} += $val;
}
for (#names) {
print "$_ $total{$_}\n";
}

Perl read a file and an array and find common words

This is kind of a small issue and I hope you are able to help me. My code is probably rubbish. For an example, I have a file in which the only statement is John is the uncle of Sam. My Perl script should copy the file contents into an array. User should be able to input different names and search if those names are mentioned in the file. There should be an array with relationships like "uncle aunt, mother, father etc" in the program.
#use warnings;
use Array::Utils qw(:all);
print "Please enter the name of the file\n";
my $c = <STDIN>;
open(NEW,$c) or die "The file cannot be opened";
#d = <NEW>;
print #d, "\n";
#g = qw(aunt uncle father);
chomp #d;
chomp #g;
my $e;
my $f;
print "Please enter the name of the first person\n";
my $a = <STDIN>;
print "Please enter the name of the second person\n";
my $b = <STDIN>;
my #isect = intersect(#g, #d);
print #isect;
foreach(#d)
{
if ($a == $_)
{
$e = $a;
}
else
{
print "The first person is not mentioned in the article";
exit();
}
if ($b == $_)
{
$f = $b;
}
else
{
print "The second person is not mentioned in the article";
exit();
}
}
print $e;
print $f;
close(NEW);
This is something that I have done so far, the intersection is not giving the word uncle which is the word common in both arrays. The program is taking any random name and printing them. It is not saying that the name doesn't exist in the file when I enter a different name other than John and Sam
There are several problems:
You do not chomp $c. The filename contains a newline at the end.
You use the 2-argument form of open, but do not test the second argument. This is a security problem: do you know what happens if the user input contains > or |?
You use == to compare strings. String equality is tested with eq, though, == tests numbers.
Moreover, you do not want to know whether "Sam" equals to "John is the uncle of Sam". You want to know whether it is a part of it. You might need to use index or regular expressions to find out.
Do not use $a as the name of a variable, it is special (see perlvar).
Do not try to compare strings with ==! Use eq (equals) instead. Also you didnt chomp your input $a$b`. I think this is what you're trying to do:
#!/usr/bin/perl
use strict;
use warnings;
print "Please enter the name of the file\n";
my $c = <STDIN>;
open(NEW,$c) or die "The file cannot be opened";
my #d = <NEW>;
chomp #d;
my $e;
my $f;
print "Please enter the name of the first person\n";
my $aa = <STDIN>;
print "Please enter the name of the second person\n";
my $bb = <STDIN>;
chomp $aa;
chomp $bb;
my $pattern_a = quotemeta $aa;
my $pattern_b = quotemeta $bb;
foreach (#d){
if ($_ =~ /$pattern_a/){
$e = $aa;
}
elsif ($_ =~ /$pattern_b/){
$f = $bb;
}
}
close(NEW);
unless ($e){
print "First person not mentionend\n";
}
unless ($f){
print "Second person not mentioned\n";
}

loop goes to next occurence after one match

I have an array that I am using to match against another table. When I execute it, it grabs only the first occurrence. For example, if company1 is in my array, it will grab only the first instance of company1 and then go to the next search term, say company2. If there had been a company1.0 after company1, only company1 would be spit out. I want it to spit out company1 etc.\t company1.0 and so on on the same line, as there will be multiple matches between the two lists.
Here is my code:
my #attendees = ('company');
foreach $fbm (#attendees) {
open(RFILE, '<', "file.txt")
or die "no such file posf: $!";
while ( $line = <RFILE> )
{
if ($line =~ /$fbm/i)
{
print $fbm."\t". $line;
last;
}
if (eof(RFILE))
{
print "posf"."\n";
}
}
}
print STDERR "\n\nFINISHED!!";
My Input:
company1
company1.0
company1 also begins with 1 but different ending
company1 can i have this one too?
My output:company1
Desired output: company1\tcompany1.0\tcompany1 also begins with 1 but different ending\tcompany1 can i have this one too?
my #attendees = ('company');
my #whatever;
open ( my $fh, '<', "file.txt")
or die "could not open file.txt: $!";
while ( <$fh> ) {
chomp $_;
push #whatever, $_;
}
foreach my $attendee ( #attendees ) {
foreach my $thing ( #whatever ) {
if ($thing =~ /$attendee/i) {
print $attendee, "\t", $thing, "\n";
}
}
}
print STDERR "FINISHED!\n";
Perhaps this does what you want it to, but I must admit that I'm not quite sure.

Resources