Perl while loop won't exit when blank line is entered - arrays

I am trying to print the sum, maximum, and minimum values of list of numbers but I am struggling to get it working.
When I press Enter the loop should exit but the program keeps running
use strict;
use warnings;
my #items;
my ( $sum, $max, $min );
while ( chomp( my $num = <STDIN> ) ) {
last if ( $num eq '\n' );
$max ||= $num;
$min ||= $num;
$sum += $num;
$max = $num if ( $num > $max );
$min = $num if ( $num < $min );
push( #items, $num );
}
printf( "Entered numbers are: %s \n", join( ', ', #items ) );
print( "Sum of all numbers is : ", $sum );
print "\n";
print( "Minimum number is : ", $min );
print "\n";
print( "Maximum number is : ", $max )

You can't use chomp inside a while condition like this
while (chomp(my $num = <STDIN>)) { ... }
because the while loop needs to terminate when the <> returns undef at end of file. So you must put the chomp as the first statement of the loop
The simplest way to exit a loop like this is to check whether the input contains any non-space characters using the regular expression /\S/
The check
last if ( $num eq '\n' )
won't work because you have used chomp to remove the newline from the input. Also, if you use single quotes '\n' is the two-character string \ followed by n. You need double quotes like "\n" to create a newline
When a scalar variable is first declared it has the value undef, so you can avoid any clumsy initialisation by testing for this and updating $min and $max unless the previous value is already defined and higher (or lower) than the new value
I would rewrite your program like this.
use strict;
use warnings;
my #items;
my ($sum, $max, $min);
while (my $num = <STDIN>) {
chomp $num;
last unless $num =~ /\S/;
$max = $num unless defined $max and $max >= $num;
$min = $num unless defined $min and $min <= $num;
$sum += $num;
push #items, $num;
}
print 'Entered numbers are: ', join(', ', #items), "\n";
print "Sum of all numbers is: $sum\n";
print "Minimum number is: $min\n";
print "Maximum number is: $max\n";

If you want to detect Enter, you can't chomp the input before. Also, '\n' is not the enter, you have to use double quotes "\n" to enable special characters.
The while (<>) loop is usually ended by Ctrl+D.
Moreover, you can find a module to do the work for you.
#!/usr/bin/perl
use warnings;
use strict;
use List::Util qw{ max min sum };
my #items = <>;
chomp #items;
my $max = max(#items);
my $min = min(#items);
my $sum = sum(#items);
print "Entered numbers are: #items\n";
print "Sum of all numbers is: $sum\n";
print "Minimum number is: $min\n";
print "Maximum number is: $max";

Related

loop through elements of array to find character perl

I have a perl array where I only want to loop through elements 2-8.
The elements are only meant to contain numbers, so if any of those elements contain a letter, I want to set an error flag = 1, as well as some other variables as seen.
The reason I have 2 error flag variables is due to scope rules within the loop.
fields is an array, I created by splitting another irrelevant array by the " " key.
So, when I try to print error_line2, error_fname2 from outside the loop, I get this:
Use of uninitialized value $error_flag2 in numeric eq (==)
I don't know why, because I've initialized the value within the loop and created the variable outside the loop.
Not sure if I'm even looping to find characters correctly, so then it's not setting the error_flag2 = 1.
Example line:
bob hankerman 2039 3232 23 232 645 64x3 324
since element 7 has the letter 'x' , I want the flag to be set to 1.
#!/usr/bin/perl
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
my $players_file = $ARGV[0];
my #players_array;
open (my $file, "<", "$players_file")
or die "Failed to open file: $!\n";
while(<$file>) {
chomp;
push #players_array, $_;
}
close $file;
#print join "\n", #players_array;
my $num_of_players = #players_array;
my $error_flag;
my $error_line;
my $error_fname;
my $error_lname;
my $error_flag2=1;
my $error_line2;
my $error_fname2;
my $error_lname2;
my $i;
foreach my $player(#players_array){
my #fields = split " ", $player;
my $size2 = #fields;
for($i=2; $i<9; $i++){
print "$fields[$i] \n";
if (grep $_ =~ /^[a-zA-Z]+$/){
my $errorflag2 = 1;
$error_flag2 = $errorflag2;
my $errorline2 = $player +1;
$error_line2 = $errorline2;
my $errorfname2 = $fields[0];
$error_fname2 = $errorfname2;
}
}
if ($size2 == "9" ) {
my $firstname = $fields[0];
my $lastname = $fields[1];
my $batting_average = ($fields[4]+$fields[5]+$fields[6]+$fields[7]) / $fields[3];
my $slugging = ($fields[4]+($fields[5]*2)+($fields[6]*3)+($fields[7]*4)) / $fields[3];
my $on_base_percent = ($fields[4]+$fields[5]+$fields[6]+$fields[7] +$fields[8]) / $fields[2];
print "$firstname ";
print "$lastname ";
print "$batting_average ";
print "$slugging ";
print "$on_base_percent\n ";
}
else {
my $errorflag = 1;
$error_flag = $errorflag;
my $errorline = $player +1;
$error_line = $errorline;
my $errorfname = $fields[0];
$error_fname = $errorfname;
my $errorlname = $fields[1];
$error_lname = $errorlname;
}
}
if ($error_flag == "1"){
print "\n Line $error_line : ";
print "$error_fname, ";
print "$error_lname :";
print "Line contains not enough data.\n";
}
if ($error_flag2 == "1"){
print "\n Line $error_line2 : ";
print "$error_fname2, ";
print "Line contains bad data.\n";
}
OK, so the problem you've got here is that you're thinking of grep in Unix terms - a text based thing. It doesn't work like that in perl - it operates on a list.
Fortunately, this is pretty easy to handle in your case, because you can split your line into words.
Without your source data, this is hopefully a proof of concept:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
while ( <DATA> ) {
#split the current line on whitespace into an array.
#first two elements get assigned to firstname lastname, and then the rest
#goes into #values
my ( $firstname, $lastname, #values ) = split; #works on $_ implicitly.
#check every element in #values, and test the regex 'non-digit' against it.
my #errors = grep { /\D/ } #values;
#output any matches e.g. things that contained 'non-digits' anywhere.
print Dumper \#errors;
#an array in a scalar context evaluates as the number of elements.
#we need to use "scalar" here because print accepts list arguments.
print "There were ", scalar #errors, " errors\n";
}
__DATA__
bob hankerman 2039 3232 23 232 645 64x3 324
Or reducing down the logic:
#!/usr/bin/perl
use strict;
use warnings;
while ( <DATA> ) {
#note - we don't need to explicity specify 'scalar' here,
#because assigning it to a scalar does that automatically.
#(split) splits the current line, and [2..8] skips the first two.
my $count_of_errors = grep { /\D/ } (split)[2..8];
print $count_of_errors;
}
__DATA__
bob hankerman 2039 3232 23 232 645 64x3 324
First : You don't need to use "GREP", Simply you can match the string with "=~" in perl and you can print matched value with $&.
Second : You should use $_ if and only if there is not other variable used in the loop. There is already $i used in the loop, you can write the loop as :
for my $i (2..9) {
print "$i\n";
}
or
foreach(2..9) {
print "$_\n";
}

Simplify long if-else chain with regex by comparing with array

for (my $j = 0; $j <100000; $j++){
my $outcome = rand();
for (my $k = 0; $k < #cum_sum; $k++){
if ($cum_sum[$k] >= $outcome){
if ($keys[$k] =~ m/\"|\/|\<|\>|\\|\`|\~|\#|\#|\$|\%|\^|\*|[0-9]/) {
print $out "";
}
if ($keys[$k] =~ m/\s/){
print $out " ";
}
elsif ($keys[$k] =~ m/\&/){
print $out " and ";
}
elsif ($keys[$k] =~ m/\!/){
print $out "! ";
}
elsif ($keys[$k] =~ m/\:/){
print $out ": ";
}
elsif ($keys[$k] =~ m/\'/){
print $out "' ";
}
elsif ($keys[$k] =~ m/\./){
print $out ". ";
}
elsif ($keys[$k] =~ m/\;/){
print $out "; ";
}
elsif ($keys[$k] =~ m/\?/){
print $out "? ";
}
elsif ($keys[$k] =~ m/\,/){
print $out ", ";
}
else {
print $out "$keys[$k]";
}
last;
}
}
# print "$outcome\n";
}
I mostly need help with simplifying the long chain of elsif statements I have.
The logic in the outer for loops works.
#keys is an array of two character (digrams) strings.
I am trying to make the elsif statements more efficient by comparing the digrams from #key to an array of the punctuation marks #punctuation = qw(! : " ' ; ? , .)
Then, if the digram does contain one of the punctuation marks in the punctuation array, the digram gets changed to "punctuation_mark " e.g. "a!" -> "! "
The end result would be that I do not use regexes for any of #punctuation.
However, I am not sure on how to implement this change.
Thank you!
In order to simplify your code, it may have sense to use the Tie::RegexpHash CPAN module. The main idea is build a hash with regexpr as keys, so you easily find the related values by matching:
use Tie::RegexpHash;
my $rehash = Tie::RegexpHash->new();
$rehash->add( qr/\s/, " " );
$rehash->add( qr/\&/, " and " );
#...
my $value = $rehash->match( "&" ); # $value <-- " and "
Use alternation, just like in the first regex, but capture the match and use $1 to replace the word
my $re_punct = join '|', map { quotemeta } qw(& ! : ' . ; ? ,); #'
for my $j (0..99_999) {
my $outcome = rand();
for my $k (0..$#cum_sum) {
...
if ($keys[$k] =~ /($re_punct)/) {
if ($1 eq '&') { $keys[$k] = " and " }
else { $keys[$k] = "$1 " }
}
...
}
}
Comments
quotemeta escapes by \ all "ASCII non-"word" characters"
syntax: $#ary is the index of the last element in #ary, just right for looping over array index
for my $i ($beg .. $end) is much clearer than the equivalent C-style for loop†
The presented logic leaves a question: what if both characters are punctuation?
Note on your idea to do it "By Comparing With Array"
You could use List::MoreUtils::first_value, for instance. It would go like
use List::MoreUtils qw(first_value);
my #punc = map { quotemeta } qw(& ! : ' . ; ? ,); #'
foreach my $word (#words) {
if (my $match = first_value { $word =~ /$_/ } #punct) {
$word = $match;
}
}
The first_value (or firstval) returns the first element of #punct for which the block returns true, and undef if none do. The $word aliases the currently processed element of #words and changing it changes the array element; so you get your replacement.
However, you still have to deal with regex and escape (at least some of) punctuation. Thus I see no advantage in going to this trouble; the "straight-up" regex is far clearer in this case.
† Even compiled languages evolve this way. The C++11 introduced the range-based for loop
for (auto var: container) ... // (really, const auto&), or auto&, or auto&&
and a standard reference linked above says
Used as a more readable equivalent to the traditional for loop [...]
In Perl this is how things are done; just use it.
So, looking at it you have two cases:
One where you replace a set of values, with 'the value and a space'. And another where you replace with something different.
So how about creating a lookup table for each, and processing just two regexes:
#!/usr/bin/env perl.
use strict;
use warnings;
use Data::Dumper;
my %replace = (
'"' => "",
'/' => "",
'&' => " and ",
);
my #add_space = ( ',', '?', ';', '.', "'", ':', '!' );
my $search = join ( "|", map { quotemeta } keys %replace );
$search = qr/($search)/;
my $add_space_after = join "|", map {quotemeta} #add_space;
$add_space_after = qr/($add_space_after)/;
while ( <DATA> ) {
s/\s/ /g;
s/$search/$replace{$1}/;
s/$add_space_after/$1 /;
print;
}
__DATA__
Work:Work
cookies&milk;wordhere
why?are;you/so "sad"
This gives us what you want I think, and - hopefully - keeps the code pretty conscise.
Of importance is the quotemeta function here, because it ensures your metachars are escaped before regexing them.
Note - the only one of your examples this doesn't handle is the \s to " ". But that's IMO probably best to write separately for clarity, as obviously youc can't quotemeta it.

Pulling out potentially overlapping subsets of elements in array to make smaller arrays

My input file looks like below (real one is much larger):
rs3683945_mark 0
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
IMP!2! 2
SNP117_impute 2.685815665
IMP!3! 3
SNP3_1_impute 3.643119709
SNP1_impute 3.643119809
rs13475706_mark 3.643119909
13 lines, two elements each line. First element is a name. Each name ends either with a "tag" _mark or impute, or there is no tag. The point of the tag is to distinguish between types of names, which form the basis of my search for subsets within the entire list.
The subsets begin with a _mark name that immediately precedes an instance of an _impute name. The subsets end with the very next instance of _mark. All names in between, which will necessarily not have any such tag, also go into a subset, which I'd like to collect into an array and send off to a subroutine to process (details of that not important). Please note, the positions with IMP in the name are not the same as those actually tagged with a _impute.
For example, with the above, the first useable subset is:
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
The second useable subset is:
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
and so on... EDIT: Note that last _mark name of the first set is the first _mark name of the second.
My code for this:
#!/usr/bin/perl
use strict; use warnings;
my $usage = "usage: merge_impute.pl {genotype file} {distances file} \n";
die $usage unless #ARGV == 2;
my $genotypes = $ARGV[0];
open (FILE, "<$genotypes");
my #genotypes = <FILE>;
close FILE;
my $distances = $ARGV[1];
open (DISTS, "<$distances");
my #distances = <DISTS>;
close DISTS;
my #workingset = ();
#print scalar #distances;
for ( my $i = 0; $i < scalar #distances; $i++ ){
chomp $distances[$i];
#print "$distances[$i]\n";
if ( $distances[$i] =~ m/impute/ ){
push ( #workingset,$distances[$i-1],$distances[$i],$distances[$i+1]);
}
print "i=$i: #workingset\n";
# at this point send off to sub routine
#workingset=();
}
As you can see, the if loop is only set up to find subsets that contain only one _impute name. How can I modify the code so that a subset will "fill up" with as many names as required until we arrive at the next _mark name?
EDIT: Perhaps instead of the for loop, I could something like...
push (#workingset, $distances[0], $distances[1] );
until ( $distance[ ??? ] =~ m/_mark/ ){
push ( #workingset, $distance[ ??? ] );
}
But what could $distances[ ??? ] be?
EDIT: Or an alternative for loop...
push (#workingset, $distances[0] );
for ( my $i = 1; $i < scalar #distances - 1 ; $i++ ){
until ( $distances[ $i ] =~ m/_mark/ ){
push ( #workingset, $distances[ $i ] );
# send #workingset to sub routine
#clear workingset
#workingset = ();
}
}
Though this isn't working.
I also tried...
push (#workingset, $distances[0] );
for ( my $i = 1; $i < scalar #distances - 1 ; $i++ ){
until ( $distances[ $i ] =~ m/_mark/ ){
push ( #workingset, $distances[ $i ] );
next if $distances[ $i+1 ] !~ /_mark/;
}
# send #workingset to sub routine here
print "i=$i, #workingset\n\n";
#clear workingset
#workingset = ();
}
I don't have a lot of time right now but I'll hopefully have some time in the morning to check back. Here's a quick example on how you could do it (it is meant to be simple and easy to understand, not fancy). Hopefully it helps you get on the right track for parsing the data.
use strict;
use warnings;
my $first_mark;
my #workingset = ();
my $second_mark;
while (<DATA>){
chomp;
if ( /_mark/ and scalar #workingset == 0 ) {
$first_mark = $_;
} elsif ( /IMP|_impute/ and defined $first_mark) {
push #workingset, $_;
} elsif ( /_mark/ and defined $first_mark) {
$second_mark = $_;
print "Found valid set: ";
print "$first_mark," . join(",", #workingset) . ",$second_mark\n";
#workingset = ();
$first_mark = $second_mark;
undef $second_mark;
}
}
__DATA__
rs3683945_mark 0
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
IMP!2! 2
SNP117_impute 2.685815665
IMP!3! 3
SNP3_1_impute 3.643119709
SNP1_impute 3.643119809
rs13475706_mark 3.643119909
Output:
Found valid set: rs6336442_mark 1E-07,rs31328150_impute 0.444121193,rs3658242_mark 0.444121293
Found valid set: rs3658242_mark 0.444121293,rs39342374_impute 0.444121393,IMP!1! 1,rs3677817_mark 1.986015679
Found valid set: rs3677817_mark 1.986015679,IMP!2! 2,SNP117_impute 2.685815665,IMP!3! 3,SNP3_1_impute 3.643119709,SNP1_impute 3.643119809,rs13475706_mark 3.643119909

Compare Arrays and Delete Arrays

I have below three sets ( arrays ) I need to perform an operation like this ( (A-B)UC ) on.
Can someone have the logic of this in Perl?
Here is my code I can able check for is B subset of A or not but I could not able to do "A-B":
my #array = (MAJOR,MINOR,MM,DD,YY);
my #exclude = (MM,MINOR,YY);
my #include = (LICENSE,VALID);
foreach (#exclude) {
if ( $_ ~~ #array ) {
print "\n $_ is defined in variables and it will be excluded \n";
#array = grep {!/\$_/} #array;
print "#array \n";
}
else {
print "\n $_ is not defined under variables please check the files \n";
exit 100;
}
}
foreach (#array){
print "$_ \n";
}
I suspect something is wrong in my logic with grep operation i.e. delete operation.
One problem with the grep is that $_ in the outer loop is redefined inside the grep block to each element of #array. You need to have different names. Also, your regex was lacking anchors; however, instead of a regex, just use string inequality. Try this:
my #array = qw(MAJOR MINOR MM DD YY);
my #exclude = qw(MM MINOR YY);
my #include = qw(LICENSE VALID);
foreach my $e (#exclude) {
if ( $e ~~ #array ) {
print "\n $e is defined in variables and it will be excluded \n";
#array = grep {$e ne $_} #array;
print "#array \n";
} else {
print "\n $e is not defined under variables please check the files \n";
exit 100;
}
}
use strict and warnings to alert you to many pitfalls in perl.
A hash is the most natural way to represent a set in perl.
use strict;
use warnings;
my #array = ('MAJOR','MINOR','MM','DD','YY');
my #exclude = ('MM','MINOR','YY');
my #include = ('LICENSE','VALID');
my %set;
# add #array to set
#set{#array} = ();
# remove #exclude
delete #set{#exclude};
# add #include
#set{#include} = ();
# array of elements resulting
my #result = sort keys %set;
You could use a set to do those kind of operations. I used a non-standard module Set::Scalar to help me with it:
#!/usr/bin/env perl
use warnings;
use strict;
use Set::Scalar;
my #array = qw(MAJOR MINOR MM DD YY);
my #exclude = qw(MM MINOR YY);
my #include = qw(LICENSE VALID);
my $array_set = Set::Scalar->new(#array);
my $exclude_set = Set::Scalar->new(#exclude);
my $include_set = Set::Scalar->new(#include);
my $result = $array_set->difference($exclude_set)->union($include_set);
use Data::Dumper;
print Dumper #$result;
Run it like:
perl script.pl
That yields:
$VAR1 = 'VALID';
$VAR2 = 'MAJOR';
$VAR3 = 'DD';
$VAR4 = 'LICENSE';

regex matching I think

First sorry if I should have added this to my earlier question today, but I now have the below code and am having problems getting things to add up to 100...
use strict;
use warnings;
my #arr = map {int( rand(49) + 1) } ( 1..100 ); # build an array of 100 random numbers between 1 and 49
my #count2;
foreach my $i (1..49) {
my #count = join(',', #arr) =~ m/,$i,/g; # ???
my $count1 = scalar(#count); # I want this $count1 to be the number of times each of the numbers($i) was found within the string/array.
# push(#count2, $count1 ." times for ". $i); # pushing a "number then text and a number / scalar, string, scalar" to an array.
push(#count2, [$count1, $i]);
}
#sort #count2 and print the top 7
my #sorted = sort { $b->[0] <=> $a->[0] } #count2;
my $sum = 0;
foreach my $i (0..$#sorted) { # (0..6)
printf "%d times for %d\n", $sorted[$i][0], $sorted[$i][1];
$sum += $sorted[$i][0]; # try to add up/sum all numbers in the first coloum to make sure they == 100
}
print "Generated $sum random numbers.\n"; # doesn't add up to 100, I think it is because of the regex and because the first number doesn't have a "," in front of it
# seem to be always 96 or 97, 93...
Replace these two lines:
my #count = join(',', #arr) =~ m/,$i,/g; # ???
my $count1 = scalar(#count); # I want this $count1 to be the number of times each of the numbers($i) was found within the string/array.
with this:
my $count1 = grep { $i == $_ } #arr;
grep will return a list of elements where only the expression in {} evaluates to true. This is less error-prone and much more efficient than joining the entire array and using a a regex. Also note that scalar is not necessary since the variable $count1 is scalar, so perl will return the result of grep in scalar context.
You can also get rid of this line:
push(#count2, $count1 ." times for ". $i); # pushing a "number then text and a number / scalar, string, scalar" to an array.
since you are already printing the same information in your last foreach loop.
#!/usr/bin/perl
use strict; use warnings;
use YAML;
my #arr;
$#arr = 99;
my %counts;
for my $i (0 .. 99) {
my $n = int(rand(49) + 1);
$arr[ $i ] = $n;
++$counts{ $n };
}
my #result = map [$_, $counts{$_}],
sort {$counts{$a} <=> $counts{$b} }
keys %counts;
my $sum;
$sum += $_->[1] for #result;
print "Number of draws: $sum\n";
You can probably reuse some well-tested code from List::MoreUtils.
use List::MoreUtils qw/ indexes /;
...
foreach my $i (1..49) {
my #indexes = indexes { $_ == $i } #arr;
my $count1 = scalar( #indexes );
push( #count2, [ $count1, $i ] );
}
If you don't need the warns in the sum loop, then I'd recommend using sum from List:Util.
use List::Util qw/ sum /;
...
my $sum = sum map { $_->[0] } #sorted;
If you insist on the loop, rewrite it as:
foreach my $i ( #sorted ) {

Resources