Compare Arrays and Delete Arrays - arrays

I have below three sets ( arrays ) I need to perform an operation like this ( (A-B)UC ) on.
Can someone have the logic of this in Perl?
Here is my code I can able check for is B subset of A or not but I could not able to do "A-B":
my #array = (MAJOR,MINOR,MM,DD,YY);
my #exclude = (MM,MINOR,YY);
my #include = (LICENSE,VALID);
foreach (#exclude) {
if ( $_ ~~ #array ) {
print "\n $_ is defined in variables and it will be excluded \n";
#array = grep {!/\$_/} #array;
print "#array \n";
}
else {
print "\n $_ is not defined under variables please check the files \n";
exit 100;
}
}
foreach (#array){
print "$_ \n";
}
I suspect something is wrong in my logic with grep operation i.e. delete operation.

One problem with the grep is that $_ in the outer loop is redefined inside the grep block to each element of #array. You need to have different names. Also, your regex was lacking anchors; however, instead of a regex, just use string inequality. Try this:
my #array = qw(MAJOR MINOR MM DD YY);
my #exclude = qw(MM MINOR YY);
my #include = qw(LICENSE VALID);
foreach my $e (#exclude) {
if ( $e ~~ #array ) {
print "\n $e is defined in variables and it will be excluded \n";
#array = grep {$e ne $_} #array;
print "#array \n";
} else {
print "\n $e is not defined under variables please check the files \n";
exit 100;
}
}

use strict and warnings to alert you to many pitfalls in perl.
A hash is the most natural way to represent a set in perl.
use strict;
use warnings;
my #array = ('MAJOR','MINOR','MM','DD','YY');
my #exclude = ('MM','MINOR','YY');
my #include = ('LICENSE','VALID');
my %set;
# add #array to set
#set{#array} = ();
# remove #exclude
delete #set{#exclude};
# add #include
#set{#include} = ();
# array of elements resulting
my #result = sort keys %set;

You could use a set to do those kind of operations. I used a non-standard module Set::Scalar to help me with it:
#!/usr/bin/env perl
use warnings;
use strict;
use Set::Scalar;
my #array = qw(MAJOR MINOR MM DD YY);
my #exclude = qw(MM MINOR YY);
my #include = qw(LICENSE VALID);
my $array_set = Set::Scalar->new(#array);
my $exclude_set = Set::Scalar->new(#exclude);
my $include_set = Set::Scalar->new(#include);
my $result = $array_set->difference($exclude_set)->union($include_set);
use Data::Dumper;
print Dumper #$result;
Run it like:
perl script.pl
That yields:
$VAR1 = 'VALID';
$VAR2 = 'MAJOR';
$VAR3 = 'DD';
$VAR4 = 'LICENSE';

Related

Perl output successive string from array

I have an array I can print out as "abcd" however I am trying to print it as "a>ab>abc>abcd". I can't figure out the nested loop I need within the foreach loop I have. What loop do I need within it to print it this way?
my $str = "a>b>c>d";
my #words = split />/, $str;
foreach my $i (0 .. $#words) {
print $words[$i], "\n";
}
Thank you.
You had the right idea, but instead of printing the word at position i, you want to print all the words between positions 0 and i (inclusive). Also, your input can contain multiple strings, so loop over them.
use warnings;
while (my $str = <>) { # read lines from stdin or named files
chomp($str); # remove any trailing line separator
my #words = split />/, $str; # break string into array of words
foreach my $i (0 .. $#words) {
print join '', #words[0 .. $i]; # build the term from the first n words
print '>' if $i < $#words; # print separator between terms (but not at end)
}
print "\n";
}
There are many other ways to write it, but hopefully this way helps you understand what's happening and why. Good luck!
one liner:
perl -e '#a=qw(a b c d); for(#a) {$s.=($h.=$_).">"} $s=substr($s,0,-1);print $s'
I would do it like this:
#!/usr/bin/perl
use strict;
use warnings;
my $str = "a>b>c>d>e>f>g";
my #words = split />/, $str;
$" = '';
my #new_words;
push #new_words, "#words[0 .. $_]" for 0 .. $#words;
print join '>', #new_words;
A few things to explain.
Perl will expand array variables in a double-quoted string. So something like this:
#array = ('x', 'y', 'z');
print "#array";
will print x y z. Notice there are spaces between the elements. The string that is inserted between the elements is controlled by the $" variable. So by setting that variable to an empty string we can remove the spaces, so:
$" = '';
#array = ('x', 'y', 'z');
print "#array";
will print xyz.
The most complex line is:
push #new_words, "#words[0 .. $_]" for 0 .. $#words;
That's just a compact way to write:
for (0 .. $#words) {
my $new_word = "#words[0 .. $_]";
push #new_words, $new_word;
}
We iterate across the integers from zero to the last index in #words. Each time around the loop, we use an array slice to get a list of elements from the array, convert that to a string (by putting it in double-quotes) and then push that string onto #new_words.
This is what I ended up with, It's the only way I could understand and get the output I was looking for.
use strict;
use warnings;
my $str = "a>b>c>d>e>f>g";
my #words = split />/, $str;
my $j = $#words;
my $i = 0;
my #newtax;
while($i <= $#words){
foreach my $i (0 .. $#words - $j){
push (#new, $words[$i]);
}
if($i < $#words){
push(#new, ">");
}
$j--;
$i++;
}
print #new;
This output "a>ab>abc>abcd>abcde>abcdef>abcdefg"

loop through elements of array to find character perl

I have a perl array where I only want to loop through elements 2-8.
The elements are only meant to contain numbers, so if any of those elements contain a letter, I want to set an error flag = 1, as well as some other variables as seen.
The reason I have 2 error flag variables is due to scope rules within the loop.
fields is an array, I created by splitting another irrelevant array by the " " key.
So, when I try to print error_line2, error_fname2 from outside the loop, I get this:
Use of uninitialized value $error_flag2 in numeric eq (==)
I don't know why, because I've initialized the value within the loop and created the variable outside the loop.
Not sure if I'm even looping to find characters correctly, so then it's not setting the error_flag2 = 1.
Example line:
bob hankerman 2039 3232 23 232 645 64x3 324
since element 7 has the letter 'x' , I want the flag to be set to 1.
#!/usr/bin/perl
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
my $players_file = $ARGV[0];
my #players_array;
open (my $file, "<", "$players_file")
or die "Failed to open file: $!\n";
while(<$file>) {
chomp;
push #players_array, $_;
}
close $file;
#print join "\n", #players_array;
my $num_of_players = #players_array;
my $error_flag;
my $error_line;
my $error_fname;
my $error_lname;
my $error_flag2=1;
my $error_line2;
my $error_fname2;
my $error_lname2;
my $i;
foreach my $player(#players_array){
my #fields = split " ", $player;
my $size2 = #fields;
for($i=2; $i<9; $i++){
print "$fields[$i] \n";
if (grep $_ =~ /^[a-zA-Z]+$/){
my $errorflag2 = 1;
$error_flag2 = $errorflag2;
my $errorline2 = $player +1;
$error_line2 = $errorline2;
my $errorfname2 = $fields[0];
$error_fname2 = $errorfname2;
}
}
if ($size2 == "9" ) {
my $firstname = $fields[0];
my $lastname = $fields[1];
my $batting_average = ($fields[4]+$fields[5]+$fields[6]+$fields[7]) / $fields[3];
my $slugging = ($fields[4]+($fields[5]*2)+($fields[6]*3)+($fields[7]*4)) / $fields[3];
my $on_base_percent = ($fields[4]+$fields[5]+$fields[6]+$fields[7] +$fields[8]) / $fields[2];
print "$firstname ";
print "$lastname ";
print "$batting_average ";
print "$slugging ";
print "$on_base_percent\n ";
}
else {
my $errorflag = 1;
$error_flag = $errorflag;
my $errorline = $player +1;
$error_line = $errorline;
my $errorfname = $fields[0];
$error_fname = $errorfname;
my $errorlname = $fields[1];
$error_lname = $errorlname;
}
}
if ($error_flag == "1"){
print "\n Line $error_line : ";
print "$error_fname, ";
print "$error_lname :";
print "Line contains not enough data.\n";
}
if ($error_flag2 == "1"){
print "\n Line $error_line2 : ";
print "$error_fname2, ";
print "Line contains bad data.\n";
}
OK, so the problem you've got here is that you're thinking of grep in Unix terms - a text based thing. It doesn't work like that in perl - it operates on a list.
Fortunately, this is pretty easy to handle in your case, because you can split your line into words.
Without your source data, this is hopefully a proof of concept:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
while ( <DATA> ) {
#split the current line on whitespace into an array.
#first two elements get assigned to firstname lastname, and then the rest
#goes into #values
my ( $firstname, $lastname, #values ) = split; #works on $_ implicitly.
#check every element in #values, and test the regex 'non-digit' against it.
my #errors = grep { /\D/ } #values;
#output any matches e.g. things that contained 'non-digits' anywhere.
print Dumper \#errors;
#an array in a scalar context evaluates as the number of elements.
#we need to use "scalar" here because print accepts list arguments.
print "There were ", scalar #errors, " errors\n";
}
__DATA__
bob hankerman 2039 3232 23 232 645 64x3 324
Or reducing down the logic:
#!/usr/bin/perl
use strict;
use warnings;
while ( <DATA> ) {
#note - we don't need to explicity specify 'scalar' here,
#because assigning it to a scalar does that automatically.
#(split) splits the current line, and [2..8] skips the first two.
my $count_of_errors = grep { /\D/ } (split)[2..8];
print $count_of_errors;
}
__DATA__
bob hankerman 2039 3232 23 232 645 64x3 324
First : You don't need to use "GREP", Simply you can match the string with "=~" in perl and you can print matched value with $&.
Second : You should use $_ if and only if there is not other variable used in the loop. There is already $i used in the loop, you can write the loop as :
for my $i (2..9) {
print "$i\n";
}
or
foreach(2..9) {
print "$_\n";
}

Perl: Inserting values into specific columns of CSV file

I have CSV data of the form:
S.No,Label,Customer1,Customer2,Customer3...
1,label1,Y,N,Y
2,label2,N,Y,N
...
I need to reproduce the "label" to the left of "customer" columns marked with Y - and have nothing ("") to the left of columns marked with N.
Expected output:
S.No,Label,Customer1,Customer1,Customer2,Customer2,Customer3,Customer3...
1,label1,label1,Y,"",N,label1,Y
2,label2,"",N,label2,Y,"",N
When opened using Excel, it would look like this:
S.No Label Customer1 Customer1 Customer2 Customer2 Customer3 Customer3...
1 label1 label1 Y N label1 Y
2 label2 N label2 Y N
The two leftmost columns, referring to S.No and the original "Label" column, are constant.
What is the simplest way to do this? I tried the following code:
use strict;
use warnings;
my $nonIncludesFile = "nonIncludes.csv";
open(my $xfh, "+>", $nonIncludesFile) or warn "Unable to open $nonIncludesFile, $!";
chomp( my $header = <$xfh> );
my #names = split ",", $header;
my #names1;
my #fields;
my #fields1;
for(my $j=0; $j< scalar(#names); $j++)
{
$names1[$j] = $names[$j];
}
while(<$xfh>)
{
my $nonIncLine = $_;
$nonIncLine = chomp($nonIncLine);
#fields = split ",", $nonIncLine;
next if $. == 1; #skip the first line
for(my $i = 0; $i < scalar(#fields) -2; $i++) #Number of "customers" = scalar(#fields) -2
{
$fields1[0] = $fields[0];
$fields1[1] = $fields[1];
if('Y' eq $fields[ $i + 2 ])
{
$fields1[$i+2] = 'Y';
substr(#fields1, $i + 1, 0, $fields[1]); #insert the label to the left - HERE
}
else
{
$fields1[$i+2] = 'N';
substr(#fields1, $i + 1, 0, "");
}
}
}
print $xfh #names1;
print $xfh #fields1;
close($xfh);
This however complains of "substr outside of string" at the line marked by "HERE".
What am I doing wrong? And is there any simpler (and better) way to do this?
Something like this maybe?
#!/usr/bin/perl
use strict;
use warnings;
#read the header row
chomp( my ( $sn, $label, #customers ) = split( /,/, <DATA> ) );
#double the 'customers' column headings (one is suffixed "_label")
print join( ",", $sn, $label, map { $_ . "_label", $_ } #customers ), "\n";
#iterate data
while (<DATA>) {
#strip trailing linefeed
chomp;
#extract fields with split - note breaks if you've quoted commas inline.
my ( $sn, $label, #row ) = split /,/;
print "$sn,$label,";
#iterate Y/N values, and either prints "Y" + label, or anything else + blank.
foreach my $value (#row) {
print join( ",", $value eq "Y" ? $label : "", $value ),",";
}
print "\n";
}
__DATA__
S.No,Label,Customer1,Customer2,Customer3
1,label1,Y,N,Y
2,label2,N,Y,N
Assumes you don't have any fruity special characters (e.g. commas) in the fields, because it'll break if you do, and you might want to consider Text::CSV instead.
It is always much better to post some usable test data than write a something like this question
However, it looks like your data has no quoted fields or escaped characters, so it looks like you can just use split and join to process the CSV data
Here's a sample Perl program that fulfils your requirement. The example output uses your data as it is. Each line of data has to be processed backwards so that the insertions don't affect the indices of elements that are yet to be processed
use strict;
use warnings 'all';
use feature 'say';
while ( <DATA> ) {
chomp;
my #fields = split /,/;
for ( my $i = $#fields; $i > 1; --$i ) {
my $newval =
$. == 1 ? $fields[$i] :
lc $fields[$i] eq 'y' ? $fields[1] :
'';
splice #fields, $i, 0, $newval;
}
say join ',', #fields;
}
__DATA__
S.No,Label,Customer1,Customer2,Customer3...
1,label1,Y,N,Y
2,label2,N,Y,N
output
S.No,Label,Customer1,Customer1,Customer2,Customer2,Customer3...,Customer3...
1,label1,label1,Y,,N,label1,Y
2,label2,,N,label2,Y,,N

Comparing two strings line by line in Perl

I am looking for code in Perl similar to
my #lines1 = split /\n/, $str1;
my #lines2 = split /\n/, $str2;
for (int $i=0; $i<lines1.length; $i++)
{
if (lines1[$i] ~= lines2[$i])
print "difference in line $i \n";
}
To compare two strings line by line and show the lines at which there is any difference.
I know what I have written is mixture of C/Perl/Pseudo-code. How do I write it in the way that it works on Perl?
What you have written is sort of ok, except you cannot use that notation in Perl lines1.length, int $i, and ~= is not an operator, you mean =~, but that is the wrong tool here. Also if must have a block { } after it.
What you want is simply $i < #lines1 to get the array size, my $i to declare a lexical variable, and eq for string comparison. Along with if ( ... ) { ... }.
Technically you can use the binding operator to perform a string comparison, for example:
"foo" =~ "foobar"
But it is not a good idea when comparing literal strings, because you can get partial matches, and you need to escape meta characters. Therefore it is easier to just use eq.
Using C-style for loops is valid, but the more Perl-ish way is to use this notation:
for my $i (0 .. $#lines1)
Which will iterate over the range 0 to the max index of the array.
Perl allows you to open filehandles on strings by using a reference to the scalar variable that holds the string:
open my $string1_fh, '<', \$string1 or die '...';
open my $string2_fh, '<', \$string2 or die '...';
while( my $line1 = <$string1_fh> ) {
my $line2 = <$string2_fh>;
....
}
But, depending on what you mean by difference (does that include insertion or deletion of lines?), you might want something different.
There are several modules on CPAN that you can inspect for ideas, such as Test::LongString or Algorithm::Diff.
my #lines1 = split(/^/, $str1);
my #lines2 = split(/^/, $str2);
# splits at start of line
# use /\n/ if you want to ignore newline and trailing spaces
for ($i=0; $i < #lines1; $i++) {
print "difference in line $i \n" if (lines1[$i] ne lines2[$i]);
}
Comparing Arrays is a way easier if you create a Hashmap out of it...
#Searching the difference
#isect = ();
#diff = ();
%count = ();
foreach $item ( #array1, #array2 ) { $count{$item}++; }
foreach $item ( keys %count ) {
if ( $count{$item} == 2 ) {
push #isect, $item;
}
else {
push #diff, $item;
}
}
#Output
print "Different= #diff\n\n";
print "\nA Array = #array1\n";
print "\nB Array = #array2\n";
print "\nIntersect Array = #isect\n";
Even after spliting you could compare them as Array.

regex matching I think

First sorry if I should have added this to my earlier question today, but I now have the below code and am having problems getting things to add up to 100...
use strict;
use warnings;
my #arr = map {int( rand(49) + 1) } ( 1..100 ); # build an array of 100 random numbers between 1 and 49
my #count2;
foreach my $i (1..49) {
my #count = join(',', #arr) =~ m/,$i,/g; # ???
my $count1 = scalar(#count); # I want this $count1 to be the number of times each of the numbers($i) was found within the string/array.
# push(#count2, $count1 ." times for ". $i); # pushing a "number then text and a number / scalar, string, scalar" to an array.
push(#count2, [$count1, $i]);
}
#sort #count2 and print the top 7
my #sorted = sort { $b->[0] <=> $a->[0] } #count2;
my $sum = 0;
foreach my $i (0..$#sorted) { # (0..6)
printf "%d times for %d\n", $sorted[$i][0], $sorted[$i][1];
$sum += $sorted[$i][0]; # try to add up/sum all numbers in the first coloum to make sure they == 100
}
print "Generated $sum random numbers.\n"; # doesn't add up to 100, I think it is because of the regex and because the first number doesn't have a "," in front of it
# seem to be always 96 or 97, 93...
Replace these two lines:
my #count = join(',', #arr) =~ m/,$i,/g; # ???
my $count1 = scalar(#count); # I want this $count1 to be the number of times each of the numbers($i) was found within the string/array.
with this:
my $count1 = grep { $i == $_ } #arr;
grep will return a list of elements where only the expression in {} evaluates to true. This is less error-prone and much more efficient than joining the entire array and using a a regex. Also note that scalar is not necessary since the variable $count1 is scalar, so perl will return the result of grep in scalar context.
You can also get rid of this line:
push(#count2, $count1 ." times for ". $i); # pushing a "number then text and a number / scalar, string, scalar" to an array.
since you are already printing the same information in your last foreach loop.
#!/usr/bin/perl
use strict; use warnings;
use YAML;
my #arr;
$#arr = 99;
my %counts;
for my $i (0 .. 99) {
my $n = int(rand(49) + 1);
$arr[ $i ] = $n;
++$counts{ $n };
}
my #result = map [$_, $counts{$_}],
sort {$counts{$a} <=> $counts{$b} }
keys %counts;
my $sum;
$sum += $_->[1] for #result;
print "Number of draws: $sum\n";
You can probably reuse some well-tested code from List::MoreUtils.
use List::MoreUtils qw/ indexes /;
...
foreach my $i (1..49) {
my #indexes = indexes { $_ == $i } #arr;
my $count1 = scalar( #indexes );
push( #count2, [ $count1, $i ] );
}
If you don't need the warns in the sum loop, then I'd recommend using sum from List:Util.
use List::Util qw/ sum /;
...
my $sum = sum map { $_->[0] } #sorted;
If you insist on the loop, rewrite it as:
foreach my $i ( #sorted ) {

Resources