Perl: Removing array items and resizing the array - arrays

I’m trying to filter an array of terms using another array in Perl. I have Perl 5.18.2 on OS X, though the behavior is the same if I use 5.010. Here’s my basic setup:
#!/usr/bin/perl
#use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
foreach $filter (#filters) {
for my $ind (0 .. $#terms) {
if (grep { /$filter/ } $terms[$ind]) {
splice #terms,$ind,1;
}
}
}
This works to pull out the lines that match the various search terms, but the array length doesn’t change. If I write out the resulting #terms array, I get:
[alpha]
[delta quadrant]
[zeta]
[eta]
[theta chi]
[kappa]
[]
[]
[]
[]
As you might expect from that, printing scalar(#terms) gets a result of 10.
What I want is a resulting array of length 6, without the four blank items at the end. How do I get that result? And why isn’t the array shrinking, given that the perldoc page about splice says, “The array grows or shrinks as necessary.”?
(I’m not very fluent in Perl, so if you’re thinking “Why don’t you just...?”, it’s almost certainly because I don’t know about it or didn’t understand it when I heard about it.)

You can always regenerate the array minus things you don't want. grep acts as a filter allowing you to decide which elements you want and which you don't:
#!/usr/bin/perl
use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
my %filter_exclusion = map { $_ => 1 } #filters;
my #filtered = grep { !$filter_exclusion{$_} } #terms;
print join(',', #filtered) . "\n";
It's pretty easy if you have a simple structure like %filter_exclusion on hand.
Update: If you want to allow arbitrary substring matches:
my $filter_exclusion = join '|', map quotemeta, #filters;
my #filtered = grep { !/$filter_exclusion/ } #terms;

To see what's going on, print the contents of the array in each step: When you splice the array, it shrinks, but your loop iterates over 0 .. $#terms, so at the end of the loop, $ind will point behind the end of the array. When you use grep { ... } $array[ $too_large ], Perl needs to alias the non-existent element to $_ inside the grep block, so it creates an undef element in the array.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #terms = ('alpha', 'beta test', 'gamma', 'delta quadrant', 'epsilon',
'zeta', 'eta', 'theta chi', 'one iota', 'kappa');
my #filters = qw( beta gamma epsilon iota );
for my $filter (#filters) {
say $filter;
for my $ind (0 .. $#terms) {
if (grep { do {
no warnings 'uninitialized';
/$filter/
} } $terms[$ind]
) {
splice #terms, $ind, 1;
}
say "\t$ind\t", join ' ', map $_ || '-', #terms;
}
}
If you used $terms[$ind] =~ /$filter/ instead of grep, you'd still get uninitialized warnings, but as there's no need to alias the element, it won't be created.

Related

compare an array of string with another array of strings in perl

I want to compare an array of string with another array of strings; if it matches, print matched.
Example:
#array = ("R-ID 1.0001", "RA-ID 61.02154", "TCA-ID 49.021456","RCID 61.02154","RB-ID 61.02154");
#var = ("TCA-ID 49", "R-ID 1");
for (my $x = 0; $x <= 4; $x++)
{
$array[$x] =~ /(.+?)\./;
if( ($var[0] eq $1) or ($var[1] eq $1) )
{
print "\n deleted rows are :#array\n";
}
else
{
print "printed rows are : #array \n";
push(#Matrix, \#array);
}
Then I need to compare #var with the #array; if it is matched, print the matched pattern.
Here the entire logic is in a hireartical for loop which gives a new #array in each iteration. so every time this logic is executed #array has different strings.
Then comes with #var it is user input field, this #var can be of any size. So in order to run the logic according to these constraints, I need to iterate the condition inside the if loop when the user input #var size is 3 for example.
So the goal is to match and delete the user input stings using the above mentioned logic. But unfortunately tis logic is not working. Could you please help me out in this issue.
The builtin grep keyword is a good place to start.
my $count = grep { $_ eq $var } #array;
This returns a count of items ($_) in the array which are equal (eq) to $var.
If you needed case-insensitive matching, you could use lc (or in Perl 5.16 or above, fc) to do that:
my $count = grep { lc($_) eq lc($var) } #array;
Now, a disadvantage to grep is that it is counting the matches. So after if finds the first match, it will keep on going until the end of the array. You don't seem to want that, but just want to know if any item in the array matches, in which case keeping on going might be slower than you need if it's a big array with thousands of elements.
So instead, use any from the List::Util module (which is bundled with Perl).
use List::Util qw( any );
my $matched = any { $_ eq $var } #array;
This will match as soon as it finds the first matching element, and skip searching the rest of the array.
Here is a couple of versions that allows multiple strings to be matched. Not clear what form $var takes when you want to store multiple, so assuming they are in an array #var for now.
The key point is this one is the use of the lookup hash to to the matching.
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
# create a lookup for the strings to match
my %lookup = map { $_ => 1} #var ;
for my $entry (#array)
{
print "$entry\n"
if $lookup{$entry} ;
}
running gives
RA-ID 61
TCA-ID 49
Next, using a regular expression to do the matching
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
my $re = join "|", map { quotemeta } #var;
print "$_\n" for grep { /^($re)$/ } #array ;
output is the same

Extract number from array in Perl

I have a array which have certain elements. Each element have two char "BC" followed by a number
e.g - "BC6"
I want to extract the number which is present and store in a different array.
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
my #band = ("BC1", "BC3");
foreach my $elem(#band)
{
my #chars = split("", $elem);
foreach my $ele (#chars) {
looks_like_number($ele) ? 'push #band_array, $ele' : '';
}
}
After execution #band_array should contain (1,3)
Can someone please tell what I'm doing wrong? I am new to perl and still learning
To do this with a regular expression, you need a very simple pattern. /BC(\d)/ should be enough. The BC is literal. The () are a capture group. They save the match inside into a variable. The first group relates to $1 in Perl. The \d is a character group for digits. That's 0-9 (and others, but that's not relevant here).
In your program, it would look like this.
use strict;
use warnings;
use Data::Dumper;
my #band = ('BC1', 'BC2');
my #numbers;
foreach my $elem (#band) {
if ($elem =~ m/BC(\d)/) {
push #numbers, $1;
}
}
print Dumper #numbers;
This program prints:
$VAR1 = '1';
$VAR2 = '2';
Note that your code had several syntax errors. The main one is that you were using #band = [ ... ], which gives you an array that contains one array reference. But your program assumed there were strings in that array.
Just incase your naming contains characters other than BC this will exctract all numeric values from your list.
use strict;
use warnings;
my #band = ("AB1", "BC2", "CD3");
foreach my $str(#band) {
$str =~ s/[^0-9]//g;
print $str;
}
First, your array is an anonymous array reference; use () for a regular array.
Then, i would use grep to filter out the values into a new array
use strict;
use warnings;
my #band = ("BC1", "BC3");
my #band_array = grep {s/BC(\d+)/$1/} #band;
$"=" , "; # make printing of array nicer
print "#band_array\n"; # print array
grep works by passing each element of an array in the code in { } , just like a sub routine. $_ for each value in the array is passed. If the code returns true then the value of $_ after the passing placed in the new array.
In this case the s/// regex returns true if a substitution is made e.g., the regex must match. Here is link for more info on grep

Match array based on existing array

I have two arrays,
my #test = ('a','b','c','d','e',f,'g','h');
my #test2 = ('h','b','d');
I'm attempting to loop through the array #test, and match elements against those in #test2, deleting those elements that do not exist.
I have the following code:
foreach my $header (#test) {
if( exists $test2[$header]){
# do nothing
}
else {
delete $test[$header];
}
}
So, I want array #test to look like this (ignore the fact this could be sorted alphabetically):
my #test = ('b','d','h');
However currently my array remains the same after the foreach loop, can anyone suggest why?
You're misunderstanding what 'header' is set to. It's set to (an alias of) the value in the array.
So
foreach my $header (#test) {
print $header,"\n";
}
Will give you a, b, c etc.
However, then you're trying to access $test2['a'] which isn't valid, because it should be numeric.
So actually, a good case study in why you should use strict and warnings because this would have told you the problem:
Argument "a" isn't numeric in array or hash lookup at
Argument "b" isn't numeric in array or hash lookup at
etc.
So it's not actually doing anything.
You shouldn't use delete in this way either though, because you're deleting from a list whilst iterating it. That's not good, even without the fact that it's nonsense to use a letter as an array index.
You could do it like this instead though:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
my %is_in_test2 = map { $_ => 1 } #test2;
#test = grep { $is_in_test2{$_} } #test;
print Dumper \#test;
If you do want to iterate by index, you can do it like this:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
}
But I would still suggest that deleting whilst iterating isn't a great plan, because modifying a thing you're iterating it ( and aiming to resize the array) is a good way to end up with strange bugs.
So whilst you could:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
delete $test[$index] if not $is_in_test2{$test[$index]};
}
print Dumper \#test;
What you'll end up with is:
$VAR1 = [
undef,
'b',
undef,
'd',
undef,
undef,
undef,
'h'
];
For the more general case there is a a FAQ: How do I computer the difference/intersection of two arrays
use strict;
use warnings;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
foreach my $header (#test) {
foreach my $header2 (#test2) {
if( $header eq $header2) {
print "$header,"; #prints only existing values
}
else {
#doNothing
}
}
}
Note: This method is not feasible when you are dealing an array with huge elements.

Perl array element comparing

I am new in Perl programming. I am trying to compare the two arrays each element. So here is my code:
#!/usr/bin/perl
use strict;
use warnings;
use v5.10.1;
my #x = ("tom","john","michell");
my #y = ("tom","john","michell","robert","ricky");
if (#x ~~ #y)
{
say "elements matched";
}
else
{
say "no elements matched";
}
When I run this I get the output
no elements matched
So I want to compare both array elements in deep and the element do not matches, those elements I want to store it in a new array. As I can now compare the only matched elements but I can't store it in a new array.
How can I store those unmatched elements in a new array?
Please someone can help me and advice.
I'd avoid smart matching in Perl - e.g. see here
If you're trying to compare the contents of $y[0] with $x[0] then this is one way to go, which puts all non-matches in an new array #keep:
use strict;
use warnings;
use feature qw/say/;
my #x = qw(tom john michell);
my #y = qw(tom john michell robert ricky);
my #keep;
for (my $i = 0; $i <$#y; $i++) {
unless ($y[$i] eq $x[$i]){
push #keep, $y[$i];
}
}
say for #keep;
Or, if you simply want to see if one name exists in the other array (and aren't interested in directly comparing elements), use two hashes:
my (%x, %y);
$x{$_}++ for #x;
$y{$_}++ for #y;
foreach (keys %y){
say if not exists $x{$_};
}
It would be well worth your while spending some time reading the Perl FAQ.
Perl FAQ 4 concerns Data Manipulation and includes the following question and answer:
How do I compute the difference of two arrays? How do I compute
the intersection of two arrays?
Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:
my (#union, #intersection, #difference);
my %count = ();
foreach my $element (#array1, #array2) { $count{$element}++ }
foreach my $element (keys %count) {
push #union, $element;
push #{ $count{$element} > 1 ? \#intersection : \#difference }, $element;
}
Note that this is the symmetric difference, that is, all elements
in either A or in B but not in both. Think of it as an xor
operation.

2d arr explicitpackage

I've looked through several threads on websites including this one to try and understand why I am getting an undeclared variable error for my usage of my $line . Each element of the #lines array is an array of strings.
The error is in line 25 and 27 with the $line[$count] statement
use strict;
use warnings;
my #lines;
my #sizes;
# read input from stdin file into 2d array
while(<>)
{
push(#lines, my #tokens = split(/\s+/, $_));
}
# search through each array for largest sizes in
# corresponding elements
for (my $count = 0; $count <= 5; $count++)
{
push(#sizes, 0);
foreach my $line (#lines)
{
if(length($line[$count])>$sizes[$count])
{
$sizes[$count] = length($line[$count]);
}
}
}
I can post the full code if it is necessary, but I am pretty sure the error must be in here somewhere.
The problem is here:
push(#lines, my #tokens = split(/\s+/, $_));
Pushing one array into another just adds all elements to the first array. So you are making a really long one dimensional array.
To fix this, use brackets to make an array reference:
push #lines, [ split(/\s+/, $_) ]; #No need for a temp variable.
Also, to access the array reference, you have to de-reference it. Both of these syntaxes are options:
${$line}[$count];
$line->[$count];
I think the second syntax is more readable.
Update: Also, you could simplify your code if you keep track of the longest lengths while you go through the file:
use strict;
use warnings;
use List::Util qw/max/;
my #lines;
my #sizes = (0)x6;
while(<>)
{
push #lines, [ my #tokens = split ];
#sizes = map { max ( length($tokens[$_]), $sizes[$_] ) } 0..$#tokens;
}
Note: The Data::Dumper core module is an invaluable tool when working with complex data structures in Perl.
use Data::Dumper;
print Dumper #lines;
This will print out the complete structure of whatever variable you give it. That way you can see if you actually created what you thought you did.

Resources