Match array based on existing array - arrays

I have two arrays,
my #test = ('a','b','c','d','e',f,'g','h');
my #test2 = ('h','b','d');
I'm attempting to loop through the array #test, and match elements against those in #test2, deleting those elements that do not exist.
I have the following code:
foreach my $header (#test) {
if( exists $test2[$header]){
# do nothing
}
else {
delete $test[$header];
}
}
So, I want array #test to look like this (ignore the fact this could be sorted alphabetically):
my #test = ('b','d','h');
However currently my array remains the same after the foreach loop, can anyone suggest why?

You're misunderstanding what 'header' is set to. It's set to (an alias of) the value in the array.
So
foreach my $header (#test) {
print $header,"\n";
}
Will give you a, b, c etc.
However, then you're trying to access $test2['a'] which isn't valid, because it should be numeric.
So actually, a good case study in why you should use strict and warnings because this would have told you the problem:
Argument "a" isn't numeric in array or hash lookup at
Argument "b" isn't numeric in array or hash lookup at
etc.
So it's not actually doing anything.
You shouldn't use delete in this way either though, because you're deleting from a list whilst iterating it. That's not good, even without the fact that it's nonsense to use a letter as an array index.
You could do it like this instead though:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
my %is_in_test2 = map { $_ => 1 } #test2;
#test = grep { $is_in_test2{$_} } #test;
print Dumper \#test;
If you do want to iterate by index, you can do it like this:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
}
But I would still suggest that deleting whilst iterating isn't a great plan, because modifying a thing you're iterating it ( and aiming to resize the array) is a good way to end up with strange bugs.
So whilst you could:
for my $index ( 0..$#test ) {
print "$index => $test[$index]\n";
delete $test[$index] if not $is_in_test2{$test[$index]};
}
print Dumper \#test;
What you'll end up with is:
$VAR1 = [
undef,
'b',
undef,
'd',
undef,
undef,
undef,
'h'
];
For the more general case there is a a FAQ: How do I computer the difference/intersection of two arrays

use strict;
use warnings;
my #test = ('a','b','c','d','e','f','g','h');
my #test2 = ('h','b','d');
foreach my $header (#test) {
foreach my $header2 (#test2) {
if( $header eq $header2) {
print "$header,"; #prints only existing values
}
else {
#doNothing
}
}
}
Note: This method is not feasible when you are dealing an array with huge elements.

Related

difference between the following two codes of [ and ( in perl?

When I want to assign input file to array, I am getting this error.
while (<>) {
my #tmp = split;
push my #arr,[#tmp];
print "#arr\n";
}
output: ARRAY(0x7f0b00)
ARRAY(0x7fb2f0)
If I change [ to ( then I am getting the required output.
while (<>) {
my #tmp = split;
push my #arr,(#tmp);
print "#arr\n";
output: hello, testing the perl
check the arrays.
What is the deference between (#tmp) and [#tmp]?
Normal parentheses () have no special function besides changing precedence. They are commonly used to confine a list, e.g. my #arr = (1,2,3) Square brackets return an array reference. In your case, you would be constructing a two-dimensional array. (You would, if your code was not broken).
Your code should perhaps be written like this. Note that you need to declare the array outside the loop block, otherwise it will not keep values from previous iterations. Note also that you do not need to use a #tmp array, you can simply put the split inside the push.
my #arr; # declare #arr outside the loop block
while (<>) {
push #arr, [ split ]; # stores array reference in #arr
}
for my $aref (#arr) {
print "#$aref"; # print your values
}
This array would have the structure:
$arr[0] = [ "hello,", "testing", "the", "perl" ];
$arr[1] = [ "check", "the", "arrays." ];
This is a good idea if you for example want to keep lines of input from being mixed up. Otherwise all values end up in the same level of the array.

Perl: Removing array items and resizing the array

I’m trying to filter an array of terms using another array in Perl. I have Perl 5.18.2 on OS X, though the behavior is the same if I use 5.010. Here’s my basic setup:
#!/usr/bin/perl
#use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
foreach $filter (#filters) {
for my $ind (0 .. $#terms) {
if (grep { /$filter/ } $terms[$ind]) {
splice #terms,$ind,1;
}
}
}
This works to pull out the lines that match the various search terms, but the array length doesn’t change. If I write out the resulting #terms array, I get:
[alpha]
[delta quadrant]
[zeta]
[eta]
[theta chi]
[kappa]
[]
[]
[]
[]
As you might expect from that, printing scalar(#terms) gets a result of 10.
What I want is a resulting array of length 6, without the four blank items at the end. How do I get that result? And why isn’t the array shrinking, given that the perldoc page about splice says, “The array grows or shrinks as necessary.”?
(I’m not very fluent in Perl, so if you’re thinking “Why don’t you just...?”, it’s almost certainly because I don’t know about it or didn’t understand it when I heard about it.)
You can always regenerate the array minus things you don't want. grep acts as a filter allowing you to decide which elements you want and which you don't:
#!/usr/bin/perl
use strict;
my #terms = ('alpha','beta test','gamma','delta quadrant','epsilon',
'zeta','eta','theta chi','one iota','kappa');
my #filters = ('beta','gamma','epsilon','iota');
my %filter_exclusion = map { $_ => 1 } #filters;
my #filtered = grep { !$filter_exclusion{$_} } #terms;
print join(',', #filtered) . "\n";
It's pretty easy if you have a simple structure like %filter_exclusion on hand.
Update: If you want to allow arbitrary substring matches:
my $filter_exclusion = join '|', map quotemeta, #filters;
my #filtered = grep { !/$filter_exclusion/ } #terms;
To see what's going on, print the contents of the array in each step: When you splice the array, it shrinks, but your loop iterates over 0 .. $#terms, so at the end of the loop, $ind will point behind the end of the array. When you use grep { ... } $array[ $too_large ], Perl needs to alias the non-existent element to $_ inside the grep block, so it creates an undef element in the array.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #terms = ('alpha', 'beta test', 'gamma', 'delta quadrant', 'epsilon',
'zeta', 'eta', 'theta chi', 'one iota', 'kappa');
my #filters = qw( beta gamma epsilon iota );
for my $filter (#filters) {
say $filter;
for my $ind (0 .. $#terms) {
if (grep { do {
no warnings 'uninitialized';
/$filter/
} } $terms[$ind]
) {
splice #terms, $ind, 1;
}
say "\t$ind\t", join ' ', map $_ || '-', #terms;
}
}
If you used $terms[$ind] =~ /$filter/ instead of grep, you'd still get uninitialized warnings, but as there's no need to alias the element, it won't be created.

Perl array element comparing

I am new in Perl programming. I am trying to compare the two arrays each element. So here is my code:
#!/usr/bin/perl
use strict;
use warnings;
use v5.10.1;
my #x = ("tom","john","michell");
my #y = ("tom","john","michell","robert","ricky");
if (#x ~~ #y)
{
say "elements matched";
}
else
{
say "no elements matched";
}
When I run this I get the output
no elements matched
So I want to compare both array elements in deep and the element do not matches, those elements I want to store it in a new array. As I can now compare the only matched elements but I can't store it in a new array.
How can I store those unmatched elements in a new array?
Please someone can help me and advice.
I'd avoid smart matching in Perl - e.g. see here
If you're trying to compare the contents of $y[0] with $x[0] then this is one way to go, which puts all non-matches in an new array #keep:
use strict;
use warnings;
use feature qw/say/;
my #x = qw(tom john michell);
my #y = qw(tom john michell robert ricky);
my #keep;
for (my $i = 0; $i <$#y; $i++) {
unless ($y[$i] eq $x[$i]){
push #keep, $y[$i];
}
}
say for #keep;
Or, if you simply want to see if one name exists in the other array (and aren't interested in directly comparing elements), use two hashes:
my (%x, %y);
$x{$_}++ for #x;
$y{$_}++ for #y;
foreach (keys %y){
say if not exists $x{$_};
}
It would be well worth your while spending some time reading the Perl FAQ.
Perl FAQ 4 concerns Data Manipulation and includes the following question and answer:
How do I compute the difference of two arrays? How do I compute
the intersection of two arrays?
Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:
my (#union, #intersection, #difference);
my %count = ();
foreach my $element (#array1, #array2) { $count{$element}++ }
foreach my $element (keys %count) {
push #union, $element;
push #{ $count{$element} > 1 ? \#intersection : \#difference }, $element;
}
Note that this is the symmetric difference, that is, all elements
in either A or in B but not in both. Think of it as an xor
operation.

More clarification about the usage of split in Perl

I have this following input file:
test.csv
done_cfg,,,,
port<0>,clk_in,subcktA,instA,
port<1>,,,,
I want to store the elements of each CSV column into an array, but I always get error when I try to fetch those "null" elements in the csv when I run the script. Here's my code:
# ... assuming file was correctly opened and stored into
# ... a variable named $map_in
my $counter = 0;
while($map_in){
chomp;
#hold_csv = split(',',$_);
$entry1[$counter] = $hold_csv[0];
$entry2[$counter] = $hold_csv[1];
$entry3[$counter] = $hold_csv[2];
$entry4[$counter] = $hold_csv[3];
$counter++;
}
print "$entry1[0]\n$entry2[0]\n$entry3[0]\n$entry3[0]"; #test printing
I always got use of uninitialized value error whenever i fetch empty CSV cells
Can you help me locate the error in my code ('cause I know I have somewhat missed something on my code)?
Thanks.
This looks like CSV. So the tool for the job is really Text::CSV.
I will also suggest - having 4 different arrays with numbered names says to me that you're probably wanting a multi-dimensional data structure in the first place.
So I'd be doing something like:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Text::CSV;
my $csv = Text::CSV->new( { binary => 1 } );
open( my $input, "<", "input.csv" ) or die $!;
my #results;
while ( my $row = $csv->getline($input) ) {
push ( #results, \#$row );
}
print join ( ",", #{$results[0]} ),"\n";
print Dumper \#results;
close($input);
If you really want separate arrays, I'd suggest naming them something different, but you could do it like this:
push ( #array1, $$row[0] ); #note - double $, because we dereference
I will note - there's an error in your code - I doubt:
while($map_in){
is doing what you think it is.
When you're assigning $entryN, define a default value:
$entry1[$counter] = $hold_csv[0] || '';
same for other #entry
I think there is a typo in while($map_in) { it should be while (#map_in) {.

Perl - Building array of arrays leads to overflow

I am trying to analyze a hashmap, for duplicate values and getting their keys in arrays. These arrays will be in an array of arrays. I am a newbie, by the way. However it never stops running when I start it. Where am I wrong?
while (($k,$v)=each %hashmap){
$hasduplicate = 0;
delete $hashmap{$k};
#dups = ();
while (($k1,$v1) = each %hashmap){
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate){
push (#dups, $k);
push #dupsarray, [#dups];}
}
Each hash has just one iterator aligned to itself in Perl (see each). Therefore, running each for the same hash in a loop that calls each is not doing what you think.
If you want to see what's going on, try adding the following line at the start of the outer loop:
warn $k;
You are missing several dollar signs before variable names. For example, you probably want to delete $hashmap{$k} instead of $hashmap{k}, which is equivalent to $hashmap{'k'}.
To output an array of arrays, you have to dereference the inner arrays:
print map "#$_\n", #dupsarray;
BTW, I would use a hash of arrays to solve your task. Here's how:
my %dups;
while (my ($k, $v) = each %hashmap) {
push #{ $dups{$v} }, $k;
}
for my $k (grep #{ $dups{$_} } > 1, keys %dups) {
print "$k: #{ $dups{$k} }\n";
}
The problem is that there can be only one each sequence per hash, as there is only a single index to keep track of the next key/value pair.
In addition, you are using k and k1 in a few places where you mean $k and $k1. You must always use strict and use warnings at the top of every Perl program. This would have alerted you to the problem.
You can get around this problem by using for my $k1 (keys %hashmap) { ... } for the inside loop. This will create a separate list of keys to assign to $k1 in turn so that there is no multiple use of the iterator.
This modification of your code does what I think you want.
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray;
while (my ($k, $v) = each %hashmap) {
my $hasduplicate = 0;
delete $hashmap{$k};
my #dups;
for my $k1 (keys %hashmap) {
my $v1 = $hashmap{$k1};
if ($v1 eq $v) {
$hasduplicate = 1;
push #dups, $k1;
delete $hashmap{$k1};
}
}
if ($hasduplicate) {
push(#dups, $k);
push #dupsarray, [#dups];
}
}
use Data::Dump;
dd \#dupsarray;
output
[["a", "c"]]
A much simpler method is to create an inverted hash where the keys and values of the original hash are swapped. Then just pick out the values of the inverted hash that have more than one element. This program demonstrates
use strict;
use warnings;
my %hashmap = (
a => 'a',
b => 'b',
c => 'a',
d => 'c',
);
my #dupsarray = do {
my %inverted;
while (my ($k, $v) = each %hashmap) {
push #{ $inverted{$v} }, $k;
}
grep { #$_ > 1 } values %inverted;
};
use Data::Dump;
dd \#dupsarray;
output
[["c", "a"]]

Resources