Check words and synonyms - arrays

I have an array with some words, and another array with words and synonyms. I'd like to create a third array when I find a matchin word between first and second array. I tried with grep but I'm not able to write the code in a proper way in order to get what I want.
The problem is that elements in array 1 can be found in array 2 at the beginning but also at the end or in the middle.
Maybe it's easier with an exemple:
#array1 = qw(chose, abstraction);
#array2 = (
"inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
The result it should be
#array3 = ("abstraction", "cause,chose,objet", "chose,objet");
Is it right to use "grep"?
I'm not able to write a right syntax to solve the problem..
Thank you

You can construct a regular expression from the array1, then filter the array2 using it:
#!/usr/bin/perl
use warnings;
use strict;
my #array1 = qw(chose, abstraction);
my #array2 = (
"inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
my $regex = join '|', map quotemeta $_, #array1; # quotemeta needed for special characters.
$regex = qr/$regex/;
my #array3 = grep /$regex/, #array2;
print "$_\n" for #array3;

I know you have an answer but here is a fun way I thought of.
So, I guess it is like an inverted index.
You take each set of synonyms and make them into an array. Then take each element of that array and put it into a hash as the keys with the value being a reference to the array.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #array1 = qw(chose abstraction);
my #array2 = ("inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
my #array;
push #array, map { /,|\s/ ? [split(/,|\s/, $_)]:[$_] } #array2;
my %construct;
while(my $array_ref = shift(#array)){
for(#{ $array_ref }){
push #{ $construct{$_} }, $array_ref;
}
}
my #array3 = map { s/,//; (#{ $construct{$_} }) } #array1;
print join(', ', #{ $_ }), "\n" for (#array3);
EDIT:
Missed apart of the answer before, this one should be complete.

Related

compare an array of string with another array of strings in perl

I want to compare an array of string with another array of strings; if it matches, print matched.
Example:
#array = ("R-ID 1.0001", "RA-ID 61.02154", "TCA-ID 49.021456","RCID 61.02154","RB-ID 61.02154");
#var = ("TCA-ID 49", "R-ID 1");
for (my $x = 0; $x <= 4; $x++)
{
$array[$x] =~ /(.+?)\./;
if( ($var[0] eq $1) or ($var[1] eq $1) )
{
print "\n deleted rows are :#array\n";
}
else
{
print "printed rows are : #array \n";
push(#Matrix, \#array);
}
Then I need to compare #var with the #array; if it is matched, print the matched pattern.
Here the entire logic is in a hireartical for loop which gives a new #array in each iteration. so every time this logic is executed #array has different strings.
Then comes with #var it is user input field, this #var can be of any size. So in order to run the logic according to these constraints, I need to iterate the condition inside the if loop when the user input #var size is 3 for example.
So the goal is to match and delete the user input stings using the above mentioned logic. But unfortunately tis logic is not working. Could you please help me out in this issue.
The builtin grep keyword is a good place to start.
my $count = grep { $_ eq $var } #array;
This returns a count of items ($_) in the array which are equal (eq) to $var.
If you needed case-insensitive matching, you could use lc (or in Perl 5.16 or above, fc) to do that:
my $count = grep { lc($_) eq lc($var) } #array;
Now, a disadvantage to grep is that it is counting the matches. So after if finds the first match, it will keep on going until the end of the array. You don't seem to want that, but just want to know if any item in the array matches, in which case keeping on going might be slower than you need if it's a big array with thousands of elements.
So instead, use any from the List::Util module (which is bundled with Perl).
use List::Util qw( any );
my $matched = any { $_ eq $var } #array;
This will match as soon as it finds the first matching element, and skip searching the rest of the array.
Here is a couple of versions that allows multiple strings to be matched. Not clear what form $var takes when you want to store multiple, so assuming they are in an array #var for now.
The key point is this one is the use of the lookup hash to to the matching.
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
# create a lookup for the strings to match
my %lookup = map { $_ => 1} #var ;
for my $entry (#array)
{
print "$entry\n"
if $lookup{$entry} ;
}
running gives
RA-ID 61
TCA-ID 49
Next, using a regular expression to do the matching
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
my $re = join "|", map { quotemeta } #var;
print "$_\n" for grep { /^($re)$/ } #array ;
output is the same

Hash key is storing only the last element of loop

I am trying to store the array values in the hash, but the hash key is storing only the last value of array in the for loop.
My expected output is, 'STORE' key should have all the array elements.
I knew there are few other ways to store the array values in the hash, but I curious why the below script doesn't work.
use strict;
use warnings;
use Data::Dumper;
my #array = (1,2,3);
my %record;
for my $array(#array) {
$record{"STORE"} = $array;
}
print Dumper \%record;
The hash has only the last value from the array because you keep overwriting the value in the for loop.
One way to store all values from the array is:
use strict;
use warnings;
use Data::Dumper;
my #array = (1,2,3);
my %record;
for my $array (#array) {
push #{ $record{"STORE"} }, $array;
}
print Dumper \%record;
This stores the array as a reference.
$VAR1 = {
'STORE' => [
1,
2,
3
]
};
Another way to store the whole array is to assign it to an array reference:
my #array = (1,2,3);
my %record;
$record{"STORE"} = [#array];
print Dumper \%record;
Refer to perldsc

Extract number from array in Perl

I have a array which have certain elements. Each element have two char "BC" followed by a number
e.g - "BC6"
I want to extract the number which is present and store in a different array.
use strict;
use warnings;
use Scalar::Util qw(looks_like_number);
my #band = ("BC1", "BC3");
foreach my $elem(#band)
{
my #chars = split("", $elem);
foreach my $ele (#chars) {
looks_like_number($ele) ? 'push #band_array, $ele' : '';
}
}
After execution #band_array should contain (1,3)
Can someone please tell what I'm doing wrong? I am new to perl and still learning
To do this with a regular expression, you need a very simple pattern. /BC(\d)/ should be enough. The BC is literal. The () are a capture group. They save the match inside into a variable. The first group relates to $1 in Perl. The \d is a character group for digits. That's 0-9 (and others, but that's not relevant here).
In your program, it would look like this.
use strict;
use warnings;
use Data::Dumper;
my #band = ('BC1', 'BC2');
my #numbers;
foreach my $elem (#band) {
if ($elem =~ m/BC(\d)/) {
push #numbers, $1;
}
}
print Dumper #numbers;
This program prints:
$VAR1 = '1';
$VAR2 = '2';
Note that your code had several syntax errors. The main one is that you were using #band = [ ... ], which gives you an array that contains one array reference. But your program assumed there were strings in that array.
Just incase your naming contains characters other than BC this will exctract all numeric values from your list.
use strict;
use warnings;
my #band = ("AB1", "BC2", "CD3");
foreach my $str(#band) {
$str =~ s/[^0-9]//g;
print $str;
}
First, your array is an anonymous array reference; use () for a regular array.
Then, i would use grep to filter out the values into a new array
use strict;
use warnings;
my #band = ("BC1", "BC3");
my #band_array = grep {s/BC(\d+)/$1/} #band;
$"=" , "; # make printing of array nicer
print "#band_array\n"; # print array
grep works by passing each element of an array in the code in { } , just like a sub routine. $_ for each value in the array is passed. If the code returns true then the value of $_ after the passing placed in the new array.
In this case the s/// regex returns true if a substitution is made e.g., the regex must match. Here is link for more info on grep

Why does my first hash value disappear in Perl?

Why does the hash remove the first value apple:2 when I print the output?
use warnings;
use strict;
use Data::Dumper;
my #array = ("apple:2", "pie:4", "cake:2");
my %wordcount;
our $curword;
our $curnum;
foreach (#array) {
($curword, $curnum) = split(":",$_);
$wordcount{$curnum}=$curword;
}
print Dumper (\%wordcount);
Perl hash can only have unique keys, so
$wordcount{2} = "apple";
is later overwritten by
$wordcount{2} = "cake";
What you probably wanted to do was:
use warnings;
use strict;
use Data::Dumper;
my #array = ("apple:2", "pie:4", "cake:2");
my %wordcount;
for my $entry (#array) {
my ($word, $num) = split /:/, $entry;
push #{$wordcount{$num}}, $word;
}
print Dumper (\%wordcount);
This way, each entry in %wordcount relates a word count to an array of the words which appear that many times (assuming the :n in the notation indicates the count).
It is OK to be a beginner, but it is not OK to assume other people can read your mind.
Also, don't use global variables (our) when lexically scoped (my) will do.

how to arrange array values in ascending order in perl

I need to arrange the array values in ascending order in perl, i used sort compare option for below values but not working, kindly help as soon as possible
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
note: i want to sort the array values not array index
Thanks in advance
How about using schwartzian transform, doc here and here:
my #unsorted = qw(
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
);
my #sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { (my $t=$_)=~s/\D+//g; [$_, $t] }
#unsorted;
dump#sorted;
output:
(
"p1.txt",
"p2.txt",
"p3.txt",
"p4.txt",
"p5.txt",
"p6.txt",
"p7.txt",
"p8.txt",
"p9.txt",
"p10.txt",
"p11.txt",
"p12.txt",
"p13.txt",
"p14.txt",
"p15.txt",
"p16.txt",
"p17.txt",
"p18.txt",
"p19.txt",
"p20.txt",
"p21.txt",
"p22.txt",
"p23.txt",
"p24.txt",
)
Consider using Sort::Naturally for this task:
use strict;
use warnings;
use Sort::Naturally qw/nsort/;
chomp( my #data = <DATA> );
print "$_\n" for nsort #data;
__DATA__
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
Partial output:
p1.txt
p2.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
p10.txt
p11.txt
p12.txt
...
p22.txt
p23.txt
p24.txt
Hope this helps!
You need to create your own sort algorithm and pass that to sort:
sub custom_sort
{
$a =~ /^p(\d+)\.txt$/; #capture the number in $a
my $intA = $1;
$b =~ /^p(\d+)\.txt$/; #capture the number in $b
my $intB = $1;
return ($intA <=> $intB); #compare the numbers and return
}
And call:
#sortedArray = sort custom_sort #array;
See: http://perldoc.perl.org/functions/sort.html and http://perldoc.perl.org/perlop.html#Equality-Operators
Easiest would be to use the nsort_by function from List::UtilsBy; this sorts a list by the numbers returned from its code block. You would then invoke this with a code block to extract the number from the filename:
use List::UtilsBy qw( nsort_by );
my #sorted = nsort_by { /^p(\d+)\.txt$/ and $1 } #array;

Resources