Perl Array of regular expressions -- matching elements of an array - arrays

I am trying to match and remove elements from an array called #array. The elements to be removed must match the patterns stored inside an array called #del_pattern
my #del_pattern = ('input', 'output', 'wire', 'reg', '\b;\b', '\b,\b');
my #array = (['input', 'port_a', ','],
['output', '[31:0]', 'port_b,', 'port_c', ',']);
To remove the patterns contained in #del_pattern from #array, I loop through all the elements in #del_pattern and exclude them using grep.
## delete the patterns found in #del_pattern array
foreach $item (#del_pattern) {
foreach $i (#array) {
#$i = grep(!/$item/, #$i);
}
}
However, I have been unable to remove ',' from #array. If I use ',' instead of '\b,\b' in #del_pattern, element port_b, gets removed from the #array as well, which is not an intended outcome. I am only interested in removing elements that contain only ','.

You are using the wrong regular expression. I updated the code and tried it and its working fine. PFB the update code:
my #del_pattern = ('input', 'output', 'wire', 'reg', '\b;\b', '^,$');
my #array = (['input', 'port_a', ','],
['output', '[31:0]', 'port_b,', 'port_c', ',']);
## delete the patterns found in #del_pattern array
foreach my $item (#del_pattern) {
foreach my $i (#array) {
#$i = grep(!/$item/, #$i);
}
}
The only change made is in the Regex '\b,\b' to '^,$'. I don't have much info on \b but the regex I am suggesting is doing what you intend.

You want
^,\z
An explanation of what \b doesn't match at all follows.
\b
defines the boundary of a "word". It is equivalent to
(?<=\w)(?!\w) | (?<!\w)(?=\w)
so
\b,\b
is equivalent to
(?: (?<=\w)(?!\w) | (?<!\w)(?=\w) ) , (?: (?<=\w)(?!\w) | (?<!\w)(?=\w) )
Since comma is a non-word character, that simplifies to
(?<=\w),(?=\w)
So
'a,b' =~ /\b,\b/ # Match
',' =~ /\b,\b/ # No match

This works, but not very nice from the code.
This code snippet also removes the ','
my $elm = [];
sub extract {
my $elm = shift;
foreach my $del (#del_pattern) {
$elm =~ s/$del//g;
if ( $elm ) {
return $elm;
}
}
}
foreach my $item (#array) {
foreach my $i (#$item) {
my $extract = extract($i);
if ($extract) {
push(#$elm, $extract);
}
}
}
print Dumper($elm);
Why your #array has array? Why not one big array?

Related

Perl : matching the contents of a file with the contents of an array

I have an array #arr1 where each element is of the form #define A B.
I have another file, f1 with contents:
#define,x,y
#define,p,q
and so on. I need to check if the second value of every line (y, q etc) matches the first value in any element of the array. Example: say the array has an element #define abc 123 and the file has a line #define,hij,abc.
When such a match occurs, I need to add the line #define hij 123 to the array.
while(<$fhDef>) #Reading the file
{
chomp;
$_ =~ tr/\r//d;
if(/#define,(\w+),(\w+)/)
{
my $newLabel = $1;
my $oldLabel = $2;
push #oldLabels, $oldLabel;
push #newLabels, $newLabel;
}
}
foreach my $x(#tempX) #Reading the array
{
chomp $x;
if($x =~ /#define\h{1}\w+\h*0x(\w+)\h*/)
{
my $addr = $1;
unless(grep { $x =~ /$_/ } #oldLabels)
{
next;
}
my $index = grep { $oldLabels[$_] eq $_ } 0..$#oldLabels;
my $new1 = $newLabels[$index];
my $headerLabel1 = $headerLabel."X_".$new1;
chomp $headerLabel1;
my $headerLine = "#define ".$headerLabel1."0x".$addr;
push #tempX, $headerLine;
}
}
This just hangs. No doubt I'm missing something right in front of me, but what??
The canonical way is to use a hash. Hash the array, using the first argument as the key. Then walk the file and check for existence of the key in the hash. I used a HoA (hash of arrays) to handle multiple values for each key (see the last two lines).
#! /usr/bin/perl
use warnings;
use strict;
my #arr1 = ( '#define y x',
'#define abc 123',
);
my %hash;
for (#arr1) {
my ($arg1, $arg2) = (split ' ')[1, 2];
push #{ $hash{$arg1} }, $arg2;
}
while (<DATA>) {
chomp;
my ($arg1, $arg2) = (split /,/)[1, 2];
if ($hash{$arg2}) {
print "#define $arg1 $_\n" for #{ $hash{$arg2} };
}
}
__DATA__
#define,x,y
#define,p,q
#define,hij,abc
#define,klm,abc
As the other answer said, it's better to use a hash. Also, keep in mind that you're doing a
foreach my $x(#tempX)
but you're also doing a
push #tempX, $headerLine;
which means that you're modifying the array on which you're iterating. This is not just bad practice, this also means that you're most likely going to have an infinite loop because of it.

Comparing two strings line by line in Perl

I am looking for code in Perl similar to
my #lines1 = split /\n/, $str1;
my #lines2 = split /\n/, $str2;
for (int $i=0; $i<lines1.length; $i++)
{
if (lines1[$i] ~= lines2[$i])
print "difference in line $i \n";
}
To compare two strings line by line and show the lines at which there is any difference.
I know what I have written is mixture of C/Perl/Pseudo-code. How do I write it in the way that it works on Perl?
What you have written is sort of ok, except you cannot use that notation in Perl lines1.length, int $i, and ~= is not an operator, you mean =~, but that is the wrong tool here. Also if must have a block { } after it.
What you want is simply $i < #lines1 to get the array size, my $i to declare a lexical variable, and eq for string comparison. Along with if ( ... ) { ... }.
Technically you can use the binding operator to perform a string comparison, for example:
"foo" =~ "foobar"
But it is not a good idea when comparing literal strings, because you can get partial matches, and you need to escape meta characters. Therefore it is easier to just use eq.
Using C-style for loops is valid, but the more Perl-ish way is to use this notation:
for my $i (0 .. $#lines1)
Which will iterate over the range 0 to the max index of the array.
Perl allows you to open filehandles on strings by using a reference to the scalar variable that holds the string:
open my $string1_fh, '<', \$string1 or die '...';
open my $string2_fh, '<', \$string2 or die '...';
while( my $line1 = <$string1_fh> ) {
my $line2 = <$string2_fh>;
....
}
But, depending on what you mean by difference (does that include insertion or deletion of lines?), you might want something different.
There are several modules on CPAN that you can inspect for ideas, such as Test::LongString or Algorithm::Diff.
my #lines1 = split(/^/, $str1);
my #lines2 = split(/^/, $str2);
# splits at start of line
# use /\n/ if you want to ignore newline and trailing spaces
for ($i=0; $i < #lines1; $i++) {
print "difference in line $i \n" if (lines1[$i] ne lines2[$i]);
}
Comparing Arrays is a way easier if you create a Hashmap out of it...
#Searching the difference
#isect = ();
#diff = ();
%count = ();
foreach $item ( #array1, #array2 ) { $count{$item}++; }
foreach $item ( keys %count ) {
if ( $count{$item} == 2 ) {
push #isect, $item;
}
else {
push #diff, $item;
}
}
#Output
print "Different= #diff\n\n";
print "\nA Array = #array1\n";
print "\nB Array = #array2\n";
print "\nIntersect Array = #isect\n";
Even after spliting you could compare them as Array.

PowerShell foreach how to skip last element

I have the follow array with the elements
$list = "A","B","C","1","2","3"
using foreach I can view all items in the array.
foreach ( $item in $list ) { $item }
I would like to print all elements in the array but the last one. As I need to add ; at the end.
How can I go about doing this?
Is this what you're looking for?
$List = "A","B","C","1","2","3";
($List[0..($List.Length-2)] -join '') + ';';
Result
ABC12;
This can also be done in as one-liner:
-join $List -replace [Regex]'.$',';'
First -join sticks all the elements in the array together. Then -replace & regex replace the last element with ;.
Regex ('.$')
. = Matches any character except line breaks.
$ = Matches the end of the string.

How can I loop over an array from the first occurrence of an element with a specific value using perl?

I have an array like ("valueA", "valueB", "valueC", "valueD") etc. I want to loop over the values of the array starting from (for example) the first instance of "valueC". Everything in the array before the first instance of the value "valueC" should be ignored; so in this case only "valueC" and "valueD" would be handled by the loop.
I can just put a conditional inside my loop, but is there a neater way to express the idea using perl?
my $seen;
for ( grep $seen ||= ($_ eq "valueC"), #array ) {
...
}
I think you also need to check if the "valueC" exist inside the array.
Hope this helps.
use strict;
use warnings;
use List::Util qw(first);
my #array = qw(valueA valueB valueC valueD);
my $starting_element = 'valueC';
# make sure that the starting element exist inside the array
# first search for the first occurrence of the $stating_element
# dies if not found
my $starting_index = first { $array[$_] eq $starting_element } 0 .. $#array
or die "element \"$starting_element\" does not exist inside the array";
# your loop
for my $index ($starting_index .. $#array) {
print $array[$index]."\n";
}
my $seen;
for ( #array ) {
$seen++ if /valueC/;
next unless $seen;
...
}
But that $seen is a little ungainly. The flip-flop operator looks tidier IMO:
for ( #array ) {
next unless /^valueC$/ .. /\0/;
# or /^valueC$/ .. '' !~ /^$;
# or $_ eq 'valueC' .. /\0/;
...
}
Or simply (building on ikegami's suggestion):
for ( grep { /^valueC$/ .. /(*FAIL)/ } #array ) { ... }
use List::MoreUtils qw( first_index );
foreach my $item ( #array[ ( first_index { $_ eq 'ValueC' } #array ) .. $#array ] ){
# process $item
}
my $start = 0;
++$start while $start < #array && $array[$start] ne 'valueC';
followed by either
for (#array[$start..$#array]) {
say;
}
or
for my $i ($start..$#array) {
say $array[$i];
}
TIMTOWTDI, but I think that:
foreach my $item (#list) {
next if !$seen && ($item ne 'valueC');
$seen++;
...
}
is both readable, correct and and terse enough. All the /valueC/ solution will process anything after "DooDadvalueCFuBAr", not what the OP asked. And, no you need no flipflop/range operator, and checking for the existence beforehand is really strange, besides requiring a possibly noncore package to perform a rather trivial task.The grep solution is really making my head spin, besides creating and tossing a temp array as a side effect.
If you want to get fancy and avoid ''ifs':
foreach my $item (#list) {
$seen || ($item eq 'valueC') || next;
$seen++;
...
}
Just don't write home about it. :-)

How to create hash of array in Perl

I have a data like this
Group AT1G01040-TAIR-G
LOC_Os03g02970 69%
Group AT1G01050-TAIR-G
LOC_Os10g26600 85%
LOC_Os10g26633 35%
Group AT1G01090-TAIR-G
LOC_Os04g02900 74%
How can create the data structure that looks like this:
print Dumper \%big;
$VAR = { "Group AT1G01040-TAIR-G" => ['LOC_Os03g02970 69%'],
"Group AT1G01050-TAIR-G" => ['LOC_Os10g26600 85%','LOC_Os10g26633 35%'],
"Group AT1G01090-TAIR-G" => ['LOC_Os04g02900 74%']};
This is my attempt, but fail:
my %big;
while ( <> ) {
chomp;
my $line = $_;
my $head = "";
my #temp;
if ( $line =~ /^Group/ ) {
$head = $line;
$head =~ s/[\r\s]+//g;
#temp = ();
}
elsif ($line =~ /^\t/){
my $cont = $line;
$cont =~ s/[\t\r]+//g;
push #temp, $cont;
push #{$big{$head}},#temp;
};
}
Here's how I'd do it:
my %big;
my $currentGroup;
while (my $line = <> ) {
chomp $line;
if ( $line =~ /^Group/ ) {
$big{$line} = $currentGroup = [];
}
elsif ($line =~ s/^\t+//) {
push #$currentGroup, $line;
}
}
You should probably add some additional error checking to this, e.g. an else clause to warn about lines that don't match either regex. Also, check to see if $currentGroup is undef before pushing (in case the first line begins with a tab instead of "Group").
The biggest problem with your original code is that you're declaring and initializing $head and #temp inside the loop, which means they got reset on every line. Variables that need to persist across lines have to be declared outside the loop, as I've done with $currentGroup.
I'm not quite sure what you're intending to accomplish with the s/[\r\s]+//g; bit. \r is included in \s, so that means the same as s/\s+//g; (which would strip all whitespace), but your desired result hash includes whitespace in your keys. If you want to strip trailing whitespace, you need to include an anchor: s/\s+\z//.
Well, I don't want to give you an answer, so I'll just tell you to look at:
perlref
perlreftut
Well, there ya go :-).
Your pushing arrays to your hash item. You should just be pushing the values. (You don't need #temp at all.)
push #{$big{$head}}, $cont;
Also $head must be declared outside your loop, otherwise it looses its value after each iteration.

Resources