Getting indices of matching parentheses - arrays

Hi I am trying to print indices of the following pattern of brackets:
((((((...)))(((...))))))
as follows:
0 23
1 22
2 21
3 11
4 10
5 9
12 20
13 19
14 18
I tried to achieve this using this perl code as given below:
#!/usr/bin/perl
use strict;
use warnings;
my $string = '((((((...)))(((...))))))';
my #myarray = split('', $string);
my #stack;
my #stack1;
while (my ($index, $element) = each(#myarray))
{
if ($element eq '(')
{
push(#stack, $index);
}
if ($element eq ')')
{
push(#stack1, $index);
}
}
print "$stack[$_]-$stack1[$_]\n" for (0 .. $#stack);
But the above code is giving me following output which is not the required output:
0-9
1-10
2-11
3-18
4-19
5-20
12-21
13-22
14-23
Is there any way I can achieve this?

Push to the stack on the left hand side parenthesis, pop on the right hand side.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $string = '((((((...)))(((...))))))';
my #output;
my #stack;
my $pos = 0;
for my $char (split //, $string) {
if ($char eq '(') {
push #stack, $pos;
} elsif ($char eq ')') {
push #output, [ pop #stack, $pos ];
}
++$pos;
}
say "#$_" for sort { $a->[0] <=> $b->[0] } #output;

Fun question! The accepted answer is fine, but here's another way to do it, because it's always educational to see another solution.
#!/usr/bin/perl
use strict;
use warnings;
my $string = '((((((...)))(((...))))))';
my (#out, #match);
while ($string =~ m/([()])/g) {
my $p = pos($string) - 1;
if ($1 eq '(') {
push #out, [$p];
push #match, $#out;
}
else {
die "mismatched paren at char $p\n"
unless #match;
$out[pop #match][1] = $p;
}
}
for (#out) { print "#$_\n" }
exit(0);
Output is exactly as per your desired output. Dies on mismatched parentheses (which choroba's code could also do with an appropriate test in the elsif block). Unmatched parentheses result in lines with no second number, and they will also be residual in #match after the while loop.
I've opted to use Perl's pattern matching a little instead of breaking the string into individual characters and iterating over them all. Instead, I match on each open or close parenthesis in turn, using the "g" modifier. Thus, the loop only iterates over characters of interest. The pos() function on $string returns the point after the last match, so I need to subtract one to get zero-based output.
The other key difference is that I accumulate in #out, and track the corresponding close by noting the last index of #out, pushing it on #match. I then pop off #match as I find closing parentheses, and add the second element to the sub-array in #out at that position. This eliminates the need for sorting the final result, as #out is already in order of opening parentheses.

Related

compare an array of string with another array of strings in perl

I want to compare an array of string with another array of strings; if it matches, print matched.
Example:
#array = ("R-ID 1.0001", "RA-ID 61.02154", "TCA-ID 49.021456","RCID 61.02154","RB-ID 61.02154");
#var = ("TCA-ID 49", "R-ID 1");
for (my $x = 0; $x <= 4; $x++)
{
$array[$x] =~ /(.+?)\./;
if( ($var[0] eq $1) or ($var[1] eq $1) )
{
print "\n deleted rows are :#array\n";
}
else
{
print "printed rows are : #array \n";
push(#Matrix, \#array);
}
Then I need to compare #var with the #array; if it is matched, print the matched pattern.
Here the entire logic is in a hireartical for loop which gives a new #array in each iteration. so every time this logic is executed #array has different strings.
Then comes with #var it is user input field, this #var can be of any size. So in order to run the logic according to these constraints, I need to iterate the condition inside the if loop when the user input #var size is 3 for example.
So the goal is to match and delete the user input stings using the above mentioned logic. But unfortunately tis logic is not working. Could you please help me out in this issue.
The builtin grep keyword is a good place to start.
my $count = grep { $_ eq $var } #array;
This returns a count of items ($_) in the array which are equal (eq) to $var.
If you needed case-insensitive matching, you could use lc (or in Perl 5.16 or above, fc) to do that:
my $count = grep { lc($_) eq lc($var) } #array;
Now, a disadvantage to grep is that it is counting the matches. So after if finds the first match, it will keep on going until the end of the array. You don't seem to want that, but just want to know if any item in the array matches, in which case keeping on going might be slower than you need if it's a big array with thousands of elements.
So instead, use any from the List::Util module (which is bundled with Perl).
use List::Util qw( any );
my $matched = any { $_ eq $var } #array;
This will match as soon as it finds the first matching element, and skip searching the rest of the array.
Here is a couple of versions that allows multiple strings to be matched. Not clear what form $var takes when you want to store multiple, so assuming they are in an array #var for now.
The key point is this one is the use of the lookup hash to to the matching.
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
# create a lookup for the strings to match
my %lookup = map { $_ => 1} #var ;
for my $entry (#array)
{
print "$entry\n"
if $lookup{$entry} ;
}
running gives
RA-ID 61
TCA-ID 49
Next, using a regular expression to do the matching
use strict;
use warnings;
my #var = ("TCA-ID 49", "RA-ID 61");
my #array = ("R-ID 1", "RA-ID 61", "TCA-ID 49");
my $re = join "|", map { quotemeta } #var;
print "$_\n" for grep { /^($re)$/ } #array ;
output is the same

Perl output successive string from array

I have an array I can print out as "abcd" however I am trying to print it as "a>ab>abc>abcd". I can't figure out the nested loop I need within the foreach loop I have. What loop do I need within it to print it this way?
my $str = "a>b>c>d";
my #words = split />/, $str;
foreach my $i (0 .. $#words) {
print $words[$i], "\n";
}
Thank you.
You had the right idea, but instead of printing the word at position i, you want to print all the words between positions 0 and i (inclusive). Also, your input can contain multiple strings, so loop over them.
use warnings;
while (my $str = <>) { # read lines from stdin or named files
chomp($str); # remove any trailing line separator
my #words = split />/, $str; # break string into array of words
foreach my $i (0 .. $#words) {
print join '', #words[0 .. $i]; # build the term from the first n words
print '>' if $i < $#words; # print separator between terms (but not at end)
}
print "\n";
}
There are many other ways to write it, but hopefully this way helps you understand what's happening and why. Good luck!
one liner:
perl -e '#a=qw(a b c d); for(#a) {$s.=($h.=$_).">"} $s=substr($s,0,-1);print $s'
I would do it like this:
#!/usr/bin/perl
use strict;
use warnings;
my $str = "a>b>c>d>e>f>g";
my #words = split />/, $str;
$" = '';
my #new_words;
push #new_words, "#words[0 .. $_]" for 0 .. $#words;
print join '>', #new_words;
A few things to explain.
Perl will expand array variables in a double-quoted string. So something like this:
#array = ('x', 'y', 'z');
print "#array";
will print x y z. Notice there are spaces between the elements. The string that is inserted between the elements is controlled by the $" variable. So by setting that variable to an empty string we can remove the spaces, so:
$" = '';
#array = ('x', 'y', 'z');
print "#array";
will print xyz.
The most complex line is:
push #new_words, "#words[0 .. $_]" for 0 .. $#words;
That's just a compact way to write:
for (0 .. $#words) {
my $new_word = "#words[0 .. $_]";
push #new_words, $new_word;
}
We iterate across the integers from zero to the last index in #words. Each time around the loop, we use an array slice to get a list of elements from the array, convert that to a string (by putting it in double-quotes) and then push that string onto #new_words.
This is what I ended up with, It's the only way I could understand and get the output I was looking for.
use strict;
use warnings;
my $str = "a>b>c>d>e>f>g";
my #words = split />/, $str;
my $j = $#words;
my $i = 0;
my #newtax;
while($i <= $#words){
foreach my $i (0 .. $#words - $j){
push (#new, $words[$i]);
}
if($i < $#words){
push(#new, ">");
}
$j--;
$i++;
}
print #new;
This output "a>ab>abc>abcd>abcde>abcdef>abcdefg"

Empty array in a perl while loop, should have input

Was working on this script when I came across a weird anomaly. When I go to print #extract after declaring it, it prints correctly the following:
------MMMMMMMMMMMMMMMMMMMMMMMMMM-M-MMMMMMMM
------SSSSSSSSSSSSSSSSSSSSSSSSSS-S-SSSSSDTA
------TIIIIIIIIIIIIITIIIVVIIIIII-I-IIIIITTT
Now the weird part, when I then try to print or return #extract (or $column) inside of the while loop, it comes up empty, thus rendering the rest of the script useless. I've never come across this before up until now, haven't been able to find any documentation or people with similar problems as mine. Below is the code, I marked with #<------ where the problems are and are not, to see if anyone can have any idea what is going on? Thank you kindly.
P.S. I am utilizing perl version 5.12.2
use strict;
use warnings;
#use diagnostics;
#use feature qw(say);
open (S, "Val nuc align.txt") || die "cannot open FASTA file to read: $!";
open (OUTPUT, ">output.txt");
my #extract;
my $sum = 0;
my #lines = <S>;
my #seq = ();
my $start = 0; #amino acid column start
my $end = 10; #amino acid column end
#Removing of the sequence tag until amino acid sequence composition (from >gi to )).
foreach my $line (#lines) {
$line =~ s/\n//g;
if ($line =~ />/g) {
$line =~ s/>.*\]/>/g;
push #seq, $line;
}
else {
push #seq, $line;
}
}
my $seq = join ('', #seq);
my #seq_prot = join "\n", split '>', $seq;
#seq_prot = grep {/[A-Z]/} #seq_prot;
#number of sequences
print OUTPUT "Number of sequences:", scalar (grep {defined} #seq_prot), "\n";
#selection of amino acid sequence. From $start to $end.
my #vertical_array;
while ( my $line = <#seq_prot> ) {
chomp $line;
my #split_line = split //, $line;
for my $index ( $start..$end ) { #AA position, extracts whole columns
$vertical_array[$index] .= $split_line[$index];
}
}
# Print out your vertical lines
for my $line ( #vertical_array ) {
my $extract = say OUTPUT for unpack "(a200)*", $line; #split at end of each column
#extract = grep {defined} $extract;
}
print OUTPUT #extract; #<--------------- This prints correctly the input
#Count selected amino acids excluding '-'.
my %counter;
while (my $column = #extract) {
print #extract; #<------------------------ Empty print, no input found
}
Update: Found the main problem to be with the unpack command, I thought I could utilize it to split my columns of my input at X elements (43 in this case). While this works, the minute I change $start to another number that is not 0 (say 200), the code brings up errors. Probably has something to do with the number of column elements does not match the lines. Will keep updated.
Write your last while loop the same way as your previous for loop. The assignment
my $column = #extract
is in scalar context, which does not give you the same result as:
for my $column (#extract)
Instead, it will give you the number of elements in the array. Try this second option and it should work.
However, I still have a concern, because in fact, if #extract had anything in it, you would obtain an infinite loop. Is there any code that you did not include between your two commented lines?

How can I print only the even numbered lines of an array in perl?

I am new to Perl and am trying to write a script that will only print the even numbered lines of an array. I have tried multiple different methods of finding the size to use as the condition for my while loop, but I always end up getting an infinite loop of the first line without the program terminating. The array being input is a text file, input with the form "program.pl < foo.txt". Have I made a logic or syntax error?
#input = <STDIN>;
$i = $1;
$size = $#input + $1;
while ($size >= $i) {
print "$input[$i]";
$i = ($i + $2);
}
Don't call your problem with
program.pl < foo.txt
Instead, just pass 'foo.txt' as a parameter:
program.pl foo.txt
Inside your script, rely on default reading from <> and the line number variable $.:
use strict;
use warnings;
while (<>) {
next if $. % 2; # Skip odd numbers.
print;
}
Assuming you already have an array with all of your input, in your example #input, you can get all of the even index entries into another array using an Array Slice like so:
my #input_even_entries_only = #input[grep { $_ % 2 == 0 } 0..$#input];
The expression inside the square brackets evaluates to all of the even numbers between 0 and $#input.
You can then use a regular for/foreach loop to go through the resulting array:
for my $val (#input_even_entries_only) {
print "$val";
}
If you are trying to print lines of an array indexed at even numbers then, try this:
use strict;
use warnings;
my #input = <DATA>;
for(my $i=0; $i<=$#input; $i+=2) {
print $input[$i];
}
__DATA__
1
2
3
4
5
6
Output:
1
3
5
I've no idea what you are doing with the $1 and $2 variables. Did you think they were just numbers?
When you use a variable that has not been assigned a value, it is undefined, which will be converted to 0 when used in numerical context. If you do not use use warnings, this is done silently, and will be rather confusing.
Other than that, your code is not too far off. It should be something like:
use strict;
use warnings;
my #input = <>; # <> is more flexible and does the same thing
my $i = 1;
while ($i <= $#input) {
print $input[$i];
$i += 2;
}
Though of course, storing the entire file in an array is not necessary, and most often you should just loop over it instead. Like Miller has shown in his answer, which is probably the solution I would suggest. Using a for loop like JS shows is an excellent way to control the loop.

Perl Script -: Useless use of array element in void context at letter_counter.pl lin 38 and 44

this is first my perl script
http://bpaste.net/show/171137/
#!/usr/bin/perl
#This program will take a user's input and then count how many letters there are. Whereupon it will count the number of unique letters before printing all the data
#back to the user.
use strict;
use warnings;
#======================================================================================================================
# This section is to collect and spit back the input to the user.
#======================================================================================================================
print "\n\nHello, please enter a word, a phrase, or a sentence. Press Enter when you are done.\n";
my $input = <>; #Collecting the input from the user.
chomp $input; #Chomping, or removing, the \n from the end of the input.
print "\nYou typed -:[$input]\n";
#======================================================================================================================
#This section will find how many unique characters there are.
#======================================================================================================================
my #uniqueArray;
my #stringArray = split(// , $input);
my $x = 0;
my $string_max_index = $#stringArray;
for($stringArray[$x];$stringArray[$string_max_index];$x++)
{
my $found = 0;
my $test = $stringArray[$x];
my $y = 0;
for($uniqueArray[$y];$uniqueArray[$#uniqueArray];$y++)
{
if($test eq $uniqueArray[$y])
{
$found=1;
}
}
if($found eq 1)
{
$uniqueArray[$#uniqueArray] = $stringArray[$x];
}
}
#======================================================================================================================
# This section will determine how many ascii characters are in the $input variable and output the results of this
# program.
#======================================================================================================================
my $numOfLet = 0;
while ( $input ne "" )
{
$numOfLet = $numOfLet + 1;
chop $input
}
print "Total Characters -: $numOfLet";
print "Total of Unique Characters -: $#uniqueArray \n\n\n";
exit;
I was able to get rid of all the errors except for these two,
Useless use of array element in void context at letter_counter.pl line 38
Useless use of array element in void context at letter_counter.pl line 44
What is confusing me is that There is nothing at those lines, just the closing brackets for my for loop, which leads me to believe that the issue is an element I called in each for loop.
The initialization block of your for loop is the immediate culprit. Adjusting to something like this resolves the warning:
for(;$stringArray[$string_max_index];$x++)
Otherwise you're accessing a value, but doing... nothing with it? That's what the warning is for.
I spot a few other problems, though:
Your for loops are... a little funny, I don't know how else to put that.
Array length is usually easiest to read with the scalar keyword.
Adding members to an array is usually done with the push keyword.
Using the above in combination:
for(my $x = 0; $x < scalar #stringArray;$x++)
{
my $found = 0;
my $test = $stringArray[$x];
my $y = 0;
for (my $y = 0; !$found && $y < scalar #uniqueArray;$y++)
{
if($test eq $uniqueArray[$y])
{
$found=1;
}
}
unless ($found)
{
push #uniqueArray, $stringArray[$x];
}
}
If the above for loops don't look sensible to you, now is a good time to look up some tutorials.
This could be simplified with foreach loops:
foreach my $letter (#stringArray) {
...
}
Or with grep searches:
my $found = grep { $_ eq $letter } #uniqueArray;
But, in the particular case of counting unique values, it's often simplest to assign to a hash:
my %uniques;
$uniques{$_} = 1 for #stringArray;
my $num_uniques = scalar keys %uniques;
Combining all of that:
my #letters = split(//, $input); #split input into array of chars
my %uniques; #declare empty hash
$uniques{$_} = 1 for #letters; #set hash key for each char
my $num_letters = scalar #letters; #count entries in letter list
my $num_uniques = scalar keys %uniques; #count unique keys in letter hash
Exercise for the reader: adjust the above code so that it counts the number of times each character is used.
That's because #uniqueArray is empty...
Given this short example:
use strict;
use warnings;
my #arr;
my $t = 0;
for ($arr[$t]; $arr[$#arr]; $t++ ) {
print "no\n";
}
__OUTPUT__
Useless use of array element in void context at t.pl line 11.
You declare my #uniqueArray; at line 21 and never do anything with it...
Which also means how will this ever match at line 34?
if($test eq $uniqueArray[$y])
Again, #uniqueArray is an empty array.
To fix your script (although please look at rutter's hash suggestion), you can do the following. Remove:
my $x = 0;
my $y = 0;
Instead of using C-style loops, replace with the following:
for my $x (0 .. $string_max_index )
for my $y (0 .. $#uniqueArray)
Lastly, use the following:
if(!$found)
{
push #uniqueArray, $stringArray[$x];
}
Hope this helps!

Resources