Perl find similar elements from two arrays - arrays

I would like to retrieve elements from #amplicon_exon array that contain similar element (like) to #failedamplicons array. Each element in #failedamplicons is unique and can only match a single element from #amplicon_exon. I've tried two for loops but get repeat values. Is there a better way of finding and retrieving similar values from the two arrays?
#failedamplicons: example:
OCP1_FGFR3_8.87
OCP1_AR_14.89
#amplicon_exon: example:
TEST_Focus_ERBB2_2:22:ERBB2:GENE_ID=ERBB2;PURPOSE=CNV,Hotspot;CNV_ID=ERBB2;CNV_HS=1
OCP1_FGFR3_8:intron:FGFR3:GENE_ID=FGFR3;PURPOSE=CNV;CNV_ID=FGFR3;CNV_HS=1
OCP1_CDK6_14:intron:CDK6:GENE_ID=CDK6;PURPOSE=CNV;CNV_ID=CDK6;CNV_HS=1
Here is two for loop code:
my $i = 0;
my $j = 0;
for ( $i = 0; $i < #amplicon_exon; $i++ ) {
for ( $j = 0; $j < #failedamplicons; $j++ ) {
my $fail_amp = ( split /\./, $failedamplicons[$j] )[0];
#print "the failed amp before match is $fail_amp\n";
if ( index( $amplicon_exon[$i], $fail_amp ) != -1 ) {
#print "the amplicon exon that matches $amplicon_exon[$i] and sample is $sample_id\n";
print "the failed amp that matches $fail_amp and sample is $sample_id\n";
my #parts = split /:/, $amplicon_exon[$i];
my $exon_amp = $parts[1];
next unless $parts[3] =~ /Hotspot/; #includes only Hotspot amplicons
my $gene_res = $parts[2];
my $depth = ( split /\./, $failedamplicons[$j] )[1];
my #total_amps = (
$run_name, $sample_id, $gene_res, $depth, $fail_amp, $run_date, $matrix_status
);
my $lines = join "\t", #total_amps;
push( #finallines, $lines );
}
}
}

split and grep are your friends, as is the idiomatic approach to iterating over a list. Simply iterate over the first array, extract just the part you want to match on (by using split to split the element on a . character, then taking only the first entry), then using a regex, grep for that part of the string in the second array from the beginning of the element up to the ::
for my $elem (#failedamplicons){
my $to_match = (split /\./, $elem)[0];
if (my ($matched) = grep {$_ =~ /^\Q$to_match:/} #amplicon_exon){
print "$matched\n";
}
}

Related

Is there a built in Perl Function for finding duplicate subarrays(exact order) in an array?

Lets say the array is (1,2,3,4,5,6,7,8,9),
Another subarray is (2,3,4)
Is there a function to check if the subarray pattern(full exact order) exists within array?
In this case, it would return any indicator(index) that shows it exists.
Also would need to work for duplicates if there are multiple subarrays existing in the array like (4,2,3,4,2,3,4).
If it happens to match multiple times for example:
Array = (2,3,2,3,2,2,3,2)
Sub Array = (2,3,2)
Would just return starting index of matches in order: 0,2,5
Or if it removes, would result in (3,2)
Edit: Elements don't have to be num
There's no built-in method, but it's easy to write:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
# Takes two arrayrefs of numbers.
#
# Returns the first index in the first one where the second list appears, or
# -1 if not found.
sub find_sublist(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
return -1 if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
return $n;
}
return -1;
}
# Takes two arrayrefs of numbers.
#
# Returns a list of the starting indexes of the first list
# of every run of the second list. Returns an empty list if
# there are no matches.
sub find_sublists(++) {
my ($haystack, $needle) = #_;
my $nlen = #$needle;
my $hlen = #$haystack;
my #positions;
return #positions if $hlen == 0 || $nlen == 0;
HAYSTACK_POS:
for (my $n = 0; $n <= $hlen - $nlen; $n++) {
for (my $m = 0; $m < $nlen; $m++) {
if ($haystack->[$n + $m] != $needle->[$m]) {
next HAYSTACK_POS;
}
}
push #positions, $n;
}
return #positions;
}
# Takes two arrayrefs of numbers.
#
# Returns a new list that is the first one with every non-overlapping run of
# the second second list removed.
sub remove_sublists(++) {
my #haystack = #{$_[0]};
my $needle = $_[1];
while ((my $pos = find_sublist #haystack, $needle) != -1) {
splice #haystack, $pos, #$needle;
}
return #haystack;
}
my #list1 = (1,2,3,4,5,6,7,8,9);
my #list2 = (4,2,3,4,2,3,4);
my #list3 = (2,3,2,3,2,2,3,2);
say find_sublist(#list1, [2, 3, 4]); # Returns 1
say find_sublist([2,9,3,4], [2,3,4]); # Returns -1
my #positions = find_sublists(#list2, [2,3,4]); # 1,4
say join(",", #positions);
#positions = find_sublists(#list3, [2,3,2]); # 0,2,5
say join(",", #positions);
say join(",", remove_sublists(#list1, [2,3,4])); # 1,5,6,7,8,9
say join(",", remove_sublists(#list3, [2,3,2])); # 3,2
If the inputs are numbers representable by your perl's integers (as shown), you can use
# Indexes
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
my #indexes;
push #indexes, $-[0] while $array =~ /\Q$pattern/g;
# Removal
my $pattern = pack "W*", #pattern;
my $array = pack "W*", #array;
$array =~ s/\Q$pattern//g;
#array = unpack "W*", $array;
How it handles overlaps:
/---\ /---\ Removed
2,3,2 from 2,3,2,3,2,2,3,2
\---/ Not removed
Note that this also works if you can map the inputs to numbers.
my ( %map_f, #map_r );
for ( #array, #pattern ) {
if ( !exists{ $map{ $_ } } ) {
$map_f{ $_ } = #map_r;
push #map_r, $_;
}
}
my $pattern = pack "W*", #map_f{ #pattern };
my $array = pack "W*", #map_f{ #array };
$array =~ s/\Q$pattern//g;
#array = #map_r[ unpack "W*", $array ];
It's not the best algorithm, but it should be very fast by moving the work from Perl to the regex engine.

Comparing two strings line by line in Perl

I am looking for code in Perl similar to
my #lines1 = split /\n/, $str1;
my #lines2 = split /\n/, $str2;
for (int $i=0; $i<lines1.length; $i++)
{
if (lines1[$i] ~= lines2[$i])
print "difference in line $i \n";
}
To compare two strings line by line and show the lines at which there is any difference.
I know what I have written is mixture of C/Perl/Pseudo-code. How do I write it in the way that it works on Perl?
What you have written is sort of ok, except you cannot use that notation in Perl lines1.length, int $i, and ~= is not an operator, you mean =~, but that is the wrong tool here. Also if must have a block { } after it.
What you want is simply $i < #lines1 to get the array size, my $i to declare a lexical variable, and eq for string comparison. Along with if ( ... ) { ... }.
Technically you can use the binding operator to perform a string comparison, for example:
"foo" =~ "foobar"
But it is not a good idea when comparing literal strings, because you can get partial matches, and you need to escape meta characters. Therefore it is easier to just use eq.
Using C-style for loops is valid, but the more Perl-ish way is to use this notation:
for my $i (0 .. $#lines1)
Which will iterate over the range 0 to the max index of the array.
Perl allows you to open filehandles on strings by using a reference to the scalar variable that holds the string:
open my $string1_fh, '<', \$string1 or die '...';
open my $string2_fh, '<', \$string2 or die '...';
while( my $line1 = <$string1_fh> ) {
my $line2 = <$string2_fh>;
....
}
But, depending on what you mean by difference (does that include insertion or deletion of lines?), you might want something different.
There are several modules on CPAN that you can inspect for ideas, such as Test::LongString or Algorithm::Diff.
my #lines1 = split(/^/, $str1);
my #lines2 = split(/^/, $str2);
# splits at start of line
# use /\n/ if you want to ignore newline and trailing spaces
for ($i=0; $i < #lines1; $i++) {
print "difference in line $i \n" if (lines1[$i] ne lines2[$i]);
}
Comparing Arrays is a way easier if you create a Hashmap out of it...
#Searching the difference
#isect = ();
#diff = ();
%count = ();
foreach $item ( #array1, #array2 ) { $count{$item}++; }
foreach $item ( keys %count ) {
if ( $count{$item} == 2 ) {
push #isect, $item;
}
else {
push #diff, $item;
}
}
#Output
print "Different= #diff\n\n";
print "\nA Array = #array1\n";
print "\nB Array = #array2\n";
print "\nIntersect Array = #isect\n";
Even after spliting you could compare them as Array.

Transform/pivot array in perl

Im stuck writing Perl code which transforms 2d array.
First column of the array is always date
Second column of the array is key that sorts.
Data is located in array "data" and is ordered by date and then key.
My situation should be understandable from the tables under. Unique values from the second column will be selected and later divided into columns header (green table)
It should work with and number of columns or dates/keys.
Structure before
Structure after
My code:
#creates filtered array of all the unique dates and its count
my #date = #{ $data->[0] };
my #filtDate = uniq #date;
my $countFiltDate = scalar #filtDate;
#unique list of keys
my #klic = #{ $data->[1] };
my #filtKlic = uniq #klic;
#orders filtered keys
#filtKlic = sort #filtKlic;
my $countFiltKlic = scalar #filtKlic;
#count of columns
my $columnsCount = scalar #{$data};
#test code - counts how many new number columns to make.
my $columnsCountAfter = ( $columnsCount - 2 ) * $countFiltKlic;
#inserst filtered dates into first column
my $dataGraph;
for ( my $i = 0; $i < $countFiltDate; $i++ ) {
$dataGraph->[0]->[$i] = #filtDate[$i];
}
#biggest loop with number of dates
for ( my $k = 0; $k < $countFiltDate; $k++ ) {
my $l;
my $c;
#columns sount k $i
for ( my $i = 0; $i < $columnsCount - 2; $i++ ) {
#loop for different keys k $j
for ( my $j = 0; $j < $countFiltKlic; $j++ ) {
$l++; #riadok v prvej tabulke
#EVERYTHING after this part is written horibly.
# I'm trying to make it work even
#if key values are missing.
for ( my $m = 0; $m < 5; $m++ ) {
if ( $data->[1]->[ $l - 1 ] eq $filtKlic[$j] ) {
print " [" . $data->[1]->[ ( $l - 1 ) ] . ',' . $filtKlic[$j] . "]";
$dataGraph->[ $l + $c ]->[$k] = $data->[ $i + 2 ]->[ ( ( $k * $countFiltKlic ) + $j ) ];
#print " [".$data->[1]->[$j].','.($filtKlic[($j)])."]-";
print " [" . ( $i + 2 ) . ',' . ( ( $k * $countFiltKlic ) + $j ) . "]-";
print " [" . ( $l + $c ) . ',' . $k . "]<br>";
$m = 5; #just random number... i don't want to get infinite loops during testing
} else {
if ( $m = 5 ) {
$l--;
$c++;
}
$j++;
}
}
}
}
}
my #nameStlpceKlic;
#nameStlpceKlic[0] = "date";
my $o;
for ( my $i = 0; $i < $columnsCount - 2; $i++ ) {
foreach (#filtKlic) {
my $o;
$o++;
#nameStlpceKlic[$o] = #filtKlic[ ( $o - 1 ) ] . $i;
}
}
I have 2 problems.
How to make sure that this will work even if some of the key are missing at some dates.
How to write it properly. My code is too clumsy.
Here is my general approach for solving this kind of problem.
In the second table, you're grouping your data by the date, then displaying the values for number1 and the values for number2. This should give you a hint as to how you want to organise your data structure and what you need to index for printing.
Your current data is (I assume) stored in an array of arrays. I was too lazy to copy the values, so I made my own AoA with made up values. I've sprinkled comments through the code so you can see how I worked on this.
my $arr = [
['date','key','number1','number2'],
['22.12.2013','1a','1a1-34567','1a2-1234567'],
['22.12.2013','2b','2b1-3249871','2b2-4597134'],
['22.12.2013','3c','3c1-1234567',''],
['22.12.2013','4d','4c1-3249871','4c2-4597134'],
['22.13.2013','1a','1a1-34567','1a2-1234567'],
['22.13.2013','2b','','2b2-4597134'],
['22.13.2013','3c','3c1-1234567','3c2-1234567'],
['22.13.2013','4d','4c1-3249871','4c2-4597134'],
];
# first, remove the first row, which contains the column headers.
my $col_h = shift #$arr;
my $data;
my $key_list;
foreach (#$arr) {
my %hash;
# use a hash slice with the column header array as keys
# and the array as the values
#hash{#$col_h} = #$_;
# store this hash in a data hash indexed by date then key
$data->{ $hash{date} }{ $hash{key} } = \%hash;
# compile a separate hash with the keys in it
$key_list->{ $hash{key} }++;
}
# make a sorted list of keys, ready for printing
my #key_list = sort keys %$key_list;
# remove the first two elements from the column headers ('date' and 'key')
splice(#$col_h, 0, 2);
# print out the header row for the table (I'm doing a simple tab-delim'd table)
print STDERR "Date\t";
# for each NUMBER from NUMBER1 ... NUMBERn
foreach my $c (#$col_h) {
# print "keyID NUMBERn"
map { print STDERR "$_ $c\t" } #key_list;
}
print STDERR "\n";
# Now print out the data itself. Sort by date...
foreach my $date (sort keys %$data) {
print STDERR "$date\t";
# for each NUMBER1 ... NUMBERn
foreach my $header (#$col_h) {
foreach my $key (#key_list) {
## print out the value OR - if there is no value
print STDERR ( $data->{$date}{$key}{$header} || "-" ) . "\t";
}
}
print STDERR "\n"; # end of the table row
}
Output (with tabs expanded for display purposes):
Date 1a number1 2b number1 3c number1 4d number1 1a number2 2b number2 3c number2 4d number2
22.12.2013 1a1-34567 2b1-3249871 3c1-1234567 4c1-3249871 1a2-1234567 2b2-4597134 - 4c2-4597134
22.13.2013 1a1-34567 - 3c1-1234567 4c1-3249871 1a2-1234567 2b2-4597134 3c2-1234567 4c2-4597134
I was able to put together code that works using great answer from "i alarmed alien" .
First thing that is different is that my data are formatted as array of arrays in transposed way.
$arr1 = [ '2013-12-22', '2013-12-22' ];
$arr2 = [ 'Number1','Number2'];
$arr3 = [ '2328942', '679204'];
$arr4 = [ '1450398', '436713'];
Also transformed data should be saved in an array. I've written this piece of code. ( It's far from perfect, if there are any suggestions how to improve it further I'd be happy to hear those.)
####################
#transpose data
my $datas = $args{DATA};
my $headers = $args{HEADERS};
my #rows = ();
my #transposed = ();
for my $row (#$datas) {
for my $column (0 .. $#{$row}) {
push(#{$transposed[$column]}, $row->[$column]);
}
}
#################################
my #arr = #transposed;
# first, define headers.
my $col_h = $args{HEADERS};
my $data;
my $key_list;
foreach (#arr) {
my %hash;
# use a hash slice with the column header array as keys
# and the array as the values
#hash{#$col_h} = #$_;
# store this hash in a data hash indexed by date then key
$data->{ $hash{date} }{ $hash{key} } = \%hash;
# compile a separate hash with the keys in it
$key_list->{ $hash{key} }++;
}
# make a sorted list of keys, ready for printing
my #key_list = sort keys %$key_list;
# remove the first two elements from the column headers ('date' and 'key')
splice(#$col_h, 0, 2);
my #output;
my #header;
# print out the header row for the table (I'm doing a simple tab-delim'd table)
#print STDERR "Date\t";
push(#header, "Date\t");
# for each NUMBER from NUMBER1 ... NUMBERn
foreach my $c (#$col_h) {
# print "keyID NUMBERn"
map { push (#header,"$_ $c\t" )} #key_list;
#map { print STDERR "$_ $c\t" } #key_list;
}
#print STDERR "<br>";
push (#output,\#header );
my $row;
my $column;
# Now print out the data itself. Sort by date...
foreach my $date (sort keys %$data) {
#print STDERR "$date\t";
$row++;
my #line;
push(#line, "$date");
# for each NUMBER1 ... NUMBERn
foreach my $header (#$col_h) {
foreach my $key (#key_list) {
## print out the value OR - if there is no value
$column++;
push (#line,( $data->{$date}{$key}{$header} || "-" ) . "\t");
#print STDERR ( $data->{$date}{$key}{$header} || "-" ) . "\t";
}
}
print STDERR "<br>"; # end of the table row
$column = 0;
push (#output,\#line );
}
my $x = 1;
return #output;
}
This code works but it's little ugly. Please let me know If there is cleaner/better way to do this.

Pulling out potentially overlapping subsets of elements in array to make smaller arrays

My input file looks like below (real one is much larger):
rs3683945_mark 0
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
IMP!2! 2
SNP117_impute 2.685815665
IMP!3! 3
SNP3_1_impute 3.643119709
SNP1_impute 3.643119809
rs13475706_mark 3.643119909
13 lines, two elements each line. First element is a name. Each name ends either with a "tag" _mark or impute, or there is no tag. The point of the tag is to distinguish between types of names, which form the basis of my search for subsets within the entire list.
The subsets begin with a _mark name that immediately precedes an instance of an _impute name. The subsets end with the very next instance of _mark. All names in between, which will necessarily not have any such tag, also go into a subset, which I'd like to collect into an array and send off to a subroutine to process (details of that not important). Please note, the positions with IMP in the name are not the same as those actually tagged with a _impute.
For example, with the above, the first useable subset is:
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
The second useable subset is:
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
and so on... EDIT: Note that last _mark name of the first set is the first _mark name of the second.
My code for this:
#!/usr/bin/perl
use strict; use warnings;
my $usage = "usage: merge_impute.pl {genotype file} {distances file} \n";
die $usage unless #ARGV == 2;
my $genotypes = $ARGV[0];
open (FILE, "<$genotypes");
my #genotypes = <FILE>;
close FILE;
my $distances = $ARGV[1];
open (DISTS, "<$distances");
my #distances = <DISTS>;
close DISTS;
my #workingset = ();
#print scalar #distances;
for ( my $i = 0; $i < scalar #distances; $i++ ){
chomp $distances[$i];
#print "$distances[$i]\n";
if ( $distances[$i] =~ m/impute/ ){
push ( #workingset,$distances[$i-1],$distances[$i],$distances[$i+1]);
}
print "i=$i: #workingset\n";
# at this point send off to sub routine
#workingset=();
}
As you can see, the if loop is only set up to find subsets that contain only one _impute name. How can I modify the code so that a subset will "fill up" with as many names as required until we arrive at the next _mark name?
EDIT: Perhaps instead of the for loop, I could something like...
push (#workingset, $distances[0], $distances[1] );
until ( $distance[ ??? ] =~ m/_mark/ ){
push ( #workingset, $distance[ ??? ] );
}
But what could $distances[ ??? ] be?
EDIT: Or an alternative for loop...
push (#workingset, $distances[0] );
for ( my $i = 1; $i < scalar #distances - 1 ; $i++ ){
until ( $distances[ $i ] =~ m/_mark/ ){
push ( #workingset, $distances[ $i ] );
# send #workingset to sub routine
#clear workingset
#workingset = ();
}
}
Though this isn't working.
I also tried...
push (#workingset, $distances[0] );
for ( my $i = 1; $i < scalar #distances - 1 ; $i++ ){
until ( $distances[ $i ] =~ m/_mark/ ){
push ( #workingset, $distances[ $i ] );
next if $distances[ $i+1 ] !~ /_mark/;
}
# send #workingset to sub routine here
print "i=$i, #workingset\n\n";
#clear workingset
#workingset = ();
}
I don't have a lot of time right now but I'll hopefully have some time in the morning to check back. Here's a quick example on how you could do it (it is meant to be simple and easy to understand, not fancy). Hopefully it helps you get on the right track for parsing the data.
use strict;
use warnings;
my $first_mark;
my #workingset = ();
my $second_mark;
while (<DATA>){
chomp;
if ( /_mark/ and scalar #workingset == 0 ) {
$first_mark = $_;
} elsif ( /IMP|_impute/ and defined $first_mark) {
push #workingset, $_;
} elsif ( /_mark/ and defined $first_mark) {
$second_mark = $_;
print "Found valid set: ";
print "$first_mark," . join(",", #workingset) . ",$second_mark\n";
#workingset = ();
$first_mark = $second_mark;
undef $second_mark;
}
}
__DATA__
rs3683945_mark 0
rs6336442_mark 1E-07
rs31328150_impute 0.444121193
rs3658242_mark 0.444121293
rs39342374_impute 0.444121393
IMP!1! 1
rs3677817_mark 1.986015679
IMP!2! 2
SNP117_impute 2.685815665
IMP!3! 3
SNP3_1_impute 3.643119709
SNP1_impute 3.643119809
rs13475706_mark 3.643119909
Output:
Found valid set: rs6336442_mark 1E-07,rs31328150_impute 0.444121193,rs3658242_mark 0.444121293
Found valid set: rs3658242_mark 0.444121293,rs39342374_impute 0.444121393,IMP!1! 1,rs3677817_mark 1.986015679
Found valid set: rs3677817_mark 1.986015679,IMP!2! 2,SNP117_impute 2.685815665,IMP!3! 3,SNP3_1_impute 3.643119709,SNP1_impute 3.643119809,rs13475706_mark 3.643119909

Perl Script -: Useless use of array element in void context at letter_counter.pl lin 38 and 44

this is first my perl script
http://bpaste.net/show/171137/
#!/usr/bin/perl
#This program will take a user's input and then count how many letters there are. Whereupon it will count the number of unique letters before printing all the data
#back to the user.
use strict;
use warnings;
#======================================================================================================================
# This section is to collect and spit back the input to the user.
#======================================================================================================================
print "\n\nHello, please enter a word, a phrase, or a sentence. Press Enter when you are done.\n";
my $input = <>; #Collecting the input from the user.
chomp $input; #Chomping, or removing, the \n from the end of the input.
print "\nYou typed -:[$input]\n";
#======================================================================================================================
#This section will find how many unique characters there are.
#======================================================================================================================
my #uniqueArray;
my #stringArray = split(// , $input);
my $x = 0;
my $string_max_index = $#stringArray;
for($stringArray[$x];$stringArray[$string_max_index];$x++)
{
my $found = 0;
my $test = $stringArray[$x];
my $y = 0;
for($uniqueArray[$y];$uniqueArray[$#uniqueArray];$y++)
{
if($test eq $uniqueArray[$y])
{
$found=1;
}
}
if($found eq 1)
{
$uniqueArray[$#uniqueArray] = $stringArray[$x];
}
}
#======================================================================================================================
# This section will determine how many ascii characters are in the $input variable and output the results of this
# program.
#======================================================================================================================
my $numOfLet = 0;
while ( $input ne "" )
{
$numOfLet = $numOfLet + 1;
chop $input
}
print "Total Characters -: $numOfLet";
print "Total of Unique Characters -: $#uniqueArray \n\n\n";
exit;
I was able to get rid of all the errors except for these two,
Useless use of array element in void context at letter_counter.pl line 38
Useless use of array element in void context at letter_counter.pl line 44
What is confusing me is that There is nothing at those lines, just the closing brackets for my for loop, which leads me to believe that the issue is an element I called in each for loop.
The initialization block of your for loop is the immediate culprit. Adjusting to something like this resolves the warning:
for(;$stringArray[$string_max_index];$x++)
Otherwise you're accessing a value, but doing... nothing with it? That's what the warning is for.
I spot a few other problems, though:
Your for loops are... a little funny, I don't know how else to put that.
Array length is usually easiest to read with the scalar keyword.
Adding members to an array is usually done with the push keyword.
Using the above in combination:
for(my $x = 0; $x < scalar #stringArray;$x++)
{
my $found = 0;
my $test = $stringArray[$x];
my $y = 0;
for (my $y = 0; !$found && $y < scalar #uniqueArray;$y++)
{
if($test eq $uniqueArray[$y])
{
$found=1;
}
}
unless ($found)
{
push #uniqueArray, $stringArray[$x];
}
}
If the above for loops don't look sensible to you, now is a good time to look up some tutorials.
This could be simplified with foreach loops:
foreach my $letter (#stringArray) {
...
}
Or with grep searches:
my $found = grep { $_ eq $letter } #uniqueArray;
But, in the particular case of counting unique values, it's often simplest to assign to a hash:
my %uniques;
$uniques{$_} = 1 for #stringArray;
my $num_uniques = scalar keys %uniques;
Combining all of that:
my #letters = split(//, $input); #split input into array of chars
my %uniques; #declare empty hash
$uniques{$_} = 1 for #letters; #set hash key for each char
my $num_letters = scalar #letters; #count entries in letter list
my $num_uniques = scalar keys %uniques; #count unique keys in letter hash
Exercise for the reader: adjust the above code so that it counts the number of times each character is used.
That's because #uniqueArray is empty...
Given this short example:
use strict;
use warnings;
my #arr;
my $t = 0;
for ($arr[$t]; $arr[$#arr]; $t++ ) {
print "no\n";
}
__OUTPUT__
Useless use of array element in void context at t.pl line 11.
You declare my #uniqueArray; at line 21 and never do anything with it...
Which also means how will this ever match at line 34?
if($test eq $uniqueArray[$y])
Again, #uniqueArray is an empty array.
To fix your script (although please look at rutter's hash suggestion), you can do the following. Remove:
my $x = 0;
my $y = 0;
Instead of using C-style loops, replace with the following:
for my $x (0 .. $string_max_index )
for my $y (0 .. $#uniqueArray)
Lastly, use the following:
if(!$found)
{
push #uniqueArray, $stringArray[$x];
}
Hope this helps!

Resources