How to find index of string in array Perl without iterating - arrays

I need to find value in array without iterating through whole array.
I get array of strings from file, and I need to get index of some value in this array, I have tried this code, but it doesn't work.
my #array =<$file>;
my $search = "SomeValue";
my $index = first { $array[$_] eq $search } 0 .. $#array;
print "index of $search = $index\n";
Please suggest how can I get index of value, or it is better to get all indexes of line if there are more than one entry.
Thx in advance.

What does "it doesn't work" mean?
The code you have will work fine, except that an element in the array is going to be "SomeValue\n", not "SomeValue". You can remove the newlines with chomp(#array) or include a newline in your $search string.

Your initial question: "I need to find value in array without iterating through whole array."
You can't. It is impossible to check every element of an array, without checking every element of an array. The very best you can do is stop looking once you've found it - but you indicate in your question multiple matches.
There are various options that will do this for you - like List::Util and grep. But they are still doing a loop, they're just hiding it behind the scenes.
The reason first doesn't work for you, is probably because you need to load it from List::Util first. Alternatively - you forgot to chomp, which means your list includes line feeds, where your search pattern doesn't.
Anyway - in the interests of actually giving something that'll do the job:
while ( my $line = <$file> ) {
chomp ( $line );
#could use regular expression based matching for e.g. substrings.
if ( $line eq $search ) { print "Match on line $.\n"; last; }
}
If you want want every match - omit the last;
Alternatively - you can match with:
if ( $line =~ m/\Q$search\E/ ) {
Which will substring match (Which in turn means the line feeds are irrelevant).
So you can do this instead:
while ( <$file> ) {
print "Match on line $.\n" if m/\Q$search\E/;
}

Related

Apply a substitution over every element of an array

I have an array of file names. Names are of the format company_ID_timestamp.
How do I apply a substitution on the array without running a loop?
for ( my $i=0; $i < scalar #todayFiles; $i++ ) {
$todayFiles[$i] = s/_20[0-9]{10}//;
}
Unless you want an ugly hack, you're going to want some kind of a loop, even if it's hidden with a map, or a for statement modifier.
s/_20[0-9]{10}// for #todayFiles;
The following works in Perl v5.14 and up (because of the /r modifier). This one makes sense if you don't want to modify the original array:
my #otherArray = map { s/_20[0-9]{10}//r } #todayFiles;
And here's a shorter/better way to write that C-style loop you showed:
for my $filename (#todayFiles) {
$filename =~ s/_20[0-9]{10}//;
}
The latter one works because the for aka foreach loop actually aliases the variable $filename to the elements of the array being iterated over.
To apply a substitution to every element of an array, there is no way but to iterate over those elements
That said, you can tidy up your code a lot by using for as a statement modifier and employing the $_ default variable
s/_20[0-9]{10}// for #todayFiles;
This still iterates over the entire array, but the code is a lot tighter

How do I modify elements in a Perl array inside a foreach loop?

My goal with this piece of code is to sanitize an array of elements (a list of URL's, some with special characters like %) so that I can eventually compare it to another file of URL's and output which ones match. The list of URL's is from a .csv file with the first field being the URL that I want (with some other entries that I skip over with a quick if() statement).
foreach my $var(#input_1) {
#Skip anything that doesn't start with http:
if ((/^[#U]/ ) || !(/^h/)) {
next;
}
#Split the .csv into the relevant field:
my #fields = split /\s?\|\s?/, $_;
$var = uri_unescape($fields[0]);
}
My delimiter is a | in the csv. In its current setup, and also when I change the $_ to $var, it only returns blank lines. When I remove the $var declaration at the beginning of the loop and use $_, it will output the URL's in the correct format. But in that case, how can I assign the output to the same element in the array? Would this require a second array to output the value to?
I'm relatively new to perl, so I'm sure there is some stuff that I'm missing. I have no clue at this moment why removing the $var at the foreach declaration breaks the parsing of the #fields line, but removing it and using $_ doesn't. Reading the perlsyn documentation did not help as much as I would have liked. Any help appreciated!
/^h/ is not bound to anything, so the match happens against $_. If you want to match $var, you have to bind it:
if ($var =~ /^[#U]/ || $var !~ /^h/) {
Using || with two matches could probably be incorporated into a single regular expression with an alternative:
next if $var =~ /^(?: [#U] | [^h] | $ )/x;
i.e. The line has to start with #, U, something else than h, or be empty.
You can populate a new array with the results by using push:
push #results, $var;
Also note that if your data can contain | quoted or escaped (or newlines etc.), you should use Text::CSV instead of split.

Perl matching multidimensional array elements

Im not getting any output, anyone get where the issue lies,
matching or calling?
(The two subarrays in the multidimensional array have the same length.)
//Multidimensional array,
//Idarray = Fasta ID, Seqarray = "ATTGTTGGT" sequences
#ordarray = (\#idarray, \#seqarray);
//This calling works
print $ordarray[0][0] , "\n";
print $ordarray[1][0] , "\n", "\n";
// Ordarray output = "TTGTGGCACATAATTTGTTTAATCCAGAT....."
User inputs a search string, loop iterates the sequence dimension,
and counts amount of matches. Prints number of matches and the corresponding ID from the ID dimension.
//The user input-searchstring
$sestri = <>;
for($r=0;$r<#idarray;$r++) {
if ($sestri =~ $ordarray[1][$r] ){
print $ordarray[0][$r] , "\n";
$counts = () = $ordarray[0][$r] =~ /$sestri/g;
print "number of counts: ", $counts ;
}
I think the problem lies with this:
$sestri = <>;
That may well not be doing what you intended - your comment says "user specified search string" but that's not what that operator does.
What it does, is open the filename you specifed on the command line, and 'return' the first line.
I would suggest that if you want to grab a search string from command line you want to do it via #ARGV
E.g.
my ( $sestri ) = #ARGV; # will give first word.
However, please please please switch on use strict and use warnings. You should always do this prior to posting on a forum for assistance.
I would also question quite why you need a two dimensional array with two elements in it though. It seems unnecessary.
Why not instead make a hash, and key your "fasta ids" to the sequence?
E.g.
my %id_of;
#id_of{#seqarray} = #idarray;
my %seq_of;
#seq_of{#id_array} = #seqarray;
I think this would suit your code a bit better, because then you don't have to worry about the array indicies at all.
use strict;
use warnings;
my ($sestri) = #ARGV;
my %id_of;
#id_of{#seqarray} = #idarray;
foreach my $sequence ( keys %id_of ) {
##NB - this is a pattern match, and will be 'true'
## if $sestri is a substring of $sequence
if ( $sequence =~ m/$sestri/ ) {
print $id_of{$sequence}, "\n";
my $count = () = $sequence =~ m/$sestri/g;
print "number of counts: ", $count, "\n";
}
}
I've rewritten it a bit, because I'm not entirely understanding what your code is doing. It looks like it's substring matching in #seqarray but then returning the count of matching elements in #idarray I don't think that makes sense, but if it does, then amend according to your needs.

Replacing array elements in Perl

I'm trying to replace an element in my array and my code doesn't seem to work.
my #wholeloop = (split //, $loop);
for my $i (0 .. $#wholeloop ) {
if ( $wholeloop[$i] eq "i" ) {
$wholeloop[$i] =~ htmlinsert($offset);
$offset++
}
}
I've read about problematics of doing stuff while iterating through an array, and maybe is there a better solution. I'm trying to replace specific occurences of a character in a string, and array seemed as a reasonable tool to use.
Typically - when iterating on a loop, you don't need to do it via:
for ( 0..$#array) {
Because
for ( #array ) {
will do the same thing, but with an added advantage of $_ being an alias to the array variable.
for my $element ( #wholeloop ) {
if ( $element eq "i" ) {
$element = htmlinsert($offset++);
}
}
$element is an alias so if you change it, you change the array. ($_ will do the same, but I dislike using it when I don't have to, because I think it make less clear code. This is a style/choice matter, rather than a technical one).
However for searching and replacing an element in a string – like you're doing – then you're probably better off using one of the other things perl does really well – regular expressions and pattern replacement. I can't give an example easily though, without knowing what htmlinsert returns.
Something like though:
$loop =~ s/i/newvalue/g;
Will replace all instances of i with a new value.
=~ is Perl's "match regular expression" operator, so unless htmlinsert() returns a regex, it's probably not what you meant to do. You probably want to use =.
A more Perlish way to do this, though, might be to use the map function. map takes a block and an array and runs the block with each element of the array in $_, returning all the values returned by that block. For example:
my #wholeloop = map {
$_ eq "i" ? htmlinsert($offset++) : $_;
} split //, $loop;
(The ? and : perform an "if/else" in a single line; they're borrowed from C. map is borrowed from functional programming languages.)
Perhaps you should use foreach. It is the most suitable for what you are trying to do here
my #array;
foreach ( #array ) {
$_ =~ whatever your replacement is;
}
Now, like Sobrique said, unless htmlinsert returns a RegEx value, that isn't going to work. Also, if you could give us context for "$offset", and what its purpose is, that would be really helpful.

Why can't I seem to remove undefined element from an array in Perl?

I have an array of strings that comes from data in a hash table. I am trying to remove any (apparently) empty elements, but for some reason there seems to be an obstinate element that refuses to go.
I am doing:
# Get list array from hash first, then
#list = grep { $_ ne ' ' } #list;
#list = uniq #list;
return sort #list;
At the grep line I get the Use of uninitialized value in string ne... message with the rest of the array printed correctly below.
I've tried doing it the 'long' way:
foreach (#list) {
if ($_ ne ' ') {
push #new_list, $_;
}
}
But this produces exactly the same result. I tried using defined with the expected result (nothing).
I could sort the array beforehand and delete the first element, but that seems very risky as I cannot guarantee that the data set will always have blank elements. It also seems excessive to resort to regular expressions, but perhaps I'm wrong. I'm sure I'm missing something ridiculously simple, as usual.
Elements can't be empty. You're trying to remove undefined elements. But you're not checking if the element is undefined, you're checking if it consists of a string consisting of a single space. You want:
#list = grep defined, #list;
My answer assumes that you do not want strings that are either empty (meaning undefined or have a length of 0) or consist solely of spaces.
Your grep line only tests for strings that equal exactly one space. However, the warning implies that at least one array element is indeed undefined. Comparing an undefined value with eq will only yield true for an empty string, not for a single space.
So in order to remove all entries that are either undefined or consist only of spaces you could do something like this:
#list = grep { defined && m/[^\s]/ } #list;
Note that an empty space is trueish for Perl. Therefore a simple grep defined, #list will actually not throw out the entries that consist solely of spaces.
It looks like you want to filter all the elements that contain a non-space character. To do this as well as reject undefined elements you can write simply
#list = uniq grep { defined and /\S/ } #list;

Resources