Apply a substitution over every element of an array - arrays

I have an array of file names. Names are of the format company_ID_timestamp.
How do I apply a substitution on the array without running a loop?
for ( my $i=0; $i < scalar #todayFiles; $i++ ) {
$todayFiles[$i] = s/_20[0-9]{10}//;
}

Unless you want an ugly hack, you're going to want some kind of a loop, even if it's hidden with a map, or a for statement modifier.
s/_20[0-9]{10}// for #todayFiles;
The following works in Perl v5.14 and up (because of the /r modifier). This one makes sense if you don't want to modify the original array:
my #otherArray = map { s/_20[0-9]{10}//r } #todayFiles;
And here's a shorter/better way to write that C-style loop you showed:
for my $filename (#todayFiles) {
$filename =~ s/_20[0-9]{10}//;
}
The latter one works because the for aka foreach loop actually aliases the variable $filename to the elements of the array being iterated over.

To apply a substitution to every element of an array, there is no way but to iterate over those elements
That said, you can tidy up your code a lot by using for as a statement modifier and employing the $_ default variable
s/_20[0-9]{10}// for #todayFiles;
This still iterates over the entire array, but the code is a lot tighter

Related

Perl: grep from multiple arrays at once

I have multiple arrays (~32). I want to remove all blank elements from them. How can it be done in a short way (may be via one foreach loop or 1-2 command lines)?
I tried the below, but it's not working:
my #refreshArrayList=("#list1", "#list2", "#list3","#list4", "#list5", "#list6" , "#list7");
foreach my $i (#refreshArrayList) {
$i = grep (!/^\s*$/, $i);
}
Let's say, #list1 = ("abc","def","","ghi"); #list2 = ("qwe","","rty","uy", "iop"), and similarly for other arrays. Now, I want to remove all blank elements from all the arrays.
Desired Output shall be: #list1 = ("abc","def","ghi"); #list2 = ("qwe","rty","uy", "iop") ### All blank elements are removed from all the arrays.
How can it be done?
You can create a list of list references and then iterator over these, like
for my $list (\#list1, \#list2, \#list3) {
#$list = grep (!/^\s*$/, #$list);
}
Of course, you could create this list of list references also dynamically, i.e.
my #list_of_lists;
push #list_of_lists, \#list1;
push #list_of_lists, \#list2;
...
for my $list (#list_of_lists) {
#$list = grep (!/^\s*$/, #$list);
}
#$_ = grep /\S/, #$_ for #AoA; # #AoA = (\#ary1, \#ary2, ...)
Explanation
First, this uses the statement modifier, "inverting" the usual for loop syntax into the form STMT for LIST
The for(each) modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn).
It is mostly equivalent to a "normal" for loop, with the notable difference being that no scope is set and so there is no need to tear it down either, adding a small measure of efficiency.† Ae can have only one statement; but then again, that can be a do block. Having no scope means that we cannot declare lexical variables for the statement (unless a do block is used).
So the statement is #$_ = grep /\S/, #$_, executed for each element of the list.
In a for(each) loop, the variable that is set to each element in turn as the list is iterated over ("topicalizer") is an alias to those elements. So changing it changes elements. From perlsyn
If VAR is omitted, $_ is set to each value.
If any element of LIST is an lvalue, you can modify it by modifying VAR inside the loop.
In our case $_ is always an array reference, and then the underlying array is rewritten by dereferencing it (#$_) and assigning to that the output list of grep, which consists only of elements that have at least one non-space character (/\S/).
† I ran a three-way benchmark, of the statement-modifier loop against a "normal" loop with and without a topical variable.
For adding 100e6 numbers I get 8-11% speedup (on both a desktop and a server) and with a more involved calculation ($r = ($r + $_ ) / sqrt($_)) it's 4-5%.
A side observation: In both cases the full for loop without a variable (using default $_ for the topicalizer) is 1-2% faster than the one with a lexical topical variable set.

Powershell - print the list length respectively

I want to write two things in Powershell.
For example;
We have a one list:
$a=#('ab','bc','cd','dc')
I want to write:
1 >> ab
2 >> bc
3 >> cd
4 >> dc
I want this to be dynamic based on the length of the list.
Thanks for helping.
Use a for loop so you can keep track of the index:
for( $i = 0; $i -lt $a.Count; $i++ ){
"$($i + 1) >> $($a[$i])"
}
To explain how this works:
The for loop is defined with three sections, separated by a semi-colon ;.
The first section declares variables, in this case we define $i = 0. This will be our index reference.
The second section is the condition for the loop to continue. As long as $i is less than $a.Count, the loop will continue. We don't want to go past the length of the list or you will get undesired behavior.
The third section is what happens at the end of each iteration of the loop. In this case we want to increase our counter $i by 1 each time ($i++ is shorthand for "increment $i by 1")
There is more nuance to this notation than I've included but it has no bearing on how the loop works. You can read more here on Unary Operators.
For the loop body itself, I'll explain the string
Returning an object without assigning to a variable, such as this string, is effectively the same thing as using Write-Output.
In most cases, Write-Output is actually optional (and often is not what you want for displaying text on the screen). My answer here goes into more detail about the different Write- cmdlets, output streams, and redirection.
$() is the sub-expression operator, and is used to return expressions for use within a parent expression. In this case we return the result of $i + 1 which gets inserted into the final string.
It is unique in that it can be used directly within strings unlike the similar-but-distinct array sub-expression operator and grouping operator.
Without the subexpression operator, you would get something like 0 + 1 as it will insert the value of $i but will render the + 1 literally.
After the >> we use another sub-expression to insert the value of the $ith index of $a into the string.
While simple variable expansion would insert the .ToString() value of array $a into the final string, referencing the index of the array must be done within a sub-expression or the [] will get rendered literally.
Your solution using a foreach and doing $a.IndexOf($number) within the loop does work, but while $a.IndexOf($number) works to get the current index, .IndexOf(object) works by iterating over the array until it finds the matching object reference, then returns the index. For large arrays this will take longer and longer with each iteration. The for loop does not have this restriction.
Consider the following example with a much larger array:
# Array of numbers 1 through 65535
$a = 1..65535
# Use the for loop to output "Iteration INDEXVALUE"
# Runs in 106 ms on my system
Measure-Command { for( $i = 0; $i -lt $a.Count; $i++ ) { "Iteration $($a[$i])" } }
# Use a foreach loop to do the same but obtain the index with .IndexOf(object)
# Runs in 6720 ms on my system
Measure-Command { foreach( $i in $a ){ "Iteration $($a.IndexOf($i))" } }
Another thing to watch out for is that while you can change properties and execute methods on collection elements, you can't change the element values of a non-collection collection (any collection not in the System.Concurrent.Collections namespace) when its enumerator is in use. While invisible, foreach (and relatedly ForEach-Object) implicitly invoke the collection's .GetEnumerator() method for the loop. This won't throw an error like in other .NET languages, but IMO it should. It will appear to accept a new value for the collection but once you exit the loop the value remains unchanged.
This isn't to say the foreach loop should never be used or that you did anything "wrong", but I feel these nuances should be made known before you do find yourself in a situation where a better construct would be appropriate.
Okey,
I fixed that;
$a=#('ab','bc','cd','dc')
$a.Length
foreach ($number in $a) {
$numberofIIS = $a.IndexOf($number)
Write-Host ($numberofIIS,">>>",$number)
}
Bender's answer is great, but I personally avoid for loops if at all possible. They usually require some awkward indexing into arrays and that ugly setup... The whole thing just ends up looking like hieroglyphics.
With a foreach loop it's our job to keep track of the index (which is where this answer differs from yours) but I think in the end it is more readable then a for loop.
$a = #('ab', 'bc', 'cd', 'dc')
# Pipe the items of our array to ForEach-Object
# We use the -Begin block to initialize our index variable ($x)
$a | ForEach-Object -Begin { $x = 1 } -Process {
# Output the expression
"$x" + ' >> ' + $_
# Increment $x for next loop
$x++
}
# -----------------------------------------------------------
# You can also do this with a foreach statement
# We just have to intialize our index variable
# beforehand
$x = 1
foreach ($number in $a){
# Output the expression
"$x >> $number"
# Increment $x for next loop
$x++
}

How to find index of string in array Perl without iterating

I need to find value in array without iterating through whole array.
I get array of strings from file, and I need to get index of some value in this array, I have tried this code, but it doesn't work.
my #array =<$file>;
my $search = "SomeValue";
my $index = first { $array[$_] eq $search } 0 .. $#array;
print "index of $search = $index\n";
Please suggest how can I get index of value, or it is better to get all indexes of line if there are more than one entry.
Thx in advance.
What does "it doesn't work" mean?
The code you have will work fine, except that an element in the array is going to be "SomeValue\n", not "SomeValue". You can remove the newlines with chomp(#array) or include a newline in your $search string.
Your initial question: "I need to find value in array without iterating through whole array."
You can't. It is impossible to check every element of an array, without checking every element of an array. The very best you can do is stop looking once you've found it - but you indicate in your question multiple matches.
There are various options that will do this for you - like List::Util and grep. But they are still doing a loop, they're just hiding it behind the scenes.
The reason first doesn't work for you, is probably because you need to load it from List::Util first. Alternatively - you forgot to chomp, which means your list includes line feeds, where your search pattern doesn't.
Anyway - in the interests of actually giving something that'll do the job:
while ( my $line = <$file> ) {
chomp ( $line );
#could use regular expression based matching for e.g. substrings.
if ( $line eq $search ) { print "Match on line $.\n"; last; }
}
If you want want every match - omit the last;
Alternatively - you can match with:
if ( $line =~ m/\Q$search\E/ ) {
Which will substring match (Which in turn means the line feeds are irrelevant).
So you can do this instead:
while ( <$file> ) {
print "Match on line $.\n" if m/\Q$search\E/;
}

Array got flushed after while loop within a filehandle

I got a problem with a Perl script.
Here is the code:
use strict;
use Data::Dumper;
my #data = ('a','b');
if(#data){
for(#data){
&debug_check($_)
}
}
print "#data";#Nothing is printed, the array gets empty after the while loop.
sub debug_check{
my $ip = shift;
open my $fh, "<", "debug.txt";
while(<$fh>){
print "$_ $ip\n";
}
}
Array data, in this example, has two elements. I need to check if the array has elements. If it has, then for each element I call a subroutine, in this case called "debug_check". Inside the subroutine I need to open a file to read some data. After I read the file using a while loop, the data array gets empty.
Why the array is being flushed and how do I avoid this strange behavior?
Thanks.
The problem here I think, will be down to $_. This is a bit of a special case, in that it's an alias to a value. If you modify $_ within a loop, it'll update the array. So when you hand it into the subroutine, and then shift it, it also updates #data.
Try:
my ( $ip ) = #_;
Or instead:
for my $ip ( #array ) {
debug_check($ip);
}
Note - you should also avoid using an & prefix to a sub. It has a special meaning. Usually it'll work, but it's generally redundant at best, and might cause some strange glitches.
while (<$fh>)
is short for
while (defined($_ = <$fh>))
$_ is currently aliased to the element of #data, so your sub is replacing each element of #data with undef. Fix:
while (local $_ = <$fh>)
which is short for
while (defined(local $_ = <$fh>))
or
while (my $line = <$fh>) # And use $line instead of $_ afterwards
which is short for
while (defined(my $line = <$fh>))
Be careful when using global variables. You want to localize them if you modify them.

Replacing array elements in Perl

I'm trying to replace an element in my array and my code doesn't seem to work.
my #wholeloop = (split //, $loop);
for my $i (0 .. $#wholeloop ) {
if ( $wholeloop[$i] eq "i" ) {
$wholeloop[$i] =~ htmlinsert($offset);
$offset++
}
}
I've read about problematics of doing stuff while iterating through an array, and maybe is there a better solution. I'm trying to replace specific occurences of a character in a string, and array seemed as a reasonable tool to use.
Typically - when iterating on a loop, you don't need to do it via:
for ( 0..$#array) {
Because
for ( #array ) {
will do the same thing, but with an added advantage of $_ being an alias to the array variable.
for my $element ( #wholeloop ) {
if ( $element eq "i" ) {
$element = htmlinsert($offset++);
}
}
$element is an alias so if you change it, you change the array. ($_ will do the same, but I dislike using it when I don't have to, because I think it make less clear code. This is a style/choice matter, rather than a technical one).
However for searching and replacing an element in a string – like you're doing – then you're probably better off using one of the other things perl does really well – regular expressions and pattern replacement. I can't give an example easily though, without knowing what htmlinsert returns.
Something like though:
$loop =~ s/i/newvalue/g;
Will replace all instances of i with a new value.
=~ is Perl's "match regular expression" operator, so unless htmlinsert() returns a regex, it's probably not what you meant to do. You probably want to use =.
A more Perlish way to do this, though, might be to use the map function. map takes a block and an array and runs the block with each element of the array in $_, returning all the values returned by that block. For example:
my #wholeloop = map {
$_ eq "i" ? htmlinsert($offset++) : $_;
} split //, $loop;
(The ? and : perform an "if/else" in a single line; they're borrowed from C. map is borrowed from functional programming languages.)
Perhaps you should use foreach. It is the most suitable for what you are trying to do here
my #array;
foreach ( #array ) {
$_ =~ whatever your replacement is;
}
Now, like Sobrique said, unless htmlinsert returns a RegEx value, that isn't going to work. Also, if you could give us context for "$offset", and what its purpose is, that would be really helpful.

Resources