How to determine if an element exists in a Perl 6 array - arrays

I thought there would have been a simple answer for this somewhere on the internet but it seems like I'm having trouble finding a solution. I'm first of all wondering if there's a simple method or function for this:
e.g. ~~ or array.contains() from Perl 5
It would also be nice to know how many different ways of achieving this result there are in Perl 6 as some might be better than others given the situation of the problem.

my #a = <foo bar buzz>;
say so 'bar' ∈ #a;
say so #a ∋ 'buzz';
# OUTPUT«True␤True␤»
As documented in http://doc.perl6.org/language/setbagmix and defined in https://github.com/rakudo/rakudo/blob/nom/src/core/set_operators.pm .
I believe that Set checks for equivalence, if you need identity you will have to loop over the array and === it.
You could turn the Array into a Set and use subscripts.
say #a.Set{'bar'};
# OUTPUT«True␤»
say #a.Set<bar buzz>;
# OUTPUT«(True True)␤»

Another way to do this, is:
my #a = <foo bar buzz>;
if 'bar' eq any(#a) {
say "yes";
}
# OUTPUT«yes␤»

Sub first documentation. sub first returns the matching element or Nil. Nil is a falsey value meaning you can use the result in a Bool context to determine if the array contains matching element.
my #a = 'A'..'Z';
say 'We got it' if #a.first('Z');
say 'We didn\'t get it' if !#a.first(1);
There are several adverbs to sub first which change the results. For instance to return the index instead of the element it is possible use the :k adverb. In this example we also topicalize the result for use within the if statement:
my #a = 'A'..'Z';
if #a.first('Q', :k) -> $index {
say $index;
}

Related

Perl remove spaces from array elements

I have an array which contains n number of elements. So there might be chances the each element could have spaces in the beginning or at the end. So I want to remove the space in one shot. Here is my code snippet which is working and which is not working (The one which not working is able to trim at the end but not from the front side of the element).
Not Working:
....
use Data::Dumper;
my #a = ("String1", " String2 ", "String3 ");
print Dumper(\#a);
#a = map{ (s/\s*$//)&&$_}#a;
print Dumper(\#a);
...
Working:
...
use Data::Dumper;
my #a = ("String1", " String2 ", "String3 ");
print Dumper(\#a);
my #b = trim_spaces(#a);
print Dumper(\#b);
sub trim_spaces
{
my #strings = #_;
s/\s+//g for #strings;
return #strings;
}
...
No idea whats the difference between these two.
If there is any better please share with me!!
Your "not working example" only removes spaces from one end of the string.
The expression s/^\s+|\s+$//g will remove spaces from both ends.
You can improve your code by using the /r flag to return a modified copy:
#a = map { s/^\s+|\s+$//gr } #a;
or, if you must:
#a = map { s/^\s+|\s+$//g; $_ } #a;
This block has two problems:
{ (s/\s*$//)&& $_ }
The trivial problem is that it's only removing trailing spaces, not leading, which you said you wanted to remove as well.
The more insidious problem is the misleading use of &&. If the regex in s/// doesn't find a match, it returns undef; on the left side of a &&, that means the right side is never executed, and the undef becomes the value of the whole block. Which means any string that the regex doesn't match will be removed and replaced with a undef in the result array returned by map, which is probably not what you want.
That won't actually happen with your regex as written, because every string matches \s*, and s/// still returns true even if it doesn't actually modify the string. But that's dependent on the regex and a bad assumption to make.
More generally, your approach mixes and matches two incompatible methods for modifying data: mutating in place (s///) versus creating a copy with some changes applied (map).
The map function is designed to create a new array whose elements are based in some way on an input array; ideally, it should not modify the original array in the process. But your code does – even if you weren't assigning the result of map back to #a, the s/// modifies the strings inside #a in place. In fact, you could remove the #a = from your code and get the same result. This is not considered good practice; map should be used for its return value, not its side effects.
If you want to modify the elements of an array in place, your for solution is actually the way to go. It makes it clear what you're doing and side effects are OK.
If you want to keep the original array around and make a new one with the changes applied, you should use the /r flag on the substitutions, which causes them to return the resulting string instead of modifying the original in place:
my #b = map { s/^\s+|\s+$//gr } #a;
That leaves #a alone and creates a new array #b with the trimmed strings.

Solution to Error: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I have a function where I'm calculating two float values with a conditional if statement for the return values shown below:
# The function inputs are 2 lists of floats
def math(list1,list2):
value1=math(...)
value2=more_math(...)
z=value2-value1
if np.any(z>0):
return value1
elif z<0:
return value2
Initially, I ran into the title error. I have tried using np.any() and np.all() as suggested by the error and questions here with no luck. I am looking for a method to explicitly analyze each element of the boolean array (e.g. [True,False] for list w/ 2 elements) generated from the if statement if z>0, if it is even possible. If I use np.any(), it is consistently returning value1 when that is not the case for the input lists. My problem is similar to The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()? but it went unanswered.
Here's a simple example:
a = np.array([1,2,3,4]) #for simplicity
b = np.array([0,0,5,5])
c = b.copy()
condition = a>b #returns an array with True and False, same shape as a
c[condition] = a[condition] #copy the values of a into c
Numpy arrays can be indexed by True and False, which also allows to overwirte the values saved in these indeces.
Note: b.copy() is important, because other wise your entries in bwill change as well. (best is you try it once without the copy() and then have a look at what happens at b
If z is an array
z=value2-value1
if np.any(z>0):
return value1
elif z<0:
return value2
z>0 and z<0 will be boolean arrays. np.any(z>0) reduces that array to one True/False value, which works in the if statement. But the z<0 is still multivalued, and with give elif a headache.

How to use an array reference to pairs of elements

I am considering this answer which uses a single array reference of points, where a point is a reference to a two-element array.
My original code of the question (function extract-crossing) uses two separate arrays $x and $y here which I call like this:
my #x = #{ $_[0] }; my #y = #{ $_[1] };
...
return extract_crossing(\#x, \#y);
The new code below based on the answer takes (x, y) and returns a single datatype, here x intercept points:
use strict; use warnings;
use Math::Geometry::Planar qw(SegmentLineIntersection);
use Test::Exception;
sub x_intercepts {
my ($points) = #_;
die 'Must pass at least 2 points' unless #$points >= 2;
my #intercepts;
my #x_axis = ( [0, 0], [1, 0] );
foreach my $i (0 .. $#$points - 1) {
my $intersect = SegmentLineIntersection([#$points[$i,$i+1],#x_axis]);
push #intercepts, $intersect if $intersect;
}
return \#intercepts;
}
which I try to call like this:
my #x = #{ $_[0] }; my #y = #{ $_[1] };
...
my $masi = x_intercepts(\#x);
return $masi;
However, the code does not make sense.
I am confused about passing the "double array" to the x_intercepts() function.
How can you make the example code clearer to the original setting?
If I am understanding the problem here, #ThisSuitIsBlackNot++ has written a function (x_intercepts which is available in the thread: Function to extract intersections of crossing lines on axes) that expects its argument to be a reference to a list of array references. The x_intercepts subroutine in turn uses a function from Math::Geometry::Planar which expects the points of a line segment to be passed as series of array references/anonymous arrays that contain the x,y values for each point.
Again - it is not entirely clear - but it seems your data is in two different arrays: one containing all the x values and one with the corresponding y values. Is this the case? If this is not correct please leave a comment and I will remove this answer.
If that is the source of your problem then you can "munge" or transform your data before you pass it to x_intercepts or - as #ThisSuitIsBlackNot suggests - you can rewrite the function. Here is an example of munging your existing data into an "#input_list" to pass to x_intercepts.
my #xs = qw/-1 1 3/;
my #ys = qw/-1 1 -1 /;
my #input_list ;
foreach my $i ( 0..$#ys ) {
push #input_list, [ $xs[$i], $ys[$i] ] ;
}
my $intercept_list = x_intercepts(\#input_list) ;
say join ",", #$_ for #$intercept_list ;
Adding the lines above to your script produces:
Output:
0,0
2,0
You have to be very careful doing this kind of thing and using tests to make sure you are passing the correctly transformed data in an expected way is a good idea.
I think a more general difficulty is that until you are familiar with perl it is sometimes tricky to easily see what sorts of values a subroutine is expecting, where they end up after they are passed in, and how to access them.
A solid grasp of perl data structures can help with that - for example I think what you are calling a "double array" or "double element" here is an "array of arrays". There are ways to make it easier to see where default arguments passed to a subroutine (in #_) are going (notice how #ThisSuitIsBlackNot has passed them to a nicely named array reference: "($points)"). Copious re-reading of perldocperbsub can help things seem more obvious.
References are key to understanding perl subroutines since to pass an array or hash to a subrouting you need to do so by references. If the argument passed x_intercepts is a list of two lists of anonymous arrays then when it is assigned to ($points), #$points->[0] #$points->[1] will be the arrays contain those lists.
I hope this helps and is not too basic (or incorrect). If #ThisSuitIsBlackNot finds the time to provide an answer you should accept it: some very useful examples have been provided.

Finding values in array using Perl

I have two arrays #input0 and #input1. I would like a for loop that goes through every value in #input1 and if the value exists in #input0, the value is saved in a new array #input.
All arrays contain numbers only. There are a maximum of 10 numbers per array element (see below):
#input0 = {10061 10552 10553 10554 10555 10556 10557 10558 10559 10560, 10561 10562 10563 10564 10565 10566 10567 10573 10574 10575, ...}
#input1 = {20004 20182 ...}
The most concise and idiomatic way to achieve this in Perl is not via using "for" loop but map and grep
my %seen0 = map { ($_ => 1) } #input0;
my #input = grep { $seen0{$_} } #input1;
If you specifically want a for loop, please explain why map/grep approach does not work (unless it's a homework in which case the question should be tagged as one)
Short, sweet and slow:
my #input = grep $_ ~~ #input0, #input1;
Verbose and faster with for loop:
my %input0 = map {$_, 1} #input0;
my #input;
for (#input1) {
push #input, $_ if $input0{$_};
}
You could also use a hashslice + grep:
my %tmp ;
#tmp{#input0} = undef ; # Fill all elements of #input0 in hash with value undef
my #input = grep { exists $tmp{$_} } #input1 ; # grep for existing hash keys
dgw's answer was nearly there, but contained a couple of things which aren't best practice. I believe this is better:
my %input0_map;
#input0_map{ #input0 } = ();
my #input = grep { exists $input0_map{$_} } #input1;
You should not name a variable 'tmp' unless it's in a very small scope. Since this code snippet isn't wrapped in a brace-block, we don't know how big the scope is.
You should not assign into the hash slice with a single 'undef', because that means the first element is assigned with that literal undef, and the other elements are assigned with implicit undefs. It will work, but it's bad style. Either assign them all with a value, or have them ALL assigned implicitly (as happens if we assign from the empty list).

In Perl, how do I create a hash whose keys come from a given array?

Let's say I have an array, and I know I'm going to be doing a lot of "Does the array contain X?" checks. The efficient way to do this is to turn that array into a hash, where the keys are the array's elements, and then you can just say if($hash{X}) { ... }
Is there an easy way to do this array-to-hash conversion? Ideally, it should be versatile enough to take an anonymous array and return an anonymous hash.
%hash = map { $_ => 1 } #array;
It's not as short as the "#hash{#array} = ..." solutions, but those ones require the hash and array to already be defined somewhere else, whereas this one can take an anonymous array and return an anonymous hash.
What this does is take each element in the array and pair it up with a "1". When this list of (key, 1, key, 1, key 1) pairs get assigned to a hash, the odd-numbered ones become the hash's keys, and the even-numbered ones become the respective values.
#hash{#array} = (1) x #array;
It's a hash slice, a list of values from the hash, so it gets the list-y # in front.
From the docs:
If you're confused about why you use
an '#' there on a hash slice instead
of a '%', think of it like this. The
type of bracket (square or curly)
governs whether it's an array or a
hash being looked at. On the other
hand, the leading symbol ('$' or '#')
on the array or hash indicates whether
you are getting back a singular value
(a scalar) or a plural one (a list).
#hash{#keys} = undef;
The syntax here where you are referring to the hash with an # is a hash slice. We're basically saying $hash{$keys[0]} AND $hash{$keys[1]} AND $hash{$keys[2]} ... is a list on the left hand side of the =, an lvalue, and we're assigning to that list, which actually goes into the hash and sets the values for all the named keys. In this case, I only specified one value, so that value goes into $hash{$keys[0]}, and the other hash entries all auto-vivify (come to life) with undefined values. [My original suggestion here was set the expression = 1, which would've set that one key to 1 and the others to undef. I changed it for consistency, but as we'll see below, the exact values do not matter.]
When you realize that the lvalue, the expression on the left hand side of the =, is a list built out of the hash, then it'll start to make some sense why we're using that #. [Except I think this will change in Perl 6.]
The idea here is that you are using the hash as a set. What matters is not the value I am assigning; it's just the existence of the keys. So what you want to do is not something like:
if ($hash{$key} == 1) # then key is in the hash
instead:
if (exists $hash{$key}) # then key is in the set
It's actually more efficient to just run an exists check than to bother with the value in the hash, although to me the important thing here is just the concept that you are representing a set just with the keys of the hash. Also, somebody pointed out that by using undef as the value here, we will consume less storage space than we would assigning a value. (And also generate less confusion, as the value does not matter, and my solution would assign a value only to the first element in the hash and leave the others undef, and some other solutions are turning cartwheels to build an array of values to go into the hash; completely wasted effort).
Note that if typing if ( exists $hash{ key } ) isn’t too much work for you (which I prefer to use since the matter of interest is really the presence of a key rather than the truthiness of its value), then you can use the short and sweet
#hash{#key} = ();
I always thought that
foreach my $item (#array) { $hash{$item} = 1 }
was at least nice and readable / maintainable.
There is a presupposition here, that the most efficient way to do a lot of "Does the array contain X?" checks is to convert the array to a hash. Efficiency depends on the scarce resource, often time but sometimes space and sometimes programmer effort. You are at least doubling the memory consumed by keeping a list and a hash of the list around simultaneously. Plus you're writing more original code that you'll need to test, document, etc.
As an alternative, look at the List::MoreUtils module, specifically the functions any(), none(), true() and false(). They all take a block as the conditional and a list as the argument, similar to map() and grep():
print "At least one value undefined" if any { !defined($_) } #list;
I ran a quick test, loading in half of /usr/share/dict/words to an array (25000 words), then looking for eleven words selected from across the whole dictionary (every 5000th word) in the array, using both the array-to-hash method and the any() function from List::MoreUtils.
On Perl 5.8.8 built from source, the array-to-hash method runs almost 1100x faster than the any() method (1300x faster under Ubuntu 6.06's packaged Perl 5.8.7.)
That's not the full story however - the array-to-hash conversion takes about 0.04 seconds which in this case kills the time efficiency of array-to-hash method to 1.5x-2x faster than the any() method. Still good, but not nearly as stellar.
My gut feeling is that the array-to-hash method is going to beat any() in most cases, but I'd feel a whole lot better if I had some more solid metrics (lots of test cases, decent statistical analyses, maybe some big-O algorithmic analysis of each method, etc.) Depending on your needs, List::MoreUtils may be a better solution; it's certainly more flexible and requires less coding. Remember, premature optimization is a sin... :)
In perl 5.10, there's the close-to-magic ~~ operator:
sub invite_in {
my $vampires = [ qw(Angel Darla Spike Drusilla) ];
return ($_[0] ~~ $vampires) ? 0 : 1 ;
}
See here: http://dev.perl.org/perl5/news/2007/perl-5.10.0.html
Also worth noting for completeness, my usual method for doing this with 2 same-length arrays #keys and #vals which you would prefer were a hash...
my %hash = map { $keys[$_] => $vals[$_] } (0..#keys-1);
Raldi's solution can be tightened up to this (the '=>' from the original is not necessary):
my %hash = map { $_,1 } #array;
This technique can also be used for turning text lists into hashes:
my %hash = map { $_,1 } split(",",$line)
Additionally if you have a line of values like this: "foo=1,bar=2,baz=3" you can do this:
my %hash = map { split("=",$_) } split(",",$line);
[EDIT to include]
Another solution offered (which takes two lines) is:
my %hash;
#The values in %hash can only be accessed by doing exists($hash{$key})
#The assignment only works with '= undef;' and will not work properly with '= 1;'
#if you do '= 1;' only the hash key of $array[0] will be set to 1;
#hash{#array} = undef;
You could also use Perl6::Junction.
use Perl6::Junction qw'any';
my #arr = ( 1, 2, 3 );
if( any(#arr) == 1 ){ ... }
If you do a lot of set theoretic operations - you can also use Set::Scalar or similar module. Then $s = Set::Scalar->new( #array ) will build the Set for you - and you can query it with: $s->contains($m).
You can place the code into a subroutine, if you don't want pollute your namespace.
my $hash_ref =
sub{
my %hash;
#hash{ #{[ qw'one two three' ]} } = undef;
return \%hash;
}->();
Or even better:
sub keylist(#){
my %hash;
#hash{#_} = undef;
return \%hash;
}
my $hash_ref = keylist qw'one two three';
# or
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
If you really wanted to pass an array reference:
sub keylist(\#){
my %hash;
#hash{ #{$_[0]} } = undef if #_;
return \%hash;
}
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my #a = qw(5 8 2 5 4 8 9);
my #b = qw(7 6 5 4 3 2 1);
my $h = {};
#{$h}{#a} = #b;
print Dumper($h);
gives (note repeated keys get the value at the greatest position in the array - ie 8->2 and not 6)
$VAR1 = {
'8' => '2',
'4' => '3',
'9' => '1',
'2' => '5',
'5' => '4'
};
You might also want to check out Tie::IxHash, which implements ordered associative arrays. That would allow you to do both types of lookups (hash and index) on one copy of your data.

Resources