Check whether an array contains a value from another array - arrays

I have an array of objects, and an array of acceptable return values for a particular method. How do I reduce the array of objects to only those whose method in question returns a value in my array of acceptable values?
Right now, I have this:
my #allowed = grep {
my $object = $_;
my $returned = $object->method;
grep {
my $value = $_;
$value eq $returned;
} #acceptableValues;
} #objects;
The problem is that this is a compound loop, which I'd like to avoid. This program is meant to scale to arbitrary sizes, and I want to minimize the number of iterations that are run.
What's the best way to do this?

You could transform the accepted return values into a hash
my %values = map { $_ => 1 } #acceptedValues;
And grep with the condition that the key exists instead of your
original grep:
my #allowed = grep $values{ $_->method }, #objects;
Anyway, grep is pretty fast in itself, and this is just an idea of a
common approach to checking if an element is in an array. Try not to
optimize what's not needed, since it would only be worth in really big
arrays. Then you could for example sort the accepted results array and
use a binary search, or cache results if they repeat. But again, don't
worry with this kind of optimisation unless you're dealing with hundreds
of thousands of items — or more.

Elements supposed to be present in given arrays seems unique. So, I will make a hash containing the count of elements from both arrays. If there is any element with count greater than 1, it means its present in both the arrays.
my %values;
my #allowed;
map {$values{$_}++} (#acceptableValues, #objects);
for (keys %values) {
push #allowed, $_ if $values{$_} > 1;
}

Related

Group similar element of array together to use in foreach at once in perl

i have an array which contents elements in which some elements are similiar under certain conditions (if we detete the "n and p" from the array element then the similiar element can be recognised) . I want to use these similiar element at once while using foreach statement. The array is seen below
my #array = qw(abc_n abc_p gg_n gg_p munday_n_xy munday_p_xy soc_n soc_p);
Order of the array element need not to be in this way always.
i am editing this question again. Sorry if i am not able to deliver the question properly. I have to print a string multiple times in the file with the variable present in the above array . I am just trying to make you understand the question through below code, the below code is not right in any sense .... i m just using it to make you understand my question.
open (FILE, ">" , "test.v");
foreach my $xy (#array){
print FILE "DUF A1 (.pin1($1), .pin2($2));" ; // $1 And $2 is just used to explain that
} // i just want to print abc_n and abc_p in one iteration of foreach loop and followed by other pairs in successive loops respectively
close (FILE);
The result i want to print is as follows:
DUF A1 ( .pin1(abc_n), .pin2(abc_p));
DUF A1 ( .pin1(gg_n), .pin2(gg_p));
DUF A1 ( .pin1(munday_n_xy), .pin2(munday_p_xy));
DUF A1 ( .pin1(soc_n), .pin2(soc_p));
The scripting language used is perl . Your help is really appreciated .
Thank You.!!
Partitioning a data set depends entirely on how data are "similiar under certain conditions."
The condition given is that with removal of _n and _p the "similar" elements become equal (I assume that underscore; the OP says n and p). In such a case one can do
use warnings;
use strict;
use feature 'say';
my #data = qw(abc_n abc_p gg_n gg_p munday_n_xy munday_p_xy soc_n soc_p);
my %part;
for my $elem (#data) {
push #{ $part{ $elem =~ s/_(?:n|p)//r } }, $elem;
}
say "$_ => #{$part{$_}}" for keys %part;
The grouped "similar" strings are printed as a demo since I don't understand the logic of the shown output. Please build your output strings as desired.
If this is it and there'll be no more input to process later in code, nor will there be a need to refer to those common factors, then you may want the groups in an array
my #groups = values %part;
If needed throw in a suitable sorting when writing the array, sort { ... } values %part.
For more fluid and less determined "similarity" try "fuzzy matching;" here is one example.

calculate the index of a array element reference

Is it possible to calculate the index of a referenced scalar from an array?
in C you can use pointer arithmetic to retrieve the index.
SomeType array[500];
const SomeType* e = &array[42];
// [...]
size_t index = e-array;
Is there some similar way in Perl?
my #array = (1,2,3,4,5,6,7,8,9,0);
my $e = \$array[4];
# [...]
my $index = '???';
The reason:
I have a relatively large (> 6Mio entries) Array with equally structured geometrically related data.
I also have some kind of priority based queue that contains references to this array. while processing this queue new elements are added and the queue has to be resorted. Since this queue also will grow quite large. and the priorities of the elements changes and are derived from the array element and its neighbors, i would like to avoid complex entries in the queue (memory size and allocation performance) and have only the reference their to directly access the information from the array.
But it seems that using indexes in the task-list would be the best option.
List::MoreUtils provides routines such as:
first_index
last_index
bsearch_index
indexes
etc. Depending on the circumstances, using one of these may be more efficient than just using plain old grep:
my #i = grep $array[$_] == $v, 0 .. $#array;
$e contains a reference, and you can compare references for equality.
my #array = (0,0,0,0,0,0,0,0,0,0,0);
my $e = \$array[4];
#..
my $index =$#array;
$index-- while ($e ne \$array[$index] && $index >=0);
print $index;
prints out 4.

most efficient way to obtain all the elements between two values in a Perl array

I have a list of integer values; each of these integer values in associated to a real value.
to give you an idea, they might be represented like this:
1 0.48
5 0.56
6 0.12
20 1.65
25 1.50
not all integers are represented in the list, only those who have a value associated with them.
given a range of integer values i have to perform some operation on the real values associated to any integer between the extremes of the range. for example:
given the range 5-20 i would need to extract the real values associated with the integers 5, 6, and 20 and then perform some operation on them.
right now the best i could come up with was to use the integer values as keys to a hash and loop over all the sorted integer values and check that the each value was between the range; like so:
foreach (sort key %hash)
{
if ($_ >= $rangemin && $_ <= $rangemax)
{
push #somearray, $hash{$_}
}
last if $_ >= $rangemax;
}
The real list i'm dealing with, however, is much longer and this implementation takes a lot of time to execute.
is there a faster/more efficient way to obtain a list of values lying between two arbitrary values in an array?
Don't sort, there's no need to.
This may be slightly faster:
#somearray = #hash{ grep $_ >= $rangemin && $_ <= $rangemax, keys %hash };
(building up a list of desired indexes by using grep to filter all the keys, then using a hash slice to get all the values at once).
You would have to benchmark it to know for certain.
The other alternative is to loop from $rangemin to $rangemax:
for ($rangemin..$rangemax) {
push #somearray, $hash{$_} if exists $hash{$_};
}
or
for ($rangemin..$rangemax) {
push #somearray, $hash{$_} // ();
}
or
#somearray = #hash{ grep exists $hash{$_}, $rangemin..$rangemax };
Which is fastest will depend on greatly on the sparseness of your data and the size of range and the percentage of hash values you are including.
Whether its faster or not probably depends on your data, but you can simply loop over the numbers, and check if they exist:
for my $num ($rangemin .. $rangemax) {
if (defined $hash{$num}) { # number exists
# do stuff
}
}
As a variation on that, you can use grep to get a list of indexes:
my #range = grep defined($hash{$_}), $rangemin .. $rangemax;
You don't need to do a full collection scan, you should just iterate 5-20 and get the value associated with that key from the collection, if it exists (is not undef or defined).

Finding values in array using Perl

I have two arrays #input0 and #input1. I would like a for loop that goes through every value in #input1 and if the value exists in #input0, the value is saved in a new array #input.
All arrays contain numbers only. There are a maximum of 10 numbers per array element (see below):
#input0 = {10061 10552 10553 10554 10555 10556 10557 10558 10559 10560, 10561 10562 10563 10564 10565 10566 10567 10573 10574 10575, ...}
#input1 = {20004 20182 ...}
The most concise and idiomatic way to achieve this in Perl is not via using "for" loop but map and grep
my %seen0 = map { ($_ => 1) } #input0;
my #input = grep { $seen0{$_} } #input1;
If you specifically want a for loop, please explain why map/grep approach does not work (unless it's a homework in which case the question should be tagged as one)
Short, sweet and slow:
my #input = grep $_ ~~ #input0, #input1;
Verbose and faster with for loop:
my %input0 = map {$_, 1} #input0;
my #input;
for (#input1) {
push #input, $_ if $input0{$_};
}
You could also use a hashslice + grep:
my %tmp ;
#tmp{#input0} = undef ; # Fill all elements of #input0 in hash with value undef
my #input = grep { exists $tmp{$_} } #input1 ; # grep for existing hash keys
dgw's answer was nearly there, but contained a couple of things which aren't best practice. I believe this is better:
my %input0_map;
#input0_map{ #input0 } = ();
my #input = grep { exists $input0_map{$_} } #input1;
You should not name a variable 'tmp' unless it's in a very small scope. Since this code snippet isn't wrapped in a brace-block, we don't know how big the scope is.
You should not assign into the hash slice with a single 'undef', because that means the first element is assigned with that literal undef, and the other elements are assigned with implicit undefs. It will work, but it's bad style. Either assign them all with a value, or have them ALL assigned implicitly (as happens if we assign from the empty list).

In Perl, how do I create a hash whose keys come from a given array?

Let's say I have an array, and I know I'm going to be doing a lot of "Does the array contain X?" checks. The efficient way to do this is to turn that array into a hash, where the keys are the array's elements, and then you can just say if($hash{X}) { ... }
Is there an easy way to do this array-to-hash conversion? Ideally, it should be versatile enough to take an anonymous array and return an anonymous hash.
%hash = map { $_ => 1 } #array;
It's not as short as the "#hash{#array} = ..." solutions, but those ones require the hash and array to already be defined somewhere else, whereas this one can take an anonymous array and return an anonymous hash.
What this does is take each element in the array and pair it up with a "1". When this list of (key, 1, key, 1, key 1) pairs get assigned to a hash, the odd-numbered ones become the hash's keys, and the even-numbered ones become the respective values.
#hash{#array} = (1) x #array;
It's a hash slice, a list of values from the hash, so it gets the list-y # in front.
From the docs:
If you're confused about why you use
an '#' there on a hash slice instead
of a '%', think of it like this. The
type of bracket (square or curly)
governs whether it's an array or a
hash being looked at. On the other
hand, the leading symbol ('$' or '#')
on the array or hash indicates whether
you are getting back a singular value
(a scalar) or a plural one (a list).
#hash{#keys} = undef;
The syntax here where you are referring to the hash with an # is a hash slice. We're basically saying $hash{$keys[0]} AND $hash{$keys[1]} AND $hash{$keys[2]} ... is a list on the left hand side of the =, an lvalue, and we're assigning to that list, which actually goes into the hash and sets the values for all the named keys. In this case, I only specified one value, so that value goes into $hash{$keys[0]}, and the other hash entries all auto-vivify (come to life) with undefined values. [My original suggestion here was set the expression = 1, which would've set that one key to 1 and the others to undef. I changed it for consistency, but as we'll see below, the exact values do not matter.]
When you realize that the lvalue, the expression on the left hand side of the =, is a list built out of the hash, then it'll start to make some sense why we're using that #. [Except I think this will change in Perl 6.]
The idea here is that you are using the hash as a set. What matters is not the value I am assigning; it's just the existence of the keys. So what you want to do is not something like:
if ($hash{$key} == 1) # then key is in the hash
instead:
if (exists $hash{$key}) # then key is in the set
It's actually more efficient to just run an exists check than to bother with the value in the hash, although to me the important thing here is just the concept that you are representing a set just with the keys of the hash. Also, somebody pointed out that by using undef as the value here, we will consume less storage space than we would assigning a value. (And also generate less confusion, as the value does not matter, and my solution would assign a value only to the first element in the hash and leave the others undef, and some other solutions are turning cartwheels to build an array of values to go into the hash; completely wasted effort).
Note that if typing if ( exists $hash{ key } ) isn’t too much work for you (which I prefer to use since the matter of interest is really the presence of a key rather than the truthiness of its value), then you can use the short and sweet
#hash{#key} = ();
I always thought that
foreach my $item (#array) { $hash{$item} = 1 }
was at least nice and readable / maintainable.
There is a presupposition here, that the most efficient way to do a lot of "Does the array contain X?" checks is to convert the array to a hash. Efficiency depends on the scarce resource, often time but sometimes space and sometimes programmer effort. You are at least doubling the memory consumed by keeping a list and a hash of the list around simultaneously. Plus you're writing more original code that you'll need to test, document, etc.
As an alternative, look at the List::MoreUtils module, specifically the functions any(), none(), true() and false(). They all take a block as the conditional and a list as the argument, similar to map() and grep():
print "At least one value undefined" if any { !defined($_) } #list;
I ran a quick test, loading in half of /usr/share/dict/words to an array (25000 words), then looking for eleven words selected from across the whole dictionary (every 5000th word) in the array, using both the array-to-hash method and the any() function from List::MoreUtils.
On Perl 5.8.8 built from source, the array-to-hash method runs almost 1100x faster than the any() method (1300x faster under Ubuntu 6.06's packaged Perl 5.8.7.)
That's not the full story however - the array-to-hash conversion takes about 0.04 seconds which in this case kills the time efficiency of array-to-hash method to 1.5x-2x faster than the any() method. Still good, but not nearly as stellar.
My gut feeling is that the array-to-hash method is going to beat any() in most cases, but I'd feel a whole lot better if I had some more solid metrics (lots of test cases, decent statistical analyses, maybe some big-O algorithmic analysis of each method, etc.) Depending on your needs, List::MoreUtils may be a better solution; it's certainly more flexible and requires less coding. Remember, premature optimization is a sin... :)
In perl 5.10, there's the close-to-magic ~~ operator:
sub invite_in {
my $vampires = [ qw(Angel Darla Spike Drusilla) ];
return ($_[0] ~~ $vampires) ? 0 : 1 ;
}
See here: http://dev.perl.org/perl5/news/2007/perl-5.10.0.html
Also worth noting for completeness, my usual method for doing this with 2 same-length arrays #keys and #vals which you would prefer were a hash...
my %hash = map { $keys[$_] => $vals[$_] } (0..#keys-1);
Raldi's solution can be tightened up to this (the '=>' from the original is not necessary):
my %hash = map { $_,1 } #array;
This technique can also be used for turning text lists into hashes:
my %hash = map { $_,1 } split(",",$line)
Additionally if you have a line of values like this: "foo=1,bar=2,baz=3" you can do this:
my %hash = map { split("=",$_) } split(",",$line);
[EDIT to include]
Another solution offered (which takes two lines) is:
my %hash;
#The values in %hash can only be accessed by doing exists($hash{$key})
#The assignment only works with '= undef;' and will not work properly with '= 1;'
#if you do '= 1;' only the hash key of $array[0] will be set to 1;
#hash{#array} = undef;
You could also use Perl6::Junction.
use Perl6::Junction qw'any';
my #arr = ( 1, 2, 3 );
if( any(#arr) == 1 ){ ... }
If you do a lot of set theoretic operations - you can also use Set::Scalar or similar module. Then $s = Set::Scalar->new( #array ) will build the Set for you - and you can query it with: $s->contains($m).
You can place the code into a subroutine, if you don't want pollute your namespace.
my $hash_ref =
sub{
my %hash;
#hash{ #{[ qw'one two three' ]} } = undef;
return \%hash;
}->();
Or even better:
sub keylist(#){
my %hash;
#hash{#_} = undef;
return \%hash;
}
my $hash_ref = keylist qw'one two three';
# or
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
If you really wanted to pass an array reference:
sub keylist(\#){
my %hash;
#hash{ #{$_[0]} } = undef if #_;
return \%hash;
}
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my #a = qw(5 8 2 5 4 8 9);
my #b = qw(7 6 5 4 3 2 1);
my $h = {};
#{$h}{#a} = #b;
print Dumper($h);
gives (note repeated keys get the value at the greatest position in the array - ie 8->2 and not 6)
$VAR1 = {
'8' => '2',
'4' => '3',
'9' => '1',
'2' => '5',
'5' => '4'
};
You might also want to check out Tie::IxHash, which implements ordered associative arrays. That would allow you to do both types of lookups (hash and index) on one copy of your data.

Resources