calculate the index of a array element reference - arrays

Is it possible to calculate the index of a referenced scalar from an array?
in C you can use pointer arithmetic to retrieve the index.
SomeType array[500];
const SomeType* e = &array[42];
// [...]
size_t index = e-array;
Is there some similar way in Perl?
my #array = (1,2,3,4,5,6,7,8,9,0);
my $e = \$array[4];
# [...]
my $index = '???';
The reason:
I have a relatively large (> 6Mio entries) Array with equally structured geometrically related data.
I also have some kind of priority based queue that contains references to this array. while processing this queue new elements are added and the queue has to be resorted. Since this queue also will grow quite large. and the priorities of the elements changes and are derived from the array element and its neighbors, i would like to avoid complex entries in the queue (memory size and allocation performance) and have only the reference their to directly access the information from the array.
But it seems that using indexes in the task-list would be the best option.

List::MoreUtils provides routines such as:
first_index
last_index
bsearch_index
indexes
etc. Depending on the circumstances, using one of these may be more efficient than just using plain old grep:
my #i = grep $array[$_] == $v, 0 .. $#array;

$e contains a reference, and you can compare references for equality.
my #array = (0,0,0,0,0,0,0,0,0,0,0);
my $e = \$array[4];
#..
my $index =$#array;
$index-- while ($e ne \$array[$index] && $index >=0);
print $index;
prints out 4.

Related

2D Array Printing as Reference

I have the code similar to below:
my #array1 = (); #2d array to be used
my $string1 = "blank1";
my $string2 = "blank2";
my $string3 = "blank3";
my #temp = ($string1, $string2, $string3);
push (#array1, \#temp);
The reason I am assigning the strings and then putting them into an array is because they are in a loop and the values get updated in the loop (#array1 is not declared in the loop).
When I run my program, it only gives me a reference to an array rather than an actual 2D array. How can I get it to print out the content as a 2D array and not as a reference or flattened out to a 1D array?
I would like an output like [[blank1, blank2, blank3],....] so i can access it like $array1[i][j]
An array can only have scalars for elements. Thus this includes references, to arrays for example, what enables us to build complex data structures. See perldsc, Tom's Perl Data Structure Cookbook.
Elements of those ("second-level") arrays are accessed by dereferencing, so $array1[0]->[1] is the second element of the array whose reference is the first element of the top-level array (#array1). Or, for convenience, a simpler syntax is allowed as well: $array1[0][1].
If we want a list of all elements of a second-level array then dereference it with #, like:
my #l2 = #{ $array1[0] }; # or, using
my #l2 = $array1[0]->#*; # postfix dereferencing
Or, to get just a few elements of the array, but in one scoop -- a slice
my #l2_slice = #{$array1[0]}[1..2]; # or
my #l2_slice = $array1[0]->#[1..2]; # postfix reference slice
what returns the list with the second and third elements of the same second-level array.
The second lines are of a newer syntax called postfix dereferencing, stable as of v5.24. It avails us with the same logic for getting elements as when we drill for a single one, by working left-to-right all the way. So ->#* to get a list of all elements for an arrayref,->%* for a hashref (etc). See for instance a perl.com article and an Effective Perler article.
There is a thing to warn about when it comes to multidimensional structures built with references. There are two distinct ways to create them: by using references to existing, named, variables
my #a1 = 5 .. 7;
my %h1 = ( a => 1, b => 2 );
my #tla1 = (\#a1, \%h1);
or by using anonymous ones, where arrayrefs are constructed by [] and hashrefs by {}
my #tla2 = ( [ 5..7 ], { a => 1, b => 2 } );
The difference is important to keep in mind. In the first case, the references to variables that the array carries can be used to change those variables -- if we change elements of #tla1 then we really change the referred variables
$tla1[0][1] = 100; # now #a1 == 5, 100, 7
Also, changing variables with references in #tla1 is seen via the top-level array as well.
With anonymous arrays and hashes in #tla this isn't the case as elements (references) of #tla access independent data, which cannot be accessed (and changed) in any other way.
Both of these ways to build complex data structures have their uses.

Check whether an array contains a value from another array

I have an array of objects, and an array of acceptable return values for a particular method. How do I reduce the array of objects to only those whose method in question returns a value in my array of acceptable values?
Right now, I have this:
my #allowed = grep {
my $object = $_;
my $returned = $object->method;
grep {
my $value = $_;
$value eq $returned;
} #acceptableValues;
} #objects;
The problem is that this is a compound loop, which I'd like to avoid. This program is meant to scale to arbitrary sizes, and I want to minimize the number of iterations that are run.
What's the best way to do this?
You could transform the accepted return values into a hash
my %values = map { $_ => 1 } #acceptedValues;
And grep with the condition that the key exists instead of your
original grep:
my #allowed = grep $values{ $_->method }, #objects;
Anyway, grep is pretty fast in itself, and this is just an idea of a
common approach to checking if an element is in an array. Try not to
optimize what's not needed, since it would only be worth in really big
arrays. Then you could for example sort the accepted results array and
use a binary search, or cache results if they repeat. But again, don't
worry with this kind of optimisation unless you're dealing with hundreds
of thousands of items — or more.
Elements supposed to be present in given arrays seems unique. So, I will make a hash containing the count of elements from both arrays. If there is any element with count greater than 1, it means its present in both the arrays.
my %values;
my #allowed;
map {$values{$_}++} (#acceptableValues, #objects);
for (keys %values) {
push #allowed, $_ if $values{$_} > 1;
}

How to create two dimensional array with 2 arrays

How to create multiple dimentional array with 2 arrays?
#param=("param1","param2","param3");
#value=("value1_1, value1_2, value1_3", "value2_1, value2_2, value2_3","value3_1, value3_2, value3_3");
Output:
#out=(["param1"]["value1_1", "value1_2", "value1_3"], ["param2"]["value2_1", "value2_2", "value2_3"], ["param3"]["value3_1", "value3_2", "value3_3"])
I've tried this way:
$j=0;
foreach $i(#param1){
push #{$out[$i]}, split(", ", $value[$j]);
$j++;}
It's not exactly clear to me what data structure you want to create.
However, I assume you are trying to create a hash of arrays (a hash table is also known as dictionary or associative array), not an array.
The difference in Perl is that an array always uses integers as keys, whereas a hash always uses strings.
The resulting data structure would then look like
%out = (
'param1' => ['value1_1', 'value1_2', 'value1_3'],
'param2' => ['value2_1', 'value2_2', 'value2_3'],
'param3' => ['value3_1', 'value3_2', 'value3_3'],
);
We can create this data structure like so:
my %out;
for my $i (0 .. $#param) {
$out{$param[$i]} = [split /,\s*/, $value[$i]];
}
Note that $#foo is the highest index in the #foo array. Therefore, 0 .. $#foo would be the range of all indices in #foo. Also note that entries in hashes are accessed with a curly brace subscript $hash{$key}, unlike arrays which use square brackets $array[$index].
You can access multiple elements in a hash or array at the same time by using a slice – #foo{'a', 'b', 'c'} is equivalent to ($foo{a}, $foo{b}, $foo{c}). We can also transform a list of elements by using the map {BLOCK} LIST function. Together, this allows for the following solution:
my %out;
#out{#param} = map { [split /,\s*/, $_] } #value;
Inside the map block, the $_ variable is set to each item in the input list in turn.
To learn more about complex data structures, read (in this order):
perldoc perreftut
perldoc perllol
perldoc perldsc
You can also read the documentation for the map function and for the foreach loop.

most efficient way to obtain all the elements between two values in a Perl array

I have a list of integer values; each of these integer values in associated to a real value.
to give you an idea, they might be represented like this:
1 0.48
5 0.56
6 0.12
20 1.65
25 1.50
not all integers are represented in the list, only those who have a value associated with them.
given a range of integer values i have to perform some operation on the real values associated to any integer between the extremes of the range. for example:
given the range 5-20 i would need to extract the real values associated with the integers 5, 6, and 20 and then perform some operation on them.
right now the best i could come up with was to use the integer values as keys to a hash and loop over all the sorted integer values and check that the each value was between the range; like so:
foreach (sort key %hash)
{
if ($_ >= $rangemin && $_ <= $rangemax)
{
push #somearray, $hash{$_}
}
last if $_ >= $rangemax;
}
The real list i'm dealing with, however, is much longer and this implementation takes a lot of time to execute.
is there a faster/more efficient way to obtain a list of values lying between two arbitrary values in an array?
Don't sort, there's no need to.
This may be slightly faster:
#somearray = #hash{ grep $_ >= $rangemin && $_ <= $rangemax, keys %hash };
(building up a list of desired indexes by using grep to filter all the keys, then using a hash slice to get all the values at once).
You would have to benchmark it to know for certain.
The other alternative is to loop from $rangemin to $rangemax:
for ($rangemin..$rangemax) {
push #somearray, $hash{$_} if exists $hash{$_};
}
or
for ($rangemin..$rangemax) {
push #somearray, $hash{$_} // ();
}
or
#somearray = #hash{ grep exists $hash{$_}, $rangemin..$rangemax };
Which is fastest will depend on greatly on the sparseness of your data and the size of range and the percentage of hash values you are including.
Whether its faster or not probably depends on your data, but you can simply loop over the numbers, and check if they exist:
for my $num ($rangemin .. $rangemax) {
if (defined $hash{$num}) { # number exists
# do stuff
}
}
As a variation on that, you can use grep to get a list of indexes:
my #range = grep defined($hash{$_}), $rangemin .. $rangemax;
You don't need to do a full collection scan, you should just iterate 5-20 and get the value associated with that key from the collection, if it exists (is not undef or defined).

In Perl, how do I create a hash whose keys come from a given array?

Let's say I have an array, and I know I'm going to be doing a lot of "Does the array contain X?" checks. The efficient way to do this is to turn that array into a hash, where the keys are the array's elements, and then you can just say if($hash{X}) { ... }
Is there an easy way to do this array-to-hash conversion? Ideally, it should be versatile enough to take an anonymous array and return an anonymous hash.
%hash = map { $_ => 1 } #array;
It's not as short as the "#hash{#array} = ..." solutions, but those ones require the hash and array to already be defined somewhere else, whereas this one can take an anonymous array and return an anonymous hash.
What this does is take each element in the array and pair it up with a "1". When this list of (key, 1, key, 1, key 1) pairs get assigned to a hash, the odd-numbered ones become the hash's keys, and the even-numbered ones become the respective values.
#hash{#array} = (1) x #array;
It's a hash slice, a list of values from the hash, so it gets the list-y # in front.
From the docs:
If you're confused about why you use
an '#' there on a hash slice instead
of a '%', think of it like this. The
type of bracket (square or curly)
governs whether it's an array or a
hash being looked at. On the other
hand, the leading symbol ('$' or '#')
on the array or hash indicates whether
you are getting back a singular value
(a scalar) or a plural one (a list).
#hash{#keys} = undef;
The syntax here where you are referring to the hash with an # is a hash slice. We're basically saying $hash{$keys[0]} AND $hash{$keys[1]} AND $hash{$keys[2]} ... is a list on the left hand side of the =, an lvalue, and we're assigning to that list, which actually goes into the hash and sets the values for all the named keys. In this case, I only specified one value, so that value goes into $hash{$keys[0]}, and the other hash entries all auto-vivify (come to life) with undefined values. [My original suggestion here was set the expression = 1, which would've set that one key to 1 and the others to undef. I changed it for consistency, but as we'll see below, the exact values do not matter.]
When you realize that the lvalue, the expression on the left hand side of the =, is a list built out of the hash, then it'll start to make some sense why we're using that #. [Except I think this will change in Perl 6.]
The idea here is that you are using the hash as a set. What matters is not the value I am assigning; it's just the existence of the keys. So what you want to do is not something like:
if ($hash{$key} == 1) # then key is in the hash
instead:
if (exists $hash{$key}) # then key is in the set
It's actually more efficient to just run an exists check than to bother with the value in the hash, although to me the important thing here is just the concept that you are representing a set just with the keys of the hash. Also, somebody pointed out that by using undef as the value here, we will consume less storage space than we would assigning a value. (And also generate less confusion, as the value does not matter, and my solution would assign a value only to the first element in the hash and leave the others undef, and some other solutions are turning cartwheels to build an array of values to go into the hash; completely wasted effort).
Note that if typing if ( exists $hash{ key } ) isn’t too much work for you (which I prefer to use since the matter of interest is really the presence of a key rather than the truthiness of its value), then you can use the short and sweet
#hash{#key} = ();
I always thought that
foreach my $item (#array) { $hash{$item} = 1 }
was at least nice and readable / maintainable.
There is a presupposition here, that the most efficient way to do a lot of "Does the array contain X?" checks is to convert the array to a hash. Efficiency depends on the scarce resource, often time but sometimes space and sometimes programmer effort. You are at least doubling the memory consumed by keeping a list and a hash of the list around simultaneously. Plus you're writing more original code that you'll need to test, document, etc.
As an alternative, look at the List::MoreUtils module, specifically the functions any(), none(), true() and false(). They all take a block as the conditional and a list as the argument, similar to map() and grep():
print "At least one value undefined" if any { !defined($_) } #list;
I ran a quick test, loading in half of /usr/share/dict/words to an array (25000 words), then looking for eleven words selected from across the whole dictionary (every 5000th word) in the array, using both the array-to-hash method and the any() function from List::MoreUtils.
On Perl 5.8.8 built from source, the array-to-hash method runs almost 1100x faster than the any() method (1300x faster under Ubuntu 6.06's packaged Perl 5.8.7.)
That's not the full story however - the array-to-hash conversion takes about 0.04 seconds which in this case kills the time efficiency of array-to-hash method to 1.5x-2x faster than the any() method. Still good, but not nearly as stellar.
My gut feeling is that the array-to-hash method is going to beat any() in most cases, but I'd feel a whole lot better if I had some more solid metrics (lots of test cases, decent statistical analyses, maybe some big-O algorithmic analysis of each method, etc.) Depending on your needs, List::MoreUtils may be a better solution; it's certainly more flexible and requires less coding. Remember, premature optimization is a sin... :)
In perl 5.10, there's the close-to-magic ~~ operator:
sub invite_in {
my $vampires = [ qw(Angel Darla Spike Drusilla) ];
return ($_[0] ~~ $vampires) ? 0 : 1 ;
}
See here: http://dev.perl.org/perl5/news/2007/perl-5.10.0.html
Also worth noting for completeness, my usual method for doing this with 2 same-length arrays #keys and #vals which you would prefer were a hash...
my %hash = map { $keys[$_] => $vals[$_] } (0..#keys-1);
Raldi's solution can be tightened up to this (the '=>' from the original is not necessary):
my %hash = map { $_,1 } #array;
This technique can also be used for turning text lists into hashes:
my %hash = map { $_,1 } split(",",$line)
Additionally if you have a line of values like this: "foo=1,bar=2,baz=3" you can do this:
my %hash = map { split("=",$_) } split(",",$line);
[EDIT to include]
Another solution offered (which takes two lines) is:
my %hash;
#The values in %hash can only be accessed by doing exists($hash{$key})
#The assignment only works with '= undef;' and will not work properly with '= 1;'
#if you do '= 1;' only the hash key of $array[0] will be set to 1;
#hash{#array} = undef;
You could also use Perl6::Junction.
use Perl6::Junction qw'any';
my #arr = ( 1, 2, 3 );
if( any(#arr) == 1 ){ ... }
If you do a lot of set theoretic operations - you can also use Set::Scalar or similar module. Then $s = Set::Scalar->new( #array ) will build the Set for you - and you can query it with: $s->contains($m).
You can place the code into a subroutine, if you don't want pollute your namespace.
my $hash_ref =
sub{
my %hash;
#hash{ #{[ qw'one two three' ]} } = undef;
return \%hash;
}->();
Or even better:
sub keylist(#){
my %hash;
#hash{#_} = undef;
return \%hash;
}
my $hash_ref = keylist qw'one two three';
# or
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
If you really wanted to pass an array reference:
sub keylist(\#){
my %hash;
#hash{ #{$_[0]} } = undef if #_;
return \%hash;
}
my #key_list = qw'one two three';
my $hash_ref = keylist #key_list;
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my #a = qw(5 8 2 5 4 8 9);
my #b = qw(7 6 5 4 3 2 1);
my $h = {};
#{$h}{#a} = #b;
print Dumper($h);
gives (note repeated keys get the value at the greatest position in the array - ie 8->2 and not 6)
$VAR1 = {
'8' => '2',
'4' => '3',
'9' => '1',
'2' => '5',
'5' => '4'
};
You might also want to check out Tie::IxHash, which implements ordered associative arrays. That would allow you to do both types of lookups (hash and index) on one copy of your data.

Resources