Better way to sort an array with unusual value - arrays

I have this array:
#raw_stack = (
'900244~dfasdf~ddd3',
'900122~dfasdf~ddd1',
'900244~dfasdf~ddd2',
'900456~dfasdf~ddd4',
'900312~dfasdf~ddd3',
'900456~dfasdf~ddd5',
);
I'd like to sort it by the first '~' element.
Is there a more elegant way to solve this rather than
looping and splitting through each value?

Use Schwartzian transform:
my #raw_stack = (
'900244~dfasdf~ddd3',
'900122~dfasdf~ddd1',
'900244~dfasdf~ddd2',
'900456~dfasdf~ddd4',
'900312~dfasdf~ddd3',
'900456~dfasdf~ddd5',
);
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (split/~/)[0]] } #raw_stack;
dump#sorted;
Benchmark:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
use Benchmark qw(:all);
my $s = '~dfasdf~ddd3';
my #arr = ();
for(0..20000) {
push #arr, int(rand(100000)) . $s;
}
my $count = -3;
cmpthese($count, {
'ST' => sub {
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (split/~/)[0]] } #arr;
},
'SORT' => sub {
my #sorted =
sort {
my ($a_0) = split /~/, $a;
my ($b_0) = split /~/, $b;
$a_0 <=> $b_0
} #arr;
},
});
result:
array of 200 elements:
Rate SORT ST
SORT 267/s -- -61%
ST 689/s 158% --
array of 2000 elements:
Rate SORT ST
SORT 18.0/s -- -71%
ST 61.5/s 242% --
array of 20000 elements:
Rate SORT ST
SORT 1.35/s -- -73%
ST 4.96/s 266% --

Sort and list slices?
sort { ( split( /~/, $a ) )[0] <=> ( split( /~/, $b ) )[0] } #raw_stack;

These might help. They show you how to extract parts of strings to use them to sort the larger strings:
Perlmonks How do I sort an array by (anything)?
perlfaq4 How do I sort an array by (anything)?
Stackoverflow Perl - Sort CSV on a certain column?

Is it always 6 digits? If so, the following would be the simplest and fastest:
my #sorted_stack = sort #raw_stack;
If not,
my #sorted_stack =
sort {
my ($a_0) = split /~/, $a;
my ($b_0) = split /~/, $b;
$a_0 <=> $b_0
} #raw_stack;
A Schwartzian transform might be cleaner if you're used to that, but it's actually slower in this case: [Update: Apparently, it's actually faster than my second solution for larger lists. It's never faster than the first, though ]
my #sorted_stack =
map $_->[0],
sort { $a->[1] <=> $b->[1] }
map [ $_, split /~/ ],
#raw_stack;

Related

sort hash of array by value

I am trying to sort by value in a HoA wherein key => [ a, b, c]
I want to sort alphabetically and have tried and read with no success. I think its the commas, but please help! Below is a short snippet. The raw data is exactly how it appears in the data dumper print vs. the CLI. I have to use some sort of delimiter otherwise the cli output is tedious! Thank you!
use strict;
use warnings;
my ( $lsvm_a,$lsvm_b,%hashA,%hashB );
my $vscincludes = qr/(^0x\w+)\,\w+\,\w+.*/; #/
open (LSMAP_A, "-|", "/usr/ios/cli/ioscli lsmap -vadapter vhost7 -field clientid vtd backing -fmt ," ) or die $!;
while ($lsvm_a = (<LSMAP_A>)) {
chomp($lsvm_a);
next unless $lsvm_a =~ /$vscincludes/;
#{$hashA{$1}} = (split ',', $lsvm_a);
}
open (LSMAP_B, "-|", "/usr/sbin/clcmd -m xxxxxx /usr/ios/cli/ioscli lsmap -vadapter vhost29 -field clientid vtd backing -fmt ," ) or die $!;
while ($lsvm_b = (<LSMAP_B>)) {
chomp($lsvm_b);
next unless $lsvm_b =~ /$vscincludes/;
push #{$hashA{$1}}, (split ',', $lsvm_b);
}
print "\n\nA:";
for my $key ( sort { $hashA{$a} cmp $hashA{$b} } keys %hashA ) {
print "$key => '", join(", ", #{$hashA{$key}}), "'\n";
}
##
print "===\nB:";
foreach my $key ( sort { (#{$hashB{$a}}) cmp (#{$hashB{$b}}) } keys %hashB ) {
print "$key ==> #{$hashB{$key}}\n";
}
print "\n\n__DATA_DUMPER__\n\n";
use Data::Dumper; print Dumper \%hashA; print Dumper \%hashB;
Output
A:
0x00000008 => '0x00000008, atgdb003f_avg01, hdisk10, atgdb003f_ovg01, hdisk96, atgdb003f_pvg01, hdisk68, atgdb003f_rvg01, hdisk8, vtscsi0, atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924'
===
B:
0x00000008 => '0x00000008, atgdb003f_avg01, hdisk10, atgdb003f_data, atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924, atgdb003f_ovg01, hdisk96, atgdb003f_pvg01, hdisk68, atgdb003f_rvg01, hdisk8'
__DATA_DUMPER__
$VAR1 = {
'0x00000008' => [
'0x00000008',
'atgdb003f_avg01',
'hdisk10',
'atgdb003f_ovg01',
'hdisk96',
'atgdb003f_pvg01',
'hdisk68',
'atgdb003f_rvg01',
'hdisk8',
'vtscsi0',
'atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924'
]
};
$VAR1 = {
'0x00000008' => [
'0x00000008',
'atgdb003f_avg01',
'hdisk10',
'atgdb003f_data',
'atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924',
'atgdb003f_ovg01',
'hdisk96',
'atgdb003f_pvg01',
'hdisk68',
'atgdb003f_rvg01',
'hdisk8'
]
};
### CLI out ###
###0x00000008,atgdb003f_avg01,hdisk10,atgdb003f_ovg01,hdisk96,atgdb003f_pvg01,hdisk68,atgdb003f_rvg01,hdisk8,vtscsi0,atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924
###0x00000008,atgdb003f_avg01,hdisk10,atgdb003f_data,atgdb003f_data.5bcd027df10f27bf9a880ce7bc1dd924,atgdb003f_ovg01,hdisk96,atgdb003f_pvg01,hdisk68,atgdb003f_rvg01,hdisk8
Update The arrayrefs (hash values) have multiple elements after all, and need be sorted. Then
for my $key (keys %h) { #{$h{$key}} = sort #{$h{$key}} }
or, more efficiently† (and in the statement modifier form, with less noise but perhaps less clear)
$h{$_} = [ sort #{$h{$_}} ] for keys %h;
The sort by default uses lexicographical sort, as wanted.
Keys are desired to be sorted numerically, but note that while we can rewrite the arrays to make them sorted it is not so with hashes, which are inherently unordered. We can print sorted of course
foreach my $k (sort { $a <=> $b } keys %h) { ... }
This will warn if keys aren't numbers.
† By 56% – 60% in my benchmarks on three different machines, with both v5.16 and v5.30.0
Original post
I take it that you need to sort a hash which has an arrayref for a value, whereby that arrayref has a single element. Then sort on that, first, element
foreach my $key ( sort { $hashB{$a}->[0] cmp $hashB{$b}->[0] } keys %hashB ) {
print "$key ==> #{$hashB{$key}}\n";
}
See the cmp operator under Equality operators in perlop. It takes scalars, which are stringwise compared (so the attempted sorting with an array from the question is wrong since cmp would get lengths of those arrays to sort by!)
In my understanding your hash to sort is like
$VAR1 = {
'0x00000008' => [ 'atgdb003f_avg01,hdisk10,atgdb003f_ovg01,...' ],
...
}
where each value is an arrayref with exactly one element.

Why Perl Sort function cannot arrange array's element in my expected incremental manner?

Perl Sort function unable to arrange array elements in my expected incremental manner
#array_sort = sort { $a <=> $b } #array
#array = ("BE_10", "BE_110", "BE_111", "BE_23", "BE_34", "BE_220", "BE_335");
#array_sort = sort { $a <=> $b } #array;
print "array_sort = #array_sort\n";
Expected result:
array_sort = BE_10 BE_23 BE_34 BE_110 BE_111 BE_220 BE_335
Actual result:
array_sort = BE_10 BE_110 BE_111 BE_23 BE_34 BE_220 BE_335
Always use use strict; use warnings;. It would have found your problem, which is that all your strings have the numerical value of zero. Since all strings are numerically identical, the sort function you provided always returns zero. Because of this, and because Perl used a stable sort, the order of the strings remained unchanged.
You wish to perform a "natural sort", and there are modules such as Sort::Key::Natural that will do that.
use Sort::Key::Natural qw( natsort );
my #sorted = natsort #unsorted;
Sounds like a good case for a Schwartzian transform.
If the prefix is always going to be the same and it's just the numbers after the underscore that differ:
my #array = ("BE_10", "BE_110", "BE_111", "BE_23", "BE_34", "BE_220", "BE_335");
my #array_sort = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, (split /_/, $_)[1] ] } #array;
print "array_sort = #array_sort\n";
And if it might be different:
my #array = ("BE_10", "BE_110", "BE_111", "BE_23", "CE_34", "BE_220", "CE_335");
my #array_sort = map { $_->[0] }
sort { $a->[1] cmp $b->[1] || $a->[2] <=> $b->[2] }
map { [ $_, split(/_/, $_) ] } #array;
print "array_sort = #array_sort\n";
Basic idea is that you decompose the original array into a list of array refs holding the original element and the transformed bit(s) you want to sort on, do the sort, and then extract the original elements in the new sorted order.

Compare two hash of arrays

I have two arrays and a hash holds these arrays
Array 1:
my $group = "west"
#{ $my_big_hash{$group} } = (1534,2341,2322,3345,689,3333,4444,5533,3334,5666,6676,3435);
Array 2 :
my $element = "Location" ;
my $group = "west" ;
#{ $my_tiny_hash{$element}{$group} } = (153,333,667,343);
Now i would want to compare
#{ $my_tiny_hash{$element}{$group} }
with
#{ $my_big_hash{$group} }
and check whether all the elements of tiny hash array are a part of big_hash array .
As we can see tiny hash has just 3 digit elements and all these elements are matching with big hash if we just compare the first 3 digits
if first 3 digits/letters match and all are available in the big array, then its matching or We have to print the unmatched elements
Its an array to array comparison.
How do we achieve it.
PS : Without Array Utils , How to achieve it
The solution using Array Utils is really simple
my #minus = array_minus( #{ $my_tiny_hash{$element}{$group} } , #{ $my_big_hash{$group} } );
But it compares all the digits and i would just want to match the first 3 digits
Hope this is clear
Thanks
This seems to do what you want.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my (%big_hash, %tiny_hash);
my $group = 'west';
my $element = 'Location';
# Less confusing initialisation!
$big_hash{$group} = [1534,2341,2322,3345,689,3333,4444,5533,3334,5666,6676,3435];
$tiny_hash{$element}{$group} = [153,333,667,343];
# Create a hash where the keys are the first three digits of the numbers
# in the big array. Doesn't matter what the values are.
my %check_hash = map { substr($_, 0, 3) => 1 } #{ $big_hash{$group} };
# grep the small array by checking the elements' existence in %check_hash
my #missing = grep { ! exists $check_hash{$_} } #{ $tiny_hash{$element}{$group} };
say "Missing items: #missing";
Update: Another solution that seems closer to your original code.
my #truncated_big_array = map { substr($_, 0, 3) } #{ $big_hash{$group} };
my #minus = array_minus( #{ $my_tiny_hash{$element}{$group} } , #truncated_big_array );
A quick and bit dirty solution (which extends your existing code).
#!/usr/bin/perl
use strict;
use warnings;
my (%my_big_hash, %my_tiny_hash, #temp_array);
my $group = "west";
#{ $my_big_hash{$group} } = (1534,343,2341,2322,3345,689,3333,4444,5533,3334,5666,6676,3435);
foreach (#{ $my_big_hash{$group} }){
push #temp_array, substr $_, 0,3;
}
my $element = "Location";
my $group2 = "west";
#{ $my_tiny_hash{$element}{$group2} } = (153,333,667,343,698);
#solution below
my %hash = map { $_ => 1 } #temp_array;
foreach my $search (#{$my_tiny_hash{'Location'}->{west}}){
if (exists $hash{$search}){
print "$search exists\n";
}
else{
print "$search does not exist\n";
}
}
Output:
153 exists
333 exists
667 exists
343 exists
698 does not exist
Demo
Also see: https://stackoverflow.com/a/39585810/257635
Edit: As per request using Array::Utils.
foreach (#{ $my_big_hash{$group} }){
push #temp_array, substr $_, 0,3;
}
my #minus = array_minus( #{ $my_tiny_hash{$element}{$group} } , #temp_array );
print "#minus";
An alternative, using ordered comparison instead of hashes:
#big = sort (1534,2341,2322,3345,689,3333,4444,5533,3334,5666,6676,3435);
#tiny = sort (153,333,667,343,698);
for(#tiny){
shift #big while #big and ($big[0] cmp $_) <0;
push #{$result{
$_ eq substr($big[0],0,3)
? "found" : "missing" }},
$_;
}
Contents of %result:
{
'found' => [
153,
333,
343,
667
],
'missing' => [
698
]
}

ID tracking while swapping and sorting other two arrays in perl

#! /usr/bin/perl
use strict;
my (#data,$data,#data1,#diff,$diff,$tempS,$tempE, #ID,#Seq,#Start,#End, #data2);
#my $file=<>;
open(FILE, "< ./out.txt");
while (<FILE>){
chomp $_;
#next if ($line =~/Measurement count:/ or $line =~/^\s+/) ;
#push #data, [split ("\t", $line)] ;
my #data = split('\t');
push(#ID, $data[0]);
push(#Seq, $data[1]);
push(#Start, $data[2]);
push(#End, $data[3]);
# push #$data, [split ("\t", $line)] ;
}
close(FILE);
my %hash = map { my $key = "$ID[$_]"; $key => [ $Start[$_], $End[$_] ] } (0..$#ID);
for my $key ( %hash ) {
print "Key: $key contains: ";
for my $value ($hash{$key} ) {
print " $hash{$key}[0] ";
}
print "\n";
}
for (my $j=0; $j <=$#Start ; $j++)
{
if ($Start[$j] > $End[$j])
{
$tempS=$Start[$j];
$Start[$j]=$End[$j];
$End[$j]=$tempS;
}
print"$tempS\t$Start[$j]\t$End[$j]\n";
}
my #sortStart = sort { $a <=> $b } #Start;
my #sortEnd = sort { $a <=> $b } #End;
#open(OUT,">>./trial.txt");
for(my $i=1521;$i>=0;$i--)
{
print "hey";
my $diff = $sortStart[$i] - $sortStart[$i-1];
print "$ID[$i]\t$diff\n";
}
I have three arrays of same length, ID with IDs (string), Start and End with integer values (reading from a file).
I want to loop through all these arrays and also want to keep track of IDs. First am swapping elements in Start with End if Start > End, then I have to sort these two arrays for further application (as I am negating Start[0]-Start[1] for each item in that Start). While sorting, the Id values may change, and as my IDs are unique for each Start and End elements, how can I keep track of my IDs while sorting them?
Three arrays, ID, Start and End, are under my consideration.
Here is a small chunk of my input data:
DQ704383 191990066 191990037
DQ698580 191911184 191911214
DQ724878 191905507 191905532
DQ715191 191822657 191822686
DQ722467 191653368 191653339
DQ707634 191622552 191622581
DQ715636 191539187 191539157
DQ692360 191388765 191388796
DQ722377 191083572 191083599
DQ697520 189463214 189463185
DQ709562 187245165 187245192
DQ540163 182491372 182491400
DQ720940 180753033 180753060
DQ707760 178340696 178340726
DQ725442 178286164 178286134
DQ711885 178250090 178250119
DQ718075 171329314 171329344
DQ705091 171062479 171062503
The above ID, Start, End respectively. If Start > End i swapped them only between those two arrays. But after swapping the descending order may change, but i want them in descending order also their corresponding ID for negation as explained above.
Don't use different arrays, use a hash to keep the related pieces of information together.
#!/usr/bin/perl
use warnings;
use strict;
use enum qw( START END );
my %hash;
while (<>) {
my ($id, $start, $end) = split;
$hash{$id} = [ $start < $end ? ($start, $end)
: ($end, $start) ];
}
my #by_start = sort { $hash{$a}[START] <=> $hash{$b}[START] } keys %hash;
my #by_end = sort { $hash{$a}[END] <=> $hash{$b}[END] } keys %hash;
use Test::More;
is_deeply(\#by_start, \#by_end, 'same');
done_testing();
Moreover, in the data sample you provided, the order of id's is the same regardless of by what you sort them.

how to arrange array values in ascending order in perl

I need to arrange the array values in ascending order in perl, i used sort compare option for below values but not working, kindly help as soon as possible
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
note: i want to sort the array values not array index
Thanks in advance
How about using schwartzian transform, doc here and here:
my #unsorted = qw(
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
);
my #sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { (my $t=$_)=~s/\D+//g; [$_, $t] }
#unsorted;
dump#sorted;
output:
(
"p1.txt",
"p2.txt",
"p3.txt",
"p4.txt",
"p5.txt",
"p6.txt",
"p7.txt",
"p8.txt",
"p9.txt",
"p10.txt",
"p11.txt",
"p12.txt",
"p13.txt",
"p14.txt",
"p15.txt",
"p16.txt",
"p17.txt",
"p18.txt",
"p19.txt",
"p20.txt",
"p21.txt",
"p22.txt",
"p23.txt",
"p24.txt",
)
Consider using Sort::Naturally for this task:
use strict;
use warnings;
use Sort::Naturally qw/nsort/;
chomp( my #data = <DATA> );
print "$_\n" for nsort #data;
__DATA__
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
Partial output:
p1.txt
p2.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
p10.txt
p11.txt
p12.txt
...
p22.txt
p23.txt
p24.txt
Hope this helps!
You need to create your own sort algorithm and pass that to sort:
sub custom_sort
{
$a =~ /^p(\d+)\.txt$/; #capture the number in $a
my $intA = $1;
$b =~ /^p(\d+)\.txt$/; #capture the number in $b
my $intB = $1;
return ($intA <=> $intB); #compare the numbers and return
}
And call:
#sortedArray = sort custom_sort #array;
See: http://perldoc.perl.org/functions/sort.html and http://perldoc.perl.org/perlop.html#Equality-Operators
Easiest would be to use the nsort_by function from List::UtilsBy; this sorts a list by the numbers returned from its code block. You would then invoke this with a code block to extract the number from the filename:
use List::UtilsBy qw( nsort_by );
my #sorted = nsort_by { /^p(\d+)\.txt$/ and $1 } #array;

Resources