how to arrange array values in ascending order in perl

how to arrange array values in ascending order in perl - arrays

I need to arrange the array values in ascending order in perl, i used sort compare option for below values but not working, kindly help as soon as possible
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
note: i want to sort the array values not array index
Thanks in advance

How about using schwartzian transform, doc here and here:
my #unsorted = qw(
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
);
my #sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { (my $t=$_)=~s/\D+//g; [$_, $t] }
#unsorted;
dump#sorted;
output:
(
"p1.txt",
"p2.txt",
"p3.txt",
"p4.txt",
"p5.txt",
"p6.txt",
"p7.txt",
"p8.txt",
"p9.txt",
"p10.txt",
"p11.txt",
"p12.txt",
"p13.txt",
"p14.txt",
"p15.txt",
"p16.txt",
"p17.txt",
"p18.txt",
"p19.txt",
"p20.txt",
"p21.txt",
"p22.txt",
"p23.txt",
"p24.txt",
)

Consider using Sort::Naturally for this task:
use strict;
use warnings;
use Sort::Naturally qw/nsort/;
chomp( my #data = <DATA> );
print "$_\n" for nsort #data;
__DATA__
p1.txt
p10.txt
p11.txt
p12.txt
p13.txt
p14.txt
p15.txt
p16.txt
p17.txt
p18.txt
p19.txt
p2.txt
p20.txt
p21.txt
p22.txt
p23.txt
p24.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
Partial output:
p1.txt
p2.txt
p3.txt
p4.txt
p5.txt
p6.txt
p7.txt
p8.txt
p9.txt
p10.txt
p11.txt
p12.txt
...
p22.txt
p23.txt
p24.txt
Hope this helps!

You need to create your own sort algorithm and pass that to sort:
sub custom_sort
{
$a =~ /^p(\d+)\.txt$/; #capture the number in $a
my $intA = $1;
$b =~ /^p(\d+)\.txt$/; #capture the number in $b
my $intB = $1;
return ($intA <=> $intB); #compare the numbers and return
}
And call:
#sortedArray = sort custom_sort #array;
See: http://perldoc.perl.org/functions/sort.html and http://perldoc.perl.org/perlop.html#Equality-Operators

Easiest would be to use the nsort_by function from List::UtilsBy; this sorts a list by the numbers returned from its code block. You would then invoke this with a code block to extract the number from the filename:
use List::UtilsBy qw( nsort_by );
my #sorted = nsort_by { /^p(\d+)\.txt$/ and $1 } #array;

Related

Why Perl Sort function cannot arrange array's element in my expected incremental manner?

Perl Sort function unable to arrange array elements in my expected incremental manner
#array_sort = sort { $a <=> $b } #array
#array = ("BE_10", "BE_110", "BE_111", "BE_23", "BE_34", "BE_220", "BE_335");
#array_sort = sort { $a <=> $b } #array;
print "array_sort = #array_sort\n";
Expected result:
array_sort = BE_10 BE_23 BE_34 BE_110 BE_111 BE_220 BE_335
Actual result:
array_sort = BE_10 BE_110 BE_111 BE_23 BE_34 BE_220 BE_335

Always use use strict; use warnings;. It would have found your problem, which is that all your strings have the numerical value of zero. Since all strings are numerically identical, the sort function you provided always returns zero. Because of this, and because Perl used a stable sort, the order of the strings remained unchanged.
You wish to perform a "natural sort", and there are modules such as Sort::Key::Natural that will do that.
use Sort::Key::Natural qw( natsort );
my #sorted = natsort #unsorted;

Sounds like a good case for a Schwartzian transform.
If the prefix is always going to be the same and it's just the numbers after the underscore that differ:
my #array = ("BE_10", "BE_110", "BE_111", "BE_23", "BE_34", "BE_220", "BE_335");
my #array_sort = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [ $_, (split /_/, $_)[1] ] } #array;
print "array_sort = #array_sort\n";
And if it might be different:
my #array = ("BE_10", "BE_110", "BE_111", "BE_23", "CE_34", "BE_220", "CE_335");
my #array_sort = map { $_->[0] }
sort { $a->[1] cmp $b->[1] || $a->[2] <=> $b->[2] }
map { [ $_, split(/_/, $_) ] } #array;
print "array_sort = #array_sort\n";
Basic idea is that you decompose the original array into a list of array refs holding the original element and the transformed bit(s) you want to sort on, do the sort, and then extract the original elements in the new sorted order.

Need help sorting file list based on datestamp in filename

Unsorted data
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
Expected results
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
5ZM05391_10_15_2017_08_51_07.xml
5CM00225_10_16_2017_10_54_42.xml
1ZM06004_10_16_2017_11_04_14.xml
5CM10538_10_16_2017_11_04_18.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM00058_10_17_2017_08_00_49.xml
5ZM05388_10_17_2017_08_01_06.xml
I use Net::SFTP to get a directory listing off a remote site and compare to a local file listing. I'd like to sort the list by date in the filename, but I'm running into issues due to there being other information in the string that I need to ignore.
my $sftp = Net::SFTP->new( $host, %args);
my #list = $sftp->ls($path);
open(my $fh, '>', $file); # open a log file to save remote directory listing
my #sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, $_=~/(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})/] } # unsuccessful sorting attempt
#list;
foreach my $item (#sorted) {
$i = ${item}->{filename};
print $fh "$1\n"; # prints each record to the open log file
}
close $fh;
I have done sorting before and plenty of regex but never at the same time, and I'm clearly bungling it up, because it isn't sorting anything, and not throwing any errors.
I thought about extracting the DD_MM_YYYY_hh_mm_ss out of each string and trying to use it as a reference, but I didn't make any usable headway so I scrapped the idea.

This produces your desired output. It splits each line on underscore or period into a list, then only keeps the "columns" you want, in the order you want them. It keeps the year, followed by the month, day, etc. Then it joins the list elements into a new date string, then sorts lines based on dates.
use warnings;
use strict;
my #list;
while (<DATA>) {
chomp;
push #list, $_;
}
my #sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, join '', (split /[_.]/)[3,1,2,4,5,6] ] }
#list;
__DATA__
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
I believe your code fails because it returns the list in the order they appear on the line, namely month, day, etc.

Probably not the prettiest solution but it works:
use strict;
use warnings;
use Data::Dumper;
my #list = (
'5CM00225_10_16_2017_10_54_42.xml',
'5CM10538_10_16_2017_11_04_18.xml',
'1ZM06004_10_16_2017_11_04_14.xml',
'5XM10010_10_17_2017_08_00_47.xml',
'5ZM05391_10_15_2017_08_51_07.xml',
'5ZM05388_10_17_2017_08_01_06.xml',
'5ZM00058_10_17_2017_08_00_49.xml',
'NMC00166_10_15_2017_08_51_06.xml',
'5CM10538_10_15_2017_08_51_06.xml'
);
my #sorted = sort {
my ($mm1,$dd1,$yy1,$hh1,$min1,$ss1) = ($a =~ /_(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})\.xml$/);
my ($mm2,$dd2,$yy2,$hh2,$min2,$ss2) = ($b =~ /_(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})\.xml$/);
my $x = $yy1.$mm1.$dd1.$hh1.$min1.$ss1;
my $y = $yy2.$mm2.$dd2.$hh2.$min2.$ss2;
$x <=> $y;
} #list;
print Dumper(\#sorted);

To parse and compare dates it also makes sense using a date-time module, Time::Piece here.
A naive version (see below for a more efficient one)
use warnings;
use strict;
use feature 'say';
use Time::Piece;
my #orig = (
'5CM00225_10_16_2017_10_54_42.xml',
'5CM10538_10_16_2017_11_04_18.xml',
'1ZM06004_10_16_2017_11_04_14.xml',
'5XM10010_10_17_2017_08_00_47.xml',
'5ZM05391_10_15_2017_08_51_07.xml',
'5ZM05388_10_17_2017_08_01_06.xml',
'5ZM00058_10_17_2017_08_00_49.xml',
'NMC00166_10_15_2017_08_51_06.xml',
'5CM10538_10_15_2017_08_51_06.xml',
);
my $dt = Time::Piece->new;
my #sorted = sort {
my $a_dt = $dt->strptime($a =~ /_(.*)\./, '%m_%d_%Y_%H_%M_%S');
my $b_dt = $dt->strptime($b =~ /_(.*)\./, '%m_%d_%Y_%H_%M_%S');
$a_dt <=> $b_dt
} #orig;
say for #sorted;
This runs a regex and strptime for every comparison.
Instead, precompute them all
my #sorted =
map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [ $dt->strptime(/_(.*)\./, '%m_%d_%Y_%H_%M_%S'), $_ ] }
#orig;
This extracts the date-time portion of the string and builds a date-time object from it with strptime, placing it in an arrayref together with the original string. It does this for the whole input using map.
Then that list is passed to sort which sorts it by its first element, where the Time::Piece object's builtin comparison is used. Then the second map pulls the original strings out, for our result.

Timestamp combined with first 9 characters can be used as hash key.
Then it is just a matter to sort hash on key and output data.
use strict;
use warnings;
use feature 'say';
my %hash;
while(<DATA>) {
chomp;
next unless /(.+?)_(.+?)\.xml/;
$hash{"$2_$1"} = $_;
}
say $hash{$_} for sort keys %hash;
__DATA__
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
Output
5CM10538_10_15_2017_08_51_06.xml
NMC00166_10_15_2017_08_51_06.xml
5ZM05391_10_15_2017_08_51_07.xml
5CM00225_10_16_2017_10_54_42.xml
1ZM06004_10_16_2017_11_04_14.xml
5CM10538_10_16_2017_11_04_18.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM00058_10_17_2017_08_00_49.xml
5ZM05388_10_17_2017_08_01_06.xml

Check words and synonyms

I have an array with some words, and another array with words and synonyms. I'd like to create a third array when I find a matchin word between first and second array. I tried with grep but I'm not able to write the code in a proper way in order to get what I want.
The problem is that elements in array 1 can be found in array 2 at the beginning but also at the end or in the middle.
Maybe it's easier with an exemple:
#array1 = qw(chose, abstraction);
#array2 = (
"inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
The result it should be
#array3 = ("abstraction", "cause,chose,objet", "chose,objet");
Is it right to use "grep"?
I'm not able to write a right syntax to solve the problem..
Thank you

You can construct a regular expression from the array1, then filter the array2 using it:
#!/usr/bin/perl
use warnings;
use strict;
my #array1 = qw(chose, abstraction);
my #array2 = (
"inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
my $regex = join '|', map quotemeta $_, #array1; # quotemeta needed for special characters.
$regex = qr/$regex/;
my #array3 = grep /$regex/, #array2;
print "$_\n" for #array3;

I know you have an answer but here is a fun way I thought of.
So, I guess it is like an inverted index.
You take each set of synonyms and make them into an array. Then take each element of that array and put it into a hash as the keys with the value being a reference to the array.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #array1 = qw(chose abstraction);
my #array2 = ("inspirer respirer",
"incapable",
"abstraction",
"abaxial",
"cause,chose,objet",
"ventral",
"chose,objet"
);
my #array;
push #array, map { /,|\s/ ? [split(/,|\s/, $_)]:[$_] } #array2;
my %construct;
while(my $array_ref = shift(#array)){
for(#{ $array_ref }){
push #{ $construct{$_} }, $array_ref;
}
}
my #array3 = map { s/,//; (#{ $construct{$_} }) } #array1;
print join(', ', #{ $_ }), "\n" for (#array3);
EDIT:
Missed apart of the answer before, this one should be complete.

Get all combinations from N arrays

I have array of arrays.
#a=([1,2,3],['b','r','g'],['L','X']);
And want to have this result:
#b=(
[1,'b','L'],[1,'b','X'],
[1,'r','L'],[1,'r','X'],
[1,'g','L'],[1,'g','X'],
[2,'b','L'],[2,'b','X'],
[2,'r','L'],[2,'r','X'],
[2,'g','L'],[2,'g','X'],
[3,'b','L'],[3,'b','X'],
[3,'r','L'],[3,'r','X'],
[3,'g','L'],[3,'g','X'],
)
My input array #a have from 2 to 6 nested arrays
Dont know how to find this function on cpan.

use Algorithm::Loops qw( NestedLoops );
my #b; NestedLoops(\#a, sub { push #b, [ #_ ] });

Why cpan?
use strict; use warnings;
use Data::Dumper;
my #a=([1,2,3],['b','r','g'],['L','X']);
my #b;
foreach my $i (#{$a[0]}){
foreach my $c (#{$a[1]}){
foreach my $k (#{$a[2]}){
push #b, [$i, $c, $k];
}
}
}
print Dumper(\#b);

Better way to sort an array with unusual value

I have this array:
#raw_stack = (
'900244~dfasdf~ddd3',
'900122~dfasdf~ddd1',
'900244~dfasdf~ddd2',
'900456~dfasdf~ddd4',
'900312~dfasdf~ddd3',
'900456~dfasdf~ddd5',
);
I'd like to sort it by the first '~' element.
Is there a more elegant way to solve this rather than
looping and splitting through each value?

Use Schwartzian transform:
my #raw_stack = (
'900244~dfasdf~ddd3',
'900122~dfasdf~ddd1',
'900244~dfasdf~ddd2',
'900456~dfasdf~ddd4',
'900312~dfasdf~ddd3',
'900456~dfasdf~ddd5',
);
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (split/~/)[0]] } #raw_stack;
dump#sorted;
Benchmark:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
use Benchmark qw(:all);
my $s = '~dfasdf~ddd3';
my #arr = ();
for(0..20000) {
push #arr, int(rand(100000)) . $s;
}
my $count = -3;
cmpthese($count, {
'ST' => sub {
my #sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (split/~/)[0]] } #arr;
},
'SORT' => sub {
my #sorted =
sort {
my ($a_0) = split /~/, $a;
my ($b_0) = split /~/, $b;
$a_0 <=> $b_0
} #arr;
},
});
result:
array of 200 elements:
Rate SORT ST
SORT 267/s -- -61%
ST 689/s 158% --
array of 2000 elements:
Rate SORT ST
SORT 18.0/s -- -71%
ST 61.5/s 242% --
array of 20000 elements:
Rate SORT ST
SORT 1.35/s -- -73%
ST 4.96/s 266% --

Sort and list slices?
sort { ( split( /~/, $a ) )[0] <=> ( split( /~/, $b ) )[0] } #raw_stack;

These might help. They show you how to extract parts of strings to use them to sort the larger strings:
Perlmonks How do I sort an array by (anything)?
perlfaq4 How do I sort an array by (anything)?
Stackoverflow Perl - Sort CSV on a certain column?

Is it always 6 digits? If so, the following would be the simplest and fastest:
my #sorted_stack = sort #raw_stack;
If not,
my #sorted_stack =
sort {
my ($a_0) = split /~/, $a;
my ($b_0) = split /~/, $b;
$a_0 <=> $b_0
} #raw_stack;
A Schwartzian transform might be cleaner if you're used to that, but it's actually slower in this case: [Update: Apparently, it's actually faster than my second solution for larger lists. It's never faster than the first, though ]
my #sorted_stack =
map $_->[0],
sort { $a->[1] <=> $b->[1] }
map [ $_, split /~/ ],
#raw_stack;