grepping command line arguments out of an array in perl - arrays

I have a file that looks like this:
[options42BuySide]
logged-check-times=06:01:00
logged-check-address=192.168.3.4
logged-check-reply=192.168.2.5
logged-check-vac-days=sat,sun
start-time=06:01:00
stop-time=19:00:00
falldown=logwrite after 10000
failtolog=logwrite after 10000
listento=all
global-search-text=Target Down. This message is stored;
[stock42BuySide]
logged-check-times=06:01:00
logged-check-address=192.168.2.13
logged-check-reply=192.168.2.54
logged-check-vac-days=sat,sun
start-time=06:01:00
stop-time=18:00:00
The script grinds the list down to just the name, start and stop time.
sellSide40, start-time=07:05:00, stop-time=17:59:00
SellSide42, start-time=07:06:00, stop-time=17:29:00
SellSide44, start-time=07:31:00, stop-time=16:55:00
42SellSide, start-time=09:01:00, stop-time=16:59:00
The problem is that I would like to filter out specific names from the file with comand line parameters.
I am trying to use the #ARGV array and grep the command line values out of the #nametimes array. Something like :
capser#capser$ ./get_start_stop SorosSellSide42 ETFBuySide42
The script works fine for parsing the file - I just need help on the command line array
#!/usr/bin/perl
use strict ;
use warnings ;
my ($name , $start, $stop, $specific);
my #nametimes;
my $inifile = "/var/log/Alert.ini";
open ( my $FILE, '<', "$inifile") or die ("could not open the file -- $!");
while(<$FILE>) {
chomp ;
if (/\[(\w+)\]/) {
$name = $1;
} elsif (/(start-time=\d+:\d+:\d+)/) {
$start = $1;
} elsif (/(stop-time=\d+:\d+:\d+)/) {
$stop = $1;
push (#nametimes, "$name, $start, $stop");
}
}
for ($a = 0; $a >= $#ARGV ; $a++) {
$specific = (grep /$ARGV[$a]/, #nametimes) ;
print "$specific\n";
}
It is probably pretty easy - however I have worked on it for days, and I am the only guy that uses perl in this shop. I don't have anyone to ask and the googlize is not panning out. I apologize in advance for angering the perl deities who are sure to yell at me for asking such and easy question.

Your construct for looping over #ARGV is a bit unwieldy - the more common way of doing that would be:
for my $name (#ARGV) {
#do something
}
But really, you don't even need to loop over it. You can just join them all directly into a single regular expression:
my $names = join("|", #ARGV);
my #matches = grep { /\b($names)\b/ } #nametimes;
I've used \b in the regex here - that indicates a word boundary, so the argument SellSide4 wouldn't match SellSide42. That may or may not be what you want...

Use an array to store the results from the grep(), not a scalar. Push them, not assign. Otherwise the second iteration of the for loop will overwrite results. Something like:
for my $el ( #ARGV ) {
push #specific, grep { /$el/ } #nametimes);
};
print join "\n", #specific;

The easiest thing to do is to store your INI file as a structure. Then, you can go through your structure and pull out what you want. The simplest structure would be a hash of hashes. Where your heading is the key to the outer hash, and the inner hash is keyed by the parameter:
Here's is creating the basic structure:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
use constant INI_FILE => "test.file.txt";
open my $ini_fh, "<", INI_FILE;
my %ini_file;
my $heading;
while ( my $line = <$ini_fh> ) {
chomp $line;
if ( $line =~ /\[(.*)\]/ ) { #Headhing name
$heading = $1;
}
elsif ( $line =~ /(.+?)\s*=\s*(.+)/ ) {
my $parameter = $1;
my $value = $2;
$ini_file{$heading}->{$parameter} = $value;
}
else {
say "Invalid line $. - $line";
}
}
After this, the structure will look like this:
$VAR1 = {
'options42BuySide' => {
'stop-time' => '19:00:00',
'listento' => 'all',
'logged-check-reply' => '192.168.2.5',
'logged-check-vac-days' => 'sat,sun',
'falldown' => 'logwrite after 10000',
'start-time' => '06:01:00',
'logged-check-address' => '192.168.3.4',
'logged-check-times' => '06:01:00',
'failtolog' => 'logwrite after 10000',
'global-search-text' => 'Target Down. This message is stored;'
},
'stock42BuySide' => {
'stop-time' => '18:00:00',
'start-time' => '06:01:00',
'logged-check-reply' => '192.168.2.54',
'logged-check-address' => '192.168.2.13',
'logged-check-vac-days' => 'sat,sun',
'logged-check-times' => '06:01:00'
}
};
Now, all you have to do is parse your structure and pull the information you want out of it:
for my $heading ( sort keys %ini_file ) {
say "$heading " . $ini_file{$heading}->{"start-time"} . " " . $ini_file{$heading}->{"stop-time"};
}
You could easily modify this last loop to skip the headings you want, or to print out the exact parameters you want.
I would also recommend using Getopt::Long to parse your command line parameters:
my_file -include SorosSellSide42 -include ETFBuySide42 -param start-time -param stop-time
Getopt::Long could store your parameters in arrays. For example. It would put all the -include parameters in an #includes array and all the -param parameters in an #parameters array:
for my $heading ( #includes ) {
print "$heading ";
for my $parameter ( #parameters ) {
print "$ini_file{$heading}->{$parameter} . " ";
}
print "\n;
}
Of course, there needs to be lots of error checking (does the heading exist? What about the requested parameters?). But, this is the basic structure. Unless your file is extremely long, this is probably the easiest way to process it. If your file is extremely long, you could use the #includes and #parameters in the first loop as you read in the parameters and headings.

Related

How can I make an array of values after duplicate keys in a hash?

I have a question regarding duplicate keys in hashes.
Say my dataset looks something like this:
>Mammals
Cats
>Fish
Clownfish
>Birds
Parrots
>Mammals
Dogs
>Reptiles
Snakes
>Reptiles
Snakes
What I would like to get out of my script is a hash that looks like this:
$VAR1 = {
'Birds' => 'Parrots',
'Mammals' => 'Dogs', 'Cats',
'Fish' => 'Clownfish',
'Reptiles' => 'Snakes'
};
I found a possible answer here (https://www.perlmonks.org/?node_id=1116320). However I am not sure how to identify the values and the duplicates with the format of my dataset.
Here's the code that I have been using:
use Data::Dumper;
open($fh, "<", $file) || die "Could not open file $file $!/n";
while (<$fh>) {
chomp;
if($_ =~ /^>(.+)/){
$group = $1;
$animals{$group} = "";
next;
}
$animals{$group} .= $_;
push #{$group (keys %animals)}, $animals{$group};
}
print Dumper(\%animals);
When I execute it the push function does not seem to work as the output from this command is the same as when the command is absent (in the duplicate "Mammal" group, it will replace the cat with the dog instead of having both as arrays within the same group).
Any suggestions as to what I am doing wrong would be highly appreciated.
Thanks !
You're very close here. We can't get exactly the output you want from Data::Dumper because hashes can only have one value per key. The easiest way to fix that is to assign a reference to an array to the key and add things to it. But since you want to eliminate the duplicates as well, it's easier to build hashes as an intermediate representation then transform them to arrays:
use Data::Dumper;
my $file = "animals.txt";
open($fh, "<", $file) || die "Could not open file $file $!/n";
while (<$fh>) {
chomp;
if(/^>(.+)/){
$group = $1;
next;
}
$animals{$group} = {} unless exists $animals{$group};
$animals{$group}->{$_} = 1;
}
# Transform the hashes to arrays
foreach my $group (keys %animals) {
# Make the hash into an array of its keys
$animals{$group} = [ sort keys %{$animals{$group}} ];
# Throw away the array if we only have one thing
$animals{$group} = $animals{$group}->[0] if #{ $animals{$group} } == 1;
}
print Dumper(\%animals);
Result is
$VAR1 = {
'Reptiles' => 'Snakes',
'Fish' => 'Clownfish',
'Birds' => 'Parrots',
'Mammals' => [
'Cats',
'Dogs'
]
};
which is as close as you can get to what you had as your desired output.
For ease in processing the ingested data, it may actually be easier to not throw away the arrays in the one-element case so that every entry in the hash can be processed the same way (they're all references to arrays, no matter how many things are in them). Otherwise you've added a conditional to strip out the arrays, and you have to add another conditional test in your processing code to check
if (ref $item) {
# This is an anonymous array
} else {
# This is just a single entry
}
and it's easier to just have one path there instead of two, even if the else just wraps the single item into an array again. Leave them as arrays (delete the $animals{$group} = $animals{$group}->[0] line) and you'll be fine.
Given:
__DATA__
>Mammals
Cats
>Fish
Clownfish
>Birds
Parrots
>Mammals
Dogs
>Reptiles
Snakes
>Reptiles
Snakes
(at the end of the source code or a file with that content)
If you are willing to slurp the file, you can do something with a regex and a HoH like this:
use Data::Dumper;
use warnings;
use strict;
my %animals;
my $s;
while(<DATA>){
$s.=$_;
}
while($s=~/^>(.*)\R(.*)/mg){
++$animals{$1}{$2};
}
print Dumper(\%animals);
Prints:
$VAR1 = {
'Mammals' => {
'Cats' => 1,
'Dogs' => 1
},
'Birds' => {
'Parrots' => 1
},
'Fish' => {
'Clownfish' => 1
},
'Reptiles' => {
'Snakes' => 2
}
};
Which you can arrive to your format with this complete Perl program:
$s.=$_ while(<DATA>);
++$animals{$1}{$2} while($s=~/^>(.*)\R(.*)/mg);
while ((my $k, my $v) = each (%animals)) {
print "$k: ". join(", ", keys($v)) . "\n";
}
Prints:
Fish: Clownfish
Birds: Parrots
Mammals: Cats, Dogs
Reptiles: Snakes
(Know that the output order may be different than file order since Perl hashes do not maintain insertion order...)

Perl: Empty/broken AoH

I have subroutine in my module which checks (regular) user password age using regex search on shadow file:
Module.pm
my $pwdsetts_dump = "tmp/shadow_dump.txt";
system("cat /etc/shadow > $pwdsetts_dump");
open (my $fh1, "<", $pwdsetts_dump) or die "Could not open file '$pwdsetts_dump': $!";
sub CollectPWDSettings {
my #pwdsettings;
while (my $array = <$fh1>) {
if ($array =~ /^(\S+)[:][$]\S+[:](1[0-9]{4})/) {
my $pwdchange = "$2";
if ("$2" eq "0") {
$pwdchange = "Next login";
}
my %hash = (
"Username" => $1,
"Last change" => $pwdchange
);
push (#pwdsettings, \%hash);
}
}
my $current_date = int(time()/86400); # epoch
my $ndate = shift #_; # n-days
my $search_date = int($current_date - $ndate);
my #sorted = grep{$_->{'Last change'} > $search_date} #pwdsettings;
return \#sorted;
}
Script is divided in 2 steps:
1. load all password settings
2. search for password which is older than n-days
In my main script I use following script:
my ($user_changed_pwd);
if (grep{$_->{'Username'} eq $users_to_check} #{Module::CollectPWDSettings("100")}) {
$user_changed_pwd = "no";
}
else {
$user_changed_pwd = "yes";
}
Problem occurs in first step, AoH never gets populated. I'm also pretty sure that this subroutine always worked for me and strict and warnings never complained about it, nut now, for some reason it refuses to work.
I've just run your regex against my /etc/shadow and got no matches. If I drop the leading 1 I get a few hits.
E.g.:
$array =~ /^(\S+)[:][$]\S+[:]([0-9]{4})/
But personally - I would suggest not trying to regex, and instead rely on the fact that /etc/shadow is defined as delimited by :.
my #fields = split ( /:/, $array );
$1 contains a bunch of stuff, and I suspect what you actually want is the username - but because \S+ is greedy, you might be accidentally ending up with encrypted passwords.
Which will be $fields[0].
And then the 'last change' field - from man shadow is $fields[2].
I think your regex pattern is the main problem. Don't forget that \S matches any non-space character including colons :, and \S+ will try to match as much as possible so it will happily skip over multiple fields of the file
I think using split to separate each record into colon-delimited fields is a better approach. I also think that, instead of the array of two-element hashes #pwdsettings it would be better to store the data as a hash relating usernames to their password history
Here's how I would write this. It prints a list of all usernames whose password history is greater than 90 days
use strict;
use warnings;
use Time::Seconds 'ONE_DAY';
my #shadow = do {
open my $fh, '<', '/etc/shadow' or die qq{Unable to open "/etc/shadow" for input: $!};
<$fh>;
};
print "$_\n" for #{ collect_pwd_settings(90) };
sub collect_pwd_settings {
my ($ndate) = #_;
my %pwdsettings;
for ( #shadow ) {
my ($user, $pwdchange) = (split /:/)[0,2];
$pwdsettings{$user} = $pwdchange;
}
my $current_date = time / ONE_DAY;
my #filtered = grep { $current_date - $pwdsettings{$_} > $ndate } keys %pwdsettings;
return \#filtered;
}

More clarification about the usage of split in Perl

I have this following input file:
test.csv
done_cfg,,,,
port<0>,clk_in,subcktA,instA,
port<1>,,,,
I want to store the elements of each CSV column into an array, but I always get error when I try to fetch those "null" elements in the csv when I run the script. Here's my code:
# ... assuming file was correctly opened and stored into
# ... a variable named $map_in
my $counter = 0;
while($map_in){
chomp;
#hold_csv = split(',',$_);
$entry1[$counter] = $hold_csv[0];
$entry2[$counter] = $hold_csv[1];
$entry3[$counter] = $hold_csv[2];
$entry4[$counter] = $hold_csv[3];
$counter++;
}
print "$entry1[0]\n$entry2[0]\n$entry3[0]\n$entry3[0]"; #test printing
I always got use of uninitialized value error whenever i fetch empty CSV cells
Can you help me locate the error in my code ('cause I know I have somewhat missed something on my code)?
Thanks.
This looks like CSV. So the tool for the job is really Text::CSV.
I will also suggest - having 4 different arrays with numbered names says to me that you're probably wanting a multi-dimensional data structure in the first place.
So I'd be doing something like:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Text::CSV;
my $csv = Text::CSV->new( { binary => 1 } );
open( my $input, "<", "input.csv" ) or die $!;
my #results;
while ( my $row = $csv->getline($input) ) {
push ( #results, \#$row );
}
print join ( ",", #{$results[0]} ),"\n";
print Dumper \#results;
close($input);
If you really want separate arrays, I'd suggest naming them something different, but you could do it like this:
push ( #array1, $$row[0] ); #note - double $, because we dereference
I will note - there's an error in your code - I doubt:
while($map_in){
is doing what you think it is.
When you're assigning $entryN, define a default value:
$entry1[$counter] = $hold_csv[0] || '';
same for other #entry
I think there is a typo in while($map_in) { it should be while (#map_in) {.

Comparing two strings line by line in Perl

I am looking for code in Perl similar to
my #lines1 = split /\n/, $str1;
my #lines2 = split /\n/, $str2;
for (int $i=0; $i<lines1.length; $i++)
{
if (lines1[$i] ~= lines2[$i])
print "difference in line $i \n";
}
To compare two strings line by line and show the lines at which there is any difference.
I know what I have written is mixture of C/Perl/Pseudo-code. How do I write it in the way that it works on Perl?
What you have written is sort of ok, except you cannot use that notation in Perl lines1.length, int $i, and ~= is not an operator, you mean =~, but that is the wrong tool here. Also if must have a block { } after it.
What you want is simply $i < #lines1 to get the array size, my $i to declare a lexical variable, and eq for string comparison. Along with if ( ... ) { ... }.
Technically you can use the binding operator to perform a string comparison, for example:
"foo" =~ "foobar"
But it is not a good idea when comparing literal strings, because you can get partial matches, and you need to escape meta characters. Therefore it is easier to just use eq.
Using C-style for loops is valid, but the more Perl-ish way is to use this notation:
for my $i (0 .. $#lines1)
Which will iterate over the range 0 to the max index of the array.
Perl allows you to open filehandles on strings by using a reference to the scalar variable that holds the string:
open my $string1_fh, '<', \$string1 or die '...';
open my $string2_fh, '<', \$string2 or die '...';
while( my $line1 = <$string1_fh> ) {
my $line2 = <$string2_fh>;
....
}
But, depending on what you mean by difference (does that include insertion or deletion of lines?), you might want something different.
There are several modules on CPAN that you can inspect for ideas, such as Test::LongString or Algorithm::Diff.
my #lines1 = split(/^/, $str1);
my #lines2 = split(/^/, $str2);
# splits at start of line
# use /\n/ if you want to ignore newline and trailing spaces
for ($i=0; $i < #lines1; $i++) {
print "difference in line $i \n" if (lines1[$i] ne lines2[$i]);
}
Comparing Arrays is a way easier if you create a Hashmap out of it...
#Searching the difference
#isect = ();
#diff = ();
%count = ();
foreach $item ( #array1, #array2 ) { $count{$item}++; }
foreach $item ( keys %count ) {
if ( $count{$item} == 2 ) {
push #isect, $item;
}
else {
push #diff, $item;
}
}
#Output
print "Different= #diff\n\n";
print "\nA Array = #array1\n";
print "\nB Array = #array2\n";
print "\nIntersect Array = #isect\n";
Even after spliting you could compare them as Array.

Checking for Duplicates in array

What's going on:
I've ssh'd onto my localhost, ls the desktop and taken those items and put them into an array.
I hardcoded a short list of items and I am comparing them with a hash to see if anything is missing from the host (See if something from a is NOT in b, and let me know).
So after figuring that out, when I print out the "missing files" I get a bunch of duplicates (see below), not sure if that has to do with how the files are being checked in the loop, but I figured the best thing to do would be to just sort out the data and eliminate dupes.
When I do that, and print out the fixed data, only one file is printing, two are missing.
Any idea why?
#!/usr/bin/perl
my $hostname = $ARGV[0];
my #hostFiles = ("filecheck.pl", "hostscript.pl", "awesomeness.txt");
my #output =`ssh $hostname "cd Desktop; ls -a"`;
my %comparison;
for my $file (#hostFiles) {
$comparison{$file} +=1;
}
for my $file (#output) {
$comparison{$file} +=2
}
for my $file (sort keys %comparison) {
#missing = "$file\n" if $comparison{$file} ==1;
#print "Extra file: $file\n" if $comparison{$file} ==2;
print #missing;
}
my #checkedMissingFiles;
foreach my $var ( #missing ){
if ( ! grep( /$var/, #checkedMissingFiles) ){
push( #checkedMissingFiles, $var );
}
}
print "\n\nThe missing Files without dups:\n #checkedMissingFiles\n";
Password:
awesomeness.txt ##This is what is printing after comparing the two arrays
awesomeness.txt
filecheck.pl
filecheck.pl
filecheck.pl
hostscript.pl
hostscript.pl
The missing Files without dups: ##what prints after weeding out duplicates
hostscript.pl
The perl way of doing this would be:
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my %hostFiles = qw( filecheck.pl 1 hostscript.pl 1 awesomeness.txt 1);
# ssh + backticks + ls, not the greatest way to do this, but that's another Q
my #files =`ssh $ARGV[0] "ls -a ~/Desktop"`;
# get rid of the newlines
chomp #files;
#grep returns the matching element of #files
my %existing = map { $_ => 1} grep {exists($hostFiles{$_})} #files;
print Dumper([grep { !exists($existing{$_})} keys %hostFiles]);
Data::Dumper is a utility module, I use it for debugging or demonstrative purposes.
If you want print the list you can do something like this:
{
use English;
local $OFS = "\n";
local $ORS = "\n";
print grep { !exists($existing{$_})} keys %hostFiles;
}
$ORS is the output record separator (it's printed after any print) and $OFS is the output field separator which is printed between the print arguments. See perlvar. You can get away with not using "English", but the variable names will look uglier. The block and the local are so you don't have to save and restore the values of the special variables.
If you want to write to a file the result something like this would do:
{
use English;
local $OFS = "\n";
local $ORS = "\n";
open F, ">host_$ARGV[0].log";
print F grep { !exists($existing{$_})} keys %hostFiles;
close F;
}
Of course, you can also do it the "classical" way, loop trough the array and print each element:
open F, ">host_$ARGV[0].log";
for my $missing_file (grep { !exists($existing{$_})} keys %hostFiles) {
use English;
local $ORS = "\n";
print F "File is missing: $missing_file"
}
close F;
This allows you to do more things with the file name, for example, you can SCP it over to the host.
It seems to me that looping over the 'required' list makes more sense - looping over the list of existing files isn't necessary unless you're looking for files that exist but aren't needed.
#!/usr/bin/perl
use strict;
use warnings;
my #hostFiles = ("filecheck.pl", "hostscript.pl", "awesomeness.txt");
my #output =`ssh $ARGV[0] "cd Desktop; ls -a"`;
chomp #output;
my #missingFiles;
foreach (#hostFiles) {
push( #missingFiles, $_ ) unless $_ ~~ #output;
}
print join("\n", "Missing files: ", #missingFiles);
#missing = "$file\n" assigns the array #missing to contain a single element, "$file\n". It does this every loop, leaving it with the last missing file.
What you want is push(#missing, "$file\n").

Resources