perl hash of arrays issue - arrays

I have few lines in my array #lines in which * shows me the start time of a command (like sync/fetch) and the line with same processID pid and the command without * shows me the end time. They may not be continuous always. I would like to get the startdate and enddate of a particular processID and cmd. Like for usera the cmd sync with processID 11859 started at 2015/01/13 13:53:01.491-05:00 and ended at 2015/01/13 13:55:01.492-05:00
Below is my approach in which I took a hash of array and used processID as key and did split the lines. This works fine only when the start and end lines of a command are continuous , but how can I make it work even when they are not continuous.
my %users;
foreach my $line (#lines) {
if ($line =~ m{(\*)+}) {
($stdate, $sttime, $pid, $user, $cmd) = split ' ', $line;
$startdate ="$stdate $sttime";
}
else {
($eddate, $edtime, $pid, $user, $cmd) = split ' ', $line;
$enddate = "$eddate $edtime";
}
$users{$pid} = [ $startdate, $enddate, $user, $cmd ];
}
Content in #lines:
2015/01/13 13:53:01.491-05:00 11859 usera *sync_cmd 7f1f9bfff700 10.101.17.111
2015/01/13 13:57:02.079-05:00 11863 userb *fetch_cmd 7f1f9bfff700 10.101.17.111
2015/01/13 13:59:02.079-05:00 11863 userb fetch_cmd 7f1f9bfff700 10.101.17.111
2015/01/13 13:55:01.492-05:00 11859 usera sync_cmd 7f1f9bfff700 10.101.17.111

I'm looking at your code and wondering why you're using a hash of arrays.
As far as I'm concerned, the purpose of array is a set of similar but ordered values.
Could you not instead do:
my %processes;
foreach (#lines) {
my ( $date, $time, $pid, $user, $cmd, #everything_else ) = split;
if ( $cmd =~ m/^\*/ ) {
#if command starts with a * - it started.
if ( defined $processes{$pid} ) {
print "WARNING: $pid reused\n";
}
$processes{$pid}{'start_date'} = $date;
$processes{$pid}{'time'} = $time;
$processes{$pid}{'user'} = $user;
$processes{$pid}{'cmd'} = $cmd;
}
else {
#cmd does not start with '*'.
if ( $processes{$pid}{'cmd'} =~ m/$cmd/ ) {
#this works, because 'some_command' is a substring of '*some_command'.
$processes{$pid}{'end_date'} = $date;
$processes{$pid}{'end_time'} = $time;
}
else {
print
"WARNING: $pid has a command of $cmd, where it started with $processes{$pid}{'cmd'}\n";
}
}
}
You might want some additional validation tests in case you've got e.g. a long enough log that pids get reused, or e.g. you've got a log that doesn't include both start and finish of a particular process.

When you assign to %users{$pid} you are presuming that the most recent $startdate and $enddate are both relevant. This problem is exacerbated by the fact that your variables that hold your field values have a scope larger than the foreach loop, allowing these values to bleed between records.
In the if block, you should assign the values of $startdate, $user, $cmd to the array. Individually or as a slice if you like. In the else block you should assign $enddate to it's element in the array.
Regex extra credit: You don't seem to really care if there is more that one * in a record, making the + in the regex superfluous. As an added bonus, without it the capturing group is also of no value. m{\*} should do quite nicely.

Related

Perl: Empty/broken AoH

I have subroutine in my module which checks (regular) user password age using regex search on shadow file:
Module.pm
my $pwdsetts_dump = "tmp/shadow_dump.txt";
system("cat /etc/shadow > $pwdsetts_dump");
open (my $fh1, "<", $pwdsetts_dump) or die "Could not open file '$pwdsetts_dump': $!";
sub CollectPWDSettings {
my #pwdsettings;
while (my $array = <$fh1>) {
if ($array =~ /^(\S+)[:][$]\S+[:](1[0-9]{4})/) {
my $pwdchange = "$2";
if ("$2" eq "0") {
$pwdchange = "Next login";
}
my %hash = (
"Username" => $1,
"Last change" => $pwdchange
);
push (#pwdsettings, \%hash);
}
}
my $current_date = int(time()/86400); # epoch
my $ndate = shift #_; # n-days
my $search_date = int($current_date - $ndate);
my #sorted = grep{$_->{'Last change'} > $search_date} #pwdsettings;
return \#sorted;
}
Script is divided in 2 steps:
1. load all password settings
2. search for password which is older than n-days
In my main script I use following script:
my ($user_changed_pwd);
if (grep{$_->{'Username'} eq $users_to_check} #{Module::CollectPWDSettings("100")}) {
$user_changed_pwd = "no";
}
else {
$user_changed_pwd = "yes";
}
Problem occurs in first step, AoH never gets populated. I'm also pretty sure that this subroutine always worked for me and strict and warnings never complained about it, nut now, for some reason it refuses to work.
I've just run your regex against my /etc/shadow and got no matches. If I drop the leading 1 I get a few hits.
E.g.:
$array =~ /^(\S+)[:][$]\S+[:]([0-9]{4})/
But personally - I would suggest not trying to regex, and instead rely on the fact that /etc/shadow is defined as delimited by :.
my #fields = split ( /:/, $array );
$1 contains a bunch of stuff, and I suspect what you actually want is the username - but because \S+ is greedy, you might be accidentally ending up with encrypted passwords.
Which will be $fields[0].
And then the 'last change' field - from man shadow is $fields[2].
I think your regex pattern is the main problem. Don't forget that \S matches any non-space character including colons :, and \S+ will try to match as much as possible so it will happily skip over multiple fields of the file
I think using split to separate each record into colon-delimited fields is a better approach. I also think that, instead of the array of two-element hashes #pwdsettings it would be better to store the data as a hash relating usernames to their password history
Here's how I would write this. It prints a list of all usernames whose password history is greater than 90 days
use strict;
use warnings;
use Time::Seconds 'ONE_DAY';
my #shadow = do {
open my $fh, '<', '/etc/shadow' or die qq{Unable to open "/etc/shadow" for input: $!};
<$fh>;
};
print "$_\n" for #{ collect_pwd_settings(90) };
sub collect_pwd_settings {
my ($ndate) = #_;
my %pwdsettings;
for ( #shadow ) {
my ($user, $pwdchange) = (split /:/)[0,2];
$pwdsettings{$user} = $pwdchange;
}
my $current_date = time / ONE_DAY;
my #filtered = grep { $current_date - $pwdsettings{$_} > $ndate } keys %pwdsettings;
return \#filtered;
}

Empty array in a perl while loop, should have input

Was working on this script when I came across a weird anomaly. When I go to print #extract after declaring it, it prints correctly the following:
------MMMMMMMMMMMMMMMMMMMMMMMMMM-M-MMMMMMMM
------SSSSSSSSSSSSSSSSSSSSSSSSSS-S-SSSSSDTA
------TIIIIIIIIIIIIITIIIVVIIIIII-I-IIIIITTT
Now the weird part, when I then try to print or return #extract (or $column) inside of the while loop, it comes up empty, thus rendering the rest of the script useless. I've never come across this before up until now, haven't been able to find any documentation or people with similar problems as mine. Below is the code, I marked with #<------ where the problems are and are not, to see if anyone can have any idea what is going on? Thank you kindly.
P.S. I am utilizing perl version 5.12.2
use strict;
use warnings;
#use diagnostics;
#use feature qw(say);
open (S, "Val nuc align.txt") || die "cannot open FASTA file to read: $!";
open (OUTPUT, ">output.txt");
my #extract;
my $sum = 0;
my #lines = <S>;
my #seq = ();
my $start = 0; #amino acid column start
my $end = 10; #amino acid column end
#Removing of the sequence tag until amino acid sequence composition (from >gi to )).
foreach my $line (#lines) {
$line =~ s/\n//g;
if ($line =~ />/g) {
$line =~ s/>.*\]/>/g;
push #seq, $line;
}
else {
push #seq, $line;
}
}
my $seq = join ('', #seq);
my #seq_prot = join "\n", split '>', $seq;
#seq_prot = grep {/[A-Z]/} #seq_prot;
#number of sequences
print OUTPUT "Number of sequences:", scalar (grep {defined} #seq_prot), "\n";
#selection of amino acid sequence. From $start to $end.
my #vertical_array;
while ( my $line = <#seq_prot> ) {
chomp $line;
my #split_line = split //, $line;
for my $index ( $start..$end ) { #AA position, extracts whole columns
$vertical_array[$index] .= $split_line[$index];
}
}
# Print out your vertical lines
for my $line ( #vertical_array ) {
my $extract = say OUTPUT for unpack "(a200)*", $line; #split at end of each column
#extract = grep {defined} $extract;
}
print OUTPUT #extract; #<--------------- This prints correctly the input
#Count selected amino acids excluding '-'.
my %counter;
while (my $column = #extract) {
print #extract; #<------------------------ Empty print, no input found
}
Update: Found the main problem to be with the unpack command, I thought I could utilize it to split my columns of my input at X elements (43 in this case). While this works, the minute I change $start to another number that is not 0 (say 200), the code brings up errors. Probably has something to do with the number of column elements does not match the lines. Will keep updated.
Write your last while loop the same way as your previous for loop. The assignment
my $column = #extract
is in scalar context, which does not give you the same result as:
for my $column (#extract)
Instead, it will give you the number of elements in the array. Try this second option and it should work.
However, I still have a concern, because in fact, if #extract had anything in it, you would obtain an infinite loop. Is there any code that you did not include between your two commented lines?

Perl forkmanager dropping array value

So I am having an issue with the array #cpuAll losing its value after $pm->finish.. This is just SSHing to a bunch of servers and bringing back some stats which works fine. But the array won't print after the last loop is done. I don't want to write everything to files because I get a 90% performance increase from just loading it into the array.
my #cpuAll = ();
my #memAll = ();
$pm->run_on_finish(sub{
my ($pid,$exit_code,$ident,$exit_signal,$core_dump,$data)=#_;
push(#data,$data);
});
for(#servers)
{
next if $_ =~ "10.1.4.52";
next if $_ =~ "10.1.4.106";
my $pid = $pm->start and next;
chomp;
my #output_cpu = `/usr/bin/ssh $_ \"/root/scripts/punkbuster.cpu|sed 's/ (//g'|sed 's/)//g'|sed s'/ //g'\"`;
for(#output_cpu)
{
chomp;
my ($server,$username,$cpu,$process)=(split /:/, $_)[0,1,2,3];
# push(#cpuAll,"$server\,$username\,$cpu\,$process\,$date\,$time\n");
}
$pm->finish(0, [$server,$username,$cpu,$process]);
}
print $_ for #data;
print "OK\n";
$pm->wait_all_children;
I have run into similar issues in the past, and I believe you'll find a solution in the documentation on data structure retrieval. You need to pass the data to finish like $pm->finish(0, \#cpuAll) and then use a callback in $pm->run_on_finish to loop over your array and print whatever you need. The link I provided shows a code example which should be very clear on how to retrieve the data. Let me know if not and I'll add more to my answer.
Use Net::OpenSSH::Parallel!
my $pssh = Net::OpenSSH::Parallel->new;
for my $server (#servers) {
$pssh->add_host($server);
}
$pssh->push(*, cmd => { stdout_file => "%LABEL%.out" },
"/root/scripts/punkbuster.cpu|sed 's/ (//g'|sed 's/)//g'|sed s'/ //g'");
$pssh->run;
my #cpuAll;
for my $server (#servers) {
if (open my $fh, '<', "$server.out") {
my ($server,$username,$cpu,$process) = split /:/;
push #cpuAll, join ',', (split /:/)[0..3], $date, $time;
}
else {
warn "unable to retrieve data for $server\n";
}
}
print "$_\n" for #cpuAll;
I would also replace the sed substitutions by some local post-processing done in perl.

grepping command line arguments out of an array in perl

I have a file that looks like this:
[options42BuySide]
logged-check-times=06:01:00
logged-check-address=192.168.3.4
logged-check-reply=192.168.2.5
logged-check-vac-days=sat,sun
start-time=06:01:00
stop-time=19:00:00
falldown=logwrite after 10000
failtolog=logwrite after 10000
listento=all
global-search-text=Target Down. This message is stored;
[stock42BuySide]
logged-check-times=06:01:00
logged-check-address=192.168.2.13
logged-check-reply=192.168.2.54
logged-check-vac-days=sat,sun
start-time=06:01:00
stop-time=18:00:00
The script grinds the list down to just the name, start and stop time.
sellSide40, start-time=07:05:00, stop-time=17:59:00
SellSide42, start-time=07:06:00, stop-time=17:29:00
SellSide44, start-time=07:31:00, stop-time=16:55:00
42SellSide, start-time=09:01:00, stop-time=16:59:00
The problem is that I would like to filter out specific names from the file with comand line parameters.
I am trying to use the #ARGV array and grep the command line values out of the #nametimes array. Something like :
capser#capser$ ./get_start_stop SorosSellSide42 ETFBuySide42
The script works fine for parsing the file - I just need help on the command line array
#!/usr/bin/perl
use strict ;
use warnings ;
my ($name , $start, $stop, $specific);
my #nametimes;
my $inifile = "/var/log/Alert.ini";
open ( my $FILE, '<', "$inifile") or die ("could not open the file -- $!");
while(<$FILE>) {
chomp ;
if (/\[(\w+)\]/) {
$name = $1;
} elsif (/(start-time=\d+:\d+:\d+)/) {
$start = $1;
} elsif (/(stop-time=\d+:\d+:\d+)/) {
$stop = $1;
push (#nametimes, "$name, $start, $stop");
}
}
for ($a = 0; $a >= $#ARGV ; $a++) {
$specific = (grep /$ARGV[$a]/, #nametimes) ;
print "$specific\n";
}
It is probably pretty easy - however I have worked on it for days, and I am the only guy that uses perl in this shop. I don't have anyone to ask and the googlize is not panning out. I apologize in advance for angering the perl deities who are sure to yell at me for asking such and easy question.
Your construct for looping over #ARGV is a bit unwieldy - the more common way of doing that would be:
for my $name (#ARGV) {
#do something
}
But really, you don't even need to loop over it. You can just join them all directly into a single regular expression:
my $names = join("|", #ARGV);
my #matches = grep { /\b($names)\b/ } #nametimes;
I've used \b in the regex here - that indicates a word boundary, so the argument SellSide4 wouldn't match SellSide42. That may or may not be what you want...
Use an array to store the results from the grep(), not a scalar. Push them, not assign. Otherwise the second iteration of the for loop will overwrite results. Something like:
for my $el ( #ARGV ) {
push #specific, grep { /$el/ } #nametimes);
};
print join "\n", #specific;
The easiest thing to do is to store your INI file as a structure. Then, you can go through your structure and pull out what you want. The simplest structure would be a hash of hashes. Where your heading is the key to the outer hash, and the inner hash is keyed by the parameter:
Here's is creating the basic structure:
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
use constant INI_FILE => "test.file.txt";
open my $ini_fh, "<", INI_FILE;
my %ini_file;
my $heading;
while ( my $line = <$ini_fh> ) {
chomp $line;
if ( $line =~ /\[(.*)\]/ ) { #Headhing name
$heading = $1;
}
elsif ( $line =~ /(.+?)\s*=\s*(.+)/ ) {
my $parameter = $1;
my $value = $2;
$ini_file{$heading}->{$parameter} = $value;
}
else {
say "Invalid line $. - $line";
}
}
After this, the structure will look like this:
$VAR1 = {
'options42BuySide' => {
'stop-time' => '19:00:00',
'listento' => 'all',
'logged-check-reply' => '192.168.2.5',
'logged-check-vac-days' => 'sat,sun',
'falldown' => 'logwrite after 10000',
'start-time' => '06:01:00',
'logged-check-address' => '192.168.3.4',
'logged-check-times' => '06:01:00',
'failtolog' => 'logwrite after 10000',
'global-search-text' => 'Target Down. This message is stored;'
},
'stock42BuySide' => {
'stop-time' => '18:00:00',
'start-time' => '06:01:00',
'logged-check-reply' => '192.168.2.54',
'logged-check-address' => '192.168.2.13',
'logged-check-vac-days' => 'sat,sun',
'logged-check-times' => '06:01:00'
}
};
Now, all you have to do is parse your structure and pull the information you want out of it:
for my $heading ( sort keys %ini_file ) {
say "$heading " . $ini_file{$heading}->{"start-time"} . " " . $ini_file{$heading}->{"stop-time"};
}
You could easily modify this last loop to skip the headings you want, or to print out the exact parameters you want.
I would also recommend using Getopt::Long to parse your command line parameters:
my_file -include SorosSellSide42 -include ETFBuySide42 -param start-time -param stop-time
Getopt::Long could store your parameters in arrays. For example. It would put all the -include parameters in an #includes array and all the -param parameters in an #parameters array:
for my $heading ( #includes ) {
print "$heading ";
for my $parameter ( #parameters ) {
print "$ini_file{$heading}->{$parameter} . " ";
}
print "\n;
}
Of course, there needs to be lots of error checking (does the heading exist? What about the requested parameters?). But, this is the basic structure. Unless your file is extremely long, this is probably the easiest way to process it. If your file is extremely long, you could use the #includes and #parameters in the first loop as you read in the parameters and headings.

Missmatch in array comparison error message ' Argument "" isn't numeric in array element at'

I am very new to perl and am struggling to get this script to work.
I have taken pieces or perl and gooten them to work as indivual sections but upon trying to blend them together it fails. Even with the error messages that show up I can not find where my mistake is.
The script when working and completed will read an output file and go through it section my section and utilmately generate a new output file with not much more the a heading with some additional text and a value of the amount of lines in that section.
My issues are when it does the looping for each keyword in the array it is now failing with the error message 'Argument "" isn't numeric in array element at'. Perl directs me to a section in the script but I can not see how I am calling the element incorrectly. All the elements in the array are alpha yet the error message is refering to a numeric value.
Can anyone see my mistake.
Thank you
Here is the script
#!/usr/bin/perl -w
use strict;
use warnings;
use diagnostics;
# this version reads each variable and loops through the 18 times put only displays on per loop.
my $NODE = `uname -n`;
my $a = "/tmp/";
my $b = $NODE ;
my $c = "_deco.txt";
my $d = "_deco_mini.txt";
chomp $b;
my $STRING = "$a$b$c";
my $STRING_out = "$a$b$d";
my #keyword = ( "Report", "Last", "HP", "sulog", "sudo", "eTrust", "proftp", "process", "active clusters", "pdos", "syslog", "BNY", "syslogmon", "errpt", "ports", "crontab", "NFS", "scripts", "messages");
my $i = 0;
my $keyword="";
my $x=0;
my $y=0;
my $jw="";
my $EOS = "########################################################################";
my $qty_lines=0;
my $skip5=0;
my $skipcnt=0;
my $keeplines=0;
my #HPLOG="";
do {
print "Reading File: [$STRING]\n";
if (-e "$STRING" && open (IN, "$STRING")) {
# ++$x; # proving my loop worked
# print "$x interal loop counter\n"; # proving my loop worked
for ( ++$i) { # working
while ( <IN> ) {
chomp ;
#if ($_ =~ /$keyword/) {
#if ($_ =~ / $i /) {
#if ($_ =~ /$keyword[ $i ]/) {
if ($_ =~ /$keyword $i/) {
print " $i \n";
$skip5=1;
next;
# print "$_\n";# $ not initalized error when tring to use it
}
if ($skip5) {
$skipcnt++;
print "SKIP LINE: $_\n";
print "Header LINE: $_\n";
next if $skipcnt <= 5;
$skip5=0;
$keeplines=1;
}
if ($keeplines) {
# ++$qty_lines; # for final output
last if $_ =~ /$EOS/;
print "KEEP LINE: $_\n";
# print "$qty_lines\n"; # for final output
push #HPLOG, "$_\n";
# push #HPLOG, "$qty_lines\n";# for final output
}
} ## end while ( <IN> )
} ## end for ( ++$i)
} ## end if (-e "$STRING" && open (IN, "$STRING"))
close (IN);
} while ( $i < 19 && ++$y < 18 );
Here is a sample section or the input file.
###############################################################################
Checking for active clusters.
#########
root 11730980 12189848 0 11:24:20 pts/2 0:00 egrep hagsd|harnad|HACMP|haemd
If there are any processes listed you need to remove the server from the cluster.
############################################################################
This is the output from Pdos log
Please review it for anything that looks like a users may be trying to run something.
#########
This server is not on Tamos
############################################################################
This is the output from syslog.conf.
Look for any entries on the right side column that are not the ususal logs or location.
#########
# #(#)34 1.11 src/bos/etc/syslog/syslog.conf, cmdnet, bos610 4/27/04 14:47:53
# IBM_PROLOG_BEGIN_TAG
# This is an automatically generated prolog.
#
# bos610 src/bos/etc/syslog/syslog.conf 1.11
I truncated the rest of the file
Can anyone see my mistake.
I can see quite a lot of mistakes. But I also see some good stuff like use strict and use warnings.
My suggestion for you is to work on your coding style so that it gets easier for you and others to debug any problems.
Naming variables
my $NODE = `uname -n`;
my $a = "/tmp/";
my $b = $NODE ;
my $c = "_deco.txt";
my $d = "_deco_mini.txt";
chomp $b;
my $STRING = "$a$b$c";
my $STRING_out = "$a$b$d";
Why are some of those names all uppercase and others all lower case? If you are building up a filename, why do you call the variable that holds the filename $STRING?
my #keyword = ( "Report", "Last", "HP", "sulog", "sudo", ....
If you have a list of several keywords, wouldn't it be apt to not chose a singular for the variable name? How about #keywords?
Using temporary variables you don't need
my $NODE = `uname -n`;
my $a = "/tmp/";
my $b = $NODE ;
my $c = "_deco.txt";
chomp $b;
my $STRING = "$a$b$c";
Why do you need $a, $b and $c? The (forgive me) stupid names of those vars are a tell-tale sign that you don't need them. How about this instead?
my $node_name = `uname -n`;
chomp $node_name;
my $file_name = sprintf '/tmp/%s/_deco.txt', $node_name;
Your biggest problem: you have no idea how to use arrays
You are making several drastic mistakes when it comes to arrays.
my #HPLOG="";
Do you want an array or another string? The # says array, the "" says string. I guess you wanted a new, empty array, so my #hplog = () would have been much better. But since there is no need to tell perl that you want an empty array as it will give you an empty one anyway, my #hplog; will do the job just fine.
It took me a while to figure out this next one and I'm still not sure whether I'm guessing your intentions correctly:
my #keyword = ( "Report", "Last", "HP", "sulog", "sudo", "eTrust", "proftp", "process", "active clusters", "pdos", "syslog", "BNY", "syslogmon", "errpt", "ports", "crontab", "NFS", "scripts", "messages");
...
if ($_ =~ /$keyword $i/) {
What I think you are doing here is trying to match your current input line against element number $i in #keywords. If my assumption is correct, you really wanted to say this:
if ( /$keyword[ $i ]/ ) {
Iterating arrays
Perl is not C. It doesn't make you jump through hoops to get a loop.
Just look at all the code you wrote to loop through your keywords:
my $i = 0;
...
for ( ++$i) { # working
...
if ($_ =~ /$keyword $i/) {
...
} while ( $i < 19 && ++$y < 18 );
Apart from the facts that your working comment is just self-deception and that you hard-coded the number of elements in your array, you could have just used a for-each loop:
foreach my $keyword ( #keywords ) {
# more code here
}
I'm sure that when you try to work on the above list, the problem that made you ask here will just go away. Have fun.

Resources