I am working on writing a script that identifies login attempts that are 5 seconds or less apart, searching for brute force login attempts. So far I have been able to take the log timestamps and convert them to a readable and workable format, by using the script below:
#!/usr/bin/perl
use warnings;
use strict;
open my $IN, '<', 'test.txt' or die $!; # Open the file.
while (<$IN>) { # Process it line by line.
my $timestamp = (split)[1]; # Get the second column.
$timestamp =~ tr/://d; # Remove colons.
print "$timestamp\n";
}
The output I get looks like
102432
102434
104240
etc.
What I want to do is compare the numbers in the array to see if there is a five-second delay or less between login attempts. Something like:
if ($timestamp + 5 <= 2nd element in array) {
print "ahhh brute force"
}
The same thing all the way down the array elements until the end.
if (2nd element in array + 5 <= 3rd element in array) {
print "ahh brute force"
}
etc.
Could someone please point me in the right direction?
Example of input:
2014-08-10 13:20:30 GET Portal/jsjquery-latest.js 404 - "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
This will do as you ask. It uses Time::Piece, which has been a core module since version 10 of Perl 5, and so shouldn't need installing.
It uses both the date and the time fields from the log file to build Time::Piece objects, which can then be subtracted from one another to calculate the intervals.
The program expects the path to the log file as a parameter on the command line
use strict;
use warnings;
use 5.010;
use Time::Piece;
my $last_login;
while (<>) {
my #login = split;
my $login = Time::Piece->strptime("#login[0,1]", '%Y-%m-%d %H:%M:%S');
if ($last_login) {
my $interval = $login - $last_login;
if ($interval <= 5) {
printf "%s to %s is %d seconds\n", $last_login, $login, $interval;
}
}
$last_login = $login;
}
Update
As #knarf says in a comment, this can be done using a regular expression together with the Time::Local module's timelocal function.
This is a program that does something similar using that technique.
use strict;
use warnings;
use Time::Local 'timelocal';
my $last_login;
while (<>) {
next unless my #login = / (\d\d\d\d)-(\d\d)-(\d\d) \s+ (\d\d):(\d\d):(\d\d) /x;
$login[0] -= 1900;
$login[1] -= 1;
my $login = timelocal reverse #login;
if ($last_login) {
my $interval = $login - $last_login;
if ($interval <= 5) {
printf "%s to %s is %d seconds\n", map(scalar localtime $_, $last_login, $login), $interval;
}
}
$last_login = $login;
}
Related
I have a array which contains set of unique elements my_array= [aab, abc def, fgh,]
I have a file which containing these elements(repeated also)
I want to count each unique element has how many repetitions if no repetition then count is 1
example of file :
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
output should be
aab - 2
abc - 1
def - 1
I tried to search first and print it its not woking
use strict;
use warnings;
my #my_array;
#my_array =("abc", "aab", "def");
open (my $file, '<', 'filename.txt') or die;
my $value;
foreach $value (#my_array) {
while(<$file>) {
if ($_ =~ /$value/){
print "found : $value\n";
}
}
}
Also tried 2nd method
use strict;
use warnings;
my #my_array;
#my_array =("abc", "aab", "def");
open (my $file, '<', 'filename.txt') or die;
while (<$file>) {
my $k=0;
if ($_ =~ /$my_array[$k]/) {
print "$my_array[$k]”;
}
}
Sample input data does not specify if lookup words repeat in the line or not.
Following demo code assumes that lookup words do not repeat in the line.
If this statement above does not true then the line should be split into tokens and each token must be inspected to get correct count of lookup words.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my(%count,#lookup);
#lookup =('abc', 'aab', 'def');
while( my $line = <DATA> ) {
for ( #lookup ) {
$count{$_}++ if $line =~ /\b$_\b/;
}
}
say Dumper(\%count);
exit 0;
__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
Output
$VAR1 = {
'aab' => 2,
'abc' => 1,
'def' => 1
};
I'm a fan of the Algorithm::AhoCorasick::XS module for performing efficient searches for multiple strings at once. An example:
#!/usr/bin/env perl
use warnings;
use strict;
use Algorithm::AhoCorasick::XS;
my #words = qw/abc aab def/;
my $aho = Algorithm::AhoCorasick::XS->new(\#words);
my %counts;
while (my $line = <DATA>) {
$counts{$_}++ for $aho->matches($line);
}
for my $word (#words) {
printf "%s - %d\n", $word, $counts{$word}//1;
}
__DATA__
i want to have aab but no i dont want abc
i want to have aab but no i dont want def
outputs
abc - 1
aab - 2
def - 1
The $counts{$word}//1 bit in the output will give you a 1 if that word doesn't exist in the hash because it wasn't encountered in the text.
Can build an alternation pattern from the keywords and so match all that are on the line in one regex run, then populate a frequency hash with the matches
use warnings;
use strict;
use feature 'say';
use Data::Dumper;
my #keywords = qw(aab abc def fgh);
my $re_w = join '|', #keywords;
my %freq;
while (<>) {
++$freq{$_} for /($re_w)/g
}
say Dumper \%freq;
The <> operator reads line by line the files with names given on the command line, so the program is used as prog.pl file. (Or open the file "manually" in the program.)
The for loop imposes list context on its expression, so that regex returns the list of matches (captures), as the match operator does in the list context, and the ++$freq{$_} expression works with them one at a time.
The code counts all instances of keywords that repeat on a line. If that's not desired please clarify (can add a call to List::Util::uniq before feeding the list of matches to the for loop).
There are a number of other details that may need closer attention.
One example: if there are overlapping keywords, which one takes precedence? For instance, with keywords the and there, once the word there is encountered in the text should it be matched by there or by the? If it is there then keywords in the alternation pattern should be ordered from longest to shortest,
my $re_w = join '|', sort { length $b <=> length $a } #w;
Please clarify if there are additional considerations.
I would like to create an array in Perl of strings that I need to search/grep from a tab-deliminated text file. For example, I create the array:
#!/usr/bin/perl -w
use strict;
use warnings;
# array of search terms
my #searchArray = ('10060\t', '10841\t', '11164\t');
I want to have a foreach loop to grep a text file with a format like this:
c18 10706 463029 K
c2 10841 91075 G
c36 11164 . B
c19 11257 41553 C
for each of the elements of the above array. In the end, I want to have a NEW text file that would look like this (continuing this example):
c2 10841 91075 G
c36 11164 . B
How do I go about doing this? Also, this needs to be able to work on a text file with ~5 million lines, so memory cannot be wasted (I do have 32GB of memory though).
Thanks for any help/advice in advanced! Cheers.
Using a perl one-liner. Just translate your list of numbers into a regex.
perl -ne 'print if /\b(?:10060|10841|11164)\b/' file.txt > newfile.txt
You can search for alternatives by using a regexp like /(10060\t|100841\t|11164\t)/. Since your array could be large, you could create this regexp, by something like
$searchRegex = '(' + join('|',#searchArray) + ')';
this is just a simple string, and so it would be better (faster) to compile it to a regexp by
$searchRegex = qr/$searchRegex/;
With only 5 million lines, you could actually pull the entire file into memory (less than a gigabyte if 100 chars/line), but otherwise, line by line you could search with this pattern as in
while (<>) {
print if $_ =~ $searchRegex
}
So I'm not the best coder but this should work.
#!/usr/bin/perl -w
use strict;
use warnings;
# array of search terms
my $searchfile = 'file.txt';
my $outfile = 'outfile.txt';
my #searchArray = ('10060', '10841', '11164');
my #findArray;
open(READ,'<',$searchfile) || die $!;
while (<READ>)
{
foreach my $searchArray (#searchArray) {
if (/$searchArray/) {
chomp ($_);
push (#findArray, $_) ;
}
}
}
close(READ);
### For Console Print
#foreach (#findArray){
# print $_."\n";
#}
open(WRITE,'>',$outfile) || die $!;
foreach (#findArray){
print WRITE $_."\n";
}
close(WRITE);
I'm new in Perl, I want to write a simple program which reads an input-file and count the letters of this file, this is my code:
#!/usr/bin/perl
$textfile = "example.txt";
open(FILE, "< $textfile");
#array = split(//,<FILE>);
$counter = 0;
foreach(#array){
$counter = $counter + 1;
}
print "Letters: $counter";
this code shows me the number of letters, but only for the first paragraph of my Input-File, it doesn't work for more than one paragraph, can anyone help me, i don't know the problem =(
thank you
You only ever read one line.
You count bytes (for which you could use -s), not letters.
Fix:
my $count = 0;
while (<>) {
$count += () = /\pL/g;
}
You code is a rather over-complicated way of doing this:
#!/usr/bin/perl
# Always use these
use strict;
use warnings;
# Define variables with my
my $textfile = "example.txt";
# Lexical filehandle, three-argument open
# Check return from open, give sensible error
open(my $file, '<', $textfile) or die "Can't open $textfile: $!"
# No need for an array.
my $counter = length <$file>;
print "Letters: $counter";
But, as others have pointed out, you're counting bytes not characters. If your file is in ASCII or an 8-bit encoding, then you should be fine. Otherwise you should look at perluniintro.
Here's an alternative aproach using a module to do the work ..
# the following two lines enforce 'clean' code
use strict;
use warnings;
# load some help (read_file)
use File::Slurp;
# load the file into the variable $text
my $text = read_file('example.txt');
#get rid of multiple whitespace and linefeed chars # ****
# and replace them with a single space # ****
$text =~ s/\s+/ /; # ****
# length gives you the length of the 'string' / scalar variable
print length($text);
you might want to comment out the lines marked '****'
and play with the code...
I am new to Perl and am trying to write a script that will only print the even numbered lines of an array. I have tried multiple different methods of finding the size to use as the condition for my while loop, but I always end up getting an infinite loop of the first line without the program terminating. The array being input is a text file, input with the form "program.pl < foo.txt". Have I made a logic or syntax error?
#input = <STDIN>;
$i = $1;
$size = $#input + $1;
while ($size >= $i) {
print "$input[$i]";
$i = ($i + $2);
}
Don't call your problem with
program.pl < foo.txt
Instead, just pass 'foo.txt' as a parameter:
program.pl foo.txt
Inside your script, rely on default reading from <> and the line number variable $.:
use strict;
use warnings;
while (<>) {
next if $. % 2; # Skip odd numbers.
print;
}
Assuming you already have an array with all of your input, in your example #input, you can get all of the even index entries into another array using an Array Slice like so:
my #input_even_entries_only = #input[grep { $_ % 2 == 0 } 0..$#input];
The expression inside the square brackets evaluates to all of the even numbers between 0 and $#input.
You can then use a regular for/foreach loop to go through the resulting array:
for my $val (#input_even_entries_only) {
print "$val";
}
If you are trying to print lines of an array indexed at even numbers then, try this:
use strict;
use warnings;
my #input = <DATA>;
for(my $i=0; $i<=$#input; $i+=2) {
print $input[$i];
}
__DATA__
1
2
3
4
5
6
Output:
1
3
5
I've no idea what you are doing with the $1 and $2 variables. Did you think they were just numbers?
When you use a variable that has not been assigned a value, it is undefined, which will be converted to 0 when used in numerical context. If you do not use use warnings, this is done silently, and will be rather confusing.
Other than that, your code is not too far off. It should be something like:
use strict;
use warnings;
my #input = <>; # <> is more flexible and does the same thing
my $i = 1;
while ($i <= $#input) {
print $input[$i];
$i += 2;
}
Though of course, storing the entire file in an array is not necessary, and most often you should just loop over it instead. Like Miller has shown in his answer, which is probably the solution I would suggest. Using a for loop like JS shows is an excellent way to control the loop.
I have a school program I just got and we are learning hashes and the teacher went over hashes of arrays but not really array of hashes and I feel like an AoH is going to work out better for me in the long run. Right now I get all my data into separate variables and I want store them into a AoH bc I have the same variables the entire time but the values change.
What the program is, is a log analyzer and parses through a gigantic log file and all the data is, is lines that look like this.
IPADDY x x [DATE:TIME -x] "METHOD URL HTTPVERS" STATUSCODE BYTES "REFERER" "USERAGENT"
example line being
27.112.105.20 - - [09/Oct/2011:07:22:51 -0500] "GET / HTTP/1.1" 200 4886 "-" "Python-urllib/2.4"
Now I get all the data fine I just dont really understand how to populate and Array of Hashes if anyone can help me out.
Here is an updated code that grabs the data and tries storing it into an AoH. The output in my file used to be perfect just like the print statments I now have commented out. This is all that comes in my output file now "ARRAY(0x2395df0): HASH(0x23d06e8)". Am I doing something wrong?
#!/usr/bin/perl
use strict;
use warnings;
my $j = 0;
my #arrofhash;
my $ipadd;
my $date;
my $time;
my $method;
my $url;
my $httpvers;
my $statuscode;
my $bytes;
my $referer;
my $useragent;
my $dateANDtime;
my ($dummy1, $dummy2, $dummy3);
open ( MYFILE, '>>dodoherty.report');
if ( #ARGV < 1)
{
printf "\n\tUsage: $0 file word(s)\n\n";
exit 0;
}
for (my $i = 0; $i < #ARGV; ++$i)
{
open( HANDLE, $ARGV[$i]);
while( my $line = <HANDLE> )
{
($ipadd, $dummy1, $dummy2, $dateANDtime, $dummy3, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent) = split( /\s/, $line);
$method = substr ($method, 1, length($method));
$httpvers = substr ($httpvers, 0, length($httpvers)-1);
$referer = substr ($referer, 1, length($referer)-2);
$useragent = substr ($useragent, 1, length($useragent)-1);
if ( substr ($useragent, length($useragent)-1, length($useragent)) eq '"')
{
chop $useragent;
}
if ( $dateANDtime =~ /\[(\S*)\:(\d{2}\:\d{2}\:\d{2})/)
{
$date = $1;
$time = $2;
}
$arrofhash[$i] = {ipadd => $ipadd, date => $date, 'time' => $time, method => $method, url => $url, httpvers => $httpvers, statuscode => $statuscode, bytes => $bytes, referer => $referer, useragent => $useragent};
# print MYFILE "IPADDY :$ipadd\n";
# print MYFILE "METHOD :$method\n";
# print MYFILE "URL :$url\n";
# print MYFILE "HTTPOVERS : $httpvers\n";
# print MYFILE "STATUS CODE: $statuscode\n";
# print MYFILE "BYTES : $bytes\n";
# print MYFILE "REFERER : $referer\n";
# print MYFILE "USERAGENT : $useragent\n";
# print MYFILE "DATE : $date\n";
# print MYFILE "TIME : $time\n\n";
}
}
for ( my $j = 0; $j < #arrofhash; ++$j)
{
foreach my $hash (#hashkeys)
{
printf MYFILE "%s: %s\n",$hash, $arrofhash[$j];
}
print MYFILE "\n";
}
close (MYFILE);
A common beginner mistake is to not make use of the lexical scope of variables, and just declare all variables at the top, like you do. Declare them within the scope that you need them, no more, no less.
In your case, it would be beneficial to just store the data directly in a hash, then push that hash reference to an array. I would also advise against using split here, as it is working unreliably IMO, and you are splitting quoted strings, using dummy variables to get rid of unwanted data. Instead use a regex.
This regex won't handle escaped quotes inside quotes, but I get the feeling that you will not have to deal with that, since you were using split before to handle this.
You will need to add any further processing to the data, like extracting date and time, etc. If you want some added safety, you can add a warning if the regex seems to have failed, e.g. unless (%f) { warn "Warning: Regex did not match line: '$_'"; next; }
use strict;
use warnings;
use Data::Dumper;
my #all;
while (<DATA>) {
my %f; # make a new hash for each line
# assign the regex captures to a hash slice
#f{qw(ipadd dateANDtime method statuscode bytes referer useragent)} =
/^ # at beginning of line...
(\S+) [\s-]* # capture non-whitespace and ignore whitespace/dash
\[([^]]+)\]\s* # capture what's inside brackets
"([^"]+)"\s* # capture what's inside quotes
(\d+)\s* # capture digits
(\d+)\s*
"([^"]+)"\s*
"([^"]+)"\s*
$/x; # ..until end of line, /x for regex readability only
push #all, \%f; # store hash in array
}
#f{qw(date time)} = split /:/, $f{dateANDtime}, 2;
print Dumper \#all; # show the structure you've captured
__DATA__
27.112.105.20 - - [09/Oct/2011:07:22:51 -0500] "GET / HTTP/1.1" 200 4886 "-" "Python-urllib/2.4"
Basically you just declare the top level structure, and then use it:
my #AoH;
$AoH[0]{some_key} = 5;
$AoH[1]{some_other_key} = 10;
# ^ ^ second level is a hash
# | first level is an array
Which would create an array with two elements, each hashes, each with one key. This feature is called autovivification, and it causes container structures to spring into existence when they are used.
All of this is documented in the perldsc tutorial.
In your case, it would be something like:
$arrofhash[$i]{key_name} = value;
$arrofhash[$i]{another_key} = another_value;
...
or
$arrofhash[$i] = {key => value, key2 => value2, ...}
to set the whole hash at once.