Reading data from file into an array to manipulate within Perl script - arrays

New to Perl. I need to figure out how to read from a file, separated by (:), into an array. Then I can manipulate the data.
Here is a sample of the file 'serverFile.txt' (Just threw in random #'s)
The fields are Name : CPU Utilization: avgMemory Usage : disk free
Server1:8:6:2225410
Server2:75:68:64392
Server3:95:90:12806
Server4:14:7:1548700
I would like to figure out how to get each field into its appropriate array to then perform functions on. For instance, find the server with the least amount of free disk space.
The way I have it set up now, I do not think will work. So how do I put each element in each line into an array?
#!usr/bin/perl
use warnings;
use diagnostics;
use v5.26.1;
#Opens serverFile.txt or reports and error
open (my $fh, "<", "/root//Perl/serverFile.txt")
or die "System cannot find the file specified. $!";
#Prints out the details of the file format
sub header(){
print "Server ** CPU Util% ** Avg Mem Usage ** Free Disk\n";
print "-------------------------------------------------\n";
}
# Creates our variables
my ($name, $cpuUtil, $avgMemUsage, $diskFree);
my $count = 0;
my $totalMem = 0;
header();
# Loops through the program looking to see if CPU Utilization is greater than 90%
# If it is, it will print out the Server details
while(<$fh>) {
# Puts the file contents into the variables
($name, $cpuUtil, $avgMemUsage, $diskFree) = split(":", $_);
print "$name ** $cpuUtil% ** $avgMemUsage% ** $diskFree% ", "\n\n", if $cpuUtil > 90;
$totalMem = $avgMemUsage + $totalMem;
$count++;
}
print "The average memory usage for all servers is: ", $totalMem / $count. "%\n";
# Closes the file
close $fh;

For this use case, a hash is much better than an array.
#!/usr/bin/perl
use strict;
use feature qw{ say };
use warnings;
use List::Util qw{ min };
my %server;
while (<>) {
chomp;
my ($name, $cpu_utilization, $avg_memory, $disk_free)
= split /:/;
#{ $server{$name} }{qw{ cpu_utilization avg_memory disk_free }}
= ($cpu_utilization, $avg_memory, $disk_free);
}
my $least_disk = min(map $server{$_}{disk_free}, keys %server);
say for grep $server{$_}{disk_free} == $least_disk, keys %server;

choroba's answer
is ideal, but I think your own code could be improved
Don't use v5.26.1 unless you need a specific feature that is available only in the given version of Perl. Note that it also enables use strict, which should be at the top of every Perl program you write
die "System cannot find the file specified. $!" is wrong: there are multiple reasons why an open may fail, beyond that it "cannot be found". Your die string should include the path to the file you're trying to open; the reason for the failure is in $!
Don't use subroutine prototypes: they don't do what you think they do. sub header() { ... } should be just sub header { ... }
There's no point in declaring a subroutine only to call it a few lines later. Put your code for header in line
You have clearly come from another language. Declare your variables with my as late as possible. In this case only $count and $totalMem must be declared outside the while loop
perl will close all open file handles when the program exits. There is rarely a need for an explicit close call, which just makes your code more noisy
$totalMem = $avgMemUsage + $totalMem is commonly written $totalMem += $avgMemUsage
I hope that helps

To your original question about how to store the data in an array...
First, initialize an empty array outside the file read loop:
my #servers = ();
Then, within the loop, after you have your data pieces parsed out, you can store them in your array as sub-arrays (the resulting data structure is a two dimensional array):
$servers[$count] = [ $name, $cpuUtil, $avgMemUsage, $diskFree ];
Note, the square brackets on the right create the sub-array for the server's data pieces and return a reference to this new array. Also, on the left side we just use the current value of $count as an index within the #servers array and as the value increases, the size of the #servers array will grow automatically (this is called autovivification of new elements). Alternatively, you can push new elements onto the #servers array inside the loop, like this:
push #servers, [ $name, $cpuUtil, $avgMemUsage, $diskFree ];
This way, you explicitly ask for a new element to be added to the array and the square brackets still do the same creation of the sub-array.
In any case, the end result is that after you are finished with the file read loop, you now have a 2D array where you can access the first server and its disk free field (the 4-th field at index 3) like this:
my $df = $servers[0][3];
Or inspect all the servers in a loop to find the minimum disk free:
my $min_s = 0;
for ( my $s = 0; $s < #servers; $s++ ) {
$min_s = $s if ( $servers[$s][3] < $servers[$min_s][3] );
}
print "Server $min_s has least disk free: $servers[$min_s][3]\n";
Like #choroba suggested, you can store the server data pieces/fields in hashes, so that your code will be more readable. You can still store your list of servers in an array but the second dimension can be hash:
$servers[$count] = {
name => $name,
cpu_util => $cpuUtil,
avg_mem_usage => $avgMemUsage,
disk_free => $diskFree
};
So, your resulting structure will be an array of hashes. Here, the curly braces on the right create a new hash and return the reference to it. So, you can later refer to:
my $df = $servers[0]{disk_free};

Related

How to loop through file and count specific values in perl?

Let's say I have a file with the lines such as:
*some numbers* :00: *somenumbers*
*somenumbers* :21: *somenumbers*
And for every number between :: I need to count how many times it repeats in the file?
while (<>){
chomp($_);
my ($nebitno,$bitno,$opetnebitno) = split /:/, $_;
$count{$bitno}++;
}
foreach $bitno(sort keys %count){
print $bitno," ",$count{bitno}, "\n";
}
What you produced was not bad code — it did the job for a single file at a time. Adapting the code shown in the question to handle multiple files, resetting the counts after each file:
#!/usr/bin/perl
use strict;
use warnings;
my %count = ();
while (<>) {
my ($nebitno, $bitno, $opetnebitno) = split /:/, $_;
$count{$bitno}++;
}
continue
{
if (eof) {
print "$ARGV:\n";
foreach $bitno (sort keys %count) {
print "$bitno $count{bitno}\n";
}
%count = ();
}
}
The key here is the continue block, and the if (eof) test. You can use close $ARGV in a continue block to reset $. (the line number) when the file changes; it is a common use for it. This sort of per-file summary is another use. The other changes are cosmetic. You don't need to chomp the line (though there's no particular harm done if you do); I print whole strings rather than using comma-separated lists (it works well here and very often). I use a few more spaces. I left it with the 1TBS format for the blocks of code, though I don't use that myself (I use Allman).
My draft solution used practically the same printing code as shown above, but the main while loop was slightly different:
#!/usr/bin/env perl
use strict;
use warnings;
my %counts = ();
while (<>)
{
$counts{$1}++ if (m/.*:(\d+):/);
}
continue
{
if (eof)
{
print "$ARGV:\n";
foreach my $number (sort { $a <=> $b } keys %counts)
{
print ":$number: $counts{$number}\n"
}
%counts = ();
}
}
The only advantage over what you used is that if some line doesn't contain a colon-surrounded number, it ignores the line, whereas yours doesn't consider that possibility. I'm not sure the comparison code in the sort is necessary — it ensures that the comparisons are numeric, though. If the numbers are all the same length and zero-padded on the left when necessary, there's no problem. If they're more generally formatted, the 'forced numeric' comparison might make a difference.
Remember: this is Perl, so TMTOWDTI (There's More Than One Way To Do It). Someone else might come up with a simpler solution.
Desired output can be achieved with following code snippet
look for pattern :\d+: in a line
increment hash %count for the digit
output result to console
use strict;
use warnings;
use feature 'say';
my %count;
/:(\d+):/ && $count{$1}++ for <>;
say "$_ = $count{$_}" for sort keys %count;

Changing an array in Perl

I am trying to use Perl to parse output from a (C-based) program.
Every output line is a (1D) Perl array, which I sometimes want to store (based on certain conditions).
I now wish to (deep) copy an array when its first element has a certain keyword,
and print that same copied array if another keyword matches in a later line-array.
So far, I have attempted the following:
#!/usr/bin/env perl
use strict; # recommended
use Storable qw(dclone);
...
while(1) # loop over the lines
{
# subsequent calls to tbse_line contain
# (references to) arrays of data
my $la = $population->tbse_line();
my #copy;
my $header = shift #$la;
# break out of the loop:
last if ($header eq 'fin');
if($header eq 'keyword')
{
#copy = #{ dclone \#$la };
}
if($header eq 'other_keyword')
{
print "second condition met, print first line:\n"
print "#copy\n";
}
}
However, this prints an empty line to the screen, instead of the contents of the copied array. I don't have a lot of Perl experience, and I can't figure out what I am doing wrong.
Any idea on how to go about this?
my #copy allocates a new Perl array named #copy in the current scope. It looks like you want to set #copy during one iteration of your while loop and print it in a different iteration. In order for your array not to be erased each time a new while loop iteration starts, you should move the my #copy declaration outside of the loop.
my #copy;
while (1) { ... }

Split an array into two. Predetermined sizes, might not always be equal

I need to iterate over many files in a directory and split each file into two parts. I need to keep lines intact (I can't split on bite size). I also can't always assume that the file has an equal number of lines. I could use the "split" function, but am looking for a faster way of going through my files and to avoid the standard output names "xaa" and "xab" it generates.
The easiest would be to make two subsequent substrings of an array in the sizes specified ($number_of_group_one and $number_of_group_two). I can't find out how to do this. Instead I am trying to push the lines into different arrays- filling one up until a certain number of lines and then "spill over" into the other array until there are no more lines left to push. However, this approach yields two output arrays that both have exactly double the number of input lines. Here is my code:
#!/usr/bin/perl
use warnings;
use strict;
my ($directory) = #ARGV;
my $dir = "$directory";
my #arrayoffiles = glob "$dir/*";
my #arrayoflines_one;
my #arrayoflines_two;
my $counter = 0;
foreach my $filename(#arrayoffiles){
my #arrayoflines_one;
my #arrayoflines_two;
my #lines = read_lines($filename);
my $NumberofLines = #lines;
my $number_of_group_one = int($NumberofLines/2);
my $number_of_group_two = ($NumberofLines - $number_of_group_one);
foreach my $line (#lines){
$counter++;
push (#arrayoflines_one, $line, "\n");
if ($counter == $number_of_group_one){
push (#arrayoflines_two, $line, "\n");
}
}
}
sub read_lines {
my ($file) = #_;
open my $in, '<', $file or die $!;
local $/ = undef; #slurps the whole file in as one
my $content = <$in>;
return split /\s/, $content;
close $in;
}
I hope this is clear. Thanks for your help!
This is a good use case for splice:
my #lines = read_lines($filename);
my #lines1 = splice #lines, 0, #lines/2;
will put (about) half of your lines from #lines into #lines1, removing them (and leaving about half of the lines) from #lines.

how to get a particular block in an array get copied in perl

I have details like below in an array. There will be plenty of testbed details in actual case. I want to grep a particular testbed(TESTBED = vApp_eprapot_icr) and an infomation like below should get copied to another array. How can I do it using perl ? End of Testbed info can be understood by a closing flower bracket }.
TESTBED = vApp_eprapot_icr {
DEVICE = vApp_eprapot_icr-ipos1
DEVICE = vApp_eprapot_icr-ipos2
DEVICE = vApp_eprapot_icr-ipos3
DEVICE = vApp_eprapot_icr-ipos5
CARDS=1GIGE,ETHFAST
CARDS=3GIGE,ETHFAST
CARDS=10PGIGE,ETHFAST
CARDS=20PGIGE,ETHFAST
CARDS=40PGIGE,ETHFAST
CARDS=ETHFAST,ETHFAST
CARDS=10GIGE,ETHFAST
CARDS=ETH,ETHFAST
CARDS=10P10GIGE,ETHFAST
CARDS=PPA2GIGE,ETHFAST
CARDS=ETH,ETHFAST,ETHGIGE
}
I will make it simpler, please see the below array
#array("
student=Amit {
Age=20
sex=male
rollno=201
}
student=Akshaya {
Age=24
phone:88665544
sex=female
rollno=407
}
student=Akash {
Age=23
sex=male
rollno=356
address=na
phone=88456789
}
");
Consider an array like this. Where such entries are plenty. I need to grep, for an example student=Akshaya's data. from the opening '{' to closing '}' all info should get copied to another array. This is what I'm looking for.
while (<>) {
print if /TESTBED = vApp_eprapot_icr/../\}/;
}
as a sidenote <> will capture the filename you use on cmdline. So if the data is stored in a file you will run from commandline
perl scriptname.pl filename.txt
Ok. We finally have enough information to come up with an answer. Or, at least, to produce two answers which will work on slightly different versions of your input file.
In a comment you say that you are creating your array like this:
#array = `cat $file`;
That's not a very good idea for a couple of reasons. Firstly, why run an external command like cat when Perl will read the file for you. And secondly, this gives you one element in your array for each line in your input file. Things become far easier if you arrange it so that each of your TESTBED = foo { ... } records is a single array element.
Let's get rid of the cat first. The easiest way to read a single file into an array is to use the file input operator - <>. That will read data from the file whose name is given on the command line. So if you call your program filter_records, you can call it like this:
$ ./filter_records your_input_data.txt
And then read it into an array like this:
#array = <>;
That's good, but we still have each line of the input file in its own array element. How we fix that depends on the exact format of your input file. It's easiest if there's a blank line between each record in the input file, so it looks like this:
student=Amit {
Age=20
sex=male
rollno=201
}
student=Akshaya {
Age=24
phone:88665544
sex=female
rollno=407
}
student=Akash {
Age=23
sex=male
rollno=356
address=na
phone=88456789
}
Perl has a special variable called $/ which controls how it reads records from input files. If we set it to be an empty string then Perl goes into "paragraph" mode and it uses blank lines to delimit records. So we can write code like this:
{
local $/ = '';
#array = <>;
}
Note that it's always a good idea to localise changes to Perl's special variables, which is why I have enclosed the whole thing in a naked block.
If there are no blank lines, then things get slightly harder. We'll read the whole file in and then split it.
Here's our example file with no blank lines:
student=Amit {
Age=20
sex=male
rollno=201
}
student=Akshaya {
Age=24
phone:88665544
sex=female
rollno=407
}
student=Akash {
Age=23
sex=male
rollno=356
address=na
phone=88456789
}
And here's the code we use to read that data into an array.
{
local $/;
$data = <>;
}
#array = split /(?<=^})\n/m, $data;
This time, we've set $/ to undef which means that all of the data has been read from the file. We then split the data wherever we find a newline that is preceded by a } on a line by itself.
Whichever of the two solutions above that we use, we end up with an array which (for our sample data) has three elements - one for each of the records in our data file. It's then simple to use Perl's grep to filter that array in various ways:
# All students whose names start with 'Ak'
#filtered_array = grep { /student=Ak/ } #array;
If you use similar techniques on your original data file, then you can get the records that you are interested in with code like this:
#filtered_array = grep { /TESTBED = vApp_eprapot_icr/ } #array;

Perl - can't use string (...) as an array ref

I'm practicing Perl with a challenge from codeeval.com, and I'm getting an unexpected error. The goal is to iterate through a file line-by-line, in which each line has a string and a character separated by a comma, and to find the right-most occurrence of that character in the string. I was getting wrong answers back, so I altered the code to print out just variable values, when I got the following error:
Can't use string ("Hello world") as an ARRAY ref while "strict refs" in use at char_pos.pl line 20, <FILE> line 1.
My code is below. You can see a sample from the file in the header. You can also see the original output code, which was incorrectly only displaying the right-most character in each string.
#CodeEval challenge: https://www.codeeval.com/open_challenges/31/
#Call with $> char_pos.pl numbers
##Hello world, d
##Hola mundo, H
##Keyboard, b
##Connecticut, n
#!/usr/bin/perl
use strict;
use warnings;
my $path = $ARGV[0];
open FILE, $path or die $!;
my $len;
while(<FILE>)
{
my #args = split(/,/,$_);
$len = length($args[0]) - 1;
print "$len\n";
for(;$len >= 0; $len--)
{
last if $args[0][$len] == $args[1];
}
#if($len > -1)
#{
# print $len, "\n";
#}else
#{
# print "not found\n";
#}
}
EDIT:
Based on the answers below, here's the code that I got to work:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
open my $fh,"<",shift;
while(my $line = <$fh>)
{
chomp $line;
my #args = split(/,/,$line);
my $index = rindex($args[0],$args[1]);
print $index>-1 ? "$index\n" : "Not found\n";
}
close $fh;
It looks like you need to know a bit about Perl functions. Perl has many functions for strings and scalars and it's not always possible to know them all right off the top of your head.
However, Perl has a great function called rindex that does exactly what you want. You give it a string, a substring (in this case, a single character), and it looks for the first position of that substring from the right side of the string (the index does the same thing from the left hand side.)
Since you're learning Perl, it may be a good idea to get a few books on Modern Perl and standard coding practices. This way, you know newer coding techniques and the standard coding practices.
Modern Perl - Gives you newer programming help.
Learning Perl - An old standard.
Perl Best Practices - The standard coding practices.
Here's a sample program:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
use feature qw(say);
open my $fh, "<", shift;
while ( my $line = <$fh> ) {
chomp $line;
my ($string, $char) = split /,/, $line, 2;
if ( length $char != 1 or not defined $string ) {
say qq(Invalid line "$line".);
next;
}
my $location = rindex $string, $char;
if ( $location != -1 ) {
say qq(The right most "$char" is at position $location in "$string".);
}
else {
say qq(The character "$char" wasn't found in line "$line".)";
}
close $fh;
A few suggestions:
use autodie allows your program to automatically die on bad open. No need to check.
Three parameter open statement is now considered de rigueur.
Use scalar variables for file handles. They're easier to pass into subroutines.
Use lexically scoped variables for loops. Try to avoid using $_.
Always do a chomp after a read.
And most importantly, error check! I check the format of the line to make sure that's there is only a single comma, and that the character I'm searching for is a character. I also check the exit value of rindex to make sure it found the character. If rindex doesn't find the character, it returns a -1.
Also know that the first character in a line is 0 and not 1. You may need to adjust for this depending what output you're expecting.
Strings in perl are a basic type, not subscriptable arrays. You would use the substr function to get individual characters (which are also just strings) or substrings from them.
Also note that string comparison is done with eq; == is numeric comparison.
while($i=<DATA>){
($string,$char)=split(",",$i);
push(#str,$string);}
#join=split("",$_), print "$join[-1]\n",foreach(#str);
__DATA__
Hello world, d
Hola mundo, H
Keyboard, b
Connecticut, n

Resources