perl read data from textfile - file

I wanted to get the fruit and info from an input file in order (reading from line 1 and proceeds). The code below is somehow printing the fruit and info in any random order, everytime running the script generate different order, not reading from line 1. Any recommendation?
I am having an input file something like below
apple
text1
text2
grape
text3
text4
jackfruit
text5
and this is the code I am having to call out each fruit and info
use strict;
use warnings;
my %hash;
open FILE, "config.txt" or die $!;
my $key;
while (my $line = <FILE>) {
chomp($line);
if ($line !~ /^\s/) {
$key = $line;
#$hash{$key} = [];
} else {
$line =~ s/\s//g;
push (#{ $hash{$key} }, $line);
}
}
close FILE;
my %final;
foreach my $fruit (keys %hash){
foreach my $info (values #{$hash{$fruit}}){
print "Fruit: $fruit\n";
print "Info for $fruit = $info\n";
}
}

keys %hash
Gives you an array of the keys from the hash, but not sorted. You can sort it with the command sort
The whole line would be
foreach my $fruit (sort(keys %hash)){
Use perldoc -f sort to get help of the sort function.

If you want to keep things in order that they are in the file, use an array. Each element of that array can be a hash that organizes the data:
#!perl
use v5.26;
my #items;
while( <DATA> ) {
chomp;
if( /\A(\S+)/ ) {
push #items, { fruit => $1, info => [] }
}
elsif( /\A\s+(.+)/ ) {
push $items[-1]{info}->#*, $1
}
}
foreach my $item ( #items ) {
print "Fruit: $item->{fruit}\n";
foreach my $info ( $item->{info}->#* ) {
print "\tInfo: $info\n";
}
}
__END__
apple
text1
text2
grape
text3
text4
jackfruit
text5
cranberry
text6
The output maintains the order in the file:
Fruit: apple
Info: text1
Info: text2
Fruit: grape
Info: text3
Info: text4
Fruit: jackfruit
Info: text5
Fruit: cranberry
Info: text6
However, if you want to keep them in order of the file and merely output them, you don't need a data structure:
my #items;
while( <DATA> ) {
chomp;
if( /\A(\S+)/ ) {
print "Fruit: $1\n";
}
elsif( /\A\s+(.+)/ ) {
print "\tInfo: $1\n";
}
}
If you wanted slightly different output where each line needed to know the fruit, store that name in a persistent variable in the while loop:
my #items;
while( <DATA> ) {
state $fruit;
chomp;
if( /\A(\S+)/ ) {
$fruit = $1;
print "Fruit: $fruit\n";
}
elsif( /\A\s+(.+)/ ) {
print "\t$fruit: $1\n";
}
}

Related

Perl - Capture sentences with occurrence of more than1 element of an array

I have a text file and a array which has a list of words. I need to find a way where I can filter out the sentences with occurrence of more than 1 . I am just not able to formulate how to write the code. Here is an example :
Input :
my #strings = (
"i'm going to find the occurrence of two words if possible",
"i'm going to find the occurrence of two words if possible",
"to find a solution to this problem",
"i will try my best for a way to this problem"
);
my #words = ("find", "two", "way");
Output :
i'm going to find the occurrence of two words if possible
i'm going to find the occurrence of two words if possible
And I do understand it's a simple problem but my mind seems to have hit a road block.
If you want strings with two or more instances of the keywords:
my #keywords = ("find", "two", "way");
my %keywords = map { $_ => 1 } #keywords;
for my $string (#strings) {
my #words = $string =~ /\w+/g;
my $count = grep { $keywords{$_} } #words; # Count words that are keywords.
if ($count >= 2) {
...
}
}
Short-circuiting alternate (i.e. good for extremely long strings):
my #keywords = ("find", "two", "way");
my %keywords = map { $_ => 1 } #keywords;
for my $string (#strings) {
my $count = 0;
while ($string =~ /\w+/g) {
if ($keywords{$_} && ++$count == 2) {
...
last;
}
}
}
If you want strings with instances of two or more keywords:
my #keywords = ("find", "two", "way");
for my $string (#strings) {
my #words = $string =~ /\w+/g;
my %seen; ++$seen{$_} for #words;
my $count = grep { $seen{$_} } #keywords; # Count keywords that were seen.
if ($count >= 2) {
...
}
}
Alternate:
my #keywords = ("find", "two", "way");
for my $string (#strings) {
my #words = $string =~ /\w+/g;
my %seen = map { $_ => -1 } #keywords;
my $count = grep { ++$seen{$_} == 0 } #words;
if ($count >= 2) {
...
}
}
Short-circuiting alternate (i.e. good for extremely long strings):
my #keywords = ("find", "two", "way");
for my $string (#strings) {
my $count = 0;
my %seen = map { $_ => -1 } #keywords;
while ($string =~ /\w+/g) {
if (++$seen{$_} == 0 && ++$count == 2) {
...
last;
}
}
}

Perl: Dump data from a hash into excel

I have a hash with key and Values(array). I want to dump them to a spreadsheet, but having difficulty in arranging them .
%hash
key1 -> foo bar
key2-> john adam gill
key3-> apple banana mango orange
Code:
use strict;
use warnings;
use Excel::Writer::XLSX;
my $pattern = "BEGIN_";
my $format;
my #keys = qw(key1 key2 key3);
foreach my $key(#keys){
open my $fh, "<","filename.txt" or die $!;
while ( <$fh> ) {
if (/$pattern/) {
push(#matching_lines, $_);
}
}
$hash{$key} = [#matching_lines] ;
for (#matching_lines) { $_ = undef } ; #Emptying the array contents,to reuse it for for all the other keys
}
my $workbook = Excel::Writer::XLSX->new( 'c:\TEMP\filename.xlsx' );
if (not defined $workbook)
{
die "Failed to create spreadsheet: $!";
}
my $worksheet = $workbook->add_worksheet();
# Add and define a format
$format = $workbook->add_format();
$format->set_bg_color( 'yellow' );
my $row = 1;
my $col = 0;
foreach my $k (keys %hash)
{
$worksheet->write($row, $col, $k, $format); # title
$worksheet->write_col($row+1, $col, $hash{$k}); #value
$col++;
}
$workbook->close() or die "Error closing file: $!";
Current Output
Desired Output
Edit: Now you've actually updated your program to clarify that the problem is how you're reading your data, the below is moot. But it does illustrate an alternative approach.
OK, the core problem here is what you're trying to do is 'flip' a hash. You're printing row by row, but your hash is organised in columns.
Using comma sep as a quick proxy for printing actual excel:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
#initialise
my %hash = (
key1 => [qw ( foo bar )],
key2 => [qw ( john adam gill )],
key3 => [qw ( apple banana mango orange )],
);
#print for debug
print Dumper \%hash;
#get header row. Sort it, because hashes are unordered.
#could instead:
#my #keys = qw ( key1 key2 key3 );
my #keys = sort keys %hash;
#print header row
print join ",", #keys, "\n";
#iterate until every element of the hash is gone
while ( map { #{ $hash{$_} } } #keys ) {
#cycle the keys, shifting a value of the top of each array.
#replace any undefined values with ''.
print shift( #{ $hash{$_} } ) // '', "," for #keys;
print "\n";
}
This prints:
key1,key2,key3,
foo,john,apple,
bar,adam,banana,
,gill,mango,
,,orange,
Which if you load it as csv into Excel, should give your desired result. I'm pretty sure you could use a similar 'write row' with the module.
So this actually seems to do what you want:
#!/usr/env/perl
use strict;
use warnings;
use Excel::Writer::XLSX;
#initialise
my %hash = (
key1 => [qw ( foo bar )],
key2 => [qw ( john adam gill )],
key3 => [qw ( apple banana mango orange )],
);
my $workbook = Excel::Writer::XLSX->new('c:\TEMP\filename.xlsx');
if ( not defined $workbook ) {
die "Failed to create spreadsheet: $!";
}
my $worksheet = $workbook->add_worksheet();
# Add and define a format
my $format = $workbook->add_format();
$format->set_bg_color('yellow');
my #keys = sort keys %hash;
my $row = 0;
$worksheet->write_row( $row++, 0, \#keys, $format );
while ( map { #{ $hash{$_} } } #keys ) {
my $col = 0;
$worksheet->write( $row, $col++, shift( #{ $hash{$_} } ) // '' )
for #keys;
$row++;
}
$workbook->close() or die "Error closing file: $!";
You're not correctly emptying your #matching_lines array. This line:
for (#matching_lines) { $_ = undef }
Sets the array values to undef, but does not remove them.
For example, if #matching_lines was ('foo', 'bar'), now it becomes (undef, undef). When you add baz and qux to it later, it becomes (undef, undef, 'baz', 'qux'). These undefs become blank cells when you add them to the worksheet.
To correctly empty the array, use:
#matching_lines = ();

First and last regex match

Hi I have a problem with my program, I have wrote the code below and it returns the expected result. However I only am intrested in the first and last occurance of the matches. How would I go about doing this?
foreach (#array)
{
$element = $_;
foreach(#array2)
{
if($_ =~ s/($element)//ig)
{
print "$_ \n";
}
}
}
Currently the loop goes through every element in the array finds it in the second array and prints the whole line. It returns the expected result, however I want the first match and last match.
foreach my $elm2 (#array2) {
my $state = 'start';
my $first, $last;
foreach my $elm1(#array1) {
if (($state eq 'start') && ($elm1 =~ m/$elm2/i)) {
$first = "$elm1";
$state = 'last';
}
elsif (($state eq 'last') && ($elm1 =~ m/$elm2/i)) {
$last = $elm1;
}
}
print "$elm2,$first,$last\n";
}
Could maybe do this
foreach (#array)
{
$first = "";
$last = "";
$element = $_;
foreach(#array2)
{
if($_ =~ s/($element)//ig)
{
if (!length($first)){
$first = $_;
}
else {
$last = $_;
}
}
}
if (length($first) && length($last)) {
print "\n----------\nfirst = '$first'\nlast = '$last'";
}
}
Totally forgot about grep.
foreach my $elm2 (#array2) {
my #matches = grep(/$elm2/i, #array1);
if (#matches && (scalar (#matches > 1))) {
print "$elm2,$matches[0], $matches[-1]\n";
}
elsif (#matches) {
print "$elm2,$matches[0]\n";
}
else {print "no matches\n";};
}
a bit late but this is what I think you could use
Find first match
if ($_ =~ m/($element)/) { print $1; }
Find last match
if ($_ =~ m/.*($element)/) { print $1; }
Assuming you want to check which elements of #array2 match any of the patterns in #array and print the first and last of those, it is simplest to build an alternation regex from the contents of #array and filter #array2 using that.
Like this
my $re = join '|', #array; # Build a regex
$re = qr/$re/; # Compile it
my #matches = grep /$re/, #array2;
print "$_\n" for #matches[0,-1];

Iterate through a hash and an array in Perl

I have an array and a hash:
#arraycodons = "AATG", "AAAA", "TTGC"... etc.
%hashdictionary = ("AATG" => "A", "AAAA" => "B"... etc.)
I need to translate each element of the array for the corresponding value in hashdictionary. However, I obtain a wrong translation.....
To see the problem, I have printed $codon (each element of the array), but each codon is repeated several times..... and It shouldn't.
sub translation() {
foreach $codon (#arraycodons) {
foreach $k (keys %hashdictionary) {
if ($codon == $k) {
$v = $hashdictionary{$k};
print $codon;
}
}
}
}
I don't know if I've explained my problem well enough, but I can't go on with my code if this doesn't work...
Many thanks in advance.
You appear to be looping through the keys of your hash (also known as a "dictionary") to find your desired key. This defeats the purpose of a hash (also known as a "dictionary") - the primary advantage of which is ultra fast lookups of a key.
Try, instead of
foreach $codon (#arraycodons) {
foreach $k (keys %hashdictionary) {
if ($codon == $k) {
$v = $hashdictionary{$k};
print $codon;
}
}
}
this:
foreach $codon (#arraycodons) {
my $value = $hashdictionary{$codon};
print( "$codon => $value\n" );
}
or:
foreach my $key ( keys %hashdictionary ) {
my $value = $hashdictionary{$key};
print( "$key => $value\n" );
}
my #mappedcodons = map {$hashdictionary{$_}}
grep (defined $hashdictionary{$_},#arraycodons);
or
my #mappedcodons = grep ($_ ne "", map{$hashdictionary{$_} || ""} #arraycodons);
my #words = ("car", "house", "world");
my %dictionary = ("car" => "el coche", "house" => "la casa", "world" => "el mundo");
my #keys = keys %dictionary;
foreach(#words) {
my $word = $_;
foreach(#keys) {
if($_ eq $word) { # eq, not ==
my $translation = $dictionary{$_};
print "The Spanish translation of $word is $translation\n";
}
}
}

perl hash of arrays

I am trying to access elements of an array which is part of a hash.
for my $idx ( 0 .. $#vss ) {
push (#{$vsnhash->{$vss[$idx]}}, $vsports[$idx]);
}
print Dumper(\%$vsnhash);
($VAR1 = {
'name2' => [
'8001',
'8002'
],
'name1' => [
'8000'
]
};
I an able to access the keys with a foreach loop:
foreach my $key ( keys %$vsnhash ) {
print "$key\n";
}
How do I access the array of port numbers ('8001' , '8002') within the hash?
Thank you for the help!
while (my ($k, $v) = each %$vsnhash) {
print "$k: #$v\n";
}
foreach my $key ( keys %$vsnhash ) {
print "$key\n";
foreach my $port (#{$vsnhash->{key}}){
print "Port $port\n";
}
}
$vsnhash{name2}->[0]; #8001
$vsnhash{name2}->[1]; #8002
$vsnhash{name1}->[0]; #8000
Code wise:
foreach my $key (sort keys %vsnhash) {
foreach my $index (0..$#{$key}) {
print "\$vsnhash{$key}->[$index] = " . $vsnhash{$key}->[$index] . "\n";
}
}
The $#{$key} means the last entry in the array #{$key}. Remember that $key is a reference to an array while #{$key} is the array itself.

Resources