misunderstanding of filehandle (or array?) in perl? [duplicate]

misunderstanding of filehandle (or array?) in perl? [duplicate] - arrays

This question already has answers here:
confusing filehandle in perl
(2 answers)
Closed 7 years ago.
New to perl (and language in general), and have gone so far as OPEN FH.
A little confused about the following two scenarios (outputs are different, but I couldn't understand why the second one is "wrong").
script 1
#! usr/bin/perl
use warnings;
use strict;
open FILE, "example.txt" or die $!;
while (<FILE>) {
print $_;
}
script2
#! usr/bin/perl
use warnings;
use strict;
open FILE, "example.txt" or die $!;
my #array = <FILE>;
while (<FILE>) {
print $_;
}
The example.txt contains 3 lines of sentence. When I assign filehandle FILE to an array, suddenly, $_ becomes empty ??
If I replace <FILE> in the while loop with array (#array), the loop cannot stop and $_ becomes "uninitialized value".
This is likely a very basic misunderstanding of filehandle and array , but can someone please help explain them - in plain English - ? Or point to the references where I could find more info (have a few references handy and have consulted them, but obviously I still couldn't quite understand why script 2 (and 3) are not correct). Great many thanks.

In Perl, when you read from a filehandle in a list context, it automatically reads to the end of the file. $_ is empty because the entire file has been read into #array, and the filehandle has been moved to EOF. Perl's different handling of the same instruction in scalar and list "context" is unusual, but the behavior of the filehandle itself is pretty standard across languages. In general, a filehandle or input stream will let you read a line at a time, a byte at a time, or the whole file at a time, and have a pointer that keeps track of the last place in the file you read. When you inadvertently moved that pointer all the way to the end of the file (EOF) by doing a read in a list context, your next call to read a line finds that there's no more file to read. This is a pretty good explanation of scalar and list context; that might be the trickier part of what's going on here.

Related

How do I add data from a CSV file to an array in Perl?

I've searched and searched online, but nothing I do seems to work. I'm aware that this is a stupidly easy question, but I'm really stuck...
I had a directory of files, and saved the names of the directories to a CSV file. It's just made up of one column of data e.g.:
100
101
102
103...
I used the following code to create the file (just in case its relevant):
open (my $fileh, '>', '/Users/Shared/serials.csv') or die "Could not open file!";
print $fileh `ls -1 $path`;
close $fileh;
print "Saved Serials!";
Now all I want is to read in the data in the file into an array, and then loop through each value to complete a task. I can't figure out how to do this...
Currently I am entering the numbers manually in the code into an array, like:
#ser_list = (100,101,102,103,...);
As stated above, I instead want to automatically write the file names (numbers) using the ls command line query and read them back into the script from there.
If there is a way of doing this without having to make a separate file of values that would be great.
The array is called #ser_list in the example, and from there I am reading the next $ser_num from the array and working with that in the loop.
foreach $ser_num (#ser_list) {....}
Thank you all in advance for your help and patience!

Don't use ls in a Perl program. You can use glob instead:
my #files = glob "$path/*";
If you need to work with the paths and filenames, check Path::Tiny.
To read lines with paths into an array, just use
open my $fh, '<', $filename or die $!;
chomp( my #paths = <$fh> );
See open, chomp, and readline for details.

Read whole file at once into an array

I had read somewhere about the following method to read the whole file into a Perl array at once,
open my $file, '<', $filePath or die "Error: Unable to open file : $!";
my #fileData = <$file>;
close $file;
I suppose the size of the array is only limited by the available system memory. I wanted to know how exactly this works in the background, since there are no loops involved here to read the file line by line and feed them into the array.

Your wish is my command — a comment transferred to an answer, with a mild correction en route.
What is there to say? In array list context, as provided by my #fileData, the <> operator reads lines into the array with an implicit loop. It works. Occasionally, it is useful.
Perl has a couple of mottos. One is TMTOWTDI — There's More Than One Way To Do It. Another is DWIM — Do What I Mean; at least, Perl does this more than many languages, provided you know what you're asking for. This is a piece of dwimmery.

readline is the Perl 5 built-in function that is implementing the <EXPR> operator. It has different behaviour in scalar and list context.

Using TCL is it possible to read a file "backwards"

I have a very large text file from which I have to extract some data. I read the file line by line and look for keywords. As I know that the keywords I am looking for are much closer to the end of the file than to the beginning, I wonder if it is possible to read the file starting at the last row instead of the first. I then would use an aditional keyword which indicates "everything beyound this word is not of interesst" and stop reading.
Is that possible ?

I don't know how performant this would be, but run the file through tac and read from that:
set fh [open "|tac filename"]
# read from last line to first
while {[gets $fh line] != -1} {...
Another tactic would be to read the last, say, 5000 bytes of the file (using seek), split on newlines and examine those lines, then seek to position 10000 from the end and read the "next" 5000 bytes, etc.

No it is not possible (in any runtime/language I'm aware of, Tcl included).
So decide on a buffer side and read your file by seeking backwards and trying to read a full buffer each time.
Note that you have to observe certain possibilities:
The file might be smaller than the size of your buffer.
It seems you're dealing with a text file, and you want to process it line-wise. If so, observe that if the code is cross-platform or has to work on Windows you have to deal with the case when the data placed in the buffer by the last read operation starts with LF, and the next read operation—of the preceding chunk—will end with CR—that is, your EOL marker will be split across the buffers.
You might want to take a look at the implementation of Tcl_GetsObj() in the generic/tclIO.c file in the Tcl source code—it deals with split CRLFs on normal ("forward") reading of a textual string from a file.

The simplest way to grab the end of a file for searching, assuming you don't know the size of the records (i.e., the line lengths) is to grab too much and work with that.
set f [open $filename]
# Pick some large value; the more you read, the slower
seek $f -100000 end
# Read to the end, split into lines and *DISCARD FIRST*
set lines [lrange [split [read $f] "\n"] 1 end]
Now you can search with lsearch. (Note that you won't know exactly where in the file your matched line is; if you need that, you have to do quite a lot more work.)
if {[lsearch -glob $lines "*FooBar*"] >= 0} {
...
}
The discarding of the first line from the read section is because you're probably starting reading half way through a line; dropping the first “line” will mean that you've only got genuine lines to deal with. (100kB isn't very much for any modern computer system to search through, but you may be able to constrain it further. It depends on the details of the data.)

package require struct::list
set fp [open "filename.txt"]
set lines [split [read -nonewline $fp] "\n"]
foreach line [struct::list reverse $lines] {
...
}
do something with "$line".

to reverse file , I read the file into a variable "list" line by line pre-pending $list with the current line. That way List is in reverse order of file ..
while {[gets $in line] > -1} {
if [regexp "#" $line] {
continue
}
# reverse the order in variable "list"
set list "$line $list"
}
foreach line $list {
puts "line:$ln line= $line"
""*** process each line as you need ***""
}

Perl Filehandle Open Truncates the File

I am writing my first Perl program and it's a doozy. I'm happy to say that everything has been working for the most part, and searching this website has helped with most of my problems.
I am working with a large file composed of space separated values. I filter the file down to display only lines with a certain value in one of the columns, and output the filtered data to a new file. I then attempt to push all of the lines of that file into an array to use for looping. Here's some code:
my #orig_file_lines = <ORIG_FILE>;
open MAKE_NEW_FILE, '>', 'newfile.dat' or die "Couldn't open newfile.dat!";
&make_new_file(\#orig_file_lines); ##Creates a new, filtered newfile.dat
open NEW, "newfile.dat" or die "Couldn't open newfile.dat!";
my #lines;
while(<NEW>){
push(#lines,$_);
}
printf("%s\n", $lines[$#lines]); ##Should print entirety of last line of newfile.dat
The problem is twofold: 1. $#lines = 24500 here when the newly created file (newfile.dat) actually has 24503 lines (so it should be 24502), 2. the printf statement returns a truncated line 24500, cutting off that line prematurely by about two columns.
Every other line, e.g. $lines[0-24499], will successfully print the entire line even when it is wider than $lines[24500], so the length of that particular line (they're all long) is not the problem. But it is almost as if the array has gotten too large somehow, since it cut off part of one line, and then the next two lines. If so, how do I combat this?

It looks like you forgot to close MAKE_NEW_FILE before opening the same file with NEW.
Some other points to look at:
&function syntax is mostly deprecated because it bypasses prototype checking.
I trust that you are using use warnings; and use strict;.
I notice that you have a two-argument open and a three-argument open. Although both are legal they have different mindsets which makes using them together confusing to the programmer. I would stay with the three argument open because I think it is easier to understand (unless you are playing code golf)

Getting array elements to end at a particular keyword and shifting the rest to the next line in Perl

I am reading a text file into an array in perl and looping through the array to do stuff on it. Whenever there is a "begin", "end" or a ";" anywhere in the text, I want my array element to end there and whatever comes after any of those keywords to be in the next element to make life easier for me when I try to make sense of the elements later.
To achieve this I thought of reading the entire file into an array, replacing all "begin" with "begin\n", "end" with "end\n" and ";" with ";\n", writing this array back to a file and then reading that file back to an array. Will this work ?
Is there a more elegant way to do this rather than use messy extra writes and reads to file?
Is there a way to short (in the electrical circuits sense if you know what I mean!) a read file handle and a write file handle so that I can escape the whole writing to the text file but still get my job done?
Gururaj

You can use split with parentheses to keep the separator in the result:
open my $FH, '<', 'file.txt' or die $!;
my #array = map { split /(begin|end|;)/ } <$FH>;

I would prefer to use a Perl one-liner and avoid manipulating arrays altogether:
$ perl -pi -e 's#(?<=begin)#\n#g; s#(?<=end)#\n#g; s#(?<=;)#\n#g;' file.txt