I need to use system-calls of windows.h to read a file which I get from command line. I can read to whole file to buffer using ReadFile() and then cut the buffer at the first \0, but how can I read only one line? Also I need to read the last line of the file, Is this possible without reading the whole file into buffer, because maybe the file is 4gb or more so I won't be able to read it. So anyone knows how to read it by lines?
If you have an idea of how long lines are then you are in business, make a buffer that is a bit larger than max line.
ReadFile read a number of bytes and cut buffer at first end of line (\n)
Use LZSeek to position at end of file, then move back a line of bytes and look for end of line, start there and read rest of line.
Don't "cut the buffer at the first \0", ReadFile doesn't return a zero-terminated string. It reads raw bytes. You have to pay attention to the value returned through the lpNumberOfBytesRead argument. It will be equal to the nNumberOfBytesToRead value you pass unless you've reached the end of the file.
Now you know how many valid bytes are in the buffer. Search them for the first '\r' or '\n' byte to find the line terminator. Copy the range of bytes to a string buffer supplied by the caller and return. The next time you read a line, start where you left off previously, past the line terminator. When you don't find the line terminator then you have to copy the bytes in the buffer and call ReadFile() again to read more bytes. That makes the code a bit tricky to get right, excellent exercise otherwise.
ReadFile is a particularly poor choice for what you want to do. Are you allowed to use fgets? That would be much easier to use in your case.
Related
I have a file of size 1GB. I want to find out how many times the word "sosowhat" is found in the file. I've written a code using fgetc() which reads one character at a time from the file which is way too slower when it comes for a file of size 1GB. So I made a buffer of size 1000(using mmalloc) to hold 1000 words at a time from the file and I used the strstr() function to count the occurrence of the word "sosowhat". The logic is fine. But the problem is that if the part "so" of "sosowhat" is located at the end of the buffer and the "sowhat" part in the new buffer, the word will not be counted. So I used two buffers old_buffer and current_buffer. At the beginning of each buffer I want to check from the last few characters of old buffer. Is this possible? How can I go back to the old buffer? Is it possible without memmove()? As a beginner, I will be more than happy for your help.
Yes, it can be done. There are more possible approaches to this.
The first one, which is the cleanest, is to keep a second buffer, as suggested, of the length of the searched word, where you keep the last chunk of the old buffer. (It needs to be exactly the length of the searched word because you store wordLength - 1 characters + NULL terminator). Then the quickest way is to append to this stored chunk from the old buffer the first wordLen - 1 characters from the new buffer and search your word here. Then continue with your search normally. - Of course you can create a buffer which can hold both chunks (the last bytes from the old buffer and the first bytes from the new one).
Another approach (which I don't recommend, but can turn out to be a bit easier in terms of code) would be to fseek wordLen - 1 bytes backwards in the read file. This will "move" the chunk stored in previous approach to the next buffer. This is a bit dirtier as you will read some of the contents of the file twice. Although that's not something noticeable in terms of performance, I again recommend against it and use something like the first described approach.
use the same algorithm as per fgetc only read from the buffers you created. It will be same efficient as strstr iterates thorough the string char by char as well.
I would like to read text files line by line in c.
I saw some examples using fgets. But I don't know if the fgets reads the caracteres until the end of the line, or it will read the amunt of chactrers specified (without stoping at the end of the line).
Best regards.
For future, if you're using a vim editor, try using man fgets. It'll give you some basic info on the function and its parameters. You can use this on literally any function that you're unsure about and it may help to clear some things up (although in my experience it confused things a bit more sometimes since I'm also a beginner)
fgets reads until either a null-byte (basically '\0'), the new line character or until it reaches the end of the file.
One of many references located here.
fgets - char * fgets ( char * str, int num, FILE * stream );
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first.
A newline character makes fgets stop reading, but it is considered a
valid character by the function and included in the string copied to
str.
Many code examples out there.
For example if the file contains:
12345
-3445654
1245646
I want to read the first line into a string using fgets(). Then I want to read the second line in too check if there is a '-' in the first spot. If there is one, I will read the second line and strcat it to the first line.
Then I want to read the thrid line using fgets() again. This time when there is no '-' I just want to make the file go back to the beginning of the third line so that the next time I call fgets() it will read the same third line again.
Is there a way I can do this?
Use fgetc to read the first character on the next line, and if it's not a '-' use ungetc to put it back.
Generally you would just keep the part of the file just read, in the memory until you are sure you don't need it anymore.
Or you could read the entire file into a buffer and then jump around it using pointer as much as you like.
Or if you really must, you can more the current stream position with fseek, and then re-read the parts you need.
Whenever I want to
read a line
read a second line
maybe do something involving both the first line and the second line
what I usually do is declare a second line-holding variable
char prevline[whateversize];
and then, somewhere between step 1 and step 2
strcpy(prevline, line);
(Naturally you have to be sure that the line and prevline variables are consistently allocated, e.g. as arrays of the same size, so that overflow isn't a problem.)
Hi
My program reads a CSV file.
So I used fgets to read one line at a time.
But now the interface specification says that it is possible to find NULL characters in few of the columns.
So I need to replace fgets with another function to read from the file
Any suggestions?
If your text stream has a NUL (ascii 0) character, you will need to handle your file as a binary file and use fread to read the file. There are two approaches to this.
Read the entire file into memory. The length of the file can be obtained by fseek(fp, 0, SEEK_END) and then calling ftell.You can then allocate enough memory for the whole file.Once in memory, parsing the file should be relatively easy. This approach is only really suitable for smallish files (probably less than 50M max). For bonus marks look at the mmap function.
Read the file byte by byte and add the characters to a buffer until a newline is found.
Read and parse bit by bit. Create a buffer that is biggest than you largest line and fill it with content from your file. You then parse and extract as many lines as you can. Add the remainder to the beginning of a new buffer an read the next bit. Using a bigger buffer will help minimize copying.
fgets works perfectly well with embedded null bytes. Pre-fill your buffer with \n (using memset) and then use memchr(buf, '\n', sizeof buf). If memchr returns NULL, your buffer was too small and you need to enlarge it to read the rest of the line. Otherwise, you can determine whether the newline you found is the end of the line or the padding you pre-filled the buffer with by inspecting the next byte. If the newline you found is at the end of the buffer or has another newline just after it, it's from padding, and the previous byte is the null terminator inserted by fgets (not a null from the file). Otherwise, the newline you found has a null byte after it (terminator inserted by fgets, and it's the end-of-line newline.
Other approaches will be slow (repeated fgetc) or waste (and risk running out of) resources (loading the whole file into memory).
use fread and then scan the block for the separator
Check the function int T_fread(FILE *input) at http://www.mrx.net/c/source.html
Hi I am working in C on Unix platform. Please tell me how to append one line before the last line in C. I have used fopen in appending mode but I cant add one line before the last line.
I just want to write to the second last line in the file.
You don't need to overwrite the whole file. You just have to:
open your file in "rw" mode,
read your file to find the last line: store its position (ftell/ftello) in the file and its contents
go back to the beginning of the last line (fseek/fseeko)
write whatever you want before the last line
write the last line.
close your file.
There is no way of doing this directly in standard C, mostly because few file systems support this operation. The easiest way round this is to read the file into an in memory structure (where you probably have it anyway), insert the line in memory, then write the whole structure out again, overwriting the original file.
Append only appends to the end, not in the middle.
You need to read in the entire file, and then write it out to a new file. You might have luck starting from the back, and finding the byte offset of the second-to-last linefeed. Then you can just block write the entire "prelude", add your new line, and then emit the remaining trailer.
You can find the place where the last line ends, read the last line into memory, seek back to the place, write the new line, and then the last line.
To find the place: Seek to the end, minus a buffer size. Read buffer, look for
newline. If not found, seek backwards two buffer sizes, and try again.
You'll need to use the r+ mode for fopen.
Oh, and you'll need to be careful about text and binary modes. You need to use binary mode, since with text mode you can't compute jump positions, you can only jump to locations you've gotten from ftell. You can work around that by reading through the entire file, and calling ftell at the beginning of each line. For large files, that is going to be slow.
Use fseek to jump to end of file, read backwards until you encounter a newline. Then insert your line.
You might want to save the 'last line' you are reading by counting how many chars you are reading backwards then strncpy it to a properly allocated buffer.