Counting numbers in a file, in C - c

I'm trying to count the number of numbers in a text file. I have the following code:
FILE *f1;
char pathname[4096];
snprintf(pathname, 4095, "%s%d%s\n", "Key_", 2, ".txt");
if( ( f1 = fopen(pathname, "w+") ) == NULL )
perror("fopen");
for(int i = 0; i<20; ++i)
fprintf(f1, "%d\n", i+1);
int sum = 0;
int num;
while( fscanf(f1, "%d", &num) != EOF )
++sum;
printf("number of numbers: %d\n", sum);
This code says the number of numbers in the file is zero. However, if I fclose the file stream and refopen it, the sum will be 20 as expected. Any idea as to why this happens?
Thanks

The file pointer at which reads occur is shared with writes. Since w+ creates a new file or truncates an existing file, in the beginning the file is empty. As you write to the file the file pointer is moved forward and always points to the end of file. Now, when you read at that position, you will hit EOF right away.
After writing, but before reading, seek to the beginning with fseek(f1, 0, SEEK_SET);

You need to use fseek to reset the current position in the file before you read from the beginning again: fseek(f1, 0, SEEK_SET);

Related

Why do fseek & fputc not work in my program?

I am learning to program in C. Please explain why my program doesn't work. What is wrong? The program creates a file, writes a number into this file, and increases this number each time I run this program. The program counts how many times I have opened the file.
#include <stdio.h>
int main (void)
{
int n;
int c;
FILE* f = fopen("count_pr.bin", "a+");
if ((c=fgetc(f)) == EOF)
{
n=1;
fputc(n, f);
}
else
{
++n;
fseek(f, 0, SEEK_SET);
fputc(n, f);
}
printf ("The program was opened: %d\n", n);
fclose(f);
}
In the "a+" mode, from here
append/update: Open a file for update (both for input and output) with all output operations writing data at the end of the file. Repositioning operations (fseek, fsetpos, rewind) affects the next input operations, but output operations move the position back to the end of file. The file is created if it does not exist.
Output operations move the position back to the end of the file. So every write you do will be at the end of the file.
In addition, you need to make changes as suggested by Klas
If you want to overwrite the number each time, you should use r+ mode
read/update: Open a file for update (both for input and output). The file must exist.
There is still the issue of creating the file if it does not exist, so in that case, if the file cannot be opened in r+ mode, then you can open it in "w" or "w+" mode and only write to it.
I have updated the code below.
#include <stdio.h>
int main (void)
{
int n;
int c;
FILE* f = fopen("count_pr.bin", "r+");
if (f == NULL)
{
f = fopen("count_pr.bin", "w");
if (f != NULL)
{
n = 1;
fputc(n, f);
}
else
{
printf (" File Open Error");
exit(1);
}
}
else
{
c=fgetc(f);
n = c+1;
fseek(f, 0, SEEK_SET);
fputc(n, f);
}
printf ("The program was opened: %d\n", n);
fclose(f);
}
You read the value into c. Then you increase the uninitialized variable n by one and write n to the file. Since n is uninitialized you invoke undefined behaviour.
You need to use the value you read:
else
{
n = c + 1; // Changed line
fseek(f, 0, SEEK_SET);
fputc(n, f);
}

how to copy and write a file from a specific line number

I would like to copy a huge txt file and 'shrink' it. this is my code, but it seems it's still takes a lot of time reading the file. is there a way to read from a specific line number to EOF? for instance, the first 1 million lines are not useful to me, how to read from line 1 million. or anyway to read from EOF?
include<stdio.h>
include<stdlib.h>
void main() {
FILE *fp1, *fp2;
char ch;
int i = 1;
int n = 0;
int k;
fp1 = fopen("co.data", "r"); /* open a file to read*/
fp2 = fopen("Output.txt", "w"); /* open a file to write*/
printf("please enter how many lines do not need to be copied\n");
scanf ("%d", &k);
while (1) {
ch = fgetc(fp1); /* a loop to read/copy the file*/
if (ch == '\n') /* record the number of lines*/
i++;
if (ch == EOF)
break;
else if (i>k)
putc(ch, fp2);
}
printf("File copied Successfully!\n");
printf("number of lines read is %d\n",i-1);
printf("number of lines copied is %d\n",i-1-k);
fclose(fp1);
fclose(fp2);
}
There are two potential answers to your question, depending on if your file has known line lengths or not.
is there a way to read from a specific line number to EOF
In a file with line lengths are completely arbitrary (variable), no.
For example, if line 1 is 10 characters, and line 2 is 20 characters, then there is no way to calculate where line 3 is going to start without iterating through lines 1 and 2.
Operating systems aren't magic; if this kind of functionality was supported, they'd have to iterate through the file first as well. Either way, you're going to be looping through the contents.
Now, if the line lengths are guaranteed to be the same, that's a different story.
Say you have a text file like so:
AAAAAAA
BBBBBBB
CCCCCCC
Each line in the above text file is 7 characters. Assuming your line terminator is \n, each line takes up exactly 8 bytes.
In this case, you can safely fread() 8 bytes at a time and know that you're getting exactly one line. In order to jump to a particular byte in a file, you would use fseek().
Since you know the length of the lines in this scenario, you could jump to line N by simply doing
fseek(fp1, S * N, SEEK_SET);
where N is the line number (starting at 0) and S is the length of the line (as mentioned above, 8 bytes in our example file).
Note that the second solution will break if you're using a multi-byte encoding such as Unicode. Keep that in mind.
Using fgets() i made program, try it.
#include<stdio.h>
#include<stdlib.h>
int main()
{
FILE *fp1, *fp2;
char ch,*str,*r;
int i =0;
int n = 0;
int l;
fp1 = fopen("co.data", "r");
fp2 = fopen("Output.txt", "w+");
printf("please enter how many lines do not need to be copied\n");
scanf ("%d", &l);
while (1)
{
if(r=fgets(str, 500, fp1))
{ /* a loop to read/copy the file*/
i++;
}
if (r == NULL)
break;
else if (i > l)
fputs(str, fp2);
}
printf("File copied Successfully!\n");
printf("number of lines read is %d\n",i-1);
printf("number of lines copied is %d\n",i-1-l);
fclose(fp1);
fclose(fp2);
}

Overwrite a line in a file

I'm trying to overwrite a line in a file that contains only unsigned long numbers.
The contents of the file look like this:
1
2
3
4
5
I want to replace a specific number with the number 0. The code I wrote looks like this:
FILE *f = fopen("timestamps", "r+");
unsigned long times = 0;
int pos = 0;
while(fscanf(f, "%lu\n", &times) != EOF)
{
if(times == 3)
{
fseek(f, pos, SEEK_SET);
fprintf(f, "%lu\n", 0);
}
times = 0;
pos = ftell(f);
}
fclose(f);
f = fopen("timestamps", "r");
times = 0;
while(fscanf(f, "%lu\n", &times) != EOF)
{
printf("%lu\n", times);
times = 0;
}
fclose(f);
The output of the program looks like this:
1
2
10
5
Interestingly, if I cat the file, it looks like this:
1
2
10
5
Am I making a mistake in my ftell? Also, why didn't the printf show the missing line that the cat showed?
I could reproduce and fix.
The present problem is that when you open a file in r+ you must call fseek at each time you switch from reading to writing and from writing to reading.
Here, you correctly call fseek before writing the 0, but not after that write and the following read. The file pointer is not correctly positionned and you get undefined behaviour.
Fix is trivial, simply replace :
if(times == 3)
{
fseek(f, pos, SEEK_SET);
fprintf(f, "%lu\n", 0);
}
with
if(times == 3)
{
fseek(f, pos, SEEK_SET);
fprintf(f, "%lu\n", 0);
pos = ftell(f);
fseek(f, pos, SEEK_SET);
}
But BEWARE : it works here because you replace a line by a line of exactly same length. If you tried to replace a line containing 1000 with a line containing 0 you would get an extra line containing 0 on a windows system where end of line is \r\n and 00 on an unix like system with end of line \n.
Because here is what would happen (Windows case) :
Before rewrite :
... 1 0 0 0 \r \n ...
After :
... 0 \r \n 0 \r \n ...
because a sequential file is ... a sequential serie of byte !
The most comfortable way (in my opinion) to change text files is to create a new temporary file, copy the old one, line by line, with whatever changes you need, delete the old (or rename) and rename the temporary file.
Something like
char line[1000];
FILE *original, *temporar;
original = fopen("original", "r");
temporar = fopen("temporar", "w");
while (fgets(line, sizeof line, original)) {
processline(line);
fprintf(temporar, "%s", line);
}
fclose(temporar);
fclose(original);
unlink("original"); // or rename("original", "original.bak");
rename("temporar", "original");
Of course you need to validate all calls in real code.

Usage of fseek and feof

I this code is used for reading the text file in reverse order. And it successful does, displaying the original content of file and the reversed content of file.
#include <stdio.h>
#include <stdlib.h>
int main() {
int count = 0, ch = 0;
FILE *fp;
if( (fp = fopen("file.txt", "r")) == NULL ) {
perror("fopen");
exit(EXIT_FAILURE);
}
printf("\tINPUT FILE\n");
printf("\n");
while(!feof(fp)) {
if((ch = getc(fp)) != EOF) {
printf("%c", ch);
count ++;
}
}
feof(fp);
printf("\n");
printf("\tREVERSED INPUT FILE\n");
printf("\n");
while(count) {
fseek(fp, -2, SEEK_CUR);
printf("%c", getc(fp));
count--;
}
printf("\n");
fclose(fp);
}
But when i replaced, this piece of code
while(!feof(fp)) {
if((ch = getc(fp)) != EOF) {
printf("%c", ch);
count ++;
}
}
by
fseek (fp, 0, SEEK_END); or feof(fp);
Basically i just went till end of file and directly without printing the original contents of file and tried printing the reversed content of file.
But for it does not print the reversed content filed either !!! it just display blank. Why is this happening ??
NOTE: fseek(fp, -2, SEEK_CUR); Have done this (in another while loop) as getc(fp) moves fp forward by one so need to rewind it back by two, also initially it will be pointing to EOF
What is happening here? Can any one please explain?
It breaks because the second loop is while (count), and count is zero if you haven't read through the file first while incrementing it. You can use ftell to obtain the equivalent of count in this case.
P. S. feof(fp) only tests whether fp is at end-of-file, it does not make it seek to EOF, so the line feof(fp) basically does nothing since you aren't using the return value.
As #Arkku already showed, when you replace the while loop with fseek(SEEK_END), count will not be incremented.
To fix this, you can use ftell after fseek, which returns the file length
fseek(fp, 0, SEEK_END);
count = ftell(fp);
Now the file will be printed backwards.

writing in a line in c

i have a c project and i have serious problem , i want to open file and replace line number nb (nb is an int) with "*" . this is my code could some one help me please ? it show me the word i want to replace that's mean that the pointer is pointing on the wanted line but nothing happen .help me please
#include <stdio.h>
int main( void )
{
FILE * f;
char ch[1024];
int i, nb;
i = 0;
scanf( "%d", &nb ) ;
f = fopen( "dict.txt", "r+t" );
while( i < nb )
{
fscanf( f, "%s", ch ) ;
i++;
}
printf( "%s", ch );
fprintf( f, "%s", "****" );
fclose( f );
}
You've opened the file for reading and writing. According to the MSDN man page for fopen (I am assuming from the r+t mode on the file that you are using Visual Studio):
When the "r+", "w+", or "a+" access type is specified, both reading and writing are allowed (the file is said to be open for "update"). However, when you switch from reading to writing, the input operation must encounter an EOF marker. If there is no EOF, you must use an intervening call to a file positioning function. The file positioning functions are fsetpos, fseek, and rewind.
Some other things to keep in mind:
When fscanf reads a string with %s, it reads only one word at a time, not a whole line. It is easier to read whole lines of input with fgets than with fscanf.
A file consists of a stream of bytes. If the line you want to replace is 47 characters long, then fprintf(f, "%s", "****") will only replace the first four bytes in the line.
That means that if you want to replace line #nb, you will need to read in the line, figure out how long it is, then seek back to the beginning of the line and print out the correct number of asterisks.
Try something like this instead:
#include <stdio.h>
#include <string.h>
int main()
{
FILE * f;
char ch[1024];
int i,nb ;
fpos_t beginning_of_line;
i=0;
scanf("%d",&nb) ;
f = fopen("dict.txt", "r+t");
while (i<nb)
{
fgetpos(f, &beginning_of_line);
fgets(ch, 1024, f);
i++;
}
fseek(f, beginning_of_line, SEEK_SET); // return to beginning of line
for (i = 0; ch[i] != '\n'; ++i) {
fputc('*', f);
}
fclose(f);
}

Resources