Reading lines ahead in a file (In C) - c

I have a file that looks like this:
This is the first line in the file
This is the third line in the file
Where I have a blank line in the file (On line 2). I want to read the file line by line (Which I do using fgets), but then i want to read ahead just check if a line there is a blank line in the file.
However, My while fgetshas a break statement in it, because my function is only so posed to read the file a line at a time per function call.
so if I call the function:
func(file);
It would read the first line, then break.
If I called it again, it would read the second line then break, etc.
Because I have to implement it this way, it's hard to read ahead, is there any way I can accomplish this?
This is my code:
int main(void) {
FILE * file;
if(file == NULL){perror("test.txt"); return EXIT_FAILURE;}
readALine(file);
}
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
//Read lines ahead to check if there is a line
//which is blank
break; //only read a line each FUNCTION CALL
}
}
So to clarify, if I WAS reading the entire file at once (Only one function call) it would go like this (Which is easy to implement).
int main(void) {
FILE * file = fopen("test.txt","r");
if(file == NULL){perror("test.txt"); return EXIT_FAILURE;}
readALine(file);
}
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if(isspace(buffer[0]) {
printf("Blank line found\n");
}
}
}
But since I'm reading the file in (Line by line, per function call), The second piece of code above wouldn’t work (Since I break per line read, which I can't change).
Is there a way I could use fseek to accomplish this?

A while loop ending in an unconditional break is an if statement, so I don't really see why you are using a while loop. I'm also assuming you are not worried about a single line being longer than 1000 chars.
the continue statement jumps over to the next iteration of the loop and checks the condition again.
void readALine(FILE * file) {
char buffer[1000];
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if(!isspace(buffer[0]) { //note the not operator
//I'm guessing isspace checks for a newline character since otherwise this will be true also for lines beginning with space
continue; //run the same loop again
}
break;
}
//buffer contains the next line except for empty ones here...
}

You can "read ahead" by simply storing your position in the file (with position = ftell(your_file)), then read the line, if this is a blank line do whatever you have to do, and finally go back to the position you were (with fseek(your_file, position, SEEK_SET)).
Hope this helps !

The while loop in readALine reads lines until the end of the file. So it will skip blank lines, and all other lines.
You can return from within the loop if you've found a non-blank line:
while(fgets(buffer,sizeof(buffer),file) != NULL) {
if (buffer[0] != '\n')
return;
}
If you also want to skip lines that consist of nothing but spaces, you can write a function that does that check:
bool isNothingButWhitespace(char *s) {
while (*s == ' ' || *s == '\n')
s++;
return *s == '\0';
}
This will find the first character that's not whitespace. If it's the string terminator '\0' then it will return true (the string was nothing but whitespace) otherwise falseS (there was some non-whitespace character found).
If the while loop in readALine completes due to it reaching the end of file, you need some way to signal that back to the caller. I recommend setting buffer[0] = '\0'.

Related

How to split text from file into words?

I'm trying to get the text from a file and split it into words by removing spaces and other symbols. This is part of my code for handling the file:
void addtext(char wordarray[M][N])
{
FILE *fileinput;
char word[N];
char filename[N];
char *pch;
int i=0;
printf("Input the name of the text file(ex \"test.txt\"):\n");
scanf("%19s", filename);
if ((fileinput = fopen(filename, "rt"))==NULL)
{
printf("cannot open file\n");
exit(EXIT_FAILURE);
}
fflush(stdin);
while (fgets(word, N, fileinput)!= NULL)
{
pch = strtok (word," ,'.`:-?");
while (pch != NULL)
{
strcpy(wordarray[i++], pch);
pch = strtok (NULL, " ,'.`:-?");
}
}
fclose(fileinput);
wordarray[i][0]='\0';
return ;
}
But here is the issue. When the text input from the file is:
Alice was beginning to get very tired of sitting by her sister on the bank.
Then the output when I try to print it is this:
Alice
was
beginning
to
get
very
tired
of
sitting
by
her
s
ister
on
the
bank
As you can see, the word "sister" is split into 2. This happens quite a few times when adding a bigger text file. What am I missing?
If you count the characters you'll see that s is the 57th character. 57 is 19 times 3 which is the number of parsed characters in each cycle, (20 -1, as fgets null terminates the string and leaves the 20th character in the buffer).
As you are reading lines in batches of 19 characters, the line will be cuted every multiple of 19 charater and the rest will be read by the next fgets in the cycle.
The first two times you where lucky enough that the line was cutted at a space, character 19 at the end of beggining, character 38 at the end of tired, the third time it was in the midle of sister so it cuted it in two words.
Two possible fixes:
Replace:
while (fgets(word, N, fileinput)!= NULL)
With:
while (fscanf(fileinput, %19s, word) == 1)
Provided that there are no words larger than 19 in the file, which is the case.
Make word large enough to take whole the line:
char word[80];
80 should be enough for the sample line.
What am I missing?
You are missing that a single fgets call at maximum will read N-1 characters from the file, Consequently the buffer word may contain only the first part of a word. For instance it seems that in your case the s from the word sister was read by one fgets call and that the remaining part, i.e. ister was read by the next fgets call. Consequently, your code detected sister as two words.
So you need to add code that can check whether the end of the is a whole word or a part of a word.
To start with you can increase N to a higher number but to make it work in general you must add code that checks the end of the word buffer.
Also notice that long words may require more than 2 fgets call.
As a simple alternative to fgets and strtok consider fread and a simple char-by-char passing of the input.
Below is a simple, low-performance example of how it can be done.
int isdelim(char c)
{
if (c == '\n') return 1;
if (c == ' ') return 1;
if (c == '.') return 1;
return 0;
}
void addtext(void)
{
FILE *fileinput;
char *filename = "test.txt";
if ((fileinput = fopen(filename, "rt"))==NULL)
{
printf("cannot open file\n");
return;
}
char c;
int state = LOOK_FOR_WORD;
while (fread(&c, 1, 1, fileinput) == 1)
{
if (state == LOOK_FOR_WORD)
{
if (isdelim(c))
{
// Nothing to do.. keep looking for next word
}
else
{
// A new word starts
putchar(c);
state = READING_WORD;
}
}
else
{
if (isdelim(c))
{
// Current word ended
putchar('\n');
state = LOOK_FOR_WORD;
}
else
{
// Current word continues
putchar(c);
}
}
}
fclose(fileinput);
return ;
}
To keep the code simple it prints the words using putchar instead of saving them in an array but that is quite easy to change.
Further, the code only reads one char at the time from the file. Again it's quit easy to change the code and read bigger chunks from the file.
Likewise you can add more delimiters to isdelim as you like (and improve the implementation)

Remove empty new line at the end of a file

I have a file that's storing strings from the user's input (stdin)
However there are 2 situations
If I read it normally, my file will have an empty line at its end due to the newline from the last string the user introduced.
If I remove the \n from the input string, the file stores all strings in the same line, which is not wanted.
How can I simply remove that newline from the end of the file?
I can edit and provide some of my code if required.
EDIT: Let's say the last line of a file I already have is "cards"
when the cursor is in front of "cards", if I press the down arrow it doesn't go on to the next line, while in this case it can happen once.
For my code to function perfectly I can't let that happen,
Here's an example of what I have:
f=fopen(somefile, "w");
do
{
fgets(hobby, 50, stdin);
fprintf(f, "%s", hobby)
} while(strcmp(hobby,"\n") != 0);
The newline character1 at the end of the file is part of the last line. Removing it is possible but makes the last line incomplete, which will break many programs. For example, concatenating another file at the end of this file will cause the last line to be merged with the first line of the concatenated file.
A Yuri Laguardia commented, if you later reopen this file in append mode to write more lines, the first one added would be concatenated at the end of this last incomplete line. Probably not the intended behavior.
If you do not want the file to contain empty lines, check user input before writing the line into the file:
void input_file_contents(FILE *fp) {
char userinput[80];
printf("enter file contents:\n");
while (fgets(userinput, sizeof userinput, stdin)) {
if (*userinput != '\n') {
fputs(userinput, fp);
}
}
}
EDIT:
Your code does not test for termination at the right place: you write the empty line before the test. Do not use a do / while loop:
f = fopen(somefile, "w");
if (f != NULL) {
/* read lines until end of file or empty line */
while (fgets(hobby, 50, stdin) != NULL && *hobby != '\n') {
fputs(hobby, f);
}
}
1 The newline character is actually a pair of bytes <CR><LF> on legacy systems.

skip to next line of file ignoring content

hi so i have a program where if there is an # at the begining of the first line of the text file it needs to be ignored, how do you jump to the next line of file? ignoring all that there is after the #?
for example:
#1234
5
I want to print 5 and the rest to be ignored.
I only managed to skip the # if there is nothing behind it
while (a == '#' || a == '\r'|| a == '\n') {
fscanf(inp, "%c", &a);
}
As for your previous question, if your want to ignore comment lines with an initial #, it is highly recommended to read the file line by line with fgets() and to handle non comment lines directly while ignoring comment lines.
It is actually non trivial to do it with fscanf because depending on your format lines, the linefeed may or may not have been consumed.
If you are at the start of a line and want to read the next char while ignoring the comment lines, do this:
int c; // Must be int to accommodate for EOF.
while ((c = getc(inp)) == '#') {
while ((c = getc(inp)) != EOF && c != '\n')
continue;
}
// Here c contains the first char from a non comment line or EOF.
Instead of
while (a == '#' || a == '\r'|| a == '\n') {
fscanf(inp,"%c",&a);
}
Try (pseudo code):
If FirstChar == '#'
Loop/scan until '\n'
On nextline here
If you want to use fscanf().
If better performance is needed, work on buffers directly.

go to a line and change to uppercase

I'm trying to do this assignment. Basically there's a file with 10 lines...using argv[] the user enters the filename in location 1 and line number is location 2 of the array.
I got everything working so far...checking file...counting line numbers and so on.
What would you guys suggest that I do to change the characters on that line to upper. I'm lost with how to do it. We can only use lseek, open, write, read and close commands.
My logic was ... if the user enters 5 for line number to change....in program I count the line numbers....when the counter hits 4..anything after that is line 5...up until \n.
The counter increments on each \n it comes across.
int line;
int counter = 0;
char c;
do
{
line = read(fd, &c, 1);
if (c == '\n')
{
counter++;
}
if (lnum == counter)
{
}
} while (line != 0);
You are just going to change the line, keeping it with the same size, so you can overwrite it(no need to rewrite the file). Since you have already found the way to read the lines you know the position where your line starts (in number of bytes, thanks to read function).
So you read the line you have to change to uppercase, you reposition the position indicator to the beginning of the line (using lseek) and then you rewrite the reading line with the changes you want.

How do I extract a specific numbered line from a text file? (C)

I am trying to write a function that prints a specific line from a text file based on the number given. For example, let's say the file contains the following:
1 hello1 one
2 hello2 two
3 hello3 three
If the number given is '3', the function will output "hello3 three". If the number given is '1', the function output will be "hello1 one".
I am very new to C but here is my logic so far.
I imagine first thing is first, I need to find the character 'number' inside the file. Then what? How do I go about writing the line out without including the number? How do I even find the 'number'? I am sure it's very simple but I have no idea how to do this. Here is what I have so far:
void readNumberedLine(char *number)
{
int size = 1024;
char *buffer = malloc(size);
char *line;
FILE *fp;
fp = fopen("xxxxx.txt", "r");
while(fp != NULL && fgets(buffer, sizeof(buffer), fp) != NULL)
{
if(line = strstr(buffer, number))
//here is where I am confused as to what to do.
}
if (fp != NULL)
{
fclose(fp);
}
}
Any help at all would be greatly appreciated.
from what you are saying you are looking for lines tagged with a number at the beginning of the line. In which case you want something where you can read a line with a tag prefix
bool readTaggedLine(char* filename, char* tag, char* result)
{
FILE *f;
f = fopen(filename, "r");
if(f == NULL) return false;
while(fgets(result, 1024, f))
{
if(strncmp(tag, result, strlen(tag))==0)
{
strcpy(result, result+strlen(tag)+1);
return true;
}
}
return false;
}
then use it like
char result[3000];
if(readTaggedLine("blah.txt", "3", result))
{
printf("%s\r\n", result);
}
else
{
printf("Could not find the desired line\r\n");
}
I would try the following.
Approach 1:
Read and throw away (n - 1) lines
// Consider using readline(), see reference below
line = readline() // one more time
return line
Approach 2:
Read block by block and count carriage-return characters (e.g. '\n').
Keep reading and throwing away for the first (n - 1) '\n's
Read characters till next '\n' and accumulate them into line
return line
readline(): Reading one line at a time in C
P.S. Following is a shell solution, it may be used to unit test the C program.
// Display 42nd line of file foo
$ head --lines 42 foo | tail -1
// (head displays lines 1-42, and tail displays the last of them)
You can use an additional value to help you record how many lines you have read.Then in while loop compare the value with your input value, if they are equal, output the buffer.

Resources