How to read blank lines using %[^\n]s? - c

I am having a program where
fscanf(fp,"%[^\n]s",line);
is used for reading a line.
If I put in a while loop,
while(!feof(fp))
fscanf(fp,"%[^\n]s",line);
the above code works for first line and for the rest, I am getting
line as NULL. ( line = "" )
My file contains many lines even many blank lines. How can I make the above code work?

First, the conversion specifier would be %[^\n] (no s on the end).
Secondly, you don't want to use the %[ conversion specifier without an explicit size; otherwise you run the risk of a buffer overflow:
char line[132];
...
fscanf(fp, "%131[^\n]", line);
Third, this will leave the newline in the input stream, potentially fouling up the next read.
Finally, you don't want to use feof as your loop condition, since it won't return true until after you try to read past EOF, causing your loop to execute one too many times.
Frankly, I think the better option is to use fgets(); it will read everything up to and including the next newline or one less than the specified size. IOW, if line is sized to hold 20 characters and the input line has 80 characters (including the newline), fgets will read 19 characters and append the 0 terminator into line. If the input line is 10 characters, it will read the whole input line into line (including the newline).
fgets will return NULL on EOF or error, so you should structure your loop as
while (fgets(line, sizeof line, fp))
{
// do something with line
}
if (feof(fp))
// hit end of file
else
// error on read.

I'm pretty sure that \n is not allowed within a scanset. It would be helpful if you state what you're trying to code there.
If you want to read in entire lines, I'd strongly suggest you go for the fgets() library routine. With fgets() you're able to state, how much characters to read in at max, so you're able to avoid buffer overflows.

Related

About the mechanism of using fgets() and stdin together

I would like to have a better understanding of using fgets() and stdin.
The following is my code:
int main()
{
char inputBuff[6];
while(fgets(inputBuff, 6, stdin))
{
printf("%s", inputBuff);
}
return 0;
}
Let's say my input is aaaabbbb and I press Enter. By using a loopcount, I understand that actually the loop will run twice (including the one I input aaaabbbb) before my next input.
Loop 1: After I have typed in the characters, aaaabbbb\n will be stored in the buffer of stdin file stream. And fgets() is going to retrieve a specific number of data from the file stream and put them in inputBuff. In this case, it will retrieve 5 (6 - 1) characters at a time. So that when fgets() has already run once, inputBuff will store aaaab, and then be printed.
Loop 2: Then, since bbb\n are left in the file stream, fgets() will execute for the second time so that inputBuff contains bbb\n, and then be printed.
Loop 3: The program will ask for my input (the 2nd time) as the file stream has reached the end (EOF).
Question: It seems that fgets() will only ask for my keyboard input after stdin stream has no data left in buffer (EOF). I am just wondering why couldn't I use keyboard to input anything in loop 2, and fgets() just keep on retrieving 5 characters from stdin stream and left the excess data in the file stream for next time retrieval. Do I have any misunderstanding about stdin or fgets()? Thank you for your time!
The behavior of your program is somewhat more subtle than you expect:
fgets(inputBuff, 6, stdin) reads at most 5 bytes from stdin and stops reading when it gets a newline character, which is stored into the destination array.
Hence as you correctly diagnose, the first call reads the 5 bytes aaab and prints them and the second call reads 4 bytes bbb\n and prints them, then the third call gets an empty input stream and waits for user input.
The tricky part is how stdin gets input from the user, also known as console input.
Both console input and stdin are usually line buffered by default, so you can type a complete line of input regardless of the size of the buffer passed to fgets(). Yet if you can set stdin as unbuffered and console input as uncooked, the first fgets() would indeed read the first 5 bytes as soon as you type them.
Console input is an intricate subject. Here is an in depth article about its inner workings: https://www.linusakesson.net/programming/tty/
Everything is there in manual page of fgets() whatever you are asking. Just need to read it properly, It says
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than sizecharacters
from stream and stores them into the buffer pointed to by s. Reading
stops after an EOF or a newline. If a newline is read, it is
stored into the buffer. A terminating null byte (aq\0aq) is stored
after the last character in the buffer.
If input is aaaabbbb and in fgets() second argument you specified size as 6 i.e it will read one less 5 character and terminating \0 will be added so first time inputBuff holds aaaab and since still EOF or \n didn't occur so next time inputBuff holds bbb\n as new line also get stored at last.
Also you should check the return type of fgets() and check if \n occurs then break the loop. For e.g
char *ptr = NULL;
while( (ptr = fgets(inputBuff, 6, stdin))!= NULL){
if(*ptr == '\n')
break;
printf("%s", inputBuff);
}
fgets() does only read until either '\n' or EOF. Everything after that will be left in stdin and therefore be read when you call fgets() again. You can however remove the excess chars from stdin by for example using getc() until you reach '\0'. You might want to look at the manpages for that.

What are options scanf vs gets vs fgets?

I heve a following code
while ( a != 5)
scanf("%s", buffer);
This works well but takes no space in between the mentioned words or in other words, scanf terminates if we use spaces to scan
If I use this
while( a != 5)
scanf("%[^\n]", buffer);
It works only for once which is bad
I never use gets() because I know how much nasty it is..
My last option is this
while( a != 5)
fgets(buffer, sizeof(buffer), stdin);
So my questions are
Why the second command is not working inside the loop?
What are the other options I have to scan a string with spaces?
"%[^\n]" will attempt to scan everything until a newline. The next character in the input would be the \n so you should skip over it to get to the next line.
Try: "%[^\n]%*c", the %*c will discard the next character, which is the newline char.
Why the second command is not working inside the loop
Becuase, for the first time what you scan until \n, the \n is remaining in the input buffer. You need to eat up (or, in other word, discard) the stored newline from the buffer. You can make use of while (getchar()!=\n); to get that job done.
What are the other options I have to scan a string with spaces?
Well, you're almost there. You need to use fgets(). Using this, you can
Be safe from buffer overrun (Overcome limitation of gets())
Input strings with spaces (Overcome limitation of %s)
However, please keep in mind, fgets() reads and stores the trailing newline, so you may want to get rid of it and you have to do that yourself, manually.

fgets() didn't print the first line

fgets() not printing the first line of the opened file.
this is the code i have done,
#include <stdio.h>
int main(int argc, char* argv[])
{
float num;
char const* const filename=argv[1];
FILE* file=fopen(filename,"r");
char line[256];
int j=0;
if(file!=NULL)
{
while(fgets(line,sizeof(line),file)!=NULL){
for(j=0; j<2;j++)
{
if(j==0)
{
fscanf(file,"%f",&num);
printf("%f \t",num);
}
else if(j==1)
{
fscanf(file,"%f",&num);
printf("%f \n",num);
}
}
}
fclose(file);
}
}
i get the output as,
650.000000 699.000000
99.000000 132.000000
150.000000 272.000000
128.000000 291.000000
302.000000 331.000000
95.000000 199.000000
instead of,
130.000000 186.000000
650.000000 699.000000
99.000000 132.000000
150.000000 272.000000
128.000000 291.000000
302.000000 331.000000
95.0000000 199.000000
I don't get the first line is my problem here. Please help me to solve this.
Your fgets reads the first line into line, which you never use. Then you proceed to read numbers directly from the file (not from line) using fscanf. The further calls to fgets basically consume the newlines from the file. Try switching fscanf to sscanf and read from line instead of file, or alternatively get rid of the fgets entirely and exit from the loop based on the return value of fscanf.
fgets() doesn't print anything, by definition. It only reads input.
In any event, your problem is that - each time through the while loop, the actions are one call of fgets() (the while loop condition) followed by two calls of fscanf(). Only the results read using fscanf() are output.
The first time through the loop, the data received by the first line of fgets() is simply discarded - so the first line will not be output at all.
On subsequent iterations, the only reason output isn't being skipped is that you're getting lucky. fscanf() - with your format string - stops reading when it encounters a whitespace character, but leaves that character waiting to be read from the file. If that is a newline character, it will be encountered by the next call of fgets(), which will return immediately (effectively discarding the newline).
The real problem is in how fgets() and fscanf(), when they are interleaved with each other to read the same file, interact with each other (in this case, because they handle newlines in ways that affect each other in unwanted ways). The usual guideline, in order to avoid such interactions, is to avoid mixing styles of input on the same file handle i.e. don't mix line-oriented input (e.g. fgets()) with formatted input (e.g. fscanf()) nor with character-oriented input (e.g. fgetc()) on the same file handle.
Incidentally, with your code as it stands, you will get even stranger behaviour if you change your input file slightly (e.g. merge two lines into one, add blank lines, etc). More data will be discarded, but the patterns of what is discarded will be a bit harder to pick.

Difference between fgets and fscanf?

I have a question concerning fgets and fscanf in C. What exactly is the difference between these two? For example:
char str[10];
while(fgets(str,10,ptr))
{
counter++;
...
and the second example:
char str[10];
while(fscanf(ptr,"%s",str))
{
counter++;
...
when having a text file which contains strings which are separated by an empty space, for example: AB1234 AC5423 AS1433. In the first example the "counter" in the while loop will not give the same output as in the second example. When changing the "10" in the fgets function the counter will always give different results. What is the reason for this?
Can somebody please also explain what the fscanf exactly does, how long is the string in each while loop?
The function fgets read until a newline (and also stores it). fscanf with the %s specifier reads until any blank space and doesn't store it...
As a side note, you're not specifying the size of the buffer in scanf and it's unsafe. Try:
fscanf(ptr, "%9s", str)
fgets reads to a newline. fscanf only reads up to whitespace.
In your example, fgets will read up to a maximum of 9 characters from the input stream and save them to str, along with a 0 terminator. It will not skip leading whitespace. It will stop if it sees a newline (which will be saved to str) or EOF before the maximum number of characters.
fscanf with the %s conversion specifier will skip any leading whitespace, then read all non-whitespace characters, saving them to str followed by a 0 terminator. It will stop reading at the next whitespace character or EOF. Without an explicit field width, it will read as many non-whitespace characters as are in the stream, potentially overruning the target buffer.
So, imagine the input stream looks like this: "\t abcdef\n<EOF>". If you used fgets to read it, str would contain "\t abcdef\n\0". If you usedfscanf, str could contain "abcdef\0" (where \0 indicates the 0 terminator).
fgets read the whole line. fscanf with %s read a string, separate by space (or \n,\t,etc...).
Anyway, you should not use them unless you sure that the array you read to is big enough to contain the input.
You wrote When changing the "10" in the fgets function the counter will always give different results. Note that fgets and scanf don't know how much bytes to read. you should tell them. changing the "10" just enlarge the buffer these functions write to.

how to read scanf with spaces

I'm having a weird problem
i'm trying to read a string from a console with scanf()
like this
scanf("%[^\n]",string1);
but it doesnt read anything. it just skips the entire scanf.
I'm trying it in gcc compiler
Trying to use scanf to read strings with spaces can bring unnecessary problems of buffer overflow and stray newlines staying in the input buffer to be read later. gets() is often suggested as a solution to this, however,
From the manpage:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because gets()
will continue to store characters past
the end of the buffer, it is extremely
dangerous to use. It has been used to
break computer security. Use fgets()
instead.
So instead of using gets, use fgets with the STDIN stream to read strings from the keyboard
That should work fine, so something else is going wrong. As hobbs suggests, you might have a newline on the input, in which case this won't match anything. It also won't consume a newline, so if you do this in a loop, the first call will get up to the newline and then the next call will get nothing. If you want to read the newline, you need another call, or use a space in the format string to skip whitespace. Its also a good idea to check the return value of scanf to see if it actually matched any format specifiers.
Also, you probably want to specify a maximum length in order to avoid overflowing the buffer. So you want something like:
char buffer[100];
if (scanf(" %99[^\n]", buffer) == 1) {
/* read something into buffer */
This will skip (ignore) any blank lines and whitespace on the beginning of a line and read up to 99 characters of input up to and not including a newline. Trailing or embedded whitespace will not be skipped, only leading whitespace.
I'll bet your scanf call is inside a loop. I'll bet it works the first time you call it. I'll bet it only fails on the second and later times.
The first time, it will read until it reaches a newline character. The newline character will remain unread. (Odds are that the library internally does read it and calls ungetc to unread it, but that doesn't matter, because from your program's point of view the newline is unread.)
The second time, it will read until it reaches a newline character. That newline character is still waiting at the front of the line and scanf will read all 0 of the characters that are waiting ahead of it.
The third time ... the same.
You probably want this:
if (scanf("%99[^\n]%*c", buffer) == 1) {
Edit: I accidentally copied and pasted from another answer instead of from the question, before inserting the %*c as intended. This resulting line of code will behave strangely if you have a line of input longer than 100 bytes, because the %*c will eat an ordinary byte instead of the newline.
However, notice how dangerous it would be to do this:
scanf("%[^n]%*c", string1);
because there, if you have a line of input longer than your buffer, the input will walk all over your other variables and stack and everything. This is called buffer overflow (even if the overflow goes onto the stack).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *text(int n);
int main()
{
char str[10];
printf("enter username : ");
scanf(text(9),str);
printf("username = %s",str);
return 0;
}
char *text(int n)
{
fflush(stdin);fflush(stdout);
char str[50]="%",buf[50],st2[10]="[^\n]s";
char *s;itoa(n,buf,10);
// n == -1 no buffer protection
if(n != -1) strcat(str,buf);
strcat(str,st2);s=strdup(str);
fflush(stdin);fflush(stdout);
return s;
}

Resources