Conditionally skipping entire line with just fscanf - c

Need to skip entire line (comment) if the first character is #
Some solutions in other posts suggested fgets
but in my case fscanf is the preferred option as I need to parse each word 1 by 1 later. How can this be done with just fscanf ?
Thank you.
File to be read
#This line is a comment <== skip this entire line
BEGIN {
THIS IS A WORD
}
CODE
void read_file(FILE *file_pointer, Program *p)
{
char buffer[FILESIZE];
int count = 0;
while (fscanf(file_pointer, "%s", buffer) != EOF)
{
if (buffer[0] == '#')
{
continue; <============ need to skip until the end of the line
}
else
{
strcpy(p->wds[count++], buffer);
}
}
}

How can this be done with just fscanf?
fscanf(stdin, "%*[^\n]"); will read all characters other than a new-line character, until an error or end-of-file occurs.
* says to suppress assignment of the data being read.
By default, [ starts a list of characters to accept. However, ^ negates that; [^\n] says to accept all characters other than a new-line character.

Some solutions in other posts suggested fgets but in my case fscanf is the preferred option as I need to parse each word 1 by 1 later. How can this be done with just fscanf ?
I recommend that you use fgets to read a whole line and then you can use sscanf instead of fscanf to read that line word by word.
#include <stdio.h>
void read_file( FILE *file_pointer )
{
char line[100];
while ( fgets( line, sizeof line, file_pointer) != NULL )
{
char *p = line;
char word[50];
int chars_read;
//skip line if it starts with "#"
if ( line[0] == '#' )
{
continue;
}
//read all words on the line one by one
while ( sscanf( p, "%49s%n", word, &chars_read ) == 1 )
{
//do something with the word
printf( "Found word: %s\n", word );
//make p point past the end of the word
p += chars_read;
}
}
}
int main( void )
{
//this function can also be called with an opened file,
//however for simplicity, I will simply pass "stdin"
read_file( stdin );
}
With the input
This is a test.
#This line should be ignored.
This is another test.
this program has the following output:
Found word: This
Found word: is
Found word: a
Found word: test.
Found word: This
Found word: is
Found word: another
Found word: test.
As you can see, the line starting with the # was successfully skipped.

Use fgets() after you've read the first word to read the rest of the line.
while (fscanf(file_pointer, "%s", buffer) != EOF)
{
if (buffer[0] == '#')
{
fgets(file_pointer, buffer, sizeof buffer);
}
else
{
strcpy(p->wds[count++], buffer);
}
}

Related

How can I do to finish a program with a read of a String?

I need read a string and finish it. For example:
INPUT
fxy yxf
abc bac
weq qew
abg bga
acd adc
abt bta
poeq eopq
qwte wtqe
I want to finish the program after the word "wtqe".
Here is my code:
int main(){
do {
scanf("%s %s", &str1, &str2);
}while(scanf("%c") != EOF);
return 0;
}
Use fgets to read a line and sscanf to parse the two strings. If a blank line is entered or a line with only one string, the while loop will exit.
#include <stdio.h>
int main (void) {
char line[999] = "";
char str1[100] = "";
char str2[100] = "";
int result = 0;
do {
if ( fgets ( line, sizeof line, stdin)) {
result = sscanf ( line, "%99s%99s", str1, str2);
}
else {//fgets failed
break;
}
} while ( line[0] != '\n' && result == 2);
return 0;
}
The end of the loop is while(scanf("%c") != EOF)
This means that the loop quits when you reach end-of-file.
If you put all of those in a text file, and ran your program with that file as standard input (./prog < file), then it will quit after it finishes reading the file.
If you want to type all of those in, you need to indicate end-of-file to the terminal. On Linux, Mac, and similar systems, you do this with ctrl+d; on Windows I think it's ctrl-z.
Edit to add: This will give strange results if the very start of the input doesn't match. A better option might be
while (scanf("%s %s", &str1, &str2) == 2) {
// Do stuff
}
As shown by user3121023's answer, this can also cause buffer overflow issues, and you should give an explicit size to %s.

How to take a line input in C?

I was trying to take a full line input in C. Initially I did,
char line[100] // assume no line is longer than 100 letters.
scanf("%s", line);
Ignoring security flaws and buffer overflows, I knew this could never take more than a word input. I modified it again,
scanf("[^\n]", line);
This, of course, couldn't take more than a line of input. The following code, however was running into infinite loop,
while(fscanf(stdin, "%[^\n]", line) != EOF)
{
printf("%s\n", line);
}
This was because, the \n was never consumed, and would repeatedly stop at the same point and had the same value in line. So I rewrote the code as,
while(fscanf(stdin, "%[^\n]\n", line) != EOF)
{
printf("%s\n", line);
}
This code worked impeccably(or so I thought), for input from a file. But for input from stdin, this produced cryptic, weird, inarticulate behavior. Only after second line was input, the first line would print. I'm unable to understand what is really happening.
All I am doing is this. Note down the string until you encounter a \n, store it in line and then consume the \n from the input buffer. Now print this line and get ready for next line from the input. Or am I being misled?
At the time of posting this question however, I found a better alternative,
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
{
printf("%s\n", line);
}
This works flawlessly for all cases. But my question still remains. How come this code,
while(fscanf(stdin, "%[^\n]\n", line) != EOF)
{
printf("%s\n", line);
}
worked for inputs from file, but is causing issues for input from standard input?
Use fgets(). #FredK
char buf[N];
while (fgets(buf, sizeof buf, stdin)) {
// crop potential \n if desired.
buf[strcspn(buf, "\n")] = '\0';
...
}
There are to many issues trying to use scanf() for user input that render it prone to mis-use or code attacks.
// Leaves trailing \n in stdin
scanf("%[^\n]", line)
// Does nothing if line begins with \n. \n remains in stdin
// As return value not checked, use of line may be UB.
// If some text read, consumes \n and then all following whitespace: ' ' \n \t etc.
// Then does not return until a non-white-space is entered.
// As stdin is usually buffered, this implies 2 lines of user input.
// Fails to limit input.
scanf("%[^\n]\n", line)
// Does nothing if line begins with \n. \n remains in stdin
// Consumes 1 char after `line`, even if next character is not a \n
scanf("%99[^\n]%*c", line)
Check against EOF is usual the wrong check. #Weather Vane The following, when \n is first entered, returns 0 as line is not populated. As 0 != EOF, code goes on to use an uninitialized line leading to UB.
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
Consider entering "1234\n" to the following. Likely infinite loop as first fscanf() read "123", tosses the "4" and the next fscanf() call gets stuck on \n.
while(fscanf(stdin, "%3[^\n]%*c", line) != EOF)
When checking the results of *scanf(), check against what you want, not against one of the values you do not want. (But even the following has other troubles)
while(fscanf(stdin, "%[^\n]%*c", line) == 1)
About the closest scanf() to read a line:
char buf[100];
buf[0] = 0;
int cnt = scanf("%99[^\n]", buf);
if (cnt == EOF) Handle_EndOfFile();
// Consume \n if next stdin char is a \n
scanf("%*1[\n]");
// Use buf;
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
worked for inputs from file, but is causing issues for input from standard input?
Posting sample code and input/data file would be useful. With modest amount of code posted, some potential reasons.
line overrun is UB
Input begins with \n leading to UB
File or stdin not both opened in same mode. \r not translated in one.
Note: The following fails when a line is 100 characters. So meeting the assumption cal still lead to UB.
char line[100] // assume no line is longer than 100 letters.
scanf("%s", line);
Personally, I think fgets() is badly designed. When I read a line, I want to read it in whole regardless of its length (except filling up all RAM). fgets() can't do that in one go. If there is a long line, you have to manually run it multiple times until it reaches the newline. The glibc-specific getline() is more convenient in this regard. Here is a function that mimics GNU's getline():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long my_getline(char **buf, long *m_buf, FILE *fp)
{
long tot = 0, max = 0;
char *p;
if (*m_buf == 0) { // empty buffer; allocate
*m_buf = 16; // initial size; could be larger
*buf = (char*)malloc(*m_buf); // FIXME: check NULL
}
for (p = *buf, max = *m_buf;;) {
long l, old_m;
if (fgets(p, max, fp) == NULL)
return tot? tot : EOF; // reach end-of-file
for (l = 0; l < max; ++l)
if (p[l] == '\n') break;
if (l < max) { // a complete line
tot += l, p[l] = 0;
break;
}
old_m = *m_buf;
*m_buf <<= 1; // incomplete line; double the buffer
*buf = (char*)realloc(*buf, *m_buf); // check NULL
max = (*m_buf) - old_m;
p = (*buf) + old_m - 1; // point to the end of partial line
}
return tot;
}
int main(int argc, char *argv[])
{
long l, m_buf = 0;
char *buf = 0;
while ((l = my_getline(&buf, &m_buf, stdin)) != EOF)
puts(buf);
free(buf);
return 0;
}
I usually use my own readline() function. I wrote this my_getline() a moment ago. It has not been thoroughly tested. Please use with caution.

Trying to get each line, but only returns first character in the loop

So i have a while loop to read comments off a text file as follows. There could be things after the comments so that shouldn't be ruled out. Here is my code as follows:
int i = 0;
while(!feof(fd) || i < 100) {
fscanf(fd, "#%s\n", myFile->comments[i]);
printf("#%s\n", myFile->comments[i]);
getch();
i++;
}
Comment format:
# This is a comment
# This is another comment
Any idea why it only returns the first character?
EDIT:
Here is my comments array:
char comments [256][100];
The comments array allows for 256 strings of up to 100 characters each.
The scanset " %99[^\n]" will skip leading whitespace and scan up to 99 (to allow for a terminating '\0') characters or to a newline whichever comes first.
The if condition will print the line and increment i on a commented line.
int i = 0;
char comments[256][100];
while( i < 256 && ( fscanf(fd, " %99[^\n]", comments[i]) == 1)) {
if ( comments[i][0] == '#') {
printf("%s\n", comments[i]);
i++;
}
}
Because scanf with %s reads until ' ' or '\n' or EOF. Use something like fgets.
"%s" does not save spaces.
It reads and discards leading white-space and then saves non-white-space into myFile->comments[i]
// reads "# This ", saving "This" into myFile->comments[i]
fscanf(fd, "#%s\n", myFile->comments[i]);
// Prints "This"
printf("#%s\n", myFile->comments[i]);
// reads and tosses "i"
getch();
// next fails to read "s a comment" as it does not begin with `#`
// nothing saved in myFile->comments[i]
fscanf(fd, "#%s\n", myFile->comments[i]);
// output undefined.
printf("#%s\n", myFile->comments[i]);
Instead, avoid scanf(). To read a line of input use fgets()
char buf[100];
while (fgets(buf, sizeof buf, stdin)) {
if (buf[0] == '#') printf("Comment %s", &buf[1]);
else printf("Not comment %s", buf);
}

How to read in a specific line from an input file in C?

If I want to read in a specific line without knowing what exactly is in that line how would I do that using fscanf?
one \n
two \n
three \n
i want this line number four \n
five \n
six \n
How would I read in the 5th line in that input text file? Do I need to use a while loop or a for loop?
You can use any loop both work more or less same way
Here is something you can do
int ch, newlines = 0;
while ((ch = getc(fp)) != EOF) {
if (ch == '\n') {
newlines++;
if (newlines == 5)
break;
}
}
Or you can use fgets because fgets places the "\n" (newline) at the end of the line
char line[100]; int newline=0;
while ( fgets( line, 100, stdin ) != null )
{
newline++;
if(newline==5)
{
fprintf("The line is: %s\n", line);
}
}

How to read only the first word from each line?

I've done many simple procedures, but I'm only trying to read the first word into a char word[30], from each line of a text file.
I've tried, but without success. Oh, I have to reuse that char each time I read it. (To put in an ordered list each time I read it).
Can anyone show me a way to read this way from a file, in a simple and "cleany" way?
FILE *fp;
char word[30];
fp = fopen("/myhome/Desktop/tp0_test.txt", "r");
if (fp == NULL) {
printf("Erro ao abrir ficheiro!\n");
} else {
while (!feof(fp)) {
fscanf(fp,"%*[^\n]%s",word);//not working very well...
printf("word read is: %s\n", word);
strcpy(word,""); //is this correct?
}
}
fclose(fp);
For example for a file that contains:
word1 word5
word2 kkk
word3 1322
word4 synsfsdfs
it prints only this:
word read is: word2
word read is: word3
word read is: word4
word read is:
Just swap the conversion specifications in your format string
// fscanf(fp,"%*[^\n]%s",word);//not working very well...
fscanf(fp,"%s%*[^\n]",word);
Read the first word and ignore the rest, rather than ignore the line and read the first word.
Edit some explanation
%s ignores whitespace, so if the input buffer has " forty two", scanf ignores the first space, copies "forty" to the destination and leaves the buffer positioned at the space before "two"
%*[^\n] ignores everything up to a newline, excluding the newline. So a buffer containing "one \n two" gets positioned at the newline after the scanf (as if it was "\n two")
so ross$ expand < first.c
#include <stdio.h>
int main(void) {
char line[1000], word[1000];
while(fgets(line, sizeof line, stdin) != NULL) {
word[0] = '\0';
sscanf(line, " %s", word);
printf("%s\n", word);
}
return 0;
}
so ross$ ./a.out < first.c
#include
int
char
while(fgets(line,
word[0]
sscanf(line,
printf("%s\n",
}
return
}
Update: Ok, here is one that just uses scanf(). Really, scanf doesn't deal well with discrete lines and you lose the option of avoiding word buffer overflow by setting the word buffer to be the same size as the line buffer, but, for what it's worth...
so ross$ expand < first2.c
#include <stdio.h>
int main(void) {
char word[1000];
for(;;) {
if(feof(stdin) || scanf(" %s%*[^\n]", word) == EOF)
break;
printf("%s\n", word);
}
return 0;
}
so ross$ ./a.out < first2.c
#include
int
char
for(;;)
if(feof(stdin)
break;
printf("%s\n",
}
return
}
Have a look at this, strtok function is what we needed. You may tell to function where to split the string with parameters, like strtok (singleLine," ,'(");. Here it will cut every time it see white space "," " ' " and (.
strtok (singleLine," "); or just in white spaces.
FILE *fPointer,*fWords,*fWordCopy;
char singleLine[150];
fPointer= fopen("words.txt","r");
fWordCopy= fopen("wordscopy.txt","a");
char * pch;
while(!feof(fPointer))
{
fgets(singleLine,100,fPointer);
pch = strtok (singleLine," ,'(");
fprintf(fWordCopy,pch);
fprintf(fWordCopy, "\n");
}
fclose(fPointer);

Resources