I have some lines I want to parse from a text file. Some lines start with x and continue with several y:z and others are composed completely of several y:zs, where x,y,z are numbers. I tried following code, but it does not work. The first line also reads in the y in y:z.
...
if (fscanf(stream,"%d ",&x))
if else (fscanf(stream,"%d:%g",&y,&z))
...
Is there a way to tell scanf to only read a character if it is followed by a space?
The *scanf family of functions do not allow you to do that natively. Of course, you can workaround the problem by reading in the minimum number of elements that you know will be present per input line, validate the return value of *scanf and then proceed incrementally, one item at a time, each time checking the return value for success/failure.
if (1 == fscanf(stream, "%d", &x) && (x == 'desired_value)) {
/* we need to read in some more : separated numbers */
while (2 == fscanf(stream, "%d:%d", &y, &z)) { /* loop till we fail */
printf("read: %d %d\n", y, z);
} /* note we do not handle the case where only one of y and z is present */
}
Your best bet to handle this is to read in a line using fgets and then parse the line yourself using sscanf.
if (NULL != fgets(stream, line, MAX_BUF_LEN)) { /* read line */
int nitems = tokenize(buf, tokens); /* parse */
}
...
size_t tokenize(const char *buf, char **tokens) {
size_t idx = 0;
while (buf[ idx ] != '\0') {
/* parse an int */
...
}
}
char line[MAXLEN];
while( fgets(line,MAXLEN,stream) )
{
char *endptr;
strtol(line,&endptr,10);
if( *endptr==':' )
printf("only y:z <%s>",line);
else
printf("beginning x <%s>",line);
}
I found a crude way to do, what I wanted without having to switch to fgets (which would probably be safer on the long run).
if (fscanf(stream,"%d ",&x)){...}
else if (fscanf(stream,"%d:%g",&y,&z)){...}
else if (fscanf(stream,":%g",&z)){
y=x;
x=0;
}
Related
I'm having some troubles using strtok function.
As an exercise I have to deal with a text file by ruling out white spaces, transforming initials into capital letters and printing no more than 20 characters in a line.
Here is a fragment of my code:
fgets(sentence, SIZE, f1_ptr);
char *tok_ptr = strtok(sentence, " \n"); //tokenazing each line read
tok_ptr[0] = toupper(tok_ptr[0]); //initials to capital letters
int num = 0, i;
while (!feof(f1_ptr)) {
while (tok_ptr != NULL) {
for (i = num; i < strlen(tok_ptr) + num; i++) {
if (i % 20 == 0 && i != 0) //maximum of 20 char per line
fputc('\n', stdout);
fputc(tok_ptr[i - num], stdout);
}
num = i;
tok_ptr = strtok(NULL, " \n");
if (tok_ptr != NULL)
tok_ptr[0] = toupper(tok_ptr[0]);
}
fgets(sentence, SIZE + 1, f1_ptr);
tok_ptr = strtok(sentence, " \n");
if (tok_ptr != NULL)
tok_ptr[0] = toupper(tok_ptr[0]);
}
The text is just a bunch of lines I just show as a reference:
Watch your thoughts ; they become words .
Watch your words ; they become actions .
Watch your actions ; they become habits .
Watch your habits ; they become character .
Watch your character ; it becomes your destiny .
Here is what I obtain in the end:
WatchYourThoughts;Th
eyBecomeWords.WatchY
ourWords;THeyBecomeA
ctions.WatchYourActi
ons;TheyBecomeHabits
.WatchYourHabits;The
yBecomeCharacteR.Wat
chYourCharacter;ItBe
comesYourDEstiny.Lao
-Tze
The final result is mostly correct, but sometimes (for example "they" in they become (and only in that case) or "destiny") words are not correctly tokenized. So for example "they" is split into "t" and "hey" resulting in THey (DEstiny in the other instance) after the manipulations I made.
Is it some bug or am I missing something? Probably my code is not that efficient and some condition may end up being critical...
Thank you for the help, it's not that big of a deal, I just don't understand why such a behaviour is occurring.
You have a large number of errors in your code and you are over-complicating the problem. The most pressing error is Why is while ( !feof (file) ) always wrong? Why? Trace the execution-path within your loop. You attempt to read with fgets(), and then you use sentence without knowing whether EOF was reached calling tok_ptr = strtok(sentence, " \n"); before you ever get around to checking feof(f1_ptr)
What happens when you actually reach EOF? That IS "Why while ( !feof (file) ) is always wrong?" Instead, you always want to control your read-loop with the return of the read function you are using, e.g. while (fgets(sentence, SIZE, f1_ptr) != NULL)
What is it you actually need your code to do?
The larger question is why are you over-complicating the problem with strtok, and arrays (and fgets() for that matter)? Think about what you need to do:
read each character in the file,
if it is whitespace, ignore it, set the in-word flag false,
if a non-whitespace, if 1st char in word, capitalize it, output the char, set the in-word flag true and increment the number of chars output to the current line, and finally
if it is the 20th character output, output a newline and reset the counter zero.
The bare-minimum tools you need from your C-toolbox are fgetc(), isspace() and toupper() from ctype.h, a counter for the number of characters output, and a flag to know if the character is the first non-whitespace character after a whitespace.
Implementing the logic
That makes the problem very simple. Read a character, is it whitespace?, set your in-word flag false, otherwise if your in-word flag is false, capitalize it, output the character, set your in-word flag true, increment your word count. Last thing you need to do is check if your character-count has reached the limit, if so output a '\n' and reset your character-count zero. Repeat until you run out of characters.
You can turn that into a code with something similar to the following:
#include <stdio.h>
#include <ctype.h>
#define CPL 20 /* chars per-line, if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
int c, in = 0, n = 0; /* char, in-word flag, no. of chars output in line */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while ((c = fgetc(fp)) != EOF) { /* read / validate each char in file */
if (isspace(c)) /* char is whitespace? */
in = 0; /* set in-word flag false */
else { /* otherwise, not whitespace */
putchar (in ? c : toupper(c)); /* output char, capitalize 1st in word */
in = 1; /* set in-word flag true */
n++; /* increment character count */
}
if (n == CPL) { /* CPL limit reached? */
putchar ('\n'); /* output newline */
n = 0; /* reset cpl counter */
}
}
putchar ('\n'); /* tidy up with newline */
if (fp != stdin) /* close file if not stdin */
fclose (fp);
}
Example Use/Output
Given your input file stored on my computer in dat/text220.txt, you can produce the output you are looking for with:
$ ./bin/text220 dat/text220.txt
WatchYourThoughts;Th
eyBecomeWords.WatchY
ourWords;TheyBecomeA
ctions.WatchYourActi
ons;TheyBecomeHabits
.WatchYourHabits;The
yBecomeCharacter.Wat
chYourCharacter;ItBe
comesYourDestiny.
(the executable for the code was compiled to bin/text220, I usually keep separate dat, obj, and bin directories for data, object files and executables to keep by source code directory clean)
note: by reading from stdin by default if no filename is provided as the first argument to the program, you can use your program to read input directly, e.g.
$ echo "my dog has fleas - bummer!" | ./bin/text220
MyDogHasFleas-Bummer
!
No fancy string functions required, just a loop, a character, a flag and a counter -- the rest is just arithmetic. It's always worth trying to boils your programming problems down to basic steps and then look around your C-toolbox and find the right tool for each basic step.
Using strtok
Don't get me wrong, there is nothing wrong with using strtok and it makes a fairly simple solution in this case -- the point I was making is that for simple character-oriented string-processing, it's often just a simple to loop over the characters in the line. You don't gain any efficiencies using fgets() with an array and strtok(), the read from the file is already placed into a buffer of BUFSIZ1.
If you did want to use strtok(), you should control you read-loop your with the return from fgets()and then you can tokenize with strtok() also checking its return at each point. A read-loop with fgets() and a tokenization loop with strtok(). Then you handle first-character capitalization and then limiting your output to 20-chars per-line.
You could do something like the following:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define CPL 20 /* chars per-line, if you need a constant, #define one (or more) */
#define MAXC 1024
#define DELIM " \t\r\n"
void putcharCPL (int c, int *n)
{
if (*n == CPL) { /* if n == limit */
putchar ('\n'); /* output '\n' */
*n = 0; /* reset value at mem address 0 */
}
putchar (c); /* output character */
(*n)++; /* increment value at mem address */
}
int main (int argc, char **argv) {
char line[MAXC]; /* buffer to hold each line */
int n = 0; /* no. of chars ouput in line */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (line, MAXC, fp)) /* read each line and tokenize line */
for (char *tok = strtok (line, DELIM); tok; tok = strtok (NULL, DELIM)) {
putcharCPL (toupper(*tok), &n); /* convert 1st char to upper */
for (int i = 1; tok[i]; i++) /* output rest unchanged */
putcharCPL (tok[i], &n);
}
putchar ('\n'); /* tidy up with newline */
if (fp != stdin) /* close file if not stdin */
fclose (fp);
}
(same output)
The putcharCPL() function is just a helper that checks if 20 characters have been output and if so outputs a '\n' and resets the counter. It then outputs the current character and increments the counter by one. A pointer to the counter is passed so it can be updated within the function making the updated value available back in main().
Look things over and let me know if you have further questions.
footnotes:
1. Depending on your version of gcc, the constant in the source setting the read-buffer size may be _IO_BUFSIZ. _IO_BUFSIZ was changed to BUFSIZ here: glibc commit 9964a14579e5eef9 For Linux BUFSIZE is defined as 8192 (512 on Windows).
This is actually a much more interesting OP from a professional point of view than some of the comments may suggest, despite the 'newcomer' aspect of the question, which may sometimes raise fairly deep, underestimated issues.
The fun thing is that on my platform (W10, MSYS2, gcc v.10.2), your code runs fine with correct results:
WatchYourThoughts;Th
eyBecomeWords.WatchY
ourWords;TheyBecomeA
ctions.WatchYourActi
ons;TheyBecomeHabits
.WatchYourHabits;The
yBecomeCharacter.Wat
chYourCharacter;ItBe
comesYourDestiny.
So first, congratulations, newcomer: your coding is not that bad.
This points to how different compilers may or may not protect against limited inappropriate coding or specification misuse, may or may not protect stacks or heaps.
This said, the comment by #Andrew Henle pointing to an illuminating answer about feof is quite relevant.
If you follow it and retrieve your feof test, just moving it down after read checks, not before (as below). Your code should yield better results (note: I will just alter your code minimally, deliberately ignoring lesser issues):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <ctype.h>
#define SIZE 100 // add some leeway to avoid off-by-one issues
int main()
{
FILE* f1_ptr = fopen("C:\\Users\\Public\\Dev\\test_strtok", "r");
if (! f1_ptr)
{
perror("Open issue");
exit(EXIT_FAILURE);
}
char sentence[SIZE] = {0};
if (NULL == fgets(sentence, SIZE, f1_ptr))
{
perror("fgets issue"); // implementation-dependent
exit(EXIT_FAILURE);
}
errno = 0;
char *tok_ptr = strtok(sentence, " \n"); //tokenizing each line read
if (tok_ptr == NULL || errno)
{
perror("first strtok parse issue");
exit(EXIT_FAILURE);
}
tok_ptr[0] = toupper(tok_ptr[0]); //initials to capital letters
int num = 0;
size_t i = 0;
while (1) {
while (1) {
for (i = num; i < strlen(tok_ptr) + num; i++) {
if (i % 20 == 0 && i != 0) //maximum of 20 char per line
fputc('\n', stdout);
fputc(tok_ptr[i - num], stdout);
}
num = i;
tok_ptr = strtok(NULL, " \n");
if (tok_ptr == NULL) break;
tok_ptr[0] = toupper(tok_ptr[0]);
}
if (NULL == fgets(sentence, SIZE, f1_ptr)) // let's get away whith annoying +1,
// we have enough headroom
{
if (feof(f1_ptr))
{
fprintf(stderr, "\n%s\n", "Found EOF");
break;
}
else
{
perror("Unexpected fgets issue in loop"); // implementation-dependent
exit(EXIT_FAILURE);
}
}
errno = 0;
tok_ptr = strtok(sentence, " \n");
if (tok_ptr == NULL)
{
if (errno)
{
perror("strtok issue in loop");
exit(EXIT_FAILURE);
}
break;
}
tok_ptr[0] = toupper(tok_ptr[0]);
}
return 0;
}
$ ./test
WatchYourThoughts;Th
eyBecomeWords.WatchY
ourWords;TheyBecomeA
ctions.WatchYourActi
ons;TheyBecomeHabits
.WatchYourHabits;The
yBecomeCharacter.Wat
chYourCharacter;ItBe
comesYourDestiny.
Found EOF
So, I'm working on a simple hangman game in C, and I have the function read_guess, shown below.
void read_guess(char *guesses, char *p_current_guess)
{
int valid_guess = 0;
// Repeatedly takes input until guess is valid
while (valid_guess == 0)
{
printf(">>> ");
fgets(p_current_guess, 2, stdin);
if (!isalpha(*p_current_guess)) printf("Guesses must be alphabetic. Please try again.\n\n");
else
{
valid_guess = 1;
// Iterates over array of guesses and checks if letter has already been guessed
for (int i = 0; guesses[i] != '\0'; i++)
{
if (guesses[i] == *p_current_guess)
{
printf("You have already guessed this letter. Please try again.\n\n");
valid_guess = 0;
break;
}
}
}
}
}
I've tried all the standard input functions (including getchar), but with all of them, when an input larger than one character is supplied, instead of taking just the first character and moving on (or asking again), the rest of the input is "pushed back", and the next time input is requested, whether it be because the input contained a non-alphabetic character or the next round begins, the rest of the input is automatically processed. This repeats for each character of the input.
How can I avoid this?
You are using fgets which is good, but unfortunately not the right way...
fgets reads up to an end of line or at most 1 less the the number of character asked. And of course remaining characters are left for the next read operation...
The idiomatic way would be to ensure reading up to the end of line, whatever the length, or at least up to a much larger length.
Simple but could fail in more than SIZE characters on input:
#define SIZE 64
...
void read_guess(char *guesses, char *p_current_guess)
{
char line[SIZE];
int valid_guess = 0;
// Repeatedly takes input until guess is valid
while (valid_guess == 0)
{
printf(">>> ");
fgets(line, SiZE, stdin); // read a line of size at most SIZE-1
p_current_guess[0] = line[0]; // keep first character
p_current_guess[1] = '\0';
...
Robust but slightly more complex
/**
* Read a line and only keep the first character
*
* Syntax: char * fgetfirst(dest, fd);
*
* Parameters:
* dest: points to a buffer of size at least 2 that will recieve the
* first character followed with a null
* fd : FILE* from which to read
*
* Return value: dest if one character was successfully read, else NULL
*/
char *readfirst(dest, fd) {
#define SIZE 256 // may be adapted
char buf[SIZE];
char *cr = NULL; // return value initialized to NULL if nothing can be read
for (;;) {
if(NULL == fgets(buff, sizeof(buff), fd)) return cr; // read error or end of file
if (0 == strcspn(buff, "\n")) return cr; // end of file
if (cr == NULL) { // first read:
cr = dest; // prepare to return first char
dest[0] = buff[0];
dest[1] = 0;
}
}
}
You can then use it simply in your code:
void read_guess(char *guesses, char *p_current_guess)
{
int valid_guess = 0;
// Repeatedly takes input until guess is valid
while (valid_guess == 0)
{
printf(">>> ");
fgetfirst(p_current_guess, stdin);
You can discard all input until end-of-line, each time you want to ask for input.
void skip_to_eol(FILE* f, int c)
{
while (c != EOF && c != '\n')
c = fgetc(f);
}
...
char c = getchar(); // instead of fgets
skip_to_eol(stdin, c);
You can use getch() function on windows to get single character. and this is linux equivalent
What is the equivalent to getch() & getche() in Linux?
I wanted to only count the number of strings in a text file, containing numbers as well. But the code below, counts even the numbers in the file as strings. How do I rectify the problem?
int count;
char *temp;
FILE *fp;
fp = fopen("multiplexyz.txt" ,"r" );
while(fscanf(fp,"%s",temp) != EOF )
{
count++;
}
printf("%d ",count);
return 0;
}
Well, first up, using the temp pointer without having backing storage for it is going to cause you a world of pain.
I'd suggest, as a start, using something like char temp[1000] instead, keeping in mind that's still a bit risky if you have words more than a thousand or so characters long (that's a different issue to the one you're asking about so I'll mention it but not spend too much time on fixing it).
Secondly, it appears you want to count words with numbers (like alpha7 or pi/2). If that's the case, you simply need to check temp after reading the "word" and increment count only if it matches a "non-numeric" pattern.
That could be as simple as just not incrementing if the word consists only of digits, or it could be complicated if you want to handle decimals, exponential formats and so on.
But the bottom line remains the same:
while(fscanf(fp,"%s",temp) != EOF )
{
if (! isANumber(temp))
count++;
}
with a suitable definition of isANumber. For example, for unsigned integers only, something like this would be a good start:
int isANumber (char *str) {
// Empty string is not a number.
if (*str == '\0')
return 0;
// Check every character.
while (*str != '\0') {
// If non-digit, it's not a number.
if (! isdigit (*str))
return 0;
str++;
}
// If all characters were digits, it was a number.
return 1;
}
For more complex checking, you can use the strto* calls in C, giving them the temp buffer and ensuring you use the endptr method to ensure the entire string is scanned. Off the top of my head, so not well tested, that would go something like:
int isANumber (char *str) {
// Empty string is not a number.
if (*str == '\0')
return 0;
// Use strtod to get a double.
char *endPtr;
long double d = strtold (str, &endPtr);
// Characters unconsumed, not number (things like 42b).
if (*endPtr != '\0')
return 0;
// Was a long double, so number.
return 1;
}
The only thing you need to watch out for there is that certain strings like NaN or +Inf are considered a number by strtold so you may need extra checks for that.
inside your while loop, loop through the string to check if any of its characters are digits. Something like:
while(*temp != '\0'){
if(isnumber(*temp))
break;
}
[dont copy exact same code]
I find strpbrk to be one of the most helpful function to search for several needles in a haystack. Your set of needles being the numeric characters "0123456789" which if present in a line read from your file will count as a line. I also prefer POSIX getline for a line count do to its proper handling of files with non-POSIX line endings for the last line (both fgets and wc -l omit text (and a count) of the last line if it does not contain a POSIX line end ('\n'). That said, a small function that searches a line for characters contained in a trm passed as a parameter could be written as:
/** open and read each line in 'fn' returning the number of lines
* continaing any of the characters in 'trm'.
*/
size_t nlines (char *fn, char *trm)
{
if (!fn) return 0;
size_t lines = 0, n = 0;
char *buf = NULL;
FILE *fp = fopen (fn, "r");
if (!fp) return 0;
while (getline (&buf, &n, fp) != -1)
if (strpbrk (buf, trm))
lines++;
fclose (fp);
free (buf);
return lines;
}
Simply pass the filename of interest and the terms to search for in each line. A short test code with a default term of "0123456789" that takes the filename as the first parameter and the term as the second could be written as follows:
#include <stdio.h> /* printf */
#include <stdlib.h> /* free */
#include <string.h> /* strlen, strrchr */
size_t nlines (char *fn, char *trm);
int main (int argc, char **argv) {
char *fn = argc > 1 ? argv[1] : NULL;
char *srch = argc > 2 ? argv[2] : "0123456789";
if (!fn) return 1;
printf ("%zu %s\n", nlines (fn, srch), fn);
return 0;
}
/** open and read each line in 'fn' returning the number of lines
* continaing any of the characters in 'trm'.
*/
size_t nlines (char *fn, char *trm)
{
if (!fn) return 0;
size_t lines = 0, n = 0;
char *buf = NULL;
FILE *fp = fopen (fn, "r");
if (!fp) return 0;
while (getline (&buf, &n, fp) != -1)
if (strpbrk (buf, trm))
lines++;
fclose (fp);
free (buf);
return lines;
}
Give it a try and see if this is what you are expecting, if not, just let me know and I am glad to help further.
Example Input File
$ cat dat/linewno.txt
The quick brown fox
jumps over 3 lazy dogs
who sleep in the sun
with a temp of 101
Example Use/Output
$ ./bin/getline_nlines_nums dat/linewno.txt
2 dat/linewno.txt
$ wc -l dat/linewno.txt
4 dat/linewno.txt
I have a very strange problem, I'm trying to read a .txt file with C, and the data is structured like this:
%s
%s
%d %d
Since I have to read the strings all the way to \n I'm reading it like this:
while(!feof(file)){
fgets(s[i].title,MAX_TITLE,file);
fgets(s[i].artist,MAX_ARTIST,file);
char a[10];
fgets(a,10,file);
sscanf(a,"%d %d",&s[i].time.min,&s[i++].time.sec);
}
However, the very first integer I read in s.time.min shows a random big number.
I'm using the sscanf right now since a few people had a similar issue, but it doesn't help.
Thanks!
EDIT: The integers represent time, they will never exceed 5 characters combined, including the white space between.
Note, I take your post to be reading values from 3 different lines, e.g.:
%s
%s
%d %d
(primarily evidenced by your use of fgets, a line-oriented input function, which reads a line of input (up to and including the '\n') each time it is called.) If that is not the case, then the following does not apply (and can be greatly simplified)
Since you are reading multiple values into a single element in an array of struct, you may find it better (and more robust), to read each value and validate each value using temporary values before you start copying information into your structure members themselves. This allows you to (1) validate the read of all values, and (2) validate the parse, or conversion, of all required values before storing members in your struct and incrementing your array index.
Additionally, you will need to remove the tailing '\n' from both title and artist to prevent having embedded newlines dangling off the end of your strings (which will cause havoc with searching for either a title or artist). For instance, putting it all together, you could do something like:
void rmlf (char *s);
....
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST = "";
char a[10] = "";
int min, sec;
...
while (fgets (title, MAX_TITLE, file) && /* validate read of values */
fgets (artist, MAX_ARTIST, file) &&
fgets (a, 10, file)) {
if (sscanf (a, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
rmlf (title); /* remove trailing newline */
rmlf (artist);
s[i].time.min = min; /* copy to struct members & increment index */
s[i].time.sec = sec;
strncpy (s[i].title, title, MAX_TITLE);
strncpy (s[i++].artist, artist, MAX_ARTIST);
}
/** remove tailing newline from 's'. */
void rmlf (char *s)
{
if (!s || !*s) return;
for (; *s && *s != '\n'; s++) {}
*s = 0;
}
(note: this will also read all values until an EOF is encountered without using feof (see Related link: Why is “while ( !feof (file) )” always wrong?))
Protecting Against a Short-Read with fgets
Following on from Jonathan's comment, when using fgets you should really check to insure you have actually read the entire line, and not experienced a short read where the maximum character value you supply is not sufficient to read the entire line (e.g. a short read because characters in that line remain unread)
If a short read occurs, that will completely destroy your ability to read any further lines from the file, unless you handle the failure correctly. This is because the next attempt to read will NOT start reading on the line you think it is reading and instead attempt to read the remaining characters of the line where the short read occurred.
You can validate a read by fgets by validating the last character read into your buffer is in fact a '\n' character. (if the line is longer than the max you specify, the last character before the nul-terminating character will be an ordinary character instead.) If a short read is encountered, you must then read and discard the remaining characters in the long line before continuing with your next read. (unless you are using a dynamically allocated buffer where you can simply realloc as required to read the remainder of the line, and your data structure)
Your situation complicates the validation by requiring data from 3 lines from the input file for each struct element. You must always maintain your 3-line read in sync reading all 3 lines as a group during each iteration of your read loop (even if a short read occurs). That means you must validate that all 3 lines were read and that no short read occurred in order to handle any one short read without exiting your input loop. (you can validate each individually if you just want to terminate input on any one short read, but that leads to a very inflexible input routine.
You can tweak the rmlf function above to a function that validates each read by fgets in addition to removing the trailing newline from the input. I have done that below in a function called, surprisingly, shortread. The tweaks to the original function and read loop could be coded something like this:
int shortread (char *s, FILE *fp);
...
for (idx = 0; idx < MAX_SONGS;) {
int t, a, b;
t = a = b = 0;
/* validate fgets read of complete line */
if (!fgets (title, MAX_TITLE, fp)) break;
t = shortread (title, fp);
if (!fgets (artist, MAX_ARTIST, fp)) break;
a = shortread (artist, fp);
if (!fgets (buf, MAX_MINSEC, fp)) break;
b = shortread (buf, fp);
if (t || a || b) continue; /* if any shortread, skip */
if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
s[idx].time.min = min; /* copy to struct members & increment index */
s[idx].time.sec = sec;
strncpy (s[idx].title, title, MAX_TITLE);
strncpy (s[idx].artist, artist, MAX_ARTIST);
idx++;
}
...
/** validate complete line read, remove tailing newline from 's'.
* returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
* if shortread, read/discard remainder of long line.
*/
int shortread (char *s, FILE *fp)
{
if (!s || !*s) return -1;
for (; *s && *s != '\n'; s++) {}
if (*s != '\n') {
int c;
while ((c = fgetc (fp)) != '\n' && c != EOF) {}
return 1;
}
*s = 0;
return 0;
}
(note: in the example above the result of the shortread check for each of the lines that make up and title, artist, time group.)
To validate the approach I put together a short example that will help put it all in context. Look over the example and let me know if you have any further questions.
#include <stdio.h>
#include <string.h>
/* constant definitions */
enum { MAX_MINSEC = 10, MAX_ARTIST = 32, MAX_TITLE = 48, MAX_SONGS = 64 };
typedef struct {
int min;
int sec;
} stime;
typedef struct {
char title[MAX_TITLE];
char artist[MAX_ARTIST];
stime time;
} songs;
int shortread (char *s, FILE *fp);
int main (int argc, char **argv) {
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST] = "";
char buf[MAX_MINSEC] = "";
int i, idx, min, sec;
songs s[MAX_SONGS] = {{ .title = "", .artist = "" }};
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
for (idx = 0; idx < MAX_SONGS;) {
int t, a, b;
t = a = b = 0;
/* validate fgets read of complete line */
if (!fgets (title, MAX_TITLE, fp)) break;
t = shortread (title, fp);
if (!fgets (artist, MAX_ARTIST, fp)) break;
a = shortread (artist, fp);
if (!fgets (buf, MAX_MINSEC, fp)) break;
b = shortread (buf, fp);
if (t || a || b) continue; /* if any shortread, skip */
if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
continue; /* skip line - tailor to your needs */
}
s[idx].time.min = min; /* copy to struct members & increment index */
s[idx].time.sec = sec;
strncpy (s[idx].title, title, MAX_TITLE);
strncpy (s[idx].artist, artist, MAX_ARTIST);
idx++;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (i = 0; i < idx; i++)
printf (" %2d:%2d %-32s %s\n", s[i].time.min, s[i].time.sec,
s[i].artist, s[i].title);
return 0;
}
/** validate complete line read, remove tailing newline from 's'.
* returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
* if shortread, read/discard remainder of long line.
*/
int shortread (char *s, FILE *fp)
{
if (!s || !*s) return -1;
for (; *s && *s != '\n'; s++) {}
if (*s != '\n') {
int c;
while ((c = fgetc (fp)) != '\n' && c != EOF) {}
return 1;
}
*s = 0;
return 0;
}
Example Input
$ cat ../dat/titleartist.txt
First Title I Like
First Artist I Like
3 40
Second Title That Is Way Way Too Long To Fit In MAX_TITLE Characters
Second Artist is Fine
12 43
Third Title is Fine
Third Artist is Way Way Too Long To Fit in MAX_ARTIST
3 23
Fourth Title is Good
Fourth Artist is Good
32274 558212 (too long for MAX_MINSEC)
Fifth Title is Good
Fifth Artist is Good
4 27
Example Use/Output
$ ./bin/titleartist <../dat/titleartist.txt
3:40 First Artist I Like First Title I Like
4:27 Fifth Artist is Good Fifth Title is Good
Instead of sscanf(), I would use strtok() and atoi().
Just curious, why only 10 bytes for the two integers? Are you sure they are always that small?
By the way, I apologize for such a short answer. I'm sure there is a way to get sscanf() to work for you, but in my experience sscanf() can be rather finicky so I'm not a big fan. When parsing input with C, I have just found it a lot more efficient (in terms of how long it takes to write and debug the code) to just tokenize the input with strtok() and convert each piece individually with the various ato? functions (atoi, atof, atol, strtod, etc.; see stdlib.h). It keeps things simpler, because each piece of input is handled individually, which makes debugging any problems (should they arise) much easier. In the end I typically spend a lot less time getting such code to work reliably than I did when I used to try to use sscanf().
Use "%*s %*s %d %d" as your format string, instead...
You seem to be expecting sscanf to automagically skip the two tokens leading up to the decimal digit fields. It doesn't do that unless you explicitly tell it to (hence the pair of %*s).
You can't expect the people who designed C to have designed it the same way as you would. You NEED to check the return value, as iharob said.
That's not all. You NEED to read (and understand reelatively well) the entire scanf manual (the one written by OpenGroup is okay). That way you know how to use the function (including all of the subtle nuances of format strings) and what to do with the return vale.
As a programmer, you need to read. Remember that well.
gcc 4.4.2
I was reading an article about scanf. I personally have never checked the return code of a scanf.
#include <stdio.h>
int main(void)
{
char buf[64];
if(1 == scanf("%63s", buf))
{
printf("Hello %s\n", buf);
}
else
{
fprintf(stderr, "Input error.\n");
}
return 0;
}
I am just wondering what other techniques experienced programmers do when they use scanf when they want to get user input? Or do they use another function or write their own?
Thanks for any suggestions,
EDIT =========
#include <stdio.h>
int main(void)
{
char input_buf[64] = {0};
char data[64] = {0};
printf("Enter something: ");
while( fgets(input_buf, sizeof(input_buf), stdin) == NULL )
{
/* parse the input entered */
sscanf(input_buf, "%s", data);
}
printf("Input [ %s ]\n", data);
return 0;
}
I think most programmers agree that scanf is bad, and most agree to use fgets and sscanf. However, I can use fgets to readin the input. However, if I don't know what the user will enter how do I know what to parse. For example, like if the user was to enter their address which would contain numbers and characters and in any order?
Don't use scanf directly. It's surprisingly hard to use. It's better to read an entire line of input and to then parse it (possibly with sscanf).
Read this entry (and the entries it references) from the comp.lang.c FAQ:
http://c-faq.com/stdio/scanfprobs.html
Edit:
Okay, to address your additional question from your own edit: If you allow unstructured input, then you're going to have to attempt to parse the string in multiple ways until you find one that works. If you can't find a valid match, then you should reject the input and prompt the user again, probably explaining what format you want the input to be in.
For anything more complicated, you'd probably be better off using a regular expression library or even using dedicated lexer/parser toolkits (e.g. flex and bison).
I don't use scanf() for interactive user input; I read everything as text using fgets(), then parse the input as necessary, using strtol() and strtod() to convert text to numeric values.
One example of where scanf() falls down is when the user enters a bad numeric value, but the initial part of it is valid, something like the following:
if (scanf("%d", &num) == 1)
{
// process num
}
else
{
// handle error
}
If the user types in "12e4", scanf() will successfully convert and assign the "12" to num, leaving "e4" in the input stream to foul up a future read. The entire input should be treated as bogus, but scanf() can't catch that kind of error. OTOH, if I do something like:
if (fgets(buffer, sizeof buffer, stdin))
{
int val;
char *chk;
val = (int) strtol(buffer, &chk, 10);
if (!isspace(*chk) && *chk != 0)
{
// non-numeric character in input; reject it completely
}
else
{
// process val
}
}
I can catch the error in the input and reject it before using any part of it. This also does a better job of not leaving garbage in the input stream.
scanf() is a great tool if you can guarantee your input is always well-formed.
scanf() has problems, in that if a user is expected to type an integer, and types a string instead, often the program bombs. This can be overcome by reading all input as a string (use getchar()), and then converting the string to the correct data type.
/* example one, to read a word at a time */
#include <stdio.h>
#include <ctype.h>
#define MAXBUFFERSIZE 80
void cleartoendofline( void ); /* ANSI function prototype */
void cleartoendofline( void )
{
char ch;
ch = getchar();
while( ch != '\n' )
ch = getchar();
}
main()
{
char ch; /* handles user input */
char buffer[MAXBUFFERSIZE]; /* sufficient to handle one line */
int char_count; /* number of characters read for this line */
int exit_flag = 0;
int valid_choice;
while( exit_flag == 0 ) {
printf("Enter a line of text (<80 chars)\n");
ch = getchar();
char_count = 0;
while( (ch != '\n') && (char_count < MAXBUFFERSIZE)) {
buffer[char_count++] = ch;
ch = getchar();
}
buffer[char_count] = 0x00; /* null terminate buffer */
printf("\nThe line you entered was:\n");
printf("%s\n", buffer);
valid_choice = 0;
while( valid_choice == 0 ) {
printf("Continue (Y/N)?\n");
scanf(" %c", &ch );
ch = toupper( ch );
if((ch == 'Y') || (ch == 'N') )
valid_choice = 1;
else
printf("\007Error: Invalid choice\n");
cleartoendofline();
}
if( ch == 'N' ) exit_flag = 1;
}
}
I make a loop call fgets until the end of the line is read, and then call sscanf to parse the data. It's a good idea to check whether sscanf reaches the end of the input line.
I rarely use scanf. Most of the times, I use fgets() to read data as a string. Then, depending upon the need, I may use sscanf(), or other functions such as strto* family of functions, str*chr(), etc., to get data from the string.
If I use scanf() or fgets() + sscanf(), I always check the return values of the functions to make sure they did what I wanted them to do. I also don't use strtok() to tokenize strings, because I think the interface of strtok() is broken.