Getting particular strings in scanf - c

I was wondering if it is possible to only read in particular parts of a string using scanf.
For example since I am reading from a file i use fscanf
if I wanted to read name and number (where number is the 111-2222) when they are in a string such as:
Bob Hardy:sometext:111-2222:sometext:sometext
I use this but its not working:
(fscanf(read, "%23[^:] %27[^:] %10[^:] %27[^:] %d\n", name,var1, number, var2, var3))

Your initial format string fails because it does not consume the : delimiters.
If you want scanf() to read a portion of the input, but you don't care what is actually read, then you should use a field descriptor with the assignment-suppression flag (*):
char nl;
fscanf(read, "%23[^:]:%*[^:]:%10[^:]%*[^\n]%c", name, number, &nl);
As a bonus, you don't need to worry about buffer overruns for fields with assignment suppressed.
You should not attempt to match a single newline via a trailing newline character in the format, because a literal newline (or space or tab) in the format will match any run of whitespace. In this particular case, it would consume not just the line terminator but also any leading whitespace on the next line.
The last field is not suppressed, even though it will almost always receive a newline, because that way you can tell from the return value if you've scanned the last line of the file and it is not newline-terminated.

Check fscanf() return value.
fscanf(read, "%23[^:] %27[^:] ... is failing because after scanning the first field with %23[^:], fscanf() encounters a ':'. Since that does not match the next part of the format, a white-space as in ' ', scanning stops.
Had code checked the returned value of fscanf(), which was certainly 1, it may have been self-evident the source of the problem. So the scanning needs to consume the ':', add it to the format: "%23[^:]: %27[^:]: ...
Better to use fgets()
Using fscanf() to read data and detect properly and improperly formatted data is very challenging. It can be done correctly to scan expected input. Yet it rarely works to handle some incorrectly formated input.
Instead, simple read a line of data and then parse it. Using '%n' is an easy way to detect complete conversion as it saves the char scan count - if scanning gets there.
char buffer[200];
if (fgets(buffer, sizeof buffer, read) == NULL) {
return EOF;
}
int n = 0;
sscanf(buffer, " %23[^:]: %27[^:]: %10[^:]: %27[^:]:%d %n",
name, var1, number, var2, &var3, &n);
if (n == 0) {
return FAIL; // scan incomplete
}
if (buffer[n]) {
return FAIL; // Extra data on line
}
// Success!
Note: sample input ended with text, but original format used "%d". Unclear on OP's intent.

Related

if my scanf variable is a float and a user inputs a character how can i prompt them to input a number? assuming the scanf is inside a do while loop

i have tried to use k = getchar() but it doesn't work too;
here is my code
#include<stdio.h>
int main()
{
float height;
float k=0;
do
{
printf("please type a value..\n");
scanf("%f",&height);
k=height;
}while(k<0);// i assume letters and non positive numbers are below zero.
//so i want the loop to continue until one types a +ve float.
printf("%f",k);
return 0;
}
i want a if a user types letters or negative numbers or characters he/she should be prompted to type the value again until he types a positive number
Like Govind Parmar already suggested, it is better/easier to use fgets() to read a full line of input, rather than use scanf() et al. for human-interactive input.
The underlying reason is that the interactive standard input is line-buffered by default (and changing that is nontrivial). So, when the user starts typing their input, it is not immediately provided to your program; only when the user presses Enter.
If we do read each line of input using fgets(), we can then scan and convert it using sscanf(), which works much like scanf()/fscanf() do, except that sscanf() works on string input, rather than an input stream.
Here is a practical example:
#include <stdlib.h>
#include <stdio.h>
#define MAX_LINE_LEN 100
int main(void)
{
char buffer[MAX_LINE_LEN + 1];
char *line, dummy;
double value;
while (1) {
printf("Please type a number, or Q to exit:\n");
fflush(stdout);
line = fgets(buffer, sizeof buffer, stdin);
if (!line) {
printf("No more input; exiting.\n");
break;
}
if (sscanf(line, " %lf %c", &value, &dummy) == 1) {
printf("You typed %.6f\n", value);
continue;
}
if (line[0] == 'q' || line[0] == 'Q') {
printf("Thank you; now quitting.\n");
break;
}
printf("Sorry, I couldn't parse that.\n");
}
return EXIT_SUCCESS;
}
The fflush(stdout); is not necessary, but it does no harm either. It basically ensures that everything we have printf()'d or written to stdout, will be flushed to the file or device; in this case, that it will be displayed in the terminal. (It is not necessary here, because standard output is also line buffered by default, so the \n in the printf pattern, printing a newline, also causes the flush.
I do like to sprinkle those fflush() calls, wherever I need to remember that at this point, it is important for all output to be actually flushed to output, and not cached by the C library. In this case, we definitely want the prompt to be visible to the user before we start waiting for their input!
(But, again, because that printf("...\n"); before it ends with a newline, \n, and we haven't changed the standard output buffering, the fflush(stdout); is not needed there.)
The line = fgets(buffer, sizeof buffer, stdin); line contains several important details:
We defined the macro MAX_LINE_LEN earlier on, because fgets() can only read a line as long as the buffer it is given, and will return the rest of that line in following calls.
(You can check if the line read ended with a newline: if it does not, then either it was the final line in an input file that does not have a newline at the end of the last line, or the line was longer than the buffer you have, so you only received the initial part, with the rest of the line still waiting for you in the buffer.)
The +1 in char buffer[MAX_LINE_LEN + 1]; is because strings in C are terminated by a nul char, '\0', at end. So, if we have a buffer of 19 characters, it can hold a string with at most 18 characters.
Note that NUL, or nul with one ell, is the name of the ASCII character with code 0, '\0', and is the end-of-string marker character.
NULL (or sometimes nil), however, is a pointer to the zero address, and in C99 and later is the same as (void *)0. It is the sentinel and error value we use, when we want to set a pointer to a recognizable error/unused/nothing value, instead of pointing to actual data.
sizeof buffer is the number of chars, total (including the end-of-string nul char), used by the variable buffer.
In this case, we could have used MAX_LINE_LEN + 1 instead (the second parameter to fgets() being the number of characters in the buffer given to it, including the reservation for the end-of-string char).
The reason I used sizeof buffer here, is because it is so useful. (Do remember that if buffer was a pointer and not an array, it would evaluate to the size of a pointer; not the amount of data available where that pointer points to. If you use pointers, you will need to track the amount of memory available there yourself, usually in a separate variable. That is just how C works.)
And also because it is important that sizeof is not a function, but an operator: it does not evaluate its argument, it only considers the size (of the type) of the argument. This means that if you do something silly like sizeof (i++), you'll find that i is not incremented, and that it yields the exact same value as sizeof i. Again, this is because sizeof is an operator, not a function, and it just returns the size of its argument.
fgets() returns a pointer to the line it stored in the buffer, or NULL if an error occurred.
This is also why I named the pointer line, and the storage array buffer. They describe my intent as a programmer. (That is very important when writing comments, by the way: do not describe what the code does, because we can read the code; but do describe your intent as to what the code should do, because only the programmer knows that, but it is important to know that intent if one tries to understand, modify, or fix the code.)
The scanf() family of functions returns the number of successful conversions. To detect input where the proper numeric value was followed by garbage, say 1.0 x, I asked sscanf() to ignore any whitespace after the number (whitespace means tabs, spaces, and newlines; '\t', '\n', '\v', '\f', '\r', and ' ' for the default C locale using ASCII character set), and try to convert a single additional character, dummy.
Now, if the line does contain anything besides whitespace after the number, sscanf() will store the first character of that anything in dummy, and return 2. However, because I only want lines that only contain the number and no dummy characters, I expect a return value of 1.
To detect the q or Q (but only as the first character on the line), we simply examine the first character in line, line[0].
If we included <string.h>, we could use e.g. if (strchr(line, 'q') || strchr(line, 'Q')) to see if there is a q or Q anywhere in the line supplied. The strchr(string, char) returns a pointer to the first occurrence of char in string, or NULL if none; and all pointers but NULL are considered logically true. (That is, we could equivalently write if (strchr(line, 'q') != NULL || strchr(line, 'Q') != NULL).)
Another function we could use declared in <string.h> is strstr(). It works like strchr(), but the second parameter is a string. For example, (strstr(line, "exit")) is only true if line has exit in it somewhere. (It could be brexit or exitology, though; it is just a simple substring search.)
In a loop, continue skips the rest of the loop body, and starts the next iteration of the loop body from the beginning.
In a loop, break skips the rest of the loop body, and continues execution after the loop.
EXIT_SUCCESS and EXIT_FAILURE are the standard exit status codes <stdlib.h> defines. Most prefer using 0 for EXIT_SUCCESS (because that is what it is in most operating systems), but I think spelling the success/failure out like that makes it easier to read the code.
I wouldn't use scanf-family functions for reading from stdin in general.
fgets is better since it takes input as a string whose length you specify, avoiding buffer overflows, which you can later parse into the desired type (if any). For the case of float values, strtof works.
However, if the specification for your deliverable or homework assignment requires the use of scanf with %f as the format specifier, what you can do is check its return value, which will contain a count of the number of format specifiers in the format string that were successfully scanned:
§ 7.21.6.2:
The [scanf] function returns the value of the macro EOF if an input failure occurs
before the first conversion (if any) has completed. Otherwise, the function returns the
number of input items assigned, which can be fewer than provided for, or even zero, in
the event of an early matching failure.
From there, you can diagnose whether the input is valid or not. Also, when scanf fails, stdin is not cleared and subsequent calls to scanf (i.e. in a loop) will continue to see whatever is in there. This question has some information about dealing with that.

how to scan line in c program not from file

How to scan total line from user input with c program?
I tried scanf("%99[^\n]",st), but it is not working when I scan something before this scan statment.It worked if this is the first scan statement.
How to scan total line from user input with c program?
There are many ways to read a line of input, and your usage of the word scan suggests you're already focused on the scanf() function for the job. This is unfortunate, because, although you can (to some extent) achieve what you want with scanf(), it's definitely not the best tool for reading a line.
As already stated in the comments, your scanf() format string will stop at a newline, so the next scanf() will first find that newline and it can't match [^\n] (which means anything except newline). As a newline is just another whitespace character, adding a blank in front of your conversion will silently eat it up ;)
But now for the better solution: Assuming you only want to use standard C functions, there's already one function for exactly the job of reading a line: fgets(). The following code snippet should explain its usage:
char line[1024];
char *str = fgets(line, 1024, stdin); // read from the standard input
if (!str)
{
// couldn't read input for some reason, handle error here
exit(1); // <- for example
}
// fgets includes the newline character that ends the line, but if the line
// is longer than 1022 characters, it will stop early here (it will never
// write more bytes than the second parameter you pass). Often you don't
// want that newline character, and the following line overwrites it with
// 0 (which is "end of string") **only** if it was there:
line[strcspn(line, "\n")] = 0;
Note that you might want to check for the newline character with strchr() instead, so you actually know whether you have the whole line or maybe your input buffer was to small. In the latter case, you might want to call fgets() again.
How to scan total line from user input with c program?
scanf("%99[^\n]",st) reads a line, almost.
With the C Standard Library a line is
A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. C11dr §7.21.2 2
scanf("%99[^\n]",st) fails to read the end of the line, the '\n'.
That is why on the 2nd call, the '\n' remains in stdin to be read and scanf("%99[^\n]",st) will not read it.
There are ways to use scanf("%99[^\n]",st);, or a variation of it as a step in reading user input, yet they suffer from 1) Not handling a blank line "\n" correctly 2) Missing rare input errors 3) Long line issues and other nuances.
The preferred portable solution is to use fgets(). Loop example:
#define LINE_MAX_LENGTH 200
char buf[LINE_MAX_LENGTH + 1 + 1]; // +1 for long lines detection, +1 for \0
while (fgets(buf, sizeof buf, stdin)) {
size_t eol = strcspn(buf, "\n"); **
buf[eol] = '\0'; // trim potential \n
if (eol >= LINE_MAX_LENGTH) {
// IMO, user input exceeding a sane generous threshold is a potential hack
fprintf(stderr, "Line too long\n");
// TBD : Handle excessive long line
}
// Use `buf[[]`
}
Many platforms support getline() to read a line.
Short-comings: Non C-standard and allow a hacker to overwhelm system resources with insanely long lines.
In C, there is not a great solution. What is best depends on the various coding goals.
** I prefer size_t eol = strcspn(buf, "\n\r"); to read lines in a *nix environment that may end with "\r\n".
scanf() should never be used for user input. The best way to get input from the user is with fgets().
Read more: http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html
char str[1024];
char *alline = fgets(str, 1024, stdin);
scanf("%[^'\n']s",alline);
I think the correct solution should be like this. It is worked for me.
Hope it helps.

When using fscanf to parse words, how can I check when I skipped a line

I'm working on a program that reads text from a file and parses the text to words and manipulates them. I'm parsing with fscanf like that
while (fscanf (fp, " %32[^ ,.\t\n]%*c", word) == 1)
{
/*manipulate the text word by word */
…
}
I wanna write next to each word that I find in which line I found it.
Is there a way that I can check when I moved down a line
when using the function fscanf?
The soundest advice is to use fgets() or perhaps POSIX
getline() to read lines and then consider using
sscanf() to parse each line. You will probably need to consider how to use sscanf() in a loop. There are also numerous other options for parsing the line instead of sscanf(), such as strtok_r() or the less desirable strtok() — or, on Windows, strtok_s();
strspn(),
strcspn(),
strpbrk(); and other functions that are not as standardized.
If you feel you must use fscanf(), then you probably need to capture the trailing context. A simple version of that would be:
char c;
while (fscanf(fp, " %32[^ ,.\t\n]%c", word, &c) == 2)
…
This captures the character after the word, assuming there is one. If your file doesn't end with a newline, it is possible a word will be lost. It's also rather too easy to miss a newline. For example, if the line ends with a full stop (period) before the newline, then c will hold the . and the newline will be skipped by the next iteration of the loop. You could overcome that with:
char s[33];
while (fscanf(fp, " %32[^ ,.\t\n]%32[ ,.\t\n]", word, s) == 2)
…
Note that the length in the format string must be one less than the length in the variable declaration!
After a successful call to fscanf(), the string s could contain multiple newlines and blanks and so on. The fscanf() functions mostly don't care about newlines, and the scan set for s would read multiple newlines in a row if that's what's in the data file.
If you explicitly capture the status from fscanf(), you can be more sensitive to files that end without a newline (or a punctuation character), or that cause other problems:
char s[33];
int rc;
while ((rc = fscanf(fp, " %32[^ ,.\t\n]%32[ ,.\t\n]", word, s)) != EOF)
{
switch (rc)
{
case 2:
…proceed as normal, checking s for newlines.
break;
case 1:
…probably an overlong word or EOF without a newline.
break;
case 0:
…probably means the next character is one of comma or dot.
…spaces, tabs, newlines will be skipped without detection
…by the leading space in the format string.
break;
default:
assert(0);
break;
}
}
If you start to care about !, ?, ;, :, ' or " characters — not to mention ( and ) — life gets more complex still. In fact, at that point, the alternatives to sscanf() start looking much better.
It is very hard to use the scanf() family of functions correctly. They're anything but tools for the novice, at least once you start needing to do anything complex. You could look at A beginner's guide to not using scanf(), which contains much valuable information. I'm not wholly convinced by the last couple of examples which are supposed to be bomb-proof uses of scanf(). (It is a little easier to use sscanf() correctly, but you still need to understand what you're up to in detail.)
Read lines with fgets() and then parse them using sscanf:
char buff[1024];
int lineno = 0;
int offset = 0;
while (fgets(buff, 1024, fp)) {
lineno++;
offset = 0;
while (sscanf(buff + offset, " %32[^ ,.\t\n]%*c", word) == 1)
{
/* manipulate the text word by word */
}
}
In second loop you must increase buffer offset appropriately in order to parse line correctly. for this you can use %n for example in order to get read bytes.

Getting a char* with spaces in C from sscanf

I am attempting to read a line written in the format:
someword: .asciiz "want this as a char*"
There is an arbitrary amount of white space between words. I am curious if there is a simple way of getting the internal characters in the quotes into a char* variable using something like sscanf? I am guaranteed the quotes and that where will be no more than 32 characters (including spaces). There will also be a new line character immediately following the quotes.
Most scanf() field descriptors implicitly cause leading whitespace to be skipped and expect the field to be whitespace-terminated. To scan a string that may contain whitespace, however, you can use the %[] field descriptor with an appropriate scan set. Thus, you might scan sequence of lines following the pattern you describe like so by looping calls like this:
char keyword[32], value[32], description[32];
scanf("%s%s%*[ \t]\"%[^\"]\"", keyword, value, description);
That format string:
scans two whitespace-delimited strings into char arrays keyword and value,
scans but does not assign one or more whitespace characters followed by a quotation mark,
scans everything up to but not including the next quotation mark into char array description, and scans and discards a quotation mark.
It relies on the data to be correctly formatted; among other things, this is vulnerable to a buffer overflow if the data are malformed. You can address that by specifying maximum field widths in the format string.
Note, too, that you should check the return value of the function to ensure that all fields were successfully matched. That will allow you to terminate early in the event of malformed input, and even to present valid information about the location of the malformation.
You can use scanf ("%s%s%31[^\n]",s1,s2,s3);
Example:
#include <stdio.h>
int main()
{
char s1[32],s2[32],s3[32];
printf ("write something: ");
scanf ("%s%s%31[^\n]",s1,s2,s3);
printf ("%s %s %s",s1,s2,s3);
return 0;
}
s1 and s2 will ignore spaces but s3 won't
Use \"%32[^\"]\" to capture the quoted phrase. Use "%n" to detect success.
char w1[32+1];
char w2[32+1];
char w3[32+1];
int n = 0;
sscanf(buffer, "%32s%32s \"%32[^\"]\" %n", w1, w2, w3, &n);
if (n == 0) return fail; // format mis-match
if (buffer[n]) return fail; // Extra garbage detected
// else good to go.
"%32s" Skip white-space,then read & save up to 32 non-white-space char. Append '\0'.
" " Skip white space.
"\"" Match a '\"'.
"%32[^\"]" Read and save up to 32 non-'\"' char. Append '\0'.
"%n" Save the count of characters scanned.

Reading characters from a stream up to whitespace using isspace in c

I'm trying to write a function that takes a stream as an argument and reads from that stream and reads in the contents of the file up to the first whitespace as defined by the isspace function, and then uses the strtok function to parse the string. I'm not sure how to start it though with which function to read a line and ignore the whitespace. I know fgetc and getc only read one character at a time, and looking up the fscanf reference, will that work? Or does that only find items in your stream according to the format specifiers %s? Thanks!
To read an entire line at a time, you generally should use fgets. Some care is needed in case a line in the stream is longer than your buffer, the leftover will remain in the stream, which might not be what you want. (If you want to ignore the rest of the line, you can use fgets followed by fscanf as described at http://home.datacomm.ch/t_wolf/tw/c/getting_input.html.)
If you want to read in an entire line without worrying about a buffer size, you may wish to look into Chuck Falconer's ggets function which dynamically allocates a buffer for you (this does mean that you are responsible for freeing it).
C Answer
The fscanf s conversion will match a sequence of non-white-space characters. The input string stops at white space (as defined by isspace) or at the maximum field width, whichever occurs first. Note that there must be enough space in the provided buffer or it may be overflowed by long input.
FILE *fp;
char cstr[128];
fp = fopen("test.txt", "r");
while (!feof(fp))
{
fscanf(fp, "%s", cstr);
...
}
Original C Answer
The fgets function will allow you to read in the file one line at a time, but you will still need to check each character with isspace.
Since isspace may include space, form-feed ('\f'), newline ('\n'), carriage return ('\r'), horizontal tab ('\t'), and vertical tab ('\v') in its check for white-space characters, your best bet may be to read one character at a time in a loop using the fgetc function. Note that if the integer value returned by fgetc() is stored into a variable of type char and then compared against the integer constant EOF, the comparison may never succeed, because sign-extension of a variable of type char on widening to integer is implementation-defined.
FILE *fp;
int c;
fp = fopen("test.txt", "r");
while ((c = fgetc(fp)) != EOF)
{
if (isspace(c))
{
...
}
else
{
...
}
}
Original C++ Answer
The istream::getline method will allow you to read in one line at a time and optionally specify the delimiter (default is '/n').
Since isspace may include space, form-feed ('\f'), newline ('\n'), carriage return ('\r'), horizontal tab ('\t'), and vertical tab ('\v') in its check for white-space characters, your best bet may be to read one character at a time in a loop using the istream::get method.
char c;
string str;
ifstream file("test.txt",ios::in);
while (file.get(c))
{
if (isspace((unsigned char)c))
{
...
}
else
{
str.push_back(c);
}
file.peek();
if (file.eof())
{
break;
}
}
Note: Error checking was omitted from the all of the above code for simplicity.
well although getc/fgetc only retrieve 1 character at a time, you can put them into a loop right ?:)

Resources