I have to use this file information
Cigaretas,For Smoking,1,5.500000
Woods,For indor chimneys,2,100.000000
To fill this structure:
typedef struct product{
char name[32];
char about_product[32];
int product_id;
double price;
}sProduct;
Using this function:
void print_prod(){
sProduct ptr;
FILE *fp=fopen("product.txt", "r");
int cnt=1;
do{
fscanf(fp, "%s,%s,%d,%g\n", ptr.name, ptr.about_product, &ptr.product_id, &ptr.price);
cnt++;
}while(!feof(fp));
fclose(fp);
}
But it doesn't bugs because I use comma for a divider in the txt file.
There are two issues here, both revolving around how scanf processes %s directives:
it scans whitespace-delimited strings, so it takes the internal whitespace in some of your input data as delimiters
it scans whitespace-delimited strings, so it does not recognize the comma as a field delimiter
There are several ways you could go about the input task you have set, but to continue to use scanf to scan your particular input directly into your data structure, you want the %[ directive instead of %s.
The %[ directive accepts a "scanset" describing exactly which characters may appear in the field, which can include whitespace. This takes a form similar to a regex or glob character class. The corresponding argument should be a pointer to char, just as for a %s directive. You should also be aware that unlike most directives, including %s, the %[ directive does not skip leading whitespace. For you, its use might look like this:
fscanf(fp, "%[^,],%[^,],%d,%g\n", ptr.name, ptr.about_product, &ptr.product_id, &ptr.price);
The two %[^,] field descriptors there each scan any number of characters other than a comma (,).
Additionally, you would be wise to specify field widths, so as to avoid overrunning your arrays' bounds in the event that too-long fields appear in your data. Since you provide 32-byte arrays, and one byte of each must be reserved for a string terminator, that could be this:
fscanf(fp, "%31[^,],%31[^,],%d,%g\n", ptr.name, ptr.about_product, &ptr.product_id, &ptr.price);
But also, as with substantially all functions that have any failure modes, you should also check, programmatically, whether the function succeeded. You need to do this proactively and consistently, else your programs may fail in subtle and surprising ways. For this use of scanf, and many similar ones, that means checking that the return value is equal to the number of input directives in the format (the return value indicates how many fields were successfully scanned and assigned):
int fields;
fields = fscanf(fp, "%31[^,],%31[^,],%d,%g\n", ptr.name, ptr.about_product,
&ptr.product_id, &ptr.price);
if (fields != 4) {
// handle input error ...
}
Addendum:
Continuing with the point that you should check for failures in functions that have failure modes, I also observe that another such function is fopen(). It is entirely possible for this function to fail, in which case it returns a null pointer. A robust program would definitely check for this case and handle it gracefully, using a pattern similar to the one I described for scanf(). This, however, is not related to the specific misbehavior you asked about. (Credit to #RobertSsupportsMonicaCellio.)
Related
I have this .txt file that contains only:
THN1234 54
How can I take only the number 54, to isolate it from the rest and to use it as an integer variable in my program?
If the input is from standard input, then you could use:
int value;
if (scanf("%*s %d", &value) != 1)
…Oops - incorrectly formatted data…
…use value…
The %*s reads but discards optional leading blanks and a sequence of one or more non-blanks (THN1234); the blank skips more optional blanks; the %d reads the integer, leaving a newline behind in the input buffer. If what follows the blank is not convertible to a number, or if you get EOF, you get to detect it in the if condition and report it in the body of the if.
Hmmm…and I see that BLUEPIXY said basically the same (minus the explanation) in their comment, even down to the choice of integer variable name.
Wow. It's been a long time since I have used C. However, I think the answer is similar for C and C++ in this case. You can use strtok_r to split the string into tokens then take the second token and parse it into an int. See http://www.cplusplus.com/reference/clibrary/cstring/strtok/.
You might also want to look at this question as well.
Today I've looked over some C code that was parsing data from a text file
and I've stumbled upon these lines
fgets(line,MAX,fp);
if(line[strlen(line)-1]=='\n'){
line[strlen(line)-1]='\0');
}else{
printf("Error on line length\n");
exit(1);
}
sscanf((line,"%s",records->bday));
with record being a structure
typedef struct {
char bday[11];
}record;
So my question here regards the fgets-sscanf combination to create a type/length safe stream reader:
Is there any other way to work this out beside having to combine these two readers?
What about the \n checking-removing sequence?
The combination of fgets() with sscanf() is usually good. However, you should probably be using:
if (fgets(line, sizeof(line), fp) != 0)
{
...
}
This checks for I/O errors and EOF. It also assumes that the definition of the array is visible (otherwise sizeof gives you the size of a pointer, not of the array). If the array is not in scope, you should probably pass the size of the array to the function containing this code. All that said, there are worse sins than using MAX in place of sizeof(line).
You have not checked for a zero-length birthday string; you will probably end up doing quite a lot of validation on the string that is entered, though (dates are fickle and hard to process).
Given that MAX is 60, but sizeof(records->bday) == 11, you need to protect yourself from buffer overflows in the sscanf(). One way to do that is:
if (sscanf(line, "%10s", records->bday) != 1)
...handle error...
Note that the 10 is sizeof(records->bday) - 1, but you can't provide the length as an argument to sscanf(); it has to appear in the format string literally. Here, you can probably live with the odd sizing, but if you were dealing with more generic code, you'd probably think about:
sprintf(format, "%%%zus", sizeof(records->bday) - 1);
The first %% maps to %; the %zu formats the size (z is C99 for size_t); the s is for the string conversion when the format is used.
Or you could consider using strcpy() or memcpy() or memmove() to copy the right subsection of the input string to the structure - but note that %10s skips leading blanks which strcpy() et al will not. You have to know how long the string is before you do the copying, of course, and make sure the string is null terminated.
I have been told that scanf should not be used when user inputs a string. Instead, go for gets() by most of the experts and also the users on StackOverflow. I never asked it on StackOverflow why one should not use scanf over gets for strings. This is not the actual question but answer to this question is greatly appreciated.
Now coming to the actual question. I came across this type of code -
scanf("%[^\n]s",a);
This reads a string until user inputs a new line character, considering the white spaces also as string.
Is there any problem if I use
scanf("%[^\n]s",a);
instead of gets?
Is gets more optimized than scanf function as it sounds, gets is purely dedicated to handle strings. Please let me know about this.
Update
This link helped me to understand it better.
gets(3) is dangerous and should be avoided at all costs. I cannot envision a use where gets(3) is not a security flaw.
scanf(3)'s %s is also dangerous -- you must use the "field width" specifier to indicate the size of the buffer you have allocated. Without the field width, this routine is as dangerous as gets(3):
char name[64];
scanf("%63s", name);
The GNU C library provides the a modifier to %s that allocates the buffer for you. This non-portable extension is probably less difficult to use correctly:
The GNU C library supports a nonstandard extension that
causes the library to dynamically allocate a string of
sufficient size for input strings for the %s and %a[range]
conversion specifiers. To make use of this feature, specify
a as a length modifier (thus %as or %a[range]). The caller
must free(3) the returned string, as in the following
example:
char *p;
int n;
errno = 0;
n = scanf("%a[a-z]", &p);
if (n == 1) {
printf("read: %s\n", p);
free(p);
} else if (errno != 0) {
perror("scanf");
} else {
fprintf(stderr, "No matching characters\n"):
}
As shown in the above example, it is only necessary to call
free(3) if the scanf() call successfully read a string.
Firstly, it is not clear what that s is doing in your format string. The %[^\n] part is a self-sufficient format specifier. It is not a modifier for %s format, as you seem to believe. This means that "%[^\n]s" format string will be interpreted by scanf as two independent format specifiers: %[^\n] followed by a lone s. This will direct scanf to read everything until \n is encountered (leaving \n unread), and then require that the next input character is s. This just doesn't make any sense. No input will match such self-contradictory format.
Secondly, what was apparently meant is scanf("%[^\n]", a). This is somewhat close to [no longer available] gets (or fgets), but it is not the same. scanf requires that each format specifiers matches at least one input character. scanf will fail and abort if it cannot match any input characters for the requested format specifier. This means that scanf("%[^\n]",a) is not capable of reading empty input lines, i.e. lines that contain \n character immediately. If you feed such a line into the above scanf, it will return 0 to indicate failure and leave a unchanged. That's very different from how typical line-based input functions work.
(This is a rather surprising and seemingly illogical properly of %[] format. Personally, I'd prefer %[] to be able to match empty sequences and produce empty strings, but that's not how standard scanf works.)
If you want to read the input in line-by-lane fashion, fgets is your best option.
Say If i want an input to be
[Name] [Name]
How would I detect
[Name] [Name] [Name]
and return error?
Here is what I have so far,
char in[20];
char out[20];
scanf(" %s %s", out, in);
scanf returns the number of validly converted arguments. So in your first case, the return value would be 2, in the latter case 3.
To check the right amount of parameters, this might help:
char in[20];
char out[20];
char error[20];
int check;
check = scanf(" %s %s %s", out, in, error);
if(check != 2)
{
// TODO: error handling
}
EDIT: now it should be working, see comments below.
Of course, as stated by other posters: scanf is not considered a quite safe function since buffer overflows can occur, and you should avoid using it. It is better to read the inputs to a buffer with fgets() and the try to parse the arguments you want.
This is homework, so you might be required to work under certain (arbitrary) restrictions. However, the phrase "scanf error handling" is something of an oxymoron in the C programming language.
The best way to do this is to read in a line/other suitable chunk and parse it with C string functions. You can do it in one line of scanf but there are many drawbacks:
You can guard against buffer overflows, but if a buffer isn't large enough you can't recover.
You can specify specific character ranges for your strings, but it starts to look a little regexy, and the behavior of the "%[" format of scanf isn't mandated in the standard.
You can check for a third name, but the code looks unintuitive - it doesn't look like you only want two names, it looks like you want three. scanf also gives you very little control over how you handle whitespace.
EDIT: I initially thought from your question that the names were contained in brackets (a la "[Bruce] [Wayne]") but it now appears that was merely your convention for denoting a placeholder.
Anyway, despite my intense dislike of scanf, it has its uses. The biggest killer (for me) is the inability to distinguish between line endings and simple space separation. To fix that, you can call fgets to read the data into a buffer, then call sscanf on the buffer. This gives you both a) safer reading (scanf messes with the ability of other more straightforward functions to read from a buffer) and b) the benefits of scanf formats.
If you have to use scanf, your format basically be this:
" %s %s %s"
With the third being undesirable. As #Constantinius's answer shows, you'd need to read data into three buffers, and check whether or not the third passed. However, if you're reading multiple consecutive lines of this data, then the first entry of the next line would satisfy the third slot, falsely giving you an error. I highly recommend using fgets and sscanf or ditching the sscanf for more precise manual parsing in this case.
Here's a link to the fgets man page if you missed the one I snuck in earlier. If you decide to ditch sscanf, here are some other functions to look into: strchr (or strspn, or strcspn) to find how long the name is, strcpy or memcpy (but please not strncpy, it's not what you think it is) to copy data into the buffers.
I am beginner for programming.I referred books of C programming,but i am confused.
1.) What's the difference betweent printf and gets?
I believe gets is simpler and doesn't have any formats?
printf
The printf function writes a formatted string to the standard output. A formatted string is the result of replacing placeholders with their values. This sounds a little complicated but it will become very clear with an example:
printf("Hello, my name is %s and I am %d years old.", "Andreas", 22);
Here %s and %d are the placeholders, that are substituted with the first and second argument. You should read on the man page (linked above) the list of placeholders and their options, but the ones you'll run into most often are %d (a number) and %s (a string).
Making sure that the placeholder arguments match their type is extremely important. For example, the following code will result in undefined behavior (meaning that anything can happen: the program may crash, it may work, it may corrupt data, etc):
printf("Hello, I'm %s years old.", 22);
Unfortunately in C there is no way to avoid these relatively common mistakes.
gets
The gets function is used for a completely different purpose: it reads a string from the standard input.
For example:
char name[512];
printf("What's your name? ");
gets(name);
This simple program will ask the user for a name and save what he or she types into name.
However, gets() should NEVER be used. It will open your application and the system it runs on to security vulnerabilities.
Quoting from the man page:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because gets()
will continue to store characters past
the end of the buffer, it is extremely
dangerous to use. It has been used to
break computer security. Use fgets()
instead.
Explained in a more simple way the problem is that if the variable you give gets (name in this case) is not big enough to hold what the user types a buffer overflow will occur, which is, gets will write past the end of the variable. This is undefined behavior and on some systems it will allow execution of arbitrary code by the attacker.
Since the variable must have a finite, static size and you can't set a limit of the amount of characters the user can type as the input, gets() is never secure and should never be used. It exists only for historical reasons.
As the manual suggested, you should use fgets instead. It has the same purpose as gets but has a size argument that specifies the size of the variable:
char *fgets(char *s, int size, FILE *stream);
So, the program above would become:
char name[512];
printf("What's your name? ");
fgets(name, sizeof(name) /* 512 */, stdin /* The standard input */);
They fundamentally perform different tasks.
printf: prints out text to a console.
gets: reads in input from the keyboard.
printf: allowing you to format a string from components (ie. taking results from variables), and when output to stdout, it does not append new line character. You have to do this by inserting '\n' in the format string.
puts: only output a string to stdout, but does append new line afterward.
scanf: scan the input fields, one character at a time, and convert them according to the given format.
gets: simply read a string from stdin, with no format consideration, the return character is replaced by string terminator '\0'.
http://en.wikipedia.org/wiki/Printf
http://en.wikipedia.org/wiki/Gets