I am beginner for programming.I referred books of C programming,but i am confused.
1.) What's the difference betweent printf and gets?
I believe gets is simpler and doesn't have any formats?
printf
The printf function writes a formatted string to the standard output. A formatted string is the result of replacing placeholders with their values. This sounds a little complicated but it will become very clear with an example:
printf("Hello, my name is %s and I am %d years old.", "Andreas", 22);
Here %s and %d are the placeholders, that are substituted with the first and second argument. You should read on the man page (linked above) the list of placeholders and their options, but the ones you'll run into most often are %d (a number) and %s (a string).
Making sure that the placeholder arguments match their type is extremely important. For example, the following code will result in undefined behavior (meaning that anything can happen: the program may crash, it may work, it may corrupt data, etc):
printf("Hello, I'm %s years old.", 22);
Unfortunately in C there is no way to avoid these relatively common mistakes.
gets
The gets function is used for a completely different purpose: it reads a string from the standard input.
For example:
char name[512];
printf("What's your name? ");
gets(name);
This simple program will ask the user for a name and save what he or she types into name.
However, gets() should NEVER be used. It will open your application and the system it runs on to security vulnerabilities.
Quoting from the man page:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because gets()
will continue to store characters past
the end of the buffer, it is extremely
dangerous to use. It has been used to
break computer security. Use fgets()
instead.
Explained in a more simple way the problem is that if the variable you give gets (name in this case) is not big enough to hold what the user types a buffer overflow will occur, which is, gets will write past the end of the variable. This is undefined behavior and on some systems it will allow execution of arbitrary code by the attacker.
Since the variable must have a finite, static size and you can't set a limit of the amount of characters the user can type as the input, gets() is never secure and should never be used. It exists only for historical reasons.
As the manual suggested, you should use fgets instead. It has the same purpose as gets but has a size argument that specifies the size of the variable:
char *fgets(char *s, int size, FILE *stream);
So, the program above would become:
char name[512];
printf("What's your name? ");
fgets(name, sizeof(name) /* 512 */, stdin /* The standard input */);
They fundamentally perform different tasks.
printf: prints out text to a console.
gets: reads in input from the keyboard.
printf: allowing you to format a string from components (ie. taking results from variables), and when output to stdout, it does not append new line character. You have to do this by inserting '\n' in the format string.
puts: only output a string to stdout, but does append new line afterward.
scanf: scan the input fields, one character at a time, and convert them according to the given format.
gets: simply read a string from stdin, with no format consideration, the return character is replaced by string terminator '\0'.
http://en.wikipedia.org/wiki/Printf
http://en.wikipedia.org/wiki/Gets
Related
Code I have:
int main(){
char readChars[3];
puts("Enter the value of the card please:");
scanf(readChars);
printf(readChars);
printf("done");
}
All I see is:
"done"
after I enter some value to terminal and pressing Enter, why?
Edit:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
The actual problem is that you are passing an uninitialized array as the format to scanf().
Also you are invoking scanf() the wrong way try this
if (scanf("%2s", readChars) == 1)
printf("%s\n", readChars);
scanf() as well as printf() use a format string and that's actually the cause for the f in their name.
And yes you are able to use it with just one argument, scanf() scans input according to the format string, the format string uses special values that are matched against the input, if you don't specify at least one then scanf() will only be useful for input validation.
The following was extracted from C11 draft
7.21.6.2 The fscanf function
The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
An optional assignment-suppressing character *.
An optional decimal integer greater than zero that specifies the maximum field width
(in characters).
An optional length modifier that specifies the size of the receiving object.
A conversion specifier character that specifies the type of conversion to be applied.
as you can read above, you need to pass at least one conversion specifier, and in that case the corresponding argument to store the converted value, if you pass the conversion specifier but you don't give an argument for it, the behavior is undefined.
Yes, it is possible to call scanf with just one parameter, and it may even be useful on occasion. But it wouldn't do what you apparently thought it would. (It would just expect the characters in the argument in the input stream and skip them.) You didn't notice because you failed to do due diligence as a programmer. I'll list what you should do:
RTFM. scanf's first parameter is a format string. Plain characters which are not part of conversion sequences and are not whitespace are expected literally in the input. They are read and discarded. If they do not appear, conversion stops there, and the position in the input stream where the unexpected character occured is the start of subsequent reads. In your case probably no character was ever successfully read from the input, but you don't know for sure, because you didn't initialize the format string (see below).
Another interesting detail is scanf's return value which indicates the number items successfully read. I'll discuss that below together with the importance to check return values.
Initialize locals. C doesn't automatically initialize local data for performance reasons (in today's light one would probably enforce user initialization like other languages do, or make auto initialization a default with an opt-out possibility for the few inner loops where it would hurt). Because you didn't initialize readchars, you don't know what's in it, so you don't know what scanf expected in the input stream. On top it probably is nominally undefined behaviour. (But on your PC it shouldn't do anything unexpected.)
Check return values. scanf probably returned 0 in your example. The manual states that scanf returns the number of items successfully read, here 0, i.e. no input conversion took place. This type of undetected failure can be fatal in long sequences of read operations because the following scanfs may read in one-off indexes from a sequence of tokens, or may stall as well (and not update their pointees at all), etc.
Please bear with me -- I do not always read the manual, check return values or (by error) initialize variables for little test programs. But if it doesn't work, it's part of my investigation. And before I ask anybody, let alone the world, I make damn sure that I have done my best to find out what I did wrong, beforehand.
You're not using scanf correctly:
scanf(formatstring, address_of_destination,...)
is the right way to do it.
EDIT:
Isn't the prototype for scanf:
int scanf(const char *format, ...);
So I should be able to use it with just one argument?
No, you should not. Please read documentation on scanf; format is a string specifying what scanf should read, and the ... are the things that scanf should read into.
The first argument to scanf is the format string. What you need is:
scanf("%2s", readChars);
It Should provided Format specifiers in scanf function
char readChars[3];
puts("Enter the value of the card please:");
scanf("%s",readChars);
printf("%s",readChars);
printf("done");
http://www.cplusplus.com/reference/cstdio/scanf/ more info...
I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.
Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);
When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".
Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.
I have been told that scanf should not be used when user inputs a string. Instead, go for gets() by most of the experts and also the users on StackOverflow. I never asked it on StackOverflow why one should not use scanf over gets for strings. This is not the actual question but answer to this question is greatly appreciated.
Now coming to the actual question. I came across this type of code -
scanf("%[^\n]s",a);
This reads a string until user inputs a new line character, considering the white spaces also as string.
Is there any problem if I use
scanf("%[^\n]s",a);
instead of gets?
Is gets more optimized than scanf function as it sounds, gets is purely dedicated to handle strings. Please let me know about this.
Update
This link helped me to understand it better.
gets(3) is dangerous and should be avoided at all costs. I cannot envision a use where gets(3) is not a security flaw.
scanf(3)'s %s is also dangerous -- you must use the "field width" specifier to indicate the size of the buffer you have allocated. Without the field width, this routine is as dangerous as gets(3):
char name[64];
scanf("%63s", name);
The GNU C library provides the a modifier to %s that allocates the buffer for you. This non-portable extension is probably less difficult to use correctly:
The GNU C library supports a nonstandard extension that
causes the library to dynamically allocate a string of
sufficient size for input strings for the %s and %a[range]
conversion specifiers. To make use of this feature, specify
a as a length modifier (thus %as or %a[range]). The caller
must free(3) the returned string, as in the following
example:
char *p;
int n;
errno = 0;
n = scanf("%a[a-z]", &p);
if (n == 1) {
printf("read: %s\n", p);
free(p);
} else if (errno != 0) {
perror("scanf");
} else {
fprintf(stderr, "No matching characters\n"):
}
As shown in the above example, it is only necessary to call
free(3) if the scanf() call successfully read a string.
Firstly, it is not clear what that s is doing in your format string. The %[^\n] part is a self-sufficient format specifier. It is not a modifier for %s format, as you seem to believe. This means that "%[^\n]s" format string will be interpreted by scanf as two independent format specifiers: %[^\n] followed by a lone s. This will direct scanf to read everything until \n is encountered (leaving \n unread), and then require that the next input character is s. This just doesn't make any sense. No input will match such self-contradictory format.
Secondly, what was apparently meant is scanf("%[^\n]", a). This is somewhat close to [no longer available] gets (or fgets), but it is not the same. scanf requires that each format specifiers matches at least one input character. scanf will fail and abort if it cannot match any input characters for the requested format specifier. This means that scanf("%[^\n]",a) is not capable of reading empty input lines, i.e. lines that contain \n character immediately. If you feed such a line into the above scanf, it will return 0 to indicate failure and leave a unchanged. That's very different from how typical line-based input functions work.
(This is a rather surprising and seemingly illogical properly of %[] format. Personally, I'd prefer %[] to be able to match empty sequences and produce empty strings, but that's not how standard scanf works.)
If you want to read the input in line-by-lane fashion, fgets is your best option.
Say If i want an input to be
[Name] [Name]
How would I detect
[Name] [Name] [Name]
and return error?
Here is what I have so far,
char in[20];
char out[20];
scanf(" %s %s", out, in);
scanf returns the number of validly converted arguments. So in your first case, the return value would be 2, in the latter case 3.
To check the right amount of parameters, this might help:
char in[20];
char out[20];
char error[20];
int check;
check = scanf(" %s %s %s", out, in, error);
if(check != 2)
{
// TODO: error handling
}
EDIT: now it should be working, see comments below.
Of course, as stated by other posters: scanf is not considered a quite safe function since buffer overflows can occur, and you should avoid using it. It is better to read the inputs to a buffer with fgets() and the try to parse the arguments you want.
This is homework, so you might be required to work under certain (arbitrary) restrictions. However, the phrase "scanf error handling" is something of an oxymoron in the C programming language.
The best way to do this is to read in a line/other suitable chunk and parse it with C string functions. You can do it in one line of scanf but there are many drawbacks:
You can guard against buffer overflows, but if a buffer isn't large enough you can't recover.
You can specify specific character ranges for your strings, but it starts to look a little regexy, and the behavior of the "%[" format of scanf isn't mandated in the standard.
You can check for a third name, but the code looks unintuitive - it doesn't look like you only want two names, it looks like you want three. scanf also gives you very little control over how you handle whitespace.
EDIT: I initially thought from your question that the names were contained in brackets (a la "[Bruce] [Wayne]") but it now appears that was merely your convention for denoting a placeholder.
Anyway, despite my intense dislike of scanf, it has its uses. The biggest killer (for me) is the inability to distinguish between line endings and simple space separation. To fix that, you can call fgets to read the data into a buffer, then call sscanf on the buffer. This gives you both a) safer reading (scanf messes with the ability of other more straightforward functions to read from a buffer) and b) the benefits of scanf formats.
If you have to use scanf, your format basically be this:
" %s %s %s"
With the third being undesirable. As #Constantinius's answer shows, you'd need to read data into three buffers, and check whether or not the third passed. However, if you're reading multiple consecutive lines of this data, then the first entry of the next line would satisfy the third slot, falsely giving you an error. I highly recommend using fgets and sscanf or ditching the sscanf for more precise manual parsing in this case.
Here's a link to the fgets man page if you missed the one I snuck in earlier. If you decide to ditch sscanf, here are some other functions to look into: strchr (or strspn, or strcspn) to find how long the name is, strcpy or memcpy (but please not strncpy, it's not what you think it is) to copy data into the buffers.
I want to know the disadvantages of scanf().
In many sites, I have read that using scanf might cause buffer overflows. What is the reason for this? Are there any other drawbacks with scanf?
Most of the answers so far seem to focus on the string buffer overflow issue. In reality, the format specifiers that can be used with scanf functions support explicit field width setting, which limit the maximum size of the input and prevent buffer overflow. This renders the popular accusations of string-buffer overflow dangers present in scanf virtually baseless. Claiming that scanf is somehow analogous to gets in the respect is completely incorrect. There's a major qualitative difference between scanf and gets: scanf does provide the user with string-buffer-overflow-preventing features, while gets doesn't.
One can argue that these scanf features are difficult to use, since the field width has to be embedded into format string (there's no way to pass it through a variadic argument, as it can be done in printf). That is actually true. scanf is indeed rather poorly designed in that regard. But nevertheless any claims that scanf is somehow hopelessly broken with regard to string-buffer-overflow safety are completely bogus and usually made by lazy programmers.
The real problem with scanf has a completely different nature, even though it is also about overflow. When scanf function is used for converting decimal representations of numbers into values of arithmetic types, it provides no protection from arithmetic overflow. If overflow happens, scanf produces undefined behavior. For this reason, the only proper way to perform the conversion in C standard library is functions from strto... family.
So, to summarize the above, the problem with scanf is that it is difficult (albeit possible) to use properly and safely with string buffers. And it is impossible to use safely for arithmetic input. The latter is the real problem. The former is just an inconvenience.
P.S. The above in intended to be about the entire family of scanf functions (including also fscanf and sscanf). With scanf specifically, the obvious issue is that the very idea of using a strictly-formatted function for reading potentially interactive input is rather questionable.
The problems with scanf are (at a minimum):
using %s to get a string from the user, which leads to the possibility that the string may be longer than your buffer, causing overflow.
the possibility of a failed scan leaving your file pointer in an indeterminate location.
I very much prefer using fgets to read whole lines in so that you can limit the amount of data read. If you've got a 1K buffer, and you read a line into it with fgets you can tell if the line was too long by the fact there's no terminating newline character (last line of a file without a newline notwithstanding).
Then you can complain to the user, or allocate more space for the rest of the line (continuously if necessary until you have enough space). In either case, there's no risk of buffer overflow.
Once you've read the line in, you know that you're positioned at the next line so there's no problem there. You can then sscanf your string to your heart's content without having to save and restore the file pointer for re-reading.
Here's a snippet of code which I frequently use to ensure no buffer overflow when asking the user for information.
It could be easily adjusted to use a file other than standard input if necessary and you could also have it allocate its own buffer (and keep increasing it until it's big enough) before giving that back to the caller (although the caller would then be responsible for freeing it, of course).
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
#define SMALL_BUFF 3
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Size zero or one cannot store enough, so don't even
// try - we need space for at least newline and terminator.
if (sz < 2)
return SMALL_BUFF;
// Output prompt.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
// Get line with buffer overrun protection.
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// Catch possibility of `\0` in the input stream.
size_t len = strlen(buff);
if (len < 1)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[len - 1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[len - 1] = '\0';
return OK;
}
And, a test driver for it:
// Test program for getLine().
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
// Extra NL since my system doesn't output that on EOF.
printf ("\nNo input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long [%s]\n", buff);
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
Finally, a test run to show it in action:
$ printf "\0" | ./tstprg # Singular NUL in input stream.
Enter string>
No input
$ ./tstprg < /dev/null # EOF in input stream.
Enter string>
No input
$ ./tstprg # A one-character string.
Enter string> a
OK [a]
$ ./tstprg # Longer string but still able to fit.
Enter string> hello
OK [hello]
$ ./tstprg # Too long for buffer.
Enter string> hello there
Input too long [hello the]
$ ./tstprg # Test limit of buffer.
Enter string> 123456789
OK [123456789]
$ ./tstprg # Test just over limit.
Enter string> 1234567890
Input too long [123456789]
From the comp.lang.c FAQ: Why does everyone say not to use scanf? What should I use instead?
scanf has a number of problems—see questions 12.17, 12.18a, and 12.19. Also, its %s format has the same problem that gets() has (see question 12.23)—it’s hard to guarantee that the receiving buffer won’t overflow. [footnote]
More generally, scanf is designed for relatively structured, formatted input (its name is in fact derived from “scan formatted”). If you pay attention, it will tell you whether it succeeded or failed, but it can tell you only approximately where it failed, and not at all how or why. You have very little opportunity to do any error recovery.
Yet interactive user input is the least structured input there is. A well-designed user interface will allow for the possibility of the user typing just about anything—not just letters or punctuation when digits were expected, but also more or fewer characters than were expected, or no characters at all (i.e., just the RETURN key), or premature EOF, or anything. It’s nearly impossible to deal gracefully with all of these potential problems when using scanf; it’s far easier to read entire lines (with fgets or the like), then interpret them, either using sscanf or some other techniques. (Functions like strtol, strtok, and atoi are often useful; see also questions 12.16 and 13.6.) If you do use any scanf variant, be sure to check the return value to make sure that the expected number of items were found. Also, if you use %s, be sure to guard against buffer overflow.
Note, by the way, that criticisms of scanf are not necessarily indictments of fscanf and sscanf. scanf reads from stdin, which is usually an interactive keyboard and is therefore the least constrained, leading to the most problems. When a data file has a known format, on the other hand, it may be appropriate to read it with fscanf. It’s perfectly appropriate to parse strings with sscanf (as long as the return value is checked), because it’s so easy to regain control, restart the scan, discard the input if it didn’t match, etc.
Additional links:
longer explanation by Chris Torek
longer explanation by yours truly
References: K&R2 Sec. 7.4 p. 159
It is very hard to get scanf to do the thing you want. Sure, you can, but things like scanf("%s", buf); are as dangerous as gets(buf);, as everyone has said.
As an example, what paxdiablo is doing in his function to read can be done with something like:
scanf("%10[^\n]%*[^\n]", buf));
getchar();
The above will read a line, store the first 10 non-newline characters in buf, and then discard everything till (and including) a newline. So, paxdiablo's function could be written using scanf the following way:
#include <stdio.h>
enum read_status {
OK,
NO_INPUT,
TOO_LONG
};
static int get_line(const char *prompt, char *buf, size_t sz)
{
char fmt[40];
int i;
int nscanned;
printf("%s", prompt);
fflush(stdout);
sprintf(fmt, "%%%zu[^\n]%%*[^\n]%%n", sz-1);
/* read at most sz-1 characters on, discarding the rest */
i = scanf(fmt, buf, &nscanned);
if (i > 0) {
getchar();
if (nscanned >= sz) {
return TOO_LONG;
} else {
return OK;
}
} else {
return NO_INPUT;
}
}
int main(void)
{
char buf[10+1];
int rc;
while ((rc = get_line("Enter string> ", buf, sizeof buf)) != NO_INPUT) {
if (rc == TOO_LONG) {
printf("Input too long: ");
}
printf("->%s<-\n", buf);
}
return 0;
}
One of the other problems with scanf is its behavior in case of overflow. For example, when reading an int:
int i;
scanf("%d", &i);
the above cannot be used safely in case of an overflow. Even for the first case, reading a string is much more simpler to do with fgets rather than with scanf.
Yes, you are right. There is a major security flaw in scanf family(scanf,sscanf, fscanf..etc) esp when reading a string, because they don't take the length of the buffer (into which they are reading) into account.
Example:
char buf[3];
sscanf("abcdef","%s",buf);
clearly the the buffer buf can hold MAX 3 char. But the sscanf will try to put "abcdef" into it causing buffer overflow.
Problems I have with the *scanf() family:
Potential for buffer overflow with %s and %[ conversion specifiers. Yes, you can specify a maximum field width, but unlike with printf(), you can't make it an argument in the scanf() call; it must be hardcoded in the conversion specifier.
Potential for arithmetic overflow with %d, %i, etc.
Limited ability to detect and reject badly formed input. For example, "12w4" is not a valid integer, but scanf("%d", &value); will successfully convert and assign 12 to value, leaving the "w4" stuck in the input stream to foul up a future read. Ideally the entire input string should be rejected, but scanf() doesn't give you an easy mechanism to do that.
If you know your input is always going to be well-formed with fixed-length strings and numerical values that don't flirt with overflow, then scanf() is a great tool. If you're dealing with interactive input or input that isn't guaranteed to be well-formed, then use something else.
Many answers here discuss the potential overflow issues of using scanf("%s", buf), but the latest POSIX specification more-or-less resolves this issue by providing an m assignment-allocation character that can be used in format specifiers for c, s, and [ formats. This will allow scanf to allocate as much memory as necessary with malloc (so it must be freed later with free).
An example of its use:
char *buf;
scanf("%ms", &buf); // with 'm', scanf expects a pointer to pointer to char.
// use buf
free(buf);
See here. Disadvantages to this approach is that it is a relatively recent addition to the POSIX specification and it is not specified in the C specification at all, so it remains rather unportable for now.
The advantage of scanf is once you learn how use the tool, as you should always do in C, it has immensely useful usecases. You can learn how to use scanf and friends by reading and understanding the manual. If you can't get through that manual without serious comprehension issues, this would probably indicate that you don't know C very well.
scanf and friends suffered from unfortunate design choices that rendered it difficult (and occasionally impossible) to use correctly without reading the documentation, as other answers have shown. This occurs throughout C, unfortunately, so if I were to advise against using scanf then I would probably advise against using C.
One of the biggest disadvantages seems to be purely the reputation it's earned amongst the uninitiated; as with many useful features of C we should be well informed before we use it. The key is to realise that as with the rest of C, it seems succinct and idiomatic, but that can be subtly misleading. This is pervasive in C; it's easy for beginners to write code that they think makes sense and might even work for them initially, but doesn't make sense and can fail catastrophically.
For example, the uninitiated commonly expect that the %s delegate would cause a line to be read, and while that might seem intuitive it isn't necessarily true. It's more appropriate to describe the field read as a word. Reading the manual is strongly advised for every function.
What would any response to this question be without mentioning its lack of safety and risk of buffer overflows? As we've already covered, C isn't a safe language, and will allow us to cut corners, possibly to apply an optimisation at the expense of correctness or more likely because we're lazy programmers. Thus, when we know the system will never receive a string larger than a fixed number of bytes, we're given the ability to declare an array that size and forego bounds checking. I don't really see this as a down-fall; it's an option. Again, reading the manual is strongly advised and would reveal this option to us.
Lazy programmers aren't the only ones stung by scanf. It's not uncommon to see people trying to read float or double values using %d, for example. They're usually mistaken in believing that the implementation will perform some kind of conversion behind the scenes, which would make sense because similar conversions happen throughout the rest of the language, but that's not the case here. As I said earlier, scanf and friends (and indeed the rest of C) are deceptive; they seem succinct and idiomatic but they aren't.
Inexperienced programmers aren't forced to consider the success of the operation. Suppose the user enters something entirely non-numeric when we've told scanf to read and convert a sequence of decimal digits using %d. The only way we can intercept such erroneous data is to check the return value, and how often do we bother checking the return value?
Much like fgets, when scanf and friends fail to read what they're told to read, the stream will be left in an unusual state;
In the case of fgets, if there isn't sufficient space to store a complete line, then the remainder of the line left unread might be erroneously treated as though it's a new line when it isn't.
In the case of scanf and friends, a conversion failed as documented above, the erroneous data is left unread on the stream and might be erroneously treated as though it's part of a different field.
It's no easier to use scanf and friends than to use fgets. If we check for success by looking for a '\n' when we're using fgets or by inspecting the return value when we use scanf and friends, and we find that we've read an incomplete line using fgets or failed to read a field using scanf, then we're faced with the same reality: We're likely to discard input (usually up until and including the next newline)! Yuuuuuuck!
Unfortunately, scanf both simultaneously makes it hard (non-intuitive) and easy (fewest keystrokes) to discard input in this way. Faced with this reality of discarding user input, some have tried scanf("%*[^\n]%*c");, not realising that the %*[^\n] delegate will fail when it encounters nothing but a newline, and hence the newline will still be left on the stream.
A slight adaptation, by separating the two format delegates and we see some success here: scanf("%*[^\n]"); getchar();. Try doing that with so few keystrokes using some other tool ;)
There is one big problem with scanf-like functions - the lack of any type safety. That is, you can code this:
int i;
scanf("%10s", &i);
Hell, even this is "fine":
scanf("%10s", i);
It's worse than printf-like functions, because scanf expects a pointer, so crashes are more likely.
Sure, there are some format-specifier checkers out there, but, those are not perfect and well, they are not part of the language or the standard library.