I recently faced an interview question on what's the hidden problem with the following code. I was unable to detect it .Can anyone help?
#include<stdio.h>
int main(void)
{
char buff[10];
memset(buff,0,sizeof(buff));
gets(buff);
printf("\n The buffer entered is [%s]\n",buff);
return 0;
}
The function gets accepts a string from stdin and does not check the capacity of the buffer.This may result in buffer overflow. The standard function fgets() can be used here.
gets could return much more than 10 characters.
gets is really problematic because you can't tell it to only fill 'buff' up to a length of 10.
check the Bugs Section of this manual which says
Never use gets(). Because it is impossible to tell without knowing
the data in advance how many characters gets() will read, and because
gets() will continue to store characters past the end of the buffer,
it is extremely dangerous to use. It has been used to break computer
security. Use fgets() instead.
It is not advisable to mix calls to input functions from the stdio
library with low-level calls to read(2) for the file descriptor
associated with the input stream; the results will be undefined and
very probably not what you want.
It is always recommended to use fgets()/ scanf() over gets().
by using the function gets() you don't have the option to limit the user to a certian text length, which may cause a buffer overflow exception. That is why you should not use it.
Try to use fgets() instead:
fgets(buff, MAX_LENGTH_ stdin);
Good luck!
Related
If the code is
scanf("%s\n",message)
vs
gets(message)
what's the difference?It seems that both of them get input to message.
The basic difference [in reference to your particular scenario],
scanf() ends taking input upon encountering a whitespace, newline or EOF
gets() considers a whitespace as a part of the input string and ends the input upon encountering newline or EOF.
However, to avoid buffer overflow errors and to avoid security risks, its safer to use fgets().
Disambiguation: In the following context I'd consider "safe" if not leading to trouble when correctly used. And "unsafe" if the "unsafetyness" cannot be maneuvered around.
scanf("%s\n",message)
vs
gets(message)
What's the difference?
In terms of safety there is no difference, both read in from Standard Input and might very well overflow message, if the user enters more data then messageprovides memory for.
Whereas scanf() allows you to be used safely by specifying the maximum amount of data to be scanned in:
char message[42];
...
scanf("%41s", message); /* Only read in one few then the buffer (messega here)
provides as one byte is necessary to store the
C-"string"'s 0-terminator. */
With gets() it is not possible to specify the maximum number of characters be read in, that's why the latter shall not be used!
The main difference is that gets reads until EOF or \n, while scanf("%s") reads until any whitespace has been encountered. scanf also provides more formatting options, but at the same time it has worse type safety than gets.
Another big difference is that scanf is a standard C function, while gets has been removed from the language, since it was both superfluous and dangerous: there was no protection against buffer overruns. The very same security flaw exists with scanf however, so neither of those two functions should be used in production code.
You should always use fgets, the C standard itself even recommends this, see C11 K.3.5.4.1
Recommended practice
6 The fgets function allows properly-written
programs to safely process input lines too long to store in the result
array. In general this requires that callers of fgets pay attention to
the presence or absence of a new-line character in the result array.
Consider using fgets (along with any needed processing based on
new-line characters) instead of gets_s.
(emphasis mine)
There are several. One is that gets() will only get character string data. Another is that gets() will get only one variable at a time. scanf() on the other hand is a much, much more flexible tool. It can read multiple items of different data types.
In the particular example you have picked, there is not much of a difference.
gets - Reads characters from stdin and stores them as a string.
scanf - Reads data from stdin and stores them according to the format specified int the scanf statement like %d, %f, %s, etc.
gets:->
gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte ('\0').
BUGS:->
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
scanf:->
The scanf() function reads input from the standard input stream stdin;
BUG:->
Some times scanf makes boundary problems when deals with array and string concepts.
In case of scanf you need that format mentioned, unlike in gets. So in gets you enter charecters, strings, numbers and spaces.
In case of scanf , you input ends as soon as a white-space is encountered.
But then in your example you are using '%s' so, neither gets() nor scanf() that the strings are valid pointers to arrays of sufficient length to hold the characters you are sending to them. Hence can easily cause an buffer overflow.
Tip: use fgets() , but that all depends on the use case
The concept that scanf does not take white space is completely wrong. If you use this part of code it will take white white space also :
#include<stdio.h>
int main()
{
char name[25];
printf("Enter your name :\n");
scanf("%[^\n]s",name);
printf("%s",name);
return 0;
}
Where the use of new line will only stop taking input. That means if you press enter only then it will stop taking inputs.
So, there is basically no difference between scanf and gets functions. It is just a tricky way of implementation.
scanf() is much more flexible tool while gets() only gets one variable at a time.
gets() is unsafe, for example: char str[1]; gets(str)
if you input more then the length, it will end with SIGSEGV.
if only can use gets, use malloc as the base variable.
my question is about
gets()
and
puts()
are they a perfect solution for string input and output?
gets is marked as obsolescent in C99 and has been removed in C11 because of security issues with this function. Don't use it, use fgets instead. As an historical note, gets was exploited (in fingerd) by the first massive internet worm: the inet worm back in 1988.
puts function is OK if it fits your needs.
gets() is fundamentally insecure in a really horrific way: it will write an unlimited number of characters to its argument, overflowing any buffer it is provided. As such, it should never, ever be used. Many newer compilers will issue an automatic warning if you use it. Instead, use fgets(), which takes a length argument:
char buf[...];
fgets(buf, sizeof(buf), stdin);
On the other hand, puts() is totally fine. It's equivalent to printf("%s\n", x);, and some compilers will in fact convert certain constant printf() calls to puts() as a standard optimization. Go wild.
For gets, see the man page:
BUGS
Never use gets(). Because it is impossible to tell without knowing
the data in advance how many characters gets() will read, and because gets() will continue to store
characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
puts is fine, if you're just looking to write a string to stdout.
This question already has an answer here:
Can fseek(stdin,1,SEEK_SET) or rewind(stdin) be used to flush the input buffer instead of non-portable fflush(stdin)?
(1 answer)
Closed 8 years ago.
I was thinking since the start that why can't fseek(stdin,0,SEEK_SET) and rewind(stdin) flush the input buffer since it is clearly written in cplusplusreference that calling these two functions flush the buffer(Input or Output irrespective).But since the whole idea seemed new,I had put it in a clumsy question yesterday.
Can fseek(stdin,1,SEEK_SET) or rewind(stdin) be used to flush the input buffer instead of non-portable fflush(stdin)?
And I was skeptical about the answers I got which seemed to suggest I couldn't do it.Frankly,I saw no reason why not.Today I tried it myself and it works!! I mean, to deal with the problem up the newline lurking in stdin while using multiple scanf() statments, it seems like I can use fseek(stdin,0,SEEK_SET) or rewind(stdin) inplace of the non-portable and UB fflush(stdin).
Please tell me if this is a correct approach without any risk.Till now, I had been using the following code to deal with newline in stdin: while((c = getchar()) != '\n' && c != EOF);. Here's my code below:
#include <stdio.h>
int main ()
{
int a,b;
char c;
printf("Enter 2 integers\n");
scanf("%d%d",&a,&b);
printf("Enter a character\n");
//rewind(stdin); //Works if activated
fseek(stdin,0,SEEK_SET); //Works fine
scanf("%c",&c); //This scanf() is skipped without fseek() or rewind()
printf("%d,%d,%c",a,b,c);
}
In my program, if I don't use either of fseek(stdin,0,SEEK_SET) or rewind(stdin),the second scanf() is skipped and newline is always taken up as the character.The problem is solved if I use fseek(stdin,0,SEEK_SET) or rewind(stdin).
I'm not sure where you read on cplusplusreference (whatever that is) that flushing to end of line is the mandated behaviour.
The closest matches I could find, http://www.cplusplus.com/reference/cstdio/fseek/ and http://www.cplusplus.com/reference/cstdio/rewind, don't mention flushing at all, other than in reference to fflush().
In any case, there's nothing in the C standard which mandates this behaviour either. C11 7.20.9.2 fseek and 7.20.9.5 rewind (which is, after all, identical to fseek with zero offset and SEEK_SET) also make no mention of flushing.
All they state is that the file pointer is moved to the relevant position in the stream.
So, to the extent this works in your environment, all we can say is that this works in your environment. It may not work elsewhere, it may even stop working in your envirnment at an indeterminate point in the future.
If you really want robust input, you should be using a two-stage approach, fgets to retrieve a line followed by sscanf to get what you want from that line. Mixing the two paradigms of input (scanf and getchar) is frequently problematic.
A good (robust, error-checking, and clearing to end of line if needed) input function can be found here.
I tested it right ago, and I checked that fseek doesn't work on stdin. fseek() usually works on the file on the disk so that it seems to be prohibited to access to stdin by the kernel for some secure reasons. Anyway, it was so happy to see who thought like me. Tnx for good question.
I am programming in C in Unix,
and I am using gets to read the inputs from keyboard.
I always get this warning and the program stop running:
warning: this program uses gets(), which is unsafe.
Can anybody tell me the reason why this is happening?
gets is unsafe because you give it a buffer, but you don't tell it how big the buffer is. The input may write past the end of the buffer, blowing up your program fairly spectacularly. Using fgets instead is a bit better because you tell it how big the buffer is, like this:
const int bufsize = 4096; /* Or a #define or whatever */
char buffer[bufsize];
fgets(buffer, bufsize, stdin);
...so provided you give it the correct information, it doesn't write past the end of the buffer and blow things up.
Slightly OT, but:
You don't have to use a const int for the buffer size, but I would strongly recommend you don't just put a literal number in both places, because inevitably you'll change one but not the other later. The compiler can help:
char buffer[4096];
fgets(buffer, (sizeof buffer / sizeof buffer[0]), stdin);
That expression gets resolved at compile-time, not runtime. It's a pain to type, so I used to use a macro in my usual set of headers:
#define ARRAYCOUNT(a) (sizeof a / sizeof a[0])
...but I'm a few years out of date with my pure C, there's probably a better way these days.
As mentioned in the previous answers use fgets instead of gets.
But it is not like gets doesn't work at all, it is just very very unsafe. My guess is that you have a bug in your code that would appear with fgets as well so please post your source.
EDIT
Based on the updated information you gave in your comment I have a few suggestions.
I recommend searching for a good C tutorial in your native language, Google is your friend here. As a book I would recommend The C Programming Language
If you have new information it is a good idea to edit them into your original post, especially if it is code, it will make it easier for people to understand what you mean.
You are trying to read a string, basically an array of characters, into a single character, that will of course fail. What you want to do is something like the following.
char username[256];
char password[256];
scanf("%s%s", username, password);
Feel free to comment/edit, I am very rusty even in basic C.
EDIT 2 As jamesdlin warned, usage of scanf is as dangerous as gets.
man gets says:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because
gets() will continue to store
characters past the end of the buffer,
it is extremely dangerous to use. It
has been used to break computer
security. Use fgets() instead.
gets() is unsafe. It takes one parameter, a pointer to a char buffer. Ask yourself how big you have to make that buffer and how long a user can type input without hitting the return key.
Basically, there is no way to prevent a buffer overflow with gets() - use fgets().
From man gets:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many
characters gets() will read, and
because gets() will continue to store
characters past the end of the buffer,
it is extremely dangerous to use.
It has been used to break computer
security. Use fgets() instead.
Almost everywhere I see scanf being used in a way that should have the same problem (buffer overflow/buffer overrun): scanf("%s",string). This problem exists in this case? Why there are no references about it in the scanf man page? Why gcc does not warn when compiling this with -Wall?
ps: I know that there is a way to specify in the format string the maximum length of the string with scanf:
char str[10];
scanf("%9s",str);
edit: I am not asking to determe if the preceding code is right or not. My question is: if scanf("%s",string) is always wrong, why there are no warnings and there is nothing about it in the man page?
The answer is simply that no-one has written the code in GCC to produce that warning.
As you point out, a warning for the specific case of "%s" (with no field width) is quite appropriate.
However, bear in mind that this is only the case for the case of scanf(), vscanf(), fscanf() and vfscanf(). This format specifier can be perfectly safe with sscanf() and vsscanf(), so the warning should not be issued in that case. This means that you cannot simply add it to the existing "scanf-style-format-string" analysis code; you will have to split that into "fscanf-style-format-string" and "sscanf-style-format-string" options.
I'm sure if you produce a patch for the latest version of GCC it stands a good chance of being accepted (and of course, you will need to submit patches for the glibc header files too).
Using gets() is never safe. scanf() can be used safely, as you said in your question. However, determining if you're using it safely is a more difficult problem for the compiler to work out (e.g. if you're calling scanf() in a function where you pass in the buffer and a character count as arguments, it won't be able to tell); in that case, it has to assume that you know what you're doing.
When the compiler looks at the formatting string of scanf, it sees a string! That's assuming the formatting string is not entered at run-time. Some compilers like GCC have some extra functionality to analyze the formatting string if entered at compile time. That extra functionality is not comprehensive, because in some situations a run-time overhead is needed which is a NO NO for languages like C. For example, can you detect an unsafe usage without inserting some extra hidden code in this case:
char* str;
size_t size;
scanf("%z", &size);
str = malloc(size);
scanf("%9s"); // how can the compiler determine if this is a safe call?!
Of course, there are ways to write safe code with scanf if you specify the number of characters to read, and that there is enough memory to hold the string. In the case of gets, there is no way to specify the number of characters to read.
I am not sure why the man page for scanf doesn't mention the probability of a buffer overrun, but vanilla scanf is not a secure option. A rather dated link - Link shows this as the case. Also, check this (not gcc but informative nevertheless) - Link
It may be simply that scanf will allocate space on the heap based on how much data is read in. Since it doesn't allocate the buffer and then read until the null character is read, it doesn't risk overwriting the buffer. Instead, it reads into its own buffer until the null character is found, and presumably copies that buffer into another of the correct size at the end of the read.