Are gets() and puts() technically sound to be used for strings? - c

my question is about
gets()
and
puts()
are they a perfect solution for string input and output?

gets is marked as obsolescent in C99 and has been removed in C11 because of security issues with this function. Don't use it, use fgets instead. As an historical note, gets was exploited (in fingerd) by the first massive internet worm: the inet worm back in 1988.
puts function is OK if it fits your needs.

gets() is fundamentally insecure in a really horrific way: it will write an unlimited number of characters to its argument, overflowing any buffer it is provided. As such, it should never, ever be used. Many newer compilers will issue an automatic warning if you use it. Instead, use fgets(), which takes a length argument:
char buf[...];
fgets(buf, sizeof(buf), stdin);
On the other hand, puts() is totally fine. It's equivalent to printf("%s\n", x);, and some compilers will in fact convert certain constant printf() calls to puts() as a standard optimization. Go wild.

For gets, see the man page:
BUGS
Never use gets(). Because it is impossible to tell without knowing
the data in advance how many characters gets() will read, and because gets() will continue to store
characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
puts is fine, if you're just looking to write a string to stdout.

Related

What's the difference between gets and scanf?

If the code is
scanf("%s\n",message)
vs
gets(message)
what's the difference?It seems that both of them get input to message.
The basic difference [in reference to your particular scenario],
scanf() ends taking input upon encountering a whitespace, newline or EOF
gets() considers a whitespace as a part of the input string and ends the input upon encountering newline or EOF.
However, to avoid buffer overflow errors and to avoid security risks, its safer to use fgets().
Disambiguation: In the following context I'd consider "safe" if not leading to trouble when correctly used. And "unsafe" if the "unsafetyness" cannot be maneuvered around.
scanf("%s\n",message)
vs
gets(message)
What's the difference?
In terms of safety there is no difference, both read in from Standard Input and might very well overflow message, if the user enters more data then messageprovides memory for.
Whereas scanf() allows you to be used safely by specifying the maximum amount of data to be scanned in:
char message[42];
...
scanf("%41s", message); /* Only read in one few then the buffer (messega here)
provides as one byte is necessary to store the
C-"string"'s 0-terminator. */
With gets() it is not possible to specify the maximum number of characters be read in, that's why the latter shall not be used!
The main difference is that gets reads until EOF or \n, while scanf("%s") reads until any whitespace has been encountered. scanf also provides more formatting options, but at the same time it has worse type safety than gets.
Another big difference is that scanf is a standard C function, while gets has been removed from the language, since it was both superfluous and dangerous: there was no protection against buffer overruns. The very same security flaw exists with scanf however, so neither of those two functions should be used in production code.
You should always use fgets, the C standard itself even recommends this, see C11 K.3.5.4.1
Recommended practice
6 The fgets function allows properly-written
programs to safely process input lines too long to store in the result
array. In general this requires that callers of fgets pay attention to
the presence or absence of a new-line character in the result array.
Consider using fgets (along with any needed processing based on
new-line characters) instead of gets_s.
(emphasis mine)
There are several. One is that gets() will only get character string data. Another is that gets() will get only one variable at a time. scanf() on the other hand is a much, much more flexible tool. It can read multiple items of different data types.
In the particular example you have picked, there is not much of a difference.
gets - Reads characters from stdin and stores them as a string.
scanf - Reads data from stdin and stores them according to the format specified int the scanf statement like %d, %f, %s, etc.
gets:->
gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte ('\0').
BUGS:->
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.
scanf:->
The scanf() function reads input from the standard input stream stdin;
BUG:->
Some times scanf makes boundary problems when deals with array and string concepts.
In case of scanf you need that format mentioned, unlike in gets. So in gets you enter charecters, strings, numbers and spaces.
In case of scanf , you input ends as soon as a white-space is encountered.
But then in your example you are using '%s' so, neither gets() nor scanf() that the strings are valid pointers to arrays of sufficient length to hold the characters you are sending to them. Hence can easily cause an buffer overflow.
Tip: use fgets() , but that all depends on the use case
The concept that scanf does not take white space is completely wrong. If you use this part of code it will take white white space also :
#include<stdio.h>
int main()
{
char name[25];
printf("Enter your name :\n");
scanf("%[^\n]s",name);
printf("%s",name);
return 0;
}
Where the use of new line will only stop taking input. That means if you press enter only then it will stop taking inputs.
So, there is basically no difference between scanf and gets functions. It is just a tricky way of implementation.
scanf() is much more flexible tool while gets() only gets one variable at a time.
gets() is unsafe, for example: char str[1]; gets(str)
if you input more then the length, it will end with SIGSEGV.
if only can use gets, use malloc as the base variable.

why is the following code wrong?

I recently faced an interview question on what's the hidden problem with the following code. I was unable to detect it .Can anyone help?
#include<stdio.h>
int main(void)
{
char buff[10];
memset(buff,0,sizeof(buff));
gets(buff);
printf("\n The buffer entered is [%s]\n",buff);
return 0;
}
The function gets accepts a string from stdin and does not check the capacity of the buffer.This may result in buffer overflow. The standard function fgets() can be used here.
gets could return much more than 10 characters.
gets is really problematic because you can't tell it to only fill 'buff' up to a length of 10.
check the Bugs Section of this manual which says
Never use gets(). Because it is impossible to tell without knowing
the data in advance how many characters gets() will read, and because
gets() will continue to store characters past the end of the buffer,
it is extremely dangerous to use. It has been used to break computer
security. Use fgets() instead.
It is not advisable to mix calls to input functions from the stdio
library with low-level calls to read(2) for the file descriptor
associated with the input stream; the results will be undefined and
very probably not what you want.
It is always recommended to use fgets()/ scanf() over gets().
by using the function gets() you don't have the option to limit the user to a certian text length, which may cause a buffer overflow exception. That is why you should not use it.
Try to use fgets() instead:
fgets(buff, MAX_LENGTH_ stdin);
Good luck!

why scanf is safer than getchar? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
getc Vs getchar Vs Scanf for reading a character from stdin
I know that getchar is a macro that takes character as an input from stdin . where as scanf is a function that takes any data type as an input from stdin.
As getchar is a macro thats why it helps to make program run faster ,but inspite of this it is recommended to use scanf in place of getchar. WHY?
i learned on net that scanf is safe as compared to getchar,so ,what makes scanf safe?
I would argue that both can be as safe or unsafe as you make them, but in the end I would settle for using scanf():
Since it processes more than one characters at each call, scanf() is potentially faster.
Its behavior is well-documented which means that there are well-established methods to avoid security issues, such as including hard-coded length limits in the format string, and of course never using format strings generated from user input.
scanf() is far easier to use - getchar() may be "safer" on its own, but you will have to write a lot of code around it to get some actual functionality out of it. Code that will duplicate functionality provided by scanf() and will then have to be reviewed for security implications. The scanf() implementation is likely to be peer-reviewed extensively, something that your own code will probably never be.
Code using standard functions, such as scanf() is far easier to maintain than anything based on custom libraries, especially in programming teams with significant staff turnover.
scanf is faster because it can read multiple characters at once.
However, getchar is safer because it can only read one character at a time. If input is not checked, it is relatively easy to exploit program that is using scanf.
getChar() is a function as What I know and what I have read, Although it is termed as a macro by some of the websites but it is clearly termed as a function in the official C++ Reference website.
getchar() is used when you need one and only one character to be input from the keyboard where as scanf() is used to get multiple characters. Which one is faster then other depends totally on your requirement and implementation.
As far as the safety and ease of use is concerned i would cast my vote to scanf() as I can input multiple characters and I don't have to use flush() after the use of scanf() which sometimes becomes mandatory when using getchar().

why gets() is not working?

I am programming in C in Unix,
and I am using gets to read the inputs from keyboard.
I always get this warning and the program stop running:
warning: this program uses gets(), which is unsafe.
Can anybody tell me the reason why this is happening?
gets is unsafe because you give it a buffer, but you don't tell it how big the buffer is. The input may write past the end of the buffer, blowing up your program fairly spectacularly. Using fgets instead is a bit better because you tell it how big the buffer is, like this:
const int bufsize = 4096; /* Or a #define or whatever */
char buffer[bufsize];
fgets(buffer, bufsize, stdin);
...so provided you give it the correct information, it doesn't write past the end of the buffer and blow things up.
Slightly OT, but:
You don't have to use a const int for the buffer size, but I would strongly recommend you don't just put a literal number in both places, because inevitably you'll change one but not the other later. The compiler can help:
char buffer[4096];
fgets(buffer, (sizeof buffer / sizeof buffer[0]), stdin);
That expression gets resolved at compile-time, not runtime. It's a pain to type, so I used to use a macro in my usual set of headers:
#define ARRAYCOUNT(a) (sizeof a / sizeof a[0])
...but I'm a few years out of date with my pure C, there's probably a better way these days.
As mentioned in the previous answers use fgets instead of gets.
But it is not like gets doesn't work at all, it is just very very unsafe. My guess is that you have a bug in your code that would appear with fgets as well so please post your source.
EDIT
Based on the updated information you gave in your comment I have a few suggestions.
I recommend searching for a good C tutorial in your native language, Google is your friend here. As a book I would recommend The C Programming Language
If you have new information it is a good idea to edit them into your original post, especially if it is code, it will make it easier for people to understand what you mean.
You are trying to read a string, basically an array of characters, into a single character, that will of course fail. What you want to do is something like the following.
char username[256];
char password[256];
scanf("%s%s", username, password);
Feel free to comment/edit, I am very rusty even in basic C.
EDIT 2 As jamesdlin warned, usage of scanf is as dangerous as gets.
man gets says:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because
gets() will continue to store
characters past the end of the buffer,
it is extremely dangerous to use. It
has been used to break computer
security. Use fgets() instead.
gets() is unsafe. It takes one parameter, a pointer to a char buffer. Ask yourself how big you have to make that buffer and how long a user can type input without hitting the return key.
Basically, there is no way to prevent a buffer overflow with gets() - use fgets().

if one complains about gets(), why not do the same with scanf("%s",...)?

From man gets:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many
characters gets() will read, and
because gets() will continue to store
characters past the end of the buffer,
it is extremely dangerous to use.
It has been used to break computer
security. Use fgets() instead.
Almost everywhere I see scanf being used in a way that should have the same problem (buffer overflow/buffer overrun): scanf("%s",string). This problem exists in this case? Why there are no references about it in the scanf man page? Why gcc does not warn when compiling this with -Wall?
ps: I know that there is a way to specify in the format string the maximum length of the string with scanf:
char str[10];
scanf("%9s",str);
edit: I am not asking to determe if the preceding code is right or not. My question is: if scanf("%s",string) is always wrong, why there are no warnings and there is nothing about it in the man page?
The answer is simply that no-one has written the code in GCC to produce that warning.
As you point out, a warning for the specific case of "%s" (with no field width) is quite appropriate.
However, bear in mind that this is only the case for the case of scanf(), vscanf(), fscanf() and vfscanf(). This format specifier can be perfectly safe with sscanf() and vsscanf(), so the warning should not be issued in that case. This means that you cannot simply add it to the existing "scanf-style-format-string" analysis code; you will have to split that into "fscanf-style-format-string" and "sscanf-style-format-string" options.
I'm sure if you produce a patch for the latest version of GCC it stands a good chance of being accepted (and of course, you will need to submit patches for the glibc header files too).
Using gets() is never safe. scanf() can be used safely, as you said in your question. However, determining if you're using it safely is a more difficult problem for the compiler to work out (e.g. if you're calling scanf() in a function where you pass in the buffer and a character count as arguments, it won't be able to tell); in that case, it has to assume that you know what you're doing.
When the compiler looks at the formatting string of scanf, it sees a string! That's assuming the formatting string is not entered at run-time. Some compilers like GCC have some extra functionality to analyze the formatting string if entered at compile time. That extra functionality is not comprehensive, because in some situations a run-time overhead is needed which is a NO NO for languages like C. For example, can you detect an unsafe usage without inserting some extra hidden code in this case:
char* str;
size_t size;
scanf("%z", &size);
str = malloc(size);
scanf("%9s"); // how can the compiler determine if this is a safe call?!
Of course, there are ways to write safe code with scanf if you specify the number of characters to read, and that there is enough memory to hold the string. In the case of gets, there is no way to specify the number of characters to read.
I am not sure why the man page for scanf doesn't mention the probability of a buffer overrun, but vanilla scanf is not a secure option. A rather dated link - Link shows this as the case. Also, check this (not gcc but informative nevertheless) - Link
It may be simply that scanf will allocate space on the heap based on how much data is read in. Since it doesn't allocate the buffer and then read until the null character is read, it doesn't risk overwriting the buffer. Instead, it reads into its own buffer until the null character is found, and presumably copies that buffer into another of the correct size at the end of the read.

Resources