Handling String input in C - c

If a char array needs to be declared before it is used, how does one declare one so that is can be used to store input?
e.g. The user enters a sentence or series of words. How is this stored so that it can be manipulated?
What is the correct way rather than just declaring an array which is large enough to handle expected input?

If you are talking about console input, you have no choice but to have a FIXED SIZE buffer and use a secure function not allowing more than FIXED_SIZE to be stored on your buffer.
An example would be:
char buff[1024];
fgets(buff, 1024, stdin); // to read from standard input
You must warn your user that any characters beyond 1023th will be ignored.
If you want to access last character the user entered:
printf("%c", buff[strlen(buff)-1]);

I usually use the following function:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
It uses the buffer-overflow-safe fgets with some supporting code to figure out if the line you entered was too long.
You can of course, read partial lines and perform memory re-allocations to store an arbitrary sized input string but usually it's more than adequate to just set a large enough upper boundary and allow for that (say 1K for example). If anyone enters more than that for their name or address, they're probably just being silly :-)
I've actually used that trick (partial reads and reallocs) to do user input before but, to be honest, the need for it was so rare that it didn't make it into my "important source code snippets" repository.
The use of fgets prevents the possibility of buffer overflow which is the big danger to user input.
If you want to test that code, try adding:
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
printf ("No input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long\n");
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
and some sample runs:
pax> ./qq
Enter string> hi bob
OK [hi bob]
pax> ./qq
Enter string>
No input
pax> ./qq
Enter string> hi ho the merry oh
Input too long
(that second one was entering CTRLD, an immediate end of file).

Input via buffer ? ( User writes its text to buffer of some size, when buffer is full, programm changes size of target array using realloc )
( you need to use char* instead of char[] )

Related

How to take a line input in C?

I was trying to take a full line input in C. Initially I did,
char line[100] // assume no line is longer than 100 letters.
scanf("%s", line);
Ignoring security flaws and buffer overflows, I knew this could never take more than a word input. I modified it again,
scanf("[^\n]", line);
This, of course, couldn't take more than a line of input. The following code, however was running into infinite loop,
while(fscanf(stdin, "%[^\n]", line) != EOF)
{
printf("%s\n", line);
}
This was because, the \n was never consumed, and would repeatedly stop at the same point and had the same value in line. So I rewrote the code as,
while(fscanf(stdin, "%[^\n]\n", line) != EOF)
{
printf("%s\n", line);
}
This code worked impeccably(or so I thought), for input from a file. But for input from stdin, this produced cryptic, weird, inarticulate behavior. Only after second line was input, the first line would print. I'm unable to understand what is really happening.
All I am doing is this. Note down the string until you encounter a \n, store it in line and then consume the \n from the input buffer. Now print this line and get ready for next line from the input. Or am I being misled?
At the time of posting this question however, I found a better alternative,
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
{
printf("%s\n", line);
}
This works flawlessly for all cases. But my question still remains. How come this code,
while(fscanf(stdin, "%[^\n]\n", line) != EOF)
{
printf("%s\n", line);
}
worked for inputs from file, but is causing issues for input from standard input?
Use fgets(). #FredK
char buf[N];
while (fgets(buf, sizeof buf, stdin)) {
// crop potential \n if desired.
buf[strcspn(buf, "\n")] = '\0';
...
}
There are to many issues trying to use scanf() for user input that render it prone to mis-use or code attacks.
// Leaves trailing \n in stdin
scanf("%[^\n]", line)
// Does nothing if line begins with \n. \n remains in stdin
// As return value not checked, use of line may be UB.
// If some text read, consumes \n and then all following whitespace: ' ' \n \t etc.
// Then does not return until a non-white-space is entered.
// As stdin is usually buffered, this implies 2 lines of user input.
// Fails to limit input.
scanf("%[^\n]\n", line)
// Does nothing if line begins with \n. \n remains in stdin
// Consumes 1 char after `line`, even if next character is not a \n
scanf("%99[^\n]%*c", line)
Check against EOF is usual the wrong check. #Weather Vane The following, when \n is first entered, returns 0 as line is not populated. As 0 != EOF, code goes on to use an uninitialized line leading to UB.
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
Consider entering "1234\n" to the following. Likely infinite loop as first fscanf() read "123", tosses the "4" and the next fscanf() call gets stuck on \n.
while(fscanf(stdin, "%3[^\n]%*c", line) != EOF)
When checking the results of *scanf(), check against what you want, not against one of the values you do not want. (But even the following has other troubles)
while(fscanf(stdin, "%[^\n]%*c", line) == 1)
About the closest scanf() to read a line:
char buf[100];
buf[0] = 0;
int cnt = scanf("%99[^\n]", buf);
if (cnt == EOF) Handle_EndOfFile();
// Consume \n if next stdin char is a \n
scanf("%*1[\n]");
// Use buf;
while(fscanf(stdin, "%[^\n]%*c", line) != EOF)
worked for inputs from file, but is causing issues for input from standard input?
Posting sample code and input/data file would be useful. With modest amount of code posted, some potential reasons.
line overrun is UB
Input begins with \n leading to UB
File or stdin not both opened in same mode. \r not translated in one.
Note: The following fails when a line is 100 characters. So meeting the assumption cal still lead to UB.
char line[100] // assume no line is longer than 100 letters.
scanf("%s", line);
Personally, I think fgets() is badly designed. When I read a line, I want to read it in whole regardless of its length (except filling up all RAM). fgets() can't do that in one go. If there is a long line, you have to manually run it multiple times until it reaches the newline. The glibc-specific getline() is more convenient in this regard. Here is a function that mimics GNU's getline():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long my_getline(char **buf, long *m_buf, FILE *fp)
{
long tot = 0, max = 0;
char *p;
if (*m_buf == 0) { // empty buffer; allocate
*m_buf = 16; // initial size; could be larger
*buf = (char*)malloc(*m_buf); // FIXME: check NULL
}
for (p = *buf, max = *m_buf;;) {
long l, old_m;
if (fgets(p, max, fp) == NULL)
return tot? tot : EOF; // reach end-of-file
for (l = 0; l < max; ++l)
if (p[l] == '\n') break;
if (l < max) { // a complete line
tot += l, p[l] = 0;
break;
}
old_m = *m_buf;
*m_buf <<= 1; // incomplete line; double the buffer
*buf = (char*)realloc(*buf, *m_buf); // check NULL
max = (*m_buf) - old_m;
p = (*buf) + old_m - 1; // point to the end of partial line
}
return tot;
}
int main(int argc, char *argv[])
{
long l, m_buf = 0;
char *buf = 0;
while ((l = my_getline(&buf, &m_buf, stdin)) != EOF)
puts(buf);
free(buf);
return 0;
}
I usually use my own readline() function. I wrote this my_getline() a moment ago. It has not been thoroughly tested. Please use with caution.

How to input multi-word string in C

I have this program. I want to input multi-word strings in a 2-D array. But instead of input whole string in first array of 2-D array this program inputs the first three words of my String in the first three array each(as I defined the no of rows in my 2-D array). Here is the program:
int main()
{
char title[50];
int track;
int question_no;
printf("\nHow many questions?\t");
scanf("%d",&question_no);
track=0;
char question[question_no][100];
while(track<=question_no)
{
printf("Question no %d is:",track+1);
scanf("%s",question[track]);
printf("Q %d.%s",track,question[track]);
track++;
}
}
Here "question_no" is the no of strings I want to input in my 2-D array- "question". But when I input first string, the string's first three words get inputted in the three arrays of 2-D array. It even doesn't ask me to input 2nd or 3rd strings.
A solution to this problem, as I perceive, should be 3-D array. Because that way 2-D arrays inside the outermost array would print the whole multi-word string (But there too I am bound to the length of each string, I think). If this, 3-D array concept, can solve the problem, then is there some efficient method also? Which is better, faster and less time consuming than 3-D array method.
scanf("%s") will scan a string up to the first piece of white space it finds, hence it's unsuitable for multi-word input.
There are ways to use scanf for line-based input but you're generally better off using methods that are easier to protect from buffer overflow, such as an old favorite of mine:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
This is a handy routine which provides line-based input, buffer overflow protection, detection of lines that are too long, cleaning up of those lines so that they don't affect the next input operation and prompting.
A test program can be seen below:
int main (void) {
int rc;
char buff[10] = "";
while ( 1) {
rc = getLine ("\nWhat? ", buff, sizeof(buff));
if (rc == NO_INPUT) {
// Extra NL since my system doesn't output that on EOF.
printf ("\nNo input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long [%s]\n", buff);
continue;
}
if ( strcmp (buff, "exit") == 0)
break;
printf ("OK [%s]\n", buff);
}
return 0;
}
And a transcript follows:
pax> ./testprog
What? hello
OK [hello]
What? this is way too big for the input buffer
Input too long [this is w]
What?
OK []
What? exit
pax> _
Use gets(), this takes input as one string including white spaces, even the newline. But will take in till the first newline. As opposed to scanf(), which takes upto the first white space.

Is this vulnerable to a stack overflow?

void gctinp (char *inp, int siz)
{
puts ("Input value: ");
fgets (inp, siz, stdin);
printf ("buffer3 getinp read %s", inp);
}
From what I've read, fgets is supposed to be used when you want to limit the size of input. So this code shouldn't be vulnerable right?
It is being called like so:
int main (int argc, char *argv[])
{
char buf[16];
getinp (buf, sizeof (buf));
display (buf);
printf ("buffer3 done\n");
}
Thanks for your time.
You won't strike buffer overflow problems if you enter more characters than can be safely stored since fgets restricts the input. It also adds a null terminator (assuming buffer size is greater than 0, of course).
However, you will have problems with information being left in the input buffer the next time you try to read something - this is something that users will find very annoying, entering something like hello again and having it treated as two separate inputs like hello ag and ain. And there's no indication given by fgets that it stopped retrieving input before the end of the line so, as far as your code is aware, everything is fine.
The major things you need to look out for (re buffer overflows on input) are, at a minimum, scanf with an unbounded %s format string and gets, which has no limiting size argument, neither of which are in your code.
If you're looking for a more robust input solution with size limiting, prompting and buffer clearing, check out this code, which provides all those features:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
// Test program for getLine().
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
// Extra NL since my system doesn't output that on EOF.
printf ("\nNo input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long [%s]\n", buff);
rc = getLine ("Hit ENTER to check remains> ", buff, sizeof(buff));
printf ("Excess [%s]\n", buff);
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
And, doing some basic tests:
pax> ./prog
Enter string> [CTRL-D]
No input
pax> ./prog
Enter string> x
OK [x]
pax> ./prog
Enter string> hello
OK [hello]
pax> ./prog
Enter string> hello from earth
Input too long [hello fro]
Hit ENTER to check remains> [ENTER]
Excess []
pax> ./prog
Enter string> i am pax
OK [i am pax]
No, it isn't prone to stack overflow.
Are you confusing stack overflow and buffer overflow by any chance?
http://en.wikipedia.org/wiki/Stack_overflow
fgets will read at most one less than the specified number of bytes, and will make sure that the read string is null-terminated. So as long as you pass the correct size, it should be fine (although the string might not end in a newline).

How do I read a string entered by the user in C?

I want to read the name entered by my user using C programmes.
For this I wrote:
char name[20];
printf("Enter name: ");
gets(name);
But using gets is not good, so what is a better way?
You should never use gets (or scanf with an unbounded string size) since that opens you up to buffer overflows. Use the fgets with a stdin handle since it allows you to limit the data that will be placed in your buffer.
Here's a little snippet I use for line input from the user:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
This allows me to set the maximum size, will detect if too much data is entered on the line, and will flush the rest of the line as well so it doesn't affect the next input operation.
You can test it with something like:
// Test program for getLine().
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
// Extra NL since my system doesn't output that on EOF.
printf ("\nNo input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long [%s]\n", buff);
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
I think the best and safest way to read strings entered by the user is using getline()
Here's an example how to do this:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *buffer = NULL;
int read;
unsigned int len;
read = getline(&buffer, &len, stdin);
if (-1 != read)
puts(buffer);
else
printf("No line read...\n");
printf("Size read: %d\n Len: %d\n", read, len);
free(buffer);
return 0;
}
On a POSIX system, you probably should use getline if it's available.
You also can use Chuck Falconer's public domain ggets function which provides syntax closer to gets but without the problems. (Chuck Falconer's website is no longer available, although archive.org has a copy, and I've made my own page for ggets.)
I found an easy and nice solution:
char*string_acquire(char*s,int size,FILE*stream){
int i;
fgets(s,size,stream);
i=strlen(s)-1;
if(s[i]!='\n') while(getchar()!='\n');
if(s[i]=='\n') s[i]='\0';
return s;
}
it's based on fgets but free from '\n' and stdin extra characters (replacing fflush(stdin) doesn't works on all OS, useful if you have to acquire strings after this).
On BSD systems and Android you can also use fgetln:
#include <stdio.h>
char *
fgetln(FILE *stream, size_t *len);
Like so:
size_t line_len;
const char *line = fgetln(stdin, &line_len);
The line is not null terminated and contains \n (or whatever your platform is using) in the end. It becomes invalid after the next I/O operation on stream. You are allowed to modify the returned line buffer.
Using scanf removing any blank spaces before the string is typed and limiting the amount of characters to be read:
#define SIZE 100
....
char str[SIZE];
scanf(" %99[^\n]", str);
/* Or even you can do it like this */
scanf(" %99[a-zA-Z0-9 ]", str);
If you do not limit the amount of characters to be read with scanf it can be as dangerous as gets
ANSI C unknown maxinum length solution
Just copy from Johannes Schaub's
https://stackoverflow.com/a/314422/895245
Don't forget to free the returned pointer once you're done with it.
char * getline(void) {
char * line = malloc(100), * linep = line;
size_t lenmax = 100, len = lenmax;
int c;
if(line == NULL)
return NULL;
for(;;) {
c = fgetc(stdin);
if(c == EOF)
break;
if(--len == 0) {
len = lenmax;
char * linen = realloc(linep, lenmax *= 2);
if(linen == NULL) {
free(linep);
return NULL;
}
line = linen + (line - linep);
linep = linen;
}
if((*line++ = c) == '\n')
break;
}
*line = '\0';
return linep;
}
This code uses malloc to allocate 100 chars. Then it fetches char by char from the user. If the user reaches 101 chars, it doubles the buffer with realloc to 200. When 201 is reached, it doubles again to 400 and so on until memory blows.
The reason we double rather say, just adding 100 extra every time, is that increasing the size of a buffer with realloc can lead to a copy of the old buffer, which is a potentially expensive operation.
Arrays must be contiguous in memory because we wan to be able to random access them efficiently by memory address. Therefore if we had in RAM:
content buffer[0] | buffer[1] | ... | buffer[99] | empty | empty | int i
RAM address 1000 | 1001 | | 1100 | 1101 | 1102 | 1103
we wouldn't be able to just increase the size of buffer, as it would overwrite our int i. So realloc would need to find another location in memory that has 200 free bytes, and then copy the old 100 bytes there and free the 100 old bytes.
By doubling rather than adding, we quickly reach the order of magnitude of the current string size, since exponentials grow really fast, so only a reasonable number of copies is done.
You can use scanf function to read string
scanf("%[^\n]",name);
i don't know about other better options to receive string,

Using scanf to accept user input

gcc 4.4.2
I was reading an article about scanf. I personally have never checked the return code of a scanf.
#include <stdio.h>
int main(void)
{
char buf[64];
if(1 == scanf("%63s", buf))
{
printf("Hello %s\n", buf);
}
else
{
fprintf(stderr, "Input error.\n");
}
return 0;
}
I am just wondering what other techniques experienced programmers do when they use scanf when they want to get user input? Or do they use another function or write their own?
Thanks for any suggestions,
EDIT =========
#include <stdio.h>
int main(void)
{
char input_buf[64] = {0};
char data[64] = {0};
printf("Enter something: ");
while( fgets(input_buf, sizeof(input_buf), stdin) == NULL )
{
/* parse the input entered */
sscanf(input_buf, "%s", data);
}
printf("Input [ %s ]\n", data);
return 0;
}
I think most programmers agree that scanf is bad, and most agree to use fgets and sscanf. However, I can use fgets to readin the input. However, if I don't know what the user will enter how do I know what to parse. For example, like if the user was to enter their address which would contain numbers and characters and in any order?
Don't use scanf directly. It's surprisingly hard to use. It's better to read an entire line of input and to then parse it (possibly with sscanf).
Read this entry (and the entries it references) from the comp.lang.c FAQ:
http://c-faq.com/stdio/scanfprobs.html
Edit:
Okay, to address your additional question from your own edit: If you allow unstructured input, then you're going to have to attempt to parse the string in multiple ways until you find one that works. If you can't find a valid match, then you should reject the input and prompt the user again, probably explaining what format you want the input to be in.
For anything more complicated, you'd probably be better off using a regular expression library or even using dedicated lexer/parser toolkits (e.g. flex and bison).
I don't use scanf() for interactive user input; I read everything as text using fgets(), then parse the input as necessary, using strtol() and strtod() to convert text to numeric values.
One example of where scanf() falls down is when the user enters a bad numeric value, but the initial part of it is valid, something like the following:
if (scanf("%d", &num) == 1)
{
// process num
}
else
{
// handle error
}
If the user types in "12e4", scanf() will successfully convert and assign the "12" to num, leaving "e4" in the input stream to foul up a future read. The entire input should be treated as bogus, but scanf() can't catch that kind of error. OTOH, if I do something like:
if (fgets(buffer, sizeof buffer, stdin))
{
int val;
char *chk;
val = (int) strtol(buffer, &chk, 10);
if (!isspace(*chk) && *chk != 0)
{
// non-numeric character in input; reject it completely
}
else
{
// process val
}
}
I can catch the error in the input and reject it before using any part of it. This also does a better job of not leaving garbage in the input stream.
scanf() is a great tool if you can guarantee your input is always well-formed.
scanf() has problems, in that if a user is expected to type an integer, and types a string instead, often the program bombs. This can be overcome by reading all input as a string (use getchar()), and then converting the string to the correct data type.
/* example one, to read a word at a time */
#include <stdio.h>
#include <ctype.h>
#define MAXBUFFERSIZE 80
void cleartoendofline( void ); /* ANSI function prototype */
void cleartoendofline( void )
{
char ch;
ch = getchar();
while( ch != '\n' )
ch = getchar();
}
main()
{
char ch; /* handles user input */
char buffer[MAXBUFFERSIZE]; /* sufficient to handle one line */
int char_count; /* number of characters read for this line */
int exit_flag = 0;
int valid_choice;
while( exit_flag == 0 ) {
printf("Enter a line of text (<80 chars)\n");
ch = getchar();
char_count = 0;
while( (ch != '\n') && (char_count < MAXBUFFERSIZE)) {
buffer[char_count++] = ch;
ch = getchar();
}
buffer[char_count] = 0x00; /* null terminate buffer */
printf("\nThe line you entered was:\n");
printf("%s\n", buffer);
valid_choice = 0;
while( valid_choice == 0 ) {
printf("Continue (Y/N)?\n");
scanf(" %c", &ch );
ch = toupper( ch );
if((ch == 'Y') || (ch == 'N') )
valid_choice = 1;
else
printf("\007Error: Invalid choice\n");
cleartoendofline();
}
if( ch == 'N' ) exit_flag = 1;
}
}
I make a loop call fgets until the end of the line is read, and then call sscanf to parse the data. It's a good idea to check whether sscanf reaches the end of the input line.
I rarely use scanf. Most of the times, I use fgets() to read data as a string. Then, depending upon the need, I may use sscanf(), or other functions such as strto* family of functions, str*chr(), etc., to get data from the string.
If I use scanf() or fgets() + sscanf(), I always check the return values of the functions to make sure they did what I wanted them to do. I also don't use strtok() to tokenize strings, because I think the interface of strtok() is broken.

Resources