Read arbitrarily sized string from stdin? [duplicate] - c

This question already has answers here:
How to read a line from the console in C?
(14 answers)
Closed 5 years ago.
I want to read a string from the console.
Using scanf or fgets however, it seems to me that it's only possible to read a string of a fixed maximum size. Even worse, there seems to be no way of checking how many characters were entered in case the user enters too much (in that case I could simply realloc the array in order for the string to fit into the array).
I read that I'm supposed to read one character at a time in the answer to this question, however I don't know how to read one character at a time without having the user press enter after each character.
How can I do it?

The GCC documentation says that:
Standard C has functions to do this, but they aren't very safe: null characters and even (for gets) long lines can confuse them. So the GNU library provides the nonstandard getline function that makes it easy to read lines reliably.
and that
[getline] is a GNU extension, but it is the recommended way to read lines from a stream. The alternative standard functions are unreliable.
So if you're using GCC, I'd say you should use getline. If you're not using GCC, you should see if your compiler offers a similar feature. If it doesn't — or if you really prefer to use something standard — then you need to read one character at a time.
I don't know how to read one character at a time without having the user press enter after each character.
Yes, you do. The user enters a sequence of characters, and then presses Enter. Your first fgetc call will block until the user presses Enter, but after that, subsequent fgetc calls will return immediately, up until you read the newline. (Then it will block again, until the user presses Enter again.) Reading "one character at a time" doesn't mean that you have to read each character before the user types the next one; it just means that, once the user is done typing a line, you read that line one character at a time.

Try running this..
#include <stdio.h>
int main(){
char next;
while((next=getchar())!=EOF)
printf("%c\n",next);
}
then check out the man page for getchar() to see what's really at hand.

char c = NULL;
while (c != 0x0D)
{
scanf("%c", &c);
// do stuffs with c
}

You can use fgets() in a loop and realloc if the last character is not a \n
/* UNTESTED: MAY HAVE OFF-BY-ONE ERRORS */
char *buffer;
size_t bufsiz = 100;
size_t buflen = 100;
size_t bufcur = 0;
buffer = malloc(bufsiz);
for (;;) {
fgets(buffer + bufcur, bufsiz, stdin);
buflen = bufcur + strlen(buffer + bufcur);
if (buffer[buflen - 1] == '\n') break;
tmp = realloc(buffer, bufsiz * 2);
if (tmp == NULL) /* deal with error */;
buffer = tmp;
bufcur = buflen - 1;
bufsiz *= 2;
}
/* use buffer (and bufsiz and buflen) */
free(buffer);

The accepted answer should note that getchar() returns an int. The char data type is not big enough to hold EOF.
We could read a predetermined amount of text and then discard the rest of the input. That approach has more than its share of critics (how dare we presume to know what to discard). The other option is to use getline or write a custom function. I thought I'd try the latter.
This does not prevent users from filling up memory with cat large_file.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#define MAX 50
#define JUSTDO(a) if (!(a)) {perror(#a);exit(1);}
/** char *get_line FILE *f
* reads an arbitrarily long line of text or until EOF
* caller must free the pointer returned after use
*/
char *get_line (FILE *f) {
int len=MAX;
char buf[MAX],*e=NULL,*ret;JUSTDO (ret=calloc(MAX,1));
while (fgets (buf,MAX,f)) {
if (len-strlen(ret)<MAX) JUSTDO (ret=realloc(ret,len*=2));
strcat (ret,buf) ;if ((e=strrchr (ret,'\n'))) break;
} if (e) *e='\0';
return ret;
}
/* main */
int main (void) {
char *s;
printf ("enter a line of text:\n");
printf ("line was '%s'\n",s=get_line(stdin)) ;free (s);
return 0;
}

Related

C - Buffersize conditional not breaking while loop

UPDATE:
Okay, I'm going about this entirely the wrong way. The reason I'm not geting the result I want is because I'm reading from the terminal and awaiting an enter keypress before the program continues to execute. What I actually need to do is program a "screen" or x11 window to read real-time inputs. Therefore my question is now redundant. Thanks for everyone's suggestions.
Is there a better way to program this which would allow me to capture
keyPress time? And, why is the BUFFERSIZE conditional in the while loop not
breaking out of the loop?
#define BUFFERSIZE 100
int main(void) {
int terminalInput;
int keyPress = 0;
int inputArrayBuffer[BUFFERSIZE];
int bufferExceptionFlag = 0;
printf("\n\t\t\t->"); /* Prompt */
while((terminalInput = getchar()) != '\n' && keyPress < BUFFERSIZE) {
inputArrayBuffer[keyPress] = terminalInput;
++keyPress;
if (keyPress >= BUFFERSIZE) {
bufferExceptionFlag = 1;
}
}
return 0;
}
A few issues. For one, your inputArrayBuffer should be a char array, not an int array. Secondly, there are standard C libraries that include the functionality of what you want to do.
Question 1: "Is there a better way to program this which would allow me to capture keyPress time?"
Yes. For reading stdin until either a newline is encountered or a max length is encountered, the fgets function from stdio.h works nicely (although alternatives may exist). Something like,
fgets(inputArrayBuffer, BUFFERSIZE, stdin)
I understand you want to know the number of keys the user entered, not including the newline key. This is essentially the string length. An easier way to achieve this is to simply determine the length of the string entered by the user. Something like,
keyPress = strlen(inputArrayBuffer) - 1; // -1 because the newline '\n' is included in strlen if you use fgets
If you must capture a single character at a time, then the original code you proposed should work, just be certain to define inputArrayBuffer as char inputArrayBuffer[BUFFERSIZE];
Question 2: "And, why is the BUFFERSIZE conditional in the while loop not breaking out of the loop?"
It definitely should be breaking out of the loop. But don't confuse bufferExceptionFlag equaling the value 1 with signifying that keyPress < BUFFERSIZE didn't cause the loop to break. Clearly, when you ++keyPress inside the loop, if keyPress had the value of 99, it would then become 100, which would cause bufferExceptionFlag to be set. However, on the next loop iteration keyPress < BUFFERSIZE would be false and the loop would break.
Here is a more simple and appropriate solution in my opinion.
#include <stdio.h>
#include <string.h>
#define BUFFERSIZE 100
int main(void) {
char inputArrayBuffer[BUFFERSIZE];
unsigned long keyPress = 0;
printf("\n\t\t\t->"); /* Prompt */
fgets(inputArrayBuffer, BUFFERSIZE, stdin);
keyPress = strlen(inputArrayBuffer) - 1; // -1 because the newline '\n' is included in strlen if you use fgets
printf("User entered: %s\n", inputArrayBuffer);
printf("Input length: %lu\n", keyPress);
return 0;
}
Note that fgets includes the newline character on the string it reads. To remove this character you can do something like inputArrayBuffer[keyPress] = '\0';

How do you prevent buffer overflow using fgets?

So far I have been using if statements to check the size of the user-inputted strings. However, they don't see to be very useful: no matter the size of the input, the while loop ends and it returns the input to the main function, which then just outputs it.
I don't want the user to enter anything greater than 10, but when they do, the additional characters just overflow and are outputted on a newline. The whole point of these if statements is to stop that from happening, but I haven't been having much luck.
#include <stdio.h>
#include <string.h>
#define SIZE 10
char *readLine(char *buf, size_t sz) {
int true = 1;
while(true == 1) {
printf("> ");
fgets(buf, sz, stdin);
buf[strcspn(buf, "\n")] = 0;
if(strlen(buf) < 2 || strlen(buf) > sz) {
printf("Invalid string size\n");
continue;
}
if(strlen(buf) > 2 && strlen(buf) < sz) {
true = 0;
}
}
return buf;
}
int main(int argc, char **argv) {
char buffer[SIZE];
while(1) {
char *input = readLine(buffer, SIZE);
printf("%s\n", input);
}
}
Any help towards preventing buffer overflow would be much appreciated.
When the user enters in a string longer than sz, your program processes the first sz characters, but then when it gets back to the fgets call again, stdin already has input (the rest of the characters from the user's first input). Your program then grabs another up to sz characters to process and so on.
The call to strcspn is also deceiving because if the "\n" is not in the sz chars you grab than it'll just return sz-1, even though there's no newline.
After you've taken input from stdin, you can do a check to see if the last character is a '\n' character. If it's not, it means that the input goes past your allowed size and the rest of stdin needs to be flushed. One way to do that is below. To be clear, you'd do this only when there's been more characters than allowed entered in, or it could cause an infinite loop.
while((c = getchar()) != '\n' && c != EOF)
{}
However, trying not to restructure your code too much how it is, we'll need to know if your buffer contains the newline before you set it to 0. It will be at the end if it exists, so you can use the following to check.
int containsNewline = buf[strlen(buf)-1] == '\n'
Also be careful with your size checks, you currently don't handle the case for a strlen of 2 or sz. I would also never use identifier names like "true", which would be a possible value for a bool variable. It makes things very confusing.
In case that string inside the file is longer that 10 chars, your fgets() reads only the first 10 chars into buf. And, because these chars doesn't contain the trailing \n, function strcspn(buf, "\n") returns 10 - it means, you are trying to set to 0 an buf[10], so it is over buf[] boundaries (max index is 9).
Additionally, never use true or false as the name of variable - it totally diminishes the code. Use something like 'ok' instead.
Finally: please clarify, what output is expected in case the file contains string longer than 10 characters. It should be truncated?

C - Using fgets until newline/-1 [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
So I'm trying to make it so that you can write text into a file until you make a newline or type -1. My problem is that when you write, it just keeps going until it crashes and gives the error "Stack around the variable "inputChoice" was corrupted".
I believe the problem is that the program doesn't stop accepting stdin when you want to stop typing (-1, newline) and that causes the error. I've tried with a simple scanf and it works, but you can only write a word. No spaces and it doesn't support multiple lines either. That's why I have to use fgets
Judging from your comments, I assume that there are some basic concepts in C
that you haven't fully understood, yet.
C-Strings
A C-String is a sequence of bytes. This sequence must end with the value 0.
Every value in the sequence represents a character based on the
ASCII encoding, for example the
character 'a' is 97, 'b' is 98, etc. The character '\0' has
the value 0 and it's the character that determines the end of the string.
That's why you hear a lot that C-Strings are '\0'-terminated.
In C you use an array of chars (char string[], char string[SOME VALUE]) to
save a string. For a string of length n, you need an array of dimension n+1, because
you also need one space for the terminating '\0' character.
When dealing with strings, you always have to think about the proper type,
whether your are using an array or a pointer. A pointer
to char doesn't necessarily mean that you are dealing with a C-String!
Why am I telling you this? Because of:
char inputChoice = 0;
printf("Do you wish to save the Input? (Y/N)\n");
scanf("%s", &inputChoice);
I haven't changed much, got very demotivated after trying for a while.
I changed the %s to an %c at scanf(" %c, &inputChoice) and that
seems to have stopped the program from crashing.
which shows that haven't understood the difference between %s and %c.
The %c conversion specifier character tells scanf that it must match a single character and it expects a pointer to char.
man scanf
c
Matches a sequence of characters whose length is specified by the maximum field
width (default 1); the next pointer must be a
pointer to char, and there must be enough room for all the characters
(no terminating null byte is added). The usual skip of
leading white space is suppressed. To skip white space first, use an explicit space in the format.
Forget the bit about the length, it's not important right now.
The important part is in bold. For the format scanf("%c", the function
expects a pointer to char and its not going to write the terminating '\0'
character, it won't be a C-String. If you want to read one letter and one
letter only:
char c;
scanf("%c", &c);
// also possible, but only the first char
// will have a defined value
char c[10];
scanf("%c", c);
The first one is easy to understand. The second one is more interesting: Here
you have an array of char of dimension 10 (i.e it holds 10 chars). scanf
will match a single letter and write it on c[0]. However the result won't be
a C-String, you cannot pass it to puts nor to other functions that expect
C-Strings (like strcpy).
The %s conversion specifier character tells scanf that it must match a sequence of non-white-space characters
man scanf
s
Matches a sequence of non-white-space characters; the next pointer must be a
pointer to the initial element of a character array that is long enough to
hold the input sequence and the terminating null byte ('\0'), which is added
automatically.
Here the result will be that a C-String is saved. You also have to have enough
space to save the string:
char string[10];
scanf("%s", string);
If the strings matches 9 or less characters, everything will be fine, because
for a string of length 9 requires 10 spaces (never forget the terminating
'\0'). If the string matches more than 9 characters, you won't have enough
space in the buffer and a buffer overflow (accessing beyond the size) occurs.
This is an undefined behaviour and anything can happen: your program might
crash, your program might not crash but overwrites another variable and thus
scrwes the flow of your program, it could even kill a kitten somewhere, do
you really want to kill kittens?
So, do you see why your code is wrong?
char inputChoice = 0;
scanf("%s", &inputChoice);
inputChoice is a char variable, it can only hold 1 value.
&inputChoice gives you the address of the inputChoice variable, but the
char after that is out of bound, if you read/write it, you will have an
overflow, thus you kill a kitten. Even if you enter only 1 character, it will
write at least 2 bytes and because you it only has space for one character, a kitten will die.
So, let's talk about your code.
From the perspective of an user: Why would I want to enter lines of text, possibly a lot of lines of text
and then answer "No, I don't want to save the lines". It doesn't make sense to
me.
In my opinion you should first ask the user whether he/she wants to save the
input first, and then ask for the input. If the user doesn't want to save
anything, then there is no point in asking the user to enter anything at
all. But that's just my opinion.
If you really want to stick to your plan, then you have to save every line and
when the user ends entering data, you ask and you save the file.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFERLEN 1024
void printFile () {
int i;
char openFile[BUFFERLEN];
FILE *file;
printf("What file do you wish to write in?\n");
scanf("%s", openFile);
getchar();
file = fopen(openFile, "w");
if (file == NULL) {
printf("Could not open file.\n");
return;
}
// we save here all lines to be saved
char **lines = NULL;
int num_of_lines = 0;
char buffer[BUFFERLEN];
printf("Enter an empty line of -1 to end input\n");
// for simplicity, we assume that no line will be
// larger than BUFFERLEN - 1 chars
while(fgets(buffer, sizeof buffer, stdin))
{
// we should check if the last character is \n,
// if not, buffer was not large enough for the line
// or the stream closed. For simplicity, I will ignore
// these cases
int len = strlen(buffer);
if(buffer[len - 1] == '\n')
buffer[len - 1] = '\0';
if(strcmp(buffer, "") == 0 || strcmp(buffer, "-1") == 0)
break; // either an empty line or user entered "-1"
char *line = strdup(buffer);
if(line == NULL)
break; // if no more memory
// process all lines that already have been entered
char **tmp = realloc(lines, (num_of_lines+1) * sizeof *tmp);
if(tmp == NULL)
{
free(line);
break; // same reason as for strdup failing
}
lines = tmp;
lines[num_of_lines++] = line; // save the line and increase num_of_lines
}
char inputChoice = 0;
printf("Do you wish to save the Input? (Y/N)\n");
scanf("%c", &inputChoice);
getchar();
if (inputChoice == 'Y' || inputChoice == 'y') {
for(i = 0; i < num_of_lines; ++i)
fprintf(file, "%s\n", lines[i]); // writing every line
printf("Your file has been saved\n");
printf("Please press any key to continue");
getchar();
}
// closing FILE buffer
fclose(file);
// free memory
if(num_of_lines)
{
for(i = 0; i < num_of_lines; ++i)
free(lines[i]);
free(lines);
}
}
int main(void)
{
printFile();
return 0;
}
Remarks on the code
I used the same code as yours as the base for mine, so that you can spot the
differences much quicker.
I use the macro BUFFERLEN for declaring the length of the buffers. That's
my style.
Look at the fgets line:
fgets(buffer, sizeof buffer, stdin)
I use here sizeof buffer instead of 1024 or BUFFERLEN. Again, that's my
style, but I think doing this is better, because even if you change the size
of the buffer by changing the macro, or by using another explicit size, sizeof buffer
will always return the correct size. Be aware that this only works when
buffer is an array.
The function strdup returns a pointer a pointer to a new string that
duplicates the argument. It's used to create a new copy of a string. When
using this function, don't forget that you have to free the memory using
free(). strdup is not part of the standard library, it conforms
to SVr4, 4.3BSD, POSIX.1-2001. If you use Windows (I don't use Windows,
I'm not familiar with the Windows ecosystem), this function might not be
present. In that case you can write your own:
char *strdup(const char *s)
{
char *str = malloc(strlen(s) + 1);
if(str == NULL)
return NULL;
strcpy(str, s);
return str;
}

Why fgets is not inputting first value?

I am writing a program to write my html files rapidly. And when I came to write the content of my page I got a problem.
#include<stdio.h>
int main()
{
int track;
int question_no;
printf("\nHow many questions?\t");
scanf("%d",&question_no);
char question[question_no][100];
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],sizeof(question[track-1]),stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
}
In this program I am writing some questions and their answers (in html file). When I test run this program I input the value of question_no to 3. But when I enter my first question it doesn't go in question[0] and consequently the first question doesn't output. The rest of the questions input without issue.
I searched some questions on stackoverflow and found that fgets() looks for last \0 character and that \0 stops it.
I also found that I should use buffer to input well through fgets() so I used: setvbuf and setbuf but that also didn't work (I may have coded that wrong). I also used fflush(stdin) after my first and last (as well) scanf statement to remove any \0 character from stdin but that also didn't work.
Is there any way to accept the first input by fgets()?
I am using stdin and stdout for now. I am not accessing, reading or writing any file.
Use fgets for the first prompt too. You should also malloc your array as you don't know how long it is going to be at compile time.
#include <stdlib.h>
#include <stdio.h>
#define BUFSIZE 8
int main()
{
int track, i;
int question_no;
char buffer[BUFSIZE], **question;
printf("\nHow many questions?\t");
fgets(buffer, BUFSIZE, stdin);
question_no = strtol(buffer, NULL, 10);
question = malloc(question_no * sizeof (char*));
if (question == NULL) {
return EXIT_FAILURE;
}
for (i = 0; i < question_no; ++i) {
question[i] = malloc(100 * sizeof (char));
if (question[i] == NULL) {
return EXIT_FAILURE;
}
}
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],100,stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
for (i = 0; i < question_no; ++i) free(question[i]);
free(question);
return EXIT_SUCCESS;
}
2D arrays in C
A 2D array of type can be represented by an array of pointers to type, or equivalently type** (pointer to pointer to type). This requires two steps.
Using char **question as an exemplar:
The first step is to allocate an array of char*. malloc returns a pointer to the start of the memory it has allocated, or NULL if it has failed. So check whether question is NULL.
Second is to make each of these char* point to their own array of char. So the for loop allocates an array the size of 100 chars to each element of question. Again, each of these mallocs could return NULL so you should check for that.
Every malloc deserves a free so you should perform the process in reverse when you have finished using the memory you have allocated.
malloc reference
strtol
long int strtol(const char *str, char **endptr, int base);
strtol returns a long int (which in the code above is casted to an int). It splits str into three parts:
Any white-space preceding the numerical content of the string
The part it recognises as numerical, which it will try to convert
The rest of the string
If endptr is not NULL, it will point to the 3rd part, so you know where strtol finished. You could use it like this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char * endptr = NULL, *str = " 123some more stuff";
int number = strtol(str, &endptr, 10);
printf("number interpreted as %d\n"
"rest of string: %s\n", number, endptr);
return EXIT_SUCCESS;
}
output:
number interpreted as 123
rest of string: some more stuff
strtol reference
This is because the previous newline character left in the input stream by scanf(). Note that fgets() stops if it encounters a newline too.
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an
EOF or a newline. If a newline is read, it is stored into the
buffer
Don't mix fgets() and scanf(). A trivial solution is to use getchar() right after scanf() in order to consume the newline left in the input stream by scanf().
As per the documentation,
The fgets() function shall read bytes from stream into the array
pointed to by s, until n-1 bytes are read, or a < newline > is read and
transferred to s, or an end-of-file condition is encountered
In case of scanf("%d",&question_no); a newline is left in the buffer and that is read by
fgets(question[track-1],sizeof(question[track-1]),stdin);
and it exits.
In order to flush the buffer you should do,
while((c = getchar()) != '\n' && c != EOF)
/* discard */ ;
to clear the extra characters in the buffer

C-Troubleshoot with gets()

When I give the first input, an extra 0 appears before gets() works. But if I remove gets(), then there is no problem. printf() can't be used because it breaks on blank space. Please give any alternative solution or what should I do?
#include <cstdio>
#include <iostream>
#include <stdlib.h>
using namespace std;
int main()
{
long long a,i,t,count;
int op;
char s[10000];
scanf("%lld",&t);
for(i=1;i<=t;i++)
{
gets(s);
a=atoll(&s[7]);
printf("%lld",a);
}
return 0;
}
The scanf() leaves the end-of-line character of the first line in the input stream which is then consumed by gets(). This is a common beginner's error often discussed here.
Recommendations:
Do not mix scanf() routines with gets() routines.
Except for short test programs do not use gets() (instead use fgets()) because with gets() buffer overflows may occur.
You can try adding the '\n' character when you are reading with scanf:
scanf("%lld\n",&t);
for(i=1;i<=t;i++)
{
gets(s);
a=atoll(&s[7]);
printf("%lld",a);
}
Why not:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
long long i;
long long t;
char s[10000];
if (fgets(s, sizeof(s), stdin) == 0)
return 0;
t = atoll(s);
for (i = 1; i <= t; i++)
{
if (fgets(s, sizeof(s), stdin) == 0)
break;
a = atoll(&s[7]);
printf("%lld\n", a);
}
return 0;
}
Amongst other merits, it doesn't:
print stray zeroes,
include C++ code in a purportedly C program,
contain any stray (unused) variables,
use the dangerous gets() function.
It is fair to note a couple of deficiencies:
It would produce bogus output if a data line was not at least 8 characters long; it should check strlen(s) before calling atoll(&s[7])
We'll assume that 10K is longer than any single line it will be given to read so truncated lines won't be a problem, though JSON data sometimes seems to get encoded in files without any newlines and can be humongously long (Firefox bookmark lists or backups, for example, don't even contain a single newline).
I'm sure what you're trying to do here, or what the problem is. But ...
As Greg Hewgill correctly said: NEVER use "gets()". It's a buffer overflow waiting to happen.
You CAN use "fgets()" - and it could easily solve the problem.
While you're at it, why the "scanf()", followed by "gets()", followed by "atoll()"? Can any of these inputs be merged? Or made more consistent?
Where are you checking for a valid conversion from "atoll()"? Why not just use "sscanf()" (and check the return value)?

Resources