Dynamically allocate user inputted string - c

I am trying to write a function that does the following things:
Start an input loop, printing '> ' each iteration.
Take whatever the user enters (unknown length) and read it into a character array, dynamically allocating the size of the array if necessary. The user-entered line will end at a newline character.
Add a null byte, '\0', to the end of the character array.
Loop terminates when the user enters a blank line: '\n'
This is what I've currently written:
void input_loop(){
char *str = NULL;
printf("> ");
while(printf("> ") && scanf("%a[^\n]%*c",&input) == 1){
/*Add null byte to the end of str*/
/*Do stuff to input, including traversing until the null byte is reached*/
free(str);
str = NULL;
}
free(str);
str = NULL;
}
Now, I'm not too sure how to go about adding the null byte to the end of the string. I was thinking something like this:
last_index = strlen(str);
str[last_index] = '\0';
But I'm not too sure if that would work though. I can't test if it would work because I'm encountering this error when I try to compile my code:
warning: ISO C does not support the 'a' scanf flag [-Wformat=]
So what can I do to make my code work?
EDIT: changing scanf("%a[^\n]%*c",&input) == 1 to scanf("%as[^\n]%*c",&input) == 1 gives me the same error.

First of all, scanf format strings do not use regular expressions, so I don't think something close to what you want will work. As for the error you get, according to my trusty manual, the %a conversion flag is for floating point numbers, but it only works on C99 (and your compiler is probably configured for C90)
But then you have a bigger problem. scanf expects that you pass it a previously allocated empty buffer for it to fill in with the read input. It does not malloc the sctring for you so your attempts at initializing str to NULL and the corresponding frees will not work with scanf.
The simplest thing you can do is to give up on n arbritrary length strings. Create a large buffer and forbid inputs that are longer than that.
You can then use the fgets function to populate your buffer. To check if it managed to read the full line, check if your string ends with a "\n".
char str[256+1];
while(true){
printf("> ");
if(!fgets(str, sizeof str, stdin)){
//error or end of file
break;
}
size_t len = strlen(str);
if(len + 1 == sizeof str){
//user typed something too long
exit(1);
}
printf("user typed %s", str);
}
Another alternative is you can use a nonstandard library function. For example, in Linux there is the getline function that reads a full line of input using malloc behind the scenes.

No error checking, don't forget to free the pointer when you're done with it. If you use this code to read enormous lines, you deserve all the pain it will bring you.
#include <stdio.h>
#include <stdlib.h>
char *readInfiniteString() {
int l = 256;
char *buf = malloc(l);
int p = 0;
char ch;
ch = getchar();
while(ch != '\n') {
buf[p++] = ch;
if (p == l) {
l += 256;
buf = realloc(buf, l);
}
ch = getchar();
}
buf[p] = '\0';
return buf;
}
int main(int argc, char *argv[]) {
printf("> ");
char *buf = readInfiniteString();
printf("%s\n", buf);
free(buf);
}

If you are on a POSIX system such as Linux, you should have access to getline. It can be made to behave like fgets, but if you start with a null pointer and a zero length, it will take care of memory allocation for you.
You can use in in a loop like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h> // for strcmp
int main(void)
{
char *line = NULL;
size_t nline = 0;
for (;;) {
ptrdiff_t n;
printf("> ");
// read line, allocating as necessary
n = getline(&line, &nline, stdin);
if (n < 0) break;
// remove trailing newline
if (n && line[n - 1] == '\n') line[n - 1] = '\0';
// do stuff
printf("'%s'\n", line);
if (strcmp("quit", line) == 0) break;
}
free(line);
printf("\nBye\n");
return 0;
}
The passed pointer and the length value must be consistent, so that getline can reallocate memory as required. (That means that you shouldn't change nline or the pointer line in the loop.) If the line fits, the same buffer is used in each pass through the loop, so that you have to free the line string only once, when you're done reading.

Some have mentioned that scanf is probably unsuitable for this purpose. I wouldn't suggest using fgets, either. Though it is slightly more suitable, there are problems that seem difficult to avoid, at least at first. Few C programmers manage to use fgets right the first time without reading the fgets manual in full. The parts most people manage to neglect entirely are:
what happens when the line is too large, and
what happens when EOF or an error is encountered.
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgets() shall return a null pointer. If a read error occurs, the error indicator for the stream shall be set, fgets() shall return a null pointer...
I don't feel I need to stress the importance of checking the return value too much, so I won't mention it again. Suffice to say, if your program doesn't check the return value your program won't know when EOF or an error occurs; your program will probably be caught in an infinite loop.
When no '\n' is present, the remaining bytes of the line are yet to have been read. Thus, fgets will always parse the line at least once, internally. When you introduce extra logic, to check for a '\n', to that, you're parsing the data a second time.
This allows you to realloc the storage and call fgets again if you want to dynamically resize the storage, or discard the remainder of the line (warning the user of the truncation is a good idea), perhaps using something like fscanf(file, "%*[^\n]");.
hugomg mentioned using multiplication in the dynamic resize code to avoid quadratic runtime problems. Along this line, it would be a good idea to avoid parsing the same data over and over each iteration (thus introducing further quadratic runtime problems). This can be achieved by storing the number of bytes you've read (and parsed) somewhere. For example:
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL, *temp;
do {
size_t alloc_size = bytes_read * 2 + 1;
temp = realloc(bytes, alloc_size);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
temp = fgets(bytes + bytes_read, alloc_size - bytes_read, f); /* Parsing data the first time */
bytes_read += strcspn(bytes + bytes_read, "\n"); /* Parsing data the second time */
} while (temp && bytes[bytes_read] != '\n');
bytes[bytes_read] = '\0';
return bytes;
}
Those who do manage to read the manual and come up with something correct (like this) may soon realise the complexity of an fgets solution is at least twice as poor as the same solution using fgetc. We can avoid parsing data the second time by using fgetc, so using fgetc might seem most appropriate. Alas most C programmers also manage to use fgetc incorrectly when neglecting the fgetc manual.
The most important detail is to realise that fgetc returns an int, not a char. It may return typically one of 256 distinct values, between 0 and UCHAR_MAX (inclusive). It may otherwise return EOF, meaning there are typically 257 distinct values that fgetc (or consequently, getchar) may return. Trying to store those values into a char or unsigned char results in loss of information, specifically the error modes. (Of course, this typical value of 257 will change if CHAR_BIT is greater than 8, and consequently UCHAR_MAX is greater than 255)
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL;
do {
if ((bytes_read & (bytes_read + 1)) == 0) {
void *temp = realloc(bytes, bytes_read * 2 + 1);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
}
int c = fgetc(f);
bytes[bytes_read] = c >= 0 && c != '\n'
? c
: '\0';
} while (bytes[bytes_read++]);
return bytes;
}

Related

How do you prevent buffer overflow using fgets?

So far I have been using if statements to check the size of the user-inputted strings. However, they don't see to be very useful: no matter the size of the input, the while loop ends and it returns the input to the main function, which then just outputs it.
I don't want the user to enter anything greater than 10, but when they do, the additional characters just overflow and are outputted on a newline. The whole point of these if statements is to stop that from happening, but I haven't been having much luck.
#include <stdio.h>
#include <string.h>
#define SIZE 10
char *readLine(char *buf, size_t sz) {
int true = 1;
while(true == 1) {
printf("> ");
fgets(buf, sz, stdin);
buf[strcspn(buf, "\n")] = 0;
if(strlen(buf) < 2 || strlen(buf) > sz) {
printf("Invalid string size\n");
continue;
}
if(strlen(buf) > 2 && strlen(buf) < sz) {
true = 0;
}
}
return buf;
}
int main(int argc, char **argv) {
char buffer[SIZE];
while(1) {
char *input = readLine(buffer, SIZE);
printf("%s\n", input);
}
}
Any help towards preventing buffer overflow would be much appreciated.
When the user enters in a string longer than sz, your program processes the first sz characters, but then when it gets back to the fgets call again, stdin already has input (the rest of the characters from the user's first input). Your program then grabs another up to sz characters to process and so on.
The call to strcspn is also deceiving because if the "\n" is not in the sz chars you grab than it'll just return sz-1, even though there's no newline.
After you've taken input from stdin, you can do a check to see if the last character is a '\n' character. If it's not, it means that the input goes past your allowed size and the rest of stdin needs to be flushed. One way to do that is below. To be clear, you'd do this only when there's been more characters than allowed entered in, or it could cause an infinite loop.
while((c = getchar()) != '\n' && c != EOF)
{}
However, trying not to restructure your code too much how it is, we'll need to know if your buffer contains the newline before you set it to 0. It will be at the end if it exists, so you can use the following to check.
int containsNewline = buf[strlen(buf)-1] == '\n'
Also be careful with your size checks, you currently don't handle the case for a strlen of 2 or sz. I would also never use identifier names like "true", which would be a possible value for a bool variable. It makes things very confusing.
In case that string inside the file is longer that 10 chars, your fgets() reads only the first 10 chars into buf. And, because these chars doesn't contain the trailing \n, function strcspn(buf, "\n") returns 10 - it means, you are trying to set to 0 an buf[10], so it is over buf[] boundaries (max index is 9).
Additionally, never use true or false as the name of variable - it totally diminishes the code. Use something like 'ok' instead.
Finally: please clarify, what output is expected in case the file contains string longer than 10 characters. It should be truncated?

C: removing new line/null terminate input string

In C I'm using this method from a serial library:
int serialport_read_until(int fd, char* buf, char until, int buf_max, int timeout)
{
char b[1]; // read expects an array, so we give it a 1-byte array
int i=0;
do {
int n = read(fd, b, 1); // read a char at a time
if( n==-1) return -1; // couldn't read
if( n==0 ) {
usleep( 1 * 1000 ); // wait 1 msec try again
timeout--;
if( timeout==0 ) return -2;
continue;
}
#ifdef SERIALPORTDEBUG
printf("serialport_read_until: i=%d, n=%d b='%c'\n",i,n,b[0]); // debug
#endif
buf[i] = b[0];
i++;
} while( b[0] != until && i < buf_max && timeout>0 );
buf[i] = 0; // null terminate the string
return 0;
}
The string that it is going to read is like this:
"111\r\n" (with a carriage + new line behind)
It is being printed out in Arduino using
serial.print("1");
serial.print("1");
serial.println("1");
Using the serialport_read_until method (char until is '\r\n'), I want to ensure that I am reading the entire buffer correctly.
Which of the following below does the char* buf look like in the end?
1) 111\r\n
2) 111\r\n\0
3) 111\0
4) 111
I need to figure out this part before I use sscanf method to convert the string into an integer correctly, but I'm not sure which to use:
sscanf(buf, "%d\r\n", &num); OR sscanf(buf, "%d", &num);
In addition, should I change the 2nd last line: buf[i] = 0; to buf[i-1] = 0; ?
It looks to me like you should expect 111\r\n\0. Note that the condition b[0] != until is checked after incrementing i, so when the newline character is received and the loop exits, i points to the next byte after \n. Then b[i]=0 stores a null byte there.
Note that this code appears to have a bug: if the until character is never received, the loop will run until i == buf_max and then store one byte more with the null terminator. So a total of buf_max+1 bytes are stored, meaning the following code would have a buffer overflow:
char mybuf[123];
serialport_read_until(fd, buf, 'x', 123, 42);
Unless the documentation says that buf_max should be one less than the size of the buffer, which would be counterintuitive and error-prone, the loop termination condition should probably say i+1 < buf_max or something similar.
Also, since i is checked at the end, even with this fix, the code will still store one byte if you pass in buf_max == 0 (but without the fix it will store two bytes). So that's another bug.
The char b[1]; declaration and accompanying comment is a little weird too. It would be more idiomatic to simply declare char b; and then just pass &b to read().
So if this is your code, there's more work to do on it. If it's someone else's code, I'd be very careful using this library, if this function is any indication of the quality.
Doing buf[i-1]=0 at the end would avoid the overflow, but would also mean that if the until character is not received, the last byte received would be lost. It would also break if you ever call the function with buf_max == 0. So that's not what you want.
If you're using sscanf, the question of whether or not there is trailing whitespace is irrelevant; sscanf("%d") will just ignore it. You should have a careful read through your library's documentation of sscanf. In particular, the way it handles whitespace is not always intuitive.

Why fgets is not inputting first value?

I am writing a program to write my html files rapidly. And when I came to write the content of my page I got a problem.
#include<stdio.h>
int main()
{
int track;
int question_no;
printf("\nHow many questions?\t");
scanf("%d",&question_no);
char question[question_no][100];
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],sizeof(question[track-1]),stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
}
In this program I am writing some questions and their answers (in html file). When I test run this program I input the value of question_no to 3. But when I enter my first question it doesn't go in question[0] and consequently the first question doesn't output. The rest of the questions input without issue.
I searched some questions on stackoverflow and found that fgets() looks for last \0 character and that \0 stops it.
I also found that I should use buffer to input well through fgets() so I used: setvbuf and setbuf but that also didn't work (I may have coded that wrong). I also used fflush(stdin) after my first and last (as well) scanf statement to remove any \0 character from stdin but that also didn't work.
Is there any way to accept the first input by fgets()?
I am using stdin and stdout for now. I am not accessing, reading or writing any file.
Use fgets for the first prompt too. You should also malloc your array as you don't know how long it is going to be at compile time.
#include <stdlib.h>
#include <stdio.h>
#define BUFSIZE 8
int main()
{
int track, i;
int question_no;
char buffer[BUFSIZE], **question;
printf("\nHow many questions?\t");
fgets(buffer, BUFSIZE, stdin);
question_no = strtol(buffer, NULL, 10);
question = malloc(question_no * sizeof (char*));
if (question == NULL) {
return EXIT_FAILURE;
}
for (i = 0; i < question_no; ++i) {
question[i] = malloc(100 * sizeof (char));
if (question[i] == NULL) {
return EXIT_FAILURE;
}
}
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],100,stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
for (i = 0; i < question_no; ++i) free(question[i]);
free(question);
return EXIT_SUCCESS;
}
2D arrays in C
A 2D array of type can be represented by an array of pointers to type, or equivalently type** (pointer to pointer to type). This requires two steps.
Using char **question as an exemplar:
The first step is to allocate an array of char*. malloc returns a pointer to the start of the memory it has allocated, or NULL if it has failed. So check whether question is NULL.
Second is to make each of these char* point to their own array of char. So the for loop allocates an array the size of 100 chars to each element of question. Again, each of these mallocs could return NULL so you should check for that.
Every malloc deserves a free so you should perform the process in reverse when you have finished using the memory you have allocated.
malloc reference
strtol
long int strtol(const char *str, char **endptr, int base);
strtol returns a long int (which in the code above is casted to an int). It splits str into three parts:
Any white-space preceding the numerical content of the string
The part it recognises as numerical, which it will try to convert
The rest of the string
If endptr is not NULL, it will point to the 3rd part, so you know where strtol finished. You could use it like this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char * endptr = NULL, *str = " 123some more stuff";
int number = strtol(str, &endptr, 10);
printf("number interpreted as %d\n"
"rest of string: %s\n", number, endptr);
return EXIT_SUCCESS;
}
output:
number interpreted as 123
rest of string: some more stuff
strtol reference
This is because the previous newline character left in the input stream by scanf(). Note that fgets() stops if it encounters a newline too.
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an
EOF or a newline. If a newline is read, it is stored into the
buffer
Don't mix fgets() and scanf(). A trivial solution is to use getchar() right after scanf() in order to consume the newline left in the input stream by scanf().
As per the documentation,
The fgets() function shall read bytes from stream into the array
pointed to by s, until n-1 bytes are read, or a < newline > is read and
transferred to s, or an end-of-file condition is encountered
In case of scanf("%d",&question_no); a newline is left in the buffer and that is read by
fgets(question[track-1],sizeof(question[track-1]),stdin);
and it exits.
In order to flush the buffer you should do,
while((c = getchar()) != '\n' && c != EOF)
/* discard */ ;
to clear the extra characters in the buffer

Reading from stdin and storing \n and whitespace

I've been trying to use scanf to get input from stdin but it truncates the string after seeing whitespace or after hitting return.
What I'm trying to get is a way to read keyboard input that stores in the buffer linebreaks as well as whitespace. And ending when ctrl-D is pressed.
Should I try using fgets? I figured that wouldn't be optimal either since fgets returns after reading in a \n
There is no ready-made function to read everyting from stdin, but creating your own is fortunately easy. Untested code snippet, with some explanation in comments, which can read arbitrarily large amount of chars from stdin:
size_t size = 0; // how many chars have actually been read
size_t reserved = 10; // how much space is allocated
char *buf = malloc(reserved);
int ch;
if (buf == NULL) exit(1); // out of memory
// read one char at a time from stdin, until EOF.
// let stdio to handle input buffering
while ( (ch = getchar()) != EOF) {
buf[size] = (char)ch;
++size;
// make buffer larger if needed, must have room for '\0' below!
// size is always doubled,
// so reallocation is going to happen limited number of times
if (size == reserved) {
reserved *= 2;
buf = realloc(buf, reserved);
if (buf == NULL) exit(1); // out of memory
}
}
// add terminating NUL character to end the string,
// maybe useless with binary data but won't hurt there either
buf[size] = 0;
// now buf contains size chars, everything from stdin until eof,
// optionally shrink the allocation to contain just input and '\0'
buf = realloc(buf, size+1);
scanf() splits the input at whitespace boundaries, so it's not suitable in your case. Indeed fgets() is the better choice. What you need to do is keep reading after fgets() returns; each call will read a line of input. You can keep reading until fgets() returns NULL, which means that nothing more can be read.
You can also use fgetc() instead if you prefer getting input character by character. It will return EOF when nothing more can be read.
If you want to read all input, regardless of whether it is whitespace or not, try fread.
Read like this
char ch,line[20];
int i=0; //define a counter
//read a character assign it to ch,
//check whether the character is End of file or not and
//also check counter value to avoid overflow.
while((ch=getchar())!=EOF && i < 19 )
{
line[i]=ch;
i++;
}
line[i]='\0';

C - scanf() vs gets() vs fgets()

I've been doing a fairly easy program of converting a string of Characters (assuming numbers are entered) to an Integer.
After I was done, I noticed some very peculiar "bugs" that I can't answer, mostly because of my limited knowledge of how the scanf(), gets() and fgets() functions work. (I did read a lot of literature though.)
So without writing too much text, here's the code of the program:
#include <stdio.h>
#define MAX 100
int CharToInt(const char *);
int main()
{
char str[MAX];
printf(" Enter some numbers (no spaces): ");
gets(str);
// fgets(str, sizeof(str), stdin);
// scanf("%s", str);
printf(" Entered number is: %d\n", CharToInt(str));
return 0;
}
int CharToInt(const char *s)
{
int i, result, temp;
result = 0;
i = 0;
while(*(s+i) != '\0')
{
temp = *(s+i) & 15;
result = (temp + result) * 10;
i++;
}
return result / 10;
}
So here's the problem I've been having. First, when using gets() function, the program works perfectly.
Second, when using fgets(), the result is slightly wrong because apparently fgets() function reads newline (ASCII value 10) character last which screws up the result.
Third, when using scanf() function, the result is completely wrong because first character apparently has a -52 ASCII value. For this, I have no explanation.
Now I know that gets() is discouraged to use, so I would like to know if I can use fgets() here so it doesn't read (or ignores) newline character.
Also, what's the deal with the scanf() function in this program?
Never use gets. It offers no protections against a buffer overflow vulnerability (that is, you cannot tell it how big the buffer you pass to it is, so it cannot prevent a user from entering a line larger than the buffer and clobbering memory).
Avoid using scanf. If not used carefully, it can have the same buffer overflow problems as gets. Even ignoring that, it has other problems that make it hard to use correctly.
Generally you should use fgets instead, although it's sometimes inconvenient (you have to strip the newline, you must determine a buffer size ahead of time, and then you must figure out what to do with lines that are too long–do you keep the part you read and discard the excess, discard the whole thing, dynamically grow the buffer and try again, etc.). There are some non-standard functions available that do this dynamic allocation for you (e.g. getline on POSIX systems, Chuck Falconer's public domain ggets function). Note that ggets has gets-like semantics in that it strips a trailing newline for you.
Yes, you want to avoid gets. fgets will always read the new-line if the buffer was big enough to hold it (which lets you know when the buffer was too small and there's more of the line waiting to be read). If you want something like fgets that won't read the new-line (losing that indication of a too-small buffer) you can use fscanf with a scan-set conversion like: "%N[^\n]", where the 'N' is replaced by the buffer size - 1.
One easy (if strange) way to remove the trailing new-line from a buffer after reading with fgets is: strtok(buffer, "\n"); This isn't how strtok is intended to be used, but I've used it this way more often than in the intended fashion (which I generally avoid).
There are numerous problems with this code. We'll fix the badly named variables and functions and investigate the problems:
First, CharToInt() should be renamed to the proper StringToInt() since it operates on an string not a single character.
The function CharToInt() [sic.] is unsafe. It doesn't check if the user accidentally passes in a NULL pointer.
It doesn't validate input, or more correctly, skip invalid input. If the user enters in a non-digit the result will contain a bogus value. i.e. If you enter in N the code *(s+i) & 15 will produce 14 !?
Next, the nondescript temp in CharToInt() [sic.] should be called digit since that is what it really is.
Also, the kludge return result / 10; is just that -- a bad hack to work around a buggy implementation.
Likewise MAX is badly named since it may appear to conflict with the standard usage. i.e. #define MAX(X,y) ((x)>(y))?(x):(y)
The verbose *(s+i) is not as readable as simply *s. There is no need to use and clutter up the code with yet another temporary index i.
gets()
This is bad because it can overflow the input string buffer. For example, if the buffer size is 2, and you enter in 16 characters, you will overflow str.
scanf()
This is equally bad because it can overflow the input string buffer.
You mention "when using scanf() function, the result is completely wrong because first character apparently has a -52 ASCII value."
That is due to an incorrect usage of scanf(). I was not able to duplicate this bug.
fgets()
This is safe because you can guarantee you never overflow the input string buffer by passing in the buffer size (which includes room for the NULL.)
getline()
A few people have suggested the C POSIX standard getline() as a replacement. Unfortunately this is not a practical portable solution as Microsoft does not implement a C version; only the standard C++ string template function as this SO #27755191 question answers. Microsoft's C++ getline() was available at least far back as Visual Studio 6 but since the OP is strictly asking about C and not C++ this isn't an option.
Misc.
Lastly, this implementation is buggy in that it doesn't detect integer overflow. If the user enters too large a number the number may become negative! i.e. 9876543210 will become -18815698?! Let's fix that too.
This is trivial to fix for an unsigned int. If the previous partial number is less then the current partial number then we have overflowed and we return the previous partial number.
For a signed int this is a little more work. In assembly we could inspect the carry-flag, but in C there is no standard built-in way to detect overflow with signed int math. Fortunately, since we are multiplying by a constant, * 10, we can easily detect this if we use an equivalent equation:
n = x*10 = x*8 + x*2
If x*8 overflows then logically x*10 will as well. For a 32-bit int overflow will happen when x*8 = 0x100000000 thus all we need to do is detect when x >= 0x20000000. Since we don't want to assume how many bits an int has we only need to test if the top 3 msb's (Most Significant Bits) are set.
Additionally, a second overflow test is needed. If the msb is set (sign bit) after the digit concatenation then we also know the number overflowed.
Code
Here is a fixed safe version along with code that you can play with to detect overflow in the unsafe versions. I've also included both a signed and unsigned versions via #define SIGNED 1
#include <stdio.h>
#include <ctype.h> // isdigit()
// 1 fgets
// 2 gets
// 3 scanf
#define INPUT 1
#define SIGNED 1
// re-implementation of atoi()
// Test Case: 2147483647 -- valid 32-bit
// Test Case: 2147483648 -- overflow 32-bit
int StringToInt( const char * s )
{
int result = 0, prev, msb = (sizeof(int)*8)-1, overflow;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if ((*s >= '0') && (*s <= '9'))
{
prev = result;
overflow = result >> (msb-2); // test if top 3 MSBs will overflow on x*8
result *= 10;
result += *s++ & 0xF;// OPTIMIZATION: *s - '0'
if( (result < prev) || overflow ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
// Test case: 4294967295 -- valid 32-bit
// Test case: 4294967296 -- overflow 32-bit
unsigned int StringToUnsignedInt( const char * s )
{
unsigned int result = 0, prev;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if (*s >= '0' && *s <= '9')
{
prev = result;
result *= 10;
result += *s++ & 0xF; // OPTIMIZATION: += (*s - '0')
if( result < prev ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
int main()
{
int detect_buffer_overrun = 0;
#define BUFFER_SIZE 2 // set to small size to easily test overflow
char str[ BUFFER_SIZE+1 ]; // C idiom is to reserve space for the NULL terminator
printf(" Enter some numbers (no spaces): ");
#if INPUT == 1
fgets(str, sizeof(str), stdin);
#elif INPUT == 2
gets(str); // can overflows
#elif INPUT == 3
scanf("%s", str); // can also overflow
#endif
#if SIGNED
printf(" Entered number is: %d\n", StringToInt(str));
#else
printf(" Entered number is: %u\n", StringToUnsignedInt(str) );
#endif
if( detect_buffer_overrun )
printf( "Input buffer overflow!\n" );
return 0;
}
You're correct that you should never use gets. If you want to use fgets, you can simply overwrite the newline.
char *result = fgets(str, sizeof(str), stdin);
char len = strlen(str);
if(result != NULL && str[len - 1] == '\n')
{
str[len - 1] = '\0';
}
else
{
// handle error
}
This does assume there are no embedded NULLs. Another option is POSIX getline:
char *line = NULL;
size_t len = 0;
ssize_t count = getline(&line, &len, stdin);
if(count >= 1 && line[count - 1] == '\n')
{
line[count - 1] = '\0';
}
else
{
// Handle error
}
The advantage to getline is it does allocation and reallocation for you, it handles possible embedded NULLs, and it returns the count so you don't have to waste time with strlen. Note that you can't use an array with getline. The pointer must be NULL or free-able.
I'm not sure what issue you're having with scanf.
never use gets(), it can lead to unprdictable overflows. If your string array is of size 1000 and i enter 1001 characters, i can buffer overflow your program.
Try using fgets() with this modified version of your CharToInt():
int CharToInt(const char *s)
{
int i, result, temp;
result = 0;
i = 0;
while(*(s+i) != '\0')
{
if (isdigit(*(s+i)))
{
temp = *(s+i) & 15;
result = (temp + result) * 10;
}
i++;
}
return result / 10;
}
It essentially validates the input digits and ignores anything else. This is very crude so modify it and salt to taste.
So I am not much of a programmer but let me try to answer your question about the scanf();. I think the scanf is pretty fine and use it for mostly everything without having any issues. But you have taken a not completely correct structure. It should be:
char str[MAX];
printf("Enter some text: ");
scanf("%s", &str);
fflush(stdin);
The "&" in front of the variable is important. It tells the program where (in which variable) to save the scanned value.
the fflush(stdin); clears the buffer from the standard input (keyboard) so you're less likely to get a buffer overflow.
And the difference between gets/scanf and fgets is that gets(); and scanf(); only scan until the first space ' ' while fgets(); scans the whole input. (but be sure to clean the buffer afterwards so you wont get an overflow later on)

Resources