scanf reads more chars than the destination var can hold - c

The following code reads up to 10 chars from stdin and output the chars.
When I input more than 10 chars, I expect it to crash because msg has not enough room, but it does NOT! How could that be?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* msg = malloc(sizeof(char)*10);
if (NULL==msg)
exit(EXIT_FAILURE);
scanf("%s", msg);
printf("You said: %s\n", msg);
if (strlen(msg)<10)
free(msg);
return EXIT_SUCCESS;
}

Use fgets instead, scanf is not buffer safe. What you are seeing is Undefined Behavior.

You may allocate "safe" big size when using scanf(). On user input it should be 2 lines (cca. 2x80 chars), in case of files some bigger.
Conclusion: scanf() is kinda quick-and-dirty stuff, don't use it in serious projects.

You can specify max size in scanf() format string
scanf("%9s", msg);

I would imagine that malloc() allocates blocks of memory aligned to word boundaries. On a 32-bit machine, that means whatever you ask for will be rounded up to the nearest multiple of 4. That means you might get away with a string of at least 11 characters (plus a '\0' terminator) without suffering any problems.
But don't ever assume this to be the case. Like everyone else is saying, you should always specify a safe maximum length in your format string if you want to avoid problems.

It does not crash because c is very lenient, contrary to popular belief. It is not required for the program to crash or even complain if a buffer is overflown. Say you define
union{
uint8_t a[3]
uint32_t b
}
then a[4] is perfectly fine memory and there is no reason to crash (but don't ever do this). Even a[5] or a[100] may be perfectly fine.
On the other hand I may try to access a[-1] which happens to be memory the OS does not allow you to access, causing a segfault.
As to what you should do to fix this:as others have pointed out, scanf is not safe to use with buffers. Use on of their suggetsions.

Related

Malloc array of characters. String

I understand that assigning memory allocation for string requires n+1 due to the NULL character. However, the question is what if you allocate 10 chars but enter an 11 char string?
#include <stdlib.h>
int main(){
int n;
char *str;
printf("How long is your string? ");
scanf("%d", &n);
str = malloc(n+1);
if (str == NULL) printf("Uh oh.\n");
scanf("%s", str);
printf("Your string is: %s\n", str);
}
I tried running the program but the result is still the same as n+1.
If you allocated a char* of 10 characters but wrote 11 characters to it, you're writing to memory you haven't allocated. This has undefined behavior - it may happen to work, it may crash with a segmentation fault, and it may do something completely different. In short - don't rely on it.
If you overrun an area of memory given you by malloc, you corrupt the RAM heap. If you're lucky your program will crash right away, or when you free the memory, or when your program uses the chunk of memory right after the area you overran. When your program crashes you'll notice the bug and have a chance to fix it.
If you're unlucky your code goes into production, and some cybercriminal figures out how to exploit your overrun memory to trick your program into running some malicious code or using some malicious data they fed you. If you're really unlucky, you get featured in Krebs On Security or some other information security news outlet.
Don't do this. If you're not confident of your ability to avoid doing it, don't use C. Instead use a language with a native string data type. Seriously.
what if you allocate 10 chars but enter an 11 char string?
scanf("%s", str); experiences undefined behavior (UB). Anything may happen including "I tried running the program but the result is still the same as n+1." will appear OK.
Instead always use a width with scanf() and "%s" to stop reading once str[] is full. Example:
char str[10+1];
scanf("%10s", str);
Since n is variable here, consider instead using fgets() to read a line of input.
Note that fgets() also reads and saves a trailing '\n'.
Better to use fgets() for user input and drop scanf() call altogether until you understand why scanf() is bad.
str = malloc(n+1);
if (str == NULL) printf("Uh oh.\n");
if (fgets(str, n+1, stdin)) {
str[strcspn(str, "\n")] = 0; // Lop off potential trailing \n
When you write 11 bytes to a 10-byte buffer, the last byte will be out-of-bounds. Depending on several factors, the program may crash, have unexpected and weird behavior, or may run just fine (i.e., what you are seeing). In other words, the behavior is undefined. You pretty much always want to avoid this, because it is unsafe and unpredictable.
Try writing a bigger string to your 10-byte buffer, such as 20 bytes or 30 bytes. You will see problems start to appear.

Character Array and Null character

#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char str[4];
scanf("%s",str);
printf("%s",str);
}
input scan
output scan
Here I declare an array of 4 characters. I used '%s' that is used for strings. I am not able to understand how can we input 4 char elements and get correct answer when one space should be utilized for the NULL character. The input should only work with up to 3 elements.
scanf() does not check its arguments. You could even enter more than 4 characters and scanf() would happily overwrite the memory area that comes after your array. After that, your program might crash or all kinds of funny things might happen. This is called a buffer overflow and it is a common cause of vulnerabilities in software.
as mentioned when you take more than 3 character as input ,and extra chars and \0 will be written outside of array memory(after it) and over write memory which doesn't belong to array.which will cause undefined behavior.
but you can use these to prevent buffer overflow from happening:
scanf("%3s",str);
or
fgets(str, sizeof str, stdin)

Bus error caused while reading in a string

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
char *input = (char *)malloc(sizeof(char));
input = "\0";
while (1){
scanf("%s\n", input);
if (strcmp(input, "0 0 0") == 0) break;
printf("%s\n",input);
}
}
I'm trying to read in a string of integers until "0 0 0" is entered in.
The program spits out bus error as soon as it executes the scanf line, and I have no clue how to fix it.
Below is the error log.
[1] 59443 bus error
You set input to point to the first element of a string literal (while leaking the recently allocated buffer):
input = "\0"; // now the malloc'd buffer is lost
Then you try to modify said literal:
scanf("%s\n", input);
That is undefined behaviour. You can't write to that location. You can fix that problem by removing the first line, input = "\0";.
Next, note that you're only allocating space for one character:
char *input = (char *)malloc(sizeof(char));
Once you fix the memory leak and the undefined behaviour, you can think about allocating more space. How much space you need is for you to say, but you need enough to contain the longest string you want to read in plus an extra character for the null terminator. For example,
char *input = malloc(257);
would allow you to read in strings up to 256 characters long.
The immediate problem, (thanks to another answer) is that you're initializing input wrong, by pointing it at read-only data, then later trying to write to it via scanf. (Yes, even the lowly literal "" is a pointer to a memory area where the empty string is stored.)
The next problem is semantic: there's no point in trying to initialize it when scanf() will soon overwrite whatever you put there. But if you wanted to, a valid way is input[0] = '\0', which would be appropriate for, say, a loop using strcat().
And finally, waiting in the wings to bite you is a deeper issue: You need to understand malloc() and sizeof() better. You're only allocating enough space for one character, then overrunning the 1-char buffer with a string of arbitrary length (up to the maximum that your terminal will allow on a line.)
A rough cut would be to allocate far more, say 256 chars, than you'll ever need, but scanf is an awful function for this reason -- makes buffer overruns painfully easy especially for novices. I'll leave it to others to suggest alternatives.
Interestingly, the type of crash can indicate something about what you did wrong. A Bus error often relates to modifying read-only memory (which is still a mapped page), such as you're trying to do, but a Segmentation Violation often indicates overrunning a buffer of a writable memory range, by hitting an unmapped page.
input = "\0";
is wrong.
'input' is pointer, not memory.
"\0" is string, not char.
You assigning pointer to a new value which points to a segment of memory which holds constants because "\0" is constant string literal.
Now when you are trying to modify this constant memory, you are getting bus error which is expected.
In your case i assume you wanted to initialize 'input' with empty string.
Use
input[0]='\0';
note single quotes around 0.
Next problem is malloc:
char *input = (char *)malloc(sizeof(char));
you are allocating memory for 1 character only.
When user will enter "0 0 0" which is 5 characters + zero you will get buffer overflow and will probably corrupt some innocent variable.
Allocate enough memory upfront to store all user input. Usual values are 256, 8192 bytes it doesn't matter.
Then,
scanf("%s\n", input);
may still overrun the buffer if user enters alot of text. Use fgets(buf, limit(like 8192), stdin), that would be safer.

Input too big for array

I have a small question that I was just wondering about.
#include <stdio.h>
int main()
{
char n_string[5];
printf("Please enter your first name: ");
scanf("%s", n_string);
printf("\nYour name is: %s", n_string);
return 0;
}
On the 5th line I declare a string of 4 letters. Now this means I will only be able to hold 4 characters in that string, correct?
If I execute my program and write the name: Alexander, I get the output:
Your name is Alexander.
My question is, how come I could put a string of 9 characters into an array that holds 4?
You are overwriting a part of your program's stack by doing that, which is generally a very bad thing. In this case, you got lucky, but if you write further you will almost certainly get a segfault, when main tries to return.
Malicious actors will use this as a buffer overflow attack, to overwrite a function's return address.
If your question is "Why does C allow me to do this?", the answer is that C does not do bounds checking on arrays. It treats arrays (more or less) as a pointer to an address in memory, and scanf is more than happy to write to the memory location without worrying about what it actually represents.
You allocated 5 bytes, but since your CPU probably requires 16-byte alignment, the compiler probably allocated 16 bytes. Try this :
char n_string[5];
volatile int some_int;
some_int= 0;
sscanf(..);
printf("%s %d\n", n_string, some_int);
Is some_int still 0? Writing into n_string may have caused a buffer overflow and written bad data to some_int. Of course your compiler probably knows that some_int will stay a zero, so we declare it like volatile int some_int; to stop it from optimizing.
You reserve memory for 4 letters and the terminating zero. You write nine letters and a zero to it. You overstepped your bounds by 5 bytes. Those 5 bytes belonged to someone else, you just trashed his memory.
The most likely candidate for this is variables that are close. Test this, although not guaranteed, chances are you will see what happens with your remaining bytes: they will damage your i variable:
#include <stdio.h>
int main()
{
char n_string[5];
int i = 17;
printf("Please enter your first name: ");
scanf("%s", n_string);
printf("\nYour name is: %s", n_string);
printf("\nThe variable i is %d", i);
return 0;
}
I think there just happens to be valid memory in your process at the address contiguous to your array that means it just happens to work. However, it will be corrupting other memory elsewhere in the process by overwriting it.
Essentially you have a buffer overflow.

Need help with malloc in C programming. It is allocating more space than expected

Let me preface this by saying that i am a newbie, and im in a entry level C class at school.
Im writing a program that required me to use malloc and malloc is allocating 8x the space i expect it to in all cases. Even when just to malloc(1), it is allocation 8 bytes instead of 1, and i am confused as to why.
Here is my code I tested with. This should only allow one character to be entered plus the escape character. Instead I can enter 8, so it is allocating 8 bytes instead of 1, this is the case even if I just use a integer in malloc(). Please ignore the x variable, it is used in the actual program, but not in this test. :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main (int argc ,char* argv[]){
int x = 0;
char *A = NULL;
A=(char*)malloc(sizeof(char)+1);
scanf("%s",A);
printf("%s", A);
free(A);
return 0;
}
A=(char*)malloc(sizeof(char)+1);
is going to allocate at least 2 bytes (sizeof(char) is always 1).
I don't understand how you are determining that it is allocating 8 bytes, however malloc is allowed to allocate more memory than you ask for, just never less.
The fact that you can use scanf to write a longer string to the memory pointed to by A does not mean that you have that memory allocated. It will overwrite whatever is there, which may result in your program crashing or producing unexpected results.
malloc is allocating as much memory as you asked for.
If you can read more than the allocated bytes (using scanf) it's because scanf is reading also over the memory you own: it's a buffer overflow.
You should limit the data scanf can read this way:
scanf( "%10s", ... ); // scanf will read a string no longer than 10
Im writing a program that required me
to use malloc and malloc is allocating
8x the space i expect it to in all
cases. Even when just to malloc(1), it
is allocation 8 bytes instead of 1,
and i am confused as to why.
Theoretically speaking, the way you do things in the program, is not allocating 8 bytes.
You can still type in 8 bytes (or any number of bytes) because in C there is no check, that you are still using a valid place to write.
What you see is Undefined Behaviour, and the reason for that is that you write in memory that you shouldn't. There is nothing in your code that stops the program after n byte(s) you allocated have been used.
You might get Seg Fault now, or later, or never. This is Undefined Behaviour. Just because it appears to work, does not mean it is right.
Now, Your program could indeed allocate 8 bytes instead of 1.
The reason for that is because of Alignment
The same program might allocate a different size in a different machine and/or a different Operating System.
Also, since you are using C you don't really need to cast. See this for a start.
In your code, there is no limit on how much data you can load in with scanf, leading to a buffer overflow (security flaw/crash). You should use a format string that limits the amount of data read in to the one or two bytes that you allocate. The malloc function will probably allocate some extra space to round the size up, but you should not rely on that.
malloc is allowed to allocate more memory than you ask for. It's only required to provide at least as much as you ask for, or fail if it can't.
using malloc or creating a buffer on the stack will allocate memory in words.
On a 32-bit system the word size is 4 bytes, so when you ask for
A=(char*)malloc(sizeof(char)+1);
(which is essentially A=(char*)malloc(2);
the system will actually give you 4 bytes. On a 64-bit machine you should get 8 bytes.
The way you use scanf there is dangerous as it will overflow the buffer if a string greater than the allocated size leaving a heap overflow vulnerability in your program. scanf in this case will attempt to stuff a string of any length in to that memory so using it to count the allocated size will not work.
What system are you running on? If it's 64 bit, it is possible that the system is allocating the smallest possible unit that it can. 64 bits being 8 bytes.
EDIT: Just a note of interest:
char *s = malloc (1);
Causes 16 bytes to be allocated on iOS 4.2 (Xcode 3.2.5).
If you enter 8 if will just allocate 2 bytes sizeof(char) == 1 (unless you are on some obscure platform) and you will write you number to that char. Then on printf it will output the number you stored in there. So if you store the number 8 it'll display 8 on the command line. It has nothing to do with the count of chars allocated.
Unless of course you looked up in a debugger or somewhere else that it is really allocating 8 bytes.
scanf has no idea how big the target buffer actually is. All it knows is the starting address of the buffer. C does no bounds checking, so if you pass it the address of a buffer sized to hold 2 characters, and you enter a string that's 10 characters long, scanf will write those extra 8 characters to the memory following the end of the buffer. This is called a buffer overrun, which is a common malware exploit. For whatever reason, the six bytes immediately following your buffer aren't "important", so you can enter up to 8 characters with no apparent ill effects.
You can limit the number of characters read in a scanf call by including an explicit field width in the conversion specifier:
scanf("%2s", A);
but it's still up to you to make sure that target buffer is large enough to accomodate that width. Unfortunately, there's no way to specify the field width dynamically as there is with printf:
printf("%*s", fieldWidth, string);
because %*s means something completely different in scanf (basically, skip over the next string).
You could use sprintf to build your format string:
sprintf(format, "%%%ds", max_bytes_in_A);
scanf(format, A);
but you have to make sure the buffer format is wide enough to hold the result, etc., etc., etc.
This is why I usually recommend fgets() for interactive input.

Resources