Character Array and Null character - c

#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char str[4];
scanf("%s",str);
printf("%s",str);
}
input scan
output scan
Here I declare an array of 4 characters. I used '%s' that is used for strings. I am not able to understand how can we input 4 char elements and get correct answer when one space should be utilized for the NULL character. The input should only work with up to 3 elements.

scanf() does not check its arguments. You could even enter more than 4 characters and scanf() would happily overwrite the memory area that comes after your array. After that, your program might crash or all kinds of funny things might happen. This is called a buffer overflow and it is a common cause of vulnerabilities in software.

as mentioned when you take more than 3 character as input ,and extra chars and \0 will be written outside of array memory(after it) and over write memory which doesn't belong to array.which will cause undefined behavior.
but you can use these to prevent buffer overflow from happening:
scanf("%3s",str);
or
fgets(str, sizeof str, stdin)

Related

inputting a character string using scanf()

I started learning about inputting character strings in C. In the following source code I get a character array of length 5.
#include<stdio.h>
int main(void)
{
char s1[5];
printf("enter text:\n");
scanf("%s",s1);
printf("\n%s\n",s1);
return 0;
}
when the input is:
1234567891234567, and I've checked it's working fine up to 16 elements(which I don't understand because it is more than 5 elements).
12345678912345678, it's giving me an error segmentation fault: 11 (I gave 17 elements in this case)
123456789123456789, the error is Illegal instruction: 4 (I gave 18 elements in this case)
I don't understand why there are different errors. Is this the behavior of scanf() or character arrays in C?. The book that I am reading didn't have a clear explanation about these things. FYI I don't know anything about pointers. Any further explanation about this would be really helpful.
Is this the behavior of scanf() or character arrays in C?
TL;DR - No, you're facing the side-effects of undefined behavior.
To elaborate, in your case, against a code like
scanf("%s",s1);
where you have defined
char s1[5];
inputting anything more than 4 char will cause your program to venture into invalid memory area (past the allocated memory) which in turn invokes undefined behavior.
Once you hit UB, the behavior of the program cannot be predicted or justified in any way. It can do absolutely anything possible (or even impossible).
There is nothing inherent in the scanf() which stops you from reading overly long input and overrun the buffer, you should keep control on the input string scanning by using the field width, like
scanf("%4s",s1); //1 saved for terminating null
The scanf function when reading strings read up to the next white-space (e.g. newline, space, tab etc.), or the "end of file". It has no idea about the size of the buffer you provide it.
If the string you read is longer than the buffer provided, then it will write out of bounds, and you will have undefined behavior.
The simplest way to stop this is to provide a field length to the scanf format, as in
char s1[5];
scanf("%4s",s1);
Note that I use 4 as field length, as there needs to be space for the string terminator as well.
You can also use the "secure" scanf_s for which you need to provide the buffer size as an argument:
char s1[5];
scanf_s("%s", s1, sizeof(s1));

Writing 5 character to char[5] affects int

Easy code down below.
Mac OS X 10.10.5, Xcode 7.2, C-file.
If I input 1, and afterwards qwert, I get 0 and qwert back.
1 and qwer gives 1 and qwer.
1 and e.g. qwerty gives 121 and qwerty.
What have I missed - why can I write more than 4 chars (+null) to a 5 char variable?
Why is the integer affected?
#include <stdio.h>
int main() {
int userInput;
char q[5];
printf("Hello\n");
scanf("%d", &userInput);
printf("%d\nAnd\n", userInput);
scanf("%s", q);
printf("\n");
printf("%d\n%s", userInput, q);
return 0;
}
What have I missed - why can I write more than 4 chars (+null) to a 5 char variable?
There is nothing stopping you from accessing out of bounds portions of an array in c. This will compile:
char a[2];
a[10000] = 10;
Why is the integer affected?
What you are causing is undefined behavior and is likely the reason that your int is affected. You can learn more about this by reading about c arrays. This is happening because you are putting a 5 character string plus a null terminating character ( ie 6 chars) into a space only meant for 5. You are going outside the bounds of your array.
As a further note, scanf("%s" offers no method of protecting against this behavior. If a user puts in a string that is too long then too bad. That is why you should protect your input by using something like a format string of "%4s" or use fgets:
fgets(q, sizeof q, stdin);
Which are both ways you can protect your input from entering more than 4 characters.
[Edit] User/code can try to "write more than 4 chars (+null) to a 5 char variable". C does not specify what should happen when code does not prevent such an event. C is coding without the safety net/training wheels.
scanf("%s", q); reads and saves the 5 characters of "qwert" and it also appends a null character '\0'. #Weather Vane
Since q[] has only room to 5 characters, undefined behavior occurs (UB). In OP's case, it appear to have over-written userInput.
To avoid, use a width limit on "%s" such as below. It will not consume more than 4 non-white-space from the user. Unfortunately, extra text will remain in stdin.
char q[5];
scanf("%4s", q);
Or better, review fgets() for reading user input.
The reason that the int userInput is affected is that you are writing past the end of the char array (q). Since both of these are stack variables, the compiler you're using seems to be allocating memory on the stack for the local variables in "reverse order", it, they are being "pushed" in the order defined, so the first local variable listed is lower on the stack. So, in your case, when you write past the end of q, you are writing in the memory space allocated for userInput, which is why it is affected.

Using getchar() and malloc() together

I found a C Program whose purpose is to input a string while using dynamic memory allocation.
However I am having difficulty understanding the logic behind it.
#include <stdio.h>
#include <stdlib.h>
#define MAX 10
int main(void)
{
char *A;
int max_int=0;
printf("Enter max string length: ");
scanf("%d",&max_int);
while ((getchar())!='\n');
A=(char *)malloc(max_int+1); //room for null char
printf("Enter string: ");
fgets(A,max_int,stdin);
}
What is the purpose of while ((getchar())!='\n'); ? It seems redundant to me, since your only inputing a number before it gets called.
while ((getchar())!='\n');
The above line is used to flush anything on the line not read by scanf, for example non-digits and spaces, so the next input starts in a new line.
Also:
scanf-call should check the number of assigned matches (0 matches is possible.
the result of malloc should never be cast (that just hides bugs).
the result of malloc should be checked for NULL. (Warning: Undefined Behavior)
fgets expects the buffer length and guarantees 0-termination. Passing one less means you get a shorter string.

scanf reads more chars than the destination var can hold

The following code reads up to 10 chars from stdin and output the chars.
When I input more than 10 chars, I expect it to crash because msg has not enough room, but it does NOT! How could that be?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* msg = malloc(sizeof(char)*10);
if (NULL==msg)
exit(EXIT_FAILURE);
scanf("%s", msg);
printf("You said: %s\n", msg);
if (strlen(msg)<10)
free(msg);
return EXIT_SUCCESS;
}
Use fgets instead, scanf is not buffer safe. What you are seeing is Undefined Behavior.
You may allocate "safe" big size when using scanf(). On user input it should be 2 lines (cca. 2x80 chars), in case of files some bigger.
Conclusion: scanf() is kinda quick-and-dirty stuff, don't use it in serious projects.
You can specify max size in scanf() format string
scanf("%9s", msg);
I would imagine that malloc() allocates blocks of memory aligned to word boundaries. On a 32-bit machine, that means whatever you ask for will be rounded up to the nearest multiple of 4. That means you might get away with a string of at least 11 characters (plus a '\0' terminator) without suffering any problems.
But don't ever assume this to be the case. Like everyone else is saying, you should always specify a safe maximum length in your format string if you want to avoid problems.
It does not crash because c is very lenient, contrary to popular belief. It is not required for the program to crash or even complain if a buffer is overflown. Say you define
union{
uint8_t a[3]
uint32_t b
}
then a[4] is perfectly fine memory and there is no reason to crash (but don't ever do this). Even a[5] or a[100] may be perfectly fine.
On the other hand I may try to access a[-1] which happens to be memory the OS does not allow you to access, causing a segfault.
As to what you should do to fix this:as others have pointed out, scanf is not safe to use with buffers. Use on of their suggetsions.

String decleration length in C

So I'm writing a small program (I'm new to C, coming from C++), and I want to take in a string of maximum length ten.
I declare a character array as
#define SYMBOL_MAX_LEN 10 //Maximum length a symbol can be from the user (NOT including null character)
.
.
.
char symbol[SYMBOL_MAX_LEN + 1]; //Holds the symbol given by the user (+1 for null character)
So why is it when I use:
scanf("%s", symbol); //Take in a symbol given by the user as a string
I am able to type '01234567890', and the program will still store the entire value?
My questions are:
Does scanf not prevent values from being recorded in the adjacent
blocks of memory after symbol?
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
You can limit the number of characters scanf() will read as so:
#include <stdio.h>
int main(void) {
char buffer[4];
scanf("%3s", buffer);
printf("%s\n", buffer);
return 0;
}
Sample output:
paul#local:~/src/c/scratch$ ./scanftest
abc
abc
paul#local:~/src/c/scratch$ ./scanftest
abcdefghijlkmnop
abc
paul#local:~/src/c/scratch$
scanf() will add the terminating '\0' for you.
If you don't want to hardcode the length in your format string, you can just construct it dynamically, e.g.:
#include <stdio.h>
#define SYMBOL_MAX_LEN 4
int main(void) {
char buffer[SYMBOL_MAX_LEN];
char fstring[100];
sprintf(fstring, "%%%ds", SYMBOL_MAX_LEN - 1);
scanf(fstring, buffer);
printf("%s\n", buffer);
return 0;
}
For the avoidance of doubt, scanf() is generally a terrible function for dealing with input. fgets() is much better for this type of thing.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
As far as I know, No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
By using buffer safe functions like fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Only if the size was enough for it to put the nul terminator. For example if your array was of length 10 and you input 10 chars how will it put the nul terminator.
I am able to type '01234567890', and the program will still store the entire value?
This is because you are Unlucky that you are getting your desired result. This will invoke undefined behavior.
Does scanf not prevent values from being recorded in the adjacent blocks of memory after symbol?
No.
How could I prevent the user from entering a value of greater than length SYMBOL_MAX_LEN?
Use fgets.
Does scanf put the null terminating character into symbol automatically, or is that something I will need to do manually?
Yes

Resources