Printing char array in C causes segmentation fault - c

I did a lot of searching around for this, couldn't find any question with the same exact issue.
Here is my code:
void fun(char* name){
printf("%s",name);
}
char name[6];
sscanf(input,"RECTANGLE_SEARCH(%6[A-Za-z0-9])",name)
printf("%s",name);
fun(name);
The name is grabbed from scanf, and it printed out fine at first. Then when fun is called, there is a segmentation fault when it tries to print out name. Why is this?

After looking in my scrying-glass, I have it:
Your scanf did overflow the buffer (more than 6 byte including terminator read), with ill-effect slightly delayed due to circumstance:
Nobody else relied on or re-used the memory corrupted at first, thus the first printf seems to work.
Somewhere after the first and before the second call to printf the space you overwrote got re-used, so the string you read was no longer terminated before encountering not allocated pages.
Thus, a segmentation-fault at last.
Of course, your program was toast the moment it overflowed the buffer, not later when it finally crashed.
Morale: Never write to memory you have not dedicated for that.
Looking at your edit, the format %6[A-Za-z0-9] tries to read up to 6 characters exclusive the terminator, not inclusive!

Since you're reading 6 characters, you have to declare name to be 7 characters, so there's room for the terminating null character:
char name[7];
Otherwise, you'll get a buffer overflow, and the consequences are undefined. Once you have undefined consequences, anything can happen, including 2 successful calls to printf() followed by a segfault when you call another function.

You're probably walking off the end of the array with your printf statement. Printf uses the terminating null character '\0' to know where the end of the string is. Try allocating your array like this:
char name[6] = {'\0'};
This will allocate your array with every element initially set to the '\0' character, which means that as long as you don't overwrite the entire array with your scanf, printf will terminate before walking off the end.

Are you sure that name is zero byte terminated? scanf can overflow your buffer depending on how you are calling it.
If that happens then printf will read beyond the end of the array resulting in undefined behavior and probably a segmentation fault.

Related

A little query, String in C

Recently I was programming in my Code Blocks and I did a little program only for hobby in C.
char littleString[1];
fflush( stdin );
scanf( "%s", littleString );
printf( "\n%s", littleString);
If I created a string of one character, why does the CodeBlocks allow me to save 13 characters?
C have no bounds-checking, writing out of bounds of arrays or dynamically allocated memory can't be checked by the compiler. Instead it will lead to undefined behavior.
To prevent buffer overflow with scanf you can tell it to only read a specific number of characters, and nothing more. So to tell it to read only one character you use the format "%1s".
As a small side-note: Remember that strings in C have an extra character in them, the terminator (character '\0'). So if you have a string that should contain one character, the size actually needs to be two characters.
LittleString is not a string. It is a char array of length one. In order for a char array to be a string, it must be null terminated with an \0. You are writing past the memory you have allotted for littleString. This is undefined behavior.Scanf just reads user input from the console and assigns it to the variable specified, in this case littleString. If you would like to control the length of user input which is assigned to the variable, I would suggest using scanf_s. Please note that scanf_s is not a C99 standard
Many functions in C is implemented without any checks for correctness of use. In other words, it is the callers responsibility that the arguments fulfill some rules set by the function.
Example: For strcpy the Linux man page says
The strcpy() function copies the string pointed to by src,
including the terminating null byte ('\0'), to the buffer
pointed to by dest. The strings may not overlap, and the
destination string dest must be large enough to receive the copy.
If you as a caller break that contract by passing a too small buffer, you'll have undefined behavior and anything can happen.
The program may crash or even do exactly what you expected in 99 out of 100 times and do something strange in 1 out of 100 times.

inputting a character string using scanf()

I started learning about inputting character strings in C. In the following source code I get a character array of length 5.
#include<stdio.h>
int main(void)
{
char s1[5];
printf("enter text:\n");
scanf("%s",s1);
printf("\n%s\n",s1);
return 0;
}
when the input is:
1234567891234567, and I've checked it's working fine up to 16 elements(which I don't understand because it is more than 5 elements).
12345678912345678, it's giving me an error segmentation fault: 11 (I gave 17 elements in this case)
123456789123456789, the error is Illegal instruction: 4 (I gave 18 elements in this case)
I don't understand why there are different errors. Is this the behavior of scanf() or character arrays in C?. The book that I am reading didn't have a clear explanation about these things. FYI I don't know anything about pointers. Any further explanation about this would be really helpful.
Is this the behavior of scanf() or character arrays in C?
TL;DR - No, you're facing the side-effects of undefined behavior.
To elaborate, in your case, against a code like
scanf("%s",s1);
where you have defined
char s1[5];
inputting anything more than 4 char will cause your program to venture into invalid memory area (past the allocated memory) which in turn invokes undefined behavior.
Once you hit UB, the behavior of the program cannot be predicted or justified in any way. It can do absolutely anything possible (or even impossible).
There is nothing inherent in the scanf() which stops you from reading overly long input and overrun the buffer, you should keep control on the input string scanning by using the field width, like
scanf("%4s",s1); //1 saved for terminating null
The scanf function when reading strings read up to the next white-space (e.g. newline, space, tab etc.), or the "end of file". It has no idea about the size of the buffer you provide it.
If the string you read is longer than the buffer provided, then it will write out of bounds, and you will have undefined behavior.
The simplest way to stop this is to provide a field length to the scanf format, as in
char s1[5];
scanf("%4s",s1);
Note that I use 4 as field length, as there needs to be space for the string terminator as well.
You can also use the "secure" scanf_s for which you need to provide the buffer size as an argument:
char s1[5];
scanf_s("%s", s1, sizeof(s1));

Array memory allocation of strings

I have written simple string program using array allocation method. I have allocated character array 10 bytes, but when i give input, program is accepting input string of greater than 10 bytes. I am getting segmentation fault only when I give input string of some 21 chars. Why there is no segmentation fault when my input exceed allocated my array limit?
Program:
#include <stdio.h>
#include <string.h>
void main() {
char str[10];
printf ("\n Enter the string: ");
gets (str);
printf ("\n The value of string=%s",str);
int str_len;
str_len = strlen (str);
printf ("\n Length of String=%d\n",str_len);
}
Output:
Enter the string: n durga prasad
The value of string=n durga prasad
Length of String=14
As you can see, string length is shown as 14, but I have allocated only 10 bytes. How can the length be more that my allocated size?
Please, don't use gets() it suffers from buffer overflow issues which in turn invokes undefined behaviour.
Why there is no segmentation fault when my input exceed allocated my array limit?
Once your input is exceeding the allocated array size (i.e., 9 valid characters + 1 null-terminator), the immediate next access t the array location becomes illegal and invokes UB. The segmentation fault is one of the side effect of UB, it is not a must.
Solution: Use fgets() instead.
When you declare an array, like char str[10];, your compiler won't always allocate precisely the number of bytes that you required. It often allocate more, usually a multiple of 8 if you are in 64-bits system, for instance it might be 16 in your case.
So even if you asked for 10 bytes, you can manipulate some more. But of course, it's strongly unrecommended because, as you said, it can produce segmentation faults.
And, as said by other answers from Sourav and Gopi, to use fgets instead of gets may also help to produce less undefined behavior.
When you enter more than the number of characters the array can hold then you have undefined behavior. Your array can hold 9 characters followed by a null terminator, so any devaition from this is a UB.
Don't use gets() use fgets() instead
char a[10];
fgets(a,sizeof(a),stdin);
By using fgets() you are avoiding buffer overflow issue and avoiding undefined behavior.
PS: fgets() comes with a newline character
As you already know, your input causes buffer overflow, I'm not going to repeat the reason. Instead I would like to answer the particular question ,
"Why there is no segmentation fault when my input exceed allocated my array limit?"
The reason that there may or may not be segmentation fault depends on something called undefined behaviour. Once you overrun the allocated memory boundary, you're not supposed to get a segmentation fault for sure. Rather, what you'll be facing is UB (as told earlier). Now, quoting the results of UB,
[...] programs invoking undefined behavior may compile and run, and produce correct results, or undetectably incorrect results, or any other behavior.
So, it is not a must that you'll be getting a segmentation fault immediately on accessing the very next memory. It may run perfectly well unless it reaches some memory which is actually inaccessible for the particular process and then, the SIGSEV signal (11) will be raised.
However, after running into UB, any output from any subsequent statement cannot be validated. So, the output of strlen() is invalid here.

Why output length is coming 6?

I have written a simple program to calculate length of string in this way.
I know that there are other ways too. But I just want to know why this program is giving this output.
#include <stdio.h>
int main()
{
char str[1];
printf( "%d", printf("%s", gets(str)));
return 0;
}
OUTPUT :
(null)6
Unless you always pass empty strings from the standard input, you are invoking undefined behavior, so the output could be pretty much anything, and it could crash as well. str cannot be a well-formed C string of more than zero characters.
char str[1] allocates storage room for one single character, but that character needs to be the NUL character to satisfy C string constraints. You need to create a character array large enough to hold the string that you're writing with gets.
"(null)6" as the output could mean that gets returned NULL because it failed for some reason or that the stack was corrupted in such a way that the return value was overwritten with zeroes (per the undefined behavior explanation). 6 following "(null)" is expected, as the return value of printf is the number of characters that were printed, and "(null)" is six characters long.
There's several issues with your program.
First off, you're defining a char buffer way too short, a 1 char buffer for a string can only hold one string, the empty one. This is because you need a null at the end of the string to terminate it.
Next, you're using the gets function which is very unsafe, (as your compiler almost certainly warned you about), as it just blindly takes input and copies it into a buffer. As your buffer is 0+terminator characters long, you're going to be automatically overwriting the end of your string into other areas of memory which could and probably does contain important information, such as your rsp (your return pointer). This is the classic method of smashing the stack.
Third, you're passing the output of a printf function to another printf. printf isn't designed for formating strings and returning strings, there are other functions for that. Generally the one you will want to use is sprintf and pass it in a string.
Please read the documentation on this sort of thing, and if you're unsure about any specific thing read up on it before just trying to program it in. You seem confused on the basic usage of many important C functions.
It invokes undefined behavior. In this case you may get any thing. At least str should be of 2 bytes if you are not passing a empty string.
When you declare a variable some space is reserved to store the value.
The reserved space can be a space that was previously used by some other
code and has values. When the variable goes out of scope or is freed
the value is not erased (or it may be, anything goes.) only the programs access
to that variable is revoked.
When you read from an unitialised location you can get anything.
This is undefined behaviour and you are doing that,
Output on gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 is 0
For above program your input is "(null)", So you are getting "(null)6". Here "6" is the output from printf (number of characters successfully printed).

Null termination of char array

Consider following case:
#include<stdio.h>
int main()
{
char A[5];
scanf("%s",A);
printf("%s",A);
}
My question is if char A[5] contains only two characters. Say "ab", then A[0]='a', A[1]='b' and A[2]='\0'.
But if the input is say, "abcde" then where is '\0' in that case. Will A[5] contain '\0'?
If yes, why?
sizeof(A) will always return 5 as answer. Then when the array is full, is there an extra byte reserved for '\0' which sizeof() doesn't count?
If you type more than four characters then the extra characters and the null terminator will be written outside the end of the array, overwriting memory not belonging to the array. This is a buffer overflow.
C does not prevent you from clobbering memory you don't own. This results in undefined behavior. Your program could do anything—it could crash, it could silently trash other variables and cause confusing behavior, it could be harmless, or anything else. Notice that there's no guarantee that your program will either work reliably or crash reliably. You can't even depend on it crashing immediately.
This is a great example of why scanf("%s") is dangerous and should never be used. It doesn't know about the size of your array which means there is no way to use it safely. Instead, avoid scanf and use something safer, like fgets():
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.
Example:
if (fgets(A, sizeof A, stdin) == NULL) {
/* error reading input */
}
Annoyingly, fgets() will leave a trailing newline character ('\n') at the end of the array. So you may also want code to remove it.
size_t length = strlen(A);
if (A[length - 1] == '\n') {
A[length - 1] = '\0';
}
Ugh. A simple (but broken) scanf("%s") has turned into a 7 line monstrosity. And that's the second lesson of the day: C is not good at I/O and string handling. It can be done, and it can be done safely, but C will kick and scream the whole time.
As already pointed out - you have to define/allocate an array of length N + 1 in order to store N chars correctly. It is possible to limit the amount of characters read by scanf. In your example it would be:
scanf("%4s", A);
in order to read max. 4 chars from stdin.
character arrays in c are merely pointers to blocks of memory. If you tell the compiler to reserve 5 bytes for characters, it does. If you try to put more then 5 bytes in there, it will just overwrite the memory past the 5 bytes you reserved.
That is why c can have serious security implementations. You have to know that you are only going to write 4 characters + a \0. C will let you overwrite memory until the program crashes.
Please don't think of char foo[5] as a string. Think of it as a spot to put 5 bytes. You can store 5 characters in there without a null, but you have to remember you need to do a memcpy(otherCharArray, foo, 5) and not use strcpy. You also have to know that the otherCharArray has enough space for those 5 bytes.
You'll end up with undefined behaviour.
As you say, the size of A will always be 5, so if you read 5 or more chars, scanf will try to write to a memory, that it's not supposed to modify.
And no, there's no reserved space/char for the \0 symbol.
Any string greater than 4 characters in length will cause scanf to write beyond the bounds of the array. The resulting behavior is undefined and, if you're lucky, will cause your program to crash.
If you're wondering why scanf doesn't stop writing strings that are too long to be stored in the array A, it's because there's no way for scanf to know sizeof(A) is 5. When you pass an array as the parameter to a C function, the array decays to a pointer pointing to the first element in the array. So, there's no way to query the size of the array within the function.
In order to limit the number of characters read into the array use
scanf("%4s", A);
There isn't a character that is reserved, so you must be careful not to fill the entire array to the point it can't be null terminated. Char functions rely on the null terminator, and you will get disastrous results from them if you find yourself in the situation you describe.
Much C code that you'll see will use the 'n' derivatives of functions such as strncpy. From that man page you can read:
The strcpy() and strncpy() functions return s1. The stpcpy() and
stpncpy() functions return a
pointer to the terminating `\0' character of s1. If stpncpy() does not terminate s1 with a NUL
character, it instead returns a pointer to s1[n] (which does not necessarily refer to a valid mem-
ory location.)
strlen also relies on the null character to determine the length of a character buffer. If and when you're missing that character, you will get incorrect results.
the null character is used for the termination of array. it is at the end of the array and shows that the array is end at that point. the array automatically make last character as null character so that the compiler can easily understand that the array is ended.
\0 is an terminator operator which terminates itself when array is full
if array is not full then \0 will be at the end of the array
when you enter a string it will read from the end of the array

Resources