How to fix 'Segmentation fault (core dumped)' - c

I'm trying to solve my homework.
The question is asking to implement the function count(FILE *fp) that takes as input a text file and returns the number of sentences in the file.
But the output is showing an error 'Segmentation fault (core dumped)'.
int count(FILE *fp)
{
int count=0;
char word[256];
while(fscanf(fp,"%s",word)!=EOF)
{
if(word[strlen(word)-1]=='.')
{
count+=1;
}
}
return count;
}

(If you're on Linux, compile your program with the -fsanitize=address flag. If your program runs into a segmentation fault, it will tell you in excruciating detail what went wrong).
If your file contains a "word" (sequence of characters which are not whitespace) longer than 256 characters (maybe the text is in German, or the text of Mary Poppins), fscanf will write that many characters into word, overflowing it. This can lead to segmentation faults.
You can prevent that by limiting the number of characters fscanf will try to write:
fscanf(fp,"%256s",word);
This may split a "word" into two or more parts, but only the last part with the dot will be counted (unless the word looks like "twohundred-and-fifty-five-characters.some-more").
Note that fscanf can return zero if no fields were stored, although this appears impossible when %s is used. In this case, you'll be applying strlen to uninitialized memory, which can lead to segmentation faults.
Also, if fscanf gives back an empty string (also appears impossible), strlen will return zero, and you'll try to read word[-1], that is, a buffer underrun. You should check the result of strlen before subtracting from it.

Here if strlen(word) returns a value grater than 256 (Meaning you have a sentence with more than 256 characters) you would get a segmentation fault.
You can find the solution in Bulletmagnet's answer

Related

I am getting a segmentation fault

I am getting a segmentation fault for some reason, I wrote this program that calculates the days between two dates and wanted to get the "dd-mm-yyyy" to be represented as a string and "dd2-mm2-yyyy2"should also be represented as a string, i thought I could solve it this way, but i cam getting a segmentation fault, can someone help me? what am i doing wrong?
This seems incorrect. argv[1] is your "day" string, which is 1 or 2 characters long, and you're indexing characters 3 and 4.
char monstr[3];
monstr[0]= argv [1][3];
monstr[1]=argv [1][4];
monstr[2] = '\0';
This should probably be:
char monstr[3];
monstr[0]= argv [2][0];
monstr[1]=argv [2][1];
monstr[2] = '\0';
Same with some other strings.
But, that said, I'm basing this on how you seem to be parsing input. If you want your input to be dd-mm-yyyy, then you're not getting input right. Instead, you should do something like this:
int dd, mm, yyyy;
sscanf(argv[1], "%d-%d-%d", &dd, &mm, &yyyy);
And same with the other string. And in that case, the previous thing I corrected doesn't need to be corrected.
As a general piece of advice: the reason segmentation faults happen is because you're accessing memory that you aren't able to access. A common cause of this is going outside of array bounds, or using invalid pointers. In your case, it seems like one of those two, and it comes from misuse of argv.

Stack Smashing and using malloc

I'm making a program that counts the number of words contained within a file. My code works for certain test cases with files that have less than a certain amount of words/characters...But when I test it with, let's say, a word like:
"loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong", (this is not random--this is an actual test case I'm required to check), it gives me this error:
*** stack smashing detected ***: ./wcount.out terminated
Abort (core dumped)
I know what the error means and that I have to implement some sort of malloc line of code to be able to allocate the right amount of memory, but I can't figure out where in my function to put it or how to do it:
int NumberOfWords(char* argv[1]) {
FILE* inFile = NULL;
char temp_word[20]; <----------------------I think this is the problem
int num_words_in_file;
int words_read = 0;
inFile = fopen(argv[1], "r");
while (!feof(inFile)) {
fscanf(inFile, "%s", temp_word);
words_read++;
}
num_words_in_file = words_read;
printf("There are %d word(s).\n", num_words_in_file - 1);
fclose(inFile);
return num_words_in_file;
}
As you've correctly identified by rendering your source code invalid (future tip: /* put your arrows in comments */), the problem is that temp_word only has enough room for 20 characters (one of which must be a terminal null character).
In addition, you should check the return value of fopen. I'll leave that as an exercise for you. I've answered this question in other questions (such as this one), but I don't think just shoving code into your face will help you.
In this case, I think it may pay to better analyse the problem you have, to see if you actually need to store words to count them. As we define a word (the kind read by scanf("%s", ...) as a sequence of non-whitespace characters followed by a sequence of (zero or more) whitespace characters, we can see that such a counting program as yours needs to follow the following procedure:
Read as much whitespace as possible
Read as much non-whitespace as possible
Increment the "word" counter if all was successful
You don't need to store the non-whitespace any more than you do the whitespace, because once you've read it you'll never revisit it. Thus you could write this as two loops embedded into one: one loop which reads as much whitespace as possible, another which reads non-whitespace, followed by your incrementation and then the outer loop repeats the whole lot... until EOF is reached...
This will be best achieved using the %*s directive, which tells scanf-related functions not to try to store the word. For example:
size_t word_count = 0;
do {
fscanf(inFile, "%*s");
} while (!feof(inFile) && ++word_count);
You are limited by the size of your array. A simple solution would be to increase the size of your array. But you are always susceptible to stack smashing if someone enters a long word.
A word is delimited by spaces.
You can simply store a counter variable initialized to zero, and a variable that records the current char that you are looking at. Every time you read in a character using fgetc(inFile, &temp) that is a space, you increment the counter.
In your current code you simply want to count the words. Therefore you are not interested in the words themselves. You can suppress the assignment with the optional * character:
fscanf(inFile, "%*s");

Array memory allocation of strings

I have written simple string program using array allocation method. I have allocated character array 10 bytes, but when i give input, program is accepting input string of greater than 10 bytes. I am getting segmentation fault only when I give input string of some 21 chars. Why there is no segmentation fault when my input exceed allocated my array limit?
Program:
#include <stdio.h>
#include <string.h>
void main() {
char str[10];
printf ("\n Enter the string: ");
gets (str);
printf ("\n The value of string=%s",str);
int str_len;
str_len = strlen (str);
printf ("\n Length of String=%d\n",str_len);
}
Output:
Enter the string: n durga prasad
The value of string=n durga prasad
Length of String=14
As you can see, string length is shown as 14, but I have allocated only 10 bytes. How can the length be more that my allocated size?
Please, don't use gets() it suffers from buffer overflow issues which in turn invokes undefined behaviour.
Why there is no segmentation fault when my input exceed allocated my array limit?
Once your input is exceeding the allocated array size (i.e., 9 valid characters + 1 null-terminator), the immediate next access t the array location becomes illegal and invokes UB. The segmentation fault is one of the side effect of UB, it is not a must.
Solution: Use fgets() instead.
When you declare an array, like char str[10];, your compiler won't always allocate precisely the number of bytes that you required. It often allocate more, usually a multiple of 8 if you are in 64-bits system, for instance it might be 16 in your case.
So even if you asked for 10 bytes, you can manipulate some more. But of course, it's strongly unrecommended because, as you said, it can produce segmentation faults.
And, as said by other answers from Sourav and Gopi, to use fgets instead of gets may also help to produce less undefined behavior.
When you enter more than the number of characters the array can hold then you have undefined behavior. Your array can hold 9 characters followed by a null terminator, so any devaition from this is a UB.
Don't use gets() use fgets() instead
char a[10];
fgets(a,sizeof(a),stdin);
By using fgets() you are avoiding buffer overflow issue and avoiding undefined behavior.
PS: fgets() comes with a newline character
As you already know, your input causes buffer overflow, I'm not going to repeat the reason. Instead I would like to answer the particular question ,
"Why there is no segmentation fault when my input exceed allocated my array limit?"
The reason that there may or may not be segmentation fault depends on something called undefined behaviour. Once you overrun the allocated memory boundary, you're not supposed to get a segmentation fault for sure. Rather, what you'll be facing is UB (as told earlier). Now, quoting the results of UB,
[...] programs invoking undefined behavior may compile and run, and produce correct results, or undetectably incorrect results, or any other behavior.
So, it is not a must that you'll be getting a segmentation fault immediately on accessing the very next memory. It may run perfectly well unless it reaches some memory which is actually inaccessible for the particular process and then, the SIGSEV signal (11) will be raised.
However, after running into UB, any output from any subsequent statement cannot be validated. So, the output of strlen() is invalid here.

Printing char array in C causes segmentation fault

I did a lot of searching around for this, couldn't find any question with the same exact issue.
Here is my code:
void fun(char* name){
printf("%s",name);
}
char name[6];
sscanf(input,"RECTANGLE_SEARCH(%6[A-Za-z0-9])",name)
printf("%s",name);
fun(name);
The name is grabbed from scanf, and it printed out fine at first. Then when fun is called, there is a segmentation fault when it tries to print out name. Why is this?
After looking in my scrying-glass, I have it:
Your scanf did overflow the buffer (more than 6 byte including terminator read), with ill-effect slightly delayed due to circumstance:
Nobody else relied on or re-used the memory corrupted at first, thus the first printf seems to work.
Somewhere after the first and before the second call to printf the space you overwrote got re-used, so the string you read was no longer terminated before encountering not allocated pages.
Thus, a segmentation-fault at last.
Of course, your program was toast the moment it overflowed the buffer, not later when it finally crashed.
Morale: Never write to memory you have not dedicated for that.
Looking at your edit, the format %6[A-Za-z0-9] tries to read up to 6 characters exclusive the terminator, not inclusive!
Since you're reading 6 characters, you have to declare name to be 7 characters, so there's room for the terminating null character:
char name[7];
Otherwise, you'll get a buffer overflow, and the consequences are undefined. Once you have undefined consequences, anything can happen, including 2 successful calls to printf() followed by a segfault when you call another function.
You're probably walking off the end of the array with your printf statement. Printf uses the terminating null character '\0' to know where the end of the string is. Try allocating your array like this:
char name[6] = {'\0'};
This will allocate your array with every element initially set to the '\0' character, which means that as long as you don't overwrite the entire array with your scanf, printf will terminate before walking off the end.
Are you sure that name is zero byte terminated? scanf can overflow your buffer depending on how you are calling it.
If that happens then printf will read beyond the end of the array resulting in undefined behavior and probably a segmentation fault.

Character array in C(Puts vs printf)

I have some doubts regarding character array in C, I have a character array of size 1, logic says that when I input more than 2 characters, I should be getting a segmentation fault, However puts prints out the array properly whereas printf prints some parts of the array along with garbage value, Why is this happening
#include<stdio.h>
int main()
{
int i;
char A[1];
printf("%d\n",(int)sizeof(A));
gets(A);
puts(A);
for(i=0;i<8;i++)
{
printf("%c\n",A[i]);
}
}
O/P:
1
abcdefg
abcdefg
a
f
g
To add to this I have to type in multiple characters of the array size in the program to throw a segmentation fault. Is it because of the SFP in the stack? The size of SFP is 4 bytes Please correct me if I'm wrong
1
abcdefghijklmnop
abcdefghijklmnop
a
f
g
h
Segmentation fault
OK, others explained it in high-level language and elder's expierence.
I would like to explain your situations in the assembly layer.
You know why your first situation ran without accident?
Because your buffers overflow does NOT destory other processes's memory, So the OS does't signal a Segmentation fault to your process.
And why your stack's length is more than your array's size?
Because of the aligning. Many OS reqiures a stack frame aligning x bytes to implement efficient addressing.
x is machine-dependent.
e.g, If x is 16 bytes.
char s[1] will lead the stack to 16 byte;
char s[17] will lead the stack to 32byte.
Actually even when you write only one character, it's still buffer overflow, because gets() will write a null character to the array.
Buffer overflow doesn't necessarily mean segmentation fault. You can't rely on undefined behavior in any ways. Possibly it just took the program several times to break the memory that it shouldn't write.
It seems that you have known that gets() is dangerous and should be avoided, I added this just in case.

Resources