My program reads a file specified in the argument and prints out each string and its frequency inside the file.
The program works for this file: http://www.cse.yorku.ca/course/3221/dataset1.txt
but not this file: http://www.cse.yorku.ca/course/3221/dataset2.txt.
It gives Segmentation fault (core dumped) error for the second file.
What could be wrong? Please help!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct {
char word[101];
int freq;
} WordArray;
int main(int argc, char *argv[])
{
WordArray *array = malloc(sizeof(WordArray));
FILE *file;
int i = 0;
file = fopen(argv[1], "r");
char *str = (char*) malloc (108);
while(fgets(str, 100, file) != NULL)
{
int pos = 0;
char *word = malloc (100);
while (sscanf(str, "%s%n", word, &pos ) == 1)
{
int j;
for (j = 0; j < i; j++)
{
if (strcmp(array[j].word, word) == 0)
{
array[j].freq = array[j].freq + 1;
break;
}
}
if (j==i)
{
array = (WordArray *) realloc (array, sizeof(WordArray) * (i+1));
strcpy(array[i].word, word);
array[i].freq = 1;
i++;
}
str += pos;
}
}
fclose(file);
int k;
for (k=0; k<i; k++)
{
printf("%s %d\n", array[k].word, array[k].freq);
}
return 0;
}
Several problems:
You increment str as part of the second loop and don't reset it. I think this means your program is slowly walking through memory.
You fail to free word - probably better to allocate it outside the loop and on the stack but that won't cause a crash unless you input is huge and you run out of memory.
You don't need to cast result of malloc for modern compilers (yes, it used to be needed).
May want to check the results of malloc and realloc for safety.
I assume the first item is your problem.
Related
I am getting a segmentation fault for a C program that first reads the characters of a given file, identifies words, indexes words, and prints the first word. I have been troubleshooting for a long time but cannot seem to find what the error is.
#include <stdio.h>
#include <stdlib.h>
#include <cs50.h>
#include <string.h>
int main (int argc, char *argv[])
{
if (argc != 2)
{
printf("Usage: ./test15 text\n");
return 1;
}
char *file = argv[1];
FILE *ptr = fopen(file, "r");
char ch;
int i = 0;
int k = 0;
int j = 0;
char *text = malloc(sizeof(char));
string word[k];
while ((ch = fgetc(ptr)) != EOF)
{
text[i] = ch;
if (ch == ' ')
{
for (int l = j; l < i; l++)
{
strcat(word[k], &text[l]);
}
k++;
j = i;
}
i++;
}
printf("%s\n", word[0]);
return 0;
}
Just like #Zen said, a SEGFAULT will occur if you try to access a memory location you are not allowed to or not allocated.
Your program terminates just after the first iteration because i becomes 1 at that moment and text[1] becomes inaccessible because text was allocated the size of a single character only:
char *text = malloc(sizeof(char)); here.
Yet, I have not checked your algorithm right now so I am only providing an initial observation. If any errors still pop up, feel free to post on this thread here.
Best.
I have a really quick question. Why do i get heap corruption detected when i try to deallocate the array in the void translateWord() function?
I tried to deallocate line by line in a for loop but it doesnt seem to work. I thought that if i use that function more than once, and every time the function allocates memory, i should deallocate it at the end of the function. Any idea?
#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
#include <conio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
void translateWord(char word[], FILE *point_out, FILE *point_1)
{
rewind(point_1);
char ch;
int gasit = 0;
int lines = count_lines(point_1);
int i = 0;
char *cp;
char *bp;
char line[255];
char **array = (char**)malloc(sizeof(char*)*2*lines);
rewind(point_1);
while (fgets(line, sizeof(line), point_1) != NULL) {
bp = line;
while (1) {
cp = strtok(bp, "=\n");
bp = NULL;
if (cp == NULL)
break;
array[i] = (char*)malloc(sizeof(char)*strlen(cp));
strcpy(array[i++], cp);
}
}
gasit = cuvant_gasit(word, array, lines, point_out, gasit);
if (gasit == 0)
{
fprintf(point_out, "<<%s>>", word);
}
for (int k = 0; k < 2 * lines; k++)
{
free(array[k]);
}
free(array);
}
There is something wrong in translateWord :
array[i] = (char*)malloc(sizeof(char)*strlen(cp));
strcpy(array[i++], cp);
The first line must be array[i] = (char*)malloc(strlen(cp) + 1); else 1 char is missing to save the final null char during the strcpy.
Note that by definition sizeof(char) is 1.
And why do you not just use strdup rather than a malloc then a strcpy ? just replace these 2 lines by array[i++] = strdup(cp);
I am trying to input a list of strings. The list may vary in length, so I try to use dynamic allocation. Each string has 20 chars max. The list ends with a single dot. I have been working on it for some time now, but I keep getting a segmentation fault and I am not sure why. I guess the error is in my use of realloc / malloc, but I just cannot see what exactly I am doing wrong. The code block is part of a larger program, but I singled out this block and am trying to make it work. It works fine for a "list" of one word followed by a dot. As soon as I try to read a list of two or more strings, I get the segmentation error.
Any help would be great, thanks!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **resizeDynamicArray(char **arr, int newSize){
char **ptr = realloc(arr, newSize*sizeof(char*));
if (ptr==NULL) {
printf("Error: memory allocation failed.\n");
exit(-1);
}
return ptr;
}
int input (char ***seq){
int len = 0;
char string[21];
*seq=NULL;
do{
scanf("%s", string);
if (string[0] != '.'){
*seq = resizeDynamicArray(*seq, (len+1));
*seq[len] = malloc(sizeof(char[21]));
strcpy((*seq)[len], string);
len++;
}
} while (string[0] != '.');
return len;
}
int main(int argc, char *argv[]) {
int length;
char **words;
length = input(&words);
for (int i=0; i<length; ++i){
printf("%s\n", words[i]);
}
for (int i=0; i<length; ++i){
free(words[i]);
}
free(words);
return 0;
}
Change the following line:
*seq[len] = malloc(sizeof(char[21]));
to:
(*seq)[len] = malloc(sizeof(char[21]));
There is an extra level of indirection that needs to be dereferenced before you can index into the top-level array.
so I've been writing an mtf encoder in C and I've been running into a realloc() error regardless of what I do. I've checked to see if there was an error in my logic (and there may be) by using print statements to see if I'm overstepping the bounds of my currently malloc'd array (adding a string past my original array size) and that doesn't seem to be the issue. I've used GDB and Valgrind and GDB gives me a cryptic message while Valgrind runs into a segmentation fault. This is my first time using dynamic memory and I'm pretty confused as to what the problem is, below are my code along with the GDB error:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int count = 0;
move_to_front(int index, char** words){
int i;
char *t = words[index];
for(i = index; i>1; i--){
words[i] = words[i-1];
}
words[1] = t;
}
char** reallocate_words(char** words, int* words_size_pointer){
printf("We're entering here\n");
printf("%d", *words_size_pointer);
int temp = *words_size_pointer;
char** tempor;
temp = temp*2;
printf("%d", temp);
tempor = (char**) realloc(words, temp);
int i = *words_size_pointer;
for(i; i<temp; i++){
tempor[i] = (char*) malloc(120);
}
words_size_pointer = &temp;
return tempor;
}
void encode_word(int* words_size_pointer, FILE *f, char* word, char** words){
if(count == 0){
words[1] = word;
fputs(words[1], f);
count++;
}
int i;
for(i=0; i<=count; i++){
if(strcmp(words[i], word) == 0){
break;
}
}
if(i>=(*words_size_pointer)){
printf("%d\n", i);
words = reallocate_words(words, words_size_pointer);
words[count+1] = word;
count++;
fputc(count+128, f);
fputs(words[count], f);
move_to_front(count, words);
}
if(i>count){
words[count+1] = word;
count++;
fputc(count+128, f);
fputs(words[count], f);
move_to_front(count, words);
}
else{
fputc(i+128, f);
move_to_front(i, words);
}
}
void sep_words(char** words, char *line, int* words_size_pointer, FILE *f){
char* x;
int i = 0;
x = strtok(line, " ");
while(x != NULL){
encode_word(words_size_pointer,f, x, words);
x = strtok(NULL, " ");
}
}
void readline(FILE *f_two, FILE *f, char** words, int* words_size_pointer){
char *line;
size_t len = 0;
ssize_t temp;
int count;
do{
temp = getline(&line,&len,f);
printf("%s", line);
if(temp!= -1){
sep_words(words, line, words_size_pointer, f_two);
}
}while(temp!=-1);
}
int main(int argc, char *argv[]){
int x;
int i;
int j;
x = strlen(argv[1]);
char fi[x];
char mtf[3] = "mtf";
FILE *f;
FILE *f_two;
for(j = 0; j<(x-3); j++){
fi[j] = argv[1][j];
}
strcat(fi, mtf);
f = fopen(argv[1], "r");
f_two = fopen(fi, "w");
fputc(0xFA, f_two);
fputc(0XCE, f_two);
fputc(0XFA, f_two);
fputc(0XDF, f_two);
if(f == NULL){
return 1;
}
char** words;
words = (char **) malloc(20);
for(i = 0; i<20; i++){
words[i] = (char*) malloc(120);
}
int words_size = 20;
int* words_size_pointer = &words_size;
readline(f_two, f, words, words_size_pointer);
return 0;
}
And as for the GDB error:
*** Error in `/file_loc/mtfcoding2': realloc(): invalid next size: 0x0000000000603490 ***
2040 \\This is due to print statements within my function.
Program received signal SIGABRT, Aborted.
0x00007ffff7a4acc9 in __GI_raise (sig=sig#entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Thank you for your time! :)
malloc and realloc require the number of bytes as argument. However you are writing code like:
char** words;
words = (char **) malloc(20);
for(i = 0; i<20; i++){
words[i] = (char*) malloc(120);
You allocate 20 bytes but then you write 20 pointers (which probably takes 80 bytes). To fix this you need to compute how many bytes are required to store the 20 pointers. A safe way of doing this is to use malloc as recommended by SO:
words = malloc(20 * sizeof *words);
You have the same problem in your realloc call.
This line has no effect: words_size_pointer = &temp; . Perhaps you meant *words_size_pointer = temp; . Make sure you clearly understand the difference between those two lines.
NB. There may be other errors.
Well, for starters, your move_to_front is dropping pointers. This one is a pretty bad memory leak, and given the nature of C and memory leaks, could be the cause of your segfault (for now). You should be doing this
for(i = index; i > 1; i--){
char* tmp = words[i];
words[i] = words[i-1];
words[i-1] = tmp;
}
Otherwise, what you have done is overwritten the pointers from index to words[2] with the pointer at index. Also, you seem to like to start your words array at 1 instead of 0. Those off-by-one errors are gonna hurt ya too.
Also (as stated in my earlier comment), words_size_pointer = &temp; isn't quite right. Do this instead *words_size_pointer = temp;. The first way is only a local pointer re-assignment, but you want the change to be reflected in the caller's scope, so you must dereference the pointer and modify it.
It seems to be caused by your call to getline in your readline function.
char *line;
size_t len = 0;
...
temp = getline(&line,&len,f);
getline requires line to be NULL (in which case the value of len is ignored) or line must be a pointer returned by malloc, calloc, or realloc. If line is not NULL, and len isn't large enough, line is resized by calling realloc. This is the crucial point: line points to some random address that wasn't returned by malloc, so getline attempts to use realloc to increase the buffer size because 0 bytes is just too small.
You also have a buffer overflow here:
char mtf[3] = "mtf";
...
for(j = 0; j<(x-3); j++){
fi[j] = argv[1][j];
}
strcat(fi, mtf);
Because you made mtf only 3 bytes in size, it may or may not be followed immediately by a null terminator. When strcat is called, if mtf isn't followed immediately by a 0 byte in memory, you end up with something like sample.mtf\x1b\x01X as the output filename, assuming you don't write too far beyond the end of the fi array to crash the program with a SIGSEGV (segfault). Any of the following will correct it:
char mtf[4] = "mtf";
//OR
char mtf[] = "mtf";
//OR
const char *mtf = "mtf";
However, because fi is only made up of x number of bytes, you'll end up writing to fi[x] with your null terminator that strcat adds. This is a problem because char fi[x]; means you only have array indices 0 to x - 1 available. Fix this part by using x = strlen(argv[1]) + 1.
The program runs fine except for the last free, which results in the program freezing.
When I comment out the last 'free' it runs fine.
The program gets all substrings from a string and returns it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char** getPrefixes(char* invoer);
int main()
{
char buffer[100];
char *input;
char **prefixes;
int counter = 0;
puts("Give string.");
fgets(buffer, 99, stdin);
fflush(stdin);
if (buffer[strlen(buffer) - 1] == '\n')
buffer[strlen(buffer) - 1] = '\0';
input= (char*)malloc(strlen(buffer) + 1);
if (input == NULL)
{
puts("Error allocating memory.");
return;
}
strcpy(input, buffer);
prefixes = (char**) getPrefixes(input);
for (counter = strlen(input); counter > 0; counter--)
{
puts(prefixes[counter]);
free(prefixes[counter]);
}
free(input);
free(prefixes);
}
char** getPrefixes(char* input)
{
char** prefixes;
int counter;
prefixes = malloc(strlen(input) * sizeof(char*));
if (prefixes == NULL)
{
puts("ELM.");
return NULL;
}
for (counter= strlen(input); counter> 0; counter--)
{
prefixes[counter] = (char*)malloc(counter + 1);
strcpy(prefixes[counter], input);
input++;
}
return prefixes;
}
Thanks in advance!
The reason for your program freezing is simple: undefined behaviour + invalid return values: _Your main function returns void, not an int: add return 0 ASAP! If you type in echo $? in your console after executing your compiled binary, you should see a number other than 0. This is the program's exit code. anything other than 0 means trouble. if the main did not return an int, it's bad news.
Next:
The undefined behaviour occurs in a couple of places, for example right here:
prefixes = malloc(strlen(input) * sizeof(char*));
//allocate strlen(input) pointers, if input is 10 long => valid indexes == 0-9
for (counter= strlen(input); counter> 0; teller--)
{//teller doesn't exist, so I assume you meant "counter--"
prefixes[teller] = (char*)malloc(counter + 1);//first call prefixes[10] ==> out of bounds
strcpy(prefixes[counter], input);//risky, no zero-termination... use calloc + strncpy
input++;
}
Then, when free-ing the memory, you're not freeing the pointer # offset 0, so the free(prefixes) call is invalid:
for (counter = strlen(input); counter > 0; counter--)
{//again 10 --> valid offsets are 9 -> 0
puts(prefixes[counter]);
free(prefixes[counter]);
}
free(prefixes);//wrong
Again, valid indexes are 0 and up, your condition in the loop (counter > 0) means that the loop breaks whenever counter is 0. You, at no point, are freeing the first pointer in the array, the one at index/offstet 0.
Write your loops like everyone would:
for (int i=0, size_t len = strlen(input); i<len; ++i)
{
printf("%d\n", i);//prints 0-9... 10 lines, all valid indexes
}
Change your loops, and make sure you're only using the valid offsets and you _should be good to go. using strncpy, you can still get the same result as before:
for (int i=0;i<len;++i)
{
//or malloc(i+2), char is guaranteed to be 1
//I tend to use `calloc` to set all chars to 0 already, and ensure zero-termination
prefixes[i] = malloc((i+2)*sizeof(*prefixes[i]));
strncpy(prefixes[i], input, i+1);//max 1 - 10 chars are copied
}
If we apply this to your code, and re-write it like so:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char** getPrefixes(char* input);
int main( void )
{
char *input;
char **prefixes;
int counter, i;
input= calloc(50,1);
if (input == NULL)
{
puts("Error allocating memory.");
return;
}
strcpy(input, "teststring");
prefixes = getPrefixes(input);
counter = strlen(input);
for (i=0; i<counter;++i)
{
puts(prefixes[i]);
free(prefixes[i]);
}
free(input);
free(prefixes);
return 0;
}
char** getPrefixes(char* input)
{
int i, counter = strlen(input);
char** prefixes = malloc(counter * sizeof *prefixes);
if (prefixes == NULL)
{
puts("ELM.");
return NULL;
}
for (i=0; i<counter; ++i)
{
prefixes[i] = calloc(i + 2,sizeof *prefixes[i]);
strncpy(prefixes[i], input, i+1);
}
return prefixes;
}
The output we get is:
t
te
tes
test
tests
testst
teststr
teststri
teststrin
teststring
As you can see for yourself
on this codepad
allocating memory for pointer to pointer:
char** cArray = (char**)malloc(N*sizeof(char*));
for(i=0;i<N;i++)
cArray[i] = (char*)malloc(M*sizeof(char));
De-allocating memory - in reverse order:
for(i=0;i<N;i++)
free(cArray[i]);
free(cArray)
I hope this gives you a little insight on what's wrong.
you are calling strcpy with prefixes[counter] as destination. However, you've only allocated 4/8 bytes per prefixes[counter] depending on the size of (char*)
When you call strcpy you're copying all of input all the way to the end requiring strlen(input)! space
Doing this will corrupt the heap which might explain why the program is freezing.