Issue with scanning in words from a file (in C) - c

I am currently attempting to write a function which scans words in from a dictionary file. It works perfectly in the debugger but when I compile and run it normally, it crashes after five words scanned. Here is my code:
char** readDictionary(FILE *ifp, int size){
int i;
char** dictionary;
char buffer[21];
dictionary = malloc(sizeof(char*) * size);
if(dictionary == NULL){
printf("dictionary allocation ERROR");
return NULL;
}
for(i=0; i<size; i++){
fscanf(ifp, "%s", buffer);
printf("%s\n", buffer); //debugging statement
dictionary[i] = malloc(sizeof(char) * strlen(buffer));
strcpy(dictionary[i], buffer);
}
return dictionary;
}
In the debugger, all words are scanned in properly. When I run without the debugger, I crash after the fifth word.
here is a list of my first few words (again, it crashes after aardvarks):
aahing
aahs
aals
aardvark
aardvarks
aardwolf
aardwolves
I am not sure why this could be happening. Please help.

You are not allocating space for the terminating null byte. Change
dictionary[i] = malloc(sizeof(char) * strlen(buffer));
strcpy(dictionary[i], buffer);
to
size_t length = strlen(buffer);
dictionary[i] = malloc(length + 1);
if (dictionary[i] != NULL)
memcpy(dictionary[i], buffer, length + 1);
Or even better
dictionary[i] = strdup(buffer);
Also, check that fscanf() didn't faile if it didn't it will return 1 in your case.

Related

_platform_memmove$VARIANT$Unknown () from /usr/lib/system/libsystem_platform.dylib changing content of character pointer

I am trying to write a program that accepts a user string and then reverses the order of the words in the string and prints it. My code works for most tries, however, it seg faults on certain occasions, for the same input.
On stepping through I found that the content of character pointers words[0] and words[1] are getting changed to garbage values/Null.
I set a watch point on one of the word[1] and wprd[0] character pointers that are getting corrupted (incorrect address), and can see that the content of these pointers changes at '_platform_memmove$VARIANT$Unknown () from /usr/lib/system/libsystem_platform.dylib'. I cant figure out how this gets invoked and what's causing the content of the pointers to be overwritten.
I have posted my code below and would like any assistance in figuring out where I am going wrong. I am sorry about the indentation issues.
char* reverseWords(char *s) {
char** words = NULL;
int word_count = 0;
/*Create an array of all the words that appear in the string*/
const char *delim = " ";
char *token;
token = strtok(s, delim);
while(token != NULL){
word_count++;
words = realloc(words, word_count * sizeof(char*));
if(words == NULL){
printf("malloc failed\n");
exit(0);
}
words[word_count - 1] = strdup(token);
token = strtok(NULL, delim);
}
/*Traverse the list backwards and check the words*/
int count = word_count;
char *return_string = malloc(strlen(s) + 1);
if(return_string == NULL){
printf("malloc failed\n");
exit(0);
}
int offset = 0;
while(count > 0){
memcpy((char*)return_string + offset, words[count - 1], strlen(words[count - 1]));
free(words[count - 1]);
offset += strlen(words[count - 1]);
if(count != 1){
return_string[offset] = ' ';
offset++;
}
else {
return_string[offset] = '\0';
}
count--;
}
printf("%s\n",return_string);
free(words);
return return_string;
}
int main(){
char *string = malloc(1000);
if(string == NULL){
printf("malloc failed\n");
exit(0);
}
fgets(string, 1000, stdin);
string[strlen(string)] = '\0';
reverseWords(string);
return 0;
}
The problem is that the line
char *return_string = malloc(strlen(s) + 1);
doesn't allocate nearly enough memory to hold the output. For example, if the input string is "Hello world", you would expect strlen(s) to be 11. However, strlen(s) will actually return 5.
Why? Because strtok modifies the input line. Every time you call strtok, it finds the first delimiter and replaces it with a NUL character. So after the first while loop, the input string looks like this
Hello\0world\0
and calling strlen on that string will return 5.
So, the result_string is too small, and one or more memcpy will write past the end of the string, resulting in undefined behavior, e.g. a segmentation fault. The reason for the error message about memmove: the memcpy function internally invokes memmove as needed.
As #WhozCraig pointed out in the comments, you also need to make sure that you don't access memory after a call to free, so you need to swap these two lines
free(words[count - 1]);
offset += strlen(words[count - 1]);

Cannot get realloc() to work

FILE *file;
file = fopen(argv[1], "r");
char *match = argv[2];
if (file == NULL) {
printf("File does not exist\n");
return EXIT_FAILURE;
}
int numWords = 0, memLimit = 20;
char** words = (char**) calloc(memLimit, sizeof(char));
printf("Allocated initial array of 20 character pointers.\n");
char string[20];
while (fscanf(file, "%[a-zA-Z]%*[^a-zA-Z]", string) != EOF) {
words[numWords] = malloc(strlen(string) + 1 * sizeof(char));
strcpy(words[numWords], string);
printf("Words: %s\n", words[numWords]);
numWords++; /*keep track of indexes, to realloc*/
if (numWords == memLimit) {
memLimit = 2 * memLimit;
words = (char**) realloc(words, memLimit * sizeof(char*)); /*Fails here*/
printf("Reallocated array of %d character pointers.\n", memLimit);
}
}
Code should open and read a file containing words with punctuation, spaces etc and store in a string, but after 20 tries it throws an error, and I can't seem to get realloc() to work here, which I'm expecting to be the problem. The array is dynamically allocated 20 char pointers, at which when limit is reached, it should realloc by double. How can I get around this?
Two notes. First, you shouldn't ever cast the return value of calloc/malloc/realloc. See this for more information.
Second, as others have pointed out in comments, the first calloc statement uses sizeof(char) and not sizeof(char*) like it should.
words is a pointer to a pointer. The idea is to allocate an array of pointers.
The below is wrong as it allocates for memLimit characters rather than memLimit pointers.
This is the main issue
char** words = (char**) calloc(memLimit, sizeof(char)); // bad
So use an easy idiom: allocate memLimit groups of whatever words points to. It is easier to write, read and maintain.
char** words = calloc(memLimit, sizeof *words);
Avoid the while (scanf() != EOF) hole. Recall that various results can come from scanf() family. It returns the count of successfully scanned fields or EOF. That is typically 1 of at least 3 options. So do not test for one result you do not want, test for the one result you do want.
// while (fscanf(file, "%[a-zA-Z]%*[^a-zA-Z]", string) != EOF) {
while (fscanf(file, "%[a-zA-Z]%*[^a-zA-Z]", string) == 1) {
The above example may not every return 0, but the below easily could.
int d;
while (fscanf(file, "%d", &d) == 1) {
#Enzo Ferber rightly suggests using "%s". Further recommend to follow the above idiom and restrict input width to 1 less than the size of the buffer.
char string[20];
while (fscanf(file, "%19s", string) == 1) {
Suggest the habit of checking allocation result.
// better to use `size_t` rather than `int `for array sizes.
size_t newLimit = 2u * memLimit;
char** newptr = realloc(words, newLimit * sizeof *newptr);
if (newptr == NULL) {
puts("Out-of-memory");
// Code still can use old `words` pointer of size `memLimit * sizeof *words`
return -1;
}
memLimit = newLimit;
words = newptr;
}
Errors
Don't cast malloc/calloc returns. There's not need for it.
Your first sizeof is wrong. It should be sizeof(char*)
That scanf() format string. %s does the job just fine.
Code
The following code worked for me (printed one word per line):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
FILE *file;
file = fopen(argv[1], "r");
char *match = argv[2];
if (file == NULL) {
printf("File does not exist\n");
return EXIT_FAILURE;
}
int numWords = 0, memLimit = 20;
char **words = calloc(memLimit, sizeof(char*));
printf("Allocated initial array of 20 character pointers.\n");
char string[20];
while (fscanf(file, "%s", string) != EOF) {
words[numWords] =
malloc(strlen(string) + 1 * sizeof(char));
strcpy(words[numWords], string);
printf("Words: %s\n", words[numWords]);
numWords++; /*keep track of indexes, to realloc */
if (numWords == memLimit) {
memLimit = 2 * memLimit;
words = realloc(words, memLimit * sizeof(char *));
printf
("Reallocated array of %d character pointers.\n",
memLimit);
}
}
}
Called with ./realloc realloc.c
Hope it helps.
Your first allocation is the problem. You allocate 20 chars and treat them as 20 char pointers. You overrun the allocated buffer and corrupt your memory.
The second allocation fails because the heap is corrupted.

Why cannot I free the memory?(Debug Error)

I need remove punctuation from a given string or a word. Here's my code:
void remove_punc(char* *str)
{
char* ps = *str;
char* nstr;
// should be nstr = malloc(sizeof(char) * (1 + strlen(*str)))
nstr = (char *)malloc(sizeof(char) * strlen(*str));
if (nstr == NULL) {
perror("Memory Error in remove_punc function");
exit(1);
}
// should be memset(nstr, 0, sizeof(char) * (1 + strlen(*str)))
memset(nstr, 0, sizeof(char) * strlen(*str));
while(*ps) {
if(! ispunct(*ps)) {
strncat(nstr, ps, 1);
}
++ps;
}
*str = strdup(nstr);
free(nstr);
}
If my main function is the simple one:
int main(void) {
char* str = "Hello, World!:)";
remove_punc(&str);
printf("%s\n", str);
return 0;
}
It works! The output is Hello World.
Now I want to read in a big file and remove punctuation from the file, then output to another file.
Here's another main function:
int main(void) {
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char* str = (char *)malloc(sizeof(char) * 1024);
if (str == NULL) {
perror("Error -- allocating memory");
exit(1);
}
memset(str, 0, sizeof(char) * 1024);
while(1) {
if (fscanf(fp, "%s", str) != 1)
break;
remove_punc(&str);
fprintf(fout, "%s ", str);
}
return 0;
}
When I rerun the program in Visual C++, it reports a
Debug Error! DAMAGE: after Normal Block(#54)0x00550B08,
and the program is aborted.
So, I have to debug the code. Everything works until the statement free(nstr) being executed.
I get confused. Anyone can help me?
You forgot to malloc space for the null terminator. Change
nstr = (char *)malloc(sizeof(char) * strlen(*str));
to
nstr = malloc( strlen(*str) + 1 );
Note that casting malloc is a bad idea, and if you are going to malloc and then memset to zero, you could use calloc instead which does just that.
There is another bug later in your program. The remove_punc function changes str to point to a freshly-allocated buffer that is just big enough for the string with no punctuation. However you then loop up to fscanf(fp, "%s", str). This is no longer reading into a 1024-byte buffer, it is reading into just the buffer size of the previous punctuation-free string.
So unless your file contains lines all in descending order of length (after punctuation removal), you will cause a buffer overflow here. You'll need to rethink your design of this loop. For example perhaps you could have remove_punc leave the input unchanged, and return a pointer to the freshly-allocated string, which you would free after printing.
If you go with this solution, then use %1023s to avoid a buffer overflow with fscanf (unfortunately there's no simple way to take a variable here instead of hardcoding the length). Using a scanf function with a bare "%s" is just as dangerous as gets.
The answer by #MatMcNabb explains the causes of your problems. I'm going to suggest couple of ways you can simplify your code, and make it less susceptible to memory problems.
If performance is not an issue, read the file character by character and discard the puncuation characters.
int main(void)
{
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char c;
while ( (c = fgetc(fp)) != EOF )
{
if ( !ispunct(c) )
{
fputc(c, fout);
}
}
fclose(fout);
fclose(fp);
return 0;
}
Minimize the number of calls to malloc and free by passing in the input string as well as the output string to remove_punc.
void remove_punc(char* inStr, char* outStr)
{
char* ps = inStr;
int index = 0;
while(*ps)
{
if(! ispunct(*ps))
{
outStr[index++] = *ps;
}
++ps;
}
outStr[index] = '\0';
}
and change the way you use remove_punc in main.
int main(void)
{
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char inStr[1024];
char outStr[1024];
while (fgets(inStr, 1024, fp) != NULL )
{
remove_punc(inStr, outStr);
fprintf(fout, "%s", outStr);
}
fclose(fout);
fclose(fp);
return 0;
}
In your main you have the following
char* str = (char *)malloc(sizeof(char) * 1024);
...
remove_punc(&str);
...
Your remove_punc() function takes the address of str but when you do this in your remove_punc function
...
*str = strdup(nstr);
...
you are not copying the new string to the previously allocated buffer, you are reassigning str to point to the new line sized buffer! This means that when you read lines from the file and the next line to be read is longer than the previous line you will run into trouble.
You should leave the original buffer alone and instead e.g. return the new allocate buffer containing the new string e.g. return nstr and then free that when done with it or better yet just copy the original file byte by byte to the new file and exclude any punctuation. That would be far more effective

Why is my array of pointers getting overwritten after dynamic allocation?

I'm working on a little C program for a class that reads the lines in from a file and then sorts them using qsort. Long story short, I am dynamically allocating memory for every line of a file, stored as a char*, in an array of char*. The reading in and storing ostensibly works fine based upon the output (see below), but when I print out the lines, they are all duplicates of the last line in the file. Can anyone point out my (most likely painfully obvious) error?
Here is the relevant code to the problem I'm currently running into:
char* trim_white_space(char* str);
char* get_line(FILE* infile, char temp[]);
int main(int argc, char* argv[]) {
FILE* infile;
char* input_file = argv[1];
int cnt = 0;
char temp[MAX_LINE_LENGTH]; //to hold each line as it gets read in
char* tempPointer = temp;
if (argc < 2) {
printf("No input file provided");
return EXIT_FAILURE;
}
//determine the number of lines in the file
infile = fopen(input_file, "r");
int num_lines_in_file = num_lines(infile);
fclose(infile);
//allocate pointers for each line
char** lines = (char**) malloc(num_lines_in_file * sizeof(char*));
//temporarily store each line, and then dynamically allocate exact memory for them
infile = fopen(input_file, "r");
for (cnt = 0; cnt != num_lines_in_file; cnt++) {
tempPointer = get_line(infile, temp);
lines[cnt] = (char*) malloc(strlen(tempPointer) + 1);
lines[cnt] = trim_white_space(tempPointer);
printf("%d: %s\n", cnt, lines[cnt]);
}
fclose(infile);
//print the unsorted lines (for debugging purposes)
printf("Unsorted list:\n");
for (cnt = 0; cnt != num_lines_in_file; cnt++) {
printf("%s\n", lines[cnt]);
}
char* get_line(FILE* infile, char temp[]) {
fgets(temp, MAX_LINE_LENGTH-1, infile);
char* pntr = temp;
return pntr;
}
char *trimwhitespace(char *str)
{
char *end;
// Trim leading space
while(isspace(*str)) str++;
if(*str == 0) // All spaces?
return str;
// Trim trailing space
end = str + strlen(str) - 1;
while(end > str && isspace(*end)) end--;
// Write new null terminator
*(end+1) = 0;
return str;
}
I have this sample input file 5-1input.dat:
Hi guys
x2 My name is
Slim Shady
For real
And here's the output I'm getting:
user#user-VirtualBox ~/Desktop/Low-level/HW5 $ ./homework5-1 5-1input.dat
0: Hi guys
1: x2 My name is
2: Slim Shady
3: For real
Unsorted list:
For real
For real
For real
For real
As in the comments, you should change your loop to:
for (cnt = 0; cnt != num_lines_in_file; cnt++) {
tempPointer = get_line(infile, temp);
lines[cnt] = (char*) malloc(strlen(tempPointer) + 1);
strncpy(lines[cnt], trim_white_space(tempPointer), strlen(tempPointer)+1);
printf("%d: %s\n", cnt, lines[cnt]);
}
The size in strncpy is based on the size of malloc you've used.
Of course you can optimize this code, e.g. to count strlen only once, etc.

pass string in array

I am trying to pass strings (lines of text file) into arrays (array for f1 and array2 for f2). When I just print the buffer buffer2, the lines come up just fine. When I try to pass them using strcpy the program crashes with no apparent reason. I have tried the following:
Using a two dimensional array with no avail
Working with methods and tried to return char and THEN pass it to the array, so I can avoid this sloppy code, but this will do for now.
I am using windows 7 x64, with DEV-C++.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
char *arrayF1[20] ;
char *arrayF2[20] ;
int i = 0;
int size = 1024, pos;
int c;
int lineCount = 0;
char *buffer = (char *)malloc(size);
char *buffer2 = (char *)malloc(size);
char *array[100];
char *array2[100];
if (argc!=3)
{
printf("\nCommand Usage %s filename.txt filename.txt\n", argv[0]);
}
else
{
FILE *f1 = fopen(argv[1], "r");
FILE *f2 = fopen(argv[2], "r");
if(f1)
{
do { // read all lines in file
pos = 0;
do{ // read one line
c = fgetc(f1);
if(c != EOF) buffer[pos++] = (char)c;
if(pos >= size - 1) { // increase buffer length - leave room for 0
size *=2;
buffer = (char*)realloc(buffer, size);
}
}while(c != EOF && c != '\n');
lineCount++;
buffer[pos] = 0;
// line is now in buffer
strcpy(array[i], buffer);
printf("%s", array[i]);
//printf("%s", buffer);
i++;
} while(c != EOF);
printf("\n");
fclose(f1);
}
printf("%d\n",lineCount);
free(buffer);
lineCount=0;
i=0;
if (f2)
{
do { // read all lines in file
pos = 0;
do{ // read one line
c = fgetc(f2);
if(c != EOF) buffer2[pos++] = (char)c;
if(pos >= size - 1) { // increase buffer length - leave room for 0
size *=2;
buffer2 = (char*)realloc(buffer, size);
}
}while(c != EOF && c != '\n');
lineCount++;
buffer2[pos] = 0;
// line is now in buffer
strcpy(array2[i], buffer);
//printf("%s", buffer2);
printf("%s", array2[i]);
i++;
} while(c != EOF);
printf("\n");
fclose(f2);
}
printf("%d\n",lineCount);
free(buffer2);
}//end first else
return 0;
}
You haven't allocated any memory for the arrays in array. You'll need to do that before you can copy the strings there.
array[i] = malloc(pos + 1);
if (array[i] == NULL) {
// handle error
}
strcpy(array[i], buffer);
printf("%s", array[i]);
To strcpy() to a char*, you need to have already allocated memory for it. You can do this by making static char arrays:
char array[100][50]; //Strings can hold up to 50 chars
or you can use pointers and dynamically allocate them instead.
char *array[100];
for(int i = 0; i < 100; i++)
array[i] = malloc(sizeof(char) * 50); //Up to 50 chars
...
for(int i = 0; i < 100; i++)
free(array[i]); //Delete when you're finished
After allocating it with one of those methods, it's safe to write to it with strcpy().
Looks to me like you allocated the arrays on the stack but failed to ensure that they'd be big enough, since each has size exactly 100. Since you don't know how big they'll be, you can either allocate them dynamically (using #JohnKugelman's solution) or wait to declare them until after you know what their sizes need to be (i.e., how long the strings are that they need to hold).
the program crashes with no apparent reason
There is always a reason :)
This line:
char *array[100];
Creates an array of 100 pointers to characters.
Then this line:
strcpy(array[i], buffer);
Tries to copy your buffer to the ith pointer. The problem is that you never allocated any memory to those pointers, so strcpy() crashes. Just this:
array[i] = malloc(strlen(buffer)+1);
strcpy(array[i], buffer);
will resolve that error.

Resources