Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.
I'm still working on it, and I'm not really good at C.
My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc() function gives me a pointer to an address that's out of bounds.
If you look at this example
#include <stdio.h>
int main() {
char **arrayString = NULL;
char *testString;
testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
strcpy(testString, "1234567890123456789012345678901234567890123456789");
int numElem = 0;
while (numElem < 50) {
numElem++;
arrayString = realloc(arrayString, numElem * sizeof(char**));
arrayString[numElem-1] = malloc(strlen(testString)+1);
strcpy(arrayString[numElem-1], testString);
}
printf("done\n");
return 0;
}
it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)
I'll explain more in dephth below.
In this code i make use of the libxml2, but you don't need to know it to help me.
I have a custom struct declared this way:
struct List {
char key[24][15];
char **value[15];
int size[15];
};
struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail
Which is filled up with the necessary key values. list->size[] is initialized with zeros, to keep track of how many values i've inserted in value.
value is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)
I loop through the xml file, and for each node I do a strcmp between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.
The "broken" code, follows, where
read is the index of the key abovementioned.
reader is the xmlNode
string contained the name of the xmlNode but is then freed so consider it as if its a new char*
list is the above declared struct
if (xmlTextReaderNodeType(reader) == 3 && read >= 0)
{
/* pull out the node value */
xmlChar *value;
value = xmlTextReaderValue(reader);
if (value != NULL) {
free(string);
string=strdup(value);
/*increment array size */
list->size[read]++;
/* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
if (list->value[read] == NULL)
return 16;
/*allocate string (char*) memory */
list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
if (list->value[read][list->size[read]-1] == NULL)
return 16;
/*write string in list */
strcpy(list->value[read][list->size[read]-1], string);
}
/*free memory*/
xmlFree(value);
}
xmlFree(name);
free(string);
I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.
Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1] after the size is incremented, list->value[read][list->size[read]-1] points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.
Here is an image of that: https://imgur.com/a/FAHoidp
How can I properly allocate the needed memory without going out of bounds?
Your code has quite a few problems:
You are not including all the appropriate headers. How did you get this to compile? If you are using malloc and realloc, you need to #include <stdlib.h>. If you are using strlen and strcpy, you need to #include <string.h>.
Not really a mistake, but unless you are applying sizeof to a type itself you don't have to use enclosing brackets.
Stop using sizeof str to get the length of a string. The correct and safe approach is strlen(str)+1. If you apply sizeof to a pointer someday you will run into trouble.
Don't use sizeof(type) as argument to malloc, calloc or realloc. Instead, use sizeof *ptr. This will avoid your incorrect numElem * sizeof(char**) and instead replace it with numElem * sizeof *arrayString, which correctly translates to numElem * sizeof(char*). This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*), at least on GCC.
If you are dynamically allocating memory, you must also deallocate it manually when you no longer need it. Use free for this purpose: free(testString);, free(arrayString);.
Not really a mistake, but if you want to cycle through elements, use a for loop, not a while loop. This way your intention is known by every reader.
This code compiles fine on GCC:
#include <stdio.h> //NULL, printf
#include <stdlib.h> //malloc, realloc, free
#include <string.h> //strlen, strcpy
int main()
{
char** arrayString = NULL;
char* testString;
testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
strcpy(testString, "1234567890123456789012345678901234567890123456789");
for (int numElem = 1; numElem < 50; numElem++)
{
arrayString = realloc(arrayString, numElem * sizeof *arrayString);
arrayString[numElem - 1] = malloc(strlen(testString) + 1);
strcpy(arrayString[numElem - 1], testString);
}
free(arrayString);
free(testString);
printf("done\n");
return 0;
}
Related
Edit: solved by kaylums little comment. Thank you!
good morning,
I am relatively new to C still and I'm trying to make a doubly linked list.
I got my program to run properly with all the functions with this kind of element:
the program crashes after either 2 or 3 inserted elements in the list in the calloc() call of my insertElement() function. I don't get any SIGSEGV or anything, the program just stops with a random negative return.
I'll try to give a minimum code example of the function and the function call:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct Element {
char name[30];
}Element;
typedef struct List {
int size;
Element* first;
Element* last;
}List;
Element* insertElement(List* List, char name[30]) {
Element* element;
element = (Element*)calloc(0, sizeof(Element));
strncpy_s(element->name, name, 30);
return element;
}
List globalList;
char name[30];
int main() {
while (true) {
printf("insert the name >>");
if (fgets(name, 30, stdin) != NULL)
name[strcspn(name, "\n")] = 0;
insertElement(&globalList, name);
}
}
is there already something obvious wrong with that basic stuff?
Thank you very much in advance! Any advice would be very much appreciated, have a good day!
element = (Element*)calloc(0, sizeof(Element));
what is 0 in first argument?
actually you ask for 0 number of your type from memory!
here is some explanation about dynamic memory allocation:
Dynamic memory allocation is a process of allocating memory at run time. There are four library routines, calloc(), free(), realloc(), and malloc() which can be used to allocate memory and free it up during the program execution. These routines are defined in the header file called stdlib.h.
What is malloc() ?
It is a function which is used to allocate a block of memory dynamically. It reserves memory space of specified size and returns the null pointer pointing to the memory location.
The pointer returned is usually of type void. It means that we can assign malloc function to any pointer. The full form of malloc is memory allocation.
What is calloc() ?
Calloc() function is used to allocate multiple blocks of memory. It is a dynamic memory allocation function which is used to allocate the memory to complex data structures such as arrays and structures. If this function fails to allocate enough space as specified, it returns will null pointer. The full form of calloc function is contiguous allocation.
Why use malloc() ?
Here are the reasons of using malloc()
You should use malloc() when you have to allocate memory at runtime.
You should use malloc when you have to allocate objects which must exist beyond the execution of the current memory block.
Go for malloc() if you need to allocate memory greater than the size of that stack.
It returns the pointer to the first byte of allocated space.
It enables developers to allocate memory as it is needed in the exact amount.
This function allocates a memory block size of bytes from the heap.
Why use calloc() ?
Here are the reasons of using calloc()
When you have to set allocated memory to zero.
You can use calloc that returns a pointer to get access to memory heap.
Used when you need to initialize the elements to zero to returns a pointer to the memory.
To prevent overflow that is possible with malloc()
Use calloc() to request a page that is known to already be zeroed.
Syntax of malloc()
Here is a Syntax of malloc()
ptr = (cast_type *) malloc (byte_size);
n above syntax, ptr is a pointer of cast_type. The malloc function returns a pointer to the allocated memory of byte_size.
Example of malloc() in C
In the bellow code, sizeof(*ptr) is used to allocate a memory block of 15 integers. In the printf statement, we are finding the value of the 6th integer.
#include<stdlib.h>
#include<stdio.h>
int main(){
int *ptr;
ptr = malloc(15 * sizeof(*ptr));
if (ptr != NULL) {
*(ptr + 5) = 480;
printf("Value of the 6th integer is %d",*(ptr + 5));
}
}
Output:
Value of the 6th integer is 480
Syntax of calloc()
Here is a Syntax of malloc()
ptr = (cast_type *) calloc (n, size);
The above syntax is used to allocate n memory blocks of the same size. After the memory space is allocated, all the bytes are initialized to zero. The pointer, which is currently at the first byte of the allocated memory space, is returned.
Example of calloc() in C
The C language program below calculates the sum of the first ten terms. If the pointer value if null, then the memory space will not be allocated.
For loop is used to iterate the value of a variable "i" and print the sum. Lastly, function free is used to free-up the pointer.
#include <stdio.h>
#include <stdlib.h>
int main() {
int i, * ptr, sum = 0;
ptr = calloc(10, sizeof(int));
if (ptr == NULL) {
printf("Error! memory not allocated.");
exit(0);
}
printf("Building and calculating the sequence sum of the first 10 terms \n");
for (i = 0; i < 10; ++i) { * (ptr + i) = i;
sum += * (ptr + i);
}
printf("Sum = %d", sum);
free(ptr);
return 0;
}
Output:
Building and calculating the sequence sum of the first 10 terms n Sum = 45
I will not extend on the actual problem (specifying 0 as the number of elements requested to calloc()). I will point you to several other things found in your code.
The first problem in reading your code is that you lack to include the file <stdbool.h>, necessary to use the constants true and false and the type bool. I have added it in the first line.
#include <stdbool.h>
Next, you use at several places the value 30 as the size of several objects that are all related. If you decide in the future to change that value, it will be difficult to find all the ocurrences of the constan 30 and change all of them (and the risk you have used also 30 for anything else and it gets changed in the middle)
I have included a constan with the following lines:
#define NAME_LENGTH (30)
and all the definitions:
...
char name[NAME_LENGTH];
in the structure...
Element* insertElement(List* List, char name[NAME_LENGTH]) {
in the prototype of insertElement (you don't need as name is actually defined as char *, not as an array of NAME_LENGTH elements...
On other side, you need to include a pointer on each Element to link each to the next element of the list. This is done right after name:
struct Element *next; /* we need to include struct as the type Element is not yet defined */
Next, include sizeof *element as the second parameter to calloc() and 1 to the first. Better, if you are going to initialize all fields in the Element structure, then it is better to call malloc() (see the final code , posted at the end)
NEVER, NEVER, NEVER cast the value returned by malloc()
(and friends) This is a legacy that causes a lot of
errors, that get undetected (and very difficult to find),
due to the cast. When you cast you tell the compiler:
leave it in my hands, as I know what I'm doing. And this
makes the compiler silent, when it should be complaining.
The problem mainly has to do with forgetting to include
the header file where malloc (and friends) are declared
(<stdlib.h>) and you will take long time to detect and
see why your program has crashed.
For the same reason, don't use the size of the type, when
you can use the pointed to expression as template of the
type. This is because if you change the type of the
pointed to object, you need to remember that here you have
put the type of the object (and you need to change it too)
This way, this expression
will only be bad if you change the object into a non
pointer object. Also, you have requested for 0 elements
of the specified type, which has already been noticed in other answers. This will make calloc() to return NULL, value you don't check in your code, and you try to use it later on. This will crash your program, but in the best case, it is Undefined Behaviour (and a very difficult error to find, so be careful and always check the value returned by malloc()).
Next, don't use strncpy_s() as it is Microsoft specific routine, and isn't included in any standard. A proper substitute has been provided by strncpy():
strncpy(element->name, name, sizeof element->name);
also use the sizeof operator, as it protects you if you decide in the future to change the type of the pointer.
Finally, it is better to use fgets() as the test expression for the while statement in main(). The reason is that you can end the loop when the end of file is detected.
Finally, you code ends as (including the linking of Elements in the linked list):
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define NAME_LENGTH (30)
typedef struct Element {
char name[NAME_LENGTH];
struct Element *next;
} Element;
typedef struct List {
int size;
Element* first;
Element* last;
} List;
Element* insertElement(List* List, char name[NAME_LENGTH]) {
Element* element;
/* NEVER, NEVER, NEVER cast the value returned by malloc
* (and friends) This is a legacy that causes a lot of
* errors, that get undetected (and very difficult to find),
* due to the cast. When you cast you tell the compiler:
* leave it in my hands, as I know what I'm doing. And this
* makes the compiler silent, when it should be complaining.
* The problem mainly has to do with forgetting to include
* the header file where malloc (and friends) are declared
* (<stdlib.h>) and you will take long time to detect and
* see why your program has crashed. */
/* for the same reason, don't use the size of the type, when
* you can use the pointed to expression as template of the
* type. This is because if you change the type of the
* pointed to object, you need to remember that here you have
* put the type of the object. This way, this expression
* will only be bad if you change the object into a non
* pointer object. Also, you have requested for 0 elements
* of the specified type. */
element = malloc(sizeof *element);
/* don't use strncpy_s as it is not standard. Use the sizeof
* operator again, to protect the expression if you change
* the type of element->name */
strncpy(element->name, name, sizeof element->name);
element->next = NULL;
if (List->last) {
List->last->next = element;
List->last = element;
} else {
List->first = List->last = element;
}
return element;
}
List globalList;
char name[NAME_LENGTH];
int main() {
/* if you put the fgets() call as the test of the while
* statement below, you will process each line until you get
* an end of file condition. Then you can do both things: to
* null the occurence of the \n char, and the call to
* insertElement() I have not corrected because it's a
* question of taste. */
printf("insert the name >> ");
while (fgets(name, sizeof name, stdin) != NULL) {
/* sizeof name is better than the constant, as if you
* change the type definition of object name, you have to
* remember that you are using here its size. sizeof
* does the job for you. */
name[strcspn(name, "\n")] = 0;
insertElement(&globalList, name);
printf("insert the name >> ");
}
Element *p;
char *sep = "\n\n{ ";
for (p = globalList.first; p; p = p->next) {
printf("%s\"%s\"", sep, p->name);
sep = ", ";
}
printf(" };\n");
}
The following code compiled fine yesterday for a while, started giving the abort trap: 6 error at one point, then worked fine again for a while, and again started giving the same error. All the answers I've looked up deal with strings of some fixed specified length. I'm not very experienced in programming so any help as to why this is happening is appreciated. (The code is for computing the Zeckendorf representation.)
If I simply use printf to print the digits one by one instead of using strings the code works fine.
#include <string.h>
// helper function to compute the largest fibonacci number <= n
// this works fine
void maxfib(int n, int *index, int *fib) {
int fib1 = 0;
int fib2 = 1;
int new = fib1 + fib2;
*index = 2;
while (new <= n) {
fib1 = fib2;
fib2 = new;
new = fib1 + fib2;
(*index)++;
if (new == n) {
*fib = new;
}
}
*fib = fib2;
(*index)--;
}
char *zeckendorf(int n) {
int index;
int newindex;
int fib;
char *ans = ""; // I'm guessing the error is coming from here
while (n > 0) {
maxfib(n, &index, &fib);
n -= fib;
maxfib(n, &newindex, &fib);
strcat(ans, "1");
for (int j = index - 1; j > newindex; j--) {
strcat(ans, "0");
}
}
return ans;
}
Your guess is quite correct:
char *ans = ""; // I'm guessing the error is coming from here
That makes ans point to a read-only array of one character, whose only element is the string terminator. Trying to append to this will write out of bounds and give you undefined behavior.
One solution is to dynamically allocate memory for the string, and if you don't know the size beforehand then you need to reallocate to increase the size. If you do this, don't forget to add space for the string terminator, and to free the memory once you're done with it.
Basically, you have two approaches when you want to receive a string from function in C
Caller allocates buffer (either statically or dynamically) and passes it to the callee as a pointer and size. Callee writes data to buffer. If it fits, it returns success as a status. If it does not fit, returns error. You may decide that in such case either buffer is untouched or it contains all data fitting in the size. You can choose whatever suits you better, just document it properly for future users (including you in future).
Callee allocates buffer dynamically, fills the buffer and returns pointer to the buffer. Caller must free the memory to avoid memory leak.
In your case the zeckendorf() function can determine how much memory is needed for the string. The index of first Fibonacci number less than parameter determines the length of result. Add 1 for terminating zero and you know how much memory you need to allocate.
So, if you choose first approach, you need to pass additional two parameters to zeckendorf() function: char *buffer and int size and write to the buffer instead of ans. And you need to have some marker to know if it's first iteration of the while() loop. If it is, after maxfib(n, &index, &fib); check the condition index+1<=size. If condition is true, you can proceed with your function. If not, you can return error immediately.
For second approach initialize the ans as:
char *ans = NULL;
after maxfib(n, &index, &fib); add:
if(ans==NULL) {
ans=malloc(index+1);
}
and continue as you did. Return ans from function. Remember to call free() in caller, when result is no longer needed to avoid memory leak.
In both cases remember to write the terminating \0 to buffer.
There is also a third approach. You can declare ans as:
static char ans[20];
inside zeckendorf(). Function shall behave as in first approach, but the buffer and its size is already hardcoded. I recommend to #define BUFSIZE 20 and either declare variable as static char ans[BUFSIZE]; and use BUFSIZE when checking available size. Please be aware that it works only in single threaded environment. And every call to zeckendorf() will overwrite the previous result. Consider following code.
char *a,*b;
a=zeckendorf(10);
b=zeckendorf(15);
printf("%s\n",a);
printf("%s\n",b);
The zeckendorf() function always return the same pointer. So a and b would pointer to the same buffer, where the string for 15 would be stored. So, you either need to store the result somewhere, or do processing in proper order:
a=zeckendorf(10);
printf("%s\n",a);
b=zeckendorf(15);
printf("%s\n",b);
As a rule of thumb majority (if not all) Linux standard C library function uses either first or third approach.
we wrote a program that reads comma-separated integer-values into an array and tries processing them with a parallel structure.
By doing so, we found out that there is a fixed limitation for the maximum size of the dynamic array, which usually gets allocated dynamically by doubling the size. Yet for a dataset with more than 5000 values, we can't double it anymore.
I am a bit confused right now, since technically, we did everything the way other posts pointed out we should do (use realloc, don't use stack but heap instead).
Note that it works fine for any file with less or equal than 5000 values.
We also tried working with realloc, but to the same result.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
// compile with gcc filename -lpthread -lm -Wall -Wextra -o test
int reader(int ** array, char * name) {
FILE *fp;
int data,row,col,count,inc;
int capacity=10;
char ch;
fp=fopen(name,"r");
row=col=count=0;
while(EOF!=(inc=fscanf(fp,"%d%c", &data, &ch)) && inc == 2){
if(capacity==count)
// this is the alternative with realloc we tried. Still the same issue.
//*array=malloc(sizeof(int)*(capacity*=2));
*array = realloc(*array, sizeof(int)*(capacity*=2));
(*array)[count++] = data;
//printf("%d ", data);
if(ch == '\n'){
break;
} else if(ch != ','){
fprintf(stderr, "format error of different separator(%c) of Row at %d \n", ch, row);
break;
}
}
// close file stream
fclose(fp);
//*array=malloc( sizeof(int)*count);
*array = realloc(*array, sizeof(int)*count);
return count;
}
int main(){
int cores = 8;
pthread_t p[cores];
int *array;
int i = 0;
array=malloc(sizeof(int)*10);
// read the file
int length = reader(&array, "data_2.txt");
// clean up and exit
free(array);
return 0;
}
EDIT: I included the realloc-command we tried and changed the values back to our original testing values (starting at 10). This didn't impact the result though, or rather still does not work. Thanks anyways for pointing out the errors! I also reduced the included code to the relevant part.
I can't really get my head around the fact that it should work this way, but doesn't, so it might just be a minor mistake we overlooked.
Thanks in advance.
New answer after question has been updated
The use of realloc is wrong. Always do realloc into a new pointer and check for NULL before overwriting the old pointer.
Like:
int* tmp = realloc(....);
if (!tmp)
{
// No more memory
// do error handling
....
}
*array = tmp;
Original answer (not fully valid after question has been updated)
You have some serious problems with the current code.
In main you have:
array=malloc(sizeof(int)*10); // This only allocates memory for 10 int
int length = reader(&array, "data_1.txt");
and in reader you have:
int capacity=5001;
So you assume that the array capacity is 5001 even though you only reserved memory for 10 to start with. So you end up writing outside the reserved array (i.e. undefined behavior).
A better approach could be to handle all allocation in the function (i.e. don't do any allocation in main). If you do that you shall initialize capacity to 0 and rewrite the way capacity grows.
Further, in reader you have:
if(capacity==count)
*array=malloc(sizeof(int)*(capacity*=2));
It is wrong to use malloc as you loose all data already in the array and leak memory as well. Use realloc instead.
Finally, you have:
*array=malloc( sizeof(int)*count);
Again this is wrong for the same reason as above. If you want to resize to the exact size (aka count) use realloc
This is really strange... and I can't debug it (tried for about two hours, debugger starts going haywire after a while...). Anyway, I'm trying to do something really simple:
Free an array of strings. The array is in the form:
char **myStrings. The array elements are initialized as:
myString[index] = malloc(strlen(word));
myString[index] = word;
and I'm calling a function like this:
free_memory(myStrings, size); where size is the length of the array (I know this is not the problem, I tested it extensively and everything except this function is working).
free_memory looks like this:
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
free(list[i]);
}
free(list);
}
Now here comes the weird part. if (size> strlen(list[i])) then the program crashes. For example, imagine that I have a list of strings that looks something like this:
myStrings[0] = "Some";
myStrings[1] = "random";
myStrings[2] = "strings";
And thus the length of this array is 3.
If I pass this to my free_memory function, strlen(myStrings[0]) > 3 (4 > 3), and the program crashes.
However, if I change myStrings[0] to be "So" instead, then strlen(myStrings[0]) < 3 (2 < 3) and the program does not crash.
So it seems to me that free(list[i]) is actually going through the char[] that is at that location and trying to free each character, which I imagine is undefined behavior.
The only reason I say this is because I can play around with the size of the first element of myStrings and make the program crash whenever I feel like it, so I'm assuming that this is the problem area.
Note: I did try to debug this by stepping through the function that calls free_memory, noting any weird values and such, but the moment I step into the free_memory function, the debugger crashes, so I'm not really sure what is going on. Nothing is out of the ordinary until I enter the function, then the world explodes.
Another note: I also posted the shortened version of the source for this program (not too long; Pastebin) here. I am compiling on MinGW with the c99 flag on.
PS - I just thought of this. I am indeed passing numUniqueWords to the free function, and I know that this does not actually free the entire piece of memory that I allocated. I've called it both ways, that's not the issue. And I left it how I did because that is the way that I will be calling it after I get it to work in the first place, I need to revise some of my logic in that function.
Source, as per request (on-site):
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
#include "words.h"
int getNumUniqueWords(char text[], int size);
int main(int argc, char* argv[]) {
setvbuf(stdout, NULL, 4, _IONBF); // For Eclipse... stupid bug. --> does NOT affect the program, just the output to console!
int nbr_words;
char text[] = "Some - \"text, a stdin\". We'll have! also repeat? We'll also have a repeat!";
int length = sizeof(text);
nbr_words = getNumUniqueWords(text, length);
return 0;
}
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
// You can see that printing the values is fine, as long as free is not called.
// When free is called, the program will crash if (size > strlen(list[i]))
//printf("Wanna free value %d w/len of %d: %s\n", i, strlen(list[i]), list[i]);
free(list[i]);
}
free(list);
}
int getNumUniqueWords(char text[], int length) {
int numTotalWords = 0;
char *word;
printf("Length: %d characters\n", length);
char totalWords[length];
strcpy(totalWords, text);
word = strtok(totalWords, " ,.-!?()\"0123456789");
while (word != NULL) {
numTotalWords ++;
printf("%s\n", word);
word = strtok(NULL, " ,.-!?()\"0123456789");
}
printf("Looks like we counted %d total words\n\n", numTotalWords);
char *uniqueWords[numTotalWords];
char *tempWord;
int wordAlreadyExists = 0;
int numUniqueWords = 0;
char totalWordsCopy[length];
strcpy(totalWordsCopy, text);
for (int i = 0; i < numTotalWords; i++) {
uniqueWords[i] = NULL;
}
// Tokenize until all the text is consumed.
word = strtok(totalWordsCopy, " ,.-!?()\"0123456789");
while (word != NULL) {
// Look through the word list for the current token.
for (int j = 0; j < numTotalWords; j ++) {
// Just for clarity, no real meaning.
tempWord = uniqueWords[j];
// The word list is either empty or the current token is not in the list.
if (tempWord == NULL) {
break;
}
//printf("Comparing (%s) with (%s)\n", tempWord, word);
// If the current token is the same as the current element in the word list, mark and break
if (strcmp(tempWord, word) == 0) {
printf("\nDuplicate: (%s)\n\n", word);
wordAlreadyExists = 1;
break;
}
}
// Word does not exist, add it to the array.
if (!wordAlreadyExists) {
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
numUniqueWords ++;
printf("Unique: %s\n", word);
}
// Reset flags and continue.
wordAlreadyExists = 0;
word = strtok(NULL, " ,.-!?()\"0123456789");
}
// Print out the array just for funsies - make sure it's working properly.
for (int x = 0; x <numUniqueWords; x++) {
printf("Unique list %d: %s\n", x, uniqueWords[x]);
}
printf("\nNumber of unique words: %d\n\n", numUniqueWords);
// Right below is where things start to suck.
free_memory(uniqueWords, numUniqueWords);
return numUniqueWords;
}
You've got an answer to this question, so let me instead answer a different question:
I had multiple easy-to-make mistakes -- allocating a wrong-sized buffer and freeing non-malloc'd memory. I debugged it for hours and got nowhere. How could I have spent that time more effectively?
You could have spent those hours writing your own memory allocators that would find the bug automatically.
When I was writing a lot of C and C++ code I made helper methods for my program that turned all mallocs and frees into calls that did more than just allocate memory. (Note that methods like strdup are malloc in disguise.) If the user asked for, say, 32 bytes, then my helper method would add 24 to that and actually allocate 56 bytes. (This was on a system with 4-byte integers and pointers.) I kept a static counter and a static head and tail of a doubly-linked list. I would then fill in the memory I allocated as follows:
Bytes 0-3: the counter
Bytes 4-7: the prev pointer of a doubly-linked list
Bytes 8-11: the next pointer of a doubly-linked list
Bytes 12-15: The size that was actually passed in to the allocator
Bytes 16-19: 01 23 45 67
Bytes 20-51: 33 33 33 33 33 33 ...
Bytes 52-55: 89 AB CD EF
And return a pointer to byte 20.
The free code would take the pointer passed in and subtract four, and verify that bytes 16-19 were still 01 23 45 67. If they were not then either you are freeing a block you did not allocate with this allocator, or you've written before the pointer somehow. Either way, it would assert.
If that check succeeded then it would go back four more and read the size. Now we know where the end of the block is and we can verify that bytes 52 through 55 are still 89 AB CD EF. If they are not then you are writing over the end of a block somewhere. Again, assert.
Now that we know that the block is not corrupt we remove it from the linked list, set ALL the memory of the block to CC CC CC CC ... and free the block. We use CC because that is the "break into the debugger" instruction on x86. If somehow we end up with the instruction pointer pointing into such a block it is nice if it breaks!
If there is a problem then you also know which allocation it was, because you have the allocation count in the block.
Now we have a system that finds your bugs for you. In the release version of your product, simply turn it off so that your allocator just calls malloc normally.
Moreover you can use this system to find other bugs. If for example you believe that you've got a memory leak somewhere all you have to do is look at the linked list; you have a complete list of all the outstanding allocations and can figure out which ones are being kept around unnecessarily. If you think you're allocating too much memory for a given block then you can have your free code check to see if there are a lot of 33 in the block that is about to be freed; that's a sign that you're allocating your blocks too big. And so on.
And finally: this is just a starting point. When I was using this debug allocator professionally I extended it so that it was threadsafe, so that it could tell me what kind of allocator was doing the allocation (malloc, strdup, new, IMalloc, etc.), whether there was a mismatch between the alloc and free functions, what source file contained the allocation, what the call stack was at the time of the allocation, what the average, minimum and maximum block sizes were, what subsystems were responsible for what memory usage...
C requires that you manage your own memory; this definitely has its pros and cons. My opinion is that the cons outweigh the pros; I much prefer to work in automatic storage languages. But the nice thing about having to manage your own storage is that you are free to build a storage management system that meets your needs, and that includes your debugging needs. If you must use a language that requires you to manage storage, use that power to your advantage and build a really powerful subsystem that you can use to solve professional-grade problems.
The problem is not how you're freeing, but how you're creating the array. Consider this:
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
...
word = strtok(NULL, " ,.-!?()\"0123456789");
There are several issues here:
word = strtok(): what strtok returns is not something that you can free, because it has not been malloc'ed. ie it is not a copy, it just points to somewhere inside the underlying large string (the thing you called strtok with first).
uniqueWords[numUniqueWords] = word: this is not a copy; it just assigns the pointer. the pointer which is there before (which you malloc'ed) is overwritten.
malloc(strlen(word)): this allocates too little memory, should be strlen(word)+1
How to fix:
Option A: copy properly
// no malloc
uniqueWords[numUniqueWords] = strdup(word); // what strdup returns can be free'd
Option B: copy properly, slightly more verbose
uniqueWords[numUniqueWords] = malloc(strlen(word)+1);
strcpy(uniqueWords[numUniqueWords], word); // use the malloc'ed memory to copy to
Option C: don't copy, don't free
// no malloc
uniqueWords[numUniqueWords] = word; // not a copy, this still points to the big string
// don't free this, ie don't free(list[i]) in free_memory
EDIT As other have pointed out, this is also problematic:
char *uniqueWords[numTotalWords];
I believe this is a GNU99 extension (not even C99), and indeed you cannot (should not) free it. Try char **uniqueWords = (char**)malloc(sizeof(char*) * numTotalWords). Again the problem is not the free() but the way you allocate. You are on the right track with the free, just need to match every free with a malloc, or with something that says it is equivalent to a malloc (like strdup).
You are using this code in an attempt to allocate the memory:
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
numUniqueWords++;
This is wrong on many levels.
You need to allocate strlen(word)+1 bytes of memory.
You need to strcpy() the string over the allocated memory; at the moment, you simply throw the allocated memory away.
Your array uniqueWords is itself not allocated, and the word values you have stored are from the original string which has been mutilated by strtok().
As it stands, you cannot free any memory because you've already lost the pointers to the memory that was allocated and the memory you are trying to free was never in fact allocated by malloc() et al.
And you should be error checking the memory allocations too. Consider using strdup() to duplicate strings.
You are trying to free char *uniqueWords[numTotalWords];, which is not allowed in C.
Since uniqueWords is allocated on the stack and you can't call free on stack memory.
Just remove the last free call, like this:
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
free(list[i]);
}
}
Proper way of allocating and deallocating char array.
char **foo = (char **) malloc(row* sizeof(char *));
*foo = malloc(row * col * sizeof(char));
for (int i = 1; i < row; i++) {
foo[i] = *foo + i*col;
}
free(*foo);
free(foo);
Note that you don't need to go through each & every element of the array for deallocation of memory. Arrays are contiguous so call free on the name of the array.
I have an array, say, text, that contains strings read in by another function. The length of the strings is unknown and the amount of them is unknown as well. How should I try to allocate memory to an array of strings (and not to the strings themselves, which already exist as separate arrays)?
What I have set up right now seems to read the strings just fine, and seems to do the post-processing I want done correctly (I tried this with a static array). However, when I try to printf the elements of text, I get a segmentation fault. To be more precise, I get a segmentation fault when I try to print out specific elements of text, such as text[3] or text[5]. I assume this means that I'm allocating memory to text incorrectly and all the strings read are not saved to text correctly?
So far I've tried different approaches, such as allocating a set amount of some size_t=k , k*sizeof(char) at first, and then reallocating more memory (with realloc k*sizeof(char)) if cnt == (k-2), where cnt is the index of **text.
I tried to search for this, but the only similar problem I found was with a set amount of strings of unknown length.
I'd like to figure out as much as I can on my own, and didn't post the actual code because of that. However, if none of this makes any sense, I'll post it.
EDIT: Here's the code
int main(void){
char **text;
size_t k=100;
size_t cnt=1;
int ch;
size_t lng;
text=malloc(k*sizeof(char));
printf("Input:\n");
while(1) {
ch = getchar();
if (ch == EOF) {
text[cnt++]='\0';
break;
}
if (cnt == k - 2) {
k *= 2;
text = realloc(text, (k * sizeof(char))); /* I guess at least this is incorrect?*/
}
text[cnt]=readInput(ch); /* read(ch) just reads the line*/
lng=strlen(text[cnt]);
printf("%d,%d\n",lng,cnt);
cnt++;
}
text=realloc(text,cnt*sizeof(char));
print(text); /*prints all the lines*/
return 0;
}
The short answer is you can't directly allocate the memory unless you know how much to allocate.
However, there are various ways of determining how much you need to allocate.
There are two aspects to this. One is knowing how many strings you need to handle. There must be some defined way of knowing; either you're given a count, or there some specific pointer value (usually NULL) that tells you when you've reached the end.
To allocate the array of pointers to pointers, it is probably simplest to count the number of necessary pointers, and then allocate the space. Assuming a null terminated list:
size_t i;
for (i = 0; list[i] != NULL; i++)
;
char **space = malloc(i * sizeof(*space));
...error check allocation...
For each string, you can use strdup(); you assume that the strings are well-formed and hence null terminated. Or you can write your own analogue of strdup().
for (i = 0; list[i] != NULL; i++)
{
space[i] = strdup(list[i]);
...error check allocation...
}
An alternative approach scans the list of pointers once, but uses malloc() and realloc() multiple times. This is probably slower overall.
If you can't reliably tell when the list of strings ends or when the strings themselves end, you are hosed. Completely and utterly hosed.
C don't have strings. It just has pointers to (conventionally null-terminated) sequence of characters, and call them strings.
So just allocate first an array of pointers:
size_t nbelem= 10; /// number of elements
char **arr = calloc(nbelem, sizeof(char*));
You really want calloc because you really want that array to be cleared, so each pointer there is NULL. Of course, you test that calloc succeeded:
if (!arr) perror("calloc failed"), exit(EXIT_FAILURE);
At last, you fill some of the elements of the array:
arr[0] = "hello";
arr[1] = strdup("world");
(Don't forget to free the result of strdup and the result of calloc).
You could grow your array with realloc (but I don't advise doing that, because when realloc fails you could have lost your data). You could simply grow it by allocating a bigger copy, copy it inside, and redefine the pointer, e.g.
{ size_t newnbelem = 3*nbelem/2+10;
char**oldarr = arr;
char**newarr = calloc(newnbelem, sizeof(char*));
if (!newarr) perror("bigger calloc"), exit(EXIT_FAILURE);
memcpy (newarr, oldarr, sizeof(char*)*nbelem);
free (oldarr);
arr = newarr;
}
Don't forget to compile with gcc -Wall -g on Linux (improve your code till no warnings are given), and learn how to use the gdb debugger and the valgrind memory leak detector.
In c you can not allocate an array of string directly. You should stick with pointer to char array to use it as array of string. So use
char* strarr[length];
And to mentain the array of characters
You may take the approach somewhat like this:
Allocate a block of memory through a call to malloc()
Keep track of the size of input
When ever you need a increament in buffer size call realloc(ptr,size)