C project using pointers to send arrays - c

In this program I have to create a system of pointers that will, after reading in only 80 characters of a larger input, supply a pointer-to-pointer-to-character operation. The result is then sent to a function determining the number of words in the total input and the average amount of letters they contain. My problem is that I cannot dont't know how to create the pointer system without generating a exc_bad_access warning. Additionally I cannot find a combination of malloc and free that is suiting my needs. Any help with this would be greatly appreciated.
inputPtr = (char*)malloc(81 * sizeof(char));
while (fgets(wordy, 81, stdin) != NULL) {
numChar = strlen(wordy);
inputPtr = wordy;
for (i = 0; i < groupRange; ++i) {
sentPtr[i] = &inputPtr;
}
if (numChar == 80) {
groupRange++;
}
free(inputPtr);
printwords(*sentPtr, numChar);
}

Lets take a close look at these three lines from your code:
inputPtr = (char*)malloc(81 * sizeof(char));
...
inputPtr = wordy;
...
free(inputPtr);
The first allocates memory, and assign the pointer to that memory to the variable inputPtr.
The second line reassigns inputPtr so it no longer points to the memory you have allocated. You will lose that memory and have a memory leak.
Finally the last line, where you attempt to free what inputPtr is pointing to, and exactly what it is pointing to I don't know but it probably isn't memory you have allocated with malloc. That leads to undefined behavior.
Exactly how to solve your problem I'm not sure about, but a good start would be to not allocate memory dynamically, and then of course not call free.

Related

C: Realloc() after reading second line in file results in garbage values

I'm attempting to read sequences from a FASTA file into a table of structs that I've created, which each contain a character array member called "seq". My code seems to work well for the first loop, but when I realloc() memory for the second sequence, the pointer seems to point to garbage values and then the strcat() method gives me a segfault.
Here's the whole FASTA file I'm trying to read from:
>1
AAAAAAAAAAGWTSGTAAAAAAAAAAA
>2
LLLLLLLLLLGWTSGTLLLLLLLLLLL
>3
CCCCCCCCCCGWTSGTCCCCCCCCCCC
Here's the code (sorry that some of the variable names are in french):
typedef struct _tgSeq { char *titre ; char *seq ; int lg ; } tgSeq ;
#define MAX_SEQ_LN 1000
tgSeq* readFasta(char *nomFile) {
char ligne[MAX_SEQ_LN];
tgSeq *lesSeq = NULL;
int nbSeq=-1;
FILE *pF = fopen(nomFile, "r");
while(fgets(ligne, MAX_SEQ_LN, pF) != NULL) {
if(ligne[0] == '>') {
/*create a new sequence*/
nbSeq++;
//reallocate memory to keep the new sequence in the *lesSeq table
lesSeq = realloc(lesSeq, (nbSeq)*sizeof(tgSeq));
//allocate memory for the title of the new sequence
lesSeq[nbSeq].titre = malloc((strlen(ligne)+1)*sizeof(char));
//lesSeq[nbSeq+1].titre becomes a pointer that points to the same memory as ligne
strcpy(lesSeq[nbSeq].titre, ligne);
//Now we create the new members of the sequence that we can fill with the correct information later
lesSeq[nbSeq].lg = 0;
lesSeq[nbSeq].seq = NULL;
} else {
/*fill the members of the sequence*/
//reallocate memory for the new sequence
lesSeq[nbSeq].seq = realloc(lesSeq[nbSeq].seq, (sizeof(char)*(lesSeq[nbSeq].lg+1+strlen(ligne))));
strcat(lesSeq[nbSeq].seq, ligne);
lesSeq[nbSeq].lg += strlen(ligne);
}
}
// Close the file
fclose(pF);
return lesSeq;
}
For the first line (AAAAAAAAAAGWTSGTAAAAAAAAAAA), lesSeq[nbSeq].seq = realloc(lesSeq[nbSeq].seq, (sizeof(char)*(lesSeq[nbSeq].lg+1+strlen(ligne)))); gives me an empty character array that I can concatenate onto, but for the second line (LLLLLLLLLLGWTSGTLLLLLLLLLLL) the same code gives me garbage characters like "(???". I'm assuming the problem is that the reallocation is pointing towards some sort of garbage memory, but I don't understand why it would be different for the first line versus the second line.
Any help you could provide would be greatly appreciated! Thank you!
The problem here is the first realloc gets the value of nbSeq as 0 which does not allocate any memory.
Replace
int nbSeq=-1;
with
int nbSeq=0;
Access the index with lesSeq[nbSeq - 1]
Some programmer dude already pointed out that you do not allocate enough memory.
You also seem to expect some behaviour from realloc that will not happen.
You call realloc with NULL pointers. This will make it behave same as malloc.
For the first line (AAAAAAAAAAGWTSGTAAAAAAAAAAA), ...= realloc(); gives me an empty character array that I can concatenate onto, but for the second line (LLLLLLLLLLGWTSGTLLLLLLLLLLL) the same code gives me garbage characters like "(???".
You should not expect any specifiy content of your allocated memory. Especially the memory location is not set to 0. If you want to rely on that, you can use calloc.
Or you simply assign a 0 to the first memory location.
You do not really concatenaty anything. Instead you allocate new memory where you could simply use strcpy instead of strcat.

I am trying to free the memory occupied by an element in the structure using free(), but its not working

I have this struct Exam. and i am using cleanUp function to allocate and free the memory occupied by title but its not freeing it.
typedef struct
{
char* title;
Question* questions[MAX_QUESTIONS];
}Exam;
BOOL CleanUp(Exam * e){
char name[200];
printf("Enter name of the course \n");
gets(name);
fflush(stdout);
e->title = (char*)malloc(sizeof(strlen(name)+1));
strcpy(e->title,name);
free(e->title);
}
sizeof(strlen(name)+1) is not correct, this gives you the size of the result of that calculation, i.e. sizeof(int). Because you have allocated the wrong size you are writing past the end of the buffer.
This is corrupting data and causing free() to fail.
What you mean to do is:
sizeof(char) * (strlen(name) + 1)
In C, sizeof(char) is guaranteed to be 1, so you don't actually need it here, however I've put it there to illustrate the general way to allocate memory for multiple objects: multiply the size of the object by the number of objects.
Surely you simply meant:
e->title = strdup(name);
...
free(e->title);
strdup() will count the string pointed to by 'name', allocate space for a copy (including the null terminator) and copy the data in a sensible, architecture aligned way (usually.)`
I think Whilom Chime gave a pretty adequete answer, as did Mr. Zebra. Another way to do it would be like so;
e->title = malloc(sizeof(char *));
if(e->title != NULL) strcpy(e->title, word);
However, I've found when working with really large data sets (I had to put ~3M words into a 2-3-4 tree a couple days ago), e->title = strdup(word); is actually faster than strcpy(e->title, word);. I don't know why, and it honestly doesn't make sense to me, seeing as strcpy doesn't have to go through the process of allocating memory for the character pointer. Maybe someone else can give input on this

Strange (Undefined?) Behavior of Free in C

This is really strange... and I can't debug it (tried for about two hours, debugger starts going haywire after a while...). Anyway, I'm trying to do something really simple:
Free an array of strings. The array is in the form:
char **myStrings. The array elements are initialized as:
myString[index] = malloc(strlen(word));
myString[index] = word;
and I'm calling a function like this:
free_memory(myStrings, size); where size is the length of the array (I know this is not the problem, I tested it extensively and everything except this function is working).
free_memory looks like this:
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
free(list[i]);
}
free(list);
}
Now here comes the weird part. if (size> strlen(list[i])) then the program crashes. For example, imagine that I have a list of strings that looks something like this:
myStrings[0] = "Some";
myStrings[1] = "random";
myStrings[2] = "strings";
And thus the length of this array is 3.
If I pass this to my free_memory function, strlen(myStrings[0]) > 3 (4 > 3), and the program crashes.
However, if I change myStrings[0] to be "So" instead, then strlen(myStrings[0]) < 3 (2 < 3) and the program does not crash.
So it seems to me that free(list[i]) is actually going through the char[] that is at that location and trying to free each character, which I imagine is undefined behavior.
The only reason I say this is because I can play around with the size of the first element of myStrings and make the program crash whenever I feel like it, so I'm assuming that this is the problem area.
Note: I did try to debug this by stepping through the function that calls free_memory, noting any weird values and such, but the moment I step into the free_memory function, the debugger crashes, so I'm not really sure what is going on. Nothing is out of the ordinary until I enter the function, then the world explodes.
Another note: I also posted the shortened version of the source for this program (not too long; Pastebin) here. I am compiling on MinGW with the c99 flag on.
PS - I just thought of this. I am indeed passing numUniqueWords to the free function, and I know that this does not actually free the entire piece of memory that I allocated. I've called it both ways, that's not the issue. And I left it how I did because that is the way that I will be calling it after I get it to work in the first place, I need to revise some of my logic in that function.
Source, as per request (on-site):
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
#include "words.h"
int getNumUniqueWords(char text[], int size);
int main(int argc, char* argv[]) {
setvbuf(stdout, NULL, 4, _IONBF); // For Eclipse... stupid bug. --> does NOT affect the program, just the output to console!
int nbr_words;
char text[] = "Some - \"text, a stdin\". We'll have! also repeat? We'll also have a repeat!";
int length = sizeof(text);
nbr_words = getNumUniqueWords(text, length);
return 0;
}
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
// You can see that printing the values is fine, as long as free is not called.
// When free is called, the program will crash if (size > strlen(list[i]))
//printf("Wanna free value %d w/len of %d: %s\n", i, strlen(list[i]), list[i]);
free(list[i]);
}
free(list);
}
int getNumUniqueWords(char text[], int length) {
int numTotalWords = 0;
char *word;
printf("Length: %d characters\n", length);
char totalWords[length];
strcpy(totalWords, text);
word = strtok(totalWords, " ,.-!?()\"0123456789");
while (word != NULL) {
numTotalWords ++;
printf("%s\n", word);
word = strtok(NULL, " ,.-!?()\"0123456789");
}
printf("Looks like we counted %d total words\n\n", numTotalWords);
char *uniqueWords[numTotalWords];
char *tempWord;
int wordAlreadyExists = 0;
int numUniqueWords = 0;
char totalWordsCopy[length];
strcpy(totalWordsCopy, text);
for (int i = 0; i < numTotalWords; i++) {
uniqueWords[i] = NULL;
}
// Tokenize until all the text is consumed.
word = strtok(totalWordsCopy, " ,.-!?()\"0123456789");
while (word != NULL) {
// Look through the word list for the current token.
for (int j = 0; j < numTotalWords; j ++) {
// Just for clarity, no real meaning.
tempWord = uniqueWords[j];
// The word list is either empty or the current token is not in the list.
if (tempWord == NULL) {
break;
}
//printf("Comparing (%s) with (%s)\n", tempWord, word);
// If the current token is the same as the current element in the word list, mark and break
if (strcmp(tempWord, word) == 0) {
printf("\nDuplicate: (%s)\n\n", word);
wordAlreadyExists = 1;
break;
}
}
// Word does not exist, add it to the array.
if (!wordAlreadyExists) {
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
numUniqueWords ++;
printf("Unique: %s\n", word);
}
// Reset flags and continue.
wordAlreadyExists = 0;
word = strtok(NULL, " ,.-!?()\"0123456789");
}
// Print out the array just for funsies - make sure it's working properly.
for (int x = 0; x <numUniqueWords; x++) {
printf("Unique list %d: %s\n", x, uniqueWords[x]);
}
printf("\nNumber of unique words: %d\n\n", numUniqueWords);
// Right below is where things start to suck.
free_memory(uniqueWords, numUniqueWords);
return numUniqueWords;
}
You've got an answer to this question, so let me instead answer a different question:
I had multiple easy-to-make mistakes -- allocating a wrong-sized buffer and freeing non-malloc'd memory. I debugged it for hours and got nowhere. How could I have spent that time more effectively?
You could have spent those hours writing your own memory allocators that would find the bug automatically.
When I was writing a lot of C and C++ code I made helper methods for my program that turned all mallocs and frees into calls that did more than just allocate memory. (Note that methods like strdup are malloc in disguise.) If the user asked for, say, 32 bytes, then my helper method would add 24 to that and actually allocate 56 bytes. (This was on a system with 4-byte integers and pointers.) I kept a static counter and a static head and tail of a doubly-linked list. I would then fill in the memory I allocated as follows:
Bytes 0-3: the counter
Bytes 4-7: the prev pointer of a doubly-linked list
Bytes 8-11: the next pointer of a doubly-linked list
Bytes 12-15: The size that was actually passed in to the allocator
Bytes 16-19: 01 23 45 67
Bytes 20-51: 33 33 33 33 33 33 ...
Bytes 52-55: 89 AB CD EF
And return a pointer to byte 20.
The free code would take the pointer passed in and subtract four, and verify that bytes 16-19 were still 01 23 45 67. If they were not then either you are freeing a block you did not allocate with this allocator, or you've written before the pointer somehow. Either way, it would assert.
If that check succeeded then it would go back four more and read the size. Now we know where the end of the block is and we can verify that bytes 52 through 55 are still 89 AB CD EF. If they are not then you are writing over the end of a block somewhere. Again, assert.
Now that we know that the block is not corrupt we remove it from the linked list, set ALL the memory of the block to CC CC CC CC ... and free the block. We use CC because that is the "break into the debugger" instruction on x86. If somehow we end up with the instruction pointer pointing into such a block it is nice if it breaks!
If there is a problem then you also know which allocation it was, because you have the allocation count in the block.
Now we have a system that finds your bugs for you. In the release version of your product, simply turn it off so that your allocator just calls malloc normally.
Moreover you can use this system to find other bugs. If for example you believe that you've got a memory leak somewhere all you have to do is look at the linked list; you have a complete list of all the outstanding allocations and can figure out which ones are being kept around unnecessarily. If you think you're allocating too much memory for a given block then you can have your free code check to see if there are a lot of 33 in the block that is about to be freed; that's a sign that you're allocating your blocks too big. And so on.
And finally: this is just a starting point. When I was using this debug allocator professionally I extended it so that it was threadsafe, so that it could tell me what kind of allocator was doing the allocation (malloc, strdup, new, IMalloc, etc.), whether there was a mismatch between the alloc and free functions, what source file contained the allocation, what the call stack was at the time of the allocation, what the average, minimum and maximum block sizes were, what subsystems were responsible for what memory usage...
C requires that you manage your own memory; this definitely has its pros and cons. My opinion is that the cons outweigh the pros; I much prefer to work in automatic storage languages. But the nice thing about having to manage your own storage is that you are free to build a storage management system that meets your needs, and that includes your debugging needs. If you must use a language that requires you to manage storage, use that power to your advantage and build a really powerful subsystem that you can use to solve professional-grade problems.
The problem is not how you're freeing, but how you're creating the array. Consider this:
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
...
word = strtok(NULL, " ,.-!?()\"0123456789");
There are several issues here:
word = strtok(): what strtok returns is not something that you can free, because it has not been malloc'ed. ie it is not a copy, it just points to somewhere inside the underlying large string (the thing you called strtok with first).
uniqueWords[numUniqueWords] = word: this is not a copy; it just assigns the pointer. the pointer which is there before (which you malloc'ed) is overwritten.
malloc(strlen(word)): this allocates too little memory, should be strlen(word)+1
How to fix:
Option A: copy properly
// no malloc
uniqueWords[numUniqueWords] = strdup(word); // what strdup returns can be free'd
Option B: copy properly, slightly more verbose
uniqueWords[numUniqueWords] = malloc(strlen(word)+1);
strcpy(uniqueWords[numUniqueWords], word); // use the malloc'ed memory to copy to
Option C: don't copy, don't free
// no malloc
uniqueWords[numUniqueWords] = word; // not a copy, this still points to the big string
// don't free this, ie don't free(list[i]) in free_memory
EDIT As other have pointed out, this is also problematic:
char *uniqueWords[numTotalWords];
I believe this is a GNU99 extension (not even C99), and indeed you cannot (should not) free it. Try char **uniqueWords = (char**)malloc(sizeof(char*) * numTotalWords). Again the problem is not the free() but the way you allocate. You are on the right track with the free, just need to match every free with a malloc, or with something that says it is equivalent to a malloc (like strdup).
You are using this code in an attempt to allocate the memory:
uniqueWords[numUniqueWords] = malloc(strlen(word));
uniqueWords[numUniqueWords] = word;
numUniqueWords++;
This is wrong on many levels.
You need to allocate strlen(word)+1 bytes of memory.
You need to strcpy() the string over the allocated memory; at the moment, you simply throw the allocated memory away.
Your array uniqueWords is itself not allocated, and the word values you have stored are from the original string which has been mutilated by strtok().
As it stands, you cannot free any memory because you've already lost the pointers to the memory that was allocated and the memory you are trying to free was never in fact allocated by malloc() et al.
And you should be error checking the memory allocations too. Consider using strdup() to duplicate strings.
You are trying to free char *uniqueWords[numTotalWords];, which is not allowed in C.
Since uniqueWords is allocated on the stack and you can't call free on stack memory.
Just remove the last free call, like this:
void free_memory(char **list, int size) {
for (int i = 0; i < size; i ++) {
free(list[i]);
}
}
Proper way of allocating and deallocating char array.
char **foo = (char **) malloc(row* sizeof(char *));
*foo = malloc(row * col * sizeof(char));
for (int i = 1; i < row; i++) {
foo[i] = *foo + i*col;
}
free(*foo);
free(foo);
Note that you don't need to go through each & every element of the array for deallocation of memory. Arrays are contiguous so call free on the name of the array.

Malloc or calloc

here is a very small structure used for indexing words of a file. Its members are a string (the word), an array of integers (the lines this word is found at), and an integer representing the index of the first free cell in the lines array.
typedef struct {
wchar_t * word;
int * lines;
int nLine;
} ndex;
ndex * words;
I am trying to allocate (ndex)es nb_words = 128 at a time, and (lines) nb_lines = 8 at a time, using malloc and realloc.
First question, what is the difference between malloc(number * size) and calloc(number, size) when allocating *words and/or *lines? Which should I choose?
Second question, I gdbed this:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400cb0 in add_line (x=43, line=3) at cx17.3.c:124
124 words[x].lines[words[x].nLine] = line;
(gdb) p words[43].nLine
$30 = 0
In other words, it consistently fails at
words[43].lines[0] = 3;
Since I allocate words by 128, and lines by 8, there is no reason the indexing worked for the 42 previous words and fail here, except if my allocating was botched, is there?
Third question: here are my allocations, what is wrong with them?
words = malloc(sizeof(ndex *) * nb_words);
short i;
for (i = 0; i < nb_words; i++) {
words[i].lines = malloc(sizeof(int) * nb_lines);
words[i].nLine = 0;
}
Should I initialize lines in a for(j) loop? I don't see why leaving it uninitialized would prevent writing to it, so long as it as been allocated.
This C is a very mysterious thing to me, thanks in advance for any hints you can provide.
Best regards.
This looks suspicious:
sizeof(ndex *)
You probably don't want the size of a pointer - you want the size of a structure. So remove the star.
Here:
words = malloc(sizeof(ndex *) * nb_words);
You are allocating space for some number of pointers (i.e., 4 bytes * nb_words). What you really need is to allocate some number of ndex's:
words = malloc(sizeof(ndex) * nb_words);
Also, calloc 0 initializes the returned buffer while malloc does not. See this answer.
malloc will allocate the requested space only. calloc will allocate the space and initialize to zero.
In your example, the segmentation fault is observed here words[x].lines[words[x].nLine] = line;. There could be 2 possibilities viz., allocation is wrong which I don't feel is the case. The more probable case would be words[x].nLine didn't evaluate to 0. Please print this value and check. I suspect this is some huge number which is forcing your program to access a memory out of your allocated space.
Others have answered this part, so I will skip it.

How to allocate memory for an array of strings of unknown length in C

I have an array, say, text, that contains strings read in by another function. The length of the strings is unknown and the amount of them is unknown as well. How should I try to allocate memory to an array of strings (and not to the strings themselves, which already exist as separate arrays)?
What I have set up right now seems to read the strings just fine, and seems to do the post-processing I want done correctly (I tried this with a static array). However, when I try to printf the elements of text, I get a segmentation fault. To be more precise, I get a segmentation fault when I try to print out specific elements of text, such as text[3] or text[5]. I assume this means that I'm allocating memory to text incorrectly and all the strings read are not saved to text correctly?
So far I've tried different approaches, such as allocating a set amount of some size_t=k , k*sizeof(char) at first, and then reallocating more memory (with realloc k*sizeof(char)) if cnt == (k-2), where cnt is the index of **text.
I tried to search for this, but the only similar problem I found was with a set amount of strings of unknown length.
I'd like to figure out as much as I can on my own, and didn't post the actual code because of that. However, if none of this makes any sense, I'll post it.
EDIT: Here's the code
int main(void){
char **text;
size_t k=100;
size_t cnt=1;
int ch;
size_t lng;
text=malloc(k*sizeof(char));
printf("Input:\n");
while(1) {
ch = getchar();
if (ch == EOF) {
text[cnt++]='\0';
break;
}
if (cnt == k - 2) {
k *= 2;
text = realloc(text, (k * sizeof(char))); /* I guess at least this is incorrect?*/
}
text[cnt]=readInput(ch); /* read(ch) just reads the line*/
lng=strlen(text[cnt]);
printf("%d,%d\n",lng,cnt);
cnt++;
}
text=realloc(text,cnt*sizeof(char));
print(text); /*prints all the lines*/
return 0;
}
The short answer is you can't directly allocate the memory unless you know how much to allocate.
However, there are various ways of determining how much you need to allocate.
There are two aspects to this. One is knowing how many strings you need to handle. There must be some defined way of knowing; either you're given a count, or there some specific pointer value (usually NULL) that tells you when you've reached the end.
To allocate the array of pointers to pointers, it is probably simplest to count the number of necessary pointers, and then allocate the space. Assuming a null terminated list:
size_t i;
for (i = 0; list[i] != NULL; i++)
;
char **space = malloc(i * sizeof(*space));
...error check allocation...
For each string, you can use strdup(); you assume that the strings are well-formed and hence null terminated. Or you can write your own analogue of strdup().
for (i = 0; list[i] != NULL; i++)
{
space[i] = strdup(list[i]);
...error check allocation...
}
An alternative approach scans the list of pointers once, but uses malloc() and realloc() multiple times. This is probably slower overall.
If you can't reliably tell when the list of strings ends or when the strings themselves end, you are hosed. Completely and utterly hosed.
C don't have strings. It just has pointers to (conventionally null-terminated) sequence of characters, and call them strings.
So just allocate first an array of pointers:
size_t nbelem= 10; /// number of elements
char **arr = calloc(nbelem, sizeof(char*));
You really want calloc because you really want that array to be cleared, so each pointer there is NULL. Of course, you test that calloc succeeded:
if (!arr) perror("calloc failed"), exit(EXIT_FAILURE);
At last, you fill some of the elements of the array:
arr[0] = "hello";
arr[1] = strdup("world");
(Don't forget to free the result of strdup and the result of calloc).
You could grow your array with realloc (but I don't advise doing that, because when realloc fails you could have lost your data). You could simply grow it by allocating a bigger copy, copy it inside, and redefine the pointer, e.g.
{ size_t newnbelem = 3*nbelem/2+10;
char**oldarr = arr;
char**newarr = calloc(newnbelem, sizeof(char*));
if (!newarr) perror("bigger calloc"), exit(EXIT_FAILURE);
memcpy (newarr, oldarr, sizeof(char*)*nbelem);
free (oldarr);
arr = newarr;
}
Don't forget to compile with gcc -Wall -g on Linux (improve your code till no warnings are given), and learn how to use the gdb debugger and the valgrind memory leak detector.
In c you can not allocate an array of string directly. You should stick with pointer to char array to use it as array of string. So use
char* strarr[length];
And to mentain the array of characters
You may take the approach somewhat like this:
Allocate a block of memory through a call to malloc()
Keep track of the size of input
When ever you need a increament in buffer size call realloc(ptr,size)

Resources