C - list of char*'s - Memory Allocation - c

I am confused about how to allocate memory correctly. I am trying to make a list of char*'s from a text file. Every time I make a char* do I have to allocate memory for it? When and where are the exceptions?
#define BUFF 1000
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* word = calloc(BUFF, sizeof(char));
char* sentence = calloc(BUFF, sizeof(char));
char** list = calloc(BUFF, sizeof(char*));
int i = 0;
while((fgets(sentence, BUFF, file)) != NULL)
{
word = strtok(sentence, " ,/.");
while(word != NULL)
{
printf("%s\n", word);
strcpy(list[i], word);
i++;
word = strtok(NULL, " ,/.");
}
}
int k;
for(k = 0; k < i; k++)
{
puts("segging here");
printf("%s\n", list[i]);
}

The rule is: you have to allocate any memory that you use.
Your problem comes in:
strcpy(list[i], word);
list[i] is currently not pointing to any allocated storage (it's probably a null pointer). You have to make it point somewhere before you can copy characters into it.
One way would be:
list[i] = strdup(word);
strdup is not an ISO C standard function, but it is equivalent to doing malloc then strcpy. You will need to free afterwards.
Also, the i++ line needs to stop when i == BUFF, and it'd be useful to add \n to the list of strtok separators .

In addition to Matt McNabb's answer, there's also a more subtle problem with your usage of strtok. That function doesn't require an output buffer; it just returns a pointer to somewhere inside the input buffer.
When you call char* word = calloc(BUFF, sizeof(char));, you allocate memory and assign word to point to the allocated memory. Then, when you call word = strtok(sentence, " ,/.");, you overwrite the value of word. That means that no pointer in your control points to the memory you've allocated. That memory is no longer useful to your code, and you can't deallocate it; it has been leaked.
You can fix this issue by writing char* word = strtok(sentence, " ,/."); Then, since you didn't allocate the memory that word points to, remember not to free it either.

your list is a char* list, with the size of BUFF, but the list[i] is what?
you do not allocate memory to it.
you need to allocate the memory for list[i] in a loop

Related

C - Saving strings into array elements

I have a notepad file with approximately 150,000 words (representing a dictionary). I'm trying to scan in each word and print it to the console. This setup works fine:
void readDictionary(FILE *ifp, int numWords) {
fscanf(ifp, "%d", &numWords);
printf("%d\n", numWords);
int i;
char* words = (char*)malloc(20 * sizeof(char));
for(i = 0; i < numWords; i++) {
fscanf(ifp, "%s", words);
printf("%s\n", words);
}
}
However, this code obviously overwrites "words" each time it loops. I'm trying to get each word to save to a certain array element. I did the following but it instantly crashes (I changed the memory allocation to 2D because I read around here and it seems that is what I am supposed to do):
void readDictionary(FILE *ifp, int numWords) {
fscanf(ifp, "%d", &numWords);
printf("%d\n", numWords);
int i;
char** words = (char**)malloc(20 * sizeof(char*));
for(i = 0; i < numWords; i++) {
fscanf(ifp, "%s", words[i]);
printf("%s\n", words[i]);
}
}
Any help is appreciated. I've read around on many posts but haven't figured it out.
In your second version, you allocate space for 20 pointers, but you leave those pointers uninitialized and without anything to point to. I'm sure you can imagine how that presents a problem when you then try to read from your dictionary into the memory designated by one of those pointers.
It looks like you want to allocate space for numwords pointers
char** words = malloc(numwords * sizeof(*words));
, and for each of them, to allocate space for a word.
for(i = 0; i < numWords; i++) {
words[i] = malloc(20); // by definition, sizeof(char) == 1
// ...
Additionally, do check the return value of malloc(), which will be NULL in the event of allocation failure.
The first problem is you only allocated space for a list of words (ie. character pointers) but you didn't allocate space for the words themselves.
char** words = (char**)malloc(20 * sizeof(char*));
This allocates space for 20 character pointers and assigns it to words. Now words[i] has space for a character pointer but not for the characters.
words[i] contains garbage, because malloc does not initialize memory. When you pass it into fscanf, fscanf tries to use the garbage in words[i] as a memory location to write characters to. That's either going to corrupt some memory in the program, or more likely it tries to read a memory location is isn't allowed to and crashes. Either way, it's not good.
You have to allocate memory for the string, pass that to fscanf, and finally put that string into words[i].
char** words = malloc(numWords * sizeof(char*));
for(i = 0; i < numWords; i++) {
char *word = malloc(40 * sizeof(char));
fscanf(ifp, "%39s", word);
words[i] = word;
printf("%s\n", words[i]);
}
Note that I didn't cast the result of malloc, that's generally considered unnecessary.
Also note that I allocated space for numWords in the list. Your original only allocates space for 20 words, once it goes over that it'll start overwriting allocated memory and probably crash. As a rule of thumb, avoid constant memory allocations. Get used to dynamic memory allocation as quickly as you can.
Also note that I limited how many characters fscanf is allowed to read to the size of my buffer (minus one because of the null byte at the end of strings). Otherwise if your word list contained "Pneumonoultramicroscopicsilicovolcanoconiosis", 45 characters, it would overrun the word buffer and start scribbling on adjacent elements and that would be bad.
This leads to a new problem that are common to fscanf and scanf: partial reads. When the code above encounters "Pneumonoultramicroscopicsilicovolcanoconiosis" fscanf(ifp, "%39s", word); will read in the first 39 characters, "Pneumonoultramicroscopicsilicovolcanoco" and stop. The next call to fscanf will read "niosis". You'll store and print them as if they were two words. That's no good.
You could solve this by making the word buffer bigger, but now most words will be wasting a lot of memory.
scanf and fscanf have a whole lot of problems and are best avoided. Instead, it's best to read entire lines and parse them with sscanf. In this case you don't need to do any parsing, they're just strings, so getting the line will suffice.
fgets is the usual way to read a line, but that also requires that you try and guess how much memory you'll need to read in the line. To mitigate that, have a large line buffer, and copy the words out of it.
void strip_newline( char* string ) {
size_t len = strlen(string);
if( string[len-1] == '\n' ) {
string[len-1] = '\0';
}
}
...
int i;
/* The word list */
char** words = malloc(numWords * sizeof(char*));
/* The line buffer */
char *line = malloc(1024 * sizeof(char*));
for(i = 0; i < numWords; i++) {
/* Read into the line buffer */
fgets(line, 1024, ifp);
/* Strip the newline off, fgets() doesn't do that */
strip_newline(line);
/* Copy the line into words */
words[i] = strdup(line);
printf("%s\n", words[i]);
}
strdup won't copy all 1024 bytes, just enough for the word. This will result in using only the memory you need.
Making assumptions about files, like that they'll have a certain number of lines, is a recipe for problems. Even if the file says it contains a certain number of lines you should still verify that. Otherwise you'll get bizarre errors as you try to read past the end of the file. In this case, if the file has less than numWords it'll try to read garbage and probably crash. Instead, you should read the file until there's no more lines.
Normally this is done by checking the return value of fgets in a while loop.
int i;
for( i = 0; fgets(line, 1024, ifp) != NULL; i++ ) {
strip_newline(line);
words[i] = strdup(line);
printf("%s\n", words[i]);
}
This brings up a new problem, how do we know how big to make words? You don't. This brings us to growing and reallocating memory. This answer is getting very long, so I'll just sketch it.
char **readDictionary(FILE *ifp) {
/* Allocate a decent initial size for the list */
size_t list_size = 256;
char** words = malloc(list_size * sizeof(char*));
char *line = malloc(1024 * sizeof(char*));
size_t i;
for( i = 0; fgets(line, 1024, ifp) != NULL; i++ ) {
strip_newline(line);
/* If we're about to overflow the list, double its size */
if( i > list_size - 1 ) {
list_size *= 2;
words = realloc( words, list_size * sizeof(char*));
}
words[i] = strdup(line);
}
/* Null terminate the list so readers know when to stop */
words[i] = NULL;
return words;
}
int main() {
FILE *fp = fopen("/usr/share/dict/words", "r");
char **words = readDictionary(fp);
for( int i = 0; words[i] != NULL; i++ ) {
printf("%s\n", words[i]);
}
}
Now the list will start at size 256 and grow as needed. Doubling grows pretty fast without wasting too much memory. My /usr/share/dict/words has 235886 lines in it. That can be stored in 218 or 262144. 256 is 28 so it only requires 10 expensive calls to realloc to grow to the necessary size.
I've changed it to return the list, because there isn't much good in building the list if you're just going to use it immediately. This allows me to demonstrate another technique in working with dynamically sized lists, null termination. The last element in the list is set to NULL so anyone reading the list knows when to stop. This is safer and simpler than trying to pass the length around with the list.
That was a lot, but that's all the basic stuff you need to do when working with files in C. It's good to do it manually, but fortunately there are libraries out there which make doing this sort of thing a lot easier. For example, Gnome Lib provides a lot of basic functionality including arrays of pointers that automatically grow as needed.

Access value of a pointer in dynamically allocated memory

My assignment is to read words from a text file and store them in character arrays which are stored in an array of char*. All memory in these arrays needs to be dynamically allocated.
What I am doing is reading in each word with fscanf() and storing it into the variable str. I am then calculating the length of the word in str and dynamically allocating memory to store the value of str in the character array new_word. new_word is then inserted into the array of char* named words. When words runs out of space, I double its size and continue.
My problem lies in the commented code starting on line 62. I'm going to need to read these words later from words, so I'm testing my ability to access the pointers and their values. I can index new_word fine (in the lines above), but when I then store new_word in words and try to read from words, I get the following error:
hw1.c:63:25: error: subscripted value is not an array, pointer, or vector
while (*(words[count])[k] != '\0'){
on lines 63 and 64. I know it has something to do with dereferencing the pointer, but I have tried a bunch of variations with no success. How can I fix this?
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[]){
if (argc != 3){
fprintf(stderr, "Incorrect number of arguments\n");
exit(1);
}
char* infile = argv[1];
FILE* finp = fopen(infile, "r");
if (finp == NULL){
fprintf(stderr, "Unable to open input file\n");
exit(1);
}
char* prefix = argv[2];
int count = 0;
int size = 20;
char* words = calloc(size, sizeof(char));
printf("Allocated initial array of 20 character pointers.\n");
char* str = malloc(30*sizeof(char));
while (fscanf(finp, "%s", str) == 1){
if (count == size){
words = realloc(words, 2 * size);
size *= 2;
printf("Reallocated array of %d character pointers.\n", size);
}
int i = 0;
while (str[i] != '\0'){
i++;
}
char* new_word = malloc((i+1)*sizeof(char));
int j = 0;
while (str[j] != '\0'){
new_word[j] = str[j];
j++;
}
new_word[j] = '\0';
int k = 0;
while (new_word[k] != '\0'){
printf("%c", new_word[k]);
k++;
}
printf("\n");
words[count] = *new_word;
/*k = 0;
while (*(words[count])[k] != '\0'){
printf("%c", *(words[count])[k]);
k++;
}
printf("\n");*/
count++;
}
}
Ok, dissecting that a bit:
char* words = calloc(size, sizeof(char));
this should probably read:
char **words = calloc(size, sizeof(char *));
Why? What you want here is a pointer to an array of pointers to char ... words points to the first char *, which points to your first "string".
char* str = malloc(30*sizeof(char));
while (fscanf(finp, "%s", str) == 1){
Buffer overflow here. Make sure to read at maximum 30 characters if you define your buffer not to hold more. Btw, just for convention, call your buffer buffer or buf (not str) and there's really no need to dynamically allocate it. Hint: Use a field size for fscanf() or, even better, some other function like fgets().
if (count == size){
words = realloc(words, 2 * size);
size *= 2;
printf("Reallocated array of %d character pointers.\n", size);
}
The realloc here will not work, should read
words = realloc(words, 2 * size * sizeof(char *));
You need to multiply the size of a single element, which, in this case, is a pointer to char.
No guarantee this will be all errors, but probably the most important ones. On a sidenote, strlen() and strncpy() will help you stop writing unnecessary code.
A pointer to "A [dynamically-allocated] array of char*" would need to be recorded in a variable of type char **. That is, a pointer to the first element of the array, which element is of type char *. Thus ...
char **words;
If you want to have sufficient space for size words, then you could allocate it as ...
words = calloc(size, sizeof(char *));
(note the difference from your code), though it's harder to make a mistake with this form:
words = calloc(size, sizeof(*words));
Note in that case that the sizeof operator does not evaluate its operand, so it does not matter that words is not yet allocated.
Most importantly, be aware that the elements of array words are themselves pointers, not the ultimately pointed-to strings. Thus you assign a new word to the array by
words[count] = new_word;
(Again, note the difference from your version.) Other adjustments are needed as well.
The problematic while loop, though, is not fixed even then. Remember that the expression pointer[index] is equivalent to *((pointer) + (index)), so the expression *(words[count])[k] attempts to triply derference words. Even with the type correction, you want only to doubly dereference it: words[count][k].
But why re-invent the wheel? As Olaf observed with respect to strlen() and some of your earlier code, C already has perfectly good functions in its standard library for dealing with strings. In this case ...
printf("%s", words[count]);
... would be so much simpler than that while loop.

How to store fgets string results into an char array?

I am currently getting the following error
Process terminated with status -1073741819
and I suspect its my fgets() but I have no idea why this is happening, any help would be much appreciated.
//Gets Dictionary from file
char* GetDictionary() {
int ArraySize;
int i = 0;
FILE * DictionaryFile;
//Gets first line (in this case it is the amount of Lines)
DictionaryFile = fopen("dictionary.txt", "r");
fscanf(DictionaryFile,"%d", &ArraySize);
ArraySize = ArraySize + 1;
printf("%d", ArraySize);
fclose(DictionaryFile);
//Gets the array
char* Dictionary = malloc(sizeof(char)*ArraySize);
char Temp[ArraySize];
char TempArray[ArraySize];
DictionaryFile = fopen("dictionary.txt", "r");
while(fgets(Temp, sizeof Temp, DictionaryFile)!=NULL) {
Dictionary[i] = Temp;
//Check The array
printf("%s", Dictionary[i]);
i++;
}
fclose(DictionaryFile);
return Dictionary;
}
-1073741819 --> C0000005 and likely has some significance. Maybe use below to discern its meaning.
puts(strerror(-1073741819));
Code has many issues: Here are some corrected to get you going.
1) Allocate an array of pointers, not an array of char
// char* Dictionary = malloc(sizeof(char)*ArraySize);
char** Dictionary = malloc(ArraySize * sizeof *Dictionary);
2) Form a big buffer to read each line
char Temp[100];
3) After reading each line, get rid of the likely trailing '\n'
size_t len = strlen(Temp);
if (len && Temp[len-1] == '\n') Temp[--len] = 0;
4) Allocate memory for that word and save
Dictionary[i] = malloc(len + 1);
assert(Dictionary[i]);
memcpy(Dictionary[i], Temp, len + 1);
5) Robust code frees it allocations before completion
6) Code reads "amount of Lines" twice as file is opened twice. Just leave file open (and not re-open it). #user3386109
You likely want Dictionary to be an array of char strings. That is, Dictionary is an array, and each element in the array is a char *. That makes Dictionary a char **.
For this example, it may be most straightforward to allocate memory for the Dictionary array itself, then allocate memory for its contents. You'll need to free all this when you're done, of course.
char **Dictionary = malloc(sizeof(char *) * ArraySize);
for (int i = 0; i < ArraySize; i++) {
Dictionary[i] = malloc(ArraySize);
}
There are better ways to do this. For one, you might only allocate memory when you need it, for each fgets() return. You could also use strdup() to allocate only the memory you need. You could also pass in Dictionary from the caller, already allocated, so you don't worry about allocating it here.
Later in your program, as #WhozCraig pointed out, you need to copy the string in Temp, like strcpy(Dictionary[i], Temp), in place of Dictionary[i] = Temp. I too am surprised that's not generating a compiler warning!

Segmentation error in C while allocation memory

I am a total begginer at C programming and am trying to write a program that reads the value of "stat" file in /proc/. It works for the first few entries, but then it returns "Segmentation error (core dumped)".
So far I found out that the error has to do with the allocation of memory, but I cant seem to find a way to fix it.
My code so far is:
char* readFile(char* filename)
{
FILE *fp;
struct stat buf;
fp=fopen(filename,"r");
stat(filename,&buf);
char *string = malloc(buf.st_size);
char *s;
while(!feof(fp))
{
s=malloc(1024);
fgets(s,1024,fp);
s[strlen(s)-1]='\0';
strcat(string,s);
}
return string;
}
char* readStat(char* path, int statNumber)
{
char* str = malloc(sizeof(readFile(path)));
str = readFile(path);
char * pch = malloc(sizeof(str));
char * vals;
pch = strtok (str," ");
int i = 1;
while (pch != NULL)
{
if(i == statNumber)
vals = pch;
pch = strtok(NULL, " ");
i++;
}
return vals;
}
1) the
s=malloc(1024);
should not into the while it should be oitside the while loop and before the while.
And free it before leaving the function:
free(s);
2) add
string[0] = '\0';
just after
char *string = malloc(buf.st_size);
Otherwise the strcat will not work properly
3) You do not need to allocate memory for str pointer because the readFile function already did
char* str = malloc(sizeof(readFile(path)));
Just replaced with
char* str;
4) And also replace
char * pch = malloc(sizeof(str));
by
char * pch = str;
To start with, you don't allocate space for the terminator for the string variable. You also need to terminate it before you can use it as a destination for strcat.
To continue, when you do sizeof on a pointer, you get the size of the pointer and not what it points to. You have this problem in readStat.
You also have memory leaks, in that you call readFile twice, but never free the memory allocated in it. Oh, and one of the memory allocations in readFile is not needed at all.
And there's another memory leak in that you allocate memory for pch, but you loose that pointer when you assign the result of the strtok call. strtok returns a pointer to the string in the strtok call, so no need to allocate memory for it (which you didn't attempt to free anyway).
s=malloc(1024); should not be in loop, you should allocate the memory once and reset s with NULL before use next time in loop.
Also you should make a habit to free the memory after its usage.

how to put char * into array so that I can use it in qsort, and then move on to the next line

I have lineget function that returns char *(it detects '\n') and NULL on EOF.
In main() I'm trying to recognize particular words from that line.
I used strtok:
int main(int argc, char **argv)
{
char *line, *ptr;
FILE *infile;
FILE *outfile;
char **helper = NULL;
int strtoks = 0;
void *temp;
infile=fopen(argv[1],"r");
outfile=fopen(argv[2],"w");
while(((line=readline(infile))!=NULL))
{
ptr = strtok(line, " ");
temp = realloc(helper, (strtoks)*sizeof(char *));
if(temp == NULL) {
printf("Bad alloc error\n");
free(helper);
return 0;
} else {
helper=temp;
}
while (ptr != NULL) {
strtoks++;
fputs(ptr, outfile);
fputc(' ', outfile);
ptr = strtok(NULL, " ");
helper[strtoks-1] = ptr;
}
/*fputs(line, outfile);*/
free(line);
}
fclose(infile);
fclose(outfile);
return 0;
}
Now I have no idea how to put every of tokenized words into an array (I created char ** helper for that purpose), so that it can be used in qsort like qsort(helper, strtoks, sizeof(char*), compare_string);.
Ad. 2 Even if it would work - I don't know how to clear that line, and proceed to sorting next one. How to do that?
I even crashed valgrind (with the code presented above) -> "valgrind: the 'impossible' happened:
Killed by fatal signal"
Where is the mistake ?
The most obvious problem (there may be others) is that you're reallocating helper to the value of strtoks at the beginning of the line, but then incrementing strtoks and adding to the array at higher values of strtoks. For instance, on the first line, strtoks is 0, so temp = realloc(helper, (strtoks)*sizeof(char *)); leaves helper as NULL, but then you try to add every word on that line to the helper array.
I'd suggest an entirely different approach which is conceptually simpler:
char buf[1000]; // or big enough to be bigger than any word you'll encounter
char ** helper;
int i, numwords;
while(!feof(infile)) { // most general way of testing if EOF is reached, since EOF
// is just a macro and may not be machine-independent.
for(i = 0; (ch = fgetc(infile)) != ' ' && ch != '\n'; i++) {
// get chars one at a time until we hit a space or a newline
buf[i] = ch; // add char to buffer
}
buf[i + 1] = '\0' // terminate with null byte
helper = realloc(++numwords * sizeof(char *)); // expand helper to fit one more word
helper[numwords - 1] = strdup(buffer) // copy current contents of buffer to the just-created element of helper
}
I haven't tested this so let me know if it's not correct or there's anything you don't understand. I've left out the opening and closing of files and the freeing at the end (remember you have to free every element of helper before you free helper itself).
As you can see in strtok's prototype:
char * strtok ( char * str, const char * delimiters );
...str is not const. What strtok actually does is replace found delimiters by null bytes (\0) into your str and return a pointer to the beginning of the token.
Per example:
char in[] = "foo bar baz";
char *toks[3];
toks[0] = strtok(in, " ");
toks[1] = strtok(NULL, " ");
toks[2] = strtok(NULL, " ");
printf("%p %s\n%p %s\n%p %s\n", toks[0], toks[0], toks[1], toks[1],
toks[2], toks[2]);
printf("%p %s\n%p %s\n%p %s\n", &in[0], &in[0], &in[4], &in[4],
&in[8], &in[8]);
Now look at the results:
0x7fffd537e870 foo
0x7fffd537e874 bar
0x7fffd537e878 baz
0x7fffd537e870 foo
0x7fffd537e874 bar
0x7fffd537e878 baz
As you can see, toks[1] and &in[4] point to the same location: the original str has been modified, and in reality all tokens in toks point to somewhere in str.
In your case your problem is that you free line:
free(line);
...invalidating all your pointers in helper. If you (or qsort) try to access helper[0] after freeing line, you end up accessing freed memory.
You should copy the tokens instead, e.g.:
ptr = strtok(NULL, " ");
helper[strtoks-1] = malloc(strlen(ptr) + 1);
strcpy(helper[strtoks-1], ptr);
Obviously, you will need to free each element of helper afterwards (in addition to helper itself).
You should be getting a 'Bad alloc' error because:
char **helper = NULL;
int strtoks = 0;
...
while ((line = readline(infile)) != NULL) /* Fewer, but sufficient, parentheses */
{
ptr = strtok(line, " ");
temp = realloc(helper, (strtoks)*sizeof(char *));
if (temp == NULL) {
printf("Bad alloc error\n");
free(helper);
return 0;
}
This is because the value of strtoks is zero, so you are asking realloc() to free the memory pointed at by helper (which was itself a null pointer). One outside chance is that your library crashes on realloc(0, 0), which it shouldn't but it is a curious edge case that might have been overlooked. The other possibility is that realloc(0, 0) returns a non-null pointer to 0 bytes of data which you are not allowed to dereference. When your code dereferences it, it crashes. Both returning NULL and returning non-NULL are allowed by the C standard; don't write code that crashes regardless of which behaviour realloc() shows. (If your implementation of realloc() does not return a non-NULL pointer for realloc(0, 0), then I'm suspicious that you aren't showing us exactly the code that managed to crash valgrind (which is a fair achievement — congratulations) because you aren't seeing the program terminate under control as it should if realloc(0, 0) returns NULL.)
You should be able to avoid that problem if you use:
temp = realloc(helper, (strtoks+1) * sizeof(char *));
Don't forget to increment strtoks itself at some point.

Resources