Valgrind: REALLOC Uninitialised value was created by a heap allocation - c

Please, after reading and trying to apply solutions found on stackOverflow the problem has not been solved.
Conditional jump or move depends on uninitialised value(s):
Conditional jump or move depends on uninitialised value(s):
Uninitialised value was created by a heap allocation.
Error popped up by Valgrind:
I'm trying to implement file reading line by line and dynamically realloc an array for them.
Error on line: result = realloc(result, currLen * sizeof(char *));
void readFile(char *fileName) {
FILE *fp = NULL;
size_t len = 0;
int currLen = 2;
char **result = calloc(currLen, sizeof(char *));
fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
if (result == NULL)
exit(EXIT_FAILURE);
int i = 0;
while (getline(&(*(result + i)), &len, fp) != -1) {
if (i >= currLen - 1) {
currLen *= 2;
result = realloc(result, currLen * sizeof(char *));
}
++i;
}
fclose(fp);
for (int j = 0; j < currLen; ++j) {
free(*(result + j));
}
free(result);
result = NULL;
}
int main() {
readFile("");
exit(EXIT_SUCCESS);
}

In the original posted code, result undergoes an initial allocation via calloc, which zero-initializes the content therein, and being pointers, null-initializes. Later on, when expanding the sequence via realloc, no such affordance is taken. In effect if the original array looked like this:
[ NULL, NULL ]
and after adding two elements, looks like this:
[ addr1, addr2 ]
the realloc kicks in and gives you this :
[ addr1, addr2, ????, ???? ]
Adding salt to the wound, getline also requires the length argument being reflective of the allocation size present in the line. But you're carrying over the length from the prior loop iteration, so not only is the pointer wrong after the first expansion, the length is never correct after the first invocation of getline (leading to your actual crash; the rest of the problems are just not something you saw yet).
Solving all of this
Use a separate pointer and length for each iteration,
Ensure they're properly initialized to null,0 before the getline call
If you read the line successfully, then expand the line pointer buffer.
Store the pointer, discard the length, and reset both to null,0 before the next iteration.
In practice, it looks like this:
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
#include <stdlib.h>
char **readFile(const char *fileName, size_t *lines)
{
FILE *fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
// initially empty, no size or capacity
char **result = NULL;
size_t size = 0;
size_t capacity = 0;
size_t len = 0;
char *line = NULL;
while (getline(&line, &len, fp) != -1)
{
if (size == capacity)
{
size_t new_capacity = (capacity ? 2 * capacity : 1);
void *tmp = realloc(result, new_capacity * sizeof *result);
if (tmp == NULL)
{
perror("Failed to expand lines buffer");
exit(EXIT_FAILURE);
}
// recoup the expanded buffer and capacity
result = tmp;
capacity = new_capacity;
}
result[size++] = line;
// reset these to NULL,0. they trigger the fresh allocation
// and size storage on the next iteration.
line = NULL;
len = 0;
}
// if getline allocated a buffer on the failure case
// get rid of it (didn't see that coming).
if (line)
free(line);
fclose(fp);
*lines = size;
return result;
}
int main()
{
size_t count = 0;
char **lines = readFile("/usr/share/dict/words", &count);
if (lines)
{
for (size_t i = 0; i < count; ++i)
{
fputs(lines[i], stdout);
free(lines[i]);
}
free(lines);
}
return 0;
}
On a stock Linux/Mac system /usr/share/dict/words contains about a quarter-million words in the English language. On my stock Mac, its 235886 (yours will vary). The callers gets the line pointer and the count, and is responsible for freeing the content therein.
Output
A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
.... a ton of lines omitted ....
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton
Valgrind Summary
==17709==
==17709== HEAP SUMMARY:
==17709== in use at exit: 0 bytes in 0 blocks
==17709== total heap usage: 235,909 allocs, 235,909 frees, 32,506,328 bytes allocated
==17709==
==17709== All heap blocks were freed -- no leaks are possible
==17709==
Alternative: Let getline reuse its buffer
There is no guarantee the buffer getline allocates matches the line length efficiently. In fact, the only guarantee is, on successful execution, the function returns the number of chars including the delimiter (but not the terminator), and the memory holds that data. The actual allocation size could be considerably more than that, and that space is effectively wasted.
To demonstrate this, consider the following. The same code, but we do NOT reset the buffer on each loop, and rather than store its pointer directly, we store a strdup of the line. Note this only works if the line does not contain embedded null chars. This allows getline to reuse its buffer, and only expand if needed, for each read. We're responsible for making the actual copy of the line data (and we do using POSIX strdup). When executed there are still no leaks, but note the valgrind summary, specifically the number of bytes allocated in comparison to the number of bytes from the previous version above.
char **readFile(const char *fileName, size_t *lines)
{
FILE *fp = fopen(fileName, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
// initially empty, no size or capacity
char **result = NULL;
size_t size = 0;
size_t capacity = 0;
size_t len = 0;
char *line = NULL;
while (getline(&line, &len, fp) != -1)
{
if (size == capacity)
{
size_t new_capacity = (capacity ? 2 * capacity : 1);
void *tmp = realloc(result, new_capacity * sizeof *result);
if (tmp == NULL)
{
perror("Failed to expand lines buffer");
exit(EXIT_FAILURE);
}
// recoup the expanded buffer and capacity
result = tmp;
capacity = new_capacity;
}
// make copy here. let getline reuse 'line'
result[size++] = strdup(line);
}
// free whatever was left
if (line)
free(line);
fclose(fp);
*lines = size;
return result;
}
Valgrind Summary
==17898==
==17898== HEAP SUMMARY:
==17898== in use at exit: 0 bytes in 0 blocks
==17898== total heap usage: 235,909 allocs, 235,909 frees, 6,929,003 bytes allocated
==17898==
==17898== All heap blocks were freed -- no leaks are possible
==17898==
The number of allocations is the same (which tells us getline allocated a large enough buffer up front to never need expansion), but the actual total allocated space is considerably more efficient, as now we are storing strings in buffers allocated to match their length; not whatever getline stood up as a read buffer.

Related

Dynamic growing string array memory issues

I'm working on a crosswords program in which a word dictionary is necessary. I'm trying load a jspell dictionary file into an dynamic string array but i keep getting the
error malloc(): mismatching next->prev_size (unsorted)
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "dictionary.h"
void dict_init(Dictionary * dict, char * dict_dir, size_t w_len)
{
printf("dictionary.c (dict_init): initializing dictionary.\n");
/*Adjust this value to control the initial array size*/
size_t init_size = 1000;
/*initialize dictionary file directory*/
dict->dir = malloc(strlen(dict_dir) * sizeof(char) + 1);
strcpy(dict->dir, dict_dir);
/*create memory for words array*/
dict->words = malloc(init_size * sizeof(char *));
/*initialize array size*/
dict->size = init_size;
/*initilize word length*/
dict->w_len = w_len;
/*initialize word counter*/
dict->counter = 0;
/*load words into dictionary*/
dict_load(dict);
printf("dictionary.c (dict_init): dictionary initialized.\n");
}
void dict_add(Dictionary * dict, char * word)
{
char ** dictionary = dict->words;
/*check if word array is full*/
if(dict->counter == dict->size)
{
/*increrase size of dictionary*/
dict->size *= 1.5;
dict->words = realloc(dict->words, dict->size * sizeof(char *));
}
/*add word to dictionary*/
dictionary[dict->counter] = malloc(strlen(word) * sizeof(char) + 1);
strcpy(dictionary[dict->counter], word);
dict->counter++;
free(word);
}
void dict_free(Dictionary * dict)
{
free(dict->words);
}
void dict_load(Dictionary * dict)
{
FILE * fp;
char * line = NULL;
char * word = NULL;
size_t len = 0;
ssize_t read;
fp = fopen(dict->dir, "r");
/*check if file exists*/
if (fp == NULL)
{
perror("ERROR: File not found.");
exit(EXIT_FAILURE);
}
/*discard first line*/
if(strstr(dict->dir, ".dic"))
getline(&line, &len, fp);
/*read file lines*/
while ((read = getline(&line, &len, fp)) != -1)
{
if(((strstr(line, "[CAT=punct") == NULL) && (word = parse_line(line, dict->w_len)) != NULL)) {
dict_add(dict, word);
}
}
fclose(fp);
free(line);
printf("dictionary.c (dict_load): dictionary loaded %ld words.\n", dict->counter);
}
char * parse_line(char * line, size_t w_len)
{
int i;
char s_tmp[101] = "";
char * dlm_slash, * dlm_space, * dlm_tab , *substring;
/*get delimiter pointer*/
dlm_slash = strchr(line, '/');
dlm_space = strchr(line, ' ');
dlm_tab = strchr(line, '\t');
/*check if delimiter exists in line*/
if(dlm_slash != NULL)
i = (int)(dlm_slash - line);
else if(dlm_space != NULL)
i = (int)(dlm_space - line);
else if(dlm_tab != NULL)
i = (int)(dlm_tab - line);
else
{
/*replace '\n' with '\0'*/
line[strcspn(line, "\n")] = '\0';
i = strlen(line);
}
strncpy(s_tmp, line, i);
substring = malloc(sizeof(char) * strlen(s_tmp) + 1);
strncpy(substring, s_tmp, strlen(s_tmp));
/*lowercase word*/
lower_case(substring);
if((is_valid(substring) == 0) && (strlen(substring) <= w_len))
return substring;
free(substring);
return NULL;
}
Here's the basic problem, I think:
void dict_add(Dictionary * dict, char * word) {
char ** dictionary = dict->words; /* **** 1 **** */
/*check if word array is full*/
if(dict->counter == dict->size)
{
/*increrase size of dictionary*/
dict->size *= 1.5; /* **** 2 **** */
dict->words = realloc(dict->words, dict->size * sizeof(char *));
/* **** 3 **** */
}
/*add word to dictionary*/
This one is the problem:
dictionary[dict->counter] = malloc(strlen(word) * sizeof(char) + 1);
strcpy(dictionary[dict->counter], word);
dict->counter++;
free(word); /* **** 4 **** */
}
The problem is that dictionary was saved before you called realloc. realloc might make a brand-new memory allocation, in which case it will automatically free() the old one after copying its contents into the new one. So any copy of the pointer which you made before calling realloc might end up pointing to unallocated memory. Writing to unallocated memory is a big no-no; in this particular case, you're probably overwriting malloc's bookkeeping information about the unallocated block, which is why it detects the problem and complains. Count yourself lucky: lots of memory corruption problems go undetected for quite a while until the factory explodes.
Some other issues which I noticed while writing this, with numbered comments in the source:
There's actually no need for the variable dictionary at all.
dict->size is an integer. Forcing conversion to a floating point number and then truncating back to an integer is not very useful. Prefer dict->size += dict->size/2;. Even better would be to first make sure that dict->size isn't so big that increasing it will cause integer wraparound. (This is not undefined behaviour on unsigned types like size_t, but it's not going to produce correct results.)
Here you could actually use a temporary, because realloc might return NULL indicating a memory allocation failure. If that happens, the original allocation is not automatically freed, and you don't have a way to free it. (Actually you do, since you have a variable confusingly called dictionary, but in point 1 I recommended that you get rid of it.) A more idiomatic call would be:
if(dict->counter == dict->size) {
/*increrase size of dictionary*/
dict->size += dict->size / 2; /* See point 2, above */
char** new_words = realloc(dict->words, dict->size * sizeof(*new_words));
if (new_words == NULL) {
/* Report allocation error and free all the memory you've allocated */
/* Then probably exit(1) but if this were a library function, just
* return some kind of failure indication so that the caller can do
* their own clean-up.
*/
}
dict->words = new_words;
}
dict->words[dict->counter] = word; /* See point 4, below */
You're freeing word here because it was allocated in parse_line(). But if you know you're going to free it anyway, there wasn't much point making a copy of it first. You might as well just use it. (But you need to document the fact that this function takes ownership of the word passed as an argument.)
It might be considered cleaner to do the copy as you do but then not free the argument, leaving it for the caller to do that. That would have the advantage of allowing the caller to provide a word which hadn't been dynamically allocated, or use the word for some other purpose.
(Not indicated in this snippet, but nonetheless important). Every block of allocated memory must be freed. So your program should execute free exactly as many times as it executed malloc. But you don't do that; you just free the array of word pointers, and let the words pointed to in that array leak. You should fix that. (Note that you don't need an extra call to free for a call to realloc, since realloc itself frees the old block if it allocates a new one. You only need to match the initial malloc with a free.)

C Program to Convert a Text File into a CSV File

The question is to convert a text file into a CSV file using C programming. The input text file is formatted as the following:
JACK Maria Stephan Nora
20 34 45 28
London NewYork Toronto Berlin
The output CSV file should look like:
Jack,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
The following code is what I tried so far:
void load_and_convert(const char* filename){
FILE *fp1, *fp2;
char ch;
fp1=fopen(filename,"r");
fp2=fopen("output.csv","w");
for(int i=0;i<1000;i++){
ch=fgetc(fp1);
fprintf(fp2,"%c",ch);
if(ch==' '|| ch=='\n')
fprintf(fp2,"%c,\n",ch);
}
fclose(fp1);
fclose(fp2);
}
The output from my code looks like:
Jack,
Maria,
Stephan,
Nora,
20,
34,
45,
28,
London,
NewYork,
Toronto,
Berlin,
How should I modify my code to make it work correctly?
What's the idea to treat this question?
Since I have some times, here is a working solution for you (tried my best to make the solution as elegant as I can):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_STRING_LENGTH 50
#define MAX_NUMBER_OF_PEOPLE 50
typedef struct
{
char name[MAX_STRING_LENGTH];
int age;
char city[MAX_STRING_LENGTH];
} Person;
void getName(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
strncpy(people[i].name, ptr, MAX_STRING_LENGTH);
ptr = strtok(NULL, delim);
i++;
}
}
void getAge(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
people[i].age = atoi(ptr);
i++;
ptr = strtok(NULL, delim);
}
}
void getCity(char *src, char *delim, Person *people) {
char *ptr = strtok(src, delim);
int i = 0;
while(ptr != NULL)
{
strncpy(people[i].city, ptr, MAX_STRING_LENGTH);
i++;
ptr = strtok(NULL, delim);
}
}
int main(void)
{
Person somebody[MAX_NUMBER_OF_PEOPLE];
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
int ln = 0;
fp = fopen("./test.txt", "r");
if (fp == NULL)
return -1;
// Read every line, support first line is name, second line is age...
while ((read = getline(&line, &len, fp)) != -1) {
// remote trailing newline character
line = strtok(line, "\n");
if (ln == 0) {
getName(line, " ", somebody);
} else if (ln == 1) {
getAge(line, " ", somebody);
} else {
getCity(line, " ", somebody);
}
ln++;
}
for (int j = 0; j < MAX_NUMBER_OF_PEOPLE; j++) {
if (somebody[j].age == 0)
break;
printf("%s, %d, %s\n", somebody[j].name, somebody[j].age, somebody[j].city);
}
fclose(fp);
if (line)
free(line);
return 0;
}
What you are needing to do is non-trivial if you want to approach the problem holding all values in memory as you transform the 3-rows with 4-fields in each row, to a format of 4-rows with 3-fields per-row. So when you have your datafile containing:
Example Input File
$ cat dat/col2csv3x4.txt
JACK Maria Stephan Nora
20 34 45 28
London NewYork Toronto Berlin
You want to read each of the three lines and then transpose the columns into rows for .csv output. Meaning you will then end up with 4-rows of 3-csv fields each, e.g.
Expected Program Output
$ ./bin/transpose2csv < dat/col2csv3x4.txt
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
There is nothing difficult in doing it, but it takes meticulous attention to handling the memory storage of object and allocating/reallocating to handle the transformation between 3-rows with 4-pieces of data to 4-rows with 3-pieces of data.
One approach is to read all original lines into a typical pointer-to-pointer to char setup. Then transform/transpose the columns into rows. Since conceivably there could be 100-rows with 500-fields next time, you will want to approach the transformation using indexes and counters to track your allocation and reallocation requirement to make your finished code able to handle transposing a generic number of lines and fields into fields-number of lines with as many vales per row as you had original lines.
You can design your code to provide the transformation in two basic functions. The first to read and store the lines (saygetlines`) and the second to then transpose those lines into a new pointer-to-pointer to char so it can be output as comma separated values
One way to approach these two functions would be similar to the following that takes the filename to read as the first-arguments (or will read from stdin by default if no argument is given). The code isn't trivial, but it isn't difficult either. Just keep track of all your allocations, preserving a pointer to the beginning of each, so the memory may be freed when no longer needed, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NPTR 2
#define NWRD 128
#define MAXC 1024
/** getlines allocates all storage required to read all lines from file.
* the pointers are doubled each time reallocation is needed and then
* realloc'ed a final time to exactly size to the number of lines. all
* lines are stored with the exact memory required.
*/
char **getlines (size_t *n, FILE *fp)
{
size_t nptr = NPTR; /* tracks number of allocated pointers */
char buf[MAXC]; /* tmp buffer sufficient to hold each line */
char **lines = calloc (nptr, sizeof *lines);
if (!lines) { /* validate EVERY allocaiton */
perror ("calloc-lines");
return NULL;
}
*n = 0; /* pointer tracks no. of lines read */
rewind (fp); /* clears stream error state if set */
while (fgets (buf, MAXC, fp)) { /* read each line o finput */
size_t len;
if (*n == nptr) { /* check/realloc ptrs if required */
void *tmp = realloc (lines, 2 * nptr * sizeof *lines);
if (!tmp) { /* validate reallocation */
perror ("realloc-tmp");
break;
}
lines = tmp; /* assign new block, (opt, zero new mem below) */
memset (lines + nptr, 0, nptr * sizeof *lines);
nptr *= 2; /* increment allocated pointer count */
}
buf[(len = strcspn(buf, "\r\n"))] = 0; /* get line, remove '\n' */
lines[*n] = malloc (len + 1); /* allocate for line */
if (!lines[*n]) { /* validate */
perror ("malloc-lines[*n]");
break;
}
memcpy (lines[(*n)++], buf, len + 1); /* copy to line[*n] */
}
if (!*n) { /* if no lines read */
free (lines); /* free pointers */
return NULL;
}
/* optional final realloc to free unused pointers */
void *tmp = realloc (lines, *n * sizeof *lines);
if (!tmp) {
perror ("final-realloc");
return lines;
}
return (lines = tmp); /* return ptr to exact no. of required ptrs */
}
/** free all pointers and n alocated arrays */
void freep2p (void *p2p, size_t n)
{
for (size_t i = 0; i < n; i++)
free (((char **)p2p)[i]);
free (p2p);
}
/** transpose a file of n rows and a varying number of fields to an
* allocated pointer-to-pointer t0 char structure with a fields number
* of rows and n csv values per row.
*/
char **transpose2csv (size_t *n, FILE *fp)
{
char **l = NULL, **t = NULL;
size_t csvl = 0, /* csv line count */
ncsv = 0, /* number of csv lines allocated */
nchr = MAXC, /* initial chars alloc for csv line */
*offset, /* array tracking read offsets in lines */
*used; /* array tracking write offset to csv lines */
if (!(l = getlines (n, fp))) { /* read all lines to l */
fputs ("error: getlines failed.\n", stderr);
return NULL;
}
ncsv = *n;
#ifdef DEBUG
for (size_t i = 0; i < *n; i++)
puts (l[i]);
#endif
if (!(t = malloc (ncsv * sizeof *t))) { /* alloc ncsv ptrs for csv */
perror ("malloc-t");
freep2p (l, *n); /* free everything else on failure */
return NULL;
}
for (size_t i = 0; i < ncsv; i++) /* alloc MAXC chars to csv ptrs */
if (!(t[i] = malloc (nchr * sizeof *t[i]))) {
perror ("malloc-t[i]");
while (i--) /* free everything else on failure */
free (t[i]);
free (t);
freep2p (l, *n);
return NULL;
}
if (!(offset = calloc (*n, sizeof *offset))) { /* alloc offsets array */
perror ("calloc-offsets");
free (t);
freep2p (l, *n);
return NULL;
}
if (!(used = calloc (ncsv, sizeof *used))) { /* alloc used array */
perror ("calloc-used");
free (t);
free (offset);
freep2p (l, *n);
return NULL;
}
for (;;) { /* loop continually transposing cols to csv rows */
for (size_t i = 0; i < *n; i++) { /* read next word from each line */
char word[NWRD]; /* tmp buffer for word */
int off; /* number of characters consumed in read */
if (sscanf (l[i] + offset[i], "%s%n", word, &off) != 1)
goto readdone; /* break nested loops on read failure */
size_t len = strlen (word); /* get word length */
offset[i] += off; /* increment read offset */
if (csvl == ncsv) { /* check/realloc new csv row as required */
size_t newsz = ncsv + 1; /* allocate +1 row over *n */
void *tmp = realloc (t, newsz * sizeof *t); /* realloc ptrs */
if (!tmp) {
perror ("realloc-t");
freep2p (t, ncsv);
goto readdone;
}
t = tmp;
t[ncsv] = NULL; /* set new pointer NULL */
/* allocate nchr chars to new pointer */
if (!(t[ncsv] = malloc (nchr * sizeof *t[ncsv]))) {
perror ("malloc-t[i]");
while (ncsv--) /* free everything else on failure */
free (t[ncsv]);
goto readdone;
}
tmp = realloc (used, newsz * sizeof *used); /* realloc used */
if (!tmp) {
perror ("realloc-used");
freep2p (t, ncsv);
goto readdone;
}
used = tmp;
used[ncsv] = 0;
ncsv++;
}
if (nchr - used[csvl] - 2 < len) { /* check word fits in line */
/* realloc t[i] if required (left for you) */
fputs ("realloc t[i] required.\n", stderr);
}
/* write word to csv line at end */
sprintf (t[csvl] + used[csvl], used[csvl] ? ",%s" : "%s", word);
t[csvl][used[csvl] ? used[csvl] + len + 1 : len] = 0;
used[csvl] += used[csvl] ? len + 1 : len;
}
csvl++;
}
readdone:;
freep2p (l, *n);
free (offset);
free (used);
*n = csvl;
return t;
}
int main (int argc, char **argv) {
char **t;
size_t n = 0;
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
if (!(t = transpose2csv (&n, fp))) {
fputs ("error: transpose2csv failed.\n", stderr);
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < n; i++)
if (t[i])
puts (t[i]);
freep2p (t, n);
return 0;
}
Example Use/Output
$ ./bin/transpose2csv < dat/col2csv3x4.txt
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/transpose2csv < dat/col2csv3x4.txt
==18604== Memcheck, a memory error detector
==18604== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==18604== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==18604== Command: ./bin/transpose2csv
==18604==
JACK,20,London
Maria,34,NewYork
Stephan,45,Toronto
Nora,28,Berlin
==18604==
==18604== HEAP SUMMARY:
==18604== in use at exit: 0 bytes in 0 blocks
==18604== total heap usage: 15 allocs, 15 frees, 4,371 bytes allocated
==18604==
==18604== All heap blocks were freed -- no leaks are possible
==18604==
==18604== For counts of detected and suppressed errors, rerun with: -v
==18604== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

How to Delete Duplicate Elements from Dynamically Allocated String Array in C

I have created a program in C that reads in a word file and counts how many words are in that file, along with how many times each word occurs.
When I run it through Valgrind I either get too many bytes lost or a Segmentation Fault.
How can I remove a duplicate element from a dynamically allocated array and free the memory as well?
Gist: wordcount.c
int tokenize(Dictionary **dictionary, char *words, int total_words)
{
char *delim = " .,?!:;/\"\'\n\t";
char **temp = malloc(sizeof(char) * strlen(words) + 1);
char *token = strtok(words, delim);
*dictionary = (Dictionary*)malloc(sizeof(Dictionary) * total_words);
int count = 1, index = 0;
while (token != NULL)
{
temp[index] = (char*)malloc(sizeof(char) * strlen(token) + 1);
strcpy(temp[index], token);
token = strtok(NULL, delim);
index++;
}
for (int i = 0; i < total_words; ++i)
{
for (int j = i + 1; j < total_words; ++j)
{
if (strcmp(temp[i], temp[j]) == 0) // <------ segmentation fault occurs here
{
count++;
for (int k = j; k < total_words; ++k) // <----- loop to remove duplicates
temp[k] = temp[k+1];
total_words--;
j--;
}
}
int length = strlen(temp[i]) + 1;
(*dictionary)[i].word = (char*)malloc(sizeof(char) * length);
strcpy((*dictionary)[i].word, temp[i]);
(*dictionary)[i].count = count;
count = 1;
}
free(temp);
return 0;
}
Thanks in advance.
Without A Minimal, Complete, and Verifiable example, there is no guarantee that additional problems do not originate elsewhere in your code, but the following need careful attention:
char **temp = malloc(sizeof(char) * strlen(words) + 1);
Above you are allocating pointers not words, your allocation is too small by a factor of sizeof (char*) - sizeof (char). To prevent such problems, if you use the sizeof *thepointer, you will always have the correct size, e.g.
char **temp = malloc (sizeof *temp * strlen(words) + 1);
(unless you plan on providing a sentinel NULL as the final pointer, then + 1 is unnecessary. You must also validate the return (see below))
Next:
*dictionary = (Dictionary*)malloc(sizeof(Dictionary) * total_words);
There is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?. Further, if *dictionary was previously allocated elsewhere, the allocation above creates a memory leak because you lose the reference to the original pointer. If it has been previously allocated, you need realloc, not malloc. And if wasn't allocate, a better way of writing it would be:
*dictionary = malloc (sizeof **dictionary * total_words);
You must also validation the allocation succeeds before attempting to use the block of memory, e.g.
if (! *dictionary) {
perror ("malloc - *dictionary");
exit (EXIT_FAILURE);
}
In:
temp[index] = (char*)malloc(sizeof(char) * strlen(token) + 1);
sizeof(char) is always 1 and can be omitted. Better written as:
temp[index] = malloc (strlen(token) + 1);
or better, allocate and validate in a single block:
if (!(temp[index] = malloc (strlen(token) + 1))) {
perror ("malloc - temp[index]");
exit (EXIT_FAILURE);
}
then
strcpy(temp[index++], token);
Next, while total_words may be equal to the words in temp, you have only validated that you have index number of words. That combined with your original allocation times sizeof (char) instead of sizeof (char *), makes it no wonder there can be segfaults where you attempt to iterate over your list of pointers in temp. Better:
for (int i = 0; i < index; ++i)
{
for (int j = i + 1; j < index; ++j)
(the same applies to your k loop as well. Additionally, since you have allocated each temp[index], when you shuffle pointers with temp[k] = temp[k+1]; you overwrite the pointer address in temp[k] causing a memory leak with every pointer you overwrite. Each temp[k] that is overwritten should be freed before the assignment is made.
While you are updating total_words--, there still to this point has never been a validation that index == total_words, and in the event they are not, you can have no confidence in total_words or that you won't segfault attempting to iterate over uninitialized pointers as the result.
The rest appears workable, but after changes are made above, you should insure that the are no additional changes needed. Look things over and let me know if you need additional help. (and with a MCVE, I'm happy to help further)
Additional Problems
I apologize for the delay, real-world called -- and this took a lot longer than anticipated, because what you have is an awkward slow-motion logical train-wreck. First and foremost, while there is nothing wrong with reading an entire text-file file into a buffer with fread -- the buffer is NOT nul-terminated and therefore cannot be used with any functions expecting a string. Yes, strtok, strcpy or any string function will read past the end of word_data looking for the nul-terminating character (well out into memory you don't own) resulting in a SegFault.
Your various scattered +1 tacked onto your malloc allocations now make a little more sense, as it appears you were looking for where you needed to add an additional character to make sure you could nul-terminate word_data, but couldn't quite figure out where it went. (don't worry, I straightened that out for you, but it is a big hint that you are probably going about this in the wrong way -- reading with POSIX getline or fgets is probably a better approach than the file-at-once for this type of text processing)
That is literally, just the tip of the iceberg in the problems encountered in your code. As hinted at earlier, in tokenize, you failed to validate that index equals total_words. This ends up being important given your choice of delim which includes the ASCII apostrophe (or single-quote). This causes your index to exceed the word_count any time a plural-possessive or contraction is encountered in the buffer (e.g. "can't" is split is "can" and "t", "Peter's" is split into "Peter" and "s", etc.... You will have to decide how you want to resolve this, I have simply removed the single quote for now.
Your logic in both tokenize and count_words was difficult to follows, and just wrong in some aspects, and your return type (void) for read_file provided absolutely no way to indicate a success (or failure) within. Always choose a return type that provides meaningful information from which you can determine is a critical function has succeeded or failed (reading your data qualifies as critical).
If it provides a return -- use it. This applies to all functions that can fail (including functions like fseek)
Returning 0 from tokenize misses the return of the number of words (allocated struts) in dictionary leaving you unable to properly free the information and leaving you to guess at some number to display (e.g. for (int i = 0; i < 333; ++i) in main()). You need to track the number of dictionary structs and member word that are allocated in tokenize (keep an index, say dindex). Then returning dindex to main() (assigned to hello in your code) provides the information you need to iterate over the structs in main() to output your information, as well as to free each allocated word before freeing the pointers.
If you don't have an accurate count of the number of allocated dictionary structs back in main(), you have failed in the two responsibilities you have regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. If you don't know how many blocks there are, then you haven't done (1) and can't do (2).
This is a nit about style, and while not an error, the standard coding style for C avoids the use of Initialcaps, camelCase or MixedCase variable names in favor of all lower-case while reserving upper-case names for use with macros and constants. It is a matter of style -- so it is completely up to you, but failing to follow it can lead to the wrong first impression in some circles.
Rather than carry on for another handful of paragraphs, I've reworked your example for you and added a few comments inline. Go though it, I haven't punishingly tested it for all corner-cases, but it should be a sound base to build from. You will note in going though it, your count_words and tokenize have been simplified. Try and understand why what was done, was done, and ask if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <errno.h>
typedef struct{
char *word;
int count;
} dictionary_t;
char *read_file (FILE *file, char **words, size_t *length)
{
size_t size = *length = 0;
if (fseek (file, 0, SEEK_END) == -1) {
perror ("fseek SEEK_END");
return NULL;
}
size = (size_t)ftell (file);
if (fseek (file, 0, SEEK_SET) == -1) {
perror ("fseek SEEK_SET");
return NULL;
}
/* +1 needed to nul-terminate buffer to pass to strtok */
if (!(*words = malloc (size + 1))) {
perror ("malloc - size");
return NULL;
}
if (fread (*words, 1, size, file) != size) {
perror ("fread words");
free (*words);
return NULL;
}
*length = size;
(*words)[*length] = 0; /* nul-terminate buffer - critical */
return *words;
}
int tokenize (dictionary_t **dictionary, char *words, int total_words)
{
// char *delim = " .,?!:;/\"\'\n\t"; /* don't split on apostrophies */
char *delim = " .,?!:;/\"\n\t";
char **temp = malloc (sizeof *temp * total_words);
char *token = strtok(words, delim);
int index = 0, dindex = 0;
if (!temp) {
perror ("malloc temp");
return -1;
}
if (!(*dictionary = malloc (sizeof **dictionary * total_words))) {
perror ("malloc - dictionary");
return -1;
}
while (token != NULL)
{
if (!(temp[index] = malloc (strlen (token) + 1))) {
perror ("malloc - temp[index]");
exit (EXIT_FAILURE);
}
strcpy(temp[index++], token);
token = strtok (NULL, delim);
}
if (total_words != index) { /* validate total_words = index */
fprintf (stderr, "error: total_words != index (%d != %d)\n",
total_words, index);
/* handle error */
}
for (int i = 0; i < total_words; i++) {
int found = 0, j = 0;
for (; j < dindex; j++)
if (strcmp((*dictionary)[j].word, temp[i]) == 0) {
found = 1;
break;
}
if (!found) {
if (!((*dictionary)[dindex].word = malloc (strlen (temp[i]) + 1))) {
perror ("malloc (*dictionay)[dindex].word");
exit (EXIT_FAILURE);
}
strcpy ((*dictionary)[dindex].word, temp[i]);
(*dictionary)[dindex++].count = 1;
}
else
(*dictionary)[j].count++;
}
for (int i = 0; i < total_words; i++)
free (temp[i]); /* you must free storage for words */
free (temp); /* before freeing pointers */
return dindex;
}
int count_words (char *words, size_t length)
{
int count = 0;
char previous_char = ' ';
while (length--) {
if (isspace (previous_char) && !isspace (*words))
count++;
previous_char = *words++;
}
return count;
}
int main (int argc, char **argv)
{
char *word_data = NULL;
int word_count, hello;
size_t length = 0;
dictionary_t *dictionary = NULL;
FILE *input = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!input) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
if (!read_file (input, &word_data, &length)) {
fprintf (stderr, "error: file_read failed.\n");
return 1;
}
if (input != stdin) fclose (input); /* close file if not stdin */
word_count = count_words (word_data, length);
printf ("wordct: %d\n", word_count);
/* number of dictionary words returned in hello */
if ((hello = tokenize (&dictionary, word_data, word_count)) <= 0) {
fprintf (stderr, "error: no words or tokenize failed.\n");
return 1;
}
for (int i = 0; i < hello; ++i) {
printf("%-16s : %d\n", dictionary[i].word, dictionary[i].count);
free (dictionary[i].word); /* you must free word storage */
}
free (dictionary); /* free pointers */
free (word_data); /* free buffer */
return 0;
}
Let me know if you have further questions.
There are a few things that you need to do to make your code work:
Fix the memory allocation of temp by replacing sizeof(char) with sizeof(char *) like so:
char **temp = malloc(sizeof(char *) * strlen(words) + 1);
Fix the memory allocation of dictionary by replacing sizeof(Dictionary) with sizeof(Dictionary *):
*dictionary = (Dictionary*)malloc(sizeof(Dictionary *) * (*total_words));
Pass the address of address of word_count when calling tokenize:
int hello = tokenize(&dictionary, word_data, &word_count);
Replace all occurrences of total_words in tokenize function with (*total_words). In the tokenize function signature, you can replace int total_words with int *total_words.
You should also replace the hard-coded value of 333 in your for loop in the main function with word_count.
After you make these changes, your code should work as expected. I was able to run it successfully with these changes.

dynamically allocating my 2d array in c

Any hints on how I would dynamically allocate myArray so I can enter any amount of strings and it would store correctly.
int main()
{
char myArray[1][1]; //how to dynamically allocate the memory?
counter = 0;
char *readLine;
char *word;
char *rest;
printf("\n enter: ");
ssize_t buffSize = 0;
getline(&readLine, &buffSize, stdin);//get user input
//tokenize the strings
while(word = strtok_r(readLine, " \n", &rest )) {
strcpy(myArray[counter], word);
counter++;
readLine= rest;
}
//print the elements user has entered
int i =0;
for(i = 0;i<counter;i++){
printf("%s ",myArray[i]);
}
printf("\n");
}
Use realloc like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
char **myArray = NULL;
char *readLine = NULL;
size_t buffSize = 0;
size_t counter = 0;
char *word, *rest, *p;
printf("\n enter: ");
getline(&readLine, &buffSize, stdin);
p = readLine;
while(word = strtok_r(p, " \n", &rest )) {
myArray = realloc(myArray, (counter + 1) * sizeof(*myArray));//check omitted
myArray[counter++] = strdup(word);
p = NULL;
}
free(readLine);
for(int i = 0; i < counter; i++){
printf("<%s> ", myArray[i]);
free(myArray[i]);
}
printf("\n");
free(myArray);
}
Here is one way you might approach this problem. If you are going to dynamically allocate storage for an unknown number of words of unknown length, you can start with a buffSize that seems reasonable, allocate that much space for the readLine buffer, and grow this memory as needed. Similarly, you can choose a reasonable size for the number of words expected, and grow word storage as needed.
In the program below, myArray is a pointer to pointer to char. arrSize is initialized so that pointers to 100 words may be stored in myArray. First, readLine is filled with an input line. If more space than provided by the initial allocation is required, the memory is realloced to be twice as large. After reading in the line, the memory is again realloced to trim it to the size of the line (including space for the '\0').
strtok_r() breaks the line into tokens. The pointer store is used to hold the address of the memory allocated to hold the word, and then word is copied into this memory using strcpy(). If more space is needed to store words, the memory pointed to by myArray is realloced and doubled in size. After all words have been stored, myArray is realloced a final time to trim it to its minimum size.
When doing this much allocation, it is nice to write functions which allocate memory and check for errors, so that you don't have to do this manually every allocation. xmalloc() takes a size_t argument and an error message string. If an allocation error occurs, the message is printed to stderr and the program exits. Otherwise, a pointer to the allocated memory is returned. Similarly, xrealloc() takes a pointer to the memory to be reallocated, a size_t argument, and an error message string. Note here that realloc() can return a NULL pointer if there is an allocation error, so you need to assign the return value to a temporary pointer to avoid a memory leak. Moving realloc() into a separate function helps protect you from this issue. If you assigned the return value of realloc() directly to readLine, for example, and if there were an allocation error, readLine would no longer point to the previously allocated memory, which would be lost. This function prints the error message and exits if there is an error.
Also, you need to free all of these memory allocations, so this is done before the program exits.
This method is more efficient than reallocing memory for every added character in the line, and for every added pointer to a word in myArray. With generous starting values for buffSize and arrSize, you may only need the initial allocations, which are then trimmed to final size. Of course, there are still the individual allocations for each of the individual words. You could also use strdup() for this part, but you would still need to remember to free those allocations as well.Still, not nearly as many allocations will be needed as when readLine and myArray are grown one char or one pointer at a time.
#define _POSIX_C_SOURCE 1
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void * xmalloc(size_t size, char *msg);
void * xrealloc(void *ptr, size_t size, char *msg);
int main(void)
{
char **myArray;
size_t buffSize = 1000;
size_t arrSize = 100;
size_t charIndex = 0;
size_t wordIndex = 0;
char *readLine;
char *inLine;
char *word;
char *rest;
char *store;
/* Initial allocations */
readLine = xmalloc(buffSize, "Allocation error: readLine");
myArray = xmalloc(sizeof(*myArray) * arrSize,
"Allocation error: myArray\n");
/* Get user input */
printf("\n enter a line of input:\n");
int c;
while ((c = getchar()) != '\n' && c != EOF) {
if (charIndex + 1 >= buffSize) { // keep room for '\0'
buffSize *= 2;
readLine = xrealloc(readLine, buffSize,
"Error in readLine realloc()\n");
}
readLine[charIndex++] = c;
}
readLine[charIndex] = '\0'; // add '\0' terminator
/* If you must, trim the allocation now */
readLine = xrealloc(readLine, strlen(readLine) + 1,
"Error in readLine trim\n");
/* Tokenize readLine */
inLine = readLine;
while((word = strtok_r(inLine, " \n", &rest)) != NULL) {
store = xmalloc(strlen(word) + 1, "Error in word allocation\n");
strcpy(store, word);
if (wordIndex >= arrSize) {
arrSize *= 2;
myArray = xrealloc(myArray, sizeof(*myArray) * arrSize,
"Error in myArray realloc()\n");
}
myArray[wordIndex] = store;
wordIndex++;
inLine = NULL;
}
/* You can trim this allocation, too */
myArray = xrealloc(myArray, sizeof(*myArray) * wordIndex,
"Error in myArray trim\n");
/* Print words */
for(size_t i = 0; i < wordIndex; i++){
printf("%s ",myArray[i]);
}
printf("\n");
/* Free allocated memory */
for (size_t i = 0; i < wordIndex; i++) {
free(myArray[i]);
}
free(myArray);
free(readLine);
return 0;
}
void * xmalloc(size_t size, char *msg)
{
void *temp = malloc(size);
if (temp == NULL) {
fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
return temp;
}
void * xrealloc(void *ptr, size_t size, char *msg)
{
void *temp = realloc(ptr, size);
if (temp == NULL) {
fprintf(stderr, "%s\n", msg);
exit(EXIT_FAILURE);
}
return temp;
}
I suggest you first scan the data and then call malloc() with the appropriate size.
Otherwise, you can use realloc() to reallocate memory as you go through the data.

I'm trying to read a line from a file in c and dynamically allocate memory but the result is always bad

char * readline(FILE *fp, char *buffer) {
char ch;
int i = 0, buff_len = 0;
buffer = malloc(buff_len);
while ((ch = fgetc(fp)) != '\n' && ch != EOF) {
++buff_len;
buffer = realloc(buffer, buff_len);
buffer[i] = ch;
++i;
}
return buffer;
}
I'm trying to allocate the memory directly after reading a character.
There are several subtleties in passing a pointer to a function to allocate, fill and return. The most important to understand is that when passing a pointer to a function, the function receives a copy of that pointer with a new and separate address. If you then allocate space in the function for the pointer, you must return the address of the pointer to the caller (main() here) or the caller will have no way to access the values you have stored in your newly allocated block of memory.
To overcome this problem, and be able to pass a pointer to a function for allocation and filling, without having to utilize the return, you must pass the address of the pointer to the function, i.e.:
char *readline (FILE *fp, char **buffer)
Otherwise, as discussed in the comments and above, there is no reason to pass the pointer to buffer to the function. You can simply declare buffer local to the function, dynamically allocate space for it, and return the starting address to the new block of memory.
There is nothing wrong with doing it either way, it really boils down to what you need, but you want to be clear on what you are passing to your function and why. The following is a short example passing the address of your pointer to readline and allowing the function to allocate/fill each line without needing to utilize the return. Now it never hurts to return a pointer to your line in this case. It provides a way of determining success/failure and the flexibility to assign the return if you desire:
#include <stdio.h>
#include <stdlib.h>
#define NCHAR 64
char *readline (FILE *fp, char **buffer);
int main (int argc, char **argv) {
char *line = NULL;
size_t idx = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (readline (fp, &line)) { /* read each line in 'fp' */
printf (" line[%2zu] : %s\n", idx++, line);
free (line);
line = NULL;
}
if (fp != stdin) fclose (fp);
return 0;
}
/* read line from 'fp' allocate *buffer NCHAR in size
* realloc as necessary. Returns a pointer to *buffer
* on success, NULL otherwise.
*/
char *readline (FILE *fp, char **buffer)
{
int ch;
size_t buflen = 0, nchar = NCHAR;
*buffer = malloc (nchar); /* allocate buffer nchar in length */
if (!*buffer) {
fprintf (stderr, "readline() error: virtual memory exhausted.\n");
return NULL;
}
while ((ch = fgetc(fp)) != '\n' && ch != EOF)
{
(*buffer)[buflen++] = ch;
if (buflen + 1 >= nchar) { /* realloc */
char *tmp = realloc (*buffer, nchar * 2);
if (!tmp) {
fprintf (stderr, "error: realloc failed, "
"returning partial buffer.\n");
(*buffer)[buflen] = 0;
return *buffer;
}
*buffer = tmp;
nchar *= 2;
}
}
(*buffer)[buflen] = 0; /* nul-terminate */
if (buflen == 0 && ch == EOF) { /* return NULL if nothing read */
free (*buffer);
*buffer = NULL;
}
return *buffer;
}
Input
$ cat ../dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Output
$ ./bin/readline ../dat/captnjack.txt
line[ 0] : This is a tale
line[ 1] : Of Captain Jack Sparrow
line[ 2] : A Pirate So Brave
line[ 3] : On the Seven Seas.
Note
You really do not want to allocate for every character. malloc is a relatively expensive operation. It makes far more sense to allocate some reasonably anticipated number of characters for each line and then realloc if you reach that limit than it does to realloc for every character. If you are concerned with only allocating exactly the amount needed to hold the string, then allocate as described, and do one final realloc for strlen(buf) + 1 characters at the end.
Further, allocating in this manner, you can always set #define NCHAR 1 and force the allocation to start and 1-char if you like.
Memory Leak/Error Check
In any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are all simple to use. Just run your program through it.
$ valgrind ./bin/readline ../dat/captnjack.txt
==16460== Memcheck, a memory error detector
==16460== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==16460== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==16460== Command: ./bin/readline ../dat/captnjack.txt
==16460==
line[ 0] : This is a tale
line[ 1] : Of Captain Jack Sparrow
line[ 2] : A Pirate So Brave
line[ 3] : On the Seven Seas.
==16460==
==16460== HEAP SUMMARY:
==16460== in use at exit: 0 bytes in 0 blocks
==16460== total heap usage: 6 allocs, 6 frees, 888 bytes allocated
==16460==
==16460== All heap blocks were freed -- no leaks are possible
==16460==
==16460== For counts of detected and suppressed errors, rerun with: -v
==16460== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
You should see All heap blocks were freed -- no leaks are possible and ERROR SUMMARY: 0 errors from 0 contexts every time.
note: updated to reflect comment on return and type for ch and to fix potential memory leak when returning NULL from readline on EOF.
Code never makes the allocated buffer a string by appending a null character nor does it allocate space for it.
char * readline(FILE *fp, char *buffer) {
int ch;
int i = 0;
size_t buff_len = 0;
buffer = malloc(buff_len + 1);
if (!buffer) return NULL; // Out of memory
while ((ch = fgetc(fp)) != '\n' && ch != EOF) {
buff_len++;
void *tmp = realloc(buffer, buff_len + 1);
if (tmp == NULL) {
free(buffer);
return NULL; // Out of memory
}
buffer = tmp;
buffer[i] = (char) ch;
i++;
}
buffer[i] = '\0';
// Detect end
if (ch == EOF && (i == 0 || ferror(fp))) {
free(buffer);
return NULL;
}
return buffer;
}
Note:
Reallocating on each iteration is a bit wasteful.
Input value of buffer is never used. Perhaps redesign OP's function.
ch should be int to properly distinguish char fromEOF`.
As mentioned by chux, suggested alternative. Example code to display a text file.
#include <stdio.h>
#include <stdlib.h>
char * readline(FILE *fp) {
char * buffer = malloc(1024); /* assume longest line < 1023 chars */
int ch;
int i = 0;
while ((ch = fgetc(fp)) != '\n' && ch != EOF)
buffer[i++] = ch;
if(ch == EOF){ /* if eof, free buffer, return */
free(buffer);
return 0;
}
buffer[i++] = 0; /* add 0 terminator */
buffer = realloc(buffer, i);
return buffer;
}
int main(int argc, char *argv[])
{
FILE *fp;
char *pline;
if(argc < 2)
return 0;
fp = fopen(argv[1], "r");
while(1){
pline = readline(fp);
if(pline == 0)
break;
printf("%s\n", pline);
free(pline);
}
fclose(fp);
return 0;
}

Resources