Need help for reading a file character by character in C - c

I have a question about reading a file character by character and counting it in C
here's my code down below
void read_in(char** quotes){
FILE *frp = fopen(IN_FILE, "r");
char c;
size_t tmp_len =0, i=0;
//char* tmp[100];
//char* quotes[MAX_QUOTES];
//char str = fgets(str, sizeof(quotes),frp);
while((c=fgetc(frp)) != EOF){
if(frp == NULL){
printf("File is empty!");
fclose(frp); exit(1);
}
else{
if(c != '\n'){
printf("%c",c);
c=fgetc(frp);
tmp_len++;
}
}
char* tmp = (char*)calloc(tmp_len+1, sizeof(char));
fgets(tmp, sizeof(tmp), frp);
strcpy((char*)quotes[i], tmp);
printf("%s\n", (char*)quotes[i]);
i++;
}
}
It doesn't work but I don't understand why.
Thank you

From your question and through the comments, it is relatively clear you want to read all quotes (lines) in a file into dynamically allocated storage (screen 1) and then sort the lines by length and output the first 5 shortest lines (screen 2) saving the 5 shortest lines to a second output file (this part is left to you). Reading and storing all lines from a file isn't difficult -- but it isn't trivial either. It sounds basic, and it is, but it requires that you use all of the basic tools needed to interface with persistent storage (reading the file from disk/storage media) and your computer's memory subsystem (RAM) -- correctly.
Reading each line from a file isn't difficult, but like anything in C, it requires you to pay attention to the details. You can read from a file using character-oriented input functions (fgetc(), getc(), etc..), you can use formatted-input functions (fscanf()) and you can use line-oriented input functions such as (fgets() or POSIX getline()). Reading lines from a file is generally done with line-oriented functions, but there is nothing wrong with using a character-oriented approach either. In fact you can relatively easily write a function based around fgetc() that will read each line from a file for you.
In the trivial case where you know the maximum number of characters for the longest line in the file, you can use a 2D array of characters to store the entire file. This simplifies the process by eliminating the need to allocate storage dynamically, but has a number of disadvantages like each line in the file requiring the same storage as the longest line in the file, and by limiting the size of the file that can be stored to the size of your program stack. Allocating storage dynamically with (malloc, calloc, or realloc) eliminates these disadvantages and inefficiencies allowing you to store files up to the limit of the memory available on your computer. (there are methods that allow both to handle files of any size by using sliding-window techniques well beyond your needs here)
There is nothing difficult about handling dynamically allocated memory, or in copying or storing data within it on a character-by-character basis. That said, the responsibility for each allocation, tracking the amount of data written to each allocated block, reallocating to resize the block to ensure no data is written outside the bounds of each block and then freeing each allocated block when it is no longer needed -- is yours, the programmer. C gives the programmer the power to use each byte of memory available, and also places on the programmer the responsibility to use the memory correctly.
The basic approach to storing a file is simple. You read each line from the file, allocating/reallocating storage for each character until a '\n' or EOF is encountered. To coordinate all lines, you allocate a block of pointers, and you assign the address for each block of memory holding a line to a pointer, in sequence, reallocating the number of pointers required as needed to hold all lines.
Sometimes a picture really is worth 1000 words. With the basic approach you declare a pointer (to what?) a pointer so you can allocate a block of memory containing pointers to which you will assign each allocated line. For example, you could declare, char **lines; A pointer-to-pointer is a single pointer that points to a block of memory containing pointers. Then the type for each pointer for lines will be char * which will point to each block holding a line from the file, e.g.
char **lines;
|
| allocated
| pointers allocated blocks holding each line
lines --> +----+ +-----+
| p1 | --> | cat |
+----+ +-----+--------------------------------------+
| p2 | --> | Four score and seven years ago our fathers |
+----+ +-------------+------------------------------+
| p3 | --> | programming |
+----+ +-------------------+
| .. | | ... |
+----+ +-------------------+
| pn | --> | last line read |
+----+ +----------------+
You can make lines a bit more flexible to use by allocating 1 additional pointer and initializing that pointer to NULL which allows you to iterate over lines without knowing how many lines there are -- until NULL is encountered, e.g.
| .. | | ... |
+----+ +-------------------+
| pn | --> | last line read |
+----+ +----------------+
|pn+1| | NULL |
+----+ +------+
While you can put this all together in a single function, to help the learning process (and just for practical reusability), it is often easier to break this up into two function. One that reads and allocates storage for each line, and a second function that basically calls the first function, allocating pointers and assigning the address for each allocated block of memory holding a line read from the file to the next pointer in turn. When you are done, you have an allocated block of pointers where each of the pointers holds the address of (points to) an allocated block holding a line from the file.
You have indicated you want to read from the file with fgetc() and read a character at a time. There is nothing wrong with that, and there is little penalty to this approach since the underlying I/O subsystem provides a read-buffer that you are actually reading from rather than reading from disk one character at-a-time. (the size varies between compilers, but is generally provided through the BUFSIZ macro, both Linux and Windows compilers provide this)
There are virtually an unlimited number of ways to write a function that allocates storage to hold a line and then reads a line from the file one character at-a-time until a '\n' or EOF is encountered. You can return a pointer to the allocated block holding the line and pass a pointer parameter to be updated with the number of characters contained in the line, or you can have the function return the line length and pass the address-of a pointer as a parameter to be allocated and filled within the function. It is up to you. One way would be:
#define NSHORT 5 /* no. of shortest lines to display */
#define LINSZ 128 /* initial allocation size for each line */
...
/** read line from 'fp' stored in allocated block assinged to '*s' and
* return length of string stored on success, on EOF with no characters
* read, or on failure, return -1. Block of memory sized to accommodate
* exact length of string with nul-terminating char. unless -1 returned,
* *s guaranteed to contain nul-terminated string (empty-string allowed).
* caller responsible for freeing allocated memory.
*/
ssize_t fgetcline (char **s, FILE *fp)
{
int c; /* char read from fp */
size_t n = 0, size = LINSZ; /* no. of chars and allocation size */
void *tmp = realloc (NULL, size); /* tmp pointer for realloc use */
if (!tmp) /* validate every allocation/reallocation */
return -1;
*s = tmp; /* assign reallocated block to pointer */
while ((c = fgetc(fp)) != '\n' && c != EOF) { /* read chars until \n or EOF */
if (n + 1 == size) { /* check if realloc required */
/* realloc using temporary pointer */
if (!(tmp = realloc (*s, size + LINSZ))) {
free (*s); /* on failure, free partial line */
return -1; /* return -1 */
}
*s = tmp; /* assign reallocated block to pointer */
size += LINSZ; /* update allocated size */
}
(*s)[n++] = c; /* assign char to index, increment */
}
(*s)[n] = 0; /* nul-terminate string */
if (n == 0 && c == EOF) { /* if nothing read and EOF, free mem return -1 */
free (*s);
return -1;
}
if ((tmp = realloc (*s, n + 1))) /* final realloc to exact length */
*s = tmp; /* assign reallocated block to pointer */
return (ssize_t)n; /* return length (excluding nul-terminating char) */
}
(note: the ssize_t is a signed type providing the range of size_t that essentially allows the return of -1. it is provided in the sys/types.h header. you can adjust the type as desired)
The fgetclines() function makes one final call to realloc to shrink the size of the allocation to the exact number of characters needed to hold the line and the nul-terminating character.
The function called to read all lines in the file while allocation and reallocating pointers as required does essentially the same thing as the fgetclines() function above does for characters. It simply allocates some initial number of pointers and then begins reading lines from the file, reallocating twice the number of pointers each time it is needed. It also adds one additional pointer to hold NULL as a sentinel that will allow iterating over all pointers until NULL is reached (this is optional). The parameter n is updated to with the number of lines stored to make that available back in the calling function. This function too can be written in a number of different ways, one would be:
/** read each line from `fp` and store in allocated block returning pointer to
* allocateted block of pointers to each stored line with the final pointer
* after the last stored string set to NULL as a sentinel. 'n' is updated to
* the number of allocated and stored lines (excluding the sentinel NULL).
* returns valid pointer on success, NULL otherwise. caller is responsible for
* freeing both allocated lines and pointers.
*/
char **readfile (FILE *fp, size_t *n)
{
size_t nptrs = LINSZ; /* no. of allocated pointers */
char **lines = malloc (nptrs * sizeof *lines); /* allocated bock of pointers */
void *tmp = NULL; /* temp pointer for realloc use */
/* read each line from 'fp' into allocated block, assign to next pointer */
while (fgetcline (&lines[*n], fp) != -1) {
lines[++(*n)] = NULL; /* set next pointer NULL as sentinel */
if (*n + 1 >= nptrs) { /* check if realloc required */
/* allocate using temporary pointer to prevent memory leak on failure */
if (!(tmp = realloc (lines, 2 * nptrs * sizeof *lines))) {
perror ("realloc-lines");
return lines; /* return original poiner on failure */
}
lines = tmp; /* assign reallocated block to pointer */
nptrs *= 2; /* update no. of pointers allocated */
}
}
/* final realloc sizing exact no. of pointers required */
if (!(tmp = realloc (lines, (*n + 1) * sizeof *lines)))
return lines; /* return original block on failure */
return tmp; /* return updated block of pointers on success */
}
Note above, the function takes an open FILE* parameter for the file rather than taking a filename to open within the function. You generally want to open the file in the calling function and validate that it is open for reading before calling a function to read all the lines. If the file cannot be opened in the caller, there is no reason to make the function all to read the line from the file to begin with.
With a way to read an store all lines from your file done, you next need to turn to sorting the lines by length so you can output the 5 shortest lines (quotes). Since you will normally want to preserve the lines from your file in-order, the easiest way to sort the lines by length while preserving the original order is just to make a copy of the pointers and sort the copy of pointers by line length. For example, your lines pointer can continue to contain the pointers in original order, while the set of pointers sortedlines can hold the pointers in order sorted by line length, e.g.
int main (int argc, char **argv) {
char **lines = NULL, /* pointer to allocated block of pointers */
**sortedlines = NULL; /* copy of lines pointers to sort by length */
After reading the file and filling the lines pointer, you can copy the pointers to sortedlines (including the sentinel NULL), e.g.
/* alocate storage for copy of lines pointers (plus sentinel NULL) */
if (!(sortedlines = malloc ((n + 1) * sizeof *sortedlines))) {
perror ("malloc-sortedlines");
return 1;
}
/* copy pointers from lines to sorted lines (plus sentinel NULL) */
memcpy (sortedlines, lines, (n + 1) * sizeof *sortedlines);
Then you simply call qsort to sort the pointers in sortedlines by length. Your only job with qsort is to write the *compare` function. The prototype for the compare function is:
int compare (const void *a, const void *b);
Both a and b will be pointers-to elements being sorted. In your case with char **sortedlines;, the elements will be pointer-to-char, so a and b will both have type pointer-to-pointer to char. You simply write a compare function so it will return less than zero if the length of line pointed to by a is less than b (already in the right order), return zero if the length is the same (no action needed) and return greater than zero if the length of a is greater than b (a swap is required). Writing the compare a the difference of two conditionals rather than simple a - b will prevent all potential overflow, e.g.
/** compare funciton for qsort, takes pointer-to-element in a & b */
int complength (const void *a, const void *b)
{
/* a & b are pointer-to-pointer to char */
char *pa = *(char * const *)a, /* pa is pointer to string */
*pb = *(char * const *)b; /* pb is pointer to string */
size_t lena = strlen(pa), /* length of pa */
lenb = strlen(pb); /* length of pb */
/* for numeric types returing result of (a > b) - (a < b) instead
* of result of a - b avoids potential overflow. returns -1, 0, 1.
*/
return (lena > lenb) - (lena < lenb);
}
Now you can simply pass the collection of objects, the number of object, the size of each object and the function to use to sort the objects to qsort. It doesn't matter what you need to sort -- it works the same way every time. There is no reason you should ever need to "go write" a sort (except for educational purposes) -- that is what qsort is provided for. For example, here with sortedlines, all you need is:
qsort (sortedlines, n, sizeof *sortedlines, complength); /* sort by length */
Now you can display all lines by iterating through lines and display all lines in ascending line length through sortedlines. Obviously to display the first 5 lines, just iterate over the first 5 valid pointers in sortedlines. The same applies to opening another file for writing and writing those 5 lines to a new file. (that is left to you)
That's it. Is any of it difficult -- No. Is it trivial to do -- No. It is a basic part of programming in C that takes work to learn and to understand, but that is no different than anything worth learning. Putting all the pieces together in a working program to read and display all lines in a file and then sort and display the first 5 shortest lines you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#define NSHORT 5 /* no. of shortest lines to display */
#define LINSZ 128 /* initial allocation size for each line */
/** compare funciton for qsort, takes pointer-to-element in a & b */
int complength (const void *a, const void *b)
{
/* a & b are pointer-to-pointer to char */
char *pa = *(char * const *)a, /* pa is pointer to string */
*pb = *(char * const *)b; /* pb is pointer to string */
size_t lena = strlen(pa), /* length of pa */
lenb = strlen(pb); /* length of pb */
/* for numeric types returing result of (a > b) - (a < b) instead
* of result of a - b avoids potential overflow. returns -1, 0, 1.
*/
return (lena > lenb) - (lena < lenb);
}
/** read line from 'fp' stored in allocated block assinged to '*s' and
* return length of string stored on success, on EOF with no characters
* read, or on failure, return -1. Block of memory sized to accommodate
* exact length of string with nul-terminating char. unless -1 returned,
* *s guaranteed to contain nul-terminated string (empty-string allowed).
* caller responsible for freeing allocated memory.
*/
ssize_t fgetcline (char **s, FILE *fp)
{
int c; /* char read from fp */
size_t n = 0, size = LINSZ; /* no. of chars and allocation size */
void *tmp = realloc (NULL, size); /* tmp pointer for realloc use */
if (!tmp) /* validate every allocation/reallocation */
return -1;
*s = tmp; /* assign reallocated block to pointer */
while ((c = fgetc(fp)) != '\n' && c != EOF) { /* read chars until \n or EOF */
if (n + 1 == size) { /* check if realloc required */
/* realloc using temporary pointer */
if (!(tmp = realloc (*s, size + LINSZ))) {
free (*s); /* on failure, free partial line */
return -1; /* return -1 */
}
*s = tmp; /* assign reallocated block to pointer */
size += LINSZ; /* update allocated size */
}
(*s)[n++] = c; /* assign char to index, increment */
}
(*s)[n] = 0; /* nul-terminate string */
if (n == 0 && c == EOF) { /* if nothing read and EOF, free mem return -1 */
free (*s);
return -1;
}
if ((tmp = realloc (*s, n + 1))) /* final realloc to exact length */
*s = tmp; /* assign reallocated block to pointer */
return (ssize_t)n; /* return length (excluding nul-terminating char) */
}
/** read each line from `fp` and store in allocated block returning pointer to
* allocateted block of pointers to each stored line with the final pointer
* after the last stored string set to NULL as a sentinel. 'n' is updated to
* the number of allocated and stored lines (excluding the sentinel NULL).
* returns valid pointer on success, NULL otherwise. caller is responsible for
* freeing both allocated lines and pointers.
*/
char **readfile (FILE *fp, size_t *n)
{
size_t nptrs = LINSZ; /* no. of allocated pointers */
char **lines = malloc (nptrs * sizeof *lines); /* allocated bock of pointers */
void *tmp = NULL; /* temp pointer for realloc use */
/* read each line from 'fp' into allocated block, assign to next pointer */
while (fgetcline (&lines[*n], fp) != -1) {
lines[++(*n)] = NULL; /* set next pointer NULL as sentinel */
if (*n + 1 >= nptrs) { /* check if realloc required */
/* allocate using temporary pointer to prevent memory leak on failure */
if (!(tmp = realloc (lines, 2 * nptrs * sizeof *lines))) {
perror ("realloc-lines");
return lines; /* return original poiner on failure */
}
lines = tmp; /* assign reallocated block to pointer */
nptrs *= 2; /* update no. of pointers allocated */
}
}
/* final realloc sizing exact no. of pointers required */
if (!(tmp = realloc (lines, (*n + 1) * sizeof *lines)))
return lines; /* return original block on failure */
return tmp; /* return updated block of pointers on success */
}
/** free all allocated memory (both lines and pointers) */
void freelines (char **lines, size_t nlines)
{
for (size_t i = 0; i < nlines; i++) /* loop over each pointer */
free (lines[i]); /* free allocated line */
free (lines); /* free pointers */
}
int main (int argc, char **argv) {
char **lines = NULL, /* pointer to allocated block of pointers */
**sortedlines = NULL; /* copy of lines pointers to sort by length */
size_t n = 0; /* no. of pointers with allocated lines */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
if (!(lines = readfile (fp, &n))) /* read all lines in file, fill lines */
return 1;
if (fp != stdin) /* close file if not stdin */
fclose (fp);
/* alocate storage for copy of lines pointers (plus sentinel NULL) */
if (!(sortedlines = malloc ((n + 1) * sizeof *sortedlines))) {
perror ("malloc-sortedlines");
return 1;
}
/* copy pointers from lines to sorted lines (plus sentinel NULL) */
memcpy (sortedlines, lines, (n + 1) * sizeof *sortedlines);
qsort (sortedlines, n, sizeof *sortedlines, complength); /* sort by length */
/* output all lines from file (first screen) */
puts ("All lines:\n\nline : text");
for (size_t i = 0; i < n; i++)
printf ("%4zu : %s\n", i + 1, lines[i]);
/* output first five shortest lines (second screen) */
puts ("\n5 shortest lines:\n\nline : text");
for (size_t i = 0; i < (n >= NSHORT ? NSHORT : n); i++)
printf ("%4zu : %s\n", i + 1, sortedlines[i]);
freelines (lines, n); /* free all allocated memory for lines */
free (sortedlines); /* free block of pointers */
}
(note: the file reads from the filename passed as the first argument to the program, or from stdin if no argument is given)
Example Input File
$ cat dat/fleascatsdogs.txt
My dog
My fat cat
My snake
My dog has fleas
My cat has none
Lucky cat
My snake has scales
Example Use/Output
$ ./bin/fgetclinesimple dat/fleascatsdogs.txt
All lines:
line : text
1 : My dog
2 : My fat cat
3 : My snake
4 : My dog has fleas
5 : My cat has none
6 : Lucky cat
7 : My snake has scales
5 shortest lines:
line : text
1 : My dog
2 : My snake
3 : Lucky cat
4 : My fat cat
5 : My cat has none
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgetclinesimple dat/fleascatsdogs.txt
==5900== Memcheck, a memory error detector
==5900== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5900== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5900== Command: ./bin/fgetclinesimple dat/fleascatsdogs.txt
==5900==
All lines:
line : text
1 : My dog
2 : My fat cat
3 : My snake
4 : My dog has fleas
5 : My cat has none
6 : Lucky cat
7 : My snake has scales
5 shortest lines:
line : text
1 : My dog
2 : My snake
3 : Lucky cat
4 : My fat cat
5 : My cat has none
==5900==
==5900== HEAP SUMMARY:
==5900== in use at exit: 0 bytes in 0 blocks
==5900== total heap usage: 21 allocs, 21 frees, 7,938 bytes allocated
==5900==
==5900== All heap blocks were freed -- no leaks are possible
==5900==
==5900== For counts of detected and suppressed errors, rerun with: -v
==5900== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
There is a lot here, and as with any "how do it do X?" question, the devil is always in the detail, the proper use of each function, the proper validation of each input or allocation/reallocation. Each part is just as important as the other to ensure your code does what you need it to do -- in a defined way. Look things over, take your time to digest the parts, and let me know if you have further questions.

If you are using Linux you can try to use getline instead of fgetc and fgets because getline takes care of memory allocation.
Example:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
if (argc != 2)
{
printf("usage: rf <filename>\n");
exit(EXIT_FAILURE);
}
fp = fopen(argv[1], "r");
if (fp == NULL)
{
perror("fopen");
exit(EXIT_FAILURE);
}
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
free(line);
exit(EXIT_SUCCESS);
}

Related

Differences between allocating memory spaces for a string based on the size of its characters vs. the size of the entire string

When we allocating memory spaces for a string, do the following 2 ways give the same result?
char *s = "abc";
char *st1 = (char *)malloc(sizeof(char)*strlen(s));
char *st2 = (char *)malloc(sizeof(s));
In other words, does allocate the memory based on the size of its characters give the same result as allocating based on the size of the whole string?
If I do use the later method, is it still possible for me to add to that memory spaces character by character such as:
*st = 'a';
st++;
*st = 'b';
or do I have to add a whole string at once now?
Let's see if we can't get you straightened out on your question and on allocating (and reallocating) storage. To begin, when you declare:
char *s = "abc";
You have declared a pointer to char s and you have assigned the starting address for the String Literal "abc" to the pointer s. Whenever you attempt to use sizeof() on a_pointer, you get sizeof(a_pointer) which is typically 8-bytes on x86_64 (or 4-bytes on x86, etc..)
If you take sizeof("abc"); you are taking the size of a character array with size 4 (e.g. {'a', 'b', 'c', '\0'}), because a string literal is an array of char initialized to hold the string "abc" (including the nul-terminating character). Also note, that on virtually all systems, a string literal is created in read-only memory and cannot be modified, it is immutable.
If you want to allocate storage to hold a copy of the string "abc", you must allocate strlen("abc") + 1 characters (the +1 for the nul-terminating character '\0' -- which is simply ASCII 0, see ASCII Table & Description.
Whenever you allocate memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. So if you allocate for char *st = malloc (len + 1); characters, you do not want to iterate with the pointer st (e.g. no st++). Instead, declare a second pointer, char *p = st; and you are free to iterate with p.
Also, in C, there is no need to cast the return of malloc, it is unnecessary. See: Do I cast the result of malloc?.
If you want to add to an allocation, you use realloc() which will create a new block of memory for you and copy your existing block to it. When using realloc(), you always reallocate using a temporary pointer (e.g. don't st = realloc (st, new_size);) because if when realloc() fails, it returns NULL and if you assign that to your pointer st, you have just lost the original pointer and created a memory leak. Instead, use a temporary pointer, e.g. void *tmp = realloc (st, new_size); then validate realloc() succeeds before assigning st = tmp;
Now, reading between the lines that is where you are going with your example, the following shows how that can be done, keeping track of the amount of memory allocated and the amount of memory used. Then when used == allocated, you reallocate more memory (and remembering to ensure you have +1 bytes available for the nul-terminating character.
A short example would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define THISMANY 23
int main (void) {
char *s = "abc", *st, *p; /* string literal and pointer st */
size_t len = strlen(s), /* length of s */
allocated = len + 1, /* number of bytes in new block allocated */
used = 0; /* number of bytes in new block used */
st = malloc (allocated); /* allocate storage for copy of s */
p = st; /* pointer to allocate, preserve st */
if (!st) { /* validate EVERY allocation */
perror ("malloc-st");
return 1;
}
for (int i = 0; s[i]; i++) { /* copy s to new block of memory */
*p++ = s[i]; /* (could use strcpy) */
used++; /* advance counter */
}
*p = 0; /* nul-terminate copy */
for (size_t i = 0; i < THISMANY; i++) { /* loop THISMANY times */
if (used + 1 == allocated) { /* check if realloc needed (remember '\0') */
/* always realloc using temporary pointer */
void *tmp = realloc (st, 2 * allocated); /* realloc 2X current */
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-st");
break; /* don't exit, original st stil valid */
}
st = tmp; /* assign reallocated block to st */
allocated *= 2; /* update allocated amount */
}
*p++ = 'a' + used++; /* assign new char, increment used */
}
*p = 0; /* nul-terminate */
printf ("result st : %s\n" /* output final string, length, allocated */
"length st : %zu bytes\n"
"final size : %zu bytes\n", st, strlen(st), allocated);
free (st); /* don't forget to free what you have allocated */
}
Example Use/Output
$ ./bin/sizeofs
result st : abcdefghijklmnopqrstuvwxyz
length st : 26 bytes
final size : 32 bytes
Look things over and let me know if this answered your questions, and if not, leave a comment and I'm happy to help further.
If you are still shaky on what a pointer is, and would like more information, here are a few links that provide basic discussions of pointers that may help. Difference between char pp and (char) p? and Pointer to pointer of structs indexing out of bounds(?)... (ignore the titles, the answers discuss pointer basics)

How to read a text file and store in an array in C

The script successfully prints the text file however I want to store what is in the text file into an array, I have looked a lot of places but I am not exactly understanding what information I have come across, is there anyway I can get some guidance?
#include <stdlib.h>
int main()
{
// OPENS THE FILE
FILE *fp = fopen("/classes/cs3304/cs330432/Programs/StringerTest/people.txt", "r");
size_t len = 1000;
char *word = malloc(sizeof(char) * len);
// CHECKS IF THE FILE EXISTS, IF IT DOESN'T IT WILL PRINT OUT A STATEMENT SAYING SO
if (fp == NULL)
{
printf("file not found");
return 0;
}
while(fgets(word, len, fp) != NULL)
{
printf("%s", word);
}
free(word);
}
the text file has the following in it(just a list of words):
endorse
vertical
glove
legend
scenario
kinship
volunteer
scrap
range
elect
release
sweet
company
solve
elapse
arrest
witch
invasion
disclose
professor
plaintiff
definition
bow
chauvinist
Let's see if we can't get you straightened out. First, you are thinking in the right direction, and you should be commended for using fgets() to read each line into a fixed buffer (character array), and then you need to collect and store all of the lines so that they are available for use by your program -- that appears to be where the wheels fell off.
Basic Outline of Approach
In an overview, when you want to handle an unlimited number of lines, you have two different types of blocks of memory you are going to allocate and manage. The first is a block of memory you allocate that will hold some number of pointers (one for each line you will store). It doesn't matter how many you initially allocate, because you will keep track of the number allocated (number available) and the number used. When (used == available) you will realloc() a bigger block of memory to hold more pointers and keep on going.
The second type block of memory you will handle is the storage for each line. No mystery there. You will allocate storage for each character (+1 for the null-terminating character) and you will copy the line from your fixed buffer to the allocated block.
The two blocks of memory work together, because to create your collection, you simply assign the address for the block of memory holding the line of data to the next available pointer.
Let's think through a short example where we declare char **lines; as the pointer to the block of memory holding pointers. Then say we allocate two-pointers initially, we have valid pointers available for lines[0] and lines[1]. We track the number of pointers available with nptrs and the number used with used. So initially nptrs = 2; and used = 0;.
When we read our first line with fgets(), we will trim the '\n' from the end of the string and then get the length of the string (len = strlen(buffer);). We can then allocate storage for the string assigning the address of the allocated block to our first pointer, e.g.
lines[used] = malloc (len + 1);
and then copy the contents of buffer to lines[0], e.g.
memcpy (lines[used], buffer, len + 1);
(note: there is no reason to call strcpy() and have it scan for end-of-string again, we already know how many characters to copy -- including the nul-terminating character)
Finally, all that is needed to keep our counters happy is to increment used by one. We store the next line the same way, and on the 3rd iteration used == nptrs so we realloc() more pointers (generally just doubling the number of pointers each time a realloc() is required). That is a good balance between calls to realloc() and growth of the number of pointers -- but you are free to increment the allocation any way you like -- but avoid calling realloc() for every line.
So you keep reading lines, checking if realloc() is required, reallocating if needed, and allocating for each line assigning the starting address to each of your pointers in turn. The only additional note is that when you realloc() you always use a temporary pointer so when realloc() fails and returns NULL, you do not overwrite your original pointer with NULL losing the starting address to the block of memory holding pointers -- creating a memory leak.
Implementation
The details were left out of the overview, so let's look at a short example to read an unknown number of lines from a file (each line being 1024 characters or less) and storing each line in a collection using a pointer-to-pointer to char as described above. Don't use Magic-Numbers in your code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTRS 2 /* initial no. of pointers to allocate (lines) */
Don't hardcode Filenames in your code either, that was argc and argv are for in int main (int argc, char **argv). Pass the filename to read as the first argument to the program (or read from stdin by default if no argument is given):
int main (int argc, char **argv) {
char buf[MAXC], /* fixed buffer to read each line */
**lines = NULL; /* pointer to pointer to hold collection of lines */
size_t nptrs = NPTRS, /* number of pointers available */
used = 0; /* number of pointers used */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
(note: you should not need to recompile your program just to read from a different filename)
Now allocate and Validate your initial number of pointers
/* allocate/validate block holding initial nptrs pointers */
if ((lines = malloc (nptrs * sizeof *lines)) == NULL) {
perror ("malloc-lines");
exit (EXIT_FAILURE);
}
Read each line and trim the '\n' from the end and get the number of characters that remaining after the '\n' has been removed (you can use strcspn() to do it all at once):
while (fgets (buf, MAXC, fp)) { /* read each line into buf */
size_t len;
buf[(len = strcspn (buf, "\n"))] = 0; /* trim \n, save length */
Next we check if a reallocation is needed and if so reallocate using a temporary pointer:
if (used == nptrs) { /* check if realloc of lines needed */
/* always realloc using temporary pointer (doubling no. of pointers) */
void *tmp = realloc (lines, (2 * nptrs) * sizeof *lines);
if (!tmp) { /* validate reallocation */
perror ("realloc-lines");
break; /* don't exit, lines still good */
}
lines = tmp; /* assign reallocated block to lines */
nptrs *= 2; /* update no. of pointers allocatd */
/* (optionally) zero all newly allocated memory here */
}
Now allocate and Validate the storage for the line and copy the line to the new storage, incrementing used when done -- completing your read-loop.
/* allocate/validate storage for line */
if (!(lines[used] = malloc (len + 1))) {
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buf, len + 1); /* copy line from buf to lines[used] */
used += 1; /* increment used pointer count */
}
/* (optionally) realloc to 'used' pointers to size no. of pointers exactly here */
if (fp != stdin) /* close file if not stdin */
fclose (fp);
Now you can use the lines stored in lines as needed in your program, remembering to free the memory for each line when done and then finally freeing the block of pointers, e.g.
/* use lines as needed (simply outputting here) */
for (size_t i = 0; i < used; i++) {
printf ("line[%3zu] : %s\n", i, lines[i]);
free (lines[i]); /* free line storage when done */
}
free (lines); /* free pointers when done */
}
That's all that is needed. Now you can go read the 324,000 words in /usr/share/dict/words (or perhaps on your system /var/lib/dict/words depending on distro) and you will not have any problems doing so.
Input File
A short example file:
$ cat dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Example Use/Output
$ ./bin/fgets_lines_dyn_simple dat/captnjack.txt
line[ 0] : This is a tale
line[ 1] : Of Captain Jack Sparrow
line[ 2] : A Pirate So Brave
line[ 3] : On the Seven Seas.
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgets_lines_dyn_simple dat/captnjack.txt
==8156== Memcheck, a memory error detector
==8156== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8156== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8156== Command: ./bin/fgets_lines_dyn_simple dat/captnjack.txt
==8156==
line[ 0] : This is a tale
line[ 1] : Of Captain Jack Sparrow
line[ 2] : A Pirate So Brave
line[ 3] : On the Seven Seas.
==8156==
==8156== HEAP SUMMARY:
==8156== in use at exit: 0 bytes in 0 blocks
==8156== total heap usage: 9 allocs, 9 frees, 5,796 bytes allocated
==8156==
==8156== All heap blocks were freed -- no leaks are possible
==8156==
==8156== For counts of detected and suppressed errors, rerun with: -v
==8156== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
The Full Code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTRS 2 /* initial no. of pointers to allocate (lines) */
int main (int argc, char **argv) {
char buf[MAXC], /* fixed buffer to read each line */
**lines = NULL; /* pointer to pointer to hold collection of lines */
size_t nptrs = NPTRS, /* number of pointers available */
used = 0; /* number of pointers used */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* allocate/validate block holding initial nptrs pointers */
if ((lines = malloc (nptrs * sizeof *lines)) == NULL) {
perror ("malloc-lines");
exit (EXIT_FAILURE);
}
while (fgets (buf, MAXC, fp)) { /* read each line into buf */
size_t len;
buf[(len = strcspn (buf, "\n"))] = 0; /* trim \n, save length */
if (used == nptrs) { /* check if realloc of lines needed */
/* always realloc using temporary pointer (doubling no. of pointers) */
void *tmp = realloc (lines, (2 * nptrs) * sizeof *lines);
if (!tmp) { /* validate reallocation */
perror ("realloc-lines");
break; /* don't exit, lines still good */
}
lines = tmp; /* assign reallocated block to lines */
nptrs *= 2; /* update no. of pointers allocatd */
/* (optionally) zero all newly allocated memory here */
}
/* allocate/validate storage for line */
if (!(lines[used] = malloc (len + 1))) {
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buf, len + 1); /* copy line from buf to lines[used] */
used += 1; /* increment used pointer count */
}
/* (optionally) realloc to 'used' pointers to size no. of pointers exactly here */
if (fp != stdin) /* close file if not stdin */
fclose (fp);
/* use lines as needed (simply outputting here) */
for (size_t i = 0; i < used; i++) {
printf ("line[%3zu] : %s\n", i, lines[i]);
free (lines[i]); /* free line storage when done */
}
free (lines); /* free pointers when done */
}
Look things over and let me know if you have any questions. If you also wanted to read lines of unknown length (millions of characters long), you would simply loop doing the same thing allocating and reallocating for each line until the '\n' character was found (or EOF) marking the end of the line. It is no different in principle than what we have done above for the pointers.

How to apply Memory Allocation to a C Program which counts the amount of words in a list? (eg. malloc, calloc, free)

Considering the code provided by #David C. Rankin in this previous answer:
How to count only words that start with a Capital in a list?
How do you optimise this code to include Memory Allocation for much larger text files? With this code below it will complete for small .txt files.
However, what is the best way to set memory allocation to this code so that C (Programming Language) does not run out of memory. Is it best to use linked lists?
/**
* C program to count occurrences of all words in a file.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <limits.h>
#define MAX_WORD 50 /* max word size */
#define MAX_WORDS 512 /* max number of words */
#ifndef PATH_MAX
#define PATH_MAX 2048 /* max path (defined for Linux in limits.h) */
#endif
typedef struct { /* use a struct to hold */
char word[MAX_WORD]; /* lowercase word, and */
int cap, count; /* if it appeast capitalized, and its count */
} words_t;
char *strlwr (char *str) /* no need for unsigned char */
{
char *p = str;
while (*p) {
*p = tolower(*p);
p++;
}
return str;
}
int main (void) {
FILE *fptr;
char path[PATH_MAX], word[MAX_WORD];
size_t i, len, index = 0;
/* Array of struct of distinct words, initialized all zero */
words_t words[MAX_WORDS] = {{ .word = "" }};
/* Input file path */
printf ("Enter file path: ");
if (scanf ("%s", path) != 1) { /* validate every input */
fputs ("error: invalid file path or cancellation.\n", stderr);
return 1;
}
fptr = fopen (path, "r"); /* open file */
if (fptr == NULL) { /* validate file open */
fputs ( "Unable to open file.\n"
"Please check you have read privileges.\n", stderr);
exit (EXIT_FAILURE);
}
while (index < MAX_WORDS && /* protect array bounds */
fscanf (fptr, "%s", word) == 1) { /* while valid word read */
int iscap = 0, isunique = 1; /* is captial, is unique flags */
if (isupper (*word)) /* is the word uppercase */
iscap = 1;
/* remove all trailing punctuation characters */
len = strlen (word); /* get length */
while (len && ispunct(word[len - 1])) /* only if len > 0 */
word[--len] = 0;
strlwr (word); /* convert word to lowercase */
/* check if word exits in list of all distinct words */
for (i = 0; i < index; i++) {
if (strcmp(words[i].word, word) == 0) {
isunique = 0; /* set unique flag zero */
if (iscap) /* if capital flag set */
words[i].cap = iscap; /* set capital flag in struct */
words[i].count++; /* increment word count */
break; /* bail - done */
}
}
if (isunique) { /* if unique, add to array, increment index */
memcpy (words[index].word, word, len + 1); /* have len */
if (iscap) /* if cap flag set */
words[index].cap = iscap; /* set capital flag in struct */
words[index++].count++; /* increment count & index */
}
}
fclose (fptr); /* close file */
/*
* Print occurrences of all words in file.
*/
puts ("\nOccurrences of all distinct words with Cap in file:");
for (i = 0; i < index; i++) {
if (words[i].cap) {
strcpy (word, words[i].word);
*word = toupper (*word);
/*
* %-15s prints string in 15 character width.
* - is used to print string left align inside
* 15 character width space.
*/
printf("%-15s %d\n", word, words[i].count);
}
}
return 0;
}
Example Use/Output
Using your posted input
$ ./bin/unique_words_with_cap
Enter file path: dat/girljumped.txt
Occurrences of all distinct words with Cap in file:
Any 7
One 4
Some 10
The 6
A 13
Since you already have an answer using a fixed-size array of struct to hold the information, changing from using the fixed-size array where storage is automatically reserved for you on the stack, to dynamically allocated storage where you can realloc as needed, simply requires initially declaring a pointer-to-type rather than array-of-type, and then allocating storage for each struct.
Where before, with a fixed-size array of 512 elements you would have:
#define MAX_WORDS 512 /* max number of words */
...
/* Array of struct of distinct words, initialized all zero */
words_t words[MAX_WORDS] = {{ .word = "" }};
When dynamically allocating, simply declare a pointer-to-type and provide an initial allocation of some reasonable number of elements, e.g.
#define MAX_WORDS 8 /* initial number of struct to allocate */
...
/* pointer to allocated block of max_words struct initialized zero */
words_t *words = calloc (max_words, sizeof *words);
(note: you can allocate with either malloc, calloc or realloc, but only calloc allocates and also sets all bytes zero. In your case since you want the .cap and .count members initialized zero, calloc is a sensible choice)
It's worth pausing a bit to understand whether you use a fixed size array or an allocated block of memory, you are accessing your data through a pointer to the first element. The only real difference is the compiler reserving storage for your array on the stack with a fixed array, and you being responsible for reserving storage for it through allocation.
Access to the elements will be exactly the same because on access, an array is converted to a pointer to the first element. See: C11 Standard - 6.3.2.1 Other Operands - Lvalues, arrays, and function designators(p3) Either way you access the memory through a pointer to the first element. When dynamically allocating, you are assigning the address of the first element to your pointer rather than the compiler reserving storage for the array. Whether it is an array with storage reserved for you, or you declare a pointer and assign an allocated block of memory to it -- how you access the elements will be identical. (pause done)
When you allocate, it is up to you to validate that the allocation succeeds. So you would follow your allocation with:
if (!words) { /* valdiate every allocation */
perror ("calloc-words");
exit (EXIT_FAILURE);
}
You are already keeping track of index telling you how many struct you have filled, you simply need to add one more variable to track how many struct are available (size_t max_words = MAX_WORDS; gives you the 2nd variable set to the initial allocation size MAX_WORDS). So your test for "Do I need to realloc now?" is simply when filled == available, or in your case if (index == max_words).
Since you now have the ability to realloc, your read loop no longer has to protect your array bounds and you can simply read each word in the file, e.g.
while (fscanf (fptr, "%s", word) == 1) { /* while valid word read */
int iscap = 0, isunique = 1; /* is captial, is unique flags */
...
Now all that remains is the index == max_words test before you fill another element. You can either place the test and realloc before the for and if blocks for handling isunique, which is fine, or you can actually place it within the if (isunique) block since technically unless you are adding a unique word, no realloc will be required. The only difference it makes is a corner-case where index == max_words and you call realloc before your for loop where the last word is not-unique, you may make one call to realloc where it wasn't technically required (think through that).
To prevent that one realloc too many, place the test and realloc immediately before the new element will be filled, e.g.
if (isunique) { /* if unique, add to array, increment index */
if (index == max_words) { /* is realloc needed? */
/* always use a temporary pointer with realloc */
void *tmp = realloc (words, 2 * max_words * sizeof *words);
if (!tmp) {
perror ("realloc-words");
break; /* don't exit, original data still valid */
}
words = tmp; /* assign reallocated block to words */
/* (optional) set all new memory to zero */
memset (words + max_words, 0, max_words * sizeof *words);
max_words *= 2; /* update max_words to reflect new limit */
}
memcpy (words[index].word, word, len + 1); /* have len */
if (iscap) /* if cap flag set */
words[index].cap = iscap; /* set capital flag in struct */
words[index++].count++; /* increment count & index */
}
Now let's look closer at the reallocation itself, e.g.
if (index == max_words) { /* is realloc needed? */
/* always use a temporary pointer with realloc */
void *tmp = realloc (words, 2 * max_words * sizeof *words);
if (!tmp) { /* validate every allocation */
perror ("realloc-words");
break; /* don't exit, original data still valid */
}
words = tmp; /* assign reallocated block to words */
/* (optional) set all new memory to zero */
memset (words + max_words, 0, max_words * sizeof *words);
max_words *= 2; /* update max_words to reflect new limit */
}
The realloc call itself is void *tmp = realloc (words, 2 * max_words * sizeof *words);. Why not just words = realloc (words, 2 * max_words * sizeof *words);? Answer: You Never realloc the pointer itself, and always use a temporary pointer. Why? realloc allocates new storage, copies the existing data to the new storage and then calls free() on the old block of memory. When (not If) realloc fails, it returns NULL and does not touch the old block of memory. If you blindly assign NULL to your exiting pointer words, you have just overwritten the address to your old block of memory with NULL creating a memory-leak because you no longer have a reference to the old block of memory and it cannot be freed. So lesson learned, Always realloc with a temporary pointer!
If realloc succeeds, what then? Pay close attention to the lines:
words = tmp; /* assign reallocated block to words */
/* (optional) set all new memory to zero */
memset (words + max_words, 0, max_words * sizeof *words);
max_words *= 2; /* update max_words to reflect new limit */
The first simply assigns the address for the new block of memory created and filled by realloc to your words pointer. (`words now points to a block of memory with twice as many elements as it had before).
The second line -- recall, realloc and malloc do not initialize the new memory to zero, if you want to initialize the memory zero, (which for your .cap and .count members is really helpful, you have to do that yourself with memset. So what needs to be set to zero? All the memory what wasn't in your original block. Where is that? Well, it starts at words + max_words. How many zeros do I have to write? You have to fill all memory above words + max_words to the end of the block. Since you doubled the size, you simply have to zero what was the original size starting at words + max_words which is max_words * sizeof *words bytes of memory. (remember we used 2 * max_words * sizeof *words as the new size, and we have NOT updated max_words yet, so it still holds the original size)
Lastly, now it is time to update max_words. Here just make it match whatever you added to your allocation in realloc above. I simply doubled the size of the current allocation each time realloc is called, so to update max_words to the new allocation size, you simply multiply by 2 with max_words *= 2;. You can add as little or a much memory as you like each time. You could scale by 3/2., you could add a fixed number of elements (say 10), it is completely up to you, but avoid calling realloc to add 1-element each time. You can do it, but allocation and reallocation are relatively expensive operations, so better to add a reasonably sized block each time you realloc, and doubling is a reasonable balance between memory growth and the number of times realloc is called.
Putting it altogether, you could do:
/**
* C program to count occurrences of all words in a file.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <limits.h>
#define MAX_WORD 50 /* max word size */
#define MAX_WORDS 8 /* initial number of struct to allocate */
#ifndef PATH_MAX
#define PATH_MAX 2048 /* max path (defined for Linux in limits.h) */
#endif
typedef struct { /* use a struct to hold */
char word[MAX_WORD]; /* lowercase word, and */
int cap, count; /* if it appeast capitalized, and its count */
} words_t;
char *strlwr (char *str) /* no need for unsigned char */
{
char *p = str;
while (*p) {
*p = tolower(*p);
p++;
}
return str;
}
int main (void) {
FILE *fptr;
char path[PATH_MAX], word[MAX_WORD];
size_t i, len, index = 0, max_words = MAX_WORDS;
/* pointer to allocated block of max_words struct initialized zero */
words_t *words = calloc (max_words, sizeof *words);
if (!words) { /* valdiate every allocation */
perror ("calloc-words");
exit (EXIT_FAILURE);
}
/* Input file path */
printf ("Enter file path: ");
if (scanf ("%s", path) != 1) { /* validate every input */
fputs ("error: invalid file path or cancellation.\n", stderr);
return 1;
}
fptr = fopen (path, "r"); /* open file */
if (fptr == NULL) { /* validate file open */
fputs ( "Unable to open file.\n"
"Please check you have read privileges.\n", stderr);
exit (EXIT_FAILURE);
}
while (fscanf (fptr, "%s", word) == 1) { /* while valid word read */
int iscap = 0, isunique = 1; /* is captial, is unique flags */
if (isupper (*word)) /* is the word uppercase */
iscap = 1;
/* remove all trailing punctuation characters */
len = strlen (word); /* get length */
while (len && ispunct(word[len - 1])) /* only if len > 0 */
word[--len] = 0;
strlwr (word); /* convert word to lowercase */
/* check if word exits in list of all distinct words */
for (i = 0; i < index; i++) {
if (strcmp(words[i].word, word) == 0) {
isunique = 0; /* set unique flag zero */
if (iscap) /* if capital flag set */
words[i].cap = iscap; /* set capital flag in struct */
words[i].count++; /* increment word count */
break; /* bail - done */
}
}
if (isunique) { /* if unique, add to array, increment index */
if (index == max_words) { /* is realloc needed? */
/* always use a temporary pointer with realloc */
void *tmp = realloc (words, 2 * max_words * sizeof *words);
if (!tmp) { /* validate every allocation */
perror ("realloc-words");
break; /* don't exit, original data still valid */
}
words = tmp; /* assign reallocated block to words */
/* (optional) set all new memory to zero */
memset (words + max_words, 0, max_words * sizeof *words);
max_words *= 2; /* update max_words to reflect new limit */
}
memcpy (words[index].word, word, len + 1); /* have len */
if (iscap) /* if cap flag set */
words[index].cap = iscap; /* set capital flag in struct */
words[index++].count++; /* increment count & index */
}
}
fclose (fptr); /* close file */
/*
* Print occurrences of all words in file.
*/
puts ("\nOccurrences of all distinct words with Cap in file:");
for (i = 0; i < index; i++) {
if (words[i].cap) {
strcpy (word, words[i].word);
*word = toupper (*word);
/*
* %-15s prints string in 15 character width.
* - is used to print string left align inside
* 15 character width space.
*/
printf("%-15s %d\n", word, words[i].count);
}
}
free (words);
return 0;
}
Example Use/Output
Where with your sample data you would get:
$ ./bin/unique_words_with_cap_dyn
Enter file path: dat/girljumped.txt
Occurrences of all distinct words with Cap in file:
Any 7
One 4
Some 10
The 6
A 13
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/unique_words_with_cap_dyn
==7962== Memcheck, a memory error detector
==7962== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7962== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==7962== Command: ./bin/unique_words_with_cap_dyn
==7962==
Enter file path: dat/girljumped.txt
Occurrences of all distinct words with Cap in file:
Any 7
One 4
Some 10
The 6
A 13
==7962==
==7962== HEAP SUMMARY:
==7962== in use at exit: 0 bytes in 0 blocks
==7962== total heap usage: 4 allocs, 4 frees, 3,912 bytes allocated
==7962==
==7962== All heap blocks were freed -- no leaks are possible
==7962==
==7962== For counts of detected and suppressed errors, rerun with: -v
Above you can see there were 4 allocations and 4 frees (original allocation of 8, realloc at 8, 16 & 32) and you can see there were 0 errors.
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any questions.
However, what is the best way to set memory allocation to this code so that C (Programming Language) does not run out of memory.
Notice that most computers, even cheap laptops, have quite a lot of RAM. In practice, you could expect to be able to allocate at least a gigabyte of memory. That is a lot for textual file processing!
A large human-written text file is the Bible. As a rule of thumb, that text takes about 16 megabytes (to a factor of two). For most computers, that is a quite small amount of memory today (my AMD2970WX has more than that in its CPU cache).
Is it best to use linked lists?
The practical consideration is more algorithmic time complexity than memory consumption. For example, searching something in a linked list has linear time. And going thru a list of a million words does take some time (even if computers are fast).
You may want to read more about:
flexible array members (use that instead in your word_t).
string duplication routines like strdup or asprintf. Even if you don't have them, reprogramming them is a fairly easy task.
But you still want to avoid memory leaks and also, and even more importantly, undefined behavior.
Read How to debug small programs. Tools like valgrind, the clang static analyzer, the gdb debugger, the address sanitizer, etc.. are very useful to learn and use.
At last, read carefully, and in full, Norvig's Teach yourself programming in 10 years. That text is thought provoking, and its appendix at least is surprisingly close to your questions.
PS. I leave you to guess and estimate the total amount of text, in bytes, you are capable of reading during your entire life. That size is surprisingly small and probably fits in any smartphone today. On today's devices, text is really cheap. Photos and videos are not.
NB. "What is the best way" types of question are too broad, off-topic here, matter of opinion, and related to P vs NP question. Rice's theorem and to the halting problem. These questions usually have no clear answer and are supposed to be unsolvable: it is often difficult to prove that a better answer could not be thought of in a dozen of years (even if, for some such questions, you could get a proof today: e.g. sorting is proved today to require at least O(n log n) time.).

A function to find and substitute specific text?

Is there a function which I can use that which will allow me to replace a specific texts.
For example:
char *test = "^Hello world^"; would be replaced with char *test = "<s>Hello world</s>";
Another example: char *test2 = "This is ~my house~ bud" would be replaced with char *test2 = "This is <b>my house</b> bud"
Before you can begin to replace substrings within a string, you have to understand what you are dealing with. In your example you want to know whether you can replace characters within a string, and you give as an example:
char *test = "^Hello world^";
By being declared and initialized as shown above test, is a string-literal created in read-only memory (on virtually all systems) and any attempt to modify characters stored in read-only memory invokes Undefined Behavior (and most likely a Segmentation Fault)
As noted in the comments, test could be declared and initialized as a character array, e.g. char test[] = "^Hello world^"; and insure that test is modifiable, but that does not address the problem where your replacement strings are longer than the substrings being replaced.
To handle the additional characters, you have two options (1) you can declare test[] to be sufficiently large to accommodate the substitutions, or (2) you can dynamically allocate storage for the replacement string, and realloc additional memory if you reach your original allocation limit.
For instance if you limit the code associated with test to a single function, you could declare test with a sufficient number of characters to handle the replacements, e.g.
#define MAXC 1024 /* define a constant for the maximum number of characters */
...
test[MAXC] = "^Hello world^";
You would then simply need to keep track of the original string length plus the number of character added with each replacement and insure that the total never exceeds MAXC-1 (reserving space for the nul-terminating character).
However, if you decided to move the replacement code to a separate function -- you now have the problem that you cannot return a pointer to a locally declared array (because the locally declared array is declared within the function stack space -- which is destroyed (released for reuse) when the function returns) A locally declared array has automatic storage duration. See: C11 Standard - 6.2.4 Storage durations of objects
To avoid the problem of a locally declared array not surviving the function return, you can simply dynamically allocate storage for your new string which results in the new string having allocated storage duration which is good for the life of the program, or until the memory is freed by calling free(). This allows you to declare and allocate storage for a new string within a function, make your substring replacements, and then return a pointer to the new string for use back in the calling function.
For you circumstance, a simple declaration of a new string within a function and allocating twice the amount of storage as the original string is a reasonable approach to take. (you still must keep track of the number of bytes of memory you use, but you then have the ability to realloc additional memory if you should reach your original allocation limit) This process can continue and accommodate any number of strings and substitutions, up to the available memory on your system.
While there are a number of ways to approach the substitutions, simply searching the original string for each substring, and then copying the text up to the substring to the new string, then copying the replacement substring allows you to "inch-worm" from the beginning to the end of your original string making replacement substitutions as you go. The only challenge you have is keeping track of the number of characters used (so you can reallocate if necessary) and advancing your read position within the original from the beginning to the end as you go.
Your example somewhat complicates the process by needing to alternate between one of two replacement strings as you work your way down the string. This can be handled with a simple toggle flag. (a variable you alternate 0,1,0,1,...) which will then determine the proper replacement string to use where needed.
The ternary operator (e.g. test ? if_true : if_false; can help reduce the number of if (test) { if_true; } else { if_false; } blocks you have sprinkled through your code -- it's up to you. If the if (test) {} format is more readable to you -- use that, otherwise, use the ternary.
The following example takes the (1) original string, (2) the find substring, (3) the 1st replacement substring, and (4) the 2nd replacement substring as arguments to the program. It allocates for the new string within the strreplace() function, makes the substitutions requested and returns a pointer to the new string to the calling function. The code is heavily commented to help you follow along, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* replace all instances of 'find' in 's' with 'r1' and `r2`, alternating.
* allocate memory, as required, to hold string with replacements,
* returns allocated string with replacements on success, NULL otherwise.
*/
char *strreplace (const char *s, const char *find,
const char *r1, const char *r2)
{
const char *p = s, /* pointer to s */
*sp = s; /* 2nd substring pointer */
char *newstr = NULL, /* newsting pointer to allocate/return */
*np = newstr; /* pointer to newstring to fill */
size_t newlen = 0, /* length for newstr */
used = 0, /* amount of allocated space used */
slen = strlen (s), /* length of s */
findlen = strlen (find), /* length of find string */
r1len = strlen (r1), /* length of replace string 1 */
r2len = strlen (r2); /* length of replace string 2 */
int toggle = 0; /* simple 0/1 toggle flag for r1/r2 */
if (s == NULL || *s == 0) { /* validate s not NULL or empty */
fputs ("strreplace() error: input NULL or empty\n", stderr);
return NULL;
}
newlen = slen * 2; /* double length of s for newstr */
newstr = calloc (1, newlen); /* allocate twice length of s */
if (newstr == NULL) { /* validate ALL memory allocations */
perror ("calloc-newstr");
return NULL;
}
np = newstr; /* initialize newpointer to newstr */
/* locate each substring using strstr */
while ((sp = strstr (p, find))) { /* find beginning of each substring */
size_t len = sp - p; /* length to substring */
/* check if realloc needed? */
if (used + len + (toggle ? r2len : r1len) + 1 > newlen) {
void *tmp = realloc (newstr, newlen * 2); /* realloc to temp */
if (!tmp) { /* validate realloc succeeded */
perror ("realloc-newstr");
return NULL;
}
newstr = tmp; /* assign realloc'ed block to newstr */
newlen *= 2; /* update newlen */
}
strncpy (np, p, len); /* copy from pointer to substring */
np += len; /* advance newstr pointer by len */
*np = 0; /* nul-terminate (already done by calloc) */
strcpy (np, toggle ? r2 : r1); /* copy r2/r1 string to end */
np += toggle ? r2len : r1len; /* advance newstr pointer by r12len */
*np = 0; /* <ditto> */
p += len + findlen; /* advance p by len + findlen */
used += len + (toggle ? r2len : r1len); /* update used characters */
toggle = toggle ? 0 : 1; /* toggle 0,1,0,1,... */
}
/* handle segment of s after last find substring */
slen = strlen (p); /* get remaining length */
if (slen) { /* if not at end */
if (used + slen + 1 > newlen) { /* check if realloc needed? */
void *tmp = realloc (newstr, used + slen + 1); /* realloc */
if (!tmp) { /* validate */
perror ("realloc-newstr");
return NULL;
}
newstr = tmp; /* assign */
newlen += slen + 1; /* update (not required here, know why? */
}
strcpy (np, p); /* add final segment to string */
*(np + slen) = 0; /* nul-terminate */
}
return newstr; /* return newstr */
}
int main (int argc, char **argv) {
const char *s = NULL,
*find = NULL,
*r1 = NULL,
*r2 = NULL;
char *newstr = NULL;
if (argc < 5) { /* validate required no. or arguments given */
fprintf (stderr, "error: insufficient arguments,\n"
"usage: %s <find> <rep1> <rep2>\n", argv[0]);
return 1;
}
s = argv[1]; /* assign arguments to poitners */
find = argv[2];
r1 = argv[3];
r2 = argv[4];
newstr = strreplace (s, find, r1, r2); /* replace substrings in s */
if (newstr) { /* validate return */
printf ("oldstr: %s\nnewstr: %s\n", s, newstr);
free (newstr); /* don't forget to free what you allocate */
}
else { /* handle error */
fputs ("strreplace() returned NULL\n", stderr);
return 1;
}
return 0;
}
(above, the strreplace function uses pointers to walk ("inch-worm") down the original string making replacement, but you can use string indexes and index variables if that makes more sense to you)
(also note the use of calloc for the original allocation. calloc allocates and sets the new memory to all zero which can aid in insuring you don't forget to nul-terminate your string, but note any memory added by realloc will not be zeroed -- unless you manually zero it with memset or the like. The code above manually terminates the new string after each copy, so you can use either malloc or calloc for the allocation)
Example Use/Output
First example:
$ ./bin/str_substr_replace2 "^Hello world^" "^" "<s>" "</s>"
oldstr: ^Hello world^
newstr: <s>Hello world</s>
Second example:
$ ./bin/str_substr_replace2 "This is ~my house~ bud" "~" "<b>" "</b>"
oldstr: This is ~my house~ bud
newstr: This is <b>my house</b> bud
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/str_substr_replace2 "This is ~my house~ bud" "~" "<b>" "</b>"
==8962== Memcheck, a memory error detector
==8962== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==8962== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==8962== Command: ./bin/str_substr_replace2 This\ is\ ~my\ house~\ bud ~ \<b\> \</b\>
==8962==
oldstr: This is ~my house~ bud
newstr: This is <b>my house</b> bud
==8962==
==8962== HEAP SUMMARY:
==8962== in use at exit: 0 bytes in 0 blocks
==8962== total heap usage: 1 allocs, 1 frees, 44 bytes allocated
==8962==
==8962== All heap blocks were freed -- no leaks are possible
==8962==
==8962== For counts of detected and suppressed errors, rerun with: -v
==8962== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any further questions.

dynamically allocating a string with unknown size

I have to get names with a known number of names from input as one string each separated by a space, I have to dynamically allocate memory for an array of strings where each string gets a name,
char** names;
char ch;
names = malloc(N*sizeof(char*); /*N is defined*/
for(i=0; i<N; i++) {
Now I have to allocate for each string without using a defined number:
i=0, j=0;
while ((ch=getchar) != '\n') {
while (ch != ' ') {
names[i][j++] = ch;
}
if (ch == ' ') {
names[i][j] = '\0';
i++}}
if (ch == '\n')
names[i][j] = '\0';
This is the classic question of how do I handle dynamic allocation and reallocation to store an unknown number of strings. (with a twist to separate each string into individual tokens before saving to the array) It is worth understanding this process in detail as it will serve as the basis for just about any any other circumstance where you are reading an unknown number of values (whether they are structs, floats, characters, etc...).
There are a number of different types of data structures you can employ, lists, trees, etc., but the basic approach is by creating an array of pointer-to-pointer-to-type (with type being char in this case) and then allocating space for, filling with data, and assigning the starting address for the new block of memory to each pointer as your data is read. The short-hand for pointer-to-pointer-to-type is simply double-pointer (e.g. char **array;, which is technically a pointer-to-pointer-to-char or pointer-to-char* if you like)
The general, and efficient, approach to allocating memory for an unknown number of lines is to first allocate a reasonably anticipated number of pointers (1 for each anticipated token). This is much more efficient than calling realloc and reallocating the entire collection for every token you add to your array. Here, you simply keep a counter of the number of tokens added to your array, and when you reach your original allocation limit, you simmply reallocate twice the number of pointers you currenly have. Note, you are free to add any incremental amount you choose. You can simply add a fixed amount each time, or you can use some scaled multiple of the original -- it's up to you. The realloc to twice the current is just one of the standard schemes.
What is "a reasonably anticipated number of pointers?" It's no precise number. You simply want to take an educated guess at the number of tokens you roughtly expect and use that as an initial number for allocating pointers. You wouldn't want to allocate 10,000 pointers if you only expect 100. That would be horribly wasteful. Reallocation will take care of any shortfall, so a rough guess is all that is needed. If you truly have no idea, then allocate some reasonable number, say 64 or 128, etc.. You can simply declare the limit as a constant at the beginning of your code, so it is easily adjusted. e.g.:
#declare MAXPTR 128
or accomplish the same thing using an anonymous enum
enum { MAXPTR = 128 };
When allocating your pointers originally, and as part of your reallocation, you can benefit by setting each pointer to NULL. This is easily accomplished for the original allocation. Simply use calloc instead of malloc. On reallocation, it requires that you set all new pointers allocated to NULL. The benefit it provides is the first NULL acts as a sentinel indicating the point at which your valid pointers stop. As long as you insure you have at least one NULL preserved as a sentinel, you can iterate without the benefit of knowing precise number of pointers filled. e.g.:
size_t i = 0;
while (array[i]) {
... do your stuff ...
}
When you are done using the allocated memory, you want to insure you free the memory. While in a simple piece of code, the memory is freed on exit, get in the habit of tracking the memory you allocate and freeing it when it is no longer needed.
As for this particular task, you will want to read a line of unknown number of characters into memory and then tokenize (separate) the string into tokens. getline will read and allocate memory sufficient to hold any size character string. You can do the same thing with any of the other input functions, you just have to code the repeated checks and reallocations yourself. If getline is available (it is in every modern compier), use it. Then it is just a matter of separating the input into tokens with strtok or strsep. You will then want to duplicate the each token to preserve each token in its own block of memory and assign the location to your array of tokens. The following provides a short example.
Included in the example are several helper functions for opening files, allocating and reallocating. All they do is simple error checking which help keep the main body of your code clean and readable. Look over the example and let me know if you have any questions.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 64 /* initial number of pointers */
/* simple helper/error check functions */
FILE *xfopen (const char *fn, const char *mode);
void *xcalloc (size_t n, size_t s);
void *xrealloc_dp (void *ptr, size_t *n);
int main (int argc, char **argv) {
char **array = NULL;
char *line = NULL;
size_t i, idx = 0, maxl = MAXL, n = 0;
ssize_t nchr = 0;
FILE *fp = argc > 1 ? xfopen (argv[1], "r") : stdin;
array = xcalloc (maxl, sizeof *array); /* allocate maxl pointers */
while ((nchr = getline (&line, &n, fp)) != -1)
{
while (nchr > 0 && (line[nchr-1] == '\r' || line[nchr-1] == '\n'))
line[--nchr] = 0; /* strip carriage return or newline */
char *p = line; /* pointer to use with strtok */
for (p = strtok (line, " \n"); p; p = strtok (NULL, " \n")) {
array[idx++] = strdup (p); /* allocate & copy */
/* check limit reached - reallocate */
if (idx == maxl) array = xrealloc_dp (array, &maxl);
}
}
free (line); /* free memory allocated by getline */
if (fp != stdin) fclose (fp);
for (i = 0; i < idx; i++) /* print all tokens */
printf (" array[%2zu] : %s\n", i, array[i]);
for (i = 0; i < idx; i++) /* free all memory */
free (array[i]);
free (array);
return 0;
}
/* fopen with error checking */
FILE *xfopen (const char *fn, const char *mode)
{
FILE *fp = fopen (fn, mode);
if (!fp) {
fprintf (stderr, "xfopen() error: file open failed '%s'.\n", fn);
// return NULL;
exit (EXIT_FAILURE);
}
return fp;
}
/* simple calloc with error checking */
void *xcalloc (size_t n, size_t s)
{
void *memptr = calloc (n, s);
if (memptr == 0) {
fprintf (stderr, "xcalloc() error: virtual memory exhausted.\n");
exit (EXIT_FAILURE);
}
return memptr;
}
/* realloc array of pointers ('memptr') to twice current
* number of pointer ('*nptrs'). Note: 'nptrs' is a pointer
* to the current number so that its updated value is preserved.
* no pointer size is required as it is known (simply the size
* of a pointer
*/
void *xrealloc_dp (void *ptr, size_t *n)
{
void **p = ptr;
void *tmp = realloc (p, 2 * *n * sizeof tmp);
if (!tmp) {
fprintf (stderr, "%s() error: virtual memory exhausted.\n", __func__);
exit (EXIT_FAILURE);
}
p = tmp;
memset (p + *n, 0, *n * sizeof tmp); /* set new pointers NULL */
*n *= 2;
return p;
}
Input File
$ cat dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Output
$ ./bin/getline_strtok <dat/captnjack.txt
array[ 0] : This
array[ 1] : is
array[ 2] : a
array[ 3] : tale
array[ 4] : Of
array[ 5] : Captain
array[ 6] : Jack
array[ 7] : Sparrow
array[ 8] : A
array[ 9] : Pirate
array[10] : So
array[11] : Brave
array[12] : On
array[13] : the
array[14] : Seven
array[15] : Seas.
Memory/Error Check
In any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are all simple to use. Just run your program through it.
$ valgrind ./bin/getline_strtok <dat/captnjack.txt
==26284== Memcheck, a memory error detector
==26284== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==26284== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==26284== Command: ./bin/getline_strtok
==26284==
array[ 0] : This
array[ 1] : is
<snip>
array[14] : Seven
array[15] : Seas.
==26284==
==26284== HEAP SUMMARY:
==26284== in use at exit: 0 bytes in 0 blocks
==26284== total heap usage: 18 allocs, 18 frees, 708 bytes allocated
==26284==
==26284== All heap blocks were freed -- no leaks are possible
==26284==
==26284== For counts of detected and suppressed errors, rerun with: -v
==26284== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
What you want to confirm each time is "All heap blocks were freed -- no leaks are possible" and "ERROR SUMMARY: 0 errors from 0 contexts".
How about growing the buffer gradually, for example, by doubling the size of buffer when the buffer becomes full?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
char *read_string(void) {
size_t allocated_size = 2;
size_t read_size = 0;
char *buf = malloc(allocated_size); /* allocate initial buffer */
if (buf == NULL) return NULL;
for(;;) {
/* read next character */
int input = getchar();
if (input == EOF || isspace(input)) break;
/* if there isn't enough buffer */
if (read_size >= allocated_size - 1) {
/* allocate new buffer */
char *new_buf = malloc(allocated_size *= 2);
if (new_buf == NULL) {
/* failed to allocate */
free(buf);
return NULL;
}
/* copy data read to new buffer */
memcpy(new_buf, buf, read_size);
/* free old buffer */
free(buf);
/* assign new buffer */
buf = new_buf;
}
buf[read_size++] = input;
}
buf[read_size] = '\0';
return buf;
}
int main(void) {
int N = 5;
int i;
char** names;
names = malloc(N*sizeof(char*));
if(names == NULL) return 1;
for(i=0; i<N; i++) {
names[i] = read_string();
}
for(i = 0; i < N; i++) {
puts(names[i] ? names[i] : "NULL");
free(names[i]);
}
free(names);
return 0;
}
Note: They say you shouldn't cast the result of malloc() in C.
For a known number of strings, you have allocated the char ** correctly:
char** names;
names = (char**) malloc(N*sizeof(char*));
Note, because the cast is not necessary in C, you could write it like this:
names = malloc(N*sizeof(char*));
For allocating memory as you read the file, for strings of unknown length, use the following approach:
allocate a buffer using [m][c]alloc of a known starting size (calloc is cleaner)
read into the buffer until you run out of space.
use realloc to increase the size of buffer by some increment (double it)
repeat steps 1 through 3 until file is read
Also, when working with buffers of unknown length, and you would like its contents to be pre-set, or zeroed, consider using calloc() over malloc(). It is a cleaner option.
When you say,
char** names;
char ch;
names = malloc(N*sizeof(char*));
You created a names variable which is double pointer capable of storing address of strings multiple N times.
Ex: if you have 32 strings, then N is 32.
So, 32* sizeof(char*)
and sizeof char* is 4 bytes
Hence, 128 bytes will be allocated
After that you did this,
names[i][j++] = ch;
The above expression is wrong way to use.
Because, you are trying to assign char data to address variables.
You need to create sub memories for memory address variables name .
Or you need to assign address of each sub string from main string.
use readline() or getline() to acquire a pointer to a memory allocation that contains the data.
Then use something like sscanf() or strtok() to extract the individual name strings into members of an array.

Resources