I am wondering how to go about storing strings into an array of strings.
char buff[1024]; // to get chars from fgetc
char *line[2024]; // to store strings found in buff
int ch;
int i = 0;
while ((ch = fgetc(file)) != EOF) {
buff[i] = ch;
if (ch == '\n') { // find the end of the current line
int line_index = 0;
char *word;
word = strtok(buff, ":"); // split the buffer into strings
line[line_index] = word;
line_index++;
while (word != NULL) {
word = strtok(NULL, ":");
line[line_index] = word;
line_index++;
}
}
Do I need to dynamically allocate every element of the line array before inserting word into that element?
You are free to use either calloc (or malloc or realloc) or strdup if you have it. All strdup does is to automate the process of allocating length + 1 characters and then copying the given string to the new block of memory which you then assign to your pointer. It does the exact same thing you would do on your own if you didn't have strdup available.
However, note, strdup allocates so you must validate the allocation just as you would if you had called one of the allocation functions directly. Further note, on failure, it sets errno and returns NULL just as any of the allocation functions do.
Before looking at examples, you have several other sources of error you must address. You declare both buff and line with a fixed size. Therefore, as you are adding characters to buff or filling pointers in line, you must be keeping track of the index and checking against the maximum available to protect your array bounds. If you have a file with 1024-character lines or lines with 2025 words, you write past the end of each array invoking Undefined Behavior.
Equally important is your choice of variable names. line isn't a line, it is an array or pointers to tokens or words separated by the delimiters you provide to strtok. The only variable that contains a "line" is buff. If you were going to call anything line, you should change buff to line and change line to word (or token). Now it is possible that your file does contain lines from something else separated by ':' in the input file (not provided), but without more, we will change line to word and line_index to word_index in the example. While your use of word as the word-pointer was fine, let's shorten it to wp to avoid conflict with the renamed word array of pointers to each word. buff is fine, you know you are buffering characters in it.
Your final problem is you have not nul-terminated buff before passing buff to strtok. All string functions require a nul-terminated string as their argument. Failure to provide on invokes Undefined Behavior.
Preliminary
Do not use magic numbers in your code. Instead:
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buff[MAXC] = "", /* line buffer */
*word[MAXC * 2] = {NULL}; /* array of pointers to char */
int ch,
i = 0;
...
while ((ch = fgetc(fp)) != EOF) { /* read each char until EOF */
int word_index = 0;
buff[i++] = ch;
if (i == MAXC - 1 || ch == '\n') { /* protect array bounds */
char *wp; /* word pointer */
buff[i++] = 0; /* nul-termiante buf */
for (wp = strtok (buff, " :\n"); /* initial call to strtok */
word_index < MAXC && wp; /* check bounds & wp */
wp = strtok (NULL, " :\n")) { /* increment - next token */
(note: you should actually check if (i == MAXC - 1 || (i > 1 && ch == '\n')) to avoid attempting to tokenize empty-lines -- that is left to you. Also note a for loop provides a convenient means to cover both calls to strtok in a single expression)
Using strdup to Allocate/Copy
As mentioned above, all strdup does is to allocate storage for the word pointed to by wp above (including storage for the nul-terminating characters), copy the word to the newly allocated block of memory, and then return a pointer to the first address in that block allowing you to assign the starting address to your pointer. It is quite convenient, but since it allocates, you must validate, e.g.
/* NOTE: strdup allocates - you must validate */
if (!(word[word_index] = strdup (wp))) {
perror ("strdup(wp)");
exit (EXIT_FAILURE);
}
word_index++; /* increment after allocation validated */
Using strlen + calloc + memcpy To Do The Same Thing
If strdup wasn't available, or you simply wanted to allocate and copy manually, then you would just do the exact same thing. (1) Get the length of the word (or token) pointed to by wp, (2) allocate length + 1 bytes; and (3) copy the string pointed to by wp to your newly allocated block of memory. (the assignment of the starting address of the new block to your pointer occurs at the point of allocation).
An efficiency nit about copying to your new block of memory. Since you have already scanned forward in the string pointed to by wp to find the length, there is no need to scan the string again by using strcpy. You have the length, so just use memcpy to avoid the second scan for end-of-string (it's trivial, but shows an understanding of what has taken place in your code). Using calloc you would do:
/* using strlen + calloc + memcpy */
size_t len = strlen (wp); /* get wp length */
/* allocate/validate */
if (!(word[word_index] = calloc (1, len + 1))) {
perror ("calloc(1,len+1)");
exit (EXIT_FAILURE);
} /* you alread scanned for '\0', use memcpy */
memcpy (word[word_index++], wp, len + 1);
Now if I were to do that for a whole line, when I find a new line in
the file how do I reuse my char *line[2024] array?
Well, it's called word now, but as mentioned in the comment, you have kept track of the number of pointers filled with your line_index (my word_index) variables, so before you can allocate a new block of memory and assign a new address to your pointer (thereby overwriting the old address held by the pointer), you must free the block of memory at the address currently held by the pointer (or your will lose the ability to ever free that memory causing a memory leak). It is good (but optional) to set the pointer to NULL after freeing the memory.
(doing so ensures only valid pointers remain in your array of pointers allowing you to iterate the array, e.g. while (line[x] != NULL) { /* do something */ x++; } - useful if passing or returning a pointer to that array)
To reuse the free the memory, reset the pointers for reuse and reset your character index i = 0, you could do something like the following while outputting the words from the line, e.g.
}
for (int n = 0; n < word_index; n++) { /* loop over each word */
printf ("word[%2d]: %s\n", n, word[n]); /* output */
free (word[n]); /* free memory */
word[n] = NULL; /* set pointer NULL (optional) */
}
putchar ('\n'); /* tidy up with newline */
i = 0; /* reset i zero */
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
}
Putting it altogether in an example that lets you choose whether to use strdup or use calloc depending on whether you pass the command line definition -DUSESTRDUP as part of your compiler string, you could do something like the following (note: I use fp instead of file for the FILE* pointer):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buff[MAXC] = "", /* line buffer */
*word[MAXC * 2] = {NULL}; /* array of pointers to char */
int ch,
i = 0;
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while ((ch = fgetc(fp)) != EOF) { /* read each char until EOF */
int word_index = 0;
buff[i++] = ch;
if (i == MAXC - 1 || ch == '\n') { /* protect array bounds */
char *wp; /* word pointer */
buff[i++] = 0; /* nul-termiante buf */
for (wp = strtok (buff, " :\n"); /* initial call to strtok */
word_index < MAXC && wp; /* check bounds & wp */
wp = strtok (NULL, " :\n")) { /* increment - next token */
#ifdef USESTRDUP
/* NOTE: strdup allocates - you must validate */
if (!(word[word_index] = strdup (wp))) {
perror ("strdup(wp)");
exit (EXIT_FAILURE);
}
word_index++; /* increment after allocation validated */
#else
/* using strlen + calloc + memcpy */
size_t len = strlen (wp); /* get wp length */
/* allocate/validate */
if (!(word[word_index] = calloc (1, len + 1))) {
perror ("calloc(1,len+1)");
exit (EXIT_FAILURE);
} /* you alread scanned for '\0', use memcpy */
memcpy (word[word_index++], wp, len + 1);
#endif
}
for (int n = 0; n < word_index; n++) { /* loop over each word */
printf ("word[%2d]: %s\n", n, word[n]); /* output */
free (word[n]); /* free memory */
word[n] = NULL; /* set pointer NULL (optional) */
}
putchar ('\n'); /* tidy up with newline */
i = 0; /* reset i zero */
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
}
Compile
By default the code will use calloc for allocation, and a simple gcc compile string would be:
gcc -Wall -Wextra -pedantic -std=c11 -O3 -o strtokstrdupcalloc strtokstrdupcalloc.c
For VS (cl.exe) you would use
cl /nologo /W3 /wd4996 /Ox /Festrtokstrdupcalloc /Tc strtokstrdupcalloc.c
(which will create strtokstrdupcalloc.exe in the current directory on windows)
To compile using strdup, just add -DUSESTRDUP to either command line.
Example Input File
$ cat dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Example Use/Output
$ ./bin/strtokstrdupcalloc dat/captnjack.txt
word[ 0]: This
word[ 1]: is
word[ 2]: a
word[ 3]: tale
word[ 0]: Of
word[ 1]: Captain
word[ 2]: Jack
word[ 3]: Sparrow
word[ 0]: A
word[ 1]: Pirate
word[ 2]: So
word[ 3]: Brave
word[ 0]: On
word[ 1]: the
word[ 2]: Seven
word[ 3]: Seas.
(output is the same regardless how you allocate)
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/strtokstrdupcalloc dat/captnjack.txt
==4946== Memcheck, a memory error detector
==4946== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==4946== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==4946== Command: ./bin/strtokstrdupcalloc dat/captnjack.txt
==4946==
word[ 0]: This
word[ 1]: is
word[ 2]: a
word[ 3]: tale
word[ 0]: Of
word[ 1]: Captain
word[ 2]: Jack
word[ 3]: Sparrow
word[ 0]: A
word[ 1]: Pirate
word[ 2]: So
word[ 3]: Brave
word[ 0]: On
word[ 1]: the
word[ 2]: Seven
word[ 3]: Seas.
==4946==
==4946== HEAP SUMMARY:
==4946== in use at exit: 0 bytes in 0 blocks
==4946== total heap usage: 17 allocs, 17 frees, 628 bytes allocated
==4946==
==4946== All heap blocks were freed -- no leaks are possible
==4946==
==4946== For counts of detected and suppressed errors, rerun with: -v
==4946== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.
The fastest, most readable and most portable way to allocate new room for a string is to use malloc+memcpy:
size_t size = strlen(old_str) + 1;
...
char* new_str = malloc (size);
if(new_str == NULL)
{ /* error handling */ }
memcpy(new_str, old_str, size);
There exists no argument against using this method when you know the length in advance.
Notes about inferior methods:
strcpy is needlessly slow when you already know the length.
strncpy -"-. And also dangerous since it is easy to miss null termination.
calloc is needlessly slow as it will do a zero-out of all memory.
strdup is needlessly slow and also non-standard, so it is non-portable and might not compile.
Related
Bascially, Im trying to accept a command line text file so that when I run the program as "program instructions.txt", it stores the values listed in instructions. However I am having trouble testing what i currently have, as it is giving me the error "segmentation fault core dumped".
int main(int argc, char* argv[]) {
setup_screen();
setup();
// File input
char textExtracted[250];
FILE* file_handle;
file_handle = fopen(argv[1], "r");
while(!feof(file_handle)){
fgets(textExtracted, 150, file_handle);
}
fclose(file_handle);
printf("%s", textExtracted[0]);
return 0;
}
Inside the text file is
A 1 2 3 4 5
B 0 0
C 1 1
Im just trying to store each line in an array and then print them.
Some points:
int main(int argc, char* argv[]) {
I suggest you check number of arguments here before proceeding further
setup_screen();
setup();
// File input
char textExtracted[250];
Declaration can be joined but always always check return values from I/O
FILE* file_handle = = fopen(argv[1], "r");
if (NULL == file_handle)
{
perror(argv[1]);
return EXIT_FAILURE;
}
below is not the correct way to read from a file, instead you should
try and read from the file first, then check for error/eof/enuff bytes read
// while(fgets(textExtracted,sizeof(textExtracted), 1, file_handle) > 0) {}
while(!feof(file_handle)){
fgets(textExtracted, 150, file_handle);
}
It looks like you think fgets appends to textExtracted when you call it
multiple times, it doesn't! every line in the file will overwrite the previously read line. Note also that the \n character is included in your buffer.
But since your file appears to be pretty small, you could read the whole
content into your buffer and work with that.
// int size = fread(textExtracted, sizeof(textExtracted), 1, file_handle);
Better is to check the size of the file first and then allocate a buffer with malloc to hold the whole file or read the file character by character and do whatever commands you need to do on the fly. e.g. a switch statement is excellent as a statemachine
switch( myreadchar )
{
case 'A':
break;
case 'B':
break;
...
}
textExtracted[0] is one character, textExtracted is the whole array so instead of
printf("%s", textExtracted[0]);
write
printf("%s", textExtracted);
or even better
fputs(textExtracted, stdout);
return 0;
The problem you present is the classic problem of "How do I read an unknown number of lines of unknown length from a file?" The way you approach the problem in a memory efficient manner is to declare a pointer-to-pointer to char and allocate a reasonable number of pointers to begin with and then allocate for each line assigning the starting address for the block holding each line to your pointers as you read each line and allocate for it.
An efficient way to do that is to read each line from a file into a fixed buffer (of size sufficient to hold your longest line without skimping) with fgets or by using POSIX getline which will allocate as needed to hold the line. You then remove the trailing '\n' from your temporary buffer and obtain the length of the line.
Then allocate a block of memory of length + 1 characters (the +1 for the nul-terminating character) and assign the address for your new block of memory to your next available pointer (keeping track of the number of pointers allocated and the number of pointers used)
When the number of pointers used equals the number allocated, you simply realloc additional pointers (generally by doubling the current number available, or by allocating for some fixed number of additional pointers) and you keep going. Repeat the process as many times as needed until you have read all of the lines in your input file.
There are a number of ways to implement it and arrange the differing tasks, but all basically boil down to the same thing. Start with a temporary buffer of reasonable size to handle your longest line (without skimping, just in case there is some variation in your input data -- a 1K buffer is cheap insurance, adjust as needed). Add your counters to keep track of the number of pointers allocated and then number used (your index). Open and validate the file given on the command line is open for reading (or read from stdin by default if no argument was given on the command line) For example you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buf[MAXC] = "", /* temp buffer to hold line read from file */
**lines = NULL; /* pointer-to-pointer to each line */
size_t ndx = 0, alloced = 0; /* current index, no. of ptrs allocated */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
...
With your file open and validated, you are ready to read each line, controlling your read loop with your read function itself, and following the outline above to handle storage for each line, e.g.
while (fgets (buf, MAXC, fp)) { /* read each line */
size_t len; /* for line length */
if (ndx == alloced) { /* check if realloc needed */
void *tmp = realloc (lines, /* alloc 2X current, or 2 1st time */
(alloced ? 2 * alloced : 2) * sizeof *lines);
if (!tmp) { /* validate EVERY allocation */
perror ("realloc-lines");
break; /* if allocation failed, data in lines still good */
}
lines = tmp; /* assign new block of mem to lines */
alloced = alloced ? 2 * alloced : 2; /* update ptrs alloced */
}
Note: above, the first thing that happens in your read loop is to check if you have pointers available, e.g. if (ndx == alloced), if your index (number used) is equal to the number allocated, you reallocate more. The ternary above alloced ? 2 * alloced : 2 simply asks if you have some previously allocated alloced ? then double the number 2 * alloced otherwise (:) just start with 2 pointers and go from there. In that doubling scheme, you allocate 2, 4, 8, 16, ... pointers with each successive reallocation.
Also note: when you call realloc you always use a temporary pointer, e.g. tmp = realloc (lines, ...) and you never realloc using the pointer itself, e.g. lines = realloc (lines, ...). When (not if) realloc fails, it returns NULL, and if you assign that to your original pointer -- you have just created a memory leak because the address for lines has been lost meaning you cannot reach or free() the memory you previously allocated.
Now you have confirmed you have a pointer available to assign the address of a block of memory to hold the line, remove the '\n' from buf and get the length of the line. You can do that conveniently in a single call to strcspn which returns the initial number of characters in the string not containing the delimiter "\n", e.g.
buf[(len = strcspn(buf, "\n"))] = 0; /* trim \n, get length */
(note: above you are simply overwriting the '\n' with the nul-terminating character 0, equivalent to '\0')
Now that you have the length of the line, you simply allocate length + 1 characters and copy from the temporary buffer buf to your new block of memory, e.g.
if (!(lines[ndx] = malloc (len + 1))) { /* allocate for lines[ndx] */
perror ("malloc-lines[ndx]"); /* validate combined above */
break;
}
memcpy (lines[ndx++], buf, len + 1); /* copy buf to lines[ndx] */
} /* increment ndx */
At that point you are done reading and storing all lines and can simply close the file if not reading from stdin. Here, for example, we just output each of the lines, and then free the storage for each line, finally freeing the memory for the allocated pointers as well, e.g.
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++) { /* loop over each storage line */
printf ("lines[%2zu] : %s\n", i, lines[i]); /* output line */
free (lines[i]); /* free storage for strings */
}
free (lines); /* free pointers */
}
That's it. Putting it altogether, you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buf[MAXC] = "", /* temp buffer to hold line read from file */
**lines = NULL; /* pointer-to-pointer to each line */
size_t ndx = 0, alloced = 0; /* current index, no. of ptrs allocated */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (buf, MAXC, fp)) { /* read each line */
size_t len; /* for line length */
if (ndx == alloced) { /* check if realloc needed */
void *tmp = realloc (lines, /* alloc 2X current, or 2 1st time */
(alloced ? 2 * alloced : 2) * sizeof *lines);
if (!tmp) { /* validate EVERY allocation */
perror ("realloc-lines");
break; /* if allocation failed, data in lines still good */
}
lines = tmp; /* assign new block of mem to lines */
alloced = alloced ? 2 * alloced : 2; /* update ptrs alloced */
}
buf[(len = strcspn(buf, "\n"))] = 0; /* trim \n, get length */
if (!(lines[ndx] = malloc (len + 1))) { /* allocate for lines[ndx] */
perror ("malloc-lines[ndx]"); /* validate combined above */
break;
}
memcpy (lines[ndx++], buf, len + 1); /* copy buf to lines[ndx] */
} /* increment ndx */
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++) { /* loop over each storage line */
printf ("lines[%2zu] : %s\n", i, lines[i]); /* output line */
free (lines[i]); /* free storage for strings */
}
free (lines); /* free pointers */
}
Example Use/Output
$ ./bin/fgets_lines_dyn dat/cmdlinefile.txt
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
Redirecting from stdin instead of opening the file:
$ ./bin/fgets_lines_dyn < dat/cmdlinefile.txt
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgets_lines_dyn dat/cmdlinefile.txt
==6852== Memcheck, a memory error detector
==6852== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6852== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6852== Command: ./bin/fgets_lines_dyn dat/cmdlinefile.txt
==6852==
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
==6852==
==6852== HEAP SUMMARY:
==6852== in use at exit: 0 bytes in 0 blocks
==6852== total heap usage: 6 allocs, 6 frees, 624 bytes allocated
==6852==
==6852== All heap blocks were freed -- no leaks are possible
==6852==
==6852== For counts of detected and suppressed errors, rerun with: -v
==6852== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
While allocating storage for each pointer and each line may look daunting at first, it is a problem you will face over and over again, whether reading and storing lines from a file, integer or floating point values in some 2D representation of the data, etc... it is worth the time it takes to learn. Your alternative is to declare a fixed size 2D array and hope your line length never exceeds the declared width and the number of lines never exceeds your declared number of rows. (you should learn that as well, but the limitation become quickly apparent)
Look things over and let me know if you have further questions.
I have an input file which contains space separated values
AA BB 4
A B
AB BA
AA CC
CC BB
A B 3
A C
B C
c B
I want content of the first line moved to s1, s2 and n respectively
and next n lines contents 2 space separated strings these will move to
production_left and production_right respectively.
The layout will be repeated for subsequent blocks of lines also. The sample data has two blocks of input.
My code given below
int main()
{
char *s1[20], *t1[20];
char *production_left[20], *production_right[20];
int n;
FILE *fp;
fp = fopen("suffix.txt","r");
int iss=0;
do{
fscanf(fp,"%s %s %d",s1[iss],t1[iss],&n);
int i=0;
while(i<n)
{
fscanf(fp,"%s %s",production_left[i],production_right[i]);
i++;
}
}while(!eof(fp));
}
Every time it is giving segmentation fault.
A lot of C, is just keeping clear in your mind, what data you need available in what scope (block) of your code. In your case, the data you are working with is the first (or header) line for each section of your data file. For that you have s1 and t1, but you have nothing to preserve n so that it is available for reuse with your data. Since n holds the number of indexes to expect for production_left and production_right under each heading, simply create an index array, e.g. int idx[MAXC] = {0}; to store each n associated with each s1 and t1. That way you preserve that value for use in iteration later on. (MAXC is just a defined constant for 20 to prevent using magic numbers in your code)
Next, you need to turn to your understanding of pointer declarations and their use. char *s1[MAXC], *t1[MAXC], *production_left[MAXC], *production_right[MAXC]; declares 4 array of pointers (20 pointers each) for s1, t1, production_left and production_right. The pointers are uninitialized and unallocated. While each can be initialized to a pointer value, there is no storage (allocated memory) associated with any of them that would allow copying data. You can't simply use fscanf and assign the same pointer value to each (they would all end up pointing to the last value -- if it remained in scope)
So you have two choices, (1) either use a 2D array, or (2) allocate storage for each string and copy the string to the new block of memory and assign the pointer to the start of that block to (e.g. s1[x]). The strdup function provides the allocate and copy in a single function call. If you do not have strdup, it is a simple function to write using strlen, malloc and memcopy (use memmove if there is a potential that the strings overlap).
Once you have identified what values you need to preserve for later use in your code, you have insured the variables declared are scoped properly, and you have insured that each is properly initialized and storage allocated, all that remains is writing the logic to make things work as you intend.
Before turning to an example, it is worth noting that you are interested in line-oriented input with your data. The scanf family provides formatted input, but often it is better to use a line-oriented function for the actual input (e.g. fgets) and then a separate parse with, e.g. sscanf. In this case it is largely a wash since your first values are string values and the %s format specifier will skip intervening whitespace, but that is often not the case. For instance, you are effectively reading:
char tmp1[MAXC] = "", tmp2[MAXC] = "";
...
if (fscanf (fp, "%s %s %d", tmp1, tmp2, &n) != 3)
break;
That can easily be replaced with a bit more robust:
char buf[MAXC] = "", tmp1[MAXC] = "", tmp2[MAXC] = "";
...
if (!fgets (buf, sizeof buf, fp) ||
sscanf (buf, "%s %s %d", tmp1, tmp2, &n) != 3)
break;
(which will validate both the read and the conversion separately)
Above, also note the use of temporary buffers tmp1 and tmp2 (and buf for use with fgets) It is very often advantageous to read input into temporary values that can be validated before final storage for later use.
What remains is simply putting the pieces together in the correct order to accomplish your goals. Below is an example that reads data from the filename given as the first argument (or from stdin if no filename is given) and then outputs the data and frees all memory allocated before exiting.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 20
int main (int argc, char **argv) {
char *s1[MAXC], *t1[MAXC],
*production_left[MAXC], *production_right[MAXC];
int idx[MAXC] = {0}, /* storage for 'n' values */
iss = 0, ipp = 0, pidx = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* loop until no header row read, protecting array bounds */
for (; ipp < MAXC && iss < MAXC; iss++) {
char buf[MAXC] = "", tmp1[MAXC] = "", tmp2[MAXC] = "";
int n = 0;
if (!fgets (buf, sizeof buf, fp) ||
sscanf (buf, "%s %s %d", tmp1, tmp2, &n) != 3)
break;
idx[iss] = n;
s1[iss] = strdup (tmp1); /* strdup - allocate & copy */
t1[iss] = strdup (tmp2);
if (!s1[iss] || !t1[iss]) { /* if either is NULL, handle error */
fprintf (stderr, "error: s1 or s1 empty/NULL, iss: %d\n", iss);
return 1;
}
/* read 'n' data lines from file, protecting array bounds */
for (int i = 0; i < n && ipp < MAXC; i++, ipp++) {
char ptmp1[MAXC] = "", ptmp2[MAXC] = "";
if (!fgets (buf, sizeof buf, fp) ||
sscanf (buf, "%s %s", ptmp1, ptmp2) != 2) {
fprintf (stderr, "error: read failure, ipp: %d\n", iss);
return 1;
}
production_left[ipp] = strdup (ptmp1);
production_right[ipp] = strdup (ptmp2);
if (!production_left[ipp] || !production_right[ipp]) {
fprintf (stderr, "error: production_left or "
"production_right empty/NULL, iss: %d\n", iss);
return 1;
}
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (int i = 0; i < iss; i++) {
printf ("%-8s %-8s %2d\n", s1[i], t1[i], idx[i]);
free (s1[i]); /* free s & t allocations */
free (t1[i]);
for (int j = pidx; j < pidx + idx[i]; j++) {
printf (" %-8s %-8s\n", production_left[j], production_right[j]);
free (production_left[j]); /* free production allocations */
free (production_right[j]);
}
pidx += idx[i]; /* increment previous index value */
}
return 0;
}
Example Input File
$ cat dat/production.txt
AA BB 4
A B
AB BA
AA CC
CC BB
A B 3
A C
B C
c B
Example Use/Output
$ ./bin/production <dat/production.txt
AA BB 4
A B
AB BA
AA CC
CC BB
A B 3
A C
B C
c B
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/production <dat/production.txt
==3946== Memcheck, a memory error detector
==3946== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3946== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3946== Command: ./bin/production
==3946==
AA BB 4
A B
AB BA
AA CC
CC BB
A B 3
A C
B C
c B
==3946==
==3946== HEAP SUMMARY:
==3946== in use at exit: 0 bytes in 0 blocks
==3946== total heap usage: 18 allocs, 18 frees, 44 bytes allocated
==3946==
==3946== All heap blocks were freed -- no leaks are possible
==3946==
==3946== For counts of detected and suppressed errors, rerun with: -v
==3946== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
As the title explains the problem. In this function in my program , whenever I allocate a certain size in the array I get huge chunks of error code on my terminal. The array gets copied properly and everything but right after printing it the program crashes. The purpose of the program is to read from file and then store each line in an array index. Your help is much appreciated. Thanks
The array is declared as a pointer in the main and then it is dynamically allocated inside.
void read_file_into_array(char **filearray, FILE * fp)
{
/* use get line to count # of lines */
int numlines = -1;
numlines = linecount(fp);
size_t size = 50;
char *buffer;
buffer = (char *)malloc(size * sizeof(char));
if (buffer == NULL) {
perror("Unable to allocate buffer");
exit(1);
}
if (numlines < 0) {
fprintf(stderr, "error: unable to determine file length.\n");
return;
}
printf(" number of lines counted are ; %d\n", numlines);
/* allocate array of size numlines + 1 */
if (!(*filearray = (char *)malloc(numlines + 1 * sizeof(char)))) {
fprintf(stderr, "error: virtual memory exhausted.\n");
return;
}
fseek(fp, 0, SEEK_SET);
for (int i = 0; i < numlines; i++) {
if (!feof(fp)) {
fgets(buffer, size, fp);
filearray[i] = (char *)malloc(size * sizeof(char *));
strcpy(filearray[i], buffer);
printf("buffer at %d : %s\n", i, buffer);
printf("array at %d : %s\n", i, filearray[i]);
}
}
free(buffer);
}
/* THIS IS MY MAIN BELOW */
int main (int argc, char **argv)
{
FILE *fp = NULL;
char* array;
/* open file for reading (default stdin) */
fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open */
fprintf (stderr, "error: file open failed '%s'\n", argv[1]);
return 1;
}
read_file_into_array (&array, fp);
if (fp != stdout)
if (fclose (fp) == EOF) {
fprintf (stderr, "error: fclose() returned EOF\n");
return 1;
}
return 0;
}
/* MY LNE COUNT FUNCITON */
int linecount(FILE *fp) {
char buff[MAXLINE];
int count = 0;
while(fgets(buff,MAXLINE,fp) != NULL) {
count++;
}
return count;
}
OK, let's get you sorted out. First, to pass a pointer to pointer to char as a parameter to a function, allocate within the function and have the results available back in the calling function (without more), you must pass the address of the pointer to pointer to char. Otherwise, the function receives a copy and any changes made in the function are not available back in the caller (main here) because the copy is lost (the address to the start of the allocated block is lost) when the function returns.
Passing the address of the pointer to pointer to char means you have to become a 3-STAR Programmer (it is not a compliment). A better way is to not pass the pointer at all, simply allocate within the function and return a pointer to the caller (but there are some cases where you will still need to pass it as a parameter, so that is shown below)
The program below will read any text file passed as the first argument (or it will read from stdin if no argument is given). It will dynamically allocate pointers and needed (in MAXL blocks rather than one-at-a-time which is highly inefficient). It will allocate (and reallocate) lines as required to accommodate lines of any length. It uses a fixed buffer of MAXC chars to read continually until a complete line is read saving the offset to the position in the current line to append subsequent reads as required.
Finally after reading the file, it will print the text and associated line number to stdout, and free all memory allocated before exiting. Look it over and let me know if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAXL = 32, MAXC = 1024 };
char **read_file_into_buf (char ***buf, FILE *fp, int *nlines);
int main (int argc, char **argv) {
char **buf = NULL; /* buffer to hold lines of file */
int n; /* number of lines read */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
if (read_file_into_buf (&buf, fp, &n)) { /* read file into buf */
for (int i = 0; i < n; i++) { /* loop over all lines */
printf ("line[%3d]: %s\n", i, buf[i]); /* output line */
free (buf[i]); /* free line */
}
free (buf); /* free buffer */
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
/** read text file from 'fp' into 'buf' update 'nlines'.
* Being a 3-STAR Programmer, is NOT a compliment. However,
* to pass a pointer to pointer to char as a parameter and
* allocate within the function, it is required. You must
* pass the address of buf from main otherwise the function
* recieves a copy whose allocation is lost on return. A better
* approach is to simply assign the return in 'main' and not
* pass buf at all.
*/
char **read_file_into_buf (char ***buf, FILE *fp, int *nlines)
{
/* current allocation, current index, and offset (if less
* than a whole line is read into line on fgets call), and
* eol indicates '\n' present.
*/
size_t n = MAXL, idx = 0, offset = 0, eol = 0;
char line[MAXC] = ""; /* temp buffer for MAXC chars */
void *tmp = NULL; /* pointer for realloc */
/* validate address, file ptr & nlines address */
if (!buf || !fp || !nlines) {
fprintf (stderr, "error; invalid parameter.\n");
return NULL;
}
/* allocate initial MAXL pointers, calloc used to avoid valgrind
* warning about basing a conditional jump on uninitialized value.
* calloc allocates and initializes.
*/
if (!(*buf = calloc (sizeof *buf, MAXL))) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return NULL;
}
while (fgets (line, MAXC, fp)) /* read every line in file */
{
/* save offset from prior read (if any), get len */
size_t end = offset, len = strlen (line);
if (len && line[len - 1] == '\n') { /* test for new line */
line[--len] = 0; /* overwrite with nul */
offset = 0; /* zero offset, all read */
eol = 1; /* POSIX eol present */
}
else {
line[len] = 0; /* nul-terminate */
offset += len; /* short read, save offset to last char */
eol = 0; /* no POSIX eol */
}
/* allocate/reallocate for current line + nul-byte */
tmp = realloc ((*buf)[idx], sizeof ***buf * (end + len + 1));
if (!tmp) {
fprintf (stderr, "error: realloc, memory exhausted.\n");
return *buf; /* return current buf */
}
(*buf)[idx] = tmp; /* assign block to current index */
strcpy ((*buf)[idx] + end, line); /* copy line to block */
if (!eol) continue; /* chars remain in line, go read them */
if (++idx == n) { /* check pointer allocation, realloc as needed */
tmp = realloc (*buf, sizeof **buf * (n + MAXL));
if (!tmp) {
fprintf (stderr, "error: realloc buf, memory exhausted.\n");
return *buf;
}
*buf = tmp; /* assign new block to *buf */
memset (*buf + n, 0, sizeof **buf * MAXL); /* zero new memory */
n += MAXL; /* update the current number of ptrs allocated */
}
*nlines = idx; /* update the number of lines read */
}
if (!eol) { /* protect against file with no POSIX ending '\n' */
idx++; /* account for final line */
*nlines = idx; /* update nlines */
}
/* final realloc to size buf to exactly fit number of lines */
tmp = realloc (*buf, sizeof **buf * (idx));
if (!tmp) /* if it fails, return current buf */
return *buf;
*buf = tmp; /* assign reallocated block to buf */
return *buf;
}
Note: zeroing the new pointers allocated with memset (*buf + n, 0, sizeof **buf * MAXL); is optional, but it eliminates valgrind warnings about basing a conditional jump on an uninitialized value, just as with calloc/malloc of the initial pointers. There is no option to zero with realloc, so a separate call to memset is required.
Example Use/Output
$ ./bin/fgets_readfile < dat/damages.txt
line[ 0]: Personal injury damage awards are unliquidated
line[ 1]: and are not capable of certain measurement; thus, the
line[ 2]: jury has broad discretion in assessing the amount of
line[ 3]: damages in a personal injury case. Yet, at the same
line[ 4]: time, a factual sufficiency review insures that the
line[ 5]: evidence supports the jury's award; and, although
line[ 6]: difficult, the law requires appellate courts to conduct
line[ 7]: factual sufficiency reviews on damage awards in
line[ 8]: personal injury cases. Thus, while a jury has latitude in
line[ 9]: assessing intangible damages in personal injury cases,
line[ 10]: a jury's damage award does not escape the scrutiny of
line[ 11]: appellate review.
line[ 12]:
line[ 13]: Because Texas law applies no physical manifestation
line[ 14]: rule to restrict wrongful death recoveries, a
line[ 15]: trial court in a death case is prudent when it chooses
line[ 16]: to submit the issues of mental anguish and loss of
line[ 17]: society and companionship. While there is a
line[ 18]: presumption of mental anguish for the wrongful death
line[ 19]: beneficiary, the Texas Supreme Court has not indicated
line[ 20]: that reviewing courts should presume that the mental
line[ 21]: anguish is sufficient to support a large award. Testimony
line[ 22]: that proves the beneficiary suffered severe mental
line[ 23]: anguish or severe grief should be a significant and
line[ 24]: sometimes determining factor in a factual sufficiency
line[ 25]: analysis of large non-pecuniary damage awards.
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated.
For Linux valgrind is the normal choice.
$ valgrind ./bin/fgets_readfile < dat/damages.txt
==5863== Memcheck, a memory error detector
==5863== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5863== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==5863== Command: ./bin/fgets_readfile
==5863==
line[ 0]: Personal injury damage awards are unliquidated
line[ 1]: and are not capable of certain measurement; thus, the
line[ 2]: jury has broad discretion in assessing the amount of
line[ 3]: damages in a personal injury case. Yet, at the same
line[ 4]: time, a factual sufficiency review insures that the
line[ 5]: evidence supports the jury's award; and, although
line[ 6]: difficult, the law requires appellate courts to conduct
line[ 7]: factual sufficiency reviews on damage awards in
line[ 8]: personal injury cases. Thus, while a jury has latitude in
line[ 9]: assessing intangible damages in personal injury cases,
line[ 10]: a jury's damage award does not escape the scrutiny of
line[ 11]: appellate review.
line[ 12]:
line[ 13]: Because Texas law applies no physical manifestation
line[ 14]: rule to restrict wrongful death recoveries, a
line[ 15]: trial court in a death case is prudent when it chooses
line[ 16]: to submit the issues of mental anguish and loss of
line[ 17]: society and companionship. While there is a
line[ 18]: presumption of mental anguish for the wrongful death
line[ 19]: beneficiary, the Texas Supreme Court has not indicated
line[ 20]: that reviewing courts should presume that the mental
line[ 21]: anguish is sufficient to support a large award. Testimony
line[ 22]: that proves the beneficiary suffered severe mental
line[ 23]: anguish or severe grief should be a significant and
line[ 24]: sometimes determining factor in a factual sufficiency
line[ 25]: analysis of large non-pecuniary damage awards.
==5863==
==5863== HEAP SUMMARY:
==5863== in use at exit: 0 bytes in 0 blocks
==5863== total heap usage: 28 allocs, 28 frees, 1,733 bytes allocated
==5863==
==5863== All heap blocks were freed -- no leaks are possible
==5863==
==5863== For counts of detected and suppressed errors, rerun with: -v
==5863== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm All heap blocks were freed -- no leaks are possible and equally important ERROR SUMMARY: 0 errors from 0 contexts. (although note: some OS's do not provide adequate memeory exclusion files (the file that excludes system and OS memory from being reported as in use) which will cause valgrind to report that some memory has not yet been freed (despite the fact you have done your job and freed al blocks you allocoted and under your control.)
One of the main issues why you get compilation warnings (+pay attention to the comments which point out other different bugs in your code) is
if (!(*filearray = (char**)malloc(sizeof(char*) * (numlines + 1))))
*filearray is of type char* not char**; + there is no need to cast the value returned by malloc.
char **filearray is actually &array from the main() which is defined as char *array;
Now the following totally breaks the stability and leads to undefined behavior
filearray[i] = (char *)malloc(size * sizeof(char *));
this is equivalent to (&array)[i] = (char *)malloc(size * sizeof(char *)); from the main, this is totally broken. You are walking through the stack, starting from the location of the array from the main and just override everything with the values of char* that are returned by malloc.
If you want your array be array of strings, defined it as char **array, pass it by pointer &array, update the declaration and the usage of filearray accordingly.
I need to create an array I where column array is a fixed size and the row array is dynamically allocated (using malloc).
I looked at other similar questions and they either make the entire thing fixed or dynamically allocated. How can you do both?
char A[malloc][100];
My guess would be:
char *A[100];
Following on from my comment, you can statically declare an array-of-pointers and then dynamically allocate storage for each line read from a file. Below is a short example showing this approach. You must keep an index or counter of the number of lines read so you can prevent writing beyond the end of your array of pointers.
Note xcalloc below is just a wrapper for calloc that allocates space for each line and does error checking. Allocation below is the long-way (note: you can replace the long-way with strdup as indicated in the comments) The full allocate/validate/nul-terminate approach to allocation is shown below for example/explanation.
You can change the number of pointers declared by changing the #define MAXL value and adjust the maximum characters per-line by changing #define MAXC value. Look over and understand what each part of each line is doing. Note the use of a static buffer buf into which the initial lines are read before memory is allocated and the contents assigned to an index in your array. This is very useful as it allows you to validate what you have read before allocating space for it and copying the sting to new memory. It allows you to skip empty lines, etc is a much simpler manner than having to free/unindex an unwanted line:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 256 /* max chars per-line */
#define MAXL 100 /* initial num pointers */
void *xcalloc (size_t n, size_t s);
int main (int argc, char **argv) {
char *array[MAXL] = {NULL};
char buf[MAXC] = {0};
size_t i, idx = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, MAXC, fp)) /* read all lines from fp into array */
{
size_t len = strlen (buf);
/* validate complete line read */
if (len + 1 == MAXC && buf[len - 1] != '\n')
fprintf (stderr, "warning: line[%zu] exceeded '%d' chars.\n",
idx, MAXC);
/* strip trailing '\r', '\n' */
while (len && (buf[len-1] == '\n' || buf[len-1] == '\r'))
buf[--len] = 0;
/* allocate & copy buf to array[idx], nul-terminate
* note: this can all be done with array[idx] = strdup (buf);
*/
array[idx] = xcalloc (len + 1, sizeof **array);
strncpy (array[idx], buf, len);
array[idx++][len] = 0;
/* MAXL limit check - if reached, break */
if (idx == MAXL) break;
}
if (fp != stdin) fclose (fp);
printf ("\n lines read from '%s'\n\n", argc > 1 ? argv[1] : "stdin");
for (i = 0; i < idx; i++)
printf (" line[%3zu] %s\n", i, array[i]);
for (i = 0; i < idx; i++)
free (array[i]); /* free each line */
return 0;
}
/* simple calloc with error checking */
void *xcalloc (size_t n, size_t s)
{
void *memptr = calloc (n, s);
if (memptr == 0) {
fprintf (stderr, "xcalloc() error: virtual memory exhausted.\n");
exit (EXIT_FAILURE);
}
return memptr;
}
Compile
gcc -Wall -Wextra -O3 -o bin/fgets_lines_stat_dyn fgets_lines_stat_dyn.c
Use/Output
$ ./bin/fgets_lines_stat_dyn dat/captnjack.txt
lines read from 'dat/captnjack.txt'
line[ 0] This is a tale
line[ 1] Of Captain Jack Sparrow
line[ 2] A Pirate So Brave
line[ 3] On the Seven Seas.
Memory Leak/Error Check
In any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are all simple to use. Just run your program through it.
$ valgrind ./bin/fgets_lines_stat_dyn dat/captnjack.txt
==22770== Memcheck, a memory error detector
==22770== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==22770== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==22770== Command: ./bin/fgets_lines_dyn dat/captnjack.txt
==22770==
lines read from 'dat/captnjack.txt'
line[ 0] This is a tale
line[ 1] Of Captain Jack Sparrow
line[ 2] A Pirate So Brave
line[ 3] On the Seven Seas.
==22770==
==22770== HEAP SUMMARY:
==22770== in use at exit: 0 bytes in 0 blocks
==22770== total heap usage: 6 allocs, 6 frees, 1,156 bytes allocated
==22770==
==22770== All heap blocks were freed -- no leaks are possible
==22770==
==22770== For counts of detected and suppressed errors, rerun with: -v
==22770== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
I'm trying to create a code, which reads from textile, and then stores the data into memory, prints out to the screen so the user can read it, but it is still saved into the memory so you can use it for the rest of the program..
Here is the sample of the textile
75
nevermind
nvm
not much
nm
no problem
np
people
ppl
talk to you later
ttyl
because
cuz
i don't know
idk
as soon as possible
asap
yeah
ya
how are you
hru
you
the list goes on, it has a total of 150 words, 151 lines if the first number is included. The 75 serves to tell you how many pairs there are.
anyways, here is the code that i Have written so far, it uses this struct
typedef struct
{
char *English;
char *TextSpeak;
}Pair;
The code i have written so far is:
FILE *infile =fopen("dictionary.txt","r");
int phraseCounter;
fscanf(infile, "%i", &phraseCounter); //Now you have the number of phrase pairs
//Allocate memory
Pair *phrases=malloc(sizeof(Pair) * phraseCounter);
//Run loop
for(int i=0; i < phraseCounter; i++){
//Get the english word
phrases[i].English = malloc(sizeof(char));
fscanf(infile,"%s",phrases[i].English);
//run loop to read next line
for(int a=0; a < phraseCounter; a++){
phrases[i].TextSpeak = malloc(sizeof(char));
fscanf(infile,"%s",phrases[i].TextSpeak);
}
printf("%s - %s\n", phrases[i].English, phrases[i].TextSpeak);
}
fclose(infile);
for(int i=0; i < phraseCounter; i++)
free(phrases[i].English);
free(phrases);
The output i keep getting is:
nevermind - atm
by - definitely
def -
-
-
-
-
-
And it keeps going for 75 lines.
Now I'm not sure whether I should use a 2D array or if this will be acceptable. Any help will be appreciated! Thanks
phrases[i].English = malloc(sizeof(char));
Here the problem lies, you are allocating a single byte and then trying to cram a string into it, which leads to undefined behavior here:
fscanf(infile,"%s", phrases[i].English);
Generally, you should assume a sensible buffer length and read that, and check whether the new line is contained. If that's not the case, read again into another buffer or enlarge the old one (using realloc).
Either that or use a non-standard function like getline that does this already for you under the hood.
Given that your lines are pretty short sentences, a constant buffer size could suffice (let's call it MAX_LINE), which provides us a little easier way to achieve the same:
fscanf(infile, "%*[^\n]s", MAX_LINE, buf);
This reads a string of length MAX_LINE into the buffer buf and terminates before the '\n' is encountered.
When reading strings, you should refrain from using fscanf("%s", buf) and use fgets() or scanf("%*s", MAX_LINE, ...) instead. This ensures that there will be no attempt to write more data than what you specified, avoiding buffer overflows.
Edit: The nested loop shouldn't be there. You are basically overwriting phrases[i].TextSpeak a total of phraseCounter times for no benefit. And in the process of that you are leaking a lot of memory.
There are a number of ways to do this. However, one suggestion I would have would be to make use of line-input tools such as fgets or getline to make reading the file more robust. It is fine to use fscanf for discrete variables (I left it for reading phraseCounter), but for reading string data of unknown length/content, line-input really should be used.
Below is an example of your code with this employed. The code is commented to explain the logic. The real logic here is the fact that you will read 2-lines for each allocated struct. A simple line counter using % (mod) as a toggle can help you keep track of when to allocate a new struct. I also added code to accept the filename as the first argument to the program. (to run, e.g ./progname <filename>). Let me know if you have questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 128
typedef struct Pair
{
char *English;
char *TextSpeak;
} Pair;
int main (int argc, char** argv) {
if (argc < 2) {
fprintf (stderr, "error: insufficient input. Usage: %s filename\n", argv[0]);
return 1;
}
Pair **pair = NULL; /* pointer to array of pointers */
char line[MAXL] = {0}; /* variable to hold line read */
FILE* infile = NULL; /* file pointer for infile */
unsigned int phraseCounter = 0; /* count of pairs */
unsigned int index = 0; /* index to pairs read */
size_t nchr = 0; /* length of line read */
size_t lnum = 0; /* line number read */
/* open file and validate */
if (!(infile = fopen ((argv[1]), "r"))) {
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* read phraseCounter */
if (!fscanf (infile, "%u%*c", &phraseCounter)) {
fprintf (stderr, "error: failed to read phraseCounter.\n");
return 1;
}
/* allocate phraseCounter number of pointers to Pair */
if (!(pair = calloc (phraseCounter, sizeof *pair))) {
fprintf (stderr, "error: memory allocation failed.\n");
return 1;
}
/* read each line in file */
while (fgets (line, MAXL - 1, infile) != NULL)
{
nchr = strlen (line); /* get the length of line */
if (nchr < 1) /* if blank or short line, skip */
continue;
if (line[nchr-1] == '\n') /* strip newline from end */
line[--nchr] = 0;
if (lnum % 2 == 0) /* even/odd test for pair index */
{
/* allocate space for pair[index] */
if (!(pair[index] = calloc (1, sizeof **pair))) {
fprintf (stderr, "error: memory allocation failed for pair[%u].\n", index);
return 1;
}
pair[index]-> English = strdup (line); /* allocate space/copy to English */
}
else
{
pair[index]-> TextSpeak = strdup (line);/* allocate space/copy to TextSpeak */
index++; /* only update index after TextSpeak read */
}
lnum++; /* increment line number */
}
if (infile) fclose (infile); /* close file pointer after read */
/* print the pairs */
printf ("\n Struct English TextSpeak\n\n");
for (nchr = 0; nchr < index; nchr++)
printf (" pair[%3zu] %-24s %s\n", nchr, pair[nchr]-> English, pair[nchr]-> TextSpeak);
/* free memory allocated to pair */
for (nchr = 0; nchr < index; nchr++)
{
if (pair[nchr]-> English) free (pair[nchr]-> English);
if (pair[nchr]-> TextSpeak) free (pair[nchr]-> TextSpeak);
if (pair[nchr]) free (pair[nchr]);
}
if (pair) free (pair);
return 0;
}
Input
$ cat dat/pairs.txt
10
nevermind
nvm
not much
nm
no problem
np
people
ppl
talk to you later
ttyl
because
cuz
i don't know
idk
as soon as possible
asap
yeah
ya
how are you
hru
Output
$ ./bin/struct_rd_pairs dat/pairs.txt
Struct English TextSpeak
pair[ 0] nevermind nvm
pair[ 1] not much nm
pair[ 2] no problem np
pair[ 3] people ppl
pair[ 4] talk to you later ttyl
pair[ 5] because cuz
pair[ 6] i don't know idk
pair[ 7] as soon as possible asap
pair[ 8] yeah ya
pair[ 9] how are you hru
Verify no memory leaks
$ valgrind ./bin/struct_rd_pairs dat/pairs.txt
==3562== Memcheck, a memory error detector
==3562== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==3562== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==3562== Command: ./bin/struct_rd_pairs dat/pairs.txt
==3562==
<snip>
==3562==
==3562== HEAP SUMMARY:
==3562== in use at exit: 0 bytes in 0 blocks
==3562== total heap usage: 32 allocs, 32 frees, 960 bytes allocated
==3562==
==3562== All heap blocks were freed -- no leaks are possible
==3562==
==3562== For counts of detected and suppressed errors, rerun with: -v
==3562== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)