Reading an unknown number of lines with unknown length from stdin - c

I'm relatively new to programming in C and am trying to read input from stdin using fgets.
To begin with I thought about reading max 50 lines, max 50 characters each, and had something like:
int max_length = 50;
char lines[max_length][max_length];
char current_line[max_length];
int idx = 0;
while(fgets(current_line, max_length, stdin) != NULL) {
strcopy(lines[idx], current_line);
idx++;
}
The snippet above successfully reads the input and stores it into the lines array where I can sort and print it.
My question is how do I deal with an unknown number of lines, with an unknown number of characters on each line? (bearing in mind that I will have to sort the lines and print them out).

While there are a number of different variations of this problem already answered, the considerations of how to go about it could use a paragraph. When faced with this problem, the approach is the same regardless of which combination of library or POSIX functions you use to do it.
Essentially, you will dynamically allocate a reasonable number of characters to hold each line. POSIX getline will do this for you automatically, using fgets you can simply read a fixed buffer full of chars and append them (reallocating storage as necessary) until the '\n' character is read (or EOF is reached)
If you use getline, then you must allocate memory for, and copy the buffer filled. Otherwise, you will overwrite previous lines with each new line read, and when you attempt to free each line, you will likely SegFault with double-free or corruption as you repeatedly attempt to free the same block of memory.
You can use strdup to simply copy the buffer. However, since strdup allocates storage, you should validate successful allocation before assigned a pointer to the new block of memory to your collection of lines.
To access each line, you need a pointer to the beginning of each (the block of memory holding each line). A pointer to pointer to char is generally used. (e.g. char **lines;) Memory allocation is generally handled by allocating some reasonable number of pointers to begin with, keeping track of the number you use, and when you reach the number you have allocated, you realloc and double the number of pointers.
As with each read, you need to validate each memory allocation. (each malloc, calloc, or realloc) You also need to validate the way your program uses the memory you allocate by running the program through a memory error check program (such as valgrind for Linux). They are simple to use, just valgrind yourexename.
Putting those pieces together, you can do something similar to the following. The following code will read all lines from the filename provided as the first argument to the program (or from stdin by default if no argument is provided) and print the line number and line to stdout (keep that in mind if you run it on a 50,000 line file)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NPTR 8
int main (int argc, char **argv) {
size_t ndx = 0, /* line index */
nptrs = NPTR, /* initial number of pointers */
n = 0; /* line alloc size (0, getline decides) */
ssize_t nchr = 0; /* return (no. of chars read by getline) */
char *line = NULL, /* buffer to read each line */
**lines = NULL; /* pointer to pointer to each line */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* allocate/validate initial 'nptrs' pointers */
if (!(lines = calloc (nptrs, sizeof *lines))) {
fprintf (stderr, "error: memory exhausted - lines.\n");
return 1;
}
/* read each line with POSIX getline */
while ((nchr = getline (&line, &n, fp)) != -1) {
if (nchr && line[nchr - 1] == '\n') /* check trailing '\n' */
line[--nchr] = 0; /* overwrite with nul-char */
char *buf = strdup (line); /* allocate/copy line */
if (!buf) { /* strdup allocates, so validate */
fprintf (stderr, "error: strdup allocation failed.\n");
break;
}
lines[ndx++] = buf; /* assign start address for buf to lines */
if (ndx == nptrs) { /* if pointer limit reached, realloc */
/* always realloc to temporary pointer, to validate success */
void *tmp = realloc (lines, sizeof *lines * nptrs * 2);
if (!tmp) { /* if realloc fails, bail with lines intact */
fprintf (stderr, "read_input: memory exhausted - realloc.\n");
break;
}
lines = tmp; /* assign reallocted block to lines */
/* zero all new memory (optional) */
memset (lines + nptrs, 0, nptrs * sizeof *lines);
nptrs *= 2; /* increment number of allocated pointers */
}
}
free (line); /* free memory allocated by getline */
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++) {
printf ("line[%3zu] : %s\n", i, lines[i]);
free (lines[i]); /* free memory for each line */
}
free (lines); /* free pointers */
return 0;
}
If you don't have getline, or strdup, you can easily implement each. There are multiple examples of each on the site. If you cannot find one, let me know. If you have further questions, let me know as well.

Check out the GetString function found here.

Related

Creating a function in C that reads a file but it infinitely loops

I basically want to print out all of the lines in a file i made
but then it just loops over and over back to the start
basically cuz in the function i set fseek(fp, 0, SEEK_SET); this part
but idk how otherwise i would place it to get through all the other lines
im basically going back to the start every time.
#include<stdio.h>
#include <stdlib.h>
char *freadline(FILE *fp);
int main(){
FILE *fp = fopen("story.txt", "r");
if(fp == NULL){
printf("Error!");
}else{
char *pInput;
while(!feof(fp)){
pInput = freadline(fp);
printf("%s\n", pInput); // outpu
}
}
return 0;
}
char *freadline(FILE *fp){
int i;
for(i = 0; !feof(fp); i++){
getc(fp);
}
fseek(fp, 0, SEEK_SET); // resets to origin (?)
char *pBuffer = (char*)malloc(sizeof(char)*i);
pBuffer = fgets(pBuffer, i, fp);
return pBuffer;
}
this is my work so far
Continuing from my comments, you are thinking along the correct lines, you just haven't put the pieces together in the right order. In main() you are looping calling your function, allocating for a single line, and then outputting a line of text and doing it over and over again. (and without freeing any of the memory you allocate -- creating a memory leak of every line read)
If you are allocating storage to hold the line, you will generally want to read the entire file in your function in a single pass though the file allocating for, and storing all lines, and returning a pointer to your collection of lines for printing in main() (or whatever the calling function is).
You do that by adding one additional level of indirection and having your function return char **. This is a simple two-step process where you:
allocate a block of memory containing pointers (one pointer for each line). Since you will not know how many lines beforehand, you simply allocate some number of pointers and then realloc() more pointers when you run out;
for each line you read, you allocate length + 1 characters of storage and copy the current line to that block of memory assigning the address for the line to your next available pointer.
(you either keep track of the number of pointers and lines allocated, or provide an additional pointer set to NULL as a Sentinel after the last pointer assigned a line -- up to you, simply keeping track with a counter is likely conceptually easier)
After reading your last line, you simply return the pointer to the collection of pointers which is assigned for use back in the caller. (you can also pass the address of a char ** as a parameter to your function, resulting in the type being char ***, but being a Three-Star Programmer isn't always a compliment). However, there is nothing wrong with doing it that way, and in some cases, it will be required, but if you have an alternative, that is generally the preferred route.
So how would this work in practice?
Simply change your function return type to char ** and pass an additional pointer to a counting variable so you can update the value at that address with the number of lines read before you return from your function. E.g., you could do:
char **readfile (FILE *fp, size_t *n);
Which will take your file pointer to an open file stream and then read each line from it, allocating storage for the line and assigning the address for that allocation to one of your pointers. Within the function, you would use a sufficiently sized character array to hold each line you read with fgets(). Trim the '\n' from the end and get the length and then allocate length + 1 bytes to hold the line. Assign the address for the newly allocated block to a pointer and copy from your array to the newly allocated block.
(strdup() can both allocate and copy, but it is not part of the standard library, it's POSIX -- though most compilers support it as an extension if you provide the proper options)
Below the readfile() function puts it altogether, starting with a single pointer and reallocation twice the current number when you run out (that provides a reasonable trade-off between the number of allocations needed and the number of pointers. (after just 20 calls to realloc(), you would have 1M pointers). You can choose any reallocation and growth scheme you like, but you want to avoid calling realloc() for every line -- realloc() is still a relatively expensive call.
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTR 1 /* initial no. of pointers to allocate */
/* readfile reads all lines from fp, updating the value at the address
* provided by 'n'. On success returns pointer to allocated block of pointers
* with each of *n pointers holding the address of an allocated block of
* memory containing a line from the file. On allocation failure, the number
* of lines successfully read prior to failure is returned. Caller is
* responsible for freeing all memory when done with it.
*/
char **readfile (FILE *fp, size_t *n)
{
char buffer[MAXC], **lines; /* buffer to hold each line, pointer */
size_t allocated = NPTR, used = 0; /* allocated and used pointers */
lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
if (lines == NULL) { /* validate EVERY allocation */
perror ("malloc-lines");
return NULL;
}
while (fgets (buffer, MAXC, fp)) { /* read each line from file */
size_t len; /* variable to hold line-length */
if (used == allocated) { /* is pointer reallocation needed */
/* always realloc to a temporary pointer to avoid memory leak if
* realloc fails returning NULL.
*/
void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-lines");
break; /* lines before failure still good */
}
lines = tmp; /* assign reallocted block to lines */
allocated *= 2; /* update no. of allocated pointers */
}
buffer[(len = strcspn (buffer, "\n"))] = 0; /* trim \n, save length */
lines[used] = malloc (len + 1); /* allocate storage for line */
if (!lines[used]) { /* validate EVERY allocation */
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buffer, len + 1); /* copy buffer to lines[used] */
used++; /* increment used no. of pointers */
}
*n = used; /* update value at address provided by n */
/* can do final realloc() here to resize exactly to used no. of pointers */
return lines; /* return pointer to allocated block of pointers */
}
In main(), you simply pass your file pointer and the address of a size_t variable and check the return before iterating through the pointers making whatever use of the line you need (they are simply printed below), e.g.
int main (int argc, char **argv) {
char **lines; /* pointer to allocated block of pointers and lines */
size_t n; /* number of lines read */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
lines = readfile (fp, &n);
if (fp != stdin) /* close file if not stdin */
fclose (fp);
if (!lines) { /* validate readfile() return */
fputs ("error: no lines read from file.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting all lines read */
puts (lines[i]);
free (lines[i]); /* don't forget to free lines */
}
free (lines); /* and free pointers */
return 0;
}
(note: don't forget to free the memory you allocated when you are done. That become critical when you are calling functions that allocate within other functions. In main(), the memory will be automatically released on exit, but build good habits.)
Example Use/Output
$ ./bin/readfile_allocate dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
The program will read any file, no matter if it has 4-lines or 400,000 lines up to the physical limit of your system memory (adjust MAXC if your lines are longer than 1023 characters).
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/readfile_allocate dat/captnjack.txt
==4801== Memcheck, a memory error detector
==4801== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4801== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4801== Command: ./bin/readfile_allocate dat/captnjack.txt
==4801==
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
==4801==
==4801== HEAP SUMMARY:
==4801== in use at exit: 0 bytes in 0 blocks
==4801== total heap usage: 10 allocs, 10 frees, 5,804 bytes allocated
==4801==
==4801== All heap blocks were freed -- no leaks are possible
==4801==
==4801== For counts of detected and suppressed errors, rerun with: -v
==4801== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
The full code used for the example simply includes the headers, but is included below for completeness:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
#define NPTR 1 /* initial no. of pointers to allocate */
/* readfile reads all lines from fp, updating the value at the address
* provided by 'n'. On success returns pointer to allocated block of pointers
* with each of *n pointers holding the address of an allocated block of
* memory containing a line from the file. On allocation failure, the number
* of lines successfully read prior to failure is returned. Caller is
* responsible for freeing all memory when done with it.
*/
char **readfile (FILE *fp, size_t *n)
{
char buffer[MAXC], **lines; /* buffer to hold each line, pointer */
size_t allocated = NPTR, used = 0; /* allocated and used pointers */
lines = malloc (allocated * sizeof *lines); /* allocate initial pointer(s) */
if (lines == NULL) { /* validate EVERY allocation */
perror ("malloc-lines");
return NULL;
}
while (fgets (buffer, MAXC, fp)) { /* read each line from file */
size_t len; /* variable to hold line-length */
if (used == allocated) { /* is pointer reallocation needed */
/* always realloc to a temporary pointer to avoid memory leak if
* realloc fails returning NULL.
*/
void *tmp = realloc (lines, 2 * allocated * sizeof *lines);
if (!tmp) { /* validate EVERY reallocation */
perror ("realloc-lines");
break; /* lines before failure still good */
}
lines = tmp; /* assign reallocted block to lines */
allocated *= 2; /* update no. of allocated pointers */
}
buffer[(len = strcspn (buffer, "\n"))] = 0; /* trim \n, save length */
lines[used] = malloc (len + 1); /* allocate storage for line */
if (!lines[used]) { /* validate EVERY allocation */
perror ("malloc-lines[used]");
break;
}
memcpy (lines[used], buffer, len + 1); /* copy buffer to lines[used] */
used++; /* increment used no. of pointers */
}
*n = used; /* update value at address provided by n */
/* can do final realloc() here to resize exactly to used no. of pointers */
return lines; /* return pointer to allocated block of pointers */
}
int main (int argc, char **argv) {
char **lines; /* pointer to allocated block of pointers and lines */
size_t n; /* number of lines read */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
lines = readfile (fp, &n);
if (fp != stdin) /* close file if not stdin */
fclose (fp);
if (!lines) { /* validate readfile() return */
fputs ("error: no lines read from file.\n", stderr);
return 1;
}
for (size_t i = 0; i < n; i++) { /* loop outputting all lines read */
puts (lines[i]);
free (lines[i]); /* don't forget to free lines */
}
free (lines); /* and free pointers */
return 0;
}
Let me know if you have further questions.
If you want to read a fine till the last line, you can simply use getc(). It returns EOF when end of file is reached or it fails to read. So, if using getc(), to make sure the end of file is reached, it's better to use feof(), which returns a non-zero value if end of file is reached, else 0.
Example :-
int main()
{
FILE *fp = fopen("story.txt", "r");
int ch = getc(fp);
while (ch != EOF)
{
/* display contents of file on screen */
putchar(ch);
ch = getc(fp);
}
if (feof(fp))
printf("\n End of file reached.");
else
printf("\nError while reading!");
fclose(fp);
getchar();
return 0;
}
you can also fgets(), below is the example :-
#define MAX_LEN 256
int main(void)
{
FILE *fp = fopen("story.txt", "r");
if (fp == NULL) {
perror("Failed: "); //prints a descriptive error message to stderr.
return 1;
}
char buffer[MAX_LEN];
// -1 to allow room for NULL terminator for really long string
while (fgets(buffer, MAX_LEN - 1, fp))
{
// Remove trailing newline
buffer[strcspn(buffer, "\n")] = 0;
printf("%s\n", buffer);
}
fclose(fp);
return 0;
}
alternatively,
int main() {
int MAX_LEN = 255;
char buffer[MAX_LEN];
FILE *fp = fopen("story.txt", "r");
while(fgets(buffer, MAX_LEN, fp)) {
printf("%s\n", buffer);
}
fclose(fp);
return 0;
}
For more details, refer C read file line by line
Your approach is wrong and it's very likely that it will generate an endless loop. I'll explain why using the original code and inline comments:
char *freadline(FILE *fp){
int i;
// This part attempts to count the number of characters
// in the whole file by reading char-by-char until EOF is set
for(i = 0; !feof(fp); i++){
getc(fp);
}
// Here EOF is set
// This returns to the start and clears EOF
fseek(fp, 0, SEEK_SET);
// Here EOF is cleared
char *pBuffer = (char*)malloc(sizeof(char)*i);
// Here you read a line, i.e. you read characters until (and including) the
// first newline in the file.
pBuffer = fgets(pBuffer, i, fp);
// Here EOF is still cleared as you only read the first line of the file
return pBuffer;
}
So in main when you do
while(!feof(fp)){
...
}
you have an endless loop as feof is false. Your program will print the same line again and again and you have memory leaks as you never call free(pInput).
So you need to redesign your code. Read what fgets do, e.g. here https://man7.org/linux/man-pages/man3/fgets.3p.html
A number of issues to address:
Using fgets does not guarantee that you read a line after the function returns. So if you really want to check whether you've read a complete line, check the number of characters in the returned string, and also check for the presence of a new-line character at the end of the string.
Your use of fseek is interesting here because what it does is to tell the stream pointer to go back to the start of the file, and start reading from there. This means that after the first time the freadline function is called, you will continue reading the first byte from the file each time.
Lastly, your program is hoarding memory like a greedy baby! You never free any of those allocations you did!
With that being said, here is an improved freadline implementation:
char *freadline(FILE *fp) {
/* initializations */
char buf[BUFSIZ + 1];
char *pBuffer;
size_t size = 0, tmp_size;
/* fgets returns NULL when it reaches EOF, so our loop is conditional
* on that
*/
while (fgets (buf, BUFSIZ + 1, fp) != NULL) {
tmp_size = strlen (buf);
size += tmp_size;
if (tmp_size != BUFSIZ || buf[BUFSIZ] == '\n')
break;
}
/* after breaking from loop, check that size is not zero.
* this should only happen if we reach EOF, so return NULL
*/
if (size == 0)
return NULL;
/* Allocate memory for the line plus one extra for the null byte */
pBuffer = malloc (size + 1);
/* reads the contents of the file into pBuffer */
if (size <= BUFSIZ) {
/* Optimization: use memcpy rather than reading
* from disk if the line is small enough
*/
memcpy (pBuffer, buf, size);
} else {
fseek (fp, ftell(fp) - size, SEEK_SET);
fread (pBuffer, 1, size, fp);
}
pBuffer[size] = '\0'; /* set the null terminator byte */
return pBuffer; /* remember to free () this when you are done! */
}
This way will not need the additional call to feof (which is often a hit and miss), and instead relies on what fgets returns to determine if we have reached the end of file.
With these changes, it should be enough to change main to:
int main() {
FILE *fp = fopen("story.txt", "r");
if(fp == NULL){
printf("Error!");
}else{
char *pInput;
/* here we just keep reading until a NULL string is returned */
for (pInput = freadline(fp); pInput != NULL; pInput = freadline(fp))) {
printf("%s", pInput); // output
free (pInput);
}
fclose(fp);
}
return 0;
}

Dynamically allocated string in dynamic structure array(seg fault)

I want to read the entire file(line by line)into a char pointer"name" in the struct array.(Wanna keep the names (can be of arbitrary length) in a dynamically allocated string Then I will divide the readed string(name) into chunks(age name score) in struct.I get seg fault.(file format is:
age name score
25,Rameiro Rodriguez,3
30,Anatoliy Stephanos,0
19,Vahan: Bohuslav,4.2
struct try{
double age;
char *name;
double score;
};
void allocate_struct_array(struct try **parr,int total_line);
int main(){
int count=0,i=0;
char ch;
fileptr = fopen("book.txt", "r");
//total line in the file is calculated
struct try *parr;
allocate_struct_array(&parr,count_lines);
//i got segmentation fault at below.(parsing code is not writed yet just trying to read the file)
while((ch=fgetc(fileptr))!=EOF) {
count++;
if(ch=='\n'){
parr->name=malloc(sizeof(char*)*count+1);
parr[i].name[count+1]='\0';
parr+=1;
count=0;
}
}
fclose(fileptr);
}
void allocate_struct_array(struct try **parr,int total_line){
*parr = malloc(total_line * sizeof(struct try));
}
Continuing from my comment, in allocate_struct_array(struct try **parr,int total_line), you allocate a block of struct try not a block of pointers (e.g. struct try*). Your allocation parr->name=malloc(sizeof(char*)*count+1); attempts to allocate count + 1 pointers. Moreover, on each iteration, you overwrite the address held by parr->name creating a memory leak because the pointer to the prior allocation is lost and cannot be freed.
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
A better approach to your problem is to read each line into a simply character array (of sufficient size to hold each line). You can then separate age, name and score and determine the number of characters in name so you can properly allocate for parr[i].name and then you can copy the name after you have allocated. If you are careful about it, you can simply locate both ',' in the buffer, allocate for parr[i].name and then use sscanf() with a proper format-string to separate, convert and copy all values to your struct parr[i] in a single call.
Since you have given no way to determine how //total line in the file is calculated, we will just presume a number large enough to accommodate your example file for purposes of discussion. Finding that number is left to you.
To read each line into an array, simply declare a buffer (character array) large enough to hold each line (take your longest expected line and multiply by 2 or 4, or if on a typical PC, just use a buffer of 1024 or 2048 bytes that will accommodate all but the obscure file with lines longer than that. (Rule: Don't Skimp On Buffer Size!!) You can do that with, e.g.
#define COUNTLINES 10 /* if you need a constant, #define one (or more) */
#define MAXC 1024
#define NUMSZ 64
...
int main (int argc, char **argv) {
char buf[MAXC]; /* temporary array to hold each line */
...
When reading until '\n' or EOF in a loop, it is easier to loop continually and check for EOF within the loop. That way the final line is handled as a normal part of your read loop and you don't need a special final code block to handle the last line, e.g.
while (nparr < count_lines) { /* protect your allocation bounds */
int ch = fgetc (fileptr); /* ch must be type int */
if (ch != '\n' && ch != EOF) { /* if not \n and not EOF */
...
}
else if (count) { /* only process buf if chars present */
...
}
if (ch == EOF) { /* if EOF, now break */
break;
}
}
(note: for your example we have continued to read with the fgetc() you used, but in normal practice you would simply use fgets() to fill the character array with the line)
To find the first and last ',' in the array, you can simply #include <string.h> and use strchar() to find the first and strrchr() to find the last. Using a pointer and end-pointer set to the first and last ',' the number of characters in name becomes ep - p - 1;. You can find the ','s and find the length of name with:
char *p = buf, *ep; /* pointer & end-pointer */
...
/* locate 1st ',' with p and last ',' with ep */
if ((p = strchr (buf, ',')) && (ep = strrchr (buf, ',')) &&
p != ep) { /* confirm pointers don't point to same ',' */
size_t len = ep - p - 1; /* get length of name */
Once you have found the first ',' and second ',' and determined the number of characters in name, you allocate characters, not pointers, e.g. with len characters in name and nparr as the struct index (instead of your i) you would do:
parr[nparr].name = malloc (len + 1); /* allocate */
if (!parr[nparr].name) { /* validate */
perror ("malloc-parr[nparr].name");
break;
}
(note: you break instead of exit on allocation error as all prior structs allocated for and filled will still contain valid data that you can use)
Now you can craft a sscanf() format string and separate age, name and score in a single call, e.g.
/* separate buf & convert into age, name, score -- validate */
if (sscanf (buf, "%d,%[^,],%lf", &parr[nparr].age,
parr[nparr].name, &parr[nparr].score) != 3) {
fputs ("error: invalid line format.\n", stderr);
...
}
Putting it altogether into a short program to read and separate your exmaple file, you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define COUNTLINES 10 /* if you need a constant, #define one (or more) */
#define MAXC 1024
#define NUMSZ 64
typedef struct { /* typedef for convenient use as type */
int age; /* age is generally an integer, not double */
char *name;
double score;
} try;
/* always provde a meaningful return when function can
* succeed or fail. Return result of malloc.
*/
try *allocate_struct_array (try **parr, int total_line)
{
return *parr = malloc (total_line * sizeof **parr);
}
int main (int argc, char **argv) {
char buf[MAXC]; /* temporary array to hold each line */
int count = 0,
nparr = 0,
count_lines = COUNTLINES;
try *parr = NULL;
/* use filename provided as 1st argument (book.txt by default) */
FILE *fileptr = fopen (argc > 1 ? argv[1] : "book.txt", "r");
if (!fileptr) { /* always validate file open for reading */
perror ("fopen-fileptr");
return 1;
}
if (!fgets (buf, MAXC, fileptr)) { /* read/discard header line */
fputs ("file-empty\n", stderr);
return 1;
}
/* validate every allocation */
if (allocate_struct_array (&parr, count_lines) == NULL) {
perror ("malloc-parr");
return 1;
}
while (nparr < count_lines) { /* protect your allocation bounds */
int ch = fgetc (fileptr); /* ch must be type int */
if (ch != '\n' && ch != EOF) { /* if not \n and not EOF */
buf[count++] = ch; /* add char to buf */
if (count + 1 == MAXC) { /* validate buf not full */
fputs ("error: line too long.\n", stderr);
count = 0;
continue;
}
}
else if (count) { /* only process buf if chars present */
char *p = buf, *ep; /* pointer & end-pointer */
buf[count] = 0; /* nul-terminate buf */
/* locate 1st ',' with p and last ',' with ep */
if ((p = strchr (buf, ',')) && (ep = strrchr (buf, ',')) &&
p != ep) { /* confirm pointers don't point to same ',' */
size_t len = ep - p - 1; /* get length of name */
parr[nparr].name = malloc (len + 1); /* allocate */
if (!parr[nparr].name) { /* validate */
perror ("malloc-parr[nparr].name");
break;
}
/* separate buf & convert into age, name, score -- validate */
if (sscanf (buf, "%d,%[^,],%lf", &parr[nparr].age,
parr[nparr].name, &parr[nparr].score) != 3) {
fputs ("error: invalid line format.\n", stderr);
if (ch == EOF) /* if at EOF on failure */
break; /* break read loop */
else {
count = 0; /* otherwise reset count */
continue; /* start read of next line */
}
}
}
nparr += 1; /* increment array index */
count=0; /* reset count zero */
}
if (ch == EOF) { /* if EOF, now break */
break;
}
}
fclose(fileptr); /* close file */
for (int i = 0; i < nparr; i++) {
printf ("%3d %-20s %5.1lf\n",
parr[i].age, parr[i].name, parr[i].score);
free (parr[i].name); /* free strings when done */
}
free (parr); /* free struxts */
}
(note: Never Hardcode Filenames or use Magic-Numbers in your code. If you need a constant, #define ... one. Pass the filename to read as the first argument to your program or take the filename as input. You shouldn't have to recompile your code just to read from a different filename)
Example Use/Output
With your example data in dat/parr_name.txt, you would have:
$ ./bin/parr_name dat/parr_name.txt
25 Rameiro Rodriguez 3.0
30 Anatoliy Stephanos 0.0
19 Vahan: Bohuslav 4.2
Memory Use/Error Check
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/parr_name dat/parr_name.txt
==17385== Memcheck, a memory error detector
==17385== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17385== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==17385== Command: ./bin/parr_name dat/parr_name.txt
==17385==
25 Rameiro Rodriguez 3.0
30 Anatoliy Stephanos 0.0
19 Vahan: Bohuslav 4.2
==17385==
==17385== HEAP SUMMARY:
==17385== in use at exit: 0 bytes in 0 blocks
==17385== total heap usage: 7 allocs, 7 frees, 5,965 bytes allocated
==17385==
==17385== All heap blocks were freed -- no leaks are possible
==17385==
==17385== For counts of detected and suppressed errors, rerun with: -v
==17385== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Using fgets() To Read Each Line And A Temp Array For name
To not leave you with the wrong impression, this problem can be simplified substantially by reading each line into a character array using fgets() and separating the needed values with sscanf(), saving name into a temporary array of sufficient size. Now all that is needed is to allocate for parr[nparr].name and then copy the temporary name to parr[nparr].name.
By doing it this way you substantially reduce the complexity of reading character-by-character and by using a temporary array for name, you eliminate having to locate the ',' in order to obtain the length of the name.
The only changes needed are to add a new constant for the temporary name array and then you can replace the entire read-loop with:
#define NAMSZ 256
...
/* protect memory bounds, read each line into buf */
while (nparr < count_lines && fgets (buf, MAXC, fileptr)) {
char name[NAMSZ]; /* temporary array for name */
size_t len; /* length of name */
/* separate buf into age, temp name, score & validate */
if (sscanf (buf, "%d,%[^,],%lf", &parr[nparr].age, name,
&parr[nparr].score) != 3) {
fputs ("error: invalid line format.\n", stderr);
continue;
}
len = strlen (name); /* get length of name */
parr[nparr].name = malloc (len + 1); /* allocate for name */
if (!parr[nparr].name) { /* validate allocation */
perror ("malloc-parr[nparr].name");
break;
}
memcpy (parr[nparr].name, name, len + 1);
nparr += 1;
}
fclose(fileptr); /* close file */
...
(same output and same memory check)
Also note you can allocate and copy as a single operation if your compiler provides strdup(). That would reduce the allocation and copy of name to a single call, e.g.
parr[nparr].name = strdup (name);
Since strdup() allocates memory (and can fail), you must validate the allocation just as you would if you were using malloc() amd memcpy(). But, understand, strdup() is not standard C. It is a POSIX function that isn't part of the standard library.
The other improvement you can make is adding logic to call realloc() when your block of struct (parr) is full. That way you can start with some reasonably anticipated number of struct and then reallocate more whenever you run out. This will eliminate the artificial limit on the number of lines you can store -- and remove the need to know count_lines. (there are numerous examples on this site of how to use realloc(), the implementation is left to you.
Look things over and let me know if you have further questions.

Using fgets with realloc()

I'm trying to create a function to read a single line from a file of text using fgets() and store it in a dynamically allocating char* using malloc()but I am unsure as to how to use realloc() since I do not know the length of this single line of text and do not want to just guess a magic number for the maximum size that this line could possibly be.
#include "stdio.h"
#include "stdlib.h"
#define INIT_SIZE 50
void get_line (char* filename)
char* text;
FILE* file = fopen(filename,"r");
text = malloc(sizeof(char) * INIT_SIZE);
fgets(text, INIT_SIZE, file);
//How do I realloc memory here if the text array is full but fgets
//has not reach an EOF or \n yet.
printf(The text was %s\n", text);
free(text);
int main(int argc, char *argv[]) {
get_line(argv[1]);
}
I am planning on doing other things with the line of text but for sake of keeping this simple, I have just printed it and then freed the memory.
Also: The main function is initiated by using the filename as the first command line argument.
The getline function is what you looking for.
Use it like this:
char *line = NULL;
size_t n;
getline(&line, &n, stdin);
If you really want to implement this function yourself, you can write something like this:
#include <stdlib.h>
#include <stdio.h>
char *get_line()
{
int c;
/* what is the buffer current size? */
size_t size = 5;
/* How much is the buffer filled? */
size_t read_size = 0;
/* firs allocation, its result should be tested... */
char *line = malloc(size);
if (!line)
{
perror("malloc");
return line;
}
line[0] = '\0';
c = fgetc(stdin);
while (c != EOF && c!= '\n')
{
line[read_size] = c;
++read_size;
if (read_size == size)
{
size += 5;
char *test = realloc(line, size);
if (!test)
{
perror("realloc");
return line;
}
line = test;
}
c = fgetc(stdin);
}
line[read_size] = '\0';
return line;
}
One possible solution is to use two buffers: One temporary that you use when calling fgets; And one that you reallocate, and append the temporary buffer to.
Perhaps something like this:
char temp[INIT_SIZE]; // Temporary string for fgets call
char *text = NULL; // The actual and full string
size_t length = 0; // Current length of the full string, needed for reallocation
while (fgets(temp, sizeof temp, file) != NULL)
{
// Reallocate
char *t = realloc(text, length + strlen(temp) + 1); // +1 for terminator
if (t == NULL)
{
// TODO: Handle error
break;
}
if (text == NULL)
{
// First allocation, make sure string is properly terminated for concatenation
t[0] = '\0';
}
text = t;
// Append the newly read string
strcat(text, temp);
// Get current length of the string
length = strlen(text);
// If the last character just read is a newline, we have the whole line
if (length > 0 && text[length - 1] == '\n')
{
break;
}
}
[Discalimer: The code above is untested and may contain bugs]
With the declaration of void get_line (char* filename), you can never make use of the line you read and store outside of the get_line function because you do not return a pointer to line and do not pass the address of any pointer than could serve to make any allocation and read visible back in the calling function.
A good model (showing return type and useful parameters) for any function to read an unknown number of characters into a single buffer is always POSIX getline. You can implement your own using either fgetc of fgets and a fixed buffer. Efficiency favors the use of fgets only to the extent it would minimize the number of realloc calls needed. (both functions will share the same low-level input buffer size, e.g. see gcc source IO_BUFSIZ constant -- which if I recall is now LIO_BUFSIZE after a recent name change, but basically boils down to an 8192 byte IO buffer on Linux and 512 bytes on windows)
So long as you dynamically allocate the original buffer (either using malloc, calloc or realloc), you can read continually with a fixed buffer using fgets adding the characters read into the fixed buffer to your allocated line and checking whether the final character is '\n' or EOF to determine when you are done. Simply read a fixed buffer worth of chars with fgets each iteration and realloc your line as you go, appending the new characters to the end.
When reallocating, always realloc using a temporary pointer. That way, if you run out of memory and realloc returns NULL (or fails for any other reason), you won't overwrite the pointer to your currently allocated block with NULL creating a memory leak.
A flexible implementation that sizes the fixed buffer as a VLA using either the defined SZINIT for the buffer size (if the user passes 0) or the size provided by the user to allocate initial storage for line (passed as a pointer to pointer to char) and then reallocating as required, returning the number of characters read on success or -1 on failure (the same as POSIX getline does) could be done like:
/** fgetline, a getline replacement with fgets, using fixed buffer.
* fgetline reads from 'fp' up to including a newline (or EOF)
* allocating for 'line' as required, initially allocating 'n' bytes.
* on success, the number of characters in 'line' is returned, -1
* otherwise
*/
ssize_t fgetline (char **line, size_t *n, FILE *fp)
{
if (!line || !n || !fp) return -1;
#ifdef SZINIT
size_t szinit = SZINIT > 0 ? SZINIT : 120;
#else
size_t szinit = 120;
#endif
size_t idx = 0, /* index for *line */
maxc = *n ? *n : szinit, /* fixed buffer size */
eol = 0, /* end-of-line flag */
nc = 0; /* number of characers read */
char buf[maxc]; /* VLA to use a fixed buffer (or allocate ) */
clearerr (fp); /* prepare fp for reading */
while (fgets (buf, maxc, fp)) { /* continuall read maxc chunks */
nc = strlen (buf); /* number of characters read */
if (idx && *buf == '\n') /* if index & '\n' 1st char */
break;
if (nc && (buf[nc - 1] == '\n')) { /* test '\n' in buf */
buf[--nc] = 0; /* trim and set eol flag */
eol = 1;
}
/* always realloc with a temporary pointer */
void *tmp = realloc (*line, idx + nc + 1);
if (!tmp) /* on failure previous data remains in *line */
return idx ? (ssize_t)idx : -1;
*line = tmp; /* assign realloced block to *line */
memcpy (*line + idx, buf, nc + 1); /* append buf to line */
idx += nc; /* update index */
if (eol) /* if '\n' (eol flag set) done */
break;
}
/* if eol alone, or stream error, return -1, else length of buf */
return (feof (fp) && !nc) || ferror (fp) ? -1 : (ssize_t)idx;
}
(note: since nc already holds the current number of characters in buf, memcpy can be used to append the contents of buf to *line without scanning for the terminating nul-character again) Look it over and let me know if you have further questions.
Essentially you can use it as a drop-in replacement for POSIX getline (though it will not be quite as efficient -- but isn't not bad either)

C - cannot read and process a list of strings from a text file into an array

This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting a array of numbers somehow. So how to read the file into a list. I tried using 2 dimensional list but this doesn't work as well.
I am new to C. I am mostly using Python but now I want to check if C is faster or not for a task.
#include <stdio.h>
#include <time.h>
#include <string.h>
void loadlist(char *ptext) {
char filename[] = "Z://list.txt";
char myline[200];
FILE * pfile;
pfile = fopen (filename, "r" );
char larray[100000];
int i = 0;
while (!feof(pfile)) {
fgets(myline,200,pfile);
larray[i]= myline;
//strcpy(larray[i],myline);
i++;
//printf(myline);
}
fclose(pfile);
printf("%s \n %d \n %d \n ","while doneqa",i,strlen(larray));
printf("First larray element is: %d \n",larray[0]);
/* for loop execution */
//for( i = 10; i < 20; i = i + 1 ){
// printf(larray[i]);
//}
}
int main ()
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
char *ptext = "String";
loadlist(ptext);
time(&etime);
printf("time to load: %f \n", difftime(etime, stime));
return(0);
}
This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting an array of numbers somehow.
There are many ways to do this correctly. To begin with, first sort out what it is you actually need/want to store, then figure out where that information will come from and finally decide how you will provide storage for the information. In your case loadlist is apparently intended load a list of lines (up to 10000) so that they are accessible through your statically declared array of pointers. (you can also allocate the pointers dynamically, but if you know you won't need more than X of them, statically declaring them is fine (up to the point you cause StackOverflow...)
Once you read the line in loadlist, then you need to provide adequate storage to hold the line (plus the nul-terminating character). Otherwise, you are just counting the number of lines. In your case, since you declare an array of pointers, you cannot simply copy the line you read because each of the pointers in your array does not yet point to any allocated block of memory. (you can't assign the address of the buffer you read the line into with fgets (buffer, size, FILE*) because (1) it is local to your loadlist function and it will go away when the function stack frame is destroyed on function return; and (2) obviously it gets overwritten with each call to fgets anyway.
So what to do? That's pretty simple too, just allocate storage for each line as it is read using the strlen of each line as #iharob says (+1 for the nul-byte) and then malloc to allocate a block of memory that size. You can then simply copy the read buffer to the block of memory created and assign the pointer to your list (e.g. larray[x] in your code). Now the gnu extensions provide a strdup function that both allocates and copies, but understand that is not part of the C99 standard so you can run into portability issues. (also note you can use memcpy if overlapping regions of memory are a concern, but we will ignore that for now since you are reading lines from a file)
What are the rules for allocating memory? Well, you allocate with malloc, calloc or realloc and then you VALIDATE that your call to those functions succeeded before proceeding or you have just entered the realm of undefined behavior by writing to areas of memory that are NOT in fact allocated for your use. What does that look like? If you have your array of pointers p and you want to store a string from your read buffer buf of length len at index idx, you could simply do:
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* handle error condition */
Now you are free to allocate before you test as follows, but it is convenient to make the assignment as part of the test. The long form would be:
p[idx] = malloc (len + 1); /* allocate storage */
if (p[idx] == NULL) /* validate/handle error condition */
return NULL;
strcpy (p[idx], buf); /* copy buf to storage */
How you want to do it is up to you.
Now you also need to protect against reading beyond the end of your pointer array. (you only have a fixed number since you declared the array statically). You can make that check part of your read loop very easily. If you have declared a constant for the number of pointers you have (e.g. PTRMAX), you can do:
int idx = 0; /* index */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
...
idx++;
}
By checking the index against the number of pointers available, you insure you cannot attempt to assign address to more pointers than you have.
There is also the unaddressed issue of handling the '\n' that will be contained at the end of your read buffer. Recall, fgets read up to and including the '\n'. You do not want newline characters dangling off the ends of the strings you store, so you simply overwrite the '\n' with a nul-terminating character (e.g. simply decimal 0 or the equivalent nul-character '\0' -- your choice). You can make that a simple test after your strlen call, e.g.
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
...
(note: that also brings up the issue of reading a line longer than the 200 characters you allocate for your read buffer. You check for whether a complete line has been read by checking whether fgets included the '\n' at the end, if it didn't, you know your next call to fgets will be reading again from the same line, unless EOF is encountered. In that case you would simply need to realloc your storage and append any additional characters to that same line -- that is left for future discussion)
If you put all the pieces together and choose a return type for loadlist that can indicate success/failure, you could do something similar to the following:
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
note: you could just as easily change the return type to int and return the number of lines read, or pass a pointer to int (or better yet size_t) as a parameter to make the number of lines stored available back in the calling function.
However, in this case, we have used the initialization of all pointers in your array of pointers to NULL, so back in the calling function we need only iterate over the pointer array until the first NULL is encountered in order to traverse our list of lines. Putting together a short example program that read/stores all lines (up to PTRMAX lines) from the filename given as the first argument to the program (or from stdin if no filename is given), you could do something similar to:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
enum { LNMAX = 200, PTRMAX = 10000 };
char **loadlist (char **p, FILE *fp);
int main (int argc, char **argv) {
time_t stime, etime;
char *list[PTRMAX] = { NULL }; /* array of ptrs initialized NULL */
size_t n = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
printf ("Starting of the program...\n");
time (&stime);
if (loadlist (list, fp)) { /* read lines from fp into list */
time (&etime);
printf("time to load: %f\n\n", difftime (etime, stime));
}
else {
fprintf (stderr, "error: loadlist failed.\n");
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
while (list[n]) { /* output stored lines and free allocated mem */
printf ("line[%5zu]: %s\n", n, list[n]);
free (list[n++]);
}
return(0);
}
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
Finally, in any code your write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
Use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
Look things over, let me know if you have any further questions.
It's natural that you see numbers because you are printing a single character using the "%d" specifier. In fact, strings in c are pretty much that, arrays of numbers, those numbers are the ascii values of the corresponding characters. If you instead use "%c" you will see the character that represents each of those numbers.
Your code also, calls strlen() on something that is intended as a array of strings, strlen() is used to compute the length of a single string, a string being an array of char items with a non-zero value, ended with a 0. Thus, strlen() is surely causing undefined behavior.
Also, if you want to store each string, you need to copy the data like you tried in the commented line with strcpy() because the array you are using for reading lines is overwritten over and over in each iteration.
Your compiler must be throwing all kinds of warnings, if it's not then it's your fault, you should let the compiler know that you want it to do some diagnostics to help you find common problems like assigning a pointer to a char.
You should fix multiple problems in your code, here is a code that fixes most of them
void
loadlist(const char *const filename) {
char line[100];
FILE *file;
// We can only read 100 lines, of
// max 99 characters each
char array[100][100];
int size;
size = 0;
file = fopen (filename, "r" );
if (file == NULL)
return;
while ((fgets(line, sizeof(line), file) != NULL) && (size < 100)) {
strcpy(array[size++], line);
}
fclose(file);
for (int i = 0 ; i < size ; ++i) {
printf("array[%d] = %s", i + 1, array[i]);
}
}
int
main(void)
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
loadlist("Z:\\list.txt");
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Just to prove how complicated it can be in c, check this out
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>
struct string_list {
char **items;
size_t size;
size_t count;
};
void
string_list_print(struct string_list *list)
{
// Simply iterate through the list and
// print every item
for (size_t i = 0 ; i < list->count ; ++i) {
fprintf(stdout, "item[%zu] = %s\n", i + 1, list->items[i]);
}
}
struct string_list *
string_list_create(size_t size)
{
struct string_list *list;
// Allocate space for the list object
list = malloc(sizeof *list);
if (list == NULL) // ALWAYS check this
return NULL;
// Allocate space for the items
// (starting with `size' items)
list->items = malloc(size * sizeof *list->items);
if (list->items != NULL) {
// Update the list size because the allocation
// succeeded
list->size = size;
} else {
// Be optimistic, maybe realloc will work next time
list->size = 0;
}
// Initialize the count to 0, because
// the list is initially empty
list->count = 0;
return list;
}
int
string_list_append(struct string_list *list, const char *const string)
{
// Check if there is room for the new item
if (list->count + 1 >= list->size) {
char **items;
// Resize the array, there is no more room
items = realloc(list->items, 2 * list->size * sizeof *list->items);
if (items == NULL)
return -1;
// Now update the list
list->items = items;
list->size += list->size;
}
// Copy the string into the array we simultaneously
// increase the `count' and copy the string
list->items[list->count++] = strdup(string);
return 0;
}
void
string_list_destroy(struct string_list *const list)
{
// `free()' does work with a `NULL' argument
// so perhaps as a principle we should too
if (list == NULL)
return;
// If the `list->items' was initialized, attempt
// to free every `strdup()'ed string
if (list->items != NULL) {
for (size_t i = 0 ; i < list->count ; ++i) {
free(list->items[i]);
}
free(list->items);
}
free(list);
}
struct string_list *
loadlist(const char *const filename) {
char line[100]; // A buffer for reading lines from the file
FILE *file;
struct string_list *list;
// Create a new list, initially it has
// room for 100 strings, but it grows
// automatically if needed
list = string_list_create(100);
if (list == NULL)
return NULL;
// Attempt to open the file
file = fopen (filename, "r");
// On failure, we now have the responsibility
// to cleanup the allocated space for the string
// list
if (file == NULL) {
string_list_destroy(list);
return NULL;
}
// Read lines from the file until there are no more
while (fgets(line, sizeof(line), file) != NULL) {
char *newline;
// Remove the trainling '\n'
newline = strchr(line, '\n');
if (newline != NULL)
*newline = '\0';
// Append the string to the list
string_list_append(list, line);
}
fclose(file);
return list;
}
int
main(void)
{
time_t stime, etime;
struct string_list *list;
printf("Starting of the program...\n");
time(&stime);
list = loadlist("Z:\\list.txt");
if (list != NULL) {
string_list_print(list);
string_list_destroy(list);
}
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Now, this will work almost as the python code you say you wrote but it will certainly be faster, there is absolutely no doubt.
It is possible that an experimented python programmer can write a python program that runs faster than that of a non-experimented c programmer, learning c however is really good because you then understand how things work really, and you can then infer how a python feature is probably implemented, so understanding this can be very useful actually.
Although it's certainly way more complicated than doing the same in python, note that I wrote this in nearly 10min. So if you really know what you're doing and you really need it to be fast c is certainly an option, but you need to learn many concepts that are not clear to higher level languages programmers.

c reading separate words from text file using fscanf()

I am writing a quiz program. The program should read question, answers and correct answer from csv file.
Then it should store them in array.
void read(char question[][50], char answer1[10][10], char answer2[10][10], char answer3[10][10], char answer4[10][10], int correctAnswer[10], int *size, char fileName[], int noOfQuestion){
FILE *reader;
int count;
char qBuffer[50];
char ansBuffer1[50];
char ansBuffer2[50];
char ansBuffer3[50];
char ansBuffer4[50];
int iBuffer = 0;
*size = 0;
//open file
reader = fopen(fileName, "r");
//checking file is open or not
if (reader == NULL)
{
printf("Unable to open file %s", fileName);
}
else
{
fscanf(reader, "%100[^\t*\?,],%[^,],%[^,],%[^,],%[^,],%d", size);
for (count = 0; feof(reader) == 0 && count<*size && count<noOfQuestion; count++){
//Reading file
fscanf(reader, "%100[^\t*\?,],%[^,],%[^,],%[^,],%[^,],%d", qBuffer, ansBuffer1, ansBuffer2, ansBuffer3, ansBuffer4, iBuffer);
//Storing data
strcpy(question[count], qBuffer);
strcpy(answer1[count], ansBuffer1);
strcpy(answer2[count], ansBuffer2);
strcpy(answer3[count], ansBuffer3);
strcpy(answer4[count], ansBuffer4);
correctAnswer[count] = iBuffer;
// Check Correct Number of Items Read
if( count == noOfQuestion )
{
printf("There are more items in the file than MaxNoItems specifies can be stored in the output arrays.\n\n");
*size = count;
}
else if( count != *size - 1 )
{
printf("File is corrupted. Not as many items in the file as specified at the top.\n\n");
*size = count;
}
//Break if reached end of file.
if (feof(reader))
{ break;}
}
fclose(reader);
}
}
This the csv file to read from. each question and answers are in one line.
What function do you use to open a file?,fscanf,fclose,fopen,main,3
Which of the following is not a variable type?,int,float,char,string,4
How many bytes is a character?,8,4,2,1,4
What programming language have you been studying this term?,B,A,D,C,4
Which of the following is a comment?,#comment,//comment,$comment,%comment,2
Which of these is in the C Standard Library?,stdio.h,studio.h,iostream,diskio.h,1
What tool do we use to compile?,compiler,builder,linker,wrench,1
What function do you use to close a file?,fscanf,fclose,fopen,main,2
How do you include a file?,#include,//include,$include,%include,1
What are you doing this quiz on?,paper,whiteboard,computer,chalkboard,3
I worked to find a way to solve the issues in your code, however there just isn't a clean way to follow your double-read of each line an make it work in a reasonable way. The structural issue you have is attempting to read the line twice, first to determine the size and next to try and read the actual values. This has many pitfalls.
Instead of trying to read each line in a piecemeal manner, it is far better to read an entire line at a time using the line-oriented input functions provided by C (fgets, getline). It will make your code much more flexible and make life easier on you as well. The basic approach is to read a line at a time into a 'buffer', then using the tools provided, extract what you need from the line, store it in a way that makes sense, and move on to the next line.
There is just no way to hardcode a bunch of arrays in your function argument list and have it work in a sane way. The proper way to do it is to pass a pointer to some type datastruct to your function, have your function fill it, allocating memory as needed, and provide a pointer in return. In your case a simple structure makes a lot more sense that one two-dimensional array for each question you expect to read.
It is far better to define an initial size for the expected number questions, (MAXQ 128 below), and allocate storage for that amount. You can do the same for expected answers per question (MAXA 16 below). If you end up reading more than each, you can easily reallocate to handle the data.
Once you have your struct filled (or array of structs), you make that data available to your main code by a simple return. You then have a single pointer to your data that you can easily pass you a print function or wherever else you need the data. Since the storage for your data was allocated dynamically, you are responsible for freeing the memory used when it is no longer needed.
I have provided examples of both a print and free function to illustrate passing the pointer to the data between functions as well as the practical printing and freeing of the memory.
Designing your code in a similar manner will save you a lot of headaches in the long run. There are many ways to do this, the example below is simply one approach. I commented the code to help you follow along. Take a look and let me know if you have questions.
Note: I have replaced the original readQAs function with the version I originally wrote, but had a lingering issue with. When using getline you must preserve the starting address for any buffer allocated by getline or repetitive calls to getline will result in a segfault when getline attempts to reallocate its buffer. Basically, getline needs a way of keeping track of the memory it has used. You are free to chop the buffer allocated by getline up any way you want, as long as you preserve the starting address of the originally allocated buffer. Keeping a pointer to the original is sufficient.
This can be particularly subtle when you pass the buffer to functions that operate on the string such as strtok or strsep. Regardless, the result of failing to preserve the start of the buffer allocated by getline will result in a segfault at whatever loop exhausts the initial 120-byte buffer allocated by getline receiving __memcpy_sse2 () from /lib64/libc.so.6 If you never exhaust the original 120-byte buffer, you will never experience a segfault. Bottom line, always preserve the starting address for the buffer allocated by getline.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXQ 128
#define MAXA 16
typedef struct {
char *q;
char **ans;
unsigned int nans;
} ques;
ques **readQAs (char *fn);
void prn_ques (ques **exam);
void free_ques (ques **exam);
int main (int argc, char **argv) {
if (argc < 2) {
fprintf (stderr,"\n error: insufficient input. Usage: %s <csvfile>\n\n", argv[0]);
return 1;
}
ques **exam = NULL; /* pointer to pointer to struct */
/* allocate/fill exam structs with questions/answers */
if ( !( exam = readQAs (argv[1]) ) ) {
fprintf (stderr, "\n error: reading questions/answers from '%s'\n\n", argv[1]);
return 1;
}
prn_ques (exam); /* print the questions/answers */
free_ques (exam); /* free all memory allocated */
return 0;
}
/* allocate and fill array of structs with questions/answers */
ques **readQAs (char *fn)
{
FILE *fp = fopen (fn, "r"); /* open file and validate */
if (!fp) {
fprintf (stderr,"\n error: Unable to open file '%s'\n\n", fn);
return NULL;
}
char *line = NULL; /* line buff, if NULL getline allocates */
size_t n = 0; /* max chars to read (0 - no limit) */
ssize_t nchr = 0; /* num chars actually read by getline */
char *p = NULL; /* general pointer to parse line */
char *sp = NULL; /* second pointer to parse line */
char *lp = NULL; /* line ptr (preserve line start addr) */
size_t qidx = 0; /* index for questions structs */
size_t aidx = 0; /* index for answers within structs */
ques **q = calloc (MAXQ, sizeof (*q)); /* allocate MAXQ ptrs */
if (!q) { fprintf (stderr,"\n Allocation error.\n\n"); return NULL; }
/* for each line in file (fn) */
while ((nchr = getline (&line, &n, fp)) != -1)
{
/* test qidx = MAXQ-1, realloc */
aidx = 0; /* reset ans index each line */
lp = line; /* save line start address */
if (line[nchr - 1] == '\n') /* test/strip trailing newline */
line[--nchr] = 0;
q [qidx] = calloc (1, sizeof (**q)); /* allocate struct */
q [qidx]-> ans = calloc (MAXA, sizeof (*(q[qidx]-> ans)));
/* read question */
*(p = strchr (line, ',')) = 0; /* null-terminate ln at ',' */
q [qidx]-> q = strdup (line); /* alloc/read question */
sp = p + 1; /* sp now starts next ch */
/* read correct answer number */
*(p = strrchr (sp, ',')) = 0; /* null-term ln at last ',' */
q [qidx]-> nans = *(p+1) - '0'; /* save num ans, cvt to %zd */
/* read multi-choice answers */
for (p = strtok (sp, ","); p && *p; p = strtok (NULL, ","))
q [qidx]-> ans [aidx++] = strdup (p); /* alloc/read ans */
line = lp; /* avoid __memcpy_sse2 err */
qidx++; /* inc index for next Q */
}
if (line) free (line); /* free line memory */
if (fp) fclose (fp); /* close file stream */
return q; /* return ptr to array of structs holding Q/A(s) */
}
/* print formatted exam read from file */
void prn_ques (ques **exam)
{
if (!exam) {
fprintf (stderr, "\n %s() error: invalid exam pointer.\n\n", __func__);
return;
}
size_t qidx = 0; /* index for questions structs */
size_t aidx = 0; /* index for answers within structs */
printf ("\nClass Exam\n\n");
while (exam [qidx])
{
printf (" %2zd. %s\n\n", qidx + 1, exam[qidx]-> q);
aidx = 0;
while (exam[qidx]->ans[aidx])
{
if (exam[qidx]-> nans == aidx + 1)
printf ("\t(%c) %-16s (* correct)\n", (int)aidx + 'a', exam[qidx]->ans[aidx]);
else
printf ("\t(%c) %s\n", (int)aidx + 'a', exam[qidx]->ans[aidx]);
aidx++;
}
printf ("\n");
qidx++;
}
printf ("\n");
}
/* free all memory allocated */
void free_ques (ques **exam)
{
if (!exam) {
fprintf (stderr, "\n %s() error: invalid exam pointer.\n\n", __func__);
return;
}
size_t qidx = 0; /* index for questions structs */
size_t aidx = 0; /* index for answers within structs */
while (exam[qidx])
{
if (exam[qidx]->q) free (exam[qidx]->q);
for (aidx = 0; aidx < MAXA; aidx++) {
if (exam[qidx]->ans[aidx]) {
free (exam[qidx]->ans[aidx]);
}
}
free (exam[qidx]->ans);
free (exam[qidx++]);
}
free (exam);
}
output/verification:
$ ./bin/readcsvfile dat/readcsvfile.csv
Class Exam
1. What function do you use to open a file?
(a) fscanf
(b) fclose
(c) fopen (* correct)
(d) main
2. Which of the following is not a variable type?
(a) int
(b) float
(c) char
(d) string (* correct)
3. How many bytes is a character?
(a) 8
(b) 4
(c) 2
(d) 1 (* correct)
4. What programming language have you been studying this term?
(a) B
(b) A
(c) D
(d) C (* correct)
5. Which of the following is a comment?
(a) #comment
(b) //comment (* correct)
(c) $comment
(d) %comment
6. Which of these is in the C Standard Library?
(a) stdio.h (* correct)
(b) studio.h
(c) iostream
(d) diskio.h
7. What tool do we use to compile?
(a) compiler (* correct)
(b) builder
(c) linker
(d) wrench
8. What function do you use to close a file?
(a) fscanf
(b) fclose (* correct)
(c) fopen
(d) main
9. How do you include a file?
(a) #include (* correct)
(b) //include
(c) $include
(d) %include
10. What are you doing this quiz on?
(a) paper
(b) whiteboard
(c) computer (* correct)
(d) chalkboard
valgrind verification:
==16221==
==16221== HEAP SUMMARY:
==16221== in use at exit: 0 bytes in 0 blocks
==16221== total heap usage: 73 allocs, 73 frees, 3,892 bytes allocated
==16221==
==16221== All heap blocks were freed -- no leaks are possible
==16221==
==16221== For counts of detected and suppressed errors, rerun with: -v
==16221== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Resources