Reading a stream of values from text file in C - c

I have a text file which may contain one or up to 400 numbers. Each number is separated by a comma and a semicolon is used to indicate end of numbers stream.
At the moment I am reading the text file line by line using the fgets. For this reason I am using a fixed array of 1024 elements (the maximum characters per line for a text file).
This is not the ideal way how to implement this since if only one number is inputted in the text file, an array of 1024 elements will we pointless.
Is there a way to use fgets with the malloc function (or any other method) to increase memory efficiency?

If you are looking into using this in a production code then I would request you to follow the suggestions put in the comments section.
But if you requirement is more for learning or school, then here is a complex approach.
Pseudo code
1. Find the size of the file in bytes, you can use "stat" for this.
2. Since the file format is known, from the file size, calculate the number of items.
3. Use the number of items to malloc.
Voila! :p
How to find file size
You can use stat as shown below:
#include <sys/stat.h>
#include <stdio.h>
int main(void)
{
struct stat st;
if (stat("file", &st) == 0) {
printf("fileSize: %d No. of Items: %d\n", (st.st_size), (st.st_size/2));
return st.st_size;
}
printf("failed!\n");
return 0;
}
This file when run will return the file size:
$> cat file
1;
$> ./a.out
fileSize: 3 No. of Items: 1
$> cat file
1,2,3;
$> ./a.out
fileSize: 7 No. of Items: 3
Disclaimer: Is this approach to minimize the pre-allocated memory an optimal approach? No ways in heaven! :)

Dynamically allocating space for you data is a fundamental tool for working in C. You might as well pay the price to learn. The primary thing to remember is,
"if you allocate memory, you have the responsibility to track its use
and preserve a pointer to the starting address for the block of
memory so you can free it when you are done with it. Otherwise your
code with leak memory like a sieve."
Dynamic allocation is straight forward. You allocate some initial block of memory and keep track of what you add to it. You must test that each allocation succeeds. You must test how much of the block of memory you use and reallocate or stop writing data when full to prevent writing beyond the end of your block of memory. If you fail to test either, you will corrupt the memory associated with your code.
When you reallocate, always reallocate using a temporary pointer because with a reallocation failure, the original block of memory is freed. (causing loss of all previous data in that block). Using a temporary pointer allows you to handle failure in a manner to preserve that block if needed.
Taking that into consideration, below we initially allocate space for 64 long values (you can easily change to code to handle any type, e.g. int, float, double...). The code then reads each line of data (using getline to dynamically allocate the buffer for each line). strtol is used to parse the buffer assigning values to the array. idx is used as an index to keep track of how many values have been read, and when idx reaches the current nmax, array is reallocated twice as large as it previously was and nmax is updated to reflect the change. The reading, parsing, checking and reallocating continues for every line of data in the file. When done, the values are printed to stdout, showing the 400 random values read from the test file formatted as 353,394,257,...293,58,135;
To keep the read loop logic clean, I've put the error checking for the strtol conversion into a function xstrtol, but you are free to include that code in main() if you like. The same applies to the realloc_long function. To see when the reallocation takes place, you can compile the code with the -DDEBUG definition. E.g:
gcc -Wall -Wextra -DDEBUG -o progname yoursourcefile.c
The program expects your data filename as the first argument and you can provide an optional conversion base as the second argument (default is 10). E.g.:
./progname datafile.txt [base (default: 10)]
Look over it, test it, and let me know if you have any questions.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>
#define NMAX 64
long xstrtol (char *p, char **ep, int base);
long *realloc_long (long *lp, unsigned long *n);
int main (int argc, char **argv)
{
char *ln = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* max chars to read (0 - no limit) */
ssize_t nchr = 0; /* number of chars actually read */
size_t idx = 0; /* array index counter */
long *array = NULL; /* pointer to long */
unsigned long nmax = NMAX; /* initial reallocation counter */
FILE *fp = NULL; /* input file pointer */
int base = argc > 2 ? atoi (argv[2]) : 10; /* base (default: 10) */
/* open / validate file */
if (!(fp = fopen (argv[1], "r"))) {
fprintf (stderr, "error: file open failed '%s'.", argv[1]);
return 1;
}
/* allocate array of NMAX long using calloc to initialize to 0 */
if (!(array = calloc (NMAX, sizeof *array))) {
fprintf (stderr, "error: memory allocation failed.");
return 1;
}
/* read each line from file - separate into array */
while ((nchr = getline (&ln, &n, fp)) != -1)
{
char *p = ln; /* pointer to ln read by getline */
char *ep = NULL; /* endpointer for strtol */
while (errno == 0)
{ /* parse/convert each number in line into array */
array[idx++] = xstrtol (p, &ep, base);
if (idx == nmax) /* check NMAX / realloc */
array = realloc_long (array, &nmax);
/* skip delimiters/move pointer to next digit */
while (*ep && *ep != '-' && (*ep < '0' || *ep > '9')) ep++;
if (*ep)
p = ep;
else
break;
}
}
if (ln) free (ln); /* free memory allocated by getline */
if (fp) fclose (fp); /* close open file descriptor */
int i = 0;
for (i = 0; i < idx; i++)
printf (" array[%d] : %ld\n", i, array[i]);
free (array);
return 0;
}
/* reallocate long pointer memory */
long *realloc_long (long *lp, unsigned long *n)
{
long *tmp = realloc (lp, 2 * *n * sizeof *lp);
#ifdef DEBUG
printf ("\n reallocating %lu to %lu\n", *n, *n * 2);
#endif
if (!tmp) {
fprintf (stderr, "%s() error: reallocation failed.\n", __func__);
// return NULL;
exit (EXIT_FAILURE);
}
lp = tmp;
memset (lp + *n, 0, *n * sizeof *lp); /* memset new ptrs 0 */
*n *= 2;
return lp;
}
long xstrtol (char *p, char **ep, int base)
{
errno = 0;
long tmp = strtol (p, ep, base);
/* Check for various possible errors */
if ((errno == ERANGE && (tmp == LONG_MIN || tmp == LONG_MAX)) ||
(errno != 0 && tmp == 0)) {
perror ("strtol");
exit (EXIT_FAILURE);
}
if (*ep == p) {
fprintf (stderr, "No digits were found\n");
exit (EXIT_FAILURE);
}
return tmp;
}
Sample Output (with -DDEBUG to show reallocation)
$ ./bin/read_long_csv dat/randlong.txt
reallocating 64 to 128
reallocating 128 to 256
reallocating 256 to 512
array[0] : 353
array[1] : 394
array[2] : 257
array[3] : 173
array[4] : 389
array[5] : 332
array[6] : 338
array[7] : 293
array[8] : 58
array[9] : 135
<snip>
array[395] : 146
array[396] : 324
array[397] : 424
array[398] : 365
array[399] : 205
Memory Error Check
$ valgrind ./bin/read_long_csv dat/randlong.txt
==26142== Memcheck, a memory error detector
==26142== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==26142== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==26142== Command: ./bin/read_long_csv dat/randlong.txt
==26142==
reallocating 64 to 128
reallocating 128 to 256
reallocating 256 to 512
array[0] : 353
array[1] : 394
array[2] : 257
array[3] : 173
array[4] : 389
array[5] : 332
array[6] : 338
array[7] : 293
array[8] : 58
array[9] : 135
<snip>
array[395] : 146
array[396] : 324
array[397] : 424
array[398] : 365
array[399] : 205
==26142==
==26142== HEAP SUMMARY:
==26142== in use at exit: 0 bytes in 0 blocks
==26142== total heap usage: 7 allocs, 7 frees, 9,886 bytes allocated
==26142==
==26142== All heap blocks were freed -- no leaks are possible
==26142==
==26142== For counts of detected and suppressed errors, rerun with: -v
==26142== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Related

Reading text file using fgets() and strtok() to separate strings in line yielding unwanted behaviour

I am trying to read a text file with the following format, using fgets() and strtok().
1082018 1200 79 Meeting with President
2012018 1200 79 Meet with John at cinema
2082018 1400 30 games with Alpha
3022018 1200 79 sports
I need to separate the first value from the rest of the line, for example:
key=21122019, val = 1200 79 Meeting with President
To do so I am using strchr() for val and strtok() for key, however, the key value remains unchanged when reading from file. I can't understand why this is happening since I am allocating space for in_key inside the while loop and placing inside an array at a different index each time.
My code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define N 1000 // max number of lines to be read
#define VALLEN 100
#define MAXC 1024
#define ALLOCSIZE 1000 /*size of available space*/
static char allocbuf[ALLOCSIZE]; /* storage for alloc*/
static char *allocp = allocbuf; /* next free position*/
char *alloc(int n) { /* return a pointer to n characters*/
if (allocbuf + ALLOCSIZE - allocp >= n) { /*it fits*/
allocp += n;
return allocp - n; /*old p*/
} else /*not enough room*/
return 0;
}
int main(int argc, char** argv) {
FILE *inp_cal;
inp_cal = fopen("calendar.txt", "r+");
char buf[MAXC];
char *line[1024];
char *p_line;
char *in_val_arr[100];
char *in_key_arr[100];
int count = 0;
char delimiter[] = " ";
if (inp_cal) {
printf("Processing file...\n");
while (fgets(buf, MAXC, inp_cal)) {
p_line = malloc(strlen(buf) + 1); // malloced with size of buffer.
char *in_val;
char *in_key;
strcpy(p_line, buf); //used to create a copy of input buffer
line[count] = p_line;
/* separating the line based on the first space. The words after
* the delimeter will be copied into in_val */
char *copy = strchr(p_line, ' ');
if (copy) {
if ((in_val = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
} else {
strcpy(in_val, copy + 1);
printf("arr: %s", in_val);
in_val_arr[count] = in_val;
}
} else
printf("Could not find a space\n");
/* We now need to get the first word from the input buffer*/
if ((in_key = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
}
else {
in_key = strtok(buf, delimiter);
printf("%s\n", in_key);
in_key_arr[count] = in_key; // <-- Printed out well
count++;
}
}
for (int i = 0; i < count; ++i)
printf("key=%s, val = %s", in_key_arr[i], in_val_arr[i]); //<-- in_key_arr[i] contains same values throughout, unlike above
fclose(inp_cal);
}
return 0;
}
while-loop output (correct):
Processing file...
arr: 1200 79 Meeting with President
1082018
arr: 1200 79 Meet with John at cinema
2012018
arr: 1400 30 games with Alpha
2082018
arr: 1200 79 sports
3022018
for-loop output (incorrect):
key=21122019, val = 1200 79 Meeting with President
key=21122019, val = 1200 79 Meet with John
key=21122019, val = 1400 30 games with Alpha
key=21122019, val = 1200 79 sports
Any suggestions on how this can be improved and why this is happening? Thanks
Continuing for the comment, in attempting to use strtok to separate your data into key, val, somenum and the remainder of the line as a string, you are making things harder than it need be.
If the beginning of your lines are always:
key val somenum rest
you can simply use sscanf to parse key, val and somenum into, e.g. three unsigned values and the rest of the line into a string. To help preserve the relationship between each key, val, somenum and string, storing the values from each line in a struct is greatly ease keeping track of everything. You can even allocate for the string to minimize storage to the exact amount required. For example, you could use something like the following:
typedef struct { /* struct to handle values */
unsigned key, val, n;
char *s;
} keyval_t;
Then within main() you could allocate for some initial number of struct, keep an index as a counter, loop reading each line using a temporary stuct and buffer, then allocating for the string (+1 for the nul-terminating character) and copying the values to your struct. When the number of structs filled reaches your allocated amount, simply realloc the number of structs and keep going.
For example, let's say you initially allocate for NSTRUCT struts and read each line into buf, e.g.
...
#define NSTRUCT 8 /* initial struct to allocate */
#define MAXC 1024 /* read buffer size (don't skimp) */
...
/* allocate/validate storage for max struct */
if (!(kv = malloc (max * sizeof *kv))) {
perror ("malloc-kv");
return 1;
}
...
size_t ndx = 0, /* used */
max = NSTRUCT; /* allocated */
keyval_t *kv = NULL; /* ptr to struct */
...
while (fgets (buf, MAXC, fp)) { /* read each line of input */
...
Within your while loop, you simply need to parse the values with sscanf, e.g.
char str[MAXC];
size_t len;
keyval_t tmp = {.key = 0}; /* temporary struct for parsing */
if (sscanf (buf, "%u %u %u %1023[^\n]", &tmp.key, &tmp.val, &tmp.n,
str) != 4) {
fprintf (stderr, "error: invalid format, line '%zu'.\n", ndx);
continue;
}
With the values parsed, you check whether your index has reached the number of struct you have allocated and realloc if required (note the use of a temporary pointer to realloc), e.g.
if (ndx == max) { /* check if realloc needed */
/* always realloc with temporary pointer */
void *kvtmp = realloc (kv, 2 * max * sizeof *kv);
if (!kvtmp) {
perror ("realloc-kv");
break; /* don't exit, kv memory still valid */
}
kv = kvtmp; /* assign new block to pointer */
max *= 2; /* increment max allocated */
}
Now with storage for the struct, simply get the length of the string, copy the unsigned values to your struct, and allocate length + 1 chars for kv[ndx].s and copy str to kv[ndx].s, e.g.
len = strlen(str); /* get length of str */
kv[ndx] = tmp; /* assign tmp values to kv[ndx] */
kv[ndx].s = malloc (len + 1); /* allocate block for str */
if (!kv[ndx].s) { /* validate */
perror ("malloc-kv[ndx].s");
break; /* ditto */
}
memcpy (kv[ndx++].s, str, len + 1); /* copy str to kv[ndx].s */
}
(note: you can use strdup if you have it to replace malloc through memcpy with kv[ndx].s = strdup (str);, but since strdup allocates, don't forget to check kv[ndx].s != NULL before incrementing ndx if you go that route)
That's pretty much the easy and robust way to capture your data. It is now contained in an allocated array of struct which you can use as needed, e.g.
for (size_t i = 0; i < ndx; i++) {
printf ("kv[%2zu] : %8u %4u %2u %s\n", i,
kv[i].key, kv[i].val, kv[i].n, kv[i].s);
free (kv[i].s); /* free string */
}
free (kv); /* free stucts */
(don't forget to free the memory you allocate)
Putting it altogether, you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NSTRUCT 8 /* initial struct to allocate */
#define MAXC 1024 /* read buffer size (don't skimp) */
typedef struct { /* struct to handle values */
unsigned key, val, n;
char *s;
} keyval_t;
int main (int argc, char **argv) {
char buf[MAXC]; /* line buffer */
size_t ndx = 0, /* used */
max = NSTRUCT; /* allocated */
keyval_t *kv = NULL; /* ptr to struct */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("fopen-file");
return 1;
}
/* allocate/validate storage for max struct */
if (!(kv = malloc (max * sizeof *kv))) {
perror ("malloc-kv");
return 1;
}
while (fgets (buf, MAXC, fp)) { /* read each line of input */
char str[MAXC];
size_t len;
keyval_t tmp = {.key = 0}; /* temporary struct for parsing */
if (sscanf (buf, "%u %u %u %1023[^\n]", &tmp.key, &tmp.val, &tmp.n,
str) != 4) {
fprintf (stderr, "error: invalid format, line '%zu'.\n", ndx);
continue;
}
if (ndx == max) { /* check if realloc needed */
/* always realloc with temporary pointer */
void *kvtmp = realloc (kv, 2 * max * sizeof *kv);
if (!kvtmp) {
perror ("realloc-kv");
break; /* don't exit, kv memory still valid */
}
kv = kvtmp; /* assign new block to pointer */
max *= 2; /* increment max allocated */
}
len = strlen(str); /* get length of str */
kv[ndx] = tmp; /* assign tmp values to kv[ndx] */
kv[ndx].s = malloc (len + 1); /* allocate block for str */
if (!kv[ndx].s) { /* validate */
perror ("malloc-kv[ndx].s");
break; /* ditto */
}
memcpy (kv[ndx++].s, str, len + 1); /* copy str to kv[ndx].s */
}
if (fp != stdin) /* close file if not stdin */
fclose (fp);
for (size_t i = 0; i < ndx; i++) {
printf ("kv[%2zu] : %8u %4u %2u %s\n", i,
kv[i].key, kv[i].val, kv[i].n, kv[i].s);
free (kv[i].s); /* free string */
}
free (kv); /* free stucts */
}
Example Use/Output
Using your data file as input, you would receive the following:
$ ./bin/fgets_sscanf_keyval <dat/keyval.txt
kv[ 0] : 1082018 1200 79 Meeting with President
kv[ 1] : 2012018 1200 79 Meet with John at cinema
kv[ 2] : 2082018 1400 30 games with Alpha
kv[ 3] : 3022018 1200 79 sports
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgets_sscanf_keyval <dat/keyval.txt
==6703== Memcheck, a memory error detector
==6703== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6703== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6703== Command: ./bin/fgets_sscanf_keyval
==6703==
kv[ 0] : 1082018 1200 79 Meeting with President
kv[ 1] : 2012018 1200 79 Meet with John at cinema
kv[ 2] : 2082018 1400 30 games with Alpha
kv[ 3] : 3022018 1200 79 sports
==6703==
==6703== HEAP SUMMARY:
==6703== in use at exit: 0 bytes in 0 blocks
==6703== total heap usage: 5 allocs, 5 frees, 264 bytes allocated
==6703==
==6703== All heap blocks were freed -- no leaks are possible
==6703==
==6703== For counts of detected and suppressed errors, rerun with: -v
==6703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me now if you have any further questions. If you need to further split kv[i].s, then you can think about using strtok.
You are storing the same pointer in the in_key_arr over and over again.
You roughly need this:
in_key = strtok(buf, delimiter);
printf("%s\n", in_key);
char *newkey = malloc(strlen(in_key) + 1); // <<<< allocate new memory
strcpy(newkey, in_key);
in_key_arr[count] = newkey; // <<<< store newkey
count++;
Disclaimer:
no error checking is done for brevity
the malloced memory needs to be freed once you're done with it.
you are assigning an address with the call to alloc then reassigning with call to strtok? rewriting the same address? Copy return from strtok to in_key?
char *copy = strchr(p_line, ' ');
if (copy) {
if ((in_val = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
} else {
printf("arr: %ul\n", in_val);
strcpy(in_val, copy + 1);
printf("arr: %s", in_val);
in_val_arr[count] = in_val;
}
} else
printf("Could not find a space\n");
/* We now need to get the first word from the input buffer*/
if ((in_key = alloc(strlen(line[count]) + 1)) == NULL) {
return -1;
}
else {
printf("key: %ul\n", in_key);
in_key = strtok(buf, delimiter);
printf("key:\%ul %s\n",in_key, in_key);
in_key_arr[count++] = in_key; // <-- Printed out well
}
output:
allocbuf: 1433760064l
Processing file...
all: 1433760064l
arr: 1433760064l
arr: 1200 79 Meeting with President
all: 1433760104l
key: 1433760104l
key:4294956352l 1082018
this change fixed it:
strcpy(in_key, strtok(buf, delimiter));

Split a string on the occurance of a particular character [duplicate]

This question already has answers here:
C - split string into an array of strings
(2 answers)
Closed 5 years ago.
I am trying to split a string on every occurrence of a closing bracket and send it to a character array as a line by line in a while loop.
This is the input I am reading in a char * input
(000,P,ray ),(100,D,ray ),(009,L,art ),(0000,C,max ),(0000,S,ben ),(020,P,kay ),(040,L,photography ),(001,C,max ),(0001,S,ben ),(0001,P,kay )
This is the output I am trying to produce in a char each[30] = {}
(000,P,ray ),
(100,D,ray ),
(009,L,art ),
(000,C,max ),
(000,S,ben ),
(020,P,kay ),
(040,L,photography ),
(001,C,max ),
(201,S,ben ),
(301,P,kay )
I copied the input to a char * temp so that strtok() does not change the input. But I am not understanding how to use strtok() inside the while loop condition. Does anyone know how to do it ?
Thanks,
UPDATE:
Sorry if I have violated the rules.
Here's my code -
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char *argv[]){
size_t len = 0;
ssize_t read;
char * line = NULL;
char *eacharray;
FILE *fp = fopen(argv[1], "r");
char * each = NULL;
while ((read = getline(&line, &len, fp)) != -1) {
printf("%s\n", line);
eacharray = strtok(line, ")");
// printf("%s +\n", eacharray);
while(eacharray != NULL){
printf("%s\n", eacharray);
eacharray = strtok(NULL, ")");
}
}
return 0;
}
It produces an output like this -
(000,P,ray
,(100,D,ray
,(009,L,art
,(0000,C,max
,(0000,S,ben
,(020,P,kay
,(040,L,photography
,(001,C,max
,(0001,S,ben
,(0001,P,kay
I would not use strtok, because your simple parser should first detect an opening brace and then search for a closing one. With strtok, you could just split at a closing brace; then you could not check if there was an opening one, and you'd have to skip the characters until the next opening brace "manually".
BTW: you probably meant each[10][30], not each[30].
See the following code looking for opening and closing braces and copying the content in between (including the braces):
int main(int argc, char *argv[]) {
const char* source ="(000,P,ray ),"
"(100,D,ray ),"
"(009,L,art ),"
"(0000,C,max ),"
"(0000,S,ben ),"
"(020,P,kay ),"
"(040,L,photography ),"
"(001,C,max ),"
"(0001,S,ben ),"
"(0001,P,kay )";
char each[10][30] = {{ 0 }};
const char *str = source;
int i;
for (i=0; i<10; i++) {
const char* begin = strchr(str, '(');
if (!begin)
break;
const char* end = strchr(begin,')');
if (!end)
break;
end++;
ptrdiff_t length = end - begin;
if (length >= 30)
break;
memcpy(each[i], begin, length);
str = end;
}
for (int l=0; l<i; l++) {
printf("%s", each[l]);
if (l!=i-1)
printf(",\n");
}
putchar ('\n');
}
Hope it helps.
There are many ways to approach this problem. Stephan has a good approach using the functions available in string.h (and kindly contributed the example source string). Another basic way to approach this problem (or any string parsing problem) is to simply walk-a-pointer down the string, comparing characters as you go and taking the appropriate action.
When doing so with multiple-delimiters (e.g. ',' and (...), it is often helpful to indicate the "state" of your position within the original string. Here a simple flag in (for inside or outside (...)) well let you control whether you copy characters to your array or skip them.
The rest is just keeping track of your indexes and protecting your array bounds as you loop over each character (more of an accounting problem from a memory standpoint -- which you should do regardless)
Putting the pieces together, and providing additional details in comments in-line below, you could do something like the following:
#include <stdio.h>
#define MAXS 10 /* if you need constants -- declare them */
#define MAXL 30 /* (don't use 'magic numbers' in code) */
int main (void) {
const char* source ="(000,P,ray ),"
"(100,D,ray ),"
"(009,L,art ),"
"(0000,C,max ),"
"(0000,S,ben ),"
"(020,P,kay ),"
"(040,L,photography ),"
"(001,C,max ),"
"(0001,S,ben ),"
"(0001,P,kay )";
char each[MAXS][MAXL] = {{0}},
*p = (char *)source;
int i = 0, in = 0, ndx = 0; /* in - state flag, ndx - row index */
while (ndx < MAXS && *p) { /* loop over all chars filling 'each' */
if (*p == '(') { /* (while protecting your row bounds) */
each[ndx][i++] = *p; /* copy opening '(' */
in = 1; /* set flag 'in'side record */
}
else if (*p == ')') {
each[ndx][i++] = *p; /* copy closing ')' */
each[ndx++][i] = 0; /* nul-terminate */
i = in = 0; /* reset 'i' and 'in' */
}
else if (in) { /* if we are 'in', copy char */
each[ndx][i++] = *p;
}
if (i + 1 == MAXL) { /* protect column bounds */
fprintf (stderr, "record exceeds %d chars.\n", MAXL);
return 1;
}
p++; /* increment pointer */
}
for (i = 0; i < ndx; i++) /* display results */
printf ("each[%2d] : %s\n", i, each[i]);
return 0;
}
(note: above, each row in each will be nul-terminated by default as a result of initializing all characters in each to zero at declaration, but it is still good practice to affirmatively nul-terminate all strings)
Example Use/Output
$ ./bin/testparse
each[ 0] : (000,P,ray )
each[ 1] : (100,D,ray )
each[ 2] : (009,L,art )
each[ 3] : (0000,C,max )
each[ 4] : (0000,S,ben )
each[ 5] : (020,P,kay )
each[ 6] : (040,L,photography )
each[ 7] : (001,C,max )
each[ 8] : (0001,S,ben )
each[ 9] : (0001,P,kay )
Get comfortable using either method. You can experiment whether using if.. else if.. or a switch best fits any parsing problem. The functions in string.h can be the better choice. It all depends on your input. Being comfortable with both approaches helps you better tailor your code to the problem at hand.
Example with getline and realloc of Rows
Since you are using getline to read each line, it will potentially read and allocate storage for an unlimited number of records (e.g. (...)). The way to handle this is to allocate storage for your records (pointers) dynamically, keep track of the number of pointers used, and realloc to allocate more pointers when you reach your record limit. You will need to validate each allocation, and understand you allocate each as a pointer-to-pointer-to-char (e.g. char **each) instead of each being a 2D array (e.g. char each[rows][cols]). (though you will still access and use the string held with each the same way (e.g. each[0], each[1], ...))
The code below will read from the filename given as the first argument (or from stdin if no argument is given). The approach is a standard approach for handling this type problem. each is declared as char **each = NULL; (a pointer-to-pointer-to-char). You then allocate an initial number of pointers (rows) for each with:
each = malloc (rows * sizeof *each); /* allocate rows no. of pointers */
if (!each) { /* validate allocation */
perror ("each - memory exhausted"); /* throw error */
return 1;
}
You then use getline to read each line into a buffer (buf) and pass a pointer to buf to the logic we used above. (NOTE, you must preserve a pointer to buf as buf points to storage dynamically allocated by getline that you must free later.)
The only addition to the normal parsing logic is we now need to allocate storage for each of the records we parse, and assign the address of the block of memory holding each record to each[x]. (we use strcpy for that purpose after allocating the storage for each record).
To simplify parsing, we originally parse each record into a fixed size buffer (rec) since we do not know the length of each record ahead of time. You can dynamically allocate/reallocate for rec as well, but that adds an additional level of complexity -- and I suspect you will struggle with the additions as they stand now. Just understand we parse each record from buf into rec (which we set at 256 chars #define MAXR 256 -- more than ample for the expected 30-31 char record size) Even though we use a fixed length rec, we still check i against MAXR to protect the fixed array bounds.
The storage for each record and copy of parsed records from rec to each[ndx] is handled when a closing ) is encountered as follows:
(note - storage for the nul-character is included in 'i' where you would normally see 'i + 1')
each[ndx] = malloc (i); /* allocate storage for rec */
if (!each[ndx]) { /* validate allocation */
perror ("each[ndx] - memory exhausted");
return 1;
}
strcpy (each[ndx], rec);/* copy rec to each[ndx] */
(note: by approaching allocation in this manner, you allocate the exact amount of storage you need for each record. There is no wasted space. You can handle 1 record or 10,000,000 records (to the extent of the memory on your computer))
Here is your example. Take time to understand what every line does and why. Ask questions if you do not understand. This is the meat-and-potatoes of dynamic allocation and once you get it -- you will have a firm understanding of the basics for handling any of your storage needs.
#include <stdio.h>
#include <stdlib.h> /* for malloc, realloc */
#include <string.h> /* for strcpy */
#define ROWS 10 /* initial number of rows to allocate */
#define MAXR 256 /* maximum record length between (...) */
int main (int argc, char **argv) {
int in = 0; /* in - state flag */
char **each = NULL, /* pointer to pointer to char */
*buf = NULL; /* buffer for getline */
size_t rows = ROWS, /* currently allocated row pointers */
ndx = 0, /* ndx - row index */
n = 0, /* buf size (0 - getline decides) */
i = 0; /* loop counter */
ssize_t nchr = 0; /* num chars read by getline (return) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
each = malloc (rows * sizeof *each); /* allocate rows no. of pointers */
if (!each) { /* validate allocation */
perror ("each - memory exhausted"); /* throw error */
return 1;
}
while ((nchr = getline (&buf, &n, fp) != -1)) { /* read line into buf */
char *p = buf, /* pointer to buf */
rec[MAXR] = ""; /* temp buffer to hold record */
while (*p) { /* loop over all chars filling 'each' */
if (*p == '(') { /* (while protecting your row bounds) */
rec[i++] = *p; /* copy opening '(' */
in = 1; /* set flag 'in'side record */
}
else if (*p == ')') {
rec[i++] = *p; /* copy closing ')' */
rec[i++] = 0; /* nul-terminate */
each[ndx] = malloc (i); /* allocate storage for rec */
if (!each[ndx]) { /* validate allocation */
perror ("each[ndx] - memory exhausted");
return 1;
}
strcpy (each[ndx], rec);/* copy rec to each[ndx] */
i = in = 0; /* reset 'i' and 'in' */
ndx++; /* increment row index */
if (ndx == rows) { /* check if rows limit reached */
/* reallocate 2X number of pointers using tmp pointer */
void *tmp = realloc (each, rows * sizeof *each * 2);
if (!tmp) { /* validate realloc succeeded */
perror ("realloc each - memory exhausted");
goto memfull; /* each still contains original recs */
}
each = tmp; /* assign reallocated block to each */
rows *= 2; /* update rows with current pointers */
}
}
else if (in) { /* if we are 'in', copy char */
rec[i++] = *p;
}
if (i + 1 == MAXR) { /* protect column bounds */
fprintf (stderr, "record exceeds %d chars.\n", MAXR);
return 1;
}
p++; /* increment pointer */
}
}
memfull:; /* labet for goto */
free (buf); /* free memory allocated by getline */
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (i = 0; i < ndx; i++) { /* display results */
printf ("each[%2zu] : %s\n", i, each[i]);
free (each[i]); /* free memory for each record */
}
free (each); /* free pointers */
return 0;
}
(note: since nchr isn't used to trim the '\n' from the end of the buffer read by getline, you can eliminate that variable. Just note that there is no need to call strlen on the buffer returned by getline as the number of characters read is the return value)
Example Use/Output
Note: for the input test, I just put your line of records in the file dat/delimrecs.txt and copied it 4 times giving a total of 40 records in 4 lines.
$ ./bin/parse_str_state_getline <dat/delimrecs.txt
each[ 0] : (000,P,ray )
each[ 1] : (100,D,ray )
each[ 2] : (009,L,art )
each[ 3] : (0000,C,max )
each[ 4] : (0000,S,ben )
<snip 5 - 34>
each[35] : (020,P,kay )
each[36] : (040,L,photography )
each[37] : (001,C,max )
each[38] : (0001,S,ben )
each[39] : (0001,P,kay )
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/parse_str_state_getline <dat/delimrecs.txt
==13035== Memcheck, a memory error detector
==13035== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==13035== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==13035== Command: ./bin/parse_str_state_getline
==13035==
each[ 0] : (000,P,ray )
each[ 1] : (100,D,ray )
each[ 2] : (009,L,art )
each[ 3] : (0000,C,max )
each[ 4] : (0000,S,ben )
<snip 5 - 34>
each[35] : (020,P,kay )
each[36] : (040,L,photography )
each[37] : (001,C,max )
each[38] : (0001,S,ben )
each[39] : (0001,P,kay )
==13035==
==13035== HEAP SUMMARY:
==13035== in use at exit: 0 bytes in 0 blocks
==13035== total heap usage: 46 allocs, 46 frees, 2,541 bytes allocated
==13035==
==13035== All heap blocks were freed -- no leaks are possible
==13035==
==13035== For counts of detected and suppressed errors, rerun with: -v
==13035== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
This is a lot to take in, but this is a basic minimal example of the framework for handling an unknown number of objects.

How to read from a char array (array of names)?

Guys I have used this to read in data successfully from a file.
void read(FILE *fPtr, int *a, int *b, char **c, int size)
{
char line[100];
int num = 0;
while ( fgets(line, 99, fPtr) && num < size )
{
// This is a names array
c[num] = strtok (line, " ");
// here the name prints out fine
a[num] = atoi(strtok (NULL, " \n"));
b[num] = atoi(strtok (NULL, " \n"));
num++;
}
}
But I am unable to read properly from this char ** array.
main function:
int main()
{
FILE *fp;
int g[30];
int a[30];
char *names[30];
// Open file
fp = fopen("input.txt", "r");
read(
fp, g, a,
names, 30 );
printf("%s\n", player_names[0]);
printThis(
g, a,
names, 30 );
return 0;
}
print this:
void printThis(int* g, int* a, char** n, int s)
{
for (int i = 0; i < 5; ++i)
printf("%s\n", n[i]);
}
This totally does not print names! it prints just # some space chars and #. Why isn't it printing anything. Is it not the proper way for accessing char arrays?
Where, as has been addressed in the comments, you were expecting to assign c[num] = strtok (line, " ");, you found the values in names no longer pointed to anything meaningful. When you declare a variable within a function, unless you dynamically allocate storage for the variable (e.g. with malloc), the variable only survives for the lifetime of the function. Why?
When a function is called a separate area of memory is created for the execution of the function called the function stack frame. That memory holds the variables declared within the function. When the function returns, the stack frame is destroyed (actually just released for reuse as required). Any variables declared local to the function are no longer available for access.
In your case, strtok returns a pointer to the beginning of a token within line (or NULL if no token in found). Upon function return, the pointer returned by strtok points to memory that has been release for reuse and can no longer be accessed within the calling function (main here) -- thus your problem.
How do I fix this? Well, there are many ways. The first, easiest solution, is just to dynamically allocate storage for the names, and make a copy of the tokens found by strtok and store them within the memory you have allocated. That memory will survive return and can be validly accessed in the caller. (you must remember to free the memory when it is no longer needed)
You also need to have your read (my readfp below) function return a meaningful value rather than void. It would be nice to know how many records were actually filled. If you change the function type to int, then you can return num and know how many records (e.g. name[x] a[x] & g[x] combination) were filled.
Wait... num was declared local to read, why can I return it? Every function can always return a value of its own type. If you declare int read (...), read can return an int value and its survives return due to the calling convention between caller and callee.
Putting those pieces together, and using strtol in place of atoi for int conversions (and making it easy by creating a separate function to handle and validate the conversions), you could do something similar to the following. Also note that main takes arguments which you can make use of to pass a filename to open, rather than hardcoding, e.g. fopen("input.txt", "r").
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <limits.h>
/* define constants for use below */
enum { BASE = 10, MAXP = 30, MAXL = 100 };
int readfp (FILE *fp, int *a, int *b, char **c, int size);
int xstrtol (char *s, int base);
int main (int argc, char **argv) {
int g[MAXP] = {0},
a[MAXP] = {0},
n = 0;
char *names[MAXP] = {NULL};
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
n = readfp (fp, g, a, names, MAXP );
if (fp != stdin) fclose (fp); /* close file if not stdin */
if (n)
for (int i = 0; i < n; i++) {
printf ("%-30s %8d %8d\n", names[i], g[i], a[i]);
free (names[i]); /* free memory allocated by strdup */
}
return 0;
}
int readfp (FILE *fp, int *a, int *b, char **c, int size)
{
char line[MAXL] = "";
int num = 0;
while (num < size && fgets (line, sizeof line, fp))
{
char *delim = " \t\n",
*tmp = strtok (line, delim);
int itmp = 0;
if (tmp)
c[num] = strdup (tmp); /* allocate storage and copy tmp */
else {
fprintf (stderr, "error: strtok failed - c[%d].\n", num);
break;
}
if ((tmp = strtok (NULL, delim)) && (itmp = xstrtol (tmp, BASE)))
a[num] = itmp;
else {
fprintf (stderr, "error: strtok failed - a[%d].\n", num);
break;
}
if ((tmp = strtok (NULL, delim)) && (itmp = xstrtol (tmp, BASE)))
b[num] = itmp;
else {
fprintf (stderr, "error: strtok failed - b[%d].\n", num);
break;
}
num++;
}
return num;
}
int xstrtol (char *s, int base)
{
errno = 0;
char *endptr = NULL;
long v = strtol (s, &endptr, base);
if (errno) {
fprintf (stderr, "xstrtol() error: over/underflow detected.\n");
exit (EXIT_FAILURE);
}
if (s == endptr && v == 0) {
fprintf (stderr, "xstrtol() error: no digits found.\n");
exit (EXIT_FAILURE);
}
if (v < INT_MIN || INT_MAX < v) {
fprintf (stderr, "xstrtol() error: out of range of integer.\n");
exit (EXIT_FAILURE);
}
return (int)v;
}
Example Input File
$ cat dat/names.txt
Ryan,Elizabeth 62 325
McIntyre,Osborne 84 326
DuMond,Kristin 18 327
Larson,Lois 42 328
Thorpe,Trinity 15 329
Ruiz,Pedro 35 330
Ali,Mohammed 60 331
Vashti,Indura 20 332
Example Use/Output
$ ./bin/readnames <dat/names.txt
Ryan,Elizabeth 62 325
McIntyre,Osborne 84 326
DuMond,Kristin 18 327
Larson,Lois 42 328
Thorpe,Trinity 15 329
Ruiz,Pedro 35 330
Ali,Mohammed 60 331
Vashti,Indura 20 332
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/readnames <dat/names.txt
==11303== Memcheck, a memory error detector
==11303== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==11303== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==11303== Command: ./bin/readnames
==11303==
Ryan,Elizabeth 62 325
McIntyre,Osborne 84 326
DuMond,Kristin 18 327
Larson,Lois 42 328
Thorpe,Trinity 15 329
Ruiz,Pedro 35 330
Ali,Mohammed 60 331
Vashti,Indura 20 332
==11303==
==11303== HEAP SUMMARY:
==11303== in use at exit: 0 bytes in 0 blocks
==11303== total heap usage: 8 allocs, 8 frees, 112 bytes allocated
==11303==
==11303== All heap blocks were freed -- no leaks are possible
==11303==
==11303== For counts of detected and suppressed errors, rerun with: -v
==11303== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.
You have lots of issues in your code as mentioned by the people in comments. But you should carefully consider the comment by Weather Vane. Right now your names array is char *names[30]; and you assign address of a local variable to it. You need to copy the data in some way(as line goes out of scope after function call). So i would suggest changing your names array to something like :
char names[30][WORD_LEN];
For copying data you might need something like strcpy. I changed your code a bit and was able to see the desired result:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define WORD_LEN 100
void read(FILE *fPtr, int *a, int *b, char c[][WORD_LEN], int size)
{
char line[100];
int num = 0;
while ( fgets(line, 99, fPtr) && num < size ) {
// This is a names array
//strcpy(c[num], strtok (line, " "));
// here the name prints out fine
// !avoid atoi, use sscanf instead
//a[num] = atoi(strtok (NULL, " "));
//b[num] = atoi(strtok (NULL, " "));
sscanf(line, "%s%d%d", c[num], &a[num], &b[num]);
num++;
}
}
void printThis(int* g, int* a, char n[][WORD_LEN], int s)
{
for (int i = 0; i < 2; ++i)
printf("%s %d %d\n", n[i], a[i], g[i]);
}
int main()
{
FILE *fp;
int g[30];
int a[30];
char names[30][WORD_LEN];
// Open file
fp = fopen("input.txt", "r");
read(fp, g, a, names, 30 );
printThis(g, a, names, 30 );
return 0;
}
I changed the file structure to space delimited tokens, and i was able to see the names :
~/work : $ cat input.txt
Rohan 120000 4300000
Prashant 12 43
~/work : $ g++ readCharArray.c
~/work : $ ./a.out
Rohan 4300000 120000
Prashant 43 12
~/work : $

Storing and accessing data in memory using pointers from txt file

So I'm currently working on a project that uses data from a txt file. The user is prompted for the filename, and the first two lines of the txt file are integers that essentially contain the row and column values of the txt file.
There are two things that are confusing me when writing this program the way my instructor is asking. For the criteria, she says:
read in the data and place into an array of data and
your code should access memory locations via pointers and pointer arithmetic, no []'s in the code you submit.
The left-most column is an identifier while the rest of the row should be considered as that rows data (floating point values).
An example of what the file might contain is:
3
4
abc123 8.55 5 0 10
cdef123 83.50 10.5 10 55
hig123 7.30 6 0 1.9
My code:
//Creates array for 100 characters for filename
char fileName[100];
printf("Enter the file name to be read from: ");
scanf("%s", fileName);
FILE *myFile;
myFile = fopen(fileName, "r");
//Checks if file opened correctly
if (myFile == NULL)
{
printf("Error opening file\n"); //full file name must be entered
}
else {
printf("File opened successfully\n");
}
//gets value of records and value per records from file
//This will be the first 2 lines from line
fscanf(myFile, "%d %d", &records, &valuesPerRecords);
//printf("%d %d\n", records, valuesPerRecords); //Check int values from file
int counter = 0;
char *ptr_data;
ptr_data = (char*)malloc(records*(valuesPerRecords));
int totalElements = records*(valuesPerRecords);
/*If malloc can't allocate enough space, print error*/
if (ptr_data == NULL) {
printf("Error\n");
exit(-1);
}
int counter;
for (counter = 0; counter < totalElements; counter++){
fscanf(myFile, "%s", &ptr_data);
}
so I'm wondering if so far, I'm on the right track. I can't seem to think of a way to have the first column read in as a string, while the rest is read in as integers. I'll also have to use the stored values later and sort them but that's a different problem for a later date.
First off, your prof apparently wants you to become familiar with walking a pointer through a collection of both strings (the labels) and numbers (the floating-point values) using pointer arithmetic without using array indexing. A solid pointer familiarity assignment.
To handle the labels you can use a pointer to pointer to type char (a double pointer) as each pointer will point to an array of chars. You can declare and allocate pointers for labels as follows. (this assumes you have already read the rows and cols values from the input file)
char buf[MAXC] = "", /* temporary line buffer */
**labels = NULL, /* collection of labels */
**lp = NULL; /* pointers to walk labels */
...
/* allocate & validate cols char* pointers */
if (!(labels = calloc (rows, sizeof *labels))) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return 1;
}
You can do the same thing for your pointer values, except you only need a pointer to type double as you will simply need to allocate for a collection of doubles.
double *mtrx = NULL, /* collection of numbers */
*p; /* pointers to walk numbers */
...
nptrs = rows * cols; /* set number of poiners required */
/* allocate & validate nptrs doubles */
if (!(mtrx = calloc (nptrs, sizeof *mtrx))) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return 1;
}
The use of the pointers lp and p are crucial because you cannot increment either labels or mtrx (without saving the original address) because doing so will lose the pointer to the start of the memory allocated to each, immediately causing a memory leak (you have no way to free the block) and preventing you from ever being able to access the beginning again. Each time you need to walk over labels or mtrx just assign the start address to the pointer, e.g.
p = mtrx; /* set pointer p to mtrx */
lp = labels; /* set poiners lp to labels */
Now you are free to read and parse the lines in any manner you choose, but I would strongly recommend using line-oriented-input functions to read each line into a temporary line buffer, and then parse the values you need using sscanf. This has many advantages to reading with fscanf alone. After you read each line, you can parse/validate each value before allocating space for the strings and assigning the values.
(note: I cheat below with a single sscanf call, where you should actually assign a char* pointer to buf, read the label, then loop cols number of times (perhaps using strtok/strtod) checking each value and assigning to mtrx, -- that is left to you)
/* read each remaining line, allocate/fill pointers */
while (ndx < rows && fgets (buf, MAXC, fp)) {
if (*buf == '\n') continue; /* skip empty lines */
char label[MAXC] = ""; /* temp storage for labels */
double val[cols]; /* temp storage for numbers */
if (sscanf (buf, "%s %lf %lf %lf %lf", /* parse line */
label, &val[0], &val[1], &val[2], &val[3]) ==
(int)(cols + 1)) {
*lp++ = strdup (label); /* alloc/copy label */
for (i = 0; i < cols; i++) /* alloc/copy numbers */
*p++ = val[i];
ndx++; /* increment index */
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
Then it is simply a matter of looping over the values again, using or outputting as needed, and then freeing the memory you allocated. You could do that with something similar to:
p = mtrx; /* reset pointer p to mtrx */
lp = labels; /* reset poiners lp to labels */
for (i = 0; i < rows; i++) {
printf (" %-10s", *lp);
free (*lp++);
for (j = 0; j < cols; j++)
printf (" %7.2lf", *p++);
putchar ('\n');
}
free (mtrx); /* free pointers */
free (labels);
That's basically one of many approaches. Putting it all together, you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAXC = 512 }; /* constants for max chars per-line */
int main (int argc, char **argv) {
char buf[MAXC] = "", /* temporary line buffer */
**labels = NULL, /* collection of labels */
**lp = NULL; /* pointers to walk labels */
double *mtrx = NULL, /* collection of numbers */
*p; /* pointers to walk numbers */
size_t i, j, ndx = 0, rows = 0, cols = 0, nptrs = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, MAXC, fp)) /* get rows, ignore blank lines */
if (sscanf (buf, "%zu", &rows) == 1)
break;
while (fgets (buf, MAXC, fp)) /* get cols, ignore blank lines */
if (sscanf (buf, "%zu", &cols) == 1)
break;
if (!rows || !cols) { /* validate rows & cols > 0 */
fprintf (stderr, "error: rows and cols values not found.\n");
return 1;
}
nptrs = rows * cols; /* set number of poiners required */
/* allocate & validate nptrs doubles */
if (!(mtrx = calloc (nptrs, sizeof *mtrx))) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return 1;
}
/* allocate & validate rows char* pointers */
if (!(labels = calloc (rows, sizeof *labels))) {
fprintf (stderr, "error: virtual memory exhausted.\n");
return 1;
}
p = mtrx; /* set pointer p to mtrx */
lp = labels; /* set poiners lp to labels */
/* read each remaining line, allocate/fill pointers */
while (ndx < rows && fgets (buf, MAXC, fp)) {
if (*buf == '\n') continue; /* skip empty lines */
char label[MAXC] = ""; /* temp storage for labels */
double val[cols]; /* temp storage for numbers */
if (sscanf (buf, "%s %lf %lf %lf %lf", /* parse line */
label, &val[0], &val[1], &val[2], &val[3]) ==
(int)(cols + 1)) {
*lp++ = strdup (label); /* alloc/copy label */
for (i = 0; i < cols; i++) /* alloc/copy numbers */
*p++ = val[i];
ndx++; /* increment index */
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
p = mtrx; /* reset pointer p to mtrx */
lp = labels; /* reset poiners lp to labels */
for (i = 0; i < rows; i++) {
printf (" %-10s", *lp);
free (*lp++);
for (j = 0; j < cols; j++)
printf (" %7.2lf", *p++);
putchar ('\n');
}
free (mtrx); /* free pointers */
free (labels);
return 0;
}
Example Input File Used
$ cat dat/arrinpt.txt
3
4
abc123 8.55 5 0 10
cdef123 83.50 10.5 10 55
hig123 7.30 6 0 1.9
Example Use/Output
$ ./bin/arrayptrs <dat/arrinpt.txt
abc123 8.55 5.00 0.00 10.00
cdef123 83.50 10.50 10.00 55.00
hig123 7.30 6.00 0.00 1.90
Memory Use/Error Check
In any code your write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice.
$ valgrind ./bin/arrayptrs <dat/arrinpt.txt
==22210== Memcheck, a memory error detector
==22210== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==22210== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==22210== Command: ./bin/arrayptrs
==22210==
abc123 8.55 5.00 0.00 10.00
cdef123 83.50 10.50 10.00 55.00
hig123 7.30 6.00 0.00 1.90
==22210==
==22210== HEAP SUMMARY:
==22210== in use at exit: 0 bytes in 0 blocks
==22210== total heap usage: 5 allocs, 5 frees, 142 bytes allocated
==22210==
==22210== All heap blocks were freed -- no leaks are possible
==22210==
==22210== For counts of detected and suppressed errors, rerun with: -v
==22210== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
Always confirm All heap blocks were freed -- no leaks are possible and equally important ERROR SUMMARY: 0 errors from 0 contexts. Note: some OS's do not provide adequate leak and error suppression files (the file that excludes system and OS memory from being reported as in use) which will cause valgrind to report that some memory has not yet been freed (despite the fact you have done your job and freed all blocks you allocated and under your control).
Look things over and let me know if you have any questions.

CSV File Input in C using Structures

I want to print the data from .csv file line by line which is separated by comma delimeter.
This code prints the garbage value .
enum gender{ M, F };
struct student{
int stud_no;
enum gender stud_gen;
char stud_name[100];
int stud_marks;
};
void main()
{
struct student s[60];
int i=0,j,roll_no,marks,k,select;
FILE *input;
FILE *output;
struct student temp;
input=fopen("Internal test 1 Marks MCA SEM 1 oct 2014 - CS 101.csv","r");
output=fopen("out.txt","a");
if (input == NULL) {
printf("Error opening file...!!!");
}
while(fscanf(input,"%d,%c,%100[^,],%d", &s[i].stud_no,&s[i].stud_gen,&s[i].stud_name,&s[i].stud_marks)!=EOF)
{
printf("\n%d,%c,%s,%d", s[i].stud_no,s[i].stud_gen,s[i].stud_name,s[i].stud_marks);
i++;
}
}
I also tried the code from: Read .CSV file in C But it prints only the nth field. I want to display all fields line by line.
Here is my sample input.
1401,F,FERNANDES SUZANNA ,13
1402,M,PARSEKAR VIPUL VILAS,14
1403,M,SEQUEIRA CLAYTON DIOGO,8
1404,M,FERNANDES GLENN ,17
1405,F,CHANDRAVARKAR TANUSHREE ROHIT,15
While there are a number of ways to parse any line into components, one way that can really increase understanding is to use a start and end pointer to work down each line identifying the commas, replacing them with null-terminators (i.e. '\0' or just 0), reading the field, restoring the comma and moving to the next field. This is just a manual application of strtok. The following example does that so you can see what is going on. You can, of course, replace use of the start and end pointers (sp & p, respectively) with strtok.
Read through the code and let me know if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* maximum number of student to initially allocate */
#define MAXS 256
enum gender { M, F };
typedef struct { /* create typedef to struct */
int stud_no;
enum gender stud_gen;
char *stud_name;
int stud_marks;
} student;
int main (int argc, char *argv[]) {
if (argc < 2) {
printf ("filename.csv please...\n");
return 1;
}
char *line = NULL; /* pointer to use with getline () */
ssize_t read = 0; /* characters read by getline () */
size_t n = 0; /* number of bytes to allocate */
student **students = NULL; /* ptr to array of stuct student */
char *sp = NULL; /* start pointer for parsing line */
char *p = NULL; /* end pointer to use parsing line */
int field = 0; /* counter for field in line */
int cnt = 0; /* counter for number allocated */
int it = 0; /* simple iterator variable */
FILE *fp;
fp = fopen (argv[1], "r"); /* open file , read only */
if (!fp) {
fprintf (stderr, "failed to open file for reading\n");
return 1;
}
students = calloc (MAXS, sizeof (*students)); /* allocate 256 ptrs set to NULL */
/* read each line in input file preserving 1 pointer as sentinel NULL */
while (cnt < MAXS-1 && (read = getline (&line, &n, fp)) != -1) {
sp = p = line; /* set start ptr and ptr to beginning of line */
field = 0; /* set/reset field to 0 */
students[cnt] = malloc (sizeof (**students)); /* alloc each stuct with malloc */
while (*p) /* for each character in line */
{
if (*p == ',') /* if ',' end of field found */
{
*p = 0; /* set as null-term char (temp) */
if (field == 0) students[cnt]->stud_no = atoi (sp);
if (field == 1) {
if (*sp == 'M') {
students[cnt]->stud_gen = 0;
} else {
students[cnt]->stud_gen = 1;
}
}
if (field == 2) students[cnt]->stud_name = strdup (sp); /* strdup allocates for you */
*p = ','; /* replace with original ',' */
sp = p + 1; /* set new start ptr start pos */
field++; /* update field count */
}
p++; /* increment pointer p */
}
students[cnt]->stud_marks = atoi (sp); /* read stud_marks (sp alread set to begin) */
cnt++; /* increment students count */
}
fclose (fp); /* close file stream */
if (line) /* free memory allocated by getline */
free (line);
/* iterate over all students and print */
printf ("\nThe students in the class are:\n\n");
while (students[it])
{
printf (" %d %c %-30s %d\n",
students[it]->stud_no, (students[it]->stud_gen) ? 'F' : 'M', students[it]->stud_name, students[it]->stud_marks);
it++;
}
printf ("\n");
/* free memory allocated to struct */
it = 0;
while (students[it])
{
if (students[it]->stud_name)
free (students[it]->stud_name);
free (students[it]);
it++;
}
if (students)
free (students);
return 0;
}
(note: added condition on loop that cnt < MAXS-1 to preserve at least one pointer in students NULL as a sentinel allowing iteration.)
input:
$ cat dat/people.dat
1401,F,FERNANDES SUZANNA ,13
1402,M,PARSEKAR VIPUL VILAS,14
1403,M,SEQUEIRA CLAYTON DIOGO,8
1404,M,FERNANDES GLENN ,17
1405,F,CHANDRAVARKAR TANUSHREE ROHIT,15
output:
$./bin/stud_struct dat/people.dat
The students in the class are:
1401 F FERNANDES SUZANNA 13
1402 M PARSEKAR VIPUL VILAS 14
1403 M SEQUEIRA CLAYTON DIOGO 8
1404 M FERNANDES GLENN 17
1405 F CHANDRAVARKAR TANUSHREE ROHIT 15
valgrind memcheck:
I have updated the code slightly to insure all allocated memory was freed to prevent against any memory leaks. Simple things like the automatic allocation of memory for line by getline or failing to close a file stream can result in small memory leaks. Below is the valgrind memcheck confirmation.
valgrind ./bin/stud_struct dat/people.dat
==11780== Memcheck, a memory error detector
==11780== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11780== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==11780== Command: ./bin/stud_struct dat/people.dat
==11780==
The students in the class are:
1401 F FERNANDES SUZANNA 13
1402 M PARSEKAR VIPUL VILAS 14
1403 M SEQUEIRA CLAYTON DIOGO 8
1404 M FERNANDES GLENN 17
1405 F CHANDRAVARKAR TANUSHREE ROHIT 15
==11780==
==11780== HEAP SUMMARY:
==11780== in use at exit: 0 bytes in 0 blocks
==11780== total heap usage: 13 allocs, 13 frees, 2,966 bytes allocated
==11780==
==11780== All heap blocks were freed -- no leaks are possible
==11780==
==11780== For counts of detected and suppressed errors, rerun with: -v
==11780== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Resources