fscanf writing to string array wrong - c

I am reading a file word by word using fscanf and writing them to a char** array.
If I want to print the current index it works fine but after full writing finishes, printing the array causes wrong output.
char **stop_words = (char**)malloc(1000*sizeof(char*));
FILE *fp;
fp = fopen("englishstopwords.txt", "r");
int i = 0;
while(!feof(fp)) {
fscanf(fp,"%s\n", &stop_words[i]);
// printf("%s\n", &stop_words[i]); //this works fine
i++;
}
// for (int i = 0; i < 1000; i++) { //this works buggy
// printf("%s\n", &stop_words[i]);
// }
fclose(fp);
Broken print looks like this:
immediatimportanimportanindex
working print look like this:
immediately
importance
important
index
What is the difference between them?

The problem
You did memory allocation fundamentally wrong.
char **stop_words = (char**)malloc(1000*sizeof(char*)); only allocates a block of memory that is capable to store 1000 pointers.
The content of stop_words[0] to stop_words[999] is undefined, they are all garbage value after the malloc() returns.
Sometimes it looks fine to write to stop_words[i], but it is just a lucky part that the garbage is a pointer to mapped memory (still bad though, you probably have a memory corruption because of that).
The fix for this is simply to allocate another block of memory to contain the data from your file.
Wrong target buffer
This part
fscanf(fp,"%s\n", &stop_words[i]);
writes to array of pointer that you have allocated with malloc(). The type of expression &stop_words[i] itself doesn't match with %s, you should really activate warn flags and a good compiler should warn you about that by default.
Potential buffer overflow
Your method to read a line is dangerous, because fscanf with %s doesn't care about how big your buffer is, and your program is vulnerable to buffer overflow because of that.
Fix for this is that you can use fgets and specify the size of your buffer.
You can then realloc() if a line has more than allocated memory for the buffer. To detect this, you can see the last character returned. If it is a line feed, then it is the end of line, otherwise it may be end of file or a line that has characters more than buffer size (so you can decide to realloc).
Fix for this
englishstopwords.txt (sample file for testing)
i
me
my
myself
we
our
test_long_line_123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123
ours
ourselves
test.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>
#define MAX_WORDS (1000u)
#define INIT_ALLOC (128u)
int main(void)
{
size_t i, total_words;
FILE *fp;
char **stop_words = malloc(MAX_WORDS * sizeof(*stop_words));
/* TODO: Handle `stop_words == NULL` */
fp = fopen("englishstopwords.txt", "r");
/* TODO: Handle `fp == NULL` */
i = 0;
while (true) {
size_t len = 0;
char *ret, *buf = malloc(INIT_ALLOC * sizeof(*buf));
/* TODO: Handle `buf == NULL` */
ret = buf;
re_fgets:
ret = fgets(ret, INIT_ALLOC, fp);
if (ret == NULL) {
/* We've reached the end of file */
if (len == 0) {
/*
* Throw away the buffer, this is unused
*/
free(buf);
} else {
/* Last line buffer. */
stop_words[i++] = buf;
}
break;
}
len = strlen(buf);
if (buf[len - 1] != '\n') {
/*
*
* We don't see an LF, this means this line
* has more than `INIT_ALLOC` characters or
* it may be the EOF.
*
*/
ret = realloc(buf, (len + 1 + INIT_ALLOC) * sizeof(*buf));
/* TODO: Handle `ret == NULL` */
buf = ret;
/*
* Shift the pointer to the right (end of string).
* Because this line has not been fully read.
*
* We put the next `fgets` buffer to the end of this
* string.
*/
ret += len;
goto re_fgets;
}
/* TODO: Trim CR on Windows platform */
/* Trim the LF */
buf[len - 1] = '\0';
stop_words[i++] = buf;
if (i >= MAX_WORDS) {
/*
* TODO: You can do realloc(stop_words, ...) if you
* want to.
*/
break;
}
}
fclose(fp);
total_words = i;
for (i = 0; i < total_words; i++)
printf("%s\n", stop_words[i]);
for (i = 0; i < total_words; i++)
free(stop_words[i]);
free(stop_words);
return 0;
}
Compile and Run
ammarfaizi2#integral:/tmp$ cat englishstopwords.txt
i
me
my
myself
we
our
test_long_line_123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123
ours
ourselves
ammarfaizi2#integral:/tmp$ gcc -ggdb3 -Wall -Wextra -pedantic-errors test.c -o test
ammarfaizi2#integral:/tmp$ valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --track-fds=yes --error-exitcode=99 -s ./test
==503906== Memcheck, a memory error detector
==503906== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==503906== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==503906== Command: ./test
==503906==
i
me
my
myself
we
our
test_long_line_123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123123
ours
ourselves
==503906==
==503906== FILE DESCRIPTORS: 3 open (3 std) at exit.
==503906==
==503906== HEAP SUMMARY:
==503906== in use at exit: 0 bytes in 0 blocks
==503906== total heap usage: 22 allocs, 22 frees, 20,476 bytes allocated
==503906==
==503906== All heap blocks were freed -- no leaks are possible
==503906==
==503906== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
ammarfaizi2#integral:/tmp$

Related

Saving values to 2D array

according to my task I need to read a file passed as a command line argument using C and store its content (each character) to an 2D array to be able change array's values later and save the changed content to another file. NVM some custom functions.
Here is an example of a file I need to read:
#,#,#,#,#,#,.,#,.,.,.$
#,.,#,.,.,#,.,#,#,#,#$
#,.,#,.,.,.,.,.,.,#,#$
#,.,#,.,.,#,#,#,#,#,#$
#,.,.,#,.,.,.,.,.,.,#$
#,.,.,.,#,.,#,#,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,.,.,.,.,.,#$
#,#,#,#,#,#,#,#,#,.,#$
Here is what I've tried:
int main(int argc, char *argv[]) {
int startX = 3;
int startY = 3;
int endX = 6;
int endY = 6;
int count = 0;
int x = 0;
int y = 0;
int fd = open(argv[1], O_RDONLY);
char ch;
if (fd == -1) {
mx_printerr("map does not exist\n");
exit(-1);
}
int targetFile =
open("path.txt", O_CREAT | O_EXCL | O_WRONLY, S_IWUSR | S_IRUSR);
while (read(fd, &ch, 1)) {
if (ch == '\n') {
x++;
}
if (ch != ',') {
count++;
}
}
fd = open(argv[1], O_RDONLY);
y = (count - x) / x;
char **arr;
arr = malloc(sizeof(char *) * x);
for (int i = 0; i < x; i++) arr[i] = malloc(y);
int tempX = 0, tempY = 0, tempCount = 0;
char tempString[count - x];
// the loop in question >>>>>
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 11; j++) {
while (read(fd, &ch, 1)) {
if (ch != ',') {
arr[i][j] = ch;
// mx_printchar(arr[i][j]);
}
}
}
}
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 11; j++) {
mx_printchar(arr[i][j]);
}
}
for (int i = 0; i < x; i++) free(arr[i]);
free(arr);
close(fd);
close(targetFile);
exit(0);
}
The last while loop should be saving the file's content to an array. However, when I try to print the array's content to console, I get some garbage values:
���pp
����8��
Please help me understand what is wrong here or should I use another approach to save the data to the array.
You have started off well, but then strayed into an awkward way of handling your read and allocations. There are a number of ways you can approach a flexible read of any number of characters and any number of rows into a dynamically allocated pointer-to-pointer-to char object that you can index like a 2D array. (often incorrectly referred to an a "dynamic 2D array") There is no array involved at all, you have a single-pointer to more pointers and you allocate a block of storage for your pointers (rows) and then allocate separate blocks of memory to hold each row worth of data and assign the beginning address to each such block to one of the pointers in turn.
An easy way to eliminate having to pre-read each row of characters to determine the number is to simply buffer the characters for each row and then allocate storage for and copy that number of characters to their final location. This provides the advantage of not having to allocate/reallocate each row starting from some anticipated number of characters. (as there is no guarantee that all rows won't have a stray character somewhere)
The other approach, equally efficient, but requiring the pre-read of the first row is to read the first row to determine the number of characters, allocate that number of characters for each row and then enforce that number of characters on every subsequent row (handling the error if additional characters are found). There are other options if you want to treat each row as a line and then read and create an array of strings, but your requirements appear to simply be a grid of characters) You can store your lines as strings at this point simply by adding a nul-terminating character.
Below we will use a fixed buffer to hold the characters until a '\n' is found marking the end of the row (or you run out of fixed storage) and then dynamically allocate storage for each row and copy the characters from the fixed buffer to your row-storage. This is generally a workable solution as you will know some outer bound of the max number of characters than can occur per-line (don't skimp). A 2K buffer is cheap security even if you think you are reading a max of 100 chars per-line. (if you are on an embedded system with limited memory, then I would reduce the buffer to 2X the anticipated max number of chars) If you define a constant up top for the fixed buffer size -- if you find you need more, it's a simple change in one location at the top of your file.
How does it work?
Let's start with declaring the counter variables to track the number of pointers available (avail), a row counter (row) a column counter (col) and a fixed number of columns we can use to compare against the number of columns in all subsequent rows (cols). Declare your fixed buffer (buf) and your pointer-to-pointer to dynamically allocate, and a FILE* pointer to handle the file, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NCHARS 2048 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
size_t avail = 2, /* initial no. of available pointers to allocate */
row = 0, /* row counter */
col = 0, /* column counter */
cols = 0; /* fixed no. of columns based on 1st row */
char buf[NCHARS], /* temporary buffer to hold characters */
**arr = NULL; /* pointer-to-pointer-to char to hold grid */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
(note: if no argument is provided, the program will read from stdin by default)
Next we validate the file is open for reading and we allocate an initial avail number of pointers:
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
/* allocate/validate initial avail no. of pointers */
if (!(arr = malloc (avail * sizeof *arr))) {
perror ("malloc-arr");
return 1;
}
Next rather than looping while ((c = fgetc(fp)) != EOF), just continually loop - that will allow you to treat a '\n' or EOF within the loop and not have to handle the storage of the last line separately after the loop exits. Begin by reading the next character from the file and checking if you have used all your available pointers (indicating you need to realloc() more before proceeded):
while (1) { /* loop continually */
int c = fgetc(fp); /* read each char in file */
if (row == avail) { /* if all pointers used */
/* realloc 2X no. of pointers using temporary pointer */
void *tmp = realloc (arr, 2 * avail * sizeof *arr);
if (!tmp) { /* validate reallocation */
perror ("realloc-arr");
return 1; /* return failure */
}
arr = tmp; /* assign new block to arr */
avail *= 2; /* update available pointers */
}
(note: always realloc() using a temporary pointer. When realloc() fails (not if it fails) it returns NULL and if you reallocate using arr = realloc (arr, ..) you have just overwritten your pointer to your current block of memory with NULL causing the loss of the pointer and inability to free() the prior allocated block resulting in a memory-leak)
Now check if you have reached the end of line, or EOF and in the case of EOF if your col count is zero, you know you reached EOF after a previous '\n' so you can simply break the loop at that point. Otherwise, if you reach EOF with a full column-count, you know your file lacks a POSIX end-of-file and you need to store the last line of character, e.g.
if (c == '\n' || c == EOF) { /* if end of line or EOF*/
if (c == EOF && !col) /* EOF after \n - break */
break;
if (!(arr[row] = malloc (col))) { /* allocate/validate col chars */
perror ("malloc-arr[row]");
return 1;
}
memcpy (arr[row++], buf, col); /* copy buf to arr[row], increment */
if (!cols) /* if cols not set */
cols = col; /* set cols to enforce cols per-row */
if (col != cols) { /* validate cols per-row */
fprintf (stderr, "error: invalid no. of cols - row %zu\n", row);
return 1;
}
if (c == EOF) /* break after non-POSIX eof */
break;
col = 0; /* reset col counter zero */
}
If your character isn't a '\n' or EOF it's just a normal character, so add it to your buffer, check your buffer has room for the next and keep going:
else { /* reading in line */
buf[col++] = c; /* add char to buffer */
if (col == NCHARS) { /* if buffer full, handle error */
fputs ("error: line exceeds maximum.\n", stderr);
return 1;
}
}
}
At this point you have all of your characters stored in a dynamically allocated object you can index as a 2D array. (you also know it is just storage of characters that are not nul-terminated so you cannot treat each line as a string). You are free to add a nul-terminating character if you like, but then you might as well just read each line into buf with fgets() and trim the trailing newline, if present. Depends on your requirements.
The example just closes the file (if not reading from stdin), outputs the stored characters and frees all allocated memory, e.g.
if (fp != stdin) /* close file if not stdin */
fclose (fp);
for (size_t i = 0; i < row; i++) { /* loop over rows */
for (size_t j = 0; j < cols; j++) /* loop over cols */
putchar (arr[i][j]); /* output char */
putchar ('\n'); /* tidy up with newline */
free (arr[i]); /* free row */
}
free (arr); /* free pointers */
}
(that's the whole program, you can just cut/paste the parts together)
Example Input File
$ cat dat/gridofchars.txt
#,#,#,#,#,#,.,#,.,.,.$
#,.,#,.,.,#,.,#,#,#,#$
#,.,#,.,.,.,.,.,.,#,#$
#,.,#,.,.,#,#,#,#,#,#$
#,.,.,#,.,.,.,.,.,.,#$
#,.,.,.,#,.,#,#,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,.,.,.,.,.,#$
#,#,#,#,#,#,#,#,#,.,#$
Example Use/Output
$ ./bin/read_dyn_grid dat/gridofchars.txt
#,#,#,#,#,#,.,#,.,.,.$
#,.,#,.,.,#,.,#,#,#,#$
#,.,#,.,.,.,.,.,.,#,#$
#,.,#,.,.,#,#,#,#,#,#$
#,.,.,#,.,.,.,.,.,.,#$
#,.,.,.,#,.,#,#,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,.,.,.,.,.,#$
#,#,#,#,#,#,#,#,#,.,#$
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/read_dyn_grid dat/gridofchars.txt
==29391== Memcheck, a memory error detector
==29391== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==29391== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==29391== Command: ./bin/read_dyn_grid dat/gridofchars.txt
==29391==
#,#,#,#,#,#,.,#,.,.,.$
#,.,#,.,.,#,.,#,#,#,#$
#,.,#,.,.,.,.,.,.,#,#$
#,.,#,.,.,#,#,#,#,#,#$
#,.,.,#,.,.,.,.,.,.,#$
#,.,.,.,#,.,#,#,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,#,.,.,.,.,#$
#,.,.,.,.,.,.,.,.,.,#$
#,#,#,#,#,#,#,#,#,.,#$
==29391==
==29391== HEAP SUMMARY:
==29391== in use at exit: 0 bytes in 0 blocks
==29391== total heap usage: 17 allocs, 17 frees, 6,132 bytes allocated
==29391==
==29391== All heap blocks were freed -- no leaks are possible
==29391==
==29391== For counts of detected and suppressed errors, rerun with: -v
==29391== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Valgrind: Conditional jump or move depends on uninitialised value(s) - opening file

[EDIT]: I added full code.
I have to create a simple version of "grep" command on unix systems in C. Everything is working fine, only Valgrind says Conditional jump or move depends on uninitialised value(s).
I think, it might be connected to the file, that I am trying to open. Please see my code bellow.
Please note, that I can't use <string.h> in my code.
I compile the code with clang on Ubuntu:
cc -pedantic -Wall -Werror -g -std=c99 grep.c -o program
This is what Valgrind says:
lukas#lukas-VirtualBox:~/Desktop/shared/Lab04/prg-hw04$ valgrind --track-origins=yes ./program Mem /proc/meminfo
==2588== Memcheck, a memory error detector
==2588== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2588== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2588== Command: ./program Mem /proc/meminfo
==2588==
==2588== Conditional jump or move depends on uninitialised value(s)
==2588== at 0x4C32D08: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2588== by 0x4EBC9D1: puts (ioputs.c:35)
==2588== by 0x108970: check (grep.c:14)
==2588== by 0x108AA9: read (grep.c:50)
==2588== by 0x108B66: main (grep.c:71)
==2588== Uninitialised value was created by a heap allocation
==2588== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2588== by 0x108A04: read (grep.c:33)
==2588== by 0x108B66: main (grep.c:71)
==2588==
MemTotal: 10461696 kB
MemFree: 7701488 kB
MemAvailable: 8480772 kB
==2588==
==2588== HEAP SUMMARY:
==2588== in use at exit: 0 bytes in 0 blocks
==2588== total heap usage: 4 allocs, 4 frees, 2,700 bytes allocated
==2588==
==2588== All heap blocks were freed -- no leaks are possible
==2588==
==2588== For counts of detected and suppressed errors, rerun with: -v
==2588== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Could you help me with locating the problem?
This is my grep.c file.
#include <stdio.h>
#include <stdlib.h>
#define SIZE 100
int printed = 1; // return value -> 0 for patter found, 1 for pattern not found
char *pattern;
char *dest;
void check(char *line, int length, int size) {
for (int i = 0; i < length; i++) {
if (line[i] == pattern[0]) {
for (int j = 1; j < size && (i+j) < length; j++) {
if (line[i+j] == pattern[j]) {
if (j==size-1) {
printf("%s\n", line); // print line
printed = 0; // pattern found
goto END;
}
} else {
break;
}
}
}
}
END: ;
}
void read(void) { // read lines, then check individual lines
int c;
int lengthPat = 0;
while(pattern[++lengthPat] != '\0'); // check length of pattern - I can't use string.h library
FILE *file = fopen(dest, "r");
size_t size =100;
char *line = (char*)malloc(size * sizeof(char));
if (line == NULL) //succesfully created malloc?
exit(102);
int last = 0;
if (file != NULL) { // file succesfully opened
while ((c = getc(file)) != EOF) {
if (c != '\n') { // read line until \n
if(last ==size) {
char *p_line = realloc(line, 2*size*sizeof(char));
if (p_line == NULL)
free(line);
line = p_line;
size *= 2;
}
line[last++] = (char)c;
}
else { // end of line, check for pattern
check(line, last, lengthPat);
last = 0;
for (int i = 0; i < size; i++) {
line[i] = '\0';
}
}
}
fclose(file);
free(line);
}
else {
fprintf(stderr, "Error: Could not open file!\n");
}
}
/* The main program */
int main(int argc, char *argv[])
{
if (argc == 3) {
pattern = argv[1];
dest = argv[2];
read();
}
return printed;
}
The problem was missing null terminator \0 at the end of the string line.
Thanks all for help.
"connected to the file" is the key.
Your input file has lines longer than 100 characters. Replace stack array with dynamically growing heap array.
size_t size =100;
char c_line = malloc(size);
...
if(last ==size)
line = c_line = realloc(c_line, size<<=1);
On fixing that, the mistake is on this line:
printf("%s\n", line); // print line
line is not null terminated so using printf is an advanced topic. We do this instead:
for (int k = 0; k < length; k++)
putc(line[k], stdout);
putc(line[k], stdout);

Error using malloc

I pass char ** input from main() to processInExp() function, then I pass it again from processInExp() function to getInput() function to dynamically allocate it over while reading through the file.
Inside getInput() function input is allocated memory properly when checked, but while using it in in processInExp() it encounters gets runtime error. What can be the issue?
Below is my code:
int getInput(char ** input, const char * fileName)
{
int numInput = 0;
int i, j;
char c;
char tempInput[100];
FILE * pFile;
if((pFile = fopen(fileName, "r")) == NULL)
{
printf("Cannot read file %s\n", fileName);
system("PAUSE");
exit(1);
}
while(!feof(pFile))
{
c = fgetc(pFile);
if(c == '\n') ++numInput;
}
/* printf("%d\n", numInput); */
input = (char**)malloc(numInput * sizeof(char*)); /* #2 MALLOC input */
rewind(pFile);
for(i = 0; !feof(pFile); ++i)
{
fscanf(pFile, "%[^\n]%*c", tempInput);
/* printf("%s\n", tempInput); */
input[i] = (char*)malloc((strlen(tempInput) + 1) * sizeof(char)); /* #3 MALLOC input[] */
strcpy(input[i], tempInput);
/* printf("%s\n", input[i]); */ /* #4 PRINT OUT PERFECTLY */
memset(tempInput, 0, sizeof(tempInput));
}
fclose(pFile);
return numInput;
}
void processInExp(char ** input, char ** output, const char * fileName)
{
int numFormula;
int i;
numFormula = getInput(input, fileName); /* #1 PASSING input */
/* printf("%s\n", input[0]); */ /* #5 RUNTIME ERROR */
output = (char**)malloc(numFormula * sizeof(char*));
system("PAUSE");
for(i = 0; i < numFormula; ++i)
{
convertIntoPost(input[i], output[i]);
printf("%d. %s -> %s", (i + 1), input[i], output[i]);
}
}
While others have pointed out the issue with pass by value, there is another issue where learning can occur. There is no need to pre-read the file to determine the number of characters or lines and then rewind the file to read each line.
Take a look at getline which returns the number of characters read. All you need to do is keep a sum variable and after reading all line, simply return (or update a pointer you provided as an argument) and you are done. Of course you can do the same with fscanf or fgets by calling strlen after reading the line.
The following is a short example of reading a text file in one pass while determining the number of characters (without the newline) and returning that information to the calling function. Just as you needed to pass a pointer to your array of pointers in getInput, we will use pointers passed as arguments to return the line and character counts to our calling function. If you declare and call the function to read the file as follows:
size_t nline = 0; /* placeholders to be filled by readtxtfile */
size_t nchar = 0; /* containing number of lines/chars in file */
...
char **file = readtxtfile (fn, &nline, &nchar);
By declaring the variables in the calling function, and then passing pointers to the variables as arguments (using the urnary &), you can update the values in the function and have those values available for use back in main (or whatever function you called readtxtfile from.)
A quick example illustrating these points could be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NMAX 256
char **readtxtfile (char *fn, size_t *idx, size_t *sum);
void prn_chararray (char **ca);
void free_chararray (char **ca);
int main (int argc, char **argv) {
size_t nline = 0; /* placeholders to be filled by readtxtfile */
size_t nchar = 0; /* containing number of lines/chars in file */
char *fn = argc > 1 ? argv[1] : NULL;/* if fn not given, read stdin */
/* read each file into an array of strings,
* number of lines/chars read updated in nline, nchar
*/
char **file = readtxtfile (fn, &nline, &nchar);
/* output number of lines read & chars read and from where */
printf ("\n read '%zu' lines & '%zu' chars from file: %s\n\n",
nline, nchar, fn ? fn : "stdin");
/* simple print function to print all lines */
if (file) prn_chararray (file);
/* simple free memory function */
if (file) free_chararray (file);
return 0;
}
/* simple function using getline to read any text file and return
* the lines read in an array of pointers. user is responsible for
* freeing memory when no longer needed
*/
char **readtxtfile (char *fn, size_t *idx, size_t *sum)
{
char *ln = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* line buf size (0 - use default) */
ssize_t nchr = 0; /* number of chars actually read */
size_t nmax = NMAX; /* check for reallocation */
char **array = NULL; /* array to hold lines read */
FILE *fp = NULL; /* file pointer to open file fn */
/* open / validate file or read stdin */
fp = fn ? fopen (fn, "r") : stdin;
if (!fp) {
fprintf (stderr, "%s() error: file open failed '%s'.", __func__, fn);
return NULL;
}
/* allocate NMAX pointers to char* */
if (!(array = calloc (NMAX, sizeof *array))) {
fprintf (stderr, "%s() error: memory allocation failed.", __func__);
return NULL;
}
/* read each line from stdin - dynamicallly allocated */
while ((nchr = getline (&ln, &n, fp)) != -1)
{
/* strip newline or carriage rtn */
while (nchr > 0 && (ln[nchr-1] == '\n' || ln[nchr-1] == '\r'))
ln[--nchr] = 0;
*sum += nchr; /* add chars in line to sum */
array[*idx] = strdup (ln); /* allocate/copy ln to array */
(*idx)++; /* increment value at index */
if (*idx == nmax) { /* if lines exceed nmax, reallocate */
char **tmp = realloc (array, nmax * 2);
if (!tmp) {
fprintf (stderr, "%s() error: reallocation failed.\n", __func__);
exit (EXIT_FAILURE); /* or return NULL; */
}
array = tmp;
nmax *= 2;
}
}
if (ln) free (ln); /* free memory allocated by getline */
if (fp != stdin) fclose (fp); /* close open file descriptor */
return array;
}
/* print an array of character pointers. */
void prn_chararray (char **ca)
{
register size_t n = 0;
while (ca[n])
{
printf (" arr[%3zu] %s\n", n, ca[n]);
n++;
}
}
/* free array of char* */
void free_chararray (char **ca)
{
if (!ca) return;
register size_t n = 0;
while (ca[n])
free (ca[n++]);
free (ca);
}
Use/Output
$ ./bin/getline_ccount <dat/fc-list-fonts.txt
read '187' lines & '7476' chars from file: stdin
arr[ 0] andalemo.ttf: Andale Mono - Regular
arr[ 1] arialbd.ttf: Arial - Bold
arr[ 2] arialbi.ttf: Arial - Bold Italic
arr[ 3] ariali.ttf: Arial - Italic
arr[ 4] arialnbi.ttf: Arial
arr[ 5] arialnb.ttf: Arial
arr[ 6] arialni.ttf: Arial
arr[ 7] arialn.ttf: Arial
arr[ 8] arial.ttf: Arial - Regular
arr[ 9] ARIALUNI.TTF: Arial Unicode MS - Regular
arr[ 10] ariblk.ttf: Arial
arr[ 11] Bailey Script Regular.ttf: Bailey Script - Regular
arr[ 12] Bailey_Script_Regular.ttf: Bailey Script - Regular
arr[ 13] Belwe Gotisch.ttf: Belwe Gotisch - Regular
arr[ 14] Belwe_Gotisch.ttf: Belwe Gotisch - Regular
<snip>
Memory/Leak Check
Whenever you allocated/free memory in your code, don't forget to use a memory checker to insure there are no memory errors or leaks in your code:
$ valgrind ./bin/getline_ccount <dat/fc-list-fonts.txt
==20259== Memcheck, a memory error detector
==20259== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==20259== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==20259== Command: ./bin/getline_readfile_function
==20259==
read '187' line from file: stdin
arr[ 0] andalemo.ttf: Andale Mono - Regular
arr[ 1] arialbd.ttf: Arial - Bold
arr[ 2] arialbi.ttf: Arial - Bold Italic
arr[ 3] ariali.ttf: Arial - Italic
<snip>
==20259==
==20259== HEAP SUMMARY:
==20259== in use at exit: 0 bytes in 0 blocks
==20259== total heap usage: 189 allocs, 189 frees, 9,831 bytes allocated
==20259==
==20259== All heap blocks were freed -- no leaks are possible
==20259==
==20259== For counts of detected and suppressed errors, rerun with: -v
==20259== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Follow On From Comment
There are several issues with the code you posted in the comment:
for(i = 0; !feof(pFile); ++i) {
fscanf(pFile, "%[^\n]%*c", tempInput);
/* printf("%s\n", tempInput); */
input[i] = (char*)malloc((strlen(tempInput) + 1) * sizeof(char));
strcpy(input[i], tempInput);
printf("%s\n", input[i]);
memset(tempInput, 0, sizeof(tempInput));
}
for(i = 0; i < numInput; ++i) {
convertIntoPost(input[i], output[i]);
}
First, read the link in the first comment about why feof can cause problems when using it to indicate EOF in a loop. Second, functions have return values, the ability to use them for an advantage tells you whether you are using the correct function for the job.
The difficulty you are having trying to shoehorn reading an entire line with fscanf should be telling you something... The problem you have backed into by your choice of the format specifier "%[^\n]%*c" to read a line containing whitespace is the exact reason fscanf is NOT the proper tool for the job.
Why? The scanf family of functions were created to read discrete values. Their return is based on:
the number of input items successfully matched and assigned
Using your format specifier, the number of items read on success is 1. The *%c reads and discards the newline, but is NOT added to the item count. This causes a BIG problem when trying to read a file that can contain blank lines. What happens then? You experience an input failure and fscanf returns 0 -- but it is still very much a valid line. When that occurs, nothing is read. You cannot check the return as being >= 0 because when you encounter a blank line you loop forever...
With your format specifier, you cannot check for EOF either. Why? With the scanf family of functions:
The value EOF is returned if the end of input is reached before
either the first successful conversion or a matching failure
occurs.
That will never occur in your case because you have an input failure with fscanf (not end of input) and no matching failure has occurred. Are you starting to see why fscanf may not be the right tool for the job?
The C library provides two functions for line-oriented input. They are fgets and getline. Both read an entire line of text into the line buffer. This will include the newline at the end of each line (including blank lines). So when you use either to read text, it is a good idea to remove the newline by overwriting with a null-terminating character.
Which to use? With fgets, you can limit the number of characters read by sizing the character buffer appropriately. getline is now part of the C library, and it provides the added benefit of returning the number of characters actually read (a bonus), but it will read the line no matter how long it is because it dynamically allocates the buffer for you. I prefer it, but just know that you need to check the number of characters it has read.
Since I provided a getline example above, your read loop can be much better written with fgets as follows:
while (fgets (tempInput, MAXL, pFile) != NULL) {
nchr = strlen (tempInput);
while (nchr && (tempInput[nchr-1] == '\n' || tempInput[nchr-1] == '\r'))
tempInput[--nchr] = 0; /* strip newlines & carriage returns */
input[i++] = strdup (tempInput); /* allocates & copies tempInput */
}
numInput = i;
Next, your allocation does not need to be cast to (char *). The return of malloc and calloc is just a pointer to (i.e. the address of) the block of memory allocated. (it is the same no matter what you are allocating memory for) There is no need for sizeof (char). It is always 1. So just write:
input[i] = malloc (strlen(tempInput) + 1);
strcpy (input[i], tempInput);
A more convenient way to both allocate and copy is using strdup. With strdup, the two lines above become simply:
input[i++] = strdup (tempInput); /* allocates & copies */
Next, there is no need for memset.
memset(tempInput, 0, sizeof(tempInput));
If tempInput is declared to hold 100 chars with say: tempInput[100], you can read strings up to 99 char into the same buffer over-and-over again without ever having to zero the memory. Why? Stings are null-terminated. You don't care what is in the buffer after the null-terminator...
That's a lot to take in. Putting it all together in a short example, you could do something like:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 256
/* dummy function */
void convertIntoPost (char *in, char **out)
{
size_t i = 0, len = strlen (in);
*out = calloc (1, len + 1);
for (i = 0; i < len; i++) {
(*out)[len-i-1] = in[i];
}
}
int main (int argc, char **argv) {
char tempInput[MAXL] = {0};
char **input = NULL, **output = NULL;
size_t i = 0, numInput = 0;
size_t nchr = 0;
FILE *pFile = NULL;
pFile = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!pFile) {
fprintf (stderr, "error: file open failed '%s'.\n",
argv[1] ? argv[1] : "stdin");
return 1;
}
input = calloc (1, MAXL); /* allocate MAXL pointer for input & output */
output = calloc (1, MAXL); /* calloc allocates and sets memory to 0-NULL */
if (!input || !output) { /* validate allocation */
fprintf (stderr, "error: memory allocation failed.\n");
return 1;
}
while (fgets (tempInput, MAXL, pFile) != NULL) {
nchr = strlen (tempInput);
while (nchr && (tempInput[nchr-1] == '\n' || tempInput[nchr-1] == '\r'))
tempInput[--nchr] = 0;
input[i++] = strdup (tempInput); /* allocates & copies */
}
numInput = i;
fclose (pFile);
/* call convertIntoPost with input[i] and &output[i] */
for (i = 0; i < numInput; ++i) {
convertIntoPost (input[i], &output[i]);
printf (" input[%2zu]: %-25s output[%2zu]: %s\n",
i, input[i], i, output[i]);
}
/* free all memory */
for (i = 0; i < numInput; ++i) {
free (input[i]), free (output[i]);
}
free (input), free (output);
return 0;
}
Example Output
$ ./bin/feoffix ../dat/captnjack.txt
input[ 0]: This is a tale output[ 0]: elat a si sihT
input[ 1]: Of Captain Jack Sparrow output[ 1]: worrapS kcaJ niatpaC fO
input[ 2]: A Pirate So Brave output[ 2]: evarB oS etariP A
input[ 3]: On the Seven Seas. output[ 3]: .saeS neveS eht nO
Notes On Compiling Your Code
Always compile your code with Warnings enabled. That way the compiler can help point out areas where your code may have ambiguity, etc.. To enable warnings when you compile, simply add -Wall and -Wextra to your compile string. (If you really want all warnings, add -pedantic (definition: overly concerned with trivial details)). The take the time to read and understand what the compiler is telling you with the warnings (they are really very good, and you will quickly learn what each means). Then... go fix the problems so your code compiles without any warnings.
There are only very rare and limited circumstances where it is permissible to 'understand and choose to allow' a warning to remain (like when using a library where you have no access to the source code)
So putting it all together, when you compile your code, at a minimum you should be compiling with the following for testing and development:
gcc -Wall -Wextra -o progname progname.c -g
With gcc, the -g option tell the compiler to produce additional debugging information for use with the debugger gdb (learn it).
When you have all the bugs worked out and you are ready for a final compile of your code, you will want to add optimizations like the optimization level -On (that's capital O [not zero] where 'n' is the level 1, 2, or 3 (0 is default), -Ofast is essentially -O3 with a few additional optimizations). You may also want to consider telling the compiler to inline your functions when possible with -finline-functions to eliminate function call overhead. So for final compile you will want something similar to:
gcc -Wall -Wextra -finline-functions -Ofast -o progname progname.c
The optimizations can produce a 10-fold increase in performance and decrease in your program execution time (that's a 1000% increase in performance in some cases (300-500% improvement is common)). Well worth adding a couple of switches.
C uses pass-by-value for function argument passing. So, from inside the function getInput(), you cannot change the variable input and expect that change to be reflected back in the actual argument, passed to the function. For that, you'll need a pointer-to variable to be passed, like in this case, you need to do
int getInput(char *** input, const char * fileName) { //notice the extra *
and need to call it like
char ** inp = NULL;
getInput(&inp, ..........);
Then, getInput() will be able to allocate memory to *input inside the function which will be reflected into inp.
Otherwise, after returning from the getInput(), the actual argument will still be uninitialized and using that further (in your case, in the for loop in processInExp() function) will lead to undefined behaviour.
That said, two more important things to notice,
Please see why not to cast the return value of malloc() and family in C.
Check Why is while ( !feof (file) ) always wrong?
As Sourav mentioned, C uses pass-by-value for argument passing, so the input variable within the scope of processInExp has the value of the address of the memory previously allocated in main.
This results in a segmentation fault when you print input[0]. This is because printf is trying to print the string located at the address relative to the previously allocated memory instead of memory allocated to input in the getInput function to which you copied the string.
A solution would be to pass a pointer to input, so your function signature would like like this: int getInput(char *** input, const char * fileName). You would then need to change any references to input to *input in order to dereference the pointer, and pass input's pointer to getInput like this: getInput(&input, fileName).
The C language is pass-by-value without exception.
A function is not able to change the value of actual parameters.

Reading a stream of values from text file in C

I have a text file which may contain one or up to 400 numbers. Each number is separated by a comma and a semicolon is used to indicate end of numbers stream.
At the moment I am reading the text file line by line using the fgets. For this reason I am using a fixed array of 1024 elements (the maximum characters per line for a text file).
This is not the ideal way how to implement this since if only one number is inputted in the text file, an array of 1024 elements will we pointless.
Is there a way to use fgets with the malloc function (or any other method) to increase memory efficiency?
If you are looking into using this in a production code then I would request you to follow the suggestions put in the comments section.
But if you requirement is more for learning or school, then here is a complex approach.
Pseudo code
1. Find the size of the file in bytes, you can use "stat" for this.
2. Since the file format is known, from the file size, calculate the number of items.
3. Use the number of items to malloc.
Voila! :p
How to find file size
You can use stat as shown below:
#include <sys/stat.h>
#include <stdio.h>
int main(void)
{
struct stat st;
if (stat("file", &st) == 0) {
printf("fileSize: %d No. of Items: %d\n", (st.st_size), (st.st_size/2));
return st.st_size;
}
printf("failed!\n");
return 0;
}
This file when run will return the file size:
$> cat file
1;
$> ./a.out
fileSize: 3 No. of Items: 1
$> cat file
1,2,3;
$> ./a.out
fileSize: 7 No. of Items: 3
Disclaimer: Is this approach to minimize the pre-allocated memory an optimal approach? No ways in heaven! :)
Dynamically allocating space for you data is a fundamental tool for working in C. You might as well pay the price to learn. The primary thing to remember is,
"if you allocate memory, you have the responsibility to track its use
and preserve a pointer to the starting address for the block of
memory so you can free it when you are done with it. Otherwise your
code with leak memory like a sieve."
Dynamic allocation is straight forward. You allocate some initial block of memory and keep track of what you add to it. You must test that each allocation succeeds. You must test how much of the block of memory you use and reallocate or stop writing data when full to prevent writing beyond the end of your block of memory. If you fail to test either, you will corrupt the memory associated with your code.
When you reallocate, always reallocate using a temporary pointer because with a reallocation failure, the original block of memory is freed. (causing loss of all previous data in that block). Using a temporary pointer allows you to handle failure in a manner to preserve that block if needed.
Taking that into consideration, below we initially allocate space for 64 long values (you can easily change to code to handle any type, e.g. int, float, double...). The code then reads each line of data (using getline to dynamically allocate the buffer for each line). strtol is used to parse the buffer assigning values to the array. idx is used as an index to keep track of how many values have been read, and when idx reaches the current nmax, array is reallocated twice as large as it previously was and nmax is updated to reflect the change. The reading, parsing, checking and reallocating continues for every line of data in the file. When done, the values are printed to stdout, showing the 400 random values read from the test file formatted as 353,394,257,...293,58,135;
To keep the read loop logic clean, I've put the error checking for the strtol conversion into a function xstrtol, but you are free to include that code in main() if you like. The same applies to the realloc_long function. To see when the reallocation takes place, you can compile the code with the -DDEBUG definition. E.g:
gcc -Wall -Wextra -DDEBUG -o progname yoursourcefile.c
The program expects your data filename as the first argument and you can provide an optional conversion base as the second argument (default is 10). E.g.:
./progname datafile.txt [base (default: 10)]
Look over it, test it, and let me know if you have any questions.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>
#define NMAX 64
long xstrtol (char *p, char **ep, int base);
long *realloc_long (long *lp, unsigned long *n);
int main (int argc, char **argv)
{
char *ln = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* max chars to read (0 - no limit) */
ssize_t nchr = 0; /* number of chars actually read */
size_t idx = 0; /* array index counter */
long *array = NULL; /* pointer to long */
unsigned long nmax = NMAX; /* initial reallocation counter */
FILE *fp = NULL; /* input file pointer */
int base = argc > 2 ? atoi (argv[2]) : 10; /* base (default: 10) */
/* open / validate file */
if (!(fp = fopen (argv[1], "r"))) {
fprintf (stderr, "error: file open failed '%s'.", argv[1]);
return 1;
}
/* allocate array of NMAX long using calloc to initialize to 0 */
if (!(array = calloc (NMAX, sizeof *array))) {
fprintf (stderr, "error: memory allocation failed.");
return 1;
}
/* read each line from file - separate into array */
while ((nchr = getline (&ln, &n, fp)) != -1)
{
char *p = ln; /* pointer to ln read by getline */
char *ep = NULL; /* endpointer for strtol */
while (errno == 0)
{ /* parse/convert each number in line into array */
array[idx++] = xstrtol (p, &ep, base);
if (idx == nmax) /* check NMAX / realloc */
array = realloc_long (array, &nmax);
/* skip delimiters/move pointer to next digit */
while (*ep && *ep != '-' && (*ep < '0' || *ep > '9')) ep++;
if (*ep)
p = ep;
else
break;
}
}
if (ln) free (ln); /* free memory allocated by getline */
if (fp) fclose (fp); /* close open file descriptor */
int i = 0;
for (i = 0; i < idx; i++)
printf (" array[%d] : %ld\n", i, array[i]);
free (array);
return 0;
}
/* reallocate long pointer memory */
long *realloc_long (long *lp, unsigned long *n)
{
long *tmp = realloc (lp, 2 * *n * sizeof *lp);
#ifdef DEBUG
printf ("\n reallocating %lu to %lu\n", *n, *n * 2);
#endif
if (!tmp) {
fprintf (stderr, "%s() error: reallocation failed.\n", __func__);
// return NULL;
exit (EXIT_FAILURE);
}
lp = tmp;
memset (lp + *n, 0, *n * sizeof *lp); /* memset new ptrs 0 */
*n *= 2;
return lp;
}
long xstrtol (char *p, char **ep, int base)
{
errno = 0;
long tmp = strtol (p, ep, base);
/* Check for various possible errors */
if ((errno == ERANGE && (tmp == LONG_MIN || tmp == LONG_MAX)) ||
(errno != 0 && tmp == 0)) {
perror ("strtol");
exit (EXIT_FAILURE);
}
if (*ep == p) {
fprintf (stderr, "No digits were found\n");
exit (EXIT_FAILURE);
}
return tmp;
}
Sample Output (with -DDEBUG to show reallocation)
$ ./bin/read_long_csv dat/randlong.txt
reallocating 64 to 128
reallocating 128 to 256
reallocating 256 to 512
array[0] : 353
array[1] : 394
array[2] : 257
array[3] : 173
array[4] : 389
array[5] : 332
array[6] : 338
array[7] : 293
array[8] : 58
array[9] : 135
<snip>
array[395] : 146
array[396] : 324
array[397] : 424
array[398] : 365
array[399] : 205
Memory Error Check
$ valgrind ./bin/read_long_csv dat/randlong.txt
==26142== Memcheck, a memory error detector
==26142== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==26142== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==26142== Command: ./bin/read_long_csv dat/randlong.txt
==26142==
reallocating 64 to 128
reallocating 128 to 256
reallocating 256 to 512
array[0] : 353
array[1] : 394
array[2] : 257
array[3] : 173
array[4] : 389
array[5] : 332
array[6] : 338
array[7] : 293
array[8] : 58
array[9] : 135
<snip>
array[395] : 146
array[396] : 324
array[397] : 424
array[398] : 365
array[399] : 205
==26142==
==26142== HEAP SUMMARY:
==26142== in use at exit: 0 bytes in 0 blocks
==26142== total heap usage: 7 allocs, 7 frees, 9,886 bytes allocated
==26142==
==26142== All heap blocks were freed -- no leaks are possible
==26142==
==26142== For counts of detected and suppressed errors, rerun with: -v
==26142== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

CSV File Input in C using Structures

I want to print the data from .csv file line by line which is separated by comma delimeter.
This code prints the garbage value .
enum gender{ M, F };
struct student{
int stud_no;
enum gender stud_gen;
char stud_name[100];
int stud_marks;
};
void main()
{
struct student s[60];
int i=0,j,roll_no,marks,k,select;
FILE *input;
FILE *output;
struct student temp;
input=fopen("Internal test 1 Marks MCA SEM 1 oct 2014 - CS 101.csv","r");
output=fopen("out.txt","a");
if (input == NULL) {
printf("Error opening file...!!!");
}
while(fscanf(input,"%d,%c,%100[^,],%d", &s[i].stud_no,&s[i].stud_gen,&s[i].stud_name,&s[i].stud_marks)!=EOF)
{
printf("\n%d,%c,%s,%d", s[i].stud_no,s[i].stud_gen,s[i].stud_name,s[i].stud_marks);
i++;
}
}
I also tried the code from: Read .CSV file in C But it prints only the nth field. I want to display all fields line by line.
Here is my sample input.
1401,F,FERNANDES SUZANNA ,13
1402,M,PARSEKAR VIPUL VILAS,14
1403,M,SEQUEIRA CLAYTON DIOGO,8
1404,M,FERNANDES GLENN ,17
1405,F,CHANDRAVARKAR TANUSHREE ROHIT,15
While there are a number of ways to parse any line into components, one way that can really increase understanding is to use a start and end pointer to work down each line identifying the commas, replacing them with null-terminators (i.e. '\0' or just 0), reading the field, restoring the comma and moving to the next field. This is just a manual application of strtok. The following example does that so you can see what is going on. You can, of course, replace use of the start and end pointers (sp & p, respectively) with strtok.
Read through the code and let me know if you have any questions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* maximum number of student to initially allocate */
#define MAXS 256
enum gender { M, F };
typedef struct { /* create typedef to struct */
int stud_no;
enum gender stud_gen;
char *stud_name;
int stud_marks;
} student;
int main (int argc, char *argv[]) {
if (argc < 2) {
printf ("filename.csv please...\n");
return 1;
}
char *line = NULL; /* pointer to use with getline () */
ssize_t read = 0; /* characters read by getline () */
size_t n = 0; /* number of bytes to allocate */
student **students = NULL; /* ptr to array of stuct student */
char *sp = NULL; /* start pointer for parsing line */
char *p = NULL; /* end pointer to use parsing line */
int field = 0; /* counter for field in line */
int cnt = 0; /* counter for number allocated */
int it = 0; /* simple iterator variable */
FILE *fp;
fp = fopen (argv[1], "r"); /* open file , read only */
if (!fp) {
fprintf (stderr, "failed to open file for reading\n");
return 1;
}
students = calloc (MAXS, sizeof (*students)); /* allocate 256 ptrs set to NULL */
/* read each line in input file preserving 1 pointer as sentinel NULL */
while (cnt < MAXS-1 && (read = getline (&line, &n, fp)) != -1) {
sp = p = line; /* set start ptr and ptr to beginning of line */
field = 0; /* set/reset field to 0 */
students[cnt] = malloc (sizeof (**students)); /* alloc each stuct with malloc */
while (*p) /* for each character in line */
{
if (*p == ',') /* if ',' end of field found */
{
*p = 0; /* set as null-term char (temp) */
if (field == 0) students[cnt]->stud_no = atoi (sp);
if (field == 1) {
if (*sp == 'M') {
students[cnt]->stud_gen = 0;
} else {
students[cnt]->stud_gen = 1;
}
}
if (field == 2) students[cnt]->stud_name = strdup (sp); /* strdup allocates for you */
*p = ','; /* replace with original ',' */
sp = p + 1; /* set new start ptr start pos */
field++; /* update field count */
}
p++; /* increment pointer p */
}
students[cnt]->stud_marks = atoi (sp); /* read stud_marks (sp alread set to begin) */
cnt++; /* increment students count */
}
fclose (fp); /* close file stream */
if (line) /* free memory allocated by getline */
free (line);
/* iterate over all students and print */
printf ("\nThe students in the class are:\n\n");
while (students[it])
{
printf (" %d %c %-30s %d\n",
students[it]->stud_no, (students[it]->stud_gen) ? 'F' : 'M', students[it]->stud_name, students[it]->stud_marks);
it++;
}
printf ("\n");
/* free memory allocated to struct */
it = 0;
while (students[it])
{
if (students[it]->stud_name)
free (students[it]->stud_name);
free (students[it]);
it++;
}
if (students)
free (students);
return 0;
}
(note: added condition on loop that cnt < MAXS-1 to preserve at least one pointer in students NULL as a sentinel allowing iteration.)
input:
$ cat dat/people.dat
1401,F,FERNANDES SUZANNA ,13
1402,M,PARSEKAR VIPUL VILAS,14
1403,M,SEQUEIRA CLAYTON DIOGO,8
1404,M,FERNANDES GLENN ,17
1405,F,CHANDRAVARKAR TANUSHREE ROHIT,15
output:
$./bin/stud_struct dat/people.dat
The students in the class are:
1401 F FERNANDES SUZANNA 13
1402 M PARSEKAR VIPUL VILAS 14
1403 M SEQUEIRA CLAYTON DIOGO 8
1404 M FERNANDES GLENN 17
1405 F CHANDRAVARKAR TANUSHREE ROHIT 15
valgrind memcheck:
I have updated the code slightly to insure all allocated memory was freed to prevent against any memory leaks. Simple things like the automatic allocation of memory for line by getline or failing to close a file stream can result in small memory leaks. Below is the valgrind memcheck confirmation.
valgrind ./bin/stud_struct dat/people.dat
==11780== Memcheck, a memory error detector
==11780== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11780== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==11780== Command: ./bin/stud_struct dat/people.dat
==11780==
The students in the class are:
1401 F FERNANDES SUZANNA 13
1402 M PARSEKAR VIPUL VILAS 14
1403 M SEQUEIRA CLAYTON DIOGO 8
1404 M FERNANDES GLENN 17
1405 F CHANDRAVARKAR TANUSHREE ROHIT 15
==11780==
==11780== HEAP SUMMARY:
==11780== in use at exit: 0 bytes in 0 blocks
==11780== total heap usage: 13 allocs, 13 frees, 2,966 bytes allocated
==11780==
==11780== All heap blocks were freed -- no leaks are possible
==11780==
==11780== For counts of detected and suppressed errors, rerun with: -v
==11780== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

Resources