saving to a string from a pipe - c

I have code here that runs /bin/ls -l and then prints out the output to the terminal, what I want to do here is save that output into a string for later use. I'm not sure how to go about doing this but from my guess it would look something like this
int main (){
FILE *fp;
int status;
char path[100];
char *s;
fp = popen("/bin/ls -l", "r");
if (fp == NULL){
printf("fp error");
}
while(fgets(path,100,fp)!= NULL){
printf("%s\n", path );
scanf(path, s);
}
status = pclose(fp);
if (status==-1){
printf("pclose error");
}else{
printf("else on pclose\n");
}
return 0;
}
The while loop prints out my directory result no problem but I run into a segmentation fault : 11 at the end of it. What would be the correct way of approaching this problem?

To begin with while(fgets(path,100,fp)!= NULL){ already stores the first 99characters read from the pipe in path. There is no need to scanf anything else.
100 is a horribly insufficient magic number to include in your code for the maximum path length. Better to use PATH_MAX defined in limits.h. (generally 4096, but is implementation defined). Which brings up another point, don't use magic numbers in your code, if you need a constant the system doesn't provide, then #define one, or use a global enum to define it.
When reading with fgets, you must check for, and remove (or otherwise account for) the '\n' that will be included in the buffer filled by fgets (and POSIX getline). That also provides validation that you did in fact read a complete line of data. Otherwise, if the length of the line read is PATH_MAX - 1 and there was no '\n' at the end, the line was too long for the buffer and character for that line remain unread in your input stream. A simple call to strlen (path) and then checks with if/else if statements provide the validation and allow you to overwrite the trailing '\n' with a nul-terminating character.
To handle the problem of "How many files to I need to provide for?", you can simply use a pointer-to-pointer-to-char and dynamically allocate pointers to each line and realloc additional pointers as required. You assign the address of each block of memory you allocate to store each line, to the individual pointers you allocate. Keep a count of the lines (for the pointer count) and realloc more pointers when you reach your limit (this works for reading from text files the same way) Don't forget to free the memory you allocate when you are done with it.
Putting all the pieces together, you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#define NFILES 16
int main (){
size_t i, n = 0, /* number of files in listing */
nfiles = NFILES; /* number of allocated pointers */
char path[PATH_MAX] = "", /* buffer (PATH_MAX in limits.h) */
**files = NULL; /* pointer to pointer to file */
FILE *fp = popen("/bin/ls -l", "r");
if (fp == NULL) { /* validate pipe open for reading */
perror ("popen");
return 1;
}
/* allocate/validate initial number of pointers */
if (!(files = malloc (nfiles * sizeof *files))) {
perror ("malloc - files");
return 1;
}
while (fgets (path, PATH_MAX, fp)) { /* read each line */
size_t len = strlen (path); /* get length */
if (len && path[len - 1] == '\n') /* validate '\n' */
path[--len] = 0; /* trim '\n' */
else if (len + 1 == PATH_MAX) { /* check path too long */
fprintf (stderr, "error: path too long.\n");
/* handle remaining chars in fp */
}
/* allocate/validate storage for line */
if (!(files[n] = malloc (len + 1))) {
perror ("malloc - files[n]");
break;
}
strcpy (files[n++], path); /* copy path */
if (n == nfiles) { /* realloc pointers as required */
void *tmp = realloc (files, nfiles * 2 * sizeof *files);
if (!tmp) {
perror ("realloc");
break;
}
files = tmp;
nfiles *= 2; /* increment current allocation */
}
}
if (pclose (fp) == -1) /* validate close */
perror ("pclose");
/* print and free the allocated strings */
for (i = 0; i < n; i++) {
printf ("%s\n", files[i]);
free (files[i]); /* free individual file storage */
}
free (files); /* free pointers */
return 0;
}
Example Use/Output
$ ./bin/popen_ls_files > dat/filelist.txt
$ wc -l dat/filelist.txt
1768 dat/filelist.txt
$ cat dat/filelist.txt
total 9332
-rw-r--r-- 1 david david 376 Sep 23 2014 3darrayaddr.c
-rw-r--r-- 1 david david 466 Sep 30 20:13 3darrayalloc.c
-rw-r--r-- 1 david david 802 Jan 25 02:55 3darrayfill.c
-rw-r--r-- 1 david david 192 Jun 27 2015 BoggleData.txt
-rw-r--r-- 1 david david 3565 Jun 26 2014 DoubleLinkedList-old.c
-rw-r--r-- 1 david david 3699 Jun 26 2014 DoubleLinkedList.c
-rw-r--r-- 1 david david 3041 Jun 26 2014 DoubleLinkedList.diff
<snip>
-rw-r--r-- 1 david david 4946 May 7 2015 workers.c
-rw-r--r-- 1 david david 206 Jul 11 2017 wshadow.c
-rw-r--r-- 1 david david 1283 May 18 2015 wsininput.c
-rw-r--r-- 1 david david 5519 Oct 13 2015 xpathfname.c
-rw-r--r-- 1 david david 785 Sep 30 02:49 xrealloc2_macro.c
-rw-r--r-- 1 david david 2090 Sep 6 02:29 xrealloc_tst.c
-rw-r--r-- 1 david david 1527 Sep 6 03:22 xrealloc_tst_str.c
-rwxr-xr-- 1 david david 153 Aug 5 2014 xsplit.sh
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/popen_ls_files > /dev/null
==7453== Memcheck, a memory error detector
==7453== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==7453== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==7453== Command: ./bin/popen_ls_files
==7453==
==7453==
==7453== HEAP SUMMARY:
==7453== in use at exit: 0 bytes in 0 blocks
==7453== total heap usage: 1,777 allocs, 1,777 frees, 148,929 bytes allocated
==7453==
==7453== All heap blocks were freed -- no leaks are possible
==7453==
==7453== For counts of detected and suppressed errors, rerun with: -v
==7453== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

I assume that you want to save directories into the array of strings. In this case array of s[] keeps pointers to all your entries. Every read entry would get allocated memory for the entry and the string terminator \0.
The starting program may look like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NR_OF_ENTRIES 10000
#define PATH_LEN 256
int main (){
FILE *fp;
int status;
char path[PATH_LEN];
char *s[MAX_NR_OF_ENTRIES];
int i,j = 0;
fp = popen("/bin/ls -l", "r");
if (fp == NULL){
printf("fp error\n");
}
while(fgets(path,PATH_LEN,fp) != NULL){
printf("%s\n", path );
s[i] = malloc(strlen(path)+1);
strcpy(s[i],path);
i++;
if(i>=MAX_NR_OF_ENTRIES)
{
printf("MAX_NR_OF_ENTRIES reached!\n");
break;
}
}
status = pclose(fp);
if (status==-1){
printf("pclose error");
}else{
printf("pclose was fine!\n");
}
// Print and free the allocated strings
for(j=0; j< i; j++){
printf("%s\n", s[j] );
free (s[j]);
}
return 0;
}
That works, but we put 10,000 pointers on the stack. But, as suggested by David C. Rankin, we could allocate array of arrays s dynamically as well. Starting working program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NR_OF_ENTRIES 10
#define PATH_LEN 256
int main (){
FILE *fp;
int status;
char path[PATH_LEN];
int i,j = 0;
size_t nr_of_elements = MAX_NR_OF_ENTRIES;
char **old_arr;
char **new_arr;
char **s = calloc(nr_of_elements, sizeof(char*));
fp = popen("/bin/ls -l", "r");
if (fp == NULL){
printf("fp error\n");
}
while(fgets(path,PATH_LEN,fp) != NULL){
printf("%s\n", path );
char *str = malloc(strlen(path)+1);
strcpy(str, path);
s[i] = str;
i++;
if(i>=nr_of_elements )
{
printf("resizing\n");
size_t old_size = nr_of_elements;
nr_of_elements = 4*nr_of_elements; // increase size for `s` 4 times (you can invent something else here)
old_arr = s;
// allocating a bigger copy of the array s, copy it inside, and redefine the pointer:
new_arr = calloc(nr_of_elements, sizeof(char*));
if (!new_arr)
{
perror("new calloc failed");
exit(EXIT_FAILURE);
}
memcpy (new_arr, old_arr, sizeof(char*)*old_size); // notice the use of `old_size`
free (old_arr);
s = new_arr;
}
}
status = pclose(fp);
if (status==-1){
printf("pclose error");
}else{
printf("pclose was fine!\n");
}
// Print and free the allocated strings
for(j=0; j< i; j++){
printf("%s\n", s[j] );
free (s[j]);
}
free(s);
return 0;
}

Related

Read print lines from any data function

Second, of my assignment, I am new to C program user, but way too behind it. I have to solve it that is due today. I thought in the main (it is really complicated for me to understand the function with file) is supposed taking fscan to read the file and to output with new lines from the file. The file is also included for example too I am using. I know 5 points seem awkward a lot but I am new to C program and have never done with the function part of the C program. I wanted to output with i to count each line to print. I also thought to include the comments help a little bit further info. For the function, I don't know what to write the code for fooen and fgets for the first time.
For example:
Example input/output:
./a.out testfile
1: Bob
2: Tiffany
3: Steve
4: Jim
5: Lucy
...
/* 5 points */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAXLEN 1000
#define MAX_LINE_LEN 4096
/**
* Complete the function below that takes an array of strings and a integer
* number representing the number of strings in the array and prints them out to
* the screen with a preceding line number (starting with line 1.)
*/
void
printlines (char *a[], int n)
{
}
/**
* Create a main function that opens a file composed of words, one per line
* and saves them to an array of MAXLEN strings that is then printed using
* the above function.
*
* Hints:
* - Use fopen(3) to open a file for reading, the the returned file handle * with the fscanf() function to read the words out of it.
* - You can read a word from a file into a temporary word buffer using the
* fscanf(3) function.
* - You can assume that a word will not be longer than MAXLEN characters.
* - Use the strdup(3) function to make a permanent copy of a string that has
* been read into a buffer.
*
* Usage: "Usage: p7 <file>\n"
*
* Example input/output:
* ./p7 testfile
* 1: Bob
* 2: Tiffany
* 3: Steve
* 4: Jim
* 5: Lucy
* ...
*/
int main (int argv, char *argc[])
{ if (argc < 2)
{
printf ("Usage: p7 <file>\n");
}
char buffer[MAX_LINE_LEN];
/* opening file for reading */
FILE *fp = fopen (argv[1], "r");
if (!fp)
{
perror ("fopen failed"); exit (EXIT_FAILURE);
}
int i = 0;
while ((i <= MAX_LINES) && (fgets (buffer, sizeof (buffer), fp)))
{
printf ("%2i: %s", i, buffer); //i is to count from each lines
i++;
}
fclose (fp);
return (0);
}
View for testfile:
Bob
Tiffany
Steve
Jim
Lucy
Fred
George
Jill
Max
Butters
Randy
Dave
Bubbles
You can do like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAXLEN 1000
#define MAX_LINE_LEN 4096
void printlines (char *a[], int n)
{
for (int i = 0; i != n; ++i)
printf("%2d: %s\n", i, a[i]);
}
int main (int argc, char *argv[])
{
if (argc < 2)
{
printf ("Usage: p7 <file>\n");
return -1;
}
FILE *fp = fopen (argv[1], "r");
if (!fp)
{
perror ("fopen failed");
exit (EXIT_FAILURE);
}
char* a[MAXLEN];
char buffer[MAX_LINE_LEN];
int i = 0;
while (i < MAXLEN && fscanf(fp, "%s", buffer) == 1)
a[i++] = strdup(buffer);
printlines(a, i);
fclose (fp);
return (0);
}

Open multiple files and store them in array of structures

I need some help with my task. I have to open unknown number of files and store them in data structure at the start of program. First file contains names of two other files and so on ( this is explained more under example of first file).
Each file has same structure:
[Title of file] [name of file X][name of file Y][Text]
, for example, first file will looks like this:
File 1 file_8.txt file_25.txtText: "this is some example text, lenght is unknown so so i will have to use malloc and realloc to dynamicaly store it."
The name of first file is typed in stdin by user when starting a program
(example: ./task1 page_1.txt)
The first line stores the title of the file. The second and third lines each contain a file name of a next file that i have to read/store.If there are no further names of files on 2nd and 3rd line, both of the lines will have " -\n ". Text starts at fourth line (can have multiple lines like in example above)
My struct for now:
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
char title[1000]; // should use malloc and realloc and not this way
char file_x[1000]; // dynamically
char file_y[1000]; // dynamically
char text[10000];
} Story;
My main looks like this:
int main (int argc,char *argv[])
{
char c[100];
char buffer[100];
FILE *input = fopen(argv[1], "r");
Story *temp = (Story*) malloc(sizeof(Story) * 8);
if(input)
{
int flag = 0;
while (fgets(c, sizeof(buffer),input) != NULL)
{
if(flag == 0)
{
sscanf(c, "%s", temp->title);
}
else if(flag == 1)
{
sscanf(c, "%s", temp->file_x);
}
else if(flag == 2)
{
sscanf(c, "%s", temp->file_y);
}
else
{
while(!feof(input))
{
fread(temp->text, sizeof(Story),1,input);
}
}
flag++;
}
printf("%s\n%s\n%s\n", temp->title,
temp->file_x, temp->file_y);
}
else if (input == NULL)
{
printf("ERROR MESSAGE HERE \n");
return 1;
}
free(temp);
fclose(input);
return 0;
}
For now i managed to open first file and store it to structure. I need an idea how to open and store all other files and also have to implement it using dynamic memory allocation.
Any advice is greatly appreciated.
I suspect your lesson is covering recursion as with each element in your array of story you will need to branch an unknown number of times to read file_x and file_y (each of which can contain additional file_x and file_y within). Your procedural option is to follow down the trail of all file_x and then return to each file_y repeating the process until you reach the final files in each chain where file_x and file_y are empty.
Before determining which approach you will take, you simply need a way to read one file, extract the title, file_x, file_y and allocate and store text. This is a fairly straight-forward process where your primary task is to validate each step so you have confidence that you are processing actual data and are not invoking Undefined Behavior by reading from a file that isn't actually open or attempting to write (or read) beyond the bounds of your storage.
Here is a short example that takes a pointer to story to fill and the filename to read from. You will note a repetitive process involved. (read the string with fgets, get the length, validate the last char read was '\n' indicating you read the whole line, and finally trimming the '\n' by either overwriting with a nul-terminating character so you don't have newlines dangling off the end of your stored string, or overwriting with a ' ' (space) in the case of text where you are concatenating lines together.
note: below, realloc is never called directly on the pointer to text. Instead a tmp pointer is used with realloc to validate realloc succeeds before assigning the new block to text. (otherwise you will lose your pointer to text if realloc fails -- because it returns NULL)
/* read values into struct story 's' from 'filename' */
int read_file (story *s, char *filename)
{
size_t len = 0, /* var for strlen */
text_size = 0, /* total text_size */
nul_char = 0; /* flag for +1 on first allocation */
char buf[TITLE_MAX] = ""; /* read buffer for 'text' */
FILE *fp = fopen (filename, "r"); /* file pointer */
if (!fp) /* validate file open for reading */
return 0; /* or return silently indicating no file_x or file_y */
if (fgets (s->title, TITLE_MAX, fp) == 0) { /* read title */
fprintf (stderr, "error: failed to read title from '%s'.\n",
filename);
fclose(fp);
return 0;
}
len = strlen (s->title); /* get title length */
if (len && s->title[len - 1] == '\n') /* check last char is '\n' */
s->title[--len] = 0; /* overwrite with nul-character */
else { /* handle error if line too long */
fprintf (stderr, "error: title too long, filename '%s'.\n",
filename);
fclose(fp);
return 0;
}
if (fgets (s->file_x, PATH_MAX, fp) == 0) { /* same for file_x */
fprintf (stderr, "error: failed to read file_x from '%s'.\n",
filename);
fclose(fp);
return 0;
}
len = strlen (s->file_x);
if (len && s->file_x[len - 1] == '\n')
s->file_x[--len] = 0;
else {
fprintf (stderr, "error: file_x too long, filename '%s'.\n",
filename);
fclose(fp);
return 0;
}
if (fgets (s->file_y, PATH_MAX, fp) == 0) { /* same for file_y */
fprintf (stderr, "error: failed to read file_y from '%s'.\n",
filename);
fclose(fp);
return 0;
}
len = strlen (s->file_y);
if (len && s->file_y[len - 1] == '\n')
s->file_y[--len] = 0;
else {
fprintf (stderr, "error: file_y too long, filename '%s'.\n",
filename);
fclose(fp);
return 1;
}
while (fgets (buf, TITLE_MAX, fp)) { /* read text in TITLE_MAX chunks */
len = strlen (buf);
if (len && buf[len - 1] == '\n') /* check for '\n' */
buf[len - 1] = ' '; /* overwrite with ' ' for concat */
if (text_size == 0)
nul_char = 1; /* account for space for '\0' when empty, and */
else /* use a flag to set new block to empty-string */
nul_char = 0;
void *tmp = realloc (s->text, text_size + len + nul_char); /* allocate */
if (!tmp) { /* validate realloc succeeded */
fprintf (stderr, "error: realloc failed, filename '%s'.\n",
filename);
break;
}
s->text = tmp; /* assign new block to s->text */
if (nul_char) /* if first concatenation */
*(s)->text = 0; /* initialize s->text to empty-string */
strcat (s->text, buf); /* concatenate buf with s->text */
text_size += (len + 1); /* update text_size total */
}
fclose (fp); /* close file */
return 1;
}
With this, you will need to design a way to work through all file_x and file_y filenames. As mentioned above, this likely lends itself to a recursive function, or you can work your way down the file_x tree and circle back and pick up all the file_y additions. Note, you need to account for the new addition of story each time either file_x or file_y are followed.
Below is a short example that follows through all file_x additions and comes back and follow through only the 1st file_y branch. It is intended to shown you how to handle the calling and filling from both a file_x and file_y rather than write the final code for you. If you add the following above the read_file function, you will have a working example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h> /* for PATH_MAX */
enum { STORY_MAX = 12, TITLE_MAX = 1024 };
typedef struct
{
char title[TITLE_MAX],
file_x[PATH_MAX],
file_y[PATH_MAX],
*text;
} story;
int read_file (story *s, char *filename);
int main (int argc, char **argv) {
int n = 0, storycnt = 0;
story stories[STORY_MAX] = {{ .title = "" }};
char *filename = argv[1];
/* read all file_x filenames */
while (n < STORY_MAX && read_file (&stories[n], filename)) {
filename = stories[n++].file_x;
}
storycnt = n; /* current story count of all file_x */
for (int i = 0; i < storycnt; i++) /* find all file_y files */
while (n < STORY_MAX && read_file (&stories[n], stories[i].file_y)) {
filename = stories[i++].file_y;
n++;
}
for (int i = 0; i < n; i++) { /* output stories content */
printf ("\ntitle : %s\nfile_x: %s\nfile_y: %s\ntext : %s\n",
stories[i].title, stories[i].file_x,
stories[i].file_y, stories[i].text);
free (stories[i].text); /* don't forget to free memory */
}
return 0;
}
Example Input Files
$ cat file_1.txt
File 1
file_8.txt
file_25.txt
Text: "this is some example text, lenght is unknown
so i will have to use malloc and realloc to
dynamicaly store it."
$ cat file_8.txt
file_8
This is the text from file 8. Not much,
just some text.
$ cat file_25.txt
file_25
This is the text from file 25. Not much,
just some text.
Example Use/Output
$ ./bin/rdstories file_1.txt
title : File 1
file_x: file_8.txt
file_y: file_25.txt
text : Text: "this is some example text, lenght is unknown so i will
have to use malloc and realloc to dynamicaly store it."
title : file_8
file_x:
file_y:
text : This is the text from file 8. Not much, just some text.
title : file_25
file_x:
file_y:
text : This is the text from file 25. Not much, just some text.
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/rdstories file_1.txt
==9488== Memcheck, a memory error detector
==9488== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9488== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==9488== Command: ./bin/rdstories file_1.txt
==9488==
title : File 1
file_x: file_8.txt
file_y: file_25.txt
text : Text: "this is some example text, lenght is unknown so i will
have to use malloc and realloc to dynamicaly store it."
title : file_8
file_x:
file_y:
text : This is the text from file 8. Not much, just some text.
title : file_25
file_x:
file_y:
text : This is the text from file 25. Not much, just some text.
==9488==
==9488== HEAP SUMMARY:
==9488== in use at exit: 0 bytes in 0 blocks
==9488== total heap usage: 13 allocs, 13 frees, 3,353 bytes allocated
==9488==
==9488== All heap blocks were freed -- no leaks are possible
==9488==
==9488== For counts of detected and suppressed errors, rerun with: -v
==9488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have questions.

How to read in two text files and count the amount of keywords?

I have tried looking around but, to me files are the hardest thing to understand so far as I am learning C, especially text files, binary files were a bit easier. Basically I have to read in two text files both contains words that are formatted like this "hard, working,smart, works well, etc.." I am suppose to compare the text files and count the keywords. I would show some code but honestly I am lost and the only thing I have down is just nonsense besides this.
#include <time.h>
#include <stdlib.h>
#include <stdio.h>
#define SIZE 1000
void resumeRater();
int main()
{
int i;
int counter = 0;
char array[SIZE];
char keyword[SIZE];
FILE *fp1, *fp2;
int ch1, ch2;
errno_t result1 = fopen_s(&fp1, "c:\\myFiles\\resume.txt", "r");
errno_t result2 = fopen_s(&fp2, "c:\\myFiles\\ideal.txt", "r");
if (fp1 == NULL) {
printf("Failed to open");
}
else if (fp2 == NULL) {
printf("Failed to open");
}
else {
result1 = fread(array, sizeof(char), 1, fp1);
result2 = fread(keyword, sizeof(char), 1, fp2);
for (i = 0; i < SIZE; i++)
{
if (array[i] == keyword[i])
{
counter++;
}
}
fclose(fp1);
fclose(fp2);
printf("Character match: %d", counter);
}
system("pause");
}
When you have a situation where you are doing a multiple of something (like reading 2 files), it makes a lot of sense to plan ahead. Rather than muddying the body of main with all the code necessary to read 2 text files, create a function that reads the text file for you and have it return an array containing the lines of the file. This really helps you concentrate on the logic of what your code needs to do with the lines rather than filling space with getting the lines in the first place. Now there is nothing wrong with cramming it all in one long main, but from a readability, maintenance, and program structure standpoint, it makes all more difficult.
If you structure the read function well, you can reduce your main to the following. This reads both text files into character arrays and provides the number of lines read in a total of 4 lines (plus the check to make sure your provided two filenames to read):
int main (int argc, char **argv) {
if (argc < 3 ) {
fprintf (stderr, "error: insufficient input, usage: %s <filename1> <filename2>\n", argv[0]);
return 1;
}
size_t file1_size = 0; /* placeholders to be filled by readtxtfile */
size_t file2_size = 0; /* for general use, not needed to iterate */
/* read each file into an array of strings,
number of lines read, returned in file_size */
char **file1 = readtxtfile (argv[1], &file1_size);
char **file2 = readtxtfile (argv[2], &file2_size);
return 0;
}
At that point you have all your data and you can work on your key word code. Reading from textfiles is a very simple matter. You just have to get comfortable with the tools available. When reading lines of text, the preferred approach is to use line-input to read an entire line at a time into a buffer. You then parse to buffer to get what it is you need. The line-input tools are fgets and getline. Once you have read the line, you then have tools like strtok, strsep or sscanf to separate what you want from the line. Both fgets and getline read the newline at the end of each line as part of their input, so you may need to remove the newline to meet your needs.
Storing each line read is generally done by declaring a pointer to an array of char* pointers. (e.g. char **file1;) You then allocate memory for some initial number of pointers. (NMAX in the example below) You then access the individual lines in the file as file1_array[n] when n is the line index 0 - lastline of the file. If you have a large file and exceed the number of pointers you originally allocated, you simply reallocate additional pointers for your array with realloc. (you can set NMAX to 1 to make this happen for every line)
What you use to allocate memory and how you reallocate can influence how you make use of the arrays in your program. Careful choices of calloc to initially allocate your arrays, and then using memset when you reallocate to set all unused pointers to 0 (null), can really save you time and headache? Why? Because, to iterate over your array, all you need to do is:
n = 0;
while (file1[n]) {
<do something with file1[n]>;
n++;
}
When you reach the first unused pointer (i.e. the first file1[n] that is 0), the loop stops.
Another very useful function when reading text files is strdup (char *line). strdup will automatically allocate space for line using malloc, copy line to the newly allocated memory, and return a pointer to the new block of memory. This means that all you need to do to allocate space for each pointer and copy the line ready by getline to your array is:
file1[n] = strdup (line);
That's pretty much it. you have read your file and filled your array and know how to iterate over each line in the array. What is left is cleaning up and freeing the memory allocated when you no longer need it. By making sure that your unused pointers are 0, this too is a snap. You simply iterate over your file1[n] pointers again, freeing them as you go, and then free (file1) at the end. Your done.
This is a lot to take in, and there are a few more things to it. On the initial read of the file, if you noticed, we also declare a file1_size = 0; variable, and pass its address to the read function:
char **file1 = readtxtfile (argv[1], &file1_size);
Within readtxtfile, the value at the address of file1_size is incremented by 1 each time a line is read. When readtxtfile returns, file1_size contains the number of lines read. As shown, this is not needed to iterate over the file1 array, but you often need to know how many lines you have read.
To put this all together, I created a short example of the functions to read two text files, print the lines in both and free the memory associated with the file arrays. This explanation ended up longer than I anticipated. So take time to understand how it works, and you will be a step closer to handling textfiles easily. The code below will take 2 filenames as arguments (e.g. ./progname file1 file2) Compile it with something similar to gcc -Wall -Wextra -o progname srcfilename.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NMAX 256
char **readtxtfile (char *fn, size_t *idx);
char **realloc_char (char **p, size_t *n);
void prn_chararray (char **ca);
void free_chararray (char **ca);
int main (int argc, char **argv) {
if (argc < 3 ) {
fprintf (stderr, "error: insufficient input, usage: %s <filename1> <filename2>\n", argv[0]);
return 1;
}
size_t file1_size = 0; /* placeholders to be filled by readtxtfile */
size_t file2_size = 0; /* for general use, not needed to iterate */
/* read each file into an array of strings,
number of lines read, returned in file_size */
char **file1 = readtxtfile (argv[1], &file1_size);
char **file2 = readtxtfile (argv[2], &file2_size);
/* simple print function */
if (file1) prn_chararray (file1);
if (file2) prn_chararray (file2);
/* simple free memory function */
if (file1) free_chararray (file1);
if (file2) free_chararray (file2);
return 0;
}
char** readtxtfile (char *fn, size_t *idx)
{
if (!fn) return NULL; /* validate filename provided */
char *ln = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* max chars to read (0 - no limit) */
ssize_t nchr = 0; /* number of chars actually read */
size_t nmax = NMAX; /* check for reallocation */
char **array = NULL; /* array to hold lines read */
FILE *fp = NULL; /* file pointer to open file fn */
/* open / validate file */
if (!(fp = fopen (fn, "r"))) {
fprintf (stderr, "%s() error: file open failed '%s'.", __func__, fn);
return NULL;
}
/* allocate NMAX pointers to char* */
if (!(array = calloc (NMAX, sizeof *array))) {
fprintf (stderr, "%s() error: memory allocation failed.", __func__);
return NULL;
}
/* read each line from fp - dynamicallly allocated */
while ((nchr = getline (&ln, &n, fp)) != -1)
{
/* strip newline or carriage rtn */
while (nchr > 0 && (ln[nchr-1] == '\n' || ln[nchr-1] == '\r'))
ln[--nchr] = 0;
array[*idx] = strdup (ln); /* allocate/copy ln to array */
(*idx)++; /* increment value at index */
if (*idx == nmax) /* if lines exceed nmax, reallocate */
array = realloc_char (array, &nmax);
}
if (ln) free (ln); /* free memory allocated by getline */
if (fp) fclose (fp); /* close open file descriptor */
return array;
}
/* print an array of character pointers. */
void prn_chararray (char **ca)
{
register size_t n = 0;
while (ca[n])
{
printf (" arr[%3zu] %s\n", n, ca[n]);
n++;
}
}
/* free array of char* */
void free_chararray (char **ca)
{
if (!ca) return;
register size_t n = 0;
while (ca[n])
free (ca[n++]);
free (ca);
}
/* realloc an array of pointers to strings setting memory to 0.
* reallocate an array of character arrays setting
* newly allocated memory to 0 to allow iteration
*/
char **realloc_char (char **p, size_t *n)
{
char **tmp = realloc (p, 2 * *n * sizeof *p);
if (!tmp) {
fprintf (stderr, "%s() error: reallocation failure.\n", __func__);
// return NULL;
exit (EXIT_FAILURE);
}
p = tmp;
memset (p + *n, 0, *n * sizeof *p); /* memset new ptrs 0 */
*n *= 2;
return p;
}
valgrind - Don't Forget To Check For Leaks
Lastly, anytime you allocate memory in your code, make sure you use a memory checker such as valgrind to confirm you have no memory errors and to confirm you have no memory leaks (i.e. allocated blocks you have forgotten to free, or that have become unreachable). valgrind is simple to use, just valgrind ./progname [any arguments]. It can provide a wealth of information. For example, on this read example:
$ valgrind ./bin/getline_readfile_fn voidstruct.c wii-u.txt
==14690== Memcheck, a memory error detector
==14690== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==14690== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==14690== Command: ./bin/getline_readfile_fn voidstruct.c wii-u.txt
==14690==
<snip - program output>
==14690==
==14690== HEAP SUMMARY:
==14690== in use at exit: 0 bytes in 0 blocks
==14690== total heap usage: 61 allocs, 61 frees, 6,450 bytes allocated
==14690==
==14690== All heap blocks were freed -- no leaks are possible
==14690==
==14690== For counts of detected and suppressed errors, rerun with: -v
==14690== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Pay particular attention to the lines:
==14690== All heap blocks were freed -- no leaks are possible
and
==14690== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
You can ignore the (suppressed: 2 from 2) which just indicate I don't have the development files installed for libc.

Pointers, files and memory management in C

I am new to the world of C programming and at the moment I am exploring a combination of pointers, pointer arithmetic with file IO and memory management, all at once. Please find my code below and here is what I am trying to do.
My program is supposed to allocate 8 bytes of heap memory using malloc, then store the pointer from malloc to a char*, then open a file (text.txt), which contains the following lines of plain text (each 8 bytes long):
chartest
chtest2!
I am then trying to read 8 bytes at a time from text.txt using fread, until the end of file has been reached. The 8 bytes read in fread are stored in the chuck of memory allocated earlier with malloc. I am then using my char* to iterate over the 8 bytes and print each character in stdout using printf. After every 8 bytes (and until EOF) I reset my pointer to the 0th byte of my 8-byte memory chunk and repeat until EOF.
Here is the code:
int main(void)
{
char* array = malloc(8 * sizeof(char));
if (array == NULL)
return 1;
FILE* inptr = fopen("text.txt", "r");
if (inptr == NULL)
return 2;
while (!feof(inptr))
{
fread(array, 8 * sizeof(char), 1, inptr);
for (int i = 0; i < 8; i++)
{
printf("%c", *array);
array++;
}
array -= 8;
}
free(array);
fclose(inptr);
return 0;
}
Please bare in mind that the program has been run through valgrind, which reports no memory leaks. This is the output I get:
chartest
chtest2!
htest2
I don't get where the 3rd line comes from.
Moreover, I don't understand why when I reset my char pointer (array) using
array -= 7;
and running through valgrind it reports:
LEAK SUMMARY:
==8420== definitely lost: 8 bytes in 1 blocks
Logically thinking of the 8 bytes of heap memory as an array of chars we would have to take the pointer back 7 places to reach spot 0, but this approach seems to leak memory (whereas array -= 8 is fine)!
I would be very grateful if someone could analyse this. Thanks!
As pointed to in the comments, you are using feof incorrectly, which explains the extra line. As for subtracting 7 instead of 8: you add 1 to array 8 times, so why would you expect subtracting 7 to get you back to where you started?
I made some changes to your code and everything is working fine. Here it is :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
char* array = malloc(9 * sizeof(char)); \\changed
if (array == NULL)
return 1;
FILE* inptr = fopen("file", "r");
if (inptr == NULL)
return 2;
while (!feof(inptr))
{
fread(array, 9 * sizeof(char), 1, inptr); \\changed
int i=0;
for (i = 0; i < 8 ; i++)
{
if(feof(inptr)) \\added
goto next; \\added
printf("%c", *array);
array++;
}
printf("\n"); \\added
next:array =array - i; \\changed
}
free(array);
fclose(inptr);
return 0;
}
You need to take care of the space allocated, the end of file EOF character and the end of line \n and for that reason your program did not work as you were expecting !!!
your file is
c h a r t e s t \n c h t e s t 2 ! \n
first loop reads 8 characters and prints prints chartest
second loop reads 8 characters and prints \nchtest2
third loop reads the last 2 characters and prints !\nhtest2
this because htest2 was left in the buffer after reading to the end of the file.
checking the return value from fread() may be helpful
eg: make these changes:
int n = fread(array, sizeof(char), 8, inptr);
for (int i = 0; i < n; i++)
array -= i;

C Memory leaks and Valgrind output

I am doing some learning with C, and am having trouble identifying a memory leak situation.
First, some code:
My main function:
#define FILE_NAME "../data/input.txt"
char * testGetLine( FILE * );
int testGetCount(void);
int main(void)
{
int count = 0;
FILE * fptr;
if ((fptr = fopen(FILE_NAME, "r")) != NULL) {
char * line;
while ((line = testGetLine(fptr)) != NULL) {
printf("%s", line);
free(line); count++;
}
free(line); count++;
} else {
printf("%s\n", "Could not read file...");
}
// testing statements
printf("testGetLine was called %d times\n", testGetCount());
printf("free(line) was called %d times\n", count);
fclose(fptr);
return 0;
}
and my getline function:
#define LINE_BUFFER 500
int count = 0;
char * testGetLine(FILE * fptr)
{
extern int count;
char * line;
line = malloc(sizeof(char) * LINE_BUFFER);
count++;
return fgets(line, LINE_BUFFER, fptr);
}
int testGetCount(void) {
extern int count;
return count;
}
my understanding is that I would need to call free everytime I have called my testGetLine function, which I do. By my count, on a simple text file with four lines I need to call free 5 times. I verify that with my testing statements in the following output:
This is in line 01
Now I am in line 02
line 03 here
and we finish with line 04
testGetLine was called 5 times
free(line) was called 5 times
What I am having trouble with is, valgrind says that I alloc 6 times, and am only calling free 5 times. Here is truncated output from valgrind:
HEAP SUMMARY:
in use at exit: 500 bytes in 1 blocks
total heap usage: 6 allocs, 5 frees, 3,068 bytes allocated
500 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C2B3F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4007A5: testGetLine (testGetLine.c:13)
by 0x400728: main (tester.c:16)
LEAK SUMMARY:
definitely lost: 500 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 0 bytes in 0 blocks
suppressed: 0 bytes in 0 blocks
I feel I am missing something with the memory management. Where is the 6th memory allocation that valgrind says I am using? and how should I free it?
Followup to implement Adrian's answer
testGetLine adjustment:
char * testGetLine(FILE * fptr)
{
extern int count;
char * line;
line = malloc(sizeof(char) * LINE_BUFFER);
count++;
if (fgets(line, LINE_BUFFER, fptr) == NULL) {
line[0] = '\0';
}
return line;
}
main while loop adjustment:
while ((line = testGetLine(fptr))[0] != '\0') {
printf("%s", line);
free(line); count++;
}
free(line); count++;
fgets return description:
On success, the function returns str. If the end-of-file is
encountered while attempting to read a character, the eof indicator is
set (feof). If this happens before any characters could be read, the
pointer returned is a null pointer (and the contents of str remain
unchanged). If a read error occurs, the error indicator (ferror) is
set and a null pointer is also returned (but the contents pointed by
str may have changed).
When fgets doesn't read anything it doesn't return the char * that you used malloc on.
Therefore, the malloc in your last call isn't being freed. The statement after your while doesn't work as you want.
Solution: change your return and return line instead:
char * testGetLine(FILE * fptr)
{
extern int count;
char * line;
line = malloc(sizeof(char) * LINE_BUFFER);
count++;
fgets(line, LINE_BUFFER, fptr);
return line;
}

Resources